February 19, 2003 14:25 WSPC/148-RMP
00158
Reviews in Mathematical Physics, Vol. 15, No. 1 (2003) 1–78 c World Scientific Publishing Company
SELF ORGANIZATION IN THE LOW TEMPERATURE REGION OF A SPIN GLASS MODEL
MICHEL TALAGRAND Universit´ e Paris VI Equipe d’Analyse, Institut Math´ ematique UMR n◦ 1074 Boite 186, 4 Place Jussieu, 75230 Paris Cedex 05 Received 31 October 2002 Revised 25 November 2002 We obtain an almost complete description of the structure of the p-spin interaction model down to temperatures that decrease exponentially with p. We prove in particular the spontaneous creation of pure states, and we describe the distribution of their weights. This confirms the picture of “one step of symmetry breaking” predicted by the physicists. Similar results are obtained when a small external field is added, provided one accepts to add a lower order “generic” perturbation to the Hamiltonian. Keywords: p-spin interaction model; replica-symmetry breaking; Poisson-Dirichlet distribution; pure states; cavity method.
Contents 1. Introduction 2. A Priori Estimates 3. Construction of the Lumps 4. Pure States 5. Orthogonality in the Absence of External Field 6. The Ghirlanda–Guerra Relations and the Poisson–Dirichlet Distribution 7. Conditioning and the Relative Weights 8. Conditioning and the Cavity Method 9. The Perturbed Hamiltonian and the Extended Ghirlanda–Guerra Identities 10. The Model with External Field References
1 8 22 23 31 34 43 55 66 69 77
1. Introduction The study of the supremum of a family of random variables (r.v.) is obviously a topic of considerable importance. A collection of r.v. is also called a stochastic process. A main use of these is to model phenomenon that evolve with time, and a stochastic process is then a collection (Xt )t∈R of r.v. The use of an index set with such precise features as R (in particular an order) motivates the consideration of dependant structures where, typically, the correlation 1
February 19, 2003 14:25 WSPC/148-RMP
2
00158
M. Talagrand
of Xs and Xt decreases as |s−t| increases. A large part of probability theory consists in the study of such situations. In a somewhat different direction, one can consider a stochastic process (Xt )t∈T where T is now an “abstract set”. This point of view is extremely useful in the theory of Gaussian processes, and more generally in probabilistic arguments in analysis (see e.g. [1]). Concerning Gaussian processes, it can be said that for such a process Xt the order of magnitude of supt∈T Xt is understood “within a constant multiplicative factor” (through the theory of majorizing measures, see [2]). Due to the variety of possible situations, it seems difficult to obtain a better description in a general setting. In a different but connected order of ideas, when the r.v. (Xt )t∈T are independent, there is a very satisfactory theory of the “extreme values” taken by this family. Theoretical physicists discovered in the 80s a new direction of investigations (although probably they did not quite formulate it in the present terms) [3]. They discovered that very natural, and apparently simple processes display a very rich behavior of their “extreme values”. The present paper is devoted to the study of such a situation. Given an integer p, we will consider a family (HN (σ))σ∈ΣN of Gaussian r.v., where ΣN = {−1, 1}N
(1.1)
such that N E HN (σ )HN (σ ) ' 2 1
2
1 X 1 2 σi σi N
!p ,
(1.2)
i≤N
where of course σ ` = (σi` )i≤N and where ' means equality within terms of order 1. (See the exact formula (1.4) below.) For N large we want to understand, for a given (but typical) realization of these variables, what are the large values among this realization. The somewhat canonical character of this situation should be apparent. The richness and the depth of the situation are largely due to the choice of the index set ΣN . The natural distance on ΣN is the Hamming distance given by 1 card{i ≤ N : σi1 6= σi2 } d(σ 1 , σ 2 ) = N and we observe that 1 X 1 2 σi σi = 1 − 2d(σ 1 , σ2 ) , N i≤N
so that (1.2) clearly relates the structure of the process (HN (σ))σ∈Σn to the metric structure of (ΣN , d). The “high dimensional” character of the correlation (1.2) sharply contrasts with the “one dimensional” situation of many processes (Xt )t∈R . Condition (2.1) occurs with p = 2 in the famous Sherrington–Kirkpatrick (SK) model [4]. In this model, the energy HN (σ) of a configuration σ ∈ ΣN is given by X 1 gij σi σj , (1.3) −HN (σ) = √ N 1≤i<j≤N
February 19, 2003 14:25 WSPC/148-RMP
00158
Self Organization in the Low Temperature Region
3
where (gij ) are independent standard normal r.v. (the minus sign follows the convention of physics). It was completely unexpected that the study of this model (even at the non-rigorous level) should prove very difficult. The predictions made by G. Parisi, the so-called “Parisi solution” are stunningly beautiful, and indicate that the simple, canonical formula (1.3) creates structures of great intricacy. While investigating the relevance of Parisi’s ideas to other situations, it was discovered [5, 6], that a simpler version of Parisi’s structure should occur if in (1.2), one replaces the “2-spin” interaction by a “p-spin” interaction, i.e. one considers 1/2 X p! gi1 ···ip σi1 · · · σip , (1.4) −HN (σ) = 2N p−1 i <···
p
where the summation is now over all choices of indices 1 ≤ i1 < · · · < ip ≤ N , and where gi1 ···ip are independent standard normal r.v. (This model was invented earlier by B. Derrida [7].) For p large enough, the existence of these structures will be completely proved in the present paper. The work of [3] is not rigorous (and has no claim to be so). Serious efforts to prove some of the claims there have started rather late, if one keeps in mind how remarkable these claims are. Even the proof of the “high temperature” results (in a sense to be explained below) has turned out to be very much more strenuous that could have been anticipated (see [8, 9] and the references therein). The fact however that genuine “low temperature” results (supposedly more difficult) can now be proved as will be done here in the situation of (1.2), shows that steps have been taken in the right direction, and one can hope that the topic in general will attract the efforts we believe it deserves. We turn towards the description of our results. Consider Gibbs’ measure on ΣN , given by GN ({σ}) =
exp(−βHN (σ)) , ZN
(1.5)
where ZN is the normalization factor that ensures that GN is a probability measure. This is a random probability on ΣN , and one would like to understand its typical structure. For convenience, the randomness occuring in (1.4) through the variables gi1 ···ip will be called the disorder. A basic concept is that of replicas. These are simply powers (ΣkN , G⊗k N ) of the probability space (ΣN , GN ). (Observe that each copy has the same disorder.) Given two configurations (that is, points of ΣN ) σ 1 , σ 2 , we write R12 = R(σ 1 , σ 2 ) =
1 X 1 2 σi σi . N
(1.6)
i≤N
This is called the overlap of σ 1 , σ 2 . The fruitful idea (although it is mysterious at first sight) is to consider R12 = R(σ 1 , σ 2 ) as a function on (Σ2N , G⊗2 N ). The “high temperature behavior” then means that this function essentially takes only
February 19, 2003 14:25 WSPC/148-RMP
4
00158
M. Talagrand
one non-random value. More formally for some number q depending upon β only, we have lim Eh(R12 − q)2 i = 0 .
(1.7)
N →∞
There, E means expectation with respect to the disorder; the brackets h · i mean that each configuration σ 1 , σ 2 is averaged with respect to Gibbs’ measure, i.e. ZZ (R(σ 1 , σ 2 ) − q)2 dGN (σ 1 )dGN (σ 2 ) . h(R12 − q)2 i = (This average will often be called thermal average.) It is not difficult to show (using the symmetry between sites) that (1.7) is equivalent to the following two statements: (i) The spin correlations vanish in average, i.e. lim E(hσ1 σ2 i − hσ1 ihσ2 i)2 = 0 .
N →∞
(ii) The quantities hσi i2 “average out” 1 X hσi i2 − q lim E N →∞ N
(1.8)
!2 = 0.
(1.9)
i≤N
The main result of this paper is that (in the situation (1.2)), in a suitable range of temperature, the function (σ 1 , σ 2 ) → R12 = R(σ 1 , σ 2 ), rather than taking essentially only one non-random value, takes essentially two non-random values. (In the terminology of [3], this situation is called “one step of replica-symmetry breaking”.) Theorem 1.1 (Informal version). There is a number L such that if p ≥ L, p is odd, then (except possibly for a few exceptional values) if β ≤ 2p/L , the function R12 takes (for N large) essentially only two non-random values, 0 and qN (β). We also know how to treat the case where p is even. This will be discussed later. √ Simple arguments show that if β > 2 log 2, then both 0 and qN (β) are essentially obtained as values of R12 . The situation underlying Theorem 1.1 can be described more precisely. If √ β > 2 log 2 there exist a (random) partition (Cα )α≥1 of ΣN such that if two configurations belong to the same set Cα , their overlap is (typically) about qN (β), while if they belong to two different sets Cα , Cγ , their overlap is about zero. The sequence of weights wα = GN (Cα ) is a random sequence with a precisely understood distribution (namely, a Poisson–Dirichlet distribution, as will be explained later). The Gibbs measure thus breaks into an asymptotically infinite sequence of non trivial pieces. These pieces are as far apart as they can be, and contain no further structure. Their existence is certainly not obvious from (1.4) (and the way they depend upon the randomness remains a mystery). There is a kind of “selforganization”, and one of the most remarkable predictions of [3] is verified.
February 19, 2003 14:25 WSPC/148-RMP
00158
Self Organization in the Low Temperature Region
5
What does this tell us about the original question, the large values of a given realization of the r.v. −HN (σ)? It turns out that max(−HN (σ)) is of order N , and that such is also the case of EhHN (σ)i. By general principles (that will be detailed in the proof of the Ghirlanda–Guerra identities) it is typically true that 2 ! 1 1 HN − EhHN i = o(1) , (1.10) E N N so that, in some sense, Gibbs’ measure looks only at the configurations σ for which HN (σ) = EhHN i + o(N ) .
(1.11)
A consequence of the structure previously described is that (in a certain sense) for a suitable range of values of β, the configurations satisfying (1.11) do not appear “everywhere” in the configuration space but only in the small, far apart clusters Cα . We now turn toward a precise formulation of Theorem 1.1. Throughout the paper, we set p i. TN (β) = EhR12
(1.12)
We will prove that if p is large enough, for 1 ≤ β ≤ 2p , the system of equations TN (β) . qp
(1.13)
E th2 X chm X . E chm X
(1.14)
m = 1− q= where
r X=β
pq p−1 g, 2
(1.15)
and where g is N (0, 1) (i.e. standard normal) has a unique solution mN (β), qN (β). Theorem 1.2 (Formal version). There exists a number L such that if p ≥ L, p odd, for each ε > 0 we have Z 2p/L E(G⊗2 (1.16) lim β,N (Dε ))dβ = 0 , N →∞
1
where Dε = {(σ 1 , σ 2 ); |R12 | ≥ ε, |R12 − qN (β)| ≥ ε} .
(1.17)
In (1.16), the notation Gβ,N stresses the fact that Gibbs’ measure depends the parameter β. The reason why in (1.16) the integral is over [1, 2p/L ] is that qN (β) is not defined for β small; but the case β ≤ 1 is not interesting, because then lim Eh|R12 |i = 0 .
N →∞
(1.18)
This is shown in [10] and will be shown again here. (On the other hand (1.18) √ does not hold for β > 2 log 2.)
February 19, 2003 14:25 WSPC/148-RMP
6
00158
M. Talagrand
We will not only consider (1.3), but also the more general case 1/2 X X p! gi1 ···ip σi1 · · · σip + h σi . −HN (σ) = p−1 2N i <···
p
(1.19)
i≤N
The last term is the influence of an “external field” that favors the + spins over the − spins. The reason for considering this extra term is that there are special symmetries in the case h = 0. These symmetries make certain specific arguments possible. Conceivably, these arguments do not reflect an understanding of the “general” case. In fact, we can treat the case h 6= 0 only by adding an (asymptotically infinitesimal) suitable term to the Hamiltonian (1.19). This will be discussed in detail in Sec. 10. We now sketch the proof of Theorem 1.1 (and its extension to the case h 6= 0). At the same time we describe the organization of the paper. The various sections correspond to the main steps of the proof. The greatest difficulty in the proof of Theorem 1.1 is that the various properties one has to prove are closely connected, and it is rather delicate to find where to start. The most striking feature is probably the self-organization of Gibbs’ measure into the sets (Cα ). At the beginning of the proof is the observation (first made in [10]) that candidates for these sets can be constructed if one knows that the overlaps rarely belong to certain intervals. In Sec. 2, we prove that the overlaps are either close to 1 or in the interval [−1/2, 1/2]. The methods are an extension of those of [10] to the case h 6= 0, combined with a simple observation (unfortunately overlooked in [10]) that allows us to get estimates in a much larger region of parameters. The proof of the main result of Sec. 2 (Theorem 2.1) is based on the simple idea (going back to B. Derrida) that for large p, “the model should be close to Derrida’s random energy model”. Even after one has found how to give a precise meaning to this statement, its proof requires a number of unrewarding cautious elementary estimates. These tedious considerations are not related to the other (hopefully more appealing) ideas of the paper. Since Sec. 2 is the first cornerstone of the paper, it must be presented first; but understanding the details of the arguments is not required, or even useful to penetrate the rest of the paper. In Sec. 3 we use the estimates of Sec. 2 to construct “candidates” for the sets Cα . We construct a partition of ΣN in sets Cα that are small (so that R12 is close to 1 if σ 1 , σ 2 ∈ Cα ), and that have the property that if R12 is close to one, then this (essentially) must be because σ 1 , σ 2 belong to the same set Cα ; At that stage we know little about the sets Cα . In particular we know little about the random sequence wα = GN (Cα ). The next stage of the proof is to show that the sets Cα , or at least those of these sets for which wα is not too small, are “pure states”. This means that the overlap of two generic configurations in Cα is essentially independent of these configurations (but it might depend upon α). The natural approach to such a result is to introduce a quantity UN that quantifies this phenomenon and to try to use induction upon N (the so-called cavity method) to prove that limN →∞ UN = 0. The difficulty (which
February 19, 2003 14:25 WSPC/148-RMP
00158
Self Organization in the Low Temperature Region
7
is the main difficulty in studying the present model), is that it is very hard to use the cavity method unless one knows something about the weighs wα (which is not yet the case at this stage of the proof). Considerable efforts were expended in [10] to attempt the proof by cavity that limN →∞ UN = 0, but succeeded only under the (unproven at the time) assumption that the sequence (wα ) did not behave in a rather pathological way. In Sec. 4, we present a proof that limN →∞ UN = 0. This proof is based on the ideas of the cavity method, but requires no knowledge of the distribution of the weights wα . This proof exemplifies the beauty (and the frustration) of the cavity method. Certain special combinations of ingredients work almost effortlessly, but these special combinations might take a very long time to discover. Once we have proved that the sets Cα are pure states, we are in a much better position, because we completely understand the internal structure of these sets, and we only have to understand how they relate to each other. Knowing that the sets Cα are pure states implies that if σ 1 , σ 2 are two generic configurations, if they belong to the some set Cα , then R12 is (generically) about qα (a certain number depending upon α) while if σ 1 ∈ Cα , σ 2 ∈ Cγ , α 6= γ, then R12 is about qαγ , a certain number depending upon α, γ. One would like to prove that all the numbers qα are about equal to a non-random quantity q, while all the qαγ are equal to a non-random quantity q0 ; In Sec. 5, we perform “half” of this work in the case h = 0. We prove that qαγ = 0 if α 6= γ, or, equivalently that if |R12 | ≤ 1/2, then R12 ' 0. More formally, we prove that 2 1{|R12 |≤1/2} i = 0 . lim EhR12
N →∞
The proof makes essential use of special symmetries that exist only when h = 0. To go beyond these results, it seems necessary to better understand the weight distribution (wα ). This is the purpose of Sec. 6. We explain what is the Poisson– Dirichlet distribution and how it arises from Derrida’s Random Energy Model. We give a proof of a suitable version of the Ghirlanda–Guerra identities, and we explain their deep connections with the Poisson–Dirichlet distribution. In particular we detail the central observation that made this paper possible. If we knew before hand that for a certain non-random number q, we have qα = q for each α, the distribution of the sequence of weights would be entirely determined. After explaining this, we establish the basic formulas that, when combined with the method of Sec. 5 allow (once we know that qα = q) to make precise computations using the cavity method, and in particular, to establish (1.14). The observations of Sec. 6 show also that, if we knew that |qα − q| ≤ ε for each α, then q would approximately satisfy (1.14) (where “approximately” gets better as ε → 0). Given q, ε, in Sec. 7 we undertake the technical task to give a meaning to the expression “the part of the system consisting of the sets Cα for which |qα −q| ≤ ε”. This is a kind of conditioning argument. This argument involves a number of unpleasant technicalities. The central fact is however that (due to nice combinatorics) a suitable version of the Ghirlanda–Guerra identities allows to show
February 19, 2003 14:25 WSPC/148-RMP
8
00158
M. Talagrand
that, by many respects, this “part of the system” we consider behaves like the entire system. The gain is that we now know, by construction, that |qα − q| ≤ ε. In Sec. 8 we pursue the idea, and develop the cavity method for the “part of the system” as defined in the previous section. When the “part of the system” does not vanish as N → ∞, we show that q nearly satisfies (1.14). Turning the argument around, the solution of (1.14) was the only possible value for the numbers qα . In Sec. 9 we go back to the study of the case h 6= 0; We do not know how to complete this study with the Hamiltonian (1.19). Following the work of Ghirlanda and Guerra we introduce a suitable perturbation term to the Hamiltonian, and we show how this term allows us to prove a very general version of the Ghirlanda–Guerra identities. This perturbation term is “asymptotically infinitesimal” compared to the Hamiltonian (1.19), and although studying the “perturbed Hamiltonian” is definitely a different problem than studying (1.19), these problems are close cousins. We conclude Sec. 9 by proving that this perturbation term forces the distribution of the weights (wα ) to be a certain Poisson–Dirichlet distribution. Having this knowledge a priori removes a large part of the difficulty of the model. In Sec. 10, we show how to adopt the cavity arguments of Secs. 4 and 5 to obtain a suitable extension of Theorem 1.1 when h 6= 0. 2. A Priori Estimates The purpose of this section is to prove Theorem 2.1. In this theorem, the Hamiltonian is given by (1.19). We make the convention, valid throughout the paper, that L denotes a universal constant (that may vary between occurrences), while K denotes a quantity that may depend upon the various parameters, such as p, β, h but that does not depend upon N . The value of K will often vary between occurrences. Theorem 2.1. There exists a number L with the following property. Assume that p ≥ L, h ≤ 1/L, β ≤ 2p/L . Then we can find a number 0 ≤ q(β, h) ≤ 1/4 such that N (R ∈ U ) ≤ K exp − , (2.1) EG⊗2 12 N K where U = {x ∈ R; |x − q(β, h)| ≥ 2−p/L , |1 − x| ≥ 2−p/L } ,
(2.2)
when p is odd, while U = {x ∈ R; |x − q(β, h)| ≥ 2−p/L , kx| − 1| ≥ 2−p/L }
(2.3)
if p is even. The meaning of (2.1) is that (before the serious work starts in Sec. 3) we have established the “a priori” information that the overlaps hardly ever belong to certain intervals. The reason for the different formulation when p is even should be obvious: when h = 0, Gibbs’ measure is then invariant by global symmetry around
February 19, 2003 14:25 WSPC/148-RMP
00158
Self Organization in the Low Temperature Region
9
zero. The case “p even” creates some irritating complications and the reader should assume p odd at first reading. Writing Gibbs’ measure of a set as the ratio of two random quantities, we will estimate the numerator from above, and the denominator from below. Such a method hardly ever gives exact results, but is appropriate here. The denominator will be the square of the quantity X exp(−βHN (σ)) , (2.4) ZN = ZN (β, h) = σ
the normalizing factor in (1.5). Given t, we consider ) ( X σi = tN . S(t) = σ ∈ ΣN ;
(2.5)
i≤N
Of course S(t) is empty unless tN is an integer between −N and N . We define X exp(−βHN (σ)) . (2.6) ZN (β, h, t) = σ∈S(t)
P The idea is that for σ ∈ S(t), the contribution of the term βh i≤N σi is easy to evaluate (since it is N βht) while (in some sense) we will have ZN (β, h) ' maxt ZN (β, h, t). One key reason for the success of this approach is that, even though we do not know how to calculate ZN (β, h, t) for large p we can approximate it very well in a large region. Let us consider 1 (2.7) ψ(t) = log 2 + (L(1 + t) + L(1 − t)) 2 where, for a > 0, L(a) = −a log a . Consider the function ξ(β, h, t) given by 2 ψ(t) + β + βht 4 ξ(β, h, t) = p β ψ(t) + βht
(2.8) p if β ≤ 2 ψ(t) p if β ≥ 2 ψ(t) .
(2.9)
An important step in the proof of Theorem 2.1 is the following lower bound for ZN (β, h, t). Theorem 2.2. There exists a number L such that, if p ≥ L, h ≤ 1/L, β ≤ 2p/L , |t| ≤ 1/4, then N . (2.10) P (ZN (β, h, t) ≤ exp N (ξ(β, h, t) − 2−p/L )) ≤ K exp − K This is an accurate result, because, as we will also show, ZN (β, h, t) is hardly ever larger than exp N (ξ(β, h, t) + ε) if ε > 0. The simple observation (overlooked in [10]) is that such large values of β as in Theorem 2.2 can be reached by using
February 19, 2003 14:25 WSPC/148-RMP
10
00158
M. Talagrand
p that the function β → 7 ξ(β, h, t) is linear for β ≥ 2 ψ(t) and by proving (2.10) for p β ≤ 2 ψ(t). This is explained in Proposition 2.8 below. The proof of Theorem 2.2 is not very complicated; but the “upper bound” argument needed in Theorem 2.1 will require more struggling. We now collect simple facts. The reason for the occurrence of the function ψ is the following well known estimate. Lemma 2.3. We have 1 √ exp N ψ(t) ≤ card S(t) ≤ exp N ψ(t) . L N
(2.11)
It is of course understood here and everywhere that we consider only values of t for which S(t) is not empty. To distinguish between the Hamiltonians (1.4) and (1.19), we will denote by −HN,0 (σ) the quantity (1.4), so that (1.19) reads X σi . (2.12) −HN (σ) = −HN,0 (σ) + h i≤N
Lemma 2.4. We have ∀ σ,
N N 2 − K ≤ EHN,0 , (σ) ≤ 2 2
∀ σ 1 , σ 2 , |E(HN,0 (σ 1 )HN,0 (σ 2 )) − N R(σ 1 , σ 2 )p | ≤ K .
(2.13) (2.14)
Proof. For (2.13), we write p! N 2N p−1 p 1 p−1 1 ··· 1 − = N 1− 2 N N
2 (σ) = EHN,0
because there are
N p
choices for i1 < · · · < ip . To prove (2.14) we note that
2EHN,0 (σ 1 )HN,0 (σ 2 ) =
= P
X
p! N p−1 1 N p−1
σi11 · · · σi1p σi21 · · · σi2p
i1 <···
X
σi11 σi21 · · · σi1p σi2p
d
where d means that the summation is over all choices of indices i1 , . . . , ip ≤ N that are all distinct. On the other hand, 1 X 1 2 σi1 σi1 · · · σi1p σi2p , N R(σ 1 , σ 2 )p = p−1 N where now the sum is over all indices i1 , . . . , ip ≤ N . But |σi | ≤ 1, and there are at most KN p−1 choices of i1 , . . . , ip that are not all distinct.
February 19, 2003 14:25 WSPC/148-RMP
00158
Self Organization in the Low Temperature Region
11
It is good to note that E(HN,0 (σ 1 ) + HN,0 (σ 2 ))2 ≤ N (1 + R(σ 1 , σ 2 )p ) + K ,
(2.15)
as follows from (2.13), (2.14). It is an amusing exercise to show that the error term K is in fact not needed (see [10]). We set 1 (2.16) pN (β, h, t) = E log ZN (β, h, t) . N Lemma 2.5. We have β2 + βth (2.17) EZN (β, h, t) ≤ exp N ψ(t) + 4 and pN (β, h, t) ≤ ξ(β, h, t) .
(2.18)
Proof. For a Gaussian r.v. g, we have Eeg = eEg
2
/2
(2.19)
so that (2.17) follows from (2.11), (2.13). As for (2.18), it follows from the next proposition, with M = card S(t) and τ 2 = N/2. Proposition 2.6. Consider M centered Gaussian r.v. (gi )i≤M with Egi2 ≤ τ 2 for each i ≤ M (we do not assume that the (gi )i≤M are independent). Then X β2τ 2 + log M (2.20) eβgi ≤ E log 2 i≤M √ and if β ≥ 2 log M /τ we have X p eβgi ≤ βτ 2 log M . (2.21) E log i≤M
Proof. From Jensen’s inequality and (2.19) we have X X 2 2 eβgi ≤ log Eeβgi = log(M eβ τ /2 ) f (β) := E log i≤M
and this proves (2.20). Next, E log
i≤M
X i≤M
eβgi ≥ βE max gi , i≤M
√ 2 log M /τ we get p E max gi ≤ τ 2 log M .
and combining with (2.20) for β = i≤M
Next, we observe that P βgi X p d i≤M gi e 0 βgi E log e =E P ≤ E max g ≤ τ 2 log M . f (β) = i βg i i≤M dβ i≤M e i≤M
(2.22)
February 19, 2003 14:25 WSPC/148-RMP
12
00158
M. Talagrand
√ Since f (β) ≤ β 2 τ 2 /2 + log M and f 0 (β) ≤ τ 2 log M , it follows easily that √ √ f (β) ≤ βτ 2 log M for β ≥ 2 log M /τ . The following simple fact is at the heart of Theorem 2.2. p Proposition 2.7. We set βt = 2 ψ(t) and we assume that pN (βt , h, t) ≥ ξ(βt , h, t) − a
(2.23)
where a ≤ ψ(t). Then we have
√ ∀ β ≥ βt , pN (β, h, t) ≥ ξ(β, h, t) − 2(β − βt ) a − a .
(2.24)
Proof. We fix N , h, t, and we observe that the function η(β) = pN (β, h, t) is convex (by H¨older’s inequality). Thus, using (2.23), ∀ β, η(β) ≥ ξ(βt , h, t) + (β − βt )η 0 (βt ) − a while ∀ β ≥ βt , ξ(β, h, t) = ξ(βt , h, t) + (β − βt )(ht + and thus ∀ β ≥ βt , η(β) ≥ ξ(β, h, t) − (β − βt )(ht +
p ψ(t))
(2.25)
(2.26)
p ψ(t) − η 0 (βt )) − a .
Thus, to prove (2.24), it suffices to prove that p √ η 0 (βt ) ≥ ht + ψ(t) − 2 a .
√ This follows by combining (2.25) and (2.18) for β = βt − 2 a.
To relate statements about N −1 log ZN (a r.v.) and N −1 E log ZN (a number) the following is very precious. Proposition 2.8. For N > 0, we have 1 N u2 . P log ZN (β, h, t) − pN (β, h, t) ≥ u ≤ exp − 2 N β
(2.27)
This is a special instance of the “concentration of measure phenomenon for Gaussian processes”, as expressed first in [11]. As apfunction of the variables gi1 ...,ip , N −1 log ZN (β, h, t) has a Lipschitz constant ≤ β N/2 by simple estimates. We will use this principle several more times. If we combine Proposition 2.7 with (2.27) we see that to prove Theorem 2.2 it suffices to prove the following. Proposition 2.9. If |t| ≤ 1/4, β 2 ≤ 4ψ(t), then pN (β, h, t) ≥ ξ(β, h, t) − 2−p/L −
K log N √ . N
(2.28)
February 19, 2003 14:25 WSPC/148-RMP
00158
Self Organization in the Low Temperature Region
13
Let us explain the structure of the proof. We will use the bound ZN (β, h, t) ≥ eN (ht+β where Y =
X
2
/2)
Y
(2.29)
1{−HN,0 (σ)≥βN/2} .
σ∈S(t)
The “second moment method” (Paley-Zygmund inequality) states that 1 (EY )2 EY (2.30) ≥ P Y ≥ 2 4 EY 2 and combining with (2.29) this gives EY 1 β2 1 (EY )2 1 log ZN (β, h, t) ≥ th + + log . ≥ P N 2 N 2 4 EY 2 If we combine with (2.27), we obtain 1 β2 + log pN (β, h, t) ≥ th + 2 N
EY 2
s
−
4EY 2 β2 log . N (EY )2
(2.31)
Now, if we combine (2.11), (2.13), we see that β2 1 exp N ψ(t) − . EY ≥ KN 4
(2.32)
Thus, to prove Proposition 2.9, it suffices to show the following. Lemma 2.10. If |t| ≤ 1/4, β 2 ≤ 4ψ(t), then β2 + 2−p/L . EY 2 ≤ K exp N 2ψ(t) − 2 Proof. We have Y2 =
X
(2.33)
1{−HN,0 (σ1 )≥βN/2} 1{−HN,0 (σ2 )≥βN/2}
σ 1 ,σ2 ∈S(t)
≤
X
1{−HN,0 (σ1 )−HN,0 (σ2 )≥βN } ,
σ 1 ,σ2 ∈S(t)
and thus since, for a Gaussian r.v. g, we have P (g ≥ u) ≤ exp(−u2 /2Eg 2 ), we get X P (−HN,0 (σ 1 ) − HN,0 (σ 2 ) ≥ βN ) EY 2 ≤ σ 1 ,σ 2 ∈S(t)
≤
X σ 1 ,σ 2 ∈S(t)
exp −
β2N 2 2E(HN,0 (σ 1 ) + HN,0 (σ 2 ))2
.
February 19, 2003 14:25 WSPC/148-RMP
14
00158
M. Talagrand
Using (2.14), EY ≤ K 2
X σ 1 ,σ 2 ∈S(t)
β2N exp − p 2(1 + R12 )
.
(2.34)
We see at this stage that it would help to have information on how many pairs of configurations in S(t) have a given overlap. For further need, we prove a more general result. We recall the notation L(a) = −a log a. Lemma 2.11. We have card{(σ 1 , σ 2 ) ∈ S(t1 ) × S(t2 ), R12 = u} ≤ exp N ψ(t1 , t2 , u) ,
(2.35)
where 1 1 ψ(t1 , t2 , u) = 2 log 2 + L(1 + u + t1 + t2 ) + L(1 + u − t1 − t2 ) 4 4 1 1 (2.36) + L(1 − u + t1 − t2 ) + L(1 − u − t1 + t2 ) . 4 4 Of course the set of (2.35) is empty should one of the arguments of L in (2.36) be negative. All the arguments are non negative provided −1+|t1 +t2 | ≤ u ≤ 1−|t1 −t2 |. Proof. A couple (σ 1 , σ 2 ) of configurations is specified by the four sets A(ε1 , ε2 ) = {i ≤ N ; σi1 = ε1 , σi2 = ε2 } where ε1 , ε2 = ±1. The relations σ 1 ∈ S(t1 ), σ 2 ∈ S(t2 ), R(σ 1 , σ 2 ) = u mean respectively that 1 + t1 N card A1,1 + card A1,−1 = 2 card A1,1 + card A−1,1 =
1 + t2 N 2
1+u N card A1,1 + card A−1,−1 = 2 P which, together with card Aε1 ,ε2 = N imply that 1 + ε 1 t1 + ε 2 t2 + ε 1 ε 2 u . card Aε1 ,ε2 = N 4 The possible number of choices of the sets Aε1 ,ε2 is N! (card Aε1 ,ε2 )! ε1 ,ε2
Q
from which the result follows using Sterling’s formula. We make a few simple observations. Computation from (2.36) yields (1 − u)2 − (t1 − t2 )2 1 ∂ ψ(t1 , t2 , u) = log ∂u 2 (1 + u)2 − (t1 + t2 )2
(2.37)
February 19, 2003 14:25 WSPC/148-RMP
00158
Self Organization in the Low Temperature Region
15
which is zero at u = t1 t2 . Also, it is obvious from the definition that ψ(t1 , t2 , u) ≤ ψ(t1 ) + ψ(t2 ) and there is equality if u = t1 t2 . We continue the proof of Lemma 2.10. From (2.34) we obtain that β2 . EY 2 ≤ K(2N + 1) exp N max ψ(t, t, u) − u 2(1 + up )
(2.38)
(2.39)
The max is taken over the possible values of u, that is −1 + 2t+ ≤ u ≤ 1, where t+ = max(t, 0). Thus to prove (2.33), it suffices to prove that if −1 + 2t+ ≤ u ≤ 1, we have ψ(t, t, u) ≤ 2ψ(t) − β 2
up + 2−p/L 2(1 + up )
(2.40)
and since β 2 ≤ 4ψ(t), it suffices to prove that for each u we have ψ(t, t, u)(1 + up ) ≤ 2ψ(t) + 2−p/L .
(2.41)
Proving this requires tedious elementary estimates. We give these estimates in a sufficiently general form to cover all our future needs. Lemma 2.12. If |t1 |, |t2 | ≤ 1/2, up ≤ 2−7 , we have ψ(t1 , t2 , u)(1 + up ) ≤ ψ(t1 ) + ψ(t2 ) −
(u − t1 t2 )2 + 22−p . 8
(2.42)
Proof. From (2.37) we have 1+u 1−u ∂2 ψ(t1 , t2 , u) = − − 2 2 2 2 ∂u (1 − u) − (t1 − t2 ) (1 + u) − (t1 + t2 )2 1 1 ≤ − (1 − u + 1 + u) ≤ − 4 2 so that, since
∂ ∂u ψ(t1 , t2 , u)
is zero for u = t1 t2 ,
1 1 ψ(t1 , t2 , u) ≤ ψ(t1 , t2 , t1 t2 ) − (u − t1 t2 )2 = ψ(t1 ) + ψ(t2 ) − (u − t1 t2 )2 . 4 4 (2.43) Since ψ(t1 , t2 , u) ≤ 2 log 2 ≤ 2, we get 1 ψ(t1 , t2 , u)(1 + up ) ≤ ψ(t1 ) + ψ(t2 ) − (u − t1 t2 )2 + max(0, 2up ) . 4 This implies (2.42) if |u| ≤ 1/2. But if |u| ≥ 1/2, and since up ≤ 2−7 , we have 2up ≤
1 1 ≤ (u − t1 t2 )2 . 32 8
February 19, 2003 14:25 WSPC/148-RMP
16
00158
M. Talagrand
Lemma 2.13. We have 4 = LL 0 ≤ a, b ≤ 2 ⇒ |L(a) − L(b)| ≤ L|a − b| log |a − b| |u − v| |ψ(t1 , t2 , u) − ψ(t1 , t2 , v)| ≤ LL 4 ψ(t1 , t2 , u) ≤ ψ(t) + LL
1−u 4
+ LL
|t1 − t| 4
+ LL
|a − b| 4
(2.44)
(2.45) |t2 − t| 4
.
(2.46)
Proof. The proof of (2.44) is elementary, and (2.44) implies (2.45) using (2.36). To prove (2.46), we use (2.36), the observation that ψ(t) = ψ(t, t, 1) and the fact that |a| |b| |c| |a + b + c| ≤L L +L +L L 4 4 4 4 for |a|, |b|, |c| ≤ 2, |a + b + c| ≤ 2. Lemma 2.14. If |t1 | < 1/2, |t2 |, |t| ≤ 1, u ≥ 0, up ≥ 2−7 , then 1−u |t1 − t| p + LL ψ(t1 , t2 , u)(1 + u ) ≤ 2ψ(t) + LL 4 4 |t2 − t| p − (1 − u) . + LL 4 L
(2.47)
Proof. Since up ≥ 2−7 we have 1 − u ≤ L/p and thus up = (1 − (1 − u))p ≤ 1 −
p (1 − u) . L
Since ψ(t1 , t2 , u) ≥ ψ(t1 ) ≥ 1/L as |t1 | ≤ 1/2, we get ψ(t1 , t2 , u)(1 + up ) ≤ 2ψ(t1 , t2 , u) −
p (1 − u) L
and the result follows from (2.46). When p is even, we will also need the following. Lemma 2.15. If |t1 | ≤ 1/2, |t2 |, |t| ≤ 1, up ≥ 2−7 , u < 0 then 1+u |t − t1 | p + LL ψ(t1 , t2 , u)(1 + u ) ≤ 2ψ(t) + LL 4 4 |t − t2 | p − (1 + u) . + LL 4 L Proof. It is nearly identical to that of Lemma 2.14.
(2.48)
February 19, 2003 14:25 WSPC/148-RMP
00158
Self Organization in the Low Temperature Region
17
Proof of (2.41). If up ≤ 2−7 , this follows from (2.42). If up ≥ 2−7 this follows from (2.47) (resp. (2.48)) if u ≥ 0 (resp. u ≤ 0), used for t1 = t2 = t, and from the fact that p x (2.49) − x ≤ 2−p/L . L 4 L We have proved Theorem 2.2. Now that we somewhat understand ZN (β, h, t), we turn to the study of ZN (β, h). Quite naturally, we define tm = tm (β, h) by ξ(β, h, tm ) = max ξ(β, h, t) .
(2.50)
t
Lemma 2.16. There exist h0 > 0 such that h ≤ h0 ⇒ ∀ β ,
1 . 8
(2.51)
(t − tm )2 . 4
(2.52)
0 ≤ tm (β, h) ≤
Moreover, if |t| ≤ 1, ξ(β, h, t) ≤ ξ(β, h, tm ) − Proof. Fixing β, h we have ξ(β, h, t) = p if β ≤ 2 ψ(t), while otherwise
β2 + ψ(t) + tβh 2
p ξ(β, h, t) = β( ψ(t) + th) ,
so that tm satisfies either −ψ 0 (tm ) = βh
(2.53)
ψ 0 (tm ) = h. − p 2 ψ(tm )
(2.54)
1−t 1 , ψ 0 (t) = − log 2 1+t
(2.55)
or else
Now
√ and (2.53) means that tm = th βh. The case (2.53) can occurs only if β ≤ 2 log 2; Since ψ(tm ) ≤ 2 log 2 the solution of (2.54) goes to zero with h. Thus (2.51) should be obvious. Next, from (2.55) we have ψ 00 (t) = −
1 ≤ −1 1 − t2
February 19, 2003 14:25 WSPC/148-RMP
18
00158
M. Talagrand
and p ψ 0 (t)2 ψ 00 (t) 1 − , ≤− p ( ψ(t))00 = p 3/2 4ψ(t) 2 ψ(t) 2 ψ(t) p p so that (β ψ(t))00 ≤ −1, whenever β ≥ 2 ψ(t). Clearly this implies (2.52). Given t1 , t2 , u we set D(β, h, t1 , t2 , u) =
X
exp(−βHN (σ 1 ) − βHN (σ 2 )) ,
(2.56)
where the summation is over σ 1 ∈ S(t1 ), σ 2 ∈ S(t2 ), R12 = u. The reason for considering this quantity is that G⊗2 ({(σ 1 , σ 2 ); R12 ∈ U }) = where A=
X
A , ZN (β, h)2
D(β, h, t1 , t2 , u) ,
(2.57)
(2.58)
for a summation over |t1 |, |t2 |, |u| ≤ 1, u ∈ U , N t1 , N t2 , N u integers. We set η(β, h, t1 , t2 , u) = t2m .
1 E log D(β, h, t1 , t2 , u) . N
(2.59)
We now turn to the proof of Theorem 2.1. In this theorem we will have q(β, h) = For clarity, we consider a parameter c, and, when p is odd, we define U = {x ∈ [−1, 1]; |x − t2m | ≥ c, x ≤ 1 − c} ,
(2.60)
while, when p is even, we define U = {x ∈ [−1, 1]; |x − t2m | ≥ c, |x| ≤ 1 − c} .
(2.61)
To prove Theorem 2.1 it suffices to prove the following. Lemma 2.17. Given a number L0 , we can find L1 such that if p ≥ L1 , if c = 2−p/L1 , then for all t1 , t2 , all h ≤ h0 , and all β with 1 ≤ β ≤ 2p/L1 we have u ∈ U ⇒ η(β, h, t1 , t2 , u) ≤ 2ξ(β, h, tm ) − 2−p/L0 +1 .
(2.62)
To see this, we take for L0 the number L of Theorem 2.2. We bound the sum in (2.58) by (2N + 1)2 times its largest term, and we see that from (2.59), (2.62), we have log(2N + 1) 1 E log A ≤ 2 + 2ξ(β, h, tm ) − 2−p/L0 +1 . (2.63) N N We use Theorem 2.2 with t = tm to control from below the denominator of (2.57). Now, mimicking (2.27), we have 1 1 N u2 P log A − E log A ≤ u ≤ exp − 2 , N N 4β
February 19, 2003 14:25 WSPC/148-RMP
00158
Self Organization in the Low Temperature Region
19
and together with (2.63) this controls the numerator of (2.57). Proof of Lemma 2.17. First, we observe that D(β, h, t1 , t2 , u) ≤ ZN (β, h, t1 )ZN (β, h, t2 ) , so that, taking logarithms, expectation, and using (2.18) we get η(β, h, t1 , t2 , u) ≤ ξ(β, h, t1 ) + ξ(β, h, t2 ) . Thus (2.52) shows that to prove (2.62), we can assume (as we do in the rest of the proof) that |t1 − tm |, |t2 − tm | ≤ 4d
(2.64)
where d = 2−p/2L0 . √ For a, b > 0, we define F (a, b) = a + b if a ≤ b and F (a, b) = 2 ab if a ≥ b. We observe that 2 β , 2ψ(t) + βht (2.65) ξ(β, h, t) = F 2 and that η(β, h, t1 , t2 , u) ≤ F
β2 p (1 + u ), ψ(t1 , t2 , u) + βht1 + βht2 , 2
(2.66)
a fact that follows from Proposition 2.6, using (2.15), (2.35). Next, we show that if p ≥ L1 , c = 2−p/L1 , L1 large enough, then u ∈ U ⇒ (1 + up )ψ(t1 , t2 , u) ≤ 2ψ(tm ) −
c2 . L
(2.67)
If up ≤ 2−7 , this follows from (2.42), since |ψ(tj ) − ψ(tm )| ≤ LL(d)
(2.68)
for j = 1, 2, by (2.64). To treat the case up ≥ 2−7 , we observe that if x ≥ c, c = 2−p/L1 , L1 large enough, then c c2 p x − x≤− ≤− , LL(d) + LL 4 L L L and we use (2.47) (resp. (2.48)) when u > 0 (resp. p even and u < 0). We now prove (2.62) when β2 (1 + up ) ≥ ψ(t1 , t2 , u) . 2
(2.69)
February 19, 2003 14:25 WSPC/148-RMP
20
00158
M. Talagrand
By (2.66), we have, using (2.67) in the second line r 1 (1 + up )ψ(t1 , t2 , u) + βh(t1 + t2 ) η(β, h, t1 , t2 , u) ≤ 2β 2 r c2 + βh(t1 + t2 ) ≤ 2β ψ(tm ) − L ≤ 2β
p
ψ(tm ) + 2βtm + 2βd −
≤ 2ξ(β, h, tm ) + 2βd −
βc2 L
βc2 . L
(2.70)
Since we assume β ≥ 1 (and c = 2−p/L1 where L1 is large enough) this finishes the proof of (2.62) under (2.69). Finally we prove (2.62) when (2.69) fails, i.e. β2 (1 + up ) < ψ(t1 , t2 , u) . 2 First, we assume up > 0. Then, from (2.71) β2 β2 < (1 + up ) < ψ(t1 , t2 , u) ≤ ψ(t1 ) + ψ(t2 ) ≤ 2ψ(tm ) + LL(d) , 2 2 using (2.68). Thus, if β 2 > 4ψ(tm ), then 2 β β2 , 2ψ(tm ) ≥ F (2ψ(tm ), 2ψ(tm )) = 4ψ(tm ) ≥ + 2ψ(tm ) − LL(d) F 2 2
(2.71)
(2.72)
(2.73)
by (2.72). If β 2 ≤ 4ψ(tm ), (2.73) remains true since F (β 2 /2, 2ψ(tm)) = β 2 /2 + 2ψ(tm ). Now, under (2.71) 2 β 2 up β β2 (1 + up ), ψ(t1 , t2 , u) = + + ψ(t1 , t2 , u) F 2 2 2 ≤
β2 + (1 + up )ψ(t1 , t2 , u) 2
c2 β2 + 2ψ(tm ) − , (2.74) 2 L using (2.72) in the second line and (2.67) in the last line. Combining with (2.65), (2.66), (2.73), this proves again (2.62). The much easier case u < 0 is left to the reader. ≤
We have proved Theorem 2.1. The following is also worth noting. Proposition 2.18. If h≤
1 , 2
β<2
p ψ(tm ) − 2−p/L ,
(2.75)
February 19, 2003 14:25 WSPC/148-RMP
00158
Self Organization in the Low Temperature Region
21
then 1 2 EG⊗2 N ({(σ , σ );
|R12 −
t2m |
≥2
−p/L
N }) ≤ K exp − . K
(2.76)
In particular in this case we essentially have |R12 | ≤ 1/2. This is a much simpler situation. It would be simple to prove that we are then in a “high temperature situation”, i.e. that Eh(R12 − q)2 i → 0 for a certain value of q (given by (1.1) below when m = 1). We will not do this, and we will simply use Proposition 2.18 to demonstrate that the assumption β ≥ 1 in Theorems 1.1 and 10.1 does not remove anything of real interest. Proof. We have to show that, under (2.75), if we replace in (2.61) the definition U by U = {x ∈ [−1, 1]; |x − t2m | ≥ c} ,
(2.77)
then (2.62) still holds under (2.75). There is nothing to change to the proof of the case where up ≤ 2−7 , so we assume up ≥ 2−7 . Using (2.47)–(2.49) we see that (1 + up )ψ(t1 , t2 , u) ≤ 2ψ(tm ) + LL(d) + 2−p/L .
(2.78)
Under (2.69), rather than (2.70) we use (2.78) to write 2β
1/2 1 p (1 + u )ψ(t1 , t2 , u) ≤ 2β(ψ(tm ) + LL(d) + 2−p/L )1/2 2 p p ≤ 2β ψ(tm ) + L L(d) + L2−p/L ,
and we use that 2β
p
ψ(tm ) ≤
β2 1 p + 2ψ(tm ) − (2 ψ(tm ) − β)2 , 2 2
and (2.75) to conclude. Under (2.71), we use the full strength of (2.71) to write, instead of (2.72), that up β2 p u ≤ ψ(t1 , t2 , u) = 2 1 + up
up −
u2p 1 + up
ψ(t1 , t2 , u)
and in (2.74) we use (2.78), (2.79) to get a bound u2p β2 + 2ψ(tm ) − ψ(tm ) + LL(d) + 2−p/L , 2 1 + up and the result since up ≥ 2−7 .
(2.79)
February 19, 2003 14:25 WSPC/148-RMP
22
00158
M. Talagrand
3. Construction of the Lumps This construction is made possible by a simple result concerning (non-random) probability measures on ΣN Theorem 3.1. Consider a probability µ on ΣN . Consider 0 < a < 1/28 and let 1 1 2 , 1 − 2a . (3.1) ε= µ⊗µ (σ , σ ); |R12 | ∈ 2 Then we can find a partition (Bα )α≥1 of ΣN with the following properties. Each Bα is symmetric : σ ∈ Bα ⇔ −σ ∈ Bα ,
(3.2)
σ 1 , σ 2 ∈ Bα ⇒ |R12 | ≥ 1 − 12a , ! -[ 1 1 2 2 Bα ≤ 3ε1/3 . µ⊗µ (σ , σ ); |R12 | ≥ 2 α
(3.3) (3.4)
This will be used for ε very small. The meaning of (3.3) is that the sets Bα are very small; the meaning of (3.4) is that basically the only way that we can have |R12 | ≥ 1/2 is that σ 1 , σ 2 belong to the same set Bα . There is certainly nothing magic about the value 1/2. To keep the notation manageable, we index the sets Bα by α ∈ N. Of course at most 2N such sets are not empty. A result very similar to Theorem 3.1 is proved in [10] so we do not give the simple proof here. Theorem 3.2. There exists a number L with the following property. If p ≥ L, h ≤ 1/L, β ≤ 2p/L , then we can find a partition (Bα )α≥1 of ΣN (depending upon the disorder ) such that Each Bα is symmetric , σ 1 , σ 2 ∈ Bα ⇒ |R12 | ≥ 1 − 2−p/L , !! - [ 1 N 2 Bα . |R12 | ≥ ≤ K exp − 2 K
E G⊗2 N
(3.5) (3.6) (3.7)
α≥1
Proof. We use Theorem 3.1 at given disorder and (2.1). By (3.2) we have Bα = Bα0 ∪ {−Bα0 }, where σ 1 , σ 2 ∈ Bα0 ⇒ R12 ≥ 1 − 2p/L .
(3.8)
We will denote by (Cα ) an enumeration of the sets Bα0 , −Bα0 ; given α, there is a unique ϕ(α) such that Cϕ(α) = −Cα . We have ! [ [ [ 2 2 Bα = Cα ∪ Cα × Cϕ(α) . (3.9) α≥1
α≥1
α≥1
February 19, 2003 14:25 WSPC/148-RMP
00158
Self Organization in the Low Temperature Region
Now Cα × Cϕ(α) ⊂
23
1 . R12 ≤ − 2
When p is odd, Theorem 2.1 shows that 1 N 1 2 ≤ K exp − , E (σ , σ ); R12 ≤ − 2 K S S so that (3.9) shows that in (3.7) we can replace α≥1 Bα2 by α≥1 Cα2 . We conjecture that this is also true when p is even and h 6= 0. (It is wrong when p is even and h = 0.) Throughout the paper, the sequences Bα , Cα will keep the meaning above. 4. Pure States We consider 3 replicas σ ` , ` = 1, 2, 3 and * + 2 σ∼ · σ3 1{R12 ≥3/4} UN (β) = UN (β, h) = E N
(4.1)
where σ ∼ = σ 1 − σ 2 . The objective of this chapter is to prove the following: Theorem 4.1. If h ≤ 1/2, β ≤ 2p/L , then K . (4.2) N It is instructive to interpret this result using the lumps constructed in Sec. 3. There, we have constructed a partition (Cα )α≥1 of ΣN , with the following properties, where b = 1 − 2−p/L . UN (β, h) ≤
EG⊗2 N
If σ 1 , σ 2 ∈ Cα then R12 ≥ 1 − b , ! - [ 1 N 2 Cα ≤ K exp − . R12 ≥ 2 K
(4.3) (4.4)
α≥1
There exists a one to one map ϕ such that ! - [ 1 N ⊗2 Cα × Cϕ(α) ≤ K exp − . R12 ≤ − EGN 2 K
(4.5)
α≥1
We recall the notation wα = GN (Cα ). It will be useful to consider integrals restricted to Cα . For a function f on k replicas, we write * + Y 1 1{σ` ∈Cα } hf iα = k f wα `≤k
=
1 GN (Cα )k
Z σ 1 ,...,σ k ∈C
f (σ 1 , . . . , σ k )dGN (σ 1 ) · · · dGN (σ k ) . α
(4.6)
February 19, 2003 14:25 WSPC/148-RMP
24
00158
M. Talagrand
Thus, if we define * Iα =
σ∼ · σ3 N
2 + ,
(4.7)
α
then Iα ≤
1 UN (β, h) . wα3
(4.8)
We know from Theorem 4.1 that UN (β, h) is vanishingly small. Thus (4.8) shows that unless wα is itself small, then Iα is small. Now, Iα is simply a symmetrized version of the quantity h(R12 − hR12 iα )2 iα , which, when small, expresses the fact that the overlap of two configurations in Cα is essentially independent of these configurations, which means, as the physicists say, that Cα is in “a pure state”. Thus a consequence of Theorem 4.1 is that “the lumps that are of macroscopic weight are in a pure state”. A fundamental tool of the present paper is the cavity method (induction upon N ) that relates a (N + 1)-spin system with an N -spin system. The use of that method is very problematic until one has gathered information about the weight distribution (wα ) (as is demonstrated in [10], Sec. 6.7). The proof of Theorem 4.1 is inspired by the cavity method, but with a number of twists. Rather than developing special notation tailored to the proof of this theorem, we will prove it with the notation we use in the entire paper for the cavity method. We introduce this notation now. It will be used throughout the paper. For the cavity method it is convenient to think to h0 = βh as a parameter independent of β. We also keep this notation throughout the paper. We consider a new i.i.d standard normal sequence gi1 ···ip−1 for 1 ≤ i1 < · · · < ip−1 ≤ N , that is independent of all other sequences considered so far. The basic observation of the cavity method is that if ε ∈ {−1, 1}, −βHN (σ) + βε
p! 2N p−1
1/2
X
gi1 ···ip−1 σi1 · · · σip−1 + εh0
(4.9)
i1 <···
is the Hamiltonian −β 0 HN +1 (%) of a (N + 1)-spin system. To see this, one sets ρi = σi for i ≤ N , ρN +1 = ε, gi1 ···ip−1 N +1 = gi1 ···ip−1 . The value of β 0 is given by β
p! 2N p−1
1/2
= β0
p! 2(N + 1)p−1
1/2 (4.10)
i.e. (p−1)/2 1 . β0 = β 1 + N
(4.11)
February 19, 2003 14:25 WSPC/148-RMP
00158
Self Organization in the Low Temperature Region
25
The parameter h0 does not vary (which is the reason for adopting it). To simplify notation, we write 1/2 X p! gi1 ···ip−1 σi1 · · · σip−1 (4.12) g(σ) = β p−1 2N i1 <···
so that (4.9) becomes −β 0 HN +1 (%) = −βHN (σ) + εg(σ) + εh0 .
(4.13)
Consider now a function f on ΣkN +1 , and let us denote by h · i0 integration with respect to (the kth power of) Gibbs’ measure with Hamiltonian (4.13). Then we have the algebraic identity hf i0 =
hAv f Ek i . Zk
(4.14)
Here Ek = Ek (σ 1 , . . . , σ k ) = exp
X
ε` (g(σ ` + h0 ) ,
(4.15)
`≤k
Z = hAv Ei ,
(4.16)
E = exp ε(g(σ) + h0 ) , and Av means average over ε, ε1 , . . . , εk = ±1. The quantity f depends upon ` , ε` ). After averaging over ε1 , . . . , εk , f E %1 , . . . , %k ∈ ΣN +1 , and %` = (σ1` , . . . , σN 1 k ` ` ), and hAv f Ek i means that depends only upon σ , . . . , σ , where σ = (σ1` , . . . , σN 1 k in Av f E, these variables σ , . . . , σ are averaged for GN . Of course h · i0 is for the value β 0 related to the value β appearing in h · i by (4.11). The essential idea of the cavity method is that when computing Ehf i0 = E
hAv f Ek i Zk
the variables g(σ) occuring on the right are probabilistically independent of the r.v. occuring in h · i, so we can first integrate in these. The following lemma will be useful for this purpose. Its proof is nearly identical to that of Lemma 2.4. Lemma 4.2. We have 1 2 p−1 σ p · σ K . ≤ Eg(σ 1 )g(σ 2 ) − β 2 N 2 N
(4.17)
In particular we have Eg(σ)2 ' β 2 p/2. This means that, as p increases, the (N + 1)-spin system is less and less related to the N -spin system. Fortunately, this will be more than compensated by the control of the system through Theorem 2.1, control that improves as p increases.
February 19, 2003 14:25 WSPC/148-RMP
26
00158
M. Talagrand
Another important feature of the cavity method is that it breaks the symmetry between the sites. To take advantage of this, we write ∼ 3 σ ·σ ∼ 3 σN σN 1{R12 ≥3/4} , (4.18) UN (β, h0 ) = E N ∼ 1 2 = σN − σN . The notation UN (β, h0 ) means that we now think where of course σN 0 to β and h = hβ as independent parameters. Changing β into β 0 , N into N + 1, we get from (4.18) that ∼ 3 0 % ·% ∼ 0 0 0 ε ε3 A , (4.19) UN +1 (β , h ) = E N +1
where %∼ = %1 − %2 , ε∼ = ε1 − ε2 , A0 = 1{%1 ·%2 ≥3(N +1)/4} .
(4.20)
` , ε` ), and σ ` = The notation here is that of (4.13), so that %` = (σ1` , . . . , σN ` ` (σ1 , . . . , σN ). From (4.19) we get that ∼ 3 0 σ ·σ ∼ K 0 0 +E ε ε3 A (4.21) UN +1 (β , h ) ≤ N N
where A = 1{σ1 ·σ2 ≥3N/4} .
(4.22)
The reason for this is that (for N ≥ 10), {A 6= A0 } ⊂ 1{1/2≤%1 ·%2 /(N +1)≤4/5}
(4.23)
and that integration on this latter set gives an exponentially small contribution by Theorem 2.1. Of course we also have ∼ 3 % · % σ ∼ · σ 3 2 . (4.24) ≤ N +1 − N N Using (4.14), we get from (4.21) that UN +1 (β 0 , h0 ) ≤
hf Av ε∼ ε3 E3 i K +E N Z3
(4.25)
where f (%1 , %2 ) = Aσ ∼ · σ 3 /N . The reason that f is before Av is that it does not depend upon ε1 , ε2 , ε3 . We observe the crucial identity X 1 ε` (g(σ ` ) + h0 ) = ε∼ exp ε∼ (g(σ 1 ) − g(σ 2 )) (4.26) ε∼ exp 2 `≤2
that should be obvious by distinguishing the cases ε2 = ε1 and ε2 = −ε1 . Thus, we get from (4.25) that UN +1 (β 0 , h0 ) ≤
1 K + E 3 hf Av ε∼ ε3 E ∼ i N Z
(4.27)
February 19, 2003 14:25 WSPC/148-RMP
00158
Self Organization in the Low Temperature Region
where E
∼
= exp
27
1 ∼ 1 2 3 0 ε (g(σ ) − g(σ )) + ε3 g(σ ) + ε3 h . 2
As we will detail below it would be easy to study UN +1 (β 0 , h0 ) if the r.v. g(σ 1 )− g(σ 2 ) were independent of all the other r.v. involved. We will use the ubiquitous idea of the present paper: to move from the situation we want to study to a simpler situation along a well chosen path. To do this, we introduce a new Gaussian family (g(σ 1 , σ 2 )) of r.v. that is independent of all other r.v. already considered, and that has the same joint distribution as the family (g(σ 1 ) − g(σ 2 )) as σ 1 , σ 2 range over Σ2N . We set gθ (σ 1 , σ 2 ) = cos θg(σ 1 , σ 2 ) + sin θ(g(σ 1 ) − g(σ 2 )) , 1 ∼ 1 2 3 0 ε gθ (σ , σ ) + ε3 g(σ ) + ε3 h , Eθ = exp 2 ψ(θ) = E Thus
hf Av ε∼ ε3 Eθ i . Z3
(4.28) (4.29) (4.30)
π K +ψ . UN +1 (β , h ) ≤ N 2
(4.31)
ψ(0) = 0 .
(4.32)
0
0
Lemma 4.3. We have
Proof. Let us denote by E0 expectation in the variables g(σ 1 , σ 2 ) only. As these variables occur only in E0 , and are independent of all the other variables, we get ψ(0) = E Now
E0 E0 = exp
hf Av ε∼ ε3 E0 E0 i . Z3
1 ∼ 2 (ε ) E0 g 2 (σ 1 , σ 2 ) + ε3 g(σ 3 ) + ε3 h0 8
(4.33)
so that Av ε∼ ε3 E0 E0 = 0 .
Combining (4.31), (4.32), and Z π/2 π ψ 0 (θ)dθ , = ψ(0) + ψ 2 0
(4.34)
February 19, 2003 14:25 WSPC/148-RMP
28
00158
M. Talagrand
we see that UN +1 (β 0 , h0 ) ≤
π K + sup ψ 0 (θ) . N 2
(4.35)
Lemma 4.4. We have |ψ 0 (θ)| ≤
K hf 0 Av(ε∼ )2 Eθ i + Lβ 2 p2 E , N Z3
where f0 = A
σ∼ · σ3 N
(4.36)
2 .
Proof. We write gθ0 = gθ0 (σ 1 , σ 2 ) :=
d gθ (σ 1 , σ 2 ) dθ
= − sin θg(σ 1 , σ 2 ) + cos θ(g(σ 1 ) − g(σ 2 ))
(4.37)
so that 1 hAv f (ε∼ )2 ε3 gθ0 Eθ i E . (4.38) 2 Z3 To make sense of this formula, we have to integrate by parts, using the formula X ∂F E(ggi )E (g1 , . . . , gm ) (4.39) EgF (g1 , . . . , gm ) = ∂xi ψ 0 (θ) =
i≤m
true for jointly Gaussian r.v. (g, gi , . . . , gm ) and well behaved F . Observing that Egθ gθ0 = 0 ,
(4.40)
integration by parts of (4.38) yields, after a straightforward (but tedious) computation ψ(θ) = I + II ,
(4.41)
for I=
1 hf Av(ε∼ )2 E(gθ0 g(σ 3 ))Eθ i E , 2 Z3
(4.42)
3 hf Av(ε∼ )2 ε3 ε4 E(gθ0 g(σ4 ))Eθ E 0 i , II = − E 2 Z4 where E 0 = exp ε4 (g(σ 4 ) + h0 ). Now, using (4.37) and Lemma 4.1, we get |Egθ0 g(σ 4 )| = |cos θ| |Eg(σ 1 )g(σ 4 ) − Eg(σ 2 )g(σ 4 )| ≤
β 2 p p−1 K p−1 + |R14 − R24 | N 2
≤
β 2 p2 β 2 p2 K K + |R14 − R24 | = + N 2 N 2
∼ 4 σ · σ N .
(4.43)
February 19, 2003 14:25 WSPC/148-RMP
00158
Self Organization in the Low Temperature Region
Thus |II| ≤ Writing
1 K + β 2 p2 E 4 N Z
∼ 3 σ · σ N +1
∼ σ · σ3 A N
29
∼ 4 σ · σ Av(ε∼ )2 Eθ E 0 . N
∼ 4 ∼ 3 2 ∼ 4 2 σ · σ σ ·σ ≤ σ ·σ + , N +1 N +1 N +1
we obtain 1 K + Lβ 2 p2 E 3 hf 0 Av(ε∼ )2 Eθ i . N Z Proceeding in a similar way for I finishes the proof. |II| ≤
Lemma 4.5. If β ≤ 2p/L , we have in fact for N large enough that * ∼ 3 2 +0 % ·% K 0 2 2 ∼ 2 0 + Lβ p E (ε ) A . |ψ (θ)| ≤ N N +1
(4.44)
Proof. As in the proof of Lemma 4.3 we can replace Eθ by E0 Eθ in (4.36). Now 1 ∼ 2 (ε ) cos2 θE0 (g 2 (σ 1 , σ 2 )) E0 Eθ = exp 8 1 ∼ 1 2 3 0 + ε sin θ(g(σ ) − g(σ )) + ε3 g(σ ) + ε3 h . 2 We have, using Lemma 4.1, E0 (g 2 (σ 1 , σ 2 )) = E(g(σ 1 ) − g(σ 2 ))2 ≤ Thus, from (4.36) we get |ψ 0 (θ)| ≤
K 1 + Lβ 2 p2 E 3 N Z
where E 00 = exp
f 0 Av(ε∼ )2 exp
K p−1 + β 2 p(1 − R12 ). N
K p−1 + β 2 p(1 − R12 ) E 00 N
(4.45)
1 ∼ ε sin θ(g(σ 1 ) − g(σ 2 )) + ε3 g(σ 3 ) + ε3 h0 . 2
In (4.45), the factor A occuring in f 0 means that we integrate only over the pairs of configurations (σ 1 , σ 2 ) for which R12 ≥ 1/2. It follows from Theorem 2.1 that the contribution of the set of pairs (σ 1 , σ 2 ) for which 1/2 ≤ R12 ≤ 1 − 2−p/L is as most K exp(−N/K). When R12 ≥ 1 − 2−p/L , we have p−1 ) ≤ β 2 p(1 − (1 − 2−p/L )p−1 ) ≤ 1 β 2 p(1 − R12
provided p ≥ L and β ≤ 2
p/L
|ψ 0 (θ)| ≤
(4.46)
. Thus for N large enough we have from (4.36) that
K hf 0 Av(ε∼ )2 E 00 i + Lβ 2 p2 E . N Z3
(4.47)
February 19, 2003 14:25 WSPC/148-RMP
30
00158
M. Talagrand
Now we observe the following relations, where we write g ∼ = g(σ 1 ) − g(σ 2 ) 1 ∼ 1 ∼ ∼ 2 ∼ ∼ 2 ∼ ε sin θg ε sh θg = Av(ε ) ch Av(ε ) exp 2 2 1 ∼ ∼ ∼ 2 ε g ≤ Av(ε ) ch 2 1 ∼ ∼ ε g , (4.48) = Av(ε∼ )2 exp 2 where in the second line we use that ch x ≤ ch y if |x| ≤ |y|. Moreover, from (4.26) we have ! X 1 ∼ ∼ ∼ 2 ∼ 2 ` 0 ε g ε(g(σ ) + h ) . = (ε ) exp (ε ) exp 2 `≤2
Combining with (4.48) we get from (4.47) that |ψ 0 (θ)| ≤
K hf 0 Av(ε∼ )2 E3 i K + Lβ 2 p2 E + Lβ 2 p2 Ehf 0 (ε∼ )2 i0 = 3 N Z N
using (4.14). To finish the proof it remains to show that we can replace A by A0 , (σ ∼ · σ 3 /N )2 by (%∼ · %3 /(N + 1))2 ; we use (4.23), (4.24) for this propose. Proof of Theorem 4.1. From (4.35), (4.44) we have for N large enough * ∼ 3 2 +0 % ·% K 0 0 2 2 ∼ 2 0 + Lβ p E (ε ) A , UN +1 (β , h ) ≤ N N +1 and changing back N + 1 into N and β 0 into β, * ∼ 3 2 + σ ·σ K ∼ 2 + Lβ 2 p2 E (σN ) A . UN (β, h0 ) ≤ N N Using symmetry between sites, K + Lβ 2 p2 E UN (β, h ) ≤ N
*
0
∼ 3 2 σ ·σ 1 X ∼ 2 (σi ) A N N
+ .
(4.49)
i≤N
Now we observe that X X (σi∼ )2 = 2N − 2 σi1 σi2 = 2N (1 − R12 ) i≤N
i≤N
so that (4.49) implies K + Lβ 2 p2 E UN (β, h0 ) ≤ N
* (1 − R12 )A
σ∼ · σ3 N
2 + .
(4.50)
February 19, 2003 14:25 WSPC/148-RMP
00158
Self Organization in the Low Temperature Region
31
As we used in Lemma 4.4, the factor A means that we integrate over the pairs of configurations σ 1 , σ 2 for which R12 ≥ 1/2 . As the contribution of these pairs for which 1/2 ≤ R12 ≤ 1 − 2−p/L is exponentially small from Theorem 2.1, we get * + ∼ 3 2 K σ · σ + Lβ 2 p2 2−p/L E A UN (β, h0 ) ≤ N N ≤
K + Lβ 2 p2 2−p/L UN (β, h) . N
(4.51)
For β ≤ 2p/L , p ≥ L, this yields UN (β, h) ≤
1 K + UN (β, h) N 2
(4.52)
so that UN (β, h) ≤ K/N . Since we have proved that the lumps are pure states, this is how we will call them. The “pure states” will refer to the sets Cα . 5. Orthogonality in the Absence of External Field In this section we use the special symmetries that exist only when h = 0 to show in this case that the overlap of two configurations that are not in the same pure state is essentially zero (for p odd). Theorem 5.1. If h = 0, β ≤ 2p/L , then 2 1{|R12 |≤3/4} i ≤ EhR12
K . N
(5.1)
In a smaller range of β this is proved in [10], but it seems useful to give a streamlined proof here. This proof is close in spirit to the proof of Theorem 4.1. We consider the quantity 2 1{|R12 |≤3/4} i . VN (β) = EhR12
(5.2)
Mimicking the beginning of the proof of Theorem 4.1, we obtain VN +1 (β 0 ) ≤
Ehf Av ε1 ε2 E2 i K + , N Z2
(5.3)
where ZN = hAv Ei, f = R12 1{|R12 |≤3/4} and E2 = exp
X `≤2
! `
ε` g(σ ) ;
E = exp εg(σ) .
(5.4)
February 19, 2003 14:25 WSPC/148-RMP
32
00158
M. Talagrand
In the spirit of the proof of Theorem 4.1, the idea is to replace in (5.4) the Gaussian process (g(σ)) by an independent process (g 0 (σ)) we understand better and to estimate the error induced by this replacement. Before making a specific choice it is probably clearer to explain the method in greater generality. We introduce a parameter 0 < t < 1 and we set √ √ (5.5) gt (σ) = tg(σ) + 1 − tg 0 (σ) . We consider the probability measure h · it on ΣN +1 such that, for a function f 0 on ΣN +1 , we have hf 0 it =
hAv f 0 Et i , Zt
(5.6)
where Et = exp εgt (σ), where Av is the usual average over ε = ±1 and where Zt = hAv Et i. Thus h · i1 = h · i0 . We also denote by h · it the powers of h · i0t , so that if f 0 is now a function on Σ2N +1 , hf 0 it =
hAv f 0 E2,t i Zt2
(5.7)
P where E2,t = exp( `≤2 ε` gt (σ ` )). The basis of the method is the following identity. Lemma 5.2. If we assume ∀ σ, Eg 2 (σ) = E(g 0 (σ)2 ) ,
(5.8)
ψ(t) = Ehf 0 it
(5.9)
and if we set
then ψ 0 (t) = Eh∆1,2 ε1 ε2 f 0 it − 2
X
Eh∆`,3 ε` ε3 f 0 it + 3Eh∆3,4 ε3 ε4 f 0 it
(5.10)
`=1,2
where 0
0
∆`,`0 = Eg(σ ` )g(σ ` ) − Eg 0 (σ ` )g 0 (σ ` ) . In the right-hand side of (5.10), the second term is an integral over a 3-replica and the third term is an integral over a 4-replica. Proof. We write gt0 (σ) :=
1 1 d g(σ) = √ g(σ) − √ g 0 (σ) dt 2 1−t 2 t
so that (5.7) yields 1 ψ (t) = E 2 Zt 0
− 2E
* 0
Av f ε`
X
(5.11)
+ gt0 (σ ` )E2,t
`≤2
1 hAv f 0 E2,t ihAv ε3 gt0 (σ 3 )Et i . Zt3
(5.12)
February 19, 2003 14:25 WSPC/148-RMP
00158
Self Organization in the Low Temperature Region
33
We then observe from (5.1) that 1 ∆`,`0 . 2 To obtain (5.12) from (5.10), it suffices to perform (a tedious) integration by parts. 0
Egt0 (σ ` )gt (σ ` ) =
In our specific case we consider a process g 0 (σ) such that σ, σ 0 ∈ Bα ⇒ Eg 0 (σ)g 0 (σ 0 ) = Eg(σ)g(σ 0 ) ,
(5.13)
σ ∈ Bα , σ 0 ∈ Bγ , α 6= γ ⇒ Eg 0 (σ)g 0 (σ 0 ) = 0 ,
(5.14)
where the sets Bα are provided by Theorem 3.2. The existence of the process (g 0 (σ))σ∈ΣN is obvious by “gluing together independent copies of the process (g 0 (σ))σ∈Bα ”. We then consider the function ψ(t) given by (5.9) where f 0 = f ε1 ε2 . Lemma 5.3. We have ψ(0) = 0. Proof. This is where special symmetries are important. We have hf ε1 ε2 E2,0 i X = EWα,γ ψ(0) = E Z02 α,γ
(5.15)
where Wα,γ = Eg
1 hf sh(g 0 (σ 1 )) sh(g 0 (σ 2 ))1Bα (σ 1 )1Bγ (σ 2 )i , Z02
and where Eg denotes expectation in the variables (g 0 (σ)) only. If α = γ, Wα,γ is zero because A := 1{|R12 |≤1/2} is zero on Bα2 . If α 6= γ, we observe that the process / Bγ and g(σ) = −g 0 (σ) if σ ∈ Bγ has the same (g(σ)) given by g(σ) = g 0 (σ) if σ ∈ 0 law as the process (g (σ)). Since ZN,0 = hch g 0 (σ)i, replacing (g 0 (σ)) by (g(σ)) amounts to replace Wα,γ by −Wα,γ . On the other hand, it does not change Wα,γ because this quantity depends only on the law of (g 0 (σ)). Thus Wα,γ = 0. Lemma 5.4. We have |ψ 0 (t)| ≤
K 2 + Lβ 2 p2−p EhR12 Ait , N
(5.16)
where A = 1{|R12 |≤1/2} . Proof. When |f 0 | ≤ 1, (5.10) and the symmetry between replicas imply that |ψ 0 (t)| ≤ 6Eh|∆1,2 |it . Now, by definition of g
0
∆1,2 = 0
if σ 1 , σ 2 ∈ Bα
∆1,2 = Eg(σ 1 )g(σ 2 ) if σ 1 ∈ Bα , σ 2 ∈ Bγ , α 6= γ .
(5.17)
February 19, 2003 14:25 WSPC/148-RMP
34
00158
M. Talagrand
Thus, using Lemma 4.2, ∆1,2 ≤ where
K + β 2 p|R12 |p−1 A + β 2 p1D , N
(5.18)
- [ 1 Bα2 . D = |R12 | ≥ 2 α≥1
p−1 2 −p+3 A| ≤ R12 2 , so that (5.17), (5.18), (3.12) imply (5.16). We note that |R12
Lemma 5.5. We have K d 2 2 EhR12 + Lβ 2 p2−p EhR12 Ait ≤ Ait . dt N
(5.19)
2 and we repeat the proof of Lemma 5.4. Proof. We use (5.10) with f 0 = AR12 verbatim.
Proof of Theorem 5.1. We conclude from (5.19) and integration that 2 Ait ≤ EhR12
K 2 + LEhR12 Ai0 N
provided β ≤ 2−p/L , and we repeat the proof of Theorem 4.1. Namely, we deduce from (5.16) that |ψ 0 (t)| ≤
K 2 + Lβ 2 p2 2−p EhR12 Ai0 N
and since ψ(1) ≤ ψ(0) + sup|ψ 0 (t)| we deduce that, if β ≤ 2p/L , 2 Ai0 ≤ EhR12 2 Ai0 ≤ so that EhR12
K N
1 K 2 + EhR12 Ai0 , N 2
and thus * 2 +0 K %1 · %3 , A0 ≤ E N +1 N
which means VN +1 (β 0 , h0 ) ≤ K/N . 6. The Ghirlanda Guerra Relations and the Poisson Dirichlet Distribution In this section we recall the fundamental Ghirlanda–Guerra identities. (See also [12] for related ideas.) We explain what is the Poisson–Dirichlet distribution and its deep connections with the Ghirlanda–Guerra identities. We then prove some basic formulas needed to perform precise computations with the cavity method. The Ghirlanda–Guerra relations are based on (1.10) and integration by parts, and it is somewhat mysterious that such a simple argument has such strong consequences. We feel that it is necessary to repeat the proofs given in [13] because
February 19, 2003 14:25 WSPC/148-RMP
00158
Self Organization in the Low Temperature Region
35
we will handle some of the secondary technical issues in a different manner. In particular we do not want to take subsequences, but we want to prove results for every N . We recall that HN,0 = HN,0 (σ) is given by (1.4). The basic fact is that the quantity * 2 + 1 1 HN,0 − E HN,0 IN = IN (β, h) = E N N * =E
1 HN,0 N
2 +
2 1 HN,0 − E N
(6.1)
is small for most values of β. It does not seem possible with this method to rule out the existence of exceptional values of β for which IN would be of order 1. (Hence we are obliged to average over β in Theorem 1.1.) Our first task is to show that IN is small “most of the time”. For this purpose, it is more convenient to consider β and h0 = βh as independent parameters. (The notation h0 = βh will be used throughout much of the rest of the paper.) Proposition 6.1. For each h0 , B0 , we have Z B0 K (6.2) IN (β, h0 )dβ ≤ √ . N 0 √ We believe, but cannot prove, that the term N can be replaced by N . (This would give some hope of getting closer to the correct rate of convergence.) We set pN (β, h0 ) =
1 E log ZN (β, h0 ) . N
Proof. We observe that
HN,0 − N * 2 + 2 ! HN,0 HN,0 1 ∂2 0 log ZN (β, h ) = N − , N ∂β 2 N N 1 ∂ log ZN (β, h0 ) = N ∂β
so that 1 ∂2 pN (β, h0 ) = N E N ∂β 2
*
HN,0 N
2 +
−
HN,0 N
2 ! ,
and by integration, Z
*
B0
E 0
HN,0 N
2 +
−E
HN,0 N
2 !! dβ ≤
K . N
February 19, 2003 14:25 WSPC/148-RMP
36
00158
M. Talagrand
It remains to prove that 2 ! 2 ! Z B0 HN,0 HN,0 K E − E dβ ≤ √ . N N N 0
(6.3)
This is a consequence of the fact that the random convex function 1/N log ZN (β, h0 ) having small fluctuations (see Proposition 2.9), such is the case of its derivative. √ See [10], Proposition 4.3, with v = N . We now consider a function f on k replicas, that is, f : ΣkN 7→ R, and we assume that f does not depend upon the disorder. We can also view f as a function on Σk+1 N . We write R1,` = R(σ 1 , σ ` ) . Theorem 6.2 (The Ghirlanda Guerra identities). If |f | ≤ 1 we have 1 1 X p p p f i = EhR12 iEhf i + EhR1,` fi + δ EhR1,k+1 k k
(6.4)
2≤`≤k
where |δ| ≤
K 1 IN (β, h0 )1/2 + . β N
Proof. Using Cauchy–Schwarz, and since |f | ≤ 1, we have HN,0 HN,0 (σ 1 ) f =E Ehf i + δ , E N N
(6.5)
(6.6)
where |δ| ≤ IN (β, h0 )1/2 . We then integrate by parts to see that 1/2 X p! 1 HN,0 Egi1 ···ip hσi1 · · · σip i = E − N N 2N p−1 i <···
=
X βp! E (h(σi1 · · · σip )2 i − hσi1 · · · σip i2 ) p 2N i <···
=
p
βp! E 2N p i
p
X
(1 − hσi11 σi21 · · · σi1p σi2p i)
(6.7)
1 <···
and thus (see the proof of Lemma 2.4) E − HN,0 − β (1 − EhRp i) ≤ K . 12 N 2 N
(6.8)
In a similar fashion ! 1 X K β (σ ) H N,0 p p E − f − Ehf i + . EhR1,` f i − kEhR1,k+1 f i ≤ N 2 N 2≤`≤k (6.9)
February 19, 2003 14:25 WSPC/148-RMP
00158
Self Organization in the Low Temperature Region
37
The result follows from (6.8) and (6.9). Use of the relation (6.4) will always produce an error term δ. We do not know that this error term goes to zero as N → ∞ at a given value of β, but we know that is goes to zero if one averages over β in an interval, at h0 fixed. Throughout the paper, δ will denote such an error term, that may be different at each occurrence. We now turn to the study of the Poisson–Dirichlet distribution Λm = P D(m, 0). This is a distribution on the set of non-decreasing sequences (vα ), vα ≥ 0, P α≥1 vα = 1. It is best defined as follows. Consider a Poisson point process on R+ , of intensity measure u−1−m du. A realization of this process creates R 1 a family of points in R+ , that can be labeled as (uα )α≥1 , u1 ≥ u2 ≥ . . . . Since 0 u−m du < ∞, P almost surely the sum S = α≥1 uα is finite; Λm is the law of the sequence (vα ), where vα = uα /S. The occurrence of this distribution in this topic is probably best understood through Derrida’s Random Energy Model (REM). In this model the r.v. (HN (σ)) 2 (σ) = N/2, so that are i.i.d. Gaussian, EHN N if σ 1 = σ 2 , 2 (6.10) EHN (σ 1 )HN (σ 2 ) = 0 otherwise . Comparing with (1.2) we can say that this is “the case p = ∞” of the p-spin interaction model (and this is how Derrida also invented the p-spin interaction model). The REM is a “toy model”, but it is very instructive. The expected number of values √ of −HN (σ) in an interval of length dx around x + aN , where aN = (N log(2N / N))1/2 , is about p (x + aN )2 dx 2N dx √ exp − (6.11) ' √ exp(−2x log 2) . N π πN The largest values among the numbers (−HN (σ)) asymptotically resemble the numbers xα + aN , where x1 ≥ · · · are generated by a Poisson point process of intensity measure ν given by p dx dν(x) = √ exp(−2x log 2) . π (The simple details are left to the reader.) The largest Boltzmann factors exp(−βHN (σ)) resemble the numbers uα exp βaN , where u1 ≥ u2 · · · are generated by a Poisson point process, the intensity measure of which is the image of ν under the map x → exp βx, so that ν has a density of the type Cx−m−1 , where √ 2 log 2 . (6.12) m= β √ This shows that, for β > 2 log 2, the sequence of the weights GN ({σ}) (when ranked in decreasing order) has asymptotically a distribution Λm . The distribution Λm is very well understood (see [14, 15]). Quite interestingly some properties crucial
February 19, 2003 14:25 WSPC/148-RMP
38
00158
M. Talagrand
in the present work are a simple consequence of the previous description using the √ REM. First, it should be obvious that if β > 2 log 2, * 2 +! HN p − log 2 → 0, (6.13) E − N √ because only the values of σ for which −HN /N ∼ log 2 are relevant for Gibbs’ measure. In particular p HN (6.14) = log 2 . lim E − N →∞ N We integrate by parts, using now the formula Eg1 u(g2 ) = Eg1 g2 Eu0 (g2 ) for g1 = HN (σ 1 ), g2 = HN (σ 2 ) and we get using (6.10) that β HN 2 (σ)i − hE(HN (σ 1 )HN (σ 2 )i = E hEHN E − N N β (1 − Eh1{σ1 =σ2 } i) 2 X β 1−E G2N ({σ1 }) , = 2 and comparing with (6.13) we get in the limit that √ X 2 log 2 2 = 1 − m, vα = 1 − E β =
(6.15)
(6.16)
α≥1
(which is of course well known). Consider now a function f on k replicas, |f | ≤ 1. If we use (6.13) to mimic the proof of Theorem 6.2 (integrating now by parts using (6.10), as in (6.15)) we get 1 1 X Eh1{σ1 =σ` } f i + o(1) (6.17) Eh1{σ1 =σk+1 } f i = TN Ehf i + k k 2≤`≤k
P
where TN = Eh1{σ1 =σ2 } i = E G2N ({σ}). Above and below, o(1) goes to zero as N → ∞. Consider an integer n, and for s ≤ n consider ks different replicas σ s,` , ` ≤ ks . P Consider the function of k = s≤n ks replicas given by Y Y 1{σs,` =σs,`+1 } . f= s≤n `
Thus, if we denote by (wα )α≥1 the reordering of the weights GN ({σ}), we have Y X wαks , hf i = s≤n α≥1
February 19, 2003 14:25 WSPC/148-RMP
00158
Self Organization in the Low Temperature Region
and Ehf i = SN (k1 , . . . , kn ) := E
X X
39
wαks .
s≤n α≥1
We now want to apply (6.17) to this function f , where σ 1,1 plays the role of σ 1 . We call σ 1,k1 +1 the new replica that occurs in the first term of (6.17), and we get X ¯ Eh1{σ1,1 =%} f i + o(1) . (6.18) kEh1 {σ 1,1 =σ1,k1 +1 } f i = TN Ehf i + %
In this summation, % is one of the replicas σ 1,` , 2 ≤ ` ≤ k1 or σ s,` , s ≥ 2, ` ≤ ks . To understand what is 1{σ1.1 =%} f , it might help to think to σ s,1 , . . . , σ s,ks as being linked in a chain that forces them to be all equal (otherwise f = 0). Multiplying by the term 1{σ1,1 =%} changes nothing if % = σ 1,` , ` ≤ k, but if % = σ s,` , s ≥ 2, it merges the first and the sth chain. Thus (6.18) reads as kSN (k1 + 1, k1 , . . . , kn ) = (k1 − 1 + TN )SN (k1 , k2 , . . . , kn ) + k2 SN (k1 + k2 , k3 , . . . , kn ) + k3 SN (k1 + k3 , k2 , k4 , . . . , kn ) + · · · + kn SN (k1 + kn , k2 , . . . , kn−1 ) .
(6.19)
If we denote by (vα )α≥1 a sequence with Λm distribution, and if we set ! Y X (m) ks vα , (6.20) S (k1 , . . . , kn ) = E s≤n α≥1
we get in the limit N → ∞, using (6.16), the relation S (m) (k1 + 1, k2 , . . . , kn ) = (k1 − m)S (m) (k1 , . . . , kn ) + k2 S (m) (k1 + k2 , k3 , . . . , kn ) + · · · + kn S (m) (k1 + kn , k3 , . . . , kn−1 ) .
(6.21)
These relations hold for each n, each k1 , . . . , kn ≥ 1. If one defines Sn (1, . . . , 1) = 1, they determine by induction the quantities S (m) (k1 , . . . , kn ). Even though it should certainly be easy to prove these relations using the methods of [15], they had apparently not been formulated previously. To explain the importance of the relations (6.21) we will prove a simple fact that will guide the work of the next two sections. We assume p odd for definiteness. Suppose that at a given β, we already know (which we do not yet) that, asymptotically, in the p-spin interaction model the overlaps take only two non-random values, 0 and q. We define mN (β) by β HN (6.22) = (1 − q p mN (β)) . E − N 2
February 19, 2003 14:25 WSPC/148-RMP
40
00158
M. Talagrand
We show now that Theorem 6.2 implies that asymptotically the distribution of the weights wα of the sets Cα is close to Λm for m = mN (β). To see this, we consider the function f of replicas σ s,` , s ≤ n, ` ≤ ks given by Y Y Rp (σ s,` , σ s,`+1 ) , f= s≤n 1≤`
where R(σ, σ 0 ) = σ · σ 0 /N . By our assumption Rp (σ s,` , σs,`+1 ) takes essentially only the values q p on 0, and the former only if σ s,` and σ s,`+1 belong to the same set Cα . Thus Ehf i = q kp SN (k1 , . . . , kn ) , where of course now SN (k1 , . . . , kn ) = E
Y X
wαks .
s≤n α≥1
It now follows from (6.4) that kSN (k1 + 1, . . . , kn ) = (k1 − mN (β))SN (k1 , k2 , . . . , kn ) + k2 SN (k1 + k2 , k3 , . . . , kn ) + · · · + kn SN (k1 + kn , . . . , kn−1 ) + δ
(6.23)
where following our convention the error δ is small in average over β. Of course the small errors in (6.23) accumulate when one iterates this relation. P Still, comparing (6.21) and (6.23), we see that given k, if s≤n ks ≤ k then |SN (k1 , . . . , kn ) − S (mN (β)) (k1 , . . . , kn )| → 0
(6.24)
(at least after averaging over β). Next, we observe that the law of a random sequence of weights (wα )α≥1 is determined by the quantities S(k1 , . . . , kn ) defined as in (6.20). This is beP cause Rthe random probability µ = α≥1 wα δwα is determined by its moments P Mk = xk dµ(x) = α≥1 wαk+1 , and the joint law of the random variables (Mk )k≥1 is determined by the expectations E(Mk1 · · · Mkn ). This, together with simple compactness arguments, shows that many properties of ΛmN (β) will be asymptotically true under (6.24) for the distribution of the weights of the pure states. We end this section by several such properties that will be instrumental in the use of the cavity method. We consider a pair of r.v. U , V ≥ 0 and i.i.d. copies (Uα , Vα ) of this pair (U and V need not be independent). We assume that V ≥ 1, E(U 2 ) < ∞ .
(6.25)
These are not the minimal conditions, but they will be satisfied in our applications. I am grateful to M. Yor for discussions concerning the organization of the proof of the following.
February 19, 2003 14:25 WSPC/148-RMP
00158
Self Organization in the Low Temperature Region
41
Proposition 6.3. Assume that the sequence (vα )α≥1 of weights is distributed like Λm . Then P E(U V m−1 ) α≥1 vα Uα (6.26) = EP E(V m ) α≥1 vα Vα P 2 E(U V m−1 ) α6=β vα vβ Uα Uβ = m . (6.27) E P ( α≥1 vα Vα )2 E(V m ) P 2 2 E(U 2 V m−2 ) α≥1 vα Uα P . (6.28) = (1 − m) E ( α≥1 vα Vα )2 E(V m ) Proof. First, we recall that if xα is a realization of a Poisson point process of intensity measure ξ, then for a well behaved function f , Z X f (xα ) = exp (ef (x) − 1)dξ(x) , (6.29) E exp α≥1
as follows from the case where f takes only one value other than zero. Consider now points (uα )α≥1 on R+ that arise from a Poisson point process of intensity measure µ with dµ(x) = x−1−m dx. The points (uα , Uα , Vα ) then arise from a Poisson point process on (R+ )3 , of intensity measure µ ⊗ ν, where ν is the joint law of the couple (U, V ) on (R+ )2 . Thus, considering a well-behaved function h : (R)3 → R+ the points xα = h(uα , Uα , Vα ) arise from a Poisson point process on R+ , of intensity measure ξ, the image of µ ⊗ ν under h. Using (6.29), for f (x) = −x this gives ! X h(uα , Uα , Vα ) E exp − α≥0
Z = exp
(e
−h(x,u,v)
−1−m
− 1)x
dxdν(u, v) .
(6.30)
Consider now k ≥ 1, and take h(x, u, v) = s(ux)k + tvx where s, t are parameters. Expressing that the derivatives at s = 0 of the two sides of (6.30) coincide, we get the relation ! Z X X k (Uα uα ) exp − t vα Vα = (ux)k e−tvx x−1−m dxdν(ua, v) E α≥1
α≥1
Z × exp
(e−tvx − 1)x−1−m dxdν(u, v) . (6.31)
February 19, 2003 14:25 WSPC/148-RMP
42
00158
M. Talagrand
We set x = y/tv to see that Z Z k −tvx −1−m m−k k m−k x dxdν(u, v) = t E(U V ) e−y y −m+k−1 dy (ux) e = tm−k Γ(k − m)E(U k V m−k ) , and
Z (e
−tvx
−1−m
− 1)x
Z m
dxdν(u, v) = t E(V ) =−
Thus, (6.28) reads as E
X
k
(uα Uα ) exp
α≥1
m
−t
X
(e−y − 1)y −1−m dy
tm Γ(1 − m)E(V m ) . m
! uα Vα
= tm−k E(U k V m−k )Γ(k − m)
α≥1
tm × exp −E(V m ) Γ(1 − m) . m
Taking k = 1 and integrating in t from 0 to ∞ yields (6.26). Taking k = 2, multiplying by t and integrating in t yields (6.28). Taking again k = 1, but expressing now that the second derivatives at s = 0 of the two sides of (6.30) coincide, we get by a similar computation 2 2 P u α Uα E(U V m−1 ) E(U 2 V m−2 ) α (1 − m) + m = , E P E(V m ) E(V m ) α uα Vα which, combined with (6.28) yields (6.27). The following expresses in a precise (and usable) way that when a random sequence of weights (ηα ) has a distribution that resembles Λm , then (6.26), is approximately true. Proposition 6.4. Given a number A and given ε0 > 0, there exists a number ε > 0 and an integer k such that, for each 0 < m < 1, if we have ! Y X X ks (m) ks ≤ k ⇒ E ηα − S (k1 , . . . , kn ) ≤ ε , ∀ n, ∀ k1 , . . . , kn , s≤n s≤n α≥1 (6.32) then, if the r.v. U, V satisfy V ≥ 1 and EU 2 ≤ A, we have P E(U V m−1 ) Pα≥1 ηα Uα − ≤ ε0 , E EV m α≥1 ηα Vα P 2 2 E(U 2 V m−2 ) P α≥1 ηα Uα − (1 − m) ≤ ε0 , E ( α≥1 ηα Vα )2 EV m
(6.33)
(6.34)
February 19, 2003 14:25 WSPC/148-RMP
00158
Self Organization in the Low Temperature Region
P 2 E(U V m−1 ) α6=β ηα ηβ Uα Uβ −m E P ≤ ε0 . ( α≥1 ηα Vα )2 EV m
43
(6.35)
Proof. If we fix m, U , V the result follows from Proposition 6.3 because the conditions (6.22) force the distribution of (ηα ) to resemble Λm . The only issue could be the uniformity over m. But as m → 0, the largest weight becomes close to one (in which case everything is obvious) and as m → 1, it becomes small, in which case everything is also obvious by the law of large numbers. 7. Conditioning and the Relative Weights In our quest for a proof of Theorem 1.1 we have (roughly speaking and assuming p odd) reached the following stage. It σ 1 , σ 2 do not belong to the same pure state, then their overlap is zero. If they belong to the pure state Cα , then R(σ 1 , σ 2 ) ' qα = hσ 1 , σ 2 iα . We would like to show that the numbers qα do not depend upon α and are not random. The first approach that comes to mind is to try a kind of iteration procedure in the spirit of Sec. 5, but that does not seem to work. Rather, our approach relies upon the following observation. If we knew that qα ' q (q non-random) we would know the distribution of wα . We would then make precise computations with the cavity method and prove that q must satisfy a relation such as (1.14), so that q would be completely determined by TN (β). To follow this line of attack, we will show that, given a number q, it is possible to make sense of “the part of the system consisting of the pure states for which qα ' q”, and most importantly, that this “partial system” satisfies relations similar to the Ghirlanda–Guerra identities. These relations are proved in a similar manner than the Ghirlanda–Guerra identities, but seem strictly more general. They involve rather interesting combinatorics. The effect of this construction is that we obtain an object similar to the whole system, but where we now know that qα is always near a given value of q. In the next section we will then learn how to use the cavity method to prove that q (nearly) satisfies (1.14). This will mean that the solution of (1.14) is the only possible value for qα ; this will finish the proof of Theorem 1.1. While, as mentioned, the proof contains an interesting and unexpected combinatorial ingredient it is obscured by a number of unimportant technical complications. Thus it seems appropriate to first give an informal sketch of this proof. We consider 1/2 ≤ q ≤ 1 and ε > 0. The role of ε is that we want to consider only those pure states Cα for which qα is within distance ε of q. Consider the function given by ψ(x) = 1{|x−q|<ε} ,
(7.1)
V = hψ(R12 )i .
(7.2)
and set
February 19, 2003 14:25 WSPC/148-RMP
44
00158
M. Talagrand
Since R12 ' qα on Cα , we expect that X wα2 ψ(qα ) + o(1) . V =
(7.3)
α≥1
Here we however run into a technical difficulty. Even though R12 ' qα on Cα , if qα = q + ε or qα = q − ε, this does not imply that hψ(R12 )iα = ψ(qα ). The values of α for which qα is very close to q ±ε create trouble. It could happen by extraordinary misfortune that we have been unlucky enough that qα is close to q ± ε for many values of α. But this cannot happen for all values of ε, and the difficulty will be solved by letting ε vary within a factor 2 and averaging over it. We also observe that (7.2) brings forward the squares of the weights wα rather than the weights themselves. We will renormalize and study w2 ψ(qα ) . ηα = P α 2 γ wγ ψ(qγ )
(7.4)
The aim of this section is to show that (after we reorder these weights in nonincreasing order) for large N , their distribution is close to Λm where m = TN /q p . Here “close” depends upon ε, and the smaller ε the better. Now let us come to the combinatorics. We consider an integer n, and for 1 ≤ s ≤ n we consider integers ks . We consider replicas (σ s,k )k≤2ks and the function Y Y ψ(R(σ s,k , σ s,k+1 )) . (7.5) f= s≤n 1≤k<2ks
This should look familiar after the work of Sec. 6. The difference is that now we consider only the pure states for which |qα − q| < ε; the occurrence of 2ks rather than ks is related to the fact that we deal with the square of the weights. As in (7.3) we expect that Y X wα2ks ψ(qα ) + o(1) hf i = s≤n α≥1
=
Y X
(wα2 ψ(qα ))ks + o(1) .
(7.6)
s≤n α≥1
Comparing with (7.3), (7.4), we expect that hf i V
k
=
Y X
ηαks + o(1)
(7.7)
s≤n α≥1
P where k = s≤n ks . If V is too small, there could be a lot of trouble with the error term in (7.7), but we will ignore this for the moment. We want to obtain induction relations on the quantities Y X ηαks , S(k1 , . . . , kn ) = E s≤n α≥1
February 19, 2003 14:25 WSPC/148-RMP
00158
Self Organization in the Low Temperature Region
45
and for this we mimic the proof of the Ghirlanda–Guerra identities. We expect that hf i h−HN,0 (σ 1,2k1 )f i = E(h−HN,0 (σ)i)E +δ. (7.8) E Vk Vk Integration by parts in the left-hand side of (7.8) yields the value P p 1,2k1 h R (σ hf Rp (σ 1,2k1 , σ 1 )ψ(R12 )i , %)f i β E − 2k . 2 Vk V k+1
(7.9)
The summation is over % one of σ s,` ; and in the last bracket, σ 1 , σ 2 are considered as new replicas. Let us look at the term hf Rp (σ 1,2k1 , σ 1 )ψ(R12 )i . E V k+1 The work of Sec. 6 shows that the case |R(σ 1,2k1 , σ 1 )| ≤ 1/2 should give a vanishing contribution, so that we can assume that σ 1,2k1 and σ 1 belong to the same pure state Cα . But then Rp (σ 1,2k1 , σ 1 ) ' qαp . Moreover, the term ψ(R12 ) vanishes unless σ 2 ∈ Cα and |qα − q| < ε. Now |qαp − q p | ≤ pε . Thus, within an error that becomes small with ε, considering hf Rp (σ 1,2k1 , σ 1 )ψ(R12 )i E V k+1 amounts to replace k1 by k1 + 1 in E(hf i/V k ) (or, equivalently to replace 2k1 by 2k1 + 2) and to multiply by q p . The other terms are handled similarly, yielding the appropriate near induction relations on the quantities S(k1 , . . . , kn ). The factor 2k in (7.9) (rather than k) implies that the distribution of the weights 0 (ηα ) will now resemble Λm , where 1 TN (β) 1− m0 = . (7.10) 2 qp The factor 1/2 is related to the fact that ηα resembles wα2 , and that if the P weights vα have a distribution Λm , the weights vα0 = vα2 / γ vγ2 have distribution P Λm/2 . This is obvious on the representation vα = uα / γ≥1 uγ where the points (uα ) are generated by a Poisson point process of intensity measure Cx−m−1 dx, P because vα0 = u2α / γ≥1 u2γ , and the points (u2α ) are generated by a Poisson point process of intensity measure Cx−m/2−1 dx. Unfortunately it does not seem possible to make sense of the preceeding program unless one somehow controls V from below. The natural idea is thereby to work conditionally upon the condition V ≥ 2−` (where ` is a new parameter) i.e. to study hf i 1 E 1 −` {V ≥2 } P (V ≥ 2−` ) Vk
February 19, 2003 14:25 WSPC/148-RMP
46
00158
M. Talagrand
rather than E(hf i/V k ). This however creates an new difficulty; the term 1{V ≥2−` } creates a new term when one integrates by parts in (7.8). One has to show that this term is small. Unfortunately this does not seem be always the case: it could happen in principle that pathologies accumulate precisely around the level 2−` at which we truncate and we will have to go around that difficulty by averaging over a certain range of values of `. (That this works is the result of calculation and is not so intuitive beforehand.) When going through the proof, one has to remember that there are three types error terms. Firstly, there are the error terms due to the fact that it is only asymptotically true that R12 ' qα on Cα . These terms are not dangerous; they vanish as N → ∞. Secondly, there are the error terms due to the “edge effect”: values of qα close to q ± ε, and similar terms occuring when introducing the truncation 1{V ≥2−` } . There are made as small as one wishes by averaging over ε and ` in a suitable way. Finally, there are error terms occuring from the fact that we consider values of qα that are not exactly q. These are typically bounded by K(p)ε. Rather than the function (7.1), we find it more convenient to consider a function 0 ≤ ψ ≤ 1 such that |x − q| ≥ ε ⇒ ψ(x) = 1
(7.11)
|x − q| ≤ ε + ε0 ⇒ ψ(x) = 0 .
(7.12)
and ε0 > 0 such that
The “edge effect” will occur in the region ε ≤ |x − q| ≤ ε + ε0 . Typically ε0 will be much smaller than ε. We assume that |x − q| − ε , ε ≤ |x − q| ≤ ε + ε0 ⇒ ψ(x) = 1 − ε0 so that 1 (7.13) |ψ(x) − ψ(y)| ≤ 0 |x − y| . ε We define V = hψ(R12 )i . Lemma 7.1. We have
X δ wα2 ψ(qα ) ≤ 0 . E V − ε
(7.14)
(7.15)
α≥1
Here and below δ denotes a quantity depending upon N , β, but independent of all the various parameters (ε, q, ε0 , `, . . .) of our construction, and which goes to zero as N → ∞ when one averages over β ≤ 2p/L . In fact, in (7.15), δ goes to zero as N → ∞ for each value of β, and the need to average over β will arise only when we will use (6.2).
February 19, 2003 14:25 WSPC/148-RMP
00158
Self Organization in the Low Temperature Region
47
Proof. We have ψ(R12 ) 6= 0 ⇒ R12 ≥ q − ε − ε0 ≥ 1/2. From Theorem 4.1 we have - [ 1 N ⊗2 1 2 2 Cα ≤ exp − . (σ , σ ); R12 ≥ EGN 2 K α≥1
Thus, we make only an exponentially small error if we pretend that X X h1Cα2 ψ(R12 )i = wα2 hψ(R12 )iα V = α≥1
α≥1
using the notation (4.6). To improve clarity, we will not mention these exponentially small errors. Using (7.13), we have hψ(R12 )iα − ψ(qα ) ≤ h|ψ(R12 ) − ψ(qα )|iα ≤
1 h|R12 − qα |i . ε0
Now, using Jensen’s and Cauchy–Schwarz inequalities h|R12 − qα |iα = h|R12 − < R12 iα |iα ≤ h|R12 − R34 |iα ≤ h|R12 − R13 |iα + h|R13 − R34 |iα = 2h|R12 − R13 |iα 1 σ · (σ 2 − σ 3 ) =2 ≤ 2Iα1/2 , N α where Iα is defined in (4.7). P Thus, since α≥1 wα ≤ 1, we have X 2 X 2 1/2 2 wα ψ(qα ) ≤ 0 wα Iα V − ε α≥1
α≥2
2 ≤ 0 ε ≤
X
!1/2 wα3 Iα
α≥1
2 UN (β)1/2 , ε0
(7.16)
which implies the result by Theorem 4.1. To express that V is not too small, we consider another smooth function 0 ≤ ϕ ≤ 1, and we assume ϕ is differentiable and |ϕ0 | ≤ 2 , x ≤ 1 ⇒ ϕ(x) = 0 ,
x ≥ 2 ⇒ ϕ(x) = 1 .
(7.17) (7.18)
We set W = ϕ(2` V ) ,
(7.19)
February 19, 2003 14:25 WSPC/148-RMP
48
00158
M. Talagrand
where ` is an integer. Thus, saying that W is not zero is a smooth version of saying that V is larger than 2−` . We recall the notation (7.5). Lemma 7.2. We have hf i ≤ V k .
(7.20)
Proof. Since 0 ≤ ψ ≤ 1, we have Y Y Y Y ψ(R(σ s,k , σ s,k+1 )) ≤ ψ(R(σ s,2k−1 , σs,2k )) s 1≤k≤2ks −1
s 1≤k≤ks
because there are more terms on the left-hand side. The k terms on the right-hand side depend upon different replicas, so the thermal average of this quantity is V k .
We consider the function ϕ`,k (x) =
ϕ(2` x) . xk
(7.21)
Lemma 7.3. The derivative of ϕ`,k is bounded. Moreover ϕ`,k (x) ≤ 2`k .
(7.22)
Proof. ϕ ≤ 1 and ϕ(2` x) = 0 unless x ≥ 2−` . In order to control the “edge effect” (the region where ψ ∈ / {0, 1}), we introduce ψ ∼ (x) = 1{ε≤|x−q|≤ε+ε0 }
(7.23)
0 < ψ(x) < 1 ⇒ ψ ∼ (x) = 1 .
(7.24)
so that
Lemma 7.4. We have " # hf i Y X X δ ` ks ` 2 ∼ ϕ(2 V ) − ηα ϕ(2 V ) ≤ K(k, `) E wα ψ (qα ) + 0 . E V k ε α s≤n
(7.25)
α≥1
There of course K(k, `) is a number depending only upon k, `. The control of the error terms requires great case, so we will explicitly write all the quantities on which the various constants K(· · ·) depend, except p, that is fixed once and for all. We recall that δ depends only on N, β, but not on k, `, . . . . We note the different nature of the error terms: the term δ/ε0 will go to zero as N → ∞ (if one averages P over β). The “edge effect” E α≥1 wα2 ψ ∼ (qα ) will be made small by taking ε0 very small and averaging over ε.
February 19, 2003 14:25 WSPC/148-RMP
00158
Self Organization in the Low Temperature Region
49
Proof. We will proceed as in Lemma 7.1. The one difficulty is that one has to perform the various approximations in the right order to avoid potential problems with small denominators. We first note that if −1 ≤ as , bs ≤ 1, then Y Y X as − bs ≤ |as − bs | . (7.26) s≤n
From (7.5) we have hf i =
Y s≤n
=
s≤n
*
s≤n
+
Y
ψ(R(σ s,k , σ s,k+1 ))
1≤k≤2ks −1
Y X
*
+
Y
wα2ks
ψ(R(σ
s,k
,σ
s,k+1
))
1≤k≤2ks −1
s≤n α≥1
We set U1 =
Y X
. α
wα2ks ψ(qα )2ks −1 .
s≤n α≥1
Using (7.26) twice, we see that |hf i − U1 | ≤ K(k)
X
wα2 h|ψ(R12 ) − ψ(qα )|iα ,
α≥1
so that, as shown in the proof of Lemma 7.1 we have E|hf i − U1 | ≤ K(k)
δ . ε0
Thus E(|hf i − U1 |ϕ`,k (V )) ≤ K(k, `) i.e.
δ , ε0
hf i δ ϕ(2` V ) − U1 ϕ`,k (V ) ≤ K(k, `) 0 . E k ε V
Next, we set U2 =
Y X
wα2ks ψ(qα )ks .
s≤n α≥1
We note that ψ(qα )2ks −1 6= ψ(qα )ks ⇒ ψ ∼ (qα ) = 1 . Using (7.26), we get |U2 − U1 | ≤ n
X α≥1
wα2 ψ ∼ (qα ) ,
(7.27)
February 19, 2003 14:25 WSPC/148-RMP
50
00158
M. Talagrand
and thus E|(U1 − U2 )ψ`,k (V )| ≤ K(k, `)E
X
wα2 ψ ∼ (qα ) .
(7.28)
α≥1
Next, we set V1 =
X
wα2 ψ(qα ) .
α≥1
Combining Lemmas 7.1 and 7.3 we get, since U2 ≤ 1 E|U2 (ϕ`,k (V ) − ϕ`,k (V1 ))| ≤ K(k, `)
δ . ε0
(7.29)
Now U2 ϕ`,k (V1 ) =
U2 V1k
ϕ(2` V1 ) ,
and, obviously U2 ≤ V1k . Thus using again Lemma 7.1. U δ 2 ` ` (ϕ(2 V1 ) − ϕ(2 V )) ≤ K(k, `) 0 . E V k ε
(7.30)
1
Now U2 V1k
=
X X
ηα2ks ,
s≤n α≥1
and combining (7.27) to (7.30), this proves (7.18). We set W (k1 , . . . , kn ) = E
X X
! ηαks ϕ(2` V
) ,
(7.31)
s≤n α≥1
S(k1 , . . . , kn ) =
1 W (k1 , . . . , kn ) . Eϕ(2` V )
(7.32)
(The notation does not indicate that these quantities depend upon ε, ε0 , `). This is where the conditioning argument appears. We replace the basic probability P by a probability P 0 having a density proportional to ϕ(2` V ). The relations among the quantities S(k1 , . . . , kn ) will follow from the next estimate, where m0 is given in (1.10). Proposition 7.5. Given k, there exists for ` ≥ 1 numbers a(`, N ), K(k, `) with the following properties X a(`, N ) ≤ 8 . (7.33) ∀ N, `≥1
February 19, 2003 14:25 WSPC/148-RMP
00158
Self Organization in the Low Temperature Region
If we consider k1 , . . . , kn ≥ 1 and
X
51
ks ≤ k, then
s≤n
|kW (k1 + 1, k2 , . . . , kn ) − [(k1 − m0 )W (k1 , . . . , kn ) + k2 W (k1 + k2 , k3 , . . . , kn ) + · · · + kn W (k1 + kn , . . . , kn−1 )]| " # X δ wα2 ψ ∼ (qα ) + 0 + E(ϕ(2` V ))K(k)pε . ≤ a(`, N ) + K(k, `) E ε
(7.34)
α≥1
Compared to Lemma 7.4, we get two new error terms. The term a(`, N ) is an “edge effect” produced by ϕ(2` V ). Condition (7.33) shows that it can be made small by averaging over `. The last term will be made small by taking ε small. (Of course parameters will have to be chosen in an appropriate order.) Proof. Since ϕk,` is bounded, we see by the Cauchy–Schwarz inequality and Proposition 4.1 that 1 1,2k1 )f ϕk,` (V ) E − HN,0 (σ N 1 E(hf iϕk,` (V )) + K(k, `)δ =E − HN,0 (σ) N β (1 − TN (β))E(hf iϕk,` (V )) + K(k, `)δ . 2 To integrate by parts the left-hand side we write
(7.35)
=
h− N1 HN,0 (σ 1,2k1 )f i Vk
=
2k h− N1 HN,0 (σ 1,2k1 )f i ZN 2 V )k (ZN
.
In words, the denominators of the brackets on the numerator and the denominator are the same and cancel out. When integrating by parts, we get three terms corresponding respectively to dependence on the disorder of the Boltzmann factors occuring in HN,0 1,2k1 2k (σ )f ZN − N and in 2 V, ZN
and to the dependence on the disorder of ϕ(2` V ) = ϕ(2` hψ(R12 )i). These terms are labeled respectively I, II, III. The term III is the all important error term, so we handle it first. Introducing three new replicas σ 1 , σ 2 , σ 3 , we have III =
2` ϕ0 (2` V ) β E (hf E(σ 1,2k1 , σ 1 )ψ(R12 ) + f E(σ 1,2k1 , σ 2 )ψ(R12 )i k 2 V ! − 2hf ψ(R12 )E(σ 1,2k1 , σ 3 )i) ,
February 19, 2003 14:25 WSPC/148-RMP
52
00158
M. Talagrand
where E(σ 1 , σ 2 ) =
1 E(HN,0 (σ 1 )HN,0 (σ 2 )) . N
We bound crudely |E(σ 1 , σ 2 )| by 1. Since ψ(R12 ) is thermally independent of f , we get a bound ` 2 V hf iϕ0 (2` V ) ≤ 2βE(2` V ϕ0 (2` V )) III ≤ 2βE Vk since hf i ≤ V k by Lemma 7.2. Now, it is obvious from the definition of ϕ that |xϕ0 (x)| ≤ 4 1{1<x<2} . Thus we have |III| ≤ βa(`, N ) , where a(`, N ) = 8E(1{1<2` V ≤2} ) . Since the sets {1 < 2` x < 2} are disjoint as ` varies, we get (7.33). Next, we turn to the term II, which is ϕ(2` V ) kβ + K(k, `)δ . II = − E h(R(σ 1,2k1 , σ 1 )p + R(σ 1,2k1 , σ 2 )p )ψ(R12 )f i 2 V k+1 The error term comes from the use of Lemma 2.4 to replace N −1 E(HN,0 (σ 1 )HN,0 (σ 2 )) by R(σ 1 , σ 2 )p . It is of the form claimed since hψ(R12 )f i = V hf i ≤ V k+1 by (7.20). Now, II = −kβE(hR(σ 1,2k1 , σ 1 )p ψ(R12 )f iϕk+1,` (V )) + K(k, `)δ
(7.36)
by symmetry between replicas. We claim that |II + kβq p W (k1 + 1, k2 , . . . , kn )| ≤ βK(k, `) E
X α≥1
wα2 ψ ∼ (qα )
δ + 0 ε
! + kβpεEϕ(2` V ) .
(7.37)
To see this, writing R = R(σ 1,2k1 , σ1 ), we decompose Rp = R1 + R2 , where R1 = Rp 1{|R|≤1/2} . The contribution of R1 to the term II is vanishing by the result of Sec. 5. Thus in the expression of II we can replace Rp by R2 = Rp 1{|R|>1/2} and if R2 6= 0, we can pretend that σ 1,2k1 and σ 1 must be in the same pure state. Thus σ 1,s , 1 ≤ s ≤ 2k1 , σ 1 , σ2 are all in the same pure state. The method of Lemma 7.1
February 19, 2003 14:25 WSPC/148-RMP
00158
Self Organization in the Low Temperature Region
53
shows that within the allowed error term, we can in the expression of II replace the term hR2 ψ(R12 )f i by
!
X
!
Y
X
2≤s≤n
α≥1
wα2k1 +2 ψ(qα )2k1 qαp
α≥1
(7.38)
wα2ks ψ(qα )2ks −1
.
Next, we want to replace in (7.36) the term (7.38) by ! ! X Y X p 2k1 +2 2k1 +1 2ks 2ks −1 wα ψ(qα ) wα ψ(qα ) , Q=q α≥1
2≤s≤n
(7.39)
(7.40)
α≥1
that is, we want to replace II by ¯ (V )) . IV = −kβE(Qϕ ¯ k+1,`
(7.41)
We note that |ψ(x)2k1 xp − q p ψ(x)2k1 +1 | ≤ pεψ(x)2k1 +1 + 2ψ ∼ (x) ,
(7.42)
as is seen by distinguishing the cases |x − q| ≤ ε ,
ε ≤ |x − q| ≤ ε + ε0 ,
|x − q| ≥ ε + ε0 .
The error term when replacing (7.39) by Q is made of two pieces, corresponding to the two terms on the right of (7.42). The contribution of the term ψ ∼ (x) is obviously swallowed by the allowed error term. The contribution of the term pεψ(x)2k1 +1 , after a new use of the method of Lemma 7.1, is at most ϕ(2` V ) kβpεE hf i ≤ kβpεE(ϕ(2` V )) V k+1 by Lemma 7.1. To prove (7.37) we need yet another use of the method of Lemma 7.1 to see that within the same type of error term we can replace IV by −kβE(hf iϕk+1,` (V )) ,
(7.43)
where f is defined as f , but replacing k1 by k1 + 1. Finally, to prove (7.37) we use Lemma 7.4 that relates (7.41) and W (k1 + 1, k2 , . . . , kn ). Having proved (7.37), we turn to the study of the term I, which is βX (7.44) Ds,` + K(k, `)δ , I= 2 where the summation is over all possible choices of s ≤ n, ` ≤ 2ks and where Ds,` = E(hRp (σ 1,2k1 , σ s,` )f iϕk,` (V )) .
February 19, 2003 14:25 WSPC/148-RMP
54
00158
M. Talagrand
We claim that if 1 ≤ ` < 2k1 , we have |D1,` − q p W (k1 , . . . , kn )| ≤ K(k, `) E
X α≥1
δ wα2 ψ ∼ (qα ) + 0 ε
! + K(k)pεE(ϕ(2` v)) ,
(7.45)
and if s ≥ 2, |Ds,` − q p W (k1 + ks , k2 , . . . , ks−1 , ks+1 , . . . , kn )| ! X δ 2 ∼ wα ψ (qα ) + 0 + K(k)pεEϕ(2` V ) . ≤ K(k, `) E ε
(7.46)
α≥1
The combinatorics (merging of two chains in (7.46)) are explained in Sec. 6 (see (6.10)) and the details are handled as in the case of (7.37), so there is no point to repeat them. If we combine the estimates (7.37), (7.45), (7.46) after integration by parts, and within an error term as in the right-hand side of (7.37), the relation (7.35) yields, after division of both sides by β/2, (1 + (2k1 − 1)q p )W (k1 , . . . , kn ) X 2ks q p W (k1 + ks , k2 , . . . , ks−1 , ks+1 , . . . , kn ) + 2≤s≤n
− 2kq p W (k1 + 1, k2 , . . . , kn ) ' (1 − TN )W (k1 , . . . , kn ) i.e. kW (k1 + 1, k2 , . . . , kn ) X 1 TN ks W (k1 + ks , . . . , kn ) . 1− p ' k1 − W (k1 , . . . , kn ) + 2 q 2≤s≤n
The proof of the following is immediate from Proposition 7.5 using induction on k. It uses notation (6.8), (6.21). Theorem 7.6. Given an integer k, there exists a number K(k), depending upon k only, with the following property. There exists a sequence b(`, N ) ≥ 0, with X b(`, N ) ≤ K(k) (7.47) `≥1
and, for each number `0 , there exists a number K(k, `0 ), such that, given integers P k1 , . . . , kn with s≤n ks ≤ k, then for any ` ≤ `0 , 0
|S(k1 , . . . , kn ) − S (m ) (k1 , . . . , kn )| ! X δ 1 b(`, N ) + K(k, `0 ) E wα2 ψ ∼ (qα ) + 0 + K(k)pε . ≤ Eϕ(2` V ) ε α≥1
(7.48)
February 19, 2003 14:25 WSPC/148-RMP
00158
Self Organization in the Low Temperature Region
55
How to make the error term small and to use this information is the goal of Sec. 8. 8. Conditioning and the Cavity Method In this chapter we finish the proof of Theorem 1.1. We will prove a statement P meaning about the following. Given any number q, if E α≥1 {wα2 ; |qα − q| ≤ ε} does not go to zero as N → ∞ then q is an approximate solution of (1.14) (where the meaning of “approximate” gets better as ε → 0). Turning things around, if q is not a P solution of (1.14), then for some ε > 0, we have that E α≥1 {wα2 ; |qα −q| ≤ ε} → 0. Thus if (1.14) has a unique solution, this unique solution is asymptotically the only possible value for qα . One essential step in this line of reasoning is to find an argument proving that if qα = q for each α, then q must satisfy (1.14). It is then not difficult (although cumbersome) to reproduce this argument in the setting of Sec. 7. Our first task is to outline this argument. We will stay informal about the details, since these will be fully covered in the case of “conditioning” which is the one we really need. This argument is designed to involve only the squares of the weights wα (which is what the numbers ηα are). Remembering that we assume that R12 can essentially be only 0 or q, we write q'E
2 hσ 1 σ 2 R12 i i hR12 =E N N , hR12 i hR12 i
(8.1)
by symmetry upon the sites. Changing N into N + 1 and β into β 0 , and setting 0 0 = R12 (%1 , %2 ) = R12
σ1 σ2 N R12 + N +1 N +1 , N +1 N +1
(8.2)
we get q'E
1 2 0 0 hσN +1 σN +1 R12 i 0 i0 hR12
.
(8.3)
0 by R12 so that One can expect a limited effect of replacing R12
q'E
1 2 0 hσN +1 σN +1 R12 i 0 hR12 i
and using cavity P hR12 Av ε1 ε2 exp( `≤2 ε` g(σ ` ))i P . q'E hR12 Av exp `≤2 ε` g(σ ` )i
(8.4)
Remembering that R12 ' 0 if σ 1 , σ 2 do not belong to the same set Cα we have P P 2 2 2 α≥1 wα hsh g(σ)iα α≥1 ηα hsh g(σ)iα = E . (8.5) q ' EP P 2 2 2 α≥1 wα hch g(σ)iα α≥1 ηα hch g(σ)iα
February 19, 2003 14:25 WSPC/148-RMP
56
00158
M. Talagrand
Next, we use the fact that R12 takes essentially values close to 0 or q to prove that P α≥1 ηα Uα (8.6) q 'EP α≥1 ηα Vα where (Uα , Vα )α≥1 are i.i.d, and the law of the couple (Uα , Vα ) is that of the couple (sh2 (X), ch2 (X)), where X is Gaussian, EX 2 = βpq p−1 /2. This step might not be so intuitive, so let us at least mention that the fact that the couples (hsh2 g(σ)iα , hch2 g(σ)iα ) are nearly independent as α varies follows from the fact, proved in Sec. 5, that g(σ) is typically nearly independent of g(σ0 ) if σ ∈ Cα , σ 0 ∈ Cα0 , α 6= α0 . Next, as proved in Sec. 6, the sequence (wα ) has nearly distribution Λm for m = mN (β), so that the sequence (ηα ) has nearly distribution Λm/2 . Thus (6.33) yields q'
E th2 X chm X , E chm X
which determines q. We now repeat the previous argument “under conditioning”, taking case of all the details. The delicate step (control of the distribution of the weights ηα ) was performed in Sec. 7. The rest of proof is not difficult, but it is made cumbersome by the need of averaging to control the error terms of Theorem 7.6. We will use the notation of Sec. 6. The starting point (that corresponds to (8.1)) is as follows. Since ψ(x) = 0 if |x − q| ≥ ε + ε0 , we have |qψ(x) − xψ(x)| ≤ (ε + ε0 )ψ(x) ≤ 2εψ(x) .
(8.7)
Thus we have |qhψ(R12 )i − hR12 ψ(R12 )i| ≤ 2εhψ(R12 )i . Dividing by V = hψ(R12 )i, multiplying by ϕ(2` V ) and taking expectation, we get
` qEϕ(2` V ) − E hR12 ψ(R12 )i ϕ(2 V ) ≤ 2εEϕ(2` V ) . V We use (8.8) and symmetry between sites to get ϕ(2` V ) 1 2 ` qEϕ(2` V ) − E hσN σN ψ(R12 )i ≤ 2εEϕ(2 V ) . V
(8.8)
(8.9)
We rewrite (8.9) changing N into (N + 1) and β into β 0 . We recall the notation 0 )i0 , so that (8.9) implies (8.2) and we set V 0 = hψ(R12 ` 0 1 2 0 0 ϕ(2 V ) ` 0 Eϕ(2` V 0 ) − E hσN σ ψ(R )i (8.10) +1 N +2 12 ≤ 2εEϕ(2 V ) . 0 V
February 19, 2003 14:25 WSPC/148-RMP
00158
Self Organization in the Low Temperature Region
57
0 We recall that ϕ, ψ are Lipschitz, and that |R12 − R12 | ≤ 2/N . Thus, at the cost of adding to the right-hand side of (8.10) an error term K(`)δ, we can replace 0 by R12 and V 0 by in (8.10) R12
V1 = hψ(R12 )i0 .
(8.11)
We use the cavity method to write 1 2 0 hσN +1 σN +1 ψ(R12 )i =
V1 = hψ(R12 )i0 =
hψ(R12 ) sh g(σ1 ) sh g(σ2 )i hch g(σ)i2
hψ(R12 ) ch g(σ 1 ) ch g(σ 2 )i hch g(σ)i2
(8.12) (8.13)
so that (8.10) yields 1 2 qEϕ(2` V1 ) − E hψ(R12 ) sh g(σ ) sh g(σ )i ϕ(2` V1 ) hψ(R12 ) ch g(σ 1 ) ch g(σ 2 )i ≤ 2εEϕ(2` V1 ) + K(`)δ .
(8.14)
At some later stage of the proof we will want to use the results of Sec. 7. These involve a new probability having a density proportional to ϕ(2` V ), not ϕ(2` V1 ), so in (8.14) we would like to replace ϕ(2` V1 ) by ϕ(2` V ). The idea to do this is simply that the ratio V1 /V is not too different from 1, so that ϕ(2` V1 ) and ϕ(2` V ) are equal except for a few values of `; this should not make much difference. We recall that the dependence of the various constants on p is implicit. Lemma 8.1. We have Eg
V1 ≤K; V
Eg
V ≤ 1. V1
(8.15)
Proof. By (8.13) hψ(R12 ) ch g(σ 1 )ch g(σ 2 )i V1 ≤ V hψ(R12 )i so that Eg
V1 ≤ exp 2β 2 p ≤ K , V
since β ≤ 2p/2 . The rest is obvious. We use again the notation a(`, N ) = P (1 < 2` V ≤ 2) . Lemma 8.2. Given r ≥ 1, we have E(|ϕ(2` V ) − ϕ(2` V1 )|) ≤
X `−r≤s≤`+r
a(s, N ) + K2−r .
(8.16)
February 19, 2003 14:25 WSPC/148-RMP
58
00158
M. Talagrand
Proof. It should be obvious from the properties of ϕ that E(|ϕ(2` V ) − ϕ(2` V1 )|) ≤ P (2−r ≤ 2` V ≤ 2r ) + P (2` V ≤ 2−r , 2` V1 ≥ 1) + P (2` V ≥ 2r , 2` V1 ≤ 2) . Using Lemma 8.1 and Markov inequality, P (2` V ≤ 2−r , 2` V1 ≥ 1) ≤ K2−r , P (2` V ≥ 2−r , 2` V1 ≤ 2) ≤ K2−r , and also P (2−r ≤ 2` V ≤ 2r ) ≤
X
a(s, N ) .
`−r≤s≤`+r
Combining Lemma 8.2 and (8.14), we have shown that 1 2 qEϕ(2` V ) − E hψ(R12 ) sh g(σ ) sh g(σ )i ϕ(2` V ) 1 2 hψ(R12 ) ch g(σ ) ch g(σ )i ≤ 2εEϕ(2` V ) + K(`)δ + K2−r + b(`, N ) where
X
b(`, N ) ≤ K(r) .
(8.17)
(8.18)
`≥1
The next step is to replace in (8.17) the process (g(σ)) by a simpler process. In the scheme of proof described early in this section, the step we are going to perform corresponds from going from (8.5) to (8.6). We consider i.i.d Gaussian variables (gα )α≥1 , with Egα2 = β 2 pq p−1 /2, and we define the process g 0 (σ) by g 0 (σ) = gα if σ ∈ Cα . Lemma 8.3. We have hψ(R12 ) sh g 0 (σ 1 ) sh g 0 (σ 2 )i hψ(R12 ) sh g(σ 1 ) sh g(σ 2 )i ` E − V ) ϕ(2 1 2 0 1 0 2 hψ(R12 ) ch g(σ ) ch g(σ )i hψ(R12 ) ch g (σ ) ch g (σ )i ≤ εKϕ(2` V ) + K(`)
δ . ε0
(8.19)
√ √ tg(σ) + 1 − tg 0 (σ) and hψ(R12 ) sh gt (σ 1 ) sh gt (σ 2 )i . ξ(t) = Eg hψ(R12 ) ch gt (σ 1 ) ch gt (σ 2 )i
Proof. We write gt (σ) =
Writing 0
0
∆``0 = Eg(σ ` )g(σ ` ) − Eg 0 (σ ` )g 0 (σ ` ) = 2E
0 d gt (σ ` ) gt (σ ` ) . dt
(8.20)
(8.21)
February 19, 2003 14:25 WSPC/148-RMP
00158
Self Organization in the Low Temperature Region
59
we obtain from (8.20), after integration by parts, as in Lemma 5.2 (and using the fact ∆`` is a constant so that the corresponding terms cancel out) ξ 0 (t) = Eg
hψ12 ∆12 it hψ12 ψ34 ∆13 ε2 ε3 it − 4Eg hψ12 it hψ12 i2t
+ 3Eg
hψ12 ψ34 ψ56 ∆35 ε1 ε2 ε3 ε5 it . hψ12 i3t
(8.22)
Here, we write ψ``0 = ψ(R``0 ) and we use compact notation as follows. Given n (= 2, 4 or 6), and a function f of σ 1 , . . . , σ n , ε1 , . . . , εn we write hf it = hAv f Et i for Et = exp
X
ε` gt (σ ` ) .
`≤n
Thus, we get |ξ 0 (t)| ≤ Eg
hψ12 |∆12 |it hψ12 ψ34 |∆13 |it + 7Eg . hψ12 i2t hψ12 i2t
(8.23)
We will explain only how to take care of the last term (the hardest), i.e. to control hψ12 ψ34 |∆13 |it ` ϕ(2 V ) . (8.24) E Eg hψ12 i2t Consider a function ψ ∗ , such that 0 ≤ ψ ∗ ≤ 1, ψ ∗ (x) = 0 if x ≤ 2βp2 ε, while ψ (x) = 1 if |x| ≥ 3βp2 ε, ψ ∗ Lipschitz. Then the term (8.24) is at most ϕ(2` V ) 3εβp2 Eϕ(2` V ) + E Eg hψ12 ψ34 |∆13 |ψ ∗ (∆13 )it hψ12 i2t ∗
≤ 3εβp2 Eϕ(2` V ) + K(`)Ehψ12 ψ34 |∆13 |ψ ∗ (∆13 )i ,
(8.25)
because, since ch x ≥ 1, we have hψ12 it ≥ hψ12 i = V . All we have to do is to show that the last term of (8.25) is bounded by δ/. We recall Lemma 4.2, so that ∆13 − βp (Rp−1 − δ(σ 1 , σ 3 )) ≤ K , (8.26) 13 N 2 S where δ(σ, σ 0 ) = q p−1 if (σ, σ 0 ) ∈ α≥1 Cα2 , while δ(σ, σ 0 ) = 0 otherwise. Setting ∆013 =
βp p−1 (R13 − δ(σ 1 , σ 3 )) , 2
we have to show that Ehψ12 ψ34 |∆013 |ψ ∗ (∆013 )i ≤
δ . ε0
(8.27)
We first show that Ehψ12 ψ34 |∆013 |ψ ∗ (∆013 )1{|R13 |≤1/2} i ≤
δ . ε0
(8.28)
February 19, 2003 14:25 WSPC/148-RMP
60
00158
M. Talagrand
Indeed, if |R(σ 1 , σ 3 )| ≤ 1/2 then δ(σ 1 , σ3 ) = 0, so that the quantity (8.28) is at most βp p−1 Eh|R13 |1{|R13 |≤1/2} i , 2
(8.29)
and this goes to zero by Theorem 5.1. Next, the method of Lemma 7.1 shows that Ehψ12 ψ34 |∆013 |ψ ∗ (∆013 )1{|R13 |≥1/2} i X βp p−1 βp p−1 δ 4 2 p−1 ∗ p−1 (q (q wα ψ(qα ) −q ) ψ −q ) + 0, =E 2 α 2 α ε α α≥1
(8.30) but the first term on the right-hand side of (8.30) is zero because ψ(qα ) 6= 0 ⇒ |qα − q| ≤ ε + ε0 ≤ 2ε βp p−1 βp p−1 p−1 2 ∗ p−1 (q ) ≤ εβp ⇒ ψ −q ) = 0. ⇒ (qα − q 2 2 α We have proved (8.19). The proof of the following mimics that of Lemma 7.4. Lemma 8.4. We have P ηα sh2 gα hψ(R12 ) sh g 0 (σ 1 ) sh g 0 (σ 2 )i ` −P ϕ(2 V ) E hψ(R12 ) ch g 0 (σ 1 ) ch g 0 (σ 2 )i ηα ch2 gα ! X δ 2 ∼ wα ψ (qα ) . ≤ K(`) 0 + E ε
(8.31)
α≥1
If we combine this with (8.17), (8.19), we have shown the following. Lemma 8.5. We have P 2 qEϕ(2` V ) − E P ηα sh gα ϕ(2` V ) 2 ηα ch gα ≤ εKEϕ(2` V ) + K2−` +
! X δ +E wα2 ψ ∼ (qα ) + b(`, N ) . ε0 α≥1
where b(`, N ) satisfies (8.18). If we can control the error terms, (8.32) means that P ηα sh2 gα 1 ` E Eϕ(2 V ) , q' P Eϕ(2` V ) ηα ch2 gα
(8.32)
February 19, 2003 14:25 WSPC/148-RMP
00158
Self Organization in the Low Temperature Region
61
which, when combined with (6.33) and the work in Sec. 7, will yield (1.14). Thus, we turn to the control of the error terms and the averaging arguments. The function ϕ is fixed once and for all. Our “conditioning” construction depends upon the parameters ε, ε0 , `. We write, for a r.v. Y , Eε,ε0 ,` (Y ) =
1 E(Y ϕ(2` V )) . Eϕ(2` V )
(8.33)
Considering a parameter d > 0, we write ! ! P 2 η sh X α α α≥1 F (β, ε, ε0 , `, d) = max q − Eε,ε0 ,` P − d, 0 , 2 α≥1 ηα ch Xα so F (β, ε, ε0 , `, d) ≥ 0 and ! P 2 α≥1 ηα sh Xα q − Eε,ε0 ,` P ≤ d + F (β, ε, ε0 , `, d) . 2 η ch X α α≥1 α
(8.34)
(8.35)
Considering an integer `0 , andε0 > 0, we write Z 1 X 1 2ε0 F (β, ε, ε0 , `, d)dε . AvF (β, ε0 , ε0 , `0 , d) = `0 ε0 ε0 `0 ≤`<2`0
Lemma 8.6. We have X
0
AvF (β, ε0 , ε , `0 , ε0 K)P
! wα2 1|q−qα |≤ε
≥2
−`0 +1
α≥1
≤ K(`0 )
δ ε0 + 0 ε ε0
+ K2−r +
K(r) . `0
(8.36)
It is much clearer how to control such an error term than the error term in (8.32). One successively takes r large, `0 large, ε0 small. Proof. We deduce from (8.32) that E(ϕ(2` V ))F (β, ε, ε0 , `, εK) ! X δ wα2 ψ ∼ (qα ) + b(`, N ) . ≤ K2−r + K(`) 0 + E ε α≥1
Using Lemma 7.1, we have for ε ≥ ε0 , ` ≤ `0 ,
! X δ ` 2 wα ψ(qα ) Eϕ(2 V ) ≥ −K(`) 0 + Eϕ 2 ε `
α≥1
δ ≥ −K(`) 0 + P ε
X
! wα2 1|q−qα |≤ε0 ≥ 2−`0 +1
α≥1
since ψ(x) = 1 for |x − q| ≤ ε, ϕ(x) ≥ 1 for x ≥ 2.
,
(8.37)
February 19, 2003 14:25 WSPC/148-RMP
62
00158
M. Talagrand
Combining with (8.37) we get P
X
!
wα2 1|q−q0 |≤ε0 ≥ 2−`0 +1 F (β, ε, ε0 , `, ε0 K)
α≥1
! X δ 2 ∼ wα ψ (qα ) + K2−r + b(`, N ) . ≤ K(`0 ) 0 + E ε α≥1
We observe that for any x we have Z 2ε0 ψ ∼ (x)dε ≤ ε0 . ε0
The result follows, using (8.18). We need a similar averaging argument to be able to use Theorem 7.6. We write 0
Gm (β, ε, ε0 , `, k, d) X 0 max(|S(k1 , . . . , kn ) − S (m ) (k1 , . . . , kn )| − d, 0) . =
(8.38)
k1 +···+kn ≤k
Here we follow the notation of Sec. 7, and the dependence of the right-hand side on ε, ε0 , ` is through S(k1 , . . . , kn ). We write Z 1 X 1 2ε0 m0 m0 0 G (β, ε, ε0 , `, k, d)dε . Av G (β, ε0 , ε , `0 , k, d) = `0 ε0 ε0 `0 ≤`<2`0
Lemma 8.7. We have X
0
Av Gm (β, ε0 , ε0 , `, k, K(k)ε0 )P
! wα2 1|qα −q|≤ε0 ≥ 2−`0 +1
α≥1
≤
K(k) + K(k, `0 ) `0
ε0 δ + 0 ε0 ε
.
(8.39)
Proof. This follows from (7.52) the way (8.36) follows from (8.32). We consider a Gaussian r.v. X with EX 2 = β 2 pq p−1 /2, and m = TN (β)/ q , m0 = m/2. p
Lemma 8.8. Given η > 0, we can find k, ε0 , θ > 0 depending only upon η such that if 2 m q − E th Xmch X ≥ η , (8.40) E ch X then, for each ε0 , `0 , we have 0
Av F (β, ε0 , ε0 , `0 , K(k)ε0 ) + Av Gm (β, ε0 , ε0 , `0 , k, K(k)ε0 ) ≥ θ .
(8.41)
February 19, 2003 14:25 WSPC/148-RMP
00158
Self Organization in the Low Temperature Region
63
Proof. It follows from Proposition 6.4 that we can find a number k and a number ξ > 0, depending only upon η such that, if the random weights (ηα )α≥1 satisfy X ks ≤ k ∀ n, ∀ k1 , . . . , kn ≥ 1, s≤n
⇒ E then
E
Y
! ηαks
s≤n
P P
2 α≥1 ηα sh Xα
α≥1
ηα ch2 Xα
!
− S (m ) (k1 , . . . , kn ) ≤ ξ , 0
Eth2 X chm X η − ≤ . E chm X 2
(8.42)
(8.43)
Let us denote by c the left-hand side of (8.41). If two functions have an average less than c, there exists a point where both are at most 2c. Thus we can find ε and ` such that ! P 2 α≥1 ηα sh Xα (8.44) ≤ 2c + K(k)ε0 . q − Eε,ε0 ,` P 2 α≥1 ηα ch Xα ∀ k1 , . . . , kn ,
X
0
ks ≤ k, |S(k1 , . . . , kn ) − S (m ) (k1 , . . . , kn )|
s≤n
≤ 2c + K(k)ε0 .
(8.45)
Combining (8.44) and (8.40), we have ! P 2 E th2 X chm X α≥1 ηα sh Xα − Eε,ε0 ,` P ≥ η − 2c − K(k)ε . m 2 E ch X α≥1 ηα ch Xα Recalling that S(k1 , . . . , kn ) = Eε,ε0 ,`
Y X
(8.46)
! ηαks
,
s≤n α≥1
we see from (8.45) and the implication (8.42) ⇒ (8.43) that we must have either 2c + K(k)ε0 > ξ or η − 2c − K(k)ε0 < η/2, so that in any case we have η − K(k)ε0 . 2c ≥ min ξ, 2 We conclude the proof by taking θ = min(ξ, η/2)/4 and ε0 small enough, depending only upon η. Corollary 8.9. Given η > 0, there exists ε0 > 0, depending only upon η such that, if (8.40) holds, then, for each integer `0 , we have ! 0 X ε δ K(r, η) 2 −`0 +1 wα 1|q−qα |≥ε0 ≥ 2 + 0 + K(η)2−r + . ≤ K(`0 , η) P ε0 ε `0 α≥1
(8.47)
February 19, 2003 14:25 WSPC/148-RMP
64
00158
M. Talagrand
Proof. Combine the three previous lemmas. We now define the set 1 1 ,1 ; ∃q ∈ , 1 , |x − q| ≤ η; J(η, β, N ) = x ∈ 2 2
2 m q − E th Xmch X ≤ η , E ch X (8.48)
where, as usual, EX 2 = β 2 pq p−1 /2 and m = TN (β)/q p−1 . Theorem 8.10. We have E
X
! wα2 ; qα ∈ / J(η, β, N )
= δ.
(8.49)
α≥1
What (8.49) means is that the left-hand side of (8.49), at h0 fixed, goes to zero if we average over β in an interval (while staying in the domain β ≤ 2p/2 , (β, h) in the region of Theorem 2.1). Proof. We consider ε0 > 0, depending upon η only as provided by Corollary 8.9. We can assume ε0 ≤ η. Considering x in [0, 1], there is an integer n such that / |x − nε0 | ≤ ε0 ≤ η. If q = ε0 n fails (8.40), then x ∈ J(η, β, N ). Thus if x ∈ J(η, β, N ), q = ε0 n must satisfy (8.40). Thus ! ! X X X 2 2 wα ; qα ∈ / J(η, β, N ) ≤ E wα ; |qα − nε0 | ≤ ε0 E α≥1
α≥1
where the sum on the right is over n ≤ 1/ε0 such that q = nε0 satisfies (8.40). Which terms are in that sum depends upon N , β (through TN (β)). It now suffices to show that for each q, ! X 2 wα 1|q−qα |≤ε0 = δ (8.50) AE α≥1
where A = 1 if (8.40) holds and A = 0 otherwise. For a r.v. 0 ≤ X ≤ 1, we have E(X) ≤ 2−`0 +1 + P (X ≥ 2−`0 +1 ) , and combining this with Corollary 8.9 shows that the left-hand side of (8.50) is at most 0 ε δ K(r, η) + . + K(η)2−r + 2−`0 +1 + K(`0 , η) ε0 ε `0
February 19, 2003 14:25 WSPC/148-RMP
00158
Self Organization in the Low Temperature Region
Thus
Z
2p/L
AE
lim sup N →∞
0
X
65
! wα2 1|q−qα |≤ε0
dβ
α≥1
≤ K2−`0 +1 + K(`0 , η)
ε0 K(r, η) + K(η)2−r + . ε0 `0
We let ε0 → 0, then `0 → ∞, then r → ∞ to finish the proof. To finish the proof of Theorem 1.1, it suffices to prove the following. Proposition 8.11. If 1 ≤ β ≤ 2p/L , p ≥ L, the system of Eqs. (1.13), (1.14) has a unique solution such that q ≥ 1 − 2−p/L . Proof. Considering the function Φ(q, m) = we show that the map
E th2 X chm X , E chm X
TN (q, m) 7→ Φ(q, m), 1 − p q
sends [1 − 2−p/L , 1] × [1/Lβ, 1] into itself. First, we note that Φ(q, m) ≤ 1, 1 − TN /q p ≤ 1. We observe that TN ≤ 1 − 1/Lβ. This follows from (6.8), using that, √ by (2.22), we have |Eh−HN /N i| ≤ log 2. Thus, if q ≥ 1 − 2−p/L , we have TN ≥ 1 − TN (1 − 2−p/L )−1 qp 1 1 ≥ 1− 1− (1 − 2−p/L )−1 ≥ Lβ Lβ
m = 1−
(8.51)
for β ≥ 1, β ≤ 2p/L . Now Φ(q, m) = 1 −
1 E chm−2 X ≥1− , E chm X E chm X
and E chm X ≥ 2−m E exp mX = 2−m exp
m2 β 2 p p−2 q 2
p 1 exp , 2 L using again that m ≥ 1/Lβ. Thus indeed Φ(q, m) ≥ 1 − 2 exp(−p/L). The function TN f (q) = Φ q, 1 − p q ≥
(8.52)
February 19, 2003 14:25 WSPC/148-RMP
66
00158
M. Talagrand
satisfies f (1 − 2−p/L ) > 1 − 2−p/L , f (1) < 1, so in between these values there is a number q with q = f (q). To show that this number is unique we show that f 0 < 1 on the previous interval. This is because the partial derivatives of Φ with respect to q, m are exponentially small in p. This follows from (8.51), and elementary considerations. We have finished the proof of Theorem 1.1. Now we know that the only possible value of qα is given by (1.13), (1.14), we can use to the argument of Sec. 6 (see (6.27)) to see that the distribution of the sequence (wα )α≥1 , is about ΛmN (β) . (When mN (β) = 1, i.e. TN (β) = 0, this means of course that there are “no macroscopic weights”.) Theorem 1.1 deals only with the case p odd, and we now investigate the case p even. In that case, Gibbs’ measure is invariant under the symmetry σ → −σ. The pure states Cα go by pairs, Cα , Cϕ(α) = −Cα , and GN (Cα ) = GN (−Cα ). The only change to make in the proof of Theorem 1.1 is that in (8.5) it is no longer true that the terms in the sums are nearly independent; but this is true after one regroups the contributions of Cα and −Cα . We then conclude that qα can essentially only be equal to q, so that the overlap are asymptotically q, 0, or −q, the later being obtained as the overlap of a configuration in Cα and one in −Cα . 9. The Pertubed Hamiltonian and the Extended Ghirlanda Guerra Identities We would like to have the identities (6.4) when Rp is replaced by any other power of R. Following an idea of [13] this is possible if one adds to the Hamiltonian a smaller order term that “contains a s-spin interaction for each integer s > 0”. More precisely, we consider 1/2 X s! (s) (s) (9.1) gi1 ···is σi1 · · · σis , gN (σ) = N s−1 (s)
where the summation is over 1 ≤ i1 < · · · < is ≤ N . (Thus gN (σ) = 0 if s > N ). (s) In (9.1), the r.v. gi1 ···is are all independent standard normal. Given β, we define the “perturbation term of the Hamiltonian” by X (s) per (σ) = ξ(N ) 2−s βs gN (σ) , (9.2) −βHN s≤N
where −1 ≤ βs ≤ 1, and where ξ(N ) = N −1/6 . The purpose of the factor 2−s is to ensure convergence. There is nothing magical about the power N −1/6 . One could also take N −a , 0 < a < 1/4. The full Hamiltonian is now given by per full (σ) = −βHN (σ) − βHN (σ) −βHN
and N −1 E log ZN is now a function pN (β, h, β) where β = (β1 , β2 , . . .).
(9.3)
February 19, 2003 14:25 WSPC/148-RMP
00158
Self Organization in the Low Temperature Region
67
The following proves that the perturbation term is indeed small is some sense. Lemma 9.1. We have pN (β, h, 0) ≤ pN (β, h, β) ≤ pN (β, h, 0) +
X 2−s β 2 s
s≥1
2
ξ 2 (N ) .
Proof. The right-hand side follows by Jensen’s inequality, integrating in the (s) r.v. gi1 ,...is inside the log rather than outside. The left-hand side is obtained by observing that pN (β, h, β) − pN (β, h, 0) =
1 per E loghexp(−βHN (σ))i N
where the bracket is for the choice of parameters corresponding to pN (β, h, 0). Now per per (σ))i ≥ E log exp(−βhHN (σ)i) E loghexp(−βHN per (σ)i = 0 , = −βEhHN (s)
as is seen by integrating first in gi1 ···is at gi1 ···ip fixed. Lemma 9.2. If β ≤ 2p we have (s) 2 + Z 1 * (s) g (σ) 22s g (σ) 2s √ . −E + E dβs ≤ K(p) N N N ξ(N ) ξ(N )2 N −1 It is understood that in all the brackets, the parameters are β, h, β. It is in this lemma that the condition ξ(N )N 1/4 → ∞ arises. Proof. The proof mimics that of Lemma 6.1. We start with + * (s) g (σ) ∂pN N , (β, h, β) = 2−s ξ(N )E ∂βs N * !2 + * +2 (s) (s) g (σ) (σ) g ∂ 2 pN N N , (β, h, β) = 2−2s N ξ(N )2 E − ∂βs2 N N so that Z
1
−1
* E
(s)
gN (σ) N
!2 +
* −
(s)
gN (σ) N
+2 dβs ≤
L2s . N ξ(N )
As in the proof of Lemma 6.1, we deduce from Proposition 3.4 of [10] that * +2 +!2 * Z 1 (s) (s) g g (σ) (σ) N dβs ≤ K(p) √ . E N − E (2−s ξ(N ))2 N N N −1
February 19, 2003 14:25 WSPC/148-RMP
68
00158
M. Talagrand
Proposition 9.3 (Extended Ghirlanda Guerra identities). Given a function f on k replicas, |f | ≤ 1, and a continuous function ξ, we have Ehξ(R1,k+1 )f (σ 1 , . . . , σk )i =
1 Ehξ(R12 )iEhf i k 1 X Ehξ(R1,` )f i + δ , + k
(9.4)
2≤`≤k
where
Z lim
N →∞
δdβ = 0 ,
(9.5)
for an integral over −1 ≤ βs ≤ 1 for each s ≥ 1. Proof. By approximation one can assume that ξ is a polynomial, and by linearity that it is a power in which case Lemma 9.2 allows to prove (9.4) as in Theorem 6.2.
Throughout the rest of the paper, δ will denote a quantity such as in (9.5). Proposition 9.3 is a statement of amazing power as we will now show. There is nothing to change in the work of Sec. 2 in the case of the full Hamiltonian (as is seen along the lines of Lemma 9.1). Thus we can construct the sets (Cα ) as in Sec. 3, and we denote wα = GN (Cα ) (where the Gibbs’ measure now corresponds to the full Hamiltonian). Throughout the rest of the paper, we write m = mN (β, h, β) = Eh1{R12 ≥3/4} i . We recall the notation S
(m)
(9.6)
(k1 , . . . , kn ) of (6.20).
Theorem 9.4. For any integers n, k1 , . . . , kn we have Y X ks (m) wα − S (k1 , . . . , kn ) = δ . E
(9.7)
s≤n α≥1
This fact should be obvious following the method of Sec. 6, i.e. recursive use of (9.4) for a function ξ such that ξ(x) = 0 if x ≤ 1/2, ξ(x) = 1 if x ≥ 3/4. (Let us insist that the argument makes essential use of the fact that we know a priori that R12 essentially never belongs to the interval [1/2, 3/4].) The meaning of Theorem 9.4 is essentially that the weights of the lumps have a Poisson–Dirichlet distribution Λm . As we mentioned earlier, not controlling this distribution was the main obstacle in using the cavity method. Once this obstacle has been passed (almost effortlessly) the problem becomes much easier, as will be shown in Sec. 10. It is of course disturbing that the perturbation term seems to bring information out of nowhere. A possible explanation is that at a certain deep level (not yet understood) this information is “generically present” and that adding the perturbation term eliminates the exceptional “unstable” situations that escape the general rule.
February 19, 2003 14:25 WSPC/148-RMP
00158
Self Organization in the Low Temperature Region
69
One clear occurrence of this is when h = 0, p is even. Without the perturbation term, the pure states go by symmetric pairs. We will show in the next section that the perturbation term breaks the symmetry. How do we change the problem by adding the perturbation term to the Hamiltonian? The answer to that question really depends on what we study. If we study the overlap of two configurations, the example of h = 0, p even shows that we do change of problem (the overlap takes essentially two values rather than three). On the other hand, if we are not interested in the detailed structure of Gibbs’ measure, but only in the asymptotic computation of pN , Lemma 9.1 shows that we have not changed the problem. 10. The Model with External Field The purpose of this section is to prove Theorem 10.1, that extends Theorem 1.1 to the case h 6= 0, provided we accept to add the perturbation term of Sec. 9 to the Hamiltonian. Given two numbers q0 ≤ q1 , and two independent standard normal r.v. z, g, we consider r q q p p−1 p−1 p−1 g q1 − q0 + z q0 + βh . (10.1) X =β 2 We denote by Eg (resp. Ez ) expectation at z (resp. g) fixed. We set m = mN = 1 − E(h1{R12 ≥1/2} i) .
(10.2)
Theorem 10.1. There exists a number L with the following property. If p > L, 1 ≤ β ≤ 2p/L , h ≤ 1/L, then the system of equations 2 ! Eg (th X chm X) , (10.3) q0 = Ez Eg chm X q1 = Ez
Eg (th2 X chm X) Eg chm X
has a unique solution (q0 , q1 ). Given ε > 0, we have Z EG⊗2 lim N ({|R12 − q0 | ≥ ε and |R12 − q1 | ≥ ε})dβ = 0 . N →∞
(10.4)
(10.5)
In (10.5), β is fixed, and the average is over β, such that −1 ≤ βs ≤ 1, for s ≥ 1. Gibbs’ measure in (10.5) refers to the Hamiltonian (9.3). The only reason for the requirement β ≥ 1 is to ensure that there is a solution to (10.3), (10.4). In fact, in the setting of Proposition 2.18, (or more generally when m → 1 as N → ∞) one can interpret (10.3) as meaning s p−1 pq 0 z + βh . q0 = Ez th2 β 2
February 19, 2003 14:25 WSPC/148-RMP
70
00158
M. Talagrand
In that case, asymptotically, R12 takes only the value q0 . (This is the so-called replica-symmetric solution.) Before we start the proof of Theorem 10.1, we need to know that the overlap cannot take values close to −1, even if p is even. Lemma 10.2. If β ≤ 2p/L , h ≤ 1/L, we have Eh1{R12 ≤−1/2} i = δ .
(10.6)
Proof. With the notation of the discussion following Theorem 3.2, we have, combining (3.7) and (3.9), that - [ 1 N ⊗2 Cα × Cϕ(α) ≤ K exp − R12 ≤ − EGN 2 K α≥1
and all we need to show is that ! [ X ⊗2 Cα × Cϕ(α) = E wα wϕ(α) = δ . EGN α≥1
(10.7)
α≥1
To do this, we observe that for two continuous functions θ, ψ, on R, the extended Ghirlanda–Guerra relations imply that 1 1 (10.8) Ehθ(R13 )ψ(R12 )i = Ehθ(R12 )iEhψ(R12 )i + Ehθ(R12 )ψ(R12 )i + δ . 2 2 We take ψ such that ψ(x) = 1 if x ≤ −3/4, ψ(x) = 0 if x ≥ −1/2. Thus it is (essentially) true that ψ(R12 ) = 1 if and only if σ 2 ∈ Cϕ(α) , where α is such that σ 1 ∈ Cα . Taking θ = ψ, we see that (10.8) implies !2 ! X X X 1 2 wα wϕ(α) = E wα wϕ(α) +E wα wϕ(α) + δ . (10.9) E 2 α≥1
α≥1
α≥1
Taking θ(x) = ψ(−x), since θ(x)ψ(x) = 0, we get now from (10.8) that X X 1 X wα2 wϕ(α) = E wα wϕ(α) E wα2 + δ . (10.10) E 2 α≥1 α≥1 α≥1 P P 2 wα , comparing (10.9) and (10.10) gives Since α≥1 wα2 wϕ(α) = α≥1 wϕ(α) !" # X X X 2 wα wϕ(α) 1 + E wα wϕ(α) − E wα = δ E and since
P
α≥1
wα2
α≥1
α≥1
≤ 1, this implies (10.8).
We now start the main argument of the proof of Theorem 10.1. Considering two numbers q0 , q1 to be determined later, we define 3 q1 if R12 ≥ 4 , (10.11) q12 = q12 (σ 1 , σ 2 ) = q0 if R12 < 3 . 4
February 19, 2003 14:25 WSPC/148-RMP
00158
Self Organization in the Low Temperature Region
71
We will study the quantity AN (β, h0 , β) = Eh(R12 − q12 )2 i ,
(10.12)
0
where Gibbs’ measure is of course for the values β, h , β of the parameters. Using the symmetry between sites, 4 1 2 1 2 + Eh(σN (10.13) −1 σN −1 − q12 )(σN σN − q12 )i . N We will use a technique related to that of Secs. 4 and 5, but to make the proof work it seems required to distinguish two coordinates rather than one . We set p−1/2 N +2 00 β (10.14) β = N AN (β, h, β) ≤
β 00 = (βs00 ) ,
where βs00 ξ(N + 2) = βs ξ(N )
for s ≥ 1 .
(10.15)
Lemma 10.3. We have AN +2 (β 00 , h0 , β00 ) = E
hAv(η1 η2 − q12 )(ε1 ε2 − q12 )Ei +δ, hAv Ei
where Av is average over η1 , η2 , ε1 , ε2 = ±1, E = exp
X
0
`
0
(10.16)
! `
0
(η` (g (σ ) + h ) + ε` (g(σ ) + h )) ,
(10.17)
`≤2
g(σ) is given by (4.12) and the process (g 0 (σ)) is an independent copy of the process (g(σ)). In the right-hand side of (10.15), Gibbs’ measure is for the value (β, h0 , β) of the parameters. Proof. This formula is clearly related to (4.15). One uses (10.13) for N + 2 rather than N , and one makes explicit the contribution of the last two spins. The righthand side of (10.16) does not however exactly arise from the second term in (10.13). For equality to hold, in E there would be terms taking into account the perturbation term in the Hamiltonian and there would also be an interaction term between the (N + 1)th spin η and the (N + 2)th spin ε. These extra terms are obviously of lower 00 order. Also, in an identity, we would have to define q12 as in (10.11) but using R12 rather than R12 , where 00 = R12
1 (N R12 + η1 η2 + ε1 ε2 ) . N +2
(10.18)
But (as we used several times) this makes little difference since R12 is essentially never in [1/2, 1 − 2p/L ]. To take advantage of (10.16), we will replace the processes g(σ), g 0 (σ) by simpler ones. We consider i.i.d. N (0, 1) r.v. z, gα , and we define r q q p p−1 q0 z + q1p−1 − q0p−1 gα (10.19) γ(σ) = β 2
February 19, 2003 14:25 WSPC/148-RMP
72
00158
M. Talagrand
for σ ∈ Cα . Thus we have Eγ(σ 1 )γ(σ 2 ) = if (σ 1 , σ 2 ) ∈
S α≥1
β 2 p p−1 β 2 p p−1 q1 = q 2 2 12
(10.20)
Cα2 , while we have Eγ(σ 1 )γ(σ 2 ) =
β 2 p p−1 β 2 p p−1 q0 = q 2 2 12
(10.21)
otherwise. We consider an independent copy (γ 0 (σ)) of this process. For 0 ≤ t ≤ 1 we define √ √ (10.22) gt (σ) = tg(σ) + 1 − tγ(σ) and we define gt0 (σ) similarly. We define Et = exp
X
!
(η` (gt0 (σ ` ) + h0 ) + ε` (gt (σ ` ) + h0 ))
(10.23)
`≤2
and θ(t) = E
hAv(η1 η2 − q12 )(ε1 ε2 − q12 )Et i . hAv Et i
(10.24)
To study θ(1), we will use the relation 1 θ(1) = θ(0) + θ0 (1) − θ00 (1) + 2
Z
1 2
t 000 θ (t)dt 2
(10.25)
1 θ(t) ≤ θ(0) + θ0 (1) − θ00 (1) + max |θ000 (t)| . 0
(10.26)
0
so that
In order to conveniently compute the derivatives of θ, we introduce notation similar to that used in Sec. 5. We write % in ΣN +2 as % = (σ1 , . . . , σN , η, ε) , and we define a probability measure h · it on ΣN +2 by, for a function f on ΣN +2 , hf it =
hAv f exp(η(gt0 (σ) + h0 ) + ε(gt (σ) + h0 ))i . hAv exp(η(gt0 (σ) + h0 ) + ε(gt (σ) + h0 ))i
We will also denote by h · it the product measure on several replicas, so that (10.23) reads now θ(t) = Eh(η1 η2 − q12 )(ε1 ε2 − q12 )it .
(10.27)
We write 0
0
0
0
∆`,`0 = E(g(σ ` )g(σ ` ) − γ(σ ` )γ(σ ` )) = E(g 0 (σ ` )g 0 (σ ` ) − γ 0 (σ ` )γ 0 (σ ` )) .
February 19, 2003 14:25 WSPC/148-RMP
00158
Self Organization in the Low Temperature Region
73
Lemma 10.4. If f is a function on ΣkN +2 , we have d Ehf it = dt
X
Ehf (η` η`0 + ε` ε`0 )∆`,`0 it
1≤`h`0 ≤k
−k
X
Ehf (η` ηk+1 + ε` εk+1 )∆`,k+1 it
1≤`≤k
+
k(k + 1) Ehf (ηk+1 ηk+2 + εk+1 εk+2 )∆k+1,k+2 it . 2
(10.28)
The brackets in the second sum on the right are on a (k + 1) replica. The last bracket is in a (k + 2) replica. The proof relies an integration by part, as in Lemma 5.2. Our goal in studying (10.25) is to show that the last three terms are a small proportion of θ(1), so that (10.25) will show that θ(1) and θ(0) are of the same order. The appropriate choice of q0 and q1 is relevant only in making θ(0) small. We first study the last term of (10.25). Trivial bounds and use of H¨ older’s inequality show that |θ000 (t)| ≤ LEh|∆1,2 |3 it .
(10.29)
|θ000 (t)| ≤ 2−p/L Eh(R12 − q12 )2 it + δ .
(10.30)
We claim that
Using Lemma 4.2, and (10.20), (10.21), we see that |∆1,2 | ≤
β2p p K p + |R12 − q12 |. N 2
(10.31)
β 2 p2 K + |R12 − q12 | . N 2
(10.32)
In particular we have |∆1,2 | ≤
On the other hand (and this is the key ingredient of the proof) |∆1,2 | is (almost) / always small, |∆1,2 | ≤ 2−p/L for β ≤ 2p/L . Indeed, if |R12 | ≤ 1/2, then (σ 1 , σ 2 ) ∈ S 2 Cα , and
while if (σ 1 , σ 2 ) ∈
S
p p p − q12 | = |R12 − q0p | ≤ 2−p+1 |R12
(10.33)
Cα2 ,
p p p p − q12 | = |R12 − q1p | ≤ 1 − R12 + 1 − q1p ≤ 2p2−p/L . (10.34) |R12 S / Cα2 , |R12 | ≥ 1/2) their contribution As for the other cases, (i.e. (σ 1 , σ 2 ) ∈ is vanishingly small by Theorem 3.2 and Lemma 10.2. This proves (10.30). To complement (10.30), we prove that
Eh(R12 − q12 )2 it ≤ 2Eh(R12 − q12 )2 i1 + δ .
(10.35)
February 19, 2003 14:25 WSPC/148-RMP
74
00158
M. Talagrand
To see this, we use (10.27) and the fact that |∆`,`0 | ≤ 2−p/L is outside a vanishing set to see that d Eh(R12 − q12 )2 it ≤ L2−p/L Eh(R12 − q12 )2 it + δ , (10.36) dt and, by integration, this yields (10.35). Combining with (10.30) this shows that |θ000 (t)| ≤ L2−p/L Eh(R12 − q12 )2 i00 + δ .
(10.37)
We now turn to the study of θ0 (1), that we compute through (10.27). It is a sum of several terms, that are all treated the same way. We explain how to handle a term Ehf η` η`0 ∆`,`0 i00
(10.38)
for ` < `0 ≤ 4, where f = (η1 η2 − q12 )(ε1 ε2 − q12 ). We write Ehf η` η`0 ∆`,`0 i00 = Ehf ∆`,`0 i00 + Ehf (η` η`0 − 1)∆`,`0 i00 := I + II .
(10.39)
To handle I we will appeal to the symmetry between sites to say that |Ehf ∆`,`0 i00 − Eh(R12 − q12 )2 ∆`,`0 i00 | ≤
K . N
(10.40)
To see this, we write R12 − q12 =
1 X 1 2 (σi σi − q12 ) , N i≤N
∆`,`0 =
β2p 2N p−1
X
0
0
p−1 (σi`1 σi`1 · · · σi`p−1 σi`p−1 − q`` 0 ),
1≤i1 <···
we expend both sides of (10.35) and we use the symmetry among sites. Combining (10.40) with the fact that |∆`,`0 | ≤ 2−p/L outside a vanishing set, we get that K + L2−p/LEh(R12 − q12 )2 i00 + δ . N To handle the term II, again we use symmetry between sites to see that |I| ≤
|II − Eh(R12 − q12 )(η1 η2 − q12 )(η` η`0 − 1)∆`,`0 i00 | ≤
K N
and since 1 − η` η`0 ≥ 0, K + Eh|R12 − q12 |(1 − η` η`0 )|∆`,`0 |i00 . N Now, 2xy ≤ ax2 + a−1 y 2 for any a > 0, so that |II| ≤
K + aEh(R12 − q12 )2 i00 + a−1 Eh(1 − η` η`0 )∆2`,`0 i00 . N Using symmetry between the sites, we have |II| ≤
Eh(1 − η` η`0 )∆2`,`0 i00 ≤
K + Eh(1 − R12 )∆21,2 i00 . N
(10.41)
February 19, 2003 14:25 WSPC/148-RMP
00158
Self Organization in the Low Temperature Region
75
The nice fact is that (outside a set of vanishing measure) (1 − R12 )∆21,2 ≤
K + β 4 p4 2−p/L (R12 − q12 )2 . N
This follows from (10.32) if R12 ≥ 1 − 2−p/L , and from (10.31) if |R12 | ≤ 1/2 and since |xp − y p | ≤ p2−p+1 |x − y| for x, y ≤ 1/2. Optimization over a yields |II| ≤ δ + Lβp2 2−p/L Eh(R12 − q12 )2 i00
(10.42)
so that, combining with (10.41), the same bound holds for θ0 (1). The case of θ00 (1) is handled similarly. If we keep in mind (using symmetry between sites) that AN +2 (β 00 , h0 , β 00 ), θ(1) and Eh(R12 − q12 )2 i00 differ from each other by at most K/N , then (10.26) and the previous estimate show that AN +2 (β 00 , h0 , β00 ) ≤ δ + θ(0)
(10.43)
provided β ≤ 2−p/L , p ≥ L. We turn now to the study of θ(0). We have θ(0) = E We write
hAv(η1 η2 − q12 )(ε1 ε2 − q12 )E0 i . hAvE0 i
r q q p p−1 q0 z + q1p−1 − q0p−1 gα + h0 Xα = β 2
and we denote by Xα0 independent copies. We see that P θ(0) = E
(
P +E
ch2 Xα )(sh2 Xα0 − q1 ch2 Xα0 ) 0 2 α wα ch Xα ch Xα )
2 2 − q1 α wα (sh XαP
α6=β
wα wβ (sh Xα sh Xβ − q0 ch Xα ch Xβ )(sh Xα0 sh Xβ0 − q0 ch Xα0 ch Xβ0 ) P . ( α wα ch Xα ch Xα0 )2
(10.44) Conditionally upon the value of z, z 0 , the r.v. Xα , Xα0 are i.d.d. We write q q βq ( q0p−1 z + q1p−1 − q0p−1 g) + h0 , X= 2 where z, g are independent standard normal, and we denote by Eg expectation in g. To compute the right-hand side of (10.43), we use Theorem 9.4 and (6.34), (6.35), conditionally upon z, z 0 . For example, with Uα = sh Xα sh Xα0 , Vα = ch Xα ch Xα0 P 2 2 Eg (th Xα th Xα0 chm Xα chm Xα0 ) w U Eg P α α 2 = δ + (1 − m) ( wα Vα ) Eg chm Xα chm Xα0 = δ + (1 − m)
Eg (th Xα chm Xα ) Eg (th Xα0 chm Xα0 ) Eg (chm Xα ) Eg (chm Xα0 )
February 19, 2003 14:25 WSPC/148-RMP
76
00158
M. Talagrand
where Eg denotes expectation at z, z 0 fixed and where we use independence in the last line. Thus P 2 2 2 Eg (th X chm X) wα Uα P = δ + (1 − m) . E ( wα Vα )2 Eg chm X In this manner, we find 2 Eg th2 X chm X θ(0) = δ + (1 − m) q1 − E Eg chm X 2 !2 Eg th X chm X + m q0 − E . Eg chm X
(10.45)
In particular, (10.42) shows that AN +2 (β 00 , h0 , β00 ) = δ if q0 , q1 satisfy (10.3), (10.4). (It seems worth mentioning that the reason why we do not know how to make the present argument work without the information of Theorem 9.4 is that then we do not know how to prove the existence of numbers q0 , q1 such that θ(0) is small.) Now we prove the existence of solutions of (10.2). The proof resembles the proof of the existence of a solution to (1.13), (1.14) and we explain only the new ingredient. Consider Φ(q0 , q1 ) = E
Eg th2 X chm X , Eg chm X
Ψ(q0 , q1 ) =
Eg th X chm X Eg chm X
2 ! .
We show that the map (q0 , q1 ) 7→ (Ψ(q0 , q1 ), Φ(q0 , q1 ))
(10.46)
sends the set [0, 1/2] × [1 − 2−p/L , 1] into itself. What is not so obvious is that Ψ(q0 , q1 ) ≤ 1/2, because h0 = hβ is large for large β. Since q0 appears in Ψ, Φ only through q0p−1 , it is obvious that q0 ≤
∂Ψ 1 ⇒ , 2 ∂q0
∂Φ ≤ L2−p/L . ∂q0
So, to prove Ψ(q0 , q1 ) ≤ 1/2 it suffices to prove that f (h0 ) = q where Y = βg
1 E th Y chm Y ≤ m E ch Y 4
pq1p−1 /2 + h0 . This is done by showing that f 0 (h0 ) ≤ 2−p/L + Lm
(10.47)
February 19, 2003 14:25 WSPC/148-RMP
00158
Self Organization in the Low Temperature Region
77
and using the fact (similar to (8.51)) that m ≤ L/β. Next, we observe that ∂Φ ≤ L2−p/L , ∂q1
∂Ψ ≤ Lβ 2 p ∂q1
(10.48)
so that if β ≤ 2p/L , for a suitable choice of a, the change of variable q1 → aq1 makes the map (10.46) a contradiction. At this point we have almost finished the proof of Theorem 10.1, except that in Eqs. (10.3), (10.4) we have used the value m = mN (β, h0 , β) = Eh1{R12 ≥3/4} i , rather than mN +2 (β 00 , h0 , β 00 ). To finish the proof, we need to show |mN +2 (β 00 , h0 , β 00 ) − mN (β 0 , h0 , β)| = δ .
(10.49)
If we define s i, mN,s = EhR12
it suffices to prove that for each s, |mN +2,s (β 00 , h0 , β00 ) − mN,s (β, h0 , β)| = δ . Now dpN (β, h, β) = 2−s ξ(N )E dβs
g (s) (σ) N
= 2−s ξ(N )(1 − mN,s ) + R
(10.50)
where |R| ≤ Kξ(N )/N . Also, it is simple to see that K . N Using the argument of (6.3), the derivative √ in βs of these two convex functions cannot differ in average by more than K/ N, and combining with (10.50) this proved (10.49) and Theorem 10.1. pN (β, h0 , β) ≤ pN +2 (β 00 , h0 , β00 ) ≤ pN (β, h0 , β) +
References [1] M. Ledoux and M. Talagrand, Probability in Banach Spaces, Springer, 1991. [2] M. Talagrand, Regularity of Gaussian processes, Acta. Math. 159 (1987), 99–149. [3] M. Mezard, G. Parisi and M. A. Virasiro, Spin Glass Theory and Beyond, World Scientific Lecture note in Physics, 9, World Scientific, 1987. [4] D. Sherrington and S. Kirkpatrick, Solvable model of a spin glass, Phys. Rev. Lett. 35 (1972), 1792–1796. [5] D. Gross and M. Mezard, The simplest spin glass, Nucl. Phys. B240 (1984), 431–452. [6] E. Gardner, Spin glasses with p-spin interactions, Nucl. Phys. B257 (1985), 747–765. [7] B. Derrida, Random-energy model: Limit of a family of disordered models, Phys. Rev. Lett. 45(2) (1980), 79–82. [8] M. Talagrand, Huge Random Structures and Mean Field Models for Spin Glasses, Proceedings of the International Congress of Mathematicians, Documenta. Math (electronic) special volume, 1998.
February 19, 2003 14:25 WSPC/148-RMP
78
00158
M. Talagrand
[9] M. Talagrand, Rigorous Results for Mean Field Spin Glasses Models: A First Course, Saint Flour Summer school in Probability, August 2000, Springer Lectures Notes in Math., to appear. [10] M. Talagrand, Rigorous low temperature results for the mean field p-spins interaction model, Probab. Theory Relat. Fields 117 (2000), 303–360. [11] I. Ibragimov, V. Sudakov and B. Tsirelon, Norms of Gaussian sample functions, Proceedings of the third USSR-Japan symposium on Probability Theory, Lecture Note in math, Vol. 550, Springer Verlag, 1976. [12] M. Aizenmann and P. Contucci, On the stability of the quenched state in mean field spin glasses models, J. Stat. Phys. 92 (1998), 765–783. [13] S. Ghirlanda and F. Guerra, General properties of overlap distributions in disordered spin systems: Towards Parisi ultrametricity, J. Phys. A31 (1998), 9149–9155. [14] A. Ruelle, A mathematical reformulation of Derrida’s REM and GREM, Comm. Math. Phys. 108 (1987), 225–239. [15] J. Pitmam and M. Yor, The two parameters Poisson Dirichlet distribution derived from a stable subordinator, Ann. Probab. 25 (1997), 855–900. [16] B. Derrida, Random energy model: An exactly solvable model of disordered systems, Phys. Rev. B24(5) (1981), 2613–2626.
February 18, 2003 10:23 WSPC/148-RMP
00157
Reviews in Mathematical Physics, Vol. 15, No. 1 (2003) 79–91 c World Scientific Publishing Company
MONOTONICITY OF QUANTUM RELATIVE ENTROPY REVISITED
´ DENES PETZ Department for Mathematical Analysis Budapest University of Technology and Economics H-1521 Budapest XI., Hungary
[email protected] Received 24 November 2001 Revised 25 May 2002 Dedicated to E. Lieb and H. Araki on the occasion of their 70th birthday Monotonicity under coarse-graining is a crucial property of the quantum relative entropy. The aim of this paper is to investigate the condition of equality in the monotonicity theorem and in its consequences as the strong sub-additivity of von Neumann entropy, the Golden–Thompson trace inequality and the monotonicity of the Holevo quantitity. The relation to quantum Markov states is briefly indicated. Keywords: Quantum states; relative entropy; strong subadditivity; coarse-graining; Uhlmann’s theorem; α-entropy.
1. Introduction Quantum relative entropy was introduced by Umegaki [24] as a formal generalization of the Kullback–Leibler information (in the setting of finite von Neumann algebras). Its real importance was understood much later and the monograph [14] already deduced most information quantities from the relative entropy. One of the fundamental results of quantum information theory is the monotonicity of relative entropy under completely positive mappings. After the discussion of some particular cases by Araki [3] and by Lindblad [12], this result was proven by Uhlmann [23] in full generality and nowadays it is referred as Uhlmann’s theorem. The strong sub-additivity property of entropy can be obtained easily from Uhlmann’s theorem (see [14] about this point and as a general reference as well) and Ruskai discussed the relation of several basic entropy inequalities in details [21, 22]. The aim of this paper is to investigate the condition of equality in the monotonicity theorem and in its consequences. The motivation to do this comes from the needs of quantum information theory developed in the setting of matrix algebras in the last ten years [13], on the other hand work [22] has given also some stimulation. The paper is written entirely in a finite dimensional setting but some remarks are made about the possible more general scenario. 79
February 18, 2003 10:23 WSPC/148-RMP
80
00157
D. Petz
2. Uhlmann’s Theorem Let H be a finite dimensional Hilbert space and Di be statistical operators on H (i = 1, 2). Their relative entropy is defined as ( Tr D1 (log D1 − log D2 ) if supp D1 ⊂ supp D2 , (1) S(D1 , D2 ) = +∞ otherwise . If λ > 0 is the smallest eigenvalue of D2 , then S(D1 , D2 ) is always finite and S(D1 , D2 ) ≤ log n − log λ, where n is the dimension of H. Let K be another finite dimensional Hilbert space. We call a linear mapping T : B(H) → B(K) coarse graining if T is trace preserving and 2-positive, that is " # " # T (A) T (B) A B ≥ 0 if ≥ 0. (2) T (C) T (D) C D Such a T sends a statistical operator to statistical operator and satisfies the Schwarz inequality T (a∗ a) ≥ T (a)T (a)∗ . The concept of coarse graining is the quantum version of the Markovian mapping in probability theory. All the important examples are actually completely positive. We work in this more general framework because the proofs require only the Schwarz inequality. B(H) and B(K) are Hilbert spaces with respect to the Hilbert–Schmidt inner product and the adjoint of T : B(H) → B(K) is defined: Tr AT (B) = Tr T ∗ (A)B
(A ∈ B(K), B ∈ B(H)) .
The adjoint of a coarse graining T is 2-positive again and T ∗ (I) = I. It follows that T ∗ satisfies the Schwarz inequality as well. The following result is known as Uhlmann’s theorem. Theorem 2.1 ([23, 14]). For a coarse graining T : B(H) → B(K) the monotonicity S(D1 , D2 ) ≥ S(T (D1 ), T (D2 )) holds. It should be noted that relative entropy was defined in the setting of von Neumann algebras first by Umegaki [24] and extended by Araki [3]. Uhlmann’s monotonicity result is more general than the above statement. To the best knowledge of the author, it is not known if the monotonicity theorem holds without the hypothesis of 2-positivity. 3. The Proof of Uhlmann’s Theorem and Its Analysis The simplest way to analyse the equality in the monotonicity theorem is to have a close look at the proof of the inequality. Therefore we present a proof which is based on the relative modular operator method. The concept of relative modular operator was developed by Araki in the modular theory of operator algebras [4], however it
February 18, 2003 10:23 WSPC/148-RMP
00157
Monotonicity of Quantum Relative Entropy Revisited
81
could be used very well in finite dimensional settings. For example, Lieb’s concavity theorem gets a natural proof by this method [15]. Let D1 and D2 be density matrices acting on the Hilbert space H and assume that they are invertible. On the Hilbert space B(H) one can define an operator ∆ as ∆a = D2 aD1−1
(a ∈ B(H)) .
This is the so-called relative modular operator and it is the product of two commuting positive operators: ∆ = LR, where La = D2 a and Ra = aD1−1
(a ∈ B(H)) .
Since log ∆ = log L + log R, we have 1/2
1/2
1/2
1/2
S(D1 , D2 ) = hD1 , (log D1 − log D2 )D1 i = −hD1 , (log ∆)D1 i . The relative entropy S(D1 , D2 ) is expressed by the quadratic form of the logarithm of the relative modular operator. This is the fundamental formula what we use (and actually this is nothing else but Araki’s definition of the relative entropy in a general von Neumann algebra [3]). Let T be a coarse graining as in Theorem 2.1. We assume that D1 and T (D1 ) are invertible matrices and set ∆a = D2 aD1−1
and ∆0 x = T (D2 )xT (D1 )−1
(a ∈ B(H))
(x ∈ B(K)) .
∆ and ∆0 are operators on the spaces B(H) and B(K), respectively. Both become a Hilbert space with the Hilbert–Schmidt inner product. The relative entropies in the theorem are expressed by the resolvent of relative modular operators: 1/2
1/2
S(D1 , D2 ) = −hD1 , (log ∆)D1 i Z
∞
=
hD1 , (∆ + t)−1 D1 i − (1 + t)−1 dt , 1/2
1/2
0
S(T (D1 ), T (D2 )) = −hT (D1 )1/2 , (log ∆0 )T (D1 )1/2 i Z
∞
=
hT (D1 )1/2 , (∆0 + t)−1 T (D1 )1/2 i − (1 + t)−1 dt ,
0
where the identity
Z log x =
∞
(1 + t)−1 − (x + t)−1 dt
0
is used. The operator V xT (D1 )1/2 = T ∗ (x)D1
1/2
(3)
is a contraction: kT ∗(x)D1 k2 = Tr D1 T ∗ (x∗ )T ∗ (x) ≤ Tr D1 T ∗ (x∗ x) = Tr T (D1 )x∗ x = kxT (D1 )1/2 k2 1/2
February 18, 2003 10:23 WSPC/148-RMP
82
00157
D. Petz
since the Schwarz inequality is applicable to T ∗ . A similar simple computation gives that V ∗ ∆V ≤ ∆0 .
(4)
The function y 7→ (y + t)−1 is operator monotone (decreasing) and operator convex, hence (∆0 + t)−1 ≤ (V ∗ ∆V + t)−1 ≤ V ∗ (∆ + t)−1 V
(5)
1/2
(see [8]). Since V T (D1 )1/2 = D1 , this implies hD1 , (∆ + t)−1 D1 i ≥ hT (D1 )1/2 , (∆0 + t)−1 T (D1 )1/2 . 1/2
1/2
(6)
By integrating this inequality we have the monotonicity theorem from the above integral formulas. Now we are in the position to analyse the case of equality. If S(D1 , D2 ) = S(T (D1 ), T (D2 )) , then hT (D1 )1/2 , V ∗ (∆ + t)−1 V T (D1 )1/2 i = hT (D1 )1/2 , (∆0 + t)−1 T (D1 )1/2 i
(7)
for all t > 0. This equality together with the operator inequality (5) gives V ∗ (∆ + t)−1 D1
1/2
= (∆0 + t)−1 T (D1 )1/2
(8)
for all t > 0. Differentiating by t we have V ∗ (∆ + t)−2 D1
1/2
= (∆0 + t)−2 T (D1 )1/2
(9)
and we infer kV ∗ (∆ + t)−1 D1 k2 = h(∆0 + t)−2 T (D1 )1/2 , T (D1 )1/2 i 1/2
= hV ∗ (∆ + t)−2 D1 , T (D1 )1/2 i 1/2
= k(∆ + t)−1 D1 k2 . 1/2
When kV ∗ ξk = kξk holds for a contraction V , it follows that V V ∗ ξ = ξ. In the light of this remark we arrive at the condition V V ∗ (∆ + t)−1 D1
1/2
= (∆ + t)−1 D1
1/2
and V (∆0 + t)−1 T (D1 )1/2 = V V ∗ (∆ + t)−1 D1
1/2
= (∆ + t)−1 D1
1/2
.
By Stone–Weierstrass approximation we have 1/2
V f (∆0 )T (D1 )1/2 = f (∆)D1
(10)
February 18, 2003 10:23 WSPC/148-RMP
00157
Monotonicity of Quantum Relative Entropy Revisited
83
for continuous functions. In particular for f (x) = xit we have T ∗ (T (D2 )it T (D1 )−it ) = D2it D1−it .
(11)
This condition is necessary and sufficient for the equality. Theorem 3.1. Let T : B(H) → B(K) be a 2-positive trace preserving mapping and let D1 , D2 ∈ B(H), T (D1 ), T (D2 ) ∈ B(K) be invertible density matrices. Then the equality S(D1 , D2 ) = S(T (D1 ), T (D2 )) holds if and only if the following equivalent conditions are satisfied : (i) T ∗ (T (D1 )it T (D2 )−it ) = D1it D2−it for all real t. (ii) T ∗ (log T (D1 ) − log T (D2 )) = log D1 − log D2 . The equality implies (11) which is equivalent to Theorem 3.1(i). Differentiating (i) at t = 0 we have the second condition which obviously applies the equalities of the relative entropies. The above proof follows the lines of [17]. The original paper is in the setting of arbitrary von Neumann algebras and hence slightly more technical (due to the unbounded feature of the relative modular operators). Condition (ii) of Theorem 3.1 appears also in the paper [22] in which different methods are used. Next we recall a property of 2-positive mappings. When T is assumed to be 2-positive, the set AT := {X ∈ B(H) : T (X ∗X) = T (X)T (X ∗)
and T (X ∗X) = T (X ∗)T (X)}
is a ∗-sub-algebra of B(H) and T (XY ) = T (X)T (Y )
for all X ∈ AT
and Y ∈ B(H) .
(12)
Corollary 3.1. Let T : B(H) → B(K) be a 2-positive trace preserving mapping and let D1 , D2 ∈ B(H), T (D1 ), T (D2 ) ∈ B(K) be invertible density matrices. Assume that T (D1 ) and T (D2 ) commute. Then the equality S(D1 , D2 ) = S(T (D1 ), T (D2 )) implies that D1 and D2 commute. Under the hypothesis ut := T (D1 )it T (D2 )−it and wt := D1it D2−it are unitaries. Since T ∗ is unital ut ∈ AT ∗ for every t ∈ R. We have wt+s = T ∗ (ut+s ) = T ∗ (ut us ) = T ∗ (ut )T ∗ (us ) = wt ws which shows that wt and ws commute and so do D1 and D2 . 4. Consequences and Related Inequalities 4.1. The Golden Thompson inequality The Golden–Thompson inequality tells that Tr eA+B ≤ Tr eA eB
February 18, 2003 10:23 WSPC/148-RMP
84
00157
D. Petz
holds for self-adjoint matrices A and B. It was shown in [18] that this inequality can be reformulated as a particular case of monotonicity when eA /Tr eA is considered as a density matrix and eA+B /Tr eA+B is the so-called perturbation by B. Corollary 5 of the original paper is formulated in the context of von Neumann algebras but the argument was adapted to the finite dimensional case in [19], see also [14, p. 128]. The equality holds in the Golden–Thompson inequality if and only if AB = BA. One of the possible extensions of the Golden–Thompson inequality is the statement that the function p 7→ Tr(epB/2 epA epB/2 )1/p
(13)
is increasing for p > 0. The limit at p = 0 is Tr eA+B [5]. It was proved by Friedland and So that the function (13) is strictly monotone or constant [7]. The latter case corresponds to the commutativity of A and B. 4.2. A posteriori relative entropy
P Let Ej (1 ≤ j ≤ m) be a partition of unity in B(H)+ , that is j Ej = I. (The operators Ej could describe a measurement giving finitely many possible outcomes.) Any density matrix Di ∈ B(H) determines a probability distribution µi = (Tr Di E1 , Tr Di E2 , . . . , Tr Di Em ) . It follows from Uhlmann’s theorem that S(µ1 , µ2 ) ≤ S(D1 , D2 ) .
(14)
We give an example that the equality in (14) may appear non-trivially. Example 4.1. Let D2 = Diag(1/3, 1/3, 1/3), D1 = Diag(1 − 2µ, µ, µ), E1 = Diag(1, 0, 0) and 0 0 0 0 0 0 E2 = 0 x z , E3 = 0 1 − x −z . 0 −¯ z x 0 z¯ 1 − x When 0 < µ < 1/2, 0 < x < 1 and for the complex z the modulus of z is small enough we have a partition of unity and S(µ1 , µ2 ) = S(D1 , D2 ) holds. First we prove a lemma. Lemma 4.1. If D2 is an invertible density then the equality in (14) implies that D2 commutes with D1 , E1 , E2 , . . . , Em . The linear operator T associates a diagonal matrix Diag(Tr DE1 , Tr DE2 , . . . , Tr DEm ))
February 18, 2003 10:23 WSPC/148-RMP
00157
Monotonicity of Quantum Relative Entropy Revisited
85
to the density D acting on H and under the hypothesis (11) is at our disposal. We have hD2 , T ∗ (T (D1 )it T (D2 )−it )D2 i = hD2 , D1it D2−it D2 i . 1/2
1/2
1/2
1/2
Actually we benefit from the analytic continuation and we put −i/2 in place of t. Hence m X 1/2 1/2 (Tr Ej D1 )1/2 (Tr Ej D1 )1/2 = Tr D1 D2 . (15) j=1
The Schwarz inequality tells us that m X 1/2 1/2 1/2 1/2 1/2 1/2 1/2 1/2 hD1 Ej , D2 Ej i Tr D1 D2 = hD1 , D2 i = j=1
≤
q m q X 1/2 1/2 1/2 1/2 1/2 1/2 1/2 1/2 hD1 Ej , D1 Ej i hD2 Ej , D2 Ej i j=1
=
m X
(Tr Ej D1 )1/2 (Tr Ej D2 )1/2 .
j=1
The condition for equality in the Schwarz inequality is well-known: There are some complex numbers λj ∈ C such that 1/2
1/2
D1 Ej
1/2
1/2
= λj D2 Ej
.
(16)
(Since both sides have positive trace, λj are actually positive.) The operators Ej 1/2 and Ej have the same range, therefore 1/2
1/2
D1 Ej = λj D2 Ej .
(17)
Summing over j we obtain −1/2
D2
1/2
D1
=
m X
λj Ej .
j=1 −1/2
1/2
1/2
−1/2
D1 = D1 D2 and D1 D2 = Here the right hand side is self-adjoint, so D2 D2 D1 . Now it follows from (16) that Ej commutes with D2 . Next we analyse the equality in (14). If D2 is invertible, then the previous lemma tells us that D1 and D2 are diagonal in an appropriate basis. In this case S(µ1 , µ2 ) is determined by the diagonal elements of the matrices Ej . Let E(A) denote the diagonal matrix whose diagonal coincides with that of A. If Ej is a partition of unity, then so is E(Ej ). However, given a partition of unity Fj of diagonal matrices, there could be many choice of a partition of unity Ej such that E(Ej ) = Fj , in general. In the moment we do not want to deal with this ambiguity, and we assume that we have a basis e1 , e2 , . . . , en consisting of common eigenvectors of the operators D1 , D2 , E(E1 ), E(E2 ), . . . , E(En ): Di ek = vki ek
and E(Ej )ek = wkj ek
(i = 1, 2, j = 1, 2, . . . , m, k = 1, 2, . . . , n) .
February 18, 2003 10:23 WSPC/148-RMP
86
00157
D. Petz
The matrix [wkj ]kj is (raw) stochastic and condition (17) gives vk1 wkj = (λj )2 wkj . vk2 This means that wkj 6= 0 implies that vk1 /vk2 does not depend on k. In other words, D1 D2−1 is constant on the support of any Ej . Let j be equivalent with k, if the support of E(Ej ) intersects the support of E(Ek ). We denote by [j] the equivalence class of j and let J be the set of equivalence classes. X E(Ek ) P[j] := k∈[j]
must be a projection and {P[j] : [j] ∈ J} is a partition of unity. We deduced above that D1 D2−1 P[j] = λj P[j] . One cannot say more about the condition for equality. All these extracted conditions hold in the above example and E(Ek )’s do not determine Ek ’s, see the freedom for the variable z in the example. We can summarise our analysis as follows. The case of equality in (14) implies some commutation relation and the whole problem is reduced to the commutative case. It is not necessary that the positive-operator-valued measure Ej should have projection values. 4.3. The Holevo bound
P Let Ej (1 ≤ j ≤ m) be a partition of unity in B(K)+ , j Ej = I. We assume that P the density matrix D ∈ B(H) is in the form of a convex combination D = i pi Di of other densities Di . Given a coarse graining T : B(H) → B(K) we can say that our signal i appears with probability pi , it is encoded by the density matrix Di , after transmission the density T (Di ) appears in the output and the receiver decides that the signal j was sent with the probability Tr T (Di )Ej . This is the standard scheme of quantum information transmission. Any density matrix Di ∈ B(H) determines a probability distribution µi = (Tr T (Di )E1 , Tr T (Di )E2 , . . . , Tr T (Di )Em ) on the output. The inequality X X pi S(µi ) ≤ S(D) − pi S(Di ) S(µ) − i
i
(18)
P P (where µ := i pi µi and D := i pi Di ) is the so-called Holevo bound for the amount of information passing through the communication channel. Note that the Holevo bound appeared before the use of quantum relative entropy and the first proof was more complicated.
February 18, 2003 10:23 WSPC/148-RMP
00157
Monotonicity of Quantum Relative Entropy Revisited
87
µi is a coarse-graining of T (Di ), therefore inequality (18) is of the form X X pi S(R(Di ), R(D)) ≤ pi S(Di , D) . i
i
On the one hand, this form shows that the bound (18) is a consequence of the monotonicity, on the other hand, we can make an analysis of the equality. Since the states Di are the codes of the messages to be transmitted, it would be too much to assume that all of them are invertible. However, we may assume that D and T (D) are invertible. Under this hypothesis Lemma 4.1 applies and tells us that the equality in (18) implies that all the operators T (D), T (Di ) and Ej commute. 4.4. α-entropies The α-divergence of the densities D1 and D2 is Sα (D1 , D2 ) =
1+α 1−α 4 2 2 Tr(D − D D ), 1 1 2 1 − α2
(19)
which is essentially 1/2
hD2 , ∆
1+α 2
1/2
D2 i
up to constants in the notation of Sec. 2. The proof of the monotonicity works for this more general quantity with a small alteration. What we need is Z sin πβ ∞ β 1/2 1/2 1/2 1/2 −t hD2 , (∆ + t)−1 D2 i + tβ−1 dt hD2 , ∆β D2 i = π 0 for 0 < β < 1. Therefore for 0 < α < 2 the proof of the above Theorem 3.1 goes through for the α-entropies. The monotonicity holds for the α-entropies, moreover (i) and (ii) from Theorem 3.1 are necessary and sufficient for the equality. The role of the α-entropies is smaller than that of the relative entropy but they are used for approximation of the relative entropy and for some other purposes (see [9], for example). 5. Strong Subadditivity of Entropy and the Markov Property The strong subadditivity is a crucial property of the von Neumann entropy it follows easily from the monotonicity of the relative entropy. (The first proof of this property of entropy was given by Lieb and Ruskai [11] before the Uhlmann’s monotonicity theorem.) The strong subadditivity property is related to the composition of three different systems. It is used, for example, in the analysis of the translation invariant states of quantum lattice systems: The proof of the existence of the global entropy density functional is based on the subadditivity and a monotonicity property of local entropies is obtained by the strong subadditivity [20]. Consider three Hilbert spaces, Hj , j = 1, 2, 3 and a statistical operator D123 on the tensor product H1 ⊗ H2 ⊗ H3 . This statistical operator has marginals on all subproducts, let D12 , D2 and D23 be the marginals on H1 ⊗ H2 , H2
February 18, 2003 10:23 WSPC/148-RMP
88
00157
D. Petz
and H2 ⊗ H3 , respectively. (For example, D12 is determined by the requirement Tr D123 (A12 ⊗ I3 ) = Tr D12 A12 for every operator A12 acting on H1 ⊗ H2 ; D2 and D23 are similarly defined.) The strong subadditivity asserts the following: S(D123 ) + S(D2 ) ≤ S(D12 ) + S(D23 ) .
(20)
In order to prove the strong subadditivity, one can start with the identities S(D123 , tr123 ) = S(D12 , tr12 ) + S(D123 , D12 ⊗ tr3 ) , S(D2 , tr2 ) + S(D23 , D2 ⊗ tr3 ) = S(D23 , tr23 ) , where tr with a subscript denotes the density of the corresponding tracial state, for example tr12 = I12 / dim(H1 ⊗ H2 ). From these equalities we arrive at a new one, S(D123 , tr123 ) + S(D2 , tr2 ) = S(D12 , tr12 ) + S(D23 , tr23 ) + S(D123 , D12 ⊗ tr3 ) − S(D23 , D2 ⊗ tr3 ) . If we know that S(D123 , D12 ⊗ tr3 ) ≥ S(D23 , D2 ⊗ tr3 )
(21)
then the strong subadditivity (20) follows. Set a linear transformation B(H1 ⊗ H2 ⊗ H3 ) → B(H2 ⊗ H3 ) as follows: T (A ⊗ B ⊗ C) := B ⊗ C(Tr A) ,
(22)
T is completely positive and trace preserving. On the other hand, T (D123 ) = D23 and T (D12 ⊗ tr3 ) = D2 ⊗ tr3 . Hence the monotonicity theorem gives (21). This proof is very transparent and makes the equality case visible. The equality in the strong subadditivity holds if and only if we have equality in (21). Note that T is the partial trace over the third system and T ∗ (B ⊗ C) = I ⊗ B ⊗ C .
(23)
Theorem 5.1. Assume that D123 is invertible. The equality holds in the strong subadditivity (20) if and only if the following equivalent conditions hold: −it it it D12 = D23 D2−it for all real t. (i) D123 (ii) log D123 − log D12 = log D23 − log D2 .
Note that both condition (i) and (ii) contain implicitly tensor products, all operators should be viewed in the three-fold-product. Theorem 3.1 applies due to (23) and this is the proof. It is not obvious the meaning of conditions (i) and (ii) in Theorem 5.1. The easy choice is log D12 = H1 + H2 + H12 ,
log D23 = H2 + H3 + H23 ,
log D2 = H2
February 18, 2003 10:23 WSPC/148-RMP
00157
Monotonicity of Quantum Relative Entropy Revisited
89
for a commutative family of self-adjoint operators H1 , H2 , H3 , H12 , H23 and to define log D123 by condition (ii) itself. This example lives in an abelian subalgebra of H1 ⊗ H2 ⊗ H3 and a probabilistic representation can be given. D123 may be regarded as the joint probability distribution of some random variables ξ1 , ξ2 and ξ3 . In this language we can rewrite (i) in the form Prob(ξ2 = x2 , ξ3 = x3 ) Prob(ξ1 = x1 , ξ2 = x2 , ξ3 = x3 ) = Prob(ξ1 = x1 , ξ2 = x2 ) Prob(ξ2 = t2 )
(24)
or in terms of conditional probabilities Prob(ξ3 = x3 |ξ1 = x1 , ξ2 = x2 ) = Prob(ξ3 = x3 |ξ2 = x2 ) .
(25)
In this form one recognizes the Markov property for the variables ξ1 , ξ2 and ξ3 ; subscripts 1, 2 and 3 stand for “past”, “present” and “future”. It must be well-known that for classical random variables the equality case in the strong subadditivity of the entropy is equivalent to the Markov property. The equality S(D123 ) − S(D12 ) = S(D23 ) − S(D2 )
(26)
means an equality of entropy increments. Concerning the Markov property, see [2] or [14, pp. 200–203]. Theorem 5.2. Assume that D123 is invertible. The equality holds in the strong subadditivity (20) if and only if there exists a completely positive unital mapping γ : B(H1 ⊗ H2 ⊗ H3 ) → B(H2 ⊗ H3 ) such that (i) Tr(D123 γ(x)) = Tr(D123 x) for all x. (ii) γ|B(H2 ) ≡ identity. If γ has properties (i) and (ii), then γ ∗ (D23 ) = D123 and γ ∗ (D2 ⊗ Tr3 ) = D12 ⊗ Tr3 for its dual and we have equality in (21). To prove the converse let E(A ⊗ B ⊗ C) := B ⊗ C(Tr A/ dim H1 )
(27)
which is completely positive and unital. Set −1/2
γ(·) := D23
1/2
1/2
−1/2
E(D123 · D123 )D23
.
(28)
If the equality holds in the strong subadditivity, then property (i) from Theorem 3.1 is at our disposal and it gives γ(x) = x for x ∈ B(H2 ). In a probabilistic interpretation E and γ are conditional expectations. E preserves the tracial state and it is a projection of norm one. γ leaves the state with density D123 invariant, however it is not a projection. (Accardi and Cecchini called this γ generalised conditional expectation, [1].) It is interesting to construct translation invariant states on the infinite tensor product of matrix algebras (that is, quantum spin chain over Z) such that condition (26) holds for all ordered subsystems 1, 2 and 3.
February 18, 2003 10:23 WSPC/148-RMP
90
00157
D. Petz
Acknowledgment The work was supported by the Hungarian OTKA T032662.
References [1] L. Accardi and C. Cecchini, Conditional expectations in von Neumann algebras and a theorem of Takesaki, J. Funct. Anal. 45 (1982), 245–273. [2] L. Accardi and A. Frigerio, Markovian cocycles, Proc. R. Ir. Acad. 83 (1983), 251–263. [3] H. Araki, Relative entropy for states of von Neumann algebras, Publ. RIMS Kyoto Univ. 11 (1976), 809–833. [4] H. Araki and T. Masuda, Positive cones and Lp -spaces for von Neumann algebras, Publ. RIMS Kyoto Univ. 18 (1982), 339–411. [5] H. Araki, On an inequality of Lieb and Thirring, Lett. Math. Phys. 19 (1990), 167–170. [6] M. A. Nielsen and I. L. Chuang, Quantum Computation and Quantum Information, Cambridge University Press, 2000. [7] S. Friedland and W. So, On the product of matrix exponentials, Lin. Alg. Appl. 196 (1994), 193–205. [8] F. Hansen and G. K. Pedersen, Jensen’s inequality for operator and L¨ owner’s theorem, Math. Anal. 258 (1982), 229–241. [9] H. Hasegawa and D. Petz, Non-commutative extension of information geometry II, in Quantum Communication, Computing and Measurement, eds. Hirota et al., Plenum Press, New York, 1997. [10] A. S. Holevo, Information theoretical aspects of quantum measurement, Prob. Inf. Transmission USSR 9 (1973), 31–42. [11] E. H. Lieb and M. B. Ruskai, Proof of the strong subadditivity of quantum mechanical entropy, J. Math. Phys. 14 (1973), 1938–1941. [12] G. Lindblad, Completely positive maps and entropy inequalities, Comm. Math. Phys. 40 (1975), 147–151. [13] M. A. Nielsen and I. L. Chuang, Quantum Computation and Quantum Information, Cambridge University Press, 2000. [14] M. Ohya and D. Petz, Quantum Entropy and Its Use, Springer-Verlag, Heidelberg, 1993. [15] D. Petz, Quasi-entropies for finite quantum systems, Rep. Math. Phys. 23 (1986), 57–65. [16] D. Petz, A dual in von Neumann algebras, Quart. J. Math. Oxford 35 (1984), 475–483. [17] D. Petz, Sufficiency of channels over von Neumann algebras, Quart. J. Math. Oxford 39 (1988), 907–1008. [18] D. Petz, A variational expression for the relative entropy, Commun. Math. Phys. 114 (1998), 345–348. [19] D. Petz, A survey of trace inequalities, in Functional Analysis and Operator Theory, Banach Center Publications 30 (Warszawa 1994), pp. 287–298. [20] D. Petz, Entropy density in quantum statistical mechanics and information theory, in Contributions in Probability, ed. C. Cecchini, Forum, Udine, 1996, pp. 221–226. [21] M. B. Ruskai, Beyond strong subadditivity? Improved bounds on the contraction of generalized relative entropy, Rev. Math. Phys. 6 (1994), 1147–1161. [22] M. B. Ruskai, Inequalities for quantum entropy: A review with conditions with equality, quant-ph/0205064 (2002).
February 18, 2003 10:23 WSPC/148-RMP
00157
Monotonicity of Quantum Relative Entropy Revisited
91
[23] A. Uhlmann, Relative entropy and the Wigner-Yanase-Dyson-Lieb concavity in an interpolation theory, Commun. Math. Phys. 54 (1977), 21–32. [24] H. Umegaki, Conditional expectations in an operator algebra IV (entropy and information), Kodai Math. Sem. Rep. 14 (1962), 59–85.
April 11, 2003 14:43 WSPC/148-RMP
00160
Reviews in Mathematical Physics Vol. 15, No. 2 (2003) 93–198 c World Scientific Publishing Company
EQUILIBRIUM STATISTICAL MECHANICS OF FERMION LATTICE SYSTEMS
HUZIHIRO ARAKI Research Institute for Mathematical Sciences, Kyoto University Kitashirakawa-Oiwakecho, Sakyoku, Kyoto 606-8502, Japan HAJIME MORIYA Institute of Particle and Nuclear Studies High Energy Accelerator Research Organization (KEK) 1-1 Oho, Tsukuba, Ibaraki, 305-0801, Japan Received 1 July 2002 Revised 30 November 2002 We study equilibrium statistical mechanics of Fermion lattice systems which require a different treatment compared with spin lattice systems due to the non-commutativity of local algebras for disjoint regions. Our major result is the equivalence of the KMS condition and the variational principle with a minimal assumption for the dynamics and without any explicit assumption on the potential. Its proof applies to spin lattice systems as well, yielding a vast improvement over known results. All formulations are in terms of a C∗ -dynamical systems for the Fermion (CAR) algebra A with all or a part of the following assumptions: (I) The interaction is even, namely, the dynamics αt commutes with the even-oddness automorphism Θ. (Automatically satisfied when (IV) is assumed.) (II) The domain of the generator δα of αt contains the set A◦ of all strictly local elements of A. (III) The set A◦ is the core of δα . (IV) The dynamics αt commutes with lattice translation automorphism group τ of A. A major technical tool is the conditional expectation from A onto its C ∗ -subalgebras A(I) for any subset I of the lattice, which induces a system of commuting squares. This technique overcomes the lack of tensor product structures for Fermion systems and even simplifies many known arguments for spin lattice systems. In particular, this tool is used for obtaining the isomorphism between the real vector space of all ∗-derivations with their domain A◦ , commuting with Θ, and that of all Θ-even standard potentials which satisfy a specific norm convergence condition for the one point interaction energy. This makes it possible to associate a unique standard potential to every dynamics satisfying (I) and (II). The convergence condition for the potential is a consequence of its definition in terms of the ∗-derivation and not an additional assumption. If translation invariance is imposed on ∗-derivations and potentials, then the isomorphism is kept and the space of translation covariant standard potentials becomes a separable Banach space with respect to the norm of the one point interaction energy. 93
April 11, 2003 14:43 WSPC/148-RMP
94
00160
H. Araki & H. Moriya This is a crucial basis for an application of convex analysis to the equivalence proof in the major result. Everything goes in parallel for spin lattice systems without the evenness assumption (I).
Contents 1. Introduction 2. Conditional Expections 2.1. Basic properties 2.2. Geometrical lemma 2.3. Commuting square 3. Entropy and Relative Entropy 3.1. Definitions 3.2. Monotone property 3.3. Strong subadditivity 4. Fermion Lattice Systems 4.1. Fermion algebra 4.2. Product property of the tracial state 4.3. Conditional expectations for Fermion algebras 4.4. Commuting squares for Fermion algebras 4.5. Commutants of subalgebras 5. Dynamics 5.1. Assumptions 5.2. Local Hamiltonians 5.3. Internal energy 5.4. Potential 5.5. General potential 6. KMS Condition 6.1. KMS condition 6.2. Differential KMS condition 7. Gibbs Condition 7.1. Inner perturbation 7.2. Surface energy 7.3. Gibbs condition 7.4. Equivalence to KMS condition 7.5. Product form of the Gibbs condition 8. Translation Invariant Dynamics 8.1. Translation invariance and covariance 8.2. Finite range potentials 9. Thermodynamic Limit 9.1. Surface energy estimate 9.2. Pressure 9.3. Mean energy 10. Entropy for Fermion Systems 10.1. SSA for Fermion systems 10.2. Mean entropy 10.3. Entropy inequalities for translation invariant states 11. Variational Principle 11.1. Extension of even states 11.2. Variational inequality
95 102 102 104 105 106 106 108 108 109 109 112 113 117 118 123 123 124 129 130 134 134 134 135 137 137 139 139 141 143 146 146 151 153 154 157 160 161 161 162 163 164 164 167
April 11, 2003 14:43 WSPC/148-RMP
00160
Equilibrium Statistical Mechanics of Fermion Lattice Systems
11.3. Variational equality 11.4. Variational principle 12. Equivalence of Variational Principle and KMS Condition 12.1. Variational principle from Gibbs condition 12.2. Some tools of convex analysis 12.3. Differential KMS condition from variational principle 13. Use of Entropy in the Variational Equality 13.1. CNT-entropy 13.2. Variational equality in terms of CNT-entropy 14. Discussion Appendix: Van Hove Limit A.1. Van Hove net A.2. Van Hove limit References
95
167 171 172 172 173 179 183 183 184 187 190 191 194 196
1. Introduction We investigate the equilibrium statistical mechanics of Fermion lattice systems. While equilibrium statistical mechanics of spin lattice systems has been well studied (see e.g. [17], [23] and [40]), there is a crucial difference between spin and Fermion cases. Namely, local algebras for disjoint regions commute elementwise for spin lattice systems, but do not commute for Fermion lattice systems. Due to this difference, the known formulations and proof in the case of spin lattice systems do not necessarily go over to the case of Fermion lattice systems and that is the motivation for this investigation. An example of a Fermion lattice system is the well-studied Hubbard model, to which our results apply. It turned out that, in the matter of the equivalence of the KMS condition and the variational principle (i.e. the minimum free energy) for translation invariant states, we obtain its proof without any explicit assumption on the potential except for the condition that it is the standard potential corresponding to a translation invariant even dynamics, a minimal condition for a proper formulation of the problem. Without any change in the methods of proof, this strong result holds for spin lattice systems as well — a vast improvement over known results for spin lattice systems and a solution of a problem posed by Bratteli and Robinson (Remark after Theorem 6.2.42. [17]). In addition to this major result, we hope that the present work supplies a general mathematical foundation for equilibrium statistical mechanics of Fermion lattice systems, which was lacking so far. There are two distinctive features of our approach. One feature is the central role of the time derivative (i.e. the generator of the dynamics). On one hand, this enables us to deal with all types of potentials without any explicit conditions on their long range or many body behavior, as long as the first time derivative of strictly localized operators can be defined. On the other hand, the existence of the dynamics for a given potential is separated from the problems treated here and we can bypass that existence problem via Assumption (III) below.
April 11, 2003 14:43 WSPC/148-RMP
96
00160
H. Araki & H. Moriya
Another feature is the use of conditional expectations instead of the tensor product structure traditionally used for spin lattice systems. They provide not only a substitute tool (for the tensor product structure), which is applicable for both spin and Fermion lattice systems, but also a method of estimates which does not use the norm of individual potentials, for which we do not impose any explicit condition. The main subject of our paper is the characterization of equilibrium states in terms of the KMS condition and the variational principle, which have an entirely different appearance but are shown to be equivalent. They refer to canonical ensembles in the infinite volume limit. However, they also refer to grand canonical ensembles if the dynamics is modified by gauge transformations with respect to Fermion numbers [11]. Namely, in the language of potentials, we may add a onebody potential, which consists of the particle number operator(s) times c-number chemical potential(s), and then the canonical ensemble for the so-modified potential is the grand canonical ensemble for the original potential, so that the grand canonical ensemble can be studied as a canonical ensemble for a modified potential, which is in the scope of our theory. For the sake of notational simplicity, our presentation is for the case of one Fermion at each lattice site. Our results and proofs hold without any essential change for more general case where a finite numbers of Fermions and finite spins coexist at each lattice site. The even-oddness in that case refers to the total Fermion number. For example, for Hubbard model, there are two Fermions at each lattice site, representing the two components of a spin 1/2 Fermion. Our starting point is a C∗ -dynamical system (A, αt ), where A is the C∗ -algebra of Fermion creation and annihilation operators on lattice sites of Zν with local subalgebras A(I) for finite subsets I ⊂ Zν and αt is a given strongly continuous one-parameter group of ∗-automorphisms of A. Since the normal starting point in statistical mechanics is a potential, a digression on our formulation and strategy starting from a given dynamics may be appropriate at this point. The KMS condition, which is formulated in terms of the dynamics, is one of two main components of our equivalence result. On the other hand, the variational principle, which is formulated in terms of the potential, is the other main component. Therefore both dynamics and potential are indispensable for our main results and their mutual relation is of at most importance. The key equation for that relation is the following formula. For any operator A localized in a finite subset I of the lattice, its time derivative is given by d αt (A) = αt (i[H(I), A]) dt where H(I) is described as a sum of potentials Φ(J), based on a finite subset J of the lattice, the sum being over all J except those J for which Φ(J) commutes with any A localized in I, thus H(I) depending on I. The problem of construction of αt from a given class of potentials is not a straight-forward task and has been studied by many people. As a result, a large
April 11, 2003 14:43 WSPC/148-RMP
00160
Equilibrium Statistical Mechanics of Fermion Lattice Systems
97
number of results are known for quantum spin lattice systems (see e.g. [17]) and most of them can be applied to Fermion lattice systems. There are also some specific analyses for Fermion lattice systems (see e.g. [29]). In parallel, the equivalence of the KMS condition and the variational principle for translation invariant states has been proved for a wide class of potentials for quantum spin lattice systems. The same proof also works for Fermion lattice systems in most cases; for example this is the case for finite range potentials (see e.g. p. 113 of [30]). While these results cover a wide range of explicit models, it seems difficult to decide exactly which class of potentials determine a dynamics and to show the equivalence in question in most general cases (which is not explicitly known) from the potential point of view. In the present work, we do not intend to make any contribution to the problem of either construction of a dynamics from a potential, or giving a complete criterion for potentials, which give rise to a unique dynamics. (Thus we do not directly contribute to the study of explicit models.) On the contrary, we avoid these difficult problems by assuming that the dynamics is already given (since this is needed in any case for the KMS condition) and prove the equivalence result in question under minimal (general) assumptions on the dynamics, explained immediately below. Note that we do not make any explicit assumptions about the existence of a potential for a given dynamics nor about its property (such as the absolute convergence of the sum defining H(I) in terms of the potential). For any given dynamics, for which all finitely localized operators have the time derivative at t = 0 (Assumption (II) below) and which is lattice translation invariant (Assumption (IV) below), we show the existence of a corresponding potential, of which H(I) is a sum (as in usual formulation) convergent in a well-defined sense. We now explain our assumptions and interconnection of dynamics with potentials in more detail. The following two assumptions make it possible to associate a potential to any given dynamics satisfying them. (I) The dynamics is even. In other words, αt Θ = Θ αt for any t ∈ R, where Θ is an involutive automorphism of A, multiplying −1 on all creation and annihilation operators. (II) The domain D(δα ) of the generator δα of αt includes A◦ , the union of all A(I) for all finite subsets I of the lattice. It should be noted that Assumption (I) follows from Assumption (IV) below. (See Proposition 8.1.) We denote by ∆(A◦ ) the set of all ∗-derivations with A◦ as their domain and their values in A, commuting with Θ (on A◦ ). Then the generator δα of our αt , when restricted to A◦ , belongs to ∆(A◦ ). It is shown that ∆(A◦ ) is in one-to-one correspondence with the set P of standard even potentials, which are functionals Φ(I) of all finite subsets I of the lattice
April 11, 2003 14:43 WSPC/148-RMP
98
00160
H. Araki & H. Moriya
with values in the self-adjoint Θ-even part of the local algebra A(I), satisfying our standardness condition and a topological convergence condition (Theorem 5.13). The topological convergence condition ((Φ-e) in Definition 5.10) is required in order that the potential is associated with a ∗-derivation on A◦ and refers to the convergence of the interaction energy operator for every finite subset I X H(I) = {Φ(K); K ∩ I 6= ∅} , K
where a finite sum is first taken over K contained in a finite subset J and the limit of J tending to the whole lattice is to converge in the norm topology of A. (If this condition is satisfied for every one-point set I = {n} (n ∈ Zν ), then it is satisfied for all finite subsets I.) Note the difference from conventional topological conditions, such as summability of kΦ(I)k over all I containing a point n, which are assumed for the sake of mathematical convenience. For Φ ∈ P, internal energy U (I) and surface energy W (I) are also given in terms of Φ by the conventional formulae for every finite I. The connection of the derivation δ and the corresponding potential Φ is given by δA = i[H(I), A]
(A ∈ A(I)) .
Due to the Θ-evenness assumption (I), the replacement of H(I) by H(K) with K ⊃ I gives the same δ on A(I), a necessary condition for consistency. The standardness ((Φ-d) in Definition 5.10) is formulated in terms of conditional expectations and picks up a unique potential for each δ ∈ ∆(A◦ ). Without the standardness condition, there are many different potentials (called equivalent potentials) which yield exactly the same δ through the above formulae. Through the one-to-one correspondence between δ(∈ ∆(A◦ )) and Φ(∈ P), any dynamics αt satisfying our standing assumptions (I) and (II) is associated with a unique standard potential Φ ∈ P. This is a crucial point of our formulation, leading to our major result. When we want to derive a statement involving αt from a condition involving the potential Φ, we need the following assumption, guaranteeing the unique determination of αt from the given Φ: (III) A◦ is the core of the generator δα of the dynamics αt . For the discussion of variational principle, we need the translation invariance assumption for the dynamics: (IV) αt τk = τk αt , where τk , k ∈ Zν , is the automorphism group of A representing the lattice translations. The above Assumptions (I)–(IV) are the only assumptions needed for our theory below. On the other hand, if a potential Φ (say, in the class P) is first given for any model, it is a hard problem in general to show that the corresponding derivation
April 11, 2003 14:43 WSPC/148-RMP
00160
Equilibrium Statistical Mechanics of Fermion Lattice Systems
99
δΦ ∈ ∆(A◦ ) is given by some dynamics satisfying Assumptions (II) and (III), or equivalently that the closure of δΦ is a generator of a dynamics (i.e. it can be exponentiated to a one-parameter group of automorphisms of A). We now present our main theorem after the explanation about the variational principle and its ingredients. The set Pτ of all translation covariant potentials in P forms a Banach space (Proposition 8.8) with respect to the norm kΦk ≡ kH({n})k , which is independent of the lattice point n. The finite range potentials are shown to be dense in Pτ with respect to this norm and to imply separability of Pτ (Theorem 8.12 and Corollary 8.13). In terms of this norm, we obtain the energy estimate kU (I)k ≤ kH(I)k ≤ kΦk · |I| , where |I| is the cardinality of I (Lemma 8.6). Then the conventional estimate for W (I) follows. These estimates are used to show the existence of the thermodynamic functionals, such as pressure P (Φ) and mean energy eΦ (ω). All these estimates are carried out by the technique of conditional expectations without using the norm of the individual Φ(I). For any state ω of A, its local entropy SA(I) (ω) = S(ω|A(I) ) is given as usual by the von Neumann entropy S(·). Due to the non-commutativity of local algebras for disjoint regions, not all known properties of entropy for spin lattice systems hold for our Fermion case [33]. However, the strong subadditivity of entropy (SSA) for Fermion systems holds. Then the existence of the mean entropy s(ω) for any translation invariant state ω for Fermion lattice systems follows by a known method of spin lattice systems. The variational principle refers to the following equation for a translation invariant state ϕ of A for a given translation covariant potential Φ(∈ P τ ) and β ∈ R: P (βΦ) = s(ϕ) − βeΦ (ϕ) .
(1.1)
Our major result can be formulated as the following two theorems. Theorem A. Under Assumptions (II) and (IV) for the dynamics αt , any translation invariant state, which satisfies the KMS condition for αt at the inverse temperature β, is a solution of Eq. (1.1), where Φ is the unique standard potential corresponding to αt . Theorem B. Under Assumptions (II), (III) and (IV) for the dynamics αt , any solution ϕ of (1.1) satisfies the KMS condition for αt at β. Remark. These two theorems hold also for spin lattice systems. We now present an over-all picture of the proof of our main results above. The proof of Theorem A and Theorem B will be carried out through the following steps:
April 11, 2003 14:43 WSPC/148-RMP
100
(1) (2) (3) (4) (5)
00160
H. Araki & H. Moriya
KMS condition ⇒ Gibbs condition. Gibbs condition ⇒ Variational principle. Variational principle ⇒ dKMS condition on A◦ . dKMS condition on A◦ ⇒ dKMS condition on D(δα ). dKMS condition on D(δα ) ⇒ KMS condition.
Assumptions (I) and (II) are used throughout (1)–(5). Assumption (IV) is used for the formulation of the variational principle and necessarily for (2) and (3). It is also used to derive Assumption (I), which is not included in the premise of Theorems A and B. Assumption (III) is used only for (4). The differential KMS (abbreviated as dKMS) condition in (4) and (5) refers to a known condition, which is entirely described in terms of the generator δα of αt and without use of αt (Definition 6.3). This condition on the full domain D(δα ) of the generator δα of αt is known to be equivalent to the KMS condition (which is Step (5)). The differential KMS condition for our purpose is the condition for the restriction of δα to A◦ . Thus we need to show Step (4) using the additional assumption (III) on αt . For Steps (1) and (2), we follow the proof for spin lattice systems in principle. However, the Gibbs condition for Fermion lattice systems requires a careful definition. We define the Gibbs condition for a state ϕ as the requirement that the local algebra A(I) is in the centralizer of the perturbed functional ϕβH(I) , which is obtained from ϕ by a perturbation βH(I), for each finite subset I of the lattice (Definition 7.1 and Lemma 7.2). When A(I) and A(Ic ) commute (as in the case of spin lattice systems), this condition reduces to the product type characterization which was introduced and called the Gibbs condition by Araki and Ion for quantum spin lattice systems [5]. With our definition of the Gibbs condition, we have been able to prove Steps (1) and (2). The product type characterization mentioned above is the condition that ϕβH(I) is the product of the tracial state of A(I) and its restriction to the complement algebra A(Ic ). In the present case of Fermion lattice systems, we show that a Gibbs state satisfies this condition if and only if it is an even state of A (Proposition 7.7). The same kind of formulation and result are valid for a perturbation βW (I). For Step (3) as well as for the proof of the variational equality P (βΦ) =
sup {s(ω) − βeΦ (ω)} ,
τ ω∈A∗ +,1,
(1.2)
which is crucial for the variational principle, we need a product state of local Gibbs state. For this purpose, we have a technical result about the existence of a joint extension from states of local algebras for disjoint subsets of the lattice to a state of the algebra for their union, which holds if the individual states are even possibly except one (Theorem 11.2).
April 11, 2003 14:43 WSPC/148-RMP
00160
Equilibrium Statistical Mechanics of Fermion Lattice Systems
101
As an aside, the converse of Step (1) is shown under Assumptions (I), (II) and (III) (Theorem 7.6). A major tool of our analysis is the C∗ -algebra conditional expectation EI : A 7→ A(I) with respect to the unique tracial state τ of A. Its existence is shown not only for finite subsets but for all subsets I of the lattice (Theorem 4.7). Based on the product property of τ for subalgebras A(I) and A(J) for disjoint I and J, we obtain the following commuting square of C∗ -subalgebras (Theorem 4.13) for Fermion systems. (It holds trivially for spin systems.) E
I A(I ∩ J) −−−− → EJ y
A(J)
A(I) y EI∩J
−−−−→ A(I ∩ J) . EI∩J
This serves as a replacement for the tensor-product structure in traditional arguments for spin lattice systems. As by-products, we obtain a few useful results on the CAR algebra: The even-odd automorphism Θ is shown to be outer for any infinite CAR algebra (Corollary 4.20) and formulae for commutants of A(I) and A(I)+ in A for finite and infinite I are obtained (Theorem 4.17 and Theorem 4.19). Some more results contained in this paper are as follows. We show the validity of the variational equality (1.2) when the Connes– Narnhofer–Thirring entropy hω (τ ) with respect to the group of lattice translation automorphisms τ is used in place of the mean entropy s(ω) (Theorem 13.2). Note that our system (A, τ ), where τ denotes the group of lattice translation automorphisms, does not belong to the class of C∗ -systems considered in [34], being a non-abelian system. We define general potentials as those which satisfy all conditions for those in P except for the standardness. They include all potentials satisfying the following condition: X kΦ(I)k < ∞ (1.3) I3n
for every lattice point n. For each general potential, the corresponding H(I) and δ are defined and there is a unique standard potential in P with the same δ as a given general potential as described earlier. Restricting our attention to those general potentials satisfying (1.3) (a condition which is introduced also in some discussion of spin lattice systems), we are able to show by a straightforward argument that the set of solutions of variational principle for a general translation covariant potential satisfying (1.3) coincide with those for the equivalent standard potential (which is automatically translation covariant) (Remark 1 to Proposition 14.1), although the pressure and the mean energy may be different between the two potentials.
April 11, 2003 14:43 WSPC/148-RMP
102
00160
H. Araki & H. Moriya
2. Conditional Expectations 2.1. Basic properties The following proposition is well-known (see, e.g. Proposition 2.36, Chapter V [43]). Proposition 2.1. Let M be a von Neumann algebra with a faithful normal tracial state τ and N be its von Neumann subalgebra. Then there exists a unique conditional expectation M M EN : a ∈ M → EN (a) ∈ N
satisfying M τ (ab) = τ (EN (a)b)
(2.1)
for any b ∈ N . M Remark. A conditional expectation EN is linear, positive, unital, and satisfies M M EN (ab) = EN (a)b ,
for any a ∈ M and b ∈ N , and
M M EN (ba) = bEN (a) ,
M kEN k = 1.
(2.2)
(2.3)
We shall obtain a C∗ -version of this proposition for the Fermion algebra in Sec. 4, where M and N are C∗ -algebras with a unique tracial state τ . The main step of M (a) ∈ N for every a ∈ M satisfying (2.1). Once its proof is the existence of EN M it is established, the map EN is a conditional expectation by standard argument, which we formulate for the sake of completeness as follows. Lemma 2.2. Let M be a unital C∗ -algebra with a faithful tracial state τ and N be its subalgebra containing the identity of M. Suppose that for every a ∈ M there M M from M to N (a) of N satisfying (2.1). Then the map EN exists an element EN is the unique conditional expectation from M to N with respect to τ, possessing the following properties: M (1) EN is linear, positive and unital map from M onto N . (2) For any a ∈ M and b ∈ N , M M EN (ab) = EN (a)b ,
M M EN (ba) = bEN (a) .
M is a projection of norm 1. (3) EN M Proof. First we prove the uniqueness of EN (a) ∈ N satisfying (2.1) for a given 0 00 a ∈ M. Let a and a in N satisfy (2.1), namely,
τ (ab) = τ (a0 b) = τ (a00 b)
for all b ∈ N . Then
τ (b(a0 − a00 )) = 0 .
April 11, 2003 14:43 WSPC/148-RMP
00160
Equilibrium Statistical Mechanics of Fermion Lattice Systems
103
By taking b = (a0 − a00 )∗ and using the faithfulness of τ , we obtain a0 − a00 = 0, M hence the uniqueness of EN (a) ∈ N for each a ∈ M. Except for the positivity, (1) and (2) can be shown in the same pattern as follows. Let a = c1 a1 + c2 a2 where a1 , a2 ∈ M and c1 , c2 ∈ C. Then for any b ∈ N , M M τ (ab) = c1 τ (a1 b) + c2 τ (a2 b) = c1 τ (EN (a1 )b) + c2 τ (EN (a2 )b) M M = τ ({c1 EN (a1 ) + c2 EN (a2 )}b) . M M Since c1 EN (a1 ) + c2 EN (a2 ) ∈ N , the uniqueness already shown implies M M M c1 EN (a1 ) + c2 EN (a2 ) = EN (a) .
M Therefore, EN is linear. In the same way, for any a ∈ M and b ∈ N ,
M τ (abb0 ) = τ (EN (a)bb0 )
holds for all b0 ∈ N and hence
M M EN (ab) = EN (a)b .
Also M τ (bab0 ) = τ (ab0 b) = τ (EN (a)b0 b) M = τ (bEN (a)b0 )
implies M M EN (ba) = bEN (a) . M (a)b) with b ∈ N and the uniqueness If a ∈ N , then the identity τ (ab) = τ (EN result imply M EN (a) = a . M is a map onto N . By taking a = 1(∈ N ), we have Therefore EN M EN (1) = 1 .
M Hence EN is unital. M M M M (a). Therefore (a) ∈ N for any a ∈ M, we have EN (EN (a)) = EN Since EN M EN is a projection. M To show the positivity of the map EN , we consider the GNS triplet for the tracial state τN of N (which is the restriction of τ to N ) consisting of a Hilbert N space HτN , a representation πτN of N on HτN and a unit vector ΩN τ ∈ Hτ , giving N N rise to the state τN (A) = τ (A) = (ΩN τ , πτ (A)Ωτ ) for A ∈ N . If a ∈ M and a ≥ 0, then for b ∈ N N M N N ∗ M (πτN (b)ΩN τ , πτ (EN (a))πτ (b)Ωτ ) = τN (b EN (a)b) M = τN (EN (a)bb∗ ) = τ (abb∗ ) = τ (b∗ ab) ≥ 0 .
April 11, 2003 14:43 WSPC/148-RMP
104
00160
H. Araki & H. Moriya
N Since πτN (b)ΩN τ , b ∈ N is dense in Hτ , we obtain
M πτN (EN (a)) ≥ 0 .
Since πτN is faithful,
M EN (a) ≥ 0 ,
M and the positivity of EN is shown. For any a ∈ M, the faithfulness of πτN implies M M kEN (a)k = kπτN (EN (a))k
=
N M N N sup {|(πτN (b1 )ΩN τ , {πτ (EN (a))}πτ (b2 )Ωτ )| ;
b1 ,b2 ∈N
N kπτN (b1 )ΩN τ k ≤ 1, kπτ (b2 )Ωτ k ≤ 1)|}
=
M sup {|(τ (b∗1 EN (a)b2 )|; τ (b∗1 b1 ) ≤ 1, τ (b∗2 b2 ) ≤ 1}
b1 ,b2 ∈N
=
sup {|(τ (b∗1 ab2 )|; τ (b∗1 b1 ) ≤ 1, τ (b∗2 b2 ) ≤ 1}
b1 ,b2 ∈N
=
M M M sup {|(πτM (b1 )ΩM τ , πτ (a)πτ (b2 )Ωτ )| ;
b1 ,b2 ∈N
M M kπτM (b1 )ΩM τ k ≤ 1, kπτ (b2 )Ωτ k ≤ 1)|}
≤ kπτM (a)k = kak , where we have used the cyclicity of
πτN (N )
(2.4) for
HτN
for the second equality,
M M τ (b∗1 EN (a)b2 ) = τ (EN (a)b2 b∗1 ) = τ (ab2 b∗1 ) = τ (b∗1 ab2 ) ,
for the fourth equality, and the same computation backwards replacing N by M M (1) = 1 and (2.4), we have for the fifth equality. Due to EN M kEN k = 1.
We have completed the proof. 2.2. Geometrical lemma Let us consider finite type I factors (i.e. full matrix algebras) M and N such that M ⊃ N . We have the isomorphisms M ' N ⊗ N1 , N ' N ⊗ 1, and τ = τN ⊗ τN1 where N1 ≡ M ∩ N 0 is a finite type I factor. A conditional expectation satisfying (2.1) is given by the slice map: M (bb1 ) = τ (b1 )b (b ∈ N , b1 ∈ N1 ) . EN
(2.5)
M We give this EN a geometrical picture which we find useful. We introduce the following inner product on M:
ha, bi ≡ τ (a∗ b) ,
(a, b ∈ M) .
April 11, 2003 14:43 WSPC/148-RMP
00160
Equilibrium Statistical Mechanics of Fermion Lattice Systems
105
M M is then a (finite-dimensional) Hilbert space with this inner product. Let PN M be the orthogonal projection onto the subspace N of M. We show that PN is the M same as EN as a map M 7→ N .
Lemma 2.3. With the notation above, M M PN a = EN (a) .
(2.6)
for any a ∈ M. M Proof. Any a ∈ M can be decomposed as a = PN a + a0 where a0 ∈ N ⊥ . For any ∗ b ∈ N , we have b ∈ N and hence M τ (ab) = hb∗ , ai = hb∗ , PN ai + hb∗ , a0 i M M = hb∗ , PN ai = τ ((PN a)b) .
M Since PN a ∈ N , it follows from Proposition 2.1 that M M PN a = EN (a) .
2.3. Commuting square We introduce the following equivalent conditions for a commuting square. (See e.g. [21].) Proposition 2.4. Let M, N1 , N2 and P be finite type I factors satisfying M ⊃ N1 ⊃ P ,
M ⊃ N2 ⊃ P .
Then the following conditions are equivalent: (1) (2) (3) (4) (5)
N2 M EN | = EP 1 N2 N1 M E N 2 |N 1 = E P M M M M E N1 E N2 = E N P = N1 ∩ N2 and EN 2 1 M M M E N1 E N2 = E P M M M . E N1 = E P EN 2
Proof. (1) ⇔ (4): Assume (1). Let a ∈ M and b ∈ P. By the assumption, we have N2 M M M (a)) = EP (EN (a)) ∈ P (EN EN 2 2 1
M due to EN (a) ∈ N2 . On the other hand, 2
M M M τ (EN (EN (a))b) = τ (EN (a)b) 1 2 2
= τ (ab)
(due to b ∈ (P ⊂)N1 )
(due to b ∈ (P ⊂)N2 ) .
M . = EP = and so Hence The converse is obvious: for a ∈ N2 , (4) implies M M (a)) (EN EN 2 1
M EP (a)
M M E N2 EN 1
N2 M M M M EP (a) = EP (a) = EN EN2 (a) = EN (a) 1 1
April 11, 2003 14:43 WSPC/148-RMP
106
00160
H. Araki & H. Moriya
and hence (1). (2) ⇔ (5): Exactly the same proof as above, with N1 and N2 interchanged. (4) ⇔ (3): Assume (4). By Lemma 2.3, (4) implies M M PN PN2 = PPM . 1
Taking adjoints, we obtain M M PN PN1 = PPM . 2
This implies M M M M M E N2 , = EN E N1 = E P EN 1 2
the last equality being due to (4). Due to N1 ⊃ P and N2 ⊃ P, we have P ⊂ N1 ∩ N2 . If b ∈ N1 ∩ N2 , then M M M b = EN EN2 (b) = EP (b) ∈ P 1
by (4). Hence P = N1 ∩ N2 . This completes the proof of (4) ⇒ (3). M M M M (a)) ∈ N1 ∩ (EN (a)) = EN (EN Assume (3). For any a ∈ M, (3) implies EN 1 2 2 1 M M N2 = P because the range of EN1 is N1 and the range of EN2 is N2 . For any b ∈ P and a ∈ M, M M M τ (EN (EN (a))b) = τ (EN (a)b) = τ (ab) . 1 2 2 M M M (a). This implies (4). Hence EN (EN (a)) = EP 1 2 (5) ⇔ (3): Exactly the same proof as above, with N1 and N2 interchanged.
3. Entropy and Relative Entropy 3.1. Definitions We introduce some definitions and related lemmas needed for formulation of the main result of this section. Lemma 3.1. Let M be a finite type I factor. (i) Let ϕ be a positive linear functional on M. Then there exists a unique ρˆϕ ∈ M+ (called adjusted density matrix ) satisfying ϕ(a) = τ (ˆ ρϕ a) for all a ∈ M. (ii) Let N be a subfactor of M and ϕN be the restriction of ϕ to N . Then M ρˆϕN = EN (ˆ ρϕ )
April 11, 2003 14:43 WSPC/148-RMP
00160
Equilibrium Statistical Mechanics of Fermion Lattice Systems
107
Proof. (i) is well-known. M M (ii) For b ∈ N , ϕN (b) = ϕ(b) = τ (ˆ ρϕ b) = τ (EN (ˆ ρϕ )b). Since EN (ˆ ρϕ ) ∈ N+ , we have M ρˆϕN = EN (ˆ ρϕ ) .
Remark. The above definition of density matrix is given in terms of the tracial state in contrast to the standard definition using the matrix trace Tr. Hence we use the word ‘adjusted’. Definition 3.2. Let ρˆϕ be the adjusted density matrix of a positive linear functional ϕ of a finite type I factor. Then ˆ S(ϕ) ≡ −ϕ(log ρˆϕ ) is called the adjusted entropy of ϕ. Remark. The adjusted density matrix and the adjusted entropy for a type In factor M with the dimension Tr(1) = n are related to the usual ones by the following relations: ρˆϕ = nρϕ ,
ˆ S(ϕ) = S(ϕ) − ϕ(1) log n .
(3.1)
The range of the values of entropy is given by the following well-known lemma. Lemma 3.3. If M is a type In factor and ϕ is a state of M, then 0 ≤ S(ϕ) ≤ log n .
(3.2)
The equality S(ϕ) = 0 holds if and only if ϕ is a pure state of M. The equality S(ϕ) = log n holds if and only if ϕ is the tracial state τ of M. Definition 3.4. The relative entropy of % and σ in M+ as well as that of positive linear functionals ϕ and ψ are defined by S(σ, %) = τ (%(log % − log σ)) ,
(3.3)
S(ψ, ϕ) = ϕ(log ρˆϕ − log ρˆψ )(= τ (ˆ ρϕ log ρˆϕ − ρˆϕ log ρˆψ )) .
(3.4)
Remark. S(ψ, ϕ) remains the same if ρˆϕ and ρˆψ are replaced by the density matrices ρϕ and ρψ with respect to Tr. The right-hand sides of (3.3) and (3.4) are well-defined when %, σ, ρˆϕ and ρˆψ are regular. Otherwise, one may define them as the limit of regular cases, for example by taking the limit ε → 0 for (1 − ε)ϕ + ετ , (1 − ε)ψ + ετ for (3.4), and similarly for (3.3). The value of S(ψ, ϕ) is real or +∞ for positive linear functionals ϕ and ψ. The following lemma is also well-known. Lemma 3.5. Let ϕ and ψ be states. Then S(ψ, ϕ) is non-negative. It vanishes if and only if ϕ = ψ.
April 11, 2003 14:43 WSPC/148-RMP
108
00160
H. Araki & H. Moriya
Remark. We note that there are different notations for the relative entropy and that we adopt that of Araki [8] and Kosaki [25]. In comparison with our notation, the order of two states is reversed in that of Umegaki [45], while both the order of states and the sign are reversed in that of Bratteli and Robinson [17]. 3.2. Monotone property Under any conditional expectation E and under restriction to any subalgebra, the relative entropy is known to be non-increasing: S(ψ ◦ E, ϕ ◦ E) ≤ S(ψ, ϕ) ,
(3.5)
S(ψN , ϕN ) ≤ S(ψ, ϕ) .
(3.6)
(For example, (3.6) is Theorem 4.1(iv) of [25]. (3.5) follows from Theorem 4.1(v) of [25], because E is a Schwarz map [44].) When we want to exhibit the dependence of entropy on M more explicitly, we ˆ The relation between the entropy use the notation SM and SˆM instead of S and S. and the relative entropy for a state ϕ is given by ˆ S(ϕ) = −S(τ, ϕ) = S(ϕ) − S(τ ) . Note that S(τ ) = log n for a type In factor M. We identify M with N ⊗ (M ∩ N 0 ) and use the notation ϕN ⊗ τM∩N 0 . We also identify A ∈ N ⊂ M with A ⊗ 1 ∈ N ⊗ (M ∩ N 0 ). Lemma 3.6. Let M ⊃ N be finite type I factors, and ϕ be a state on M. Then M , ϕ) . SˆN (ϕN ) − SˆM (ϕ) = SM (ϕN ⊗ τM∩N 0 , ϕ) = SM (ϕ ◦ EN
(3.7)
Proof. If ϕ is a faithful state, we show the above identity by a straight-forward calculation. If ϕ is not faithful, we add ε · τ to (1 − ε)ϕ and then take the limit ε → 0. Remark. Sˆ in the above Lemma cannot be replaced by S. 3.3. Strong subadditivity If the system under consideration enjoys the commuting square property with respect to a tracial state, the strong subadditivity property for the adjusted entropy Sˆ holds (see Theorem 12 in [35]). Theorem 3.7. Let M, N1 , N2 and P be finite type I factors satisfying one of the equivalent conditions of Proposition 2.4. Let ψ be a state on M. Then ˆ P) ≤ 0 . ˆ N2 ) + S(ψ ˆ ˆ N1 ) − S(ψ S(ψ) − S(ψ
Proof. By (3.7) and (3.5) M M M M SˆN2 (ψN2 ) − SˆM (ψ) = SM (ψ ◦ EN , ψ) ≥ SM (ψ ◦ EN ◦ EN , ψ ◦ EN ). 2 2 1 1
(3.8)
April 11, 2003 14:43 WSPC/148-RMP
00160
Equilibrium Statistical Mechanics of Fermion Lattice Systems
109
M M M M M By the assumption, EN E N1 = E N E N2 = E P . Hence, 2 1 M M M SM (ψ ◦ EN ◦ EN , ψ ◦ EN ) = SM (ψP ⊗ τM∩P 0 , ψN1 ⊗ τM∩N10 ) 2 1 1
= SN (ψP ⊗ τN1 ∩P 0 , ψN1 ) = SˆP (ψP ) − SˆN1 (ψN1 ) , where the second equality is due to τM∩P 0 = τN1 ∩P 0 ⊗ τM∩N10 and the last equality due to (3.7). Therefore we obtain (3.8). 4. Fermion Lattice Systems 4.1. Fermion algebra We introduce Fermion lattice systems where there exists one spinless Fermion at each lattice site and they interact with each other. The restriction to spinless particle (i.e. one degree of freedom for each site) is just a matter of simplification of notation. All results and their proofs in the present work go over to the case of an arbitrary (constant) finite number of degrees of freedom at each lattice site without any essential alteration. The lattice we consider is ν-dimensional lattice Zν (ν ∈ N, an arbitrary positive integer). Definition 4.1. The Fermion C∗ -algebra A is a unital C∗ algebra satisfying the following conditions and generated by elements in (1 − 1): (1-1) For each lattice site i ∈ Zν , there are elements ai and a∗i of A called annihilation and creation operators, respectively, where a∗i is the adjoint of ai . (1-2) The CAR (canonical anticommutation relations) are satisfied for any i, j ∈ Zν : {a∗i , aj } = δi,j 1 , {a∗i , a∗j } = {ai , aj } = 0 .
(4.1)
Here {A, B} = AB + BA (anticommutator ), δi,j = 1 for i = j, and δi,j = 0 for i 6= j. (1-3) Let A◦ be the ∗-algebra generated by all ai and a∗i (i ∈ Zν ), namely the (algebraic) linear span of their monomials A1 · · · An where Ak is aik or a∗ik , ik ∈ Zν . (2) For each subset I of Zν , the C∗ -subalgebra of A generated by ai , a∗i , i ∈ I, is denoted by A(I). If the cardinality |I| of the set I is finite, then A(I) is referred to as a local algebra or more specifically the local algebra for I. For the empty set ∅, we define A(∅) = C1. Remark 1. A◦ is dense in A. Remark 2. For finite I, A(I) is known to be isomorphic to the tensor product of |I| copies of the full 2 × 2 matrix algebra M2 (C) and hence isomorphic to M2|I| (C).
April 11, 2003 14:43 WSPC/148-RMP
110
00160
H. Araki & H. Moriya
Then A◦ =
[
|I|<∞
A(I)
has the unique C∗ -norm. A together with its individual elements {ai , a∗i |i ∈ Zν } is uniquely defined up to isomorphism and is isomorphic to the UHF-algebra ¯ i∈Zν M2 (C), where the bar denotes the norm completion. A has the unique tracial ⊗ state τ as the extension of the unique tracial state of A(I), |I| < ∞. Remark 3. Since a∗i ’s and ai ’s anti-commute among different indices, a∗i and ai with a specific i can be brought together at any spot in a monomial, with a possible sign change (without changing the ordering among themselves), and this can be done for each i. Therefore, the monomials of the form A i1 · · · A ik
(4.2)
together with 1 have a dense linear span in A(I), where the indices i1 , . . . , ik ∈ I are distinct and Aiα is one of a∗iα , aiα , a∗iα aiα , aiα a∗iα . Definition 4.2. ΘI denotes a unique automorphism of A satisfying ΘI (ai ) = −ai , ΘI (ai ) = ai ,
ΘI (a∗i ) = −a∗i ,
ΘI (a∗i ) = a∗i ,
(i ∈ I) ,
(i 6= I) .
(4.3)
ν
In particular, we denote Θ = ΘZ . The even and odd parts of A are defined as A+ ≡ {a ∈ A | Θ(a) = a} ,
A− ≡ {a ∈ A | Θ(a) = −a} .
(4.4)
Remark 1. Such Θ exists and is unique because (4.3) preserves CAR. It obviously satisfies Θ2 = id . Remark 2. For any a ∈ A(I), a = a + + a− ,
a± ≡
(4.5)
1 (a ± Θ(a)) 2
(4.6)
gives the (unique) splitting of a into a sum of a+ ∈ A(I)+ and a− ∈ A(I)− , where the even and odd parts of A(I) are denoted by A(I)+ and A(I)− . Remark 3. For any a ∈ A− , we have τ (a) = τ (Θ(a)) = −τ (a) = 0 .
(4.7)
Lemma 4.3. Let I and J be mutually disjoint and aσ ∈ A(I)σ , bσ ∈ A(J)σ where σ = + or −. Then aσ bσ0 = (σ, σ 0 )bσ0 aσ ,
(4.8)
April 11, 2003 14:43 WSPC/148-RMP
00160
Equilibrium Statistical Mechanics of Fermion Lattice Systems
111
where (σ, σ 0 ) = −1 = +1
if σ = σ 0 = − otherwise .
Proof. Since A(I) is generated by ai and a∗i , i ∈ I, polynomials p of ai and a∗i , i ∈ I, are dense in A(I). For any ε > 0 and a given aσ , σ = + or −, there exists a polynomial p, i.e. a linear combination p of monomials of ai and a∗i , i ∈ I, satisfying kaσ − pk < ε. Since Eσ ≡ (1/2)(id + σΘ) satisfies Eσ aσ = aσ and kEσ k = 1, we have kEσ (aσ − p)k = kaσ − pσ k < ε where pσ = Eσ p. Since Eσ selects even or odd monomials (annihilating others) according as σ is + or −, pσ is a linear combination of even or odd monomials of ai and a∗i , i ∈ I. Similarly there exits a linear combination qσ0 of even or odd monomials of aj and a∗j , j ∈ J, satisfying kbσ0 −qσ0 k < ε. Since the graded commutation relation (4.8) holds for pσ and qσ0 , it holds for aσ and bσ0 . Definition 4.4. (1) For each k ∈ Zν , τk denotes a unique automorphism of A satisfying τk (a∗i ) = a∗i+k ,
τk (ai ) = ai+k ,
(i ∈ Zν ) .
(4.9)
(2) For a state ϕ of A, the adjoint action of τk is defined by (τk∗ ϕ)(A) = ϕ(τk (A)) ,
(A ∈ A) .
(4.10)
Remark. The automorphism τk represents the lattice translation by the amount k ∈ Zν . The map k ∈ Zν 7→ τk is a group of automorphisms: τk τl = τk+l ,
(k, l ∈ Zν ) .
The subalgebras transform covariantly under this group: τk (A(I)) = A(I + k) ,
(4.11)
where I + k = {i + k; i ∈ I} for any subset of I of Zν and any k ∈ Zν . Definition 4.5. The sets of all states and all positive linear functionals of A are denoted by A∗+,1 and A∗+ ; the sets of all Θ invariant and all τ invariant ones by Θ τ A∗+,1 , A∗+Θ and A∗+,1 , A∗+τ , respectively. For any subset I of Zν , the set of all states Θ of A(I) is denoted by A(I)∗+,1 ; the set of all Θ invariant ones by A(I)∗+,1 . Remark 1. Any translation invariant state is automatically even (see, e.g. Example 5.2.21 of [17]): τ Θ A∗+,1 ⊂ A∗+,1 .
(4.12)
Remark 2. For each subset I of Zν , we can consider the set of all states {A(I)+ }∗+,1 on the even subalgebra A(I)+ . There exists an obvious one-to-one correspondence
April 11, 2003 14:43 WSPC/148-RMP
112
00160
H. Araki & H. Moriya
Θ between A(I)∗+,1 and {A(I)+ }∗+,1 due to (4.7) by the restriction and the unique Θ invariant extension.
4.2. Product property of the tracial state The following proposition provides a basis for the present section. Proposition 4.6. If J1 and J2 are disjoint, then τ (ab) = τ (a)τ (b)
(4.13)
for arbitrary a ∈ A(J1 ) and b ∈ A(J2 ). Proof. It is enough to prove the formula when a and b are monomials of the form (4.2). Let a = Ai a0 , where i ∈ J1 , a0 ∈ A(J1 \{i}) is a monomial of the form (4.2) and Ai is one of a∗i , ai , a∗i ai , ai a∗i . We will now show τ (ab) = τ (Ai )τ (a0 b) .
(4.14)
If a0 b is a Θ-odd monomial, then τ (a0 b) = 0 by (4.7). If Ai is Θ-even, then ab is odd and τ (ab) = 0, implying (4.14). If Ai is odd, then Ai (a0 b) = −(a0 b)Ai . Hence τ (ab) = τ (Ai (a0 b)) = −τ ((a0 b)Ai ) = −τ (Ai (a0 b)) = 0, where the third equality is due to the tracial property of τ . So (4.14) holds in either case. If a0 b is even and Ai is odd, then τ (Ai ) = 0 because Ai is odd and τ (ab) = 0 because ab = Ai (a0 b) is odd. Again (4.14) holds. Finally, if a0 b is even and Ai = a∗i ai , then a∗i commutes with a0 b due to CAR and hence τ (ab) = τ ((a∗i ai )(a0 b)) = τ (ai (a0 b)a∗i ) = τ (ai a∗i (a0 b)) =
(due to [a∗i , a0 b] = 0)
1 1 τ ((a∗i ai + ai a∗i )(a0 b)) = τ (a0 b) . 2 2
The same formula for a0 b = 1 yields τ (Ai ) =
1 2
and hence
τ (ab) = τ (Ai )τ (a0 b) . If a0 b is even and Ai = ai a∗i , the above formula holds in the same way. We have now proved (4.14) for all cases. Let a be now given by (4.2). By using (4.14) for i1 , i2 , . . . , ik successively, we obtain τ (ab) = τ (Ai1 ) · · · τ (Aik )τ (b) .
April 11, 2003 14:43 WSPC/148-RMP
00160
Equilibrium Statistical Mechanics of Fermion Lattice Systems
113
The same equality for b = 1 yields τ (a) = τ (Ai1 ) · · · τ (Aik ) . Hence we have τ (ab) = τ (a)τ (b) . This completes the proof. We may say that the tracial state τ is a ‘product’ state although A(J1 ) and A(J2 ) do not commute. We will show in the next subsections that this product property of the tracial state implies the commuting square property for the conditional expectations. 4.3. Conditional expectations for Fermion algebras We prove the C∗ -algebraic version of Proposition 2.1 for the Fermion algebra A and its subalgebras. We note that A(I) is not a von Neumann algebra unless I is a finite subset of Zν . Hence Proposition 2.1 is not directly applicable to the Fermion algebra. Theorem 4.7. For any subset I of Zν , there exists a conditional expectation EI : a ∈ A 7→ EI (a) ∈ A(I)
(4.15)
uniquely determined by EI (a) ∈ A(I) and τ (ab) = τ (EI (a)b)
(b ∈ A(I)) .
(4.16)
For any second subset J of Zν , EI (a) ∈ A(I ∩ J)
(4.17)
EI EJ = EJ EI = EI∩J .
(4.18)
for any a ∈ A(J), and
Proof. The C∗ -subalgebra of A generated by A(I) and A(Ic )+ is isomorphic to their tensor product and will be denoted as A(I) ⊗ A(Ic )+ . Let (1)
EI
≡
c 1 (id + ΘI ) . 2
It maps A onto A(I) ⊗ A(Ic )+ . Since c
c
τ (ΘI (a)b) = τ (ΘI (ab)) = τ (ab) (1)
for all a ∈ A and b ∈ A(I) ⊗ A(Ic )+ , EI
satisfies (4.16).
(4.19)
April 11, 2003 14:43 WSPC/148-RMP
114
00160
H. Araki & H. Moriya
Since τ is a product state for the tensor product A(I) ⊗ A(Ic )+ , there exists (2) a conditional expectation EI from A(I) ⊗ A(Ic )+ onto A(I) satisfying (4.16), (2) characterized by EI (cd) = τ (d)c for c ∈ A(I) and d ∈ A(Ic )+ and called a slice map. Therefore (2)
(1)
EI = E I EI
(4.20)
is a map from A onto A(I) satisfying (4.16). By Lemma 2.2, it is a unique conditional expectation from A onto A(I) satisfying (4.16). To show (4.17), note that A(J) is generated by A(J ∩ I) and A(J ∩ Ic ), namely, the linear span of products ab with a ∈ A(J ∩ I) and b ∈ A(J ∩ Ic ) is dense in A(J). Due to the linearity of EI and kEI k = 1, it is enough to show (4.17) for such (1) products. We have EI (b) ∈ A(Ic )+ and hence (2)
(1)
(1)
EI (ab) = EI (aEI (b)) = aτ (EI (b)) = aτ (b) ∈ A(J ∩ I) , which proves (4.17). For any a ∈ A, EJ (a) ∈ A(J) and hence EI (EJ (a)) ∈ A(I ∩ J). For b ∈ A(I ∩ J), (4.16) implies τ (EI (EJ (a))b) = τ (EJ (a)b) = τ (ab) , where the first equality is due to b ∈ A(I), while the second equality is due to b ∈ A(J). This equality and EI (EJ (a)) ∈ A(I ∩ J) imply EI∩J (a) = EI (EJ (a)) by the uniqueness result. By interchanging I and J, we obtain EI EJ = EJ EI = EI∩J , which proves the last statement (4.18). Remark 1. For spin lattice systems, the conditional expectation EI can be obtained simply as a slice map with respect to the tracial state τ . When spins and Fermions coexist at each lattice site, EI can be obtained in exactly the same way as Theorem 4.7 (by including spin operators in the even part A(I)+ ), provided that the degree of freedom at each lattice site is finite (i.e. A(I) is a finite factor of type I for any finite I). In all these cases, the results of our paper are valid as they are proved by the use of conditional expectations EI . Remark 2. Theorem 4.7 can be shown by a more elementary (lengthy) method by giving EI explicitly for a finite I and then giving EJ for an infinite J as a limit of EIn for an increasing sequence of finite subsets In of Zν tending to J. Proof presented above is by a suggestion of a referee. Corollary 4.8. For each subset I of Zν , EI Θ = ΘEI .
(4.21)
April 11, 2003 14:43 WSPC/148-RMP
00160
Equilibrium Statistical Mechanics of Fermion Lattice Systems
115
Proof. For any a ∈ A and b ∈ A(I),
τ (EI (Θ(a))b) = τ (Θ(a)b) = τ (Θ{Θ(a)b}) = τ (aΘ(b)) = τ (EI (a)Θ(b)) = τ (Θ{EI (a)Θ(b)}) = τ (Θ(EI (a))b) .
Since A(I) is invariant under Θ as a set, we have Θ(EI (a)) = EI (Θ(a)) due to the uniqueness of EI in the preceding theorem. We now show a continuous dependence of EI on the subsets I of Zν . We use the following notation for various limits of subsets of Zν . If {Iα } is a monotone (not necessarily strictly) increasing or decreasing net of subsets converging to a subset I of Zν , we write Iα % I or Iα & I. For these cases, I = ∪α Iα or I = ∩α Iα , respectively. We use Iα → I for the standard convergence of a net Iα to I (i.e. lim supα Iα = lim inf α Iα = I). By J % Zν (which is written without any index), we mean a net of all finite subsets tending to Zν with the set inclusion as its partial ordering. (In the same way, we use J % I.) In this case, J itself serves as the net index and it is a monotone increasing net. Later in Secs. 9 and 10, we use a more restrictive notion of a van Hove net {Iα } tending to Zν or to ‘∞’ (see Appendix for detailed explanation). Lemma 4.9. Let {Iα } be an increasing net of (finite or infinite) subsets of I such that their union is I. For any a ∈ A, lim EIα (a) = EI (a) .
(4.22)
lim EIα (a) = a .
(4.23)
α
As a special case I = Zν , Iα %Zν
Proof. Since polynomials of ai and a∗i , i ∈ I, are dense in A(I), there exists a finite subset Jn of I and an ∈ A(Jn ) such that 1 kEI (a) − an k < . n Because Jn is a finite subset of I and ∪α Iα = I, there exists a finite number of Iα , say, Iα(1) , . . . Iα(k) , such that ∪kl=1 Iα(l) ⊃ Jn . Since Iα is a net, there exists an index αn > α(1), . . . , α(k). Since Iα is increasing, Iαn ⊃ Iα(1) ∪ · · · Iα(k) ⊃ Jn . For any α ≥ αn , Iα ⊃ Jn and so EIα (an ) = an . Hence by I ⊃ Iα , we have 1 kEIα (a) − an k = kEIα (EI (a) − an )k ≤ kEI (a) − an k < n due to kEIα k ≤ 1. Thus 2 kEIα (a) − EI (a)k ≤ kEIα (a) − an k + kEI (a) − an k < , n for all α ≥ αn , which proves the assertion (4.22).
April 11, 2003 14:43 WSPC/148-RMP
116
00160
H. Araki & H. Moriya
Lemma 4.10. Let {Iα } be a decreasing net of (finite or infinite) subsets of Zν such that their intersection is I. For any a ∈ A, lim EIα (a) = EI (a) . α
(4.24)
Proof. Let Lk be a monotone increasing sequence of finite subsets of Zν such that their union is Zν . For any ε > 0, there exists kε such that ka − ELk (a)k < ε for all k ≥ kε by Lemma 4.9. Hence kEI (a) − EI∩Lk (a)k = kEI (a − ELk (a))k < ε ,
(4.25)
kEIα (a) − EIα ∩Lk (a)k = kEIα (a − ELk (a))k < ε
(4.26)
for all k ≥ kε and all α due to kEI k ≤ 1 and kEIα k ≤ 1. Since Iα & I, we have (Iα ∩ Lk ) & (I ∩ Lk ). Since Lkε is a finite set, there exists αε such that Iα ∩ Lkε = I ∩ Lkε and hence EIα ∩Lkε = EI∩Lkε for all α ≥ αε . Therefore, we obtain kEIα (a) − EI (a)k ≤ kEIα (a) − EIα ∩Lkε (a)k + kEIα ∩Lkε (a) − EI (a)k = kEIα (a) − EIα ∩Lkε (a)k + kEI∩Lkε (a) − EI (a)k < 2ε for all α ≥ αε , where the first term is estimated by (4.26), and the second by (4.25). Hence we obtain lim EIα (a) = EI (a) . α
Theorem 4.11. If a net {Iα } converges to I, then lim EIα (a) = EI (a) . α
(4.27)
for all a ∈ A. Proof. By definition, Iα → I means I = ∩β (∪α≥β Iα ) = ∪β (∩α≥β Iα ) . Set Jβ ≡ ∪α≥β Iα ,
Jβ ≡ ∩α≥β Iα .
Then Jβ & I and Jβ % I. By Lemmas 4.9 and 4.10, there exists a βε for any given ε > 0 such that for all β ≥ βε , kEJβ (a) − EI (a)k < ε ,
kEJβ (a) − EI (a)k < ε .
Hence kEJβ (a) − EJβ (a)k < 2ε .
April 11, 2003 14:43 WSPC/148-RMP
00160
Equilibrium Statistical Mechanics of Fermion Lattice Systems
117
Since Jβ ⊃ Iβ ⊃ Jβ , we have EIβ EJβ = EIβ , EIβ EJβ = EJβ and kEIβ (a) − EJβ (a)k = kEIβ (EJβ (a) − EJβ (a))k < 2ε . Therefore, kEIβ (a) − EI (a)k < 3ε for all β ≥ βε . This proves (4.27). The following corollary follows immediately from the results obtained in this subsection. Corollary 4.12. For any countable family {In } of subsets of Zν , ∞ ∩∞ n=1 A(In ) = A(∩n=1 In ) .
(4.28)
Proof. Let Jn ≡ ∩nk=1 Ik and I ≡ ∩∞ n=1 In . Then Jn & I. By (4.18), EJn−1 EIn = EJn Qn and hence EJn = k=1 EIk . On one hand, Jn ⊂ Ik for k = 1, . . . , n, and hence A(Jn ) ⊂ ∩nk=1 A(Ik ). On the other hand, a ∈ ∩nk=1 A(Ik ) satisfies EIk (a) = a for all k = 1, . . . , n and hence EJn (a) = a ∈ A(Jn ). Therefore A(Jn ) = ∩nk=1 A(Ik ) .
Since Jn ⊃ I, we have A(Jn ) ⊃ A(I) and hence
∞ ∩∞ n=1 A(In ) = ∩n=1 A(Jn ) ⊃ A(I) .
For a ∈ ∩∞ n=1 A(Jn ), EJn (a) = a for any n. Since limn EJn (a) = EI (a) by Lemma 4.10, we have a = EI (a) ∈ A(I). Now we obtain the desired conclusion ∩∞ n=1 A(In ) = A(I) .
4.4. Commuting squares for Fermion algebras In the following theorem, we show that any two subsets I and J of Zν are associated with a commuting square of the conditional expectations with respect to the tracial L state τ . For K ⊂ L ⊂ Zν , denote the restriction of EK to A(L) by EK . Then it is a conditional expectation from A(L) to A(K) with respect to the tracial state. Theorem 4.13. For any subsets I and J of Zν , the following subalgebras of A form a commuting square: A(I) A(I ∪ J) Q
Q
Q
3
Q
Q
Q s Q A(J)
Q
Q s Q A(I ∩ J) 3
L Here the arrow from A(L) to A(K) represents the conditional expectation E K .
April 11, 2003 14:43 WSPC/148-RMP
118
00160
H. Araki & H. Moriya
Proof. It follows from (4.18) that I I∪J J EI∩J EII∪J = EI∩J = EI∩J EJI∪J ,
which shows the assertion. 4.5. Commutants of subalgebras We are going to determine the commutants of subalgebras of A. Lemma 4.14. For a finite I, (A(I)+ )0 ∩ A = A(Ic ) + vI A(Ic ) ,
(4.29)
where vI is a self-adjoint unitary in A(I)+ given by Y vI ≡ vi , vi ≡ a∗i ai − ai a∗i
(4.30)
i∈I
and implementing ΘI on A. Proof. By CAR, a∗i vi = −a∗i ,
ai vi = a i ,
vi a∗i = a∗i ,
vi ai = −ai .
Thus vi anticommutes with ai and a∗i . If j 6= i, vi commutes with aj and a∗j due to vi ∈ A({i})+ . Therefore for any a ∈ A(I), we have (AdvI )a ≡ vI avI∗ = Θ(a) ,
(4.31)
vI a = Θ(a)vI .
(4.32)
vI a = avI .
(4.33)
or equivalently,
For any a ∈ A(Ic ), Due to vI∗ = vI = vI2 , vI is a self-adjoint unitary implementing ΘI on A. Since vI ∈ A(I)+ implements ΘI , (A(I)+ )0 is contained in the fixed point I (1) subalgebra AΘ . In terms of EIc = 21 (id + ΘI ), we have I
(1)
(A(I)+ )0 ⊂ AΘ = EIc (A) = A(I)+ ⊗ A(Ic ) . Since A(Ic ) is in (A(I)+ )0 , we have
(A(I)+ )0 = Z(A(I)+ ) ⊗ A(Ic )
(4.34)
where Z(A(I)+ ) is the center of A(I)+ . Since A(I)+ = {vI }0 ∩A(I), vI is a self-adjoint unitary in A(I) and A(I) is a full matrix algebra for a finite I, we have Z(A(I)+ ) = C1 + CvI . By (4.34) and (4.35), we obtain (4.29).
(4.35)
April 11, 2003 14:43 WSPC/148-RMP
00160
Equilibrium Statistical Mechanics of Fermion Lattice Systems
119
Lemma 4.15. For a finite I, A(I)0 ∩ A = A(Ic )+ + vI A(Ic )− .
(4.36)
Proof. By Lemma 4.14 and A(I)0 ⊂ (A(I)+ )0 , any element a ∈ A(I)0 is of the form a = a 1 + v I a2 ,
a1 , a2 ∈ A(Ic ) .
Take any unitary u ∈ A(I)− (e.g. u = ai + a∗i , i ∈ I). Then we have a=
1 1 1 (a + uau∗ ) = (a1 + ua1 u∗ ) + vI (a2 − ua2 u∗ ) 2 2 2
= (a1 )+ + vI (a2 )− due to uvI = −vI u, where (a1 )+ =
1 (a1 + Θ(a1 )) ∈ A(Ic )+ , 2
(a2 )− =
1 (a2 − Θ(a2 )) ∈ A(Ic )− . 2
Hence A(Ic )0 ⊂ A(Ic )+ + vI A(Ic )− . The inverse inclusion follows from (4.32) and Lemma 4.3. Hence (4.36) holds. Lemma 4.16. For an infinite I, A(I)0 ∩ A = A(Ic )+ .
(4.37)
Proof. It is clear that elements of A(Ic )+ and A(I) commute. Hence it is enough to prove A(I)0 ∩ A ⊂ A(Ic )+ . Let a ∈ A(I)0 ∩ A. Then a± =
1 (a ± Θ(a)) ∈ A(I)0 ∩ A 2
because Θ(A(I)) = A(I). For any finite subset K of I, a± ∈ (A(K)0 )± . Hence by Lemma 4.15, a+ ∈ A(Kc )+ . Consider an increasing sequence of finite subsets Kn % I. We apply Corollary 4.12 to (Kn )c & Ic , and obtain c c a+ ∈ ∩∞ n=1 A((Kn ) )+ = A(I )+ .
(4.38)
We now prove a− = 0, which yields the desired conclusion due to a = a+ + a− and (4.38). For a monotone increasing sequence of finite subsets Ln of Zν such that Ln % Zν , we have limn ELn (a− ) = a− and hence there exists nε for any given ε > 0 such that kELn (a− ) − a− k < ε
(4.39)
April 11, 2003 14:43 WSPC/148-RMP
120
00160
H. Araki & H. Moriya
for n ≥ nε . For any k, we set Kk ≡ I ∩ Lk (⊂ I). Then a− ∈ A(Kk )0 and by Lemma 4.15 we have a − = v Kk b k for some bk ∈ A((Kk )c )− . For any i ∈ Kk , E{i}c (a− ) = τ (vi )v(Kk \{i}) bk = 0 .
(4.40)
Now take an n0 ≥ nε . Since Kk % I and I is an infinite set while any Ln0 is a finite set, there exists a number k such that Kk contains a point i of Zν such that i ∈ / L n0 . Then Ln0 ⊂ {i}c . It follows from (4.40) that ELn0 (a− ) = ELn0 E{i}c (a− ) = 0 . This and (4.39) imply ka− k < ε . Since ε is arbitrary, we obtain a− = 0. Combining Lemma 4.15 and Lemma 4.16, we obtain Theorem 4.17. (1) For a finite I, A(I)0 ∩ A = A(Ic )+ + vI A(Ic )− , where vI is given by (4.30). (2) For an infinite I, A(I)0 ∩ A = A(Ic )+ . As a preparation for the remaining case (the commutant of A(I)+ for infinite I), we present the following technical Lemma for the sake of completeness. We define (i)
u11 ≡ a∗i ai ,
(i)
u12 ≡ a∗i ,
(i)
u21 ≡ ai ,
(i)
u22 ≡ ai a∗i .
(4.41)
Lemma 4.18. Let I = (i1 , . . . , i|I| ) be a finite subset of Zν . Put (ij ) j) u0(i αα ≡ uαα for α = 1, 2 ,
0(i )
0(i )
uαβj ≡ uαβj v{i1 ,...,ij−1 } for α 6= β .
(4.42)
Define ukl ≡
|I| Y
0(i )
ukj ljj ,
(4.43)
j=1
where kn and ln are either 1 or 2, respectively, k = (k1 , . . . , k|I| ) and l = (l1 , . . . , l|I| ). Then the following holds. (1) The set of all ukl form a self-adjoint system of matrix units of A(I). (2) Let σ(k, l) be the number of n such that kn 6= ln . Then Θ(ukl ) = (−1)σ(k,l) ukl .
(4.44)
April 11, 2003 14:43 WSPC/148-RMP
00160
Equilibrium Statistical Mechanics of Fermion Lattice Systems
(3) Any a ∈ A has a unique expansion X a= ukl akl
121
(4.45)
k,l
with akl ∈ A(Ic ) and akl is uniquely given by akl = 2|I| EIc (ulk a) .
(4.46) (i)
Proof. (1) By using (4.1) for the case of i = j, {uαβ }αβ (α, β = 1, 2) satisfies the relations X (i) (i) (i) (i) (i) (uαβ )∗ = uβα , uαβ uα0 β 0 = δβα0 uαβ 0 , u(i) (4.47) αα = 1 , α
for a self-adjoint system of matrix units. Since v{i1 ,...,ij−1 } is a self-adjoint unitary 0(i )
commuting with aij and a∗ij , the same computation shows that {uαβj }αβ (α, β = 1, 2) satisfies the same relations. Since v{i1 ,...,ij−1 } anticommutes with aik and a∗ik for k < j and commutes with 0(i )
them for k ≥ j, {uαβj }αβ commutes with each other for different j. Since they generate all A({ik }) recursively for k = 1, . . . , n, they form a selfadjoint system of matrix units of A(I). (i) (i) (i) (i) (2) Θ(uαα ) = uαα , Θ(uαβ ) = −uαβ for α 6= β, and Θ(v{i1 ,...,ij−1 } ) = v{i1 ,...,ij−1 } imply (4.44). (3) For a full matrix algebra A(I) contained in a C∗ -algebra A, the following expansion of any a ∈ A in term of a self-adjoint system of a matrix units {ukl } of A(I) is well-known. X a= ukl bkl , k,l
bkl =
X m
umk aulm ∈ A(I)0 .
(4.48)
By Lemma 4.15, there are bkl1 and bkl2 in A(Ic ) satisfying bkl = bkl1 + vI bkl2 .
(4.49)
By direct computation, ukl vI = ±ukl where the sign depends on k and l. Thus we have the expansion (4.45) with akl = bkl1 ± bkl2 ∈ A(Ic ). The coefficient akl ∈ A(Ic ) is uniquely determined by the following computation and given by (4.46). ! X EIc (ulk a) = EIc ull0 akl0 l0
=
X l0
EIc (ull0 )akl0 =
X l0
τ (ull0 )akl0 = 2−|I| akl .
April 11, 2003 14:43 WSPC/148-RMP
122
00160
H. Araki & H. Moriya
Here we have used the following relation: τ (ukl ) = τ (ukm uml ) = τ (uml ukm ) = δkl τ (umm ) = δkl 2
−|I|
τ
X
umm
m
!
= 2−|I| δkl .
Theorem 4.19. (1) For a finite I, (A(I)+ )0 ∩ A = A(Ic ) + vI A(Ic ) ,
(4.50)
where vI is given by (4.30). (2) For an infinite I, (A(I)+ )0 ∩ A = A(Ic ) .
(4.51)
Proof. (1) is given by Lemma 4.14. To prove (2), we consider an infinite I. Clearly (A(I)+ )0 ∩ A ⊃ A(Ic ) due to (4.8). Hence it is enough to prove that any b ∈ (A(I)+ )0 ∩ A belongs to A(Ic ). Let {Ln } be an increasing sequence of finite subsets of Zν such that their union is Zν . Set In ≡ Ln ∩ I. Then In % I. For any ε > 0, there exist a positive integer lε and an element bε of A(Llε ) satisfying kb − bε k < ε . For any n, b ∈ (A(In )+ )0 due to In ⊂ I and b ∈ (A(I)+ )0 . The conclusion of (1) implies b = b0n + vIn b1n ,
(4.52)
where b0n , b1n ∈ A({In }c ). Since In % I and I is infinite, there exists an nε such that Inε contains a point i which does not belong to Llε . Then i ∈ In for all n ≥ nε . Due to bε ∈ A(Llε ) and {i}c ⊃ Llε , E{i}c (bε ) = bε . Since
b0n ,
b1n
c
(4.53)
c
∈ A({In } ) ⊂ A({i} ) for all n ≥ nε , we have E{i}c (b0n ) = b0n , E{i}c (vIn b1n ) = τ (vi )vIn \{i} b1n = 0 .
(4.54)
This implies E{i}c (b) = E{i}c (b0n ) + E{i}c (vIn b1n ) = b0n . It follows from (4.53) and (4.55), kbε − b0n k = kE{i}c (bε ) − E{i}c (b)k ≤ kbε − bk < ε .
(4.55)
April 11, 2003 14:43 WSPC/148-RMP
00160
Equilibrium Statistical Mechanics of Fermion Lattice Systems
123
Therefore, kb − b0n k ≤ kb − bε k + kbε − b0n k < 2ε
(4.56)
for all n ≥ nε . Hence b = lim b0n . n
b0n
c
For any fixed m ∈ N, ∈ A({In } ) ⊂ A({Im }c ) for all n ≥ m due to In ⊃ Im . c Thus b ∈ A({Im } ) for any m. By Corollary 4.12, b ∈ ∩m A({Im }c ) = A(∩m (Icm )) = A({∪m Im }c ) = A(Ic ) . As a by-product, we obtain the following. Corollary 4.20. For any infinite I, the restriction of Θ to A(I) is outer. Proof. We denote the restriction of Θ by the same letter. For any infinite subsets I and J, (A(I), Θ) is isomorphic to (A(J), Θ) as a pair of C∗ -algebra and its automorphism through any bijective map between I and J. Therefore it is enough to show the assertion for a proper infinite subset I of Zν . Suppose that u is a unitary element in A(I) such that u∗ au = Θ(a) , for all a ∈ A(I). Substituting u into a, we have Θ(u) = u. Let b ∈ A(Ic )− and b 6= 0. Then ub ∈ A− . By (4.8), ba = Θ(a)b . Hence ub ∈ A(I)0 . Therefore ub ∈ (A(I)0 )− , which implies ub = 0 by Lemma 4.16. This implies b = u∗ (ub) = 0 , a contradiction. 5. Dynamics 5.1. Assumptions We consider a one-parameter group of ∗-automorphisms αt of the Fermion algebra A. Throughout this work, αt is assumed to be strongly continuous, that is, t ∈ R 7→ αt (A) ∈ A is norm continuous for each A ∈ A. In order to associate a potential to the dynamics αt (see Sec. 5.4 for details), we need the following two assumptions on αt and its generator δα with the domain D(δα ): (I) αt Θ = Θ αt for all t ∈ R. (II) A◦ is in the domain of δα , namely, A◦ ⊂ D(δα ).
April 11, 2003 14:43 WSPC/148-RMP
124
00160
H. Araki & H. Moriya
The assumption (I) of Θ-even dynamics comes from two sources. On the physical √ side, the generator of the time translation αt should be i = −1 times the commutator with the energy operator which is a physical observable and hence Θ-even. On the technical side, the potential to be introduced below has to commute with a fixed local element of A when the support region of the potential is far away in order that the expression for the action of the generator on that local element converges and makes sense. For αt to be uniquely specified by the associated potential to be introduced in Sec. 5.4, we need the following assumption: (III) A◦ is the core of δα , namely, if δ denotes the restriction of δα to A◦ , its closure δ¯ is δα . The assumption (III) will be used to derive a conclusion involving αt such as the KMS condition from other conditions involving the associated potential such as the Gibbs condition and the variational principle. Later, when we discuss translation invariant equilibrium states, we will add the assumption of translation invariance: (IV) αt τk = τk αt
for any t ∈ R, k ∈ Zν .
Later in Proposition 8.1, it will be shown that Assumption (IV) implies Assumption (I). By Assumptions (I) and (II), the restriction δ of δα to A◦ satisfies δΘ(A) = Θ(δA)
(5.1)
for any A ∈ A◦ . In the rest of this section, we deal with an arbitrary ∗-derivation δ with the domain A◦ commuting with Θ (Eq. (5.1)) irrespective of whether it comes from a dynamics αt or not. Of course, we can use the results about such a general δ for the restriction of δα to A◦ . 5.2. Local Hamiltonians Since A(I) is a finite type I factor for each finite subset I of Zν , there exists a self-adjoint element HI0 ∈ A satisfying δA = i[HI0 , A]
(5.2)
for any A ∈ A(I) where δ is any ∗-derivation with its domain A◦ and values in A (i.e. δ is a linear map from A◦ into A satisfying δ(AB) = (δA)B + A(δB) and δ(A∗ ) = (δA)∗ ). Although this is well-known (see e.g. [38]), we include its proof for the sake of completeness. Lemma 5.1. Let {uij } be a self-adjoint system of matrix units of A(I). Define X XX hij ≡ uli δujl − δij 2−|I| ulm δuml . l
l
m
April 11, 2003 14:43 WSPC/148-RMP
00160
Equilibrium Statistical Mechanics of Fermion Lattice Systems
Then hij ∈ A(I)0 . Define iH ≡ It satisfies H ∗ = H and
X
125
uij hij .
i,j
[iH, A] = δA for A ∈ A(I). Furthermore, EIc (H) = 0 .
(5.3)
Proof. (1) We first prove hij ∈ A(I)0 . If i 6= j, X uli (δujl )uαβ − uαi δujβ [hij , uαβ ] = l
=
X l
uli (δ(ujl uαβ ) − ujl δuαβ ) − uαi δujβ
= uαi δujβ − uαi δujβ = 0 . If i = j, [hii , uαβ ] =
X l
− =
XX l
X l
−
uli (δuil )uαβ − uαi δuiβ
m
2−|I| {ulm (δuml )uαβ − uαβ ulm δuml }
uli (δ(uil uαβ ) − uil δuαβ ) − uαi δuiβ
XX l
m
2−|I| {ulm (δ(uml uαβ ) − uml δuαβ ) − uαβ ulm δuml }
= uαi δuiβ − δuαβ − uαi δuiβ −
X
2−|I| uαm δumβ + 2−|I| (2|I| 1)δuαβ + 2−|I|
m
X
uαm δumβ
m
= 0. (2) We prove [iH, uαβ ] = δuαβ , which yields [iH, A] = δA for any A ∈ A(I) by linearity. X X X [iH, uαβ ] = [uij , uαβ ]hij = uiβ hiα − uαj hβj i,j
=
X i
i
uii δuαβ −
− uαβ
X j
X
j
2−|I| uαm δumβ
m
δujj +
X m
2−|I| uαm δumβ
April 11, 2003 14:43 WSPC/148-RMP
126
00160
H. Araki & H. Moriya
= δuαβ − uαβ δ
X j
= δuαβ − uαβ δ1
ujj
= δuαβ , where we have used hij ∈ A(I)0 for the first equality. (3) Next we prove H ∗ = H or iH + (iH)∗ = 0. By using u∗ij = uji and (δa)∗ = δa∗ , we obtain X iH + (iH)∗ = uij (hij + h∗ji ) , hij + h∗ji =
X l
=
{uli δujl + (δuli )ujl } − δij 2−|I|
X l
δ(uli ujl ) − δij 2−|I|
= δij δ
X l
ull
!
− δij δ
XX l
X l
XX l
m
{ulm δuml + (δulm )uml }
δ(ulm uml )
m
ull
!
= 0.
Hence iH + (iH)∗ = 0. (4) We prove the last statement. Note that τ (uij ) = 2−|I| δij . Hence ( ) XX X X X −|I| −|I| uli δuil − 2 ulm δuml iEIc (H) = 2 hii = i
i
l
l
m
= 0. We denote this H by HI0 . Lemma 5.2. If δ is a ∗-derivation with domain A◦ and values in A commuting with Θ, then there exists a self-adjoint element H(I) ∈ A+ satisfying δA = i[H(I), A] for all A ∈ A(I) and EIc (H(I)) = 0 . Proof. Due to commutativity of δ and Θ and Θ2 = 1, we have δA = Θ(δΘ(A)) = Θ(i[HI0 , Θ(A)]) = i[Θ(HI0 ), A] for any A ∈ A(I). Set H(I) ≡ (HI0 )+ =
1 0 (H + Θ(HI0 )) (∈ A+ ) . 2 I
(5.4)
April 11, 2003 14:43 WSPC/148-RMP
00160
Equilibrium Statistical Mechanics of Fermion Lattice Systems
127
Then we have H(I)∗ = H(I) and δA = i[H(I), A] (A ∈ A(I)) . Since EIc (HI0 ) = 0, it follows from (5.4) and (4.21) that EIc (H(I)) = 0 . The local Hamiltonian operator H(I) obtained in the above lemma has the following properties: (H-i) H(I)∗ = H(I) ∈ A. (H-ii) Θ(H(I)) = H(I) (i.e. H(I) ∈ A+ ). (H-iii) δA = i[H(I), A] (A ∈ A(I)). (H-iv) EIc (H(I)) = 0. Remark. The property (H-iv) implies τ (H(I)) = τ (EIc (H(I))) = 0 .
(5.5)
Lemma 5.3. H(I) satisfying (H-ii)–(H-iv) is uniquely determined by δ. Proof. If H(I) and H(I)0 satisfy (H-ii)-(H-iv), then ∆ = H(I) − H(I)0 satisfies [∆, A] = 0 for all A ∈ A(I) due to (H-iii). By Lemma 4.15 and (H-ii) for ∆, ∆ ∈ A(I)0 ∩ A+ = A(Ic )+ . Hence (H-iv) implies ∆ = EIc (∆) = EIc (H(I)) − EIc (H(I)0 ) = 0 . Therefore H(I) satisfying (H-ii)-(H-iv) is unique. We call H(I) the standard Hamiltonian for the region I. Remark. For the empty set ∅, H(∅) = 0 by (H-iv). Under the conditions (H-ii)–(H-iv), the property H(I)∗ = H(I) of (H-i) and the property (δA)∗ = δA∗ , (A ∈ A(I)) for δ are equivalent because of the following reason. If H(I)∗ = H(I), then (δA)∗ = δA∗ immediately follows from (H-iii). If (δA)∗ = δA∗ , then H(I)∗ satisfies (H-iii) along with (H-ii) and (H-iv). Hence H(I)∗ = H(I) by the uniqueness result Lemma 5.3. Lemma 5.4. If I ⊂ J is a pair of finite subsets, then H(I) = H(J) − EIc (H(J)) .
(5.6)
Proof. H(J) satisfies (H-ii) and (H-iii) for the region I(⊂ J). Furthermore, EIc (H(J)) ∈ A(Ic )+ due to (H-ii) for H(J) and hence it commutes with A ∈ A(I). Therefore H(J) − EIc (H(J)) satisfies (H-ii)–(H-iv) for the region I. By the uniqueness (Lemma 5.3), we obtain H(I) = H(J) − EIc (H(J)).
April 11, 2003 14:43 WSPC/148-RMP
128
00160
H. Araki & H. Moriya
We give the number (H-v) to the condition above: (H-v) H(I) = H(J) − EIc (H(J)) for any finite subsets I ⊂ J of Zν . The proof above has shown that (H-v) is derived from (H-ii)–(H-iv). So far we have derived the properties (H-i), (H-ii), (H-iv) and (H-v) for the family {H(I)} from its definition in terms of δ through the relation (H-iii). In the converse direction, any family of an element H(I) ∈ A for each finite subset I of Zν defines a derivation δ on A◦ by (H-iii). This definition requires a consistency: if A ∈ A(I) and A ∈ A(J), we have a definition of δ(A) by H(I) and H(J). The proof that they are the same is given as follows. First we note that A ∈ A(I) ∩ A(J) = A(I ∩ J). Thus it is enough to show [H(I), A] = [H(K), A]
(5.7)
for any K ⊂ I and A ∈ A(K), because, using this identity for the pair I ⊃ K = I ∩ J and J ⊃ K; we obtain [H(I), A] = [H(J), A] for any A ∈ A(I ∩ J). Since EKc (H(I)) is Θ-even by (H-ii) and (4.21), EKc (H(I)) is in A(Kc )+ and commutes with A ∈ A(K). By (H-v), H(K) = H(I) − EKc (H(I)) which leads to the consistency equation (5.7). δ defined by (H-iii) is a ∗-derivation with domain A◦ due to (H-i), and commutes with Θ by (H-ii). We have not used (H-iv) in this argument, but have imposed it on H(I) to obtain the uniqueness of H(I) for a given δ. Namely, by Lemmas 5.2 and 5.3, the correspondence of δ and H(I) is bijective, for which the condition (H-iv) is used. Summarizing the argument so far, we have obtained Theorem 5.7 stated below after introduction of two definitions. Definition 5.5. The real vector space of all ∗-derivations with their definition domain A◦ and commuting with Θ (on A◦ ) is denoted by ∆(A◦ ). Remark. Under Assumptions (I) and (II), the restriction δ of the generator δα of αt belongs to ∆(A◦ ). Definition 5.6. The real vector space of functions H(I) of finite subsets I satisfying the following four conditions is denoted by H and its element H is called a local Hamiltonian. (H-i) H(I)∗ = H(I) ∈ A, (H-ii) Θ(H(I)) = H(I) (i.e. H(I) ∈ A+ ) (H-iv) EIc (H(I)) = 0, (H-v) H(I) = H(J) − EIc (H(J)) for any finite subsets I ⊂ J of Zν .
April 11, 2003 14:43 WSPC/148-RMP
00160
Equilibrium Statistical Mechanics of Fermion Lattice Systems
129
Theorem 5.7. The following relation between H ∈ H and δ ∈ ∆(A◦ ) gives a bijective, real linear map from H to ∆(A◦ ). (H-iii) δA = i[H(I), A] (A ∈ A(I)). Remark. The value δA of the derivation δ ∈ ∆(A◦ ) for A ∈ A◦ is in general not in A◦ . 5.3. Internal energy For a finite subset I of Zν , set U (I) ≡ EI (H(I)) (∈ A(I))
(5.8)
and call it the internal energy for the region I. Due to H(∅) = 0, U (∅) = 0. Due to the property (5.5), EI EIc ((H(J))) = τ ((H(J))) = 0 . By (H-v), we obtain for I ⊂ J U (I) = EI H(I) = EI ({H(J) − EIc (H(J))}) = EI H(J) = EI EJ H(J) = EI U (J) .
(5.9)
Furthermore, for any finite subset I and any subset J of Zν , we have EJ (U (I)) = EJ EI (U (I)) = EJ∩I (U (I)) = U (I ∩ J) ,
(5.10)
where the last equality is due to (5.9). Due to (5.5), τ (U (I)) = τ (EI (H(I))) = τ (H(I)) = 0 .
(5.11)
HJ (I) ≡ EJ (H(I)) .
(5.12)
Let us denote
Lemma 5.8. (1) For any pair of finite subsets I and J, HJ (I) = U (J) − U (Ic ∩ J) .
(5.13)
(2) For any finite subset I, H(I) = limν (U (J) − U (Ic ∩ J)) . J%Z
Proof. (1): By applying (H-v) for pairs I ⊃ I ∩ J and J ⊃ I ∩ J, we obtain H(I ∩ J) = H(I) − E(I∩J)c (H(I)) , H(I ∩ J) = H(J) − E(I∩J)c (H(J)) . Therefore, H(I) = H(J) − E(I∩J)c (H(J) − H(I)) .
(5.14)
April 11, 2003 14:43 WSPC/148-RMP
130
00160
H. Araki & H. Moriya
By applying EJ to this equation, we obtain HJ (I) = U (J) − EJ E(I∩J)c (H(J) − H(I)) . Since J ∩ (I ∩ J)c = J ∩ (Ic ∪ Jc ) = (J ∩ Ic ) ∪ (J ∩ Jc ) = J ∩ Ic , we obtain EJ E(I∩J)c = EJ∩(I∩J)c = EJ∩Ic = EJ EIc = EIc EJ . Since EIc (H(I)) = 0 by (H-iv), we have EJ E(I∩J)c (H(J) − H(I)) = EIc EJ (H(J)) = EIc (U (J)) . Thus HJ (I) = U (J) − EIc (U (J)) . By this and (5.10), we arrive at (5.13). (2): By (4.23), we have H(I) = limν HJ (I) . J%Z
(5.15)
This and (5.13) imply the desired (5.14). 5.4. Potential We introduce the potential {Φ(I)} in terms of {H(I)} and derive its characterizing properties. As a consequence, we establish the one-to-one correspondence between {Φ(I)} and {H(I)}. Lemma 5.9. For a given {H(I)} ∈ H and the corresponding {U (I)}, there exists one and only one family of {Φ(I) ∈ A; finite I ⊂ Zν } satisfying the following conditions: (1) (2) (3) (4) (5)
Φ(I) ∈ A(I). Φ(I)∗ = Φ(I), Θ(Φ(I)) = Φ(I), Φ(∅) = 0. EJ (Φ(I)) = 0 if J ⊂ I and J 6= I. P U (I) = K⊂I Φ(K). P H(I) = limJ%Zν K {Φ(K); K ∩ I 6= ∅, K ⊂ J}.
Proof. We show this lemma in several steps. Step 1. Existence of Φ satisfying (1) and (4) for all finite I. The following expression for Φ(I) in terms of U (K), K ⊂ I satisfies (1) and (4) for all I and hence the existence. X (−1)|I|−|K|U (K) . (5.16) Φ(I) = K⊂I
April 11, 2003 14:43 WSPC/148-RMP
00160
Equilibrium Statistical Mechanics of Fermion Lattice Systems
131
P In fact, substituting this expression into J⊂I Φ(J), we obtain XX X (−1)|J|−|K|U (K) = α(K)U (K) , J⊂I K⊂J
α(K) =
K⊂I
X
(−1)|J|−|K| =
J:K⊂J⊂I
|I| X
(−1)m−|K| βm ,
(5.17)
m=|K|
where βm is the number of distinct J satisfying K ⊂ J ⊂ I,
|J| = m .
This is the number of way for choosing m − |K| elements (for J \ K) out of I \ K, . Putting l = m − |K|, n = |I| − |K|, we obtain which is |I|−|K| m−|K| n X n α(K) = (−1)l = (1 − 1)n = 0 l l=0
for all K 6= I (then n ≥ 1), while we have α(I) = 1. Hence (4) is satisfied by Φ(I) given as (5.16) for all I. Step 2. Uniqueness of Φ satisfying (4). The relation (4) implies X Φ(I) = U (I) − Φ(K) (5.18) K⊂I,K6=I
which obviously determines Φ(I) uniquely for a given {U (I)} by the mathematical induction on |I| = m starting from Φ(∅) = U (∅) = 0. Step 3. Property (2). We already obtain Φ(∅) = 0. Since U (I)∗ = U (I) and Θ(U (I)) = U (I), Φ(I) defined by (5.16) as a real linear combination of U (K), K ⊂ I satisfies (2). Step 4. Property (3). We note that (3) is equivalent to the following condition: EJ (Φ(I)) = 0 ,
for J 6⊃ I ,
(5.19)
because EJ (Φ(I)) = EJ EI (Φ(I)) = EJ∩I (Φ(I)) by Theorem 4.7, J ∩ I ⊂ I, and J ∩ I 6= I if and only if J 6⊃ I. On the other hand, EJ (Φ(I)) = Φ(I) if J ⊃ I due to Φ(I) ∈ A(I) ⊂ A(J). We now prove (3) by the mathematical induction on |I| = m. For m = 1, the only J satisfying J ⊂ I and J 6= I is J = ∅ for which Φ(J) = 0. Then Φ(I) = U (I) and EJ (Φ(I)) = τ (Φ(I))1 = τ (U (I)) = 0 due to (5.11). Suppose (3) holds for |I| < m. We consider I with |I| = m. We apply EJ (for J ⊂ I, J 6= I) on both sides of (5.18). All K in the summation on the right-hand side satisfy |K| < m due to K ⊂ I and K 6= I. Hence the inductive assumption is applicable to Φ(K) on the right-hand side. If K 6⊂ J, we
April 11, 2003 14:43 WSPC/148-RMP
132
00160
H. Araki & H. Moriya
have EJ (Φ(K)) = 0 by (5.19). If K ⊂ J, we have EJ (Φ(K)) = Φ(K). Therefore, by using EJ U (I) = U (J) (due to J ⊂ I), we obtain X EJ Φ(I) = EJ U (I) − EJ Φ(K) K⊂I,K6=I
= U (J) −
X
Φ(K) = 0 .
K⊂J
This proves (3). Step 5. Property (5). For a finite subset J and I ⊂ J, HJ (I) is written in terms of Φ by (5.13) and (4) as X HJ (I)(= EJ (H(I))) = {Φ(K); K ∩ I 6= ∅, K ⊂ J} . (5.20) K
Due to (5.15), Φ satisfies (5). We collect useful formulae for U and H in terms of Φ which have been obtained above: X U (I) = Φ(K) , (5.21) K⊂I
HJ (I) =
X K
{Φ(K); K ∩ I 6= ∅, K ⊂ J} ,
H(I) = limν J%Z
X K
{Φ(K); K ∩ I 6= ∅, K ⊂ J}
(5.22) !
= limν HJ (I) J%Z
.
(5.23)
Definition 5.10. A function Φ of finite subsets I of Zν with the value Φ(I) in A is called a standard potential if it satisfies the following conditions: (Φ-a) Φ(I) ∈ A(I), Φ(∅) = 0. (Φ-b) Φ(I)∗ = Φ(I). (Φ-c) Θ(Φ(I)) = Φ(I). (Φ-d) EJ (Φ(I)) = 0 if J ⊂ I and J 6= I. (Φ-e) For each fixed finite subset I of Zν , the net X HJ (I) = {Φ(K); K ∩ I 6= ∅, K ⊂ J} , K
is a Cauchy net in the norm topology of A for J % Zν . The index set for the net is the set of all finite subsets J of Zν , partially ordered by the set inclusion.
Remark. (Φ-d) is equivalent to the following condition: (Φ-d)0 EJ (Φ(I)) = 0 unless I ⊂ J, because EJ (Φ(I)) = EJ EI (Φ(I)) = EJ∩I (Φ(I)). Definition 5.11. The real vector space of all standard potentials is denoted by P.
April 11, 2003 14:43 WSPC/148-RMP
00160
Equilibrium Statistical Mechanics of Fermion Lattice Systems
133
Remark. P is a real vector space as a function space, where the linear operation is defined by (cΦ + dΨ)(I) = cΦ(I) + dΨ(I) ,
c, d ∈ R ,
Φ, Ψ ∈ P .
(5.24)
We show the one-to-one correspondence of Φ ∈ P and H ∈ H. Theorem 5.12. The equations (5.22) and (5.23) for Φ ∈ P and H ∈ H give a bijective, real linear map from P to H. Proof. First note that (4) of Lemma 5.9 is satisfied for U (I) = EI (H(I)) due to (Φ-d), if (5.22) and (5.23) are satisfied. By Lemma 5.9, there exists a unique Φ ∈ P satisfying (5.22) and (5.23) for any given H ∈ H. The map is evidently linear. The only remaining task is to prove the property (H-i), (H-ii), (H-iv) and (H-v) for the H(I) given by (5.22) and (5.23), on the basis of (Φ-a)-(Φ-e). (H-i), (H-ii) and (H-iv) follow from (Φ-b), (Φ-c) and (Φ-d)0 , respectively. To show (H-v), let L be a finite subset containing J ⊃ I. Then X HL (J) − HL (I) = {Φ(K); K ∩ J 6= ∅, K ∩ I = ∅, K ⊂ L} K
X
= E Ic
K
{Φ(K); K ∩ J 6= ∅, K ⊂ L}
!
= EIc (HL (J)) due to (5.22), (Φ-a) and (Φ-d)0 . By taking limit L % Zν , we obtain H(J) − H(I) = EIc (H(J)) , where the convergence is due to (Φ-e) and kEIc k = 1. Remark. We will use later the real linearity of the above map: HcΦ+dΨ (I) = cHΦ (I) + dHΨ (I) ,
c, d ∈ R ,
UcΦ+dΨ (I) = cUΦ (I) + dUΨ (I) ,
c, d ∈ R ,
Φ, Ψ ∈ P , Φ, Ψ ∈ P ,
(5.25) (5.26)
where HΦ (I) and UΦ (I) denote H(I) and U (I) corresponding to Φ ∈ P. Theorem 5.13. The following relation between Φ ∈ P and δΦ ∈ ∆(A◦ ) gives a bijective, real linear map from P to ∆(A◦ ). δΦ A = i[H(I), A] (A ∈ A(I)) , X H(I) = limν {Φ(K); K ∩ I 6= ∅, K ⊂ J} . J%Z
K
Proof. This is a consequence of Theorem 5.7 and Theorem 5.12.
(5.27) (5.28)
April 11, 2003 14:43 WSPC/148-RMP
134
00160
H. Araki & H. Moriya
Remark 1. The technique using the conditional expectations for associating a unique standard potential with a a given ∗-derivation has been developed for quantum spin lattice systems by one of the authors [12]. The corresponding formalism for classical lattice systems is developed in [13]. Also see [23] where EI for the quantum spin case is called a partial trace. Remark 2. We note that P is a Fr´echet space with respect to a countable family of seminorms {kH({i})k}, i ∈ Zν . 5.5. General potential If the function Φ : I ∈ {finite subsets of Zν } 7→ Φ(I)
(5.29)
satisfies (Φ-a), (Φ-b), (Φ-c) and (Φ-e), we call it a general potential. By (Φ-e), we define H(I) by (5.23) and (5.22). Then, for any finite subsets K ⊃ I, X H(K) − H(I) = limν {Φ(L); L ∩ K 6= ∅, L ∩ I = ∅, L ⊂ J} (5.30) J%Z
L
due to (Φ-e). Therefore, we can define δΦ with the domain A◦ by δΦ A = i[H(I), A] for A ∈ A(I) ,
(5.31)
which is a consistent definition due to (5.30) by essentially the same argument as the one leading to (5.7). The properties (Φ-a), (Φ-b), (Φ-c), and (Φ-e) imply that δΦ ∈ ∆(A◦ ). Two general potentials Φ and Φ0 are said to be equivalent if δΦ = δΦ0 . It follows from Theorem 5.13 that there is a unique standard potential which is equivalent to any given general potential defined above. The equivalence is discussed, e.g., in [23] and [40] with the name of physical equivalence. We will consider the consequence of equivalence for a specific class of general potentials in Sec. 14. 6. KMS Condition 6.1. KMS condition We recall the definition of the KMS condition for a given dynamics αt of A (see e.g. [17]). Definition 6.1. A state ϕ of A is called an αt -KMS state at the inverse temperature β ∈ R or (αt , β)-KMS state (or more simply KMS state) if it satisfies one of the following two equivalent conditions: (A) Let Dβ be the strip region Dβ = {z ∈ C; 0 ≤ Im z ≤ β}
if β ≥ 0 ,
= {z ∈ C; β ≤ Im z ≤ 0}
if β < 0 ,
◦
in the complex plane C and Dβ be its interior.
April 11, 2003 14:43 WSPC/148-RMP
00160
Equilibrium Statistical Mechanics of Fermion Lattice Systems
135
For every A and B in A, there exists a function F (z) of z ∈ Dβ (depending on A and B) such that ◦
(1) F (z) is analytic in Dβ , (2) F (z) is continuous and bounded on Dβ , (3) For all real t ∈ R, F (t) = ϕ(Aαt (B)) ,
F (t + iβ) = ϕ(αt (B)A) .
(B) Let Aent be the set of all B ∈ A for which αt (B) has an analytic extension to A-valued entire function αz (B) as a function of z ∈ C. For A ∈ A and B ∈ Aent , ϕ(Aαiβ (B)) = ϕ(BA) . Remark. In (A), the condition (1) is empty if β = 0. The boundedness in (2) can be omitted (see e.g. Proposition 5.3.7 in [17]). Aent is known to be dense in A. For a state ϕ on A, let {Hϕ , πϕ , Ωϕ } denote its GNS triplet, namely, πϕ is a (GNS) representation of A on the Hilbert space Hϕ , and Ωϕ is a cyclic unit vector in Hϕ , representing ϕ as the vector state. If ϕ is an (αt , β)-KMS state, then Ωϕ is separating for the generated von Neumann algebra M ≡ πϕ (A)00 . Let ∆ϕ and σtϕ be the modular operator and modular automorphisms for Ωϕ and ϕ, respectively, [42]. The KMS condition implies that σtϕ (πϕ (A)) = πϕ (α−βt (A)) ,
A ∈ A.
(6.1)
It is a result of Takesaki [42] that the KMS condition of a one-parameter automorphism group of a von Neumann algebra with respect to a cyclic vector implies the separating property of the vector, and the modular automorphism group of the von Neumann algebra with respect to the cyclic and separating vector is characterized by the KMS condition at β = −1 with respect to the state given by that vector. For the sake of brevity in stating an assumption later, we use the following terminology. Definition 6.2. A state ϕ is said to be modular if Ωϕ is separating for πϕ (A)00 . 6.2. Differential KMS condition It is convenient to introduce the following condition in terms of the generator δα of the dynamics αt , equivalent to the KMS condition with respect to αt . Definition 6.3. Let δ be a ∗-derivation of A with its domain D(δ). A state ϕ is said to satisfy the differential (δ, β)-KMS condition (or briefly, (δ, β)-dKMS condition) if the following two conditions are satisfied (C-1) ϕ(A∗ δA) is pure imaginary for all A ∈ D(δ).
April 11, 2003 14:43 WSPC/148-RMP
136
00160
H. Araki & H. Moriya
(C-2) −iβϕ(A∗ δA) ≥ S(ϕ(AA∗ ), ϕ(A∗ A)) for all A ∈ D(δ) where the function S(x, y) is given for x ≥ 0, y ≥ 0 by: S(x, y) = y log y − y log x S(x, y) = +∞ S(x, y) = 0
if x > 0, y > 0 ,
if x = 0, y > 0 ,
if x ≥ 0, y = 0 .
We use the following known result (see e.g., Theorem 5.3.15 in [17]). Theorem 6.4. Let δα be a generator of αt , namely, etδα = αt . Then the (δα , β)dKMS condition and the (αt , β)-KMS condition are equivalent. Remark. The function S(x, y) is the relative entropy for linear functionals of onedimensional ∗-algebra. The order of the arguments x, y in our notation is opposite to that of the definition in [45]. (Both the order of the argument and the sign are opposite to those in [17].) Our definition here is in accordance with our definition of the relative entropy previously given. Lemma 6.5. S(x, y) is convex and lower semi-continuous in x, y. Proof. A convenient expression for S(x, y) is ) ( Z ∞ 2 −1 2 dt (ys(t) + t x{1 − s(t)} ) , S(x, y) = sup sup y log n − 1 t n s(t) n
(6.2)
where s(t) varies over the linear span of characteristic functions of finite intervals in [0, +∞). The equality is immediate for x = 0, y > 0 as well as for x ≥ 0, y = 0. For x > 0, y > 0, (6.2) follows from identities for λ = x/y. 1 x + y(log y − log x) = sup −y log y n n ( ) Z ∞ λ dt y = sup y log n − , 1 t+λ t n n −y
λ = sup{−(ys2 + xt−1 (1 − s)2 )} . t+λ s∈R
From the expression above, S(x, y) is seen to be convex and lower semicontinuous in (x, y) because it is a supremum of homogeneous linear functions of (x, y). (The variational expression (6.2) for general von Neumann algebras is established by Kosaki [25]. This expression indicates manifestly some basic properties of relative entropy for the general case.) Lemma 6.6. The conditions (C-1) and (C-2) are stable under the simultaneous limit of A and δA in norm topology and ϕ in the weak∗ topology as well as under the convex combination of states ϕ.
April 11, 2003 14:43 WSPC/148-RMP
00160
Equilibrium Statistical Mechanics of Fermion Lattice Systems
137
Proof. Let An , A ∈ D(δ), kAn − Ak → 0, kδAn − δAk → 0, |ϕn (B) − ϕ(B)| → 0 for every B ∈ A. Then |ϕn (A∗n δAn ) − ϕ(A∗ δA)| ≤ |ϕn (A∗n δAn − AδA)| + |ϕn (AδA) − ϕ(AδA)| ,
which converges to 0 as n → ∞. Therefore, the condition (C-1) holds for ϕ and A if it holds for ϕn and An . Similarly, ϕn (An A∗n ) → ϕ(AA∗ ) ,
ϕn (A∗n An ) → ϕ(A∗ A) ,
as n → ∞. By the lower semi-continuity of S(x, y) in (x, y), we then obtain S(ϕ(AA∗ ), ϕ(A∗ A)) ≤ lim inf S(ϕ(An A∗n ), ϕ(A∗n An )) . n
Hence we obtain the condition (C-2) for ϕ and A if it holds for ϕn and An . Since ϕ(A∗ δA) is affine in ϕ while S(ϕ(AA∗ ), ϕ(A∗ A)) is convex in ϕ, the conditions (C-1) and (C-2) are stable under the convex combination of ϕ. Corollary 6.7. Let αt be a one-parameter group of ∗-automorphisms of A satisfying the conditions (II) and (III). Let δα be the generator of αt . Then a state ϕ is an (αt , β)-KMS state if and only if it is a (δ, β)-dKMS state, where δ denotes the restriction of δα to A◦ . Proof. The restriction δ of δα to A◦ makes sense due to the assumption (II). By Theorem 6.4, it suffices to prove that the dKMS condition for δ implies the same for δα . By Assumption (III), there exists a sequence An ∈ A◦ for any given A ∈ D(δα ) such that kAn − Ak → 0, kδAn − δα Ak → 0. Hence the conditions (C-1) and (C-2) for δ imply the same for δα due to Lemma 6.6. 7. Gibbs Condition In this section, we define the Gibbs condition. We first recall the notion of perturbation of dynamics and states. 7.1. Inner perturbation Consider a given dynamics αt of A with its generator δ on the domain D(δ). For each h = h∗ ∈ A, there exists the unique perturbed dynamics αht of A with its generator δ h given by δ h (A) ≡ δ(A) + i[h, A] (A ∈ D(δ)) on the same domain as the generator δ of αt . This
αht (A)
(7.1) is explicitly given by
αht (A) = uht αt (A)(uht )∗ where uht
≡1+
∞ X
m=1
i
m
Z
t
dt1 0
Z
t1 0
dt2 · · ·
Z
(7.2)
tm−1 0
dtm αtm (h) · · · αt1 (h) .
(7.3)
April 11, 2003 14:43 WSPC/148-RMP
138
00160
H. Araki & H. Moriya
This is unitary and satisfies the following cocycle equation: uhs αs (uht ) = uhs+t . The same statements hold for a von Neumann algebra M and its one parameter group of ∗-automorphisms αt ; the t-continuity of αt for each fixed x ∈ M in the strong operator topology of M is to be assumed. Let Ω be a cyclic and separating vector for M. Let ∆Ω be the modular operator for Ω and σtω be the corresponding modular automorphism group −it σtω (x) = ∆it Ω x∆Ω ,
where ω indicates the positive linear functional ω(x) = (Ω, xΩ) ,
(x ∈ M) .
For h = h∗ ∈ M, the perturbed vector Ωh is given by Z t1 Z tm−1 ∞ Z 1 X 2 h tm−1 −tm dt1 dt2 · · · Ω ≡ dtm ∆tϕm πϕ (h)∆ϕ πϕ (h) · · · ∆tϕ1 −t2 πϕ (h)Ω m=0
= Expr
0
0
Z
1 2
0
0
!
; ∆tϕ πϕ (h)∆−t ϕ dt Ω ,
(7.4)
where the sum is known to converge absolutely ([2]). The notation Expr is taken from [3]. The positive linear functional ω h on M is defined by ω h (x) ≡ (Ωh , xΩh ) (x ∈ M) .
(7.5)
The vector Ωh defined above is cyclic and separating for M. Its modular automorh phism group σtω of M coincides with (σtω )h , i.e. the perturbed dynamics of (σtω , M) by h. Ωh is in the natural positive cone of (Ω, M) (see e.g. [43] and [17]) for any self-adjoint element h ∈ M and satisfies (Ωh1 )h2 = Ωh1 +h2
(7.6)
for any self-adjoint elements h1 , h2 ∈ M. We have (ω h1 )h2 = ω h1 +h2 ,
{ω (h1 +h2 ) }
σt
(= (σtω )(h1 +h2 ) ) = {(σtω )h1 }h2 ,
(7.7)
where {(σtω )h1 }h2 indicates the dynamics which is given by the successive perturbations first by h1 and then by h2 . We denote the normalization of ω h by [ω h ]: [ω h ] = ω h (1)−1 ω h = ω (h−{log ω
h
(1)}1)
.
(7.8)
We use the following estimates (Theorem 2 of [4]) and a formula (e.g. (3.5) of [7] and Theorem 3.10 of [9]) later. 1 kΩh k ≤ exp khk , 2
log ω h (1) ≤ khk .
S(ϕh , ϕ) = −ϕ(h) .
(7.9) (7.10)
April 11, 2003 14:43 WSPC/148-RMP
00160
Equilibrium Statistical Mechanics of Fermion Lattice Systems
139
7.2. Surface energy Let us consider Φ ∈ P. For any finite subset I of Zν , we define W (I) ≡ H(I) − U (I) .
(7.11)
By (5.21), (5.22) and (5.23), the expression for W (I) in terms of the potential is given as follows. X W (I) = {Φ(K); K ∩ I 6= ∅, K ∩ Ic 6= ∅} (7.12) K
= limν J%Z
X K
c
{Φ(K); K ∩ I 6= ∅, K ∩ I 6= ∅, K ⊂ J}
!!
.
W (I) is the sum of all (interaction) potentials between the inside and the outside of I by definition, and will be called the surface energy. 7.3. Gibbs condition We are now in a position to introduce our Gibbs condition for a state ϕ of A for a given δ ∈ ∆(A◦ ). We use the following notation in its definition below. As in Sec. 6.1, {Hϕ , πϕ , Ωϕ } is the GNS triplet for ϕ. The normal extension of ϕ to the weak closure M(= πϕ (A)00 ) is denoted by the same letter ϕ: ϕ(x) = (Ωϕ , xΩϕ ) ϕ(πϕ (a)) = ϕ(a)
(x ∈ M) ,
(a ∈ A) .
Let Φ(I), H(I), U (I) and W (I) be those uniquely associated with δ. The following operators will be used for perturbations of dynamics and states ˆ = πϕ (βH(I)) , h
u ˆ = πϕ (βU (I)) ,
w ˆ = πϕ (βW (I)) .
(7.13)
Definition 7.1. For δ ∈ ∆(A◦ ), a state ϕ of A is said to satisfy the (δ, β)-Gibbs condition, or alternatively the (Φ, β)-Gibbs condition, if the following two conditions are satisfied. (D-1) ϕ is a modular state. (See Definition 6.2.) w ˆ (D-2) For each finite subset I of Zν , σtϕ satisfies w ˆ
σtϕ (πϕ (A)) = πϕ (e−iβU (I)t AeiβU (I)t ) for all A ∈ A(I). The condition (D-2) is equivalent to the following condition (D-2)0 as shown in the subsequent Lemma and hence we may define the (δ, β)-Gibbs condition by (D-1) and (D-2)0 . ˆ h
(D-2)0 For each finite subset I of Zν and A ∈ A(I), πϕ (A) is σtϕ -invariant, namely, ˆ πϕ (A(I)) is in the centralizer of the positive linear functional ϕh .
April 11, 2003 14:43 WSPC/148-RMP
140
00160
H. Araki & H. Moriya
Lemma 7.2. The conditions (D-2) and (D-2)0 are equivalent. ˆ Proof. First assume (D-2). Since ˆ h=w ˆ+u ˆ, we have ϕh = (ϕwˆ )uˆ and hence ˆ h
σtϕ = {(σtϕ )wˆ }uˆ w ˆ
= (σtϕ )uˆ . w ˆ
Since e−iβU (I)t U (I) eiβU (I)t = U (I), πϕ (U (I)) is invariant under σtϕ by (D-2). Then ˆ h
w ˆ
unitary cocycle bridging σtϕ and σtϕ becomes eiˆut . Hence w ˆ
h
σtϕ = Ad(eiˆut ) ◦ σtϕ . Therefore, for πϕ (A), A ∈ A(I), we have h
w
σtϕ (πϕ (A)) = eiˆut σtϕ (πϕ (A))e−iˆut = πϕ (Ad(eiβU (I)t ) ◦ Ad(e−iβU (I)t ) ◦ A) = πϕ (A) . Thus (D-2)0 is satisfied. w ˆ ˆ−u We show the converse. Assume (D-2)0 . Since w ˆ=h ˆ, σtϕ is the perturbed ˆ h
ˆ h
u. Since u ∈ A(I) is σtϕ -invariant (being in the centralizer), dynamics of σtϕ by −ˆ the corresponding unitary cocycle is e−iˆut . Hence, for πϕ (A), A ∈ A(I), we have ˆ h
w ˆ
σtϕ (πϕ (A)) = e−iˆut σtϕ (πϕ (A))e+iˆut = e−iβπϕ (U (I))t πϕ (A)eiβπϕ (U (I))t = πϕ (e−iβU (I)t AeiβU (I)t ) , and (D-2) is derived. We introduce the local Gibbs state. Definition 7.3. For finite I, the local Gibbs state of A(I) (or local Gibbs state for I) with respect to (δ, β) is given by ϕcI (A) ≡
τ (e−βU (I) A) , τ (e−βU (I) )
A ∈ A(I) .
(7.14) ˆ
Corollary 7.4. If ϕ satisfies the (δ, β)-Gibbs condition, then the restriction of ϕh ˆ to A(I) is ϕh (1) times the tracial state τ and that of ϕwˆ is ϕwˆ (1) times the local Gibbs state ϕcI given by (7.14). ˆ
Proof. Since ϕh has the tracial property for A(I) by (D-2)0 , its restriction to A(I) ˆ must be ϕh (1) times the unique tracial state τ .
April 11, 2003 14:43 WSPC/148-RMP
00160
Equilibrium Statistical Mechanics of Fermion Lattice Systems
141
Since the inner automorphism group αIt ≡ Ad(e−iβU (I)t )
(7.15)
leaves A(I) invariant and has the same action on A(I) as the modular automorphism of ϕwˆ |A(I) (the restriction of ϕwˆ to A(I)), ϕwˆ |A(I) satisfies (αIt , −1) KMS condition and hence must be ϕwˆ (1) times the unique KMS state given by the local Gibbs state ϕcI . 7.4. Equivalence to KMS condition Theorem 7.5. Let αt be dynamics of A satisfying conditions (I) and (II) and δ be the restriction of its generator δα to A◦ . Then any (αt , β)-KMS state ϕ of A satisfies (δ, β)-Gibbs condition. Proof. As already indicated, it is known that the KMS condition implies (D-1). It remains to show (D-2). We have w ˆ
(d/ds)(σsϕ (x) − σsϕ (x))s=0 = i[w, ˆ x] , for x ∈ M. By the group property of the automorphisms, w ˆ
w ˆ
w ˆ
(d/dt)σtϕ (x) = σtϕ {(d/ds)σsϕ (x)|s=0 } w ˆ
for x in the domain of the generator of σtϕ . For the same x, we have w ˆ
w ˆ
ˆ x]} . (d/dt)σtϕ (x) = σtϕ {(d/ds)σsϕ (x)|s=0 + i[w, The KMS condition implies that σsϕ (πϕ (A)) = πϕ (α−βs (A)) ,
A ∈ A.
Therefore, if A ∈ A is in the domain of the generator of αt , we have w ˆ
w ˆ
w ˆ
(d/dt)σtϕ (πϕ (A)) = σtϕ {(d/ds)(πϕ {α−βs (A)})|s=0 } + σtϕ (πϕ {[iβW (I), A]}) .
Now we take A ∈ A(I). By (H-iii), w ˆ
w ˆ
w ˆ
(d/dt)σtϕ (πϕ (A)) = σtϕ (−iβπϕ {[H(I), A]}) + σtϕ (iβπϕ {[W (I), A]}) w ˆ
= −iβσtϕ (πϕ {[U (I), A]}) .
For A ∈ A(I), eiβU (I)t Ae−iβU (I)t ∈ A(I), and we have w ˆ
(d/dt)σtϕ (πϕ {eiβU (I)t Ae−iβU (I)t }) w ˆ
w ˆ
= σtϕ {(d/ds)σsϕ (πϕ {eiβU (I)(t+s) Ae−iβU (I)(t+s) })|s=0 } w ˆ
= σtϕ (−iβπϕ {[U (I), eiβU (I)t Ae−iβU (I)t ]} + πϕ {d/ds(eiβU (I)(t+s) Ae−iβU (I)(t+s) )|s=0 }) = 0.
April 11, 2003 14:43 WSPC/148-RMP
142
00160
H. Araki & H. Moriya
This implies that w ˆ
σtϕ (πϕ {eiβU (I)t Ae−iβU (I)t }) is a constant function of t and hence equals to its value at t = 0, which is πϕ (A). Thus w ˆ
ϕ (πϕ (A)) = πϕ (eiβU (I)t Ae−iβU (I)t ) σ−t
and (D-2) is shown. To show the converse, we need the assumption (III) for the dynamics αt . Theorem 7.6. Let αt be a dynamics of A satisfying the conditions (I), (II) and (III). Let δ be the restriction of its generator δα to A◦ . Then any (δ, β)-Gibbs state ϕ of A satisfies (αt , β)-KMS condition. Proof. We use (D-2)0 . It says that ˆ h
(d/dt)σtϕ (πϕ (A)) = 0 for all A ∈ A(I). By the group property of the automorphism, (d/dt)σtϕ (x) = σtϕ {(d/ds)σsϕ (x)|s=0 } .
ˆ
ˆ
For any A ∈ A◦ , there exists a finite subset I such that A ∈ A(I). Since ϕ = (ϕh )−h , we have ˆ
h ˆ πϕ (A)]} (d/dt)σtϕ (πϕ (A)) = σtϕ {(d/ds)σsϕ (πϕ (A))|s=0 − [ih,
= σtϕ (−iβπϕ ([H(I), A])) = −βσtϕ (πϕ (δA)) .
(7.16)
We note that for any A ∈ A −it σtϕ (πϕ (A)) = ∆it ϕ πϕ (A)∆ϕ ,
∆ϕ Ω ϕ = Ω ϕ .
By applying (7.16) on Ωϕ and setting t = 0, we conclude that πϕ (A)Ωϕ is in the domain of log ∆ϕ and i(log ∆ϕ )πϕ (A)Ωϕ = −βπϕ (δ(A))Ωϕ
(7.17)
for all A ∈ A◦ . By Assumption (III), for every A ∈ D(δα ), there exists a sequence {An }, An ∈ ¯ A◦ such that {An } and {δAn (= δα An )} converge to A and δα A(= δA), respectively, in the norm topology of A. Since log ∆ϕ is a (self-adjoint) closed operator, πϕ (A)Ωϕ must be in the domain of log ∆ϕ and (7.17) holds for any A ∈ D(δα ). For A ∈ D(δα ) and t ∈ R, we set ξt ≡ σtϕ (πϕ {αβt (A)}Ωϕ = ∆it ϕ πϕ (αβt (A))Ωϕ .
April 11, 2003 14:43 WSPC/148-RMP
00160
Equilibrium Statistical Mechanics of Fermion Lattice Systems
143
For A ∈ D(δα ), αt (A) is in D(δα ) for any t ∈ R. Therefore, we can substitute αβt (A) into A of (7.17) and obtain is it (d/dt)ξt = ∆it ϕ {(d/ds)∆ϕ πϕ {αβt (A)}Ωϕ |s=0 } + ∆ϕ ((d/dt)πϕ {αβt (A)}Ωϕ )
= ∆it ϕ {−βπϕ {δ(αβt (A))}Ωϕ + πϕ {βδ(αβt (A))}Ωϕ } = 0. Therefore, we have ξt = ξ0 and σtϕ (πϕ {αβt (A)})Ωϕ = πϕ (A)Ωϕ . Since Ωϕ is separating for Mϕ , we obtain σtϕ (πϕ {αβt (A)}) = πϕ (A) . This implies ϕ πϕ {αβt (A)} = σ−t (πϕ (A)) .
Since D(δα )(⊃ A◦ ) is norm dense in A, we have πϕ {α−βt (A)} = σtϕ (πϕ (A)) , for every A ∈ A. Since ϕ satisfies (σtϕ , −1)-KMS condition as a state of Mϕ , we obtain the (αt , β)KMS condition for ϕ. 7.5. Product form of the Gibbs condition In the case of quantum spin lattice systems, for any region I ⊂ Zν , A = A(I) ⊗ A(Ic ). In this situation, the Gibbs condition implies that ϕwˆ (= ϕπϕ (βW (I)) ) is a product of the local Gibbs state of A(I) and its restriction to A(Ic ), or equivalently ˆ ϕh (= ϕπϕ (βH(I)) ) is a product of the tracial state of A(I) and its restriction to A(Ic ) for any finite region I [5]. ˆ However, this product property for ϕwˆ and ϕh for the present Fermion case does not seem to be automatic in general. We show that such a product property holds if and only if the Gibbs state ϕ is Θ-even, where the product property refers to the validity of the formula ψ(AB) = ψ(A)ψ(B)/ψ(1) ,
A ∈ A(I), B ∈ A(Ic )
(7.18)
ˆ
for ψ = ϕh and for ψ = ϕwˆ . Proposition 7.7. Assume the conditions (I) and (II) for the dynamics. Let I be a non-empty finite subset of Zν . If ϕ satisfies the Gibbs condition, then ϕπϕ (βW (I)) has the product property (7.18) if and only if ϕ is Θ-even. The same is true for ϕπϕ (βH(I)) .
April 11, 2003 14:43 WSPC/148-RMP
144
00160
H. Araki & H. Moriya
Proof. First assume that ϕ is even. It follows from the Gibbs condition that A(I) ˆ ˆ is in the centralizer of ϕh and the restriction of ϕh to A(I) is tracial. We will show ˆ
ϕh ([A1 , A2 ]B) = 0
(7.19)
for any A1 , A2 ∈ A(I) and any B ∈ A(Ic ). It is enough to show this for all combinations of even and odd A1 , A2 and B because the general case follows from these cases by linearity. ˆ Since A1 and A2 are in the centralizer of ϕh , we have ˆ
ˆ
ϕh (A1 A2 B) = ϕh (A2 BA1 ) ,
ˆ
ˆ
ϕh (A2 A1 B) = ϕh (A1 BA2 ) .
If one, or more of A1 , A2 , B is even, then BA1 = A1 B or BA2 = A2 B holds. Hence (7.19) follows for this case. ˆ The remaining case is when A1 , A2 , B are all odd. We now show that ϕh is even so that (7.19) holds in this case. Since ϕ is assumed to be even at this part of proof, Θ leaves ϕ invariant and hence there exists an involutive unitary UΘ on the GNS representation space Hϕ of ϕ, satisfying UΘ πϕ (A)UΘ ∗ = πϕ (Θ(A)) ,
(A ∈ A) ,
(7.20)
UΘ Ω ϕ = Ω ϕ .
(7.21)
Since H(I) is even by assumption, it follows from the commutativity of UΘ with ˆ ∆ϕ [42] and the above equations (7.20), (7.21) that the perturbed vector Ω hϕ is UΘ ˆ
ˆ
ˆ
invariant. Therefore ϕh is even, since it is the vector functional by Ωhϕ . Hence ϕh vanishes on every odd element and (7.19) is satisfied if A1 , A2 and B are all odd. Now (7.19) is proved for all the cases. Since A(I) is a 2|I| ×2|I| full matrix algebra, any element A ∈ A(I) can be written as X A = τ (A) + [Aj1 , Aj2 ] j
for some Aj1 , Aj2 ∈ A(I). Hence (7.19) implies ˆ
ˆ
ϕh (AB) = τ (A)ϕh (B)
(7.22) ˆ
for any A ∈ A(I) and B ∈ A(Ic ). This means that ϕh has a form of the product of τ of A(I) and its restriction to A(Ic ). ˆ Since U (I) is in the centralizer of ϕh , we have ˆ
ˆ
ϕwˆ = {ϕh }−ˆu = ϕh · e−ˆu . Hence, for any A ∈ A(I) and B ∈ A(Ic ), ˆ
ϕwˆ (AB) = τ (e−ˆu )ϕcI (A)ϕh (B) .
April 11, 2003 14:43 WSPC/148-RMP
00160
Equilibrium Statistical Mechanics of Fermion Lattice Systems
145
By setting A = 1, we have ˆ
ϕwˆ (B) = τ (e−ˆu )ϕh (B) . Therefore, ϕwˆ (AB) = ϕcI (A)ϕwˆ (B) .
(7.23)
Hence we have the desired product property of ϕwˆ . ˆ We now prove the converse, starting from the assumption that ϕh has a product form (7.18). We note that 1 1 1 ∗ ∗ ∗ ∗ τ (ai ai ) = τ (ai ai ) = τ (ai ai + ai ai ) = τ 1 = 2 2 2 due to CAR. On the other hand, ai anticommutes with any odd element B in A(Ic ) and hence ˆ
ˆ
ˆ
ϕh (ai a∗i B) = ϕh (a∗i Bai ) = −ϕh (a∗i ai B) ,
(7.24) ˆ h
where the first equality follows because ai is in the centralizer of ϕ due to the Gibbs condition. By the product form assumption, ˆ
ˆ
ˆ
ˆ
ϕh (AB) = ϕh (A)ϕh (B)/ϕh (1) ˆ
ˆ
for A ∈ A(I) and B ∈ A(Ic ). Since A is in the centralizer, ϕh (A)/ϕh (1) = τ (A) for the unique tracial state τ of A(I). Hence ˆ
ˆ
1 hˆ ϕ (B) , 2
ˆ
ˆ
1 hˆ ϕ (B) . 2
ϕh (ai a∗i B) = τ (ai a∗i )ϕh (B) = ϕh (a∗i ai B) = τ (a∗i ai )ϕh (B) =
(7.25)
From (7.24) and (7.25), we obtain ˆ
ϕh (B) = 0
(7.26) ˆ
for any B ∈ A(Ic )− . Since A− = A(I)+ A(Ic )− + A(I)− A(Ic )+ for a finite I, ϕh ˆ vanishes on odd elements of A. We conclude that ϕh is even. This implies that ϕ is also even by the same argument as in the first part of this proof due to ˆ ˆ ϕ = {ϕh }−h . Remark. By the above Proposition, we have already shown that if a Gibbs state ϕ satisfies the condition that ϕπϕ (βW (I)) has the product property (7.18) for the pair (A(I), A(Ic )) for one non-empty finite I, then ϕ has this product property for every finite subset I. In connection with Proposition 7.7, if A(Ic ) is replaced by the commutant algebra A(I)0 in the product property (7.18), then ϕwˆ is a product of the local Gibbs state of A(I) and its restriction to A(I)0 for every finite region I irrespective of
April 11, 2003 14:43 WSPC/148-RMP
146
00160
H. Araki & H. Moriya
whether ϕ is even or not as is shown in the following corollary. This situation is much the same as in quantum spin lattice systems. Corollary 7.8. Assume the conditions (I) and (II) for the dynamics. Let ϕ be a modular state. The state ϕ satisfies the Gibbs condition if and only if the perturbed functional ϕwˆ is a product of the local Gibbs state ϕcI of A(I) and its restriction to A(I)0 for every finite I. Proof. For a finite I, A(I) is a full matrix algebra and hence A is an (algebraic) tensor product of A(I) and A(I)0 . If ϕwˆ has the product property described above, then the GNS representation of A associated with ϕwˆ is the tensor product of those for (A(I), ϕcI ) and (A(I)0 , ψ) where ψ = ϕwˆ |A(I)0 . Therefore the product of the modular automorphisms for these two pairs satisfies the KMS condition (with β = −1) for (A, ϕwˆ ) and must be the modular operator for (A, ϕwˆ ). In particular, the restriction of the modular automorphisms of (A, ϕwˆ ) to A(I) coincides with the modular automorphisms αIt (= Ad(e−iβU (I)t )) for (A(I), ϕcI ). Hence the Gibbs condition is satisfied. Conversely, assume that the Gibbs condition is satisfied for ϕ. By the elementwise commutativity of A(I) and A(I)0 , we can show directly (7.19) in Proposition 7.7 in this case for any A1 , A2 ∈ A(I) and B ∈ A(I)0 skipping the previous discussion about even and odd elements. The argument showing (7.22) and (7.23) are still valid after we replace A(Ic ) by A(I)0 . 8. Translation Invariant Dynamics 8.1. Translation invariance and covariance From now on, we need the following assumption for the dynamics αt for the most part of our theory. (IV) αt τk = τk αt for all t ∈ R and k ∈ Zν . If (IV) holds, αt is said to be translation invariant. This assumption implies our earlier assumption (I) due to the following Proposition, which we owe to a referee. Proposition 8.1. Any automorphism αt commuting with the lattice translation τk , k ∈ Zν , must commute with Θ. For its proof, we need the following Lemma. Lemma 8.2. An element x ∈ A is Θ-even if and only if the following asymptotically central property holds. lim k[τk (x), y]k = 0
k→∞
for all y ∈ A.
(8.1)
April 11, 2003 14:43 WSPC/148-RMP
00160
Equilibrium Statistical Mechanics of Fermion Lattice Systems
147
Proof. If x ∈ (A◦ )+ and y ∈ A◦ , then [τk (x), y] = 0 for sufficiently large k. By the density of (A◦ )+ in A+ and A◦ in A, we obtain (8.1) for x ∈ A+ and y ∈ A. In the converse direction, consider a general x ∈ A and define x± = 1/2(x ± Θ(x)) ∈ A± . Due to the validity of (8.1) for x+ , which is just shown, we have lim k[τk (x), y]k = lim k[τk (x− ), y]k .
k→∞
k→∞
Take a unitary y ∈ A− (e.g., ai +
a∗i ).
Then
k[τk (x− ), y]k = 2kτk (x− )yk = 2kx− k . Hence (8.1) for x implies x− = 0, namely x ∈ A+ . Proof of Proposition 8.1. Due to τk α = ατk , we have k[τk (α(x)), α(y)]k = kα{[τk (x), y]}k = k[τk (x), y]k . Hence α(x) ∈ A+ if and only if x ∈ A+ by Lemma 8.2. Let
1 (id + Θ) . (8.2) 2 It is the conditional expectation from A onto A+ , characterized by E+ (x) ∈ A+ for all x ∈ A and τ (xy) = τ (E+ (x)y) for all x ∈ A and y ∈ A+ . Then α(α−1 (y)) = y ∈ A+ implies α−1 (y) ∈ A+ and E+ ≡
τ (E+ (α(x))y) = τ (α(x)y) = τ (α(xα−1 (y)) = τ (xα−1 (y)) = τ (E+ (x)α−1 (y)) = τ (α−1 {α(E+ (x))y}) = τ (α(E+ (x))y) , where we have used α−1 (y) ∈ A+ in the fourth equality. Since E+ (α(x)) ∈ A+ and α(E+ (x)) ∈ A+ (due to E+ (x) ∈ A+ ), we have E+ (α(x)) = α(E+ (x)). Therefore E+ α = αE+ and α commutes with Θ. Remark. A referee pointed out the following approach (which we have not adopted). Under assumption IV, any αt |A+ -KMS state of A+ has a unique even extension to an αt -KMS state of A (e.g. by [11]). This allows one to reduce the analysis of KMS states to the case of asymptotically abelian system due to (8.1). The dynamics αt is translation invariant if and only if its generator αt commutes with every τk (k ∈ Zν ). (This statement includes the τk -invariance of the domain of the generator.) The corresponding standard potential (which exists under the assumptions (I) and (II)) satisfies the following translation covariance condition: (Φ-f) τk Φ(I) = Φ(I + k), for all finite subsets I of Zν and all k ∈ Zν . Such a potential will be said to be translation covariant. We consider the set Pτ of all translation covariant potentials in P. Namely, Pτ is defined to be the set of all Φ satisfying all conditions of Definition 5.10, i.e. (Φ-a,b,c,d,e) and the translation covariance (Φ-f).
April 11, 2003 14:43 WSPC/148-RMP
148
00160
H. Araki & H. Moriya
We make Pτ a real vector space as a function space on the set of finite subsets of Zν by the linear operation given in (5.24). In the same way, we define Hτ to be the subspace of H such that each element H satisfies the following translation covariance condition: (H-vi) τk (H(I)) = H(I + k) for all k ∈ Zν . We denote the set of all translation invariant derivations in ∆(A◦ ) by ∆τ (A◦ ). Namely, ∆τ (A◦ ) is the set of all ∗-derivations with A◦ as their domain, commuting with Θ and also with τ . From Theorems 5.7, 5.12 and 5.13, the following corollaries obviously follow. Corollary 8.3. The relation (H-iii) (as given in Sec. 5.2) between H ∈ Hτ and δ ∈ ∆τ (A◦ ) gives a bijective, real linear map from Hτ to ∆τ (A◦ ). Corollary 8.4. The equations (5.22) and (5.23) for Φ ∈ Pτ and H ∈ Hτ give a bijective, real linear map from Pτ to Hτ . Corollary 8.5. The equations (5.27) and (5.28) between Φ ∈ Pτ and δΦ ∈ ∆τ (A◦ ) gives a bijective, real linear map from Pτ to ∆τ (A◦ ). For Φ ∈ Pτ , we define kΦk ≡ kH({n})k which is independent of n ∈ Zν due to the translation covariance of Φ. It defines a norm on Pτ . We show that this norm makes Pτ a Banach space, after giving the following energy estimates. Lemma 8.6. For Φ ∈ Pτ , the following estimate hold : kU (I)k ≤ kH(I)k ≤ kΦk · |I| ,
(8.3)
In particular, if kΦk = 0, H = U = Φ = 0 (as functions of finite subsets I of Zν ). Proof. For I = ∅, both sides of the above inequalities are 0. For I = {n1 , . . . , n|I| }, we obtain X H(I) = limν {Φ(K); K ∩ I 6= ∅, K ⊂ J} J%Z
= limν J%Z
= limν J%Z
=
|I| X i=1
K
|I| X X i=1 K
|I| X i=1
{Φ(K); K 3 ni , K 63 n1 , . . . , ni−1 , K ⊂ J}
E{n1 ,...,ni−1 }c
X K
{Φ(K); K 3 ni , K ⊂ J}
E{n1 ,...,ni−1 }c H({ni }) ,
April 11, 2003 14:43 WSPC/148-RMP
00160
Equilibrium Statistical Mechanics of Fermion Lattice Systems
149
where the third equality comes from the following identities ( 0 if {n1 , . . . , ni−1 } ∩ K 6= ∅, i.e. {n1 , . . . , ni−1 }c 6⊃ K , E{n1 ,...,ni−1 }c Φ(K) = Φ(K) if n1 , . . . , ni−1 ∈ / K, i.e. {n1 , . . . , ni−1 }c ⊃ K , and the interchange of limJ%Zν and E{n1 ,...,ni−1 }c in the fourth equality is allowed due to kE{n1 ,...,ni−1 }c k = 1. The following estimate follows: kH(I)k ≤
|I| X
kE{n1 ,...,ni−1 }c H({ni })k
≤
|I| X
kH({ni })k = |I| · kΦk .
i=1
i=1
(8.4)
Since U (I) = EI (H(I)) and kEI k = 1, we obtain kU (I)k ≤ kH(I)k ≤ kΦk · |I| . If kΦk = 0, then H(I) = U (I) = 0 for all I by this estimate and hence Φ(I) = 0 by (5.16). The following estimate will be used later. Lemma 8.7. For disjoint finite subsets I and J of Zν , kU (I ∪ J) − U (I)k ≤ kΦk · |J| .
(8.5)
Proof. Due to I ∩ J = ∅, U (I ∪ J) − U (I) = {Φ(K); K ∩ J 6= ∅, K ⊂ I ∪ J} . Therefore, we have U (I ∪ J) − U (I) = EI∪J H(J) , because H(J) is the sum of Φ(K) for all K satisfying K∩J 6= ∅, and EI∪J annihilates all Φ(K) for which K is not contained in I ∪ J while it retains Φ(K) unchanged if K is contained in I ∪ J. Hence kU (I ∪ J) − U (I)k = kEI∪J H(J)k ≤ kH(J)k ≤ kΦk · |J| . Proposition 8.8. P is a real Banach space with respect to the norm kΦk = kH({n})k. Proof. Pτ is a normed space with respect to kΦk, because kΦ1 + Φ2 k = kHΦ1 +Φ2 ({n})k = kHΦ1 ({n}) + HΦ2 ({n})k
April 11, 2003 14:43 WSPC/148-RMP
150
00160
H. Araki & H. Moriya
≤ kHΦ1 ({n})k + kHΦ2 ({n})k = kΦ1 k + kΦ2 k , kcΦk = kcHΦ ({n})k = |c|kHΦ ({n})k = |c| kΦk , for Φ1 , Φ2 , Φ ∈ Pτ , and c ∈ R, due to the linear dependence of HΦ on Φ and because kΦk = 0 implies Φ(I) = 0 for all I due to Lemma 8.6 and (5.16). We now show its completeness. Suppose {Φn } is a Cauchy sequence in Pτ with respect to the norm k · k. Let us denote the corresponding H(I) and U (I) for Φn by Hn (I) and Un (I), respectively. The linear dependence of H(I) on Φ and Lemma 8.6 imply that {Hn (I)} is a Cauchy sequence in A with respect to the C∗ -norm. Since A is a C∗ -algebra, {Hn (I)} has a unique limit in A, which will be denoted by H∞ (I). Since U (I) = EI (H(I)) with kEI k = 1, {Un (I)} is also a Cauchy sequence in A, has a unique limit U∞ (I), and U∞ (I) = EI (H∞ (I)). For each finite subset I of Zν , {Φn (I)} also converges to the potential Φ∞ (I) for U∞ (I) in the C∗ -norm because Φ(I) is a finite linear combination of U (J), J ⊂ I due to (5.16), and {Un (J)} converges to U∞ (J) in the C∗ -norm for every such J. For any finite subsets I, J of Zν , we obtain X K
{Φ∞ (K); K ∩ I 6= ∅, K ⊂ J} =
X K
= lim n
lim{Φn (K); K ∩ I 6= ∅, K ⊂ J} n
X K
{Φn (K); K ∩ I 6= ∅, K ⊂ J}
= lim EJ (Hn (I)) = EJ (lim Hn (I)) n
n
= EJ (H∞ (I)) , where the third equality is due to (5.20). Hence, by (4.23) we have lim
J%Zν
X {Φ∞ (K); K ∩ I 6= ∅, K ⊂ J} K
!
= limν EJ (H∞ (I)) = H∞ (I) . J%Z
Thus Φ∞ satisfies the condition (Φ-e) in the definition of Pτ . The other conditions (Φ-a), (Φ-b), (Φ-c), (Φ-d), and (Φ-f) are satisfied since each Φn satisfies them and limn Φn (I) = Φ∞ (I) for every finite subset I of Zν . In conclusion, we have Φ∞ ∈ Pτ . Finally, we have lim kΦn − Φ∞ k = lim kHn ({0}) − H∞ ({0})k = 0 . n
n
We have now shown the completeness of Pτ .
April 11, 2003 14:43 WSPC/148-RMP
00160
Equilibrium Statistical Mechanics of Fermion Lattice Systems
151
8.2. Finite range potentials Definition 8.9. (1) A potential Φ ∈ Pτ is said to be of a finite range if there exists an r ≥ 0 such that Φ(I) = 0 whenever diam(I) = max{|i − j|; i, j ∈ I} > r .
(8.6)
The infimum of such r is called the range of Φ. (2) The subspace of P consisting of all potentials Φ ∈ P of a finite range is denoted by P f . Furthermore, we denote Pτf ≡ P f ∩ Pτ .
(8.7)
Ca ≡ {x ∈ Zν ; 0 ≤ xi ≤ a − 1, i = 1, . . . , ν} .
(8.8)
We introduce the following averaged conditional expectation. 1 X ECa −i , Ea ≡ |Ca |
(8.9)
For a ∈ N, Ca denotes the following cube in Zν
i∈Ca
where |Ca | = aν is the number of lattice points in Ca , called the volume of Ca . (The sum in the above equation is over all translates of Ca which contain the origin 0 ∈ Zν .) For any finite subset I ⊂ Zν , l(a, I) denotes the number of translates of Ca containing I. By definition, for any m ∈ Zν , l(a, I) = l(a, I + m) .
(8.10)
We need the following lemma in this subsection and later. Lemma 8.10. For a finite I, lim
a→∞
l(a, I) = 1. |Ca |
(8.11)
Proof. Let d ∈ N be fixed such that there exists a translate Cd + k (k ∈ Zν ) of Cd containing I. For a > d, a translate of Ca contains I if it contains Cd + k. Hence l(a, I) is bigger than the number of translates of Ca which contains Cd , which is (a − d + 1)ν . Hence ν (a − d + 1)ν (d − 1) l(a, I) ≥ = 1− → 1 (a → ∞) . 1≥ |Ca | |Ca | a This shows (8.11).
In order to prove that the subspace Pτf is dense in Pτ , we need the following Lemma. Lemma 8.11. For any A ∈ A, lim Ea (A) = A .
a→∞
(8.12)
April 11, 2003 14:43 WSPC/148-RMP
152
00160
H. Araki & H. Moriya
Proof. Since A◦ is dense in A, there exists Aε ∈ A◦ for any ε > 0 such that kAε − Ak < ε .
(8.13)
Let Aε ∈ A(Iε ) for a finite Iε . Then there exists a sufficiently large positive integer b such that a translate of Cb , say Cb − k, contains both 0 (the origin of Zν ) and Iε . If a translate Ca − i of Ca contains Cb − k, then ECa −i (Aε ) = Aε because Ca − i ⊃ Cb − k ⊃ Iε and Aε ∈ A(Iε ). Such i belongs to Ca due to 0 ∈ Cb − k ⊂ Ca − i. The number of translates Ca − i of Ca which contains Cb − k is equal to l(a, Cb ) (the number of translates of Ca which contains Cb ). Therefore, we obtain
1 X l(a, Cb )
Aε − {ECa −i (Aε ); i ∈ Ca , Ca − i 6⊃ Cb − k} kAε −Ea (Aε )k = 1 −
. |Ca | |Ca | Hence, by using kECa −i (Aε )k ≤ kAε k due to kECa −i k = 1, we obtain 1 l(a, Cb ) + {|Ca | − l(a, Cb )} kAε k kAε − Ea (Aε )k ≤ 1− |Ca | |Ca | l(a, Cb ) = 2 1− kAε k . |Ca | By Lemma 8.10
lim
a→∞
l(a, Cb ) = 1. |Ca |
Hence, there exists nε ∈ N such that for a ≥ nε ,
kAε − Ea (Aε )k < ε .
(8.14)
Hence, for a ≥ nε , kA − Ea (A)k ≤ kA − Aε k + kAε − Ea (Aε )k + kEa (Aε − A)k < 3ε by (8.13), (8.14) and kEa k = 1. Theorem 8.12. Pτf is dense in Pτ . Proof. Let Φ ∈ Pτ . For any finite I ⊂ Zν containing the origin 0 of Zν , Ea (Φ(I)) =
l(a, I) Φ(I) , |Ca |
(8.15)
because ECa −i (Φ(I)) = Φ(I) if Ca − i contains I while ECa −i (Φ(I)) = 0 if Ca − i does not contain I due to (Φ-d). Note that all translates of Ca which contains I appear in the sum (8.9) since I is assumed to contain 0. We now consider the following potential Φa (I) =
l(a, I) Φ(I) . |Ca |
(8.16)
April 11, 2003 14:43 WSPC/148-RMP
00160
Equilibrium Statistical Mechanics of Fermion Lattice Systems
153
Due to Φ ∈ Pτ , (Φ-a), (Φ-b), (Φ-c) and (Φ-d) for Φa follow automatically. Since Φ ∈ Pτ is translation covariant and l(a, I) is translation invariant under translation of I by (8.10), Φa satisfies the translation covariance (Φ-f). Φa is of a finite range √ because there is no translates of Ca containing I if diam(I) > ν(a − 1) and hence l(a, I) = 0 for such I and a(∈ N). Hence (Φ-e) is automatically satisfied. Therefore we conclude that Φa ∈ Pτf . We compute Ea (HΦ ({0})) =
X 1 X ECa −i (Φ(J)) |Ca | J30
=
i∈Ca
X l(a, J) J30
|Ca |
Φ(J) = HΦa ({0}) ,
where we have used ECa −i (Φ(J)) = Φ(J) for Ca − i ⊃ J and ECa −i (Φ(J)) = 0 for Ca − i 6⊃ J due to (Φ-d). (Note that if a translate Ca − i contains J, then i ∈ Ca due to 0 ∈ J and hence the number of i ∈ Ca , for which Ca − i ⊃ J, is l(a, J).) By Lemma 8.11, we obtain lim kΦ − Φa k = lim kHΦ ({0}) − HΦa ({0})k
a→∞
a→∞
= lim kHΦ ({0}) − Ea (HΦ ({0}))k = 0 . a→∞
This completes the proof. Corollary 8.13. Pτ is a separable Banach space. Proof. For each n ∈ N , the set of all Φ ∈ Pτf with its range not exceeding n is a finite dimensional subspace of Pτ , because such Φ is determined by Φ(I) for a finite number of I containing the origin and satisfying diam(I) ≤ n, and so has a dense countable subset. Taking union over n ∈ N, we have a countable dense subset of Pτf . By Theorem 8.12, the same countable subset is dense in Pτ . We have now shown that Pτ is separable. 9. Thermodynamic Limit The van Hove limits of the densities (volume average) of extensive quantities are usually called thermodynamic limits. We now provide their existence theorems. The same proof as the case of spin lattice systems (see e.g. [17], [23] and [40]) is applicable to the present Fermion lattice case. We, however, present slightly simplified proof by using methods different from those of the known proof. First we derive a surface energy estimate which we will find useful and crucial in the argument of the present section.
April 11, 2003 14:43 WSPC/148-RMP
154
00160
H. Araki & H. Moriya
9.1. Surface energy estimate Lemma 9.1. For Φ ∈ Pτ ,
kW (I)k = 0. I→∞ |I|
v.H. lim
(9.1)
Proof. Let {Iα } be an arbitrary van Hove net of Zν . For n ∈ Zν and a finite subset I of Zν , let X Wn (I) ≡ limν {Φ(K); K 3 n, K ∩ Ic 6= ∅, K ⊂ J} J%Z
K
= limν (HJ ({n}) − EI {HJ ({n})}) J%Z
= H({n}) − EI {H({n})} .
ν
Let BrZ (n) be the intersection of Br (n) (the ball with its center n and radius r) ν and Zν . If n ∈ I and n ∈ / surf r (I), then BrZ (n) ⊂ I and hence EI (HBrZν (n) ({n})) = HBrZν (n) ({n}) . Therefore, Wn (I) = H({n}) − HBrZν (n) ({n}) − EI {H({n}) − HBrZν (n) ({n})} . From this, we obtain kWn (I)k ≤ 2kH({n}) − HBrZν (n) ({n})k . By (5.23), for given ε > 0, we can take sufficiently large r > 0 (hence sufficiently ν large BrZ (0)) satisfying ε kH({0}) − HBrZν (0) ({0})k < . 4 By the translation covariance assumption on Φ, we have kH({n}) − HBrZν (n) ({n})k = kτn {H({0}) − HBrZν (0) ({0})}k = kH({0}) − HBrZν (0) ({0})k <
ε . 4
Hence ε , 2
(9.2)
E{n1 ,...,ni−1 }c Wni (I)
(9.3)
kWn (I)k ≤ if n ∈ I and n ∈ / surf r (I). For I = {n1 , . . . , n|I| }, we have W (I) =
|I| X i=1
April 11, 2003 14:43 WSPC/148-RMP
00160
Equilibrium Statistical Mechanics of Fermion Lattice Systems
155
and hence kW (I)k ≤
|I| X i=1
kWni (I)k .
(9.4)
For n = ni ∈ / surf r (I), we use the estimate (9.2) for kWn (I)k. For n = ni ∈ surf r (I), we use kWn (I)k = kH({n}) − EI (H({n}))k ≤ 2kH({n})k = 2kΦk . Then ε · |I| + 2kΦk · |surf r (I)| . 2
kW (I)k ≤
(9.5)
Since {Iα } is a van Hove net, there exists αε such that, for α ≥ αε , ε |surf r (Iα )| < . |Iα | 4kΦk For such α, we obtain kW (Iα )k < ε, |Iα | which completes the proof. Lemma 9.2. Let {Iα } be a van Hove net of Zν . For each Iα and a ∈ N, take a set (a,α) of mutually disjoint n− of Ca which are all packed in Iα . For a (Iα ) translates Di any ε > 0, take an a0 ∈ N such that kW (Ca )k < |Ca | ε/2 for all a > a0 . For any such a, there exists an α0 (a) such that, for α > α0 (a),
n− (Iα ) a X
(a,α) −
H(Iα ) − U (Di ) (9.6)
< na (Iα )|Ca |ε ,
i=1
(Iα ) n− a X
(a,α) −
U (Iα ) − U (Di )
< na (Iα )|Ca |ε ,
i=1
and
1≥
n− ε a (Iα )|Ca | ≥1− . |Iα | kΦk
(9.7)
(9.8)
Proof. Before we start the proof, we note that the existence of a0 is guaranteed by Lemma 9.1. Let us set n− a (Iα )
D
(a,α)
≡
[
i=1
(a,α)
Di
,
D0
(a,α)
≡ Iα \ D(a,α) .
April 11, 2003 14:43 WSPC/148-RMP
156
00160
H. Araki & H. Moriya
Obviously |D0
(a,α)
− | ≤ (n+ a (Iα ) − na (Iα ))|Ca | ,
and − n+ a (Iα )|Ca | ≥ |Iα | ≥ na (Iα )|Ca | .
From this, we obtain n− a (Iα ) , n+ a (Iα )
1≥
|Iα | n+ (I a α )|Ca |
1≥
n− n− a (Iα )|Ca | a (Iα ) . ≥ + |Iα | na (Iα )
≥
(9.9)
On the other hand, n− a (Iα )
n− a (Iα )
H(Iα ) −
X
(a,α) U (Di )
=
X i=1
i=1
(a,α)
E{D(a,α) ∪···D(a,α) }c (W (Di 1
))
i−1
+ E{D(a,α) }c (H(D0
(a,α)
)) .
Therefore,
n− n− (Iα ) (Iα ) a a X X
(a,α) (a,α) (a,α)
H(Iα ) −
U (Di ) ≤ kW (Di )k + kH(D0 )k
i=1 i=1 ≤ n− a (Iα )|Ca | ·
ε (a,α) + kΦk|D0 |, 2
(9.10)
where in the second inequality the assumption kW (Ca )k < |Ca | ε/2 together with (a,α) the translation covariance of Φ are used for kW (Di )k, and Lemma 8.6 is used (a,α) )k. Due to condition (1) for the van Hove limit, there exists α0 (a) for kH(D0 for given ε1 > 0 such that, for α ≥ α0 (a), 0≤1−
n− a (Iα ) < ε1 . n+ a (Iα )
(9.11)
If ε1 < 1, then n+ a (Iα ) < |D0
(a,α)
1 n− (Iα ) , 1 − ε1 a
| ≤ n+ a (Iα )ε1 |Ca | <
ε1 n− (Iα )|Ca | . 1 − ε1 a
Now we choose ε1 which satisfies 2ε1 kΦk < ε, and (0 <)ε1 < 1 . 1 − ε1
(9.12)
April 11, 2003 14:43 WSPC/148-RMP
00160
Equilibrium Statistical Mechanics of Fermion Lattice Systems
157
Then from (9.10) and (9.12), we have
n− (Iα ) a X
ε ε1 (a,α)
H(Iα ) −
≤ n− U (D ) + kΦk n− a (Iα )|Ca | · a (Iα )|Ca | i
2 1 − ε 1
i=1 = n− a (Iα )|Ca |
ε ε1 + kΦk 2 1 − ε1
< n− a (Iα )|Ca |ε . We also have
n− (Iα ) n− (Iα ) a a X X
(a,α) (a,α)
U (Iα ) −
U (Di ) U (Di ) = EIα H(Iα ) −
i=1 i=1 < n− a (Iα )|Ca |ε .
Due to (9.12), ε1 <
ε . kΦk
By (9.9), (9.11) and this inequality, we obtain 1≥
n− ε a (Iα )|Ca | ≥1− . |Iα | kΦk
9.2. Pressure Theorem 9.3. Assume Φ ∈ Pτ . (1) The following limit exists: 1 log τ (e−H(I) ) I→∞ |I|
p(Φ) ≡ v.H. lim
1 log τ (e−U (I) ) . I→∞ |I|
= v.H. lim
(9.13)
(2) p(Φ) is a convex functional of Φ satisfying the following continuity property: |p(Φ) − p(Ψ)| ≤ kΦ − Ψk .
(9.14)
Proof. We first prove (1) in four steps. Step 1. We need the following well-known matrix inequality: |log τ (e−A ) − log τ (e−B )| ≤ kA − Bk ,
(9.15)
April 11, 2003 14:43 WSPC/148-RMP
158
00160
H. Araki & H. Moriya
for A, B ∈ A◦ . This follows from the following computation: Z 1 d | log τ (e−A ) − log τ (e−B )| = {log τ (e−λA−(1−λ)B )}dλ 0 dλ Z 1 τ (e−λA−(1−λ)B · (B − A)) = dλ ≤ kA − Bk , τ (e−λA−(1−λ)B ) 0
where we have used the fact that τ (ec x)/τ (ec ) for c = c∗ ∈ A◦ is a state function of x ∈ A and hence bounded by kxk. Setting B = 0 and noting log τ (e−B ) = 0 for B = 0, we have | log τ (e−A )| ≤ kAk .
(9.16) (a,α)
Step 2. We use the notation in the preceding Lemma. Because U (Di ) with (a,α) distinct i’s mutually commute due to the disjointness of Di , (5.21), (4.8) and (Φ-c), we have − na (Iα ) Y P n− (I ) (a,α) (a,α) a α log τ (e{− i=1 U (Di )} ) = log τ e−U (Di ) i=1
n− a (Iα )
= log
Y
τ (e
(a,α)
−U (Di
n− a (Iα )
)
i=1
)=
X
(a,α)
log τ (e−U (Di
)
)
i=1
−U (Ca ) = n− ), a (Iα ) log τ (e
(9.17)
where the second equality is due to the product property (4.13) of the tracial state, and the last equality follows from the translation covariance (Φ-f). By (9.16), (8.3) and (9.8), we have − na (Iα ) 1 −U (Ca ) −U (Ca ) )− log τ (e ) < ε . (9.18) |Iα | log τ (e |Ca |
Step 3. By (9.17), (9.18), (9.15), (9.6) and (9.8), , and , 1 1 −H(Iα ) −U (Ca ) )− log τ (e ) |Iα | log τ (e |Ca | Pn− 1 (a,α) 1 a (Iα ) = log τ (e−H(Iα ) ) − log τ (e{− i=1 U (Di )} ) |Iα | |Iα | − 1 na (Iα ) − + log τ (e−U (Ca ) ) < 2ε |Iα | |Ca | for any α > α0 (a). Hence for any α, β > α0 (a), we have 1 1 −H(Iα ) −H(Iβ ) log τ (e ) − log τ (e ) < 4ε . |Iα | |Iβ |
(9.19)
April 11, 2003 14:43 WSPC/148-RMP
00160
Equilibrium Statistical Mechanics of Fermion Lattice Systems
159
Therefore, |I1α | log τ (e−H(Iα ) ) is a Cauchy net in R and has the (van Hove) limit. Step 4. Due to v.H. lim
I→∞
and
kH(I) − U (I)k kW (I)k = v.H. lim =0 I→∞ |I| |I|
|log τ (e−H(I) ) − log τ (e−U (I) )| ≤ kH(I) − U (I)k , the convergence of |I1α | log τ (e−H(Iα ) ) implies that of |I1α | log τ (e−U (Iα ) ) to the same value. Now we prove (2). Since HΦ (I) is linear in Φ, we have the convexity of log τ (e−HΦ (I) ) in Φ due to the well-known convexity of the function: λ 7→ log τ (e(A+λB) ) for A = A∗ and B = B ∗ . Hence the convexity of p(Φ) follows. By (9.15), the linearity of HΦ (I) in Φ and (8.3), we obtain 1 log τ (e−HΦ (I) ) − 1 log τ (e−HΨ (I) ) |I| |I| ≤
1 1 kHΦ (I) − HΨ (I)k = kHΦ−Ψ (I)k |I| |I|
≤ kΦ − Ψk for any finite I. Hence (9.14) follows. The pressure functional P (Φ) of Φ ∈ Pτ is conventionally defined by using the matrix trace in contrast to p(Φ) in the preceding theorem defined in terms of the tracial state: 1 1 −H(I) −U (I) log TrI (e ) = v.H. lim log TrI (e ) , (9.20) P (Φ) ≡ v.H. lim I→∞ |I| I→∞ |I|
where TrI denotes the matrix trace on A(I) and hence TrI = 2|I| τ . Therefore, for any Φ ∈ Pτ , P (Φ) = p(Φ) + log 2 .
(9.21)
Due to the preceding theorem, we have obviously Corollary 9.4. Assume Φ ∈ Pτ . (1) The following limit exists: P (Φ) ≡ v.H. lim
I→∞
= v.H. lim
I→∞
1 log TrI (e−H(I) ) |I| 1 log TrI (e−U (I) ) . |I|
(9.22)
April 11, 2003 14:43 WSPC/148-RMP
160
00160
H. Araki & H. Moriya
(2) P (Φ) is a convex functional of Φ satisfying the following continuity property: |P (Φ) − P (Ψ)| ≤ kΦ − Ψk .
(9.23)
Remark. We have p(0) = 0 ,
|p(Φ)| ≤ kΦk
(9.24)
which do not hold for P (Φ). 9.3. Mean energy Theorem 9.5. For Φ ∈ Pτ and a translation invariant state ω of A, the following limit exists: 1 ω(H(I)) eΦ (ω) ≡ v.H. lim I→∞ |I| = v.H. lim
I→∞
1 ω(U (I)) . |I|
(9.25)
The mean energy eΦ (ω) so obtained is linear in Φ, affine in ω, bounded by kΦk, and weak∗ continuous in ω: ecΦ+dΨ (ω) = ceΦ (ω) + deΨ (ω)
(c, d ∈ R) ,
eΦ (λω1 + (1 − λ)ω2 ) = λeΦ (ω1 ) + (1 − λ)eΦ (ω2 ) (0 ≤ λ ≤ 1) , |eΦ | ≤ kΦk , lim eΦ (ωγ ) = eΦ(ω) , γ
(9.26) (9.27) (9.28) (9.29)
τ where Φ and Ψ are in Pτ , ω, ω1 , ω2 and ωγ are in A∗+,1, , and {ωγ } is a net converging to ω in the weak ∗ topology.
Proof. By the argument leading to (9.19) in Theorem 9.3, there exists a ∈ N and α0 (a) for any given ε > 0 such that for all α > α0 (a), 1 1 (9.30) |Iα | ω(H(Iα )) − |Ca | ω(U (Ca )) < 2ε ,
where we can take the same a ∈ N and α0 (a) uniformly in ω ∈ A∗+,1 . This estimate implies that { |I1α | ω(H(Iα ))}α is a Cauchy net in R and hence converges. Since ω(H(I)) is linear in Φ and affine in ω, so is eΦ (ω). Due to (8.3), we obtain |eΦ (ω)| ≤ kΦk. Finally we show the continuity in ω. Let {ωγ }γ be a net of states converging to ω in the weak∗ topology. For any ε > 0, we fix a ∈ N satisfying (9.30) for all α > α0 (a) and for all states ω. From the weak∗ convergence of {ωγ }γ to ω, there exixts γε such that for all γ ≥ γε 1 |ω(U (Ca )) − ωγ (U (Ca ))| < ε . |Ca |
April 11, 2003 14:43 WSPC/148-RMP
00160
Equilibrium Statistical Mechanics of Fermion Lattice Systems
Thus we have
161
1 1 |Iα | ω(H(Iα )) − |Iα | ωγ (H(Iα )) < 5ε ,
for all α > α0 (a). By taking the van Hove limit, we obtain |eΦ (ω) − eΦ (ωγ )| < 5ε
for all γ ≥ γε . Hence eΦ (ω) is continuous in ω relative to the weak∗ topology. 10. Entropy for Fermion Systems 10.1. SSA for Fermion systems We first show the SSA property of entropy for the Fermion case, which is a consequence of the results on the conditional expectations in Secs. 3 and 4. Theorem 10.1. For finite subsets I and J of Zν , the strong subadditivity (SSA) of Sˆ holds for any state ψ of A: ˆ I∪J ) − S(ψ ˆ I ) − S(ψ ˆ J ) + S(ψ ˆ I∩J ) ≤ 0 , S(ψ
(10.1)
where ψK denotes the restriction of ψ to A(K). Sˆ in this inequality can be replaced by S: S(ψI∪J ) − S(ψI ) − S(ψJ ) + S(ψI∩J ) ≤ 0 .
(10.2)
Proof. The SSA of Sˆ follows from Theorem 3.7 and Theorem 4.13. By (3.1) and log 2|I∪J| − log 2|I| − log 2|J| + log 2|I∩J| = 0 , the SSA of Sˆ implies that of S. Remark 1. The strong subadditivity can be rewritten as S(ψ123 ) − S(ψ13 ) − S(ψ23 ) + S(ψ3 ) ≤ 0 ,
(10.3)
for any disjoint subsets I1 , I2 and I3 of Zν , where ψ123 denotes the restriction of ψ to A(I1 ∪ I2 ∪ I3 ), and so on. Remark 2. The SSA for Fermion systems above does not seem to follow from those for the tensor product systems ([27, 28]) in any obvious way. Remark 3. Note that the SSA for Fermion systems holds whether the state ψ is Θeven or not. For two disjoint finite regions I and J, the so-called triangle inequality of entropy |S(ψI ) − S(ψJ )| ≤ S(ψI∪J )
April 11, 2003 14:43 WSPC/148-RMP
162
00160
H. Araki & H. Moriya
is known to hold for quantum spin lattice systems [1]. However, it can fail for Fermion lattice systems when ψ breaks Θ-evenness (see a concrete example in [33]). The following is a special case of Theorem 10.1 when I ∩ J = ∅. Corollary 10.2. For disjoint finite subsets I and J, the following subadditivity holds. ˆ I∪J ) ≤ S(ψ ˆ I ) + S(ψ ˆ J) , S(ψ
(10.4)
S(ψI∪J ) ≤ S(ψI ) + S(ψJ ) .
(10.5)
10.2. Mean entropy We now show the existence of mean entropy (von Neumann entropy density) for translation invariant states of A. For s = (s1 , . . . , sν ) ∈ Nν , we define Rs as the following box region with edges Qν of length si − 1 containing si points of Zν and with the volume |Rs | = i=1 si . Rs ≡ {x ∈ Zν ; 0 ≤ xi ≤ si − 1, i = 1, . . . , ν} .
(10.6)
Theorem 10.3. Let ω be a translation invariant state. The van Hove limit 1 s(ω) ≡ v.H. lim S(ωI ) (10.7) I→∞ |I|
exists and is given as the following infimum s(ω) = infν s∈N
The mean entropy functional
1 S(ωRs ) . |Rs |
ω 7→ s(ω) ∈ [0, log 2]
(10.8)
(10.9)
τ defined on the set A∗+,1, of translation invariant states is affine and upper semicontinuous with respect to the weak ∗ topology.
Proof. The SSA property of von Neumann entropy proved in Theorem 10.1 is sufficient for the same proof of this Theorem as in the case of quantum spin lattice systems. (See e.g. Proposition 6.2.38 of [17].) The following results about Lipschitz continuity of bounded affine functions on a state space and, in particular, of entropy density are known. τ Proposition 10.4. A bounded affine function f on A∗+,1, satisfies
|f (ω1 ) − f (ω2 )| ≤ (M/2)kω1 − ω2 k τ for any ω1 , ω2 ∈ A∗+,1, , where
τ M ≡ sup{|f (ω1 ) − f (ω2 )|; ω1 , ω2 ∈ A∗+,1, }.
(10.10)
April 11, 2003 14:43 WSPC/148-RMP
00160
Equilibrium Statistical Mechanics of Fermion Lattice Systems
163
Corollary 10.5. The mean entropy s(ω) satisfies |s(ω1 ) − s(ω2 )| ≤
1 (log 2)kω1 − ω2 k 2
(10.11)
τ for any ω1 , ω2 ∈ A∗+,1, .
Proposition 10.4 is the first equation on p. 108 of [23] and Corollary 10.5 is Corollary IV.4.3 on the same page of [23]. The inequality (10.11) without 12 factor is obtained in [20]. The coefficient 21 log 2 is best possible, the equality being reached by ω1 = τ and any pure translation invariant state ω2 with vanishing mean entropy s(ω2 ) = 0, in which case kω1 − ω2 k = 2 because πτ (type II) and πω2 (type I) are disjoint. An example of such an ω2 is given by Theorem 11.2 as a ‘product state extension’ of Θ-even pure states ϕi of A({i}) (i ∈ Zν ) satisfying the covariance condition τk∗ ϕi = ϕi+k for all k ∈ Zν . τ We define mean entropy sˆ(ω) for ω ∈ A∗+,1, by using trace τ instead of matrix trace TrI for each finite I: 1 ˆ S(ωI ) . (10.12) sˆ(ω) ≡ v.H. lim I→∞ |I| It is obviously related to s(ω) by
s(ω) = sˆ(ω) + log 2 ,
(10.13)
τ for any ω ∈ A∗+,1, .
10.3. Entropy inequalities for translation invariant states In addition to Theorem 10.3, the SSA property of von Neumann entropy plays an essential role in the derivation of some basic entropy inequalities for the present Fermion lattice systems in the same way as for quantum spin lattice systems. The following two consequences are about monotone properties of entropy as a function on the set of box regions of the lattice; the first one is a monotone decreasing property of the finite-volume entropy density and the second one is a monotone increasing property of the entropy. Theorem 10.6. Let ω be a translation invariant state on A and let Rs and Rs0 be finite boxes of Zν such that Rs ⊂ Rs0 . Then 1 1 S(ωRs0 ) , S(ωRs ) ≥ |Rs | |Rs0 |
(10.14)
S(ωRs ) ≤ S(ωRs0 ) .
(10.15)
This theorem follows from [24], where (10.14) and (10.15) are derived from the following properties without any other input. • Positivity and finiteness of the entropy of every local region. • Strong subadditivity. • Shift invariance.
April 11, 2003 14:43 WSPC/148-RMP
164
00160
H. Araki & H. Moriya
In [16], sufficient conditions are given for a sequence of regions of more general shape than boxes which guarantee a monotone decreasing property of the form (10.14) for any translation invariant state ω. This result also applies to our Fermion lattice systems. 11. Variational Principle We first prove the existence of a (unique) product state extension of given states in any (finite or infinite) number of mutually disjoint regions under the condition that all given states except for at most one are Θ-even. This result is a crucial tool to overcome possible difficulties which originate in the non-commutativity of Fermion systems in connection with the proof of variational equality in this section and in the equivalence proof of the variational principle with the KMS condition in the next section. 11.1. Extension of even states For each I, A(I) is invariant under Θ and hence the restriction of Θ to A(I) is an automorphism of A(I) and will be denoted by the same symbol Θ. We need the following lemma. Lemma 11.1. Let I be a finite subset of Zν . Let ϕ be a state of A(I) and % ∈ A(I) be its adjusted density matrix : ϕ(A) = τ (%A) = τ (A%) ,
(A ∈ A(I)) .
Then ϕ is an even state if and only if % is Θ-even. Proof. Since the tracial state τ is invariant under any automorphism, we obtain ϕ(A) = ϕ(Θ(A)) = τ (%Θ(A)) = τ (Θ{%Θ(A)}) = τ (Θ(%)A) if ϕ is even. By the uniqueness of the density matrix, we have Θ(%) = %. By the same computation, ϕ(Θ(A)) = ϕ(A) for every A ∈ A(I) if Θ(%) = %. Theorem 11.2. Let {Ii } be a (finite or infinite) family of mutually disjoint subsets S of Zν and ϕi be a state of A(Ii ) for each i. Let I = i Ii . Then there exists a state ϕ of A(I) satisfying ϕ(Ai1 · · · Ain ) =
n Y
ϕij (Aij )
(11.1)
j=1
for any set (i1 , . . . , in ) of distinct indices and for any Aij ∈ A(Iij ) if all states ϕi except for at most one are Θ-even. When such ϕ exists, it is unique. Proof. (Case 1) A finite family of finite subsets {Ii }, i = 1, . . . , n.
April 11, 2003 14:43 WSPC/148-RMP
00160
Equilibrium Statistical Mechanics of Fermion Lattice Systems
165
For each i, let %i be the density matrix of ϕi : (A ∈ A(Ii )) ,
ϕi (A) = τ (%i A) = τ (A%i ) , %i ∈ A(Ii ) ,
%i ≥ 0 ,
%i (1) = 1 .
If ϕi is Θ-even, then %i is Θ-even, namely, %i ∈ A(Ii )+ . If all states ϕi except for one is even, all %i except for one belong to A(Ii )+ . Thus each %i commutes with any %j . The product % = % n · · · %1
(11.2)
is a product of mutually commuting non-negative hermitian operators and hence it is positive. Define ϕ(A) ≡ τ (%A) ,
A ∈ A(I) .
(11.3)
By the product property of τ (4.13), we have ϕ(A1 · · · An ) = τ (%A1 · · · An ) = τ (%n−1 · · · %1 A1 · · · An−1 An %n ) = τ (%n−1 · · · %1 A1 · · · An−1 )τ (An %n ) = τ (%n−1 · · · %1 A1 · · · An−1 )ϕn (An ) . Using this recursively, we obtain ϕ(A1 · · · An ) =
n Y
ϕi (Ai ) .
i=1
This also shows ϕ(1) = 1. Hence the existence is proved for Case 1. Since the monomials of the form (4.2) with all indices in I are total in A(I), the uniqueness of a state ϕ of A(I) satisfying the product property (11.1) follows. (Case 2) A general family {Ii }. Let {Lk } be an increasing sequence of finite subsets of Zν such that their union is Zν . Set Iki ≡ Ii ∩ Lk and Ik ≡ I ∩ Lk for each k. For each k, only a finite number (which will be denoted by n(k)) of Iki are non-empty and all of them are finite subsets of Zν . Note that the restriction of an even state ϕi to A(Iki ) is even. Hence we can apply the result for Case 1 to {Iki }. We obtain a unique product state ϕk of A(Ik ) satisfying n(k) k
ϕ (Ai1 · · · Ain(k) ) =
Y
ϕkij (Aij ) ,
j=1
Aij ∈ A(Ikij ) .
(11.4)
By the uniqueness already proved, the restriction of ϕk to A(Il ) for l < k coincides with ϕl . There exists a state ϕ◦ of the ∗-algebra ∪k A(Ik ) defined by ϕ◦ (A) = ϕk (A)
April 11, 2003 14:43 WSPC/148-RMP
166
00160
H. Araki & H. Moriya
for A ∈ A(Ik ). Since ∪k Ik = I, ∪k A(Ik ) is dense in A(I). Then there exists a unique continuous extension ϕ of ϕ◦ to A(I) and ϕ is a state of A(I). Take an arbitrary index n. Let A = A 1 · · · An ,
Ai ∈ A(Ii ) .
Set Aki ≡ ELk (Ai ) ∈ A(Iki ). Since Lk % Zν , Ai = lim Aki , k
A = lim(Ak1 · · · Akn ) . k
Hence ϕ(A) = lim ϕ(Ak1 · · · Akn ) k
= lim ϕk (Ak1 · · · Akn ) = lim k
=
n Y
k
n Y
ϕi (Aki )
i=1
ϕi (Ai ) .
i=1
Thus ϕ satisfies the product property (11.1). The uniqueness of ϕ is proved in the same way as Case 1. Remark 1. This result is given in Theorem 5.4. of Power’s Thesis [36]. Remark 2. The unique product state extension ϕ is even if and only if all ϕi are even. Remark 3. The condition that all ϕi except for at most one are Θ-even can be shown to be necessary for the existence of the product state extension ϕ satisfying (11.1) [14]. Lemma 11.3. Let {Ii } be a finite family of mutually disjoint finite subsets of Zν . Let ϕi be a state of A(Ii ) for each i and all ϕi be Θ-even with at most one exception. Let ϕ be their product state extension given by Theorem 11.2. Then X X ˆ ˆ i) . S(ϕ) = S(ϕi ) , S(ϕ) = S(ϕ (11.5) i
i
Proof. This follows from the computation using the density matrix (11.2). X X X ˆ ˆ i) . S(ϕ) = −ϕ(log %) = − ϕ(log %i ) = − ϕi (log %i ) = S(% (11.6) i
i
Here the mutual commutativity of %i is used. Due to |I| = Sˆ by S.
i
P
i
|Ii |, we can replace
April 11, 2003 14:43 WSPC/148-RMP
00160
Equilibrium Statistical Mechanics of Fermion Lattice Systems
167
11.2. Variational inequality We have already quoted the positivity of relative entropy: S(ψ, ϕ) = τ (ˆ ρϕ log ρˆϕ − ρˆϕ log ρˆψ ) ≥ 0 ,
(11.7)
where the equality holds if and only if ϕ = ψ. Recall our notation (7.14) for the local Gibbs state ϕcI of A(I) with respect to (Φ, β). Let ω be a state of A. Substituting ψ = ϕcI and ϕ = ωI into (11.7), we obtain ˆ I ) + βω(U (I)) + log τ (e−βU (I) ) ≥ 0 . S(ϕcI , ωI ) = −S(ω
(11.8)
Now we assume that ω is translation invariant. By dividing the above inequality by |I| and then taking the van Hove limit I → ∞, we obtain the following variational inequality p(βΦ) ≥ sˆ(ω) − βeΦ (ω) ,
(11.9)
where sˆ(ω) is given by (10.12). Equivalently, we have P (βΦ) ≥ s(ω) − βeΦ (ω) .
(11.10)
11.3. Variational equality The variational inequality in the preceding subsection is now strengthened to the following variational equality. Theorem 11.4. Let Φ ∈ Pτ . Then P (βΦ) =
sup {s(ω) − βeΦ (ω)} ,
τ ω∈A∗ +,1,
(11.11)
where P (βΦ), s(ω) and eΦ denote the pressure, mean entropy and mean energy, τ respectively, and A∗+,1, denotes the set of all translation invariant states of A. Proof. The proof below will be carried out in the same way as for classical or quantum lattice systems ([37] or e.g. Theorem III.4.5 in [40]), with a help of the product state extension provided by Theorem 11.2. By the variational inequality (11.10), we only have to find a sequence {ρ n } of translation invariant states of A satisfying {s(ρn ) − βeΦ (ρn )} → P (βΦ) (n → ∞) .
(11.12)
For this purpose, we interrupt the proof and show the following lemma about mean entropy and mean energy of periodic states. It corresponds to Theorem 10.3 and Theorem 9.5 for translation invariant states. Lemma 11.5. Let a ∈ N, ω be an aZν -invariant state and Φ ∈ Pτ . (1) The mean entropy S(ωA(Cna ) ) s(ω) = lim n→∞ |Cna |
(11.13)
April 11, 2003 14:43 WSPC/148-RMP
168
00160
H. Araki & H. Moriya
exists. It is affine, weak∗ upper semicontinuous in ω and translation invariant: s(ω) = s(τk∗ (ω)) , (2) The mean energy eΦ (ω) = lim
n→∞
(k ∈ Zν ) .
(11.14)
(11.15)
(ω(U (Cna )) |Cna |
exists. It is linear in Φ, bounded by kΦk, affine and weak∗ continuous in ω, and translation invariant: eΦ (ω) = eΦ (τk∗ (ω)) ,
(k ∈ Zν ) .
(11.16)
Proof. We introduce a new lattice system (Aa , Aa (I)) where the total algebra Aa is equal to A and its local algebra is Aa (I) ≡ A(∪m∈I (Ca + am)) for each finite subset I of Zν . For this new system (Aa , {Aa (I)}), we assign its local Hamiltonian H a (I) ≡ H(∪m∈I (Ca + am))
to each finite I, where H(·) denotes a local Hamiltonian of the original system (A, {A(I)}). If ω is an aZν -invariant state of the system (A, {A(I)}), then it goes over to a translation invariant state of the new system (Aa , {Aa (I)}). We denote mean entropy and mean energy of ω for the system (Aa , {Aa (I)}) by a s (ω) and eaΦ (ω) which are shown to exist by Theorems 10.3 and 9.5. Because of the scale change, we have s(ω) = lim
n→∞
S(ωCna ) = |Ca |−1 sa (ω) , |Cna |
(ω(U (Cna )) = |Ca |−1 eaΦ (ω) . n→∞ |Cna |
eΦ (ω) = lim
(11.17) (11.18)
Hence those properties of mean entropy and mean energy of translation invariant states given in Theorems 10.3 and 9.5 go over to those for periodic states. Now we show (11.14) for any aZν -invariant state ω and any k ∈ Zν . Due to the ν aZ -invariance of ω, we only have to show the assertion for any k ∈ Ca . For any n ∈ N, we have S(τk∗ ω|A(Cna ) ) = S(ω|A(Cna +k) ) ,
(11.19)
which is to be compared with S(ω|A(Cna ) ). Since k ∈ Ca , we have C(n−1)a + a(1, . . . , 1) ⊂ Cna + k ⊂ C(n+1)a . By (3.2), (10.5), and the periodicity of ω, S(ωA(Cna +k) ) ≤ S(ωA(C(n−1)a ) ) + {|Cna | − |C(n−1)a |} log 2 , S(ωA(Cna +k) ) ≥ S(ωA(C(n+1)a ) ) − {|C(n+1)a | − |Cna |} log 2 .
(11.20)
April 11, 2003 14:43 WSPC/148-RMP
00160
Equilibrium Statistical Mechanics of Fermion Lattice Systems
169
Due to lim
n→∞
|Cna | = 1, |C(n−1)a |
lim
n→∞
|Cna | = 1, |C(n+1)a |
(11.21)
and (11.19), we obtain s(τk∗ ω) = lim
n→∞
S(ωA(Cna +k) ) |Cna |
S(ωA(Cna ) ) = s(ω) , n→∞ |Cna |
= lim
which is the desired equality (11.14). It remains to show (11.16). Applying the inequality (8.5) to the pair I = (C(n−1)a + a(1, . . . , 1)), J = (Cna + k) \ {C(n−1)a + a(1, . . . , 1)} and to the pair I = (C(n−1)a + a(1, . . . , 1)), J = Cna \{C(n−1)a + a(1, . . . , 1)}, we obtain kU (Cna ) − U (Cna + k)k ≤ kU (Cna ) − U (I)k + kU (I) − U (Cna + k)k ≤ 2kΦk{|Cna| − |C(n−1)a |} , where I = (C(n−1)a + a(1, . . . , 1)). Hence due to (11.21) and the periodicity of ω, eΦ (τk∗ ω) = lim
n→∞
= lim
n→∞
ω(U (Cna + k)) |Cna | ω(U (Cna )) = eΦ (ω) , |Cna |
which is the desired equality (11.16). Now we resume the proof of Theorem 11.4. Proof of Theorem 11.4 (continued). Due to Θ-evenness of the internal energy U (I) for every finite I ⊂ Zν , we have Θ ϕcI ∈ A(I)∗+,1 .
(11.22)
Let a ∈ N. For distinct m ∈ Zν , {Ca + am} are mutually disjoint and their union for all m ∈ Zν is Zν . Θ We apply Theorem 11.2 to the local Gibbs states ϕcCa +am ∈ A∗+,1 (Ca + am), ν m ∈ Z and obtain an even product state of A, which we denote by ϕca . ∗ For any k ∈ Zν , τak ϕca = ϕca by the uniqueness of the product state with the same component states. Thus ϕca is an aZν -invariant state. cca which is translation invariant as By using ϕca we construct an averaged state ϕ follows. X τ ∗ ϕc m a τ cca ≡ ∈ A∗+,1, . (11.23) ϕ |Ca | m∈Ca
April 11, 2003 14:43 WSPC/148-RMP
170
00160
H. Araki & H. Moriya
ccn . By affine dependence of s and eΦ on We now show (11.12) by taking ρn = ϕ the space of periodic states in Lemma 11.5, X ∗ c cca ) = |Ca |−1 s(ϕ s(τm ϕa ) , m∈Ca
cca ) = |Ca |−1 e Φ (ϕ
X
∗ c eΦ (τm ϕa ) .
m∈Ca
Due to (11.14) and (11.16), they imply
cca ) = s(ϕca ) , s(ϕ
(11.24)
cca ) = eΦ (ϕca ) . e Φ (ϕ
By (11.24), we have
cca ) = s(ϕca ) = s(ϕ =
(11.25)
1 S(ϕcCa ) |Ca |
1 {log TrCa (e−βU (Ca ) ) + βϕca (U (Ca ))} , |Ca |
(11.26)
where the last equality is given by the substitution of an explicit form of the density matrix of the local Gibbs state ϕcCa in Definition 7.3. In order to show (11.12), we now compare eΦ (ϕca ) with |C1a | ϕca (U (Ca )) in (11.26). Let k ∈ N and consider the following division of Cka as a disjoint union of translates of Ca : [ (Ca + am) . (11.27) Cka = m∈Ck
We give the lexicographic ordering for elements in Ck and set [ m Cka ≡ (Ca + am0 ) m0 <m
for m ∈ Ck . For any k ∈ N, X X m } W (Ca + am) . E{Cka \Cka U (Ca + am) = U (Cka ) − m∈Ck
m∈Ck
By kEk ≤ 1 and the translation covariance (Φ-f) of the potential Φ, we obtain X 1 1 U (Ca + am)k ≤ kU (Cka ) − (|Ck | · kW (Ca )k) |Cka | |Cka | m∈Ck
=
kW (Ca )k . |Ca |
(11.28)
Therefore, by (9.1), there exists a0 ∈ N for any ε > 0 such that for all a > a0
) (
1
X
U (Ca + am) < ε . (11.29) U (Cka ) −
|Cka |
m∈Ck
April 11, 2003 14:43 WSPC/148-RMP
00160
Equilibrium Statistical Mechanics of Fermion Lattice Systems
171
Note that the above a0 can be taken independent of k ∈ N. For any a ∈ N, ϕca (U (Ca + am)) = ϕca (U (Ca )) , for any m ∈ Zν , due to the aZν -invariance of ϕca . Therefore, we obtain 1 1 c c ϕ (U (C )) − ϕ (U (C )) ka a < ε, a |Cka | a |Ca | for a > a0 . By taking the limit k → ∞, we have eΦ (ϕc ) − 1 ϕc (U (Ca )) < ε . a a |Ca |
From this estimate, (11.25 ) and (11.26 ), it follows that 1 −βU (Ca ) s(ϕ cca ) − βeΦ (ϕ cca ) − ) < |β|ε , log TrCa (e |Ca | ccn in view of (9.22). for all a ≥ a0 . This proves (11.12) for ρn = ϕ
11.4. Variational principle
Definition 11.6. Any translation invariant state ϕ satisfying P (βΦ) = s(ϕ) − βeΦ (ϕ)
(11.30)
(namely, maximizing the functional s − βeΦ ) is called a solution of the (Φ, β)variational principle (or a translation invariant equilibrium state for Φ at the inverse temperature β). The set of all solutions of the (Φ, β)-variational principle is denoted by ΛβΦ . τ , P (βΦ) = s(ϕ) − βeΦ (ϕ)} . ΛβΦ ≡ {ϕ; ϕ ∈ A∗+,1,
(11.31)
Remark 1. Since βeΦ (ϕ) = eβΦ (ϕ), the condition ϕ ∈ ΛβΦ is equivalent to the condition that ϕ is a solution of the (βΦ, 1)-variational principle, and hence ΛβΦ is a consistent notation. Remark 2. In the usual physical convention, the functional s − βeΦ is −β times the free energy functional. τ Theorem 11.7. For any Φ ∈ Pτ and β ∈ R, there exists a solution ϕ(∈ A∗+,1, ) of (Φ, β)-variational principle, namely,
ΛβΦ 6= ∅ . τ cca } in the proof of Theorem 11.4 has an accumulation point in A∗+,1, Proof. {ϕ τ by the weak∗-compactness of A∗+,1, . Let ϕ be any such accumulation point. By the proof of Theorem 11.4, the weak∗ continuity of eΦ and the weak∗ upper
April 11, 2003 14:43 WSPC/148-RMP
172
00160
H. Araki & H. Moriya
semicontinuity of s in ω, the state ϕ satisfies cca )) ≤ s(ϕ) − βeΦ (ϕ) . cca ) − βeΦ (ϕ P (βΦ) = lim (s(ϕ a→∞
(11.32)
By (11.10), we obtain (11.30).
Our Fermion algebra A is not asymptotically abelian with respect to the lattice translations, but if ω is translation invariant state of A, it is well known that the pair (A, ω) is Zν -abelian and that ω is automatically even (see, for example, Example 5.2.21 in [17]). From this consideration and Theorem 11.4, we obtain the following result, which corresponds to Theorem 6.2.44 in [17] in the case of quantum spin lattice systems, by the same argument as for that theorem. For a convex set K, we denote the set of extremal points of K by E(K). Proposition 11.8. For Φ ∈ Pτ and β ∈ R, ΛβΦ is a simplex with E(ΛβΦ ) ⊂ τ E(A∗+,1, ) and the unique barycentric decomposition of each ϕ in ΛβΦ coincides with its unique ergodic decomposition. 12. Equivalence of Variational Principle and KMS Condition Among 5 steps for establishing the equivalence stated in the title (which are described in Sec. 1), Step (1) “KMS condition ⇒ Gibbs condition” is obtained in Theorem 7.5 in Sec. 7.4, Step (4) “dKMS condition on A◦ ⇒ dKMS condition on D(δα )” is obtained in Corollary 6.7, and Step (5) “dKMS condition on D(δα ) ⇒ KMS condition” is stated in Theorem 6.4. In this section, we complete the remaining two steps of proof by showing Step (2) “Gibbs condition ⇒ Variational principle” in Sec. 12.1 and Step (3) “Variational principle ⇒ dKMS condition on A◦ ” in Sec. 12.3. As a preparation for the latter, some tools of convex analysis is gathered in Sec. 12.2. 12.1. Variational principle from Gibbs condition Proposition 12.1. For Φ ∈ Pτ , each translation invariant state ϕ satisfying (Φ, β)-Gibbs condition is a solution of the (Φ, β)-variational principle. Proof. We follow the method of proof in [6]. The Gibbs condition for ϕ implies [ϕβW (I) ]|A(I) = ϕcI
(12.1)
for every finite subset I, where ϕcI is given by (7.14), and [ϕβW (I) ] denotes the normalization of ϕβW (I) given by (7.8). By (11.8) with ω replaced by ϕ, we have ˆ I ) + βϕ(U (I)) + log τ (e−βU (I) ) S(ϕcI , ϕI ) = −S(ϕ = −S(ϕI ) + βϕ(U (I)) + log TrI (e−βU (I) ) .
(12.2)
April 11, 2003 14:43 WSPC/148-RMP
00160
Equilibrium Statistical Mechanics of Fermion Lattice Systems
173
Since relative entropy is nonnegative and is monotone nonincreasing under restriction of states, it follows that 0 ≤ S(ϕcI , ϕI ) ≤ S([ϕβW (I) ], ϕ) . By (7.8), (7.10) and (7.9), we have S([ϕβW (I) ], ϕ) = log(ϕβW (I) (1)) − ϕ(βW (I)) ≤ 2kβWI k . From these estimates and (12.2), it follows that 0 ≤ S(ϕcI , ϕI ) = −S(ϕI ) + βϕ(U (I)) + log TrI (e−βU (I) ) ≤ 2kβWI k . (Up to this point, the assumption of translation invariance of ϕ is irrelevant.) We now divide the above inequality by |I| and take the van Hove limit I → ∞. Then by the translation invariance of ϕ and (9.1), we obtain s(ϕ) − βeΦ (ϕ) = P (βΦ) , which completes the proof. Combining this proposition with Theorem 7.5, we immediately obtain the following. Corollary 12.2. Let αt be a dynamics of A satisfying the Assumptions (II) and (IV) in Sec. 5 and Φ be the (translation covariant) standard potential uniquely corresponding to this αt . If ϕ is a translation invariant (αt , β)-KMS state of A, then ϕ is a solution of the (Φ, β)-variational principle. We have now completed the proof of Theorem A. 12.2. Some tools of convex analysis We use the pressure functional Φ ∈ Pτ 7→ P (Φ) ∈ R, which is a norm continuous convex function on the Banach space Pτ due to Corollary 9.4. A continuous linear functional α ∈ Pτ∗ (the dual of Pτ ) is called a tangent of the functional P at Φ ∈ Pτ if it satisfies P (Φ + Ψ) ≥ P (Φ) + α(Ψ)
(12.3)
for all Ψ ∈ Pτ . Proposition 12.3. For any solution ϕ of the (Φ, 1)-variational principle, define αϕ (Ψ) ≡ −eΨ (ϕ)
(12.4)
for all Ψ ∈ Pτ . Then αϕ is a tangent of Pτ at Φ. Proof. By linear dependence (9.26) of eΨ on Ψ, αϕ is a linear functional on Pτ . Due to |eΨ (ϕ)| ≤ kΨk given by (9.28), we have αϕ ∈ Pτ∗ . Due to the variational
April 11, 2003 14:43 WSPC/148-RMP
174
00160
H. Araki & H. Moriya
inequality (11.10), P (Φ + Ψ) ≥ s(ϕ) − eΦ+Ψ (ϕ) = s(ϕ) − eΦ (ϕ) − eΨ (ϕ) = P (Ψ) + αϕ (Ψ) for all Ψ ∈ Pτ , where the last equality is due to the assumption that ϕ is a solution of the (Φ, 1)-variational principle. (We will establish the bijectivity between solutions of the (Φ, β)-variational principle and tangents of P at βΦ through (12.4) in Theorem 12.10.) Since P (Φ + kΨ) is a convex continuous function of k ∈ R for any fixed Φ, Ψ ∈ Pτ , there exist its right and left derivatives at k = 0, ± (DΨ P )(Φ) = lim
k→±0
P (Φ + kΨ) − P (Φ) . k
By the convexity of P , + − (DΨ P )(Φ) ≥ (DΨ P )(Φ) .
If and only if they coincide, P (Φ + kΨ) is differentiable at k = 0. Then we define + − (DΨ P )(Φ) = (DΨ P )(Φ) = (DΨ P )(Φ) .
(12.5)
± The derivatives (DΨ P )(Φ) and hence (DΨ P )(Φ) (when it exists) satisfy ± ± P )(Φ)| ≤ kΨ1 − Ψ2 k , P )(Φ) − (DΨ |(DΨ 2 1
|(DΨ1 P )(Φ) − (DΨ2 P )(Φ)| ≤ kΨ1 − Ψ2 k ,
(12.6)
as is shown by the following computation in the limit k → ±0. {P (Φ + kΨ1 ) − P (Φ)} − {P (Φ + kΨ2 ) − P (Φ)} k P (Φ + kΨ1 ) − P (Φ + kΨ2 ) = k ≤ |k|−1 kk(Ψ2 − Ψ2 )k = kΨ1 − Ψ2 k ,
where (9.23) is used for the inequality. If (12.5) holds for all Ψ, then P is said to be differentiable at Φ. Let Pτ1 be the set of all Φ ∈ Pτ where P is differentiable. Proposition 12.4. If Φ ∈ Pτ1 , αΦ (Ψ) = (DΨ P )(Φ) , Pτ∗
(Ψ ∈ Pτ ) ,
(12.7)
defines an αΦ ∈ which is the unique tangent of P at Φ. Then any solution ϕ of (Φ, 1)-variational principle satisfies αΦ (Ψ) = αϕ (Ψ) , for all Ψ ∈ Pτ , where αϕ is given by (12.4).
(12.8)
April 11, 2003 14:43 WSPC/148-RMP
00160
Equilibrium Statistical Mechanics of Fermion Lattice Systems
175
Proof. By Theorem 11.7, there is a solution ϕ of the (Φ, 1)-variational principle and, by Proposition 12.3, αϕ is a tangent of P at Φ. Let α0 be any tangent of P at Φ ∈ Pτ1 . We have for k > 0 P (Φ + kΨ) ≥ P (Φ) + kα0 (Ψ) , P (Φ − kΨ) ≥ P (Φ) − kα0 (Ψ) . Hence + (Dψ P )(Φ) = lim
P (Φ + kΨ) − P (Φ) ≥ α0 (Ψ) , k
− (DΨ P )(Φ) = lim
P (Φ − kΨ) − P (Φ) ≤ α0 (Ψ) . (−k)
k→+0
k→+0
By (12.5) for Φ ∈ Pτ1 , we obtain
α0 (Ψ) = (DΨ P )(Φ) .
Then α0 is unique and (12.8) holds. Lemma 12.5. For each A ∈ A◦ such that A = A∗ = Θ(A), there exists ΨA ∈ Pτf such that eΨA (ϕ) = ϕ(A) − τ (A)
(12.9)
for all translation invariant states ϕ. Proof. Let A = A∗ = Θ(A) ∈ A(I) for some finite I and A1 ≡ A − τ (A)1 (∈ A(I)) . Since EIc (A1 ) = τ (A1 )1 = 0, there exists a unique decomposition X A(J) , A(J) ∈ A(J) , A1 =
(12.10)
J⊂I J6=∅
EK (A(J)) = 0 for K 6⊃ J . To show these formulae, let A(J) =
X
(−1)|J|−|K| EK (A1 )
(12.11)
(12.12)
K⊂J
for all non-empty J ⊂ I, a formula in parallel with (5.16). Then X EJ (A1 ) = A(K)
(12.13)
K⊂J K6=∅
for J ⊂ I by exactly the same computation as Step 1 of the proof of Lemma 5.9. (When J = ∅, the right-hand side is interpreted as 0 and E∅ (A1 ) = 0.) We have A(J)∗ = A(J) = Θ(A(J)) ∈ A(J) ,
(12.14)
April 11, 2003 14:43 WSPC/148-RMP
176
00160
H. Araki & H. Moriya
because A(J) is a real linear combination of EK (A1 ), K ⊂ J, and all EK (A1 ) satisfy the same equation. We note that Step 4 of Lemma 5.9 uses only the following properties of U (K), U (∅) = 0 ,
τ (U (K)) = 0 ,
EK (U (J)) = U (K) ,
for K ⊂ J ⊂ I, and that all of them are satisfied also by EK (A1 ). Therefore, (12.11) follows from the same argument as Step 4 of Lemma 5.9. We now construct ΨJ ∈ Pτf for each A(J) in (12.10) such that eΨJ (ϕ) = ϕ(A(J))
(12.15)
for all translation invariant states ϕ. Then by linear dependence of eΨ on Φ ∈ Pτ , P we obtain for Ψ = J⊂I ΨJ the desired relation (12.9): X X ϕ(A(J)) = ϕ(A1 ) = ϕ(A) − τ (A) . eΨJ (ϕ) = eΨ (ϕ) = J⊂I
J⊂I
We define a potential ΨJ for each J ⊂ I, J 6= ∅ by ΨJ (J + m) = τm (A(J)) ,
(m ∈ Zν ) ,
ΨJ (K) = 0 if K is not a translate of J .
(12.16)
Due to the property (12.14) and (12.11), ΨJ belongs to Pτf . We compute 1 X 1 ϕ(UΨJ (Ca )) = ϕ {ΨJ (J + m); J + m ⊂ Ca } |Ca | |Ca | =
Na ϕ(A(J)) , |Ca |
where Na is the number of m such that J + m ⊂ Ca . Na We now show that |C → 1 as a → ∞. Since J + m ⊂ Ca is equivalent to a| J ⊂ Ca − m, Na is the same as l(a, J) (the number of translates of Ca containing J). By (8.11), lim
a→∞
Na l(a, J) = lim = 1. a→∞ |Ca | |Ca |
Hence eΨJ (ϕ) = lim
a→∞
1 ϕ(UΨJ (Ca )) = ϕ(A(J)) . |Ca |
Corollary 12.6. If ϕ1 and ϕ2 are distinct solutions of (Φ, 1)-variational principle for Φ ∈ Pτ , then the corresponding tangent of P at Φ are distinct, that is, αϕ1 6= αϕ2 . Proof. If ϕ1 6= ϕ2 , there exists an A ∈ A◦ such that ϕ1 (A) 6= ϕ2 (A). Let A± = 1 2 (A±Θ(A)). Then A = A+ +A− . Since ϕ1 and ϕ2 are translation invariant, both of
April 11, 2003 14:43 WSPC/148-RMP
00160
Equilibrium Statistical Mechanics of Fermion Lattice Systems
177
them are Θ-even, and hence ϕ1 (A− ) = ϕ2 (A− ) = 0. Thus ϕ1 (A+ ) 6= ϕ2 (A+ ). So we 1 may assume that Θ(A) = A. Let A1 = 12 (A + A∗ ), A2 = 2i (A − A∗ ), A = A1 + iA2 . Then either ϕ1 (A1 ) 6= ϕ2 (A1 ) or ϕ1 (A2 ) 6= ϕ2 (A2 ). Since A∗1 = A1 and A∗2 = A2 , we may assume A = A∗ = Θ(A). Let ΨA ∈ Pτf be given as in Lemma 12.5 for this A ∈ A◦ . Then αϕ1 (ΨA ) = −eΨA (ϕ1 ) = −ϕ1 (A) + τ (A) 6= −ϕ2 (A) + τ (A) = −eΨA (ϕ2 ) = αϕ2 (ΨA ) . Hence αϕ1 6= αϕ2 . Corollary 12.7. For Φ ∈ Pτ1 , a solution of (Φ, 1)-variational principle is unique. Proof. This follows from Proposition 12.4 and Corollary 12.6. We will use the following result in the proof of Theorem 12.11. Theorem 12.8. (1) The set Pτ1 of points of unique tangent of P is residual (an intersection of a countable number of dense open sets) and dense in Pτ . (2) For any Φ ∈ Pτ , any tangent of P at Φ is contained in the weak∗ closed convex hull of the set Γ(Φ) which is defined by Γ(Φ) ≡ {α ∈ Pτ∗ ; there exists a net Φγ ∈ Pτ1 such that kΦγ − Φk → 0 , and αΦγ → α in the weak∗ topology of Pτ∗ } ,
(12.17)
where αΦγ is the unique tangent of P at Φγ . Proof. (1) is Mazur’s theorem [31]. (2) is Theorem 1 of [26] where the function f is to be set f (Ψ) = P (Φ + Ψ) for our purpose. The proof in [26] is by the Hahn–Banach theorem. (Separability of Pτ given by Corollary 8.13 is needed for both (1) and (2).) We now show a bijective correspondence between solutions of the (Φ, β)variational principle and tangents of P at βΦ. We first prove a lemma about stability of solutions of the variational principle under the limiting procedure in (12.17). Lemma 12.9. Let {Φγ } be a net in Pτ and {ϕγ } be a net consisting of a solution ϕγ of the (Φγ , βγ )-variational principle for each index γ such that kΦγ − Φk → 0, (Φ ∈ Pτ ),
βγ → β ∈ R ,
τ τ ϕγ → ϕ ∈ A∗+,1, in the weak ∗ topolgy of A∗+,1, .
Then ϕ is a solution of the (Φ, β)-variational principle.
April 11, 2003 14:43 WSPC/148-RMP
178
00160
H. Araki & H. Moriya
Proof. By the norm continuity (9.23) of P , the weak∗ upper semicontinuity of s (Theorem 10.3) and the continuous dependence of eΦ (ϕ) on Φ in the norm topology (uniformly in ϕ) and on ϕ in the weak∗ topology (Theorem 9.5), we have P (βΦ) = lim P (βγ Φγ ) , γ
s(ϕ) ≥ lim sup s(ϕγ ) , γ
eΦ (ϕ) = lim eΦγ (ϕγ ) . γ
Since, ϕγ is a solution of the (Φγ , βγ )-variational principle, we have P (βγ Φγ ) = s(ϕγ ) − βγ eΦγ (ϕγ ) . Hence P (βΦ) ≤ s(ϕ) − βeΦ (ϕ) . By the variational inequality (11.10), we have P (βΦ) = s(ϕ) − βeΦ (ϕ) . Theorem 12.10. For any Φ ∈ Pτ and β ∈ R, there exists a bijective affine map ϕ 7→ αϕ from the set ΛβΦ to the set of all tangents of the functional P at βΦ, given by αϕ (Ψ) = −eΨ (ϕ) ,
Ψ ∈ Pτ .
(12.18)
Proof. By Remark 1 after Definition 11.6, all solutions of the (Φ, β)- and (βΦ, 1)variational principle coincide. Furthermore, if ϕ is a solution of the (Φ, β)-variational principle, then P (βΦ + Ψ) ≥ s(ϕ) − eβΦ+Ψ (ϕ) = s(ϕ) − βeΦ (ϕ) − eΨ (ϕ) = P (βΦ) + αϕ (Ψ) . Namely αϕ is a tangent of P at βΦ, exactly the same statement as for a solution ϕ of the (βΦ, 1)-variational principle. Therefore, it is enough to prove the case of β = 1. The map ϕ 7→ αϕ is an affine map from the set of all solutions of (Φ, 1)variational principle into the set of all tangents of P at Φ. The map is injective by Corollary 12.6. To show the surjectivity of the map, let α be a tangent of P at Φ. By Theorem 12.8, there exists a net Φγ ∈ Pτ1 such that kΦγ − Φk → 0, and αΦγ → α in the weak∗ topology of Pτ∗ , where αΦγ is the unique tangent of P at Φγ . By Theorem 11.7, there exists a solution ϕγ of the (Φγ , 1)-variational
April 11, 2003 14:43 WSPC/148-RMP
00160
Equilibrium Statistical Mechanics of Fermion Lattice Systems
179
principle. By Proposition 12.3, αϕγ is a tangent of P at Φγ and hence must coinτ cide with the unique tangent αΦγ . Due to the weak∗ compactness of A∗+,1, , there ∗τ exists a subnet {ϕγ(µ) }µ which converges to some ϕ ∈ A+,1, . By Lemma 12.9 and by kΦγ(µ) − Φk → 0, ϕ must be a solution of the (Φ, 1)-variational principle. Furthermore, for any Ψ ∈ Pτ , we have αϕ (Ψ) = −eΨ (ϕ) = − lim eΨ (ϕγ(µ) ) = − lim αγ(µ) (Ψ) µ
µ
= α(Ψ) . Hence α = αϕ and the map ϕ → αϕ is surjective. 12.3. Differential KMS condition from variational principle In this subsection, we give a proof for Step 3. Theorem 12.11. Let Φ ∈ Pτ and ϕ be a translation invariant state. If ϕ is a solution of (Φ, β)-variational principle, then ϕ is a (δΦ , β)-dKMS state, where δΦ ∈ ∆(A◦ ) corresponds to Φ by the bijective linear map of Corollary 8.5. Remark. We note that this theorem holds for any Φ ∈ Pτ without any further assumption on Φ and we do not need αt . Note that the domain D(δΦ ) is A◦ by definition. First we present some estimate needed in the proof of this theorem in the form of the following lemma. Lemma 12.12. Let I and J be finite subsets of Zν . If A ∈ A(J), then k[U (I), A]k ≤ 2kΦk · kAk · |I ∩ J| .
(12.19)
Proof. Let I0 be the complement of I ∩ J in I. Then I0 ∩ J = ∅ and hence U (I0 ) commutes with A(∈ A(J)) due to U (I0 ) ∈ A(I0 )+ ⊂ A(J)0 . Since I0 and I ∩ J are disjoint and have the union I, the following computation proves (12.19). k[U (I), A]k = k[U (I) − U (I0 ), A]k ≤ 2kU (I) − U (I0 )k kAk ≤ 2kΦk · kAk · |I ∩ J| , where the last inequality is due to (8.5). Proof of Theorem 12.11. We note that (Φ, β)-variational principle and (βΦ, 1)variational principle are the same and (δΦ , β)-dKMS condition and (δβΦ , 1)-dKMS condition are the same. By taking βΦ as a new Φ, we only have to prove the case β = 1.
April 11, 2003 14:43 WSPC/148-RMP
180
00160
H. Araki & H. Moriya
cca be the translation invariant state defined by (11.23) in the proof of Let ϕ cca }a∈N . Then this ϕ is a Theorem 11.4. Let ϕ be any accumulation point of {ϕ solution of (Φ, 1)-variational principle as shown in Theorem 11.7. For the moment, let us assume Φ ∈ Pτ1 (the set of Φ ∈ Pτ where P is differentiable, defined in Sec. 12.2). Due to the assumption Φ ∈ Pτ1 , any accumulation point cca }a∈N coincides with the unique solution ϕ of (Φ, 1)-variational principle, and of {ϕ hence cca = ϕ . lim ϕ
(12.20)
a→∞
We now prove that the above ϕ satisfies the conditions (C-1) and (C-2) of Definition 6.3 for each A ∈ A◦ by using (12.20). Let A ∈ A(I) for a finite subset I of Zν . Suppose Ca − k ⊃ I (a ∈ N, k ∈ Zν ). Since τk∗ ϕca is the (Ad eitU (Ca −k) , 1)-KMS state on A(Ca − k), we have Re(τk∗ ϕca )(A∗ [iU (Ca − k), A]) = 0 ,
(12.21)
Im(τk∗ ϕca )(A∗ [iU (Ca − k), A]) ≥ S(τk∗ ϕca (AA∗ ), τk∗ ϕca (A∗ A)) .
(12.22)
Our strategy of the proof is to replace τk∗ ϕca and [iU (Ca − k), A] by ϕ and δΦ (A), respectively, by using an approximation argument. By (4.23) for J % Zν , there exists a finite subset Jε of Zν for any given ε > 0 such that kH(I) − EJ (H(I))k < ε ,
(12.23)
for all J ⊃ Jε . Let b be sufficiently large so that there exists a translate Cb − l0 of Cb containing both I and Jε . τ cca (∈ A∗+,1, We will use the following convenient expression for ϕ ) which is equivalent to (11.23): cca = τl∗ ϕ cca = ϕ
∗ X τl+m ϕca = |Ca |
m∈Ca
X
m∈(Ca +l)
∗ c τm ϕa , |Ca |
(12.24)
for any l ∈ Zν . We will take l = l0 . We divide Ca + l0 into the following two disjoint subsets when a > b: C1 ≡ Ca−b + l0 ,
C2 ≡ (Ca + l0 ) \ C1 .
(12.25)
Then Ca − k ⊃ C b − l0 ⊃ I ∪ J ε if k ∈ C1 , while
as a → ∞.
|C2 | = |Ca |
1−
|Ca−b | |Ca |
→ 0,
(12.26)
(12.27)
April 11, 2003 14:43 WSPC/148-RMP
00160
Equilibrium Statistical Mechanics of Fermion Lattice Systems
181
For k ∈ C1 , A(∈ A(I)) belongs to A(Ca − k) due to I ⊂ Ca − k. By using the general property of the conditional expectation, we have i[U (Ca − k), A] = iECa −k ([H(Ca − k), A]) = iECa −k ([H(I), A]) = i[ECa −k (H(I)), A] . By (12.23) for J = Ca − k(⊃ Jε ), this implies ki[H(I), A] − i[U (Ca − k), A]k < 2εkAk . Noting that δΦ (A) = i[H(I), A], we have kδΦ (A) − i[U (Ca − k), A]k < 2εkAk .
(12.28)
It follows from (12.21) and (12.28) that |Re(τk∗ ϕca )(A∗ δΦ (A))| < 2εkAk2
(12.29)
for k ∈ C1 . For k ∈ C2 , we use the following obvious estimate. |Re(τk∗ ϕca )(A∗ δΦ (A))| < kA∗ δΦ (A)k .
(12.30)
Substituting (12.29) and (12.30) into (12.24), we obtain ! X 1 cca (A∗ δΦ (A))| ≤ Re |Re ϕ τk∗ ϕca (A∗ δΦ (A)) |Ca | k∈C1
+ Re
X
k∈C2
≤ 2εkAk2 +
1 ∗ c τ ϕ |Ca | k a
!
(A δΦ (A)) ∗
|C2 | ∗ kA δΦ (A)k . |Ca |
Taking the limit a → ∞ and using (12.27), we obtain |Re ϕ(A∗ δΦ (A))| ≤ 2εkAk2 . Due to arbitrariness of ε > 0, we obtain |Re ϕ(A∗ δΦ (A))| = 0 . Hence the condition (C-1) holds. By (12.22) and (12.28), we have the following inequality for k ∈ C1 , Im(τk∗ ϕca )(A∗ δΦ (A)) ≥ S(τk∗ ϕca (AA∗ ), τk∗ ϕca (A∗ A)) − 2εkAk2 . For k ∈ C2 , we use simply the following estimate. Im(τk∗ ϕca )(A∗ δΦ (A)) ≥ −kAδΦ (A)k .
(12.31)
April 11, 2003 14:43 WSPC/148-RMP
182
00160
H. Araki & H. Moriya
From these inequalities, we obtain X 1 τ ∗ ϕc |Ca | k a
cca (A∗ δΦ (A)) = Im Im ϕ
k∈C1
+ Im
X
k∈C2
!
1 ∗ c τ ϕ |Ca | k a
(A∗ δΦ (A)) !
(A∗ δΦ (A))
1 X S(τk∗ ϕca (AA∗ ), τk∗ ϕca (A∗ A)) |Ca |
≥
k∈C1
−2
|C2 | |C1 | εkAk2 − kAδΦ (A)k . |Ca | |Ca |
(12.32)
Due to the estimate (12.27), the last term tends to 0 as a → ∞, while the second last term tends to −2εkAk2 as a → ∞. Due to the convexity of S(·, ·) in two variables, the first term on the right-hand side has the following lower bound: 1 X |C1 | 0 0 cca (AA∗ ), ϕ cca (A∗ A)) , S(τk∗ ϕca (AA∗ ), τk∗ ϕca (A∗ A)) ≥ S(ϕ |Ca | |Ca |
(12.33)
k∈C1 0
cca is a state of A defined by where ϕ 0
cca (B) ≡ ϕ
1 X ∗ c τk ϕa (B) , |C1 | k∈C1
B ∈ A.
0
cca and ϕ cca can be estimated as The difference of the states ϕ X 1 X ∗ c 1 1 0 ccn − ϕ ccn = τk∗ ϕca − − τ k ϕa ϕ |C1 | |Ca | |Ca | k∈C1
=
k∈C2
|C2 | c 0 1 X ∗ c cn − ϕ τ k ϕa . |Ca | |Ca | k∈C2
Hence
0
cca − ϕ cca k ≤ 2 kϕ
|C2 | , |Ca |
which tends to 0 as a → ∞ by (12.27). We note 0
cca (AA∗ ) = lim ϕ cca (AA∗ ) = ϕ(AA∗ ) , lim ϕ a
a
0
cca (A∗ A) = lim ϕ cca (A∗ A) = ϕ(A∗ A) . lim ϕ a
a
By the lower semi-continuity of S(·, ·), we obtain 0
0
cca (AA∗ ), ϕ cca (A∗ A)) ≥ S(ϕ(AA∗ ), ϕ(A∗ A)) . lim inf S(ϕ a
(12.34)
April 11, 2003 14:43 WSPC/148-RMP
00160
Equilibrium Statistical Mechanics of Fermion Lattice Systems
183
Combining the estimates (12.32), (12.33), (12.34) as well as (12.27), we obtain the following inequality in the limit a → ∞. Im ϕ(A∗ δΦ (A)) ≥ S(ϕ(AA∗ ), ϕ(A∗ A)) − 2εkAk2 . Due to arbitrariness of ε, we have Im ϕ(A∗ δΦ (A)) ≥ S(ϕ(AA∗ ), ϕ(A∗ A)) , for A ∈ A◦ . Hence the condition (C-2) holds. Thus, we have shown that ϕ satisfies the (δΦ , 1)-dKMS condition if ϕ is the (unique) solution of (Φ, 1)-variational principle when Φ ∈ Pτ1 . For general Φ ∈ Pτ , we will use the standard argument of the convex analysis in the same way as [26], or Theorem 6.2.42 in [17]. By Theorem 12.8, any solution of the (Φ, 1)-variational principle can be obtained by successive use of the following procedures, starting with the unique solution of ϕα of (Φα , 1)-variational principle for Φα ∈ Pτ1 . (1) Weak∗ limits of any converging nets ϕα such that kΦα − Φk → 0. (2) Convex combinations of limits obtained in (1). (3) Weak∗ limits of a converging net of states obtained in (2). By Lemma 6.6, the conditions (C-1) and (C-2) are stable under these procedures. As we have already shown these conditions for ϕα when Φα belongs to Pτ1 , the same holds for any Φ ∈ Pτ . We have now shown Theorem B. 13. Use of Other Entropy in the Variational Equality We now consider the possibility to replace the mean entropy s(ω) in Theorem 11.4 by other entropy. We take up the CNT entropy hω (τ ) with respect to the lattice translation automorphism group τ as one example. But readers will find that any other entropy will do if it has those basic properties of CNT entropy which are used in the proof of Theorem 13.2. Note that it is not known whether CNT entropy is equal to the mean entropy or not so far, either in some general context or in the present case. 13.1. CNT-entropy The CNT-entropy is introduced by Connes–Narnhofer–Thirring [19] for a single automorphism and its invariant state, and is extended by Hudetz [22] to the multidimensional case of the group Zν generated by a finite number (=ν) of commuting automorphisms. We will use the latter extended version for the group of lattice translation automorphisms τm (m ∈ Zν ).
April 11, 2003 14:43 WSPC/148-RMP
184
00160
H. Araki & H. Moriya
For a positive integer k, we consider a finite decomposition of a state ω in the state space A∗+,1 : X ω= ωi(1)i(2)···i(k) , (13.1) i(1),i(2),...,i(k)
where each i(l) runs over a finite subset of N, l = 1, . . . , k, and ωi(1)i(2)···i(k) is a nonzero positive linear functional of A. For each fixed l and i(l), let l ≡ ωi(l)
X
ωi(1)i(2)···i(k) ,
i(1),i(2),...,i(k) i(l):fixed
l ω ˆ i(l) ≡
l ωi(l) l (1) ωi(l)
.
(13.2)
Let η(x) ≡ −x log x for x > 0 and η(0) = 0. For finite dimensional subalgebras A1 , A2 , . . . , Ak of A, the so-called algebraic entropy Hω (A1 , A2 , . . . , Ak ) is defined by " X η(ωi(1)i(2)···i(k) (1)) Hω (A1 , A2 , . . . , Ak ) ≡ sup i(1),i(2),...,i(k)
−
k X X
−
k X X
l η(ωi(l) (1)) +
S(ω|Al )
l=1
l=1 i(l)
l=1 i(l)
k X
l l ωi(l) (1)S(ˆ ωi(l) |Al )
#
,
(13.3)
where the supremum is taken over all finite decompositions (13.1) of ω with a fixed k. If ω is τ -invariant, the following limit (denoted by hω,τ (N )) is known to exist (as the infimum over a) for any finite dimensional subalgebra N ⊂ A, hω,τ (N ) ≡ lim
a→∞
1 Hω (N, . . . , τ k (N ), . . . , τ a−1,...,a−1 (N )) , |Ca |
where there are |Ca | arguments for Hω (· · ·) and each of them is τ k (N ), k ∈ Ca . Let N1 ⊂ N2 ⊂ · · · ⊂ Nn ⊂ · · · be an increasing sequence of finite algebras such that the norm closure ∪n Nn is equal to A. By a Kolmogorov–Sinai type theorem (Corollary V.4 in [19]), the CNT-entropy hω (τ ) is given by hω (τ ) = lim hω,τ (Nn ) . n→∞
(13.4)
13.2. Variational equality in terms of CNT-entropy Let J1 , J2 , . . . , Jk be disjoint finite subsets of Zν with their union J. From Lemma VIII.1 in [19] it follows that Hω (A(J1 ), A(J2 ), . . . , A(Jk )) ≤ S(ωJ ) .
(13.5)
April 11, 2003 14:43 WSPC/148-RMP
00160
Equilibrium Statistical Mechanics of Fermion Lattice Systems
185
When ω is an even ‘product state’, the equality holds as follows (the following simple proof is due to a referee). Lemma 13.1. Let J1 , J2 , . . . , Jk be disjoint finite subsets with their union J. Let ω be a Θ-even state of A. Assume that ω has the following product property: ω(A1 A2 · · · Ak B) = ω(A1 )ω(A2 ) · · · ω(Ak )ω(B) ,
(13.6)
where Aj is an arbitrary element in A(Jj ) (j = 1, . . . , k) and B is an arbitrary element in A(Jc ). Then Hω (A(J1 ), A(J2 ), . . . , A(Jk )) = S(ωJ ) = and Hω (A(J1 ), A(J2 ), . . . , A(Jk )) =
k X
k X
S(ωJl ) ,
(13.7)
l=1
Hω (A(Jl )) .
(13.8)
l=1
Proof. We define 1 (id + ΘJi ) . 2 Then EJ1 ,...,Jk (+) ≡ EJ1 (+) · · · EJk (+) is the conditional expectation from A onto A(J1 )+ ⊗ · · · ⊗ A(Jk )+ ⊗ A(Jc ). Since ω is a product state for the tensor product (A(J1 )+ ⊗· · ·⊗A(Jk )+ )⊗A(Jc ), there exists an ω-preserving conditional expectation Eω0 from (A(J1 )+ ⊗ · · · ⊗ A(Jk )+ ) ⊗ A(Jc ) onto A(J1 )+ ⊗ · · · ⊗ A(Jk )+ . Hence EJi (+) ≡
EJω1 ,...,Jk (+) ≡ Eω0 EJ1 ,...,Jk (+)
is an ω-preserving conditional expectation from A onto A(J1 )+ ⊗ · · · ⊗ A(Jk )+ . Hence Hω (A(J1 )+ , A(J2 )+ , . . . , A(Jk )+ ) = Hω|A(J1 )+ ⊗···⊗A(Jk )+ (A(J1 )+ , A(J2 )+ , . . . , A(Jk )+ ) =
k X
S(ω|A(Jl )+ ) = S(ωJ ) .
l=1
On the other hand,
Hω (A(J1 )+ , A(J2 )+ , . . . , A(Jk )+ ) ≤ Hω (A(J1 ), A(J2 ), . . . , A(Jk )) ≤ S(ωJ ) . We are now in a position to give the main theorem of this subsection. Theorem 13.2. Assume the same conditions on Φ as Theorem 11.4. Then P (βΦ) =
sup [hω (τ ) − βeΦ (ω)] ,
τ ω∈A∗ +,1,
(13.9)
where hω (τ ) is the CNT-entropy of ω with respect to the lattice translation τ .
April 11, 2003 14:43 WSPC/148-RMP
186
00160
H. Araki & H. Moriya
Proof. Based on Lemma 13.1, the proof will go in the same as the case of quantum lattice systems [32]. Basic properties of the CNT-entropy to which we use in the proof are as follows. (i) Covariance under an automorphism of A (the adjoint action on states and conjugacy action on the shift). (ii) Scaling property under the scaling of the automorphism group. (iii) Concave dependence on states. Due to (13.5), we have hω (τ ) ≤ s(ω) ,
(13.10)
for any translation invariant state ω. Hence the variational inequality (11.10) obviously holds when s(ω) is replaced by hω (τ ). Due to Lemma 13.1 and the product property of ϕca , the translation invariant cca defined in (11.23) will play an identical role as in the proof of Theorem 11.4. state ϕ Therefore the sequence cca )} {hϕcca (τ ) − eΦ (ϕ
tends to the supremum value P (Φ) of the variational inequality as a → ∞. Hence the theorem follows. Remark. (iii) is a general property of CNT-entropy (see e.g. [41]) and is enough for the proof. But in the situation of the above proof, the affinity holds due to the specific nature of the states to be considered. The preceding result is the variational equality. We are then interested in the variational principle. Proposition 13.3. Suppose that a translation invariant state ϕ satisfies P (βΦ) = hϕ (τ ) − βeΦ (ϕ) .
(13.11)
Then ϕ is a solution of the (Φ, β)-variational principle and hϕ (τ ) = s(ϕ) .
(13.12)
Proof. By (13.5), we have s(ϕ) − βeΦ (ϕ) ≥ hϕ (τ ) − βeΦ (ϕ) = P (βΦ) . By the variational inequality (11.10), we have s(ϕ) − βeΦ (ϕ) = P (βΦ) .
(13.13)
Therefore ϕ is a solution of the (Φ, β)-variational principle. From (13.11) and (13.13), we obtain (13.12). Remark 1. We have no result about the existence theorem for a solution of the variational principle (13.11) in terms of the CNT-entropy for a general Φ ∈ P τ
April 11, 2003 14:43 WSPC/148-RMP
00160
Equilibrium Statistical Mechanics of Fermion Lattice Systems
187
(like Theorem 11.7) nor the stability of solutions of such a variational principle (like Lemma 12.9), the obstacle in applying the usual method being absence of any result about weak∗ upper semicontinuity of hω (τ ) in ω. In this sense, Proposition 13.3 is a superficial result, and Theorem 11.4 is short of ‘the variational principle’ in terms of the CNT-entropy. See also the discussion in Sec. 4 of [32]. Remark 2. Although we have used CNT entropy throughout this section, other entropy such as htω (σ) defined by Choda [18] can be substituted into hω (τ ), yielding similar results. 14. Discussion The following are some of remaining problems about equilibrium statistical mechanics of Fermion lattice systems which are not covered in this paper. 1. Dynamics which does not commute with Θ Obviously, there is an inner one-parameter group of ∗-automorphisms which does not commute with Θ. Examples of outer dynamics not commuting with Θ can be constructed in the following way (suggested by one of referees). Let {Ii }i=l,2,··· be a partition of the lattice Zν into mutually disjoint finite subsets Ii and let Jj ≡ ∪i≤j Ii . Choose a self-adjoint bi in A(Ii )− for each i and set Φ(Ji ) ≡ vJi−1 bi where vJ is given by (4.30). By Theorem 4.17(1), they mutually commute and Φ(Ji ) ∈ A(Ji−1 )0 for (i) each i. Hence αt ≡ Ad eitΦ(Ji ) , i = 1, 2, . . . , are mutually commuting dynamics Q (i) (i) gives a of A, αt leaving elements of A(Ji−1 ) invariant. Hence αt ≡ ∞ i=1 αt dynamics of A satisfying Θαt = α−t Θ. (Namely, its generator anticommutes with Θ.) The corresponding potential is given by Φ(I) = 0 if I 6= Ji for any i and Φ(I) = Φ(Ji ) if I = Ji . This potential satisfies the standardness condition (Φ-d) ∗ αt (Un,N ) = if each bi satisfies it for the set Ii . By looking at the behavior of Un,N PN Q N −2it i=0 Φ(Jn+i ) e for Un,N ≡ i=0 vIn+i as n → ∞, the dynamics is seen to be outer P unless i Φ(Ji ) is convergent.
2. Broken Θ-invariance of equilibrium states In connection with the Gibbs condition, we have shown in Sec. 7.7 that the perturbed state either by surface energy or by the local interaction energy satisfies the product property if and only if the equilibrium state is Θ-invariant. However, we do not know an example of an equilibrium state which is not Θ-invariant. Existence or non-existence of such a state seems to be an important question. It seems to be closely related to the next problem 3. Note that any translation invariant state is Θ-invariant. So we need broken translation invariance of an equilibrium state for its broken Θ-invariance. 3. Local Thermodynamical Stability (LTS) In parallel with the case of quantum spin lattice system, one can formulate the local stability condition ([10], [39]) for our Fermion lattice system. However, there
April 11, 2003 14:43 WSPC/148-RMP
188
00160
H. Araki & H. Moriya
seems to be two choices of the outside system for a local algebra A(I) (I finite). (1) The commutant A(I)0 . (2) A(Ic ). For the choice (1), all arguments in the case of quantum spin lattice systems seem to go through for the Fermion lattice system leading to equivalence of LTS with the KMS condition under our basic Assumptions (I), (II) and (III). On the other hand, (2) seems to be physically correct choice, although we do not have an equivalence proof for (2) so far. In this connection, the problem 2 is crucial. If all equilibrium state is Θ-invariant, then the choice (2) also seems to give the LTS which is equivalent to the KMS under our basic assumptions. A paper on this problem is forthcoming [15]. 4. Downstairs Equivalence We may say that the dynamics αt is working upstairs while its generator is working downstairs. In particular, our arena for the downstairs activity is A◦ . The stair going upstairs seems to be not wide open. On the other hand, there seems to be a lot more room downstairs. There, we have established the one-to-one correspondence between (Θ-invariant) derivations on A◦ and standard potentials. We have shown that the solution of the variational principle (described in terms of a translation covariant potential) satisfies the dKMS condition on A◦ (described in terms of the corresponding derivation). How about the converse. There is also the problem of equivalence of LTS condition (in terms of a potential) and the dKMS condition on A◦ (in terms of the corresponding derivation) where the translation invariance is not needed. Some aspects of this problem will also be included in the forthcoming paper [15]. 5. Equivalent Potentials We have introduced the notion of general potentials and equivalence among them in Sec. 5.5. Our theory is developed only for the unique standard potential among each equivalence class. Natural questions about general potentials arise. Does the existence of the limits defining the pressure P (βΦ) and the mean energy eΦ (ϕ) hold also for translation covariant general potentials Φ? Assuming the existence, are the P (βΦ) and eΦ (ϕ) the same as those for the unique standard potential Φs equivalent to Φ? If they are different, how about the solution of their variational principle? We give a partial answer to these questions. Proposition 14.1. Let Φ be a translation covariant potential (which satisfies (Φa,b,c,e,f) by definition) fulfilling the following additional condition: the surface energy X WΦ (I) = limν {Φ(K); K ∩ I 6= ∅, K ∩ Ic 6= ∅, K ⊂ J} , (14.1) J%Z
K
satisfies
v.H. lim
I→∞
kWΦ (I)k = 0. |I|
(14.2)
April 11, 2003 14:43 WSPC/148-RMP
00160
Equilibrium Statistical Mechanics of Fermion Lattice Systems
189
Let Φs be the standard potential (in Pτ ) which is equivalent to Φ. Then both van τ Hove limits defining P (βΦ) and eΦ (ω) for all ω ∈ A∗+,1, exist if and only if CΦ ≡ v.H. lim
I→∞
τ (HΦ (I)) |I|
(14.3)
exists. If this is the case, then the following relations hold 1 log TrI (e−βH(I) ) I→∞ |I|
P (βΦ) = v.H. lim
1 log TrI (e−βU (I) ) I→∞ |I|
= v.H. lim
= P (βΦs ) − βCΦ , eΦ = v.H. lim
I→∞
(14.4)
1 ω(H(I)) |I|
1 ω(U (I)) I→∞ |I|
= v.H. lim
= eΦs (ω) + CΦ .
(14.5)
Furthermore, (Φ, β)- and (Φs , β)-variational principle give the same set of solutions. Remark. If τ (Φ(I)) = 0 for all I, then (14.3) exists and CΦ = 0. Hence P (βΦ) = P (βΦs ) and eΦ (ω) = eΦs (ω). This can be achieved for any general potential Φ by changing it to Φ1 = Φ − Φ0 where Φ0 is a scalar-valued potential given by Φ0 (I) = τ (Φ(I))1 . Proof. Since Φ and Φs are equivalent, we have HΦ (I) − HΦs (I) ∈ A(I)0 . Since HΦ (I) − HΦs (I) is Θ-even by (Φ-c) for Φ and Φs , we have HΦ (I) − HΦs (I) ∈ A(Ic )+ . Hence, UΦ (I) − UΦs (I) = EI (UΦ (I) − UΦs (I)) = EI (HΦ (I) − HΦs (I)) − EI (WΦ ) − WΦs (I)) = τ (HΦ (I) − HΦs (I)) − EI (WΦ (I) − WΦs (I)) , due to (14.6). By τ (HΦs (I)) = 0 and EI (WΦs (I)) = 0 due to (Φ-d), we have UΦ (I) − UΦs (I) = τ (HΦ (I)) − EI (WΦ (I)) .
(14.6)
April 11, 2003 14:43 WSPC/148-RMP
190
00160
H. Araki & H. Moriya
By (14.2), we have v.H. lim
I→∞
Also by (14.2),
1 kUΦ (I) − UΦs (I) − τ (HΦ (I))k = 0 . |I|
v.H. lim
I→∞
Hence (14.5) follows: v.H. lim
I→∞
1 kHΦ (I) − UΦ (I)k = 0 . |I|
1 1 ω(HΦ (I)) = v.H. lim ω(UΦ (I)) I→∞ |I| |I| 1 1 ω(UΦs (I)) + v.H. lim τ (HΦ (I)) I→∞ |I| I→∞ |I|
= v.H. lim
1 τ (HΦ (I)) . I→∞ |I|
= eΦs + v.H. lim We also have v.H. lim
I→∞
1 1 log TrI (e−H(I) ) = v.H. lim log TrI (e−U (I) ) I→∞ |I| |I| 1 = P (βΦs ) − β v.H. lim τ (HΦ (I)) , I→∞ |I|
which shows (14.4).
Remark. Suppose that Φ satisfies (Φ-a), (Φ-b), (Φ-c), (Φ-f) and X kΦ(I)k < ∞ .
(14.7)
I30
Then it satisfies (Φ-e) automatically and is a general potential. Furthermore, (14.2) is known to be satisfied (the same proof as Lemma 9.1 holds except for estimates (9.2) (9.3), (9.4) and (9.5) which follow from the absolute convergence of (14.7) due to (7.12)) and τ (HΦ (I)) τ (UΦ (I)) = v.H. lim = eΦ (τ ) I→∞ I→∞ |I| |I|
CΦ = v.H. lim
(14.8)
is known to converge. (The same proof as Theorem 9.5 holds except for a modification of proof of some estimates for Lemma 9.2 on the basis of the absolute convergence of (14.7). See also e.g. Proposition 6.2.39 of [17].) Therefore (14.4) and (14.5) hold and the solutions of (Φ, β)- and (Φs , β)variational principle coincide. Appendix: Van Hove Limit For the sake of mathematical precision, we present some digression about Van Hove limit.
April 11, 2003 14:43 WSPC/148-RMP
00160
Equilibrium Statistical Mechanics of Fermion Lattice Systems
191
A.1. Van Hove net We introduce mutually equivalent two types of conditions for the van Hove limit. First we start with our notation about the shapes of regions of Zν , which will be used hereafter. Recall that Ca is a cube of size a given by (8.8). For a finite subset I of Zν and a ∈ N, let n+ a (I) be the smallest number of translates of Ca whose union covers I, while n− (I) be the largest number of mutually disjoint translates of Ca a that can be packed in I. Let Br (n) be a closed ball in Rν (⊃ Zν ) with the center n ∈ Zν and the radius r ∈ R. Denote the surface of I with a thickness r(> 0) by surf r (I) ≡ {n ∈ I; Br ({n}) ∩ Ic 6= ∅} .
(A.1)
In what follows, we consider a net of finite subsets Iα of Zν where the set of indices α is a directed set. Its partial ordering need not have any relation with the set inclusion partial ordering of Iα . Lemma A.1. For a net of finite subsets Iα of Zν , the following two conditions are equivalent: (1) For any a ∈ N,
n− a (Iα ) = 1. n+ a (Iα )
(A.2)
1 |surf r (Ia )| = 0 . |Iα |
(A.3)
lim α
(2) For any r > 0, lim α
Proof. (1) → (2): Let ε > 0 and r > 0 be given. Let a ∈ N be sufficiently large so that a ≥ 2r + 1 and ε [a − 2r]ν < , ε1 ≡ 1 − aν 2 where [b] indicates the maximal integer not exceeding b. By the condition (1), there exists an index α0 of the net {Iα } such that, for α ≥ α0 , ε2 ≡ 1 −
n− ε a (Iα ) < . 2 n+ a (Iα )
Let D1 , . . . , DN , with N = n− a (Iα ), be mutually disjoint translates of Ca contained in Ia . Let Di0 be a translate of C[a−2r] placed in Di with a distance larger than r from the complement of Di in Zν for each i = 1, . . . , N which exists. Then |Di0 | [a − 2r]ν = 1 − ε1 . = |Di | aν
April 11, 2003 14:43 WSPC/148-RMP
192
00160
H. Araki & H. Moriya
0 Let D be the union of D1 , . . . , DN and D0 be the union of D10 , . . . , DN . Then
|D0 | |D \ D0 | =1− = 1 − (1 − ε1 ) = ε1 . |D| |D|
Since n+ a (Iα ) translates of Ca covers Iα , we have
+ ν |Iα | ≤ n+ a (I)|Ca | = na (Iα )a .
Hence |Iα \ D| n− |D| |D| a (Iα ) =1− + = ε2 . =1− ≤1− + ν |Iα | |Iα | na (Iα )a na (Iα ) Due to Iα ⊃ D, |D \ D0 | |D \ D0 | ≤ = ε1 . |Iα | |D|
By construction, the distance between Di0 and the complement of Di (in Zν ) is larger than r, and hence the distance between Di0 and the complement of Iα is larger than r. Thus, surf r (Iα ) ⊂ Iα \ D0 = (D \ D0 ) ∪ (Iα \ D) . For α ≥ α0 , we obtain
|surf r (Iα )| ≤ ε1 + ε2 < ε . |Iα |
Now (1) → (2) is proved. (2) → (1): √ Let ε > 0 and a ∈ N be given. Take r > νa. Let α0 be an index of the net Ia such that, for α ≥ α0 , |surf r (Iα )| < a−ν ε . |Iα |
The translates Ca + an of Ca are disjoint for distinct n ∈ Zν and their union over n ∈ Zν is Zν . Let Oα be the union of all those Ca + an contained in Iα and N1 be their number. Let Oα0 be the union of all those Ca + an which have nonempty intersections with both Iα and (Iα )c , and N2 be their number. From the construction, the following estimates follow + N1 ≤ n − a (Iα ) ≤ na (Iα ) ≤ N1 + N2 .
Furthermore, since Ca + an in Oα0 contains a point in Iα as well as a point in (Iα )c , √ and the distance of any two points in it is at most νa < r, it has a non-empty intersection with Iα , which is contained in surf r (Ia ). Therefore, |surf r (Iα )| ≥ N2 = (N1 + N2 ) − N1 − ≥ n+ a (Iα ) − na (Iα ) .
We have also + ν |Iα | ≤ n+ a (Iα )|Ca | = na (Iα )a .
April 11, 2003 14:43 WSPC/148-RMP
00160
Equilibrium Statistical Mechanics of Fermion Lattice Systems
193
Combining above estimates, we obtain for α ≥ α0 0 ≤ 1− ≤
− n− n+ a (Iα ) a (Iα ) − na (Iα ) = n+ n+ a (Iα ) a (Iα )
|surf r (Iα )|aν |Iα |
< ε.
Hence, (2) → (1) is now proved. Definition A.2. If a net of finite subsets {Iα } satisfies the above condition (1) (or equivalently (2)), then it is said to be a van Hove net (in Zν ). We introduce the third condition on a net of finite subsets Iα of Zν : (3) For any finite subset I of Zν , there exists an index α◦ such that Iα ⊃ I for all α ≥ α◦ . Definition A.3. If a net {Iα } (in Zν ) satisfies the conditions (1) (or equivalently (2)) and (3), then it is said to be a van Hove net tending to Zν . Remark. The condition (1) (or equivalently (2)) does not imply the condition (3). {Cn }n∈N of (8.8) is obviously a van Hove sequence. But it does not cover the whole Zν . Hence it is not a van Hove sequence tending to Zν . Lemma A.4. For any van Hove net and for any van Hove net tending to Zν , the directed set can not have a maximal element. Proof. Let {Iα }α∈A be a van Hove net where A is a directed set of indices. We show that for any α◦ ∈ A, there exists α0 ∈ A satisfying α0 ≥ α◦ , α0 6= α◦ . In fact, for a given α◦ , there exist a(α◦ ) ∈ N and n ∈ Zν such that Iα◦ ⊂ Ca(α◦ )−n , and hence n− a(α◦ ) (Iα◦ ) = 0 . On the other hand, for the above a(α◦ ) ∈ N there exists α1 such that 1−
n− a(α◦ ) (Iα ) n+ a(α◦ ) (Iα )
<
1 2
for all α ≥ α1 , since {Iα }(α ∈ A) is a van Hove net. 0 For any α0 ∈ A satisfying both α0 ≥ α1 and α0 ≥ α◦ , we have n− a(α◦ ) (Iα ) 6= 0 due to α0 ≥ α1 , and hence α0 6= α◦ . We have shown the existence of a desired α0 . A van Hove net tending to Zν is a special case of a van Hove net. Hence the assertion for this case obviously follows.
April 11, 2003 14:43 WSPC/148-RMP
194
00160
H. Araki & H. Moriya
A.2. Van Hove limit Let f (I) be an R-valued function of finite subsets I of Zν . We first show the following lemma which asserts the independence of the limit on the choice of van Hove net (van Hove net tending to Zν ) when f (Iα ) has a limit for any van Hove net (van Hove net tending to Zν ) {Iα }. Lemma A.5. If f (Iα ) has a limit for any van Hove net {Iα }, then its limit is independent of such a net. If f (Iα ) has a limit for any van Hove net {Iα } tending to Zν , then its limit is independent of such a net. Proof. Let {I1α }α∈A and {I2β }β∈B be two van Hove nets where A and B are directed sets of indices. We introduce a new index set C ≡ {(α, β, i); α ∈ A, β ∈ B, i = 1, 2} with the partial ordering (α, β, i) ≥ (α0 , β 0 , i0 ) either if α > α0 and β > β 0 or if α = α0 , β = β 0 and i ≥ i0 . For any (α1 , β1 , i1 ) ∈ C and (α2 , β2 , i2 ) ∈ C, there exist α ∈ A and β ∈ B such that α > α1 , α > α2 , β > β1 , β > β2 , because A and B are directed sets without maximal elements due to Lemma A.4. Hence (α, β, 2)(∈ C) obviously satisfies (α, β, 2) > (α1 , β1 , i1 ) , So C is a directed set. Let I(α,β,i) =
(
(α, β, 2) > (α2 , β2 , i2 ) .
I1α
if i = 1 ,
I2β
if i = 2 .
Since {I1α } and {I2β } are van Hove nets, there exists α◦ ∈ A and β◦ ∈ B for any d > 0 and ε > 0 such that |surf d (I1α )| < ε if α ≥ α◦ |I1α | |surf d (I2β )| < ε if β ≥ β◦ . |I2β | Set γ◦ ≡ (α◦ , β◦ , 1). For any γ = (α, β, i) ≥ γ◦ , we have obviously α ≥ α◦ and β ≥ β◦ by the definition of the ordering. Hence, ) ( |surf d (Iγ )| |surf d (I1α )| |surf d (I2β )| < ε. ≤ max , |Iγ | |I1α | |I2β |
April 11, 2003 14:43 WSPC/148-RMP
00160
Equilibrium Statistical Mechanics of Fermion Lattice Systems
195
Thus {Iγ }γ∈C is also a van Hove net. If {I1α } and {I2β } are van Hove nets tending to Zν , then {Iγ } is also a van Hove net tending to Zν by its definition. Since {Iγ }γ∈C is a van Hove net (van Hove net tending to Zν ), f has the following limit by the assumption on f , f∞ = lim{f (Iγ ), γ ∈ C} . γ
Thus for any ε, there exists a γ◦ = (α◦ , β◦ , 1) or γ◦ = (α◦ , β◦ , 2) such that |f∞ − f (Iγ )| < ε for γ ≥ γ◦ . This inequality holds especially for γ = (α, β, 1) ≥ γ◦ with α > α◦ and β > β◦ . For this γ, Iγ = I1α , and hence f (Iγ ) = f (I1α ). Thus we have |f∞ − f (I1α )| < ε for α > α◦ . Therefore, we obtain f∞ = lim f (I1α ) . α
Similarly, f∞ = lim f (I2β ) . β
Now we have shown that the limit is the same for {I1α }α∈A and {I2β }β∈B . Hence the independence of the limit on the choice of the net follows. Definition A.6. If f (Iα ) has a limit for any van Hove net {Iα }, then f (I) is said to have the van Hove limit for large I, and its limit is denoted by v.H. lim f (I) . I→∞
(A.4)
If f (Iα ) has a limit for any van Hove net {Iα } tending to Zν , then f (I) is said to have the van Hove limit for I tending to Zν , and its limit is denoted by v.H. limν f (I) . I→Z
(A.5)
In general, the first condition is stronger than the second. If f (I) is translation invariant, however, the existence of the two limits are equivalent as shown below. Lemma A.7. If f (I) is translation invariant in the sense that f (I + n) = f (I) for any finite subset I of Zν and n ∈ Zν , then f (I) has the van Hove limit for large I if and only if f has the van Hove limit for I tending to Zν . Proof. The only if part is obvious. Let {Iα }α∈A be an arbitrary van Hove net. Let a(α) be the largest integer a such that a translate of Ca is contained in Iα . Let Ca(α) + n ⊂ Iα and hence Ca(α) ⊂ Iα − n. Now we shift an approximate center of
April 11, 2003 14:43 WSPC/148-RMP
196
00160
H. Araki & H. Moriya
Ca(α) to the origin of Zν and simultaneously shift Iα − n by the same amount. More precisely, Iα − n is shifted to a(α) − 1 I0α ≡ Iα − n − (1, . . . , 1) . 2 Obviously, |surf d (Iα )| |surf d (I0α )| = 0 |Iα | |Iα |
for all d > 0 and α ∈ A. We show that this {I0α }(α ∈ A) is tending to Zν . Let I be a finite subset of Zν . For sufficiently large integer a, I ⊂ Ca−[ a−1 ] . For this a, there exists α1 such that 2 n− a (Iα ) > 0 for α ≥ α1 . Then a(α) ≥ a and Iα0 ⊃ Ca(α)−[ a(α)−1 ] ⊃ Ca−[ a−1 ] ⊃ I 2
for α ≥ α1 . Thus invariant,
{I0α }(α
2
∈ A) is a van Hove net tending to Zν . Since f is translation f (Iα ) = f (I0α ) .
By the assumption that f has the van Hove limit tending to Zν , limα f (I0α ) exists, and hence limα f (Iα ) exists. References [1] H. Araki and E. H. Lieb, Entropy inequalities, Comm. Math. Phys. 18 (1970), 160–170. [2] H. Araki, Relative hamiltonian for faithful normal states of a von Neumann algebra, Publ. RIMS, Kyoto Univ. 7 (1973), 165–209. ´ [3] H. Araki, Expansional in Banach algebra, Ann. Sci. Ecole Norm Sup. S´er. 46 (1973), 67–84. [4] H. Araki, Golden–Thompson and Peierls–Bogoliubov inequalities for a general von Neumann algebra, Comm. Math. Phys. 34 (1973), 167–178. [5] H. Araki and P. D. F. Ion, On the equivalence of KMS and Gibbs conditions for states of quantum lattice systems, Comm. Math. Phys. 35 (1974), 1–12. [6] H. Araki, On the equivalence of the KMS condition and the variational principle for quantum lattice systems, Comm. Math. Phys. 38 (1974), 1–10. [7] H. Araki, Relative entropy and its application, in Colloques Interationaux du C.N.R.S. No. 248 Les Methodes Mathematiques de la Theorie Quantique des Champs, eds. F. Guerra, D. W. Robinson and R. Stora, CNRS, Paris, 1976. [8] H. Araki, Relative entropy of states of von Neumann algebras, Publ. RIMS, Kyoto Univ. 11 (1976), 809–833. [9] H. Araki, Relative entropy of states of von Neumann algebras II, Publ. RIMS, Kyoto Univ. 13 (1977), 173–192. [10] H. Araki and G. L. Sewell, KMS conditions and local thermodynamical stability of quantum lattice systems, Comm. Math. Phys. 52 (1977), 103–109. [11] H. Araki, D. Kastler, M. Takesaki and R. Haag, Extension of KMS states and chemical potentials, Comm. Math. Phys. 53 (1977), 97–134.
April 11, 2003 14:43 WSPC/148-RMP
00160
Equilibrium Statistical Mechanics of Fermion Lattice Systems
197
[12] H. Araki, On KMS states of a C ∗ dynamical system, Lecture Notes in Math. 650, Springer-Verlag, 1978. [13] H. Araki, Toukeirikigaku no suuri, Iwanami (Japanese), 1994. [14] H. Araki and H. Moriya, Joint extension of states of subsystems for a CAR system, to appear in Comm. Math. Phys. [15] H. Araki and H. Moriya, Local thermodynamical stability of Fermion lattice systems, Lett. Math. Phys. 62 (2002), 33–45. [16] B. Baumgartner, A partial ordering of sets, making mean entropy monotone, J. Phys. A: Math. Gen. 35 (2002), 3163–3182. [17] O. Bratteli and D. W. Robinson, Operator Algebras and Quantum Statistical Mechanics 2, 2nd edition, Springer-Verlag, 1996. [18] M. Choda, A C∗ -Dynamical Entropy and Applications to Canonical Endomorphisms, J. Funct. Anal. 173 (2000), 453–480. [19] A. Connes, H. Narnhofer and W. Thirring, Dynamical Entropy of C∗ Algebras and von Neumann Algebras, Comm. Math. Phys. 112 (1987), 691–719. [20] M. Fannes, A continuity property of the entropy density for spin lattice systems, Comm. Math. Phys. 31 (1973), 291–294. [21] F. M. Goodman, P. de la Harpe and V. F. R. Jones, Coxeter Graphs and Towers of Algebras, Springer-Verlag, 1989. [22] T. Hudetz, Spacetime Dynamical Entropy of Quantum Systems, Lett. Math. Phys. 16 (1988), 151–161. [23] R. B. Israel, Convexity in the Theory of Lattice Gases, Princeton University Press, 1979. [24] A. R. Kay and B. S. Kay, Monotonicity with volume of entropy and of mean entropy for translationally invariant systems as consequences of strong subadditivity, J. Phys. A. Math. Gen. 34 (2001) 365–382. [25] H. Kosaki, Relative entropy for states: a variational expressions, J. Operator. Theory 16 (1986), 335–348. [26] O. E. Lanford III and D. W. Robinson, Statistical mechanics of quantum spin systems III, Comm. Math. Phys. 9 (1968), 327–338. [27] E. H. Lieb and M. B. Ruskai, Proof of the strong subadditivity of quantum-mechanical entropy, J. Math. Phys. 14 (1973), 1938–1941. [28] E. H. Lieb and M. B. Ruskai, A fundamental property of quantum-mechanical entropy, Phys. Rev. Lett. 30 (1973), 434–436. [29] T. Matsui, Ground states of fermions on lattices, Comm. Math. Phys. 182 (1996), 723–751. [30] T. Matsui, Quantum statistical mechanics and Feller semigroup, Quantum Probability Communication 10 (1998), 101–124. ¨ [31] S. Mazur, Uber konvexe Menge in linearen normierten Raumen, Studia. Math. 4 (1933), 70–84. [32] H. Moriya, Variational principle and the dynamical entropy of space translation, Rev. Math. Phys. 11 (1999), 1315–1328. [33] H. Moriya, Some aspects of quantum entanglement for CAR systems, Lett. Math. Phys. 60 (2002), 109–121. [34] S. Neshveyev and E. Størmer, The variational principle for a class of asymptotically abelian C∗ -algebras, Comm. Math. Phys. 215 (2000), 177–196. [35] D. Petz, On certain properties of the relative entropy of states of operator algebras, Math. Z. 206 (1991), 351–361. [36] R. T. Powers, Representations of the canonical anticommutation relations, Thesis, Princeton University, 1967.
April 11, 2003 14:43 WSPC/148-RMP
198
00160
H. Araki & H. Moriya
[37] D. Ruelle, A variational formulation of equilibrium statistical mechanics and the Gibbs phase rule, Comm. Math. Phys. 5 (1967), 324–329. [38] S. Sakai, On one-parameter subgroups of ∗-automorphisms on operator algebras and the corresponding unbounded derivations, Am. J. Math. 98 (1976), 427–440. [39] G. L. Sewell, KMS conditions and local thermodynamical stability of quantum lattice systems II, Comm. Math. Phys. 55 (1977), 53–61. [40] B. Simon, The Statistical Mechanics of Lattice Gases, Princeton University Press, 1993. [41] E. Størmer, A survey of noncommutative dynamical entropy, Oslo preprint, Dep. of Mathematics 18 (2000). [42] M. Takesaki, Tomita’s Theory of Modular Hilbert-Algebras and its Application, Lecture Notes in Math. 128, Springer-Veralag (1970). [43] M. Takesaki, Theory of Operator Algebras I, Springer-Verlag, 1979. [44] J. Tomiyama, On the projection of norm one in W ∗ -algebras, Proc. Japan. Acad. 33 (1957), 609–612. [45] H. Umegaki, Conditional expectation in an operator algebra IV, (entropy and information), Kodai. Math. Sem. Rep. 14 (1962), 59–85.
April 11, 2003 14:51 WSPC/148-RMP
00159
Reviews in Mathematical Physics Vol. 15, No. 2 (2003) 199–215 c World Scientific Publishing Company
ON THE GEOMETRY OF THE CHARACTERISTIC CLASS OF A STAR PRODUCT ON A SYMPLECTIC MANIFOLD∗
PIERRE BIELIAVSKY Universit´ e Libre de Bruxelles, Brussels, Belgium
[email protected] PHILIPPE BONNEAU Universit´ e de Bourgogne, Dijon, France
[email protected]
Received 7 May 2002 Revised 11 October 2002 The characteristic class of a star product on a symplectic manifold appears as the class of a deformation of a given symplectic connection, as described by Fedosov. In contrast, one usually thinks of the characteristic class of a star product as the class of a deformation of the Poisson structure (as in Kontsevich’s work). In this paper, we present, in the symplectic framework, a natural procedure for constructing a star product by directly quantizing a deformation of the symplectic structure. Basically, in Fedosov’s recursive formula for the star product with zero characteristic class, we replace the symplectic structure by one of its formal deformations in the parameter ~. We then show that every equivalence class of star products contains such an element. Moreover, within a given class, equivalences between such star products are realized by formal one-parameter families of diffeomorphisms, as produced by Moser’s argument. Keywords: Deformation quantization; characteristic class of star products; reduction.
1. Introduction Inspired by the pioneering work of Weyl [16, 17], Wigner [18] and Moyal [10] a rigorous description of quantum mechanics as a deformation of classical mechanics has been given in [1, 2]. These are the foundational papers of what is now called “deformation quantization”. A fundamental problem is the construction, for a given smooth manifold N , of a formal associative product on C ∞ (N )[[t]] that is a deformation of the natural pointwise product, i.e. a product ? such that ∗ Research
supported by the Communaut´e fran¸caise de Belgique, through an Action de Recherche Concert´ee de la Direction de la Recherche Scientifique. 199
April 11, 2003 14:51 WSPC/148-RMP
200
00159
P. Bieliavsky & P. Bonneau
P f ? g = f.g + n>1 tn Pn (f, g) where f, g ∈ C ∞ (N ) and the Pn ’s are bidifferential operators. Such a product is called a “star product”. If it exists then it is straightforward to see that N is a Poisson manifold. So a natural question arises: Does it exist a star product on every Poisson manifold? An affirmative answer has been given in [9]. Two star products ?1 and ?2 on a Poisson manifold N are called P equivalent if there exists a formal series T = id + k≥1 tk Tk of differential operators {Tk : C ∞ (N ) → C ∞ (N )} such that T (f ?2 g) = T f ?1 T g. In the general case of Poisson manifolds, a classifying space for equivalence classes of star products is described in [9]. For the particular case of symplectic manifolds, this has been known for quite a while [7, 8]: equivalence classes of star products are in one-toone correspondence with sequences of elements of de Rham’s H 2 (N ). The sequence Ω ∈ H 2 (N )[[t]] associated to the equivalence class of a given star product is called the characteristic class of the star product. The characteristic class of a star product on a symplectic manifold appears as the class of a deformation of a given symplectic connection, as described by Fedosov [7, 8]. In contrast, one usually thinks of the characteristic class of a star product as the class of a deformation of the Poisson structure [9]. In this paper, we present, in the symplectic framework, a natural procedure for constructing a star product by directly quantizing a deformation of the symplectic structure. Basically, in Fedosov’s recursive formula for the star product with zero characteristic class, we replace the symplectic structure by one of its formal deformations in the parameter ~. We then show that every equivalence class of star products contains such an element. Moreover, within a given class, equivalences between such star products are realized by formal one-parameter families of diffeomorphisms, as produced by Moser’s argument. More precisely, let (M, ω) be a compact symplectic manifold. Let {Ω(t)}t∈]−,[ be a smooth path of symplectic structures on M such that Ω0 = ˆ on M ˆ = M× ] − ω. The pair (M, {Ω(t)}) defines a regular Poisson structure Ω , [ whose symplectic leaves are {(M × {t}, Ω(t))}. Applying Fedosov’s method to ˆ , Ω), ˆ one obtains a tangential star product ˆ ˆ , Ω) ˆ with zero characteristic (M ? on (M class. The “infinite jet at 0 of ˆ ? in t = ~” then defines a star product ? on (M, ω) to which is associated the de Rham class [Ω~ ]de Rham where Ω~ denotes the infinite jet at 0 of {Ω(t)} in t = ~. If {Ω0 (t)} is such that [Ω0 (t)]de Rham = [Ω(t)]de Rham ∀t, then an equivalence between the corresponding star products is realized as the infinite jet of a family of diffeomorphisms {ϕt } — whose existence is guaranteed by Moser’s argument — such that ϕ?t Ω0 (t) = Ω(t). This work is motivated by the question of obtaining a quantum analogue of Kirwan’s map when considering the problem of commutation between Marsden– Weinstein reduction and deformation quantization. However this point is not investigated in the present article. 2. Fedosov Construction on Regular Poisson Manifolds We present Fedosov star products on regular Poisson manifolds [7, 8] by mean of a partial connection defined (only) on the characteristic distribution of the Poisson
April 11, 2003 14:51 WSPC/148-RMP
00159
Star Product on a Symplectic Manifold
201
structure. By this we avoid considering Poisson affine connections (cf. Lemma 2.8). This little point excepted, there is essentially nothing new in the present section. But it sets the notations and presents Fedosov’s construction in a completely intrinsic way. 2.1. Linear Weyl algebra Let (V, ω) be a real symplectic vector space and consider the associated Heisenberg Lie algebra H over the dual space V ? . That is H = V ? ⊕ R~ where ~ is central and where the Lie bracket of two elements y, y 0 ∈ V ? is defined by [y, y 0 ] = y 0 (] y)~, ]
the map V ? → V being the isomorphism induced by ω. Denote by S(H) (resp. U(H)) the symmetric (resp. the universal enveloping) algebra of H and consider ϕ the complete symmetrization map S(H) → U(H) given by the Poincar´e–Birkhoff– Witt theorem. The symmetric product on S(H) will be denoted by •, while ? will denote the product on S(H) transported via ϕ of the universal product on U(H). L (r) (H) on Lemma 2.1. There exists one and only one grading S(H) =: r≥0 S S(H) such that: (i) (ii)
S r (V )? ⊂ S (r) (H)
S (r) (H) ? S (s) (H) ⊂ S (r+s) (H) ,
where S r (V ? ) denotes the rth symmetric power of V ? . This grading is compatible with the symmetric product • as well. One then defines the linear Weyl algebra W(H) as the direct product W(H) := Q∞ (r) (H) endowed with the extended product ?. Note that the symmetric prodr=0 S uct • extends to W(H) as well. The center ZW(H) of (W(H), ?) is canonically isomorphic to the space of power series R[[~]]. By using the symplectic structure, one gets an identification between the Lie algebra sp(V, ω) and the second symmetric power S 2 (V ? ): sp(V, ω) → S 2 (V ? ) ⊂ W(H) A 7→ A . Lemma 2.2. For all a ∈ W(H) and A ∈ sp(V, ω), one has [A, a] = 2~A(a) , where [ , ] denotes the Lie bracket on W(H) induced by the associative product ?. Proof. Both ad(A) and ~A are derivations of (W(H), ?). It is therefore sufficient to verify formula (i) on generators. [
µ
The isomorphism V → V ? defines an injection V → W(H) which we call the linear moment. Observe that, viewed as an element of W(H) ⊗ V ? , µ is fixed under the action of the symplectic group Sp(V, ω).
April 11, 2003 14:51 WSPC/148-RMP
202
00159
P. Bieliavsky & P. Bonneau
Both products ? and • extend naturally to the space W(H) ⊗ Λ• (V ? ) of multilinear forms on V valued in W(H). We define the total degree t of an element a ⊗ ω, a ∈ S (r) (H), ω ∈ Λp (V ? ) by t = p + r. With respect to this degree on W(H)⊗Λ• (V ? ), the extended multiplications, again denoted by ? and •, are graded. The bracket [ , ] mentioned in Lemma 2.2 therefore extends to W(H) ⊗ Λ• (V ? ) as well, and, (W(H) ⊗ Λ• (V ? ), [ , ]) is a graded Lie algebra. To an element a ⊗ x ∈ W(H) ⊗ Λp (V ), one can associate the operator ia⊗x : W(H) ⊗ Λ• (V ? ) → W(H) ⊗ Λ•−p (V ? ) , defined by ia⊗x (b ⊗ ω) := a • b ⊗ ix ω , where ix ω denotes the usual interior product. Using the universal property, one gets a map (W(H) ⊗ V ) × W(H) ⊗ Λ• (V ? ) → W(H) ⊗ Λ•−p (V ? ) (X, s) 7→ iX s . In the case where p is odd, since iX acts “symmetrically” on the “Weyl part” and “anti-symmetrically” on the “form part”, one has i2X = 0. In the same way, if Y ⊂ W(H) is a subspace such that [Y, Y ] ⊂ ZW(H) (e.g. Y = S (1) (H)), to any element U ∈ Y ⊗ Λp (V ? ), one can associate the operator ad(U ) : W(H) ⊗ Λ• (V ? ) → W(H) ⊗ Λ•+p (V ? ) . Using Jacobi identity on the “Weyl part”, one observes that, if p is odd, one has ad(U )2 = 0. Definition 2.3. Using the duality S (1) (H) ⊗ V ? → S (1) (H) ⊗ V U 7→ ] U , one defines the cohomology (resp. homology) operator δ (resp. δ ? ) by ~δ := ad(µ) δ ? := i]µ , where the linear moment µ is viewed as an element of S (1) (H) ⊗ V ? . For a form a ∈ S • (V ? ) ⊗ Λ• (V ? ) ⊂ W(H) ⊗ Λ• (V ? ) with total degree t, we set 1 ? δ a if t > 0 δ −1 a := t 0 if t = 0 . One extends this definition C[[~]]-linearly to the whole W(H) ⊗ Λ• (V ? ).
Lemma 2.4. (“Hodge decomposition”) δδ −1 + δ −1 δ = Id − pr0 pr0
where pr0 is the canonical projection W(H) ⊗ Λ• (V ? ) → ZW(H).
April 11, 2003 14:51 WSPC/148-RMP
00159
Star Product on a Symplectic Manifold
203
Proof. We observe that δ and δ ? are anti-derivations of degree ±1 of (W(H) ⊗ Λ• (V ? ), •). Their anti-commutator being a derivation of degree 0, it is therefore sufficient to check the formula on generators. Observe that δ is an anti-derivation of degree +1 of (W(H) ⊗ Λ• (V ? ), ?). 2.2. The Weyl bundle Let (N, Λ) be a regular Poisson manifold. The Poisson bivector Λ induces a short sequence of vector bundles over N : ι?
0 → rad(Λ) → T ? (N ) → D? → 0 ι
where D → T (N ) denotes the characteristic distribution associated to Λ [15], and where rad(Λ) is the radical of Λ in T ? (N ). One therefore gets a non-degenerate foliated 2-form ω D ∈ Ω2 (D), dual to the canonical one on the quotient T ? (N )/rad(Λ) = D? . Fix a rank(D)-dimensional symplectic vector space (V, ω), and, for all x ∈ N, define Px = {b ∈ HomR (V, Dx )|b? ωxD = ω} . S Then P = x∈N Px is naturally endowed with a structure of Sp(V, ω)-principal bundle over N (analogous to the symplectic frames in the symplectic case, except that here, one does not have a G-structure in general). Definition 2.5. The Weyl bundle is the associated bundle W = P ×Sp(V,ω) W(H) , where W(H) is the vector space underlying the linear Weyl algebra defined from the data of (V, ω). The space of p-forms with values in the sections of W is denoted by Ωp (W); it is canonically isomorphic to the space of sections of the associated bundle P ×Sp(V,ω) (W(H) ⊗ Λp (V ? )). The Sp(V, ω)-invariance, at the linear level, of both product ? and • on W(H) ⊗ Λ• (V ? ) provides graded products, again denoted by ? and •, on Ω• (W). In the same way, the operators δ and δ −1 on W(H) ⊗ Λ• (V ? ) define operators on sections: Ω• (W)
δ −→ ←− δ −1
Ω•+1 (W) ,
leading to a Hodge decomposition of sections as in Lemma 2.4. Notes that the bundle ZW = P ×Sp(V,ω) ZW(H) being trivial, its space of sections is isomorphic to C ∞ (N )[[~]]. Remark 2.6. Observe that, as a vector bundle, W is defined as soon as the distribution D is given (cf. Lemma 2.1). The full data of the Poisson tensor Λ is only needed to define the algebra structure on its space of sections.
April 11, 2003 14:51 WSPC/148-RMP
204
00159
P. Bieliavsky & P. Bonneau
2.3. Fedosov Moyal star products Definition 2.7. A foliated connection is a linear map ∇: D⊗D → D u ⊗ v 7→ ∇u v verifying (f ∈ C ∞ (N )) (i) ∇f u v = f ∇u v, (ii) ∇u f v = f ∇u v + Lι(u) f v. A foliated connection is said to be symplectic if (iii) ∇u v − ∇v u − [u, v] = 0, (iv) ∇ω = 0. Lemma 2.8. On a regular Poisson manifold, a symplectic foliated connection always exists. Proof. Choose any linear connection ∇0 in the vector bundle D → N . Since D is an involutive tangent distribution, the torsion T 0 of the connection is well defined as a section of D? ⊗End(D). One then obtains a “torsion-free” connection ∇1 = ∇0 − 21 T 0 in D. Now, the formula 1 1 D (∇ ω (v, w) + ∇1v ω D (u, w)) 3 u defines a tensor S, section of D ? ⊗ D? ⊗ D such that ∇ = ∇1 + S is as desired. ω D (S(u, v), w) =
Now, fix such a foliated symplectic connection ∇ in D and consider its associated covariant exterior derivative ∂
Ωp (W) −→ Ωp+1 (W) , defined by ∂s(u1 , . . . , up+1 ) =
p+1 X
(−1)i−1 (∇ui s)(u1 , . . . , u ˆi , . . . , up+1 ) .
i=1
Lemma 2.2 then provides a 2-form R ∈ Ω2 (D ⊗ D) ⊂ Ω2 (W) defined by the formula 2~∂ 2 = ad(R) . Inductively on the degree, one sees [8, Theorem 5.2.2] that the equation R + 2~(∂γ − δγ + γ 2 ) = 0 has a unique solution γ ∈ Ω1 (W) such that δ −1 γ = 0. This implies that the graded derivation D = ∂ − δ + ad(γ)
April 11, 2003 14:51 WSPC/148-RMP
00159
Star Product on a Symplectic Manifold
205
of (Ω• (W), ?) is flat i.e. D2 = 0. One then proves, again inductively, that the projection pr
0 WD −→ C ∞ (N )[[~]] ,
where WD is the kernel of D restricted to the sections of W, is a linear isomorphism. The space of flat sections WD being a subalgebra of the sections of W with respect to the product ? (D is a derivation), the above linear isomorphism yields a star product on C ∞ (N ) called Fedosov Moyal star product on (N, Λ). Remark 2.9. The Fedosov–Moyal star product constructed above is tangential. That means that it restricts well on the leaves, or, to say it more technically, that f ?g = f.g for f, g leafwise constant functions. Indeed, the only differential operators used in the construction are generated by the sections of D which are the vector fields on N that vanish on leafwise constant functions. 3. The Main Construction 3.1. A particular Poisson manifold
Notations
Let (M, ω) be a compact symplectic manifold. Let Ω ∈ C ∞ ( ] − , [, Ω2 (M )) be a smooth path of symplectic structures on M such that Ω(0) = ω. The smooth family ˆ := M × ] − , [ a Poisson structure {Ω(t)}t∈ ]−, [ then canonically defines on M ˆ ˆ the Ω whose symplectic leaves are {(M × {t}, Ω(t))}. We will denote by D ⊂ T M ˆ characteristic distribution of the Poisson structure Ω (i.e. D(x,t) = T(x,t) (M × {t})). ˆ )[[~]] (resp. C ∞ (M )[[~]]) of power series in ~ with values in the The spaces C ∞ (M ˆ (resp. M ) are R[[~]]-algebras. The quotient R[[~]]algebra of smooth functions on M ∞ n+1 ∞ algebra C (M )[[~]]/~ C (M )[[~]] will be denoted by C ∞ (M )[[~]]n . It will often be identified with the space of polynomials in ~ of degree at most n with values in C ∞ (M ). ˆ )[[~]] defined We will consider the natural inclusion i : C ∞ (M )[[~]]n ,→ C ∞ (M ˆ by i(f )(x, t) = f (x), ∀t ∈] − , [. We will often denote i(f ) by f. ˆ ) we will denote the algebra of tangential (with respect to the disBy DOD (M ˆ i.e. the set of all differential operators on tribution D) differential operators on M, ˆ M vanishing on leafwise constant functions. By DO(M ) we will denote the algebra of differential operators on M . As above we can consider the R[[~]]-algebras, ˆ )[[~]] and DO(M )[[~]]/~n+1 DO(M )[[~]] (abbreviated by DO(M )[[~]]n ). DOD (M When dealing with bidifferential operators, we will use the prefix “biDO”. 3.2. Taylor expansions ˆ ' C ∞ (] − , [, C ∞ (M )) seeing every element a ∈ C ∞ (M ˆ ) as a We have C ∞ (M) function of one variable with values in a Fr´echet space. We can therefore consider [4] its Taylor expansion of order n at 0: n X 1 a(t) = tk a(k) (0) + tn Rn (u)(t) with Rn (u)(t) → 0 as t → 0 . k! k=0
April 11, 2003 14:51 WSPC/148-RMP
206
00159
P. Bieliavsky & P. Bonneau
We define the R-linear map, ˆ ) → C ∞ (M )[[~]]n by j ~ a = jn~ : C ∞ (M n
n X k=0
ˆ )[[~]] in the following way: It is extended to C (M
~k
1 (k) a (0) . k!
∞
ˆ )[[~]] → C ∞ (M )[[~]]n , C ∞ (M a=
X l>0
One then has
~l al 7→ jn~ a =
n X
~ ~l jn−l al =
l=0
X
06k+l6n
~k+l
1 (k) a (0) . k! l
Lemma 3.1. (1) jn~ a = jn~ (a mod ~n+1 ). (2) jn~ is an R[[~]]-algebra homomorphism. ˆ )[[~]] in the natural way. We now extend the map jn~ to DOD (M Definition 3.2. ˆ )[[~]], we define the operator (1) For Φ ∈ DOD (M jn~ Φ : C ∞ (M )[[~]] → C ∞ (M )[[~]]n ; jn~ Φ . f = jn~ (Φ.fˆ) , ∀ f ∈ C ∞ (M )[[~]] . ˆ (2) Similarly, for B ∈ biDOD (M)[[~]], we set jn~ B . (f, g) = jn~ (B.(fˆ, gˆ)) , ∀f, g ∈ C ∞ (M )[[~]] . Lemma 3.3. (1) One has jn~ Φ ∈ DO(M )[[~]]n and jn~ B ∈ biDO(M )[[~]]n . ˆ )[[~]] one has (2) For all a, b ∈ C ∞ (M jn~ (Φ.a) = jn~ Φ . jn~ a and jn~ (B.(a, b)) = jn~ B . (jn~ a, jn~ b) . Proof. We will show that jn~ Φ and jn~ B are local hence differential by Peetre’s theorem [5, 12, 13]. Let f ∈ C ∞ (M ) and U be an open set in M such that f/U ≡ 0. ˆ t) = 0 ∀ (x, t) ∈ U × ]−, [ and Φ is differential, one has (Φ.fˆ)/U × ]−, [ ≡ Since f(x, 0. Hence X ~k+l (l) (jn~ Φ).f )/U = (jn~ (Φ.fˆ))/U = (Φ.fˆ)/U × ]−, [ (0) = 0 . l! 06k+l6n
The bidifferential case follows in the same way. This proves the first part of the lemma. The second one follows from simple computations.
ˆ )[[~]] to Remark 3.4. Lemma 3.3 implies that jn~ , defined as a map from DOD (M DO(M )[[~]]n , is an R[[~]]-algebra homomorphism for the composition product on both algebras.
April 11, 2003 14:51 WSPC/148-RMP
00159
Star Product on a Symplectic Manifold
207
3.3. Induced star-products ˆ , Ω); ˆ for instance consider the Let now ˆ ? be any tangential star product on (M Moyal–Fedosov star product defined in Sec. 2. Definition 3.5. (1) We define ?n to be the map from C ∞ (M )[[~]]n × C ∞ (M )[[~]]n to C ∞ (M )[[~]]n given by ?gˆ) . f ?n g = jn~ (fˆˆ ˆ )[[~]], one ˆ as an element of biDOD (M Equivalently (by Lemma 3.3), seeing ? ~ˆ has ?n = jn ?. (2) We define ? to be the operation from C ∞ (M )[[~]] × C ∞ (M )[[~]] to C ∞ (M )[[~]] given by f ? g mod ~n+1 = f mod ~n+1 ?n g mod ~n+1 for all n in N. Lemma 3.6. (1) ?n is an associative product on the R[[~]]-algebra C ∞ (M )[[~]]n . (2) ? is a star-product on M, called the induced star product on M by ˆ ?. ˆ = fˆˆ ˆ Proof. For f, g, h ∈ C ∞ (M )[[~]]n , one has (fˆˆ ?gˆ)ˆ ?h ?(ˆ gˆ ?h). ~ ˆˆ ~ˆ ˆ ˆ Therefore, jn (f ?gˆ)ˆ ?h = jn fˆ ?(ˆ gˆ ?h) if and only if ˆ = j ~ (ˆ ˆ ?.(ˆ ˆ jn~ (ˆ ?.(ˆ ?.(fˆ, gˆ), h)) g , h))) (reformulation) n ?.(f , ˆ ~ˆ ~ ˆ = (j ~ ˆ ˆ ⇔ (jn~ ˆ ?).(jn~ (ˆ ?.(fˆ, gˆ)), jn~ h) ?.(ˆ g , h))) (by Lemma 3.3) n ?).(jn f, jn (ˆ
⇔ (jn~ ˆ ?).((jn~ ˆ ?).(f, g), h) = (jn~ ˆ ?).(f, (jn~ ˆ ?).(g, h)) (by Lemma 3.3) ⇔ (f ?n g) ?n h = f ?n (g ?n h) (by Definition 3.5). This proves item 1 which is a classical way to show that a star-product is associative. Corollary 3.7. If ˆ ?1 and ˆ ?2 are tangentially equivalent tangential star products on ˆ , Ω), ˆ then the induced star products ?1 and ?2 on (M, ω) are equivalent. (M ˆ )[[~]] Proof. The hypothesis implies that there exists an equivalence Φ ∈ DOD (M ∞ ˆ )[[~]]. We then check, as in such that Φ.(aˆ ?1 b) = Φ.a ˆ ?2 Φ.b for all a, b ∈ C (M n+1 the proof of Lemma 3.6 that the operator Ψ mod ~ := jn~ Φ , n ∈ N defines an equivalence between ?1 and ?2 .
April 11, 2003 14:51 WSPC/148-RMP
208
00159
P. Bieliavsky & P. Bonneau
4. Characteristic Classes P Let Ω~ = k>0 ~k ω k ∈ Z 2 (M )[[~]] be a formal power series of closed 2-forms on M . A refinement of the classical Borel lemma (see the appendix) yields Lemma 4.1. Let Ω~i ∈ Z 2 (M )[[~]] (i = 1, 2). Assume that [Ω~1 ] = [Ω~2 ] in H 2 (M )[[~]] or, equivalently, that there exists ν ~ ∈ Ω1 (M )[[~]] such that Ω~2 − Ω~1 = dν ~ . Then there exists smooth functions Ωi ∈ C ∞ ( ] − , [, Ω2 (M )) and ν ∈ C ∞ ( ] − , [, Ω1 (M )) such that 1 d k (i) k! dt Ωi |t=0 = ωi ; (ii) ∀ t, Ωi (t) is symplectic; (iii) ∀ t, Ω2 (t) − Ω1 (t) = d(ν(t)) or, equivalently, [Ω1 (t)] = [Ω2 (t)].
Definition 4.2. Let us fix a connection ∇0 in the vector bundle ˆ = M× ] − , [ . D→M Let Ω~ ∈ Ω2 (M )[[~]] be a series of closed 2-forms on M such that Ω~ mod ~ = ω. Let Ω ∈ C ∞ ( ] − , [, Ω2 (M )) be a smooth family of symplectic structures on M admitting Ω~ as ∞-jet (cf. Lemma 4.1). Let ∇ be the symplectic foliated connection ˆ obtained from the data of ∇0 and Ω (cf. Sec. 2). Let ˆ on M ? be the Moyal–Fedosov ˆ , Ω) ˆ associated to ∇. The star product ?Ω~ on (M, ω) induced by star product on (M ˆ ? will be called the star product associated to the series Ω ~ . Proposition 4.3. Let Ω~i (i = 1, 2) be two series of closed 2-forms on M such that Ω~i mod ~ = ω. Denote by ?i (i = 1, 2) the associated star products on (M, ω). Then 2 ?1 and ?2 are equivalent star products if and only if [Ω~1 ] = [Ω~2 ] in Hde Rham [[~]]. The proof of Proposition 4.3 is postponed to the end of this section. ˆ →M ˆ preserves the foliation if Definition 4.4. A diffeomorphism ϕˆ : M (i) ϕ(M ˆ t ) ⊂ Mt ∀ t and (ii) ϕ| ˆ M0 = idM0 . We first adapt Moser’s lemma to our parametric situation. Lemma 4.5. Let {Ωi (t)}t∈]−, [ (i = 1, 2) be two smooth families of symplectic structures on M such that Ω1 (0) = Ω2 (0) = ω. Assume that, for all t ∈] − , [ they have the same de Rham class: [Ω1 (t)] = [Ω2 (t)] in H 2 (M ). Then there exists ˆ , Ωˆ2 ) → (M ˆ , Ωˆ1 ) which preserves the foliation. a Poisson diffeomorphism ϕˆ : (M Proof. By Hodge’s theory one has that Ω1 (t) − Ω2 (t) = dν t where ν t ∈ Ω1 (M ) is smooth in t. Set ωst = Ω2 (t) + s dν t , s ∈ [0, 1]. The form ωs0 = ω is symplectic on M for all s ∈ [0, 1]; hence by compactness, one can choose > 0 such that ωst is symplectic for all t ∈] − , [ and s ∈ [0, 1].
April 11, 2003 14:51 WSPC/148-RMP
00159
Star Product on a Symplectic Manifold
209
Consider N = M × [0, 1] endowed with the natural foliation F = {M × {s}}. Define the following smooth families of 2-forms on N : (˜ ωt )(x,s) := (ωst )x and (ωt )(x,s) := (˜ ωt )(x,s) − (ν t )x ∧ ds . Then dN (ωt ) = dN (˜ ω ) − dM (ν t ) ∧ ds = 0 for all t. Moreover, radT(x,s) (N ) (ωt ) is not entirely contained in T (F); hence one can find a smooth family of vector fields of the form: Xt = ∂∂s + Yt (Yt ∈ Γ(T (F))) generating the smooth family of smooth distributions: rad(ωt ). One has therefore LXt ωt = d(iXt ωt ) + iXt dωt = 0 . Denoting by {ϕuXt } the flow of Xt , one has: (ϕuXt )? ωt = ωt and ϕuXt (M × {s}) = M × {s + u} . One then gets a smooth family {ϕt } of diffeomorphisms of M defined by ϕ1Xt ◦ i0 = i1 ◦ ϕt such that ϕ?t (Ω1 (t)) = Ω2 (t) (is : M → N denotes the natural inclusion is (x) = (x, s)). Shrinking once more if necessary, one gets the desired Poisson map by setting ϕ(x, ˆ t) = (ϕt (x), t). Observe that X0 = ∂s , hence ϕ0 = idM . Lemma 4.6. Let ˆ ?i (i = 1, 2) there exists a diffeomorphism ϕˆ : ˆ ˆ ?ϕ ?2 mod (~n ). Then, ?1 and ?2 1 =ˆ
ˆ . Suppose be tangential star products on M ˆ ˆ M → M preserving the foliation such that are equivalent star products up to order n.
ρ ˆ ) × Diff(M ˆ) → Proof. The right action of the diffeomorphism group, C ∞ (M ˆ ), ρ(ϕ)u C ∞ (M ˆ = ϕˆ? u yields a map:
ˆ . ˆ ) → HomR (C ∞ (M ), C ∞ (M )[[~]]n ) : ρ~n (ϕ)f ρ~n : Diff(M ˆ = jn~ (ϕˆ? f) Definition 4.4 implies that if ϕˆ preserves the foliation, then ρ~n (ϕ) ˆ ∈ DO(M )[[~]]n ~ and ρ0 (ϕ) ˆ = id. Therefore an argument similar to the one used for Lemma 3.6 yields the conclusion. Corollary 4.7. Within the notations of Proposition 4.3, if Ω~1 and Ω~2 are cohomologous in H 2 (M )[[~]], then the star products ?1 and ?2 are equivalent. Proof. The first N cochains of a Fedosov star product are entirely determined by the N first terms of its Weyl curvature. Therefore, the above lemmas imply that ?1 and ?2 are equivalent up to any order. It is then classical that they are equivalent [1].
April 11, 2003 14:51 WSPC/148-RMP
210
00159
P. Bieliavsky & P. Bonneau
Proof of Proposition 4.3. We first consider a particular case. Let α~ = αo + ~α1 · · · ∈ Z 2 (M )[[~]] be a sequence of closed 2-forms on M . Set Ω0~ = Ω~ + ~k α~ . Denote by Ωt , αt and Ω0t = Ωt + tk αt respectively the smooth functions associated to the series Ω~ , α~ and Ω0~ as in Lemma 4.1. The functions Ωt and Ω0t define two ˆ We denote by Λt and Λ0t (resp. ω t and ω 0t ) the different Poisson structures on M. corresponding bivector fields (resp. D-2-forms). One has ω 0t = ω t + tk αt
and Λ0t = Λt − tk ]αo + tk+1 λ ,
(4.1)
where we denote again by αt the D-2-form corresponding to αt , where ] is the musical isomorphism between D ∗ and D induced by Ωt and where λ ∈ C ∞ ( ] − ˆ , Λt ) obtained , [, Γ ∧2 D). Let ∇ be the symplectic foliated connection on (M 0 0 from the data of ∇ (cf. Definition 4.2). Let ? be the star-product on M induced ˆ , Λ0t ). We now define a specific foliated by the Moyal–Fedosov star-product ˆ ?0 on (M 0 0t symplectic connection ∇ adapted to ω . Let us look for ∇0 of the form ∇+S where S is a symmetric 2-D-tensor field. We set 1 1 ω 0t (∇0u v, w) = ω 0t (∇u v, w) + (∇u ω 0t )(v, w) + (∇v ω 0t )(u, w) . 3 3 k
This leads to (ω t + tk αt )(S(u, v), w) = t3 [(∇u αt )(v, w) + (∇v αt )(u, w)] as ∇ω t = 0. By construction ω t + tk αt is invertible, so S(u, v) is completely determined and of the form S(u, v) = tk s(u, v). We thus have ∇0 = ∇ + t k s .
(4.2)
Let now ◦t (resp. ◦0t ) be the associative product on the sections of the Weyl bundle ˆ determined by the data of Λ (resp. Λ0 ) (cf. Sec. 2 and Remark 2.6). By W over M construction, we then get ∀ u, v ∈ W, dl (u ◦t v − u ◦0t v)(0) = 0 ∀ l 6 k − 1 . dtl
(4.3)
ˆ0 , associated to (Ω, ∇) and Similarly for Moyal–Fedosov star products, ˆ ? and ? (Ω0 , ∇0 ), Eqs. (4.2) and (4.3) yield dl (aˆ ?b − aˆ ?0 b)(0) = 0 ∀ l 6 k − 1 . dtl
(4.4)
Now let us see what happens for ? and ?0 . Let f, g ∈ C ∞ (M ) and write ˆ ? = P P 0 dl i ˆ i 0 (l) ˆ ? = i>0 ~ Ci . Setting u := dtl u, we have i>0 ~ Ci and ˆ f ? g − f ?0 g =
X ~j j>0
=
j!
X ~j j>k
j!
(f ˆ ?g − f ˆ ?0 g)(j) (0)
(f ˆ ?g − f ˆ ?0 g)(j) (0) (cf. Eq. (4.4))
April 11, 2003 14:51 WSPC/148-RMP
00159
211
Star Product on a Symplectic Manifold
=
j>k
=
X ~j
j!
X
j>k,i>0
=
X
i>0
(j)
~i (Cˆi (f, g) − Cˆi0 (f, g))
(0)
~i+j ˆ (Ci (f, g) − Cˆi0 (f, g))(j) (0) j!
X
~m
m>k
=
X
m=i+j,j>k,i>0
(j) 1 X ˆ (Ci (f, g) − Cˆi0 (f, g)) (0) j! k>0
~k+1 ~k+1 ˆ ~k (f g − gf ) + (f g − gf ) + (C1 (f, g) − Cˆ10 (f, g)) k! (k + 1)! k!
+
X
m>k+2
~m
X
m=i+j,j>k,i>0
(j) 1 X ˆ Ci (f, g) − Cˆi0 (f, g) (0) j! k>0
=
~k+1 t (Λ (df, dg) − Λ0t (df, dg))(k) (0) + o(~k+1 ) k!
=
~k+1 t (Λ (df, dg) − Λt (df, dg) + tk ]αo (df, dg)tk+1 λ(df, dg))(k) (0) k!
+ o(~k+1 ) = ~k+1 ]αo (df, dg) + o(~k+1 ) . P P Then, setting ? = i>0 ~i Ci and ?0 = i>0 ~i Ci0 , we have Ci0 = Ci ,
i = 0, . . . , k
0 and Ck+1 = Ck+1 + ]αo .
(4.5)
Let us pass to the general case. Suppose that [Ω~1 ] 6= [Ω~2 ]. We denote by k the smallest integer such that [ω1k ] 6= [ω2k ]. Let us consider Ω~3 = ~ω11 + ~2 ω12 + · · · + ~k−1 ω1k−1 + ~m ω2k + ~k+1 ω2k+1 + · · · . We have [Ω~3 ] = [Ω~2 ] and Ω~1 = Ω~3 + ~k (ω1k − ω2k ) + ~k+1 · · ·. Denoting by ?i the product associated with = Ω~i , we know that ?2 and ?3 are equivalent. What has been done previously implies ?1 = ?3 mod ~k+1 (1) (3) and Ck+1 = Ck+1 ± ]αo with αo = ω1k − ω2k . But in this case, we know that ?1 ∼ ?3 mod ~k+2 if and only if αo is exact [3]. Since ω1k − ω2k is not exact by hypothesis, ?1 6∼ ?3 and thus ?1 6∼ ?2 . Remark 4.8. An alternative construction would be to directly consider formal deformations of the Weyl algebra bundle and related structures based on the preliminary data of a formal deformation of the symplectic structure (as opposed to a smooth deformation as considered here). This direction allows to treat the case of non-compact symplectic manifolds as well. Indeed, one could either observe that the completeness of the vector field occurring in the proof of Lemma 4.5 is not necessary in the formal category, or design a formal version of Moser’s argument.
April 11, 2003 14:51 WSPC/148-RMP
212
00159
P. Bieliavsky & P. Bonneau
In the present article we chose to remain in the smooth category to underline the link with the classical Moser’s lemma. 5. Appendix: Borel’s Lemma We did not find in the literature a suitable reference for a “Borel’s lemma” applying in our framework, we therefore establish such a result here. For a classical statement of this lemma see [11]. Proposition 5.1 (Borel’s Lemma). Let E be a Fr´echet space and {αn ∈ E | n ∈ N} be a sequence in E. Then there exists f ∈ C ∞ (R, E) such that f (n) (0) = αn . Proof. Let ϕ ∈ C ∞ (R, R) be nonnegative, such that ϕ(t) = 1 for |t| 6 21 and ϕ(t) = 0 for |t| > 1 and define fn ∈ C ∞ (R, E) by fn (t) = αn!n tn ϕ(λn t) where the numbers {λn } (λn ≥ 1) will be defined later. As E is a Fr´echet space there exists {p0 , . . . , pr , . . .} a nondecreasing countable basis of continuous seminorms on E [14]. Lemma 5.2. For all n ∈ N one can find λn ∈ R such that sup pn−1 (fn(k) (t)) 6 t∈R
1 2n
∀ k ∈ N s.t. 0 6 k 6 n − 1 .
P Proof. (Lemma 5.2) Let us define Kn = pn−1 (αn ) and Mn = nj=1 supt∈R |ϕ(j) (t)|. We have ! ! k X k n! αn n−p k−p (k−p) (k) t λn ϕ (λn t) pn−1 (fn (t)) = pn−1 p (n − p)! n! p=0 6
k X
k
k X
k
k X
k
p=0
6
p=0
6
p=0
p
p
p
!
pn−1 (αn ) n−p k−p (k−p) |t| λn |ϕ (λn t)| (n − p)!
!
pn−1 (αn ) n−p n−p 1 |t| λn |ϕ(k−p) (λn t)| n−k (n − p)! λn
!
k Kn M n X Kn M n 1 6 (n − p)! λnn−k λn p=0
n−1 Kn M n X 6 λn p=0
n−1 p
!
k p
!
1 (n − p)!
1 (n − p)!
with the appropriate justifications for the inequalities: on the support of ϕ(k−p) (λn t) we have |λn t| 6 1 and, as n − k > 1, we have λnn−k > λn if λn > 1 and kp 6 n−1 p .
April 11, 2003 14:51 WSPC/148-RMP
00159
Star Product on a Symplectic Manifold
Thus a choice of the λn ’s such that (
λn > max 1, 2n Kn Mn
n−1 X p=0
n−1 p
!
1 (n − p)!
213
)
yields the assertion. Lemma 5.3. For the preceding choice of the λn ’s, C ∞ (R, E).
P
n>0
fn is convergent in
Proof. (Lemma 5.3) For the details about the topology of C ∞ (R, E) we refer to [14, Chap. 20]. The set {Pr,k,m | (r, k, m) ∈ N3 } with Pr,k,m (g) := supt∈[−m,m] pr (g (k) (t)) forms a basis of seminorms for the (Fr´echet) topology of C ∞ (R, E). If we show that, P ∀ (r, k, m) ∈ N3 , n>0 Pr,k,m (fn ) converges as a real series, the lemma is proved. As the fn ’s are compactly supported, it is sufficient to prove that, ∀ (k, r) ∈ N2 , P (k) n>0 supt∈R pr (fn (t)) converges as a real series. So let us fix (k, r) ∈ N2 and define s = max{k, r}. We have X
n>0
sup pr (fn(k) (t)) = t∈R
s X
sup pr (fn(k) (t)) +
n=0 t∈R
X
n>s+1
sup pr (fn(k) (t)) . t∈R
As fn ∈ C ∞ (R, E) for each n there is no problem for the first (finite) sum. (k) (k) For the second one we have supt∈R pr (fn (t)) 6 supt∈R pn−1 (fn (t)) 6 21n since, first, r 6 s 6 n − 1 and the countable basis of seminorms is nondecreasing and, secondly, k 6 s 6 n − 1 and we can apply the Lemma 5.2. Thus it is convergent. P∞ So, by the two lemmas we have constructed f = n=0 fn ∈ C ∞ (R, E). Let us finally see that this function fulfills the desired property. We have ! ∞ k k X X X (k) k αn n n! αn n−p (k−p) (k) t ϕ(λn t) t ϕ (λn t) . + f (t) = n! p (n − p)! n! n=0 p=0 n=k+1
In the second sum, we have n − p > n − k > 1. Thus it vanishes for t = 0. In the first sum, if n 6 k − 1, ϕ is differentiated at least once. As ϕ(j) (0) = 0 for j > 1, it vanishes for t = 0. Therefore, we have αk k f (k) (0) = (t ϕ(λk t))(k) (0) k! ! k k! αk X k tk−p λkk−p ϕ(k−p) (λk t)|t=0 . = k! p=0 p (k − p)! For p 6 k − 1, we have k − p > 1 and the corresponding term vanishes. Hence f (k) (0) = αk!k k!ϕ(0) = αk .
April 11, 2003 14:51 WSPC/148-RMP
214
00159
P. Bieliavsky & P. Bonneau
Corollary 5.4. Let N be a smooth manifold and {αn ∈ Ωq (N ) | n ∈ N} be a sequence in Ωq (N ). Then there exists f ∈ C ∞ (R, Ωq (N )) such that f (n) (0) = αn . Proof. As Ωq (N ) is a Fr´echet space [6] it is a straightforward application of the preceding proposition. Corollary 5.5. Let (α1n )n∈N , (α2n )n∈N and (νn )n∈N be sequences of forms on a smooth manifold N . Then there exist smooth functions f 1 , f 2 and f corresponding respectively to (α1n )n∈N , (α2n )n∈N and (νn )n∈N as in Corollary 5.4 such that (1) If dα1n = 0, ∀ n ∈ N, then d(f 1 (t)) = 0, ∀ t ∈ R. (2) If α2n − α1n = dνn ∀ n ∈ N, then f 2 (t) − f 1 (t) = d(f (t)), ∀ t ∈ R. α1
Proof. (1) We have fn1 (t) = n!n ϕ(λn t) hence d(fn1 (t)) = 0, ∀ t ∈ R and for each t, P∞ f 1 (t) = n=0 fn1 (t) converges in Γ1 (N, ∧q T ∗ N ). (2) Let λ1n , λ2n and λn be three real sequences defining smooth functions f˜1 , f˜2 and f˜ corresponding respectively to (α1n )n∈N , (α2n )n∈N and (νn )n∈N as in the proof of Proposition 5.1. Consider the sequence µn = max{λ1n , λ2n , λn }. Replacing λ1n , λ2n and λ1n by µn we get new functions f 1 , f 2 and f again corresponding respectively to (α1n )n∈N , (α2n )n∈N and (νn )n∈N such that fn2 − fn1 = dfn ∀ n ∈ N. Since for t fixed the series converges in Γ0 (N, ∧q T ∗ N ), we obtain the result. Acknowledgment We warmly thank the referee for several improvements of the manuscript and interesting suggestions. References [1] F. Bayen, M. Flato, C. Fronsdal, A. Lichnerowicz and D. Sternheimer, Deformation theory and quantization. I. Deformations of symplectic structures, Ann. Phys. 111(1) (1978), 61–110. [2] F. Bayen, M. Flato, C. Fronsdal, A. Lichnerowicz and D. Sternheimer, Deformation theory and quantization. II. Physical applications, Ann. Phys. 111(1) (1978), 111–151. [3] M. Bertelson, M. Cahen and S. Gutt, Equivalence of star products, Class. Quantum Grav. 14(1A) (1997), A93–A107. ´ ements de math´ematique. Vari´et´es diff´erentielles et analytiques. [4] N. Bourbaki. El´ Fascicule de r´esultats (Paragraphes 1 a ` 15), Masson, 1983. [5] M. Cahen, S. Gutt and M. De Wilde, Local cohomology of the algebra of C ∞ functions on a connected manifold, Lett. Math. Phys. 4(3) (1990), 157–167. ´ ements d’analyse, Tome III , Chap. XVI et XVII, Gauthier-Villars, [6] J. Dieudonn´e, El´ 1970. [7] B. V. Fedosov, A simple geometrical construction of deformation quantization, J. Diff. Geom. 40(2) (1994), 213–238. [8] B. V. Fedosov, Deformation quantization and index theory, Akademie Verlag, Berlin, 1996.
April 11, 2003 14:51 WSPC/148-RMP
00159
Star Product on a Symplectic Manifold
215
[9] M. Kontsevich, Deformation quantization of Poisson manifolds I. (preprint math.QA/9709040) 1997. [10] J. E. Moyal, Quantum mechanics as a statistical theory, Proc. Camb. Philos. Soc. 45 (1949), 99–124. [11] R. Narasimhan, Analysis on real and complex manifolds (Advanced Studies in Pure Mathematics, Vol. 1), Paris: Masson et Cie, Editeur; Amsterdam: North-Holland Publishing Company, 1968. [12] J. Peetre, Une caracterisation abstraite des operateurs differentiels, Math. Scand. 7 (1959), 211–218. [13] J. Peetre, Rectification a l’article “Une caracterisation abstraite des operateurs differentiels”, Math. Scand. 8 (1960), 116–120. [14] F. Tr`eves, Topological vector spaces, distributions and kernels, Academic Press, 1967. [15] I. Vaisman, Lectures on the geometry of Poisson manifolds, Birkh¨ auser Verlag, Basel, 1994. [16] H. Weyl, Gruppentheorie und quantenmechanik, Z. Physik, 1927. [17] H. Weyl, The theory of groups and quantum mechanics, Dover, 1931. [18] E. P. Wigner, On the quantum correction for thermodynamic equilibrium, Phys. Rev., II. Ser. 40 (1932), 749–759.
May 26, 2003 12:17 WSPC/148-RMP
00163
Reviews in Mathematical Physics Vol. 15, No. 3 (2003) 217–243 c World Scientific Publishing Company
DECOHERENCE INDUCED TRANSITION FROM QUANTUM TO CLASSICAL DYNAMICS
PH. BLANCHARD and R. OLKIEWICZ∗ Physics Faculty and BiBoS, University of Bielefeld, 33615 Bielefeld, Germany of Theoretical Physics, University of Wroclaw, 50-204 Wroclaw, Poland
∗Institute
Received 28 May 2002 Revised 19 November 2002 Framework for a general discussion of environmentally induced classical properties, like superselection rules, privileged basis and classical behavior, in quantum systems with both finite and infinite number of degrees of freedom is proposed. A number of examples showing that classical properties do not have to be postulated as an independent ingredient are given. In particular, it is shown that infinite open quantum systems in some cases may behave like simple classical dynamical systems. Keywords: Quantum open systems; decoherence; dynamical semigroups; superselection rules; classical behavior.
1. Introduction Quantum mechanics is usually thought of as a generalization of classical mechanics in which commutation relations are imposed on dynamical variables. This might suggest that one need a full deterministic theory first, and then should apply to it a recipe called quantization to get a more fundamental theory. Such a procedure has a great heuristic value and was used in many concrete cases. However, there is no fundamental reason for such a way of reasoning. Why quantum theory cannot be completely formulated without regard to an underlying classical picture, all the more since some observables seem to possess a genuine quantum character without classical counterparts. Therefore, it is much more natural to consider quantum systems as primary objects and try to derive classical properties, like superselection operators, pointer states, and, at the extreme, emergence of classical dynamical systems, from quantum theory. The origin of deterministic laws that govern the classical domain of our everyday experience has attracted much attention in recent years. For example, the question in which asymptotic regime non-relativistic quantum mechanics reduces to its ancestor, i.e. Hamiltonian mechanics, was addressed in [19, 20]. It was shown there that for very many bosons with weak two-body interactions there is a class of states for which time evolution of expectation values of certain operators in these 217
May 26, 2003 12:17 WSPC/148-RMP
218
00163
Ph. Blanchard & R. Olkiewicz
states is approximately described by a nonlinear Hartree equation. The problem under what circumstances such an equation reduces to the Newtonian mechanics of point particles was also discussed. A program of deriving irreversible transport equations for macroscopic quantum systems was also carried out. For example, in [17] time evolution of a spinless quantum particle moving in a Gaussian random environment was discussed. It was shown there that in the weak coupling limit the Wigner distribution of a wave function converges globally in time to a solution of the linear Boltzmann equation. The connection between the reversible dynamics of classical macroscopic observables of infinite mean-field quantum systems and a Hamiltonian flow on a generalized phase space was described in [10, 44, 45]. As was shown in [27], a collective dynamical behavior of a system consisting of infinitely many two-level atoms leads to a flow on the classical phase space of the atoms and this results in periodic time dependence of the asymptotic states. Finally, the classical ~ → 0 limit for quantum mechanical correlation functions of systems with both finitely and infinitely many degrees of freedom was discussed in [26]. A different point of view was taken in a seminal paper by Gell-Mann and Hartle [21]. They gave a thorough analysis of the role of decoherence in the derivation of phenomenological classical equations of motion. Various forms of decoherence (weak, strong) and realistic mechanisms for the emergence of various degrees of classicality were also presented. Since quantum interferences are damped in the presence of an environment, so one may hope that the classical ~ → 0 limit for quantum dissipative dynamics may exists for arbitrary large time. Such a problem was discussed in [25]. In this work we adopt a different point of view and follow the idea of environmentally induced decoherence whose potential impact on behavior of quantum open systems was briefly described by Zeh: “All quasi-classical phenomena, even those representing reversible mechanics, are based on de facto irreversible decoherence” [48]. The main objective of the present paper is to provide an algebraic framework which will enable a general discussion of the environmentally induced decoherence and, as a consequence, the appearance of classical properties in quantum systems with both finite and infinite number of particles. It is worth noting that our approach is dynamical and so it constitutes an alternative way to the classical limit. A number of examples showing that classical concepts do not have to be presumed as an independent fundamental ingredient are also discussed. 2. Mathematical Description of Quantum Systems In order to describe a quantum system we apply the idea of Segal [40], and Haag and Kastler [24] that all relevant information about the system is contained in a certain abstract noncommutative algebra A. Thus, as a primary object of the mathematical formalism of quantum theory, we take the algebra generated by bounded
May 26, 2003 12:17 WSPC/148-RMP
00163
Decoherence Induced Transition from Quantum
219
observables of the system equipped with a norm topology, a so called C ∗ -algebra. It is believed that such an algebra reflects intrinsic properties of the corresponding quantum system. In this view Hilbert spaces play only a secondary role and they appear as representation spaces of the algebra. In general, C ∗ -algebras admit an uncountable number of unitary inequivalent representations, most of which presumably have no physical interpretation. Only quantum systems with a finite number of degrees of freedom, due to the Stone–von Neumann uniqueness theorem, possess one (up to unitary equivalence) irreducible representation, the so called Schr¨ odinger representation. However, in order to enter traditional framework of quantum theory, which postulates that with any physical system one can associate a definite Hilbert space with physical properties of the system expressed in terms involving only mathematical objects related to this Hilbert space, one has to select a subset of admissible states, which through the GNS construction would lead to physically meaningful structures. Let us discuss this point more precisely. Suppose that φ is a faithful state on a C ∗ -algebra A of a quantum system and let πφ : A → B(H) be the corresponding (faithful) GNS representation [11]. Let M be a von Neumann algebra generated by πφ (A), that is M = πφ (A)00 , the bicommutant of πφ (A). M is called sometimes the algebra of contextual operators [38]. We argue now that if M has to describe a system with pure quantum character it has to be a factor, i.e. an algebra with a trivial center. By the pure quantum character we mean the following property of the system: For any two distinct orthogonal pure states |ψ1 i, |ψ2 i ∈ H, their superpositions should be physically distinguishable from the corresponding statistical mixtures. In other words, there should exist at least one Hermitian operator A in M such that the following expectation values are different, hψ, Aψi 6= tr(ρA) , where ψ = z1 ψ1 + z2 ψ2 , |z1 |2 + |z2 |2 = 1, z1 z2 6= 0, and ρ = |z1 |2 |ψ1 ihψ1 | + |z2 |2 |ψ2 ihψ2 |. However, if the center of M is nontrivial we may take a central (different from zero and identity) projection E and choose |ψ1 i ∈ EH and |ψ2 i ∈ E ⊥ H, where E ⊥ = 1 − E. Then, for any A = A∗ ∈ M one would have that hψ, Aψi coincides with tr(ρA), and so the pure state ψ and the statistical mixture ρ would be physically indistinguishable. Therefore, in the following we assume that φ is a factor state. It is worth noting that a discussion concerning the coherent superposition of states and a complete classification of the coherence classes of states in factors were presented in [39]. Another constraint on the algebra M is expressed in the so called Dirac’s requirement. It states that there should exist at least one complete set of mutually compatible observables. Expressing this in the algebraic language we say that the commutant of M is Abelian or, equivalently, that there exists a maximal (in B(H)) Abelian algebra C contained in M. The following observation is clear.
May 26, 2003 12:17 WSPC/148-RMP
220
00163
Ph. Blanchard & R. Olkiewicz
Theorem 1. The postulate about pure quantum character of the system together with the Dirac’s requirement is true if and only if M = B(H), i.e. M is a type I factor. It follows that the Dirac’s requirement is an additional condition, which specifies the type of factor representation of algebra A and leads to the framework of standard quantum theory. Since we want to consider quantum systems in the thermodynamic limit, which are known to be represented by other types of factors, we drop the Dirac’s condition keeping only the postulate about pure quantum character of the system. Physical observables are Hermitian operators from the algebra M, or, more generally, self-adjoint operators affiliated to M. Generalizing the notion of a density matrix representing mixture of states we say that statistical states of the system are represented by positive normal and normalized functionals on M. The set of statistical states we denote by D. Hence φ ∈ D iff φ(A) ≥ 0 whenever A ≥ 0, φ(1) = 1, where 1 is the identity operator, and φ is continuous in the σ-weak topology on M (see, for example, [11] for definition of these terms). The linear space generated by D is called the predual space of M and denoted by M∗ . The connection between a Hermitian operator A representing an observable and experimentally measured values of this observable, whenR the system is described by a statistical state φ, is the following one. Suppose λdE(λ) is the spectral decomposition of A. The probability that the measured value is in an interval [a, b] is given R by φ(E[a, b]), and so the expectation value of A in the state φ equals to hAi = λdφ(E(λ)). Let us observe that dφ(E(λ)) is a probability measure on σ(A), the spectrum of A. Let us now consider the dynamics of a quantum system. If a system is closed (conservative), then the time development of any observable is given by a continuous symmetry transformation, i.e. A → A(t) = αt (A), where αt is a σ-weakly continuous one parameter group of ∗ -automorphisms of M. If there exists an energy observable H for the system, then automorphisms αt are inner, given by αt (A) = i i e ~ tH Ae− ~ tH . However, if a system interacts with an environment, then its evolution becomes irreversible. In fact, although the whole system evolves unitarily according to the total Hamiltonian H = HS + HE + HI , where the three parts represent respectively the system, environment and interaction Hamiltonians, the evolution of a system observable A is given by i
i
Tt (A) = PE (e ~ tH (A ⊗ 1E )e− ~ tH ) ,
(1)
where 1E is the identity operator in the Hilbert space of the environment, and PE denotes the conditional expectation onto the algebra M with respect to a reference state φE of the environment. Equivalently, we may define Tt as the adjoint map to the operator Tt∗ : M∗ → M∗ given by i
i
Tt∗ (φ) = TrE (e− ~ tH (φ ⊗ φE )e ~ tH ) ,
(2)
where TrE denotes the partial trace with respect to the environmental variables. Tt being the composition of ∗ -automorphisms and a conditional expectation is a
May 26, 2003 12:17 WSPC/148-RMP
00163
Decoherence Induced Transition from Quantum
221
family of maps (superoperators) which in general satisfies a complicated integrodifferential equation describing an irreversible dynamics. For this reason we consider only the forward evolution, i.e. assume that t ≥ 0. Nevertheless, some important properties of Tt may be explicitly derived. (a) For any observable A ∈ M, the function t → Tt (A) is σ-weakly continuous. (b) For all t ≥ 0, the superoperators Tt are completely positive, normal and unital. Moreover, Tt are contractive in the operator norm, i.e. kTt Ak∞ ≤ kAk∞ . In case when Tt ◦ Ts = Tt+s , i.e. when the memory effect can be neglected, the family {Tt } is called a quantum Markov semigroup. A general discussion of the limiting procedures, like weak coupling limit and singular coupling limit which lead to the Markovian approximation, can be found in [1]. Let us point out, however, that many physical models possess an additional property, namely that there exists a faithful and normal state preserved by the evolution, a so called equilibrium state. Generalizing this concept we assume that: (c) There is a faithful, normal and semifinite weight ω0 on M such that ω0 ◦Tt = ω0 for all t ≥ 0. Roughly speaking, the passage from a state to a weight in the noncommutative framework corresponds to the replacement of a compact space with a probability measure by a locally compact space with a σ-finite measure. For a broad discussion of weights see, for example, [41]. Summing up this section: An open system with pure quantum character is described by a factor M acting on a separable Hilbert space of the system. The evolution of observables of the system is given by a family of superoperators {Tt }t≥0 on M which satisfy conditions (a)–(c). Having described the framework for quantum systems we now turn to the Hilbert space description of classical dynamical systems. 3. Koopman’s Formalism for Classical Systems Everybody agrees that concepts of classical and quantum physics are opposite in many aspects. Therefore, in order to demonstrate how quanta become classical, it is necessary to express them in one mathematical framework. Since, as was shown in the previous section, a natural language for quantum system is that of von Neumann algebras, we reformulate now the concept of a classical dynamical system in a similar way. The idea of using the same algebraic formalism for description of both quantum and classical mechanics was suggested in [2]. The use of the Koopman formalism together with the reverse procedure are essential for a rigorous analysis of the decoherence induced classical dynamical systems, see Definition 9 and Example 7. Suppose that M is a configuration space of a classical system. We assume that M is a locally compact metric space. A continuous evolution of the system is given by a (continuous) flow on M , i.e. a continuous mapping g : R × M → M such that gt : M → M is a homeomorphism for all t ∈ R, and t → gt is a group
May 26, 2003 12:17 WSPC/148-RMP
222
00163
Ph. Blanchard & R. Olkiewicz
homomorphism. The map t → gt (x) is called a trajectory of a point x ∈ M . From the very definition, all trajectories are continuous. We assume also that there exists a σ-finite Borel measure µ0 on M , finite on compact sets, and such that µ0 (gt−1 (B)) = µ0 (B) for all R t ∈ R and all µ0 -finite Borel subsets B ⊂ M . In addition, we assume that f dµ0 > 0 whenever f ≥ 0 and f 6= 0. The triple (M, gt , µ0 ) is called a (classical) topological dynamical system. The following result is clear. Proposition 2. Suppose that gt is a flow on M. Then γt : C0 (M ) → C0 (M ), γt (f )(x) = f (gt x) is a strongly continuous one parameter group of ∗ -automorphisms of C0 (M ), where C0 (M ) is the C ∗ -algebra of continuous functions on M vanishing at infinity. It follows that a dynamical system may be equivalently described by the triple (C0 (M ), γt , φ0 ), where φ0 is a γt -invariant weight on C0 (M ) determined by the measure µ0 . If M is compact and therefore µ0 is finite, we always assume that µ0 is a probability measure, what implies that φ0 is a state on C(M ). So far we have made half of the way. What we really need is a Hilbert space representation of the system. Suppose H = L2 (M, µ0 ). There is a natural representation of the algebra C0 (M ) in H given by π(f )ψ(x) = f (x)ψ(x). Let us define A = π(C0 (M ))00 . Then A is the von Neumann algebra L∞ (M, µ0 ) of essentially bounded functions on M , acting in the Hilbert space H. Moreover, γt extends uniquely to a σ-weakly continuous group of ∗ -automorphisms of A, and µ0 determines a γt -invariant, faithful, normal and semifinite weight φ0 on A. We call the triple (A, γt , φ0 ) a Hilbert space representation of the dynamical system (M, gt , µ0 ). Let us now discuss the reverse procedure. Suppose we start with a triple (A, γt , φ0 ), where A is a commutative von Neumann algebra, γt is a σ-weakly continuous group of ∗ -automorphisms of A, and φ0 is a γt -invariant, faithful, normal and semifinite weight on A. The problem how one can determine the underlying topological space is not a trivial one since there are essentially two non-isomorphic examples of A, namely the algebra of bounded sequences over a discrete set and the algebra of essentially bounded functions on the unit interval with respect to the Lebesgue measure. So if L∞ ([0, 1], dx) and L∞ (S 3 , µ0 ), where S 3 is a threedimensional sphere and µ0 is a normalized rotationally invariant measure, and L∞ (Rn , dxn ) are all isomorphic, how we can choose an appropriate space. To answer this question we propose the following reduction procedure. Let us start with an arbitrary representation of A, say A = L∞ (Ω, µ), Ω being a locally compact space, arising in a particular model in a natural way. It is obvious that the C ∗ algebra Cb (Ω) of continuous and bounded functions on Ω is contained in A. Let A0 be the maximal C ∗ -subalgebra in Cb (Ω) such that γt : A0 → A0 and is strongly continuous on A0 . Let us comment on this point. Let X = (Ai )i∈I be the family of all unital C ∗ -subalgebras Ai ⊂ Cb (Ω) such that γt : Ai → Ai , and is strongly continuous on it. Then (X, ⊂) is a non-empty ordered set. Suppose that (Aj )j∈J S is a linearly ordered set (chain) of elements from X. Then AJ = j∈J Aj ∈ X,
May 26, 2003 12:17 WSPC/148-RMP
00163
Decoherence Induced Transition from Quantum
223
where the closure is taken in the sup-norm of Cb (Ω). In fact, AJ is a C ∗ -subalgebra S S of Cb (Ω). It is also clear that γt : j∈J Aj → j∈J Aj and is strongly continuous on it. Because γt is contractive in the sup-norm so γt : AJ → AJ . The strong continuity on AJ follows from the standard /3 argument. Moreover, for any j ∈ J, Aj ⊂ AJ . Hence, by the Kuratowski–Zorn lemma, there exists a maximal element in X. By the Gelfand construction [11], A0 is isomorphic with C(M ), where M is a compact Hausdorff space, the spectrum of A0 . In order to avoid pathological situations we assume that the topology on M is metrizable, i.e. given by a metric on M . This property would be ensured if we additionally assumed that the spectra of all Ai ∈ X are metrizable. Because AJ is the direct limit of unital commutative C ∗ -algebras, so its spectrum is the inverse limit of the spectra of Aj , j ∈ J, and hence would be metrizable. Thus M would be also metrizable. If φ0 is a state, then we choose M as the space of the system. Next we define a probability Borel measure µ0 on M by the formula Z ˆ f(x)dµ 0 (x) = φ0 (x) , M
where fˆ ∈ C(M ) is associated with f ∈ A0 by the Gelfand isomorphism. The corresponding group of automorphisms of C(M ) we denote by γˆt . It is worth pointing out that γˆt is implemented by a strongly continuous group of unitary operators defined on the Hilbert space L2 (M, µ0 ). Let us recall that M is the space of mulˆ tiplicative states (characters) on A0 with x(f ) = f(x). Therefore, for any x ∈ M and t ∈ R we may define a new point gt x ∈ M by the formula (gt x)(f ) := x(γt f ), ˆ f ∈ A0 . Hence, the semigroup γˆt is induced by a flow, i.e. γˆt f(x) = fˆ(gt x). We show now that gt is continuous. Suppose it is not. Then there is a sequence xn → x such that gt (xn ) is not convergent to gt x. It means that there exists > 0 and a subsequence {xnm } such that d(gt (xnm ), gt x) > , where d is the metric on M . Let fˆ0 be a continuous function such that fˆ0 (gt x) = 1 and supp fˆ0 ⊂ K(gt x, ), the ball of radius and the center in gt x. Because γˆt fˆ0 is also continuous so γˆt fˆ0 (xnm ) → 1. However, γˆt fˆ0 (xnm ) = 0 for all natural m, the contradiction. Hence gt is continuous. Since (gt )−1 = g−t , it is a homeomorphism of M . By the strong continuity of γˆt we conclude that the flow g is continuous. Because, by definition, the measure µ0 is gt -invariant we have obtained in this way a topological dynamical system (M, gt , µ0 ). Suppose now that φ0 is a weight. Let A00 be the C ∗ -algebra generated by the following set C = {f ∈ A0 : f ≥ 0 and φ0 (f ) < ∞} . It is clear that γt : C → C and so γt : A00 → A00 . Because A00 does not possess the identity so it is isomorphic with C0 (M ), where M is a locally compact Hausdorff space. Assuming again that M is metrizable (if the spectrum of A0 is metrizable, then this property is automatically satisfied; because the spectrum of A00 + C · 1,
May 26, 2003 12:17 WSPC/148-RMP
224
00163
Ph. Blanchard & R. Olkiewicz
being the image of a continuous and closed mapping of the spectrum of A0 [16], is metrizable, so is the spectrum of A00 ) we obtain, by using similar arguments as in the previous case, a topological dynamical system (M, gt , µ0 ), with M being locally compact and µ0 being σ-finite. Summing up: It is the dynamics and the invariant state or weight which determine the underlying space for an abstract commutative dynamical system. It is worth noting, however, that such a reduction procedure is a “minimal” one since we aimed at getting topological dynamical systems. In some case it may be convenient to impose on A0 or A00 additional conditions. For example, to obtain a smooth dynamical system one has to require that the group γt preserves a subspace of smooth functions. 4. Decoherence in Action In recent years decoherence has been widely discussed and accepted as the mechanism responsible for the appearance of classicality in quantum measurements and the absence, in the real world, of Schr¨ odinger-cat-like states [6, 22, 28, 47, 49, 50]. The basic idea behind it is that classicality is an emergent property induced in quantum systems by unavoidable and practically irreversible interaction with their environment. It is marked by the dynamical suppression of quantum interferences and so the transformation of the vast majority of pure states of the system to statistical mixtures. It should be pointed out, however, that classicality in quantum systems may be also introduced in another way. For example, it was shown in [3] that broken symmetries in infinite systems give rise to classical observables based on a system of imprimitivity. A different approach to the description of classical states and associated with them classical observables based on the algebraic theory of superselection sectors was proposed in [30]. It was also shown there that nonautomorphic time evolution leads to the transition between different folia (equivalence classes of pure states) in the way required to find a mixed state after the measurement. A loss of phase coherence as the consequence of the coupling with an environment has been established both in the Markovian regime [43, 46] and for a system with a non-Markovian evolution (decoherence through emission of Bremsstrahlung) [13]. This idea has been also experimentally verified. For example, Brune et al. [14] created a mesoscopic superposition of quantum states involving radiation fields with classically distinct phases and observed its progressive decoherence to a statistical mixture through two-atom correlation measurements. Moreover, Schr¨ odinger-catlike states were also created in an ion trap experiment using a single beryllium ion and a combination of static and oscillating electric fields and their decoherence was observed [33]. In spite of the progress in the theoretical and experimental understanding of decoherence, the models studied so far do not answer the question concerning its nature satisfactorily. Dynamical diagonalization of pure states with respect to a preferred basis explains essentially the measurements results but it is only an
May 26, 2003 12:17 WSPC/148-RMP
00163
Decoherence Induced Transition from Quantum
225
example of possible scenarios. Other possibilities include: Environmentally induced superselection rules of discrete and continuous types, and completely classical behavior of the quantum system. Let us now discuss this issue in a detailed way. Because the evolution introduced in Sec. 2 is so general that it embraces also a unitary evolution, we first distinguish the case of a nontrivial coupling between a system and its environment. Definition 3A. Environmentally induced decoherence is said to take place in the system, if there exists at least one projection P ∈ M such that Tt (P ) is not a projection for some instant t > 0. The above definition excludes only automorphic evolutions. For the discussion on emergence of classical properties we find it more useful to strengthen it in the following way. Definition 3B. We say that environmentally induced decoherence takes place in the system, if there are two Banach ∗ -invariant subspaces M1 and M2 in M such that: (i) M = M1 ⊕ M2 with M2 6= 0. Moreover , both M1 and M2 are Tt -invariant. (ii) M1 represents a decoherence free part of the system. It is a von Neumann algebra (the image of a conditional expectation of M) generated by all projections P in M such that Tt (P ) remains a projection for all t > 0. We additionally assume that for any projection P ∈ M1 and any t > 0 there exists a projection Q ∈ M1 such that Tt (Q) = P . (iii) M2 represents those observables of the system which, after some time, are not detectable by measurements, i.e. all their expectation values vanish with time. More precisely, lim φ(Tt B) = 0
t→∞
(3)
for all φ ∈ D and any B = B ∗ ∈ M2 . If the process of decoherence is efficient, and usually it is, then Hermitian operators from M1 are those which can be detected in practice. It should be noted that by property (ii) of Definition 3B, M1 is determined in a unique way. Moreover, the evolution restricted to this subalgebra has a nice automorphic property. Theorem 4. For any t ≥ 0, Tt |M1 is a ∗ -automorphism. Proof. See the Appendix. The above properties justify the following name. Definition 5. M1 is called the algebra of effective observables. Using the decomposition from Definition 3B we now discuss the dynamical appearance of classical properties in the quantum system.
May 26, 2003 12:17 WSPC/148-RMP
226
00163
Ph. Blanchard & R. Olkiewicz
Definition 6. If M1 is noncommutative with Z(M1 ) 6= C · 1, where Z(M1 ) denotes the center of M1 , and Tt ◦ Ts (A) = Tt+s (A) for all A ∈ M1 , then we speak of environmentally induced superselection rules in the system. In such a case we may define T−t := (Tt )−1 and obtain in this way a one parameter group of ∗ -automorphisms on the algebra M1 . Hence the system dynamically loses its pure quantum character and behaves like a conservative one, however, with nontrivial superselection operators. If there are minimal projections in Z(M1 ), which are not minimal in M1 , then the superselection rules are of discrete type. In P such a case M1 = Pi MPi and the evolution preserves each superselection sector. If Z(M1 ) does not possess any minimal projections the induced superselection rules are continuous. Definition 7. We say that environment induces a classical structure in the system, if M1 is a commutative algebra greater than C · 1. If M1 = C · 1, then we say that the system is ergodic. Ergodic systems possess the property of return to equilibrium in the following sense [32]. The decomposition of any observable A ∈ M is now given by A = φ0 (A)1 + A2 , where φ0 ∈ D and A2 ∈ M2 . Hence, for any φ ∈ D, (Tt∗ φ)(A) = φ(Tt A) = φ(φ0 (A)1 + Tt (A2 )) → φ0 (A) ,
(4)
when t → ∞. Definition 8. Suppose that environment induces a classical structure. The classical structure is said to be discrete, if M1 contains minimal projections P1 , P2 , . . . . Since minimal projections (there are always countably many of them) are necessarily orthogonal and they sum up to the identity operator, it follows that any P observable A ∈ M1 may be written as A = ai Pi , where ai ∈ R. Because the evolution restricted to M1 is trivial (since Tt (Pi ) = Pi for all t ≥ 0 and all indexes i) so this case corresponds to a dynamical selection of the so called privileged basis. Let us emphasize, however, that in general Pi may not be one-dimensional, and so they represent generalized rays. Definition 9. Suppose that environment induces a classical structure. The classical structure is said to represent a classical dynamical system, if Tt |M1 is a semigroup and (M1 , Tt , ω0 ) is isomorphic with (L∞ (M ), Tˆt , µ0 ), where M is a locally compact space, Tˆt is a one parameter group of ∗ -automorphisms on L∞ (M ) induced by a continuous flow gt on M, and µ0 is a Tˆt -invariant σ-finite Borel measure on M . Let us notice that, due to the semigroup property, the restriction of Tt to M1 extends to negative times by the formula T−t := (Tt )−1 (the existence of (Tt )−1 on M1 is guaranteed by Theorem 4). The procedure of retrieving the space M from an abstract commutative von Neumann algebra M1 was discussed in Sec. 3. By the isomorphism of two dynamical systems we mean a map λ : M1 → L∞ (M )
May 26, 2003 12:17 WSPC/148-RMP
00163
Decoherence Induced Transition from Quantum
227
which is a ∗ -isomorphism intertwining between Tt and Tˆt , i.e. Tˆt ◦ λ = λ ◦ Tt , and such that µ0 (Λ) = ω0 (λ−1 χΛ ), where χΛ is the characteristic function of a Borel subset Λ ⊂ M . It follows that the above definition describes a process of dynamical de-quantization of a quantum system, (M, {Tt }t≥0 , φ0 ) → (M1 = A, {Tt }t∈R , φ0 ) → (A0 , {Tt }t∈R , φ0 ) → (C(M ), {Tˆt }t∈R , φˆ0 ) → (M, gt , µ0 ) ,
if φ0 is a state. The first arrow represents the process of decoherence, the second the reduction procedure, the third corresponds to the Gelfand isomorphism, and the last one represents the passage from statistical description to individual one expressed in terms of trajectories. A similar scheme holds true also if φ0 is a weight (we just replace A0 by A00 and C(M ) by C0 (M )). In the next section we present a number of examples showing how these definitions work in practice. 5. Examples We start with the following theorem for quantum systems which additionally satisfy the Diracs requirement. Theorem 10. Suppose M is a type I factor, i.e. M = B(HS ), where HS is a separable (finite or infinite dimensional) Hilbert space. Let the evolution of the system be given by a family of maps {Tt }t≥0 which fulfils the conditions (a)–(c) from Sec. 2 with ω0 = Tr, the standard trace. If {Tt } satisfies the semigroup property Tt ◦ Ts = Tt+s , and if there exists a faithful density matrix ρ0 subinvariant with respect to Tt , i.e. Tr ρ0 Tt (A) ≤ Tr ρ0 A for all A ≥ 0, then the decomposition M = M1 ⊕ M2 from Definition 3B always exists. Moreover, the effective part of any observable A ∈ M is given by a Tr-compatible conditional expectation from M onto M1 , the automorphic evolution of the algebra M1 is a Hamiltonian one, and the limit in equation (3) is uniform on bounded sets of M2 . Remark. If dim HS = n, then ρ0 = n1 1 is obviously a Tt -invariant faithful density matrix so the last assumption of the theorem may be omitted. Proof. See the Appendix. Example 1. The Araki–Zurek model: Superselection rules. We follow a mathematical description of the model [4, 49] as given by Kupsch [29]. Suppose the total Hamiltonian H = HS ⊗ 1E + 1S ⊗ HE + A ⊗ B , defined on a Hilbert space HS ⊗ HE , satisfies the following assumptions: • [HE , B] = 0, • B = B ∗ has an absolutely continuous spectrum,
(5)
May 26, 2003 12:17 WSPC/148-RMP
228
00163
Ph. Blanchard & R. Olkiewicz
PN • A = n=1 λn Pn , λn ∈ R, and Pn are mutually orthogonal projections summing up to the identity operator, PN L • HS = n=1 Pn HS Pn , i.e. HS = Hn , where Hn is a self-adjoint operator in the Hilbert space Pn HS , • ωE is an arbitrary statistical state of the environment represented by a density matrix ρ. Because all three terms in Eq. (5) commute (we say that two self-adjoint operators commute when their spectral measures commute) so eitH = eitHS ⊗1E eit1S ⊗HE eitHI . In order to simplify notation we have put ~ = 1. The Hamiltonian (5) is just the generator of the above one parameter group of unitary operators. Let P E be the conditional expectation from B(HS ) ⊗ B(HE ) onto B(HS ) with respect to the state ωE . Then, for any X ∈ B(HS ), Tt (X) = PE [eitH X ⊗ 1E e−itH ] = eitHS PE [eitHI X ⊗ 1E e−itHI ]e−itHS . Because eitHI =
N X
n=1
so Tt (X) =
N X
Pn ⊗ eitλn B
χn,m (t)eitHS Pn XPm e−itHS ,
(6)
n,m=1
where χn,m (t) =
Z
eit(λn −λm )s d Tr(ρE(s)) ,
and dE(s) is the spectral measure of B. Since this measure is absolutely continuous so d Tr(ρE(s)) is a probability measure absolutely continuous with respect to the Lebesgue measure. Hence, by the Riemann–Lebesgue lemma, χn,m ∈ C0 (R). Because there are finitely many λn so minn6=m |λn − λm | = δ > 0. It implies that for any > 0 there exists t0 > 0 such that |χn,m (t)| < for all n 6= m and all t > t0 . It is clear now that M1 = Pˆ (B(HS )) :=
N X
Pn B(HS )Pn .
n=1
It means that M1 describes what is called a quantum system with superselection rules. In each superselection sector Pn HS the evolution is given by the Hamiltonian Hn . Finally, we show that all expectation values of observables from M2 decrease to zero uniformly on bounded sets. Because in general {Tt } is not a semigroup we
May 26, 2003 12:17 WSPC/148-RMP
00163
Decoherence Induced Transition from Quantum
229
cannot therefore apply Theorem 10. However, this may be done explicitly. Suppose B ∈ M2 , i.e. B = (id − Pˆ )B, and kBk∞ ≤ 1. Then for any ρS ∈ D,
X
ˆ
|Tr ρS Tt (B)| ≤ kTt∗ ρS − P (Tt∗ ρS )k1 kBk∞ ≤ Pn ρS Pm χn,m (t)
, n6=m
1
where k · k1 is the trace norm. Suppose that > 0 is given. Let us take t0 > 0 such that for all t > t0 , |χn,m (t)| < N (N −1) . Then
X X
≤
P ρ P χ (t) kPn ρS Pm χn,m (t) + Pm ρS Pn χm,n (t)k1 n S m n,m
1
n6=m
n<m
=
X
n<m
|χn,m (t)| · kPn ρS Pm + Pm ρS Pn k1 .
Because kPn ρS Pm + Pm ρS Pn k1 ≤ kρS k1 + kPˆ (ρS )k1 = 2 so
X
N (N − 1)
<2 P ρ P χ (t) · = . n S m n,m
N (N − 1) 2 1 n6=m
This implies that
lim kTt∗ ρS − Pˆ (Tt∗ ρS )k1 = 0
t→∞
and hence lim |Tr ρS Tt (B)| = 0 ,
t→∞
uniformly in B provided it belongs to the unit ball in M2 . Example 2. The Araki–Zurek model: Privileged basis. A general discussion of the selection of mutually exclusive quantum states, the so called pointer states, is given for example in [36, 50]. We keep the form of the total Hamiltonian from Eq. (5) but we further specify its ingredients. Suppose that HE = L2 (R, da) and • [HE , B] = 0, • B = pˆ, the momentum operator on HE , P • A= ∞ n=1 λn Pn , λ ∈ R, λn 6= λm if n 6= m, and Pn are mutually orthogonal one-dimensional projections summing up to the identity operator, P • [HS , A] = 0, i.e. HS = n γn Pn , γn ∈ R, R iap • ωE = |ψE ihψE |, where ψE (a) = √12π √ e dp. 2 π(1+p )
May 26, 2003 12:17 WSPC/148-RMP
230
00163
Ph. Blanchard & R. Olkiewicz
Then the evolution of an observable X ∈ B(HS ) is given by ∞ X
Tt (X) =
χn,m (t)eit(γn −γm ) Pn XPm ,
(7)
n,m=1
where 1 χn,m (t) = π
Z
eit(λn −λm )p
dp = e−|λn −λm |t . 1 + p2
Because now χn,m (t)χn,m (s) = χn,m (t + s) so Tt is a quantum Markov semigroup. It is also clear that it preserves Tr. Let {bn } be a sequence of positive P numbers which sum up to 1. Then ρ0 = bn Pn is a faithful Tt -invariant density matrix. Hence all assumptions of Theorem 10 are satisfied and we may conclude that the decomposition from Definition 3B holds true with M1 =
∞ X
n=1
Pn B(HS )Pn ≡ l∞ (N) .
The evolution Tt restricted to M1 is trivial. Let us point out that due to Theorem 10, all expectation values of observables belonging to M2 tend to zero, and the limit is uniform on bounded sets of M2 . It is worth noting that this result has been obtained without assuming that there is a minimal gap between distinct eigenvalues λn of the operator A. Example 3. Ergodic system. Let us consider two finite-dimensional systems A and B, each represented by the algebra of n × n matrices, which are coupled to a measuring apparatus. Suppose that during the measurement we check if the system AB is in a state PA and in an arbitrary state of the system B, and vice versa. Generalizing the von Neumann projection postulate to the case of a continuous family of non-commuting projections we obtain the following dissipative Markov generator on Mn2 ×n2 [7], Z LD (X) = κ dµ(n)(PA (n) ⊗ 1B )X(PA (n) ⊗ 1B ) CP n−1
+
Z
CP n−1
dµ(n)(1A ⊗ PB (n))X(1A ⊗ PB (n)) − 2X ,
where κ > 0, n ∈ CP n−1 — the complex projective space, and n → PA (n) is a tautological map which assigns to a point n ∈ CP n−1 the corresponding onedimensional projection in the system A. By µ(n) we denote a unique U (n)-invariant measure on CP n−1 normalized in such a way that Z dµ(n)PA(B) = 1A(B) . CP n−1
May 26, 2003 12:17 WSPC/148-RMP
00163
Decoherence Induced Transition from Quantum
231
By direct calculations we obtain that, for an appropriately chosen value of the coupling constant κ (depending on dimension n), LD (X) = trA X + trB X − 2X, where trA (trB ) denotes the conditional expectation from Mn2 ×n2 onto 1A ⊗ Mn×n (Mn×n ⊗ 1B ) respectively. In other words, trA (X) = n1 1A ⊗ TrA (X), where TrA is the partial trace taken over the system A. Let us further assume that there is no interaction between systems A and B. Then Hamiltonian H of the system AB equals to H = HA ⊗ 1B + 1A ⊗ HB . In such a case the superoperator L(X) = i[H, X] + LD (X) generates a quantum Markov semigroup Tt (X) = eitH (e−2t X + e−t (1 − e−t )(trA X + trB X) + (1 − e−t )2 trAB X)e−itH ,
(8)
where trAB is the conditional expectation from Mn2 ×n2 onto the trivial subalgebra C · 1AB . If we take ρ0 = n12 1AB , then the semigroup Tt satisfies all assumptions of Theorem 10. Hence Mn2 ×n2 = M1 ⊕ M2 , with M1 = C · 1AB . It is easy to check that all statistical states of the system AB evolve to a completely mixed state ρ 0 . Example 4. Quantum stochastic process. Quantum stochastic processes were introduced by Davies [15] to describe rigorously certain continuous measurement processes. They can be constructed from two infinitesimal generators. The first is the generator Z of a strongly continuous semigroup on a Hilbert space HS , and the second is a stochastic kernel J, describing how the measuring apparatus interacts with the system. Let us recall that a stochastic kernel is a measure defined on the σ-algebra of Borel sets in some locally compact space and with values in the space of bounded positive linear operators on Tr(HS ), the Banach space of trace class operators on HS . In this example we take the Poincar´e disc D1 = {ζ ∈ C : |ζ| < 1} as the underlying topological space, and define Z = iHS − κ2 1S where HS is the Hamiltonian of the system, κ > 0 is the coupling constant. For E ⊂ D1 and ρ ∈ Tr(HS ) the stochastic kernel is defined by Z Tr[J(E, ρ)A] = κ dµ(ζ) Tr(eζ ρeζ A) , E
where A ∈ B(HS ), eζ = |ζihζ| with |ζi being a SU(1, 1) coherent state, i.e. a holomorphic function on D1 [37], |ζi(z) = (1 − |ζ|2 )(1 − zζ)−2 and dµ(ζ) =
1 dζdζ¯ π (1 − |ζ|2 )2
is a SU(1, 1) invariant measure on D1 . It should be pointed out that |ζi are coz herent states in Hilbert space HS = L2 (D1 , dzd¯ π ), which is the space of a unitary irreducible representation π of the group SU(1, 1) given by αz + β¯ , π(g)f (z) = (βz + α ¯ )f βz + α ¯
May 26, 2003 12:17 WSPC/148-RMP
232
00163
Ph. Blanchard & R. Olkiewicz
where g=
α β β¯ α ¯
with |α|2 − |β|2 = 1. In order to define a quantum stochastic process, Z and J have to satisfy the following relation tr[J(D1 , eψ )] = −2 Rehψ, Zψi , eψ = |ψihψ|, for all normalized vectors ψ ∈ D(Z), the domain of Z. It is straightforward to check that Z dµ(ζ) Tr(eζ eψ eζ ) = κ = −2 Rehψ, Zψi . Tr[J(D1 , eψ )] = κ D1
As was shown in [8], the generator of the semigroup Tt associated with the process is given by Z L(X) = i[HS , X] + κ dµ(ζ)eζ Xeζ − κX . (9) D1
From Eq. (9) it is clear that Tt satisfies all but the last assumption of Theorem 10. However, although the decomposition B(HS ) = M1 ⊕ M2 does not hold in this case, Tt describes a very efficient decoherence in the quantum system in the spirit of Definition 3A. In fact, if HS is the operator closure of (dπ(h), DG ), where h ∈ su(1, 1) — the Lie algebra of group SU(1, 1), and DG is the G˚ arding domain, then lim kTt Ak∞ = 0 ,
t→∞
for all A ∈ K(HS ), the space of compact operators on HS (see Theorem 3.10 in [9]). It follows that K(HS ) ∈ M2 , and so all pure states of the system instantaneously deteriorate to statistical states. Let us notice that the pre-adjoint semigroup T t∗ is asymptotically stable, i.e. lim kTt∗ ρ1 − Tt∗ ρ2 k1 = 0
t→∞
for all density matrices ρ1 and ρ2 . Hence, the set of Tt -invariant, or even subinvariant, density matrices is empty. Moreover, since kTt∗ ρk2 → 0, where k · k2 is the Hilbert–Schmidt norm, and kTt∗ ρk1 = 1 for all t ≥ 0, so for dzd¯ z -almost all z and dzd¯ z -almost all z 0 6= z, ρt (z, z 0 ) → 0, where ρt (z, z 0 ) stands for the integral kernel of the density matrix Tt∗ ρ. So far we have restricted the discussion on the emergence of classical properties to quantum open systems associated with factors of type In and I∞ . A generic feature of such factors is that they possess minimal projections. Hence, the only possible classical structure induced in such factors is a discrete one, and so the dynamics restricted to it has to be trivial. Let us now turn to infinite quantum spin systems whose GNS representation with respect to a normalized trace tr is known
May 26, 2003 12:17 WSPC/148-RMP
00163
Decoherence Induced Transition from Quantum
233
to be a hyperfinite factor of type II1 [18]. Such a factor is continuous, i.e. there are no minimal projections in it. We start with the following general theorem. Theorem 11. Suppose M is a type II1 factor. Let its evolution be given by a family of maps {Tt } satisfying the conditions (a)–(c) from Sec. 2 with ω0 = tr, the normalized trace on M. If {Tt } is a semigroup, then the decomposition M = M1 ⊕ M2 always exists. Moreover, the effective part of any observable in M is given by a tr-compatible conditional expectation from M onto M1 . Remark. Notice that tr(1S ) = 1, so the identity operator is a Tt -invariant faithful statistical state in this case. Proof. See the Appendix. Example 5. Apparatus with continuous readings. Suppose an apparatus, a semi-infinite linear array of spin- 21 particles, fixed at positions k = 1, 2, 3, . . . , interacts with a quantum particle moving along the x-axis. Then, the algebra M of the measuring device is a hyperfinite factor of type II1 , and the algebra of the system is B(HS ), where HS = L2 (R, dx). More precisely, N∞ M = π( 1 M2×2 )00 , where π is the GNS representation with respect to the norN∞ malized trace on the Glimm algebra 1 M2×2 . The evolution of the joint system is determined by a Hamiltonian H = HA ⊗ 1S + 1A ⊗ HS + A ⊗ B ,
(10)
where HA , HS , A and B are assumed to satisfy the following conditions: • [HA , A] = 0, P∞ • A = π( n=1 ( 21 )n σn3 ), where σn3 is the third Pauli matrix located at position n, and so A ∈ M, 1 ∆, the kinetic energy operator, • HS = − 2m • B = pˆ, the momentum operator,R ixp • ωS = |ψihψ|, where ψ(x) = √12π √ e dp. 2 π(1+p )
Let PS denote the conditional expectation from M ⊗ B(HS ) onto M with respect to the state ωS . Then, for any X ∈ M, Tt (X) = eitHA PS [eitA⊗B X ⊗ 1S e−itA⊗B ]e−itHA .
(11)
By Theorem 12 in [31], Tt satisfies all assumptions of Theorem 11. Hence, M = M1 ⊕ M2 with M1 being the von Neumann algebra generated by spectral projections of all σn3 , n ∈ N. It is easy to check that M1 = L∞ (C, µ), where C is the Cantor set and µ is a continuous, regular, Borel, probability measure on C, see [31] for more details. It is worth pointing out that M1 is unitarily isomorphic with L∞ ([0, 1], dx), and the trace state tr corresponds to the Lebesgue integral on [0, 1].
May 26, 2003 12:17 WSPC/148-RMP
234
00163
Ph. Blanchard & R. Olkiewicz
Because HA commutes with all spectral projections of the operator A so the evolution restricted to M1 is trivial. Hence, this case may be considered as a continuous analog of the selection of pointer states from Example 2. Example 6. Continuous superselection rules. This example is a slight modification of the previous one. Suppose that additionally to the apparatus and system introduced in Example 5, there is another quantum system represented by a factor algebra N acting in a Hilbert space Hq . As Hamiltonian of the total system we take H = Hq ⊗ 1A ⊗ 1S + 1q ⊗ HA ⊗ 1S + 1q ⊗ 1A ⊗ HS + 1q ⊗ A ⊗ B ,
(12)
where Hq is a self-adjoint operator affiliated to N . Since H was formed by adding a free evolution of the system N to the Hamiltonian (10), so N ⊗ M = (N ⊗ M1 ) ⊕ (N ⊗ M2 ) with N ⊗ M1 being the subalgebra of effective observables. Its center Z = 1q ⊗ M1 is a continuous commutative algebra. The evolution restricted to N ⊗ M1 is LR Hamiltonian and preserves each superselection sector of N dµ. This model Z may be easily generalized to the case when the Hamiltonian Hq depends on x ∈ C, and so is different in each superselection sector. Example 7. Classical dynamical system. Suppose that a quantum system is a semi-infinite linear array of spin- 21 particles fixed at positions k ∈ N. The algebra M of the system is a hyperfinite factor of type N∞ II1 defined as M = π( 1 M2×2 )00 ⊂ B(HS ), where π is the GNS representation with respect to the normalized trace on the Glimm algebra. The free evolution of the system is given by a σ-weakly continuous one parameter group of automorphisms αt : M → M constructed in the following way. Suppose U ( 2kn ), k = 0, 1, . . . , 2n − 1, n is a representation of a cyclic group { 2kn }, with addition modulo 1, in the space C2 , such that 1 U n (z1 , . . . , z2n ) = (z2n , z1 , . . . , z2n −1 ) . 2 Since it is a restriction of the standard unitary representation of the permutation group S2n , the U ( 2kn ) are unitary matrices in M2n ×2n . Because there is an N∞ embedding of M2n ×2n into 1 M2×2 , so they may be considered as operators in the Glimm algebra. Hence, they induce a discrete group of unitary automorphisms of M by the formula ∗ k k α kn (X) = π U n Xπ U n . 2 2 2
Because n was arbitrary so we obtain in this way a homomorphism d → αd , where d is a dyadic number, i.e. d = 2kn for some n ∈ N and some k = 0, 1, . . . , 2n −1. By Theorem 13 in [31], this homomorphism extends to the whole set of real numbers
May 26, 2003 12:17 WSPC/148-RMP
00163
Decoherence Induced Transition from Quantum
235
yielding a group of unitary (but not inner) automorphisms αt (X) = eitH Xe−itH , where H is a self-adjoint operator on HS . It is clear from the construction that αm = id, for any integer m. The reservoir is chosen to consists of phonons of an infinitely extended harmonic crystal. The Hilbert space H representing pure states of a single phonon is given by H = L2 (R3 , dk). It follows that the Hilbert space of the environment is the symmetric Fock space F over H, and its algebra ME is a von Neumann algebra generated by Weyl operators W (f ) = eiφ(f ) , φ(f ) = √12 (a∗ (f ) + a(f )), where a∗ (f ) denotes the creation operator of one particle state f ∈ H, and a(f ) = (a∗ (f ))∗ [12]. Because the Fock representation is irreducible so ME = B(F). The reference state ˜ f˜|, where |f˜i is a coherent of the reservoir ωE is taken to be a pure state ωE = |fih state in the Fock space, i.e. ˜ = e− 2 kf k |fi 1
2
∞ X [a∗ (f )]n Ω, n! n=0
where f ∈ H, and Ω is the vacuum state. Such a state represents a state of phonons associated with a classical acoustic wave. R The free evolution of the reservoir is determined by the Hamiltonian HE = d(k)ω(k)a∗ (k)a(k), where ω(k) is the dispersion function. Suppose now that these two systems interact. The coupling between the matter and the boson field is given by an interaction Hamiltonian HI , a self-adjoint operator on HS ⊗ F. To derive an explicit form of HI we use the formula (I.20) in [5], in √ , g 6= 0 ∈ H, and A is the same as in Example 5. Hence, which we put G(k) = A g(k) 2 HI = A ⊗ φ(g). For simplicity we put the coupling constant equal to one. It should be pointed out that, due to the form of A, HI is a straightforward generalization of the interacting term of the spin-boson model. Because HI commutes neither with H nor with HE , so in order to determine the reduced dynamics of the system we do not follow a general strategy as in the previous cases. Instead we use a simplified procedure: First we calculate the reduced dynamics of the HI only, and next add to it the automorphic evolution αt . Hence, for any X ∈ M, Tt (X) = αt (PE [eitHI X ⊗ 1E e−itHI ]) ,
(13)
where PE is the conditional expectation onto the algebra M with respect to the state ωE . In order to calculate the explicit form of the superoperators Tt we suppose that X ∈ π(M2n ×2n ). Then X eitHI X ⊗ 1E e−itHI = Pi1 ···in XPj1 ···jn ⊗ eit(ai1 ···in −aj1 ···jn )φ(g) , i1 ,...,in j1 ,...,jn
where Pi1 ···in = Pi1 ⊗ · · · ⊗ Pin , ik ∈ {0, 1}, and Pik are spectral projections of π(σk3 ). Parameters ai1 ···in are given by ai1 ···in =
n X k=1
(−1)ik
1 . 2k
May 26, 2003 12:17 WSPC/148-RMP
236
00163
Ph. Blanchard & R. Olkiewicz
Hence, PE [eitHI X ⊗ 1E e−itHI ] = Because
X
i1 ,...,in j1 ,...,jn
Pi1 ···in XPj1 ···jn hf˜, W (t(ai1 ···in − aj1 ···jn )g)f˜i .
√ 2 ∗ 1 W (−i 2f )Ω = e− 2 kf k ea (f ) Ω = |f˜i
so hf˜, W (t(ai1 ···in − aj1 ···jn )g)f˜i √ √ = hΩ, W (i 2f )W (t(ai1 ···in − aj1 ···jn )g)W (−i 2f )Ωi 1 2
= e− 4 t √ Hence (with f → 2f ),
√ (ai1 ···in −aj1 ···jn )2 kgk2 +it(ai1 ···in −aj1 ···jn ) 2Rehf,gi
.
X 2 2 1 2 Pi1 ···in XPj1 ···jn e− 4 t (ai1 ···in −aj1 ···jn ) kgk +it(ai1 ···in −aj1 ···jn )Rehf,gi Tt (X) = αt . i1 ,...,in j1 ,...,jn
(14)
Because ai1 ···in 6= aj1 ···jn if ik 6= jk for at least one k, so the off-diagonal terms vanish when t → ∞. Let us point out that in this case Tt is not a semigroup. However, the subalgebra of effective observables may be determined in the same way as in Example 5, yielding the same result, i.e. M1 = L∞ (C, µ), where C is the Cantor set. If X ∈ M1 , then Tt (X) = αt (X). In this way we have obtained an abstract commutative dynamical system (M1 , αt , tr). In the final step we apply to it the reduction procedure to determine what classical system it represents. Suppose h is a continuous function on C. Let us recall that each point c ∈ C is represented P k by an infinite sequence (i1 , i2 , . . .), ik = 0, 1, as c = k 2i 3k . Let c1 = (i1 , i2 , . . . , in , 1, 0, 0, . . .) ,
c2 = (i1 , i2 , . . . , in , 0, 1, 1, . . .) . Points c1 and c2 are such that there are no points in C between them. Lemma 12. Suppose t → αt (h) is strongly continuous. Then h(c1 ) = h(c2 ). Moreover, h(0) = h(1). Proof. See the Appendix. Let A0 be a C ∗ -algebra of continuous functions on C such that αt is strongly continuous on it. Suppose S 1 = {eia , a ∈ R}, and let λ : C → S 1 be given by ! ∞ X ik λ(i1 , i2 , . . .) = exp 2πi . 2k k=1
May 26, 2003 12:17 WSPC/148-RMP
00163
Decoherence Induced Transition from Quantum
237
ˆ : C(S 1 ) → C(C), λ(f ˆ )(c) = f (λ(c)), c ∈ C, is an Because λ is surjective so λ ˆ embedding. By Lemma 12, imλ = A0 . Hence M = S 1 , and the group of automorphisms α ˆ t : C(S 1 ) → C(S 1 ) is induced by a uniform flow gt (eia ) = ei(2πt+a) . Thus, we may conclude that (S 1 , gt , µ0 ), where µ0 is the normalized Lebesgue measure on S 1 , is the induced classical system. Example 8. Ergodic spin system. Suppose again that a quantum system is a semi-infinite linear array of spin- 21 particles fixed at positions k ∈ N. The quasi-local algebra A is the norm closure of the S algebra A0 = An of local observables. Here, by An we denote the local algebra Nn associated with the set Λn = {1, 2, . . . , n}. It is clear that An = k=1 A(k) , where A(k) is isomorphic with the algebra of 2 × 2 matrices. M = π(A)00 ⊂ B(HS ), as in the previous example. Suppose that the system interacts with its environment represented by the algebra B(HE ). The evolution of the joint system is determined by a Hamiltonian H = HS ⊗ 1E + 1S ⊗ HE + A ⊗ B ,
(15)
where HS and A are given by Q∞ • HS = π( k=1 (1 + bk σk1 )), σk1 ∈ A(k) is the first Pauli matrix, bk > 0, and P∞ k < ∞, k=1 bP ∞ • A = π( k=1 21k σk3 ), as in the Example 5. Q Because kHS k∞ = ∞ k=1 (1 + bk ) < ∞ so both HS and A are bounded and belong to π(A). We do not specify the form of the operators HE and B. Instead, we assume that the so called singular coupling limit [1] may be applied for derivation of the reduced dynamics of the system. Hence, the Markovian master equation for x ∈ M reads x˙ = L(x) = i[H, x] + LD (x) , where H = HS + αA2 and γ {x, A2 } . 2 The coefficients α ∈ R and γ > 0 are given by the formula Z ∞ γ Tr(ρE eitHE Be−itHE B)dt = + iα , 2 0 LD (x) = γAxA −
where ρE is a density matrix of the environment. It is clear that the semigroup Tt = etL on M preserves the trace tr and satisfies therefore the assumptions of Theorem 11. Hence M = M1 ⊕ M2 . Theorem 13. The system (M, Tt ) is ergodic, i.e. M1 = C · 1. Proof. See the Appendix.
May 26, 2003 12:17 WSPC/148-RMP
238
00163
Ph. Blanchard & R. Olkiewicz
The above results are purely mathematical ones and invite to ask the following question: What is the relation between the physical implications and proposed mathematical procedure of de-quantization. To establish such a close connection it is essential to consider more examples, especially those with nontrivial evolutions on the decoherence-free part of a system. However, Example 7, although a bit contrived, shows directly that in principle infinite quantum systems may behave like simple classical dynamical systems. It means that, when we neglect terms which deteriorate to zero, then the rest of the system may be described by a set of classical parameters which evolve according to the laws of classical physics. And from a mathematical point of view there is potentially a full range of classical systems emerging in this way. Since for any compact metric space M there is a surjective map λ : C → M , so the C ∗ -algebra C(C) contains all algebras C(M ) as subalgebras. The open problem is how to construct physically plausible dynamics of the infinite fermion system so that C(M ) would be selected as A0 . Appendix Proof of Theorem 4. Let t > 0 be fixed. We show that Tt : M1 → M1 is a -automorphism. Let P(M1 ) denote the set of all projections in M1 .
∗
Step 1. Suppose P, Q ∈ P(M1 ) and P Q = 0. Then Tt (P + Q) is a projection and so Tt (P )Tt (Q) + Tt (Q)Tt (P ) = 0. Since both Tt (P ) and Tt (Q) are projections so Tt (P )Tt (Q) = 0. It means that Tt maps orthogonal projections to orthogonal ones. R Step 2. Suppose that A = A∗ ∈ M1 . Let A = λdE(λ) be its spectral decomposition. ByR Step 1, dTt E(λ) is another spectral measure. Because Tt is normal so Tt (A) = λdTt E(λ) and hence Tt (A2 ) = (Tt A)2 . It follows that Tt is a Jordan homomorphism. Step 3. Suppose that A ∈ M1 . Then, by Step 2,
Tt (A∗ A) + Tt (AA∗ ) = Tt (A)∗ Tt (A) + Tt (A)Tt (A)∗ .
Because, by condition (i) of Sec. 2, Tt is completely positive and preserves the identity operator so it satisfies the Schwarz inequality. Hence Tt (A∗ A) ≥ Tt (A)∗ Tt (A) and Tt (AA∗ ) ≥ Tt (A)Tt (A)∗ , what implies that Tt (A∗ A) = Tt (A)∗ Tt (A). Step 4. Suppose that φ ∈ D. A sesquilinear form bφ on M1 given by bφ (A, B) = φ(Tt (A∗ B) − Tt (A∗ )Tt (B))
is positive. By Step 3, bφ (A, A) = 0. Hence also bφ (A, B) = 0. Because state φ was arbitrary so Tt (A∗ B) = Tt (A∗ )Tt (B). In this way we prove that Tt is a homomorphism. Step 5. Next we show that Tt : P(M1 ) → P(M1 ) is bijective. By property (ii) of Definition 3B, it is onto. Suppose that there are P1 , P2 ∈ P(M1 ) such that Tt (P1 ) = Tt (P2 ). Then Tt ((P1 − P2 )2 ) = (Tt (P1 ) − Tt (P2 ))2 = 0
May 26, 2003 12:17 WSPC/148-RMP
00163
Decoherence Induced Transition from Quantum
239
and so, by property (iii) of Sec. 2, ω0 ((P1 − P2 )2 ) = 0. Because ω0 is faithful so P1 = P 2 . Step 6. Since Tt is normal so ker Tt is a von Neumann algebra. As such it is generated by projections. By Step 5, Tt transforms non-zero projections to non-zero ones. Hence ker Tt = 0. The range of Tt is also a von Neumann algebra. Since it contains P(M1 ), it coincides with M1 . Hence Tt is a ∗ -automorphism of M1 . Proof of Theorem 10. Because Tt : B(HS ) → B(HS ) are normal and contractive in the operator norm so there exists a pre-adjoint semigroup Tt∗ : Tr(HS ) → Tr(HS ), which is contractive in the trace norm k · k1 . By Tr(HS ) we denote the Banach space of trace class operators on HS , the predual space of B(HS ). The set of density matrices D consists of those ρ ∈ Tr(HS ), which are positive and normalized. The cone of positive and normal functionals we denote by Tr(HS )+ . Since Tt is Tr-invariant so Tt : Tr(HS ) → Tr(HS ), and is bounded in the trace norm. Thus, the adjoint map Tˆt = (Tt |Tr(HS ) )∗ : B(HS ) → B(HS ) is bounded in the operator norm. Moreover, for any ρ ∈ D, Tr(ρTˆt (1)) = Tr(Tt ρ) = 1 , what implies that Tˆt is unital. Hence it satisfies the Schwarz inequality. Using this property we show now that Tˆt is in fact contractive in the operator norm. Suppose it is not. Then there exists a minimal constant C > 1 such that kTˆt Ak∞ ≤ CkAk∞ for all A ∈ B(HS ). Let us take v ∈ HS with kvk = 1. Then kTˆt (A)vk2 ≤ hv, Tˆt (A∗ A)vi = k(Tˆt (A∗ A))1/2 vk2 ≤ k(Tˆt (A∗ A))1/2 k2∞ = kTˆt (A∗ A)k∞ ≤ CkAk2∞ . Hence kTˆt (A)k∞ ≤ C 1/2 kAk∞ , a contradiction since the constant C was assumed to be minimal. The contractivity of Tˆt implies that Tt |Tr(HS ) must be also contractive. Next we show that Tt∗ is also contractive in the operator norm. Suppose φ ∈ Tr(HS ). Because Tr(HS ) ⊂ K(HS ), the Banach space (and C ∗ -algebra in fact) of compact operators on HS , and K(HS )∗ = Tr(HS ), so there exists ψ ∈ Tr(HS ) with kψk1 = 1 such that kTt∗ φk∞ = |Tr(Tt∗ φ)ψ|. Hence kTt∗ φk∞ = |Tr φ(Tt ψ)| ≤ kφk∞ kTt ψk1 ≤ kφk∞ . Summing up: For all t ≥ 0, Tt∗ : Tr(HS ) → Tr(HS ) is completely positive and contractive in both the trace and operator norm. Next we consider topological properties of the semigroup {Tt∗ }. Since, by the assumption, it possesses a faithful and Tt∗ -subinvariant density matrix ρ0 so it is relatively compact in the weak operator topology [23]. It means that a set Kφ = {Tt∗ φ}t≥0 is relatively compact for any φ ∈ Tr(HS ) in the weak topology ˇ on Tr(HS ). Suppose that φ ≥ 0. By the Eberlein–Smulian theorem Kφ is weakly sequentially compact. Let {φn } be an arbitrary sequence in Kφ . Then there exists a subsequence {φmn } such that w-lim φmn = ψ, where ψ ∈ Tr(HS )+ . However,
May 26, 2003 12:17 WSPC/148-RMP
240
00163
Ph. Blanchard & R. Olkiewicz
φmn ∈ Tr(HS )+ so, by Corollary 5.11 in [42], limn→∞ kφmn − ψk1 = 0. This implies that Kφ is sequentially compact, and so it is relatively compact (in the trace norm topology). Suppose now that φ ∈ Tr(HS ). Then φ = φ1 − φ2 + iφ3 − iφ4 , where all φj ∈ Tr(HS )+ . Each set Kj = {Tt∗ φj }t≥0 is relatively compact. Because function f (ψ1 , ψ2 , ψ3 , ψ4 ) = ψ1 − ψ2 + iψ3 − iψ4 , ψj ∈ Tr(HS ), is norm continuous so the set f (×Kj ) is compact in Tr(HS ). However, for all t ≥ 0, Tt∗ φ ∈ f (×Kj ), what implies that Kφ is compact. Hence, the semigroup {Tt∗ } is relatively compact in the strong operator topology. We are now in position to apply Theorem 24 from [34]. It states that Tr(HS ) decomposes into an isometric and sweeping part, Tr(HS ) = Tr(HS )iso ⊕ Tr(HS )s such that Tt∗ (φ1 ) = Ut φ1 Ut∗ , φ1 ∈ Tr(HS )iso , where Ut is a strongly continuous group of unitary operators, and limt→∞ kTt∗ φ2 k1 = 0 for all φ2 ∈ Tr(HS )s . Moreover, there exists a Tr-compatible projection Pˆ (a linear, contractive and completely positive superoperator which satisfies Pˆ 2 = Pˆ ) on Tr(HS ) such that its range is equal to Tr(HS )iso . In the final step we translate these results to the operator algebra framework. By Theorem 4.1 in [35], the dual projection Pˆ ∗ is a Tr-compatible conditional expectation on B(HS ). Hence B(HS ) = M1 ⊕ M2 , where M1 is the range of Pˆ ∗ , and M2 is the range of (id − Pˆ ∗ ). Moreover, M1 is a von Neumann algebra and the evolution on it is given by Tt (A1 ) = Ut∗ A1 Ut . What remains to be proven is the uniform decrease to zero of all expectation values of observables belonging to M2 . To this end suppose that A2 ∈ M2 with kA2 k∞ ≤ 1 and ρ ∈ D. Then, by Theorem 24 in [34], lim |Tr ρTt (A2 )| = lim |Tr(Tt∗ φ)(id − Pˆ ∗ )A2 | = lim |Tr(id − Pˆ )(Tt∗ φ)A2 |
t→∞
t→∞
t→∞
≤ kA2 k∞ lim kTt∗ ρ − Pˆ (Tt∗ ρ)k1 = 0 , t→∞
and the limit is uniform in A2 provided it belongs to the unit ball of M2 . Proof of Theorem 11. Since tr ◦ Tt = tr, so Tt is bounded in the trace norm. Hence, it may be extended to a map Tt : L1 (M) → L1 (M). However, L1 (M) = ∗ M∗ , so the adjoint map Tt : M → M is bounded and unital. Because it is also completely positive so it satisfies the Schwarz inequality. Using the same argument ∗ as in the proof of Theorem 10, we conclude that Tt is contractive in the operator norm, what further implies the contractivity of Tt in the trace norm. Hence, the assertion follows from Theorem 7 and Corollary 9 in [31]. Proof of Lemma 12. Suppose on the contrary that h(c1 ) 6= h(c2 ), where c1 = (i1 , i2 , . . . , in , 1, 0, 0, . . .) , c2 = (i1 , i2 , . . . , in , 0, 1, 1, . . .) .
May 26, 2003 12:17 WSPC/148-RMP
00163
Decoherence Induced Transition from Quantum
Let us take a sequence of dyadic numbers dm =
1 2m .
241
Then, for m > n + 1,
(αdm h)(c2 ) = h(U (dm )c2 ) = h(cm ) , where cm = (i1 , i2 , . . . , in , 1, 0, . . . , im = 0, 1, 1, . . .) . It is clear that cm → c1 . Because h is continuous so for = 21 |h(c1 ) − h(c2 )|, there exists N such that for all m > N , |h(cm ) − h(c1 )| < . Hence kαdm (h) − hksup ≥ |(αdm h)(c2 ) − h(c2 )| = |h(cm ) − h(c2 )| > , a contradiction. The condition h(0) = h(1) may be shown in the same way. Proof of Theorem 13. Let us observe that Tt : π(An ) → π(An ) for any n ∈ N. Let us first show that the algebra of effective observables for Tt |π(An ) consists only of operators proportional to the identity operator. Using the isomorphism of π(A n ) with the algebra M2n ×2n , we reduce this problem to determining this algebra for (n) Tt = etLn , where Ln (x) = i[Hn , x] + LD n (x) and x ∈ M2n ×2n . Here Nn Hn = k=1 (12×2 + bk σ 1 ) + αA2n , LD n (x) = γAn xAn −
γ {x, A2n } , 2
and An =
2 X
i1 ,...,in =1
ai1 ···in =
n X 1 (−1)ik −1 , 2k k=1
ai1 ···in Pi1 ···in , Pi1 ···in = Pi1 ⊗ · · · ⊗ Pin .
P1 and P2 are the spectral projections of σ 3 , i.e. σ 3 = P1 − P2 . It is clear that if (n)∗ (n) x = z12n ×2n , z ∈ C, then Tt x = Tt x = x for all t ≥ 0. Conversely, suppose (n) (n)∗ (n)∗ (n) that Tt Tt x = Tt Tt x = x for some x ∈ M2n ×2n . Then, by calculating the first and second time derivative in t = 0 of the above equation, we obtain that D LD n (x) = 0 and Ln ([Hn , x]) = 0. Because γ 2 (LD n (x))i1 ···in ,j1 ···jn = − xi1 ···in ,j1 ···jn (ai1 ···in − aj1 ···jn ) 2 so LD n (x) = 0 if and only if x is diagonal, i.e. x=
2 X
i1 ,...,in =1
xi1 ···in Pi1 ···in .
However, [Hn , x] is diagonal as well so Pi1 ···in [Hn , x]Pj1 ···jn = 0
May 26, 2003 12:17 WSPC/148-RMP
242
00163
Ph. Blanchard & R. Olkiewicz
for all (i1 · · · in ) 6= (j1 · · · jn ). Because Pi1 ···in [Hn , x] pj1 ···jn = (xi1 ···in − xj1 ···jn )Pi1 ···in Hn Pj1 ···jn and (Hn )i1 ···in ,j1 ···jn =
n Y
(12×2 + bk σ 1 )ik jk =
k=1
n Y
k=1
(δik jk + bk (σ 1 )ik jk ) 6= 0 ,
so, for all (i1 · · · in ) = 6 (j1 · · · jn ), xi1 ···in = xj1 ···jn . Hence x = z12n ×2n . Suppose now that y ∈ π(An ) and try = 0. Then, for any x ∈ π(An ), lim tr(xTt y) = 0 .
t→∞
Since π(An ) is finite dimensional so all topologies coincide on it. Hence kTt yk2 → 0, when t → ∞. Finally, we show that M1 = C · 1. Suppose on the contrary that x ∈ M1 and x 6= z1. Then y = x − (tr x)1 ∈ M1 and y 6= 0. Hence, we may assume that kyk2 = 1. Let (xn ) be a sequence such that xn ∈ π(An ) and xn → x in L2 (M). Then yn = xn − (tr xn )1 ∈ π(An ) and yn → y. Hence, there exists n0 ∈ N such that ky − yn k2 < 14 . On the other hand, since tr yn0 = 0, there exists t0 > 0 such that kTt0 yn0 k2 < 14 . Thus 1 = kTt0 yk2 ≤ kTt0 (y − yn0 )k2 + kTt0 yn0 k2 <
1 , 2
the contradiction. Hence, M1 = C · 1. Acknowledgments We thank the Referee for calling our attention to a number of references. One of the authors (R.O.) would like to thank A. von Humboldt Foundation for the financial support. References [1] R. Alicki and K. Lendi, Quantum Dynamical Semigroups and Applications, LNP 286, Springer, Berlin, 1987. [2] A. Amann, Fortschr. Phys. 34 (1986), 167. [3] A. Amann, Helv. Phys. Acta 60 (1987), 384. [4] H. Araki, Progr. Theory Phys. 64 (1980), 719. [5] V. Bach, J. Fr¨ ohlich and I. M. Segal, J. Math. Phys. 41 (2000), 3985. [6] Ph. Blanchard et al. (Eds.), Decoherence: Theoretical, Experimental and Conceptual Problems, LNP 538, Springer, Berlin, 2000. [7] Ph. Blanchard, L. Jakobczyk and R. Olkiewicz, Phys. Lett. A 280 (2001), 7. [8] Ph. Blanchard and R. Olkiewicz, Phys. Lett. A 273 (2000), 223. [9] Ph. Blanchard and R. Olkiewicz, J. Stat. Phys. 94 (1999), 933. [10] P. Bona, J. Math. Phys. 29 (1988), 2223. [11] O. Bratteli and D. W. Robinson, Operator Algebras and Quantum Statistical Mechanics I, Springer, New York, 1979. [12] O. Bratteli and D. W. Robinson, Operator Algebras and Quantum Statistical Mechanics II, Springer, New York, 1981.
May 26, 2003 12:17 WSPC/148-RMP
00163
Decoherence Induced Transition from Quantum
[13] [14] [15] [16] [17] [18] [19] [20] [21] [22] [23] [24] [25] [26] [27] [28] [29] [30] [31] [32] [33] [34] [35] [36] [37] [38] [39] [40] [41] [42] [43] [44] [45] [46] [47] [48] [49] [50]
243
H.-P. Breuer and F. Petruccione, Phys. Rev. A 63 (2001), 032102. M. E. Brune et al., Phys. Rev. Lett. 77 (1996), 4887. E. B. Davies, Comm. Math. Phys. 15 (1969), 277. D. A. Edwards, Quart. J. Math. 53 (2002), 19. L. Erd¨ os and H.-T. Yau, Comm. Pure Appl. Math. 53 (2000), 667. D. E. Evans and Y. Kawahigashi, Quantum Symmetries on Operator Algebras, Clarendon Press, Oxford, 1998. J. Fr¨ ohlich, T. Tsai and H. Yau, Geom. Funct. Anal. Special Volume GAFA 2000, 57–78. J. Fr¨ ohlich, T. Tsai and H. Yau, Comm. Math. Phys. 225 (2002), 223. M. Gell-Mann and J. B. Hartle, Phys. Rev. D 47 (1993), 3345. D. Giulini et al. (Eds.), Decoherence and the Appearance of a Classical World in Quantum Theory, Springer, Berlin, 1996. U. Groh, in One-parameter Semigroups of Positive Operators, LNM 1184, R. Nagel (Ed.), Springer, Berlin, 1986, pp. 369–426. R. Haag and D. Kastler, J. Math. Phys. 7 (1964), 848. Z. Haba, Lett. Math. Phys. 44 (1998), 121. K. Hepp, Comm. Math. Phys. 35 (1974), 265. R. Honegger and A. Rieckers, Publ. RIMS Kyoto 30 (1994), 111. E. Joos and H. D. Zeh, Z. Phys. B 59 (1985), 223. J. Kupsch, in: [6], pp. 125–136. N. P. Landsman, Int. J. Mod. Phys. A 6 (1991), 5349. P. Lugiewicz and R. Olkiewicz, J. Phys. A 35 (2002), 6695. W. A. Majewski, J. Stat. Phys. 55 (1989), 417. C. Monroe et al., Sci. 272 (1996), 1131. R. Olkiewicz, Comm. Math. Phys. 208 (1999), 245. R. Olkiewicz, Ann. Phys. 286 (2000), 10. R. Omnes, Phys. Rev. A 65 (2002), 052119. A. M. Perelomov, Comm. Math. Phys. 26 (1972), 222. H. Primas, in: [6], pp. 161–178. G. A. Raggio and A. Rieckers, Int. J. Theor. Phys. 22 (1983), 267. I. E. Segal, Bull. Am. Math. Soc. 61 (1947), 69. V. S. Sunder, An Invitation to von Neumann Algebras, Springer, New York, 1987. M. Takesaki, Theory of Operator Algebras I, Springer, New York, 1979. J. Twamley, Phys. Rev. D 48 (1993), 5730. T. Unnerstall, J. Math. Phys. 31 (1990), 680. T. Unnerstall, Comm. Math. Phys. 130 (1990), 237. W. G. Unruh and W. H. Zurek, Phys. Rev. D 40 (1989), 1071. H. D. Zeh, Found. Phys. 1 (1970), 69. H. D. Zeh, The Physical Basis of The Direction of Time, 4th edn., Springer, Berlin, 2001. W. H. Zurek, Phys. Rev. D 26 (1982), 1862. W. H. Zurek, Progr. Theory Phys. 89 (1993), 281.
May 26, 2003 16:37 WSPC/148-RMP
00162
Reviews in Mathematical Physics Vol. 15, No. 3 (2003) 245–270 c World Scientific Publishing Company
NON-RELATIVISTIC LIMIT OF A DIRAC MAXWELL OPERATOR IN RELATIVISTIC QUANTUM ELECTRODYNAMICS
ASAO ARAI Department of Mathematics, Hokkaido University Sapporo 060-0810, Japan
[email protected] Received 5 June 2002 Revised 9 January 2003
The non-relativistic (scaling) limit of a particle-field Hamiltonian H, called a Dirac– Maxwell operator, in relativistic quantum electrodynamics is considered. It is proven that the non-relativistic limit of H yields a self-adjoint extension of the Pauli–Fierz Hamiltonian with spin 1/2 in non-relativistic quantum electrodynamics. This is done by establishing in an abstract framework a general limit theorem on a family of self-adjoint operators partially formed out of strongly anticommuting self-adjoint operators and then by applying it to H. Keywords: Quantum electrodynamics; Dirac operator; Dirac–Maxwell operator; Pauli– Fierz Hamiltonian; non-relativistic limit; scaling limit; Fock space; strongly anticommuting self-adjoint operators.
1. Introduction In a previous paper [3], the author analyzed fundamental properties of a particlefield Hamiltonian H in relativistic quantum electrodynamics (QED), namely, the Hamiltonian of a Dirac particle — a relativistic charged particle with spin 1/2 — interacting with the quantum radiation field. For convenience in mentioning the particle-field Hamiltonian, we call it a Dirac–Maxwell operator. In this paper, we consider the non-relativistic (scaling) limit of H. We prove that the non-relativistic limit of H yields a self-adjoint extension of the Pauli–Fierz Hamiltonian with spin 1/2 in non-relativistic QED. This establishes a mathematically rigorous connection of relativistic QED to non-relativistic QED, which has not been proven so far. The Dirac–Maxwell operator H is of the form H = HD +Hrad +HI , where HD is a Dirac opeartor describing the Dirac particle system only, Hrad is the free Hamiltonian of the quantum radiation field (a quantum version of the Maxwell Hamiltonian in the Coulomb gauge) and HI is the interaction term between the Dirac particle 245
May 26, 2003 16:37 WSPC/148-RMP
246
00162
A. Arai
and the quantum radiation field. As for the Dirac operator HD , the non-relativistic limit has already been investigated and well understood ([11, Chapter 6] and references therein). We extend the methods used in the case of the Dirac operator HD to the case of H. This can be done in an abstract framework. We remark that the non-relativistic limit theory of HD is included in the theory of scaling limits on strongly anticommuting self-adjoint operators [2]. In view of this structure, we further develop the theory of scaling limits on strongly anticommuting self-adjoint operators in such a way that it can be applied to the non-relativistic limit of H. This is an outline of our method taken in the present paper. The present paper is organized as follows. In Sec. 2 we describe the Dirac– Maxwell operator and the Pauli–Fierz Hamiltonian with spin 1/2. In Sec. 3 we state the main results of the present paper. Section 4 is devoted to an abstract analysis of a family of self-adjoint operators partially formed out of strongly anticommuting self-adjoint operators. We prove a limit theorem and a resolvent formula. These results are generalizations of previously known ones ([2], [11, Chapter 6]). In the last section, applying the general limit theorem established in Sec. 4, we prove the main results. In Appendix A we present a method to find a self-adjoint extension S˜ of a Hermitian operator S defined as a finite sum of self-adjoint operators bounded from below. The self-adjoint extension S˜ may be different from the Friedrichs extension and the one defined as a form sum if S is symmetric, but not essentially self-adjoint. The method here has an advantage in that S˜ can be approximated by a family {S(κ)}κ>0 of self-adjoint operators (as κ → ∞) which are defined by “cutting off” S and may be tractable. We apply this abstract method to the construction of a self-adjoint extension of the Pauli–Fierz Hamiltonian without spin (Appendix B) and that with spin 1/2 (Sec. 3.3). 2. The Dirac Maxwell Operator and the Pauli Fierz Hamiltonian For a linear operator T on a Hilbert space, we denote its domain by D(T ), and its adjoint by T ∗ (provided that T is densely defined). For two objects a = (a1 , a2 , a3 ) and b = (b1 , b2 , b3 ) such that products aj bj (j = 1, 2, 3) and their sum can be P3 defined, we set a · b := j=1 aj bj . We use the physical unit system in which c (the speed of light) = 1 and ~ = 1 (~ := h/(2π); h is the Planck constant). 2.1. The Dirac operator Let Dj (j = 1, 2, 3) be the generalized partial differential operator in the variable xj , the jth component of x = (x1 , x2 , x3 ) ∈ R3 , and ∇ := (D1 , D2 , D3 ). We denote the mass and the charge of the Dirac particle by m > 0 and q ∈ R\{0} respectively. We consider the situation where the Dirac particle is in a potential V which is a Hermitian-matrix-valued Borel measurable function on R 3 . Then the Hamiltonian of the Dirac particle is given by the Dirac operator HD := α · (−i∇) + mβ + V
(2.1)
May 26, 2003 16:37 WSPC/148-RMP
00162
Non-Relativistic Limit of a Dirac–Maxwell Operator
247
acting in the Hilbert space HD := ⊕4 L2 (R3 )
(2.2)
with domain D(HD ) := [⊕4 H 1 (R3 )] ∩ D(V ) (H 1 (R3 ) is the Sobolev space of order 1) and α := (α1 , α2 , α3 ), where αj (j = 1, 2, 3) and β are 4 × 4 Hermitian matrices satisfying the anticommutation relations {αj , αk } = 2δjk , {αj , β} = 0 ,
j, k = 1, 2, 3 ,
β 2 = I4 ,
j = 1, 2, 3 ,
(2.3) (2.4)
{A, B} := AB + BA, δjk is the Kronecker delta and I4 is the 4 × 4 identity matrix. We assume the following: Hypothesis (A). Each matrix element of V is almost everywhere (a.e.) finite with respect to the three-dimensional Lebesgue measure dx and the subspace ∩3j=1 [D(Dj ) ∩ D(V )] is dense in HD . Under this hypothesis, HD is a symmetric operator. Detailed analysis of the Dirac operator is given in [11]. Example 2.1. A typical example for V is Vem := φI4 − qα · Aex ex ex 3 3 with φ : R3 → R an external scalar potential and Aex := (Aex 1 , A2 , A3 ) : R → R ex an external vector potential, where Aj and φ are in the set Z L2loc (R3 ) := f : R3 → C; Borel measurable |f (x)|2 dx < ∞, ∀R > 0 . |x|≤R
Then D(Vem ) ⊃ ⊕4 C0∞ (R3 ), where C0∞ (R3 ) is the set of C ∞ -functions on R3 with compact support. Hence ∩3j=1 [D(Dj ) ∩ D(Vem )] is dense. Thus Vem obeys Hypothesis (A). 2.2. The quantum radiation field The Hilbert space of one-photon states in momentum representation is given by Hph := L2 (R3 ) ⊕ L2 (R3 ) ,
(2.5)
where R3 := {k = (k1 , k2 , k3 )|kj ∈ R, j = 1, 2, 3} physically means the momentum space of photons. Then a Hilbert space for the quantum radiation field in the Coulomb gauge is given by n Frad := ⊕∞ n=0 ⊗s Hph ,
⊗ns Hph
(2.6)
the Boson Fock space over Hph , where denotes the n-fold symmetric tensor 0 product of Hph and ⊗s Hph := C. For basic facts on the theory of the Boson Fock space, we refer the reader to [9, §X.7].
May 26, 2003 16:37 WSPC/148-RMP
248
00162
A. Arai
We denote by a(F ) (F ∈ Hph ) the annihilation operator with test vector F on Frad ; its adjoint is given by √ ∗ (a(F )∗ Ψ)(n) = nSn (F ⊗ Ψ(n−1) ) , n ≥ 0 , Ψ = {Ψ(n) }∞ n=0 ∈ D(a(F ) ) , where Sn is the symmetrization operator on ⊗n Hph and Ψ(−1) := 0. For each f ∈ L2 (R3 ), we define a(1) (f ) := a(f, 0) ,
a(2) (f ) := a(0, f ) .
(2.7)
The mapping : f → a(r) (f ∗ ) restricted to S(R3 ) (the Schwartz space of rapidly decreasing C ∞ -functions on R3 ) defines an operator-valued distribution (f ∗ denotes complex conjugate of f ). We denote its symbolical kernel by a(r) (k) : a(r) (f ) = Rthe(r) a (k)f (k)∗ dk. We take a nonnegative Borel measurable function ω on R3 to denote the one free photon energy. We assume that, for a.e. k ∈ R3 with respect to the Lebesgue measure on R3 , 0 < ω(k) < ∞. Then the function ω defines uniquely a multiplication operator on Hph which is nonnegative, self-adjoint and injective. We denote it by the same symbol ω. The free Hamiltonian of the quantum radiation field is then defined by Hrad := dΓ(ω) ,
(2.8)
the second quantization of ω ([8, p. 302, Example 2] and [9, §X.7]). The operator Hrad is a nonnegative self-adjoint operator. The symbolical expression of Hrad is 2 Z X Hrad = ω(k)a(r) (k)∗ a(r) (k)dk . r=1
Remark 2.1. Usually ω is taken to be of the form ωphys (k) := |k|, k ∈ R3 , but, in this paper, for mathematical generality, we do not restrict ourselves to this case. There exist R3 -valued Borel measurable functions e(r) (r = 1, 2) on R3 such that, for a.e. k e(r) (k) · e(s) (k) = δrs ,
e(r) (k) · k = 0 ,
r, s = 1, 2 .
(2.9)
These vector-valued functions e(r) are called the polarization vectors of a photon. The time-zero quantum radiation field is given by A(x) := (A1 (x), A2 (x), A3 (x)) with (r) 2 Z X ej (k) {a(r) (k)∗ e−ik·x + a(r) (k)eik·x } , j = 1, 2, 3 , Aj (x) := dk p 3 ω(k) 2(2π) r=1 (2.10) in the sense of operator-valued distribution. Let % be a real tempered distribution on R3 such that %ˆ √ , ω
%ˆ ∈ L2 (R3 ) , ω
(2.11)
May 26, 2003 16:37 WSPC/148-RMP
00162
Non-Relativistic Limit of a Dirac–Maxwell Operator
249
where %ˆ denotes the Fourier transform of %. The quantum radiation field A% := (A%1 , A%2 , A%3 ) with momentum cut-off %ˆ is defined by (r) 2 Z X ej (k) (r) ∗ −ik·x % Aj (x) := dk p {a (k) e %ˆ(k)∗ + a(r) (k)eik·x %ˆ(k)}. 2ω(k) r=1 R Symbolically A%j (x) = Aj (x − y)%(y)dy.
(2.12)
(2.13)
2.3. The Dirac Maxwell operator The Hilbert space of state vectors for the coupled system of the Dirac particle and the quantum radiation field is taken to be F := HD ⊗ Frad .
(2.14)
This Hilbert space can be identified as F = L2 (R3 ; ⊕4 Frad ) =
Z
⊕ R3
⊕4 Frad dx
(2.15)
the Hilbert space of ⊕4 Frad -valued Lebesgue square integrable functions on R3 (the constant fibre direct integral with base space (R3 , dx) and fibre ⊕4 Frad [10, §XIII.6]). We freely use this identification. The total Hamiltonian of the coupled system — a particle-field Hamiltonian — is defined by H := HD + Hrad − qα · A% = α · (−i∇ − qA% ) + mβ + V + Hrad .
(2.16)
We call H a Dirac–Maxwell operator. The (essential) self-adjointness of H is discussed in [3]. 2.4. The Pauli Fierz Hamiltonian with spin 1/2 A Hamiltonian which describes a quantum system of non-relativistic charged particles interacting with the quantum radiation filed is called a Pauli–Fierz Hamiltonian [7]. Here we consider a non-relativistic charged particle with mass m, charge q and spin 1/2. Suppose that the particle is in an external electromagnetic vector potenex ex 3 3 3 tial Aex = (Aex , φ), where Aex := (Aex 1 , A2 , A3 ) : R → R and φ : R → R are Borel measurable and a.e. finite with respect to dx. Let 0 1 0 −i 1 0 σ1 := , σ2 := , σ3 := , (2.17) 1 0 i 0 0 −1
the Pauli spin matrices, and set
σ := (σ1 , σ2 , σ3 ) .
(2.18)
Then the Pauli–Fierz Hamiltonian of this quantum system is defined by HPF :=
{σ · (−i∇ − qA% − qAex )}2 + φ + Hrad 2m
(2.19)
May 26, 2003 16:37 WSPC/148-RMP
250
00162
A. Arai
acting in the Hilbert space FPF := L2 (R3 ; C2 ) ⊗ Frad = L2 (R3 ; ⊕2 Frad ) =
Z
⊕ R3
⊕2 Frad dx .
(2.20)
For the Pauli–Fierz Hamiltonian without spin, see Appendix B. 3. Main Result 3.1. A Dirac operator coupled to the quantum radiation field We use the following representation of αj and β [11, p. 3]: I2 0 0 σj , , β := αj := 0 −I2 σj 0
(3.1)
± where I2 is the 2×2 identity matrix. Hence the eigenspaces HD of β with eigenvalue ±1 take the forms respectively f 0 g 0 + − 2 3 2 3 HD = f, g ∈ L (R ) , HD = f, g ∈ L (R ) (3.2) 0 f 0 g
and we have
+ − HD = H D ⊕ HD .
Let P± be the orthogonal projections onto
± HD .
(3.3) Then we have
V = V0 + V1
(3.4)
with V0 = P + V P + + P − V P − ,
V1 = P + V P − + P − V P + .
(3.5)
Note that [V0 , β] = 0 ,
{V1 , β} = 0 ,
where [A, B] := AB − BA. In operator-matrix form relative to the orthogonal decomposition (3.3), we have 0 W∗ U+ 0 , V1 = , (3.6) V0 = W 0 0 U− where U± are 2 × 2 Hermitian matrix-valued functions on R3 and W is a 2 × 2 complex matrix-valued function on R3 . Let
then, recalling that
A%j
D /(V1 ) := α · (−i∇ − qA% ) + V1 , is
1/2 Hrad -bounded
(3.7)
[3] by (2.11), we see that D /(V1 ) is densely 1/2
defined and symmetric with D(D /(V1 )) ⊃ (∩3j=1 [D(Dj )∩D(V )])⊗alg D(Hrad ), where ⊗alg means algebraic tensor product.
May 26, 2003 16:37 WSPC/148-RMP
00162
Non-Relativistic Limit of a Dirac–Maxwell Operator
251
By (3.3), we have the following orthogonal decomposition of F: F = F + ⊕ F− ,
(3.8)
± F± := HD ⊗ Frad ∼ = FPF .
(3.9)
where
Relative to this orthogonal decomposition, we can write 0 DW ∗ , D /(V1 ) = DW 0
(3.10)
where DW := σ · (−i∇ − qA% ) + W ,
(3.11)
DW ∗ := σ · (−i∇ − qA% ) + W ∗
(3.12)
acting in FPF . For a closable linear operator T on a Hilbert space, we denote its closure by T¯ unless otherwise stated. Note that DW is densely defined as an operator on FPF and (DW )∗ ⊃ DW ∗ . Hence (DW )∗ is densely defined. Thus DW is closable. Based on this fact, we can define ¯ W )∗ 0 (D ˜ D /(V1 ) := . (3.13) ¯W D 0 ˜/(V1 ) is a self-adjoint extension of D Lemma 3.1. Under Hypothesis (A), D /(V1 ). ˜/(V1 ) follows from a general theorem (e.g. [11, Proof. The self-adjointness of D ˜/(V1 )|[D(DW ) ⊕ D(DW ∗ )] = D /(V1 ), where, p. 142, Lemma 5.3]). It is obvious that D for a linear operator T and a subspace D ⊂ D(T ), T |D denotes the restriction of ˜/(V1 ) is a self-adjoint extension of D T to D. Hence D /(V1 ). Remark 3.1. The operator ˆ/(V1 ) := D
0 ¯ W ∗ )∗ (D
¯W∗ D 0
(3.14)
is also a self-adjoint extension of D /(V1 ). But, for simplicity, we consider here only ˜/(V1 ). Discussions on D ˜/(V1 ) presented below apply also to D ˆ/(V1 ) with suitable D modifications. 3.2. A scaled Dirac Maxwell operator For a self-adjoint operator A, we denote the spectrum and the spectral measure of A by σ(A) and EA (·) respectively. In the case where A is bounded from below, we set E0 (A) := inf σ(A) ,
A0 := A − E0 (A) ≥ 0 .
May 26, 2003 16:37 WSPC/148-RMP
252
00162
A. Arai
Let Λ : (0, ∞) → (0, ∞) be a nondecreasing function such that Λ(κ) → ∞ as κ → ∞ and A be a self-adjoint operator on a Hilbert space. Then, for each κ > 0, we define A(κ) by EA0 ([0, Λ(κ)])A0 EA0 ([0, Λ(κ)]) + E0 (A) if A is bounded from below and E0 (A) < 0 A(κ) := E ([0, Λ(κ)])AE|A| ([0, Λ(κ)]) if A is nonnegative or A |A| is not bounded from below .
(3.15)
Then A(κ) is a bounded self-adjoint operator with kA(κ) k ≤ Λ(κ) .
(3.16)
Proposition 3.2. The following hold : (i) For all ψ ∈ D(A), s- limκ→∞ A(κ) ψ = Aψ, where s- lim means strong limit. (ii) For all z ∈ C\R, s- limκ→∞ (A(κ) − z)−1 = (A − z)−1 . Proof. Part (i) follows from the functional calculus of A. Part (ii) follows from (i) and a general convergence theorem [8, p. 292, Theorem VIII.25(a)]. With this preliminary, we define for κ > 0 a scaled Dirac–Maxwell operator ˜/(V1 ) + κ2 mβ − κ2 m + V0,κ + H (κ) , H(κ) := κD rad
(3.17)
where (κ)
V0,κ :=
U+
0
0
U−
(κ)
!
.
(3.18)
Some remarks may be in order on this definition. The parameter κ in H(κ) means the speed of light concerning the Dirac particle only. The speed of light related to the external potential V = V0 + V1 and the quantum radiation field A% is absorbed in them respectively. The third term −κ2 m on the right hand side of (3.17) is a subtraction of the rest energy of the Dirac particle. Hence taking the scaling limit κ → ∞ in H(κ) in a suitable sense corresponds in fact to a partial non-relativistic limit of the quantum system under consideration. If one considers the non-relativistic limit in a way similar to the usual Dirac operator HD , then one may define ˆ ˜/(V1 ) + κ2 mβ − κ2 m + V0 + Hrad H(κ) := κD
(3.19)
as a scaled Dirac–Maxwell operator, where no cut-offs on V0 and Hrad are made. In this form, however, we find that, besides the (essential) self-adjointness problem of ˆ H(κ), the methods used in the usual Dirac type operators ([11, Chapter 6] or those in [2]) seem not to work. This is because of the existence of the operator Hrad in ˆ ˜/(V1 ) + κ2 mβ − κ2 m + V0 H(κ) which is singular as a perturbation of H0 (κ) := κD
May 26, 2003 16:37 WSPC/148-RMP
00162
Non-Relativistic Limit of a Dirac–Maxwell Operator
253
(if one would try to apply the methods on scaling limits discussed in the cited literatures, then one would have to treat Hrad as a perturbation of H0 (κ)). To ˆ avoid this difficulty, we replace Hrad in H(κ) by a bounded self-adjoint operator which is obtained by cutting off Hrad . This is one of the basic ideas of the present paper. We apply the same idea to V0 which also may be singular as a perturbation ˜/(V1 ) + κ2 mβ − κ2 m. In this way we arrive at Definition (3.17) of a scaled of κD Dirac–Maxwell operator. Lemma 3.3. Under Hypothesis (A), H(κ) is self-adjoint with D(H(κ)) = ˜/(V1 )). D(D (κ)
Proof. The operator κ2 mβ − κ2 m + V0,κ + Hrad is a bounded self-adjoint operator. Hence, by the Kato–Rellich theorem, the assertion follows. 3.3. Self-Adjoint extension of the Pauli Fierz Hamiltonian Essential self-adjointness of the Pauli–Fierz Hamiltonian HPF given by (2.19) and its generalizations is discussed in [4, 5]. These papers show that, under additional conditions on %ˆ, ω, Aex and φ, the Pauli–Fierz Hamiltonians are essentially selfadjoint. In the present paper, we do not intend to discuss essential self-adjointness problem of the Pauli–Fierz type Hamiltonians. Instead, we define a self-adjoint extension of HPF , which may not be known before. We define ¯ W )∗ D ¯W (D (κ) (κ) HPF (κ; W, U+ ) := + U+ + Hrad , κ > 0 (3.20) 2m acting in FPF . Lemma 3.4. Under Hypotheses (A), HPF (κ; W, U+ ) is self-adjoint and bounded from below. Proof. By von Neumann’s theorem (e.g. [9, p. 180, Theorem X.25], the operator ¯ W )∗ D ¯ W is self-adjoint and nonnegative. The operator U (κ) + H (κ) is (2m)−1 (D + rad bounded and self-adjoint. Hence, by the Kato–Rellich theorem, HPF (κ; W, U+ ) is self-adjoint and bounded from below. A generalization of the Pauli–Fierz Hamiltonian HPF is defined by HPF (W, U+ ) :=
DW ∗ DW + U+ + Hrad 2m
(3.21)
acting in FPF . We formulate additional conditions: Hypothesis (B). The function U+ is bounded from below. In this case we set u0 := E0 (U+ ) .
May 26, 2003 16:37 WSPC/148-RMP
254
00162
A. Arai
Remark 3.2. Under Hypothesis (A), D(HPF (W, U+ )) is not necessarily dense in ¯ W )∩D(U+ )∩D(Hrad ) is dense in FPF . Hence D(D ¯ W )∩D(|U+ |1/2 )∩ FPF , but, D(D 1/2 D(Hrad ) is also dense in FPF . Therefore we can define a densely defined symmetric form sPF as follows: 1/2
¯ W ) ∩ D(|U+ |1/2 ) ∩ D(H ) (form domain) , D(sPF ) := D(D rad sPF (Ψ, Φ) :=
1 ¯ ¯ W Φ) + (Ψ, U+ Φ) + (H 1/2 Ψ, H 1/2 Φ) , (DW Ψ, D rad rad 2m Ψ, Φ ∈ D(sPF ) .
(3.22) (3.23) (3.24)
Assume Hypothesis (B) in addition to Hypothesis (A). Then it is easy to see that s PF (f) (f) is closed. Let HPF be the self-adjoint operator associated with sPF . Then HPF ≥ u0 (f) and HPF is a self-adjoint extension of HPF (W, U+ ). Theorem 3.5. Under Hypotheses (A) and (B), there exists a self-adjoint extension ˜ PF (W, U+ ) of HPF (W, U+ ) which have the following properties: of H (i) (ii) (iii)
˜ PF (W, U+ ) ≥ u0 . H ˜ PF (W, U+ )|1/2 ) ⊂ D(D ¯ W ) ∩ D(|U+ |1/2 ) ∩ D(H 1/2 ) D(|H rad For all z ∈ (C\R) ∪ {ξ ∈ R|ξ < u0 }, ˜ PF (W, U+ ) − z)−1 , s- lim (HPF (κ; W, U+ ) − z)−1 = (H κ→∞
where s- lim means strong limit. ˜ PF (W, U+ )|1/2 ), (iv) For all ξ < u0 and Ψ ∈ D(|H ˜ PF (W, U+ ) − ξ)1/2 Ψ . s- lim (HPF (κ; W, U+ ) − ξ)1/2 Ψ = (H κ→∞
Proof. We need only to apply Theorem A.1 in Appendix A to the following case: H = FPF , N = 2, A =
¯W ¯ W )∗ D (D , B1 = U+ , B2 = Hrad , L = Λ . 2m
Remark 3.3. As for conditions for ρˆ and ω for Theorem 3.5 to hold, we only need condition (2.11); no additional condition is necessary. Remark 3.4. In the same manner as in Theorem 3.5, we can define a self-adjoint extension of the Pauli–Fierz Hamiltonian without spin (see Appendix B). Remark 3.5. Under Hypotheses (A), (B) and that D(HPF (W, U+ )) is dense, HPF (W, U+ ) is a symmetric operator bounded from below. Hence it has the ˆ PF (W, U+ ). But it is not clear that, in the case where Friedrichs extension H ˜ PF (W, U+ ) = H ˆ PF (W, U+ ) or HPF (W, U+ ) is not essentially self-adjoint, H (f) ˜ HPF (W, U+ ) = HPF (Remark 3.2) or both of them do not hold.
May 26, 2003 16:37 WSPC/148-RMP
00162
Non-Relativistic Limit of a Dirac–Maxwell Operator
255
3.4. Main theorems We now state main results on the non-relativistic limit of H(κ). Theorem 3.6. Let Hypotheses (A) and (B) be satisfied. Suppose that Λ(κ)2 = 0. κ→∞ κ lim
Then, for all z ∈ C\R, s- lim (H(κ) − z)−1 = κ→∞
(3.25)
˜ (HPF (W, U+ ) − z)−1 0
0 0
.
(3.26)
In the case where U+ is not necessarily bounded from below, we have the following. Theorem 3.7. Let Hypothesis (A) and (3.25) be satisfied. Suppose that HPF (W, U+ ) is essentially self-adjoint. Then, for all z ∈ C\R, (HPF (W, U+ ) − z)−1 0 . (3.27) s- lim (H(κ) − z)−1 = κ→∞ 0 0 Remark 3.6. Under additional conditions on %, ω, W and U+ , one can prove that HPF (W, U+ ) is essentially self-adjoint for all values of the coupling constant q [4, 5]. We now apply Theorems 3.6 and 3.7 to the case where V = Vem = φ I4 −qα·Aex (Example 2.1), i.e. the case where W = −qσ · Aex and U± = φI2 . We assume the following. Hypothesis (C) 2 3 (C.1) The subspace ∩3j=1 [D(Dj ) ∩ D(Aex j ) ∩ D(φ)] is dense in L (R ). (C.2) φ is bounded from below. In this case we set φ0 := inf σ(φ).
Under Hypothesis (C), we have a self-adjoint opeartor ˜ PF := H ˜ PF (−qσ · Aex , φ) , H
(3.28)
which is a self-adjoint extension of the original Pauli–Fierz Hamiltonian HPF given by (2.19). Let (κ)
HDM (κ) := κD /(−qα · Aex ) + κ2 mβ − κ2 m + φ(κ) + Hrad ,
(3.29) ex
then HDM (κ) is the Dirac–Maxwell operator H(κ) with V1 = −qα · A and V0 = φI4 . Theorems 3.6 and 3.7 immediately yield the following results on the nonrelativistic limit of HDM (κ). Corollary 3.8. Let Hypothesis (C) and (3.25) be satisfied. Then, for all z ∈ C\R, ˜ (HPF − z)−1 0 s- lim (HDM (κ) − z)−1 = . (3.30) κ→∞ 0 0
May 26, 2003 16:37 WSPC/148-RMP
256
00162
A. Arai
Corollary 3.9. Assume (C.1) and (3.25). Suppose that HPF is essentially selfadjoint. Then, for all z ∈ C\R, ! ¯ PF − z)−1 0 (H −1 s- lim (HDM (κ) − z) = . (3.31) κ→∞ 0 0 Thus a mathematically rigorous connection of relativistic QED to nonrelativistic QED is established. 4. Limit Theorem on Strongly Anticommuting Self-Adjoint Operators In this section we prove a limit theorem concerning strongly anticommuting selfadjoint operators. For a review of the fundamental abstract theory of strongly anticommuting self-adjoint operators, see [1]. Definition 4.1. Let A and B be self-adjoint operators on a Hilbert space H. (i) We say that A and B strongly commute if their spectral measures E A and EB commute (i.e. for all Borel sets J, K ⊂ R, EA (J)EB (K) = EB (K)EA (J)). (ii) We say that A and B strongly anticommute if , for all ψ ∈ D(A) and t ∈ R, e−itB ψ ∈ D(A) and Ae−itB ψ = eitB Aψ (i.e. eitB A ⊂ Ae−itB ). Let A 6= 0 and B be strongly anticommuting self-adjoint operators on a Hilbert space H. We assume that B is injective. For each κ > 0, we define T0 (κ) := κA + κ2 (B − |B|) .
(4.1)
The operator κA + κ2 B is an abstract form of Dirac-type operators and −κ2 |B| is a “renormalization” term. It is shown that T0 (κ) is essentially self-adjoint (Lemma 3.1 in [2]). We consider a perturbation of T0 (κ). Let C(κ) (κ > 0) be a symmetric operator on H and T (κ) := T0 (κ) + C(κ) .
(4.2)
The main purpose of this section is to consider the limit κ → ∞ of T (κ) in the strong resolvent sense under a general condition for C(κ). A basic assumption for C(κ) is as follows: Hypothesis (I). D(T0 (κ)) ⊂ D(C(κ)) and T (κ) is self-adjoint with D(T (κ)) = D(T0 (κ)). To state the main result we need some preliminaries. Let B = UB |B| be the polar decomposition. Then UB is self-adjoint and unitary and σ(UB ) = {±1}, where, for a linear operator T , σ(T ) denotes the spectrum of T (see p. 141 in [2]). The operators P±B :=
1 (I ± UB ) , 2
(4.3)
May 26, 2003 16:37 WSPC/148-RMP
00162
Non-Relativistic Limit of a Dirac–Maxwell Operator
257
are respectively the orthogonal projections onto the eigenspaces H± := ker(UB ∓ I)
(4.4)
of UB with eigenvalues ±1 and we have the orthogonal decomposition H = H+ ⊕ H− .
(4.5)
It is known that A and |B| strongly commute (Lemma 2.2(v) in [2]). Hence the product spectral measure E := EA ⊗ E|B| of A and |B| can be defined with spectral representations Z Z µdE(λ, µ) . λdE(λ, µ) , |B| = A= R2
R2
With the spectral measure E, we can define a nonnegative self-adjoint operator Z 1 λ2 K0 := dE(λ, µ) ≥ 0 . (4.6) 2 R2 µ Note that K0 =
A2 |B|−1 2
on D(A2 |B|−1 ) ∩ D(|B|−1 A2 ) .
(4.7)
It is shown that K0 is reduced by H± (see Lemma 2.4 in [2]). We denote K0,± the reduced part of K0 to H± respectively. Thus we have K0,+ 0 K0 = , (4.8) 0 K0,− where the operator-matrix representation is relative to the orthogonal decomposition (4.5): I 0 0 0 B B P+ = , P− = . (4.9) 0 0 0 I We define K(κ) := K0 + P+B C(κ)P+B .
(4.10)
Hypothesis (II). Let κ0 > 0 be a constant. (II.1) For all κ ≥ κ0 , C(κ) is reduced by H± so that it has the operator-matrix representation C+ (κ) 0 C(κ) = , (4.11) 0 C− (κ) where C± (κ) are the reduced parts of C(κ) to H± respectively. 1/2 (II.2) For all κ ≥ κ0 , D(K0 ) ⊂ D(C(κ)) and there exist nonnegative constants a(κ) and b(κ) such that 1/2
kC(κ)f k ≤ a(κ)kK0 f k + b(κ)kf k ,
1/2
f ∈ D(K0 ) .
(4.12)
May 26, 2003 16:37 WSPC/148-RMP
258
00162
A. Arai
Lemma 4.2. Let Hypothesis (II) be satisfied and let K+ (κ) := K0,+ + C+ (κ) .
(4.13)
Then, for all κ ≥ κ0 , K(κ) is self-adjoint with D(K(κ)) = D(K0 ) and bounded from below. Moreover, K(κ) is reduced by H± with K+ (κ) 0 K(κ) = K+ (κ) ⊕ K0,− = . (4.14) 0 K0,− Proof. By (II.2), D(K0 ) ⊂ D(C(κ)) ⊂ D(P+B C(κ)P+B ). Hence D(K(κ)) = D(K0 ). Let f ∈ D(K0 ). Then we have for all ε > 0, 1/2
kK0 f k2 ≤ kf kkK0f k ≤ ε2 kK0 f k2 +
kf k2 . 4ε2
Hence kf k . 2ε
1/2
kK0 f k ≤ εkK0 f k +
(4.15)
This estimate and (4.12) imply kC(κ)f k ≤ a(κ)εkK0 f k +
a(κ) + b(κ) kf k . 2ε
(4.16)
By the reducibility of C(κ) by H± , we have kP+B C(κ)P+B f k ≤ kC(κ)f k. Since ε > 0 is arbitrary, it follows from the Kato–Rellich theorem that K(κ) is self-adjoint and bounded from below. The last assertion is easy to prove. Hypothesis (III). Under Hypothesis (II) (so that, by Lemma 4.2, for all κ ≥ κ0 , K+ (κ) is self-adjoint), there exists a self-adjoint operator K+ on H+ such that, for all z ∈ C\R, s- lim (K+ (κ) − z)−1 = (K+ − z)−1 .
(4.17)
κ→∞
The main result of this section is the following: Theorem 4.3. Assume Hypotheses (I)–(III). Suppose that a(κ)3 = 0, κ→∞ κ lim
b(κ)2 = 0, κ→∞ κ lim
a(κ)2 b(κ) =0 κ→∞ κ lim
(4.18)
and M := inf σ(|B|) > 0 . Then, for all z ∈ C\R, s- lim (T (κ) − z) κ→∞
−1
=
(K+ − z)−1 0
We prove Theorem 4.3 by a series of lemmas.
(4.19)
0 0
.
(4.20)
May 26, 2003 16:37 WSPC/148-RMP
00162
Non-Relativistic Limit of a Dirac–Maxwell Operator
259
In what follows, we assume (4.19). Then |B|−1 is bounded with k|B|−1 k ≤
1 . M
For z ∈ C\R, we define K(κ, z) := K(κ) − z −
(4.21)
z2 |B|−1 2κ2
(4.22)
> 0.
(4.23)
and set d(κ, z) :=
|z|2
2κ2 M |Im z|
Lemma 4.4. Assume Hypothesis (II) and (4.19). Let z ∈ C\R, κ ≥ κ0 and L(κ, z) := 1 −
z2 |B|−1 (K(κ) − z)−1 . 2κ2
(4.24)
Let d(κ, z) < 1 .
(4.25)
Then the following statements hold: (i) L(κ, z) is bijective with L(κ, z)
−1
∞ 2 n X n z = |B|−1 (K(κ) − z)−1 2 2κ n=0
(4.26)
in operator norm topology and
kL(κ, z)−1k ≤
1 . 1 − d(κ, z)
(4.27)
(ii) K(κ, z) is bijective and K(κ, z)−1 = (K(κ) − z)−1 L(κ, z)−1 ∞ 2 n X z (K(κ) − z)−1 (|B|−1 (K(κ) − z)−1 )n = 2 2κ n=0
(4.28) (4.29)
in operator norm topology with
kK(κ, z)−1k ≤ r(κ, z) ,
(4.30)
where r(κ, z) :=
1 . |Im z|(1 − d(κ, z)
Proof. (i) We have by (4.21)
2
z
−1 −1
|B| (K(κ) − z)
≤ d(κ, z) < 1 .
2κ2
(4.31)
May 26, 2003 16:37 WSPC/148-RMP
260
00162
A. Arai
Hence the bijectivity of L(κ, z) follows with Neumann expansion (4.26). Inequality (4.27) follows from the general fact that, for all bounded linear operators T with kT k < 1, k(1 − T )−1 k ≤ (1 − kT k)−1 . (ii) We have K(κ, z) = L(κ, z)(K(κ) − z), which implies that K(κ, z) is bijective with (4.28). Expansion (4.29) follows from (4.28) and (4.26). Using (4.27) and (4.28), we obtain (4.30). The following fact is an important key to the analysis here. Theorem 4.5. Assume Hypotheses (I), (II) and (4.19). Let z ∈ C\R and −1 K(κ, z)−1 is d(κ, z) < 1 with κ ≥ κ0 . Then the operator 1 + C(κ) 2κ2 (κA + z)|B| bijective and 1 −1 −1 B K(κ, z)−1 (T (κ) − z) = P+ + 2 (κA + z)|B| 2κ −1 C(κ) −1 −1 × 1+ (κA + z)|B| K(κ, z) . (4.32) 2κ2 Proof. Informal (heuristic) manipulations to obtain (4.32) are similar to the case of an abstract Dirac operator [11, p. 180, Theorem 6.4] or to a case previously discussed by the present author [2, p. 155, Theorem 4.3]. But, for completeness (since the assumption here is slightly different from those in [2, 11]), we give an outline of proof. Introducing an operator W (κ, z) := 1 + C(κ)(T0 (κ) − z)−1 , which is well-defined by Hypothesis (I), we have T (κ) − z = W (κ, z)(T0 (κ) − z) . This implies that W (κ, z) is bijective and (T (κ) − z)−1 = (T0 (κ) − z)−1 W (κ, z)−1 . On the other hand, we have (T0 (κ) − z)−1 =
1 (S0 (κ) + z)|B|−1 K0 (κ, z)−1 , 2κ2
(4.33)
where S0 (κ) := κA + κ2 (B + |B|) , K0 (κ, z) := K0 − z −
z2 |B|−1 = K(κ, z) − P+B C(κ)P+B , 2κ2
see [2, (3.17) and (3.18)]. Hence 1 (T (κ) − z)−1 = 2 (S0 (κ) + z)|B|−1 K0 (κ, z)−1 W (κ, z)−1 . 2κ Let X(κ, z) := 1 + P+B C(κ)P+B K0 (κ, z)−1 .
(4.34)
May 26, 2003 16:37 WSPC/148-RMP
00162
Non-Relativistic Limit of a Dirac–Maxwell Operator
261
Using (4.33), we have C(κ) (κA + z)|B|−1 K0 (κ, z)−1 , 2κ2 where we have used that B + |B| = 2P+B |B| and C(κ)P+B = P+B C(κ)P+B . Note that W (κ, z) = X(κ, z) +
K(κ, z) = X(κ, z)K0 (κ, z) . This implies that X(κ, z) is bijective with X(κ, z)−1 = K0 (κ, z)K(κ, z)−1 . Hence we obtain
C(κ) −1 −1 −1 X(κ, z) (κA + z)|B| K (κ, z) X(κ, z) 0 2κ2 C(κ) −1 −1 X(κ, z) , (κA + z)|B| K(κ, z) = 1+ 2κ2
W (κ, z) =
1+
which implies that
Y (κ, z) := 1 +
C(κ) (κA + z)|B|−1 K(κ, z)−1 2κ2
is also bijective with W (κ, z)−1 = X(κ, z)−1 Y (κ, z)−1 = K0 (κ, z)K(κ, z)−1 Y (κ, z)−1 . Putting this equation into (4.34), we obtain (4.32). Lemma 4.6. Assume Hypothesis (II) and (4.19). Let ε > 0. Then, for all f ∈ D(K0 ), εa(κ) 1 a(κ) kC(κ)|B|−1 f k ≤ kK0 f k + + b(κ) kf k . (4.35) M M 2ε Proof. We see by functional calculus that, for all f ∈ D(K0 ), |B|−1 f ∈ D(K0 ) and K0 |B|−1 f = |B|−1 K0 f . Using this fact, (4.16) and (4.21), we obtain (4.35). Lemma 4.7. Assume (4.19). Then D(K0 ) ⊂ D(A|B|−1 ) and kA|B|−1 f k ≤ εkK0 f k +
1 kf k , εM
f ∈ D(K0 ) ,
where ε > 0 is arbitrary. Proof. Let g ∈ D := D(A2 |B|−1 ) ∩ D(|B|−1 A2 ), we have kA|B|−1 gk2 = 2(|B|−1 g, K0 g) ≤
2kgk 1 kK0 gk ≤ ε2 kK0 gk2 + 2 2 kgk2 , M ε M
where ε > 0 is arbitrary. Hence kA|B|−1 gk ≤ εkK0 gk +
1 kgk . εM
(4.36)
May 26, 2003 16:37 WSPC/148-RMP
262
00162
A. Arai
Since D is a core of K0 (p. 143, Lemma 2.4 in [2]) and |B|−1 is bounded, the assertion follows from a limiting argument. Lemma 4.8. Assume Hypothesis (II) and (4.19). Then D(K0 ) ⊂ D(C(κ)A|B|−1 ) and √ b(κ) 2a(κ) −1 √ kf k , f ∈ D(K0 ) , (4.37) + εb(κ) kK0 f k + kC(κ)A|B| f k ≤ εM M where ε > 0 is arbitrary. Proof. Let f ∈ D(K0 ). Then it follows from the functional calculus on the product 1/2 spectral measure E and (4.12) that f ∈ D(K0 A|B|−1 ) ⊂ D(C(κ)A|B|−1 ) and 1/2
kC(κ)A|B|−1 f k ≤ a(κ)kK0 A|B|−1 f k + b(κ)kA|B|−1 f k √ = a(κ)k 2|B|−1/2 K0 f k + b(κ)kA|B|−1 f k √ 2a(κ) ≤ √ kK0 f k + b(κ)kA|B|−1 f k . M This estimate and (4.36) give (4.37). Lemma 4.9. Assume Hypothesis (II) and (4.19). Let δ > 0 be a constant such that a(κ)δ < 1. Then, for all f ∈ D(K0 ) and κ ≥ κ0 , kK0 f k ≤
1 kK(κ, z)f k 1 − a(κ)δ a(κ) |z|2 1 + b(κ) kf k . |z| + 2 + + 1 − a(κ)δ 2κ M 2δ
(4.38)
Proof. Using (4.16), we have kK0 f k ≤ kK(κ)f k + kC(κ)P+B f k ≤ kK(κ)f k + a(κ)δkK0 f k +
a(κ) + b(κ) kf k , 2δ
where δ > 0 is arbitrary. Taking δ > 0 such that a(κ)δ < 1, we obtain a(κ) 1 1 kK0 f k ≤ kK(κ)f k + + b(κ) kf k . 1 − a(κ)δ 1 − a(κ)δ 2δ On the other hand, we have |z|2 kf k . kK(κ)f k ≤ kK(κ, z)f k + |z| + 2 2κ M Thus (4.38) follows.
(4.39)
May 26, 2003 16:37 WSPC/148-RMP
00162
Non-Relativistic Limit of a Dirac–Maxwell Operator
263
Lemma 4.10. Assume Hypothesis (II), (4.19) and (4.25). Let δ > 0 be a constant such that a(κ)δ < 1 and ε > 0. Let εa(κ) |z|2 a(κ) G1 (κ, z, ε, δ) := 1 + r(κ, z) |z| + 2 + + b(κ) M (1 − a(κ)δ) 2κ M 2δ a(κ) b(κ) . (4.40) + + r(κ, z) 2εM M Then C(κ)|B|−1 K(κ, z)−1 is bounded with
kC(κ)|B|−1 K(κ, z)−1 k ≤ G1 (κ, z, ε, δ) .
(4.41)
Proof. This follows from Lemma 4.6 and Lemma 4.9. Lemma 4.11. Assume Hypothesis (II), (4.19) and (4.25). Let δ > 0 be a constant such that a(κ)δ < 1 and ε > 0. Let G2 (κ, z, ε, δ) :=
1 1 − a(κ)δ √ a(κ) 2a(κ) |z|2 √ + b(κ) × + εb(κ) 1 + r(κ, z) |z| + 2 + 2κ M 2δ M
r(κ, z)b(κ) . εM Then C(κ)A|B|−1 K(κ, z)−1 is bounded with +
kC(κ)A|B|−1 K(κ, z)−1 k ≤ G2 (κ, z, ε, δ) .
(4.42)
(4.43)
Proof. This follows from Lemma 4.8 and Lemma 4.9. Lemma 4.12. Assume Hypotheses (II), (III) and (4.19). Then (K+ − z)−1 0 s- lim P+B K(κ, z)−1 = . κ→∞ 0 0
(4.44)
Proof. Let K := K+ ⊕ K0,+ . By Lemma 4.4, we have K(κ, z)−1 = (K(κ) − z)−1 + (K(κ) − z)−1 V (κ) P∞ z2 n (|B|−1 (K(κ) − z)−1 )n . Hence with V (κ) := n=1 2κ 2
K(κ, z)−1 − (K − z)−1 = (K(κ) − z)−1 − (K − z)−1 + (K(κ) − z)−1 V (κ) .
It is easy to see that kV (κ)k → 0 as κ → ∞. By Hypothesis (III), we have s- lim (K(κ) − z)−1 = (K − z)−1 . κ→∞
May 26, 2003 16:37 WSPC/148-RMP
264
00162
A. Arai
Hence s- lim K(κ, z)−1 = (K − z)−1 , κ→∞
which implies that s- lim P+B K(κ, z)−1 κ→∞
=
P+B (K
− z)
−1
=
(K+ − z)−1 0
0 0
.
Thus (4.44) holds. Proof of Theorem 4.3. By Lemmas 4.10 and 4.11, we have
G2 (κ, z, ε, δ)
C(κ) |z| −1 −1
(κA + z)|B| K(κ, z) + 2 G1 (κ, z, ε, δ) .
≤
2κ2 2κ 2κ
Let 0 < α < 1 be fixed and set δ = α/a(κ) so that a(κ)δ = α < 1. Let κ1 > 0 be a constant such that d(κ1 , z) < 1 and κ1 ≥ max{κ0 , 1}. Let κ ≥ κ1 . Then G1 (κ, z, ε, δ) ≤ C1 [a(κ) + a(κ)3 + a(κ)b(κ) + b(κ)] ,
G2 (κ, z, ε, δ) ≤ C2 [a(κ) + a(κ)3 + a(κ)b(κ) + b(κ) + b(κ)a(κ)2 + b(κ)2 ] , where C1 and C2 are constants independent of κ ≥ κ1 . Hence, under condition (4.18), we have lim
κ→∞
G1 (κ, z, ε, δ) = 0, κ2
G2 (κ, z, ε, δ) = 0. κ
Hence
which implies that lim
κ→∞
C(κ)
−1 −1
lim (κA + z)|B| K(κ, z) = 0 , κ→∞ 2κ2
C(κ) (κA + z)|B|−1 K(κ, z)−1 1+ 2κ2
−1
=1
(4.45)
in operator-norm topology. By Lemmas 4.7 and 4.9, we have kA|B|
−1
K(κ, z)
−1
ε r(κ, z)ε a(κ) r(κ, z) |z|2 k≤ + +b(κ) + . |z|+ 2 + 1 − a(κ)δ 1 − a(κ)δ 2κ M 2δ εM
Hence, in the same way as above, we can show that
1 (κA + z)|B|−1 K(κ, z)−1 = 0 2κ2 in operator-norm topology. These facts together with Theorem 4.5 and Lemma 4.12 imply (4.20). lim
κ→∞
Remark 4.5. Higher order corrections to the limiting formula (4.20) can be computed by using Theorem 4.5 and (4.29).
May 26, 2003 16:37 WSPC/148-RMP
00162
265
Non-Relativistic Limit of a Dirac–Maxwell Operator
5. Proof of the Main Theorems 5.1. Proof of Theorem 3.6 We apply Theorem 4.3. For this purpose, we first prove the following lemma. ˜/(V1 ) strongly anticommutes with mβ. Lemma 5.1. The self-adjoint operator D Proof. We have for all t ∈ R e
−itmβ
=
e−itm I2
0
0
eitm I2
.
˜/(V1 )) = D(D ¯ W ) ⊕ D((D ¯ W )∗ ), e−itmβ Ψ ∈ This implies that, for all Ψ ∈ D(D −itmβ itmβ ˜ ˜ ˜ ˜ D(D /(V1 )) and D /(V1 )e Ψ=e D /(V1 )Ψ. Hence D /(V1 ) strongly anticommutes with mβ. Let ˜/(V1 ) , A=D
B = mβ ,
(κ)
C(κ) = V0,κ + Hrad .
Then |B| = m and we can write
H(κ) = κA + κ2 (B − |B|) + C(κ) .
By Lemma 5.1, A and B strongly anticommute. Hence H(κ) is of the form T (κ) in Sec. 4. We need only to check that T (κ) = H(κ) satisfies the assumption of Theorem 4.3. Since C(κ) is bounded, Hypothesis (I) holds. In the present case we have P±B = P± and C(κ) is reduced by F± with ! (κ) (κ) U+ + Hrad 0 . (5.1) C(κ) = (κ) (κ) 0 U− + Hrad Hence Hypothesis (II.1) holds. In the present case we have K0 =
(D ¯ W )∗ D ¯W
˜/(V1 )2 D = 2m
2m
0
. (5.2) ¯ W (D ¯ W )∗ D 0 2m By (3.16), kC(κ)Ψk ≤ 2Λ(κ)kΨk for all Ψ ∈ F. Hence Hypothesis (II.2) holds with a(κ) = 0 ,
b(κ) = 2Λ(κ) .
(5.3)
By (5.1) and (5.2), we have K+ (κ) = HPF (κ; W, U+ ) . ˜ PF (W, U+ ). By (5.3) and By Theorem 3.5, Hypothesis (III) holds with K+ = H (3.25), (4.18) holds. Thus the assumption of Theorem 4.3 is satisfied. Hence we can apply Theorem 4.3 to obtain (3.26).
May 26, 2003 16:37 WSPC/148-RMP
266
00162
A. Arai
5.2. Proof of Theorem 3.7 Hypotheses (I) and (II) hold in this case too. But it is not immediately obvious if Hypothesis (III) holds, since, in this case, we can not use Theorem 3.5. We note that lim HPF (κ; W, U+ )Ψ = HPF (W, U+ )Ψ ,
κ→∞
Ψ ∈ D(HPF (W, U+ )) .
By the assumption on the essential self-adjointness of HPF (W, U+ ), we can apply a general convergence theorem [8, p. 292, Theorem VIII.25(a)] to conclude that, for all z ∈ C\R, s- lim (HPF (κ; W, U+ ) − z)−1 = (HPF (W, U+ ) − z)−1 . κ→∞
Hence Hypothesis (III) holds with K+ = HPF (W, U+ ). Then, in the same way as in the proof of Theorem 3.6, we obtain Theorem 3.7. Appendix A. A Class of Self-Adjoint Extensions of Hermitian Operators We say that a linear operator S on a Hilbert space H is Hermitian if (ψ, Sφ) = (Sψ, φ) for all ψ, φ ∈ D(S). In this definition, we do not assume the denseness of D(S). A densely defined Hermitian operator is called a symmetric operator. In this appendix we present a class of self-adjoint extensions of Hermitian operators. To the author’s best knowledge, this class is new. Let H be a complex Hilbert space. Let A be a nonnegative self-adjoint operator on H and Bj (j = 1, 2, . . . , N, N ∈ N) be self-adjoint operators bounded from below 1/2 with Bj ≥ bj (bj ∈ R is a constant) such that ∩N ) ∩ D(|Bj |1/2 )] is dense j=1 [D(A in H. Let c0 :=
N X
bj .
j=1
Then the operator S := A +
N X
Bj
j=1
is Hermitian and bounded from below with S ≥ c0 . Remark A.1. If S is densely defined (i.e. D(S) = ∩N j=1 [D(A) ∩ D(Bj )] is dense), then S is a symmetric operator bounded from below and hence S has a self-adjoint extension SF , called the Friedrichs extension (e.g. [9, p. 177, Theorem X.23]). Remark A.2. The operator S has another type of self-fadjoint extension Sf which ˙ 1+ ˙ · · · +B ˙ N , i.e. the self-adjoint operator is given by the form sum Sf := A+B
May 26, 2003 16:37 WSPC/148-RMP
00162
Non-Relativistic Limit of a Dirac–Maxwell Operator
267
associated with the densely defined symmetric closed form s0 given by 1/2 D(s0 ) := ∩N ) ∩ D(|Bj |1/2 )] (form domain) , j=1 [D(A
s0 (ψ, φ) := (A1/2 ψ, A1/2 φ) +
N X
ˆ 1/2 ψ, B ˆ 1/2 φ) + c0 (ψ, φ) , (B
j=1
ψ, φ ∈ D(s0 ) ,
where ˆj := Bj − bj B and (·, ·) denotes the inner product of H. Here we want to construct a self-adjoint extension of S which may be different from SF and Sf if S is symmetric, but not essentially self-adjoint. For this purpose we first introduce an approximate or a “cut-off” version of S. Remark A.3. If each Bj is bounded, then, by the Kato–Rellich theorem, S is self-adjoint. Thus the arguments below are nontrivial only if A and at least one of Bj (j = 1, . . . , N ) are unbounded. Let L : (0, ∞) → (0, ∞) be a nondecreasing function such that L(κ) → ∞ as κ → ∞ and ˆj (κ) := E ˆ ([0, L(κ)])B ˆj E ˆ ([0, L(κ)]) , B Bj Bj
κ > 0,
ˆj . It is easy to see that each B ˆj (κ) is a where EBˆj is the spectral measure of B ˆj (κ)k ≤ L(κ). nonnegative bounded self-adjoint operator with kB Let
S(κ) := A +
N X
ˆj (κ) + c0 . B
j=1
Then, by the Kato–Rellich theorem, S(κ) is self-adjoint with S(κ) ≥ c0 . Moreover, for all ψ ∈ ∩N j=1 [D(A) ∩ D(Bj )], we have s- lim S(κ)ψ = Sψ . κ→∞
In this sense S(κ) may be regarded as an approximate version of S. Theorem A.1. Let A, Bj , S and S(κ) be as above. Then there exists a unique self-adjoint extension S˜ of S such that the following properties hold : (i) (ii) (iii)
S˜ ≥ c0 . ˜ 1/2 ) ⊂ ∩N [D(A1/2 ) ∩ D(B ˆ 1/2 )]. D(|S| j=1 j For all z ∈ (C\R) ∪ {ξ ∈ R|ξ < c0 },
s- lim (S(κ) − z)−1 = (S˜ − z)−1 . κ→∞
˜ 1/2 ), (iv) For all ξ < c0 and ψ ∈ D(|S|
s- lim (S(κ) − ξ)1/2 ψ = (S˜ − ξ)1/2 ψ . κ→∞
May 26, 2003 16:37 WSPC/148-RMP
268
00162
A. Arai
Proof. For each κ > 0, we define a symmetric form sκ with form domain D(s) = D(A1/2 ) by sκ (ψ, φ) := (A1/2 ψ, A1/2 φ) +
N X
ˆj (κ)φ) + c0 (ψ, φ) , (ψ, B
j=1
ψ, φ ∈ D(A1/2 ) .
This is the densely defined closed symmetric form associated with the self-adjoint ˆj (κ)ψ) is nondecreasing in κ for all ψ ∈ H with operator S(κ). Since (ψ, B ˆj (κ)φ) ≤ (B ˆ 1/2 φ, B ˆ 1/2 φ) , 0 ≤ (φ, B j j
ˆ 1/2 ) , φ ∈ D(B j
it follows that, for all κ, κ0 > 0 with κ < κ0 , c 0 ≤ s κ ≤ s κ0 ≤ s 0 . Hence we can apply a general convergence theorem on nondecreasing symmetric forms ([6, p. 461, Theorem 3.13]) to conclude that there exists a self-adjoint operator S˜ on H such that (i), (iii) and (iv) hold with sκ ≤ s, where s is the symmetric form ˜ so that D(|S| ˜ 1/2 ) ⊂ D(A1/2 ). associated with S, To show that S˜ is a self-adjoint extension of S, let ψ ∈ D(S) = ∩N j=1 [D(A) ∩ ˜ ˜ D(Bj )] and φ ∈ D(S) = D(S − c0 + 1). Then (ψ, (S˜ − c0 + 1)φ) = ((S(κ) − c0 + 1)ψ, (S(κ) − c0 + 1)−1 (S˜ − c0 + 1)φ) . Note that s- limκ→∞ (S(κ) − c0 + 1)ψ = (S − c0 + 1)ψ and, by property (iii), s- lim (S(κ) − c0 + 1)−1 = (S˜ − c0 + 1)−1 . κ→∞
Hence (ψ, (S˜ − c0 + 1)φ) = ((S − c0 + 1)ψ, (S˜ − c0 + 1)−1 (S˜ − c0 + 1)φ) = ((S − c0 + 1)ψ, φ) , ˜ and (S˜ − c0 + 1)ψ = (S − c0 + 1)ψ, which implies that ψ ∈ D(S˜ − c0 + 1) = D(S) ˜ ˜ i.e. Sψ = Sψ. Thus S is a self-adjoint extension of S. We next prove (ii). It follows from the inequality sκ ≤ s as shown above and the nondecreasingness of sκ in κ that D(s) ⊂ D(sκ ) = D(A1/2 ) and ˜ 1/2 ), limκ→∞ sκ (ψ, ψ) exists. This implies that that, for all ψ ∈ D(s) = D(|S| ˆj (κ)1/2 ψ, B ˆj (κ)1/2 ψ) exists (j = 1, . . . , N ). By using the spectral replimκ→∞ (B ˆ ˆj (κ)1/2 ψ) and the monotone convergence theorem, we resentation for (Bj (κ)1/2 ψ, B 1/2 ˆ ), j = 1, . . . , N . Thus part (ii) follows. see that ψ ∈ D(B j The uniqueness of S˜ follows from property (iii). Remark A.4. The self-adjoint extension S˜ may depend on the choice of the function L. Unfortunately we have been unable to make clear whether S# = S˜ or not (# = F, f) in the case where S is symmetric, but not essentially self-adjoint.
May 26, 2003 16:37 WSPC/148-RMP
00162
Non-Relativistic Limit of a Dirac–Maxwell Operator
269
Appendix B. Self-Adjoint Extension of the Pauli Fierz Hamiltonian Without Spin Let Aex and φ be as in Example 2.1 in Sec. 2 and Pj := −iDj − qA%j − qAex j . We set P = (P1 , P2 , P3 ). Then the Pauli–Fierz Hamiltonian without spin is given by hPF :=
P2 + φ + Hrad 2m
R⊕ acting in the Hilbert space L2 (R3 ) ⊗ Frad = L2 (R3 ; Frad ) = R3 Frad dx. It is easy to see that hPF is Hermitian. We assume Hypothesis (C) in Sec. 3. Then each Pj is symmetric. Hence we can (f) define a nonnegative self-adjoint operator KPF as the form sum 1 (f) ˙ P¯2 )∗ P¯2 +( ˙ P¯3 )∗ P¯3 } , KPF := {(P¯1 )∗ P¯1 +( 2m which is a self-adjoint extension of KPF,0 := (2m)−1 P 2 . Hence KPF,0 has a selfadjoint extension which is nonnegative. Let KPF be any self-adjoint extension of KPF,0 such that KPF ≥ 0 and 1/2 1/2 D(KPF ) ∩ D(|φ|1/2 ) ∩ D(Hrad ) is dense. Then we define (κ)
hPF (κ) := KPF + Hrad + φ(κ) ,
where (κ)
Hrad := EHrad ([0, L(κ)])Hrad EHrad ([0, L(κ)]) , φ(κ) := (φ − φ0 )χ[0,L(κ)] (φ − φ0 ) + φ0 ,
(κ)
where χ[0,L(κ)] is the characteristic function of the interval [0, L(κ)]. Since Hrad + φ(κ) is bounded and symmetric, hPF (κ) is self-adjoint and bounded from below with hPF (κ) ≥ φ0 . Theorem B.1. Assume Hypothesis (C) in Sec 3. Then there exists a unique self˜ PF of hPF such that the following properties hold : adjoint extension h (i) ˜ hPF ≥ φ0 . ˜ PF |1/2 ) ⊂ D(K 1/2 ) ∩ D(|φ|1/2 ) ∩ D(H 1/2 ). (ii) D(|h PF rad (iii) For all z ∈ (C\R) ∪ {ξ ∈ R|ξ < φ0 }, ˜ PF − z)−1 . s- lim (hPF (κ) − z)−1 = (h κ→∞
˜ PF |1/2 ), (iv) For all ξ < φ0 and Ψ ∈ D(|h
˜ PF − ξ)1/2 Ψ . s- lim (hPF (κ) − ξ)1/2 Ψ = (h κ→∞
Proof. We only need to apply Theorem A.1 to the following case: H = L2 (R3 ; Frad ), A = KPF , N = 2, B1 = φ, B2 = Hrad .
May 26, 2003 16:37 WSPC/148-RMP
270
00162
A. Arai
Adknowledgment This work was supported by the Grant-in-Aid No. 13440039 for Scientific Research from the JSPS. References [1] A. Arai, Analysis on anticommuting self-adjoint operators, Adv. Stud. Pure Math. 23 (1994), 1–15. [2] A. Arai, Scaling limit of anticommuting self-adjoint operators and applications to Dirac operators, Integr. Equat. Oper. Theory 21 (1995), 139–173. [3] A. Arai, A particle-field Hamiltonian in relativistic quantum electrodynamics, J. Math. Phys. 41 (2000), 4271–4283. [4] F. Hiroshima, Essential self-adjointness of translation-invariant quantum field models for arbitrary coupling constants, Comm. Math. Phys. 211 (2000), 585–613. [5] F. Hiroshima, Self-adjointness of the Pauli-Fierz Hamiltonian for arbitrary values of coupling constants, Ann. Henri Poincar´e 3 (2002), 171–201. [6] T. Kato, Perturbation Theory for Linear Operators, 2nd Edition, Springer, Berlin Heidelberg New York, 1976. [7] W. Pauli and M. Fierz, Zur Theorie der Emission langwelliger Lichtquanten, Nuovo Cimento 15 (1938), 167–188. [8] M. Reed and B. Simon, Methods of Modern Mathematical Physics I : Functional Analysis, Academic Press, New York, 1972. [9] M. Reed and B. Simon, Methods of Modern Mathematical Physics II : Fourier Analysis, Self-adjointness, Academic Press, New York, 1975. [10] M. Reed and B. Simon, Methods of Modern Mathematical Physics IV : Analysis of Operators, Academic Press, New York, 1978. [11] B. Thaller, The Dirac Equation, Springer-Verlag, Berlin, Heidelberg, 1992.
May 26, 2003 16:50 WSPC/148-RMP
00166
Reviews in Mathematical Physics Vol. 15, No. 3 (2003) 271–312 c World Scientific Publishing Company
LOCALIZATION OF THE NUMBER OF PHOTONS OF GROUND STATES IN NONRELATIVISTIC QED
FUMIO HIROSHIMA Department of Mathematics and Physics, Setsunan University 572-8508, Osaka, Japan
[email protected] Received 6 November 2002 Revised 23 January 2003 One electron system minimally coupled to a quantized radiation field is considered. It is assumed that the quantized radiation field is massless, and no infrared cutoff is imposed. The Hamiltonian, H, of this system is defined as a self-adjoint operator acting on L2 (R3 ) ⊗ F ∼ = L2 (R3 ; F ), where F is the Boson Fock space over L2 (R3 × {1, 2}). It k is shown that the ground state, ψg , of H belongs to ∩∞ k=1 D(1 ⊗ N ), where N denotes the number operator of F . Moreover, it is shown that for almost every electron position m+1 variable x ∈ R3 and for arbitrary k ≥ 0, k(1 ⊗ N k/2 )ψg (x)kF ≤ Dk e−δ|x| with some constants m ≥ 0, Dk > 0, and δ > 0 independent of k. In particular ψg ∈ β|x|m+1 ⊗ N k ) for 0 < β < δ/2 is obtained. ∩∞ k=1 D(e Keywords: Pauli–Fierz model; ground states; number operators; pull-through formula.
1. Introduction 1.1. The Pauli Fierz Hamiltonian In this paper one spinless electron minimally coupled to a massless quantized radiation field is considered. It is the so-called Pauli–Fierz model of the nonrelativistic QED. The Hilbert space of state vectors of the system is given by H = L2 (R3 ) ⊗ F , where F denotes the Boson Fock space defined by " # ∞ M n 2 3 F= ⊗s L (R × {1, 2}) , n=0
⊗ns L2 (R3
where × {1, 2}), n ≥ 1, denotes the n-fold symmetric tensor product of L2 (R3 × {1, 2}) and ⊗0s L2 (R3 × {1, 2}) = C. The Fock vacuum Ω is defined by Ω = {1, 0, 0, . . .}. Let ( ∞ ) M F0 = Ψ(n) ∈ F Ψ(n) = 0 for n ≥ m with some m . n=0
271
May 26, 2003 16:50 WSPC/148-RMP
272
00166
F. Hiroshima
For each {k, j} ∈ R3 × {1, 2}, the annihilation operator a(k, j) is defined by, for (n) Ψ = ⊕∞ ∈ F0 , n=0 Ψ √ (a(k, j)Ψ)(n) (k1 , j1 , . . . , kn , jn ) = n + 1Ψ(n+1) (k, j, k1 , j1 , . . . , kn , jn ) . The creation operator a∗ (k, j) is given by a∗ (k, j) = (a(k, j)dF0 )∗ . They satisfy the canonical commutation relations on F0 : [a(k, j), a∗ (k 0 , j 0 )] = δ(k − k 0 )δjj 0 , [a(k, j), a(k 0 , j 0 )] = 0 , [a∗ (k, j), a∗ (k 0 , j 0 )] = 0 . The closed extensions of a(k, j) and a∗ (k, j) are denoted by the same symbols respectively. The annihilation and creation operators smeared by f ∈ L2 (R3 ) are formally written as Z ] a (f, j) = a] (k, j)f (k)dk , a] = a or a∗ , and act as (a(f, j)Ψ)
(n)
=
√
n+1
Z
f (k)Ψ(n+1) (k, j, k1 , j1 , . . . , kn , jn )dk ,
1 X (a∗ (f, j)Ψ)(n) = √ f (k)Ψ(n−1) (k1 , j1 , . . . , kd l , jl , . . . , k n , jn ) , n j =j l
P
ˆ means neglecting X. where jl =j denotes to sum up jl such that jl = j, and X We work with the unit ~ = 1 = c. The dispersion relation is given by ω(k) = |k| . Then the free Hamiltonian Hf of F is formally written as X Z Hf = ω(k)a∗ (k, j)a(k, j)dk , j=1,2
and acts as (Hf Ψ)
(n)
(k1 , j1 , . . . , kn , jn ) =
n X
ω(kj )Ψ(n) (k1 , j1 , . . . , kn , jn ) ,
j=1
n ≥ 1,
(Hf Ψ)(0) = 0 with the domain D(Hf ) =
(
∞ ) X (n) Ψ = ⊕∞ k(Hf Ψ)(n) k2⊗n L2 (R3 ×{1,2}) < ∞ . n=0 Ψ n=0
May 26, 2003 16:50 WSPC/148-RMP
00166
Localization of the Number of Photons of Ground States
273
Since Hf is essentially self-adjoint and nonnegative, we denotes the self-adjoint extension of Hf by the same symbol Hf . Under the identification Z ⊕ H∼ Fdx , = R3
the quantized radiation field A with a form factor ϕ is given by the constant fiber direct integral Z ⊕ A= A(x)dx , R3
where A(x) is the operator acting on F defined by Z e(k, j) ∗ 1 X p {a (k, j)e−ik·x ϕ(−k) ˆ + a(k, j)eik·x ϕ(k)}dk ˆ . A(x) = √ 2 j=1,2 ω(k)
Here ϕˆ denotes the Fourier transform of ϕ and e(k, j), j = 1, 2, are polarization vectors such that (e(k, 1), e(k, 2), k/|k|) forms a right-handed system, i.e. k·e(k, j) = 0, e(k, j) · e(k, j 0 ) = δjj 0 , and e(k, 1) × e(k, 2) = k/|k| for almost every k ∈ R3 . We fix polarization vectors through this paper. The decoupled Hamiltonian is given by H0 = H p ⊗ 1 + 1 ⊗ H f . Here
1 2 p +V 2 denotes a particle Hamiltonian, where p = (−i∇x1 , −i∇x2 , −i∇x3 ) and x = (x1 , x2 , x3 ) are the momentum operator and its conjugate position operator in L2 (R3 ), respectively, and V : R3 → R an external potential. We are prepared to define the total Hamiltonian, H, of this system, which is given by the minimal coupling to H0 . i.e. we replace p ⊗ 1 with p ⊗ 1 − eA, Hp =
1 (p ⊗ 1 − eA)2 + V ⊗ 1 + 1 ⊗ Hf , 2 where e denotes the charge of an electron. H=
1.2. Assumptions on V and fundamental facts We give assumptions on external potentials. We say V ∈ K3 (the three-dimensional Kato class [23]) if and only if Z |V (y)| lim sup dy = 0 , ↓0 x∈R3 |x−y|< |x − y| and V ∈ K3loc if and only if 1R V ∈ K3 for all R ≥ 0, where ( 1 , |x| < R , 1R (x) = 0 , |x| ≥ R .
May 26, 2003 16:50 WSPC/148-RMP
274
00166
F. Hiroshima
Let us define classes K and Vexp as follows. Definition 1.1. (1) We say V ∈ K if and only if V = V+ − V− such that V± ≥ 0, V+ ∈ K3loc and V− ∈ K3 . (2) We say V ∈ Vexp if and only if V = Z + W such that inf Z > −∞, Z ∈ L1loc (R3 ), W < 0, and W ∈ Lp (R3 ) for some p > 3/2. 1
For V ∈ K, a functional integral representation of e−t(− 2 ∆+V ) by means of the Wiener measure on C([0, ∞); R3 ) is obtained. See e.g. [23]. For V ∈ K ∩ Vexp , using this functional integral representation, it can be proven that a ground state, fp , of − 12 ∆ + V decays exponentially, i.e. |fp (x)| ≤ c1 e−c2 |x|
c3
(1.1)
for almost every x ∈ R3 with some positive constants c1 , c2 , c3 . Similar estimates are available to the Pauli–Fierz Hamiltonian H with V ∈ K ∩Vexp . See Proposition 1.5. Furthermore we need to define class V (m), m = 0, 1, 2, . . . to estimate constant c3 in (1.1) precisely. Definition 1.2. Suppose that V = Z + W ∈ Vexp ∩ K, where the decomposition Z + W is that of the definition of Vexp . (1) We say V ∈ V (m), m ≥ 1, if and only if Z(x) ≥ γ|x|2m for x ∈ / O with a certain compact set O and with some γ > 0. (2) We say V ∈ V (0) if and only if lim inf |x|→∞ Z(x) > inf σ(H), where σ(H) denotes the spectrum of H. −eZ A physically reasonable example of V is the Coulomb potential 4π|x| , where Z > 0 denotes the charge of a nucleus. Actually we see the following proposition.
Proposition 1.3. Assume that Z
R3
2 |ϕ(k)| ˆ Z2 dk < . ω(k) 2(4π)2
Then −
eZ ∈ V (0) 4π|x|
for all e > 0. Proof. It is known that −1/|x| ∈ K3 ∩ Vexp . Then we shall show inf σ(H) < 0. Let V = −eZ/(4π|x|) and f be a normalized ground state of Hp = − 12 ∆ + V , Hp f = −E0 f , where E0 =
e2 Z 2 . 2(4π)2
May 26, 2003 16:50 WSPC/148-RMP
00166
Localization of the Number of Photons of Ground States
275
Then we have inf σ(H) ≤ (f ⊗ Ω, Hf ⊗ Ω)H = (f, Hp f )L2 (R3 ) + e2 = −E0 + 2 =−
e2 2
X Z
µ=1,2,3
Z2 −2 (4π)2
Z
R3
R3
kµ2 1− 2 |k| 2 |ϕ(k)| ˆ dk ω(k)
!
e2 (f ⊗ Ω, A2 f ⊗ Ω)H 2
2 |ϕ(k)| ˆ dk ω(k)
< 0.
Thus the proposition follows. We introduce Hypothesis Hm , m = 0, 1, 2, . . . . Hypothesis Hm (1) D(∆) ⊂ D(V ) and there exists 0 ≤ a < 1 and 0 ≤ b such that for f ∈ D(∆), kV f kL2 (R3 ) ≤ ak∆f kL2(R3 ) + bkf kL2(R3 ) . √ (2) ϕ(−k) ˆ = ϕ(k), ˆ and ϕ/ω, ˆ ωϕˆ ∈ L2 (R3 ). (3) inf σess (Hp )−inf σ(Hp ) > 0, where σ(Hp ) (resp. σess (Hp )) denotes the spectrum (resp. essential spectrum) of Hp . (4) V ∈ V (m). Proposition 1.4. We assume (1) and (2) of Hm . Then for arbitrary e ∈ R, H is self-adjoint on D(∆ ⊗ 1) ∩ D(1 ⊗ Hf ) and bounded from below, moreover essentially self-adjoint on any core of −∆ ⊗ 1 + 1 ⊗ Hf . Proof. See [14, 15]. The number operator of F is defined by X Z N= a∗ (k, j)a(k, j)dk . j=1,2
(n) The operator N k , k ≥ 0, acts as, for Ψ = ⊕∞ , n=0 Ψ
(N k Ψ)(n) = nk Ψ(n) with the domain k
D(N ) =
(
Ψ=
∞ X
(n) ⊕∞ n=0 Ψ
n=0
n
2k
kΨ(n) k2⊗n L2 (R3 ×{1,2})
)
<∞ .
We give a remark on notations. We can identify H with the set of F-valued L2 functions on R, i.e. H∼ = L2 (R3 ; F) .
(1.2)
May 26, 2003 16:50 WSPC/148-RMP
276
00166
F. Hiroshima
Under this identification, Ψ ∈ H can be regarded as a vector in L2 (R3 ; F). Namely for almost every x ∈ R3 , Ψ(x) ∈ F . We use identification (1.2) without notices in what follows. The following proposition is well known. Proposition 1.5. Suppose Hm . Then there exists e0 ≤ ∞ such that for all |e| ≤ e0 , (i) H has a ground state ψg , (ii) it is unique, (iii) k(1 ⊗ N 1/2 )ψg kH < ∞, m+1 (iv) kψg (x)kF ≤ De−δ|x| for almost every x ∈ R3 with some constants D > 0 and δ > 0. Proof. See [5, 10] for (i) and (iii), [13] for (ii) and [16] for (iv). Remark 1.6. It is not clear from Proposition 1.5 that ψg ∈ D(eδ|x| See Corollary 1.11.
m+1
⊗ N 1/2 ).
The condition I=
Z
R3
2 |ϕ(k)| ˆ dk < ∞ ω(k)3
(1.3)
is called the infrared cutoff condition. (1.3) is not assumed in Proposition 1.5. For suitable external potentials, e0 = ∞ is available in Proposition 1.5. This is established in [10]. In the case where inf ess (Hp ) − inf σ(Hp ) = 0, examples for H to have a ground state is investigated in [17, 19]. It is unknown, however, whether such a ground state decays in x exponentially or not. When electron includes spin, H has a twofold degenerate ground state for sufficiently small |e|, which is shown in [18]. 1.3. Localization of the number of bosons and infrared singularities for a linear coupling model The Nelson Hamiltonian [22] describes a linear coupling between a nonrelativistic particle and a scalar quantum field with a form factor ϕ. Let HN = L2 (R3 ) ⊗ FN , L∞ where FN = n=0 [⊗sn L2 (R3 )]. The Nelson Hamiltonian is defined as a self-adjoint operator acting in the Hilbert space HN , which is given by HN = Hp ⊗ 1 + 1 ⊗ HfN + gφ , R where g denotes a coupling constant, HfN = ω(k)a∗ (k)a(k)dk is the free R⊕ ∼ Hamiltonian in FN , and under identification HN = R3 FN dx, φ is defined by R⊕ φ = R3 φ(x)dx with ) Z ( 1 ˆ ˆ ∗ −ikx ϕ(−k) ikx ϕ(k) p φ(x) = √ a (k)e + a(k)e p dk . 2 ω(k) ω(k)
May 26, 2003 16:50 WSPC/148-RMP
00166
Localization of the Number of Photons of Ground States
277
It has been established in [2, 4, 9, 25] that the Nelson Hamiltonian has the unique ground state, ψgN , under the condition I < ∞. Let us denote the number operator of FN by the same symbol N as that of F. In [6] it has been proven that ψgN decays superexponentially, i.e. ke+β(1⊗N ) ψgN kHN < ∞
(1.4)
for arbitrary β > 0. This kind of results has been obtained in [11, Sec. 3] and [24] for relativistic polaron models, and [26, Sec. 8] for spin-boson models. Moreover in [6] we see that lim k(1 ⊗ N 1/2 )ψgN kHN = ∞ .
I→∞
(1.5)
Actually in the infrared divergence case, I = ∞,
(1.6)
it is shown in [20] that the Nelson Hamiltonian with some confining external potentials has no ground states in HN . Then we have to take a non-Fock representation to investigate a ground state with (1.6). See [1, 3, 21] for details. That is to say, as the infrared cutoff is removed, the number of bosons of ψgN diverges and the ground state disappears. A method to show (1.4) and (1.5) is based on a path integral representation of (ψgN , e+β(1⊗N ) ψgN )HN . Precisely it can be shown that in the case I < ∞ there exists a probability measure µ on C(R; R3 ) such that for arbitrary β > 0, Z R∞ R0 2 +β (ψgN , e+β(1⊗N ) ψgN )HN = e−(g /2)(1−e ) −∞ ds 0 dtW (qs −qt ,s−t) µ(dq) , C(R;R3 )
(1.7)
where (qt )−∞
Z
e−|T |ω(k) eik·X R3
2 |ϕ(k)| ˆ dk . ω(k)
(1.8)
R0 RT Note that the double integral −T ds 0 dtW (qs − qt , s − t) is estimated uniformly in path and T as Z Z T 0 ds dtW (qs − qt , s − t) ≤ I . (1.9) −T 0 This uniform bound is a core of the proof of identity (1.7).
May 26, 2003 16:50 WSPC/148-RMP
278
00166
F. Hiroshima
1.4. The main theorems In contrast to the Nelson Hamiltonian, for the Pauli–Fierz Hamiltonian, as is seen in Proposition 1.5, it is shown that the ground state, ψg , exists and k(1⊗N 1/2)ψg kH < ∞ even in the case I = ∞. We may say that the infrared singularity for the Pauli– Fierz Hamiltonian is not so singular in comparison with the Nelson Hamiltonian, and one may expect that ke+β(1⊗N ) ψg kH < ∞
(1.10)
holds for some β > 0 under I = ∞. Unfortunately, however, we can not show (1.10), since the similar path integral method as the Nelson Hamiltonian is not available on account ofRthe appearance of the so-called double stochastic integral ([13]) instead R0 ∞ of −∞ ds 0 dtW (qs − qt , s − t) in (1.7). The double stochastic integral is formally written as Z ∞ X Z 0 dqµ,s dqν,t Wµν (qs − qt , s − t) , (1.11) µ,ν=1,2,3
−∞
0
where (qs )−∞<s<∞ = (q1,s , q2,s , q3,s )−∞<s<∞ ∈ C(R, R3 ) and Z 2 ˆ kµ kν −|T |ω(k) ik·X |ϕ(k)| dk . e e Wµν (X, T ) = δµν − |k|2 ω(k) R3 Actually we can not estimate (1.11) uniformly in path such as (1.9). Therefore we are not concerned here with (1.10). In place of this we will show the following theorems. Theorem 1.7. Assume Hm . Then ψg ∈
T∞
k=1
D(1 ⊗ N k/2 ).
Remark 1.8. Theorem 1.7 automatically p follows if one assumes that photons have artificial positive mass, ν, i.e. ω(k) = |k|2 + ν 2 .
Theorem 1.9. Assume Hm . Then for a fixed k ≥ 0 there exist positive constants Dk , and δ independent of k such that k(1 ⊗ N k/2 )ψg (x)kF ≤ Dk e−δ|x|
m+1
(1.12)
for almost every x ∈ R3 . Remark 1.10. We do not assume I < ∞ in Theorems 1.7 and 1.9. From Theorems 1.7 and 1.9 the following corollary is immediate. Corollary 1.11. Assume Hm . Then ψg ∈ m+1
T∞
k=0
D(eβ|x|
m+1
⊗ N k/2 ) for β < δ/2.
Proof. Since ψg ∈ D(e2β|x| ⊗ 1) ∩ D(1 ⊗ N k/2 ) for all k ≥ 0, the corollary m+1 2β|x|m+1 follows from the fact that D(e ⊗ 1) ∩ D(1 ⊗ N k ) ⊂ D(eβ|x| ⊗ N k/2 ).
May 26, 2003 16:50 WSPC/148-RMP
00166
Localization of the Number of Photons of Ground States
279
1.5. Outline of proofs of the main theorems For notational convenience, in the following we mostly omit the tensor notation ⊗, e.g. we express as Hf for 1 ⊗ Hf , a] (k, j) for 1 ⊗ a](k, j), ∆ for ∆ ⊗ 1, |x| for |x| ⊗ 1, etc., and set k = (k, j) ∈ R3 × {1, 2} and
Z X
· · · dk1 · · · dkn =
X
j1 ,...,jn =1,2
Z
. . . dk1 · · · dkn .
The strategy of this paper is as follows. We check in Lemma 3.2 that Z X ka(k1 ) · · · a(kl )Ψk2H dk1 · · · dkl < ∞ , l = 1, 2, . . . , k ,
(1.13)
if and only if
Ψ ∈ D(N k/2 ) . Thus in order to prove Theorem 1.7 it is enough to show that ψg ∈ D(a(k1 ) · · · a(kl )) for almost every (k1 , . . . , kl ) ∈ R3l , and Z X ka(k1 ) · · · a(kl )ψg k2H dk1 · · · dkl < ∞ (1.14)
holds for all l ≥ 0. One subtlety to show (1.14) is that we do not assume I < ∞. Bach–Fr¨ ohlich–Sigal [5] proved (1.14) for l = 1. We extend it to l ≥ 1. To see (1.14) for all l we make a detour through the modified annihilation operator defined by e−ik·x e ϕ(k) ˆ . b(k, j) = a(k, j) − i √ (x · e(k, j)) p 2 ω(k)
For some Ψ ∈ H we establish in Lemma 3.6 that ka(k1 ) · · · a(kn )ΨkH ≤
n X
l=0 {p1
l Y |eϕ(k ˆ pj )| p 2ω(kpj ) ,...,p }⊂{1,...,n} j=1
X l
l [ [ (1.15) × kb(k1 ) · · · b(k p1 ) · · · b(kpl ) · · · b(kn )|x| ΨkH , P where b means neglecting the term below, and {p1 ,...,pl }⊂{1,2,...,n} denotes to sum up all the combinations to choose l numbers from {1, 2, . . . , n}. In Lemma 3.7 we show that there exist constants cn,l k such that
Z X
kb(k1 ) · · · b(kn )|x|
m
Ψk2H dk1
· · · dkn ≤
n X n−l X l=0 k=1
k/2 cn,l |x|m+l Ψk2H . k kN
(1.16)
May 26, 2003 16:50 WSPC/148-RMP
280
00166
F. Hiroshima
Combining (1.15) and (1.16), we see in Lemma 3.8 that Z X ka(k1 ) · · · (kn )Ψk2H dk1 · · · dkn ≤2
n
(Z X
kb(k1 ) · · · b(kn )Ψk2H dk1
· · · dkn +
with some constants dnl , where Rn,m (Ψ) =
n−l n X X l=0 k=1
n X l=1
dnl Rn−l,l (Ψ)
)
(1.17)
k/2 cn,l |x|m+l Ψk2H . k kN
Furthermore if ψg ∈ D(N k/2 ) then we see that N k/2 ψg = e−tH etE N k/2 ψg + etE [N k/2 , e−tH ]ψg , where E = inf σ(H) . Using this identity we show in Lemma 2.12 that if ψg ∈ D(N k/2 ) then for all l ≥ 0, |x|l ψg ∈ D(N k/2 ) .
(1.18)
Under these preparations we prove Theorem 1.7 by means of an induction. Let us assume that ψg ∈ D(N (n−1)/2 ) . Hence
Z X
ka(k1 ) · · · a(kl )ψg k2H dk1 · · · dkl < ∞ ,
(1.19)
l = 1, 2, . . . , n − 1 .
(1.20)
Then we see that by (1.18), n X l=1
dnl Rn−l,l (ψg ) < ∞ .
Moreover by using pull through formula (2.14) we prove in Lemma 3.4 that kb(k1 ) · · · b(kn )ψg kH ≤
n X p=1
+
[ δ1 (kp )kb(k1 ) · · · b(k p ) · · · b(kn )(|x| + 1)ψg kH
n X X p=1 q
2 [ [ δ2 (kp , kq )kb(k1 ) · · · b(k q ) · · · b(kp ) · · · b(kn )|x| ψg kH
with δ1 ∈ L2 (R3 ) ,
δ2 ∈ L2 (R3 × R3 ) .
(1.21)
May 26, 2003 16:50 WSPC/148-RMP
00166
Localization of the Number of Photons of Ground States
281
By (1.16), (1.18) and assumption (1.19), we show that Z X kb(k1 ) · · · b(kn )ψg kH dk1 · · · dkn < ∞ . Hence by (1.17) we have Z X ka(k1 ) · · · a(kn )ψg k2H dk1 · · · dkn ≤2
n
(Z X
kb(k1 ) · · · b(kn )ψg k2H dk1
· · · dkn +
which implies, together with (1.20), that
n X l=1
Rn−l,l (ψg )
)
< ∞,
ψg ∈ D(N n/2 ) .
Since ψg ∈ D(N 1/2 ) is known, we obtain ψg ∈
∞ \
D(N k/2 ) .
k=1
This paper is organized as follows. In Sec. 2 we establish (1.21) by means of the pull-through formula. In Sec. 3 we give a proof of the main theorems. In Sec. 4 we show (1.18) by virtue of a functional integral representation. 2. Pull-Through Formula and Exponential Decay 2.1. Fundamental facts Let T be an operator. We set C ∞ (T ) =
∞ \
D(T k ) .
k=1
Lemma 2.1. We have ψg ∈ C ∞ (|x|) ∩ D(∆) ∩ C ∞ (Hf ). Proof. By Proposition 1.4, D(H) = D(∆) ∩ D(Hf ). Then ψg ∈ D(∆). By Proposition 1.5(2) it holds that ψg ∈ C ∞ (|x|). It is obtained in [8] that Hfl (H − i)−l is bounded for all l ≥ 0. Recall that E = inf σ(H). Then it follows that for arbitrary l ≥ 0, kHfl ψg kH = kHfl (H − i)−l (E − i)l ψg kH ≤ |(E − i)l |kHfl (H − i)−l kkψg kH .
Then ψg ∈ D(Hfl ) for all l ≥ 0. Thus the lemma follows. Let
Fω = L{a∗ (f1 , j1 ) · · · a∗ (fn , jn )Ω, Ω|fj ∈ C0∞ (R3 ), j = 1, . . . , n, n = 1, 2, . . .} , where L{· · ·} denotes the set of finite linear sums of {· · ·}. We define D = C ∞ (|x|) ∩ C ∞ (Hf ) ,
May 26, 2003 16:50 WSPC/148-RMP
282
00166
F. Hiroshima
and ˆ ω, C = C0∞ (R3 )⊗F ˆ denotes the algebraic tensor product. where ⊗ Lemma 2.2. Let m ≥ 0 and n ≥ 0. Then (Hf + 1)n + |x|m is self-adjoint on D((Hf + 1)n ) ∩ D(|x|m ) and essentially self-adjoint on C. Proof. The self-adjointness is trivial. Since C0∞ (R3 ) and Fω are the set of analytic vectors of |x|m and (Hf + 1)n respectively, C0∞ (R3 ) and Fω are cores of |x|m and ˆ Fω is a core of (Hf + 1)n + |x|m . (Hf + 1)n respectively. Hence C = C0∞ (R3 ) ⊗ Remark 2.3. Let p, q ≥ 0. From Lemma 2.2 it follows that for Ψ ∈ D ⊂ D((Hf + 1)p + |x|q ), there exists a sequence {Ψm } ⊂ C such that Ψm → Ψ and ((Hf + 1)p + |x|q )Ψm → ((Hf + 1)p + |x|q )Ψ strongly as m → ∞. Let fj ∈ C0∞ (R3 \{0}), j = 1, . . . , n, and Ψ ∈ C. Then it is proven in Appendix C that Z Y X n fj (kj ) ka(kj ) · · · a(kn )ΨkH dk1 · · · dkn ≤ (f1 , . . . , fn )k(Hf + 1)n/2 ΨkH j=1
(2.1)
with some constant (f1 , . . . , fn ) independent of Ψ. Let A and B be operators. We say f ∈ D(AB) if f ∈ D(B) and Bf ∈ D(A). Lemma 2.4. Let Ψ ∈ D. Then there exists MD (Ψ) ⊂ R3n with the Lebesgue measure zero such that Ψ ∈ D(a(k1 ) · · · a(kn ))
(2.2)
a(k1 ) · · · a(kn )Ψ ∈ D
(2.3)
and
for (k1 , . . . , kn ) ∈ / MD (Ψ). Moreover assume that {Ψm } ⊂ C satisfies that Ψm → Ψ and (Hf + 1)n/2 Ψm → (Hf + 1)n/2 Ψ strongly as m → ∞. Then there exists a subsequence {m0 } ⊂ {m} and MD (Ψ, {Ψm }, {m0 }) ⊂ R3n with the Lebesgue measure zero such that for (k1 , . . . , kn ) ∈ / MD (Ψ, {Ψm}, {m0 }), (2.2) and (2.3) are valid and s- lim a(k1 ) · · · a(kn )Ψm0 = a(k1 ) · · · a(kn )Ψ . 0 m →∞
Proof. See Appendix A. Lemma 2.5. The operator |x| leaves D invariant.
May 26, 2003 16:50 WSPC/148-RMP
00166
Localization of the Number of Photons of Ground States
283
Proof. Let Ψ ∈ D. It is clear that |x|Ψ ∈ C ∞ (|x|). We choose a sequence {Ψm } ⊂ C such that Ψm → Ψ and ((Hf + 1)2n + |x|2 )Ψm → ((Hf + 1)2n + |x|2 )Ψ strongly as m → ∞. In particular, |x|Ψm → |x|Ψ
(2.4)
strongly as m → ∞. Hfn |x|Ψm is well defined and it is obtained that kHfn |x|Ψm k2H ≤ kHf2n Ψm kH k|x|2 Ψm kH ≤ k((Hf + 1)2n + |x|2 )Ψm k2H . Then Hfn |x|Ψm converges strongly as m → ∞. Since Hfn is closed, by (2.4) we have |x|Ψ ∈ D(Hfn ). Here n is arbitrary, hence |x|Ψ ∈ C ∞ (Hf ). The proof is complete. Let
and
e e−ik·x β(k) = √ p e(k)ϕ(k) ˆ 2 ω(k) b(k) = eix·A a(k)e−ix·A = a(k) − ix · β(k) .
For simplicity we set −ix · β(kj ) = θj . Then b(kj ) = a(kj ) + θj . Lemma 2.6. Let Ψ ∈ C and fj ∈ C0∞ (R3 \{0}), j = 1, . . . , n. Then there exists a constant 0 (f1 , . . . , fn ) independent of Ψ such that Z Y X n f (k ) j j kb(k1 ) · · · b(kn )ΨkH dk1 · · · dkn j=1 ≤ 0 (f1 , . . . , fn )k((Hf + 1)n + |x|2n )ΨkH .
Proof. Since [θj , a(k)] = 0 on C, we have b(k1 ) · · · b(kn )Ψ = (a(k1 ) + θ1 ) · · · (a(kn ) + θn )Ψ =
n X
X
l=0 {p1 ,...,pl }⊂{1,...,n}
θ p1 · · · θ pl
[ [ × a(k1 ) · · · a(k p1 ) · · · a(kpl ) · · · a(kn )Ψ . Hence by (2.1), Z Y X n fj (kj ) kb(k1 ) · · · b(kn )ΨkH dk1 · · · dkn j=1
(2.5)
May 26, 2003 16:50 WSPC/148-RMP
284
00166
F. Hiroshima
≤
n X
X
l=0 {p1 ,...,pl }⊂{1,...,n}
l Z Y |eϕ(k)| ˆ p fpi (k)dk 2ω(k) i=1
!
(n−l)/2 c × (f1 , . . . , fc |x|l ΨkH . p1 , . . . , fpl , . . . , fn )k(Hf + 1)
Since k(Hf + 1)(n−l)/2 |x|l ΨkH ≤ cnl k((Hf + 1)n + |x|2n )ΨkH with some constant cnl . Thus (2.5) follows. Lemma 2.7. Let Ψ ∈ D. Then there exists ND (Ψ) ⊂ R3n with the Lebesgue measure zero such that Ψ ∈ D(b(k1 ) · · · b(kn ))
(2.6)
b(k1 ) · · · b(kn )Ψ ∈ D
(2.7)
and
for (k1 , . . . , kn ) ∈ / ND (Ψ). Moreover assume that {Ψm } ⊂ C satisfies that Ψm → Ψ and ((Hf + 1)n + |x|2n )Ψm → ((Hf + 1)n + |x|2n )Ψ strongly as m → ∞. Then there exists a subsequence {m0 } ⊂ {m} and ND (Ψ, {Ψm }, {m}) ⊂ R3n with the Lebesgue measure zero such that for (k1 , . . . , kn ) ∈ / ND (Ψ, {Ψm }, {m0 }), (2.6) and (2.7) are valid and s- lim b(k1 ) · · · b(kn )Ψm0 = b(k1 ) · · · b(kn )Ψ . 0 m →∞
Proof. See Appendix A. 2.2. Pull-through formula Lemma 2.8. We have C ⊂ D(Hb(k1 ) · · · b(kn )) ∩ D(b(k1 ) · · · b(kn )H)
(2.8)
for all (k1 , . . . , kn ) ∈ R3n , and for Ψ ∈ C, [H, b(k1 ) · · · b(kn )]Ψ = R0 Ψ + R1 Ψ + R2 Ψ . Here R0 = R0 (k1 , . . . , kn ) = − R1 = R1 (k1 , . . . , kn ) = R2 = R2 (k1 , . . . , kn ) =
n X X p=1 q
n X p=1
n X p=1
ω(kp )b(k1 ) · · · b(kn ) ,
[ ϑ1 (kp )b(k1 ) · · · b(k p ) · · · b(kn ) ,
[ [ ϑ2 (kp , kq )b(k1 ) · · · b(k q ) · · · b(kp ) · · · b(kn ) ,
May 26, 2003 16:50 WSPC/148-RMP
00166
Localization of the Number of Photons of Ground States
285
and i {(x · β(k))k · (p − eA) 2 + k · (p − eA)(x · β(k))} − iω(k)(x · β(k)) ,
ϑ1 (k) = ϑ1 (k, x, p) =
ϑ2 (k, k0 ) = ϑ2 (k, k0 , x) = (x · β(k))(x · β(k0 ))(k · k 0 ) . Proof. (2.8) is trivial. On C we have [H, b(k)] = −ω(k)b(k) + ϑ1 (k) .
(2.9)
[b(k0 ), ϑ1 (k)] = ϑ2 (k, k0 ) .
(2.10)
Moreover
By (2.9) and (2.10) we have [H, b(k1 ) · · · b(kn )]Ψ =
n X p=1
=− +
b(k1 ) · · · {−ω(kp )b(kp ) + ϑ1 (kp )} · · · b(kn )Ψ
n X
ω(kp )b(k1 ) · · · b(kn )Ψ
n X
[ ϑ1 (kp )b(k1 ) · · · b(k p ) · · · b(kn )Ψ
p=1
p=1
+
n X X p=1 q
The lemma follows.
[ [ ϑ2 (kp , kq )b(k1 ) · · · b(k q ) · · · b(kp ) · · · b(kn )Ψ .
¯ denotes the closure of B. We simply set R1 = R1 dC . B Lemma 2.9. Let Ψ ∈ D ∩ D(∆). Then there exists N (Ψ) ⊂ R3n with the Lebesgue measure zero such that for (k1 , . . . , kn ) ∈ / N (Ψ), Ψ ∈ D(R0 (k1 , . . . , kn )) ∩ D(R1 (k1 , . . . , kn )) ∩ D(R2 (k1 , . . . , kn )) . Proof. By Lemma 2.7, Ψ ∈ D(b(k1 ) · · · b(kn )) and b(k1 ) · · · b(kn )Ψ ∈ D for P (k1 , . . . , kn ) ∈ / ND (Ψ). Thus b(k1 ) · · · b(kn )Ψ ∈ D( np=1 ω(kp )) ∩ D(ϑ2 (kp , kq )), which implies that Ψ ∈ D(R0 (k1 , . . . , kn )) ∩ D(R2 (k1 , . . . , kn )). Next we shall prove D(R1 (k1 , . . . , kn )) 3 Ψ. Simply we set Kn = ((Hf + 1)n + |x|2n ). We have on C, R1 =
n X p=1
+
[ ix · β(kp )(kp · p)b(k1 ) · · · b(k p ) · · · b(kn )
n X p=1
[ Rx (kp )b(k1 ) · · · b(k p ) · · · b(kn ) ,
May 26, 2003 16:50 WSPC/148-RMP
286
00166
F. Hiroshima
where Rx (kp ) = (−ie)(x · β(kp ))(k · A) i − (iβ(kp ) · kp + x · β(kp )|kp |2 ) − iω(kp )(x · β(kp )) . 2 It follows that for Φ ∈ C,
Z Y X
n [
fj (kj )Rx (kp )b(k1 ) · · · b(kp ) · · · b(kn )Φ
dk1 · · · dkn ≤ c1 kKm1 ΨkH
j=1 H
with some constants c1 and m1 , and
[ ix · β(kp )(kp · p)b(k1 ) · · · b(k p ) · · · b(kn )Φ [ = ix · β(kp )b(k1 ) · · · b(k p ) · · · b(kn )(kp · p)Φ X [ [ + Rx (kp , kq )b(k1 ) · · · b(k q ) · · · b(kp ) · · · b(kn )Φ ,
(2.11)
q6=p
where Rx (kp , kq ) = ix · β(kp )(−kq β(kq ) + ix · β(kq )(kp · kq )) . The second term of (2.11) is estimated as
Z Y X X
n
[ \
fj (kj ) Rx (kp , kq )b(k1 ) · · · b(kq ) · · · b(kn ) · · · b(kn )Φ
dk1 · · · dkn
j=1
q6=p H
≤ c2 kKm2 ΨkH
with some constants c2 and m2 . By (2.5) the first term of (2.11) is estimated as
Z Y X
n [
dk1 · · · dkn
f (k )(x · β(k ))b(k ) · · · b(k ) · · · b(k )(k · p)Φ j j p 1 p n p
j=1 H
≤ 0 (f1 , . . . , fbp , . . . , fn )
Let Q = Kn−1 . Note that
Z
|e| |ϕ(k ˆ p )| |f (kp )| √ p kKn−1 |x|(kp · p)ΦkH dkp . 2 ω(kp )
kQ|x|(k · p)Φk2H = (|x|2 Q2 Φ, (kp · p)2 Φ)H + (Φ, [(kp · p), Q2 |x|2 ](kp · p)Φ)H . Since [(kp · p), |x|] = −i(kp · x)/|x|, we have [(kp · p), Q2 |x|2 ] = kp · P, where x x + (−i)x(|x| + 1)2n−3 + (−i) (|x| + 1)2n−2 . P = 2 (Hf + 1)n−1 (−i) |x| |x|
Then
kQ|x|(kp · p)Φk2H ≤ |kp |2 (k|x|2 Q2 ΦkH k∆ΦkH + kP ΦkHk|p|ΦkH ) .
May 26, 2003 16:50 WSPC/148-RMP
00166
Localization of the Number of Photons of Ground States
287
Hence kQ|x|(kp · p)ΦkH ≤ |kp |(c3 kKm3 ΨkH + c0 k∆ΦkH ) follows with some constants c3 , c0 and m3 . Thus for Φ ∈ C,
Z Y X
n
0
fj (kj )R1 (k1 , . . . , kn )Φ
dk1 · · · dkn ≤ ckKm ΦkH + c k∆ΦkH
j=1
(2.12)
H
follows with some constants c and m. Set K = −∆+Km = −∆+|x|2m +(Hf +1)m . Then K is self-adjoint on D(−∆+|x|2m )∩D((Hf +1)m ) and essentially self-adjoint on C. Then for Ψ ∈ D ∩ D(∆) there exists a sequence {Ψl } ⊂ C such that Ψl → Ψ and KΨl → KΨ strongly as l → ∞. By (2.12) it follows that
Z Y X
n 0
f (k )R (k , . . . , k )Ψ j j 1 1 n l dk1 · · · dkn ≤ ckKm Ψl kH + c k∆Ψl kH .
j=1 H
0
3n
Then there exist ND (Ψ) ⊂ R with the Lebesgue measure zero and a subsequence {l0 } ⊂ {l} such that R1 (k1 , . . . , kn )Ψl0 strongly converges as l 0 → ∞ for / ND (Ψ)0 . (k1 , . . . , kn ) ∈ / ND (Ψ)0 . Then Ψ ∈ D(R1 (k1 , . . . , kn )) for (k1 , . . . , kn ) ∈ Set N (Ψ) = ND (Ψ) ∪ ND (Ψ)0 . We get the desired results. The following lemma is a variant of the pull-through formula. Lemma 2.10. For (k1 , . . . , kn ) ∈ / N (ψg ), the following (1), (2) and (3) hold : (1) ψg ∈ D(b(K1 ) · · · b(kn )) ∩ D(R0 ) ∩ D(R1 ) ∩ D(R2 ).
(2) b(k1 ) · · · b(kn )ψg ∈ D(H). (3)
H −E +
n X p=1
!
ω(kp ) b(k1 ) · · · b(kn )ψg = R1 ψg + R2 ψg .
(2.13)
In particular it follows that for (k1 , . . . , kn ) ∈ / N (ψg ) and (k1 , . . . , kn ) 6= (0, . . . , 0), !−1 n X (R1 ψg + R2 ψg ) . b(k1 ) · · · b(kn )ψg = H − E + ω(kp ) (2.14) p=1
Proof. Note that ψg ∈ D ∩ D(∆) = C ∞ (|x|) ∩ C ∞ (Hf ) ∩ D(∆). Then (1) follows from Lemma 2.9. Since C is a core of H, we have φm ∈ D such that φm → ψg and Hφm → Hψg = Eψg strongly as m → ∞. Then we have for φ ∈ C, X (Hφ, b(k1 ) · · · b(kn )φm )H = (φ, Rj φm )H + (φ, b(k1 ) · · · b(kn )Hφm )H . j=0,1,2
May 26, 2003 16:50 WSPC/148-RMP
288
00166
F. Hiroshima
It follows that lim (Hφ, b(k1 ) · · · b(kn )φm )H = lim (b∗ (kn ) · · · b∗ (k1 )Hφ, φm )H
m→∞
n→∞
= (b∗ (kn ) · · · b∗ (k1 )Hφ, ψg )H = (Hφ, b(k1 ) · · · b(kn )ψg )H , lim (φ, Rj φm )H = lim (R∗j φ, φm )H = (R∗j φ, ψg )H = (φ, Rj ψg )H ,
m→∞
m→∞
and lim (φ, b(k1 ) · · · b(kn )Hφm )H = lim (b∗ (kn ) · · · b∗ (k1 )φ, Hφm )H
m→∞
n→∞
= (b∗ (kn ) · · · b∗ (k1 )φ, Eψg )H = E(φ, b(k1 ) · · · b(kn )ψg )H . Hence (Hφ, b(k1 ) · · · b(kn )ψg )H =
X
j=0,1,2
(φ, Rj ψg )H + E(φ, b(k1 ) · · · b(kn )ψg )H .
Then b(k1 ) · · · b(kn )ψg ∈ D(H) and we have X Hb(k1 ) · · · b(kn )ψg = Rj ψg + Eb(k1 ) · · · b(kn )ψg . j=0,1,2
Note that R0 ψg = R0 ψg and R2 ψg = R2 ψg . Then (2.13) follows. 2.3. Exponential decay of N k/2 ψg Lemma 2.11. Suppose that ψg ∈ D(N k/2 ). Then there exist positive constants Dk , and δ independent of k such that kN k/2 ψg (x)kF ≤ Dk e−δ|x|
m+1
for almost every x ∈ R3 . In particular N k/2 ψg ∈ D(eδ|x|
m+1
).
The proof of Lemma 2.11 is based on a functional integral representation of e−tH . Essential ingredients of the proof have been obtained in [14]. The proof is, however, long and complicated. Then we move it to Appendix B. Lemma 2.12. Suppose that ψg ∈ D(N k/2 ). Then |x|l ψg ∈ D(N k/2 ) for all l ≥ 0. Proof. This lemma is immediately follows from Lemma 2.11 and the following fundamental lemma. Lemma 2.13. Let K be a Hilbert space, and A and B self-adjoint operators such that [e−itsA , e−isB ] = 0 for s, t ∈ R. Suppose that φ ∈ D(A)∩D(B) and Aφ ∈ D(B). Then Bφ ∈ D(A) with ABφ = BAφ.
May 26, 2003 16:50 WSPC/148-RMP
00166
Localization of the Number of Photons of Ground States
289
Proof. It follows that t−1 (e−itA − 1)e−isB φ = t−1 e−isB (e−itA − 1)φ. Take t → 0 on the both sides. Then it follows that e−isB φ ∈ D(A) with Ae−isB φ = e−isB Aφ. From this identity we have s−1 A(e−isB − 1)φ = s−1 (e−isB − 1)Aφ. Take s → 0 on the both sides. Since A is closed and assumption Aφ ∈ D(B), we see that Bφ ∈ D(A) and ABφ = BAφ. Proof of Lemma 2.12. In Lemma 2.13 we set K = H, A = N k/2 and B = |x|l . Since ψg ∈ D(N k/2 ) ∩ D(|x|l ) and N k/2 ψg ∈ D(|x|l ) by Lemma 2.11, the lemma follows. 3. Proof of the Main Theorems Lemma 3.1. The following statements are equivalent. (1) Ψ ∈ D(a(k1 ) · · · (kn )) for almost every (k1 , . . . , kn ) ∈ R3n and Z X ka(k1 ) · · · a(kn )Ψk2H dk1 · · · dkn < ∞ .
(2) Ψ ∈ D(
Qn
j=1 (N
(3.1)
− j + 1)1/2 ).
Moreover if (1) or (2) is satisfied, then it holds that
2
Y
Z n X
2 1/2
ka(k1 ) · · · a(kn )ΨkH dk1 · · · dkn = (N − j + 1) Ψ .
j=1
H
Proof. We prove (1) ⇒ (2). We identify H as H∼ = where
∞ M n=0
Hn ,
(3.2)
Hn = L2 (R3 × (R3 × {1, 2})×n sym ) . We note that (k) n Y (N − j + 1)1/2 Ψ j=1
=
(
0, p
k(k − 1) · · · (k − n + 1)Ψ(k) ,
(m)
Define Ψp = ⊕∞ m=0 Ψp
∈ H by Ψp(m) =
(
Ψ(m) , 0,
m ≤ p,
m > p.
k = 0, 1, . . . , n − 1 , k ≥ n.
(3.3)
May 26, 2003 16:50 WSPC/148-RMP
290
00166
F. Hiroshima
By the definition of a(k) we have (a(k1 ) · · · a(kn )Ψp )(l) (x, k01 , . . . , k0l ) √ √ √ = l + 1 l + 2 · · · l + nΨp(l+n) (x, k1 , . . . , kn , k01 , . . . , k0l ) . Then ka(k1 ) · · · a(kn )Ψp k2H =
∞ X l=0
× =
(l + 1)(l + 2) · · · (l + n)
Z X
p−n X l=0
× =
(l + 1)(l + 2) · · · (l + n)
Z X
p−n X l=0
kΨp(l+n) (·, k1 , . . . , kn , k01 , . . . , k0l )k2L2 (R3 ) dk10 · · · dkl0
kΨ(l+n) (·, k1 , . . . , kn , k01 , . . . , k0l )k2L2 (R3 ) dk10 · · · dkl0
k(a(k1 ) · · · a(kn )Ψ)(l) k2Hl .
(3.4)
Hence Z X ka(k1 ) · · · a(kn )Ψp k2H dk1 · · · dkn =
p−n X l=0
× =
(l + 1)(l + 2) · · · (l + n)
Z X
p X
k=n
kΨ(l+n) (·, k1 , . . . , kn , k01 , . . . , k0l )k2L2 (R3 ) dk10 · · · dkl0 dk1 · · · dkn
k(k − 1)(k − 2) · · · (k − n + 1)
(k)
2
n
Y
(N − j + 1)1/2 Ψ =
k=0 j=1
p X
Z X
kΨ(k) (·, k1 , . . . , kk )k2L2 (R3 ) dk1 · · · dkk
.
(3.5)
Hk
By (1) we see that
lim ka(k1 ) · · · a(kn )Ψp k2H = lim
p→∞
p→∞
3n
p−n X k=0
k(a(k1 ) · · · a(kn )Ψ)(k) k2Hk
= ka(k1 ) · · · a(kn )Ψk2H
for almost every (k1 , . . . , kn ) ∈ R , and
ka(k1 ) · · · a(kn )Ψk2H ∈ L1 (R3n ) .
(3.6)
May 26, 2003 16:50 WSPC/148-RMP
00166
291
Localization of the Number of Photons of Ground States
Thus the Lebesgue dominated convergence theorem yields that Z X lim ka(k1 ) · · · a(kn )Ψp k2H dk1 · · · dkn < ∞ . p→∞
Then from (3.5), it follows that
2
(k)
2
n
Y p Y
X
n
(N − j + 1)1/2 Ψ = lim (N − j + 1)1/2 Ψ
p→∞
j=1 k=0 j=1
H
< ∞.
Hk
Thus (2) follows. We prove (2) ⇒ (1). By (3.5) and (2) we see that Z X ka(k1 ) · · · a(kn )Ψp k2H dk1 · · · dkn < ∞ , lim p→∞
and by (3.4), ka(k1 ) · · · a(kn )Ψp k2H is increasing in p. Then we have by the Lebesgue monotone convergence theorem, Z X lim ka(k1 ) · · · a(kn )Ψp k2H dk1 · · · dkn p→∞
=
Z X
lim ka(k1 ) · · · a(kn )Ψp k2H dk1 · · · dkn < ∞ .
p→∞
(3.7)
Then (1) follows from (3.6). Lemma 3.2. The following statements are equivalent. (1) Ψ ∈ D(a(k1 ) · · · a(kn )) for almost every (k1 , . . . , kn ) ∈ R3n and Z X ka(k1 · · · a(kn )Ψk2H dk1 · · · dkn < ∞
for n = 1, 2, . . . , k. (2) Ψ ∈ D(N k/2 ).
Proof. By Lemma 3.1, it is enough to show that n k \ Y D (N − j + 1)1/2 = D(N k/2 ) . k=1
j=1
Assume that
Ψ∈
n \
k=1
D
k Y
j=1
(N − j + 1)1/2 .
(3.8)
Qn Let Ψp be defined by (3.3). Let Wn = j=1 (N − j + 1). For example N = W1 , N 2 = W2 + W1 , N 3 = W3 + 3W2 + W1 , N 4 = W4 + 6W3 + 7W2 + W1 , etc. One can inductively see that there exist constants aj , j = 1, . . . , k, such that on F0 , Nk =
n X j=1
a j Wj .
May 26, 2003 16:50 WSPC/148-RMP
292
00166
F. Hiroshima
Then it follows that 1/2
kN k/2 Ψp k2H = a1 kW1
1/2
1/2
Ψp k2H + a2 kW2 Ψp k2H + · · · + ak kWk Ψp k2H .
(3.9)
As n → ∞, from (3.8) it follows that the right hand side of (3.9) converges to 1/2
a1 kW1
1/2
Ψk2H + a2 kW2
1/2
Ψk2H + · · · + ak kWk Ψk2H .
Since kN k/2 Ψk2H = lim
p→∞
p X k=0
k(N k/2 Ψ)(k) k2Hk = lim kN k/2 Ψp k2H < ∞ , p→∞
Ψ ∈ D(N k/2 ) follows. Then n k \ Y D (N − j + 1)1/2 ⊂ D(N k/2 ) . k=1
(3.10)
j=1
Next we assume that
Ψ ∈ D(N k/2 ) . Note that Ψ∈
k \
D(N l/2 ) .
l=1
It is seen that there exist constants bnl , l = 1, . . . , n, such that
2
Y n Y
n
(N − j + 1)1/2 Ψp = Ψp , (N − j + 1)Ψp
j=1 j=1
= (Ψp , N (N − 1)(N − 2) · · · (N − n + 1)Ψp )
≤
n X l=1
bnl kN l/2 Ψp k2 .
(3.11)
Take p → ∞ on the both sides above. Then the right hand side of (3.11) converges to n X bnl kN l/2 Ψk2 . l=1
Hence
2 (k)
n
2
Y
p n
X Y
(N − j + 1)1/2 Ψ = lim (N − j + 1)1/2 Ψ
p→∞
j=1
k=0 j=1
H
Hk
2
n
Y 1/2
= lim (N − j + 1) Ψp
< ∞. p→∞
j=1 H
May 26, 2003 16:50 WSPC/148-RMP
00166
Localization of the Number of Photons of Ground States
Thus Ψ ∈
Tn
k=1
Qn
− j + 1)1/2 ). We obtain n n \ Y D (N − j + 1)1/2 ⊃ D(N k/2 ) .
D(
293
j=1 (N
(3.12)
j=1
k=1
The lemma follows from (3.10) and (3.12). We set Rω = Rω (k1 , . . . , kn ) =
H −E+
n X
ω(kp )
p=1
!−1
.
Lemma 3.3. There exist δ1 (·) ∈ L2 (R3 ) and δ2 (·, ·) ∈ L2 (R3 × R3 ) such that for Ψ ∈ D, kRω ϑ1 (kq )ΨkH ≤ δ1 (kq )k(|x| + 1)ΨkH ,
(3.13)
kRω ϑ2 (kq , kp )ΨkH ≤ δ2 (kq , kp )k|x|2 ΨkH .
(3.14)
and
Proof. By the closed graph theorem there exists a constant C such that k(−∆ + Hf )ΨkH ≤ Ck(H + 1)ΨkH . First we shall prove that Rω (p · kq ) and Rω (A · kq ) are bounded with kRω (p · kq )k ≤ c1 (kq )
(3.15)
and (3.16) kRω (A · kq )k ≤ c2 (kq ) , p √ √ where c1 (kq ) = (|kq | + |1 + E|)C and c2 (kq ) = 2(c1 (kq )+1)(2kϕ/ωk+k ˆ ϕ/ ˆ ωk). Let Ψ ∈ C. Since k(p · kq )Ψk2H ≤ |kq |2 (Ψ, C(H + 1)Ψ) ≤ |kq |2 C{k(H − E)1/2 Ψk2H + |1 + E|kΨk2H} , we see that k(p · kq )Rω Ψk2H ≤ C|kq |kΨk2H + C|1 + E|kΨk2H . Thus (3.15) follows. Note that √ 1/2 ka(f, j)ΨkH ≤ kf / ωkkHf ΨkF , and √ 1/2 ka∗ (f, j)ΨkF ≤ kf / ωkkHf ΨkF + kf kkΨkF . Since k(A · kq )ΨkH ≤
√ √ 1/2 2|kq |(2kϕ/ωk ˆ + kϕ/ ˆ ωk)(kHf ΨkH + kΨkH )
May 26, 2003 16:50 WSPC/148-RMP
294
00166
F. Hiroshima
and 1/2
kHf
Ψk2H ≤ C(Ψ, (H + 1)Ψ)H ≤ Ck(H − E)1/2 Ψk2H + C|1 + E|kΨk2H ,
we have 1/2
|kq |2 kHf
Rω Ψk2H ≤ C|kq |kΨk2H + C|1 + E|kΨk2H .
Hence √
√ 1/2 2(2kϕ/ωk ˆ + kϕ/ ˆ ωk)(|kq |kHf Rω ΨkH + |kq |kRω ΨkH ) √ q √ ˆ + kϕ/ ˆ ωk)kΨkH . ≤ 2{ (|kq | + |1 + E|)C + 1}(2kϕ/ωk
k(A · kq )Rω ΨkH ≤
Thus (3.16) follows. We have on C,
i ϑ1 (k) = i(p − eA) · k(x · β(k)) + (iβ(k) · k + x · β(k)|k|2 ) − iω(k)(x · β(k)) . 2 Then by (3.15) and (3.16) we have for Ψ ∈ C, |e| |ϕ(k ˆ p )| k|x|ΨkH , kRω i(p − eA) · kp , (x · β(kp ))ΨkH ≤ (c1 (kp ) + |e|c2 (kp )) √ 2 ω(kp )
and
|e| p kRω (−iω(kp )(x · β(kp )))ΨkH ≤ √ ω(kp )|ϕ(k ˆ p )|k|x|ΨkH , 2
Rω i (iβ(kp ) · kp + x · β(kp )|kp |2 )Ψ
2 1 |e| ≤ √ 2 2
H
|ϕ(k ˆ p )| kΨkH + |ϕ(k ˆ p )|k|x|ΨkH . ω(kp )
√ Since kϕk ˆ < ∞, k ω ϕk ˆ < ∞ and kϕ/ωk ˆ < ∞, (3.13) follows for Ψ ∈ C. By a limiting argument it can be extended for Ψ ∈ D. (3.14) is rather easier than (3.13). We have for Ψ ∈ C, q e2 kRω ϑ2 (kp , kq )ΨkH ≤ ω(kp )ω(kq )|ϕ(k ˆ q )ϕ(k ˆ p )|k|x|2 ΨkH . 2 √ ˆ < ∞. Thus the lemma follows from a limiting argument and k ω ϕk From Lemma 3.3 the next lemma immediately follows. Lemma 3.4. For almost every (k1 , . . . , kn ) ∈ R3n it follows that i Th [ ψg ∈ D(b(k1 ) · · · b(kn )) ∩p D b(k1 ) · · · b(k p ) · · · b(kn )(|x| + 1) Th
2 [ [ ∩q
i
(3.17)
May 26, 2003 16:50 WSPC/148-RMP
00166
Localization of the Number of Photons of Ground States
295
and kb(k1 ) · · · b(kn )ψg kH ≤
n X p=1
+
[ δ1 (kp )kb(k1 ) · · · b(k p ) · · · b(kn )(|x| + 1)ψg kH
n X X p=1 q
2 [ [ δ2 (kp , kq )kb(k1 ) · · · b(k q ) · · · b(kp ) · · · b(kn )|x| ψg kH .
Proof. Note that for (k1 , . . . , kn ) ∈ / N (ψg ) and (k1 , . . . , kn ) 6= (0, . . . , 0), b(k1 ) · · · b(kn )ψg = Rω (k1 , . . . , kn )R1 (k1 , . . . , kn )ψg + Rω (k1 , . . . , kn )R2 (k1 , . . . , kn )ψg . Let Ψ ∈ C and fj ∈ C0∞ (R3 \{0}), j = 1, . . . , n. It is obtained that Z Y X n f (k ) j j kRω (k1 , . . . , kn )R1 (k1 , . . . , kn )ΨkH dk1 · · · dkn j=1
Z Y n X X n [ fj (kj ) δ1 (kp )kb(k1 ) · · · b(k ≤ p ) · · · b(kn )(|x| + 1)ΨkH dk1 · · · dkn . p=1 j=1
(3.18)
Similarly we obtain that Z Y X n fj (kj ) kRω (k1 , . . . , kn )R2 (k1 , . . . , kn )ΨkH dk1 · · · dkn j=1 Z Y n X X n X δ2 (kp , kq ) fj (kj ) ≤ j=1 q
2 [ [ × kb(k1 ) · · · b(k q ) · · · b(kp ) · · · b(kn )|x| ΨkH dk1 · · · dkn .
(3.19)
We choose a sequence {Ψm } ⊂ C such that Ψm → ψg and ((Hf +1)K +|x|2K )Ψm → ((Hf + 1)K + |x|2K )ψg strongly as m → ∞ for sufficiently large K. Note that |x|j Ψm → |x|j Ψ and ((Hf + 1)n + |x|2n )|x|j Ψm → ((Hf + 1)n + |x|2n )|x|j Ψ strongly as m → ∞ for j = 1, 2, since K is sufficiently large. By Lemma 2.7 there exists a subsequence {m0 } ⊂ {m} such that for almost every (k1 , . . . , kn ) ∈ R3n , (3.17) follows and b(k1 ) · · · b(kn )Ψm0 → b(k1 ) · · · (kn )ψg ,
(3.20)
[ [ b(k1 ) · · · b(k p ) · · · b(kn )(|x| + 1)Ψm0 → b(k1 ) · · · b(kp ) · · · b(kn )(|x| + 1)ψg , (3.21)
May 26, 2003 16:50 WSPC/148-RMP
296
00166
F. Hiroshima
and 2 [ [ b(k1 ) · · · b(k q ) · · · b(kp ) · · · b(kn )|x| Ψm0 2 [ [ → b(k1 ) · · · b(k q ) · · · b(kp ) · · · b(kn )|x| ψg
(3.22)
strongly as m0 → ∞. Moreover Z Y X n [ f (k )δ (k ) j j 1 p kb(k1 ) · · · b(kp ) · · · b(kn )(|x| + 1)Ψm0 kH dk1 · · · dkn j=1 ≤
Z
δ1 (kp )|fp (kp )|dkp 0 (f1 , . . . , fˆp , . . . , fn )
× k((Hf + 1)n−1 + |x|2n−2 )(|x| + 1)Ψm0 kH and Z Y X n 2 [ [ (k )δ (k , k ) f j j 2 p q kb(k1 ) · · · b(kq ) · · · b(kp ) · · · b(kn )|x| Ψm0 kH dk1 · · · dkn j=1 ≤
Z
δ2 (kp , kq ))|fp (kp )fq (kq )|dkp dkq
× 0 (f1 , . . . , fˆp , . . . , fˆq , . . . , fn )k((Hf + 1)n−2 + |x|2n−4 )|x|2 Ψm0 kH . Then we have by (3.18) and (3.19), Z Y X n fj (kj ) kb(k1 ) · · · b(kn )Ψm0 kH dk1 · · · dkn j=1 Z Y n X X n ≤ f (k )δ (k ) j j 1 p p=1 j=1
[ × kb(k1 ) · · · b(k p ) · · · b(kn )(|x| + 1)Ψm0 kH dk1 · · · dkn Z Y n XX X n fj (kj )δ2 (kp , kq ) + j=1 p=1 q
2 [ [ × kb(k1 ) · · · b(k q ) · · · b(kp ) · · · b(kn )|x| Ψm0 kH dk1 · · · dkn
≤ Ck((Hf + 1)K + |x|2K )Ψm0 kH ≤ C 0 k((Hf + 1)K + |x|2K )ψg kH
May 26, 2003 16:50 WSPC/148-RMP
00166
Localization of the Number of Photons of Ground States
297
with some constants C and C 0 . Thus by the Lebesgue dominated convergence theorem and (3.20), (3.21) and (3.22), we have Z Y X n f (k ) j j kb(k1 ) · · · b(kn )ψg kH dk1 · · · dkn j=1 Z Y n X X n [ ≤ fj (kj )δ1 (kp ) kb(k1 ) · · · b(k p ) · · · b(kn )(|x| + 1)ψg kH dk1 · · · dkn j=1 p=1 Z Y n XX X n fj (kj )δ2 (kp , kq ) + j=1 p=1 q
2 [ [ × kb(k1 ) · · · b(k q ) · · · b(kp ) · · · b(kn )|x| ψg kH dk1 · · · dkn .
Since fj ∈ C0∞ (R3 \{0}), j = 1, . . . , n, are arbitrary, the lemma follows. Lemma 3.5. Let Ψ ∈ D. Then for almost every (k1 , . . . , kn ) ∈ R3n it follows that " n \ \ \ Ψ ∈ D(b(k1 ) · · · b(kn )) l=0 {p1 ,...,pl }⊂{1,...,n}
# l [ [ × D a(k1 ) · · · a(kp1 ) · · · a(kpl ) · · · a(kn )|x|
and kb(k1 ) · · · b(kn )ΨkH ≤
n X
l=0 {p1
l Y |eϕ(k ˆ pj )| p 2ω(k pj ) ,...,p }⊂{1,...,n} j=1
X l
l [ [ × ka(k1 ) · · · a(k p1 ) · · · a(kpl ) · · · a(kn )|x| ΨkH .
(3.23)
Proof. Take a sequence {Ψm } ⊂ C such that Ψm → Ψ and (HfK +|x|2K +1)Ψm → (HfK + |x|2K + 1)Ψ strongly as m → ∞ for sufficiently large K. (3.23) is valid for Ψ replaced by Ψm , since b(k1 ) · · · b(kn )Ψm = (a(k1 ) + θ1 ) · · · (a(kn ) + θn )Ψm =
n X
X
l=0 {p1 ,...,pl }⊂{1,...,n}
θ p1 · · · θ pl
[ [ × a(k1 ) · · · a(k p1 ) · · · a(kpl ) · · · a(kn )Ψm .
May 26, 2003 16:50 WSPC/148-RMP
298
00166
F. Hiroshima
Note that |x|l Ψm → |x|l Ψ and ((Hf + 1)n + |x|2n )|x|l Ψm → ((Hf + 1)n + |x|2n )|x|l Ψ strongly as m → ∞, since K is sufficiently large. By Lemmas 2.4 and 2.7, there exists a subsequence {m0 } ⊂ {m} such that for almost every (k1 , . . . , kn ) ∈ R3n , b(k1 ) · · · b(kn )Ψm0 → b(k1 ) · · · b(kn )Ψ and l [ [ a(k1 ) · · · a(k p1 ) · · · a(kpl ) · · · a(kn )|x| Ψm0 l [ [ → a(k1 ) · · · a(k p1 ) · · · a(kpl ) · · · a(kn )|x| Ψ .
Thus the proof is complete. Lemma 3.6. Let Ψ ∈ D. Then for almost every (k1 , . . . , kn ) ∈ R3n , " n \ \ \ Ψ ∈ D(a(k1 ) · · · a(kn )) l=0 {p1 ,...,pl }⊂{1,...,n}
# l [ [ × D b(k1 ) · · · b(kp1 ) · · · b(kpl ) · · · b(kn )|x|
and ka(k1 ) · · · a(kn )Ψk ≤
n X
l=0 {p1
l Y |eϕ(k ˆ pj )| p 2ω(kpj ) ,...,p }⊂{1,...,n} j=1
X l
l [ [ × kb(k1 ) · · · b(k p1 ) · · · b(kpl ) · · · b(kn )|x| ΨkH .
Proof. Note b(k1 ) · · · b(kn ) = (a(k1 ) − θ1 ) · · · (a(kn ) − θn ). The lemma is proven in the similar way as Lemma 3.5. Lemma 3.7. Suppose that |x|z Ψ ∈ D(N n/2 ) ∩ D for z = m, m + 1, . . . , m + n. Then there exist constants cn,l k such that Z X
kb(k1 ) · · · b(kn )|x|m Ψk2H dk1 · · · dkn ≤
Proof. We have by Lemma 3.5 and | Z X
×
PN
j=1
n X n−l X l=0 k=1
x j |2 ≤ N
kb(k1 ) · · · b(kn )|x|m Ψk2H dk1 · · · dkn ≤ 2n
Z X
k/2 cn,l |x|m+l Ψk2H . k kN
n X
PN
j=1
(3.24)
x2j ,
X
l=0 {p1 ,...,pl }⊂{1,...,n}
eϕˆ l
√
2ω
m+l [ [ d d Ψk2H dk1 · · · dk ka(k1 ) · · · a(k p1 · · · dkpl · · · dkn . p1 ) · · · a(kpl ) · · · a(kn )|x|
May 26, 2003 16:50 WSPC/148-RMP
00166
Localization of the Number of Photons of Ground States
299
By the assumption it follows that |x|m+l Ψ ∈ D(N n/2 ). Thus we see that Z X m+l [ [ d d ka(k1 ) · · · a(k Ψk2H dk1 · · · dk p1 ) · · · a(kpl ) · · · a(kn )|x| p1 · · · dkpl · · · dkn
2
n−l
n−l X
Y
1/2 m+l k/2
= (N − j + 1) |x| Ψ ≤ an,l |x|m+l Ψk2H k kN
j=1
k=1 H
an,l k .
with some constants Then Z X kb(k1 ) · · · b(kn )|x|m Ψk2H dk1 · · · dkn ≤
n X
X
l=1 {p1 ,...,pl }⊂{1,...,n}
Hence we conclude (3.24).
!l n−l
eϕˆ 2 X k/2
√ an,l |x|m+l Ψk2H . k kN
2ω k=1
We set the right hand side of (3.24) by Rn,m (Ψ), i.e. Rn,m (Ψ) =
n X n−l X l=0 k=1
k/2 cn,l |x|m+l Ψk2H . k kN
Lemma 3.8. Let Ψ ∈ D. Then there exist constants dnl such that Z X ka(k1 ) · · · a(kn )Ψk2H dk1 · · · dkn ≤2
n
(Z X
kb(k1 ) · · · b(kn )Ψk2H dk1
· · · dkn +
n X l=1
dnl Rn−l,l (Ψ)
)
.
Proof. We have by Lemma 3.6, Z X ka(k1 ) · · · a(kn )Ψk2H dk1 · · · dkn ≤2
n
×
n X
X
l=0 {p1 ,..,pl }⊂{1,..,n}
Z
!l
eϕˆ 2
√
2ω
l 2 [ [ kb(k1 ) · · · b(k p1 ) · · · b(kpl ) · · · b(kn )|x| ΨkH
d d (3.25) × dk1 · · · dk p1 · · · dkpl · · · dkn . R P The term with l = 0 in (3.25) is just kb(k1 ) · · · b(kn )Ψk2H dk1 · · · dkn . The lemma follows from Lemma 3.7.
May 26, 2003 16:50 WSPC/148-RMP
300
00166
F. Hiroshima
Proof of Theorem 1.7. We prove the theorem by means of an induction. It is known that ψg ∈ D(N 1/2 ) . Suppose that ψg ∈ D(N (n−1)/2 ) . Then by Lemma 3.2, Z X ka(k1 ) · · · a(kl )ψg k2H dk1 · · · dkn < ∞ ,
l = 1, 2, . . . , n − 1 ,
(3.26)
and by Lemma 2.12,
kN l/2 |x|m ψg kH < ∞ follows for all m ≥ 0 and l ≤ n − 1. By Lemma 3.8, Z X ka(k1 ) · · · a(kn )ψg k2H dk1 · · · dkn ≤ 2n
(Z X
kb(k1 ) · · · b(kn )ψg k2H dk1 · · · dkn +
By (3.27) we see that
(3.27)
n X l=1
)
dnl Rn−l,l (Ψ) .
Rn−l,l (Ψ) < ∞ . From Lemma 3.4 it follows that Z X kb(k1 ) · · · b(kn )ψg k2H dk1 · · · dkn ≤ δ1
Z n X X p=1
+ δ2
2 [ d kb(k1 ) · · · b(k p ) · · · b(kn )(|x| + 1)ψg kH dk1 · · · dkp · · · dkn
Z n XX X p=1 q
2 2 [ [ kb(k1 ) · · · b(k q ) · · · b(kp ) · · · b(kn )|x| ψg kH
dq · · · dk dp · · · dkn , × dk1 · · · dk (3.28) R R where δ1 = δ1 (k)2 dk and δ2 = δ2 (k, k 0 )2 dkdk 0 . Then the right hand side of (3.28) is finite by Lemma 3.7. Hence Z X ka(k1 ) · · · a(kn )ψg k2H dk1 · · · dkn < ∞ follows, which implies, together with (3.26), that ψg ∈ D(N n/2 ) by Lemma 3.2. Thus the theorem follows. Proof of Theorem 1.9. This follows from Theorem 1.7 and Lemma 2.11.
May 26, 2003 16:50 WSPC/148-RMP
00166
Localization of the Number of Photons of Ground States
301
Appendix A. n/2
Lemma A.1. Let Ψ ∈ D(Hf measure zero such that
). Then there exists M(Ψ) ⊂ R3n with the Lebesgue
Ψ ∈ D(a(k1 ) · · · a(kn ))
(A.1)
for (k1 , . . . , kn ) ∈ / M(Ψ). Moreover assume that {Ψm } ⊂ C satisfies that Ψm → Ψ and (Hf + 1)n/2 Ψm → (Hf + 1)n/2 Ψ strongly as m → ∞. Then there exists a subsequence {m0 } ⊂ {m} and M(Ψ, {Ψm}, {m0 }) ⊂ R3n with the Lebesgue measure zero such that (A.1) follows and lim a(k1 ) · · · (kn )Ψm0 = a(k1 ) · · · a(kn )Ψ
m0 →∞
for (k1 , . . . , kn ) ∈ / M(Ψ, {Ψm}, {m0 }). Proof. We fix a sequence {Ψm }. The lemma is proven inductively. Note that k(Hf + 1)p Ψk ≤ k(Hf + 1)q Ψk
for p ≤ q. By (2.1) we see that Z X |f1 (k1 )|ka(k1 )Ψm kH dk1 ≤ (f1 )k(Hf + 1)1/2 Ψm kH
(A.2)
(A.3)
for arbitrary f1 ∈ C0∞ (R3 \{0}). The right hand side of (A.3) converges as m → ∞ by (A.2). Then the left hand side of (A.3) is a Cauchy sequence. Then there exist N1 (Ψ) ⊂ R3 with the Lebesgue measure zero and a subsequence {m1 } ⊂ {m} such that a(k1 )Ψm1 converges strongly as m1 → ∞ for k1 ∈ / N1 (Ψ). Since a(k1 ) is closed, it follows that for k1 ∈ / N1 (Ψ), Ψ ∈ D(a(k1 )) and s- lim a(k1 )Ψm1 = a(k1 )Ψ . m1 →∞
For Ψm1 we have by (2.1), Z X |f1 (k1 )f2 (k2 )|ka(k1 )a(k2 )Ψm1 kH dk1 dk2 ≤ (f1 , f2 )k(Hf + 1)Ψm1 kH
for arbitrary f1 , f2 ∈ C0∞ (R3 \{0}). Then we also see that there exist N2 (Ψ) ⊂ R3 × R3 with the Lebesgue measure zero and a subsequence {m2 } ⊂ {m1 } such that a(k1 )a(k2 )Ψm2 converges strongly as m2 → ∞ for (k1 , k2 ) ∈ / N2 (Ψ). Since a(k2 )Ψm2 → a(k2 )Ψ strongly as m2 → ∞ for k2 ∈ / N1 (Ψ) and a(k1 ) is closed, we see that for (k1 , k2 ) ∈ / N2 (Ψ) ∪ [R3 × N1 (Ψ)], a(k2 )Ψ ∈ D(a(k1 )) and s- lim a(k1 )a(k2 )Ψm2 = a(k1 )a(k2 )Ψ . m2 →∞
Repeating this procedure we see that there exist subsets Nj (Ψ) ⊂ R3j , j = 1, . . . , n, with the Lebesgue measure zero and subsequences {mn } ⊂ {mn−1 } ⊂ · · · ⊂ {m} such that for (k1 , . . . , kn ) ∈ / Nn (Ψ), a(k1 ) · · · a(kn )Ψmn converges strongly as mn → ∞ and a(k2 ) · · · a(kn )Ψmn → a(k2 ) · · · a(kn )Ψ strongly as mn → ∞ for (k2 , . . . , kn ) ∈ / Nn−1 (Ψ) ∪ [R3 × Nn−2 (Ψ)] ∪ · · · ∪ [R3(n−2) × N1 (Ψ)]. Let M(Ψ, {Ψm}, {m0 }) = Nn (Ψ) ∪ [R3 × Nn−1 (Ψ)] ∪ · · · ∪ [R3(n−1) × N1 (Ψ)] .
May 26, 2003 16:50 WSPC/148-RMP
302
00166
F. Hiroshima
Since a(k1 ) is closed, we see that for (k1 , . . . , kn ) ∈ / M(Ψ, {Ψm }, {m0 }), a(k2 ) · · · a(kn )Ψ ∈ D(a(k1 )) and s- lim a(k1 ) · · · a(kn )Ψmn = a(k1 ) · · · a(kn )Ψ . mn →∞
Thus the proof is complete. We define adlA (B) by ad0A (B) = B and adlA (B) = [A, adl−1 A (B)]. Note that on Fω , p X p [Hfp , a(k1 ) · · · a(kn )] = adlHf (a(k1 ) · · · a(kn ))Hfp−l , l l=1
adlHf (a(k1 ) · · · a(kn ))
=
l l−p X X1
l−
p1 =0 p2 =0
···
Pn−1 i=1 X
pi
pn =0
l p1
l − p1 p2
···
l−
Pn−1 i=1
pn
pi
× adpH1f (a(k1 ))adpH2f (a(k2 )) · · · adpHnf (a(kn )) , and adpHf (a(k)) = (−1)p ω(k)p a(k) . Hence we have [Hfp , a(k1 ) · · · a(kn )]
=
p X l l−p X X1 p
l
l=1
×
l p1
p1 =0 p2 =0
l − p1 p2
l−
···
···
Pn−1 i=1 X
pi
pn =0
l−
Pn−1 i=1
pn
pi
× (−1)l ω(k1 )p1 ω(k2 )p2 · · · ω(kn )pn a(k1 ) · · · a(kn ) . (A.4) Lemma A.2. Let Ψ ∈ C ∞ (Hf ). Then there exists M∞ (Ψ) ⊂ R3n with the Lebesgue measure zero such that for (k1 , . . . , kn ) ∈ / M∞ (Ψ), Ψ ∈ D(a(k1 ) · · · a(kn )) and a(k1 ) · · · a(kn )Ψ ∈ C ∞ (Hf ) . Proof. Let p ≥ 0. Let {Ψm } ⊂ C be such that Ψm → Ψ and (Hf + 1)q Ψm → (Hf + 1)q Ψ strongly as m → ∞ for q = (n/2) + p. In particular (Hf + 1)n/2 Ψm → (Hf + 1)n/2 Ψ strongly as m → ∞. By Lemma A.1, there exists a subsequence {m0 } ⊂ {m} such that for (k1 , . . . , kn ) ∈ / M(Ψ, {Ψm}, {m0 }), Ψ ∈ D(a(k1 ) · · · a(kn ))
May 26, 2003 16:50 WSPC/148-RMP
00166
Localization of the Number of Photons of Ground States
303
and lim a(k1 ) · · · a(kn )Ψm0 = a(k1 ) · · · a(kn )Ψ .
(A.5)
m0 →∞
We reset m0 as m. By (A.4), for fj ∈ C0∞ (R3 \{0}), j = 1, . . . , n, Z Y X n p f (k ) j j kHf a(k1 ) · · · a(kn )Ψm kH dk1 · · · dkn j=1 Z Y X n p f (k ) ≤ j j ka(k1 ) · · · a(kn )Hf Ψm kH dk j=1 +
p X l l−p X1 X p l=1
l
p1 =0 p2 =0
l−
···
Pn−1 i=1 X
pi
pn =0
l p1
l − p1 p2
l − p1 p2
···
l−
l−
Pn−1 i=1
pi
pn
Z Y X n pj × fj (kj )ω(kj ) ka(k1 ) · · · a(kn )Hfn−l Ψm kH dk1 · · · dkn j=1
≤ (f1 , . . . , fn )k(Hf + 1)n/2 Hfp Ψm kH +
p X l l−p X1 X p l=1
l
p1 =0 p2 =0
l−
···
Pn−1 i=1 X
pi
pn =0
l p1
(p−l)
× (ω p1 f1 , . . . , ω pn fn )k(Hf + 1)n/2 Hf
···
Pn−1 i=1
pn
Ψm kH
≤ Ck(Hf + 1)(n/2)+p Ψm kH
(A.6)
with some constant C. The right hand side ∞. Since fj ∈ C0∞ (R3 \{0}), j = 1, . . . , n, R3n with the Lebesgue measure zero and a Hfp a(k1 ) · · · a(kn )Ψm0 strongly converges as Since Hfp is closed, we obtain by (A.5) that
of (A.6) converges strongly as m → are arbitrary, there exist Np (Ψ) ⊂ subsequence {m0 } ⊂ {m} such that m0 → ∞ for (k1 , . . . , kn ) ∈ / Np (Ψ).
a(k1 ) · · · a(kn )Ψ ∈ D(Hfp ) for (k1 , . . . , kn ) ∈ / Ωp = Np ∪ M(Ψ, {Ψm}, {m0 }). Define M∞ (Ψ) =
pi
[
Ωp .
p
Then it follows that a(k1 ) · · · a(kn )Ψ ∈ C ∞ (Hf ) for (k1 , . . . , kn ) ∈ / M∞ .
May 26, 2003 16:50 WSPC/148-RMP
304
00166
F. Hiroshima
Proof of Lemma 2.4. Let {Ψm } ⊂ C be such that Ψm → Ψ and ((Hf + 1)n/2 + |x|2p )Ψm → ((Hf +1)n/2 +|x|2p )Ψ strongly as m → ∞. From Lemma A.2, it follows that for (k1 , . . . , kn ) ∈ / M∞ (Ψ), a(k1 ) · · · a(kn )Ψ ∈ C ∞ (Hf )
and from Lemma A.1, s- lim a(k1 ) · · · a(kn )Ψm0 = a(k1 ) · · · a(kn )Ψ 0 m →∞
(A.7)
with some subsequence {m0 } for (k1 , . . . , kn ) ∈ / M(Ψ, {Ψm }, {m0 }). We reset m0 as m. Let fj ∈ C0∞ (R3 \{0}), j = 1, . . . , n. Since [|x|p , a(k1 ) · · · a(kn )]Ψm = 0, we have 2 Z Y X n fj (kj ) k|x|p a(k1 ) · · · a(kn )Ψm kH dk1 · · · dkn j=1 ≤ (f1 , . . . , fn )2 k|x|2p ΨkH k(Hf + 1)Ψm kH ≤ (f1 , . . . , fn )2 k((Hf + 1) + |x|2p )Ψm k2H . Since the right hand side converges as m → ∞, there exist Np (Ψ)0 ⊂ R3n with the Lebesgue measure zero and a subsequence {m0 } such that |x|p a(k1 ) · · · a(kn )Ψm0 strongly converge as m0 → ∞ for (k1 , . . . , kn ) ∈ Np (Ψ)0 . Since |x|p is closed and by (A.7), a(k1 ) · · · a(kn )Ψ ∈ D(|x|p ) follows for (k1 , . . . , kn ) ∈ / Ω0p = Np (Ψ)0 ∪M(Ψ, {Ψm}, {m0 }). Then for (k1 , . . . , kn ) ∈ / 0 ∪p Ω p , a(k1 ) · · · a(kn )Ψ ∈ C ∞ (|x|) .
Let MD (Ψ, {Ψm }, {m0 }) = M∞ (Ψ) Then the lemma follows.
[
[∪p Ω0p ] .
Proof of Lemma 2.6. Applying (2.5) instead of (2.1), we can show the lemma in the similar way as Lemmas A.1, A.2 and 2.4. Appendix B. In this section we prove Lemma 2.11. In [14] we proved that e−tH maps D(N k/2 ) into itself for the case when V = 0. We extend this result for some nonzero potential V . We see that if ψg ∈ D(N k/2 ) then the identity N k/2 ψg = e−tH etE N k/2 ψg + etE [N k/2 , e−tH ]ψg
(B.1)
is well defined. Using (B.1) we shall prove that kN k/2 ψg (x)kF decays exponentially. To see it we prepare some probabilistic notations.
May 26, 2003 16:50 WSPC/148-RMP
00166
Localization of the Number of Photons of Ground States
305
It is known that there exist a probability space (Q, µ) and Gaussian random variables (φ(f ), f ∈ ⊕3 L2real (R3 )) such that Z Z 1 X kµ kν ˆ φ(f )φ(g)µ(dφ) = fµ (k)ˆ gν (k)dk . δµν − 2 µ,ν=1,2,3 R3 |k|2 Q
For a general f ∈ ⊕3 L2 (R3 ), we set φ(f ) = φ(
˜ f . To have The free Hamiltonian in L2 (Q) corresponding to Hf in F is denoted by H ˜ a functional integral representation of e−tHf , we go through another probability space (Q0 , ν0 ) and Gaussian random variables (φ0 (f ), f ∈ ⊕3 L2real (R4 )) such that Z Z 1 X kµ kν ˆ φ0 (f )φ0 (g)ν0 (dφ0 ) = δµν − fµ (k, k0 )ˆ gν (k, k0 )dkdk0 . 2 µ,ν=1,2,3 R4 |k|2 Q0
Here φ0 (f ) is also extended to f ∈ ⊕3 L2 (R4 ) such as φ(f ). Let jt : L2 (R3 ) → L2 (R4 ) be the isometry defined by e−itk0 p ω(k)/(ω(k)2 + |k0 |2 ) fˆ(k) jc t f(k, k0 ) = √ π and Jt : L2 (Q) → L2 (Q0 ) by
Jt : φ(f1 ) · · · φ(fn ): =: φ0 ([
L3
jt ]f1 ) · · · φ0 ([
Jt 1 = 1 .
L3
jt ]fn ): ,
Here : X: denotes the Wick power of X inductively defined by : φ∗ (f ): = φ∗ (f ) , : φ∗ (f )φ∗ (f1 ) · · · φ∗ (fn ): = φ∗ (f ) : φ∗ (f1 ) · · · φ∗ (fn ): −
n X j=1
(φ∗ (fj ), φ∗ (f ))L2 (Q∗ ) : φ∗ (f1 ) · · · φ\ ∗ (fj ) · · · φ∗ (fn ): ,
where Q∗ = Q, Q0 and φ∗ = φ, φ0 . Then Jt can be extended to an isometry ˜ and Jt∗ Js = e−|t−s|Hf follows for t, s ∈ R. We identify H = L2 (R3 ) ⊗ F with L2 (R3 ; L2 (Q)). Under this identification, Ψ ∈ H can be regarded as L2 (Q)-valued L2 -function on R, i.e. Ψ(x) ∈ L2 (Q) for almost every x ∈ R3 . In [14, Lemma 4.9] and [12] we established that − (e−tH Ψ)(x) = EQ x (e 3
Rt 0
V (Xs )ds
Jt Ψ(Xt ))
for almost every x ∈ R . Here (Xt )t≥0 = (X1,t , X2,t , X3,t )t≥0 ∈ C([0, ∞); R3 ) 2 denotes an R3 -valued continuous path, EQ x an L (Q)-valued expectation value with respect to the wiener measure Px on C([0, ∞); R3 ) with Px (X0 = x) = 1, and Jt = Jt (x, X· ) : L2 (Q) → L2 (Q)
May 26, 2003 16:50 WSPC/148-RMP
306
00166
F. Hiroshima
is given by Jt = J0∗ e−ieφ0 (K(x,X· )) Jt , where K(x, X· ) is a ⊕3 L2 (R4 )-valued stochastic integral defined by M Z t K = K(x, X· ) = js λ(· − Xs )dXµ,s . µ=1,2,3
0
Let N and N0 be the number operators in L2 (Q) and L2 (Q0 ), respectively. Note that Jt N = N 0 Jt on a dense domain. The expectation value with respect to Px is denoted by Ex . We show a fundamental inequality. Lemma B.1. Let ξ = ξ(x, X· ) = kK(x, X· )k⊕3 L2 (R4 ) . Then, for all m ≥ 0, Z t √ 3(2m)! m−1 3(2m)! m 2m 2m t Ex t kϕ/ ˆ ωk2m . Ex (ξ ) ≤ kjs λ(· − Xs )kL2 (R4 ) ds = m 2m 2 0 (B.2) In particular supx∈R3 Ex (ξ 2m ) < ∞. Proof. See [14, Theorem 4.6]. Lemma B.2. For each (x, X· ) ∈ R3 × C([0, ∞); R3 ) and Ψ ∈ D(N k/2 ), k[N k/2 , Jt (x, X· )]ΨkL2 (Q) ≤ Pk (ξ)k(N + 1)k/2 ΨkL2 (Q) , with some polynomial Pk (·). Proof. Note that for each (x, X· ), Jt = Jt (x, X· ) maps D(N k/2 ) into itself. We have k/2 −ieφ0 (K)
[N k/2 , Jt ]Ψ = J0∗ e−ieφ0 (K) [eieφ0 (K) N0 ( =
J0∗ e−ieφ0 (K)
= −Jt N
k/2
Ψ+
N0 −
e
eφ00 (K)
J0∗ e−ieφ0 (K)
k/2
− N0
e2 + ξ 2
(
N0 −
k/2
−
eφ00 (K)
where φ00 (K) = i[N0 , φ0 (K)]. We see that kJt N k/2 ΨkL2 (Q) ≤ kN k/2 ΨkL2 (Q) . Note that kφ0 (K)Ψk ≤
√
]Jt Ψ
2 ξk(N0 + 1)1/2 Ψk .
k/2 N0
)
e2 + ξ 2
Jt Ψ
k/2 )
Jt Ψ ,
May 26, 2003 16:50 WSPC/148-RMP
00166
Localization of the Number of Photons of Ground States
Then it is obtained that
k
2 e
N0 − eφ00 (K) + ξ Jt Ψ
2
L2 (Q)
307
≤ Rk (ξ)k(N + 1)k ΨkL2 (Q)
with some polynomial Rk (·). Then
k[N k/2 , Jt ]ΨkL2 (Q) ≤ Rk (ξ)k(N + 1)k/2 ΨkL2 (Q) + kN k/2 ΨkL2 (Q) ≤ (Rk (ξ) + 1)k(N + 1)k/2 ΨkL2 (Q) . Thus the proof is complete. Proposition B.3. Let 1 ≤ p ≤ ∞ and a ≥ 0. Then there exists a constant cp = cp (a) such that sup |Ex (e−a
x∈R3
Rt 0
V (Xs )ds
f (Xt ))| ≤ cp kf kLp(R3 ) .
(B.3)
Proof. See [23, Theorem B.1.1]. Lemma B.4. We see that e−tH maps D(N k/2 ) into itself. Proof. Let Φ, Ψ ∈ D(N k/2 ). We have Z R k/2 −tH − 0t V (Xs )ds (N Φ, e Ψ)H = ((N k/2 Φ)(x), EQ Jt Ψ(Xt )))L2 (Q) dx x (e
Then (N
k/2
Φ, e
−tH
Ψ)H =
=
Z
Z
Ex {(Φ(x), Jt N k/2 Ψ(Xt ))L2 (Q) e−
+
Z
Ex {(N k/2 Φ(x), Jt Ψ(Xt ))L2 (Q) e− Rt 0
Rt 0
V (Xs )ds
V (Xs )ds
Ex {(Φ(x), [N k/2 , Jt ]Ψ(Xt ))L2 (Q) e−
Rt 0
}dx .
}dx
V (Xs )ds
}dx .
Hence we have by Lemma B.2, Z Rt |(N k/2 Φ, e−tH Ψ)H | ≤ Ex (e− 0 V (Xs )ds kΦ(x)kL2 (Q) kN k/2 Ψ(Xt )kL2 (Q) )dx +
Z
Ex (Pk (ξ)e−
Rt 0
V (Xs )ds
kΦ(x)kL2 (Q) k(N + 1)k/2 Ψ(Xt )kL2 (Q) )dx .
(B.4) (B.5)
The first term (B.4) is estimated as (B.4) = (kΦ(·)kL2 (Q) , e−tHp kN k/2 Ψ(·)kL2 (Q) )L2 (R3 ) ≤ e−tEp kΦkH kN k/2 ΨkH ,
May 26, 2003 16:50 WSPC/148-RMP
308
00166
F. Hiroshima
where Ep = inf σ(Hp ). The second term (B.5) is estimated as Z Rt (B.5) ≤ kΦ(x)kL2 (Q) (Ex Pk (ξ)2 e−2 0 V (Xs )ds )1/2 × (Ex k(N + 1)k/2 Ψ(Xt )k2L2 (Q) )1/2 dx Z Rt ≤ kΦ(x)kL2 (Q) (Ex Pk (ξ)4 )1/4 (Ex e−4 0 V (Xs )ds )1/4 × (Ex k(N + 1)k/2 Ψ(Xt )k2L2 (Q) )1/2 dx . By Lemma B.1 we have θ = sup (Ex Pk4 )1/4 < ∞ , x∈R3
and by (B.3), η = sup (Ex e−4 x∈R3
Then we have (B.5) ≤ θη ≤ θη
Z
Rt 0
V (Xs )ds 1/4
)
< ∞.
kΦ(x)kL2 (Q) (Ex k(N + 1)k/2 Ψ(Xt )k2L2 (Q) )1/2 dx
Z
kΦ(x)k2L2 (Q) dx
1/2 Z
Ex k(N + 1)
k/2
Ψ(Xt )k2L2 (Q) dx
1/2
= θηkΦkH k(N + 1)k/2 ΨkH . Thus we conclude that |(N k/2 Φ, e−tH Ψ)H | ≤ kΦkH (e−tEp kN k/2 ΨkH + θηk(N + 1)k/2 ΦkH ) .
This implies that e−tH Ψ ∈ D(N k/2 ).
Lemma B.5. Assume that ψg ∈ D(N k/2 ). Then supx∈R3 kN k/2 ψg (x)kL2 (Q) < ∞. Proof. By Lemma B.4, the identity N k/2 ψg = etE e−tH N k/2 ψg + etE [N k/2 , e−tH ]ψg is well defined, and we obtained that − N k/2 ψg (x) = etE EQ x (e
Rt 0
− + etE EQ x (e
V (Xs )ds Rt 0
Jt N k/2 ψg (Xt ))
V (Xs )ds
[N k/2 , Jt ]ψg (Xt ))
for almost every x ∈ R3 . We see that by Lemma B.2, kN k/2 ψg (x)kL2 (Q) ≤ etE Ex (e− + etE Ex (e−
Rt 0
V (Xs )ds
Rt 0
V (Xs )ds
kN k/2 ψg (Xt )kL2 (Q) )
Pk (ξ)k(N + 1)k/2 ψg (Xt )kL2 (Q) ) .
(B.6) (B.7)
By (B.3) it is obtained that
sup (B.6) < ∞ .
x∈R3
(B.8)
May 26, 2003 16:50 WSPC/148-RMP
00166
Localization of the Number of Photons of Ground States
309
(B.7) is estimated as (B.7) ≤ (Ex Pk (ξ)2 )1/2 (Ex e−2
Rt 0
V (Xs )ds
k(N + 1)k/2 ψg (Xt )k2L2 (Q) )1/2 .
By (B.3) we yield that sup Ex (e−2
x∈R3
Rt 0
V (Xs )ds
k(N + 1)k/2 ψg (Xt )k2L2 (Q) ) < ∞ ,
and by Lemma B.1, supx∈R3 Ex (Pk (ξ)2 ) < ∞. Hence sup (B.7) < ∞ .
(B.9)
x∈R3
Thus the lemma follows from (B.8) and (B.9). Proof of Lemma 2.11. It is enough to prove the lemma for sufficiently large |x| by Lemma B.5. Set θ = supx∈R3 k(N + 1)k/2 ψg (x)kL2 (Q) < ∞. We have by (B.6) and (B.7) for almost every x ∈ R3 kN k/2 ψg (x)kL2 (Q) ≤ Ex (e−
Rt 0
V (Xs )ds
(1 + Pk ))etE θ
≤ {Ex ((1 + Pk )2 )}1/2 (Ex e−2
Rt 0
V (Xs )ds 1/2 tE
)
e θ.
By (B.2) we have Ex ((1 + Pk (ξ))2 ) ≤ Qk (t) , where Qk is some polynomial of the same degree as Pk . Then we have kN k/2 ψg (x)kL2 (Q) ≤ θ Qk (t)etE Ex (e−2
Rt 0
V (Xs )ds
).
Here t is arbitrary. Take t = t(x) = |x|1−m . Then by [7] we see that there exist positive constants D and δ such that for sufficiently large |x|, et(x)E Ex (e−2
R t(x) 0
V (Xs )ds
) ≤ De−δ|x|
m+1
.
In the case of m ≥ 1, it is trivial that Qk (t(x)) ≤ θ0 with some constant θ 0 independent of x. Hence kN k/2 ψg (x)kL2 (Q) ≤ θθ0 De−δ|x|
m+1
follows for sufficiently large |x|. Thus the lemma follows for m ≥ 1. In the case of m = 0, we see that kN k/2 ψg (x)kL2 (Q) ≤ θQk (|x|)De−δ|x| , and hence 0
kN k/2 ψg (x)kL2 (Q) ≤ θD0 e−δ |x| follows for δ 0 < δ with some constant D 0 for sufficiently large |x|. The lemma is complete.
May 26, 2003 16:50 WSPC/148-RMP
310
00166
F. Hiroshima
Appendix C. Proof of (2.1). By the Schwartz inequality we have Z n X Y fj (kj ) ka(k1 ) . . . a(kn )ΨkH dk1 . . . dkn j=1
1/2 1/2 n n Z Z X Y X Y 2 . fj (kj ) ka(k1 ) . . . a(kn )ΨkH dk1 . . . dkn fj (kj ) dk1 . . . dkn ≤ j=1 j=1
˜ = a(k1 ) · · · a(kn )Ψ. Then We shall estimate the right hand side above. Let Ψ p ˜ (m) (x, k01 , . . . , k0m ) = (m + 1) · · · (m + n − 1)Ψ(m+n−1) (x, k2 , . . . , kn , k01 , . . . , k0m ) . Ψ From this we see that Z X ˜ 2 dk1 |f1 (k1 )|ka(k1 )Ψk H =
Z ∞ X X
m=0
=
∞ Z X
m=0
0 0 ˜ (m) (·, k01 , . . . , k0m )k2 2 3 dk10 . . . dkm )|)kΨ (|f1 (k10 )| + · · · + |f1 (km L (R )
0 (|f1 (k10 )| + · · · + |f1 (km )|)
0 × (m + 1) · · · (m + n − 1)kΨ(m+n−1) (·, k2 , . . . , kn , k01 , . . . , k0m )k2L2 (R3 ) dk10 . . . dkm .
Let ω ˆ (k) = ω(k) + 1. Since f1 ∈ C0∞ (R3 \{0}), we see that
0 )|)kΨ(m+n−1) (·, k2 , . . . , kn , k01 , . . . , k0m )k2L2 (R3 ) (|f1 (k10 )| + · · · + |f1 (km
≤ c(f1 )(ˆ ω (k2 ) + · · · + ω ˆ (kn ) + ω ˆ (k10 ) 0 + ···+ω ˆ (km ))kΨ(m+n−1) (·, k2 , . . . , kn , k01 , . . . , k0m )k2L2 (R3 )
with some constant c(f1 ) independent of Ψ, n and m. Note that {a(k2 ) · · · a(kn )(Hf + 1)1/2 Ψ}(m) (x, k01 , . . . , k0m ) p = (m + 1) · · · (m + n − 1){(Hf + 1)1/2 Ψ}(m+n−1) (x, k2 , . . . , kn , k01 , . . . , k0m ) q p 0 ) ˆ (k2 ) + · · · + ω ˆ (kn ) + ω b (k10 ) + · · · + ω ˆ (km = (m + 1) · · · (m + n − 1) ω × Ψ(m+n−1) (x, k2 , . . . , kn , k01 , . . . , k0m ) .
Then we obtain that Z X ˜ 2H dk1 |f1 (k1 )|ka(k1 )Ψk
Z ∞ X X ≤ c(f1 ) (m + 1) · · · (m + n − 1) m=0
May 26, 2003 16:50 WSPC/148-RMP
00166
Localization of the Number of Photons of Ground States
311
0 × (ˆ ω(k2 ) + · · · + ω ˆ (kn ) + ω ˆ (k10 ) + · · · + ω ˆ (km )) 0 × kΨ(m+n−1)(·, k2 , . . . , kn , k01 , . . . , k0m )k2L2 (R3 ) dk10 . . . dkm
= c(f1 )ka(k2 ) · · · a(kn )(Hf + 1)1/2 Ψk2H . Repeating this procedure we have Z Y n X fj (kj ) ka(k1 ) . . . a(kn )ak2H dk1 . . . dkn ≤ c(f1 , . . . , fn )k(Hf + 1)n/2 Ψk2H j=1
with some constant c(f1 , . . . , fn ). Thus (2.1) follows. Acknowledgment
I thank M. Griesemer for pointing out an error in the first manuscript. This work is in part supported by Grant-in-Aid 13740106 for Encouragement of Young Scientists from the Ministry of Education, Science, Sports, and Culture. References [1] A. Arai, Ground state of the massless Nelson model without infrared cutoff in a non-Fock representation, Rev. Math. Phys. 13 (2001), 1075–1094. [2] A. Arai and M. Hirokawa, On the existence and uniqueness of ground states of a generalized spin-boson model, J. Funct. Anal. 151 (1997), 455–503. [3] A. Arai, M. Hirokawa and F. Hiroshima, On the absence of eigenvectors of Hamiltonians in a class of massless quantum field models without infrared cutoff, J. Funct. Anal. 168 (1999), 470–497. [4] V. Bach, J. Fr¨ ohlich and I. M. Sigal, Quantum electrodynamics of confined nonrelativistic particles, Adv. Math. 137 (1998), 299–395. [5] V. Bach, J. Fr¨ ohlich and I. M. Sigal, Spectral analysis for systems of atoms and molecules coupled to the quantized radiation field, Commun. Math. Phys. 207 (1999), 249–290. [6] V. Betz, F. Hiroshima, J. L˝ orinczi, R. A. Minlos and H. Spohn, Gibbs measure associated with particle-field system, Rev. Math. Phys. 14 (2002), 173–198. [7] R. Carmona, Pointwise bounds for Schr¨ odinger operators, Comm. Math. Phys. 62 (1978), 97–106. [8] J. Fr¨ ohlich, M. Griesemer and B. Schlein, Asymptotic electromagnetic fields in a mode of quantum-mechanical matter interacting with the quantum radiation field, Adv. Math. 164 (2001), 349–398. [9] C. G´erard, On the existence of ground states for massless Pauli-Fierz Hamiltonians, Ann. Henri Poincar´e 1 (2000), 443–459. [10] M. Griesemer, E. Lieb and M. Loss, Ground states in non-relativistic quantum electrodynamics, Invent. Math. 145 (2001), 557–595. [11] L. Gross, The relativistic Polaron without cutoffs, Comm. Math. Phys. 31 (1973), 25–73. [12] F. Hiroshima, Functional integral representation of a model in quantum electrodynamics, Rev. Math. Phys. 9 (1997), 489–530. [13] F. Hiroshima, Ground states of a model in nonrelativistic quantum electrodynamics I, J. Math. Phys. 40 (1999), 6209–6222, II, J. Math. Phys. 41 (2000), 661–674.
May 26, 2003 16:50 WSPC/148-RMP
312
00166
F. Hiroshima
[14] F. Hiroshima, Essential self-adjointness of translation-invariant quantum field models for arbitrary coupling constants, Comm. Math. Phys. 211 (2000), 585–613. [15] F. Hiroshima, Self-adjointness of the Pauli-Fierz Hamiltonian for arbitrary values of coupling constants, Ann. Henri Poincar´e 3 (2002), 171–201. [16] F. Hiroshima, Analysis of ground states of atoms interacting with a quantized radiation fields, to be published in Int. J. Mod. Phys. B. [17] F. Hiroshima and H. Spohn, Enhanced binding through coupling to a quantum field, Ann. Henri Poincar´e 2 (2001), 1159–1187. [18] F. Hiroshima and H. Spohn, Ground state degeneracy of the Pauli-Fierz model with spin, Adv. Theor. Math. Phys. 5 (2001), 1091–1104. [19] C. Hainzl, V. Vougalter and S. A. Vugalter, Enhanced binding in non-relativistic QED, Comm. Math. Phys. 233 (2003), 13–26. [20] J. L˝ orinczi, R. A. Minlos and H. Spohn, The infrared behaviour in Nelson’s model of a quantum particle coupled to a massless scalar field, Ann. Henri Poincar´e 3 (2001), 1–28. [21] J. L˝ orinczi, R. A. Minlos and H. Spohn, Infrared regular representation of the three dimensional massless Nelson model, Lett. Math. Phys. 59 (2002), 189–198. [22] E. Nelson, Interaction of nonrelativistic particles with a quantized scalar field, J. Math. Phys. 5 (1964), 1190–1197. [23] B. Simon, Schr¨ odinger semigroups, Bull. Amer. Math. Soc. 7 (1982), 447–526. J. Funct. Anal. 32 (1979), 97–101. [24] A. Sloan, The polaron without cutoffs in two space dimensions, J. Math. Phys. 15 (1974), 190–201. [25] H. Spohn, Ground state of quantum particle coupled to a scalar boson field, Lett. Math. Phys. 44 (1998), 9–16. [26] H. Spohn, Ground state(s) of the spin-boson Hamiltonian, Comm. Math. Phys. 123 (1989), 277–304.
June 19, 2003 15:50 WSPC/148-RMP
00161
Reviews in Mathematical Physics Vol. 15, No. 4 (2003) 313–338 c World Scientific Publishing Company
THE GENERALIZED CCR: REPRESENTATIONS AND ENVELOPING C ∗ -ALGEBRA
CHE SOONG KIM Department of Industrial Engineering, Sangji University, Wonju, Korea 220-702
[email protected] DANIIL P. PROSKURIN∗ and ALEKSANDER M. IKSANOV† Cybernetics Faculty, Kiev T.Shevchenko, National University, Ukraine ∗
[email protected] †
[email protected] ZAKHAR A. KABLUCHKO Department of Mathematics, Goettingen University, Germany Received 4 April 2002 Revised 9 January 2003 The review of the representation theory of deformations of the CCR is presented. The faithfulness of the Fock representation of q-CCR, twisted CCR and quon CCR is discussed. The more general deformation of CCR is presented. The K0 and K1 groups of the twisted CCR algebra are calculated. Keywords: Fock representation; deformed commutation relations; universal bounded representation. Mathematics Subject Classifications (2000): 46L55, 46L65, 81S05, 81T05.
0. Introduction and Preliminaries In the recent years due to the numerous applications in the mathematical physics, non-commutative probability, theory of non-commutative symmetric domains the interest to ∗-algebras generated by some deformations of CCR, their Fock representations and enveloping C ∗ (W ∗ )-algebras was growing. In this paper we intend to give a review of some recent results on representation theory, the Fock representation and the structure of enveloping C ∗ -algebras for classes of deformations like the q-CCR, generalized quon commutation relations, the twisted CCR. We will also present a more general class of the CCR (GCCR) that contains quonic relations and twisted CCR as particular cases. As all the deformations considered are either examples of Wick ∗-algebras or their quotients by the largest quadratic ideal, we also give an exposition of the general theory, 313
June 19, 2003 15:50 WSPC/148-RMP
314
00161
C. S. Kim et al.
recall the construction of the Fock representation of hermitian Wick algebras and provide some up-to-date results concerning the existence (i.e. positivity) of the Fock representation and its properties. Note that the detailed and deep study of Wick algebras was begun in the pioneering paper [14] and continued in [23, 17, 18]. Keeping this in mind, we find it possible to combine the main results of those papers in our review. Finally we would like to stress that numerous physical motivations and properties of objects under consideration will be left aside. Rather, we focus on the algebraic structure and the representation theory. Let A be a ∗-algebra. Recall that a bounded representation of A is the ∗homomorphism π : A → B(H), where B(H) is the ∗-algebra of bounded linear operators acting on the Hilbert space H. Suppose that A is finitely generated, i.e. A = Chai , a∗i | Pj (a1 , . . . , ad , a∗1 , . . . , a∗d ) = 0i ,
i = 1, . . . , d ,
j = 1, . . . , k ,
where Pj (·) are non-commutative polynomials. In order to define the representation of A one has to construct the family of linear operators {Ai , i = 1, . . . , d} satisfying the same relations as the generators of A. In this paper the term “unbounded representation of A” means the family of unbounded operators {Ai , i = 1, . . . , d} satisfying the basic relations of algebra. Note that in any particular case one has to give the precise definition of the polynomial relations between unbounded operators. Recall that the C ∗ -algebra A generated by A, or the universal bounded representation of A, is the C ∗ -algebra having the following universal property: There exists an homomorphism φ:A→A such that any homomorphism π : A → B, B being a C ∗ -algebra, determines unique π ˜ : A → B that satisfies the equality π ˜ φ = π. Note that the universal bounded representation exists if and only if the set of bounded representations of A is not empty and the bounded representations of A are uniformly bounded, i.e. for any bounded representation π(·) and any x ∈ A one has kπ(x)k ≤ Cx where the constant Cx does not depend on π(·). In that case A is the completion of A/Rad(A) by the norm kxk = sup kπ(x)k , π
where Rad(A) is a two-sided ideal of elements which are zero in any bounded representation, and we identify x ∈ A with its class in A/Rad(A). For any one-to-one mapping f : R → R let us denote by f n the nth iteration of f , by f −n the nth iteration of f −1 if n ∈ N and f 0 := id. 1. One-dimensional Deformation Let us first consider a one-dimensional case. The well-known deformation of CCR which gives an interpolation between canonical commutation and anti-commutation
June 19, 2003 15:50 WSPC/148-RMP
00161
The Generalized CCR: Representations and Enveloping C ∗ -Algebra
315
relations is the so-called q-CCR introduced and studied by L. C. Biedenharn and A. J. Macfarlane, see [2, 24], a∗ a = 1 + qaa∗ ,
q ∈ (−1, 1) .
The limit case q = 1 defines the bosonic relation, and q = −1 defines the fermionic one. In the third special case q = 0 one has a∗ a = 1, the relations determining the isometry. Recall that the CCR has a unique irreducible ∗-representation πF and in this representation the operators πF (a), πF (a∗ ) are unbounded (J. von Neumann Theorem, see for example [1]). On the contrary, all irreducible representations of CAR are either two-dimensional or one-dimensional and, surely, bounded. The representation theory for q = 0 follows from the Wold decomposition of isometry (see [5]): Any isometry T acting on the Hilbert space H is the orthogonal sum T = (S ⊗ 1) ⊕ U , corresponding to the decomposition H = l2 (N) ⊗ K ⊕ H1 where S is a unilateral shift on l2 (N), Sen = en+1 , n ∈ N and U is a unitary operator on H1 . Hence the irreducible representations of q-CCR with q = 0 are: (1) Infinite-dimensional πF (a) = S, (2) One-dimensional πφ (a) = e2πiφ , φ ∈ [0, 1). Moreover, the enveloping C ∗ -algebra A0 = C ∗ ht, t∗ , t∗ t = 1i is isomorphic to the C ∗ (S, S ∗ ) = T (C(T)), i.e. to the algebra of Toeplitz operators (see [3, 4]). In other words, the unique infinite-dimensional representation of A0 is faithful (i.e. it has a trivial kernel). Let us describe bounded irreducible ∗-representations of the q-CCR for arbitrary q ∈ (−1, 1). Let π be some irreducible representation of q-CCR acting on the Hilbert space H. Consider the polar decomposition of π(a∗ ) := A∗ = T ∗ C, where T is an isometry, C > 0, ker T ∗ = ker C. Obviously, to define A it is sufficient to describe the operators T and C, as it was done, for example, in [27]: Theorem 1. Assume that operators A, A∗ acting on the Hilbert space H define a bounded irreducible representation of q-CCR. Then n−1
(1) H = l2 (N), T = S, C 2 en = 1−q 1−q en ; 1 2πiφ 2 (2) H = C, T = e , C = 1−q . It is easily seen that we have for either case C2 =
∞ X
q i−1 T i T ∗i .
i=1
Motivated by the above is the following statement (see [15]). Proposition 1. Let Aq be the universal bounded representation of q-CCR faithfully realized by the Hilbert space operators and a∗ = t∗ c be the polar decomposition. Then
June 19, 2003 15:50 WSPC/148-RMP
316
00161
C. S. Kim et al.
t is a pure isometry, t ∈ Aq and a=
∞ X
q
i−1 i ∗i
tt
i=1
! 21
t.
As the corollary to this result we have the well-known isomorphism Aq ' A0 . Recall that A0 has a unique largest ideal K isomorphic to the algebra of compact operators and A0 /K ' C(T). Namely, K is the kernel of any one-dimensional representation of Aq and is generated by 1 − tt∗ , see, for example [15] and reference therein. Remark 1. If −1 < q ≤ 0 then any representation of q-CCR is bounded. For 0 < q < 1 one has a collection of unbounded representations πx (·) of q-CCR defined 1 and any x ∈ [1 + qy, y) as follows: on l2 (Z) by fixed y > 1−q C 2 en = f n (x)en ,
U en = en+1 ,
n ∈ Z,
f (t) = 1 + qt ,
πx (a) = CU .
Recall that the Fock representation of q-CCR is the unique irreducible one that has a vacuum vector Ω such that a∗ Ω = 0. “Vacuum” means that the closed linear span of ani Ω, n ∈ Z+ coincides with the representation space. Since in the polar decomposition of A∗ we have ker T ∗ = ker A∗ , the Fock representation of A0 is the Fock representation of Aq . This implies that the Fock representation of C ∗ -algebra generated by one-dimensional q-CCR is faithful. Let us summarize the results listed above (see [15]): Proposition 2. The C ∗ -algebra Aq generated by q-CCR is isomorphic to A0 for any q ∈ (−1, 1); Fock representation is a unique infinite-dimensional representation of Aq , and it is faithful ; Aq has a unique largest two-sided ideal K and Aq /K ' C(T). 2. Higher-Dimensional Deformations The aim of this section is: (1) to present several types of higher-dimensional generalizations of q-CCR, (2) to recall the general construction of Fock representation of Hermitian Wick algebra and its properties, (3) to discuss the faithfulness of the Fock representation on the algebraic level, i.e. for the ∗-algebras generated by deformed CCR. Of especial interest (for us) are: • Higher-dimensional q-CCR introduced independently by O. Greenberg, D. Fivel, M. Bozejko and R. Speicher (see [7, 13, 11]), a∗i aj = δij 1 + qaj a∗i ,
i = 1, . . . , d ,
q ∈ (−1, 1) .
Let us denote by Aq the ∗-algebra defined by these relations. We will point out below that there exists the universal bounded representation of Aq (notation (d) Aq = Aq ).
June 19, 2003 15:50 WSPC/148-RMP
00161
The Generalized CCR: Representations and Enveloping C ∗ -Algebra
317
• Quon commutation relations due to W. Marcinek, see [17, 18], a∗i ai = 1 + αi ai a∗i ,
a∗i aj = λij aj a∗i ,
−1 < αi < 1 ,
λij = e2πiθij ,
aj ai = λij aj ai , θij = −θji ,
i<j;
i 6= j .
Put α = (α1 , . . . , αd ), Θ = (θij ). We denote by Aα,Θ the quonic ∗-algebra and by Aα,Θ its enveloping C ∗ -algebra. • Twisted CCR of W. Pusz and S. L. Woronowicz (see [28]), X a∗i ai = 1 + µ2 ai a∗i − (1 − µ2 ) ak a∗k ; k
a∗i aj = µaj a∗i ,
0 < µ < 1,
aj ai = µai aj ,
i<j,
i 6= j ;
i, j = 1, . . . , d .
Below Bµ denotes the ∗-algebra and Bµ the C ∗ -algebra generated by TCCR. 2.1. Wick ∗-algebras To unify the examples presented above, P. E. T. Jørgensen, L. M. Schmitt and R. F. Werner proposed the notion of a ∗-algebra allowing Wick ordering (see [14]). Definition 1. Let I = {1, 2, . . . , d}, Tijkl ∈ C, i, j, k, l ∈ I be such that Tijkl = Tjilk . The Wick algebra with the set of coefficients {Tijkl }, denoted as W (T ), is a ∗-algebra, defined by generators ai , a∗i , i ∈ I which satisfy the basic relations: a∗i aj
= δij 1 +
d X
Tijkl al a∗k .
k,l=1
The monomials ai1 · · · ain a∗j1 · · · a∗jm are called Wick ordered monomials and the family {1, ai1 · · · ain a∗j1 · · · a∗jm , n, m ∈ Z+ } constitutes the linear basis of W (T )
(see [14]), here we use the convention that ai1 · · · ain := 1 if n = 0 and analogously for the product of a∗i . Note that in W (T ) only relations between a∗i , aj , i, j = 1, . . . , d are postulated and the subalgebra of W (T ) generated by the ai , i = 1, . . . , d is L free and furthermore we identify it with the T (H) = CΩ ⊕ n≥1 H⊗n , where H = he1 , . . . , ed i is a complex finite-dimensional Hilbert space. Consider the full tensor algebra over H, H∗ , denoted by T (H, H∗ ). Then, obviously X
W (T ) ' T (H, H∗ )/ e∗i ⊗ ej − δij 1 − Tijkl ei ⊗ e∗j . If A is one of ∗-algebras generated by the deformed CCR presented above, then by the W A we denote its Wick analogue. In fact,
• Aq ' W Aq • Aα,Θ ' W Aα,Θ /haj ai − λij ai aj , i 6= ji • Bµ ' W Bµ /haj ai − µai aj , i < ji.
June 19, 2003 15:50 WSPC/148-RMP
318
00161
C. S. Kim et al.
Following [14], we can construct the Fock representation of W (T ) on the T (H). As in the one-dimensional case it is determined completely by the action of generators on the vacuum vector Ω := 1. πF (ai )ei1 ⊗ · · · ⊗ ein = ei ⊗ ei1 ⊗ · · · ⊗ ein ,
n ∈ N ∪ {0} ,
πF (a∗i )1 = 0 , the action of πF (a∗i ) on H⊗n , n ≥ 1 is defined inductively using the basic relations. Example 1. For Tijkl = 0, i, j, k, l, = 1, . . . , d the algebra W (0) is the well-known Cuntz–Toeplitz algebra, see [8], generated by the orthogonal isometries a ∗i aj = δij 1, i, j = 1, . . . , d and πF (a∗i )ei1 ⊗ ei2 ⊗ · · · ⊗ ein = hei , ei1 iei2 ⊗ · · · ⊗ ein , i.e. πF (ai ), πF (a∗i ) are realized as the classical creation and annihilation operators on the full Fock space. If we supply T (H) by the standard inner product then πF becomes a ∗-representation of W (0). Certainly it is not a case in general. To define the action of πF (a∗i ) on H⊗n explicitly, the authors of [14] introduced the operators: X lj Tik ei ⊗ ej , T = T ∗ ; T : H ⊗ H 7→ H ⊗ H , T ek ⊗ el = i,j
Ti : H⊗n 7→ H⊗n , Then
Ti = 1 ⊗ · · · ⊗ 1 ⊗ T ⊗ 1 ⊗ · · · ⊗ 1 , | {z } | {z }
πF (a∗i )X = µ0 (a∗i )Rn X ,
i−1
n−i−1
i = 1, . . . , n − 1 .
Rn = 1 + T1 + T1 T2 + · · · + T1 · · · Tn−1 , X ∈ H⊗n
and µ0 (a∗i )ei1 ⊗ ei2 ⊗ · · · ⊗ ein = hei , ei1 iei2 ⊗ · · · ⊗ ein , n ≥ 1, µ0 (a∗i )1 = 0. The next step is to define a deformed inner product h·, ·i0 on T (H) to make πF a ∗-representation of W (T ) (see [14]). Theorem 2. There exists a unique hermitian sesquiliniar form h·, ·i0 on T (H) such that hπF (a∗i )X, Y i0 = hX, πF (ai )Y i0 for any i = 1, . . . , d and X, Y ∈ T (H). In fact hX, Y i0 = hX, Pn Y i, X, Y ∈ H⊗n , hX, Y i0 = 0, X ∈ H⊗n , Y ∈ H⊗m , n 6= m, where P0 = 1, P1 = 1, P2 = 1 + T, Pn = (1 ⊗ Pn−1 )Rn , n ≥ 3. When Pn ≥ 0, n ≥ 2, the Fock representation extends to the ∗-representation on the Hilbert space which is the completion of the T (H)/ ⊕n ker Pn by the norm induced by h·, ·i0 . At the moment there are several sufficient conditions on the operator T for the positivity of family {Pn , n ≥ 2}. √ Theorem 3. If T ≥ 0 or kT k ≤ 2 − 1 when Pn > 0, for any n ≥ 2.
June 19, 2003 15:50 WSPC/148-RMP
00161
The Generalized CCR: Representations and Enveloping C ∗ -Algebra
319
See [14] for the proof. Then the operator T satisfies the Yang–Baxter equation T1 T2 T1 = T2 T1 T2 , the operators Pn can be realized as X φ(σ) , Pn = σ∈Sn
where if σ = σi1 · · · σik is a reduced decomposition (i.e. having a minimal length) under the generators σi = (ii + 1), i = 1, . . . , n − 1, we define φ(σ) to be φ(σ) = Ti1 · · · Tik , see [7]. Using the properties of Sn as a Coxeter group, M. Bozejko and R. Speicher proved the following result, see [7], Theorem 4. If operator T is a contraction, i.e. kT k ≤ 1 and satisfies Yang–Baxter equation, then Pn ≥ 0, n ≥ 2. In the case kT k < 1, the operators Pn , n ≥ 2 are strictly positive. The more precise version of the previous theorem is useful for the study of faithfulness of the Fock representation, see [23], Theorem 5. If the conditions of the previous statement are in force then ker Pn =
n−1 X k=0
H⊗k ⊗ ker(1 + T ) ⊗ H⊗n−k−2 .
In particular for −1 < T ≤ 1, one has strict positivity of Pn , n ≥ 2. Motivated by the identification of the subalgebra of W (T ) generated by {ai , i = 1, . . . , d} with T (H), the following question naturally suggests itself: Whether the element X ∈ T (H) with hX, Xi0 = 0 determines a zero operator πF (X)? The particular answer is given as follows (see [23]). Theorem 6. Let Fock representation πF of Wick algebra W (T ) be positive and I be a two-sided ideal in W (T ) generated by ⊕n≥2 ker Pn , then ker πF ⊂ I. If additionally T satisfies the Yang–Baxter equation ker πF = I. For the Yang–Baxter contraction T we have immediate (see [23]) Corollary 1. The kernel of Fock representation of W (T ), where T satisfies the Yang–Baxter condition and kT k ≤ 1 is generated as a ∗-ideal by the ker(1 + T ). In particular, this implies that πF is faithful representation of W (T )/I, where I is a two-sided ideal generated by the ker(1 + T ). Let us return to the deformed CCR ∗-algebras and their Wick analogues. • As was noted above Aq = W Aq , the operator T is defined as follows, T ei ⊗ ej = qej ⊗ ei ,
I, j = 1, . . . , d .
June 19, 2003 15:50 WSPC/148-RMP
320
00161
C. S. Kim et al.
It fulfills the YB equation, kT k = q < 1 and ker(1 + T ) = {0}. So, the Fock representation of the ∗-algebra Aq is strictly positive and faithful. • For the W Aα,Θ and W Bµ we have T ei ⊗ ej = λij ej ⊗ ei , i 6= j ,
T e i ⊗ e i = α i ei ⊗ e i
Ker(1 + T ) = haj ai − λij ai aj i and T ei ⊗ e i = µ 2 ei ⊗ e i T ei ⊗ ej = µej ⊗ ei ,
i<j
T ei ⊗ ej = −(1 − µ2 )ei ⊗ ej + µej ⊗ ei ,
i>j
Ker(1 + T ) = haj ai − µai aj , i < ji . Hence for A to be either Aα,Θ or Bµ we should have A = W A/hker(1 + T )i. In both cases kT k = 1 and T is Yang–Baxter operator. Therefore, the Fock representations of quon commutation relations and twisted CCR are faithful like q-CCR, but act on the symmetrized Fock space with (non-orthogonal) basis ei1 ⊗ ei2 ⊗ · · · ⊗ ein , n ∈ Z+ , i1 ≤ i2 ≤ · · · ≤ in . 2.2. GCCR Let us consider the natural generalization of CCR containing in the class of Wick algebras. Namely, consider the ∗-algebra Aα,β generated by the family {a∗i , ai , i = 1, . . . , d} satisfying the relations of the form a∗i ai
= 1+
d X
αij aj a∗j ,
0 < αij < 1 ,
(1)
j=1
a∗i aj = βij aj a∗i ,
βij = β ji ,
j 6= i .
In the paper [26] the algebras from this family having the largest quadratic Wick ideal were studied. The notion of Wick ideal was introduced in [14] to describe the relations between ai , i = 1, . . . , d compatible with the basic relations of Wick algebra. For another approach to this problem, see [17, 18]. Definition 2. Two-sided ideal J ⊂ T (H) is called a Wick ideal if a∗i J ⊂ J + J T (H∗ ) for any i = 1, . . . , d. If J is generated by the elements containing in H ⊗n it is called homogeneous Wick ideal of degree n. The quadratic Wick ideals (quadratic ideals, in short) are of special interest since the CCR, CAR quon CCR and TCCR are exactly the quotients of their Wick analogues by the largest quadratic ideal (see [14, 29]). Moreover, if the Fock inner
June 19, 2003 15:50 WSPC/148-RMP
00161
The Generalized CCR: Representations and Enveloping C ∗ -Algebra
321
product is positive, the largest quadratic ideal is contained in the kernel of the Fock representation, see [14]. As shown in [26], the algebras from the class (1) having the largest quadratic Wick ideal are defined by the following relations, here we consider the quotient by the quadratic ideal. X a∗i ai = 1 + αi ai a∗i − (1 − αj )aj a∗j , 0 < αk < 1 , k = 1, . . . , d ; j
1 2
a∗i aj = λij αi aj a∗i , a∗i aj = λij aj a∗i ,
1
aj ai = λij αi2 ai aj ,
aj ai = λij ai aj ,
i<j,
i<j,
ki ≥ j ;
(2)
ki < j ,
where λij = e2πiθij ∈ C, θij = −θji , i 6= j and the vector k = (k1 , k2 , . . . , kd−1 ) ∈ Nd−1 has the property that d ≥ ki ≥ i and if j < i and i ≤ kj , then ki ≤ kj . We call these relations the general commutation relations, or GCCR. Example 2. For k = (d, . . . , d), αi = µ2 , i = 1, . . . , d and λij = 1, i 6= j we have the TCCR. Example 3. If we put k = (1, 2, . . . , d − 1) we get a∗i ai = 1 + αi ai a∗i ,
a∗i aj = λij aj a∗i ,
aj ai = λij ai aj ,
i<j,
i.e. the quon CCR. The interesting detail is that in the bounded representations of GCCR the relations coming from the generators of quadratic ideal hold automatically when the Wick relations are satisfied (see [14, 26]). In the following we denote by the Ak,α,Θ the ∗-algebra and by Ak,α,Θ the C ∗ algebra generated by the GCCR with the fixed k, {αi } and Θ. 3. Representations We start with the description of the representations of quon commutation relations and TCCR. The representations of TCCR, including unbounded, were studied in the pioneering paper [28]. The representations of quon relations were studied by many authors (see [17, 18, 29, 25]). In this paper we give the description of irreducible representations of Aµ and Aα,Θ in the manner presented in the book [27] where the induction arguments were applied to reduce the problem to onedimensional. To deal with the unbounded representations first one has to give a precise meaning of the term “unbounded operators satisfying some commutation relations”. We do it according to the approach proposed by the V. Ostrovskyi and Yu. Samoilenko (see [27]). Evidently, any representation of a ∗-algebra generated by elements {a i , a∗i , i = 1, . . . , d} is determined by the family of operators {Ai , A∗i , i = 1, . . . , d} satisfying
June 19, 2003 15:50 WSPC/148-RMP
322
00161
C. S. Kim et al.
the same relations as the generators of the algebra. Below we will identify the representations of deformed CCR with such families of operators. To give the definition of unbounded representations of deformed CCR let us first rewrite the relations in the so-called “dynamical form” (see [27]). Let {Ai , A∗i , i = 1, . . . , d} be the bounded representation of the deformed commutation relations and A∗i = Si∗ Ci be the polar decomposition, then the relations between Ai , A∗i , i = 1, . . . , d, can be rewritten in terms of Ci2 and Sj , i, j = 1, . . . , d: (C12 , . . . , Cd2 )Si = Si Fi (C12 , . . . , Cd2 )
(3)
Ci Cj = C j Ci where • For quon relations Fi (x1 , . . . , xd ) = (x1 , . . . , xi−1 , 1 + αi xi , xi+1 , . . . , xd ) Si∗ Sj = λij Sj Si∗ ,
Sj Si = λij Si Sj ,
i 6= j .
• For TCCR Fi (x1 , . . . , xd ) = (µ2 x1 , . . . , µ2 xi−1 , 1 + µ2 xi , xi+1 , . . . , xd ) Si∗ Sj = Sj Si∗ ,
Sj Si = S i Sj ,
i 6= j .
The name “dynamical” means that we have the action of Fi -s on the spectrum of the commutive family of positive operators C = {Ci2 , i = 1, . . . , d}. Definition 3. Let the family C = {Ci2 , i = 1, . . . , d} commutes on the dense invariant domain of analytic vectors. We say that operators of C and partial isometries {Si , i = 1, . . . , d} satisfy the relations (3) if for any Borel set ∆ ⊂ Rn and any i = 1, . . . , d, EC (∆)Si = Si EC (F−1 i (∆)) where EC (·) is the joint resolution of identity of the family C. Definition 4. Let families C and {Si } be as defined above, where Fi , i = 1, . . . , d correspond either to quon relations or to TCCR, and ker Si∗ = ker Ci , then we say that the family Ai = Ci Si , i = 1, . . . , d, is the unbounded representation of qoun relations (TCCR) respectively. 3.1. Representations of quon algebra As for the one-dimensional q-CCR the spectrum of the Ci2 = Ai A∗i in any irreducible representation is concentrated on the positive orbit of the mapping fi (x) = 1 + αi x. The three possible types of orbits for 0 < αi < 1 are: 1−αn
(1) Fock orbit OF = { 1−αii , n ∈ Z+ };
June 19, 2003 15:50 WSPC/148-RMP
00161
The Generalized CCR: Representations and Enveloping C ∗ -Algebra 1 (2) Stationary point { 1−α }; i (3) Unbounded seria Ox = {fin (x), n ∈ Z}, defined by fixed yi > x ∈ [1 + αi yi , yi ).
1 1−αi
323
and any
If −1 < αi ≤ 0 then only the orbits of the first two types are admissible. To give the description of the irreducible representations of Aα,Θ we decompose the family of operators {Ci2 , i = 1, . . . , d} onto subfamilies according to the type of the orbits describing their spectrum. Let us consider the partition {1, . . . , d} = Φ1 ∪ Φ2 ∪ Φ3 and suppose that σ(Ci2 ) is the Fock orbit if i ∈ Φ1 , stationary point if i ∈ Φ2 and the unbounded orbit corresponding to some xi ∈ [1 + αi yi , yi , ) if i ∈ Φ3 . Obviously, if J = {i | αi ≤ 0} then the intersection of J with Φ3 should be empty and in the following we consider only the partitions sharing this property. Given a partition Φ1 ∪Φ2 ∪Φ3 , to fix the spectrum of any Ci2 , i = 1, . . . , d one has to pick some xi ∈ [1 + αi yi , yi ) for any i ∈ Φ3 . Furthermore, it could be shown (see 1 then all eigenvalues of Ci2 have the equal multiplicities [27]) that if σ(Ci2 ) 6= 1−α i and the eigenspaces are invariant with respect to the action of Sj , Sj∗ , j 6= i. The operator Si acts on the family of eigenspaces of Ci2 as a shift (bilateral or unilateral according to the type of orbit). These properties together with the commutation rules Si∗ Sj = λij Sj Si∗ , Si Sj = λji Sj Si , i 6= j allow anyone to describe completely the operators Ci2 , i = 1, . . . , d and {Si , i 6∈ Φ2 }. However as will be stated below the family {Si , i ∈ Φ2 } is determined up to the irreducible representation of the non-commutative torus with |Φ2 | generators. To give the precise picture we need to introduce some auxiliary operators. d(λ) : l2 (N) → l2 (N) ,
d(λ)en = λn−1 en ,
d(f ) : l2 (N) → l2 (N) ,
d(f )en = f n−1 (0)en ,
n∈N
D(f, z) : l2 (Z) → l2 (Z) ,
D(f, z)en = f n (z)en ,
n∈Z
n∈N
U : l2 (Z) → l2 (Z) ,
U en = en+1 ,
n∈Z
S : l2 (N) → l2 (N) ,
Sen = en+1 ,
n ∈ N.
For a fixed partition {1, 2, . . . , d} = Φ1 ∪ Φ2 ∪ Φ3 , consider the irreducible family of unitaries {Ui , i ∈ Φ2 } satisfying Ui Uj = λji Uj Ui , i 6= j, i, j ∈ Φ2 acting on some Hilbert space K, (i.e. the irreducible representation of non-commutative torus). Put 1 , i = 1, . . . , d, and pick xi ∈ τi , i ∈ Φ3 . τi := [1 + αi yi , yi ) for some fixed yi > 1−α i Let us construct the operators Ai = Ci Si , i = 1, . . . , d, where Ci2 =
O
1 ⊗ d(fi ) ⊗
O
Uij ⊗ S ⊗
j
Si =
j
O
j>i,j6∈Φ2
O
j>i,j6∈Φ2
1⊗1
1⊗1,
i ∈ Φ1
June 19, 2003 15:50 WSPC/148-RMP
324
00161
C. S. Kim et al.
Ci2 =
Si =
1 1 − αi
j
O
j>i,j6∈Φ2
O
Uij ⊗
O
1 ⊗ D(fi , xi ) ⊗
O
Uij ⊗ U ⊗
j
Si =
1⊗
O
j
Ci2 =
O
j
j>i,j6∈Φ2
1⊗1
Uij ⊗ Ui , O
j>i,j6∈Φ2
O
j>i,j6∈Φ2
i ∈ Φ2 1⊗1
1⊗1,
i ∈ Φ3 ,
and Uij = d(λji ), j ∈ Φ1 , j 6= i, and Uij = D(λji ), j ∈ Φ3 , j 6= i. To sum up all the mentioned above we state the following theorem. Theorem 7. The family constructed above determine the irreducible representation (1) (1) of quon commutation relations. The family {Ai } corresponding to Φi , i = 1, 2, 3, (1) (1) (1) (1) (2) {xi , i ∈ Φ3 } and {Ui , i ∈ Φ2 } and the family {Ai } corresponding to (2) (2) (2) (2) (2) Φi , i = 1, 2, 3, {xi , i ∈ Φ3 } and {Ui , i ∈ Φ2 } are unitary equivalent iff (1) (2) (1) (2) (1) Φi = Φi , i = 1, 2, 3; xi = xi , i ∈ Φ3 , and the family {Ui , i ∈ Φ2 } is unitary (2) equivalent to the family {Ui , i ∈ Φ2 }. Any irreducible representation of Aα,Θ has the form described above. Remark 2. (1) In the Appendix the irreducible families of unitaries Ui , i ∈ Φ2 are discussed when λij , i 6= j, are roots from unit. (2) The representation defined above is bounded iff Φ3 = ∅ and in any bounded 1 if 0 < αi < 1 and kAi k = 1 if −1 < αi ≤ 0. In representation kAi k2 = 1−α i particular, if −1 < αi ≤ 0 for any i = 1, . . . , d then any representation of Aα,Θ is bounded. (3) Fock representation corresponds to the case Φ2 = Φ3 = ∅. Indeed, in this N N case the representation space is di=1 l2 (N) and for Ω = i=1 e1 , one has A∗i Ω = Ci Si Ω = 0, i = 1, . . . , d. 3.2. Representations of TCCR Recall that originally the ∗-representations of TCCR were classified by W. Pusz and S. L. Woronowicz (see [28]). Hovewer it would be more convenient for us to present the representations of TCCR in the form similar to that of quon CCR. At the contrast with the quon commutation relations, where the mappings Fi are diagonal, for the TCCR one has the “triangular” ones. Therefore the spectrum of Ci2 has a more complicated structure. The irreducible representations of TCCR could be divided on three types according to the structure of the sets σ(Ai A∗i = Ci2 ), i = 1, . . . , d.
June 19, 2003 15:50 WSPC/148-RMP
00161
The Generalized CCR: Representations and Enveloping C ∗ -Algebra
325
To the first type belongs the Fock representation only. Here we have σ(C12 ) = 2n 2n 2 2m 1−µ { 1−µ 1−µ2 , n ∈ Z+ } and σ(Ci ) = {µ 1−µ2 , m, n ∈ Z+ }. Put f (t) = 1 + µ2 t . Then in the Fock representation the operators Aj , j = 1, . . . , d, are defined as follows: O O 1 1. d(µ) ⊗ (d(f )) 2 S ⊗ Aj = j
k<j
To the second type belong the irreducible representations having the first i − 1 generators the same as in the Fock representation but now the spectrum of Ci2 is 1 equal to {µ2n (1 − µ2 )−1 , n ∈ Z+ } if i > 1, or σ(C12 ) = { 1−µ 2 } if i = 1. Say that such representations correspond to the ith stationary point. To describe the operators Aj , j > i, where i is a stationary point we inroduce some more notations. Firstly for a fixed y > 0 define τ1 = [µ2 y, y) and for any z ∈ τ1 construct fz (t) = −(1 − µ2 )z + µ2 t .
Then construct some auxiliary operators. ˆ ˆ : l2 (Z− ) → l2 (Z− ) , ˆ d(µ), S, d(h) ˆ −n = e−n+1 , Se −n ˆ d(µ)e e−n , −n = µ
ˆ 0 = 0, n ≥ 1, Se
−n ˆ d(h)e (0)e−n , −n = h
D(µ) : l2 (Z) → l2 (Z) ,
D(µ)en = µn en ,
n ∈ Z+ n∈Z
where h(·) is any one-to-one mapping on R. Finally note that if i < d there exist the “degenerate” representations of TCCR where Ak = 0 for i < k ≤ l, with any choice of l, i < l ≤ d. So, to describe the representations corresponding to the ith stationary point, i ∈ {1, . . . , d}, we additionaly pick l = l(i), i ≤ l ≤ d, l(d) := d to fix the number of trivial generators and z ∈ τ1 . In the following, by the j < k ≤ s where s ≤ j we mean that k ∈ ∅. Consider firstly the case i < d and l < d under these conditions, one has σ(Ck2 ) = {0}, i < k ≤ l, σ(Cl+1 ) = {µ2n z, n ∈ Z} and σ(Ck2 ) = {µ2n fz−m (0), n ∈ Z, m ∈ Z+ }, l + 1 < k ≤ d. To be more precise, O O 1 1, j < i Aj = d(µ) ⊗ (d(f )) 2 S ⊗ j
l
k<j
Ai =
O k
Al+1 =
1
d(µ) ⊗ (1 − µ2 )− 2 U ⊗
O k
1
d(µ) ⊗ z 2 D(µ)U ⊗
O
1,
k>l+1
O
k>l+1
1,
June 19, 2003 15:50 WSPC/148-RMP
326
00161
C. S. Kim et al.
Ak = 0, i + 1 ≤ k ≤ l O d(µ) ⊗ D(µ) ⊗ Aj = k
O
l+1
ˆ ⊗ (d(f ˆ z )) 12 Sˆ ⊗ d(µ)
For l = d or i = d we have O O 1 d(µ) ⊗ (d(f )) 2 S ⊗ Aj = 1, k<j
O
1,
j > l +1.
k>j
j < i,
j
1
Ai = eiψ (1 − µ2 )− 2
O
d(µ) ,
k
ψ ∈ [0, 2π) ,
Aj = 0, i < j ≤ d . Note that the second type representations are parametrized by the tuples (i, l(i), z, ψ), where i = 1, . . . , d, i ≤ l(i) ≤ d, z ∈ τ1 , ψ ∈ [0, 2π) and, by definition, we have the identifications (i, l, z, ψ1 ) ' (i, l, z, ψ2) for l < d and (i, d, z1 , ψ) ' (i, d, z2 , ψ) for l = d. Finally, consider the representations corresponding to the unbounded positive 1 2 orbits of the mapping f (·). Fix some x > 1−µ 2 and pick any w ∈ τ2 := [1 + µ x, x). The third type representations are characterized by the condition that for a fixed i = 1, . . . , d, the operators Aj , j < i, are the same as in the Fock representation and the spectrum of Ci2 equals to {µ2n f m (w), n ∈ Z+ , m ∈ Z}. Let us denote by gw (·) the linear function gw (t) = 1 − (1 − µ2 )w + µ2 t ,
−m then the spectrum of Cj2 , i < j ≤ d is equal to {µ2n gw (0), n ∈ Z, m ∈ Z+ }. Thus, the operators of the third type representations determined by the i, 1 ≤ i ≤ d and w ∈ τ2 are the following, O O 1 Aj = d(µ) ⊗ (d(f )) 2 S ⊗ 1, j < i,
Ai =
k<j
j
O
d(µ) ⊗ (D(f, w)) 2 U ⊗
O
d(µ) ⊗ D(µ) ⊗
k
Aj =
k
1
O
i
O
1,
k>i
ˆ ⊗ (d(g ˆ w )) 12 Sˆ ⊗ d(µ)
O j
1,
i < j ≤ d.
The representations of the third type are parametrized by the ordered pairs (i, w), where i = 1, . . . , d, w ∈ τ2 . So, we have the following result. Theorem 8. Any irreducible representation of TCCR is unitary equivalent to one defined above. The representations corresponding to the different types or to the different values of parameters inside the same type are unequivalent. It was pointed out in [14] that any bounded irreducible representation of TCCR is coherent, i.e. having a vector Ω such that A∗i Ω = µi Ω for any i = 1, . . . , d.
June 19, 2003 15:50 WSPC/148-RMP
00161
The Generalized CCR: Representations and Enveloping C ∗ -Algebra
327
Indeed, the irreducible representation of TCCR is bounded if either it is the Fock or it corresponds to ith stationary point for some i = 1, . . . , d and l(i) = d. In the 1 last case A∗j Ω = 0, j 6= i and A∗i Ω = eiψ (1 − µ2 )− 2 Ω. The bounded representations of GCCR are classified in [26] in the manner similar to that of TCCR and quon relations. Since the formulas involved in that classification are sufficiently complicated so we omit them here. We note only that bounded representations of GCCR are uniformly bounded. Namely for any bonded irreducible representation of GCCR one has 1 , i = 1, . . . , d . kπ(ai a∗i )k ≤ 1 − αi In particular this fact implies the existence of the universal bounded representation for GCCR and hence for TCCR and quon relations also. The Fock representation of GCCR is also bounded and as stated in the next section, the Fock representation is the universal bounded representation of GCCR. Equivalently, it means that the C ∗ -algebra generated by the GCCR coincides with the C ∗ -algebra generated by the operators of the Fock representation. 3.3. Representations of q-CCR Via the stability of Aq at q = 0, see the next section, one could classify the representations of q-CCR as a representations of Cuntz–Toeplitz algebra A 0 . In fact, any non-Fock irreducible representation of A0 factors through the Cuntz algebra Od . The representations of Od could be a theme of a separate review (see [21, 22, 19, 20]). However some natural classes of irreducible representations of q-CCR could be described without reference to the stability theorem. Namely, following [16] one can generalize the notion of the Fock representation by the obvious way: Definition 5. Let ϕ ∈ Cd , the coherent representation πϕ (·) associated with ϕ is the cyclic representation with vacuum vector Ω such that πϕ (a∗i )Ω = ϕi Ω for any i = 1, . . . , d. For a given ϕ ∈ Cd , we can construct the linear functional ωϕ on Aq defined by the rule ωϕ (ai X) = ϕi ω(X) ,
∀ X ∈ Aq
where ω is the Fock state on Aq , i.e. ω(X) = constant term in the decomposition of X in the linear combination of the Wick ordered monomials. When ωϕ is positive the GNS representation corresponding to ωϕ is exactly the coherent representation πϕ . 1
Theorem 9. The functional ωϕ is positive iff kϕk ≤ (1 − q)− 2 . For any ϕ, 1 kϕk < (1 − q)− 2 , the representation πϕ is unitary equivalent to the Fock one. 1 The representations corresponding to different ϕ with kϕk = (1 − q) − 2 are called peripheral and pairwise unequivalent. In the positive case the state ω ϕ is pure, hence the representation πϕ is irreducible.
June 19, 2003 15:50 WSPC/148-RMP
328
00161
C. S. Kim et al.
It was also shown in [16] that any peripheral coherent representation is bounded, (d) hence it can be extended to the representation of the C ∗ -algebra Aq = Aq generated by q-CCR. Let us return on a moment to the q-CCR with one generator. In this case the peripheral coherent representations coincide with the one-dimensinal seria and (1) hence the unique proper ideal K in the C ∗ -algebra Aq ' T (C(T)) generated by one-dimensional q-CCR annihilates in any peripheral representation. The analogous fact is valid for the general situation (see [16]). Proposition 3. Let πϕ be peripheral coherent representation of Aq then for any proper two-sided ideal J ⊂ Aq one has πϕ (J ) = {0}. 4. Stability In this section we present the higher-dimensional analogues of the Proposition 2 for each type of deformed CCR. We start from the higher-dimensional q-CCR. It was proved by √ P. E. T. Jørgensen, L. M. Schmitt and R. F. Werner in [15] that for |q| < 2 − 1 the universal bounded reprsentation Aq of q-CCR is isomorphic to the Cuntz–Toeplitz algebra, i.e. A0 . Recall that the Cuntz–Toeplitz algebra (see [9]) is C ∗ -generated by the family of isometries vi , i = 1, . . . , d, having orthogonal ranges, i.e. vi∗ vj = δij 1 ,
i, j = 1, . . . , d .
The unique two-sided closed ideal K ⊂ A0 generated by the projection p = 1 − Pd ∗ ∗ i=1 vi vi is isomorhic to the C -algebra of compact operators and A0 /K ' Od . Here Od is the Cuntz algebra which is the example of the simple infinite C ∗ -algebra (see [9]). √ Theorem 10. For |q| < 2 − 1 the C ∗ -algebra Aq is isomorphic to A0 . The isomorphism identifies ai ∈ Aq with ρvi ∈ A0 , i = 1, . . . , d. Where ρ is positive P element in A0 uniquely determined by (1 − i vi vi∗ )ρ = 0 and the condition that {vi ρ, i = 1, . . . , d} satisfy the q-CCR. The “stability” interval for q was extended by K. Dykema and A. Nica (see [12]) for the C ∗ -algebra generated by the Fock representation of q-CCR. However even in this case the conjecture that the claim of the Theorem 10 is true for any q ∈ (−1, 1) is not proved yet. Since any irreducible representation of A0 is either the Fock or factors through Od one can conclude that the Fock representation of A0 is faithful (see [9, 14]). From the construction of the isomorphism Aq ' A0 it is easy to see that a∗i Ω = 0, i = 1, . . . , d iff vi∗ Ω = 0, i = 1, . . . , d. So, the Fock representation of Aq coincides with that of A0 and as in the one-dimensional case one has the following: √ Proposition 4. The Fock representation of Aq is faithful for |q| < 2 − 1.
June 19, 2003 15:50 WSPC/148-RMP
00161
The Generalized CCR: Representations and Enveloping C ∗ -Algebra
329
Now let us discuss the similar problems for the C ∗ -algebra Aα,Θ generated by the quon CCR. The direct application of the Proposition 2 shows that Aα,Θ ' A0,Θ for any values of αi , −1 < αi < 1. Indeed, suppose that Aα,Θ is faithfully realized as an algebra of operators on some Hilbert space H. Consider the polar decompositions of ai = ci si , then the Proposition 2 implies that the correspondence si → ti , where {ti , i = 1, . . . , d} are the isometries generating A0,Θ could be extended to the isomorphism of C ∗ -algebras Aα,Θ and A0,Θ . Note that A0,Θ is the C ∗ -algebra generated by the isometries ti , i = 1, . . . , d satisfying the relations t∗i tj = λij tj t∗i ,
ti tj = λji tj ti ,
i 6= j .
To continue the analogy with the higher dimensional q-CCR, let us show that for the almost all values of parameters θij , i 6= j the C ∗ -algebra A0,Θ is an extension of the simple C ∗ -algebra by the unique largest two-sided ideal. Firstly, we recall some results from the theory of non-commutative tori (see [42, 35, 31]). Definition 6. Let Θ = (θij )di,j=1 be the real antisymmetric matrix. The noncommutative torus AΘ is the C ∗ -algebra generated by the unitary elements Ui , i = 1, . . . , d satisfying the commutation relations Ui Uj = e2πiθji Uj Ui ,
i 6= j .
The non-commutative torus is called irrational if for some fixed i, the family {1, θij , j 6= i} is linearly independent over Q, in this case we say that the matrix Θ is irrational. Some properties of the irrational tori are specified in the following Theorem, see [42, 31]. Theorem 11. The irrational non-commutative torus AΘ is the simple C ∗ -algebra with the unique tracial state τ defined by Z τ (X) = ρλ (X)dλ λ∈T d
d
where T is the d-dimensional torus and ρλ is the unique automorphism of AΘ defined by ρλ (Ui ) = λi Ui , i = 1, . . . , d. The simplicity of the irrational torus implies the following result (see [29]): Theorem 12. Let A0,Θ correspond to the irrational matrix Θ. Consider the ideal M generated by the projections 1 − ti t∗i , i = 1, . . . , d. Then M is the unique largest two-sided ideal in A0,Θ and A0,Θ /M ' AΘ . The tracial state on AΘ can be lifted to A0,Θ and the lift, say τ˜, is the unique tracial state on A0,Θ . Then M = {b | τ˜(b∗ b) = 0}. However the structure of M is not so clear as for the q-CCR. In the Appendix we describe M for the particular case d = 2. Suppose we have the isomorphism φ : A0,Θ1 → A0,Θ2 , where Θi , i = 1, 2, are irrational matrices. Since φ maps M1 onto M2 , it induces the isomorphism of the corresponding irrational tori. The classification of the irrational tori is an open problem, see [42, 35, 34, 36, 37, 38].
June 19, 2003 15:50 WSPC/148-RMP
330
00161
C. S. Kim et al.
Finally examine the faithfullness of the Fock representation of Aα,Θ . Firstly, note that as for the q-CCR, the Fock representation of Aα,Θ coincides to that of A0,Θ . We call the matrix Θ rational if θij ∈ Q, i, j = 1, . . . , d. Proposition 5. The Fock representation of A0,Θ corresponding either to irrational or to the rational Θ is faithful. The proof of this statement for the irrational Θ can be found in [29]. The rational case is considered in Appendix. The situation for the C ∗ -algebra Bµ generated by the TCCR is the most clear. Theorem 13. For any µ ∈ (0, 1) one has the isomorphism Bµ ' B0 , where the C ∗ -algebra B0 is generated by the partial isometries {si , i = 1, . . . , d} satisfying the relations X sj s∗j , i, j = 1, . . . , d s∗i sj = δij 1 − j
sj si = 0, i < j . The Fock representation of Bµ is faithful and coincides with that of B0 (see [30] for the proof). As an example of application of this result we compute the K0 and K1 groups of Bµ . Namely, we show that K0 (Bµ ) = Z and K1 (Bµ ) = {0} (see Appendix). To obtain the stability theorem for the GCCR one has to combine the results for TCCR and quon relations. Theorem 14. For any α = {αi , 0 ≤ αi < 1, i = 1, . . . , d}, k and Θ, we have the isomorphism Aα,k,Θ ' A0,k,Θ where the C ∗ -algebra A0,k,Θ is generated by a family of partial isometries {si , i = 1, . . . , d} satisfying the following commutation relations s∗i si = 1 −
X
sj s∗j ,
s∗i sj = 0,
s j si = 0 ,
i<j,
j
s∗i sj = λij sj s∗i ,
sj si = λij si sj ,
i<j,
ki ≥ j ,
ki < j .
The interesting detail is that if k = (d, d, . . . , d) the C ∗ -algebra A0,k,Θ does not depend on Θ and is isomorphic to the C ∗ -algebra generated by TCCR with µ = 0, i.e. to B0 . As for the quon relations, when the matrix Θ is rational or irrational, the Fock representation of A0,k,Θ is faithful. The proof of this statement may be obtained by the elaboration of proofs for the quon relations and TCCR.
June 19, 2003 15:50 WSPC/148-RMP
00161
The Generalized CCR: Representations and Enveloping C ∗ -Algebra
331
Appendix Fock representation of A0,Θ . The rational case In this section we prove the faithfulness of the Fock representation of Aα,Θ ' A0,Θ corresponding to the rational matrix Θ. Further, we clarify the structure of the ideal M ⊂ A0,Θ when d = 2. Since the Fock representation of Aα,Θ coincides with that of A0,Θ , it suffices to prove that any irreducible representation π(·) of A0,Θ can be represented as the composition π = ψπF , where πF is the Fock representation, and ψ : FΘ → Cπ is the homomorphism from the C ∗ -algebra generated by the operators of πF to the C ∗ -algebra generated by π. Let us first recall that operators of the Fock representation of A0,Θ have the following form, πF (si ) = Si =
O
1≤j
d(λji ) ⊗ S ⊗
O
1.
i<j≤d
The irreducible representation of A0,Θ that corresponds to the subset Φ ⊂ {1, 2, . . . , d} and irreducible representation {Ui , i ∈ Φ} of rational torus AΘ ˜ , where ˜ = (θij )i,j∈Φ , is defined by the operators Θ Si =
O
1≤j
Si =
O j6∈Φ
d(λji ) ⊗ S ⊗
d(λji ) ⊗ Ui ,
O
i<j≤d,j6∈Φ
1⊗1,
i 6∈ Φ ,
i ∈ Φ.
The representations those correspond to Φ = {1, . . . , d} will be called the unitary representations of A0,Θ . Note that in the rational case the noncommutative tori AΘ ˜ are not simple. More precisely, they are isomorphic to the tensor products of several two-dimensional rational noncommutative tori and a C ∗ -algebra of continuous functions on the some torus of an appropriate dimension, see [32]. Namely, consider the family of unitaries {Ui , i = 1, . . . , d} which satisfies the commutation relations Ui Uj = λji Uj Ui ,
λji = e2πiθji ,
θij = −θji ,
i 6= j
and λij = λnij , where λ is the primitive root from unit of degree n. Clearly M := (nij ) is the antisymmetric matrix with integer entries. Now let us consider the ˜i , i = 1, . . . , d where U ˜j = Uj , j 6= l, and U ˜l = U l U k , new family of generators U i ˜ for a fixed k ∈ Z and i 6= l. It is easy to verify that the family {Uj } satisfies ˜ = S t M S, where S is the elementary the relations corresponding to the matrix M matrix obtained from the unit by adding the ith column multiplied by k to the lth column. By using such transformations one can reduce the antisymmetric matrix
June 19, 2003 15:50 WSPC/148-RMP
332
00161
C. S. Kim et al.
M ∈ Md (Z) to the form
0 −n1
n 1 ˜ M =
0
..
.
0 −nk
nk
0
0 ..
. 0
.
Evidently, this means that the noncommutative torus AΘ is isomorphic to Aθ1 ⊗ · · · ⊗ Aθk ⊗ C(Td−k ), where Aθj is the C ∗ -algebra generated by the unitary elements u, v satisfying the relation uv = e2πiθj vu, e2πiθj = λnj , j = 1, . . . , k and Td−k is the d − k-dimensional torus, see [32] for more details. As an important corollary we note that any irreducible representation of rational noncommutative torus is finite-dimensional. Indeed, any irreducible representation decomposes into the tensor product of irreducible representations of Aθj , j = 1, . . . , k and irreducible representation of C(Td−k ). The latest is one-dimensional. Thus the problem is reduced to the description of irreducible representations of Aθ , where λ = e2πiθ is the root from unit. If λn = 1 then any irreducible representation of Aθ is n-dimensional, i.e. Aθ is homogeneous C ∗ -algebra, see [41], and U ek = z1 λk−1 ek ,
k = 1, . . . , n ,
V el = el+1 ,
l < n,
V e n = z 2 e1 ,
where e1 , . . . , en is orthonormal basis of Cn and zi ∈ C, |zi | = 1, i = 1, 2. The representations that correspond to different tuples (z1 , z2 ) are not equivalent (see, for example, [41]). Now, we are ready to prove the following Proposition. Proposition A.1. Let Θ be rational antisymmetric matrix. For any irreducible representation π of A0,Θ there exists an homomorphism ϕ : FΘ → C π such that π = ϕπF . Proof. (1) Let us prove this statement in the case when π(si ) = Ui is unitary for any i = 1, . . . , d. Let λ ∈ C, λn = 1 be such that e2πiθjk = λnjk := λjk , j 6= k, for some natural njk . Consider the group G = hz, u1 , . . . , ud | z n = uni = e, uj uk = z nkj uk uj , zui = ui zi .
June 19, 2003 15:50 WSPC/148-RMP
00161
The Generalized CCR: Representations and Enveloping C ∗ -Algebra
333
Since π is irreducible and Uin commutes with any Uj , j = 1, . . . , d, we have Uin = µi 1 ˆi = µ for some complex µi , |µi | = 1. Then operators U ˜−1 ˜n = µ, determine the i Ui , µ irreducible representation of G, also denoted by π, such that π(z) = λ1. Let us construct a special finite-dimensional representation π ˜ of the group G Nd acting on the space i=1 Cn with basis {ei1 ⊗ ei2 ⊗ · · · ⊗ eid , 1 ≤ ij ≤ n}, which is the analogue of the regular representation of G. O 1, π ˜ (u1 ) = U ⊗ 1<j≤d
π ˜ (uk ) =
O j
D(λjk ) ⊗ U ⊗
d O
1,
j=k+1
1 < k ≤ d,
π ˜ (z) = λ1 where U ej = ej+1 , j < n, U en = e1 and D(λjk )ei = λi−1 jk ei , i = 1, . . . , n. We show that π ˜ contains π as a direct summand. To do it one has to calculate the character of π ˜ . It is easy to see that any element g ∈ G can be uniquely represented in the form g = z k uk11 · · · ukdd ,
0 ≤ k, ki ≤ n − 1 ,
i = 1, . . . , d .
Then, π ˜ (g)ei1 ⊗ · · · ⊗ eid = µ(g, i)ei1 +k1 ⊗ · · · ⊗ eid +kd where µ(g, i) ∈ C is some coefficient and the addition is in Zn . Then one can deduce that χπ˜ (z k ) = nd λk ,
χπ˜ (g) = 0, otherwise .
Hence if χπ is a character of π we have hχπ˜ , χπ i =
1 nd+1
n−1 X
nd m = m > 0
i=0
where m is the dimension of representation π, i.e. π is the direct summand of π ˜ with multiplicity m. It is evident that we have a homomorphism ψ from the C ∗ -algebra ˆi . To complete the proof it remains to construct Cπ˜ to Cπ given by ψ(˜ π (ui )) = U an homomorphism ν : FΘ → Cπ˜ such that ν(πF (si )) = µ ˜−1 ˜ (ui ). Consider the unitary operator i π V = d(˜ µ1 ) ⊗ d(˜ µ2 ) ⊗ · · · ⊗ d(˜ µd ) .
June 19, 2003 15:50 WSPC/148-RMP
334
00161
C. S. Kim et al.
Then V ∗ πF (si )V = µ ˜−1 i πF (si ), i = 1, . . . , d. Hence it is sufficient to construct the homomorphism ν˜ : FΘ → Cπ˜ such that ν˜(πF (si )) = π ˜ (ui ), i = 1, . . . , d, then ν = ν˜Ad(V ). Recall that O O 1 , i = 1, . . . , d . d(λki ) ⊗ S ⊗ πF (si ) = k>i
k
Show that there exists a homomorphism ϕi : C ∗ (S, d(λij ), j > i) := Ai → C ∗ (U, D(λij ), j > i) := Bi defined by S → U and d(λij ) → D(λij ). To do this we decompose l2 (N) =
n−1 M k=0
where Hk = henm+k , m ∈ Ni ' l2 (N). 0 1 S = 0 . . .
Hk ' Cn ⊗ l2 (N)
In view of this decomposition, 0 0···0 S 0 0···0 0 1 0···0 0 . . . . . . .. . . . .
0 0 0···1 0
and d(λij ) = D(λij )⊗1. Then we can identify Ai with the C ∗ -subalgebra of Mn (C)⊗ C ∗ (S) ' Mn (C ∗ (S)) and construct the homomorphism η : Mn (C ∗ (S)) → Mn (C) given by η(S) = 1. Let ϕi be the restriction of this homomorphism to the C ∗ subalgebra Ai ⊂ Mn (C ∗ (S)). Evidently, ϕi (S) = U ,
ϕi (d(λij )) = D(λij ) ,
j>i
and ϕi (Ai ) = Bi . To complete the proof note that FΘ and Cπ˜ are the C ∗ Nd Nd subalgebras of i=1 Ai and i=1 Bi accordingly and ν˜ : FΘ → Cπ˜ is the restriction of d O i=1
ϕi :
d O i=1
Ai →
d O
Bi .
i=1
(2) Assume now that π corresponds to Φ 6= {1, . . . , d}. We will use induction on d. Suppose that the assertion is true for algebras with d − 1 generator. Assume that π is irreducible representation of A0,Θ and ker(π(s∗1 )) 6= {0}. Let us denote the C ∗ -algebra generated by operators π by Cπ . Then one can deduce that π(s1 ) = S ⊗ 1 ,
π(sj ) = d(λ1j ) ⊗ π ˜ (sj ) ,
j≥2
June 19, 2003 15:50 WSPC/148-RMP
00161
The Generalized CCR: Representations and Enveloping C ∗ -Algebra
335
d ∗ ˜ where π ˜ is irreducible representation of A0,Θ ˜ , Θ = (θij )i,j=2 . The C -algebra generated by operators of π ˜ will be denoted by Cπ˜ . Analogously
πF (s1 ) = S ⊗ 1 ,
πF (sj ) = d(λ1j ) ⊗ π ˜F (sj ) ,
j≥2
∗ where π ˜F is Fock representation of A0,Θ ˜ . Let FΘ ˜ be the C -algebra generated by operators of Fock representation of AΘ ˜. By assumption of induction we have an homomorphism
ϕπ˜ : FΘ ˜ → Cπ ˜,
π ˜ = ϕπ˜ π ˜F .
∗ Let D = C ∗ (s, d(λ1j ), j ≥ 2). Construct D ⊗ FΘ ˜ and D ⊗ Cπ ˜ . The C crossnorms are uniquely defined as all algebras are nuclear. Evidently FΘ and Cπ are the C ∗ -subalgebras of these algebras. By property of tensor product we have an homomorphism
id ⊗ ϕπ˜ : D ⊗ FΘ ˜ → D ⊗ Cπ ˜. Denote by ϕπ the restriction of this homomorphism to FΘ . It is easily seen that ϕ(πF (si )) = π(si ), i = 1, . . . , d, hence ϕπ : F Θ → C π and π = ϕπ πF . In the following Proposition we clarify the structure of the ideal M in the case d = 2. Proposition A.2. The sequence 0 → K → M → K ⊗(C(T)⊕C(T)) → 0 is exact. Proof. We work with the Fock realization, i.e. with S1 = S ⊗ 1 ,
S2 = d(λ) ⊗ S ,
λ = λ12 .
Let 2 1 = S2i (1 − S2 S2∗ )S2∗j S1k S1∗l Pijkl = S1i (1 − S1 S1∗ )S1∗j S2k S2∗l : Pijkl 2 1 } respectively. } and {Pijkl and let M1 and M2 be ideals generated by sets {Pijkl Then, since 1 = S i (1 − SS ∗ )S ∗j d(λ)k−l ⊗ S k S ∗l , Pijkl
one has M1 ' K ⊗ T (T is a Toeplitz algebra). To prove that the M2 ' K ⊗ T one have to change basis in l2 (N) ⊗ l2 (N) so that in new basis S1 = d(λ) ⊗ S and S2 = S ⊗ 1. Further M1 ∩ M2 = M1 M2 = K(l2 (N) ⊗ l2 (N)) since Pi11 j1 k1 l1 · Pi22 j2 k2 l2 ∈ K(l2 (N) ⊗ l2 (N)). Then M/K ' M1 /K ⊕ M2 /K ' (K ⊗ T /K ⊗ K) ⊕ (K ⊗ T /K ⊗ K) .
June 19, 2003 15:50 WSPC/148-RMP
336
00161
C. S. Kim et al.
But the sequence 0 → K → T → C(T) → 0 is exact and K is nuclear so the sequence 0 → K ⊗ K → K ⊗ T → K ⊗ C(T) → 0 is exact and M/K ' K ⊗ (C(T) ⊕ C(T)) . K-theory for TCCR We use the stability of Bµ at µ = 0 and faithfulness of the Fock representation of Bµ to compute the K-groups for the TCCR. Namely we consider the Fock realization of B0 ' Bµ . In the Fock representation the generators of B0 have the following form O O si = (1 − SS ∗ ) ⊗ S ⊗ 1 , i = 1, . . . , d . j
j>i
Proposition A.3. K0 (Bµ ) = Z, K1 (Bµ ) = {0}. Proof. As it was noted above we can identify B0 with the C ∗ (si , s∗i , i = 1, . . . , d) where O O 1 , i = 1, . . . , d . (1 − ss∗ ) ⊗ s ⊗ si = j>i
j
Let us consider the case d = 2. Let T˜0 be the ideal generated by the element (1 − ss∗ ) ⊗ (1 − s). It is easy to see that T˜0 ' K ⊗ T0 , where T0 is an ideal in the Toeplitz algebra T generated by the element 1 − s. It was shown by J. Cuntz (see [9]) that Ki (T0 ) = {0}. Further, B0 /T˜0 ' T , i.e. one has the following short exact sequence 0 → T˜0 → B0 → T → 0 . Since K0 (T ) ' Z and K1 (T ) = {0}, the corresponding six-term exact sequence become 0 −−−−→ K0 (B0 ) −−−−→ x
Z y
0 ←−−−− K1 (B0 ) ←−−−− 0
Nd−1 In the general case we consider the ideal Tˆ0 generated by the element i=1 (1 − ∗ ss ) ⊗ (1 − s). Then Tˆ0 '
d−1 O i=1
K ⊗ T0 ' K ⊗ T 0
and B0 (d)/Tˆ0 ' B0 (d − 1). Applying again the six-term sequence corresponding to the 0 → K ⊗ T0 → B0 (d) → B0 (d − 1) → 0 and induction on d we get K0 (B0 (d)) ' Z and K1 (B0 (d)) = {0}.
June 19, 2003 15:50 WSPC/148-RMP
00161
The Generalized CCR: Representations and Enveloping C ∗ -Algebra
337
Remark A.1. In fact via the stability of Aα,k,Θ the result of the previous theorem is true for the C ∗ -algebra generated by the GCCR corresponding to k = (d, d, . . . , d).
References [1] O. Bratelli and D. W. Robinson, Operator Algebras and Quantum Statistical Mechanics, Springer Verlag, Berlin, Heidelberg, New York, 1981. [2] L. C. Biedenharn, The quantum group SUq (2) and q-analogue of the boson operators, J. Phys. A. 22 (1989), L873–L878. [3] L. A. Coburn, The C ∗ -algebra generated by an isometry, I, Bull. Am. Math. Soc. 73 (1967), 722–726. [4] L. A. Coburn, The C ∗ -algebra generated by an isometry, II, Trans. Am. Math. Soc. 137 (1969), 211–217. [5] P. R. Halmos, A Hilbert Space Problem Book, D. Van Nostrand co. (1967). [6] M. Bo˙zejko and R. Speicher, An example of a generalized Brownian motion, Comm. Math. Phys. 137 (1991), 519–531. [7] M. Bo˙zejko and R. Speicher, Completely positive maps on Coxeter groups, deformed commutation relations, and operator spaces, Math. Ann. 300 (1994), 97–120. [8] J. Cuntz, Simple C ∗ -algebras generated by isometries, Comm. Math. Phys. 57 (1977), 173–185. [9] J. Cuntz, K-theory and C ∗ -algebras, Proc. Conf. on K-Theory (Bielefeld 1982), Springer Lecture Notes in Math. 1046, 55–79. [10] C. Daskaloyannis, Generalized deformed oscillator and nonlinear algebras, J. Phys. A. 24 (1991), L789–L794. [11] D. I. Fivel, Interpolation between Fermi and Bose statisitics using generalized commutators, Phys. Rev. Lett. 65 (1990), 3361–3364. [12] K. Dykema and A. Nica, On the Fock representation of the q-commutation relations, J. Reine Angew. Math. 440 (1993), 201–212. [13] O. W. Greenberg, Particles with small violations of Fermi or Bose statistics, Phys. Rev. D. 43 (1991), 4111–4120. [14] P. E. T. Jørgensen, L. M. Schmitt and R. F. Werner, Positive representations of general commutation relations allowing Wick ordering, J. Funct. Anal. 134 (1995), 33–99. [15] P. E. T. Jørgensen, L. M. Schmitt and R. F. Werner, q-canonical commutation relations and stability of the Cuntz algebra, Pacific J. Math. 163, 1 (1994), 131–151. [16] P. E. T. Jørgensen, L. M. Schmitt and R. F. Werner, Coherent states of the qcanonical commutation relations, funct-an/9303002. [17] W. Marcinek, On commutation relation s for quons, Rep. Math. Phys. 41 (1998), 155–172. [18] W. Marcinek and R. Ralowski, On Wick algebras with braid relations, J. Math. Phys. 36 (1995), 2803–2820. [19] O. Bratelli and P. E. T. Jørgensen, Iterated function systems and permutation representations of the Cuntz algebra, Mem. Am. Math. Soc. 139 (1999), 663. [20] O. Bratelli, P. E. T. Jørgensen and V. L. Ostrovky˘ı, Representation Theory and Numerical AF-invariants: The representations and centralizers of certain states on Od , math.OA/9907036. [21] O. Bratteli, P. E. T. Jørgensen, A. Kishimoto and R. F. Werner, Pure states on Od , J. Oper. Theory 43 (2000), 97–143.
June 19, 2003 15:50 WSPC/148-RMP
338
00161
C. S. Kim et al.
[22] P. E. T. Jørgensen, Representations of Cuntz algebras, loop groups and wavelets, XIIIth International Congress on Mathematical Physics (London, 2000) (A. Fokas, A. Grigoryan, T. Kibble and B. Zegarlinski, eds.), International Press, Boston, 2001, pp. 327–332. [23] P. E. T. Jørgensen, D. P. Proskurin and Yu. S. Samo˘ılenko, The kernel of Fock representation of Wick algebras with braided operator of coefficients, Pacific J. Math. 198 (2001), 109–122. [24] A. J. Macfarlane, On q-analogues of the quantum harmonic oscillator and the quantum group SU (2)q , J. Phys. A. 22 (1989), 4581–4588. [25] V. Mazorchuk and L. Turowska, ∗-Representations of twisted generalized Weyl conctructions, Algebras and Rep. Theory 5, 2 (2002), 163–186. [26] V. L. Ostrovsky˘ı and D. P. Proskurin, Operator relations, dynamical systems, and representations of a class of Wick algebras, Oper. Theory Adv. Appl., Birkhauser Verlag 118 (2000), 335–345. [27] V. Ostrovsky˘ı and Yu. Samo˘ılenko, Introduction to the Theory of Representations of Finitely Presented ∗-Algebras. I. Representations by bounded operators, The Gordon and Breach Publishing Group, London (1999). [28] W. Pusz and S. L. Woronowicz, Twisted second quantization, Rep. Math. Phys. 27 (1989), 251–263. [29] D. Proskurin, Stability of a special class of qi j-CCR and extensions of higherdimensional noncommutative tori, Lett. Math. Phys. 52, 2 (2000), 165–175. [30] D. Proskurin and Yu. Samo˘ılenko, Stability of the C ∗ -algebra associated with the twisted CCR, Algebras and Rep. Theory 5 (2002), 433–444. [31] J. Slawny, On factor representations and the C ∗ -algebra of canonical commutation relations, Comm. Math. Phys. 24 (1972), 151–170. [32] B. Brenken, A classification of some noncommutative tori, Rocky Mountain J. Math. 20, 2 (1990), 389–397. [33] L. Vaksman, Lectures on q-analogues of Cartan domains and associated HarishChandra modules, math. QA/0109198. [34] G. A. Elliott, On the classification of C ∗ -algebras of real rank zero, J. Reine Angew. Math. 443 (1993), 179–219. [35] G. A. Elliott and D. E. Evans, The structure of the irrational rotation C ∗ -algebra, Ann. Math. 138 (1993), 477–501. [36] G. A. Elliott and G. Gong, On inductive limits of matrix algebras over the two-torus, Am. J. Math. 118 (1996), 263–290. [37] G. A. Elliott and Q. Lin, Cut-down method in the inductive limit decomposition of noncommutative tori, J. London Math. Soc. 54 (1996), 121–134. [38] G. A. Elliott and M. Rørdam, The automorphism group of the irrational rotation algebra, Comm. Math. Phys. 155 (1993), 3–26. [39] M. Pimsner and D. Voiculescu, Imbedding of the irrational C ∗ -algebra into AF algebras, J. Oper. Theory 4 (1980), 201–210. [40] M. Pimsner and D. Voiculescu, Exact sequence for K-groups and Ext groups of certain cross-product C ∗ -algebras, J. Oper. Theory 4 (1980), 93–118. [41] S. Disney and I. Raeburn, Homogeneous C ∗ -algebras whose spectra are tori, J. Austral. Math. Soc. Ser. A 38, 1 (1985), 9–39. [42] M. A. Rieffel, C ∗ -algebras associated with irrational rotations, Pacific J. Math. 93 (1981), 415–429. [43] M. A. Rieffel, Projective modules over higher-dimensional non-commutaive tori, Canad. J. Math. 40 (1988), 257–338.
June 19, 2003 16:13 WSPC/148-RMP
00165
Reviews in Mathematical Physics Vol. 15, No. 4 (2003) 339–386 c World Scientific Publishing Company
EXPONENTIALLY SMALL SPLITTING AND ARNOLD DIFFUSION FOR MULTIPLE TIME SCALE SYSTEMS
MICHELA PROCESI S.I.S.S.A. Functional Analysis Sector, 34014 Trieste [email protected] Received 8 May 2002 Revised 10 February 2003
We consider the class of Hamiltonians: n−1 n X 1 p2 1 X 2 Ij + εIn2 + sin(ψi ) , + ε[(cos q − 1) − b2 (cos 2q − 1)] + εµf (q) 2 j=1 2 2 i=1
where 0 ≤ b < 12 , and the perturbing function f (q) is a rational function of eiq . We prove upper and lower bounds on the splitting for such class of systems, in regions of the phase space characterized by one fast frequency. Finally using an appropriate Normal Form theorem we prove the existence of chains of heteroclinic intersections. Keywords: Homoclinic splitting; Arnold diffusion; whiskered tori; perturbation theory; diagrammatic expansion.
Contents 1. Presentation of the Model and Main Theorems 2. Perturbative Construction of the Homoclinic Trajectories 2.1 Whisker calculus, the “primitive” =t 2.2 The recursive equations 3. Proofs of the Theorems 3.1 The formal linear equation 3.2 Lower bounds on the Melnikov term 3.3 Heteroclinic intersection for systems with one fast frequency 4. Tree Representation 4.1 Definitions of trees 4.2 Admissible trees 4.3 Values of trees 4.4 Tree identities 4.4.1 Mark adding functions 4.4.2 Fruit adding functions 4.4.3 Changing the first node 4.5 Upper bounds on the values of trees 339
340 344 346 348 350 350 353 354 359 359 362 365 366 366 367 371 373
June 19, 2003 16:13 WSPC/148-RMP
340
00165
M. Procesi
A. Appendix A.1 Proof of Proposition 4.16 A.2 Normal form theorem A.3 Proof of Lemma 4.23 References
378 378 379 384 385
1. Presentation of the Model and Main Theorems The general setting of this paper is the problem of homoclinic splitting and Arnol’d diffusion in a priori stable systems with three or more relevant time scales. The general strategy is the one proposed in [1] and [2] and in particular the application to a priori stable systems proposed in [3] and further developed in [4]. More precisely we consider a class of close to integrable n degrees of freedom Hamiltonian systems for which one can prove the existence of (n−1)-dimensional unstable KAM tori together with their stable and unstable manifolds. We use a perturbative diagrammatic construction (proposed and developed in [3], [4] and [5]) to prove upper bounds on the angles of intersection of the stable and unstable manifolds of a KAM torus (homoclinic splitting). Such bounds are generally exponentially small in the perturbation parameter and depend on the chosen torus and in particular on the number of fast degrees of freedom. For systems with one fast degree of freedom we prove as well lower bounds on the homoclinic splitting through the mechanism of Melnikov dominance. Finally for such systems we prove the existence of “long” chains of heteroclinic intersections; namely we produce a list of unstable KAM tori T1 , . . . , Th such that T1 , Th are at distances of order one in the action variables and the unstable manifold of each Ti intersects the stable manifold of Ti+1 . This paper is a generalization of the results of [4], [5], [6], therefore in proving our claims we will rely heavily on intermediate results proved in the latter papers which we will not prove again. Consider the class of Hamiltonians n−1 n X 1 − c2 1 X ˜2 1 ˜2 p˜2 I + εIn + + ε (cos q˜ − 1) − (cos 2˜ q − 1) + εµf (˜ q) sin ψ˜i , 2 j=1 2 2 4 i=1
(1.1)
where the pairs I˜ ∈ Rn , ψ˜ ∈ Tn and p˜ ∈ R, q˜ ∈ T are conjugate action-angle coordinates, 0 < c ≤ 1, f (˜ q ) is odd and analytic on the torus and µ, ε are small parameters. We will consider them independent and then prove that one can prove Arnold Diffusion for µ ≤ εP , for an appropriate P . This class of Hamiltonians is a model for a near to integrable system close to a simple resonance where the dependence on the hyperbolic variables is not through the standard pendulum, but still maintains various qualitative properties of the pendulum. Namely we have a “generalized pendulum”, p˜2 1 − c2 + ε (cos q˜ − 1) − (cos 2˜ q − 1) 2 4 √ which has an unstable fixed point in p˜ = q˜ = 0 with Lyapunov exponent λ = c ε.
June 19, 2003 16:13 WSPC/148-RMP
00165
Exponentially Small Splitting and Arnold Diffusion
341
Generally one rescales the time and action variables so that the Lyapunov exponent is one: I˜ c√t ε p˜ c√t ε t t ˜ I(t) = √ , ψ(t) = ψ √ , p(t) = √ , q(t) = q˜ √ . c ε c ε c ε c ε (1.2) Such rescaling sends Hamiltonian 1.1 in n X 1 − c2 1 (I, A(ε)I) p2 sin(ψi ) + + 2 (cos q − 1) − (cos 2q − 1) + µf (q) 2 2 c 4 i=1
(1.3)
where A(ε) is the diagonal matrix with eigenvalues ai = 1 for i = 1, . . . , n − 1 and an = ε. So from now on we will work on Hamiltonian (1.3) and turn back to Hamiltonian (1.1) only to prove the existence of heteroclinic chains. The system (1.3) is integrable for µ = 0. It represents a list of n uncoupled rotators and a generalized pendulum (depending on the parameter c). We will denote the frequency of the rotators (which determines the initial data I(0)) by ω so that I(t) = I(0) = A−1 ω ,
ψ(t) = ψ(0) + ωt .
The initial data are chosen in an appropriate domain (physically interesting in the ˜ so that there are at least three characteristic orders of magnitude for variables I) the frequencies of the unperturbed system. Definition 1.1. In frequency space we first consider the ellipsoid ( ) n X Σ := x ∈ Rn : x2i /ai = 2E i=1
a
where E is an order one constant E ∼ Oε (1). For notational convenience we split the frequency ω in two vectorial components: ω1 ω = (√ , εα ω2 ) with ω1 ∈ Rm , ω2 ∈ Rn−m , and 0 ≤ α ≤ 12 . Finally, given two ε suitable order one constants R, r ∼ Oε (1), we consider the region √ √ Ω ≡ {ω ∈ Rn : εω ∈ Σ , r < |ω1,i | < R and r < |ω2 | < R , εα |ω2,i | ≥ ε , √ εα |ω2,n−m | ∼ ε} . We have chosen the generalized pendulum so that its dynamics on the separatrix is particularly simple,b namely sinh(±t) + ic 1 q(t) = 2 arc cot g sinh(±t) , eiq(t) = . (1.4) c sinh(±t) − ic a Now
a(ε)
and in the following we will say a(ε) ∼ Oε (f (ε)) if limε→0+ f (ε) = L 6= 0. motion on the separatrix can be easily obtained by direct computation; the main feature is that the motion on the separatrix is such that eiq(t) is a rational function of et . Here we are considering the simplest class of examples, which contains the standard pendulum c = 1.
b The
June 19, 2003 16:13 WSPC/148-RMP
342
00165
M. Procesi
√ 1 There are at least three characteristic time scales Oε (ε− 2 ), Oε (εα ), Oε ( ε) (coming from the degenerate variable In ) and 1 which is the Lyapunov exponent of the unperturbed pendulum. We will call ψ1 , . . . , ψm the fast variables and we will sometimes denote them as ψF ∈ Tm . Conversely we will call ψm+1 , . . . , ψn the slow variables ψS ∈ Tn−m . The perturbing function is a trigonometric polynomial of degree one in the rotators ψ and a rational functionc in eiq . We have decoupled the dependence of ψ and q only to simplify the computations. For each ω ∈ Rn the unperturbed system has an unstable fixed torus, p(t) = q(t) = 0 ,
I(t) = I(0) = A−1 ω ,
ψ(t) = ψ(0) + ωt .
The stable and unstable manifolds of such tori coincide and can be expressed as graphs on the angles. 1
Definition 1.2. Given any γ ∈ R, ε < γ ≤ O(ε 2 ) and a fixed τ > n − 1, we define the set γ Ωγ ≡ ω ∈ Ω : |ω · l| > τ , ∀ l ∈ Zn /{0} |l| of γ, τ Diophantine vectors in Ω. Now we consider 1 1 ∗ Ωγ ≡ Ω γ × − , 2 2 and for all (ω, ρ) ∈ Ω∗γ we set ωρ = (1 + ρ)ω. γ For all (ω, ρ) ∈ Ω∗γ and for all l ∈ Zn /{0} |ωρ · l| > 2|l| τ , ω ∈ Ωγ implies that ω1 and ω2 are Diophantine as well; we will call τF and τS their exponents. KAM like theorems (see [2], [5]) imply that there exists µ0 (ε, γ) ∼ ε2 such that if |µ| ≤ µ0 and if (ω, ρ) ∈ Ω∗γ , there exists one and only one n-dimensional Hµ invariant unstable torus Tµ (ω, ρ) whose Hamiltonian flow is analytically conjugated to the flow Tn 3 ϑ → ϑ + ωρ t. Moreover one can parameterize the stable and unstable manifolds of Tµ (ω, ρ) by functions I ± (ω, ϕ, q, µ), analytic in the last three arguments, with ϕ, q ∈ Tn × [− 23 π, 32 π]. Namely given z ± (ω, ϕ, q, µ) = (I ± (ω, ϕ, q, µ), p± (ω, ϕ, q, µ), ϕ, q) , where the pendulum action is derived by energy conservation, the trajectory d: ( t + ΦH z (ω, ϕ, q, µ) if t > 0 z(ω, ϕ, q, µ, t) = ΦtH z − (ω, ϕ, q, µ) if t < 0 tends exponentially to a quasi-periodic function of frequency ω. c Actually
it is sufficient that the singularity of f (ψ(t), q(t)), which is nearest to the real axis is polar and isolated. d Φt is the evolution at time t of the Hamiltonian flow (1.3). H
June 19, 2003 16:13 WSPC/148-RMP
00165
Exponentially Small Splitting and Arnold Diffusion
343
Remark 1.3. We have introduced the variable ρ in order to fix the energy of the perturbed system,e namely given a list of ωi ∈ Ωγ one can find ρ(ωi , µ) such that all the corresponding whiskered tori are on the same energy surface, see for instance [5]. Definition 1.4. We will study the difference between the stable and unstable manifolds on an hyper-plane transverse to the flow (a Poincar´e section), we choose the hyper-plane q = π and consequently drop the dependence on q. We call 1 G0j (ϕ, ω) = aj (Ij (ϕ, ω, 0− ) − Ij (ϕ, ω, 0+ )) 2 the splitting vector and prove that G0j (ϕ = 0, ω) = 0. A measure of the transversality is ∆0ij = ∂ϕj G0i (ϕ)|ϕ=0 called splitting matrix. We will prove the following theorems: Theorem 1. The splitting matrix ∆0 satisfies the formal power series relationf : ∆0 ∼ AD0 B
where A, B are close to identity matrices and D 0 is the “holomorphic part” of the splitting matrix; namely its entries are expressed as integrals over R of analytic functions. Moreover the formal power series involved are all asymptotic. g This statement was posed as a conjecture in [7] Paragraph 3. Corollary 1.5. The preceding Theorem implies that Hamiltonian (1.3), in regions of the action variables corresponding to m 6= 0 fast time scales, has exponentially small upper bounds on the determinant of the splitting matrix : c 1 , |det ∆0 | ≤ Ce− εb , with b = 2m n provided that µ < ε1+2 m . Theorem 2. Consider Hamiltonian (1.3) in regions of the action variables corresponding to m = 1 fast variables and for perturbing functions f (q) such that the pole f (q(t)) closest to the imaginary axis, say t¯, is such that |Im t¯| = d ≤ arc sin c. Setting µ ≤ εP with P = p/2 + 8 + 4n where p is the degree of the pole of f (q(t)) in t¯ we prove that C1 ε−p1 e
−
d|ω1 | √ ε
≤ |det ∆0 | ≤ C2 ε−p2 e
−
d|ω1 | √ ε
where C1 , C2 , p1 , p2 are appropriate order one constants. e The final goal is to find heteroclinic intersections on the fixed energy surface, and so “Arnold diffusion”, but in the following sections we will discuss only homoclinic intersections and so we will drop the parameter ρ. f We denote formal power series identities with the symbol A ∼ B. P n g A formal power series µ an (ε) is asymptotic if for all q > 0 there exists Q > 0 such that for −q all n ≤ ε then an (ε) ≤ ε−Qn .
June 19, 2003 16:13 WSPC/148-RMP
344
00165
M. Procesi
Corollary 1.6. Under the conditions of Theorem 2 the Hamiltonian (1.1) has heteroclinic chains, namely a set of N ≥ 1 trajectories z 1 (t), . . . , z N (t) together with N + 1 different minimal setsh T0 , . . . , TN such that for all 1 ≤ i ≤ N lim
t→−∞
dist(z i (t), Ti−1 ) = 0 = lim dist(z i (t), Ti ) . t→∞
Moreover one can construct such chains between tori T (ω a ; µ), T (ω b ; µ) such that ¯ ⊂ Ωγ and ωa , ωb ∈ Ω 1
|ε− 2 (ωna − ωnb )| ∼ Oε (1) . The techniques used for proving the Theorems are those proposed in [3] and developed in [4] for partially isochronous three time scale systems with three degrees of freedom. In this paper, particular attention is given to the formalization of the tree expansions and of the “Dyson equation” and relative cancellations proposed in [4]. This enables us to extend Theorem 1 to systems with n degrees of freedom and at least two time scales; moreover the proof is definitely simplified and quite compact. In this article we have considered completely anisochronous systems only to fix an example; generalizing to partially (or totally, thus recovering the results of [8]) isochronous systems is completely trivial. Theorem 1 and hence Corollary 1.5 are purely formal, relying only on general features of the perturbation series for the homoclinic trajectory; indeed they can be proved for very general systems, as we will show in a forthcoming paper. Moreover we have generalized the class of perturbing functions and the “pendulum” (the literature considers only trigonometric polynomials and the standard pendulum); the latter generalizations are quite technical but nevertheless non-trivial and interesting, we think, as the techniques we propose are easily generalizable and give a clear picture of the limits of proving Arnold diffusion via Melnikov dominance.
2. Perturbative Construction of the Homoclinic Trajectories One can use perturbation theory to find the (analytic for µ ≤ µ0 ) trajectories on the S/U manifolds of Hamiltoniani (1.3) z(ϕ, ω, t) =
X (µ)k z k (ϕ, ω, t) . k
h A closed subset of the phase space is called minimal (with respect to a Hamiltonian flow φ t ) if h it is non-empty, invariant for Φth and contains a dense orbit. In our case the minimal sets will be unstable tori T (I) with ω(I) Diophantine. i Notice that the apex k on the functions I, ψ represents the order in the expansion in µ NOT an exponent. To avoid confusion, when we need to exponentiate we always set the argument in parentheses.
June 19, 2003 16:13 WSPC/148-RMP
00165
Exponentially Small Splitting and Arnold Diffusion
345
Namely we insert the expansion in µ in the Hamilton equations of system (1.3), ψ˙ j = aj Ij ,
I˙j = −(µ) cos ψj f (q) , p˙ =
n X
df 1 sin ψi (q) , sin q(1 − (1 − c2 ) cos q) − (µ) c2 dq i=1
(2.1)
q˙ = p ,
and find initial data I(ω, ϕ, µ, 0± ) (and consequently p(ω, ϕ, µ, 0± )) such that the solution of (2.1) tends exponentially to a quasi-periodic function of frequency ω. Inserting in the Hamilton equations the convergent power series representation: I(t, ϕ, µ) =
∞ X
k k
(µ) I (t, ϕ) ,
ψ(t, ϕ, µ) =
k=0
p(t, ϕ, µ) =
∞ X
∞ X
(µ)k ψ k (t, ϕ) ,
k=0
(µ)k pk (t, ϕ) ,
q(t, ϕ, µ) = q 0 (t) +
k=0
∞ X
(µ)k ψ0k (t, ϕ)
k=1
we obtain, for k > 0, the hierarchy of linear non-homogeneous equations,j ψ˙ jk = aj Ijk ,
I˙jk = Fjk ({ψih }i=0,...,n ) , h
for j = 1, . . . , n ;
1 (cos(q 0 (t)) − (1 − c2 ) cos(2q 0 (t)))ψ0k + F0k ({ψih }i=0,...,n ) , c2 h
p˙ k =
where the functions Fik are defined as follows. Set: [·]k = k−1 X
"
Fjk (t) = − ∂ψj f 1 "
~ h (t) (µ)h ψ
h=1
− δj0 ∂ψ0 f
0
k−1 X
(µ)
h=1
h
!#
ψ˙ 0k = pk ,
1 dk k! dµk ( · )|µ=0 ;
(2.2)
we have
k−1
ψ0h (t)
!#
,
j = 0, . . . , n
k
~ h (t) is the vector ψ h (t), . . . , ψ h (t), where ψ 0 n ~ = f (ψ) 1
n X i=1
sin ψi f (ψ0 ) ,
1 1 − c2 2 f (ψ0 ) = 2 (cos ψ0 − 1) + sin ψ0 , c 2 0
finally δji denotes the Kronecker delta. For k = 0 we obtain the unperturbed homoclinic trajectory: z 0 (t) = (A−1 ω, p0 (t), ϕ + ωt, q 0 (t)) , (q 0 (t), p0 (t)) is the lower branch of the pendulum separatrix starting at q = π written in Eq. (1.4). j When
it is not strictly necessary we will omit the prefixed initial data of the angles ϕ = ψ1 (0), . . . , ψn (0); ψ0 (0) = π.
June 19, 2003 16:13 WSPC/148-RMP
346
00165
M. Procesi
For k > 0 we have a linear non-homogeneous ODE that we can solve by variation of constants. The fundamental solution of the linearized pendulum equation is given by, w˙ 0 x˙ 00 , w0 = 1 σ(t)x1 where σ(t) = sign(t) W (t) = 0 0 2 w 0 x0 x00 =
x10 =
c2 cosh(t) , + sinh(t)2
c2
σ(t)x00 (2(−3 + 4 c2 ) t + sinh(2 t) + 4(−1 + c2 )2 tanh(t)) . 2c4
(2.3)
It is easily seen (see [3] or [5]) that one can choose an appropriate “primitive” in the right hand side of the first column of Eqs. (2.2) so that the solutions are exponentially quasi-periodic. 2.1. Whisker calculus, the “primitive” =t Let us first define the function spaces on which we work, all the definitions and statements of this Subsection and of the following one are proposed and explained in detail in [3], we are simply reformulating them to suit our needs. Definition 2.1. (i) H is the vector space (on C) generated by monomials of the form m = σ(t)a
|t|j h i(ϕ+ωt)·ν x e j!
x = e−|t| ,
a = 0, 1 ,
where h ∈ Z ,
ν ∈ Zn ,
j ∈ N,
σ(t) = sign(t) .
(2.4)
(ii) Given two positive constants b and d, H(b, d) is the subset of functions f (t) analytic on the real axis in t 6= 0 that admit, separately for t > 0 and t < 0, a (unique) representation, f (t) =
k X |t|j j=0
j!
σ(t)
Mj
(x, ϕ + ωt) ,
σ(t)
(2.5) σ(t)
with Mj (x, ϕ) trigonometric polynomials in ϕ and the function Mk tically zero.
not iden-
σ(t)
The Fourier coefficients Mjν (x) are all holomorphic in the x-plane in a region {0 < |x| < e−b } ∪ {|arg x| < d} and have possible polar singularities at x = 0. k is called the t degree of f. In Fig. 1 we have represented a possible domain of analyticity for the Mνj . Notice that H is contained in all the spaces H(b, d); moreover if |t| > b, f (t) can be represented as an absolutely convergent series of monomials of the type m,
June 19, 2003 16:13 WSPC/148-RMP
00165
Exponentially Small Splitting and Arnold Diffusion
347
eb d
Fig. 1.
separately for t > b and t < −b. One can easily check that the functional that acts on monomials m of the form (2.4) as j X |t|j−p a+1 h i(ψ+ωt)·ν if |h| + |ν| 6= 0 −σ x e (j − p)!(h − iσω · ν)p+1 t p=0 (2.6) = (m) = a+1 j+1 σ |t| − if |h| + |ν| = 0 (j + 1)! is a primitive of m. We can extend =t , with |t| > b, to a primitive on functions f ∈ H(b, d) by expanding f in the monomials m (we obtain absolutely convergent series) and applying (2.6). Then if |t| ≤ b we set Z t =t ≡ =2σ(t)b + , (2.7) 2σ(t)b
obviously the choice of 2b is arbitrary and this is still the same primitive of f . In H(b, d) we can extend =t to complex values of t such that t ∈ C(b, d) where C(b, d) := {t ∈ C : |Im t| ≤ d, |Re t| ≤ b} ∪ {t ∈ C : |Im t| ≤ 2π, |Re t| > b} , is the domain in Fig. 1 in the t variables. An equivalent (and quite useful) definition of =t is I Z t du =t f = e−σ(τ )uτ f (τ )dτ , 2iπu σ(t)∞+is
(2.8)
where σ(t) = sign(Re t), t = t1 + is, with t1 , s ∈ R and the integral is performed on the line Im τ = s; finally the integrals in u have to be considered to be the analytic continuation on u from u positive and large. This definition is clearly compatible with the formal definition given above and one easily sees that H(b, d) is closed under the application of =t . Definition 2.2. H0 (b, d) is the subspace of H(b, d) of functions that can be extended to analytic functions in C(b, d).
June 19, 2003 16:13 WSPC/148-RMP
348
00165
M. Procesi
Notice that f is in H0 (b, d) if it is in H(b, d) and f (t) is analytic at t = 0. Remark 2.3. If f ∈ H0 (b, d) then generally =f ∈ / H0 (b, d) and has a discontinuity in t = 0. For instance if f ∈ L1 is positive, then Z ∞ 0− 0+ =(f ) := (= − = )f = f 6= 0 . −∞
−
+
We can construct operators which preserve H0 (b, d); let = = =0 − =0 and ( ( =t if t ≥ 0 =t if t ≤ 0 t t =+ = = = − =t − = if t < 0 , =t + = if t > 0 . The operator 1 1 X t =ρ = =t − σ(t)= 2 ρ=±1 2
preserves the analyticity.
Now let us cite two important properties of H0 (b, d), proved in [3]. Lemma 2.4. In H0 (b, d) we have the following shift of contour formulas: ∀f ∈ H0 (b, d) and for all d > s ∈ R, (i) =f (τ ) = =f (τ + is) , Z I X dR X t −Rσ(τ )(τ +is) t+is e f (τ + is)dτ . (ii) =ρ f (τ ) = 2iπR ρ=±1 ρ∞ ρ=±1 2.2. The recursive equations One can easily verify that f 1 (ψ0 (t), q0 (t)) and f 0 (q0 (t)) are in H0 (a, d) (and bounded at infinity) for some “optimal” values a, d corresponding respectively to the maximal distance from the imaginary axis and the minimal distance from the real axis of the poles of such functions. One can prove by induction, see [3] or [5] for the details, that the solutions of Eqs. (2.2) tend to quasi-periodic functions provided that the initial data are chosen to be: X X ± ± Ijk (ϕ, ω, 0± ) = µk =0 Fjk , p(ϕ, ω, 0± ) = µk =0 x00 F0k . k
k
Fjk (ϕ, ω, t)
Moreover one can prove that has no constant component. Consequently it is convenient to express the trajectories in terms of the “primitives” =t in the form (a0 = 1): 1 0k (µ)k ψjk (ϕ, t) = (µ)k aj Qtj Fjk + x0j G1k j + xj Gj
where x0j = 1, x1j = |t| for j 6= 0 while the xi0 are defined in Eq. 2.3,
1 k1 Qtj [f ] = (=t+ +=t− )[(x0j (t)σ(τ )x1j (τ )−x1j (t)σ(t)x0j (τ ))f (τ )] , Gik aj =xij Fjk . j = (µ) 2 2 For the proofs of these assertions see [3] or [5].
June 19, 2003 16:13 WSPC/148-RMP
00165
Exponentially Small Splitting and Arnold Diffusion
Notice that by our definitions, X −1 0 Ij (ϕ, 0− ) − Ij (ϕ, 0+ ) = 2a−1 G0k j ≡ 2aj Gj , j
349
2G00 = p(ϕ, 0− ) − p(ϕ, 0+ ) .
k
We define the formal power series X l Glk j (ϕ) ≡ Gj (ϕ), j = 0, . . . , n ,
l = 0, 1 .
k
Notice that by the KAM theorem the G0j are convergent series. Remark 2.5. (i) We will often use formal power series and in particular formal power series identities, namely identities which hold only at each order k in the series expansion in µ; we will mark such identities with the symbol A ∼ B. In Sec. 4.5 we will prove that the formal power series we use are “asymptotic”. As a definition of asymptotic power series we will assume that a formal power series P n µ an (ε) is asymptotic if for all q > 0 there exists Q > 0 such that, for all n ≤ ε−q , an (ε) ≤ ε−Qn . This implies that we can control the first ε−q terms provided that µ < εQ . (ii) It should be stressed that we do not need to prove convergence for all the asymptotic power series involved in a given identity to obtain information on those series which are known to be convergent (by the KAM theorem). The following Proposition contains some important properties of the operators Qj all proved in [3]. Proposition 2.6 (Chierchia). (i) The operators Qj are “symmetric” on H(a, d) : =(f Qj g) = =(g Qj f ) .
(ii) H0 (a, d) is closed under the application of Qtj . (iii) The operators Qj preserve parities and if f ∈ H0 (a, d) is odd then =f = 0. (iv) If F, G ∈ H(a, d) are such that the projection on polynomials, πP F · G, has no constant component, then σ
σ
=0 G(τ )∂τ F (τ ) = F (0σ )G(0σ ) − =0 F (τ )∂τ G(τ ) . Proposition 2.6(iii) immediately implies the following (again proved in [3]) Corollary 2.7. For all k ∈ N, j = 0, . . . , n, i = 0, 1, the function Gik j (ϕ) is zero for ϕ = 0. In particular the splitting vector is zero for ϕ = 0 and the system has an homoclinic point. Proof. We proceed by induction; by Proposition 2.6(iii) Gij 1 (ϕ = 0) = 0 as it is the integral of an odd analytic function. Consequently ψj1 (ϕ = 0, t) is both odd and in H0 (a, d). Now we suppose that Gij h (ϕ = 0) = 0 and ψjh (ϕ = 0, t) is odd and in H0 (a, d) for all h < k and j = 0, . . . , n. The function Fjk is an odd analytic P function of the angles ψi (∂ψj f δ ) computed at ψ = h
June 19, 2003 16:13 WSPC/148-RMP
350
00165
M. Procesi
3. Proofs of the Theorems We define the formal power series: ∆ai,j = ∂ϕi Gaj ,
for j = 1, . . . , n ,
δia = ∂ϕi Ga0 ,
for a = 0, 1 .
Notice that such series are known as a priori to be convergent only for a = 0. Lemma 3.1. The stable and unstable manifolds are on the same energy surface so that n X G0j (ϕ)(Ij (ϕ, 0+ ) + Ij (ϕ, 0− )) = −G00 (ϕ)(p(ϕ, 0+ ) + p(ϕ, 0− )) , (3.1) j=1
this relation implies that at the homoclinic point ϕ = 0, ~ = 0, 0+ )∆0 = −δ 0 p(ϕ = 0, 0+ ) . I(ϕ Proof. Equation (3.1) are simply the energy conservation at time t = 0: (I(ϕ, 0+ ), AI(ϕ, 0+ )) + p2 (ϕ, 0+ ) = (I(ϕ, 0− ), AI(ϕ, 0− )) + p2 (ϕ, 0− ) , the potential part of the Hamiltonian cancels as the perturbation depends only on the angles. Finally we differentiate in ϕ and compute at the homoclinic point where G0j = 0 by Corollary 2.7. 3.1. The formal linear equation In the recursive construction of ψj , and consequently of Gij , we have distinguished three “blocks”: (0) x0j G0k j ,
(1) x1j G1k j ,
(2) (µ)k aj Qtj (Fjk ) ,
(3.2)
k 0k as the Gih j can be brought out of the integral we can say that ψj and Gj (j = rh 1, . . . , n) are polynomials in the Gl with l = 0, . . . , n, h = 1, . . . , k − 1, r = 0, 1. This can be seen as a formal power series identity: ! n X X [r] [r] r 0 0 r Gj (ϕ) ∼ Jj (ϕ) + Njl (ϕ)Gl (ϕ) + nj G0 r=0,1
l=1
+ quadratic terms + · · · [r] = |r − 1| . Following [4] we differentiate this relation in the parameter ϕ and evaluate it on the homoclinic point where Gij ∼ 0, this leads to a linear formal identity for ∆0 : ∆0 ∼ D 0 + N 1 ∆0 + N 0 ∆1 + n 1 δ 0 + n 0 δ 1
(3.3)
0 where Dij = ∂ϕj Ji0 |ϕ=0 . Notice that we do not have an explicit expression for the matrices N i and ni although we have a recursive algorithm for the coefficients of the series expansion; we will use trees to find such explicit expressions. We can notice however that D 0
June 19, 2003 16:13 WSPC/148-RMP
00165
351
Exponentially Small Splitting and Arnold Diffusion
is the holomorphic part of the splitting matrix, namely it is obtained by using only the holomorphic block (2) of (3.2) in the construction of the homoclinic trajectory. We insert the energy conservation relations in Eq. (3.3): 1 1~ + 1 − N1 + n I(ϕ = 0, 0 ) ∆0 ∼ D 0 + N 0 ∆1 + n 0 δ 1 . (3.4) p(ϕ = 0, 0+ ) The tree representation of the trajectories leads to the following Propositions all proved in the next Sections: Proposition 3.2. The following formal power series relations hold : c (i) D0 ∼ N 0 , (ii) n0 ∼ D0 ω , 2 0 (iii) D0 is the Hessian of a function S 0 at the homoclinic point: Dij ∂ϕi ∂ϕj S 0 (ϕ)|ϕ=0 .
This Proposition generalizes to Hamiltonian (1.3) similar results of [4]. Relations (i) and (ii) inserted in (3.4) directly imply that: 1 1 1 1 1 1~ + 0 0 1−N + n I(ϕ = 0, 0 ) ∆ ∼ D 1 + ∆ + ωδ . p(ϕ = 0, 0+ ) 2
=
(3.5)
Proposition 3.3. One can use the tree representation to find appropriate bounds on the order k terms of the series expansion of the formal power series of Eq. (3.5). If we denote by M k the order k term of the µ expansion of a formal power series M, we have: max(|N 1k |, |pk (ϕ = 0, 0+ )|, |Ijk (ϕ = 0, 0+ )|, |δ 1k |, |∆1k |) ≤ (k!)c1 (Cε−1 )k . Moreover the Fourier coefficients of the function S 0k : X S 0k (ϕ) = eiϕ·ν Sˆ0k (ν) , ν∈Zn :|ν|≤k
respect the inequalities ˆ0k
|S (ν)| ≤
(
(k!)c1 (Cε−
p+7 2
)k e−|ω·ν|d
(k!)c1 C k ε−k e−|ω·ν|c
,
where C, c < d are appropriate order one constants, c1 = 4τ + 4 (τ is the Diophantine exponent of ω) finally p is the degree of the pole nearest to the real axis k of f (q 0 (t)). Proof of Theorem 1. Proposition 3.3 implies that the formal series of relation ¯ (4τ +4)q+1 . Then the formal (3.5) are asymptotic for N < ε−q with |µ| < µ0 = Cε k The
proof of this proposition only requires that f (q 0 (t)) has a pole-like singularity namely that: max √ |Fνk (et )| t∈C(a+2,d− ε)
≤ Ck!ε
p+k 2
June 19, 2003 16:13 WSPC/148-RMP
352
00165
M. Procesi
relation (3.5) is an equality for the truncated series M ≤N (let us call A the formal matrix on the left of ∆0 and B the one on the right of D 0 ): N µ ≤N 0≤N 0≤N ≤N . A ∆ =D B +o µ0 Both A≤N and B ≤N are close to identity and so have order one determinants. This proves Theorem 1 and consequently the conjecture posed in [7], namely that the leading order of the splitting determinant is given by its analytic part det D 0 . Proof of Corollary 1.5. Let us now set m ¯ = τF + 1, where τF is the Diophantine ¯ exponent of ω1 |ω1 · νF | ≥ γF |νF |−m+1 m ¯ > m (m is the number of fast degrees of 1 1 freedom). We choose N = C1 ε− 2m¯ (where C1 ≤ (γF /|ω2 |) 2m¯ ) if α = 0) so that we can remove the absolute value in e−c|ω·ν| and for all frequencies ν such that νF 6= 0: |Sˆ0k (ν)| ≤ (k!)c1 ε−k e
cγ
− √ε(N )Fm−1 +c|ω2 ||ν| ¯
;
we can sum on the frequencies ν : νF = 6 0 in X 0k Dij = νi νj S 0k (ν) |ν|≤k
with ϕi or ϕj fast. 0k Dij ≤ (k!)c1 k 3 ε−k e−˜cε
−1/2m ¯
X
0≤l≤k
n−m
ec|ω2 |l
˜ −1 )k e−˜cε−1/2m¯ . ≤ (k!)c1 (Cε
¯ So we can sum the asymptotic series D 0 for k ≤ N and ν < ε1+2(τ +1)/m , N −1/2m ¯ µ ≤ Ce−˜cε . min |det D0≤N |, µ0
Finally we can take any m ¯ > m and τ > n − 1 so we choose m ¯ −1 = m−1 − − 21 −1 − 12 −1 −1/2m ¯ (log(ε )) (similarly τ + 1 = n + (log(ε )) ) so that ε = e−1 ε−1/2m and 1+2(τ +1)/m ¯ 1+2n/m ε ≥ Cε for some order one C. If we have only one fast variable we can give better bounds on det D 0 , namely we use |Sˆ0k (ν)| ≤ (k!)c1 (Cε−
p+7 2
)k e−|ω·ν|d
and the fact that for one fast frequency |ω1 | |ω · ν| ≥ √ |ν1 | − εα |ω2 ||νS | , ε 1
provided that ν1 6= 0 and N ≤ cε− 2 (with c < |ω1 |/|ω2 | if α = 0), so by summing up the formal power series we have that: 0≤N Dij
=
01 Dij
+
N X k=2
0k Dij ,
June 19, 2003 16:13 WSPC/148-RMP
00165
Exponentially Small Splitting and Arnold Diffusion
353
˜ p+7 2 +2n : where if |µ| ≤ Cε N X k=2
0k Dij ≤
N X
˜ − (µCε
p+7 2 −2n
)k
X
[e
−
|ω1 |dν1 √ ε
0<ν1
k=2
− ˜ − p+7 2 −2n )2 [e ] ≤ (µCε
|ω1 |d √ ε
]
(3.6)
(the term in square brackets appears only if i = 1 or j = 1). So to prove Theorem 2 01 we have to show that for µ ≤ εP the first order Dij dominates. 3.2. Lower bounds on the Melnikov term The first order of D0 is 01 Dij
= −δij Im
Z
∞ −∞
e
iωj t
(f (q0 (t)) − f (0)) ,
namely the integral of an even, analytic, exponentially decreasing function. If j 6= 1 we bound this integral with an order one constant. Lemma 3.4. The singularities of F (t) = f (q 0 (t)) come in groups of eight (in |Im t| ≤ π); namely if t0 is a singularity so are −t0 ,
±t¯0 ,
±t0 + iπ ,
±t¯0 + iπ .
The residues of f (q0 (t)) at such points are related in particular if the Laurent series of F at ti is X
k≥−p
gk (ti )(t − ti )k ,
then gk (ti ) = −(−1)k g¯k (−t¯i ). Proof. We are simply using the fact that f (q) is real and odd and that z = eiq(t) has two preimages t and −t + iπ. This implies in particular that F (t) = −F (−t) = F¯ (t).l In the assumptions of Theorem 2 we have imposed that there is onem couple (t0 , −t¯0 ) of poles closest to the imaginary axis coming from f (q0 (t)) rather than f 0 (q0 (t)). Then f (0) = 0 as f is odd, and by definition f (q(t)) has two poles on the line |Im t| = d, so if ω1 > 0 we shift the integration to a line Im t = l > d, not ε close to any singularity (if ω1 < 0 we shift to Im t = −l < −d). l The
symbol f¯(z) := f (¯ z ). we could deal with any finite number of poles with this property.
m Naturally
June 19, 2003 16:13 WSPC/148-RMP
354
00165
M. Procesi
Z Im
∞ −∞
e
iω1 t
ω i √1 t f (q0 (t)) ≥ 2π|Re[Res(e ε (f (q0 (t)), t0 ) ω
i √1ε t
(f (q0 (t)), −t¯0 )]| Z ∞ |ω | ω − √1ε l i √1ε t −e e (f (q0 (t + il)) − f (0)) , Im
+ Res(e
−∞
(3.7)
the last integral is again the integral of a bounded ε independent function so we bound it by an order one constant. The residue at the poles can be computed: (iω1 )k−1 (g (t ) − (−1)k g¯k (t0 )) (k−1)/2 k 0 (k − 1)!ε k=1,p X
which is real and generally greater than Ce
−
|ω1 | √ d ε
ε−
p−1 2
.
(3.8)
Proof of Theorem 2. We choose |µ| ≤ εp/2+8+4n so that (3.8) dominates on (3.6). 3.3. Heteroclinic intersection for systems with one fast frequency In the following we will consider systems with one fast frequency and in the a priori stable variables of Hamiltonian (1.1). We can fix µ = εP and ensure Melnikov dominance, as discussed in the previous sections. This means that we have lower and upper bounds on the splitting determinant (and on the eigenvalues of the splitting matrix) of the type: aεp e−cε
−1 2
≤ det ∆0 (ω) ≤ bε−p e−cε
−1 2
.
The coefficients p, a, b, c depend on the perturbing function f . We consider the function: F (ϕ, ω0 , ω) = I˜µ− (ϕ, ω, ρ(ω)) − I˜µ+ (ϕ, ω0 , ρ(ω0 )) √ ≡ c ε(Iµ− (ϕ, ω, ρ(ω)) − Iµ+ (ϕ, ω0 , ρ(ω0 ))) where ω, ω0 ∈ Ωγ . Notice that F (0, ω0 , ω0 ) = 0 ,
det
∂F (0, ω0 , ω0 ) = 2n εn/2 det ∆0 (ω0 ) . ∂ϕ
Hence from the implicit function theorem there exists a function ϕ(ω, ω0 , ε) for which Fµ (ϕ(ω, ω0 , ε), ω, ω0 ) ≡ 0 ,
June 19, 2003 16:13 WSPC/148-RMP
00165
Exponentially Small Splitting and Arnold Diffusion
355
provided |ω − ω0 | is small enough. Fixed ω0 standard computations (see [5]) show that the smallness condition is |ω − ω0 | ≤ Cε−2p e−2cε
−1 2
.
To prove the existence of heteroclinic intersections, we have to prove the existence −1
of a chain of KAM tori at distances of order B = Oε (e−Cε 2 ) for some C > 2c, namely we have to adapt to our anisotropic setting (one fast and many slow time scales) the classical techniques discussed in detail in [2] or [5]. Proposition 3.5. There exists a list of Diophantine frequencies ω1 , . . . , ωh ∈ Ωγ such that: √ −1 1 (i) ε|ωi − ωi+1 | ≤ e−C1 ε 2 (ii) ε− 2 |Πn (ω1 − ωh )| ∼ Oε (1) , (3.9) where Πn is the projection on the nth component. To each of the frequencies ωi is associated a preserved unstable invariant torus of Hamiltonian (1.1), T (ω i , ρi ) √ (with ρi ∈ [− 21 , 21 ]) of frequency ερi ωi . The scaling factor ρi is chosen so that all the invariant tori are on the same energy surface, as explained in Remark 1.3. To prove the Proposition we proceed in two steps: ¯ of Diophantine frequencies respecting condi(1) Define an appropriate set Ω tion (3.9). √ (2) Prove the existence of unstable KAM tori of frequency: ερω for ρ ∈ [− 21 , 21 ] ¯ We will only sketch the proof of this second point. and ω ∈ Ω. −1
Definition 3.6. Given an order one C1 > 2c, set A1 = e−C1 ε 2 and consider the set: √ A1 n (a) ε|ω · l| ≥ ∀ l ∈ Z /{0} : l = 6 0 1 |l|τ ¯ := ω ∈ Ω : Ω . 2 √ ε n (b) ε|ω · l| ≥ τ ∀ l ∈ Z /{0} : l1 = 0 |l|
As there is only one fast time scale the condition ω ∈ Ω can be given only on the slow variables, while the fast variable is obtained by “energy conservation” ω ∈ Σ (Σ is the ellipsoid of Definition 1.1), namely we consider a function F : Rn−1 → Σ: v n−1 u X u x2i − ε−1 x2n , x2 , . . . , xn , F (x) := t2E − i=2
so that given β = 12 + a ( 21 ≤ β ≤ 1) and R, r, R1 , r1 , r2 , appropriate order one constantsn and defining: 1 ˜ := {˜ Ω ω ∈ Rn : ω ˜ ε− 2 ∈ Ω} ,
˜ = F (B(R, r) ∩ M ) we have Ω
√ ¯ notice that we are not using the same condition automatically imply r¯ ≤ εω1 ≤ R, notation as in (1.1), here ωi is always the ith component of ω. n This
June 19, 2003 16:13 WSPC/148-RMP
356
00165
M. Procesi
where B(R, r) ⊂ Rn−1 is the spherical shello of radiuses εβ R, εβ r and M := {ω ∈ Rn−1 : εr1 ≤ ωn ≤ εR1 , ωi > r2 εβ , i = 2, . . . , n − 1} . √ As we always deal with ω ˜ = εω we will omit the tilde rescaling all the relations. The Jacobian of F in B(R, r) ∩ M is bounded from above and below by order one constants so that given a measurable setp S ⊂ Ω meas(F −1 (S)) ∼ meas(S). Condition (b) naturally defines subsets of B(R, r)∩M . Moreover we can project the set respecting condition (a) on the subspace of the slow variables. Call this set ¯ 4 ⊂ B(R, r) ∩ M . Ω Let us call S(x) the (n − 2)-dimensional sphere centered in the origin and of ¯ so that radius εβ x. We take 2r < R and consider R ¯ < R 1 , r > r1 . (3.10) R1 /2 < R ¯ R R Definition 3.7. Consider the sets ¯ ¯ + (R1 − R)/4), ¯ S2 := {ω ∈ S(R) : ε(R1 − (R1 − R)/4) ≤ ωn ≤ ε(R ω i ≥ r 2 εβ , ∀ i 6= n} , ¯, S3 := {ω ∈ S(R) : εR1 ≤ ωn ≤ εR
ω i ≥ r 2 εβ ,
∀ i 6= n} .
M ∩ S(R) ⊃ S3 ⊃ S2 ; and the sets all have measure of order ε(n−3)β+1 . Given a set X ∈ S(R), its cone C(X) is the set of semilines stemming from the origin and reaching points of X. We consider truncated cones T (X) := C(X) ∩ B(R, r), and, for any r < a < b < R, Ta,b (X) = T (X) ∩ B(b, a). Notice that by (3.10) if X ∈ S3 , then T (X) ∈ M ∩ B(R, r). Remark 3.8. Recall that given a measurable set X ∈ S(R), the cone of X is measurable and measT (X) ∼ εβ meas(X), meas Ta,b (X) ∼ εβ (b − a) meas(X). −1
Definition 3.9. Given A2 = e−C2 ε 2 with 2c < C2 < C1 and for all s ∈ R, 1 < s < 4R/r, we consider the sets: 2 −1 n−1 ¯ 2 (s) = ω ∈ B(R, r) : |ω · l| ≥ sε , ∀ l ∈ Z /{0} |l| ≤ A Ω 2 |l|τ sε2 Ω3 (s) = ω ∈ B(R, r) : |ω · l| ≥ τ ∀ l ∈ Zn−1 /{0} . |l| ¯ i (s)∩ Remark 3.10. Standard measure theoretic arguments imply that the sets (Ω C (n−3)β+2 ¯ i (s)∩ S(R)) ∩S(R) all have measure of order ε ; this implies as well that (Ω C S2 ) ∩ S2 has measure of the same order and the same holds for intersections with ¯ 2 (s) ∩ Ω ¯ 3 (s) ∩ S2 )C ∩ S2 . We will repeatedly use such relations. S3 and for (Ω o We
call spherical shell of radiuses b, a the (n − 1)-dimensional domain {x ∈ Rn−1 : a ≤ |x| ≤ b}. symbol ∼ means that the two measures are of the same order in ε.
p The
June 19, 2003 16:13 WSPC/148-RMP
00165
Exponentially Small Splitting and Arnold Diffusion
357
¯ 2 (2R/r) ∩ S2 , the whole solid ball Bρ (ω) of Lemma 3.11. (i) Given a point ω ∈ Ω 2 1+τ ¯ 2 (R/r) and its intersection with center ω and radius ρ = ε A2 is contained in Ω S(R) is contained in S3 . ¯ 2 (R/r) ∩ S3 ) is in Ω ¯ 2 (1), same for Ω ¯ 3. (ii) The whole truncated cone T (Ω Proof. (i) First notice that any (n − 2)-dimensional “ball”, Bρ (x) ∩ S(R) ∈ S3 if x ∈ S2 . Now consider ω ∈ Ω2 (2R/r) ∩ S2 and a vector x ∈ Rn−1 on the unit sphere: |l| r|l|τ +1 |l| ≤ , as |(ω + ρx) · l| ≥ ||ω · l| − |l|ρ| ≥ |ω · l| 1 − ρ |ω · l| |ω · l| 2Rε2
|l| < 12 . and |l| ≤ A2 , setting ρ = ε2 A1+τ we have 0 < ρ |ω·l| 2 (ii) Given a point x ∈ Ω3 (R/r) ∩ S(R) (or in x ∈ Ω2 (R/r) ∩ S(R)) then rx/R ∈ S(r). Moreover for r/R ≤ t ≤ 1:
|tx · l| = t|x · l| ≥ r/R
ε2 Rε2 = . r|l|τ |l|τ
¯ 2 (R/r) ∩ S(R) is union of a finite number of disjoint Lemma 3.12. The set Ω convex domains. Each domain is contained in a (n−2)-dimensional “ball” of radius C3 εβ A2 for an appropriately fixed order one C3 . Proof. ¯ 2 (R/r) ∩ S(R)) (Ω \ Rε2 Rε2 n−1 x ∈ Rn−1 : (x · l) > ≡ S(R) ∪ x ∈ R : (x · l) < − , r|l|τ r|l|τ n−1 l∈Z |l|≤A2
now the intersection of sets such that each connected component is convex has the same property. Suppose, by contradiction, that there are points x1 , x2 ∈ Ω2 (R/r) ∩ S(R) such that the arc x_ 1 x2 is all in Ω2 (R/r) ∩ S(R) and has length greater than √ 2R−1 nεβ A2 . Let hx1 , x2 i be the plane generated by the vectors x1 , x2 , and on it consider the sector S of unit vectors orthogonal to x_ 1 x2 , this sector has angle √ ϑ = 2 nA2 . The product space of hx1 , x2 i⊥ with the sector S is a multi-cylinder in which there cannot be entire vectors l ∈ Zn−1 with |l| ≤ A−1 2 . Now we consider the intersection of the multi-cylinder with the sphere |x| = √ √ A−1 2 n, on hx1 , x2 i it is an arc of length greater than 2 n so that a ball of 2 −√ √ radius n is contained in the multi-cylinder. Now in each ball of radius n there is at least one entire vector. Namely let x be the center of the ball then [x] (entire part of each component) is entire and |x − [x]|∞ ≤ 1. ¯ 2 (R/r) ∩ S(R) contained in S3 . Let N be the number of connected domains of Ω Each domain contains an (n − 2)-dimensional “ball” of radius ρ = ε2 A1+τ , so that 2 −(n−2)(τ +1) β(n−2)−2n+5 N ≤ A2 ε .
June 19, 2003 16:13 WSPC/148-RMP
358
00165
M. Procesi
¯ 3 (R/r) ∩ S3 , by Remark 3.10 we have that Let us now consider the Cantor set Ω C ¯ 3 (R/r) ∩ S3 ) ∩ S3 has measure of order ε(n−3)β+2 . This implies that Ω ¯ 3 (R/r) ∩ (Ω ¯ ¯ ¯ S3 ∩ Ω2 (R/r) is not empty and the measure of (Ω3 (R/r) ∩ S3 ∩ Ω2 (R/r))C ∩ S3 is of order ε(n−3)β+2 . Lemma 3.13. There exists a connected domain D of Ω2 (R/r) ∩ S3 such that (n−2)(τ +1)+1
¯ 3 (R/r)) ≥ A meas(D ∩ Ω 2
.
Proof. Suppose the assertion to be false, then calling Di , i = 1, . . . , N the connected domains: N X ¯ 2 (R/r) ∩ S3 ∩ Ω ¯ 3 (R/r)) = ¯ 3 ) ≤ A(n−2)(τ +1)+1 N meas S3 ∼ meas(Ω meas(Di ∩ Ω 2 i=1
which is absurd.
¯ 2 (1), Then we can use Lemma 3.11(ii) and consider the truncated cone T (D) ⊂ Ω ¯ 3 (1) has measure of order A(1+τ )(n−2)+1 εβ ; namely by Lemma 3.13 P = T (D) ∩ Ω 2 ¯ 3 (R/r) the Cantor set P contains all radial segments having an endpoint in D ∩ Ω and the other on S(r). Consider an (n − 1)-dimensional ball of radius ρ ∼ εβ A2 centered on a point x ∈ D and which contains D (such ball exists by Lemma 3.13). Given h = [ 2(R−r) 3ρR ], consider the points xi = ti x with ti = 1 − 3/2iρ h ≥ i ∈ N0 and let us cover T (D) with a finite number of balls Bi of radius ρ and centered on points xi . Setting ρ = 2C3 εβ A2 we have that Bi ∩ Bj is empty if |i − j| > 1 and each Bi ∩ Bi+1 contains a truncated cone Tai ,bi (D) with bi − ai ≥ ρ/4. We consider the sets Pi = Tai ,bi (D) ∩ Ω3 (1), by Lemma 3.13 each Pi has measure of order (1+τ )(n−2)+2 εβ A2 . ¯ 4 whose complementary set in M ∩ B(R, r) Now we consider the Cantor set Ω (n−2)β+1 has measure of order ε A1 . Its intersection with Pi has measure of order (1+τ )(n−2)+2 (τ +1)(n−2)+3 εβ A2 , provided that A1 < A2 . Consider a list ωi ∈ Pi ∩ ¯ Ω4 ; for each i we have that ωi , ωai+1 ∈ Bi+1 so the list respects condition 3.9(i) moreover ¯ − 2Cεβ A2 and max yn ≤ r R1 + 2Cεβ A2 min yn ≥ R y∈B0 y∈Bh R for some order one C so the list respects condition 3.9(ii). In the Appendix A.2 we have proved, generalizing similar results of [9], that there exists a symplectic transformation, well defined in a region W of the phase ˜ ψ), which sends Hamiltonian (1.1) in the local normal form: space (I, √ √ 1 (J, AJ) + εG1 (P Q, ε) + µg1 (φS , J, P, Q) + αf1 (φ, J, P, Q) (3.11) 2 −1
where α = Oε (e−Cε 2 ) for any order one C. W is of order one in the actions both in the fast direction J1 and in the degenerate one Jn , namely there exists
June 19, 2003 16:13 WSPC/148-RMP
00165
Exponentially Small Splitting and Arnold Diffusion
359
points w1 , w2 ∈ W such that |ΠJn (w1 − w2 )| = Oε (1). We can then prove a KAM ¯ by theorem for the Hamiltonian (3.11) for µ < ε4 with the frequencies ω in Ω choosing (A1 )2 α. Roughly speaking, KAM theorems are proved by performing an infinite sequence of symplectic transformations defined in a set of nested domains whose intersection is not trivial. Each approximation step reduces the order of the perturbation quadratically and is well defined provided an appropriate smallness condition is verified. Roughly speaking, such condition is of the type: µγ −2 1 where µ is the small parameter and γ is the Diophantine constant of the frequency ω of the preserved torus. To apply this scheme to Hamiltonian (3.11), we first perform a finite number of approximation steps on the slow variables with J1 as a parameter; the small denominators involved are |ωS · l| on which we have the stronger Diophantine condition so that the approximation scheme works provided that µε−4 1. Eventually we will reduce the µ perturbation to order α and then continue with the classical KAM scheme on all the variables, now the smallness condition is αA−2 1 1. This completes the proof of Proposition 3.5. 4. Tree Representation 4.1. Definitions of trees We briefly review the tree representation of the homoclinic trajectory. The definitions contained in this Subsections are all adapted from [10]. Definition 4.1. A graph G consists of two sets V (G) (vertices), E(G) (edges) such that E(G) is a subset of the unordered pairs of distinct elements of V (G). We will always consider finite graphs, i.e. graphs such that N (G) = |V (G)| is finite. Two vertices i, j ∈ V (G) are said to be adjacent if (i, j) ∈ E(G). It is customary to write n ∈ G in place of n ∈ V (G) and (i, j) ∈ G in place of (i, j) ∈ E(G). Two graphs G1 , G2 are equal if and only if they have the same vertex set and the same edge set. Definition 4.2. A path joining the vertices i, j ∈ G is a subset Pij of E(G) of the form Pij := {(i, v1 ), (v1 , v2 ), . . . , (vk , j)} . A graph G is connected and without loops if for all i, j ∈ G, there exists one and only one path that connects them. Such graphs are called trees. Their vertices are called nodes and their edges are called branches. A tree T such that the set V (T ) = {1, 2, . . . , N (T )} is called a numbered tree. Definition 4.3. A labeled tree is a tree A plus a label LA (v) ≥ 0 which is generally a set of functions fAi (v) defined on the nodes. When possible we will omit the subscript A in the functions f i .
June 19, 2003 16:13 WSPC/148-RMP
360
00165
M. Procesi
Fig. 2.
Definition 4.4. Two labeled trees X, Y are isomorphic if there is a bijection, say h, from V (X) to V (Y ) such that for all a ∈ V (X), LX (a) ≡ LY (h(a)). Moreover (a, b) ∈ E(X) if and only if (h(a), h(b)) ∈ E(Y ). We say that h is an isomorphism from X to Y. Notice that since h is a bijection h−1 is well defined and is an isomorphism from Y to X. We will call symmetries or automorphisms of X, the isomorphisms from X to X. It is often convenient and more compact to represent a tree by a diagram, with points for the nodes and lines for the branches, as in Fig. 2. In this diagrams the positions of the points and lines do not matter — the only information it conveys is which pairs of nodes are joined by a branch. This means that the two diagrams in Fig. 2 are equal by definition. Strictly speaking these diagrams do not define graphs, since the set V is not specified. However, if the diagram has N points, we may assign distinct natural numbers 1, 2, . . . , N to the points (which we still call nodes), so obtaining a labeled numbered tree. Then it is easily seen that the two trees in Fig. 2 are isomorphic. Definition 4.5. We will call diagrams the equivalence classes of labeled trees via the relation A ∼ = B if and only if A and B are isomorphic. An obvious consequence of this definition is that, LA (v) and N (A) are well defined on the equivalence classes. We can choose a representative A0 of the equivalence class A by giving a numbering 1, 2, . . . , N (A) to the nodes of A. Remark 4.6. Given an equivalence class of labeled trees A and a numbering A0 , the group of automorphisms of A0 can be identified with a subgroup of the group of permutations on N (A) elements SN (A) ; we denote such subgroup by S(A0 ). S(A0 ) is the subgroup of the permutations σ ∈ SN (A) which fix both E(A0 ) and the labels L(A0 ). Namelyq σ ∈ S(A) → σE = E and L(n) = L(σ(n)) for all n ≤ N (A). Given two isomorphic trees A0 and A00 , representatives of A, let h be a bijection such that E(A0 ) = σE(A00 ). The groups S(A0 ) and S(A00 ) = h−1 S(A0 )h are isomorphic. We will improperly call the equivalence classes via this relation the symmetry group S(A) of the diagram A. q With standard abuse of notation we denote σE(A0 ) the function such that σ(a, b) = (σa, σb) for all (a, b) in E(A0 ).
June 19, 2003 16:13 WSPC/148-RMP
00165
Exponentially Small Splitting and Arnold Diffusion
361
Using standard notation (see for instance [11]) we denote by a := (i1 , i2 , . . . , im ) with N 3 ij ≤ N (A) the permutation such that a(ih ) = ih+1 , a(im ) = i1 , and a(n) = n for all N 3 n ≤ N (A) such that n ∈ / {i1 , i2 , . . . , im }. Moreover we denote by ab the composition of a and b. As an example in Fig. 3 consider the numbered tree A (N (A) = 6), its symmetries are the identity and a := (1, 4); b := (2, 3); c ≡ a ◦ b; d ≡ (5, 6)(1, 2)(4, 3), e := (5, 6)(1, 3)(2, 4); f := (5, 6)(1, 2, 4, 3); g := f ◦ a. Clearly any other numbering on A would give an isomorphic symmetry group. Definition 4.7. Given a tree A and a node v ∈ A, we define its orbit: [v] := {w ∈ A : w = g(v) for some g ∈ S(A)} , i.e. the list of nodes obtained by applying the whole group S(A) to v, notice that this is an equivalence relation (a proof of this statement is in [10]). In the example of Fig. 3 there are two orbits, which in the chosen numbering are: [1] ≡ {1, 2, 3, 4} and [5] ≡ {5, 6} . Remark 4.8. The orbits are well defined on the equivalence classes of labeled trees, it should be clear, for instance, that the nodes signed in black in the diagram of Fig. 4 are an orbit. Definition 4.9. A rooted labeled tree is a labeled tree A plus one of its nodes called the first node (vA or v0 ); this gives a partial ordering to the tree, namely we say that i > j if Pv0 j ⊂ Pv0 i . Moreover choosing a first node induces a natural 1
2
5
6
4
3
Fig. 3.
L L’
L L Fig. 4.
June 19, 2003 16:13 WSPC/148-RMP
362
00165
M. Procesi
ordering on the couples of nodes representing the branches namely (a, b) ∈ E(A) implies that a < b. We recall some definitions on rooted trees: (a) the level of v l(v) is the cardinality of Pv0 v ; (b) the nodes subsequent to v, s(v), are the nodes adjacent to v and of higher level; the node preceding v is the only node adjacent to v and of lower level; (c) given v node of A, we call A≥v the rooted tree (with first node v) of the nodes w ≥ v; we call A\v the remaining part of the tree A. An isomorphism between rooted trees (A, vA ), (B, vB ) is an isomorphism between A and B which sends vA in vB . The symmetries of a rooted labeled tree (A, vA ), which we denote again by S(A, vA ) are the subgroup of the symmetries of the corresponding unrooted tree that fix the first node vA . As done for trees, we can represent the equivalence classes of rooted trees with diagrams, representing by convention the first node on the left and all the nodes of the same level aligned vertically (it should be obvious that the definitions v > w, A\v and A≥v are well posed on the equivalence classes). 4.2. Admissible trees Definition 4.10. We consider rooted labeled trees such that some nodes are distinguished by having a different set of labels.r An admissible tree is a symbol : A, {vA }, {v1 , . . . , vm }, {w1 , . . . , wh } such that A is a tree, all the vi , wj and vA are nodes of A, the vi are all end-nodes, h {vi }m i=1 ∩ {wj }j=1 = ∅
and the vi are all different. h s We call {vi }m i=1 ≡ F(A) the fruits of A, {wj }j=1 ≡ M(A) the marked nodes of A and the set 0
A: {v ∈ / F(A)} the free nodes of A. Finally s0 (v) are the free nodes in s(v). The labels are distributed in the following way: (a) For each node v 6= vA , one angle label jv ∈ {0, . . . , n} (remember that we are considering a system with n + 1 degrees of freedom). 0
(b) For each node v, one order label δv = 0, 1 if v ∈ A and δv ∈ N otherwise. (c) For each node v ∈ M(A), one angle-marking J = 0, . . . , n and one functionmarking h(t) ∈ H. (d) For each node v ∈ F(A), one type label i = 0, 1. r The sA
dynamical meaning of the labels will be clear when we define the “value” of a tree. node v can appear many times in M(A) we will say it carries more than one marking.
June 19, 2003 16:13 WSPC/148-RMP
00165
Exponentially Small Splitting and Arnold Diffusion
f (A 2) A1
i, k 0
f (A 2)
i, k 0
A1
Fig. 5.
363
Examples of trees in A5 and in (T00 )5 (see Definition 4.13).
We set a grammar on the so defined labeled rooted trees, namely: δv = 0 → {jv = Jv = 0, |s(v)| ≥ 2, jv0 = 0 ∀ v 0 ∈ s(v)} . To draw the diagrams without writing down the labels we give a color to each j = 1, n (which forces δ = 1) and two different colors for the couples of labels j = 0, δ = 1 and j = 0, δ = 0. In all the pictures we will set n = 1 and choose the colors gray, black and white, see Fig. 5. The fruits F(A) will be represented as “bigger” end-nodes colored with the color corresponding to their angle label and with their order and type written on a side. The marked nodes will be distinguished by a box of the color corresponding to their angle-marking and with their function-marking written on a side. If the function marking is h(t) = 1 we will omit the function marking. By convention the first node is set on the left, and the nodes of the same level are aligned vertically. Definition 4.11. (1) We will call fruitless trees the (labeled rooted trees) A such that F(A) is empty. We will say that a fruit v stems from w if v ∈ s(w). (2) We will call T the set of equivalence classes (as in Definition 4.5) of admis0
sible trees, T the subset of T of trees with at least a free node and A the subset of m 0 T of “fruitless” trees. Finally we will call A the subset of A of fruitless trees with no marking. (3) We will call Fjik the “tree” composed of one fruit of order k, angle j and type i; clearly [ Fjik . T ≡T i=0,1 j=0,...,n k>0
Notational Convention 1. Using standard notation we represent the equivalence classes by [A] where A is an admissible tree. Moreover given a tree A we will write A ∈ T if it is a representative of an equivalence class in T . Definition 4.12. The order of a tree A ∈ T is: X o(A) = δv . v∈A
The order of a node v of A is o(v) = o(A
≥v
).
June 19, 2003 16:13 WSPC/148-RMP
364
00165
M. Procesi 0
Given a tree A ∈ T and one of its nodes v we call A≥v the tree composed of the nodes greater or equal to v; if A≥v is not a fruit then it is not admissible as it carries a label j in the first node. In such case, we conventionally set A ≥v ∈ T by setting a mark J(v) = jv , h(v, t) = 1 on v and subsequently “forgetting” the label jv . It is easily seen that o(A) > 0 for all A ∈ T and that T k ≡ {A ∈ T t.c. o(A) = k} 0
is a finite set; clearly the same is true in T and in A. Notational Convention 2. In all our sets an apex k means we consider the subset of trees of order k. 0
We list here the subsets of T and A that we will need in the following sections. 0
Definition 4.13. (a) Aaj (Tja ) with j = 0, . . . , n, a = 0, 1, is the subset of A (T ) such that M(A) ≡ {vA } and J(vA ) = j, h(vA , t) = xaj (t). 0
ab (b) Aab ij (Tij ), with i, j = 0, . . . , n, a, b = 0, 1, is the subset of A (T ) such that M(A) ≡ {vA , v} for some v ∈ A moreover J(vA ) = i, h(vA , t) = xai , J(v) = j, h(v, t) = xbj .
Given a set S one can consider a vector space on Q generated by formal linear combinations of the elements of the set; we represent it by V(S). Definition 4.14. V(S) is the vector space of linear combinations of elements of S with rational coefficients. [A] ∈ S → [A] ∈ V(S) ,
[A], [B] ∈ V(S) → q1 [A] + q2 [B] ∈ V(S) ,
∀q1 , q2 ∈ Q .
We construct V(S) for the sets in Definition 4.13, we obtain infinite dimensional vector spaces that can be expressed as direct sum of finite dimensional spaces generated by the sets S k (we call these spaces Vk (S)).t Definition 4.15. In particular, we will be interested in the following vectors: fk =
1 X A , k |S(A)| m
Λak i =
A∈(Tia )k
A∈(A)k δvA =1
fak i =
X
k A∈(Aa i)
t We
A , |S(A)|
X
fabk ij =
X
A∈Aabk ij
A , |S(A)|
A , |S(A)|
are using the fact that the sets are disjoint union of the corresponding “fixed order” sets S k .
June 19, 2003 16:13 WSPC/148-RMP
00165
Exponentially Small Splitting and Arnold Diffusion
365
where the sum A ∈ S k means choosing one representative A for each equivalence class (diagram) of the set S k . Clearly the vectors are determined only up to isomorphisms. The same vectors without the apex k will represent the formal series u : P∞ V = k=1 V k . 4.3. Values of trees We link the vectors defined in Definition 4.15 to the dynamics by defining an appropriate tree “value” V(A) where A ∈ T . This definition can be extended to diagrams provided that V(A) = V(B) if A and B are isomorphic, moreover we can uniquely extend V to a linear function on V(T ). The presentation is very schematic as this definitions can be found in [3] and following papers; let us only write the Fjk explicitly (using well known formulas on the derivatives of composite functions), ej the vectors of the canonical basis: Fjk
=−
X X
(∇
m+e ~ j
δ
f (t))
n δ=0,1 m∈N ~ 0
X
n,k−1 Y
j=0 {ph ~ j }m,k−δ h=1
h 1 (ψ h )pj phj ! j
where {phj }m,k is a list of numbers in N0 ≡ N ∪ {0} which respect the relations ~ # " n Y m X X j h m ~ h ∂ψj f (ψ) hpj = k , finally we define ∇ f (t) = pj = m j ,
.
ψi =ϕi +ωi t ψ0 =q0 (t)
j=0
j,h
h
So we define Vϕ (A) =
Y
v>v0
(=τ+w + =τ−w )Ψϕ (A)
where Ψϕ (A) =
Y
wjv (τw , τv )
v∈A0
v∈A0 v>v0
×
Y
α∈F (v)
Y
[i ]
xjαα
Y
Pn 1 − ajv µδv ∇ j=0 mv (j)ej f δv 2
hβ (v, τv )
β∈M(v)
Y
o(α),i(α)
Gj(α)
.
α∈F (A)
F(v) are the fruits stemming from v, M(v) is the list of markings of the node v, w is the node preceding v and finally mv (j) is the number of elements in {v, s0 (v), F(v), M(v)} having angle label (or angle marking) equal to j. We write s0 (v), F(v) instead of s(v) to remark that the fruits are not considered proper nodes. Notice that Ψϕ (A) contains the kernels of the integral operators Qj so that V is obtained by “integrating” on the times τv v > v0 ; clearly the integrations must be performed in the correct order, first the end-nodes . . . . The following proposition u Remember
that the apex k is NOT an exponent.
June 19, 2003 16:13 WSPC/148-RMP
366
00165
M. Procesi
is standard (it is proved in [3] for numbered trees instead of equivalence classes), we sketch the proof in the Appendix. Proposition 4.16. The value of the splitting vectors Gik j (ϕ) is ik Gik j (ϕ) = =Vϕ (Λj ) .
The value of the homoclinic trajectory ψjk is (µ)k ψjk (t, ϕ) = (=t+ + =t− )wj (t, τ0 )Vϕ (Λik j )+
X
[a]
xj Gak j .
a=0,1
Definition 4.17 (Equivalent trees). We are mainly interested in the splitting vectors and splitting matrix so we will consider two trees to be equal if they have the same value in the computation of the Gaj . A∼ = B iff =Vϕ (A) = =Vϕ (B) ∀ ϕ ∈ Tn ; such identity can hold only for some initial data ϕ, ¯ in such case we write (A ∼ = B)ϕ¯ . 4.4. Tree identities 4.4.1. Mark adding functions We can define linear functionsv on V(T ), for instance we can add markings to a 0
tree; given A ∈ T the symbol h(v, t)∂lv A represents the application of an angle-marking J(v) = l and a function-marking h(v, t) in the node v; formally h m h A, {vA }, {vi }m i=1 , {wj }j=1 → A, {vA }, {vi }i=1 , {{wj }j=1 ∪ {v}} ,
notice that given two nodes v, w in the same orbit [v] ∂lv A is isomorphic to ∂lw A. We can define the linear function: X Mj (h(t))[A] := h(v, t)∂jv A . (4.1) ˚ v∈A Particularly interesting mark adding functions are Mjb ≡ Mj (xbj (t)). a Lemma 4.18. The vector fab ij is obtained from fi by the mark adding function
Mjb [fai ] = fab ij . v We always define functions F on trees. Then one should verify that F (A) and F (B) are isomorphic if A, B are so. This implies that one can uniquely extend the functions on the vector spaces by linearity.
June 19, 2003 16:13 WSPC/148-RMP
00165
Exponentially Small Splitting and Arnold Diffusion
367
Proof. We need to show that X X m[v] [v] X 1 B ∂jv A = ∂j A = , |S(A)| |S(A)| |S(B)| a ab 0 0
X X
A∈Aa i
A∈Ai
v∈A
[v]∈A
B∈Aij
in the second equality m[v] is the cardinality of the orbit of v and the sum over [v] means we choose one representative from each equivalence class; similarly the [v] symbol ∂j is the application of the angle marking j to one of the nodes of the orbit [v]. We are simply grouping the isomorphic trees ∂jw A with w ∈ [v] and choosing a representative of the equivalence class. Given each tree B ∈ A ab ij there [v]
is one and only one couple A ∈ Aai , [v] ∈ A such that ∂j A = B (there is a common representative). The symmetry group of B fixes both the marked nodes so |S(A)| = m[v]|S(B)| by the Lagrange theorem.w
Lemma 4.19. The function Mj0 with j = 1, . . . , n is a function on the values of trees. Given a fruitless tree A ∈ A the mark-adding function Mj0 with j = 1, . . . , n acts as the derivative on the angle ϕj : ∂ϕj =Vϕ (A) = =Vϕ (Mj0 [A]) . Proof. Adding an angle marking j to the node v is equivalent to adding ej to mv in ∇m f δ , so we add a derivative in ψj to the function f δv (ψ) which is to be evaluated in ψj = ϕj + ωj τv , ψ0 = q0 (t). If j 6= 0 this is equivalent to applying a ϕj derivative to the node v. As the dependence on ϕ comes only from the functions f 1 we have proved our assertion. 4.4.2. Fruit adding functions Remark 4.20. Notice that by our definition of equivalent trees adding a fruit of 0
order k, type i and angle j in the free node v of a tree A ∈ T is equivalent to adding [i] a mark xj (t)∂jv to the node v and multiplying by the ϕ dependent function Gik j . As we have seen in Eq. (3.4) the only contributions to ∆0k ij come from the parts mh of G0k which are at most linear in the G with l = 0, . . . , n; h < k, m = 0, 1. i l In tree representation we can say that the only contribution comes from trees with one fruit. So to find the matrices N a and na (a = 0, 1) we have to understand how to pass from fruitless trees to trees with one fruit. First of all let us notice that the fruitless contribution to Λ0j is clearly f0j so that 0 Dij = =Vϕ=0 f00 ij . w We refer to the Lagrange theorem which states that the order of a group G acting on a set V is the order of the orbit of a point v ∈ V times the order of the subgroup of G which fixes v.
June 19, 2003 16:13 WSPC/148-RMP
368
00165
M. Procesi 0
Now we can add a fruit Fjik to the node v of a tree A ∈ T by adding a node y labeled (i, k, j) to the list F(A) and setting y ∈ s(v), given a tree a we apply this function to each node v ∈ A then sum on the nodes v. By Remark 4.20 this is [i] equivalent to applying the function Gik j (ϕ)Mj , where [i] = |i − 1|, to A. Proof of Proposition 3.2(i). If j 6= 0 we can obtain each tree with one fruit by adding the fruit to a node of a fruitless tree as described above; so that by Lemma 4.18, Nijak = =Vϕ=0 (f0a ij ) and consequently N 0 = D0 .
If j = 0 we have trees with one fruit attached to nodes with δv = 0, so that detaching the fruit we do not obtain an acceptable tree (the node has only one successive free node). We construct such trees from fruitless ones by using a different function: given a tree A ∈ A and a node v ∈ A v 6= vA and jv = 0 we attach the node y of the tree in Fig. 6 to v and w (by convention the node preceding v). Formally we set li (A, v) = E(A) \ (w, v) ∪ (w, y) ∪ (y, v) ; [i]h
then G0 li (A, v) is a tree with one fruit, stemming from y (δy = 0) and y has only one successive free node. We apply li (A, v) to the nodes of A, and set l i (A, v) = 0 if v = vA or if jv 6= 0. X Li (A) = li (A, v) ; v∈A
i h 0
1
=
i
x (t)
ih G 0
0
Fig. 6.
i
L [
]
=
i
+
0 Fig. 7.
The fruit adding functions.
i 0
June 19, 2003 16:13 WSPC/148-RMP
00165
Exponentially Small Splitting and Arnold Diffusion
369
[i] notice that this is NOT well defined as a function A → A. However Gih 0 l (A) is 0
well defined and A → T and so we can define the “value” of Li (A); in the next Subsection we will prove that Li (A) is equivalent to an acceptable tree. 0
Lemma 4.21. Calling T 1F the set of trees with one fruit and X A 0(1F ) Λj = , |S(A)| 0 A∈T 1F j
we have that: 0(1F )
Λj
=
n XX
[l]
Gli (Mi (f0j ) + δi0 L[l] (f0j )) ,
l=0,1 i=0
and consequently 0 0 n0j = =Vϕ=0 (f00 j0 + L (fj )) .
Proof. Consider a tree B with one fruit, of angle i, order k and type l attached to a node v 0 . If such node has more than one successive or δv 6= 0, then it can [l]h be obtained by applying Gi xli ∂iv to a tree A ∈ A0j . If the node has δv = 0 and only one successive then there exists one and only one couple A, v with A ∈ A0j , [l]h
v node of A such that G0 ll (A, [v]) = B (as usual the symbol [v] means choosing one representative for the equivalence class). The symmetry group of B fixes both the first node and the fruit (and so consequently all the path joining the fruit to the first node), so if we divide by Glk i we obtain a tree with two marked nodes which again fixes the first node and the node v 0 where the fruit was attached; if v 0 has only one successive free node, say v, then that is fixed as well. This proves the proposition as given A ∈ A0j , X m[v]l(A, [v]) and moreover |S(A)| = m[v]|S(l(A, [v]))| . La (A) = [v]∈A
n0j
Finally is the linear term in G10 in the expansion of G0j , so it is given by trees with one fruit of angle j = 0 and type l = 0. Remark 4.22. As f δ (t) = F (ψi (0) + ω ˜ i t, ψ0 (t)) and ψ˙ 0 (t) = − 2c x00 (t), we have that: X 2 ~ ~ δ ~ 0 δ i δ f (τv ) = f (τv ) . ∂ τ v ∇m ωj ∇m+e f (τv ) − x00 ∇m+e c j=1,...,n For notational convenience we define a symbolx ∂ty A, where A is a fruitless tree and v is one of its nodes, by setting x We
could define ∂tv (A) to be a special marked tree.
June 19, 2003 16:13 WSPC/148-RMP
370
00165
M. Procesi
Ψϕ (∂ty A)
=
Y
wjv (τw , τv )
v∈A
v∈A v>v0
×
Y
Y
v∈A v6=y
∇
Pn
j=0
This definition implies thaty X ∂tv A ∼ = v∈A
mv (j)ej
1 − a j v µ δv 2
f δv ∂τy (∇
Pn
j=0
Y
hβ (v, τv )
β∈M(v) my (j)ej
f δy ) .
2 ωj Mj0 (A) − M00 (A) . c j=1,...,n X
Lemma 4.23. Given an odd function G ∈ H0 the following relation holds: 2 t t 0 3 0 τ ∂t Qj (G) = Qj ∂τ G(τ ) + δj0 x0 (τ )∂0 f (τ )Q0 (G) . c The proof of this Lemma (proposed in [4]) is straightforward but quite long, we report it in the Appendix. Lemma 4.24. Given a tree A ∈ A0i , i = 1, . . . , n then ! X ∂tv A − l0 (A, v) = ∂t Vϕ=0 (A) . Vϕ=0 v∈A
Proof. We drop the ϕ = 0 in V for notational convenience. The assertion is trivially true for trees with only one node, so we prove it by induction on the order of the trees. Let us define Ahj as the set of fruitless trees of order h with only one marking, placed on the first node, Jv0 = j and h(v0 , t) = 1; for j 6= 0, Ahj ∼ = A0h j . Suppose h Lemma 4.24 holds for all trees in Aj , h < k for j = 0, . . . , n, then forz A ∈ A0k i , Y 1 ~ 0 ) δv 0 f ) Qjv [V(A≥v )] ∂τ0 V(A) = − (∂τ0 ∇m(v 2 v∈S(v0 )
+
X
v∈S(v0 )
V(A/v )∂τ0 [Qjv V(A≥v )]) .
Now we set V(A≥v ) = F (which is odd when ϕ = 0) and apply Lemma 4.23 to F ∈ H0 : 2 ∂τ0 Qjv (F ) = Qjv (∂τv F ) + δj0 Q0 (x00 (τy )∂03 f 0 (τy )Q0 (F )) , c
clearly ∂τv F = ∂τv [V(A≥v )] and δj0 Q0 (x00 (τy )∂03 f 0 (τy )Q0 (F )) = −V(l0 (A, v)) . ∼ B means that =V(A) = =V(B). that A = recall that V(A) = V(A/v )Qjv V(A≥v ).
y Remember z We
June 19, 2003 16:13 WSPC/148-RMP
00165
371
Exponentially Small Splitting and Arnold Diffusion
So we obtain ∂τ0 V(A) = V(∂tv0 A) −
X
v∈S(v0 )
X 2 V(A/v )[Qjv ∂τv V(A≥v )] V(l0 (A, v)) + c v∈S(v0 )
by definition A≥v ∈ Ahj for some j, h. So we consider trees of lower order for which the Proposition is true by the inductive hypothesis. Proof of Proposition 3.2(ii). By Lemma 4.21 we must show that ! n X c 00 0 00 0 0 fij ωj . ni = =Vϕ=0 (fi0 + L (fi )) = =Vϕ=0 2
(4.2)
j=1
Now for j 6= 0, =∂t Vϕ=0 (f0j ) = 0 as the integrand has no constant component. So we can use Lemma 4.24 and Remark 4.22 to obtain Eq. 4.2. 4.4.3. Changing the first node Another way of manipulating trees is to change the first node (which is distinguish0
able as it does not have the label j). Generally one can obtain various trees in T by simply changing the uncolored node (for example one can shift the angle labels down along a path joining any node v to the uncolored one vA ). However not all the trees obtained in such a way are in T . 0
Definition 4.25. Given a tree A ∈ T , let vA be the first node and v a free node; 0 0 the change of first node P (A, v) : T → T is so defined : Let vA = v0 , v1 , . . . , vm = v be the nodes of the path PvA ,v . P (A, v) is obtained from h A, {vA }, {vi }m i=1 , {wj }j=1 by shifting only the j labels of the nodes of P vA ,v in the direction of vA . This automatically implies that v is left j-uncolored and is the first node of P (A, v). If we obtain a tree not in T we set P (A, v) = 0. P : V(T ) → V(T ) is the linear function such that ∀ A ∈ T , P (A) = P 0 P (A, v). v∈ A
Lemma 4.26. P (A, v) = 0 if and only if δvA = 0, |s(vA )| = 2. This means that the possibility of applying the change of first node does not depend on the chosen v 6= vA .
Proof. Consider the trees A and P (A, v) and the nodes vA = v0 , v1 , . . . , vm = v of the path PvA ,v . For each i = 0, m − 1, vi precedes vi+1 in A and follows it in P (A, v). So for each node w 6= vA , v, the number of following nodes s(w) is the same in A and P (A, v); s(vA ) decreases by one and s(v) consequently increases by one. This implies that all trees A with δvA = 0 and |s(vA )| = 2 have P (A, v) = 0 for all v. Moreover if vi has δ = 0, then it has j = 0 as well as all the nodes (including vi+1 ) following it. This means that in P (A, v), it will still have δ = j = 0, the same s(vi ) ≥ 2; moreover vi−1 that follows vi in P (A, v) has j = 0. r
We will call T the trees whose first node can be changed.
June 19, 2003 16:13 WSPC/148-RMP
372
00165
M. Procesi
A =
P(A,v) = A =
P(A,v) = v
Fig. 8.
An example of trees thatvare equivalent by changing the first node.
Lemma 4.27. By Proposition 2.6(a), we have: r
∀ A ∈ T , ∀v ∈ A : P (A, v) − A ∈ ker =Vϕ
(4.3)
r
∀ A ∈ T (j,f )(i,h) : P1 (A) − A ∈ ker =Vϕ Proof. Notice that given a tree A and one of its nodes v, if w ∈ P(vA , v) then P (A, v) = P (P (A, w), v), so we only need to prove the assertion for v ∈ s(vA ). r Given A ∈ T and v ∈ s(vA ) such that jv = j, we compare =V(A) and =V(B) with B = P (A, v), so B has first node v (no label jv ) and a node vA in s(v) with jvA = j. P Y 1 Qjw [V(A≥w )] =V(A) = − ajva (µ)δvA =∇ j mvA (j)ej f δvA 2 w∈s(vA ) w6=v
"
δv
× Qj (−µ) ∇
P
j
mv (j)ej
f
w1 ∈s(v)
which by the symmetry of Qj is equal to P Y =∇ j mv (j)ej (−µ)δv f δv
w1 ∈s(v)
"
× (−µ)δvA ∇
P
j
Y
δv
mvA (j)ej
V(A
≥w1
#
) ,
V(A≥w1 )Qj
f δv A
Y
#
Qjw [V(A≥w )] .
w∈s(vA ) w6=v
This is the value of B, namely, both in A and in B, mv (i) with i 6= j is the number of elements in (s(v), M(v), F(v)) having label i and mv (j) − 1 is the number of elements in (s(v), M(v), F(v)) having label j. Lemma 4.28. For each i = 1, n, we have f0i ∼ = Mi0 (f) . Proof. The proof of this statement is in [7], we report it here for completeness. By Lemma 4.27 we have that for A in Akj , A∼ =
1 k
X
[v]:δv =1
m[v]P (A, v)
so
X
A∈Ak j
1 X X A = |S(A)| k k
A∈Aj [v]:δv =1
m[v] P (A, v) , |S(A)|
June 19, 2003 16:13 WSPC/148-RMP
00165
Exponentially Small Splitting and Arnold Diffusion
373
m
now there exists one and only one couple B ∈ A, [vA ] ∈ B such that δvB = 1 and ∂jvA B = P (A, vB ). Finally by the Lagrange Theorem, (m[vB ])−1 |S(A)| = (m[vA ])−1 |S(B)|. This completes the proof of Proposition 3.2. 4.5. Upper bounds on the values of trees m
Given a fruitless tree A ∈ A of order k (so with at most 2k − 1 nodes), its value through =Vϕ1 is of the form: ! N (A) Y P Y 1 − a jv = (=τ+w + =τ−w )(µ)δv0 ∇ j mv0 (j)ej f δv0 2 v>v v≥v0
×
Y
v>v0
(µ)δv ∇
P
0
j
mv (j)ej
f δv w(τw , τv ) .
(a)
We expand f 1 in Fourier series in the rotator angles, X eiν·ψ fν (q) , f 1 (ψ, q) = |ν|=1
so that each node has one more label νv ∈ Zn . We will represent as A(ν) a tree A with labels νv such that X νv = ν . v∈A
In each node v with δ = 1 we have as factor the function dnv fνv (q(t)) where nv = mv (0). The functions fν (q) and q(t) are such that fν (q(t)) = Fν (et ) ∈ H0 (a, d). Naturally by our analyticity assumptions fν (q(t)) is limited for |t| → ∞ in |Im t| < 2Π. We are considering rational functions Fν (et ), let us call tiν their (finite number of) poles in |Im t| ≤ Π (all with Im t 6= 0) then d = min |Im (tiν )| ; ν,i
a = max |Re(tiν )| . ν,i
(4.4)
Moreover the following proposition holds. Lemma 4.29. The functions ∂0k fν (q(t)) = Fνk (et ) respect the bound : k t √ |Fν (e )| t∈C(a+2,d− ε)
max
≤ Ck!ε
p+k 2
.
(4.5)
Proof. We can use Cauchy estimates on ∂0k fν (q) provided that the images in the q √ √ variables of C(a + 2, d − ε) and of C(a + 1, d − 21 ε) via the function q0 (t)−1 , have √ distance of the order of c ε for some order one c. This can be verified by direct computation or proved using simple geometric arguments.
June 19, 2003 16:13 WSPC/148-RMP
374
00165
M. Procesi
P Having fixed ν = v νv , in integral (a) we shift the integration to R + iσ(ων )d0 √ where d0 < d (we will then fix d0 = d − ε to obtain optimal estimates and d0 = c ≤ d/2 to obtain simply exponentially small estimates), ων = ω · ν and σ(x) is the sign of x. As the functions are all analytic in |Im(t)| ≤ d0 the integral (a) is unchanged. Notice that in integral (a) we cannot choose the sign of the shift in the single node integrals and so we need to work in the (symmetric) domains C(a, d0 ) to guarantee the indifference of extending in the lower or upper half-plane. To simplify the notation we set 0
σ(ων ) = + and define E(d0 , ν) = e−|ων |d . If A has k nodes with δ = 1, let {νv }kν be the lists of k vectors νv ∈ Zn such that m P νv = ν. The value of A(ν) (tree A ∈ A with total frequency ν) in integral (a) is: N (A) I Z ∞ X Y dRv0 1 mv (s) (iν ) − dτv0 e−σ(τv0 )Rv0 eiν·ϕ E(d0 , ν) vs 2iπRv 2 −∞ 0 k {νv }ν
s=1,...,n δv =1 ,v≥v0
× [dnv0 fνδvv0 (q(τv0 + id0 ))]eiωv τv0
Y I
v>v0
0
dRv 2iπRv
× e−σ(τv )Rv (τv +id ) wjv (τw + id0 , τv + id0 )
Y
Z
τw
dτv + −∞
Z
τw
dτv ∞
[dnv fνδv (q(τv + id0 ))]eiωv τv ;
(a)
v≥v0
naturally fν0 = 0 for all non-zero ν. As usual w is the node preceding v, mv (s) is the number of nodes in the list v, s(v) with label j = s, n(v) the number of those with label j = 0 and ωv = ωνv . The residues in R are introduced by using the Definition 2.8. The factors s (iνvs )mv come only from nodes with δv = 1 and their product is bounded by 1. Now we want estimates on the integrals that depend only on the order k; we start by splitting the sums in monomials. (1) Split wj (τw + id, τv + id) into 6 terms if j = 0 or 2 terms if j 6= 0: so we obtain 63k−1 terms. Each of this terms is of the form 0
0
h −l 0 τvh x−l v y(xv )τw xw y (xw ) ,
where xv = e−|τv | , 0 ≤ h, h0 , l0 , l ≤ 1 and both y(x), y 0 (x) are analytic in |x| ≤ 1 (we will call this the limited x dependent part of the Wronskian). R τw Rτ (2) Separate −∞ dτv + ∞w dτv , and =dτv0 in integral (a). We get other 2k terms like ! Z τw |s(v)|+2 Y Y I dRv dτv e−σ(τv )Rv (τv +id) eiωv τv (τv )hv xlv yvj (xv ) , 2iπRv ρv ∞ j=1 v≥v0
where 0 ≤ lv , hv ≤ |s(v)| + 1. Notice that ρv is not the sign of τv but an extra label. The functions yvj are chosen in the following way:
June 19, 2003 16:13 WSPC/148-RMP
00165
Exponentially Small Splitting and Arnold Diffusion
375
(i) One of the yvj is either coming from ∂0nv f 0 , (i.e. it is in the list cos(mq(τv +id)), sin(mq(τv + id)) with m = 1, 2) or is one of the Fνkv . (ii) One is the limited xv dependent part of a term from the Wronskian at the node v. (iii) For each node v 0 following v there is one function yvj which is the xv dependent part of a term coming from the Wronskian w(τv , τv0 ). Notice that the functions y are by definition in H(a, d) and respect condition 4.29. Rτ 0 R0 R0 (3) Given a node v ∈ s(v0 ) split the integral ρvv∞ dτv as ρv ∞ dτv − ρv ∞ dτv + 0 R τv0 2k+1 terms). We consider ρv0 ∞ dτv and proceed recursively for all nodes (other 3 Rτ first the contributions from the term with ρvw ∞ dτv for all nodes (the others will 0 be expressed as products of the same kind of integrals). Set ρv0 = −1, we want to estimate: ! Z τw |s(v)|+2 Y Y I dRv Rv (τv +id) iωv τv hv −lv v dτv e e (τv ) xv yj (τv ) . I− (A) = 2iπRv −∞ j=1 v≥v0
(4.6) R −a0 R 0 Finally we split the first integral −∞ = −∞ + −a0 , where a0 > 0 is suitably Q P large (a0 = a + 2 log 2). We set yjv (τv ) = r=0 yjv,r xr and C{rv } = v yjvrv . The integral is X Y ∂ nv Y Z τw a0 Im = Res C{rv } dτv (eRv (τv +id)+Ev τv eiωv τv xrvv ) (4.7) hv ∂E −∞ v v v {r } R0
v
with τw0 = −a0 . Starting from the end-nodes we now perform the integrals in dτv then the derivatives in Ev and finally the residues in Rv , we do this first for all the end-nodes and then proceed to the inner nodes hierarchically. Lemma 4.30. Integral (4.7) produces the bounds " !# Y |s(v)|+2 Y X v,h −m 2τ +2 k a0 h C1 |yj ||x0 | Im ≤ ε (m!) ; v
j=1
h
−a0
x0 = e , m is the number of nodes (≤ 2k−1), |s(v)| the number of nodes following v and C1 is some order one constant. Finally τ is the Diophantine exponent of √ωε , 1
|ω · n| > ε 2 γ|n|−τ
for some γ = Oε (1) .
If we choose a0 > a the series are all convergent (by the analyticity of the y j ’s in x0 ). We choose x0 = |x| ≤ e−a−2 :
e−a 8
and estimate the coefficients of the Taylor series in the ball ∞ X k=0
|yjv,k |xk0 ≤ 4 max (yjv (x)) . |x|≤2x0
June 19, 2003 16:13 WSPC/148-RMP
376
00165
M. Procesi
Proof of Lemma 4.30. This is taken from [3]. Z t xK e(iA+B)t xK eiAτ eBτ = , The integral K + B + iA −∞ so the Ev derivatives in the end-node v give 2hv terms of the form: hv1 !
v xrwv eidRv e(iωv +Rv )τw (τw )h2 rv + Rv + iωv
hv1 + hv2 = hv .
(4.8)
The residue of Rv−1 times (4.8) is (4.8) if |rv | + |ωv | 6= 0 and v v hv2 ! (τw )h1 (τw + id)h2 +1 v (h2 + 1)!
if |rv | + |ωv | = 0 .
Developing the binomial we obtain other 2hv +1 terms, all of the type ˜
Ghv +1 m!x ¯ rwv eiωv τw (τw )hv . The constant G is the maximum between one (rv 6= 0), (min|ν|≤N |ω · ν|)−1 or ( Π2 ) (we use that d < Π2 ). After integrating all the end-nodes following a node w we can P integrate in dτw a sum of 22 v∈s(w) hv +1 terms of the type ¯
ˆ
¯ r˜w eiΩv τw (τw )h Gh h!x w P P P ¯+ˆ where r˜v = v∈s(w) rv , Ωv = v∈s(w) ωv and h h ≤ v∈s(w) hv + 1. We have proved that the integrals derivatives and residues correspond to calculating the integrands in (4.7) at the limiting point a0 , ignoring the oscillating factors eiΩa0 , substituting the Taylor coefficients with their moduli and multiplying by a factor bounded by 26k−3 (k!)4
max (|ω · ν|)−2τ (2k−1) ≤ C k (k!)4τ +4 .
0<|ν|
Rt We now consider the “left out part” −a0 dτv0 (we will set t = 0 in integral (a)). Let v1 be a node of level one. Rτ We break the integral =τv0 dτv1 as =−a0 dτv1 + −av00 dτv1 . If we choose the first term and m1 is the number of nodes of A≥v1 , the integral on A≥v1 can be Rbounded t a0 by Im and we are left with the problem of bounding the “left out part” −a0 dτv0 1 on the remaining subtree A/v1 . We repeat the procedure hierarchically and we end up with 2m terms of the form: Y Z τw a0 a0 dτv V(ϑ) I m1 · · · I mp v∈ϑ
−a0
P
where the subtree ϑ has m ˜ nodes and m+ ˜ mj = m. We bound the last integral by the maximum of the integrand. Let us now examine the 3m −1 integrals left aside in the analysis of item 3. Starting from the end-nodes we cut off all the subtrees ϑ that contribute a definite integral =0ρ . Such integrals are of the type Iρ (ϑi ) that we have already considered. We are left with an integral again of the type Iρ0 (ϑ0 ) where ϑ0 is the tree deprived of the ϑi . The total number of nodes of the ϑi i = 0, . . . , h is m.
June 19, 2003 16:13 WSPC/148-RMP
00165
Exponentially Small Splitting and Arnold Diffusion
377
Now we only have to compute the maxima of the |yjv (x)|, that means the maxima of the moduli of the terms form the derivatives of f 0 from the Wronskian and from all the Fνk in the regions C(a = 2, d0 ). To bound the functions Fνk (et ) we use the √ fact that d0 = d − ε and Proposition 4.29(ii) . As we are not interested in optimality,aa we will estimate the maximum of a k pole of order k by ε− 2 . Lemma 4.31. The |yjv (τv )| non coming from f 1 contribute at most a factor ε−k−2k0 +1 where k0 is the number of nodes with δv = 0. Proof. There are k0 ≤ k − 1 nodes with δv = 0 carrying at most a double pole. Then each of the k + k0 − 1 nodes v 6= v0 carries a summand of max (|x0j |) max √ (|x1j |) √ t∈C(d− ε,a+2) t∈C(a+2,d− ε) from the Wronskian. So it is another double pole. Pk The functions Fνn appear exactly k times. Moreover n = i=1 nvi , so we count each node with δv = 1 plus all its successive nodes. As each node with δv = 0 has s(v) ≥ 2, k X i=1
nv i ≤
X v
nv − 3k0 = 2k − k0 − 1 .
√ We can bound the maxima of the Fνn in C(a + 2, d − ε) using Lemma 4.29, so √ −(p+2)k+k0 we have a factor ε . √ Finally we notice that E(d − ε, ν) ∼ E(d, ν) and we sum on all the trees of order k using the known bound:
X
m
A∈(A )k
Q
v∈A δv =1
n(v)!
|S(A)|
≤ (4n)k .
This proves Proposition 3.3(ii). To prove Proposition 3.3(i) we set d0 = 0 so we do not have any divergent contribution from the integrals in (−a + 2, a + 2). Moreover we can add fruits and markings by simply using the mark adding functions; see [3] for full details. aa Notice
that if the perturbing function is not a trigonometric polynomial then we do not approach simultaneously both the singularities of f 0 and f 1 .
June 19, 2003 16:13 WSPC/148-RMP
378
00165
M. Procesi
Appendix A. A.1. Proof of Proposition 4.16 We give values to trees recursively; namely given a tree with fruits A ∈ Tj we define its value as: Y 1 ~ j δv 0 W(A≥v ) , where V(A) = − µδv0 aj ∇m+e f (t)) 2 v∈s(v0 )
W(A) = (=t+ + =t− )wj (t, τ )V(A) if A is not a fruit, and [a]
W(Fjak ) = xj (t)=xaj V(Λkj ) otherwise. Finally we set V(Λ1j ) = − 21 µaj ∇ej f 1 (t). This is clearly the same function V we defined in Subsec. 4.3. We consider a multi-linear function Γδj on trees Tj so define Γδj (A1 , . . . , An ) attaches the fist nodes of the trees Ai to the tree αδj with one marked node δv = δ, Jv = j. By convention if we have n copies of the tree A in the list {Ai } we will write Γδj (An ). Remembering that Fjk
=−
X
X
δ=0,1 {ph ~ j }m,k−δ
∇
m+e ~ j
δ
f (t)
n,k−1 Y j=0 h=1
h
(ψjh )pj phj !
.
We define recursively: Λkj =
X 1 aj 2
X
δ=0,1 {ph ~ j }m,k−δ
µδ ¯ 1 )p10 , . . . , (Λ ¯ k−1 )pkn −1 ) , Γδ ((Λ 0 n P {phj } j
¯ k = Λk + Λ j j
X
a=0,1
Fjak ,
Q ¯ k ) = ψ k . Let us prove where given a list {ai }, P {ai } = i ai !. By definition, W(Λ j j P k −1 k that Λj = A∈T k |S(A)| A. As in both definitions, Λj is a sum over all trees in j
Tjk . This is equivalent to showing that in the two expressions, each tree A has the same coefficient. We proceed by induction as the statement is trivially true for Λ1j . Given a tree A ∈ Akj , let v1 vm be its level one nodes and A1 , . . . Am its level one subtrees; we need that m 1 N (A1 , . . . , Am ) Y 1 = |S(A)| P {phi (A)} i=1 |S(Ai )|
where {phi (A)} is the number of trees {Aj } in Ahi and N (A1 , . . . , Am ) is the number of ways in which one can choose one summand from each f10 , . . . , fnk−1 and obtain the unordered list (A1 , . . . , Am ). Now if m[vi ] is the cardinality of the orbit of vi (so there are m[v1 ] subtrees equal to A1 . . .), P {ph (A)} N (A1 , . . . , Am ) = Q i [v]1 m[v]!
and
m Y
1 = |S(Ai )| i=1
Y
[v]∈s(v0 )
1 |S(A≥[v] )m[v]|
.
June 19, 2003 16:13 WSPC/148-RMP
00165
Exponentially Small Splitting and Arnold Diffusion
379
This proves the assertion by the Lagrange Theorem: Y |S(A)| = m[v]!|S(A≥v )|m[v] . [v]∈s(v0 )
A.2. Normal form theorem We perform a symplectic change of variables that brings Hamiltonian (1.1) in local “normal form”. We will use the standard notations (see [2], [4] or [12], [13]) and the existence of the fast time scale. For systems with one fast time scale, this provides a symplectic change of variables defined in a region W such that ΠI W = Oε (1), that 1 sends the perturbing terms depending on the fast angle to order e− εB for some B(n) < 1. This will be the basis for proving Arnold diffusion for systems with one fast variable. For completeness we state the theorem for m fast variables. The first step is to set the pendulum in local hyperbolic normal form (see [2]), we obtain the local Hamiltonian: √ √ 1 (A.1) (I, AI) + εG(pq, ε) + µf (p, q, ψ) , 2 √ √ where the function G(J, ε) is analytic for |J| < k˜02 ∼ ε and will be written as P Taylor series: G(J) = k≥1 J k Gk . The perturbing term f (p, q, ψ) is a trigonometric polynomial of degree N in the rotator angles and an analytic function of p, q ≤ k0 . So we consider the domain: W (k0 , s0 ) ≡ W0 := {|p|, |q| ≤ k0 , I ∈ V0 (ε) ⊂ Cn , ψ ∈ Tn × (−is0 , is0 )} , √ εω
where V0 (ε) is some n-rectangle such that ΠIj V0 (ε) = O( aj j ). P We write f in Taylor series: f (p, q, ψ) = fν,k,h pk q h eiν·ψ . For all s < s0 , k < k0 , we use the weighted norm: X |f |k,s ≡ |f |W (k,s) = es|ν| |fν,l,h |k 2(l+h) eiν·ψ . Definition A.1. Given a sub-lattice Λ ∈ Zn and a point set D ∈ V0 (ε), we say that D is K − β non-resonant modulo Λ if for all I ∈ D, |ω(I) · ν| ≥ β, ∀ν : ν ∈ / Λ ∩ |ν| ≤ K . If Λ0 is the lattice generated by the N frequencies (νi ∈ Zn ) of f , we set Λ ∈ Λ0 to be the sub-lattice orthogonal to the fast components. We choose a point set D 1 in the following manner: let P be the set of vectors ω such thatbb ωε− 2 ∈ Ω such γ that |ω1 · νF | ≥ |νF |τF for an order one γ. Given r0 ∈ R+ , the domain D(r0 ) is a thickening of P such that ∀ I ∈ D(r0 ), there exists ω ∈ P such that 1
|AI − ω| ≤ εα+ 2 r0
for r0 < R; in the following we will set b =
1 2
+ α.
bb Recall that we are now working on Hamiltonian (1.1) so that all quantities must be appropriately √ rescaled by ε.
June 19, 2003 16:13 WSPC/148-RMP
380
00165
M. Procesi
Lemma A.2. D0 ≡ D(r0 ) is β − K non-resonant modulo Λ with K=
γ −b ε 4R
1+τ1
F
,
τF
1
β = (γ) 1+τF (4Rεb ) 1+τF .
Proof. Given I ∈ D(r0 ) ω(I) = AI is εb r0 -close to an ω ∈ P , so |ω(I) · ν| ≥ |ω1 · νF | − (εb |ω2 ||ν| + εb r0 |ν|) with r < |ω2 | < R. Thus we set εb |ω2 |γ −1 |ν|τF +1 , εb r0 |ν|γ −1 |ν|τF +1 <
1 . 4
We construct an analytic symplectic transformation (µ-close to identity) of the form: Id + µS(I 0 , p0 , ψ, q) = Id +
X
µl
|ν|≤lN
1
X
(l)
Sν,k,h (p0 )k q h eiν·ψ ,
ν6=Λ
that brings the Hamiltonian A.1 in the normal formcc √ √ K (I 0 , AI 0 ) + εG1 (pq, ε) + µg1 (ψS0 , I 0 , p0 , q 0 , ε, µ) + µ N f1 (ψ 0 , I 0 , ε, µ) in a suitable domain D0 (r1 ) × Tns1 × Bk21 , where D0 (r) = D(r) ∩ {I : ∃ω ∈ P such that |an In − ωn | ≤ r0 ε} . The Hamilton–Jacobi equations are √ √ 1 µAI 0 · Sψ + µ2 |ASψ |2 + εG(qp0 + µqSq )) = εG1 (p0 q + p0 Sp0 , µ) 2 + µg1 (ψS + µSI 0 , I 0 , p0 , q + µSp0 , ε, µ) − µf (p0 + Sq , q, ψ) + o(µK )
(A.2)
and we assume that we can find some domain D 0 (r) × Tns × Bk2 such that the functions in A.2 are evaluated inside their domain of analyticity. We will call ΠΛ the natural projection on functions NOT depending on the fast angles: ΠΛ f (ψ, p, q) = g(ψS , p, q) and ΠJ the natural projection on functions depending only on J = pq: X X F = Fν,k,h pk q h eiν·Φ ΠJ F = F0,h,h (pq)h .
We are looking for a symplectic transformation such that (ΠΛ )S = 0. We will solve the Hamilton–Jacobi equations recursively and determine the functions G1 (J, µ) = cc The
separation between the integrable G1 and the non-integrable g1 is kept only because we will eventually set up a KAM scheme for the slow variables, so we need to estimate the size of the integrable part.
June 19, 2003 16:13 WSPC/148-RMP
00165
Exponentially Small Splitting and Arnold Diffusion
µi G1 (J; i) and µg1 (ψS , I, p, q, µ) = leads todd P
i≥0
G1 (J, 0) = G(J) ,
1 G1 (J, 1) = √ ΠJ f , ε
P
i≥1
381
µi g1 (ψS , I, p, q; i). The first order
g1 (ψ1S , I 0 , p0 , q 0 , 1) = (ΠΛ − ΠJ )f ,
fν,k,h √ . · ν] + (k − h) εGJ (p0 q) √ The term i[I 0 · ν] + (k − h) εGJ (0) = D(ν, k, h) is the “small denominator” that in our case (i.e. up to order K N ) admits the lower bound D(ν, k, h) ≥ β provided 0 0 that I ∈ D (r0 ). The higher order terms are determined recursively; we set µS
Sν,k,h = −
i[I 0
l
P∞
m
0
0
the remaining resonant terms are in µg1 = m=1 µ g1 (ψs , I , p , q; m): √ 1 2
(l)
the terms of order µl such that ν 6= Λ fix the value of Sν,k,h . We expand the Taylor series only in this expression. The symbol {ki }rk means the set of vectors in Nr such Pr Pr that i=1 ki = k, while {νi }rν is the set of r vectors in Zn such that i=1 νi = ν. " X 1 (m) 1 (l−m) (l) (ν (1) , Aν (2) ) S S Sν,k,h = − D(ν, k, h) (1) (2) 2 ν (1) ,k1 ,h1 ν (2) ,k−k1 ,h−h1 ν
+
l √ X ε
+ν
X
=ν
r≥2 {ki }rk ,{hi }rh , {li }rl ,{νi }rν
+
X
X
r≥1 {ki }rk+r ,{hi }rh , {li }rl−1 ,{νi }rν1
dd Notice
1 r (l ) ∂ G(p0 q)Πri=1 Sνii,ki ,hi hi r! J
1 r (l ) ∂ 0 f (p0 , q, Ψ)Πri=1 ki Sνii,ki ,hi r! p
that the pendulum and rotator terms cannot cancel each other, this is a consequence of the locality of our analysis.
June 19, 2003 16:13 WSPC/148-RMP
382
00165
M. Procesi l−2 l−m X √ X − ε
X
m=0 r≥2 {ki }r ,{hi }r , k h {li }rl−m ,{νi }rν
−
l−1 X
X
X
1 r (l ) ∂ G1 (p0 q; m)Πri=1 Sνii,ki ,hi ki r! J
l−m X l−m−r X
X
X
s s r m=1 ka +kb =k, la +lb =l−m, r≥0 s≥0 {ki }r ka +r ,{hi }ha {kj }kb ,{hj }hb , r+s≥1 {li }r ,{νi }r s s ha +hb =h νa +νb =ν {li }l ,{νi }ν νa la b
×
b
1 r s (l ) (l ) ∂ ∂ g1 (ψS , I 0 , p0 , q; m)Πri=1 ki (Sνii,ki ,hi Πsj=1 ∇I Sνji,kj ,hj r!s! q ψS
(A.3)
To avoid proliferation of symbols we will set: max(|f |0 , |G|0 ) = E0 and choose r0 > 1 so that r0 εb ≥ r0 ε ≡ λ0 > k02 . Finally we will call bj = b if j = 1, . . . , n − 1 and bn = 1. Proposition A.3. Consider the nested domains: Dl ≡ D0 (rl ) × Tnsl × Bk2l where rl = 21 r0 e−lξ , sl = s0 (1 + lξ) and kl = 21 k0 e−lξ ; the following bounds hold ee : (l)
|Sν,k,h |l ≤ C1 (l − 1)!B l−1 , |G1 (J, l)|l ≤ C2 (l − 1)!B l−1 , |g1 (ψS , I 0 , p0 , q; l)|l ≤ C3 (l − 1)!B l−1 E2
with C1 = Eβ0 , C2 = C3 = E0 and B = c β 2 k40 ξ2 for some small enough order one c. 0 Moreover the so defined transformation is a biholomorphism: DK → D0 provided s0 that ξ = 4K , µBK < 1. Thus the system can be written in normal form for µ<
β 2 k04 ξ 2 K3
(A.4)
in the domain D(r) × Tsn × Bk2 , with r = 12 r0 e−s0 /4 , k = 21 k0 e−s0 /4 and s = s0 /4. Remark A.4. Notice that for systems with one fast time scale, the domain P coincides with the whole W (k, s0 /2) as all one-dimensional vectors of norm one are Diophantine with order one γ. Moreover in this case β = O(1) as well, so if we 5 choose K = √cε , the bound on µ is µ ≤ ε 2 . Remark A.5. Notice that if we choose K = Oε (1), we can perform some steps of the normal form theory for µ < ε. Proof. We proceed by induction, using the analyticity assumptions on G and f . We will assume that the desired bounds hold for all l < m and that G1 (J, l) and g1 (ψS , I 0 , p0 , q, l) are analytic in Dm−1 . This implies that the transformations
ee By
|f |l we mean |f |Dl .
I = I 0 + µSψ<m ,
ψ 0 = ψ + µSI<m , 0
p = p0 + µSq<m ,
q 0 = q + µSp<m 0
June 19, 2003 16:13 WSPC/148-RMP
00165
Exponentially Small Splitting and Arnold Diffusion
383
are well defined and Dm → D0 if max(|µSq<m |m , µSp<m |m ) ≤ 0 |µSI<m |s ≤ 0
1 s0 , 4
1 km , 4
|µSψ<m |m ≤ j
1 r m ε bj , 4
<m |µSψ,I 0 |m < 1 .
Substituting the bounds in these inequalities (and using Cauchy estimates for the 1 , 8C1 ) < 1 provided that µKB ≤ 12 . derivatives) we obtain the constraint µ max( 8C k02 ξ λ0 ξ 2 Having verified the analyticity of the transformation up to order m, we use analytic bounds on G, G1 and g1 and the assumed bounds on the lower orders to bound G1 (J; m) S (m) and g1 (ψS , I 0 , p0 , q; m). We repeatedly use the inequality: X
{ki ≥1}a i=1 :
P
a Y
i=1 i ki =k
(ki − 1)! ≤ (k − 1)!
Let us first consider S (m) , it is composed of five sums. In each we substitute the Cauchy estimates and the bounds coming from the inductive hypothesis. C2
(1) The sum of quadratic terms is bounded by (k − 1)!B k−1 s2 ξ21βB . 0 (2) The terms due to G are bounded by √ √ X 4C1 r εE0 8 εE0 C12 m ≤ 4 2 (m − 1)!B (m − 1)!B m−1 β k02 ξB k0 ξ βB r≥2
1 < 12 . provided that k4C 2 0 ξB (3) The terms due to f are bounded by X 2C1 r E0 4E0 C1 (m − 1)!B m−1 ≤ 2 (m − 1)!B m−1 β k02 ξB k0 ξβB
r≥1
1 < 12 . provided that k2C 2 0 ξB (4) The terms due to G1 has the same bound as (2) if we fix C2 = E0 . (5) If we fix C3 = E0 as well, the terms due to g1 are bounded by X X 2C1 r 2C1 s 4C1 E0 E0 m−1 (m − 1)!B ≤ (m − 1)!B m−1 2 β k0 ξB λ0 ξB βk02 ξB
r≥0 s≥0,r+s≥1
provided that
2C1 λ0 ξB
≤
2C1 k00 ξB
< 12 .
These five bounds must be all set < 15 C1 . It is easily seen that, as b ≤ 1 and √ 8 εE0 C1 1 , ) ≤ 15 . Now we λ0 ≥ k02 , all the desired bounds are implied by max( λ8C 2 k04 ξ 2 βB 0ξ discuss the bounds on G1 and g1 . There are always the same five terms times a 0 C1 factor √βε for G1 and β for g1 . So all the bounds are verified if, kE 4 ξ 2 βB ≤ c 1. We fix C1 =
E0 β
E2
0
as this comes from the first order and B = c k4 ξ20β 2 . 0
June 19, 2003 16:13 WSPC/148-RMP
384
00165
M. Procesi
A.3. Proof of Lemma 4.23 Proof. We define an operator Oj [g] := Qj (G) +
1 X i [i] x =(xj G) 2 i=0,1 j
j (G) so that ψjk = Oj (Fjk ). We consider the vector V = ∂tO . By the definition of (Oj (G)) Oj , it is a solution of V˙ = Lj V + G where Lj is the 2 × 2 matrix: 0 Lj = δj0 g(q 0 (t))
We derive with respect to t:
1 , 0
g(ψ0 ) = −∂ψ2 0 f 0 (ψ0 ) .
˙ . V¨ = Lj V˙ + (L˙ j )V + G) The first line of the solution V˙ is ˙ ∂t (Oj (G)) = Oj (−δj0 q˙0 (t)∂03 f 0 (t)O0 (G) + G) plus the first component of a solution of the homogeneous equation t → W (t)X that we determine via the initial data. Ojt (F ) is zero for t = 0, and the initial datum is determined by the boundedness condition ∂t (Oj (G))|t=0 = =0 (x0j G), so 2 δj0 x00 (t)∂03 f 0 (t)O0 (G) + G˙ + x0j (t)=0 (x0j G) ∂t (Oj (G)) = Oj c and as G is odd we can substitute Qj (G) = Oj (G). σ(t)x1 (t) 0 Next we notice that the vectors W i = xx˙ 00 (t) , σ(t)x˙ 01 (t) are solutions of the 0 (t) 0 ˙ = L0 W . So we apply the time derivative and obtainff system W x˙ i0 =
2 Q0 (xi0 x00 ∂03 f 0 (t)) + δi1 σ(t)x00 . c
(A.5)
The last term is added to have the right behavior in t = 0 (dt σ(t)x10 |0 = 1). 2 δj0 x00 (t)∂03 f 0 (t)O0 (G) + G˙ + x0j (t)=0 (x0j G) Oj c 2 = Qj δj0 x00 (t)∂03 f 0 (t)Q0 (G) + G˙ c 1 X i [i] 2 0 3 0 ˙ δj0 x0 (t)∂0 f (t)Q0 (G) + G + x0j (t)=0 (x0j G) . (A.6) xj =xj + 2 i c ff We
are using the fact that O0 (σ(τ )F ) = σ(t)O0 (F ).
June 19, 2003 16:13 WSPC/148-RMP
00165
Exponentially Small Splitting and Arnold Diffusion
385
The last two sums cancel each other via relation A.5 and Proposition 2.6(i) and (iv), for j = 0, and using the fact that if j 6= 0, then x˙ 0j = 0 and x˙ 1j = σ(t): 1 X i [i] 2 xj =xj δj0 x00 (t)∂03 f 0 (t)Q0 (G) + G˙ + x0j (t)=0 (x0j G) 2 i c =
1X i 2 [i] δj0 xj (t)x00 (t)∂03 f 0 (t) xj =GQ0 2 i c 1 [i] − =(x˙ j G) + x0j (t)=0 (x0j G) . 2
(A.7)
References [1] V. I. Arnold, Instability of dynamical systems with several degrees of freedom, Sov. Math. Dokl. 5 (1964), pp. 581–5. [2] L. Chierchia and G. Gallavotti, Drift and diffusion in phase space, Annales de l’IHP, Section Physique Th´eorique 60 (1994), pp. 1–144; see also the Erratum 68 (1998), 135. [3] G. Gallavotti, Twistless KAM tori, quasi flat homoclinic intersections and other cancellations in the perturbation series of certain completely integrable Hamiltonian systems. A Review, Rev. Math. Phys. 6 (1994), 343–411. [4] G. Gallavotti, G. Gentile and V. Mastropietro, Separatrix splitting for systems with three time scales, Comm. Math. Phys. 202 (1999), 197–236. [5] L. Chierchia, Arnold instability for nearly-integrable analytic Hamiltonian systems, Proc. Workshop “Variational and Local Methods in the Study of Hamiltonian Systems”, A. Ambrosetti, G. F. Dell’Antonio eds. Trieste, 1994. [6] G. Gallavotti, G. Gentile and V. Mastropietro, Melnikov approximation dominance. Some examples, Rev. Math. Phys. 11 (1999), 451–461. [7] G. Gallavotti, Reminiscences on Science at the I.H.E.S. A Problem on Homoclinic Theory and a Brief Review, Les relations entre la mathematique et la physique theorique, pp. 99–117; I.H.E.S. 1998. [8] M. Berti and P. Bolle, A functional analysis approach to Arnold diffusion, Annales de l’IHP, Section Analyse Non-Lineaire 19, 4 (2002), pp. 795–811. [9] G. Gallavotti, G. Gentile and V. Mastropietro, Hamilton–Jacobi equation and existence of heteroclinic chains in three time scales systems, Nonlinearity 13 (2000), 323–340. [10] C. Godsil and G. Royle, Algebraic Graph Theory, Springer-Verlag (Graduate texts in mathematics, 207), Berlin, Heidelberg, New York, 2001. [11] S. Lang, Algebra, Addison Wesley Publishing Co. 1993. [12] G. Benettin and G. Gallavotti, Stability of Motions Near Resonances in QuasiIntegrable Systems, J. Statist. Phys. 44 (1986), n. 3–4, pp. 293–338. [13] P¨ oschel, Nekhoroshev estimates for quasi-convex Hamiltonian systems, Math. Z. 213 (1993), 187–216. [14] V. I. Arnold, Small denominators and problems of stability of motion in classical and celestial mechanic, Russ. Math. Surveys 18:6 (1963), pp. 85–191. [15] M. Berti and P. Bolle, Fast Arnold diffusion in systems with three time scales, Discrete and Continuous Dynamical Systems, Series A 8, 3 (2002), pp. 795–811.
June 19, 2003 16:13 WSPC/148-RMP
386
00165
M. Procesi
[16] U. Bessi, An approach to Arnold diffusion through the calculus of variations, Nonlinear Analysis T. M. A. 26 (1996), pp. 1115–35. [17] U. Bessi, L. Chierchia and E. Valdinoci, Upper bounds on Arnold diffusion time via Mather theory, J. Math. Pures Appl. 80-1 (2001), pp. 105–129. [18] B. Bollobas, Graph Theory, Springer-verlag (Graduate texts in mathematics, 63), Berlin, Heidelberg, New York, 1979. [19] L. Chierchia and E. Valdinoci, A note on the construction of Hamiltonian trajectories along heteroclinic chains, Forum Math. 12, 2 (2000), 247–254. [20] V. I. Arnold, Dynamical Systems 3, Encyclopedia of Math. Sci. 3 (1963). [21] A. Delshams, V. G. Gelfreich, V. G. Jorba and T. M. Seara, Exponentially small splitting of separatrices under fast quasi-periodic forcing, Comm. Math. Phys. 150 (1997), 35–71. [22] G. Gallavotti, G. Gentile and V. Mastropietro, Lindstedt Series and Hamilton–Jacobi Equation for Hyperbolic Tori in Three Time Scales Problems, J. Math. Phys. 40 (1999), n. 12, pp. 6430–6472. [23] G. Gentile, Whiskered tori with prefixed frequencies and Lyapunov spectrum, Dynamic Stability Systems 10, 3 (1995), 269–308. [24] V. Gelfreich, Melnikov method and exponentially small splitting of separatrices, Physica D 101 (1997), pp. 227–248. [25] P. Holmes and J. Marsden, Melnikov method and Arnold diffusion for perturbations of integrable Hamiltonian systems, J. Math. Phys. 23 (1982), pp. 669–675. [26] P. Lochak, J. P. Marco and D. Sauzin, On the Splitting of Invariant Manifolds in Multidimensional Hamiltonian Systems, Memoirs of the A.M.S. n. 775, Providence R. I. 2003. [27] H. Poincar´e, Les Methodes Nouvelles de la M´ecanique C´eleste, Vol. I-III Paris, 1892, 93, 99. [28] J. P. Serre, Trees, Springer Verlag, 1980. [29] E. Valdinoci, Families of whiskered tori for a priori stable/unstable Hamiltonian systems and construction of unstable orbits, Math. Phys. Electr. 6 (2000).
June 19, 2003 16:22 WSPC/148-RMP
00168
Reviews in Mathematical Physics Vol. 15, No. 4 (2003) 387–423 c World Scientific Publishing Company
ENHANCED BINDING IN A GENERAL CLASS OF QUANTUM FIELD MODELS
ASAO ARAI∗ and HIROYUKI KAWANO† Department of Mathematics, Hokkaido University, Sapporo 060-0810, Japan ∗[email protected] †[email protected] Received 22 November 2002 Revised 24 February 2003 We consider, in an abstract form, a system of “quantum particles” coupled to a Bose field. It is shown that, under suitable hypotheses, the composed system can have a ground state even if the uncoupled particle system has no ground state. Keywords: Non-relativistic quantum field theory; particle-field interaction; Pauli–Fierz model; ground state; enhanced binding; Fock space.
1. Introduction In a quantum system with a coupling parameter λ ∈ R, it may occur that the Hamiltonian of the system has a ground state for a nonzero λ even if it has no ground state at the zero-coupling λ = 0. If such a phenomenon occur, then we call it the enhanced binding in the quantum system under consideration. A typical example is a quantum mechanical system whose Hamiltonian is given by the Schr¨ odinger operator HS (λ) := −∆ + λV on L2 (Rd ), where ∆ is the ddimensional generalized Laplacian and V : Rd → R is a potential. Indeed, it is well known that HS (0) has no ground state, but, for a general class of V , HS (λ) with λ 6= 0 has a ground state (e.g. [17]). As a next stage, it is interesting to investigate if enhanced binding occurs in a quantum system of particles coupled to a quantum field. Recently the study of enhanced binding in nonrelativisic quantum electrodynamics (QED) was intiated by Hiroshima and Spohn [14] in the case of the Pauli– Fierz model in the dipole approximation and then by Hainzl, Vougalter and Vugalter [11] in the case of Pauli–Fierz model without the dipole approximation. In [14] it is shown that, under suitable conditions, the enhanced binding occurs for large coupling constants. On the other hand, in [11], the enhanced binding is shown to occur for small coupling constants and for a class of potentials. The results and methods in [11] have been extended to the Pauli–Fierz model with spin [8, 9] (see also [10]). 387
June 19, 2003 16:22 WSPC/148-RMP
388
00168
A. Arai & H. Kawano
In this paper we consider enhanced binding in an abstract model of “quantum particles” coupled to a multi-component Bose field. We prove that, under suitable hypotheses, enhanced binding occurs in this model. This suggests that enhanced binding in quantum particle-field interaction systems is a general phenomenon, although it may depend on the type of interactions. The present paper is organized as follows. In Sec. 2 we describe the model considered and state the main results. The model is essentially same as that discussed in the previous papers [4–6] except that the Bose field is a multi-component one. In Sec. 3 we prove the self-adjointness of the total Hamiltonian of the model, where we present a method different from the one used in [4, 5]. In considering the problem of enhanced binding in the model, we distinguish two cases: the case where the Bose field is massive and the one where the Bose field is massless, but, without infrared singularity. We first prove the existence of enhanced binding in the massive case. This is done in Sec. 4. Section 5 is devoted to proof of the existence of enhanced binding in the massless case. In the last section we apply these general results to the Pauli–Fierz type model without A2 -term in the dipole approximation. In particular, we show that, if the regime of momenta of photons interacting with the quantum particle becomes sufficiently large with an infrared cutoff fixed, then the model has a ground state at least for the coupling constant in some bounded open interval even if the unperturbed particle Hamiltonian has no ground state. The present paper has two appendices. In Appendix A, we formulate, in an abstract form, the weak differentiability of Heisenberg type operators. In Appendix B, we establish, in an abstract framework, a theorem on the existence of a ground state of a self-adjoint operator and a limit theorem of ground states. Each theorem clarifies a general structure underlying methods used in proofs of existence of ground states in nonrelativistic QED [7, 12, 13]. These theorems may be interesting also in its own right in the spectral theory of self-adjoint operators. 2. Definition of the Model and the Main Results We consider, in an abstract form, a model of a quantum system S coupled to a multi-component Bose field. We denote the Hilbert space of the system S by H, which is taken to be an arbitrary separable complex Hilbert space. In concrete realizations, S may be a system of quantum particles or a quantum field system. In general we denote the inner product and the norm of a Hilbert space X by h· , ·iX and k · kX respectively, where we use the convention that the inner product is antilinear (respectively linear) in the first (respectively second) variable. If there is no danger of confusion, then we omit the subscript X in h· , ·iX and k · kX . For a linear operator T on a Hilbert space, we denote its domain by D(T ). For a subspace D ⊂ D(T ), T |D denotes the restriction of T to D. If T is densely defined, then the adjoint of T is denoted T ∗ . For linear operators S and T on a Hilbert space, D(S + T ) := D(S) ∩ D(T ) unless otherwise stated. For a self-adjoint operator S on a Hilbert space, we denote its spectrum (respectively essential spectrum) by σ(S) (respectively σess (S)) and its spectral measure
June 19, 2003 16:22 WSPC/148-RMP
00168
Enhanced Binding in a General Class of Quantum Field Models
389
by ES (·). If S is bounded from below, then we set E0 (S) := inf σ(S)
(2.1)
the ground state energy of S. We say that S has a ground state if E0 (S) is an eigenvalue of S; in this case, each nonzero vector in ker(S − E0 (S)) is called a ground state of S. To describe the Bose field, one uses the Boson Fock space over a separable complex Hilbert space X : Fb (X ) := =
n ∞ O M n=0
(
s
ψ=
X
{ψ (n) }∞ n=0 n
≥ 0, ψ
(n)
∈
n O s
X,
∞ X
n=0
kψ
(n) 2
)
k <∞ ,
(2.2)
N N0 where ns X denotes the n-fold symmetric tensor product of X with s X := C (the set of complex numbers). As is well known [16, §X.7], one of basic objects on Fb (X ) is the annihilation operator a(f ) (f ∈ X ) which is a densely defined closed linear operator on Fb (X ) ∗ ∗ (0) such that, for all ψ = {ψ (n) }∞ = 0 and n=0 ∈ D(a(f ) ), (a(f ) ψ) √ (n−1) ∗ (n) ), n ≥ 1, (a(f ) ψ) = nSn (f ⊗ ψ Nn Nn where Sn is the symmetrization operator on X (Sn∗ = Sn , Sn2 = Sn , s X = Nn Sn ( X )). The adjoint a(f )∗ , called the creation operator, and the annihilation operator a(g) (g ∈ X ) obey the canonical commutation relations [a(f ), a(g)∗ ] = hf, gi ,
[a(f ), a(g)] = 0 ,
[a(f )∗ , a(g)∗ ] = 0
(2.3)
for all f, g ∈ X on the dense subspace F0 (X ) := {ψ ∈ Fb (X )|there exists a number n0 such that ψ (n) = 0 for all n ≥ n0 } ,
(2.4)
where [X, Y ] := XY − Y X. Let φ(f ) :=
a(f ) + a(f )∗ √ , 2
f∈X,
(2.5)
which is called the Segal field operator. It is shown that φ(f ) is essentially selfadjoint on F0 (X ) [16, §X.7]. We denote its closure by the same symbol φ(f ). It follows from (2.3) that, for all f, g ∈ L2 (Rd ), [φ(f ), φ(g)] = i=hf, gi
(2.6)
on F0 (X ). Moreover we have
eiφ(f ) eiφ(g) = e−i=hf,gi eiφ(g) eiφ(f ) ,
f, g ∈ X ,
which is called the Weyl relations of {φ(f )|f ∈ X } [16, §X.7].
(2.7)
June 19, 2003 16:22 WSPC/148-RMP
390
00168
A. Arai & H. Kawano
For every self-adjoint operator S on X , one can define a self-adjoint operator dΓ(S), called the second quantization of S ([15, p. 302], [16, §X.7]), by dΓ(S) :=
∞ M
S (n) ,
(2.8)
n=0
with S (0) = 0 and S (n) is the closure of n X j=1
jth ^
! n O D(S) , I ⊗···⊗ S ⊗···⊗I alg
Nn
where I denotes identity and alg algebraic tensor product. If S is nonnegative, then so is dΓ(S). We assume that the Bose field is an N -component quantum field over Rd (d, N ∈ N). Hence the one-boson Hilbert space is taken to be W :=
N M
L2 (Rd )
(2.9)
(the N direct sum of L2 (Rd )) and the Hilbert space of the Bose field is taken to be Fb (W). Let ω be a Borel measurable function on Rd such that 0 < ω(k) < ∞ for a.e. k ∈ Rd with respect to (w.r.t.) the Lebesgue measure on Rd . Then ω defines a multiplication operator on W, which is nonnegative, injective and self-adjoint. We denote it by the same symbol ω (ωf := (ωf1 , . . . , ωfN ), f = (f1 , . . . , fN ) ∈ W with fi ∈ D(ω), i = 1, . . . , N ). The function ω represents a dispersion relation of one free boson associated with the Bose field under consideration. The free Hamiltonian of the Bose filed is defined by Hb := dΓ(ω)
(2.10)
acting on Fb (W). The Hilbert space of the coupled system of S and the Bose field is given by the tensor product F := H ⊗ Fb (W) .
(2.11)
Let A be a self-adjoint operator on H, which denotes physically the Hamiltonian of the quantum system S and Bj (j = 1, . . . , J, J ∈ N) be a self-adjoint operator TJ on H such that j=1 D(Bj ) is dense in H. Let gj ∈ W, j = 1, . . . , J. As a total Hamiltonian of the coupled system, we take the following operator: H(λ) := A ⊗ I + I ⊗ Hb + λ
J X j=1
Bj ⊗ φ(gj ) ,
(2.12)
where λ ∈ R is a constant parameter denoting the coupling constant of the system S and the Bose field system. The Hamiltonian H(λ) gives a unification of Hamiltonians of some particle-field interaction models (cf. [4, 5]). In the previous papers [4, 5], the existence of a ground state of H(λ) with N = 1 is discussed under the assumption that A has a ground state (hence H(0) has a
June 19, 2003 16:22 WSPC/148-RMP
00168
Enhanced Binding in a General Class of Quantum Field Models
391
ground state). In the present paper, we consider the problem of enhanced binding on the model, i.e. the problem whether or not H(λ) with λ 6= 0 has a ground state even if A has no ground state (hence H(0) has no ground state). We show that, under suitable hypotheses, the problem is solved affirmatively. For results on the problem of enhanced binding on the Pauli–Fierz model in nonrelativistic QED, see [8, 9, 10, 11, 14]. The method taken in the present paper is similar to that used in [14], but we do not need such scalings as done in [14], at least on the level of a general theory. We now formulate basic hypotheses. To do this, we first recall an important notion on commutativity of self-adjoint operators: Definition 2.1. We say that two self-adjoint operators T and S on a Hilbert space strongly commute (or T strongly commutes with S) if their spectral measures commute, i.e. for all Borel sets I1 , I2 ⊂ R, ET (I1 )ES (I2 ) = ES (I2 )ET (I1 ). A family of self-adjoint operators {Sj }nj=1 on a Hilbert space is said to be strongly commuting if Sj strongly commutes with Sl for all j, l = 1, . . . , n with j 6= l. In what follows, we assume that A is of the form A = A 0 + A1
(2.13)
with A0 a nonnegative self-adjoint operator and A1 a symmetric operator. Hypothesis I. gj , gj /ω 3/2 ∈ W (j = 1, . . . , J) and hgj (k), gl (k)iCN ∈ R, a.e. k, j, l = 1, . . . , J. Hypothesis II. The operator A1 is A0 -bounded, i.e. D(A0 ) ⊂ D(A1 ) and there exist constants a, b ≥ 0 such that, for all u ∈ D(A0 ), kA1 uk ≤ akA0 uk + bkuk .
(2.14)
Hypothesis III. The operator A0 strongly commutes with each Bj (j = 1, . . . , J) and J \ D(A0 ) ⊂ D(Bj Bl ) . (2.15) j,l=1
1/2
Moreover, there exist constants cj , dj ≥ 0 such that, for all u ∈ D(A0 ), 1/2
kBj uk ≤ cj kA0 uk + dj kuk (j = 1, . . . , J) .
(2.16)
Hypothesis IV. The set {Bj }Jj=1 is a family of strongly commuting self-adjoint operators. TJ Hypothesis V. D(A0 ) ⊂ j=1 D(Bj A1 )∩D(A1 Bj ) and [Bj , A1 ]|D(A0 ) is bounded (j = 1, . . . , J). We denote the operator norm of [Bj , A1 ] by k[Bj , A1 ]k. We introduce an operator J g g 1 X √j , √l Bj Bl (2.17) RB := 2 ω ω W j,l=1
June 19, 2003 16:22 WSPC/148-RMP
392
00168
A. Arai & H. Kawano
and define A(λ) := A − λ2 RB .
(2.18)
Under Hypotheses I–III, we have D(A(λ)) = D(A0 ). Let Λ := {λ ∈ R \ {0}|A(λ) is self-adjoint and bounded from below} .
(2.19)
Hypothesis VI. Λ 6= ∅. Remark 2.1. Assume Hypotheses I–III and suppose that J gl λ2 X gj a+ √ ω , √ω c j c l < 1 . 2
(2.20)
j,l=1
Then Hypothesis VI holds. This is proved as follows. By Hypothesis III, we can show that 1/2
kBj Bl uk ≤ cj cl kA0 uk + (cj dl + cl dj )kA0 uk + dj dl kuk ,
u ∈ D(A0 ) .
(2.21)
Hence 2
k(A1 − λ RB )uk ≤
! J λ2 X gj gl a+ √ω , √ω cj cl kA0 uk 2 j,l=1
+λ
2
! J X gj √ , √gl cj dl kA1/2 uk 0 ω ω
j,l=1
! J λ2 X gj gl + b+ √ω , √ω dj dl kuk . 2 j,l=1
1/2
Since A0 is infinitesimally small w.r.t. A0 , it follows under condition (2.20) that A1 − λ2 RB is relatively bounded w.r.t. A0 with relative bound smaller than 1. Hence, by the Kato–Rellich theorem, A(λ) = A0 + A1 − λ2 RB is self-adjoint with D(A(λ)) = D(A0 ) and bounded from below. A result on the self-adjointness of H(λ) is given by the following theorem. Theorem 2.1. Assume Hypotheses I–VI. Then for all λ ∈ Λ, H(λ) is self-adjoint with D(H(λ)) = D(A0 ⊗ I) ∩ D(I ⊗ Hb ) and bounded from below. This theorem is proved in Sec. 3. To establish an existence theorem of a ground state of H(λ) without the assumption that A has a ground state, we need additional conditions. Hypothesis VII. The function ω is continuous on Rd with lim ω(k) = ∞
|k|→∞
June 19, 2003 16:22 WSPC/148-RMP
00168
Enhanced Binding in a General Class of Quantum Field Models
393
and there exist constants γ > 0 and C > 0 such that |ω(k) − ω(k 0 )| ≤ C|k − k 0 |γ (1 + ω(k) + ω(k 0 )) ,
k, k 0 ∈ Rd .
In general the existence of a ground state of H(λ) may depend on whether m := ess. inf ω(k)
(2.22)
k∈Rd
is positive or zero [6], where ess. inf means essential infimum. We say that the Bose field under the consideration is massive (respectively massless) if m > 0 (respectively m = 0). We first establish a theorem on the existence of a ground state of H(λ) in the massive case. For s ≥ 0, we introduce a constant cs (g) by J
g √ X
j 2 k[Bj , A1 ]k s ω j=1
cs (g) :=
(2.23)
provided that gj /ω s ∈ W. We set
Σλ := inf σess (A(λ)) .
(2.24)
Remark 2.2. If m > 0, then the condition gj ∈ W implies that gj /ω s ∈ W for all s > 0. Hence, in this case, Hypothesis I is replaced by the condition that gj ∈ W (j = 1, . . . , J) and hgj (k), gl (k)iCN ∈ R, a.e. k, j, l = 1, . . . , J. Theorem 2.2. (Enhanced binding in the massive case). Consider the case m > 0. Assume Hypotheses I–VII. Suppose that λ ∈ Λ and
1 Σλ − E0 (A(λ)) > m + λ2 c3/2 (g)2 + |λ|c1 (g) . (2.25) 2 Then H(λ) has purely discrete spectrum in the interval [E0 (H(λ)), E0 (H(λ)) + m). In particular, H(λ) has a ground state. Remark 2.3. Condition (2.25) implies that E0 (A(λ)) is a discrete eigenvalue of A(λ) and hence A(λ) has a finite number of ground states. A new point of Theorem 2.2 is in that the existence of a ground state of A is not assumed. Theorem 2.2 is proved in Sec. 4. Theorem 2.3. (Enhanced binding in the massless case). Consider the case m = 0. Assume Hypotheses I–VII with gj /ω 2 ∈ W (j = 1, . . . , J) in addition. Suppose that Σλ − E0 (A(λ)) >
1 2 λ c3/2 (g)2 + |λ|c1 (g) 2
(2.26)
and λ2 c1 (g)2 + [Σλ − E0 (H(λ))]2
2λ2 c1 (g)2 +1 [Σλ − E0 (H(λ))]2
Then H(λ) has a ground state. We prove this theorem in Sec. 5.
λ2 c2 (g)2 < 1 . 2
(2.27)
June 19, 2003 16:22 WSPC/148-RMP
394
00168
A. Arai & H. Kawano
3. Self-Adjointness of the Total Hamiltonian Generally speaking, in considering the enhanced binding problem of a quantum field model, it would be desirable to establish the self-adjointness of the Hamiltonian of the model for a wider range of the coupling constant. For this purpose the method used in [4], which employs the Kato–Rellich theorem, is not useful. Here we take another approach which is used in [1]: we prove Theorem 2.1 by making a suitable unitary transformation of H(λ). We need some lemmas. Lemma 3.1. Let X and Y be Hilbert spaces. Let {Xj }Jj=1 (respectively {Yj }Jj=1 ) be a family of strongly commuting self-adjoint adjoint operators on X (respectively Y). Then {Xj ⊗ Yj }Jj=1 is a family of strongly commuting self-adjoint operators on X ⊗ Y. Moreover, if W (respectively U ) is a self-adjoint operator on X (respectively Y) strongly commuting with each Xj (respectively Yj ), then W ⊗ I (respectively I ⊗ U ) strongly commutes with each Xj ⊗ Yj . Proof. It is a well known fact that each Xj ⊗ Yj is self-adjoint on X ⊗ Y (e.g. [15, §VIII.10]). Moreover there exists a two-dimensional spectral measure E j such that, Rfor all Borel sets I1 , IR2 ⊂ R, Ej (I1 × I2 ) = EXj (IR1 ) ⊗ EYj (I2 ) and Xj ⊗ I = xdEj (x, y), Yj = R2 ydEj (x, y), f (Xj ⊗ Yj ) = R2 f (xy)dEj (x, y). It follows R2 that, for all Borel sets K ⊂ R, EXj ⊗Yj (K) = Ej ({(x, y) ∈ R2 |xy ∈ K}). On the other hand, the strong commutativity of Xj ’s and that of Yj ’s imply that {Ej (·)}Jj=1 is a family of commuting orthogonal projections. Hence, for all Borel sets I1 , I2 ⊂ R and j, l = 1, . . . , J, EXj ⊗Yj (I1 ) commutes with EXl ⊗Yl (I2 ). Let W and U be as above. Then EW (I1 )EXj (I2 ) = EXj (I2 )EW (I1 ), which implies the commutativity of EW ⊗I and Ej . Hence EW ⊗I commutes with EXj ⊗Yj . Thus W ⊗ I strongly commutes with Xj ⊗ Yj . Similarly one can show that I ⊗ U strongly commutes with Xj ⊗ Yj . The following fact is well known or easy to prove (e.g. [3, Lemma 2.33]). Lemma 3.2. Let {Sj }Jj=1 be a family of strongly commuting self-adjoint operators PJ on a Hilbert space. Then S := j=1 Sj is essentially self-adjoint and, for all t ∈ R, ¯
eitS =
J Y
eitSj ,
(3.1)
j=1
where S¯ denotes the closure of S and the order of the factors on the right hand side (r.h.s.) is arbitrary. Lemma 3.3. Let X be a separable complex Hilbert space and hj ∈ X (j = 1, . . . , J) such that hhj , hl iX ∈ R, j, l = 1, . . . , J. Then {φ(hj )}Jj=1 is a family of strongly ommuting self-adjoint operators on Fb (X ).
June 19, 2003 16:22 WSPC/148-RMP
00168
Enhanced Binding in a General Class of Quantum Field Models
395
Proof. By the present assumption, =hhj , hl i = 0. Hence, by the Weyl relations (2.7), eitφ(hj ) commutes with eisφ(hl ) for all s, t ∈ R and j, l = 1, . . . , J. Hence, by a general criterion [15, Theorem VIII.13], φ(hj ) strongly commutes with φ(hl ) (j, l = 1, . . . , J). Let
gj Tj := Bj ⊗ φ i ω
.
(3.2)
Then the operator T :=
J X
Tj
(3.3)
j=1
is a symmetric operator with D(T ) ⊃ closure of T by the same symbol T .
TJ
j=1
D(Bj )
N
alg
F0 (W). We denote the
Lemma 3.4. Assume Hypotheses I and IV. Then: (i) {Tj }Jj=1 is a family of strongly commuting self-adjoint operators. TJ (ii) T is essentially self-adjoint on j=1 D(Tj ) and, for all s ∈ R, eisT =
J Y
eisTj ,
(3.4)
j=1
where the order of the factors on the r.h.s. is arbitrary. Proof. (i) By Lemmas 3.1, 3.3 and Hypothesis IV, {Tj }Jj=1 is a family of strongly commuting self-adjoint operators. (ii) By part (i) and Lemma 3.2, T is essentially self-adjoint and (3.4) holds. Lemma 3.5. Assume Hypotheses I, III and IV. Then T strongly commutes with A0 ⊗ I. Proof. By Lemma 3.1 and Hypothesis III, A0 ⊗ I strongly commutes with each Tj , which implies that, for all s, t ∈ R, eitA0 ⊗I eisTj = eisTj eitA0 ⊗I . By this equation and (3.4), eitA0 ⊗I eisT = eisT eitA0 ⊗I . Hence, by a general criterion [15, Theorem VIII.13], T strongly commutes with A0 ⊗ I. The following fact is well known (e.g. [3, p. 516, Lemma 12-5]). Lemma 3.6. Let X be a Hilbert space and S be a nonnegative, injective self-adjoint operator on X . Let g ∈ D(S). Then eiφ(ig) D(dΓ(S)) = D(dΓ(S))
(3.5)
June 19, 2003 16:22 WSPC/148-RMP
396
00168
A. Arai & H. Kawano
and 1 (3.6) eiφ(ig) dΓ(S)e−iφ(ig) = dΓ(S) + φ(Sg) + hg, Sgi . 2 Suppose that Hypotheses I and IV hold. Then, by Lemma 3.4, we can define a unitary operator U (λ) := e−iλT .
(3.7)
L := A0 ⊗ I + I ⊗ Hb ,
(3.8)
We set
which is nonnegative. We introduce an operator ˜ H(λ) := A(λ) ⊗ I + I ⊗ Hb + δA1 (λ) ,
(3.9)
δA1 (λ) := U (λ)(A1 ⊗ I)U (λ)−1 − A1 ⊗ I .
(3.10)
where
Lemma 3.7. Assume Hypotheses I–IV. Then, for all λ ∈ R, and, for all Ψ ∈ D(L),
U (λ)D(L) = D(L)
(3.11)
˜ U (λ)H(λ)U (λ)−1 Ψ = H(λ)Ψ .
(3.12)
Proof. By Hypothesis IV, there exists a J-dimensional spectral measure E such that, for all R Borel sets Ij ⊂ R (j = 1, . . . , J), E(I1 × · · · × IJ ) = EB1 (I1 ) · · · EBJ (IJ ) and Bj = RJ ξj dE(ξ) (ξ = (ξ1 , . . . , ξJ ) ∈ RJ ). Let u, v ∈ D(A0 ) and ψ, ϕ ∈ D(Hb ). Set Ψ = u ⊗ ψ and Φ = v ⊗ ϕ. Then Ψ, Φ ∈ D(L) and Z hI ⊗ Hb Ψ, U (λ)Φi = hHb ψ, e−iφ(iGξ ) ϕidhu, E(ξ)vi , RJ
PJ
∈ W. By Lemma 3.6, we have 1 hHb ψ, e−iφ(iGξ ) ϕi = ψ, e−iφ(iGξ ) Hb + φ(ωGξ ) + hGξ , ωGξ i ϕ . 2
where Gξ = λ
j=1 ξj gj /ω
Hence
hI ⊗ Hb Ψ, U (λ)Φi =
*
Ψ, U (λ) I ⊗ Hb + λ
N
J X j=1
! +
Bj ⊗ φ(gj ) + λ2 RB ⊗ I Φ . (3.13)
This extends to all Ψ, Φ ∈ D(A0 ) alg D(Hb ). Using the well known estimates
f
kH 1/2 ψk ,
√ ka(f )ψk ≤ (3.14) b ω
f 1/2 1/2 −1/2
√ ) , (3.15) ka(f )∗ ψk ≤
ω kHb ψk + kf kkψk , ψ ∈ D(Hb ), f ∈ D(ω
June 19, 2003 16:22 WSPC/148-RMP
00168
Enhanced Binding in a General Class of Quantum Field Models
397
and Hypothesis III, we can show that kBj ⊗ φ(gj )Ψk ≤ Ck(L + 1)Ψk ,
Ψ ∈ D(L) ,
(3.16)
where C > 0 is a constant. By (2.21), RB is A0 -bounded. Hence
!
J X
2
≤ C 0 k(L + 1)Ξk ,
I ⊗ Hb + λ B ⊗ φ(g ) + λ R ⊗ I Ξ j j B
j=1
Ξ ∈ D(L) , (3.17)
where C 0 > 0 is a constant. By this estimate and the fact that D(A0 ) alg D(Hb ) is a core for L, (3.13) extends (via a limiting argument) to all Φ ∈ D(L). Moreover, N D(A0 ) alg D(Hb ) is a core for I ⊗ Hb , (3.13) extends also to all Ψ ∈ D(I ⊗ Hb ). Thus, for all Φ ∈ D(L), the vector U (λ)Φ is in D(I ⊗ Hb ) and J X Bj ⊗ φ(gj ) + λ2 RB ⊗ I Φ . I ⊗ Hb U (λ)Φ = U (λ) I ⊗ Hb + λ (3.18) N
j=1
The strong commutativity of A0 ⊗ I and T (Lemma 3.5) implies that U (λ)D(A0 ⊗ I) ⊂ D(A0 ⊗ I). Thus U (λ)D(L) ⊂ D(L). Since U (λ) is unitary, it follows that D(L) ⊂ U (λ)−1 D(L) = U (−λ)D(L). Since λ ∈ R is arbitrary, we obtain (3.11). Equation (3.18) implies (3.12). ˜ In view of Lemma 3.7, we first prove the self-adjointness of H(λ) (Theorem 3.1 below). We denote by [Bj , A1 ] the closure of [Bj , A1 ]|D(A0 ) which, by Hypothesis V, is bounded with D([Bj , A1 ]) = H. 1/2
Lemma 3.8. Assume Hypotheses I and V. Then D(I ⊗ Hb ) ⊂ D([Bj , A1 ] ⊗ 1/2 φ(igj /ω)) and, for all Ψ ∈ D(I ⊗ Hb ), k[Bj , A1 ] ⊗ φ(igj /ω)Ψk
√ gj 1
gj 1/2 ≤ k[Bj , A1 ]k 2 3/2 kI ⊗ Hb Ψk + √ kΨk . ω 2 ω
Proof. By (3.14) and (3.15), we have
√ f 1 1/2
√ kφ(f )ψk ≤ 2
ω kHb ψk + √2 kf kkψk , Hence, for all Ψ ∈ D(A0 )
N
1/2
f ∈ D(ω −1/2 ), ψ ∈ D(Hb ) .
1/2
alg
(3.19)
D(Hb ),
k[Bj , A1 ] ⊗ φ(igj /ω)Ψk ≤ k[Bj , A1 ]kkI ⊗ φ(igj /ω)Ψk √ 1/2 ≤ k[Bj , A1 ]k( 2kgj /ω 3/2 kkI ⊗ Hb Ψk √ + kgj /ωkkΨk/ 2) .
(3.20)
June 19, 2003 16:22 WSPC/148-RMP
398
00168
A. Arai & H. Kawano
Hence (3.19) holds for all Ψ ∈ D(A0 ) 1/2
N
1/2
alg
D(Hb ). Since D(A0 ) 1/2
N
1/2
alg
D(Hb ) is 1/2
a core for I ⊗ Hb , (3.19) extends to all Ψ ∈ D(I ⊗ Hb ) showing D(I ⊗ Hb )) ⊂ D([Bj , A1 ] ⊗ φ(igj /ω)) as well. We set J X
gj Y := [Bj , A1 ] ⊗ φ i ω j=1
.
(3.21)
By Lemma 3.8, we have D(I ⊗ Hb ) ⊂ D(Y ) . Lemma 3.9. Assume Hypotheses I–III and Hypothesis V. Then, for all Ψ, Φ ∈ D(L) hT Ψ, A1 ⊗ IΦi − hA1 ⊗ IΨ, T Φi = hΨ, Y Φi .
(3.22)
N Proof. It is easy to see that, for all Ψ, Φ ∈ D(A0 ) alg D(Hb ), (3.22) holds. By (3.16), T is L-bounded. By Hypothesis II, one can show that A1 ⊗ I is L-bounded. N By Lemma 3.8, Y is also L-bounded. Since D(A0 ) alg D(Hb ) is a core for L, (3.22) extends to all Ψ, Φ ∈ D(L). Lemma 3.10. Assume Hypotheses I–V. Then, D(L) ⊂ D(δA1 (λ)) and, for all Ψ ∈ D(L), 1 1/2 (3.23) kδA1 (λ)Ψk ≤ |λ| c3/2 (g)kI ⊗ Hb Ψk + c1 (g)kΨk , 2 where cs (g) is defined by (2.23). Proof. Let A1 (λ) = U (λ)A1 ⊗ IU (λ)−1 and Ψ, Φ ∈ D(L). Then, applying Proposition A.1 in Appendix A with H = −λT , S = A1 ⊗ I and K = A0 ⊗ I, we see that the function: t 7→ hΦ, A1 (tλ)Ψi (t ∈ R) is differentiable and d hΦ, A1 (tλ)Ψi = −iλ{hT U (tλ)−1 Φ, A1 ⊗ IU (tλ)−1 Ψi dt − hA1 ⊗ IU (tλ)−1 Φ, T U (tλ)−1 Ψi} , which, together with Lemma 3.9, yields that d hΦ, A1 (tλ)Ψi = −iλhU (tλ)−1 Φ, Y U (tλ)−1 Ψi . dt Integrating this equation from t = 0 to t = 1, we obtain Z 1 hΦ, δA1 (λ)Ψi = −iλ hU (tλ)−1 Φ, Y U (tλ)−1 Ψidt . 0
June 19, 2003 16:22 WSPC/148-RMP
00168
Enhanced Binding in a General Class of Quantum Field Models
399
Hence |hΦ, δA1 (λ)Ψi| ≤ |λ|
Z
1 0
kΦkkY U (tλ)−1 Ψkdt ,
which implies that kδA1 (λ)Ψk ≤ |λ|
Z
1 0
kY U (tλ)−1 Ψkdt .
We have kY U (tλ)−1 Ψk ≤ =
J X
k[Bj , A1 ]kkI ⊗ φ(igj /ω)U (tλ)−1 Ψk
J X
k[Bj , A1 ]kkI ⊗ φ(igj /ω)Ψk ,
j=1
j=1
where we have used the strong commutativity of I ⊗ φ(igj /ω) and U (tλ)−1 . Hence kδA1 (λ)Ψk ≤ |λ|
J X j=1
k[Bj , A1 ]kkI ⊗ φ(igj /ω)Ψk .
(3.24)
Using this estimate and (3.20), we obtain (3.23). Let L(λ) := A(λ) ⊗ I + I ⊗ Hb .
(3.25)
˜ Theorem 3.1. Assume Hypotheses I–VI. Then, for all λ ∈ Λ, H(λ) is self-adjoint ˜ with D(H(λ)) = D(L) and bounded from below. Moreover, every core of L(λ) is a ˜ core of H(λ). Proof. We can write ˜ H(λ) = L(λ) + δA1 (λ) .
(3.26)
Let λ ∈ Λ. Then, by the definition of Λ, L(λ) is self-adjoint and bounded from 1/2 below. It is easy to see that I ⊗ Hb is infinitesimally small with respect to (w.r.t.) 1/2 I ⊗ Hb . Since A(λ) ⊗ I is bounded from below, it follows that I ⊗ Hb is infinitesimally small with respect to (w.r.t.) L(λ). Therefore, by Lemma 3.10, δA1 (λ) is infinitesimally small w.r.t. L(λ). Thus, by the Kato–Rellich theorem (e.g. [16, ˜ p. 162, Theorem X.12]), H(λ) is self-adjoint on D(L(λ)) = D(A0 ⊗ I) ∩ D(I ⊗ Hb ), ˜ bounded from below and every core of L(λ) is a core of H(λ). Proof of Theorem 2.1. We have D(L(λ)) = D(L). By Lemma 3.7 and Theorem 3.1, H(λ) is self-adjoint on D(L) and bounded from below.
June 19, 2003 16:22 WSPC/148-RMP
400
00168
A. Arai & H. Kawano
Corollary 3.1. Assume Hypotheses I–VI. Then, for all λ ∈ Λ, E0 (A(λ)) −
λ2 |λ| 1 c3/2 (g)2 − c1 (g) ≤ E0 (H(λ)) ≤ E0 (A(λ)) + |λ|c1 (g) . 4 2 2
(3.27)
Proof. By Lemma 3.7 and Theorem 2.1, we have ˜ E0 (H(λ)) = E0 (H(λ)) .
(3.28)
˜ Hence we need only to prove (3.27) with E0 (H(λ)) replaced by E0 (H(λ)). Let Ψ ∈ D(L) with kΨk = 1. Then, using (3.23), we have ˜ hΨ, H(λ)Ψi ≥ hΨ, A(λ) ⊗ IΨi + hΨ, I ⊗ Hb Ψi − kδA1 (λ)Ψk 1/2
1/2
≥ E0 (A(λ)) + kI ⊗ Hb Ψk2 − |λ|c3/2 (g)kI ⊗ Hb Ψk −
|λ| c1 (g) 2
λ2 c3/2 (g)2 |λ| − c1 (g) , 4 2 which, combined with the variational principle, yields the first inequality in (3.27). Let Ω ∈ Fb (W) be the Fock vacuum: Ω(0) = 1, Ω(n) = 0, n ≥ 1. Then we have ≥ E0 (A(λ)) −
Hb Ω = 0 .
(3.29)
Hence, for all u ∈ D(A0 ) with kuk = 1, we have by the variational principle ˜ E0 (H(λ)) ≤ hu, A(λ)ui + hu ⊗ Ω, δA1 (λ)u ⊗ Ωi .
By (3.23) and the Schwarz inequality, we have hu ⊗ Ω, δA1 (λ)u ⊗ Ωi ≤
1 |λ|c1 (g) . 2
Hence 1 ˜ E0 (H(λ)) ≤ hu, A(λ)ui + |λ|c1 (g) . 2 Applying the variational principle again, we obtain (3.27) with E0 (H(λ)) replaced ˜ by E0 (H(λ)). 4. Existence of a Ground State in the Massive Case In the present case, methods used in [4, 5] is not applied directly to proving Theorem 2.2, because the existence of a ground state of A is not assumed. Thus we need a new idea. We note Lemma 3.7, which tells us that H(λ) has a ground state ˜ if and only if H(λ) does. Hence one may prove the existence of a ground state of ˜ H(λ) by proving that of H(λ). We use this structure. Throughout this section we assume Hypothesis I–VII. For a parameter V > 0, we define a lattice 2πZd 2πnj ΓV := = k = (k1 , . . . , kd ) kj = , nj ∈ Z, j = 1, . . . , d . V V
June 19, 2003 16:22 WSPC/148-RMP
00168
401
Enhanced Binding in a General Class of Quantum Field Models
We denote by `2 (ΓV ) the Hilbert space of square sumable sequences indexed by ΓV : ( ) X 2 2 ` (ΓV ) := f : ΓV → C |f (k)| < ∞ . k∈ΓV
Each element f in `2 (ΓV ) can be identified with a piecewise constant function in L2 (Rd ) which is a constant on each cube π π π π × · · · × kd − , kd + ⊂ Rd C(k, V ) := k1 − , k1 + V V V V
centered about a lattice point k ∈ ΓV . In this identification, `2 (ΓV ) is a closed subspace of L2 (Rd ). Then, putting WV :=
N M
`2 (ΓV ) ,
(4.1)
we have a natural orthogonal decomposition W = WV ⊕ WV⊥ . Hence Fb (W) = Fb (WV ) ⊗ Fb (WV⊥ ) .
(4.2)
We define ωV (k) = ω(kV ) ,
k ∈ Rd
with kV a lattice point closed to k: kV ∈ ΓV ,
|kj − (kV )j | ≤
π , V
j = 1, . . . , d ,
k ∈ C(kV , V ) .
Let Hb,V := dΓ(ωV ) , the second quantization of ωV . For technical reasons, we assume the following as a preliminary hypothesis: Hypothesis VIII. Each gj : Rd → CN is continuous with gj , gj /ω 3/2 ∈ W and hgj (k), gl (k)iCN ∈ R, j, l = 1, . . . , J, k ∈ Rd . Let C, γ be the constants in Hypothesis VII. In what follows we assume that m > 0 (m is defined by (2.22)) and γ 1 π 1+ < 1. (4.3) CV := Cdγ/2 V 2m Condition (4.3) is equivalent to V > V0 , where V0 is the constant defined by C V0 = 1 .
(4.4)
June 19, 2003 16:22 WSPC/148-RMP
402
00168
A. Arai & H. Kawano
For a constant K > 0, we define a function gj,K,V : Rd → CN (j = 1, . . . , J) by X gj,K,V (k) := gj (`)χC(`,V ) (k) , k ∈ Rd , (4.5) `∈ΓV ,|`i |≤K,i=1,...,d
where χS denotes the characteristic fucntion of the set S. We introduce a lattice approximation version for H(λ): HK,V (λ) := A ⊗ I + I ⊗ Hb,V + λ
J X j=1
Bj ⊗ φ(gj,K,V ) .
(4.6)
As in the case of Tj , one can show that {Bj ⊗ φ(igj,K,V /ωV )}Jj=1 is a family of strongly commuting self-adjoint operators. Hence TK,V :=
J X
gj,K,V Bj ⊗ φ i ωV j=1
(4.7)
is self-adjoint. We set UK,V (λ) := e−iλTK,V .
(4.8)
Let AK,V (λ) := A −
J λ2 X gj,K,V gl,K,V Bj Bl , , √ √ 2 ωV ωV
(4.9)
j,l=1
LK,V (λ) := AK,V (λ) ⊗ I + I ⊗ Hb,V , δA1,K,V (λ) := UK,V (λ)A1 ⊗ IUK,V (λ)−1 − A1 ⊗ I
(4.10) (4.11)
and ˜ K,V (λ) := LK,V (λ) + δA1,K,V (λ) . H
(4.12)
Lemma 4.1. For all λ ∈ R, UK,V (λ)D(L) = D(L)
(4.13)
˜ K,V (λ)Ψ . UK,V (λ)HK,V (λ)UK,V (λ)−1 = H
(4.14)
and, for all Ψ ∈ D(L),
Proof. By [4, Lemma 3.1], we have D(Hb,V ) = D(Hb ) ,
(4.15)
D(A ⊗ I + I ⊗ Hb,V ) = D(L) .
(4.16)
which implies that
Then, in a way similar to the proof of Lemma 3.7, one can prove (4.13) and (4.14).
June 19, 2003 16:22 WSPC/148-RMP
00168
Enhanced Binding in a General Class of Quantum Field Models
403
Lemma 4.2. For all sufficiently large V, D(L) ⊂ D(δA1,K,V (λ)) and, for all Ψ ∈ D(L), 1 1/2 kδA1,K,V (λ)Ψk ≤ |λ| c3/2,K,V (g)kI ⊗ Hb,V Ψk + c1,K,V (g)kΨk , (4.17) 2 where cs,K,V (g) is the constant cs (g) with gj (respectively ω) replaced by gj,K,V (respectively ωV ). Proof. Similar to the proof of Lemma 3.10. Let k ∈ Rd , j = 1, . . . , J .
gj,K (k) := χ[−K,K] (k1 ) · · · χ[−K,K] (kd )gj (k) ,
(4.18)
Then the following hold [4, Lemma 3.2]: lim kgj,K,V − gj,K k = 0 ,
(4.19)
lim kgj,K − gj k = 0 ,
(4.20)
V →∞
K→∞
Let
gj,K,V gj,K
= 0, √ lim − √ V →∞ ωV ω
gj,K gj
= 0.
√ √ lim − K→∞ ω ω AK (λ) := A −
(4.21) (4.22)
J λ2 X gj,K gl,K √ , √ Bj Bl . 2 ω ω
(4.23)
j,l=1
Lemma 4.3. Let λ ∈ Λ. Then, there exists a constant K0 (λ) > 0 such that, for all K > K0 (λ), AK (λ) is self-adjoint with D(AK (λ)) = D(A0 ) and bounded from below. Proof. We write AK (λ) = A(λ) + λ2 DK ,
(4.24)
where DK :=
J 1 X cj,l (K)Bj Bl 2
(4.25)
j,l=1
with cj,l (K) :=
gj gl √ , √ ω ω
−
gj,K gl,K √ , √ ω ω
.
June 19, 2003 16:22 WSPC/148-RMP
404
00168
A. Arai & H. Kawano
By (2.21), kBj Bl uk ≤ (cj cl + ε)kA0 uk +
(cj dl + cl dj )2 + dj dl kuk , 4ε
u ∈ D(A0 ) ,
where ε > 0 is arbitrary. Since A(λ) is self-adjoint with D(A(λ)) = D(A0 ), it follows from the closed graph theorem that there exists constant ν(λ) > 0 and µ(λ) > 0 such that, for all u ∈ D(A0 ), kA0 uk ≤ ν(λ)kA(λ)uk + µ(λ)kuk .
(4.26)
Hence, for all u ∈ D(A0 ), kDK uk ≤ αK (λ)kA(λ)uk + βK (λ)kuk ,
(4.27)
where αK (λ) :=
J ν(λ) X |cj,l (K)|(cj cl + ε) , 2 j,l=1
J µ(λ) X (cj dl + cl dj )2 βK (λ) := |cj,l (K)| + d j dl . 2 4ε j,l=1
By (4.22), limK→∞ αK (λ) = 0. Hence there exists a constant K0 (λ) > 0 such that, for all K > K0 (λ), λ2 αK (λ) < 1. Then, by the Kato–Rellich theorem, AK (λ) is self-adjoint with D(AK (λ)) = D(A(λ)) = D(A0 ) and bounded from below. In what follows we assume that K > K0 (λ) (λ ∈ Λ). Lemma 4.4. Let λ ∈ Λ. Then, for all sufficiently large V, AK,V (λ) is self-adjoint with D(AK,V (λ)) = D(A0 ) and bounded from below. Proof. We write AK,V (λ) = AK (λ) + λ2 DK,V ,
(4.28)
where DK,V :=
J 1 X cj,l (K, V )Bj Bl 2 j,l=1
with cj,l (K, V ) :=
gj,K gl,K √ , √ ω ω
−
gj,K,V gl,K,V , √ √ ωV ωV
.
In the same way as in the proof of Lemma 4.3 [use (4.21)], we can show that λ2 DK,V is relatively bounded w.r.t. AK (λ) with relative bound small than 1 for all sufficiently large V . Thus, by the Kato–Rellich theorem, the assertion holds.
June 19, 2003 16:22 WSPC/148-RMP
00168
Enhanced Binding in a General Class of Quantum Field Models
405
˜ K,V (λ) and Lemma 4.5. Let λ ∈ Λ. Then, for all sufficiently large V > 0, H ˜ HK,V (λ) are self-adjoint with D(HK,V (λ)) = D(HK,V (λ)) = D(A0 ⊗I)∩D(I ⊗Hb ) and bounded from below. Proof. Similar to the proof of Theorem 2.1. The following fact is well-known: Lemma 4.6. The operator Hb,V is reduced by Fb (WV ) and its reduced part is equal to the second quantization of ω|WV in Fb (WV ). Let FV := H ⊗ Fb (WV ) .
(4.29)
Then we have the orthogonal decomposition F = FV ⊕ FV⊥ ,
(4.30)
where FV⊥
=
∞ M n=1
FV ⊗
"
n O
(WV⊥ )
s
#
.
(4.31)
˜ K,V (λ) is reduced by FV and Lemma 4.7. The operator H ˜ K,V (λ)|F ⊥ ≥ E0 (HK,V (λ)) + m . H V
(4.32)
Proof. It is easy to see that gj,K,V /ωV ∈ WV . Hence, under the identifications (4.29) and (4.30), we have Bj ⊗ φ(igj,K,V /ωV ) = [B ⊗ φ(igj,K,V /ωV )] ⊕ 0. It follows that TK,V is reduced by FV and so is UK,V (λ), which implies that δA1,K,V (λ) is ˜ K,V (λ) is reduced by FV . Then a reduced by FV . By this fact and Lemma 4.6, H method similar to the proof of [4, Lemma 3.10] yields (4.32). Let Σλ,K := inf σess (AK (λ)) . Lemma 4.8. There exists a constant εK > 0 such that limK→∞ εK = 0 and Σλ ≤ Σλ,K + εK .
(4.33)
Proof. By (4.27) and a general theorem [16, p. 168, Theorem X.18], one can show that (1 − aK )A(λ) − [µ − E0 (A(λ))]aK ≤ AK (λ) ≤ (1 + aK )A(λ) + [µ − E0 (A(λ))]aK
(4.34)
with |E0 (A(λ))| βK (λ) aK := λ2 αK (λ) 1 + + µ µ
June 19, 2003 16:22 WSPC/148-RMP
406
00168
A. Arai & H. Kawano
where µ > 0 is arbitrary. For all sufficiently large K, we have 1 − aK > 0. We fix such a K. Then it follows from the first inequality in (4.34) and the min–max principle [17, p. 76, Theorem XIII.1] that (1 − aK )Σλ ≤ Σλ,K + [µ − E0 (A(λ))]aK , which implies that Σλ ≤ Σλ,K + εK with εK := aK [Σλ + µ − E0 (A(λ))]. We have limK→∞ εK = 0. Let Σλ,K,V := inf σess (AK,V (λ)) . Lemma 4.9. There exists a constant ηK,V > 0 such that limV →∞ ηK,V = 0 and Σλ,K ≤ Σλ,K,V + ηK,V .
(4.35)
Proof. Similar to the proof of Lemma 4.8. Lemma 4.10. lim E0 (AK (λ)) = E0 (A(λ)) ,
(4.36)
lim E0 (AK,V (λ)) = E0 (AK (λ)) .
(4.37)
K→∞ V →∞
Proof. By (4.34) and the variational principle, (1 − aK )E0 (A(λ)) − µaK ≤ E0 (AK (λ)) ≤ (1 + aK )E0 (A(λ)) + µaK , which implies (4.36). Similarly one can prove (4.37). Lemma 4.11. Suppose that the same hypothesis as in Theorem 2.2 and Hypothesis VIII hold. Then, for all sufficiently large K and V, HK,V (λ) has purely discrete spectrum in [E0 (HK,V (λ)), E0 (HK,V (λ)) + m). ˜ K,V (λ) has purely disProof. By Lemma 4.1, we need only to show that H crete spectrum in [E0 (HK,V (λ)), E0 (HK,V (λ)) + m) [note that E0 (hK,V ) = ˜ K,V (λ)) = E0 (HK,V (λ))]. By Lemma 4.7, it is sufficient to show that the E 0 (H ˜ K,V (λ)|FV has such a property. By Lemma 4.2, we have reduced part hK,V := H |hΨ, δA1,K,V (λ)Ψi| ≤ εhΨ, I ⊗ Hb,V Ψi + bε kΨk2 ,
Ψ ∈ D(I ⊗ Hb,V ) ,
where ε > 0 is arbitrary and 2 λ c3/2,K,V (g)2 1 + |λ|c1,K,V (g) . bε := 4ε 2
(4.38)
(4.39)
June 19, 2003 16:22 WSPC/148-RMP
00168
Enhanced Binding in a General Class of Quantum Field Models
407
By condition (2.25) and Lemmas 4.8–4.10, we have Σλ,K,V − E0 (AK,V (λ)) > m + 2bε
(4.40)
if ε < 1 is sufficiently close to 1 and K and V are sufficiently large. Note that the spectrum of h0 := Hb,V |Fb (WV ) is purely discrete with Ran(Eh0 ([0, s]) being finite dimensional for all s > 0. Hence we can apply Theorem B.1 with Remark B.1 in Appendix B to conclude that hK,V has purely discrete spectrum in [E0 (hK,V ), E0 (hK,V ) + m). Proof of Theorem 2.2. Let HK (λ) := A ⊗ I + I ⊗ Hb + λ
J X j=1
Bj ⊗ φ(gj,K ) .
Then, in the same way as in [4, Lemma 3.5], one can show that HK,V (λ) converges to HK (λ) in the norm resolovent sense as V → ∞. Hence, by Lemma 4.11 and an application of [4, Lemma 3.12], we conclude that HK (λ) has purely discrete spectrum in [E0 (HK (λ)), E0 (HK (λ)) + m). In the same way as in [4, Lemma 3.11], one can show that HK (λ) converges to H(λ) in the norm resolovent sense as K → ∞. Hence, by the preceding result and [4, Lemma 3.12] again, we see that H(λ) has purely discrete spectrum in [E0 (H(λ)), E0 (H(λ)) + m). Finally we consider the case where each gj is not necessarily continuous. In this (n) (n) case we can take a sequence of continuous functions {gj }∞ ∈W n=1 such that gj (n)
such that kgj (n) gj
− gj k → 0 (n → ∞). Let Hn be the operator H(λ) with gj replaced
(j = 1, . . . , J). Then one can show that Hn converges to H(λ) in the norm resolovent sense as n → ∞. By the result of the last paragraph, Hn has purely discrete spectrum in [E0 (Hn ), E0 (Hn ) + m). Hence, by [4, Lemma 3.12] once again, we see that H(λ) has purely discrete spectrum in [E0 (H(λ)), E0 (H(λ)) + m). 5. Existence of a Ground State in the Massless Case This section is devoted to proof of Theorem 2.3. Throughout the section, all the hypotheses of Theorem 2.3 are assumed. For each constant M > 0, we define ωM : Rd → [M, ∞) by ωM (k) := ω(k) + M , so that inf k∈Rd ωM (k) = M > 0. We set (M )
gj
:=
ωM gj , ω
k ∈ Rd ,
June 19, 2003 16:22 WSPC/148-RMP
408
00168
A. Arai & H. Kawano
which is in W. We introduce a “regularized” version of the Hamiltonian H(λ): HM := A ⊗ I + I ⊗ Hb,M + λ
J X j=1
(M )
Bj ⊗ φ(gj
),
where Hb,M := dΓ(ωM ) . Let (M )
AM := A − λ2 RB with (M )
RB
:=
J (M ) (M ) g 1 X gj Bj Bl , √l √ 2 ωM ωM j,l=1
and ˜ M := AM ⊗ I + I ⊗ Hb,M + δA1 (λ) . H Then, by Lemma 3.7, ˜M . U (λ)HM U (λ)−1 = H By applying the Lebesgue dominated convergence theorem, one can show that
(M )
gj gj
− s =0 (5.1) lim s M →0 ωM ω for all s ≥ 0 such that gj /ω s+1 ∈ W. We write
AM = A(λ) + WM with (M )
WM := λ2 (RB − RB ) . We put cM := λ2
(M ) J (M ) X gj g gl . √ , √gl − √j , √ ω ω ωM ωM
j,l=1
Then we can show that
kWM uk ≤ cM (akA(λ)uk + bkuk) ,
u ∈ D(A0 ) ,
where a and b are constants independent of M (cf. the proof of Lemma 4.3). In the same way as in Lemma 4.10, one can show that lim E0 (AM ) = E0 (A(λ)) .
M →0
(5.2)
June 19, 2003 16:22 WSPC/148-RMP
00168
409
Enhanced Binding in a General Class of Quantum Field Models
By this fact, (5.1) and (2.26), we can take M > 0 (sufficiently small) satisfying 1 (M ) (M ) Σλ − E0 (AM ) > M + λ2 c3/2 (g)2 + |λ|c1 (g) , 2 (M )
(5.3) (M )
where cs (g) is the cs (g) with ω and gj replaced by ωM and gj respectively. It ˜ follows from Theorem 2.2 that HM has a ground state and so does HM . We denote ˜ M by ΨM . a normalized ground state of H ˜ M ) and Lemma 5.1. For all f ∈ W with ωf ∈ W, I ⊗ a(f )ΨM ∈ D(H
˜ M − E 0 (H ˜ M ))I ⊗ a(f )ΨM (H ) ( J X gj λ −1 √ f, ΨM . U (λ)[Bj , A1 ]U (λ) = − a(ωM f ) − ω 2 j=1
Proof. Similar to the proof of [4, Lemma 4.1] except that, in the present case, one uses an easily proven formula λ D gj E U (λ)I ⊗ a(f )U (λ)−1 = I ⊗ a(f ) − √ f, Bj ⊗ I ω 2
on D(A0 ⊗ I) ∩ D(I ⊗ Hb,M ). We set
Nb := dΓ(I) , 1/2
the number operator on Fb (W). Then we have for all f ∈ W and ψ ∈ D(Nb ), 1/2
ka(f )ψk ≤ kf kkNb ψk ,
1/2
ka(f )∗ ψk ≤ kf kkNb ψk + kf k ,
which implies that kφ(f )ψk ≤
√ 1 1/2 2kf kkNb ψk + √ kf kkψk . 2
(5.4)
Now we can apply Theorem B.2 in Appendix B with (K, B, C) = (A(λ), Hb , ˜ δA1 (λ)) so that H = H(λ). By (2.25), Hypothesis (B.1) holds. Hypothesis (B.2) is satisfied with m = M , Km = AM , Bm = Hb,M , ψ0 = Ω (the Fock vacuum) and D = F0 (W) ∩ D(Hb ). We denote by PΩ the orthogonal projection onto the one-dimensional subspace {αΩ|α ∈ C} and set PΩ⊥ . := I − PΩ . By (3.23) and the fact that Hb,M Ω = 0 and Hb Ω = 0, we have for all u ∈ D(A0 ) with kuk = 1,
˜ M u ⊗ Ωi ≤ hu, AM ui + |λ| c1 (g) , hu ⊗ Ω, H 2 ˜ which implies that E0 (HM ) ≤ E0 (AM )+|λ|c1 (g)/2. By (5.2), for every η > 0, there exists a constant M0 > 0 such that |E0 (AM ) − E0 (A(λ))| < η for all M < M0 . By this fact and (2.26), we have ˜ M ) ≤ η + E0 (A(λ)) + |λ| c1 (g) < Σλ E 0 (H 2
June 19, 2003 16:22 WSPC/148-RMP
410
00168
A. Arai & H. Kawano
for all M < M0 , where we take η sufficiently small. Thus, if we show that 1 kδA1 (λ)ΨM k2 + kI ⊗ PΩ⊥ ΨM k2 < δ 2 ˜ (Σλ − E0 (HM ))
(5.5)
˜ for some δ < 1, then H(λ) has a ground state and so does H(λ). Let us prove (5.5). Using estimate (5.4) and (3.24), we have 1/2
kδA1 (λ)ΨM k ≤ |λ|c1 (g)kNb ΨM k +
|λ|c1 (g) . 2
It is well known or easy to see that Nb ≥ PΩ⊥ . Hence 1/2
kI ⊗ PΩ⊥ ΨM k2 ≤ kNb ΨM k2 . Therefore, if
2|λ|2 c1 (g)2 |λ|2 c1 (g)2 1/2 + 1 kNb ΨM k2 + < δ, ˜ M )]2 ˜ M )]2 [Σλ − E0 (H [Σλ − E0 (H
(5.6)
then (5.5) follows. 1/2 To estimate kNb ΨM k, we follow the method given in the proof of [4, Lemma 4.3]. Indeed, by Lemma 5.1, we can show that !
J
g |λ| X j 1/2
k[Bj , A1 ]k .
kNb ΨM k ≤ √ 2 j=1 ωωM Hence, if
2 λ 2|λ|2 c1 (g)2 +1 2 ˜ M )] 2 [Σλ − E0 (H +
J X
gj
ωωM k[Bj , A1 ]k j=1
|λ|2 c1 (g)2 <δ, ˜ M )]2 [Σλ − E0 (H
!2 (5.7)
then (5.6) follows. In the same way as in [4, Lemma 4.11], we can show that lim E0 (HM ) = E0 (H(λ)) .
M →0
˜ M ). Hence condition (2.27) implies (5.7) with M > 0 We have E0 (HM ) = E0 (H sufficiently small. This completes the proof of Theorem 2.3. 6. The Pauli Fierz Type Model In this section, we apply Theorem 2.3 to a model of the Pauli–Fierz type in nonrelativistic QED. Namely we consider the case where the system S is a system of n nonrelativistic quantum particles moving in Rd under the influence of a potential V : Rdn → R (d, n ∈ N). We set ν := nd. We assume for simplicity that ν ≥ 3,
V ∈ C0∞ (Rν ) ,
V− := min{V, 0} 6= 0 .
June 19, 2003 16:22 WSPC/148-RMP
00168
Enhanced Binding in a General Class of Quantum Field Models
411
The Hilbert space of the particle system is taken to be H = L2 (Rν ) . Hence the Hamiltonian of the particle system is Hp := −∆ + αV acting in L2 (Rν ), where ∆ is the generalized Laplacian on L2 (Rν ) and α > 0 is a parameter. We write x = (x1 , . . . , xν ) ∈ Rν and define pj := −iDj with Dj being the generalized partial differential operator in the variable xj . By the Cwikel–Lieb–Rosenbljum bound [17, Theorem XIII.12], Hp has no ground state for all sufficiently small α. Let gj : Rd → RN (j = 1, . . . , ν) be such that gj ∈W. (6.1) gj , ω2 We take as the total Hamiltonian of the composed system HPF (λ) := Hp ⊗ I + I ⊗ Hb + λ
ν X j=1
pj ⊗ φ(gj ) .
This model is a concrete realization of the abstract model H(λ) with the following choice: A0 = −∆ ,
A1 = αV ,
J =ν,
B j = pj .
It is straightforward to see that Hypotheses I–V hold with [Bj , A1 ]|D(A0 ) = −iαDj V . Suppose that, for all ξ = (ξ1 , . . . , ξν ) ∈ Rν , ν 1 X gj gl √ , √ ξj ξl = G(g)ξ 2 2 ω ω
(6.2)
j,l=1
with G(g) > 0 a constant independent of ξ. This condition is satisfied in the original Pauli–Fierz model without A2 -term in the dipole approximation [2] (see Example 6.2 below). Condition (6.2) implies that ν 1 X gj gl √ , √ pj pl = −G(g)∆ . 2 ω ω j,l=1
Hence, in the present case, A(λ) takes the following form: A(λ) = −(1 − λ2 G(g))∆ + αV . Therefore, in the present case, Λ = (−λ(g), 0) ∪ (0, λ(g)) 6= ∅ ,
June 19, 2003 16:22 WSPC/148-RMP
412
00168
A. Arai & H. Kawano
where λ(g) := p
Thus Hypothesis VI holds. Also we have cs (g) =
1 G(g)
.
ν X √
gj
2|α| kDj V k∞
ωs , j=1
(6.3)
where kF k∞ := supx∈Rν |F (x)| (F : Rν → C). We set
V0 := infν V (x) < 0 . x∈R
We first consider the massive case. Theorem 6.1. Consider the case m > 0. Let ν ≥ 3 and Hypothesis VII be satisfied. Suppose that α|V0 | >
1 λ(g)2 c3/2 (g)2 + λ(g)c1 (g) 2
(6.4)
and the constant m satisfies 1 (6.5) α|V0 | > m + λ(g)2 c3/2 (g)2 + λ(g)c1 (g) . 2 Then there exists a constant δ such that, for all |λ| ∈ (λ(g) − δ, λ(g)), HPF (λ) has purely discrete spectrum in the interval [E0 (HPF (λ)), E0 (HPF (λ)) + m). In particular, HPF (λ) has a ground state. Proof. Let 0 < |λ| < λ(g). By [17, Theorem XIII.15], Σλ = inf σess (A(λ)) = 0 . Therefore, by Theorem 2.2, we need only to show 1 −E0 (A(λ)) > m + λ2 c3/2 (g)2 + |λ|c1 (g) (6.6) 2 for all |λ| sufficiently close to λ(g). We can take a constant ε > 0 such that E := V0 + ε < 0. Then DE := {x ∈ Rν |V (x) < E} is a nonempty bounded open set. Let dE := inf u∈C0∞ (DE ),kuk=1 hu, −∆ui. Then, by the strict positivity of the Dirichlet Laplacian in a bounded open set, dE > 0. By the variational principle and the fact that hu, V ui ≤ Ekuk2 , u ∈ C0∞ (DE ), E0 (A(λ)) ≤ (1 − λ2 G(g))dE + αE .
(6.7)
Note that −αE = α|V0 | − αε. Hence, by (6.5), for all sufficiently small ε > 0 and |λ| sufficiently close to λ(g), 1 −αE − (1 − λ2 G(g))dE > m + λ(g)2 c3/2 (g)2 + λ(g)c1 (g) . 2
(6.8)
June 19, 2003 16:22 WSPC/148-RMP
00168
Enhanced Binding in a General Class of Quantum Field Models
413
Hence 1 −E0 (A(λ)) > m + λ(g)2 c3/2 (g)2 + λ(g)c1 (g) 2 1 > m + λ2 c3/2 (g)2 + |λ|c1 (g) . 2 Thus (6.6) holds. We next consider the massless case. Theorem 6.2. Consider the case m = 0. Let ν ≥ 3. Assume Hypothesis VII and (6.4). Suppose in addition that 1 8λ(g)2 c1 (g)2 4λ(g)2 c1 (g)2 + + 1 λ(g)2 c2 (g)2 < 1 . (6.9) α2 V02 2 α2 V02 Then there exists a constant δ such that, for all |λ| ∈ (λ(g) − δ, λ(g)), HPF (λ) has a ground state. Proof. Let 0 < |λ| < λ(g). By Theorem 2.3 and by the proof of Theorem 6.1, we need only to check that λ2 c1 (g)2 2λ2 c1 (g)2 1 + + 1 λ2 c2 (g)2 < 1 (6.10) E0 (HPF (λ))2 2 E0 (HPF (λ))2 for all |λ| sufficiently close to λ(g). By (6.7) and Corollary 3.1, we have E0 (HPF (λ)) ≤ (1 − λ2 G(g))dE + αE +
λ(g) c1 (g) 2
α |V0 | + αε , (6.11) 2 where, in the last step, we have used (6.4). For all |λ| sufficiently close to λ(g) and sufficiently small ε, the right hand side of (6.11) is negative. Hence, if we show that ≤ (1 − λ2 G(g))dE −
λ2 c1 (g)2 0| − αε − (1 − λ2 G(g))dE ]2 [ α|V 2
1 2λ2 c1 (g)2 + 1 λ2 c2 (g)2 < 1 , + 2 [ α|V0 | − αε − (1 − λ2 G(g))dE ]2 2
(6.12)
for all |λ| sufficiently close to λ(g) and sufficiently small ε, then (6.10) follows. It is easy to see that (6.9) implies (6.12) for all |λ| sufficiently close to λ(g) and sufficiently small ε > 0. Remark 6.1. Theorems 6.1 and 6.2 give only sufficient conditions for HPF (λ) to have a ground state with |λ| in an “intermediate” region. Suppose that Hp has no ground state. Then it would be an interesting problem to investigate if there is a constant λ0 > 0 such that, for all |λ| ∈ (0, λ0 ), HPF (λ) has no ground state. Unfortunately we have been unable to give an answer to this problem.
June 19, 2003 16:22 WSPC/148-RMP
414
00168
A. Arai & H. Kawano
κ Example 6.1. Assume Hypothesis VII. Let κ > 0 and HPF (λ) be the HPF (λ) with ω replaced by κω. Then conditions (6.4) and (6.9) take the following forms respectively:
1 1 λ(g)2 c3/2 (g)2 + √ λ(g)c1 (g) , 2κ2 κ 1 8λ(g)2 c1 (g)2 4λ(g)2 c1 (g)2 + 3 + 1 λ(g)2 c2 (g)2 < 1 . κα2 V02 2κ κα2 V02 α|V0 | >
(6.13) (6.14)
For a given α|V0 |, these inequalities are satisfied if κ is sufficiently large. Thus √ κ HPF (λ) has a ground state for all sufficiently large κ and |λ| < κλ(g) sufficiently √ close to κλ(g). This result is somewhat analogous to the results by Hiroshima and Spohn [14, Lemma 3.3, Theorem 3.4], except that the regime of the coupling constant is different. Example 6.2. Consider the original Pauli–Fierz model with one nonrelativistic particle in R3 so that n = 1, d = 3 and N = 2. We take ω(k) = |k|, k ∈ R3 , and the momentum cutoff function gj : R3 → R2 (j = 1, 2, 3) as χ[σ,L] (|k|) (2) χ[σ,L] (|k|) (1) ej (k), p ej (k) , gj (k) = p (2π)3 |k| (2π)3 |k| where χ[σ,L] is the characteristic function of the interval [σ, L] (σ > 0 is an infrared (r) (r) (r) cutoff and L > σ is an ultraviolet cutoff) and e(r) = (e1 , e2 , e3 ) : R3 → R3 (r = 1, 2) is Borel measurable such that he(s) (k), e(r) (k)i = δsr ,
he(r) (k), ki = 0 ,
r, s = 1, 2, a.e. k ∈ R3 .
By the identity 2 X r=1
(r)
(r)
ej (k)el (k) = δjl −
kj kl , |k|2
a.e. k
and the easily proven fact that Z Z 1 f (|k|)k 2 dk f (|k|)kj kl dk = δjl 3 R3 R3 R for all f : [0, ∞) → C such that R3 f (|k|)k 2 dk < ∞, we can show that Z χ[σ,L] (|k|) gj gl 2 , δjl dk = ωs ωs 3(2π)3 |k|2s+1 3 R 8π L s=1 δjl 3(2π)3 log σ ; = 8π 1 δjl (L2(1−s) − σ 2(1−s) ) ; s 6= 1 . 3(2π)3 2(1 − s)
June 19, 2003 16:22 WSPC/148-RMP
00168
Enhanced Binding in a General Class of Quantum Field Models
415
Hence, in the present example, we have r
1 (2π)3 √ , 4π L−σ !s r 3 X √ L 8π log , kDj V k∞ c1 (g) = 2|α| 3 3(2π) σ j=1 4π G(g) = (L − σ) , (2π)3
c3/2 (g) =
√
2|α|
3 X j=1
c2 (g) =
√
2|α|
3 X j=1
λ(g) =
kDj V k∞
!s
8π 3(2π)3
kDj V k∞
!s
8π 1 √ 3(2π)3 2
r
1 1 − , σ L r
1 1 − 2. σ2 L
From these formulas, we see that λ(g)c1 (g) ∼ const.
r
log L , L
1 λ(g)c3/2 (g) ∼ const. √ , L 1 λ(g)c2 (g) ∼ const. √ L as L → ∞, where “const.” denotes a constant independent of L sufficiently large. Hence, for all sufficiently large L, all the assumptions of Theorem 6.2 are satisfied. Thus, in the present example, HPF (λ) has a ground state for all sufficiently large L and |λ| < λ(g) sufficiently close to λ(g). A possible physical picture of this result is that the coupling of nonrelativistic quantum particles to photons with larger momenta makes higher the possibility for HPF (λ) to have a ground state. Appendix A. Weak Differentiability of a Heisenberg Operator Let X be a Hilbert space. Let H be a self-adjoint operator and S a symmetric operator on X . Then the Heisenberg operator (“time evolution”) of S with respect to H is defined by S(t) := eitH Se−itH ,
t ∈ R.
(A.1)
Proposition A.1. Suppose that there exists a self-adjoint operator K on X such that the following (K.1) and (K.2) hold : (K.1) K strongly commutes with H. (K.2) D(K) ⊂ D(S) and there exist constants a, b ≥ 0 such that kSψk ≤ akKψk + bkψk ,
ψ ∈ D(K) .
June 19, 2003 16:22 WSPC/148-RMP
416
00168
A. Arai & H. Kawano
Then, for all ψ, φ ∈ D(K) ∩ D(H), the function: t 7→ hψ, S(t)φi (t ∈ R) is differentiable and d hψ, S(t)φi = i{hHe−itH ψ, Se−itH φi − hSe−itH ψ, He−itH φi} . dt
(A.2)
Proof. It follows from the strong commutativity of K with H and the two-variable functional calculus that eitH D(K) ∩ D(H) = D(K) ∩ D(H) for all t ∈ R. Let f (t) := hψ, S(t)φi, Fε := (e−iεH − 1)/ε and Gε := e−iεH − 1 with ε ∈ R \ {0}. Then f (t + ε) − f (t) = hFε e−itH ψ, Se−itH Gε φi + hFε e−itH ψ, Se−itH φi ε + hSe−itH ψ, e−itH Fε φi . The first term on the right hand side is estimated as follows: |hFε e−itH ψ, Se−itH Gε φi| ≤ kFε ψk(akKe−itH Gε φk + bkGε φk) = kFε ψk(akGε Kφk + bkGε φk) , where, in the last step, we have used the strong commutativity of K and H. Note that Fε ψ → −iHψ and Gε φ → 0 strongly as ε → 0. Hence lim hFε e−itH ψ, Se−itH Gε φi = 0
ε→0
and (A.2) follows. Appendix B. Ground States of Self-Adjoint Operators In this section we establish two abstract theorems on existence of ground states of a self-adjoint operator on an abstract Hilbert space. They reveal general structures of methods used in previous papers [7, 12, 13] to prove the existence of ground states of particle-field interaction models. B.1. Existence of a ground state of a self-adjoint operator For a self-adjoint operator S on a Hilbert space X , we denote the form domain of S by Q(S) : Z Q(S) := ψ ∈ X |λ|dkES (λ)ψk2 < ∞ = D(|S|1/2 ) , (B.1) R
where ES (·) denotes the spectral measure of S (see Sec. 2 for notations). For ψ, φ ∈ Q(S), we define hψ, Sφi by Z hψ, Sφi := λdhψ, ES (λ)φi . R
June 19, 2003 16:22 WSPC/148-RMP
00168
Enhanced Binding in a General Class of Quantum Field Models
417
For symmetric operators A, B and a subspace D ⊂ D(A) ∩ D(B), we mean by “A ≤ B on D” that hψ, Aψi ≤ hψ, Bψi for all ψ ∈ D. Let H and K be separable Hilbert spaces. Let A and B be self-adjoint operators on H and K respectively. We assume the following: Hypothesis A. The operator A is bounded from below and B is nonnegative with E0 (B) = 0. We set T0 := A ⊗ I + I ⊗ B ,
(B.2)
which is self-adjoint and bounded from below by E0 (A). For a sesquilinear form Z on a Hilbert space, we denote its form domain by Q(Z). Let Z be a symmemtric sesquilinear form on H ⊗ K obeying the following conditions: (i) Q(Z) ⊃ Q(I ⊗ B); (ii) There exist constants a ∈ [0, 1) and b ≥ 0 such that, for all ψ ∈ Q(I ⊗ B), |Z(ψ, ψ)| ≤ ahψ, I ⊗ Bψi + bkψk2 . Lemma B.1. Assume Hypothesis A and let Z be as above. Then there exists a unique self-adjoint operator T on H ⊗ K such that Q(T ) = Q(T0 ) and hψ, T φi = hψ, T0 φi + Z(ψ, φ) ,
ψ, φ ∈ Q(T0 ) .
T is bounded from below by E0 (A)−b and every domain of essential self-adjointness for T0 is a form core for T. Proof. Let Aˆ := A − E0 (A), which is nonnegative. By the present assumption, we have for all ψ ∈ Q(T0 ), |Z(ψ, ψ)| ≤ ahψ, (Aˆ ⊗ I + I ⊗ B)ψi + bkψk2 . Note that Aˆ ⊗ I + I ⊗ B ≥ 0. Hence we can apply the KLMN theorem [16, Theorem X.17] to conclude that there exists a unique self-adjoint operator T 0 on H ⊗ K such that Q(T 0 ) = Q(Aˆ ⊗ I + I ⊗ B) = Q(T0 ) and T 0 = Aˆ ⊗ I + I ⊗ B + Z in the sense of sesquilinear form on Q(T0 ) with T 0 ≥ −b. Then the operator T defined by T := T 0 + E0 (A) is the desired one. Lemma B.2. Under the same hypothesis as in Lemma B.1, |E0 (T ) − E0 (A)| ≤ b .
(B.3)
Proof. By the variational principle and Lemma B.1, we have E0 (T ) ≥ E0 (A) − b .
(B.4)
June 19, 2003 16:22 WSPC/148-RMP
418
00168
A. Arai & H. Kawano
On the other hand, for all f ∈ D(A) and g ∈ D(B) with kf k = 1 and kgk = 1, we have E0 (T ) ≤ (f, Af ) + (1 + a)(g, Bg) + b , which, together with the variational principle, implies that E0 (T ) ≤ E0 (A) + b, where we have used the condition E0 (B) = 0. Hence (B.3) follows. We set Σ := inf σess (A)
(B.5)
E0 (T ) − E0 (A) + b + s . 1−a
(B.6)
s > 0. 1−a
(B.7)
and, for s > 0, β(s) := By (B.3), we have
β(s) ≥
Theorem B.1. Assume Hypothesis A and let Z be as above. Suppose that Σ − E0 (T ) > b
(B.8)
and, for some s0 > 0, Ran(EB ([0, β(s0 )]) is finite dimensional. Let m be a constant such that Σ − E0 (T ) > m + b ,
(B.9)
0 < m < s0 .
(B.10)
Then T has purely discrete spectrum in the interval [E0 (T ), E0 (T ) + m). In particular, T has a ground state. Remark B.1. By Lemma B.2, condition (B.8) is satisfied if Σ − E0 (A) > 2b . Proof. Let Aˆ := A − E0 (A) ,
Tm := T − E0 (T ) − m .
Then we have on D(T0 ), Tm = Aˆ ⊗ I + I ⊗ B + Z + E0 (A) − E0 (T ) − m ≥ Aˆ ⊗ I + I ⊗ (1 − a)B − α0 , where α0 := E0 (T ) − E0 (A) + b + m ≥ m .
(B.11)
June 19, 2003 16:22 WSPC/148-RMP
00168
Enhanced Binding in a General Class of Quantum Field Models
419
Since we have (B.9), α0 < Σ − E0 (A) , one can take a constant δ > 0 such that α0 ≤ δ < Σ − E0 (A). Let Pδ := EAˆ ([0, δ]) and Pδ⊥ := I − Pδ = EAˆ ((δ, ∞)), so that Pδ + Pδ⊥ = I. Then we have Aˆ ⊗ I + I ⊗ (1 − a)B − α0 = Pδ Aˆ ⊗ I + Pδ⊥ Aˆ ⊗ I + Pδ ⊗ (1 − a)B + Pδ⊥ ⊗ (1 − a)B − α0 Pδ ⊗ I − α0 Pδ⊥ ⊗ I . Note that Pδ Aˆ ⊗ I ≥ 0 ,
Pδ⊥ Aˆ ⊗ I ≥ δPδ⊥ ⊗ I ,
Pδ⊥ ⊗ (1 − a)B ≥ 0 .
Hence Aˆ ⊗ I + I ⊗ (1 − a)B − α0 ≥ (δ − α0 )Pδ⊥ ⊗ I + Pδ ⊗ [(1 − a)B − α0 ] ≥ (Pδ ⊗ [(1 − a)B − α0 ] , where we have used the condition δ ≥ α0 . Hence we have on D(T0 ), Tm ≥ Pδ ⊗ [(1 − a)B − α0 ] ≥ Pδ ⊗ [(1 − a)B − α0 ]− , where [(1 − a)B − α0 ]− means the negative part of (1 − a)B − α0 . Let Jm := ETm ([−m, 0)). (i) The case where Ran(Jm ) is finite dimensional. In this case Tm has a purely discrete spectrum in [−m, 0). This means that the spectrum of T in [E0 (T ), E0 (T )+ m) is purely discrete. In particular T has a ground state. (ii) The case where Ran(Jm ) is infinite dimensional. Note that [(1 − a)B − α0 ]− = EB ([0, β(m))[(1 − a)B − α0 ]EB ([0, β(m)) . By condition (B.10) and the present assumption, Ran(EB ([0, β(m)))) is finite dimensional. Hence [(1 − a)B − α0 ]− is trace class. Therefore Pδ ⊗ [(1 − a)B − α0 ]− is trace class. Let {ψn }∞ n=1 be a complete orthonormal system of Ran(Jm ). Then, for all N ∈ N, 0≥
N X
n=1
hψn , Tm ψn i ≥
N X
n=1
hψn , Pδ ⊗ [(1 − a)B − α0 ]− ψn i
≥ Tr{Pδ ⊗ [(1 − a)B − α0 ]− } . PN
Hence n=1 hψn , Tm ψn i is convergent as N → ∞, which implies Jm Tm Jm is trace class and hence it is compact. Thus Tm has purely discrete spectrum in [−m, 0), which implies that T has purely discrete spectrum in [E0 (T ), E0 (T ) + m). In particular T has a ground state.
June 19, 2003 16:22 WSPC/148-RMP
420
00168
A. Arai & H. Kawano
B.2. A limit theorem on ground states Let K be a self-adjoint operator on H bounded from below and B be a nonnegative self-adjoint operator on K with E0 (B) = 0. Let C be a symmetric operator on H ⊗ K with D(K ⊗ I) ∩ D(I ⊗ B) ⊂ D(C) such that H := K ⊗ I + I ⊗ B + C
(B.12)
is self-adjoint and bounded from below. Let Σ := inf σess (K) .
(B.13)
Hypothesis (B.1). Σ > E0 (K). Hypothesis (B.2). There are a family {Km }m∈(0,m0 ] of symmetric operators on T H with D(K) ⊂ m∈(0,m0 ] D(Km ) and a family {Bm }m∈(0,m0 ] of nonnegative self-adjoint operators on K with E0 (Bm ) = 0 such that the following hold: (i) There exists a constant cm > 0 such that, for all u ∈ D(K), k(K − Km )uk ≤ cm (kKuk + kuk) and limm→0 cm = 0. (ii) There exists a nonzero vector ψ0 such that, for all m ∈ (0, m0 ], Bm ψ0 = 0. We denote the orthogonal projection onto the one-dimensional subspace {αψ0 |α ∈ C} by P0 . (iii) For each m ∈ (0, m0 ], D(Km ⊗ I) ∩ D(I ⊗ Bm ) ⊂ D(C) and the operator Hm := Km ⊗ I + I ⊗ Bm + C
(B.14)
is self-adjoint and bounded from below. T (iv) There exists a dense subspace D ⊂ [ m∈(0,m0 ] D(Bm )] ∩ D(B) such that, for N N all ψ ∈ D, limm→0 Bm ψ = Bψ and D(K) alg D is a core of H, where alg means algebraic tensor product. For an orthogonal projection P on a Hilbert space, we set P ⊥ := I − P . Theorem B.2. Assume Hypotheses (B.1) and (B.2). Suppose that inf
m∈(0,m0 ]
E0 (Hm ) > −∞ ,
sup
E0 (Hm ) < Σ .
(B.15)
m∈(0,m0 ]
Suppose that, for all m ∈ (0, m0 ], Hm has a ground state Ψm with kΨm k = 1 and there exists a constant δ < 1 independent of m ∈ (0, m0 ] such that 1 kCΨm k2 + kI ⊗ P0⊥ Ψm k2 < δ . (B.16) (Σ − E0 (Hm ))2
Then there exists a subsequence {Ψmj }∞ j=1 with
m1 > m2 > · · · > mj > mj+1 > · · · ,
lim mj = 0
j→∞
such that the weak limit Ψ0 := w- limj→∞ Ψmj is a ground state of H.
June 19, 2003 16:22 WSPC/148-RMP
00168
Enhanced Binding in a General Class of Quantum Field Models
421
Proof. We divide the proof of this theorem into two steps. (1) By Hypothesis (B.2)-(i), there exists a positive constant ε0 < m0 such that, for all 0 < m < ε0 , cm < 1. Then, by the Kato–Rellich theorem, Km = K + (Km − K) is self-adjoint with D(Km ) = D(K) and bounded from below. We can take a constant ξ such that max{supm∈(0,m0 ] E0 (Hm ), E0 (K)} < ξ < Σ and 1 kCΨm k2 + kI ⊗ P0⊥ Ψm k2 ≤ δ . (ξ − E0 (Hm ))2
(B.17)
Let PK := EK ([E0 (K), ξ]) . Then, by Hypothesis (B.1), dim RanPK < ∞. Let Km (β) := K + βLm with Lm := (Km − K)/cm and β ∈ C. Since Lm is relatively bounded with respect to K, it follows from [17, p. 16, Lemma] that Km (β) is an analytic family of type (A) near β = 0. Hence it is an analytic family in the sense of Kato [17, p. 17, Theorem XII.9] and Km (β) is self-adjoint for real β with |β| sufficiently small. We define Qm (β) := EKm (β) ([E0 (K), ξ]) and Qm := EKm ([E0 (K), ξ]) . Then Qm (β) is analytic near β = 0. In particular, lim kQm (β) − PK k = 0
β→0
and hence dim Ran Qm (β) = dim Ran PK < ∞ for all sufficiently small |β|. Note that Km (cm ) = Km . Therefore, for every ε > 0, there exists a constant η0 > 0 such that, for all m ∈ (0, η0 ), kQm − PK k < ε
(B.18)
and dim Ran Qm = dim Ran PK . (2) By the weak compactness of the unit ball of a Hilbert space and condition (B.15), there exists a subsequence {Ψmj }∞ j=1 (m1 > m2 > · · · > mj > mj+1 > · · · , limj→∞ mj = 0) such that the weak limit Ψ0 := w- limj→∞ Ψmj and E0 := limj→∞ E0 (Hmj ) exist. By Hypothesis (B.2)-(iii), we have limm→0 Hm Ψ = HΨ for N all Ψ ∈ D(K) alg D. Hence, by an applicaiton of [4, Lemma 4.9], if we show that Ψ0 6= 0, then we can conclude that Ψ0 is a ground state with E0 = E0 (H). We have dim Ran P0 = 1. Hence, to show that Ψ0 6= 0, we need only to prove hΨm , PK ⊗ P0 Ψm i ≥ 1 − δ 0
(B.19)
with a constant δ 0 < 1 independent of m. Then, passing to the subsequence {Ψmj }j and taking the limit j → ∞, we obtain hΨ0 , PK ⊗P0 Ψ0 i ≥ 1−δ 0 > 0, which implies that Ψ0 6= 0.
June 19, 2003 16:22 WSPC/148-RMP
422
00168
A. Arai & H. Kawano
To prove (B.19), we first prove hΨm , Qm ⊗ P0 Ψm i ≥ 1 − δ .
(B.20)
Then, by (B.18), we obtain (B.19) for all m < η0 with δ 0 = δ + ε < 1 and hence the proof is completed. Now we note that (B.20) is equivalent to ⊥ hΨm , (Q⊥ m ⊗ P0 + I ⊗ P0 )Ψm i ≤ δ .
(B.21)
⊥ This is seen by using the identity 1 = hΨm , (Qm + Q⊥ m ) ⊗ (P0 + P0 )Ψm i. We prove (B.21). We have ⊥ ⊥ (Q⊥ m ⊗ P0 )Hm = Qm Km ⊗ P0 + Qm ⊗ P0 C .
Hence 0 = (Q⊥ m ⊗ P0 )(Hm − E0 (Hm ))Ψm ⊥ = (Q⊥ m (Km − E0 (Hm )) ⊗ P0 )Ψm + (Qm ⊗ P0 )CΨm ,
which implies that ⊥ hΨm , (Q⊥ m ⊗ P0 )CΨm i = −hΨm , Qm (Km − E0 (Hm )) ⊗ P0 Ψm i
≤ −(ξ − E0 (Hm ))hΨm , Q⊥ m ⊗ P 0 Ψm i . Hence hΨm , Q⊥ m ⊗ P 0 Ψm i ≤ − ≤
1 hΨm , (Q⊥ m ⊗ P0 )CΨm i ξ − E0 (Hm )
1 kQ⊥ ⊗ P0 Ψm kkCΨm k . ξ − E0 (Hm ) m
Hence hΨm , Q⊥ m ⊗ P 0 Ψm i ≤
1 kCΨm k2 , (ξ − E0 (Hm ))2
which, together with (B.17), yields (B.21). Acknowledgments This work was completed during the stay of the first author (A. A.) at the Erwin Schr¨ odinger International Institute for Mathematical Physics (ESI) in the autumn, 2002. A. A. would like to thank Professor H. Grosse for giving him an opportunity to participate in the ESI program: Noncommutative Geometry and Quantum Field Theory and warm hospitality. A. A. also acknowledges the support given by the ESI. This work was supported in part also by the Grant-In-Aid No.13440039 for scientific research from the Japan Society for Promotion of Science.
June 19, 2003 16:22 WSPC/148-RMP
00168
Enhanced Binding in a General Class of Quantum Field Models
423
References [1] A. Arai, Self-adjointness and spectrum of Hamiltonians in nonrelativistic quantum electrodynamics, J. Math. Phys. 22 (1981), 534–537. [2] A. Arai, An asymptotic analysis and its application to the nonrelativistic limit of the Pauli–Fierz and a spin-boson model, J. Math. Phys. 31 (1990), 2653–2663. [3] A. Arai, Fock Spaces and Quantum Fields, Nippon-Hyouronsha, Tokyo, 2000 (in Japanese). [4] A. Arai and M. Hirokawa, On the existence and uniqueness of ground states of a generalized spin-boson model, J. Funct. Anal. 151 (1997), 455–503. [5] A. Arai and M. Hirokawa, Ground states of a general class of quantum field Hamiltonians, Rev. Math. Phys. 8 (2000), 1085–1135. [6] A. Arai, M. Hirokawa and F. Hiroshima, On the absence of eigenvectors of Hamiltonians in a class of massless quantum field models without infrared cutoff, J. Funct. Anal. 168 (1999), 470–497. [7] V. Bach, J. Fr¨ ohlich and I. M. Sigal, Quantum elecrodynamics of confined nonrelativistic particles, Adv. Math. 137 (1998), 299–395. [8] I. Catto and C. Hainzl, Self-energy of one electron in nonrelativistic QED, mathph/0207036, 2002. [9] T. Chen, V. Vougalter and S. A. Vugalter, The increase of binding energy and enhanced binding in nonrelativistic QED, J. Math. Phys. 44 (2003), 1961–1970. [10] C. Hainzl, One nonrelativistic particle coupled to a photon field, Ann. Henri Poincar´e 4 (2003), 217–237. [11] C. Hainzl, V. Vougalter and S. A. Vugalter, Enhanced binding in nonrelativistic QED, Commun. Math. Phys. 233 (2003), 13–26. [12] F. Hiroshima, Ground states of a model in nonrelativistic quantum electrodynamics I, J. Math. Phys. 40 (1999), 6209–6222. [13] F. Hiroshima, Analysis of ground states of atoms interacting with a quantized radiation field, preprint, 2002. [14] F. Hiroshima and H. Spohn, Enhanced binding through coupling to a quantum field, Ann. Henri Poincar´e 2 (2001), 1159–1187. [15] M. Reed and B. Simon, Methods of Modern Mathematical Physics Vol. I, Academic Press, New York, 1972. [16] M. Reed and B. Simon, Methods of Modern Mathematical Physics Vol. II, Academic Press, New York, 1975. [17] M. Reed and B. Simon, Methods of Modern Mathematical Physics Vol. IV, Academic Press, New York, 1978.
July 14, 2003 10:1 WSPC/148-RMP
00164
Reviews in Mathematical Physics Vol. 15, No. 5 (2003) 425–445 c World Scientific Publishing Company
TRACES FOR STAR PRODUCTS ON THE DUAL OF A LIE ALGEBRA
PIERRE BIELIAVSKY∗ and SIMONE GUTT† D´ epartement de Math´ ematique, Universit´ e Libre de Bruxelles Campus Plaine, C. P. 218, Boulevard du Triomphe B-1050 Bruxelles, Belgique ∗[email protected] †[email protected] MARTIN BORDEMANN Laboratoire de Math´ ematiques, Universit´ e de Haute-Alsace Mulhouse 4, Rue des Fr` eres Lumi` ere, F.68093 Mulhouse, France [email protected] STEFAN WALDMANN Fakult¨ at f¨ ur Mathematik und Physik Albert-Ludwigs-Universit¨ at Freiburg, Physikalisches Institut Hermann Herder Straße 3, D 79104 Freiburg, Germany [email protected] Received 20 March 2002 Revised 29 January 2003 In this paper, we describe all traces for the BCH star-product on the dual of a Lie algebra. First we show by an elementary argument that the BCH as well as the Kontsevich starproduct are strongly closed if and only if the Lie algebra is unimodular. In a next step we show that the traces of the BCH star-product are given by the ad-invariant functionals. Particular examples are the integration over coadjoint orbits. We show that for a compact Lie group and a regular orbit one can even achieve that this integration becomes a positive trace functional. In this case we explicitly describe the corresponding GNS representation. Finally we discuss how invariant deformations on a group can be used to induce deformations of spaces where the group acts on. Keywords: Deformation quantization; closed star-product; trace functionals.
1. Introduction Trace functionals play an important role in deformation quantization [4] (for recent reviews on deformation quantization we refer to [20, 25, 37, 41], existence and classification results can be found in [5, 21, 29, 32, 33, 43]). Physically, traces correspond to states of thermodynamical equilibrium characterized by the KMS condition at infinite temperature [3, 11]. Note however, that 425
July 14, 2003 10:1 WSPC/148-RMP
426
00164
P. Bieliavsky et al.
for reasonable physical interpretation one has to impose an additional positivity condition on the traces [12, 40]. On the mathematical side traces are one half of the index theorem, namely the part of cyclic cohomology. The other half comes from the K-theory part. Having a trace functional tr : A → C of an associative algebra A over some commutative ring C and having a projection P = P 2 ∈ Mn (A) representing an element [P ] ∈ K0 (A) the value tr(P ) ∈ C does not depend on P but only on its class [P ]. This is just the usual natural pairing of cyclic cohomology with K-theory, see e.g. [17, Chap. III.3], and the value ind([P ]) = tr(P ) is called the index of [P ] with respect to the chosen trace. In the case of deformation quantization the situation is as follows. The starting point is a star-product ? for a Poisson manifold (M, π) whence the algebra of interest is A = (C ∞ (M )[[ν]], ?) viewed as an algebra over C[[ν]]. Then a trace is a C[[ν]]-linear functional tr : C ∞ (M )[[ν]] → C[[ν]] such that tr(f ? g) = tr(g ? f ) ,
(1.1)
whenever one function has compact support. For the K-theory part of the index theorem one knows that K-theory is stable under deformation, see e.g. [36]: any projection P0 of the undeformed algebra Mn (C ∞ (M )) can be deformed into a projection 1 1 1 ?p (1.2) P = + P0 − ? 2 2 1 + 4(P0 ? P0 − P0 ) with respect to ?, see [21, Eq. (6.1.4)]. Moreover, this deformation is unique up to equivalence of projections and any projection of the deformed algebra arises this way. It follows that ind([P ]) only depends on [P0 ] ∈ K0 (C ∞ (M )), which is the isomorphism class of the vector bundle defined by P0 , see also [13] for a more detailed discussion. Now let ˜ ? be an equivalent star product with equivalence transformation T (f ? ? T g. Then clearly ter = tr ◦ T −1 defines a trace functional with respect to g) = T f ˜ f P˜ ]) where ind f is the index with respect ˜ ?. From (1.2) we see that ind([P ]) = ind([ ?. Thus the index transforms well under equivalences of star to the trace ter and ˜ products provided one uses the ‘correct’ corresponding trace. It happens that in the symplectic case there is only one trace up to normalization [32]. So suppose that M is compact and that for each star product ? we have chosen a trace tr ? normalized such that tr? (1) = c where c does not depend on ?. Then T 1 = 1 implies tr ˜? = tr? ◦ T −1 and thus the index does not depend on the choice of ? but only on the equivalence class [?]. This simple reasoning already explains the structure of Fedosov’s index formula [21, Theorem 6.1.6], see also the algebraic index theorem of Nest and Tsygan [32]. Nevertheless we would like to mention that the computation of ind([P ]) in geometrical terms is a quite non-trivial task.
July 14, 2003 10:1 WSPC/148-RMP
00164
Traces for Star Products on the Dual of a Lie Algebra
427
For a formulation of the index theorem in the general Poisson case we refer to [39]. Here the situation is far more non-trivial as in general there is no longer a unique trace. In [22] it is shown that integration over M with respect to some smooth density Ω is a trace for Kontsevich’s star product provided the Poisson tensor is Ω-divergence free. However, there are much more traces, typically involving integrations over the symplectic leaves. An elementary proof that in the symplectic case one has a unique trace is presented in [27]. This approach uses the canonical way of normalization of the trace, introduced by Karabegov [28] using local ν-Euler derivations, see [26] and the elementary proof of the uniqueness up to scaling of a trace as given in [11]: Here one uses the fact that in the whole algebraic dual of C ∞ (M ) there is only one Poisson trace τ0 ({f, g}) = 0 ,
(1.3)
namely the integration with respect to the Liouville measure. In this article we shall now consider the most simple case of a Poisson manifold: the dual of a Lie algebra. Here we shall determine all the traces for the BCH star product on g∗ by very elementary arguments. The paper is organized as follows. In Sec. 2 we recall the construction of various star products on the dual of a Lie algebra g∗ as well as their relation to star products on T ∗ G where G is a Lie group with Lie algebra g. Then we prove the strong closedness of homogeneous star products on g∗ by elementary computations in Sec. 3 and in Sec. 4 we show that any ad-invariant functional is a trace for the BCH star product. In Sec. 5 we prove the positivity of a trace τO associated to a regular orbit O ⊆ g∗ for compact G by a BRST construction of a star product on O. Section 6 contains a characterization of the GNS representation obtained from the positive trace τO . Finally, Sec. 7 is devoted to a construction of trace functionals by a group action using a ‘universal deformation’ on the group, inspired by techniques developed in [6, 23]. 2. Star Products on g∗ and T ∗ G In this section we shall recall the construction of several star products on the dual g∗ of a Lie algebra g and on T ∗ G where G is a Lie group with Lie algebra g. First we shall establish some notation. By e1 , . . . , en we denote a basis of g with dual basis e1 , . . . , en ∈ g∗ . Such a basis gives raise to linear coordinates x = xi ei on g and ξ = ξi ei on g∗ . Here and in the following we shall use Einstein’s summation convention. With a capital letter X we shall denote the left-invariant vector field X ∈ Γ∞ (T G) corresponding to x ∈ g, i.e. Xe = x. A vector x ∈ g determines a linear function x ˆ ∈ Pol1 (g∗ ) by ∞ ˆ ∈ Pol1 (T ∗ G), x ˆ(ξ) = ξ(x). Analogously, X ∈ Γ (T G) determines a function X ˆ g ) = αg (Xg ), where αg ∈ T ∗ G and g ∈ G. linear in the fibers, by setting X(α g We shall use the same symbol ˆ for the corresponding graded algebra isomorphism
July 14, 2003 10:1 WSPC/148-RMP
428
00164
P. Bieliavsky et al.
W• between the symmetric algebra g of g and all polynomials Pol• (g∗ ) on g∗ . Similar W• we have a graded algebra isomorphism between Γ∞ ( T G) and Pol• (T ∗ G). By use of left-invariant vector fields and one-forms, T G and T ∗ G trivialize canonically. This yields T G ∼ = G×g and T ∗ G ∼ = G×g∗ . The corresponding projections are denoted by %
π
G ←− G × g∗ −→ g∗ ,
(2.1)
ˆ =% x whence in particular X ˆ for a left-invariant vector field X. More generally, • • ∗ ∗ G ∗ Pol (T G) = % Pol (g ). For the symplectic Poisson bracket on T ∗ G we use the sign convention such that the map ˆ : Γ∞ (T G) → Pol1 (T ∗ G) becomes an isomorphism of Lie algebras (and not an anti-isomorphism as in [8]). Then the canonical linear Poisson bracket on g∗ can be obtained by the observation that left-invariant functions on T ∗ G (with respect to the lifted action) are a Poisson sub-algebra which is in linear bijection with C ∞ (g∗ ) via %∗ . Thus it is meaningful to require %∗ to be a morphism of Poisson algebras. In the global coordinates ξ1 , . . . , ξn the resulting Poisson bracket on g∗ reads as ∗
{f, g} = ξk ckij
∂f ∂g , ∂ξi ∂ξj
(2.2)
where ckij = ek ([ei , ej ]) are the structure constants of g and f , g ∈ C ∞ (g∗ ). The first star-product on g∗ is essentially given by the Baker–Campbell– Hausdorff series of g. One uses the total symmetrization map σν : Pol• (g∗ )[ν] → U(g)[ν] into the universal enveloping algebra of g, defined by σν (ˆ x1 · · · xˆk ) =
νk X xτ (1) · · · · · xτ (k) , k!
(2.3)
τ ∈Sk
where we have built in the formal parameter ν already at this stage. Then σν (f ?BCH g) = σν (f ) · σν (g)
(2.4) •
∗
yields indeed a deformed product ?BCH for f , g ∈ Pol (g )[ν], which turns out to extend to a differential star-product for C ∞ (g∗ )[[ν]], see [24] for a detailed discussion. Here we shall just mention a few properties of ?BCH . First, ?BCH is strongly g-invariant, i.e. for f ∈ C ∞ (g∗ )[[λ]] and x ∈ g we have x ˆ ?BCH f − f ?BCH x ˆ = ν{ˆ x, f } .
(2.5)
∂ + LE , Moreover, ?BCH is homogeneous: this means that the operator H = ν ∂ν ∂ where E = ξi ∂ξi is the Euler vector field, is a derivation of ?BCH , i.e.
H(f ?BCH g) = Hf ?BCH g + f ?BCH Hg
(2.6)
for all f , g ∈ C ∞ (g∗ )[[ν]]. It follows immediately that Pol• (g∗ )[ν] is a ‘convergent’ sub-algebra generated by the constant and linear polynomials. The relation to the BCH series can be seen as follows: Consider the exponential functions ex (ξ) := eξ(x) . Then for all x, y ∈ g one has ex ?BCH ey = e ν1 H(νx,νy) ,
(2.7)
July 14, 2003 10:1 WSPC/148-RMP
00164
Traces for Star Products on the Dual of a Lie Algebra
429
where H(·, ·) is the BCH series of g. Since bidifferential operators on g∗ are already determined by their values on the exponential functions ex , x ∈ g, the star-product ?BCH is already determined by (2.7). For a more detailed analysis and proofs of the above statements we refer to [8, 24]. The other star-product we shall mention is the Kontsevich star-product ?K for g∗ . His general construction of a star-product for arbitrary Poisson structures on Rn simplifies drastically in the case of a linear Poisson structure (2.2). We shall not enter the general construction but refer to [1, 2, 19, 29, 30] for more details and just mention a few properties of ?K . First, ?K is g-covariant, i.e. one has [ x ˆ ?K yˆ − yˆ ?K x ˆ = ν{ˆ x, yˆ} = ν [x, y]
(2.8)
for all x, y ∈ g. This is a weaker compatibility with the (classical) g-action than (2.5). Moreover, ?K is homogeneous, too, but in general ?K and ?BCH do not coincide but are only equivalent, see [19]. Let us now recall how the star-product ?BCH on g∗ is related to star-products on ∗ T G. The main idea is to make the Poisson morphism %∗ into an algebra morphism of star-product algebras. This requirement does not determine the star-product on T ∗ G completely and the remaining freedom (essentially the choice of an ‘ordering prescription’ between functions depending only on G and on g∗ , respectively) can be used to impose further properties. In [24] a star-product ?G of Weyl-type was constructed by inserting additional derivatives in G-direction into the bidifferential operators of ?BCH . In [8] a star product ?S of standard-ordered type was obtained by a (standard-ordered) Fedosov construction using the lift of the half-commutator connection on G to a symplectic connection on T ∗ G. The star-product ?S can also be understood as the resulting composition law of symbols from the standardordered symbol and differential operator calculus induced by the half-commutator connection. A further ‘Weyl-symmetrization’ yields a star-product ?W of Weyltype which does not coincide in general with the original Fedosov star-product ? F built out of the half-commutator connection directly. However, it was shown in [8, Sec. 8] that ?W coincides with ?G . Moreover, the pull-back %∗ is indeed an algebra morphism for both star-products ?G and ?S , i.e. one has %∗ f ?G/S %∗ g = %∗ (f ?BCH g)
(2.9)
for all f , g ∈ C ∞ (g∗ )[[ν]]]. All the star products ?G , ?S , and ?F are homogeneous in the sense of star-products on cotangent bundles whence it follows that they are all strongly closed: integration over T ∗ G with respect to the Liouville form defines a trace on the functions with compact support, see [9, Sec. 8]. 3. Strong Closedness of ?BCH and ?K We shall now discuss an elementary proof of the fact that ?BCH as well as ?K are strongly closed with respect to the constant volume form dn ξ on g∗ if and only if the Lie algebra g is unimodular, i.e. tr ad(x) = 0 for all x ∈ g, or, equivalently,
July 14, 2003 10:1 WSPC/148-RMP
430
00164
P. Bieliavsky et al.
ciij = 0. The unimodularity of g is easily seen to be necessary since it is exactly the condition that the integration is a Poisson trace, see also [42, Sec. 4] for the Poisson case and [22] for a different and more general proof for Kontsevich’s star product on Rn . Before we discuss the general case let us consider the case where G is compact. In this case g is known to be in particular unimodular. Proposition 3.1. Let G be compact. Then ?BCH is strongly closed. Proof. Let f , g ∈ C0∞ (g∗ ). Since G is compact, %∗ f , %∗ g ∈ C0∞ (T ∗ G) and thus the strong closedness of ?G and (2.9) implies Z Z ∗ ∗ ∗ ∗ (f ?BCH g − g ?BCH f ) dn ξ , (% f ?G % g − % g ?G % f ) Ω = vol(G) 0= g∗
T ∗G
where Ω is the (suitably normalized) Liouville measure on T ∗ G. Clearly the above proof relies on the compactness of G, otherwise the integration would not be defined. As an amusing observation we remark that one can also use the above proposition to obtain the well-known fact that compact Lie groups have unimodular Lie algebras. For the general unimodular case we use a different argument which is essentially the same as for homogeneous star-products on a cotangent bundle [9, Sec. 8], see also [10, 34] for more details on star products on cotangent bundles and their traces. A differential operator D on g∗ is called homogeneous of degree r ∈ Z if [LE , D] = rD, where LE is the Lie derivative with respect to the Euler vector field. Lemma 3.2. Let D be a homogeneous differential operator of degree −r with r ≥ 1. Then for all f ∈ C0∞ (g∗ ) one has Z Df dn ξ = 0 . (3.1) g∗
From here we can follow [9] almost literally: If f ∈ Polk (g∗ ) and g ∈ C0∞ (g∗ ) then for every homogeneous star product ? on g∗ one has Z Z k X f ? g dn ξ = νr Cr (f, g) dn ξ , (3.2) g∗
r=0
g∗
where Cr is the rth bidifferential operator of ?. This follows from Lemma 3.2 since Cr (f, ·) is homogeneous of degree k − r. The analogous statement holds for the integral over g ? f . From this we conclude the following lemma: Lemma 3.3. Let ? be a homogeneous star-product for g∗ , f ∈ Pol• (g∗ ), and g ∈ C0∞ (g∗ ). Then Z (f ? g − g ? f ) dn ξ = 0 (3.3) g∗
if and only if g is unimodular.
July 14, 2003 10:1 WSPC/148-RMP
00164
Traces for Star Products on the Dual of a Lie Algebra
431
Proof. The proof is done by induction on the polynomial degree k of f . For k = 0 the statement (3.3) is true by (3.2). For k = 1 we obtain (3.3) by (3.2) if and only if the integral vanishes on Poisson brackets, i.e. if and only if g is unimodular. For k ≥ 2 we can write f as a ?-polynomial in at most linear polynomials since these polynomials generate Pol• (g∗ )[ν] by the homogeneity of ?. Then we can use the cases k = 0, 1 to prove (3.3). Having the trace property for polynomials and compactly supported functions, we only have to use a density argument, i.e. the Stone–Weierstraß theorem, to conclude the trace property in general: Theorem 3.4. Let ? be a homogeneous star-product for g∗ . Then the integration over g∗ with respect to the constant volume dn ξ is a trace if and only if g is unimodular. Since ?BCH as well as ?K are homogeneous this theorem proves in an elementary way that they are strongly closed in the sense of [18]. 4. Trace Properties of g-Invariant Functionals Quite contrary to the symplectic case it turns out that in the Poisson case traces are no longer unique in general. Before we give an elementary proof in the case of g∗ we shall make a few comments on the general situation. As we have seen already before, the trace functionals are typically not defined on the whole algebra but on a certain subspace, as e.g. the functions with compact support. On the other hand, the property of being a trace only becomes interesting if this subspace is not only a sub-algebra but even an ideal. This motivates the following terminology: For an associative algebra A we call a functional τ defined on J ⊆ A a trace on J if J is a two-sided ideal and for all A ∈ A and B ∈ J one has τ ([A, B]) = 0. Similarly we define a Poisson trace on a Poisson ideal of a Poisson algebra. With this notation the traces which are given by integrations are traces on the ideals C0∞ (g∗ ) and C0∞ (g∗ )[[ν]], respectively. However, there will be some interesting traces with a slightly different domain. If we want to integrate over a sub-manifold ι : N ,→ M then the following space becomes important. Here and in the following we shall only consider the case where ι is an embedding. We define ∞ CN (M ) := {f ∈ C ∞ (M ) | ι(N ) ∩ supp f is compact} .
(4.1)
∞ If N is a closed embedded sub-manifold then C0∞ (M ) ⊆ CN (M ). Moreover, the lo∞ cality of a star-product ensures that CN (M )[[ν]] is a two-sided ideal of C ∞ (M )[[ν]]. Taking such a subspace as example we consider more generally domains of the form D[[ν]] where D ⊆ C ∞ (M ). In this case D is necessarily a Poisson ideal which follows immediately from the ideal properties of D[[ν]]. Moreover, if τ : D[[ν]] → R[[ν]] is a trace for a local star-product ∗ on M with domain D[[ν]]
July 14, 2003 10:1 WSPC/148-RMP
432
00164
P. Bieliavsky et al.
P∞ then τ = r=0 ν r τr with linear functionals τr : D → R. For the following we shall assume that all τr have some reasonable continuity property, e.g. with respect to the locally convex topology of smooth functions. This requirement seems to be reasonable as long as we are dealing with star-products having at least continuous cochains in every order of ν. Now let us come back to the case of g∗ with the star product ?BCH . As a first observation we remark that the strong g-invariance of ?BCH implies that for a two-sided ideal D[[ν]] the space D is g-invariant. Moreover, we have the following theorem: Theorem 4.1. Let D ⊆ C ∞ (g∗ ) be a subspace such that D[[ν]] is a two-sided ideal P∞ with respect to ?BCH and let τ = r=0 ν r τr be a R[[ν]]-linear functional on D[[ν]] with the following continuity property: For a given f ∈ C ∞ (g∗ ) and g ∈ D and a sequence pn ∈ Pol• (g∗ ) such that pn → f in the locally convex topology of smooth functions we have τr ([pn , g]?BCH ) → τr ([f, g]?BCH ) (in each order of ν). Then τ is a ?BCH -trace on D[[ν]] if and only if τ is a Poisson trace on D which is the case if and only if τ is g-invariant. Proof. The continuity ensures that g-invariance coincides with the property of being a Poisson trace. Now let τ0 be a Poisson trace and let g ∈ D. Then for all x ∈ g we have τ0 ([ˆ x, g]) = ντ0 ({ˆ x, g}) = 0 by the strong invariance of ?BCH . But since Pol1 (g∗ )[ν] together with the constants generates Pol• (g∗ )[ν] we have τ0 ([p, g]) = 0 for every polynomial p. Together with the fact that the polynomials are dense in C ∞ (g∗ ) and τ0 has the above continuity it follows that τ0 is a ?BCH -trace. Now if τ is a ?BCH -trace then τ0 is a Poisson trace and hence a ?BCH -trace itself. Thus τ − τ0 is still a ?BCH -trace and a simple induction proves the theorem. The somehow technical continuity property needed above turns out to be rather mild. In the main example it is trivially fulfilled: Example 4.2. (i) Let ι : O ,→ g∗ be a not necessarily closed but embedded coadjoint orbit ∞ ∗ and consider D = CO (g ). Then the integration with respect to the Liouville measure ΩO on O, Z τO (f ) := ι∗ f Ω O , (4.2) O
∞ ∗ CO (g )[[ν]].
is a ?BCH -trace on ∆ (ii) If in addition ∆ is a g-invariant differential operator on g∗ then τO , defined by Z ∆ τO (f ) := τO (∆f ) = ι∗ (∆f )ΩO , (4.3) O
is still a trace on
∞ ∗ CO (g )[[ν]].
July 14, 2003 10:1 WSPC/148-RMP
00164
Traces for Star Products on the Dual of a Lie Algebra
433
5. Positivity of Traces If one replaces the formal parameter ν by a new formal parameter λ such that ¯ = λ, then it is well-known ν = iλ and if one treats λ as a real quantity, i.e. λ ∞ ∗ that the complex conjugation of functions in C (g )[[λ]] becomes a ∗ -involution for ?BCH . One has f ?BCH g = g¯ ?BCH f¯
(5.1)
for all f , g ∈ C ∞ (g)[[λ]]. Such a star product is also called a Hermitian star-product, see e.g. [14] for a detailed discussion. Thus one enters the realm of ∗ -algebras over ordered rings, see [12, 15]. In particular one can ask whether the traces for ? BCH are positive linear functionals, i.e. satisfy τ (f¯ ?BCH f ) ≥ 0 in the sense of formal power series, if the corresponding classical functional τ0 comes from a positive Borel measure on g. In general a classically positive linear functional is no longer positive for a deformed product, see e.g. [12, Sec. 2] for a simple example and [14]. But sometimes one can deform the functional as well in order to make it positive again: in the case of star-products on symplectic manifolds this is always possible [14, Proposition 5.1]. Such deformations are called positive deformations. In our case we are faced with the question whether we can deform the traces τO such that on one hand they are still traces and on the other hand they are positive. One strategy could be the following: First prove that the trace can be deformed into a positive functional perhaps loosing the trace property. Secondly average over the group in order to obtain a g-invariant functional and hence a trace. This would require a compact group. However, we shall follow another idea giving some additional insight in the problem. Nevertheless we shall first ask the following question as a general problem in deformation quantization of Poisson manifolds: Question 5.1. Is every Hermitian star-product on a Poisson manifold a positive deformation? We shall now consider the following more particular case. We assume the group G to be compact and ι : O ,→ g∗ to be a regular coadjoint orbit. Then we want to find a positive trace for ?BCH with zeroth order given by τO as in (4.2). The construction is based on the following theorem which is of independent interest: Theorem 5.2. Let G be compact and let ι : O ,→ g∗ be a regular coadjoint orbit. Then there exists a star-product ?O on the symplectic manifold O and a series of P∞ g-invariant differential operators S = id + r=1 λr Sr on g∗ such that the deformed restriction map ι∗ = ι∗ ◦ S : C ∞ (g∗ )[[λ]] → C ∞ (O)[[λ]]
(5.2)
becomes a real surjective homomorphism of star-products, i.e. ι∗ f ?O ι∗ g = ι∗ (f ?BCH g)
and
(ι∗ f ) = ι∗ f¯
for all f, g ∈ C ∞ (g∗ )[[λ]]. Hence ?O becomes a Hermitian deformation.
(5.3)
July 14, 2003 10:1 WSPC/148-RMP
434
00164
P. Bieliavsky et al.
One can view this theorem as a certain ‘deformed tangentiality property’ of the star product ?BCH : Though ?BCH is not tangential, i.e. restricts to all orbits, for a particular orbit it can be arranged such that it restricts by deforming the restriction map, see [16] for a more detailed discussion. From this theorem and [12, Lemma 2] we immediately obtain a positive trace deforming τO : Corollary 5.3. Let G be compact and ι : O ,→ g∗ a regular orbit with deformed restriction map ι∗ as in (5.2). Then the functional Z τ O (f ) := ι∗ f Ω O (5.4) O
is a positive trace with classical limit τO . In particular, ?O is strongly closed. Thus it remains to prove Theorem 5.2. We shall use the arguments here from phase space reduction of star-products via the BRST formalism as discussed in detail in [7]. In order to make this article self-contained we shall recall the basic steps of [7] adapted to the case of Poisson manifolds. Proof of Theorem 5.2. Since O is assumed to be a regular orbit there are realvalued Casimir polynomials J1 , . . . , Jk ∈ Pol• (g∗ ) such that O can be written as level surface O = J −1 ({0}) for the map J = (J1 , . . . , Jk ) : g∗ → Rk , where 0 is a regular value. Since the components of J commute with respect to the Poisson bracket this can be viewed as a moment map J : g∗ → t∗ where t∗ is the dual of the k-dimensional Abelian Lie algebra. Moreover, the J’s are in the Poisson center whence the corresponding torus action is trivial. Since the differential operators Sr will only be needed near O it will be sufficient to construct them in a tubular neighbourhood around O. In fact, a globalization beyond is also easily obtained, see [7, Lemma 6]. As 0 is a regular value of J we can use J for the transversal coordinates and find a G-invariant tubular neighbourhood U of O. On U we can define the following maps: First we need a prolongation map prol : C ∞ (O) ,→ C ∞ (U ) given by (prol φ)(o, µ) = φ(o) ,
(5.5)
where o ∈ O and µ ∈ t∗ is the transversal coordinate in U . Next we consider V• (t) ⊗ C ∞ (g∗ ) and define the Koszul coboundary operator ∂ by the (left-)insertion P P of J, i.e. ∂(t ⊗ f ) = l i(el )t ⊗ Jl f , where J = l el Jl . Clearly ∂ is G-invariant with respect to the G action g ∗ (t ⊗ f ) = t ⊗ g ∗ f and ∂ 2 = 0. We shall denote the Vl−1 Vl (t) ⊗ C ∞ (g∗ ) → (t) ⊗ C ∞ (g∗ ) for homogeneous components of ∂ by ∂l : ∗ ∗ l ≥ 1. In the case l = 0 we set ∂0 = ι and clearly ι ∂1 = 0. Finally, we define the V• chain homotopy h on (t) ⊗ C ∞ (U ) by h(t ⊗ f )(o, µ) =
k X l=1
el ∧ t ⊗
Z
1
0
∂f (o, sµ)sk ds , ∂µl
(5.6)
July 14, 2003 10:1 WSPC/148-RMP
00164
435
Traces for Star Products on the Dual of a Lie Algebra
an denote the corresponding homogeneous components by hl . For convenience we set h−1 = prol. Then h is obviously G-invariant and it is indeed a chain homotopy for ∂, i.e. for all l = 0, . . . , k we have hl−1 ∂l + ∂l+1 hl = idVl (t)⊗C ∞ (U ) .
(5.7)
Moreover, one has the obvious identities ι∗ prol = idC ∞ (O) ,
and h0 prol = 0 .
(5.8)
In a next step we quantize the above chain complex and it is homotopy. The first easy observation is that the star-product ?BCH is strongly t-invariant, i.e. the components of J are in the center of ?BCH , too. Thus we can define a deformed V• Koszul operator ∂ on the space ( (t) ⊗ C ∞ (g∗ ))[[λ]] by X i(el )t ⊗ f ?BCH Jl . (5.9) ∂(t ⊗ f ) = l
Then we still have ∂ 2 = 0 as well as ∂(t ⊗ f ) = ∂(t ⊗ f ) since the Jl commute and are real. Moreover, ∂ is still G-invariant. In a next step one constructs the deformations of h and ι∗ as follows. We define h−1 = prol without deformation and set ∂ 0 := ι∗ := ι∗ (id − (∂1 − ∂ 1 )h0 )−1
and hl := hl (hl−1 ∂ l + ∂ l+1 hl )−1 . (5.10)
Clearly the used inverse operators exist as formal power series thanks to (5.7). The proof of the following lemma is completely analogous to the proofs of [7, Propositions 25 and 26]. The G-invariance is obvious. Lemma 5.4. The operators ι∗ and h are G-invariant and fulfill the relations
as well as
hl−1 ∂ l + ∂ l+1 hl = idVl (t)⊗C ∞ (U )[[λ]] ι∗ ∂ 1 = 0
and
ι∗ prol = idC ∞ (O)[[λ]] .
(5.11)
(5.12)
Having the deformed restriction map and the chain homotopy, it is quite easy to characterize the ideal generated by the ‘constraints’ J: Lemma 5.5. Let I(J) be the (automatically two-sided ) ideal generated by J1 , . . . , Jk . Then the map ι∗ : C ∞ (U )[[λ]] → C ∞ (O)[[λ]] is surjective and ker ι∗ = im ∂ 1 = I(J) .
(5.13)
Thus we can simply define ?O by (5.3) which gives a well-defined star-product on the quotient. It is an easy computation that the first order commutator of ?O gives indeed the desired Poisson bracket. Moreover, since the J’s are real the ideal generated by them is automatically an ∗ -ideal. Since h0 as well as ∂ and ∂ are real operators, it follows that ι∗ is real, too. It remains to show that ι∗ can be written by use of a series of differential operators Sr . This is not obvious as we used the non-local homotopy h0 in order to
July 14, 2003 10:1 WSPC/148-RMP
436
00164
P. Bieliavsky et al.
define ι∗ . However, one can show the existence of the Sr in the same manner as in [7, Lemma 27]. Note that this is not even necessary for Corollary 5.3. Note that in the above construction one does not need the ‘full’ machinery of the BRST reduction but only the Koszul part. The reason is that in this case the coadjoint orbit plays the role of the ‘constraint surface’ and the reduced phase space at once. Remark 5.6. It seems that the above statement is not the most general one can obtain: There are certainly more general orbits and also non-compact groups where one can find such deformed restriction maps. We leave this as an open question for future projects. 6. GNS Representation of the Positive Traces Throughout this section we shall assume that G is compact and ι : O ,→ g∗ is a regular orbit. Then we shall investigate the GNS representation of the positive trace τO as constructed in the last section. Let us briefly recall the basic steps of the GNS construction, see [12]. Having an ∗ -algebra A over C[[λ]] with a positive linear functional ω : A → C[[λ]] one finds that Jω = {A ∈ A | ω(A∗ A) = 0} is a left ideal of A, the so-called Gel’fand ideal of ω. Then Hω := A/Jω becomes a pre-Hilbert space over C[[λ]] via hψA , ψB iω := ω(A∗ B), where ψA ∈ Hω denotes the equivalence class of A. Finally, the left representation πω (A)ψB = ψAB of A on Hω turns out to be a ∗ -representation, i.e. one has hψB , πω (A)ψC iω = hπω (A)ψB , ψC iω . According to Theorem 5.2 we have in our case a surjective ∗ -homomorphism ι∗ : C ∞ (g∗ )[[λ]] → C ∞ (O)[[λ]]
(6.1)
and a positive linear functional τO which is the pull back of a positive linear function on C ∞ (O)[[λ]] under ι∗ , namely the trace trO on O. Thus we can use the functionality properties of the GNS construction, see [9, Proposition 5.1 and Corollary 5.2] in order to relate the GNS construction for τO with the one for tr O , which is well-known, see [40, Sec. 5] and [11, Lemma 4.3]. Since tr O is a faithful functional the GNS representation of C ∞ (O)[[λ]] with respect to tr O is simply given by left multiplication L with respect to ?O , where HtrO = C ∞ (O)[[λ]]. Thus we arrive at the following theorem which can also be checked directly: Theorem 6.1. Let G be compact, ι : O ,→ g∗ a regular orbit, and τO = trO ◦ ι∗ the positive trace as in (5.4). (i) supp τO = ι(O). (ii) The Gel’fand ideal JτO of τO coincides with ker ι∗ . (iii) The GNS pre-Hilbert space HτO is unitarily isomorphic to C ∞ (O)[[λ]] endowed with the inner product hφ, χiO := trO (φ¯ ?O χ) via U : HτO 3 ψf 7→ ι∗ f ∈ C ∞ (O)[[λ]] with inverse U
−1
: φ 7→ ψprol φ .
(6.2)
July 14, 2003 10:1 WSPC/148-RMP
00164
Traces for Star Products on the Dual of a Lie Algebra
437
(iv) For the GNS representation πτO one obtains πO (f )φ := U πτO (f )U −1 φ = ι∗ (f ?BCH prolφ) = Lι∗ f φ .
(6.3)
Since the group G acts on O and since all relevant maps are Ginvariant/equivariant we arrive at the following G-invariance of the representation. This can be checked either directly or follows again from [9, Proposition 5.1 and Corollary 5.2]. Lemma 6.2. The GNS representation πO is G-equivariant in the sense that πO (g ∗ f )g ∗ φ = g ∗ (πO (f )φ)
(6.4)
for all φ ∈ C ∞ (O)[[λ]], f ∈ C ∞ (g∗ )[[λ]] and g ∈ G. Moreover, the G-representation on C ∞ (O)[[λ]] is unitary. Let us finally mention a few properties of the commutant of πO and the ‘baby-version’ of the Tomita–Takesaki theory arising from this representation. The following statements follow almost directly form the considerations in [40, Sec. 7]. We consider the anti-linear map J : φ 7→ φ¯ ,
(6.5)
where φ ∈ C ∞ (O)[[λ]], which is clearly anti-unitary with respect to the inner product h·, ·iO and involutive. This map plays the role of the modular conjugation. The modular operator ∆ is just the identity map since in our case the linear functional is a trace, i.e. a KMS functional for inverse temperature β = 0. Then we can characterize the commutant of the representation πO as follows: Proposition 6.3. For f ∈ C ∞ (g∗ )[[λ]] we denote by Rι∗ f the right multiplication with ι∗ f with respect to the star-product ?O . Then the map πO (f ) = Lι∗ f 7→ JLι∗ f J = Rι∗ f¯
(6.6)
0 of πO . is an anti-linear bijection onto the commutant πO
Note that in this particularly simple case the modular one-parameter group Ut is just the identity Ut = idC ∞ (O)[[λ]] , since we have a trace. More generally, one could also consider KMS functionals of the form f 7→ τO (Exp(−βH) ?BCH f ) where H ∈ C ∞ (g∗ )[[λ]] and Exp denotes the star exponential with respect to ?BCH and β ∈ R is the ‘inverse temperature’. From the above proposition we immediately have the following result on the relation between the g-representations on C ∞ (O)[[λ]] arising from the GNS construction. Lemma 6.4. For x, y ∈ g we have [ πO (ˆ x)πO (ˆ y ) − πO (ˆ y )πO (ˆ x) = iλπO ([x, y]) Rι∗ xˆ Rι∗ yˆ − Rι∗ yˆRι∗ xˆ = −iλRι∗ [x,y] [
(6.7) (6.8)
July 14, 2003 10:1 WSPC/148-RMP
438
00164
P. Bieliavsky et al.
and πO (ˆ x) − Rι∗ xˆ = iλLxO ,
(6.9)
where LxO denotes the Lie derivative in direction of the fundamental vector field of x. 7. Traces for Deformations via Group Actions Let us now describe a quite general mechanism for constructing deformations and traces via group actions. We first consider the algebraic part of the construction. Let G be a group and denote the right translations by Rg : h 7→ hg, where g, h ∈ G. The left translations are denoted by Lg , respectively. Moreover, let AG ⊆ Fun(G) be a sub-algebra of the complex-valued functions on G, closed under complex conjugation. We require R∗g AG ⊆ AG for all g ∈ G. Then an associative formal deformation (AG [[λ]], ?G ) of AG is called (right) universal deformation if it is right-invariant, i.e. R∗g (f1 ?G f2 ) = R∗g f1 ?G R∗g f2
(7.1)
for all g ∈ G and f1 , f2 ∈ AG [[λ]]. Thus the right translations act as automorphisms of ?G . In the sequel we shall always assume that 1 ∈ AG and 1 ?G f = f = f ?G 1. Remark 7.1. If G is a Lie group and AG are all smooth functions on G then the existence of a right-invariant deformation gives quite strong conditions on G. However, in typical examples one may only deform a smaller class of functions. For instance the data of a G-invariant star product on a homogeneous symplectic π space G → H\G determines a right deformation of AG := π ∗ C ∞ (H\G). In the extreme case where H = {e}, the pair (AG , ?G ) becomes a star product algebra (C ∞ (G)[[λ]], ?λ ). The Poisson structure on G associated to the first order term of ?λ is then right-invariant. Its characteristic distribution (generated by Hamiltonian vector fields) — being integrable and right-invariant — determines a Lie subalgebra S of g = Lie(G) endowed with a non-degenerate Chevalley 2-cocycle Ω with respect to the trivial representation of S on R. This type of Lie algebras (S, Ω) (or rather their associated Lie groups) has been studied by Lichnerowicz et al. When unimodular such a Lie algebra is solvable [31]. Now consider a set X with a left action τ : G×X → X of G. For abbreviation we shall sometimes write g.x instead of τ (g, x). We shall use the universal deformation ?G in order to induce a deformation of a certain sub-algebra of Fun(X). First we define αx : Fun(X) → Fun(G) by (αx f )(g) = (τg∗ f )(x)
(7.2)
for x ∈ X and g ∈ G. Having specified AG we define the space AX = {f ∈ Fun(X) | αx f ∈ AG
for all x ∈ X} ,
(7.3)
July 14, 2003 10:1 WSPC/148-RMP
00164
439
Traces for Star Products on the Dual of a Lie Algebra
which is clearly a sub-algebra of Fun(X) stable under complex conjugation. Let us remark that AX contains at least those functions on X which are constant along the orbits of τ . Indeed, let f ∈ Fun(X) satisfy f (g.x) = f (x) for all x ∈ X and g ∈ G. Then (αx f )(g) = f (g.x) = f (x) is constant (not depending on g). The deformation ?G induces canonically an associative deformation ?X of AX , thereby justifying the name ‘universal deformation’. Indeed, define (f1 ?X f2 )(x) = (αx f1 ?G αx f2 )(e) ,
(7.4)
where e ∈ G denotes the unit element. Then we have the following proposition: Proposition 7.2. Let (AG [[λ]], ?G ) be a universal deformation and (AX [[λ]], ?X ) as above. (i) Then (AX [[λ]], ?X ) is an associative formal deformation of AX which is Hermitian if ?G is Hermitian. Moreover, αx : (AX [[λ]], ?X ) → (AG [[λ]], ?G ) is a homomorphism of associative algebras. (ii) If f1 is constant on some orbit G.x0 then (f1 ?X f2 )(g.x0 ) = f1 (g.x0 )f2 (g.x0 ) = (f2 ?X f1 )(g.x0 )
(7.5)
for all functions f2 ∈ AX [[λ]]. In particular, the ?X -product with a function, which is constant along all orbits, is the undeformed product. Thus ? X is ‘tangential’ to the orbits in a very strong sense. Proof. Let us first recall a few basic properties of αx , τ , R, and L. The following relations are straightforward computations: R∗g αx = αg.x
and L∗g αx = αx τg∗ .
(7.6)
Using the right invariance of ?G and the above rules we find the following relation αx (f1 ?X f2 ) = αx f1 ?G αx f2
(7.7)
for f1 , f2 ∈ AX [[λ]]. This implies on one hand that AX [[λ]] is indeed closed under the multiplication law ?X . On the other hand it follows that αx is a homomorphism. With (7.7) the associativity of ?X is a straightforward computation. Finally, if ?G is Hermitian then ?X is Hermitian, too, since all involved maps are real, i.e. commute with complex conjugation. For the second part one computes (f1 ?X f2 )(g.x0 ) = (αx0 f1 ?G αx0 f2 )(g) .
(7.8)
Now αx0 f1 is constant whence the ?G -product is the pointwise product. Thus the claim easily follows. If this holds even for all orbits and not just for G.x0 then the ?X -product with f1 is the pointwise product globally. Remark 7.3. From (7.5) we conclude that, heuristically speaking, the deformation ?X becomes more non-trivial the larger the orbits of τ are.
July 14, 2003 10:1 WSPC/148-RMP
440
00164
P. Bieliavsky et al.
Remark 7.4. Given a right universal deformation (AG , ?R ), one gets a left universal deformation (AG , ?L ) via the formula a ?L b = ι∗ (ι∗ a ?R ι∗ b)
(7.9)
provided AG is a bi-invariant subspace. Here ι : G → G denotes the inversion map g → g −1 . Starting with a left invariant deformation (AG , ?G ) of G and an action τ : G × X → X, the associated deformation of AX is then defined by the formula (f1 ? f2 )(x) = (ι∗ αx f1 ?G ι∗ αx f2 )(e) .
(7.10)
In some interesting cases, in particular in the Abelian case, the universal deformation ?G is also left invariant, i.e. the left translations L∗g acts as automorphisms of ?G , too. In this situation the induced deformation ?X is invariant under τg∗ : Lemma 7.5. Let AG be in addition left invariant and let ?G be a bi-invariant universal deformation. Then AX is invariant under τg∗ for all g ∈ G and τg∗ (f1 ?X f2 ) = τg∗ f1 ?X τg∗ f2 .
(7.11)
Proof. This is a straightforward computation using only the definitions and (7.6). Our main interest in the universal deformations comes from the following simple observation: Theorem 7.6. Let (AG [[λ]], ?G ) be a right universal deformation and let tr G : AG [[λ]] → C[[λ]] be a trace with respect to ?G . Let Φ : Fun(X)[[λ]] → C[[λ]] be an arbitrary C[[λ]]-linear functional. Then trΦ : AX [[λ]] → C[[λ]] defined by trΦ (f ) = Φ(x 7→ trG (αx f ))
(7.12)
is a trace with respect to ?X . Proof. This follows directly from the homomorphism property of αx and the trace property of trG . In particular the trace trG combined with the evaluation functionals at some point x ∈ X trx : f 7→ trG (αx f )
(7.13)
yields a trace for ?X . Thus the only difficult task is to find traces for ?G . As a last remark we shall discuss the positivity of the traces tr Φ . We assume that trG is a positive trace whence trG (f¯ ?G f ) ≥ 0 in the sense of formal power series for all f ∈ AG [[λ]]. Lemma 7.7. Assume tr G is a positive trace and Φ takes non-negative values on non-negative valued functions on X. Then tr Φ is positive. In particular trx is always positive.
July 14, 2003 10:1 WSPC/148-RMP
00164
Traces for Star Products on the Dual of a Lie Algebra
441
Remark 7.8. The above construction has the big advantage that it can be transfered to the framework of topological deformations instead of formal deformations. This has indeed been done by Rieffel [35] in a C ∗ -algebraic framework for actions of Rd . For a class of non-abelian groups this has been done in [6]. Let us finally mention two examples. The first one is the well-known example of the Weyl–Moyal product for R2n and the second is obtained as the asymptotic version of [6] for rank one Iwasawa subgroups of SU(1, n). Example 7.9. Let ?W be the Weyl–Moyal star product on R2n , explicitly given by λ
f ?Weyl g = µ ◦ e 2i
P
k (∂qk ⊗∂pk −∂pk ⊗∂qk )
f ⊗g,
(7.14)
where µ(f ⊗ g) = f g is the pointwise product and q 1 , . . . , pn are the canonical Darboux coordinates on R2n . Clearly ?Weyl is invariant under translations whence it is a bi-invariant universal deformation of C ∞ (R2n )[[λ]]. Moreover, it is well-known that ?Weyl is strongly closed, whence the integration with respect to the Liouville measure provides a trace, which is positive. Thus one can apply the above general results to this situation. Example 7.10. This example is the asymptotic version of [6]. The groups we consider are Iwasawa subgroups G = AN of SU(1, n), where SU(1, n) = AN K is an Iwasawa decomposition. One has the obvious G-equivariant diffeomorphism G → SU(1, n)/K (here K = U(n)). The group G therefore inherits a left-invariant symplectic (K¨ ahler) structure coming from the one on the rank one Hermitian symmetric space SU(1, n)/U(n). The symplectic group may then be described as follows. As a manifold, one has G = R × R2n × R . In these coordinates the group multiplication law reads 1 0 −a0 0 0 0 0 −a0 0 −2a0 0 , L(a,x,z) (a , x , z ) = a + a , e x + x , e z + z + Ω(x, x )e 2
(7.15)
(7.16)
where Ω is a constant symplectic structure on the vector space R2n . The 2-form ω = Ω + da ∧ dz
(7.17)
then defines a left-invariant symplectic structure on G. The universal deformation ?BM we are looking for is a star product for this symplectic structure. Since on R2n+2 all symplectic star products are equivalent, it will be sufficient to describe P∞ ?BM be means of an equivalence transformation T = id + r=1 λr Tr relating ?BM and ?Weyl. In [6] an explicit integral formula for T has been given, which is defined on the Schwartz space S(R2n+2 ). It allows for an asymptotic expansion in ~ and gives indeed the desired equivalence transformation T. Then ?BM defined by f ?BM g = T −1 (T f ?Weyl T g)
(7.18)
July 14, 2003 10:1 WSPC/148-RMP
442
00164
P. Bieliavsky et al.
is a left-invariant universal deformation of G and again we can use this to apply the above results on universal deformations. Moreover, since ?Weyl is strongly closed, the functional Z G tr (f ) := T (f ) ω n+1 (7.19) G
defines a trace functional for ?BM on C0∞ (G)[[λ]]. This is again positive since that ¯ T is real, i.e. T f = T f. In what follows we give a precise description of the star product ?BM in the two dimensional case, i.e. on the group ax + b. The higher dimensional case is similar but more intricate. The non-formal deformed product in the ax + b case is obtained by transforming Weyl’s product on (R2 , da ∧ d`) under the equivalence T = F −1 ◦ φ∗~ ◦ F where F u(a, α) =
Z
(7.20)
e−iα` u(a, `) d` with
u ∈ S(R2 )
(7.21)
is the partial Fourier transform in the second variable and where φ~ : R2 → R2 is the one-parameter family of diffeomorphisms given by 1 (~ ∈ R) . (7.22) φ~ (a, α) = a, sinh(α~) ~ One has T u(a, `) = c
Z
eiα` e
=c
Z
eiα(`−q) e−iψ~ (α)q u(a, q) dq dα
−i ~
sinh(α~)q
u(a, q) dq dα (7.23)
with ψ~ (α) =
X ~2k α2k+1 . (2k + 1)!
(7.24)
k≥1
Setting p = ~α, one gets c T u(a, `) = ~
Z
i
−i
e ~ p(`−q) e ~ ψ1 (p)q u(a, q) dq dp
(7.25)
which precisely coincides with id ⊗ Op~,1 (e
−i ~ ψ1 (p)q
))u(a, `)
(7.26)
where Op~,1 f (p, q) denotes the anti-normally ordered quantization of the function f (q, p). Recall that the κ-ordered pseudodifferential quantization rule on (R2 , dq ∧ dp) is defined (at the level of test functions) by Op~,κ : D(R2 ) → End(L2 (R)) with Z i c Op~,κ (f )ϕ(q) = e ~ p(q−ξ) f (κξ + (1 − κ)q, p)ϕ(ξ) dξ dp (κ ∈ [0, 1]) . (7.27) ~
July 14, 2003 10:1 WSPC/148-RMP
00164
Traces for Star Products on the Dual of a Lie Algebra
443
The explicit asymptotic expansion formula for Op~,κ (f ) is well known, see e.g. [38, Sec. 1.2, p. 231 and Eq. (58), p. 258]. It yields an expression for the equivalence T at the formal level which we write, with natural delicacy, as i λ T = id ⊗ exp (7.28) ψ1 ∂` .` , λ i where the operator T(`) := exp( λi ψ1 ( λi ∂` ).`) is to be understood as anti-normally ordered (κ = 1). Observe the reality of the equivalence, which may be directly checked using the fact that the function ψ1 is odd. Moreover, for every right-invariant vector field X on G = ax + b, one checks [6] that T ◦ X ◦ T −1 is an inner derivation of the Moyal–Weyl product ?Weyl. In other words, the star product ?BM is left-invariant on G. Acknowledgments We would like to thank the organizers of the Warwick workshop on quantisation for their excellent working conditions, many ideas of the paper were developed during this workshop. We also would like to thank the referee for his usefull suggestions. References [1] D. Arnal and N. Ben Amar, Kontsevich’s wheels and invariant polynomial functions on the dual of Lie algebras, Lett. Math. Phys. 52 (2000), 291–300. [2] D. Arnal, N. Ben Amar and M. Masmoudi, Cohomology of good graphs and Kontsevich linear star products, Lett. Math. Phys. 48 (1999), 291–306. [3] H. Basart, M. Flato, A. Lichnerowicz and D. Sternheimer, Deformation theory applied to quantization and statistical mechanics, Lett. Math. Phys. 8 (1984), 483–394. [4] F. Bayen, M. Flato, C. Frønsdal, A. Lichnerowicz and D. Sternheimer, Deformation theory and quantization, Ann. Phys. 111 (1978), 61–151. [5] M. Bertelson, M. Cahen and S. Gutt, Equivalence of star products, Class. Quantum Grav. 14 (1997), A93–A107. [6] P. Bieliavsky and M. Massar, Strict deformation quantizations for actions of a class of symplectic Lie Groups, Prog. Theo. Phys. Suppl. 144 (2001), 1–21. [7] M. Bordemann, H.-C. Herbig and S. Waldmann, BRST cohomology and phase space reduction in deformation quantization, Commun. Math. Phys. 210 (2000), 107–144. [8] M. Bordemann, N. Neumaier and S. Waldmann, Homogeneous fedosov star products on cotangent bundles I: Weyl and Standard ordering with differential operator representation, Commun. Math. Phys. 198 (1998), 363–396. [9] M. Bordemann, N. Neumaier and S. Waldmann, Homogeneous Fedosov star products on cotangent bundles II: GNS representations, the WKB expansion, traces, and applications, J. Geom. Phys. 29 (1999), 199–234. [10] M. Bordemann, N. Neumaier, M. J. Pflaum and S. Waldmann, On representations of star product algebras over cotangent spaces on Hermitian line bundles, Preprint math.QA/9811055 (November 1998), J. Funct. Anal. 199 (2003), 1–47. [11] M. Bordemann, H. R¨ omer and S. Waldmann, A remark on formal KMS states in deformation quantization, Lett. Math. Phys. 45 (1998), 49–61. [12] M. Bordemann and S. Waldmann, Formal GNS construction and states in deformation quantization, Commun. Math. Phys. 195 (1998), 549–583.
July 14, 2003 10:1 WSPC/148-RMP
444
00164
P. Bieliavsky et al.
[13] H. Bursztyn and S. Waldmann, Deformation quantization of Hermitian vector bundles, Lett. Math. Phys. 53 (2000), 349–365. [14] H. Bursztyn and S. Waldmann, On Positive Deformations of ∗ -Algebras. In: G. Dito, D. Sternheimer, (eds.): Conf`erence Mosh`e Flato 1999. Quantization, Deformations, and Symmetries, Mathematical Physics Studies no. 22, 69–80. Kluwer Academic Publishers, Dordrecht, Boston, London, 2000. [15] H. Bursztyn and S. Waldmann, Algebraic Rieffel induction, formal Morita equivalence and applications to deformation quantization, J. Geom. Phys. 37 (2001), 307–364. [16] M. Cahen, S. Gutt and J. Rawnsley, On tangential star products for the coadjoint Poisson structure, Commun. Math. Phys. 180 (1996), 99–108. [17] A. Connes, Noncommutative Geometry, Academic Press, San Diego, New York, London, 1994. [18] A. Connes, M. Flato and D. Sternheimer, Closed star products and cyclic cohomology, Lett. Math. Phys. 24 (1992), 1–12. [19] G. Dito, Kontsevich star product on the dual of a Lie algebra, Lett. Math. Phys. 48 (1999), 307–322. [20] G. Dito and D. Sternheimer, Deformation Quantization: Genesis, Developments and Metamorphoses. To appear in the Proceedings of the meeting between mathematicians and theoretical physicists, Strasbourg, 2001. IRMA Lectures in Math. Theoret. Phys., Vol. 1, Walter De Gruyter, Berlin 2002, pp. 9–54. [21] B. V. Fedosov, Deformation Quantization and Index Theory, Akademie Verlag, Berlin, 1996. [22] G. Felder and B. Shoikhet, Deformation quantization with traces, Lett. Math. Phys. 53 (2000), 75–86. [23] A. Giaquinto and J. J. Zhang, Bialgebra actions, twists, and universal deformation formulas, J. Pure Appl. Algebra 128(2) (1998), 133–152. [24] S. Gutt, An explicit ∗ -product on the cotangent bundle of a Lie group, Lett. Math. Phys. 7 (1983), 249–258. [25] S. Gutt, Variations on deformation quantization. In: G. Dito, D. Sternheimer, (eds.): Conf`erence Mosh`e Flato 1999. Quantization, Deformations, and Symmetries, Mathematical Physics Studies no. 21, 217–254. Kluwer Academic Publishers, Dordrecht, Boston, London, 2000. [26] S. Gutt and J. Rawnsley, Equivalence of star products on a symplectic manifold; ˇ an introduction to Deligne’s Cech cohomology classes, J. Geom. Phys. 29 (1999), 347–392. [27] S. Gutt and J. Rawnsley, Traces for star products on symplectic manifolds, J. Geom. Phys. 42 (2002), 12–18. [28] A. V. Karabegov, On the canonical normalization of a trace density of deformation quantization, Lett. Math. Phys. 45 (1998), 217–228. [29] M. Kontsevich, Deformation Quantization of Poisson Manifolds, I. Preprint qalg/9709040 (September 1997). [30] M. Kontsevich, Operads and motives in deformation quantization, Lett. Math. Phys. 48 (1999), 35–72. [31] A. Lichnerowicz and A. Medina, Groupes a structures symplectiques ou kaehleriennes invariantes, C. R. Acad. Sci., Paris, Ser. I 306, No. 3 (1988), 133–138. [32] R. Nest and B. Tsygan, Algebraic index theorem, Commun. Math. Phys. 172 (1995), 223–262. [33] H. Omori, Y. Maeda and A. Yoshioka, Weyl manifolds and deformation quantization, Adv. Math. 85 (1991), 224–255.
July 14, 2003 10:1 WSPC/148-RMP
00164
Traces for Star Products on the Dual of a Lie Algebra
445
[34] M. J. Pflaum, A deformation-theoretical approach to Weyl quantization on Riemannian manifolds, Lett. Math. Phys. 45 (1998), 277–294. [35] M. A. Rieffel, Deformation quantization for actions of Rd , Mem. Am. Math. Soc. 106(506) (1993). [36] J. Rosenberg, Rigidity of K-theory under deformation quantization. Preprint qalg/9607021 (July 1996). [37] D. Sternheimer, Deformation Quantization: Twenty Years After. In: J. Rembieli` nski, (ed.): Particles, Fields, and Gravitation, AIP Press, New York, 1998. [38] E. M. Stein, Harmonic Analysis Real-Variable Methods, Orthogonality, & Oscillatory Integrals, Princeton Mathematical Series, Princeton University Press (1993). [39] D. Tamarkin and B. Tsygan, Cyclic formality and index theorems, Lett. Math. Phys. 56 (2001), 85–97. [40] S. Waldmann, Locality in GNS representations of deformation quantization, Commun. Math. Phys. 210 (2000), 467–495. [41] A. Weinstein, Deformation quantization, S´eminaire Bourbaki 46`eme ann´ee 789 (1994). [42] A. Weinstein, The modular automorphism group of a Poisson manifold, J. Geom. Phys. 23 (1997), 379–394. [43] A. Weinstein and P. Xu, Hochschild cohomology and characterisic classes for star-products. In: A. Khovanskij, A. Varchenko, V. Vassiliev, (eds.): Geometry of differential equations. Dedicated to V. I. Arnold on the occasion of his 60th birthday, 177–194, American Mathematical Society, Providence, 1998.
July 14, 2003 10:12 WSPC/148-RMP
00167
Reviews in Mathematical Physics Vol. 15, No. 5 (2003) 447–489 c World Scientific Publishing Company
PERTURBATION THEORY OF W ∗ -DYNAMICS, LIOUVILLEANS AND KMS-STATES
´ J. DEREZINSKI Department of Mathematical Methods in Physics, Warsaw University Ho˙za 74, 00-682, Warszawa, Poland ˇ C ´ V. JAKSI Department of Mathematics and Statistics, McGill University 805 Sherbrooke Street West, Montreal, QC, H3A 2K6, Canada C.-A. PILLET PHYMAT, Universit´ e de Toulon, B.P. 132, F-83957 La Garde Cedex, France CPT-CNRS Luminy, Case 907, F-13288 Marseille Cedex 9, France Received 26 March 2002 Revised 14 February 2003 Given a W ∗ -algebra M with a W ∗ -dynamics τ , we prove the existence of the perturbed W ∗ -dynamics for a large class of unbounded perturbations. We compute its Liouvillean. If τ has a β-KMS state, and the perturbation satisfies some mild assumptions related to the Golden–Thompson inequality, we prove the existence of a β-KMS state for the perturbed W ∗ -dynamics. These results extend the well known constructions due to Araki valid for bounded perturbations. Keywords: W ∗ -algebra; W ∗ -dynamics; perturbation theory; KMS states; Liouvilleans.
1. Introduction 1.1. W ∗ -dynamics and KMS states Let M be a W ∗ -algebra equipped with a W ∗ -dynamics (a 1-parameter pointwise σ-weakly continuous group of ∗-automorphisms) R 3 t 7→ τ t . The pair (M, τ ) is often called a W ∗ -dynamical system. Let Q be a self-adjoint element of M. A well known convergent power series expansion, that can be traced back at least to Schwinger and Dyson, can be used to define the perturbed W ∗ -dynamics which we t denote by R 3 t 7→ τQ . The difference of the generators of τQ and τ equals i[Q, ·] ∗ — in fact, the W -dynamics τQ is uniquely characterized by this property. Suppose in addition that β > 0 and that τ possesses a β-KMS state ω. Araki proved that in this case the dynamics τQ also possesses a canonical β-KMS state ωQ . More precisely, if ω(A) = (Ω|AΩ), where Ω is the vector representative of 447
July 14, 2003 10:12 WSPC/148-RMP
448
00167
J. Derezi´ nski, V. Jakˇsi´ c & C.-A. Pillet
the state ω in the standard positive cone, and L is the so-called Liouvillean of τ , then the vector ΩQ := e−β(L+Q)/2 Ω is well defined and the state ωQ (A) := (ΩQ |AΩQ )/kΩQ k2 is β-KMS for the W ∗ -dynamics τQ . The above two constructions play an important role in applications of operator algebras to quantum statistical physics. Whereas the construction of the perturbed W ∗ -dynamics τQ is relatively easy and not very surprising, the construction of the perturbed KMS state ωQ is more subtle and has a far-reaching physical importance. The both constructions, however, have one technical weakness which restricts the range of their applications: the perturbation Q is assumed to be bounded. In many physical applications the operator Q is unbounded and is only affiliated to M. In this paper we extend the construction of the perturbed W ∗ -dynamics τQ and the (τQ , β)-KMS state ωQ to a large class of unbounded perturbations Q affiliated to M. An application of these results is discussed in [1] and concerns spectral and ergodic theory of Pauli–Fierz systems. The proof of the first result — the construction of τQ — is again relatively simple and does not involve much more than an application of the Trotter product formula. The proof of the second result — the construction of ωQ — is more involved. Its main idea is the use of the so-called Golden–Thompson inequality. The Golden– Thompson inequality in its original form says that if A and B are self-adjoint matrices, then Tr eA+B ≤ Tr eA eB . Translated into the language of W ∗ -algebras and KMS states, the Golden– Thompson inequality can be put into the form kΩQ k ≤ ke−βQ/2 Ωk .
(1.1)
In our approach, the Golden–Thompson inequality is used to control the perturbed KMS-states and gives an upper bound, which combined with a weak convergence argument enables us to construct ΩQ for a large class of unbounded Q. In the literature there exists a different approach to the construction of the perturbed KMS states for unbounded perturbations, which is restricted to perturbations bounded from below. One of its versions has been developed by Sakai [2]; another version (applicable to generalized positive operators which may not have a dense domain) is due to Donald [3] (his method is also discussed in monograph [4]). The Sakai–Donald theory does not cover perturbations which are unbounded from both sides, and in particular is not applicable to Pauli–Fierz systems. The W ∗ -algebraic form (1.1) of the Golden–Thompson inequality was first proven by Araki [5]. A different proof, based on an application of Uhlmann’s monotonicity theorem for the relative entropy [6], was given in [3]. 1.2. Liouvilleans The term Liouvillean has become quite popular in the recent literature on algebraic quantum statistical physics. The meaning of this term can vary depending on the
July 14, 2003 10:12 WSPC/148-RMP
00167
Perturbation Theory of W ∗ -Dynamics, Liouvilleans and KMS-States
449
author. Therefore, we would like to devote some space to a discussion of possible meanings of the term Liouvillean in the context of W ∗ -dynamical systems. Let (M, τ ) be a W ∗ -dynamical system. It is often important to construct a representation of M equipped with a unitary implementation of the W ∗ -dynamics τ . There are two natural approaches to such construction. The first approach presupposes that τ has an invariant normal state ω. In the corresponding GNS representation this state is represented by a cyclic vector Ω. Then it is easy to see that there exists a unique self-adjoint operator L such that τ t (A) = eitL Ae−itL ,
LΩ = 0 .
The operator L defined this way can be called the Ω-Liouvillean of τ . In the second approach one chooses a standard representation of M on a Hilbert space H. One of the objects that go together with the standard representation is the positive cone H+ . A general theory of standard representations implies that there exists a unique self-adjoint operator L such that τ t (A) = eitL Ae−itL ,
eitL H+ ⊂ H+ .
The operator L defined in this way can be called the standard Liouvillean of τ , or simply the Liouvillean of τ . The two setups overlap if the invariant state ω is faithful and Ω ∈ H + . In this case the Ω-Liouvillean of τ coincides with the standard Liouvillean of τ . This fact is important for applications of W ∗ -algebras to quantum statical physics. If one is interested in the case of equilibrium, then the first approach to Liouvillean suffices. In nonequilibrium situations one needs the second approach. The (standard) Liouvillean encodes in a particularly convenient way the properties of the dynamics. This has been demonstrated in many places in the recent literature [1, 7–10]. The Liouvillean is also one of the main technical tools of our paper. If L is the Liouvillean for the W ∗ -dynamics τ , then one may ask what is the Liouvillean for τQ . If Q is bounded, then the answer is LQ = L + Q − JQJ, where J is the modular conjugation. We will establish the same result for unbounded Q under some mild technical assumptions. 1.3. Organization of the paper We start our paper with a concise review of some aspects of the theory of W ∗ algebras. The choice of topics is motivated by some recent applications of W ∗ algebras to quantum statistical mechanics [1, 7–11]. Among other things, we will discuss the two possible definitions of the Liouvillean. For most of the proofs in Sec. 2 the reader is referred to the literature, especially [12, 13]. In Sec. 3 we describe the perturbation theory of W ∗ -dynamics and Liouvilleans. We describe in particular the case of unbounded perturbations, which goes beyond what we could find in the literature.
July 14, 2003 10:12 WSPC/148-RMP
450
00167
J. Derezi´ nski, V. Jakˇsi´ c & C.-A. Pillet
To make our paper more accessible, we have included in Sec. 4 the proof of the Uhlmann’s monotonicity theorem [6] and Donald’s proof of the Golden–Thompson inequality [3]. A somewhat different presentation of this topic can be found in [4]. Section 5 contains the perturbation theory of KMS states. The subject naturally splits into three levels. The most restrictive level concerns analytic perturbations. In this case the proofs are essentially algebraic and relatively simple. The next level concerns bounded Q. This is the case considered by Araki [14], see also [13, 15–17]. Finally, we develop perturbation theory for a class of unbounded Q. In all the cases we prove a number of properties of ΩQ , including the Peierls–Bogoliubov and the Golden–Thompson inequalities. We stress that the Golden–Thompson inequality is at the same time an important ingredient of our proof of the existence of ΩQ . We also prove a number of estimates that can be used to compare the vectors Ω and ΩQ . Some of these estimates appear to be new. We have attempted to make the paper reasonably self-contained so that it can serve as a brief introduction to some recent works on algebraic quantum statistical physics. Our presentation is in some respects complementary to the presentation in the standard literature such as [4, 12, 13]. In particular, we tried to emphasize the use of the standard representation and the Liouvillean. In Appendix B we give a concise description of the Pauli–Fierz systems at positive densities. The material of this appendix is based on [1]. We include this material at the request of referee to briefly explain the main physical motivation and application of the results of our paper. 2. General Facts about W ∗ -Algebras In this section we recall some basic definitions and facts about W ∗ -algebras which will play a role in our paper. For additional information and proofs we refer the reader to [12, 13, 18–20]. There are two approaches to the theory of W ∗ -algebras: the concrete and the abstract approach. In the concrete approach one starts with the notion of a concrete W ∗ -algebra (called also a von Neumann algebra), defined as a ∗-algebra of bounded operators on a Hilbert space which equals its double commutant. This is in fact the original definition that dates back to the works of von Neumann. In the abstract approach, due to Sakai [18], one defines an abstract W ∗ -algebra as a C ∗ -algebra that possesses a predual. These approaches are essentially equivalent: every abstract W ∗ -algebra can be represented as a concrete W ∗ -algebra and every concrete W ∗ -algebra is an abstract W ∗ -algebra. The concrete approach is historically the first and is used in most monographs, e.g. [12, 13, 19]. The abstract approach has been developed in [18]. In some respects the abstract approach is more difficult from the pedagogical point of view — many basic properties of W ∗ -algebras are more difficult to show starting from Sakai’s definition than starting from von Neumann’s definition. Nevertheless, one
July 14, 2003 10:12 WSPC/148-RMP
00167
Perturbation Theory of W ∗ -Dynamics, Liouvilleans and KMS-States
451
can argue that Sakai’s approach is conceptually superior: it helps to distinguish the notions that are intrinsic from the notions that are representation dependent. In our presentation we will stress the abstract approach. 2.1. Abstract W ∗ -algebras If X is a Banach space, then a Banach space Y is called a predual of X iff X is isomorphic to the dual of Y. M is an (abstract) W ∗ -algebra if it is a C ∗ -algebra which possesses a predual. It can be shown that every W ∗ -algebra M possesses a unique predual (up to isomorphism). It will be denoted by M∗ . Elements of M∗ will be called normal functionals on M. The topology on M generated by the seminorms |ω(A)|, ω ∈ M∗ , is called the σ-weak topology. The topology on M generated by the seminorms |ω(A∗ A)|1/2 , ω ∈ M∗ , is called the σ-strong topology. + M+ ∗ denotes the set of positive elements of M∗ . Elements of M∗ satisfying ω(1) = 1 are called normal states. The set of normal states is denoted M∗+,1 . ∗ Let ω ∈ M+ ∗ and let N be a W -subalgebra of M. The support of ω with respect to N is defined as sN ω := inf{P ∈ N : P is an orthogonal projection and ω(1 − P ) = 0} . In particular, the support with respect to M will be called just the support of ω and denoted sω . The support of ω wrt the center of M will be called the central support of ω and denoted zω . ∗ ω ∈ M+ ∗ is called faithful iff sω = 1. A W -algebra is called σ-finite if it possesses a faithful state. Let M, N be W ∗ -algebras and π : M → N a homomorphism. We say that π is normal iff π is σ-weakly continuous. 2.2. Concrete W ∗ -algebras Let H be a Hilbert space. (Ψ|Φ) will denote the scalar product of the vectors Ψ, Φ ∈ H. We adopt “physicist’s convention” and our scalar product is antilinear with respect to the first argument. If C ⊂ B(H), then the commutant of C will be denoted by C 0 . We will say that M is a concrete W ∗ -algebra (or a von Neumann algebra) iff M ⊂ B(H) for some Hilbert space H and M00 = M. A concrete W ∗ -algebra in B(H) is a W ∗ -algebra inside B(H) containing the identity of B(H). Every abstract W ∗ -algebra is ∗-isomorphic to a concrete W ∗ -algebra. Let M be an abstract W ∗ -algebra and π : M → B(H) a representation. Then π(M) is a concrete W ∗ -algebra iff π is unital and normal. Given an injective unital normal representation π : M → B(H), we will often identify M with π(M).
July 14, 2003 10:12 WSPC/148-RMP
452
00167
J. Derezi´ nski, V. Jakˇsi´ c & C.-A. Pillet
2.3. Concrete affiliations In the following two subsections we recall the concept of operators affiliated to a W ∗ -algebra. This concept is well-known in the case of concrete W ∗ -algebras, see e.g. [12]. Let M ⊂ B(H) be a concrete W ∗ -algebra. Let A be a closed densely defined operator on H and D(A) its domain. We say that A is affiliated to M iff for all A0 ∈ M0 , A0 D(A) ⊂ D(A) and AA0 = A0 A, on D(A). Let M(η) be the set of operators affiliated to M. Theorem 2.1. (1) If A is self-adjoint on H, then A is affiliated to M iff all bounded Borel functions of A belong to M. (2) If A is a closed operator, then A is affiliated to M iff A(1 + A∗ A)−1/2 ∈ M. 2.4. Abstract affiliations The concept of affiliation can be introduced for abstract W ∗ -algebras in a fashion independent of representations. Our definition of an operator affiliated to an abstract W ∗ -algebra is directly inspired by the definition of the affiliation in the context of C ∗ -algebras due originally to Baaj and Jungl [21] and elaborated by Woronowicz [22]. We are grateful to S. L. Woronowicz for a discussion of this issue. Let M be an abstract W ∗ -algebra. In this subsection we will consider linear operators acting on M. The domain of an operator A on M will be denoted by Dom(A). (We reserve the notation D(A) to denote the domain of an operator A acting on a Hilbert space.) Let A be a linear mapping acting on M. We say that A is affiliated to M and write A ∈ Mη , iff there exists B ∈ M such that kBk ≤ 1, (1 − BB ∗ )M is σ-weakly dense in M and, for any C, D ∈ M, C ∈ Dom(A)
and AC = D ⇐⇒ BC = (1 − BB ∗ )1/2 D .
If such B exists, then it is unique. We set z(A) := B. In [22], z(A) is called the z-transform of A. One can show that if A ∈ Mη , then Dom(A) is σ-weakly dense and A is closed, both in the norm topology and in the σ-weak topology. Note that every A ∈ M may be identified with a linear map on M with Dom(A) = M (given by A(C) = AC) and thus it is an element of Mη . The ztransform of A ∈ M equals z(A) = (1 + AA∗ )−1/2 A . The following theorem describes the relationship between abstract and concrete affiliations. It shows that in the case of an injective normal representation we can identify abstract and concrete affiliated operators. Theorem 2.2. Let π : M → B(H) be a normal representation preserving the identity. Then there exists a unique extension of π to a surjective map π : Mη →
July 14, 2003 10:12 WSPC/148-RMP
00167
453
Perturbation Theory of W ∗ -Dynamics, Liouvilleans and KMS-States
π(M)(η) satisfying (1 + π(A)π(A)∗ )−1/2 π(A) = π(z(A)) . If π is injective on M, then its extension on Mη is injective as well. 2.5. Vector representatives of states Let M ⊂ B(H) be a concrete W ∗ -algebra and Ω a vector in H. Then ωΩ (A) := (Ω|AΩ) ,
A ∈ M,
defines a normal positive functional on M. We say that Ω is a vector representative of ωΩ . ωΩ is a state iff Ω is normalized. The support and the central support of ωΩ are also called the support and the central support of Ω and denoted sΩ and zΩ respectively. We thus have sωΩ = s Ω , ∗
z ωΩ = z Ω . 0
The support of Ω wrt the W -algebra M will be denoted s0Ω . One shows that Ran sΩ = (M0 Ω)cl ,
Ran s0Ω = (MΩ)cl ,
where cl stands for the closure. A vector Ω ∈ H is called cyclic if s0Ω = 1. A vector Ω is called separating if sΩ = 1, or equivalently, if it is a vector representative of a faithful state. The following construction, called after Gelfand, Naimark and Segal, associates to every normal state a normal representation equipped with a cyclic vector. Theorem 2.3 (The GNS construction). Let ω be a normal state. Then there exist a (unique up to a unitary equivalence) Hilbert space H, a normal unital representation π : M → B(H) and a cyclic vector Ω ∈ H, such that ω(A) = (Ω|π(A)Ω) . The representation π is injective on zω M and zero on (1 − zω )M. 2.6. Automorphisms of W ∗ -algebras Let Aut(M) denote the group of ∗-automorphisms of a W ∗ -algebra M. We equip Aut(M) with the following topology: if ρα is a net in Aut(M) and ρ ∈ Aut(M), then ρα → ρ iff for all A ∈ M, ρα (A) → ρ(A) σ-weakly. This topology is called the pointwise σ-weak topology. A one parameter pointwise σ-weakly continuous group R 3 t 7→ τ t ∈ Aut(M) is called W ∗ -dynamics on M. The pair (M, τ ) is called a W ∗ -dynamical system. Let M ⊂ B(H) be a concrete W ∗ -algebra and ρ ∈ Aut(M). We say that ρ is implemented by U ∈ U(H), where U(H) denotes the set of unitary operators on H, iff ρ(A) = U AU ∗ .
(2.1)
July 14, 2003 10:12 WSPC/148-RMP
454
00167
J. Derezi´ nski, V. Jakˇsi´ c & C.-A. Pillet
Let t 7→ τ t be a W ∗ -dynamics on M and t 7→ U (t) ∈ U(H) a strongly continuous group. We say that τ t is implemented by U (t) iff τ t (A) = U (t)AU (t)∗ .
(2.2)
In general, neither ∗-automorphisms nor W ∗ -dynamics need be implementable. If they are, the implementation is not unique. In the next subsections we will describe two situations where there exist distinguished implementations. 2.7. Automorphisms with a fixed invariant state ∗ + ∗ Let ω ∈ M+ ∗ and ρ ∈ Aut(M). We define ρ ω ∈ M∗ by ρ ω(A) = ω(ρ(A)). We say ∗ that ω is ρ-invariant if ω = ρ ω. The automorphisms that leave ω invariant form a group denoted Autω (M). If ρ ∈ Autω (M), then ρ(zω ) = zω and ρ(sω ) = sω . Thus ρ maps zω M and (1 − zω )M into itself, and without loss of generality we may assume that zω = 1. By passing to the GNS-representation we may assume that M ⊂ B(H) and that Ω is a cyclic vector representative of ω.
Proposition 2.1. There exists a unique representation Autω (M) 3 ρ 7→ U Ω (ρ) ∈ U(H) such that U Ω (ρ)Ω = Ω ,
U Ω (ρ)AU Ω (ρ)∗ = ρ(A) .
It is continuous if we equip Autω (M) with the pointwise σ-weak topology and U(H) with the strong operator topology. Proof. One just sets U Ω (ρ)AΩ = ρ(A)Ω ,
A ∈ M.
U Ω (ρ) will be called the Ω-implementation of ρ. Suppose now that t 7→ τ t is a W ∗ -dynamics that leaves ω invariant. Then, by Proposition 2.1, τ is implemented by a strongly continuous unitary group R 3 t 7→ U Ω (τ t ) ∈ U(H). The self-adjoint generator of U Ω (τ t ) will be denoted LΩ and called Ω the Ω-Liouvillean of τ t . (Thus U Ω (τ t ) = eitL ). The following fact is a corollary of Proposition 2.1: Proposition 2.2. The operator LΩ is the unique self-adjoint operator such that LΩ Ω = 0 ,
Ω
Ω
eitL Ae−itL = τ t (A) ,
A ∈ M.
July 14, 2003 10:12 WSPC/148-RMP
00167
Perturbation Theory of W ∗ -Dynamics, Liouvilleans and KMS-States
455
2.8. The Tomita Takesaki theory Let ω be a faithful state on M. By passing to the GNS representation we may assume that M ⊂ B(H) and that ω has a vector representative Ω which is cyclic and separating. The following theorem summarizes the results of the well known Tomita– Takesaki theory. Theorem 2.4. (1) Define the operator SΩ with the domain MΩ by SΩ AΩ = A∗ Ω . Then SΩ is antilinear, closable, has a zero kernel and cokernel. Its closure will be 1/2 denoted also SΩ . Let SΩ = J∆Ω be its polar decomposition; (2) J is an antiunitary involution; (3) ∆Ω is a positive operator satisfying J∆Ω J = ∆−1 Ω and ∆Ω Ω = Ω; (4) The map it τωt (A) := ∆−it Ω A∆ ∈ M ,
A ∈ M,
is a W ∗ -dynamics on M and − log ∆Ω is its Ω-Liouvillean. The W ∗ -dynamics R 3 t 7→ τω−t is called the modular dynamics and ∆Ω is called the modular operator. 2.9. Standard form One of the central notions of the theory of W ∗ -algebras is the so-called standard form. It has been introduced by Haagerup [23], following the work of Araki [24] and Connes [25]. A W ∗ -algebra in a standard form is a quadruple (M, H, J, H+ ), where H is a Hilbert space, M ⊂ B(H) is a concrete W ∗ -algebra, J is an antiunitary involution on H (that is, J is antilinear, J 2 = 1, J ∗ = J) and H+ is a self-dual cone in H such that: (1) (2) (3) (4)
JMJ = M0 ; JAJ = A∗ for A in the center of M; JΨ = Ψ for Ψ ∈ H+ ; AJAH+ ⊂ H+ for A ∈ M.
If M is an abstract W ∗ -algebra, then we will say that (π, H, J, H+ ) is its standard representation if π : M → B(H) is an injective unital representation and (π(M), H, J, H+ ) is a standard form. Theorem 2.5. Let M be a W ∗ -algebra with a faithful state ω. Let π : M → B(H) be the corresponding GNS representation with the cyclic vector Ω. Let J be the modular conjugation obtained by the Tomita–Takesaki theory and H + := {π(A)Jπ(A)Ω :
July 14, 2003 10:12 WSPC/148-RMP
456
00167
J. Derezi´ nski, V. Jakˇsi´ c & C.-A. Pillet
A ∈ M}cl . Then H+ is a self-dual cone and (π, H, J, H+ ) is a standard representation of M. If (π, H, J1 , H1+ ) is another standard representation of M and Ω ∈ H1+ , then H1+ = H+ and J1 = J. Theorem 2.6. Every W ∗ -algebra M possesses a standard representation. Moreover, if (π1 , H1 , J1 , H1+ ) and (π2 , H2 , J2 , H2+ ) are two standard representations of M, then there exists a unique unitary operator W 0 : H1 → H2 such that W 0 π1 (A) = π2 (A)W 0 , W 0 H1+ = H2+ . We then automatically have W 0 J1 = J2 W 0 . If M is σ-finite, then Theorem 2.6 is proven e.g. in [12]. In this case the existence part follows from Theorem 2.5. If M is not σ-finite, the theorem is proven using weights instead of states. The details can be found in [20, 23]. 2.10. States and automorphisms in the standard representation In this subsection we fix a W ∗ -algebra in the standard form (M, H, J, H+ ). Theorem 2.7. (1) H+ 3 Ω 7→ ωΩ ∈ M+ ∗ is a bijection. Its inverse will be denoted + M+ ∗ 3 ω 7→ Ωω ∈ H .
(2) If Ψ, Φ ∈ H+ , then kΨ − Φk2 ≤ kωΨ − ωΦ k ≤ kΨ − ΦkkΨ + Φk . (3) If Ω ∈ H+ , then Ω is cyclic ⇔ Ω is separating ⇔ ωΩ is faithful. (4) For Ω ∈ H+ , s0Ω = JsΩ J. The vector Ωω ∈ H+ will be called the standard vector representative of ω. A unitary operator U on H is called a standard unitary operator iff (1) U H+ = H+ , (2) U MU ∗ = M. Theorem 2.8. (1) If U is a standard unitary operator, then JU = U J and U M0 U ∗ = M0 . (2) There exists a unique unitary representation Aut(M) 3 ρ 7→ U (ρ) ∈ U(H) satisfying the following conditions:
(2.3)
July 14, 2003 10:12 WSPC/148-RMP
00167
Perturbation Theory of W ∗ -Dynamics, Liouvilleans and KMS-States
457
(a) U (ρ)AU (ρ)∗ = ρ(A), A ∈ M; (b) U (ρ)H+ ⊂ H+ . (3) The image of (2.3) is the group of all standard unitary operators. (4) (2.3) is continuous if Aut(M) is equipped with the pointwise σ-weak topology and U(H) with the strong operator topology. (5) U (ρ)Ωω = Ωρ−1∗ ω for all ω ∈ M+ ∗. U (ρ) will be called the standard implementation of ρ. Suppose that t 7→ τ t is a W ∗ -dynamics on M and let U (τ t ) be as in Theorem 2.8. Then there exists a unique self-adjoint L such that U (τ t ) = eitL . The operator L will be called the standard Liouvillean of the W ∗ -dynamics τ , or simply the Liouvillean of τ . Theorem 2.9. The Liouvillean of τ is the unique self-adjoint operator L satisfying eitL H+ ⊂ H+ ,
eitL Ae−itL = τ t (A) ,
A ∈ M,
for all t ∈ R. The final result we wish to mention follows easily from Theorems 2.7 and 2.8. It has been a key tool in recent investigations of invariant states of a certain class of W ∗ -dynamical systems called Pauli–Fierz systems [1, 7–10]. Theorem 2.10. Let τ be a W ∗ -dynamics and L the corresponding Liouvillean. Then t {ωΦ : Φ ∈ H+ ∩ KerL} = {ω ∈ M+ ∗ : ω is τ invariant} .
Consequently, (1) dim KerL = 0 ⇔ there are no normal τ -invariant states. (2) dim KerL = 1 ⇔ there exists exactly one normal τ -invariant state. We will not make use of this result in our paper. 2.11. Comparison In some circumstances the setups of Subsecs. 2.7 and 2.10 overlap. Recall that in Subsec. 2.7 we have a W ∗ -algebra M with a faithful state ω. We can assume that M ⊂ B(H) and that ω has a cyclic vector representative Ω. By Theorem 2.5, we can construct J and H+ so that (M, H, J, H+ ) is a standard form and Ω ∈ H+ . Proposition 2.3. Let ρ ∈ Autω (M). Suppose that U ∈ U(H) implements ρ, that is ρ(A) = U AU ∗ , A ∈ M. Then the following conditions are equivalent: (1) U Ω = Ω (U = U Ω (ρ) is the Ω-implementation of ρ); (2) U H+ = H+ (U = U (ρ) is the standard implementation of ρ).
July 14, 2003 10:12 WSPC/148-RMP
458
00167
J. Derezi´ nski, V. Jakˇsi´ c & C.-A. Pillet
Proof. We know from Proposition 2.1 that the Ω-implementation of ρ exists and is unique. We also know from Theorem 2.8 that the standard implementation of ρ exists and is unique. Hence, it is sufficient to show the implication in one direction. (2) ⇒ (1). The vector U Ω determines the state ρ∗ ω = ω. Hence the vectors U Ω, Ω belong to the cone H+ and determine the same state. This implies U Ω = Ω. As a corollary, if the invariant state ω is faithful, then the concepts of the Ω-Liouvillean and the standard Liouvillean coincide. Proposition 2.4. Let t 7→ τ t be a W ∗ -dynamics on M that leaves invariant a faithful state ω. Suppose that L is a self-adjoint operator such that τ t (A) = eitL Ae−itL . Then the following conditions are equivalent: (1) LΩ = 0 (L = LΩ is the Ω-Liouvillean of τ ); (2) For t ∈ R, eitL H+ ⊂ H+ (L is the standard Liouvillean of τ ). 2.12. KMS states In this subsection we recall basic properties of KMS states. Let (M, τ t ) be a W ∗ dynamical system. Definition 2.1. Let β > 0. ω ∈ M+,1 is called a (τ, β)-KMS state if for any ∗ A, B ∈ M there exists a function FA,B (z), analytic in the strip {z : 0 < Im z < β}, continuous on its closure, and satisfying the KMS boundary conditions for t ∈ R : FA,B (t) = ω(Aτ t (B)) , FA,B (t + iβ) = ω(τ t (B)A) . Theorem 2.11. Let ω be a (τ, β)-KMS state and β > 0. Then (1) (2) (3) (4)
ω is τ -invariant. sω = zω . (In particular, ω is faithful on zω M). If B ∈ zω Z, where Z is the center of M, then τ t (B) = B. Let τω be the dynamics on zω M generated by ω. Then τ t |zω M = τωβt .
Theorem 2.12. Let ω be a faithful state on M and τω the corresponding dynamics. Then ω is a (τω , 1)-KMS state. Let (M, H, J, H+ ) be a standard form. We say that Ω is a standard (τ, β)-KMS vector iff it is a standard vector representative of a (τ, β)-KMS state. Suppose that L is the Liouvillean of τ . The following theorem gives a criterium for the KMS property expressed in terms of Hilbert spaces. Theorem 2.13. Let Ω ∈ H+ be a unit vector. Then
July 14, 2003 10:12 WSPC/148-RMP
00167
Perturbation Theory of W ∗ -Dynamics, Liouvilleans and KMS-States
459
(1) Ω is a standard (τ, β)-KMS vector iff MΩ ⊂ D(e−βL/2 ) and e−βL/2 AΩ = JA∗ Ω ,
A ∈ M.
(2) If in addition Ω is cyclic and ∆Ω is the corresponding modular operator, then ∆Ω = e−βL . 2.13. Convergence It is often convenient to reduce the study of W ∗ -dynamics and normal states to the study of corresponding Liouvilleans and standard vector representatives. In this subsection we apply this point of view to the convergence properties of W ∗ dynamics, invariant states and KMS states. Theorem 2.14. Assume that (M, H, J, H+ ) is a W ∗ -algebra in the standard form. (1) Suppose that τn is a sequence of W ∗ -dynamics with Liouvilleans Ln , L is a self-adjoint operator, and Ln → L in the strong resolvent sense. Then τ t (A) := eitL Ae−itL is a W ∗ -dynamics on M and L is its Liouvillean. (2) Assume in addition that ωn ∈ M+ ∗ are τn -invariant and Ωn are their standard vector representatives. Suppose also that w- limn Ωn = Ω. Then Ω ∈ H+ and the functional ωΩ is τ -invariant. (3) Assume in addition that ωn are (τn , β)-KMS states and that Ω 6= 0. Then ωΩ/kΩk is a (τ, β)-KMS state. Proof. (1) Let A ∈ M. We have s- limn→∞ e±itLn = e±itL , hence s- lim eitLn Ae−itLn = eitL Ae−itL ∈ M . n→∞
Therefore τ is a W ∗ -dynamics. Since H+ is closed and eitLn preserve H+ , eitL preserves H+ . Hence L is the Liouvillean of τ . (2) Since H+ is weakly closed, Ω ∈ H+ . Moreover, since Ωn ∈ D(Ln ) and Ln Ωn = 0, by Proposition A.4, Ω ∈ D(L) and LΩ = 0. (3) Let A ∈ M. Ωn are (τn , β)-KMS vectors, hence exp(−βLn /2)AΩn = JA∗ Ωn . Since exp(−βLn /2) → exp(−βL/2) in the strong resolvent sense, JA∗ Ωn → JA∗ Ω weakly, and AΩn → AΩ weakly, it follows from Proposition A.4 that AΩ ∈ D(e−βL/2 ) and e−βL/2 AΩ = JA∗ Ω . Hence Ω/kΩk is a (τ, β)-KMS vector.
(2.4)
July 14, 2003 10:12 WSPC/148-RMP
460
00167
J. Derezi´ nski, V. Jakˇsi´ c & C.-A. Pillet
2.14. Analytic elements Let (M, τ ) be a W ∗ -dynamical system. An element A ∈ M is called τ -analytic if there exists a strip I(r) = {z : |Im z| < r} and a function f : I(r) → M such that: (1) f (t) = τ t (A) for t ∈ R; (2) I(r) 3 z 7→ φ(f (z)) is analytic for all φ ∈ M∗ . Under these conditions we write f (z) = τ z (A). A standard argument based on the uniform boundedness theorem shows that f (z) is actually analytic in the norm of M. If r = ∞, then we say that A is τ -entire. For A ∈ M and n ∈ N let 21 Z 2 n e−nt τ t (A)dt . An = π R Theorem 2.15. An is τ -entire and An % A in the σ-strong topology. Thus the τ -entire elements form a σ-strongly dense subspace of M. This subspace is denoted by Mτ . For additional discussion of analytic elements we refer the reader to [12]. 3. The Perturbation Theory of W ∗ -Dynamics In this section, given a W ∗ -dynamics τ and a perturbation Q, we construct a perturbed W ∗ -dynamics τQ . We also construct the so-called Araki–Dyson expansionals EτQ (t) which intertwine these two dynamics. We describe these objects in three cases: for analytic perturbations, bounded perturbations, and for a large class of unbounded perturbations. The constructions in the first two cases are well known, see [13, 26]. 3.1. Bounded perturbations Let (M, τ ) be a W ∗ -dynamical system and Q a self-adjoint element of M. The following formula defines the W ∗ -dynamics τQ on M: X Z t in [τ tn (Q), [. . . , [τ t1 (Q), τ t (A)] · · ·]]dt1 · · · dtn . τQ (A) = (3.1) n≥0
0≤tn ≤···t1 ≤t
t has the same domain as δ If δ is the generator of τ , then the generator of τQ and equals
δQ (A) = δ(A) + i[Q, A] . Let EτQ (t) be a one-parameter family of elements of M given by X Z EτQ (t) = in τ tn (Q) · · · τ t1 (Q)dt1 · · · dtn . n≥0
0≤tn ≤···t1 ≤t
(3.2)
July 14, 2003 10:12 WSPC/148-RMP
00167
Perturbation Theory of W ∗ -Dynamics, Liouvilleans and KMS-States
461
We will call EτQ (t) the Araki–Dyson expansionals. Whenever there is no danger of confusion we will write EQ (t) for EτQ (t). We remark that integrals in (3.1) and (3.2) converge in σ-weak topology and define a norm-convergent series of bounded operators. The expansions (3.1) and (3.2) played an important role in the works of Schwinger, Tomonaga and Dyson on QED. The operators EτQ (t) are closely related to the so-called Connes cocycles [25]. Let us list some properties of Araki–Dyson expansionals: Theorem 3.1. Let t, t1 , t2 ∈ R. Then (1) (2) (3) (4)
EQ (t) are unitary elements of M; t τQ (A) = EτQ (t)τ t (A)EτQ (t)−1 ; EQ (t)−1 = EQ (t)∗ = τ t (EQ (−t)); EQ (t1 + t2 ) = EQ (t1 )τ t1 (EQ (t2 ));
Assume in addition that M is a concrete W ∗ -algebra in B(H) and that L is a self-adjoint operator on H such that τ t (A) = eitL Ae−itL for A ∈ M. Then t (5) τQ (A) = eit(L+Q) Ae−it(L+Q) for A ∈ M; (6) EQ (t) = eit(L+Q) e−itL .
3.2. Analytic perturbations In this subsection we assume that Q is τ -entire. Then τQ extends to C by the formula Z X z n τQ (A) = (iz) [τ sn z (Q), [. . . , [τ s1 z (Q), τ z (A)] · · ·]]ds1 · · · dsn , 0≤sn ≤···s1 ≤1
n≥0
(3.3)
valid for A ∈ Mτ . Thus Mτ = MτQ . For τ -analytic Q, the Araki–Dyson expansionals can be defined for all complex z by Z X τ n EQ (z) = (iz) τ sn z (Q) · · · τ s1 z (Q)ds1 · · · dsn . (3.4) n≥0
0≤sn ≤···s1 ≤1
The series (3.3) and (3.4) converge in norm uniformly for z in compact sets and define analytic functions with values in M. Theorem 3.2. Let z, z1 , z2 ∈ C. Then (1) (2) (3) (4)
EQ (z) ∈ Mτ ; z τQ (A) = EτQ (z)τ z (A)EτQ (z)−1 ; EQ (z)−1 = EQ (¯ z )∗ = τ z (EQ (−z)); EQ (z1 + z2 ) = EQ (z1 )τ z1 (EQ (z2 )).
July 14, 2003 10:12 WSPC/148-RMP
462
00167
J. Derezi´ nski, V. Jakˇsi´ c & C.-A. Pillet
Assume in addition that M is a concrete W ∗ -algebra in B(H) and that L is a self-adjoint operator on H such that τ t (A) = eitL Ae−itL for A ∈ M. Then z (5) τQ (A)eiz(L+Q) = eiz(L+Q) A for A ∈ Mτ ; (6) EQ (z)eizL = eiz(L+Q) .
3.3. Unbounded perturbations In this subsection we consider a concrete W ∗ -algebra M ⊂ B(H) with a W ∗ dynamics τ implemented by a self-adjoint operator L and assume that Q is a self-adjoint operator affiliated to M. We formulate the following assumption on Q: Assumption 3.1. L + Q is essentially self-adjoint on D(L) ∩ D(Q). Theorem 3.3. Suppose that Assumption 3.1 holds and let t τQ (A) = eit(L+Q) Ae−it(L+Q) .
(3.5)
Then (1) τQ is a W ∗ -dynamics on M; (2) If Q is bounded, then τQ defined by (3.5) coincides with τQ defined by (3.1). Proof. Let A ∈ M. The Trotter product formula (Theorem A.1) yields that t τQ (A) = s- lim (eitL/n eitQ/n )n A(e−itQ/n e−itL/n )n . n→∞
t (A) ∈ M. Therefore, τQ is a W ∗ -dynamics and (1) is Since exp(±itQ/n) ∈ M, τQ proven. (2) follows from Theorem 3.1 (5).
Under Assumption 3.1 we set EτQ (t) := eit(L+Q) e−itL .
(3.6)
Again, for simplicity we will often write EQ (t) for EτQ (t). By the Trotter product formula EQ (t) = s- lim exp(itQ/n) exp(itτ t/n (Q)/n) · · · exp(itτ t(n−1)/n (Q)/n) , n→∞
hence EQ (t) ∈ M. Theorem 3.4. Suppose that Assumption 3.1 holds. Then all the statements of Theorem 3.1 hold. 3.4. Perturbations of Liouvilleans We continue with the setup of the previous subsection. In addition, we suppose that (M, H, J, H+ ) is a standard form and that L is the Liouvillean of τ . Define LQ := L + Q − JQJ .
(3.7)
July 14, 2003 10:12 WSPC/148-RMP
00167
Perturbation Theory of W ∗ -Dynamics, Liouvilleans and KMS-States
463
We set an additional hypothesis: Assumption 3.2. The operator LQ is essentially self-adjoint on D(L) ∩ D(Q) ∩ D(JQJ). The main result of this section is: Theorem 3.5. Assume that Assumptions 3.1 and 3.2 hold. Then LQ is the Liouvillian for τQ . Proof. We have to show that for t ∈ R : t (1) τQ (A) = eitLQ Ae−itLQ , A ∈ M; itLQ + (2) e H ⊂ H+ .
Clearly, eitJQJ = Je−itQ J ∈ M0 .
(3.8)
By definition, D(L + Q) ⊃ D(L) ∩ D(Q). Therefore, D(L + Q) ∩ D(JQJ) ⊃ D(L) ∩ D(Q) ∩ D(JQJ). Hence, by Hypothesis 3.2, LQ is essentially self-adjoint on D(L + Q) ∩ D(JQJ), and we can use the Trotter formula (Theorem A.1) to write eitLQ = s- lim (eit(L+Q)/n e−itJQJ/n )n . n→∞
Therefore, for all A ∈ M, t τQ (A) = eit(L+Q) Ae−it(L+Q)
= s- lim (eit(L+Q)/n e−itJQJ/n )n A(eitJQJ/n e−it(L+Q)/n )n n→∞
= eitLQ Ae−itLQ .
(3.9)
This yields (1). To establish (2), note that since eitQ and eitJQJ commute, eit(Q−JQJ) = eitQ JeitQ J . Hence eit(Q−JQJ) H+ ⊂ H+ . Moreover, eitL H+ ⊂ H+ . By definition, D(Q) ∩ D(JQJ) ⊂ D(Q + JQJ). Therefore, D(L) ∩ D(Q − JQJ) ⊃ D(L) ∩ D(Q) ∩ D(JQJ). Hence LQ is essentially self-adjoint on D(L) ∩ D(Q − JQJ) and it follows from Theorem A.1 that eitLQ = s- lim (eitL/n eit(Q−JQJ)/n )n . n→∞
This and the fact that H
+
is a closed set imply (2).
July 14, 2003 10:12 WSPC/148-RMP
464
00167
J. Derezi´ nski, V. Jakˇsi´ c & C.-A. Pillet
The following formulas are sometimes useful: Theorem 3.6. (1) Assume that Assumptions 3.1 and 3.2 hold. Then for t ∈ R, EQ (t) = eitLQ e−it(L−JQJ) , eitLQ = JEQ (t)JeitL EQ (−t)−1 . (2) Assume that Q is τ -analytic. Then for z ∈ C, EQ (z) = eizLQ e−iz(L−JQJ) , z )JeizL EQ (−z)−1 . eizLQ = JEQ (¯ 4. Relative Modular Theory and Relative Entropy One of the main tools used in our paper is the relative modular theory and relative entropy. We devote this section to a concise introduction to this subject. Our presentation follows partly [3, 4, 6, 27, 28]. 4.1. Relative modular operator Let M ⊂ B(H) be a W ∗ -algebra. Let Φ, Ψ ∈ H. Following Araki [28], we define the operator SΦ,Ψ on domain MΨ + (1 − s0Ψ )H by SΦ,Ψ (AΨ + Θ) = sΨ A∗ Φ , where A ∈ M and Θ ∈ (1 − s0Ψ )H = (MΨ)⊥ . It is easy to check that SΦ,Ψ is a well defined antilinear closable operator. Its closure will be denoted by the same symbol. It is useful to note that MΨ = {AΨ : A ∈ M, AsΨ = A} , and that for A ∈ M satisfying AsΨ = A and Θ as above we have SΦ,Ψ (AΨ + Θ) = A∗ Φ .
(4.1)
The positive operator ∗ ∆Φ,Ψ = SΦ,Ψ SΦ,Ψ
will be called the relative modular operator. The following facts are proven in [28]: Theorem 4.1. (1) Ker ∆Φ,Ψ = Ker s0Ψ sΦ ; 2 (2) ∆λΦ,µΨ = λµ2 ∆Φ,Ψ , λ, µ ∈ R; (3) if B belongs to the center of M, then B commutes with ∆Φ,Ψ . In the remaining part of the theorem we assume that (M, H, J, H + ) is a standard form and Φ, Ψ ∈ H+ . Then 1/2
(4) SΦ,Ψ = J∆Φ,Ψ ;
July 14, 2003 10:12 WSPC/148-RMP
00167
Perturbation Theory of W ∗ -Dynamics, Liouvilleans and KMS-States 1/2
465
1/2
(5) ∆Φ,Ψ Ψ = ∆Φ,Ψ sΦ Ψ = s0Ψ Φ; (6) J∆Ψ,Φ J∆Φ,Ψ = ∆Φ,Ψ J∆Ψ,Φ J = s0Ψ sΦ . The following convergence property of relative modular operators will be useful. Theorem 4.2. Let (M, H, J, H+ ) be a standard form. Suppose that Ψn , Φn ∈ H+ , that ∆Φn ,Ψn → M in the strong resolvent sense, and that w- lim n Ψn = Ψ, s- limn sΨn = sΨ and w- limn Φn = Φ. Then M = ∆Φ,Ψ . Proof. For A ∈ M, 1/2
∆Φn ,Ψn AΨn = JsΨn A∗ Φn . Note that AΨn → AΨ weakly and JsΨn A∗ Φn → JsΨ A∗ Φ weakly. Hence, by Proposition A.4 and remark after it, AΨ ∈ D(M ) and M AΨ = JsΨ A∗ Φ . Now let Θ ∈ (1 − s0Ψ )H and Θn := (1 − s0Ψn )Θ. Since s0Ψn → s0Ψ strongly, Θn → Θ strongly. Since ∆Φn ,Ψn Θn = 0, Θ ∈ D(M ) and M Θ = 0. This yields M = ∆Φ,Ψ . 4.2. Relative entropy Let M be a W ∗ -algebra. The relative entropy of two functionals ψ, φ ∈ M+ ∗ , denoted Ent(ψ|φ), is defined as follows. Choose a standard form (π, H, J, H + ) of M and let Ψ, Φ, be the standard vector representatives of ψ, φ. Then ( (Ψ| log ∆Φ,Ψ Ψ) if sψ ≤ sφ , Ent(ψ|φ) = −∞ otherwise . The relative entropy was introduced by Araki in fundamental papers [27, 28]. In the above definition we used the sign and ordering convention of [13]. The relative entropy is discussed in detail in the monograph [4]. We will need the following well-known facts [3, 4, 27, 28]. Theorem 4.3. (1) t/2
Ent(ψ|φ) = lim t−1 (k∆Φ,Ψ Ψk2 − kΨk2 ) ; t↓0
(2) for µ, λ ∈ R+ , Ent(λψ|µφ) = λ Ent(ψ|φ) + λψ(1)(log µ − log λ) ; (3) Ent(ψ|φ) ≤ ψ(1)(log φ(sψ ) − log ψ(1)) , in particular, if φ(sψ ) = ψ(1) then Ent(ψ|φ) ≤ 0 ;
(4.2)
July 14, 2003 10:12 WSPC/148-RMP
466
00167
J. Derezi´ nski, V. Jakˇsi´ c & C.-A. Pillet
(4) if Q is a self-adjoint element in the center of M and ψ(1) = 1, then Ent(ψ|φ) + ψ(Q) ≤ log φ(eQ ) . Proof. (1) Assume first that sΨ ≤ sΦ . Then the statement follows from the spectral theorem, monotone convergence theorem and the fact that lim t−1 (xt − 1) = log x , t↓0
decreasingly on ]0, ∞[. If sΦ Ψ 6= Ψ, then Ψ = Ψ1 + Ψ2 , where Ψ1 6= 0, Ψ1 ⊥ Ψ2 and Ψ1 ∈ Ker ∆Φ,Ψ , and one easily shows that the limit in (4.2) is −∞. Scaling property of Theorem 4.1 yields (2). We first prove the part (3) under the assumption φ(sψ ) = ψ(1) = 1. Using log x ≤ x − 1 ,
x > 0,
(4.3)
we get log ∆Φ,Ψ ≤ ∆Φ,Ψ − 1 . Thus 1/2
Ent(ψ|φ) ≤ k∆Φ,Ψ Ψk2 − kΨk2 = φ(sψ ) − ψ(1) = 0 . 1/2
(We used ∆Φ,Ψ Ψ = s0Ψ Φ = JsΨ Φ). To extend (3) to arbitrary φ, ψ, use (2). To prove (4), note that since eQ commutes with ∆Φ,Ψ , log ∆Φ,Ψ + Q − log φ(eQ sψ ) = log(∆Φ,Ψ eQ /φ(eQ sψ )) The inequality (4.3) yields log(∆Φ,Ψ eQ /φ(eQ sψ )) ≤ ∆Φ,Ψ eQ /φ(eQ sψ ) − 1 . Hence 1/2
Ent(ψ|φ) + ψ(Q) − log φ(eQ sψ ) ≤ k∆Φ,Ψ eQ/2 Ψk2 /φ(eQ sψ ) − 1 = keQ/2 s0ψ Φk2 /φ(eQ sψ ) − 1 = 0 , where we used keQ/2 s0ψ Φk = keQ/2 Jsψ JΦk = kJeQ/2 sψ Φk = keQ/2 sψ Φk. 4.3. Uhlmann’s monotonicity theorem In this subsection we prove a relative entropy inequality due to Uhlmann [6]. Our proof follows the steps of an argument in [4] and is based on an interpolation theorem for self-adjoint operators (Theorem A.2 in the appendix). A different proof can be found in [29]. Let M1 and M2 be W ∗ -algebras. A map γ : M1 → M2 is called a Schwartz map iff γ(1) = 1 and γ(A∗ A) ≥ γ(A)∗ γ(A).
July 14, 2003 10:12 WSPC/148-RMP
00167
Perturbation Theory of W ∗ -Dynamics, Liouvilleans and KMS-States
467
Theorem 4.4 (Uhlmann’s monotonicity theorem). Let ψi , φi be normal states on Mi , i = 1, 2, and let γ : M1 → M2 be a Schwartz map such that ψ2 ◦ γ = ψ 1 ,
(4.4)
φ2 ◦ γ = φ 1 .
(4.5)
Then Ent(ψ2 |φ2 ) ≤ Ent(ψ1 |φ1 ) . The following inequality is a consequence of Uhlmann’s theorem: Corollary 4.1. Let N ⊂ M be W ∗ -algebras with common identity and ψ, φ ∈ M+,1 ∗ . Then Ent(ψ|φ) ≤ Ent(ψ|N | φ|N ) . Proof. The inclusion map γ : N → M is Schwartz and satisfies the conditions of Theorem 4.4 with respect to ψ, φ and the restricted states ψ|N , φ|N . To prove Uhlmann’s theorem it is convenient to work in the standard representation and to translate the problem into the language of operators on Hilbert spaces. Hence we assume that Mi ⊂ B(Hi ) and that (Mi , Hi , Ji , Hi+ ) is a standard form. Let γ : M1 → M2 be a Schwartz map. Let ψi ∈ M+ i,∗ satisfy (4.4) and let Ψi be the standard vector representatives of ψi . Set D1 := M1 Ψ1 + (M1 Ψ1 )⊥ . We define a linear map T : D1 → H2 by T (AΨ1 + Θ1 ) := γ(A)Ψ2 for A ∈ M1 and Θ1 ∈ (M1 Ψ1 )⊥ . Since γ(1) = 1, T Ψ1 = Ψ2 . Lemma 4.1. The map T is well defined and extends to a contraction from H 1 to H2 . Proof. kγ(A)Ψ2 k2 = ψ2 (γ(A)∗ γ(A)) ≤ ψ2 (γ(A∗ A)) = ψ1 (A∗ A) = kAΨ1 k2 .
(4.6)
Hence if (A − B)Ψ1 = 0, then (γ(A) − γ(B))Ψ2 = 0. Therefore, T is well defined. By (4.6), T is a contraction. Let Φi be the standard vector representative of φi . The main step of the proof of Theorem 4.4 is the following interpolation estimate for the relative modular operator:
July 14, 2003 10:12 WSPC/148-RMP
468
00167
J. Derezi´ nski, V. Jakˇsi´ c & C.-A. Pillet
Lemma 4.2. For 0 ≤ t ≤ 1, t/2
t/2
k∆Φ2 ,Ψ2 Ψ2 k ≤ k∆Φ1 ,Ψ1 Ψ1 k . 1/2
Proof. The space D1 , defined above, is a core for ∆Φ1 ,Ψ1 . Let A ∈ M with A = AsΨ1 . For Ω1 = AΨ1 + Θ1 ∈ D1 we get 1/2
1/2
∆Φ2 ,Ψ2 T Ω1 = ∆Φ2 ,Ψ2 γ(A)Ψ2 = JsΨ2 γ(A)∗ Φ2 , 1/2
1/2
∆Φ1 ,Ψ1 Ω1 = ∆Φ1 ,Ψ1 AΨ1 = JA∗ Φ1 . By (4.5), kJsΨ2 γ(A)∗ Φ2 k2 ≤ φ2 (γ(A)γ(A)∗ ) ≤ φ2 (γ(AA∗ )) = φ1 (AA∗ ) = kJA∗ Φ1 k2 . Hence 1/2
1/2
k∆Φ2 ,Ψ2 T Ω1 k = k∆Φ1 ,Ψ1 Ω1 k . By Lemma 4.1, T is a contraction. Hence, by Theorem A.2, for t ∈ [0, 1], t/2
t/2
k∆Φ2 ,Ψ2 T Ω1 k ≤ k∆Φ1 ,Ψ1 Ω1 k . Setting Ω1 = Ψ1 we derive the statement. Proof of Theorem 4.4. Using Theorem 4.3 (1), Lemma 4.2 and 1 = kΨ1 k2 = kΨ2 k2 , we obtain t/2
Ent(ψ2 |φ2 ) = lim t−1 (k∆Φ2 ,Ψ2 Ψ2 k2 − kΨ2 k2 ) t↓0
t/2
≤ lim t−1 (k∆Φ1 ,Ψ1 Ψ1 k2 − kΨ1 k2 ) t↓0
= Ent(ψ1 |φ1 ) . 5. Perturbation Theory of KMS States Let β > 0. In this section, given a (τ, β)-KMS state ω and a perturbation Q, we describe the construction of the perturbed β-KMS state ωQ . We also prove various properties of this state, including the Peierls–Bogoliubov and the Golden– Thompson inequalities. The Golden–Thompson inequality plays an important role in our construction. The construction is performed on three levels: for analytic perturbations, bounded perturbations and a class of unbounded perturbations. Although the results on the first two levels are well known, the method of the proof on the second level (bounded perturbations) is new. The results concerning unbounded perturbations are new and they are the main results of our paper.
July 14, 2003 10:12 WSPC/148-RMP
00167
Perturbation Theory of W ∗ -Dynamics, Liouvilleans and KMS-States
469
5.1. Bounded perturbations Let (M, H, J, H+ ) be a W ∗ -algebra in the standard form. Let τ be the W ∗ -dynamics on M with the standard Liouvillian L. Let ω be a faithful (τ, β)-KMS state with the standard vector representative Ω. Let Q ∈ M be self-adjoint and τQ the perturbed W ∗ -dynamics defined by (3.1). By Theorem 3.5, LQ = L + Q − JQJ is the standard Liouvillean of τQ . The following two theorems summarize the (bounded) perturbation theory of KMS states developed by Araki. Theorem 5.1. (1) Ω ∈ D(e−β(L+Q)/2 ). Set ΩQ := e−β(L+Q)/2 Ω , (2) (3) (4) (5) (6)
ωQ (A) = (ΩQ |AΩQ )/kΩQ k2 .
ΩQ ∈ H + . ΩQ is a cyclic and separating vector for M. The state ωQ is a (τQ , β)-KMS state. log ∆ΩQ = −βLQ . For all self-adjoint Q1 , Q2 ∈ M, (ΩQ1 )Q2 = ΩQ1 +Q2 ,
(ωQ1 )Q2 = ωQ1 +Q2 .
(7) log ∆ΩQ ,Ω = log ∆Ω − βQ. (8) log ∆Ω,ΩQ = log ∆ΩQ + βQ. (9) Ent(ω|ωQ ) + βω(Q) = − log kΩQ k2 . (10) Ent(ωQ |ω) − βωQ (Q) = log kΩQ k2 . (11) The Peierls–Bogoliubov inequality holds: e−β(Ω|QΩ)/2 ≤ kΩQ k . (12) The Golden–Thompson inequality holds: kΩQ k ≤ ke−βQ/2 Ωk . (13) Assume that Qn ∈ M are self-adjoint and Qn → Q strongly. Then ΩQn → ΩQ and ωQn → ωQ in norm. Theorem 5.2. Let Tβ,n = {(β1 , . . . , βn ) ∈ Rn : βi ≥ 0, i = 1, . . . , n, β1 + · · · + βn ≤ β/2} . Then Ω ∈ D(e−β1 L Q · · · e−βn L Q) for (β1 , . . . , βn ) ∈ Tβ,n , the function Tβ,n 3 (β1 , . . . , βn ) 7→ e−β1 L Q · · · e−βn L Q Ω is norm continuous, ke−β1 L Q · · · e−βn L Q Ωk ≤ kQkn ,
sup
(5.1)
(β1 ,...,βn )∈Tβ,n
and ΩQ =
∞ X
n=0
(−1)
n
Z
··· Tβ,n
Z
e−β1 L Q · · · e−βn L Q Ω dβ1 · · · dβn .
(5.2)
July 14, 2003 10:12 WSPC/148-RMP
470
00167
J. Derezi´ nski, V. Jakˇsi´ c & C.-A. Pillet
We have separated Theorem 5.2 from the other results of Araki’s theory for several reasons. Theorem 5.2 contains the main idea of Araki’s original proof of Theorem 5.1. In fact, his proof was centered around the expansion (5.2). Our methods are in a certain sense orthogonal to Araki’s and we do not need Theorem 5.2 to prove Theorem 5.1. The expansion (5.2) is an additional information about ΩQ which, strictly speaking, cannot be derived by our methods alone. Hence, for bounded perturbations our method yields a slightly weaker result than the Araki method. On the other hand, our method is simpler and easily extends to a large class of unbounded perturbations Q. Both Araki and our methods start with analytic perturbations. In this case, the proofs of Theorems 5.1 and 5.2 are essentially algebraic and relatively easy. For a general bounded Q one picks a sequence of analytic Qn with Qn → Q and uses various limit arguments to establish the theorems. The key difference between the two methods concerns these limit arguments — we use weak limits while Araki uses strong limits. The use of weak limits leads to some technical simplifications and the method naturally extends to unbounded perturbations. Finally, we mention some additional estimates which can be used to compare Ω with ΩQ . Theorem 5.3. (1) kΩQ − Ωk ≤ (eβkQk/2 − 1). (2) β(Ω|QΩ)/2 ≥ kΩk2 − (Ω|ΩQ ) ≥ β(Ω|QΩQ )/2 ≥ (Ω|ΩQ ) − kΩQ k2 ≥ β(ΩQ |QΩQ )/2 . (3) β(Ω|QΩ) ≥ kΩk2 − kΩQ k2 ≥ β(ΩQ |QΩQ ) . (4) kΩQ − Ωk2 ≤ β(Ω|QΩ)/2 − β(ΩQ |QΩQ )/2 . (5) kΩQ − Ωk ≤ βf (kQΩk, kQΩQk)/2 , where, for x, y > 0, we set x−y , log x − log y f (x, y) := x
x 6= y ; x = y.
The estimate (1) follows immediately from (5.1) and is of course well-known. The estimates (2)–(5) appear to be new.
July 14, 2003 10:12 WSPC/148-RMP
00167
Perturbation Theory of W ∗ -Dynamics, Liouvilleans and KMS-States
5.2. Analytic perturbations
471
proofs
In this section we prove Theorem 5.1 for analytic self-adjoint perturbations Q ∈ M τ . The proofs are based on the algebraic arguments and are relatively easy. Proof of Theorem 5.1 in the analytic case. (1) For t real, EQ (t)Ω = eit(L+Q) e−itL Ω = eit(L+Q) Ω . Since EQ (t) has an analytic continuation to an entire function z 7→ EQ (z), Ω ∈ D(eiz(L+Q) ) for all z ∈ C and EQ (z)Ω = eiz(L+Q) Ω. In particular, ΩQ = EQ (iβ/2)Ω .
(5.3)
(2) We have EQ (iβ/2) = EQ (iβ/4)τ iβ/4 (EQ (iβ/4)) = EQ (iβ/4)τ iβ/2 (EQ (iβ/4)∗ ) . Hence, by (5.3), ΩQ = EQ (iβ/4)e−βL/2 EQ (iβ/4)∗ Ω = EQ (iβ/4)JEQ (iβ/4)Ω . Therefore, ΩQ ∈ H+ . (3) Since EQ (iβ/2) is an invertible element of M, ΩQ is obviously a cyclic and separating vector for M. (4) Theorem 3.6 yields e−βLQ /2 = JEQ (−iβ/2)Je−βL/2EQ (−iβ/2)−1 , and MΩQ = MΩ ⊂ D(e−βLQ /2 ). Moreover, for A ∈ M, e−βLQ /2 AΩQ = JEQ (−iβ/2)Je−βL/2 EQ (−iβ/2)−1 AEQ (iβ/2)Ω = JEQ (−iβ/2)EQ (iβ/2)∗ A∗ EQ (−iβ/2)−1∗ Ω = JEQ (−iβ/2)EQ (−iβ/2)−1 A∗ EQ (iβ/2)Ω = JA∗ ΩQ . (5) By Theorem 3.5, we know that LQ := L + Q − JQJ is the Liouvillean of τQ . By Theorem 2.13 we know that ∆ΩQ = e−βLQ . (6) follows from τQ
EQ21 (iβ/2)EτQ1 (iβ/2) = EτQ1 +Q2 (iβ/2) , which is an immediate consequence of Theorem 3.1 (6), where L + Q1 is to be used τ for L in the expression for EQQ21 (t). (7) The relation SΩ EQ (iβ/2)∗ AΩ = A∗ ΩQ = SΩQ ,Ω AΩ
July 14, 2003 10:12 WSPC/148-RMP
472
00167
J. Derezi´ nski, V. Jakˇsi´ c & C.-A. Pillet
implies that SΩQ ,Ω = SΩ EQ (iβ/2)∗ . Hence ∗ ∆Ω,ΩQ = SΩ S Q ,Ω ΩQ ,Ω
= EQ (iβ/2)∆Ω E∗Q (iβ/2) = (EQ (iβ/2)e−βL/2 )(e−βL/2 EQ (iβ/2)∗ ) = e−β(L+Q) , where we used ∆Ω = e−βL . (8) follows from (7) if we note that, by (6), (ΩQ )−Q = Ω. ˜ := Q + β −1 log kΩQ k2 . Then ωQ = ω ˜ and Ω ˜ := ΩQ /kΩQ k. Using (9) Set Q Q Q (7) we get ˜, log ∆ΩQ˜ ,Ω = log ∆Ω − β Q which implies ˜ . Ent(ω|ωQ ) = −βω(Q) (10) Similarly, using (8) we get ˜, log ∆Ω,Ω˜ Q = log ∆ΩQ˜ + β Q which implies ˜ . Ent(ωQ |ω) = βωQ (Q) (11) Since Ent(ω|ωQ ) ≤ 0, (9) yields that e−β(Ω|QΩ)/2 ≤ kΩQ k . This is the Peierls–Bogoliubov inequality. (12) Let N be the Abelian von Neumann subalgebra of M generated by Q. Then, log kΩQ k2 = Ent(ωQ |ω) − βωQ (Q) ≤ Ent(ωQ |N | ω|N ) − βωQ (Q) ≤ log ω(e−βQ ) = log ke−βQ/2 Ωk2 , and so kΩQ k ≤ ke−βQ/2 Ωk .
(5.4)
July 14, 2003 10:12 WSPC/148-RMP
00167
Perturbation Theory of W ∗ -Dynamics, Liouvilleans and KMS-States
473
This is the Golden–Thompson inequality. In the first step of (5.4) we used (10), in the second — Uhlmann’s estimate of Corollary 4.1 and in the third — Theorem 4.3 (4) with Q replaced by −βQ. (13) is a general fact which has the same proof for analytic and bounded perturbations. Its proof is given in the next section. We remark that the Golden–Thompson inequality was first proven by Araki [5]. The proof described in (12) is due to Donald [3]. 5.3. Bounded perturbations
proofs
In this subsection we prove Theorem 5.1. We assume that Q is an arbitrary selfadjoint element of M. By Theorem 2.15, we can find a sequence Qn of self-adjoint τ -analytic elements such that Qn → Q σ-strongly. This implies that Qn → Q strongly and the following lemma holds: Lemma 5.1. (1) L + Qn → L + Q in the strong resolvent sense. (2) LQn → LQ in the strong resolvent sense. Proof of Theorem 5.1. (1) Clearly, limn e−βQn /2 Ω = e−βQ/2 Ω. Hence there exists C such that for all n, ke−βQn /2 Ωk ≤ C . By the Golden–Thompson inequality for analytic perturbations, kΩQn k ≤ ke−βQn /2 Ωk . Hence kΩQn k ≤ C. Now by Proposition A.4, Ω ∈ D(e−β(L+Q)/2 ) and w- lim e−β(L+Qn )/2 Ω = e−β(L+Q)/2 Ω . n→∞
(2) follows from the analytic case of (2) and the fact that H + is weakly closed. t (3) Let P := 1 − sΩQ . Clearly, P ∈ M, τQ (P ) = P and P ΩQ = 0. Set Ω(z) = e−z(L+Q) Ω . By Proposition A.1, the vector-valued function Ω(z) is analytic inside the strip 0 < Re z < β/2 and norm continuous on its closure. Moreover, Ω(β/2) = ΩQ and eit(L+Q) P Ω(it + β/2) = eit(L+Q) P e−it(L+Q) Ω(β/2) t = τQ (P )ΩQ
= P ΩQ = 0 . Thus, for all real t, P Ω(it + β/2) = 0. This implies that P Ω(z) = 0 for all z in the strip 0 ≤ Re z ≤ β/2. In particular, P Ω(0) = P Ω = 0. Since Ω is a separating vector for M, P = 0. Hence sΩQ = 1 and ΩQ is a separating vector for M. Since ΩQ is separating, (2) and Theorem 2.7 (3) imply that ΩQ is also cyclic.
July 14, 2003 10:12 WSPC/148-RMP
474
00167
J. Derezi´ nski, V. Jakˇsi´ c & C.-A. Pillet
(4) follows from the analytic case of (4) and Theorem 2.14. (5), (7) and (8) follow from their analytic versions and Theorem 4.2. (6) Let now Q1 , Q2 be two self-adjoint elements and Q1,n , Q2,n the sequences of the corresponding analytic approximations. Then, by the analytic case of (6), (ΩQ1,m )Q2,n = ΩQ1,m +Q2,n . As n → ∞, (ΩQ1,m )Q2,n → (ΩQ1,m )Q2 weakly, ΩQ1,m +Q2,n → ΩQ1,m +Q2 weakly, and so (ΩQ1,m )Q2 = ΩQ1,m +Q2 .
(5.5)
By the arguments of the proof of (1), as m → ∞, ΩQ1,m +Q2 → ΩQ1 +Q2 weakly. Moreover, (ΩQ1,m )Q2 = e−β(L+Q1,m −JQ1,m J+Q2 )/2 ΩQ1,m , ΩQ1,m → ΩQ1 weakly and L + Q1,m − JQ1,m J + Q2 → L + Q1 − JQ1 J + Q2 in the strong resolvent sense. Hence by Proposition A.4, ΩQ1 ∈ D(e−β(L+Q1 −JQ1 J+Q2 )/2 ) and (ΩQ1 )Q2 = e−β(L+Q1 −JQ1 J+Q2 )/2 ΩQ1 = ΩQ1 +Q2 . (9) and (10) follow from (7) and (8) precisely as in the analytic case. (11) (The Peierls–Bogoliubov inequality) follows from (9) just as in the analytic case. (12) limn e−βQn /2 Ω = e−βQ/2 Ω implies lim ke−βQn /2 Ωk = ke−βQ/2 Ωk .
n→∞
(5.6)
Moreover, w- limn ΩQn = ΩQ implies kΩQ k ≤ lim inf kΩQn k . n→∞
(5.7)
By the Golden–Thompson inequality for analytic perturbations, kΩQn k ≤ ke−βQn /2 Ωk .
(5.8)
Now (5.6), (5.7) and (5.8) imply the Golden–Thompson inequality: kΩQ k ≤ ke−βQ/2 Ωk .
(5.9)
(13) Let Qn ∈ M be an arbitrary sequence of self-adjoint elements which converges strongly to Q. The proof of (1) yields that ΩQn → ΩQ weakly. Using first the chain rule and then the Golden–Thompson inequality we get kΩQn k = k(ΩQ )Qn −Q k ≤ ke−β(Qn −Q)/2 ΩQ k . Hence, lim supn kΩQn k ≤ kΩQ k. Combining this estimate with (5.7) we get kΩQn k → kΩQ k, and so ΩQn → ΩQ in norm. By Theorem 2.7, this implies that ωQn → ωQ in norm.
July 14, 2003 10:12 WSPC/148-RMP
00167
Perturbation Theory of W ∗ -Dynamics, Liouvilleans and KMS-States
475
5.4. Perturbative expansion of ΩQ and the estimates In this subsection we prove Theorems 5.2 and 5.3. The proof of Theorem 5.2 is based on the following technical result of Araki. Theorem 5.4. (1) Set Sβ,n := {(z1 , . . . , zn ) : Im zi ≥ 0, i = 1, . . . , n, Im z1 + · · · + Im zn ≤ β/2} . Then for (z1 , . . . , zn ) ∈ Sβ,n , Ω belongs to D(eizn L Qn · · · eiz1 L Q1 ), the function Sβ,n 3 (zn , . . . , z1 ) 7→ eizn L Qn · · · eiz1 L Q1 Ω
(5.10)
is norm continuous on Sβ,n , analytic on its interior, and keizn L Qn · · · eiz1 L Q1 Ωk ≤ kQn k · · · kQ1 k .
sup
(5.11)
(z1 ,...,zn )∈Sβ,n
(2) Let Qi,m → Qi strongly, Q∗i,m → Q∗i strongly. Then m→∞
lim e
m→∞
izn L
m→∞
Qn,m · · · e
iz1 L
Q1,m Ω = eizn L Qn · · · eiz1 L Q1 Ω ,
(5.12)
uniformly for (z1 , . . . , zn ) in compact subsets of Sβ,n . Proof. The proof follows by induction wrt n. For n = 1, the statement follows from the Proposition A.1 and the KMS condition (Theorem 2.13). Suppose that the statement is true for n − 1. Set Ω(z1 , . . . , zn−1 ) := Qn eizn−1 L Qn−1 · · · eiz1 L Q1 Ω , Ω∗ (z1 , . . . , zn−1 ) := JQ∗1 e−i¯z1 L Q∗2 · · · e−i¯zn−1 L Q∗n−1 Ω . Consider Φ ∈ D(e−βL/2 ) and the function F (z1 , . . . , zn−1 ) := (Φ|Ω∗ (z1 , . . . , zn−1 )) . By the induction assumption, the function F is continuous on Sβ,n−1 , analytic on its interior, and satisfies the estimate |F (z1 , . . . , zn−1 )| ≤ kΦkkQ1k · · · kQn k ,
(5.13)
which gives the estimate (5.11) for zn = 0. The function G(z1 , . . . , zn−1 ) := (e(i¯z1 +···+i¯zn−1 −β/2)L Φ|Ω(z1 , . . . , zn−1 ))
(5.14)
is also analytic and continuous on the same domain. (Here we used the induction assumption, the assumption Φ ∈ D(e−βL/2 ) and Proposition A.1). For z1 , . . . , zn−1 ∈ R, set s2 = z1 , s3 := z2 + z1 , . . . , sn = zn−1 + · · · + z1 . Then F (z1 , . . . , zn−1 ) = (Φ|JQ∗1 τ −s2 (Q∗2 ) · · · τ −sn (Q∗n )Ω) = (Φ|e−βL/2 τ −sn (Qn ) · · · τ −s2 (Q2 )Q1 Ω) = G(z1 , . . . , zn−1 ) ,
July 14, 2003 10:12 WSPC/148-RMP
476
00167
J. Derezi´ nski, V. Jakˇsi´ c & C.-A. Pillet
and by the edge of wedge theorem, the functions F and G coincide on their whole domains. Thus, by (5.13), |G(z1 , . . . , zn−1 )| ≤ kΦkkQ1k · · · kQn k .
(5.15)
For zn = iβ/2 − z1 − · · · − zn−1 and (z1 , . . . , zn−1 ) ∈ Sβ,n−1 , this implies that Ω(z1 , . . . , zn−1 ) ∈ D(eizn L ) , and Ω∗ (z1 , . . . , zn−1 ) = eizn L Ω(z1 , . . . , zn−1 ) .
(5.16)
(5.15) gives also the estimate (5.11) for zn = iβ/2 − z1 − · · · − zn−1 . The estimate (5.11) for 0 ≤ Im zn ≤ β/2 − Im z1 − · · · − Im zn−1 follows from (5.13), (5.15) and Proposition A.1. By Proposition A.1 and Hartog’s theorem of holomorphy, (ei¯zn L Φ|Ω(z1 , . . . , zn )) is analytic on the interior of Sβ,n , for Φ ∈ D(e−βL/2 ). Using the estimate (5.11) we see that it is analytic for all Φ. Hence we can conclude that the function (5.10) is weakly analytic. Since the weak analyticity is equivalent to the norm analyticity, we have proven all the statements of (1) except that (5.10) is norm continuous on the whole Sβ,n . Next we turn to the proof of (2) for n. Set Ωm (z1 , . . . , zn−1 ) := Qn,m eizn−1 L Qn−1,m · · · eiz1 L Q1,m Ω , Ω∗m (z1 , . . . , zn−1 ) := JQ∗1,m e−i¯z1 L Q∗2,m · · · e−i¯zn−1 L Q∗n−1,m Ω . By the uniform boundedness principle, independently of m, we have kQi,m k ≤ C ,
i = 1, . . . , n .
(5.17)
Now kΩ∗m (z1 , . . . , zn−1 ) − Ω∗ (z1 , . . . , zn−1 )k ≤ kQ1,m kke−i¯z1 L Q∗2,m · · · e−i¯zn−1 L Q∗n−1,m Ω − e−i¯z1 L Q∗2 · · · e−i¯zn−1 L Q∗n−1 Ωk + k(Q∗1,m − Q∗1 )e−i¯z1 L Q∗2 · · · e−i¯zn−1 L Q∗n−1 Ωk . The first term on the right goes to zero uniformly on compact subsets of Sβ,n−1 by the induction assumption and (5.17) for i = 1. The second term on the right goes to zero uniformly on compact subsets of Sβ,n−1 by the induction assumption, Lemma A.1 and the strong convergence Q∗1,m → Q∗1 . By the proof of (1) (see the identity (5.16)), we have for z1 , . . . , zn−1 ∈ Sβ,n−1 , Ω(z1 , . . . , zn−1 ) − Ωm (z1 , . . . , zn−1 ) ∈ D(e(−iz1 −···−izn−1 −β/2)L ) , Ω∗ (z1 , . . . , zn−1 ) − Ω∗m (z1 , . . . , zn−1 ) = e(−iz1 −···−izn−1 −β/2)L (Ω(z1 , . . . , zn−1 ) − Ωm (z1 , . . . , zn−1 )) .
July 14, 2003 10:12 WSPC/148-RMP
00167
Perturbation Theory of W ∗ -Dynamics, Liouvilleans and KMS-States
477
Hence, lim ke(−iz1 −···−izn−1 −β/2)L (Ω(z1 , . . . , zn−1 ) − Ωm (z1 , . . . , zn−1 ))k = 0
m→∞
uniformly on compact subsets of Sβ,n−1 . By the induction assumption, lim kΩ(z1 , . . . , zn−1 ) − Ωm (z1 , . . . , zn−1 )k = 0
m→∞
uniformly on compact subsets of Sβ,n−1 . Hence, by Proposition A.1, lim keizn L (Ω(z1 , . . . , zn−1 ) − Ωm (z1 , . . . , zn−1 ))k = 0
m→∞
uniformly for 0 ≤ Im zn ≤ β/2−Im z1 −· · ·−Im zn−1 and (z1 , . . . , zn−1 ) in compact subsets of Sβ,n−1 . In particular, the convergence is uniform on compact subsets of Sβ,n . This ends the proof of (2) for n. It remains to prove the norm continuity part of (1). Let Qi,m ∈ Mτ such that Qi,m → Qi strongly and Q∗i,m → Q∗i strongly. The function m→∞
m→∞
Cn 3 (z1 , . . . , zn ) 7→ eizn L Qn,m · · · eiz1 L Q1,m Ω is entire analytic and in particular, it is norm continuous. By the uniform convergence on compact subsets of Sβ,n , proven in (2), and the local compactness of Sβ,n we conclude that (5.10) is norm continuous on Sβ,n . Proof of Theorem 5.2. Let Qn ∈ Mτ be such that Qn → Q strongly. Since ΩQn = EQn (iβ/2)Ω, the expansion (3.4) yields that Theorem 5.2 holds for Qn . Moreover, ΩQ = w- lim ΩQn n→∞
= w- lim
n→∞
=
∞ X
m=0
∞ X
(−1)m
m=0
(−1)m
Z
Z
··· Tβ,m
··· Tβ,m
Z
Z
e−β1 L Qn · · · e−βm L Qn Ω dβ1 · · · dβm
e−β1 L Q · · · e−βm L Q Ω dβ1 · · · dβm .
The first identity follows from Theorem 5.1 (recall the proof of (1) or use (13)), the second is obvious, and the third follows from Theorem 5.4. Proof of Theorem 5.3. Theorem 5.2 yields (1). By Theorem 5.1 (13) it suffices to prove (2)–(5) for Q ∈ Mτ . (2)–(3). Our proof is motivated by [2]. By Theorem 3.2, Ω ∈ D(e−z(L+Q) ) for all z and EQ (iz)Ω = e−z(L+Q) Ω is an entire vector-valued function. Set f (z) := (Ω|e−z(L+Q) Ω) = (Ω|EQ (iz)Ω) .
July 14, 2003 10:12 WSPC/148-RMP
478
00167
J. Derezi´ nski, V. Jakˇsi´ c & C.-A. Pillet
Then f is an entire function, f 00 (x) ≥ 0 for x ∈ R, and f (0) = kΩk2 = 1 ,
f (β/2) = (Ω|ΩQ ) ,
f (β) = kΩQ k2 ,
f 0 (0) = −(Ω|(L + Q)Ω) = −(Ω|QΩ) , f 0 (β/2) = −(Ω|(L + Q)ΩQ ) = −(Ω|JQJΩQ ) = −(Ω|QΩQ ) , f 0 (β) = −(ΩQ |(L + Q)ΩQ ) = −(ΩQ |JQJΩQ ) = −(ΩQ |QΩQ ) (we used LΩ = 0 and (L + Q − JQJ)ΩQ = 0). These relations combined with the mean-value theorem yield (2)–(3). (4) follows easily from (2). To prove (5), consider the function z F (z) := τQ (Q)EQ (z)Ω . z Since τQ (Q) and EQ (z) are uniformly bounded on the strip 0 ≤ Im z ≤ β/2, F (z) is also bounded on the this strip. Moreover, ( kQΩk if Im z = 0 ; kF (z)k ≤ iβ/2 kτQ (Q)ΩQ k if Im z = β/2 . iβ/2
Since τQ
(Q)ΩQ = e−βLQ /2 QΩQ = JQΩQ , kF (z)k ≤ kQΩQ k if Im z = β/2 .
Hence, by the three-line theorem, for 0 ≤ t ≤ β/2, kF (it)k ≤ kQΩQ k1−2t/β kQΩk2t/β . Since ΩQ − Ω = −
Z
0
we derive kΩQ − Ωk ≤ ≤
Z
β/2
it τQ (Q)EQ (it)Ωdt ,
β/2
kF (it)kdt 0
β 2
Z
1
kQΩk1−s kQΩQ ks ds = βf (kQΩk, kQΩQk)/2 .
0
5.5. Unbounded perturbations This subsection contains our main results. It extends the construction of KMS states to a large class of unbounded perturbations. Let Q be a self-adjoint operator affiliated to M satisfying Assumptions 3.1 and 3.2. Let τQ be the dynamics defined as in Subsec. 3.3. Recall that by Theorem 3.5 its Liouvillean equals LQ = L + Q − JQJ .
July 14, 2003 10:12 WSPC/148-RMP
00167
Perturbation Theory of W ∗ -Dynamics, Liouvilleans and KMS-States
479
In order to construct the perturbed KMS state we will need an additional assumption: Assumption 5.1. ke−βQ/2 Ωk < ∞. Theorem 5.5. Assume 3.1, 3.2 and 5.1. Then (1) Ω ∈ D(e−β(L+Q)/2 ). Set ΩQ := e−β(L+Q)/2 Ω , (2) (3) (4) (5) (6) (7) (8)
ωQ (A) := (ΩQ |AΩQ )/kΩQ k2 .
ΩQ ∈ H + . ΩQ is cyclic and separating. ωQ is a (τQ , β)-KMS state. log ∆ΩQ = −βLQ . log ∆ΩQ ,Ω = −βL − βQ. Ent(ω|ωQ ) = −βω(Q) − log kΩQ k2 . The Peierls–Bogoliubov inequality holds: e−β(Ω|QΩ)/2 ≤ kΩQ k .
(9) The Golden–Thompson inequality holds: kΩQ k ≤ ke−βQ/2 Ωk . (10) For any 0 ≤ λ ≤ 1, λQ satisfies the assumptions of the theorem, hence Ω λQ is well defined. Moreover, limλ↓0 kΩλQ − Ωk = 0. Remark. The formula for relative entropy of (7) requires a comment. Because of Assumption 5.1, ω(Q− ) is finite, where Q− = 1]−∞,0] (Q)Q. Therefore, ω(Q) is a finite number or +∞. Set Qn := 1[−n,n] (Q)Q , where 1[−n,n] (Q) is the spectral projection of Q on the interval [−n, n]. Theorem 5.6. (1) L + Qn → L + Q in the strong resolvent sense. (2) LQn → LQ in the strong resolvent sense. Proof. We prove only (2) (the proof of (1) is similar). Let D0 = D(L) ∩ D(Q) ∩ D(JQJ). By Assumption 3.2, LQ is essentially self-adjoint on D0 . Moreover, LQn Ψ → LQ Ψ, Ψ ∈ D0 . Hence the statement follows from Proposition A.3. Proof of Theorem 5.5. Given the approximating sequence Qn defined above and Lemma 5.6, the parts (1)–(9) follow from Theorem 5.1 in the same way as the analogous parts of Theorem 5.1 followed from the analytic case of Theorem 5.1.
July 14, 2003 10:12 WSPC/148-RMP
480
00167
J. Derezi´ nski, V. Jakˇsi´ c & C.-A. Pillet
The only part requiring a separate argument is (10). To prove it, note that L + λQ → L in the strong resolvent sense as λ ↓ 0. This implies that ΩλQ → Ω weakly as λ ↓ 0 and kΩk ≤ lim inf kΩλQ k ≤ lim sup kΩλQ k ≤ lim ke−βλQ/2 Ωk = kΩk . λ↓0
λ↓0
λ↓0
Hence, kΩλQ k → kΩk as λ ↓ 0, and this implies that ΩλQ → Ω as λ ↓ 0. 5.6. Perturbations of Liouvilleans revisited In Theorem 3.5 we have shown that LQ is the Liouvillean of τQ by invoking Theorem 2.9 and checking that t τQ (A) = eitLQ Ae−itLQ ,
eitLQ H+ ⊂ H+ .
(5.18)
Under the conditions of Theorem 5.5 (recall Proposition 2.4), the second relation in (5.18) is equivalent to LQ ΩQ = 0 .
(5.19)
In this section we give an elementary direct proof of (5.19). This verifies that L Q is the Liouvillean of τQ without resort to Theorem 2.9. We consider only the case of analytic perturbations Q ∈ Mτ . The extension to bounded Q and unbounded Q satisfying conditions of Theorem 5.5 is immediate using the strong resolvent convergence of Liouvilleans and the weak convergence of β-KMS vectors. First, the relation eit(L+Q) ΩQ = EQ (t + iβ)Ω and analytic continuation yield that ΩQ ∈ D(exp(iz(L + Q)) for all z, and so ΩQ ∈ D(L + Q) = D(LQ ). Since eitL M0 e−itL = M0 , JQJ ∈ M0 , and eitL J = JeitL , the Trotter product formula yields eit(L+Q) JQJe−it(L+Q) = eitL JQJe−itL = JeitL Qe−itL J . By analytic continuation, the relation (eβ(L+Q)/2 Φ|JQJe−β(L+Q)/2 Ω) = (Φ|Jτ iβ/2 (Q)JΩ) ˜=S holds for all Φ in a dense domain D r>0 Ran1[−r,r] (L + Q). Using 1
Jτ iβ/2 (Q)JΩ = J∆ 2 QΩ = QΩ ,
we derive (eβ(L+Q)/2 Φ|JQJΩQ ) = (Φ|QΩ) .
July 14, 2003 10:12 WSPC/148-RMP
00167
Perturbation Theory of W ∗ -Dynamics, Liouvilleans and KMS-States
481
This relation yields (eβ(L+Q)/2 Φ|(L + Q − JQJ)ΩQ ) = (eβ(L+Q)/2 Φ|(L + Q)ΩQ ) − (Φ|QΩ) = (Φ|(L + Q)Ω) − (Φ|QΩ) = (Φ|LΩ) = 0 . Since e
β(L+Q)/2
˜=D ˜ is dense in H, LQ ΩQ = 0. D
Appendix A. Technical Facts In this appendix we collect some technical facts which have been used throughout the paper. A.1. Operators and resolvent convergence First, we recall the Trotter product formula (see [30], Theorem VIII.31). Theorem A.1. If A and B are self-adjoint operators and A + B is essentially self-adjoint on D(A) ∩ D(B), then s- lim (eitA/n eitB/n )n = eit(A+B) . n→∞
The next proposition follows easily from the spectral theorem and the three-line theorem (see also Lemma 4 in [5]). Proposition A.1. Let H be a self-adjoint operator and Ω ∈ D(eδH ) for some δ > 0. Then the vector-valued function ezH Ω is analytic inside the strip 0 < Re z < δ, norm continuous on its closure and kezH Ωk ≤ keδH ΩkRe z/δ kΩk1−Re z/δ . Lemma A.1. Let Z be a compact metric space and Z 3 z 7→ Ω(z) ∈ H a norm continuous function. Let An be bounded operators and assume that An → A strongly. Then lim k(An − A)Ω(z)k = 0
n→∞
uniformly on Z. Proof. Note first that {Ω(z) : z ∈ Z} is a compact subset of H and that by the uniform boundedness principle C := supn kAn k < ∞. Let > 0 be given. Then there exists a finite dimensional projection P such that supz∈Z k(1 − P )Ω(z)k < . Since k(An − A)Ω(z)k ≤ k(An − A)P k sup kΩ(z)k z∈Z
+ sup kAn − Ak sup k(1 − P )Ω(z)k , n
we derive lim supn k(An − A)Ω(z)k < 2C.
z∈Z
July 14, 2003 10:12 WSPC/148-RMP
482
00167
J. Derezi´ nski, V. Jakˇsi´ c & C.-A. Pillet
The following properties of the strong convergence of functions of self-adjoint operators are proven e.g. in [30]: Proposition A.2. Suppose that Hn , H are self-adjoint operators. Then the following conditions are equivalent: S∞ (1) Let z0 6= ( n=1 σ(Hn ))cl (for instance, Im z0 6= 0). Then s- lim (z0 − Hn )−1 = (z0 − H)−1 . n→∞
(2) If f is a bounded continuous function on ( strongly.
S∞
n=1
σ(Hn ))cl , then f (Hn ) → f (H)
Note that (1) in the above proposition holds for any choice of z0 if it holds for one choice of z0 . If the conditions of above proposition are satisfied we say that Hn → H in the strong resolvent sense. Proposition A.3. Suppose that Hn , H are self-adjoint operators, H is essentially self-adjoint on D and limn Hn Ψ = HΨ for Ψ ∈ D. Then Hn → H in the strong resolvent sense. Proof. Let Im z 6= 0. Then (z − H)D =: D1 is dense in H. For Ψ ∈ D1 , (z − H)−1 Ψ − (z − Hn )−1 Ψ = (z − Hn )−1 (H − Hn )(z − H)−1 Ψ → 0 . The following proposition plays an important role in several arguments in our paper. Proposition A.4. Suppose that Hn , H are self-adjoint operators and Hn → H in the strong resolvent sense. Suppose that Ωn , Ω ∈ H such that Ωn → Ω weakly and kHn Ωn k ≤ C. Then Ω ∈ D(H), w- limn Hn Ωn exists and HΩ = w- limn Hn Ωn . Remark. By the uniform boundedness principle, the condition kHn Ωn k ≤ C can be replaced by the existence of w- limn→∞ Hn Ωn . Proof. Since the ball of radius C in a Hilbert space is weakly sequentially compact, one can find a weakly convergent subsequence Hnk Ωnk . Set Ψ = w- limk→∞ Hn k Ω n k . S Let D := r>0 Ran1[−r,r](H). Let Φ ∈ D and f ∈ C0∞ (R) such that f (H)Φ = Φ. Then Φ = f (H)Φ = lim f (Hn )Φ n→∞
HΦ = f (H)HΦ = lim f (Hn )Hn Φ , n→∞
July 14, 2003 10:12 WSPC/148-RMP
00167
Perturbation Theory of W ∗ -Dynamics, Liouvilleans and KMS-States
483
and (HΦ|Ω) = lim (Hnk f (Hnk )Φ|Ωnk ) k→∞
= lim (f (Hnk )Φ|Hnk Ωnk ) k→∞
= (Φ|Ψ) .
(A.1)
Since D is a core for H, Ω ∈ D(H) and HΩ = Ψ. Now assume that w- limn→∞ Hn Ω does not exist. Then there exists Φ ∈ H and a subsequence Hnk Ω such that |(Φ|Hnk Ω) − (Φ|HΩ)| ≥ > 0 .
(A.2)
Using again the weak sequential compactness of the ball of radius C and passing to a subsubsequence we may assume that w- limk→∞ Hnk Ω exists. Repeating the arguments of (A.1), we see that w- limk→∞ Hnk Ω = HΩ. This contradicts (A.2). A.2. An interpolation theorem Various versions of the following interpolation theorem for linear operators can be found throughout literature, see e.g. [4] (where a different proof is outlined) and [31]. Theorem A.2. Let H1 , H2 be Hilbert spaces and let Hi be a positive (possibly unbounded ) operator on Hi . Let D1 be a core of H1 . Let T ∈ B(H1 , H2 ) with kT k = c0 be such that: (a) T D1 ⊂ D(H2 ). (b) For Ψ ∈ D1 , kH2 T Ψk ≤ c1 kH1 Ψk. Then, for any 0 ≤ λ ≤ 1, T D(H1λ ) ⊂ D(H2λ ) and for Ψ ∈ D(H1λ ), kH2λ T Ψk ≤ c01−λ cλ1 kH1λ Ψk .
(A.3)
Proof. Clearly, we may assume that c0 = c1 = 1. First let us show that T D(H1 ) ⊂ D(H2 ) and kH2 T Ψk ≤ kH1 Ψk ,
Ψ ∈ D(H1 ) .
(A.4)
Let Ψ ∈ D(H1 ). Then there exist Ψn ∈ D1 such that Ψn → Ψ and H1 Ψn → H1 Ψ. Now kH2 (T Ψn − T Ψm )k ≤ kH1 (Ψn − Ψm )k . Thus H2 T Ψn is Cauchy, hence convergent. T Ψn is obviously convergent. H2 is closed. Hence T Ψ ∈ D(H2 ). (A.4) follows by passing to the limit in kH2 T Ψn k ≤ kH1 Ψn k. Let Φ ∈ D(H2 ), Ω ∈ H1 and > 0. For 0 ≤ Re z ≤ 1 set, F (z) := (H2z¯Φ|T (H1 + )−z Ω) .
July 14, 2003 10:12 WSPC/148-RMP
484
00167
J. Derezi´ nski, V. Jakˇsi´ c & C.-A. Pillet
F (z) is a continuous function in the strip 0 ≤ Re z ≤ 1, analytic in its interior, and |F (z)| ≤ k(H2 + 1)Φk−1 kΩk . For Re z = 0 |F (z)| ≤ kΦkkΩk . For Re z = 1, (H1 + )−z Ω ∈ D(H1 ), and |F (z)| ≤ kΦkkH2 T (H1 + )−z Ωk ≤ kΦkkH1 (H1 + )−z Ωk ≤ kΦkkΩk . These estimates and the three-line theorem yield that for 0 ≤ λ ≤ 1, |F (λ)| ≤ kΦkkΩk . Therefore, for Ω ∈ H1 , kH2λ T (H1 + )−λ Ωk ≤ kΩk , and for Ψ ∈ D(H1λ ), kH2λ T Ψk = lim kH2λ T (H1 + )−λ (H1 + )λ Ψk ↓0
≤ lim k(H1 + )λ Ψk ↓0
= kH1λ Ψk . Appendix B. Pauli Fierz Systems B.1. Introduction A large part of the motivation for the formalism and the results of our paper comes from quantum statistical physics. A detailed description of their application to Pauli–Fierz systems — a certain class of physically motivated W ∗ -dynamical systems — can be found in [1]. In this appendix we briefly describe these applications. Pauli–Fierz systems describe a small quantum system (an atom or a molecule) interacting with a large bosonic reservoir. They arise as an approximation to nonrelativistic QED (see e.g. [11, 32]), and they have been widely used in physics literature as a basic paradigm of an open quantum system [33, 34]. We are interested in the case where the radiation density of the bosonic reservoir is not zero (in particular, the reservoir is not at zero temperature). For example, the radiation density can be given by the Planck law at the inverse temperature β < ∞, see (B.3) below. This corresponds to the case of bosons in thermal equilibrium. We are also interested in situations outside thermal equilibrium. For example, the reservoir may consist of several subreservoirs at distinct temperatures. W ∗ -dynamical systems provide a natural framework to describe Pauli–Fierz systems with nonzero radiation density, as it will be sketched below.
July 14, 2003 10:12 WSPC/148-RMP
00167
Perturbation Theory of W ∗ -Dynamics, Liouvilleans and KMS-States
B.2. Bose gas at density ρ
485
Araki Woods algebras
If Z is a Hilbert space, then we will write Γs (Z) for the bosonic Fock space over the 1-particle space Z. Ω will denote the vacuum vector. For definiteness, we will consider the Bose gas with the 1-particle space L2 (Rd ). Assume that Rd 3 ξ 7→ ρ(ξ) is a nonnegative real measurable function describing the density of bosons with the momentum ξ ∈ Rd . To describe the Bose gas at density ρ one uses a special von Neumann algebra first described by Araki and Woods in [35]. It can be defined by its representation in the Hilbert space HAW := Γs (L2 (Rd ) ⊕ L2 (Rd )) . We will write al (ξ), a∗l (ξ), ar (ξ), a∗r (ξ) for the creation and annihilation operators corresponding to the left and right L2 (Rd ) respectively. We define the left/right Araki–Woods creation and annihillation opetators p p a∗ρ,l (ξ) := 1 + ρ(ξ)a∗l (ξ) + ρ(ξ)ar (ξ) , p p aρ,l (ξ) := 1 + ρ(ξ)al (ξ) + ρ(ξ)a∗r (ξ) , p p a∗ρ,r (ξ) := ρ(ξ)al (ξ) + 1 + ρ(ξ)a∗r (ξ) , p p aρ,r (ξ) := ρ(ξ)a∗l (ξ) + 1 + ρ(ξ)ar (ξ) .
∗ The left Araki–Woods algebra is denoted by MAW ρ,l and defined as the W -algebra generated by the operators ! Z ∗ ¯ exp i (f (ξ)aρ,l (ξ) + f(ξ)a ρ,l (ξ))dξ .
Let J AW := Γ(), where is an antilinear involution on L2 (Rd ) ⊕ L2 (Rd ) given by (f1 , f¯2 ) := (f2 , f¯1 ) , and Γ is the second quantization functor, and let HρAW,+ be the closure of the cone in HAW generated by AJAΩ ,
A ∈ MAW ρ,l .
AW Then (MAW , J AW , HρAW,+ ) is a von Neumann algebra in a standard form. It ρ,l , H describes the Bose gas at density ρ.
B.3. Araki Woods algebra coupled to a type I factor We denote by K the Hilbert space of the small quantum system. For simplicity, we assume that dim K < ∞. We would like to describe the W ∗ -algebra of the joint system consisting of the small system with the algebra of observables B(K) and the Bose gas at density ρ. One way to define this algebra is to identify it with the von Neumann algebra, Mρ := B(K) ⊗ MAW ρ,l
July 14, 2003 10:12 WSPC/148-RMP
486
00167
J. Derezi´ nski, V. Jakˇsi´ c & C.-A. Pillet
acting on the Hilbert space K ⊗ Γs (L2 (Rd ) ⊕ L2 (Rd )). The identity representation of this algebra on K ⊗ Γs (L2 (Rd ) ⊕ L2 (Rd )) will be called the semi-standard representation of Mρ . ¯ be the Hilbert It is easy to describe the standard representation of Mρ . Let K space complex conjugate to K (see e.g. Sec. 4.6 in [1]). The standard representation acts on the space ¯ ⊗ Γs (L2 (Rd ) ⊕ L2 (Rd )) K⊗K
(B.1)
and is given by π(A ⊗ B) := A ⊗ 1K¯ ⊗ B for A ∈ B(K), B ∈ MAW ρ,l . The modular conjugation is given by ¯ 2 ⊗ Φ := Ψ2 ⊗ Ψ ¯ 1 ⊗ J AW Φ . J Ψ1 ⊗ Ψ Note that it is useful to consider the two representations of Mρ — the semi-standard and the standard representations in a parallel way. The semistandard representation is simpler whereas the standard representation has special mathematical properties. B.4. Pauli Fierz W ∗ -dynamical systems Suppose that K is a self-adjoint operator on K describing the Hamiltonian of the small system. Let |ξ| be the energy of the boson of momentum ξ. Let Rd 3 ξ 7→ v(ξ) ∈ B(K) describe the coupling of the small system to the Bose gas. We assume that the Bose gas is at the density ρ. Let λ ∈ R. We introduce the following operators on K ⊗ Γs (L2 (Rd ) ⊕ L2 (Rd )): Z Lsemi := K ⊗ 1 + 1 ⊗ |ξ|(a∗l (ξ)al (ξ) − a∗r (ξ)ar (ξ))dξ , fr Qsemi := ρ
Z
(v(ξ) ⊗ a∗ρ,l (ξ) + v ∗ (ξ) ⊗ aρ,l (ξ))dξ .
will be called the free semi-Liouvillean. The full semi-Liouvillean The operator Lsemi fr for the density ρ equals Lsemi := Lsemi + λQsemi . fr ρ ρ
(B.2)
For A ∈ Mρ we set semi
semi
τfrt (A) := eitLfr Ae−itLfr semi
τρt (A) := eitLρ
semi
Ae−itLρ
, .
¯ ⊗ Γs (L2 (Rd ) ⊕ L2 (Rd )): We also introduce the following operators on K ⊗ K Z ¯ ⊗ 1 + 1 ⊗ 1 ⊗ |ξ|(a∗l (ξ)al (ξ) − a∗r (ξ)ar (ξ))dξ . Lfr = K ⊗ 1 ⊗ 1 − 1 ⊗ K Qρ =
Z
(v(ξ) ⊗ 1 ⊗ a∗ρ,l (ξ) + v ∗ (ξ) ⊗ 1 ⊗ aρ,l (ξ))dξ .
July 14, 2003 10:12 WSPC/148-RMP
00167
Perturbation Theory of W ∗ -Dynamics, Liouvilleans and KMS-States
487
It is easy to see that JQρ J = Set
Z
(1 ⊗ v¯(ξ) ⊗ a∗ρ,r (ξ) + 1 ⊗ v¯∗ (ξ) ⊗ aρ,r (ξ))dξ .
Lρ := Lfr + λQρ − λJQρ J . We denote by l2 (K) the vector space B(K) equipped with the inner product (A|B) = Tr(A∗ B) (recall that dim K < ∞). The following theorem describes the case of the free dynamics. Theorem B.1. (1) For any t, τfrt preserves the algebra Mρ and (Mρ , τfr ) is a W ∗ -dynamical system. (2) Lfr is the Liouvillean for the dynamics τfr . (3) Let β > 0, ρ(ξ) = (eβ|ξ| − 1)−1 ,
(B.3)
¯ Ψfr and Ψfr := e−βK/2 ⊗ Ω. Using the natural identification of l 2 (K) with K ⊗ K, 2 d 2 ¯ ⊗ Γs (L (R ) ⊕ L (Rd )). can be understood as an element of the Hilbert space K ⊗ K Then Ψfr is a β-KMS vector for τfr . The results of our paper are the main technical input in the proof of the following theorem, which describes the interacting dynamics: Theorem B.2. (1) Assume that Z (1 + |ξ|2 )(1 + ρ(ξ))kv(ξ)k2 dξ < ∞ .
(B.4)
Then for any t, τρt preserves the algebra Mρ and (Mρ , τρ ) is a W ∗ -dynamical system. (2) Lρ is the Liouvillean for the dynamics τρ . (3) Assume that (B.3) holds and that Z (|ξ|−1 + |ξ|2 )kv(ξ)k2 dξ < ∞ . Then (B.4) holds, and there exists a β-KMS vector for τρ . The W ∗ -dynamical system (Mρ , τρ ) is called the Pauli–Fierz system at density ρ. It is canonically defined given K, K, v and ρ. The proof of Theorem B.2 is given in [1]. To prove (1) we check that Qsemi is ρ semi semi ) ∩ + λQ is essentially self-adjoint on D(L affiliated to Mρ and that Lsemi ρ fr fr D(Qsemi ). Then we apply Theorem 3.3. To prove (2), in a similar way we apply ρ Theorem 3.5. Finally, to show (3) we use Theorem 5.5. The details can be found in [1]. We finish with several remarks. The perturbation Qsemi is unbounded from above and below, and the existing ρ results in the literature [2, 3, 14] are not applicable to Pauli–Fierz systems.
July 14, 2003 10:12 WSPC/148-RMP
488
00167
J. Derezi´ nski, V. Jakˇsi´ c & C.-A. Pillet
The first result about existence of KMS-states for Pauli–Fierz systems goes back to [36] where the spin-boson system was considered. A result similar to Theorem B.2 was proven in [7] under a more restrictive infrared condition. Theorem B.2 covers the physical infrared regime of nonrelativistic QED (often called the ohmic case in the context of Pauli–Fierz systems, see e.g. [1, 11, 33, 34]). Acknowledgments The research of the first author was partly supported by the Postdoctoral Training Program HPRN-CT-2002-00277 and by the grant SPUB127 financed by Komitet Bada´ n Naukowych. A part of this work was done during a visit of the first author at the Aarhus University supported by MaPhySto funded by the Danish National Research Foundation and during a visit to University of Montreal. The research of the second author was partly supported by NSERC. We wish to thank the referees for the careful reading of the manuscript and for numerous helpful remarks. References [1] J. Derezinski and V. Jakˇsi´c, Return to equilibrium for Pauli–Fierz systems, preprint, Maphysto, Aarhus University, 2001, to appear in Ann. H. Poincar´e. [2] S. Sakai, Perturbations of KMS states in C ∗ -dynamical systems (Generalization of the absence theorem of phase transition to continuous quantum systems), Contemp. Math. 62 (1987), 187. [3] M. J. Donald, Relative Hamiltonians which are not bounded from above, J. Func. Anal. 91 (1990), 143. [4] M. Ohya and D. Petz, Quantum Entropy and its Use, Springer-Verlag, Berlin, 1993. [5] H. Araki, Golden–Thompson and Peierls–Bogoliubov inequalities for a general von Neumann algebra, Comm. Math. Phys. 34 (1973), 167. [6] A. Uhlmann, Relative entropy and the Wigner-Yanase-Dyson-Lieb concavity in an interpolation theory, Comm. Math. Phys. 54 (1977), 123. [7] V. Bach, J. Fr¨ ohlich and I. Sigal, Return to equilibrium, J. Math. Phys. 41 (2000), 3985. [8] V. Jakˇsi´c and C.-A. Pillet, On a model for quantum friction III. Ergodic properties of the spin-boson system, Comm. Math. Phys. 178 (1996), 627. [9] V. Jakˇsi´c and C.-A. Pillet, Nonequilibrium steady states for finite quantum systems coupled to thermal reservoirs, to appear in Comm. Math. Phys. [10] M. Merkli, Positive commutators in nonequilibrium quantum statistical mechanics, preprint. [11] J. Derezinski and V. Jakˇsi´c, Spectral theory of Pauli–Fierz operators, J. Func. Anal. 180 (2001), 243. [12] O. Brattelli and D. W. Robinson, Operator Algebras and Quantum Statistical Mechanics 1, 2nd edition, Springer-Verlag, Berlin, 1987. [13] O. Brattelli and D. W. Robinson, Operator Algebras and Quantum Statistical Mechanics 2, 2nd edition, Springer-Verlag, Berlin, 1996. [14] H. Araki, Relative Hamiltonian for a faithful normal states of a von Neumann algebra, Pub. R.I.M.S. Kyoto Univ. 9 (1973), 165. [15] A. Klein and L. Landau, Stochastic processes associated with KMS states, J. Func. Anal. 42 (1981), 368.
July 14, 2003 10:12 WSPC/148-RMP
00167
Perturbation Theory of W ∗ -Dynamics, Liouvilleans and KMS-States
489
[16] S. Sakai, Operator Algebras in Dynamical Systems. The Theory of Unbounded Derivations in C ∗ -algebras. Encyclopedia of Mathematics and its Applications 41, Cambridge University Press, Cambridge, 1991. [17] B. Simon, Statistical Mechanics of Lattice Gasses, Volume 1, Princeton University Press, Princeton, 1991. [18] S. Sakai, C ∗ -algebras and W ∗ -algebras, Springer-Verlag, Berlin, 1971. [19] S. Stratila and L. Zsido, Lectures on von Neumann algebras, Abacus Press, Turnbridge Wells, 1979. [20] S. Stratila, Modular theory in operator algebras, Abacus Press, Turnbridge Wells, 1981. [21] S. Baaj and P. Jungl, Th´eorie bivariant de Kasparow et operateurs non borne’ees dans les C ∗ -modules hilbertiens, C. R. Acad. Sci. Paris, Serie I 296 (1983), 875. [22] S. L. Woronowicz, Unbounded elements afflliated with C ∗ -algebras and non-compact quantum groups, Comm. Math. Phys. 136 (1991), 399. [23] U. Haagerup, The standard form of von Neumann algebras, Math. Scand. 37 (1975), 271. [24] H. Araki, Positive cone, Radon–Nikodym theorems, relative Hamiltonian and the Gibbs condition in statistical mechanics. An application of the Tomita–Takesaki theory, in C ∗ -algebras and their Applications to Statistical Mechanics and Quantum Field Theory, ed. D. Kastler, Amsterdam, North-Holand, 1976. [25] A. Connes, Une classification des facteurs de type III, Ann. Sci. Ecole Norm. Sup. 6 (1973), 133. [26] H. Araki, Expansionals in Banach algebras, Ann. Sci. Ecole Norm. Sup. 6 (1973), 67. [27] H. Araki, Relative entropy of states of von Neumann algebras, Pub. R.I.M.S. Kyoto Univ. 11 (1976), 809. [28] H. Araki, Relative entropy of states of von Neumann algebras II, Pub. R.I.M.S. Kyoto Univ. 13 (1977), 173. [29] W. Pusz and S. L. Woronowicz, Form convex functions and the WYDL and other inequalities, Lett. Math. Phys. 2 (1978), 505. [30] M. Reed and B. Simon, Methods of Modern Mathematical Physics I. Functional Analysis, Academic Press, San Diego, 1973. [31] M. Reed and B. Simon, Methods of Modern Mathematical Physics II. Fourier Analysis, Self-Adjointness, Academic Press, San Diego, 1975. [32] G. A. Raggio and S. H. Zivi, Semiclassical description of N -level systems interacting with radiation fields, in “Quantum Porbability II, Heidelberg 1984”, Lecture Notes in Math., Vol. 1136, Springer-Verlag, Berlin-New York, 1985. [33] A. J. Legget, S. Chakravarty, A. T. Dorsey, M. P. A. Fisher, A. Garg and W. Zwerger, Dynamics of the dissipative two-state system, Rev. Mod. Phys. 59 (1987), 1. [34] U. Weiss, Quantum Dissipative Systems, Series in Modern Condensed Matter PhysicsVol. 2 (second enlarged edition), World Scientific, Singapore, 1999. [35] H. Araki and E. J. Woods, Representations of the canonical commutation relations describing a nonrelativistic infinite free Bose gas, J. Math. Phys. 4 (1963), 637. [36] M. Fannes, B. Nachtergale and A. Verbeure, The equilibrium states of the spin-boson model, Comm. Math. Phys. 114 (1988), 537.
July 14, 2003 11:19 WSPC/148-RMP
00169
Reviews in Mathematical Physics Vol. 15, No. 5 (2003) 491–558 c World Scientific Publishing Company
PERTURBATIVE RENORMALIZATION BY FLOW EQUATIONS
¨ VOLKHARD F. MULLER Fachbereich Physik, Universit¨ at Kaiserslautern, D-67653 Kaiserslautern, Germany [email protected] Received 9 September 2002 Revised 6 March 2003 In this article a self-contained exposition of proving perturbative renormalizability of a quantum field theory based on an adaption of Wilson’s differential renormalization group equation to perturbation theory is given. The topics treated include the spontaneously broken SU(2) Yang–Mills theory. Although mainly a coherent but selective review, the article contains also some simplifications and extensions with respect to the literature. Keywords: Renormalization group.
Contents 1. Introduction
492
2. The Method
494
2.1. Properties of Gaussian measures
494
2.2. The flow equation
497
2.3. Proof of perturbative renormalizability
503
2.4. Insertion of a composite field
509
2.5. Finite temperature field theory
514
2.6. Elementary estimates
518
3. The Quantum Action Principle
519
3.1. Field equation
520
3.2. Variation of a coupling constant
523
3.3. Flow equations for proper vertex functions 4. Spontaneously Broken SU(2) Yang–Mills Theory
526 531
4.1. The classical action
532
4.2. Flow equations: Renormalizability without Slavnov–Taylor identities
535
4.3. Violated Slavnov–Taylor identities
541
4.4. Restoration of the Slavnov–Taylor identities
546
Appendix A. The Relevant Part of Γ
551
Appendix B. The Relevant Part of the BRS-Insertions
553
Acknowledgements
554
References
554 491
July 14, 2003 11:19 WSPC/148-RMP
492
00169
V. F. M¨ uller
1. Indroduction Dyson’s pioneer work [1, 2] opened the era of a systematic perturbative renormalization theory long ago, and in the late sixties of the last century the rigorous BPHZ-version [3–5] was accomplished. In place of the momentum space subtractions of BPHZ to circumvent UV-divergences various intermediate regularization schemes were invented: Pauli–Villars regularization [6], analytical regularization [7], dimensional regularization [8–11]. These different methods, each with its proper merits, are equivalent up to finite counterterms, for a review see e.g. [12]. All of them are based on the analysis of multiple integrals corresponding to individual Feynman diagrams, the combinatorial complexity of which rapidly grows with increasing order of the perturbative expansion. Zimmermann’s famous forest formula [13] provides the clue to disentangle overlapping divergences, organizing the order of subintegrations to be followed. The BPHZ-renormalization, originally developed in case of massive theories, was extented by Lowenstein [14] to cover also zero mass particles. From the point of view of elementary particle physics, renormalization theory culminated in the work of ’t Hooft and Veltman [15–17], demonstrating the renormalizability of non-Abelian gauge theories. At the time of these achievements, Wilson’s view [18–20] of renormalization as a continuous evolution of effective actions — a primarily non-perturbative notion — began to pervade the whole area of quantum field theory and soon proved its fertility. In the domain of rigorous mathematical analysis beyond formal perturbation expansion, the renormalizable UV-asymptotically free Gross–Neveu model in two space-time dimensions has been constructed, [21, 22], by decomposing in the functional integral the full momentum range into a union of discrete, disjoint “slices” and integrating successively the corresponding quantum fluctuations, thereby generating a sequence of effective actions. This slicing can be seen as the equivalent of introducing block-spins in lattices of Statistical Mechanics. Rigorous nonperturbative analysis of the renormalization flow is the general subject of the lecture notes [23, 24] and of the monograph [25] which, among other topics, also treats the problem of summing the formal perturbation series. In these lecture notes and in the monograph references to the original work on non-perturbative renormalization can be found. In the realm of perturbative renormalization Wilson’s ideas have proved beneficial, too. Gallavotti and Nicolo, [26, 27], split the propagator of a free scalar field in disjoint momentum slices, i.e. decomposed the field into a sum of independent (generalized) random variables, and developed a tree expansion to perturbative renormalization. Here, due to the slicing, the degrees of freedom are again integrated in finite steps. This method has been applied in the monograph [28] to present a proof of renormalizability of QED which only involves gauge-invariant counterterms. Polchinski [29] realized, that considering renormalization in terms of relevant and irrelevant operators as Wilson, is also effective in perturbation theory, and he gave in the case of the Φ44 -theory an inductive self-contained proof of perturbative renormalization, based on Wilson’s renormalization (semi-) group differential
July 14, 2003 11:19 WSPC/148-RMP
00169
Perturbative Renormalization by Flow Equations
493
equation. His method avoids completely the combinatoric complexity of generating Feynman diagrams and the following cumbersome analysis of Feynman integrals with their overlapping divergences. It rather treats an n-point Green function of a given perturbative order as a whole. Due to this fact, his method is particularly transparent. Polchinski’s approach has proved very stimulating in various directions: (i) In mathematical physics it has been extended to present new proofs of general results in perturbatively renormalized quantum field theory, which are simpler than those achieved before: renormalization of the nonlinear σ-model [30], a rigorous version of Polchinski’s argument, together with physical renormalization conditions [31], renormalization of composite operators and Zimmermann identities [32], Wilson’s operator product expansion [33], Symanzik’s improved actions [34, 35], large order bounds [36], renormalization of massless Φ44 -theory [37], renormalization of QED [38, 39], decoupling theorems [40], renormalization of spontaneously broken Yang–Mills theory [42], temperature independent renormalization of finite temperature field theories [43]. The monograph [44] contains a clear and detailed introduction to Polchinski’s method formulated with Wick-ordered field products [34], and, in addition, the application of a similar renormalization flow to the Fermi surface problem of condensed matter physics. (ii) In the domain of theoretical physics there is a vast amount of contributions with diverse applications of Polchinski’s approach. Flow equations for vertex functions have been introduced [45, 46] and also employed to investigate perturbative renormalizability of gauge, chiral and supersymmetric theories [47–53]. In these articles also several explicit one-loop calculations are performed. An effective quantum action principle has been formulated for the renormalization flow breaking gauge invariance [54]. There are also interesting attempts to combine a gauge invariant regularization with a flow equation [55]. Besides aims within perturbation theory, there have been many activities to use truncated versions of flow equations as appropriate nonperturbative approximations in strong interaction physics, more in accord with Wilson’s original goal. In the physically distinguished case of (non-Abelian) gauge theories [56–62] the effective action is restricted in a local approximation to its relevant part for all values of the flowing scale. As a consequence, the flow equation for the effective action reduces to a system of r ordinary differential equations, r being the number of relevant coefficients appearing. This system is integrated from the UV-scale downward. In these nonperturbative approaches the problem arises to reconcile the truncation with the gauge symmetry. This problem is discussed also in [54]. In a very different field of interest the question of the nonperturbative renormalizability of Quantum Einstein Gravity has been investigated, based on truncated flow equations, [63, 64]. These authors restrict the average effective action to the Hilbert action together with a small number of additional local terms. The flow of the coupling coefficients is studied numerically and the
July 14, 2003 11:19 WSPC/148-RMP
494
00169
V. F. M¨ uller
existence of a non-Gaussian fixed point in the ultraviolet found. This result is then interpreted to support the conjecture, that Quantum Einstein Gravity is “asymptotically safe” in Weinberg’s sense. We like to point out that physical applications of flow equations are reviewed in [65], containing an extensive list of references. The present article is intended to provide a self-contained exposition of perturbative renormalization based on Polchinski’s inductive method, employing the differential renormalization group equation of Wilson. Therefore, emphasis is laid on a coherent presentation of the topics considered. A comprehensive overview of the literature on the subject will not be pursued. The quantum field theories considered are treated in their Euclidean formulation on d = 4 dimensional (Euclidean) spacetime by means of functional integration. Accordingly, their correlation functions are called Schwinger functions to distinguish them from the Green functions on Minkowski space. In the intermediate steps of the derivations always regularized functionals are used, the controlled removal (within perturbation theory) of this regularization being our main concern. We avoid any manipulation of unregularized “path integrals”. The plan of this article is as follows. In Sec. 2 Polchinski’s method to prove perturbative renormalizability is elaborated treating the nonsymmetric Φ4 -theory in detail. Besides the system of Schwinger functions of this theory, the Schwinger functions with one composite field (operator) inserted are also dealt with. The presentation is mainly based on [31, 32] and on some simplifications [66]. Moreover, considering the theory at finite temperature, its temperature independent renormalizability is reviewed, following closely [43]. In Sec. 3 two simple cases of the quantum action principle are demonstrated, again treating the nonsymmetric Φ4 -theory: the field equation and the variation of a coupling constant. These applications of the method seem not to have been treated in the literature. Hereafter, somewhat disconnected, flow equations for proper vertex functions are dealt with, [41, 45, 46]. Section 4 is devoted to the proof of renormalizability of the physically most important spontaneously broken Yang–Mills theory. Because of the necessity to implement nonlinear field variations, this problem can be regarded as a further instance of the quantum action principle. The presentation follows the line of [42]. Due to a modified cutoff function for the flow, however, one can refrain from introducing irrelevant terms into the bare interaction, thereby simplifying the treatment. In addition, the restoration of the irrelevant part of the Slavnov–Taylor identities is also dealt with. 2. The Method 2.1. Properties of Gaussian measures Our point of departure is a Gaussian probability measure dµ on the space C(Ω) of continuous real-valued functions on a d-dimensional torus Ω. Such a function we
July 14, 2003 11:19 WSPC/148-RMP
00169
495
Perturbative Renormalization by Flow Equations
identify with a periodic function on Rd , i.e. φ(x) = φ(x + nl), where x ∈ Rd , n ∈ Zd , l = (l1 , . . . , ld ) ∈ Rd+ and nl = n1 l1 + · · · + nd ld . A Gaussian measure with mean zero is uniquely defined by its covariance C(x, y), Z dµC (φ)φ(x)φ(y) = C(x, y) = C(y, x) . (2.1) The covariance is a positive non-degenerate bilinear form on C ∞ (Ω) × C ∞ (Ω), we assume it to be translation invariant, C(x, y) = C(x − y) , too. Moreover, the function C(x) is assumed to have a given number N ∈ N of derivatives continuous everywhere on Ω. We list some properties of this Gaussian measure employed in the sequel, proofs can be found e.g. in [67]. • Using the notation Z dxφ(x)J(x) , hφ, Ji =
hJ, CJi =
Ω
Z
dx Ω
Z
dyJ(x)C(x − y)J(y) Ω
where J ∈ C ∞ (Ω) is a test function, the generating functional of the correlation functions is given explicitly as Z 1 (2.2) dµC (φ)ehφ,Ji = e 2 hJ,CJi . • The translation of the Gaussian measure by a function ϕ ∈ C ∞ (Ω) results in 1
dµC (φ − ϕ) = e− 2 hϕ,C
−1
ϕi
dµC (φ)ehφ,C
−1
ϕi
.
(2.3)
• Let A(φ) denote a polynomial formed of local powers of the field, φ(x)n , n ∈ N, and of its derivatives (∂µ φ(x))m , m ∈ N, 2m < N , at various points x. If the covariance C is the sum of two covariances, C = C1 + C2 , then Z Z Z (2.4) dµC (φ)A(φ) = dµC1 (φ1 ) dµC2 (φ2 )A(φ1 + φ2 ). • Integration by parts of a function A(φ) as considered in (2.4) yields Z Z Z δ dµC (φ)φ(x)A(φ) = dµC (φ) dyC(x − y) A(φ) . δφ(y) Ω
(2.5)
• Finally, let the covariance of the Gaussian measure depend differentiably on a parameter, C(x − y) = Ct (x − y) ,
d C˙ t ≡ Ct (x − y) . dt
Given again a function A(φ) as in (2.4), then Z Z 1 d δ δ dµCt (φ)A(φ) = dµCt (φ) A(φ) . , C˙ t dt 2 δφ δφ
(2.6)
July 14, 2003 11:19 WSPC/148-RMP
496
00169
V. F. M¨ uller
As an example of the class of covariances considered, we present already here the particular covariance which will be mainly used in the flow equations envisaged. The torus Ω has volume |Ω| = l1 l2 · · · ld and a point x ∈ Ω has coordinates − 21 li ≤ xi < 12 li , i = 1, . . . , d. Hence, the dual Fourier variables (momentum vectors) k form a discrete set: 2πn1 2πnd k = k(n) = ,..., , n ∈ Zd . l1 ld Let m, Λ0 be positive constants, 0 < m Λ0 , and the nonnegative parameter Λ satisfy 0 ≤ Λ ≤ Λ0 , we define the covariance C Λ,Λ0 (x − y) =
2 2 2 +m2 1 X eik(x−y) − k Λ+m 2 −k Λ 2 0 − e ). (e 2 + m2 |Ω| k d
(2.7)
n∈Z
This covariance obviously has the well-defined infinite volume limit Ω → Rd , with x, y, k ∈ Rd , Z 2 2 k2 +m2 1 eik(x−y) − k Λ+m 2 0 C Λ,Λ0 (x − y) = − e − Λ2 ) . (2.8) dk (e d 2 2 (2π) Rd k + m Abusing slightly the notation we did not choose a different symbol for the limit. Later on, however, the case referred to will be clearly stated. Choosing the values Λ0 = ∞, Λ = 0 the covariances (2.7) and (2.8) become the Euclidean propagator of a free real scalar field with mass m on Ω and Rd , respectively. A finite value of Λ0 generates an UV-cutoff thus regularizing the covariances: they now satisfy the regularity condition assumed for all N . This property is kept introducing the additional term governed by the “flowing” parameter 0 ≤ Λ ≤ Λ0 . Its role is to interpolate differentiably between a vanishing covariance at Λ = Λ0 , corresponding to a δ-measure on the function space, and the free UV-regularized covariance at Λ = 0. As a consequence we remark that the Gaussian measure with covariance (2.7) is supported with probability one on (the nuclear space) C ∞ (Ω), [68, 69]. Clearly, a modification of the Euclidean propagator showing these properties can be accomplished with a large variety of cutoff functions. In (2.7) and (2.8) a factor of the form RΛ,Λ0 (k) = σΛ0 (k 2 ) − σΛ (k 2 )
(2.9)
has been introduced, with the particular function σΛ (k 2 ) = e−
k2 +m2 Λ2
.
(2.10)
We observe, that regularization and interpolation is caused by any positive function σΛ (k 2 ) satisfying: (i) For fixed Λ it decreases as a function of k 2 , vanishing rapidly for k 2 > Λ2 . (ii) For fixed k 2 it increases with Λ from the value zero at Λ = 0 to the value one at Λ = ∞. Later on, our particular choice will prove advantageous.
July 14, 2003 11:19 WSPC/148-RMP
00169
Perturbative Renormalization by Flow Equations
497
2.2. The flow equation Perturbative renormalizability of a quantum field theory is based on locality of its action and on power counting. The qualification “perturbative” means expansion of the theory’s Green (or Schwinger) functions as formal power series in the loop parametera ~, and treating them order-by-order, i.e. disregarding questions of convergence. The notions of locality and power counting can be introduced looking at the classical precurser of the quantum field theory to be constructed. There, a local action in d space-time dimensions is the space-time integral of a Lagrangian (density), having the form of a polynomial in the fields entering the theory and their derivatives. The propagators are determined by the free part, which is bilinear in the fields. For a scalar field and a spin- 21 -field this free part is of second and first order in the derivatives, respectively. Defining the canonical (mass) dimension of the corresponding fields to be 12 (d − 2) and 12 (d − 1), respectively, and attributing the mass dimension 1 to each partial derivative, the free part of the Lagrangian has the dimension d, the action thus is dimensionless. Vector fields, especially (nonAbelian) gauge fields, pose particular problems to be considered later. Still looking at the classical theory, local interaction terms in the Lagrangian involve by definition more than two fields. Their respective coupling constant has a mass dimension, derived from the mass dimension of the interaction term, the coupling constant of an interaction term of mass dimension d being dimensionless. Any local term entering the Lagrangian is called a relevant operator [18–20] if it has a mass dimension ≤ d, but irrelevant, if its mass dimension is greater than d. In the physically distinguished case d = 4 the central result of pertubative renormalization theory is that UV-finite Green (or Schwinger) functions can be obtained in any order, if the interaction terms have mass dimensions ≤ 4, by prescribing a finite number of renormalization conditions. This number equals the number of relevant operators forming the full classical Lagrangian. We consider the quantum field theory of a real scalar field φ with mass m on four-dimensional Euclidean space-time within the framework of functional integration. The emerging vacuum effects require a finite space-time volume. Therefore we start with a real-valued field φ ∈ C 1 (Ω) on a four-dimensional torus Ω. Its bare interaction, labeled by an UV-cutoff Λ0 ∈ R+ , is chosen as Z g f 3 φ (x) + φ4 (x) LΛ0 ,Λ0 (φ) = dx 3! 4! Ω +
Z
1 1 dx v(Λ0 )φ(x) + a(Λ0 )φ2 (x) + z(Λ0 )(∂µ φ)2 (x) 2 2 Ω
! 1 1 3 4 + b(Λ0 )φ (x) + c(Λ0 )φ (x) . 3! 4! a If
(2.11)
one considers Feynman diagrams, the power of ~ counts the number of loops formed by such a diagram.
July 14, 2003 11:19 WSPC/148-RMP
498
00169
V. F. M¨ uller
The first integral has classical roots: its integrand is formed of the field’s selfinteraction with real coupling constants f and g having mass dimension equal to one and zero, respectively.b The second integral contains the related counterterms, determined according to the following rule. The canonical mass dimension of the field φ is equal to one. As counterterms in the integrand of the bare interaction have to appear all local terms of mass dimension ≤ 4 that can be formed of the field and of its derivatives but respecting the (Euclidean) O(4)-symmetry. This symmetry is not violated by the intermediate UV-regularization procedure and can thus be maintained. In contradistinction to the coupling constants f, g the five coefficients v(Λ0 ), a(Λ0 ), z(Λ0 ), b(Λ0 ), c(Λ0 ) of the counterterms cannot be chosen freely but have to depend on the UV-cutoff Λ0 . This dependence is dictated by the aim that after functional integration the UV-regularization can be removed, i.e. the limit Λ0 → ∞ can be performed keeping the physical content of the theory finite. As a consequence the coefficients stated above, however, turn out to diverge with Λ0 → ∞. If we restrict the bare interaction (2.11) to the case f = 0, v(Λ0 ) = b(Λ0 ) = 0, it is also invariant under the mirror transformation φ(x) → −φ(x) implying an additional symmetry of the theory. The regularized quantum field theory on finite volume is defined by the generating functional of its Schwinger functions Z 1 1 Λ0 ,Λ0 hφ,Ji (φ)+ ~ Λ,Λ0 (2.12) Z (J) = dµΛ,Λ0 (φ)e− ~ L
with a real source J ∈ C ∞ (Ω), bare interaction (2.11) and a Gaussian measure dµΛ,Λ0 with mean zero and covariance ~C Λ,Λ0 , (2.7). The positive parameter ~ has been introduced with regard to a systematic loop expansion considered later. For fixed Ω and Λ0 , and assuming g + c(Λ0 ) > 0, z(Λ0 ) ≥ 0 in the bare interaction, (2.11), the functional integral (2.12) is well-defined. As a functional on C ∞ (Ω), the support of the Gaussian measure dµΛ,Λ0 (φ), the bare interaction is continuous in any Sobolev norm of order n ≥ 1, and, furthermore, bounded below, LΛ0 ,Λ0 (φ) > κ. Hence, with Λ0 fixed, we have the uniform bound for 0 ≤ Λ ≤ Λ0 , Z 0,Λ0 1 1 Ji |Z Λ,Λ0 (J)| < e−κ dµΛ,Λ0 (φ)e ~ hφ,Ji ≤ e−κ+ 2~ hJ,C . (2.13) From (2.12) one obtains the generating functional W Λ,Λ0 (J) of the truncated Schwinger functionsc 1
e~W
Λ,Λ0
(J)
=
Z Λ,Λ0 (J) Z Λ,Λ0 (0)
(2.14)
which provides the n-point functions, n ∈ N, upon functional derivation: WnΛ,Λ0 (x1 , . . . , xn ) = b Stability
δn W Λ,Λ0 (J)|J=0 . δJ(x1 ) · · · δJ(xn )
(2.15)
requires g to be positive, but this property is not felt in a perturbative treatment. a representation of these functions in terms of (Feynman) diagrams only connected diagrams appear.
c In
July 14, 2003 11:19 WSPC/148-RMP
00169
Perturbative Renormalization by Flow Equations
499
Besides the UV-regularization determined by the cutoff Λ0 , imperative to have a well-defined functional integral (2.12), an additional flowing cutoff Λ has been built in, suppressing smaller momenta. It is a merely technical device, introduced by Polchinski [29] and inspired by Wilson’s view of renormalization [18–20]. Decreasing Λ from its maximal value Λ = Λ0 to its physical value Λ = 0 gradually takes into account the momentum domain, starting at high momenta — in mathematical terms: the parameter Λ interpolates continuously between a δ-measure (i.e. absence of quantum effects) at Λ = Λ0 and the Gaussian measure dµ0,Λ0 on a UV-regularized field, at Λ = 0. Of course, as stressed after Eq. (2.10), such an interpolation can also be realized by other cutoff functions than (2.7) used here. In order to make use of the flow parameter Λ it is advantageous to consider the (free propagator-) amputated truncated Schwinger functions with generating functional LΛ,Λ0 (ϕ), ϕ ∈ C ∞ (Ω), defined as Z 1 Λ0 ,Λ0 1 (LΛ,Λ0 (ϕ)+I Λ,Λ0 ) (φ+ϕ) −~ = dµΛ,Λ0 (φ)e− ~ L , (2.16) e LΛ,Λ0 (0) = 0 .
(2.17)
The constant I Λ,Λ0 is the vacuum part of the theory. Translating on the r.h.s. in (2.16) the source function ϕ to the measure and using (2.3) leads to 1
e− ~ (L
Λ,Λ0
(ϕ)+I Λ,Λ0 )
1
= e− 2 hϕ,(~C
Λ,Λ0 −1
)
ϕi
Z Λ,Λ0 ((C Λ,Λ0 )−1 ϕ)
(2.18)
relating the generating functionals Z and L. Hereupon, together with the definition (2.14) follows finally LΛ,Λ0 (ϕ) =
1 hϕ, (C Λ,Λ0 )−1 ϕi − W Λ,Λ0 ((C Λ,Λ0 )−1 ϕ) . 2
(2.19)
Denoting by C˙ Λ,Λ0 the derivative of the covariance C Λ,Λ0 with respect to the flow parameter Λ we observe, with Λ0 kept fixed, Z Z d ~ δ ˙ Λ,Λ0 δ 1 Λ0 ,Λ0 1 Λ0 ,Λ0 −~ L (φ+ϕ) (φ+ϕ) dµΛ,Λ0 (φ)e = dµΛ,Λ0 (φ) e− ~ L ,C dΛ 2 δφ δφ Z ~ δ ˙ Λ,Λ0 δ 1 Λ0 ,Λ0 (φ+ϕ) = dµΛ,Λ0 (φ)e− ~ L ,C 2 δϕ δϕ where in the first step (2.6) has been used, whereas the second step follows from the integrand’s particular dependence on the field φ. Hence, because of Eq. (2.16) we obtain the differential equation Λ,Λ0 ~ δ ˙ Λ,Λ0 δ d − 1 (LΛ,Λ0 (ϕ)+I Λ,Λ0 ) 1 (ϕ)+I Λ,Λ0 ) ~ = . (2.20) e ,C e− ~ (L dΛ 2 δϕ δϕ The reader notices that the relation (2.6) has been used in the case of a nonpolynomial function. Therefore this extension has to be understood in terms of a formal
July 14, 2003 11:19 WSPC/148-RMP
500
00169
V. F. M¨ uller
power series expansion, i.e. disregarding the question of convergence. Upon explicit differentiation in (2.20) follows the Wilson flow equation d ~ δ ˙ Λ,Λ0 δ LΛ,Λ0 (ϕ) (LΛ,Λ0 (ϕ) + I Λ,Λ0 ) = ,C dΛ 2 δϕ δϕ 1 δ Λ,Λ0 δ − L (ϕ), C˙ Λ,Λ0 LΛ,Λ0 (ϕ) . (2.21) 2 δϕ δϕ The form of Eq. (2.20) strongly resembles the heat equation. Defining the functional Laplace operator δ 1 δ , (2.22) , C Λ,Λ0 ∆Λ,Λ0 = 2 δϕ δϕ the unique solution of the differential equation (2.20), already given in the form (2.16), can also be written as 1
e− ~ (L
Λ,Λ0
(ϕ)+I Λ,Λ0 )
1
= e~∆Λ,Λ0 e− ~ L
Λ0 ,Λ0
(ϕ)
.
(2.23)
˙ Λ,Λ0 with respect to Λ, the r.h.s. of Since ∆Λ,Λ0 commutes with its derivative ∆ (2.23) satisfies the differential equation. Moreover, the initial condition holds because of ∆Λ0 ,Λ0 = 0 and I Λ0 ,Λ0 = 0. At this point, several remarks concerning the mathematical aspect of the steps performed are in order: (i) Our aim with these preparatory steps is to generate the system of flow equations satisfied by the regularized Schwinger functions of the theory, when considered in the perturbative sense of formal power series. This system then is taken as the starting point for a proof of perturbative renormalizability. As basic “root” acts the UV-regularized finite-volume generating functional (2.12) or one of its direct descendants (2.14) and (2.16). Expanding in their respective integrands the exponential function in a power series would provide the standard perturbation expansion in terms of (regularized) Feynman integrals. Bearing in mind our goal stated, we could already view the steps performed in the restricted sense as formal power series. (ii) We mention, that in [44] to begin on safe ground the generating functional of the theory has first been formulated on a finite space-time lattice in order to derive the (perturbative) flow equation — implying a finite-dimensional Gaussian integral —, and the limit to continuous infinite space-time taken afterwards. (iii) Rigorous analysis beyond perturbation theory of flow equations of the Wilson type (2.21) is the subject dealt with in [23], using convergent expansion techniques. Such techniques are developed in the monograph [25]. The flow equation (2.21) for the generating functional LΛ,Λ0 (ϕ) encodes a system of flow equations for the corresponding n-point functions, n ∈ N, and for the vacuum part I Λ,Λ0 . The flow of the latter is determined considering Eq. (2.21) at ϕ = 0.
July 14, 2003 11:19 WSPC/148-RMP
00169
Perturbative Renormalization by Flow Equations
501
From the translation invariance of the theory it follows that the 2-point function is a distribution depending on the difference variable x − y only, (suppressing momentarily the superscript Λ, Λ0 ) δ δ L(ϕ)|ϕ=0 =: L2 (x − y) , δϕ(x) δϕ(y) and thus Z Z Z δ ˙ δ ˙ ˙ − y)L2 (x − y) = |Ω| dz C(z)L ,C dy C(x dx L(ϕ)|ϕ=0 = 2 (z) . δϕ δϕ Ω Ω Ω Because of the emerging dependence on the volume |Ω| the flow equation of the vacuum part cannot be treated in the infinite volume limit. However, due to the covariance (2.8) which corresponds to a massive particle and thus decays exponentially, the flow equation for the n-point functions can and in the sequel will be treated in this limit. Hence, at least one functional derivative has to act on the flow equation (2.21). Due to the translation invariance of the theory it is convenient to consider the generating functional LΛ,Λ0 (ϕ), ϕ ∈ S(R4 ), in terms of the Fourier transformed source field ϕ, ˆ the conventions used are Z Z Z d4 p ipx , (2.24) ϕ(x) = e ϕ(p) ˆ , := 4 p p R4 (2π) δ implying for the functional derivative δϕ(x) := δϕ(x) the transformation Z δϕ(x) = (2π)4 e−ipx δϕ(p) . ˆ p
From the generating functional L functional derivation, n ∈ N,
Λ,Λ0
(ϕ) the correlation functions are obtained by
Λ,Λ0 (2π)4(n−1) δϕ(p (ϕ)|ϕ=0 ˆ n ) · · · δϕ(p ˆ 1)L 0 = δ(p1 + · · · + pn )LΛ,Λ (p1 , . . . , pn ) . n
(2.25)
0 (p1 , . . . , pn ) is a totally symThe amputated truncated n-point function LΛ,Λ n metric function of the momenta p1 , . . . , pn and, moreover, due to the δ-function, pn := −p1 − · · · − pn−1 . (In the case where the bare interaction (2.11) shows the mirror symmetry LΛ0 ,Λ0 (−φ) = LΛ0 ,Λ0 (φ) all n-point functions with n odd vanish.) Observing the definition (2.25) we obtain from (2.21) the system of flow equations for the n-point functions, n ∈ N, Z ~ 0 0 ∂Λ LΛ,Λ ∂Λ C Λ,Λ0 (k) · LΛ,Λ (p , . . . , p ) = 1 n n n+2 (k, p1 , . . . , pn , −k) 2 k
n
−
1X 2 r=0
X
Λ,Λ0 0 LΛ,Λ (p) r+1 (pi1 , . . . , pir , p)∂Λ C
i1 <···
0 · LΛ,Λ n−r+1 (−p, pj1 , . . . , pjn−r )
July 14, 2003 11:19 WSPC/148-RMP
502
00169
V. F. M¨ uller
p1 + · · · + p n = 0 ,
−p = pi1 + · · · + pir .
(2.26)
In the quadratic term a given set of momenta (pi1 , . . . , pir ), i1 < · · · < ir , determines (uniquely) the corresponding set (pj1 , . . . , pjn−r ), j1 < · · · < jn−r , such that the union of this pair is the set of momenta (p1 , . . . , pn ). Furthermore, the Fourier transform of the covariance (2.8), Cˆ Λ,Λ0 (k) =
−k 1 (e k 2 + m2
2 +m2 Λ2 0
− e−
k2 +m2 Λ2
),
(2.27)
is written with a slight abuse of notation omitting the “hat”. In the sequel we shall write the quadratic term appearing in (2.26) more compactly as n 0 1 X Λ,Λ0 1X X (p1 , . . . , pn1 , p)∂Λ C Λ,Λ0 (p) [L ··· = − − 2 r=0 i <···
1
r
2
0 · LΛ,Λ n2 +1 (−p, pn1 +1 , . . . , pn )]r sym
p := −p1 − · · · − pn1 = pn1 +1 + · · · + pn ,
(2.28)
where the prime on top of the summation symbol imposes the restriction to n1 + n2 = n. Moreover, the symbol “r sym” means summation over those permutations of the momenta p1 , . . . , pn , which do not leave invariant the (unordered) subsets (p1 , . . . , pn1 ) and (pn1 +1 , . . . , pn ), and, in addition, produce mutually different pairs of (unordered) image subsets. The system of flow equations (2.26) will be treated perturbatively employing a loop expansion of the n-point functions as formal power series, n ∈ N, 0 (p1 , . . . , pn ) = LΛ,Λ n
∞ X
0 ~l LΛ,Λ l,n (p1 , . . . , pn ) .
(2.29)
l=0
Since also flow equations for momentum derivatives of n-point functions have to be considered, we introduce the shorthand notation X w = (w1,1 , . . . , wn−1,4 ) , wi,µ ∈ N0 , |w| = wi,µ i,µ
wi,µ 4 Y ∂ , ∂ := ∂pi,µ i=1 µ=1 w
n−1 Y
w! =
n−1 Y
4 Y
wi,µ ! .
(2.30)
i=1 µ=1
From (2.26) then follows the system of flow equations, n ∈ N, l ∈ N0 : Z 1 w Λ,Λ0 0 ∂Λ ∂ Ll,n (p1 , . . . , pn ) = ∂Λ C Λ,Λ0 (k) · ∂ w LΛ,Λ l−1,n+2 (k, p1 , . . . , pn , −k) 2 k 0 0 1 X X − 2 n ,n 1
2
0 X
0 c{wi } [∂ w1 LΛ,Λ l1 ,n1 +1 (p1 , . . . , pn1 , p)
l1 ,l2 w1 ,w2 ,w3
0 · ∂ w3 ∂Λ C Λ,Λ0 (p) · ∂ w2 LΛ,Λ l2 ,n2 +1 (−p, pn1 +1 , . . . , pn )]r sym
July 14, 2003 11:19 WSPC/148-RMP
00169
Perturbative Renormalization by Flow Equations
p = −p1 − · · · − pn1 = pn1 +1 + · · · + pn .
503
(2.31)
One should not overlook that the residual symmetrization (r sym) acts on the momentum p, too. The primes restrict the summations to n1 + n2 = n, l1 + l2 = l, w1 + w2 + w3 = w, respectively. Moreover, the combinatorial factor c{wi } = w!(w1 !w2 !w3 !)−1 comes from Leibniz’s rule. In the loop order l = 0, obviously, the first term on the r.h.s. is absent. 2.3. Proof of perturbative renormalizability Perturbative renormalizability of the regularized field theory (2.16) amounts to the following: For given coupling constants f, g in the bare interaction (2.11) the coefficients v(Λ0 ), a(Λ0 ), z(Λ0 ), b(Λ0 ), c(Λ0 ) of the counterterms can be adjusted within a loop expansion of the theory, i.e. v(Λ0 ) =
∞ X
~l vl (Λ0 ), . . . , c(Λ0 ) = · · ·
(2.32)
l=1
in such a way, that all (infinite volume) n-point functions (2.29) in every loop order l have finite limits 0 lim lim LΛ,Λ l,n (p1 , . . . , pn ) ,
Λ0 →∞ Λ→0
n ∈ N, l ∈ N0 .
(2.33)
These limits emerge directly in the tree order l = 0, of course. The counterterms are adjusted by requiring that the corresponding n-point functions at the physical value Λ = 0 of the flow parameter have prescribed values for a chosen set of momenta. Since the theory is massive it is convenient to prescribe these renormalization conditions at vanishing momenta. Hence, taking into account the Euclidean symmetry we require for all l ∈ N: 0,Λ0 = vlR , Ll,1 0,Λ0 R 2 2 2 (p, −p) = aR Ll,2 l + zl p + O((p ) ) , 0,Λ0 2 2 2 (p1 , p2 , p3 ) = bR Ll,3 l + O(p1 , p2 , p3 ) , 0,Λ0 (0, 0, 0, 0) = cR Ll,4 l .
(2.34) (2.35) (2.36) (2.37)
In each loop order l ∈ N these five real renormalization constants vlR , . . . , cR l can be chosen freely, not dependingd on Λ0 . Together with the corresponding constants of the tree order l = 0 they fix the relevant part of the theory completely. A particular R R R (simple) choice would be to set vlR = aR l = zl = bl = cl = 0. The tree order has to be treated first. It is fully determined by the classical part appearing in the bare interaction (2.11). This classical interaction acts as initial condition at Λ = Λ0 when integrating the flow equations (2.31) for l = 0 dA
weak dependence of these constants on Λ0 with finite limits when Λ0 → ∞ could be permitted.
July 14, 2003 11:19 WSPC/148-RMP
504
00169
V. F. M¨ uller
downwards to smaller values of Λ, ascending successively in the number of fields n. The classical interaction contains no terms linear or quadratic in the fields. To bring the system of flow equations to bear, however, at first the crucial properties, 0 ≤ Λ ≤ Λ0 , 0 LΛ,Λ 0,1 = 0 ,
0 LΛ,Λ 0,2 (p, −p) = 0
(2.38)
have to be inferred directly from the representation (2.16). Hereupon and with the initial condition for n = 3 follows from (2.31), 0 LΛ,Λ 0,3 (p1 , p2 , p3 ) = f ,
(2.39)
and then for n = 4: 0 LΛ,Λ 0,4 (p1 , p2 , p3 , p4 )
= g − f 2 (C Λ,Λ0 (p1 + p2 ) + C Λ,Λ0 (p1 + p3 ) + C Λ,Λ0 (p1 + p4 )) .
(2.40)
Ascending further in the number of fields yields the whole tree order. (For n > 4 all initial conditions at Λ = Λ0 are equal to zero.) The first step in proving renormalizability is to establish the Proposition 2.1 (Boundedness). For all l ∈ N0 , n ∈ N, w from (2.30) and for 0 ≤ Λ ≤ Λ0 holds, 0 |∂ w LΛ,Λ l,n (p1 , . . . , pn )|
|pi | Λ+m ≤ (Λ + m)4−n−|w| P1 log P2 , m Λ+m
(2.41)
where P denotes polynomials having nonnegative coefficients. These coefficients, as well as the degree of the polynomials, depend on l, n, w but not on {p i }, Λ, Λ0 . For l = 0 all polynomials P1 reduce to positive constants. Remark. In the following the symbol P always denotes a polynomial of this type, possibly a different one each time it appears. Proof. Using again the shorthand (2.30) in the case of one momentum the covariance (2.27) satisfies the bounds, 0 ≤ Λ, 2 |k| − k2 +m e Λ2 . |∂ w ∂Λ C Λ,Λ0 (k)| ≤ Λ−3−|w| P (2.42) Λ Here, the polynomials P are of respective degree |w| (and obviously do not depend on n, l). A weaker version, used too, is |k| . (2.43) |∂ w ∂Λ C Λ,Λ0 (k)| ≤ (Λ + m)−3−|w| P Λ+m We first consider the tree order l = 0. Due to (2.38), (2.39) the bounds (2.41) evidently hold for n ≤ 3. From (2.40) and the very crude bound |C Λ,Λ0 (k)| < 2m−2 follows the claim for (n = 4, w = 0). Now in all remaining cases we have n+|w| > 4.
July 14, 2003 11:19 WSPC/148-RMP
00169
Perturbative Renormalization by Flow Equations
505
Due to the crucial properties (2.38) they can be treated successively ascending in n, and for given n the various w dealt with in arbitrary order, by integrating the respective flow equation (2.31) downwards from the initial point Λ = Λ0 . In each such case the initial condition is equal to zero. Then bounds already established together with (2.43) yield Z Λ0 λ,Λ0 w Λ,Λ0 (p1 , . . . , |pn )| dλ|∂λ ∂ w L0,n |∂ L0,n (p1 , . . . , pn )| ≤ | Λ
≤P
|pi | Λ+m
Z
|pi | Λ+m
Λ0
dλ(λ + m)4−n−|w|−1
Λ
1 (Λ + m)4−n−|w| . n + |w| − 4
(2.44)
Thus the assertion (2.41) is shown for the tree order. Given the bounds for l = 0, those of the higher loop orders can be generated inductively by successive integration of the system of flow equations (2.31): (i) Ascending in the loop order l, (ii) for fixed l ascending in n, (iii) for fixed l, n descending with w down to w = 0. We observe that in the inductive order adopted the terms on the r.h.s. of a flow equation are always prior to that on the l.h.s. since the linear term has lower loop order and to the quadratic term — because of the key properties (2.38) — only terms of the same loop order contribute which have a smaller value n. To comply with the growth properties of the bounds (2.41) the integrations are performed as follows: (A1 ) If n+|w| > 4, the bound decreases with increasing Λ. Hence, the flow equation is integrated from the initial point Λ = Λ0 downwards to smaller values of Λ with the initial condition 0 ,Λ0 ∂ w LΛ (p1 , . . . , pn ) = 0 , l,n
n + |w| > 4 ,
(2.45)
as a consequence of the bare interaction (2.11) chosen. (A2 ) In the cases n + |w| ≤ 4 the bounds increase with increasing Λ. Therefore, the corresponding flow equations are integrated for a prescribed set of momenta (the renormalization point) with the physical value Λ = 0 as initial point. The respective initial values can be freely chosen order by order, but in accordance with the (Euclidean) symmetry of the theory. As already stated before we choose vanishing momenta as renormalization point, together with the renormalization conditions (2.34)–(2.37) as initial values. Thus, for n+|w| ≤ 4, Z Λ w 0,Λ0 0 0 ∂ w LΛ,Λ (0, . . . , 0) = ∂ L (0, . . . , 0) + dλ∂λ ∂ w Lλ,Λ (2.46) l,n l,n l,n (0, . . . , 0) . 0
(For n = 1 there is no momentum dependence and w = 0.) Once a bound has been obtained at the renormalization point, it is extended to general momenta
July 14, 2003 11:19 WSPC/148-RMP
506
00169
V. F. M¨ uller
using the Taylor formula f (p) = f (0) +
n X
pi
i=1
Z
1
dt(∂i f )(tp)
(2.47)
0
for a differentiable function on Rn . Applying this formula, the bound of the integrand (due to the derivative) yields an additional factor (Λ + m)−1 which combines with the momentum factor in front to give a new momentum bound of the type considered. To generate inductively the assertion (2.41) we use it in bounding the r.h.s. of the flow equation (2.31), together with the bounds (2.42) and (2.43) in the linear and in the quadratic term, respectively, 0 |∂Λ ∂ w LΛ,Λ l,n (p1 , . . . , pn )|
e−
k2 +m2 Λ2
|k| Λ+m |pi | P2 (Λ + m) P1 log , ≤ Λ3 m Λ+m Λ+m k |pi | Λ+m + (Λ + m)4−n−|w|−1 P3 log P4 . m Λ+m Z
4−n−2−|w|
The second term on the r.h.s. results from combining a sum of such terms into a single one with new polynomials. In the first term the k-integration is performed substituting k → Λk . The result, easily majorized and combined with the second term yields the bound 0 |∂Λ ∂ w LΛ,Λ l,n (p1 , . . . , pn )|
|pi | Λ+m ≤ (Λ + m)4−n−|w|−1 P5 log P6 . m Λ+m
(2.48)
(a1 ) Following the order of the induction stated before, the (irrelevant) cases n + |w| > 4 have always to be considered first (for fixed l, n). In these cases the bound (2.48) is integrated downwards, observing (2.45), similarly as in the tree order, (2.44). In place of the pure power behavior, however, we now have Z Λ0 Λ+m λ+m < (Λ + m)4−n−|w| P1 log dλ(λ + m)4−n−|w|−1 P log m m Λ with a new polynomial on the r.h.s., see the end of this section, Sec. 2.6. Thus, the assertion is established in the cases n + |w| > 4. (a2 ) In the cases n + |w| ≤ 4 the claim (2.41) has to be deduced from the respective integrated flow equation (2.46) at the renormalization point followed by an extension to general momenta by way of (2.47), proceeding in the order of induction. That is, to start with the particular (momentum independent) case n = 1 and continue successively with the cases (n = 2, |w| = 2), (n = 2, |w| = 1), and so on. Converting in the obvious way Eq. (2.46) into an inequality for
July 14, 2003 11:19 WSPC/148-RMP
00169
Perturbative Renormalization by Flow Equations
507
absolute values, a bound on the integral is gained using the bound (2.48) at vanishing momenta: Z Z Λ Λ λ+m λ,Λ w 4−n−|w|−1 0 dλ∂λ ∂ Ll,n (0, . . . , 0) ≤ dλ(λ + m) P log 0 m 0 ≤ (Λ + m)
4−n−|w|
P1
Λ+m log m
,
where P1 is a new polynomial, see Sec. 2.6. Hence, the assertion (2.41) is established at the renormalization point. In each case extension to general momenta via (2.47) is guaranteed by bounds established before. This concludes the proof of Proposition 2.1. The boundedness due to Proposition 2.1 would still allow an oscillatory dependence on Λ0 . Such a (implausible) behaviour is excluded by the Proposition 2.2 (Convergence). For all l ∈ N0 , n ∈ N, w from (2.30) and for 0 ≤ Λ ≤ Λ0 holds, 0 |∂Λ0 ∂ w LΛ,Λ l,n (p1 , . . . , pn )|
|pi | Λ0 + m (Λ + m)5−n−|w| P4 . P3 log ≤ (Λ0 + m)2 m Λ+m
(2.49)
Since we need this proposition for large values of Λ0 only, we then obviously can write ν (Λ + m)5−n−|w| |pi | Λ0 0 |∂Λ0 ∂ w LΛ,Λ (p , . . . , p )| ≤ P (2.50) log 1 n 4 l,n (Λ0 )2 m Λ+m with a positive integer ν depending on l, n, w. Integration of these bounds with 0 respect to Λ0 finally shows that for fixed Λ all LΛ,Λ l,n (p1 , . . . , pn ) converge to finite limits with Λ0 → ∞. In particular, one obtains for all Λ00 > Λ0 : ν |pi | m5−n Λ0 0,Λ0 0,Λ0 P5 |Ll,n (p1 , . . . , pn ) − Ll,n 0 (p1 , . . . , pn )| < . log Λ0 m m
Thus, due to the Cauchy criterion, finite limits (2.33) exist, i.e. perturbative renormalizability of the theory considered is demonstrated. Proof of Proposition 2.2. We integrate the system of flow equations (2.31) according to the induction scheme employed before and derive the individual npoint functions with respect to Λ0 . The r.h.s. of (2.31) will be denoted by the 0 shorthand ∂ w RΛ,Λ l,n (p1 , . . . , pn ). Due to (2.38)–(2.40) the cases (l = 0, n + |w| ≤ 4) evidently satisfy the claim (2.49). (b1 ) n + |w| > 4: In these cases, because of the initial condition (2.45), we have Z Λ0 λ,Λ0 w Λ,Λ0 dλ∂ w Rl,n (p1 , . . . , pn ) −∂ Ll,n (p1 , . . . , pn ) = Λ
July 14, 2003 11:19 WSPC/148-RMP
508
00169
V. F. M¨ uller
and hence w Λ0 ,Λ0 0 −∂Λ0 ∂ w LΛ,Λ (p1 , . . . , pn ) l,n (p1 , . . . , pn ) = ∂ Rl,n
+
Z
Λ0 Λ
λ,Λ0 dλ∂Λ0 ∂ w Rl,n (p1 , . . . , pn ) .
(2.51)
To the first term on the r.h.s. only the quadratic part of (2.31) contributes, cf. (2.45). It is bounded using Proposition 2.1 and the bound (2.43): Λ0 ,Λ0 |∂ w Rl,n (p1 , . . . , pn )|
Λ0 + m |pi | ≤ (Λ0 + m) P1 log P2 m Λ0 + m Λ0 + m (Λ + m)5−n−|w| |pi | P log P , ≤ 1 2 (Λ0 + m)2 m Λ+m 3−n−|w|
(2.52)
valid for 0 ≤ Λ ≤ Λ0 , since n + |w| > 4. The integrand of the second term on the r.h.s. of (2.51) is the derivative with respect to Λ0 of the r.h.s. of (2.31). Observing ∂Λ0 ∂Λ C Λ,Λ0 (k) = 0 ,
(2.53)
we bound the Λ0 — derivative considered using Proposition 2.1 and — in accord with the induction hypothesis — Proposition 2.2 together with the bounds (2.42) and (2.43) which are employed in the linear and in the quadratic part, respectively. Proceeding then similarly as in deducing (2.48) yields 0 |∂Λ0 ∂ w RΛ,Λ l,n (p1 , . . . , pn )|
(Λ + m)5−n−|w|−1 Λ0 + m |pi | P7 log ≤ P8 . (Λ0 + m)2 m Λ+m
(2.54)
From this follows upon integration, with the bound on the momenta majorized, a bound on the second term on the r.h.s. of (2.51) that has the form (2.52). Therefore, the assertion (2.49) is deduced if n + |w| > 4. (b2 ) n + |w| ≤ 4. Here, the respective flow equations integrated at the renormalization point (2.46) are derived with respect to Λ0 . They imply, observing that the initial conditions, i.e. the renormalization constants (2.34)–(2.37) do not depend on Λ0 , the bound Z Λ λ,Λ0 0 (0, . . . , 0)| , (2.55) (0, . . . , 0)| ≤ dλ|∂Λ0 ∂ w Rl,n |∂Λ0 ∂ w LΛ,Λ l,n 0
where on the r.h.s. the shorthand introduced before has been used. In deducing the bound (2.54) no restriction on n, w entered. Therefore, we can use it in (2.55) also and obtain upon integration (Λ + m)5−n−|w| Λ0 + m 0 (0, . . . , 0)| ≤ |∂Λ0 ∂ w LΛ,Λ P log . (2.56) 3 l,n (Λ0 + m)2 m
July 14, 2003 11:19 WSPC/148-RMP
00169
Perturbative Renormalization by Flow Equations
509
Extension of these bounds to general momenta is again achieved via the Taylor formula (2.47) as in the proof of Proposition 2.1. Thus, Proposition 2.2 is proven. Remark. Renormalizability is a consequence of Proposition 2.2 at the value Λ = 0. From this point of view Proposition 2.1 is of preparatory, technical nature. The bounds established in both propositions are not optimal but sufficient. Their virtue is to allow a concise and complete proof of renormalizability. These bounds can be refined in various ways. We mention that Kopper and Meunier [70], by sharpening the induction hypothesis with respect to momentum derivatives of n-point functions, obtained optimal bounds on the momentum behaviour related to Weinberg’s theorem [71]. The Propositions 2.1 and 2.2 established, it is physically important to notice that they even remain valid, when the original bare interaction (2.11) is extended by appropriately chosen irrelevant terms: It is sufficient to replace the condition (2.45) by requiring for n + |w| > 4: |pi | Λ0 + m w Λ0 ,Λ0 4−n−|w| P2 , |∂ Ll,n (p1 , . . . , pn )| ≤ (Λ0 + m) P1 log m Λ0 + m Λ0 + m |pi | Λ0 ,Λ0 |∂Λ0 ∂ w Ll,n P4 ; (p1 , . . . , pn )| ≤ (Λ0 + m)3−n−|w| P3 log m Λ0 + m
(2.57)
evidently, we can also write Λ0 instead of Λ0 + m everywhere. One first observes that these bounds agree with Propositions 2.1 and 2.2 considered at Λ = Λ0 . Moreover, as bounds on the initial conditions to be added in (2.44), (a1 ) and (2.51), respectively, they can be absorbed in the corresponding bounds on the integrals appearing. 2.4. Insertion of a composite field Besides the system of n-point functions dealt with up to now, n-point functions with one or more additional composite fields inserted are of considerable physical interest. In particular, the generators of symmetry transformations of a theory appear generally in the form of composite fields. But there are further instances where inserted composite fields — sometimes also called inserted operators — occur. In the sequel we treat the perturbative renormalization of one composite field inserted. Since, by definition, a composite field depends nonlinearly on the basic field (or fields) of the theory considered, new divergences have to be circumvented and hence additional renormalization conditions are required. As before, we examine the quantum field theory of a real scalar field φ(x) with mass m in four-dimensional Euclidean space-time. Then, a composite field Q(x) is a local polynomial formed in general of the field φ(x) and of its space-time derivatives. It is determined by its classical version Qclass (x). If we restrict to achieve
July 14, 2003 11:19 WSPC/148-RMP
510
00169
V. F. M¨ uller
a renormalized theory with one insertion, Q(x) has to be chosen as follows: Let Qclass (x) be a monomial having the canonical mass dimension D, then Q(x) = Qclass (x) + Qc.t. (x) ,
(2.58)
where Qc.t. (x) is a polynomial which is formed of all local terms of canonical mass dimension ≤ D. This polynomial Qc.t. (x) acts as counterterm. If Qclass (x) shows a symmetry not violated in the intermediate process of regularization this symmetry can be imposed on Qc.t. (x), too. Since the regularization (2.8) keeps the Euclidean symmetry the counterterms Qc.t. (x) can be restricted to those showing the same tensor type as Qclass (x). We illustrate the notion introduced with the example of a scalar composite field having D = 3: Qclass (x) = Qc.t. (x) =
1 φ(x)3 , 3!
(2.59)
1 1 r1 (Λ0 )φ(x)3 − r2 (Λ0 )∆φ(x) + r3 (Λ0 )φ(x)2 3! 2! + r4 (Λ0 )φ(x) + r5 (Λ0 ) ,
(2.60)
with coefficients ri (Λ0 ) = O(~), i = 1, . . . , 5. Our aim here is to show the renormalizability of the theory considered in Sec. 2.3 with one insertion of a scalar composite field Q(x) of mass dimension D. It will turn out that we can essentially proceed as before, taking minor modifications into account. Hence, we can refrain from repeating definitions and arguments already introduced. In place of the bare interaction (2.11) one starts with a modified one: Z Λ0 ,Λ0 Λ0 ,Λ0 Λ0 ,Λ0 ˜ ˜ L (%; φ) + I (%) = L (φ) + dx%(x)Q(x) , (2.61) ˜ Λ0 ,Λ0 (%; 0) = 0 , L where the composite field (2.58), coupled to an external source % ∈ C ∞ (Ω), has been added.e Then, as in (2.12), the generating functional of regularized Schwinger functions with insertions Q is obtained upon functional integration: Z 1 1 ˜ Λ0 ,Λ0 (%;φ)+I˜Λ0 ,Λ0 (%))+ ~ hφ,Ji . (2.62) Z˜ Λ,Λ0 (%; J) = dµΛ,Λ0 (φ)e− ~ (L Moreover, passing similarly as before to regularized amputated truncated Schwinger functions with insertions, the Eqs. (2.16) and (2.17) are replaced by Z 1 ˜ Λ0 ,Λ0 1 ˜ Λ,Λ0 (%;ϕ)+I˜Λ,Λ0 (%)) (%;φ+ϕ)+I˜Λ0 ,Λ0 (%)) = dµΛ,Λ0 (φ)e− ~ (L , (2.63) e − ~ (L ˜ Λ,Λ0 (%; 0) = 0 . L e I˜Λ0 ,Λ0 (%)
(2.64)
is the field independent part that possibly has to enter the modified bare action, as e.g. in (2.60).
July 14, 2003 11:19 WSPC/148-RMP
00169
Perturbative Renormalization by Flow Equations
511
In view of the implicit notation (2.58) we stress that the shift of the field to φ + ϕ involves the field dependent insertion Q(x) in (2.61), too. In exactly the same way ˜ as (2.18) was obtained, we find the relation between the generating functionals L ˜ and Z: 1
˜ Λ,Λ0 (%;ϕ)+I˜Λ,Λ0 (%))
e − ~ (L
1
= e− 2 hϕ,(~C
Λ,Λ0 −1
)
ϕi
Z˜ Λ,Λ0 (%; (C Λ,Λ0 )−1 ϕ) .
(2.65)
Since (2.63) and (2.16) have the same form we obtain the flow equation of the ˜ Λ,Λ0 (%; ϕ) by substituting in the flow equation (2.21): functional L ˜ Λ,Λ0 (%; ϕ) , LΛ,Λ0 (ϕ) → L
I Λ,Λ0 → I˜Λ,Λ0 (%) .
As a consequence the generating functional of the amputated truncated Schwinger functions with one insertion Q, 0 LΛ,Λ (1) (x; ϕ) :=
δ ˜ Λ,Λ0 L (%; ϕ)|%(x)=0 , δ%(x)
(2.66)
then satisfies the flow equation δ ˙ Λ,Λ0 δ 0 LΛ,Λ ,C (1) (x; ϕ) δϕ δϕ δ Λ,Λ0 Λ,Λ0 Λ,Λ0 δ ˙ L (ϕ), C L (x; ϕ) , − δϕ δϕ (1)
~ d Λ,Λ0 (LΛ,Λ0 (x; ϕ) + I(1) (x)) = dΛ (1) 2
(2.67)
involving the vacuum part with one insertion Λ,Λ0 I(1) (x) :=
δ ˜Λ,Λ0 I (%)|%(x)=0 . δ%(x)
(2.68)
In deriving the r.h.s. of (2.67) use of the symmetry C˙ Λ,Λ0 (x − y) = C˙ Λ,Λ0 (y − x) has been made. We note that the functional L(1) satisfies a linear equation. Because of the insertion the full flow equation (2.67) can be studied in the infinite volume limit Ω → R4 , ϕ ∈ S(R4 ). The Fourier transform with respect to the insertion is defined as Z ˆ Λ,Λ0 (q; ϕ) := dxeiqx LΛ,Λ0 (x; ϕ) , L (2.69) (1) (1) Λ,Λ0 (q) := Iˆ(1)
Z
Λ,Λ0 (x) = (2π)4 iΛ,Λ0 δ(q) . dxeiqx I(1)
(2.70)
Furthermore, the generating functional is decomposed, observing the conventions (2.24), n ∈ N, ˆ Λ,Λ0 (2π)4(n−1) δϕ(p ˆ n ) · · · δϕ(p ˆ 1 ) L(1) (q; ϕ)|ϕ=0 0 = δ(q + p1 + · · · + pn ) LΛ,Λ (1) n (q; p1 , . . . , pn ) .
(2.71)
The amputated truncated n-point function with one insertion carrying the momentum q, 0 LΛ,Λ (1) n (q; p1 , . . . , pn ) ,
July 14, 2003 11:19 WSPC/148-RMP
512
00169
V. F. M¨ uller
is at fixed q totally symmetric in the momenta p1 , . . . , pn . Furthermore, the sum of all momenta has to vanish because of the δ-constraint in (2.71). From (2.67) and by proceeding exactly as before from (2.25) to (2.31), after a loop expansion of the n-point functions, n ∈ N, and of the vacuum part iΛ,Λ0 , 0 LΛ,Λ (1) n (q; p1 , . . . , pn ) =
∞ X
0 ~l LΛ,Λ (1) l,n (q; p1 , . . . , pn ) ,
(2.72)
l=0
we arrive at the system of flow equations with one insertion: Z 1 0 0 ∂Λ iΛ,Λ = ∂Λ C Λ,Λ0 (k) · LΛ,Λ l (1) l−1,2 (0; k, −k) 2 k −
0 X
Λ,Λ0 0 0 LΛ,Λ (0) · LΛ,Λ l1 ,1 (0)∂Λ C (1) l2 ,1 (0; 0) ,
(2.73)
l1 ,l2 0 ∂Λ ∂ w LΛ,Λ (1) l,n (q; p1 , . . . , pn ) =
1 2 −
Z
k
0 ∂Λ C Λ,Λ0 (k) · ∂ w LΛ,Λ (1) l−1,n+2 (q; k, −k, p1 , . . . , pn )
0 0 X X
0 X
0 c{wi } [∂ w1 LΛ,Λ l1 ,n1 +1 (p1 , . . . , pn1 , p)
n1 ,n2 l1 ,l2 w1 ,w2 ,w3 0 · ∂ w3 ∂Λ C Λ,Λ0 (p) · ∂ w2 LΛ,Λ (1) l2 ,n2 +1
× (q; −p, pn1 +1 , . . . , pn )]r sym
(2.74)
p = −p1 − · · · − pn1 = q + pn1 +1 + · · · + pn . The notation used above has been introduced in (2.28)–(2.31). Furthermore, the derivations ∂ w can be restricted to the momenta p1 , . . . , pn , because of q = −p1 − · · · − pn . The vacuum part does not act back on the functional L(1) . We therefore disregard its flow (2.73) and just state that in each loop order the bare parameter ilΛ0 ,Λ0 is determined by a renormalization constant il0,Λ0 prescribed at Λ = 0. The task is to show that finite limits 0 lim lim LΛ,Λ (1) l,n (q; p1 , . . . , pn ) ,
Λ0 →∞ Λ→0
n ∈ N, l ∈ N0 ,
(2.75)
can be obtained, given the n-point functions without insertion which satisfy the Propositions 2.1 and 2.2. This can be achieved following closely the corresponding steps performed before without insertion in Sec. 2.3; the demonstration here can thus be presented in a concise way. Due to (2.69)–(2.70), (2.66), (2.61), the bare functional is given by Z Λ0 ,Λ0 Λ0 ,Λ0 ˆ ˆ L(1) (q; ϕ) + I(1) (q) := dxeiqx Q(x) . (2.76) The tree order is completely determined by the classical part of the insertion (2.58). This classical part Qclass yields the initial condition in integrating the system of flow equations (2.74) for l = 0 and general momenta from the initial point Λ = Λ0
July 14, 2003 11:19 WSPC/148-RMP
00169
Perturbative Renormalization by Flow Equations
513
downwards to smaller values of Λ ascending successively in n. It is not necessary to prescribe the order in which the derivations ∂ w are treated. Since Qclass is assumed to be a monomial of canonical mass dimension D and containing n0 field factors, 2 ≤ n0 ≤ D, the key properties (2.38) imply for all w: 0 ∂ w LΛ,Λ (1) 0,1 (q; p) = 0 ,
0 ∂Λ ∂ w LΛ,Λ (1) 0,2 (q; p1 , p2 ) = 0 .
(2.77)
The nonvanishing tree order starts at n = n0 with w Λ0 ,Λ0 0 ∂ w LΛ,Λ (1) 0,n0 (q; {pi }) = ∂ L(1) 0,n0 (q; {pi }) ,
(2.78)
i.e. the initial condition. Of course the limits (2.75) exist in the tree order (since no integrations occur). In view of the mass dimension D of the insertion, the bounds |pi | D−n−|w| 0 (2.79) (q; p , . . . , p )| ≤ (Λ + m) P |∂ w LΛ,Λ 1 n (1) 0,n Λ+m are established for later use. If n0 < D, apart from (2.78) there are other relevant instances n+|w| ≤ D; they should be evaluated explicitly without using the bounds in the flow equation. (It is instructive to compare the examples Qclass = φ∆φ, φ4 both having D = 4.) The irrelevant cases n + |w| > D in (2.79) are established similarly as (2.44). For l > 0 the coefficients of the counterterms inherent in (2.76), extracted via (2.71) and (2.72), have to depend on Λ0 , 0 ,Λ0 ∂ w LΛ (1) l,n (0; 0, . . . , 0) = rl,n,w (Λ0 ) ,
n + |w| ≤ D, l > 0 .
(2.80)
They are determined by prescribing related renormalization conditions for vanishing flow parameter Λ at a renormalization point which is again chosen at vanishing momenta. Hence, order-by-order, the real constants, l > 0, R 0 ∂ w L0,Λ (1) l,n (0; 0, . . . , 0) =: rl,n,w ,
n + |w| ≤ D ,
(2.81)
0 can be freely chosen, provided they respect the symmetry of LΛ,Λ (1) .
Proposition 2.3 (Boundedness). Let l ∈ N0 , n ∈ N, w from (2.30) and 0 ≤ Λ ≤ Λ0 , then 0 |∂ w LΛ,Λ (1) l,n (q; p1 , . . . , pn )|
≤ (Λ + m)
D−n−|w|
P1
Λ+m log m
P2
|pi | Λ+m
.
(2.82)
The symbol P denotes polynomials with nonnegative coefficients which depend on l, n, w but not on {pi }, Λ, Λ0 . For l = 0 all polynomials P1 reduce to positive constants. Proof. Due to (2.79), the assertion (2.82) is already shown in the tree order. Given the set of n-point functions without insertions satisfying Proposition 2.1, one proceeds for l > 0 inductively as in the proof of the latter proposition, i.e. (i) ascending
July 14, 2003 11:19 WSPC/148-RMP
514
00169
V. F. M¨ uller
in l, (ii) at fixed l ascending in n, (iii) at fixed l, n descending in w. Inspecting the flow equations (2.74), it is easily seen that for any given l, n, w on the l.h.s. the contributions to the r.h.s. — because of the key properties (2.38) — always precede those on the l.h.s. in the order of induction adopted. Imitating the steps leading to (2.48) provides the bound for l, n ∈ N, 0 |∂Λ ∂ w LΛ,Λ (1) l,n (q; p1 , . . . , pn )|
|pi | Λ+m P6 . ≤ (Λ + m)D−n−|w|−1 P5 log m Λ+m
(2.83)
(a1 ) In the cases n + |w| > D this bound is integrated downwards from the initial point Λ = Λ0 with vanishing initial conditions, see (2.76). From this follows easily (2.82) for n + |w| > D. (a2 ) If n+|w| ≤ D, however, the respective flow equation (2.74) has to be integrated upwards at the renormalization point, as in (2.46), employing now the renormalization conditions (2.81). The bound (2.83) then implies the claim (2.82) at vanishing momenta. Again as in Sec. 2.3, extension to general momenta is accomplished appealing to the Taylor formula (2.47). Thus, Proposition 2.3 is proven. In complete analogy with the steps taken in Sec. 2.3, the proposition just proven prepares the decisive Proposition 2.4 (Convergence). Let l ∈ N0 , n ∈ N, w from (2.30), 0 ≤ Λ ≤ Λ0 , and Λ0 > Λ0 , sufficiently large, then 0 |∂Λ0 ∂ w LΛ,Λ (1) l,n (q; p1 , . . . , pn )|
≤
(Λ + m)D+1−n−|w| (Λ0 )2
log
Λ0 m
ν
with a positive integer ν depending on l, n, w only.
P4
|pi | Λ+m
(2.84)
Proof. One first verifies the assertion directly in the tree order for the relevant cases n + |w| ≤ D. Herewith, the further course of the proof is just a replica of the proof given for Proposition 2.2. As a consequence, in each place the exponent 4 + 1 − n − |w| appears there, this exponent is changed here into D + 1 − n − |w|, thus proving the assertion (2.84). Finally, integration of the bound (2.84) of Proposition 2.4 demonstrates, that the renormalized regularized n-point functions with one insertion have finite limits (2.75). 2.5. Finite temperature field theory There are essentially two formulations of quantum fields at finite temperature: a real-time approach to treat dynamical effects, and an imaginary-time approach to
July 14, 2003 11:19 WSPC/148-RMP
00169
Perturbative Renormalization by Flow Equations
515
describe equilibrium properties [72]. In this section the problem of renormalization in a temperature independent way is considered. Such a renormalization is required studying the T -dependence of observables, since then the relation between bare and renormalized coupling constants must not depend on the temperature. Our aim here is to show within the imaginary-time formalism, that a quantum field theory renormalized at T = 0 stays also renormalized at any T > 0. For the sake of a succinct presentation the symmetric Φ4 -theory is treated and the generalization to the nonsymmetric theory is stated at the end. The first steps to be taken do not differ from the zero temperature case: Starting from a finite domain, given by a 4-dimensional torus Ω, and the Gaussian measure with the regularized covariance (2.7), we obtain Wilson’s flow equation (2.21). Here, the bare interaction (2.11) is restricted to the symmetric theory, as already mentioned, i.e. putting f = v(Λ0 ) = b(Λ0 ) ≡ 0 .
(2.85)
Disregarding as before the flow of the vacuum part I Λ,Λ0 , we imagine at least one functional derivative acting on the flow equation (2.21). Then we can pass to the spatial infinite volume limit, but keeping the periodicity in the imaginary time x4 and choosing the period equal to the inverse temperature: l4 = β ≡ 1/T . Hence, in this limit the space-time domain is R3 × S 1 and the theory shows the reduced symmetry O(3)×Z2 , as compared to the O(4)-symmetry at T = 0. Correspondingly, the dual Fourier variables (momentum vectors) are p ∈ R3 ,
p := (p, p4 ) ,
p4 = 2πnT ,
n ∈ Z,
(2.86)
and hence we define Z
:= T p
XZ
n∈Z
R3
d3 p . (2π)3
(2.87)
In the sequel we underline a symbol denoting a quantity at finite temperature or write the T -dependence explicitly. In place of (2.24) the Fourier transform takes the form Z β Z Z 3 ipx dx4 e−ipx ϕ(x) , ˆ (p) , ϕ d x ˆ (p) = (2.88) ϕ(x) = e ϕ p
R3
0
implying for a functional derivation: Z Z Z β (2π)3 T −ipx 3 δϕ(x) = e δϕˆ (p) , δϕˆ (p) = dx4 eipx δϕ(x) . d x T (2π)3 R3 p 0
(2.89)
Furthermore, the regularized covariance (2.27) is restricted to momenta (2.86), 2 +m2
C Λ,Λ0 (p) =
p − 1 (e p2 + m2
Λ2 0
− e−
p2 +m2 Λ2
).
(2.90)
July 14, 2003 11:19 WSPC/148-RMP
516
00169
V. F. M¨ uller
Denoting by LΛ,Λ0 (ϕ; T ) the generating functional of the amputated truncated Schwinger functions at finite temperature T , we define the n-point functions, n ∈ N, similar to (2.25) asf n−1 (2π)3 δϕˆ (p ) · · · δϕˆ (p ) LΛ,Λ0 (ϕ; T )|ϕ≡0 1 n T = δ(p1 + · · · + pn )δ0,(p
1
Λ,Λ0 (p1 , . . . , pn ; T ) . +···+pn ),4 Ln
(2.91)
These n-point functions, after a respective loop expansion in complete analogy to (2.29), then satisfy a system of flow equations obtained from (2.31) by replacing every momentum vector appearing by its underlined analogue, and moreover, restricting the momentum derivatives ∂ w to spatial momentum components. Employing this system of flow equations, we could prove renormalizability of the theory at finite temperature proceeding similarly as in the case of zero temperature. However, because of the reduced spacetime symmetry, the renormalization conditions (2.34)–(2.37) for l ≥ 1 would have to be extended by an additional constant: R,1 R 0 (T )p2 + zlR,2 (T )p24 + O(p4 ) , L0,Λ l,2 (p, −p; T ) = al (T ) + zl
(2.92)
R 0 L0,Λ l,4 (0, 0, 0, 0; T ) = cl (T ) .
(2.93)
The constants for n = 1, 3 are set equal to zero in the symmetric theory (2.85). Only at T = 0, the emerging O(4)-symmetry implies the equality zlR,1 (0) = zlR,2 (0). Our aim is to prove renormalizability in a temperature independent way, i.e. with counterterms that do not depend on the temperature. In this case, the renormalR,1 ization constants aR (T ), zlR,2 (T ), cR l (T ), zl l (T ) cannot be prescribed arbitrarily, since they are related dynamically to the three renormalization constants at T = 0. Therefore, we follow a different course and study the respective difference of a npoint function at T > 0 and at T = 0, n ∈ N, with momenta {p} ≡ (p1 , . . . , pn ) of the form (2.86): Λ,Λ0 Λ,Λ0 0 ({p}) := LΛ,Λ Dl,n l,n ({p}; T ) − Ll,n ({p}) .
(2.94)
These functions are well-defined. From the system of flow equations (2.31) and from its analogue at finite temperature follows the system of flow equations satisfied by the difference functions (2.94), with l, n ∈ N: Z 1 Λ,Λ0 Λ,Λ0 (k, −k, {p}) ∂Λ Dl,n ({p}) = ∂Λ C Λ,Λ0 (k) · Dl−1,n+2 2 k +
− f In
1 2 Z
Z
k
k
0 ∂Λ C Λ,Λ0 (k) · LΛ,Λ l−1,n+2 (k, −k, {p})
0 ∂Λ C Λ,Λ0 (k) · LΛ,Λ l−1,n+2 (k, −k, {p})
the symmetric theory the n-point functions with odd n vanish.
!
July 14, 2003 11:19 WSPC/148-RMP
00169
Perturbative Renormalization by Flow Equations
−
517
0 0 1 X X Λ,Λ0 [Ll1 ,n1 +1 (p1 , . . . , pn , p; T )∂Λ C Λ,Λ0 (p) 1 2 n ,n 1
2
l1 ,l2
0 · DlΛ,Λ (−p, pn 2 ,n2 +1
1 +1
, . . . , pn )]r sym
0 0 1 X X Λ,Λ0 − [Dl1 ,n1 +1 (p1 , . . . , pn , p)∂Λ C Λ,Λ0 (p) 1 2 n ,n 1
2
l1 ,l2
0 · LΛ,Λ l2 ,n2 +1 (−p, pn
1 +1
, . . . , pn )]r sym
p = −p1 − · · · − pn = pn 1
1 +1
(2.95)
+ · · · + pn .
In the (tree) order l = 0 we infer directly Λ,Λ0 D0,n (p1 , . . . , pn ) = 0 ,
n ∈ N,
(2.96)
since on the set of momenta considered the n-point function at T = 0 is equal to the n-point function at T > 0 in this order. The assertion of a temperature independent renormalization now requires the bare difference functions to vanish for l ≥ 1: Λ0 ,Λ0 Dl,n (p1 , . . . , pn ) = 0 ,
l, n ∈ N.
(2.97)
Given the bounds (2.41) and (2.49) satisfied by the n-point functions at zero temperature, then follows the Theorem. For l, n ∈ N and for 0 ≤ Λ ≤ Λ0 holds, )! ( |pi | Λ+m Λ,Λ0 −s−n P2 , |Dl,n (p1 , . . . , pn )| ≤ (Λ + m) P1 log m Λ+m Λ,Λ0 (p1 , . . . , pn )| |∂Λ0 Dl,n
Λ0 (Λ + m)−s−n P3 log P4 ≤ (Λ0 )2 m
(
|pi | Λ+m
)!
.
(2.98)
(2.99)
The polynomials P have positive coefficients, which depend on l, n, s, m and (smoothly) on T, but not on {p}, Λ, Λ0. The positive integer s may be chosen arbitrarily. 0 The n-point functions at finite temperature T, LΛ,Λ l,n (p1 , . . . , pn ; T ), when renormalized with the same counterterms as the zero temperature functions, (2.97), satisfy the bounds (2.41) and (2.49) restricted to the case w = 0 and to momenta (2.86). The coefficients in the polynomials P may now depend also (smoothly) on T. For the proof we refer to [43], pp. 396–399, and just indicate that the system of flow equations (2.95) is integrated inductively from the initial point Λ = Λ0 downwards, observing (2.97). The difference of the two terms not involving any 0 function DlΛ,Λ 0 ,n0 , which appears in (2.95), however, is not accessible by induction. It is bounded separately, matching the sharp bound on Λ asserted, by use of the Euler–MacLaurin formula, see e.g. [73].
July 14, 2003 11:19 WSPC/148-RMP
518
00169
V. F. M¨ uller
Due to the theorem, the n-point functions of the theory at T > 0, renormalized at zero temperature, satisfy the bound (2.49) for w = 0 and momenta (2.86). Hence, they have finite limits 0,Λ0 lim Ll,n (p1 , . . . , pn ; T ) ,
Λ0 →∞
l, n ∈ N ,
upon removing the UV-cutoff Λ0 . As already indicated before, a finite theory at given temperature T0 > 0 could also be generated imposing renormalization conditions at this temperature. The price to be paid (in the symmetric theory considered) are in each loop order the R,1 four constants aR (T0 ), zlR,2 (T0 ), cR l (T0 ), zl l (T0 ), (2.92) and (2.93), instead of the R R R three constants al , zl , cl at zero temperature. However, an arbitrary choice of zlR,1 (T0 ), zlR,2 (T0 ) would not correspond to a theory at zero temperature, which shows the O(4)-symmetry of Euclidean space-time. Starting from an O(4)-invariant theory at zero temperature, the functional LΛ,Λ0 (ϕ; T ) − LΛ,Λ0 (ϕ)
(2.100)
with initial condition (2.97) has been proven to satisfy the bound (2.99). Hence, 0,Λ0 (p, −p), converging for all l with Λ0 → ∞ to a finite limit, the function Dl,2 produces a dynamical relation between the renormalization constants zlR,1 (T0 ) and zlR,2 (T0 ), i.e. fixing one of them determines the other. Thus, a renormalization does not depend on temperature, if this relation is satisfied. It becomes manifest in the equality zl1 (Λ0 ) = zl2 (Λ0 ) of the corresponding bare parameters. Concluding we remark that the proof can be easily extended to the nonsymmetric Φ4 -theory. In this case, the n-point functions with n odd no longer vanish, since the Z2 -symmetry is now lacking. Hence, the bare interaction will be of the general form (2.11). Correspondingly, the theory at zero temperature is renormalized by the conditions (2.34)–(2.37), involving five renormalization constants. Proceeding inductively as before, considering now odd and even values of n, establishes the theorem for the nonsymmetric theory, too. 2.6. Elementary estimates Here, rather obvious estimates on some elementary integrals are listed, which we used repeatedly in generating inductive bounds on Schwinger functions. (a1 ) In the irrelevant cases, the integrals have the form Z b dxx−r−1 (log x)s , with 1 ≤ a ≤ b and r ∈ N, s ∈ N0 . a
Defining correspondingly the function 1 s s(s − 1) 1 · 2···s fr,s (x) := x−r (log x)s + (log x)s−1 + (log x)s−2 + · · · + r r r2 rs
!
July 14, 2003 11:19 WSPC/148-RMP
00169
Perturbative Renormalization by Flow Equations
519
0 we observe fr,s (x) = −x−r−1 (log x)s < 0 and fr,s (x) > 0 for x > 1, hence Z b dxx−r−1 (log x)s = fr,s (a) − fr,s (b) < fr,s (a) . a
(a2 ) The integrals to be bounded in the relevant cases have the form Z b dxxr−1 (log x)s , with 1 ≤ b and r, s ∈ N0 . 1
If r = 0, we just integrate. For r > 0, defining gr,s (x) :=
s s(s − 1) 1 r x (log x)s − (log x)s−1 + (log x)s−2 r r r2 + · · · + (−)
s1
! · 2···s , rs
0 we notice gr,s (x) = xr−1 (log x)s and hence Z b 1 · 2···s dxxr−1 (log x)s = gr,s (b) − gr,s (1) < + |gr,s (b)| . rs 1
3. The Quantum Action Principle The Green functions of a relativistic quantum field theory depend on the adjustable parameters of this theory and are in general related according to the inherent symmetries of the theory. Clearly, all types of Green functions, whether truncated, amputated, or one-particle-irreducible, show these properties. The quantum action principle deals with the variation of Green functions caused by diverse operations performed: (i) applying the differential operator appearing in the (classical) field equation, (ii) (nonlinear) variations of the fields, (iii) variation of an adjustable parameter of the theory. The quantum action principle relates each of these different operations on Green functions to the insertion of a corresponding composite field into the Green functions: as a local operator in the first two cases, whereas integrated over space-time in the third. Moreover, in general the local operation has a precursor within classical field theory (e.g. the field equation, the Noether theorem). Then the local composite field to be inserted in the case of a quantum field theory is a sum formed of its classical precursor and of assigned local counterterms, whose canonical mass dimensions are equal to or smaller than the canonical mass dimension ascribed to the term of classical descent. The quantum action principle has been established first by Lam [74, 75] and Lowenstein [76] using the BPHZ-formulation of perturbation theory. This principle is extensively used in the method of algebraic renormalization [77].
July 14, 2003 11:19 WSPC/148-RMP
520
00169
V. F. M¨ uller
Our aim is to demonstrate the parts (i) and (iii) of the quantum action principle by means of flow equations in the case of the scalar field theory. The particularly interesting part (ii) is deferred to a later section, where nonlinear BRST-transformations have to be implemented in showing the renormalizability of a non-Abelian gauge theory. 3.1. Field equation We consider again the quantum field theory of a real scalar field on four-dimensional Euclidean space-time, which has been treated in the preceding sections. To derive a field equation, we act on the generating functional of its regularized Schwinger functions (2.12) as follows: Z δ ~ dy(C Λ,Λ0 )−1 (x − y) Z Λ,Λ0 (J) δJ(y) Z Z 1 1 Λ0 ,Λ0 (φ)+ ~ hφ,Ji . = dy(C Λ,Λ0 )−1 (x − y) dµΛ,Λ0 (φ)φ(y)e− ~ L In presence of the regularization the inverse of the regularized covariance (2.8) replaces the differential operator −∆ + m2 . Integration by parts (2.5) on the r.h.s. and recalling that the covariance of the Gaussian measure dµΛ,Λ0 (φ) is ~C Λ,Λ0 yields the field equation of the regularized generating functional (2.12), Z δ Z Λ,Λ0 (J) J(x) − ~ dy(C Λ,Λ0 )−1 (x − y) δJ(y) Z 1 Λ0 ,Λ0 1 (φ)+ ~ hφ,Ji = dµΛ,Λ0 (φ)Q(x)e− ~ L . (3.1) On the r.h.s. the inserted composite field Q(x) is given by Q(x) =
δ LΛ0 ,Λ0 (φ) . δφ(x)
(3.2)
If we employ the generating functional of regularized Schwinger functions with insertions (2.61) and (2.62) we can rewrite the field equation (3.1) in the form Z δ Λ,Λ0 −1 ) (x − y) Z Λ,Λ0 (J) J(x) − ~ dy(C δJ(y) = −~
δ ˜ Λ,Λ0 Z (%; J)|%(x)=0 . δ%(x)
(3.3)
Taking into account the relations (2.18) and (2.65) on the l.h.s. and on the r.h.s. of this equation, respectively, provides the field equation for the generating functional of the amputated truncated Schwinger functions, δ Λ,Λ0 0 LΛ,Λ0 (ϕ) = LΛ,Λ (1) (x; ϕ) + I(1) (x) . δϕ(x)
(3.4)
July 14, 2003 11:19 WSPC/148-RMP
00169
Perturbative Renormalization by Flow Equations
521
Hence, in momentum space we have (2π)4
δ ˆ Λ,Λ0 (q; ϕ) + IˆΛ,Λ0 (q) , LΛ,Λ0 (ϕ) = L (1) (1) δ ϕ(q) ˆ
(3.5)
using the conventions (2.24) and (2.69). Our goal is to show within perturbation theory, i.e. in a formal loop expansion, that the field equation (3.5) remains valid taking the limit Λ = 0, Λ0 → ∞. To this end we proceed as follows: (α) In Sec. 2.3 it has been shown that the generating functional LΛ,Λ0 (ϕ) of the theory considered (perturbatively) converges to a finite limit with Λ0 → ∞. The limit theory is determined by the choice of renormalization conditions (2.34)–(2.37) at the renormalization point (chosen at vanishing momenta). 0 (β) The generating functional LΛ,Λ (1) (q; ϕ) with insertion of one composite field of mass dimension D and momentum q has been shown in Sec. 2.4 to have a finite limit with Λ0 → ∞, too, provided the counterterms of the insertion (2.58) are introduced at first as indeterminate functions of Λ0 and then determined by a choice of renormalization conditions at the renormalization point. In the case entering the field equation, however, the dependence on Λ0 of the insertion (3.2) is already given by (2.11), the counterterms of the theory without insertion. In order to maintain in an intermediate stage the freedom in choosing renormalization conditions, we use instead of (3.2) indeterminate counterterms to begin with: Q(x) =
g f 2 φ (x) + φ3 (x) + v1 (Λ0 ) + a1 (Λ0 )φ(x) − z1 (Λ0 )∆φ(x) 2! 3! +
1 1 b1 (Λ0 )φ2 (x) + c1 (Λ0 )φ3 (x) . 2! 3!
(3.6)
Then, as has been demonstrated in Sec. 2.4, any choice of admissible renormaΛ,Λ0 lization conditions leads to a finite limit of the generating functional L(1) (q; ϕ) in sending Λ0 → ∞, and thereby a related dependence of the coefficients v1 (Λ0 ), . . . , c1 (Λ0 ) on Λ0 arises. (γ) We define the functional ˆ Λ,Λ0 (q; ϕ) := L ˆ Λ,Λ0 (q; ϕ) + IˆΛ,Λ0 (q) − (2π)4 D (1) (1)
δ LΛ,Λ0 (ϕ) . δ ϕ(q) ˆ
(3.7)
If this functional can be forced to vanish at Λ = 0 and for all Λ0 , Λ0 > Λ0 , by an appropriate fixed choice of renormalization conditions in (β), then (3.5) converges to a finite renormalized field equation for (Λ = 0, Λ0 → ∞). The functional (3.7) obeys the linear flow equation ~ δ ˙ Λ,Λ0 δ d ˆ Λ,Λ0 ˆ Λ,Λ0 (q; ϕ) D ,C (q; ϕ) = D dΛ 2 δϕ δϕ δ Λ,Λ0 δ ˆ Λ,Λ0 L (ϕ), C˙ Λ,Λ0 D (q; ϕ) , (3.8) − δϕ δϕ
July 14, 2003 11:19 WSPC/148-RMP
522
00169
V. F. M¨ uller
which follows directly from the flow equations (2.21) and (2.67), performing a functional derivation of the first and Fourier transforming the latter. To make use of it we decompose ˆ Λ,Λ0 (q; 0) = δ(q) D Λ,Λ0 (2π)−4 D 0 ˆ Λ,Λ0 (q; ϕ)|ϕ=0 = δ(q + p1 + · · · + pn ) (2π)4(n−1) δϕ(p ˆ n ) · · · δϕ(p ˆ 1)D × DnΛ,Λ0 (q; p1 , . . . , pn ) .
(3.9)
From (2.25), (2.69) and (2.70) then results 0 D0Λ,Λ0 = iΛ,Λ0 − LΛ,Λ (0) , 1
Λ,Λ0 0 DnΛ,Λ0 (q; p1 , . . . , pn ) = LΛ,Λ (1) n (q; p1 , . . . , pn ) − Ln+1 (q, p1 , . . . , pn ) .
(3.10) (3.11)
We notice that the flow equations (3.8) and (2.67) have the same form. Thus, after a loop expansion, the strict analogue of the system of flow equations (2.73) and (2.74) is obtained, l ∈ N0 , n ∈ N : Z 1 Λ,Λ0 Λ,Λ0 ∂Λ C Λ,Λ0 (k) · Dl−1,2 (0; k, −k) ∂Λ Dl,0 = 2 k −
0 X
Λ,Λ0 0 0 (0; 0) , (0) · DlΛ,Λ LΛ,Λ l1 ,1 (0)∂Λ C 2 ,1
(3.12)
l1 ,l2
Λ,Λ0 ∂Λ ∂ w Dl,n (q; p1 , . . . , pn ) Z 1 Λ,Λ0 = ∂Λ C Λ,Λ0 (k) · ∂ w Dl−1,n+2 (q; k, −k, p1 , . . . , pn ) 2 k
−
0 0 X X
0 X
w3 0 c{wi } [∂ w1 LΛ,Λ ∂Λ C Λ,Λ0 (p) l1 ,n1 +1 (p1 , . . . , pn1 , p) · ∂
n1 ,n2 l1 ,l2 w1 ,w2 ,w3 0 · ∂ w2 DlΛ,Λ (q; −p, pn1 +1 , . . . , pn )]r sym 2 ,n2 +1
(3.13)
p = −p1 − · · · − pn1 = q + pn1 +1 + · · · + pn . ˆ Λ0 ,Λ0 (q; ϕ)|l=0 = 0. We first treat the tree order. From (2.11), (2.76) follows D Integrating the flow equations with l = 0 and general momenta from the initial point Λ = Λ0 downwards to smaller values of Λ, ascending successively in n, we find due to the properties (2.38) for n ∈ N, 0 ≤ Λ ≤ Λ0 , Λ,Λ0 D0,0 = 0,
Λ,Λ0 D0,n (q; p1 , . . . , pn ) = 0 .
(3.14)
The extension to all loop orders l is achieved by the Proposition 3.1. For all l ∈ N and n + |w| ≤ 3 let 0,Λ0 = 0, Dl,0
0,Λ0 (0; 0, . . . , 0) = 0 , ∂ w Dl,n
(3.15)
July 14, 2003 11:19 WSPC/148-RMP
00169
Perturbative Renormalization by Flow Equations
523
then for l ∈ N0 , n ∈ N, |w| ≤ 3, and 0 ≤ Λ ≤ Λ0 : Λ,Λ0 Dl,0 = 0,
Λ,Λ0 ∂ w Dl,n (q; p1 , . . . , pn ) = 0 .
(3.16)
Proof. In the order l = 0 the assertion is already established because of (3.14). We now assume (3.16) to hold for all orders smaller than a fixed order l. As a consequence, on the respective r.h.s. of the flow equations (3.12) and (3.13) the first term vanishes and in the second term only the pair (l1 = 0, l2 = l) has to be taken into account. Looking first at the vacuum part, we observe that the r.h.s. of (3.12) vanishes due to (2.38). Thus, integration from the initial point Λ = 0 yields Λ,Λ0 Dl,0 = 0. To demonstrate the assertion for general n we proceed inductively: ascending in n, and for fixed n descending with w from |w| = 3. Thus, for each n the irrelevant cases n + |w| > 3 always precede the relevant ones, n + |w| ≤ 3, if present at all. Since Λ0 ,Λ0 ∂ w Dl,n (q; p1 , . . . , pn ) = 0 ,
n + |w| > 3 ,
(3.17)
the flow equations (3.13) of these cases are integrated from the initial point Λ = Λ 0 downwards. On the other hand, the respective flow equation of the cases n+|w| ≤ 3 is first integrated at zero momentum from the initial point Λ = 0 with vanishing initial condition (3.15) and in a now familiar second step the result is extended to general momenta via the Taylor formula (2.47). Following the inductive order stated one notices that for each pair (n, w) occuring, the r.h.s. of (3.13) vanishes due to the key properties (2.38) and preceding instances (3.16). Hence, (3.16) also holds for all n in the order l and the proposition is proven. From (α), (β) we know, that letting Λ0 → ∞, each term on the r.h.s. of Eq. (3.7) converges to a finite limit. Hence, if the renormalization conditions chosen for 0 Λ,Λ0 LΛ,Λ (ϕ) to satisfy (3.15), the l.h.s. of (3.7) (1) (q; ϕ) are inferred from those of L vanishes for 0 ≤ Λ ≤ Λ0 . Thus, the field equation (3.5) remains valid after removing the cutoffs Λ, Λ0 , written suggestively as (2π)4
δ ˆ 0,∞ (q; ϕ) + Iˆ0,∞ (q) ; L0,∞ (ϕ) = L (1) (1) δ ϕ(q) ˆ
(3.18)
in the realm of a formal loop expansion, of course. Considering the relations (3.16) at Λ = Λ0 reveals that the counterterms entering the insertion (3.6) have to be chosen identical to those of the bare interaction (2.11), l ∈ N : v1,l (Λ0 ) = vl (Λ0 ), . . . , c1,l (Λ0 ) = cl (Λ0 ) .
(3.19)
3.2. Variation of a coupling constant The renormalized amputated truncated Schwinger functions (2.33) depend on the coupling constants f and g, which can be freely chosen in the bare interaction (2.11). Our aim is to find a representation for the derivative of these Schwinger
July 14, 2003 11:19 WSPC/148-RMP
524
00169
V. F. M¨ uller
functions with respect to f or g. To this end we start from the defining Eq. (2.16) of the regularized generating functional. Denoting by κ either f or g, and defining Z ∂ Λ0 ,Λ0 L (φ) =: dxQκ (3.20) Wκ (φ) := ∂κ Ω where the integrand Qκ (x) is a composite field and Wκ (φ) the space-time integral of it, we obtain from deriving (2.16): 1
Λ,Λ0
Λ,Λ0
(ϕ)+I ) ∂κ (LΛ,Λ0 (ϕ) + I Λ,Λ0 ) · e− ~ (L Z 1 Λ0 ,Λ0 (φ+ϕ) Wκ (φ + ϕ) . = dµΛ,Λ0 (φ)e− ~ L
(3.21)
On the other hand , the functional derivation of Eq. (2.63) with respect to %(x) at %(x) = 0 yields, observing the shift φ → φ + ϕ to be performed in (2.61) and employing the notations (2.66), (2.68): 1
Λ,Λ0
Λ,Λ0
Λ,Λ0 (ϕ)+I ) − ~ (L 0 (LΛ,Λ (1) (x; ϕ) + I(1) (x))e Z 1 Λ0 ,Λ0 (φ+ϕ) = dµΛ,Λ0 (φ)e− ~ L Q(x)|φ→φ+ϕ .
(3.22)
In writing this equation we have already taken account of the identities LΛ,Λ0 (0; ϕ) = LΛ,Λ0 (ϕ) ,
I Λ,Λ0 (0) = I Λ,Λ0 .
Choosing in (3.22) the particular composite field Q(x) = Qκ (x) introduced in (3.20), and integrating over the finite space-time Ω, implies by comparison with (3.21), Z Λ,Λ0 0 ∂κ L (ϕ) = dxLΛ,Λ (1) (x; ϕ) . Ω
We can now pass to the infinite volume limit Ω → R4 , ϕ ∈ S(R4 ). Hence, ˆ Λ,Λ0 (0; ϕ) , ∂κ LΛ,Λ0 (ϕ) = L (1)
(3.23)
ˆ Λ,Λ0 (0; ϕ) with the Fourier transform (2.69) at vanishing momentum. In the sequel, L (1) is always understood as the generating functional with the insertion (3.20). The task posed is to produce a finite limit of the Eq. (3.23) upon removing the cutoffs, i.e. letting Λ = 0, Λ0 → ∞. A finite limit of LΛ,Λ0 (ϕ) has been established in Sec. 2.3. Furthermore, the insertion appearing in (3.23) is a particular instance of the insertion of a composite field Q(x) dealt with in Sec. 2.4. The composite field Qκ (x) involved here follows from (3.20) and (2.11) as Qκ (x) =
1 1 δκf φ3 (x) + δκg φ4 (x) + vκ (Λ0 )φ(x) 3! 4! 1 1 + aκ (Λ0 )φ2 (x) + zκ (Λ0 )(∂µ φ)2 (x) 2 2 +
1 1 bκ (Λ0 )φ3 (x) + cκ (Λ0 )φ4 (x) , 3! 4!
(3.24)
July 14, 2003 11:19 WSPC/148-RMP
00169
Perturbative Renormalization by Flow Equations
525
where δκf is the Kronecker symbol: δκf = 1, if κ = f , and δκf = 0, if κ 6= f . One should note that this composite field in both cases κ = f or κ = g has the canonical mass dimension D = 4, in contrast to its classical part. The coefficients of the counterterms appearing in (3.24) are the coefficients entering the bare interaction (2.11) derived with respect to κ. However, since in the process of renormalization the counterterms are determined by the renormalization conditions chosen, we at first treat the counterterms in (3.24) as free functions of Λ0 , which are then determined by the renormalization conditions prescribed in the case of the insertion. We do not assume the renormalization conditions (2.34)–(2.37) of LΛ,Λ0 (ϕ) to depend on f and g, hence, their derivative with respect to κ vanishes. Requiring (3.23) to be valid at the renormalization point for all values of Λ0 then implies, that in all ˆ Λ,Λ0 (0; ϕ) and its momentum derivatives loop orders l ≥ 1 an n-point function of L (1) vanish for Λ = 0 at zero momenta, if n + |w| ≤ 4. The renormalization conditions ˆ Λ,Λ0 (0; ϕ) has a finite fixed, we know from Proposition 2.4, that the functional L (1) limit Λ = 0, Λ0 → ∞. To control the renormalization of Eq. (3.23) we define ˆ Λ,Λ0 (0; ϕ) − ∂κ LΛ,Λ0 (ϕ) . DΛ,Λ0 (ϕ) := L (1)
(3.25)
This functional satisfies, as easily seen, a linear flow equation of the form (3.8); hence, after decomposition, a system of flow equations of the form (3.13) results. (Here, no vacuum part appears.) Comparing the bare interaction (2.11) with the insertion (3.24) we observe that D Λ0 ,Λ0 vanishes in the tree order DΛ0 ,Λ0 |l=0 = 0 , and its irrelevant part vanishes for l > 0, Λ0 ,Λ0 (p1 , . . . , pn ) = 0 , ∂ w Dl,n
n + |w| > 4 .
Given these initial conditions we have the Proposition 3.2. Assume for all l ∈ N, n + |w| ≤ 4: 0,Λ0 ∂ w Dl,n (0, . . . , 0) = 0 ,
(3.26)
then follows for l ∈ N0 , n ∈ N, |w| ≤ 4, and 0 ≤ Λ ≤ Λ0 : Λ,Λ0 (p1 , . . . , pn ) = 0 . ∂ w Dl,n
(3.27)
The proof by induction proceeds exactly as the proof of Proposition 3.1 and is omitted. Proposition 3.2 implies, that Eq. (3.23) has a finite limit for Λ = 0, Λ0 → ∞, ˆ 0,∞ (0; ϕ) , ∂κ L0,∞ (ϕ) = L (1)
(3.28)
again to be read in terms of a formal loop expansion. Furthermore, from (3.27) at Λ = Λ0 follows the relation of the counterterms vκ,l (Λ0 ) = ∂κ vl (Λ0 ), . . . , cκ,l (Λ0 ) = ∂κ cl (Λ0 ) .
(3.29)
July 14, 2003 11:19 WSPC/148-RMP
526
00169
V. F. M¨ uller
3.3. Flow equations for proper vertex functions In perturbative renormalization based on the analysis of Feynman integrals, (proper) vertex functions form the building blocks. They are represented by oneparticle-irreducible (1PI) Feynman diagrams, see e.g. [78]. Although their generating functional has no representation as a functional integral, flow equations for vertex functions can be derived [45, 46]. Our goal in this section is to deduce in the case of the symmetric Φ4 -theory from Wilson’s differential flow equation (2.21) for the L-functional the system of flow equations satisfied by the regularized n-point vertex functions, n ∈ N. After that, an inductive proof of renormalizability based on them is outlined. We start from the regularized generating functional W Λ,Λ0 (J) of the truncated Schwinger functions, (2.14), decomposed as W
Λ,Λ0
∞ X
1 (J) = (2n)! n=1
Z
dx1 · · ·
Z
dx2n
Λ,Λ0 × W2n (x1 , . . . , x2n )J(x1 ) · · · J(x2n ) ,
(3.30)
according to (2.15). Due to the symmetry φ → −φ of the theory, all n-point functions with n odd vanish identically. Defining the “classical field”, ϕ(x) :=
δW Λ,Λ0 (J) , δJ(x)
(3.31)
we then notice, that ϕ(x)|J≡0 = 0 , and, moreover, that ϕ depends on the flow parameter Λ ( and on Λ0 ). Since the 2-point function W2Λ,Λ0 is different from zero, (3.31) can be inverted iteratively as a formal series in ϕ(x) to yield the source J(x) in the form ∞ X
1 J(x) = J (ϕ(x)) ≡ (2n + 1)! n=0
Z
dx1 · · ·
Z
dx2n+1
× F2n+2 (x, x1 , . . . , x2n+1 )ϕ(x1 ) · · · ϕ(x2n+1 ) ,
(3.32)
where Z
dyF2 (x, y)W2Λ,Λ0 (y, z) = δ(x − z) .
The generating functional ΓΛ,Λ0 (ϕ) of the regularized vertex functions results from the Legendre transformation Z Λ,Λ0 Λ,Λ0 (3.33) Γ (ϕ) := −W (J) + dyJ(y)ϕ(y) J=J (ϕ)
July 14, 2003 11:19 WSPC/148-RMP
00169
Perturbative Renormalization by Flow Equations
527
implying, due to (3.31), δΓΛ,Λ0 (ϕ) = J(x) . δϕ(x)
(3.34)
The functional ΓΛ,Λ0 (ϕ) is even under ϕ → −ϕ and vanishes at ϕ = 0. Finally, performing the functional derivation of (3.34) with respect to J(y) and using again (3.31), provides the crucial functional relation: Z δ 2 W Λ,Λ0 (J) δ 2 ΓΛ,Λ0 (ϕ) dz · = δ(y − x) . (3.35) δJ(y)δJ(z) δϕ(z)δϕ(x) As an immediate consequence follows Z 0 dz W2Λ,Λ0 (y, z)ΓΛ,Λ (z, x) = δ(y − x) , 2
considering (3.35) at ϕ = 0, and thus also at J = 0. In order to obtain the relation between the functionals LΛ,Λ0 (ϕ) and ΓΛ,Λ0 (ϕ), we write (2.19) in the form 1 W Λ,Λ0 (J) = −LΛ,Λ0 (ϕ) + hJ, C Λ,Λ0 Ji , 2 Z ϕ(x) = dy C Λ,Λ0 (x − y)J(y) .
(3.36) (3.37)
Deriving (3.36) twice with respect to J as required in (3.35) one obtains after operating on this equation with (C Λ,Λ0 )−1 , Z Z δ 2 LΛ,Λ0 (ϕ) (C Λ,Λ0 )−1 (y − x) = − dz du δϕ(y)δϕ(u) × C Λ,Λ0 (u − z)
δ 2 ΓΛ,Λ0 (ϕ) δ 2 ΓΛ,Λ0 (ϕ) + . δϕ(z)δϕ(x) δϕ(y)δϕ(x)
(3.38)
From this analogue of (3.35) follow the relations between the respective n-point functions of ΓΛ,Λ0 (ϕ) and LΛ,Λ0 (ϕ) upon repeated functional derivation with respect to ϕ, employing the chain rule together with the relation Z δΓΛ,Λ0 (ϕ) , (3.39) ϕ(x) = dyC Λ,Λ0 (x − y) δϕ(y) due to (3.37) and (3.34). With our conventions (2.24), (2.27) for the Fourier transformation, the Eqs. (3.38) and (3.39) appear in momentum space as Z δ(p + q) δ 2 ΓΛ,Λ0 δ 2 LΛ,Λ0 (2π)−4 Λ,Λ0 C Λ,Λ0 (k) = −(2π)8 C ˆ ϕ(k) ˆ δϕ ˆ (−k)δ ϕ ˆ (q) (p) k δ ϕ(p)δ +
δ 2 ΓΛ,Λ0 , δϕ ˆ (p)δ ϕ ˆ (q)
ϕ(q) ˆ = (2π)4 C Λ,Λ0 (q)
(3.40) δΓΛ,Λ0 . δϕ ˆ (−q)
(3.41)
July 14, 2003 11:19 WSPC/148-RMP
528
00169
V. F. M¨ uller
In the tree order, LΛ,Λ0 (ϕ) contains no 2-point function, (2.38). Hence, setting in (3.40) ϕ = ϕ = 0 yields l=0 δ 2 ΓΛ,Λ0 (ϕ) δ(p + q) = (2π)−4 Λ,Λ0 . δϕ ˆ (p)δ ϕ ˆ (q) C (p)
(3.42)
ϕ≡0
To deduce the flow equation for the vertex functional, we derive Eq. (3.33) with respect to the flow parameter Λ, Z Z δΓΛ,Λ0 ∂Λ ϕ(y) = −∂Λ W Λ,Λ0 (J) + dyJ(y)∂Λ ϕ(y) . (∂Λ ΓΛ,Λ0 )(ϕ) + dy δϕ(y) Hence, because of (3.34), (∂Λ ΓΛ,Λ0 )(ϕ) + ∂Λ W Λ,Λ0 (J) = 0 .
(3.43)
Substituting W Λ,Λ0 by LΛ,Λ0 according to (3.36) then yields Z Z δLΛ,Λ0 ˙ Λ,Λ0 (∂Λ ΓΛ,Λ0 )(ϕ) − (∂Λ LΛ,Λ0 )(ϕ) − dz dy C (y − z)J(z) δϕ(y) 1 + hJ, C˙ Λ,Λ0 Ji = 0 , 2
(3.44)
C˙ Λ,Λ0 denoting the derivative of the covariance C Λ,Λ0 with respect to Λ. There is an alternative way [41] to arrive at Eq. (3.44), starting from (3.33) but treating ϕ as Λ-independent and thus J to depend on Λ according to (3.31). Deriving (3.33) with respect to Λ then reads " # Z Z δW Λ,Λ0 Λ,Λ0 Λ,Λ0 )(J) − dy (ϕ) = −(∂Λ W ∂Λ Γ ∂Λ J(y) + dy∂Λ J(y)ϕ(y) δJ(y) J=J (ϕ)
= −(∂Λ W Λ,Λ0 )(J)|J=J (ϕ) .
(3.45)
The substitution of W Λ,Λ0 by LΛ,Λ0 due to (3.36) and (3.37) again provides (3.44). Given (3.44) the flow equation (2.21) with its vacuum part subtracted can be taken into account, leading to Λ,Λ0 1 δLΛ,Λ0 δL Λ,Λ0 Λ,Λ0 ˙ (∂Λ Γ )(ϕ) + − J, C −J 2 δϕ δϕ ~ δ ˙ Λ,Λ0 δ δ δ . LΛ,Λ0 (ϕ) − LΛ,Λ0 (ϕ) = ,C , C˙ Λ,Λ0 2 δϕ δϕ δϕ δϕ ϕ≡0
In the second term on the l.h.s. we use the relation Z δLΛ,Λ0 dx(C Λ,Λ0 )−1 (z − x)ϕ(x) = − + J(z) , δϕ(z)
(3.46)
July 14, 2003 11:19 WSPC/148-RMP
00169
Perturbative Renormalization by Flow Equations
529
resulting from (3.31) together with (3.36), and note (∂Λ C)C −1 + C∂Λ C −1 = 0. Hence, the flow equation for the vertex functional turns out as 1 ~ δ Λ,Λ0 Λ,Λ0 −1 Λ,Λ0 δ ˜ Λ,Λ0 (ϕ) , Γ (∂Λ Γ )(ϕ) − hϕ, (∂Λ (C ) )ϕi = , (∂Λ C ) 2 2 δϕ δϕ (3.47) with the r.h.s. defined by ˜ Λ,Λ0 (ϕ) δ2 Γ δ 2 LΛ,Λ0 (ϕ) := δϕ(x)δϕ(y) δϕ(x)δϕ(y)
ϕ=C Λ,Λ0 J (ϕ)
δ 2 LΛ,Λ0 (ϕ) − δϕ(x) δϕ(y)
.
(3.48)
ϕ=0
˜ Λ,Λ0 (ϕ) we notice, that its 2Looking at this definition of the functional Γ point function vanishes, and furthermore, that its higher n-point functions, n = 4, 6, 8, . . . , emerge from the first term on the r.h.s. These latter are recursively determined by the functional Eqs. (3.38) or (3.40) (which could also be obtained via (3.46) and (3.34)), by performing successively two, four, six, . . . functional derivations with respect to ϕ. The r.h.s. of the flow equation (3.47) can also be given another form, expressing (3.48) first in terms of the functional W Λ,Λ0 (J) by way of (3.36), (3.37) and then using the functional relation (3.35), ~ δ Λ,Λ0 δ ˜ Λ,Λ0 (ϕ) , (∂Λ C ) Γ 2 δϕ δϕ ! Z Z ~ δ 2 W Λ,Λ0 (J) δ 2 W Λ,Λ0 (J) Λ,Λ0 −1 ) (y − x) = dx dy∂Λ (C − 2 δJ(x) δJ(y) δJ(x) δJ(y) J=0
=
Z
Z
~ dx dy∂Λ (C Λ,Λ0 )−1 (y − x) 2 !−1 !−1 δ 2 ΓΛ,Λ0 (ϕ) δ 2 ΓΛ,Λ0 (ϕ) · (x, y) − (x, y) δϕ δϕ δϕ δϕ
ϕ=0
.
(3.49)
This form is (also) met in the literature [54, 59–62], and the flow equation (3.47) called there “exact renormalization group”. Similar to (2.25) regarding the functional LΛ,Λ0 (ϕ) we define the n-point functions, n ∈ 2N, of the functional ΓΛ,Λ0 (ϕ) in momentum space as δ δ (2π)4(n−1) ··· ΓΛ,Λ0 (ϕ) = δ(p1 + · · · + pn )ΓnΛ,Λ0 (p1 , . . . , pn ) , δϕ ˆ (p1 ) δϕ ˆ (pn ) ϕ≡0
(3.50)
˜ Λ,Λ0 (ϕ). Performing in addition a and analogously in the case of the functional Γ respective loop expansion, n ∈ 2N, ∞ X Λ,Λ0 Λ,Λ0 ~l Γl,n (p1 , . . . , pn ) , (3.51) Γn (p1 , . . . , pn ) = l=0
July 14, 2003 11:19 WSPC/148-RMP
530
00169
V. F. M¨ uller
0 ˜ Λ,Λ and for the functions Γ (p1 , . . . , pn ) alike, the flow equation (3.47) is finally n converted into the system of flow equations, satisfied by the n-point functions, n ∈ 2N, l ∈ N, Z 1 Λ,Λ0 ˜ Λ,Λ0 (k, −k, p1 , . . . , pn ) . ∂Λ Γl,n (p1 , . . . , pn ) = ∂Λ C Λ,Λ0 (k) · Γ (3.52) l−1,n+2 2 k
In contrast to the system of flow equations (2.31) satisfied by the amputated truncated Schwinger functions, here the r.h.s. is in total of lower loop order, but there is no closed form for it. As explained before, it has to be determined recursively via (3.40), treated in a loop expansion and using (3.42). It then emerges in the form ˜ Λ,Λ0 (k, −k, p1 , . . . , pn ) = ΓΛ,Λ0 (k, −k, p1 , . . . , pn ) Γ l,n+2 l,n+2 −
X
0 X
Λ,Λ0 Λ,Λ0 0 σΓΛ,Λ Γl2 ,n2 +2 · · · l1 ,n1 +1 (k, . . .)C
r≥2 {ni },{li }
Λ,Λ0 Λ,Λ0 0 · · · ΓΛ,Λ Γlr ,nr +1 (−k, . . .) . lr−1 ,nr−1 +2 C
(3.53)
The prime restricts summation to l1 +l2 +· · ·+lr = l and n1 +n2 +· · ·+nr = n+2, in addition, 2-point functions in the tree order are excluded as factors. The momentum assignment has been suppressed, it goes without saying that the sum inherits from the l.h.s. the complete symmetry in the momenta p1 , . . . , pn . Moreover, there is a sign factor σ depending on {ni } and {li }. The form of (3.53) is easily understood when represented by Feynman diagrams: To the first term (on the r.h.s.) correspond 1PI-diagrams, whereas to the sum correspond chains of 1PI-diagrams, minimally connected by single lines and thus not of 1PI-type. These chains are closed to 1PI-diagrams by the contraction involved in the flow equation. The system of flow equations (3.52) can alternatively be employed to prove the renormalizability of the theory considered. In the tree order, only the 2-point 0 function (3.42) and the 4-point function ΓΛ,Λ 0,4 (p1 , . . . , p4 ) = g are different from zero. The latter is easily obtained via (3.40)–(3.42) from (2.40) observing f = 0 there. In each loop order l ≥ 1, the three counterterms Λ0 ,Λ0 (p, −p) = al (Λ0 ) + z l (Λ0 )p2 , Γl,2
Λ0 ,Λ0 (p1 , . . . , p4 ) = cl (Λ0 ) Γl,4
(3.54)
form the respective bare action, determined in the end by the renormalization conditions, l ≥ 1, R R 2 2 2 0 Γ0,Λ l,2 (p, −p) = al + z l p + O((p ) ) ,
0,Λ0 Γl,4 (0, 0, 0, 0) = cR l .
(3.55)
R R The renormalization constants aR l , z l , cl can be freely chosen. To prove renormalizability, we also have to make use of momentum derivatives of the flow equations (3.52), i.e. acting on them with ∂ w , (2.30). Then, the proof by induction follows step-by-step the proof given in Sec. 2.3 considering amputated truncated Schwinger
July 14, 2003 11:19 WSPC/148-RMP
00169
Perturbative Renormalization by Flow Equations
531
functions. It will therefore not be repeated. As result, the analogue of the Propo0 sitions 2.1 and 2.2 is established, where the function LΛ,Λ appearing there is now l,n 0 replaced by the function ΓΛ,Λ l,n .
4. Spontaneously Broken SU(2) Yang Mills Theory Attempting to prove renormalizability of a non-Abelian gauge theory via flow equations, following the path taken before in the case of a scalar field theory, one finds oneself confronted with a serious obstacle to be surmounted. By their definition the Schwinger functions of a non-Abelian gauge theory are not gauge invariant individually, the local gauge invarince of the theory, however, compels them to satisfy the system of Slavnov–Taylor identities [79, 80]. These identities are inevitably violated, if one employs in the intermediate regularization procedure a momentum cutoff. Moreover, the Slavnov–Taylor identities are generated by nonlinear transformations of the fields — the BRS-transformations [81, 82] — which, being composite fields, have to be renormalized, too. In the sequel we follow the general line of [42], hereafter referred to as “I”, incorporating the simplifications due to a more appropriate cutoff function. For the sake of readability, however, a coherent detailed argumentation is kept up. As concerns a number of intermediate proofs and technical derivations, we refer to the original article. After presenting in Sec. 4.1 the classical action of the theory considered, as a first step we disregard in Sec. 4.2 the Slavnov–Taylor identities and establish for an arbitrary set of renormalization conditions at a physical renormalization point a finite UV-behavior of the Schwinger functions without and with the insertion of one BRS-variation or of the gauge symmetry violation. The procedure is essentially the same as in the case of the scalar theory. Having thus established a family of finite theories, in Sec. 4.3 the violation of the Slavnov–Taylor identities of the amputated truncated Schwinger functions at the physical value Λ = 0 for the flow parameter and fixed Λ0 is worked out, as well as the BRS-variation of the bare action. Moreover, the violated Slavnov–Taylor identities of the vertex functions at Λ = 0, Λ0 fixed, are deduced. Flow equations for the vertex functions, however, are not used. Finally, in Sec. 4.4, the Slavnov–Taylor identities are accomplished by a proper choice of physical renormalization conditions. To this end, first termwise equivalence relations between the violated Slavnov–Taylor identities of the vertex functions and the BRS-variation of the bare action are established. Choosing freely 9 physical renormalization conditions, we then determine a related bare action such that the relevant part of its BRS-variation vanishes. Hence, due to the equivalence relations, the relevant part of the violated Slavnov–Taylor identities vanishes, too. From these 53 equations, the remaining 37 + 7 − 9 renormalization conditions are obtained, determining a UV-finite theory. It satisfies the Slavnov–Taylor identities, since the irrelevant part of the violation has a vanishing UV-limit, too.
July 14, 2003 11:19 WSPC/148-RMP
532
00169
V. F. M¨ uller
4.1. The classical action We begin collecting some basic properties of the classical Euclidean SU(2) Yang– Mills–Higgs model on four-dimensional Euclidean space-time, following closely the monograph of Faddeev and Slavnov [83]. This model involves the real Yang–Mills field {Aaµ }a=1,2,3 and the complex scalar doublet {φα }α=1,2 assumed to be smooth functions which fall-off rapidly. The classical action has the form Z 1 a a 1 Sinv = dx Fµν Fµν + (∇µ φ)∗ ∇µ φ + λ(φ∗ φ − ρ2 )2 , (4.1) 4 2 with the curvature tensor a Fµν (x) = ∂µ Aaν (x) − ∂ν Aaµ (x) + gabc Abµ (x)Acν (x)
(4.2)
and the covariant derivative ∇µ = ∂ µ + g
1 a a σ Aµ (x) 2i
(4.3)
acting on the SU(2)-spinor φ. The parameters g, λ, ρ are real positive, abc is totally skew symmetric, 123 = +1, and {σ a }a=1,2,3 are the standard Pauli matrices. For simplicity the wave function normalizations of the fields are chosen equal to one. The action (4.1) is invariant under local gauge transformations of the fields 1 1 a a σ Aµ (x) → u(x) σ a Aaµ (x)u∗ (x) + g −1 u(x)∂µ u∗ (x) , 2i 2i
(4.4)
φ(x) → u(x)φ(x) , with u : R4 → SU(2), smooth. A stable ground state of the action (4.1) implies spontaneous symmetry breaking, taken into account by reparametrizing the complex scalar doublet as ! B 2 (x) + iB 1 (x) φ(x) = (4.5) ρ + h(x) − iB 3 (x) where {B a (x)}a=1,2,3 is a real triplet and h(x) the real Higgs field. Moreover, in place of the parameters ρ, λ the masses 1 1 (4.6) gρ , M = (8λρ2 ) 2 2 are used. Since we aim at a quantized theory pure gauge degrees of freedom have to be eliminated. We choose the ’t Hooft gauge fixingg Z 1 Sg.f. = dx(∂µ Aaµ − αmB a )2 , (4.7) 2α
m=
with α ∈ R+ , implying complete spontaneous symmetry breaking. With regard to functional integration this condition is implemented by introducing anticommuting
g The general α-gauge [83] would lead to mixed propagators, in the Lorentz gauge the fields {B a } would be massless.
July 14, 2003 11:19 WSPC/148-RMP
00169
533
Perturbative Renormalization by Flow Equations
Faddeev–Popov ghost and antighost fields {ca }a=1,2,3 and {¯ ca }a=1,2,3 , respectively, and forming with these six independent scalar fields the additional interaction term ( Z 1 Sgh = − dx¯ ca (−∂µ ∂µ + αm2 )δ ab + αgmhδ ab 2 ) 1 acb c acb c + αgm B − g∂µ Aµ cb . 2
(4.8)
Hence, the total “classical action” is SBRS = Sinv + Sg.f. + Sgh ,
(4.9)
which we decompose as SBRS =
Z
dx{Lquad (x) + Lint (x)}
(4.10)
into its quadratic part, with ∆ ≡ ∂µ ∂µ , Lquad =
1 1 1 1 (∂µ Aaν − ∂ν Aaµ )2 + (∂µ Aaµ )2 + m2 Aaµ Aaµ + h(−∆ + M 2 )h 4 2α 2 2 1 + B a (−∆ + αm2 )B a − c¯a (−∆ + αm2 )ca , 2
(4.11)
and into its interaction part: 1 Lint = gabc (∂µ Aaν )Abµ Acν + g 2 (abc Abµ Acν )2 4 1 + g{(∂µ h)Aaµ B a − hAaµ ∂µ B a − abc Aaµ (∂µ B b )B c } 2 1 + gAaµ Aaµ {4mh + g(h2 + B a B a )} 8 2 1 M2 1 M + g h(h2 + B a B a ) + g 2 (h2 + B a B a )2 4 m 32 m 1 − αgm¯ ca {hδ ab + acb B c }cb − gacb (∂µ c¯a )Acµ cb . 2
(4.12)
In (4.11) we recognize the important properties that all fields are massive and that no coupling term Aaµ ∂µ B a appears. The classical action SBRS , (4.10), shows the following symmetries: (i) Euclidean invariance: SBRS is an O(4)-scalar. (ii) Rigid SO(3)-isosymmetry: The fields {Aaµ }, {B a }, {ca }, {¯ ca } are isovectors and h an isoscalar; SBRS is invariant under spacetime independent SO(3)transformations.
July 14, 2003 11:19 WSPC/148-RMP
534
00169
V. F. M¨ uller
(iii) BRS-invariance: Introducing the classical composite fields ψµa (x) = {∂µ δ ab + garb Arµ (x)}cb (x) , 1 ψ(x) = − gB a (x)ca (x) , 2 1 arb r 1 ab a ψ (x) = m + gh(x) δ + g B (x) cb (x) , 2 2 Ωa (x) =
(4.13)
1 apq p g c (x)cq (x) , 2
the BRS-transformations of the basic fields are defined as Aaµ (x) → Aaµ (x) − ψµa (x) , h(x) → h(x) − ψ(x) , B a (x) → B a (x) − ψ a (x) , a
a
(4.14)
a
c (x) → c (x) − Ω (x) , c¯a (x) → c¯a (x) −
1 (∂ν Aaν (x) − αmB a (x)) . α
In these transformations is a spacetime independent Grassmann element that commutes with the fields {Aaµ , h, B a } but anticommutes with the (anti-) ghosts {ca , c¯a }. To show the BRS-invariance of the total classical action (4.9) one first observes that the composite classical fields (4.13) are themselves invariant under the BRS-transformations (4.14). Moreover, we can write (4.8) in the form Z Sgh = − dx¯ ca {−∂µ ψµa + αmψ a } . (4.15) Using these properties the BRS-invariance of the classical action (4.9) follows upon direct verification. It is convenient to add to the classical action (4.9) source terms both for the fields and the composite fields introduced, defining the extended action Z Sc = SBRS + dx{γµa ψµa + γψ + γ a ψ a + ω a Ωa } −
Z
dx{jµa Aaµ + sh + ba B a + η¯a ca + c¯a η a } .
(4.16)
The sources γµa , γ, γ a all have canonical dimension 2, ghost number −1 and are Grassmann elements, whereas ω a has canonical dimension 2 and ghost number −2; the sources η a and η¯a have ghost number +1 and −1, respectively, and are Grassmann elements. Employing the BRS-operator D, defined by Z δ δ δ δ 1 δ δ D = dx jµa a + s + ba a + η¯a a + η a ∂ν a − m a , (4.17) δγµ δγ δγ δω α δjν δb
July 14, 2003 11:19 WSPC/148-RMP
00169
Perturbative Renormalization by Flow Equations
535
the BRS-transformation of the extended action Sc , (4.16), can be written as Sc → Sc + DSc .
(4.18)
Of course, also anticommutes with the sources of Grassmannian type. 4.2. Flow equations: Renormalizability without Slavnov Taylor identities In view of the various fields present, it is convenient to introduce a short collective notation for the fields and sources. We denote: (i) the bosonic fields and the corresponding sources, respectively, by ϕτ = (Aaµ , h, B a ) ,
Jτ = (jµa , s, ba ) ,
(4.19)
(ii) all fields and their respective sources by Φ = (ϕτ , ca , c¯a ) ,
K = (Jτ , η¯a , η a ) ,
(4.20)
(iii) the insertions and their sources ψτ = (ψµa , ψ, ψ a ) ,
γτ = (γµa , γ, γ a ) ,
ξ = (γτ , ω a ) .
(4.21)
The quadratic part of SBRS , (4.10), defines the inverses of the various unregularized free propagators. We start from the theory defined on finite volume, as described in Sec. 2.1. With the notation introduced there, we have Z 1 1 dxLquad (x) ≡ Q(Φ) = hAaµ , (C −1 )µν Aaν i + hh, C −1 hi 2 2 1 + hB a , S −1 B a i − h¯ ca , S −1 ca i , (4.22) 2 where the Fourier transforms of these free propagators, (compare (2.7) with Λ = 0, Λ0 = ∞) turn out to be 1 1 , S(k) = 2 , 2 +M k + αm2 kµ kν 1 δµν − (1 − α) 2 . Cµν (k) = 2 k + m2 k + αm2 C(k) =
k2
(4.23)
Again the notation has been abused omitting the “hat”. Furthermore, we shall use C(k) as a collective symbol for these propagators. A Gaussian product measure, the covariances of which are a regularized version of the propagators (4.23), forms the basis to quantize the theory by functional integration. Although gauge symmetry is violated by any momentum cutoff one should try to reduce the bothersome consequences as far as possible. Instead of the simple form (2.10) we choose the cutoff function 2 )(k2 +M 2 ) (1 + α)m2 M 2 + αm4 2 − (k2 +m2 )(k2 +αm Λ6 k . (4.24) e σΛ (k 2 ) = 1 + 6 Λ
July 14, 2003 11:19 WSPC/148-RMP
536
00169
V. F. M¨ uller
It is positive, invertible and analytic as the former, but satisfies in addition d σΛ (k 2 )|k2 =0 = 0 . dk 2
(4.25)
This property is the raison d’ˆetre for the particular choice (4.24), compared with I (30). Employing this cutoff function we define the regularized propagators, 0 ≤ Λ ≤ Λ0 < ∞, C Λ,Λ0 (k) ≡ C(k)σΛ,Λ0 (k 2 ) := C(k)
σΛ0 (k 2 ) − σΛ (k 2 ) . σΛ0 (0)
They satisfy the bounds, valid for |w| ≤ 4, Y |w| ∂ ∂ Λ,Λ0 C (k) ∂k ∂Λ i=1 µi
2 2 2 +αm2 )(k2 +M 2 ) − (k +m )(k 2m 6 c|w| e ≤ 2 )(k2 +M 2 ) |k| − (k2 +m2 )(k2 +αm Λ−3−|w| P|w| Λ6 e Λ
(4.26)
for 0 ≤ Λ ≤ m , for Λ > m
(4.27)
with polynomials P|w| having positive coefficients. These coefficients, as well as the constants c|w| , only depend on α, m, M, |w|. The bounds (4.27), only valid in the cases |w| ≤ 4, are sufficient for our purpose and have the same form as the bounds (2.42). Writing ! Z X a a a a (4.28) ϕτ (x)Jτ (x) + c¯ (x)η (x) + η¯ (x)c (x) , hΦ, Ki := dx τ
the characteristic functional of the Gaussian product measure with covariances ~C Λ,Λ0 , (4.26), (4.23), is given by Z 1 1 (4.29) dµΛ,Λ0 (Φ)e ~ hΦ,Ki = e ~ P (K) , P (K) =
1 a Λ,Λ0 a 1 hj , C j i + hs, C Λ,Λ0 si 2 µ µν ν 2 1 + hba , S Λ,Λ0 ba i − h¯ η a , S Λ,Λ0 η a i . 2
(4.30)
The free propagators (4.23) reveal the mass dimensions of the corresponding quantum fields: each of the fields has a mass dimension equal to one, attributing equal values to the ghost and antighost field. To promote the classical model to a quantum field theory we consider the generating functional LΛ,Λ0 (Φ) of the amputated truncated Schwinger functions. It unfolds according to the integrated flow equation, cf. (2.23), 1
e− ~ (L
Λ,Λ0
(Φ)+I Λ,Λ0 )
1
= e~∆Λ,Λ0 e− ~ L
Λ0 ,Λ0
(Φ)
(4.31)
July 14, 2003 11:19 WSPC/148-RMP
00169
Perturbative Renormalization by Flow Equations
537
from the bare functional LΛ0 ,Λ0 (Φ), which forms its initial value at Λ = Λ0 . The functional Laplace operator appearing has the form 1 δ δ 1 Λ,Λ0 δ Λ,Λ0 δ , C , C + ∆Λ,Λ0 = 2 δAaµ µν δAaν 2 δh δh δ δ 1 Λ,Λ0 δ Λ,Λ0 δ + . (4.32) ,S ,S + 2 δB a δB a δca δ¯ ca Since the local gauge symmetry is violated by the regularization, the bare functional Z Λ0 ,Λ0 LΛ0 ,Λ0 (Φ) = dxLint (x) + Lc.t. (Φ) (4.33)
has at first to be chosen sufficiently general in order to allow the restoration of the Slavnov–Taylor identities at the end. Therefore, we add as counterterms to the given interaction part (4.12) of classical origin all local terms of mass dimension ≤ 4, which are permitted by the unbroken global symmetries, i.e. Euclidean O(4)invariance and SO(3)-isosymmetry. There are 37 such terms, by definition all of order O(~). The bare functional is presented in Appendix A. We remark that no irrelevant terms I, (107) and (108) are introduced in the bare interaction, now unnecessary due to (4.25). The decomposition of the generating functional LΛ,Λ0 (Φ) is written employing a multiindex n, the components of which denote the number of each source field species appearing: n = (nA , nh , nB , nc¯, nc ) ,
|n| = nA + nh + nB + nc¯ + nc .
(4.34)
Moreover, we consider the functional within a formal loop expansion, hence LΛ,Λ0 (Φ) =
∞ X
|n|=3
0 LΛ,Λ l=0,n (Φ) +
∞ X l=1
~l
∞ X
0 LΛ,Λ l,n (Φ) .
(4.35)
|n|=1
Disregarding the vacuum part, we can study the flow of the n-point functions in the infinite volume limit Ω → R4 , Φ ∈ S(R4 ). With our conventions (2.24) of the Fourier transformation, the momentum representation of the n-point function with multiindex n, (4.34), at loop order l is obtained as an |n|-fold functional derivative: n 0 0 (2π)4(|n|−1) δΦ(p) LΛ,Λ |Φ=0 = δ(p1 + · · · + p|n| )LΛ,Λ ˆ ˆ l l,n (p1 , . . . , p|n| ) .
(4.36)
To avoid clumsiness, the notation does not reveal how the momenta are assigned to the multiindex n, and in addition, it suppresses the O(4)- and SO(3)-tensor structure. From the definition (4.36) of the n-point function follows that it is completely symmetric (antisymmetric) upon permuting the variables belonging to each of the bosonic (fermionic) species occurring. Proceeding exactly as in the case of the scalar field, the flow equation (4.31) is converted into a system of flow equations relating the n-point functions. It looks like (2.31), where n is now a multiindex and the residual symmetrization has to be extended to a corresponding antisymmetrization in case of the (anti)ghost fields, cf. I (37). The system is integrated in the familiar
July 14, 2003 11:19 WSPC/148-RMP
538
00169
V. F. M¨ uller
way. At first the tree order l = 0 has to be gained, fully determined by the classical descendant (4.12) appearing in the initial condition (4.33) at Λ = Λ0 . Given the tree order l = 0, the inductive integration ascends in the loop order l, for fixed l ascends in |n|, and for fixed l, n descends in w from |w| = 4 to w = 0, with initial conditions as follows: (A1 ) For |n| + |w| > 4 at Λ = Λ0 , 0 ,Λ0 ∂ w LΛ (p1 , . . . , p|n| ) = 0 , l,n
(4.37)
due to the choice of the bare functional (4.33). (A2 ) For the (relevant) cases |n| + |w| ≤ 4, renormalization conditions at the physical value Λ = 0 and a chosen renormalization point are freely prescribed order-by-order, subject only to the unbroken O(4)- and SO(3)-symmetries. These conditions determine the 37 local counterterms entering the bare functional (4.33). For simplicity we choose, as before, vanishing momenta as renormalization point. Repeating exactly the steps that in the case of the scalar field led to the Propositions 2.1 and 2.2, one establishes in the present case analogous bounds, just reading now n as a multiindex. (cf. I, Propositions 1 and 2.) As a consequence of these bounds a finite theory results in the limit Λ0 → ∞, however, it is still not yet the gauge theory looked for! The problem to be solved is to select renormalization conditions (A2 ) such that the n-point functions in the limit Λ = 0, Λ0 → ∞ satisfy the Slavnov–Taylor identities. As worked out in the next section, to establish the Slavnov–Taylor identities necessitates to consider Schwinger functions with a composite field inserted. There will appear two kinds of such insertions: the composite BRS-fields forming local insertions, and a space-time integrated insertion describing the intermediate violation of the Slavnov–Taylor identities. The classical composite BRS-fields (4.13) all have mass dimension 2 and transform as vector-isovector, scalar-isoscalar, scalar-isovector and scalar-isovector, respectively. Moreover, the first three have ghost number 1, whereas the last has ghost number 2. Thus, adding counterterms according to the rules formulated in Sec. 2.4, we introduce the bare composite fields ψµa (x) = R10 ∂µ ca (x) + R20 garb Arµ (x)cb (x) , g ψ(x) = −R30 B a (x)ca (x) , 2 g g ψ a (x) = R40 mca (x) + R50 h(x)ca (x) + R60 arb B r (x)cb (x) , 2 2 g Ωa (x) = R70 apq cp (x)cq (x) , 2
(4.38)
July 14, 2003 11:19 WSPC/148-RMP
00169
Perturbative Renormalization by Flow Equations
539
keeping the notation introduced for the classical terms and using it henceforth exclusively according to (4.38). We set Ri0 = 1 + O(~) ,
(4.39)
thus viewing the counterterms again as formal power series in ~; the tree order ~0 provides the classical terms (4.13). The reader notices that there is no insertion attributed to the linear variation of the antighost field. It will be seen that the Slavnov–Taylor identities can be established generating this variation by functional derivation with respect to the sources of the fields involved. We shall have to deal with Schwinger functions with one insertion. Similarly as in Sec. 2.4, the bare interaction (4.33) is modified adding the composite fields (4.38) coupled to corresponding sources, introduced in (4.16), ˜ Λ0 ,Λ0 := LΛ0 ,Λ0 + LΛ0 ,Λ0 (ξ) , L LΛ0 ,Λ0 (ξ) =
Z
dx{γµa (x)ψµa (x) + γ(x)ψ(x) + γ a (x)ψ a (x) + ω a (x)Ωa (x)} .
(4.40) (4.41)
Then, from the corresponding generating functional of the regularized amputated truncated Schwinger functions with one insertion ψ(x), Z δ ˜ Λ,Λ0 Λ,Λ0 Λ,Λ0 0 ˆ (x; Φ) (4.42) L |ξ=0 , Lγ (q; Φ) = dxeiqx LΛ,Λ Lγ (x; Φ) := γ δγ(x) with analogous expressions for the other insertions, after a loop expansion, follows for the n-point functions with one insertion ψ, Λ,Λ0 (q; p1 , . . . , p|n| ) δ(q + p1 + · · · + p|n| )Lγ;l,n n ˆ Λ,Λ0 (q; Φ)|Φ=0 , := (2π)4(|n|−1) δΦ(p) L ˆ γ;l
(4.43)
a system of flow equations. (cf. I (46)–(51).) From each of these systems the renormalizability of the amputated truncated Schwinger functions with one insertion can be deduced inductively in the familiar way. We denote by ξ any of the labels γµa , γ, γ a , ω a . First, the tree order l = 0 is obtained from its initial condition at Λ = Λ0 . For l ≥ 1 the initial conditions are: (B1 ) If |n| + |w| > 2 at Λ = Λ0 , 0 ,Λ0 ∂ w LΛ ξ;l,n (q; p1 , . . . , p|n| ) = 0 .
(4.44)
(B2 ) If |n| + |w| ≤ 2 at Λ = 0 and at vanishing momenta (the renormalization point) the initial condition can be fixed freely in each loop order, provided the Euclidean symmetry and the isosymmetry are respected. In total, there are 7 such renormalization conditions which then determine the 7 parameters Ri0 entering the bare insertions (4.38). Given the bounds of the case without insertion one deduces inductively the analogues of the Propositions 2.3 and 2.4, with n now a multiindex and D = 2 (cf. I, Prop. 3). Hence, we
July 14, 2003 11:19 WSPC/148-RMP
540
00169
V. F. M¨ uller
have boundedness and convergence of the amputated truncated Schwinger functions with the insertion of one BRS-variation. The intermediate violation of the Slavnov–Taylor identities, as will be derived in the following section, leads to a bare space-time integrated insertion of the form Z Λ0 ,Λ0 L1 (Φ) = dxN (x) , (4.45) N (x) = Q(x) + Q0 (x; (Λ0 )−1 ) .
(4.46)
Here Q(x) is a local polynomial in the fields and their derivatives, having canonical mass dimension D = 5, whereas Q0 (x; (Λ0 )−1 ) is nonpolynomial in the field derivatives but with powers (Λ0 )−1 as coefficients such that it becomes irrelevant. The individual terms composing N (x) involve at most five fields and have ghost number equal to one. We have to control L1Λ,Λ0 (Φ), the L-functional with one (bare) insertion (4.45). Hence, in analogy to the local case, cf. (2.61), a modified bare action 0 ,Λ0 LΛ0 ,Λ0 (Φ) + χLΛ (Φ) 1
(4.47)
is introduced as initial condition in the (integrated form of the) flow equation 1
Λ,Λ0
e− ~ (Lχ
+I Λ,Λ0 )
1
:= e~∆Λ,Λ0 e− ~ (L
Λ0 ,Λ0
Λ ,Λ0
+χL1 0
)
.
(4.48)
Herefrom results the generating functional of the (regularized) amputated truncated Schwinger functions with one insertion (4.45) as 0 LΛ,Λ (Φ) = 1
∂ Λ,Λ0 L (Φ)|χ=0 . ∂χ χ
(4.49)
It satisfies a linear differential flow equation which is easily obtained relating it to the case of a bare local insertion, cf. (2.61)–(2.69), Z dx%(x)N (x) and observing
∂ Λ,Λ0 L (Φ)|χ=0 = ∂χ χ
Z
=
Z
dx
δ ˜ Λ,Λ0 L (%; Φ)|%=0 δ%(x)
0 ˆ Λ,Λ0 dxLΛ,Λ (1) (x; Φ) = L(1) (0; Φ) .
Hence, the differential flow equation satisfied by the functional L1Λ,Λ0 (Φ) is the space-time integrated analogue of (2.67). Performing a loop expansion, the amputated truncated n-point functions with one insertion (4.45), n a multiindex (4.34), 4(|n|−1) n 0 0 δΦ(p) δ(p1 + · · · + p|n| )LΛ,Λ LΛ,Λ ˆ 1;l,n (p1 , . . . , p|n| ) := (2π) 1;l (Φ)|Φ=0
(4.50)
then satisfy a system of flow equations similar to the case of the local BRSinsertions, letting there the momentum take the value q = 0. As a consequence, we
July 14, 2003 11:19 WSPC/148-RMP
00169
Perturbative Renormalization by Flow Equations
541
obtain analogous bounds, but observing in the present case the dimension D = 5 (cf. I, Prop. 3). The irrelevant part appearing in the bare insertion (4.45) and (4.46) satisfies the required bounds to be admitted, cf. (2.57). 4.3. Violated Slavnov Taylor identities The Schwinger functions of the spontaneously broken Yang–Mills theory should be uniquely determined by its free physical parameters g, λ, m and the gauge fixing parameter α, once the normalization of the fields has been fixed. This uniqueness — as well as the physical gauge invariance — is accomplished by requiring the Schwinger functions to satisfy the Slavnov–Taylor identities. These identities, however, are inevitably violated by the intermediate regularization in momentum space. Our ultimate goal is to show, that by a proper choice of the renormalization conditions the Slavnov–Taylor identities emerge upon removing the regularization. To this end we first examine the violation of the Slavnov–Taylor identities produced by the UV-cutoff Λ0 . Our starting point is the generating functional of the regularized Schwinger functions, here considered at the physical value Λ = 0 of the flow parameter, Z 1 1 Λ0 ,Λ0 +~ hΦ,Ki . Z 0,Λ0 (K) = dµ0,Λ0 (Φ)e− ~ L The Gaussian measure dµ0,Λ0 (Φ) corresponds to the quadratic form cf. (4.22), (4.26), to wit:
1 0,Λ0 (Φ), ~Q
1 a 1 0,Λ0 −1 a hA , (C 0,Λ0 )−1 ) hi µν Aν i + hh, (C 2 µ 2
Q0,Λ0 (Φ) =
1 ca , (S 0,Λ0 )−1 ca i . + hB a , (S 0,Λ0 )−1 B a i − h¯ 2 Defining regularized BRS-variations (4.14), (4.38) of the fields by
(4.51)
δBRS ϕτ (x) = −(σ0,Λ0 ψτ )(x) , δBRS ca (x) = −(σ0,Λ0 Ωa )(x) , 1 a a a δBRS c¯ (x) = − σ0,Λ0 ∂ν Aν − mB (x) , α
the BRS-variation of the Gaussian measure follows as 1 0,Λ0 dµ0,Λ0 (Φ) 7→ dµ0,Λ0 (Φ) 1 − δBRS Q (Φ) . ~
(4.52)
(4.53)
Written more explicitly, δBRS Q0,Λ0 (Φ) =
−
X
ca , (S 0,Λ0 )−1 σ0,Λ0 Ωa i hϕτ , (Cτ0,Λ0 )−1 σ0,Λ0 ψτ i + h¯
τ
−
1 ∂ν Aaν − mB a , σ0,Λ0 (S 0,Λ0 )−1 ca α
!
,
(4.54)
July 14, 2003 11:19 WSPC/148-RMP
542
00169
V. F. M¨ uller
it reveals that σ0,Λ0 just cancels its inverse appearing in the inverted propagators, and as a consequence, the BRS-variation of the Gaussian measure has mass dimension D = 5. The essential reason for using regularized BRS-variations (4.52) is to assure this property. From the requirement, that the regularized generating functional Z 0,Λ0 (K) be invariant under the BRS-variations (4.52), result the violated Slavnov–Taylor identities h Z 1 Λ0 ,Λ0 1 ! +~ hΦ,Ki (δBRS hΦ, Ki − δBRS (Q0,Λ0 + LΛ0 ,Λ0 )) . (4.55) 0 = dµ0,Λ0 (Φ)e− ~ L This equation can be rewritten, introducing modified generating functionals: (i) With the modified bare interaction (4.40) we define Z 1 ˜ Λ0 ,Λ0 1 +~ hΦ,Ki Z˜ 0,Λ0 (K, ξ) := dµ0,Λ0 (Φ)e− ~ L ,
(4.56)
in combination with a regularized version of the BRS-operator (4.17), X δ δ a + η¯ , σ0,Λ0 a D Λ0 = Jτ , σ0,Λ0 δγτ δω τ +
δ δ 1 a ∂ν − m a , σ0,Λ0 η . α δjνa δb
(4.57)
(ii) In addition, we treat the BRS-variation of the bare action, L1Λ0 ,Λ0 := −δBRS (Q0,Λ0 + LΛ0 ,Λ0 ) ,
(4.58)
as a space-time integrated insertion with ghost number 1. Because of the regularizing factor σ0,Λ0 , cf. (4.52), the integrand is not a polynomial in the fields and their derivatives. With χ ∈ R, we then define Z Λ ,Λ Λ0 ,Λ0 1 1 +χL1 0 0 )+ ~ hΦ,Ki . (4.59) Zχ0,Λ0 (K) := dµ0,Λ0 (Φ)e− ~ (L Due to these definitions, the violated Slavnov–Taylor identities (4.55) can be written in the form DΛ0 Z˜ 0,Λ0 (K, ξ)|ξ=0 =
d 0,Λ0 Z (K)|χ=0 . dχ χ
(4.60)
From the modified functionals (4.56) and (4.59) follow, cf. (2.65), the generating functionals of the corresponding amputated truncated Schwinger functions 0,Λ0 1 1 ˜ 0,Λ0 ) Z˜ 0,Λ0 (K, ξ) = e ~ P (K) e− ~ (L (ϕτ ,c,¯c;ξ)+I , 1
Zχ0,Λ0 (K) = e ~ P (K) e
1 0 (ϕ ,c,¯ −~ (L0,Λ c)+I 0,Λ0 ) τ χ
,
where the variables of the Z- and the L-functional are related as h As
long as the vacuum part is involved, one has to stay in finite volume.
(4.61) (4.62)
July 14, 2003 11:19 WSPC/148-RMP
00169
Perturbative Renormalization by Flow Equations
Z
dyCτ0,Λ0 (x − y)Jτ (y) , Z ca (x) = − dyS 0,Λ0 (x − y)η a (y) , Z c¯a (x) = − dyS 0,Λ0 (x − y)¯ η a (y) .
ϕτ (x) =
543
(4.63)
Furthermore, P (K), (4.30), has to be taken here at Λ = 0. We observe, that the vacuum part I 0,Λ0 present without insertions appears, since both insertions have positive ghost number. To have a less cumbersome notation in the rest of this section, we abbreviate 0 ˜ 0,Λ0 |ξ=0 = L0,Λ , | L ≡ L0,Λ0 = L χ=0 χ
0 L1 ≡ L0,Λ := 1
L0 ≡ LΛ0 ,Λ0 ,
d 0,Λ0 L |χ=0 , dχ χ
0 (x; Φ) , Lγ ≡ L0,Λ γ
0 ,Λ0 L01 ≡ LΛ , 1 δ Λ0 ,Λ0 0 Λ0 ,Λ0 (ξ)|ξ=0 , L (x; Φ) = L γ ≡ Lγ δγ(x)
(4.64)
see (4.40)–(4.42). Moreover, we denote the inverted unregularized propagators by Dτ ≡
(−∆ + m2 )δµ,ν −
1−α ∂µ ∂ν , −∆ + M 2 , −∆ + αm2 ≡ D . α
(4.65)
From (4.60) we derive via (4.61)–(4.63), employing the previous abbreviations, the violated Slavnov–Taylor identities of the amputated truncated Schwinger functions:
δL δL 1 − ca , σ0,Λ0 ∂ν a − m a ∂ν Aaν − mB a α δAν δB X + hϕτ , Dτ Lγτ i − h¯ ca , DLωa i = L1 .
ca , D
(4.66)
τ
As will turn out, we also need the explicit form of L01 , (4.58), i.e. the BRS-variation of the bare action. From its definition (4.58) follows directly, using (4.40) and (4.41), avoiding the detour in I, Eqs. (91)–(98), L01 =
δL0 1 a a , σ ∂ A − mB 0,Λ0 ν ν δ¯ ca α X X δL0 0 a 0 0 hϕτ , Dτ Lγτ i − h¯ + c , DLωa i + , σ0,Λ0 Lγτ δϕτ τ τ
ca , D
−
1 ∂ν Aaν − mB a α
δL0 , σ0,Λ0 L0ωa δca
.
−
(4.67)
July 14, 2003 11:19 WSPC/148-RMP
544
00169
V. F. M¨ uller
Moreover, to restore the Slavnov–Taylor identities we shall rely on proper vertex functions, too. Therefore, the violated form in terms of these functions is derived here, too. In the following, all functionals appearing should carry the superscript 0, Λ0 which is omitted, cf. (4.64). Considering the generating functional of the truncated Schwinger functions 1
˜
e ~ W (K,ξ) =
˜ Z(K, ξ) , ˜ Z(0, 0)
(4.68)
it follows from (4.60), together with (4.61) and (4.62) and using notation defined in (4.64), that ˜ (K, ξ)|ξ=0 = −L1 (ϕτ , ca , c¯a ) , D Λ0 W
(4.69)
with arguments according to (4.63). Because of the inherent symmetries, the functional L, and hence also W , contain only one 1-point function, which we force to vanish by the renormalization condition ˜ δL δL ! = 0 , → = 0. (4.70) δh(x) Φ=0 δh(x) Φ=0
A Legendre transformation yields the (modified) generating functional of the proper vertex functions, ! Z X a a a a a a a a ˜ ˜ ϕτ Jτ + η¯ c + c¯ η , (4.71) Γ(ϕτ , c , c¯ ; ξ) + W (Jτ , η , η¯ ; ξ) = dx τ
with variables related by
ϕτ (x) =
˜ δW , δJτ (x)
Jτ (x) =
ca (x) =
˜ δW , a δ η¯ (x)
η¯a (x) = −
c¯a (x) = −
˜ δW , a δη (x)
η a (x) =
˜ δΓ , δϕτ (x) ˜ δΓ , a δc (x)
(4.72)
˜ δΓ . a δ¯ c (x)
˜ does not contain 1-point functions, because of (4.70), but begins with 2Since W point functions, the equations on the left in (4.72) imply, that the variables ϕ τ , ca , c¯a ˜ vanish, if the variables Jτ , η a , η¯a are equal to zero. Inverting these equations of Γ provides Jτ , η a , η¯a as respective functions of ϕτ , ca , c¯a , to be used in the definition ˜ It follows, that there is no 1-point proper vertex function, i.e. (4.71) of Γ. δΓ = 0. (4.73) δh(x) Φ=0 From the functional derivation of (4.71) with respect to the source γ(x) at fixed Φ, ˜ ˜ δΓ + δW δγ(x) Φ δγ(x) K
July 14, 2003 11:19 WSPC/148-RMP
00169
Perturbative Renormalization by Flow Equations
X δW ˜ ˜ ˜ δJτ (y) δ η¯a (y) δ W δη a (y) δ W + dy + + a a δJτ (y) δγ(x) δγ(x) δ η¯ (y) δγ(x) δη (y) τ ! Z X δJτ (y) δ η¯a (y) a δη a (y) a + c (y) − c¯ (y) , = dy ϕτ (y) δγ(x) δγ(x) δγ(x) τ Z
545
!
we infer, because of (4.72),
˜ ˜ δΓ = − δW , δγ(x) Φ δγ(x) K
(4.74)
and similar relations for the derivatives with respect to the sources γµa , γ a and ω a . These relations are employed at ξ = 0. Using a notation in accord with (4.64) , ˜ 0,Λ0 δ Γ 0,Λ 0 ˜ Γ≡Γ |ξ=0 , Γγτ (x) ≡ , (4.75) δγτ (x) ξ=0
the violated Slavnov–Taylor identities for proper vertex functions emerge from (4.69) via (4.72), (4.74) as + * X δΓ 1 δΓ δΓ a a a , σ0,Λ0 Γω − ∂ν Aν − mB , σ0,Λ0 a , σ0,Λ0 Γγτ − δϕτ δca α δ¯ c τ = Γ1 (ϕτ , ca , c¯a ) ,
(4.76)
with Γ1 (ϕτ , ca , c¯a ) = L1 (ϕτ , ca , c¯a ) .
(4.77)
In (4.77) the variables are related, suppressing the supersript 0, Λ0 of the propagators, as Z δΓ ϕτ (x) = dyCτ (x − y) , δϕτ (y) Z δΓ ca (x) = − dyS(x − y) a , δ¯ c (y) Z δΓ (4.78) c¯a (x) = dy a S(y − x) . δc (y) Comparing (4.67) with (4.76) we observe, that L01 and Γ1 have the same form! The apparently additional terms in L01 result from the quadratic part of the classical action, which by definition is excluded from the bare interaction L0 , but is contained in Γ. The relevant part of Γ1 , and hence of L01 , is listed in I, Appendix C, (I–XXIX), which consists of 53 different local parts. Due to our new cutoff function, however, the following simplifications occur here: (i) All conditions whose numbering carries a superscript zero are deleted, since no irrelevant bare terms I, (107) and (108) ˙ have been introduced, (ii) σ˙ = 0, cf. (4.25), (iii) The symbols Σ(0) have to be read
July 14, 2003 11:19 WSPC/148-RMP
546
00169
V. F. M¨ uller
˙ as Σ(0), cf. our Appendix A. In the case of Γ1 there appear contributions from irrelevant terms of Γ, indistinctly denoted by “irr”. They have to be deleted if one reads the list as the relevant part of L01 . 4.4. Restoration of the Slavnov Taylor identities The systems of amputated truncated Schwinger functions and of proper vertex functions are equivalent formulations of the theory. To analyze the Slavnov–Taylor identities, however, turns out to be simpler in the case of the proper vertex functions. Starting from the violated Slavnov–Taylor identities (4.76), the restoration is accomplished, if a set of the 37 + 7 relevant parameters of the theory and the BRS-variations can be given, such that the limit 0 lim Γ0,Λ (ϕτ , ca , c¯a ) = 0 1
Λ0 →∞
(4.79)
results. To achieve this we have also to resort to the functionals L1 , (4.66), and L01 , (4.67), derived as well as Γ1 in the foregoing section. This triplet of functionals is invoked to uncover linear dependences in the relevant parts of Γ and L0 . To this end we shall establish termwise equivalence relations between the relevant parts of Γ1 and L01 . In these relations the functional L1 acts as a connecting link. In the sequel we keep the notation introduced in (4.64), (4.77) for the various functionals and use in addition the shorthand n ∂ w δΦ F |0
(4.80)
ˆ where Φ should always be read as the Fourier transformed field Φ(p), to denote the |n|-fold field derivative of the functional F (which is a L- or a Γ-functional) ˆ evaluated at Φ ˆ = 0, then followed corresponding to the multiindex n of fieldsi Φ, by removing the momentum δ-function and afterwards performing the momentum derivative ∂ w . Setting furthermore in (4.80) all momenta equal to zero is written as n ∂ w δΦ F |0,0 ,
(4.81)
and finally, in a loop expansion of (4.81), the coefficient of loop order l is denoted by n ∂ w δΦ F |0,0,l .
(4.82)
In accord with (4.70), (4.73) we require for the possible 1-point function in each loop order l ≥ 1 the renormalization conditions !
κl := δh Γ|0,l = 0 ,
!
δh L|0,l = 0 .
(4.83)
(The 1-point functions are constants.) We first present two Lemmata invoked later to establish the limit (4.79). For the multiindices n, w a somewhat hybrid convention is used: n0 ⊂ n means strict inclusion in the set-theoretic sense, and correspondingly for (n, w). Lemma 1. Let l, n, w be given and assume (4.83). If i In
ˆ appear. the case of a Γ-functional the fields Φ
July 14, 2003 11:19 WSPC/148-RMP
00169
Perturbative Renormalization by Flow Equations
547
(i) for l0 < l and (n0 , w0 ) ⊆ (n, w), (ii) for l0 = l and (n0 , w0 ) ⊂ (n, w): 0
0
n ∂ w δΦ Γ1 |0,0,l0 = 0
is valid, then n n ∂ w δΦ Γ1 |0,0,l = 0 ⇔ ∂ w δΦ L1 |0,0,l = 0 .
(4.84)
Lemma 2. Let l, n, w with |n| + |w| ≤ 5 be given and (4.83) satisfied. Assuming (i) for l0 < l, (n0 , w0 ) with |n0 | + |w0 | ≤ 5: 0
0
n Γ1 |0,0,l0 = 0 , ∂ w δΦ
0
0
0
0
n 0 ∂ w δΦ L1 |0,0,l0 = 0 ,
(ii) for l0 = l, n0 ⊂ n and (n0 , w0 ) ⊂ (n, w): 0
0
n ∂ w δΦ L1 |0,0,l = 0 ,
n 0 ∂ w δΦ L1 |0,0,l = 0 ,
then follows the equality n n 0 L1 |0,0,l = ∂ w δΦ L1 |0,0,l . ∂ w δΦ
(4.85)
Proofs are given in I, p. 501 and pp. 503–504, respectively. After these preparations we turn to the proof of (4.79). Our first goal is to determine renormalization conditions for the functional Γ in such a way that the relevant part of Γ1 vanishes, proceeding inductively in the loop order l. The Lemmata indicate that we should ascend for given l in the number |n| of fields, and for given l, n to ascend in |w|. Requiring the relevant part of Γ1 to vanish amounts to the 53 conditions listed in Appendix C of I, however, observe the qualifications stated at the end of Sec. 4.3. The loop order l = 0 of these conditionsj is satisfied by the (classical) parameters displayed in our Appendices A and B, i.e. Γ1,l=0 |rel = 0 ,
(4.86)
L01,l=0 |rel = 0 .
(4.87)
and hence we also have
Induction hypothesis. Given l ∈ N, we assume for all loop orders l 0 < l the functional Γ (or equivalently the L-functional ) to be renormalized according to A 2 ) stated after Eq. (4.37) such that n ∂ w δΦ Γ1 |0,0,l0 = 0 ,
n 0 ∂ w δΦ L1 |0,0,l0 = 0
is satisfied for all (n, w) with |n| + |w| ≤ 5. Theorem. The induction hypothesis holds in the loop order l. j There
are no contributions “irr” in the order l = 0.
(4.88)
July 14, 2003 11:19 WSPC/148-RMP
548
00169
V. F. M¨ uller
Proof. The 37 + 7 relevant parameters appearing in the theory and in the BRStransformations are defined in Appendix A and B, respectively. In the loop order l we fix a priori 26 of them as follows, suppressing an index l: (A) Besides κl = 0, (4.83), already fixed before, we choose freely in Γ0,Λ0 (Φ): ˙ c¯c , ΣAB , F BBh , F AAA , R3 . Σtrans , Σlong , Σ˙ BB , Σ
(4.89)
This means that the normalizations of all fields except the Higgs field, k the two couplings F AAA and F BBh and one global normalization for the BRStransformations are freely chosen. ˜ Λ0 ,Λ0 , (4.40), by requiring (!): Moreover, we restrict the bare functional L (B) the bare parameters of the BRS-insertions (4.38) to obey R60 = R70 = R20 ,
R50 =
(R20 )2 , R30
(4.90)
(C) the 11 r0 -terms listed in Appendix A which have no correspondence in the loop order l = 0 to vanish, i.e. hBA r20 = · · · = r0c¯c¯cc = 0 ,
(4.91)
(D) and finally the additional relations F0BBA = −
R30 hBA F , 2R20 10
F0c¯cB = −
R30 c¯ch F , R20 0
F0AAhh =
R50 AABB F . R30 10
(4.92)
Since by the conditions (B)–(D) we have fixed 17 relevant parameters on the “wrong side” Λ = Λ0 , these restrictions have to be shown to provide a finite limit theory upon removing the cutoff Λ0 . The values of the bare parameters BBh AB ˙ c¯c , F0AAA , R30 , κ0 , Σ0trans , Σ0long , Σ˙ BB 0 , Σ 0 , Σ 0 , F0
(4.93)
follow from the renormalization conditions chosen in (4.89). The remaining relevant ˜0 ≡ L ˜ Λ0 ,Λ0 are now determined by requiring L0 |rel to vanish, taking parameters of L 1 into account the relations (B)–(D) already introduced. We list these parameters determined successively writing in bracket the number of the particular equation of Appendix C of I, from which the parameter is determined. This way each bare parameter different from (4.93) is fixed in terms of the free parameters (4.93) — directly or via parameters already determined before in proceeding: c¯cA (IIIa0 ) , R10 (Ib0 ), R40 (IIb0 ), R20 (IIIb0 ) → R60 , R70 , R50 , due to (B), F10 hBA F0BBA (V 0 ) → F10 , due to (D), F0c¯cB (IVa0 ) → F0c¯ch , due to (D) , 0 hh 0 AAh ˙ hh (V II 0 ) . Σc0¯c (V IIIa0 ), δm20 (Ia0 ), ΣBB (V Ib0 ), Σ 0 (IIa ), Σ0 (V IIa ), F0 0 b k The
field h emerges together with the field B a from the complex scalar doublet φ, (4.5).
July 14, 2003 11:19 WSPC/148-RMP
00169
Perturbative Renormalization by Flow Equations
549
There are 4 linearly dependent equations: Denoting by {X} the content of the bracket {· · ·} appearing in equation X, and using also (B), (D) we find −α−1 {V IIIb0 } = R10 ({IIIa0 } + {IIIb0 }) + gR20 {Ib0 } , R10 {IVb0 } = 2mR40 {V 0 } − gR20 {IIb0 } + m{V IIIb0 } , {V Ia0 } = {V Ib0 } + {V IIc0 } =
R20 {IVa0 } , R30
R20 0 {V } . R30
(4.94) (4.95) (4.96) (4.97)
Moreover, V IId0 , V IIIc0 are satisfied because of (C). At this stage all contributions to L01 |rel involving 2 or 3 fields do vanish. We then continue as follows: F0BBBB (X 0 ), F0BBhh (XX 0 ), F0hhhh (XIX 0 ), F0hhh (IX 0 ) , AABB AAAA F10 (XIII20 ) → F0AAhh , due to (D), F10 (XIVc0 ) .
˜ 0 are fixed. The remaining equations have to be Now all bare parameters entering L fulfilled identically. Indeed, using (B), (C): g 0 mR40 {XV1a } = R10 {XIII20 } − R30 {V Ib0 } , 2 0 R30 {XV IIa0 } = R50 {XV1a },
{XIVa0 } = −{XIVc0 } .
(4.98) (4.99)
Finally, one easily sees that the remaining 26 equations are satisfied because of (B), (C), (D). Thus we have achieved L01 |rel ≡ L1Λ0 ,Λ0 |rel = 0 .
(4.100)
From this property, we now determine Γ1 |rel , based on the induction hypothesis. The relevant parts of L01 , L1 , Γ1 are formed by terms that fulfil |n| + |w| ≤ 5, i.e. involve at most |n| = 5 fields. Ascending for given n with |w|, one treats first the cases |n| = 2: From ∂ w L01;n = 0, (4.100), one infers via Lemma 2 that ∂ w L1;n = 0, and hereupon via Lemma 1 that ∂ w Γ1;n = 0 holds, too. The cases |n| = 2 established, treating this way successively the cases |n| = 3, 4, 5, we arrive at 0 Γ1 |rel ≡ Γ0,Λ |rel = 0 . 1
(4.101)
Hence, the 53 equations listed in Appendix C of I are fulfilled. Given the freely chosen renormalization conditions (4.89) and κ = 0 in the loop order l, the remaining 37 + 7 − 9 (dependent) ones in this loop order are extracted one after the other from correspondingly chosen equations. We now list these renormalization constants in the order they are obtained, together with the label of the respective determining equation written in brackets. In this succession each determining equation then only contains parameters already determined before. Furthermore,
July 14, 2003 11:19 WSPC/148-RMP
550
00169
V. F. M¨ uller
one easily infers in each case from the equation considered, that the renormalization condition imposed in such a way stays finite in the limit Λ0 → ∞. We obtain R1 (Ib ), R4 (IIb ), (−αδm2 + Σc¯c )(Ia ), (ΣBB − Σc¯c )(IIa ), R2 (IIIb ) , (F1c¯cA − r2c¯cA )(IIIa ), F BBA (IVb ), R6 (V ), r2hBA (V IId ), R7 (V IIIb ) , r2c¯cA (V IIIc ), F c¯cB (IVa ), Σc¯c (V IIIa ), F1hBA (XV2a − XV2b ) , ˙ hh (V IIb ) . F AAh (V Ib ), F c¯ch (V Ia ), R5 (V IIc ), Σhh (V IIa ), Σ (Now all equations I–V III have been used, and the difference XV2a − XV2b , to determine all Ri and all terms with |n| = 2, 3 apart from F hhh .) F1AAAA (XIVc ), r1AA¯cc (XIVb ), r2AA¯cc (XIVe ), r2AAAA (XIVa ), r2AABB (XIII1 ) , F1AABB (XV1a ), F AAhh (XV IIa ), r1BB¯cc (XV1b ), r2BB¯cc (XV2c ), rhB¯cc (XI) , F BBBB (X), rhh¯cc (XV IIb ), rc¯c¯cc (XV IIIa ), F BBhh (XX) , F hhhh (XIX), F hhh (IX) . Thus all 37 + 7 relevant parameters are fixed in the loop order l and have finite limits in removing the cutoff Λ0 . This completes the proof. In view of the Theorem and Lemma 1, the restoration of the Slavnov–Taylor 0 0 identities is finally accomplished, if the irrelevant parts of Γ0,Λ or of L0,Λ are 1 1 shown to vanish, too, sending the UV-cutoff Λ0 to infinity. This behavior follows from the Proposition. Let l ∈ N0 , |w| ≤ 4, n a multiindex and 0 ≤ Λ ≤ Λ0 , then ν |pi | (Λ + m)5+1−|n|−|w| Λ0 0 P |∂ w LΛ,Λ (p , . . . , p )| ≤ , log 1 |n| 1;l,n Λ0 m Λ+m (4.102) with nonnegative integers ν and polynomials P as before. The proof is given in [84]. We just recall that all irrelevant terms occurring in the 0 ,Λ0 BRS-variation LΛ , (4.67), of the bare interaction result from momentum deriva1 tives of the cutoff function σ0,Λ0 (k 2 ), (4.26), and have |n| ≤ 5. The Proposition now directly implies vanishing limits 0,Λ0 (p1 , . . . , p|n| ) = 0 lim L1;l,n
Λ0 →∞
(4.103)
for all sets of fields n, (4.34), and in all loop orders l. Thus, the Slavnov–Taylor identities are restored upon removing the UV-cutoff Λ0 . This completes the proof of perturbative renormalizability of the spontaneously broken Yang–Mills theory based on flow equations.
July 14, 2003 11:19 WSPC/148-RMP
00169
Perturbative Renormalization by Flow Equations
551
In retrospect, one notes that we used only flow equations for L-functionals, ˆ Λ,Λ0 (q), LΛ,Λ0 . There are flow equations for Γ-functionals, too, albeit i.e. for LΛ,Λ0 , L 1 ξ of a more implicit form. The only reason we passed in an intermediate stage to vertex functions was to facilitate extracting the relevant parameters of the theory upon analyzing the relevant part of the violated Slavnov–Taylor identities. This task is greatly reduced dealing with the vertex functions, indeed. Appendix A. The Relevant Part of Γ The bare functional LΛ0 ,Λ0 and the relevant part of the generating functional ΓΛ,Λ0 for the proper vertex functions have the same general form. We present the latter and give the tree order of both explicitly. The cutoff symbols Λ and Λ0 are suppressed. We write Γ(A, h, B, c¯, c) =
4 X
Γ|n| + Γ(|n|>4) ,
|n|=1
|n| counting the number of fields, and extract its relevant part, i.e. its local field content with mass dimension not greater than four. Generally we will not underline the field variable symbols in the Appendices, though of course all arguments in the Γ-functional should appear underlined. The modification to obtain the bare functional LΛ0 ,Λ0 is stated at the end. (1) One-point function: ˆ . Γ1 = κh(0) (2) Two-point functions: Z ( Γ2 =
p
1 a 1 hh Aµ (p)Aaν (−p)ΓAA µν (p) + h(p)h(−p)Γ (p) 2 2
1 + B a (p)B a (−p)ΓBB (p) − c¯a (p)ca (−p)Γc¯c (p) 2 )
+ Aaµ (p)B a (−p)ΓAB µ (p) ,
2 2 2 2 ΓAA µν (p) = δµν (m + δm ) + (p δµν − pµ pν )(1 + Σtrans (p ))
+
1 pµ pν (1 + Σlong (p2 )) , α
Γhh (p) = p2 + M 2 + Σhh (p2 ) , Γc¯c (p) = p2 + αm2 + Σc¯c (p2 ) ,
ΓBB (p) = p2 + αm2 + ΣBB (p2 ) , AB 2 ΓAB (p ) . µ (p) = ipµ Σ
July 14, 2003 11:19 WSPC/148-RMP
552
00169
V. F. M¨ uller
Besides the unregularized tree order explicitly stated, there emerge 10 relevant parameters from the various self-energies: δm2 , Σtrans (0), Σlong (0), Σhh (0), Σ˙ hh (0), ΣBB (0), Σ˙ BB (0), ˙ c¯c (0), ΣAB (0) Σc¯c (0), Σ P P where the notation ˙ (0) ≡ (∂p2 )(0) has been used. We note, that in transforming the regularized L-functional into the corresponding Γ-functional the inverse of the regularized propagators (4.26) become the 2-point functions of the latter in the tree order l = 0. The factor (σΛ,Λ0 (p2 ))−1 thus appearing, however, does not contribute to the relevant part due to the property (4.25). (3) Three-point functions: Only the relevant part is given explicitly: r ∈ O(~) denotes a relevant parameter which vanishes in the tree order, otherwise a relevant parameter is denoted by F . Moreover, we indicate an irrelevant part by a symbol On , n ∈ N, indicating that this part vanishes as an nth power of the momentum in the limit when all momenta tend to zero homogeneously. Z Z n rst Arµ (p)Asν (q)Atλ (−p − q)ΓAAA Γ3 = µνλ (p, q) p
q
+ Arµ (p)Arν (q)h(−p − q)ΓAAh µν (p, q) + rst B r (p)B s (q)Atµ (−p − q)ΓBBA (p, q) µ + h(p)B r (q)Arµ (−p − q)ΓhBA (p, q) + rst c¯r (p)cs (q)Atµ (−p − q)Γcµ¯cA (p, q) µ + B r (p)B r (q)h(−p − q)ΓBBh (p, q) + h(p)h(q)h(−p − q)Γhhh (p, q) o + c¯r (p)cr (q)h(−p − q)Γc¯ch (p, q) + rst c¯r (p)cs (q)B t (−p − q)Γc¯cB (p, q) , AAA + O3 , ΓAAA µνλ (p, q) = δµν i(p − q)λ F AAh ΓAAh + O2 , µν (p, q) = δµν F
ΓBBA (p, q) = i(p − q)µ F BBA + O3 , µ ΓhBA (p, q) = i(p − q)µ F1hBA + i(p + q)µ r2hBA + O3 , µ
1 F AAA = − g + rAAA , 2 1 F AAh = mg + rAAh , 2 1 F BBA = − g + rBBA , 4 1 F1hBA = g + r1hBA , 2
Γcµ¯cA (p, q) = ipµ F1c¯cA + iqµ r2c¯cA + O3 ,
F1c¯cA = g + r1c¯cA ,
ΓBBh (p, q) = F BBh + O2 ,
F BBh =
Γhhh (p, q) = F hhh + O2 ,
F hhh =
1 M2 g + rBBh , 4 m 1 M2 g + rhhh , 4 m
July 14, 2003 11:19 WSPC/148-RMP
00169
Perturbative Renormalization by Flow Equations
553
1 F c¯ch = − αgm + rc¯ch , 2 1 Γc¯cB (p, q) = F c¯cB + O2 , F c¯cB = αgm + rc¯cB . 2 The 3-point functions AAB and BBB have no relevant local content. Γc¯ch (p, q) = F c¯ch + O2 ,
(4) Four-point functions: With parameters r and F defined as before Z Z Z Γ4 |rel = {abc ars Abµ (k)Acν (p)Arµ (q)Asν (−k − p − q)F1AAAA k
p
q
+ Arµ (k)Arµ (p)Asν (q)Asν (−k − p − q)r2AAAA + Aaµ (k)Abµ (p)¯ cr (q)cs (−k − p − q)(δ ab δ rs r1AA¯cc + δ ar δ bs r2AA¯cc ) + Aaµ (k)Abµ (p)B r (q)B s (−k − p − q)(δ ab δ rs F1AABB + δ ar δ bs r2AABB ) + B a (k)B b (p)¯ cr (q)cs (−k − p − q)(δ ab δ rs r1BB¯cc + δ ar δ bs r2BB¯cc ) + h(k)h(p)h(q)h(−k − p − q)F hhhh + B r (k)B r (p)h(q)h(−k − p − q)F BBhh + B r (k)B r (p)B s (q)B s (−k − p − q)F BBBB + Arµ (k)Arµ (p)h(q)h(−k − p − q)F AAhh + h(k)h(p)¯ cr (q)cr (−k − p − q)r hh¯cc + c¯a (k)ca (p)¯ cr (q)cr (−k − p − q)r c¯c¯cc + rst h(k)B r (p)¯ cs (q)ct (−k − p − q)r hB¯cc } , 1 2 1 g + r1AAAA , F1AABB = g 2 + r1AABB , 4 8 2 2 1 2 M 1 2 M + rhhhh , F BBhh = + rBBhh , F hhhh = g g 32 m 16 m 2 1 1 2 M BBBB F = g + rBBBB , F AAhh = g 2 + rAAhh . 32 m 8 Hence, in total Γ involves 1 + 10 + 11 + 15 = 37 relevant parameters. After deleting in the two-point functions the contributions of the order l = 0, i.e. keeping only the 10 parameters which appear in the various self-energies, we have the form of the bare functional LΛ0 ,Λ0 , and its order l = 0 also given explicitly. F1AAAA =
Appendix B. The Relevant Part of the BRS-Insertions We also have to consider the vertex functions (4.74) and (4.75) with one operator insertion, generated by the BRS-variations. These insertions have mass dimension
July 14, 2003 11:19 WSPC/148-RMP
554
00169
V. F. M¨ uller
D = 2. Performing the Fourier-transform Z 0,Λ0 ˆ Γγ (q) = dxeiqx Γγ0,Λ0 (x)
and similarly in the other cases, we list the respective relevant part of these four vertex functions with one insertion, suppressing the superscript 0, Λ0 : Z a arb ˆ a Γγµ (q)|rel = −iqµ c (−q)R1 + Arµ (k)cb (−q − k)gR2 , k
1 B (k)c (−q − k) − gR3 , 2 k Z 1 = mca (−q)R4 + h(k)ca (−q − k) gR5 2 k Z 1 + arb B r (k)cb (−q − k) gR6 , 2 k Z 1 = ars cr (k)cs (−q − k) gR7 . 2 k
ˆ γ (q)|rel = Γ ˆ γ a (q)|rel Γ
ˆ ωa (q)|rel Γ
Z
r
r
There appear 7 relevant parameters: Ri = 1 + r i ,
ri = O(~) ,
i = 1, . . . , 7 .
All the other 2-point functions, and the higher ones, of course, are of irrelevant type. Acknowledgments The author is much indebted to Christoph Kopper for his encouragement and gratefully acknowledges his careful reading of an earlier version of the manuscript and his numerous valuable suggestions. Thanks are also due to the anonymous referees for constructive proposals. References [1] F. J. Dyson, The Radiation theories of Tomonaga, Schwinger, and Feynman, Phys. Rev. 75 (1949), 486–502. [2] F. J. Dyson, The S matrix in quantum electrodynamics, Phys. Rev. 75 (1949), 1736–1755. ¨ [3] N. N. Bogoliubov and O. S. Parasiuk, Uber die Multiplikation der Kausalfunktionen in der Quantentheorie der Felder, Acta Math. 97 (1957), 227–266. [4] K. Hepp, Th´eorie de la Renormalisation, Lecture Notes in Physics, Springer, Berlin, 1969. [5] W. Zimmermann, Local Operator Products and Renormalization in Quantum Field Theory, in Lectures on Elementary Particles and Quantum Field Theory, 1970 Brandeis University Summer Institute in Theoretical Physics, Vol. 1, M.I.T. Press, Cambridge, 1970.
July 14, 2003 11:19 WSPC/148-RMP
00169
Perturbative Renormalization by Flow Equations
555
[6] W. Pauli and F. Villars, On invariant regularization in relativitic quantum theory, Rev. Mod. Phys. 21 (1949), 434–444. [7] E. Speer, Generalized Feynman Amplitudes, Princeton University, Princeton, 1969. [8] G. ’t Hooft and M. Veltman, Regularization and renormalization of Gauge fields, Nucl. Phys. B44 (1972), 189–213. [9] C. G. Bollini and J. J. Giambiagi, Lowest order “divergent” Graphs in ν-dimensional space, Phys. Lett. 40B (1972), 566–568. [10] P. Breitenlohner and D. Maison, Dimensional Renormalization and the Action Principle, Comm. Math. Phys. 52 (1977), 11–38. [11] P. Breitenlohner and D. Maison, Dimensionally renormalized Green’s functions for theories with massless particles. I, II, Comm. Math. Phys. 52 (1977), 39–54, 55–75. [12] A. S. Wightman and G. Velo (eds.), Renormalization Theory, Erice 1975, Reidel, 1976. [13] W. Zimmermann, Convergence of Bogoliubov’s method of renormalization in momentum space, Comm. Math. Phys. 15 (1969), 208–234. [14] J. H. Lowenstein, Convergence theorems for renormalized Feynman integrals with zero-mass propagators, Comm. Math. Phys. 47 (1976), 53–68. [15] G. ’t Hooft, Renormalization of massless Yang–Mills fields, Nucl. Phys. B33 (1971), 173–199. [16] G. ’t Hooft, Renormalizable Lagrangians for massive Yang–Mills fields, Nucl. Phys. B35 (1971), 167–188. [17] G. ’t Hooft and M. Veltman, Combinatorics of Gauge fields, Nucl. Phys. B50 (1972), 318–353. [18] K. G. Wilson, Renormalization group and critical phenomena. I. renormalization group and the Kadanoff scaling picture, Phys. Rev. B4 (1971), 3174–3183. [19] K. G. Wilson, Renormalization group and critical phenomena. II. phase-space-cell analysis of critical behaviour, Phys. Rev. B4 (1971), 3184–3205. [20] K. G. Wilson and J. Kogut, The Renormalization Group and the ε Expansion, Phys. Rep. 12C (1974), 75–199. [21] K. Gawedzki and A. Kupiainen, Gross-Neveu model through convergent perturbation expansions, Comm. Math. Phys. 102 (1985), 1–30. [22] J. Feldman, J. Magnen, V. Rivasseau and R. S´en´eor, A renormalizable field theory: the massive Gross-Neveu model in two dimensions, Comm. Math. Phys. 103 (1986), 67–103. [23] D. C. Brydges, Functional Integrals and their Applications, Lausanne lectures 1992, mp-arc 93–24. [24] V. Rivasseau, Constructive Renormalization Theory, arXiv: math-ph/9902023. [25] V. Rivasseau, From Perturbative to Constructive Renormalization, Princeton University Press, 1991. [26] G. Gallavotti and F. Nicolo, Renormalization theory in four-dimensional scalar fields I, II, Comm. Math. Phys. 100 (1985), 545–590; 101 (1986), 247–282. [27] G. Gallavotti, Renormalization theory and ultraviolet stability for scalar fields via renormalization group methods, Rev. Mod. Phys. 57 (1985), 471–562. [28] J. S. Feldman, T. R. Hurd, L. Rosen and J. D. Wright, QED: A proof of renormalizability, Springer, Lecture Notes in Physics 312, 1988. [29] J. Polchinski, Renormalization And Effective Lagrangians, Nucl. Phys. B231 (1984), 269-295. [30] P. K. Mitter and T. R. Ramadas, “The two-dimensional O(N ) nonlinear σmodel: renormalisation and effective actions, Comm. Math. Phys. 122 (1989), 575–596.
July 14, 2003 11:19 WSPC/148-RMP
556
00169
V. F. M¨ uller
[31] G. Keller, Ch. Kopper and M. Salmhofer, Perturbative renormalization and effective Lagrangians in Φ44 , Helv. Phys. Acta 65 (1992), 32–52. [32] G. Keller and Ch. Kopper, Perturbative renormalization of composite operators via flow equations I, Comm. Math. Phys. 148 (1992), 445–467. [33] G. Keller and Ch. Kopper, Perturbative renormalization of composite operators via flow equations II: Short distance expansion, Comm. Math. Phys. 153 (1993), 245–276. [34] Ch. Wieczerkowski, Symanzik’s improved actions from the viewpoint of the renormalization group, Comm. Math. Phys. 120 (1988), 148–176. [35] G. Keller, The perturbative construction of Symanzik’s improved action for Φ44 and QED4 , Helv. Phys. Acta 66 (1993), 453–470. [36] G. Keller, Local Borel summability of Euclidean Φ44 : A simple proof via differential flow equations, Comm. Math. Phys. 161 (1994), 311–323. [37] G. Keller and Ch. Kopper, Perturbative renormalization of massless Φ44 with flow equations, Comm. Math. Phys. 161 (1994), 515–532. [38] G. Keller and Ch. Kopper, Perturbative renormalization of QED via flow equations, Phys. Lett. B273 (1991), 323–332. [39] G. Keller and Ch. Kopper, Renormalizability proof for QED based on flow equations, Comm. Math. Phys. 176 (1996), 193–226. [40] C. Kim, A renormalization group flow approach to decoupling and irrelevant operators, Ann. Phys. (N.Y.) 243 (1995), 117–143. [41] G. Keller, Ch. Kopper and C. Schophaus, Perturbative renormalization with flow equations in Minkowski space, Helv. Phys. Acta 70 (1997), 247–274. [42] Ch. Kopper and V. F. M¨ uller, Renormalization proof for spontaneously broken Yang–Mills theory with flow equations, Comm. Math. Phys. 209 (2000), 477–516. [43] Ch. Kopper, V. F. M¨ uller and Th. Reisz, Temperature independent renormalization of finite temperature field theory, Ann. Henri Poincar´e 2 (2001), 387–402. [44] M. Salmhofer, Renormalization, An Introduction, Springer, 1999. [45] M. Bonini, M. D’ Attanasio and G. Marchesini, Perturbative renormalization and infrared finiteness in the Wilson renormalization group: the massless scalar case, Nucl. Phys. B409 (1993), 441–464. [46] Ch. Wetterich, Exact evolution equation for the effective potential, Phys. Lett. B301 (1993), 90–94. [47] M. Bonini, M. D’Attanasio and G. Marchesini, Ward identities and Wilson renormalization group in QED, Nucl. Phys. B418 (1994), 81–112. [48] M. Bonini, M. D’Attanasio and G. Marchesini, Renormalization group flow for SU(2) Yang–Mills theory and gauge invariance, Nucl. Phys. B421 (1994), 429–455. [49] M. Bonini, M. D’Attanasio and G. Marchesini, BRS — symmetry for Yang–Mills theory and exact renormalization group, Nucl. Phys. B437 (1994), 163–186. [50] L. Girardello and A. Zaffaroni, Exact renormalization group equation and decoupling in quantum field theory, Nucl. Phys. B424 (1994), 219–238. [51] C. Becchi, On the construction of renormalized gauge theories using renormalization group techniques, arXiv: hep-th/ 9607188. [52] M. Bonini and F. Vian, Chiral gauge theories and anomalies in the Wilson renormalization group approach, Nucl. Phys. B511 (1998), 479–494. [53] M. Bonini and F. Vian, Wilson renormalization group for supersymmetric gauge theories and gauge anomalies, Nucl. Phys. B532 (1998), 473–497. [54] M. D’ Attanasio and T. R. Morris, Gauge invariance, the quantum action principle, and the renormalization group, Phys. Lett. B378 (1996), 213–221. [55] S. Arnone, Y. A. Kubyshin, T. R. Morris and J. F. Tighe, A gauge invariant regulator for the ERG, Int. J. Mod. Phys. A16 (2001), 1989.
July 14, 2003 11:19 WSPC/148-RMP
00169
Perturbative Renormalization by Flow Equations
557
[56] U. Ellwanger, Flow equations and BRS invariance for Yang–Mills theories, Phys. Lett. B335 (1994), 364–370. [57] U. Ellwanger, M. Hirsch and A. Weber, Flow equations for the relevant part of the pure Yang–Mills action, Z. Phys. C69 (1996), 687–697. [58] U. Ellwanger, Confinement, monopoles and Wilsonian effective action, Nucl. Phys. B531 (1998), 593–612. [59] M. Reuter and C. Wetterich, Effective average action for gauge theories and exact evolution equations, Nucl. Phys. B417 (1994), 181–214. [60] M. Reuter and C. Wetterich, Exact evolution equation for scalar electrodynamics, Nucl. Phys. B427 (1994), 291-324. [61] M. Reuter and C. Wetterich, Gluon condensation in nonperturbative flow equations, Phys. Rev. D56 (1997), 7893–7916. [62] H. Gies, Running coupling in Yang–Mills theory — a flow equation study, Phys. Rev. D66 (2002), 025006. [63] O. Lauscher and M. Reuter, Towards nonperturbative renormalizability of quantum Einstein gravity, Int. J. Mod. Phys. A17 (2002), 993–1002. [64] O. Lauscher and M. Reuter, Is quantum Einstein gravity nonperturbatively renormalizable? Class. Quant. Grav. 19 (2002), 483–492. [65] C. Bagnuls and C. Bervillier, Exact renormalizsation group equations. An introductory review, Phys. Rep. 348 (2001), 91–157. [66] Ch. Kopper, Renormierungstheorie mit Flussgleichungen, Shaker Verlag, Aachen, 1998. [67] J. Glimm and A. Jaffe, Quantum Physics, Sec. ed., Springer, New York, 1987, Chap. 9.1. [68] T. Hida, Stationary Stochastic Processes, Princeton, 1970, Theo. 4.2. [69] Y. Yamasaki, Measures on Infinite Dimensional Spaces, World Scientific, Singapore, 1985, Part A, Theo. 17.1. [70] Ch. Kopper and F. Meunier, Large momentum bounds from flow equations, Ann. Henri Poincar´e 3 (2002), 435–449. [71] S. Weinberg, High energy behaviour in quantum field theory, Phys. Rev. 118 (1960), 838–849. [72] N. P. Landsman and C. van Weert, Real- and imaginary-time field theory at finite temperature and density, Phys. Rep. 145 (1987), 141–249. [73] N. Bourbaki, Fonctions d’une variable r´eelle, Editions Hermann, Paris, 1976, Chap. 6. [74] Y. M. P. Lam, Perturbation Lagrangian theory for scalar fields — Ward–Takahashi identity and current algebra, Phys. Rev. D6 (1972), 2145–2161. [75] Y. M. P. Lam, Equivalence theorem on Bogoliubov–Parasiuk–Hepp–Zimmermannrenormalized Lagrangian field theories, Phys. Rev. D7 (1973), 2943–2949. [76] J. H. Lowenstein, Differential vertex operations in Lagrangian field theory, Comm. Math. Phys. 24 (1971), 1–21. [77] O. Piguet and S. P. Sorella, Algebraic Renormalization, Springer, Berlin, 1995. [78] J. Zinn–Justin, Quantum Field Theory and Critical Phenomena, Sec. ed., Clarendon Press, Oxford, 1993. [79] A. A. Slavnov, Math. Theor. Phys. 10 (1972), 99. [80] J. C. Taylor, Ward identities and charge renormalization of the Yang–Mills field, Nucl. Phys. B33 (1971), 436–444. [81] C. Becchi, A. Rouet and R. Stora, Renormalization of the Abelian Higgs–Kibble Model, Comm. Math. Phys. 42 (1975), 127–162. [82] C. Becchi, A. Rouet and R. Stora, Renormalization of gauge theories, Ann. Phys. (N.Y.) 98 (1976), 287–321.
July 14, 2003 11:19 WSPC/148-RMP
558
00169
V. F. M¨ uller
[83] L. D. Faddeev and A. A. Slavnov, Gauge Fields: Introduction to Quantum Theory, Benjamin, Reading MA, 1980. [84] V. F. M¨ uller, Perturbative renormalization by flow equations, arXiv hep-th/0208211, Proposition 4.4.
September 1, 2003 11:49 WSPC/148-RMP
00172
Reviews in Mathematical Physics Vol. 15, No. 6 (2003) 559–628 c World Scientific Publishing Company
ON THE MODULI OF A QUANTIZED ELASTICA IN P AND KdV FLOWS: STUDY OF HYPERELLIPTIC CURVES AS AN EXTENSION OF EULER’S PERSPECTIVE OF ELASTICA I
† ˆ SHIGEKI MATSUTANI∗ and YOSHIHIRO ONISHI ∗8-21-1 †Faculty
Higashi-Linkan Sagamihara, 228-0811 Japan of Humanities and Social Sciences, Iwate University, Ueda, Morioka, Iwate, 020-8550 Japan Received 25 June 1999 Revised 4 May 2003
Quantization needs evaluation of all of states of a quantized object rather than its stationary states with respect to its energy. In this paper, we have investigated moduli MPelas of a quantized elastica, a quantized loop with an energy functional associated with the Schwarz derivative, on a Riemann sphere P. Then it is proved that its moduli space is decomposed to a set of equivalent classes determined by flows obeying the Korteweg-de Vries (KdV) hierarchy which conserve the energy. Since the flow obeying the KdV hierarchy has a natural topology, it induces topology in the moduli space M Pelas . Using the topology, MPelas is classified. Studies on a loop space in the category of topological spaces Top are well-established and its cohomological properties are well-known. As the moduli space of a quantized elastica can be regarded as a loop space in the category of differential geometry DGeom, we also proved an existence of a functor between a triangle category related to a loop space in Top and that in DGeom using the induced topology. As Euler investigated the elliptic integrals and its moduli by observing a shape of classical elastica on C, this paper devotes relations between hyperelliptic curves and a quantized elastica on P as an extension of Euler’s perspective of elastica. Keywords: Statistical mechanics; elastica; polymer; loop space; hyperelliptic function.
1. Introduction History of investigations of elastica was opened by James Bernoulli in 1691 according to Truesdell’s inquiry [1, 2]. He named a shape of a thin non-stretching elastic rod elastica and proposed the elastica problem: what shape does elastica take for a given boundary condition? It should be, further, noted that he also proposed the lemniscate problem and discovered an elliptic integral corresponding to the lemniscate function by investigation of elastica. He considered a smooth curve with the arc-length in a plane C, γ˜ : [0, l] ,→ C , 559
(s 7→ γ˜ (s)) .
September 1, 2003 11:49 WSPC/148-RMP
560
00172
ˆ S. Matsutani & Y. Onishi
Following his studies, his nephew Daniel Bernoulli discovered that the elastica obeys the minimal principle that shape of the elastica is realized as a stationary point of an energy functional, which is called Euler–Bernoulli functional nowadays [1–3], Z E[˜ γ ] = k 2 ds ,
√ where k is the curvature of the curve γ˜ in C, k = − −1∂s2 γ˜/∂s γ˜, ∂s := d/ds, and s is the arc-length of the curve using the induced metric in C. (It should be noted that this functional differs from that of a “string” in the literature of the string theory in the elementary particle physics: although an elastica is a model of a string of the chord e.g. the guitar, “string” in the string theory cannot be realized in the classical mechanical regime.) Since the curvature k is expressed as k = ∂s φ where φ is the tangential angle, and R the energy is given by E = |∂s φ|2 ds, the elastica problem could be interpreted as the oldest problem of a harmonic map into a target space U(1); if we write ∂s φds = g −1 dg, forR U(1) valued function g over C, then the Hodge-star dual ∗g −1 dg = ∂s φ and E = hg −1 dg ∧ ∗g −1 dgi. ˜ elas, cls , The elastica problem is to investigate moduli M ˜ elas, cls := {˜ M γ : [0, 1] ,→ C | δE[˜ γ ]/δ˜ γ = 0}/ ∼ . Here “∼” means modulo Euclidean move in C and dilatation. We sometimes call this space moduli space of the classical elasticas. The classification of this moduli space ˜ elas, cls was essentially done by Euler in 1744 by means of numerical computations M ˜ elas, cls is classified by the moduli of the elliptic curves [4]. The moduli space M [1–3, 5]. It is noted that before Euler referred to Fagnano’s paper on his discovery of an algebraic properties of the lemniscate function (an elliptic function of a special modulus) at December 31, 1751, the elliptic integrals for more general modulus was investigated in the study of this classical harmonic map problem. (It is known that Jacobi recognized that the day is the birthday of the elliptic function. Thus we think that elastica is a kind of movements of the fetus of algebraic curves.) We also emphasize that from the beginning, the harmonic map problem (classical field theory in physics) is closely related to algebraic varieties. Recently Mumford investigated this elastica problem from a viewpoint of applied mathematics and gave simple and deep expressions of the shape of elastica, which show the depth, importance and beauty of this problem [6]. Especially for a closed elastica, Euler showed that its moduli space, Melas, cls := {γ : S 1 ,→ C | δE[γ]/δγ = 0}/ ∼ , consists of two disjoint points: the corresponding moduli τ of the elliptic curves consist of two points τ = 0 and τ = 0.70946 . . . [1–4]. Recently a loop space is one of the most concerned objects in mathematics and there have been so many efforts to investigate it [7–11 and reference therein]. Further it is well-known that soliton equations are closely related to the loop spaces, loop
September 1, 2003 11:49 WSPC/148-RMP
00172
On the Moduli of a Quantized Elastica in P and KdV Flows
561
groups and loop algebras [8, 11]. However these studies are sometimes too abstract to be related to physical problems, except problems in the elementary particle physics; for example, the embedded space is often a group manifold, e.g. U(N ). Further the energy function is paid little attention in these studies. On the other hand, our concerned object is a non-stretching elastica, which is related to a large polymer, such as the deoxyribonucleic acid (DNA) as a physical model [12–15]. Elastica has an energy functional as we described above. Thus our problem, basically, differs from the arguments in an ordinary loop space in [8, 10, 11] except [7, 9] though it is closely related to them. One of these authors (S.M.) considered the quantization of a closed elastica (precisely speaking, statistical mechanics of elasticas) [13]. He defined the moduli space of the closed quantized elastica, which is an isometric immersion of S 1 into C module the Euclidean motion and dilatation, 1 MC elas := {γ : S ,→ C | isometric immersion}/ ∼ .
He investigated the partition function from a physical point of view, which has not been mathematically justified: Z : MC elas × R>0 → R , with Z[β] =
Z
Dγ exp(−βE[γ]) , MC elas
where β ∈ R>0 := {x ∈ R | x > 0} and Dγ is the Feynman measure. On the quantization of an elastica, we need more information of the moduli space of curves besides those around its stationary points. To evaluate this map Z, he classified the moduli space of a quantized closed elastica MC elas and attempted to redefine the Feynman measure by replacing it with the series of Riemann integral over MC elas . His quantization is somewhat novel for an elastica. He physically proved that the moduli space of the quantized elastica is given as a subspace of the moduli space of the modified Korteweg-de Vries (MKdV) equation [13]. Here we should emphasize that it is very surprising that a physical system is completely described by a soliton equation as mentioned in [13]. Even in physical phenomena which are known as systems represented by soliton equations, like shallow waves, plasma waves, charge density waves and so on, the higher soliton solutions are, in general, out of their approximation regions; of course one or two soliton solutions do represent these phenomena well. On the other hand, in the quantized elastica problem, its functional space is completely expressed by the MKdV hierarchy, even though problems in polymer physics are, in general, too complex to be solved exactly [15]. In this paper, we will rewrite the physical theorem in [13] from a mathematical point of view and extend it. Pedit gave a lecture on a loop space over a Riemann sphere P at Tokyo Metropolitan University in 1998 [16]. There he showed that the
September 1, 2003 11:49 WSPC/148-RMP
562
00172
ˆ S. Matsutani & Y. Onishi
loop space is related to the Korteweg-de Vries (KdV) flow by considering a loop in C2 \{0}. As his treatment is given in the framework of pure mathematics, we will follow the expressions of Pedit and deal with the KdV flow instead of the MKdV flow here. Due to the Miura map (the Ricatti type differential equation), the MKdV flow and the KdV flow can be regarded as different aspects of the same object; this choice is not significant. Mathematical investigations on the KdV flow leads us to our main results, Theorems 3.4, 4.2 and 7.4. As we will show later, our investigation of a quantized elastica leads us to study the hyperelliptic curves and their moduli space as Euler encountered the elliptic integrals and studied of the moduli of the elliptic functions by observing a shape of classical elastica on C. One of our purposes of this study is to know the hyperelliptic functions and its moduli by investigating a quantized elastica in P as an extension of Euler’s perspective of elastica. After we submitted the first version of this paper, these works progressed [17–20]. Hence in this revised version, we also rewrite the related parts. Contents of this paper is as follows. Section 2 shows an expression of a real curve immersed in a Riemann sphere P according to the lecture of Pedit [16]. Using his expressions, we define the moduli space of a real smooth curve immersed in P and an energy functional of the curve whose integrand is the Schwarz derivative along the curve. When we regard P as a complex plane with the infinity point, C ∪ {∞}, the energy functional is identified with the Euler–Bernoulli energy functional around the origin {0} of C ∪ {∞} and the curve with the energy is reduced to a quantized elastica which was studied by one of these authors [13]. Thus we continue to refer such a curve in P “quantized elastica in P”. In order to consider a quantum effect, we should get knowledge of a set of curves with different energies instead of investigation of only a stationary point of the energy functional even though we are dealing with a single elastica. Thus we will call, in this paper, the moduli MPelas defined in Definitions 2.10 and 2.12, “moduli of a quantized elastica” rather than moduli of loops. In Sec. 2, we will give an equivalence between a loop space over C2 \{0} and P in a certain sense. Further following MacLaughlin and Beylinski [21, 7], we will introduce a natural topology of the loop space which is induced from the topology of the base space. In Sec. 3, we introduce infinite dimensional parameters t = (t1 , t2 , t3 , . . .) which deform a given curve and define a flow obeying the KdV hierarchy along t, which is called KdVH flow. First we give our first main Theorem 3.4 in this paper. Since the energy functional of a curve turns out to be the first integral with respect to the parameter t, we prove that using the KdVH flow we can classify the moduli MPelas of a quantized elastica in P. In other words, the moduli space MPelas is decomposed to a set of the equivalent classes with respect to the KdVH flow. As Sec. 3 gives the differential geometrical and dynamical properties of the quantized elastica, we will attempt to express the theorem in terms of the words of the differential geometers. Remark 3.10 is a key of the study in Sec. 3.
September 1, 2003 11:49 WSPC/148-RMP
00172
On the Moduli of a Quantized Elastica in P and KdV Flows
563
Primary considerations leads the fact that the moduli space of a quantized elastica MPelas is a subspace of the moduli space of the KdVH flow MKdV as shown in Proposition 4.29. The system of the KdV hierarchy has a natural topology, which essentially determines the algebraic properties of the KdV hierarchy [11, 22–25]. Using results on these studies of the KdV hierarchy, we give finer classification of the moduli space MPelas in Theorem 4.2 and Proposition 4.33, which is our second main theorem. There a dense subspace in MPelas is decomposed by a subspace characterized by a natural number. As we defined below Lemma 4.1, we encounter a finite type of the KdVH flow, which corresponds to the finite type solutions of the KdV equation and are related to a hyperelliptic curve. The natural number is related to genus of the hyperelliptic curve. In order that we mention our second main statements, Theorem 4.2 and Proposition 4.33, Sec. 4 reviews the algebro-geometrical properties of the KdV hierarchy based upon the so-called Sato–Mulase theory [24–26]. As the completion of set of finite type solutions is equal to MKdV , we concentrate our attention on the finite solution of the KdV flow and consider MKdV algebro-geometrically. As Sato–Mulase theory is of the algebraic analysis and is based upon the formal power series ring, we replace the base ring of smooth functions by the formal power series. There we find that a commutative differential ring is connected with geometry of a commutative ring, i.e. a hyperelliptic curve. Using the inclusion MPelas ⊂ MKdV , we will introduce the relative topology in MPelas induced from the topology of MKdV . In Sec. 5, we will show another algorithm of explicit computation of solutions of the KdV flow. There we will reconsider the KdV equation in the framework of inverse scattering method and comment the meanings of Theorem 4.2 again at Proposition 5.20. In other words, we will rewrite our second result more analytically. So readers can skip this section except Example 5.21. There we will also review Krichever’s construction of algebro-geometrical solutions [27–28] and Baker’s original method given about one hundred years ago [17, 29]. Using it we showed that there is an injection from the moduli space Mhyp of hyperelliptic curves to the moduli space MKdV of the KdV equation up to an ambiguity; this correspondence enables us to determine function forms of hyperelliptic ℘ functions as solutions of the KdV equations for any algebraically given hyperelliptic curves including degenerate curves. Section 6 is digression and we will review a result of a loop space over S 2 in the category of topological space Top, whose morphism is a continuous map, following the arguments in the textbook of Bott and Tu [30]. Studies on a loop space in Top are well-established and its cohomological properties are well-known. On the other hand, the moduli space of a quantized elastica in P can be regarded as a loop space in the category of the differential geometry DGeom. Thus by loosening the properties in DGeom and regarding them as those in Top, it is expected that the moduli of a quantized elastica MPelas in P are topologically related to those of a loop space in Top. Thus in Sec. 6, we will review a loop space in Top and show its cohomological properties.
September 1, 2003 11:49 WSPC/148-RMP
564
00172
ˆ S. Matsutani & Y. Onishi
In Sec. 7, we will mention the topological properties of the moduli of a quantized elastica MPelas and give our third main theorem. As loop spaces in both Top and DGeom are not finite dimensional spaces when we regard them as manifolds in an appropriate sense, it is not known that de Rham’s theorem can be applicable to them. However it is expected that cohomological sequences in both categories should correspond to each other. In other words, it is important to argue existence of functor between triangle categories related to them, i.e. quasi-isomorphism. Precisely speaking, though the closed condition and the reality condition in the moduli MPelas make its topological properties difficult to treat, we will tune the low dimensional parts of chain complex of MPelas and consider a complex of a quotient spaces CMPelas . Then we will show existence of a functor between the triangle categories in loop spaces in both Top and DGeom as our third main theorem at Theorem 7.4. The existence of the functor means the our theory in DGeom is justified in topological investigation. We believe that this result is meaningful to the investigations of the loop space. Section 8 gives the remarks and comments upon our results. First we will comment upon sequences of homotopy of loop spaces in both Top and DGeom. Next, we will give a possibility of computations of the partition function of a quantized elastica in C. Even in the quantized system, we will show that the orbit space is meaningful, whereas it is well-known that in noncommutative space, concepts of orbit and geometry are sometimes nonsense [58]. So we will comment upon the fact. Further we will remark the relations between our system and Painlev`e equation of the first kind [13, 31], and between our system and conformal field theory. Finally we will comment upon our results from the a point of view recent progress of Dirac operator related to immersion object based upon [12–14, 33, 34]. We will also mention possibility of higher dimensional case of our consideration there. Notations R and C are real and complex number fields respectively. R≥0 is the set of the non-negative real numbers. Z is the set of integers and N is the set of natural numbers 1, 2, 3, . . . . Z≥0 is the set of the non-negative integers. C ∞ (A, B) means the set of B-valued smooth functions over A. R[x1 , . . . , xn ] is the set of polynomial of x1 , . . . , xn with R valued coefficients and R[[x1 , . . . , xn ]] is the set of formal power series of x1 , . . . , xn with R valued coefficients. Others important quantities are listed as follows. MP : M
C2 \{0}
Moduli of Loops in P :
{γ, s}SD : MPelas : MC elas :
Defintion 2.4 2
Moduli of Loops in C \{0}, $ : M Schwarz derivative Moduli of quantized elastica in P Moduli of quantized elastica in C
C2 \{0}
→M
P
Defintion 2.4, Remark 2.5 Defintion 2.6 Defintion 2.10 Defintion 2.10
September 1, 2003 11:49 WSPC/148-RMP
00172
On the Moduli of a Quantized Elastica in P and KdV Flows C2 \{0}
Melas
C2 \{0}
:
$ : Melas
→ MPelas
Defintion 2.10, Remark 2.11 Defintion 2.12 Defintion 2.12
MPelas : MC elas :
P πelas : MPelas → MPelas C πelas : MPelas → MC elas
Melas : E[γ]: Ds : Es : V∞: φ¯∂s u,t , ϕ ¯∂s u,t :
πelas : MPelas → Melas energy of elastica in P Differential ring over C ∞ (S 1 , C) Micro differential ring to Ds Q S1 × ( ∞ n=1 R) the KdVH flow
C2 \{0}
Ω, Ω: ∼
KdVHf
:
φA,t : MPelas : MPelas, finite ,
C2 \{0}
C2 \{0}
Defintion 2.12 Defintion 2.18 Defintion 3.1 Defintion 3.1 Defintion 3.2 Defintion 3.2, Proposition 3.11 Recursion differenital operator Defintion 3.2, Lemma 3.6 Equivlent relation related to the KdVH flow Defintion 3.2 a flow for A πelas : MPelas → MPelas := MPelas /
MPelas g :
565
∼
KdVHf
Defintion 3.7 Definition 3.18
Df : Ef : Ef : Wf , Wc : L: Af , Ac : Ac : Dt , Et , W t : MKdV , M∞ KdV :
finite type of the KdV flow and finite g-type flow Differential ring over C[[t1 ]] Micro differential ring to Df Micro differential ring with coefficient C Subsets of Ef and Ec Subset of Ef commutative subrings of Ef and Ec set of the commutative subrings in Ec Differential rings over C[[t1 , t2 , . . .]] Moduli of the KdV hierarchy
MKdV , g: Fg MKdV , MKdV g : Ef :
g πKdV : MKdV → MKdV , g Filter of Moduli of the KdV hierarchy Micro differential ring with coefficient C
Theorem 4.2 Definition 4.4 Definition 4.4 Definition 4.4 Definition 4.4 Lemma 4.8 Lemma 4.8 Definition 4.12 Definition 4.19 Definition 4.20, Proposition 4.32 Above Proposition 4.27 Definition 4.26 Definition 4.4
Wfg , Wf0,1 : Fg MPelas : Mhyp, g : P (X): ΩX: DMPelas , CMPelas
Gauge freedom Filter of Moduli of quantized elastica Moduli of hyperelliptic curves of genus g Path space over X in Top Loop space over X in Top Complex related to quantized elastica
Lemma 4.28 Proposition 4.29 Proposition 5.4 Proposition 6.2 Proposition 6.2 Proposition 7.1
2. A Loop in P In this section we will give an expression of a real curve immersed in a Riemann sphere P following one of Pedit [16]. His expression is based upon the oldest theory of a complex curve embedded in a complex plane C or an upper half plane H, which was found in ending of the nineteenth century and studied by Klein, Schwarz, Fuchs,
56 7
ˆ S. Matsutani & Y. Onishi
F lo w s
566
00172
dV
Poincar´e and so on [35]. Using the expression, we will define the moduli space MPelas C2 \{0}
C2 \{0}
in
ca
E la st i
ze d
ua nt i
Q
a
of
th e
M
od ul i
Using this map and the natural projection of C2 \{0} to the complex projective space (Riemann sphere) P, we can define the immersion of a loop in P: Definition 2.1. We define an immersion γ : S 1 ,→ P by the commutative diagram as γ = $ ◦ ψ, n
−− −− −→
W SP
C
/1
48
-R M P
00
17
2
P
an d
K
(Melas ) and MPelas (Melas ) of smooth curves in P (C2 \{0}) in Definitions 2.10 and 2.12 and an energy functional of a curve in Definition 2.18, whose integrand is the Schwarz derivative along the curve. As mentioned in Introduction, we will call MPelas a moduli space of a quantized elastica. Let us consider a smooth immersion of a circle into a two-dimensional complex space without origin, !! ψ1 (s) 1 2 . ψ : S (:= R/Z) ,→ C \{0} , s 7→ ψ(s) = ψ2 (s)
O
13
September 1, 2003 11:49 WSPC/148-RMP
ψ
γ
S 1 −−−−→ For a chart around ψ2 6= 0, s 7→ γ(s) =
C2 \{0} $ . y P
ψ1 ψ2 (s).
Definition 2.2. (1) The special linear map SL2 (C) : C2 \{0} → C2 \{0}, a b a, b, c, d ∈ C ad − bc = 1 , m ∈ SL2 (C) := c d
acts on P through the M¨ obius transformation as a symmetric group of P : g m : $ ◦ ψ 7→ $ ◦ mψ for m ∈ SL2 (C) and for a point γ ∈ P, aγ + b a b gm : γ 7→ , for m = . c d cγ + d
Let PSL2 (C) denote this group including the group action. (2) Let Γ0 (C) denote the subgroup which is characterized by vanishing condition of (2, 1)-component, a b Γ0 (C) := ∈ SL2 (C) , 0 d and E0 (C) denote its action to C ∪ {∞} using the M¨ obius transformation. (3) Let Γ1 (C) denote the other subgroup which is characterized by a b Γ1 (C) := ∈ SL2 (C) |a| = 1 , 0 d
and E1 (C) denote its action to C ∪ {∞} using the M¨ obius transformation.
September 1, 2003 11:49 WSPC/148-RMP
00172
On the Moduli of a Quantized Elastica in P and KdV Flows
567
Remark 2.3. PSL2 (C) has following properties: (1) Translation, rotation and global dilatation: z→
a b 0 d
∈ Γ0 (C), (b = 1/a)
az + b = a2 z + ab . d
If we restrict the action into Γ1 (C), it generates a Euclidean motion induced from C = P\{∞}. (2) Coordinate transformation from chart around 0 to chart around ∞: z → −1/z . In Definition 2.10, we give the definitions of moduli spaces of a quantized elastica in P, which are our main objects in this article. However as Proposition 2.8 is correct for a more complicate system, we will give provisional moduli spaces of loops. Definition 2.4. We define the moduli spaces of loops as sets as follows: (1) MP := {γ : S 1 ,→ P | γ is smooth immersion}/PSL2 (C). 2 (2) MC \{0} := {ψ : S 1 ,→ C2 \{0} | ψ is smooth immersion, det(ψ(s), ∂s ψ(s)) = 1}/SL2 (C). Here ∂s := d/ds. Remark 2.5. (1) Let [γ] denote an element in MP for a representative element 2 γ ∈ P and an element in MC \{0} by [ψ] for a representative element ψ ∈ C2 \{0}. (2) For a free loop space M over a base space M , M := {δ : S 1 → M | δ is smooth immersion} ,
we can define an evaluation map ev from S 1 × M to M by ev(s, δ) = δ(s) [7]. For M◦ (◦ is P or C2 \{0}), we have the evaluation map whose image is a little bit complicate. 2 (3) For loops ψ1 and ψ2 in C2 \{0} such that [ψ1 ] = [ψ2 ] ∈ MC \{0} , we obviously obtain [$ψ1 ] = [$ψ2 ] in MP . Thus we also use the notation of $ as the map, $ : MC
2
\{0}
→ MP .
Definition 2.6 (Schwarz derivative [35]). {γ(s), s}SD is called Schwarz derivative, which is defined for a smooth map γ : S 1 → P equipped with a parameter s ∈ S 1 by, 2 2 1 ∂s2 γ(s) ∂s γ(s) . − {γ(s), s}SD := ∂s ∂s γ(s) 2 ∂s γ(s) We write it by {γ, s}SD or {γ, s}SD(s) for brevity.
By elementally computations, the Schwarz derivative is also expressed by 3 2 ∂s γ(s) 3 ∂s2 γ(s) {γ, s}SD = − . ∂s γ(s) 2 ∂s γ(s)
Straightforward computations give following lemma.
September 1, 2003 11:49 WSPC/148-RMP
568
00172
ˆ S. Matsutani & Y. Onishi
Lemma 2.7 ([35]). (1) For the action of g ∈ PSL2 (C), the Schwarz derivative {γ, s}SD is invariant: {γ, s}SD = {g(γ), s}SD .
(2) For a diffeomorphism s0 ∈ Diff + (S 1 )
{γ, s}SD = (∂s s0 )2 ({γ, s0 }SD − {s, s0 }SD )
and for U(1) action on S 1 , i.e. s0 = s + α,
{γ(s), s}SD = {γ(s0 − α), s0 }SD . Definition/Proposition 2.8 [35]. There is a natural one-to-one correspondence 2 between MC \{0} and MP with the following properties. (1) If [γ] is an element of MP , there exists a unique lifted curve [ψ] as an inverse of the map $ ($[ψ] = [γ]). Let the correspondence be denoted by σ ˜ , i.e. σ ˜ : MP → C2 \{0} M , ([ψ] = σ ˜ ([γ])). Then we have $ ◦ σ ˜ ([γ]) = [γ] and σ ˜ ◦ $([ψ]) = [ψ]. (2) For a map γ : S 1 → P representing a point of γ(s) ∈ P, there is a curve ψ in C2 \{0} as a solution of the differential equation, 1 2 −∂s − {γ, s}SD (s) ψ(s) = 0 , 2 so that ψ defines an element [ψ] ∈ MC a realization of σ ˜.
2
\{0}
and [$ψ] = [γ]. This algorithm is
Proof. In this proof, we will deal only with representative elements γ and ψ of 2 MP and MC \{0} . First we will check the well-definedness of σ ˜ in (2). Without 2 1 loss of its generality, we use the chart of ψ2 6= 0 a loop ψ := ψ ψ2 ∈ C \{0}. Noting the Remark 2.5(3), the well-definedness means that the lift of the loop γ(S 1 ) := $ψ(S 1 ) is uniquely ψ up to SL2 (C). By differentiating det(ψ, ∂s ψ) = 1 in s, (∂s2 ψ2 )/ψ2 = (∂s2 ψ1 )/ψ1 . After straightforward computation, for γ = ψ1 /ψ2 , we obtain the relation, (∂s2 ψ2 )/ψ2 = −{γ, s}SD/2. Up to SL2 (C), ψ is identified with a solution of (2). Hence well-definedness is asserted. Further existence of a solution √ √ −1γ/ ∂s γ √ √ of this equation in (2) is guaranteed by a special solution, ψ = , whose −1/ ∂s γ det(ψ, ∂s ψ) is unit. The property of Wronskian det(ψ, ∂s ψ) = 1 and the uniqueness of the solutions of a second order differential equation confirms uniqueness of the solution of (2) up to SL2 (C). Further due to the construction of the solutions, we will consider the effect of Diff + (S 1 ); for s0 (s), the Schwarz derivative changes as √ in Lemma 2.7, and ∂s0 given by the chain rule, ψ(s0 ) := ψ(s)/ ∂s0 s. Then ψ1 (s0 ) : ψ2 (s0 ) = γ(s0 ) and γ(S 1 ) and ψ(S 1 ) do not depend on the parameterization. Thus (1) and (2) are completely proved. Remark 2.9 (Poincar´ e and Schwarz [35, 36]). By the analytical continuation of s ∈ S 1 , γ can be complexfied to γc . If γc is also embedded in C, γc−1 is automorphic function. (In general, even though γ is immersed or embedded in P, γc cannot be immersed in P.) For example the case s = ℘(ξ) (ξ = ℘−1 (s) ∈ X1 ) for
September 1, 2003 11:49 WSPC/148-RMP
00172
On the Moduli of a Quantized Elastica in P and KdV Flows
569
s ∈ P, ξ ∈ C, {ξ, s}SD is a meromorphic function of s, where ℘(ξ) is the Weierstrass elliptic function and X1 = C/Λ is an elliptic curve. These studies are by Klein, Riemann, Poincar´e, Schwarz and so on. In this article, we will not restrict ourselves to deal only with meromorphic function. We will consider transcendental functions of s because our problem is related to a physical problem or an elastica problem as the catenary, another physical curve, is also given by the transcendental function. In this article, we are concerned with a loop with an energy functional in Definition 2.18. However the integrand {γ, s}SD in the energy integration depends upon the parameterization of S 1 or Diff + (S 1 ) from Lemma 2.7. Hence we must fix the parameterizations of the loop in order to treat a loop with the energy functional. Even in P, we can locally define the metric because its tangent space T P is isomorphic to C but the action of PSL2 (C) prevents that the metric becomes global. Hence we restrict ourselves to consider an action of the subgroup Γ0 (C) instead of PSL2 (C). Let us introduce our main objects in this article. Definition 2.10. We define the moduli spaces of loops, which are called moduli of a quantized elastica or moduli spaces of a quantized elastica, as follows: (1) MPelas := {γ : S 1 ,→ P | smooth immersion, |∂s γ(s)| = 1}/E0 (C). 1 (2) MC elas := {γ : S ,→ C | smooth immersion, |∂s γ(s)| = 1}/E0 (C). C2 \{0}
(3) Melas := {ψ : S 1 ,→ C2 \{0} | s smooth immersion, det(ψ(s), ∂s ψ(s)) = 1, |ψa (s)| = 1 (a = 1 or 2)}/Γ0 (C).
Remark 2.11. (1) The condition |∂s γ(s)| = 1 means that we will treat only loops with the arc-length parameter s in C or P = C ∪ {∞} equipped with the standard flat metric hereafter. We call the condition |∂s γ(s)| = 1 reality condition or arclength condition. C2 \{0} by [γ] and [ψ] for loops (2) We continue to express the elements in MPelas , Melas γ ∈ P and ψ ∈ C2 \{0} satisfying appropriate conditions respectively for a while. C2 \{0} (3) Further similar to Remark 2.5(3), we can define the map from Melas to MPelas by $ noting that the reality condition |∂s γ(s)| = 1 means |ψa (s)| = 1 (a = 1 or 2) under the condition det(ψ(s), ∂s ψ(s)) = 1 since ∂s γ(s) = det(ψ(s), ∂s ψ(s))/ψ2 (s)2 for ψ2 (s) 6= 0. (4) We H can find a representative element by tuning the dilation of E0 (C). By letting |dγ| = 2π for a curve with finite length in C ∪ {∞} we have a natural isomorphism, C,2π C 1 Melas ≈ Melas := γ : S ,→ C ∪ {∞} smooth immersion, |∂s γ(s)| = 1,
Here |dγ| = |∂s γ(s)|ds.
I
|dγ| = 2π
E1 (C) .
September 1, 2003 11:49 WSPC/148-RMP
570
00172
ˆ S. Matsutani & Y. Onishi
(5) Using (1), we have a decomposition whether the length is finite or not, i.e. a C,2π M∞ MPelas ≈ Melas elas .
This picture is also asserted if one considers a smooth loop in a two-sphere S 2 and its stereographic projection. H For M∞ |dγ| = ∞, elas , we have another representation element, ∞,cvtr M∞ ≈ M := γ : S 1 ,→ C ∪ {∞} | smooth immersion, elas elas |∂s γ(s)| = 1, sup |∂s log ∂s γ(s)| = 1 s∈S 1
E1 (C) .
Using the equivalences and such a representative element, we can introduce scale in our system. Next let us introduce smaller moduli spaces for later convenience. Even under the reality condition |∂γ(s)| = 1, there is a freedom to choose its origin of the loop, which is denoted by Isom(S 1 ) = U(1). Thus let us define smaller sets with P projections to these sets of moduli spaces, e.g. πelas : MPelas → MPelas /Isom(S 1 ). Definition 2.12 (Moduli of a quantized elastica). We define moduli spaces of loops, which are also called moduli of a quantized elastica, or moduli spaces of a quantized elastica, as follows: P : MPelas → MPelas := MPelas /Isom(S 1 ). (1) πelas
1 C C C : MC (2) πelas elas → Melas := Melas /Isom(S ). C2 \{0}
(3) πelas
C2 \{0}
: Melas
C2 \{0}
→ Melas
C2 \{0}
:= Melas
/Isom(S 1 ).
In physics, we are concerned only with the shape of elastica. M ◦elas is more important than M◦elas . Further we remark here that we have a natural isomorphism MPelas ≈ MP /Diff + (S 1 ) as a connection between MPelas and MP . C2 \{0}
Next we give our correspondence between MPelas and Melas based on C,2π C,2π C Proposition 2.8. We let Melas = Melas /Isom(S 1 ) and have MC,2π elas ≈ Melas . Definition/Proposition 2.13. (1) There is a natural one-to-one and continuous C2 \{0} correspondence between Melas and MPelas with the following properties. (1-1) If [γ] is an element of MPelas , there exists a unique lifted curve [ψ] in C2 \{0}
Melas as an inverse of the map $, ($[ψ] = [γ]). Let the correspondence be denoted by C2 \{0}
σ : MPelas → Melas
,
([ψ] = σ([γ])) .
Then we have $ ◦ σ([γ]) = [γ] and σ ◦ $([ψ]) = [ψ].
September 1, 2003 11:49 WSPC/148-RMP
00172
On the Moduli of a Quantized Elastica in P and KdV Flows
571
(1-2) For a curve γ(s) ∈ P representing [γ] ∈ MPelas , there is a curve ψ in C2 \{0} as a solution of the differential equation, 1 −∂s2 − {γ, s}SD(s) ψ(s) = 0 , 2 C2 \{0}
so that ψ uniquely defines an element [ψ] in Melas algorithm is a realization of σ.
and $[ψ] = [γ]. This C2 \{0}
(2) There is a natural one-to-one correspondence between Melas induced from σ and $. C2 \{0} C2 \{0} (3) MC (MC ) are connected with elas and Melas elas and Melas C2 \{0}
MC elas = $Melas
,
C2 \{0}
MC elas = $Melas
and MPelas
.
Proof. (1) and (2) are essentially the same as the proves in Proposition 2.8 if we check the compatibility between |∂s γ(s)| = 1 and |ψ2 (s)| = 1, and continuity of C2 \{0} the map. Due to Remark 2.11(3), the condition |ψ2 (s)| = 1 in Melas essentially means the reality condition |∂s γ(s)| = 1 of the map γ on the chart around ψ2 6= 0. Let us consider the continuity. For a loop in P, γ 0 (s) := γ(s) + v(s) with small number and an element v of C ∞ (S 1 , C), the Schwarz derivative changes 2 2 3 ∂3γ ∂2γ ∂s ∂s v ∂ v − s 2 ∂s v . {γ 0 , s}SD = {γ, s}SD + s − s 2 ∂s v − 3 ∂s γ (∂s γ) ∂s γ ∂s γ (∂s γ) From the proof of Proposition 2.8, a solution of the its differential equation, ψ 0 , is periodic and thus is a loop in C2 \{0}. For sufficiently small , the second term becomes small enough. Then using the perturbation theory, we have ψ 0 = ψ + η as a solutions of 1 ∂s2 ψ 0 + {γ 0 , s}SD ψ 0 = 0 . 2 0 We note {γ , s}SD is also invariant for PSL2 (C). For the condition |ψ2 | = 1, we replace the parameter s by s0 using the fact in the proof of Proposition 2.8. On the other hand, for ψ 0 = ψ + η 0 we can find v ∈ C ∞ (S 1 , C) such that γ 0 := ψ10 /ψ20 = ψ1 /ψ2 + v 0 . Hence both maps are continuous. Here we will consider the natural neighborhood in the moduli space MPelas . Corollary 2.14. (1) There is a continuous injective map from MPelas to the function space C ∞ (S 1 , C). (2) MPelas has a natural topology generated by neighborhood in the function space ∞ C (S 1 , C). Proof. (2) is obvious if (1) is proved. For a given u ∈ C ∞ (S 1 , C) the functions ψ ∈ C ∞ ([0, 2π), C2 ) satisfying (−∂s2 − u)ψ = 0
September 1, 2003 11:49 WSPC/148-RMP
572
00172
ˆ S. Matsutani & Y. Onishi
is uniquely determined up to SL2 (C). In general, even though u is periodic and a function over S 1 , ψ is not periodic due to Floquet theorem [49]. However if ψ is periodic for some u, by letting |ψ2 | = 1, γ ∈ P is uniquely determined by γ = ψ1 /ψ2 up to E0 (C). We note that for such γ and u, u is given as u = {γ, s}SD/2 and {γ, s}SD /2 is invariant for the action of E1 (C). (For an action Diff + (S 1 ) to the reparameterization of the coordinate s, {γ, s}SD/2 is not invariant and {γ, s}SD/2 changes its value. However we have considered only the arc-length parameterization of s as |∂s γ(s)| = 1.) Hence if ψ is periodic for some u, by letting |ψ2 | = 1, it determines [γ] ∈ MPelas . The continuity is obvious from the previous proposition. As the injective map in the Corollary 2.14 is a continuous map, the above neighborhood can be geometrically interpreted as a neighborhood of γ. Remark 2.15. It is known that the free loop space can be a metric space and has natural topology if the base space is a Riemannian manifold [7, 21]. As we can regard R that an element in MPelas with finite length |dγ(s)| < ∞ can be represented by a loop with 2π length whose gravity center exists at the origin of C, it is not difficult to treat the quotient by E0 or E1 . Consider the image C of the map γ : S 1 → P; C := γ(S 1 ). For such a loop C ⊂ P whose represents a point [C] ∈ MPelas , there is a normal bundle characterized by the exact sequence of tangent bundle of C and P, 0 → T C → T P|C → NC → 0 . Any elements in T P|C are decomposed to T P|C ≡ NC ⊕ T C. For a smooth section v ∈ C ∞ (C, T P|C ) of T P|C over C, and for an infinitesimal real parameter , we have C + v as a loop in P. Here + means the natural addition in the local chart C of P in the sense of euclidean geometry. Of course, it is important to check whether such a loop is in a different point in MPelas or not but if it is, we can find an infinitesimal path from [γ] to [γ + vγ ] in MPelas by letting vγ ∈ C ∞ (S 1 , T P|γ(S 1) ), vγ := v ◦ γ. |∂s (γ(s) + vγ (s)| = 1 is not difficult to be treated by reparameterizing s by s0 in primitive sense. Further even for the case [γ] = [γ + vγ ], we can regard it as a trivial path. If [γ] and [γ + vγ ] are different points for an infinitesimal small , we can regard such [vγ ] as an element in a set of smooth sections of tangent bundle of MPelas , C ∞ (T[γ] MPelas ) ≡ C ∞ (MPelas , T[γ] MPelas ). We show that there exist such different points. From Corollary 2.14, C ∞ (T[γ] MPelas ) is not the empty set. Let us find an element in C ∞ (T[γ] MPelas ) for each [γ] ∈ MPelas . As the fiber of Tp P is isomorphic to C, we define a norm in v ∈ C ∞ (S 1 , T P|γ(S 1 ) ) by sup-norm. (In our article, our argument does not strongly depend upon the norm in C ∞ (S 1 , T P|γ(S 1) ).) As it is difficult to define a length in C,2π scaleless space, we might consider an element in Melas rather than MPelas due to C,2π 1 Remark 2.11. Consider [γ]R ∈ Melas for γ : S → P satisfying the reality condition |∂s γ(s)| = 1 and dγ = 2π, and v ∈ C ∞ (S 1 , T P|γ(S 1) ). Suppose that γ(S 1 ) + v(S 1 ) preserves local and total length of γ(S 1 ) for sufficiently small ,
September 1, 2003 11:49 WSPC/148-RMP
00172
On the Moduli of a Quantized Elastica in P and KdV Flows
573
R i.e. |∂s (γ(S 1 ) + v(S 1 ))| = 1 and d(γ + v) = 2π are satisfied. We call the deformation as isometric. Then we regard [γ + v] as an element in MC,2π elas . If the vector field v(6∈ {Euclidean move}) is the isometric deformation, [γ + v] is a different C,2π point from [γ] in Melas and thus they are different points in MPelas . Then we can naturally define a neighborhood around a loop with finite length in our moduli space MPelas . Similarly for an element in M∞ elas , we can define the neighborhood. For an element [γ] in M∞,cvtr , we define its tangent space and velocity C ∞ (S 1 , T P|γ(S 1) ). elas We can constraint the velocity as an isometric path which locally preserves the length. However since M∞,cvtr is defined by sup-norm in Remark 2.11, for an elas element [γ] ∈ M∞,cvtr and an isometric path for v ∈ C ∞ (S 1 , T P|γ(S 1 ) ), [γ + v] elas ∞,cvtr generally does not belong to Melas even with a sufficiently small . However [γ + v] is in MPelas and [γ + v] generates a point in M∞ elas again. Hence the path is well-defined by local argument. Accordingly we can naturally define a neighborhood in our moduli space MPelas . C2 \{0}
Further as we also define neighborhood in Melas MPelas
using Proposition 2.13, we
C2 \{0} Melas .
define a topology in our moduli spaces and As the topology comes from that in the loop space we call it topology of loop space [7, 21]. As the topology of loop space is generated by C ∞ (Tγ MPelas ), we will consider an infinitesimal deformation parameterized by t ∈ [0, ] for a sufficiently small in detail. Remark 2.16. Due to the arguments in Remark 2.15, we wish to find one parameter family [γt ] in MPelas such that ∂t [γt ] belongs to C ∞ (Tγ MPelas ). (1) First, we will consider an isometric deformation which locally preserves the arc-length of one parameter family of loops immersed in P : γ◦ : S 1 × [0, ] → P, (γt (s) := γ(s, t) ∈ P) satisfying [∂t , ∂s ]γt (s) = 0 . Here ∂s := ∂/∂s and ∂t := ∂/∂t. We call this condition isometric condition. Then if |∂s γt=0 (s)| = 1 for s ∈ S 1 , |∂s γt (s)| = 1 for (s, t) ∈ S 1 × [0, ]. For (s, [γt ]) ∈ S 1 × MPelas , ∂s acts only on S 1 whereas ∂t acts only on [γt ] ∈ P Melas . Of course the relation [∂t , ∂s ](s, [γt ]) = 0 trivially holds. On the other hand, for the evaluation map ev(s, [γt ]) as Remark 2.5(2), the action of [∂t , ∂s ] is not trivial. However by dealing only with the isometric deformation, we can avoid the noncommutativity between a deformation and the evaluation map. (2) Let us consider one parameter family of loops immersed in P, γ◦ : S 1 ×[0, ] → P, given by a differential equation, which the right hand side depends upon γt itself. First assume that the differential equation is ∂t γt = f (γt ) for a given functional f . In this case, the deformation depends upon the affine coordinate γt in P and it is not invariant for the action of E1 . Further we note that ∂s γt (s) is the tangential vector of the circle γt (S 1 ) and φ := log ∂s γt (s) denotes its
September 1, 2003 11:49 WSPC/148-RMP
574
00172
ˆ S. Matsutani & Y. Onishi
tangential angle if |∂s γt (s)| = 1. Provided that the deformation ∂t γt is governed by a function of ∂s γt (s) itself, the deformation must depend upon the angle of C in P and a euclidean move. Hence they can not be deformations in MPelas and a deformation in MPelas does not include γ(s) and φ. (3) From Lemma 2.8(1), ut (s) ≡ u(s, t) := {γt (s), s}SD /2 is a function of MPelas . Further ut (s) depends only on ∂s log(∂s γt (s)) and ∂s2 log(∂s γ(s)) due to Definition 2.6. We might consider the deformation in an element γt in MPelas through an equation, ∂t ut = f (ut , ∂s ut , ∂s2 ut , . . . , A) , for appropriate functional f and function A ∈ C ∞ (S 1 × [0, ], C); the function A must be invariant for the action of E0 (C). If ut is determined at a time t, γt can be reconstructed by Proposition 2.13. If there does not appear γt (s) or ∂s γt (s) themselves in right hand side, the deformation is invariant for the action of E 1 (C) for an appropriate A. Due to the above consideration, we can consider an infinitesimal deformation in MPelas by ∂t u = f (u, ∂s u, ∂s2 u, . . . , A) and [∂t , ∂s ]γ(s, t) = 0. As we prepared the tools, it is not difficult to deal with the quotient space MPelas C2 \{0}
and Melas
. From here, let γ itself denote an element of MPelas instead of [γ] for C2 \{0}
a loop γ(S 1 ) ∈ P satisfying the conditions. Similarly we write ψ ∈ Melas sake of simplicity. Further we will consider a flow in MPelas .
for the
Remark 2.17. Let us consider the situation that for a point γ ∈ MPelas and its neighborhood Uγ in terms of the loop topology, we can find another point γ 0 ∈ Uγ such that γ 0 = γ + v for a sufficiently small and some velocity v ∈ C ∞ (Tγ MPelas ). Suppose that by sequentially finding such points, we construct a curve γt , t ∈ [0, 1] in MPelas connecting between a starting point γ0 = γ and a terminal point γ1 = γ 00 for some γ 00 ∈ Uγ . Then we may write the velocity as ∂t γt at γt . In this way, for each point γ in MPelas , we can define an immersion of [0, 1] in MPelas for a smooth section v ∈ C ∞ (Tγ MPelas ) if it is well-defined. We call such a immersion flow in MPelas . Further for each point γt in a flow, t ∈ [0, 1] with vt ∈ C ∞ (Tγt MPelas ), let us assume that we can choose another element vt0 ∈ C ∞ (Tγt MPelas ) and find a point γt, in the neighborhood of γt such that γt, = γt + u0 for a sufficiently small parameter . Then we can consider duplex flow such as γt,t0 for [0, 1]2 in MPelas . Similarly we can deal with an immersion of [0, 1]m in MPelas . For the case, we call the immersion γt of t ∈ [0, 1]m multiple flow. Further for a certain case, [0, 1]m ∈ MPelas can be extended to Rm ∈ MPelas where m is a positive integer or the infinite number. C2 \{0} Similarly we can deal with flow in Melas . We define the KdV and KdVH flow as an extension of [0, 1]m immersion to Rm immersion in Sec. 3.
September 1, 2003 11:49 WSPC/148-RMP
00172
On the Moduli of a Quantized Elastica in P and KdV Flows
575
Definition 2.18 (Energy of a quantized elastica). We introduce an energy C,2π functional of γ ∈ Melas ≈ MC elas , called Euler–Bernoulli energy functional, by Z 1 {γ(s), s}SD ds . E[γ] := 2π S 1 Lemma 2.19. For γ ∈ MC elas , the energy E[γ] is non-negative real. Proof. The Schwarz derivative can be expressed by 1 {γ, s}SD = ∂s2 log(∂s γ) − (∂s log(∂s γ))2 . 2 √ Due to Definition 2.10, the reality condition |∂s γ(s)| = 1, we let ∂s γ = exp( −1φ), φ is a real smooth function over S 1 , φ(0) = φ(2π). Hence Z Z 1 {γ, s}SD ds = ds (∂s φ)2 , 2 1 1 S S which is real. Remark 2.20 ([13]). (1) By Lemma 2.7, the integrand in the energy E is invariant for the action of PSL2 (C). However the diffeomorphism of S 1 , Diff + (S 1 ), changes the energy. Hence we cannot find a well-defined energy over MP . P Further for γ ∈ πelas M∞,cvtr , we can also consider a correspondence elas Z {γ, s}SD ds , S1
by giving up to considering dilatation symmetry. P As we wish to neglect the problem for πelas M∞ elas , we restrict ourselves to deal C with Melas . In other words, we will consider the energy functional only for MC elas . (2) We regard the energy function as a section of line bundle over MC , elas R −−−−→ Energy (MC elas ) y MC elas
(3) As mentioned in Introduction, for γ ∈ MC elas , this energy functional E = R 2 2 {γ, s} ds is identified with (∂ γ/∂γ) ds; thus we call it Euler–Bernoulli SD S1 S1 s energy functional. The stationary points of E in MPelas in the meaning (1) were investigated by Euler [1–4]. Even though we will not touch the problem in this paper, we are implicitly considering the partition function of an elastica as a problem of physics [13], Z Z Z= Dγ exp −β {γ, s}SD ds . R
MC elas
S1
In order to know this partition function (which is not mathematically still wellP defined), we must investigate the moduli space of curve MC elas ⊂ Melas and we will do in this paper.
September 1, 2003 11:49 WSPC/148-RMP
576
00172
ˆ S. Matsutani & Y. Onishi
3. KdVH Flow Our studies are based upon the discovery of Goldstein and Pertich [37, 38] on the MKdV flow for a loop in C and that of Langer and Perline [9] on the nonlinear Schr¨ odinger flow for a loop in R3 . Using their results, one of the authors studied the moduli of loops in C [13] and loops in R3 [14]. Our purpose is to give mathematical implications of these works [13, 14] using results of Pedit [16]. In this section, we will give our main Theorem 3.4 and its proof, which are of a relation between the moduli of a quantized elastica in P and the KdV flow. In order to express the system of the KdV equation, we will introduce the differential algebra and its division algebra before our main arguments in this section. Definition 3.1 ([24]). (1) The differential ring Ds is defined by, N X ak (s)∂sk N < ∞, ak (s) ∈ C ∞ (S 1 , C), s ∈ S 1 . Ds := k≥0
(2) The degree of a differential operator, D ∈ Ds , is denoted by deg D, deg : Ds → Z≥0 ,
where Z≥0 is the set of non-negative integers. (3) The micro-differential ring Es to Ds is defined by ( N ) X Es := ak (s)∂sk N < ∞, ak (s) ∈ C ∞ (S 1 , C), s ∈ S 1 , k=−∞
s
where deg : E → Z and the product is defined by the extended Leibniz rule, ∞ X 1 n n r n−r n := n(n − 1) · · · (n − r + 1) . (∂s a)∂s , ∂s a = r r r! r=0
(4) The projections + and − are defined by + : E s → Ds ,
(L 7→ L+ ) ,
− : Es → Es \Ds ,
(L 7→ L− ,
L = L + + L− ) .
Hereafter we will write a map from S 1 to P, MPelas and MPelas by the same γ. Noting Remark 2.17, let us define the KdV and KdVH flows, which satisfy the isometric condition as in Proposition 3.11. Definition 3.2 (KdV flow and KdVH flow). (1) The KdV flow is defined as the immersion γ◦ : R ,→ MPelas
and
C2 \{0}
ψ◦ : R ,→ Melas
which satisfies the following properties: (1-1) γt (s) = $ ◦ ψt (s), for each t ∈ R.
,
(t 7→ (γt , ψt )) ,
September 1, 2003 11:49 WSPC/148-RMP
00172
On the Moduli of a Quantized Elastica in P and KdV Flows
577
(1-2) u(s, t) := {γt (s), s}SD /2 obeys the KdV equation,
∂t u + 6u∂s u + ∂s3 u = 0 . C2 \{0}
If for a point γ ∈ MPelas and its corresponding point ψ ∈ Melas , there is one of the KdV flows such that γt = γ and ψt = ψ for some t ∈ R, we say that γ or ψ belongs to the KdV flow γ◦ or ψ◦ . (2) Let us introduce a formal infinite dimensional parameter space, ! ∞ Y V ∞ := S 1 × R , t = (t1 , t2 , t3 , . . .) ∈ V ∞ , t1 ∈ S 1 . n=1
Then the KdVH flow is defined as the immersion
C2 \{0}
(γ◦ , ψ◦ ) ≡ φ∂s u,t : V ∞ ,→ MPelas × Melas
,
(t 7→ (γt , ψt )) ,
which satisfy the following conditions: (1-1) γt (s) = $ ◦ ψt (s), (1-2) φ∂s u,t is given by γ(s, t) → γ(s, t + δt) = exp
X
δtn ∂tn
n=1
!
γ(s, t) ,
whose each tn deformation obeys the nth KdV equation (n ≥ 1), ∂tn u = −Ωn−1 ∂s u ,
where Ω is a micro-differential operator, Ω = (∂s2 + 2u + 2∂s u∂s−1 ) ∈ Es . C2 \{0}
If for a point γ ∈ MPelas and its corresponding point ψ ∈ Melas , there is one of the KdV flows such that γt = γ and ψt = ψ for some t ∈ V ∞ , we say that γ or ψ belongs to the KdVH flow γ◦ or ψ◦ . (3) We define a relation, γ
∼
KdVHf
γ0 ,
for two points γ, γ 0 ∈ MPelas if these γ and γ 0 are on an orbit of the projection of P P −1 P −1 0 the KdVH flow πelas ◦ φ∂s u,t , i.e. every points in the fibers πelas γ and πelas γ belongs the same KdVH flow. For convenience, let γt (ψt ) denote the KdV flow or KdVH flow instead of γ◦ (ψ◦ ) from this. 3.3. (1) Though the well-definedness of the above definition is later investigated, these flows satisfy the isometric condition as in Proposition 3.11. (2) We will note that the space V ∞ has an algebro-analytic structure induced from the equations, ∂tn+1 u = Ω∂tn u .
September 1, 2003 11:49 WSPC/148-RMP
578
00172
ˆ S. Matsutani & Y. Onishi
(3) The n = 2 KdVH flow obeying ∂t2 u = −Ω∂s u is identified with the KdV flow in (1) of Definition 3.2. Our first main theorem is as follows: Theorem 3.4. (1) The relation MPelas
relation; for arbitrary γ in and for γ ∼ γ 0 and γ 0 ∼ KdVHf
KdVHf
∼
KdVHf
in the Definition 3.2 becomes an equivalent
there is one of the KdVH flows to which γ belongs, γ 00 , we have a relation γ ∼ γ 00 . By this relation, KdVHf
we can define an equivalent class
C[γ] := {γ 0 ∈ MPelas | γ 0
∼
γ} ,
∼
γ} ,
KdVHf
MPelas =
a
C[γ] .
MC elas =
a
CC [γ] .
γ
Similarly we can define 0 CC [γ] := {γ 0 ∈ MC elas | γ
KdVHf
γ
(2) The KdVH flow conserves the energy E. In other words, for the subspace of MC elas , C MC elas, E := {γ ∈ Melas | E[γ] − E = 0} ,
and a curve γ ∈ MC elas , the following relation holds MC elas, E[γ] ⊃ CC [γ] . (3) The moduli space of a quantized elastica MC elas is decomposed as a a CC [γ] . MC MC MC elas, E , elas, E = elas = E
γ,E[γ]=E
C2 \{0}
by Noting Remark 2.16, we will investigate the moduli spaces MPelas and Melas considering flows over there and prove our theorem. Here we mention the strategies of the proof of the theorem. 3.5. We plan to investigate MPelas by dealing with a group which is generated by a Lie algebra associated with T MPelas . By the correspondence between MPelas C2 \{0}
in Proposition 2.8, we can identify γ(s) with (ψ1 , ψ2 )(s). We firstly and Melas deal with wider class of flows φA,t in Lemma 3.6, which is characterized by a smooth function A over S 1 × [0, 1]. In Lemma 3.9, we find that an arbitrary flow φA,t approximately preserves the energy of elastica in Definition 2.18. Due to the argument in Remark 3.10, we choose a special A as A = ∂s {γ, s}SD and then the flow is identified with the KdV flow in Proposition 3.11. As shown in Propositions 3.15, 3.16, and 3.17, we use the regular properties of the KdV hierarchy and prove the theorem. Noting Remark 2.16, we have the following lemma.
September 1, 2003 11:49 WSPC/148-RMP
00172
579
On the Moduli of a Quantized Elastica in P and KdV Flows
Lemma 3.6 (Goldstein Pertich, Pedit [37, 38, 16]). Let us consider a flow of [0, ] for a real number > 0 : [0, ] → MPelas ,
(t 7→ γt ) ,
i.e. it is realized by an isometric deformation, [∂s , ∂t ]γt (s) = 0 . (1) Every isometric deformation γt (s) locally obeys the equation of motion, ∂t u = −ΩA(s, t) , where u = {γ, s}SD/2 and A(s, t) is an appropriate smooth function over (s, t) ∈ S 1 × [0, ]. (2) For the function A(s, t), there exists a smooth function B(s, t) such that A(s, t) = −∂s B(s, t)/2 and this equation of motion is locally rewritten by, ∂t u =
1 ΩB(s, t) , 2
where Ω := Ω∂s , Ω = (∂s3 + 2u∂s + 2∂s u) . C2 \{0}
Proof. Using the one-to-one correspondence between MPelas and Melas , we lift the flow γt to ψt := σγt . In this proof, we consider representative elements of the image of its evaluation map, γt (s) and ψt (s). Due to the linear independence given by det(∂s ψt (s), ψt (s)) = 1, we express the deformation in terms of ψt (s) and ∂s ψt (s); ∂t ψt (s) = (A(s, t) + B(s, t)∂s )ψt (s) , where A(s, t) and B(s, t) are smooth functions over (s, t). However from ∂t det(ψt (s), ∂s ψt (s)) = 0, we have the constraint, ∂s B(s, t) = −2A(s, t) ,
(3.1)
−(∂s2 ψ2 (s))/ψ2 (s),
using [∂s , ∂t ]ψt (s) = 0. Noting u(s, t) = we perform a straightforward computations of ∂t u(t, s), we obtain the equation in (1). On the other hand, if the equation is satisfied, we can reduce the equation to [∂t , ∂s ]γt (s) = 0. Similarly we obtain (2). Let us introduce another formal infinite dimensional parameter spaces, t = (t1 , t2 , t3 , . . .) ∈ [0, ]∞ and a formal multiple flow φA,t with the infinite dimensional parameters, which is locally defined. Definition 3.7. For t ∈ [0, ]∞ for a sufficiently small parameter , we will define an infinitesimal multiple flow, φA,t : [0, ]∞ → MPelas ,
(t 7→ γt ) ,
September 1, 2003 11:49 WSPC/148-RMP
580
00172
ˆ S. Matsutani & Y. Onishi
induced from the formal variation for a sufficiently small δt, (δti < < N δti , a small natural number N ) and image of evaluation map γ(s, t) := γt (s), ! ! X X γ(s, t) 7→ γ(s, t+δt) = exp δtn ∂tn γ(s, t) := 1 + δtn ∂tn γ(s, t)+O(δt2 ) , n=1
n=1
with local relations, [∂s , ∂tn ]γ(s, t) = 0 , ∂tn u = −Ωn−1 A(s, t) ,
(n ≥ 1) , (n ≥ 1) ,
where u(s, t) = {γt (s), s}SD /2, A(s, t) and B(s, t) are appropriate smooth functions over S 1 × [0, ]∞ such that 2A = −∂s B. Remark 3.8. (1) In terms of the definition of the exponential function to the base e, N 0 O , 1+ 0 exp(O) = lim N 0 →∞ N the development of δtn generates [0, ]∞ . By tuning N 0 compatible to N in Definition 3.7, we can define the exponent action to γ(s, t). (2) If Ωn−1 A(s, t) vanishes for n > M for a natural number M , the deformation is of finite dimensional. Then the flow φA,t is well-defined for a sufficiently small . (3) In general, the above flow φA,t is a formal one and its well-definedness is not guaranteed. However if it is well-defined, it gives an isometric deformation of a curve γ(s) ∈ MPelas . In fact due to the relation ∂tn+1 = Ω∂tn , we have the flow ∂tn ψt (s) = (An + Bn ∂s )ψt (s) , where A2 = A = −∂s B/2, B2 = B, A1 = Ω−1 A, and An = ΩAn−1 ,
∂s Bn = ΩBn−1 ,
(n ≥ 2) .
Then the above relation ∂tn u = −Ωn−1 A(s, t) turns out to be the standard type of the flow for An in Lemma 3.6. ∞ 1 ∞ Lemma 3.9. For γ ∈ MC elas and A ∈ C (S ×[0, ] , C), the infinitesimal flow φA,t preserves the energy functional modulo (δt)2 , supposed that φA,t is well-defined , Z Z 1 1 {γt , s}SD ds = {γt+δt , s}SD ds + O((δt)2 ) . 2π S 1 2π S 1
Proof. Noting Remark 3.8 and by Eq. (3.1) in the proof of Lemma 3.6, we have the relations, ∂s Bn = −2An = −2ΩAn−1 ,
∂s Bn = ΩBn−1 .
September 1, 2003 11:49 WSPC/148-RMP
00172
On the Moduli of a Quantized Elastica in P and KdV Flows
581
When we will apply the relation to the right hand side of the lemma, (u(s, t) := {γt , s}SD /2), Z Z Z X ∂tn u(s, t)ds + O((δt)2 ) u(s, t)ds + u(s, t + δt)ds = δtn S1
S1
=
=
=
Z Z Z
S1
u(s, t)ds − u(s, t)ds +
S1
S1
S1
n=1
X
δtn
n=2
Z
ΩAn ds + S1
1X δtn 2 n
Z
S1
1 2
Z
S1
∂s Bds + O((δt)2 )
∂s Bn+1 (s, t)ds + O((δt)2 )
u(s, t)ds + O((δt)2 ) .
We completely prove the lemma. Remark 3.10. (1) This flow φA,t could be regarded as an infinitesimal action of a diffeomorphism of MPelas , which is a (infinite dimensional) Lie group GA if it can be well-defined. (2) We can regard S 1 as a Riemannian manifold with a metric ds2 . Then ∂s is √ the Killing vector and exp( −1s) is a geodesic flow. They are a generator and an element of the Isom(S 1 ) = U(1) group respectively; √ √ U(1) : S 1 → S 1 , exp( −1s) 7→ exp( −1(s + s0 )) , for g0 ∈ U(1), g0 gives a natural automorphism of MPelas . P : MPelas → MPelas , the U(1) action (3) Since there is the natural projection πelas P on γ ∈ Melas must be trivial g0 γ = γ for g0 ∈ U(1) and we have the relation P P g0 ◦ πelas = πelas ◦ g0 . It implies that the immersion of the loop S 1 is consistent with U(1) action. (4) For a curve γ(s) ∈ MPelas , we can locally express the U(1) action, (∂s − ∂s0 ){γ, s}SD (s, s0 ) = 0 ,
(∂s − ∂s0 )γ(s, s0 ) = 0 .
These equations faithfully represent the u(1) symmetry or translation, γ(s) → γ(s − s0 ) in MPelas . (5) Due to the above remarks, if exists, GA should include G0 = U(1) as its normal subgroup. Accordingly it is natural that A in Definition 3.7 starts with the internal symmetry: A = ∂s u and ∂t1 u = ∂s u for u = {γ, s}SD. (6) When we consider the multiple flow generated by φ∂s u,t (A = ∂s u), it means that we deal with the variation, ! X γ(s, t) → γ(s, t + δt) = exp δtn ∂tn γ(s, t) , n
which obeys
∂tn u = −Ωn−1 ∂s u . Following Definition 3.2, they are locally identified with the KdVH flow.
September 1, 2003 11:49 WSPC/148-RMP
582
00172
ˆ S. Matsutani & Y. Onishi
(7) Due to the Remark 2.16, this multiple flow is locally well-defined in MPelas . (8) Physically speaking for the above arguments, we are implicitly investigating the partition function of a “elastic” curve in P. We require that the partition function must naturally include classical shapes whose have the above trivial translation symmetry as the Goldstone bosons or the Jacobi fields [39]. This requirement makes the group structure acting MPelas (if exists) contain this trivial symmetry [13]. We will summarize the above results as a proposition. Proposition 3.11. (1) The multiple flow φ∂s u,t contains a subflow φ∂s u,t1 generated by (∂s − ∂s0 ){γ, s}SD = 0 . This domain of t1 ∈ [0, ] is extended to S 1 and is consistent with the projection P P P πelas : MPelas → MPelas , i.e. there exists ϕ∂s u,t such that πelas ◦ φ∂s u,t = ϕ∂s u,t ◦ πelas . (2) By choosing A = ∂s u for u = {γ, s}SD/2, the flow φ∂s u,t defined in Definition 3.7 is well-defined as a flow in MPelas and can extend the domain of the flow [0, ]∞ → V ∞ . (3) φ∂s u,t is identified with the KdVH flow φ∂s u,t by extending [0, ]∞ to V ∞ . P P (4) There exists a flow ϕ∂s u,t in MPelas such that πelas ◦ φ∂s u,t = ϕ∂s u,t ◦ πelas , we also call it KdVH flow. (5) For the KdVH flow, we have algebraic relations among multi-times t n as ∂tn+1 u = Ω∂tn u. (6) The KdVH flow preserves the decomposition in Remark 2.11. (7) The restricted flow of the KdVH flow to MC elas preserves the energy functional exactly. Proof. (1) is obvious from Remark 3.10. If (2) is satisfied, (3), (4) and (5) are naturally given from Remark 3.10. Since the KdVH flow consists of isometric deformations, (6) is obvious. (2) and (7) will be asserted by Propositions 3.15 and 3.16. Firstly we note that (7) should be compared with the Lemma 3.9. Next we also note that in order to prove (2), we should check (i) the well-definedness of the KdVH flow locally and (ii) the extension of the domain to V ∞ . If the well-definedness of the KdVH flow is guaranteed, we can find the neighborhood of a point γ ∈ MPelas by the KdVH flow to γ as its initial state, because the KdVH flow consists of the isometric C,2π deformations. We can consider the process in Melas as mentioned in Remark 2.15. Here we will introduce the words of a dynamic system here apart from our notations in main subject [40]. Definition 3.12 ([7, 40]). We will consider a manifold M equipped with a closed real 2-form ω. We will use the notations: iY v is the interior product of a vector field Y and a differential form v.
September 1, 2003 11:49 WSPC/148-RMP
00172
On the Moduli of a Quantized Elastica in P and KdV Flows
583
(1) A vector field Y is called symplectic if iY ω is closed. (2) A vector field Y is called a Hamiltonian vector field if there exists a function f such that iY ω = df. Corresponding to Definition 3.12, we will define quantities in the KdV flow in Definition 3.13 and give Proposition 3.15 by assuming MPelas as a (infinite dimensional) manifold [40]. Definition 3.13 ([40]). (1) In our KdVH flow, we define a 2-form ω for vectors Y1 and Y2 over MPelas , Z Z s 1 0 0 0 (Y2 (s)Y1 (s ) − Y1 (s)Y2 (s ))ds ds . ω(Y1 , Y2 ) := 2 S1 0
¯ δu ¯ for the KdVH flow : (2) We define the quantities Xn and hn and variation δ/ h0 = u/2, X0 = 0 and ¯ n δh Xn (u) = ∂s ¯ , δu
Xn (u) := Ωn−1 ∂s u , where
¯ n δhn δh δhn 2 δhn 3 δhn ¯ = δu − ∂s δ(∂s u) + ∂s δ(∂s2 u) − ∂s δ(∂s3 u) + · · · . δu The existence of such hn will be guaranteed in Proposition 4.18. Noting Ω = Ω∂s , and from the definition we have a recursion relation, ¯ n−1 ¯ n δh δh , ∂s ¯ = Ω ¯ δu δu if hn exists. In Proposition 4.18, we show existence of a set of functionals 2n ¯ n = res 2 h L(2n−1)/2 , 2n − 1 ¯ n−1 with ¯ u satisfying ∂s ¯ hn = Ω∂s h h1 = u/2. Here L = ∂s2 + u and “res” means the −1 ¯ n = Ω n ∂s h ¯1 = coefficient of the ∂s in the notations in Sec. 4. In other words, ∂s h Ωn ∂s u/2. Further from the definition, we have [22] Z Z ¯¯ δ hn ¯ ¯ ds ≡ 2(2n − 3) hn−1 ds , δu R ¯h¯ n R δh¯ n ≡ δu due to periodicity and since δδu ¯ ! Z Z ¯ r−1 X ¯ 1/2 δ r/2 1/2 i δL 1/2 r−i−1 (L ) ¯ (L ) ds ¯ res(L )ds = res δu δu i=1
¯
(r−1)/2 δL
1/2
=r
Z
res L
r = 2
Z
res(L(r−2)/2 )ds .
¯ δu
r ds = 2
Z
res L
¯ ds ¯ δu
(r−2)/2 δL
September 1, 2003 11:49 WSPC/148-RMP
584
00172
ˆ S. Matsutani & Y. Onishi
¯ n−1 /(2n + 1) modulo periodic functions and Xn = ∂s h ¯ n with h ¯ 0 = 0. Let hn ≡ 2n h Hence Definition 3.13 is guaranteed by Proposition 4.18. Here we give the vector fields Xn and quantities hn explicitly: Example 3.14 (KdVH flow). 1 u 2 1 h 1 = u2 2
n=0:
X0 (u) = 0 ,
h0 =
n=1:
X1 (u) = ∂s u ,
n=2:
X2 (u) = ∂s (3u2 + ∂s2 u) ,
n=3:
X3 (u) = ∂s (10u3 + 5(∂s u)2 + 10u∂s2 u + ∂s4 u) ,
1 h2 = u3 + (∂s u)2 2 5 h3 = u4 + 10u(∂s u)3 + (∂s2 u)2 . 2
n=1:
∂ t1 u + ∂ s u = 0 ,
n=2:
∂t2 u + 6u∂s u + ∂s3 u = 0 ,
n=3:
∂t3 u + 30u2 ∂s u + 20∂s u∂s2 u + 10u∂s3u + ∂s5 u = 0 .
Proposition 3.15 ([40]). (1) ω is a cocycle 2-form. (2) The KdVH flow has the Hamiltonian structures with their Hamiltonian, Z Hn := hn ds , (n ≥ 0) , S1
with involutive relations for the Poisson bracket, {Hn , Hm } := ω(Xn , Xm ), {Hn , Hm } = 0 ,
for all n, m .
(3) The nth KdV flow has infinite conserved quantities Hm n ∈ Z≥0 . (4) We have the relation, [∂tn , ∂tm ]u = 0 ,
for all n, m .
(5) For an arbitrary curve γ, the nth (n ≥ 1) KdV flow is uniquely determined. Proof. We will prove these following to the arguments in [40]. First we will show that iX ω is exact: For all n > 0, we have the relation, Z ¯ n δh ds ¯ v = (dHn )(v) , for n ≥ 1 . iXn ω(v) = ω(Xn (u), v) = δu S1 Hence Xn (u) is a Hamiltonian vector field from the Definition 3.12(2). Our system is a Hamiltonian system and the nth KdV equation is given by, utn = Xn (u) .
September 1, 2003 11:49 WSPC/148-RMP
00172
On the Moduli of a Quantized Elastica in P and KdV Flows
585
Next we will show that the KdVH flow is involutive. As the time tm development of Hn is given by Z ¯ Z δhn ∂ tm Hn = ∂ u = {Hn , Hm } , t m ¯ δu the involution relations are important. From the Definition 3.12, we have relations for n ≥ 1, ¯ n ¯ n−1 δh δh . Xn = ∂ s ¯ = Ω ¯ δu δu
Since in terms of ω in Definition 3.13(1), the Poisson bracket between Hn ’s are given by {Hn , Hm } = ω(Xn , Xm ), we obtain the following relation for n, m > 0: Z Z ¯ m−1 ¯ n δh ¯ n δh δh ds ¯ Ω ¯ ds ¯ Xm (u) = {Hn , Hm } = δu δu δu S1 S1 Z ¯ n−1 δh ¯ m δh = dsΩ ¯ ¯ = {Hn+1 , Hm−1 } . δu δu S1 Using this relations and noting {Hn , Hm } = −{Hm , Hn }, we will prove the involutive relation. When both n and m are even or both n and m are odd, {Hn , Hm } = {H(n+m)/2 , H(n+m)/2 } = 0 . On the other hand, when n is odd and m is even, {Hn , Hm } = {H(n+m−1)/2 , H(n+m−1)/2+1 } = {H(n+m−1)/2+1 , H(n+m−1)/2 } = 0 . Hence Hn ’s are involutive and the KdVH flow has infinite conserved quantities. We can express the relation {Hn , Hm } = 0 by using a vector representation for n, m > 0, [Xn , Xm ] = 0 . In the solution of the KdV hierarchy, we can identify ∂tn with Xn itself: ∂tn ≡ Xn . Hence we obtain (4). Further (5) can be proved as follows. For a given curve γ, we uniquely have the data, u, ∂s u, ∂s2 u, . . . . The KdV equations are given by ∂t u = f (u, ∂s u, ∂s2 u, . . .) . Hence for an arbitrary curve γ ∈ MPelas , the KdVH flow is uniquely determined by the KdV hierarchy. Due to the integrability, the “time” development of the γ is stably determined. Since the KdVH flow is a Hamiltonian system with infinite time parameters, we can find a group g ∈ G such that γt+t0 = gt0 γt . The multiplication is given as gt0 gt = gt0 +t . g0 is unit and g−t is the inverse of gt . Further Proposition 3.15(4) P means that [∂t1 , ∂tn ]u = 0 and the projection of πelas : MPelas → MPelas consists with the KdV flow.
September 1, 2003 11:49 WSPC/148-RMP
586
00172
ˆ S. Matsutani & Y. Onishi
Further as solving the KdV hierarchy is an initial problem with the first derivative with respect to the time, for an arbitrary γ ∈ MPelas we can find the KdVH flow to which γ belongs as an initial state. We will give a proposition as a summary of the above arguments. P Proposition 3.16. (1) There is an Abelian group G := {exp( n tn ∂tn ) | tn ∈ V ∞ } acting on the moduli spaces MPelas and MPelas , whose orbits are identified with the KdVH flow. (2) There is a fixed normal subgroup G0 of G, G0 = {gt1 | t1 ∈ R} ≈ U(1); G0 trivially acts upon MPelas : γ = gt1 γ for γ ∈ MPelas and gt1 ∈ G0 . (3) The group G/G0 acts on MPelas . Hence Proposition 3.11(2) is proved. We can express the equivalent class in MPelas by the group action in the following proposition. Proposition 3.17. (1) Fixing γ ∈ MPelas , G/G0 whose element is given as gt2 ,t3 ,... transitively acts upon C[γ] : For an arbitrary γ 0 ∈ C[γ], we can find an element gt2 ,t3 ,... of the group G/G0 such that γ = gt2 ,t3 ,... γ 0 . (2) For an arbitrary γ ∈ MPelas , there exists the KdVH flow : MPelas can be decomposed, a MPelas = C[γ] . (3) For γ ∈ MC elas , the energy functional E[γ] is exactly conserved for the KdVH flow.
Proof. (1) and (2) are obvious from the properties of group. (3) is proved because the energy E[γ] of the loop γ given by Definition 2.18 is identified with the conserved quantity of H0 . Hence Proposition 3.11(7) is proved from Proposition 3.17(3). By Propositions 3.11, 3.15, 3.16 and 3.17, we completely proved our main Theorem 3.2. As we have the classification of MPelas , we will use it and go on to investigate the moduli space MPelas in rest of this paper because our purpose is to get some knowledge of the moduli space MPelas . For later convenience, we will introduce a quotient space. Due to Theorem 3.2 and Proposition 3.17, MPelas has natural projections induced by the equivalent relation ∼ , i.e. πKdVHf : C[γ] 7→ (γ), where (γ) is a representative element KdVHf
of C[γ].
Definition 3.18. (1) We define a quotient space of the moduli space by, M Pelas := πKdVHf MPelas := MPelas / ∼ . KdVHf
(2) The natural projection is denoted by πelas : MPelas → MPelas .
Remark 3.19. We will comment on Proposition 3.15(4), [∂s , ∂tn ] = 0 for n > 0 .
September 1, 2003 11:49 WSPC/148-RMP
00172
On the Moduli of a Quantized Elastica in P and KdV Flows
587
As the KdVH flow is very regular, we can regard C[γ] × S 1 ∈ MPelas as a manifold. Accordingly ∂tn are regarded as a vector field. We will use it as a generator of a cohomology in Sec. 7. (1) It means the local length ds preserves for the KdVH flow. (2) It can be interpreted as Frobenius integrability conditions. (3) It is known as the compatibility condition or zero curvature conditions known in the soliton physics. 4. Algebro-Geometric Properties of the KdV Flow I: Algebraic Properties As we proved Theorem 3.4, we will use the relation between the moduli of a quantized elastica MPelas and the KdV flow in order to give a finer classification, which is based on the study finite type flow in MPelas and MPelas , in this section. However as this classification comes from the algebraic investigation of the KdV flow, we should replace the base function space in the category of the smooth functions by that of the formal power series in order to explain this classification, though we need some subtle treatments. This section is devoted for investigations of a commutative differential ring, which were given by Mulase in [26], Burchnall and Chaundy in about seventy years ago [41–43], and Mumford in [44]. Our argument basically follows the arguments of Mulase for the Schottky problem [26] and of Sato [24, 25]. Following their theories, we will consider a part of the moduli of a quantized elastica using the formal power series. Since the part is dense in MPelas as mentioned in Theorem 4.2, the replacement of the base function field is not so critical. Although investigation of γ as a real one-dimensional curve is our main subject, we deal with a hyperelliptic curve as a complex one-dimensional curve in the context of algebraic geometry in this section and next section. Thus readers should not confuse the terms “curve” in the categories of the differential geometry and the algebraic geometry. We basically refer the complex algebraic curve algebraic curve, hyperelliptic curve or elliptic curve whereas we call such a real curve just curve. Let us start this section with the following lemma. Lemma 4.1. If there is a natural number N such that ∂tN u is an eigen vector of the operator Ω with an eigenvalue k ∈ C, i.e. k∂tN u = Ω∂tN u , ∂tm is a scalar multiplication of ∂tN for all m ≥ N. Further by introducing t0n n > N and setting ∂t0n := ∂tn − k n−N ∂tN , the relation becomes ∂t0n u ≡ 0. Proof. This proof is easily from Definition 3.2 and Proposition 3.11(5).
September 1, 2003 11:49 WSPC/148-RMP
588
00172
ˆ S. Matsutani & Y. Onishi
Lemma 4.1 means that some orbits in an infinite dimensional vector space V ∞ are essentially reduced to an orbit consisting of finite N dimensional vector space. Let us refer this flow finite flow or finite N -type flow. Here we will give our second main theorem: Theorem 4.2. (1) We will write the set of the finite type flow by MPelas, finite and the set of finite g-type flow by MPelas g . The moduli space of the elastica has decomposition, a MPelas g . MPelas, finite := g<∞
MPelas, finite
is dense in (2) by the KdVH flow.
MPelas
with respect to the canonical topology determined
In order to prove Theorem 4.2, we prepare the knowledge of the KdV equation. A more concrete statement appears in Proposition 4.33. Before that, we will recall the result of the Whitney for the quasi-analytic system [45], which is easily proved by the Weierstrass preparation theorem. Proposition 4.3 ([45]). For a presheaf C ∞ (R) of smooth functions over R and a presheaf F(R) of C-valued formal power series over R, we have a surjective presheaf morphism, η : C ∞ (R) → F(R) ,
i.e. for a germ f ∈ Γp (C ∞ (R)) and t1 ∈ R around t01 , ∞ i X d η(f ) = (t1 − t01 )i . f i ds 0 t1 =t i 1
The map η is not injective, e.g. due to a function f , f (s) = 0 at s = s0 and f (s) = exp(−1/(s − s0 )2 ) at otherwise points. Since η is a local correspondence, the map η can be applied for the presheaves of the C ∞ functions and the formal power series over S 1 , or η : C ∞ (S 1 ) → F(S 1 ). On the other hand, for an arbitrary element in C ∞ (S 1 ) we can find a sequence in the presheaf of the formal power series F(S 1 ) which converges to the element using the same Weierstrass preparation theorem. By using these properties, we will replace the base ring C ∞ (S 1 , R) with the formal power series in this section. In order to express the system of the KdV equation, we will mention the differential algebra and its division algebra over a commutative ring R. As we show in Definition 5.15 and Proposition 5.18, the hyperelliptic ℘ function obeys the KdV equation and has a singularity of the second order. Hence we might be ought to deal with Lourant expansion ring C[[t1 ]][t−1 1 ] as R. However as we are concerned with one of the KdVH flows which are finite and real valued, we deal only with a finite “real” valued part of ℘ and avoid the singular points. In other words, we employ a formal series ring C[[t1 ]] as the ring R.
September 1, 2003 11:49 WSPC/148-RMP
00172
On the Moduli of a Quantized Elastica in P and KdV Flows
589
In this section, t1 is dealt with as a generic parameter but can be regarded as a real (complex) number t1 ∈ R (C). After considering periodicity, we regard it as a point of S 1 = R/Z in later. For convenience, let ∂1 := ∂/∂t1 . Here we assume that all algebra and subalgebra have unit as their definitions in this article. Definition 4.4 ([23 26]). (1) The differential ring Df is defined by N X ak ∂1k |N < ∞, ak ∈ C[[t1 ]] . Df := k≥0
(2) Let us identify commutative subalgebras B1 and B2 in Df if there exists an invertible element r ∈ C[[t1 ]]× such that B1 = rB2 r−1 .
We define standard representation Bs in the equivalent class [B1 ] by tuning r ∈ C[[t1 ]]× such that it contains, ∂1n + bn−2 ∂1n−2 + · · · + b0 ∈ Bs . (3) The degree of a differential operator and the projections + and − are defined by the same as the case Es in Definition 3.1. (4) The micro-differential ring Ef to Df is defined by, ) ( N X k f ak ∂1 |N < ∞, ak ∈ C[[t1 ]] . E := k=−∞
(5) The constant coefficient subring of Ef is defined by, ( N ) X Ec := ak ∂1k |N < ∞, ak ∈ C . k=−∞
f
(6) An invertible set W in Df is defined by, ( ) ∞ X −i f f W := W ∈ E |W = 1 + ai ∂1 , ak ∈ C[[t1 ]] , i=1
Wc := Wf ∩ Ec .
For the readers who are not familiar with valuation, we will review it.
Definition 4.5 ([46]). K is a topological field. Let us call a topology space E left linear topological space if E satisfies (1) E is a K linear space. (2) A map from K × E to E ((λ, x) 7→ λx) and addition x + y ∈ E are continues. Definition 4.6 ([46]). (1) Let K be a field. A valuation of K with values in Z is a map val : K → Z, for all x, y ∈ K, x, y 6= 0, val(xy) = val(x) + val(y) , and val(0) = ∞.
val(x + y) ≥ min(val(x), val(y)) .
September 1, 2003 11:49 WSPC/148-RMP
590
00172
ˆ S. Matsutani & Y. Onishi
(2) The set R := {x ∈ K | val(x) ≥ 0} is a local subring, called valuation ring. (3) The set m := {x ∈ K | val(x) > 0} is called a local ideal of R. (4) Let the metric of K be |x| := e−val(x) for x ∈ K, which is called nonArchimedian metric. For example, the valuation of a commutative ring C[x] is given by its degree, i.e. for f (x) ∈ C[x], val(x) = degx (x). For a more general commutative ring, we can find a local parameter by localization at a prime ideal and its valuation is given by its degree of the local parameter. The valuation ring is a linear topological space due to the non-Archimedian metric [46]. Similarly, we have the following proposition [24], which is naturally obtained. Proposition 4.7 ([24]). (1) When we define Efm := {D ∈ Ef | deg D ≤ m}, Ef has filter, Ef = ∪m Efm ,
{0} = ∩m Efm ,
Efm ⊂ Efm+1 .
(2) Ef is a linear topological space with respect to this filter. (3) Ef is an infinite dimensional algebra given by the formal power sires whose element converges in the filter topology. P∞ (4) In Ef , we can define valuations in C[[t]] and Ef as P = i=−∞ ai ∂1i ∈ Ef val(a) := max{m ∈ N | ∂1m a 6= 0}, val(P ) := inf{val(ai ) − i} .
Formally Proposition 4.7 is obvious from their definitions but we need rigorous arguments to justify them mathematically, which is written in [11, 25]. The differential operators appearing in the soliton theory and in the following arguments converge in this topology. Lemma 4.8 ([24 26]). (1) The adjoint map for W ∈ Wf , Ad(W ) : Ef → Ef (Ad(W )P = W P W −1 ), defines the automorphism in Ef . Ad(W )|Efm is invariant, i.e. Ad(W )Efm = Efm . ˜ ∈ L, where (2) For an operator L ( ) ∞ X f −i ai ∂s L := D ∈ E | D = ∂s + , i=1
f
c
we can find a unique W ∈ W modulo W such that ˜ = ∂s , Ad(W )L
and then this relation induces the isomorphisms of Wf /Wc ≈ L . (3) For every standard commutative subalgebra Af ⊂ Df , there is a C-subalgebra A ∈ Ec such that is C-isomorphic to Af and c
Ac ∩ Ec− = {0} .
September 1, 2003 11:49 WSPC/148-RMP
00172
On the Moduli of a Quantized Elastica in P and KdV Flows
591
P∞ ˜ = W ∂1 W −1 . Proof. (1) is trivial. (2) Let us find W = i=0 wi ∂1−i such that L ˜ = ∂1 + L ˜ − , the relation is reduced to [∂1 , W ] = −L ˜ − W , i.e. Noting L X 1−i ∂1 wk−1 = − ui ∂1r wj . r i+j+r=k,i≥2
Then we can recurrently determine wi from small numbers since C[[t]] has indefinite ˜ a −Wa ∂1 = 0, then W1 ∂1 W −1 W2 − integrals. When we find such W1 and W2 , i.e. LW 1 −1 −1 W1 W1 W2 ∂1 = 0 or [∂1 , W1 W2 ] = 0. Hence W1−1 W2 ∈ Wc . (3) Let us take a monic element of Af such that its form is Ln = ∂1n + bn−2 ∂1n−2 + · · · + b0 . ˜ = Ln ∈ L. Let S ∈ Wf such that L ˜ = S∂1 S −1 . Then Ac := Then we have L −1 f f −1 S A S. For an arbitrary P ∈ A , S P S belongs to Ec , i.e. [∂1 , S −1 P S] = 0 because [∂1 , S −1 P S] = S −1 [S∂1 S −1 , P ]S = S −1 [L, P ]S = 0 due to the assumption [P, L] = 0. Further the inner automorphism preserves the order of the operator. As we will not prove here, it is known that if we define a left Ef -module, Vf := Ef /Ef t1 , the homomorphism from Ef to the endomorphism of Vf is injective. In other words, the endomorphism is faithful if it can be regarded as a representation. Vf has a valuational topology val and becomes a graded module. There the valuation and graded topology are identified. Further we have a natural Ec -module isomorphism [24], Vf ≈ E c . Further we consider an embedding of a submodule Vf0 into Vf with zero-index map for a certain index, which can be regarded as a Grassmannian manifold in a certain sense. We note that the above isomorphism is not meaning of Ef -module. In this article, we characterize such an embedding by a finite subset of natural numbers F , which can be regarded as the Weierstrass gap in the infinite point of a corresponding algebraic curve. Further we should note that the adjoint map Ad is the key of the Sato theory and in this section, we sometimes call it gauge transformation. Definition 4.9 ([26]). A C-subalgebra Ac in Ec is called a rank one subalgebra if it has C-linear basis whose indices corresponds to all of integer except a finite subset F, i.e. N − F = NAc := {n ∈ N | ∃ P ∈ Ac such that ord(P ) = n} and Ac ∩ Ec− = {0} .
September 1, 2003 11:49 WSPC/148-RMP
592
00172
ˆ S. Matsutani & Y. Onishi
As Ac is a C-algebra, there is a monic element Pn in Ac of order n ∈ N − F with P0 := 1. Then {Pn | n ∈ N − F } forms a C-linear basis of Ac . In other words arbitrary P ∈ Ac can be represented by C-linear combinations of monic Pn elements. In fact if the order of P is m, there exists c ∈ C such that the order of P − cPm ∈ Ac must be less than m. Such a recursion process gives us the representations. Lemma 4.10 ([26]). Let Ac 6= C be a rank one subalgebra, and P and Q be elements in Ac whose orders are coprime. (1) dimC (Ac /C[P, Q]) < +∞. (2) Ac is finite C[P ]-module. There is a nontrivial polynomial f (x, y) ∈ C[x, y] such that f (P, Q) = 0. (3) The transcendence degree of Ac over C is one. (4) By regarding Ac as a graded module with respect to degree of differential operators: Ac(n) := {P ∈ Ac | ord(P ) ≤ n} ,
Acn := Ac(n) ⊕ Ac(n−1) · I ⊕ Ac(n−2) · I 2 ⊕ · · · ⊕ Ac(0) · I n ,
c gr Ac = ⊕∞ n=0 An ,
we regard Proj(gr Ac ) as an algebraic curve C. Here I is the identity of Ac . (5) Let H 1 (Ac ) = Ec /Ac ⊕ Ec− . We have × H 1 (C, OC ) = H 1 (Ac ) ,
where O is the sheaf of holomorphic functions on C of (4) and O × is a multiplicative subset of O. Proof. Let GCD(m, n) denote the greatest common divisor of two non-negative integer m and n. Since the rank of Ac is unit, we have the relations 1 = min{GCD(ord(P 0 ), ord(Q0 )) | P 0 , Q0 ∈ Ac }
and the orders of P and Q are coprime. Hence C[P, Q] ⊂ Ac . As NC[P,Q] = N and C[P, Q] is C-linear vector space, NAc − NC[P,Q] must be finite set. Hence dimC (Ac /C[P, Q]) must finite. On the other hand, since N − {ord{P m , Qn } | m, n ∈ Z≥0 } must be finite set, P and Q satisfy an algebraic relation f (P, Q) = 0. Further the proofs of (4) and (5) are due to theory of an ordinary commutative ring [46].
We note that F in Definition 4.9 is related to the Weierstrass gap at infinity point of the algebraic curve C. After this point, we will concentrate our attention only on the operator L = ∂12 + u, which is related to the KdV equation: L2 := {D ∈ Ef | D = ∂12 + u, u ∈ C[[t1 ]]} . We give its related operators as examples.
September 1, 2003 11:49 WSPC/148-RMP
00172
593
On the Moduli of a Quantized Elastica in P and KdV Flows
Example 4.11. 1 1 1 1 L1/2 = ∂1 + u∂1−1 − (∂1 u)∂1−2 + ((∂12 u) − u2 )∂1−3 + (6u(∂1 u) − ∂13 u)∂1−4 2 4 8 16 1 − (−2u3 + 14u(∂12u) + 11(∂1 u)2 − (∂13 u))∂1−5 + · · · , 32 1 2 3 2 3/2 3 4L = 4∂1 + 3∂1 u + 3u∂1 + ∂ u + u ∂1−1 + · · · , 2 1 2 16L5/2 = 16∂15 + 40u∂13 + 60(∂1 u)∂12 + 50(∂12 u)∂1 + 30u2∂1 15(∂13 u) + 30u(∂1 u) 1 + 5 u3 + (∂1 u)2 + ∂1 f (u, ∂1 u, . . .) ∂1−1 + · · · . 2
Here ∂1 f (u, ∂1 u, . . .) is a functional of u, ∂1 u, . . . .
Let us fix the operator P = ∂12 of Ac in Lemma 4.10 because we only consider L = W ∂12 W −1 . From the primitive number theory, for an odd number m and an integer n(> m), we find a, b ∈ Z≥0 such that n = am + 2b , c
(a, b ∈ Z≥0 ) .
(4.1) ∂12
When we fixed A as a rank one subalgebra, the partner Q of P ≡ in the Lemma 4.10 is an operator whose order is given by an odd number 2g +1. Thus F in Definition 4.9 is given by a smaller sequence of odd numbers, {1, 3, 5, 7, 9, . . . , 2g−1}. Let us introduce a set of such subrings Ac in Ec . Definition 4.12. Ac := {Ac | Ac is a rank one subalgebra,
∃ W ∈ Wf such that W Ac W −1 ∈ Df is a commutative subalgebra,
N − NAc ⊂ {1, 3, . . . , 2g − 1}, g < ∞} .
Similarly for the case of g = 0, Ac ≡ W −1 C[L, [L1/2 ]+ ]W . Since [L1/2 ]+ ≡ ∂1 , [L, ∂1 ] = 0 and thus u must be C. In other words, Q must be ∂1 , C[∂12 , ∂1 ] ≡ C[∂1 ]. For the case g = 0, it becomes an ordinary polynomial ring. 4.13. We recall that an algebraic curve with a morphism to P of order two is called hyperelliptic curve. A hyperelliptic curve Cg of genus g (g ≥ 1), including the case of elliptic curve, is given by the homogeneous equation, Y 2 Z 2g−1 = hg (X, W ) := λ0 Z 2g+1 + λ1 XZ 2g + λ2 X 2 Z 2g−1 + · · · + λ2g+1 X 2g+1 , where λ2g+1 ≡ 1 and λj ’s are complex values. Lemma 4.14. Let L = ∂12 + u ≡ W ∂12 W −1 ∈ L2 for W ∈ Wf . (1) Ln/2 = W ∂1n W −1 . 2n (2) [L2n + , L] ≡ [L , L] ≡ 0. (3) The set of the differential operators in Df which commute with the given operator L is itself a commutative subalgebra of Df .
September 1, 2003 11:49 WSPC/148-RMP
594
00172
ˆ S. Matsutani & Y. Onishi
(4) Ac ∈ Ac is C[∂12 ]-module and by considering X Ac = Ac ,
Ac is also C[∂12 ]-module. (5) For an arbitrary Ac ∈ Ac , we can find Qg ∈ Ac which satisfies an affine equation, Q2g = hg (∂12 , 1) , so that there is a W ∈ Wf /Wc such that W ∂12 W −1 = L and W QW −1 are commutative in Df . Further we have found a hyperelliptic curve C = Proj(gr Ac ) and H1 (C, OC ) = H1 (Ac ) ,
which are generated by h∂1 , ∂13 , . . . , ∂12g−1 i.
Proof. (1) and (2) can be shown by direct computations. On (3), we consider a commutative differential ring in Df such that B := {P ∈ Df | [P, L] = 0}. Since L1/2 = W ∂1 W −1 , [∂1 , W −1 P W ] = W −1 [L1/2 , P ]W = 0 because of the assumption. Hence W −1 P W is an element of Ec and thus we can find Ac ∈ Ac such that B = W −1 Ac W . Hence B is a commutative ring. Next (4) is trivial. From the definition Lemma 4.10, and Eq. (4.1), we reach (5). Next we will consider a filter structure in Ac and its completion with respect to the filtration. Proposition 4.15. Let us define a filter, Fg Ac := {Ac ∈ Ac | N − NAc ⊂ {1, 3, . . . , 2g − 1}} . This satisfies the following relations: (1) Fg Ac ⊂ Fg+1 Ac , Fn Ac ≡ 0, n < 0. (2) By letting Acg := Fg Ac /Fg−1 Ac , there is a large gauge transformation between Ac1 , Ac2 ∈ Acg , i.e. there exists W ∈ Wf such that Ac1 = W Ac2 W −1 . (3) The direct limit of the filtration gives Ac := lim Fg Ac →
= {Ac ∈ Ec | ∃ W ∈ Wf such that W Ac W −1 ∈ Df is a subalgebra, N − NAc ⊂ 2N − 1} . Proof. (1) and (3) are obvious. (2) is due to the proof of Proposition 4.16(2). For each element Ac ∈ Acg , we consider the correspondence in Lemma 4.10(4), i.e. Proj(gr Acg ). It turns out that Acg is isomorphic to the set of the hyperelliptic curves with genus g.
September 1, 2003 11:49 WSPC/148-RMP
00172
On the Moduli of a Quantized Elastica in P and KdV Flows
595
Proposition 4.16. (1) The set Af of commutative subrings in Df inherits from the above filtration of Ac . (2) For any elements L1 and L2 in L2 , there is a gauge transformation W ∈ Wt /Wc such that L1 = W L2 W −1 . Proof. (1) is trivial. (2) There is an element Wa ∈ Wt /Wc such that La = Wa ∂12 Wa−1 for (a = 1, 2). Hence L2 = W2 W1−1 L1 W1 W2−1 . As we described the tools and their properties for the differential ring over C[[t1 ]], we will extend its base field to C[[t1 , t2 , . . .]]. However before we will give the extension in Definition 4.19, we digress and show a connection between Ω in Definition 3.2 and L in L2 . Following the arguments in [22], we firstly prepare a lemma. Lemma 4.17 ([22]). The “resolvent” operator for L = ∂12 + u, # " ∞ 1 X (±) r −r/2 , T := (±z) L 2z 2 r=−∞ −
has the following properties: (1) (T (+) + T (−) ) = (L − z 2 )−1 . (2) [T (±) (L − z 2 )]− = [(L − z 2 )T (±) ]− = 0. (3) When we define a map for a X ∈ Es , called Adler map [22], h(X) := [(L − z 2 )X]+ (L − z 2 ) − (L − z 2 )[X(L − z 2 )]+ , we have the relation, h(T (±) ) = 0. (4) T (±) has a formal expansion, T (±) =
∞ X
Sr(±) ∂1−r .
r=1
Proof. (1) is trivial. (2) is given by the relation, " ∞ # X 1 [(L − z 2 )T (±) ]− = (L − z 2 ) (±z)r L−r/2 2 r=−∞ −
=
∞ X 1 (L − z 2 ) (±z)r L−r/2 2 r=−∞
1 = 2
" "
∞ X
r=−∞
r
2
(±z) L − z (±z)
r
#
−
−
L
−r/2
#
. −
September 1, 2003 11:49 WSPC/148-RMP
596
00172
ˆ S. Matsutani & Y. Onishi
It is clear that it vanishes. (3) is proved due to the property of the Adler map, h(X) ≡ −[(L − z 2 )X]− (L − z 2 ) + (L − z 2 )[X(L − z 2 )]− . (4) Is obvious from the definition of the resolvent. Due to the lemma, we gave the connection. Proposition 4.18 ([22]). (1) [2(n−1) [L(2n−1)/2 ]+ , L] = Ωn1 ∂1 u, where Ω1 := ∂12 + 2u + 2∂1 u∂1−1 . ¯ n = res 22n L(2n−1)/2 , (“ res” means the coefficient of ∂ −1 ), we (2) By letting h 1 2n+1 have ¯ n = Ω 1 ∂1 h ¯ n−1 . ∂1 h Proof. Due to the condition h(T (±) ) = 0, we can determine the first two coefficients S1 and S2 as (±)
∂s3 S1
(±)
+ 2(∂s u)S1
(±)
∂s2 S2
(±)
+ 4(u + z 2 )∂s S1
= 0,
1 (±) = − ∂ s S1 . 2
Let us consider the following operator, " ∞ # X (+) (−) 2r+1 −(2r+1)/2 (T −T ) = z L r=−∞
. −
The left hand side in the relation, [L(2r+1)/2 ]+ L − L[L(2r+1)/2 ]+ = L[L(2r+1)/2 ]− − [L(2r+1)/2 ]− L , appears as a coefficient of z 2r−1 in the series h(T (+) − T (−) ) with respect to z. Thus (+) (−) we are concerned with Sr := Sr − Sr , which must have the expansion, Sr =
∞ X
Sr(i) z 2i+1 .
i=−∞
Comparing the coefficients in z (i+1)
4∂1 S1
2r−1
, we obtain, (i)
(i)
= (∂s3 + 2(∂s u) + 4u∂1 )S1 = Ω1 ∂1 S1 .
We have the relation, [[L(2r+1)/2 ]+ , L] =
1 1 (1) ∂1 S12r+1 = r Ωr1 ∂1 S1 , 4 4
with S11 = −u/2. Then we let hn identified with Sn by tuning its coefficient. As we finished the digression, we extend C[[t1 ]] to C[[t1 , t2 , . . .]]. In the extension of the valuation over C[[t1 ]] to that of C[[t1 , t2 , . . .]], let the degree of tni be (2i−1)n.
September 1, 2003 11:49 WSPC/148-RMP
00172
On the Moduli of a Quantized Elastica in P and KdV Flows
597
Definition 4.19 ([24, 26]). (1) The differential ring Dt , and its related set and ring are defined by, N X Dt := ak ∂1k |N < ∞, ak ∈ C[[t1 , t2 , . . .]] , k≥0
t
E :=
(
N X
k=−∞
ak ∂1k |N
< ∞, ak ∈ C[[t1 , t2 , . . .]]
Lt2 := {D ∈ Et | D = ∂12 + u ,
)
,
Et = Dt + Et− ,
u ∈ C[[t1 , t2 , t3 , . . .]]} .
(2) By letting val(tni ) := (2i − 1)n, we extend the valuation of Dt and Et , which are also called valuations of Dt and Et . P∞ (3) Wt := {W ∈ Et | W = 1 + i=1 wi ∂1−i }. P∞ (4) Dˆt := {P = i=0 ai ∂1i ∈ Dt | ∃ N ∈ N, val(ai ) > i − N for ∀ i 0}, P∞ Eˆt := {P = i=−∞ ai ∂1i ∈ Et | ∃ N ∈ N, val(ai ) > i − N for ∀ i 0}.
We note that Dt , Et and so on, have natural embeddings of Df , Ef and so on, e.g. Df 3 P (t1 ) 7→ P (t1 , 0, 0, . . .) ∈ Dt . Using the embeddings, we regard Df as a subring of Dt as following. Definition 4.20. (1) The moduli space of the KdV equations is defined by MKdV := {u ∈ C[[t1 , t2 , . . .]] | ∂tn u−Ω1n−1∂1 u = 0 for ∀ n} , MKdV := MKdV /(t1 ) ,
where Ω1 := ∂12 + 2u + 2∂1 u∂1−1 . P ˜ ∈ Et− , ˜ ∈ Lt := {D ∈ Et | D = ∂s + ∞ ai ∂s−i } and P ∈ Dt satisfy [P, L] (2) If L i=1 ˜ the equation [∂y − P, L] = 0 is called Lax equation and (P, L) is Lax pair. Here y is an element of the vector space generated by t1 , t2 , . . . . Proposition 4.21 ([26]). Let L := ∂12 − u ∈ Lt2 . (1) [∂tn − 22(n−1) [L(2n−1)/2 ]+ , L] = 0 is the Lax equation. (2) For an arbitrary P ∈ Dt of the Lax pair (P, L), P can be expressed by P =
n X
cj [Lj/2 ]+ ,
j=1
where cj ∈ C[[t2 , t3 , . . .]]. (3) If and only if u satisfies [∂y − P, L1/2 ] = 0, [∂y − P, L] = 0. (4) The equation [∂tn − 22(n−1) [L(2n−1)/2 ]+ , L] = 0 gives the nth KdV equation, ∂tn u − Ω1n−1 ∂1 u = 0, and thus we have a bijection MKdV ≈ {L ∈ Lt2 | [∂tn − 22(n−1) [L(2n−1)/2 ]+ , L] = 0, n > 1} .
Here ≈ is given by the correspondence between u and L = ∂12 + u.
September 1, 2003 11:49 WSPC/148-RMP
598
00172
ˆ S. Matsutani & Y. Onishi
Proof. First we consider (3). Let L = W ∂12 W −1 . [∂y − P, W ∂1 W −1 ] = 0 gives [W (∂y − P )W −1 , ∂1 ] = 0 and then we obtain [W (∂y − P )W −1 , ∂12 ] = 0 and [∂y − P, L] = 0. For an operator Q ∈ Ac , [Q, ∂12 ] = 0 means (∂12 Q) + 2(∂1 Q)∂1 = 0, i.e. (∂12 Q) = 0 and (∂1 Q) = 0. Hence [W (∂y − P )W −1 , ∂12 ] = 0 also means [∂y − P, L1/2 ] = 0. (1) It is know that [[Lj/2 ]+ , L1/2 ] ∈ Et− . Due to (3), (1) is proved. Next we consider (2). [Lj/2 ]+ is a monic operator. Hence if order of P is n, there exists c ∈ C such that the order of P − c[Ln/2 ]+ ∈ Dt is n − 1. By induction, we have the results in (2). Proposition 4.18(1) leads us to (4). Here we will translate the relations in terms of geometrical language. Due to Proposition 4.21(4), we also denote the right hand side there by MKdV . Lemma 4.22 ([24 26]). Let L := ∂12 − u = W −1 ∂12 W ∈ Lt2 , dL := ∂1 Ldt1 + ∂2 Ldt2 + ∂3 Ldt3 + · · · , dW := ∂1 W dt1 + ∂2 W dt2 + ∂3 W dt3 + · · · , dZ := 2L1/2 dt1 + 4L3/2 dt2 + 8L5/2 dt3 + · · · , dZ+ := 2[L1/2 ]+ dt1 + 4[L3/2 ]+ dt2 + 8[L5/2 ]+ dt3 + · · · ,
Z = Z + + Z− .
(1) The Lax equation becomes dL = [Z+ , L] ,
dL = −[Z− , L] .
(2) dZ+ = 12 [Z+ , Z+ ], dZ− = − 21 [Z− , Z− ]. (3) dL = [dW · W −1 , L]. (4) W −1 dW − Z+ ∈ Dc dt1 + Dc dt2 + Dc dt3 + · · · or by using the gauge freedom, dW = Z+ W ,
dW = −Z− W .
Proof. (1) is trivial. (2) Noting d2 L ≡ 0, [L, dZ+ − Z+ Z+ ] ≡ 0 and then we obtain (2). (3) From d(W W −1 ) ≡ 0, dW −1 = −W −1 dW W −1 . Hence dL = d(W ∂12 W −1 ) becomes the right hand side. (4) Using (2) and (3), [dW W −1 − Z+ , L] = 0, and we obtain [W −1 dW − W −1 Z+ W, ∂1 ] = 0. It implies (4). Here we note that the conditions dZ+ = [Z+ , Z+ ] and so on are the Frobenius integrability conditions. Due to the conditions, the orbit as a dynamical system can be uniquely determined. Conclusively we have the following proposition on the orbit of the KdV equations. As its proof is a little bit complicated, we will give only the result. Proposition 4.23 ([26]). For L(0) = S(0)∂ 2 S(0)−1 , U (t) = exp(t1 ∂1 + t2 ∂13 + t3 ∂15 + · · ·)S(0)−1 ∈ Eˆt ,
U (t) = S(t)−1 Y for S(t) ∈ G and Y ∈ Dˆt , we have the time development, L(t) = S(t)∂12 S(t)−1 .
September 1, 2003 11:49 WSPC/148-RMP
00172
On the Moduli of a Quantized Elastica in P and KdV Flows
599
Definition 4.24 ([26]). (1) For L ∈ Lt2 , if the map ∂L1/2 ∂ 7→ T0 (Rt,n ) 3 ∈ Et− ∂y ∂y y=0
is injective, we say that Rt,n is effective. Here we write Rt,n as the orbit space generated by t1 , t2 , . . . , tn and T0 (Rt,n ) as its tangent space at the origin 0 ∈ Rt,n . (2) If for L ∈ Lt2 Rt,n is not effective for n > g but n ≤ g is effective, we say that L = ∂12 + u or u is finite g type solution of the KdV equation.
Lemma 4.25 ([23, 24, 26]). (1) If there is a natural number N such that ∂tN u is an eigen vector of the operator Ω with an eigenvalue k ∈ C, i.e., k∂tN u = Ω∂tN u , ∂tm is scalar multiplication of ∂tN for m ≥ N. If not, we refer that tm is effective. (2) For the finite g solution of L and for n > g, we have the commutation relation, [22(g+1) [L(2g+1)/2 ]+ , L] ≡ 0 , by construction tn in terms of a linear combination in Cht1 , t2 , . . . , tg i. (3) Let L ∈ Lt2 such that [22(g+1) [L(2g+1)/2 ]+ , L] ≡ 0 and [∂tj − 22(j−1) [L(2j−1)/2 ]+ , L] = 0 ,
for j < g ,
is effective. Then we have a commutative subring At := C[L, [L(2g+1)/2 ]+ ] ⊂ Dt such that W ∈ Wt , Ac = W −1 At W ∈ Ac , and an isomorphism as C-vector space, H 1 (Ac ) ≈ Chdt1 , dt2 , . . . , dtg i . Proof. (1) is essentially the same as Lemma 4.1 and (2) is obvious from (1). So we will concentrate our attention on (3). The integrability conditions makes the conditions in Dt reduced to those in Df as an initial state. Due to Lemma 4.14(5), (3) is proved. Definition 4.26. (1) The filter with respect to the effective differential equations is defined by Fg MKdV := {L ∈ Lt2 | [∂n − 22(n−1) [L(2n−1)/2 ]+ , L] = 0 is not effective for n > g} = {L ∈ Lt2 | [[L(2g−1)/2 ]+ , L] ≡ 0} and Fg MKdV ⊂ Fg+1 MKdV ,
Fn MKdV = ∅, for n < 0 .
(2) A set of finite g type solutions of the KdV equation is denoted by MKdVg := Fg MKdV \Fg−1 MKdV .
September 1, 2003 11:49 WSPC/148-RMP
600
00172
ˆ S. Matsutani & Y. Onishi
Due to Lemma 4.25(3), Fg MKdV corresponds to Fg Ac and the correspondence becomes a bijection by considering their appropriate quotient spaces. As the system of the KdV equations is a dynamical system, there is a g-dimensional orbit in each solution space in MKdVg by neglecting its periodicity. We can regard it as a fiber bundle, orbit −−−−→ MKdVg g πKdV y MKdVg .
For each orbit space Ac ∈ Acg such that
−1 g πKdV (p)
at a point p in MKdVg , there is a commutative ring
g T ∗ πKdV
−1
(p) ≈ H 1 (Ac ) .
For later convenience, we also define a space MKdV finite := qg MKdVg . Next we will consider MKdV itself. (Fg , Ac ) has direct limit due to Proposition 4.15. Let us consider the set of subrings in Dt B := {L ∈ Lt2 | ∃ W ∈ Wt , ∃ At ∈ Dt and ∃ Ac ∈ Ac such that At = W Ac W −1 and L = W ∂12 W −1 } . Since solving the KdV equations are an initial value problem, for an arbitrary initial state u ∈ C[[t1 ]] we can find the time-development obeying the KdV equations. Thus we have B ⊂ MKdV . On the other hand, from the definition, we can find C[∂12 ] ∈ Ac which gives NC[∂12 ] = 2N. Further for an arbitrary L ∈ MKdV , there is a gauge transformation, W ∈ Wt such that W −1 LW = ∂12 due to Lemma 4.8(2). Hence B ⊃ MKdV and then B ≡ MKdV . Such a consideration is justified by the direct limit and graded topology of Dt or Ec . Thus MKdV has naturally the topology induced from the linear topology of the micro-differential operator in Proposition 4.7 and the filter of C[∂s2 ]-module in Proposition 4.15, even though MKdV itself is not a vector space. Proposition 4.27. (1) MKdV is a filter space. (2) The set of finite g type solutions of the KdV equation is denoted by M KdVg := Fg MKdV \Fg−1 MKdV and the set of finite type solutions of the KdV equation is denoted by MKdV finite . Then we have decomposition, a MKdV finite = MKdVg . g<∞
(3) In the sense of Proposition 4.15(3), it converges. MKdV = ∪∞ g=1 Fg MKdV .
For a point of MKdVg , they have non-trivial (effective) differential equations, [∂n − 22(n−1) L(2n−1)/2 + , L] = 0 ,
(n = 1, . . . , g) .
September 1, 2003 11:49 WSPC/148-RMP
00172
On the Moduli of a Quantized Elastica in P and KdV Flows
601
For the orbital as a dynamical system, we find a natural volume form hdt1 , dt2 , . . . , dtg iC . Lemma 4.28. (1) For L1 , L2 ∈ MKdVg , there is W ∈ Wt such that W L1 W −1 = L2 . (2) For a subset of Wf (g > 0), Wfg := {W ∈ Wf | W LW −1 ∈ MKdVg , for L ∈ MKdVg } , g the projection πKdV along the orbit space in the quotient space of MKdVg by the f action of Wg is given by, g πKdV (MKdV g /Wfg ) ∼ pt .
(3) For Wf0,1 := {W ∈ Wf | W LW −1 ∈ MKdV 0 ∪ MKdV 1 , for L ∈ MKdV 0 ∪ MKdV 1 } , the following relation holds, πKdV (MKdV 0 ∪ MKdV 1 /Wf0,1 ) ∼ pt . Proof. (1) is essentially the same as Proposition 4.16(2). The action of W fg to MKdVg is transitive due to (1) and thus (2) is obtained. Now let us come back to the elastica problem. Firstly we note that the elastica problem is defined over the real functions. Hence we should restrict the above result to a real analytic problem. In other words, we choose a natural complex structure J (J 2 = −1) in the orbits space hdt1 , dt2 , . . . , dtg iC and constraint it by hdt1 , dt2 , . . . , dtg iR using the fact that finite g-type flow is a finite g-type solution of the KdV equation. Further the orbit satisfies the reality condition |∂1 γ| = 1, which characterizes a certain type of hyperelliptic curves. Secondly we should notice the difference of the categories of the previous chapter and this chapter. However as the g-type flow u is expressed by meromorphic functions over a hyperelliptic curve of genus g, elements of the finite real flow exist in the category of the formal power series. Hence the investigation of the finite flow does not depend on the difference. Further the arc-length s corresponds to t1 in the above argument but we are consider only the closed one. Hence firstly t1 must be an element of S 1 = R/Z. Even though u(s) ≡ {γ, s}SD is periodic, γ is not in general. We should restrict the space of the solution space of the KdV equation so that γ(0) = γ(2π) or γ(0) = γ(∞). g We will define a projective structure in MPelas g by πelas : MPelas g → MPelas g so −1
g that for a point p in MPelas g , πelas (p) is the real number orbit, and let MPelas finite = qg MPelas g as did in Definition 3.18. We summary these results in the following proposition.
September 1, 2003 11:49 WSPC/148-RMP
602
00172
ˆ S. Matsutani & Y. Onishi
Proposition 4.29. There are natural injections iKdV : MPelas finite ,→ MKdV finite ,
ιKdV : MPelas finite ,→ MKdV finite
which satisfy (1) ιKdV ◦ πelas = πKdV ◦ iKdV , (2) MKdV \MPelas 6= ∅. Using the above results, there is a filtration in MPelas such that Fg MPelas := MPelas ∩ Fg MKdV , which satisfies Fg MPelas ⊂ Fg+1 MPelas ,
MPelasg = Fg MPelas /Fg−1 MPelas .
We have written just MPelas as iKdV (MPelas ) and MPelas as ιKdV (MPelas ) for brevity. Next we will consider the real orbits or the “time” development of each finite g type flow in MPelas (instead of MKdV ). Let us recall the fact that rational points in [0, 1), i.e. Q/Z, are measure zero in [0, 1). Further it is known that for a torus √ C/(Z + −1Z), a real direct line (orbit) stemmed from the origin with an angle θ does not stand upon the origin again if θ ∈ / tan−1 (Q/Z). Similarly in general, the real number “time” development of the finite g type solution is not periodic in “time” ti (i > 1), in the g-dimensional torus Jg which is called quasi-periodic solutions. Hence we conclude that such an orbit is homeomorphic to Rg−1 in this sense and show the following proposition. Proposition 4.30. For each pt ∈ MPelas g , we have a restricted action of Wfg and thus the following results are satisfied : g (1) πKdV |MPelas (MPelas g /[Wfg |MPelas ]) = pt. g
g
(2) MPelas g /[Wfg |MPelas ] ≈ S 1 × Rg−1 , MPelas g /[Wfg |MPelas ] ≈ Rg−1 , g
g
for g > 1.
(3) MPelas 0 ∪ MPelas 1 /Wf0,1 |MPelas ∪MPelas ≈ S 1 , (MPelas 0 ∪ MPelas 1 /Wf0,1 |MPelas ∪MPelas )/ 0 1 0 1 Isom(S 1 ) ≈ pt. We will recover the base ring with smooth functions. In other words, we show that the completion in Proposition 4.27 can be extended to Es because the convergence is determined only by the topology of order of the differential operator as shown in the following lemma. Lemma 4.31. (1) When we define Esm := {D ∈ Es | deg D ≤ m}, Es has a filter topology, Es = ∪m Esm ,
{0} = ∩m Esm ,
Esm ⊂ Esm+1 .
(2) Es is a linear topological space with respect to this filter topology.
September 1, 2003 11:49 WSPC/148-RMP
00172
On the Moduli of a Quantized Elastica in P and KdV Flows
603
Due to Lemma 4.31 and note below the Proposition 4.3, we have the following proposition. Proposition 4.32. Let us define the moduli space of the KdV equations over the ring of the smooth functions: n−1 ∞ ∞ u = 0 for ∀ n} , M∞ KdV := {u ∈ C (V ) | ∂tn u − Ω1
∞ M∞ KdV := MKdV /(t1 ) .
Then (1) MKdV is dense in M∞ KdV . (2) MKdV finite is a subset of M∞ KdV . Proof. Due to the Weierstrass preparation theorem, for an arbitrary germ in C ∞ (R), there is a sequence in F(R) conversing it. Integrability due to Proposition 3.15 asserts that the difference does not enlarge for the time development. Hence (1) is proved. (2) is obvious Hence we have the final statement in this section. ∞ Proposition 4.33. MPelas ⊂ M∞ KdV has the filter topology induced from MKdV .
(1) There is a natural decomposition, MPelas, finite =
a
MPelas g .
g<∞
(2) With respect to the induced topology and in the sense of Propositions 4.27(3) and 4.32(1), we have P MPelas = ∪∞ g=1 Fg Melas g .
5. Algebro-Geometric Properties of the KdV Flow II: Analytic Expressions In this section, we will give a more concrete argument. For example, we will give another proof of Theorem 4.2 and Proposition 4.33 at Proposition 5.22. This section is based on the inverse scattering method, Krichever’s scheme and Baker’s approach. The studies of the KdV equation have a long history. There were so many researchers contributing them, e.g. Miura, Gardner, Greene, Kruskal, Lax, and so on [22, 47]. Owing to their studies, we will give, here, another aspect of the KdV equation without proofs, which is called the inverse scattering method. Proposition 5.1 ([22, 27, 28, 47]). (1) For a solution u of the nth KdV equation, there is a complex valued smooth function ψx¯ over R × {tn }, which is a universal covering of S 1 (S 1 = R/2πZ) and {tn } ⊂ R, ψx¯ ∈ C ∞ (R × {tn }, C) ,
September 1, 2003 11:49 WSPC/148-RMP
604
00172
ˆ S. Matsutani & Y. Onishi
as an eigen vector of the eigenvalue problem over R (S 1 = R/2πZ), L = ∂12 + u ,
−Lψx¯ = x¯ψx¯ ,
and as a solution of (∂tn − 22(n−1) L(2n−1)/2 + )ψx¯ = 0 . The deformation of u with respect to tn which preserves the eigen value x¯ is equivalent with that u is a solution of the nth KdV-equation and vice-versa. Here the extension of the domain of u over S 1 to R is naturally defined as u(t1 ) = u(t1 +2π). These equations are equivalent to the Lax equations in Definition 4.20. Remark 5.2. (1) The eigenvalue problem −Lψx¯ = x¯ψx¯ can be regarded as a quantization of the “classical” equation 1 2 −∂1 − {γ, s}SD(t1 ) ψ(t1 ) = 0 . 2 Indeed, (∂τ − L)Ψ = 0, (Ψ = exp(τ x¯)ψ) appears when we quantize ψ(t1 ) by means of the path integration [39, 48]. (2) For finite type solutions of the KdV hierarchy, the Lax equations and the compatibility condition are essentially reduced to finite relations. Due to Lemma 4.1 and Definition 4.24, the equations with respect to tm , m > N , are trivial one for N -type solution. For a while, we will assume that u is real. Let Spect(−L) denote a set of x ¯. Due to hermitian properties of −L, Spect(−L) is a subset of real number bounded from below. The function ψx¯ (t1 ) is regarded as a section of line bundle over Spect(−L). For bases y0 and y1 of the solution space of −Lψx¯ = x ¯ψx¯ , (ψx¯ = ay0 + by1 , for a, b ∈ C), y0 (0, x ¯) = 1 ,
y1 (0, x ¯) = 0 ,
∂1 y0 (0, x ¯) = 0 ,
∂1 y1 (0, x ¯) = 1 ,
we have monodoromy matrix defined as M (¯ x) :=
y0 (π, x¯)
y1 (π, x ¯)
∂1 y0 (π, x¯)
∂1 y1 (π, x¯)
!
,
whose determinant is unity. If the eigenvalue of this matrix ρ is in the unit circle in C (|ρ| = 1), the solution ψx¯ is called stable and exist as a global section over the line bundle over s ∈ R. Unless, it is called unstable and it means that there is no global section over s ∈ R even though we can find local solutions of −Lψx¯ = x¯ψx¯ . We sometimes refer the unstable state “gap state” or “forbidden state”. The determinant whether it is stable or unstable is done by the characteristic equation, ρ2 − ∆u ρ + 1 = 0 ,
September 1, 2003 11:49 WSPC/148-RMP
00172
On the Moduli of a Quantized Elastica in P and KdV Flows
605
where ∆u := tr M . If its discriminant ∆2u − 4 is non-positive, corresponding x ¯ becomes stable. Since ∆2u − 4 is an analytic function over Spect(−L) − {∞} and has ordered zero points x ¯1 , x ¯2 , . . . , it has infinite product expression: (∆2u − 4) = c
∞ Y
j=0
(¯ x−x ¯j ) ,
where c is a constant in x ¯. This fact is correct even for the case that u is complex valued and thus we will return to the general u form here. Proposition 5.3 ([49]). For −Lψx¯ = x¯ψx¯ with smooth u(t1 ) over R, the discriminant ∆ is characterized by infinite x¯j and can be rewritten as, Y x )2 , (¯ x−x ¯j ) h(¯ (∆2u − 4) = j=0, single zeros
√ Q∞ x−x ¯j ) is the part of double zeros. where h(¯ x) = c j 0 , double zeros (¯
For large x ¯, −L asymptotically behaves like −∂12 for bounded u and thus the asymptotic behavior of ∆ can be investigated. Since the ground state corresponds to a single zero of ∆2 − 4 and other each gap has two single zeros of ∆2 − 4, the number of single zeros of ∆2 − 4 must be odd. Here we will consider a case with finite single zero points 2g + 1: 2g+1 Y (∆2u − 4) = (¯ x−x ¯j ) . h(¯ x )2 j=1
We refer such a case as finite-gap-state. It should be noted that ψx¯ has natural involution π : Spect(−L) → Spect(−L) (π : y¯ → −¯ y , π : ∞ = ∞) where y¯ = p ∆2u − 4/h(¯ x). Due to analyticity, we can extend Spect(−L) to complex. As for u ≡ 0 case, Spect(−∂12 ) is complexfied to P (even though we need more precise arguments), the energy spectrum Spect(−L) is, in general, p reduced to a hyperelliptic curve Cg due to its two-folding property. In fact for y¯ = ∆2u − 4/h(¯ x), this relation means a hyperelliptic curve defined in 4.13. In this section, we will fix a hyperelliptic curve Cg with genus g given by an affine curve, y¯2 = hg (¯ x, 1) = (¯ x − c1 ) · · · (¯ x − c2g+1 ) . In other words, we deal with a commutative ring C[¯ x, y¯]/(¯ y 2 − hg (¯ x, 1)) ∪ {∞}. We should note that for a hyperelliptic curve Cg , there exists a differential operator −L with u such that its spectrum Spect(−L) gives the hyperelliptic curve isomorphic to Cg . Proposition 5.4. Let the moduli space of hyperelliptic curves of genus g be denoted by Mhyp, g . Then Mhyp, g is (2g − 1) dimensional space.
September 1, 2003 11:49 WSPC/148-RMP
606
00172
ˆ S. Matsutani & Y. Onishi
Proof. A point in the moduli space Mhyp, g is characterized by 2g +1 zero points of hg (x, 1) in the above definition and ∞ point. However in these variables, there are several symmetries which express the same compact Riemannian surface. First one is translational symmetry cj → cj + α0 , α0 ∈ C. Second one is dilatation cj → cj α1 α1 ∈ C. Third one is (¯ x, y¯) → (1/¯ x, y¯Πj cj /¯ x(2g+1)/2 ), which reduces cj → 1/cj . Hence the remainder degree of freedom is 2g − 1. Remark 5.5. We will mention Mhyp, g here. We consider a smooth curve in Mhyp, g which is not degenerated; ci 6= cj if i 6= j and all of cj are finite value of C. Let us find the largest distance |cj − ck | of pair (cj , ck ) in {cj } as an arbitrary |cj − ck | does not vanish because the curve is not degenerated. Let us rename them as (c1 , c2g+1 ) and define (α1 , . . . , α2g−1 ) := ((c2 − c1 )/(c2g+1 − c1 ), . . . , (c2g − c1 )/(c2g+1 − c1 )) ∈ C2g−1 . Since 1 − αj = (cj+1 − c2g+1 )/(c1 − c2g+1 ) and |c1 − c2g+1 | is the largest distance, the region of each αj must be constrained as |αj | ≤ 1 and |1 − αj | ≤ 1. Next we will order α following the law, • if Re(αi ) < Re(αj ), i < j, • if Re(αi ) = Re(αj ) and Im(αi ) < Im(αj ), i < j. Hyperelliptic curves of genus g are determined as two-fold coverings of P1 ramified at 0, 1, ∞ and 2g − 1 additional points as the above order. However it is difficult to deal with deformation from non-degenerate hyperelliptic curves to degenerate curves [50–52]. Definition 5.6 ([28, 51, 27, 52, 53]). (1) Let H1 (Cg , Z) =
g M j=1
Zαj ⊕
g M
Zβj .
j=1
denote the homology of an algebraic curve Cg . (2) We introduce the periodic matrix of the curve Cg , in terms of the normalized first kind one-form ωi over Cg : "Z # "Z # 1 ˆ 1= . ω ˆi , T = ω ˆ i , Ω1 = T αj βj (3) For fixing T, we define the theta function θ : Cg → C by, X 1t nTn + tnz . θ(z) := θ(z | T) := exp 2πi 2 g n∈Z
Proposition 5.7 ([27, 28, 44, 51 53]). (1) By defining the Abel map for gth symmetric product of the curve Cg , ! g Z Qi X sˆ ωk , w ˆ : Symg (Cg ) → Cg , w ˆk (Q) := i=1
∞
September 1, 2003 11:49 WSPC/148-RMP
00172
On the Moduli of a Quantized Elastica in P and KdV Flows
607
the Jacobi variety Jˆg is realized as a complex torus, ˆ. Jˆg = Cg /Λ ˆ is a lattice generated by Ω. ˆ For the Abelian group of the divisor of the line Here Λ bundle over a hyperelliptic curve Cg , which is called Picard group Pic0 (X), the Abel ˆ theorem is expressed by Pic0 (X) ≈ Cg /Λ. (2) The theta function has monodoromy properties θ(z + ek ) = θ(z) ,
θ(z + τk ) = e−2πizk +πiτkk θ(z) .
(3) The Riemann theorem gives that θ w(Q) ˆ −
g X i=1
w(P ˆ g) + K
!
6≡ 0 ,
if and only if Pg ’s are general points on Cg where K is a constant called Riemann constant. As MPelas, g and MKdV, g have the natural projections, we will introduce the universal family of hyperelliptic curves of Mhyp, g induced from πhyp : Jg 7→ pt ∈ Mhyp, g . Proposition 5.8 (Krichever, Mulase [26, 27, 44]). (1) A finite g type solution of the KdV equation is given by a meromorphic function over the Jacobi variety J g of a hyperelliptic curves Cg . (2) There is a natural bijection between the moduli spaces of hyperelliptic curves Mhyp, g and MKdV, g , Mhyp, g ≈ MKdV, g . As (2) comes from the previous section, we will mention its idea of (1) as follows [11, 27]. Krichever started with ψx , a solution of (−∂12 − u + x2 )ψx = 0, which is called the Baker–Akhiezer function. His approach is very natural in the soliton theory and can be generalized from the case of the KdV hierarchy, which is related to hyperelliptic curves, to that in the KP hierarchy related to more general compact Riemannian surfaces. Lemma 5.9 ([27]). (1) For a solution of the KdV equation whose Spect(−L) is associated with the hyperelliptic curve Cg , we parameterize the eigenvalue −x2 for Lψx = x2 ψx . Then 1/x is a local parameter of ∞ of Cg . (2) ψx is meromorphic on Cg − ∞ and at the point ∞ it has an essential singularity ! ∞ X sx −i . ψx = e ψW , ψW := 1 + ai (t1 )x i=1
Here this expansion gives us the recursive relation −2∂1 ai = −Lai−1 with a0 = 1.
September 1, 2003 11:49 WSPC/148-RMP
00172
ˆ S. Matsutani & Y. Onishi
608
Proof. (1) For a sufficiently large |x|, this equation can be approximated by (−∂12 + x2 )ψx ∼ 0. Thus we can regard ψx ∼ exp(sx). In other words for a local coordinate z = 1/x around ∞ ∈ Spect(−L), ψx ∼ exp(−s/z)(1 + O(z)) : 1/x2 = 1/¯ x is a local coordinate around ∞ ∈ Spect(−L). (2) can be obtained by straightforward computations. Using this Lemma 5.9, we follow the Krichever’s construction of the finite g type solution. As we gave the Jacobi varieties and theta functions of hyperelliptic curve Cg in 4.13 and Proposition 5.7, we introduce a normalized Abelian differential of the second kind, ηˆP,i , 1 ηˆP,n = d n−1 + O(1) , t around P using a local parameter t (t(P ) = 0) with the normalization Z ηˆP,n = 0 , for j = 1, . . . , g . αj
As we have prepared to express the Baker–Akhiezer function, we consider the deformation equation, (2n−1)/2
(∂tn − 22(n−1) L+
)ψx = 0 . (2n−1)/2
Since z = 1/x is a local parameter around ∞ and around there L+ we introduce
(2n−1)
∼ ∂1
,
ηˆ∞,n = d(x2n−1 + O(1)) , and consider the function
Around ∞,
E(t, Q) = exp
X
22(n−1) tα,j
α,j
E(t, Q) ∼ exp
∞ X
n=1
2
2(n−1)
Z
Q
tn x
ηPα ,i . 2n−1
!
and ∂tn E(t, Q) ∼ 22(n−1) x2n−1 E(t, Q) . Due to Lemmas 4.8 and 5.9 and by letting L = W (s, ∂1 )∂12 W (s, ∂1 )−1 in the sense of Lemma 4.14, we obtain the relations ψx = W (s, ∂1 )E(t, Q) + O( x1 ) and 1 . Ln/2 W (s, ∂1 )E(t, Q) = W (s, ∂1 )∂1n E(t, Q) + O x
September 1, 2003 11:49 WSPC/148-RMP
00172
On the Moduli of a Quantized Elastica in P and KdV Flows
609
From the Lax equations in Proposition 5.1, ψx is expressed by ψx /E = (ψx /E) (xt1 , 4x3 t2 , 8x5 t3 , . . .) + O( x1 ). On the other hand, even though E(t, Q) is satisfied with the dispersion relation around ∞ and has no monodoromy around αj ’s, it has monodoromy around βj X i exp(2πiUj ) := exp 22(j−1) tj Hα,j j,α
where
i Hα,j =
1 2πi
Z
dˆ ηPα ,j . βi
Noting this monodoromy of the theta function in Proposition 5.7, we can find a single value function over Cg , which is known as Baker–Akhiezer function; P P θ(w(Q) + α,j 22(j−1) tα,j Hα,j − gi=1 w(Pg ) + K) Pg ψx = E(t, Q) . θ(w(Q) − i=1 w(Pg ) + K)
This is a solution of the Lax equations in Proposition 5.1. We can find a finite type solution of the KdV equation by using the zero mode using Proposition 2.8. ψx is determined by an analysis on the functions over Cg related to the Jacobi variety Jˆg . As the map form Cg ’s to the Jacobi variety Jˆg is known as Abel map, finding inverse map from functions over Jˆg to functions over Cg ’s is known as Jacobi inverse problem. Krichever’s scheme should be regarded as the Jacobi inverse method and can be applied even to a generalized Jacobi variety. It shows the existence of an injection form Mhyp to MKdV . Remark 5.10. For a finite type solution of the KdV hierarchy u, we have the hyperelliptic curve Cg as a spectrum of −L to u. Then the above arguments give the following results: (1) The orbit of an equation in the KdV hierarchy is realized in a direct line in the Jacobi variety Jg of Cg . (2) Any finite g solution u is given as a solution of the Jacobi inverse problem of the Jacobi variety Jg . As far as we will deal with only hyperelliptic curves and the KdV hierarchy, we can give more concrete arguments based upon Baker’s original argument [29, 36]. Definition 5.11 ([29, 36, 54]). We introduce the family of the differential forms: g−1
d¯ x x ¯ d¯ x ω2 = x¯2¯ y , . . . ωg = 2¯ y . P 2g−j 1 (2) ηj = 2¯ ¯k d¯ x, (j = 1, . . . , g). k=j (k + 1 − j)λk+1+j x y
(1) ω1 =
d¯ x 2¯ y,
Lemma 5.12 ([29, 36, 54]). (1) ω’s are the basis of the holomorphic function valued cohomology of hyperelliptic curve Cg , which give unnormalized periods: "Z # "Z # " 0# Ω Ω0 = ωi , Ω00 = ωi , Ω = . Ω00 αj βj
September 1, 2003 11:49 WSPC/148-RMP
610
00172
ˆ S. Matsutani & Y. Onishi
(2) They are related to the normalized ones: t
[ˆ ω1 · · · ω ˆ g ] := Ω0−1 t [ω1 · · · ωg ] ,
T := Ω0−1 Ω00 .
(3) η’s are the unnormalized one-form of the second kind over Cg and then the complete hyperelliptic integral of the second kinds is given as "Z # "Z # H 0 := ηi , H 00 := ηi . αj
βj
Here the contours in the integral are, for example, given in p. 3.83 in [53]. Proof. We check holomophicity of the forms in (1) and (3). A zero point of y¯ = 0, or a root cj of f (¯ x) = 0, corresponds to a point (cj , 0) of the curve Cg . We use a local coordinate z 2 := (¯ x − cj ) and x ¯m d¯ x/(2¯ y) ∼ (z 2 + cj )m dz + · · · . On the other hand, around ∞ point, let us choose local coordinate 1/x as 1/x2 = 1/¯ x and then x¯m d¯ x/(2¯ y) ∼ (1/x)2g−2m+2 dx + · · · . Hence ω is holomorphic all over the curve Cg while η is holomorphic except ∞ point. Definition 5.13. (1) The unnormalized Jacobi variety Jg is defined by a complex torus, Jg = Cg /Λ ,
where Λ is a lattice generated by Ω. (2) We defined the theta function by, X 1t a t exp 2πi (z) = θ ab (z; T) = θ (n + a)T(n + a) + (n + a)(z + b) . b 2 g n∈Z
Proposition 5.14. The Riemann constant of the hyperelliptic curve Cg is given as g Z Aj X K= ω ˆ = δ 0 + δ 00 T j=1
∞
1 1 00 t 1 where δ 0 = t [ g2 g−1 2 · · · 2 ], δ = [ 2 · · · 2 ].
Proof. This proof is in p. 3.82 in [53]. Using the Abel map Symg (Cg ) → Cg , we define the coordinate in Cg , g Z (¯ yi ,¯ xi ) X ωj . tj := i=1
Here we note that tj behaves (1/x) x2 = x ¯.
2(g−j)+1
around ∞ point if we use the parameter
Definition 5.15 (℘-function, Baker [29, 36]). (1) Using the coordinate tj , the σ-function, which is a holomorphic function over Cg , is defined by " 00 # δ 1 t 0 0−1 σ(t) := σ(t; Cg ) := exp − tH Ω t ϑ (Ω0−1 t; T) . 2 δ0
September 1, 2003 11:49 WSPC/148-RMP
00172
On the Moduli of a Quantized Elastica in P and KdV Flows
611
(2) In terms of the σ-function, the hyperelliptic ℘-function over the hyperelliptic curve Cg is defined by ℘ij (t) := −
∂2 σi (t)σj (t) − σij (t)σ(t) log σ(u) = . ∂ti ∂tj σ(t)2
As σ-function is an entire function over Cg and has single zero at g − 1 dimensional subvariety of Cg which is called theta-divisor, the hyperelliptic ℘ij has the second order singularity and function of Jg . Remark 5.16. It is worth while noting that from Definition 5.15, the hyperelliptic ℘-function can be concretely computed for a given hyperelliptic curve Cg . The summation in the definition of θ function rapidly converges due to effects of T and others are integrations of primary functions. Further it is known that ℘gi is an elementary symmetric function, i.e. for F (¯ x) = (¯ x−x ¯1 )(¯ x−x ¯2 ) · · · (¯ x−x ¯g ), [36, 54], g
F (¯ x) = x¯ −
g X
℘gi x ¯i−1 .
i=1
Accordingly, by numerical approach, we can compute a value of the hyperelliptic ℘ function as Euler determined a value of the elliptic integral to know the shape of a classical elastica by numerical method [1–4]. This approach was discovered by Baker about one hundred years ago [17–20, 29, 36]. We emphasize that it completely differs from Krichever’s approach based upon Baker–Akhiezer theorem explained in Sec. 4. Krichever’s arguments might not give us practical algorithms to fix parameters of general hyperelliptic function except solutions expressed by elliptic or hyperbolic functions. (Due to its abstract, it is a good strategy to construct soliton theory.) On the other hand, Baker’s original method determines concrete function forms of corresponding ℘ functions, for any algebraically given hyperelliptic curves (even for degenerate curves in Mhyp, g ). We can expand ℘-function around a general point and know its parameter dependence. Since this Baker’s construction in [29] might be no longer in recent researchers’ memory as long as I know, we believe that this review of Baker’s work has meaning. We believe that it is very useful for the analysis in physics. In [29] Baker found that the ℘-functions obey the following differential equations, which contain the KdV hierarchy. Example 5.17 (Genus = 3 [29]). Let us express ℘ijk := ∂℘ij (t)/∂tk and ℘ijkl := ∂ 2 ℘ij (t)/∂tk ∂tl . The hyperelliptic ℘-function obeys the relations (1) (2) (3) (4) (5)
℘3333 − 6℘233 = 2λ5 λ7 + 4λ6 ℘33 + 4λ7 ℘32 , ℘3332 − 6℘33 ℘32 = 4λ6 ℘32 + 2λ7 (3℘31 − ℘22 ), ℘3331 − 6℘31 ℘33 = 4λ6 ℘31 − 2λ7 ℘21 , ℘3322 − 4℘232 − 2℘33 ℘22 = 2λ5 ℘32 + 4λ6 ℘31 − 2λ7 ℘21 , ℘3321 − 2℘33 ℘21 − 4℘32 ℘31 = 2λ5 ℘31 ,
September 1, 2003 11:49 WSPC/148-RMP
00172
612
ˆ S. Matsutani & Y. Onishi
(6) (7) (8) (9) (10) (11)
℘3311 − 4℘231 − 2℘33 ℘11 = 2∆℘ , ℘3222 − 6℘32 ℘22 = −4λ2 λ7 − 2λ3 ℘33 + 4λ4 ℘32 + 4λ5 ℘31 − 6λ7 ℘11 , ℘3221 − 4℘32 ℘21 − 2℘31 ℘22 = −2λ1 λ7 + 4λ4 ℘31 − 2∆℘ , ℘3211 − 4℘31 ℘21 − 2℘32 ℘11 = −4λ0 λ7 + 2λ3 ℘31 , ℘3111 − 6℘31 ℘11 = 4λ0 ℘33 − 2λ1 ℘32 + 4λ2 ℘31 , ℘2222 −6℘222 = −8λ2 λ6 +2λ3 λ5 −6λ1 λ7 −12λ2 ℘33 +4λ3 ℘32 +4λ4 ℘22 +4λ5 ℘21 − 12λ6 ℘11 + 12∆℘ , ℘2221 − 6℘22 ℘21 = −4λ1 λ6 − 8λ0 λ7 − 6λ1 ℘33 + 4λ3 ℘31 + 4λ4 ℘21 − 2λ5 ℘11 , ℘2211 − 4℘221 − 2℘22 ℘11 = −8λ0 λ6 − 8λ0 ℘33 − 2λ1 ℘32 + 4λ2 ℘31 + 2λ3 ℘21 , ℘2111 − 6℘21 ℘11 = −2λ0 λ5 − 8λ0 ℘32 + 2λ1 (3℘31 − ℘22 ) + 4λ2 ℘21 , ℘1111 − 6℘211 = −4λ0 λ4 + 2λ1 λ3 + 4λ0 (4℘31 − 3℘22 ) + 4λ1 ℘21 + 4λ2 ℘11
(12) (13) (14) (15)
where ∆℘ = ℘32 ℘21 − ℘31 ℘22 + ℘231 − ℘33 ℘11 . Proposition 5.18. For u = −2(℘gg − λ2g /3) and 3 tg−1 tg−2 tg−1 u(s, t2 , t3 ) = u tg , 2 , 4 + 4 2 2 2 λ2g
obeys the first and the second KdV equations.
Proof. Let us consider g = 3 case. If we regarded as u = −2(℘33 − λ6 /3), it is obvious that (1) in Example 3.14 becomes the KdV equation noting λ7 = 1. By 2 setting 2∂t3 × (2) + ∂t2 × (1) and ∂t3 = 16∂t1 + 16λ 3 ∂t2 , we obtain the second KdV equation. From arguments of Baker [29, 36], even for g > 3 the relations (1) and (2) maintain for g case. Remark 5.19. (1) By above arguments, for given hyperelliptic curve y¯2 = f (¯ x), we can construct a solution of the first and second KdV equations. Further the compatibility of Lax system gives more general argument for the other equations in the KdV hierarchy. Then it implies that we explicitly showed the existence of an injective map Mhyp, g → MKdV, g . This correspondence is valid even for degenerate curves. (2) Our development of the quantized elastica after submitting this article is in [17–20]. In [17–20], we showed more explicit function forms of quantized elastica over C. (3) After submitting this article, we knew the works of Buchstaber, Enolskii, Leykin and related people [reference in [54] as an extension of parts of Baker’s studies. Now let us give another proof of Theorem 4.2(2) and Proposition 4.33(2). Proposition 5.20. Propositions 4.2(2), 4,27(3), and 4.33(2) can be regarded as an approximation theory based upon the Weierstrass preparation theorem.
September 1, 2003 11:49 WSPC/148-RMP
00172
On the Moduli of a Quantized Elastica in P and KdV Flows
613
Proof. Let us recall the moduli space of the KdV equations whose base ring is smooth functions and definition is in Proposition 4.32, n−1 ∞ ∞ M∞ u = 0 for ∀ n} , KdV = {u ∈ C (V ) | ∂tn u − Ω1
M∞ KdV ,
MKdV = MKdV /(t1 ) ,
For an arbitrary u ∈ there is a unique spectrum Spect(L := −∂s2 − u) up to its orbits, by solving the eigenvalue equation (−∂s2 −u)ψx = x ¯ψx . We assume that the spectrum does not have finite gap {(c1 , c2 ), (c3 , c4 ), . . . , (c2g−1 , c2g ), . . .} and then the corresponding characteristic equation becomes transcendental equation y¯2 = f (x), where f (x) is the transcendental function with zeros (cj )j=0,1,... . Since u is a smooth function over S 1 , |u| is bounded the above. Hence around ∞ of the Spect(L), L ∼ −∂s2 and Spect(L) at ∞ is patched by the affine space C; the width of gap converges to zero for x ¯ → ∞. Thus we can approximate Spect(L) by finite gap spectrum Spect(Lg ) := {(c1 , c2 ), (c3 , c4 ), . . . , (c2g−1 , c2g ), (c2g+1 , ∞)}. The approximated potential ug is given by the ℘ function of the hyperelliptic function y¯2 = hg (¯ x, 1) whose zero points are (cj )j=1,2,...,2g+1 . By using Weierstrass preparation theorem and taking appropriate g, we can approximate f (¯ x) by hg (¯ x, 1) for desired. Hence up to the KdVH flow, ug approaches to u for g approaches to ∞ from its construction. (For an arbitrary finite g, ug is unique up to its orbits). Thus for an arbitrary u in M∞ KdV , there is a series of points ug belonging to Fg MKdV such as ug → u for g → ∞ up to orbit. (We note that the finite type solutions does not depend upon the base rings C ∞ or formal power series.) Hence we have M∞ KdV = ∪g Fg MKdV .
P Since MPelas is a subset of M∞ KdV and for an arbitrary curve γ ∈ Melas , u := {γ, s}SD has a unique value, the above statement is valid.
Example 5.21 ([1, 2, 4, 13]). As an element γ in MPelas must satisfy γ(s + L) = γ(s) in P and a reality condition |∂s γ| = 1. Even though the hyperelliptic function ℘ is a meromorphic function over Jg , we can find a trajectory or real line in Jg which avoids the singularities and satisfies the reality and closed conditions. In other words, we will find MPelas as a subset of MKdV . We give examples of the γ ∈ P in terms of the local chart around the origin. (1) genus g = 0 case: a circle with radius 1. (2) genus g = 1 cases: MPelas 1 consists of two points: (2-1) Jacobi elliptic modulus l = 1 case
√ 2 (tanh(αs) − −1 sinh(αs)) . α (2-2) Jacobi elliptic modulus l = 0.908911 · · · , which gives the eight shape loop in a complex plane C [4]. γ(s) = s −
Here we note that in [6], Mumford gave simple and deep expression of the shape of elastica, which shows the depth, importance and beauty of this problem. There he showed how the reality condition |∂s γ| = 1 restricts the moduli of elliptic curves.
September 1, 2003 11:49 WSPC/148-RMP
614
00172
ˆ S. Matsutani & Y. Onishi
6. Cohomology of a Loop Space As we mentioned in Introduction, in this section, we will digress from our analysis of the moduli of a quantized elastica and review arguments of a loop space over S 2 in the category of topological spaces Top whose morphism is a continuous map (isomorphism is homeomorphism, monomorphism is injective continuous map and so on). Studies on a loop space in Top are well-established and its cohomological properties are well-known as in the textbook of Bott and Tu [30]. We can recognize the moduli space of a quantized elastica in P as a loop space in the category of the differential geometry DGeom. When we replace smooth functions with continuous functions and P with S 2 respectively, it is expected that the moduli space of a quantized elastica in P is related to that in Top. In this section, we will review a loop space in Top and show its cohomological properties. Definition 6.1 ([30]). E and X are topological space and X has a good cover U. A map π : E → X is called a fibering if it satisfies the covering homotopy properties: for given a map f : Y → E from an arbitrary topological space Y into E and homotopy f¯t of f¯ = π ◦ f in X (Y × [0, 1] → X, f0 := f ), there is a homotopy ft of f which covers f¯t ; (Y × [0, 1] → E such that f¯t := π ◦ ft ). Definition 6.2 ([30]). (1) The path space of X is defined to be the space P (X) consisting of all the paths in X with initial point ∗ : P (X) := {maps µ : [0, 1] → X | µ(0) = ∗ ∈ X} . (2) The loop space over X with a fixed point is defined by, ΩX = {µ : [0, 1] → X | µ(0) = µ(1) = ∗ ∈ X} . In the category of topological spaces Top, P and S 2 are identified by homeomorphism as its morphism. Thus we will give properties of the loop space over S 2 in Top as follows. Theorem 6.3 ([30]). (1) P (S 2 ) is a fibering whose fiber is Ω(S 2 ) : Ω(S 2 ) −−−−→ P (S 2 ) y S2
(2) Its cohomology is torsionless and given by H q (ΩS 2 , Z) = Z
for q ∈ Z≥0
as a module and its algebraic properties are given by H ∗ (ΩS 2 , Z) = E(x) ⊗Z Zγ (e) ,
September 1, 2003 11:49 WSPC/148-RMP
00172
On the Moduli of a Quantized Elastica in P and KdV Flows
615
where x and e generators of H 1 (ΩS 2 , Z) and H 2 (ΩS 2 , Z) respectively (dim x = 1 and dim e = 2). Here E(x) is the exterior algebra Z[x]/(x2 ) and Zγ (e) is the divided polynomial algebra whose base is (1, e, e2 /2, e3 /3!, . . .). In other words, the generator of H 2k+1 (ΩS 2 , Z) is x · ek /k! and that of H 2k (ΩS 2 , Z) is ek /k!. In order to prove Theorem 6.3, we prepare two well-known results in algebraic topology and triangle category without proofs [30]. Proposition 6.4 ([30]). For given a double complex K = ⊕q,p≥0 K p,q , there is a spectral sequence {Er , dr } converging to the total cohomology HD (K) such that each Er has a bigrading with dr : Erp,q → Erp+r,q−r+1 and E1p,q = Hdp,q (K) ,
E2p,q = Hδp,q Hd (K) ,
where d and δ are derivative: d : K p,q → K p+1,q and δ : K p,q → K p,q+1 , D = d + (−)p δ. We will consider the double complex for a fibering π : E → M , K p,q := C p (π −1 U, Ωq ) .
Here U is a ramification of M and Ωq is a q-form along the fiber. Proposition 6.5 (Leray-Hirsch theorem [30]). π : E → X is a fibering with fiber F over simply connected topological space which has a good cover, E2p,q = H p (X, H q (F, A)) , where A is a commutative ring. If H q (F, A) is a finitely generated A-module, E2 := H ∗ (X; A) ⊗ H ∗ (F ; A) . Proof of Theorem 6.3 [30]. Since P (X) is contractive, ( Z for q = 1 q H (P (X)) = 0 otherwise and the spectral sequence must converge to H p (P (X)), E2 must give isomorphism except 0-dimension.
E2 :
5
.. .
.. .
.. .
.. .
..
4
Z
0
Z
0
3
Z
0
Z
0
···
2
Z
0
Z
0
1
Z
0
Z
0
0
Z
0
Z
0
0
1
2
3
.
···
···
···
···
···
September 1, 2003 11:49 WSPC/148-RMP
616
00172
ˆ S. Matsutani & Y. Onishi
Next we will consider the algebraic properties. From Proposition 6.5, E2 is the tensor product H ∗ (ΩS 2 ) ⊗ H ∗ (S 2 ). Let v be a two-form of S 2 . Then if H 1 (ΩS 2 ) is denoted as Zx, E20,1 is expressed by Zx ⊗ 1. The derivative d2 in E2 , which is isomorphism, acts on x ⊗ 1 as d2 (x ⊗ 1) = (1 ⊗ v). Since d2 (x2 ⊗ 1) = (d2 x ⊗ 1) · x ⊗ 1 − x ⊗ 1 · d2 x ⊗ 1 = (1 ⊗ v)(x ⊗ 1) − (x ⊗ 1)(1 ⊗ v) = 0, we have x2 = 0 because d2 2 2 is isomorphism. Thus d−1 2 (x ⊗ v) is expressed by another generator e in H (ΩS ), which is algebraically independent of x. d2 (e ⊗ 1) = (x ⊗ v). Since d(ex ⊗ 1) = e ⊗ v, ex is a generator in dimension 3. Similarly d2 (e2 ⊗ 1) = 2ex ⊗ v means that e2 /2 is a generator in dimension 4. In other words, we have a table such that, 5 4 3
E2 :
2 1 0
.. .
.. .
.. .
.. .
..
e2 /2 ⊗ 1
0
0
0
ex ⊗ 1
0
0
e⊗1
0
ex ⊗ v
···
x⊗1
0
0
1
0
x⊗v
0
1
e⊗v
0
1⊗v
0
2
3
.
···
···
···
···
···
Hence Theorem 6.3 is proved. Remark 6.6. Though we showed the result on the loop space defined in Definition 6.2. However there are several studies on another loop space {γ : S 1 ,→ S 2 | smooth immersion} , and its cohomology, which differs from the result in Theorem 6.3 [7]. This loop has a freedom of choice of starting points of S 1 in S 2 . However in this article, we are concerned with a loop space with fixed point as we mentioned in Remark 2.15 and Definition 2.10. Accordingly we mentioned only the result. 7. Topological Properties of Moduli MPelas As in previous section, we reviewed the cohomological properties of a loop space in Top, in this section we will argue its relation to our loop space in DGeom or the moduli space of a quantized elastica again. We believe that such considerations are important for the quantization of an elastica and the statistical mechanics of polymer physics [12, 13, 32, 55]. The loop spaces in both Top and DGeom are infinite dimensional spaces when we regard them as manifolds in an appropriate sense. Even though it is not known that de Rham’s theorem can be applicable to such an infinite dimensional manifold, it is expected that cohomological sequences should correspond to each other. Precisely speaking, as we will show later, the closed condition and the reality condition |∂s γ| = 1 in the moduli space MPelas makes its topological properties difficult. Thus we must tune the 0-dimension of the cohomology related to MPelas .
September 1, 2003 11:49 WSPC/148-RMP
00172
On the Moduli of a Quantized Elastica in P and KdV Flows
617
Then we will reach our third main Theorem 7.4, which implies that cohomology of MKdV reproduces Theorem 6.3 with R coefficients. Since the loop space in Top is given with the fixed point, there is no translation freedom for the loop in S 2 , which corresponds to our situation of quotient of E 0 (C) in Definition 2.2. Further there is no freedom of change of the origin of the loop in Top. Hence we must compare ΩS 2 with MPelas rather than MPelas . Further MPelas 0 ≈ MPelas 1 /Wt1 ≈ pt, which should be regarded the same class because both these are zero dimension. The FMPelas might be natural sequence: FMPelas : ∅ → F1 MPelas ,→ F2 MPelas ,→ · · · ,→ Fg−1 MPelas ,→ Fg MPelas ,→ Fg+1 MPelas ,→ · · · . Noting MPelas g := Fg MPelas /Fg+1 MPelas , as we are concerned only with its topological properties, let us consider the related complex of vector spaces, δ
δ
δ
δ
GMPelas : ∅ −→ F1 MPelas /Wt0,1 −→ MPelas 2 /Wt2 −→ · · · −→ MPelas g−1 /Wtg−1 δ
δ
δ
−→ MPelas g /Wtg −→ MPelas g+1 /Wtg+1 −→ · · · , with trivial map δ = 0 and δ 2 = 0. As each MPelas g /Wtg is a finite dimensional vector space Rg−1 thanks to Proposition 4.30, we have de Rham complex DMPelas g (g > 1), d
d
d
DMPelas g : 0 → Ω0 (MPelas g /Wtg ) −→ Ω1 (MPelas g /Wtg ) −→ Ω2 (MPelas g /Wtg ) −→ · · · . and DMPelas 1 d
DMPelas 1 : 0 → Ω0 (F1 MPelas /Wt0,1 ) −→ Ω1 (F1 MPelas /Wt0,1 ) d
d
−→ Ω2 (F1 MPelas /Wt0,1 ) −→ · · · , where Ωp (M ) is the set of p-forms over M . Proposition 7.1. Let us consider a double complex CMPelas with the derivative D = d + (−)g δ, 0 → DMPelas 1 → DMPelas 2 → · · · → DMPelas g−1 → DMPelas g → DMPelas g+1 → · · · . Then its cohomology, H p (CMPelas ) := ⊕g H p−g+1 (DMPelas g ) , is given by H 0 (CMPelas ) := R and H p (CMPelas ) = Rdt2 ∧ dt3 ∧ · · · ∧ dtp+1 ,
p > 0.
Proof. First we note MPelas g /Wt ≈ Rg−1 , for g ≥ 1 ,
CMPelas 1 /Wt ≈ pt .
September 1, 2003 11:49 WSPC/148-RMP
618
00172
ˆ S. Matsutani & Y. Onishi
Since we have for n ≥ 0 [30],
H p (Rn ) = R for p = 0 .
Due to Poincar´e duality, we have H p (Rn ) = Hcn−p (Rn ) , if we write the compact support function valued cohomology by Hcp [30]. The generator is expressed by, dt2 ∧ dt3 ∧ · · · ∧ dtg ,
with a compact support function over there.
First from Proposition 3.11(5), let us interpret Ω : ∂tn 7→ ∂tn+1 as an endomorphism of tangent space of Jacobi varieties T∗ Jg of a hyperelliptic curve related to a point γ in MPelas . Since the Jacobi variety is a quotient space of Cg , its tangent space (and also its cotangent space) can be identified with Cg : T ∗ Jg ≈ T∗ Jg ≈ Cg . Of course, we are concerned only with its real part Rg . Then using the canonical duality in the real part Rg , h∂tn , dtm i = δn,m ,
we can introduce an endomorphism Ω−1∗ and Ω∗ of MPelas , Ω∗ : dtn 7→ dtn−1 = Ω∗ dtn ,
Ω−1∗ : dtn 7→ dtn+1 = Ω−1∗ dtn ,
where hΩ∂tn , dtm i = h∂tn , Ω∗ dtm i.
Definition 7.2. Let us define an endomorphism of MPelas by, := dt2 Ω−1∗ , where Ω−1∗ is regarded as a right action operator, q = dt2 Ω−1∗ (∧q−1 ) for q > 1 and Ω−1∗ · 1 := 1. Then we have the properties of as follows. Lemma 7.3. (1) We have the relation q · 1 = dt2 ∧ dt3 ∧ · · · ∧ dtq+1 . (2) can be realized by ˜, X ˜ := σ k , 0 := dt2 , k := dtk+2 ∧ (dtk+1 i∂tk+1 ) (k > 0) , k=0
where σ is a permutation operator 1
2
3
q
q−1
q−2
···
···
q−1 2
q 1
!
and i∂tk is an inner product ∗ operator ; i∂tk · dtl = h∂tk , dtl i = δlk . (3) There is a ring isomorphism, ϕ0 : R ⊗R E(x) ⊗R Zγ (e) → R[[2 , dt2 ]] by ϕ0 : (e, x) → (2 , dt2 ) ,
September 1, 2003 11:49 WSPC/148-RMP
00172
On the Moduli of a Quantized Elastica in P and KdV Flows
619
where the product in R[[2 , dt2 ]] is defined by ∗ dt2 = dt2 ∗ := · dt2 ,
∗ = 2 ,
dt2 ∗ dt2 = dt2 ∧ dt2 = 0 .
Proof. (1) For example 2 · 1 = dt2 Ω−1∗ (∧dt2 Ω−1∗ ) · 1 = dt2 ∧ dt3 and this can be extended to general case. (2) Noting 2k = 0, (k ≥ 0), straightforward computations gives the results. (3) Noting Theorem 6.3, it is obvious. Here we will note that : Hcg (Rg ) → Hcg (Rg+1 ) generates the sequence Rg ,→ Rg+1 , and thus m could be regarded as a generator of the filter topology of MPelas and MKdV . Thus it means that we can evaluate the moduli space of a quantized elastica MPelas using the induced topology and as in Proposition 4.26. Finally we reach our third main theorem. Theorem 7.4. By setting e = 2 , x = dt2 , the cohomology H q (CMPelas ), is a ring isomorphic to H q (ΩS 2 , R), φ : H ∗ (CMPelas ) → ˜ H ∗ (ΩS 2 , R) . Remark 7.5. (1) The closed condition γ(s + L) = γ(s) for some L and the reality condition |∂s γ| = 1 are too strong. For example due to the condition, CMPelas 0 and CMPelas 1 consist only of disjoint points as mentioned in Examples 5.21 and [13]. Thus if we assign real vector bases each point, these cohomology might be H p (CMPelas 0 ) = Rδ0,p and H p (CMPelas 1 ) = R ⊕ Rδ0,p . These phenomena come from a “elasticity” in the category DGeom but we wish to consider the topological properties of the loop space in DGeom. Thus we have replaced MPelas with CMPelas by loosing strongness of the condition and make its topology weak; it implies a replacement to fewer open sets. This replacement comes from modulo computations in the gauge transformation by Wt in the KdV equations, using the natural immersion iKdV : MPelas → MKdV . However for a sufficiently large g case, the closed condition and the reality condition might not have serious effects. Then the quotient by the gauge transformations can be also guaranteed by the fact that each moduli space of compact Riemannian surface of genus g is simply connected [56]. Accordingly we consider that the replacement is not worse. The isomorphism φ could be regarded as a functor between the triangle categories of the loop spaces in DGeom and Top and a quasi-isomorphism between CMPelas and ΩS 2 [7]. (These objects in the triangle categories are vector spaces given by n and (xa , em ) respectively. The morphisms are multiplications as their ring structures.) (2) From the definition, m can be regarded as a map from H q (CMPelas ) to q+m H (CMPelas ). We should regard that this map comes from the properties of vertex operator, which change the genus of curves [57] and m · 1 is interpreted as a topological base of CMPelas .
September 1, 2003 11:49 WSPC/148-RMP
620
00172
ˆ S. Matsutani & Y. Onishi
(3) The operator induces the complexes FMPelas and GMPelas . This essentially exhibits the topology of Sato theory because in Sato theory [23–25], the existence of the gauge transformation Wt is a key factor. Theorem 7.4 means that its topology is as strong as that of a loop space in Top. It implies that the topology of Sato theory is too weak to lead us to express fine structure of the moduli space as Harris and Morrison pointed out in [50, p. 44–5]; they stated that the geometrical approach in [26] does not influence the study of the moduli space of algebraic curves including Mhyp [50]. In fact as mentioned in [50, 52], the moduli space of Mhyp is, in general, very complicate but our approach is not so difficult. Accordingly we wish to obtain stronger topology to express the moduli space. We hope that the day comes that the studies on quantized elastica are connected with those of Mhyp as Euler did for the case of genus one [1, 2, 4]. (4) As we will comment in Remark 8.9, the correspondence between loop spaces in DGeom and Top can be extended to higher dimensional loop spaces by considering recent result of a quantized elastica in R3 [14]. 8. Discussion 8.1. Although we have correspondence between homological properties of ΩS 2 in Top and those of MPelas, g in DGeom, there is an open problem for a correspondence of homotopy group between them, e.g. πq−1 (ΩS 2 ) = πq (S 2 ) (q ≥ 2) , ( Q for q = 1, 2 2 πq−1 (ΩS ) × Q = . 0 otherwise 8.2 [13]. We will consider γ ∈ MC elas in this remark. By defining ∂2γ √ s , v= 2 −1∂s γ this problem is related to the quantization of an elastica in C, Z Z Z[β] = Dγ exp −β v 2 ds . MC elas
For β > 0, the domain of E =
R
S1
S1
v 2 can be extended to ∞-point and we will define
1 MC elas = {γ : S → C | γ is continuous, |∂s γ(s)| = 1}/ ∼ .
In other words, as we assign the energy of γ with wild shape to ∞-point of E, it does not contribute the partition function Z. Then we can regard the partition function as Z : MC elas × R≥0 → R .
September 1, 2003 11:49 WSPC/148-RMP
00172
On the Moduli of a Quantized Elastica in P and KdV Flows
621
The integral region in Z is recognized as MC elas . Due to our Theorem 3.4, we have a natural projection operator ΠE : C Π E : MC elas → Melas, E ,
Π2E = ΠE .
We have a spectral decomposition, 1MCelas =
Z
dEΠE .
Hence the partition function becomes Z −βE Z[β] = dE Vol(MC , elas, E )e
C where Vol(MC elas, E ) means the volume of (Melas, E ). Here we will comment on a question why we can use the concept of the orbits of “kinematic” system even though in the noncommutative algebra, one sometimes encounters nonsense of concept of orbit, e.g. Kronecker foliation [58]. Even in quantized problem, we can go on to use the concept of orbit and commutative geometry even though the dimension of the orbit space need not be finite. Let extend to the domain of β ∈ R≥0 to R≥0 + ∞. Note that as the inverse image, −1 (Z(∞)) , MC elas, cls = Z
the classical moduli space of the harmonic map of the elastica depending upon the boundary condition is naturally immersed in our moduli space MC elas . In other words, our analysis naturally contains Euler’s perspective of the classical elastica [1–4]. 8.3. Due to the projection operator, we can define the order in the moduli space MPelas . Noting that the energy E is real in MC elas , let a MC MC elas,<E := elas, E 0 . E 0 <E
For E1 < E2 , we have
C MC elas,<E1 ⊂ Melas,<E2 .
Then the moduli space MC elas is an ordered space. 8.4. The operator in Lemma 7.3 can be regarded as a creation operator in the quantum field theory. The vacuum state is regarded as 1. We can define the dual space of V ∞ ; hem , en i = δnm where en = dtn and em = ∂tm . Further by noting modulo 2 , we can reconstruct CMPelas in Lemma 4.31. On the other hand, we can introduce the micro-differential operator em (m ∈ Z) as the base of CMPelas as in the Definition 3.1 and Proposition 7.1. Then as the dual of CMPelas , we can define em (m < 0) and the vacuum of this field operator in the quantum field theory has affine structure as physicists think.
September 1, 2003 11:49 WSPC/148-RMP
622
00172
ˆ S. Matsutani & Y. Onishi
R 8.5. In the differential operator ring, Ds , the integral S 1 ∂s u = 0 means that since the integral is linear map, its kernel belongs to Ds /∂s Ds . Using the Definition 3.7 and Proposition 3.11, let us define, X X dtj ∂tj , a = uds . hj dtj , δ := h := j
j
We have the transformation in (Ds /∂s Ds ): ˜ , δa = Ωh
˜ := ds∂s δ , Ω δu
δ ∗h = 0. This relation is called Becchi–Rouet–Stora (BRS) relation [13, 59]. 8.6. We will introduce a dilatation flow ∂t ψx = t∂s ψx . The intersection between this flow and the KdV flow is governed by the Painlev´e equation of the first kind, s = 3u2 + ∂s2 u . This statement can be proved as follows. Since the KdV flow in Remark 3.8 is given by B1 = u while this flow B1 = t. Hence u = t and the KdV flow becomes ∂t u = 1 = ∂s (3u2 + ∂s2 u) , and we obtain the Painlev´e equation of the first kind [13, 31]. 8.7. Since the Schwarz derivative u is invariant for PSL2 (C) and PSL2 (C) transitively acts upon P, we can regard MPelas as ΩSL2 (C) := {γ : S 1 ,→ PSL2 (C) | γ(0) = 1} . Because γ(s) = gs γ(0) for g ∈ PSL2 (C), we have the condition g(0) = g(2π). As Witten pointed out, for a loop space we can naturally construct its tangent space as a loop space of the tangent space of the target space [60]. In other words, we can naturally define a loop algebra Ω sl2 (C). In the loop algebra, we have only the condition g −1 dg(0) = g −1 dg(2π) using g ∈ SL2 (C), which is not stronger condition than the condition g(0) = g(2π). Since there is a smooth map from S 1 to S 1 as Diff(S 1 ), we obtain an expression of the loop algebra, Diff(S 1 ) ⊗ sl2 (C) ⊕ C , which acts upon M∞ KdV in Proposition 4.32 with the weaker condition. The KdV flow has bi-hamiltonian structure and 2-cocycle ωΩ (X, Y ) := ω(ΩX, Y ) + ω(X, ΩY ) .
September 1, 2003 11:49 WSPC/148-RMP
00172
On the Moduli of a Quantized Elastica in P and KdV Flows
623
Using ordinary functional derivative (Gatuex derivative δu(y)/δu(x) = δ(x − y)), we can write down the (second) Poisson relation, {u(s), u(s0 )} = Ωδ(s − s0 ) , where δ(s) is the Dirac δ-function. Let ln :=
1 2π
Z
dsuκ eisn .
denote its Fourier component. Then it obeys the semi-classical Virasoro algebra, {ln , lm } = (n − m)ln+m + n(n2 − 1)δn+m,0 . where the second term the unit central charge. We have the Virasoro algebra. Using the topological relation C∗ ∼ S 1 , the problem of conformal field theory is reduced to that of the loop algebra. Thus our relation can be also interpreted in the regime of the conformal field theory. Thus it is clear that our problem is related to the two dimensional quantum gravity [50]. R 8.8. It is known that for H0 := uds, the second Poisson structure of H0 reproduces the KdV equation; when the second Poisson bracket is defined as {X, Y }Ω = ωΩ (X, Y ) , ∂t u = {u, H0 }Ω is ∂t u + 6u∂s u + ∂s3 u = 0. If we will used the Hamiltonian Hn of the higher dimensional KdV as the energy functional of the system, we will have another decomposition, a P, (n) P,(n) Melas = Melas, E P, (n)
Melas, E := {γt ∈ MPelas | Hn − E = 0} .
The space is determined by the n(> 1)th KdV hierarchy, 8.9 [14, 30]. According to the results in [7], we have the relation H q (ΩS n , Z) = Z for q = 0 modulo n − 1 . As we mentioned in Remark 7.5, it is expected that the moduli space of a quantized elastica in S n has similar cohomological properties. In fact, one of these authors calculated the quantized elastica in Rn and obtained the same structure of the moduli space of a quantized elastica in Rn [14]. 8.10. We wish to know the volume of each MC elas, E . However this problem is not easy. In fact as pointed out in [50], the soliton theory might not affect to get any information of the structure of MC elas, E . In other words, our Theorem 7.4 means that the filter topology in the soliton theory is too weak and is equivalent with the topological properties of the loop
September 1, 2003 11:49 WSPC/148-RMP
624
00172
ˆ S. Matsutani & Y. Onishi
space. It might have no effect on the study of geometrical future of moduli space of hyperelliptic curve. Thus we believe that we must go beyond the ordinary soliton theories to another theoretical world for the study of moduli space of a quantized elastica as Euler investigated the elliptic functions by studying the shape of classical elasticas [1–5]. 8.11. First we will note the relations for P, C and upper half complex plane H; P : PSL2 (C) : C:
aγ + b cγ + d
: aγ + b
H : PSL2 (R) :
aγ + b cγ + d
We showed that loops on P are related to the KdV flow and that loops on C are related to the MKdV flow. Next we should consider loops on H. 8.12. One of solutions of
−∂s2
1 − {γ, s}SD ψ = 0 2
√ 1 is given by 1/ ∂s γ. The coordinate p transformation for the Diff(S ) leads us to redefine ψ as the invariant form ds/dγ. This reminds us of the prime form and the Dirac field which has a half weight as same as the theta function [17, 19, 53]. In fact, for a curve in C ⊂ P, there is a natural topology of γ induced form the distance in C, which is given by the Frenet–Serret relation: ! ! √ ∂s k/2 1/ ∂s γ = 0. √ −k/2 ∂s i/ ∂s γ This operator is regarded as the Dirac operator. The Dirac operator could be regarded as a translator from the category of analysis to the category of geometry. Hence as we are dealing with the topology of the Dirac operator, we might have a stronger topology of the curve. We can extend this structure to a conformal surface in R3 as the generalized Weierstrass relation [33, 34, 55, 61]. We note that this Dirac operator (and the Schr¨ odinger operator in Proposition 2.8) defined upon the loop space differs from the Dirac operator of Witten in [60] because Witten’s one is related to the conformal field theory and the ordinary string which is determined by intrinsic properties whereas ours are related to the extrinsic Polyakov string [33, 34, 55]. 8.13. As we noticed in 8.2, the partition function Z can be expressed by Z −βE Z = dE Vol(MC , elas, E )e
September 1, 2003 11:49 WSPC/148-RMP
00172
On the Moduli of a Quantized Elastica in P and KdV Flows
625
where Vol(MC elas, E ) is formally represented by Z XZ d vol(J) Vol(MC ) = dt2 dt3 · · · dtg , elas, E g
MC elas, E,g
J
C where d vol(J) is the volume form around a point J in MC elas, E,g and Melas, E,g := C MC elas, E ∩ Melas, g . Then we will leave integral over t2 , in the above expression and obtain the time t2 depending partition function, Z Z XZ d vol(J) dt3 · · · dtg e−βE . Z[t2 ] = dE g
MC elas, E,g
J
Similarly we obtain Z[t2 , t3 , . . . , tg ], which is a generating function [39]. Then we can expect that it might obey the KdV equation or related equation. This situation might be related to with Witten’s conjecture and Kontsevich’s theorem [50]. Acknowledgments One of us (S.M.) would like to thank Prof. F. Pedit and Prof. K. Tamano for critical discussions and drawing his attention to this problem. It is acknowledged that Prof. K. Tamano has taught him algebraic topology and differential geometry based upon [30] and [7] for over this decade and critically read this manuscript. He also thanks Prof. S. Saito, Prof. T. Tokihiro, W. Kawase and H. Mitsuhashi for helpful discussions and comments in early stage of this study. Prof. K. Sogo privately suggested him that soliton equations should be expressed in a projective space before starting this study and thus this study is one of the answers to his suggestions. He also thanks Prof. A. Koholodnko for telling him the reference [7] and many encouragements and discussions by using e-mails and Prof. B. L. Konopelchenko for kind letters to encourage his works. He is also grateful to Prof. Y. Ohnita, Prof. M. Guest, Dr. R. Aiyama and Prof. K. Akutagawa for inviting him to their seminars and for critical discussions especially Prof. M. Guest for sending him the reference [10]. Further we thank Prof. J. McKay for his interest on this article; his kind comments encouraged us to revise the manuscript. Finally we would like to express our sincere thanks to the referee for appropriate suggestions, which improved this article. References [1] C. Truesdell, The influence of elasticity on analysis: The classic heritage, Bull. Amer. Math. Soc. 9 (1983), 293–310. [2] C. Truesdell, Leonhrdi Euleri Opera Omnia ser. Secunda XI; The Rational Mechanics of flexible or elastic bodies 1638–1788, Birkhauser Verlag, Berlin, 1960. [3] A. E. H. Love, A Treatise on the Mathematical Theory of Elasticity, Cambridge University Press, Cambridge, 1927. [4] L. Euler, Methodus inveniendi lineas curvas maximi minimive proprietate gaudentes, Lausanne, 1744.
September 1, 2003 11:49 WSPC/148-RMP
626
00172
ˆ S. Matsutani & Y. Onishi
[5] A. Weil, Number Theory: an approach through history; From Haammurapi to Legendre, Birkh¨ auser, Cambridge, 1983. [6] D. Mumford, “Elastica and computer vision” in Algebraic Geometry and its Applications, ed. C. Bajaj, Springer-Verlag, Berlin, 1993, pp. 507–518. [7] J.-L. Brylinski, Loop Spaces Characteristic Classes and Geometric Quantization, Birkh¨ auser, Boston, 1992. [8] M. A. Guest, Harmonic Maps, Loop Groups, and Integrable Systems (London Math. Soc. Student Text 38 ), Cambridge University Press, Cambridge, 1997. [9] J. Langer and R. Perline, Poisson geometry of the filament equation, J. Nonlinear Sci. 1 (1991), 71–91. [10] G. Segal, Topological Methods in Quantum Field Theory, eds. W. Nahm et al., World Scienctific, Singapore, 1990, pp. 96–106. [11] G. Segal and G. Wilson, Loop groups and equations of KdV type, IHES 61 (1985), 5–65. [12] S. Matsutani, Geometrical construction of the Hirota bilinear form of the modified Korteweg-de Vries equation on a thin elastic rod: Bosonic classical theory, Int. J. Mod. Phys. A 22 (1995), 3109–3123. [13] S. Matsutani, Statistical mechanics of elastica on plane: origin of MKdV hierarchy, J. Phys. A 31 (1998), 2705–2725. [14] S. Matsutani, Statistical mechanics of elastica in R3 , J. Geom. Phys. 29 (1999), 243–259. [15] A. L. Kholodenko and T. A. Vilgis, Some geometrical and topological problems in polymer physics, Phys. Rep. 298 (1998), 251–370. [16] F. Pedit, KdV flows on the Riemann sphere, a talk at the meeting on “Study on Integrability in Differential Geometry”, Lecture on Tokyo Metropritan University, Jan. 8–10, 1998. [17] S. Matsutani, Closed loop solitons and sigma functions: Classical and quantized elasticas with Genera one and two, J. Geom. Phys. 39 (2001), 50–61. [18] S. Matsutani, Hyperelliptic solutions of KdV and KP equations: Reevaluation of Baker’s study on hyperelliptic sigma functions, J. Phys. A 34 (2001), 4721–4732. [19] S. Matsutani, Hyperelliptic loop solitons with Genus g: Investigations of a quantized elastica, J. Geom. Phys. 43 (2002), 146–162. [20] S. Matsutani, Explicit hyperelliptic solutions of modified Korteweg-de Vries equation: Essentials of Miura transformation, J. Phys. A. Math & Gen. 35 (2002), 4321–4333. [21] C. MacLaughlin, Orientation and string structures on loop spaces, Pac. J. Math 155:1 (1992), 143–156. [22] L. A. Dickery, Soliton Equations and Hamiltonian Systems, World Scientific, Singapore, 1991. [23] M. Sato, D-Modules and nonlinear system, Adv. Stud. Pure Math. 19 (1989), 417–434. [24] M. Sato and M. Noumi, Soliton Equation and Universal Grassmannian Manifold (in Japanese), Shophia University, Tokyo, 1984. [25] M. Sato and Y. Sato, Soliton equations as dynamical systems on infinite dimensional Grassmann manifold, Nonlinear Partial Differentail Equations in Applied Science, eds. H. Fujita, P. D. Lax and G. Strang, Kinokuniya/North-Holland, Tokyo, 1983. [26] M. Mulase, Cohomological structure in soliton equations and Jacobian varieties, J. Diff. Geom. (1984), 403–430. [27] I. M. Krichever, Methods of algebraic geomtery in the theory of nonlinear equations, Russian Math. Surverys 32 (1977), 185–213. [28] E. D. Belokolos, A. I. Bobenko, V. Z. Enol’skii, A. R. Its and V. B. Matveev, AlgebroGeometric Approach to Nonlinear Integrable Equations, Springer, New York, 1994.
September 1, 2003 11:49 WSPC/148-RMP
00172
On the Moduli of a Quantized Elastica in P and KdV Flows
627
[29] H. F. Baker, On a system of differential equations leading to periodic functions, Acta Math. 27 (1903), 135–156. [30] R. Bott and L. W. Tu, Differential Form in Algebraic Topology, Springer, New York, 1982. [31] E. L. Ince, Ordinary Differential Equations, Dover, New York, 1956. [32] S. Matsutani, On density of state of quantized Willmore surface: A way to a quantized extrinsic string in R3 , J. Phys. A 31 (1998), 3595–3606. [33] S. Matsutani, Dirac operator of a conformal surface immersed in R4 : Further generalized Weierstrass relation, Rev. Math. Phys. 12 (2000), 431–444. [34] S. Matsutani, Immersion anomaly of Dirac operator on surface in R3 , Rev. Math. Phys. 11 (1999), 171–186. [35] H. Poincar´e, Papers on Fuchsian functions, J. Stillwel, Springer, 1985. [36] H. F. Baker, Abelian Functions, Cambridge University Press, Cambridge, 1897. [37] R. E. Goldstein and D. M. Petrich, The Korteweg-de Vries hierarchy as dynamics of closed curves in the plane, Phys. Rev. Lett. 67 (1991), 3203-3206. [38] R. E. Goldstein and D. M. Petrich, Solitons, Euler’s equation, and vortex patch dynamics, Phys. Rev. Lett. 67 (1992), 555–558. [39] P. Ramond, Field Theory, A Modern Primer, Benjamin/Cummings, Massachusetts, 1981. [40] R. Abraham and J. E. Marsden, Foundations of Mechanics, 2nd ed., Addison-Wesley, Reading, 1985. [41] J. L. Burchnall and T. W. Chaundy, Commutative Ordinary Differential Operators, Proc. Royal Society London (A) 118 (1928), 557–583. [42] J. L. Burchnall and T. W. Chaundy, Commutative Ordinary Differential Operators II, Proc. Royal Society London (A) 134 (1931), 471–485. [43] H. F. Baker, Note the foregoing paper “Commutative Ordinary Differential Operators” by J. L. Burchnall and T. W. Chaundy, Proc. Royal Society London (A) 118 (1928), 584–593. [44] D. Mumford, An algebro-geometric construction of commuting operators and of solutions to the Toda latticd equation, Korteweg-de Vries equation and related nonlinear equation, Intl. Symp. on Algebraic Geomtery (1977), Kyoto, 115–153. [45] H. Whitney, Analytic extensions of differentiable functions defined in closed sets, Trans. Amer. Math. Soc. 36 (1934), 63–89. [46] R. Hartshorne, Algebraic Geometry, Springer, Berlin, 1977. [47] P. G. Drazin and R. S. Johnson, Solitons: An introduction, Cambridge University Press, Cambridge, 1989. [48] S. Matsutani, The physical realization of the Jimbo-Miwa theory of the modified Korteweg-de Vries equation on a thin elastic rod: Fermionic theory, Int. J. Mod. Phys. A 10 (1995), 3091–3107. [49] H. P. McKean and P. van Moerbeke, The spectrum of Hill’s equation, Inventions Math. 30 (1975), 217–274. [50] J. Harris and I. Morrison, Moduli of Curves, Springer, New York, 1998. [51] S. Iitaka, K. Ueno and Y. Namikawa, Sprits of Deescartes and Algebraic Geometry (in Japanese) Nihon-Hyouron-Sha, Tokyo, 1980. [52] D. Mumford, Curves and Their Jacobians, University of Michigan, Michigan, 1975. [53] D. Mumford, Tata Lectures on Theta, Vol. II, Birkh¨ auser, Boston, 1983–1984. [54] V. H. Buchstaber, V. Z. Enolskii and D. V. Leykin, Klein Function, Hyperelliptic Jacobians and applications, Rev. Math. & Math. Phys. 10 (1997), 3–120. [55] B. G. Knopelchenko and G. Landlfi, Generalized Weierstrass representation for surface in multidimensional Riemann spaces, math.DG/9804144 (1998).
September 1, 2003 11:49 WSPC/148-RMP
628
00172
ˆ S. Matsutani & Y. Onishi
[56] C. Maclachlan, Modulus space is simply-connected, Proc. A.M.S. 29 (1971), 85–86. [57] E. Date, M. Jimbo, M. Kashiwara and T. Miwa, Nonlinear Integrable Systems — Classical Thoery and Quantum Thoery, eds. M. Jimbo and T. Miwa, World Scientific, Singapore, 1983. [58] A. Connes, Noncommutative Geometry, Academic Press, Singapore, 1994. [59] J. M. Leinass and K. Olaussen, Ghosts and geometry, Phys. Lett. 108B (1982), 199-202. [60] E. Witten, Elliptic Curves and Modular Forms in Algebraic Topology, Proceedings Princeton 1986, ed. P. S. Landweber, Springer, Berlin, 1986. [61] B. G. Knopelchenko, Induced surfaces and their integrable dynamics, Studies in Appl. Math. 96 (1996), 9–51.
September 1, 2003 10:14 WSPC/148-RMP
00170
Reviews in Mathematical Physics Vol. 15, No. 6 (2003) 629–641 c World Scientific Publishing Company
ENTANGLEMENT BREAKING CHANNELS
MICHAEL HORODECKI Institute of Theoretical Physics and Astrophysics, University of Gda´ nsk, 80-952 Gda´ nsk, Poland [email protected] PETER W. SHOR AT&T Labs Research, Florham Park, New Jersey 07922 USA [email protected] MARY BETH RUSKAI Department of Mathematics, Tufts University, Medford, Massachusetts 02155 USA [email protected] Received 2 February 2003 Revised 30 May 2003
This paper studies the class of stochastic maps, or channels, for which (I ⊗ Φ)(Γ) is always separable (even for entangled Γ). Such maps are called entanglement breaking, P and can always be written in the form Φ(ρ) = k Rk Tr Fk ρ where each Rk is a density matrix and Fk > 0. If, in addition, Φ is trace-preserving, the {Fk } must form a positive operator valued measure (POVM). Some special classes of these maps are considered and other characterizations given. Since the set of entanglement-breaking trace-preserving maps is convex, it can be characterized by its extreme points. The only extreme points of the set of completely positive trace preserving maps which are also entanglement breaking are those known as classical-quantum or CQ. However, for d ≥ 3, the set of entanglement breaking maps has additional extreme points which are not extreme CQ maps. Keywords: Quantum channels; entanglement breaking maps; completely positive maps; CQ channels; separable states; extreme points.
1. Introduction A quantum channel is represented by a stochastic map, i.e. a map which is both completely positive and trace-preserving. We will refer to these as CPT maps. In this paper we consider the special class of quantum channels which can be simulated by a classical channel in the following sense: The sender makes a measurement on the input state ρ, and sends the outcome k via a classical channel to the receiver who 629
September 1, 2003 10:14 WSPC/148-RMP
630
00170
M. Horodecki, P. W. Shor & M. B. Ruskai
then prepares an agreed upon state Rk . Such channels can be written in the form X Φ(ρ) = Rk Tr Fk ρ (1) k
where each Rk is a density matrix and the {Fk } form a positive operator valued measure (POVM). We call this the “Holevo form” because it was introduced by Holevo in [6]. It is also natural consider the class of channels which break entanglement. Definition 1. A stochastic map Φ is called entanglement breaking if (I ⊗ Φ)(Γ) is always separable, i.e. any entangled density matrix Γ is mapped to a separable one. It is not hard to see that, as shown in the next section, a map is entanglementbreaking if and only if it can be written in the form X |ψk ihψk |hφk , ρφk i (2) Φ(ρ) = k
in which case it is necessarily completely positive. Furthermore, Φ is traceP preserving if and only if k |φk ihφk | = I, in which case, (2) is a special case of (1). One can show that the converse also holds, so that we have the following result. Theorem 2. A channel can be written in the form (1) using positive semi-definite operators Fk if and only if it is entanglement breaking. Such a map is also traceP preserving if and only if the {Fk } form a POVM or, equivalently, k |φk ihφk | = I. The rather straightforward proof will be given in the next section together with some additional equivalences. We will refer to stochastic maps which are both entanglement-breaking and trace-preserving as EBT. Of course there are stochastic maps which are not of the form (1). In particular, conjugation with a unitary matrix is not EBT. Channels which break entanglement are particularly noisy in some sense, e.g. a qubit map is EBT if the image of the Bloch sphere collapses to a plane or a line. In the opposite direction, we will show that a channel in d dimensions is not EBT if it can be written using fewer than d Kraus operators. Theorem 3. The set of EBT maps is convex. Although this follows easily from the definition of entanglement breaking, it may be instructive to also show directly that the set of maps of the form (1) is convex. ˜ denote such maps with density matrices {Rj }j=1···m and {R ˜ k }k=1···n Let Φ and Φ ˜ and POVM’s {Ej }j=1···m and {Ek }k=1···n respectively. For any α ∈ [0, 1] the map X X ˜ ˜ k Tr[(1 − α)E˜j ρ] [αΦ + (1 − α)Φ](ρ) = Rj Tr(αEj ρ) + R j
k
September 1, 2003 10:14 WSPC/148-RMP
00170
Entanglement Breaking Channels
631
˜n } is also a has the form (1) since {αE1 , αE2 , . . . , αEm , (1 − α)E˜1 , . . . , (1 − α)E POVM. Note that we have used implicitly the idea of generating a new POVM as the convex combination of two POVM’s, In this sense, the set of POVM’s is also convex, and one might expect that the extreme points of the set of entanglement-breaking maps are precisely those with an extreme POVM and pure Rk . However, this is false; at end of Sec. 3 of [18], the trine POVM is used to give an example of a qubit channel which is not extreme, despite the fact that the POVM is. Certain subclasses of EBT maps are particularly important. Holevo called a channel • classical-quantum (CQ) if each Fk = |kihk| in the POVM is a one-dimensional P projection. In this case, (1) reduces to Φ(ρ) = k Rk hk, ρki. • quantum-classical (QC) if each density matrix Rk = |kihk| is a one-dimensional P projection and k Rk = I. If a CQ map has the property that each density matrix Rk = |ψk ihψk | is a pure state, we will call it an extreme CQ map. Note that the pure states |ψk i need not be orthonormal, or even linearly independent. We will see in Sec. 3 that extreme CQ maps are always extreme points of the set of EBT maps, but they are only extreme points for the set of CPT maps if all pairs hψj , ψk i are nonzero. When all Rk = R are identical, then Φ is the maximally noisy map Φ(ρ) = R for all ρ. Because it maps all density matrices to the same R, its image is a single “point” in the set of density matrices and its capacity is zero. A point channel is extreme if and only if its image R is a pure state. A point channel is a special case of a CQ map; however, because all Rk = R the sum in (1) can be reduced to a single term with E1 = I. For d > 2, one can also consider those CQ maps for which some Rk are identical; then the POVM can be written as a projective measurement, and the image is a polyhedron. It is useful to have Kraus operator representations of EBT maps. For Φ of the √ √ form (1), let Akmn = Rk |mihn| Fk where {|mi} and {|ni} are orthonormal bases. Then one easily verifies that X X Akmn ρA†kmn = Rk Tr Fk ρ . (3) kmn
k
√ For CQ and QC maps these operators reduce to Akm = Rk |mihk| and Akn = √ |kihn| Fk respectively. Moreover, if all density matrices are pure states Rk = |ψk ihψk |, then one can achieve a further reduction to Ak = |ψk ihk| in the case of CQ maps. Holevo [6] showed that for EBT maps the Holevo capacity (i.e. the capacity of a quantum channel used for classical communication with product inputs) is additive. This result was extended by King [13] to additivity of the capacity of channels of the form Φ ⊗ Ω where Φ is CQ or QC and Ω is completely arbitrary. Shor [20] then proved the additivity of minimal entropy and Holevo capacity when Φ is EBT and
September 1, 2003 10:14 WSPC/148-RMP
632
00170
M. Horodecki, P. W. Shor & M. B. Ruskai
Ω arbitrary. Quite recently, King [14] showed that the maximal p-norms of EBT channels are multiplicative, and used this to give another proof of Shor’s additivity results for minimal entropy and Holevo capacity. In a related development, Vidal, D¨ ur and Cirac [22] used Shor’s techniques to prove additivity of the entanglement of formation for a class of mixed states associated with EBT maps. As it is important to understand the differences between those channels which break entanglement and those which preserve it, we seek other characterizations of these channels, describe their extreme points, and examine their properties. Results for qubits are given in a related paper [18] which follows. Some analysis of entanglement breaking channels was also independently presented by Verstraete and Verschelde [21]. 2. Equivalent Conditions In this section, we establish a number of equivalent characterizations of EBT maps, some of which were already discussed in the previous section. Theorem 4. The following are equivalent (A) Φ has the Holevo form (1) with Fk positive semi-definite. (B) Φ is entanglement breaking. P (C) (I ⊗ Φ)(|βihβ|) is separable for |βi = d−1/2 j |ji ⊗ |ji a maximally entangled state. (D) Φ can be written in operator sum form using only Kraus operators of rank one. (E) Υ ◦ Φ is completely positive for all positivity preserving maps Υ. (F) Φ ◦ Υ is completely positive for all positivity preserving maps Υ. A corresponding equivalence holds for CPT and EBT maps with the additional P conditions that {Fk } is a POVM, the Kraus operators Ak satisfy k A†k Ak = I, and Υ is trace-preserving. To prove this result, we will make use of the correspondence [2, 12] between maps and states given by Φ ↔ (I ⊗ Φ)(|βihβ|). (Also see [1] in this context.) Proof. To show that (A) ⇒ (B) note that when Φ has the form (1), X p p (I ⊗ Φ)(Γ) = R k T2 ( E k Γ E k ) k
=
X k
γk Rk ⊗ Q k
√ √ where T2 denotes the partial trace, γk = Tr Ek Γ and Qk = γ1l T2 ( Ek Γ Ek ). Thus, for arbitrary Γ, (I ⊗ Φ)(Γ) is separable. The implication (B) ⇒ (C) is trivial. To see that (C) ⇒ (A), observe that since (I ⊗ Φ)(|βihβ|) is separable, one can find normalized vectors |vn i and |wn i
September 1, 2003 10:14 WSPC/148-RMP
00170
Entanglement Breaking Channels
633
for which (I ⊗ Φ)(|βihβ|) ≡ =
1X |jihk| ⊗ Φ(|jihk|) d
(4)
X
(5)
jk
n
pn |vn ihvn | ⊗ |wn ihwn | .
Now let Ω be the map Ω(ρ) = d
X n
|wn ihwn | Tr(ρpn |vn ihvn |) .
(6)
Then one easily verifies that (I ⊗ Ω)(|βihβ|) =
X
=
X
jkn
n
|jihk| ⊗ |wn ihwn |pn hj, vn ihvn , ki pn |vn ihvn | ⊗ |wn ihwn |
P where we have used |vn i = j |jihj, vn i. Since a map Φ is uniquely determined by its action on the basis |jihk|, and hence by the action of (I ⊗ Φ) on |βihβ|, we can conclude that Φ = Ω. For trace-preserving maps, we also need to verify that {d pn |vn ihvn |} is a POVM. Taking the partial trace of (5), and using the fact that Φ is trace-preserving yields 1X 1 T2 [(I ⊗ Φ)(|βihβ|)] = |jihk| ⊗ Tr(|jihk|) = I d d jk
=
X n
pn |vn ihvn |
which is the desired result. Moreover, we have also shown that (C) ⇒ (D). P To show that (D) ⇒ (A), suppose that Φ(ρ) = k Ak ρA†k with Ak = |wk ihuk |. Then the map Φ can be written in the form (1) with Rk = |uk ihuk |. Moreover, P P when k A†k Ak = I, then k |uk ihuk | = I so that Fk = |uk ihuk | defines a POVM. The equivalence of (E) and (B) follows easily from the fact that a density matrix Γ is separable if and only if (I ⊗Ω)(Γ) > 0 for all positivity preserving maps Ω [7]. To see that this is equivalent to (F), it suffices to observe that Ω is positivity preserving ˆ is and that Φ ˆ Φ, ˆ where the adjoint is taken with \ if and only if its adjoint Ω ◦ Υ = Υ◦ † ˆ respect to the Hilbert Schmidt inner product so that Tr[Ω(A)] B = Tr A† Ω(B). ˆ is unital It may be interesting to recall that Υ is trace-preserving if and only if Υ so that the adjoint of a positivity and trace preserving map preserves POVM’s. Thus, when Φ has the form (1), the map Φ ◦ Υ is achieved by replacing Ek by ˆ k ). Υ(E Conditions (E) and (F) could be weakened slightly since it would suffice to check either for all Υ in some set of entanglement witnesses for the space on which Φ acts. However, one does not expect to be able to weaken them beyond this. Indeed, [5] and [9] contain examples of a channels which preserve PPT entanglement,
September 1, 2003 10:14 WSPC/148-RMP
634
00170
M. Horodecki, P. W. Shor & M. B. Ruskai
but break other types, i.e. the channel output (I ⊗ Φ)(Γ) is entangled, yet the partial transpose (I ⊗ T ) acting on it always yields a positive semi-definite state (I ⊗ T ◦ Φ)(Γ) ≥ 0. Alternatively, one could also consider maps which are not EBT, but break particular types of entanglement. 3. Extreme Points We now give some results about the extreme points of the convex set of EBT maps. In this section we will use some additional results from Choi [2] who observed that Φ is completely positive if and only if (I ⊗Φ)(|βihβ|) is positive semi-definite. When Φ is written in the operator sum form X Ak ρA†k (7) Φ(ρ) = k
the Kraus operators Ak can be chosen as the eigenvectors of (I ⊗ Φ)(|βihβ|) with strictly positive (i.e. nonzero) eigenvalue. (See Leung [16] for an nice exposition.) Choi [2] also showed that Φ is extreme in the set of CPT maps if and only if the set {A†j Ak } is linearly independent. Since both (7) and this linear independence P are preserved when Ai 7→ j uij Aj , a sufficient condition for Φ to be an extreme EBT map is that {A†j Ak } is linearly independent for some set of operators {Ak } satisfying (7). Note that the condition that Φ is also trace-preserving becomes P † k Ak Ak = I. Recall that an extreme CQ map is one which can be written in the form X Φ(ρ) = |ψk ihψk |hek , ρek i (8) k
with the vectors {ek } orthonormal. We can summarize our results as follows. Theorem 5. (A) If Φ is an extreme CQ map, then Φ is an extreme point in the set of EBT maps. (B) If Φ is an extreme CQ map, then Φ is an extreme point in the set of CPT maps if and only if hψj , ψk i 6= 0 ∀ j, k when it is written in the form (8). (C) If Φ is both in the set of EBT maps and an extreme point of the CPT maps, then Φ is an extreme CQ map. (D) When d = 2, the extreme points of the set of EBT maps are precisely the extreme CQ maps. When d ≥ 3 there are extreme EBT maps which are not CQ. Proof. To prove (A) we assume that Φ = aΦ1 + (1 − a)Φ2 with Φ1 , Φ2 6= Φ, 0 < a < 1 and Φ1 , Φ2 both EBT. Both Φ1 , Φ2 can be written in the form (2). By combining these, one finds one can write X Φ(ρ) = tj |φj ihφj |hfj , ρfj i (9) j
September 1, 2003 10:14 WSPC/148-RMP
00170
Entanglement Breaking Channels
635
with Φ1 , Φ2 having the same form, but different tj ≥ 0. By assumption, Φ can be written in the form (8) with |ek i orthonormal so that X tj |hek , fj i|2 |φj ihφj | . (10) Φ(|ek ihek |) = |ψk ihψk | = j
Since all tj ≥ 0, the rank one projection |ψk ihψk | is a linear combination with non-negative coefficients of the projections |φj ihφj |. This is possible only if those projections |φj ihφj | which have nonzero coefficients in (10) are identical to the projection |ψk ihψk |. Hence, we can conclude that every projection |φj ihφj | in (9) is equal to one of the projections |ψk ihψk | in (8). Let us now relabel the projections P |ψk0 ihψk0 | so that they are all distinct and let Ek0 = i∈k0 |ei ihei | where the sum is taken over those ei for which the associated projection in (8) is |ψk0 ihψk0 |. Then {Ek0 } gives a partition of I into mutually orthogonal projections, i.e. a von Neumann measurement, and we can write (dropping the 0 s for simplicity) X Φ(ρ) = |ψk ihψk | Tr Ek ρ . (11) k
We can also write Φ1 (ρ) =
X k
Φ2 (ρ) =
X k
|ψk ihψk | Tr Fk ρ
(12)
|ψk ihψk | Tr Gk ρ
(13)
with {Fk } and {Gk } each a POVM. Since the |ψk0 ihψk0 | were chosen to be distinct and the Ek0 orthonormal, it follows that Φ = aΦ1 + (1 − a)Φ2 if and only if Ek = aFk + (1 − a)Gk . Since 0 ≤ Fk , Gk ≤ I, this is possible only if Fk = Gk = Ek . But then we have shown that Φ1 = Φ2 = Φ, which proves part (A). To prove (B) note that the Kraus operators can be chosen as Ak = |ψk ihvk |. Thus, A†j Ak = hψj , ψk i|ek ihej | which yields a linearly independent set if and only if none of the ψj are mutually orthogonal. But this is precisely Choi’s condition for the map to be extreme in the set of all CPT maps. The proof of part (C) requires Lemma 8 which is of interest in its own right. The proof of (D) when d = 2 is given in the following paper [18] on qubit EBT maps, while the counter-example establishing (D) for d > 3 is given below. Remark. Recall that a QC map can be written in the form X Φ(ρ) = |ek ihek | Tr ρFk
(14)
k
with the vectors {ek } orthonormal. Such maps can never be extreme in the set of CPT maps; their Kraus operators always include a subset of the form Ak = |ek ihvk |Gk which can not satisfy Choi’s linear independence condition due to the orthogonality of the {ek }. In the case of qubits, QC maps are not even extreme
September 1, 2003 10:14 WSPC/148-RMP
636
00170
M. Horodecki, P. W. Shor & M. B. Ruskai
in EBT, unless they are also CQ. However, for d = 4, one can have extreme EBT maps which are QC but not CQ. Example. Let {gk } be orthonormal and consider the POVM consisting of a “trine” on span{g1 , g2 } and the projection onto span{g3 , g4 }, i.e. 2 2 2 E1 = |g1 ihg1 | , E2 = |g+ ihg+ | , E3 = |g+ ihg+ | , E4 = |g3 ihg3 | + |g4 ihg4 | 3 3 3 √ P4 3 1 where |g± i = 2 |g1 i± 2 |g2 i. Then Φ(ρ) = k=1 |ek ihek | Tr ρEk is an extreme EBT map, which is QC, but not CQ. To see that Φ is extreme it suffices to observe that it is essentially the direct P3 sum of maps ΦA ⊕ ΦB where ΦA : C2 7→ C3 with ΦA (ρ) = k=1 |ek ihek | Tr ρEk and ΦB : C2 7→ C1 with ΦB (ρ) = |e4 ihe4 | for all ρ. ΦA is extreme because it is the adjoint of an extreme CQ map, and ΦB is the only CPT from map C2 to C1 . We used the fact that proof of part (A) of Theorem 5 extends easily to map from Cd 0 to Cd with d0 < d. A map which is both CQ and QC projects a density matrix ρ onto its diagonal in a fixed orthonormal basis. One can generalize this to CPT maps which take a density matrix to its projection onto a block-diagonal one. Such maps have the form P Φ(ρ) = k Ek ρEk where Ek are the projections in a von Neumann measurement; they are not EBT when at least one of the projections has rank > 1. The map in the example above is a generalization of CQ in the sense that it is the composition of a block diagonal projection together with an EBT map, and thus could be regarded as “block CQ”. In a similar spirit, one might regard an extreme CQ map for which the ψk can be split into two mutually orthogonal subsets as “block QC”. With respect to CPT, maps which are both block QC and block CQ could be considered as generalizations of the quasi-extreme points introduced in [19] for stochastic maps on C2 . We now give some results about the number of Kraus operators associated with EBT maps. Theorem 6. If a CPT map Φ can be written with fewer than d Kraus operators, then it is not EBT. Proof. This follows from the fact [2] that Φ can always be written using at most r ≡ rank[(I ⊗ Φ)(|βihβ|) Kraus operators. However, it was shown in [11] that if r < d, then (I ⊗ Φ)(|βihβ|) is not separable and, hence, Φ does not break the entanglement of the state |βihβ|. Alternatively, one could observe that if r < d, then at least one eigenvalue of (I ⊗ Φ)(|βihβ| is greater than 1/d, while its left reduced density matrix has all eigenvalues equal to 1/d (since Φ is CPT). However, in Ref. 8 it was shown that if a state is separable, then its the maximal eigenvalue must not exceed the maximal eigenvalue of either of subsystems. Lemma 7. If Φ is a CPT map for which rank[(I ⊗ Φ)(|βihβ|)] = d, then Φ is EBT if and only if T ◦ Φ is completely positive.
September 1, 2003 10:14 WSPC/148-RMP
00170
Entanglement Breaking Channels
637
This follows immediately from a (non-trivial) result in [10] which implies that a d2 × d2 density matrix of rank d is separable if and only if it has positive partial transpose. The following lemma is of some interest since one can find examples [4] of separable matrices of rank d whose decomposition into product pure states requires more than d products. The additional hypothesis that the reduced density matrix ρA = TrB ρ also has rank d is crucial. The lemma was first proven in [10]. Here we present a simpler proof. Lemma 8. Let ρ be a density matrix on HA ⊗ HB . If ρ is separable, ρ has rank d, and ρA = TrB ρ has rank d, then ρ can be written as a convex combination of products of pure states using at most d products. Proof. Since ρ is separable it can be written in the form ρ=
k X i=1
λi |ai ihai | ⊗ |bi ihbi | .
(15)
Assume that k > d and that ρ cannot be written in the form (15) using less than k products. Since ρA has exactly rank d, there is no loss of generality in assuming that the vectors above have been chosen so that |a1 i, |a2 i, . . . , |ad i are linearly independent. Moreover, since ρ has rank d < k, the first d + 1 vectors |ai i ⊗ |bi i must be linearly dependent so that one can find αj such that d+1 X j=1
αj |aj i ⊗ |bj i = 0 .
(16)
Now let {|ek i} be an orthonormal basis for HB . Then d+1 X j=1
αj hek , bj i|aj i = 0 ∀ k .
(17)
Since the first d vectors |aj i are linearly independent, there is a vector x in Cd+1 P such that j vj |aj i = 0 if and only if v is a multiple of x. Applying this to the coefficients in (17) one finds that there are numbers νk such that uj hek , |bj i = νk xj . P Let |νi be the vector k νk |ek i. Then αj |bj i = xj |νi. Since |bj i was chosen to have x norm 1, it follows that when αj 6= 0, | αjj | = 1 and |bj i = eiθj |νi. Thus, one can rewrite (15) as X X ρ= λj |aj ihaj | ⊗ |bj ihbj | + λj |aj ihaj | ⊗ |νihν| . (18) j:αj 6=0
j:αj =0
Suppose that t of the αj are nonzero. Since the vectors {aj i : αj 6= 0} are linearly P dependent, the density matrix ρ˜A = j:αj =0 λj |aj ihaj | has rank strictly < t and Pt0 0 0 0 0 can be rewritten in the form ρ˜A = k=1 λj |aj ihaj | using only t < t vectors. Substituting this in (18) gives ρ as linear combination of products using strictly less than k contradicting the assumption that (15) used the minimum number.
September 1, 2003 10:14 WSPC/148-RMP
638
00170
M. Horodecki, P. W. Shor & M. B. Ruskai
Proof of (C). If Φ can be written with fewer than d Kraus operators, it is not entanglement breaking; and if it requires more than d Kraus operators, it is not extreme. Hence we can assume that rank[(I ⊗ Φ)(|βihβ|) = d. The result then follows from Lemma 8. We now show that, for d = 3, the set of entanglement breaking maps has extreme points which are not CQ. Moreover, unlike the d = 4 example considered earlier, there is no decomposition into orthogonal blocks associated with this map. Counterexample. Let |0i, |1i, |2i be an orthonormal basis for C3 and consider the following four vectors corresponding to the vertices of a tetrahedron 1 |v0 i = √ (+|0i + |1i + |2i) 3 1 |v1 i = √ (+|0i − |1i − |2i) 3 1 |v2 i = √ (−|0i + |1i − |2i) 3 1 |v3 i = √ (−|0i − |1i + |2i) 3 and let 3
Φ(ρ) =
3X |vi ihvi | Tr ρ|vi ihvi | . 4 i=0
(19)
We now show that Φ is an extreme point for the set of entanglement-breaking maps. To see this, first recall that any entanglement breaking map Ψ can be written as X Ψ(ρ) = αi |yi ihyi | Tr ρ|zi ihzi | . (20) i
Let Ψ be one of the entanglement breaking maps whose convex combination is Φ, and let |yi and |zi be |yi i and |zi i for some fixed i in this above expression for Ψ. Now, consider the six vectors |wij i for i < j, where these are defined so that hwij |vk i = 0 for k 6= i, j. For example, |w01 i = √12 (|1i + |2i). Then, 1 (|vi ihvi | + |vj ihvj |) (21) 2 so for input |wij ihwij |, the output has rank 2 and is orthogonal to wkl , where i, j, k, l are all distinct. We thus have that for |yi and |zi, Φ(|wij ihwij |) =
hwij |yi = 0 or hwkl |zi = 0
(22)
where {i, j, k, l} is any permutation of {0, 1, 2, 3}, as above. Now, consider |yi. Suppose it is orthogonal to two of w01 , w02 , and w12 . Then, we must have |yi = |v3 i. This means that |yi is not orthogonal to w23 , w13 and
September 1, 2003 10:14 WSPC/148-RMP
00170
Entanglement Breaking Channels
639
w03 , which implies in turn that |zi is orthogonal to w01 , w02 and w12 , showing that |zi = |v3 i as well. The other case is when y is not orthogonal to at least two of the above three vectors w01 , w02 , w12 ; we can assume by symmetry that these two are w01 and w02 . Then z is orthogonal to w23 and w13 , showing that |zi = |v0 i. By the same reasoning as in the last paragraph, we now have that |yi = |v0 i as well. Thus, all the yi and zi in the above expression for Ψ must be one of the four vectors vj . It follows easily from this that Ψ = Φ. Moreover, we have shown that the Holevo form for Φ is essentially unique. Hence Φ cannot be written in the form required for it to be a CQ map. Note that Φ is not extreme in the set of CPT maps. In fact, it can be represented as a convex combination of CPT maps in several ways. For example, it can be written as the convex combination of the identity map, with weight 31 , and the average of the three CP maps that first project the state into one of the three planes {|0i, |1i}, {|0i, |2i}, {|1i, |2i}, and then apply the σx operator for that plane interchanging the two basis states, with weight 23 . It can also be written as a convex combination of the identity and the four maps corresponding to conjugation with a unitary map which reflects across the plane orthogonal to one of the vectors |v j i. 4. Representations in Bases Let G0 = d−1/2 I and let G1 · · · Gd2 −1 be a basis for the subspace of self-adjoint d×d matrices with trace zero which is orthonormal in the sense Tr G∗j Gk = δjk . Then {Gk }, k = 0, 1 · · · d2 − 1 is an orthonormal basis for the subspace of self-adjoint d × d matrices and every density matrix can be written in the form 2
ρ=
2
dX −1 dX −1 1 I+ wj G j = wj G j d j=1 j=0
(23)
with wj = Tr ρGj so that w0 = d1/2 . It then follows that 2 dX −1
j=0
wj2
2
= Tr ρ ≤ Tr ρ = 1 and
2 dX −1
j=1
wj2 ≤
d−1 . d
Then any linear (and hence stochastic) map Φ on the self-adjoint d×d matrices can be represented as a d2 ×d2 matrix T with elements tjk = Tr Gj Φ(Gk ). Now let Φ be a P P Holevo channel with density matrices Rk = j wjk Gj and POVM Fk = n ukn Gn P (k = 1 · · · N ) and write ρ = i xi Gi . Then it is straightforward to verify that P tjn = k wjk ukn . Thus, T = W T U where W and U are the d2 × N matrices with elements wjk = wjk and unk = ukn respectively. The condition that {Fk } is a POVM is precisely that the first row of T is (1, 0, . . . , 0). Such representations have been studied in more detail for qubits using the Pauli matrices for Gk . Recently, several generalizations have been considered for
September 1, 2003 10:14 WSPC/148-RMP
640
00170
M. Horodecki, P. W. Shor & M. B. Ruskai
d = 3 [15] and higher [3, 17]. Another natural choice of basis has Gjk = |jihk| for some orthonormal basis |ji. In this case some modifications are needed since P I = k Gkk . For j < k, one could also replace Gjk , Gkj by 2−1/2 (Gjk ± Gkj ) which act like σx and iσy for the two-dimensional subspace span{|ji, |ki}. Unfortunately, when d > 2, the requirement that Rk and Fk are positive semi-definite does not P 2 −1 2 seem easily related to a condition between u0 and dj=1 uj in any of these bases. Hence, such representations seem most useful for qubits, as discussed in [18]. For a CQ or QC channel, W and U are d2 × d which implies rank(T) ≤ d. Hence the image of a QC or CQ channel lies in a subspace of dim ≤ d − 1. This raises the question of whether or not a stochastic map for which the image of the set of density matrices lies in a subspace of sufficiently small dimension is always entanglement breaking. (This is true for qubits for which all planar maps are EBT.) For a basis in which a necessary condition for positive semi-definiteness is Pd2 −1 Pd2 −1 2 2 i=1 |xi | ≤ x0 , one can show that EBT implies j=1 |tjj | ≤ 1. For details, see Ref. 18. In general, a matrix T can be written as a product in many ways. We have shown that T represents an entanglement-breaking map if it can be decomposed into a product T = W T U whose elements W, U have very special properties. There is also a correspondence between the matrix T which represents Φ in a basis in the usual sense and the matrix (I ⊗Φ)(|βihβ|). It would seem that the requirement that (I ⊗ Φ)(|βihβ|) is separable is related to the product decomposition of T; however, we have not analyzed this. It may be more amenable to the filtering approach advocated by Verstraete and Verschelde [21]. Acknowledgment Part of this work was done while the authors participated in the program on Quantum Computation at the Mathematical Sciences Research Institute at Berkeley in November, 2002. The work of M.H. is supported by EC, grant EQUIP (IST-1999-11053), RESQ (IST-2001-37559) and QUPRODIS (IST-2001-38877). The work of M.B.R. was partially supported by the National Security Agency (NSA) and Advanced Research and Development Activity (ARDA) under Army Research Office (ARO) contract numbers DAAG55-98-1-0374 and DAAD19-02-1-0065, and by the National Science Foundation under Grant number DMS-0074566. References [1] C. H. Bennett, D. P. DiVincenzo, J. Smolin and W. K. Wootter, Mixed-state entanglement and quantum error correction, Phys. Rev. A 54 (1996), 3824–3851, quant-ph/9604024. [2] M.-D. Choi, Completely positive linear maps on complex matrices, Lin. Alg. Appl. 10 (1975), 285–290. [3] J. Cortese, The Holevo–Schumacher–Westmoreland channel capacity for a class of qudit unital channels, quant-ph/0211093.
September 1, 2003 10:14 WSPC/148-RMP
00170
Entanglement Breaking Channels
641
[4] D. P. DiVincenzo, B. M. Terhal and A. V. Thapliyal, Optimal decompositions of barely separable states, J. Mod. Optics 47 (2000), 377–385, quant-ph/9904005. [5] D. P. DiVincenzo, P. W. Shor, J. A. Smolin, B. M. Terhal and A. V. Thapliyal, Evidence for bound entangled states with negative partial transpose, Phys. Rev. A 61, 062312 (2000), quant-ph/9910026. [6] A. S. Holevo, Coding theorems for quantum channels, Russian Math. Surveys 53 (1999), 1295–1331, quant-ph/9809023. [7] M. Hordecki, P. Hordecki and R. Hordecki, Separability of mixed states: necessary and sufficient conditions, Phys. Lett. A223 (1996), 1–8. [8] M. Horodecki and P. Horodecki, Reduction criterion of separability and limits for a class of protocols of entanglement distillation, Phys. Rev. A 59 (1999), 4206–4216, quant-ph/9708015. [9] M. Horodecki, P. Horodecki and R. Horodecki, Binding entanglement channels, J. Mod. Opt. 47 (2000), 347–354, quant-ph/9904092. [10] P. Hordecki, M. Lewenstein, G. Vidal and I. Cirac, Operational criterion and constructive checks for the separability of low rank density matrices, Phys. Rev. A 62, 032310 (2000), quant-ph/0002089. [11] P. Horodecki, J. Smolin, B. Terhal and A. Thapliyal, Rank two bound entangled states do not exist, J. Theor. Comp. Sci. 292 (2003), 589–596, ArXiv.org preprint quant-ph/9910122. [12] A. Jamiolkowski, Linear transformations which preserve trace and positive semidefiniteness of operators, Rep. Math. Phys. 3 (1972), 275–278. [13] C. King, Maximization of capacityand lp norms for some product channels, J. Math. Phys. 43 (2002), 1247–1260. [14] C. King, Maximal p-norms of entanglement breaking channels, Quant. Information and Computation 3 (2003), 186–190, quant-ph/0212057. [15] C. King, Capacity of the depolarizing channel, Lecture in workshop on Quantum Information and Cryptography at Mathematical Sciences Research Institute (November, 2002). http://www.msri.org/publications/ln/msri/2002/quantumcrypto/king/1/index.html [16] D. Leung, Choi’s proof as a recipe for quantum process tomography, J. Math. Phys. 44 (2003), 528–533, quant-ph/0201119. [17] A. O. Pittenger and M. H. Rubin, Separability and Fourier representations of density matrices, Phys. Rev. A 62, 032313 (2000), quant-ph/0001014. [18] M. B. Ruskai, Qubit entanglement breaking maps, quant-ph/0302032, Rev. Math. Phys. 15 (2003), 643–662. [19] M. B. Ruskai, S. Szarek and W. Werner, An analysis of completely positive tracepreserving maps on M2 , Lin Alg. Appl. 347 (2002), 159–187, quant-ph/0101003. [20] P. W. Shor, Additivity of the classical capacity of entanglement-breaking quantum channels, J. Math. Phys. 43 (2002), 4334–4340, quant-ph/0201149. [21] F. Verstraete and H. S. Verschelde, On one-qubit channels, ArXiv.org preprint quant-ph/0202124, version 1. [22] G. Vidal, W. D¨ ur and J. I. Cirac, Entanglement cost of bipartite mixed states, Phys. Rev. Lett. 89, 027901 (2002), quant-ph/0112131.
September 1, 2003 12:19 WSPC/148-RMP
00171
Reviews in Mathematical Physics Vol. 15, No. 6 (2003) 643–662 c World Scientific Publishing Company
QUBIT ENTANGLEMENT BREAKING CHANNELS
MARY BETH RUSKAI∗ Department of Mathematics Tufts University, Medford, Massachusetts 02155 [email protected]
Received 2 February 2003 Revised 30 May 2003 This paper continues the study of stochastic maps, or channels, for which (I ⊗ Φ)(Γ) is always separable in the case of qubits. We give a detailed description of entanglementbreaking qubit channels, and show that such maps are precisely the convex hull of those known as classical-quantum channels. We also review the complete positivity conditions in a canonical parameterization and show how they lead to entanglement-breaking conditions. Keywords: Quantum channels; entanglement breaking maps; completely positive maps; CQ channels; separable states; extreme points; qubit complete positivity conditions.
1. Introduction The preceding paper [11] studied the class of stochastic maps which break entanglement. For a given map Φ this means that I ⊗ Φ(Γ) is separable for any density matrix Γ on a tensor product space. It was observed that a map is entanglement breaking if and only if it can be written in one of the following equivalent forms X Φ(ρ) = Rk TrFk ρ (1) k
=
X k
|ψk ihψk |hφk , ρ φk i
(2)
where each Rk is a density matrix and Fk a positive semi-definite operator. The map P P Φ is also trace-preserving if and only if k Fk = k |φk ihφk | = I, in which case the set {Fk } form a POVM. Henceforth we will only consider trace-preserving maps and use the abbreviations CPT for those which are also completely positive and EBT ∗ Partially
supported by the National Security Agency (NSA) and Advanced Research and Development Activity (ARDA) under Army Research Office (ARO) contract numbers DAAG5598-1-0374 and DAAD19-02-1-0065, and by the National Science Foundation under Grant number DMS-0074566. 643
September 1, 2003 12:19 WSPC/148-RMP
644
00171
M. B. Ruskai
for those which are also entanglement breaking. An EBT map is called classicalquantum (CQ) if each Fk = |kihk| is a one-dimensional projection; it is quantumclassical (QC) if each density matrix Rk = |kihk| is a one-dimensional projection. Maps which break entanglement can always be simulated using a classical channel; thus, one is primarily interested in those which preserve entanglement. Nevertheless, it is important to understand the distinction. In this paper we restrict attention to EBT maps on qubits, for which one can obtain a number of results which do not hold for general EBT maps. The main new result, which does not hold in higher dimensions, is that every qubit EBT map can be written as a convex combination of maps in the subclass of CQ maps defined above. Before proving this result in Sec. 6, we review parameterizations and complete positivity conditions for qubit maps. We also give a number of more specialized results which use the canonical parameterization and/or the fact that positivity of the partial transpose suffices to test entanglement for states on pairs of qubits. Recall that any CPT map Φ on qubits can be represented by a matrix in the canonical basis of {I, σ1 , σ2 , σ3 }. When ρ = 12 [I +v·σ], then Φ(ρ) = 12 [I +(t+T v)·σ] where t is the vector with elements tk = t0k , k = 1, 2, 3 and T is a 3 × 3 matrix, i.e. T = t1 T0 . Moreover, it was shown in [14] that we can assume without loss of generality (i.e. after suitable change of bases) that T is diagonal so that T has the canonical form 1 0 0 0 t 1 λ1 0 0 . (3) T= 0 λ2 0 t2 t3 0 0 λ3 The conditions for complete positivity in this representation were obtained in [16] and are summarized in Sec. 4. In the case of qubits, Theorem 4 of [11] can be extended to give several other equivalent characterizations. Theorem 1. For trace-preserving qubit maps, the following are equivalent (A) (B) (C) (D)
Φ has the Holevo form (1) with {Fk } a POVM. Φ is entanglement breaking. Φ ◦ T is completely positive, where T (ρ) = ρT is the transpose. Φ has the “sign-change” property that changing any λk → −λk in the canonical form (3) yields another completely positive map. (E) Φ is in the convex hull of CQ maps.
Conditions (C) through (E) are special to qubits. Conditions (C) and (D) use the fact [4, 8, 10, 15] that the PPT (positive partial transpose) condition for separability is also sufficient in the case of qubits.
September 1, 2003 12:19 WSPC/148-RMP
00171
Qubit Entanglement Breaking Channels
645
2. Characterizations In this section, we prove Theorem 1 and provide some results using the canonical parameters. This gives another characterization of qubit EBT maps in the special case of CPT maps which are also unital. The equivalence (A) ⇔ (B) was proved in [11] where it was also shown that both are equivalent to the condition that Υ ◦ Φ is CPT for all Υ in a set of entanglement witnesses and that Φ◦Υ is CPT if and only if Υ◦Φ is. In the case of qubits, it is wellknown that it suffices to let Υ be the transpose, which proves the equivalence with (C). Furthermore, changing Φ → Φ ◦ T is equivalent to changing λ2 → −λ2 in the representation (3), and is unitarily equivalent (via conjugation with a Pauli matrix) to changing the sign of any other λk which yields (C) ⇔ (D). That (E) ⇒ (A) follows immediately from the facts that CQ maps are a special type of entanglementbreaking maps and the set of entanglement-breaking maps is convex by Theorem 2 of [11]. The proof that shows (D) ⇒ (E) will be given in Sec. 6. The proof that (B) ⇒ (A) given in [11] relied on the fact that there is a one-toone correspondence [5, 10, 12] (but not a unitary equivalence) between maps Φ and states ΓΦ = (I ⊗ Φ)(|βihβ|)
(4)
where |βi = √12 (|00i+|11i) is one of the maximally entangled Bell states. Moreover, a map is EBT if and only if ΓΦ is separable since that was shown to be equivalent to writing it in the form (2). One could then apply the reduction criterion for separability [2, 9, 10] to ΓΦ . This condition states that a necessary condition for separability of ρ is that hβ, ρ βi ≤ d1 for all maximally entangled states. In the case of qubits, this criterion is equivalent to the PPT condition, and hence sufficient, and equivalent to ρ ≤ 12 I, which gives the following result. Theorem 2. A qubit CPT map is EBT if and only if ΓΦ ≤ 21 I with ΓΦ as in (4). We now consider entanglement breaking conditions which involve only the parameters λk . Theorem 3. If Φ is an entanglement breaking qubit map written in the form (3), P then j |λj | ≤ 1. Proof. It is shown in [1, 16] that a necessary condition for complete positivity is (λ1 ± λ2 )2 ≤ (1 ± λ3 )2 .
(5)
When combined with the sign change condition (D), this yields the requirement |λ1 | + |λ2 | ≤ 1 − |λ3 |. For unital qubit channels, the condition in Theorem 3 is also sufficient for entanglement breaking. For unital maps t = 0 and, as observed in [1, 14, 16], the P conditions in (5) are also sufficient for complete positivity. Since j |λj | ≤ 1 implies
September 1, 2003 12:19 WSPC/148-RMP
646
00171
M. B. Ruskai
that (5) holds for any choice of sign in λk = ±|λk |, it follows that any unital CPT map satisfying this condition is also EBT. Theorem 4. A unital qubit channel is entanglement breaking if and only if P j |λj | ≤ 1 [after reduction to the form (3)].
Moreover, as will be discussed in Sec. 5 the extreme points of the set of unital entanglement breaking maps are those for which two λk = 0. Hence these channels are in the convex hull of CQ maps. For non-unital maps these conditions need not be sufficient. Consider the socalled amplitude damping channel for which λ1 = α, λ2 = α, λ3 = α2 , t1 = t2 = 0, and t3 = 1 − α2 . For this map equality holds in the necessary and sufficient conditions (λ1 ± λ2 )2 ≤ (1 ± λ3 )2 − t23 .
(6)
Since the inequalities would be violated if the sign of one λk is changed, the amplitude damping maps are never entanglement breaking except for the limiting case P α = 0. Thus there are maps for which j |λj | = 2α + α2 can be made arbitrarily small (by taking α → 0), but are not entanglement-breaking. 3. A Product Representation We begin by considering the representation of maps in the basis {I, σ1 , σ2 , σ3 }. Let Φ have the form (1) and write Rk = 21 [I + wk · σ] and Fk = 21 [uk0 + uk · σ]. Let W, U be the n × 4 matrices whose rows are (1, w1k , w2k , w3k ) and (uk0 , uk1 , uk2 , uk3 ) respectively, i.e. wjk = wjk , ujk = ukj k = 0 · · · 3. Let T be the matrix W T U . Note that the requirement that {Fk } is a POVM is precisely that the first row of T is (1, 0, 0, 0). The matrix T = W T U is the representative of Φ in the form (3) (albeit not necessarily diagonal). We can summarize this discussion in the following theorem. Theorem 5. A qubit channel is entanglement breaking if and only if it can be represented in the form (3) with T = W T U where W and U are n × 4 matrices as P3 P3 2 1/2 above, i.e. the rows satisfy ( k=1 u2jk )1/2 ≤ u0k and ( k=1 wjk ) ≤ w0k = 1 for all k. We can use this representation to give alternate proofs of two results of the previous section. To show that (A) ⇔ (D) observe that changing the sign of the jth column of U (j = 1, 2, 3) is equivalent to replacing Fk by the POVM with ukj → −ukj . The effect on T is simply to multiply the jth column by −1. The critical property about P qubits is that the condition Fk > 0 is equivalent to ( j |ukj |2 )1/2 ≤ uk0 which is unaffected by the replacement ukj → −ukj . Next, we give an alternate proof of Theorem 3 which is of interest because it may be extendable to higher dimensions.
September 1, 2003 12:19 WSPC/148-RMP
00171
Qubit Entanglement Breaking Channels
Proof. Let W, U be as in Sec. 3. Then 3 3 X n X X k k |λj | = w j uj j=1
647
j=1 k=1
≤
3 X n X
≤
n X
≤
n X
j=1 k=1
k=1
k=1
|wjk ukj |
3 X j=1
=
n X 3 X k=1 j=1
1/2
|wjk |2
|wjk ukj |
3 X j=1
1/2
|ukj |2
1 · uk0 = 1
where we have used the fact that |wk | ≤ 1 and |uk | ≤ uk0 . That consequence of the fact that the {Fk } form a POVM.
P
k
uk0 = 1 is a
We now consider the decomposition T = W T U for the special cases of CQ, QC and point channels. If Φ is a CQ channel, we can assume without loss of generality 1 that U = 12 11 00 00 −11 . Now write W = 11 w w2 . Then
1
0 0
0
· 0 0 · T T=W U = 1 2 1 w +w w − w2 0 0 2 2 · 0 0 ·
.
(7)
0 By acting on the left with a unitary matrix of the form 01 ±R where R is a rotation whose third row is a multiple of w1 − w2 , this can be reduced to the form (3) with λ1 = λ2 = 0, |λ3 | = 21 |w1 − w2 |, and t = Rw1 − (0, 0, λ3 )T [since 1 1 1 2 1 1 2 2 (w + w ) = w − 2 (w − w )]. Indeed, it suffices to choose ! 1 t 1 t2 t3 + λ 3 W = . (8) 1 t 1 t2 t3 − λ 3 Note that the requirement |t| ≤ 1 only implies t21 + t22 + (t3 + λ3 )2 ≤ 1; however, the requirement |wk | ≤ 1 implies that t21 + t22 + (t3 ± λ3 )2 ≤ 1 must hold with both signs and this is equivalent to the stronger condition t21 + t22 + (|t3 | + |λ3 |)2 ≤ 1
(9)
which is necessary and sufficient for a CPT map to reduce the Bloch sphere to a line.
September 1, 2003 12:19 WSPC/148-RMP
648
00171
M. B. Ruskai
If Φ is a QC channel, we can assume without loss of generality that 1 0 0 1 W = 1 0 0 −1
and
U=
u0
u1
u2
u3
1 − u0
−u1
−u2
−u3
,
from which one easily finds that the second and third rows of T = W T U are identically zero and the fourth row is (2u0 − 1 2u1 2u2 2u3 ). One then easily verifies that multiplication on the right by a matrix as above with R a rotation whose third column is a multiple ofp(u1 u2 u3 ) reduces T = W T U to the canonical form (3) with λ1 = λ2 = 0, λ3 = 2 u21 + u22 + u22 = |u| ≤ min{2u0 , 2(1 − u0 )} ≤ 1, and t3 = 2u0 − 1. (Note that t3 + λ3 ≤ |2u0 − 1| + min{2u0 , 2(1 − u0 )} ≤ 1 with equality if and only if the image reaches the Bloch sphere). It is interesting to note that for qubits channels, every QC channel is unitarily equivalent to a CQ channel. Indeed, a channel which, after reduction to canonical form has nonzero elements λ3 and t3 with |λ3 | + |t3 | ≤ 1 and |t3 | < 1 can be written as either a QC channel with 1 0 0 1 1 1 + t3 0 0 λ3 W = , U= 2 1 − t3 0 0 −λ3 1 0 0 −1
or as a CQ channel with 1 0 0 t 3 + λ3 , W = 1 0 0 t 3 − λ3
1 U= 2
1 0 0
1
1 0 0 −1
.
For point channels W = ( 1 t1 t2 t3 ) and U = 12 ( 1 0 0 0 ). We conclude this section with an example of map of the form (1) with an extreme POVM, for which the corresponding map Φ is not extreme. Let Ek = 31 [I + wk · σ] √ √ with w1 = (1, 0, 0), w2 = (− 21 , 0, 23 ), w3 = (− 21 , 0, − 23 ). Then, irrespective of the choice of Rk , the third column of T = W T U is identically zero, which implies that, after reduction to canonical form, one of the parameters λk = 0. However, it is easy to find density matrices, e.g. Rk = 21 [I + σk ], for which the resulting map Φ is not CQ or point. But by Theorem 10, Φ is a convex combination of CQ maps and hence, not extreme. 4. Complete Positivity Conditions Revisited Not only is the set of CPT maps convex, in a fixed basis corresponding to the canonical form (3) the set of λk corresponding to any fixed choice of t = (t1 , t2 , t3 ) is also a convex set which we denote Λt . We will also be interested in the convex subset Λt,λ3 of the λ1 –λ2 plane for fixed t, λ3 , and in the convex set Ξt3 ,λ3 of points (t1 , t2 , λ1 , λ2 ) corresponding to fixed t3 , λ3 . Although stated somewhat differently, the following result was proved in [16].
September 1, 2003 12:19 WSPC/148-RMP
00171
Qubit Entanglement Breaking Channels
649
Theorem 6. Let t and λ3 be fixed with |t3 | + |λ3 | < 1. Then the convex set Λt,λ3 † † consists of the points (λ1 , λ2 ) for which I − RΦ RΦ (or, equivalently I − RΦ RΦ ) is positive semi-definite, where
t1 + it2 1/2 (1 − t − λ )1/2 (1 + t + λ ) 3 3 3 3 RΦ = λ1 − λ 2 (1 + t3 − λ3 )1/2 (1 − t3 − λ3 )1/2
λ1 + λ 2 (1 + t3 + λ3 )1/2 (1 − t3 + λ3 )1/2 . (10) t1 + it2 (1 + t3 − λ3 )1/2 (1 − t3 + λ3 )1/2
† Similarly, Ξt3 ,λ3 also consists of the points (t1 , t2 , λ1 , λ2 ) for which I − RΦ RΦ ≥ 0. † Moreover, the extreme points of Λt3 ,λ3 are those for which RΦ RΦ = I.
Although this result is stated in a form in which t3 and λ3 play a special role and does not appear to be symmetric with respect to interchange of indices, the conditions which result are, in fact, invariant under permutations of 1, 2, 3. Theorem 6 follows from Choi’s theorem [5] that Φ is completely positive if and only if ΓΦ , given by (4), is positive semi-definite. As noted in [16], this implies that it can be written in the form Φ(E11 ) p p † Φ(E22 )RΦ Φ(E11 )
ΓΦ =
p
Φ(E11 )RΦ
p
Φ(E22 )
Φ(E22 )
!
(11)
where RΦ is a contraction. (Note, however, that the expression for RΦ given in (10) b i.e. to (I ⊗ Φ)(|βihβ|).) b was obtained by applying this result to the adjoint Φ, 2 Conversely, given a CPT map Φ and any contraction U on C , one can define a 4 × 4 matrix in block form,
b 11 ) Φ(E M = q q b 22 ) U † Φ(E b 11 ) Φ(E
q
b 22 ) Φ(E . b 22 ) Φ(E
b 11 ) U Φ(E
q
(12)
It then follows that there is another CPT map which (with a slight abuse of nocU )(|βihβ|) = M . However, (12) need not, tation) we denote ΦU for which (I ⊗ Φ in general, correspond to a map Φ which has the canonical form (3) since that q q U b 12 ) = Φ(E b 11 ) U Φ(E b 22 ) = (t1 + it2 )I + λ1 σx + iλ2 σy . For U an requires Φ(E arbitrary unitary or contraction, we can only conclude that b x) = Φ(σ b y) = Φ(σ
q
q
b 11 ) U Φ(E
q q q 3 X † b b b 22 ) ≡ Φ(E22 ) + Φ(E22 ) U Φ(E t1k σk k=0
q q q 3 X † b b b b Φ(E11 ) U Φ(E22 ) − Φ(E22 ) U Φ(E22 ) ≡ t2k σk k=0
September 1, 2003 12:19 WSPC/148-RMP
650
00171
M. B. Ruskai
so that the map ΦU corresponds to a matrix of 1 0 0 t10 t11 t12 t20 t21 t22 t3
0
0
the form 0 t13 t23 λ3
with tjk real. In order to study the general case of nonzero tk , it is convenient to rewrite (10) in the following form (using notation similar to that introduced in [13]). λ+ τ √ √ c++ c−− c++ c+− RΦ = (13) λ− τ √ √ c−− c−+ c+− c−+
where λ± = λ1 ± λ2 , τ = t1 + it2 , and c±± = 1 ± λ3 ± t3 , e.g. c+− = 1 + λ3 − t3 . Then ! m11 m12 † (14) I − R Φ RΦ ≡ M = m21 m22 with m11 = 1 −
|τ |2 |λ− |2 − c++ c−− c−− c−+
(15)
m22 = 1 −
|τ |2 |λ+ |2 − c+− c−+ c++ c+−
(16)
m12 = m21 =
τ λ− τ λ+ + . √ √ c++ c−− c+− c−+ c−− c+−
(17)
Note that the denominators, although somewhat messy, are essentially constants depending only on t3 and λ3 . Considering τ as also a fixed constant it suffices to rotate (and dilate) the λ1 –λ2 plane by π/4 and work instead with the variables λ± . The diagonal conditions m11 ≥ 0 and m22 ≥ 0 define a rectangle in the λ+ –λ− plane, namely |λ− |2 ≤ c−− c−+ −
1 − λ 3 + t3 2 c−+ 2 |τ | = (1 − λ3 )2 − t23 − |τ | c++ 1 + λ 3 + t3
(18)
|λ+ |2 ≤ c++ c+− −
c++ 2 1 + λ 3 + t3 2 |τ | = (1 + λ3 )2 − t23 − |τ | . c−+ 1 − λ 3 + t3
(19)
These diagonal conditions imply the necessary conditions |λ± |2 ≤ (1 ± λ3 )2 − t23
(20)
September 1, 2003 12:19 WSPC/148-RMP
00171
Qubit Entanglement Breaking Channels
651
λ+ 2
--
--
+-
m22 = 0 1
++
-+
-+ λ− –1
1
2
–1
m22 = 0
--
--
+m11 = 0
m11 = 0
–2
Fig. 1. The λ+ –λ− plane showing the regions described by the diagonal conditions (dotted lines) and the curves corresponding to det(I − R†Φ RΦ ) = 0 for t = (0.2, 0.3, 0) and λ3 = 0.35. The closed curve and its interior describes the parameters for which the corresponding map is completely positive.
for complete become sufficient when by τ =the 0. diagonal The determinant Figure 1: positivity, The λ+ -λ−which plane also showing the regions described con2 † condition |m12and | isthe more complicated, buttobasically form 11 m22 ≥ ditionsm(dotted lines) curves corresponding det(I − has R Rthe Φ ) = 0 for Φ
t = (0.2, 0.3, 0) and λ23 = 0.35. 2 The closed curve 2 2 and its interior describes the − bλthe − dλ− ] ≥ eλmap f λcompletely (21) −. + ][ccorresponding + + is − + gλ+ λpositive. parameters for[a which
In particular, we would like to know if the values of (λ+ , λ− ) satisfying (21) necessarily lie within the rectangle defined by (18) and (19). Extending the lines bounding this rectangle, i.e. m11 = 0 and m22 = 0 one sees that the λ+ –λ− plane is divided into 9 regions, as shown in Fig. 1 and described below. 20
• the rectangle in the center which we denote ++, • four (4) outer corners which we denote −− since both m11 < 0 and m22 < 0, • the four (4) remaining regions (directly above, below and to the left and right of the center rectangle) which we denote as +− or −+ according to the signs of m11 and m22 . We know that the determinant condition (21) is never satisfied in the +− or −+ regions since m11 m22 − |m12 |2 < 0 when m11 and m22 have opposite signs. This implies that equality in (21) defines a curve which bounds a convex region lying
September 1, 2003 12:19 WSPC/148-RMP
652
00171
M. B. Ruskai λ+ 1
0.5
λ− –0.8
–0.4
0.4
0.8
1
–0.5
–1
Fig. 2.
The λ+ –λ− plane showing the region determined by determinant condition when t =
(0.4, 0.3, =−0.15 and the corresponding withdetermined λ+ and λ− interchanged. Their Figure 2: 0.0) Theandλλ+3-λ plane showing the region region by determinant condiintersection corresponds to the entanglement breaking maps with the indicated parameters. tion when t = (0.4, 0.3, 0.0) and λ3 = 0.15 and the corresponding region with λ+ andλ− interchanged. Their intersection corresponds to the entanglement breaking entirely within the ++ rectangle. Although (21) also has solutions in the −− regions maps with the indicated parameters. as shown in Fig. 1, one expects that these will typically lie outside the region for
which |tk | + |λk | ≤ 1, i.e. the rectangle bounded by the line segments satisfying |λ+ + λ− | ≤ 2(1 − |t1 |) and |λ+ − λ− | ≤ 2(1 − |t2 |). However, John Cortese [6] has shown that this need not necessarily be the case. Nevertheless, one need only check one of the two conditions m11 > 0, m22 > 0, and might substitute a weaker condition, such as Tr M > 0, to exclude points in the −− regions. For example, one could substitute for the diagonal conditions, c−− m11 + c+− m22 ≥ 0 which is equivalent to (λ21 + λ22 )(1 + t3 ) + λ23 (1 − t3 ) ≤ (1 + t3 )(1 − |t|)2 + 2λ1 λ2 λ3 .
(22)
Thus, strict inequality in both (21) and (22) suffice to ensure complete positivity. In general, when t 6= 0, the convex set Λt,λ3 is determined by (21), i.e. by the closed curve for which equality holds and its interior. Since changing the sign of λ1 or λ2 is equivalent to changing λ+ ↔ λ− , the corresponding set of entanglement breaking maps is given by the intersection of this region with the corresponding one with λ+ and λ− switched, as shown in Fig. 2. † † Remark. If, instead of looking at I − RΦ RΦ , we had considered I − RΦ RΦ , the matrix M would change slightly and the conditions (18) or (19) would be modified accordingly. (In fact, the only change would be to replace +t3 by −t3 in the fraction
Figure 3: The tetrahedron of bistochastic maps and its inversion through the origin (left); their intersection gives the octahedron of unital entanglement breaking maps (right). (Figures by K. Durstberger appeared in [3]. )
21
September 1, 2003 12:19 WSPC/148-RMP
00171
Qubit Entanglement Breaking Channels
653
multiplying |τ |2 .) However, the determinant condition (21) would not change. Since † † RΦ RΦ and RΦ RΦ , are unitarily equivalent, † † † det[I − RΦ RΦ ] = det(U [I − RΦ RΦ ]U † ) = det[I − RΦ RΦ ].
It is worth noting that whether or not RΦ is a contraction is not affected by the signs of the tk . (In particular, changing t2 7→ −t2 takes RΦ 7→ RΦ , changing T t3 7→ −t3 takes RΦ 7→ σx RΦ σx , and changing t1 7→ −t1 takes RΦ 7→ −σz RΦ σz .) Therefore, one can change the sign of any one of the tk without affecting completely positivity. By contrast, one can not, in general, change λk → −λk without affecting the complete positivity conditions. (Note, however, that one can always change the signs of any two of the λk since this is equivalent to conjugation with a Pauli matrix on either the domain or range. The latter will also change the signs of two of the tk .) Changing the sign of λ2 is equivalent to composing Φ with the transpose, so that changing the sign of one of the λk is equivalent to composing Φ with the transpose and conjugation with one of the Pauli matrices. Furthermore, if changing the sign of one particular λk does not affect complete positivity, then one can change the sign of any of the λk without affecting complete positivity. In view of the role of the sign change condition it is worth summarizing these remarks. Proposition 7. Let Φ be a CPT map in canonical form (3) and let T (ρ) = ρT denote the transpose. Then (i) T ◦ Φ ◦ T is also completely positive, i.e. changing tk → −tk does not affect complete positivity. (ii) Φ ◦ T is completely positive if and only if changing any λk → −λk does not affect complete positivity. (iii) Φ ◦ T is completely positive if and only T ◦ Φ is. The only difference between Φ ◦ T and T ◦ Φ is that the former changes the sign of λ2 while the latter changes the signs of both t2 and λ2 . 5. Geometry 5.1. Image of the Bloch sphere We first consider the geometry of entanglement breaking channels in terms of their effect on the Bloch sphere. It follows from the equivalence with the sign change condition in Theorem 1 that any CPT map with some λk = 0 is entanglement breaking. We call such channels planar since the image lies in a plane within the Bloch sphere. Similarly, we call a channel with two λk = 0 linear. If all three λk = 0, the Bloch sphere is mapped into a point. Note that the subsets of channels whose images lie within points, lines, and planes respectively are not convex. However, they are well-defined and useful classes to consider.
September 1, 2003 12:19 WSPC/148-RMP
654
00171
M. B. Ruskai
Points: A channel which maps the Bloch sphere to a point has the Holevo form (1) in which the sum reduces to a single term with R = 21 [I + t · σ] and E = I. 1 0 Then Φ(ρ) = R Tr(Eρ) = R ∀ ρ and T = t 0 when |t| = 1, R is a pure state and the map is extreme. It is also a special case of the so-called amplitude damping channels, and (as noted at the end of Sec. 2) these are the only amplitude damping channels which break entanglement. Lines: When two of the λk = 0 so that the image of the Bloch sphere is a line, the conditions for complete positivity reduce to a single inequality, which becomes (9) in the case λ1 = λ2 = 0. Moreover, it is straightforward to verify that any such channel can be realized as a CQ channel. Indeed, it suffices to choose W as in (8). Planar channels: The image of a map with exactly one λk = 0 lies in a plane. When † this is λ3 , the condition I − RΦ RΦ ≥ 0 becomes ! 1 − |t|2 − (λ1 − λ2 )2 2(t1 λ1 + it2 λ2 ) ≥0 2(t1 λ1 − it2 λ2 ) 1 − |t|2 − (λ1 + λ2 )2 where |t|2 = t21 + t22 + t22 , and the condition on the diagonal becomes (|λ1 | + |λ2 |)2 + |t|2 ≤ 1 .
(23)
Now, if either diagonal element is identically zero, then one must have t1 λ1 = t2 λ2 = 0. Thus, if both λ1 , λ2 6= 0 and equality holds in the necessary condition (23), one must have t1 = t2 = 0, in which case it reduces to (|λ1 | + |λ2 |)2 + t23 = 1. This implies that a truly planar channel can not touch the Bloch sphere, unless it reduces to a point or a line. 5.2. Geometry of λk space We now consider, instead of the geometry of the images of entanglement-breaking maps, the geometry of the allowed set of maps in λk space. After reduction to the canonical form (3) it is often useful to look at the subset of [λ1 , λ2 , λ3 ] which correspond to a particular class of maps. We first consider maps for which t = 0. Theorem 8. In a fixed (diagonal) basis, the set of unital entanglement breaking maps on qubits corresponds to the octahedron whose extreme points correspond to the channels for which [λ1 , λ2 , λ3 ] is a permutation of [±1, 0, 0]. P Since this octahedron is precisely the subset with j |λj | ≤ 1 the result follows immediately from Theorem 4. Alternatively, one could use Theorem 10 and the fact that the unital CQ maps must have the form above. Remarks. (1) The channels corresponding to a permutation of [±1, 0, 0] belong to the subclass known as CQ channels. Hence, the set of unital entanglement breaking maps is the convex hull of unital CQ maps.
–0.5
September 1, 2003 12:19 WSPC/148-RMP
00171 –1
Figure 2: The λ+ -λ− plane showing the region determined by determinant condition when t = (0.4, 0.3, 0.0) and λ3 = 0.15 and the corresponding region with λ+ andλ− interchanged. Their intersection corresponds to the entanglement breaking maps with the indicated parameters. Qubit Entanglement Breaking Channels
655
Figure 3: The tetrahedron of bistochastic and its inversion the (left); origin their Fig. 3. The tetrahedron of bistochastic maps andmaps its inversion throughthrough the origin intersection of gives the octahedron of unital entanglement breaking mapsby K. intersection(left); gives their the octahedron unital entanglement breaking maps (right). (Figures (Figures Durstberger(right). appeared in [3].)by K. Durstberger appeared in [3]. )
(2) This octahedron in Theorem 8 is precisely the intersection of the tetrahe21 dron with corners [1, 1, 1], [1, −1, −1], [−1, 1, −1], [−1, −1, 1] with its inversion through the origin, as shown in Fig. 3. (A similar picture arises in studies of entanglement and Bell inequalities. See, e.g. Fig. 3 in [18] or Fig. 2 in [3].) (3) The tetrahedron of unital maps is precisely the intersection of four half-spaces bounded by planes of the form n · [λ1 , λ2 , λ3 ] = 1 with n = [±1, ±1, ±1] and an odd number of negative signs, i.e. n1 n2 n3 = −1. The octahedron of unital EBT maps is precisely the intersection of all eight planes of this form. (4) If the octahedron of unital entanglement breaking maps is removed from the tetrahedron of unital maps, one is left with four disjoint tetrahedrons whose sides are half the length of the original. Each of these defines a region of “entanglement-preserving” unital channels with fixed sign. For example, the tetrahedron with corners, [1, 1, 1], [1, 0, 0], [0, 10], [1, 0, 0]; whose boundary consists of four equilateral triangles, one in the plane [−1, −1, −1] · [λ1 , λ2 , λ3 ] = −1 and three in the planes n · [λ1 , λ2 , λ3 ] = 1 with n = [1, 1, −1], [1, −1, 1], [−1, 1, 1]. For many purposes, e.g. consideration of additivity questions, it suffices to confine attention to one of these four corner tetrahedrons. Indeed, conjugation with one of the Pauli matrices, transforms the corner above into one of the other four. We next consider non-unital maps, for which one finds the following analogue of Theorem 8. Theorem 9. Let t = (t1 , t2 , t3 ) be a fixed vector in R3 and let Λt denote the convex subset of R3 corresponding to the vectors [λ1 , λ2 , λ3 ] for which the canonical map with these parameters is completely positive. Then the intersection of Λ t with its inversion through the origin (i.e. λj → −λj ) is the subset of EBT maps with translation t. Remark. The effect of changing the sign of λ2 is λ+ ↔ λ− and of changing the sign of λ1 is λ+ ↔ −λ− . In either case, the effect on the determinant condition (21)
September 1, 2003 12:19 WSPC/148-RMP
656
00171
M. B. Ruskai
is simply to switch λ+ ↔ λ− , i.e. to reflect the boundary across the λ+ = λ− line. Thus, the intersection of these two regions will correspond to entanglement breaking channels. The remainder will, typically, consist of 4 disjoint (non-convex) regions, corresponding to the four corners remaining after the “rounded octahedron” of Theorem 9 is removed from the “rounded tetrahedron”. 6. Convex Hull of Qubit CQ Maps In [16] we found it useful to generalize the extreme points of the set of CPT maps S to include all maps for which RΦ is unitary, which is equivalent to the statement that both singular values of RΦ are 1. In addition to true extreme points, this includes “quasi-extreme” points which correspond to the edges of the tetrahedron of unital maps. Some of these quasi-extreme points are true extreme points for the set of entanglement-breaking maps. However, there are no extreme points of the latter which are not generalized extreme points of S. This will allow us to conclude the following. Theorem 10. Every extreme point of the set of entanglement-breaking qubit maps is a CQ map. Hence, the set of entanglement-breaking qubit maps is the convex hull of qubit CQ maps. The goal of the section is to prove this result. Because our argument is somewhat subtle, we also include, at the end of this section a direct proof of some special cases. First we note that the following was shown in [16]. After reduction to canonical form (3), for any map which is a generalized extreme point, the parameters λk must satisfy (up to permutation) λ3 = λ1 λ2 . This is compatible with the sign change condition if and only if at least two of the λk = 0, which implies that Φ be a CQ map. We now wish to examine in more detail those maps for which RΦ is not unitary. We can assume, without loss of generality, that the singular values of RΦ can be written as cos θ1 and cos θ2 , that cos θ1 ≥ cos θ2 , and that 0 ≤ cos θ2 < 1. Recall that we showed in Lemma 15 of [16] that one can use the singular value decomposition of RΦ to write ! cos θ1 0 1 1 (24) W † = U+ + U− RΦ = V 2 2 0 cos θ2 where U± = V
ei±θ1 0 0 ei±θ2
W † and V, W are unitary. Thus, Φ is the midpoint of
a line segment in S and can be written as Φ=
1 1 ΦU + + ΦU − 2 2
(25)
with ΦU ± defined as in (12). Although ΦU ± need not have the canonical form (3), they are related so that their sum does.
September 1, 2003 12:19 WSPC/148-RMP
00171
Qubit Entanglement Breaking Channels
657
We now use the singular value decomposition of RΦ to decompose it into unitary maps in another way. ! cos θ1 0 RΦ = V W† (26) 0 cos θ2 =V
=
cos θ1 + cos θ2 cos θ1 − cos θ2 I+ σz W † 2 2
cos θ1 − cos θ2 cos θ1 + cos θ2 V W† + V σz W † . 2 2
(27)
Moreover, it follows from (27) that cos θ1 + cos θ2 cos θ1 − cos θ2 ΦV W † + ΦV σz W † + (1 − cos θ1 )Φ0 2 2 where Φ0 is the QC map corresponding to ! b 11 ) Φ(E 0 . M= b 22 ) 0 Φ(E Φ=
(28)
Since we have assumed that we do not have cos θ1 = cos θ2 = 1, Eq. (28) represents Φ as a non-trivial convex combination of at least two distinct CPT maps, the first two of which are generalized extreme points. (Unless cos θ1 = 1 or cos θ1 = cos θ2 , we will have three distinct points, and can already conclude that Φ lies in the interior of a segment of a plane within S.) Now, the assumption that cos θ2 6= 1 suffices to show that the decompositions (28) and (25) involve different sets of extreme points and, hence, that Φ can be written as a point on two distinct line segments in S. Therefore, there is a segment of a plane in S which contains Φ and for which Φ does not lie on the boundary of the plane (although the plane might be on the boundary of S). Thus we have proved the following. Lemma 11. Every map Φ in S lies in one of two disjoint sets which allows it to be characterized as follows. Either (I) Φ is a generalized extreme point of S, or (II) Φ is in the interior of a segment of a plane in S. Now let T denote the set of maps for which Φ ◦ T or, equivalently (−I) ◦ Φ, is in S. Since T is a convex set isomorphic to S, its elements can also be broken into two classes as above. The set of entanglement breaking maps is precisely S ∩ T . We can now prove Theorem 6 by showing that the convex hull of CQ maps is S ∩ T . Proof. Let Φ be in S ∩ T which is also a convex set. If Φ is a generalized extreme point of either S or T , then the only possibility consistent with Φ being entanglement-breaking is that it is CQ. Thus we suppose that Φ belongs to class II for both S and T . Then Φ lies within a plane in S and within a plane in T . The
September 1, 2003 12:19 WSPC/148-RMP
658
00171
M. B. Ruskai
intersection of these two planes is non-empty (since it contains Φ) and its intersection must contain a line segment in S ∩ T which contains Φ and for which Φ is not an endpoint. Therefore, Φ is not an extreme point of S ∩ T . Thus all possible extreme points of S ∩ T must be generalized extreme points of S or T , in which case they are CQ. Remark. Although this shows that all extreme points of S ∩ T are CQ maps, this need not hold for the various convex subsets, corresponding to allowed values of λk , tk in a fixed basis, discussed at the start of Sec. 4. The following remark shows that “most” points in the convex subset Λt,λ3 of the λ1 –λ2 plane can, in fact, be written as a convex combination of CQ maps in canonical form in the same basis. It also shows why it is necessary to go outside this region for those points close to the boundary. (a) First consider the set of entanglement-breaking maps with λ3 = 0, which is the convex set ∪t3 Ξt3 ,0 . Every extreme point must be an extreme point of the convex set Ξt 3 ,0 for some t3 . By Theorem 6, these are the maps for which √ 1 2 λτ λτ+ is unitary, which implies that either − 1−t3
(i) t1 = t2 = 0 and (λ1 ± λ2 )2 = 1 − t23 which implies that either λ1 = 0 or λ2 = 0 with t23 + λ2j = 1 for j = 1 or 2, or (ii) λ1 = λ2 = 0 and |t|2 = 1. The first type of extreme point is obviously a CQ map; the second is a “point” channel which, as noted before, is a special case of a CQ map. Thus any map in Ξt3 ,0 can be written as a convex combination of CQ maps in Ξt3 ,0 . Similar results hold if λ1 = 0 or λ2 = 0. Therefore, any entanglement breaking channel with some λk = 0, can be written as a convex combination of CQ channels with at most one nonzero λk in the same basis. Thus any planar channel can be written as a convex combination of CQ channels in the same plane. (b) Next consider entanglement-breaking maps with at most one nonzero tk . We can assume, without loss of generality, that t1 = t2 = 0 in which case the conditions for complete positivity reduce to (20). Combining this with the sign change condition yields (|λ1 | + |λ2 |)2 ≤ (1 − |λ3 |)2 − t23 .
(29)
It follows that for each fixed value of λ3 the set of allowable (λ1 , λ2 ) form a p square with corners (0, ±A3 ), (±A3 , 0) where A3 = (1 − |λ3 |)2 − t23 . Thus, the extreme points of Λ(0,0,t3 ),λ3 are planar channels which, by part (a) are in the convex hull of CQ channels. In particular, a map with λ1 = 0, λ2 = ±A3 , can be written as a convex combination of CQ maps with either λ2 = 0 or λ3 = 0. However, these maps need not necessarily lie in Λ(0,0,t3 ),λ3 ; we can only be sure that λ1 = 0 and t1 = 0, but not that t2 = 0. Thus we can only state
September 1, 2003 12:19 WSPC/148-RMP
00171
Qubit Entanglement Breaking Channels
659
λ+ 0.6
0.4
0.2
–0.6
–0.4
–0.2
0
0.2
0.4
0.6
λ−
–0.2
–0.4
–0.6
Fig. 4.
The region of the λ+ –λ− plane corresponding to entanglement breaking maps with t =
Figure 4:0.0) The of The thedotted λ+ -λlines to entanglement breaking (0.4, 0.3, andregion λ3 = 0.15. show thecorresponding convex hull of the intersection points, − plane which are planar maps. maps with t = (0.4, 0.3, 0.0) and λ3 = 0.15. The dotted lines show the convex hull of the intersection points, which are planar maps. that Λ(0,0,t3 ),λ3 is in the convex hull of those CQ maps with λj = 0 and tj = 0 for either j = 1 or 2. Although it may beλ+ necessary to enlarge the set Λ(0,0,t3 ),λ3 in order to ensure that it is in the convexλ+ hull of some subset of CQ maps, these 0.6 CQ maps will have the canonical form in the same basis, and the same value for λ3 in that basis. (c) Now consider the convex subset Λt,λ3 ∩Λt,−λ3 of the λ1 –λ2 plane corresponding 0.4 to entanglement breaking maps with t, |λ3 | fixed. These two regions intersect when either λ1 = 0 or λ2 = 0 (or, equivalently, |λ+ | = |λ− | where λ± = λ1 ±λ2 ). One can again use part (a) to see that these intersection points can be written 0.2 in canonical form in the same basis. Since as convex combinations of CQ maps their convex hull has the same property, the resulting parallelogram, as shown in Fig. 4, is also a convex combination of CQ maps of the same type. Only for λ− those points in the strip between the parallelogram and the boundary might 0 –0.2 one need to make a change of basis in order 0.2 to write 0.4 the maps0.6 as a convex combination of CQ maps. (d) Now suppose |λ1 | = |λ2 | = |λ3 | = λ > 0. Since any two signs can be changed –0.2 by conjugation with a Pauli matrix, Φ is unitarily equivalent to a map with λ1 = λ2 = λ3 = ±λ. One can then conjugate with another unitary matrix (corresponding to a rotation on the Bloch sphere) pP to conclude that Φ is unitarily 2 equivalent to a channel Φ0 with t1 = |t| = k tk , and t2 = t3 = 0. It then 0 follows from part (b) that Φ , and thus also Φ, can be written as a convex combination of CQ channels which have the form described above in the rotated However, these not necessarily have the canonical form in the Figurebasis. 5: The region of maps the λneed + -λ− plane corresponding to entanglement breaking original basis. maps with t = (0.4, 0.3, 0.3742) and λ = 0.20, Because the intersections of the 3
axes with the boundary (at λ± = ±0.4, for which all |λk | = 0.2) correspond to maps known to be in the convex hull of CQ maps, one can enlarge the convex hull of such maps from the dotted line to the octagon shown by the dashed line. 22
–0.2
September 1, 2003 12:19 WSPC/148-RMP
00171
–0.4
–0.6
Figure 4: The region of the λ+ -λ− plane corresponding to entanglement breaking maps with t = (0.4, 0.3, 0.0) and λ3 = 0.15. The dotted lines show the convex hull 660of M. Ruskai theB. intersection points, which are planar maps. λ+ 0.6
λ+
0.4
0.2
λ− –0.2
0
0.2
0.4
0.6
–0.2
Fig.Figure 5. The of the of λ+the –λ− λplane corresponding to entanglement breaking maps with t = 5: region The region + -λ− plane corresponding to entanglement breaking (0.4, 0.3, 0.3742) and λ3 = 0.20, Because the intersections of the axes with the boundary (at maps with t = (0.4, 0.3, 0.3742) and λ3 = 0.20, Because the intersections of the λ± = ±0.4, for which all |λk | = 0.2) correspond to maps known to be in the convex hull of CQ axes the boundary (athull λ± of=such ±0.4, forfrom which |λk |line = 0.2) to by maps, onewith can enlarge the convex maps the all dotted to thecorrespond octagon shown known themaps dashed line. to be in the convex hull of CQ maps, one can enlarge the convex hull
of such maps from the dotted line to the octagon shown by the dashed line.
Consider the region Λt,λ3 with 0 <22λ3 = λ < 31 and |t|2 = 1 − 2λ + 3λ2 . The maps with |λ1 | = |λ2 | = λ lie on the boundary of this region (in fact, at the intersection of the boundary with the λ± axes, as shown in Fig. 5). Since these maps have the form considered in part (d) they can be written as a convex combination of CQ maps; however, those CQ maps need not have the canonical form in the original basis. Nevertheless, every point in the octagon formed from the convex hull of the intersection points of the lines |λ+ | = |λ− |, |λ+ | = 0, and |λ− | = 0 with the boundary, as shown in Fig. 5, can be written as a convex combination of CQ maps as described above. As another example, consider the set of entanglement breaking maps with t = (0, 0, t3 ) fixed. For any fixed λ1 , the set Λt,λ1 ∩ Λt,−λ1 is a convex subset of the λ2 –λ3 plane. Let (λ2 , λ3 ) be a point in this subset that lies between the boundary and a parallelogram as described in (c) above. By considering the associated map as a point in the set Λt,λ3 ∩ Λt,−λ3 instead, one can be sure that it can be written as a convex combination of CQ maps since this subset of the λ1 –λ2 plane is of the type described in (b). Moreover, these boundary points can be added to the convex hull of CQ maps without need for a change of basis. One might expect that additional boundary points could be added in various ways with additional ingenuity and bases changes. That this is always true, is the essence of Theorem 10. Only for points near the boundary with two tk nonzero is it necessary to actually make the change of basis used in the proof of this theorem. In other cases, the necessary convex combinations (which are not unique) can be formed using the strategies outlined above.
September 1, 2003 12:19 WSPC/148-RMP
00171
Qubit Entanglement Breaking Channels
661
Acknowledgments This paper is the outgrowth of discussions with a number of people. I owe a particularly large debt to Peter Shor for stimulating my interest in this topic by communicating the results of [17] prior to publication. Both M. Horodecki and P. Shor communicated several other results which have now been incorporated into [11]. It is also a pleasure to thank Elisabeth Werner for helpful discussions, particularly regarding Sec. 6; Chris King for many helpful discussions; John Cortese for making stimulating comments, communicating a counter-example to the sufficiency of the determinant condition, and providing many plots, including four of the figures in this paper, which helped to clarify my thinking. Part of this work was done while visiting the mathematics department at the Amherst campus of the University of Massachusetts; I am grateful to Professor Donald St. Mary for arranging this visit and providing a very hospitable working environment. Finally, I would like to thank Heide Narnhofer for permission to use the postscript files produced by K. Durstberger for Fig. 3.
References [1] A. Fujiwara and P. Algoet, Affine parameterization of completely positive maps on a matrix algebra, preprint; subsequently published as “One-to-one parameterization of quantum channels”, Phys. Rev. A 59 (1999), 3290–3294. [2] C. H. Bennett, D. P. DiVincenzo, J. Smolin and W. K. Wootters, Mixed-state entanglement and quantum error correction, Phys. Rev. A 54 (1996), 3824–3851, quantph/9604024 [3] R. A. Bertlmann, H. Narnhofer and W. Thirring, A geometric picture of entanglement and bell inequalities, Phys. Rev. A 66 (2002), 032319, quant-ph/0111116. [4] D. Bruss, Characterizing entanglement, J. Math. Phys. 43 (2002), 4237–4251, quantph/0110078. [5] M.-D. Choi, Completely positive linear maps on complex matrices, Lin. Alg. Appl. 10 (1975), 285–290. [6] J. Cortese, private communication. [7] A. S. Holevo, Coding theorems for quantum channels, Russian Math. Surveys 53 (1999), 1295–1331, preprint quant-ph/9809023. [8] M. Hordecki, P. Hordecki and R. Hordecki, Separability of mixed states: necessary and sufficient conditions, Phys. Lett A223 (1996). [9] M. Horodecki and P. Horodecki, Reduction criterion of separability and limits for a class of protocols of entanglement distillation, Phys. Rev. A 59 (1999), 4206–4216, quant-ph/9708015. [10] M. Hordecki, P. Hordecki and R. Hordecki, “Mixed-state entanglement and quantum communication” in Quantum Information: An Introduction to Basic Theoretical Concepts and Experiments (Springer Tracts in Modern Physics 173, 2001), quantph/0109124. [11] M. Horodecki, P. Shor and M. B. Ruskai, Entanglement breaking channels, quantph/0302031, Rev. Math. Phys. 15 (2003), 629–641. [12] A. Jamiolkowski, Linear transformations which preserve trace and positive semidefiniteness of operators, Rep. Math. Phys. 3 (1972), 275–278. [13] C. King, “Maximization of capacity and lp norms for some product channels” J. Math. Phys. 43 (2002), 1247–1260.
September 1, 2003 12:19 WSPC/148-RMP
662
00171
M. B. Ruskai
[14] C. King and M. B. Ruskai, “Minimal entropy of states emerging from noisy quantum channels”, IEEE Trans. Info. Theory 47 (2001), 192–209. [15] A. Peres, Separability criterion for density matrices, Phys. Rev. Lett. 77 (1996), 1413–1415. [16] M. B. Ruskai, S. Szarek and W. Werner, An analysis of completely positive tracepreserving maps on M2 , Lin Alg. Appl. 347 (2002), 159–187, quant-ph/0101003. [17] P. W. Shor, Additivity of the classical capacity of entanglement-breaking quantum channels, J. Math. Phys. (2002), 4334–4340, quant-ph/0201149. [18] K. G. H. Vollbrecht and R. F. Werner, Entanglement measures under symmetry, Phys. Rev. A 64 (2001), 062307, quant-ph/0010095.
November 4, 2003 10:45 WSPC/148-RMP
00176
Reviews in Mathematical Physics Vol. 15, No. 7 (2003) 663–703 c World Scientific Publishing Company
POISSON GEOMETRY IN CONSTRAINED SYSTEMS
MARTIN BOJOWALD Center for Gravitational Physics and Geometry, Department of Physics, The Pennsylvania State University, University Park, Pennsylvania 16802, USA [email protected] THOMAS STROBL Institut f¨ ur Theoretische Physik, Universit¨ at Jena D–07743 Jena, Germany [email protected] Received 6 June 2002 Revised 8 August 2002 Associated to a constrained system with closed constraint algebra there are two Poisson manifolds P and Q forming a symplectic dual pair with respect to the original, unconstrained phase space: P is the image of the constraint map (equipped with the algebra of constraints) and Q the Poisson quotient with respect to the orbits generated by the constraints (the orbit space is assumed to be a manifold). We provide sufficient conditions so that the reduced phase space of the constrained system may be identified with a symplectic leaf of Q. By these methods, a second class constrained system with closed algebra is reformulated as an abelian first class system in an extended phase space. While any Poisson manifold (P, Π) has a symplectic realization (Karasev, Weinstein 87), it does not always permit a leafwise symplectic embedding into a symplectic manifold (M, ω). For regular P , it is seen that such an embedding exists, iff the characteristic form-class of Π, a certain element of the third relative cohomology of P , vanishes. A tubular neighborhood of the constraint surface of a general second class constrained system equipped with the Dirac bracket provides a physical example for such an embedding into the original symplectic manifold. In contrast, a leafwise symplectic embedding of e.g. (the maximal regular part of) a Poisson Lie manifold associated to a compact, semisimple Lie algebra does not exist. Keywords: Poisson geometry; symplectic dual pairs; constraint algebras; second class constraints.
Contents 1. Introduction 2. Preliminaries — Different Types of Contraints 3. Poisson Geometry from Closed Constraint Algebras 3.1. Symplectic dual pairs and related morphisms 3.2. Second-class constraints
663
664 667 672 672 676
November 4, 2003 10:45 WSPC/148-RMP
664
00176
M. Bojowald & T. Strobl
3.3. The general case 3.4. Transforming second-class constraints to a first-class system 3.4.1 (M, ω ¯ ) as graph of the constraint map 3.4.2 Extension by a Whitney sum 3.4.3 Possible application and generalization 4. Dirac Brackets and Leaf-Symplectic Embeddings of Poisson Manifolds 4.1. Dirac bracket 4.2. Closed constraint algebra revisited 4.3. Compatible presymplectic forms on neighborhoods of regular leaves 4.4. Leafwise symplectic embeddings of Poisson manifolds Acknowledgments References
682 686 687 688 689 691 691 694 695 700 702 702
1. Introduction Within constrained Hamiltonian mechanics, one is used to the concept of presymplectic manifolds, manifolds equipped with a closed 2-form. Pulling back the symplectic form ω from the original phase space M to the constraint surface C, the resulting 2-form ωC is still closed, but in general no longer nondegenerate. The constraints (functions on M characterizing the constraint surface C as joint preimage of zero) are of “second class” [1], iff ωC happens to be nondegenerate; (C, ωC ) is then the reduced phase space of the theory. Otherwise, there are vector fields in the kernel of ωC , which are integrable as a consequence of the Frobenius theorem; their orbits are the “gauge transformations” of the theory. Taking the factor space, provided well-defined, yields the reduced (or physical) phase space in this more general case. Recently there has been increasing interest in Poisson manifolds, in part because of its relation to deformation quantization (cf. e.g. [2–4]) and the interplay of String Theory with noncommutative gauge theories (cf. e.g. [5, 6]). Poisson manifolds are a generalization of symplectic manifolds in a way dual to (but different from) the one of presymplectic manifolds. Instead of defining a symplectic manifold through the existence of a nondegenerate 2-form ω = 1 i j 2 ωij dx ∧ dx , closed due to Jacobi, one could as well define it by its (negative) inverse, i.e. by a nondegenerate bivector field Π = 21 Πij ∂i ∧ ∂j , where Πij = −(ω −1 )ij ≡ {xi , xj }, the fundamental Poisson brackets between local coordinates. (The reason for the somewhat unconventional minus sign in front of the matrix inverse to ωij in the definition of Πij will be commented on later.) Now the Jacobi identity takes the form [Π, Π] ≡ −Πij ,s Πsk ∂i ∧ ∂j ∧ ∂k = 0. While presymplectic manifolds result from giving up the nondegeneracy of ωij (keeping dω = 0), Poisson manifolds result from giving up nondegeneracy of Πij (keeping [Π, Π] = 0). In the degenerate (nonsymplectic) Poisson case, Hamiltonian vector fields vf := −df yΠ ≡ {·, f } no longer span the full tangent space Tx P at any x ∈ P . Still, the respective distribution is integrable (since [vf , vg ] = v{g,f } due to Jacobi,
November 4, 2003 10:45 WSPC/148-RMP
00176
Poisson Geometry in Constrained Systems
665
i.e. due to [Π, Π] = 0), generating a (generalized) foliation of P into symplectic leaves.a In this paper we demonstrate also that the notion of Poisson manifolds plays some role within the physical scenario of constrained Hamiltonian systems (for early work on Poisson brackets and constrained systems, see e.g. [7]). We show how (also nonsymplectic) Poisson manifolds arise naturally in this context and how they are related to the respective (pre)symplectic manifolds of the system. The first instance, where Poisson manifolds show up, is in the case that the Poisson algebra of the constraint functions defines a closed Poisson subalgebra. As pointed out recently [8], such a constrained system does not only give rise to a presymplectic manifold (the constraint surface C embedded in the original symplectic manifold) but is equally naturally associated to a Poisson manifold P , P being the image of the original phase space M under the constraint map, endowed with the Poisson algebra of these constraints. In fact, invoking theorems from the mathematics literature, any Poisson manifold P or any presymplectic manifold C (though not any pair (P, C)) can be regarded as arising in this manner (cf. Theorems 1 and 2 below). (In the context of physics, constraints are usually globally defined functions on M and then only any region of P homeomorphic to a subset of Rd , d = dim(P ), can be obtained. For most of our considerations, however, we can relax this condition on the constraints.) Since the constraints form a closed algebra, their Hamiltonian vector fields define an integrable distribution (on all of M , not just on C as one is used to from a general first class constrained system). Assume the quotient of M with respect to these “generalized gauge orbits” gives a well-defined, differentiable manifold Q. Then also Q can be endowed with a natural Poisson bracket; essentially it is just the original Poisson algebra of “gauge invariant” functions on M . In Sec. 3 of this paper we refine this construction. In particular, we show that P and Q form a so-called symplectic dual pair with respect to the original phase space M and specify conditions (cf. Theorem 3, Corollary 2, and Theorem 4 below) such that the reduced phase space R, of primary physical interest, may be identified with a symplectic leaf of Q. While the first part of this statement, discussed only briefly here, essentially has already been studied in the mathematical literature (up to technical properties which are important to ensure the proper applicability to constrained systems), the a The integral submanifolds (leaves) generated by Hamiltonian vector fields do not in general generate a foliation in the strict sense of the standard definition (called regular foliation in what follows): the leaves may for example have different topology, including their dimension which equals the rank of Πij at a point on the respective leaf. The “regular” part of a Poisson manifold M is the subset of M where Πij has locally constant rank; it is dense in M (cf. e.g. [3]). Notice, however, that (not necessarily dense) parts of M where Π ij has a fixed constant rank are in general still not foliated regularly, as the leaves may lie dense in some higher dimensional submanifolds.
November 4, 2003 10:45 WSPC/148-RMP
666
00176
M. Bojowald & T. Strobl
rest is new and proven here in detail because it leads to a new perspective on the issue of constraint reduction. This is the main part of the present paper. For a closed first-class constrained system (i.e. the origin in P is a symplectic leaf) this implies that we may reverse the two steps in the reduction process: instead of first restricting to a submanifold (which is presymplectic) and then taking the factor space with respect to the gauge orbits, we may as well first take a factor space with respect to generalized gauge orbits, leading to the Poisson manifold Q, and only then take the restriction (in this case to a particular symplectic leaf). Moreover, by these means also, a closed second-class constrained system may be viewed similarly to a (general) first-class constrained system. This will be made more explicit in Sec. 3.4 by embedding the original, unconstrained phase space M ˜ where for any second-class constraint function into some extended phase space M, ˜ We will show that upon an in M one gets a first-class constraint function in M. appropriate extension, one can manage to have the first-class constraints to Poisson commute, which provides a global realization of abelian conversion [9–12]. This may constitute a significant simplification for the quantization of the original system. There is also another instance where Poisson manifolds are used in the context of constrained systems. In the case of second-class constraints Φα ≈ 0 (not necessarily closed under taking Poisson brackets), it was Dirac himself who introduced a modified Poisson bracket on the original phase space, the Dirac bracket. This bracket has the feature that the constraints Φα (x) are its Casimir functions (a function C is called a Casimir function of a Poisson tensor Π if Πij ∂j C = 0) and, essentially, that restricting to the constraint surface commutes with taking Poisson brackets. In other words, one defines a new Poisson bivector ΠD such that one of its symplectic leaves coincides (as a symplectic manifold) with the reduced phase space Φα = 0 of the theory. In fact, such a relation holds even for a whole neighborhood of leaves: the slightly deformed constraint surfaces Φα = cα , for constants cα out of some interval containing zero, endowed with the symplectic form inherited from the embedding original phase space (M, ω), are symplectic leaves of ΠD . In general ΠD is defined only in some tubular neighborhood S of the constraint surface C. What is the relation between (M, ω) and (S, ΠD )? Clearly, the restriction of the identity map on M to S is not a Poisson map since the respective Poisson brackets coincide only for particular functions. On the other hand, the embedding map from S to M is leafwise symplectic (or leaf-symplectic), i.e. restricted to any leaf of ΠD , the map is symplectic. In Sec. 4 we consider the question whether a given Poisson manifold (P, ΠP ) admits a leaf-symplectic embedding into some symplectic manifold. Locally and around a regular point x of P such a map exists always: in a neighborhood N of x one may choose Casimir–Darboux coordinates (q α , pβ , C I ), I = 1, . . . , k where k is the corank of Π at x, for which the given bivector has the form ΠP = ∂q∂α ∧ ∂p∂α [13]. The respective embedding phase space may be chosen as (N × Rk , dpα ∧ dq α + dPI ∧ dC I ), where PI are linear coordinates in Rk and the embedding corresponds to fixing these “Casimir momenta” to some value.
November 4, 2003 10:45 WSPC/148-RMP
00176
Poisson Geometry in Constrained Systems
667
Globally, however, there can be obstructions which may be characterized by a closed 3-form on P , the so-called form-class [14] of the bivector ΠP . Considered as an element of the third relative cohomology of a regular Poisson manifold (P, ΠP ), its vanishing is necessary and sufficient for the existence of a leaf-symplectic embedding (cf. Proposition 11 below). The condition on the form-class may, furthermore, be cast into the form of descent equations, familiar to physicists from the analysis of anomalies (cf. e.g. [15]) and, more recently, also from the cohomological deformation of action functionals [16]. In this manner we will find, for example, that a family of coadjoint orbits of a compact, semisimple Lie algebra (viewed as Lie Poisson manifold) does not permit a leaf-symplectic embedding. On the other hand, any regularly-foliated Poisson manifold whose leaves have trivial second cohomology does so. Thus while there always exists a surjective Poisson map from a symplectic manifold to any given Poisson manifold [17, 18] (i.e. a so-called symplectic realization, cf. Theorem 1 below), and likewise always a coisotropic embedding of (regular) presymplectic manifolds into a symplectic manifold [19, 20] (Theorem 2 below), a leaf-symplectic embedding of a given Poisson manifold exists only in particular cases. Both relations of (pre)symplectic and Poisson manifolds studied in this paper have physical applications: the information in the symplectic dual pairs, and in particular the reformulation of a second-class constrained system with closed algebra as an abelian first-class system, can be used, for example, in a path integral quantization, the prime example possibly being some Yang–Mills gauge theory where the constraint algebra is spoiled by an anomaly (cf. also [9, 11, 21] for related work). The existence of leaf-symplectic maps for a regular Poisson manifold leads to solutions [22, 23] of the associated Poisson Sigma model [24, 25] on a Riemann surface. Our intention is to keep the paper at a level which we hope is accessible to both mathematicians and physicists. Before describing the new results we set the stage in Sec. 2, recalling the basic definitions of different types of submanifolds in a symplectic manifold, in particular the definitions of first and second class constraint surfaces going back to Dirac. On this occasion we also suggest generalizations of these notions to submanifolds of presymplectic and Poisson manifolds (Definition 2). These definitions can, for example, be used for systems like the Plebanski action of gravity which are formulated in such a way that some constraints do not only restrict the phase space coordinates but also multipliers of other constraints [26]. 2. Preliminaries
Different Types of Constraints
We start with some general remarks on constrained systems. Let us consider a phase space, i.e. a symplectic manifold, (M, ω) in which we single out a submanifold C, called the constraint surface. We assume that this submanifold can be characterized as the intersection of the zero level set of {Φα ∈ C ∞ (M ), α = 1, . . . , d}, the
November 4, 2003 10:45 WSPC/148-RMP
668
00176
M. Bojowald & T. Strobl
constraint functions or simply the constraints. We further assume that the conV straints are regular and irreducible, which means that dα=1 dΦα does not vanish on C. The Hamiltonian of the system Poisson commutes with all the Φα , i.e. the set of Φα s is already the total set of constraints — we are not interested in a splitting into “primary” and “secondary” constraints [1] as it arises when starting from different Lagrangian systems, eventually leading to the same constrained Hamiltonian system. We will, however, use the notion of first class and second class constraints: Definition 1. A constraint function Φα of a constrained system C ,→ (M, ω) is of the first (second) class, if its Hamiltonian vector field v α ≡ {·, Φα } is (is nowhere) T tangent to C. The full set of constraint functions {Φα }, such that C = α (Φα )−1 (0), is of the first (second) class, if each single constraint function Φα is of the first (second) class. In other words, Φα is of first (second) class, iff v α lies (does not lie) in ι∗ T C for all points on the constraint surface C ⊂ M , ι: C → M denoting the respective embedding map. In general a constraint will be neither of first- nor second-class, but of “mixed type”, and a splitting of constraints, characterizing a given constraint surface C ⊂ M , into first- and second-class constraints can be achieved only locally. The definition as given above is readily seen to reproduce the one given by Dirac [1] (use {Φα , Φβ }|C ≡ v β (Φα ) or cf. Proposition 1 below): a system of constraints {Φα } is of the first-class, iff {Φα , Φβ }|C = 0 ∀α, β, and of the second-class, iff det{Φα , Φβ }|C 6= 0. In the more mathematically inclined literature, the above two types of constraint surfaces are characterized in a different way, without any reference to the constraint functions. First note that the pullback bundle ι∗ T M (or T M |C ) is a symplectic vector bundle [27]b over C and ι∗ T C (or simply T C) a subbundle thereof. This subbundle induces a symplectically orthogonal subbundle (T C)⊥ := {v ∈ T M |C : ω(v, w) = 0 for all w ∈ T C} b
(1)
Let us call a vector bundle endowed with a differentiable field of bilinear antisymmetric forms of its fibers a presymplectic vector bundle. If this bilinear form is nondegenerate, the bundle is symplectic [27]. Note that both the tangent bundle T M of a presymplectic manifold (M, ω) and the cotangent bundle T ∗ M of a Poisson manifold (M, Π) are presymplectic vector bundles over M in this terminology. In this context it is worthwhile mentioning a possible unification of the two apparently different generalizations of symplectic manifolds in terms of so called Dirac structures [28, 29]. Here one starts from T M ⊕ T ∗ M , endowed with the canonical symmetric bilinear form from the pairing and looks for certain maximally isotropic subbundles. (By definition a subbundle in a vector bundle equipped with a nondegenerate symmetric or antisymmetric bilinear form κ on its fibers is isotropic if all of its sections are mutually orthogonal with respect to κ). Although it may be worthwhile to explore the considerations of the present paper within this more general framework of Dirac structures, we will not attempt to do so here.
November 4, 2003 10:45 WSPC/148-RMP
00176
Poisson Geometry in Constrained Systems
669
which is related to the annihilator of T C in M (or the “conormal bundle” of C), AnnM (T C) = {α ∈ T ∗ M |C : α(v) = 0 for all v ∈ T C} ,
(2)
in the following way. Denoting the map from a vector space V to its dual V ∗ induced by a bilinear form B on V as B ] , v 7→ B(v, ·), the symplectic form ω ∈ Ω2 (M ) defines a bijection ω ] : T M → T ∗ M . Denote by Π] : T ∗ M → T M the transpose inverse map, Π] := (ω ] )−1,T ≡ −(ω ] )−1 , which implicitly defines a bivector field Π.c ] ⊥ ] ⊥ We now find Π (AnnM (T C)) = (T C) or AnnM (T C) = ω (T C) : the inclusion is established by noting that α ∈ AnnM (T C) implies ω(Π] α, w) = −α(w) = 0 for any w ∈ T C, while, vice versa, ω(v, w) ≡ ω ] v(w) = 0 for all w ∈ T C implies ω ] (T C)⊥ ⊂ AnnM (T C). Clearly, AnnM (T C) = ≺dΦα since dΦα (w) = w(Φα ) vanishes for any tangent vector w ∈ T C. With v α = −dΦα yΠ ≡ −Π] dΦα , we similarly recognize (T C)⊥ as the span of the Hamiltonian vector fields generated by the constraints: (T C)⊥ = ≺v α. We now have several equivalent characterizations of first- and second-class constrained systems: Proposition 1. For a constrained system C ,→ (M, ω) the statements in part (i) and (ii), respectively, are equivalent to one another : (i) (a) (b) (c) (d) (ii) (a) (b) (c) (d) (e)
C is first class. (T C)⊥ ⊂ T C. Π|AnnM (T C) = 0, i.e. Π(α, β) = 0 for all α, β ∈ AnnM (T C). The embedding ι: C → M is coisotropic, i.e. (T C)⊥ is isotropic (ω|(T C)⊥ = 0). C is second class. (T C)⊥ ∩ T C = 0. Π|AnnM (T C) is nondegenerate. ω|(T C)⊥ is nondegenerate. C is symplectic, i.e. ι∗ ω (or ω|T C ) is nondegenerate.
In (ii), (b) 0 denotes the zero section. Given an arbitrary, fiberwise nondegenerate bilinear form ω on T M , ω ] defines a bijection which may be lifted to the full tensor bundle over M and which thus allows one to identify different types (covariant, contravariant) of tensors (“lowering and raising of indices”). If we want Π ∈ Γ(T M ⊗ T M ) to be the same object as ω ∈ Γ(T ∗ M ⊗ T ∗ M ) in this identification, Π = (ω ] )−1 ⊗ (ω ] )−1 (ω), we find Π] = (ω ] )−1,T . (Pseudo-)Riemannian geometry, where the bilinear form ω ≡ g is symmetric, is special in two ways: the transpose is irrelevant, g ij is the inverse to gij , and, maybe more important, symmetry of the bilinear form is the only case where index transport commutes with contraction of tensors. In the antisymmetric case considered here, we thus find the additional minus sign in the definition of Πij ; moreover, now e.g. Ai B i = −Ai Bi . (See also Appendix B of [30] for a discussion). c
November 4, 2003 10:45 WSPC/148-RMP
670
00176
M. Bojowald & T. Strobl
Proof. Equivalence of (a) and (b) in (i) and (ii) follows immediately from Definition 1 with the preceding considerations. Likewise for (c) and (d), if we note that ω(v α , v β ) = Π(dΦα , dΦβ ); since, moreover, the right-hand side of this equation is equal to {Φα , Φβ }|C , (c) in (i) and (ii) is recognized as the original definition of Dirac, formulated, however, without the use of constraint functions Φα used to specify the embedded constraint surface C. Concerning (i) it is now sufficient to establish equivalence between (b) and (d): (T C)⊥ ⊂ T C by definition implies isotropy of (T C)⊥ , and isotropy of (T C)⊥ yields AnnM (T C)((T C)⊥ ) = [ω ] ((T C)⊥ )]((T C)⊥ ) = 0 . Concerning (ii) we next show (b) ⇔ (e). Let us assume that ι∗ ω is degenerate, which implies that there is a point p ∈ C which has a nonzero tangent vector v ∈ Tp C such that ι∗ ω(v, w) = 0 for all w ∈ Tp C. This is in contradiction to T C ∩ (T C)⊥ = 0. Likewise, if there is an isotropic tangent vector, i.e. a nonzero vector which is contained in both T C and (T C)⊥ , ι∗ ω is obviously degenerate. Finally, equivalence between (b) and (d) in (ii) may be shown likewise by noting that T C = ((T C)⊥ )⊥ , so that we may replace T C by (T C)⊥ in the preceding argument. Since we are interested in both presymplectic and Poisson manifolds as two different generalizations of symplectic manifolds, let us in the following suggest a generalization of the notions of first-class and second-class submanifolds to these cases, ensuring agreement on their symplectic intersection. In this context we want to keep the following characteristics of first- and second-class constraint surfaces C. It should not be possible that C is simultaneously first- and second-class (Dirac’s classification is not exhaustive, but at least it should remain exclusive). Moreover, a second-class constraint surface is a (nondegenerate) phase space of its own (C carries a natural symplectic structure), and a first-class constraint surface is a phase space after factoring out gauge transformations which are generated by the flow of the constraints (in both cases, first- and second-class, this defines the reduced phase space). (For presymplectic manifolds the flow generated by a constraint is not uniquely defined. However, the ambiguity is the kernel of ω ∈ Ω2 (M ), which we thus want to include into the generators of the “flow of gauge transformations”. This is achieved in the definition below.) On the other hand, at least for Poisson manifolds there already exists a reasonable notion of coisotropic submanifolds (cf. e.g. [3]). In this way we arrive at the following definition: Definition 2. (i) Let C be a (closed) submanifold of a presymplectic manifold (M, ω). (a) C is called coisotropic, if (T C)⊥ — as defined in (1) — is isotropic, (b) it is called first-class if (T C)⊥ ⊂ T C, and (c) it is called second-class, if T C ∩ (T C)⊥ = 0.
November 4, 2003 10:45 WSPC/148-RMP
00176
Poisson Geometry in Constrained Systems
671
(ii) A (closed) submanifold C of a Poisson manifold (M, Π) is called coisotropic, if AnnM (T C) is isotropic, i.e. Π|AnnM (T C) = 0. (iii) Let C be a (closed) submanifold of a Poisson manifold (M, Π) such that ι∗ T C ⊂ Π] (T ∗ M ). C is called (a) first-class if 0 6= Π] AnnM (T C)|x ⊂ Tx C for any point x ∈ C, and (b) second-class if Π] AnnM (T C) ∩ T C = 0. Using the fact that in the symplectic case, Π] defines an isomorphism between AnnM (T C) and (T C)⊥ , it is easy to see that the respective definitions for presymplectic and Poisson manifolds coincide if M is symplectic. In particular, the additional condition in part (iii) is always true in this case due to Π] (T ∗ M ) = T M . Remark. Conditions (a) and (b) in part (i) are not equivalent to one another if (M, ω) is not symplectic. In particular, there may be coisotropic second class submanifolds (iff C is a symplectic submanifold of M with (T C)⊥ = ker ω). On the other hand, in the Poisson case, Π|AnnM (T C) = 0 is equivalent to Π] AnnM (T C) ⊂ T C. Nonetheless, a coisotropic submanifold of a Poisson manifold is not necessarily first-class because the additional condition in part (iii) may be violated. In both cases, Poisson and presymplectic, a first-class submanifold C is coisotropic, but in general not vice versa. It follows as in previous considerations that a first-class submanifold of a presymplectic manifold (M, ω), with the pulled back 2-form ωC = ι∗ ω, is degenerate, i.e. the kernel has to be factored out to arrive at a reduced phase space, whereas a second-class submanifold is always symplectic. If we have a submanifold C of a Poisson manifold (M, ΠM ), on the other hand, it is in general not possible to even define a Poisson or presymplectic structure on C directly since we cannot use the pull back as in the (pre)symplectic case. However, one of the instances where this is possible occurs, if the annihilator is a Poisson ideal: Π]M (AnnM (T C)) = 0 ,
(3)
i.e. ΠM (α, ·) = 0 for all α ∈ AnnM (T C). Using the isomorphism i: T ∗C → T ∗ M |C / AnnM (T C) , : α 7→ [αM ] = αM + AnnM (T C) where αM is some element of T ∗ M |C fulfilling αM (v) = α(v) for all v ∈ T C ⊂ T M , ΠC (α, β) := ΠM ([αM ], [βM ])|C
(4)
leads to a well-defined Poisson tensor ΠC on C, if (3) is satisfied. Note that the condition (3) is stronger than the one required for coisotropic submanifolds, which only requires Π] AnnM (T C) ⊂ T C; correspondingly a general coisotropic submanifold of a Poisson manifold does not carry a canonical Poisson (or presymplectic) structure. The (coisotropic) submanifolds C of smallest possible dimension fulfilling (3) are the symplectic leaves; in this case AnnM (T C) = ker Π] and ΠC becomes nondegenerate. The symplectic form ωC on the leaf C is then defined by (ωC )] = ((ΠC )] )−1 .
November 4, 2003 10:45 WSPC/148-RMP
672
00176
M. Bojowald & T. Strobl
The additional condition ι∗ T C ⊂ Π] (T ∗ M ) in part (iii) of Definition 2 ensures that C is a subset of a symplectic leaf in (M, Π). It is then always possible to define a symplectic structure on a first-class or second-class submanifold C by pulling back the symplectic structure on the leaf to C. In this way, a first-class submanifold acquires a degenerate presymplectic structure, whereas a second-class submanifold always is symplectic. In the following section we focus on closed constraint algebras, i.e. on a constrained system (M, ω, {Φα }) in which the constraints Φα = 0 generate a Poisson subalgebra. This implies that {Φα , Φβ }M (x) = Παβ P (Φ(x))
(5)
holds for some Παβ P (Φ). In the context of (5) the constraint surface C is of first-class, iff ΠP (0) = 0 and of second-class, iff det Π(0) 6= 0. For simplicity we will take M to be connected in the following and require that the constraint functions Φα can be used as part of a local coordinate system around any point in M (instead of just points in C ⊂ M , which follows from the regularity and irreducibility of the constraints required in the beginning of this section). 3. Poisson Geometry from Closed Constraint Algebras In this section we present results on the geometry of closed constraint algebras and the relation to the reduced phase space. This then yields an alternative procedure for a constraint reduction. 3.1. Symplectic dual pairs and related morphisms Define P as the subset of Rd with coordinates Φα which is in the image of M under the constraint map φ: M → P ⊂ Rd , x 7→ Φα (x) and endow P with the bivector ΠP = 21 Παβ P (Φ)∂α ∧ ∂β appearing in the closed constraint algebra (5). In this way we have ι
φ
(C, ωC ) ,→ (M, ω) −→ (P, ΠP ) .
(6)
Both arrows in this diagram are morphisms. However, there are two different categories involved: C endowed with ωC = ι∗ ω is a presymplectic manifold (as before ι denotes the embedding of the constraint surface C into M ); note that in the present case the dimension of the kernel of ωC is necessarily constant on C, and in the following we will include this condition on the kernel of the 2-form in the definition of a presymplectic manifold unless stated otherwise. (P, ΠP ), on the other hand, is Poisson. Correspondingly, ι: C → M is a morphism of presymplectic manifolds and φ: M → P a morphism of Poisson manifolds, where, in the first instance, (M, ω) is regarded as a (nondegenerate) presymplectic manifold and, in the second case, as a (nondegenerate) Poisson manifold. Definition 3. A morphism f between two (pre)symplectic manifolds (M1 , ω1 ) and (M2 , ω2 ), a (pre)symplectic map, is a map f : M1 → M2 such that f ∗ ω2 = ω1 . A
November 4, 2003 10:45 WSPC/148-RMP
00176
Poisson Geometry in Constrained Systems
673
morphism g between two Poisson manifolds (M1 , Π1 ) and (M2 , Π2 ), a Poisson map, is a map g: M1 → M2 such that g∗ Π1 = Π2 . An alternative characterization of a Poisson map makes use of the definition of coisotropy (Definition 2) and the notion of the graph Γg = {(m1 , g(m1 )) : m1 ∈ M1 } ⊂ M1 × M2 of a map g: M1 → M2 : Lemma 1. [3, 31] A map g: M1 → M2 between Poisson manifolds is a Poisson map ¯ 2 where M ¯ 2 has the negative if and only if its graph Γg is coisotropic in M1 × M Poisson structure of M2 . Equivalently a Poisson map g may be characterized by g ∗ {F, G}M2 = {g F, g ∗ G}M1 for all F, G ∈ C ∞ (M2 ). (Warning: In general (and in particular always for (6) in the instance of symplectic C and P , as we will verify explicitly in Sec. 3.2 below) a Poisson map (a (pre)symplectic map) between two symplectic manifolds is not symplectic (Poisson).) Thus by construction φ: M → P is Poisson (cf. e.g. (5)) and found to provide a surjective, submersive symplectic realization of P : ∗
Definition 4. A symplectic realization of a Poisson manifold P is a Poisson map φ: M → P where M is a symplectic manifold. For our mathematical considerations we will drop the condition that P ⊂ Rd with d = dim P . We then have to single out a point 0 ∈ P in order to define the constraint surface C = φ−1 (0). As the following theorem shows, any Poisson manifold P can be obtained in the above way: Theorem 1. (Karasev, Weinstein [17, 18]) Every Poisson manifold has a surjective, submersive symplectic realization. In a sense dual to this observation is the following embedding theorem for a presymplectic manifold: Theorem 2. (Gotay [19]) Every presymplectic manifold (M, ω) can be coisotropically embedded into a symplectic manifold. This extension of M is unique up to a local symplectomorphism in a neighborhood of M. Later in this paper (Sec. 4), we will address the question as to whether there is an analogous, rather than dual, result for Poisson manifolds, namely whether any Poisson manifold P can be embedded into a symplectic manifold M such that the embedding becomes symplectic upon restriction to any leaf of P . We may also compare this with the embeddings considered in the previous section (cf. in particular Definition 2): the embedding ι of a second-class submanifold C into a Poisson manifold cannot be Poisson. Indeed in order to be able to define some ΠC ⊂ Γ(Λ2 T C) such that ι is Poisson (ι∗ ΠC = ΠM ), it is necessary and sufficient that Π]M (T ∗ M )|C ⊂ T C, which is easily recognized as the condition (3), and which obviously can be defined only for dim C ≥ rank Π]M (while for a
November 4, 2003 10:45 WSPC/148-RMP
674
00176
M. Bojowald & T. Strobl
second-class submanifold always dim C ≤ rank Π]M , equality holding for C being a symplectic leaf). Nevertheless C inherits a natural (nondegenerate) Poisson structure as a submanifold of a symplectic leaf. It may be quite interesting to reconsider (and possibly generalize) the (embedding) maps discussed in Definition 2 and Sec. 4 within the more general and unifying framework of Dirac structures (cf. footnote b). In the context of constrained systems (M, ω) with a closed constraint algebra (5) there is, under appropriate regularity conditions, also another canonical Poisson manifold Q associated to it: define an equivalence relation on M by calling two points equivalent, if they lie on the same orbit generated by the Hamiltonian vector fields v α = {·, Φα } of the constraints. Assume that Q := M/∼ is a differentiable manifold (for a case where this condition is violated cf. e.g. Example 2 below) and denote the respective projection map from M to Q by π. Then Q may be equipped with a Poisson bracket by π ∗ {f, g}Q := {π ∗ f, π ∗ g}M for all f, g ∈ C ∞ (Q). This indeed provides a well-defined bracket on Q since due to the Jacobi identity for the Poisson bracket on M the right-hand side is a function in the kernel of all the v α and thus it can be written as the pull back of a function on Q. With this choice for the bracket, (Q, ΠQ ) is a Poisson quotient of M , i.e. the projection map π: M → Q is Poisson. The two manifolds Q and P are not unrelated certainly. In fact we have Proposition 2. With the assumptions above the two Poisson manifolds Q and P form a symplectic dual pair with respect to the original symplectic manifold M, in which the Poisson map π: M → Q is complete. Definition 5. [3] Two Poisson manifolds (P, ΠP ) and (Q, ΠQ ) form a symplectic dual pair with respect to a symplectic manifold (M, ω) if there are two Poisson maps φ: (M, ω) → (P, ΠP ), π: (M, ω) → (Q, ΠQ ) with symplectically orthogonal fibers. (M, ω) @ φ (P, ΠP )
@ π@
@
@ R @ (Q, ΠQ )
(7)
Definition 6. A Poisson map π: M → Q is complete, if the Hamiltonian vector field Xπ∗ h = {·, π ∗ h} generated by π ∗ h in M is complete for any function h of compact support on Q. Proof. To prove the first part of the above Proposition (which has been studied in the context of Libermann foliations [32]) it is only necessary to note that the fibers are symplectically orthogonal iff the pull back of any function on P (i.e. a function of the constraints) Poisson commutes with the pull back of any function on Q = M/∼, i.e. a function f satisfying v α f = 0 for all 1 ≤ α ≤ d. Any function
November 4, 2003 10:45 WSPC/148-RMP
00176
Poisson Geometry in Constrained Systems
675
g ∈ φ∗ C ∞ (P ) can be expressed as a function of the Φα only, for which we have {Φα , f } = −v α f = 0, proving {g, f } = 0 for any f ∈ π ∗ C ∞ (Q), g ∈ φ∗ C ∞ (P ) and so symplectic orthogonality of the π- and φ-fibers. The completeness of the projection map π may be shown by adapting the proof of Proposition 6.6 in [3]: let h be a function of compact support on Q which implies that Xh = {·, h} has a complete integral curve through some given point q ∈ Q. We now assume that the integral curve of Xπ∗ h through any point m ∈ π −1 (q) is not complete, i.e. there is (without loss of generality) a maximal parameter t+ beyond which the curve cannot be defined. Through the projection to Q by π of the final point m+ belonging to t+ , there is, however, an extension of the integral curve which can be lifted to M to a curve through some point m0 ∈ π −1 (π(m+ )) (not necessarily identical to m+ ). Using the action generated by the constraints Φα on M , by which we defined the equivalence relation ∼, we can construct local symplectomorphisms s from a neighborhood of m0 to a neighborhood of m+ such that s(m0 ) = m+ (e.g. by following integral curves of the v α in a chosen ordering, where the parameter lengths of all the integral curves are determined so as to fulfill s(m0 ) = m+ and are the same for all points in the neighborhood of m0 ). Since Xπ∗ h is by definition invariant under the flow, the local symplectomorphism transports a piece of its integral curve through m0 to a piece of the integral curve through m+ . This is a contradiction to our assumption that the original integral curve through m is inextendible beyond m+ which proves completeness of the map π. The orthogonality in the above proof could have been inferred also from the results of Sec. 2, where the span of the vector fields v α generating the orbits (the fibers of the map π) was identified with (T C)⊥ , the symplectically orthogonal to T C, which in turn is the tangent bundle to a (generalized) “constraint surface” C defined by the pre-image of some point in P (the fibers of the map φ). Note in this context that orthogonality of the fibers does not imply that they are transversal to one another (this is only the case for second-class constraints). In particular, for first-class constraints the fibers of π are even submanifolds of the fibers of φ. In the rest of this section we want to clarify the relation between the factor space Q and the reduced phase space R. For instance if P is symplectic, so that the constraints Φα are second class, R coincides just with (C, ωC ) in (6). We then will show that, under some rather mild conditions specified below, R and Q are symplectomorphic. (Note that P symplectic implies Q symplectic because their pre-images in the symplectic manifold M are symplectically orthogonal due to Proposition 2.) Thus in this case we can trade in the standard procedure of reducing the corresponding second-class constrained system on M by restriction to the constraint surface C for taking the factor space of M with respect to the flow generated by the second-class constraints. This second approach is in its spirit more closely related to the (in physical systems generically better understood) reduction procedure used for constrained systems with purely first-class constraints. In fact, as we will show, M may be regarded as a constraint surface in a higher dimensional (extended)
November 4, 2003 10:45 WSPC/148-RMP
676
00176
M. Bojowald & T. Strobl
˜ characterized by first-class constraints, the orbits of which symplectic manifold M ˜ on M ,→ M coincide with the original orbits of v α on M . In the following we will first focus on this case of a symplectic target (P, ΠP ) (corresponding to second-class constraints) before we then consider the general case. 3.2. Second-class constraints By assumption, (P, ΠP ) is symplectic (we denote the symplectic form inverse to ΠP by ΩP in the following). Then any Poisson map φ: M → P is a submersion [3]. If, in addition, φ is complete, we have Lemma 2. Let φ be a complete Poisson map from a Poisson manifold M to a symplectic manifold (P, ΩP ). Then M is a fiber bundle over P with projection map φ which is endowed with a natural flat connection. The proof [3] proceeds by showing that any vector v ∈ T P can be lifted horizontally to a vector in T M by pulling back the covector vyΩP . In this way one obtains a connection which is flat since the Lie bracket of two horizontal vector fields is again horizontal. To see this, one uses canonical coordinates on P and completeness of the map φ. In our situation of regular second-class constraints, we have Theorem 3. Let Φα , 1 ≤ α ≤ d be regular constraint functions with nonempty constraint surface C = φ−1 (0) on a connected symplectic manifold (M, ω) with a complete Poisson map φ: (M, ω) → (P, ΠP ), x 7→ Φα (x) such that the Poisson manifold (P, ΠP ) is symplectic. If Q = M/∼, which is symplectically dual to (P, ΠP ) with respect to (M, ω), is a manifold, then it is covered by the constraint surface (C, ωC ). This is always the case if the holonomy group of the flat connection of Lemma 2 is finite; if it is trivial, the covering by C is a symplectomorphism. Proof. By construction C, defined by Φα = 0 for all α, is embedded as a submanifold into M , ι: C → M , and ωC = ι∗ ω. Now define an equivalence relation for points on C by calling two points on C equivalent if they are connected by a constraint orbit in M . Taking the corresponding factor space yields Q as a topological space. Here it is essential that any orbit Op of the Hamiltonian vector fields generated by Φα through a point p ∈ M intersects the constraint surface C at least once. This is indeed the case due to completeness of the map φ: if p 6∈ C, the point p is mapped to 0 6= φ(p) ∈ P , which can be connected to 0 ∈ P along trajectories of Hamiltonian vector fields, chosen to be of compact support, in P since P is symplectic. These trajectories can be carried to M by pulling back their Hamiltonian functions with φ. Due to the completeness of φ, they can be extended to arbitrary parameters such that we reach a point in the pre-image of 0 ∈ P . This point lies on the constraint surface C.
November 4, 2003 10:45 WSPC/148-RMP
00176
Poisson Geometry in Constrained Systems
677
In general, there will be more than one intersection point. Thanks to Lemma 2 the bundle φ: M → P with typical fiber C = φ−1 (0) comes equipped with a flat connection whose holonomy provides an action of the fundamental group π1 (P ) on C. The set of all points in the intersection of C with an orbit O through a point p ∈ C is given by the orbit through p of the action of π1 (P ). The number of copies of p obtained in such a way is given by the number k (maybe infinite) of elements in the factor space π1 (P )/Fp , where Fp is the isotropy subgroup of π1 (P ) in p. Since π1 (P ) is discrete, this number is constant on any connected component of C. Factoring C with the action of π1 (P ) yields a k-fold covering C → Q, where Q is obtained by the projection map π: M → Q along the orbits O as in Proposition 2. If k is finite, the action of π1 (P ) is properly discontinuous and Q is a manifold (see Example 2 for a case with an infinite holonomy group). If k = 0 (trivial holonomy), C∼ = Q topologically. We next show that ωC factors through to Q (under the assumption that Q is a manifold). Hamiltonian vector fields generated by Φα yield a symplectic map between neighborhoods of the surfaces φ(x) = φ0 and φ(x) = φ1 for any constants φ0 and φ1 . This also holds true if φ0 = φ1 = 0, in which case we have the map identifying points on C to result in Q used above. Consequently, the symplectic structure ωC on C can be projected down to yield the symplectic manifold (Q, ΩQ ). It remains to show that the map π: (M, ω) → (Q, ΩQ ) is indeed Poisson. (The rest, such as symplectic orthogonality of the π- and φ-fibers, follows from (or as in the proof of) Proposition 2.) For this purpose we introduce some further notation: decompose the tangent and cotangent space at x ∈ C ⊂ M according to the splitting induced by C and the orbit Ox . More explicitly, Tx M ∼ = Tx C ⊕ Tx O and ∗ ∗ ∗ ∼ ∼ Tx M = AnnM (Tx O) ⊕ AnnM (Tx C) = Tx C ⊕ Tx O, where = denotes canonical isomorphisms, used to identify respective vector spaces. (We could have replaced Tx O by (Tx C)⊥ , cf. Sec. 2.) Denote by ιi and pi , i = 1, 2, the respective embeddings and projections to the first and second factor in Tx M ; thus e.g. ι1 : Tx C → Tx M . Then πi := ιi ◦ pi is the corresponding projection operator inside Tx M . Proceed likewise for Tx∗ M with bars on the respective maps; thus, e.g. ¯ι1 : Tx∗ C → Tx∗ M etc. ] In this notation, the condition defining ωC , ωC = ι∗ ω, takes the form ωC = ] ] ] p¯1 ω ι1 , while orthogonality of the fibers becomes p2 Π π ¯ 1 = 0 = p1 Π π ¯2 . From this we need to prove that π as defined above is Poisson, i.e. that p1 Π] ¯ι1 = Π]C . This indeed follows since ] p1 Π] ¯ι1 ωC = p1 Π] ¯ι1 p¯1 ω ] ι1 = p1 Π] (¯ π1 + π ¯2 )ω ] ι1 = −1ITx C .
Example 1. (generalized from [8]) Let M := R2 \{(0, 0)} be the punctured plane with canonically conjugate coordinates (q, p) and define on M the constraints p 1 1 Φ1n := q 2 + p2 Tn (q(q 2 + p2 )− 2 ) − 1, Φ2n := p Un−1 (q(q 2 + p2 )− 2 )
for a fixed n ∈ N using the Chebyshev polynomials Tn and Un of the first and second kind. With Tn (cos θ) = cos nθ and Un (cos θ) = sin[(n + 1)θ]/ sin θ, one can see that {Φ1n , Φ2n } = n, and thus P = φ(M ) = R2 \{(−1, 0)} is a symplectic manifold.
November 4, 2003 10:45 WSPC/148-RMP
678
00176
M. Bojowald & T. Strobl
In polar coordinates defined by q = r cos φ, p = r sin φ it follows easily that the constraint surface C = {(cos(kπ/n), sin(kπ/n)) : 0 ≤ k ≤ n} consists of n points which are all connected by the single orbit generated by the constraints in M . The map C → Q is an n-fold covering. Example 2. Let M := T 2 ×R2 3 (q1 , p1 ; q2 , p2 ) with constraints Φ1 := q1 +ωp1 +q2 and Φ2 := p2 (the first constraint function is taken modulo 1 such that P = S 1 ×R). If ω = 0, the constraint surface is homeomorphic to S 1 × R where the S 1 -factor comes from the unaffected coordinate p1 , and the R-factor is the regular curve (a helix) q1 = −q2 (mod 1) lying on the cylinder S 1 × R ⊂ M with coordinates q1 , q2 . The flow generated by the second constraint intersects the constraint surface of the first constraint in an infinite number of points of the form (q1 , p1 , −q1 + m, 0) for m ∈ Z, and vice versa. Since the corresponding identifications form a discrete set of translations in the q2 -direction of constant shift, the factor space Q is a manifold. The situation is unchanged if ω is nonvanishing but rational. But if ω is irrational, the projection of the constraint surface to the cylinder appearing above will be dense. There are still infinitely many intersection points, and this time the factor space is not a differentiable manifold. This also follows from the fact that the orbits generated by the constraints are dense in a suitable subset of M , and so Q = M/∼ cannot be a manifold. Theorem 3 can be used to define a natural presymplectic form on M : Corollary 1. [8] Under the assumptions of Theorem 3 the symplectic manifold (M, ω) carries a presymplectic form ω ¯ := ω − φ∗ ΩP
(8)
whose kernel is tangent to the orbits generated by the Φα . Proof. By definition ω ¯ is closed. From Theorem 3, in particular the fact that the fibers of the dual pair are symplectically orthogonal, it follows that ω ¯ vanishes when applied to vector fields tangential to the orbits O. This can also be seen directly: by definition, the Hamiltonian vector field XΦα generated by Φα satisfies XΦα yω = dΦα . Furthermore, by definition of ΩP , we have βα γ α XΦα yφ∗ ΩP = (ΩP )βγ XΦα (Φβ )dΦγ = −(Π−1 P )βγ ΠP dΦ = dΦ
so that XΦα y¯ ω = 0. It is clear from the proof of Theorem 3 that we have a trivial covering by C if π1 (P ) = 0: Corollary 2. If P is simply connected, then (P, ΠP ) and (C, ωC ) form a symplectic dual pair with respect to (M, ω). This is always the case if M is simply connected, as we will now demonstrate. In Theorem 3 and its proof we made use of the fact that M may be considered as a fiber
November 4, 2003 10:45 WSPC/148-RMP
00176
Poisson Geometry in Constrained Systems
679
bundle over P with typical fiber C = φ−1 (0), the constraint map φ playing the role of the projection from M to P . However, if Q is a manifold, (Q, ΠQ ) is symplectic and, according to Proposition 2, the respective projection map π is a complete Poisson map. We thus may apply Lemma 2 also to this context and conclude as in the proof of Theorem 3 that the orbit (O, ωO ) (with symplectic form ωO induced on it by ω) covers (P, ΩP ). Moreover, this covering map is a symplectomorphism if the holonomy of the respective flat connection is trivial, which in particular is always the case if π1 (Q) = 0. Proposition 3. If M is simply connected (and connected ), P and C are also simply connected and connected. Proof. For the fiber bundle O ,→ M → Q we have the exact homotopy sequence (cf. e.g. [33]) · · · → π1 (M ) → π1 (Q) → π0 (O) → · · · Since by assumption π1 (M ) = 0 and by definition π0 (O) = 0, it follows that π1 (Q) vanishes. According to the considerations above, this implies that the fiber bundle O ,→ M → Q is trivial and that its fiber is isomorphic to P (here we use the holonomy of Lemma 2 for which the completeness of the map φ: M → P is important); thus M ∼ = Q × P . This in turn yields π1 (M ) = π1 (Q) ⊕ π1 (P ), resulting in π1 (P ) = 0. Thus Q ∼ = C, π1 (C) = 0, and with M being connected, finally also P and Q (or C) are connected. This result can be helpful because M is the original symplectic manifold and may thus be easier accessible than P in particular cases. We also note that P being simply connected is only a sufficient condition for Q ∼ = C. The essential object which determines Q is the holonomy of the flat connection of Lemma 2, which certainly can be trivial even if P is not simply connected. In other words, it is the behavior of the v α -orbits in M which determines whether the reduced phase space C itself and P are symplectically dual with respect to M (and consequently then C may be obtained as factor space by projection π along the orbits). Example 3. Let M = T ∗ (S 1 × R) with canonically conjugate coordinates (x, y; px , py ), where x ≡ x + 1, and with constraints Φ1 := x, Φ2 := px . Then P = T ∗ S 1 is not simply connected, but still R ≡ C = Q = T ∗ R. According to Proposition 3, M being simply connected implies Q ∼ = C and also that C is connected. In Example 1, M was not simply connected, C was not connected and C was a nontrivial covering of Q, i.e. we had Gribov copies (a Gribov copy of a point on the constraint surface is a different point on the constraint surface which lies on the same orbit generated by the constraints [8, 34]). As the following example demonstrates, this can also happen if C is connected (but M is still not simply connected):
November 4, 2003 10:45 WSPC/148-RMP
680
00176
M. Bojowald & T. Strobl
Example 4. Let M be the 4-torus with canonically conjugate coordinates (x, px ; y, py ) ∈ T 2 × T 2 which are identified modulo 1. The constraints are Φ1 := nx + y and Φ2 := px with 1 < n ∈ N such that {Φ1 , Φ2 } = n and the constraint surface C ∼ = T 2 characterized by (x, 0, −nx, py ) is connected. The Hamiltonian vector fields generated by the constraints are v 1 = −nd/dpx − d/dpy and v 2 = d/dx. Their orbit O parameterized by t1 , t2 through the point (x0 , 0, −nx0 , py,0 ) is exp(t1 v 1 + t2 v 2 )(x0 , 0, −nx0 , py,0 ) = (x0 + t2 , −nt1 , −nx0 , py,0 − t1 ) . It intersects C if and only if t1 = m1 /n, t2 = m2 /n with m1 , m2 ∈ Z. On the other hand, points on O ∼ = T 2 coincide if t1 or t2 are changed by an integer. Therefore, the intersection O ∩ C consists of n2 points, which are obtained with 1 ≤ m1 , m2 ≤ n. Thus, C is connected, M not simply connected and we have n2 Gribov copies. We note that the map φ|O : (x0 + t2 , −nt1 , −nx0 , py,0 − t1 ) 7→ (nt2 , −nt1 ) is also an n2 -fold covering of P ∼ = T 2.
ι
(R, ωR )
(M, ω) Q 3 Q π ?
So the connectedness of the constraint surface C (which coincides with the reduced phase space R of physical interest) alone is not sufficient for the absence of Gribov copies, i.e. to guarantee together with completeness of φ that any orbit O generated by the constraints intersects C in precisely one point. However, in the above example, P is not a subset of R2 , which would usually be the case in physical models (so, e.g. Φ1 6∈ C ∞ (M ) ≡ C ∞ (M, R) and, correspondingly, the orbit generating vector fields v 1 and v 2 are only locally Hamiltonian). It would be interesting to clarify whether the condition π0 (C) = 0 is sufficient if P is a subset of Rd , d = dim P . We now summarize our findings in a diagram:
∼ =
−→
(Q = M/∼, ΩQ )
Q
Q φ Q Q
Q
Qs Q (P, ΩP )
(9)
In the present subsection we considered the case of a symplectic target (P, ΩP ) (the general case of a Poisson manifold will be considered in the subsequent subsection). For simplicity we take the original unconstrained phase space (M, ω) to be connected. We now require the orbit space Q = M/ ∼ to be a differentiable
November 4, 2003 10:45 WSPC/148-RMP
00176
Poisson Geometry in Constrained Systems
681
manifold (the orbits are generated by the constraints Φα ). The right-hand side of the diagram then follows from Proposition 1, yielding π to be a complete Poisson map. When P is symplectic, the constraints are second class and the reduced phase space (R, ωR ) coincides with the constraint surface (C, ωC ) as embedded in M . Thus with (6) we obtain the embedding of (R, ωR ) into M in the diagram. Requiring that the constraint map φ is complete, too, we showed in the present subsection that (R, ωR ) covers the Poisson quotient Q via the map π ◦ ι. Moreover, if (as a sufficient but not necessary condition) π1 (P ) = 0 (which in turn follows for example if M is simply connected), this covering is an isomorphism (“no Gribov copies”). (In [8] also another potential source for the map R → Q not to be an isomorphism was mentioned and backed up by an explicit example, namely systems where the map φ restricted to an orbit is not surjective. This qualitatively different case of a Gribov problem (corresponding to “gauge orbits” not intersecting the “gauge fixing surface” C) is excluded by the requirement that φ should be complete.) In this favorable case of an isomorphism we thus obtain the reduced phase space (R, ωR ) not only by restriction to the constraint surface C ⊂ M (standard procedure for reducing second-class constraints), but instead may consider the factor space Q of M with respect to the orbits generated by the constraints. This is more closely related to the standard and generally more preferred treatment of first-class constraints. In fact, the analogy with first-class constraints can be made even more precise: according to Corollary 1, M may be equipped with a presymplectic form ω ¯, the kernel of which coincides with the orbits to be factored out. Together with Corollary 2 we have Corollary 20 . If the Poisson map φ of the constraints is complete and the target P symplectic and simply connected, there exists a symplectomorphism from (M, ω) to (R, ωR ) × (P, ΩP ) with canonical projections to both factors. The reduced phase space may then be obtained by projecting to the first factor or, equivalently, by taking the factor space of (M, ω ¯ ) with respect to the kernel of the presymplectic form ω ¯ defined in (8). The presymplectic manifold (M, ω ¯ ) can then in turn be embedded coisotropi˜,ω cally into some higher dimensional symplectic manifold (M ˜ ). This extended phase space is equipped with first-class constraints such that M is the level zero set of the constraints and ω ¯ the pull back of ω ˜ with respect to the corresponding embedding map. In this way, (R, ωR ) can be obtained by standard reduction of the extended ˜,ω phase space (M ˜ ) constrained by first-class constraints. In Sec. 3.4 we will provide two explicit possibilities for constructing such a constrained extended phase space ˜ in one of which the constraints are even Poisson commuting (abelian). Within M, this reformulation, the above-mentioned orbits of the (originally second class) constraints Φα , α = 1, . . . , d, become just the standard gauge orbits of the extended system (generated by d first-class constraints) and the original constraint surface C corresponds to one possible choice of a gauge (which exists globally due to the
November 4, 2003 10:45 WSPC/148-RMP
682
00176
M. Bojowald & T. Strobl
absence of a Gribov problem — now in the standard use of the terminology — as a consequence of the assumptions specified above). Let us add a remark concerning the arrows in diagram (9): as remarked before, they are morphisms of respective categories. So ι as well as the covering map ι ◦ π are symplectic maps and π, φ are Poisson. Note that neither of the last two maps is symplectic, since π ∗ ΩQ = ω − φ∗ ΩP 6= ω. Similarly, ι is also not Poisson (except for C = M ), since a Poisson map between symplectic manifolds is always a submersion [3]. Only the covering map is symplectic and Poisson. Likewise remarks apply to the diagrams below. We finally remark that it is a direct consequence of Proposition 3 that, provided M is connected and simply connected, so are (R, ωR ) and P¯ := (P, −ΠP ) ∼ = R × P and the maps π and φ, which then (P, −ΩP ). Moreover, in this case M ∼ = are just projections to the first and second factor, respectively, have constant rank. This demonstrates that R and P¯ are Morita equivalent (cf. [3] for the definition) with respect to (M, ω). 3.3. The general case Let us first consider the other extreme case opposite to the one of Sec. 3.2 and cast the standard reduction of a system of first-class constraints into a diagram similar to the one at the end of the previous subsection. This yields the following (commutative) diagram: ι
(C = φ−1 (0), ωC ) −→ (M, ω) Q π|C ?
Q
π ?
(R, ωR ) ,→ (Q = M/∼, ΠQ )
Q
Q
Qφ Q
Q s Q (P, ΠP )
(10)
The right-hand side is again taken from Proposition 2 (the assumption from there that Q = M/∼ is a differentiable manifold is understood to hold here too). By definition the constraint surface C is the pre-image of zero with respect to the constraint map φ. Restricting the projection map π to C, which is equivalent to factoring out the flow generated by the kernel of ωC , yields the standard reduced phase space R. Proposition 4. Let (R, ωR ) be the reduced phase space of a first-class constrained system with the assumptions of Proposition 2. Then R is a symplectic leaf in the orbit space (Q = M/∼, ΠQ ) which is obtained by factoring out the constraint orbits. Proof. Note first that there is a natural embedding of R as a subset of Q, since the flow of the constraints does not leave C being the pre-image of a symplectic leaf (namely the origin) in (P, ΠP ).
November 4, 2003 10:45 WSPC/148-RMP
00176
Poisson Geometry in Constrained Systems
683
To prove that the embedding is as a symplectic leaf, we first note that Π]Q (AnnQ (T R)) = 0 if R is the reduced phase space. This follows from the fact that AnnQ (T R) is spanned by total derivatives of the constraints which on R Poisson commute with each other as well as with elements of T ∗ Q|R / AnnQ (T R) since this space is spanned by total derivatives of physical observables which by definition commute on R with the constraints. This shows that we can define a Poisson structure ΠR on R using (4): {f, g}R = ΠR (df, dg) = ΠQ ((df )Q , (dg)Q )|R using the isomorphism i: T ∗ R → T ∗ Q|R / AnnQ (T R), α 7→ αQ + AnnQ (T R). If this Poisson structure is nondegenerate, R is an open subset of a symplectic leaf in Q. We proceed by showing that the pull back of {f, g}R under π|C coincides with the bracket computed using the pull back of the symplectic structure on M . This will imply that ΠR coincides with the structure of R as the reduced phase space which by definition is nondegenerate. In a first step we use the fact that π is Poisson and obtain π|∗C {f, g}R = ΠM (π ∗ (df )Q , π ∗ (dg)Q )|C . Here we need the image under the map i of an exact 1-form df . More precisely, we want to show that αQ can be chosen to be exact if α = df is exact. To construct a function fQ on Q with αQ = dfQ , we choose a tubular neighborhood of R in Q, cover it with open subsets in which we can use local coordinates of R together with transversal coordinates, and use a partition of unity subordinate to these neighborhoods. First we can extend f to a function fU on any local neighborhood U by requiring that fU does not depend on the transversal coordinates. Using the partition of unity, we arrive at a smooth function which is defined on the full tubular neighborhood of R and which equals f when pulled back to R. Multiplying this function with a smooth function which vanishes outside the tubular neighborhood and equals one on R defines a smooth function fQ defined on Q. Because fQ |R = f , we have v(fQ ) = v(f ) for any vector v tangential to R such that we can choose (df )Q = dfQ as representative in T ∗ Q|R / AnnQ (T R). We use this relation to compute the pull back of the Poisson bracket π|∗C {f, g}R = ΠM (dπ ∗ fQ , dπ ∗ gQ )|C = {π ∗ fQ , π ∗ gQ }|C . Owing to the construction of fQ above, π ∗ fQ is a function on M which is constant along constraint orbits and whose values on the orbits through C coincide with f . This shows that the pull back to C of the symplectic structure on R as a leaf in Q coincides with the symplectic structure of M restricted to C. Completeness of the map π implies that R is a symplectic leaf in Q, not just an open subset of a leaf: assume that R is contained in but not identical to a leaf L of Q and choose a point r 6∈ R in the boundary of R in L. We can choose a function h on Q which is supported only on some neighborhood of r and which generates a Hamiltonian vector field Xh tangential to L. Using the trajectories of Xh , we can
November 4, 2003 10:45 WSPC/148-RMP
684
00176
M. Bojowald & T. Strobl
connect the point r 6∈ R to a point r0 ∈ R. Due to completeness, the pull back π ∗ h generates a complete Hamiltonian vector field on M . Furthermore, it is tangential to C because Xh was chosen to be tangential to L. Thus, there are points in π −1 (r) and π −1 (r0 ) which both lie in C ⊂ M and so are projected to R under π. Therefore, r ∈ R contradicting our assumption R 6= L. Thus for the reduction process in a system of first-class constraints forming a closed constraint algebra we may exchange the order of restriction to a submanifold and taking the factor space. So, in this case, there are two equivalent, “dual” perspectives of the reduction process, the one using presymplectic geometry in an intermediary step and the other one using Poisson geometry. The case of a general Hamiltonian system with a closed constraint algebra is now basically a combination of the previous two diagrams. The essential point to prove is again under which circumstances the reduced phase space (R, ωR ) is isomorphic to (or at least a covering of) an appropriate orbit space. Theorem 4. Let Φα , 1 ≤ α ≤ d be regular constraint functions with nonempty constraint surface C = φ−1 (0) on a connected symplectic manifold (M, ω) with a complete Poisson map φ: (M, ω) → (P, ΠP ), x 7→ Φα (x), and denote the symplectic leaf through 0 ∈ P by L0 and its pre-image under φ in M by M0 := φ−1 (L0 ). The orbit space of M (M0 ) with respect to the flow generated by the constraints Φα is denoted by Q (Q0 ). If Q is a manifold, then the reduced phase space R is a (symplectic) covering of Q0 , which is a symplectic leaf of Q. If π1 (L0 ) = 0, this covering is a symplectomorphism. ι0
ι1
(C = φ−1 (0), ωC ) ,→ (M0 = φ−1 (L0 ), ωM0 ) ,→ (M, ω) Q π|C
π|M0
?
? ∼ =
(R = C/∼, ωR ) −→ (Q0 = M0 /∼, ωQ0 ) ,→
π ?
Q
Q
Q
Q φ Q Q s Q
(Q = M/∼, ΠQ )
(P, ΠP ) ←- (L0 , ωL0 )
(11) The standard reduction procedure consists in first going to the presymplectic (within this subsection “presymplectic” does not necessarily contain the condition that the kernel of the closed 2-form has constant rank) constraint surface (C, ω C ) embedded by ι = ι0 ◦ ι1 into M and then taking the factor space with respect to the orbits generated by the vector fields in the kernel of ωC . (By a slight abuse of notation we denoted the corresponding equivalence relation by ∼, too, while the flow of all of the constraints Φα certainly does not remain inside of C). The second step is equivalent to restricting the projection map π: M → Q to C ⊂ M . According to the theorem (provided the covering map from R to Q0 is an isomorphism), we may
November 4, 2003 10:45 WSPC/148-RMP
00176
Poisson Geometry in Constrained Systems
685
alternatively restrict to a generically much larger (also presymplectic) submanifold (M0 , ωM0 ) of M and then, in a second step, factor out the flow generated by all the constraints Φα . (Note that not all the vector fields v α generating this flow are in the kernel of ωM0 . Below, however, we will define another presymplectic form ω ¯ M0 , generalizing ω ¯ of Corollary 1, so that the kernel of this presymplectic form is spanned by the v α .) Finally, as obvious also from the commutative diagram (11), a third alternative consists in factoring out the flow of all the constraints in a first step resulting in the Poisson manifold Q, in which Q0 is embedded as a symplectic leaf. In the second approach the first step amounts to solving only the first-class part of the constraints. Solving the second-class part of them is traded for taking the flow of all the constraints instead of just the part of the flow which remains inside C. Clearly for this alternative to work the closure of the constraint algebra is essential since only then the Hamiltonian vector fields of the constraints are in involution and thus generate orbits in M . Proof. Locally, one can always split the constraints into first-class and secondclass ones. This corresponds to choosing coordinates in a neighborhood of 0 ∈ P which are adapted to the leaf L0 . First we choose arbitrary coordinates of L0 and supplement them by additional local functions in the kernel of the Poisson tensor Π such that, taken together, they form a coordinate system in a neighborhood of 0. If the transition from the original coordinates Φα to the adapted ones is nonsingular (i.e. with nonvanishing Jacobian), the new coordinates form local regular and irreducible constraints. By choosing the adapted coordinates we have performed a local splitting of the constraints: the coordinates of the symplectic leaf L0 are of second-class whereas the remaining coordinates are of first-class. Restricting M to M0 amounts to solving the first-class part of the constraints implying that M0 is a coisotropic submanifold of M . It then follows as in the purely first-class case at the beginning of this subsection that Q0 is a symplectic leaf in Q. Note that the codimension of M0 in M and of L0 in P always coincide due to regularity and irreducibility of the constraints, which implies that the differential of the map φ is nonvanishing. To obtain the reduced phase space R, it remains to factor out the flow generated by the first-class constraints and to solve the second-class constraints (in a local splitting). The first part of this procedure is obviously contained in factoring out by the equivalence class ∼, and analogously to the purely second-class case the rest of the equivalence relation serves to solve the second-class constraints. As in Theorem 3, R and Q0 are in general not identical, even if Q0 is a manifold. In the presence of Gribov copies Q0 is obtained from R by a discrete set of identifications such that R → Q0 is a covering if Q0 is a manifold (which is always the case if there is a finite number of such identifications). Using Lemma 2 one sees that M0 is a fiber bundle over L0 with a natural flat connection whose holonomy group determines the number of Gribov copies. Hence, a sufficient condition for the absence of identifications is simply connectedness of L0 .
November 4, 2003 10:45 WSPC/148-RMP
686
00176
M. Bojowald & T. Strobl
As in Corollary 1 we may equip M0 with a presymplectic form ω ¯ M0 such that its kernel spans the tangent space of the fibers of π: M0 → Q. The image of M0 ⊂ M under φ is L0 . Subtracting from (the presymplectic form) ωM0 the pull back of ωL0 under this map (i.e. under φ ◦ ι0 ) we obtain a presymplectic 2-form ω ¯ M0 such as we did in Corollary 1. Note that the dimension of the kernel of ω ¯ M0 equals d, the number of constraints, and thus in particular is constant on all of M0 . (The proof follows along the same lines as in the symplectic case.) As in the previous cases also, here there is a Poisson perspective of the reduction process, dual to the one using either the presymplectic manifold (M0 , ω ¯ M0 ) or the presymplectic manifold (C, ωC ). In the particular case of P being symplectic, ι0 becomes an isomorphism, M0 ∼ = Q, so that the middle part of the diagram (11) may be M , and likewise Q0 ∼ = dropped (disappears). Furthermore, the presymplectic form ωC and the Poisson bivector ΠQ become nondegenerate, rendering both spaces symplectic, πC becomes an isomorphism, and the diagram (11) reduces to the one of the previous subsection, diagram (9). Let us finally note that the problem of Gribov copies (in the sense used in this paper) can occur only if the constrained system is not purely first-class. For first-class constraints, the equivalence relation ∼ is exactly what is needed for the standard constraint reduction; only when there are also second-class constraints can a discrete set of surplus identifications occur. The description of systems with constraints of both first- and second-class (or more generally of mixed type) described in the preceding paragraphs suggests to take those identifications seriously and to identify the physical phase space with Q0 = M0 /∼ in all cases. The additional identifications can then be interpreted as large gauge transformations which appear together with the gauge transformations generated by first-class constraints. 3.4. Transforming second-class constraints to a first-class system In the present subsection we extend on the considerations at the end of Sec. 3.2 with the aim to reformulate the constrained system with symplectic P as a first ˜ . The setting is that of Corollary 20 class system in an extended phase space M with a simply connected P , the original phase space M being a trivial fiber bundle with base space R ∼ = C and fiber P , endowed with a canonical flat connection (Lemma 2). Moreover, with the respective two projections, (C, ωC ) and (P, ΩP ) form a symplectic dual pair with respect to (M, ω) (Theorem 3 and Corollary 2). The simply connectedness is a sufficient condition for the orbit space of (M, ω ¯) to coincide with the reduced phase space (C, ωC ); otherwise the former can be also a covering of the latter, giving rise to “Gribov copies”. (Cf. also diagram (9). The considerations below apply also to the slightly more general case where π1 (P ) 6= 0 but still the holonomy of the flat connection is trivial such that Q ∼ = 0 R.) If the assumptions of Corollary 2 are fulfilled, there are no Gribov problems and the reduced phase space (C, ωC ) can be identified with the orbit space of
November 4, 2003 10:45 WSPC/148-RMP
00176
Poisson Geometry in Constrained Systems
687
the presymplectic manifold (M, ω ¯ ) [8]. This orbit space, in turn, is obtained by symplectic reduction (using first class constraints) of an extended phase space which can be constructed in, for example, one of the following two ways. 3.4.1. (M, ω ¯ ) as graph of the constraint map ˜ 1, ω ˜ 1 as the graph Γφ of the Let (M ˜ 1 ) := (M, ω) × (P, −ΩP ) and embed M in M ˜ map φ: M → P , i.e. M → {(x, φ(x)) : x ∈ M } ⊂ M1 .d As a direct product of ˜ 1 is symplectic with the product symplectic structure. two symplectic manifolds M ∗ Note that we take P with the symplectic structure reversed, i.e. ω ˜ 1 = πM ω− ∗ ˜ 1 to M and P , respectively, are denoted by πP ΩP where the projections from M ˜ 1 as the graph of φ is πM and πP . This ensures that the embedding of M into M coisotropic owing to Lemma 1, which can also be checked directly: as a subset, M is specified by the constraints Ψα (x, Φ) := Φα − Φα (x) = 0 where Φα are coordinates of P and Φα (x) is the respective original constraint as a function on M . The new constraints satisfy {Ψα (x, Φ), Ψβ (x, Φ)}M˜ 1 = −{Φα , Φβ }P + {Φα (x), Φβ (x)}M αβ = −Παβ P (Φ) + ΠP (φ(x)) ≈ 0
(12)
by definition of ΠP (“≈ 0” denotes vanishing on the constraint surface). This ˜ 1 , and thereverifies that M is coisotropically embedded as the graph of φ in M α fore the second-class constraints Φ can be replaced by first class constraints Ψα . The constraint surface Ψα = 0 is the presymplectic manifold (M, ω ¯ ) (rather than the symplectic manifold (M, ω)) and factoring out the kernel of ω ¯ leads to the 0 reduced phase space (R, ωR ) according to Corollary 2 . We thus obtain Proposition 5. The reduced phase space (R, ωR ) of the second-class constrained system (M, ω) with constraint map φ: M → P, P symplectic and simply connected, is symplectomorphic to the reduced phase space of the first-class constrained system ˜ 1, ω (M ˜ 1 ) := (M, ω) × (P, −ΩP ) with the graph of φ as constraint surface. Note that the Poisson bracket (12) of two first-class constraints Ψα in general vanishes only on the constraint surface Φ = φ(x). Correspondingly, in the extended ˜ 1 the new constraints are first-class, but, in general, they no longer phase space M form a closed Poisson subalgebra. In fact, they form a closed algebra, iff ΠP is at most linear in Φ (while still respecting det Παβ P 6= 0, if necessary with a restricted M ), i.e. in the case of some centrally extended Lie algebras. In contrast, the second construction introduced now, despite being more complicated at first sight, always leads to abelian (i.e. Poisson commuting) first-class constraints. d We
are grateful to P. Bressler for suggesting this construction.
November 4, 2003 10:45 WSPC/148-RMP
688
00176
M. Bojowald & T. Strobl
3.4.2. Extension by a Whitney sum According to Lemma 2, M is a fiber bundle over P . So are P itself (with trivial fiber) and T ∗ P , and we can form the Whitney sum of these bundles to obtain a ˜ 2 := M ⊕ P ⊕ T ∗ P with base P and fiber C × Rd : new fiber bundle M π
π
1 2 Definition 7. Let E1 −→ B and E2 −→ B be two fiber bundles over the same base manifold B. The Whitney sum E1 ⊕ E2 is defined as the fiber bundle over B given by
E1 ⊕ E2 = {(u1 , u2 ) ∈ E1 × E2 : π1 × π2 (u1 , u2 ) = (p, p)} . If the bundles E1 and E2 are equipped with presymplectic forms ω1 and ω2 , respectively, we can also define (E1 , ω1 ) ⊕ (E2 , ω2 ) as a sum of presymplectic manifolds which will be equipped with a new presymplectic form: Definition 8. Let (E1 , ω1 ) and (E2 , ω2 ) be presymplectic manifolds which are simultaneously fiber bundles over the same base manifold. Then the sum (E1 , ω1 ) ⊕ (E2 , ω2 ) is the fiber bundle (E1 ⊕ E2 , ω1 ⊕ ω2 ) with ω1 ⊕ ω2 := p∗1 ω1 + p∗2 ω2 where pi : E1 ⊕ E2 → Ei are the projections to the respective factors. In our case, we have the three symplectic manifolds (M, ω), (P, ΩP ) and T ∗ P ˜ 2, ω which all are fiber bundles over P . We define the extended phase space (M ˜ 2 ) := ∗ (M, ω) ⊕ (P, −ΩP ) ⊕ T P . Note that P enters with negative symplectic structure, which implies that the presymplectic form on the sum of the first two spaces is just ω ¯ [note that M ⊕ P = M topologically, and the projections are (M ⊕ P → M ) = id, ˜ 2 → M we have a canonical (M ⊕ P → P ) = φ]. In addition to the projection p1 : M ∗ ˜ embedding i: M → M2 as the zero section in the T P -part of the fiber. On the extended phase space we also add the symplectic form of T ∗ P which results in a symplectic form ω ˜2: ˜ 2 = M ⊕P ⊕T ∗P is nondegenerate Lemma 3. The form ω ˜ 2 = p∗1 ω ¯ +dΦα ∧dπα on M α ∗ ˜ 2 via p3 ). (dΦ ∧ dπα is the symplectic form on T P pulled back to M Proof. In this proof we will denote any Hamiltonian vector field generated by a function f using the symplectic structure ω on M by Xf . It can be transported ˜ 2 by push forward with the canonical embedding i. to vector fields on i(M ) ⊂ M i Choosing local coordinates y on the fibers of φ and Φα in P , we then have XΦα yω = αβ dΦα , Xyi yω = dy i and, by definition of ΩP , XΦα ydΦβ = {Φβ , Φα } = (Ω−1 . As P ) ω = 0 and thus already seen in Corollary 1, XΦα y¯ αβ ¯ + dΦβ ∧ dπβ ) = (Ω−1 dπβ ω2 = i∗ XΦα y(p∗1 ω i∗ XΦα y˜ P )
on i(M ) .
(13)
Since ∂/∂y i y(˜ ω2 − ω) = 0, i∗ Xyi y˜ ω2 = dy i + miα dΦα + nα i dπα
on i(M )
(14)
November 4, 2003 10:45 WSPC/148-RMP
00176
Poisson Geometry in Constrained Systems
689
˜ with some coefficient functions miα and nα i . Finally, on all of M (not just on i(M )) ∂ yω ˜ 2 = −dΦα ∂πα
(15)
˜ → T ∗M ˜ is surjective which together with (13) and (14) demonstrates that ω ˜ 2# : T M and hence (due to equal dimension of domain of definition and image) bijective on i(M ). By continuity of ω ˜ 2 , this holds true in a neighborhood of i(M ), and translation invariance of ω ˜ 2 in the πα -directions (according to (15), ∂/∂πα are ˜ 2. Hamiltonian vector fields) implies that ω ˜ 2 is nondegenerate on all of M Important in physical applications is also the following observation: ˜ 2 → T ∗ P of the Lemma 4. Let πα , 1 ≤ α ≤ d denote the pull back by p3 : M ∗ ˜ momenta on T P. Then {πα , πβ } = 0 on (M2 , ω ˜ 2 ). Proof. According to (13), on the constraint surface i(M ) the variable πα generates the Hamiltonian vector field Xπα = (ΩP )αβ i∗ XΦβ (where XΦα is the vector field generated by Φα using ω on M ) using the symplectic structure ω ˜ 2 . Therefore, ˜ 2, {πα , πβ }|i(M ) = i∗ Xπβ πα = 0. Since ∂/∂πα are Hamiltonian vector fields on M we have L∂/∂πγ {πα , πβ } = {L∂/∂πγ πα , πβ } + {πα , L∂/∂πγ πβ } = 0, which implies ˜ 2. {πα , πβ } = 0 on all of M Collecting the results we obtain Proposition 6. The reduced phase space (R, ωR ) of the second-class constrained system (M, ω) with constraint map φ: M → P, P symplectic and simply connected, is ˜ 2, ω symplectomorphic to the reduced phase space of the constrained system ( M ˜ 2 ) := ∗ ∗ ¯ (M, ω) ⊕ P ⊕ T P with the momenta of T P as abelian first-class constraints. 3.4.3. Possible application and generalization ˜ 2 we can use local coordinates (xi , πα ) in which the symplectic form is On M ω ˜ 2 (x, π) = ω ¯ (x) + dΦα (x) ∧ dπα and the constraints are just πα = 0. (Note that ˜ 2 is M × Rd globally if and only if T ∗ P ∼ by construction M = P × Rd , i.e. if P is parallelizable.) Despite the simple appearance of πα in the symplectic form, it is ˜ 2 ). In compariquite nontrivial that the constraints Poisson commute (on all of M ∼ ˜ 1 is always a product manifold M ˜ 1 = M × P , albeit with son, the first extension M a topologically more complicated factor (P may have nontrivial topology, even if embedded into Rd ). This extension, however, has nonabelian first-class constraints, in general they may even have nontrivial structure functions. Various quantization schemes, such as BRST quantization (cf. e.g. [35]), simplify greatly in the case of Poisson commuting constraints. In both approaches the original second-class system is reformulated as a firstclass system. The original constraint surface C ∼ = R is one admissible gauge fixing
November 4, 2003 10:45 WSPC/148-RMP
690
00176
M. Bojowald & T. Strobl
surface in the new system. Choosing this gauge, i.e. Φα (x) = 0, in a quantization scheme relying on a gauge fixation, such as some path integral quantization schemes (Faddeev–Popov procedure and its BRST- or BV-generalization), one re-obtains the path integral formulation of the original second-class system (as one may verify for both approaches by straightforward explicit calculations). However, now one has the option to choose another gauge, which may greatly simplify the resulting path integral measure in concrete applications. Or one can use a quantization scheme that does not need any gauge fixing, such as Dirac quantization [1] (cf. also [35]), where one regards physical states as those which are in the kernel of the quantum operators corresponding to the constraint functions; this scheme is applicable only for a first-class constraint system (the first-class property remaining alive even after quantization, i.e. assuming the absence of anomalies). The reformulation in Proposition 6 may be generalized by relaxing the constraints πα ≈ 0 to, e.g. πα − fα (x) ≈ 0 for some choice of f : M → P such that ˜ 2, ω the new system of constraints is still first-class in (M ˜ 2 ). It is possible that such more complicated constraints lead to a simplification in the final result for a path integral measure after fixing the gauge. It does not seem straightforward (although interesting) to us how to extend the first-class reformulation of the present subsection to the general case of a Poisson manifold P (not necessarily symplectic). If, on the other hand, the constraints Φα may be split globally (on all of M , not just in a neighborhood of C) into first-class and second-class constraints, then essentially in the above extension one merely has to replace (P, ΩP ) by the symplectic leaf (L0 , ωL0 ) and the extension goes through without difficulties. In order to transform a complete physical system from second-class to first-class, one also has to find observables (in particular this applies to the Hamiltonian) which are in involution with the new constraints. In general, this can be done as follows: observables of second-class systems are arbitrary functions on M , where, however, only the restriction to the constraint surface is of physical significance. This means that independent observables correspond to arbitrary functions on Q. Using the projection π they can be pulled back to M and then extended in an arbitrary way to the extended phase space to yield a set of functions there. By construction, these functions are in involution with the new constraints on the “constraint surface” M (they are constant on the orbits of M ) and so are observables of the first-class system. See also [36] for a discussion of observables in a nonstandard approach of handling second class constrained systems. The two proposals for turning a second-class constrained system into a firstclass one may be compared to other methods in the literature. First, there is the Faddeev–Shatashvili approach [21] of handling second-class constraints. In fact, for their toy model (M ∼ = T ∗ (R2 ), dx ∧ dpx + dy ∧ dpy ) constrained by Φ1 = −x, Φ2 = −px , Proposition 5 specializes to [21]. However, for their more complicated, realistic system of physical interest, the constructions differ on several grounds. There are, for example, topological differences between the extended phase spaces;
November 4, 2003 10:45 WSPC/148-RMP
00176
Poisson Geometry in Constrained Systems
691
but above all, their original system of second-class constraints does not form a closed Poisson subalgebra, and thus the constructions developed in the present paper (and also in [8]) do not even apply. A general method, abelian conversion, for turning a second-class constrained system into a first-class one has been introduced in [9, 10]. Besides working only locally, the new first-class constraints are given there by an iteration procedure which does not allow one to find the constraints in closed form for a general system. The original method has been extended to a global version [11, 12], where an alternative has also been mentioned which is easier to compare to the present dis˜ = M × Rd with symplectic cussion: as an extended phase space one uses simply M ∗ ∗ ∗ α form ω ˜ = p1 ωM + (p2 dπα ) ∧ (p1 dΦ ) (in our notation, denoting the projections ˜ to M and Rd by p1 and p2 , respectively; πα are global coordinates of from M d R ). This procedure obviously works globally, and it turns the second-class con˜ (the fact that the constraints Φα into abelian first-class constraints p∗1 Φα on M α ˜ straints on M are abelian follows easily from v = Xp∗1 Φα = ∂(p∗∂πα ) ). However, in 2 ˜ 2 , which also leads to an abelian first-class our context we prefer the extension M system, because according to (15) it preserves the orbits generated by the original second-class constraints on M , whereas the method of [11] leads to orbits Rd in the constraint surface which are always of trivial topology. (Note that in this extension it is not possible to choose p∗2 πα as first-class constraints with Φα = 0 as gauge fixing conditions since the constraint manifold p∗2 πα = 0 would be (M, ω) and so nondegenerate.) 4. Dirac Brackets and Leaf-Symplectic Embeddings of Poisson Manifolds In the previous section we have seen the appearance of Poisson manifolds in the context of a constrained system with closed constraint algebra. In the pure secondclass case, however, both P and Q were symplectic. On the other hand, we noted that the original manifold M can be equipped naturally with a presymplectic form (8) such that, under favorable circumstances specified above which guaranteed the absence of a “Gribov problem”, the constraint surface is an admissible cross section of the respective orbits. There is, however, another perspective, in a way dual to the one with a presymplectic manifold (M, ω ¯ ) and which is applicable for any second-class constrained system (not necessarily forming a closed algebra), where a Poisson manifold plays a role. This is the concept of the well-known Dirac bracket [1]. 4.1. Dirac bracket Given a symplectic manifold (M, ω) with a system of (regular, irreducible) secondclass constraints Φα , α = 1, . . . , d, {Φα , Φβ }(x) = F αβ (x), det F αβ 6≈ 0, Dirac defined the so-called Dirac bracket as a modified Poisson bracket {·, ·}D on M . We
November 4, 2003 10:45 WSPC/148-RMP
692
00176
M. Bojowald & T. Strobl
again denote the constraint surface Φα (x) = 0 by C and the induced symplectic 2-form by ωC ; if ιC : C ,→ M is the respective embedding map, then ωC = ι∗C ω. Using the Poisson bivector Π inverse to ω, the bivector ΠD corresponding to the bracket {·, ·}D has the form 1 ΠD = Π + Gαβ v α ∧ v β , 2
(16)
where Gαβ is the inverse to F αβ , well-defined at least in some tubular neighborhood S ⊂ M of the constraint surface, C ⊂ S, and v α = {·, Φα } ≡ −(∂i Φα )Πij ∂j is the Hamiltonian vector field generated by the constraint Φα . One may verify that ΠD indeed satisfies the Jacobi identity and thus defines a Poisson bracket. (This may be inferred also from the considerations to follow. With Proposition 7 one finds that the leaves Φα = const of the local foliation generated by ΠD are symplectic which shows that ΠD is Poisson: In a neighborhood S of C, ΠD has constant rank and is surface forming. This implies that there is a 3-form H on S such that [ΠD , ΠD ] = (Π]D )⊗3 H (such a bivector ΠD is called H-Poisson, cf. also the remark after Definition 9). On the leaves of ΠD , the exterior derivative of its inverse equals the pull back of H. Using our later result (Proposition 7) this implies that the pull back of H to an arbitrary leaf in S vanishes, and so H is purely transversal with respect to the foliation. The transversal part, however, is projected out by (Π]D )⊗3 which gives us [ΠD , ΠD ] = 0.) By construction the constraint functions restricted to S, Φα |S ∈ C ∞ (S), span the center of the algebra generated by this bracket. Correspondingly the constraint surface is a symplectic leaf of ΠD . Moreover, the symplectic form on that leaf coincides with ωC (we will verify this explicitly in Proposition 7 below.) Inspired by the relation of the Dirac bracket of a second-class constrained system with the original symplectic structure, we make the following definition. Definition 9. Let (P, Π) be a Poisson manifold and ΩL denote the induced symplectic form on a given leaf L of the foliation of P with ιL : L ,→ P being the respec˜ and a Poisson tive embedding map. A (symplectic, presymplectic, . . .) 2-form Ω ˜ bivector Π are called compatible within S ⊂ P , if for any leaf L in S: ΩL = ι∗L Ω. Remark. There is an obvious adaptation of the above definition to the case of an almost Poisson bivector Π which still generates a (generalized) foliation — so called ⊗3 H-Poisson or twisted Poisson structures Π, fulfilling [Π, Π] = Π] H for some closed 3-form H [37–39]. The Dirac bivector (16) is well-defined in some subset S ⊂ M containing the constraint surface C. Within S we have the following relation Proposition 7. The Dirac bivector ΠD and the symplectic form ω ∈ Λ2 (T ∗ M ) are compatible within the domain S ⊂ M of definition of ΠD .
November 4, 2003 10:45 WSPC/148-RMP
00176
Poisson Geometry in Constrained Systems
693
Proof. It suffices to show that for any two functions f and g on a given leaf L of the foliation with embedding map ιL : L → S one has on all of L ˜ g (f ) , ι∗L ({F, G}D ) = X
(17)
where F and G are arbitrary extensions of f and g to S, such that ι∗L F = f and ˜ g ∈ Γ(T L) is (uniquely) defined by means of X ˜ g yι∗ ω = dg. ι∗L G = g, and where X L (The right hand side is the Poisson bracket between f and g as induced by ι∗L ω, while on the other hand the left-hand side gives the corresponding bracket induced by the symplectic form of the bivector ΠD on L. Thus, equality for all functions f and g proves equality of the respective 2-forms.) We first note that the choice of the extension of the function f (and likewise of g) does not enter the left-hand side of Eq. (17), since by construction {·, Φα }D ≡ 0, while two different extensions F , F 0 of f differ only by a combination of the constraints: F 0 = F + cα Φα for some smooth functions cα defined in a neighborhood of L (cf. e.g. [35]). We thus may choose a particularly convenient extension, namely one such that ι∗L (v α (F )) = 0 (and likewise for G), where, as before v α = {·, Φα }. It then immediately follows from the definition (16) of the Dirac bracket that ι∗L {F, G}D = ι∗L {F, G} ≡ ι∗L (XG (F )) ≡ XG |L (f ) where XG = {·, G}, since, due to the chosen extension, XG is tangential to L and thus may be restricted consistently ˜g , to L. From XG |L yι∗L ω = ι∗L (XG yω) = ι∗L dG ≡ dg it thus follows that XG |L = X which proves the assertion. Remark. (i) The constraint surface is the pre-image of the origin of the constraint map Φα : M → Rd . (Note that here we no longer require that the constraint algebra is closed; thus, the target does not inherit canonically a Poisson bracket in this case.) This corresponds only to one particular symplectic leaf L0 of (S, ΠD ). If one shifts the constraint surface slightly by setting Φα (x) to some constant cα ∈ Rd small enough such that the respective pre-image Lc is still in S, the bivector ΠD given in (16) still provides the respective Dirac bracket. So all the leaves L c of (S, ΠD ) are seen to be possible constraint surfaces. (ii) Constraints describing a fixed constraint surface L0 are defined only up to ˜ α ≡ Aα β Φβ of the constraint functions, where the coefficient redefinitions Φα → Φ α matrix A β (x), required to have a nonvanishing determinant on C, is in general a smooth function on phase space (or at least on some neighborhood of C). The ˜ D , defined in some region S˜ ⊂ M containing C, corresponding Dirac bivector Π now has in general different symplectic leaves on the intersection of S and S˜ to the ones generated by (ΠD , S); in general only the constraint surface C itself is a joint ˜ c , shifted as symplectic leaf. So, one obtains different possible constraint surfaces L in the previous remark, when one redefines the constraint functions which specify ˜ 0 . Any Dirac bivector ΠD in Eq. (16), the original constraint surface C = L0 = L corresponding to some specific choice of the constraint functions Φα and defined in a region S ⊂ M , is compatible with the original symplectic 2-form ω within S.
November 4, 2003 10:45 WSPC/148-RMP
694
00176
M. Bojowald & T. Strobl
(iii) According to the above consideration, S is foliated into leaves C (with a slight abuse of notation, since previously C denoted only the original constraint surface) which are all second-class submanifolds, and at any point x ∈ S ⊂ M we have a splitting of the tangent space according to Tx M = Tx C ⊕ Tx C ⊥ , and likewise of the cotangent bundle Tx∗ M = AnnM (Tx C ⊥ ) ⊕ AnnM (Tx C), induced by −1 the original bivector Π (cf. also (1) with ω ] = − Π] and recall the discussion at the end of the proof of Theorem 3). Denote by π1 the projection to the first factor in Tx M and by π ¯1 to the first factor in Tx∗ M and regard both Π and ΠD as bivector fields defined on S ⊂ M . Then the definition (16) of the Dirac bivector ΠD is equivalent [40] to saying that ΠD is the projection of Π to T C along T C ⊥ . In formulas this becomes Π]D = π1 ◦ Π] ◦ π ¯1 or, equivalently, ΠD = Π ◦ (¯ π1 ⊗ π ¯1 ) or, if one regards the bivectors as elements or sections of Λ2 T M , ΠD = (π1 ⊗ π1 )Π. This follows easily using previous results. As mentioned before Proposition 1, we have AnnM (T C) =≺dΦα implying π1 ◦ Π] ◦ π ¯1 (dΦα ) = 0. Furthermore, π1 ◦ Π] ◦ π ¯1 restricted to the leaves is by definition of π ¯1 non-degenerate and coincides with Π] . So the two properties which characterize the Dirac bivector ΠD in (16) are fulfilled which proves π1 ◦ Π] ◦ π ¯1 = Π]D . This presents an alternative proof for the compatibility of ΠD with the original symlectic form ω. 4.2. Closed constraint algebra revisited Before proceeding let us specialize the above considerations to a system of a closed, second-class constraint algebra. In this case the Dirac bivector takes the form 1 (18) ΠD = Π − (ΩP )αβ (dΦα yΠ) ∧ (dΦβ yΠ) , 2 wherever Ω]P = −(Π]P )−1 is defined, i.e. at least in a tubular neighborhood S of the constraint surface C. For simplicity we assume S = M , or, equivalently, (P, ΠP ) symplectic. (We refer to diagram (9) for the notation.) Then (M, ΠD ) is a Poisson manifold. First of all we note that the expression (18) obviously is invariant with respect ˜ α (Φ), corresponding to a particular to a (regular) coordinate change in P , Φα → Φ change of constraint functions, i.e. to a particular matrix Aα β (x) in Remark (ii) above. By construction, the reduced phase space (R = φ−1 (0), ωR ) is a symplectic leaf of this Poisson manifold, furthermore. This also applies to any other, shifted value for the constraint functions, Φα := constα , cf. Remark (i) above. The bivector ΠD , defined on all of M , is not only compatible with ω (cf. Proposition 7 above), but clearly also with ω ¯ defined in (8). As remarked already at the beginning of this section, M equipped with the presymplectic form ω ¯ provides a perspective “dual” to the one following from M being equipped with the Dirac bivector ΠD . There is now also another (nonsymplectic) Poisson manifold we can associate to the system at hand: in Sec. 3.4, and under the assumptions stated there, we considered two reformulations of the original second-class constrained system in an
November 4, 2003 10:45 WSPC/148-RMP
00176
Poisson Geometry in Constrained Systems
695
˜ 2, ω extended phase space. In the second of those extensions, (M ˜ 2 ), the first-class constraint algebra in the extended phase space was abelian and thus in particular closed. Therefore we can adapt diagram (10) to the present situation (combining it with part of diagram (9)): i
−→
(M, ω ¯)
f2 , ω (M e2 ) @
π e |M = π ? ∼ =
(R, ωR ) −→ (Q, ΩQ )
,→
π e ? e Π e Q) (Q,
@
@
@
@ R @ (Pe = Rd , 0)
(19)
This plot displays a “duality” between (M, ω ¯ ), a presymplectic manifold, and the ˜ Π ˜ Q ). It is therefore natural to ask, what the relation between Poisson manifold (Q, ˜ Π ˜ Q ) is. Note as an aside that the second the two Poisson manifolds (M, ΠD ) and (Q, Poisson manifold can be defined in general only by the second of the two extensions ˜1 of Sec. 3.4. (As remarked at the end of Sec. 3.4.1, the first-class constraints in M form a closed algebra only when the original constraint algebra ΠP is at most linear in the constraints Φα .) In this physically particularly interesting case, however, an analogous question may be posed. Proposition 8. Under the conditions specified in Corollary 20 , the Poisson mani˜ Π ˜ Q ) are isomorphic, iff P ∼ folds (M, ΠD ) and (Q, = Rd topologically. ˜∼ Proof. The necessity follows from the remark that Q = Q×Rd, i.e. the constraints d ˜ 2 (with its coordinates πα ) and πα act trivially on the local linear factor R in M α have a flow agreeing with that generated by dΦ yΠ on M . This is a consequence ˜ 2 specified by of (13), where i(M ) may as well be taken to denote any section in M some constant value of the momenta πα (cf. also Lemma 4). The rest of the claim is proved by establishing symplectomorphisms between ˜ Q and Φα (x) = const of ΠD . Using the direct the symplectic leaves πα = const of Π product structure of M under the conditions of Corollary 20 , one easily sees that the leaves are all symplectomorphic to the reduced phase space, while the leaf spaces are both Rd . 4.3. Compatible presymplectic forms on neighborhoods of regular leaves We may now reverse the question addressed originally by Dirac: given a Poisson bivector Π ∈ Λ2 (T M ) on a manifold M , is there a symplectic (or, since dim M may be odd, at least presymplectic) 2-form which is compatible with Π in M ? In general, such a compatible presymplectic 2-form will not exist on all of M (recall that also the Dirac bivector is defined only in a neighborhood of the
November 4, 2003 10:45 WSPC/148-RMP
696
00176
M. Bojowald & T. Strobl
constraint surface); but it may nevertheless exist in some neighborhood S of a given leaf L in M . In particular rank Π should be constant in all of S, so S should be a regular Poisson manifold. For simplicity, we will assume that M is already regular and look for a compatible presymplectic form on all of M . We remark that regularity of (M, Π) does not imply that M is foliated regularly. Consider for example the Poisson tensor Π = (∂1 + ω∂2 )∧(∂3 + ω ¯ ∂4 ) with irrational ω, ω ¯ on a torus T 4 (coordinates xi ≡ xi + 1). Any of its symplectic leaves is dense in all of T 4 . This bivector is easily verified to permit the compatible symplectic 2-form Ω = (1 + ω ω ¯ )−1 (dx1 ∧ dx3 + dx2 ∧ dx4 ) on T 4 (or simply dx1 ∧ dx3 as a compatible presymplectic form). We first construct an almost presymplectic form (i.e. just any 2-form, not necessarily closed) which is compatible with Π. This is possible always: picking ˜ ∈ Λ2 (T ∗ M ) by an auxiliary Riemannian metric on M , we can define a 2-form Ω requiring that for any X ∈ M a vector normal to the leaf through X is annihilated ˜ whereas for a tangential vector v ∈ TX L we have vyΩ ˜ = vyΩL (again, ΩL is by Ω the symplectic form induced on the leaf by Π). In this way we obtain a well-defined 2-form on M which is compatible with Π, but not closed in general. There is still some freedom in defining this form since one can add an arbitrary 2-form λ which vanishes when pulled back to a leaf of the foliation F . The space of such k-forms is denoted by Λk0 (M, F ). This may be exploited to obtain another ˜ + λ. Since d maps Λk (M, F ) to Λk+1 (M, F ) and dΩ ˜ ∈ compatible 2-form Ω := Ω 0 0 3 Λ0 (M, F ), the question of whether λ may be chosen in such a way that Ω is closed, ˜ regarded as an relates to the so-called characteristic form-class of Π, i.e. by dΩ (3) (3) 3 (M, F ) := ker d0 /Im d0 with respect to element of the relative cohomologye Hrel (k) the foliation F (d0 denotes the restriction of d to Λk0 (M, F ); cf. also [14]): ˜ be any 2-form Proposition 9. Let (M, Π) be a regular Poisson manifold and Ω compatible with Π (such a 2-form exists always). There is a compatible presymplectic form Ω on M if and only if the characteristic ˜ ∈ H 3 (M, F ), vanishes. form-class of Π, [dΩ] rel ˜ is well˜ ∈ Λ3 (M, F ) such that [dΩ] Proof. Note first that by construction dΩ 0 defined in the relative cohomology. Let us first prove that the condition is necessary: assume that there exists a ˜ compatible presymplectic form Ω. Then [dΩ] = 0 because Ω is closed, and Ω = Ω+λ 2 ˜ for some λ ∈ Λ0 (M, F ) implies [dΩ] = [dΩ] − [dλ] = 0. ˜ = 0, then dΩ ˜ = dλ for some λ ∈ Λ2 (M, F ). This implies that Conversely, if [dΩ] 0 ˜ − λ is closed and compatible. Ω := Ω e To avoid confusion, we point out that this is not the standard definition of “relative cohomology”, but the definition of “relative cohomology with respect to a foliation” as used in [14]; we thank a referee for pointing this out.
November 4, 2003 10:45 WSPC/148-RMP
00176
Poisson Geometry in Constrained Systems
697
In the remainder of this section we will discuss this main condition in a more explicit form. Using the auxiliary metric, we have a splitting of the tangent bundle T M = T L ⊕ T ⊥ L interpreting both bundles on the right-hand side as vector bundles over the base manifold M . Here, we have to assume that the normal distribution T ⊥ L is integrable for reasons which will become clear in the course of this subsection. Using the metric, we obtain a dual decomposition of the cotangent bundle T ∗ M leading to a two-fold grading of Λ∗ (T ∗ M ) by taking exterior powers. In this way the decomposition T M = T L ⊕ T ⊥L entails a decomposition of n-forms ω ∈ Λn (T ∗ M ). Pn (i,n−i) (i,n−i) More explicitly, we have ω = fulfilling vj y · · ·yv1 yωX = 0 for i=0 ω ⊥ j > n − i and all v1 , . . . , vj ∈ TX L, whereas for j = n − i either ω (i,n−i) = 0 or (i,n−i) ⊥ there are v1 , . . . , vj ∈ TX such that vj y · · ·yv1 yωX 6= 0. The uniqueness of this decomposition (for a fixed Riemannian metric) can also be seen in local coordinates adapted to the foliation of M , in which it amounts to collecting terms with a fixed number of differentials along normal coordinates. Definition 10. An n-form ω is called pure of degree (i, n − i) if it has the decomposition ω = ω (i,n−i) . The space Λ(i,n−i) (T ∗ M ) ⊂ Λn (T ∗ M ) is the space of pure forms of degree (i, n − i). Note that these and the following definitions depend on the Riemannian metric chosen on M . We can similarly decompose the exterior derivative operator d into two parts dk : Λ(i,n−i) (T ∗ M ) → Λ(i+1,n−i) (T ∗ M ) and d⊥ : Λ(i,n−i) (T ∗ M ) → Λ(i,n+1−i) (T ∗ M ) . Given a pure form ω ∈ Λ(i,n−i) (T ∗ M ), we define dω = (dω)(i+1,n−i) + (dω)(i,n+1−i) =: dk ω + d⊥ ω . Both derivative operators can be extended linearly so as to be defined on arbitrary forms. Choosing local coordinates (X α , X I ) such that X α parameterize the leaves in a neighborhood and X I the normal directions, the new derivative operators can be written as dk = ∂α dX α ∧ and d⊥ = ∂I dX I ∧. At this point the integrability of the normal distribution has been used: otherwise there would be an additional term in the decomposition of d (see also [14]). This can be seen using the Cartan formula which for a 1-form ω reads dω(v1 , v2 ) = v1 ω(v2 ) − v2 ω(v1 ) − ω([v1 , v2 ]) . Choosing v1 , v2 ∈ T ⊥ L such that [v1 , v2 ] 6∈ T ⊥ L (which by definition is possible only if the normal distribution is not integrable), there is always a 1-form ω ∈ Λ(1,0) (T ∗ M ) with dω(v1 , v2 ) 6= 0. Thus, dω has a nonvanishing contribution in Λ(0,2) (T ∗ M ) and d 6= dk + d⊥ for a nonintegrable normal distribution. One can see that there is only one additional term in the general case mapping Λ(i,j) (T ∗ M ) to
November 4, 2003 10:45 WSPC/148-RMP
698
00176
M. Bojowald & T. Strobl
Λ(i−1,j+2) ; but this would already complicate the descent equations derived below. Therefore, we will only deal with the integrable case from now on, for which we have Lemma 5. dk 2 = 0, {dk , d⊥ } := dk d⊥ + d⊥ dk = 0, d⊥ 2 = 0. Proof. It suffices to prove the assertion for actions on a pure form ω of arbitrary degree. We then have 0 = d2 ω = d(dk ω + d⊥ ω) = dk 2 ω + {dk , d⊥ }ω + d⊥ 2 ω . Because the three terms in the sum are all pure of different degrees, they have to vanish separately. Now we are in the position to proceed with the derivation of conditions for ˜ introduced above the existence of a compatible presymplectic form. The 2-form Ω (2,0) ˜ ˜ is pure of degree (2, 0) by construction, Ω = Ω . As already discussed in the ˜ is not necessarily closed since d⊥ Ω ˜ 6= 0 in paragraph preceding Proposition 9, Ω 2 (2,0) (1,1) ˜ general. Adding a form λ ∈ Λ0 (M, F ) leads to a new form Ω = Ω +Ω +Ω(0,2) which is closed if and only if ˜ (2,0) + dk Ω(1,1) + d⊥ Ω(1,1) + dk Ω(0,2) + d⊥ Ω(0,2) = 0 . dΩ = d⊥ Ω Collecting forms of equal degree immediately leads to Proposition 10. Let (M, Π) be a regular Poisson manifold being equipped with a Riemannian metric and an associated integrable decomposition of the tangential bundle. There is a presymplectic 2-form Ω = Ω(2,0) + Ω(1,1) + Ω(0,2) compatible with Π on M if and only if the descent equations dk Ω(1,1) = − d⊥ Ω(2,0) ,
dk Ω(0,2) = − d⊥ Ω(1,1) ,
have a solution on M subject to the condition that Ω M coincides with the symplectic form of that leaf.
(2,0)
d⊥ Ω(0,2) = 0
(20)
restricted to any leaf in
In special cases of a foliation we can reformulate the conditions of the proposition. Corollary 3. If M is foliated trivially, i.e. it is of the form M ∼ = L × Rk , then the first equation in (20) implies I ∂I Ω(2,0) = 0 σ
where ∂I denotes any differentiation transversal to L and σ is a closed two-cycle in L. This means that the symplectic volume of any closed two-cycle in a leaf has to be constant in M. This condition is violated for example, for any family of homeomorphic coadjoint orbits of a compact, semisimple Lie algebra because the symplectic form of leaves in
November 4, 2003 10:45 WSPC/148-RMP
00176
Poisson Geometry in Constrained Systems
699
the dual Lie algebra with the Kirillov–Kostant structure depends nontrivially on the Casimir functions (the radial coordinate, e.g. for su(2)). In particular, according to the work of Kirillov, the (always discrete) set of irreducible unitary representations of a compact, semisimple Lie algebra corresponds to the set of all integral symplectic leaves in the corresponding Lie Poisson manifold, that is to leaves satisfying that H (2,0) Ω is an integer multiple of a fixed constant for any two-cycle in the leaf. σ H Correspondingly, σ Ω(2,0) depends nontrivially on the leaf and Poisson manifolds of this kind do not allow compatible presymplectic H (2,0) forms. (In this argument we cannot vanish on an interval if used analyticity which implies that ∂I σ Ω Halso (2,0) Ω is not constant. This means that any open subset of the dual Lie algebra σ contains (part of) a leaf violating the condition of the Corollary.) We have another interesting situation if all leaves in M have trivial second cohomology: H 2 (L) = 0. If M is foliated trivially, i.e. of the form M ∼ = L × Rk , one can easily verify that there is always a compatible presymplectic form: ˜ (2,0) has a symplectic potential θ (1,0) on any leaf L in M ∼ Lemma 6. If Ω = L × Rk , (2,0) (1,0) (1,0) ˜ i.e. Ω = dk θ , and θ varies smoothly from leaf to leaf, then Ω := dθ (1,0) is a compatible presymplectic form on M. In particular, if all leaves L in a trivially foliated M ∼ = L×Rk have trivial second cohomology, then there exists a compatible presymplectic form on M. This can also be derived using the descent equations which acquire the form (using Lemma 5) dk Ω(1,1) = dk d⊥ θ(1,0) solved by Ω(1,1) := d⊥ θ(1,0) , leading to dk Ω(0,2) = − d⊥ 2 θ(1,0) = 0 in addition to d⊥ Ω(0,2) = 0. The latter two equations have an obvious solution Ω(0,2) = 0 implying ˜ (2,0) + Ω(1,1) = dk θ(1,0) + d⊥ θ(1,0) = dθ(1,0) . Ω=Ω As a more explicit examplef we look at the manifold M = T 2 × R with Poisson bivector Π = F (x1 , x2 , x3 )(∂1 + ω∂2 ) ∧ ∂3 with an arbitrary function F on M . Leaves are submanifolds subject to the condition ωx1 − x2 = 0 and we can use x1 and x3 as local coordinates of a leaf. If we choose the normal distribution to ˜ (2,0) + Ω(1,1) + Ω(0,2) with be spanned by ∂2 , any 2-form Ω is split into Ω = Ω (2,0) −1 (1,1) ˜ Ω = F (x1 , x2 , x3 ) dx1 ∧ dx3 , Ω = µ1 dx1 ∧ dx2 + µ2 dx3 ∧ dx2 , Ω(0,2) = 0, and only the first descent equation is nontrivial and takes the form dk Ω(1,1) = (∂3 µ1 − ∂1 µ2 )dx1 ∧ dx2 ∧ dx3 = − d⊥ Ω(2,0) = ∂2 F −1 dx1 ∧ dx2 ∧ dx3 . f We
thank A. Weinstein for suggesting to look at such an example.
November 4, 2003 10:45 WSPC/148-RMP
700
00176
M. Bojowald & T. Strobl
R This implies ∂3 µ1 −∂1 µ2 = ∂2 F −1 which is solved by, e.g. µ1 = ∂2 F −1 dx3 , µ2 = 0 yielding Z −1 −1 ∂2 F dx3 dx1 ∧ dx2 Ω = F (x1 , x2 , x3 ) dx1 ∧ dx3 + as compatible presymplectic form. Note that Ω is well-defined globally even if ω is irrational in which case the leaves are dense in the torus factor of M . However, if we change M to be T 2 × S 1 , the x3 -integration in the second term of Ω is not periodic if there is a nonvanishing zero-mode in the Fourier decomposition of ∂2 F −1 with respect to x3 . This means that in such a case the characteristic form class of Π on T 2 ×S 1 does not vanish. If ω is rational the leaves in M have nontrivial second cohomology, whereas for irrational ω the leaves are of topology R × S 1 and so have trivial second cohomology, but M is not foliated trivially. So in both cases, there is no contradiction to Lemma 6. 4.4. Leafwise symplectic embeddings of Poisson manifolds Theorem 1 asserts that any Poisson manifold (P, ΠP ) has a symplectic realization. This provides an appropriate Poisson map from a symplectic manifold to the given Poisson manifold. As demonstrated in Sec. 3 above (cf. in particular Proposition 2 and diagram (11), physically this is, for example, of interest in constrained systems with a closed algebra: any Poisson manifold (or at least any region of it contained in Rd , d = dim P ) can be understood as arising from a constrained Hamiltonian system with appropriate constraint map φ. The Dirac bracket and the considerations of the present section motivate the question for another map, going in a reverse direction. Clearly, the identity map from (S, ω) to (S, ΠD ) is not a Poisson map (here ΠD is the Dirac bivector (16) defined for some second-class constraints; S is the neighborhood of C for which ΠD exists). Still, ΠD and ω are by no means unrelated; they are what we called compatible to one another. In the language of maps, this may be rephrased as follows: the embedding map from (S, ΠD ) into (M, ω) is leafwise symplectic. Definition 11. A map from a Poisson manifold (P, ΠP ) to a symplectic manifold (M, ω) is leafwise symplectic (or leaf-symplectic), if its restriction to any symplectic leaf of (P, ΠP ) is a symplectic map. Remark. Recall that according to Definition 3 a map f between two symplectic manifolds (M1 , ω1 ) and (M2 , ω2 ) is symplectic, iff f ∗ ω2 = ω1 . Therefore f : (P, ΠP ) → (M, ω) is leaf-symplectic, iff (f ◦ ιL )∗ ω = ΩL for any leaf L of (P, ΠP ), ΩL being the induced symplectic 2-form on L. Proposition 11. A regular Poisson manifold (P, ΠP ) permits a leaf-symplectic embedding into some symplectic manifold, iff its characteristic form-class vanishes. Proof. According to Proposition 9, there exists a compatible presymplectic form on P iff the characteristic form-class of ΠP vanishes. Assuming that there exists a
November 4, 2003 10:45 WSPC/148-RMP
00176
Poisson Geometry in Constrained Systems
701
leaf-symplectic embedding f : P → M from P to (M, ω), f ∗ ω provides a compatible presymplectic form on (P, ΠP ). So the vanishing of the characteristic form-class is a necessary condition. On the other hand, if it vanishes, there exists a compatible presymplectic form Ω on P . According to Theorem 2, (P, Ω) can be embedded coisotropically into a symplectic manifold (M, ω). Denoting this embedding by ι and the embedding of a symplectic leaf (L, ΩL ) into P by ιL , we have ΩL = ι∗L Ω = ∗ ι∗L ι∗ ω = (ι|L ) ω. Thus the condition is also sufficient. Leaf-symplectic embeddings can be regarded as a concept related to isotropic symplectic realizations,g which are Poisson maps r: (Y, ω) → (M, Π) from a symplectic manifold (Y, ω) to a Poisson manifold (M, Π) such that the fibers of r are isotropic (i.e. T F ⊂ (T F )⊥ for any fiber F = r −1 (m), m ∈ M ). They are of interest for symplectic groupoids [18, 41], which also appeared recently in the context of Poisson Sigma Models in Ref. [42]. Obstructions for the existence of an isotropic symplectic realization have been derived [14, 43] which are of cohomological nature and similar to those derived here for the existence of a leaf-symplectic embedding. While the Dirac bracket, which provided the motivation for defining leafsymplectic embeddings, is usually used only on symplectic manifolds (M, ω) there is a straightforward generalization of Definition 11 to the case of a Poisson target manifold: Definition 12. A map from a Poisson manifold (P1 , Π1 ) to a Poisson manifold (P2 , Π2 ) is leaf-to-leaf symplectic if the image of any leaf L1 of P1 is symplectically embedded in a leaf L2 of P2 . Remark. (i) If (P2 , Π2 ) is symplectic, this clearly reduces to Definition 11. (ii) Unlike leaf-symplectic embeddings, there always exists a leaf-to-leaf symplectic embedding of a Poisson manifold (P, Π), namely the identity map. (iii) Non-trivial examples of leaf-to-leaf symplectic embeddings are given by second-class submanifolds of a Poisson manifold as defined in part (iii), (b) of Definition 2 (see also the discussion following this definition). Other examples are cosymplectic submanifolds [13] and Dirac submanifolds [44] of a Poisson manifold; this follows from Corollary 2.11 and Theorem 2.3, (vi) of [44]. The situation in Definition 12 corresponds to a (generalized) Dirac bracket constructed for a family of second-class submanifolds C in a (nonsymplectic) Poisson manifold (P, Π) in the following way: we use the notation of Remark (iii) at the end of Sec. 4.1. However, for a degenerate Poisson structure Π there is no symplectically orthogonal complement of Tx C. Instead, we use the fact that C is, as a consequence of Definition 2, contained in a leaf L of P with symplectic structure ΩL which defines the complement of Tx C in Tx L = Tx C ⊕ Tx C ⊥ . Similarly, we have Tx∗ L = AnnL (Tx C ⊥ ) ⊕ AnnL (Tx C). The projection π ¯1 is now defined in two g We
are grateful to A. Weinstein for making us aware of this relation.
November 4, 2003 10:45 WSPC/148-RMP
702
00176
M. Bojowald & T. Strobl
steps: we first project Tx∗ P to Tx∗ P/ AnnP (Tx L) ∼ = Tx∗ L, followed by a projection ∗ to the first factor in the decomposition of Tx L. Since AnnP (Tx L) = ker Π]x the Poisson bivector factors through the first projection and ΠD = Π ◦ (¯ π1 ⊗ π ¯1 ) defines a bivector which generalizes the Dirac bracket. Acknowledgments We thank A. Alekseev, M. Bordemann, P. Bressler, S. Lyakhovich, D. Sternheimer and in particular A. Weinstein for interesting discussions and suggestions, and L. Dittmann for help with drawing the diagrams. M. B. is grateful for support from NSF grant PHY00-90091 and the Eberly research funds of Penn State, and to A. Wipf and the TPI in Jena for hospitality during an essential part of the completion of this work. T. S. thanks the Erwin Schr¨ odinger Institute in Vienna for hospitality during an inspiring workshop on Poisson geometry. References [1] P. A. M. Dirac, Lectures on Quantum Mechanics, Yeshiva University, Academic Press, New York, 1967. [2] M. Kontsevich, Deformation quantization of Poisson manifolds, I, q-alg/9709040. [3] A. Cannas da Silva and A. Weinstein, Geometric models for noncommutative algebras, Providence, RI, Amer. Math. Soc. (AMS ), 1999. [4] I. Vaisman, On the geometric quantization of Poisson manifolds, J. Math. Phys. 32 (1991), 3339–3345. [5] V. Schomerus, D-branes and deformation quantization, JHEP 9906 (1999), 030. [6] B. Jurco, P. Schupp and J. Wess, Noncommutative gauge theory for Poisson manifolds, Nucl. Phys. B584 (2000), 784–794. [7] M. Flato, A. Lichnerowicz and D. Sternheimer, Deformations of Poisson brackets, Dirac brackets and applications, J. Math. Phys. 17 (1976), 1754–1762. [8] A. Y. Alekseev, V. Schomerus and T. Strobl, Closed constraint algebras and path integrals for loop group actions, J. Math. Phys. 42 (2001), 2144–2155. [9] I. A. Batalin and E. S. Fradkin, Operatorial quantization of dynamical systems subject to second class constraints, Nucl. Phys. B279 (1987), 514–528. [10] I. A. Batalin and I. V. Tyutin, Existence theorem for the effective gauge algebra in the generalized canonical formalism with Abelian conversion of second class constraints, Int. J. Mod. Phys. A6 (1991), 3255. [11] I. A. Batalin, M. A. Grigoriev and S. L. Lyakhovich, Star product for second class constraint systems from a BRST Theory, Theor. Math. Phys. 128 (2001), 1109. [12] M. Grigoriev and S. Lyakhovich, Fedosov deformation quantization as a BRST theory, Comm. Math. Phys. 218 (2001), 437–457. [13] A. Weinstein, The local structure of Poisson manifolds, J. Differential Geom. 18 (1983), 523–557. [14] I. Vaisman, Lectures on the Geometry of Poisson Manifolds, Birkh¨ auser, Basel, 1994. [15] R. A. Bertlmann, Anomalies in Quantum Field Theory, Clarendon Press, Oxford, 1996. [16] G. Barnich, F. Brandt and M. Henneaux, Local BRST cohomology in gauge theories, Phys. Rep. 338 (2000), 439–569. [17] M. Karasev, Analogues of objects of the theory of Lie groups for nonlinear Poisson brackets, Math. USSR Izvestiya 28 (1987), 497–527.
November 4, 2003 10:45 WSPC/148-RMP
00176
Poisson Geometry in Constrained Systems
703
[18] A. Weinstein, Symplectic groupoids and Poisson manifolds, Bull. Amer Math. Soc. 16 (1987), 101–104. [19] M. J. Gotay, On coisotropic embeddings of presymplectic manifolds, Proc. Am. Math. Soc. 84 (1982), 111–114. [20] V. Guillemin and S. Sternberg, Symplectic Techniques in Physics, Cambridge University Press, Cambridge, 1984. [21] L. D. Faddeev and S. L. Shatashvili, Realization of the Schwinger term in the Gauss law and the possibility of correct quantization of a theory with anomalies, Phys. Lett. B167 (1986), 225–228. [22] M. Bojowald and T. Strobl, Classical Solutions for Poisson Sigma Models on a Riemann Surface, JHEP 0307, 002 (2003). [23] T. Strobl, Gravity in Two Spacetime Dimensions, Habilitation thesis, RWTH Aachen, June 1999, hep-th/0011240. [24] P. Schaller and T. Strobl, Poisson structure induced (topological) field theories, Mod. Phys. Lett. A9 (1994), 3129–3136. [25] N. Ikeda, Two-dimensional gravity and nonlinear gauge theory, Ann. Phys. 235 (1994), 435–464. [26] M. Bojowald and A. Perez, Spin Foam Quantization and Anomalies, gr-gc/0303026. [27] I. Vaisman, Symplectic Geometry and Secondary Characteristic Classes, Birkh¨ auser, Basel, 1987. [28] T. J. Courant, Dirac manifolds, Trans. Amer. Math. Soc. 319 (1990) 631–661. [29] Z. J. Liu, A. Weinstein and P. Xu, Manin triples for Lie bialgebroids, J. Differential Geom. 45 (1997), 547–574. [30] R. U. Sexl and H. K. Urbantke, Relativity, Groups, Particles: Special Relativity and Relativistic Symmetry in Field and Particle Physics, Springer-Verlag, New York, 2001. [31] A. Weinstein, Coisotropic calculus and Poisson groupoids, J. Math. Soc. Japan 40 (1988), 705–727. [32] P. Libermann and Ch.-M. Marle, Symplectic Geometry and Analytical Mechanics, D. Reidel Publ. Comp., Dordrecht-Boston, 1987. [33] R. Bott and L. M. Tu, Differential Forms in Algebraic Topology, Graduate Texts in Mathematics 82, Springer-Verlag, New York, 1991. [34] V. N. Gribov, Quantization of nonabelian gauge theories, Nucl. Phys. B139, 1 (1978). [35] M. Henneaux and C. Teitelboim, Quantization of Gauge Systems, Princeton University Press, Princeton, 1992. [36] S. Lyakhovich and R. Marnelius, Extended observables in theories with constraints, Int. J. Mod. Phys. A 16 (2001), 4271–4296. [37] C. Klimˇc´ık and T. Strobl, WZW-Poisson manifolds, J. Geom. Phys. 43 (2002), 341–344. [38] J.-S. Park, Topological Open P-Branes., hep-th/0012141. [39] P. Severa and A. Weinstein, Poisson geometry with a 3-form background, Prog. Theor. Phys. Suppl. 144 (2001), 145–154. [40] J. E. Marsden and T. S. Ratiu, Introduction to Mechanics and Symmetry, Springer, New York, 1999. [41] M. V. Karasev and V. P. Maslov, Nonlinear Poisson Brackets: Geometry and Quantization, Translations of Mathematical Monographs 119. Providence: AMS, 1993. [42] A. S. Cattaneo and G. Felder, Poisson Sigma Models and Symplectic Groupoids, Math. SG/0003023. [43] D. Dazord and T. Delzant, Le probl`eme g´en´eral des variables actions-angles, J. Differential Geom. 26 (1987), 223–251. [44] P. Xu, Dirac Submanifolds and Poisson Involutions, Math. SG/0110326.
November 5, 2003 9:42 WSPC/148-RMP
00173
Reviews in Mathematical Physics Vol. 15, No. 7 (2003) 705–743 c World Scientific Publishing Company
THE POISSON BRACKET FOR POISSON FORMS IN MULTISYMPLECTIC FIELD THEORY
MICHAEL FORGER∗ Departamento de Matem´ atica Aplicada, Instituto de Matem´ atica e Estat´ıstica, Universidade de S˜ ao Paulo, Caixa Postal 66281, BR–05311-970 S˜ ao Paulo, S.P., Brazil [email protected] ‡ ¨ CORNELIUS PAUFLER† and HARTMANN ROMER
Fakult¨ at f¨ ur Physik, Albert-Ludwigs-Universit¨ at Freiburg im Breisgau Hermann-Herder-Straße 3, D–79104 Freiburg i.Br., Germany †[email protected] ‡[email protected] Received 4 November 2002 Revised 17 April 2003 We present a general definition of the Poisson bracket between differential forms on the extended multiphase space appearing in the geometric formulation of first order classical field theories and, more generally, on exact multisymplectic manifolds. It is well defined for a certain class of differential forms that we propose to call Poisson forms and turns the space of Poisson forms into a Lie superalgebra. Keywords: Geometric field theory; multisymplectic geometry; Poisson brackets.
1. Introduction The multiphase space approach to classical field theory, whose origins can be traced back to the early work of Hermann Weyl on the calculus of variations, has recently undergone a rapid development, but a number of conceptual questions is still open. The basic idea behind all attempts to extend the covariant formulation of classical field theory from the Lagrangian to the Hamiltonian domain is to treat spatial derivatives on the same footing as time derivatives. This requires associating to each field component ϕi not just its standard canonically conjugate momentum πi but rather n conjugate momenta πiµ , where n is the dimension of space-time. If one starts out from a Lagrangian L depending on the field and its first partial ∗ Partially ‡ Partially
supported by CNPq, Brazil supported by FAPESP, Brazil 705
November 5, 2003 9:42 WSPC/148-RMP
706
00173
M. Forger, C. Paufler & H. R¨ omer
derivatives, these are obtained by the covariant Legendre transformation πiµ =
∂L . ∂∂µ ϕi
This allows one to rewrite the standard Euler-Lagrange equations of field theory, ∂µ
∂L ∂L − =0 ∂∂µ ϕi ∂ϕi
as a covariant first order system, the covariant Hamiltonian equations or De DonderWeyl equations ∂H = ∂ µ ϕi , ∂πiµ
∂H = −∂µ πiµ ∂ϕi
where H = πiµ ∂µ ϕi − L is the covariant Hamiltonian density or De Donder-Weyl Hamiltonian. Multiphase space (ordinary as well as extended) is the geometric environment built by appropriately patching together local coordinate systems of the form (q i , pµi ) — instead of the canonically conjugate variables (q i , pi ) of mechanics — together with space-time coordinates xµ and, in the extended version, a further energy type variable that we shall denote by p (without any index). In the recent literature on the subject, special attention has been devoted to the so-called multisymplectic form ω which is, except for a sign, the exterior derivative of another form θ that we propose to call the multicanonical form: both are naturally defined on extended multiphase space and are the geometric objects replacing, respectively, the symplectic form ω = dq i ∧ dpi and the canonical form θ = pi dq i of Hamiltonian mechanics (on cotangent bundles), or more precisely, the symplectic form ω = dq i ∧ dpi + dt ∧ dE and the canonical form θ = pi dq i + E dt of Hamiltonian mechanics (on cotangent bundles) for non-autonomous systems. Additional motivation and precise definitions will be given in the next section, and a table confronting the most relevant concepts of the field theoretical formalism with their counterparts in Hamiltonian mechanics can be found at the end of the paper. The advantage of such an approach as compared to the orthodox strategy of treating field theoretical models as infinite-dimensional dynamical systems is threefold. First, general covariance (and in particular, Lorentz covariance) is trivially achieved. Second, by working on multiphase space which is a finite-dimensional manifold, one automatically avoids all the functional analytic complications that plague the orthodox method. Third, space-time locality is also automatically guaranteed, since one works with the field variables and their first derivatives or conjugates of these at single points of space-time, rather than with fields defined over entire hypersurfaces: integration is deferred to the very last step of every procedure. Of course, there is also a price to be paid for all these benefits, namely that the obvious duality of classical mechanics between coordinates and momenta is lost.
November 5, 2003 9:42 WSPC/148-RMP
00173
The Poisson Bracket for Poisson Forms in Multisymplectic Field Theory
707
As a result, there is no evident multiphase space quantization procedure. What seems to be needed is a new and more sophisticated concept of “multi-duality” to replace the standard duality underlying the canonical commutation relations. Certainly, an important step towards a better understanding of what might be the nature of this “multi-duality” and that of a multiphase space quantization procedure is the construction of Poisson brackets within this formalism. After all, the Poisson bracket should be the classical limit of the commutator of quantum theory. Surprisingly, this is to a large extent still an open problem. Our approach to the question has been motivated by the work of Kanatchikov [1, 2], who seems to have been the first to propose a Poisson bracket between differential forms of arbitrary degree in multimomentum variables and to analyze the restrictions that must be imposed on these forms in order to make this bracket well-defined: he uses the term “Hamiltonian form” in this context, although the concept as such is of course much older. It must be pointed out, however, that Kanatchikov’s approach is essentially local and makes extensive use of features that have no invariant geometric meaning, such as a systematic splitting into horizontal and vertical parts; moreover, his definition of Hamiltonian forms is too restrictive. We avoid all these problems by working exclusively within the multisymplectic framework and on the extended multiphase space, instead of the ordinary one: this leads naturally to a definition of the concept of a Poisson form which is more general than Kanatchikov’s notion of a Hamiltonian form, as well as to a coordinate-independent definition of the Poisson bracket between any two such forms. In fact, most of the concepts involved do not even depend on the explicit construction of extended multiphase space but only on its structure as an exact multisymplectic manifold, and we shall make use of this fact in order to simplify the treatment whenever possible. The paper is organized as follows. In Sec. 2, we give a brief review of some salient features of the multiphase space approach to the geometric formulation of first order classical field theories, following Ref. [3] and, in particular, Ref. [4], to which the reader is referred for more details and for the discussion of many relevant examples; this material is included here mainly in order to fix notation and make our presentation reasonably self-contained. The main point is to show that the extended multiphase space of field theory does carry the structure of an exact multisymplectic manifold (in fact it seems to be the only known example of a multisymplectic manifold). In Sec. 3, we introduce the concept of a Poisson form on a general multisymplectic manifold, specify the notion of an exact multisymplectic manifold, define the Poisson bracket between Poisson forms on exact multi symplectic manifolds and prove our main theorem, which states that this bracket satisfies the usual axioms of a Lie superalgebra. The construction generalizes the corresponding one for Hamiltonian (n − 1)-forms on the extended multiphase space of field theory given by two of the present authors in a previous paper [5]: the idea is to modify the standard formula that had been adopted for decades [6–11], even though it fails to satisfy the Jacobi identity, by adding a judiciously chosen exact form that turns out to cure the defect. Here, we show that the same trick works
November 5, 2003 9:42 WSPC/148-RMP
708
00173
M. Forger, C. Paufler & H. R¨ omer
for forms of arbitrary degree, provided one introduces appropriate sign factors. In both cases, it is the structure of the correction term that requires the underlying manifold to be exact multisymplectic and not just multisymplectic. In Sec. 4, we define the notion of an exact Hamiltonian multivector field on an exact multisymplectic manifold and show that by contraction with the multicanonical form θ, any such multivector field gives rise to a Poisson form; moreover, this simple prescription yields an antiSchouten bracket of multivector fields and the Poisson bracket of Poisson forms introduced here). It can be viewed as an extension, from vector fields to multivector fields, of the universal part of the covariant momentum map [4], which is the geometric version of the construction of Noether currents and the energy-momentum tensor in field theory, and we shall therefore refer to it as the universal multimomentum map. In Sec. 5, we return to the case of extended multiphase space and discuss other examples for the construction of Poisson forms. More specifically, we show that arbitrary functions are Poisson forms (of degree 0) and find that Kanatchikov’s Hamiltonian forms, when pulled back from ordinary to extended multiphase space by means of the appropriate projection, constitute a special class of Poisson forms. The complete determination of the space of Poisson forms of arbitrary degree > 0 on extended multiphase space, together with that of exact Hamiltonian and locally Hamiltonian multivector fields of arbitrary degree < n, is a technically demanding problem whose solution will be presented elsewhere [12]. The paper concludes with two appendices: the first presents a number of important formulas from the multivector calculus on manifolds, related to the definition and main properties of the Schouten bracket and the Lie derivative of differential forms along multivector fields, while the second shows how, given a connection in a fiber bundle, one can construct induced connections in various other fiber bundles derived from it, including the multiphase spaces of geometric field theory; this possibility is important for the comparison of the multisymplectic formalism with other approaches that have been proposed in the literature and to a certain extent depend on the a priori choice of a connection. Recently, the problem of constructing Poisson brackets has also been addressed in the context of other formalisms such as the one based on n-symplectic manifolds [13] (see [14] for a recent overview) or that of Lepage-Dedecker which is more general than that of De Donder-Weyl [15]. Finally, we would like to point out that there exists another construction of a covariant Poisson bracket in classical field theory, based on the same functional approach that underlies the construction of “covariant phase space” of CrnkovicWitten [16, 17] and Zuckerman [18]. This bracket, originally due to Peierls [19] and further elaborated by de Witt [20, 21] (see also [22] for a recent exposition), has been adapted to the multiphase space approach by Romero [23] and shown to be precisely the Poisson bracket associated with the symplectic form on covariant phase space introduced in Refs. [16, 17] and [18]; these results will be presented elsewhere [24]. It would be interesting to identify the relation between that bracket and the one introduced here; this question is presently under investigation.
November 5, 2003 9:42 WSPC/148-RMP
00173
The Poisson Bracket for Poisson Forms in Multisymplectic Field Theory
709
2. Multiphase Spaces in Geometric Field Theory The starting point for the geometric formulation of classical field theory is the choice of a configuration bundle, which in general will be a fiber bundle over space-time whose sections are the fields of the theory under consideration. In what follows, we shall denote its total space by E, its base space by M , its typical fiber by Q and the projection from E to M by π; the dimensions are dim M = n,
dim Q = N,
dim E = n + N .
(2.1)
In field theoretical models, M is interpreted as space-time whereas Q is the configuration space of the theory — a manifold whose (local) coordinates describe internal degrees of freedom.a The total space E is locally but not necessarily globally isomorphic to the Cartesian product M × Q, but it must be stressed that even when the configuration bundle is globally trivial, there will in general not exist any preferred trivialization, and it is precisely the freedom to change trivialization that allows one to incorporate gauge theories into the picture. Another point that deserves to be emphasized is that the configuration bundle does not in general carry any additional structures: these only appear when one focusses on special classes of field theories. • Vector bundles arise naturally in theories with linear matter fields and also in general relativity: the metric tensor is an example. • Affine bundles can be employed to incorporate gauge fields, since connections in a principal G-bundle P over space-time M can be viewed as sections of the connection bundle of P — an affine bundle CP over M constructed from P . • General fiber bundles are used to handle nonlinear matter fields, in particular those corresponding to maps from space-time M to some target manifold Q: a standard example are the nonlinear sigma models. In order to cover this variety of situations, the general constructions on which the geometric formulation of classical field theory is based must not depend on the choice of any additional structure on the configuration bundle. This requirement is naturally satisfied in the multiphase space formalism — in contrast to the majority of similar approaches that have over the last few decades found their way into the literature: most of these depend on the a priori choice of a connection in the configuration bundle, thus excluding gauge theories in which connections must be treated as dynamical variables and not as fixed background fields. The multiphase space approach to first order classical field theory follows the same general pattern as the standard formalism of classical mechanics on the tangent and cotangent bundle of a configuration space Q [25, 26].b However, the a This
interpretation is turned around in the theory of strings and membranes. term “first order” refers to the fact that the Lagrangian is supposed to be a pointwise defined function of the coordinates or fields and of their derivatives or partial derivatives of no more than first order; higher order derivatives should be eliminated, e.g., by introducing appropriate auxiliary variables.
b The
November 5, 2003 9:42 WSPC/148-RMP
710
00173
M. Forger, C. Paufler & H. R¨ omer
correspondence between the objects and concepts underlying the geometric formulation of mechanics and that of field theory becomes fully apparent only when one reformulates mechanics so as to incorporate the time dimension. (This is standard practice, e.g., in the study of non-autonomous systems, that is, mechanical systems whose Lagrangian/Hamiltonian depends explicitly on time, such as systems of particles in time-dependent external fields. Additional motivation is provided by relativistic mechanics where Newton’s concept of absolute time is abandoned and hence there is no place for an extraneous, absolute time variable that can be kept entirely separate from the arena where the dynamical phenomena take place.) In its simplest version, this reformulation amounts to replacing the configuration space Q by the extended configuration space R × Q and the velocity phase space T Q (the tangent bundle of Q) by the extended velocity phase space R × T Q, where R stands for the time axis. The usual momentum phase space T ∗ Q (the cotangent bundle of Q) admits two different extensions: the simply extended phase space R×T ∗ Q, where R represents the time variable, and the doubly extended phase space R × T ∗ Q × R, where the first copy of R represents the time variable whereas the second copy of R represents an energy variable. This second extension is required if one wants to maintain a symplectic structure, rather than just a contact structure, for extended phase space, since energy is the physical quantity canonically conjugate to time. A further generalization appears when one considers mechanical systems in external gauge fields, since time-dependent gauge transformations do not respect the direct product structure of the extended configuration and phase spaces mentioned above. What does remain invariant under such transformations are certain projections, namely the projection from the extended configuration space onto the time axis, the projections from the various extended phase spaces onto extended configuration space and, finally, the projection from the doubly extended to the simply extended phase space which amounts to “forgetting the additional energy variable”. In passing to field theory, we must replace the time axis R by the space-time manifold M , the extended configuration space R × Q by the configuration bundle E over M introduced above and the extended velocity phase space R × T Q by the jet bundle JE of E.c It is well known that JE is — unlike the tangent bundle of a manifold — in general only an affine bundle over E (of fiber dimension N n) and not a vector bundle; the corresponding difference vector bundle over E (also of fiber ~ dimension N n) will be called the linearized jet bundle of E and be denoted by JE. ~ This leads to the possibility of forming two kinds of dual: the linear dual of JE, denoted here by J~∗ E, and the affine dual of JE, denoted here by J ? E; both of them are vector bundles over E (of fiber dimension N n and N n + 1, respectively). Even more important are their twisted versions, obtained by taking the tensor product with the line bundle of volume forms on M , pulled back to E via π: this ~ called ordinary multiphase space and gives rise to the twisted linear dual of JE, c We
consider only first order jet bundles and therefore omit the index “1” used by many authors.
November 5, 2003 9:42 WSPC/148-RMP
00173
The Poisson Bracket for Poisson Forms in Multisymplectic Field Theory
711
∗ E, and the twisted affine dual of JE, called extended multiphase denoted here by J~ ? space and denoted here by J E; both of them, once again, are vector bundles over E (of fiber dimension N n and N n + 1, respectively). The former replaces the simply extended phase space R × T ∗ Q of mechanics whereas the latter replaces the doubly extended phase space R × T ∗ Q × R of mechanics. Moreover, in both cases (twisted or untwisted), there is a natural projection η that, as in mechanics, ? can be interpreted as “forgetting the additional energy variable”: it turns J E into
∗ ? ~ an affine line bundle over J E and, similarly, J E into an affine line bundle over J~∗ E. The most remarkable property of extended multiphase space is that it is an exact multisymplectic manifold: it carries a naturally defined multicanonical form θ, of degree n, whose exterior derivative is the multisymplectic form ω, of degree n + 1, replacing the canonical form θ and the symplectic form ω, respectively, on the doubly extended phase space R × T ∗ Q × R of mechanics. The global construction of the first order jet bundle JE and the linearized first order jet bundle J~E associated with a given fiber bundle E over a manifold M , as well as that of the various duals mentioned above, is quite easy to understand. (Higher order jet bundles are somewhat harder to deal with, but we won’t need them in this paper.) Given a point e in E with base point x = π(e) in M , the fiber Je E of JE at e consists of all linear maps from the tangent space Tx M of the base space M at x to the tangent space Te E of the total space E at e whose composition with the tangent map Te π : Te E → Tx M to the projection π : E → M gives the identity on Tx M :
Je E = {ue ∈ L(Tx M, Te E)/Te π ◦ ue = idTx M } .
(2.2)
Thus the elements of Je E are precisely the candidates for the tangent maps at x to (local) sections ϕ of the bundle E satisfying ϕ(x) = e. Obviously, Je E is an affine subspace of the vector space L(Tx M, Te E) of all linear maps from Tx M to the tangent space Te E, the corresponding difference vector space being the vector space of all linear maps from Tx M to the vertical subspace Ve E: J~e E = L(Tx M, Ve E) .
(2.3)
The jet bundle JE thus defined admits two different projections, namely the target projection τJE : JE → E and the source projection σJE : JE → M which is simply its composition with the original projection π, that is, σJE = π ◦ τJE . It is easily shown that JE is a fiber bundle over M with respect to σJE , in general without any additional structure, but it is an affine bundle over E with respect to τJE , the corresponding difference vector bundle being the vector bundle over E of linear maps from the pull-back of the tangent bundle of the base space by the projection π to the vertical bundle of E: ~ = L(π ∗ T M, V E) . JE
(2.4)
November 5, 2003 9:42 WSPC/148-RMP
712
00173
M. Forger, C. Paufler & H. R¨ omer
The affine structure of the jet bundle JE over E, as well as the linear structure of the ~ over E, can also be read off directly from local coordinate linearized jet bundle JE expressions. Namely, choosing local coordinates xµ for M , local coordinates q i for Q and a local trivialization of E induces naturally a local coordinate system (x µ , q i , qµi ) ~ such coordinates for JE, as well as a local coordinate system (xµ , q i , q~µi ) for JE: will simply be referred to as adapted local coordinates. Moreover, a transformation to new local coordinates x0κ for M , new local coordinates q 0k for Q and a new local trivialization of E, according to x0κ = x0κ (xµ ) ,
q 0k = q 0k (xµ , q i )
(2.5)
induces naturally a transformation to new adapted local coordinates (x0κ , q 0k , qκ0k ) ~ given by Eq. (2.5) and for JE and (x0κ , q 0k , q~κ0k ) for JE qκ0k = qκ0k (xµ , q i , qµi ) ,
q~κ0k = q~κ0k (xµ , q i , q~µi ) ,
(2.6)
where qκ0k =
∂xµ ∂q 0k i ∂xµ ∂q 0k q + , ∂x0κ ∂q i µ ∂x0κ ∂xµ
~qκ0k =
∂xµ ∂q 0k i q~ . ∂x0κ ∂q i µ
(2.7)
Before going on, we pause to fix some notation concerning differential forms, for which we shall in terms of local coordinates xµ use the following conventions: dn x = dx1 ∧ · · · ∧ dxn ,
(2.8)
dn xµ = i∂µ dn x = (−1)µ−1 dx1 ∧ · · · ∧ dxµ−1 ∧ dxµ+1 ∧ · · · ∧ dxn ,
(2.9)
dn xµν = i∂ν i∂µ dn x . . . dn xµ1 ...µr = i∂µr . . . i∂µ1 dn x .
(2.10)
Then i∂µ dn xµ1 ...µr = dn xµ1 ...µr µ ,
(2.11)
dxκ ∧ dn xµ = δµκ dn x ,
(2.12)
dxκ ∧ dn xµν = δνκ dn xµ − δµκ dn xν ,
(2.13)
whereas
dxκ ∧ dn xµ1 ... µr =
r X
(−1)r−p δµκp dn xµ1 ...µp−1 µp+1 ... µr .
(2.14)
p=1
Moreover, these (local) forms on M are lifted to (local) forms on E by pull-back with the projection πE , and later (local) forms on E will be lifted to (local) forms on total spaces of bundles over E by pull-back with the respective projection, without change of notation. The dual J ? E of the jet bundle JE and the dual J~∗ E of the linearized jet ~ are obtained according to the standard rules for defining the dual of bundle JE an affine space and of a vector space, respectively. In particular, these rules state
November 5, 2003 9:42 WSPC/148-RMP
00173
The Poisson Bracket for Poisson Forms in Multisymplectic Field Theory
713
that if A is an affine space of dimension k over R, its dual A? is the space A(A, R) of affine maps from A to R, which is a vector space of dimension k + 1. Thus the dual or, more precisely, affine dual J ? E of the jet bundle JE and the dual or, more ~ are obtained by defining precisely, linear dual J~∗ E of the linearized jet bundle JE their fiber over any point e in E to be the vector space Je? E = {ze : Je E → R affine} ,
(2.15)
J~e∗ E = {~ze : J~e E → R linear} ,
(2.16)
and the vector space
respectively. However, as mentioned before, the multiphase spaces of field theory are defined with an additional twist, replacing the real line by the one-dimensional space of volume forms on the base manifold M at the appropriate point. Thus ? E of the jet bundle JE and the twisted (linear) dual the twisted (affine) dual J
∗ ~ are obtained from the corresponding ordinary J~ E of the linearized jet bundle JE (untwisted) duals by taking the tensor product with the line bundle of volume forms on the base manifold M , pulled back to the total space E via the projection π, i.e. we put Vn ? E = J ? E ⊗ π∗ ( T ∗M ) , (2.17) J and
∗ E = J~∗ E ⊗ π ∗ ( J~
Vn
T ∗M ) ,
respectively, which means that if x = π(e), we set Vn ∗ ? Tx M affine} , E = {ze : Je E → Je
(2.18)
(2.19)
and
∗ E = {~ze : J~e E → J~e
Vn
Tx∗ M linear} ,
(2.20)
respectively. As is the case for the jet bundle itself, the linearized jet bundle and the various types of dual bundles introduced here all admit two different projections, namely the target projection τ... onto E and the source projection σ... onto M which is simply its composition with the original projection π, that is, σ... = π ◦ τ... . It is easily shown that all of them are fiber bundles over M with respect to σ... , in general without any additional structure, but — as stated before — they are vector bundles over E with respect to τ... . The global linear structure of these bundles over E also becomes clear in local coordinates. Namely, choosing local coordinates x µ for M , local coordinates q i for Q and a local trivialization of E induces naturally ~ but also not only local coordinate systems (xµ , q i , qµi ) for JE and (xµ , q i , q~µi ) for JE µ i µ
? ? local coordinate systems (x , q , p i , p) both for J E and for J E, as well as local ∗ coordinate systems (xµ , q i , p µi ) both for J~∗E and for J~ E, respectively: all these will again be referred to as adapted local coordinates. They are defined by requiring
November 5, 2003 9:42 WSPC/148-RMP
714
00173
M. Forger, C. Paufler & H. R¨ omer
? the dual pairing between a point in J ? E or in J E with coordinates (xµ , q i , p µi , p ) µ i i and a point in JE with coordinates (x , q , qµ ) to be given by
pµi qµi + p
(2.21)
in the ordinary (untwisted) case and by (pµi qµi + p)dn x
(2.22)
∗ in the twisted case, whereas the dual pairing between a point in J~∗ E or in J~ E µ i µ µ i i ~ with coordinates (x , q , q~µ ) should with coordinates (x , q , pi ) and a point in JE be given by
pµi ~qµi
(2.23)
in the ordinary (untwisted) case and by pµi q~µi dn x
(2.24)
in the twisted case. Moreover, a transformation to new local coordinates x 0κ for M , new local coordinates q 0k for Q and a new local trivialization of E, according to Eq. (2.5), induces naturally not only a transformation to new adapted local coordi~ as given by Eqs. (2.6) and (2.7), nates (x0κ , q 0k , qκ0k ) for JE and (x0κ , q 0k , q~κ0k ) for JE, 0 but also a transformation to new adapted local coordinates (x0κ , q 0k , p0κ k , p ) both
? ? for J E and for J E, as well as a transformation to new adapted local coordinates ∗ ~∗ ~ (x0κ , q 0k , p0κ k ) both for J E and for J E, respectively: they are given by 0κ µ i µ p0κ k = pk (x , q , pi , p) ,
p0 = p0 (xµ , q i , pµi , p) ,
(2.25)
where p0κ k =
∂x0κ ∂q i µ p , ∂xµ ∂q 0k i
p0 = p −
∂q 0k ∂q i µ p ∂xµ ∂q 0k i
in the ordinary (untwisted) case and ∂x ∂x0κ ∂q i µ ∂x ∂q 0k ∂q i µ 0 p0κ = det p , p = det p p − k ∂x0 ∂xµ ∂q 0k i ∂x0 ∂xµ ∂q 0k i
(2.26)
(2.27)
? in the twisted case. Finally, it is worth noting that the affine duals J ? E and J E
? ? of JE contain line subbundles J0 E and J0 E whose fiber over any pointVe in E n ∗ consists of the constant (rather than affine) maps from Je E to R and to Tx M respectively, and the corresponding quotient vector bundles over E can be naturally ∗ ~ i.e. we have identified with the respective duals J~∗ E and J~ E of JE,
J ? E/J0? E ∼ = J~∗ E ∼ = L(V E, π ∗ T M ) ,
(2.28)
and
? ? E/J0 E∼ J = L(V E, π ∗ ( = J~ ∗ E ∼
Vn−1
T ∗ M )) ,
(2.29)
respectively. This shows that, in both cases, the corresponding projection onto the quotient amounts to “forgetting the additional energy variable” since it takes a
November 5, 2003 9:42 WSPC/148-RMP
00173
The Poisson Bracket for Poisson Forms in Multisymplectic Field Theory
715
point with coordinates (xµ , q i , pµi , p) to the point with coordinates (xµ , q i , pµi ); it will be denoted by η (as a reminder for the fact that it projects the extended ? multiphase space to the ordinary one) and is easily seen to turn J ? E and J E into
∗ ∗ ~ ~ affine line bundles over J E and over J E, respectively. An alternative but equivalent description of the extended multiphase space of field theory is as a certain bundle of differential forms on the total space E of the Vn configuration bundle, namely the bundle n−1 T ∗ E of (n − 1)-horizontal n-forms on E, that is, of n-forms on E that vanish whenever one inserts at least two vertical vectors. In fact, there is a canonical isomorphism Vn ∼ = ? Φ : n−1 T ∗ E −→ J E (2.30)
of vector bundles over E that can be defined explicitly as follows: given anyVpoint e in n E with base point x = π(e) in M and any (n − 1)-horizontal n-form αe ∈ n−1 Te∗ E, together with a jet ue ∈ Je E, we can use ue , which is a linear map from Tx M to Te E, to pull back the n-form αe on Te E to an n-form u∗e αe on Tx M . Obviously, u∗e αe is an affine function of ue as ue varies over the affine space Je E because it is actually a linear function of ue when ue is allowed to vary over the entire vector space L(Tx M, Te E) (the restriction of a linear map between two vector spaces to an affine subspace of its domain is an affine map). Thus putting Φe (αe ) · ue = u∗e αe
(2.31)
Vn defines a map Φe : n−1 Te∗ E → Je? E which is evidently linear and, as e varies over E, provides the desired isomorphism (2.30). Further details can be found in Ref. [4]. The importance of this canonical isomorphism is due to the fact that it provides a natural way to introduce a multicanonical form θ and a multisymplectic form ω on extended multiphase space which play a similar role in field theory as the canonical form θ and the symplectic form ω on cotangent bundles in mechanics. Namely, θ is an n-form that can be defined intrinsically by using the tangent map ? ? T τJ ◦? E : T (J E) → T E to the bundle projection τJ ◦? E : J E → E, as follows. ? E with base point e = τJ ◦? E (z) in E and n tangent vectors Given a point z ∈ J ? w1 , . . . , wn to J E at z, put ? E · wn ) . ? E · w 1 , . . . , Tz τ J ◦ θz (w1 , . . . , wn ) = (Φ−1 e (z))(Tz τJ ◦
(2.32)
Moreover, ω is an (n + 1)-form which, as in mechanics, is defined to be the negative of the exterior derivative of θ: ω = −dθ .
(2.33)
Another important object that can be defined globally both on extended and ordinary multiphase space is the scaling or Euler vector field which we shall denote here ∗ ? E are total by Σ. Its definition is based exclusively on the fact that J E and J~ spaces of vector bundles over E. In fact, given any vector bundle V over E, Σ V (which we shall simply denote by Σ when there is no danger of confusion) is defined
November 5, 2003 9:42 WSPC/148-RMP
716
00173
M. Forger, C. Paufler & H. R¨ omer
to be the fundamental vector field associated with the action of R, considered as a commutative group under addition, by scaling transformations on the fibers: R×V →
V
(λ, v) 7→ exp(λ)v
.
Thus Σ is simply that vertical vector field on V which, under identification of the vertical tangent spaces to V with the fibers of V itself typical for vector bundles, becomes the identity on V : d exp(λ)v =v. Σ(v) = dλ λ=0
In adapted local coordinates, the isomorphism Φ can be defined by the requirement ? E with that the (n − 1)-horizontal n-form on E corresponding to the point in J µ coordinates (xµ , q i , pi , p) is explicitly given by pµi dq i ∧ dn xµ + p dn x .
(2.34)
The tautological nature of the definition of θ then becomes apparent by realizing that exactly the same expression represents the multicanonical form θ: θ = pµi dq i ∧ dn xµ + p dn x .
(2.35)
Taking the exterior derivative yields ω = dq i ∧ dpµi ∧ dn xµ − dp ∧ dn x .
(2.36)
? ∗ Moreover, the scaling vector fields on J E and on J~ E are given by
Σ = pµi
∂ ∂ +p ∂pµi ∂p
(2.37)
∂ ∂pµi
(2.38)
and by Σ = pµi
respectively. Finally, we note the following relations, which will be used later. Proposition 2.1. The multicanonical form θ, the multisymplectic form ω and the ? scaling or Euler vector field Σ on extended multiphase space J E satisfy the following relations: LΣ θ = θ .
(2.39)
LΣ ω = ω .
(2.40)
iΣ θ = 0 .
(2.41)
iΣ ω = −θ .
(2.42)
Proof. Let (ϕλ )λ ∈ R denote the one-parameter group of scaling transformations ? on J E given by ϕλ (z) = eλ z. Then by the formula relating the Lie derivative of
November 5, 2003 9:42 WSPC/148-RMP
00173
The Poisson Bracket for Poisson Forms in Multisymplectic Field Theory
717
a differential form along a vector field to the derivative of its pull-back under the flow of that vector field (see, e.g., [25, p. 91]) and the definition of θ, we have ∂ ∗ (LΣ θ)z (w1 , . . . , wn ) = (ϕ θ)z (w1 , . . . , wn ) ∂λ λ λ=0 ∂ θϕλ (z) (Tz ϕλ · w1 , . . . , Tz ϕλ · wn ) = ∂λ λ=0 ∂ −1 Φe (ϕλ (z))(Tϕλ (z) τJ ◦? E · (Tz ϕλ · w1 ), . . . , Tϕλ (z) τJ ◦? E · (Tz ϕλ · wn )) = ∂λ λ=0 ∂ −1 λ = Φ (e z)(Tz (τJ ◦? E ◦ ϕλ ) · w1 , . . . , Tz (τJ ◦? E ◦ ϕλ ) · wn ) ∂λ e λ=0 ∂ λ −1 e Φe (z)(Tz τJ ◦? E · w1 , . . . , Tz τJ ◦? E · wn ) = ∂λ λ=0 ∂ λ = e θz (w1 , . . . , wn ) ∂λ λ=0 = θz (w1 , . . . , wn ) ,
which proves Eq. (2.39) and also Eq. (2.40) since LΣ commutes with the exterior ? derivative. Next, observe that with respect to the target projection of J E onto E, Σ is vertical whereas θ is horizontal, which implies Eq. (2.41). Combining these two equations, we finally get θ = LΣ θ = d(iΣ θ) + iΣ dθ = −iΣ ω , proving Eq. (2.42). We note here that the existence of the canonically-defined forms θ and ω is what ? distinguishes the twisted affine dual J E from the ordinary affine dual J ? E of JE. ∗ ? Using the jet bundle JE and the multiphase spaces J~ E and J E associated with a given fiber bundle E over space-time M , one can develop a general covariant Lagrangian and Hamiltonian formalism for field theories whose configurations are sections of E. For example, the Lagrangian function of mechanics is replaced by a Lagrangian density L, which is a function on JE with values in the volume forms on space-time, so that one can integrate it to compute the action functional and formulate a variational principle. It gives rise to a covariant Legendre transformation which replaces that of mechanics and comes in two variants, both defined by an appropriate notion of vertical derivative or fiber derivative: one of them is a fiber ∗ E and the other a fiber preserving smooth preserving smooth map ~FL : JE → J~
? map FL : JE → J E; of course, the former is obtained from the latter by composi? ∗ tion with the natural projection η from J E onto J~ E mentioned above. When ~FL is a local/global diffeomorphism, the Lagrangian L is called regular/hyperregular.
November 5, 2003 9:42 WSPC/148-RMP
718
00173
M. Forger, C. Paufler & H. R¨ omer
On the other hand, the Hamiltonian function of mechanics is replaced by a Hamil? E as an affine tonian density H, which is a section of extended multiphase space J
∗ line bundle over ordinary multiphase space J~ E. Once again, any such section gives rise to a covariant Legendre transformation, defined by an appropriate notion of vertical derivative or fiber derivative: it is a fiber preserving smooth map ∗ FH : J~ E → JE. When FH is a local/global diffeomorphism, the Hamiltonian H is ? E to JE via called regular/hyperregular. In any case, pulling back θ and ω from J FL generates the Poincar´e-Cartan forms θL and ωL on JE, and similarly, pulling ∗ ∗ ? E. As in E via H generates the forms θH and ωH on J~ them back from J E to J~ mechanics, the Lagrangian and Hamiltonian formulations turn out to be completely equivalent in the hyperregular case, with ~FL and FH being each other’s inverse. For more details on these and related matters, the reader may consult Ref. [3] and, in particular, Ref. [4] — except for the direct construction of the Legendre transformation ~FH associated with a Hamiltonian H, which was first derived in Ref. [23]; see also Ref. [24]. There is also a generalization of the Hamilton-Jacobi equation to the field theoretical situation; the reader may consult the extensive review by Kastrup [27] as a starting point for this direction. 3. Poisson Forms and Their Poisson Brackets The constructions exposed in the previous section have identified the extended multiphase space of field theory as an example of a multisymplectic manifold. Definition 3.1. A multisymplectic manifold is a manifold P equipped with a non-degenerate closed (n + 1)-form ω, called the multisymplectic form. Remark. This definition is deliberately vague as to the meaning of the term “nondegenerate”, at least when n > 1. The standard interpretation is that the kernel of ω on vectors should vanish, that is, iX ω = 0 ⇒ X = 0 for vector fields X .
(3.1)
Note that, of course, no such conclusion holds for multivector fields, that is, the kernel of ω on multivectors is non-trivial. (This is true even for symplectic forms which vanish on certain bivectors, for example on those that represent two-dimensional isotropic subspaces.) However, the condition (3.1) alone is too weak and it is not clear what additional algebraic constraints should be imposed on ω. A first attempt in this direction has been made by Martin [28, 29], but his conditions are too restrictive and do not seem to agree with what is needed in applications to field theory. More recently, a promising proposal has been made by Cantrijn, Ibort and de Le´ on [30] which seems to come close to a convincing definition of the concept of a multisymplectic manifold. Fortunately, there is no need to enter this discussion here since the “minimal” requirement of non-degeneracy formulated in Eq. (3.1) is sufficient for our purposes and will be used here to provide a working definition.
November 5, 2003 9:42 WSPC/148-RMP
00173
The Poisson Bracket for Poisson Forms in Multisymplectic Field Theory
719
In what follows, we shall make extensive use of the basic operations of calculus on manifolds involving multivector fields and differential forms, namely the Schouten bracket between multivector fields, the contraction of differential forms with multivector fields and the Lie derivative of differential forms along multivector fields. For the convenience of the reader, the relevant formulae are summarized in Appendix A; in particular, Eqs. (A.9) and (A.11) will be used constantly and often without further mention. On multisymplectic manifolds, there are special classes of multivector fields and of differential forms: Definition 3.2. An r-multivector field X on a multisymplectic manifold P is called locally Hamiltonian if iX ω is closed, or equivalently, if LX ω = 0 ,
(3.2)
and it is called globally Hamiltonian or simply Hamiltonian if iX ω is exact, i.e. if there exists an (n − r)-form f on P such that iX ω = df .
(3.3)
In this case, we say that f is associated with X or corresponds to X. Conversely, an (n − r)-form f on a multisymplectic manifold P is called Hamiltonian if there exists an r-multivector field X on P such that iX ω = df .
(3.4)
In this case, we say that X is associated with f or corresponds to f . Remark. As mentioned before, the kernel of ω on multivectors is non-trivial, so the correspondence between Hamiltonian multivector fields and Hamiltonian forms is not unique (in either direction). Moreover, by far not every form is Hamiltonian. In particular, as first shown in special examples by Kijowski [8] and then more systematically by Kanatchikov [1], although in a somewhat different context, there are restrictions on the allowed multimomentum dependence of the coefficient functions. Of course, every closed form is Hamiltonian (the corresponding Hamiltonian multivector field vanishes identically). Below we will give more interesting examples to show that the definition is not empty. Proposition 3.3. The Schouten bracket of any two locally Hamiltonian multivector fields X and Y on a multisymplectic manifold P is a globally Hamiltonian multivector field [X, Y ] on P whose associated Hamiltonian form can, up to sign, be chosen to be the double contraction iX iY ω. More precisely, assuming X to be of degree r and Y to be of degree s, we have i[X,Y ] ω = (−1)(r−1)s d(iX iY ω) .
(3.5)
In particular, this implies that under the Schouten bracket, the space X ∧ LH (P ) of locally Hamiltonian multivector fields on P is a subalgebra of the Lie superalgebra
November 5, 2003 9:42 WSPC/148-RMP
720
00173
M. Forger, C. Paufler & H. R¨ omer
X∧ (P ) of all multivector fields on P, containing the space X∧ H (P ) of globally Hamiltonian multivector fields, as well as the (smaller) space X∧ (P ) of multivector fields 0 taking values in the kernel of ω, as ideals: if X is locally Hamiltonian, then iξ ω = 0
⇒
i[ξ,X] ω = 0 .
(3.6)
Proof. According to Eqs. (A.11) and (A.9), we have for any two multivector fields X of degree r and Y of degree s, i[X,Y ] ω = (−1)(r−1)s LX iY ω − iY LX ω = (−1)(r−1)s d(iX iY ω) + (−1)(r−1)(s−1) iX d(iY ω) − iY LX ω = (−1)(r−1)s d(iX iY ω) + (−1)(r−1)(s−1) iX LY ω − iY LX ω , since dω = 0, showing that if X and Y are both locally Hamiltonian, then [X, Y ] is globally Hamiltonian and Eq. (3.5) holds. Definition 3.4. A Hamiltonian form f on a multisymplectic manifold P is called a Poisson form if its contraction with any multivector field ξ on P taking values in the kernel of ω vanishes: iξ ω = 0
⇒
iξ f = 0 .
(3.7)
Remark. For the Poisson bracket introduced below to be well-defined, it would be sufficient to impose the apparently weaker condition that the contraction of f with any multivector field ξ on P taking values in the kernel of ω should be a closed form: iξ ω = 0
⇒
d(iξ f ) = 0 .
(3.8)
However, it turns out that this condition is already sufficient to imply the previous one. To see this, observe that if f is a differential form on P satisfying Eq. (3.8) and ξ is any multivector field on P taking values in the kernel of ω, then for any function ϕ on P , ϕξ will be a multivector field on P taking values in the kernel of ω as well and hence 0 = d(iϕξ f ) = d(ϕiξ f ) = dϕ ∧ iξ f + ϕd(iξ f ) = dϕ ∧ iξ f . But this means that the exterior product of iξ f with any one-form on P must vanish, which is only possible if iξ f itself vanishes. Definition 3.5. An exact multisymplectic manifold is a multisymplectic manifold whose multisymplectic form ω is the exterior derivative of a Poisson form: ω = −dθ . iξ ω = 0
⇒
iξ θ = 0 .
We shall call θ the multicanonical form.
(3.9) (3.10)
November 5, 2003 9:42 WSPC/148-RMP
00173
The Poisson Bracket for Poisson Forms in Multisymplectic Field Theory
721
Remark. It is an immediate consequence of Proposition 2.1, in particular of Eq. (2.42), that the extended multiphase space of field theory is an exact multisymplectic manifold. However, the condition that the kernel of θ should contain that of ω is non-trivial in the sense that it is not always possible to modify a potential of an exact form by adding an appropriate closed form so as to achieve the desired inclusion of the kernels, as the following counterexample will show.d Consider the three-sphere S 3 as the total space of the Hopf bundle, a principal U (1)-bundle over the two-sphere S 2 , and let ξ be the fundamental vector field of the U (1) group action on S 3 and α be the canonical connection 1-form on S 3 . Then iξ α = 1 and iξ dα = 0. We want to modify α by some closed form β so that iξ (α + β) = 0. But S 3 is simply connected, so dβ = 0 implies that there is a function f with df = β. Hence we are looking for a function f on S 3 that satisfies iξ df = −1. But S 3 is compact, so f must have at least two critical points (a maximum and a minimum), and we arrive at a contradiction. In other words, we cannot modify the potential α of dα in such a way that the kernel of dα is contained in the kernel of the modified potential. Definition 3.6. Let P be an exact multisymplectic manifold. Given any two Poisson forms f of degree n − r and g of degree n − s on P, their Poisson bracket is defined to be the (n + 1 − r − s)-form on P given by {f, g} = −LX g + (−1)(r−1)(s−1) LY f − (−1)(r−1)s LX ∧ Y θ ,
(3.11)
or equivalently, {f, g} = (−1)r(s−1) iY iX ω + d (−1)(r−1)(s−1) iY f − iX g − (−1)(r−1)s iY iX θ ,
(3.12)
where X and Y are Hamiltonian multivector fields associated with f and with g, respectively. Remark. This Poisson bracket is an extension of the one between Hamiltonian (n − 1)-forms introduced by two of the present authors in an earlier article [5], except for the fact that when f and g are (n − 1)-forms, X and Y are vector fields and are uniquely determined by f and g, so there is no need to impose restrictions on the contraction of f and g with multivector fields taking values in the kernel of ω: the definition given in Ref. [5] works for all Hamiltonian (n − 1)-forms and not just for Poisson (n − 1)-forms. Proposition 3.7. The Poisson bracket introduced above closes and is well-defined, i.e. when f and g are Poisson forms, {f, g} is again a Poisson form which does not depend on the choice of the Hamiltonian multivector fields X and Y used in its definition. Moreover, we have i[Y,X]ω = d{f, g} , d This
example is due to M. Bordemann.
(3.13)
November 5, 2003 9:42 WSPC/148-RMP
722
00173
M. Forger, C. Paufler & H. R¨ omer
i.e. if X is a Hamiltonian multivector field associated with f and Y is a Hamiltonian multivector field associated with g, then [Y, X] is a Hamiltonian multivector field associated with {f, g}. Proof. We begin by using Eq. (A.9) to show that, for any two Hamiltonian forms f of degree n − r and g of degree n − s with associated Hamiltonian multivector fields X and Y , respectively, the expressions on the right-hand side of Eqs. (3.11) and (3.12) coincide: −LX g + (−1)(r−1)(s−1) LY f − (−1)(r−1)s LX ∧ Y θ = −d(iX g) + (−1)r iX dg + (−1)(r−1)(s−1) d(iY f ) − (−1)(r−1)(s−1)+s iY df − (−1)(r−1)s d(iX ∧ Y θ) − (−1)r(s−1) iX ∧ Y ω = −d(iX g) + (−1)rs+r iY iX ω + (−1)(r−1)(s−1) d(iY f ) + (−1)rs−r iY iX ω − (−1)(r−1)s d(iY iX θ) − (−1)rs−r iY iX ω = (−1)r(s−1) iY iX ω + d (−1)(r−1)(s−1) iY f − iX g − (−1)(r−1)s iY iX θ .
In order for the bracket to be well-defined, it is necessary and sufficient that this expression vanishes whenever X or Y takes its values in the kernel of ω: this is guaranteed by the requirement that f , g and θ should be Poisson forms. Moreover, in view of Eq. (3.5), Eq. (3.13) follows immediately from Eq. (3.12), proving that the Poisson bracket {f, g} of two Poisson forms is a Hamiltonian form. To check that it is in fact a Poisson form, assume ξ to be a multivector field taking values in the kernel of ω, say of degree k, and consider the expressions obtained by contracting each of the four terms in Eq. (3.12) with ξ. The first obviously vanishes, whereas the fourth can be seen to vanish due to Eqs. (3.6) and (3.10): iξ d(iY iX θ) = (−1)s iξ iY d(iX θ) + iξ LY iX θ = (−1)s(k−1) iY iξ LX θ + (−1)r+s(k−1) iY iξ iX dθ −i[Y,ξ]iX θ + (−1)(s−1)k LY iξ iX θ = −(−1)s(k−1) iY i[X,ξ] θ + (−1)(r−1)k+s(k−1) iY LX iξ θ −(−1)r+s(k−1) iY iξ iX ω −i[Y,ξ]iX θ + (−1)(s−1)k LY iξ iX θ = 0.
November 5, 2003 9:42 WSPC/148-RMP
00173
The Poisson Bracket for Poisson Forms in Multisymplectic Field Theory
723
Similarly, the second and third can be handled by using Eqs. (3.6) and (3.7) which imply that iξ d(iY f ) = (−1)s iξ iY df + iξ LY f = (−1)s iξ iY iX ω − i[Y,ξ]f + (−1)(s−1)k LY iξ f , and iξ d(iX g) = (−1)r iξ iX dg + iξ LX g = (−1)r iξ iX iY ω − i[X,ξ] g + (−1)(r−1)k LX iξ g . vanish since f and g are Poisson forms. Now we can formulate the main theorem of this paper: Theorem 3.8. Let P be an exact multisymplectic manifold. The Poisson bracket introduced above is bilinear over R, is graded antisymmetric, which means that for any two Poisson forms f of degree n − r and g of degree n − s on P, we have {g, f } = −(−1)(r−1)(s−1) {f, g} ,
(3.14)
and satisfies the graded Jacobi identity, which means that for any three Poisson forms f of degree n − r, g of degree n − s and h of degree n − t on P, we have (−1)(r−1)(t−1) {f, {g, h}} + cyclic perm. = 0 ,
(3.15)
thus turning the space of Poisson forms on P into a Lie superalgebra. Remark. Bilinearity over R and the graded antisymmetry (3.14) being obvious, the main statement of the theorem is of course the validity of the graded Jacobi identity (3.15), which depends crucially on the exact correction terms, that is, the last three terms in the defining equation (3.12). To prove this, we need the following two lemmas: Lemma 3.9. Let P be a multisymplectic manifold. For any three locally Hamiltonian multivector fields X of degree r, Y of degree s and Z of degree t on P, we have the cyclic identity (−1)r(t−1) iX d(iY iZ ω) + cyclic perm. = (−1)rt d(iX iY iZ ω) , Proof. This is obtained by calculating iX d(iY iZ ω) = (−1)(s−1)t iX i[Y,Z] ω = (−1)(s−1)t+r(s+t−1) i[Y,Z] iX ω = (−1)r(s+t−1) (LY iZ − (−1)(s−1)t iZ LY )iX ω = (−1)r(s+t−1) d(iY iZ iX ω) + (−1)r(s+t−1)+s−1 iY d(iZ iX ω) −(−1)r(s+t−1)+(s−1)t iZ d(iY iX ω) , and multiplying by (−1)rt−r .
(3.16)
November 5, 2003 9:42 WSPC/148-RMP
724
00173
M. Forger, C. Paufler & H. R¨ omer
Lemma 3.10. Let P be an exact multisymplectic manifold. For any three locally Hamiltonian multivector fields X of degree r, Y of degree s and Z of degree t on P, we have the cyclic identity (−1)r(t−1) iX d(iY iZ θ) − (−1)r(t−1)+s iX iY d(iZ θ) + cyclic perm. = (−1)rt+r+s+t iX iY iZ ω + (−1)rt d(iX iY iZ θ) .
(3.17)
Proof. This is obtained by calculating iX d(iY iZ θ) + (−1)s−1 iX iY d(iZ θ) − (−1)(s−1)t iX iZ d(iY θ) + (−1)(s−1)(t−1) iX iZ iY ω = iX (LY iZ − (−1)(s−1)t iZ LY )θ = (−1)(s−1)t iX i[Y,Z] θ = (−1)(s−1)t+r(s+t−1) i[Y,Z] iX θ = (−1)r(s+t−1) (LY iZ − (−1)(s−1)t iZ LY )iX θ = (−1)r(s+t−1) d(iY iZ iX θ) + (−1)r(s+t−1)+s−1 iY d(iZ iX θ) − (−1)r(s+t−1)+(s−1)t iZ d(iY iX θ) − (−1)r(s+t−1)+(s−1)(t−1) iZ iY d(iX θ) , and multiplying by (−1)rt−r .
Proof of Theorem 3.8. Given any three Poisson forms f of degree n − r, g of degree n − s and h of degree n − t and fixing three Hamiltonian multivector fields X of degree r, Y of degree s and Z of degree t associated with f , with g and with h, respectively, we compute the double Poisson bracket (−1)(r−1)(t−1) {f, {g, h}} = (−1)(r−1)(t−1)+r(s+t) i[Z,Y ] iX ω + (−1)(r−1)(t−1)+(r−1)(s+t) d(i[Z,Y ] f ) − (−1)(r−1)(t−1) d(iX {g, h}) − (−1)(r−1)(t−1)+(r−1)(s+t−1) d(i[Z,Y ] iX θ) = −(−1)(rs+r+t)+r(s+t−1)+(st+s+t) iX i[Y,Z] ω + (−1)(r−1)(s−1)+(t−1)s d(LZ iY f ) − (−1)(r−1)(s−1) d(iY LZ f ) − (−1)(r−1)(t−1)+s(t−1) d(iX iZ iY ω) − (−1)(r−1)(t−1)+(s−1)(t−1) d(iX d(iZ g)) + (−1)(r−1)(t−1) d(iX d(iY h)) + (−1)(r−1)(t−1)+(s−1)t d(iX d(iZ iY θ)) − (−1)(r−1)s+(t−1)s d(LZ iY iX θ) + (−1)(r−1)s d(iY LZ iX θ)
November 5, 2003 9:42 WSPC/148-RMP
00173
The Poisson Bracket for Poisson Forms in Multisymplectic Field Theory
725
= −(−1)rt+s+t iX d(iY iZ ω) + (−1)rs+st+r+t d(iZ d(iY f )) + (−1)rs+r+s d(iY d(iZ f )) − (−1)rs+r+s+t d(iY iZ iX ω) + (−1)rt+st+r+s+t d(iX iZ iY ω) − (−1)rt+st+r+s d(iX d(iZ g)) − (−1)rt+r+t d(iX d(iY h)) − (−1)rt+r d(iX d(iY iZ θ))
←
+ (−1)st+t d(iZ d(iX iY θ))
←
+ (−1)rs+s d(iY d(iZ iX θ)) − (−1)rs+s+t d(iY iZ d(iX θ)) . In the last expression, the underlined terms cancel each other. Moreover, under the cyclic sum, the terms marked by an arrow cancel each other and the terms containing derivatives of contractions of f , g, h cancel pairwise, i.e. the expression +(−1)rs+st+r+t d(iZ d(iY f )) + (−1)rs+r+s d(iY d(iZ f )) −(−1)rt+st+r+s d(iX d(iZ g)) − (−1)rt+r+t d(iX d(iY h)) +(−1)st+tr+s+r d(iX d(iZ g)) + (−1)st+s+t d(iZ d(iX g)) −(−1)sr+tr+s+t d(iY d(iX h)) − (−1)sr+s+r d(iY d(iZ f )) +(−1)tr+rs+t+s d(iY d(iX h)) + (−1)tr+t+r d(iX d(iY h)) −(−1)ts+rs+t+r d(iZ d(iY f )) − (−1)ts+t+s d(iZ d(iX g)) vanishes. Finally, using the cyclic identities (3.16) and (3.17), we see that the remaining terms sum up as follows: (−1)(r−1)(t−1) {f, {g, h}} + cyclic perm. = −(−1)r+s+t (−1)r(t−1) iX d(iY iZ ω) + cyclic perm.
+ d (−1)r(t−1) iX d(iY iZ θ) − (−1)r(t−1)+s iX iY d(iZ θ) + cyclic perm. = −(−1)r+s+t (−1)rt d(iX iY iZ ω) + d (−1)r+s+t (−1)rt iX iY iZ ω + (−1)rt d(iX iY iZ θ) = 0.
This completes the proof of the main theorem. Remark. From the definition given in Eq. (3.12), it is obvious that the Poisson bracket between an arbitrary Poisson form f and a closed Poisson form g is exact, since in this case the Hamiltonian multivector field Y associated with g may be
November 5, 2003 9:42 WSPC/148-RMP
726
00173
M. Forger, C. Paufler & H. R¨ omer
chosen to vanish identically, so that one gets {f, g} = −d(iX g). Therefore, the space of closed Poisson forms is an ideal in the Lie superalgebra of all Poisson forms. Concluding, it must not go unnoticed that the Poisson bracket between Poisson forms introduced in this paper should be looked upon with a certain amount of caution, for a variety of reasons. One of these is that the space of Poisson forms is a Lie superalgebra but apparently not a Poisson superalgebra, since the Poisson bracket does not act as a superderivation in its second argument with respect to the exterior product of forms, nor does there seem to exist any other naturally defined associative supercommutative product between Poisson forms with that property: this is in contrast to the situation for multivector fields which do form a Poisson superalgebra with respect to the exterior product and the Schouten bracket. There is also a degree problem, since for example, the Poisson bracket between functions would be a form of negative degree, which is always zero: this is, at least at first sight, rather odd. Finally, the question about the relation to the covariant Poisson bracket of Peierls and de Witt mentioned at the end of the introduction remains open. 4. The Universal Multimomentum Map On exact multisymplectic manifolds, Definition 3.2 can be complemented as follows. Definition 4.1. A multivector field X on an exact multisymplectic manifold P is called exact Hamiltonian if LX θ = 0 .
(4.1)
The terminology is consistent with that introduced before because exact Hamiltonian multivector fields are Hamiltonian: this is an immediate consequence of Proposition 4.3 below. Thus Proposition 3.3 can be complemented as follows. Proposition 4.2. The Schouten bracket of any two exact Hamiltonian multivector fields X and Y on an exact multisymplectic manifold P is an exact Hamiltonian multivector field [X, Y ] on P . This means that the space X∧ EH (P ) of exact Hamiltonian multivector fields on P is a subalgebra of the Lie superalgebra X ∧ (P ) of all multivector fields on P which, according to Eq. (3.6), contains the space X ∧ 0 (P ) of multivector fields taking values in the kernel of ω as an ideal. Proof. The proposition follows directly from Eq. (A.12). Exact Hamiltonian multivector fields generate Poisson forms, by contraction with the multicanonical form. Proposition 4.3. Let P be an exact multisymplectic manifold. For every exact Hamiltonian r-multivector field X on P, the formula J(X) = (−1)r−1 iX θ
(4.2)
November 5, 2003 9:42 WSPC/148-RMP
00173
The Poisson Bracket for Poisson Forms in Multisymplectic Field Theory
727
defines a Poisson (n−r)-form J(X) on P whose associated Hamiltonian multivector field is X itself. In particular, X is Hamiltonian. Proof. Using Eq. (A.9), we see that the condition (4.1) implies d(J(X)) = (−1)r−1 d(iX θ) = (−1)r−1 LX θ − iX dθ = iX ω ,
(4.3)
so J(X) is a Hamiltonian form whose associated Hamiltonian multivector field is X itself. Moreover, the kernel of J(X) on multivectors contains that of θ which in turn contains that of ω, so J(X) is a Poisson form. Proposition 4.4. Let P be an exact multisymplectic manifold. The linear map J from the space X∧ EH (P ) of exact Hamiltonian multivector fields on P to the space of Poisson forms on P defined by Eq. (4.2) is an antihomomorphism of Lie superalgebras, i.e. we have {J(X), J(Y )} = J([Y, X]) .
(4.4)
Proof. For any two exact Hamiltonian multivector fields X of degree r and Y of degree s, we have, according to the defining Eqs. (3.12) and (4.2), {J(X), J(Y )} = (−1)r(s−1) iY iX ω + (−1)(r−1)(s−1)+r−1 d(iY iX θ) − (−1)s−1 d(iX iY θ) − (−1)(r−1)s d(iY iX θ) = (−1)r(s−1) iY iX ω + (−1)(r−1)s d(iY iX θ) , whereas combining Eqs. (A.11), (A.9) and (4.3) gives J([Y, X]) = (−1)r+s i[Y,X]θ = (−1)r+s+r(s−1) LY iX θ
since LY θ = 0
= (−1)r(s−1) d(iY iX θ) − (−1)r(s−1)+s iY d(iX θ) = (−1)r(s−1) d(iY iX θ) + (−1)r(s−1) iY iX ω . Obviously, these two expressions coincide. Remark. This proposition, even when restricted to vector fields and (n − 1)-forms, constitutes a remarkable improvement over the corresponding Proposition 4.5 of Ref. [4] where, due to an inadequate definition of the Poisson bracket (omitting the exact correction terms, that is, the last three terms in Eq. (3.12)), Eq. (4.4) must be modified by an exact correction term. Definition 4.5. Let P be an exact multisymplectic manifold. The linear map J from the space X∧ EH (P ) of exact Hamiltonian multivector fields on P to the space of Poisson forms on P defined by Eq. (4.2) will be called the universal multimomentum map and its restriction to the space XEH (P ) of exact Hamiltonian vector fields on P the universal momentum map.
November 5, 2003 9:42 WSPC/148-RMP
728
00173
M. Forger, C. Paufler & H. R¨ omer
Remark. The term “universal momentum map” can be justified in the context of Noether’s theorem, dealing with the derivation of conservation laws from symmetries. In classical field theory, conserved quantities are described by Noether currents which depend on the fields of the theory and are (n − 1)-forms on ndimensional space-time, so that they can be integrated over compact regions in spacelike hyper-surfaces in order to provide Noether charges associated with each such region: Noether’s theorem then asserts that when the fields satisfy the equations of motion of the theory, these Noether currents are closed forms. In the multiphase space approach, the Noether currents on space-time are obtained from corresponding Noether current forms defined on (extended) multiphase space via pull-back of differential forms, their entire field dependence being induced by this pull-back. Moreover, there is an explicit procedure to construct these Noether current forms on (extended) multiphase space: it is the field theoretical analogue of the momentum map of Hamiltonian mechanics on cotangent bundles and, in Ref. [4], is called the “special covariant momentum map”. Briefly, given a Lie group G, with Lie algebra g, the statement that G is a symmetry group of a specific theory supposes that we are given an action of G on the configuration bundle E over M by ~ as bundle automorphisms, which of course induces actions of G on JE and on JE,
? ∗ ~ well as on all of their duals, including J E and J E, by bundle automorphisms. (In order to speak of a symmetry, we must also assume the Lagrangian or Hamiltonian density to be invariant, or rather equivariant, under the action of G, but this aspect is not relevant for the present discussion.) As usual, each of these actions induces an antihomomorphism from g to the Lie algebra of vector fields on the corresponding manifold, taking each generator X in g to the corresponding fundamental vector ? E , all of which (except XM ) are projectable: field XM , XE , XJE , XJE ∗ E , XJ ◦ ~ . . . XJ~◦ for example, XE projects to XM under the tangent map T π : T E → T M to the ? E can projection π : E → M . Moreover, the vector fields XJE , XJE ∗ E , XJ ◦ ~ . . . XJ~◦ all be obtained from the vector field XE by a canonical lifting process. In particular, ? the projectable vector fields XJ ◦? E on J E obtained from projectable vector fields XE on E by lifting are exact Hamiltonian, and conversely, it turns out that all ? exact Hamiltonian vector fields on J E are obtained in this way. (The last statement, analogous to a corresponding statement for cotangent bundles, is not proved in Ref. [4]; it will be derived in Ref. [12].) Now the “special covariant momentum map” of Ref. [4] associated with the symmetry under G is simply given by composing the antihomomorphism that takes generators X in g to exact Hamiltonian ? fundamental vector fields XJ ◦? E on J E with the universal momentum map introduced above. Therefore, the universal momentum map comprises that part of the construction of the momentum map in field theory which does not depend on the a priori choice of a symmetry group or its action on the dynamical variables of the theory, and the universal multimomentum map extends that from vector fields to multivector fields.
November 5, 2003 9:42 WSPC/148-RMP
00173
The Poisson Bracket for Poisson Forms in Multisymplectic Field Theory
729
5. Poisson Forms on Multiphase Space Our aim in this final section is to give a series of examples for Poisson forms on the ? E of field theory. A full, systematic treatment of the extended multiphase space J subject will be given in a forthcoming separate paper [12]. As a preliminary step, we observe that there is a natural, globally defined notion ? of vertical vectors and of horizontal covectors on J E. In fact, there are two such notions, one referring to the “source” projection onto space-time M and the other to the “target” projection onto the total space E of the configuration bundle. In either case, the vertical vectors are those that vanish under the tangent to the projection, while the horizontal covectors are those that vanish on all vertical vectors. In adapted local coordinates, ∂ ∂ , i ∂q ∂pµi ∂ ∂pµi
and
and ∂ ∂p
∂ ∂p
are vertical with respect to the source projection, (5.1)
are vertical with respect to the target projection,
(5.2)
while dxµ dxµ
are horizontal with respect to the source projection,
and dq i
are horizontal with respect to the target projection.
(5.3) (5.4)
This can be extended to multivectors and exterior forms, as follows. Given positive integers r and s with s 6 r, an exterior r-form is said to be s-horizontal if it vanishes whenever one inserts at least r − s + 1 vertical vectors (this includes the standard notion of horizontal forms by taking s = r), and an r-multivector is said to be s-vertical if it is annihilated by all (r − s + 1)-horizontal exterior forms. Using the standard expansion of multivectors and of exterior forms in adapted local coordinates, it is not difficult to see that an r-form is s-horizontal if and only if it is a linear combination of terms each of which is an exterior product containing at least s horizontal covectors and that an r-multivector is s-vertical if and only if it is a linear combination of terms each of which is an exterior product containing at least s vertical vectors. Thus for example, Eqs. (2.35)–(2.37) show that θ and ω are both (n − 1)-horizontal with respect to the source projection and even n-horizontal with respect to the target projection, while Σ is vertical with respect to both projections. In what follows, the terms “vertical” and “horizontal” will always refer to the source projection, except when explicitly stated otherwise. For later use, we first write down the expansion of a general multivector field X of degree r in terms of adapted local coordinates, as follows:
November 5, 2003 9:42 WSPC/148-RMP
730
X =
00173
M. Forger, C. Paufler & H. R¨ omer
∂ 1 ∂ ∂ 1 µ1 ...µr ∂ ∂ X ∧···∧ + X i,µ2 ...µr i ∧ µ2 ∧ · · · ∧ µr r! ∂xµ1 ∂xµr (r − 1)! ∂x ∂q ∂x +
∂ 1 ∂ ∂ 1 µ1 ...µr ∂ ∂ µ2 ...µr ∂ X ∧ ∧ 1 µ2 ∧ · · · ∧ ∂xµr + (r − 1)! X0 µ2 ∧ · · · ∧ ∂xµr + ξ . r! i ∂x ∂p ∂x ∂pµ i
(5.5) Here, all coefficients are assumed to be totally antisymmetric in their spacetime indices, whereas ξ is assumed to take values in the kernel of ω. (This can always be achieved without loss of generality, because if we begin by supposing instead that ξ should contain all other terms of the standard expansion, that is, all 2-vertical terms, then ξ would contain just one group of terms that are not obviously annihilated under contraction with ω, namely the terms of the form ∂ ∂ ∂ ∂ ∧ ∧ ∧···∧ . κ i µ 3 ∂q ∂pk ∂x ∂xµr However, this part of ξ can be decomposed into the sum of a term which is annihilated under contraction with ω and a linear combination of the 1-vertical terms ∂ ∂ ∂ ∂ ∧ ∧ ∧···∧ , ∂p ∂xµ2 ∂xµ3 ∂xµr so that by a redefinition of the coefficents X0µ2 ...µr and of ξ, we arrive at the expression for X given in Eq. (5.5), with ξ now taking values in the kernel of ω. For a more detailed discussion, see Ref. [12].) Explicitly, the contraction of ω with X then reads 1 (−1)r µ1 ...µr iX ω = X µ1 ...µr dq i ∧ dpµi ∧ dn xµµ1 ...µr − X dp ∧ dn xµ1 ...µr r! r! +
(−1)r−1 i,µ2 ...µr µ n X dpi ∧ d xµµ2 ...µr (r − 1)!
+
(−1)r µ1 ...µr i n Xi dq ∧ d xµ1 ...µr r!
−
1 X µ2 ...µr dn xµ2 ...µr , (r − 1)! 0
(5.6)
while that of θ with X reads (−1)r µ1 ...µr µ i n 1 iX θ = X pi dq ∧ d xµµ1 ...µr + X µ1 ...µr p dn xµ1 ...µr r! r! +
1 X i,µ2 ...µr pµi dn xµµ2 ...µr , (r − 1)!
(5.7)
where, in each of the last two equations, the first term is to be omitted if r = n, whereas only the last term in the first equation remains and iX θ vanishes identically if r = n + 1. With these preliminaries out of the way, we can easily deal with the simplest case, which is that of functions.
November 5, 2003 9:42 WSPC/148-RMP
00173
The Poisson Bracket for Poisson Forms in Multisymplectic Field Theory
731
? E is always a Poisson 0-form. Moreover, in Proposition 5.1. A function f on J adapted local coordinates, the corresponding Hamiltonian n-multivector field X is, modulo terms taking values in the kernel of ω, given by ∂ 1 ∂f ∂ 1 ∂f ∂ ∂ µ2 ...µn µ ∧ X=− − ∧···∧ µ µ µ 2 (n − 1)! ∂x ∂p n ∂p ∂x ∂x ∂xµn 1 ∂ ∂f ∂ 1 ∂f ∂ ∂ + ∧···∧ . (5.8) µ2 ... µn µ − ∧ µ µ i i µ 2 (n − 1)! ∂pi ∂q n ∂q ∂pi ∂x ∂xµn
Proof. First of all, observe that for functions f , the kernel condition (3.7) is void. Next, we simplify the expression (5.6), with r = n, by noting that due to our conventions (2.8), (2.9) and (2.10), we have dn xµ1 ...µn = µ1 ...µn ,
dn xµ2 ...µn = µ2 ...µn µ dxµ .
(5.9)
Thus iX ω = − −
(−1)n 1 µ1 ...µn X µ1 ...µn dp + µ ...µ µ X i,µ2 ...µn dpµi n! (n−1)! 2 n 1 1 µ ...µ µ X µ,µ2 ...µn dq i − µ ...µ µ X µ2 ...µn dxµ . (5.10) (n − 1)! 2 n i (n − 1)! 2 n 0
Equating this expression with the exterior derivative of f , we obtain the following system of equations ∂f X µ1 ...µn = (−1)n−1 µ1 ...µn , (5.11) ∂p X i,µ2 ...µn = µ2 ...µn µ
∂f , ∂pµi
Xiµ,µ2 ...µn = −µ2 ...µn µ
1 ∂f , n ∂q i
(5.12) (5.13)
∂f . (5.14) ∂xµ Inserting this back into Eq. (5.5), with r = n, and rearranging the terms, we arrive at Eq. (5.8). X0µ2 ...µn = −µ2 ...µn µ
? Remark. It has been shown in Ref. [31] that for functions h on J E of the special form
h(xµ , q i , pµi , p) = −H(xµ , q i , pµi ) − p ,
(5.15)
the associated Hamiltonian multivector field X can be chosen so that it defines ? an n-dimensional distribution in J E because it is locally decomposable, that is, locally there exist vector fields X1 , . . . , Xn such that X = X1 ∧ · · · ∧ Xn satisfies the equation iX ω = dh. Indeed, setting ∂ ∂h ∂ 1 ∂h ∂ ∂h 1 ∂h ∂h ∂ Xµ = − µ + µ i − − − , (5.16) ∂x ∂pi ∂q n ∂q i ∂pµi ∂xµ n ∂q i ∂pµi ∂p
November 5, 2003 9:42 WSPC/148-RMP
732
00173
M. Forger, C. Paufler & H. R¨ omer
we can convince ourselves that this choice of X and the choice of X made in Eq. (5.8) differ by a term taking values in the kernel of ω. Under additional assumptions, this distribution will be integrable and its integral manifolds will be the images of ? E over M satisfying the covariant Hamiltonian equations of motion, sections of J or De Donder-Weyl equations. Another method for constructing Poisson forms on the extended multiphase ? ∗ E is from Hamiltonian forms on the ordinary multiphase space J~ E, as space J
? introduced by Kanatchikov [1, 2], pulling these back to J E via the appropriate projection. To describe the salient features of Kanatchikov’s construction, one must first of ∗ E similar to the multisymplectic form ω that exists all introduce a structure on J~
? naturally on J E. This requires the choice of a connection in E and of a linear connection in T M which, for the sake of convenience, will be assumed to be torsion free. Together, they induce connections in all the other bundles that are important in the multiphase space approach to field theory, including the multiphase spaces ∗ ? E and J E; for the convenience of the reader, the relevant formulas in adapted J~ ∗ local coordinates are collected in Appendix B. In the case of J~ E, this induced connection can be used to define a “vertical multisymplectic form” ω V which is however not closed; instead, it is annihilated under the action of a “vertical exterior derivative” d V for differential forms. In adapted local coordinates, these objects can be written in the form ω V = ei ∧ eµi ∧ dn xµ + · · ·
(5.17)
∂ ∂ + eµi ∧ µ i ∂q ∂pi
(5.18)
and dV = e i ∧
respectively, where ei = dq i + Γiν dxν and eµi = dpµi − (∂i Γjκ pµj − Γµκλ pλi + Γρκρ pµi )dxκ are vertical 1-forms (with respect to the aforementioned induced connection): the dots in the definition of ω V indicate n-horizontal terms that are not important here, while the partial derivatives in the definition of dV are meant to act on the coefficient functions. As shown by one of the present authors [32], dV is still a cohomology operator, i.e. it has square zero. Then the Hamiltonian forms as defined by ∗ Kanatchikov can be shown to be precisely the horizontal forms f˜ on J~ E satisfying the equation iX˜ ω V = dV f˜ ,
(5.19)
∗ ˜ is a multivector field on J~ E; this relation is of course completely analowhere X gous to our equation (3.3/3.4). Moreover, Kanatchikov introduces a Poisson bracket between Hamiltonian forms f˜ of degree n−r and g˜ of degree n−s, with multivector ˜ of degree r and Y˜ of degree s corresponding to f˜ and to g˜ according to fields X Eq. (5.19), by setting
˜ g˜}V = (−1)r(s−1) i ˜ i ˜ ω V . {f, Y X
(5.20)
This Poisson bracket satisfies the analogue of the graded Jacobi identity (3.15).
November 5, 2003 9:42 WSPC/148-RMP
00173
The Poisson Bracket for Poisson Forms in Multisymplectic Field Theory
733
We will now show how this approach can be naturally incorporated into the multisymplectic framework used in the present paper. Proposition 5.2. Under the canonical projection from extended multiphase space ∗ ∗ ? E as deE, every Hamiltonian form f˜ on J~ J E to ordinary multiphase space J~
? fined by Kanatchikov pulls back to a horizontal Poisson form f on J E. Conversely, ? every horizontal Poisson form f of degree > 0 on J E is obtained in this way.
? Moreover, the Hamiltonian multivector field X on J E corresponding to f can be ∗ ˜ on J~ chosen so as to project to a Hamiltonian multivector field X E corresponding ˜ to f. Proof. We begin by analyzing the properties of Poisson forms f of degree n − r ? (0 < r < n) on J E which are horizontal. Being horizontal, such a form trivially satisfies the kernel condition (3.7) and its expansion in adapted local coordinates is f=
1 µ1 ...µr n d xµ1 ...µr , f r!
implying df =
1 ∂f µ2 ...µr ν n 1 ∂f µ1 ...µr i n + d x dq ∧ d xµ1 ...µr µ ...µ 2 r (r − 1)! ∂xν r! ∂q i +
1 ∂f µ1 ...µr 1 ∂f µ1 ...µr κ n dp ∧ d x + dp ∧ dn xµ1 ...µr . µ ...µ k 1 r r! ∂pκk r! ∂p
Comparing this formula with Eq. (5.6), we see that f being a Hamiltonian form implies first of all that X must be 1-vertical since the coefficients X µ1 ...µr give a contribution to iX ω proportional to dq i ∧ dpµi ∧ dn xµµ1 ...µr which is absent from df . But this implies that iX ω contains no terms proportional to dp∧dn xµ1 ...µr either and hence the coefficients f µ1 ...µr cannot depend on the energy variable p; the same then goes for all the coefficients of X. Therefore, f is the pull-back of a horizontal ∗ ∗ ˜ on J~ form f˜ on J~ E whereas X projects onto a 1-vertical multivector field X E whose expansion in terms of adapted local coordinates is given by the second and third term in Eq. (5.5). Finally, we see that with these relations between the various objects involved, Eq. (3.3, 3.4) becomes equivalent to Eq. (5.19) plus the relation X0µ2 ...µr = −
∂f µ2 ...µr ν , ∂xν
∗ which has no counterpart in J~ E but also does not convey any additional information.
Finally, the fact that the Poisson bracket (5.20) introduced by Kanatchikov, ∗ ? when pulled back from J~ E to J E, coincides with the Poisson bracket defined by Eq. (3.12) follows from the following simple observation. ? Proposition 5.3. Let f and g be two horizontal Poisson forms on J E of respective degrees n − r and n − s, with corresponding 1-vertical Hamiltonian multivector
November 5, 2003 9:42 WSPC/148-RMP
734
00173
M. Forger, C. Paufler & H. R¨ omer
fields X and Y of respective degrees r and s. Then the definition (3.12) of their Poisson bracket reduces to the pull-back of Eq. (5.20): {f, g} = (−1)r(s−1) iY iX ω .
(5.21)
Proof. As we have seen in the proof of the preceding proposition, f and g being horizontal forces X and Y to be 1-vertical, so iY f and iX g vanish. Similarly, Eq. (5.7) shows that iX θ and iY θ are horizontal, so iY iX θ and iX iY θ vanish. Therefore, the exact correction term of Eq. (3.12) does not contribute in this case. Finally, X ∧ Y will be 2-vertical, so contraction of the pull-back of ω V or of ω with X and Y gives the same result, implying that Eq. (5.21) is really the pull-back of Eq. (5.20). Remark. In the case of horizontal Poisson forms, one can also introduce an associative product, which has been found by Kanatchikov [2]: f • g = ∗−1 (∗f ∧ ∗ g) ,
(5.22)
where ∗ is the Hodge star operator on M associated to some metric which can be ∗ transported to horizontal forms on J E in an obvious manner. With respect to this product, the Poisson bracket (5.21) satisfies a graded Leibniz rule {f, g • h} = {f, g} • h + (−1)(r−1)s g • {f, h} .
(5.23)
However, this product cannot be extended in any natural way to arbitrary Poisson forms. To see this, suppose we had such an extension at hand. Then we could define ? a space of vertical covectors at every point of J E by requiring it to consist of all covectors that vanish when multiplied by a horizontal (n − 1)-form, which would be equivalent to the choice of a connection. Appendix A. Multivector calculus on manifolds The extension of the usual calculus on manifolds from vector fields to multivector fields is by now well known, although it does not seem to be treated in any of the standard textbooks on the subject. Moreover, there is a certain amount of ambiguity concerning sign conventions. Our sign conventions follow those of Tulczyjew [33], but for the sake of completeness we shall briefly expose the structural properties that naturally motivate these choices. Multivector fields of degree r on a manifold are sections of the rth exterior power of its tangent bundle: they are the dual objects to differential forms of degree r, which are sections of the rth exterior power of its cotangent bundle. Every known natural operation involving vector fields, such as the contraction on differential forms, the Lie bracket and the Lie derivative, has a natural extension to multivector fields: this is the subject of an area of differential geometry that we simply refer to as “multivector calculus”. The most important and the ones that we need in this
November 5, 2003 9:42 WSPC/148-RMP
00173
The Poisson Bracket for Poisson Forms in Multisymplectic Field Theory
735
paper are (a) the Schouten bracket between multivector fields, (b) the contraction of a differential form with a multivector field and (c) the Lie derivative of a differential form along a multivector field. Throughout this appendix, let M be an n-dimensional manifold, F(M ) the commutative algebra of functions on M (with respect to pointwise multiplication), X(M ) the space of vector fields on M and X∧ (M ) =
n M Vr
X(M )
r=0
the supercommutative superalgebra of multivector fields on M (with respect to pointwise exterior multiplication). A.1. The Schouten bracket The Schouten bracket between multivector fields constitutes the natural, canonical extension both of the Lie bracket between vector fields and of the Lie derivative of multivector fields (as special tensor fields) along vector fields. Starting from the Lie derivative of multivector fields along vector fields, it can be defined by imposing a Leibniz rule with respect to the exterior product of multivector fields, as in Eq. (A.4) below. Proposition A.1. There exists a unique R-bilinear map [· , ·] : X∧ (M ) × X∧ (M ) → X∧ (M )
(A.1)
called the Schouten bracket, with the following properties. 1. It is homogeneous of degree −1 with respect to the standard tensor degree, i.e. deg X = r ,
deg Y = s ⇒ deg[X, Y ] = r + s − 1 .
(A.2)
2. It is graded antisymmetric: if X has tensor degree r and Y has tensor degree s, then [Y, X] = −(−1)(r−1)(s−1) [X, Y ] .
(A.3)
3. It coincides with the standard Lie bracket on vector fields. 4. It satisfies the graded Leibniz rule: if X has tensor degree r, Y has tensor degree s and Z has tensor degree t, then [X, Y
∧ Z]
= [X, Y ] ∧ Z + (−1)(r−1)s Y
∧ [X, Z] .
(A.4)
5. It satisfies the graded Jacobi identity: if X has tensor degree r, Y has tensor degree s and Z has tensor degree t, then (−1)(r−1)(t−1) [X, [Y, Z]] + cyclic perm. = 0 .
(A.5)
November 5, 2003 9:42 WSPC/148-RMP
736
00173
M. Forger, C. Paufler & H. R¨ omer
We shall not prove this proposition here but just point out that uniqueness of an operation with the properties stipulated above follows from the required R-bilinearity (not F(M )-bilinearity, of course), the homogeneity (A.2), the graded antisymmetry (A.3) and the graded Leibniz rule (A.4) alone; existence can then be proved, for example, by showing that the resulting local coordinate formula satisfies all these requirements. Moreover, the validity of the graded Jacobi identity (A.5) can be derived from the standard Jacobi identity for the Lie bracket of vector fields by means of the graded Leibniz rule (A.4), using induction on the degree. An explicit formula which is slightly more general than the local coordinate formula just mentioned and often useful in practical applications is that for the Schouten bracket between decomposable multivector fields; it follows directly from the same kind of argument and states that for any r + s vector fields X1 , . . . , Xr and Y1 , . . . , Ys , we have [X1 ∧ · · · ∧ Xr , Y1 ∧ · · · ∧ Ys ] =
r X s X
(−1)i+j [Xi , Yj ] ∧ X1 ∧ · · · ∧ Xi−1 ∧ Xi+1 ∧ · · · ∧ Xr
i=1 j=1
∧ Y1 ∧ · · · ∧ Yj−1 ∧ Yj+1 ∧ · · · ∧ Ys
.
(A.6)
Note also that there is a graded Leibniz rule in the other factor as well: it follows from the one written down above by using graded antisymmetry and reads [X ∧ Y, Z] = (−1)(t−1)s [Z, X] ∧ Y + X ∧ [Y, Z] .
(A.7)
Finally, a word seems in order on the adequate choice of signs and degrees. Indeed, one recognizes Eqs. (A.2), (A.3) and (A.5) as the graded homogeneity, the graded antisymmetry and the graded Jacobi identity familiar from the definition of a Lie superalgebra, provided one assigns to every multivector field X of tensor degree r the parity (−1)r−1 : this means that X is even with respect to the Schouten bracket if it has odd tensor degree and is odd with respect to the Schouten bracket if it has even tensor degree! This switch can be better understood by realizing that the operator ad(X) = [X, .] lowers the tensor degree of any multivector field that it operates on by r − 1. The same argument explains the sign that appears in the graded Leibniz identity (A.4), which can be thought of as stating that the operator ad(X) = [X, .] should be a superderivation with respect to the exterior product and, more precisely, an even or odd superderivation according to whether X is even or odd with respect to the Schouten bracket. We can also think of this operator as defining the Lie derivative LX of multivector fields along X (possibly up to signs, which are a matter of convention), but this will not be needed here. Algebraically, the situation can be summarized by stating that X∧ (M ) is a Poisson superalgebra, the supersymmetric analogue of a Poisson algebra — the structure encountered, for example, on the space of functions on a symplectic manifold or, more generally, a Poisson manifold. The surprising aspect is that this
November 5, 2003 9:42 WSPC/148-RMP
00173
The Poisson Bracket for Poisson Forms in Multisymplectic Field Theory
737
intricate structure requires no additional structure whatsoever on the underlying manifold. A.2. Lie derivative of differential forms along multivector fields We now come to the other two operations of multivector calculus mentioned at the beginning of this appendix, namely the contraction of differential forms with multivector fields and the Lie derivative of differential forms along multivector fields. The case of contraction is easy. First, the contraction of a differential form α with a decomposable multivector field X1 ∧ · · · ∧ Xr is simply defined as repeated contraction with its constituents (which by convention should be performed in the opposite order): iX1 ∧···∧ Xr α = iXr . . . iX1 α .
(A.8)
This is then extended to arbitrary (non-decomposable) multivector fields X by F(M )-linearity. (Here, of course, one uses that contraction is a purely algebraic operation; it would not work so naively if we were dealing with a differential operator.) The Lie derivative LX α of a differential form α along a multivector field X is most conveniently defined by a generalization of a well known formula for vector fields. Definition A.2. On differential forms, the Lie derivative LX along a multivector field X is defined as the supercommutator of the exterior derivative d and the contraction operator iX : LX α = diX α − (−1)r iX dα .
(A.9)
According to the rules of supersymmetry, the sign of the second term is fixed by observing that d is an odd operator (it is of degree 1 since it raises the tensor degree of forms by 1) while iX is an even/odd operator if r is even/odd (it is of degree −r since it lowers the tensor degree of forms by r). Proposition A.3. Given any two multivector fields X of tensor degree r and Y of tensor degree s, we have for any differential form α dLX α = (−1)r−1 LX dα ,
(A.10)
i[X,Y ] α = (−1)(r−1)s LX iY α − iY LX α .
(A.11)
L[X,Y ] α = (−1)(r−1)(s−1) LX LY α − LY LX α .
(A.12)
LX ∧ Y α = (−1)s iY LX α + LY iX α .
(A.13)
November 5, 2003 9:42 WSPC/148-RMP
738
00173
M. Forger, C. Paufler & H. R¨ omer
Proof. The first formula is an immediate consequence of the definition (A.9), since d2 = 0. Next, the last formula can be proved by direct calculation: LX ∧ Y α = d(iX ∧ Y α) − (−1)r+s iX ∧ Y dα = d(iY iX α) − (−1)r+s iY iX dα = d(iY iX α) − (−1)s iY d(iX α) + (−1)s iY d(iX α) − (−1)r+s iY iX dα = LY iX α + (−1)s iY LX α . Next, observe that the first formula is well known to be true when X and Y are vector fields. The general case follows by induction on the tensor degree of both factors. Indeed, if X, Y and Z are multivector fields of tensor degree r, s and t, respectively, such that the above equation holds for [X, Y ] and for [X, Z], one can use the graded Leibniz rule (A.4) to derive that it also holds for [X, Y ∧ Z]: i[X,Y
∧ Z] α
= i[X,Y ] ∧ Z α + (−1)(r−1)s iY
∧ [X,Z] α
= iZ i[X,Y ] α + (−1)(r−1)s i[X,Z] iY α = (−1)(r−1)s iZ LX iY α − iZ iY LX α + (−1)(r−1)s+(r−1)t LX iZ iY α − (−1)(r−1)s iZ LX iY α = (−1)(r−1)(s+t) LX iY
∧Zα
− iY
∧ Z LX α .
Similarly, if X, Y and Z are multivector fields of tensor degree r, s and t, respectively, such that the above equation holds for [X, Z] and for [Y, Z], one can use the graded Leibniz rule (A.7) together with Eq. (A.13) to derive that it also holds for [X ∧ Y, Z]: i[X ∧ Y,Z] α = (−1)(t−1)s i[X,Z] ∧ Y α + iX ∧ [Y,Z] α = (−1)(t−1)s iY i[X,Z] α + i[Y,Z] iX α = (−1)(t−1)s+(r−1)t iY LX iZ α − (−1)(t−1)s iY iZ LX α + (−1)(s−1)t LY iZ iX α − iZ LY iX α = (−1)(r+s−1)t+s iY LX iZ α − (−1)s iZ iY LX α + (−1)(r+s−1)t LY iX iZ α − iZ LY iX α = (−1)(r+s−1)t LX ∧ Y iZ α − iZ LX ∧ Y α . Finally, the second formula can now again be proved by direct calculation: L[X,Y ] α = di[X,Y ] α + (−1)r+s i[X,Y ] dα = (−1)(r−1)s dLX iY α − diY LX α +(−1)r(s−1) LX iY dα − (−1)r+s iY LX dα
November 5, 2003 9:42 WSPC/148-RMP
00173
The Poisson Bracket for Poisson Forms in Multisymplectic Field Theory
739
= (−1)(r−1)(s−1) LX diY α − diY LX α + (−1)r(s−1) LX iY dα + (−1)s iY dLX α = (−1)(r−1)(s−1) LX LY α − LY LX α . B. Induced connections In this appendix we want to describe briefly the construction of various induced connections in jet bundle language. First of all, if E is a fiber bundle over M , we shall view a connection in E as a section ΓE of the first order jet bundle JE of E, considered as an affine bundle over E; see [34, Ch. IV.17]. In adapted local coordinates (xµ , q i ) for E and (xµ , q i , qµi ) for JE, this section is given by ΓE : (xµ , q i ) 7→ (xµ , q i , Γiµ (x, q)) . Next, if V is a vector bundle over M , a linear connection in V is given by a section ΓV of JV over V that depends linearly on the fiber coordinates. In adapted local coordinates (xµ , v i ) for V and (xµ , v i , vµi ) for JV , this section is given by ΓV : (xµ , v i ) 7→ (xµ , v i , Γiµj (x) v j ) , where the Γiµ,j are of course the connection coefficients (gauge potentials) associated with the corresponding covariant derivative. In particular, a linear connection in the tangent bundle T M of the base manifold M corresponds to a section ΓT M of J(T M ) over T M which, in adapted local coordinates (xµ , x˙ κ ) for T M and (xµ , x˙ κ , x˙ κµ ) for J(T M ) is given by ΓT M : (xµ , x˙ κ ) 7→ (xµ , x˙ κ , Γκµλ (x) xλ) ˙ , where the Γκµλ are of course the corresponding Christoffel symbols. Now given a fiber bundle E over M together with a connection in E and a linear connection in T M , we can introduce induced connections in all the various induced bundles that appear in this paper — regarded as fiber bundles over M , not over E. (This means that jets of sections will contain just one additional lower space-time index for counting partial derivatives with respect to the space-time variables.) The simplest way to describe them is by introducing adapted local coordinates (xµ , q i ) for E as before; then the local coefficient functions of the induced connections with respect to the induced adapted local coordinates can be expressed directly in terms of the local coefficient functions Γiµ and Γκµλ of the original two connections with respect to the original adapted local coordinates, as follows. • The vertical bundle V E of E: in adapted local coordinates (xµ , q i , q˙k ) for V E and (xµ , q i , q˙k , qµi , q˙µk ) for J(V ∗ E), the induced connection maps (xµ , q i , q˙k ) to (xµ , q i , q˙k , Γiµ (x, q), ∂l Γkµ (x, q)q˙l ) .
November 5, 2003 9:42 WSPC/148-RMP
740
00173
M. Forger, C. Paufler & H. R¨ omer
• The dual vertical bundle V ∗ E of E: in adapted local coordinates (xµ , q i , pk ) for V ∗ E and (xµ , q i , pk , qµi , pµ,k ) for J(V ∗ E), the induced connection maps (xµ , q i , pk ) to (xµ , q i , pk , Γiµ (x, q), −∂k Γlµ (x, q)pl ) . • The pull-back π ∗ (T M ) of the tangent bundle T M of M to E: in adapted local coordinates (xµ , q i , x˙ κ ) for π ∗ (T M ) and (xµ , q i , x˙ κ , qµi , x˙ κµ ) for J(π ∗ (T M )), the induced connection maps (xµ , q i , x˙ κ ) to (xµ , q i , x˙ κ , Γiµ (x, q), Γκµλ (x)x˙ λ ) . • The pull-back π ∗ (T ∗ M ) of the cotangent bundle T ∗ M of M to E: in adapted local coordinates (xµ , q i , ακ ) for π ∗ (T ∗ M ) and (xµ , q i , ακ , qµi , αµ,κ ) for J(π ∗ (T ∗ M )), the induced connection maps (xµ , q i , ακ ) to (xµ , q i , ακ , Γiµ (x, q), −Γλµκ (x)αλ ) . Vn Vn ∗ • The pull-back π ∗ ( T ∗ M ) of the bundle TVM of volume forms on M to E: n µ i ∗ in adapted local coordinates (x , q , ) for π ( T ∗ M ) and (xµ , q i , , qµi , µ ) for Vn ∗ ∗ J(π ( T M )), the induced connection maps (xµ , q i , ) to (xµ , q i , , Γiµ (x, q), −Γρµρ (x)) .
~ of E: • The linearized jet bundle JE k ~ and (xµ , q i , q~κk , qµi , q~µ,κ in adapted local coordinates (xµ , q i , q~κk ) for JE ) for µ i k ~ J(J E), the induced connection maps (x , q , q~κ ) to (xµ , q i , q~κk , Γiµ (x, q), ∂l Γkµ (x, q)~ qκl − Γλµκ (x)~ qλk ) . • The jet bundle JE of E: k in adapted local coordinates (xµ , q i , qκk ) for JE and (xµ , q i , qκk , qµi , qµ,κ ) µ i k for J(JE), the induced connection maps (x , q , qκ ) to (xµ , q i , qκk , Γiµ (x, q), ∂l Γkµ (x, q)(qκl − Γlκ (x, q)) − Γλµκ (x)(qλk − Γkλ (x, q))) . ∗ • Ordinary multiphase space J~ E: ∗ in adapted local coordinates (xµ , q i , pκk ) for J~ E and (xµ , q i , pκk , qµi , pκµ,k ) for ∗ J(J~ E), the induced connection maps (xµ , q i , pκk ) to
(xµ , q i , pκk , Γiµ (x, q), −∂k Γlµ (x, q)pκl + Γκµλ (x)pλk − Γρµρ (x) pκk ) . ? • Extended multiphase space J E: ? in adapted local coordinates (xµ , q i , pκk , p) for J E and (xµ , q i , pκk , p, qµi , pκµ,k , pµ ) ? for J(J E), the induced connection maps (xµ , q i , pκk , p) to
(xµ , q i , pκk , p, Γiµ (x, q), −∂k Γlµ (x, q)pκl + Γκµλ (x)pλk − Γρµρ (x)pκk , − Γρµρ (x)p − (∂µ Γjν (x, q) − Γkν (x, q)∂k Γjµ (x, q) − Γκµν (x)Γjκ (x, q))pνj ) .
November 5, 2003 9:42 WSPC/148-RMP
00173
The Poisson Bracket for Poisson Forms in Multisymplectic Field Theory
Table 1. Correspondence of important concepts time-dependent mechanics versus field theory. Mechanics
in
the
multiphase
space
approach:
Field Theory
Extended configuration space R × Q, where R is the time axis
Configuration bundle E over M with typical fibre Q, where M is the space-time manifold
Extended velocity space R × T Q
Velocity bundle: jet bundle JE
Doubly extended phase space P = T ∗ (R × Q) = R × T ∗ Q × R
Extended multiphase space: twisted affine dual of VJE n ∗ ?E = J ?E ⊗ P = J T M
Simply extended phase space P0 = R × T ∗ Q
Ordinary multiphase space: ~ twisted linear dual of JE V ∗E = J ~∗ E ⊗ n T ∗ M P0 = J~
Local coordinates for R × Q t, q i
Local coordinates for E xµ , q i
Local coordinates for R × T Q t, q i , q˙i
Local coordinates for JE i xµ , q i , q µ
Local coordinates for P t, q i , pi , E
Local coordinates for P x µ , q i , pµ i ,p
Local coordinates for P0 t, q i , pi
Local coordinates for P0 x µ , q i , pµ i
Projection from P to P0 (t, q i , pi , E) 7→ (t, q i , pi )
Projection from P to P0 µ i µ (xµ , q i , pµ i , p) 7→ (x , q , pi )
Canonical 1-form on P θ = pi dq i + Edt
Multicanonical n-form on P i n n θ = pµ i dq ∧ d xµ + p d x
Symplectic 2-form ω = −dθ on P, non-degenerate ω = dq i ∧ dpi − dE ∧ dt
Multisymplectic (n + 1)-form ω = −dθ on P, non-degenerate (on vector fields) n n ω = dq i ∧ dpµ i ∧ d xµ − dp ∧ d x
Hamiltonian is a function on P0
Hamiltonian is a section of P (as an affine line bundle over P0 )
iX ω = df Hamiltonian vector fields X
↔
iX ω = df functions f
Poisson bracket for functions f, g ∈ C ∞ (P) {f, g} = LY f − LX g
Hamiltonian r-multivector fields X
↔
Hamiltonian or Poisson (n − r)-forms f
Poisson bracket for Poisson forms f ∈ Ωn−r (P), g ∈ Ωn−s (P) P P {f, g} = (−1)(r−1)(s−1) LY f − LX g −(−1)(r−1)s LX ∧ Y θ
Hamiltonian equations ∂H ∂H = q˙i , i = −p˙ i ∂pi ∂q
De Donder-Weyl equations ∂pµ ∂q i ∂H ∂H , i = − iµ µ = µ ∂pi ∂x ∂q ∂x
741
November 5, 2003 9:42 WSPC/148-RMP
742
00173
M. Forger, C. Paufler & H. R¨ omer
Acknowledgments Two of the authors (M.F. and H.R) wish to gratefully acknowledge the financial support of FAPESP (Funda¸ca ˜o de Amparo a ` Pesquisa do Estado de S˜ ao Paulo, Brazil) which made this collaboration possible.
References [1] I. V. Kanatchikov, On Field Theoretic Generalizations of a Poisson Algebra, Rep. Math. Phys. 40 (1997), 225–234, hep-th/9710069. [2] I. V. Kanatchikov, Canonical Structure of Classical Field Theory in the Polymomentum Phase Space, Rep. Math. Phys. 41 (1998), 49–90, hep-th/9709229. [3] J. F. Cari˜ nena, M. Crampin and L. A. Ibort, On the Multisymplectic Formalism for First Order Field Theories, Diff. Geom. Appl. 1 (1991), 345–374. [4] M. J. Gotay, J. Isenberg and J. E. Marsden, Momentum Maps and Classical Relativistic Fields I: Covariant Field Theory, physics/9801019. [5] M. Forger and H. R¨ omer, A Poisson Bracket on Multisymplectic Phase Space, Rep. Math. Phys. 48 (2001), 211–218, math-ph/0009037. [6] H. Goldschmidt and S. Sternberg, The Hamilton-Cartan Formalism in the Calculus of Variations, Ann. Inst. Four. 23 (1973), 203–267. [7] V. Guillemin and S. Sternberg, Geometric Asymptotics, Mathematical Surveys, Vol. 14, American Mathematical Society, Providence 1977. [8] J. Kijowski, A Finite-dimensional Canonical Formalism in the Classical Field Theory, Commun. Math. Phys. 30 (1973), 99–128; Multiphase Spaces and Gauge in Calculus of Variations, Bull. Acad. Pol. Sci. SMAP 22 (1974), 1219–1225. [9] J. Kijowski and W. Szczyrba, “Multisymplectic Manifolds and the Geometrical Construction of the Poisson Brackets in the Classical Field Theory”, in G´eometrie Symplectique et Physique Math´ematique, ed. J.-M. Souriau, C.N.R.S., Paris 1975, pp. 347–379. [10] J. Kijowski and W. Szczyrba, Canonical Structure for Classical Field Theories, Commun. Math. Phys. 46 (1976), 183–206. [11] J. Kijowski and W. Tulczyjew, “A Symplectic Framework for Field Theories”, Lecture Notes in Physics, Vol. 107, Springer-Verlag, Berlin 1979. [12] M. Forger, C. Paufler and H. R¨ omer, “More about Poisson Brackets and Poisson Forms”, in Multisymplectic Field Theory, in preparation. [13] L. K. Norris, N -Symplectic Algebra of Observables in Covariant Lagrangian Field Theory, J. Math. Phys. 42 (2001), 4827–4845. [14] M. de Le´ on, M. McLean, L. K. Norris, A. R. Roca and M. Salgado, Geometric Structures in Field Theory, preprint, math-ph/0208036. [15] F. H´elein and J. Kouneiher, “Finite-Dimensional Hamiltonian Formalism for Gauge and Field Theories”, J. Math. Phys. 43 (2002), 2306–2347, math-ph/0010036; Covariant Hamiltonian Formalism for the Calculus of Variations with Several Variables, preprint, math-ph/0211046v2. [16] C. Crnkovi´c and E. Witten, “Covariant Description of Canonical Formalism in Geometrical Theories”, in Three Hundred Years of Gravitation, eds. W. Israel and S. Hawking, Cambridge University Press, Cambridge 1987, pp. 676–684. [17] C. Crnkovi´c, Symplectic Geometry of Covariant Phase Space, Class. Quant. Grav. 5 (1988), 1557–1575. [18] G. Zuckerman, “Action Principles and Global Geometry”, in Mathematical Aspects of String Theory, ed. S.-T. Yau, World Scientific, Singapore 1987, pp. 259–288.
November 5, 2003 9:42 WSPC/148-RMP
00173
The Poisson Bracket for Poisson Forms in Multisymplectic Field Theory
743
[19] R. E. Peierls, The Commutation Laws of Relativistic Field Theory, Proc. Roy. Soc. Lond. A214 (1952), 143–157. [20] B. de Witt, “Dynamical Theory of Groups and Fields”, in Relativity, Groups and Topology, 1963 Les Houches Lectures, eds. B. de Witt and C. de Witt, Gordon and Breach, New York 1964, pp. 585–820. [21] B. de Witt, “The Spacetime Approach to Quantum Field Theory”, in Relativity, Groups and Topology II, 1983 Les Houches Lectures, eds. B. de Witt and R. Stora, Elsevier, Amsterdam 1984, pp. 382–738. [22] G. Bimonte, G. Esposito, G. Marmo and C. Stornaiolo, Peierls Brackets in Field Theory, preprint, hep-th/0301113. [23] S. V. Romero, Colchete de Poisson Covariante na Teoria Geom´etrica dos Campos, PhD thesis, Institute for Mathematics and Statistics, University of S˜ ao Paulo, June 2001. [24] M. Forger and S. V. Romero, Covariant Poisson Brackets in Geometric Field Theory, in preparation. [25] R. Abraham and J. E. Marsden, Foundations of Mechanics, 2nd edition, Benjamin/Cummings, Reading 1978. [26] V. Arnold, Mathematical Methods of Classical Mechanics, 2nd edition, Springer, Berlin 1989. [27] H. A. Kastrup, Canonical Theories of Lagrangian Dynamical Systems in Physics, Phys. Rep. 101 (1983), 3–167. [28] G. Martin, A Darboux Theorem for Multisymplectic Manifolds, Lett. Math. Phys. 16 (1988), 133–138. [29] G. Martin, Dynamical Structures for k-Vector Fields, Int. J. Theor. Phys. 41 (1988), 571–585. [30] F. Cantrijn, A. Ibort and M. de Le´ on, On the Geometry of Multisymplectic Manifolds, J. Austral. Math. Soc. (Ser. A) 66 (1999), 303–330. [31] C. Paufler and H. R¨ omer, Geometry of Hamiltonian n-vectors in Multisymplectic Field Theory, J. Geom. Phys. 44 (2002), 52–69, math-ph/0102008. [32] C. Paufler, A Vertical Exterior Derivative in Multisymplectic Geometry and a Graded Poisson Bracket for Nontrivial Geometries, Rep. Math. Phys. 47 (2001), 101–119, math-ph/0002032. [33] W. M. Tulczyjew, The Graded Lie Algebra of Multivector Fields and the Generalized Lie Derivative of Forms, Bull. Acad. Pol. Sci. SMAP 22 (1974), 937–942. [34] I. Kol´ aˇr, P. W. Michor and J. Slov´ ak, Natural Operations in Differential Geometry, Springer, Berlin 1993.
November 1, 2003 12:2 WSPC/148-RMP
00175
Reviews in Mathematical Physics Vol. 15, No. 7 (2003) 745–763 c World Scientific Publishing Company
COMBINATORIAL PROPERTIES OF ARNOUX RAUZY SUBSHIFTS AND APPLICATIONS TO ¨ SCHRODINGER OPERATORS
D. DAMANIK Department of Mathematics 253–37, California Institute of Technology Pasadena, CA 91125, USA [email protected] LUCA Q. ZAMBONI Department of Mathematics, University of North Texas Denton, TX 76203, USA [email protected] Received 6 February 2003 Revised 16 July 2003 We consider Arnoux–Rauzy subshifts X and study various combinatorial questions: When is X linearly recurrent? What is the maximal power occurring in X? What is the number of palindromes of a given length occurring in X? We present applications of our combinatorial results to the spectral theory of discrete one-dimensional Schr¨ odinger operators with potentials given by Arnoux–Rauzy sequences. Keywords: Arnoux–Rauzy subshifts; linear recurrence; powers; palindromes; Schr¨ odinger operators. 2000 AMS Subject Classification: 81Q10, 68R15, 37B 10
1. Introduction Mainly motivated by the discovery of quasicrystals by Shechtman et al. in 1984 [1], there has been a lot of research done on the spectral properties of Schr¨ odinger operators with potentials displaying long-range order. The first rigorous mathematical results were obtained in the late eighties. By now, many key issues are well understood, at least in one dimension. The two survey articles [2] and [3] recount the history of this effort up to 1994 and 1999, respectively. The primary example is given by a discrete one-dimensional Schr¨ odinger operator whose potential is given by the Fibonacci sequence. More generally, one considers Sturmian potentials or potentials generated by (primitive) substitutions. It turned out that all these potentials lead to the same qualitative behavior: the
745
November 1, 2003 12:2 WSPC/148-RMP
746
00175
D. Damanik & L. Q. Zamboni
corresponding Schr¨ odinger operator has purely singular continuous zero-measure Cantor spectrum. This has been established for all Sturmian potentials and most substitution potentials. On the other hand, no counterexample is known. This led to the conjecture that these properties are shared by a large class of potentials displaying long-range order in a certain sense. One possible way to measure longrange order is given by the combinatorial complexity function f : N → N associated with a potential taking finitely many values, where f (n) is given by the number subwords of the potential of a given length n. Since periodic potentials are well understood, one is interested in the case of an aperiodic potential and, in this case, it is well known that the complexity function grows at least linearly. One possible point of view could be the stipulation that long-range order manifests itself in a slowly (e.g. linearly) growing complexity function, possibly along with further conditions. This combinatorial approach is further motivated by the fact that the properties above, singular continuous zero-measure Cantor spectrum, can be shown by purely combinatorial methods. The key combinatorial properties that allow one to deduce these spectral properties are linear recurrence and the occurrence of local symmetries such as powers and palindromes. Here, a sequence is linearly recurrent if its subwords occur infinitely often, with gap lengths bounded linearly in the length of the subword. Powers are repetitions of subwords and palindromes are subwords that are the same when read backwards. Thus, the interplay between the spectral theory of Schr¨ odinger operators and combinatorics of infinite words has enjoyed quite some popularity recently, due to its success in answering long-standing questions; for example, the completion of the analysis of the Sturmian case [4] or the proof of zero-measure spectrum for all primitive substitution potentials [5] (see also [6] for a different proof of the latter result using trace maps). This interplay and its applications will be discussed in detail in [7]. As was mentioned above, the spectral theory is well understood for the primary example, the Fibonacci case, and more generally, for all Sturmian potentials. Thus it is natural to consider generalizations of Sturmian potentials and to explore whether the combinatorial approach continues to be applicable. There are a number of natural candidates: • quasi-Sturmian sequences, • sequences obtained by codings of rotations, • Arnoux–Rauzy sequences. Quasi-Sturmian sequences are essentially given by morphic images of Sturmian sequences and the corresponding Schr¨ odinger operators were studied in [8], confirming all of the above points. Sturmian sequences have a geometric realization as a coding of an irrational rotation on the unit circle with respect to a decomposition of the circle into two half-open intervals, where the rotation number is equal to the length of one of the intervals. By dropping the latter condition, one obtains the more
November 1, 2003 12:2 WSPC/148-RMP
00175
Combinatorial Properties of Arnoux–Rauzy Subshifts
747
general class of sequences associated to codings of rotations. The corresponding operators display purely singular continuous zero-measure Cantor spectrum in many cases [9–11]. Finally, Sturmian potentials can be characterized by a scarceness of so-called special factors, that is, aside from being defined over two symbols, there is, for each length, exactly one subword with multiple extensions to the right and one subword with multiple extensions to the left. When considering more than two symbols, this definition leads to the class of Arnoux–Rauzy sequences, originally defined and studied in [12] (the paper [12] considers the three-symbol case; for the case of k ≥ 3 symbols, see [13] for initial definition and study). For the corresponding operators, no results have been shown yet. Thus, the spectral analysis of these operators via the combinatorial approach mentioned above is the objective of the present paper. Let us mention that some authors (e.g. [14–16]) study the class of episturmian sequences which contains all Arnoux–Rauzy sequences. To this end, we shall recall the formal definition and some basic combinatorial properties of Arnoux–Rauzy sequences in Sec. 2 and then study the relevant combinatorial issues, namely, linear recurrence, powers, and palindromes, in Secs. 3–5, respectively. Applications of the combinatorial results obtained in these sections to the corresponding Schr¨ odinger operators are then presented in Sec. 6. 2. Basic Properties of Arnoux Rauzy Sequences In this section we recall some known properties of Arnoux–Rauzy sequences and subshifts. In particular, we explain the two combinatorial descriptions of such subshifts from [13] since they will be used extensively in later sections. We begin with some definitions. Let Ak = {1, 2, . . . , k} with k ≥ 2. We denote Z by A∗k , AN k , Ak the set of finite, one-sided infinite, and two-sided infinite words over Z x of length Ak . Given x ∈ AN k or Ak , we denote by Fx (n) the set of all subwords of S n ∈ N, that is, Fx (n) = {xj · · · xj+n−1 : j ∈ N(or Z)}. We write Fx = n∈N Fx (n). The complexity function f : N → N of x is defined by f (n) = |Fx (n)|, where | · | denotes cardinality. A factor (= subword) u ∈ Fx (n) of x is called right-special if it has at least two extensions to the right, that is, there are a, b ∈ Ak , a 6= b such that ua, ub ∈ Fx (n + 1). A left-special factor is defined analogously. If a factor is both right-special and left-special, it is called bispecial. The sequence x is called an Arnoux–Rauzy sequence if • x is uniformly recurrent (i.e. each factor of x occurs with bounded gaps), • f (n) = (k − 1)n + 1, • each Fx (n) contains exactly one right-special factor rn and one left-special factor ln . It can be shown that rn = lnR [13], where the reversal uR of a word u = u1 · · · um is defined by uR = um · · · u1 . In particular, r1 = l1 and this factor is bispecial. Observe that there is a unique symbol a ∈ Ak such that aa ∈ Fx (2) (which is given by a = r1 = l1 ). We shall say that x is of type a.
November 1, 2003 12:2 WSPC/148-RMP
748
00175
D. Damanik & L. Q. Zamboni
If k = 2, this recovers the definition of a Sturmian sequence. Hence, Arnoux– Rauzy (AR for short) sequences are a natural generalization of Sturmian sequences to larger alphabets. Given an AR sequence x, we define the associated AR subshift X by Z X = {y ∈ AN k (or Ak ) : Fy (n) = Fx (n) for every n ∈ N} .
By definition, we have Fy = Fx ≡ FX for every y ∈ X. When we want to be more specific (about the choice of N or Z), we shall refer to X as a one-sided (resp., twosided) subshift. AR sequences and subshifts were originally defined and studied by Arnoux and Rauzy in [12]. Since x is assumed to be uniformly recurrent, X is minimal. Moreover, it was shown in [12] that X is uniquely ergodic, that is, X admits a unique shift-invariant probability measure ν. Equivalently, for every factor u ∈ FX and every m ∈ N (or m ∈ Z), the limit 1 d(u) = lim #u (xm · · · xm+n−1 ) n→∞ n exists, uniformly in m. Here, #u (v) denotes the number of occurrences of u in v. The number d(u) is called the frequency of u. We have, for every m, ν({y ∈ X : ym · · · ym+|w|−1 = u}) = d(u) .
(2.1)
Two important objects associated with such a subshift are the index sequence N (in ) ∈ AN k and the characteristic sequence (cn ) ∈ Ak which are defined as follows: Let {ε = w1 , w2 , w3 , . . .} be the set of bispecial factors ordered so that 0 = |w1 | < |w2 | < |w3 | < · · · . For n ∈ N, let in ∈ Ak be the unique symbol so that in wn is right-special. The characteristic sequence (cn ), on the other hand, is defined to be the unique accumulation point of the set {l1 , l2 , l3 , . . .} of left-special factors. Note that (cn ) is an element of the one-sided subshift X and hence has the same factors as x. Consequently, it carries all the necessary information and once we find a way of constructing (cn ) from (in ), we see that the index sequence completely determines X. One such construction is given by the hat algorithm from [13]. Define a function N H : AN k → Ak
as follows. Set ˆ A0k = {1, . . . , k, ˆ 1, . . . , k} and let Φ denote the morphism Φ : A0k → Ak ,
Φ(ˆ a) = Φ(a) = a for every a ∈ Ak .
(A0k )∗
and (A0k )N . With each sequence S = (sn ) ∈ AN Clearly, Φ extends to both k, we associate a sequence (Bn ) of words over the alphabet A0k as follows: B1 = sˆ1 and, for n > 1, Bn is obtained from Bn−1 according to the following rule. If sˆn does not occur in Bn−1 , then Bn = Bn−1 sˆn Φ(Bn−1 ) .
November 1, 2003 12:2 WSPC/148-RMP
00175
Combinatorial Properties of Arnoux–Rauzy Subshifts
749
Otherwise, if sˆn occurs in Bn−1 , then we can write Bn−1 = x0 sˆn y 0 , where x0 , y 0 are words over A0k (possibly empty) and sˆn does not occur in y 0 . In this case we set Bn = Bn−1 sˆn Φ(y 0 ) . The sequence (Bn ) converges to a unique sequence B ∈ (A0k )N . We set H(S) = Φ(B) . Then, the following holds: Theorem 2.1 (Risley Zamboni [13]). Let X be an AR subshift over Ak . Let I = (in ) be its index sequence and C = (cn ) its characteristic sequence. Then every a ∈ Ak occurs in I an infinite number of times and C = H(I) . Conversely, if I = (in ) is a sequence over Ak such that every a ∈ Ak occurs infinitely often in I, then H(I) is the characteristic sequence of an AR subshift. The key observation is that {Φ(Bn ) : n ∈ N} is precisely the set of all bispecial factors (see [13]). Thus, we can re-interpret the construction above on this level: Let w be one of the bispecial factors. Suppose that each symbol a ∈ Ak occurs in w (this holds by minimality if w is long enough). Then by the hat algorithm, for each symbol a ∈ Ak , there is a positive integer m < |w| (depending on a) such that the next bispecial factor is obtained from w by adjoining to the end of w a suffix of w of length m. The quantity m is one of the k periods of w. Let p1 , p2 , . . . , pk denote the k periods of w. Then we have the following formula [17]: k X
(pi − 1) = (k − 1)|w| .
i=1
We can suppose that p1 ≥ p2 ≥ · · · ≥ pk . In this case it follows that |w| , 1 ≤ i ≤ k −1. (2.2) k In fact, if pk−1 ≤ |w|/k, then pk−1 + pk ≤ |w|, implying that p1 + p2 + · · · + pk−2 ≥ (k − 2)|w|, which is a contradiction since each pi < |w|. Another way of constructing the characteristic sequence is given by the following result. For each a ∈ Ak , define the morphism τa by τa (a) = a and τa (b) = ab for b ∈ Ak \{a}. pi >
Theorem 2.2 (Risley Zamboni [13]). Let X be an AR subshift and let (in ) be its index sequence. For each a ∈ Ak , the characteristic sequence (cn ) of X is given by lim τi1 ◦ · · · ◦ τin (a) .
n→∞
That is, the characteristic sequence admits an S-adic representation where the underlying morphisms are given by {τa : a ∈ Ak } and they are iterated in an order dictated by the index sequence.
November 1, 2003 12:2 WSPC/148-RMP
750
00175
D. Damanik & L. Q. Zamboni
3. Linearly Recurrent Arnoux Rauzy Sequences In this section we characterize the set of AR subshifts that are linearly recurrent. Recall that a subshift X is called K-linearly recurrent (or K-LR) if there is a constant K > 0 such that every w ∈ FX is contained in every v ∈ FX of length K|w|. X is called linearly recurrent (or LR) if it is K-LR for some K. Linear recurrence is a concept that has been quite popular since the late nineties and it is known to have a number of nice consequences; compare [5, 18–22]. For example, every linearly recurrent X is uniquely ergodic and N -power free for some N (i.e. FX does not contain an element of the form uN ). Theorem 3.1. An AR subshift X over Ak is linearly recurrent if and only if every letter a ∈ Ak occurs in (in ) with bounded gaps. Remark 3.1. This is Corollary III.9 in [13]. One direction was stated without proof and the proof of the other direction was based on [19, Proposition 5] which turned out to be incorrect [23]. Proof. It clearly suffices to prove the assertion for the characteristic sequence (c n ) since the LR property only depends on the set of factors of a sequence and hence is an invariant of a minimal subshift. We first prove that if some letter a ˜ ∈ Ak occurs in (in ) with unbounded gaps, then (cn ) is not linearly recurrent. A special case of this scenario is easy to handle: If for each n ∈ N, there is m ∈ N such that im = · · · = im+n−1 , then (cn ) is not LR since it is not N -power free for any N (see [13, Corollary III.6] or the next section). Let us therefore assume, in addition, that there is some N ∈ N such that for every a ∈ Ak , aN does not occur in the index sequence. Fix some K > 0. Let L > kN K (recall that k is the size of the alphabet). Then there exists n ∈ N such that in+j 6= a ˜, 0 ≤ j ≤ L. We shall show that wn+L is a word of length > (K + 1)|wn | which does not contain an occurrence of wn a ˜. (Recall that wm denotes the mth bispecial factor.) This implies that (cn ) is not K-LR. Since K was arbitrary, (cn ) is not LR. That wn+L does not contain wn a ˜ follows from the hat algorithm and the fact that a ˜ does not occur in in , . . . , in+L . That wn+L is of length > K|wn | follows from the fact that for each j, in passing from wn+j to wn+j+1 , one adds on a suffix; all but one of these suffixes are, by (2.2), of length > |wn |/k. Since (cn ) is N -power free, each window of length N in the index sequence must contain at least two distinct symbols. Thus |wn+jN | > |wn | + j
|wn | , k
and hence |wn | = (K + 1)|wn | . k Consider now the case where each symbol a ∈ Ak occurs in the index sequence with bounded gaps. That is, there is a number g ∈ N such that for every a ∈ Ak |wn+kN K | > |wn | + Kk
November 1, 2003 12:2 WSPC/148-RMP
00175
Combinatorial Properties of Arnoux–Rauzy Subshifts
751
and every m ∈ N, at least one of im , . . . , im+g−1 equals a. We have to show that C = (cn ) is linearly recurrent. Recall from Theorem 2.2 that C = lim (τi1 ◦ · · · ◦ τin )(a) for every a ∈ Ak . n→∞
(3.1)
Fix some a ˜ ∈ Ak and define, for m ∈ N, a) . C (m) = lim (τim ◦ · · · ◦ τin )(˜ n→∞
(3.2)
In order to show that C is linearly recurrent, we shall employ [23, Lemma 4] which provides a sufficient condition for LR: For each m ∈ N, let dm be the largest gap between consecutive occurrences of a word of length 2 in C (m) . If the set {dm : m ∈ N} is bounded, then C is linearly recurrent. Fix some m ∈ N and consider the sequence C (m) . It is an AR sequence of type im and its factors of length 2 are given by FC (m) (2) = {aim : a ∈ Ak } ∪ {im a : a ∈ Ak } . The gaps between occurrences of aim are bounded by twice the maximal length of the gaps between occurrences of a in C (m+1) , which in turn occurs with gaps bounded by 2g since at least one of im+1 , . . . , im+g is equal to a (the corresponding substitution produces a sequence where the gaps between successive a’s are bounded by 2, and then the gaps increase under subsequent substitutions at most by a factor 2). The same argument works for words of the form im a and hence dm ≤ 2 g . 4. Powers in Arnoux Rauzy Sequences In this section we study the occurrences of powers in a given AR subshift X. As was noted in [13], if there are arbitrarily long runs in the index sequence (in ), then there are arbitrarily high powers. Here, we shall prove the converse and provide an explicit expression for the index of X (the highest power occurring in X) in terms of the run lengths in (in ). To this end, we shall distinguish between two types of runs in (in ), namely, open runs and closed runs. An r-run in (in ) is a pair (a, l) ∈ Ak × N such that im = a for l ≤ m ≤ l + r − 1. If the value of r is understood, such an r-run will sometimes be simply referred to as a run. An r-run (a, l) is called open if im 6= a for 1 ≤ m ≤ l−1; otherwise, it is called closed. Recall that the (integer) index of X, ind(X) ∈ N ∪ {∞}, is defined by ind(X) = sup{p ∈ N : there exists u ∈ A∗k such that up ∈ FX } . The following result provides an explicit formula for the index in terms of the runs in the index sequence and generalizes the corresponding result in the Sturmian case (cf. e.g. [24–27]). See [15, Secs. 4.1 and 5.5] for related results on powers in AR subshifts.
November 1, 2003 12:2 WSPC/148-RMP
752
00175
D. Damanik & L. Q. Zamboni
Theorem 4.1. Let X be an AR subshift over Ak and (in ) its index sequence. If ind(X) is defined as above, then ind(X) = max{N1 , N2 } , where N1 = 1 + sup{r ∈ N : (in ) contains an open r-run} , N2 = 2 + sup{r ∈ N : (in ) contains a closed r-run} . In particular, we obtain the following corollary which is a generalization of the corresponding result in the Sturmian case, which was proved by Mignosi in [28]. Corollary 4.1. An AR subshift X has finite index if and only if the runs in its index sequence are uniformly bounded. We begin by proving ind(X) ≥ max{N1 , N2 }. The key observation is given in the following lemma: Lemma 4.1. Let a, a1 , . . . , an , α, β ∈ Ak with ai 6= a, 1 ≤ i ≤ n and α 6= β. Then τa ◦ τa1 ◦ · · · ◦ τan (a) is a prefix of τa ◦ τa1 ◦ · · · ◦ τan (αβ). Proof. For n = 1, we have τa ◦ τa1 (a) = aa1 a and τa ◦ τa1 (αβ) = τa (a1 α0 τa1 (β)) = aa1 a · · · , where α0 =
(
ε
if a1 = α ,
α
if a1 6= α .
Thus the statement is true for n = 1. Let us now assume that the statement holds for n. We have τa ◦ τa1 ◦ · · · ◦ τan ◦ τan+1 (a) = τa ◦ τa1 ◦ · · · ◦ τan (an+1 a) = τa ◦ τa1 ◦ · · · ◦ τan (an+1 )τa ◦ τa1 ◦ · · · ◦ τan (a) . On the other hand, we have (with α0 , β 0 defined as above) τa ◦ τa1 ◦ · · · ◦ τan ◦ τan+1 (αβ) = τa ◦ τa1 ◦ · · · ◦ τan (an+1 α0 an+1 β 0 ) = τa ◦ τa1 ◦ · · · ◦ τan (an+1 )τa ◦ τa1 ◦ · · · ◦ τan × (α0 an+1 β 0 ) . Since α 6= β, at least one of them is 6= an+1 , say α, and then α0 = α 6= an+1 . Now apply the induction hypothesis.
November 1, 2003 12:2 WSPC/148-RMP
00175
753
Combinatorial Properties of Arnoux–Rauzy Subshifts
Proposition 4.1. Suppose a, a1 , . . . , an , an+1 ∈ Ak with ai 6= a, 1 ≤ i ≤ n + 1 and x ∈ AN k has an occurrence of a. Then, for every r ∈ N, τa ◦ τa1 ◦ · · · ◦ τan ◦ τar ◦ τan+1 (x) contains (τa ◦ τa1 ◦ · · · ◦ τan (a))r+2 . Proof. Since x contains a and an+1 = 6 a, τan+1 (x) contains an+1 aan+1 . Thus τar ◦ τan+1 (x) contains ar an+1 ar+1 an+1 a. Therefore τa ◦ τa1 ◦ · · · ◦ τan ◦ τar ◦ τan+1 (x) contains (τa ◦ τa1 ◦ · · · ◦ τan (a))r+1 τa ◦ τa1 ◦ · · · ◦ τan (an+1 a) . By Lemma 4.1, τa ◦ τa1 ◦ · · · ◦ τan (an+1 a) has τa ◦ τa1 ◦ · · · ◦ τan (a) as a prefix. Proposition 4.2. Let X be an AR subshift over Ak and (in ) its index sequence. If (in ) contains a run ar , then X has a factor ur+1 . If the run ar is preceded by a somewhere in (in ), then X has a factor ur+2 . In particular, ind(X) ≥ max{N1 , N2 } . Proof. Both claims follow immediately from Proposition 4.1 and its proof. We now aim at proving ind(X) ≤ max{N1 , N2 }. This will be done by starting from a factor up , p ≥ 3 and then performing an iterated desubstitution process which will produce an r-run in the index sequence. In general, we have r = p − 2, but under certain circumstances, we have r = p − 1. To illustrate this procedure, let us start with an example. Suppose X is an AR subshift over three symbols such that FX contains (21232121232122123212123212212321)p . (1)
(4.1)
Clearly, C = C = (cn ) is of type 2 and hence i1 = 2. Thus, C C (2) must contain the factor
(1)
= τ2 (C
(2)
(13113121311312131)p . (2)
Now, C is of type 1 and hence i2 = 1. Thus, C contain the factor
) and (4.2)
(2)
(3132313231)p−1313231323? .
= τ1 (C
(3)
) and C
(3)
must (4.3)
Observe that the last symbol in the last block cannot be desubstituted uniquely. We indicate this ambiguity by “?” and note that the last block is one symbol shorter than the other blocks. Next, i3 = 3, C (3) = τ1 (C (4) ) and C (4) must contain the factor (12121)p−1 1212? .
(4.4)
The ambiguity on this level comes from the ambiguity on the previous level, that is, the “?” in (4.3). However, we clearly have i4 = 1 and hence the last 2 in (4.4) must be followed by a 1, so in fact C (4) must contain the factor (12121)p−112121 .
(4.5)
November 1, 2003 12:2 WSPC/148-RMP
754
00175
D. Damanik & L. Q. Zamboni
This allows us to go up one level and replace the “?” in (4.3) by a 1. This deciphers all the ambiguities up to this point. Next, C (4) = τ1 (C (5) ) and C (5) must contain the factor (221)p−1 22?
(4.6)
and i5 = 2, C (5) = τ2 (C (6) ) and C (6) must contain the factor (21)p−1 2? .
(4.7)
Now, i6 is either 1 or 2, but in either case, further desubstitution yields a run of length p−2 in the index sequence. Namely, if i6 = 1, then i7 = · · · = i7+(p−2)−1 = 2, and i6 = 2 gives i7 = · · · = i7+(p−2)−1 = 1. Note that, contrary to the situation above, the ambiguities in (4.6) and (4.7) cannot be removed. We observe: (1) The desubstitution process takes up to ap−1 ? for some a ∈ Ak and “?” is either known or not. This yields at least a (p − 2)-run in the index sequence. (2) If at no step there is an ambiguity in the desubstitution process, then w p reduces to ap and hence produces a (p − 1)-run in the index sequence. These observations lead to the following lemma: Lemma 4.2. Let X be an AR subshift over Ak and (in ) its index sequence. Suppose there is p ≥ 3 and a primitive u ∈ A∗k such that up ∈ FX . Then we have one of the following scenarios: (i) There are a ∈ Ak and m ∈ N such that im+j = a, 1 ≤ j ≤ p − 1. (ii) There are a ∈ Ak and m ∈ N such that im+j = a, 1 ≤ j ≤ p − 2 and ij = a for some j with 1 ≤ j ≤ m. Proof. Start with the word up and perform a continued desubstitution process, as above, using τi−1 , τi−1 ◦ τi−1 , τi−1 ◦ τi−1 ◦ τi−1 , . . ., where the last symbol of the 1 2 1 3 2 1 desubstituted word may be unknown and hence denoted by “?”. Clearly, this process leads, after, say, m steps, to a desubstituted word which has either the form ap or ap−1 ?, where a is some symbol from Ak . In particular, we must have im+j = a for 1 ≤ j ≤ p − 1 (in the first case) or 1 ≤ j ≤ p − 2 (in the second case). It only remains to be shown that in the second case, we must have applied τa somewhere along the way. Notice that each desubstituted word results from the previous word by a deletion of a number of symbols. In particular, the word u we started with must contain a. Consider first the case where u contains at least two occurrences of a. Then, in order to reduce u2 (the first two of the p ≥ 3 blocks) to a2 , we necessarily have to apply τa along the way. Let us now consider the case where u contains exactly one a. Then we do not apply τa until we are left with a desubstituted word of length ≤ 2. That is, either the word is a, in which case we are done (since the a in the last block never gets deleted and hence up reduces to ap ), or the word is ab (or ba) for some b. In the next step, either τa or τb is applied. Remember that the
November 1, 2003 12:2 WSPC/148-RMP
00175
Combinatorial Properties of Arnoux–Rauzy Subshifts
755
last word still contains a (so that it is one of ab, ba, a?) so that desubstitution by τb leads to ap . That is, only desubstitution by τa leads to ap−1 ?. Proposition 4.3. Let X be an AR subshift over Ak and (in ) its index sequence. Then ind(X) ≤ max{N1 , N2 } . Proof. This is an immediate consequence of Lemma 4.2. Namely, a given power up ∈ FX , for some p ≥ 3, corresponds to either an open or closed (p − 1)-run or a closed (p − 2)-run in the index sequence. Note that every AR subshift contains squares (e.g. i1 i1 ) so that powers p < 3 are irrelevant for the computation of the index. Proof of Theorem 4.1. The assertion follows from Propositions 4.2 and 4.3. One might also be interested in powers that occur for arbitrarily long factors. That is, define i − ind(X) ∈ N ∪ {∞} by i − ind(X) = sup{p ∈ N : there exist un with |un | → ∞ such that upn ∈ FX } . Then, the above analysis has the following immediate consequence: Corollary 4.2. Let X be an AR subshift over Ak and (in ) its index sequence. If i − ind(X) is defined as above, then i − ind(X) = 2 + lim sup en , n→∞
where, for n ∈ N, en = max{l ∈ N : in+l−1 = in } . Proof. Since every symbol from Ak occurs in (in ), the index sequence has exactly k open runs. Moreover, there is N ∈ N such that beyond iN , there are no more open runs. In particular, for the computation of i − ind(X), only closed runs are relevant. Thus, the assertion follows in a straightforward way from Proposition 4.2, Lemma 4.2, and their proofs. 5. Palindromes in Arnoux Rauzy Sequences In this section we study the number of palindromes of a given length that occur in a given AR subshift X. Recall that a word is called a palindrome if it is the same when read backwards. Given a minimal subshift X, define its palindrome complexity function p : N → N0 by p(n) = |{p ∈ FX : p = pR , |p| = n}| .
November 1, 2003 12:2 WSPC/148-RMP
756
00175
D. Damanik & L. Q. Zamboni
It was shown by Droubay and Pirillo that all Sturmian subshifts have the same palindrome complexity function, namely, ( 2 if n is odd , p(n) = (5.1) 1 if n is even , and that Sturmian subshifts are in fact characterized by this property [29]. The first part, namely, that all AR subshifts over Ak have the same palindrome complexity function was shown by Justin and Pirillo in [15]. The second part, however, does not extend. That is, AR subshifts are not characterized by their palindrome complexity. We give a simple alternate proof of the result of Justin and Pirillo which reads as follows (cf. [15, Theorem 4.4]): Theorem 5.1. The palindrome complexity function p : N → N0 of an AR subshift X over Ak is given by ( k if n is odd , p(n) = (5.2) 1 if n is even . Proof. We shall prove the statement ∀ n ∈ N : p(2n − 1) = k ,
p(2n) = 1 ,
(5.3)
which is equivalent to the assertion, by induction on n. The case n = 1 is readily checked. In fact, p(1) = k is obvious, and if X is of type a ∈ Ak , then aa is the unique palindrome of length 2 which occurs in X. Now assume that (5.3) holds for n. Let us show (5.3) for n + 1 by proving that if p is a palindrome occurring in X, then p admits a unique extension apa to a palindrome of length |p| + 2. Fix a palindrome p ∈ FX . We show below that p ∈ FX bispecial ⇒ there exists a unique a ∈ Ak such that apa ∈ FX .
(5.4)
Now, either there exists a unique a ∈ Ak such that apa ∈ FX , or else p is bispecial (and so by (5.4) there exists a unique a ∈ Ak such that apa ∈ FX ); hence in either case there exists a unique a ∈ Ak such that apa ∈ FX . Let us show (5.4). Let a ∈ Ak be the unique letter for which ap is right-special (and, equivalently, pa is left-special). Then apa ∈ FX . Consider any letter b 6= a. Then, we have that bp is not right-special, and bpa ∈ FX (since pa is left-special), so bpb 6∈ FX . As we mentioned above, every minimal subshift with palindrome complexity given by (5.1) is necessarily Sturmian. We are now going to show that this does not extend to the AR case, that is, there are non-AR subshifts with palindrome complexity given by (5.2). To this end, we consider subshifts X3iet , defined over A3 , associated with three-interval exchange transformations. These dynamical systems have the following combinatorial description, as shown by Ferenczi et al. [30]:
November 1, 2003 12:2 WSPC/148-RMP
00175
757
Combinatorial Properties of Arnoux–Rauzy Subshifts
• FX3iet (2) = {12, 13, 21, 22, 31}. • If u ∈ FX3iet , then uR ∈ FX3iet . • For every n ∈ N, there are exactly two left-special words in FX3iet (n), one beginning in 1 and one beginning in 2. • If w is a bispecial word ending in 1 and w 6= w R , then w2 is left-special if and only if wR 1 is left-special. Clearly, no such subshift is an AR subshift. We have the following result: Proposition 5.1. The palindrome complexity function p of X3iet is given by ( 3 if n is odd , p(n) = (5.5) 1 if n is even . Proof. The proof is similar to the proof of Theorem 5.1. It follows from the proof of Proposition 2.6 in [30] that if u is a bispecial palindrome factor, then there exists a unique symbol a ∈ A3 such that aua is a factor. In fact, if u begins in 1, then a ∈ {2, 3}, while if u begins in 2, then a ∈ {1, 2}. So now suppose u is a palindrome factor of length n. We claim there exists a unique symbol a ∈ A3 such that aua is a factor. If no such a exists, then there exist distinct symbols b, c ∈ A3 such that buc is a factor. This implies that u is bispecial, so from the above there must exist a ∈ A3 such that aua is a factor. Next, suppose there exist distinct symbols b, c ∈ A3 such that bub and cuc are factors. Then again u is bispecial, and hence this cannot happen. Thus p(n) = p(n + 2). Since p(1) = 3 and p(2) = 1, the assertion follows. On the other hand, X3iet has the same factor complexity function as an AR subshift over three symbols, so one may ask whether (5.5) implies that f (n) = 2n + 1 .
(5.6)
This is not true, at least on the level of individual sequences, as demonstrated by the following result (the example is due to J. Cassaigne [31]). Proposition 5.2. There exists a sequence over A3 whose palindrome complexity function is given by (5.5), but whose factor complexity function is not given by (5.6). Proof. Let w = 121312141213121 · · · be the fixed point of the infinite substitution 1 7→ 12 ,
2 7→ 13 ,
3 7→ 14, . . . .
Let w0 be the morphic image of w under the map Θ where Θ(i) = 1i 21i 31i . So w0 = 1213111211311121311112111311112131 · · · .
November 1, 2003 12:2 WSPC/148-RMP
758
00175
D. Damanik & L. Q. Zamboni
Note that 22, 33, 21n2, 31n 3, 31n21n 3, 21n31n 2 are not factors of w 0 .
(5.7)
We therefore have for w 0 , p(n) = 1 for n even ,
(5.8)
since by (5.7) the only even length palindromes are 1n , p(n) = 3 for n odd , since by (5.7) the only odd length palindromes are 1n , 1
(5.9) n−1 2
21
n−1 2
f is not given by (5.6) .
,1
n−1 2
31
n−1 2
, and
(5.10)
For example, f (3) = 9. By (5.8)–(5.10), we have palindrome complexity as in (5.5), but factor complexity different from (5.6). Note, however, that the sequence w 0 above is not uniformly recurrent and hence does not induce a minimal subshift. We consider it an interesting open problem to determine all minimal subshifts X over Ak , with k ≥ 3 arbitrary, whose palindrome complexity function is given by (5.2). 6. Applications to Schr¨ odinger Operators In this section, we discuss applications of our combinatorial results, Theorems 3.1 and 5.1 and Corollary 4.2, to the spectral theory of Schr¨ odinger operators. A discrete one-dimensional Schr¨ odinger operator acts in the Hilbert space H = `2 (Z). If φ ∈ H, then Hφ is given by (Hφ)(n) = φ(n + 1) + φ(n − 1) + V (n)φ(n) , where V : Z → R. The map V is called the potential. For our purposes, we can assume V bounded. Then H is a bounded, self-adjoint operator. Denote the spectrum of H by σ(H). Given an initial state φ ∈ H, the Schr¨ odinger time evolution is given by φ(t) = exp(−itH)φ, where exp(−itH) is given by the spectral theorem. One is interested in the question whether φ(t) will spread out in space, and if so, how fast. One possible way to tackle this issue is to study the spectral measure µ φ associated with φ, which is defined by Z dµφ (x) −1 for every z with Im z > 0 . hφ, (H − z) φi = R x−z Roughly speaking, the more continuous µφ , the faster the spreading of φ(t); compare, for example, [32–34]. Denote Hac = {φ ∈ H : µφ is absolutely continuous} Hsc = {φ ∈ H : µφ is singular continuous} Hpp = {φ ∈ H : µφ is pure point}
November 1, 2003 12:2 WSPC/148-RMP
00175
Combinatorial Properties of Arnoux–Rauzy Subshifts
759
and σε (H) = σ(H|Hε ) for ε ∈ {ac, sc, pp} . We say that H has purely absolutely continuous spectrum if both σsc (H) and σpp (H) are empty, etc. As was mentioned in the introduction, there has been a considerable amount of research dealing with the spectral properties of H if V displays long-range order. The totally ordered case (i.e. V periodic) is well-understood [35]. In this case, H has purely absolutely continuous spectrum. If V takes on only finitely many values, one popular measure for long-range order is given by the complexity function. It has then been the goal to determine the spectral properties for aperiodic potentials of low combinatorial complexity. A complete understanding has been obtained for Sturmian potentials [4, 36] and quasi-Sturmian potentials [8]. It turned out that in all these cases, one has purely singular continuous spectrum, supported on a Cantor set of Lebesgue measure zero. Here, a Cantor set is a closed, perfect, nowhere dense set. It is natural to conjecture that these properties are shared by other lowcomplexity potentials. In fact, these questions can be studied from a purely combinatorial perspective. That is, there are results that deduce singular continuous, zero-measure spectrum from purely combinatorial properties of the potential. Here we study the case of Arnoux–Rauzy potentials which provide a natural class of low-complexity potentials. Fix a two-sided AR subshift X over Ak with index sequence (in ) and a nonconstant function f : Ak → R. Denote the unique ergodic measure on X by ν. Each element x of X induces a potential via Vx (n) = f (xn ). The Schr¨ odinger operator with potential Vx will be denoted by Hx . Since X is minimal, we have that the spectrum and the absolutely continuous spectrum are invariants of X, that is, there are sets Σ, Σac ⊆ R such that σ(H) = Σ and σac (H) = Σac for every x ∈ X. The result for the spectrum follows from strong convergence and is folklore. The result on the absolutely continuous is much deeper and more recent [37]. In fact, aperiodicity implies that Σac is empty [38]. Thus, to establish the desired picture, we have to show that Σ has Lebesgue measure zero and σpp (H) is often/always empty. We first turn to the zero-measure property. It is a result of Lenz that linear recurrence provides a sufficient condition: Theorem 6.1 (Lenz [5]). If X is a linearly recurrent subshift and X and f are such that the resulting potentials Vx are aperiodic, then Σ has Lebesgue measure zero. Combining this with our Theorem 3.1, we immediately obtain the following (the Cantor set properties follow from the zero-measure property by general principles): Corollary 6.1. If every letter a ∈ Ak occurs in (in ) with bounded gaps, then σ(Hx ) is a Cantor set of zero Lebesgue measure for every x ∈ X. Let us now discuss the absence of point spectrum. Both palindromes and powers allow one to prove this property. The palindrome criterion is easy to verify, but it has the slight disadvantage that it only gives generic absence of eigenvalues:
November 1, 2003 12:2 WSPC/148-RMP
760
00175
D. Damanik & L. Q. Zamboni
Theorem 6.2 (Hof et al. [11]). If X is a minimal subshift and its palindrome complexity function obeys lim supn→∞ p(n) > 0, then for a dense Gδ -set of x ∈ X, we have σpp (Hx ) = ∅. We immediately deduce from this and Theorem 5.1: Corollary 6.2. For a dense Gδ -set of x ∈ X, we have σpp (Hx ) = ∅. We remark that an analog of Theorem 6.2 for half-line Schr¨ odinger operators was found in [39]. On the other hand, the criterion for empty point spectrum which is based on powers is slightly more complicated to state, requires more effort to be verified, but yields a stronger conclusion. Define the set Xn of elements of X, which have cubes of length 3n, suitably centered around the origin, by Xn = {x ∈ X : x−n+j = xj = xn+j ,
1 ≤ j ≤ n} .
Then we have the following result (the proof is based on a Gordon-type argument [40]; see e.g. [3, 10]): Theorem 6.3. Suppose lim supn→∞ ν(Xn ) > 0. Then, for ν-almost every x ∈ X, we have σpp (Hx ) = ∅. We can use this theorem and our Corollary 4.2 to show: Corollary 6.3. If the index sequence (in ) contains infinitely many 2-runs, we have σpp (Hx ) = ∅ for ν-almost every x ∈ X. Proof. Corollary 4.2 shows that if the index sequence (in ) contains infinitely many 2-runs, then FX contains arbitrarily long fourth powers. That is, there are un ∈ A∗k with |un | → ∞ and u4n ∈ FX . Since X is aperiodic, either u4n is right-special or one of its conjugates is right-special. Thus, we can assume without loss of generality that u4n is right-special. It was shown in [17, Lemma 2.2] that among all factors of length 4|un |, the right-special factor has the largest frequency. Since there are 4(k − 1)|un | + 1 words of length 4|un | whose frequencies add up to one, we infer that 1 . d(u4n ) ≥ 4(k − 1)|un | + 1 This yields, using (2.1), lim sup ν(Xn ) ≥ lim sup ν(X|un | ) ≥ lim sup n→∞
n→∞
n→∞
|un | 1 = > 0. 4(k − 1)|un | + 1 4(k − 1)
Thus, the assertion follows from Theorem 6.3. Corollary 6.3 does not cover the prominent case of the Tribonacci subshift XTrib , which is defined over three symbols and corresponds to the index sequence (in ) = 1, 2, 3, 1, 2, 3, 1, 2, 3, . . . .
November 1, 2003 12:2 WSPC/148-RMP
00175
Combinatorial Properties of Arnoux–Rauzy Subshifts
761
We shall nevertheless show that the conclusion of Corollary 6.3 holds for this case. By Theorem 2.2, the characteristic sequence C = (cn ) is given by C = lim (τ1 ◦ τ2 ◦ τ3 )n (1) . n→∞
The substitution S = τ1 ◦ τ2 ◦ τ3 on A3 is given by S(1) = 1213121 ,
S(2) = 121312 ,
S(3) = 1213 .
Note that S is primitive (i.e. there is l ∈ N, namely l = 1, such that for every a ∈ A3 , S l (a) contains all symbols from A3 ). Recall that a fractional power w q is a word wp w0 with p ∈ N, w0 a prefix of w, and q = p + |w 0 |/|w|. We have the following result for subshifts generated by primitive substitutions: Theorem 6.4 (Damanik [41]). Suppose the subshift X is generated by a primitive substitution S and FX contains a fractional power w q with q > 3. Then we have σpp (Hx ) = ∅ for ν-almost every x ∈ X. This allows us to prove the following: Corollary 6.4. For the Tribonacci subshift XTrib , we have σpp (Hx ) = ∅ for ν-almost every x ∈ XTrib . Proof. As we have seen above, C is the unique fixed point of S in AN 3 and we have [ FS n (1) . FXTrib = n∈N
Thus it suffices to find some S n (1) which contains w q with q > 3. The claim then follows from Theorem 6.4. First, S 2 (1) contains the word 1121. Thus, S 3 (1) contains the word 1213121 1213121 121312 1213121 = (1213121) 32 · · · , and hence S 4 (1) contains (1213121 121312 1213121 1213 1213121 121312 1213121) 3121312 · · · , which yields a fractional power w q with q = 3 + 3/22 > 3. Acknowledgments We thank J. Cassaigne for useful discussions. D. D. would like to express his gratitude to the Department of Mathematics at the University of North Texas at Denton for its warm hospitality and financial support through the Texas Advanced Research Program. D. D. was supported in part by NSF Grant No. DMS–0227289.
November 1, 2003 12:2 WSPC/148-RMP
762
00175
D. Damanik & L. Q. Zamboni
References [1] D. Shechtman, I. Blech, D. Gratias and J. V. Cahn, Metallic phase with long-range orientational order and no translational symmetry, Phys. Rev. Lett. 53 (1984), 1951– 1953. [2] A. S¨ ut˝ o, Schr¨ odinger difference equation with deterministic ergodic potentials, in Beyond Quasicrystals (Les Houches, 1994), eds. F. Axel and D. Gratias, Springer, Berlin (1995), 481–549. [3] D. Damanik, Gordon-type arguments in the spectral theory of one-dimensional quasicrystals, in Directions in Mathematical Quasicrystals, eds. M. Baake and R. V. Moody, CRM Monograph Series 13, AMS, Providence, RI (2000), 277–305. [4] D. Damanik, R. Killip and D. Lenz, Uniform spectral properties of one-dimensional quasicrystals, III. α-continuity, Commun. Math. Phys. 212 (2000), 191–204. [5] D. Lenz, Singular spectrum of Lebesgue measure zero for quasicrystals, Commun. Math. Phys. 227 (2002), 119–130. [6] Q.-H. Liu, B. Tan, Z.-X. Wen and J. Wu, Measure zero spectrum of a class of Schr¨ odinger operators, J. Statist. Phys. 106 (2002), 681–691. [7] J.-P. Allouche and D. Damanik, Applications of combinatorics on words to physics, in preparation. [8] D. Damanik and D. Lenz, Uniform spectral properties of one-dimensional quasicrystals, IV. Quasi-Sturmian potentials, to appear in J. d’Analyse Math. [9] B. Adamczewski and D. Damanik, Linearly recurrent circle map subshifts and an application to Schr¨ odinger operators, Ann. Henri Poincar´e 3 (2002), 1019–1047. [10] F. Delyon and D. Petritis, Absence of localization in a class of Schr¨ odinger operators with quasiperiodic potential, Commun. Math. Phys. 103 (1986), 441–444. [11] A. Hof, O. Knill and B. Simon, Singular continuous spectrum for palindromic Schr¨ odinger operators, Commun. Math. Phys. 174 (1995), 149–159. [12] P. Arnoux and G. Rauzy, Repr´esentation g´eom´etrique de suites de complexit´e 2n + 1, Bull. Soc. Math. France 119 (1991), 199–215. [13] R. N. Risley and L. Q. Zamboni, A generalization of Sturmian sequences: combinatorial structure and transcendence, Acta Arith. 95 (2000), 167–184. [14] X. Droubay, J. Justin and G. Pirillo, Epi-Sturmian words and some constructions of de Luca and Rauzy, Theoret. Comput. Sci. 255 (2001), 539–553. [15] J. Justin and G. Pirillo, Episturmian words and episturmian morphisms, Theoret. Comput. Sci. 276 (2002), 281–313. [16] J. Justin and L. Vuillon, Return words in Sturmian and episturmian words, Theor. Inform. Appl. 34 (2000), 343–356. [17] N. Wozny and L. Q. Zamboni, Frequencies of factors in Arnoux–Rauzy sequences, Acta Arith. 96 (2001), 261–278. [18] D. Damanik and D. Lenz, Linear repetitivity. I. Uniform subadditive ergodic theorems and applications, Discrete Comput. Geom. 26 (2001), 411–428. [19] F. Durand, Linearly recurrent subshifts have a finite number of non-periodic subshift factors, Ergodic Theory Dynam. Systems 20 (2000), 1061–1078. [20] F. Durand, B. Host and C. Skau, Substitutional dynamical systems, Bratteli diagrams and dimension groups, Ergodic Theory Dynam. Systems 19 (1999), 953–993. [21] J. C. Lagarias and P. A. B. Pleasants, Repetitive Delone sets and quasicrystals, to appear in Ergodic Theory Dynam. Systems. [22] D. Lenz, Uniform ergodic theorems on subshifts over a finite alphabet, Ergodic Theory Dynam. Systems 22 (2002), 245–255. [23] F. Durand, Corrigendum and appendum to: Linearly recurrent subshifts have a finite number of non-periodic subshift factors, to appear in Ergodic Theory Dynam. Systems.
November 1, 2003 12:2 WSPC/148-RMP
00175
Combinatorial Properties of Arnoux–Rauzy Subshifts
763
[24] J. Berstel, On the index of Sturmian words, in Jewels are Forever, Springer, Berlin (1999), 287–294. [25] D. Damanik and D. Lenz, The index of Sturmian sequences, European J. Combin. 23 (2002), 23–29. [26] J. Justin and G. Pirillo, Fractional powers in Sturmian words, Theoret. Comput. Sci. 255 (2001), 363–376. [27] D. Vandeth, Sturmian words and words with a critical exponent, Theoret. Comput. Sci. 242 (2000), 283–300. [28] F. Mignosi, On the number of factors of Sturmian words, Theoret. Comput. Sci. 82 (1991), 71–84. [29] X. Droubay and G. Pirillo, Palindromes and Sturmian words, Theoret. Comput. Sci. 223 (1999), 73–85. [30] S. Ferenczi, C. Holton and L. Q. Zamboni, Structure of three-interval exchange transformations II: A combinatorial description of the trajectories, to appear in J. d’Analyse Math. [31] J. Cassaigne, private communication. [32] J. M. Barbaroux, F. Germinet and S. Tcheremchantsev, Fractal dimensions and the phenomenon of intermittency in quantum dynamics, Duke Math. J. 110 (2001), 161–193. [33] I. Guarneri, Spectral properties of quantum diffusion on discrete lattices, Europhys. Lett. 10 (1989), 95–100. [34] Y. Last, Quantum dynamics and decompositions of singular continuous spectra, J. Funct. Anal. 142 (1996), 406–445. [35] G. Teschl, Jacobi Operators and Completely Integrable Nonlinear Lattices, Mathematical Surveys and Monographs 72, AMS, Providence, RI, 2000. [36] J. Bellissard, B. Iochum, E. Scoppola and D. Testard, Spectral properties of onedimensional quasi-crystals, Commun. Math. Phys. 125 (1989), 527–543. [37] Y. Last and B. Simon, Eigenfunctions, transfer matrices, and absolutely continuous spectrum of one-dimensional Schr¨ odinger operators, Invent. Math. 135 (1999), 329–367. [38] S. Kotani, Jacobi matrices with random potentials taking finitely many values, Rev. Math. Phys. 1 (1989), 129–133. [39] D. Damanik, J.-M. Ghez and L. Raymond, A palindromic half-line criterion for absence of eigenvalues and applications to substitution Hamiltonians, Ann. Henri Poincar´e 2 (2001), 927–939. [40] A. Gordon, On the point spectrum of the one-dimensional Schr¨ odinger operator, Usp. Math. Nauk 31 (1976), 257–258. [41] D. Damanik, Singular continuous spectrum for a class of substitution Hamiltonians II, Lett. Math. Phys. 54 (2000), 25–31.
November 5, 2003 9:23 WSPC/148-RMP
00174
Reviews in Mathematical Physics Vol. 15, No. 7 (2003) 765–788 c World Scientific Publishing Company
PHASE TRANSITION FROM THE VIEWPOINT OF RELAXATION PHENOMENA
NOBUO YOSHIDA Division of Mathematics, Graduate School of Science Kyoto University, Kyoto 606-8502, Japan [email protected] Received 3 February 2003 Revised 15 August 2003
Some results on the relaxation processes (Glauber dynamics) obtained in the last decade are presented. This article is intended to be a short guided tour through these results for readers without prior knowledge of rigorous statistical mechanics or stochastic processes. Keywords: Gibbs measure; phase transition; relaxation time; log-Sobolev inequality.
Contents 0. Introduction 0.1 What is relaxation phenomenon? 0.2 Some notations in probability 1. Gibbs Measure 1.1 Gibbs measure 1.2 The phase transition 2. Glauber Dynamics 2.1 The case of the Ising modal 2.2 The case of the lattice φ4 -field 2.3 Relaxation time 3. Relaxation in the Unique Phase Region 3.1 The equivalence of the decay of correlation and the fast relaxation 3.2 Sufficient conditions for (DC), (SG) or (LS) 3.3 Temperatures slightly above 1/βc , or small non-zero magnetic fields 4. Relaxation in the Phase-Coexistence Region 4.1 Some general observations 4.2 The free boundary condition 4.3 The plus boundary condition A. Appendix A.1 Proof of Eq. (2.7) A.2 Brownian motion and Eq. (2.10) A.3 Proof of Eq. (2.14) A.4 Proof of Proposition 2.3 765
766 766 766 766 766 769 770 771 773 774 776 776 778 779 780 780 781 783 784 784 785 785 786
November 5, 2003 9:23 WSPC/148-RMP
766
00174
N. Yoshida
0. Introduction 0.1. What is relaxation phenomenon? Imagine a piece of iron from the microscopic point of view. The piece of iron is then a collection of a large number of atoms randomly interacting with each other through the spinning of their electrons. According to equilibrium statistical mechanics, the statistics of the spins is governed by a probability distribution called the Gibbs measure which is supposed to describe the distribution of a large number of random objects in equilibrium. As we discuss later in detail, the Gibbs measure works as a mathematical tool to formulate the notion of phase transition. On the other hand, there is a rather different way of looking at the Gibbs measure. The Gibbs measure is understood as the stationary measure for relaxation processes. Here, a relaxation process means the way a physical system goes back to its equilibrium, after being disturbed by some external factor (e.g. temporary exposure of the piece of iron to an external magnetic field). Relaxation processes are a familiar aspect of nature which we experience through diffusion of particles, heat conduction, etc. An interesting thing from the physical/mathematical point of view is that the phase transition and the relaxation are related roughly as; unique phase ↔ fast relaxation
(0.1)
phase coexistence ↔ slow relaxation
(0.2)
The purpose of this article is to present some theorems which make (0.1) and (0.2) clearer in the mathematical framework. The readers are not required to be experienced in statistical mechanics or or stochastic processes, although knowledge in these fields would certainly make the reading easier. We have tried to make the exposition as clear as possible, however, without going into technicalities. More complete lecture notes can be found in [1, 2]. 0.2. Some notations in probability Let (X, A) be a measurable space and µ be a probability measure on (X, A). R • The expectation of a function f ∈ L1 (µ) is denoted by µ(f ) = X f dµ. • The correlation (or covariance) of f, g ∈ L2 (µ) is denoted by µ(f ; g) = µ((f − µ(f ))(g − µ(g))). 1. Gibbs Measure 1.1. Gibbs measure We would like to describe a physical system in which a large number of particles interact with each other. In the models we will consider here, the state of a particle
November 5, 2003 9:23 WSPC/148-RMP
00174
Phase Transition from the Viewpoint of Relaxation Phenomena
767
at a position x is denoted by σx . We will then introduce the Gibbs measure as a probability measure which governs the random variable (σx )x∈Λ , where Λ is a large set. We will discuss two different models (Ising model and lattice φ4 -field) at the same time. Position of a particle: The position x of a particle is supposed to be a point in the d-dimensional cubic lattice Zd ; Zd = {x = (xi )di=1 ; xi ∈ Z} . P Zd is endowed with `1 -distance: kxk1 = di=1 |xi |. The notation Λ ⊂⊂ Zd means that Λ is a non-empty, finite set in Zd . For Λ ⊂ Zd , the interior boundary ∂in Λ and the exterior boundary ∂ex Λ are defined by ∂in Λ = {x ∈ Λ; kx − yk1 = 1 for some y ∈ Zd \Λ} ,
(1.1)
∂ex Λ = {y ∈ Zd \Λ; kx − yk1 = 1 for some x ∈ Λ} .
(1.2)
State σ x of a particle: The state of a particle is expressed as a value in a set S. In the sequel, the set S is either a two-point set {−1, +1} (the case of the Ising model) or the real line R (the case of the lattice φ4 -field). We introduce the configuration spaces as follows: S Λ = {σ = (σx )x∈Λ ; σx ∈ S},
Λ ⊂⊂ Zd ,
d
S Z = {η = (ηy )y∈Zd ; ηy ∈ S} . The way these configurations are used is as follows. We consider some large Λ ⊂⊂ Z d d and specify the spin configuration outside Λ by η ∈ S Z , so that only (ηy )y6∈Λ (more exactly, only (ηy )y∈∂ex Λ ) is used in doing so. We then discuss the statistical property d of σ = (σx )x∈Λ . The configuration η ∈ S Z which appears in the way described above is called the boundary condition. Interaction among particles: Suppose that Λ ⊂⊂ Zd and that the spin configud ration of the particles on Zd \Λ is specified by η ∈ S Z . We define the Hamiltonian H Λ,η : S Λ → R as follows: X 1 β(σx − σy )2 H Λ,η (σ) = 2 {x,y}⊂Λ kx−yk1 =1
+
1 2
X
x∈Λ,y6∈Λ kx−yk1 =1
β(σx − ηy )2 +
X
βhσx .
(1.3)
x∈Λ
Here β > 0 and h ≥ 0 are parameters, wihch are usually called the inverse temperature and the external magnetic field. Note that only (ηy )y∈∂ex Λ is relevant in (1.3).
November 5, 2003 9:23 WSPC/148-RMP
768
00174
N. Yoshida
Gibbs measure: To define two models (Ising model and lattice φ4 -field) at the same time, we endow S with a probability measure ν in two different ways depending on whether S = {−1, +1} (Ising model) or S = R (lattice φ4 -field). • For the Ising model, we take S = {−1, +1} ,
ν({+1}) = ν({−1}) =
• For the lattice φ4 -field, we take S = R,
ν(ds) = exp(−U (s))ds
Z
1 . 2
exp(−U (s))ds ,
(1.4)
(1.5)
R
where U (s) = (s2 − 1)2 , and ds is the Lebesgue measure, see Remark 1.2 below. Suppose that Λ ⊂⊂ Zd and that the spin configuration of the particles on Zd \Λ d is specified by η ∈ S Z . We define a probability measure µΛ,η on the configuration space S Λ by exp −H Λ,η (σ) Y ν(dσx ) , (1.6) µΛ,η (dσ) = Z Λ,η x∈Λ
which we will henceforce refer to as the finite volume Gibbs measure on Λ with the boundary condition η.
Remark 1.1. Though we discuss only finite volume Gibbs measures in this article, it is possible to define infinite volume Gibbs measures (Λ = Zd ) as is done in mathematical textbooks in statistical mechanics, e.g. [3, 4]. Roughly speaking, an infinite volume Gibbs measure µ is obtained as a limit of finite volume ones (1.6) as the set Λ expands to Zd along a suitable sequence {Λn }n≥1 . The situation here is not simple because of the fact that the limit may depend on the choice of the boundary condition η and even on the way {Λn }n≥1 is chosen. This uniqueness problem is fundamental in discussing infinite volume Gibbs measures. In this respect, see also Remark 1.4. Remark 1.2. We can take a much more general function as U (s) in (1.5). For example, it is enough to assume the following condition: for any m > 0, there exist V, W ∈ C ∞ (R → R) such that U (s) = V (s) + W (s) for all s ∈ R , inf V 00 (s) ≥ m , s
kW k∞ + kW 0 k∞ < ∞
(1.7) (1.8) (1.9)
where kW k∞ = sups |W (s)|. A typical example of U with these requirements is given by the following polynomial: U (s) =
N X
ν=1
a2ν s2ν + a1 s
(1.10)
November 5, 2003 9:23 WSPC/148-RMP
00174
Phase Transition from the Viewpoint of Relaxation Phenomena
769
where N ≥ 2, a1 , a2 ∈ R, a4 ≥ 0, . . . , a2(N −1) ≥ 0 and a2N > 0. Since a2 can be a large negative value, the polynomial U in (1.10) may have arbitrarily deep double wells. Here, we have decided to choose one of the simplest non-trivial example U (s) = (s2 − 1)2 only for the sake of simplicity. 1.2. The phase transition One of the greatest advantage of introducing Gibbs measure is that it works as a mathematical tool to capture the notion of phase transition. In fact, if the situation of a piece of iron described at the beginning of this article is properly formulated, one can show that the system exhibits phase transition. We begin by explaining it conceptually. Let us first simplify the picture and regard each atom as a small magnetic dipole which represents the spinning of the electron. Each dipole (or spin) can point up (value +1) or down (value −1) with a certain randomness due to thermal fluctuation. On the other hand, two spins at neighboring atoms tend to point in the same direction to decrease the interaction energy. Then, the phase transition can be explained as follows: • If the temperature is sufficiently high, the system is entirely dominated by thermal fluctuation and only one equilibrium called disordered phase can be realized, in which spins point up and down almost independently of each other. • On the other hand, if the temperature is sufficiently low, the thermal fluctuation is suppressed by the interaction energy which tries to align the spins. As a result, there are (at least) two equilibrium states called pure phases; a great majority of spins point upwards in one of these pure phases, while the opposite situation can be seen in the other pure phase. We now formulate the notion of phase transition more precisely using the Gibbs measure. We discuss the Ising model and the lattice φ4 -field at the same time. We start with the following proposition. The value m± (β, h) introduced there can be understood as the average value of the spin with respect to the two “pure phases” alluded to above. Proposition 1.1. There exists a function (β, h) 7→ m± (β, h) and boundary d conditions {η + , η − } ⊂ S Z such that ( m+ (β, h) if η ≥ η + Λ,η lim µ (σ0 ) = (1.11) Λ%Zd m− (β, h) if η ≤ η − . For the Ising model, it is enough to take ηx± ≡ ±1 (pure boundary conditions). For the lattice φ4 -field, it is shown in [5] that (1.11) holds for example if ηx+ ≥ 1+ε 1+ε (ln(2 + |x|)) 2 and ηx− ≤ −(ln(2 + |x|)) 2 for all x ∈ Zd . Definition 1.1. We say that the phase is unique when m+ (β, h) = m− (β, h). We say that the phases coexist when m+ (β, h) 6= m− (β, h).
November 5, 2003 9:23 WSPC/148-RMP
770
00174
N. Yoshida
The phase transition is then described as follows: Theorem 1.1. • If h > 0, then m+ (β, h) = m− (β, h) > 0. • If h = 0, then there is βc ∈ (0, ∞) (βc = ∞ ⇔ d = 1) such that β < βc ⇒ m+ (β, 0) = m− (β, 0) = 0 , βc < β ⇒ m+ (β, 0) = −m− (β, 0) > 0 . The number βc is called the critical inverse temperature. Remark 1.3. It is believed that m+ (βc , 0) = m− (βc , 0) = 0. For the Ising model this is proven for d = 2 (see (1.12) below) and for d ≥ 4 [6, (1.9), (1.13)]. The critical inverse temperature and the magnetization for the two-dimensional Ising model are explicitly known: 1/8 sinh 2βc = 1 and m+ (β, 0) = 1 − (sinh 2β)−4 for β ≥ βc . (1.12)
See [7] and the references therein.
d
Remark 1.4. For the boundary conditions {η + , η − } ⊂ S Z in Proposition 1.1, not only the limit (1.11), but also infinite volume Gibbs measures µ± called “pure phases” exist as the limits (in the weak topology); µ+ = limΛ%Zd µΛ,η if η ≥ η+ , and µ− = limΛ%Zd µΛ,η if η ≤ η− . Then, the phase coexistence discussed in Theorem 1.1 can be equivalently stated as µ+ 6= µ− . In fact, it might be more common to capture the notion of the phase transition in terms of the infinite volume Gibbs measures. However, we have chosen to formulate the phase transition referring only to finite volume Gibbs measures to make the exposition simpler. 2. Glauber Dynamics We now formulate a notion of relaxation as the random time evolution called Glauber dynamics. There are at least two approach to describe the time evolution. Probabilistic definition: In this approach, time evolution is described as a family of S Λ -valued stochastic process (σtΛ,η )t≥0 indexed by time t. Here, we consider a continuous time parameter t ∈ (0, ∞) rather than discrete ones t ∈ {0, 1, ...}. Each σtΛ,η here is a random configuration observed at time t. As will be explained below, the time evolution is a continuous-time Markov chain with values in S Λ . The relaxation to the equilibrium (= Gibbs measure) is then described as: lim P (σtΛ,η = σ) = µΛ,η ({σ}) ,
t%∞
σ ∈ SΛ ,
(2.1)
where P denotes a probability on a measurable space on which the process (σtΛ,η )t≥0 is defined.
November 5, 2003 9:23 WSPC/148-RMP
00174
Phase Transition from the Viewpoint of Relaxation Phenomena
771
Analytic definition: In this approach, we look at the averaged quantity over the stochastic process alluded to above, rather than the stochastic process in itself. More precisely, we look at (TtΛ,η f )(σ) = P [f (σtΛ,η )|σ0Λ,η = σ]
(2.2)
for f : S Λ → R, where the expectation on the right-hand side is the conditional expectation given the time-zero configuration σ. Thus, what we observe in this approach is more like “heat conduction” caused by the motion of the particles. As is explained below, it is possible to describe the quantity on the left-hand side of (2.2) in terms of the Hamiltonian, without referring to the stochastic process (σtΛ,η )t≥0 . In this approach, (2.1) is rephrased as: lim (TtΛ,η )f (σ) = µΛ,η f ,
t→∞
σ ∈ SΛ .
(2.3)
In what follows, we will present both probabilistic and analytic definitions of the Glauber dynamics. 2.1. The case of the Ising model Probabilistic definition of the Glauber dynamics: We need some notations. CΛ = the set of all real functions on S Λ . For f ∈ CΛ , ∇x f (σ) = f (σ x ) − f (σ) , where σyx
=
(
x ∈ Λ,
−σy
if y = x ,
σy
if y = x .
The Glauber dynamics evolves in time by a series of such spin flips σ 7→ σ x . The flip rate cΛ,η x (σ) is defined by 1 Λ,η Λ,η (2.4) cx (σ) = exp − ∇x H (σ) . 2 The evolution of the dynamics (σtΛ,η )t≥0 which starts from a configuration σ0Λ,η = σ can be described as follows: • The first spin flip occurs at a random time Tσ , where Tσ is an exponentially distributed random variable with the expectation given by the inverse of P C Λ,η (σ) = x∈Λ cΛ,η x (σ), that is, P (Tσ ∈ dt) = C Λ,η (σ)e−tC
Λ,η
(σ)
dt .
• A spin flip σ 7→ σ x is implemented at time Tσ , where site x is chosen with the Λ,η probability cΛ,η (σ). x (σ)/C
November 5, 2003 9:23 WSPC/148-RMP
772
00174
N. Yoshida
• After the first flip, continue the same procedure independently of the past, with σ x as the new starting configuration. The time evolution of (σtΛ,η )t≥0 described above is nothing but the continuous-time Λ,η Markov chain with the flip rate cΛ,η x (σ). The time evolution (σt )t≥0 and the Gibbs measure µΛ,η is then related as follows: Z µΛ,η (dσ 0 )P [σtΛ,η = σ|σ0Λ,η = σ 0 ] = µΛ,η ({σ}) . (2.5) This amounts to saying that the Gibbs measure is a stationary measure of the Markov chain (σtΛ,η )t≥0 . Since the Markov chain is irreducible in the present case, µΛ,η is in fact the unique stationary measure and the relaxation to the equilibrium (2.1) follows from a well-known convergence theorem for Markov chains (see e.g. [8, p. 65]). Analytic definition of the Glauber dynamics: The generator of the Glauber dynamics is defined by AΛ,η f (σ) =
X
cΛ,η x (σ)∇x f (σ) ,
f ∈ CΛ .
(2.6)
x∈Λ
Since AΛ,η is a linear map on a finite dimensional vector space CΛ , AΛ,η may be regarded as a matrix. It is not difficult (cf. Sec. A.1) to see that def.
E Λ,η (f, g) = −µΛ,η (f AΛ,η g) =
1 X Λ,η Λ,η µ (cx ∇x f ∇x g) . 2
(2.7)
x∈Λ
We therefore see the following: Proposition 2.1. The operator AΛ,η : L2 (µΛ,η ) → L2 (µΛ,η ) is symmetric and is negative semi-definite. Moreover, the eigenspace for eivenvalue zero consists only of constant functions. The quadratic form defined by (2.7) is called the Dirichlet form of the Glauber dynamics. The semi-group generated by AΛ,η is denoted by (TtΛ,η )t≥0 , TtΛ,η = exp(tAΛ,η ) ,
t > 0.
(2.8)
This is again, identified with a special case of the exponential of a matrix: exp(A) = P Λ,η p f (σ) can be characterized by the following p≥0 A /p!. For f ∈ CΛ , u(σ, t) = Tt initial value problem: ∂ u(σ, t) = AΛ,η u(σ, t) , ∂t u(σ, 0) = f (σ) ,
t>0 t = 0.
(2.9)
November 5, 2003 9:23 WSPC/148-RMP
00174
Phase Transition from the Viewpoint of Relaxation Phenomena
773
2.2. The case of the lattice φ4 -field Probabilistic definition of the Glauber dynamics: For f ∈ C 2 (RΛ ), we introduce the following notations: 2 ∂ ∂ f (σ) , x ∈ Λ . ∇x f (σ) = f (σ) , ∆x f (σ) = ∂σx ∂σx We then define
CΛ = {f ∈ C ∞ (RΛ ) ; ∇x f (x ∈ Λ) are bounded} . Λ,η For the lattice φ4 -field, the Glauber dynamics σtΛ,η = (σt,x )x∈Λ ∈ RΛ is introduced as the solution to the following stochastic differential equation: Z 1 t Λ,η Λ,η e Λ,η (σ Λ,η )ds , ∇x H (2.10) σt,x = σ0,x + Bt,x − s 2 0 where X e Λ,η (σ) = H Λ,η (σ) + H U (σx ) (2.11) x∈Λ
and Bt = (Bt,x )x∈Λ is a |Λ|-dimensional standard Brownian motion (see Sec. A.2). Knowledge of Brownian motion would be helpful for the firm grasp of the mathematics behind (2.10). However, almost no knowledge of Brownian motion is needed here to roughly capture the meaning of (2.10). In fact, (2.10) is just a random perturbation of the following deterministic integral equation: Z 1 t Λ,η Λ,η e Λ,η (σ Λ,η )ds . = σ0,x − σt,x ∇x H (2.12) s 2 0
Analytic definition of the Glauber dynamics: The generator of the Glauber dynamics is defined by 1X 1X e Λ,η (σ)∇x f (σ) f ∈ CΛ . ∆x f (σ) − ∇x H (2.13) AΛ,η f (σ) = 2 2 x∈Λ
x∈Λ
It is not difficult (cf. Sec. A.3) to see that def.
E Λ,η (f, g) = −µΛ,η (f AΛ,η g) =
1 X Λ,η µ (∇x f ∇x g) . 2
(2.14)
x∈Λ
As in the Ising model case, we see from (2.14) that the similar statement as in Proposition 2.1 is true for AΛ,η : L2 (µΛ,η ) → L2 (µΛ,η ) defined on CΛ . However, unlike the Ising model case, the operator AΛ,η here is not bounded on L2 (µΛ,η ). This requires a property of AΛ,η called the self-adjointness for the associated semigroup to be defined rigorouslya. For this reason, we will extend the operator AΛ,η on a larger domain Dom(AΛ,η ) on which it is self-adjoint, that is, the following hold: (i) f ∈ Dom(AΛ,η ) if and only if sup{|µΛ,η (f AΛ,η g)| ; g ∈ Dom(AΛ,η ) , kgkL2(µΛ,η ) ≤ 1} < ∞ , a Readers
who do not care about the rigorous construction of the semi-group can skip this point.
November 5, 2003 9:23 WSPC/148-RMP
774
00174
N. Yoshida
(ii) µΛ,η (f AΛ,η g) = µΛ,η (gAΛ,η f ) ,
for f, g ∈ Dom(AΛ,η ) .
We state this procedure as: Proposition 2.2. Let Dom(AΛ,η ) be the set of f ∈ L2 (µΛ,η ) for which there is a sequence {fn }n≥1 ⊂ CΛ such that lim kf − fn kL2 (µΛ,η ) = 0,
n%∞
and
lim kAΛ,η (fm − fn )kL2 (µΛ,η ) = 0 . (2.15)
m,n%∞
(a) For f ∈ Dom(AΛ,η ) and a sequence {fn }n≥1 ⊂ CΛ with the property (2.15), the following L2 (µΛ,η )-limit def.
AΛ,η f = lim AΛ,η fn
(2.16)
n%∞
is independent of the choice of the sequence {fn }n≥1 and hence defines a linear operator on L2 (µΛ,η ) with Dom(AΛ,η ) as its domain of definition. (b) The operator AΛ,η : L2 (µΛ,η ) → L2 (µΛ,η ) defined by (2.16) is self-adjoint, negative semi-definite on Dom(AΛ,η ). Moreover, the eigenspace for eivenvalue zero consists only of constant functions. The quadratic form defined by (2.14) is called the Dirichlet form of the Glauber dynamics. The semi-group generated by AΛ,η is denoted by (TtΛ,η )t≥0 , TtΛ,η = exp(tAΛ,η ) ,
t > 0.
(2.17)
Thanks to the self-adjointness (and the negative semi-definiteness) of AΛ,η , the semi-group can be constructed by the spectral decomposition [9, p. 235]. For f ∈ L2 (µΛ,η ), u(σ, t) = TtΛ,η f (σ) can be characterized by the following initial value problem: ∂ u(σ, t) = AΛ,η u(σ, t) , t > 0 ∂t (2.18) u(σ, 0) = f (σ) , t = 0.
2.3. Relaxation time
d
Proposition 2.3. Fix Λ ⊂⊂ Zd and a boundary condition η ∈ S Z . Then, the constant γ ∈ (0, ∞) described in the following two statements are the same. (a) The smallest γ ∈ (0, ∞) such that the following inequality (Poincar´ e inequality) holds for all f ∈ CΛ : µΛ,η (f ; f ) ≤ γE Λ,η (f, f ) .
(2.19)
(b) The smallest γ ∈ (0, ∞) such that kTtΛ,η f − µΛ,η f kL2 (µΛ,η ) ≤ kf − µΛ,η f kL2 (µΛ,η ) exp(−t/γ) , for all f ∈ CΛ and t > 0.
(2.20)
November 5, 2003 9:23 WSPC/148-RMP
00174
Phase Transition from the Viewpoint of Relaxation Phenomena
775
The constant γ is called the relaxation time or the inverse spectral gap and is denoted by γSG (Λ, η). The proof is easy and is presented in Sec. A.4 for the interested readers. Since we have put the notion of “relaxation time” into a mathematical framework, we can now formulate the meaning of “fast/slow relaxation” as follows: fast relaxation ↔ sup γSG (Λ(`), η) < ∞ ,
(2.21)
`≥1
slow relaxation ↔ lim γSG (Λ(`), η) = ∞ , `%∞
(2.22)
where Λ(`) = Zd ∩ (−`/2, −`/2]d .
(2.23)
The log-Sobolev inequality introduced in the following proposition plays an important role in the analysis of Glauber dynamics. d
Proposition 2.4. Fix Λ ⊂⊂ Zd and a boundary condition η ∈ S Z . We let γLS (Λ, η) denote the smallest γ ∈ (0, ∞) such that the following inequality (log-Sobolev inequality) holds for all f ∈ CΛ : f2 ≤ 2γE Λ,η (f, f ) . (2.24) µΛ,η f 2 ln Λ,η 2 µ (f ) Then, 0 < γSG (Λ, η) ≤ γLS (Λ, η) < ∞ .
(2.25)
Log-Sobolev inequality was introduced by Gross [10] as an equivalent condition to the hypercontractivity of the associated semi-group. Since then, it has been applied to many aspects in probability theory. We refer the reader to [11, 12] or [13, Sec. 6.1] for expositions of log-Sobolev inequality in more general settings. A proof of (2.25) can be found, for example, in [12, Theorem 2.5], [13, (6.1.17)]. It might be interesting to relate the log-Sobolev inequality with the relative entropy. For probability measures µ and ν on a some measurable space, the relative entropy of ν with respect to µ is defined by: ( ν(log ρ) = µ(ρ log ρ) , if dν = ρdµ with ρ ∈ L1 (µ) , H(ν|µ) = (2.26) +∞ , if otherwise . Note that the left-hand side of (2.24) equals µΛ,η (f 2 )H(ν|µΛ,η ), where dν =
f2 dµΛ,η . µΛ,η (f 2 )
The relative entropy can be used to measure the deviation of ν with respect to µ. In fact, it is well known [13, (3.2.25)] that 1 |ν(f ) − µ(f )|2 ≤ H(ν|µ)kf k2 , 2
(2.27)
November 5, 2003 9:23 WSPC/148-RMP
776
00174
N. Yoshida
for any bounded measurable function f , where kf k denotes the sup-norm. On the other hand, (2.24) implies that for any probability measure ν on S Λ that H(νTtΛ,η |µΛ,η ) ≤ H(ν|µΛ,η ) exp(−2t/γ) , νTtΛ,η
t ≥ 0,
(2.28)
(νTtΛ,η )(f )
where the probability measure is defined by = ν(TtΛ,η f ), see [13, (6.1.37)]. This shows that the log-Sobolev inequality (2.24) implies the exponentially fast relaxation to equilibrium in the sense of entropy, while the Poincar´e inequality (2.19) is equivalent to exponentially fast L2 -relaxation. 3. Relaxation in the Unique Phase Region 3.1. The equivalence of the decay of correlation and the fast relaxation The following result originates in a celebrated work of Stroock and Zegarlinski [14], which is a typical example of the correspondence (0.1). Theorem 3.1 ([14 17]). For the Ising model and lattice φ4 -field, the following conditions are equivalent: (DC) There exist constants B3.1 , C3.1 ∈ (0, ∞) such that for all Λ ⊂⊂ Zd , η ∈ Λ and x, y ∈ Λ, |µΛ,η (σx ; σy )| ≤ B3.1 exp(−kx − yk1 /C3.1 ) .
(3.1)
d
(SG) sup{γSG (Λ, η) ; Λ ⊂⊂ Zd , η ∈ S Z } < ∞. d (LS) sup{γLS (Λ, η) ; Λ ⊂⊂ Zd , η ∈ S Z } < ∞. The condition (DC) which we call decay of correlation refers to a certain asymptotic independence of the spins located far away from each other [55]. It is not difficult to see that (DC) implies the uniqueness of the phase. On the other hand, (SG) and (LS) are conditions which ensure fast enough relaxation of the Glauber dynamics (cf. Proposition 2.3, Proposition 2.4). Here are some rough explanations for why the equivalence in Theorem 3.1 is true. (DC) ⇒ (LS): An important observation which is true even without condition (DC) is that, for all n ≥ 1, there exists C(n) = C(n, β, d) ∈ (0, ∞) such that d
sup{γSG (Λ, η) ; |Λ| ≤ n , η ∈ S Z } ≤ C(n) .
(3.2)
On the other hand, (DC) implies that for large enough Λ, the measure µΛ,η is almost independent of the choice of Λ and η, and hence supΛ,η γSG (Λ, η) is essentially the same as the left-hand side of (3.2) for finite n, if it is large enough. (LS) ⇒ (SG): This follows from (2.25). (SG) ⇒ (DC): We will use the following basic property of the Glauber dynamics, which is often referred to as “finite speed propagation property”. For any ε ∈ (0, ∞), there exists C ∈ (0, ∞) such that Λ,η Λ,η P (σt,x ; σt,y ) ≤ C exp(Ct − ε|y − z|)
(3.3)
November 5, 2003 9:23 WSPC/148-RMP
00174
Phase Transition from the Viewpoint of Relaxation Phenomena
777
d
for all Λ ⊂⊂ Zd , η ∈ S Z and x, y ∈ Zd . This implies that, for fixed time t and Λ,η Λ,η |y − z| large enough, σt,x and σt,y are almost independent in the sense that their correlation decays exponentially in |y − z|. On the other hand, (SG) implies that the distribution of σtΛ,η with large enough finite t is close to µΛ,η uniformly in Λ and η; recall Proposition 2.3. Therefore, with (SG), the left-hand side of (3.3) can be replaced by µΛ,η (σx ; σy ) if t is a large enough finite number. Remark 3.1. The relation between the decay of correlation and the fast relaxation began to be studied already in 1970’s in the series of works by Holley and Stroock [18, 19, 54]. Stroock and Zegarlinski [14, 20] proved the equivalence stated in Theorem 3.1 for a certain class of spin systems where the single spin space S is either a finite set or a compact Riemannian manifold. This implies Theorem 3.1 for the case of the Ising model. We will refer to the papers [15, 16] in Remark 3.2 below. See also a work of Cesi [21] for an elegant proof of (LS) in this context. Theorem 3.1 for a class of unbounded spin system including the case of the lattice φ4 -field can be found in [17]. Remark 3.2. Conditions in Theorem 3.1 require the uniformity over all Λ ⊂⊂ Zd . In some cases, it is more reasonable to restrict one’s attention only to nicely-shaped Λ (e.g. cubes or fat enough boxes) to avoid pathological phenomena caused by Λ’s whose shapes are too irregular. This idea was implemented by the series of works by Martinelli, Olivieri, Schonmann, Shlosman [15, 16, 22, 23], which led to improvement of Theorems 3.1 and 3.3, especially in the case of the two-dimensional Ising model (see Theorem 3.4). A more practical role played by the (LS) can be seen in the following result. Recall that (SG) is equivalent to L2 -convergence of the Glauber dynamics with uniform speed in Λ and η. With (LS), one gets much stronger result (L∞ -convergence). Theorem 3.2 ([14 16]). Consider the Ising model and suppose that def.
γ(η) = sup{γLS (Λ, η) ; Λ ⊂⊂ Zd } < ∞
(3.4)
d
for some η ∈ S Z . Then, there exist constants B3.5 , C3.5 ∈ (0, ∞) such that for all d Λ ⊂⊂ Zd , η ∈ S Z , kTtΛ,η f − µΛ,η f k ≤ B3.5 |||f ||| exp(−t/C3.5 ) , f or all f ∈ CΛ and t > 0 , P where kf k is the sup-norm and |||f ||| = x∈Λ k∇x f k.
(3.5)
Here is a rough explanation of how the log-Sobolev inequality is used to derive (3.5). Let ∆ ⊂ Λ be the “support” of f : ∆ = {x ∈ Λ ∇x f 6≡ 0}. We then define Λ(t) = {x ∈ Λ dist.(x, ∆) ≤ C + Ct} , where the constant C is large enough. By the finite propagation speed property Λ,η (3.3), the coordinates σs,x , x ∈ ∆, s ≤ t are almost independent of what happens in Λ\Λ(t) up to time t. For this reason, the proof of (3.5) boils down to the estimation
November 5, 2003 9:23 WSPC/148-RMP
778
00174
N. Yoshida
of the left-hand side of (3.5) with Λ replaced by Λ(t). Now, by applying (2.27) and Λ(t),η (2.28) to µ = µΛ(t),η and ν = δσ Tt , we have 1 Λ(t),η Λ(t),η Λ(t),η |T f (σ) − µΛ(t),η f |2 ≤ kf k2H(δσ Tt |µ ) 2 t ≤ kf k2H(δσ |µΛ(t),η ) exp(−2t/γ(η)) . By the definition of the relative entropy, it is easy to see that H(δσ |µΛ(t),η ) ≤ ln(1/µΛ(t),η (σ)) ≤ C1 t , where C1 = C1 (d, β, h) ∈ (0, ∞). These prove the exponential decay of the left-hand side of (3.5) with Λ replaced by Λ(t). Remark 3.3. For the lattice φ4 -field, a similar result to Theorem 3.2 is known [24, 25], where however, the exponential decay is not in the sup-norm, but in a certain point-wise sense for each “tempered” initial configuration. 3.2. Sufficient conditions for (DC ), (SG ) or (LS ) The following result concerns sufficient conditions which ensure the validity of (DC), (SG) and (LS) in Theorem 3.1. Theorem 3.3. Consider the Ising model and lattice φ4 -field. (a) (LS ) holds if d = 1. (b) For all d ≥ 1, there is an inverse temperature β0 = β0 (d) ∈ (0, ∞) such that for β ≤ β0 , γLS (Λ, η) is bounded in Λ, η and h. In particular, (LS ) holds for β ≤ β0 . (c) For the Ising model, (LS ) holds if |h| > 2d. Remark 3.4. • Theorem 3.3(a) was obtained by Zegarlinski [25, 26]. • A result of Zegarlinski [27, Theorem 4.3] implies Theorem 3.3(b) for the Ising model. Theorem 3.3(b), (c) for the Ising model can also be found in, [28] where (DC) (rather than (LS)) is discussed. For the Ising model, it is known [29] that (DC) holds for β < βc /2. • As for Theorem 3.3(b) for the lattice φ4 -field, Zegarlinski [27, Propositions 3.4, 3.5] also proved some results in this direction. However, the conditions imposed there for the potential function were not mild enough to cover the case of the lattice φ4 -field. Theorem 3.3(b) for the lattice φ4 -field was then obtained in [30] by the use of “surgery construction” of Dobrushin and Shlosman [31] (to show a strong enough decay of correlation), together with the “martingale method” of Lu and Yau [32] (to conclude the spectral gap and the log-Sobolev inequality from the decay of correlation). Soon afterwards, Bodineau and Helffer [33, 56] used Witten Laplacian to simplify a part of the proof (the decay of correlation and the spectral gap) and thereby generalized the assumptions on the interaction potentials needed to prove the log-Sobolev inequality; see also a paper by Gentil and Roberto [34] for
November 5, 2003 9:23 WSPC/148-RMP
00174
Phase Transition from the Viewpoint of Relaxation Phenomena
779
a research on (SG) in this direction. More recently, Procacci and Scoppola [35] investigated (DC) for the lattice φ4 -field by a different method based on the cluster expansion. A written article by Ledoux [36] is now available on the subject of Theorem 3.3(b) for unbounded lattice spin systems. 3.3. Temperatures slightly above 1/βc , or small non-zero magnetic fields The results in Sec. 3.1 apply if the temperature is sufficiently high, or if large enough magnetic field is present (Theorem 3.3). To make the correspondence (0.1) more precise, it is desired to study the model at a temperature only slightly above the critical one, or under a small magnetic field. As is alluded to in Remark 3.2, researches in this direction for the two-dimensional Ising model was very successfully done in a series of work by Martinelli, Olivieri, Schonmann and Shlosman [15, 22, 23]. Let us quote one of its final form as: Theorem 3.4 ([23]). Consider the Ising model in d = 2. If β < βc or h 6= 0, then γLS (Λ(`), η) < ∞ ,
sup `≥1,η∈S Zd
or equivalently, γSG (Λ(`), η) < ∞ .
sup `≥1,η∈S Zd
Remark 3.5. In contrast with the situation in two dimensions described in Theorem 3.4, it is conjectured in dimension d ≥ 3 that the fast relaxation property (2.21) is violated by an appropriate choice of low temperatures, small but non-zero magnetic fields, and boundary conditions η which are mixtures of +1 and −1. The part of the (β, h)-plane referred to above is often called “Basuev region”. Though the conjecture mentioned above is not yet proven, there are some results on related models which suggest the existence of such “dangerous” boundary condition d η ∈ S Z [37, 38, 39]. The conjecture in Remark 3.5 motivated an attempt to find a class of “safe” boundary conditions even when (β, h) is near the phase transition line. To describe a sufficient condition to be a safe boundary condition, we introduce the effective magnetic field : X def. hΛ,η = h+ ηy , x ∈ Λ . (3.6) x y6∈Λ kx−yk1 =1
Note that the Hamiltonian (1.3) can be written, up to a term independent of σ, as X X 1 β(σx − σy )2 + βhΛ,η (3.7) x σx . 2 {x,y}⊂Λ kx−yk1 =1
x∈Λ
November 5, 2003 9:23 WSPC/148-RMP
780
00174
N. Yoshida
Note also that hΛ,η = h for x ∈ Λ\∂in Λ, recall (1.1). The effective magnetic field is x thus a “magnetic field” with the effect of the boundary condition taken into account. A sufficient condition for a safe boundary condition is then, roughly speaking, that the signs of {hΛ,η x }x∈∂in Λ are kept identical to that of the external magnetic field h. More precisely, we have: Theorem 3.5 ([15, 40, 41]). Consider the Ising model and we allow the boundary d condition η in an extended configuration space RZ . (a) If β < βc and h ≥ 0, then there exist constants B3.5 , C3.5 ∈ (0, ∞) such that d (3.5) holds whenever Λ ⊂⊂ Zd and η ∈ RZ satisfy minx∈∂in Λ hΛ,η ≥ 0, cf. (3.6). x (b) There exsits β0 = β0 (d) ∈ (0, ∞) as follows. If β ≥ β0 , h > 0, then for any h0 > 0, there exist constants B3.5 , C3.5 ∈ (0, ∞) such that (3.5) holds whenever d Λ is a cube and η ∈ RZ satisfies minx∈∂in Λ hΛ,η ≥ h0 , cf. (3.6). x Remark 3.6. Theorem 3.5 originates in a result of Martinelli and Olivieri [15, Theorem 5.1], where the infinite volume dynamics are discussed. The present form of Theorem 3.5, which is stronger than the above mentioned result (because of the uniformity over Λ) is due to Schonmann and Yoshida [40, 41]. 4. Relaxation in the Phase-Coexistence Region We now turn our attention to the correspondence (0.2), namely slow relaxation in the phase-coexistence region. In this subsection, we investigate the relaxation time γSG (Λ(`), η) for the Ising model with d ≥ 2, β > βc and h = 0. 4.1. Some general observations • Heuristics: Here are heuristics to predict how long the relaxation time γSG (Λ(`), η) is. Although the argument is quite rough, it leads to “correct” answers in many cases. The relaxation time is supposed to be proportional to the expected amount of time needed to cross the energy barrier (if the barrier is present). Let us assume for simplicity that the boundary condition is non-negative. Then, the time needed to cross the energy barrier is roughly the time needed to make the transition from σ ≡ −1 to σ ≡ +1. There are many different ways to make this transition. However, the main contribution comes from the most “efficient” way to do it, namely the one that minimizes the increment of the energy along the transition. With this in mind, let us now approximate the original dynamics by a birth-death chain which moves back and forth along a fixed series of configurations σ (0) , σ (1) , . . . , σ (N ) with σ (0) ≡ −1 and σ (N ) ≡ +1. Then, the computation of the expected hitting time of σ (N ) under this approximation tells us that the relaxation time is the exponential of the maximum increment of the energy along the transition, up to some polynomial correction.
November 5, 2003 9:23 WSPC/148-RMP
00174
Phase Transition from the Viewpoint of Relaxation Phenomena
781
Another important observation is that the energy barrier along the transition from σ ≡ −1 to σ ≡ +1 is the creation of the layer of plus spins which separates a pair of opposite faces of Λ(`) (and hence contains `d−1 spins). In fact, after such a layer is created, then the rest of the transition can be made without increasing the energy. • A general upper bound of the relaxation time: It is known that for all d ≥ 2, β > 0 and h ≥ 0; γSG (Λ({`i }), η) ≤ exp(βC`2 · · · `d ) , (4.1) Q where Λ({`i }) = Zd ∩ di=1 (−`i /2, −`i/2] with `1 ≥ `2 ≥ · · · ≥ `d and the constant C depends only on d, see [42, Theorem 5], [2, Theorem 6]. Note that the length `1 of the longest side of the rectangle Λ({`i }) does not appear on the right-hand side of (4.1). Along with the line of heuristics explained above, the right-hand side of (4.1) Qd refers to the cost to fill a layer, say, the bottom one {−b`1 /2c}× i=2 (−`i /2, −`i /2] with the plus spins. We have in particular γSG (Λ(`), η) ≤ exp(βC`d−1 ) .
(4.2)
4.2. The free boundary condition We first consider the free boundary condition. The following result says that the “slowest relaxation time” (the right-hand side of (4.2)) is indeed realized here; Theorem 4.1 ([43, 44]). Consider the Ising model with h = 0. For d = 2, β > βc , and η ≡ 0, lim
`→∞
1 1 ln γSG (Λ(`), η) = 2 − ln(tanh β) . β` β
(4.3)
For d ≥ 2, sufficiently large β and η ≡ 0, there exist Bi = Bi (β, d) > 0, Ci = Ci (β, d) > 0 (i = 1, 2) such that B1 exp(C1 `d−1 ) ≤ γSG (Λ(`), η) ≤ B2 exp C2 `d−1 , ` = 1, 2, . . . . (4.4)
Theorem 4.1 is quite reasonable from the viewpoint of the heuristics explained in Sec. 4.1. In fact, the exponent `d−1 can be explained by the cardinality of the layer of plus spins alluded to there. Moreover for d = 2, the right-hand side of (4.3) is the surface tension in a coordinate direction [45, p. 264, (81)]. In proofs of (4.3) and (4.4), the lower bound of the relaxation time is based on the following immediate consequence of the definition of γSG (Λ, η), cf. (2.19): γSG (Λ, η)−1 ≤ E Λ,η (f, f )/µΛ,η (f ; f ) ,
(4.5)
for any f ∈ CΛ . To obtain the lower bound in (4.4), for example, one chooses f to be the indicator function of an event in which a large closed surface separating +1 and −1 (as such a “contour”) is present. One then uses Peierls’ argument to show that the right-hand side of (4.5) is exponentially small in `d−1 .
November 5, 2003 9:23 WSPC/148-RMP
782
00174
N. Yoshida
Remark 4.1. The asymptotics (4.3) is obtained by Cesi, Guadagni, Martinelli and Schonmann [43], while the bound (4.4) is due to Thomas [44]. We now consider extensions of Theorem 4.1 to a more general boundary condition η. We say a set A ⊂ Zd is (*)-connected if for any {x, y} ⊂ A, there is {x0 , x1 , . . . , xn } ⊂ A such that x0 = x, xn = y and kxi −xi−1 k∞ = 1, i = 0, 1, . . . , n, where kyk∞ = max1≤i≤d |yi | for y = (yi )di=1 ∈ Zd . Theorem 4.2 ([46–48]). Consider the Ising model with h = 0. (a) For d = 2, suppose that 0 < δ < 1 and that the boundary condition η y ∈ [−1, +1], y ∈ ∂ex Λ(`) satisfies X ηy ≤ δ|I| f or every (∗) − connected I ⊂ ∂ex Λ(`) with |I| = ` . (4.6) y∈I
Then, there exist β0 = β0 (δ) > 0, Bi = Bi (β, δ) > 0, Ci = Ci (β, δ) > 0 (i = 1, 2) such that (4.4) holds for β ≥ β0 and ` = 1, 2, . . . . d (b) For d ≥ 3, suppose that η ∈ {0, 1}Z and that X lim `−(d−1) ηy < δ d , (4.7) `%∞
y∈∂ex Λ(`)
9 27 , δ5 = 12 where δ3 = 43 , δ4 = 16 32 , δd ≤ (16d) for d ≥ 6. Then, there exist β0 = β0 (d) > 0, Bi = Bi (β, d) > 0, Ci = Ci (β, d) > 0 (i = 1, 2) such that (4.4) holds for β ≥ β0 and ` = 1, 2, . . . .
A typical boundary condition for which (4.6) holds is a chessboard-like one, i.e., a mixture of alternating +1 and −1. The condition (4.6) also allows an “almost plus” boundary condition, e.g. +1 for 99% of the boundary with 1% zero on each side; see [49] for an even stronger result in this direction. The condition (4.6) can be explained by the heuristics in Sec. 4.1. Under the condition, a creation of the layer of plus spins with length ` necessarily increases the energy by at least (1 − δ)`. The condition (4.6) is optimal in some sense as can be seen from the following example. For δ > 0, consider a boundary condition η defined by +1 if x = ` + 1 and −δ` < x ≤ δ` , 1 2 2 2 2 ηx = (4.8) 0 otherwise .
By Theorem 4.2(a), one sees that (4.4) is true for all δ < 1. On the other hand, it follows from [50, Corollary 4.1] that (4.4) is no longer valid for δ = 1. The proof of Theorem 4.2 is again based on (4.5) and Peierls’ argument taking the boundary condition into account. For d ≥ 3, Peierls’ argument with boundary condition is more difficult to implement, due to the geometrical complexity. In fact, Theorem 4.2(b) is the only result in this direction for d ≥ 3. For example, (4.4) for
November 5, 2003 9:23 WSPC/148-RMP
00174
Phase Transition from the Viewpoint of Relaxation Phenomena
783
a chessboard-like boundary condition which appears to be clearly true, is not yet proven as far as we know. Remark 4.2. Theorem 4.2(a) was obtained in a weaker form by Higuchi and Yoshida in [46] and then in the present form by Alexander and Yoshida [47]. Theorem 4.2(b) is due to Sugimine [48]. 4.3. The plus boundary condition In contrast to Theorems 4.1 and 4.2, the relaxation time with pure (+) boundary condition is shorter than the exponential of `d−1 . This can again be explained from the viewpoint of the heuristics explained in Sec. 4.1. For the (+) boundary condition, a plus layer along the boundary can be created without increasing the energy, i.e. there is no energy barrier to cross. This suggests that the relaxation time should be at most polynomial in `. In fact, a recent work of Bodineau and Martinelli [51], where the cube Λ(`) is replaced by Wulff shape with linear size `, suggests the following: for d = 2, h = 0 and β > βc , γSG (Λ(`), η ≡ +1) ` ,
` % ∞,
and for d ≥ 2, h = 0 and β > βc , γLS (Λ(`), η ≡ +1) `2 ,
` % ∞.
So far, existing rigorous upper bounds of the relaxation time are obtained only in the following weaker form: γSG (Λ(`), η ≡ +1) ≤ exp(C(β)`d−2 ϕ(`)) ` = 1, 2, . . . ,
(4.9)
for some C(β) > 0, lim ϕ(`) = ∞ and lim ϕ(`)/` = 0. `%∞
`%∞
Theorem 4.3 ([2, 50, 52, 53]). Consider the Ising model with h = 0 (a) For d = 2 and β > βc , (4.9) holds with ϕ(`) = (` ln `)1/2 . (b) For d ≥ 3 and sufficiently large β, (4.9) holds with ϕ(`) = (ln `)2 . Roughly speaking, the right-hand side of (4.9) is a bound for the relaxation time for a slice (say S(`)) of Λ(`) with height ϕ(`) from the bottom: S(`) = {x ∈ Λ(`) ; xd ≤ −`/2 + ϕ(`)} , recall (4.1). In fact, along with the line of heuristics in Sec. 4.1, the relaxation time γSG (Λ(`), η ≡ +1) should be controlled by the time needed to fill S(`) with (+) spins. Technically, to be able to control the thermal fluctuation, the slice S(`) should not be too thin, i.e. ϕ(`) % ∞ should be fast to a certain extent. Remark 4.3. The bound (4.9) for d = 2 was first proven by Martinelli for large 1+ε enough β [50], where ϕ(`) = ` 2 and then for β > βc [2]. The present form Theorem 4.3(a) was obtained by Higuchi and Wang [52]. Theorem 4.3(b) is due to Sugimine [53].
November 5, 2003 9:23 WSPC/148-RMP
784
00174
N. Yoshida
Acknowledgments This article originates in a course given by the author in a summer school “Mathematical Physics 2001” in Tokyo. The author would like to thank the organizers Huzihiro Araki and Hiroshi Ezawa for the opportunities to give the course and to write these notes. The author also would like to thank Thierry Bodineau, Fabio Martinelli, Nobuaki Sugimine, Boguslaw Zegarlinski and the anonymous referees for their useful comments. A. Appendix A.1. Proof of Eq. (2.7) We will show that −µΛ,η (f cΛ,η x ∇x g) =
1 Λ,η Λ,η µ (cx ∇x f ∇x g) , 2
(A.1)
which, by summation over x ∈ Λ, proves (2.7). It is clear that X X f (σ x ) = f (σ) f ∈ CΛ . σ∈S Λ
(A.2)
σ∈S Λ
On the other hand, x µΛ,η (cΛ,η x (σ)f (σ )) =
µΛ,η (cΛ,η x (σ)f (σ)) =
1 Z Λ,η 1 Z Λ,η
Note now that exp(−H
Λ,η
(σ))cΛ,η x (σ)
X
x exp(−H Λ,η (σ))cΛ,η x (σ)f (σ )
(A.3)
exp(−H Λ,η (σ))cΛ,η x (σ)f (σ) .
(A.4)
σ∈S Λ
X
σ∈S Λ
1 1 = exp − H Λ,η (σ x ) − H Λ,η (σ) 2 2
is invariant under spin flip σ 7→ σ x . Therefore we see from (A.2), (A.3) and (A.4) that x Λ,η Λ,η µΛ,η (cΛ,η (cx (σ)f (σ)) . x (σ)f (σ )) = µ
(A.5)
It is now easy to conclude (A.1) from (A.5): Λ,η Λ,η −µΛ,η (f cΛ,η (cx (σ)f (σ)(g(σ x ) − g(σ))) x ∇x g) = −µ x x = −µΛ,η (cΛ,η x (σ)f (σ )(g(σ) − g(σ )))
=
1 ((A.6) + (A.7)) 2
=
1 Λ,η Λ,η µ (cx ∇x f ∇x g) . 2
(A.6) (A.7)
November 5, 2003 9:23 WSPC/148-RMP
00174
Phase Transition from the Viewpoint of Relaxation Phenomena
785
A.2. Brownian motion and Eq. (2.10) A family {Bt }t≥0 of Rd -valued random variables on a probability space (Ω, F, P ) is called a d-dimensional Brownian motion if the following properties are satisfied; (B0) P (B0 = 0) = 1, (B1) There is an Ω0 ∈ F such that P (Ω0 ) = 1 and t 7→ Bt (ω) is continuous for all ω ∈ Ω0 . (B2) {Btj − Btj−1 }nj=1 are independent if n ≥ 2 and 0 = t0 < t1 < . . . < tn . (B3) For any 0 ≤ s < t, Bt − Bs is a mean-zero Gaussian random variable with the covariance matrix (t − s)(δij )di,j=1 ; Z P (Bt − Bs ∈ A) = pt−s (x)dx , for all Borel set A ⊂ Rd , (A.8) A
2
where pt (x) = (2πt)−d/2 exp(− |x| 2t ). Let us now think of the “time derivative” B˙ t = (B˙t,i )di=1 of the Brownian motion which however does not exist in the classical sense (it exists only in the distributional sense). Then, the intuition behind (B2) and (B3) above are that {B˙t,i ; 1 ≤ i ≤ d, t ≥ 0} is an independent (both in i and t) Gaussian random field. With this nonrigorous formulation of the Brownian motion, the stochastic differential equation (2.10) may be interpreted in the following intuitive form: Λ,η dσt,x 1 e Λ,η (σtΛ,η ) . = B˙ t,x − ∇x H dt 2
A.3. Proof of Eq. (2.14)
1 1 e Λ,η ∇x g. We will show that Set AΛ,η x g = 2 ∆x g − 2 ∇x H Z Z Y Y 1 e Λ,η e Λ,η − e−H f AΛ,η g dσ = e − H ∇x f ∇ x g dσy , y x 2 RΛ RΛ y∈Λ
(A.9)
y∈Λ
which, by summation over x ∈ Λ, proves (2.14). The proof of (A.9) boils down to that of: Z Z 1 e Λ,η e Λ,η − e−H f AΛ,η gdσ = e−H ∇x f ∇x gdσx . (A.10) x x 2 R R Note that AΛ,η x g =
1 He Λ,η e Λ,η e ∇x (e−H ∇x g) 2
and that therefore (A.10) is equivalent to that Z Z e Λ,η e Λ,η − f ∇x (e−H ∇x g)dσx = e−H ∇x f ∇x gdσx . R
R
This can easily be seen by integration by parts.
November 5, 2003 9:23 WSPC/148-RMP
786
00174
N. Yoshida
A.4. Proof of Proposition 2.3 We will prove the equivalence of (2.19) and (2.20). We let k · k and h ·, · i stand for the norm and the inner product of L2 (µΛ,η ). We may assume that µΛ,η f = 0. Then, (2.19) ⇔ kf k2 ≤ −γh f, AΛ,η f i ,
for all f ∈ L2 (µ) ,
(2.20) ⇔ e2t/γ kTtΛ,η f k2 ≤ kf k2 ,
for all f ∈ L2 (µ) and t > 0 .
The equivalence of (2.19) and (2.20) can therefore be seen from the following computation; 2 Λ,η 2 d 2t/γ Λ,η 2 e kTt f k = e2t/γ kTt f k + 2h TtΛ,η f, AΛ,η TtΛ,η f i . dt γ References [1] A. Guionnet and B. Zegarlinski, Lectures on Logarithmic Sobolev Inequalities, S´eminaire de Probabilit´es, Lecture Notes in Math. 1801, Springer. [2] F. Martinelli, Lectures on Glauber dynamics for discrete spin models, Ecole de probabilit´es de St Flour 1997, Lecture Notes in Math. 1717, Springer, Berlin (1999). [3] R. S. Ellis, Entropy, Large Deviations, and Statistical Mechanics, Springer Verlag, New York (1985). [4] H. O. Georgii, Gibbs Measures and Phase Trans. Walter de Gruyter, Berlin, New York (1988). [5] J. Bellissard and R. Høegh-Krohn, Compactness and maximal Gibbs state for random Gibbs fields on the lattice, Commun. Math. Phys. 84 (1982), 297–327. [6] M. Aizenmann and R. Fern´ andez, On the critical behavior of the magnetization in Ising models, J. Stat. Phys. 44 (1986), 393–454. [7] G. Benettin, G. Gallavotti, G. Jona-Lasinio and A. L. Stella, On the Onsager–Yang– Value of the spantaneous magnetization, Commun. Math. Phys. 30 (1973), 45–54. [8] T. M. Liggett, Interacting Particle Systems, Springer Verlag, Berlin–Heidelberg– Tokyo (1985). [9] M. Reed and B. Simon, Method of Modern Mathematical Physics II, Academic Press, 1980. [10] L. Gross, Logarithmic Sobolev inequalities, Amer. J. Math. 97 (1975), 1061–1083. [11] C. An´e, S. Blach`ere, D. Chafa¨ı, P. Foug´eres, I. Gentil, F. Malrieu, C. Roberto and G. Scheffer, Sur les in´egalit´es de Sobolev logarithmiques. Panoramas et Synth´eses 10. Soci´et´e Math´ematique de France, Paris, 2000, pp. xvi+217. [12] L. Gross, Logarithmic Sobolev inequalities and contractivity properties of semigroups, Springer Lecture Notes in Math. 1563 (1994), 54–88. [13] J. D. Deuschel and D. W. Stroock, “Large Deviations”, AMS Chelsea publishing (2001). [14] D. W. Stroock and B. Zegarlinski, The logarithmic Sobolev inequality for discrete spin systems on a lattice, Commun. Math. Phys. 149 (1992), 175–193. [15] F. Martinelli and E. Olivieri, Approach to equilibrium of Glauber dynamics in the one phase region I. The attractive case, Commun. Math. Phys. 161 (1994), 447–486. [16] F. Martinelli and E. Olivieri, Approach to equilibrium of Glauber dynamics in the one phase region II. General case, Commun. Math. Phys. 161 (1994), 487–514.
November 5, 2003 9:23 WSPC/148-RMP
00174
Phase Transition from the Viewpoint of Relaxation Phenomena
787
[17] N. Yoshida, The equivalence of the log-Sobolev inequality and a mixing conditions for unbounded spin systems on the lattice, Ann. Inst. Henri Poincar´e. Probabilit´es et Statistiques 37(2) (2001), 223–243. [18] R. Holley and D. W. Stroock, L2 theory for the stochastic Ising model, Z. Warsch. verw. Gebiete 35 (1976), 87–101. [19] R. Holley and D. W. Stroock, Applications of the stochastic Ising model to the Gibbs states, Commun. Math. Phys. 48 (1976), 249–265. [20] D. W. Stroock and B. Zegarlinski, The equivalence of the logarithmic Sobolev inequality and the Dobrushin–Shlosman mixing condition, Commun. Math. Phys. 144 (1992), 303–323. [21] F. Cesi, Quasi-factorization of the entropy and logarithmic Sobolev inequalities for Gibbs random fields, Prob. Th. Rel. Fields 120 (2001), 569–584. [22] F. Martinelli, E. Olivieri and R. H. Schonmann, For 2-D lattice spin systems weak mixing implies strong mixing, Commun. Math. Phys. 165 (1994), 33–47. [23] R. H. Schonmann and S. B. Shlosman, Complete analyticity for 2D Ising completed, Commun. Math. Phys. 170 (1995), 453–482. [24] N. Yoshida, Application of log-Sobolev inequality to the stochastic dynamics of unbounded spin systems on the lattice, J. Funct. Anal. 173 (2000), 74–102. [25] B. Zegarlinski, The strong decay to equilibrium for the stochastic dynamics of unbounded spin systems on a lattice, Commun. Math. Phys. 175 (1996), 401–432. [26] B. Zegarlinski, Log-Sobolev inequalities for infinite one dimensional lattice systems, Commun. Math. Phys. 133 (1990), 147–162. [27] B. Zegarlinski, Dobrushin Uniqueness Theorem and Logarithmic Sobolev Inequalities, J. Funct. Anal. 105 (1992), 77–111. [28] R. L. Dobrushin and S. Shlosman, Completely analytical Gibbs fields, in Statistical Physics and Dynamical Systems, eds. J. Fritz, A. Jaffe and D. Szasz, Birkh¨ auser (1985). [29] Y. Higuchi, unpublished result. [30] N. Yoshida, The log-Sobolev inequality for weakly coupled lattice fields, Prob. Th. Rel. Fields 115 (1999), 1–40. [31] R. L. Dobrushin and S. Shlosman, Constructive criterion for the uniqueness of Gibbs field, in Statistical Physics and Dynamical Systems, eds. J. Fritz, A. Jaffe and D. Szasz, Birkh¨ auser (1985). [32] S. Lu and H. T. Yau, Spectral gap and logarithmic Sobolev inequality for Kawasaki and Glauber dynamics, Commun. Math. Phys. 156 (1993), 399–433. [33] T. Bodineau and B. Helffer, Log-Sobolev inequality for unbounded spin systems, J. Funct. Anal. 166 (1999), 168–178. [34] I. Gentil and C. Roberto, Spectral gaps for spin systems: some non-convex phase examples, J. Funct. Anal. 180 (2001), 66–84. [35] A. Procacci and B. Scoppola, On decay of correlations for unbounded spin systems with arbitrary boundary conditions, J. Stat. Phys. 105 (2001), 453–482. [36] M. Ledoux, Log-Sobolev inequality for unbounded spin systems revisited, Lecture Notes in Math. 1755 Springer, Berlin (2001). [37] F. Cesi and F. Martinelli, On the layering transition of an SOS surface interacting with a wall. I. Equilibrium results, J. Stat. Phys. 82 (1996), 823–913. [38] F. Cesi and F. Martinelli, On the layering transition of an SOS surface interacting with a wall. II. The Glauber dynamics, Commun. Math. Phys. 177 (1996), 173–201. [39] E. I. Dinaburg and A. E. Mazel, Layering transition in SOS model with external magnetic field. J. Stat. Phys. 74 (1994), 533–563.
November 5, 2003 9:23 WSPC/148-RMP
788
00174
N. Yoshida
[40] R. H. Schonmann and N. Yoshida, Exponential relaxation of Glauber dynamics with some special boundary conditions, Commun. Math. Phys. 189 (1997), 299–310. [41] N. Yoshida, Finite volume Glauber dynamics in a small magnetic field, J. Stat. Phys. 90 (1998), 1015–1035. [42] R. H. Schonmann, Slow droplet-driven relaxation of stochastic Ising models in the vicinity of the phase coexistence region, Commun. Math. Phys. 161 (1994), 1–49. [43] F. Cesi, G. Guadagni, F. Martinelli and R. H. Schonmann, On the 2D dynamical Ising model in the phase coexistence region near the critical point, J. Stat. Phys. 85 (1996), 55–102. [44] L. E. Thomas, Bound on the mass gap for finite volume stochastic Ising models at low temperature, Commun. Math. Phys. 126 (1989), 1–11. [45] D. B. Abraham and A. Martin-L¨ of, The transfer matrix for a pure phase in the two dimensional Ising model, Commun. Math. Phys. 32 (1973), 245–268. [46] Y. Higuchi and N. Yoshida, Slow relaxation of stochastic Ising models with random and non-random boundary conditions, in New Trends in Stochastic Analysis, eds. K. D. Elworthy, S. Kusuoka and I. Shigekawa, World Scientific Publishing (1997). [47] K. S. Alexander and N. Yoshida, The spectral gap of the 2-D stochastic Ising model with mixed boundary conditions, J. Stat. Phys. 104 (2001), 89–109. [48] N. Sugimine, Extension of Thomas’ result and upper bound on the spectral gap of d(≥ 3)-dimensional stochastic Ising models, J. Math. Kyoto Univ. 42 (2002), 141– 160. [49] K. S. Alexander, The spectral gap of the 2-D stochastic Ising model with nearly single-spin boundary conditions, J. Stat. Phys. 104 (2001), 59–87. [50] F. Martinelli, On the two dimensional dynamical Ising model in the phase coexistence region, J. Stat. Phys. 76 (1994), 1179–1246. [51] T. Bodineau and F. Martinelli, Some new results on the kinetic Ising model in a pure phase, J. Stat. Phys. 109 (2002), 207–235. [52] Y. Higuchi and J. Wang, Spectral gap of Ising model for Dobrushin’s boundary condition in two dimensions, preprint, 1999. [53] N. Sugimine, A lower bound on the spectral gap of the 3-dimensional stochastic Ising models (preprint, 2002). [54] R. Holley and D. W. Stroock, Logarithmic Sobolev inequality and stochastic Ising models, J. Stat. Phys. 46 (1987), 1159–1194. [55] R. Dobrushin, and S. Shlosman, Completely analytical interactions: Constructive description, J. Stat. Phys. 46 (1987), 983–1014. [56] T. Bodineau and B. Helffer, Correlations, Spectral gap and Log-Sobolev inequality for unbounded spin systems, in Differential Equations and Mathematical Physics, International Press, Birmingham, 1999, pp. 27–42.
December 8, 2003 11:39 WSPC/148-RMP
00181
Reviews in Mathematical Physics Vol. 15, No. 8 (2003) 789–822 c World Scientific Publishing Company
COIDEAL SUBALGEBRAS IN QUANTUM AFFINE ALGEBRAS
A. I. MOLEV School of Mathematics and Statistics, University of Sydney NSW 2006, Australia [email protected] E. RAGOUCY∗ and P. SORBA† LAPTH, Chemin de Bellevue, BP 110, F-74941 Annecy-le-Vieux cedex, France ∗[email protected] †[email protected] Received 15 February 2003 Revised 21 August 2003 We introduce two subalgebras in the type A quantum affine algebra which are coideals with respect to the Hopf algebra structure. In the classical limit q → 1 each subalgebra specializes to the enveloping algebra U(k), where k is a fixed point subalgebra of the loop algebra glN [λ, λ−1 ] with respect to a natural involution corresponding to the embedding of the orthogonal or symplectic Lie algebra into glN . We also give an equivalent presentation of these coideal subalgebras in terms of generators and defining relations which have the form of reflection-type equations. We provide evaluation homomorphisms from these algebras to the twisted quantized enveloping algebras introduced earlier by Gavrilik and Klimyk and by Noumi. We also construct an analog of the quantum determinant for each of the algebras and show that its coefficients belong to the center of the algebra. Their images under the evaluation homomorphism provide a family of central elements of the corresponding twisted quantized enveloping algebra. Keywords: Quantized enveloping algebra; quantum determinant; evaluation homomorphism.
1. Introduction For a simple Lie algebra g over C consider the corresponding quantized enveloping algebra Uq (g); see Drinfeld [9], Jimbo [18]. If k is a subalgebra of g then U(k) is a Hopf subalgebra of U(g). However, Uq (k), even when it is defined, need not be isomorphic to a Hopf subalgebra of Uq (g). In the case where (g, k) is a classical symmetric pair the twisted quantized enveloping algebra Utw q (k) was introduced by Noumi [31] (type A pairs) and by Noumi and Sugitani [32] (remaining classical types). This is a subalgebra and a left coideal of the Hopf algebra Uq (g) which specializes to U(k) as q → 1. The algebras Utw q (k) play an important role in the theory of quantum 789
December 8, 2003 11:39 WSPC/148-RMP
790
00181
A. I. Molev, E. Ragoucy & P. Sorba
symmetric spaces developed in [31] and [32]. In particular, in the type A, which we are only concerned with in this paper, there are two twisted quantized enveloping tw algebras Utw q (oN ) and Uq (sp2n ) corresponding to the symmetric pairs AI : (glN , oN ) ,
(1.1)
AII : (gl2n , sp2n ) ,
respectively. It was also shown by Noumi [31] that the algebra Utw q (oN ) coincides with the one introduced earlier by Gavrilik and Klimyk [12]. The algebra Utw q (oN ) also appears as the symmetry algebra for the q-oscillator representation of the quantized enveloping algebra Uq (sp2n ); see Noumi, Umeda and Wakayama [33, 34]. In Noumi’s approach, the defining relations for the quantized algebras can be written in the form of a reflection-type equation. A constant solution of the reflection equation provides an embedding of the twisted quantized enveloping algebra into Uq (glN ). The quantum homogeneous spaces corresponding to the remaining series of the classical symmetric pairs of type A AIII : (glN , glN −l ⊕ gl l )
(1.2)
were studied by Dijkhuizen, Noumi and Sugitani [6, 7]. A one-parameter family of the constant solutions of the appropriate reflection equation was produced in [7], although no reflection-type presentation of the subalgebras of type Utw q (k) were formally introduced. A different description of the coideal subalgebras of Uq (g) associated with an arbitrary irreducible symmetric pair (g, k) was given by Letzter [22, 23]. The subalgebras are presented by generators and explicit relations depending on the Cartan matrix of g; see [23]. In particular, this work demonstrates the importance of the coideal property: it makes the construction of the twisted quantized algebras essentially unique. Natural infinite-dimensional analogs of the symmetric pairs are provided by involutive subalgebras in the polynomial current Lie algebras g[x] = g ⊗ C[x] or loop algebras g[λ, λ−1 ] = g ⊗ C[λ, λ−1 ]. Let (g, k) be a symmetric pair and g = k ⊕ p be the decomposition determined by the involution θ of g. So, k and p are the eigenspaces of θ with the eigenvalues 1 and −1, respectively. Then the twisted polynomial current Lie algebra g[x]θ can be defined by g[x]θ = k ⊕ px ⊕ kx2 ⊕ px3 ⊕ · · · ,
(1.3)
or, equivalently, it is the fixed point subalgebra of g[x] with respect to the extension of θ given by θ : Axp 7→ (−1)p θ(A)xp ,
A ∈ g.
(1.4)
As demonstrated by Drinfeld [9], the enveloping algebra U(g[x]) admits a canonical deformation in the class of Hopf algebras. The corresponding “quantum” algebra is called the Yangian and denoted by Y(g). For the case where θ is an involution of glN corresponding to the pair of type AI or AII, quantum analogs of the symmetric
December 8, 2003 11:39 WSPC/148-RMP
00181
791
Coideal Subalgebras in Quantum Affine Algebras
pairs (glN [x], glN [x]θ ) are provided by the Olshanski twisted Yangians [35]. These are coideal subalgebras in the Yangian Y(glN ) and each of them is a deformation of the enveloping algebra U(glN [x]θ ). In the AIII case, the quantum analogs of the pairs (glN [x], glN [x]θ ) are provided by the reflection algebras B(N, l) which are coideal subalgebras of Y(gl N ) originally introduced by Sklyanin [37]. Recently, these algebras and their representations were studied in connection with the NLS model ; see Liguori, Mintchev and Zhao [24], Mintchev, Ragoucy and Sorba [25], Molev and Ragoucy [29]. For the symmetric pairs (g[x], g[x]θ ) of general types the corresponding coideal subalgebras in the Yangian Y(g) were recently introduced by Delius, MacKay and Short [4] in relation with the principal chiral models with boundaries. These subalgebras are given in terms of the Q-presentation of the Yangian. A different R-matrix presentation of coideal subalgebras in the (super) Yangian Y(g) is given by Arnaudon, Avan, Cramp´e, Frappat and Ragoucy [1]. Field theoretical applications of the coideal subalgebras in the quantum affine algebras have been studied in a recent paper by Delius and MacKay [5]. In the case of the loop algebra b g = g[λ, λ−1 ] there is another natural way [cf. (1.4)] to extend the involution θ of g, θ : Aλp 7→ θ(A)λ−p ,
A∈g
(1.5)
and thus to define the fixed point subalgebra b g θ . In this paper we introduce certain b N , gl b θ ) associated with the involution θ quantizations of the symmetric pairs (gl N
corresponding to the pairs AI and AII. We define the twisted q-Yangians Yqtw (oN ) b ). They are and Yqtw (sp2n ) as subalgebras of the quantum affine algebra Uq (gl N b b θ ) as left coideals with respect to the coproduct on Uq (glN ) and specialize to U(gl N
q → 1. At this point we consider it necessary to comment on the terminology. Although, b θ is not a “twisted” quantum affine as we have mentioned above, the Lie algebra gl N
algebra in the usual meaning, we believe the names we use for the coideal subalgebras can be justified having in mind their analogy with both the twisted Yangians and the q-Yangian; cf. [30]. The latter is a subalgebra of the quantum affine algebra b ) which can be regarded as a q-analog of the usual Yangian Y(gl ); see also Uq (gl N N Sec. 3 below. Our first main result is a construction of the evaluation homomorphisms Yqtw (oN ) → Utw q (oN ) ,
Yqtw (sp2n ) → Utw q (sp2n )
(1.6)
to the corresponding twisted quantized enveloping algebras of [12] and [31]. Note that an evaluation homomorphism Uq (b g ) → Uq (g) from the quantum affine algebra to the corresponding quantized enveloping algebra only exists if g is of A type, and the same holds for the case of the Yangians; see Jimbo [19], Drinfeld [9]. In both cases, the evaluation homomorphisms play an important role in the representation theory of the quantum algebras; see Chari and Pressley [2]. An evaluation
December 8, 2003 11:39 WSPC/148-RMP
792
00181
A. I. Molev, E. Ragoucy & P. Sorba
homomorphism from the twisted Yangian to the corresponding enveloping algebra U(oN ) or U(sp2n ) does exist (see [35, 28]) and has many applications in the classical representation theory; see e.g. [27] for an overview. Note also that the existence of the homomorphisms (1.6) is not directly related with the corresponding fact for the A type algebras but is quite a nontrivial property of the reflection equations satisfied by the generators of the twisted q-Yangians. Next we construct an analog of the quantum determinant for each twisted q-Yangian and show that its coefficients belong to the center of this algebra. The application of the evaluation homomorphism (1.6) yields a family of central tw elements in Utw q (oN ) and Uq (sp2n ). In the orthogonal case we also produce a “short” determinant-like formula for this analog which employs a certain map from the symmetric group into itself. This same map was used in the short formulas for the Sklyanin determinants for the twisted Yangians; see [27]. Some other families of Casimir elements were constructed by Noumi, Umeda and Wakayama [34] and by Gavrilik and Iorgov [13]. It would be interesting to understand the relationship between the families, as well as to investigate possible applications to the study of the quantum Howe dual pairs; cf. [33, 34]. b N ) of Another intriguing problem is to construct coideal subalgebras of Uq (gl type AIII, i.e. to find q-analogs of the Sklyanin reflection algebras B(N, l) mentioned above. 2. Coideal Subalgebras of Uq (glN ) We shall use an R-matrix presentation of the algebra Uq (glN ). Our main references are Jimbo [19] and Reshetikhin, Takhtajan and Faddeev [36]. We fix a complex parameter q which is nonzero and not a root of unity. Consider the R-matrix X X X R=q Eii ⊗ Eii + Eii ⊗ Ejj + (q − q −1 ) Eij ⊗ Eji (2.1) i6=j
i
N
i<j
N
which is an element of End C ⊗ End C , where the Eij denote the standard matrix units and the indices run over the set {1, . . . , N }. The R-matrix satisfies the Yang–Baxter equation R12 R13 R23 = R23 R13 R12 ,
(2.2)
where both sides take values in End C N ⊗ End C N ⊗ End C N and the subindices indicate the copies of End C N , e.g. R12 = R ⊗ 1 etc. The quantized enveloping algebra Uq (glN ) is generated by elements tij and t¯ij with 1 ≤ i, j ≤ N subject to the relationsa tij = t¯ji = 0 , tii t¯ii = t¯ii tii = 1 , R T 1 T2 = T 2 T1 R , a Our
1≤i<j≤N, 1≤i≤N, R T¯1 T¯2 = T¯2 T¯1 R ,
(2.3) R T¯1 T2 = T2 T¯1 R .
T and T¯ correspond to the L-operators L− and L+ , respectively, in the notation of [36].
December 8, 2003 11:39 WSPC/148-RMP
00181
Coideal Subalgebras in Quantum Affine Algebras
Here T and T¯ are the matrices X tij ⊗ Eij , T =
X
T¯ =
t¯ij ⊗ Eij ,
793
(2.4)
i,j
i,j
which are regarded as elements of the algebra Uq (glN ) ⊗ End C N . Both sides of each of the R-matrix relations in (2.3) are elements of Uq (glN ) ⊗ End C N ⊗ End C N and the subindices of T and T¯ indicate the copies of End C N where T or T¯ acts; e.g. T1 = T ⊗ 1. In terms of the generators the defining relations between the tij can be written as q δij tia tjb − q δab tjb tia = (q − q −1 ) (δb
(2.5)
where δi<j equals 1 if i < j and 0 otherwise. The relations between the t¯ij are obtained by replacing tij by t¯ij everywhere in (2.5). Finally, the relations involving both tij and t¯ij have the form q δij t¯ia tjb − q δab tjb t¯ia = (q − q −1 ) (δb
i
(2.6)
(2.7)
i>j
We have the relations ˜ = (q − q −1 ) P , R−R where P =
X
˜ = P R−1 P , R
Eij ⊗ Eji
(2.8)
(2.9)
i,j
is the permutation operator. The following relations are implied by (2.3) ˜, ˜ T¯1 T¯2 = T¯2 T¯1 R R
˜, ˜ T 1 T2 = T 2 T1 R R
˜. ˜ T1 T¯2 = T¯2 T1 R R
(2.10)
The coproduct ∆ on Uq (glN ) is defined by the relations ∆(tij ) =
N X
tik ⊗ tkj ,
∆(t¯ij ) =
k=1
N X
t¯ik ⊗ t¯kj .
(2.11)
k=1
It is well known that the algebra Uq (glN ) specializes to U(glN ) as q → 1. To make this more precise, regard q as a formal variable and Uq (glN ) as an algebra over C(q). Then set A = C[q, q −1 ] and consider the A-subalgebra UA of Uq (glN ) generated by the elements t¯ij tij for i > j , τ ¯ = for i < j , (2.12) τij = ij q − q −1 q − q −1 and τii =
tii − 1 , q−1
τ¯ii =
t¯ii − 1 , q−1
(2.13)
December 8, 2003 11:39 WSPC/148-RMP
794
00181
A. I. Molev, E. Ragoucy & P. Sorba
for i = 1, . . . , N. Then we have an isomorphism UA ⊗A C ∼ = U(glN )
(2.14)
with the action of A on C defined via the evaluation q = 1; see e.g. [2, Sec. 9.2]. Note that τij and τ¯ij respectively specialize to the elements Eij and −Eij of U(glN ). More generally, given a subalgebra V of Uq (glN ), set VA = V ∩ UA . Following Letzter [22, Sec. 1], we shall say that V specializes to the subalgebra V ◦ of U(glN ) (as q goes to 1) if the image of VA in UA ⊗A C is V ◦ . 2.1. Orthogonal case Following Noumi [31], we introduce the twisted quantized enveloping algebra Utw q (oN ) as the subalgebra of Uq (glN ) generated by the matrix elements of the matrix S = T T¯ t . It can be easily derived from (2.3) (see [31]) that the matrix S satisfies the relations sij = 0 ,
1≤i<j≤N,
(2.15)
sii = 1 ,
1≤i≤N,
(2.16)
R S 1 R t S2 = S 2 R t S1 R ,
(2.17)
where R t := R t1 denotes the element obtained from R by the transposition in the first tensor factor: X X X Rt = q Eii ⊗ Eii + Eii ⊗ Ejj + (q − q −1 ) Eji ⊗ Eji . (2.18) i
i6=j
i<j
Indeed, the only nontrivial part of this derivation is to verify that R T1 T¯1t R t T2 T¯2t = T2 T¯2t R t T1 T¯1t R .
(2.19)
However, this is implied by the relation R R t = R t R and the following consequences of (2.3): T¯1t R t T2 = T2 R t T¯1t ,
R T¯1t T¯2t = T¯2t T¯1t R .
(2.20)
We now prove an auxiliary lemma which establishes a weak form of the Poincar´e– Birkhoff–Witt theorem for abstract algebras defined by the relation (2.17). It will be used in both the orthogonal and symplectic cases. Lemma 2.1. Consider the associative algebra with N 2 generators sij , i, j = 1, . . . , N and the defining relations written in terms of the matrix S = (sij ) by the relation (2.17). Then the ordered monomials of the form 1N sk1111 sk1212 · · · sk1N · · · skNN11 skNN22 · · · skNNNN
with nonnegative powers kij linearly span the algebra.
(2.21)
December 8, 2003 11:39 WSPC/148-RMP
00181
Coideal Subalgebras in Quantum Affine Algebras
795
Proof. Rewriting (2.17) in terms of the generators we get q δaj +δij sia sjb − q δab +δib sjb sia = (q − q −1 ) q δai (δb
(2.22)
where δi<j or δi<j
(2.23)
we introduce its length p and weight w by w = i1 + · · · + ip . Clearly, for a monomial of length p the weight can range between p and pN. We shall use induction on w. By (2.22) we obtain the following equality modulo products of weight less than i + j: for i ≥ j q δaj +δij sia sjb ≡ q δab +δib sjb sia + δb
(2.24)
This allows us to represent (2.23) modulo monomials of weight less than w as a linear combination of monomials sj1 b1 · · · sjp bp of weight w such that j1 ≤ · · · ≤ jp . Consider a submonomial sic1 · · · sicr containing generators with the same first index. By (2.24) we have for a > b q δai +1 sia sib ≡ q δab +δib sib sia .
(2.25)
Using this relation repeatedly we bring the submonomial to the required form. We shall be proving now that (2.15)–(2.17) are precisely the defining relations of the algebra Utw q (oN ). In other words, the following theorem takes place. Theorem 2.2. The abstract algebra S generated by elements s ij , i, j = 1, . . . , N with the defining relations (2.15)–(2.17) is isomorphic to Utw q (oN ). Proof. It was mentioned in [34, Remark 7.9(1)] without a detailed proof that this fact can be established with the use of the Diamond Lemma. We employ a different approach based on a weak Poincar´e–Birkhoff–Witt theorem for the algebra S. For this proof only, denote the matrix T T¯ t by S˜ and its matrix elements by s˜ij . As we noted above, the map S 7→ S˜ defines an algebra homomorphism ϕ : S → Utw q (oN ). We only need to show that this homomorphism is injective. By Lemma 2.1 the monomials k
k21 k31 k32 N,N −1 kN 1 kN 2 s21 s31 s32 · · · sN 1 sN 2 · · · sN,N −1
(2.26)
span the algebra S. We shall show that the images of these monomials under ϕ are linearly independent. We prove, in fact, that given any linear ordering on the set of generators s˜ij , i > j, the ordered monomials in the s˜ij are linearly independent. Regarding q as a formal variable, we keep the notation A for the algebra of Laurent
December 8, 2003 11:39 WSPC/148-RMP
796
00181
A. I. Molev, E. Ragoucy & P. Sorba
polynomials C[q, q −1 ]; see Sec. 2. Set V = Utw q (oN ) and note that the subalgebra VA is generated by the elements σij := s˜ij /(q − q −1 ) with i > j. It is enough to verify that the ordered monomials in the generators σij are linearly independent over A. We have for i > j X σij = τij + τ¯ji + (q − 1)(τii τ¯ji + τij τ¯jj ) + (q − q −1 ) τia τ¯ja . (2.27) j
Therefore, the image of σij in VA ⊗A C is Fij := Eij − Eji . The elements Fij with i > j constitute a basis of a subalgebra of glN isomorphic to the orthogonal Lie algebra oN . By the Poincar´e–Birkhoff–Witt theorem for oN the ordered monomials in the Fij are linearly independent. Now suppose, on the contrary, that there exists a nontrivial linear combination of the ordered monomials in the σij equal to zero: Y k X (2.28) σijij = 0 , c(k) (k)
i>j
summed over the multi-indices (k) = (kij | i > j), where c(k) ∈ A. We may assume that at least one coefficient c(k) in (2.28) does not vanish at q = 1. Taking the image of (2.28) in VA ⊗A C yields a nontrivial linear combination of the ordered monomials in the Fij equal to zero. This makes a contradiction proving the claim. As a corollary of the above argument we obtain an analog of the Poincar´e– Birkhoff–Witt theorem for the algebra S. A different proof was given by N. Iorgov; see [15]. Corollary 2.3. The monomials (2.26) constitute a basis of the algebra S. It follows from the proof of the theorem that the algebra VA ⊗A C coincides with the enveloping algebra U(oN ). Thus, Utw q (oN ) specializes to U(oN ) as q → 1; see [12] and [31, Sec. 2.4]. This result also follows from [21, Sec. 6]. Regarding Utw q (oN ) as a subalgebra of Uq (glN ) we introduce another matrix S¯ by S¯ = T¯ T t .
(2.29)
The matrix S¯ = (¯ sij ) is upper triangular with 1’s on the diagonal. It is related to S by the formula S¯ = 1 − q + q S t ,
(2.30)
or, in terms of the matrix elements, s¯ij = q sji for i < j. Indeed, (2.3) implies t¯ia tja = q tja t¯ia .
(2.31)
Taking the sum over a gives the result. So, the elements s¯ij belong to the subalgebra Utw q (oN ).
December 8, 2003 11:39 WSPC/148-RMP
00181
Coideal Subalgebras in Quantum Affine Algebras
797
The next proposition is immediate from (2.11). It shows that the subalgebra Utw q (oN ) is a left coideal of Uq (glN ). This property was mentioned in [31, Sec. 2.4] and proofs can be found in [14] and [21, Lemma 6.3]. Proposition 2.4. The image of the generator sij of Utw q (oN ) under the coproduct is given by ∆(sij ) =
N X
tik t¯jl ⊗ skl .
(2.32)
k,l=1
Remark 2.5. It was shown in [31] that for any nondegenerate diagonal matrix D the matrix S = T D T¯ t satisfies the reflection equation (2.17). Therefore one can define a family of subalgebras in Uq (glN ) parametrized by the matrices D. However, all of them are isomorphic to each other as abstract algebras which can be seen from the fact that for any diagonal matrix C, the relation (2.17) is preserved by the transformation S 7→ C S C. Indeed, the entries of S are then transformed as sij 7→ sij ci cj and the claim is immediate from (2.22). The corresponding remark also applies to the symplectic case considered below. 2.2. Symplectic case Following again [31] introduce the 2n × 2n matrix G by G=q
n X
E2k−1,2k −
k=1
n X
E2k,2k−1 .
(2.33)
k=1
It is convenient to introduce the involution of the set of indices {1, . . . , 2n} which we denote by a prime and which acts by the rule: for k = 1, . . . , n (2k − 1)0 = 2k
and (2k)0 = 2k − 1 .
(2.34)
To introduce the twisted quantized enveloping algebra Utw q (sp2n ) consider the matrix t ¯ S = T G T with entries in Uq (gl2n ). For each odd i = 1, 3, . . . , 2n − 1 the element sii0 = q tii t¯i0 i0 is invertible in Uq (gl2n ). We define Utw q (sp2n ) as the subalgebra of Uq (gl2n ) generated by the matrix elements of the matrix S and by the elementsb s−1 ii0 , i = 1, 3, . . . , 2n − 1. Clearly, the matrix S has a block-triangular form with n diagonal 2 × 2-blocks so that sij = 0 for i < j
with j 6= i0 .
(2.35)
As in the orthogonal case, the matrix S satisfies the relation R S 1 R t S2 = S 2 R t S1 R , b These
(2.36)
additional generators, not considered in [31], will bring some simplification in the arguments below.
December 8, 2003 11:39 WSPC/148-RMP
798
00181
A. I. Molev, E. Ragoucy & P. Sorba
with the same definition of R and R t (taking N = 2n), cf. (2.15)–(2.17). This follows from the fact that G is a solution of the reflection equation [31] R G 1 R t G2 = G 2 R t G1 R .
(2.37)
Lemma 2.6. For any odd i = 1, 3, . . . , 2n − 1 we have the identity si0 i0 sii − q 2 si0 i sii0 = q 3 .
(2.38)
Proof. The calculation is the same for each i so we take i = 1. We have s11 = q t11 t¯12 , s12 = q t11 t¯22 s22 = q t21 t¯22 ,
s21 = q t21 t¯12 − t22 t¯11 .
(2.39)
Now the relation is implied by the defining relations (2.3) in Uq (glN ). Next we prove an analog of Theorem 2.2 for the algebra Utw q (sp2n ). Introduce the abstract algebra S with generators sij , i, j = 1, 2, . . . , 2n and s−1 ii0 , i = 1, 3, . . . , 2n−1 with the defining relations given by (2.35), (2.37), (2.38) and −1 sii0 s−1 ii0 = sii0 sii0 = 1 ,
i = 1, 3, . . . , 2n − 1 .
(2.40)
We show first that the relations (2.38) allow us to eliminate the generators s i0 i in the spanning set of monomials. Lemma 2.7. The algebra S is spanned by the ordered monomials of the form → Y
k
k
k
k
ki1 ki2 i ,i −2 si1 si2 · · · sii0ii0 si0 ii00 i0 si0 i10 1 · · · si0 ,i 0 −2 , 0
0
(2.41)
i=1,3,...,2n−1
where the kii0 with i = 1, 3, . . . , 2n − 1 are arbitrary integers while the remaining powers kij are nonnegative integers. Proof. Denote by S 0 the algebra with generators sij and s−1 ii0 and the defining relations given by (2.35), (2.37) and (2.40). The defining relations imply that for any odd i we have q δi0 ,k −δik sii0 skl = q δil −δi0 ,l skl sii0 .
(2.42)
Therefore, by Lemma 2.1 the algebra S 0 is spanned by the monomials → Y
k
k
k
k
ki1 ki2 i ,i −1 i0 i0 si2 · · · sii0ii0 si0 1i0 1 · · · si0 ,i si1 , 0 −1 si0 i0 0
0
(2.43)
i=1,3,...,2n−1
where the integers kii0 are allowed to be negative. As we noted in the proof of Lemma 2.1, the generators sia and sib can be permuted modulo terms of lower weight; see (2.25). Therefore, rearranging the generators appropriately, we conclude that S 0 is also spanned by the monomials → Y
i=1,3,...,2n−1
k
k
k
k
k
0 ki1 ki2 i ,i −1 i ,i −2 i0 i0 si1 si2 · · · siiii si0 ,i si0 1i0 1 · · · si0 ,i 0 0 −1 si0 i0 0 −2 . 0
0
0
0
(2.44)
December 8, 2003 11:39 WSPC/148-RMP
00181
Coideal Subalgebras in Quantum Affine Algebras
799
It is a straightforward calculation to derive from the defining relations that the elements si0 i0 sii − q 2 si0 i sii0 with i = 1, 3, . . . , 2n − 1 are central in the algebra S 0 . Let I be the ideal of S 0 generated by the central elements si0 i0 sii − q 2 si0 i sii0 − q 3 for i = 1, 3, . . . , 2n − 1. We need to show that the quotient S 0 /I is spanned by the monomials (2.41). However, the defining relations between the generators s ii , sii0 , si0 i and si0 i0 do not involve any other generators. In order to complete the proof it is therefore sufficient to consider the particular case n = 1. We shall show that modulo the ideal I generated by the central element s22 s11 − q 2 s21 s12 − q 3 , every element of the algebra S 0 can be written as a linear combination of monomials k11 k12 k21 k22 k11 k12 k22 s12 s21 s22 be an arbitrary monomial of the s12 s22 . Let s11 of the form s11 form (2.44). We assume k21 ≥ 1 and use induction on k21 . Modulo the ideal I we have −1 s21 ≡ q −2 s22 s11 s−1 12 + q s12 .
(2.45)
Therefore, omitting the ordered monomials with smaller powers k21 we have the relation modulo the ideal I: k11 k12 k21 k22 k11 k12 −1 k21 −1 k22 s11 s12 s21 s22 ≡ q −2 s11 s12 s22 s11 s21 s22 .
(2.46)
Note that by (2.22) we have the relation s22 s11 = s11 s22 + (q − q −1 )(s212 + q s12 s21 ) .
(2.47)
This allows us to bring the right-hand side of (2.46) to the form k11 k12 k21 k22 (1 − q −2 ) s11 s12 s21 s22 ,
(2.48)
which completes the proof. The following is an analog of Theorem 2.2 for the symplectic case. Theorem 2.8. The algebra S is isomorphic to Utw q (sp2n ). Proof. We use the same argument as for the proof of Theorem 2.2. For this proof only, denote the matrix T G T¯ t by S˜ and its matrix elements by s˜ij . The map S 7→ S˜ defines an algebra homomorphism ϕ : S → Utw q (sp2n ). We show that the images of the monomials (2.41) under ϕ are linearly independent. Note that by (2.42) the product of a monomial of the form (2.41) and skii0 is, up to a nonzero factor, equal to the same monomial with the index kii0 replaced with kii0 + k. Therefore we may assume without loss of generality that all powers in (2.41) are nonnegative integers. Set V = Utw q (sp2n ). The corresponding generators σij ∈ VA can now be given by σij =
s˜ij − gij , q − q −1
(2.49)
where the gij are the matrix elements of G. We have the relation modulo (q − 1), σij ≡ g˜j 0 j τij 0 + g˜ii0 τ¯ji0 ,
(2.50)
December 8, 2003 11:39 WSPC/148-RMP
800
00181
A. I. Molev, E. Ragoucy & P. Sorba
where g˜ij is the value of gij at q = 1 so that g˜ii0 = (−1)i−1 . Therefore, the image of the element σij in VA ⊗A C is Fij := g˜j 0 j Eij 0 − g˜ii0 Eji0 . The elements Fij span a subalgebra of gl2n isomorphic to the symplectic Lie algebra sp2n and this allows us to complete the proof exactly as in the orthogonal case. The following is an analog of the Poincar´e–Birkhoff–Witt theorem for the algebra S which is immediate from Theorem 2.8. Corollary 2.9. The monomials (2.41) constitute a basis of the algebra S. It follows from the proof of the theorem that the algebra VA ⊗A C coincides with the enveloping algebra U(sp2n ). Thus, Utw q (sp2n ) specializes to U(sp2n ) as q → 1. The result also follows from [21, Sec. 6]. Regarding Utw q (sp2n ) as a subalgebra of Uq (gl2n ) we introduce another matrix ¯ S by S¯ = T¯ G T t .
(2.51)
Using the same argument as in the orthogonal case (see Sec. 2.1) we derive the ¯ for any i = 1, following relations between the matrix elements of S and S: 3, . . . , 2n − 1 s¯ii = −q −2 sii ,
s¯i0 i0 = −q −2 si0 i0 ,
s¯i0 i = −q −1 sii0 ,
s¯ii0 = −q −1 si0 i + (1 − q −2 ) sii0 ,
(2.52)
while for the remaining generators we have s¯ij = −q −1 sji ,
i < j , j 6= i0 .
(2.53)
Thus, the elements s¯ij belong to the subalgebra Utw q (sp2n ). The next proposition is immediate from (2.11); cf. Proposition 2.4. It shows that the subalgebra Utw q (sp2n ) is a left coideal of Uq (gl2n ); see [21, Lemma 6.3]. Proposition 2.10. The images of the generators of Utw q (sp2n ) under the coproduct are given by ∆(sij ) =
2n X
tik t¯jl ⊗ skl
(2.54)
k,l=1
and −1 ¯ ∆(s−1 ii0 ) = ti0 i0 tii ⊗ sii0 .
b ) 3. Coideal Subalgebras of Uq (gl N
(2.55)
Consider the Lie algebra of Laurent polynomials glN [λ, λ−1 ] in an indetermib N for brevity. The quantum affine algebra Uq (gl b N ) is nate λ. We denote it by gl b a deformation of the universal enveloping algebra U(glN ). We use its R-matrix presentation following Reshetikhin, Takhtajan and Faddeev [36]; cf. Sec. 2. Note also
December 8, 2003 11:39 WSPC/148-RMP
00181
Coideal Subalgebras in Quantum Affine Algebras
801
b ) a recent work of Frenkel and Mukhin [11], where different realizations of Uq (gl N b are collected. By definition, the algebra Uq (glN ) has countably many generators (r) (r) tij and t¯ij where 1 ≤ i, j ≤ N and r runs over nonnegative integers. They are combined into the matrices T (u) =
N X
tij (u) ⊗ Eij ,
T¯(u) =
i,j=1
N X
t¯ij (u) ⊗ Eij ,
(3.1)
i,j=1
where tij (u) and t¯ij (u) are formal series in u−1 and u, respectively: tij (u) =
∞ X
(r)
tij u−r ,
t¯ij (u) =
r=0
∞ X
(r) t¯ij ur .
(3.2)
r=0
The defining relations are (0) (0) tij = t¯ji = 0 ,
1≤i<j≤N,
(0) (0) (0) (0) tii t¯ii = t¯ii tii = 1 ,
1≤i≤N, (3.3)
R(u, v) T1 (u)T2 (v) = T2 (v)T1 (u)R(u, v) , R(u, v) T¯1 (u)T¯2 (v) = T¯2 (v)T¯1 (u)R(u, v) , R(u, v) T¯1 (u)T2 (v) = T2 (v)T¯1 (u)R(u, v) ,
where we have used the notation of (2.3) and R(u, v) is the trigonometric R-matrix given by X X R(u, v) = (u − v) Eii ⊗ Ejj + (q −1 u − q v) Eii ⊗ Eii i6=j
i
+ (q −1 − q)u
X
Eij ⊗ Eji + (q −1 − q)v
i>j
X
Eij ⊗ Eji .
(3.4)
i<j
It satisfies the Yang–Baxter equation R12 (u, v)R13 (u, w)R23 (v, w) = R23 (v, w)R13 (u, w)R12 (u, v) ,
(3.5)
where both sides take values in End C N ⊗ End C N ⊗ End C N and the subindices indicate the copies of End C N , e.g. R12 (u, v) = R(u, v) ⊗ 1 etc. Note that R(u, v) is related with the constant R-matrices (2.1) and (2.7) by the formula ˜ −vR. R(u, v) = u R
(3.6)
b ) with the coproduct defined by There is a Hopf algebra structure on Uq (gl N ∆(tij (u)) =
N X
k=1
tik (u) ⊗ tkj (u) ,
∆(t¯ij (u)) =
N X
k=1
t¯ik (u) ⊗ t¯kj (u) .
(3.7)
December 8, 2003 11:39 WSPC/148-RMP
802
00181
A. I. Molev, E. Ragoucy & P. Sorba
b ) specializes to U(gl b ) as q → 1. More precisely, as with The algebra Uq (gl N N the case of Uq (glN ) (see Sec. 2), regard q as a formal variable and introduce the b ) generated by the elements τ (r) and τ¯(r) defined by A-subalgebra UA of Uq (gl N ij ij (r)
(r)
τij =
tij , q − q −1
(r)
τ¯ij =
(r) t¯ij q − q −1
(3.8)
for r ≥ 0 and all i, j, except for the case r = 0 and i = j where we set (0)
(0)
τii =
tii − 1 , q−1
(0)
τ¯ii =
(0) t¯ii − 1 . q−1
(3.9)
Then we have an isomorphism bN) ; UA ⊗A C ∼ = U(gl
(3.10)
see [2, Sec. 12.2] and [11, Sec. 2]. The images of the generators of UA in (3.10) are given by (r)
τij → Eij λr , (0)
(r)
τ¯ij → −Eij λ−r
(3.11)
(0)
for all r ≥ 0 with the exception τij = τ¯ji = 0 if i < j. Given a subalgebra V of b N ) we set VA = V ∩ UA . We shall say that V specializes to a subalgebra V ◦ Uq (gl b N ) if VA ⊗A C ∼ of U(gl = V ◦. The quantized enveloping algebra Uq (glN ) is a natural (Hopf) subalgebra of b ) defined by the embedding Uq (gl N (0)
tij 7→ tij ,
(0) t¯ij 7→ t¯ij .
(3.12)
b N ) → Uq (glN ) called the evaMoreover, there is an algebra homomorphism Uq (gl luation homomorphism defined by T (u) 7→ T − T¯ u−1 ,
T¯(u) 7→ T¯ − T u .
(3.13)
The A type quantum affine algebras are exceptional in the sense that only in this case such an evaluation homomorphism does exist; see Chari–Pressley [2, Chapter 12]. b N ) generated by the elements t(r) was studied, e.g. in The subalgebra of Uq (gl ij [3, 30] and [36]. We call it the q-Yangian. In what follows we construct quantum affine algebras associated with the orthogonal and symplectic Lie algebras for which analogs of the evaluation homomorphism (3.13) do exist; cf. the B and C type twisted Yangians [35, 28]. These algebras can be viewed as twisted analogs of the q-Yangian as well as q-analogs of the twisted Yangians. Note, however, that contrary to the case of the twisted Yangians, our algebras are not subalgebras of the q-Yangian; they are generated by certain combinations of both types of elements (r) (r) tij and t¯ij .
December 8, 2003 11:39 WSPC/148-RMP
00181
Coideal Subalgebras in Quantum Affine Algebras
803
3.1. Orthogonal twisted q-Yangians b ) Definition 3.1. The twisted q-Yangian Yqtw (oN ) is the subalgebra of Uq (gl N (r)
generated by the coefficients sij of the matrix elements of the matrix S(u) = T (u) T¯(u−1 )t . More precisely, we have sij (u) =
N X
tia (u) t¯ja (u−1 )
(3.14)
a=1
so that
sij (u) =
∞ X
(r)
sij u−r .
(3.15)
r=0
(r)
The subalgebra Yqtw (oN ) is generated by the elements sij with 1 ≤ i, j ≤ N and r running over the set of nonnegative integers. Next we give a presentation of the algebra Yqtw (oN ) in terms of generators and defining relations by analogy with the finite-dimensional case; see Sec. 2.1. Consider the element R t (u, v) := R t1 (u, v) obtained from R(u, v) by the transposition in the first factor: X X Eii ⊗ Eii Eii ⊗ Ejj + (q −1 u − q v) R t (u, v) = (u − v) i6=j
+ (q −1 − q)u
i
X
Eji ⊗ Eji + (q −1 − q)v
i>j
X
Eji ⊗ Eji .
(3.16)
i<j
The following relations are implied by (3.3): (0)
1≤i<j≤N,
(3.17)
(0)
1≤i≤N,
(3.18)
sij = 0 , sii = 1 ,
R(u, v) S1 (u) R t (u−1 , v) S2 (v) = S2 (v) R t (u−1 , v) S1 (u) R(u, v) .
(3.19)
We shall be proving that these are precisely the defining relations for the algebra Yqtw (oN ). (r)
Lemma 3.2. Consider the (abstract) associative algebra with generators s ij where i, j = 1, . . . , N and r = 0, 1, . . . . The defining relations are written in terms of the matrix S(u) = (sij (u)) by the relation (3.19) with sij (u) defined by (3.15). (p) (r) Introduce the ordering on the generators in such a way that sij skl if and only if (i, j, r) (k, l, p) in the lexicographical order. Then the ordered monomials in the generators span the algebra. Proof. Write the defining relations in terms of the generating series sij (u): (q −δij u − q δij v) αijab (u, v) + (q −1 − q)(uδj
(3.20)
December 8, 2003 11:39 WSPC/148-RMP
804
00181
A. I. Molev, E. Ragoucy & P. Sorba
where we have used the notation αijab (u, v) = (q −δaj − q δaj uv) sia (u) sjb (v) + (q −1 − q)(δj
(3.21)
We shall show that any monomial (k )
(k )
(k )
si1 a11 si2 a22 · · · sip app
(3.22)
can be written as a linear combination of the ordered monomials. Define the degree of the monomial (3.22) as the sum k1 +· · ·+kp and argue by induction on the degree. The induction base is Lemma 2.1 which takes care of the monomials of degree zero. (k) In other words, we can introduce the filtration on the algebra by setting deg sia = k and it will be sufficient to prove the lemma for the corresponding graded algebra. We keep the same notation for its generators while the defining relations are given by (3.20) where instead of (3.21) we should take αijab (u, v) = −q δaj sia (u) sjb (v) + (q −1 − q) δa<j sij (u) sab (v) .
(3.23)
Furthermore, we apply the same argument to this new algebra considering the (k) filtration defined by setting the degree of sia to be equal to i. The corresponding (k) graded algebra is generated by elements sia with the defining relations (3.20) where the expression αijab (u, v) further simplifies to αijab (u, v) = −q δaj sia (u) sjb (v) .
(3.24)
Working with this algebra we have for i > j q δaj sia (u) sjb (v) =
q −δab u − q δab v δbi q sjb (v) sia (u) u−v +
q − q −1 δai q (u sja (u) sib (v) u−v
− (uδa
(3.25)
This implies that the monomials (3.22) with the condition i1 ≤ · · · ≤ ip span the algebra. Now we take i = j and a < b in (3.20) with the assumption (3.24) to get q δbi sib (v) sia (u) =
q −1 u − q v q δai sia (u) sib (v) q −δab u − q δab v +
q − q −1 q δai u sia (v) sib (u) . q −δab u − q δab v
(3.26)
Therefore, the algebra is spanned by the monomials (3.22) such that (i1 , a1 ) · · · (ip , ap ) in the lexicographical order. Finally, using again (3.20) we note that sia (u) sia (v) = sia (v) sia (u) (k)
(r)
which implies [sia , sia ] = 0 for all k, r. This completes the proof.
(3.27)
December 8, 2003 11:39 WSPC/148-RMP
00181
Coideal Subalgebras in Quantum Affine Algebras
805
(r)
Theorem 3.3. The abstract algebra S generated by elements s ij with i, j = 1, . . . , N and r ≥ 0 with the defining relations (3.17)–(3.19) is isomorphic to Yqtw (oN ). Proof. We use the argument of the proof of Theorem 2.2 appropriately modified for the affine case. Namely, we check first that the matrix elements of the matrix T (u) T¯(u−1 )t satisfy the defining relations of S so that we have an algebra homomorphism S → Yqtw (oN ). To prove its injectivity, regard q as a formal variable and set V = Yqtw (oN ). Note that the algebra VA is generated by the elements (r)
(r)
σij =
sij , q − q −1
(3.28) (r)
where the condition i > j must hold if r = 0. The image of σij in VA ⊗A C is (r)
(r)
Fij = Eij λr − Eji λ−r . However, the elements Fij span the fixed point subalgebra b θ of gl b corresponding to the automorphism gl N N θ : Eij λp 7→ −Eji λ−p .
(3.29)
Thus, Lemma 3.2 and this observation complete the proof. Consider the ordering on the generators of S defined in Lemma 3.2. We shall (0) assume that for the generators sij , the condition i > j holds. Corollary 3.4. The ordered monomials in the generators constitute a basis of the algebra S. b N ) specializes to the universal Corollary 3.5. The subalgebra Yqtw (oN ) of Uq (gl θ b ) as q → 1. enveloping algebra U(gl N
The following proposition is immediate from (3.7).
b ) is a left coideal so that Proposition 3.6. The subalgebra Yqtw (oN ) of Uq (gl N ∆(sij (u)) =
N X
tik (u) t¯jl (u−1 ) ⊗ skl (u) .
(3.30)
k,l=1 tw The next proposition allows us to regard Utw q (oN ) as a subalgebra of Yq (oN ). (0)
tw Proposition 3.7. The map sij 7→ sij defines an embedding Utw q (oN ) ,→ Yq (oN ). (0)
Proof. It is clear from (3.17)–(3.19) that the elements sij satisfy the defining relations for the sij ; see (2.15)–(2.17). Corollary 3.4 ensures that the homomorphism is injective. The following theorem establishes the existence of the evaluation homomorphisms for the twisted q-Yangians. We use notation (2.29). As before, we combine
December 8, 2003 11:39 WSPC/148-RMP
806
00181
A. I. Molev, E. Ragoucy & P. Sorba
the formal series sij (u) into the matrix S(u) so that S(u) is a formal power series in u−1 with matrix coefficients. Theorem 3.8. The mapping S(u) 7→ S + q −1 u−1 S¯
(3.31)
defines an algebra homomorphism Yqtw (oN ) → Utw q (oN ). Proof. Regarding Yqtw (oN ) as an abstract algebra we need to verify that the relation (3.19) holds when we substitute the right-hand side of (3.31) for S(u). That is, we need to verify ˜ − vR)(q u S1 + S¯1 )(R ˜ t − uvRt )(q v S2 + S¯2 ) (uR ˜ t − uvRt )(q u S1 + S¯1 )(uR ˜ − vR) . = (q v S2 + S¯2 )(R
(3.32)
Applying (2.3) and (2.10) we derive the relations R S 1 R t S2 = S 2 R t S1 R ,
˜ S 1 R t S2 = S 2 R t S1 R ˜, R
R S¯1 R t S2 = S2 R t S¯1 R ,
˜ S¯1 R ˜ t S¯2 = S¯2 R ˜ t S¯1 R ˜, R
˜ t S2 = S 2 R ˜ t S¯1 R , R S¯1 R
˜ t S1 R ˜, ˜ S1 R ˜ t S¯2 = S¯2 R R
˜ t S¯1 R , ˜ t S¯2 = S¯2 R R S¯1 R
˜. ˜ S1 R t S¯2 = S¯2 R t S1 R R
(3.33)
Expanding the products on the left and right-hand sides of (3.32) we see that in order to complete the proof, it suffices to verify the following four relations ˜, ˜ S¯1 R t S2 = S¯2 R t S1 R − S2 R t S¯1 R R S1 R t S¯2 − R ˜, ˜ t S¯1 R ˜ t S1 R − S 2 R ˜ t S2 = S¯2 R ˜ S¯1 R ˜ t S¯2 − R R S1 R ˜, ˜ − q −2 S¯2 R t S¯1 R ˜ t S1 R ˜ S¯1 R t S¯2 = S2 R ˜ t S2 − q −2 R ˜ S1 R R
(3.34)
˜ t S1 R − q −2 S¯2 R t S¯1 R . ˜ t S2 − q −2 R S¯1 R t S¯2 = S2 R R S1 R ˜ respectively, with Let us prove the first of them. Using (2.8), replace R and R, −1 −1 ˜ R + (q − q )P and R − (q − q )P. Due to (3.33), we now have to check that P S1 R t S¯2 + P S¯1 R t S2 = S¯2 R t S1 P + S2 R t S¯1 P .
(3.35)
However, this is obvious if we observe that P S1 = S2 P , P S¯1 = S¯2 P and P R t = R t P. The proof of the second relation in (3.34) is the same. The third and forth relations are also verified in a way similar to each other so we only consider the third one. Note that by (2.8), ˜ t + (q − q −1 ) Q , Rt = R
Q=Pt=
N X
i,j=1
Eij ⊗ Eij .
(3.36)
December 8, 2003 11:39 WSPC/148-RMP
00181
Coideal Subalgebras in Quantum Affine Algebras
807
˜ t , respectively, with R ˜ t + (q − q −1 ) Q and R t − (q − q −1 ) Q Replacing R t and R in the third relation and using again (3.33) we conclude that it is now sufficient to verify that ˜ S1 QS2 + q −2 R ˜ S¯1 QS¯2 = S2 QS1 R ˜ + q −2 S¯2 QS¯1 R ˜. R
(3.37)
Since P Q = QP = Q and P A1 = A 2 P
(3.38)
for any matrix A, we have P S¯1 QS¯2 = S¯2 QS¯1 P. Therefore, by (2.8) we may replace (3.37) with ˜ S1 QS2 + q −2 R S¯1 QS¯2 = S2 QS1 R ˜ + q −2 S¯2 QS¯1 R . R
(3.39)
Finally, we have the chain of equalities ˜ S1 Q = R ˜ T1 T¯1t Q = R ˜ T1 T¯2 Q = T¯2 T1 R ˜Q R = q −1 T¯2 T1 Q = q −1 T¯2 T2t Q = q −1 S¯2 Q ,
(3.40)
˜ Q = QR ˜ = q −1 Q together with the where we have used (2.10) and the relations R ˜ S1 QS2 = q −1 S¯2 QS2 . A observation that At1 Q = A2 Q implied by (3.38). Thus, R similar argument shows that R S¯1 QS¯2 = q S2 QS¯2 ,
˜ = q −1 S2 QS¯2 , S2 QS1 R
S¯2 QS¯1 R = q S¯2 QS2
(3.41)
proving (3.39) and the theorem. 3.2. Symplectic twisted q-Yangians As in Sec. 2.2, we use the matrix G = (gij ) given by (2.33). Definition 3.9. The twisted q-Yangian Yqtw (sp2n ) is defined as the subalgebra of b ) generated by the coefficients s(r) of the matrix elements of the matrix Uq (gl 2n
ij
(0) S(u) = T (u) G T¯(u−1 )t and the elements (sii0 )−1 with i = 1, 3, . . . , 2n − 1. More precisely, the matrix elements are given by
sij (u) = q
n X
ti,2a−1 (u) t¯j,2a (u−1 ) −
a=1
n X
ti,2a (u) t¯j,2a−1 (u−1 )
(3.42)
a=1
so that sij (u) =
∞ X
(r)
sij u−r .
(3.43)
r=0
Now we give a presentation of the algebra Yqtw (sp2n ) in terms of generators and defining relations. We shall use the notation (3.16). Denote by S the associative (r) (0) algebra with (abstract) generators sij with 1 ≤ i, j ≤ 2n and r ≥ 0, (sii0 )−1 with
December 8, 2003 11:39 WSPC/148-RMP
808
00181
A. I. Molev, E. Ragoucy & P. Sorba
i = 1, 3, . . . , 2n − 1 and the following defining relations written with the use of the generating series (3.43) and the matrix S(u) = (sij (u)): (0)
sij = 0 for i < j
with j 6= i0 ;
(3.44)
also, for any odd i = 1, 3, . . . , 2n − 1 (0)
(0)
(0) (0)
si0 i0 sii − q 2 si0 i sii0 = q 3 , (0)
(0)
(0)
(3.45)
(0)
sii0 (sii0 )−1 = (sii0 )−1 sii0 = 1 ; and R(u, v) S1 (u) R t (u−1 , v) S2 (v) = S2 (v) R t (u−1 , v) S1 (u) R(u, v) .
(3.46)
Theorem 3.10. The algebra S is isomorphic to the twisted q-Yangian Y qtw (sp2n ). Proof. Let us verify first that the matrix G satisfies R(u, v) G1 R t (u−1 , v) G2 = G2 R t (u−1 , v) G1 R(u, v) .
(3.47)
We have to check the four relations R G 1 R t G2 = G 2 R t G1 R ,
˜ G 1 R t G2 = G 2 R t G1 R ˜, R
˜ t G2 = G 2 R ˜ t G1 R , R G1 R
˜ G1 R ˜ t G2 = G 2 R ˜ t G1 R ˜. R
(3.48)
Applying (2.37) together with (2.8) and (3.36) we see that it is sufficient to check that R G1 QG2 = G2 QG1 R which can be done by a direct calculation. Further, repeating the arguments of the proof of Theorem 3.3 we construct the algebra homomorphism S → Yqtw (sp2n ). Next we verify directly that the elements (0)
(0)
(0) (0)
si0 i0 sii − q 2 si0 i sii0 with odd i are central in the algebra S. Now Lemmas 2.7 and 3.2 imply that the algebra S is spanned by the ordered (r) monomials in the generators sij . Here the triples (i, j, r) are ordered lexicographi(0)
cally and the generators si0 i with odd i are eliminated. Finally, the linear independence of the ordered monomials is proved in the same way as in Theorem 3.3. Namely, setting V = Yqtw (sp2n ) we consider the corresponding elements of the algebra VA given by (r)
(0)
(0)
σij =
sij − gij , q − q −1
(r)
and σij =
sij q − q −1
for r ≥ 1 ,
(3.49)
cf. (2.49). Then we have modulo (q − 1) (r)
(r)
(r)
σij ≡ g˜j 0 j τij 0 + g˜ii0 τ¯ji0 ,
(3.50) (r)
where g˜ii0 = (−1)i−1 ; see (2.50). Hence, the image of σij in VA ⊗A C is (r)
Fij := g˜j 0 j Eij 0 λr − g˜ii0 Eji0 λ−r .
(3.51)
December 8, 2003 11:39 WSPC/148-RMP
00181
Coideal Subalgebras in Quantum Affine Algebras (r)
The elements Fij automorphism
809
b θ of gl b with respect to the span a fixed point subalgebra gl 2n 2n θ : Eij λr 7→ (−1)i+j−1 Ej 0 i0 λ−r .
(3.52)
bθ The application of the Poincar´e–Birkhoff–Witt theorem to the Lie algebra gl 2n completes the proof. Consider the ordering on the generators of S defined in the proof of Theorem 3.10. Corollary 3.11. The ordered monomials in the generators constitute a basis of the algebra S. b ) specializes to the universal Corollary 3.12. The subalgebra Yqtw (sp2n ) of Uq (gl 2n θ b enveloping algebra U(gl ) as q → 1. 2n
The following proposition is immediate from (3.7).
b ) is a left coideal so that Proposition 3.13. The subalgebra Yqtw (sp2n ) of Uq (gl 2n ∆(sij (u)) =
2n X
tik (u) t¯jl (u−1 ) ⊗ skl (u) .
(3.53)
k,l=1 tw The next proposition allows us to regard Utw q (sp2n ) as a subalgebra of Yq (sp2n ).
Proposition 3.14. The map (0)
sij 7→ sij ,
(0)
−1 s−1 ii0 7→ (sii0 )
(3.54)
tw defines an embedding Utw q (sp2n ) ,→ Yq (sp2n ). tw Proof. By the defining relations in Utw q (sp2n ) and Yq (sp2n ) the mapping is clearly an algebra homomorphism. Corollary 3.11 ensures that the homomorphism is injective.
The following is an analog of Theorem 3.8 for the symplectic case. We use notation (2.51). Theorem 3.15. The mapping S(u) 7→ S + q u−1 S¯ ,
(0)
(sii0 )−1 7→ s−1 ii0
(3.55)
defines an algebra homomorphism Yqtw (sp2n ) → Utw q (sp2n ). Proof. By analogy with the orthogonal case (see Theorem 3.8), we need to verify ˜ − vR)(u S1 + q S¯1 )(R ˜ t − uvRt )(v S2 + q S¯2 ) (uR ˜ t − uvRt )(u S1 + q S¯1 )(uR ˜ − vR) . = (v S2 + q S¯2 )(R
(3.56)
December 8, 2003 11:39 WSPC/148-RMP
810
00181
A. I. Molev, E. Ragoucy & P. Sorba
Note that (3.33) and the first two relations in (3.34) still hold in the same form for the symplectic case. To prove the theorem we shall verify that the third and fourth relations in (3.34) are respectively replaced by ˜ S1 R ˜ t S2 − q 2 R ˜ S¯1 R t S¯2 = S2 R ˜ t S1 R ˜ − q 2 S¯2 R t S¯1 R ˜, R ˜ t S2 − q 2 R S¯1 R t S¯2 = S2 R ˜ t S1 R − q 2 S¯2 R t S¯1 R . R S1 R
(3.57)
The chain of equalities (3.40) is replaced with ˜ S1 Q = R ˜ T1 G1 T¯1t Q = R ˜ T1 T¯2 G1 Q = T¯2 T1 RG ˜ 1Q R = −q T¯2 T1 G2 Q = −q T¯2 G2 T2t Q = −q S¯2 Q ,
(3.58)
˜ 1 Q = −q G2 Q which is verified where we have used (2.10) and the relation RG ˜ S1 QS2 = −q S¯2 QS2 . A similar argument shows that directly. Thus, R R S¯1 QS¯2 = −q −1 S2 QS¯2 , ˜ = −q S2 QS¯2 , S2 QS1 R S¯2 QS¯1 R = −q −1 S¯2 QS2
(3.59)
completing the proof. 3.3. Comments on possible variations of the definitions tw • By analogy with the algebras Utw q (oN ) and Uq (sp2n ) we can introduce the ¯ matrices S(u) = (¯ sij (u)) by
¯ S(u) = T¯(u)T (u−1 )t
and
¯ S(u) = T¯(u) G T (u−1 )t
(3.60)
in the orthogonal and symplectic case, respectively; cf. Definitions 3.1 and 3.9. Then one can derive the following relations form (3.3): (uq − u−1 q −1 ) s¯ij (u) = (uq δij − u−1 q −δij )sji (u−1 ) + (q − q −1 )(uδj
(3.61)
in the orthogonal case, and (u−1 q − uq −1 ) s¯ij (u) = (uq δij − u−1 q −δij )sji (u−1 ) + (q − q −1 )(uδi<j + u−1 δj
(3.62)
in the symplectic case. This implies that the coefficients of the series s¯ij (u) generate b ). the same subalgebra Yqtw (oN ) or Yqtw (sp2n ) of Uq (gl N
• The construction of the coideal subalgebras Yqtw (oN ) and Yqtw (sp2n ) can be b N )c . generalized to the case of the centrally extended quantum affine algebra Uq (gl We use its presentation given in [8]. The R-matrix R(u) we employ here is related to R(u, v) by (uq −1 − q)R(u) = R(u, 1) .
(3.63)
December 8, 2003 11:39 WSPC/148-RMP
00181
Coideal Subalgebras in Quantum Affine Algebras
811
The defining relations (3.3) are then replaced by R(u/v) T1 (u) T2 (v) = T2 (v) T1 (u) R(u/v) , R(u/v) T¯1 (u) T¯2 (v) = T¯2 (v) T¯1 (u) R(u/v) , R(uq
−c
(3.64)
/v) T¯1 (u) T2 (v) = T2 (v) T¯1 (u) R(uq /v) . c
The definition of the matrix S(u) is modified to S(u) = T (uq −c ) T¯(u−1 )t
and S(u) = T (uq −c ) G T¯(u−1 )t
(3.65)
in the orthogonal and symplectic case, respectively. However, one can demonstrate that S(u) still satisfies the same reflection equation (3.19) or (3.46) which does not involve the central charge c. The corresponding subalgebra is therefore isomorphic to the twisted q-Yangian, as an abstract algebra. 4. Centers of the Twisted q-Yangians In this section we construct a formal series whose coefficients belong to the center of the twisted q-Yangian. We start by recalling the well-known construction of the quantum determinants b ); see e.g. [3, 19] and [36]. We use an approach for the quantum affine algebra Uq (gl N analogous to the case of the Yangian Y(glN ) [17] and [20]; see also [28] for a detailed exposition. b N ) ⊗ (End C N )⊗ r and use the Let us consider the multiple tensor product Uq (gl notation of (3.3). Then we have the following corollary of (3.5) and (3.3): R(u1 , . . . , ur ) T1 (u1 ) · · · Tr (ur ) = Tr (ur ) · · · T1 (u1 ) R(u1 , . . . , ur ) ,
where R(u1 , . . . , ur ) =
Y
Rij (ui , uj ) ,
(4.1)
(4.2)
i<j
with the product taken in the lexicographical order on the pairs (i, j). The proof of (4.1) is exactly the same as for the Yangians; see e.g. [28]. Furthermore, consider the q-permutation operator P q ∈ End (C N ⊗ C N ) defined by X X X Pq = Eii ⊗ Eii + q Eij ⊗ Eji + q −1 Eij ⊗ Eji . (4.3) i
i>j
i<j
The action of symmetric group Sr on the space (C N )⊗ r can be defined by setting q for i = 1, . . . , r − 1, where si denotes the transposition (i, i + 1). si 7→ Psqi := Pi,i+1 If σ = si1 · · · sil is a reduced decomposition of an element σ ∈ Sr we set Pσq = Psqi1 · · · Psqi . We denote by Aqr the q-antisymmetrizer l X Aqr = sgn σ · Pσq . (4.4) σ∈Sr
The following proposition is proved by induction on r in the same way as for the Yangians [28] with the use of a property of the reduced decompositions [16, p. 50].
December 8, 2003 11:39 WSPC/148-RMP
812
00181
A. I. Molev, E. Ragoucy & P. Sorba
Proposition 4.1. We have the relation in End (C N )⊗ r : Y R(1, q −2 , . . . , q −2r+2 ) = (q −2i − q −2j ) Aqr .
(4.5)
0≤i<j≤r−1
Now (4.1) implies Aqr T1 (u) · · · Tr (q −2r+2 u) = Tr (q −2r+2 u) · · · T1 (u) Aqr which equals
X
··· ar t ab11··· br (u) ⊗ Ea1 b1 ⊗ · · · ⊗ Ear br
(4.6)
(4.7)
ai ,bi ··· ar −1 b for some elements t ba11··· ]] which we call the quantum minors. br (u) of Uq (glN )[[u They can be given by the following formulas which are immediate from the definition. If a1 < · · · < ar then X ··· ar (−q)−l(σ) · taσ(1) b1 (u) · · · taσ(r) br (q −2r+2 u) , (4.8) t ab11··· br (u) = σ∈Sr
and for any τ ∈ Sr we have ··· aτ (r)
a
(1) t b1τ··· br
··· ar (u) = (−q)l(τ ) t ab11··· br (u) ,
(4.9)
where l(σ) denotes the length of the permutation σ. If b1 < · · · < br (and the ai are arbitrary) then X ··· ar (−q)l(σ) · tar bσ(r) (q −2r+2 u) · · · ta1 bσ(1) (u) , (4.10) t ab11··· br (u) = σ∈Sr
and for any τ ∈ Sr we have ··· ar −l(τ ) a1 ··· ar t baτ1(1) t b1 ··· br (u) . ··· bτ (r) (u) = (−q)
(4.11)
Moreover, the quantum minor is zero if two top or two bottom indices are equal. Note also that the standard row and column expansion formulas can be easily derived from (4.8) or (4.10). In particular, we have ··· ar t ba11··· br (u) =
r X
···b al ··· ar −2 (q u) , (−q)−l+1 t al b1 (u) t ba21··· br
(4.12)
l=1
where the hat indicates the index to be omitted. ··· ar ¯ The quantum minors t¯ab11··· br (u) of the matrix T (u) are given by the same formulas where the tij (u) are respectively replaced with t¯ij (u). Furthermore, for any indices i, j we have the well known relations which are deduced from (4.1): cr [tci dj (u), t dc11··· ··· dr (v)] = 0 ,
cr [tci dj (u), t¯dc11··· ··· dr (v)] = 0 ,
(4.13)
and the same holds with tci dj (u) replaced by t¯ci dj (u). For their proof introduce an extra copy of End C N as a tensor factor which will be enumerated by the index 0. Now we specialize the parameters ui in (4.1) as follows: u0 = v ,
ui = q −2i+2 u for i = 1, . . . , r .
(4.14)
December 8, 2003 11:39 WSPC/148-RMP
00181
Coideal Subalgebras in Quantum Affine Algebras
813
Then by Proposition 4.1 the element (4.2) will take the form R(v, u, . . . , q −2r+2 u) =
r Y
R0i (v, q −2i+2 u) Aqr .
(4.15)
i=1
Using the definition of the quantum minors and equating the matrix elements on both sides of (4.1) we get the first relation in (4.13); cf. [28]. The proof of the second is similar. The quantum determinants of the matrices T (u) and T¯(u) are respectively defined by the relations N qdet T (u) = t 1··· 1··· N (u) ,
N qdet T¯(u) = t¯1··· 1··· N (u) .
(4.16)
Write qdet T (u) =
∞ X
dk u−k ,
qdet T¯(u) =
k=0
∞ X
d¯k uk ,
k=0
b ). dk , d¯k ∈ Uq (gl N
(4.17)
Due to the property (4.13), the coefficients dk and d¯k belong to the center of the b N ). Note also the relation d0 d¯0 = 1 which is implied by (3.3). algebra Uq (gl
4.1. Sklyanin determinant
We shall consider the orthogonal and symplectic cases simultaneously unless otherwise stated. The symbol Yqtw will denote either Yqtw (oN ) or Yqtw (spN ) (the latter with N = 2n). As before, we denote by S(u) the matrix T (u)G T¯(u−1 )t , where G is the identity matrix in the orthogonal case, and G is given by (2.33) in the symplectic case. Then S(u) satisfies the relations (3.19) and (3.46), respectively, which have the same form. Our arguments are similar to those used in [35] and [28] for the construction of the Sklyanin determinant for the twisted Yangians. The following relation in the algebra Yqtw ⊗ (End C N )⊗ r is a corollary of (3.46). t t t t t R(u1 , . . . , ur ) S1 (u1 )R12 · · · R1r S2 (u2 )R23 · · · R2r S3 (u3 ) · · · Rr−1,r Sr (ur ) t t t = Sr (ur )Rr−1,r · · · S3 (u3 )R2r · · · R23 t t × S2 (u2 )R1r · · · R12 S1 (u1 ) R(u1 , . . . , ur ) ,
(4.18)
where R(u1 , . . . , ur ) is given by (4.2) and Rijt = Rijt (u−1 i , uj ). Now take r = N and specify ui = u q −2i+2 . Then Proposition 4.1 implies t t t t t −2N +2 AqN S1 (u)R12 · · · R1N S2 (u q −2 )R23 · · · R2N S3 (u q −4 ) · · · RN ) −1,N SN (u q t −4 t t = SN (u q −2N +2 )RN )R2N · · · R23 −1,N · · · S3 (u q t t × S2 (u q −2 )R1N · · · R12 S1 (u) AqN .
AqN
(4.19)
Since the q-antisymmetrizer is proportional to an idempotent and maps the space (C N )⊗ N into a one-dimensional subspace, both sides must be equal to AqN
December 8, 2003 11:39 WSPC/148-RMP
814
00181
A. I. Molev, E. Ragoucy & P. Sorba
times a series sdetS(u) in u−1 with coefficients in Yqtw . We call this series the Sklyanin determinant. The following theorem provides an expression of sdetS(u) in terms of the quantum determinants. Theorem 4.2. We have sdetS(u) = γN (u) qdet T (u) qdet T¯(q 2N −2 u−1 ) ,
(4.20)
where Y
γN (u) =
(q 2i−2 u−1 − q −2j+2 u)
1≤i<j≤N
·
1
in the case of oN , n−2
n 2
q −q u 2n−2 q − q −2n u2
in the case of sp2n .
(4.21)
Proof. We follow the arguments of [28, Sec. 4]. Substitute S(u) = T (u)G T¯(u−1 )t into (4.19) and transform the left-hand side using the relations t t T¯i (u−1 )t Rij (u−1 , v) Tj (v) = Tj (v) Rij (u−1 , v) T¯i (u−1 )t
(4.22)
which are implied by (3.3). We then bring it to the form ˜ AqN T1 (u) · · · TN (q −2N +2 u) R(u) T¯1 (u−1 )t · · · T¯N (q 2N −2 u−1 )t ,
(4.23)
t t t t t ˜ · · · R2N G3 · · · RN G2 R23 R(u) = G1 R12 · · · R1N −1,N GN .
(4.24)
where
By the definition of the quantum determinant we have AqN T1 (u) · · · TN (q −2N +2 u) = AqN qdet T (u) .
(4.25)
Further, using the homomorphism S(u) 7→ G we derive from (4.19) q t t t t t ˜ AqN R(u) = G N RN −1,N · · · G3 R2N · · · R23 G2 R1N · · · R12 G1 AN .
(4.26)
Thus, this expression equals AqN γN (u) for a scalar function γN (u). Using the explicit formulas (4.8) one easily derives that AqN T¯1 (u−1 )t · · · T¯N (q 2N −2 u−1 )t = AqN qdet T¯(q 2N −2 u−1 ) .
(4.27)
It remains to calculate the function γN (u). Consider first the orthogonal case. Apply ˜ the operator AqN R(u) to the basis vector e1 ⊗ · · · ⊗ eN , where the ei denote the canonical basis vectors of C N . By (3.16) the result is clearly γN (u) AqN (e1 ⊗· · ·⊗eN ) ˜ with γN (u) given by (4.21). In the symplectic case apply Aq2n R(u) to the basis vector v = e2n−1 ⊗ e2n−3 ⊗ · · · ⊗ e1 ⊗ e2 ⊗ e4 ⊗ · · · ⊗ e2n .
(4.28)
December 8, 2003 11:39 WSPC/148-RMP
00181
Coideal Subalgebras in Quantum Affine Algebras
815
˜ Since G e2k = q e2k−1 , the vector Aq2n R(u) v equals Y qn (q 2i−2 u−1 − q −2j+2 u) n≤i<j≤2n
t t t t × Aq2n G1 R12 · · · R1,2n G2 · · · Gn Rn,n+1 · · · Rn,2n w,
(4.29)
where w = e2n−1 ⊗ e2n−3 ⊗ · · · ⊗ e1 ⊗ e1 ⊗ e3 ⊗ · · · ⊗ e2n−1 . If j > n + 1 then
t Rn,j
w = (q
2n−2 −1
u
−q
−2j+2
(4.30)
u) w. Further, we have
t Rn,n+1 (e1 ⊗ e1 ) = (q 2n−3 u−1 − q −2n+1 u) (e1 ⊗ e1 )
+ (q −1 − q) q −2n u
2n X
(ek ⊗ ek ) .
(4.31)
k=2
Next apply the operator Gn and note that due to the subsequent application of the q-antisymmetrizer, we may only keep the linear combination of the tensor products containing e1 or e2 on the nth and (n + 1)th places (see [28, Sec. 4] for a similar argument in the symplectic twisted Yangian case). That is, we may write t t t Aq2n G1 R12 · · · R1,2n G2 · · · Gn Rn,n+1 w
= (q 2n−4 u−1 − q −2n+2 u) t t t t × Aq2n G1 R12 · · · R1,2n G2 · · · Gn−1 Rn−1,n · · · Rn−1,2n w0
(4.32)
with w0 = e2n−1 ⊗ e2n−3 ⊗ · · · ⊗ e3 ⊗ e1 ⊗ e2 ⊗ e3 ⊗ · · · ⊗ e2n−1 .
(4.33)
By the skew-symmetry of the antisymmetrizer, replace w 0 with the vector w00 = e2n−1 ⊗ e2n−3 ⊗ · · · ⊗ e3 ⊗ e3 ⊗ e2 ⊗ e1 ⊗ · · · ⊗ e2n−1 ,
(4.34)
taking the sign into account. Continuing the calculation in a similar manner, we conclude that the product of the scalar factors occurring in the procedure will coincide with γ2n (u) given in (4.21) with N = 2n. b ) immediately implies the The centrality of qdet T (u) and qdet T¯(u) in Uq (gl N corresponding property of the Sklyanin determinant sdetS(u). Corollary 4.3. The coefficients of the series sdetS(u) belong to the center of the algebra Yqtw .
Introduce the series c(u) and the elements ck of the center of the algebra Yqtw by the formula ∞ X ck u−k . (4.35) c(u) = γN (u)−1 sdetS(u) = 1 + k=1
The series begins with 1 due to Theorem 4.2 and the relation d0 d¯0 = 1.
December 8, 2003 11:39 WSPC/148-RMP
816
00181
A. I. Molev, E. Ragoucy & P. Sorba
Proposition 4.4. The coefficients ck , k ≥ 1 are algebraically independent. Proof. The coefficients {dk , k ≥ 0, d¯k , k ≥ 1} of the quantum determinants are algebraically independent. This can be derived by analogy with the case of the Yangian; see e.g. [28, Sec. 2]. The key observation here is the isomorphism (3.10). The statement is now implied by Theorem 4.2. Since both evaluation homomorphisms (3.31) and (3.55) are surjective, we tw obtain families of central elements in the algebras Utw q (oN ) and Uq (sp2n ) as images of the coefficients of sdetS(u). In other words, we have the following result. ¯ Proposition 4.5. The coefficients of the Sklyanin determinants sdet(S+q −1 u−1 S) −1 ¯ tw tw and sdet(S + q u S) are central elements in the algebras Uq (oN ) and Uq (sp2n ), respectively. In what follows we only consider the case of the orthogonal twisted q-Yangian Yqtw (oN ). In order to produce an explicit formula for the corresponding Sklyanin determinant we introduce a map πN : SN → S N ,
p 7→ p0
(4.36)
which was previously used in the “short” formula for the Sklyanin determinant for the twisted Yangians; see [27]. This map is defined by an inductive procedure. Given a set of positive integers ω1 < · · · < ωN , we regard SN as the group of their permutations. If N = 2 we define π2 as the map S2 → S2 whose image is the identity permutation. For N > 2 define a map from the set of ordered pairs (ωk , ωl ) with k 6= l into itself by the rule (ωk , ωl ) 7→ (ωl , ωk ) ,
k, l < N ,
(ωk , ωN ) 7→ (ωN −1 , ωk ) ,
k < N −1,
(ωN , ωk ) 7→ (ωk , ωN −1 ) ,
k < N −1,
(4.37)
(ωN −1 , ωN ) 7→ (ωN −1 , ωN −2 ) , (ωN , ωN −1 ) 7→ (ωN −1 , ωN −2 ) . Let p = (p1 , . . . , pN ) be a permutation of the indices ω1 , . . . , ωN . Its image under the map πN is the permutation p0 = (p01 , . . . , p0N −1 , ωN ), where the pair (p01 , p0N −1 ) is the image of the ordered pair (p1 , pN ) under the map (4.37). Then the pair (p02 , p0N −2 ) is found as the image of (p2 , pN −1 ) under the map (4.37) which is defined on the set of ordered pairs of elements obtained from (ω1 , . . . , ωN ) by deleting p1 and pN ; etc. The map πN has curious combinatorial properties which were observed by Lascoux; see [26]. In particular, each fiber of this map is an interval in SN with respect to the Bruhat order, isomorphic to a Boolean poset. Example 4.6. The Bruhat order on S3 and the fibers of the map π3 :
0
0
0
0
0
0
0
0 (p image thepermutation ordered pthe the mapwhere (4.37).the map πN isofthe p (p=1 , (p , .under . . , pN(4.37). theThen pair (p p, N ) is N1) 2 , pthe N −2 ) −1 , ωN ), Then image of the ordered pair (p1 , pN )pair under map pairthe (p1 ,20pair pN−1 −2 ) 0 0 December 8,is2003 11:39 WSPC/148-RMP found asordered of (p (p12,,ppNN)−1under ) 00181 under map(4.37). (4.37) Then whichthe is defined the set of the thethe map pair (pon 2 , pN −2 ) is found asimage the image of the (p2 ,image pNpair −1 ) under the map (4.37) which is defined on the set of ordered of of elements obtained from (ω1 , .(4.37) . . , ωNwhich ) by deleting p1 and pN ;set etc. found thepairs image (p2 , pN −1 is defined on the of ordered ispairs of as elements obtained from) under (ω1 , . .the . , ωmap N ) by deleting p1 and pN ; etc. The map π has curious combinatorial properties which were observed by Lascoux; of ordered pairs of elements obtained from (ω1 , . . . , ωN ) by deleting p1 and pN ; etc. The map πN has curiousN combinatorial properties which were observed by Lascoux; [26].πNInhas particular, fiber of this map is an interval SN with by respect to the Thesee map curious each combinatorial properties which wereinobserved Lascoux; see [26]. In particular, each fiber of this map is an interval in S with respect to the N isomorphic to a of Boolean poset. see Bruhat [26]. Inorder, particular, each fiber this map is an interval in SN with respect to the Bruhat order, isomorphic to a Boolean poset. Bruhat order, isomorphic to a Boolean poset.
Example 4.6. The Bruhat order on SSubalgebras of the map π3 : 3 and theinfibers Coideal Quantum Affine Algebras
817
Example Example 4.6. The Bruhat on order S3 and the fibers of the map π3 : π3 : 4.6. Theorder Bruhat on S 3 and the fibers of the map 321
r @ 312 r r r @ @ 231 r @ @ @r 312 312 312 HH @ @ r 213 r r @@r 132 −→ 213 H r r 312 H @r 312 231 @ @ 231 r @ HH H @ @ HH @ @ H r r 213 213 @ r @r 132213−→ 213 H 132r r H 213 132 H @−→ r H @ @ HHr @ H r @ 123 @ r @ 213 Hr 132 213 @ 132 @r @r @ @r 123 @ 123 @ 123 @r 321
@r
321
321
r 321
321
r
r
−→
r −→ −→ 123
123
123
231
r
r 231
231
123
¯ S(u) Consider S(u) introduced in (3.60). By (3.61) the matrix elements introduced in (3.60). By (3.61) the matrix elements of Considerthe thematrix matrix ¯ of S(u)are areformal formal series series in in u with coefficients ininthe subalgebra YqtwY (otw ). N ). N(o S(u) u with coefficients the subalgebra q Consider the matrix S(u) introduced in (3.60). By (3.61) the matrix elements of nndenote the rank of the Lie algebra oBy that Nthe = 2n or 2n N =elements 2n N so theLet matrix S(u) introduced in Lie (3.60). of 1. Let denote the rank of the algebra o(3.61) so that Nmatrix = N+=1.2n + tw or
123
Consider N subalgebra Y (o ). S(u) are formal series in u with coefficients in the N q tw S(u) are formal in with coefficients the subalgebra N ). Theorem 4.7.uthe We have an explicit formula Letseries n denote rank of the Lie in algebra oN so that Y Nq =(o2n or N = 2n + 1. XLie algebra o 31−1 N = Let n denote the rank of the or N−1= 2n + 1. −l(p)+l(p0 )N t so that t 2n 2n−2 (−q)
c(u) =
s¯p1 p01 (u
31
p∈SN
31
× spn+1 p0n+1 (q −2n u) · · · sp
0 N pN
) · · · s¯pn p0n (q
u
)
(q −2N +2 u) ,
(4.38)
where the s¯ijt (u) denote the matrix elements of the transposed matrix S¯ t (u). Example 4.8. If N = 2 then t t (u−1 ) s22 (q −2 u) − q −1 s¯21 (u−1 ) s12 (q −2 u) . c(u) = s¯11
(4.39)
If N = 3 then t t (u−1 ) s31 (q −2 u) s23 (q −4 u) c(u) = s¯22 (u−1 ) s11 (q −2 u) s33 (q −4 u) + s¯12 t t (u−1 ) s32 (q −2 u) s13 (q −4 u) − q s¯12 (u−1 ) s21 (q −2 u) s33 (q −4 u) + q −2 s¯21 t t − q −1 s¯32 (u−1 ) s11 (q −2 u) s23 (q −4 u) − q −3 s¯31 (u−1 ) s22 (q −2 u) s13 (q −4 u) .
Proof of Theorem 4.7. Our proof is based on the properties of the quantum minors of the matrices T (u) and T¯(u). Let a1 , . . . , ar and b1 , . . . , br be indices from the set {1, . . . , N }. Introduce the elements X ··· ar ··· ar ¯br ··· b1 2r−2 u−1 ) , s ba11··· t ac11··· (4.40) cr (u) t cr ··· c1 (q br (u) = c1 <···
where the indices c1 , . . . , cr run over the set {1, . . . , N }. In particular, s ab (u) = sab (u), and N s 1··· 1··· N (u) = c(u)
(4.41)
by (4.20). We shall now derive a recurrent formula for the elements (4.40) with the conditions bi = ai for i = 1, . . . , r − 1 and a1 < · · · < ar . First, by the formulas
December 8, 2003 11:39 WSPC/148-RMP
818
00181
A. I. Molev, E. Ragoucy & P. Sorba
(4.8)–(4.13) for the quantum minors we can write ar r! s aa11 ··· ··· ar−1 ,br (u) X ··· ar ¯br ,ar−1 ··· a1 (q 2r−2 u−1 ) t ac11··· = cr (u) t cr ··· c1 c1 ,...,cr
=r
X
a
··· a
··· ar ¯ r−1 1 2r−4 u−1 ) t¯b c (q 2r−2 u−1 ) t ac11··· cr (u) t cr−1 ··· c1 (q r r
c1 ,...,cr
=r
X
ar−1 ··· a1 2r−4 −1 ··· ar 2r−2 −1 ¯ u ) t ac11··· u ). t¯cr−1 ··· c1 (q cr (u) tbr cr (q
(4.42)
c1 ,...,cr ··· ar Next, applying the column expansion to the quantum minor t ca11··· cr (u), we obtain ··· ar (r − 1)! s aa11 ··· ar−1 ,br (u)
=
r X X
··· a
a
r−1 1 2r−4 −1 ···b ak ··· ar u ) t ac11··· (−q)k−r t¯cr−1 ··· c1 (q cr−1 (u)
c1 ,...,cr k=1
× tak cr (q −2r+2 u) t¯br cr (q 2r−2 u−1 ) =
X
r X
··· a
a
r−1 1 2r−4 −1 ···b ak ··· ar −2r+2 (−q)k−r t¯cr−1 u ) t ca11··· u) , ··· c1 (q cr−1 (u) sak br (q
c1 ,...,cr−1 k=1
where the hats indicate the symbols to be omitted. Now for any k < r write ar−1 ··· a1 2r−4 −1 ar−1 ···b ak ··· a1 ,ak 2r−4 −1 t¯cr−1 u ) = (−q)k−1 t¯cr−1 (q u ) ··· c1 (q ··· c1
(4.43)
and apply the column expansion to this minor to bring the previous formula to the form ( X ar−2 ··· a1 2r−4 −1 ar (−q)r−2 t¯ar−1 c1 (u−1 ) t¯cr−1 u ) (r − 2)! s aa11 ··· ··· c2 (q ··· ar−1 ,br (u) = c1 ,...,cr−1
a ··· a
r−1 × t c11··· cr−1 (u) sar br (q −2r+2 u)
+
r−1 X
···b a ··· a1
a
r−1 k (−q)2k−r−1 t¯ak c1 (u−1 ) t¯cr−1 ··· c2
(q 2r−4 u−1 )
k=1
···b ak ··· ar −2r+2 × t ca11··· cr−1 (u) sak br (q
)
u) .
Applying again (4.13), we write this as ( X a1 ··· ar (r − 2)! s a1 ··· ar−1 ,br (u) = (−q)r−2 t¯ar−1 c1 (u−1 ) c1 ,...,cr−1
ar−2 ··· a1 2r−4 −1 a ··· ar−1 u ) sar br (q −2r+2 u) (u) t¯cr−1 × t c11··· cr−1 ··· c2 (q
December 8, 2003 11:39 WSPC/148-RMP
00181
Coideal Subalgebras in Quantum Affine Algebras
+
r−1 X
819
···b ak ··· ar ¯ar−1 ···bak ··· a1 (−q)2k−r−1 t¯ak c1 (u−1 ) t ca11··· cr−1 (u) t cr−1 ··· c2
k=1
× (q
2r−4
u
−1
) sak br (q
−2r+2
)
u) .
Finally, using the column expansion (4.12) and the definition (3.60) of the elements s¯ij (u) we get the following recurrence relation ··· ar s aa11 ··· ar−1 ,br (u) a ··· a
−2 u) sar br (q −2r+2 u) = s¯ tar−1 ar−1 (u−1 ) s a11 ··· ar−2 r−2 (q
+
r−2 X a ···b a ··· a (q −2 u) sar br (q −2r+2 u) (−q)2r−2l−3 s¯ tal ar−1 (u−1 ) s a11 ···ball ··· ar−1 r−2 ,al l=1
+
r−1 X k=1
+
k−1 X
(
a ···b a ··· a
(−q)2k−2r+1 s¯ tar ak (u−1 ) s a11 ···bakk ··· ar−1 (q −2 u) sak br (q −2r+2 u) r−1
···b al ···b ak ··· ar −2 (−q)2k−2l−2 s¯ tal ak (u−1 ) s aa11 ···b u) sak br (q −2r+2 u) al ···b ak ··· ar−1 ,al (q
l=1
+
r−1 X
(−q)
2k−2l
ak ···b al ··· ar −2 s¯ tal ak (u−1 ) s aa11 ···b ···b ak ···b al ··· ar−1 ,al (q
l=k+1
u) sak br (q
−2r+2
)
u) .
N Using (4.41) and starting with s 1··· 1··· N (u) we apply this recurrence relation repeatedly to get an explicit expression for the series c(u) in terms of the generators s ij (u) and s¯ij (u). It follows from the definition of the map πN that c(u) will be written as a combination of the monomials of the required form, the coefficients being powers of −q. The exact values of the powers are easily found by calculating the number of inversions of the permutations occurring in the recurrence relation.
Remark 4.9. The argument used in the proof of Theorem 4.7 can also be applied to produce a simpler proof for the formula for the Sklyanin determinant given in [27] in the case of orthogonal and symplectic twisted Yangians. However, the argument does not seem to be directly applicable to the case of the symplectic twisted q-Yangian Yqtw (sp2n ). ¯ The image of the matrix S(u) under the evaluation homomorphism (3.31) is found from the relation (3.61) so that 1 + uq −1 ¯ ¯ (S + q u S) . S(u) 7→ 1 + uq
(4.44)
Applying the evaluation homomorphism to the Sklyanin determinant sdetS(u) and using (3.31) and (4.44) we derive the following corollary from Theorem 4.7.
December 8, 2003 11:39 WSPC/148-RMP
820
00181
A. I. Molev, E. Ragoucy & P. Sorba
Corollary 4.10. The coefficients of the polynomial X 0 (−q)−l(p)+l(p ) [u S¯ + q S]p01 p1 · · · [u S¯ + q 2n−1 S]p0n pn C(u) = p∈SN
¯ p p0 · · · [u S + q 2N −3 S¯ ]p p0 × [u S + q 2n−1 S] n+1 n+1 N N are Casimir elements for the algebra Utw q (oN ). Moreover, the polynomial C(u) is monic of degree N. Proof. The polynomial C(u) is obtained by the application of the evaluation homomorphism to the series c(u) and multiplication by an appropriate rational function in u. The centrality of its coefficients thus follows from Theorem 4.7. Obviously, the degree of C(u) does not exceed N. The coefficient of uN can only occur in the summands with the property p = p0 . However, it follows from the definition of the map (4.36) that this property is satisfied by the only permutation p ( (N − 1, N − 3, . . . , 1, 2, 4, . . . , N ) if N is even , p= (4.45) (N − 1, N − 3, . . . , 2, 1, 3, . . . , N ) if N is odd . Since the diagonal entries of S and S¯ are ones, the second statement follows. The polynomial C(u) may be regarded as a q-analog of the Capelli polynomial for the algebra Utw q (oN ). It would be interesting to find its eigenvalues in the irreducible representations and to get the corresponding q-analogs of the Capelli identities; cf. [27, 13, 15] and [34]. Example 4.11. If N = 3 then C(u) = (u + q)[(u + q)(u + q 3 ) − q 2 u C] ,
(4.46)
C = s221 + q 2 s232 + s231 − q s21 s32 s31 .
(4.47)
where
The following corollary provides a characteristic identity for the algebra cf. [27] and [30].
Utw q (oN );
Corollary 4.12. We have the identity C(−q 2N −3 S¯ S −1 ) = 0 . b Proof. Introduce the quantum comatrix S(u) by the formula b S(uq −2N +2 ) = sdetS(u) . S(u)
(4.48)
(4.49)
b Explicit expressions for the matrix elements of the matrix S(u) can be found from a1 ··· ar the recurrence relation for the elements s a1 ··· ar−1 ,br (u); see the end of the proof of
December 8, 2003 11:39 WSPC/148-RMP
00181
Coideal Subalgebras in Quantum Affine Algebras
821
Theorem 4.7. Using these expressions and applying the evaluation homomorphism to both sides of (4.49) we get b C(u) = C(u) (S¯ + uq −2N +3 S) ,
b where C(u) is a polynomial in u with coefficients in the algebra This completes the proof.
(4.50) Uqtw (oN ) ⊗ End C N .
Acknowledgments We are grateful to Gustav Delius, Masatoshi Noumi and Tˆ oru Umeda for valuable discussions. The financial support of the Australian Research Council and the Laboratoire d’Annecy-le-Vieux de Physique Th´eorique is acknowledged. References [1] D. Arnaudon, J. Avan, N. Cramp´e, L. Frappat and E. Ragoucy, R-matrix presentation for super-Yangians Y (osp(m|2n)), J. Math. Phys. 44 (2003), 302–308. [2] V. Chari and A. Pressley, A Guide to Quantum Groups, Cambridge University Press, 1994. [3] I. V. Cherednik, A new interpretation of Gelfand–Tzetlin bases, Duke Math. J. 54 (1987), 563–577. [4] G. W. Delius, N. J. MacKay and B. J. Short, Boundary remnant of Yangian symmetry and the structure of rational reflection matrices, Phys. Lett. B522 (2001), 335–344; Erratum ibid. B524 (2002), 401. [5] G. W. Delius and N. J. MacKay, Quantum group symmetry in sine-Gordon and affine Toda field theories on the half-line, Commun. Math. Phys. 233 (2003), 173–190. [6] M. S. Dijkhuizen and M. Noumi, A family of quantum projective spaces and related q-hypergeometric orthogonal polynomials, Trans. AMS 350 (1998), 3269–3296. [7] M. S. Dijkhuizen, M. Noumi and T. Sugitani, Multivariable Askey–Wilson polynomials and quantum complex Grassmannians, Fields Int. Commun. 14 (1997), 167–177. ˆ [8] J. Ding, Spinor representations of Uq (gl(n)) and quantum boson-fermion correspondence, Commun. Math. Phys. 200 (1999), 399–420. [9] V. G. Drinfeld, Hopf algebras and the quantum Yang–Baxter equation, Sov. Math. Dokl. 32 (1985), 254–258. [10] V. G. Drinfeld, A new realization of Yangians and quantized affine algebras, Sov. Math. Dokl. 36 (1988), 212–216. b , Selecta Math. (N.S.) 8 [11] E. Frenkel and E. Mukhin, The Hopf algebra Rep Uq gl ∞ (2002), 537–635. [12] A. M. Gavrilik and A. U. Klimyk, q-deformed orthogonal and pseudo-orthogonal algebras and their representations, Lett. Math. Phys. 21 (1991), 215–220. [13] A. M. Gavrilik and N. Z. Iorgov, “On Casimir elements of q-algebras Uq0 (son ) and their eigenvalues in representations”, in Symmetry in Nonlinear Mathematical Physics, Proc. Inst. Mat. Ukr. Nat. Acad. Sci. 30, Kyiv, 1999, pp. 310–314. [14] A. M. Gavrilik, N. Z. Iorgov and A. U. Klimyk, “Nonstandard deformation Uq0 (son ): the embedding Uq0 (son ) ⊂ Uq (sln ) and representations”, in Symmetries in Science, X (Bregenz, 1997). Plenum, New-York, 1998, pp. 121–133. [15] M. Havl´ıˇcek, A. U. Klimyk and S. Poˇsta, Central elements of the algebras Uq0 (som ) and Uq (isom ), Czechoslovak J. Phys. 50 (2000), 79–84.
December 8, 2003 11:39 WSPC/148-RMP
822
00181
A. I. Molev, E. Ragoucy & P. Sorba
[16] J. E. Humphreys, Introduction to Lie Algebras and Representation Theory, Springer, New York, 1972. [17] A. G. Izergin and V. E. Korepin, A lattice model related to the nonlinear Schr¨ odinger equation, Sov. Phys. Dokl. 26 (1981), 653–654. [18] M. Jimbo, A q-difference analogue of U(g) and the Yang–Baxter equation, Lett. Math. Phys. 10 (1985), 63–69. [19] M. Jimbo, A q-analogue of Uq (gl(N + 1)), Hecke algebra and the Yang–Baxter equation, Lett. Math. Phys. 11 (1986), 247–252. [20] P. P. Kulish and E. K. Sklyanin, “Quantum spectral transform method: recent developments”, in Integrable Quantum Field Theories, Lecture Notes in Phys. 151 Springer, Berlin-Heidelberg, 1982, pp. 61–119. [21] G. Letzter, Symmetric pairs for quantized enveloping algebras, J. Algebra 220 (1999), 729–767. [22] G. Letzter, “Coideal subalgebras and quantum symmetric pairs”, in New directions in Hopf algebras, Math. Sci. Res. Inst. Publ. 43, Cambridge Univ. Press, Cambridge, 2002, pp. 117–165. [23] G. Letzter, Quantum symmetric pairs and their zonal spherical functions, preprint math.QA/0204103. [24] A. Liguori, M. Mintchev and L. Zhao, Boundary exchange algebras and scattering on the half line, Commun. Math. Phys. 194 (1998), 569–589. [25] M. Mintchev, E. Ragoucy and P. Sorba, Spontaneous symmetry breaking in the gl(N )-NLS hierarchy on the half line, J. Phys. A34 (2001), 8345–8364. [26] A. I. Molev, Stirling partitions of the symmetric group and Laplace operators for the orthogonal Lie algebra, Discrete Math. 180 (1998), 281–300. [27] A. I. Molev, “Yangians and their applications”, in Handbook of Algebra, Vol. 3, ed. M. Hazewinkel, Elsevier, 2003. [28] A. Molev, M. Nazarov and G. Olshanski, Yangians and classical Lie algebras, Russian Math. Surveys 51(2) (1996), 205–282. [29] A. I. Molev and E. Ragoucy, Representations of reflection algebras, Rev. Math. Phys. 14 (2002), 317–342. [30] M. Nazarov and V. Tarasov, Yangians and Gelfand–Zetlin bases, Publ. RIMS, Kyoto Univ. 30 (1994), 459–478. [31] M. Noumi, Macdonald’s symmetric polynomials as zonal spherical functions on quantum homogeneous spaces, Adv. Math. 123 (1996), 16–77. [32] M. Noumi and T. Sugitani, “Quantum symmetric spaces and related q-orthogonal polynomials”, in Group Theoretical Methods in Physics (ICGTMP ) (Toyonaka, Japan, 1994), World Scientific Publishing, New Jersey (1995), pp. 28–40. [33] M. Noumi, T. Umeda and M. Wakayama, A quantum dual pair (sl2 , on ) and the associated Capelli identity, Lett. Math. Phys. 34 (1995), 1–8. [34] M. Noumi, T. Umeda and M. Wakayama, Dual pairs, spherical harmonics and a Capelli identity in quantum group theory, Compos. Math. 104 (1996), 227–277. [35] G. Olshanski, “Twisted Yangians and infinite-dimensional classical Lie algebras”, in Quantum Groups, ed. P. P. Kulish, Lecture Notes in Math. 1510, Springer, BerlinHeidelberg, 1992, pp. 103–120. [36] N. Yu. Reshetikhin, L. A. Takhtajan and L. D. Faddeev, Quantization of Lie Groups and Lie algebras, Leningrad Math. J. 1 (1990), 193–225. [37] E. K. Sklyanin, Boundary conditions for integrable quantum systems, J. Phys. A21 (1988), 2375–2389.
December 3, 2003 17:12 WSPC/148-RMP
00182
Reviews in Mathematical Physics Vol. 15, No. 8 (2003) 823–845 c World Scientific Publishing Company
DIRICHLET FORMS AND SYMMETRIC MARKOVIAN SEMIGROUPS ON Z2 -GRADED VON NEUMANN ALGEBRAS
CHANGSOO BAHN Natural Science Research Institute, Yonsei University, Seoul 120-749, Korea [email protected] CHUL KI KO∗ and YONG MOON PARK† Department of Mathematics, Yonsei University, Seoul 120-749, Korea ∗[email protected] †[email protected] Received 7 April 2003 Revised 9 September 2003 We extend the construction of Dirichlet forms and symmetric Markovian semigroups on standard forms of von Neumann algebras given in [1] to the case of Z2 -graded von Neumann algebras. As an application of the extension, we construct symmetric Markovian semigroups on CAR algebras with respect to gauge invariant quasi-free states and also investigate detailed properties such as ergodicity of the semigroups. Keywords: Dirichlet forms; Z2 -graded algebras; standard forms; Markovian semigroups; CAR algebras; quasi-free states.
1. Introduction In [1], the author has used the theory of Dirichlet forms and Markovian semigroups on standard forms of von Neumann algebras developed by Cipriani [2] to construct Dirichlet forms which generate symmetric Markovian semigroups. The result has been employed to construct Dirichlet forms and associated symmetric Markovian semigroups on CCR algebras with respect to quasi-free states [3] and quantum mechanical systems [4]. The purpose of this paper is to extend the construction method of Dirichlet forms in [1] to the case of Z2 -graded algebras by using the notion of superderivations [5]. As an application of the extension, we construct symmetric Markovian semigroups on CAR algebras with respect to gauge invariant quasi-free states and also investigate detailed properties such as ergodicity of the semigroups. This application is similar to our previous work on CCR algebras [3]. The need to construct Markovian semigroups on von Neumann algebras, which are symmetric with respect to a non-tracial state, is clear for various applications to open systems [6], quantum statistical mechanics [7], and quantum probability 823
December 3, 2003 17:12 WSPC/148-RMP
824
00182
C. Bahn, C. K. Ko & Y. M. Park
theory [8–10]. Although on the abstract level we have quite well-developed theories [2, 11–13], the progress in concrete applications is very slow. We would like to mention a few works in this direction. The completely positive Hamiltonian semigroup for quantum spin chains in the ground state representation has been considered in [14]. Completely positive unit preserving semigroups for infinite free Fermion systems have been constructed and analyzed in [15] (see Remark 3.3). In [16, 17], the authors used the generalized conditional expectation to construct generators of spin-flip type dynamics for quantum spin systems. It may be worth to mention that in [16, 17], the authors have also constructed Markovian semigroups of a diffusion type generator (which have been formally introduced in [15]) for quantum systems under the condition of strong asymptotic abelianness which is not proven yet (see Remark 2.2). As we mentioned above, one of authors gave a general construction method of Dirichlet forms on standard forms of von Neumann algebras and applied the method to construct translation invariant Markovian semigroups for quantum spin systems [1]. The method of [1] has been extended to construct symmetric Markovian semigroups on the CCR algebras with respect to quasi-free states [3], and on quantum mechanical systems [4]. In [18, 19], quantum Ornstein–Uhlenbeck semigroups were constructed by means of noncommutative Dirichlet forms. Let us describe the content of this paper briefly. For a Z2 -graded von Neumann algebra M acting on a Hilbert space H, let ξ0 be a cyclic and separating vector for M which is invariant under the Z2 -grading (see (2.6)). We consider a construction of Dirichlet forms on the natural standard form (M, H, P, J) associated with the pair (M, ξ0 ). For any admissible function f (Definition 2.1) and any odd analytic element x in M, we construct a bounded Dirichlet form which generates a Markovian semigroup on H (Theorem 2.1). Applying this result, we construct Dirichlet forms and associated symmetric Markovian semigroups on CAR algebras with respect to gauge invariant quasi-free states. More precisely, let A(h0 ) be the CAR algebra over a complex separable pre-Hilbert space h0 and let ω be a gauge invariant quasi-free state on A(h0 ). For any normalized admissible function f and complete orthonormal system (CONS) {gn } ⊂ h0 , we construct a Dirichlet form and corresponding symmetric Markovian semigroup on the natural standard form associated to the GNS representation of the pair (A(h0 ), ω) as follow: Let a(gn ), n ∈ N, be the annihilation operators. For each n ∈ N , let En (·, ·) : H× H → C be the Dirichlet form corresponding to x = a(gn ). We define a sesquilinear form on H by D(E) =
E(η, ξ) =
(
ξ∈H:
∞ X
n=1
∞ X
n=1
En (η, ξ) ,
En (ξ, ξ) < ∞
)
η, ξ ∈ D(E) .
December 3, 2003 17:12 WSPC/148-RMP
00182
Dirichlet Forms and Symmetric Markovian Semigroups
825
Then E is densely defined (Theorem 3.1). Since each En is a Dirichlet form (Theorem 2.1), the sum E is also a Dirichlet form by Theorem 5.2 of [2]. Furthermore it turns out that the form is independent of the normalized admissible function f and the CONS {gn } chosen (Theorem 3.1). By establishing a (chaos) decomposition of the quasi-free Hilbert space (see Sec. 4) and analyzing the spectrum of the generator (Dirichlet operator) of the semigroup, we prove that the semigroup is ergodic in the sense of [20] and tends to the equilibrium exponentially fast (Theorem 3.2). We organize the paper as follows. In Sec. 2, we introduce some terminologies in the theory of noncommutative Dirichlet forms in the sense of Cipriani [2] and give a brief review on Z2 -graded von Neumann algebras and the notion of superderivations [5]. We then extend the construction of Dirichlet forms given in [1] to the case of Z2 -graded von Neumann algebras. In Sec. 3, as an application of the results in the previous section, we give an explicit expression of a Dirichlet form on the natural standard form associated to CAR algebra with a quasi-free state. In Sec. 4, we make a chaos decomposition of H which is essentially same as that in [21], and then prove the ergodicity of the semigroup and the existence of a spectral gap. Before closing Introduction, let us mention that the semigroup on the CAR algebra with respect to a quasi-free state, which we constructed in Sec. 3, is deeply dependent on so-called quasi-free dynamics (3.5) of the modular automorphism. We use the quasi-free property extensively to prove that the Dirichlet form is independent of the admissible function f and the CONS {gn } chosen, and to analyze the spectrum of the semigroup completely. In the case of a general KMS state for an interacting system, the Dirichlet form given in (2.9) should depend on the admissible function used. 2. Construction of Dirichlet Forms on Z2 -Graded von Neumann Algebras In this section, we first introduce necessary terminologies in the theory of noncommutative Dirichlet forms in the sense of Cipriani [2] and the notion of superderivations [5], and then by employing the method developed in [1], we construct Dirichlet forms on standard forms of Z2 -graded von Neumann algebras which generate symmetric Markovian semigroups. Let M be a σ-finite von Neumann algebra acting on a complex Hilbert space H equipped with an inner product h·, ·i. Let ξ0 ∈ H be a cyclic and separating vector for M. We use ∆ and J to denote respectively, the modular operator and the modular conjugation associated to the pair (M, ξ0 ) [7]. The associated modular automorphism group is denoted by σt : σt (A) = ∆it A∆−it , ∀ A ∈ M, t ∈ R. j : M → M0 is the antilinear ∗-isomorphism defined by j(A) = JAJ, A ∈ M. The natural positive cone P associated with the pair (M, ξ0 ) is the closure of the set {Aj(A)ξ0 : A ∈ M} .
December 3, 2003 17:12 WSPC/148-RMP
826
00182
C. Bahn, C. K. Ko & Y. M. Park
By a general result, the closed convex cone P can be obtained by the closure of the set {∆1/4 AA∗ ξ0 : A ∈ M} and this cone P is self-dual in the sense that {ξ ∈ H : hξ, ηi ≥ 0, ∀ η ∈ P} = P . Then the form (M, H, P, J) is the standard form associated with the pair (M, ξ0 ). For details we refer to [22] and [7, Sec. 2.5]. We shall use the fact that H is the complexification of the real subspace HJ = {ξ ∈ H : hξ, ηi ∈ R, ∀η ∈ P}, whose elements are called J-real : H = HJ ⊕ iHJ . The cone P gives rise to a structure of ordered Hilbert space on HJ (denoted by ≤) and to an anti-unitary involution J on H, which preserves P and HJ : J(ξ + iη) = ξ − iη, ∀ ξ, η ∈ HJ . Also note that any J-real element ξ ∈ HJ can be decomposed uniquely as a difference of two mutually orthogonal, positive elements, called the positive and negative part of ξ, respectively: ξ = ξ+ − ξ− , ξ+ , ξ− ∈ P and hξ+ , ξ− i = 0. The order interval {η ∈ H : 0 ≤ η ≤ ξ0 } will be denoted by [0, ξ0 ]. This is a closed convex subset of H, and we shall denote the nearest point projection onto [0, ξ0 ] by η 7→ ηI . A bounded operator A on H is called J-real if AJ = JA and positive preserving if AP ⊂ P. The semigroup {Tt }t≥0 is said to be J-real if Tt is J-real for any t ≥ 0 and it is called positive preserving if Tt is positive preserving for any t ≥ 0. A bounded operator A : H → H is called sub-Markovian (with respect to ξ0 ) if 0 ≤ ξ ≤ ξ0 implies 0 ≤ Aξ ≤ ξ0 . A is called Markovian if it is sub-Markovian and also Aξ0 = ξ0 . A semigroup {Tt }t≥0 is said to be sub-Markovian (with respect to ξ0 ) if Tt is sub-Markovian for every t ≥ 0. The semigroup {Tt }t≥0 is called Markovian if Tt is Markovian for every t ≥ 0. Next, we consider a sesquilinear form on some linear manifold of H : E(·, ·) : D(E) × D(E) → C. We also consider the associated quadratic form: E[·] : D(E) → C, E[ξ] := E(ξ, ξ). A real valued quadratic form E[·] is said to be semi-bounded if inf{E[ξ] : ξ ∈ D(E), kξk = 1} = −b > −∞. A quadratic form (E, D(E)) is said to be J-real if JD(E) ⊂ D(E) and E[Jξ] = E[ξ] for any ξ ∈ D(E). For a given semi-bounded quadratic form E, one considers the inner product given by hξ, ηiλ := E(ξ, η) + λhξ, ηi, for λ > b. The form E is closed if D(E) is a Hilbert space for some of the above norms. The form E is called closable if it admits a closed extension. Associated to a semi-bounded closed form E, there is a self-adjoint operator (H, D(H)) and a strongly continuous, symmetric semigroup {Tt }t≥0 . Each of the above objects determines uniquely the others according to well-known relations (see [7, Sec. 3.1]). From now on we will consider only J-real, real-valued, semi-bounded, densely defined quadratic forms. It is easy to check that these forms satisfy the relation: E[ξ + iη] = E[ξ] + E[η] for all ξ + iη ∈ D(E)J + iD(E)J = D(E) where D(E)J := D(E) ∩ H J .
December 3, 2003 17:12 WSPC/148-RMP
00182
Dirichlet Forms and Symmetric Markovian Semigroups
827
A J-real, real-valued, densely defined quadratic form (E, D(E)) is called Markovian with respect to ξ0 ∈ P if η ∈ D(E)J
implies ηI ∈ D(E) and E[ηI ] ≤ E[η] .
(2.1)
A closed Markovian form is called a Dirichlet form. Next, we collect the main results of [2]. Let (E, D(E)) be a J-real, real-valued, densely defined closed form. Assume that the following properties hold: (a) ξ0 ∈ D(E) ,
(2.2)
(b) E(ξ, ξ) ≥ 0 for ξ ∈ D(E) , (c) ξ ∈ D(E)J implies ξ± ∈ D(E) and E(ξ+ , ξ− ) ≤ 0 . Then E is a Dirichlet form if and only if E(ξ, ξ0 ) ≥ 0 for all ξ ∈ D(E) ∩ P. The above result follows from [2, Propositions 4.5(b) and 4.10(ii)]. The following is [2, Theorem 4.11]: Let {Tt }t≥0 be a J-real, strongly continuous, symmetric semigroup on H and let (E, D(E)) be the associated densely defined Jreal, real-valued quadratic form. Then the following are equivalent: (a) {Tt }t≥0 is sub-Markovian, (b) (E, D(E)) is a Dirichlet form.
(2.3)
We refer the reader to [2] for the details. Let γ : M → M be a Z2 -grading of M, so that γ is a ∗-automorphism of M which is involutive: γ 2 = id. Thus M is a direct sum of Me := {A ∈ M : γ(A) = A} and Mo := {A ∈ M : γ(A) = −A}. The elements of Me are called even and those of Mo odd. The identity element 1 ∈ M is even. A superderivation δ on a Z2 -graded von Neumann algebra (M, γ) is a linear map satisfying that for all A, B ∈ M, (a) δ(AB) = δ(A)B + γ(A)δ(B) ,
(2.4)
(b) δ(γ(A)) = −γ(δ(A)) . A superderivation δ is called inner if there is x ∈ M such that δ(A) = xA − γ(A)x ,
∀A ∈ M.
(2.5)
The element x which defines an inner superderivation as in (2.5) is odd. See [5, Lemma 1.1]. In this paper we assume that ξ0 is invariant under γ in the sense that hξ0 , Aξ0 i = hξ0 , γ(A)ξ0 i ,
A ∈ M.
(2.6)
Then there exists a unitary operator Uγ on H such that γ(A) = Uγ AUγ−1 , Uγ P ⊂ P , Uγ ξ0 = ξ 0 , Uγ2 = 1 .
A ∈ M, (2.7)
December 3, 2003 17:12 WSPC/148-RMP
828
00182
C. Bahn, C. K. Ko & Y. M. Park
In fact Uγ Aξ0 = γ(A)ξ0 , where A ∈ M. See [7, proof of Corollary 2.5.32]. Here we have used the property γ 2 = id for the last relation. In this case, it can be checked that Uγ commutes with ∆ and J, which implies for any t ∈ R, A ∈ M, γ(σt (A)) = σt (γ(A)) , γ(j(A)) = j(γ(A)) .
(2.8)
In order to express Dirichlet forms, let us introduce the notion of admissible functions in [1]. Definition 2.1. An analytic function f : D → C on a domain D containing the strip Im z ∈ [−1/4, 1/4] is said to be admissible if the following properties hold: (a) f (t) ≥ 0 for ∀ t ∈ R, (b) f (t + i/4) + f (t − i/4) ≥ 0 for ∀ t ∈ R, (c) there exist M > 0 and p > 1 such that the bound |f (t + is)| ≤ M (1 + |t|)−p holds uniformly in s ∈ [−1/4, 1/4]. We remark that there exists an admissible function satisfying Definition 2.1. See [1, Lemma 3.1]. We are ready to give a construction of Dirichlet forms on the standard form (M, H, P, J) associated with pair (M, ξ0 ), where M is a Z2 -graded von Neumann algebra by γ. Denote by Man the dense subset of M consisting of σt -analytic elements on a domain containing the strip I1/2 := {z : |Im z| ≤ 1/2} [7]. By [7, Proposition 2.5.21], any element A ∈ Man is strongly analytic. For a given admissible function f and odd element x ∈ Mo ∩ Man , define a sesquilinear form E : H × H −→ C by E(η, ξ) Z = h(σt−i/4 (x) − j(σt−i/4 (x∗ ))Uγ )η, (σt−i/4 (x) − j(σt−i/4 (x∗ ))Uγ )ξif (t) dt +
Z
h(σt−i/4 (x∗ ) − j(σt−i/4 (x))Uγ )η, (σt−i/4 (x∗ ) − j(σt−i/4 (x))Uγ )ξif (t) dt
≡ E (1) (η, ξ) + E (2) (η, ξ) .
(2.9)
Here Uγ is the unitary operator satisfying γ(A) = Uγ AUγ−1 , A ∈ M. The associated quadratic form E[·] is given by Z E[ξ] = k(σt−i/4 (x) − j(σt−i/4 (x∗ ))Uγ )ξk2 f (t) dt +
Z
k(σt−i/4 (x∗ ) − j(σt−i/4 (x))Uγ )ξk2 f (t) dt .
(2.10)
December 3, 2003 17:12 WSPC/148-RMP
00182
Dirichlet Forms and Symmetric Markovian Semigroups
829
The form given in the above is a bounded form since kσt−i/4 (x)k = kσ−i/4 (x)k for any x ∈ Man . Recall that Mξ0 is a dense subset in H and x an odd element. Using the fact that JBξ0 = ∆1/2 B ∗ ξ0 , B ∈ M, one obtains that for any A ∈ M, (σt−i/4 (x# ) − j(σt−i/4 ((x# )∗ ))Uγ )Aξ0
= σt−i/4 (x# )Aξ0 − γ(A)j(σt−i/4 ((x# )∗ ))ξ0 = (σt−i/4 (x# )A − γ(A)σt−i/4 (x# ))ξ0
(2.11)
where x# stands for x or x∗ . Thus we have used inner superderivations in (2.5) to define the sesquilinear form in (2.9). The following is the main result which corresponds to [1, Theorem 3.1]. Theorem 2.1. For a given admissible function f and an odd element x ∈ Mo ∩ Man , let (E, H) be defined as in (2.9). Let H be the self-adjoint operator associated with (E, H), i.e., E[ξ] = hξ, Hξi, ∀ ξ ∈ H. Assume that there exists a constant M > 0 such that the bound sup s∈[−1/4,1/4]
kσt+is (x)k ≤ M
(2.12)
holds uniformly in t ∈ R. Then the following properties hold : (a) Hξ0 = 0, (b) E[Jξ] = E[ξ], ∀ ξ ∈ H, (J-real ) (c) E(ξ+ , ξ− ) ≤ 0 for ∀ ξ ∈ HJ . Furthermore (E, H) is a Dirichlet form. We will produce the proof of Theorem 2.1 at the end of this section. The following is a consequence of Theorem 2.1: Theorem 2.2. Let H be the self-adjoint operator associated with (E, H) defined as in (2.9) and let Tt = e−tH , t ≥ 0. Then {Tt }t≥0 is a J-real, strongly continuous, symmetric Markovian semigroup. Proof. Clearly E[·] ≥ 0. Thus Theorems 2.1 and (2.3) imply that {Tt }t≥0 is J-real, strongly continuous, sub-Markovian. It follows from Theorem 2.1(a) that Tt ξ0 = ξ0 for any t ≥ 0. Thus {Tt }t≥0 is Markovian. Remark 2.1. Consider the symmetric embedding: i0 : M → H i0 (A) = ∆1/4 Aξ0 . Define the maps St on M by St : M → M ,
i 0 ◦ St ≡ Tt ◦ i0 .
December 3, 2003 17:12 WSPC/148-RMP
830
00182
C. Bahn, C. K. Ko & Y. M. Park
It follows from [2, Theorem 2.12] that {St }t≥0 is a weak* continuous, Markovian semigroup on M. Notice that St preserves parity: St (γ(A)) = γ(St (A)), where A ∈ M. This can be checked from the fact Uγ HUγ−1 = H where H is defined in Theorem 2.2. Remark 2.2. The notion of admissible function similar to that used in the formula (2.9) (also [1, (3.4)]) has already appeared in the diffusion type generator [15, formula (2)]. However, in order to make the generator in [15] be well-defined, one needs some kind of asymptotic abelianness [15–17]. On the other hand the Dirichlet forms we introduced in (2.9) do not need any asymptotic abelianness and are reduced to usual (canonical) forms [23, 5] in the case of tracial states. We now produce the proof of Theorem 2.1. Let us mention that we modify [1, proof of Theorem 3.1]. Proof of Theorem 2.1. (a) Notice that Uγ ξ0 = ξ0 , JAξ0 = ∆1/2 A∗ ξ0 for any A ∈ M. We obtain (σt−i/4 (x) − j(σt−i/4 (x∗ ))Uγ )ξ0 = ∆1/4 σt (x)ξ0 − J∆1/4 σt (x∗ )∆−1/4 ξ0 = ∆1/4 σt (x)ξ0 − ∆1/4 σt (x)ξ0 = 0, and also (σt−i/4 (x∗ ) − j(σt−i/4 (x))Uγ )ξ0 = 0. Thus (a) follows from (2.9) and the above facts. (b) Notice that Uγ xUγ−1 = −x for any x ∈ Mo . The first relation in (2.8) implies that σt−i/4 (x) = −Uγ σt−i/4 (x)Uγ−1 .
(2.13)
Thus a direct calculation shows that for any ξ ∈ H, k(σt−i/4 (x) − j(σt−i/4 (x∗ ))Uγ )Jξk2 = kUγ J(−j(σt−i/4 (x))Uγ + σt−i/4 (x∗ ))ξk2 = k(σt−i/4 (x∗ ) − j(σt−i/4 (x))Uγ )ξk2 . Here we have used Uγ−1 = Uγ and Uγ commutes with J. Thus we have E (1) [Jξ] = E (2) [ξ]. The method used in the above also implies that E (2) [Jξ] = E (1) [ξ]. (c) By the expression of E(ξ, η) in (2.9), E(ξ+ , ξ− ) can be written as E(ξ+ , ξ− ) = E (1) (ξ+ , ξ− ) + E (2) (ξ+ , ξ− ) = (I(1) + II(1) ) + (I(2) + II(2) )
(2.14)
December 3, 2003 17:12 WSPC/148-RMP
00182
Dirichlet Forms and Symmetric Markovian Semigroups
where I
(1)
II(1)
=
Z
831
(hσt−i/4 (x)ξ+ , σt−i/4 (x)ξ− i
+ hσt−i/4 (x∗ )ξ− , σt−i/4 (x∗ )ξ+ i)f (t) dt , Z = − (hσt−i/4 (x)ξ+ , j(σt−i/4 (x∗ ))Uγ ξ− i
(2.15)
+ hj(σt−i/4 (x∗ ))Uγ ξ+ , σt−i/4 (x)ξ− i)f (t) dt
and I(2) and II(2) are obtained from I(1) and II(1) respectively by replacing x by x∗ in the above. Here we have used the facts that Uγ = Uγ−1 , the antilinearity of J and (2.13) to obtain the second term of I(1) in (2.15). As a consequence of [22, Theorem 4(7)], Mξ+ ⊥ Mξ− , which implies I(1) = 0. Similarly I(2) = 0. See also [2, proof of Proposition 5.3(ii)]. Next, we consider II(1) . Since σt−i/4 (x)∗ = σt+i/4 (x∗ ), it follows from (2.7) and (2.13) that hσt−i/4 (x)ξ+ , j(σt−i/4 (x∗ ))Uγ ξ− i = hξ+ , σt+i/4 (x∗ )j(σt+i/4 (x)∗ )Uγ ξ− i
(2.16)
and hj(σt−i/4 (x∗ ))Uγ ξ+ , σt−i/4 (x)ξ− i = h−Uγ j(σt−i/4 (x∗ ))ξ+ , −Uγ σt−i/4 (x)Uγ ξ− i = hξ+ , σt−i/4 (x)j(σt−i/4 (x∗ )∗ )Uγ ξ− i .
(2.17)
Substituting (2.16) and (2.17) into the second expression of (2.15), we get that Z II(1) = − hξ+ , σt+i/4 (x∗ )j(σt+i/4 (x)∗ )Uγ ξ− if (t) dt −
Z
hξ+ , σt−i/4 (x)j(σt−i/4 (x∗ )∗ )Uγ ξ− if (t) dt .
Notice that for any x ∈ Man and ξ ∈ H, the map
z 7→ j(σz (x)∗ )Uγ ξ
is analytic on a domain containing the strip I1/2 . In fact, the analyticity follows from the fact that hη, j(σz (x)∗ )Uγ ξi = hσz (x)∗ JUγ ξ, Jηi = hJUγ ξ, σz (x)Jηi for any η, ξ ∈ H, and that weak analyticity implies strong analyticity (see [24, Theorem VI.4]). Using the Cauchy integral theorem, the assumption in the theorem, the property (c) in Definition 2.1 and σt (x)∗ = σt (x∗ ), we obtain Z i dt II(1) = − hξ+ , σt (x∗ )j(σt (x∗ ))Uγ ξ− if t − 4 Z i − hξ+ , σt (x)j(σt (x))Uγ ξ− if t + dt . 4
December 3, 2003 17:12 WSPC/148-RMP
832
00182
C. Bahn, C. K. Ko & Y. M. Park
Replacing x by x∗ in the above, we obtain the expression of II(2) . Thus we get Z II = − hξ+ , [σt (x)j(σt (x)) + σt (x∗ )j(σt (x∗ ))]Uγ ξ− i i i · f t− +f t+ dt . 4 4
Recall that Uγ P ⊂ P in (2.7). Since σt (x# )j(σt (x# ))Uγ ξ− ∈ P ,
hξ+ , σt (x# )j(σt (x# ))Uγ ξ− i ≥ 0 for ∀ t ∈ R ,
where x# is either x or x∗ . By the property (b) in Definition 2.1, we conclude that II ≤ 0. This proved the part (c) of the theorem. Clearly E[·] ≥ 0. Note that E(ξ, ξ0 ) = 0 for all ξ ∈ H. Theorem 2.1(b) and (c) imply that the form (E, H) satisfies the conditions in (2.2). Thus (E, H) is a Dirichlet form. 3. Markovian Semigroups on CAR Algebras with Quasi-Free States In this section, we apply the results in Sec. 2 to construct Dirichlet forms and associated Markovian semigroups on CAR algebras with respect to quasi-free states. We first review the notion of CAR algebras. For the details we refer to [7, Sec. 5.2.2]. Let h0 be a separable pre-Hilbert space with an inner product (·, ·) and h the completion of h0 . Let A(h) be the C ∗ -algebra generated by the identity 1 and elements a(f ), f ∈ h, satisfying (a) f 7→ a(f ) is antilinear,
(3.1)
(b) {a(f ), a(g)} = 0, (c) {a(f ), a(g)∗ } = (f, g)1
for all f , g ∈ h, where {A, B} := AB + BA. Notice that ka(f )k = kf k ,
∀f ∈ h,
(3.2)
and so A(h0 ) = A(h). Let γ : A(h) → A(h) be the ∗-automorphism defined by γ(a(f )) = −a(f ) ,
(3.3)
for all f ∈ h. Then (A(h), γ) is a Z2 -graded C ∗ -algebra. Next, we describe quasi-free states on A(h). Let A be a bounded and nonnegative operator on h. Recall that a vector g ∈ h is called an analytic vector for an operator B on h if for each n ∈ N, g ∈ D(B n ) and for some t > 0, ∞ X kB n gk n t < ∞. n! n=0
December 3, 2003 17:12 WSPC/148-RMP
00182
Dirichlet Forms and Symmetric Markovian Semigroups
833
In the rest of this paper, we assume that A satisfies the following properties: Assumption 3.1. (a) There exists α > 0 such that 0 < A ≤ α1. (b) The inverse A−1 of A exists as a (unbounded) self-adjoint and positive operator on h. (c) For any z ∈ C, Az leaves h0 invariant, i.e. Az h0 ⊂ h0 . Moreover, z 7→ Az f is entire analytic for any f ∈ h0 . (d) Any g ∈ h0 is an analytic vector for A−1/2 . We remark that a dense subspace h0 of h satisfying Assumption 3.1 exists by the spectral theorem. Example 3.1 (Ideal Fermi Gases). Let h be the space L2 (Rd , dx) and ∆ the Laplacian operator on h. Let A be given by A = e−β(−∆−µ1) , where β > 0, µ ∈ R. The subspace h0 is given by h0 = {f ∈ h : fb ∈ Cc (Rd )}, where fb denotes the Fourier transform of f . Then Assumption 3.1 is satisfied with α = eβµ . For a given bounded operator A on h satisfying Assumption 3.1, the gauge invariant quasi-free state ω on A(h) is defined by ω(a(fm )∗ · · · a(f1 )∗ a(g1 ) · · · a(gn )) = δnm det((gi , A(1 + A)−1 fj ))
(3.4)
for any f1 , . . . , fm , g1 , . . . , gn ∈ h. Let σt : A(h) → A(h) be the ∗-automorphisms defined by σt (a(f )) = a(Ait f )
(3.5)
for any f ∈ h, t ∈ R. Then one can check KMS conditions ω(Bσ−i (C)) = ω(CB) for any analytic elements B, C ∈ A(h) [7]. Let (Hω , πω , Ωω ) be the GNS representation of (A(h), ω). Define Mω := πω (A(h))00 , a(f )ω := πω (a(f )), a(f )∗ω := πω (a(f )∗ ), f ∈ h, σtω := πω σt πω−1 and γω := πω γπω−1 . From now on we suppress ω from the notations, e.g. M := Mω , H := Hω , a(f ) := aω (f ), a(f )∗ := aω (f )∗ , σt := σtω and γ := γω etc. By continuity, σt and γ extend to M. We also write ξ0 = Ωω . Notice that ω satisfies the σt -KMS conditions [7]. Thus, by the uniqueness of the modular automorphism, σt is the modular automorphism on M. See [7, Theorem 5.3.10]. Let ∆ and J denote the modular operator and the modular conjugation associated with the pair (M, ξ0 ) respectively. Thus σt (B) = ∆it B∆−it , ∀ B ∈ M. It follows from (3.3) and (3.4) that ω(γ(B)) = ω(B) ,
∀B ∈ M.
December 3, 2003 17:12 WSPC/148-RMP
834
00182
C. Bahn, C. K. Ko & Y. M. Park
Since ω is γ-invariant, that is, hξ0 , γ(B)ξ0 i = hξ0 , Bξ0 i, ∀ B ∈ M, there exists a unitary operator Uγ on H satisfying the properties in (2.7). For any g ∈ h, define an odd element B(g) ∈ Mo by 1 B(g) := √ (a(g) + a(g)∗ ) . 2
(3.6)
Using the antilinearity of a(g) and (3.6), we get that for any g ∈ h, 1 a(g) = √ (B(g) + iB(ig)) , 2 1 a(g) = √ (B(g) − iB(ig)) . 2
(3.7)
∗
From the CAR relations (3.1), we obtain that for any f , g ∈ h, 1 {a(f ), B(g)} = √ (f, g)1 , 2 1 {a(f ) , B(g)} = √ (g, f )1 . 2
(3.8)
∗
Denote by Mf in the algebra generated by 1 and a(f )# , f ∈ h0 , where a(f )# is either a(f ) or a(f )∗ for f ∈ h0 . Because of (3.7), Mf in is equal to the algebra generated by 1 and B(f ), f ∈ h0 . Since Mf in is norm dense in πω (A(h)), Mf in ξ0 is also dense in H. For any f ∈ h0 , z ∈ C, we write σz (a(f )) := a(Ai¯z f ) , σz (a(f )∗ ) := a(Aiz f )∗ .
(3.9)
In fact, using (3.4) and Assumption 3.1(c), one can check that for any f ∈ h0 and ξ ∈ Mf in ξ0 , the map z 7→ σz (a(f )# )ξ is the analytic extension on C of the map t 7→ σt (a(f )# )ξ (where a(f )# is either a(f ) or a(f )∗ ). Thus for any g ∈ h0 , a(g), a(g)∗ and B(g) are σt -entire analytic odd elements of M. Let us turn to construction of a Dirichlet form which generates the symmetric Markovian semigroup on CAR algebra A(h) with respectR to the quasi-free state ω. An admissible function f is said to be normalized if f (t) dt = 1. For given normalized function f and a complete orthonormal system (CONS) {gj }∞ j=1 ⊂ h0 for h, define a sesquilinear form E : D(E) × D(E) → C as follows: ( ) ∞ X D(E) = ξ ∈ H : Ej (ξ, ξ) < ∞ , j=1
E(η, ξ) =
∞ X j=1
Ej (η, ξ) ,
(3.10)
η, ξ ∈ D(E)
December 3, 2003 17:12 WSPC/148-RMP
00182
Dirichlet Forms and Symmetric Markovian Semigroups
835
where for each j ∈ N, D(Ej ) = H , Z Ej (η, ξ) = h(σt−i/4 (a(gj )) − j(σt−i/4 (a(gj )∗ ))Uγ )η, (σt−i/4 (a(gj ))
− j(σt−i/4 (a(gj )∗ ))Uγ )ξif (t) dt Z + h(σt−i/4 (a(gj )∗ ) − j(σt−i/4 (a(gj )))Uγ )η, (σt−i/4 (a(gj )∗ )
(3.11)
− j(σt−i/4 (a(gj )))Uγ )ξif (t) dt .
We also define the associated quadratic forms by Ej [ξ] = Ej (ξ, ξ) , E[ξ] =
∞ X j=1
Ej [ξ] ,
ξ ∈ H,
j ∈ N,
ξ ∈ D(E) .
(3.12)
We remark that the expression Ej (η, ξ) in (3.11) can be obtained from E(η, ξ) in (2.9) by replacing x by a(gj ). The following are the main results in this section. It turns out that the form defined as in (3.10) and (3.11) is independent of the CONS{gj } ⊂ h0 for h and the normalized admissible function f we have chosen: Theorem 3.1. Let (E, D(E)) be defined as in (3.10) and (3.11). Then the form (E, D(E)) is a densely defined Dirichlet form, and independent of the CONS{g j } ⊂ h0 for h and the normalized admissible function f we have chosen. Moreover, let {Tt }t≥0 be the semigroup associated to the form (E, D(E)). Then the semigroup {Tt }t≥0 is J-real, strongly continuous, symmetric and Markovian. Theorem 3.2. Let {Tt }t≥0 be the symmetric Markovian semigroup associated to the form (E, D(E)) in Theorem 3.1 and H the Dirichlet operator, i.e., T t = e−tH , t ≥ 0. Then the following results hold : (a) H is essentially self-adjoint on Mf in ξ0 . (b) The zero is a simple eigenvalue of H with eigenvector ξ0 . Moreover (0, 2) ∩ σ(H) = ∅. By the spectral theorem, Theorem 3.2(b) implies that for any ξ ∈ H and t ≥ 0, kTt ξ − hξ0 , ξiξ0 kH ≤ e−2t kξ − hξ0 , ξiξ0 kH . Thus {Tt }t≥0 converges to the equilibrium exponentially fast. Note that Theorem 3.2(b) implies that the vector ξ0 is a simple, strictly positive ground state for the generator H. In view of [20], Tt satisfies the indecomposability and the ergodicity (for each positive ξ, η, there exists t > 0 such that hξ, Tt ηi > 0). See [20, Theorem 4.3].
December 3, 2003 17:12 WSPC/148-RMP
836
00182
C. Bahn, C. K. Ko & Y. M. Park
Remark 3.1. The main results in this section can be generalized in several ways. For instance, if one replaces gj by B λ gj (for the definition of B see the Eq. (4.1)), j ∈ N, for some λ ∈ R in the definition of E(η, ξ) in (3.10) and (3.11), and modifies Assumption 3.1(d) appropriately, then all of the results in this section still hold with a modified spectral gap in Theorem 3.2(b), i.e. (0, 21+2λ ) ∩ σ(H) = ∅. Remark 3.2. The Markovian semigroup in Theorem 3.2 commutes with the modular operator (group). In fact, let CONS{gj } be used to construct the Dirichlet operator H. See (4.11). It is easy to check that ∆it H∆−it is the Dirichlet operator corresponding to CONS{Ait gj }. Since H is independent of CONS{gj } chosen, H = ∆it H∆−it for any t ∈ R. Remark 3.3. It may be worthwhile comparing the results in Theorems 3.1 and 3.2 with those in [15, Lemma III.1 and Theorem III.2]. Since the free Fermion system satisfies the condition of L1 -asymptotic abelianness [7], the authors of [15] used the formula (2) of [15] to construct a completely positive unit preserving semigroup Qf,φ on (A(h), ω) for any function f ∈ Fβ and any vector φ ∈ H0 ⊂ h, where Fβ t and H0 are explicitly defined in [15]. They also showed that if any state ω e is left invariant under Qf,φ for all f ∈ F and φ ∈ H , then ω e is the equilibrium state ω. β 0 t On the other hand, Theorem 3.2(b) implies that the cyclic vector ξ0 corresponding to the quasi-free state ω is a unique invariant vector under Tt and for any ξ ∈ H, Tt ξ converges to hξ0 , ξiξ0 exponentially fast. In the rest of this section, we produce the proof of Theorem 3.1. The proof of Theorem 3.2 will be postponed to the next section. Proof of Theorem 3.1. Notice that a(gj ) ∈ Mo ∩ Man , ∀ j ∈ N (see below Eq. (3.9)). Since kσt+is (a(gj ))k = kAs gj k (see (3.9) and (3.2)), the assumption (2.12) is satisfied. Thus Theorem 2.1 impies that each Ej , j ∈ N, is a Dirichlet form. Thus if one can show that D(E) is dense in H, then it follows from [2, Theorem 5.2(ii)] that (E, D(E)) is a Dirichlet form and it generates a J-real, strongly continuous, symmetric and sub-Markovian semigroup Tt = e−tH , t ≥ 0. Using the calculations used in the proof of Theorem 2.1(a), we get that for each j ∈ N, (σt−i/4 (a(gj )# ) − j(σt−i/4 ((a(gj )# )∗ ))Uγ )ξ0 = 0 .
Thus Hξ0 = 0 and so {Tt }t≥0 is Markovian. Let us show that D(E) is dense. Since Mf in ξ0 is dense in H, it is sufficient to show that Mf in ξ0 is contained in D(E). For any B ∈ M, it follows from (2.11) and (3.9) that (σt−i/4 (a(gj )) − j(σt−i/4 (a(gj )∗ ))Uγ )Bξ0 = a(Ait−1/4 gj )Bξ0 − γ(B)a(Ait−1/4 gj )ξ0 ( [a(Ait−1/4 gj ), B]ξ0 if B ∈ Me = {a(Ait−1/4 gj ), B}ξ0 if B ∈ Mo
(3.13)
December 3, 2003 17:12 WSPC/148-RMP
00182
Dirichlet Forms and Symmetric Markovian Semigroups
837
and (σt−i/4 (a(gj )∗ ) − j(σt−i/4 (a(gj )))Uγ )Bξ0 ( [a(Ait+1/4 gj )∗ , B]ξ0 if B ∈ Me . = it+1/4 ∗ {a(A gj ) , B}ξ0 if B ∈ Mo
(3.14)
Let B ∈ Mf in be any element of the form B = B(f1 ) · · · B(fn ) ∈ Mf in , f1 , . . . , fn ∈ h0 , n ∈ N (see (3.6) for the definition of B(fi )). Consider the case for n = 2m, that is, B ∈ Me . Using the CARs (3.1) and (3.8), we obtain that [a(Ait−1/4 gj ), B] = ∗
[a (A
it+1/4
n X (−1)k+1 it−1/4 b k ) · · · B(fn ) , √ (A gj , fk )B(f1 ) · · · B(f 2 k=1
n X (−1)k+1 b k ) · · · B(fn ) √ gj ), B] = (fk , Ait+1/4 gj )B(f1 ) · · · B(f 2 k=1
(3.15)
b ), f ∈ h0 , denotes that B(f ) is omitted. By the Schwarz inequaity and where B(f the Bessel inequality it is easy to see that for any h1 , h2 ∈ h0 , t ∈ R, ∞ X j=1
|(Ait h1 , gj )(gj , Ait h2 )| ≤ kh1 k kh2 k .
Using the above inequality and that kB(h)k ≤ khk for any h ∈ h, we get from (3.13)–(3.15) that ∞ X j=1
Ej [Bξ0 ] < ∞ .
Thus Bξ0 ∈ D(E). The result, same as that in the above, holds for odd n. Hence Mf in ξ0 is contained in D(E) and so D(E) is dense. Next we will show that (E, D(E)) is independent of the choice of the CONS{gj } ⊂ h0 for h and the normalized admissible function f . Using the Parseval relations, we get that for all t ∈ R, k, k 0 ∈ N, ∞ X
(fk , Ait−1/4 gj )(Ait−1/4 gj , fk0 ) = (fk , A−1/2 fk0 )
j=1
and
∞ X
(Ait+1/4 gj , fk )(fk0 , Ait+1/4 gj ) = (fk0 , A1/2 fk ) .
j=1
It follows from (3.12)–(3.15), the above relations and Dominated convergence theorem that 0 n X n X (−1)k+k {(fk , A−1/2 fk0 ) + (fk0 , A1/2 fk )} E[Bξ0 ] = 2 0 k=1 k =1
b k ) · · · B(fn )ξ0 , B(f1 ) · · · B(f b k0 ) · · · B(fn )ξ0 i , · hB(f1 ) · · · B(f
(3.16)
December 3, 2003 17:12 WSPC/148-RMP
838
00182
C. Bahn, C. K. Ko & Y. M. Park
where Bξ0 = B(f1 ) · · · B(fn )ξ0 ∈ Mf in ξ0 . It turns out that Mf in ξ0 is a core for the Dirichlet operator H associated to the form (E, D(E)) (Theorem 3.2(a)). Thus H is independent of normalized admissible function f and CONS{gj } ⊂ h0 and so is (E, D(E)). 4. Decomposition of Quasi-Free Hilbert Space: Proof of Theorem 3.2 As in [3], we will decompose the Hilbert space H = Hω into direct sum of H(m,n) , m, n ∈ N ∪ {0}, where H(m,n) is the Hilbert space of m quasi-particles and n anti quasi-particles. We then use the results to prove Theorem 3.2. The decomposition method we use is essentially the same as that in [21]. See also [7, Example 5.2.20]. Denote by B the operator given by B := A−1/2 + A1/2 .
(4.1)
In this section, a(g)# , g ∈ h0 stands for a(g) or a(g)∗ . And we write δ(a(g)# ) := a(g)# − j(σ−i/2 ((a(g)# )∗ ))Uγ = a(g)# + Uγ j(σ−i/2 ((a(g)# )∗ )) ,
(4.2)
as a bounded operator on H. For g ∈ h0 , C ∈ M, we have δ(a(g)# )Cξ0 = a(g)# Cξ0 − γ(C)a(g)# ξ0
(4.3)
(see the argument of (2.11)). Recall (2.13), Uγ−1 = Uγ and the fact that Uγ commutes with J and ∆. It follows from (3.9) and (4.2) that for any g ∈ h0 δ(a(B −1/2 A−1/4 g)) = a(B −1/2 A−1/4 g) − j(σ−i/2 (a(B −1/2 A−1/4 g)∗ ))Uγ = a(B −1/2 A−1/4 g) + Uγ j(a(B −1/2 A1/4 g)∗ ) ,
(4.4)
δ(a(B −1/2 A1/4 g)∗ ) = a(B −1/2 A1/4 g)∗ − j(σ−i/2 (a(B −1/2 A1/4 g)))Uγ = a(B −1/2 A1/4 g)∗ + Uγ j(a(B −1/2 A−1/4 g)) . Since (j(a(g)))∗ = j(a(g)∗ ) and Uγ = Uγ−1 = Uγ∗ , a computation shows that (δ(a(B −1/2 A−1/4 g)))∗ = a(B −1/2 A−1/4 g)∗ + j(a(B −1/2 A1/4 g))Uγ = a(B −1/2 A−1/4 g)∗ − Uγ j(a(B −1/2 A1/4 g)) = a(B 1/2 A1/4 g)∗ − δ(a(B −1/2 A3/4 g)∗ ) .
(4.5)
Here we have used the fact that B −1/2 (A−1/4 +A3/4 ) = B 1/2 A1/4 . Using the method similar to that used in the above, we get (δ(a(B −1/2 A1/4 g)∗ ))∗ = a(B 1/2 A−1/4 g) − δ(a(B −1/2 A−3/4 g)) .
(4.6)
December 3, 2003 17:12 WSPC/148-RMP
00182
Dirichlet Forms and Symmetric Markovian Semigroups
839
From notational brevity, we write that for any g ∈ h0
D1 (g) := δ(a(B −1/2 A−1/4 g)) , D2 (g) := δ(a(B −1/2 A1/4 g)∗ ) .
(4.7)
Then it follows from (4.5)–(4.7) that D1 (g)∗ = a(B 1/2 A1/4 g)∗ − D2 (A1/2 g) ,
D2 (g)∗ = a(B 1/2 A−1/4 g) − D1 (A−1/2 g) .
(4.8)
We first collect some properties of Di (g) for g ∈ h0 and i = 1, 2. Lemma 4.1. Di (g)ξ0 = 0 for any g ∈ h0 and i = 1, 2. Proof. This follows from (4.3) and (4.7). Lemma 4.2. The following relations hold for any g, h ∈ h0 : (a) (b) (c) (d) (e)
{Di (g), Dj (h)} = 0, i = 1, 2, j = 1, 2, {D1 (g), a(h)} = 0, {D2 (g), a(h)∗ } = 0, {D1 (g), a(B 1/2 A1/4 h)∗ } = (g, h)1, {D2 (g), a(B 1/2 A−1/4 h)} = (h, g)1.
Proof. Notice that a(h)# Uγ = −Uγ a(h)# for h ∈ h0 . The proofs follows from direct computations using the definitions in (4.7) and the CARs in (3.1). Proposition 4.1. The following canonical anti-commutation relations (CARs) hold for any g, h ∈ h0 : (a) {D1 (g), D1 (h)∗ } = (g, h)1, {D1 (g), D1 (h)} = 0, {D1 (g)∗ , D1 (h)∗ } = 0, (b) {D2 (g), D2 (h)∗ } = (h, g)1, {D2 (g), D2 (h)} = 0, {D2 (g)∗ , D2 (h)∗ } = 0, (c) {D1 (g), D2 (h)} = 0, {D1 (g), D2 (h)∗ } = 0, {D1 (g)∗ , D2 (h)} = 0, {D1 (g)∗ , D2 (g)∗ } = 0. Proof. The anti-commutation relations follow from (4.7), (4.8), (3.1) and Lemma 4.2. Now we are ready to decompose the Hilbert space H = Hω , called quasi-free Hilbert space. According to Lemma 4.1 and the CARs in Proposition 4.1, Di (g) and Di (h)∗ , i = 1, 2, g, h ∈ h0 , can be thought as annihilation and creation operators respectively. We remark that h 7→ D1 (h)∗ is linear, but g 7→ D2 (g)∗ is conjugate linear. With an abuse of terminology, we call D1 (h)∗ and D2 (h)∗ the creation operators for quasi-particles and anti quasi-particles respectively for h ∈ h 0 . The following is the decomposition of H:
December 3, 2003 17:12 WSPC/148-RMP
840
00182
C. Bahn, C. K. Ko & Y. M. Park
Theorem 4.1. The following decomposition holds: H=
∞ M
m,n=0
H(m,n)
where for each m, n ∈ N ∪ {0}, H(m,n) is the closure of the subspace spanned by the vectors of the form ! n m Y Y D2 (hl )∗ ξ0 , gj , hl ∈ h0 . D1 (gj )∗ j=1
l=1
In the case in which m = 0 (n = 0), we replace the operator in the first (second ) parenthesis in the above by identity.
Proof. Remark that Mf in is the algebra generated by 1, B(f ), f ∈ h0 is dense in M. It follows from (4.8) and (3.6) that any B(g), g ∈ h0 can be written as linear Qm sum of four Di (h)# , h ∈ h0 , i = 1, 2. Thus any ( l=1 B(gl ))ξ0 , gl ∈ h0 , l = 1, . . . , m can be expressed as a finite linear combination of the vectors of the form ! p q Y Y # # ξ 0 , g j , hl ∈ h 0 , D1 (gj ) D2 (hl ) j=1
l=1
where D(g)# is either D(g) or D(g)∗ . Using Lemma 4.1 and the CARs in Proposition 4.1, the above vector can be expressed as a finite linear combination of the vectors of the form 0 0 p q Y Y D1 (gj )∗ D2 (hl )∗ ξ0 , gj , hl ∈ h0 , p0 , q 0 ∈ N ∪ {0} . j=1
l=1
The set of finite linear combinations of the vectors of the above form is dense in H. Thus the decomposition follows from Lemma 4.1 and the CARs in Proposition 4.1.
Recall that h0 is a dense subspace of a complex Hilbert space h. Let F = F(h) be the anti-symmetric Fock space over h, and a(g) and a(g)∗ , g ∈ h, the annihilation and creation operator on F respectively. Denote by Ω the vacuum vector in F. Let C : h → h be an anti-unitary operator. If h is a L2 -space, one may consider that C is the complex conjugation. Denote by Γ(C) the second quantization of C. See [7, Sec. 5.2.1]. Let F1 , Ω1 , a1 (g) and a1 (g)∗ , g ∈ h be the identical copies of F, Ω, a(g) and a(g)∗ , g ∈ h, respectively. Notice that Γ(C)a(g)# Γ(C)−1 = a(Cg)# . We write that F2 = Γ(C)F(= F), Ω2 = Ω, a2 (g) = a(Cg), and a2 (g)∗ = a(Cg)∗ , g ∈ h. Then the following anti-commutation relations hold: for any g, h ∈ h {a2 (g), a2 (h)∗ } = (h, g)1 , {a2 (g), a2 (h)} = 0 .
(4.9)
December 3, 2003 17:12 WSPC/148-RMP
00182
Dirichlet Forms and Symmetric Markovian Semigroups
841
Proposition 4.2. Let U be the operator defined by U : H → F 1 ⊗ F2 ! ! m n m n Y Y Y Y ∗ ∗ ∗ ∗ D1 (gj ) D2 (hl ) ξ0 7→ (a1 (gj ) ⊗ 1) (θ ⊗ a2 (hl ) ) Ω1 ⊗ Ω2
j=1
j=1
l=1
l=1
for gj , hl ∈ h0 , j = 1, . . . , m, l = 1, . . . , n, m, n ∈ N ∪ {0}, where θ is an operator which anti-commutes with a(g), a(h)∗ for g, h ∈ h0 and satisfies θΩ1 = Ω1 . Then U is unitary. Proof. Notice that θ 2 = 1 and for g, h in h0 (a1 (g)∗ ⊗ 1)(θ ⊗ a2 (h)∗ )Ω1 ⊗ Ω2 = −(θ ⊗ a2 (h)∗ )(a1 (g)∗ ⊗ 1)Ω1 ⊗ Ω2 .
(4.10)
Since D1 (g)# and a1 (g)# , and D2 (g)# and a2 (g)# for g ∈ h0 satisfy the same anti-commutation relations respectively by Proposition 4.1 and (4.9), the unitarity of U follows from Lemma 4.1, (4.10) and ai (g)Ωi = 0, i = 1, 2 for any g ∈ h0 . We next turn to the spectral analysis of H, where H is the generator of the symmetric Markovian semigroup {Tt }t≥0 associated to the Dirichlet form (E, D(E)). Let us first describe the basic idea of the proof of Theorem 3.2. Recall that the vectors in Mf in ξ0 can be expressed as finite linear combination of the vectors of the form ! p q Y Y ∗ ∗ D1 (fj ) D2 (hl ) ξ0 , fj , hl ∈ h0 , p, q ∈ N ∪ {0} . j=1
l=1
Let {gj }∞ j=1 ⊂ h0 be a CONS for h. In the proof of Theorem 3.1, we showed that the form (E, Mf in ξ0 ) is independent of the CONS{gj } ∈ h0 for h. Thus one can show that for any ξ ∈ Mf in ξ0 ∞ Z X (kD1 (Ait B 1/2 gj )ξk2 + kD2 (Ait B 1/2 gj )ξk2 )f (t) dt E[ξ] = j=1
=
∞ X j=1
where
(kD1 (B 1/2 gj )ξk2 + kD2 (B 1/2 gj )ξk2 )
b , = hξ, Hξi b= H
∞ X j=1
(4.11)
{D1 (B 1/2 gj )∗ D1 (B 1/2 gj ) + D2 (B 1/2 gj )∗ D2 (B 1/2 gj )}
as a bilinear form on Mf in ξ0 × Mf in ξ0 . If one can show that Mf in ξ0 ⊂ D(H), b on Mf in ξ0 , and that Mf in ξ0 is a core for H, then one expects that the H =H
December 3, 2003 17:12 WSPC/148-RMP
842
00182
C. Bahn, C. K. Ko & Y. M. Park
spectrum of H can be analyzed completely. The following is one of the main results in this section: ˜ : Mf in ξ0 → H be the operator defined by Theorem 4.2. (a) Let H ! n ! m Y Y ∗ ∗ ˜ H D1 (fp ) D2 (hq ) ξ0 p=1
=
m X
k=1
+
q=1
k−1 Y
D1 (fp )
p=1
n X k=1
m Y
∗
!
D1 (Bfk )
D1 (fp )∗
p=1
∗
m Y
D1 (fp )
p=k+1
!
k−1 Y
D2 (hq )∗
q=1
!
∗
n Y
D2 (hq )
∗
q=1
D2 (Bhk )∗
n Y
q=k+1
!
ξ0
D2 (hq )∗ ξ0
(4.12)
for any m, n ∈ N ∪ {0} and fp , hq ∈ h0 , p = 1, . . . , m, q = 1, . . . , n. Then the relation ˜ E(η, ξ) = hη, Hξi holds for any η, ξ ∈ Mf in ξ0 , where E is defined as in (3.10) and (3.11). ˜ is essentially self-adjoint and the self-adjoint extension denoted by H ˜ (b) H again is equal to the Dirichlet operator H. Proof. (a) Let (E1 , Mf in ξ0 ) be the form given by E1 [ξ] =
∞ X j=1
kD1 (B 1/2 gj )ξk2 ,
˜ =H ˜1 + H ˜ 2 , where the image where {gj }, gj ∈ h0 is a CONS for h. We write that H ˜ 1 (resp. H ˜ 2 ) is defined by the first (resp. second) vector in the right-hand under H side in (4.12). The CARs in Proposition 4.1(a) and Lemma 4.1 imply that " m ! # Y ∗ E1 D1 (fp ) ξ0 p=1
=
∞ X m X m X
(−1)p+q (B 1/2 fp , gj )(gj , B 1/2 fq )G(f1 , . . . , fm ; p, q)
j=1 p=1 q=1
where G(f1 , . . . , fm ; p, q) * p−1 ! Y ∗ := D1 (fτ ) τ =1
m Y
τ =p+1
D1 (fτ )
!
∗
ξ0 ,
q−1 Y
τ =1
D1 (fτ )
!
∗
m Y
τ =q+1
D1 (fτ )
! +
∗
ξ0
.
December 3, 2003 17:12 WSPC/148-RMP
00182
843
Dirichlet Forms and Symmetric Markovian Semigroups
Using the Parserval relations and the fact that m X
(−1)p+q (fp , Bfq )G(f1 , . . . , fm ; p, q)
p=1
= (−1)
=
*
we have " E1
q−1
m Y
*
D1 (fp )
m X q=1
=
*
* m Y
p=1
D1 (fp )
∗
p=1
D1 (fp )
p=1
=
D1 (Bfq )
∗
p=1
m Y
m Y
∗
!
m Y
!
ξ0 ,
ξ0
#
q−1 Y
D1 (fp )
p=1
D1 (fp )
D1 (fτ )
τ =1
∗
!
∗
!
!
ξ0 ,
q−1 Y
∗
q−1 Y
ξ0 ,
!
τ =1
D1 (Bfq )
˜1 ξ0 , H
p=1
D1 (fτ )
D1 (fp )
!
m Y
∗
m Y
∗
∗
!
!
D1 (Bfq )
ξ0
∗
D1 (fτ )
m Y
τ =q+1
+
D1 (fτ )
∗
τ =q+1
τ =q+1
τ =1 m Y
D1 (fτ )
∗
∗
!
ξ0
+
D1 (fτ )
∗
!
ξ0
+
,
!
ξ0
+
.
˜2 = H ˜ −H ˜ 1 commutes with D1 (f )∗ for any f ∈ h0 by (4.12). Thus Notice that H by the polarization identity, we proved that ˜ 1 ξi E1 (η, ξ) = hη, H for any η, ξ ∈ Mf in ξ0 . The method similar to that used in the above implies that ˜ 2 ξi E2 (η, ξ) = hη, H
for any η, ξ ∈ Mf in ξ0 . This proves part (a) of the theorem. (b) By Proposition 4.2, we have ˜ −1 = dΓ1 (B) ⊗ 1 + 1 ⊗ dΓ2 (B) , U HU
(4.13)
where each i, i = 1, 2, dΓi (B) is the second quantization of B on Fi . We remark that dΓ2 (B) is anti-unitary equivalent to dΓ1 (B). By Assumption 3.1(d), any f ∈ h0 Qm is an analytic vector for B, and so it is easy to check that ( p=1 a(fp )∗ )Ω is an analytic vector for dΓ(B) for any fp ∈ h0 , p = 1, . . . , m. Thus it follows that any ˜ Since HM ˜ f in ξ0 ⊂ Mf in ξ0 by (4.12) and ξ ∈ Mf in ξ0 is an analytic vector for H. ˜ = H on Mf in ξ0 by part (a) of the theorem, it follows from [24, Corollary 2 of H ˜ and H are essentially self-adjoint on Mf in ξ0 , and so H ˜ = H. Theorem X.39] that H Finally we can produce the proof of Theorem 3.2. Proof of Theorem 3.2. (a) follows from Theorem 4.2.
December 3, 2003 17:12 WSPC/148-RMP
844
00182
C. Bahn, C. K. Ko & Y. M. Park
(b) Recall 0 < A ≤ α1. Let m=
(
α−1/2 + α1/2
α<1
2
α≥1
.
Since B = A−1/2 + A1/2 , inf σ(B) ≥ m . It follows from the above lower bound and (4.13) that zero is a simple eigenvalue with eigenvector ξ0 and inf(σ(H) − {0}) ≥ m ≥ 2 . This completes the proof of the theorem. Acknowledgment This work was supported by Korea Research Foundation Grant (KRF-2001-005D20003). The authors wish to thank anonymous referees for suggesting several improvements. References [1] Y. M. Park, Construction of Dirichlet forms on standard forms of von Neumann algebras, Infinite Dimensional Analysis, Quantum Probability and Related Topics 3(1) (2000), 1–14. [2] F. Cipriani, Dirichlet forms and Markovian semigroups on standard forms of von Neumann algebras, J. Funct. Anal. 147 (1997), 259–300. [3] C. Bahn, C. K. Ko and Y. M. Park, Dirichlet forms and symmetric Markovian semigroups on CCR Algebras with quasi-free states, J. Math. Phys. 44 (2003), 723–753. [4] C. Bahn and C. K. Ko, Construction of unbounded Dirichlet forms on standard forms of von Neumann Algebras, J. Korean Math. Soc. 39(6) (2002), 931–951. [5] E. B. Davies and J. M. Lindsay, Superderivations and symmetric Markov semigroups, Commun. Math. Phys. 157 (1993), 359–370. [6] E. B. Davies, Quantum theory of open systems, Academic Press, London-New YorkSan Francisco, 1976. [7] O. Bratteli and D. W. Robinson, Operator algebras and quantum statistical mechanics, Springer-Verlag, New York-Heidelberg-Berlin, Vol. I 1979, Vol. II 1981. [8] L. Accardi, Topics in quantum probability, Phys. Rep. 77 (1981), 169–192. [9] L. Accaradi, A. Frigerio and J. T. Lewis, Quantum stochastic processes, Publ. Res. Inst. Math. Sci. 18 (1982), 97–133. [10] K. R. Parthasarathy, An introduction to quantum stochastic calculus, Birkh¨ auser, Basel, 1992. [11] S. Goldstein and J. M. Lindsay, Beuring-Deny conditions for KMS-symmetric dynamical semigroups, C. R. Acad. Sci. Paris. Ser. I 317 (1993), 1053–1057. [12] S. Goldstein and J. M. Lindsay, KMS-symmetric Markov semigroups, Math. Zeit. 219 (1995), 590–608. [13] S. Goldstein and J. M. Lindsay, Markov semigroups KMS-symmetric for a weight, Math. Ann. 313 (1999), 39–67.
December 3, 2003 17:12 WSPC/148-RMP
00182
Dirichlet Forms and Symmetric Markovian Semigroups
845
[14] T. Matsui, Markov semigroups on UHF algebras, Rev. Mod. Phys. 5 (1993), 587–600. [15] S. Stragler, J. Quaegebeur and A. Verbeure, Quantum detailed balance, Ann. Inst. Henri Poincar´e 41(1) (1984), 25–36. [16] A. W. Majewski and B. Zegarlinski, Quantum stochastic dynamics I: Spin systems on a lattice, MPEJ 1, Paper 2 (1995). [17] A. W. Majewski and B. Zegarlinski, Quantum stochastic dynamics II, Rev. Math. Phys. 8(5) (1996), 689–713. [18] F. Cipriani, F. Fagnola and J. M. Lindsay, Spectral Analysis and Feller Properties for Quantum Ornstein–Uhlenbeck Semigroups, Commun. Math. Phys. 210 (2000), 85–105. [19] C. K. Ko and Y. M. Park, Construction of a Family of Quantum Ornstein–Uhlenbeck Semigroups, preprint mp arc 03-141. [20] F. Cipriani, Perron theory for positive maps and semigroups on von Neumann algebras, CMS Conf. Proc. 29 (2000), 115–123. [21] H. Araki and W. Wyss, Representations of canonical anticommutation relations, Helv. Phys. Acta. 37 (1964), 136–159. [22] H. Araki, Some properties of modular conjugation operator of von Neumann algebras ans noncommutative Radon-Nikodym theorem with chain rule, Pacific J. Math. 50(2) (1974), 309–354. [23] E. B. Davies and J. M. Lindsay, Non-commutative symmetric Markov semigroups, Math. Zeit. 210 (1992), 379–411. [24] M. Reed and B. Simon, Method of modern mathmatical physics I, II, Academic Press, 1980.
December 3, 2003 18:33 WSPC/148-RMP
00183
Reviews in Mathematical Physics Vol. 15, No. 8 (2003) 847–875 c World Scientific Publishing Company
CLASSICAL LIMIT OF QUANTUM DYNAMICAL ENTROPIES
F. BENATTI and V. CAPPELLINI Dip. Fisica Teorica Universit` a di Trieste, Strada Costiera 11, I-34014 Trieste, Italy M. DE COCK, M. FANNES and D. VANPETEGHEM∗ Instituut voor Theoretische Fysica, K.U. Leuven, B-3001 Leuven, Belgium
Received 25 July 2003 Revised 13 September 2003 Two non-commutative dynamical entropies are studied in connection with the classical limit. For systems with a strongly chaotic classical limit, the Kolmogorov–Sinai invariant is recovered on time scales that are logarithmic in the quantization parameter. The model of the quantized hyperbolic automorphisms of the 2-torus is examined in detail. Keywords: Quantum dynamical entropy; coherent states; semi-classical limit; hyperbolic automorphisms of the 2-torus.
1. Introduction Classical chaos is understood as motion on compact regions with trajectories highly sensitive to initial conditions [27, 19, 11, 33]. Once quantized, the motion has discrete energy spectrum and behaves almost periodically in time. Nevertheless, nature is fundamentally quantal and, according to the correspondence principle, classical behavior emerges in the limit ~ → 0. Also, classical and quantum mechanics are expected to almost coincide in the semi-classical regime, that is over times scaling as ~−α for some α > 0 [33]. Actually, this is true only for regular classical limits, while for chaotic ones the semi-classical regime typically scales as −log ~ [19, 11, 33]. Both time scales diverge when ~ → 0, but the shortness of the latter means that classical mechanics has to be replaced by quantum mechanics much sooner for quantum systems with chaotic classical behavior. The logarithmic breaking time −log ~ has been considered by some as a violation of the correspondence principle [17, 18], by others (see [11] and Chirikov in [19]), as evidence that time and classical limits do not commute. ∗ Also
Research Assistant of the Fund for Scientific Research — Flanders (Belgium) (F.W.O. — Vlaanderen). 847
December 3, 2003 18:33 WSPC/148-RMP
848
00183
F. Benatti et al.
The analytic studies of logarithmic time scales have been mainly performed by means of semi-classical tools, essentially by focusing, via coherent state techniques, on the phase space localization of specific time evolving quantum observables. In the following, we shall show how they emerge in the context of quantum dynamical entropies. As a particular example, we shall concentrate on finite dimensional quantizations of hyperbolic automorphisms of the 2-torus, which are prototypes of chaotic behavior; indeed, their trajectories separate exponentially fast with a Lyapounov exponent log λ > 0 [7, 31]. Standard quantization, a ` la Berry, of hyperbolic automorphisms [10, 14] yields Hilbert spaces of a finite dimension N . This dimension plays the role of semi-classical parameter and sets the minimal size 1/N of quantum phase space cells. By the theorems of Ruelle and Pesin [21], the positive Lyapounov exponents of smooth, classical dynamical systems are related to the dynamical entropy of Kolmogorov [20] which measures the information per time step provided by the dynamics. There are several candidates for non-commutative extensions of the latter [12, 3, 30, 1, 28]: in this paper we shall use two of them [12, 3] and study their semiclassical limit. We show that, from both of them, one recovers the Kolmogorov–Sinai entropy by computing the average quantum entropy produced over a logarithmic time scale and then taking the classical limit. This confirms the numerical results in [5], where the dynamical entropy [3] is applied to the study of the quantum kicked top. In this approach, the presence of logarithmic time scales indicates the typical scaling for a joint time-classical limit suited to preserve positive entropy production in quantized classically chaotic quantum systems. The paper is organized as follows: Sec. 2 contains a brief review of the algebraic approach to classical and dynamical systems, while Sec. 3 introduces some basic semi-classical tools. Sections 4 and 5 deal with the quantization of hyperbolic maps on finite dimensional Hilbert spaces and the relation between classical and time limits. Section 6 gives an overview of the quantum dynamical entropy of Connes, Narnhofer and Thirring [12] (CNT-entropy) and of Alicki and Fannes [2, 3] (ALFentropy, where L stands for Lindblad); finally, in Sec. 7 their semi-classical behavior is studied and the emergence of a typical logarithmic time scale is showed.
2. Dynamical Systems: Algebraic Setting We consider reversible, discrete time, compact classical dynamical systems that can be represented by a triple (X , T, µ), where: • X is a compact metric space: the phase space of the system. • T is a measurable transformation of X that is invertible such that T −1 is also measurable. The group {T k |k ∈ Z} implements the conservative dynamics in discrete time. • µ is a T -invariant probability measure on X , i.e. µ ◦ T = µ.
December 3, 2003 18:33 WSPC/148-RMP
00183
Classical Limit of Quantum Dynamical Entropies
849
In this paper, we consider a general scheme for quantizing and dequantizing, i.e. for taking the classical limit (see [32]). Within this framework, we focus on the semiclassical limit of quantum dynamical entropies of finite dimensional quantizations of the Arnold cat map and of generic hyperbolic automorphisms of the 2-torus, cat maps for short. In order to make the quantization procedure more explicit, it proves useful to follow an algebraic approach and replace (X , T, µ) with (Mµ , Θ, ωµ ) where • Mµ is the von Neumann algebra L∞ µ (X ) of (equivalence classes of) essentially bounded µ-measurable functions on X , equipped with the so-called essential supremum norm k · k∞ [26]. • ωµ is the state on Mµ defined by the reference measure µ Z ωµ (f ) := µ(dx)f (x) . X
• {Θk |k ∈ Z} is the discrete group of automorphisms of Mµ which implements the dynamics: Θ(f ) := f ◦ T −1 . The invariance of the reference measure reads now ωµ ◦ Θ = ω µ . Quantum dynamical systems are described in a completely similar way by a triple (M, Θ, ω), the critical difference being that the algebra of observables M is no longer Abelian: • M is a von Neumann algebra of operators, the observables, acting on a Hilbert space H. • Θ is an automorphism of M. • ω is an invariant normal state on M: ω ◦ Θ = ω. Quantizing essentially corresponds to suitably mapping the commutative, classical triple (Mµ , Θ, ωµ ) to a non-commutative, quantum triple (M, Θ, ω). 3. Classical Limit: Coherent States Performing the classical limit or a semi-classical analysis consists in studying how a family of algebraic triples (M, Θ, ω) depending on a quantization ~-like parameter is mapped onto (Mµ , Θ, ωµ ) when the parameter goes to zero. The most successful semi-classical tools are based on the use of coherent states. For our purposes, we shall use a large integer N as a quantization parameter, i.e. we use 1/N as the ~-like parameter. In fact, we shall consider cases where M is the algebra MN of N -dimensional square matrices acting on CN , the quantum reference state is the normalized trace N1 Tr on MN , denoted by τN and the dynamics is given in terms of a unitary operator UT on CN in the standard way: ΘN (X) := UT∗ XUT . In full generality, coherent states will be identified as follows.
December 3, 2003 18:33 WSPC/148-RMP
850
00183
F. Benatti et al.
Definition 3.1. A family {|CN (x)i|x ∈ X } ∈ H of vectors, indexed by points x ∈ X , constitutes a set of coherent states if it satisfies the following requirements (1) (2) (3) (4)
Measurability: x 7→ |CN (x)i is measurable on X ; 2 Normalization: kCN (x)k R = 1, x ∈ X ; Overcompleteness: N X µ(dx)|CN (x)ihCN (x)| = 1l; Localization: given ε > 0 and d0 > 0, there exists N0 (, d0 ) such that for N ≥ N0 and d(x, y) ≥ d0 , one has N |hCN (x), CN (y)i|2 ≤ ε .
The overcompleteness condition may be written in dual form as Z µ(dx)hCN (x), XCN (x)i = Tr X , X ∈ MN . N X
Indeed, Z Z µ(dx)|CN (x)ihCN (x)|X = Tr X . µ(dx)hCN (x), XCN (x)i = N Tr N X
X
3.1. Anti-wick quantization In order to study the classical limit and, more generally, semi-classical behavior of (MN , ΘN , τN ) when N → ∞, we introduce two linear maps. The first, γN ∞ , (anti-Wick quantization) associates N × N matrices to functions in Mµ = L∞ µ (X ), the second one, γ∞N , maps N × N matrices to functions in L∞ µ (X ). Definition 3.2. Given a family {|CN (x)i|x ∈ X } of coherent states in CN , the anti-Wick quantization scheme will be described by a (completely) positive unital map γN ∞ : Mµ → MN Z Mµ 3f 7→ N µ(dx) f (x)|CN (x)ihCN (x)| =: γN ∞ (f ) ∈ MN . X
The corresponding dequantizing map γ∞N : MN → Mµ will correspond to the (completely) positive unital map MN 3 X 7→ hCN (x), XCN (x)i =: γ∞N (X)(x) ∈ Mµ . Both maps are identity preserving because of the conditions imposed on the family of coherent states and are also completely positive since the domain of γ N ∞ is a commutative algebra as well as the range of γ∞N . Moreover, kγ∞N ◦ γN ∞ (g)k∞ ≤ kgk∞ ,
g ∈ Mµ ,
(1)
where k · k∞ denotes the essential norm on Mµ = L∞ µ (X ). The following two equivalent properties are less trivial: Proposition 3.1. For all f ∈ Mµ lim γ∞N ◦ γN ∞ (f ) = f
N →∞
µ-a.e.
December 3, 2003 18:33 WSPC/148-RMP
00183
Classical Limit of Quantum Dynamical Entropies
851
Proposition 3.2. For all f, g ∈ Mµ lim τN (γN ∞ (f )∗ γN ∞ (g)) = ωµ (f g) =
N →∞
Z
µ(dx) f (x)g(x) . X
The previous two propositions can be taken as requests on any well-defined quantization–dequantization scheme for observables. In the sequel, we shall need the notion of quantum dynamical systems (MN , ΘN , τN ) tending to the classical limit (X , T, µ). We then not only need convergence of observables but also of the dynamics. This aspect will be considered in Sec. 5. Proof of Proposition 3.1. We first prove the assertion when f is continuous on X and then remove this condition. We show that the quantity FN (x) := |f (x) − γ∞N ◦ γN ∞ (f )(x)| Z = f (x) − N µ(dy) f (y)|hCN (x), CN (y)i|2 X
Z = N µ(dy) (f (y) − f (x))|hCN (x), CN (y)i|2 X
becomes arbitrarily small for N large enough, uniformly in x. Selecting a ball B(x, d0 ) of radius d0 , using the mean-value theorem and property (3.1.3), we derive the upper bound Z 2 FN (x) ≤ N µ(dy) (f (y) − f (x))|hCN (x), CN (y)i| B(x,d0 ) Z +N µ(dy) (f (y) − f (x))|hCN (x), CN (y)i2 X \B(x,d0)
≤ |f (c) − f (x)| +
Z
X \B(x,d0 )
µ(dy) |f (y) − f (x)|N |hCN (x), CN (y)i|2 ,
(2)
(3)
where c ∈ B(x, d0 ). Because X is compact, f is uniformly continuous. Therefore, we can choose d0 in such a way that |f (c) − f (x)| < ε uniformly in x ∈ X . On the other hand, from the localization property (3.1.4), given ε0 > 0, there exists an integer N0 (ε0 , d0 ) such that N |hCN (x), CN (y)i|2 < ε0 whenever N > N0 (ε0 , d0 ). This choice leads to the upper bound Z FN (x) ≤ ε + ε0 µ(dy) |f (y) − f (x)| X \B(x,d0 )
≤ ε + ε0
Z
X
µ(dy) |f (y) − f (x)| ≤ ε + 2ε0 kf k∞ .
(4)
December 3, 2003 18:33 WSPC/148-RMP
852
00183
F. Benatti et al.
To get rid of the continuity of f , we use Lusin’s theorem [26]. It states that, given f ∈ L∞ µ (X ), with X compact, there exists a sequence {fn } of continuous functions on X such that |fn | ≤ kf k∞ and converging to f µ-almost everywhere. Thus, for f ∈ L∞ µ (X ), we pick such a sequence and estimate FN (x) ≤ |f (x) − fn (x)| + |fn (x) − γ∞N ◦ γN ∞ (fn )(x)| + |γ∞N ◦ γN ∞ (fn − f )(x)| . The first term can be made arbitrarily small (µ.a.e) by choosing n large enough because of Lusin’s theorem, while the second one goes to 0 when N → ∞ since fn is continuous. Finally, the third term becomes as well vanishingly small with n → ∞ as one can deduce from Z µ(dx) |γ∞N ◦ γN ∞ (f − fn )(x)| X
=
Z
≤
Z
X
=
Z
X
X
Z 2 µ(dx) µ(dy) (f (y) − fn (y))N |hCN (x), CN (y)i| X
µ(dy) |f (y) − fn (y)|
Z
X
µ(dx) N |hCN (x), CN (y)i|2
µ(dy) |f (y) − fn (y)| ,
where exchange of integration order is harmless because of the existence of the integral (1). The last integral goes to zero with n by dominated convergence and thus the result follows. Proof of Proposition 3.2. Consider ΩN := |τN (γN ∞ (f )∗ γN ∞ (g)) − ωµ (f¯g)| Z Z 2 = N µ(dx) f (x) µ(dy) (g(y) − g(x))|hCN (x), CN (y)i| X
≤
Z
X
X
Z 2 µ(dx) |f (x)| µ(dy)(g(y) − g(x))N |hCN (x), CN (y)i| . X
By choosing a sequence of continuous gn approximating g ∈ L∞ µ (X ), and arguing as in the previous proof, we get the following upper bound: Z Z 2 ΩN ≤ N µ(dx) |f (x)| µ(dy) (g(y) − gn (y))|hCN (x), CN (y)i| X
+N
Z
+N
Z
X
X
X
Z 2 µ(dx) |f (x)| µ(dy) (gn (y) − gn (x))|hCN (x), CN (y)i| X
Z µ(dx) |f (x)| µ(dy) (g(x) − gn (x))|hCN (x), CN (y)i|2 . X
December 3, 2003 18:33 WSPC/148-RMP
00183
Classical Limit of Quantum Dynamical Entropies
853
The integrals in the first and third lines go to zero by dominated convergence and Lusin’s theorem. As regards the middle line, one can apply the argument used for the quantity FN (x) in the proof of Proposition 3.1. 4. Classical and Quantum Cat Maps In this section, we collect the basic material needed to describe both classical and quantum cat maps and we introduce a specific set of coherent states that will enable us to perform the semi-classical analysis of the dynamical entropy. 4.1. Finite dimensional quantizations We first introduce cat maps in the spirit of the algebraic formulation introduced in the previous sections. Definition 4.1. Hyperbolic automorphisms of the torus, i.e. cat maps, are generically represented by triples (Mµ , Θ, ωµ ), where • Mµ is the algebra of essentially bounded functions on the two dimensional torus T := {x = (x1 , x2 ) ∈ R2 (mod 1)}, equipped with the Lebesgue measure µ(dx) := dx. • {Θk |k ∈ Z} is the family of automorphisms (discrete time evolution) given by Mµ 3 f 7→ (Θk f )(x) := f (A−k x (mod 1)), where A = ac db has integer entries such that ad − cb = 1, |a + d| > 2 and maps T onto itself. • ωµ is the expectation obtained by integration with respect to the Lebesgue meaR sure: Mµ 3 f 7→ ωµ (f ) := T dxf (x), that is left invariant by Θ.
The matrix A has irrational eigenvalues 1 < λ , λ−1 , therefore distances stretch along the eigendirection u of λ, while shrink along v, the eigendirection of λ−1 . Once the folding condition is added, the hyperbolic automorphisms of the torus become prototypes of classical chaos, with positive Lyapounov exponent log λ. One can quantize the associated algebraic triple (Mµ , Θ, ωµ ) on either infinite [8] or finite dimensional Hilbert spaces [10, 14, 13]. In the following, we shall focus on the latter. Given an integer N , we consider an orthonormal basis |ji of CN , where the index j runs through ZN , namely |j + N i ≡ |ji, j ∈ Z. By using this basis we define two unitary matrices UN and VN as follows: 2πi 2πi u |j + 1i , and VN |ji := exp (v − j) |ji . (5) UN |ji := exp N N
u, v ∈ [0, 1) are parameters labelling the representations and N UN = e2iπu 1lN ,
VNN = e2iπv 1lN .
(6)
It turns out that UN VN = exp
2iπ N
VN U N .
(7)
December 3, 2003 18:33 WSPC/148-RMP
854
00183
F. Benatti et al.
Introducing Weyl operators labeled by n = (n1 , n2 ) ∈ Z2 iπ n1 WN (n) := exp = WN (−n)∗ n1 n2 VNn2 UN N
(8)
it follows that WN (N n) = eiπ(N n1 n2 +2n1 u+2n2 v) iπ WN (n)WN (m) = exp σ(n, m) WN (n + m) , N
(9) (10)
where σ(n, m) := n1 m2 − n2 m1 . Definition 4.2. Quantized cat maps will be identified with algebraic triples (MN , ΘN , τN ) where • MN is the full N × N matrix algebra linearly spanned by the Weyl operators WN (n). • ΘN : MN 7→ MN is the automorphism such that WN (p) 7→ ΘN (WN (p)) := WN (Ap) ,
p ∈ Z2 .
(11)
In the definition above, we have omitted reference to the parameters u, v in (5): they must be chosen such that ! ! ! ! a c u u N ac (mod 1) . (12) = + 2 bd b d v v Then, the folding condition (9) is compatible with the time evolution [14]. Further, the algebraic relations (10) are also preserved since the symplectic form remains invariant, i.e. σ(At n, At m) = σ(n, m). Useful relations can be obtained by using 2iπ iπ WN (n)|ji = exp (−n1 n2 + 2n1 u + 2n2 v) exp − jn2 |j + n1 i . (13) N N From (13) one readily derives (N )
iπ
τN (WN (n)) = e N (−n1 n2 +2n1 u+2n2 v) δn,0 ,
(14)
τN (WN (An)) = τN (WN (n)),
(15)
1 N
N −1 X
WN (−p)WN (n)WN (p) = Tr (WN (n)) 1lN ,
(16)
p1 ,p2 =0
MN 3 X =
N −1 X
τN (XWN (−p))WN (p) .
(17)
p1 ,p2 =0 (N )
In (14), we have introduced the periodic Kronecker delta, that is δn,0 = 1 if and only if n = 0 mod (N ).
December 3, 2003 18:33 WSPC/148-RMP
00183
Classical Limit of Quantum Dynamical Entropies
855
From Eq. (10) one derives [WN (n), WN (m)] = 2i sin
π σ(n, m) WN (n + m) , N
which suggests that the ~-like parameter is 1/N and that the classical limit corresponds to N → ∞. In the following section, we set up a coherent state technique suited to study classical cat maps as limits of quantized cats. 4.2. Coherent states for cat maps We shall construct a family {|CN (x)i|x ∈ T} of coherent states on the 2-torus by means of the discrete Weyl group. We define |CN (x)i := WN ([N x])|CN i ,
(18)
where [N x] = ([N x1 ], [N x2 ]), 0 ≤ [N xi ] ≤ N − 1 is the largest integer smaller than N xi and the fundamental vector |CN i is chosen to be s N −1 X 1 N −1 . (19) |CN i = CN (j)|ji , CN (j) := (N −1)/2 j 2 j=0
Measurability and normalization are immediate, over-completeness comes as follows. Let Y be the operator in the left-hand side of property (3.1.3). If τN (Y WN (n)) = τN (WN (n)) for all n = (n1 , n2 ) with 0 ≤ ni ≤ N − 1, then according to (17) applied to Y it follows that Y = 1l. This is indeed the case as, using (9) and N -periodicity, Z dxhCN (x), WN (n)CN (x)i τN (Y WN (n)) = T
=
Z
dx exp T
1 = 2 N
N −1 X
2πi σ(n, [N x]) hCN , WN (n)CN i N
exp
p1 ,p2 =0
2πi σ(n, p) hCN , WN (n)CN i N
= τN (WN (n)) .
(20)
In the last line when x runs over T, [N xi ], i = 1, 2 runs over the set of integers 0, 1, . . . , N − 1. The proof the localization property (3.1.4) requires several steps. First, we observe that, due to (6), v N −n −1 ! ! u 1 u N −1 N −1 2πi 1 X t exp − E(n) := |hCN , WN (n)CN i| = N −1 `n2 2 N ` ` + n1 `=0 v ! u u N −1 2πi t exp − + `n2 N ` `=N −n1 N −1 X
! ` + n1 − N N −1
(21)
December 3, 2003 18:33 WSPC/148-RMP
856
00183
F. Benatti et al.
1
≤
2N −1
v
u N −n 1 −1 u X
t N −1 `
`=0
v ! u u N −1 t + ` `=N −n1 N −1 X
!
N −1
` + n1
N −1
` + n1 − N
!
!
.
(22)
Second, using the entropic bound of the binomial coefficients ! N −1 ` ≤ 2(N −1)η( N −1 ) , `
(23)
where η(t) :=
(
−t log2 t − (1 − t) log2 (1 − t)
if 0 < t ≤ 1
0
if t = 0
,
(24)
we estimate E(n) ≤
1 2N −1
"N −1−n X 1
2
`+n1 N −1 ` 2 [η( N −1 )+η( N −1
)]
+
N −1 X
2
N −1 2 [η
`=N −n1
`=0
# 1 −N ] ( N `−1 )+η( `+n ) . N −1 (25)
The exponents in the two sums are bounded by their maxima ` ` + n1 η +η ≤ 2η1 (n1 ) , (0 ≤ ` ≤ N − n1 − 1) N −1 N −1 η where
` N −1
+η
` + n1 − N N −1
≤ 2η2 (n1 ) ,
(N − n1 ≤ ` ≤ N − 1)
(26)
(27)
η1 (n1 ) := η
n1 1 − 2 2(N − 1)
≤1
(28)
η2 (n1 ) := η
1 N − n1 + 2 2(N − 1)
≤ η2 < 1 .
(29)
Notice that η2 is automatically < 1, while η1 (n1 ) < 1 if limN n1 /N 6= 0. If so, the upper bound E(n) ≤ N (2−(N −1)(1−η1 (n1 )) + 2−(N −1)(1−η2 ) ) 2
(30)
implies N |hCN , WN (n)CN i| 7→ 0 exponentially with N → ∞. The condition for which η1 (n1 ) < 1 is fulfilled when |x1 − y1 | > δ; in fact, n = [N y] − [N x] and limN ([N x1 ] − [N y1 ])/N = x1 − y1 . On the other hand, if x1 = y1 and n2 = [N x2 ] − [N y2 ] 6= 0, one explicitly computes N −1 πn2 N |hCN , WN ((0, n2 ))CN i|2 = N cos2 . (31) N
December 3, 2003 18:33 WSPC/148-RMP
00183
Classical Limit of Quantum Dynamical Entropies
857
Again, the above expression goes exponentially fast to zero, if limN n2 /N 6= 0 which is the case if x2 6= y2 . 5. Quantum and Classical Time Evolutions One of the main issues in the semi-classical analysis is to compare if and how the quantum and classical time evolutions mimic each other when a quantization parameter goes to zero. In the case of classically chaotic quantum systems, the situation is strikingly different from the case of classically integrable quantum systems. In the former case, classical and quantum mechanics agree on the level of coherent states only over times which scale as −log ~. As before, let T denote the evolution on the classical phase space X and UT the unitary single step evolution on CN . We formally impose the relation between the classical and quantum evolution on the level of coherent states through: Condition 5.1 (Dynamical Localization). There exists an α > 0 such that for all choices of ε > 0 and d0 > 0 there exists an N0 ∈ N with the following property: if N > N0 and k ≤ α log N , then N |hUTk CN (x), CN (y)i|2 ≤ ε whenever d(T k x, y) ≥ d0 . Remark. The condition of dynamical localization is what is expected of a good choice of coherent states, namely, on a time scale logarithmic in the inverse of the semi-classical parameter, evolving coherent states should stay localized around the classical trajectories. Informally, when N → ∞, the quantities Kk (x, y) := hUTk CN (x), CN (y)i
(32)
should behave as if N |Kk (x, y)|2 ' δ(T k x − y). The constraint k ≤ α log N is typical of hyperbolic classical behavior and comes heuristically as follows. The maximal localization of coherent states cannot exceed the minimal coarse-graining dictated by 1/N ; if, while evolving, coherent states stayed localized forever around the classical trajectories, they would get more and more localized along the contracting direction. Since for hyperbolic systems the increase of localization is exponential with Lyapounov exponent λLyap > 0, this sets the upper bound and indicates that α ' 1/λLyap . Proposition 5.1. Let (MN , ΘN , τN ) be a general quantum dynamical system as p defined in Sec. 3 and suppose that it satisfies Condition 5.1. Let kXk2 := τN (X ∗ X), X ∈ MN denote the normalized Hilbert–Schmidt norm. In the ensuing topology lim
k,N →∞ k<α log N
kΘkN ◦ γN ∞ (f ) − γN ∞ ◦ Θk (f )k2 = 0 .
(33)
December 3, 2003 18:33 WSPC/148-RMP
858
00183
F. Benatti et al.
Proof. One computes kΘkN ◦ γN ∞ (f ) − γN ∞ ◦ Θk (f )k22 Z Z = 2N µ(dx) µ(dy) f (x)f (y)|hCN (x), CN (y)i|2 X
− 2N <e
X
Z
µ(dx) X
Z
X
µ(dy) f (y)f (T k x)|hUTk CN (x), CN (y)i|2 .
(34)
R The double integral in the first term goes to µ(dx)|f (x)|2 . So, we need to show that the second integral, which we shall denote by IN (k), does the same. We will concentrate on the case of continuous f , the extension to essentially bounded f is straightforward. Explicitly, selecting a ball B(T k x, d0 ), one derives Z µ(dy)|f (y)|2 IN (k) − X Z Z k k 2 = µ(dx) µ(dy) f (y)(f (T x) − f (y))N |hUT CN (x), CN (y)i| X
Z Z ≤ µ(dx) µ(dy)f (y)(f (T k x) − f (y))N |hUTk CN (x), CN (y)i|2 X B(T k x,d0 )
Z Z k k 2 + µ(dx) µ(dy)f (y)(f (T x) − f (y))N |hUT CN (x), CN (y)i| . X X \B(T k x,d0 )
Applying the mean value theorem and approximating the integral of the kernel as in the proof of Proposition 3.2, we get that ∃ c ∈ B(T k x, d0 ) such that Z µ(dy) |f (y)|2 IN (k) − X Z k k 2 ≤ µ(dx) f (c)(f (T x) − f (c))N |hUT CN (x), CN (y)i| X
Z Z + µ(dx) µ(dy)f (y)(f (T k x) − f (y))N |hUTk CN (x), CN (y)i|2 . X k X \B(T x,d0 )
By uniform continuity we can bound the first term by some arbitrary small ε, provided we choose d0 small enough. Now, for the second integral we use our localization Condition 5.1. As the constraint k ≤ α log N has to be enforced, we have to take a joint limit of time and size of the system with this constraint. In that case the second integral can also be bounded by an arbitrarily small ε0 , provided N is large enough.
December 3, 2003 18:33 WSPC/148-RMP
00183
Classical Limit of Quantum Dynamical Entropies
859
We shall not prove the dynamical localization condition 5.1 for the quantum cat maps but instead provide a direct derivation of formula (33) based on the simple expression (11) of the dynamics when acting on Weyl operators. For this reason, we introduce the Weyl quantization: Definition 5.1. Let f be a function in L∞ µ (T), T denoting the two-dimensional ˆ torus, whose Fourier series f has only finitely many non-zero terms. We shall denote by Supp(ˆf) the support of fˆ in Z2 . Then, in the Weyl quantization scheme, one associates to f the N × N matrix X ˆ WN (f ) := f(k)W N (k) . k∈Supp(fˆ)
Our aim is to prove: Proposition 5.2. Let (MN , ΘN , τN ) be a sequence of quantum cat maps tending with N → ∞ to a classical cat map with Lyapounov exponent log λ; then lim
k, N →∞ k
kΘkN ◦ γN ∞ (f ) − γN ∞ ◦ Θk (f )k2 = 0 ,
where k · k2 is the Hilbert–Schmidt norm of Proposition 5.1. First we prove an auxiliary result. ni = 0, Lemma 5.1. If n = (n1 , n2 ) ∈ Z2 is such that 0 ≤ ni ≤ N −1 and limN √N −1 then the expectation of Weyl operators WN (n) with respect to the state |CN i given in (19) is such that
lim hCN , WN (n)CN i = 1 .
N →∞
Proof. The idea of the proof is to use the fact that, for large N , the binomial coefficients N j−1 contribute to the binomial sum only when j stays within a neigh√ borhood of (N − 1)/2 of width ' N , in which case they can be approximated by a normalized Gaussian function. We also notice that, by expanding the exponents in the bounds √ (30) and (31), the exponential decay fails only if n1,2 grow with N slower than N , which is surely the case for fixed finite n, whereby it also follows that we can disregard the second term in the sum comprising the contributions (21). We then write the j’s in the binomial coefficients as N −1 1 N −1 +k = , j= + k − α , α ∈ 0, 2 2 2 √ and consider only k = O( N). Stirling’s formula √ L! = LL+1/2 e−L 2π(1 + O(L−1 )) ,
December 3, 2003 18:33 WSPC/148-RMP
860
00183
F. Benatti et al.
allows us to rewrite the first term in the right-hand side of (21) as 1 n21 exp − 2N −1
N −1−[ N 2−1 ]+n1
X
k=−[ N 2−1 ]
2πi
N −1
2e N n2 (k+[ 2 ]) p 2π(N − 1)
2(k − α + n21 )2 (1 + O(N −1 ) + O((k + n1 )3 N −2 )) . × exp − N −1
(35)
For any fixed, finite n, both the sum and the factor in front tend to 1, the sum becoming the integral of a normalized Gaussian. Proof of Proposition 5.2. Given f ∈ L∞ µ (X ) and ε > 0, we choose N0 such that ˆ = N0 is such that kf − fε k ≤ ε, the Fourier approximation fε of f with #(Supp(f)) where k · k denotes the usual Hilbert space norm. Next, we estimate IN (f ) := kΘkN ◦ γN ∞ (f ) − γ∞N ◦ Θk (f )k2 ≤ kΘkN ◦ γN ∞ (f − fε )k2 + kγN ∞ ◦ Θk (f − fε )k2 + kΘkN ◦ γN ∞ (fε ) − γN ∞ ◦ Θk (fε )k2 ≤ 2kf − fε k + IN (fε ) . This follows from ΘN -invariance of the norm k·k2 , from T -invariance of the measure µ and from the fact that the positivity inequality for unital completely positive maps such as γN ∞ gives: kγN ∞ (g)k22 = τN (γN ∞ (g)∗ γN ∞ (g)) ≤ τN (γN ∞ (|g|2 )) Z = dx|g|2 (x) = kgk2 . T
We now use that fε as a function with finitely supported Fourier transform and, inserting the Weyl quantization of fε , we estimate IN (fε ) ≤ kγN ∞ (fε ) − WN (fε )k2 + kγN ∞ ◦ Θk (fε ) − ΘkN (WN (fε ))k2 .
(36)
Then, we concentrate on the square of the second term, which we denote by GN,k (fε ) and explicitly reads GN,k (fε ) = τN (γN ∞ ◦ Θk (fε∗ γN ∞ ◦ Θk (fε )) + τN (WN (fε )∗ WN (fε )) − 2<e(τN (γN ∞ ◦ Θk (fε )∗ ΘkN (WN (fε )))).
(37)
The first term tends to kfε k2 as N → ∞, because of Proposition 3.2 and the same is true of the second term; indeed, X iπ τN (WN (fε )∗ WN (fε )) = fˆε (k)fˆε (q)e N σ(q,k) τN (WN (q − k)) . k,q∈Supp(fˆε )
December 3, 2003 18:33 WSPC/148-RMP
00183
Classical Limit of Quantum Dynamical Entropies
861
Now, since Supp(ˆfε ) is finite, the vector k − q is uniformly bounded with respect to N . Therefore, with N large enough, (14) forces k = q, whence the claim. It remains to show that the same holds for the third term in (37) which amounts to twice the real part of Z dxfε (A−k x)hCN (x), ΘkN (WN (fε ))CN (x)i T
=
X
p∈S(fε )
fˆε (p)hCN , WN (Ak p)CN i
Z
dxfε
(A−k x) exp
T
2πi σ(Ak p, [N x]) N
.
According to Lemma 5.1, the matrix element hCN , WN (Ak p)CN i tends to 1 as N → ∞ whenever the vectorial components (Ak p)j , j = 1, 2, satisfy
(Ak p)2j λ2k = Cu (p)(u)j lim = 0, N N N N where we expanded p = Cu (p)u + Cv (p)v along the stretching and squeezing eigendirections of A (see Definition 4.1). This fact sets the logarithmic time scale N k < 12 log log λ . Notice that, when k = 0, GN,k (fε ) equals the first term in (36) and this concludes the proof. lim
Remark. The previous result essentially points to the fact that the time evolution and the classical limit do commute over time scales that are logarithmic in the semiN classical parameter N . The upper bound of this time, which goes like const.× log log λ , is typical of quantum chaos and is known as logarithmic breaking-time. Such a scaling has been found numerically in [9] also for discrete classical cat maps, converging in a suitable classical limit to continuous cat maps. 6. Dynamical Entropies Intuitively, one expects the instability proper to the presence of a positive Lyapounov exponent to correspond to some degree of unpredictability of the dynamics: classically, the metric entropy of Kolmogorov provides the link [16]. In the usual setting, one considers partitions C = {C0 , C1 , . . . , Cq−1 } of the phase space X into finitely many measurable disjoint subsets Cj (atoms). Under the dynamics T , C evolves into another finite partition T (C) := {T −1 (C0 ), T −1 (C1 ), . . . , T −1 (Cq−1 )}. Moreover, by intersecting atoms of partitions at different times one gets disjoint atoms Ci :=
k−1 \
T j (Cij ) for i = (i0 , i1 , . . . , ik−1 ) ,
j=0
which constitute the refined partition C (k) :=
k−1 _ j=0
T j (C) .
December 3, 2003 18:33 WSPC/148-RMP
862
00183
F. Benatti et al.
Given the invariant measure µ on X , the probability for the system to belong to the atoms Ci0 , Ci1 , . . . , Cik−1 at the successive times 0 ≤ j ≤ k − 1 is µ(Ci ). In terms of symbolic dynamics, one gets a stationary stochastic process. This amounts to a right-shift along a classical spin half-chain with respect to a translation-invariant state. At each site of the half-chain, one has the state space {0, 1, . . . , q − 1}. The atom Ci of the refined partition C (k) is identified with the local configurations i ∈ {0, 1, . . . , q − 1}k and has a weight µC(k) (i) := µ(Ci ) .
The local states µC(k) are compatible and define a global state on the set of extended configurations {0, 1, . . . , q − 1}N . Such a state is invariant under the right-shift and has a well-defined mean entropy 1 S(µC(k) ) , (38) hKS µ (T, C) := lim k→∞ k P where, for a discrete measure λ, S(λ) := − j λj log λj . The entropy density (38) is also interpretable as average entropy production. It consistently measures how predictable the dynamics is on the coarse grained scale provided by the finite partition C. Then, removal of the dependence on finite partitions leads to Definition 6.1. The KS-entropy of a classical dynamical system (X , T, µ) is KS hKS µ (T ) := sup hµ (T, C) . C
For the automorphisms of the 2-torus, we have the well-known result [20]: Proposition 6.1. Let (Mµ , Θ, ωµ ) be as in Definition 4.1, then hKS µ (T ) = log λ. The idea behind the notion of dynamical entropy is that information can be obtained by repeatedly observing a system in the course of its time evolution. Due to the uncertainty principle, or, in other words, to non-commutativity, if observations are intended to gather information about the intrinsic dynamical properties of quantum systems, then non-commutative extensions of the KS-entropy ought first to decide whether quantum disturbances produced by observations have to be taken into account or not. Concretely, let us consider a quantum system described by a density matrix ρ acting on a Hilbert space H. Via the wave packet reduction postulate, generic measurement processes may reasonably well be described by finite sets Y = P {y0 , y1 , . . . , yq−1 } of bounded operators yj ∈ B(H) such that j yj∗ yj = 1l. These sets are called partitions of unity and describe the change in the state of the system caused by the corresponding measurement process: X ρ 7→ Γ∗Y (ρ) := yj ρyj∗ . (39) j
It looks rather natural to rely on partitions of unity to describe the process of collecting information through repeated observations of an evolving quantum system [3]. Yet, most of these measurements interfere with the quantum evolution,
December 3, 2003 18:33 WSPC/148-RMP
00183
Classical Limit of Quantum Dynamical Entropies
863
possibly acting as a source of unwanted extrinsic randomness. Nevertheless, the effect is typically quantal and rarely avoidable. Quite interestingly, as we shall see later, pursuing these ideas leads to quantum stochastic processes with a quantum dynamical entropy of their own, the ALF-entropy, that is also useful in a classical context. An alternative approach [12] leads to the CNT-entropy. This approach lacks the operational appeal of the ALF-construction, but is intimately connected with the intrinsic relaxation properties of quantum systems [12, 22] and possibly useful in the rapidly growing field of quantum communication. The CNT-entropy is based on decomposing quantum states rather than on reducing them as in (39). Explicitly, if the state ρ is not a one-dimensional projection, any partition of unity Y yields a decomposition √ ∗ √ X ρyj yj ρ . (40) Tr (ρyj∗ yj ) ρ= Tr (ρyj∗ yj ) j When Γ∗Y (ρ) = ρ, reductions also provide decompositions, but not in general. 6.1. CNT-entropy The CNT-entropy is based on decomposing quantum states into convex linear combinations of other states. The information content attached to the quantum dynamics is not based on modifications of the quantum state or on perturbations of the time evolution. Let (M, Θ, ω) represent a quantum dynamical system in the algebraic setting and assume ω to be decomposable. The construction runs as follows. • Classical partitions are replaced by finite dimensional C*-algebras N with identity embedded into M by completely positive, unity preserving (cpu) maps γ : N 7→ M. Given γ, consider the cpu maps γ` := Θ` ◦ γ that result from successive iterations of the dynamical automorphism Θ, and associate to each of them an index set I` . These index sets I` will be coupled to the cpu maps γ` through the variational problem (43). • If 0 ≤ ` < k then consider multi-indices i = (i0 , i1 , . . . , ik−1 ) ∈ I (k) := I0 × · · · × Ik−1 as labels of states ωi on M and of weights 0 < µi < 1 such that P P states are given by elements 0 ≤ x0i ∈ M0 , i µi = 1 and ω = i µi ωi . These P 0 the commutant of M, such that i xi = 1lN . Explicitly y ∈ M 7→ ωi (y) :=
ω(x0i y) , ω(x0i )
µi := ω(x0i ) .
(41)
The decomposition has be done with elements x0 in the commutant in order to ensure the positivity of the expectations ωi . P P • From ω = i µi ωi , one obtains subdecompositions ω = i` ∈I` µ`i` ωi`` , where X µi X ` µi . (42) ωi`` := ω and µ := i i ` µ`i` i i i` fixed
i` fixed
December 3, 2003 18:33 WSPC/148-RMP
864
00183
F. Benatti et al.
• Since N is finite dimensional, the states ω ◦ Θ` ◦ γ = ω ◦ γ and ωi`` ◦ Θ` ◦ γ, have finite von Neumann entropies S(ω ◦ γ) and S(ωi`` ◦ Θ` ◦ γ). With η(x) := −x log x if 0 < x ≤ 1 and η(0) = 0, one defines the k subalgebra functional ( k−1 X XX η(µ`i` ) Hω (γ0 , γ1 , . . . , γk−1 ) := P sup η(µi ) − ω=
+
i
µi ω i
k−1 X `=0
`=0 i` ∈I`
i
(S(ω ◦ γ` ) −
X
µ`i` S(ωi``
i` ∈I`
)
◦ γ` )) .
(43)
We list a number of properties of k-subalgebra functionals, see [12], that will be used in the sequel: • positivity: 0 ≤ Hω (γ0 , γ1 , . . . , γk−1 ) • subadditivity: Hω (γ0 , γ1 , . . . , γk−1 ) ≤ Hω (γ0 , γ1 , . . . , γ`−1 ) + Hω (γ` , γ`+1 , . . . , γk−1 ) • time invariance: Hω (γ0 , γ1 , . . . , γk−1 ) = Hω (γ` , γ`+1 , . . . , γ`+k−1 ) • boundedness: Hω (γ0 , γ1 , . . . , γk−1 ) ≤ kHω (γ) ≤ kS(ω ◦ γ) • The k-subalgebra functionals are invariant under interchange and repetitions of arguments: Hω (γ0 , γ1 , . . . , γk−1 ) = Hω (γk−1 , . . . , γ0 , γ0 ) .
(44)
• monotonicity: If i` : N` 7→ N, 0 ≤ ` ≤ k − 1, are cpu maps from finite dimensional algebras Nl into N, then the maps γ˜` := γ ◦ i` are cpu and Hω (˜ γ0 , Θ ◦ γ˜1 , . . . , Θk−1 ◦ γ˜k−1 ) ≤ Hω (γ0 , γ1 , . . . , γk−1 ) .
(45)
• continuity: Let us consider for ` = 0, 1, . . . , k − 1 a set of cpu maps γ˜` : N 7→ M such that kγ` − γ˜` kω ≤ for all `, where p kγ` − γ˜` kω := sup ω((γ` (x) − γ˜` (x))∗ (γ` (x) − γ˜` (x))) . (46) x∈N,kxk≤1
Then by [12], there exists δ() > 0 depending on the dimension of the finite dimensional algebra N and vanishing when → 0, such that |Hω (γ0 , γ1 . . . , γk−1 ) − Hω (˜ γ0 , γ˜1 . . . , γ˜k−1 )| ≤ kδ() .
On the basis of these properties, one proves the existence of the limit 1 hCNT ω (θ, γ) := lim Hω (γ0 , γ1 , . . . , γk−1 ) k k and defines [12]:
(47)
(48)
Definition 6.2. The CNT-entropy of a quantum dynamical system (M, Θ, ω) is hCNT (Θ) := sup hCNT (Θ, γ) . ω ω γ
December 3, 2003 18:33 WSPC/148-RMP
00183
Classical Limit of Quantum Dynamical Entropies
865
6.2. ALF-entropy The idea underlying the ALF-entropy is that the evolution of a quantum dynamical system can be modelled by repeated measurements at successive equally spaced times, the measurements corresponding to partitions of unity as defined in Sec. 6 which we shall refer to as p.u., for the sake of shortness. Such a construction associates a quantum dynamical system with a symbolic dynamics corresponding to the right-shift along a quantum spin half-chain [29]. Generic p.u. Y = {y0 , y1 , . . . , y`−1 } need not preserve the state, but disturbances are kept under control by suitably selecting the yj . The construction of the ALFentropy for a quantum dynamical system (M, Θ, ω) can be resumed as follows: • One selects a sub-algebra M0 ⊆ M which is invariant under Θ and a p.u. Y = {y0 , y1 , . . . , y`−1 } of finite size ` with yj ∈ M0 . After j time steps Y will have evolved into another p.u. from M0 : Θj (Y) := {Θj (y0 ), Θj (y1 ), . . . , Θj (y`−1 )} ⊂ M0 . • Every p.u. Y of size ` gives rise to an `-dimensional density matrix ρ[Y]i,j := ω(yj∗ yi ) ,
(49)
with von Neumann entropy Hω [Y] := S(ρ[Y]). • Given two p.u. Y = {y0 , y1 , . . . , y`−1 } and Z = {z0 , z1 , . . . , zk−1 }, of sizes ` and k, their ordered refinement is the size `k p.u. Y ◦ Z := {y0 z0 , y0 z1 , . . . , y0 zk−1 , . . . , y`−1 zk−1 } .
(50)
• Given a size ` p.u. Y and the ordered time refinements Y (k) := Θk−1 (Y) ◦ Θk−2 (Y) ◦ · · · ◦ Y ,
(51)
(k)
the density matrices ρY := ρ[Y (k) ] define states on the k-fold tensor product M⊗k of `-dimensional matrix algebras M` . ` • Given a p.u. Y of size `, let ΦY : M` ⊗M 7→ M and eM : M 7→ M, with M ∈ M` , be linear maps defined by X X ΦY (M ⊗ x) := yi∗ xyj Mij and eM (x) := yi∗ Θ(x)yj Mij . (52) i,j
i,j
ΦY is a cpu map, while e1l (1l) = 1l. One readily computes (n)
ω(eM0 ◦ eM1 · · · ◦ eMk−1 (1l)) = Tr(ρY M0 ⊗ M1 · · · ⊗ Mk−1 ) . (k)
The states ρY are compatible and therefore define a global state ωY on the S ⊗n quantum spin half-chain MN ` , which is the uniform closure of n∈N M` . Along the same line as in Sec. 6, one associates with the quantum dynamical system (M, Θ, ω) the right shift σ along the quantum spin half-chain. However, non-commutativity P shows up in that ωY is shift-invariant only if ω( `j=0 yj∗ xyj ) = ω(x) for all x ∈ MN `. Note that this is the case when p.u. give rise to decompositions of ω as in CNTconstruction, (compare (39) and (40)). This leads to
December 3, 2003 18:33 WSPC/148-RMP
866
00183
F. Benatti et al.
Definition 6.3. The ALF-entropy of a quantum dynamical system (M, Θ, ω) is 1 ALF ALF (k) hALF ]. (ω,M0 ) (Θ) := sup hω (Θ, Y) with hω (Θ, Y) := lim sup Hω [Y k Y⊂M0 k
(53)
6.3. Quantum dynamical entropies compared In this section we outline some of the main features of both quantum dynamical entropies. The first thing to notice is that the CNT- and the ALF-entropy coincide with the KS-entropy when M = Mµ is the Abelian von Neumann algebra L∞ µ (X ) and (M, Θ, ω) represents a classical dynamical system. The next observation is that when, as for the quantized hyperbolic automorphisms of the torus considered in this paper, M is a finite-dimensional algebra, both the CNT- and the ALF-entropy are zero, see [12, 3]. Consequently, if we decide to take the strict positivity of quantum dynamical entropies as a signature of quantum chaos, quantized hyperbolic automorphisms of the torus cannot be called chaotic. The complete proofs of the above facts can be found in [12] for the CNT and [3, 6] for the ALF-entropy. Here, we just sketch them, emphasizing those parts that are important to the study of their classical limit. Proposition 6.2. Let (Mµ , Θ, ωµ ) represent a classical dynamical system. Then, with the notations of the previous sections KS ALF hCNT ωµ (Θ) = hµ (T ) = h(ωµ ,Mµ ) (Θ) .
Proof. CNT-Entropy: In this case, hCNT ωµ (Θ) is computable by using natural embeddings of finite dimensional subalgebras of Mµ rather than generic cpu maps γ. Partitions C = {C0 , C1 , . . . , Cn−1 } of X can be identified with the finite dimensional subalgebras NC ∈ Mµ generated by the characteristic functions χCj of the atoms of the partition, with ωµ (χC ) = µ(C). Also, the refinements C (k) of the evolving partitions Qk−1 (k) T −j (C) correspond to the subalgebras NC generated by χCi = j=0 χT −j (Cij ) . Thus, if ıNC embeds NC into Mµ , then ωµ ◦ıNC corresponds to the state ωµ NC , which is obtained by restriction of ωµ to NC and is completely determined by the expectation values ωµ (χCj ), 1 ≤ j ≤ n − 1. Further, identifying the cpu maps γ` = Θ` ◦ ıNC with the corresponding subKS algebras Θ` (N ), hCNT ωµ (Θ) = hµ (T ) follows from Hω (NC , Θ(NC ), . . . , Θk−1 (NC )) = Sµ (C (k) ) ,
∀C,
see (38). In order to prove (54), we decompose the reference state as Z X 1 µ(dx) χCi (x)f (x) ωµ = µi ωi with ωi (f ) := µi X i
where µi = µ(Ci ), see (42). Then,
P
i
η(µi ) = Sµ (C (k) ).
(54)
December 3, 2003 18:33 WSPC/148-RMP
00183
Classical Limit of Quantum Dynamical Entropies
867
On the other hand, ωi`` (f ) =
1 µ`i`
Z
X
µ(dx) χT −` (Ci` ) (x)f (x)
and µ`i` = µ(Ci` ) .
It follows that ωµ ◦ ıNC = ωµ NC is the discrete measure {µ`0 , µ`1 , . . . , µ`n−1 } for all ` = 0, 1, . . . , k − 1 and, finally, that S(ωi`` ◦ γ` ) = 0 as ωi`` ◦ j` = ωi`` Θ` (NC ) is a discrete measure with values 0 and 1. ALF-Entropy: One expects that (53), computed over all possible p.u. from Mµ should equal hKS µ (T ). Notice, however, that, even if the dynamical system is classical, still (53) has to be computed within the non-commutative setting of density KS matrices as in (49). In [6], it is shown that hALF (ω,Mµ ) (Θ) = hµ (T ). In the particular case of the hyperbolic automorphisms of the torus, we may restrict our attention to p.u. whose elements belong to the ∗-algebra Dµ of complex functions f on T such that the support of fˆ is bounded: ALF ALF hKS µ (T ) = h(ωµ ,Mµ ) (Θ) = h(ωµ ,Dµ ) (Θ) .
Remarkably, the computation of the classical KS-entropy via the quantum mechanical ALF-entropy yields a proof of Proposition 6.1 that is much simpler than the standard ones [7, 31]. Proposition 6.3. Let (M, Θ, ω) be a quantum dynamical system with M, a finite dimensional C*-algebra, then, hCNT ω (Θ) = 0
and
hALF (ω,M) (Θ) = 0 .
Proof. CNT-Entropy: As in the commutative case, hCNT ω (Θ) is computable by means of cpu maps γ that are the natural embeddings ıN of subalgebras N ⊆ M into M. Since each Θ` (N ) is obviously contained in the algebra N (k) ⊆ M generated by the subalgebras Θj (N ), j = 0, 1, . . . , k − 1, from the properties of the k-subalgebra functionals H and identifying again the natural embeddings γ` := Θ` ◦ ıN with the subalgebras Θ` (N ) ⊆ M, we derive Hω (N , Θ(N ), . . . , Θk−1 (N )) ≤ Hω (N (k) , N (k) , . . . , N (k) )
≤ Hω (N (k) ) ≤ qS(ω N (k) ) ≤ log d , where M ⊆ Md . In fact, ω N amounts to a density matrix with eigenvalues λ` Pd and von Neumann entropy S(ω N ) = − `=1 λ` log λ` ≤ log d. Therefore, for all N ⊆ M, hCNT ω (Θ, N ) = 0. ALF-Entropy: Let the state ω on Md be given by ω(x) = Tr (ρx), where ρ is a density matrix in Md . Given a partition of unity Y of size `, the cpu map ΦY in (52) can be used to define a state Φ∗Y (ρ) on M` ⊗ M which is dual to ω: Φ∗Y (ρ)(M ⊗ x) = Tr (ρΦY (M ⊗ x)) ,
M ∈ M` , x ∈ M .
December 3, 2003 18:33 WSPC/148-RMP
868
00183
F. Benatti et al.
P Since `j=0 yj∗ yj = 1l, it follows that Φ∗Y (ρk ) = (Φ∗Y (ρ))k . Therefore, ρ and Φ∗Y (ρ) have the same spectrum, apart possibly from the eigenvalue zero, and thus the same von Neumann entropy. Moreover, Φ∗Y (ρ) M` = ρ[Y] and Φ∗Y (ρ) M = Γ∗Y (ρ) as in (39). Applying the triangle inequality for the entropy [25] S(Φ∗Y (ρ)) ≥ |S(Φ∗Y (ρ) M` ) − S(Φ∗Y (ρ) M)| ,
one obtains S(ρ[Y]) ≤ 2 log d. Finally, as evolving p.u. Θj (Y) and their ordered refinements (50), (51) remain in M, one gets 1 lim sup Hω [Y (k) ] = 0 , Y ⊂ M . k k From the considerations of above, it is clear that the main field of application of the CNT-and ALF-entropies are infinite quantum systems, where the differences between the two come to the fore [4]. The former has been proved to be useful to connect randomness with clustering properties and asymptotic commutativity. A rather strong form of clustering and asymptotic Abeliannes is necessary to have a non-vanishing CNT-entropy [22–24]. In particular, the infinite dimensional quantization of the automorphisms of the torus has vanishing CNT-entropy for most of irrational values of the deformation parameter φ, whereas, independently of the value of φ, the ALF-entropy is always equal to the positive Lyapounov exponent. These results reflect the different perspectives upon which the two constructions are based. 7. Classical Limit of Quantum Dynamical Entropies Proposition 6.3 confirms the intuition that finite dimensional, discrete time, quantum dynamical systems, however complicated the distribution of their quasienergies might be, cannot produce enough information over large times to generate a non-vanishing entropy per unit time. This is due to the fact that, despite the presence of almost random features over finite intervals, the time evolution cannot bear random signatures if watched long enough, because almost periodicity would always prevail asymptotically. In this section we take the CNT and the ALF-entropy as good indicators of the degree of randomness of a quantum dynamical system. Then, we show that underlying classical chaos plus Hilbert space finiteness make a characteristic logarithmic time scale emerge over which these systems can be called chaotic. 7.1. CNT-entropy Theorem 7.1. Let (X , T, µ) be a classical dynamical system which is the classical limit of a sequence of finite dimensional quantum dynamical systems (MN , ΘN , τN ). We also assume that the dynamical localization condition 5.1 holds. If (1) C = {C0 , C1 , . . . , Cq−1 } is a finite measurable partition of X , (2) NC ⊂ Mµ is the finite dimensional subalgebra generated by the characteristic functions χCj of the atoms of C,
December 3, 2003 18:33 WSPC/148-RMP
00183
Classical Limit of Quantum Dynamical Entropies
869
(3) ıNC is the natural embedding of NC into Mµ = Lµ (X ), γN ∞ the anti-Wick quantization map and γC` := Θ`N ◦ γN ∞ ◦ ıNC ,
` = 0, 1, . . . , k − 1 ,
then there exists an α such that 1 |H(γC0 , γC1 , . . . , γCk−1 ) − Sµ (C (k) )| = 0 . lim k, N →∞ k k≤α log N
Proof. We split the proof in two parts: (1) We relate the quantal evolution γC` = Θ`N ◦ γN ∞ ◦ ıNC to the classical evolution γ˜C` := γN ∞ ◦ Θ` ◦ ıNC using the continuity property of the entropy functional. (2) We find an upper and a lower bound to the entropy functional that converge to the KS-entropy in the long time limit. (k)
We define for convenience the algebra NC` := Θ` (NC ) and the algebra NC Wk−1 −` corresponding to the refinements C (k) = `=0 T (C) which consist of atoms Tk−1 −` Ci := `=0 T (Ci` ) labeled by the multi-indices i = (i0 , i1 , . . . , ik−1 ). Thus the (k) algebra NC is generated by the characteristic functions χCi . Step 1 The maps γC` and γ˜C` connect the quantum and classical time evolution. Indeed, using Proposition 5.1 k ≤ α log N ⇒ kΘkN ◦ γN ∞ ◦ ıNC (f ) − γN ∞ ◦ Θk ◦ ıNC (f )k2 ≤ ε , or k ≤ α log N ⇒ kγCk − γ˜Ck k2 ≤ ε . This in turn implies, due to strong continuity, |H(γC0 , γC1 , . . . , γCk−1 ) − H(˜ γC0 , γ˜C1 , . . . , γ˜Ck−1 )| ≤ kδ(ε) with δ(ε) > 0 depending on the dimension of the space NC and vanishing when ε → 0. From now on we can concentrate on the classical evolution and benefit from its properties. Step 2, upper bound We now show that H(˜ γC0 , γ˜C1 , . . . , γ˜Ck−1 ) ≤ Sµ (C (k) ) . (k)
Notice that we can embed NC` into Mµ by first embedding it into NC and then embedding
(k) NC
into Mµ with ıN (k) : C
ıNC` = ıN (k) ◦ ıN ` N (k) . C
C
C
with ıN ` N (k) C
C
December 3, 2003 18:33 WSPC/148-RMP
870
00183
F. Benatti et al.
We now estimate: H(˜ γC0 , γ˜C1 , . . . , γ˜Ck−1 ) = H(γN ∞ ◦ ıN (k) ◦ ıN 0 N (k) , . . . , γN ∞ ◦ ıN (k) ◦ ıN k−1 N (k) ) C
C
C
C
C
C
≤ H(γN ∞ ◦ ıN (k) , . . . , γN ∞ ◦ ıN (k) ) ≤ H(γN ∞ ◦ ıN (k) ) C
C
C
≤ S(τN ◦ γN ∞ ◦ ıN (k) ) .
(55)
C
The first inequality follows from monotonicity of the entropy functional, the second from invariance under repetitions and the third from boundedness in terms of von Neumann entropies. The state τN ◦ γN ∞ ◦ ıN (k) takes the values C
τN (γN ∞ (χCi )) = τN =
Z
N
Z
µ(dx) χCi (x)|CN (x)ihCN (x)| X
µ(dx) χCi (x)hCN (x), CN (x)i = ωµ (χCi ) = µ(Ci ) . X
This gives, together with S(µ(Ci )) = Sµ (C (k) ), the desired upper bound. Step 2, lower bound We show that ∀ ε > 0 there exists an N such that H(˜ γC0 , γ˜C1 , . . . , γ˜Ck−1 ) ≥ Sµ (C (k) ) − kε . As H(˜ γC0 , γ˜C1 , . . . , γ˜Ck−1 ) is defined as a supremum over decompositions of the state τN , we can construct a lower bound by picking a good decomposition. Consider the P decomposition τN = i µi ωi with ωi : MN 3 x 7→ ωi (x) :=
τN (γN ∞ (χCi ))(x) τN (γN ∞ (χCi ))
µi := τN (γN ∞ (χCi )) and the subdecompositions τN =
P
j`
µ`j` ωj`` , ` = 0, 1, . . . , k − 1, with
ωj`` : MN 3 x 7→ ωj`` (x) :=
τN (γN ∞ (χT −` (Cj` ) ))(x) τN (γN ∞ (χCjl ))
µ`j` := τN (γN ∞ (χCj` )) . In comparison with (41), it is not necessary to go to the commutant for one can use the cyclicity property of the trace. We then have: H(˜ γC0 , γ˜C1 , . . . , γ˜Ck−1 ) ≥ Sµ (C (k) ) −
k−1 X
X
i=0 i` ∈I`
µ`i` S(ωi`` ◦ γ˜C` ) .
The inequality stems from the fact that H(˜ γC0 , γ˜C1 , . . . , γ˜Ck−1 ) is a supremum, whereas the middle terms in the original definition of the entropy functional drop out because
December 3, 2003 18:33 WSPC/148-RMP
00183
Classical Limit of Quantum Dynamical Entropies
871
they are equal in magnitude but opposite in sign. For s = 0, 1, . . . , k − 1, ωi`` ◦ γ˜C` takes on the values ωi`` (γN ∞ (χT −` (Cs ) )) = µ−1 i` τN (γN ∞ (χT −` (Cj` ) )γN ∞ (χT −` (Cs ) )) . Due to Proposition 3.2, these converge to ωµ (χT −` (Cj` ) χT −` (Cs ) ) = δi` ,s . This means that in the limit the von Neumann entropy will be zero. Or stated more carefully, ∀ε0 : ∃N 0 such that k−1 X
X
i=0 i` ∈I`
µ`i` S(ωi`` ◦ γ˜C` ) ≤ kε0 .
We thus obtain a lower bound. ˜ := max(N, N 0 ), we conclude Combining our results and choosing N Sµ (C (k) ) − kε0 − kδ(ε) ≤ H(γC0 , γC1 , . . . , γCk−1 ) ≤ Sµ (C (k) ) + kδ(ε) . 7.2. ALF-entropy Theorem 7.2. Let (X , T, µ) be a classical dynamical system which is the classical limit of a sequence of finite dimensional quantum dynamical systems (MN , ΘN , τN ). We also assume that the dynamical localization condition 5.1 holds. If (1) C = {C0 , C1 , . . . , Cq−1 } is a finite measurable partition of X , (2) YN = {y0 , y1 , . . . , yq } is a bistochastic partition of unity, which is the quantization of the partition, namely yi = γN ∞ (χCi ) for i = 0, 1, . . . , q − 1 q previous Pq−1 ∗ and yq := 1l − i=0 yi yi , then there exists an α such that lim
k,N →∞ k≤α log N
1 (k) |H[YN ] − Sµ (C (k) )| = 0 . k
Proof. First notice that YN = {y0 , y1 , . . . , yq } is indeed a bistochastic partition. We have yi∗ = γN ∞ (χCi )∗ = γN ∞ (χCi ) = γN ∞ (χCi ) = yi 0 ≤ γN ∞ (χCi )2 = yi2 ≤ γN ∞ (χ2Ci ) = γN ∞ (χCi ) . Pq−1 Summing the last line over i from 1 to q − 1, we see that i=1 yi2 ≤ 1l, This means that {y0 , y1 , . . . , yq−1 } is not a partition of unity, but we can use this property to define an extra element yq which completes it to a bistochastic partition of unity, YN = {y0 , y1 , . . . , yq }: v u q−1 u X yq := t1l − yi∗ yi . i=0
December 3, 2003 18:33 WSPC/148-RMP
872
00183
F. Benatti et al.
The bistochasticity is a useful property because it implies translation invariance of the state on the quantum spin chain, a state which arises during the construction of the ALF-entropy. The density matrix ρ[Y (k) ] of the refined partition reads X ρ[Y (k) ] = ρ[Y (k) ]i,j |ei ihej | i,j
=
X i,j
k−1 ∗ k−1 τN (yj∗1 · · · ΘN (yjk )ΘN (yik ) · · · yi1 )|ei ihej | .
Now we will expand this formula using the operators yi defined above, the quantities Ka (x, y) defined in (32) and controlling the element yq as follows:
2
v !
u q−1 q−1 u X X
∗ 2 t ∗
yi yi yi yi = τN 1l − kyq k2 = 1l −
i=1 i=1 2
=
Z
dy dz
X i6=j
χi (y)χj (z)N |K0 (y, z)|2 .
(56)
Thus, in Rthe limit of large N , N |K0 (y, z)|2 is just δ(y − z) (see (32)) so that (56) P tends to dz i6=j χi (z)χj (z) = 0 and we can consistently neglect those entries of ρ[Y (k) ] containing yq . By means of the properties of coherent states, we write out explicitly the elements of the density matrix ! Z k k−1 Y Y (k) 2k−1 ρ[Y ]i,j = N dy dz χCj` (y` )χCi` (z` )K0 (z1 , y1 ) K1 (yp , yp+1 ) p=1
`=1
× K0 (yk , zk )
q−1 Y
q=1
!
K−1 (zk−q+1 , zk−q ) .
We now use that for N large enough, Z N dy χC (y)Km (x, y)Kn (y, z) − χT −m C (x)Km+n (x, z) ≤ εm (N ) ,
(57)
(58)
where εm (N ) → 0 with N → ∞ uniformly in x, y ∈ X . This is a consequence of the dynamical localization condition 5.1 and can be rigorously proven in the same way as Proposition 3.1. However, the rough idea is the following: from the property 3.1.3 of coherent states, one derives Z N dy χC (y)Km (x, y)Kn (y, z) = Km+n (x, z) + N
Z
dy (χC (y) − 1)Km (x, y)Kn (y, z) .
December 3, 2003 18:33 WSPC/148-RMP
00183
Classical Limit of Quantum Dynamical Entropies
873
For large N , the condition 5.1 makes the integral in (58) negligible small unless x ∈ T −m (C), in which case it is the second integral in the formula of above which can be neglected. By applying (58) to the couples of products in (57) one after the other, we finally arrive at the upper bound ! k X |ρ[Y (k) ]i,j − δi,j µ(Ci )| ≤ 2 εm (N ) + ε0 (N ) =: (N ) , m=1
Tk
−`+1
where Ci := `=1 T Ci` is an element of the partition C (k) . P (k) We now set σ[C ] := i µ(Ci )|ei ihei | and use the following estimate: let A be an arbitrary matrix of dimension d and let {e1 , e2 , . . . , ed } and {f1 , f2 , . . . , fd } be P two orthonormal bases of Cd , then kAk1 := Tr |A| ≤ i,j |hei , Afj i|. This yields ∆(k) := kρ[Y (k) ] − σ[C (k) ]k1 = Tr |ρ[Y (k) ] − σ[C (k) ]| ≤ q 2k (N ) .
Finally, by the continuity of the von Neumann entropy [15], we get |S(ρ[X (k) ]) − S(σ[C (k) ])| ≤ ∆(k) log q k + η(∆(k)) . Since, from k ≤ α log N , q 2k ≤ N 2α log q , if we want the bound q 2k (N ) to converge to zero with N → ∞, the parameter α has to be chosen accordingly. Then, the result follows because the von Neumann entropy of σ reduces to the Shannon entropy of the refinements of the classical partition. 8. Conclusions In this paper, we have shown that both the CNT and ALF entropies reproduce the Kolmogorov–Sinai invariant if we observe a strongly chaotic system at a very short time scale. However, due to the discreteness of the spectrum of the quantizations, we know that saturation phenomena will appear. It would be interesting to study the scaling behavior of the quantum dynamical entropies in the intermediate region between the random breaking time and the Heisenberg time. This will, however, require quite different techniques than the coherent states approach. Acknowledgments Two of the authors (F.B., V.C.) wish to express their gratitude to the Institute of Theoretical Physics of the University of Leuven for its hospitality and financial support. References [1] L. Accardi, M. Ohya and N. Watanabe, Dynamical entropy through quantum Markov chains, Open Sys. & Information Dyn. 4 (1997) 71–87. [2] R. Alicki and M. Fannes, Quantum Dynamical Systems, Oxford University Press, Oxford, 2001.
December 3, 2003 18:33 WSPC/148-RMP
874
00183
F. Benatti et al.
[3] R. Alicki and M. Fannes, Defining quantum dynamical entropy, Lett. Math. Phys. 32 (1994) 75–82. [4] R. Alicki and H. Narnhofer, Comparison of dynamical entropies for the noncommutative shifts, Lett. Math. Phys. 33 (1995) 241–247. [5] R. Alicki, D. Makowiec and W. Miklaszewski, Quantum chaos in terms of entropy for a periodically kicked top, Phys. Rev. Lett. 77 (1996) 838–841. [6] R. Alicki, J. Andries, M. Fannes and P. Tuyls, An algebraic approach to the Kolmogorov–Sinai entropy, Rev. Math. Phys. 8 (1996) 167–184. [7] V. I. Arnold and A. Avez, Ergodic Problems of Classical Mechanics, Benjamin, New York, 1968. [8] F. Benatti, H. Narnhofer and G. L. Sewell, A non-commutative version of the Arnold cat map, Lett. Math. Phys. 21 (1991) 157–172. [9] F. Benatti, V. Cappellini and F. Zertuche, Quantum Dynamical Entropies in Discrete Classical Chaos, math-ph/0308033. [10] M. V. Berry, N. L. Balazs, M. Tabor and A. Voros, Quantum maps, Ann. Phys. 122 (1979) 26–63. [11] G. Casati and B. Chirikov, Quantum Chaos. Between Order and Disorder, Cambridge, University Press, 1995. [12] A. Connes, H. Narnhofer and W. Thirring, Dynamic entropy of C*-algebras and von Neumann algebras, Commun. Math. Phys. 112 (1987) 691–719. [13] S. De Bievre, Chaos, quantization and the classical limit on the torus, Proceedings of the XIVth Workshop on Geometrical Methods in Physics–Bialowieza (1995), Polish Scientific Publishers PWN, 1998 (see also mp arc96 − 191). [14] M. Degli Esposti, Quantization of the orientation preserving automorphisms of the torus, Ann. Inst. Henri Poincar´e 58 (1993) 323–341. [15] M. Fannes, A continuity property of the entropy density for spin lattice systems, Commun. Math. Phys. 31 (1973) 291–294. [16] J. Ford and M. Ilg, Eigenfunctions, eigenvalues, and time evolution of finite, bounded, undriven, quantum systems are not chaotic, Phys. Rev. A45 (1992) 6165–6173. [17] J. Ford, G. Mantica and G. H. Ristow, The Arnold cat-failure of the correspondence principle, Physica D50 (1991) 493–520. [18] J. Ford and G. Mantica, Does quantum mechanics obey the correspondence principle — is it complete, Am. J. Phys. 60 (1992) 1086–1098. [19] M.-J. Giannoni, A. Voros and J. Zinn-Justin, Chaos and Quantum Physics, Les Houches Summer School of Theoretical Physics 1989, North-Holland, Amsterdam, London, New York, Tokyo, 1991. [20] B. Hasselblatt and A. Katok, Modern Theory of Dynamical Systems, Encyclopedia of Mathematics and its Applications, Cambridge University Press, Cambridge, 1999. [21] R. Ma˜ n´e, Ergodic Theory and Differentiable Dynamics, Springer Verlag, Berlin, 1987. [22] H. Narnhofer, Quantized Arnold cat maps can be entropic K-systems, J. Math. Phys. 33 (1992) 1502–1510. [23] H. Narnhofer, E. Størmer and W. Thirring, C*-dynamical systems for which the tensor product formula for entropy fails, Ergod. Th. and Dynam. Sys. 15 (1995) 961–968. [24] S. V. Neshveyev, On the K-property of quantized Arnold cat maps, J. Math. Phys. 41 (2000) 1961–1965. [25] M. Ohya and D. Petz, Quantum Entropy and its Use, Springer Verlag, Berlin, 1993. [26] W. Rudin, Real and Complex Analysis, 3rd edn., McGraw-Hill, 1987. [27] H. G. Schuster, Deterministic Chaos, 3rd edn., VCH, Weinheim, 1995. ˙ [28] W. Slomczy´ nski and K. Zyczkowski, Quantum chaos, an entropy approach, J. Math. Phys. 35 (1994) 5674.
December 3, 2003 18:33 WSPC/148-RMP
00183
Classical Limit of Quantum Dynamical Entropies
875
[29] P. Tuyls, Towards Quantum Dynamical Entropy, Thesis, KU-Leuven, 1997. [30] D. Voiculescu, Dynamical approximation entropies and topological entropy in operator algebras, Commun. Math. Phys. 144 (1992) 443–490. [31] P. Walters, An Introduction to Ergodic Theory, Graduate Text in Mathematics, Vol. 79, Springer Verlag, Berlin Heidelberg, New York, 1982. [32] R. F. Werner, The classical limit of quantum theory, quant-ph/9504016. [33] G. M. Zaslavsky, Chaos in Dynamic Systems, Harwood Academic Publ., 1985.
December 8, 2003 12:22 WSPC/148-RMP
00184
Reviews in Mathematical Physics Vol. 15, No. 8 (2003) 877–903 c World Scientific Publishing Company
ON ASYMPTOTIC STABILITY OF GROUND STATES OF NLS
SCIPIO CUCCAGNA University of Virginia, Charlottesville, VA 22904, USA [email protected] Received 11 March 2003 Revised 14 September 2003 We prove in dimension n = 3 an asymptotic stability result for ground states of the Nonlinear Schr¨ odinger Equation which contain one internal mode. Keywords: Ground state; asymptotic stability; Fermi golden rule; scattering.
1. Introduction We extend from dimension n = 1 to dimension n = 3 some of the ideas in Buslaev and Perelman [1] on asymptotic stability of ground states for the Nonlinear Schr¨ odinger Equation (NLS) in the context of even solutions (our discussion can be extended with little effort to n ≥ 3; for not even solutions our proof is not sufficient). The crux is a mechanism for the passage of energy from a finite dimensional Hamiltonian system to a dispersive partial differential equation which takes the form of a Fermi’s Golden Rule (FGR). Sigal [12] used an FGR to prove instability of periodic and quasiperiodic solutions of wave equations under a generic small nonlinear perturbation. In the present context, [1] sketches a FGR for n = 1. [2] contains a detailed presentation of [1]. Similar ideas are in [14, 15, 10]. We consider even solutions of a NLS iut + ∆u + β(|u|2 )u = 0 ,
(t, x) ∈ R × Rn ,
n = 3.
(1.1)
Ground states are solutions eitω+γ φω (x) with: γ and ω constants; φω (x) a positive, exponentially decreasing as |x| → ∞ and spherically symmetric solution of −∆φω + ωφω − β(φ2ω )φω = 0 .
(1.2)
We assume existence and uniqueness of φω (x). We then show, under appropriate hypotheses, that even solutions u(t, x) close to a ground state for t = 0 converge asymptotically to ground states. We highlight that Lemma 4.3 and what leads to it is the crux of the argument. We follow closely [1] which sketches case n = 1. [15] treats for n = 3 a similar problem, simpler due to smallness of the ground states. Our arguments can be used also as an alternative analog to [15, Theorem 1.4]. This 877
December 8, 2003 12:22 WSPC/148-RMP
878
00184
S. Cuccagna
paper gives analogs of [1, 15] with essentially a single argument and is a revision of a manuscript, [6], produced in February 2001. 2. Hypotheses and Statements Hypotheses E and U below guarantee existence and uniqueness of ground states. We will assume β(t) real-valued, smooth and satisfying β(0) = 0
(E.1)
there is α > 0 such that ωt − β(t2 )t > 0 for 0 < t < α
(E.2)
ωt − β(t2 )t < 0 for α < t < ∞
β(t2 ) = 0, t→+∞ tl lim
Set G(r) =
Rr 0
l=
4 , n−2
here n = 3 .
(E.3)
2sβ(s2 )ds. Then we assume
there is r > 0 such that G(r) − ωr 2 > 0 .
(E.4)
Define now [11], I(t, λ) = λt[β(t2 )t − ωt]0 − (λ + 2)[β(t2 )t − ωt]. For the α of (E.2) we assume for any U > α there is λ = λ(U ) such that I(t, λ) ≥ 0 for 0 < t < U and
(U )
I(t, λ) ≤ 0 for t > U . U and E imply that the map (x, ω) → φω (x) is smooth [13]. We assume β 0 (0) = 0 .
(D)
Remark. Using an idea in [15, p. 194] a theorem can be formulated and proved without (D) (similarly the hypotheses on the nonlinearity in [1, 2] can be made less stringent), see [15]. (D) implies that, for u near 0, |β(u2 )u| = 0(u5 ). We strengthen (E.3) and we assume that for |u| large, |β(u2 )u| ≤ c|u|p for some p < 5. This hypothesis and (D) together imply that there is a p such that: |β(u2 )u| ≤ c|u|p ,
3 > p > 5 so that
p > 3 implies µdp = 3(p − 2)/2 > 3/2 1 1 1 1 1 1 for d = 3 , µ= . − − − 2 p+1 2 p 2 p+1 Inequality µdp > 3/2 is used in Sec. 6.4. We have (see [3, Proposition 4.2.3 and Theorem 4.3.1]):
(P )
December 8, 2003 12:22 WSPC/148-RMP
00184
On Asymptotic Stability of Ground States of NLS
879
Theorem 2.1. ∀ φ(x) ∈ H 1 (R3 )∃ t0 > 0, t0 = t0 (kφkH 1 ), and a unique solution to (1.1) u ∈ C([0, t0 ), H 1 ) ∩ C 1 ([0, t0 ), H −1 ) with initial condition u(0, x) = φ(x) and such that for any t, ku(t)k2 = kφk2 and (energy) E(u(t)) = E(φ). Corollary 2.2. Fix a ω0 satisfying (E) and (U ). ∀ T > 0 and > 0, ∃ δ > 0 such that if u(0) ∈ H 1 (R3 ) is even and ku(0) − eiγ0 φω0 kH 1 < δ for some γ0 , then the solution u(t) of (1.1) is defined in [0, T ] along with functions ω(t), γ(t), so that ∀ t ∈ [0, T ]: Z t iΘ(t) u(t, x) = e (φω(t) (x) + R(t, x)) , Θ(t) = ω(s)ds + γ(t) (2.1) 0
with kR(t)kH 1 + |(ω(t), γ(t)) − (ω0 , γ0 )| < ∀ t ∈ [0, T ].
We assume there is an open set O ⊆ (0, ∞) satisfying ∀ω ∈ O,
hφω , ∂ω φω i > 0 .
(LS)
From [17, 7, 8] we conclude: Theorem 2.3. Under the above hypotheses, if in Corollary 2.2 ω 0 ∈ O, with O satisfying (LS), then the statement of Corollary 2.2 is valid also for T = +∞. We will assume all the above hypotheses and the resulting conclusions. We want to strengthen Theorem 2.3. Insert the ansatz (2.1) in (1.1): iRt = −∆R + ω(t)R − β(φ2ω(t) )R − β 0 (φ2ω(t) )φ2ω(t) R 0 ¯ + γ(t)φ − β 0 (φ2ω(t) )φ2ω(t) R ˙ ˙ ˙ + e−iΘ N (eiΘ R) ω(t) − iω(t)φ ω(t) + γ(t)R
(2.2)
where φ0ω = ∂ω φω and N (eiΘ R) is defined by the equality and is O(R2 ) when R ¯ (2.2) is in effect a system. In the following formula η is is small. Because of R, a constant, η = 1 (resp. a small constant) when treating Eq. (1.1) (resp. (2.9)). We set: " # " # " # 0 1 0 i 1 0 σ1 = , σ2 = , σ3 = ; 1 0 −i 0 0 −1 H(t) = σ3 [−∆ + ω(t) − ηβ(φ2ω(t) ) − ηβ 0 (φ2ω(t) )φ2ω(t) ] + iηβ 0 (φ2ω(t) )φ2ω(t) σ2 ; " # " # " # R1 R φ R= = , Φ= , Φ0 = ∂ ω Φ . ¯ R φ R2
(2.3η )
We rewrite (2.2) iRt = H(t)R + σ3 γR ˙ + σ3 γΦ ˙ − iωΦ ˙ 0 + Nonlinear .
(2.4)
Call also Hω operator in (2.31 ) for ω(t) = ω constant. Write H0 (ω) = σ3 (−∆ + ω) and V (ω) = Hω − H0 (ω). For the essential and discrete spectra of Hω we have: σe = σe (Hω ) = σe (H0 (ω)) = (−∞, −ω] ∪ [ω, +∞) ;
σd (Hω ) ⊇ {0} .
December 8, 2003 12:22 WSPC/148-RMP
880
00184
S. Cuccagna
Given an operator L we set Ng (L) = ∪j≥1 N (Lj ) and N (L) = ker L. [11] and [16] imply that, if {·} means span, Ng (Hω∗ ) = {Φ, σ3 Φ0 } (here we are in the category of even functions). We will assume the following hypothesis, with resonances defined in Sec. 3: Hypothesis H1. There is an open set O ⊆ (0, +∞) such that for ω ∈ O: (1) Hω has two simple eigenvalues ±λ(ω) with 0 < λ(ω) < ω and 2λ(ω) > ω; (2) Hω has no other eigenvalues except for 0, ±λ(ω) and has no resonances in σe (H); (3) We assume (LS), i.e. for the L2 inner product we have hφ, φ0 i > 0, where φ0 = ∂ω φ. Remark. Here, when treating Eq. (1.1), we consider only even functions. So the eigenspace associated to λ(ω) is formed by even functions. Lemma 2.4. Under Hypothesis H1 and if we use that V (ω) is smooth and exponentially decreasing as |x| → ∞, the point spectrum σp (H) of H = Hω is finite with corresponding generalized eigenspaces finite dimensional. We have H (resp. H ∗ ) invariant Jordan block decompositions, here it is easy to see σp (H) = σp (H ∗ ), ⊥ X X L2 = Ng (H − λ) ⊕ Xc (H) , Xc (H) = Ng (H ∗ − λ) λ∈σp (H)
L2 =
X
λ∈σp (H)
σp (H)
Ng (H ∗ − λ) ⊕ Xc (H ∗ ) ,
We discuss briefly Lemma 2.4 in Sec. 3. Set r
Xc (H ∗ ) =
Hsr = {f : k(1 − ∆) 2 hxis f kL2 < ∞} ,
X
σp (H)
⊥
Ng (H − λ)
.
1
hxi = (1 + |x|2 ) 2 .
We consider s > 0 large enough, > 0 small enough and ω0 ∈ O. Pick initial datum u(0) such that √ (2.5) ku(0) − eiγ0 φω0 kH 1 + ku(0) − eiγ0 φω0 kHs0 < . We can apply Theorem 2.3. Furthermore, a standard Implicit Function Theorem argument insures the existence of unique functions γ(t) and ω(t) ∈ O, for 0 ≤ t ≤ T , such that R(t) ∈ Ng⊥ (H ∗ (t)) ∀ t, here H(t) = Hω(t) . We consider the spectral decomposition of Lemma 2.4 ! L 2 L = Ng (H(t)) ⊕ N (H(t) ∓ λ(t)) ⊕ Xc (t) = Ng (H(t)) ⊕ Ng⊥ (H ∗ (t)) . (2.6) ±
Set R(t) = ζ(t) + f (t) ∈
"
X ±
#
N (H(t) ∓ λ(t)) ⊕ Xc (H(t)) .
(2.7)
December 8, 2003 12:22 WSPC/148-RMP
00184
On Asymptotic Stability of Ground States of NLS
881
∀ ω ∈ O∃ ξ(ω) ∈ N (Hω∗ − λ(ω)) such that: ξ has real entries; hξ, σ3 ξi = 1 (here hξ, σ3 ξi > 0 due to positivity of hHω ·, ·i in Ng⊥ (Hω∗ )). It is elementary to show that σ1 ξ(ω) generates N (Hω∗ + λ(ω)). By routine arguments: the function (ω, x) ∈ O × R3 → ξ(ω, x) is smooth; |ξ(ω, x)| < ce−a|x| for fixed c > 0 and a > 0 if ω ∈ K ⊂ O, K compact. ξ(ω, x) is even in x by assumption. We consider the projection from (2.6), P1 (ω) : L2 (Rn ) → Xc (Hω ). For L2 (ξ) a quadratic expression in ξ defined by (4.2) below, and for R(z) = R(Hω , z) the resolvent, we assume ∀ ω ∈ O lim =h[R(2λ(ω) + i) − R(2λ(ω) − i)]P1 (ω)L2 (ξ), P1∗ (ω)σ3 L2 (ξ)i 6= 0 .
→0+
(2.8)
(2.8) is the crucial assumption and provides in [1] the mechanism for the passage of energy from the finite dimensional hamiltonian system to its coupled dispersive PDE. We also set N = kf (0)kH 1 + khxis f (0)k2 , f (0) ∈ Xc (0), see (2.7). Our main result is: Theorem 2.4. Assume: Hypotheses U, E, D, H1, P, (2.8); s > 0 large. ∀ ω0 ∈ O∃ > 0 such that if u(0, x), for some γ0 ∈ R, satisfies (2.5) (so in particular |ζ(0)|2 ≤ c0 ) and furthermore if u(0, x) is even and N ≤ , then the solution u(t) of Eq. (1.1) with initial data u(0) extends into [0, +∞) and can be written in the form (2.1) with t → +∞ : ω(t) convergent; R(t) = ζ(t) + f (t), with supt (1 + 1 (1+|t|)d kf (t)kLp+1 < ∞, d = 3(1/2 − 1/(p + 1)) for p |t|) 2 kζ(t)kL∞ < ∞ and supt log(2+t) as in assumption P. That u(0, x) is even, assumed also in [1, 2], simplifies the nonlinear analysis, i.e. control of the modulation, but can be dropped in the proof of linear estimates. For a more careful analysis of the decay, which is at first slow and later faster, see [2]. Notice that we do not provide examples of Eq. (1.1) which satisfy our hypotheses. The discussion to prove Theorem 2.4 can be used to give an alternative account of a weaker version of [15, Theorem 1.1]. Consider a Schr¨ odinger operator −∆ + V (x). Assume V (x) smooth and exponentially decreasing at infinity; −∆ + V (x) has two simple eigenvalues −e0 < −e1 < 0 with eigenspaces generated by φ0 and φ1 ; 0 is not a resonance. Consider iut + (∆ − V (x) + ηβ(|u|2 ))u = 0 .
(2.9)
We have, see [15]: Lemma 2.5. Assume m is the first integer such that β (m) (0) 6= 0 and η 6= 0. Assume P. For ηβ (m) (0) > 0 (resp. ηβ (m) (0) < 0), there is a small interval I = (e0 , e0 + δ) (resp. I = (e0 − δ, e0 ) such that if ω ∈ I then the equation (−∆ + V (x) + ω − ηβ(|u|2 ))u = 0 ,
(2.10)
December 8, 2003 12:22 WSPC/148-RMP
882
00184
S. Cuccagna
has a nonzero solution u = φω (x), smoothly dependent on (ω, x), positive, and |φω (x)| < ce−a|x| for some fixed c > 0 and a > 0. Furthermore 1 ! 2m ω(a) − e0 1 φω = φ0 + o(|ω(a) − e0 | 2m ) . (2.11) 2m+2 m ηβ (0)kφ0 k2m+2 The operator in (2.10) has for η = 0 spectrum {ω − e0 } ∪ {ω − e1 } ∪ [ω, +∞) ˜ and for η 6= 0 small, {0} ∪ {λ(ω)} ∪ [ω, +∞). Solutions to (2.9) initially close to iγ0 e φω0 can be written in the form (2.1). We can write an analog of (2.4) with H(t, η) = (2.3η ) + σ3 V (x) . For η = 0 we obtain σ3 (−∆ + V + ω) with spectrum {±(ω − e0 )} ∪ {±(ω − e1 )} ∪ [ω, +∞) ∪ (−∞, −ω]. For η 6= 0 small, the spectrum of H(t, η) is symmetric with respect to x and y axis and close to spectrum of σ3 (−∆ + V + ω). More precisely, dim Ng (H(t, η)) = 2, and the spectrum is {0} ∪ {±λ(ω)} ∪ [ω, +∞) ∪ (−∞, −ω], with λ(ω) close to e0 − e1 . We assume as in Lemma 2.5 there is a smallest m such that β (m) (0) 6= 0. For Pc the projection associated with the continous spectrum of −∆ + V we assume: 2(e0 − e1 ) > e0 ;
(2.12)
hφ2m−1 φ21 , δ(−∆ + V + e0 − 2(e0 − e1 ))Pc φ2m−1 φ21 i > 0 . 0 0
(2.13)
We have: Theorem 2.6. Assume the above hypotheses for −∆+V (x). Assume D and P. Assume hypotheses and conclusions of Lemma 2.5. Consider then a nonlinear ground state eiγ0 φω0 (x). If ω0 is close enough to e0 , there ∃ > 0 such that, if an initial datum u(0) satisfies (2.5), so in particular if |ζ(0)|2 ≤ c0 , and if, in the notation of Theorem 2.4, N ≤ , then the solution u(t) of (2.9) with initial datum u(0) exists in [0, ∞) and can be written in the form (2.1), with ω(t) and R(t) satisfying the same conclusions of Theorem 2.4. This result is weaker than [15, Theorem 1.1] because of our stronger regularity and decay assumptions on V (x), which can be replaced by the less stringent hypotheses [15] to prove analog of material in Sec. 3 here and maybe also material in Sec. 8, without changing techniques; our assumptions (D) and (P ), which do not capture β(|u|2 )u = |u|2 u, but see the Remark after (D). For a more detailed anlysis on the dynamics of u(t), which is first slow and later fast, see [2, 15]. For a more precise description of the relation between the lower bound in (2.8) and 0 , see [15]. In Sec. 10 we list some errors in [5]. 3. Scattering Theory for H = Hω Write H = H0 + V , and V = B ∗ A with A, B smooth and exponentially decreasing. We start with a sketch of proof of Lemma 2.4. For =z > 0, set
December 8, 2003 12:22 WSPC/148-RMP
00184
On Asymptotic Stability of Ground States of NLS
883
Q0 (z) = A(H0 − z)−1 B ∗ . Q0 (z) : L2 → L2 is well defined and compact, and can be extended to =z ≥ 0, we denote this extension by Q+ 0 (z). Because A and B + are exponentially decreasing, Q0 (z) can be extended as a meromorphic function in a neighborhood of any point z 6= ±ω, =z ≥ 0. Furthermore, at the branch point √ ±ω, Q+ z ∓ ω, see [18]. By Agmon, see 0 (z) admits a meromorphic extension in 2 2 [5], we have limz→∞ kQ+ (z) : L → L k = 0. Then, by Analytic Fredholm Theory 0 + Ng (1 + Q0 (z)) 6= 0 in {=z ≥ 0}, Ng the generalized kernel, at most for a zero measure set. The intersection this set with the (ω, ∞) is discrete by the fact of the meromorphic extension in z of Q+ ±ω 0 (z) at the points of (ω, ∞). The points √ are not accumulation points by the fact of the meromorphic extension in z ∓ ω of Q0 (z) at ±ω, see [18]. So the set of singular points in {=z ≥ 0} is finite and P dim Ng (1 + Q+ 0 (z)) < ∞. We say that z, with =z ≥ 0, is a resonance or an eigenvalue for H if Ng (1 + Q+ 0 (z)) 6= 0. We assume by Hypothesis H1, that no such points z ≥ ω and z ≤ −ω exist. To sum up, we have that H admits finitely many eigenvalues which are isolated and of finite algebraic dimension. The splitting of Lemma 2.4 follows by standard theory. This concludes the discussion of Lemma 2.4. Remark. We show in [19] that if for z > ω, Ng (1 + Q+ 0 (z)) 6= 0, then z is an eigenvalue for H, see also [20]. The discussion on this point in [5] has flaws. Let us assume now Hypothesis H1. We can apply the theory of smoothness by Kato [9] in order to prove that H in Xc (H) acts like H0 in L2 (R3 ): Proposition 3.1. Suppose n ≥ 3. Let H = Hω , H = H0 + B ∗ A, A and B smooth and exponentially decreasing. Then there are isomorphisms W : L 2 (R3 ) → Xc (H) and Z : Xc (H) → L2 (R3 ), inverses of each other, defined as follows: for u ∈ L2 (R3 ), v ∈ Xc (H ∗ ), Z +∞ 1 hA(H0 − i − λ)−1 u, B(H ∗ + i − λ)−1 vidλ ; hW u, vi = hu, vi + lim+ →0 2πi −∞ for u ∈ Xc (H), v ∈ L2 (R3 ), hZu, vi = hu, vi + lim+ →0
1 2πi
Z
+∞ −∞
hA(H − i − λ)−1 u, B(H0 + i − λ)−1 vidλ .
By [9] Proposition 3.1 is a direct consequence of Lemma 3.2. There is c > 0 such that ∀ 6= 0 and for V = A or V = B, the following inequalities hold : (i) Z
kV(H0 − i − λ)−1 uk2 dλ ≤ ckuk2
∀ u ∈ L2 (R3 )
Z
kB(H ∗ − i − λ)−1 uk2 dλ ≤ ckuk2
∀ u ∈ Xc (H ∗ )
(ii)
December 8, 2003 12:22 WSPC/148-RMP
884
00184
S. Cuccagna
(iii) Z
kA(H − i − λ)−1 uk2 dλ ≤ ckuk2
∀ u ∈ Xc (H) .
Statement (i) is standard. (iii) follows from −1 A(H − z)−1 u = (1 + Q+ A(H0 − z)−1 u , 0 (z))
from (i) and from the following observations: A(H −z)−1 v has a removable singularity at any isolated eigenvalue z0 of H if v ∈ Xc (H) because the natural projection of v in Ng (H − z0 ) is 0; by Hypothesis H1, away from isolated eigenvalues of H, −1 (1 + Q+ is uniformly bounded. For (ii) the same argument works. 0 (z)) By [9] we also have that W u = limt→+∞ eiHt e−iH0 t u. Another important ingredient, proved in [5] and based on theory developed by Yajima, is summarized in: Theorem 3.3. Let O be as in Sec. 2. For ω ∈ O set
W+ (ω)u = lim eiHω t e−iσ3 (−∆+ω)t u . t→+∞
The limit exists ∀ u ∈
C0∞
and W+ (ω) extends into an isomorphism #⊥ " X ∗ p 3 p 3 ∗ N (Hω ∓ λ(ω)) ∀ p ∈ [1, +∞] . L (R ) → L (R ) ∩ Ng (Hω ) ⊕ ±
W+−1 (ω),
For Z+ (ω) = the operator norms of Z+ (ω) and W+ (ω) remain uniformly bounded for ω ∈ K ⊂ O, with K compact. In particular, for q ≥ 2 and a constant cq dependent only on K: " #⊥ X n n q . , v ∈ Ng (Hω∗ ) ⊕ N (Hω∗ ∓ λ(ω)) keiHω t vkLq ≤ cq t− 2 + q kvk q−1 L
±
For s, s0 > s(n), the function z → (Hω + z)−1 from
(4.1)
December 8, 2003 12:22 WSPC/148-RMP
00184
885
On Asymptotic Stability of Ground States of NLS
For N (eiΘ R) the nonlinear term in (2.2) we have, for small R, ¯ + |R|2 )eiΘ (φ + R) e−iΘ N (eiΘ R) = e−iΘ {β(φ2 + φ(R + R) ¯ − eiΘ [β(φ2 )φ + β(φ2 )R + β 0 (φ2 )φ2 R + β 0 (φ2 )φ2 R]} 1 ¯ 2 + β 0 (φ2 )φ(R + R)R ¯ = β 0 (φ2 )φ|R|2 + β 00 (φ2 )φ3 (R + R) 2 1 ¯ 2R + β 0 (φ2 )|R|2 R + β 00 (φ2 )φ2 (R + R) 2 1 ¯ 3 + O(R4 ) . + β 000 (φ2 )φ4 (R + R) 6 We set c1 (ω, x) = β 0 (φ2 )φ + 12 β 00 (φ2 )φ3 and c2 (ω, x) = 21 β 00 (φ2 )φ3 . Set " # " # R1 (R1 + 2R2 ) R22 L2 (R) = c1 + c2 −R2 (2R1 + R2 ) −R21 0
L3 (R) = β R1 R2
"
R1
−R2
#
# " R1 β 00 2 2 φ (R1 + R2 ) + 2 −R2
" # 1 β 000 4 3 + φ (R1 + R2 ) . 6 −1 Call ζj , j = 1, 2, the entries of ζ, see (2.7), and set " # " 0 ζ1 + ζ 2 ζ1 + 2c2 C(ζ) = c1 2 −ζ1 −ζ2 −(ζ1 + ζ2 )
(4.2)
ζ2 0
#
.
Writing in vectorial form the expansion of N (eiΘ R) we obtain ˜ (R) Nonlinear = L2 (ζ) + L3 (ζ) + C(ζ)f + N
(4.3)
˜ (R) ≤ C garbage1 , for with N
˜ |2 + |f |p garbage1 = |ζ|4 + |ζ|2 |f | + ψ|f
(4.4)
˜ where ψ(x) = ce−a|x| , for some c > 0 and a > 0 fixed. Using (4.3) and condition ⊥ R(t) ∈ Ng (H ∗ (t)) we obtain ˜ (R), Φi iωhΦ, ˙ Φ0 i = γhσ ˙ 3 ζ, Φi − iωhζ ˙ 0 , Φi + hL2 (ζ) + L3 (ζ) + C(ζ)f + N
˜ (R), σ3 Φ0 i γhΦ, ˙ Φ0 i = iωhζ ˙ 0 , σ3 Φ0 i − γhσ ˙ 3 ζ, σ3 Φ0 i − hL2 (ζ) + L3 (ζ) + C(ζ)f + N ˜ (R), σ3 ξi . iz˙ − λz = γhσ ˙ 3 ζ, σ3 ξi − iωhζ ˙ 0 , σ3 ξi + hL2 (ζ) + L3 (ζ) + C(ζ)f + N
Hence, iωhΦ, ˙ Φ0 i = hL2 (ζ), Φi + error, γhΦ, ˙ Φ0 i = −hL2 (ζ), σ3 Φ0 i + error, with ˜ . error ≤ ch|ζ|3 + |ζ| |f | + ψ|f |2 + |f |p , ψi
We set :
L03 (ζ) = −
hL2 (ζ), Φi 0 hL2 (ζ), σ3 Φ0 i σ3 ζ − ζ . hΦ, Φ0 i hΦ, Φ0 i
December 8, 2003 12:22 WSPC/148-RMP
886
00184
S. Cuccagna
˜ |2 + |f |p ), We conclude, with |N (R)| ≤ C garbage1 = C(|ζ|4 + |ζ|2 |f | + ψ|f iωhΦ, ˙ Φ0 i = hL2 (ζ) + L3 (ζ) + L03 (ζ) + C(ζ)f + N (R), Φi
−γhΦ, ˙ Φ0 i = hL2 (ζ) + L3 (ζ) + L03 (ζ) + C(ζ)f + N (R), σ3 Φ0 i
(4.5)
iz˙ − λz = hL2 (ζ) + L3 (ζ) + L03 (ζ) + C(ζ)f + N (R), σ3 ξ(ω)i . 4.2. The Fermi golden rule We approximate (4.5) with a closed system in ω, γ and z and we uncover a Fermi Golden Rule. Notice that here the wave operators W+ and Z+ introduced in Sec. 3 are used in (4.6) and in Lemma 4.1. Consider an interval [0, T ] and the splitting (2.6) for t = T . For W+ = W+ (ω(T )), Z+ = Z+ (ω(T )), set # # " " 0 0 1 0 (4.6) , E− = P+ = W + E + Z + , P− = W + E − Z + , E + = 0 1 0 0 i.e. P+ (P− ) spectral projector for [ω(T ), +∞) ((−∞, −ω(T )]). We have P1 = ˜ 01,± where P+ + P− . Set h0 = h0+ + h0− , with h0,± = |z|2 h00,± + z 2 h01,± + z¯2 h h01,± = R(H(T ), 2λ(T ) + i0)P± L2 (ξ)
˜ 01,± = R(H(T ), 2λ(T ) + i0)P± σ1 L2 (ξ) h " # 1 −1 P± F0 (ω, ·) , F0 = 2c1 (ξ1 ξ2 + ξ12 + ξ22 ) , h00,± = H(T ) −1
(4.7)
see in the next section how (4.7) arises. By elementary computation # " 1 2 2 2 . L2 (ζ) = z L2 (ξ) − z¯ σ1 L2 (ξ) + |z| 2c1 F0 −1 Coordinates of L2 (ξ) and F0 are real valued: this is crucial later. Set now garbage = garbage1 + C|z| |f − h0 | + C|ω(t) − ω(T )|(|z|3 + |f | |z|) .
(4.8)
We will prove that, roughly, garbage ≤ chti−2 . Consider now ωhΦ, ˙ Φ 0 i = F2 + F3 + FR ;
iz˙ − λz = G2 + G3 + GR ,
(4.9)
where (equalities define the aj (ω) and bj (ω); in the first line we use σ1 Φ = Φ) * " # + 1 2 2 2 F2 = =hL2 (ζ), Φi = = z F (ω) − z¯ σ1 F (ω) + |z| F0 (ω) ,Φ −1 = 2hF (ω), Φi=z 2 = 2a0 (ω)=z 2 ; F3 = =hL3 (ζ) + L03 (ζ) + C(ζ)h0 , σ3 Φ0 i = 2=[a1 (ω)z|z|2 + a2 (ω)z 3 ] ;
December 8, 2003 12:22 WSPC/148-RMP
00184
On Asymptotic Stability of Ground States of NLS
G2 = hL2 (ζ), σ3 ξi =
*
2
2
2
z F (ω) − z¯ σ1 F (ω) + |z| F0 (ω)
"
1 −1
#
, σ3 ξ
887
+
= b1 (ω)z 2 + b2 (ω)¯ z 2 + b3 (ω)|z|2 ; G3 = hL3 (ζ) + L03 (ζ) + C(ζ)h0 , σ3 ξi = b4 (ω)z|z|2 + b5 (ω)¯ z |z|2 + b6 (ω)z 3 + b7 (ω)¯ z3 ; ˜ . |FR | + |GR | ≤ |hgarbage, ψi|
In the above formulas =bj (ω) = 0 for j = 1, 2, 3. (Standard theory gives a pointwise estimate |ξ(ω)| ≤ ce−a|x| for some a > 0, c > 0 fixed.) From ζ = zξ + z¯σ1 ξ we see L3 (ζ) + L03 (ζ) = z|z|2 F1 (ω, x) + z¯|z|2 F2 (ω, x) + z 3 F3 (ω, x) + z¯3 F4 (ω, x)
with F1 , . . . , F4 vectors with real entries. Define now h00 = h00+ + h00− . Define ˜ 01 , using the terms in (4.7). We have similarly h01 and h C(ζ) = zC(ξ) − z¯σ1 C(ξ)σ1 (by direct computation) C(ζ)h0 = C(ζ)(|z|2 h00 + z 2 h01 + z¯2 ˜ h01 ) and b4 (ω) = hF1 (ω, ·), σ3 ξi + hC(ξ)h00 , σ3 ξi − hσ1 C(ξ)σ1 h01 , σ3 ξi. By definitions and computation σ1 C ∗ (ξ)σ1 σ3 ξ = −2σ3 L2 (ξ). Thus =b4 = −=hh01 , σ1 C ∗ (ξ)σ1 σ3 ξi = 2=hh01 , σ3 L2 (ξ)i. We have =b4 = 2 lim+ =h−R(H(T ), 2λ(T ) + i)P1 L2 (ξ), P1∗ σ3 L2 (ξ)i →0
= − lim =h[R(H(T ), 2λ(T ) + i) − R(H(T ), 2λ(T ) − i)] →0+
× P1 L2 (ξ), P1∗ σ3 L2 (ξ)i .
By hypothesis (2.8), =b4 (ω) 6= 0. In fact:
Lemma 4.1. ∀ ω ∈ O, we have =b4 (ω) < 0. ∗ Proof. Notice that P1∗ σ3 = σ3 P1 , W+∗ (ω)σ3 = σ3 Z+ (ω) and Z+ (ω)σ3 = σ3 W+ (ω). The last two equalities are consequences of the definition of W+ and of the equality (consequence of Sec. 3) Z+ u = limt→+∞ eitσ3 (−∆+ω) e−itHω u for u ∈ Range(W+ ). We can set P1 F = W+ (ω)F˜ and H0 = −∆ + ω. Then, for R = R(Hω ), R0 = R(H0 ), + + * * X X ∗ ±R0 (2λ ± i)F˜ , W σ3 W+ F˜ . ±R(2λ ± i)P1 F, σ3 W+ F˜ = +
±
±
Use W+∗ σ3 = σ3 Z+ and Z+ W+ = identity. Then, for → 0+ , F˜ =
h ˜ i F1 , F˜2
πhδ(σ3 [−∆ + ω] − 2λ)F˜ , σ3 F˜ i = πhδ(−∆ + ω − 2λ(ω))F˜1 , F˜1 i > 0 . The lemma is proved.
December 8, 2003 12:22 WSPC/148-RMP
888
00184
S. Cuccagna
By the continuity of the function (Hω + z)−1 in z and ω: Lemma 4.2. ∀ compact sets K ⊂ O, ∃ γ > 0 such that ∀ ω ∈ K lhs (2.8) > γ. We rewrite (4.7) as follows: ωhΦ, ˙ Φ0 i = 2a0 (ω)=z 2 + 2=(a1 (ω)z|z|2 + a2 (ω)z 3 ) + FR iz˙ − λ(t)z = b1 (ω)z 2 + b2 (ω)¯ z 2 + b3 (ω)|z|2 + b4 (ω)z|z|2 + b5 (ω)¯ z |z|2 + b6 (ω)z 3 + b7 (ω)¯ z 3 + GR |FR | + |GR | ≤ h(4.8), ce−a|x|i , with a0 (ω) and b1 (ω), b2 (ω) and b3 (ω) real. Ignoring FR and GR the above one is a finite dimensional system. Standard theory of normal forms gives: Lemma 4.3. For appropriate c(ω)’s and d(ω)’s, with c0 , d1 , d2 and d3 real, and ω ˜ = ω + 2c0 (ω)
iz˜˙ − λ(t)˜ z = b˜ z |˜ z |2 + G˜R .
If we replace G˜R and F˜R with 0, we obtain a closed system in ω ˜ and z˜ where z˜ decays. We show in Sec. 6 that the closed system is close to the starting one. 5. Description of the Continuous Spectrum Component f (t) of R(t) Recall f in (2.7). Consider an interval t ∈ [0, T ]. Decompose X ˜ + h(t) ∈ Ng (H(T )) ⊕ f (t) = k(t) + ζ(t) N (H(T ) ∓ λ(T )) ⊕ Xc (T ) . ±
Here h(t) is the continuous spectrum component. It is elementary to show |f (t) − h(t)| ≤ (constant)|ω(T ) − ω(t)|h|f (t)|, ψi . We apply P1 , see (4.6), to (4.1). We set δσ = γ˙ + ω − ω(T ) ,
V (t) = H(t) − σ3 (−∆ + ω(t))
and we write iht = H(T )h + δσ(P+ − P− )h + P1 L2 (ζ) + D1
(5.1)
December 8, 2003 12:22 WSPC/148-RMP
00184
On Asymptotic Stability of Ground States of NLS
889
where ˜ 1 + P1 (Nonlinear − L2 (ζ)) + δσ[P1 σ3 − (P+ − P− )]h , D1 = D
where
˜ 1 = γP D ˙ 1 σ3 [Φ − Φω(T ) ] − iωP ˙ 1 [Φ0 − Φ0ω(T ) ] + P1 [V (t) − V (T )]h − iωP ˙ 1ζ 0 ˜ . + P1 [(zλ − iz)−(¯ ˙ zλ + iz¯˙ )σ1 ][ξ − ξ(T )]+(δσP1 σ3 + P1 [V (t) − V (T )])(k + ζ) D1 will be proved in Sec. 6 to be small. Informally, we obtain a closed system of ODE’s by replacing in (4.9) f with h and by neglecting D1 . Set h± = P± h and write h± = h1,± + h2,± + h3,± where Rt
h1± (t) = e−iH(T )t e±i 0 δσ(τ )dτ h± (0) Z t Rt h2± (t) = −i e−iH(T )(t−s) e±i s δσ(τ )dτ P± L2 (ζ)ds 0
h3± (t) = −i
Z
t
e−iH(T )(t−s) e±i
0
Rt s
δσ(τ )dτ
P± D1 ds .
Set F (ω, x) = L2 (ξ) and let F0 (ω, x) be the function in (4.7). Set z1 = e−iλ(T )t z. Textbook arguments give: Lemma 5.1. We have h2,± (t) = −i lim
→0+
Z
t 0
{e−iH(T )(t−s)−2iλ(T )s+s e±i
− e−iH(T )(t−s)+2iλ(T )s+s e±i +
Z
t
e
−iH(T )(t−s) ±i
e
0
Rt s
Rt s
δσ(τ )dτ
δσ(τ )dτ
2
Rt s
δσ(τ )dτ
(z12 P± F )(s)
(¯ z12 P± σ1 F )(s)}ds
(|z| P± F0 )(s)
"
1 −1
#
ds .
Keeping in mind formulas (4.7), Lemma 5.1 and Theorem 3.3 lead to: Lemma 5.2. We have h2,± = h0,± − U(t, 0)h0,± (0) + h02,± with: U(t, 0)h0,± (0) = −e−iH(T )t±i
Rt
δσ(τ )dτ
Rt
δσ(τ )dτ
+ e−iH(T )t±i
0
R(H(T ), 2λ(T ) + i0)z 2(0)P± F (ω(0), ·)
R(H(T ), −2λ(T ) + i0)¯ z 2 (0)P± F (ω(0), ·) " # Rt 1 e−iH(T )t±i 0 δσ(τ )dτ 2 |z(0)| P± F0 (ω(0), ·) ; − H(T ) −1 h02,± =
Z
t
0
e−iH(T )(t−s)−2iλ(T )s R(H(T ), 2λ(T ) + i0)
0
×
d ±i R t δσ(τ )dτ 2 [e s z1 P± F (ω, ·)]ds ds
December 8, 2003 12:22 WSPC/148-RMP
890
00184
S. Cuccagna
−
Z
0
t
e−iH(T )(t−s)+2iλ(T )s R(H(T ), −2λ(T ) + i0)
d ±i R t δσ(τ )dτ 2 [e s z¯1 σ1 P± F (ω, ·)]ds ds Z t −iH(T )(t−s) d ±i R t δσ(τ )dτ 2 e [e s |z| P± F (ω, ·)]ds . + H(T ) ds 0 We finally state the following two facts, which will be proved in Secs. 7 and 8. ×
Proposition 5.3. For p ∈ [1, 2], q ∈ [2, ∞) there is a constant cp,q , independent of ω(T ) as long as ω(T ) ∈ K, K a preassigned compact subset of O, such that k[P1 σ3 − (P+ − P− )]gkp ≤ cp,q kgkq .
Lemma 5.4. We have for some C = C(K), K compact subset of O, ω(T ) ∈ K, (i) 3
khxi−γ e−iH(T )t R(H(T ), ±2λ(T ) + i0)P1 gk2 < Chti− 2 khxiγ gk2 ,
γ > 4;
(ii) 3
khxi−γ e−iH(T )t (1/H(T ))P1 gk2 < Chti− 2 khxiγ gk2 ,
γ > 3/2 .
6. Proof of Theorem 2.4 We prove Theorem 2.4 assuming Proposition 5.3 and Lemma 5.4. We need to justify the distinction between main and garbage terms made above. With the above preparations and the linear estimates, the proof in this section is a standard continuity argument. Consider an interval [0, T ]. ∀ t ∈ [0, T ] set M0 (t) = |ω(t) − ω(T )| , M1 (t) = |z(t)| , M2 (t) = khxi−s (f (t) − h0 (t) − h1 (t) + U(t, 0)h0 (0))k2 , M3 (t) = kf (t)kp+1 . The Mj (t) depend on T , i.e. on the spectral decomposition of H(T ). Consider, for hzi = (1 + |z|2 )1/2 , M0 (t, T ) = suphτ iM0 (τ ) , τ ≤t
1
M1 (t, T ) = suphτ i 2 M1 (τ ) , τ ≤t
3
M2 (t, T ) = suphτ i 2 M2 (τ ) , τ ≤t
M3 (t, T ) = sup τ ≤t
hτ id M3 (τ ) . log(1 + hτ i)
December 8, 2003 12:22 WSPC/148-RMP
00184
891
On Asymptotic Stability of Ground States of NLS
˜ j (t) = Mj (t, t). We Of course we can define similar quantities for any T0 ≤ T . Set M can pick T so that the Mj (t, T0 ) are smaller than a preassigned positive number M for all 0 ≤ t ≤ T0 ≤ T . Proposition 6.1. Fix σ > 0 small. There are fixed constants M (small ), C 1 (large) and 0 (very small ) such that for any 0 < < 0 and if we denote by (6.1) the ˜ 0 (0) = M ˜ 2 (0) = 0) following inequalities (notice M ˜ 1 (0) ≤ 21 ; M ∀j
and, ∀ t
˜ 3 (0) ≤ M
˜ j (t) < M , 0 ≤ t ≤ T, M
(6.1)
then inequalities (6.1) imply that ∀ t ∈ [0, T ] we have ˜ 0 (t) ≤ C1 21 , M
˜ 1 (t) ≤ C1 21 , M
(6.2)
˜ 2 (t) ≤ C1 1+σ , M ˜ 3 (τ ) ≤ C1 d−σ . M 1
Proposition 6.1 implies that we can take T = ∞ if C1 02 < M . We have: Proposition 6.2. There are fixed constants M and 0 as above and a function C(M ) continous at M = 0, such that if we assume inequalities (6.1), if we set Mj = Mj (t, T0 ), then for any 0 ≤ < 0 and any T0 ≤ T we have M0 ≤ C(M )[M21 + N + −1 (M1 M2 + M41 + M22 + M31 M0 + Mp3 )] , 1
1
1
M1 ≤ C(M )[ 2 + M14 (M1 N + M1 M21 (0) + M1 M2 + M41 + M22 + Mp3 + M0 M31 ) 4 ] , 2 M2 ≤ C(M )[M21 M0 + M0 M3 + M31 + Mµp 3 + M 1 M3 ] ,
M3 ≤ C(M )[N + d−1 (M21 + M0 M3 + M23 ) + M0 M21 (0) + M0 M2 ] . Proposition 6.2 implies Proposition 6.1 through a continuation argument. That is, suppose that in [0, T1 ] (6.2) holds with 2C1 rather than with C1 and call (6.2bis) this weaker form of (6.2). Then, by picking 0 very small with respect to 1/C1 , we obtain p 1 1 M0 ≤ C(M ) 2 , M1 ≤ 8 C1 C(M ) 2 , M2 ≤ C(M )1+σ , M3 ≤ C(M )d−σ . Then for C1 larger than an appropriate multiple of C(M )2 , we see (6.1) and (6.2bis) imply (6.2) in [0, T1 ]. By continuity we extend (6.2) in [0, T ]. Next we prove Propostion 6.2, only for t = T0 = T .
6.1. Estimates h1 and U(t, 0)h0 (0) Set h = h1 + h2 + h3 , hj = hj+ + hj− (see Sec. 5). The following lemma follows by the definition of h1 and U(t, 0)h0 (0), by Theorem 3.3 and by Lemma 5.4: Lemma 6.1.1. We have, for some fixed C = C(M ):
December 8, 2003 12:22 WSPC/148-RMP
892
00184
S. Cuccagna
(i) kh1 (t)kp+1 ≤ Chti−d N ) (ii) 3
khxi−s h1 (t)k2 ≤ Chti− 2 N (iii) 3
khxi−s U(t, 0)h0 (0)k2 ≤ Chti− 2 M21 (0) . 6.2. Estimates for |ω(t) − ω(T )|, |z(t)| and garbage = (4.8) The first step is a bound on the garbage terms which we state without proof: Lemma 6.2.1. We have, for some fixed C = C(M ) : |hgarbage,ce−a|x|i| 3
1
≤ Chti−2 (M1 M2 + M41 + M22 + Mp3 + M0 M31 ) + Chti− 2 hti− 2 M1 (M21 (0) + N ) . The next step is:
Lemma 6.2.2. We have, for some C = C(M ) and for 0 ≤ t ≤ T,
|ω(t) − ω(T )| ≤ Chti−1 [M21 + N + −1 (M1 M2 + M41 + M22 + M31 M0 + Mp3 )] .
Proof. We have ω ˜˙ = F˜R and so |ω ˜˙ | ≤ |hgarbage, ce−a|x|i|. We write Z T |ω(t) − ω(T )| ≤ |ω(t) − ω ˜ (t)| + |ω(T ) − ω ˜ (T )| + |ω ˜˙ (s)|ds t
and we apply Lemma 6.2.1.
Lemma 6.2.2 gives the first inequality in Proposition 6.2. The second one follows from the following lemma. 1
Lemma 6.2.3. We have, for some C = C(M ) and for 0 ≤ t ≤ T, hti 2 |z(t)| 1
1
1
≤ C[ 2 + M14 (M1 N + M1 M21 (0) + M1 M2 + M41 + M22 + Mp3 + M0 M31 ) 4 + M21 ] . Proof. Consider, for Γ = 2=b(ω(T )), the equation d 2 |˜ z | = −Γ|˜ z |4 + 2=(G˜R z¯˜) . dt We have, for C = C(M ) and M the product of C with [M1 N +M1 M21 (0)+M1 M2 + M41 + M22 + Mp3 + M0 M31 ]M1 , 5
|2=(G˜R z¯˜)| ≤ (1 + t)− 2 M . Now set ρ0 = B(1 + t)−1 , where B > 0 satisfies 2
B = ΓB − M ,
i.e. B =
+
√
4MΓ + 2 . 2Γ
December 8, 2003 12:22 WSPC/148-RMP
00184
On Asymptotic Stability of Ground States of NLS
893
We have ρ˙ 0 = −Γρ20 + (1 + t)−2 M
and so, if |˜ z (0)|2 ≤ /Γ ≤ B, then 2
|˜ z (t)| ≤ ρ0 (t) ≤
+
√ MΓ (1 + t)−1 . Γ
˜ We estimate f . Recall f = k+ζ+h. From (5.1) the following follows immediately: Lemma 6.2.4. We have, for some fixed C = C(M ) and for any q ∈ [1, +∞], −2 ˜ kk(t)kq + kζ(t)k M0 (N + M2 + M21 ) . q ≤ Chti
The right-hand side in the last inequality can be absorbed in the right-hand sides in Proposition 6.2. 6.3. Further estimates for h2 and h02 Set h02 = h02,+ + h02,− . We have: Lemma 6.3.1. For some fixed C = C(M ), (i) kh2 (t)kp+1 ≤ Chti−d log(1 + hti)d−1 M21 (ii) 3
3
khxi−s h02 k2 ≤ Chti− 2 (M0 M1 + M21 + M23 )M1 + Chti− 2 M21 (0) . p+1
Proof. To bound (i), we use the L p → Lp+1 estimate provided by Theorem 3.3 and the fact that L2 (ζ) is spatially exponentially decreasing and we write: Z t Z t 1 −iH(T )(t−s) kh2± (t)kp+1 ≤ ke P± L2 (ζ)kp+1 ds ≤ C |z(s)|2 ds (t − s)d 0 0 ≤C
Z
t 0
1 hsi−1 dsM21 . (t − s)d
If t ≤ 1, the last integral is bounded by t1−d ≤ d−1 . If t > 1 we split the integral R t Rt into 02 + t , with 2
Z
t 2
0
≤ Ct
−d
log(t) ,
Z
t
t 2
≤C
1 1−d t . t
As for (ii), thanks to Lemma 5.4 we have: Z t 3 khxi−s h02,± k2 ≤ C˜ ht − si− 2 (|δσ(s)z 2 | + |z˙1 z|)ds . 0
Recalling δσ = γ˙ + ω − ω(T ), we bound |δσ(s)| ≤ Chsi−1 (M0 + M21 + M23 ).
December 8, 2003 12:22 WSPC/148-RMP
894
00184
S. Cuccagna
Therefore |δσ(s)z 2 (s)| ≤ Chsi−2 M21 (M0 + M21 + M23 ). 3 We bound |z˙1 (s)z(s)| ≤ Chsi− 2 M1 (M0 M1 + M21 + M23 ). Finally, another elementary calculation leads to: Z t 3 3 3 ht − si− 2 hsi− 2 ds ≤ Chti− 2 . 0
6.4. Estimate for kh3 kp+1 We next consider h3 . Lemma 6.4.1. We have, for C = C(M ), kh3 kp+1 ≤ Chti−d d−1 [M0 M21 + M31 + M0 M3 + M23 ] . Proof. First of all we have, for a rapidly decreasing function ψ and some C = C(M ), |Nonlinear − L2 (ζ)| ≤ C(ψ|f |2 + |z|3 + |f |p ) . Then we split D1 = D2 + D3 , with D3 = Nonlinear − L2 (ζ)
if |Nonlinear − L2 (ζ)| > 2C(ψ|f |2 + |z|3 )
= 0 otherwise , so that we have the estimate |D3 | ≤ 2C|f |p . Now we bound (here we use dp > 1, Theorem 3.3 and an elementary bound for the last integral) Z t Rt ke−iH(T )(t−s) e±i s δσ(τ )dτ P± D3 dskp+1 0
≤ C1
Z
≤ C1
Z
Our next step is to show:
t 0 t 0
(t − s)−d kf (s)kpp+1 ds logp (1 + hsi) dsMp3 ≤ Cd−1 hti−d Mp3 . (t − s)d hsidp
Claim 6.4.2. We have for some fixed C: 3
kD2 (t)k p+1 ≤ Chti− 2 (M0 M21 + M0 M3 + M31 + M23 ) . p
Assuming Claim 6.4.2, by Theorem 3.3 we have, for a C1 = C1 (M ), Z t Z t R −iH(T )(t−s) ±i st δσ(τ )dτ ke e (t − s)−d kD2 (s)k p+1 ds , P± D2 kp+1 ds ≤ C1 0
0
and Lemma 6.4.1 follows from the elementary Z t 3 (t − s)−d hsi− 2 ds ≤ Cσ d−1 hti−d . 0
p
December 8, 2003 12:22 WSPC/148-RMP
00184
On Asymptotic Stability of Ground States of NLS
895
Recalling the definition of D1 in Sec. 2, Claim 6.4.2 follows from the following subclaims. Subclaim (i). We have ˜ p+1 ≤ Chti− 23 (M0 + M1 )(M2 + M2 ) . ˜ 1 (t) − δσP1 σ3 (k + ζ)k kD 1 3 p
For any s > 0 we have ˜ 1 (t) − δσP1 σ3 (k + ζ)]k ˜ 2 ≤ Cs hti− 32 (M0 + M1 )(M2 + M2 ) . khxis [D 1 3 Proof. Recall ζ 0 = z∂ω ξ + z¯σ1 ∂ω ξ. Using we obtain
|ω| ˙ + |γ| ˙ + |iz˙ − λz| ≤ c(|z|2 + kf k2p+1 ) 3
kωP ˙ 1 ζ 0 k p+1 ≤ chti− 2 M1 (M21 + M23 ) . p
˜ we have ˜ 1 (t) − δσP1 σ3 (k + ζ), For another term of the sum defining D 3
kγP ˙ 1 σ3 (Φ − Φω(T ) )k p+1 ≤ chti− 2 M0 (M21 + M23 ) . p
The remaining terms can be treated like the last one. The proof of the second inequality in Subclaim (i) is the same (all the terms are rapidly decreasing and regular in the spatial variables). Subclaim (ii). We have ˜ p+1 kδσ[P1 σ3 − (P+ − P− )]hk p+1 + kδσP1 σ3 (k + ζ)k p
p
≤ Chti−1−d log(1 + hti)(M0 + M1 + M23 )M3 .
For any s > 0 we have
˜ 2 ≤ Cs hti−1−d log(1 + hti)(M0 + M1 + M2 )M3 . khxis δσP1 σ3 (k + ζ)k 3 We have by Proposition 5.3 kδσ[P1 σ3 − (P+ − P− )]hk p+1 ≤ C|δσ| khkp p
Similarly,
≤ Chti−1−d log(1 + hti)(M0 + M1 + M23 )M3 . khxis δσ[P1 σ3 − (P+ − P− )]hk2 ≤ kδσ[P1 σ3 − (P+ − P− )]hk2 ≤ C|δσ| khkp .
From Lemma 6.2.4
˜ p+1 + khxis δσP1 σ3 (k + ζ)k ˜ 2 ≤ Cs kδσhkp . kδσP1 σ3 (k + ζ)k p
December 8, 2003 12:22 WSPC/148-RMP
896
00184
S. Cuccagna
If D3 = 0 we have |Nonlinear − L2 (ζ)| ≤ cψ(|f |2 + |z|3 ), which leads immediately to: Subclaim (iii). We have kNonlinear − L2 (ζ)k
3
L
p+1 p
(D3 =0)
≤ chti− 2 (M31 + M23 ) .
For any s > 0 we have 3
khxis [Nonlinear − L2 (ζ)]kL2 (D3 =0) ≤ cs hti− 2 (M31 + M23 ) . 6.5. Estimate for khxi−s h3 k2 Lemma 6.5.1. We have, for C = C(M ) and µ = (1/2 − 1/p)/(1/2 − 1/(p + 1)), 3
khxi−s h3 k2 ≤ Chti− 2 [M0 M21 + M0 M3 + M31 + M23 + Mµp 3 ].
We start by considering (D3 introduced in Lemma 6.4.1) Z t Z t−1 Z R −iH(T )(t−s) ±i st δσ(τ )dτ e e P± D3 (s)ds = ··· + 0
0
t t−1
··· .
We have (it is here that we use p > 3 which implies µpd > 3/2):
Z t−1 Z Z t−1
−s t−1 3 p − 32
hxi
≤ c1 ht − si− 2 kf (s)kµp kf (s)k ds ≤ c · · · ht − si 1 p p+1
0
≤ c1
We have
Z
−s
hxi
t t−1
0
0
2
Z
t−1
logµp (1 + hsi)
ht − si
0
Z
· · · ≤ c 1
2
t t−1
3 2
hsiµpd
3
−2 dsMµp Mµp 3 ≤ Chti 3 .
(t − s)−d kf (s)kpp+1 ds ≤ C ≤ Chti−dp Mp3 .
The integral with D2 instead of D3 can be estimated almost exactly like in the proof of Claim 6.4.2. The only change is for Z t Z t−1 Z t R −iH(T )(t−s) ±i st δσ(τ )dτ e e ···+ ··· . δσ(s)[P1 σ3 − (P+ − P− )]hds = 0
0
We have
Z
−s
hxi
0
t−1
Z
· · · ≤ c
2
≤ c1
Z
≤ c2
Z
≤ c2
Z
t−1 0 t−1
0 t−1 0 t−1 0
· · ·
t−1
∞ 3
ht − si− 2 kδσ(s)[P1 σ3 − (P+ − P− )]hk1 ds 3
ht − si− 2 kδσ(s)hkp ds 3
ht − si− 2 hsi−1−d log(1 + hsi)ds(M0 + M1 + M23 )M3 .
December 8, 2003 12:22 WSPC/148-RMP
00184
On Asymptotic Stability of Ground States of NLS
We have
Z
−s
hxi
t−1 0
Z
· · · ≤ c
2
≤ c1
Z
≤ c2
Z
t−1 0 t
t−1 t t−1
· · ·
897
p+1
(t − s)−d kδσ(s)[P1 σ3 − (P+ − P− )]hk p+1 ds p
(t − s)−d kδσ(s)hkp ds
≤ c3 hti−1−d log(1 + hti)(M0 + M1 + M23 )M3 . 7. Proof of Proposition 5.3 In this section we do not make use of any evenness of functions. Just by adapting the exponent s of the weighted spaces L2±s , the arguments are valid for all dimensions n ≥ 3. Set, ω = ω(T ), H = Hω , H0 = σ3 (−∆ + ω) and W+ = W+ (ω). Furthermore set R0 (z) = (H0 − z)−1 and R(z) = (H − z)−1 . The first lemma would follow from standard theory if H was self-adjoint. Lemma 7.1. If u ∈ Xc (H). Then Z M 1 lim [R(λ + i) − R(λ − i)]udλ →0+ 2πi M →+∞ ω Z −ω [R(λ + i) − R(λ − i)]udλ . lim
u = P+ u + P− u = lim + lim
→0+
1 2πi M →+∞
(7.1)
−M
Furthermore the first (resp. second ) limit in the right-hand side of (7.1) is P + u (resp. P− u). (7.1) is an easy consequence of the isomorphism W+ : L2 → Xc and of the limiting absortion principle for H0 . The representation for P+ follows from P+ = W+ E+ Z+ and E+ = χ[ω,+∞) (H0 ). The argument for P− is similar. Proposition 5.3 follows from k[P± σ3 ∓ P± ]gkp ≤ cp,q kgkq . The proofs are similar so we consider P+ and the [ω, +∞) integral in (7.1) only. Using Lemma 7.1 we express P+ σ3 − P+ as a sum where for the first few terms the desired inequality is obvious or easy. For the remaining terms, we observe that the first term in the right-hand side of (7.1) is a path integral from +∞ to +∞ around [ω, +∞) of an analytic function in z which we deform into an integral on (ω − δ − i∞, ω − δ + i∞) for δ > 0 small. On this vertical line, away from the spectra of H and H0 , the resolvents behave nicely. Setting H = H0 + V , we write X X ±R(λ ± i) = ±(1 + R0 (λ ± i)V )−1 R0 (λ ± i) . (7.2) ±
±
December 8, 2003 12:22 WSPC/148-RMP
898
00184
S. Cuccagna
By elementary computation, for E2 see (3.6), R0 (λ ± i)σ3 = R0 (λ ± i) − 2(−∆ + ω + λ ± i)−1 E− . Therefore lhs(7.2)σ3 = lhs(7.2) + 2
X ±
±(1 + R0 (λ ± i)V )−1 E− (−∆ + ω + λ ± i)−1 .
The main result of this section is: Proposition 7.2. In the above notation the following limit X Z M ± Ku = lim lim (1 + R0 (λ ± i)V )−1 E− (−∆ + ω + λ ± i)−1 udλ →0+ M →+∞
ω
±
defines an operator such that for any 1 ≤ p ≤ q < ∞ there is a constant c such that kKukp ≤ ckukq for any u ∈ Lq . PN +1 For N ≥ 1 (N = 1 is enough) (1 + R0 V )−1 = j=0 [−R0 V ]j + R0 V RV (−R0 V )N PN +1 0 and the corresponding decomposition K = j=0 Kj + K. The following lemma is straightforward: Lemma 7.3. For any u ∈ L2 the following limit exists and is equal to 0: Z MX K00 u = lim+ lim ±(−∆ + ω + λ ± i)−1 E− udλ = 0 . →0
M →+∞
ω
±
K10 .
We next consider We preliminarily observe that the multiplier operator −1 (−∆ + ω + z) has symbol satisfying, for
(7.3)
Therefore we have: Lemma 7.4. For any p ∈ (1, +∞) (resp. p = 1), z → (−∆ + ω + z)−1 defines a function 0 small ). We also state the following lemma, implied for R by Theorem 3.3. Lemma 7.5. The functions R0 (z) and R(z) from 1, are continuous with norm bounded by constant ×hzi−1/2 . We can assume u smooth and rapidly decreasing. Z MX K10 u = lim+ lim ±[R0 (λ ± i)V (−∆ + ω + λ ± i)−1 ]E− udλ . →0
M →+∞
ω
(7.4)
±
For u ∈ L2s , s > 1/2, Lemmas 7.4 and 7.5 imply that the limit on the right-hand side of (7.4) exists and converges to the following integral, absolutely convergent in
December 8, 2003 12:22 WSPC/148-RMP
00184
On Asymptotic Stability of Ground States of NLS
899
L2−s (but this is not a convergent integral in L2 , this is why we treat K10 differently from the Kj0 , j > 1): K10 u
=
Z
+∞ ω
[R0 (λ + i0) − R0 (λ − i0)]V (−∆ + ω + λ)−1 E− udλ .
R0 (λ + i0) − R0 (λ − i0) = 2iπδ(∆ − ω + λ)E+ , for E+ (4.6), so schematically K10 u is Z +∞ δ(∆ − ω + λ)V (−∆ + ω + λ)−1 udλ . (7.5) ω
After taking Fourier transform Z +∞ Z δ(λ − ξ 2 − ω)Vˆ (η)(|ξ − η|2 + ω + λ)−1 u ˆ(ξ − η)dλ dη R3
ω
=
Z
R3
u(ξ − η) . dη(ξ 2 + |ξ − η|2 + 2ω)−1 Vˆ (η)ˆ
Therefore, up to a constant factor, (7.5) is Z Vˆ (η)ˆ u(ξ − η) dηdξ . eix·ξ 2 ξ + |ξ − η|2 + 2ω 6 R We have ∂α 2 2 −1 −2−|α| (ξ + |ξ − η| + 2ω) ≤ cα (1 + |ξ| + |η|) ∂(ξ, η)α
and so an easy version of [4, Theorem 1] which we state without proof implies: Lemma 7.6. We have, for p, p1 , p2 ∈ [1, +∞), 1/p = 1/p1 + 1/p2 and a constant c = c(p1 , p2 ): kK10 ukp ≤ ckV kp1 kukp2 . We consider now Kj0 u = (−)j lim
→0+
X ±
±
Z
+∞
ω
[R0 (λ ± i)V ]j E− (−∆ + ω + λ ± i)−1 udλ .
Using Lemmas 7.4 and 7.5 we see that, this is a path integral around [ω, +∞). For some δ > 0 small but fixed Z ω−δ+i∞ 0 j Kj u = (−) [R0 (ζ)V ]j E− (−∆ + ω + ζ)−1 udζ . ω−δ−i∞
By (6.3) kR0 (ζ) : Lp → Lp k ≤ c(p, δ)(1 + |ζ|)−1 ∀ p ∈ (1, +∞). We conclude kKj0 ukp ≤ c(p, q, δ, V )kukq ∀ q ≤ p. If V were small, as in the case of Theorem 2.6,
December 8, 2003 12:22 WSPC/148-RMP
900
00184
S. Cuccagna
we would obtain a convergent series. Here we consider also the remainder term. Arguing as above X Z +∞ ± R0 (λ ± i)V R(λ ± i)V [R0 (λ ± i)V ]N (−)N +2 Ku = lim →0+
±
ω
× E− (−∆ + ω + λ ± i)−1 udλ =
Z
ω−δ+i∞ ω−δ−i∞
R0 (ζ)V R(ζ)V [R0 (ζ)V ]N E− (−∆ + ω + ζ)−1 udζ .
∀ p ∈ (1, 2] and ∀ q ∈ [2, ∞), if <ζ = ω − δ, (6.3) implies c(p, q, δ)(1 + |ζ|)−1 ≥ kR0 (ζ)V : L2 → Lp k + kV [R0 (ζ)V ]N : Lq → L2 k + k(−∆ + ω + ζ)−1 : Lq → Lq k . Furthermore kR(ζ) : L2 → L2 k ≤ c(δ)(1 + |ζ|)−1 for <ζ = ω − δ and so kK : Lq → Lp k < ∞ for 1 < p ≤ 2 ≤ q < ∞. 8. Proof of Lemma 5.4 Here the arguments are given for dimension n = 3 but, by minor changes, i.e. the explicit form of G0 below and the values of the exponent s in L2−s , the argument goes through for any dimension n ≥ 3. The proof of (ii) follows immediately from the following straightforward, thanks to Theorem 3.3, set of inequalities:
−iH(T )t
e
e−iH(T )t
−γ e−iH(T )t
hxi P1 f < c 1 P 1 f < c 2 Z + P1 f
H(T ) H(T ) H(T ) 2 ∞ ∞
1
≤ c3 H0 (T )
n
L∞ →L∞
ke−iH0 (T )t Z+ P1 f k∞ ≤ c4 t− 2 kf k1 .
For (i), as for (2.18) [14], g(t) a smooth cutoff equal to 1 near ∓2λ(T ) and supported in an interval I = [a, b], a > 0, and g¯(t) = 1 − g(t), we consider (H = H(T ), H0 = H0 (T ), g(H) = W+ g(H0 )Z+ , R(z) = R(H, z), R0 (z) = R(H0 , z), λ = λ(T ),) khxi−γ g¯(H)e−iHt R(∓2λ + i0)P1 f k2 + khxi−γ g(H)e−iHt R(∓2λ + i0)P1 f k2 . The estimation of the first term is obtained from: k¯ g(H)e−iHt R(∓2λ + i0)P1 f kL2−γ ≤ C1 k¯ g(H)e−iHt R(∓2λ + i0)P1 f k∞ ≤ c2 kZ+ g¯(H)e−iHt R(∓2λ + i0)P1 f k∞ ≤ C2 k¯ g(H0 )R0 (∓2λ + i0)kB(L∞,L∞ ) ke−iH0 t Z+ P1 f k∞ n
≤ C3 t− 2 kf k1 .
December 8, 2003 12:22 WSPC/148-RMP
00184
901
On Asymptotic Stability of Ground States of NLS
We now proceed in the analysis of hxi−γ g(H)e−iHt R(∓2λ + i0)P1 hyi−γ Z +∞ e−i(H±2λ−i0)s g(H)P1 dshyi−γ . = ei±2λt hxi−γ
(8.1)
t
For definiteness we focus on the −λ case. Using the plane waves u(x, ξ) associated to H and to the positive part [ω, +∞) of the continuous spectrum, we can write Z 2 −γ −i(H−2λ−i)s −γ −γ u(x, ξ)e−i(σ3 (ξ +ω)−2λ−i)s hxi g(H)e hyi = (constant)hxi R3
× g(ξ 2 + ω)¯ u(y, ξ)dξhyi−γ .
(8.2)
Here u(x, ξ) are characterized as u(x, ξ) = e
ix·ξ
" # 1 0
+ eix·ξ w(x, ξ) ,
with w(x, ξ) the unique solution in L2−s , s > 1/2, of the integral equation Z w(z, ξ)V ∗ (z)G0 (z, x, |ξ|)ei(z−x)·ξ dz , w(x, ξ) = −F (x, ξ) − R3
2
with V = H − H0 , [H0 − (ω + ξ )]G0 (z, x, |ξ|) = 4πδ(z − x) and " # Z 1 F (x, ξ) = V (z) G0 (z, x, |ξ|)ei(z−x)·ξ dz . 3 0 R e−i|y−z| |ξ| |y−z| |∂xα ∂ξβ F (x, ξ)|
G0 is a diagonal matrix with components essentially
√ 2 e−|y−z| ξ +2ω . |y−z| |β| c˜αβ hxi . Using
and
It is elementary to show that, for |ξ| ∈ [a, b], then ≤ standard arguments from stationary scattering theory it is possible to conclude |∂xα ∂ξβ w(x, ξ)| ≤ cαβ hxi|β| . This implies that, after an integration by parts (i.e. starting from e±iξ
2
s
d ±iξ = (±2i|ξ|s)−1 d|ξ| e
2
s
etc., see [14, p. 25])
|(8.2)| ≤ chxi−γ+r hyi−γ+r s−r e−t and so |(8.1)| ≤ chxi−γ+r hyi−γ+r t−r+1 . For γ > r + 3/2 and r ≥ 5/2, we obtain the conclusion. 9. Sketch of the Proof of Theorem 2.6 Before discussing Theorem 2.6 we need to discuss Lemma 2.5. Let us write a solution of (2.10) in the form u = aφ0 + h, with kφ0 k2 = 1 and h perpendicular to φ0 in L2 . Let P be the projection in the orthogonal complement of φ0 . Then (2.10) becomes (e0 − ω) + ηahβ(|a2 φ20 |)φ0 , φ0 i + hN (h), φ0 i = 0 h − (−∆ + V + e0 )−1 P [(e0 − ω)h + ηβ(|aφ0 + h|2 )(aφ0 + h)] = 0 ,
(9.1)
where pointwise |N (h)| ≤ C(|aφ0 | |h| + |h|p ). Thanks to [21], (−∆ + V + e0 )−1 P is bounded W k,q → W k+2,q for any positive integer k and for any q ∈ (1, ∞). So the
December 8, 2003 12:22 WSPC/148-RMP
902
00184
S. Cuccagna
functions in (9.1) are C ∞ in the arguments a, ω ∈ R and h ∈ H 1 . By Bifurcation Theory, [22, Theorem 3.2.2], there are two curves of solutions {u, ω} in H 1 × R, which intersect transversally in (0, e0 ). One of them is ω → (0, ω), the other can be d h(0) = 0, with expressed in the form a → (aφ0 + h(a), ω(a)), da 2m+2 ω(a) − e0 = ηa2m β m (0)kφ0 k2m+2 (1 + o(a2m )) .
Then we obtain (2.11). The proof of Theorem 2.6 is the same as for Theorem 2.4 after replacing H(t) with H(t, η), (2.3). We have, for φω = aφ0 + h like above, " # √ φ1 ξ=c + O(a) , 1/c = 2kφ1 k2 −φ1 " # 2 φ 1 L2 (ξ) = (1/l!)cηa2l−1 β (l) (0)φ2l−1 (1 + O(a)) 0 −φ21 P1 = Pc I2 + O(η) ,
I2 = identity matrix of rank 2 .
Then, for H(ω, η) = Hω + σ3 V =h[R(H(ω, η), 2λ(ω) + i) − R(H(ω, η), 2λ(ω) − i)]P1 L2 (ξ), σ3 P1 L2 (ξ)i * X = (1/l!)2 c2 [β (l) (0)]2 a4l−2 η 2 = ±R(σ3 (−∆ + V ) , ±
2λ(ω) ±
i)Pc φ2l−1 0
"
φ21 −φ21
#
, σ3 Pc φ2l−1 0
"
φ21 −φ21
#+
(1 + O(a)) .
The limit for → 0+ is > Cη 2 a4l−2 > γ0 > 0, for some very small γ0 dependent on the initial ω0 . For |z(0)| + N γ, the rest of the proof goes through. 10. Errata in [5] (1) The proof of [5, Lemma 5.2] contains several mistakes. Nevertheless, the statement is correct and is proved, correctly this time, in [19, 20]. Here the statement of [5, Lemma 5.2] is taken as hypothesis. (2) In the proof of [5, Corollary 3.2], we misquote from Kato’s textbook on perturbation theory. In fact it is not true that, as asserted in [5], Corollary 3.2 follows from Lemma 3.1. However [5, Corollary 3.2] is a consequence of Proposition 3.1 in the present paper. Acknowledgments I wish to thank T. P. Tsai for discussions about an early version of [15], and Chongchun Zeng.
December 8, 2003 12:22 WSPC/148-RMP
00184
On Asymptotic Stability of Ground States of NLS
903
References [1] V. S. Buslaev and G. S. Perelman, On the stability of solitary waves for nonlinear Schr¨ odinger equations, Nonlinear Evolution Equations, ed. N. N. Uraltseva, Transl. Ser. 2, 164, Amer. Math. Soc., Providence, RI, 1995, pp. 75–98. [2] V. S. Buslaev and C. Sulem, On the asymptotic stability of solitary waves of Nonlinear Schr¨ odinger equations, Ann. Inst. H. Poincare. An. Nonlin. 20 (2003), 419–475. [3] T. Cazenave, An introduction to nonlinear Schr¨ odinger equations, Textos de Metods Matematicos, Instit. Matematica Universidade Federal IM-UFRJ, Rio de Janeiro, 1989. [4] R. Coifman and Y. Meyer, Commutateures d’int´egrales singulieres et operateurs multilineaires, Ann. Inst. Fourier 28 (1978), 177–202. [5] S. Cuccagna, Stabilization of solutions to nonlinear Schr¨ odinger equations, Commun. Pure Appl. Math. 54 (2001), 1110–1145. [6] S. Cuccagna, On Asymptotic Stability of Ground States of NLS (preprint February 2001) http://www.math.siu.edu/preprints/preprintx.html [7] M. Grillakis, J. Shatah and W. Strauss, Stability of solitary waves in the presence of symmetries, I, J. Funct. Anal. 74 (1987), 160–197. [8] , Stability of solitary waves in the presence of symmetries, II, J. Funct. Anal. 94 (1990), 308–348. [9] T. Kato, Wave operators and similarity for some non-selfadjoint operators, Math. Annalen 162 (1966), 258–269. [10] A. Komech, M. Kunze and H. Spohn, Long-time asymptotics for a classical particle interacting with a scalar wave field, Comm. Partial Differential Equations 22 (1997), 307–335. [11] K. McLeod, Uniqueness of positive radial solutions of ∆u + f (u) = 0 in Rn , Trans. Amer. Math. Soc. 339 (1993), 495–505. [12] I. M. Sigal, Nonlinear wave and Schr¨ odinger equations. I. Instability of periodic and quasiperiodic solutions, Commun. Math. Phys. 153 (1993), 297–320. [13] J. Shatah and W. Strauss, Instability of nonlinear bound states, Commun. Math. Phys. 100 (1985), 173–190. [14] A. Soffer and M. Weinstein, Resonances, radiation damping and instability in Hamiltonian nonlinear wave equations, Invent. Math. 136 (1999), 9–74. [15] T. P. Tsai and H. T. Yau, Asymptotic dynamics of nonlinear Schr¨ odinger equations: resonance dominated and radiation dominated solutions, Commun. Pure Appl. Math. 55 (2002), 153–216. [16] M. Weinstein, Modulation stability of ground states of nonlinear Schr¨ odinger equations, Siam J. Math. Anal. 16 (1985), 472–491. [17] , Lyapunov stability of ground states of nonlinear dispersive equations, Commun. Pure Appl. Math. 39 (1986), 51–68. [18] J. Rauch, Local decay of scattering solutions to Schr¨ odinger’s equation, Commun. Math. Phys. 61 (1978), 149–168. [19] Cuccagna, Pelinovsky and Vougalter, Spectra of positive and negative energies in the linearized NLS problem, preprint. [20] Perelman, Asymptotic stability of solitons for nonlinear Schr¨ odinger equations, preprint. [21] K. Yajima, The W k,p continuity of wave operators for Schr¨ odinger operators, J. Math. Soc. Japan 47 (1995), 551–581. [22] L. Nirenberg, Topics in nonlinear functional analysis, Lecture Notes, Courant Institute.
December 3, 2003 19:54 WSPC/148-RMP
00185
Reviews in Mathematical Physics Vol. 15, No. 8 (2003) 905–923 c World Scientific Publishing Company
VARIATIONAL PRINCIPLE FOR NON-EQUILIBRIUM STEADY STATES OF THE XX MODEL
TAKU MATSUI Graduate School of Mathematics, Kyushu University 1-10-6 Hakozaki, Fukuoka 812-8581, Japan [email protected] YOSHIKO OGATA Department of Physics, Graduate School of Science The University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, Tokyo, 113-0033, Japan [email protected] Received 2 February 2003 Revised 22 September 2003 We show that non-equilibrium steady states of the one-dimensional exactly-solved XX model can be characterized by the variational principle of free energy of a long range interaction and that they cannot be a KMS state for any C ∗ -dynamical system of the UHF algebra. Keywords: Variational principle; non-equilibrium steady state; the XX model.
1. Introduction In [21] Ruelle proposed to investigate a class of invariant states for infinite quantum systems as a model of non-equilibrium statistical mechanics. The initial state consists of a finite number of heat reservoirs and a finite system. The non-equilibrium steady states are obtained after the long time average of the Hamiltonian time evolution under an interaction between the finite system and the heat reservoirs. At the time of publication of [21], the same states were studied for the free Fermion and the XY model by Ho and Araki [13], by Dirren [9], and by Tasaki [23] independently. (See [6] as well.) Even though the construction of these non-equilibrium steady states (NESS) is simple and natural, there arise a few questions. Are these NESS qualified as states far from equilibrium? Are these NESS ordinary equilibrium states of other Hamiltonians? In this paper, we consider this question in a simple case. We use C ∗ -algebraic method to achieve our goal (cf. [7] and [8]). Our problem is closely related to an attractive attempt to characterize nonequilibrium steady states via variational principle. So far, no widely accepted 905
December 3, 2003 19:54 WSPC/148-RMP
906
00185
T. Matsui & Y. Ogata
characterization of non-equilibrium steady states has been established for general infinite quantum systems. In the early stage of research of non-equilibrium statistical mechanics, Zubarev proposed his interpretation of the non-equilibrium steady state as an equilibrium state of some effective Hamiltonian (see [25]). Let us carry out a very heuristic argument for a while. Suppose that an initial state is given by the density matrix ρ = e−βH0 and that the state evolves under a Hamiltonian H. Then the state at a time t is described by the density matrix ρ(t) = e−βH(t) , with H(t) = eitH H0 e−itH . We may regard this state as a state in equilibrium for the Hamiltonian H(t). Thus the non-equilibrium steady state should be an equilibrium state for the Hamiltonian H(∞). This effective Hamiltonian H(∞) is referred to as the Zubarev Hamiltonian. Whether the Zubarev Hamiltonian H(∞) is well-defined is a subtle question, which Zubarev himself left untouched. In particular, for an infinite system, whether a state can be interpreted as an equilibrium state for some dynamics is a highly non-trivial matter. The claim that our NESS is an equilibrium state for the Zubarev Hamiltonian H(∞) can be justified very easily if the time evolution αt satisfies the L1 asymptotic abelian condition where the finiteness of the following integral is assumed for sufficiently many local observables Q and R. Z ∞ k[Q, eitH Re−itH ]k dt < ∞ . 0
The Zubarev Hamiltonian H(∞) is expressed in terms of infinite integral and commutators and the L1 -asymptotic abelian condition guarantees the convergence of the integral (cf. [24]). However the L1 -asymptotic abelian condition is too strong to assume as it is difficult to verify and the condition is not valid in various cases including the XX model (cf. [3]). After publication of papers by Ruelle [21, 22] on non-equilibrium steady states, both general and model-dependent results are obtained on entropy production by Jaksic and Pillet in [15] and [16], and by Fr¨ ohlich, Merkli, and Ueltschi in [12]. Techniques such as complex spectrum deformation and renormalization groups developed for non-relativistic quantum field theory can be applied to NESS to show exponential decay of time dependent correlation functions: lim ϕ(QeitH R) = ϕ(Q)ϕ(R) .
t→∞
These techniques establish strict positivity of entropy production as well. However, we are not certain that these techniques give rise to any results on variational principle for NESS. Let us turn to the XX model. Formally the Jordan–Wigner transformation gives rise to an equivalence between the XX model and the free Fermion. The exact meaning of the equivalence between the XX model and the corresponding Fermion system can be clarified by use of C ∗ -algebraic methods (see [3]). On the other hand, with the aid of C ∗ -algebraic formalism, various justification of the KMS state as the equilibrium state has been done. For quantum spin systems, the equivalence of variational principle, Gibbs condition, and the KMS condition was proved under
December 3, 2003 19:54 WSPC/148-RMP
00185
Variational Principle for Non-Equilibrium Steady States
907
some summability conditions of the potential (interaction) ([20, 17, 2]). Recently, Araki and Moriya established the equivalence of these conditions for a conceivably widest class of potentials and they extended their results for Fermion lattice systems in [4] and [5]. The result of Araki and Moriya and results on the Bogoliubov automorphisms of CAR algebras are key tools of our analysis. The argument of Evans and Lewis in their analysis of the Ising Model in [11] is used in part of our proof. Our results of this paper is described as follows. We will see that the nonequilibrium steady state of the XX model cannot be a KMS state of any one-parameter group of automorphisms of the UHF C ∗ -algebra of quasi-local observables. Nevertheless the non-equilibrium steady state of the XX model is the unique solution of variational identity (maximization of free energy) for a translation invariant long range potential. The point is that the XX model and the Fermion are equivalent if restricted to the even part of the quasi-local observables while (im)possibility of extension of the equivalence to the whole algebra depends on choice of states. The non-equilibrium steady state of the XX model can be characterized by local thermodynamic stability [5]. This is straightforward to see and for simplicity of exposition we will not make further comment on this. For the corresponding Fermion system our NESS is a KMS state for the quasifree motion and satisfies the Gibbs condition and variational identity for the corresponding potential [19]. Thus an explicit form of the Zubarev Hamiltonian H(∞) is available. From the viewpoint of mathematical physics, our results illustrate two features of the NESS of the XX model. (i) The heuristic equivalence of Pauli spin and Fermion systems breaks down in our context. (ii) The NESS of quantum spin chains is a simple and natural model of non-equilibrium statistical mechanics. Recall that non-Gibbsian measures appear in block spin transformation and non-reversible interacting particle systems in the realm of classical spin models. Efforts were made to characterize non-Gibbsian measures via the variational principle for singular interaction. (See [10] for recent status of research in this direction.) So far, quantum examples are missing and as a consequence of our result, the NESS of the XX model turns out to be the first example of a natural non-equilibrium state characterized by the variational principle. In Sec. 2, we recall the Jordan–Wigner transformation, and the non-equilibrium steady state is introduced in Sec. 3. In Sec. 4, we show that the non-equilibrium steady state of the XX model can not be a KMS state of any C ∗ -dynamical system. In Sec. 5, we consider corresponding the Fermi system, and we show the variational equality of non-equilibrium steady states of the XX model. 2. The Model Following Araki [3], we introduce the Jordan–Wigner transformation between the one-dimensional lattice spin system AS and the one-dimensional lattice Fermion system AF . AS is generated by Pauli matrices σxn , σyn , σzn located on the lattice site
December 3, 2003 19:54 WSPC/148-RMP
908
00185
T. Matsui & Y. Ogata
n ∈ Z. σxn , σyn and σzn satisfy σxn σyn = iσzn ,
(σαn )2 = 1 ,
[σαn , σβm ] = 0 (m 6= n) .
AF is generated by Fermion creation annihilation operators a∗n , an n ∈ Z satisfying the canonical anti-commutation relations: {a∗n , am } = δn,m 1 ,
{an , am } = {a∗n , a∗m } = 0 .
We will use creation and annihilation operators smeared by square summable sequences: a(f ) =
∞ X
an fn ,
a∗ (f ) =
n=−∞ 2
∞ X
a∗n f¯n
n=−∞
where f = (fj ) is in l (Z). The sum converges in norm topology. AF is referred to as the CAR algebra. Both algebras AS and AF are realized as the C ∗ -subalgebras of ˆ defined as follows. A ˆ is the crossed product of AF by a Z2 -action a larger algebra A Θ− where the ∗-automorphism Θ− on AF is determined by the following equations: ( an n ≥ 1 . Θ− (an ) = −an n ≤ 0 It is the C ∗ -algebra generated by AF and T which satisfies T = T∗ , T2 = 1, T AT = Θ− (A) ,
A ∈ AF .
∗
The spin algebra AS is the C -subalgebra defined by σzn = 2a∗n an − 1 σxn = T S n (an + a∗n ) σxn = iT S n(an − a∗n ) where
1 σ · · · σzn−1 z Sn = 1 0 σz · · · σzn
n>1 n=1 . n<1
ˆ as Next let us define an automorphism Θ on A Θ(an ) = −an
∀n ∈ Z
Θ(T ) = T . Let A+ (resp. A− ) be an even (resp. odd) part of AS i.e. A+ ≡ {A ∈ AS ; Θ(A) = A} ,
A− ≡ {A ∈ AS ; Θ(A) = −A} .
December 3, 2003 19:54 WSPC/148-RMP
00185
Variational Principle for Non-Equilibrium Steady States
909
ˆ is decomposed into four subspaces as A ˆ = A+ + A− + A+T + A−T A A+T = A+ T ,
A−T = A−T ,
and AS = A + + A − AF = A+ + A−T . In this paper, we investigate the non-equilibrium steady state of the XX model and the corresponding Fermion system. The Hamiltonian of the XX model has the following form: H=
∞ ∞ γ X z 1 X y x (σnx σn+1 + σny σn+1 )+ σ 4 n=−∞ 2 n=−∞ n
where the real parameter γ is an external magnetic field. The above equation means the generator of the dynamics τtS is determined by the potential Φ0S (I) via the following equation: X d S τt |t=0 (A) = i[Φ0S (J), A] dt
A ∈ AS (I) ,
J∩I6=∅
for each interval I. Here Φ0S (I) is given by Φ0S ({n}) =
∞ γ X z σ 2 n=−∞ n
Φ0S ({n, n + 1}) =
1 x x y σ σ + σny σn+1 4 n n+1
Φ0S (I) = 0 others . As each Φ0S (I) ∈ AS ∩ AF , it can be expressed by the Fermion operators: 1 ∗ Φ({n}) = γ an an − 2 1 Φ({n, n + 1}) = − [a∗n+1 an + a∗n an+1 ] 2 Φ(I) = 0 others . The corresponding Hamiltonian is ∞ ∞ 1 X ∗ γ X ∗ H=− [a an + an an+1 ] + (2a∗ an − 1) 2 n=−∞ n+1 2 n=−∞ n
December 3, 2003 19:54 WSPC/148-RMP
910
00185
T. Matsui & Y. Ogata
where an and a∗n are the fermionic annihilation and creation operators of the nth ˆ [3] and the extension site. The automorphism τtS can be extended to the whole A τt gives rise to the quasi-free motion of Fermions: τt (a∗ (f )) = a∗ (eitb f ) ,
(1)
where 1 (bf )(n) = − (f (n − 1) + f (n + 1)) + γf (n) . 2 3. Non-Equilibrium Steady State The non-equilibrium steady state we consider here is obtained as the limit state starting from inhomogeneous initial state. In our system, the spin chain is initially separated from the left and the right infinite systems (reservoirs) with different temperatures and the evolution of states is governed by the XX Hamiltonian as the time goes to infinite. The existence of the limit state itself is not obvious. However, as the set of states of O of AS is weak ∗-compact by Alaoglu’s Theorem, the sets ω ◦ τt t ∈ R has the accumulation points which are not necessary to be invariant under the dynamics τt . To make the invariant state, we take the long time average ω ◦ τt as follows: Z 1 T T ω ◦ αt dt . ω = T 0 Again by Alaoglu’s Theorem, ω T has the accumulation points which, this time, are invariant under αt , i.e. there exists sequence Tn which converges as ω(A) = lim ω Tn (A) n
∀A ∈ O.
For the XX model, (1), we do not have to carry out such time average in order to construct the steady state. An exact form of the non-equilibrium steady state of the XX model can be expressed in terms of a quasi-free state of Fermion: ω|A+ = ωρ |A+ ,
ω|A− = 0 .
(2)
Here ωρ is the quasi-free state on Fermion algebra AF defined by ωρ (a(fn )∗ · · · a(f1 )∗ a(g1 ) · · · a(gm )) = δnm det(hfi , ρgj i) , where ρ is the multiplication operator in the Fourier representation: ( (1 + e−β+ (k) )−1 k ∈ [0, π) ρr (k) = (1 + e−β− (k) )−1 k ∈ [−π, 0) (k) = cos(k) − γ . Let αF t be a one-parameter group of automorphism of AF such that ith αF f) t (a(f )) = a(e
(3)
December 3, 2003 19:54 WSPC/148-RMP
00185
Variational Principle for Non-Equilibrium Steady States
with
c )(k) = (hf
2β+ (cos(k) − γ)fˆ(k) − β+ + β − 2β− ˆ − β + β (cos(k) − γ)f (k) + −
.
911
(4)
It is easily verified that ωρ is the (α, β)-KMS state of AF , with β = (β+ + β− )/2. As αF is a quasi-free motion, ωρ is the unique (αF , β)-KMS state (cf. [1]). In spite of this fact, the state ω on AS , cannot be a KMS state of any oneparameter group of automorphisms on AS . In the next section, we will show this impossibility. 4. NESS and the KMS Condition As we claimed in the previous section, we show that the state ω of (2) cannot be a KMS state of any one-parameter group of automorphisms on AS even though it is a faithful factor state. We denote the GNS triple associated with the state ϕ on A by (Hϕ , πϕ , Ωϕ ). The natural extension ϕˆ of ϕ to πϕ (A)00 is defined by ϕ(x) ˆ = hΩϕ , xΩϕ i ,
x ∈ πϕ (A)00 .
Now assuming existence of a one-parameter group αS of automorphisms AS such that ω is (αS , β)-KMS state we derive a contradiction. Recall that any KMS state is faithful due to [8, Corollary 5.3.9]. In what follows, we may assume that the state ω is faithful for AS . Lemma 4.1. Let αSt be a strongly continuous one-parameter group of automorphisms of AS such that ω is (αS , β)-KMS state. Then, αSt ◦ Θ = Θ ◦ αSt , and αSt |A+ is given by ∗ ∗ αSt (a∗ (f )a(g)) = αF t (a (f )a(g)) = a (ut f )a(ut g)
where ut = eith . Proof. By definition ω is Θ invariant, ω◦Θ = ω. As ω is αS -KMS state, ω◦αS = ω. Hence Θ and αS are both extendible to the von Neumann algebra πωˆ (AS )00 , which we denote by the same symbol and ω is (ˆ αS , β)-KMS state on πωˆ (AS )00 . As was explained above, we assume that ω is faithful for πωˆ (AS )00 . Due to Θ invariance of ω, we obtain αt ◦ Θ = Θ ◦ αt . Hence α|A+ gives an automorphism of A+ , and by restriction, ω|A+ is α-KM S state of A. On the other hand, ω satisfies the KMS condition with the automorphism given by α0 (a∗ (f )a(g)) = a∗ (ut f )a(ut g). As ω|A+ is faithful, we again obtain the uniqueness αS |A+ = α0 . Lemma 4.2. Let Vt be Vt ≡ αSt (B2 T ) · αF t (B2 )T
December 3, 2003 19:54 WSPC/148-RMP
912
00185
T. Matsui & Y. Ogata
with B2 = a∗2 + a2 . Then we have Vt QVt∗ = αSt ◦ Γ ◦ Θ− ◦ Γ ◦ αS−t ◦ Θ− (Q) ,
∀ Q ∈ A+
where Γ(Q) = B2 QB2−1 . Proof. F S F F Vt Θ− (αSt (Q)) = αSt (B2 T )αF t (B2 )αt (Q)T = αt (B2 T )αt (Γ(Q))αt (B2 )T
= αSt (B2 T )αSt (Γ(Q))αSt (T B2−1 )Vt = αSt (B2 T Γ(Q)T B2−1)Vt = αSt ◦ Γ ◦ Θ− ◦ Γ(Q)Vt . Note that Vt ∈ A+ . Let us consider an arbitrary gauge invariant quasi-free state ϕq which is determined by the following two point function: ϕq (a∗ (f )a(g)) = hf, qgi for f and g in l2 (Z). For any A in A− T ⊂ AF , Vt AVt∗ is an element in A− T , and we obtain ϕq (Vt AVt∗ ) = 0. On the other hand, by Lemma 4.2, we obtain for A ∈ A+ ϕq (Vt AVt∗ ) = ϕq (αSt ◦ Γ ◦ Θ− ◦ Γ ◦ αS−t ◦ Θ− (A)) . Note that Γ ◦ Θ− Γ = Θ− . We conclude that ϕq (Vt · Vt∗ ) is quasi-free, which is given by the two point function ϕq (Vt a∗ (f )a(g)Vt∗ ) = hwt f, qwt gi where wt = ut θ− u−t θ− , and θ− is the self-adjoint unitary on l 2 (Z) specified by Θ− (a(f )) = a(θ− f ) . Let P+ be the projection to the subspace of l 2 (Z) generated by f = (fj ) with fj = 0 (j ≤ 0). Then, θ− = 2P+ − 1 = P+ − P− . It is generally known that an automorphism which maps any quasi-free state to a quasi-free state is a Bogoliubov automorphism (see [14]). It turns out that there exists a one-parameter family of unitary operators vt such that Vt B(f )Vt∗ = B(vt f ) . This equation suggests that vt is continuous in t as the left-hand side is continuous due to strong continuity assumption of αSt . By the equality hvt f, qvt gi = ϕq (a∗ (vt f )a(vt g)) = ϕq (Vt a∗ (f )a(g)Vt∗ ) = hwt f, qwt gi , we obtain vt wt∗ q = qvt wt∗ ,
December 3, 2003 19:54 WSPC/148-RMP
00185
Variational Principle for Non-Equilibrium Steady States
913
for all q ∈ B(l2 (Z)). As vt wt∗ is unitary, we obtain a continuous complex function f (t) such that vt wt∗ = f (t) · 1 ,
|f (t)| = 1 .
Due to the equation a(wt f )a(wt g) = Vt a(f )a(g)Vt∗ = a(vt f )a(vt g) = f (t)2 a(wt f )a(wt g) , we obtain f (t) = ±1. By continuity of f (t), we have f (t) = 1. Hence we obtain Vt B(f )Vt∗ = B(wt f ) ,
wt = ut θ− u−t θ− .
The gauge invariant Bogoliubov automorphism associated with wt is inner if and only if 1 − wt is of trace class (cf. [1]). In the following, we show this is not the case. If 1 − wt is of trace class, it is of Hilbert Schmidt class i.e. Tr((1 − ut θ− u−t θ− )∗ (1 − ut θ− u−t θ− )) X = hn|(1 − ut θ− u−t θ− )∗ (1 − ut θ− u−t θ− )|ni < ∞ . n
As
(1 − ut θ− u−t θ− )∗ (1 − ut θ− u−t θ− ) = 2P− ut P+ u−t + 2P+ ut P− u−t + 2ut P+ u−t P− + 2ut P− u−t P+ for all N > 0, we have 0≤
X
8|hm|ut |ni|2
1≤n≤N,−N ≤m≤0
≤ Tr((1 − ut θ− u−t θ− )∗ (1 − ut θ− u−t θ− )) < ∞ . P That is, 1≤n≤N,−N ≤m≤0 |hm|ut |ni|2 is an upper bounded monotonically increasing sequence. So it has a limit X |hm|ut |ni|2 < ∞ . m≤0,1≤n
Let us consider |hm|ut |ni|. By partial integral, we obtain hm|ut |ni =
1 + (eiβ+ (1−γ)t − eiβ+ (−1−γ)t · (−1)m−n ) − fmn 2πi(m − n) +
1 − (−eiβ− (1−γ)t + eiβ− (−1−γ)t · (−1)m−n ) − fmn , 2πi(m − n)
where + fmn
β+ t = 2π(m − n)
− fmn =
β− t 2π(m − n)
Z Z
π
dk sin keiβ+ (k)t · e−ik(m−n)
0 0
dk sin keiβ− (k)t · e−ik(m−n) . −π
December 3, 2003 19:54 WSPC/148-RMP
914
00185
T. Matsui & Y. Ogata
Again using partial integral, we obtain a bound c c + − |fmn |≤ |≤ , |fmn (n − m)2 (n − m)2 for some constant c. Hence we obtain 8c 4c2 + , π(n − m)3 (n − m)4
khm|ut |ni|2 − Bmn | ≤ where Bmn =
1 |eiβ+ (1−γ)t − eiβ+ (−1−γ)t · (−1)m−n 4π 2 (n − m)2 − eiβ− (1−γ)t + eiβ− (−1−γ)t · (−1)m−n |2 .
It is summable X
khm|ut |ni|2 − Bmn | ≤
m≤0,1≤n
As Bmn is Bmn
X
m≤0,1≤n
4c2 8c + < ∞. π(n − m)3 (n − m)4
" 1 1 − cos(γt(β+ − β− )) · cos(t(β+ − β− )) = 2 π (n − m)2 + 2(−1)m−n cos(t(β+ + β− )) · sin
· sin
t (β+ − β− )(1 + γ) 2
#
t (β+ − β− )(1 − γ) 2
,
in case β+ 6= β− , we can take t so that there exists d > 0 such that Bmn >
d > 0, (n − m)2
for all m ≤ 0 and n ≤ 1. Hence we have X X Bmn > m≤0,n≥1
Then we obtain X (Bmn − |hm|ut |ni|2 ) + m≤0,1≤n
m≤0,n≥1
X
m≤0,1≤n
d = +∞ . (n − m)2
|hm|ut |ni|2 =
X
Bmn = +∞ .
m≤0,1≤n
The left-hand side is finite by the previous arguments, while the right-hand side is infinite. This is a contradiction and we conclude that 1 − utθ− u−t θ− is not of trace class. Theorem 4.1. The non-equilibrium steady state ω of the XX model ω cannot be a KMS state of any strongly continuous one-parameter group automorphism of the C ∗ -algebra AS .
December 3, 2003 19:54 WSPC/148-RMP
00185
Variational Principle for Non-Equilibrium Steady States
915
5. The Gibbs Condition and the Variational Principle In this section, we prove that the NESS ωρ of the Fermion system satisfies the Gibbs condition and the variational principle for a long range interaction. As a result the NESS ω of XX model satisfies variational equality. We apply the equivalence of the KMS-condition, the Gibbs condition, and the variational principle, for long range interactions which was recently established by Araki and Moriya in [4]. We collect here the essence of their results which we will use in our proof later. Before stating the details of the conditions, let us recall the notion of standard potentials in the sense of [4]. Let A(I) be the set of all observables localized in a subset I of Z. Set [ A(I) . Aloc = |I|<∞
A potential Φ is a map defined on the set of finite subsets I of Z with values in the set of self-adjoint elements of A. We define the translation invariant standard potential as follows. Definition 5.1 (Translation Invariant Standard Potential). The translation invariant standard potential Pτ is defined as the set of potentials which satisfies the following conditions: (1) (2) (3) (4) (5)
Φ(I) ∈ A(I), Φ(∅) = 0, Φ(I)∗ = Φ(I), Θ(Φ(I)) = Φ(I), EJ (Φ(I)) = 0, if J ⊂ I and J 6= I, For each finite I ⊂ Z, the limit X {Φ(K); K ∩ I 6= ∅, K ⊂ J} lim J%Z
K
exists in AF , (6) τ (k) (Φ(I)) = Φ(I + k) where τ (k) is the lattice translation specified with τ (k) (an ) = an+k ,
τ (k) (a∗n ) = a∗n+k .
EJ is the conditional expectation from AF to AF (I) which is uniquely determined by the condition EJ (a) ∈ A(J) and tr(ab) = tr(EJ (a)b) for all a ∈ AF and b ∈ A(J). Here, tr is the unique trace state of AF . We define the internal energy U (I) and the surface energy W (I) as X H(I) = {Φ(K); K ∩ I 6= ∅} K
December 3, 2003 19:54 WSPC/148-RMP
916
00185
T. Matsui & Y. Ogata
U (I) =
X
Φ(K)
K⊂I
W (I) =
X {Φ(K); K ∩ I 6= ∅, K ∩ I c 6= ∅} . K
The convergence of the summation in the definition of W (I) and H(I) is guaranteed by the fifth condition of Definition 5.1. It is known that for any ∗-derivation δ defined on Aloc with value in A there exists a unique standard potential satisfying δ(Q) = i[H(I), Q] Q ∈ A(I) . Now let us turn to the Gibbs condition. To define the Gibbs condition we need some preparations. Let (Hϕ , πϕ , Ωϕ ) be the GNS triple of ϕ. Suppose that ϕ is a faithful state for πϕ (AF )00 . Then, Ωϕ is a cyclic and separating vector for πϕ (AF )00 and modular operator ∆ϕ corresponding to Ωϕ is well-defined. We denote the modular automorphism as σtϕ : −it σtϕ (x) = ∆it ϕ x∆ϕ ,
x ∈ πϕ (AF )00 .
If ϕ is a α-KMS state of AF , we have the following identity: σtϕ (πϕ (A)) = πϕ (αt (A)) ,
A ∈ AF .
Now we define the bounded perturbation of the dynamics αt , σtϕ and the state ϕ. Let (τt , O) be a C ∗ -dynamical system or W ∗ -dynamical system, and v ∈ O be a bounded operator. There exists the perturbed dynamics τtv of τt with generator δ v given by δ v (A) ≡ δ(A) + i[v, A] ,
A ∈ D(δ) ,
where δ is the generator of τt . The perturbed dynamics αvt and (σtϕ )πϕ (v) are related by the following equation: (σtϕ )πϕ (v) (πϕ (A)) = πϕ (αvt (A)) ,
A ∈ AF .
The bounded perturbation of Ωϕ by v is defined as Z tm−1 Z t1 ∞ Z 1 X 2 dtm ∆tϕm πϕ (v)∆tϕm−1 −tm πϕ (v) · · · dt2 · · · dt1 Ωvϕ ≡ m=0
0
0
0
∆tϕ1 −t2 πϕ (v)Ω . The perturbed state ϕv is defined by ϕv (A) =
hΩvϕ , πϕ (A)Ωvϕ i . hΩvϕ , Ωvϕ i
The sum converges absolutely. It is known that Ωvϕ is cyclic and separating for v πϕ (AF )00 . Its modular automorphism group σtϕ coincides with (σtϕ )πϕ (v) . Let Φ : I ⊂ Z → Φ(I) ∈ AF be an interaction which satisfies X δα (A) = i[Φ(J), A] , A ∈ AF (I) . J∩I6=∅
December 3, 2003 19:54 WSPC/148-RMP
00185
Variational Principle for Non-Equilibrium Steady States
917
Suppose that W (I) converges in the norm topology of AF for any finite I. In [4], Araki and Moriya introduced the Gibbs condition of states of quantum lattice models in a different manner from the standard definition [8]. The definition of Araki and Moriya is convenient for the study of Fermion lattice models as follows: Definition 5.2. Suppose AF0 ⊂ D(δ), where AF0 is the union of finite local subalgebras. Let W (β, I) ≡ πϕ (βW (I)). A state ϕ satisfies the (δ, β)-Gibbs condition if ϕ is faithful and σtϕ σtϕ
W (β,I)
W (β,I)
satisfies
(πϕ (A)) = πϕ (e−iβU (I)t AeiβU (I)t ) ,
A ∈ AF (I) ,
for any finite I ⊂ Z. The above definition is equivalent to the following product property, which is chosen as the definition of the Gibbs state in quantum spin systems. Proposition 5.1. Assume that αt is even, i.e. αt ◦ Θ = Θ ◦ αt for any t ∈ R, and AF0 ⊂ D(δ). Set W (β, I) ≡ πϕ (βW (I)) ∈ AF . Suppose that ϕ ◦ Θ = ϕ. ϕ satisfies the Gibbs condition if and only if ϕw has the following product property: ϕW (β,I) (AB) = ϕW (β,I) (A)ϕW (β,I) (B) ,
A ∈ A(I) ,
B ∈ A(I c ) ,
for any finite subset I ⊂ Z. Next let us turn to the variational principle. Before treating the thermodynamical limit, let us recall the definition of entropy of a state restricted to finite system. Let ωI be the restriction of ω to AF (I). Then there exists a density matrix ρI which satisfies ωI (A) = Tr(ρI A) ,
A ∈ AF (I) .
The entropy S(ωI ) of ωI is defined as S(ωI ) ≡ −ωI (log ρI ) . For any translation invariant potential Φ ∈ Pτ , and any translation invariant state ω, the thermodynamic limit of pressure, mean entropy and mean energy exist: P (Φ) ≡ v.H. lim
I→∞
1 log TrI (e−H(I) ) |I|
eΦ (ω) ≡ v.H. lim
I→∞
s(ω) ≡ v.H. lim
I→∞
1 ω(H(I)) |I| 1 S(ωI ) . |I|
Here the limit v.H. is taken in the sense of van Hove. For any translation invariant state ϕ, the following variational inequality is valid: P (βΦ) ≥ s(ϕ) − βeΦ (ϕ) .
December 3, 2003 19:54 WSPC/148-RMP
918
00185
T. Matsui & Y. Ogata
There exists a state which achieves the equality: Theorem 5.1. For any Φ ∈ Pτ , there exists a state ϕ which satisfies P (βΦ) = s(ϕ) − βeΦ (ϕ) . The KMS condition is equivalent to Gibbs condition and variational equality under the following assumptions (cf. [4]). Theorem 5.2. Let αt be a strongly continuous one-parameter group of automorphisms with the generator δα . Suppose that αt satisfies αt Θ = Θαt , and the domain D(δα ) of δα contains the strict local algebra AF0 . Let δ be the restriction of δα to A F0 . There exists a translation invariant standard potential Φ ∈ Pτ such that X δ(A) = i[Φ(J), A] , A ∈ AF (I) . I∩J6=∅
Suppose further that the strict local algebra AF0 is a core of the generator δα . Then, for any translation invariant state ϕ, the following conditions are equivalent. (1) ϕ is the (αt , β)-KMS state, (2) ϕ satisfies the (δ, β)-Gibbs condition, (3) P (βΦ) = s(ϕ) − βeΦ (ϕ). Let us return to our system. Lemma 5.1. Let αF t be a one-parameter group of Bogoliubov automorphisms on AF such that the generator δ of αt on the one-particle space l2 (Z) is bounded. The strict local algebra AF0 is a core for the generator δ of αF t . Proof. This follows from the following fact (cf. [7, Corollary 3.1.20]). Let D be a dense αt invariant subalgebra of AF such that any element of D is in the domain of the generator δ of αF t . Then D is a core of δ . For our purpose D is the set of all polynomials in creation and annihilation operators with all smearing functions in l2 (Z). As the generator of αt on the oneparticle space l2 (Z) is bounded, the graph closure of (AF0 , δ(AF0 ) contains (D, δ(D). This shows the claim of the lemma. F Obviously αF t Θ = Θαt , AF0 ⊂ D(δαF ), and ωρ is translation invariant. Now let us check that the potential of our Fermion system is a translation invariant standard potential. Requiring X d F = i[ΦF (I), am ] , αt (am ) dt t=0 I⊂Z,m∈I
December 3, 2003 19:54 WSPC/148-RMP
00185
Variational Principle for Non-Equilibrium Steady States
919
we obtain the potential ΦF (I) as follows: 1 ∗ ΦF ({m}) = γ am am − 2 1 ΦF ({n, n + 1}) = − (a∗n an+1 + a∗n+1 an ) 2 2i β+ − β− n−m π β+ + β− (n − m)2 − 1 × (a∗n am − a∗m an ) , for n − m even ΦF ({n, m : n 6= m}) = − 2i γ β+ − β− 1 π β+ + β− n−m × (a∗n am − a∗m an ) , for n − m > 1 odd
ΦF (I) = 0 for |I| ≥ 3
(5)
Obviously, the conditions 1, 2, 3, and 6 of the standard Potential are satisfied. For b ∈ A(J), and n ∈ J c the conditional expectation satisfies tr(EJ (an )b) = tr(an b) = 0 . So we have EJ (an ) = 0, which implies the condition 4 by EJ (a∗m an ) = a∗m EJ (an ) = 0 ,
m ∈ J, n ∈ J c .
The condition 5 can be proved by noting that the internal energy UF (I) and the surface energy WF (I) are elements of AF and given by X 1 1X ∗ ∗ ∗ a (P (I)fm )am + am a(P (I)fm ) + γ am am − UF (I) = 2 2 m∈I
WF (I) =
X
m∈I
a∗ (P (I c )fm )am + a∗m a(P (I c )fm ) .
(6)
m∈I
Here fm is given by fm =
∞ i β− − β + X 4l 1 |2l + mi − (|m + 1i + |m − 1i) 2 π β− + β + 4l − 1 2 l=−∞
−
∞ 1 2i β− − β+ X γ |2l + 1 + mi , π β− + β + 2l + 1 l=−∞
where the sum converges in the norm of l2 (Z). P (I) is a projection onto I. Hence the conditions required in Theorem 5.2 are satisfied. Recall that ωρ is the (αF , β)KMS state, with β = (β+ + β−)/2. In general, the KMS state of a quasi-free motion is unique, and ωρ is the only (αF , β)-KMS state. Due to Theorem 5.2 ωρ is the only translation invariant state which satisfies (δ, β)-Gibbs condition and (ΦF , β)-variational equality as well.
December 3, 2003 19:54 WSPC/148-RMP
920
00185
T. Matsui & Y. Ogata
Theorem 5.3. Let αF t be the one-parameter group of automorphisms of A F determined by (3) and (4). Let ΦF (I) be the standard potential of (5), then, X δ(A) = i[ΦF (J), A] , A ∈ AF (I) , I∩J6=∅
where δ is the restriction of the generator of αF t . The non-equilibrium steady state ωρ F is the unique (αt , β)-KMS state, satisfies the (δ, β)-Gibbs condition and (ΦF , β)variational equality. No other translation invariant state satisfies neither (δ, β)Gibbs condition nor (ΦF , β)-variational equality. The above result in our Fermion system implies the variational equality of ωρ , because all the potentials Φ(I) belong to A+ . However for our proof of uniqueness of the state which satisfies variational equality for the spin system of the XX model, we need more preparation. Suppose that ω and ω 0 are translation invariant states of AS that satisfy the (ΦS , β)-variational principle. Then their restriction to A+ , ω|A+ and ω 0 |A+ satisfies the (ΦS , β)-variational principle as a state on A+ . Note that we are using the fact that the mean entropy is affine and and that s(ω) = s(1/2(ω +ω ◦Θ)) On the other hand, the statement of the Theorem 5.2 is satisfied even if AF is replaced by A+ . (The proof is the same as the AF case.) So any translation invariant state which satisfies the (ΦS |A+ , β)-variational equality is the (αt |A+ , β)-KMS state. Let ω+ be the restriction of our NESS ωρ to A+ . Lemma 5.2. ω+ is the unique (β, αt |A+ )-KMS state on A+ . Proof. We apply [8, Theorem 5.4.3]. (AF , A+ , Z2 , αF , γ) is the field system, where γ is the group action of Z2 with γ1 = Θ. Let ϕ+ be the extremal (αF , β)-KMS state over A+ . Then [8, Theorem 5.4.3] implies that any extremal αF invariant extension ϕ of ϕ+ to AF is a (αF t γξt , β)-KMS state, where t → ξt is a continuous one-parameter group of Z2 . However, as Z2 is discrete, ξt = 1. So the extension is a (αF t , β)-KMS state. But as the KMS state of our quasi-free motion is unique, we conclude that ω+ is the unique (β, αt |A+ )-KMS state on A+ .
ω
0
By this Lemma, we obtain ω|A+ = ω 0 |A+ . Furthermore we have ω = +ω 0 ◦Θ because they are even states. 2
ω+ω◦Θ 2
=
Lemma 5.3. The NESS ω on AS is a factor state. Proof. We use the notation of Lemma 4.1. The GNS representation of ω is unitary equivalent to (HS , πS , ΩS ). Here, the Hilbert space HS is HS = H + ⊕ H − where H+ = πω (A+ )Ω ,
H− = πω (A− )Ω ,
December 3, 2003 19:54 WSPC/148-RMP
00185
Variational Principle for Non-Equilibrium Steady States
921
and ΩS ≡ Ω ⊕ 0 , where B = B2 T = (a2 +a∗2 )T . Let us define σ on A+ as adjoint action of the unitary B: σ(A) ≡ BAB −1
A ∈ A+ .
Set U HS U ∗ ≡ H+ ⊕ H+ . By U we denote the unitary operator from HS to U HS U ∗ determined by U AΩ = AΩ , We have ∗
U AU =
A 0 0 σ(A)
U σ(A)BΩ = AΩ A ∈ A+ .
A ∈ A+ ,
∗
U BU =
0 1 1 0
.
Let us consider an element A of U (πS (A)00 ∩ πS (A)0 )U ∗ . As A ∈ U πS (A)0 U ∗ , we obtain a b A= a, d ∈ C1 , xb = bσ(x) , cA = σ(A)c , x ∈ A+ , c d because π+ (A+ )00 on H+ is a factor. If b 6= 0 or c 6= 0 the relation implies that the representation π+ of A+ is equivalent to π+ ◦ σ. We will see that this leads to a contradiction. As a consequence, we conclude b = c = 0. Similarly, as A commutes with U BU ∗ , we obtain a = d. Hence we obtain A ∈ C1. Thus, ω is a factor state.
Lemma 5.4. The quasi-free states ω and ω ◦ σ restricted to A+ are disjoint. We can apply the results of [18]. It suffices to show that ρ1/2 − θ− ρ1/2 θ− and (1 − ρ)1/2 − θ− (1 − ρ)1/2 θ− are not of Hilbert Schmidt class. This claim can be proved in the same way as the proof of Lemma 4.3. We omit the proof. 0 0 The previous argument shows that ω = ω +ω2 ◦Θ if ω 0 is a translation invariant state satisfying the variational equality. As ω is a translation invariant factor state, ω is ergodic (extremal in the set of translation invariant states), so ω = ω 0 = ω 0 ◦ Θ. As a consequence we arrive at our main result of this article. Theorem 5.4. The non-equilibrium steady state ω of XX model is the unique translation invariant state which satisfies the (ΦS , β)-variational equality: P (βΦS ) = s(ω) − βeΦS (ω) , where ΦS ({m}) =
1 m γσ , 2 z
1 y x ΦS ({n, n + 1}) = − (σnx σn+1 + σny σn+1 ) 2
December 3, 2003 19:54 WSPC/148-RMP
922
00185
T. Matsui & Y. Ogata
ΦS ([m, n]) =
n−1 Y 1 β+ − β − n−m − σzk π β+ + β− (n − m)2 − 1 k=m+1 m n m n · (σy σy − σx σx ) for n − m even
n−1 Y 1 β+ − β − 1 σzk γ π β + β n − m + − k=m+1 · (σym σyn − σxm σxn ) for n − m > 1 odd
.
The convergence of the local Hamiltonian operator for our NESS X H(I) = {Φ(K); K ∩ I 6= ∅} K
is assured by the underlying anti-commutativity of Fermion, while X {kΦ(K)k; K ∩ I 6= ∅} = ∞ . K
References [1] H. Araki, On quasifree states of CAR and Bogoliubov automorphisms, Publ. Res. Inst. Math. Sci. 6 (1970/71), 385–442. [2] H. Araki, On the equivalence of the KMS condition and the variational principal for quantum lattice systems, Commun. Math. Phys. 38 (1974), 1–10. [3] H. Araki, On the XY -model on two-sided infinite chain, Publ. RIMS, Kyoto Univ. 20 (1984), 277. [4] H. Araki and H. Moriya, Equilibrium Statistical Mechanics of Fermion Lattice Systems, Rev. Math. Phys. 15 (2003), 93–198. [5] H. Araki and H. Moriya, Local Thermodynamic Stability of Fermion Lattice Systems, Lett. Math. Phys. 62 (2003), 33–45. [6] W. H. Aschbacher and C.-A. Pillet, Non-Equilibrium Steady States of the XY Chain, J. Stat. Phys. 112 (2003), 1153–1175. [7] O. Bratteli and D. W. Robinson, Operator Algebras and Quantum Statistical Mechanics 1, Springer-Verlag, 1987. [8] O. Bratteli and D. W. Robinson, Operator Algebras and Quantum Statistical Mechanics 2, Springer-Verlag, 1996. [9] S. Dirren, ETH master thesis under the supervision of J. Froehlich and G. M. Graf, 1998. [10] A. van Enter, C. Maes and S. Shlosman, Dobrushin’s program on Gibbsianity restoration: weakly Gibbs and almost Gibbs random fields, On Dobrushin’s way, from probability theory to statistical physics, Amer. Math. Soc. Transl. Ser. 2 198 (2000), 59–70. [11] D. E. Evans and J. T. Lewis, On a C ∗ -algebra approach to phase transition in the two-dimensional Ising model. II, Commun. Math. Phys. 102 (1986), 521–535. [12] J. Fr¨ ohlich, M. Merkli and D. Ueltschi, Dissipative Transport: Thermal Contacts and Tunnelling Junctions, preprint, mp-arh/02-528, 2002. [13] T. G. Ho and H. Araki, Asymptotic time evolution of a partitioned infinite two-sided isotropic XY -chain, Proc. Steklov. Inst. Math. 228 (2000), 191–204.
December 3, 2003 19:54 WSPC/148-RMP
00185
Variational Principle for Non-Equilibrium Steady States
923
[14] N. M. Hugenholtz and R. V. Kadison, Automorphisms and quasi-free states of the CAR algebra, Commun. Math. Phys. 43 (1975), 181–197. [15] V. Jaksic and C.-A. Pillet, On entropy production in quantum statistical mechanics, Commn. Math. Phys. 217 (2001), 285–293. [16] V. Jaksic and C.-A. Pillet, Non-equilibrium steady states of finite quantum systems coupled to thermal reservoirs, Commun. Math. Phys. 226 (2002), 131–162. [17] O. Lanford and D. W. Robinson, Statistical Mechanics of Quantum Spin Systems III, Commun. Math. Phys. 9 (1968), 327–338. [18] T. Matsui, On quasi-equivalence of quasifree states of the gauge invariant CAR algebras, J. Operator Theory 17 (1987), 281–290. [19] Y. Ogata, Non-equilibrium Properties in the Transverse XX Chain, Phy. Rev. E 2 (2002), 016135. [20] D. Ruelle, A variational formulation of equilibrium statistical mechanics and the Gibbs phase rule, Commun. Math. Phys. 5 (1967), 324–329. [21] D. Ruelle, Natural nonequilibrium states in quantum statistical mechanics, J. Stat. Phys. 98 (2000), 57. [22] D. Ruelle, Entropy production in quantum spin systems, Commun. Math. Phys. 224 (2001), 3–16. [23] S. Tasaki, Nonequilibrium stationary states of noninteracting electrons in a onedimensional lattice, Chaos, Solitons and Fractals, 12 (2001), 2657–2674. [24] S. Tasaki and T. Matsui, Fluctuation Theorem, Nonequilibrium Steady States and MacLennan-Zubarev Ensembles of L1 -Asymptotic Abelian C∗ Dynamical Systems, preprint mp-arh 02-533 (2002). [25] D. N. Zubarev, Nonequilibrium Statistical Physics Nauka, 317 (1971).
December 8, 2003 12:30 WSPC/148-RMP
00186
Reviews in Mathematical Physics Vol. 15, No. 8 (2003) 925–947 c World Scientific Publishing Company
A MATHEMATICAL VERIFICATION OF THE EXISTENCE OF STRINGS OF CIRCULATION IN SUPERFLUID FILMS ON POROUS MEDIA
CHRIS PETERSEN BLACK Department of Mathematics, Seattle University, Seattle, WA 98122, USA [email protected] Received 12 April 2002 Revised 1 October 2003 A superfluid film flowing along the walls of a porous material can be modeled by a harmonic differential on a punctured Riemann surface satisfying integrality conditions on its periods and residues. In this paper, we use classical Riemann surface theory to mathematically investigate the observed physical phenomenon of the appearance of pairs of vortices in the fluid and the resulting patterns of circulation. We show that the existence of strings of circulation in the superfluid depends on a quantized divisibility condition between the number of vortex–antivortex pairs and the circulation of the fluid. Keywords: Superfluid films; liquid helium; low-temperature physics; Riemann surface theory; vortex filaments; quantization; ideal fluid.
1. Introduction In this paper, we explore the geometric aspects of superfluid films using classical Riemann surface theory to develop a mathematical model for the flow of a superfluid 4 He film which can be applied to the special case of a superfluid film on a porous surface. Our inspiration comes from the explorations of the physical aspects of superfluid helium films by such physicists as Machta and Guyer [1] and Minoguchi and Nagaoka [2], who have studied the flow of superfluid 4 He adsorbed on a porous surface using the Kosterlitz–Thouless model. In this work we formulate a general mathematical model of the physical situation that is readily accessible to a wide range of scientists, mathematicians and students. This model is consistent with experimental observations about the presence of strings of circulation in the flow, a particular circulation pattern where all circulation is concentrated along a path between vortex–antivortex pairs. In Sec. 4.2, we show that in the case of one vortex–anitvortex pair there is a one-to-one correspondence between superfluid flows and strings of circulation. However, the one-to-one correspondence fails in the case of multiple vortex–antivortex pairs. We show that while it is possible for any set of strings to be produced by some superfluid flow, 925
December 8, 2003 12:30 WSPC/148-RMP
926
00186
C. P. Black
the only superfluid flows that will produce strings of circulation are those whose real circulation along closed loops on the surface are quantized as integer multiples h , where r is the number of vortex–antivortex pairs in the fluid, h is Planck’s of r m constant, and m is the mass of one 4 He atom. The behavior of superfluid helium is analogous to that of a superconductor in that both systems can sustain a significant particle current for long periods of time without any driving force. The microscopic behavior of a superfluid system can be described by the theory of a Bose–Einstein condensate [3], a state of matter where a large number of particles join together and behave like a single quantum object. The study of superfluidity also has connections to string theory and conformal field theory. However, in this paper we study the flow of a superfluid film from a purely geometric viewpoint. Before starting into the mathematics of such a situation, we first describe the physical situation that leads us to our mathematical model. 2. The Physics of Superfluidity 2.1. Basic properties of 4 He Atkins [4] provides a thorough and wonderfully readable history of the experiments performed during the period between 1908 and 1955 that led to the understanding of the properties of superfluid helium and provided the foundation for the work that has followed. A more modern review of the subject was provided by Brewer [5] in 1978, with an update in 1992 by Reppy [6]. The results of the experiments that we reference in this paper are summarized below, but this is by no means a comprehensive overview of the history of this field. For details of the experiments, the reader is referred to these aforementioned works. The natural state of 4 He is a gas, and condensation into a liquid can only occur if the temperature is within a few degrees of absolute zero. Below the boiling point of 4.21 K, 4 He remains liquid down to temperatures as low as 0.1 K without solidifying. It is likely that the liquid still exists at absolute zero [7]. The solid form of 4 He can only be obtained under pressure of the order of 25 atmospheres. Immediately below the boiling point, 4 He behaves as an ordinary low-viscosity liquid. However, at the transition temperature Tλ = 2.17 K (also known as the λ-point), liquid 4 He undergoes a remarkable change called the superfluid/normal phase transition. Above Tλ and at reduced pressure so it does not vaporize, the liquid boils vigorously. But, as soon as the temperature Tλ is reached, the liquid becomes immediately still without any bubbles or agitation. This indicates that Tλ marks the transition between two different forms of liquid 4 He, known as helium I above the λ-point, and helium II below it. (According to Atkins, the use of the Greek letter λ to represent this transition comes from the shape of the graph of specific heat as a function of temperature.) Helium I behaves as an ordinary low-viscosity, low-density liquid, but helium II has amazing properties. One of the most remarkable properties of helium II is its ability to appear simultaneously viscous and non-viscous, as was determined in 1938 by both Kapitza [8] and Allen and Misener [9]. These seemingly contradictory results were resolved
the order of 10
poise and is capable of flowing through very narrow channels
(called superleaks) with very high velocities, December 8, 2003 12:30 WSPC/148-RMP 00186leaving the normal component behind. The division of 4 He into two components was convincingly demonstrated by Andronikashvili [12] in 1946. If we let ρs represent the density of the superfluid and ρn the density of the normal fluid, then the density of the total fluid is given by ρ = ρs + ρn . The fraction of superfluid is then given by the ratio ρs /ρ, and the Superfluid Films on Porous Media fraction of normal fluid is given by ρn /ρ.
1 0.8 0.6 0.4 0.2
Ρs Ρ TΛ Ρn Ρ 0.5
Fig. 1.
927
1
1.5
2
2.5
Densities of He I and He II by temperature, after Andronikashvili.
Figure 1. Densities of He I and He II by Temperature, after Andronikashvili
by the two-fluid theory, proposed by Tisza [10] in 1940 and further refined by results aretoshown figure 1 which illustrates the relationship Landau [11]Andronikashvili’s in 1941. According the in two-fluid theory, helium II is considered to be an intimate mixture of two components. The normal component has a normal between the densities of the two fluids as the temperature approaches the transition viscosity on the order of 10−5 poise. However, the superfluid component has very low viscosity, on the order of 10−11 poise and is capable of flowing through very narrow channels (called superleaks) with very high velocities, leaving the normal component behind. The division of 4 He into two components was convincingly demonstrated by Andronikashvili [12] in 1946. If we let ρs represent the density of the superfluid and ρn the density of the normal fluid, then the density of the total fluid is given by ρ = ρs + ρn . The fraction of superfluid is then given by the ratio ρs /ρ, and the fraction of normal fluid is given by ρn /ρ. Andronikashvili’s results are shown in Fig. 1 which illustrates the relationship between the densities of the two fluids as the temperature approaches the transition point Tλ . The data points shown on the graph are Andronikashvilli’s data, which is modeled nicely by the exponential function y = 0.001819(17.60976) t, for t < Tλ . For the purposes of this paper, we deal exclusively with the case where the ratio ρs /ρ = 1; that is, the case when He II is entirely superfluid, an idealization of the situation at temperatures near absolute zero. Due to its extremely low viscosity, a film of liquid helium on a surface typically flows only a few atoms thick. This film thickness is sufficiently small to act as a superleak to prevent the normal component of the fluid from flowing, but allows the superfluid component to flow as a virtually two-dimensional fluid. Thus, we can treat a superfluid film as one of Riemann’s “ideal fluids”, meaning that we assume the superfluid film is two-dimensional and frictionless. On a porous surface such as a packed powder, aerogel or Vycor glass (a porous glass with a sponge-like structure), the superfluid component of the 4 He film will flow through the pores of the surface, which act as superleaks. Although superfluids exhibit many strange properties not mentioned here, it is the behavior of a superfluid film on a porous surface that caused us to become interested in this field of study, and it is this situation that we model in this paper. 2.2. Quantization of circulation Before further discussing superfluid films, we should note one of the most important features of superfluid flow that affects our upcoming discussion. As we shall explain
December 8, 2003 12:30 WSPC/148-RMP
928
00186
C. P. Black
below, the laws of quantum physics dictate that the circulation of a superfluid flow around any closed path on the surface is always an integer multiple of the ratio of Planck’s constant to the mass of one 4 He atom. Liquid 4 He below the λ-point possesses a macroscopically large number of particles occupying a single quantum state, called a condensate. In his classic work, London [13] argues that superfluid currents are quantum currents, and that there is some kind of wave function that extends throughout the superfluid. Under steady state conditions [7], this wave function can be written in the form Ψ(r) = |Ψ| exp[iS(r)] where the phase S(r) is a real function of the position r. This leads to an expression for the canonical momentum: p = ~∇S. However, the canonical momentum can also be interpreted as the momentum of one particle of superfluid, so we also have the expression p = mvs , where m is the mass of one atom of 4 He , and vs is the velocity of the superfluid. Thus, we have the expression for the superfluid velocity vs =
~ ∇S . m
The circulation of the superfluid along a closed loop is given by I κ = vs · dl . Thus, we have an expression for the circulation that we can evaluate using the Fundamental Theorem for Line Integrals: I ~ ~ ∇S · dl = ∆S . κ= m m Because the superfluid wave function is single-valued, the value of S can only change by a multiple of 2π as we traverse the loop. Thus, ∆S = 2πn for some integer n, and ~ h the total circulation is an integer multiple of the quantum of circulation, 2π m =m , where h = 2π~ is Planck’s constant. This result is known as the Feynman–Osanger relation [14, 15]. 2.3. Vortices, strings, and previous mathematical models Although the theory we develop in this paper is not restricted to a superfluid film in a porous medium, this is the physical situation that attracted our attention, and thus it merits special mention here. In previous literature (for example, [1, 2, 16]) addressing superfluid films adsorbed in porous media, the surface has been modeled using the Kosterlitz– Thouless model [17] of the porous medium. Essentially, this is a jungle gym structure, where the surface is assumed to be composed of cylinders of fixed length and radius attached at right angles to make a three-dimensional lattice (see Fig. 2). Although there is some leeway in the length of the cylinders, this is a very restricted
restriction on the geometry of the surface. For purposes of illustration (see figDecember 8, 2003 12:30 WSPC/148-RMP
00186
ure 3), we will show a surface of genus 2 or 3, but the theory applies for surfaces of any genus, and the situation we are modeling demands a surface of extremely high genus – one for every pore in the surface. Superfluid Films on Porous Media
Fig. 2.
929
The Kosterlitz–Thouless (jungle gym) model of a porous medium. 8
C. P. BLACK
Figure 2. The Kosterlitz-Thouless (Jungle Gym) Model of a Porous Medium Fig. 3.
An “arbitrary” Riemann surface.
Figure 3. An “arbitrary” Riemann surface
geometry, and notItan accurate representation of the porous surface. In this invesis clear that the surface of a sponge-like material such as Vycor glass or aerogel tigation we will be less restrictive. We instead model our porous medium by a would form a smooth Riemann surface, but for the surface of the grains of a packed compact Riemann surface of high genus, which is a topological specification and powder this is not clear. In actuality, however, the grains of a packed powder places no restriction on the geometry of the surface. For purposes of illustration (see deform at the contact points so that the surface formed is indeed smooth. Fig. 3), we will show a surface of genus 2 or 3, but the theory applies for surfaces of is well known (see, for example Kosterlitz & Thouless [17]) that vortices appear any genus, and theIt situation we are modeling demands a surface of extremely high in helium in pairsin of equal opposite polarity, called a vortex-antivortex pair. genus — one for every IIpore theand surface. As wethe shall describe in theof nextaparagraph, the vortex-antivortex pairs such are linkedas Vycor glass or It is clear that surface sponge-like material by invisible strings that Riemann bind them together. These vortices movethe in thesurface fluid, aerogel would form a smooth surface, butcanfor of the grains of a packed powder this is not Inandactuality, however, grains of a packed they can be created and clear. annihilated, as the temperature increases tothe Tλ the powder deform atpairsthe contact points that the surface disassociate. However, in thisso paper we consider only a static formed picture of theis indeed smooth. It is well known (see, for example Kosterlitz and Thouless [17]) that vortices vortices. appear in helium IIUsing in pairs of equal and opposite polarity, called a vortex–antivortex the jungle-gym model, Machta & Guyer and Minoguchi & Nagaoka indepair. As we shallpendently describe in the next paragraph, the vortex–antivortex pairs are theorized the existence of strings of circulation in the superfluid film [1, 2] linked by invisible strings that bind them together. These vortices can move in the in 1988. A string of circulation is a fixed path between a vortex-antivortex pair along fluid, they can be created and annihilated, and as the temperature increases to Tλ which the circulation around any simple loop that intersects this path transversally the pairs disassociate. However, in this paper we consider only a static picture of remains constant. Such a situation is illustrated in figure 4. The dashed line indithe vortices. cates a “string” between the positive and negative vortices. The loop γ does not Using the jungle-gym model, Machta and Guyer and Minoguchi and Nagaoka cross the path of the string, so the circulation around the loop γ must be zero. The independently theorized the existence of strings of circulation in the superfluid film [1, 2] in 1988. A string of circulation is a fixed path between a vortex–antivortex pair along which the circulation around any simple loop that intersects this path transversally remains constant. Such a situation is illustrated in Fig. 4. The dashed line indicates a “string” between the positive and negative vortices. The loop γ
be nonzero, and is determined by the homotopy type of the loop in question. A
December 8, 2003 12:30 WSPC/148-RMP 00186 string of circulation is distinguished from a vortex filament, which is an actual distribution of vorticity concentrated on a loop or an arc in a two-dimensional surface so that the fluid flow has a discontinuity in its tangential component across the filament. The goal of this paper is to mathematically determine when strings of 930 circulation C. P. Blackexist.
g d +
Fig. 4.
–
Strings of circulation on an “arbitrary” Riemann surface.
Figure 4. Strings of circulation on an “arbitrary” Riemann surface
does not cross the path of the string, so the circulation around the loop γ must be zero. The loop δ does cross the string, so the circulation around δ is non-zero (and Since Similarly, the 1980’s, the the study of superfluid flows hasclosed taken aloop mathematical turn. the quantized). circulation around any that intersects string will also be non-zero, and is determined by the homotopy type of the loop in Marsden provided an introduction to the mathematical method in his 1981 lecquestion. A string of circulation is distinguished from a vortex filament, which is an actual distribution of vorticity concentrated a loop helium or an arc in a two-dimensional tures [18]. Previous mathematical models of on superfluid incorporate classical surface so that the fluid flow has a discontinuity in its tangential component across Hamiltonian dynamics [19].mathematically Rasetti & Reggedetermine [20] treat quantum the filament. The goal of and thisθ-statistics paper is to when strings of circulation exist. vortices as global objects (as opposed to the Feynman-Osanger approach in which Since the 1980s, the study of superfluid flows has taken a mathematical turn. Marsden provided an introduction to thebymathematical method in his 1981 lectures vortices are a purely local phenomenon) creating a canonical quantization scheme [18]. Previous mathematical models of superfluid helium incorporate classical for vortices in superfluid II using the Rasetti infinite Lie incompressible Hamiltonian dynamics andhelium θ-statistics [19]. andalgebra Reggeof[20] treat quantum vortices as global objects (as opposed to the Feynman–Osanger approach in which flows. Similarly, Marsden & Weinstein [21] use symplectic geometry to describe the vortices are a purely local phenomenon) by creating a canonical quantization classical hydrodynamics of an ideal incompressible fluidinfinite using the of theofLie scheme for vortices in superfluid helium II using the Liedual algebra incompressible flows. Similarly, Marsden and Weinstein [21] use symplectic geometry to algebra of the group of area-preserving diffeomorphisms. The algebro-geometric apdescribe the classical hydrodynamics of an ideal incompressible fluid using the dual of the Lie algebra the group of area-preserving Thefield algebroproach of Goldin,ofMenikoff & Sharp [22] also quantizesdiffeomorphisms. the entire fluid velocity geometric approach of Goldin, Menikoff and Sharp [22] also quantizes the entire fluid velocity field instead of individual vortices using representations of the group of area-preserving diffeomorphisms. More recently, Penna and Spera [23] have used classical Riemann surface theory to investigate quantized point vortex theories in the zero total vorticity case. Our approach in this paper is similar to that of Penna and Spera. 2.4. The mathematical model In this paper, a porous medium is modeled by a Riemann surface M of high genus. This removes the restriction of the jungle gym geometry, and allows the surface to take on a more arbitrary shape. Our only assumption is that the surface is smooth. However, the set-up we develop here is “ready-made” to later study the effects of deforming the surface (even catastrophically), so that it is no longer smooth. Because the layer of fluid is only a few atoms thick, we model the flow of the superfluid by a two-dimensional vector field. The normal component of the fluid cannot flow through the pores of the medium, so we will neglect any normal
December 8, 2003 12:30 WSPC/148-RMP
00186
Superfluid Films on Porous Media
931
part and consider only the superfluid component. In order to satisfy the physical conditions for a fluid flow, it must first be irrotational and incompressible away from the singularities. That is, the flow must behave as a curl and divergence free vector field. Thus, the superfluid flow can be modeled by a meromorphic differential ω on the Riemann surface, with the poles of the differential corresponding to the vortices in the fluid. Additionally, the Feynman–Osanger relation demands that the real part of the circulation of the fluid flow along any closed path γ on the surface is quantized. That is, the real circulation of the fluid flow along any closed path is an integer h , where h is Planck’s constant, and m is the mass of one 4 He atom. multiple of m Recall from Sec. 2.3 that vortices appear in the superfluid film in pairs of equal but opposite parity. Thus for a pair of vortices (p, q) on M, we have resp ω = −resq ω . In particular, the singularities in the fluid are vortices, instead of sources and sinks. This imposes the mathematical condition that the residues of ω are purely imaginary [24]. This, combined with the fact that circulation is quantized, implies that the residues must have the form 1 h Z resp ω ∈ 2πi m for any point p on the Riemann surface M. 3. Mathematical Definitions Before we can investigate the appearance of strings of circulation in the flow of a superfluid, we need to set up our model and define terms used for the remainder of the paper. We need a way to describe paths on the surface M, and we need to model the flow of the superfluid with a meromorphic differential to allow for vortices in the fluid. In particular, we need to first define a canonical homology basis which is used throughout the paper to describe paths on the Riemann surface M. We then need to define the two bases of holomorphic forms on M that we use to describe the fluid flows on this surface. Before we go on to investigate the existence of strings of circulation in the superfluid, we need to provide some fundamental mathematical definitions. 3.1. Riemann surfaces Any smooth compact surface in R3 can be given the structure of a compact Riemann surface; that is, it is a one-dimensional compact complex manifold. Thus, the surface formed by grains of a packed powder or the walls of a porous ceramic is a Riemann surface M, and the superfluid flows on the outside of this surface as a two-dimensional film. Let z be a complex local coordinate on M, so that for an open subset U of M, z : U → C. Then, we get real local coordinates (x, y) : U → R2 by z = x + iy.
December 8, 2003 12:30 WSPC/148-RMP
932
00186
C. P. Black
Definition 3.1. A 1-form ω on M is an assignment of two continuous functions f and g to each local coordinate z = x + iy on M so that ω = f dx + g dy is invariant under coordinate changes. Definition 3.2. We define the conjugation operator ? on a 1-form ω = f dx + g dy as in Definition 3.1 on a Riemann surface M by ?ω = −g dx + f dy . 3.2. Homology bases In this section, we define terms used to describe the topological paths on the Riemann surface that we use to define strings of circulation in the superfluid helium film. Definition 3.3. A canonical homology basis on a Riemann surface M of genus g is a set of oriented loops {αi , βi }gi=1 that satisfy #(αi , βj ) = δij #(αi , αj ) = 0 #(βi , βj ) = 0 . For example, Fig. 5 illustrates a canonical homology basis in the genus 2 case. Definition 3.4. Let M be a Riemann surface with paired points {(pj , qj )}rj=1 , let γpj be a small loop around pj , and let γqj be a small loop around qj oriented so that for any path l from qj to pj , #(γpj , l) = 1 #(γqj , l) = −1 . Then, the set {α1 , . . . , αg , β1 , . . . , βg , γp1 , . . . , γpr , γq1 , . . . , γqr−1 } is a canonical homology basis on the punctured Riemann surface M \ {p j , qj }rj=1 . An example of a canonical homology basis on a genus 2 Riemann surface with punctures at p and q isSUPERFLUID shown in Fig. 6. ON POROUS MEDIA FILMS 13
Β1 Α1 Fig. 5.
Β2 Α2
A canonical homology basis for g = 2.
Figure 5. A Canonical Homology Basis for g = 2. For example, Figure 5 illustrates a canonical homology basis in the genus 2 case.
Definition 3.4. Let M be a Riemann surface with paired points {(pj , qj )}rj=1 , let γpj be a small loop around pj , and let γqj be a small loop around qj oriented so
December 8,is2003 12:30 homology WSPC/148-RMP 00186 Riemann surface M \ {pj , qj }r . a canonical basis on the punctured j=1
An example of a canonical homology basis on a genus 2 Riemann surface with punctures at p and q is shown in Figure 6. Superfluid Films on Porous Media
Β1
Β2 Γq
Γp
Α1
p Fig. 6.
933
Α2
q
A canonical homology basis for a punctured Riemann surface.
Figure 6. A Canonical Homology Basis for a Punctured Riemann Surface 3.3. Meromorphic and holomorphic differentials on a Riemann surface
We now begin to define the meromorphic and holomorphic differentials used to describe fluid flows on M. We paraphrase the following facts from Farkas and Kra [25]: Fact 3.1. For any Riemann surface (M, g), there is a unique basis {ϕj }2g j=1 of real harmonic 1-forms with respect to a given canonical homology basis H = {αi , βi }gi=1 so that for all i, j = 1, . . . , g, Z Z ϕj = δij ϕj+g = 0 αi
Z
αi
ϕj = 0 βi
Z
ϕj+g = δij . βi
These harmonic 1-forms also satisfy ZZ ϕj ∧ ?ϕk = δij . M
Additionally, there exist real non-zero constants cj so that these harmonic forms satisfy ?ϕj = cj ϕj+g . From this basis {ϕj }2g j=1 of real harmonic 1-forms on M, we build two bases {ξk }gk=1 and {Γk }gk=1 of holomorphic 1-forms that we reference throughout the remainder of this paper. For further discussion of these bases, refer to Farkas and Kra [25]. Fact 3.2. The set {ξk }gk=1 defined by ξk = ϕ k + i ? ϕ k
(1)
satisfies R
Z
ξk αj
!
= δjk
R
Z
ξk βj
!
= 0.
Fact 3.3. Additionally, the set {Γk }gk=1 defined by Γk = ϕk+g + i ? ϕk+g
(2)
December 8, 2003 12:30 WSPC/148-RMP
934
00186
C. P. Black
also forms a basis of holomorphic 1-forms. These satisfy ! ! Z Z Γk = δjk . R Γk = 0 R βj
αj
The preceding holomorphic 1-forms on M do not allow for vortices in the fluid, since holomorphic forms are non-singular. We need a meromorphic form that provides us with vortices at paired points p and q on M. Again, this is taken from Farkas and Kra [25]. Fact 3.4. Let p and q be any two points on a Riemann surface M of genus g, and let {αi , βi }gi=1 be a canonical homology basis not containing p or q. Then there is a unique meromorphic 1-form µpq that satisfies (1) resRp µpq = −resq µpq = R1 (2) R( αj µpq ) = 0 and R( βj µpq ) = 0 for all j = 1, . . . , g (3) µpq is holomorphic on M \ {p, q}. 3.4. Superflows and the string condition We now want to mathematically describe a the flow of superfluid helium on a Riemann surface. Here we incorporate the requirements of the physical situation of a superfluid film into our developing mathematical model. Definition 3.5. A meromorphic 1-form with at least one non-zero residue is called a differential of the third kind. Definition 3.6. A differential of the third kind ω on M is quantized if for any closed loop γ on M, Z h R ω ∈ Z. m γ Remark 1. A quantized flow satisfies our first physical requirement for a superfluid helium film: quantization of circulation. Since the residues of a flow with vortices (as opposed to sources and/or sinks) must be purely imaginary, we have Z ω = 2π i resp ω ∈ R , γj
where γj is a small loop around pj . Quantization of circulation gives h Z. m Thus for each j = 1, . . . , r, there exists Kj ∈ Z so that we have 2π i respj ω ∈
respj ω =
Kj h . 2πi m
December 8, 2003 12:30 WSPC/148-RMP
00186
Superfluid Films on Porous Media
935
Definition 3.7. Given a quantized flow ω on M, the strength of a pair of vortices (p, q) is the integer K so that resp ω =
K h 2πi m
and resq ω = −
K h . 2πi m
Finally, here is the definition we need to describe the flow of superfluid helium on a porous surface: Definition 3.8. A quantized flow ω with vortices of strength 1 at the paired points {(pj , qj )}rj=1 is called a superflow. To clarify this definition, a superflow is a meromorphic differential of the third kind on the Riemann surface M that describes the physical situation of superfluid 4 He flowing on a porous surface. The paired poles of the differential correspond to the paired vortices in the fluid. Remark 2. To verify that a meromorphic differential ω with paired poles at {(pj , qj )}rj=1 is a superflow, we only have to show that for any canonical homology basis (as described on page 932) of M \ {pj , qj }rj=1 , we have Z h (1) R ω ∈ Z, for k = 1, . . . , g m αk Z h for k = 1, . . . , g (2) R ω ∈ Z, m βk ! Z h ω = (3) R , for j = 1, . . . , r m γp j ! Z h ω = − , for j = 1, . . . , r − 1 . (4) R m γqj Now we want to investigate the appearance of “strings of circulation” in the superfluid. First, we need a formal mathematical definition for this phenomenon: Definition 3.9. Given a superflow ω on M with fixed vortex–antivortex pairs {(pj , qj )}rj=1 , we say that ω satisfies the string condition if there exists a set of homotopy classes of paths L = {lj }rj=1 from qj to pj , so that for any closed loop γ on M \ {pj , qj }rj=1 , Z X r h R ω = #(γ, lj ) . m γ j=1 In this is case, we say that the flow ω produces the strings L. In the following sections, we show that while every string configuration L is produced by some superflow, the converse is not true. In fact, a given superflow will produce strings only if it satisfies a divisibility condition on its periods. Only in the
December 8, 2003 12:30 WSPC/148-RMP
936
00186
C. P. Black
special case of one vortex–antivortex pair do we have a one-to-one correspondence between strings and the superflows that produce them. Remark 3. The definition of the string condition that we use does not take into account possible repairings of the vortices. It was designed to be used in the work that will follow this paper. If we were only going to pursue this field of study within the scope of this paper, then a correct mathematical theorem should contain an allowance for permutation of the vortices, so that strings arising from different pairings may be equivalent. But, in order to continue with the work that will follow from this paper, we will eventually have to revert back to this definition which distinguishes (to some extent) fixed vortex–antivortex pairs. Thus, changing to a looser definition of the string condition would be shortsighted. 4. The Mathematics of the String Condition As described in Sec. 2.3, the appearance of strings of circulation in the flow of superfluid 4 He leads us to the main question to be answered in this section: can these strings exist? If so, do they always exist? If a flow ω produces a set L of such strings, then we say that the flow satisfies the string condition. Experiments studied in the physics lab have focused on the case where there is only one vortex–antivortex pair, and it seems that in this case strings always exist. We shall see here that the existence of strings is contingent upon the number of vortex–antivortex pairs and periodicity conditions of the flow ω. First, we need to develop a mathematical description of the flow of superfluid helium on a Riemann surface. For the remainder of this paper, M denotes a Riemann surface of genus g, the set {(pj , qj )}rj=1 denotes paired points on M and L = {lj }rj=1 denotes fixed oriented paths from qj to pj . Also, ω denotes a meromorphic differential of the third kind on M with poles of opposite parity at the paired points {(pj , qj )}rj=1 . This means that respi ω = −resqi ω. The points {(pj , qj )}rj=1 are the paired vortices of the superfluid denoted by ω. Claim 4.1. A flow ω on M satisfies the string condition and produces strings L = {lj }rj=1 if and only if for a canonical homology basis of M \ {pj , qj }rj=1 we have X Z r h ω = #(αk , lj ) , for k = 1, . . . , g (1) R m αk j=1 Z X r h (2) R ω = #(βk , lj ) , for k = 1, . . . , g m βk j=1 ! Z h , for j = 1, . . . , r (3) R ω = m γp j ! Z h ω =− , (4) R for j = 1, . . . , r − 1 . m γqj
December 8, 2003 12:30 WSPC/148-RMP
00186
Superfluid Films on Porous Media
937
Proof of Claim. Let η be any closed loop on M \ {pj , qj }rj=1 . Let the set of closed loops H = {α1 , . . . , αg , β1 , . . . , βg , γp1 , . . . , γpr , γq1 , . . . , γqr−1 } be a canonical homology basis for the punctured Riemann surface M \ {pj , qj }rj=1 , as defined in Definition 3.4. We can write η in terms of the elements of the canonical homology r−1 basis H. For some integers {Ak , Bk }gk=1 , {Cj }rj=1 and {Dj }j=1 , we have g X
η=
A k αk +
g X
r X
B k βk +
j=1
k=1
k=1
C j γ pj +
r−1 X
Dj γ q j .
j=1
By Definition 3.9, ω(L) produces strings L if X Z r h ω(L) = #(η, lj ) . R m η j=1
The following calculation shows that this is true: Z X Z X Z g g R ω(L) = Ak R ω(L) + Bk R η
αk
k=1
+
r X
Z
Cj R
j=1
=
g X
!
h #(αk , li ) m i=1
Ak
r X
ω(L) γp j
r X
k=1
+
k=1
r−1 X
+
!
+
Ak #(αk , li )
g X r X
h m
r X
+
=
i=1
=
r X i=1
Dj
j=1
#(Ak αk , li )
h #(βk , li ) m i=1
!
Dj R(2π i resqj ω(L))
h − m
g r h XX h + #(Bk βk , li ) m m i=1 k=1
r−1
Cj #(γpj , lj )
j=1
r X
r X
g r h XX h + Bk #(βk , li ) m m i=1
r−1 X
k=1 i=1
+
r−1 X
ω(L) γqj
!
k=1
Cj
j=1
=
Bk
k=1
Z
j=1
k=1 i=1
+
g X
Cj R(2π i respj ω(L)) +
g X r X r X
Dj R
j=1
j=1
=
ω(L) βk
#
g X
A k αk +
k=1
#(η, li )
h X h + Dj #(γqj , lj ) m j=1 m
h . m
g X
k=1
B k βk +
r X j=1
C j γ pj +
r−1 X j=1
Dj γ q j , li
h m
December 8, 2003 12:30 WSPC/148-RMP
938
00186
C. P. Black
So, if conditions 1–4 are satisfied, then ω(L) satisfies the string condition and produces strings L. And by Definition 3.9, if ω(L) satisfies the string condition and produces the strings L, then conditions 1–4 are satisfied since, by the definition of a canonical homology basis on a punctured Riemann surface, we have for all j #(γp , lj ) = 1 and #(γq , lj ) = −1 . 4.1. The first string condition theorem Now that we have established the definition of a superflow, we will investigate the relationship between strings and superflows that satisfy the string condition. We hope to find that the results obtained here agree with observations in the physics lab. We start with some lemmas that will lead to the main theorem in this section. Lemma 4.1. There exists a unique superflow ω0 on M with paired vortices at the points {(pj , qj )}rj=1 so that for a given canonical homology basis {αk , βk }gk=1 of M, ω0 satisfies Z Z R ω0 = 0 . ω0 = 0 and R αk
βk
Proof of Lemma. For each pair of vortices (pj , qj ), j = 1, . . . , r, we know from the Riemann surface theory [25] that there exists a unique differential of the third kind µj = µpj qj on M, holomorphic on M \ {pj , qj }rj=1 , with simple poles at pj and qj so that respj µj = 1 = −resqj µj and R The differential the differential
Pr
j=1
Z
µj αk
=0=R
Z
µj βk
.
µj is meromorphic, with poles at the points {pj , qj }rj=1 . Then r
ω0 =
1 h X µj 2πi m j=1
is the unique meromorphic 1-form with paired vortices of strength 1 at the points {(pj , qj )}rj=1 and zero real periods. To show that ω0 is a superflow, we need to verify that it satisfies the conditions in Remark 2. Z Z r X 1 h R ω0 = R µj = 0 αk 2πi m j=1 αk R
Z
ω0 βk
Z = R
βk
r 1 h X µj = 0 2πi m j=1
December 8, 2003 12:30 WSPC/148-RMP
00186
Superfluid Films on Porous Media
R
Z
ω0 γp k
!
Z = R
= R
γp k
r 1 h X µj 2πi m j=1
939
r X
1 h 2πi respk µj 2πi m j=1
r
h h X δjk = = m j=1 m R
Z
ω0 γqk
!
Z = R
γqk
r X 1 h µj 2πi m j=1
r X h 1 2πi = R resqk µj 2πi m j=1 r
=
h h X (−δjk ) = − . m j=1 m
So, the meromorphic differential ω0 satisfies the conditions listed in Remark 2, and is thus a superflow. Now, let ν be another superflow with vortices at {pj , qj } and zero real periods. Then for some real constants ck and dk , k = 1, . . . , g, we have ν = ω0 +
g X
ck ξk +
g X
d k Γk ,
k=1
k=1
as described in Facts 3.2 and 3.3. The real periods of ν are ! ! ! Z Z Z Z g g X X dk R R ck R ν =R ξk + ω0 + αj
αj
αj
k=1
k=1
Γk αj
!
= cj
and R
Z
ν αj
!
=R
Z
ω0 βj
!
+
g X k=1
ck R
Z
ξk βj
!
+
g X k=1
dk R
Z
Γk βj
!
= dj .
Thus, we have cj = 0 and dj = 0 for all j = 1, . . . , g, and thus ν = ω0 . Hence, ω0 is the unique superflow with zero real periods and the prescribed poles. We use ω0 as our basic superflow and construct other superflows by adding quantized holomorphic forms to it. Specifically, as we will see in Lemma 4.3, it is easy to use ω0 to build a superflow that produces any set of strings we want.
December 8, 2003 12:30 WSPC/148-RMP
940
00186
C. P. Black
Lemma 4.2. Let {αk , βk }gk=1 be a canonical homology basis of M. If ω0 is the unique superflow with paired vortices at {(pj , qj )}rj=1 constructed in Lemma 4.1, then any superflow ω is of the form g g X X h h Mk ξk + Nk Γ k , ω = ω0 + (3) m m k=1
k=1
for unique integers Mk and Nk , k = 1, . . . , g. Additionally, any meromorphic differential with paired poles at {(pj , qj )}rj=1 that can be written this way is a superflow.
Proof of Lemma. In the proof of the previous lemma we saw that ω0 is a meromorphic differential with paired vortices of strength 1 at the points {(pj , qj )}rj=1 . Thus, any other meromorphic differential with these vortices differs from ω0 by a holomorphic form. Recall from Fact 3.2 that a basis of holomorphic forms is given by {ξk }gk=1 , where ξk = ϕ k + i ? ϕ k , and {ϕk }2g k=1 is a basis {Γk }gk=1 is given by
for k = 1, . . . , g ,
of real harmonic 1-forms. From Fact 3.3, another basis
Γk = ϕk+g + i ? ϕk+g ,
for k = 1, . . . , g .
These bases satisfy ?ξk = ck Γk , where the ck are real constants so that ?ϕk = ck ϕk+g . Thus we can write ω = ω0 +
g X
(ak + i bk )ξk ,
k=1
for some ak , bk ∈ R. For ω to be a superflow, it must satisfy the conditions of Remark 2. In particular, we must have ! ! Z Z h h ω ∈ Z. ω ∈ Z and R R m m βj αj We have Z
ω= αj
=
=
=
Z Z Z Z
ω0 + αj
(ak + i bk )
k=1
ω0 + αj
g Z X
k=1
ω0 + αj
g Z X
k=1
ω0 + αj
g X
g Z X
k=1
Z
ξk αj
(ak + i bk )(ϕk + i ? ϕk ) αj
(ak ϕk + i ak ? ϕk + i bk ϕk − bk ? ϕk ) αj
(ak ϕk + i ak ck ϕk+g + i bk ϕk − bk ck ϕk+g ) . αj
December 8, 2003 12:30 WSPC/148-RMP
00186
Superfluid Films on Porous Media
We are only interested in the real part of this equation. ! " ! ! Z Z Z Z g X R ak ϕk + R i a k c k ω0 + ω =R R αj
αj
+ R i bk
αj
k=1
Z
ϕk αj
!
− R bk ck
Z
ϕk+g αj
!#
ϕk+g αj
941
!
= aj .
h But, ω satisfies quantization of circulation. Thus, we have aj ∈ m Z. We let aj = h Mj m . Similarly for the circulation around the β loops, we find that # "Z Z Z Z Z Z g X bk ck ϕk+g . i b k ϕk − i ak ck ϕk+g + ω0 + a k ϕk + ω= βj
βj
βj
βj
βj
k=1
βj
Thus we have Z
R
ω βj
!
= −bj cj .
Once again, quantization of circulation implies that −bj cj ∈ for some integer Nj . We now have
h , − c1j Nj m
ω = ω0 +
g X
Mk − i
k=1
1 Nk ck
h m Z.
Let bj =
h ξk . m
Using the facts ?ϕk = ck ϕk+g ϕk = −ck ? ϕk+g , we see that −i
1 1 ξk = −i (ϕk + i ? ϕk ) ck ck =−
1 (i ϕk − ?ϕk ) ck
=−
1 [i(−ck ? ϕk+g − ck ϕk+g ] ck
= ϕk+g + i ? ϕk+g = Γk . Thus, for any superflow ω, we have found unique integers Mk and Nk , k = 1, . . . , g so that ω = ω0 +
g X k=1
g
Mk
X h h ξk + Nk Γ k . m m k=1
December 8, 2003 12:30 WSPC/148-RMP
942
00186
C. P. Black
Additionally, if we are given a meromorphic differential ω in this form, then ω is a superflow if it satisfies the conditions of Remark 2. From the discussion above, we have ! Z h h R ω = Mk ∈ Z m m αj Z
R
ω βj
!
= Nk
h h ∈ Z. m m
Since the forms ξk and Γk are holomorphic, we have ! ! Z Z h , R ω =R ω0 = m γp j γp j and R
Z
ω γqj
!
Z
=R
ω0 γqj
!
=−
h . m
Thus, any meromorphic differential that can be written in the form of Eq. (3) is a superflow. Lemma 4.3. If there exists paths L = {lj }rj=1 from qj to pj and a canonical homology basis {αk , βk }gk=1 so that ω = ω0 +
g X r X
#(αk , lj )
k=1 j=1
g X r X h h ξk + #(βk , lj ) Γk , m m j=1 k=1
then ω satisfies the string condition and produces the strings L. Proof of Lemma. In Lemma 4.2, we saw that any superflow ω has the form ω = ω0 +
g X
g
Mk
k=1
X h h ξk + Nk Γ k m m k=1
for unique integers Mk and Nk , k = 1, . . . , g. Here we claim that if ω produces strings L = {lj }rj=1 , then Mk =
r X
#(αk , lj )
j=1
and Nk =
r X
#(βk , lj ) .
j=1
In the proof of Lemma 4.2 we saw that Z Z h h ω = Mk ω = Nk . R and R m m αk βk
December 8, 2003 12:30 WSPC/148-RMP
00186
Superfluid Films on Porous Media
943
Thus, we have R
Z
ω αk
=
R
ω βk
#(αk , lj )
j=1
and Z
r X
=
Since ω is a superflow, we have ! Z h R ω = m γp j
r X
h m
#(βk , lj )
h . m
Z
!
j=1
and R
ω γqj
=−
h . m
Thus, ω satisfies the conditions in Claim 4.1, and hence ω produces the strings L = {lj }rj=1 . Now that we have established these preliminary lemmas, we dive into our main result for this section. Theorem 4.1. Given a fixed set of paths L = {lj }rj=1 from qj to pj , there is a unique superflow ω(L) so that ω(L) produces the strings L. Proof. This theorem is a direct consequence of Lemma 4.3. Given a set of strings L = {lj }rj=1 and a canonical homology basis {αk , βk }gk=1 on M, we define a superflow ω(L) by ω(L) = ω0 +
g X r X
#(αk , lj )
k=1 j=1
g X r X h h ξk + #(βk , lj ) Γk . m m j=1 k=1
This superflow ω(L) produces strings L. The only question left to answer is the uniqueness of ω(L). To show that ω(L) is unique, let ν be another superflow with paired vortices at {(pj , qj )}. Then ν and ω(L) differ only by a holomorphic form. Thus, by Lemma 4.2 there exist {ck , dk }gk=1 ⊂ Z so that ν = ω(L) +
g X
k=1
Then we have Z Z ν =R R αi
=
r X
ω(L) + αi
#(αi , lj )
j=1
R
Z
ν βi
=R
=
Z
r X j=1
g X k=1
g
ck
X h h d k Γk . ξk + m m k=1
h ck R m
Z
ξk αi
+
g X k=1
h dk R m
Z
Γk αi
h h + ci , m m
X Z X Z g g h h ξk + Γk ω(L) + ck R dk R m m βi βi βi
#(βi , lj )
k=1
h h + di . m m
k=1
December 8, 2003 12:30 WSPC/148-RMP
944
00186
C. P. Black
So, ν will satisfy the string condition and produce strings L = {lj }rj=1 if and only if ci and di are zero for all i = 1, . . . , g, which means that ν = ω(L). Thus, the superflow ω(L) that satisfies the string condition and produces the strings L is unique. When we started studying this situation, we had expected to find a one-toone correspondence between superflows ω and sets of strings L so that ω produces the strings L. Theorem 4.1 tells us that for every possible string configuration L on M, there exists a unique superflow ω(L) that produces those particular strings. Interestingly, the converse of this theorem is not always true. In the following section we find the exact conditions required for a superflow to satisfy the string condition. 4.2. The second string condition theorem In this section, we investigate the converse of Theorem 4.1. We saw in Sec. 4.1 that every set of strings is produced by a superflow. Now we look at the conditions necessary for a superflow to satisfy the string condition. Theorem 4.2. Let {αk , βk }gk=1 be a canonical homology basis of M that does not contain any of the points {(pj , qj )}rj=1 . Let ω be an arbitrary superflow on M that produces strings L(ω) = {lj (ω)}rj=1 from qj to pj . Then the real periods of ω satisfy the following: Z h ω ∈r Z R m αi Z h R ω ∈r Z m βi for all i = 1, . . . , g, where r is the number of pairs of vortices on M. Additionally, the strings L(ω) produced by the superflow ω are unique up to homotopy. Remark 4. Theorem 4.2 states that the only way strings can appear in the superfluid is when the real periods of ω are divisible by the number of vortex pairs. In the proof of the theorem, we will actually construct these strings. Proof. Because ω is a superflow, we know that Z Z h h R ω = Mk Z and R ω = Nk Z m m αk βk
(4)
for some integers Mk and Nk , k = 1, . . . , g. Because the superflow ω produces the strings {lj (ω)}rj=1 , we know that r X
#(αk , lj ) = Mk
j=1
for Mk and Nk as in Eq. (4).
and
r X j=1
#(βk , lj ) = Nk
December 8, 2003 12:30 WSPC/148-RMP
00186
Superfluid Films on Porous Media
945
Choose a fixed set of oriented paths ˆlj from qj to pj so that for i = 1, . . . , g and j = 1, . . . , r we have #(αi , ˆlj ) = 0 and #(βi , ˆlj ) = 0 . Then the strings lj , j = 1, . . . , g have the form lj = ˆlj +
g X
mk α k +
k=1
g X
n k βk ,
k=1
for some integers mk , nk , k = 1, . . . , g. We then have #(αk , lj ) = #(αk , ˆlj ) +
g X
mi #(αk , αi ) +
i=1
g X
=
g X
ni #(αk , βi )
g X
ni #(βk , βi )
i=1
ni δki = nk
i=1
and
#(βk , lj ) = #(βk , ˆlj ) +
g X
mi #(βk , αi ) +
i=1
i=1
=
g X
mi δki = mk .
i=1
Thus,
Mk =
r X
#(αk , lj ) =
j=1
r X
nk = rnk
j=1
and Nk =
r X
#(βk , lj ) =
j=1
r X
mk = rmk ,
j=1
so Mk and Nk are integer multiples of r, for all k = 1, . . . , g. From the argument above, we see that the strings produced by the superflow ω with periods that satisfy the conditions of the theorem are uniquely specified by the formula g g X X 1 1 Nk α k + M k βk , lj (ω) = ˆlj + r r k=1
k=1
where Mk and Nk are as specified in Eq. (4).
Remark 5. Notice that when constructing strings, we can only add whole loops (not partial loops). This is why we need Mk and Nk to be divisible by r. When building a flow, we can add fractions of flows (as long as the result remains quantized) which is why we do not have a similar integer restriction in Theorem 4.1.
December 8, 2003 12:30 WSPC/148-RMP
946
00186
C. P. Black
Remark 6. Theorem 4.2 makes no distinction between different pairings of the vortices. That is, if the vortex pairs are rearranged so that a vortex is paired with any antivortex, then the existence or non-existence of the strings will not be affected by this re-arrangement. The existence of the strings depends solely on the integrality of the real periods of the superflow ω. What does this model predict in the case of one vortex pair? The following corollary is an immediate consequence of Theorems 4.1 and 4.2. Corollary 4.1. In the case of a superflow on a Riemann surface with one pair of vortices, there is a one-to-one correspondence between strings and superflows that satisfy the string condition. Proof. Notice that for one pair of vortices, the divisibility condition in Theorem 4.2 is trivial, since r = 1. Every superflow then satisfies the divisibility condition, so every superflow produces a string. By Theorem 4.1, every possible string comes from a superflow. Therefore, there is a one-to-one correspondence between superflows and strings. 5. Agreement with Physical Results In summary, what we have shown in this paper is that for every string configuration between paired vortices there exists a superflow, but that not every superflow produces strings. A superflow will produce strings if it only has one vortex–antivortex pair, or if it satisfies the periodicity condition specified in Theorem 4.2. Note that the results obtained here apply to any superfluid flow, not just 4 He. We would just need to change the value of m to be the mass of one atom of the superfluid to apply these equations to any other ideal fluid. How do the results obtained in this paper compare to the experimental results? As far as we know, experiments have only been performed with one vortex– antivortex pair [28]. In that case, there appears to be a one-to-one correspondence between superflows and string configurations, which agrees with the results obtained here. However, our result goes further to predict that this one-to-one correpondence does not hold for all superflows with more than one vortex–antivortex pair. While this result is somewhat unexpected, it is not unbelievable due to the quantum nature of superfluid flows [29]. References [1] J. Machta and R. A. Guyer, Phys. Rev. Lett. 60 (1988), 2054; J. Machta and R. A. Guyer, J. Low Temp. Phys. 74 (1989), 231. [2] T. Minoguchi and Y. Nagaoka, Progr. Theorer. Phys. 80 (1988), 397. [3] F. London, Phys. Rev. 947 (1938), 54. [4] K. R. Atkins, Liquid Helium, Cambridge University Press, Cambridge, 1959. [5] D. F. Brewer, The Physics of Liquid and Solid Helium Part II, eds. K. H. Benneman and J. B. Ketterson, John Wiley and Sons, New York, 1978.
December 8, 2003 12:30 WSPC/148-RMP
00186
Superfluid Films on Porous Media
947
[6] J. Reppy, J. Low Temp. Phys. 87 (1992), 205. [7] D. R. Tilley and J. Tilley, Superfluidity and Superconductivity, Adam Hilger, New York, 1990. [8] P. L. Kapitza, Nature 141 (1938), 74. [9] J. F. Allen and A. D. Misener, Nature 141 (1938), 75. [10] L. Tisza, J. Phys. Radium I (1940), 165–350. [11] L. D. Landau, J. Phys. Moscow 5 (1941), 71. [12] E. Andronikashvili, J. Phys. X (1946) 201. [13] F. London, Superfluids, Vols. I and II, John Wiley and Sons, Inc., New York, 1950. [14] R. P. Feynman, in Progress in Low Temperature Physics, Vol. I, ed. C. J. Gorter, Interscience Publishers Inc., New York, 1955, Ch. II. [15] L. Osanger, Nuovo Cimento Suppl. 6 (1949), 249. [16] F. Gallet and G. A. Williams, Phys. Rev. B39 (1989), 4673. [17] J. M. Kosterlitz and D. J. Thouless, J. Phys. C : Solid State Phys. 6 (1973), 1181. [18] J. Marsden, Lectures on Geometric Methods in Mathematical Physics, Society for Industrial and Applied Mathematics, Philadelphia, 1981. [19] Y. Nambu, Phys. Lett. 92B (1980), 327; H. Kuratsuji, Phys. Rev. Lett. 68 (1992), 1746; J. Leinaas and J. Myrheim, Phys. Rev. B37 (1988), 37; R. Chiao, A. Hansen and A. Moulthrop, Phys. Rev. Lett. 54 (1985), 1339; G. A. Goldin, R. Menikoff and D. H. Sharp, J. Math. Phys. 28 (1987), 744. [20] M. Rasetti and T. Regge, Physica 80A (1975), 217; M. Rasetti and T. Regge, Quantum Vortices in Highlights of Condensed Matter Theory, ed. F. Bassani, Elsevier Science Pub. Co., 1985, pp. 748–766. [21] J. Marsden and A. Weinstein, Physica D7 (1983), 305. [22] G. A. Goldin, R. Menikoff and D. H. Sharp, Phys. Rev. Lett. 58 (1987), 2162; G. A. Goldin, Acta Physica Polonica B27 (1996), 2341. [23] V. Penna and M. Spera, J. Geom. Phys. 27 (1998), 99. [24] Polya and Latta, Complex Variables, John Wiley and Sons, New York, 1994. [25] H. Farkas and I. Kra, Riemann Surfaces, Springer-Verlag, 1992. [26] J. H. Silverman, The Arithmetic of Algebraic Curves, Springer-Verlag, 1986. [27] G. Springer, Introduction to Riemann Surfaces, Chelsea, 1957. [28] J. Machta, Personal communication. [29] R. A. Guyer, Personal communication.
December 12, 2003 15:2 WSPC/148-RMP
00177
Reviews in Mathematical Physics Vol. 15, No. 9 (2003) 949–993 c World Scientific Publishing Company
SINGLE SCALE ANALYSIS OF MANY FERMION SYSTEMS PART 1: INSULATORS
JOEL FELDMAN∗ Department of Mathematics, University of British Columbia Vancouver, B.C., Canada V6T 1Z2 [email protected] http://www.math.ubc.ca/∼feldman/ † and EUGENE TRUBOWITZ‡ ¨ HORST KNORRER
Mathematik, ETH-Zentrum, CH-8092 Z¨ urich, Switzerland †[email protected] ‡[email protected] †http://www.math.ethz.ch/∼knoerrer/
Received 22 April 2003 Revised 8 August 2003 We construct, using fermionic functional integrals, thermodynamic Green’s functions for a weakly coupled fermion gas whose Fermi energy lies in a gap. Estimates on the Green’s functions are obtained that are characteristic of the size of the gap. This prepares the way for the analysis of single scale renormalization group maps for a system of fermions at temperature zero without a gap. Keywords: Fermi liquid; renormalization; fermionic functional integral; insulator.
Contents I. II. III. IV.
Introduction to Part 1 Norms Covariances and the Renormalization Group Map Bounds for Covariances Integral bounds Contraction bounds V. Insulators Appendices A. Calculations in the Norm Domain Notation References
950 959 966 969 969 973 982 987 987 992 993
∗ Research supported in part by the Natural Sciences and Engineering Research Council of Canada and the Forschungsinstitut f¨ ur Mathematik, ETH Z¨ urich.
949
December 12, 2003 15:2 WSPC/148-RMP
950
00177
J. Feldman, H. Kn¨ orrer & E. Trubowitz
I. Introduction to Part 1 Consider a gas of fermions with prescribed, strictly positive, density, together with a crystal lattice of magnetic ions in d space dimensions. The fermions interact with each other through a two-body potential. The lattice provides periodic scalar and vector background potentials. As well, the ions can oscillate, generating phonons and then the fermions interact with the phonons. To start, turn off the fermion–fermion and fermion–phonon interactions. Then we have a gas of independent fermions, each with Hamiltonian H0 =
1 (i∇ + A(x))2 + U(x) . 2m
Assume that the vector and scalar potentials A, U are periodic with respect to some lattice Γ in Rd . Note that it is the magnetic potential, and not just the magnetic field, that is assumed to be periodic. This forces the magnetic field to have mean zero. Here, bold face characters are d-component vectors. Because the Hamiltonian commutes with lattice translations it is possible to simultaneously diagonalize the Hamiltonian and the generators of lattice translations. Call the eigenvalues and eigenvectors εν (k) and φν,k (x) respectively. They obey H0 φν,k (x) = εν (k)φν,k (x) φν,k (x + γ) = eihk,γi φν,k (x)
∀γ ∈ Γ.
(I.1)
The crystal momentum k runs over R2 /Γ# where Γ# = {b ∈ R2 |hb, γi ∈ 2πZ for all γ ∈ Γ} is the dual lattice to Γ. The band index ν ∈ N just labels the eigenvalues for 1 (k−bν,k )2 boundary condition k in increasing order. When A = U = 0, εν (k) = 2m # for some bν,k ∈ Γ . In the grand canonical ensemble, the Hamiltonian H is replaced by H − µN where N is the number operator and the chemical potential µ is used to control the density of the gas. At very low temperature, which is the physically interesting domain, only those pairs ν, k for which εν (k) ≈ µ are important. To keep things as simple as possible, we assume that εν (k) ≈ µ only for one value ν0 of ν and we fix an ultraviolet cutoff so that we consider only those crystal momenta in a region B for which |εν0 (k) − µ| is smaller than some fixed small constant. We denote E(k) = εν0 (k) − µ. When the fermion–fermion and fermion–phonon interactions are turned on, the models at temperature zero are characterized by the Euclidean Green’s functions, formally defined by G2n (p1 , . . . , qn )(2π)d+1 δ(Σpi − Σqi ) *n + R Qn ¯ Q Y ( i=1 ψpi ψ¯qi )eA(ψ,ψ) k,σ dψk,σ dψ¯k,σ ¯ R = ψ pi ψ q i = . ¯ Q ¯ eA(ψ,ψ) k,σ dψk,σ dψk,σ i=1
(I.2)
December 12, 2003 15:2 WSPC/148-RMP
00177
Single Scale Analysis of Many Fermion Systems — Part 1
951
The action ¯ =− A(ψ, ψ)
Z
dd+1 k ¯ . (ik0 − E(k))ψ¯k ψk + V(ψ, ψ) (2π)d+1
(I.3)
The interaction V will be specified shortly. We prefer to split A = Q + V where R dd+1 k ¯ Q = − (2π) d+1 (ik0 − E(k))ψk ψk and write R ¯ Q ¯ ¯ A(ψ,ψ) f (ψ, ψ)e k,σ dψk,σ dψk,σ ¯ R hf (ψ, ψ)i = Q ¯ eA(ψ,ψ) k,σ dψk,σ dψ¯k,σ R ¯ ¯ V(ψ,ψ) ¯ f (ψ, ψ)e dµC (ψ, ψ) R = ¯ ¯ eV(ψ,ψ) dµC (ψ, ψ) where dµC is the Grassmann Gaussian “measure” with covariance C(k) =
1 . ik0 − E(k)
We now take some time to explain (I.2). The fermion fields are vectors # " ψk,↑ ψ¯k = [ψ¯k,↑ ψ¯k,↓ ] ψk = ψk,↓ whose components ψk,σ , ψ¯k,σ , k = (k0 , k) ∈ R × B, σ ∈ {↑, ↓}, are generators of an infinite dimensional Grassmann algebra over C. That is, the fields anticommute with each other. (−)
(−)
(−)
(−)
ψk,σ ψp,τ = − ψp,τ ψk,σ . We have deliberately chosen ψ¯ to be a row vector and ψ to be a column vector so that # " ψk,↑ ψ¯p,↑ ψk,↑ ψ¯p,↓ ¯ ¯ ¯ ¯ . ψk ψp = ψk,↑ ψp,↑ + ψk,↓ ψp,↓ ψk ψp = ψk,↓ ψ¯p,↑ ψk,↓ ψ¯p,↓ In the argument k = (k0 , k), the last d components k are to be thought of as a crystal momentum and the first component k0 as the dual variable to a temperature √ or imaginary time. Hence the −1 in ik0 − E(k). Our ultraviolet cutoff restricts k to B. In the full model, k is replaced by (ν, k) with ν summed over N and k integrated over Rd /Γ# . On the other hand, the ultraviolet cutoff does not restrict k0 at all. It still runs over R. So we could equally well express the model in terms of a Hamiltonian acting on a Fock space. We find the functional integral notation more efficient, so we use it. The relationship between the position space field ψσ (x0 , x), with (x0 , x) running over (imaginary) time × space, and the momentum space field ψk,σ is really given, in our single band approximation, by Z dd+1 k −ik0 x0 φν0 ,k (x)ψk,σ . e ψσ (x0 , x) = (2π)d+1
December 12, 2003 15:2 WSPC/148-RMP
952
00177
J. Feldman, H. Kn¨ orrer & E. Trubowitz
We find it convenient to use a conventional Fourier transform, so we work in a “pseudo” space–time and instead define Z dd+1 k ıhk,xi− ψσ (x0 , x) = e ψk,σ (2π)d+1 Z dd+1 k −ıhk,xi− ¯ e ψk,σ ψ¯σ (x0 , x) = (2π)d+1
where hk, xi− = −k0 x0 + k · x for k = (k0 , k) ∈ R × Rd . Under this convention, the covariance in position space is Z ¯ 0 )dµC (ψ, ψ) ¯ C(x, x0 ) = ψ(x)ψ(x = δσ,σ0
Z
dd+1 k ıhk,x−x0 i− e C(k) . (2π)d+1
Under suitable conditions on φν0 ,k (x), it is easy to go from the pseudo space–time ψ(x) to the real one. For a simple spin independent two-body fermion–fermion interaction, with no phonon interaction, Z 1 X dtdxdyv(x − y)ψ¯σ (t, x)ψσ (t, x)ψ¯τ (t, y)ψτ (t, y) . V=− 2 σ,τ ∈{↑,↓}
The general form of the interaction is Z ¯ = ¯ 1 )ψ(x2 )ψ(x ¯ 3 )ψ(x4 )dx1 dx2 dx3 dx4 V(ψ, ψ) V0 (x1 , x2 , x3 , x4 )ψ(x (R×R2 ×{↑,↓})4
¯ where, for x = (x0 , x, σ), we write ψ(x) = ψσ (x0 , x) and ψ(x) = ψ¯σ (x0 , x). The translation invariant function V (x1 , x2 , x3 , x4 ) can implement both the fermion– fermion and fermion–phonon interactions. This series of four papers provides part of the construction of an interacting Fermi liquid at temperature zeroa in d = 2 space dimensions.b Before we give the description of the content of these four papers, we outline the main results of the full construction. For the detailed hypotheses and results, see [5]. The main assumptions concerning the interaction are contained in
Hypothesis I.1. The interaction is weak and short range. That is, V0 is sufficiently near the origin in V, which is a Banach space of fairly short range, spin independent, translation invariant functions V0 (x1 , x2 , x3 , x4 ). See [5, Theorem I.4] for V’s precise norm. For some results, we also assume that V0 is “k0 -reversal real” V0 (Rx1 , Rx2 , Rx3 , Rx4 ) = V0 (x1 , x2 , x3 , x4 ) a For b For
(I.4)
results at strictly positive temperature see [1–3]. d = 1, the corresponding system is a Luttinger liquid. See [4] and the references therein.
December 12, 2003 15:2 WSPC/148-RMP
00177
Single Scale Analysis of Many Fermion Systems — Part 1
953
where R(x0 , x, σ) = (x0 , −x, σ) and “bar/unbar exchange invariant” V0 (−x2 , −x1 , −x4 , −x3 ) = V0 (x1 , x2 , x3 , x4 )
(I.5)
where −(x0 , x, σ) = (−x0 , −x, σ). If V0 corresponds to a two-body interaction v(x1 − x3 ) with a real-valued Fourier transform, then V0 obeys (I.4) and (I.5). We prove that perturbation expansions for various objects converge. These objects depend on both E(k) and V0 and are not smooth in V0 when E(k) is held fixed. However, we can recover smoothness in V0 by a change of variables. To do so, we split E(k) = e(k) − δe(V0 , k) into two parts and choose δe(V0 , k) to satisfy an implicit renormalization condition. This is called renormalization of the dispersion relation. Define the proper self energy Σ(p) for the action A by the equation R ¯ Q ψp ψ¯q eA(ψ,ψ) dψk,σ dψ¯k,σ −1 d+1 R . (ip0 − e(p) − Σ(p)) (2π) δ(p − q) = Q ¯ A(ψ, ψ) e dψk,σ dψ¯k,σ
The counterterm δe(V0 , k) is chosen so that Σ(0, p) vanishes on the Fermi surface F = {p|e(p) = 0}. We take e(k) and V0 , rather than the more natural, E(k) and V0 as input data. The counterterm δe will be an output of our main theorem. It will lie in a suitable Banach space E. While the problem of inverting the map e 7→ E = e−δe is reasonably well understood on a perturbative level [6], our estimates are not yet good enough to do so nonperturbatively. Our main hypotheses are imposed on e(k). Hypothesis I.2. The dispersion relation e(k) is a real-valued, sufficiently smooth, function. We further assume that
(a) the Fermi curve F = {k ∈ R2 |e(k) = 0} is a simple closed, connected, convex curve with nowhere vanishing curvature. (b) ∇e(k) does not vanish on F. (c) For each q ∈ R2 , F and −F + q have low degree of tangency. (F is “strongly asymmetric”.) Here −F + q = {−k + q|k ∈ F }. Again, for the details, see [5, Hypothesis I.12]. It is the strong asymmetry condition, Hypothesis I.2(c), that makes this class of models somewhat unusual and permits the system to remain a Fermi liquid when the interaction is turned on. If A = 0 then, taking the complex conjugate of (I.1), we see that εν (−k) = εν (k) so that Hypothesis I.2(c) is violated for q = 0. Hence the presence of a nonzero vector potential A is essential. We shall say more about the role of strong asymmetry later. For now, we just mention one model that violates these hypotheses, not only for technical reasons but because it exhibits different physics. It is the Hubbard model at half filling, whose Fermi curve is sketched below. This Fermi curve is not smooth, violating Hypothesis I.2(b), has zero curvature almost everywhere, violating Hypothesis I.2(a), and is invariant under k → −k so that F = −F , violating Hypothesis I.2(c) with q = 0.
els somewhatItunusual and permits thecondition, system to remain aI.2.c, Fermi liquid when is is the strong asymmetry Hypothesis that makes this the classinteraction of modturned If A =unusual 0 then, taking thethecomplex conjugate of (I.1), wewhen see that εν (−k) =isεν (k) els on. somewhat and permits system to remain a Fermi liquid the interaction
December 12, 2003 15:2 WSPC/148-RMP
00177
so that Hypothesis I.2.c is violated forcomplex q = 0. conjugate Hence theofpresence of athat nonzero potenturned on. If A = 0 then, taking the (I.1), we see εν (−k)vector = εν (k) so is that Hypothesis is violated for about q = 0. Hence theof presence a nonzero vector tial A essential. WeI.2.c shall say more the role strongofasymmetry later.potenFor now,
tial mention A is essential. We shall more about role of strong later. For now, but we just one model thatsay violates these the hypotheses, not asymmetry only for technical reasons we just mention one model that violates these hypotheses, not only for technical reasons butFermi because it exhibits different physics. It is the Hubbard model at half filling, whose because it exhibits different physics. It is the Hubbard model at half filling, whose Fermi curve is sketched below. This Fermi curve is not smooth, violating Hypothesis I.2.b, has zero curve is sketched below. This Fermi curve is not smooth, violating Hypothesis I.2.b, has zero
curvature almost everywhere, violating Hypothesis I.2.a, and is invariant under k → −k so almost Hypothesis I.2.a, and is invariant under k → −k so 954 curvature J. Feldman, H. everywhere, Kn¨ orrer & violating E. Trubowitz that that F =F−F , violating Hypothesis I.2.c with q = 0. = −F , violating Hypothesis I.2.c with q = 0. FF
To give aTorigorous definition of (I.2) oneone must cutoffs and give a rigorous definition of (I.2) mustintroduce introduce cutoffs and thenthen take take the the To give a rigorous definition of (I.2) one must introduce cutoffs and then take the in whichthe the cutoffs cutoffs are removed. To impose an infrared in thecutoff spatialin directions limitlimit in which are removed. To impose an cutoff infrared the spatial limitone in which the the cutoffs areinremoved. To impose an /LΓ. infrared cutoff in spatial cutoff directions 2 the could putcould system a system finite periodic IR2periodic To impose ultraviolet directions one put the in a box finite box Ran /LΓ. To impose an 2 one could the system in finiteput periodic box on IR a/LΓ. To By impose an ultraviolet cutoff in the put spatial onea may the system lattice. also imposing ultraviolet cutoffdirections in the spatial directions one may put the system on a infrared lattice. By in the spatial directions one may put the system we on could a lattice. Byto also imposing infrared and ultraviolet cutoffs in the temporal direction, arrange start from a finite also imposing infrared and ultraviolet cutoffs in the temporal direction, we could and dimensional ultraviolet Grassmann cutoffs in algebra. temporal direction, arrange to start from a finite We choose not to dowe so.could We prove that formal arrange to start from athe finite dimensional Grassmann algebra. Werenormalized choose not to perturbation expansions converge. coefficients expansions are well–defined even dimensional Grassmann algebra. WeThe choose not to in dothose so. We prove that formal renormalized do so. We prove that formal renormalized perturbation expansions3 converge. The without aexpansions finite volumeconverge. cutoff. So we choose to startinwith x running over all . We impose even perturbation coefficients those expansions areIRwell–defined coefficients in those expansions The are well-defined even without a finite volume cutoff. a (permanent) ultraviolet cutoff through a smooth, compactly supported function U 3(k). This without a finite to volume cutoff. Sorunning we choose to start with x running over all IR . We impose 3 So we choose start with x over all R . We impose a (permanent) ultrakeeps k permanently bounded. We impose a (temporary) infrared cutoff through a function aviolet (permanent) ultraviolet cutoff through a smooth, compactlyfunction supportedUfunction U (k). Thisk cutoff through a smooth, compactly supported (k). This keeps 2 2 νε k0 + e(k) where νε (κ) looks like keeps k permanently bounded. We impose a (temporary) infrared cutoff through a function permanently bounded. We impose a (temporary) infrared cutoff through a function 22 ννεε (k k022 + e(k) where νε (κ) looks like like 0 + e(k) ) where νε (κ) looks 1 1 κ
ε
When ε > 0 and νε k02 + e(k)2 > 0, |ik0 − e(k)| is at least of order ε. The coefficients of the
κ
ε
5 2 2 +e(k) 2 2 ) > 0, |ik0 −e(k)| is at least of order ε. The coefficients Whenεε>>00and andνενεk(k When > 0, |ik0 − e(k)| is at least of order ε. The coefficients of the 0 e(k) 0+
of the perturbation expansion (either renormalized or not) of the cutoff Euclidean 5 Green’s functions *n + Y G2n;ε (x1 , σ1 , . . . , yn , τn ) = ψσi (xi )ψ¯τi (yi ) i=1
ε
where hf iε =
R
¯ ¯ V(ψ,ψ) ¯ f (ψ, ψ)e dµCε (ψ, ψ) R ¯ ¯ eV(ψ,ψ) dµCε (ψ, ψ)
with Cε (k; δe) =
U (k)νε (k02 + e(k)2 ) ik0 − e(k) + δe(k)
are well-defined. Our main result is
Theorem [5, Theorem I.4]. Assume that d = 2 and that e(k) fulfils Hypothesis I.2. There is ◦ a nontrivial open ball B ⊂ V, centered on the origin, and ◦ an analytic c function V ∈ B 7→ δe(V ) ∈ E, that vanishes for V = 0, c For an elementary discussion of analytic maps between Banach spaces see, for example, [7, Appendix A].
December 12, 2003 15:2 WSPC/148-RMP
00177
Single Scale Analysis of Many Fermion Systems — Part 1
955
such that: ◦ for any ε > 0 and n ∈ N, the formal Taylor series for the Green’s functions G 2n;ε converges to an analytic function on B, ◦ as ε → 0, G2n;ε converges uniformly, in x1 , . . . , yn and V ∈ B, to a translation invariant, spin independent, particle number conserving function G2n that is analytic in V. If, in addition, V is k0 -reversal real, as in (I.4), then δe(k; V ) is real for all k. Theorem [5, Theorem I.5]. Under the hypotheses of [5, Theorem I.4] and the assumption that V ∈ B obeys the symmetries (I.4) and (I.5), the Fourier transform Z ˇ G2 (k0 , k) = dx0 d2 x eıhk,xi− G2 ((0, 0, ↑), (x0 , x, ↑)) =
Z
dx0 d2 x eıhk,xi− G2 ((0, 0, ↓), (x0 , x, ↓))
1 when U (k) = 1 ik0 − e(k) − Σ(k) of the two-point function exists and is continuous, except on the Fermi curve (precisely, except when k0 = 0 and e(k) = 0). The momentum distribution function Z dk0 ık0 τ ˇ e G2 (k0 , k) n(k) = lim τ →0+ 2π ¯ ∈ F, then lim k→k¯ n(k) and is continuous except on the Fermi curve F. If k =
e(k)>0
lim
¯ k→k e(k)<0
n(k) exist and obey
lim n(k) − lim n(k) = 1 + O(V ) >
¯ k→k e(k)<0
¯ k→k e(k)>0
1 . 2
Theorem [FKTf1, Theorem I.7] Let Theorem [5, Theorem I.7]. Let
k ˇ 4 (k ˇG , k2 2, ,kk33, ,kk44)) = = 1 G 4 (k 1 ,1k k4
k2 k3
(spin dropped from notation) be the Fourier transform of the four-point function and 4 Y 41 Q ˇA ˇ A ,k ,k ,k ) = G 1 ˇ ˇ G (k (k , k , k , k ) 1 2 3 4 4 1 2 3 4 G (k , k , k , k ) = G (k , k , k , k 4 4 1 2 3 4 4 1 2 3 4) ˇ G2 (kGˇ` )2 (k` )
(spin dropped from notation) be the Fourier transform of the four–point function and
`=1 `=1
its amputation by the physical propagator. Under the hypotheses of [5, Theorem I.5], its amputation by the physical propagator. Under the hypotheses of [FKTf1, Theorem I.5], ˇA G 4 has a decomposition ˇ A has a decomposition G 4 k1 + k 2 k3 + k 4 1 ˇA L , , k − k G (k , k , k , k ) = N (k , k , k , k ) + 2 1 1 2 3 4 1 2 3 4 4 2 2 2 ˇA (k , k , k , k ) = N (k , k , k , k ) G 1 2 3 4 1 2 3 4 4 k3 + k2 k11 + kk14+k2 k3 +k4 1 , , k − k − L , , k − k + 2L 2 3 2 1 2 2 2 2 2
− 21 L
with
◦ N continuous
◦ L(q1 , q2 , t) continuous except at t = 0 ◦ lim L(q1 , q2 , t) continuous t0 →0
◦ lim L(q1 , q2 , t) continuous t→0
k3 +k2 k1 +k4 , 2 , k2 2
− k3
ˇA its 12, amputation by1the of [FKTf1, Theorem I.5], (k ,WSPC/148-RMP k2physical , k3 , k4 )propagator. = N (k100177 , Under k2 , k3 ,the k4hypotheses ) December 2003G 415:2 A ˇ G4 has a decomposition k +k k +k 1 + 2L
1
2
2
,
3
4
2
ˇA , k32, kk G 4 )1 +k4 4 (k1 , k2 , k3 , k4 ) = N 1 (k1 ,kk32+k
− 2L 1
+ 2L
with
956
J. Feldman, H. Kn¨ orrer & E. Trubowitz
− 12 L
2k1
, +k
2
2
2
, k2 − k 1 , k2 − k 3
4 , k3 +k 2 , k2 − k 1
k3 +k2 k1 +k4 , 2 , k2 2
◦ N continuous with
− k3
◦ N continuous except at t = 0 ◦ L(q1 , q2with , t) continuous ◦ continuous L(q1 , q2 , t) continuous except at t = 0 ◦ N ◦ lim L(q 1 , q2 , t) continuous t0 →0
◦ L(q , q2 ,L(q t) continuous except at t = 0 ◦ 1lim 1 , q2 , t) continuous
→0 ◦ lim L(q◦1 lim ,◦q2tlim t)L(q continuous L(q continuous t ,→0 1 , ,qq2, ,t)t)continuous 0
t→0
0
1
2
◦ limt→0 t→0 L(q1 , q2 , t) continuous.
Think of L as of a particle–hole ladder ladder Think of L as a particle–hole Think L as aladder particle–hole tq + t t t q2 q+2 + 1 t 2 L(q1 , q2qq,12t)+ +=22tq − 2t q + 2 t 22 q − 1 2 L(q , q , t) = 1 2 2 2 L(q1 , q2 , t) = qq12 − − t2t qq22 − − t2t. 2 We now discuss further I.2 in 2 the role of the geometric conditions of Hypothesis
thenow Cooper channel. When youof the interaction V , the system We discuss further the role ofturn the geometric conditions of Hypothesis I.2effectively in We blocking now discuss further the role theon geometric conditions ofitself Hypothesis I.2 in blocking Cooper channel. “effective When youinteraction”. turn on the interaction V , the contribution system itself replaces V bythe more complicated The (dominant)
blocking the Cooper channel. you complicated turn on the interaction V , the system itself effectively effectively replacesWhen V by more “effective interaction”. The (dominant) contribution replaces V by more complicated “effective interaction”. The (dominant) contribution p
q k
p
to the p1 + p2
−p + t
−q + t
q
to the strength of the effective interaction between two particles of total momentum k t p1 + p2 =ofq1the + q2 is to the= strength particles −p + tZeffective interaction between two−q + t of total momentum t = stuff p1 + p2 = q1 + q2 is . dk Z [ik0 − e(k)][i(−k0 +stuff t0 ) − e(−k + t)] dk [ik0 −e(k)][i(−k0 +t 0 )−e(−k+t)] Note that Note that strength of the effective interaction between two particles of total momentum [ik0 − e(k)] = 0 ⇐⇒ k0 = 0 , 7e(k) = 0 ⇐⇒ k0 = 0, k ∈ F [ik0 − e(k)] = 0 ⇐⇒ k0 = 0, e(k) = 0 ⇐⇒ k0 = 0, k ∈ F = q + q is 1
2 [i(−k Z+ t)] = 0 ⇐⇒ k0 = t0 , e(−k + t) = 0 ⇐⇒ k0 = t0 , k ∈ t − F . 0 + t0 ) − e(−k [i(−k0 + t0 ) − e(−k + t)] = 0 ⇐⇒ k0 stuff = t0 , e(−k + t) = 0 ⇐⇒ k0 = t0 , k ∈ t − F dk 1 1 [ik −e(k)][i(−k 0 0 +t 0 )−e(−k+t)] We can transform locally to by a simple change of variables. Thus ik −e(k)
ik −k
1 1 1 0 10 change of variables. Thus ik0 −e(k) is We can 1 transform ik0 −e(k) locally to ik0 −k1 by a simple 2 is locally integrable, but is not locally L . So the strength of the effective ik0 −e(k) 2 locally integrable, but is not locally L .momentum of the effective interaction diverges when the total t obeys t0 = 0 and interaction F = t − Fdiverges , 7So the strength 1 when the total momentum t obeys t = 0 and F = t − F , because then the locus singular of locus because then the singular locus of0 ik0 −e(k) coincides with the singular 1 1 1coincides with the singular locus of of ik0 −e(k) i(−k +t )−e(−k+t) . This always happens when i(−k0 +t0 )−e(−k+t) . This always happens when F0 =0−F (for example, when F is a F circle) = −F and (for texample, when Fthe is astrength circle) and t =effective 0. Similarly the strength of the effective = 0. Similarly of the interaction diverges when
interaction diverges whent/2 F has t/2aslies in that flaton piece, as in below. the figure on F has a flat piece and lies ainflat thatpiece flat and piece, in the figure the right On the other hand, when F is strongly asymmetric, F and t − F always the right below. On the other hand, when F is strongly asymmetric, F and t − F always t−F
F
t 2
−k
k
F
intersect only at isolated points. A “worst” case is illustrated below. There the antipode, a(k), of k ∈ F , is the unique point of F , different from k, such that the tangents to F at k and a(k) are parallel. F k
k + a(k) − F a(k)
k
F a(k)
t =
interaction diverges when F has a flat piece and t/2 lies in that flat piece, as in the figure on the right below. On the other hand, when F is strongly asymmetric, F and t − F always December 12, 2003 15:2 WSPC/148-RMP 00177 t−F
F
t 2
−k
k
F
Single Scale Analysis of Many Fermion Systems — Part 1
957
intersect only at isolated points. A “worst” case is illustrated below. There the antipode, at isolated points.ofAF“worst” case is illustrated below. theto F at k a(k),intersect of k ∈ Fonly , is the unique point , different from k, such that the There tangents antipode, a(k), of k ∈ F , is the unique point of F , different from k, such that the and a(k) are parallel. tangents to F at k and a(k) are parallel. F
k + a(k) − F a(k)
k
F
k
a(k)
For strongly asymmetric Fermi curves, [ik0 −e(k)][i(−k01+t0 )−e(−k+t)] remains locally 1 remains locally For strongly asymmetric Fermi curves, integrable in k for each fixed t and [ik strength of the interaction remainsintegrable )−e(−k+t)] 0 −e(k)][i(−k 0 +t0effective in k bounded. for each fixed t and strength of the effective interaction remains bounded. The Green’s functions G2n are constructed using a multiscale analysis and renormalization. The multiscale analysis is introduced by choosing a parameter The Green’s functions G2n are constructed using a multiscale analysis and renormalM > 1 and decomposing momentum space into a family of shells, with the jth ization. multiscale analysis is introduced by |ik choosing a parameter M > 1 and decompos1 shellThe consisting of those momenta k obeying 0 − e(k)| ≈ M j . Correspondingly, P th ∞ (j) ing momentum space into a family of shells, with j = shell consisting thosefor momenta k we write the covariance as a telescoping seriesthe C(k) (k) of where, j=0 C 1 obeying |ik0 − e(k)| ≈ M j . Correspondingly, we write the covariance as a telescoping series j ≥ 1, P∞ ≥ 1, C(k) = j=0 C (j) (k) where, for j (j) C = CM −j − CM −j+1 p (j) −j+1 C (j) = CM −j − is the “covariance at scale j”. By construction CC (k) vanishes unless k02 + e(k)2 M is of order M −j , and kC (j) (k)kL∞ ≈ M j . 8 We consider, for each j, the cutoff amputated Euclidean Green’s functions amp G2n;M −j . They are related to the previously defined Green’s functions by G2n;ε (x1 , y1 , . . . , xn , yn ) =
Z Y n
dx0i dyi0
i=1
n Y
!
0 0 0 0 Cε (xi , x0i )Cε (yi0 , yi ) Gamp 2n;ε (x1 , y1 , . . . , xn , yn )
i=1
for n ≥ 2, and G2;ε (x, y) − Cε (x, y) =
Z
0 0 dx0 dy 0 Cε (x, x0 )Cε (y 0 , y)Gamp 2;ε (x , y )
where ε = M −j . The amputated Green’s functions are the coefficients in the expansion of Z ¯ ¯ ¯ = log 1 ¯ , Gamp,j (φ, φ) eV(φ+ψ,φ+ψ) dµCM −j (ψ, ψ) Zj Z ¯ ¯ Zj = eV(ψ,ψ) dµCM −j (ψ, ψ) in powers of φ. That is,
December 12, 2003 15:2 WSPC/148-RMP
958
00177
J. Feldman, H. Kn¨ orrer & E. Trubowitz
¯ Gamp,j (φ, φ) =
∞ X
1 (n!)2 n=1
Z Y n
dxi dyi Gamp 2n,M −j (x1 , y1 , . . . , xn , yn )
i=1
n Y
¯ i )φ(yi ) . φ(x
i=1
The generating functionals Gamp,j are controlled using the renormalization group map Z 1 ¯ ¯ ¯ ¯ ¯ ¯ eW(φ,φ,ψ+ζ,ψ+ζ) dµS (ζ, ζ) ΩS (W)(φ, φ, ψ, ψ) = log Z which is defined for any covariance S. Here, W is a Grassmann function and the R ¯ ¯ ΩS maps Grassmann functions in partition function is Z = eW(0,0,ζ,ζ) dµS (ζ, ζ). ¯ ¯ the variables φ, φ, ψ, ψ to Grassmann functions in the same variables. Clearly ¯ = ΩC −j (V)(0, 0, φ, φ) ¯ Gamp,j (φ, φ) M
(I.6)
¯ as a function of the four variables φ, φ, ¯ ψ, ψ¯ that happens where we view V(ψ, ψ) ¯ to be independent of φ, φ. The renormalization group map is discussed in a general setting in great detail in [8, 9]. It obeys the semigroup property ΩS1 +S2 = ΩS1 ◦ ΩS2 . Therefore ¯ = ΩC (j) (Gamp,j−1 (ψ, ψ))(0, ¯ ¯ Gamp,j (φ, φ) 0, φ, φ)
(I.7)
¯ as a function of φ, φ, ¯ ψ, ψ¯ that happens to be where we again view Gamp,j−1 (ψ, ψ) ¯ The limiting Green’s functions are controlled by tracking the independent of φ, φ. renormalization group flow (I.7). One of the main inputs from this series of four papers to the proof of the theorems stated above is a detailed analysis, with bounds, of the map ΩC (j) . This is the content of the third paper in this series. In this first paper of the series, we apply the general results of [8] to simple many fermion systems. We introduce concrete norms that fulfill the conditions of [8, Sec. II.4] and develop contraction and integral bounds for them. Then, we apply [8, Theorem II.28] and (I.6) to models for which the dispersion relation is both infrared and ultraviolet finite (insulators). For these models, no scale decomposition is necessary. In the second paper of this series, we introduce scales and apply the results of Part 1 to integrate out the first few scales. It turns out that for higher scales the norms introduced in Parts 1 and 2 are inadequate and, in particular, power count poorly. Using sectors (see [5, Sec. II, Subsec. 8]), we introduce finer norms that, in dimension two,d power count appropriately. For these sectorized norms, passing from one scale to the next is not completely trivial. This question is dealt with in d This
is the only part of the construction that is restricted to d = 2. We believe that the difficulties preventing the extension to d = 3 are technical rather than physical. Indeed, there has already been some progress in this direction [10, 11].
December 12, 2003 15:2 WSPC/148-RMP
00177
Single Scale Analysis of Many Fermion Systems — Part 1
959
Part 4. Cumulative notation tables are provided at the end of each paper of this series. II. Norms ¯ Let A be the Grassmann algebra freely generated by the fields φ(y), φ(y) with d y ∈ R × R × {↑, ↓}. The generating functional for the connected Greens functions is a Grassmann Gaussian integral in the Grassmann algebra with coefficients in A ¯ that is generated by the fields ψ(x), ψ(x) with x ∈ R × Rd × {↑, ↓}. We want to apply the results of [8] to this situation. To simplify notation we define, for ξ = (x0 , x, σ, a) = (x, a) ∈ R × Rd × {↑, ↓} × {0, 1} , the internal fields ψ(ξ) =
(
ψ(x) ¯ ψ(x)
if a = 0 if a = 1
.
Similarly, we define for an external variable η = (y0 , y, τ, b) = (y, b) ∈ R × Rd × {↑, ↓} × {0, 1}, the source fields ( φ(y) if b = 0 φ(η) = . ¯ φ(y) if b = 1 B = R × R2 × {↑, ↓} × {0, 1} is called the “base space” parameterizing the fields. The Grassmann algebra A is the direct sum of the vector spaces Am generated by the products φ(η1 ) · · · φ(ηm ). Let V be the vector space generated by ψ(ξ), ξ ∈ B. An antisymmetric function C(ξ, ξ 0 ) on B × B defines a covariance on V by 0 0 with this covariance, RC(ψ(ξ), ψ(ξ )) = C(ξ, ξ ). The Grassmann Gaussian integral V · dµC (ψ), is a linear functional on the Grassmann algebra A V with values in A. V We shall define norms on A V by specifying norms on the spaces of functions on B m × B n , m, n ≥ 0. The rudiments of such norms and simple examples are discussed in this section. In the next section we recall the results of [8] in the current concrete situation. The norms we construct are (d + 1)-dimensional seminorms in the sense of [8, Definition II.15]. They measure the spatial decay of the functions, i.e. derivatives of their Fourier transforms. Definition II.1 (Multi-indices). (i) A multi-index is an element δ = (δ0 , δ1 , . . . , δd ) ∈ N0 × Nd0 . The length of a multi-index δ = (δ0 , δ1 , . . . , δd ) is |δ| = δ0 + δ1 + · · · + δd and its factorial is δ! = δ0 !δ1 ! · · · δd !. For two multi-indices δ, δ 0 we say that δ ≤ δ 0 if δi ≤ δi0 for i = 0, 1, . . . , d. The spatial part of the multi-index δ = (δ0 , δ1 , . . . , δd ) is δ = (δ1 , . . . , δd ) ∈ Nd0 . It has length |δ| = δ1 + · · · + δd . (ii) Let δ, δ (1) , . . . , δ (r) be multi-indices such that δ = δ (1) + · · · + δ (r) . Then by definition ! δ δ! . = (1) δ ! · · · δ (r) ! δ (1) , . . . , δ (r)
December 12, 2003 15:2 WSPC/148-RMP
960
set
00177
J. Feldman, H. Kn¨ orrer & E. Trubowitz
(iii) For a multi-index δ and x = (x0 , x, σ), x0 = (x00 , x0 , σ 0 ) ∈ R × Rd × {↑, ↓} (x − x0 )δ = (x0 − x00 )δ0 (x1 − x01 )δ1 · · · (xd − x0d )δd .
If ξ = (x, a), ξ 0 = (x0 , a0 ) ∈ B we define (ξ − ξ 0 )δ = (x − x0 )δ . (iv) For a function f (ξ1 , . . . , ξn ) on B n , a multi-index δ, and 1 ≤ i, j ≤ n; i 6= j set δ Di,j f (ξ1 , . . . , ξn ) = (ξi − ξj )δ f (ξ1 , . . . , ξn ) .
Lemma II.2 (Leibniz’s rule). Let f (ξ1 , . . . , ξn ) be a function on B n and f 0 (ξ1 , . . . , ξm ) a function on B m . Set Z g(ξ1 , . . . , ξn+m−2 ) = dηf (ξ1 , . . . , ξn−1 , η)f 0 (η, ξn , . . . , ξn+m−2 ) . B
Let δ be a multi-index and 1 ≤ i ≤ n − 1, n ≤ j ≤ n + m − 2. Then Z X δ δ Di,j g(ξ1 , . . . , ξn+m−2 ) = dη δ0 , δ − δ0 B 0 δ ≤δ
0
0
δ−δ δ × Di,n f (ξ1 , . . . , ξn−1 , η)D1,j−n+2 f 0 (η, ξn , . . . , ξn+m−2 ) .
Proof. For each η ∈ B
(ξi − ξj )δ = ((ξi − η) + (η − ξj ))δ X 0 0 δ (ξi − η)δ (η − ξj )δ−δ . = 0 0 δ ,δ −δ 0 δ ≤δ
Definition II.3 (Decay operators). Let n be a positive integer. A decay operator D on the set of functions on B n is an operator of the form (1)
(k)
D = Duδ 1 ,v1 · · · Duδ k ,vk
with multi-indices δ (1) , . . . , δ (k) and 1 ≤ uj , vj ≤ n, uj 6= vj . The indices uj , vj are called variable indices. The total order of derivatives in D is δ(D) = δ (1) + · · · + δ (k) .
In a similar way, we define the action of a decay operator on the set of functions on (R × Rd )n or on (R × Rd × {↑, ↓})n . Definition II.4. (i) On R+ ∪{∞} = {x ∈ R|x ≥ 0}∪{+∞}, addition and the total ordering ≤ are defined in the standard way. With the convention that 0 · ∞ = ∞, multiplication is also defined in the standard way. (ii) Let d ≥ −1. For d ≥ 0, the (d + 1)-dimensional norm domain Nd+1 is the set of all formal power series X X= Xδ tδ00 tδ11 · · · tδdd δ∈N0 ×Nd 0
December 12, 2003 15:2 WSPC/148-RMP
00177
Single Scale Analysis of Many Fermion Systems — Part 1
961
in the variables t0 , t1 , . . . , td with coefficients Xδ ∈ R+ ∪ {∞}. To shorten notation, we set tδ = tδ00 tδ11 · · · tδdd . Addition and partial ordering on Nd+1 are defined componentwise. Multiplication is defined by X (X · X 0 )δ = Xβ Xγ0 . β+γ=δ
The max and min of two elements of Nd+1 are again defined componentwise. The zero-dimensional norm domain N0 is defined to be R+ ∪ {∞}. We also identify R+ ∪ {∞} with the set of all X ∈ Nd+1 with Xδ = 0 for all δ 6= 0 = (0, . . . , 0). If a > 0, X0 6= ∞ and a − X0 > 0 then (a − X)−1 is defined as n ∞ X 1 X − X0 −1 . (a − X) = a − X0 n=0 a − X0 For an element X = ∂ ∂tj X
is defined as
P
δ∈N0 ×Nd 0
Xδ tδ of Nd+1 and 0 ≤ j ≤ d the formal derivative
∂ X= ∂tj
X
(δj + 1)Xδ+j tδ
δ∈N0 ×Nd 0
where j is the jth unit vector. Definition II.5. Let E be a complex vector space. A (d+1)-dimensional seminorm on E is a map k · k : E → Nd+1 such that ke1 + e2 k ≤ ke1 k + ke2 k ,
kλek = |λ| kek
for all e, e1 , e2 ∈ E and λ ∈ C. Example II.6. For a function f on B m × B n we define the (scalar valued) L1 –L∞ norm as Z Y dξj |f (ξ1 , . . . , ξn )| if m = 0 max sup 1≤j0 ≤n ξj ∈B 0 j=1,...,n j6=j0 k|f k|1,∞ = Z Y dξj |f (η1 , . . . , ηm ; ξ1 , . . . , ξn )| if m 6= 0 sup η1 ,...,ηm ∈B
j=1,...,n
and the (d + 1)-dimensional L1 –L∞ seminorm X 1 k|Df k|1,∞ tδ00 tδ11 · · · tδdd max D decay operator δ! kf k1,∞ = δ∈N0 ×Nd with δ(D)=δ 0 k|f k|1,∞
if m = 0 if m 6= 0
.
December 12, 2003 15:2 WSPC/148-RMP
962
00177
J. Feldman, H. Kn¨ orrer & E. Trubowitz
Here k|f k|1,∞ stands for the formal power series with constant coefficient k|f k|1,∞ and all other coefficients zero. 0
0
Lemma II.7. Let f be a function on B m × B n and f 0 a function on B m × B n . 0 0 Let 1 ≤ µ ≤ n, 1 ≤ ν ≤ n0 . Define the function g on B m+m × B n+n −2 by g(η1 , . . . , ηm+m0 ; ξ1 , . . . , ξµ−1 , ξµ+1 , . . . , ξn , ξn+1 , . . . , ξn+ν−1 , ξn+ν+1 , . . . , ξn+n0 ) Z dζf (η1 , . . . , ηm ; ξ1 , . . . , ξµ−1 , ζ, ξµ+1 , . . . , ξn ) = B
× f 0 (ηm+1 , . . . , ηm+m0 ; ξn+1 , . . . , ξn+ν−1 , ζ, ξn+ν+1 , . . . , ξn+n0 ) .
If m = 0 or m0 = 0
k|gk|1,∞ ≤ k|f k|1,∞ k|f 0 k|1,∞ kgk1,∞ ≤ kf k1,∞ kf 0 k1,∞ . Proof. We first consider the norm k| · k|1,∞ . In the case m 6= 0, m0 = 0, for all η1 , . . . , ηm ∈ B Z n+n0 Y dξj g(η1 , . . . , ηm ; ξ1 , . . . , ξµ−1 , ξµ+1 , . . . , ξn , j=1 j6=µ,n+ν × ξn+1 , . . . , ξn+ν−1 , ξn+ν+1 , . . . , ξn+n0 )| Z ≤ dξ1 · · · dξn f (η1 , . . . , ηm ; ξ1 , . . . , ξn )
Z n 0 Y 0 0 0 0 0 dξj f (ξ1 , . . . , ξν−1 × sup , ζ, ξν+1 , . . . , ξn0 0 ) ζ∈B j6j=1 =ν
≤ k|f k|1,∞ k|f 0 k|1,∞ .
The case m = 0, m0 6= 0 is similar. In the case m = m0 = 0, first fix j0 ∈ {1, . . . , n} \ {µ}, and fix ξj0 ∈ B. As in the case m 6= 0, m0 = 0 one shows that 0 Z n+n Y dξj g(ξ1 , . . . , ξµ−1 , ξµ+1 , . . . , ξn , ξn+1 , . . . , ξn+ν−1 , ξn+ν+1 , . . . , ξn+n0 ) j6=j0j=1 ,µ,n+ν Z n0 Z n Y 0 0 0 Y 0 0 dξj f (ξ1 , . . . , ξν−1 , ζ, ξν+1 , . . . , ξn0 0 ) dξj f (ξ1 , . . . , ξn ) sup ≤ ζ∈B j=1 j=1 j6=ν j6=j0
≤ k|f k|1,∞ k|f 0 k|1,∞ .
December 12, 2003 15:2 WSPC/148-RMP
00177
Single Scale Analysis of Many Fermion Systems — Part 1
963
If one fixes one of the variables ξj0 with j0 ∈ {n + 1, . . . , n + n0 } \ {n + ν}, the argument is similar. We now consider the norm k · k1,∞ . If m 6= 0 or m0 6= 0 this follows from the first part of this lemma and kgk1,∞ = k|gk|1,∞ ≤ k|f k|1,∞ k|f 0 k|1,∞ ≤ kf k1,∞ kf 0 k1,∞ . So assume that m = m0 = 0. Set X 1 Xδ t δ kf k1,∞ = δ! d δ∈N0 ×N0
kgk1,∞ =
X
δ∈N0 ×Nd 0
kf 0 k1,∞ =
X
δ∈N0 ×Nd 0
1 0 δ X t δ! δ
1 Yδ t δ δ!
with Xδ , Xδ0 , Yδ ∈ R+ ∪ {∞}. Let D be a decay operator of degree δ acting on g. The variable indices for g lie in the set I ∪ I 0 , where I = {1, . . . , µ − 1, µ + 1, . . . , n} I 0 = {n + 1, . . . , n + ν − 1, n + ν + 1, . . . , n + n0 } . We can factor the decay operator D in the form
˜ 2 D1 D = ±DD
where all variable indices of D1 lie in I, all variable indices of D2 lie in I 0 , and ˜ = Dδ(1) · · · Dδ(k) D u1 ,v1 uk ,vk
with u1 , . . . , uk ∈ I, v1 , . . . , vk ∈ I 0 . Set h = D1 f and h0 = D2 f 0 . By Leibniz’s rule Z ˜ Dg = ±D dζh(ξ1 , . . . , ξµ−1 , ζ, ξµ+1 , . . . , ξn ) B
× h0 (ξn+1 , . . . , ξn+ν−1 , ζ, ξn+ν+1 , . . . , ξn+n0 ) =±
(i)
X (i)
k Y
i=1
(i)
α +β =δ for i=1,...,k
δ (i) α(i) , β (i)
!! Z
dζ
k Y
i=1
(i)
Duαi ,µ h
!
× (ξ1 , . . . , ξµ−1 , ζ, ξµ+1 , . . . , ξn ) ×
k Y
i=1
(i)
β h0 Dν,v i
!
(ξn+1 , . . . , ξn+ν−1 , ζ, ξn+ν+1 , . . . , ξn+n0 ) .
By the first part of this lemma, the L1 –L∞ -norm of each integral on the right-hand side is bounded by
k k
Y
Y
β (i) 0 α(i) . Dν,vi h Dui ,µ h
i=1
1,∞
i=1
1,∞
December 12, 2003 15:2 WSPC/148-RMP
964
00177
J. Feldman, H. Kn¨ orrer & E. Trubowitz
˜ = δ (1) + · · · + δ (k) , Therefore, setting δ˜ = δ(D) !! k X Y δ (i) (1) (k) δ k|Dgk|1,∞ t ≤ tδ(D1 ) tα +···+α (i) (i) α ,β (i) (i) (i) i=1 α +β =δ for i=1,...,k
k
Y (i)
α Dui ,µ (D1 f ) ×
i=1
≤
X
α+β=δ˜
t 1,∞
k Y
X
i=1
α(i) +β (i) =δ (i) α(1) +···+α(k) =α β (1) +···+β (k) =β
k
Y (i) β 0 Dν,vi (D2 f )
δ(D2 ) β (1) +···+β (k)
t
δ (i) α(i) , β (i)
!!
i=1
1,∞
0 × Xδ(D1 )+α tδ(D1 )+α Xδ(D tδ(D2 )+β 2 )+β ! X δ˜ 0 tδ(D2 )+β Xδ(D1 )+α tδ(D1 )+α Xδ(D = 2 )+β α, β ˜ α+β=δ
≤
δ
X
α+β=δ˜
δ(D1 ) + α, δ(D2 ) + β
!
0 × Xδ(D1 )+α tδ(D1 )+α Xδ(D tδ(D2 )+β . 2 )+β
(II.1)
In the equality, we used the fact that for each pair of multi-indices α, β with α+β = δ˜ P and each k-tuple of multi-indices δ (i) , 1 ≤ i ≤ k, with i δ (i) = δ˜ ! ! k Y X δ (i) δ˜ . = α(i) , β (i) α, β (i) (i) (i) i=1 α +β =δ α(1) +···+α(k) =α β (1) +···+β (k) =β
This standard combinatorial identity follows from ! k X Y δ˜ (i) ˜ xα y β = (x + y)δ = (x + y)δ α, β i=1 ˜ α+β=δ
=
k Y
i=1
=
δ (i)
X
α(i) +β (i) =δ (i)
X
α(i) +β (i) =δ (i) i=1,...,k
by matching the coefficients of xα y β .
"
k Y
i=1
α(i) , β (i) δ (i) (i)
α ,β
(i)
!
(i)
(i)
xα y β
!#
xα
(1)
+···+α(k) β (1) +···+β (k)
y
December 12, 2003 15:2 WSPC/148-RMP
00177
Single Scale Analysis of Many Fermion Systems — Part 1
965
It follows from (II.1) that 1 Yδ t δ ≤ δ!
X
α0 +β 0 =δ
0 0 1 1 Xα0 tα 0 Xβ0 0 tβ 0 α! β!
and kgk1,∞ ≤
X δ
X
α0 +β 0 =δ
0 1 Xα 0 t α α0 !
1 0 β0 X 0 t = kf k1,∞ kf 0 k1,∞ . β0! β 0
Corollary II.8. Let f be a function on B n , f 0 a function on B n and C2 , C3 functions on B 2 . Set Z 0 0 h(ξ4 , . . . , ξn , ξ4 , . . . , ξn0 ) = dζdξ2 dξ20 dξ3 dξ30 f (ζ, ξ2 , ξ3 , ξ4 , . . . , ξn ) × C2 (ξ2 , ξ20 )C3 (ξ3 , ξ30 ) f 0 (ζ, ξ20 , ξ30 , ξ40 , . . . , ξn0 0 ) . Then khk1,∞ ≤ sup |C2 (ξ, ξ 0 )| sup |C3 (η, η 0 )| kf k1,∞ kf 0 k1,∞ . η,η 0
ξ,ξ 0
Proof. Set g(ξ2 , . . . , ξn , ξ20 , . . . , ξn0 0 )
=
Z
dζf (ζ, ξ2 , ξ3 , ξ4 , . . . , ξn )f 0 (ζ, ξ20 , ξ30 , ξ40 , . . . , ξn0 0 ) .
Let D be a decay operator acting on h. Then Z Dh = dξ2 dξ20 dξ3 dξ30 C2 (ξ2 , ξ20 )C3 (ξ3 , ξ30 )Dg(ξ2 , . . . , ξn , ξ20 , . . . , ξn0 0 ) . Consequently k|Dhk|1,∞ ≤ sup |C2 | sup |C3 | k|Dgk|1,∞ and therefore khk1,∞ ≤ sup |C2 | sup |C3 | kgk1,∞ . The corollary now follows from Lemma II.7. Definition II.9. Let Fm (n) be the space of all functions f (η1 , . . . , ηm ; ξ1 , . . . , ξn ) on B m × B n that are antisymmetric in the η variables. If f (η1 , . . . , ηm ; ξ1 , . . . , ξn ) is any function on B m × B n , its antisymmetrization in the external variables is 1 X Antext f (η1 , . . . , ηm ; ξ1 , . . . , ξn ) = sgn(π)f (ηπ(1) , . . . , ηπ(m) ; ξ1 , . . . , ξn ) . m! π∈Sm
For m, n ≥ 0, the symmetric group Sn acts on Fm (n) from the right by f π (η1 , . . . , ηm ; ξ1 , . . . , ξn ) = f (η1 , . . . , ηm ; ξπ(1) , . . . , ξπ(n) ) for π ∈ Sn .
December 12, 2003 15:2 WSPC/148-RMP
966
00177
J. Feldman, H. Kn¨ orrer & E. Trubowitz
Definition II.10. A seminorm k · k on Fm (n) is called symmetric, if for every f ∈ Fm (n) and π ∈ Sn kf π k = kf k
and kf k = 0 if m = n = 0. For example, the seminorms k · k1,∞ of Example II.6 are symmetric. III. Covariances and the Renormalization Group Map Definition III.1 (Contraction). Let C(ξ, ξ 0 ) be any skew symmetric function on B × B. Let m, n ≥ 0 and 1 ≤ i < j ≤ n. For f ∈ Fm (n) the contraction ConC i→j f ∈ Fm (n − 2) is defined as ConC f (η1 , . . . , ηm ; ξ1 , . . . , ξi−1 , ξi+1 , . . . , ξj−1 , ξj+1 , . . . , ξn ) i→j
= (−1)j−i+1
Z
dζdζ 0 C(ζ, ζ 0 )
× f (η1 , . . . , ηm ; ξ1 , . . . , ξi−1 , ζ, ξi+1 , . . . , ξj−1 , ζ 0 , ξj+1 , . . . , ξn ) . Definition III.2 (Contraction Bound). Let k · k be a family of symmetric seminorms on the spaces Fm (n). We say that c ∈ Nd+1 is a contraction bound for the covariance C with respect to this family of seminorms, if for all m, n, m0 , n0 ≥ 0 there exist i and j with 1 ≤ i ≤ n, 1 ≤ j ≤ n0 such that
ConC (Antext (f ⊗ f 0 )) ≤ ckf k kf 0k
i→j
for all f ∈ Fm (n), f 0 ∈ Fm0 (n0 ). Observe that f ⊗ f 0 is a function on (B m × B n ) × 0 0 0 0 (B m × B n ) ∼ = B m+m × B n+n , so that Antext (f ⊗ f 0 ) ∈ Fm+m0 (n + n0 ). Remark III.3. If c is a contraction bound for the covariance C with respect to a family of symmetric seminorms, then, by symmetry,
ConC (Antext (f ⊗ f 0 )) ≤ ckf k kf 0 k
i→n+j
for all 1 ≤ i ≤ n, 1 ≤ j ≤ n0 and all f ∈ Fm (n), f 0 ∈ Fm0 (n0 ).
Example III.4. The L1 –L∞ -norm introduced in Example II.6 has max{kCk1,∞ , k|Ck|∞ } as a contraction bound for covariance C. Here, k|Ck|∞ is the element of Nd+1 whose constant term is supξ,ξ0 |C(ξ, ξ 0 )| and is the only nonzero term. This is easily proven by iterated application of Lemma II.7. See also [8, Example II.26]. A more general statement will be formulated and proven in Lemma V.1(iii). Definition III.5 (Integral Bound). Let k·k be a family of symmetric seminorms on the spaces Fm (n). We say that b ∈ R+ is an integral bound for the covariance C with respect to this family of seminorms, if the following holds:
December 12, 2003 15:2 WSPC/148-RMP
00177
Single Scale Analysis of Many Fermion Systems — Part 1
967
Let m ≥ 0, 1 ≤ n0 ≤ n. For f ∈ Fm (n) define f 0 ∈ Fm (n − n0 ) by
f 0 (η1 , . . . , ηm ; ξn0 +1 , . . . , ξn ) ZZ = dξ1 · · · dξn0 f (η1 , . . . , ηm ; ξ1 , . . . , ξn0 , ξn0 +1 , . . . , ξn ) B n0
× ψ(ξ1 ) · · · ψ(ξn0 )dµC (ψ) . Then 0
kf 0 k ≤ (b/2)n kf k . Remark III.6. Suppose that there is a constant S such that Z ψ(ξ1 ) · · · ψ(ξn )dµC (ψ) ≤ S n
for all ξ1 , . . . , ξn ∈ B. Then 2S is an integral bound for C with respect to the L1 –L∞ -norm introduced in Example II.6. Definition III.7. (i) We define Am [n] as the subspace of the Grassmann algebra V A V that consists of all elements of the form Z Gr(f ) = dη1 · · · dηm dξ1 · · · dξn f (η1 , . . . , ηm ; ξ1 , . . . , ξn ) × φ(η1 ) · · · φ(ηm )ψ(ξ1 ) · · · ψ(ξn ) with a function f on B m × B n . (ii) Every element of Am [n] has a unique representation of the form Gr(f ) with a function f (η1 , . . . , ηm ; ξ1 , . . . , ξn ) ∈ Fm (n) that is antisymmetric in its ξ variables. Therefore a seminorm k · k on Fm (n) defines a canonical seminorm on Am [n], which we denote by the same symbol k · k. Remark III.8. For F ∈ Am [n] kF k ≤ kf k for all f ∈ Fm (n) with Gr(f ) = F . P 1 π is the unique element of Proof. Let f ∈ Fm (n). Then f 0 = n! π∈Sn sgn(π) f Fm (n) that is antisymmetric in its ξ variables such that Gr(f 0 ) = Gr(f ). Therefore 1 X 1 X kf π k = kf k = kf k kGr(f )k = kf 0 k ≤ n! n! π∈Sn
π∈Sn
since the seminorm is symmetric. Definition III.9. Let k · k be a family of symmetric seminorms, and let W(φ, ψ) be a Grassmann function. Write X W= Wm,n m,n≥0
December 12, 2003 15:2 WSPC/148-RMP
968
00177
J. Feldman, H. Kn¨ orrer & E. Trubowitz
with Wm,n ∈ Am [n]. For any constants c ∈ Nd+1 , b > 0 and α ≥ 1 set 1 X n n α b kWm,n k . N (W; c, b, α) = 2 c b m,n≥0
In practice, the quantities b, c will reflect the “power counting” of W with respect to the covariance C and the number α is proportional to an inverse power of the largest allowed modulus of the coupling constant. In this paper, we will derive bounds on the renormalization group map for several kinds of seminorms. The main ingredients from [8] are Theorem III.10. Let k·k be a family of symmetric seminorms and let C be a covariance on V with contraction bound c and integral bound b. Then the formal Taylor series ΩC (:W:) converges to an analytic map on {W|W even, N (W; c, b, 8α)0 < α2 4 }. Furthermore, if W(φ, ψ) is an even Grassmann function such that N (W; c, b, 8α)0 <
α2 4
then N (ΩC (: W :) − W; c, b, α) ≤
N (W; c, b, 8α)2 2 . α2 1 − α42 N (W; c, b, 8α)
Here, : · : denotes Wick ordering with respect to the covariance C.
In Sec. V we will use this theorem to discuss the situation of an insulator. More generally we have: Theorem III.11. Let, for κ in a neighborhood of 0, Wκ (φ, ψ) be an even Grassmann function and Cκ , Dκ be antisymmetric functions on B×B. Assume that α ≥ 1 and N (W0 ; c, b, 32α)0 < α2 and that C0 has contraction bound c d Cκ has contraction bound c0 dκ κ=0
and that c ≤
1 2 µc .
Set
1 b is an integral bound for C0 , D0 2 1 0 d b is an integral bound for Dκ 2 dκ κ=0
˜ κ (φ, ψ) :ψ,Dκ = ΩCκ (: Wκ :ψ,Cκ +Dκ ) . :W
Then N
d ˜ [Wκ − Wκ ]κ=0 ; c, b, α dκ
d ; c, b, 8α Wκ dκ κ=0 ( 0 2 ) 1 0 1 N (W0 ; c, b, 32α)2 b + 2 . c + 2α 1 − α12 N (W0 ; c, b, 32α) 4µ b
1 N (W0 ; c, b, 32α) ≤ N 2 2α 1 − α12 N (W0 ; c, b, 32α)
December 12, 2003 15:2 WSPC/148-RMP
00177
Single Scale Analysis of Many Fermion Systems — Part 1
969
Proof of Theorems III.10 and III.11. If f (η1 , . . . , ηm ; ξ1 , . . . , ξn ) is a function on B m × B n we define the corresponding element of Am ⊗ V ⊗n as Z Y n m Y dξj f (η1 , . . . , ηm ; ξ1 , . . . , ξn ) Tens(f ) = dηi i=1
j=1
× φ(η1 ) · · · φ(ηm )ψ(ξ1 ) ⊗ · · · ⊗ ψ(ξn ) .
⊗n
Each element of Am ⊗ V can be uniquely written in the form Tens(f ) with a function f ∈ Fm (n). Therefore a seminorm on Fm (n) defines a seminorm on Am ⊗ V ⊗n and conversely. Under this correspondence, symmetric seminorms on Fm (n) in the sense of Definition II.10 correspond to symmetric seminorms on Am ⊗V ⊗n in the sense of [8, Definition II.18], contraction bounds as in Definition III.2 correspond, by Remark III.3, to contraction bounds as in [8, Definition II.25(i)] and integral bounds as in Definition III.5 correspond to integral bounds as in [8, Definition II.25(ii)]. Furthermore the norms on the spaces Am [n] defined in Definition II.7(ii) agrees with those of [8, Lemma II.22]. Therefore [8, Theorem III.10] follows directly from [8, Theorem II.28] and Theorem III.11 follows from [8, Theorem IV.4]. IV. Bounds for Covariances Integral bounds Definition IV.1. For any covariance C = C(ξ, ξ 0 ) we define 1/m Z ψ(ξ1 ) · · · ψ(ξm )dµC (ψ) . S(C) = sup sup m ξ1 ,...,ξm ∈B
Remark IV.2. (i) By Remark III.6, 2 S(C) is an integral bound for C with respect to the L1 –L∞ -norms introduced in Example II.6. (ii) For any two covariances C, C 0 S(C + C 0 ) ≤ S(C) + S(C 0 ) . Proof of (ii). For ξ1 , . . . , ξm ∈ B Z ψ(ξ1 ) · · · ψ(ξm )dµ(C+C 0 ) (ψ) =
Z
(ψ(ξ1 ) + ψ 0 (ξ1 )) · · · (ψ(ξm ) + ψ 0 (ξm ))dµC (ψ)dµC 0 (ψ 0 ) .
Multiplying out one sees that (ψ(ξ1 ) + ψ 0 (ξ1 )) · · · (ψ(ξm ) + ψ 0 (ξm )) =
m X
X
p=0 I⊂{1,...,m} |I|=p
with M(p, I) = ±
Y i∈I
ψ(ξi )
Y
j ∈I /
ψ 0 (ξj ) .
M(p, I)
December 12, 2003 15:2 WSPC/148-RMP
970
00177
J. Feldman, H. Kn¨ orrer & E. Trubowitz
Therefore Z X m ψ(ξ1 ) · · · ψ(ξm )dµ(C+C 0 ) (ψ) ≤
X
Z Z 0 0 M(p, I)dµ (ψ)dµ (ψ ) C C
X
S(C)p S(C 0 )m−p
p=0 I⊂{1,...,m} |I|=p
≤
m X
p=0 I⊂{1,...,m} |I|=p
= (S(C) + S(C 0 ))m . In this section, we assume that there is a function C(k) such that for ξ = (x, a) = (x0 , x, σ, a), ξ 0 = (x0 , a0 ) = (x00 , x0 , σ 0 , a0 ) ∈ B Z dd+1 k ıhk,x−x0 i− 0 e C(k) if a = 0, a0 = 1 δ σ,σ d+1 (2π) Z (IV.1) C(ξ, ξ 0 ) = dd+1 k ıhk,x0 −x>− 0 −δ e C(k) if a = 1, a0 = 0 σ,σ d+1 (2π) 0 if a = a0
(as usual, the case x0 = x00 = 0 is defined through the limit x0 − x00 → 0−) and derive bounds for S(C) in terms of norms of C(k). Proposition IV.3 (Gram’s estimate). (i) S(C) ≤
sZ
dd+1 k |C(k)| (2π)d+1
(ii) Let, for each s in a finite set Σ, χs (k) be a function on R×Rd . Set, for a ∈ {0, 1}, Z a 0 dd+1 k 0 χ ˆs (x − x , a) = e(−1) ıhk,x−x i− χs (k) (2π)d+1 and ψs (x, a) =
Z
dd+1 x0 χ ˆs (x − x0 , a)ψ(x0 , a) .
Then for all ξ1 , . . . , ξm ∈ B and all s1 , . . . , sm ∈ Σ Z m/2 Z dd+1 k 2 ψs1 (ξ1 ) · · · ψsm (ξm )dµC (ψ) ≤ max |C(k)χs (k) | . s∈Σ (2π)d+1
Proof. Let H be the Hilbert space H = L2 (R × Rd ) ⊗ C2 . For σ ∈ {↑, ↓} define the element ω(σ) ∈ C2 by ( (1, 0) if σ = ↑ ω(σ) = . (0, 1) if σ = ↓
December 12, 2003 15:2 WSPC/148-RMP
00177
Single Scale Analysis of Many Fermion Systems — Part 1
971
For each ξ = (x, a) = (x0 , x, σ, a) ∈ B define w(ξ) ∈ H by
w(ξ) =
Then
−ıhk,xi− p e (2π)(d+1)/2 |C(k)| ⊗ ω(σ)
e−ıhk,xi− C(k) p ⊗ ω(σ) (d+1)/2 (2π) |C(k)|
kw(ξ)k2H
=
Z
if a = 0 . if a = 1
dd+1 k |C(k)| for all ξ ∈ B (2π)d+1
and C(ξ, ξ 0 ) = hw(ξ), w(ξ 0 )iH if ξ = (x, σ, 0), ξ 0 = (x0 , σ 0 , 1) ∈ B. Part (i) of the proposition now follows from [8, Proposition B.1(i)]. (ii) For each ξ = (x, a) = (x0 , x, σ, a) ∈ B and s ∈ Σ define w 0 (ξ, s) ∈ H by
0
w (ξ, s) =
Then
−ıhk,xi− p e (2π)(d+1)/2 |C(k)| χs (k) ⊗ ω(σ) C(k) e−ıhk,xi− p χs (k) ⊗ ω(σ) (2π)(d+1)/2 |C(k)|
kw0 (ξ, s)k2H =
Z
if a = 0 . if a = 1
dd+1 k |C(k)||χs (k)|2 (2π)d+1
and Z
ψs (ξ)ψs0 (ξ 0 )dµC (ξ) = hw(ξ, s), w(ξ 0 , s0 )iH
if ξ = (x0 , x, σ, 0), ξ 0 = (x00 , x0 , σ 0 , 1) ∈ B. Part (ii) of the proposition now follows from [8, Proposition B.1(i)], applied to the generating system of fields ψs (ξ). Lemma IV.4. Let Λ > 0 and U (k) a function on Rd . Assume that C(k) =
U (k) . ık0 − Λ
Then S(C) ≤
sZ
dd k |U (k)| . (2π)d
December 12, 2003 15:2 WSPC/148-RMP
972
00177
J. Feldman, H. Kn¨ orrer & E. Trubowitz
Proof. For a = 0, a0 = 1 C((x0 , x, σ, a), (x00 , x0 , σ 0 , a0 )) Z Z 0 dk0 e−ık0 (x0 −x0 ) dd k ıhk,x−x0 i = δσ,σ0 e U (k) 2π ık0 − Λ (2π)d ( −Λ(x −x0 ) Z 0 0 e if x0 > x00 dd k ıhk,x−x0 i = −δσ,σ0 e U (k) . (2π)d 0 if x0 ≤ x00 Let H be the Hilbert space H = L2 (Rd ) ⊗ C2 . For σ ∈ {↑, ↓} define the element ω(σ) ∈ C2 as in the proof of Proposition IV.3, and for each ξ = (x0 , x, σ, a) ∈ B define w(ξ) ∈ H by −ıhk,xi p e if a = 0 (2π)d/2 |U (k)| ⊗ ω(σ) w(ξ) = −ıhk,xi U (k) −e p ⊗ ω(σ) if a = 1 . (2π)d/2 |U (k)| Again
kw(ξ)k2H =
1 (2π)d
Z
dd k |U (k)|
for all ξ ∈ B .
Furthermore set τ (x0 , x, σ, a) = Λx0 . Then for ξ = (x0 , x, σ, 0), ξ 0 = (x00 , x0 , σ 0 , 1) ∈B ( −(τ (ξ)−τ (ξ0)) e hw(ξ), w(ξ 0 )iH if τ (ξ) > τ (ξ 0 ) 0 . C(ξ, ξ ) = 0 if τ (ξ) ≤ τ (ξ 0 ) The lemma now follows from [8, Proposition B.1(ii)]. Proposition IV.5. Assume that C is of the form C(k) =
U (k) − χ(k) ık0 − e(k)
with real valued measurable functions U (k), e(k) on Rd and χ(k) on R × Rd such that 0 ≤ χ(k) ≤ U (k) ≤ 1 for all k = (k0 , k) ∈ R × Rd . Then Z Z Z dd k 3 dd+1 k dd+1 k U (k) − χ(k) 2 U (k) + χ(k) + 6 S(C) ≤ 9 d+1 |ık − e(k)| (2π)d E (2π)d+1 0 |k0 |≤E (2π) where E = supk∈supp U |e(k)|. Proof. Write C(k) =
U (k) χ(k) e(k) − E − + (U (k) − χ(k)) . ık0 − E ık0 − E (ık0 − e(k))(ık0 − E)
December 12, 2003 15:2 WSPC/148-RMP
00177
Single Scale Analysis of Many Fermion Systems — Part 1
973
By Remark IV.2, Lemma IV.4 and Proposition IV.3(i) Z Z 1 dd k dd+1 k χ(k) S(C)2 ≤ |U (k)| + 3 (2π)d (2π)d+1 ık0 − E Z e(k) − E dd+1 k + (U (k) − χ(k)) . (2π)d+1 (ık0 − e(k))(ık0 − E) The first two terms are bounded by Z Z dd k 1 dd+1 k U (k) + χ(k) . d (2π) E (2π)d+1
The contribution to the third term having |k0 | ≤ E is bounded by Z dd+1 k e(k) − E (U (k) − χ(k)) d+1 (ık − e(k))(ık − E) (2π) 0 0 |k0 |≤E Z dd+1 k U (k) − χ(k) ≤2 . (2π)d+1 |ık0 − e(k)| The contribution to the third term having |k0 | > E is bounded by Z e(k) − E dd+1 k (U (k) − χ(k)) d+1 (ık0 − e(k))(ık0 − E) |k0 |>E (2π) Z E dd+1 k U (k) ≤4 d+1 (2π) |ık0 − E|2 Z dd k U (k) . =2 (2π)d Hence Z dd k 1 dd+1 k U (k) + χ(k) (2π)d E (2π)d+1 Z dd+1 k U (k) − χ(k) +2 . (2π)d+1 |ık0 − e(k)|
1 S(C)2 ≤ 3 3
Z
Contraction bounds We have observed in Example III.4 that the L1 –L∞ -norm introduced in Example II.6 has max{kCk1,∞ , k|Ck|∞ } as a contraction bound for covariance C. For the propagators of the form (IV.1), we estimate these position space quantities by norms of derivatives of C(k) in momentum space. Definition IV.6. (i) For a function f (k) on R × Rd and a multi-index δ we set Dδ f (k) =
∂ δ0 ∂ δ1 ∂k0δ0 ∂k1δ1
···
∂ δd ∂kdδd
f (k)
December 12, 2003 15:2 WSPC/148-RMP
974
00177
J. Feldman, H. Kn¨ orrer & E. Trubowitz
and kf (k)kˇ∞ = kf (k)kˇ1 =
X
1 δ!
X
1 δ!
Z
δ∈N0 ×Nd 0
δ∈N0 ×Nd 0
δ
sup |D f (k)| tδ k
∈ Nd+1
dd+1 k δ f (k)| |D tδ ∈ Nd+1 . (2π)d+1
If B is a measurable subset of R × Rd , X 1 δ ˇ sup |D f (k)| tδ kf (k)k∞,B = δ! k∈B d
∈ Nd+1
δ∈N0 ×N0
kf (k)kˇ1,B =
X
δ∈N0 ×Nd 0
1 δ!
Z
B
dd+1 k δ |D f (k)| tδ ∈ Nd+1 . (2π)d+1
(ii) For µ > 0 and X ∈ Nd+1 d
µ X Tµ X = d+1 X + µ d + 1 j=0 1
∂ ∂ ··· ∂t0 ∂td
∂ X. ∂tj
Remark IV.7. For functions f (k) and g(k) on B ⊂ R × Rd kf (k)g(k)kˇ1,B ≤ kf (k)kˇ1,B kg(k)kˇ∞,B by Leibniz’s rule for derivatives. The proof is similar to that of Lemma II.7. Proposition IV.8. Let d ≥ 1. Assume that there is a function C(k) such that for ξ = (x, a) = (x0 , x, σ, a), ξ 0 = (x0 , a0 ) = (x00 , x0 , σ 0 , a0 ) ∈ B Z dd+1 k ıhk,x−x0 i− 0 δ e C(k) if a = 0, a0 = 1 σ,σ (2π)d+1 0 C(ξ, ξ ) = 0 if a = a0 −C(ξ 0 , ξ) if a = 1, a0 = 0 . Let δ be a multi-index and 0 < µ ≤ 1. Z vol dd+1 k δ |Dδ C(k)| ≤ (i) k|D1,2 Ck|∞ ≤ d+1 (2π) (2π)d+1
sup |Dδ C(k)|
k∈R×Rd
and kCk1,∞ ≤ const Tµ kC(k)kˇ1 ≤ const
vol Tµ kC(k)kˇ∞ (2π)d+1
where vol is the volume of the support of C(k) in R × Rd and the constant depends only on the dimension d.
const
December 12, 2003 15:2 WSPC/148-RMP
00177
Single Scale Analysis of Many Fermion Systems — Part 1
975
(ii) Assume that there is an r-times differentiable real valued function e(k) on R d such that |e(k)| ≥ µ for all k ∈ Rd and a real valued, compactly supported, smooth, non-negative function U (k) on Rd such that C(k) =
U (k) . ık0 − e(k)
1 |e(k)|
g2 =
Set g1 =
Z
dd k supp U
Then there is a constant part |δ| ≤ r − d − 1, k|Ck|∞ ≤ const
const
Z
dd k supp U
µ . |e(k)|2
such that, for all multi-indices δ whose spatial
1 const δ k|D1,2 Ck|1,∞ ≤ d+|δ| δ! µ
(
g1 2|δ| g2
if |δ| = 0 if |δ| ≥ 1
.
The constant const depends only on the dimension d, the degree of differentiability r, the ultraviolet cutoff U (k) and the quantities supk |Dγ e(k)|, γ ∈ Nd0 , |γ| ≤ r. (iii) Assume that C is of the form C(k) =
U (k) − χ(k) ık0 − e(k)
with real valued functions U (k), e(k) on Rd and χ(k) on R × Rd that fulfill the following conditions: The function e(k) is r times differentiable. |ık0 − e(k)| ≥ µ for all k = (k0 , k) in the support of U (k)−χ(k). The function U (k) is smooth and has compact support. The function χ(k) is smooth and has compact support and 0 ≤ χ(k) ≤ U (k) ≤ 1 for all k = (k0 , k) ∈ R × Rd . There is a constant const such that k|Ck|∞ ≤ const .
(IV.2)
The constant const depends on d, µ and the supports of U (k) and χ. Let r0 ∈ N. There is a constant const such that, for all multi-indices δ whose spatial part |δ| ≤ r − d − 1 and whose temporal part |δ0 | ≤ r0 − 2, δ k|D1,2 Ck|1,∞ ≤ const .
(IV.3)
The constant const depends on d, r, r0 , µ, U (k) and the quantities supk |Dγ e(k)| with γ ∈ Nd0 , |γ| ≤ r and supk |Dβ χ(k)| with β ∈ N0 × Nd0 , β0 ≤ r0 , |β| ≤ r. 0
Proof. (i) As the Fourier transform of the operator Dδ is, up to a sign, multipli0 cation by [−i(x − x0 )]δ , we have for ξ = (x, σ, a) and ξ 0 = (x0 , σ 0 , a0 ) Z 0 0 dd+1 k δ |(x − x0 )δ | |D1,2 C(ξ, ξ 0 )| ≤ |Dδ+δ C(k)| . (2π)d+1
December 12, 2003 15:2 WSPC/148-RMP
976
00177
J. Feldman, H. Kn¨ orrer & E. Trubowitz
In particular δ |D1,2 C(ξ, ξ 0 )|
≤
and, for j = 0, 1, . . . , d, µ
d+2
|xj −
x0j |
≤ µd+2
dd+1 k |Dδ C(k)| (2π)d+1
Z
d Y
i=0
δ |xi − x0i | |D1,2 C(ξ, ξ 0 )|
dd+1 k |Dδ++j C(k)| (2π)d+1
Z
(IV.4)
(IV.5j )
where = (1, 1, . . . , 1) and j is the jth unit vector. Taking the geometric mean of (IV.50 ), . . . , (IV.5d ) on the left-hand side and the arithmetic mean on the righthand side gives µ
d+2
d Y
i=0
1
δ C(ξ, ξ 0 )| |xi − x0i |1+ d+1 |D1,2 d
µd+2 X ≤ d+1 j=0
Z
dd+1 k |Dδ++j C(k)| . (2π)d+1
(IV.6)
Adding (IV.4) and (IV.6) gives 1+µ
d+2
d Y
i=0
≤
Z
|xi −
1 x0i |1+ d+1
!
δ |D1,2 C(ξ, ξ 0 )| d
dd+1 k µd+2 X δ |D C(k)| + (2π)d+1 d + 1 j=0
Dividing across and using
R
δ k|D1,2 C(ξ, ξ 0 )k|1,∞
1+µd+2
≤
const
Z
dd+1 x Qd 1+ 1 d+1 i=0 |xi |
1
µd+1
Z
dd+1 k |Dδ++j C(k)| . (2π)d+1
(IV.7)
1 we get ≤ const µd+1
dd+1 k |Dδ C(k)| (2π)d+1
d
µ X + d + 1 j=0
Z
! dd+1 k δ++j |D C(k)| . (2π)d+1
The contents of the bracket on the right-hand side are, up to a factor of coefficient of tδ in Tµ kC(k)kˇ1 . (ii) Denote by Z dk0 −ık0 t U (k) C(t, k) = e 2π ık0 − e(k) ( −χ(e(k) > 0) if t > 0 = U (k)e−e(k)t χ(e(k) < 0) if t ≤ 0
1 δ! ,
the
December 12, 2003 15:2 WSPC/148-RMP
00177
Single Scale Analysis of Many Fermion Systems — Part 1
977
the partial Fourier transform of C(k) in the k0 direction. (As usual, the case t = 0 is defined through the limit t → 0 − .) Then, for |δ| + |δ 0 | ≤ r, 0
δ |(x − x0 )δ | |D1,2 C(ξ, ξ 0 )| Z 0 dd k δ0 ≤ Dδ+δ C(t − t0 , k)| |D1,2 d (2π) Z 0 0 dd k(|t − t0 |δ0 +|δ|+|δ | + |t − t0 |δ0 )e−|e(k)(t−t )| ≤ const supp U
≤ const ≤ const
# 0 ( 32 )δ0 +|δ|+|δ | (δ0 + |δ| + |δ 0 |)! ( 23 )δ0 δ0 ! −|e(k)(t−t0 )/3| e + d k 0 |e(k)|δ0 |e(k)|δ0 +|δ|+|δ | supp U Z 1 δ0 −|e(k)(t−t0 )/3| dd k 2 δ0 ! . 0| e |δ|+|δ |e(k)| supp U
Z
d
"
In particular, k|Ck|∞ ≤ const and Z Z 0 δ |(x − x0 )δ | dt0 |D1,2 C(ξ, ξ 0 )| ≤ const 2δ0 δ0 ! δ0
≤ const 2 δ0 !
dd k supp U
g1
g2 0 µ|δ|+|δ |
g 2 δ0 δ 0 ! 1 ≤ const |δ0 | g 2 µ µ|δ|
1 0 |e(k)||δ|+|δ |+1 if |δ| + |δ 0 | = 0 if |δ| + |δ 0 | > 0
if |δ| = 0 if |δ| > 0
since g1 ≥ g2 . As in Eqs. (IV.4)–(IV.7), choosing various δ 0 ’s with |δ 0 | = d + 1, Z 1 δ dt0 |D1,2 C(ξ, ξ 0 )| ≤ const 2δ0 δ0 ! Qd 1 0 1+ d d+1 1+µ i=1 |xi − xi | if |δ| = 0 g1 × . g 2 if |δ| > 0 |δ| µ δ Integrating x0 gives the desired bound on k|D1,2 Ck|1,∞ . (iii) Write
C(k) = C1 (k) − C2 (k) + C3 (k) with C1 (k) =
U (k) ık0 − E
C2 (k) =
χ(k) ık0 − E
December 12, 2003 15:2 WSPC/148-RMP
978
00177
J. Feldman, H. Kn¨ orrer & E. Trubowitz
C3 (k) =
e(k) − E (U (k) − χ(k)) (ık0 − e(k))(ık0 − E)
and define the covariances Cj by Z dd+1 k ıhk,x−x0 i− 0 δ e Cj (k) σ,σ (2π)d+1 0 Cj (ξ, ξ ) = 0 −Cj (ξ 0 , ξ)
if a = 0, a0 = 1 if a = a0 if a = 1, a0 = 0
for j = 1, 2, 3. For a = 0, a0 = 1
C1 ((x0 , x, σ, a), (x00 , x0 , σ 0 , a0 )) = −δσ,σ0
dd k ıhk,x−x0 i e U (k) (2π)d
Z
and, for |δ| ≤ r, |δ0 | ≤ r0 , δ k|D1,2 C1 k|∞ ≤ δ k|D1,2 C1 k|1,∞ ≤
const
E δ0
(
0
e−E(x0 −x0 )
if x0 > x00
0
if x0 ≤ x00
δ0 ! ≤ const
const
E δ0 +1
δ0 ! ≤ const .
By Remark IV.7
ˇ
1
ˇ kC2 (k)kˇ1 ≤ kχ(k)kˇ1
ık0 − E ≤ kχ(k)k1 ∞
so that, for |δ0 | ≤ r0 − 2 and |δ| ≤ r − d − 1, δ k|D1,2 C2 k|∞ ≤ const
∞ X
n=0
1
E
tn n+1 0
!
δ k|D1,2 C2 k|1,∞ ≤ const
by part (i). We now bound C3 . Let B be the support of U (k) − χ(k). On B, |ık0 − e(k)| ≥ µ > 0 and |e(k)| ≤ E, so we have, for δ = (δ0 , δ) 6= 0 with |δ| ≤ r and δ0 ≤ r0 , δ E 1 1 e(k) − E ≤ const D + (ık0 − e(k))(ık0 − E) |ık0 − E| |ık0 − e(k)||δ|+1 |ık0 − e(k)| ≤ const
Integrating 1 δ!
Z
B
dd+1 k (2π)d+1
E 1 . |δ| µ |ık0 − E| |ık0 − µ|
δ e(k) − E D (ık0 − e(k))(ık0 − E) ≤ const .
It follows that
ˇ e(k) − E
(ık0 − e(k))(ık0 − E)
1,B
≤ const
X
|δ|≤r |δ0 |≤r0
tδ +
X
|δ|>r or |δ0 |>r0
∞ tδ
December 12, 2003 15:2 WSPC/148-RMP
00177
Single Scale Analysis of Many Fermion Systems — Part 1
and, by Remark IV.7, that
X δ kC3 (k)kˇ1 ≤ const t + |δ|≤r |δ0 |≤r0
979
X
|δ|>r or |δ0 |>r0
ˇ ˇ ∞tδ (kU (k)k∞ + kχ(k)k∞ ) .
By part (i) of this proposition and the previous bounds on C1 and C2 , this concludes the proof of part (iii). Corollary IV.9. Under the hypotheses of Proposition IV.8(ii), the (d + 1)dimensional norm |δ| X X 2 const ∞tδ tδ + kCk1,∞ ≤ d g1 + g 2 µ µ |δ|≥r−d
|δ|≥1 |δ|≤r−d−1
≤
const
µd
g1
X
|δ|≤r−d−1
|δ| X 2 ∞tδ . tδ + µ |δ|≥r−d
Under the hypotheses of Proposition IV.8(iii) kCk1,∞ ≤ const
X
tδ +
|δ|≤r−d−1 |δ0 |≤r0 −2
X
|δ|>r−d−1 or |δ0 |>r0 −2
∞tδ .
In the renormalization group analysis we shall add a counterterm δe(k) to the dispersion relation e(k). For such a counterterm, we define the Fourier transforme Z a 0 dd k δˆ e(ξ, ξ 0 ) = δσ,σ0 δa,a0 δ(x0 − x00 ) e(−1) ı k·(x−x ) δe(k) (2π)d for ξ = (x, a) = (x0 , x, σ, a), ξ 0 = (x0 , a0 ) = (x00 , x0 , σ 0 , a0 ) ∈ B.
Definition IV.10. Fix r0 and r. Let X c0 = tδ + |δ|≤r |δ0 |≤r0
X
|δ|>r or |δ0 |>r0
∞tδ ∈ Nd+1 .
c0 The map e0 (X) = 1−X from X ∈ Nd+1 with X0 < 1 to Nd+1 is used to implement the differentiability properties of various kernels depending on a counterterm whose norm is bounded by X.
Proposition IV.11. Let C(k) = eA
U (k) − χ(k) ık0 − e(k) + δe(k)
C0 (k) =
U (k) − χ(k) ık0 − e(k)
comprehensive set of Fourier transform conventions are formulated in Sec. IX.
December 12, 2003 15:2 WSPC/148-RMP
980
00177
J. Feldman, H. Kn¨ orrer & E. Trubowitz
with real valued functions U (k), e(k), δe(k) on Rd and χ(k) on R × Rd that fulfill the following conditions: The function e(k) is r + d + 1 times differentiable. |ık0 − e(k)| ≥ µe > 0 for all k = (k0 , k) in the support of U (k) − χ(k). The function U (k) is smooth and has compact support. The function χ(k) is smooth and has compact support and 0 ≤ χ(k) ≤ U (k) ≤ 1 for all k = (k0 , k) ∈ R × Rd . The function δe(k) obeys X kδˆ ek1,∞ < µ + ∞tδ . δ6=0
Then, there is a constant µ1 > 0 such that if µ < µ1 , the following hold (i) C is an analytic function of δe and k|Ck|∞ ≤ const
k|C − C0 k|∞ ≤ constk|δˆ ek|1,∞
and kCk1,∞ ≤ const e0 (kδˆ ek1,∞ )
kC − C0 k1,∞ ≤ const e0 (kδˆ ek1,∞ )kδˆ ek1,∞ .
(ii) Let Cs (k) =
U (k) − χ(k) . ık0 − e(k) + δe(k) + sδe0 (k)
Then d C s ds
d
Cs
ds
s=0
∞
s=0 1,∞
≤ constk|δˆ e0 k|1,∞ ≤ const e0 (kδˆ ek1,∞ ) kδˆ e0 k1,∞ .
Proof. (i) The first bound follows from (IV.2), by replacing e by e − δe. ˜ (k) and a smooth compactly Select a smooth, compactly supported function U ˜ supported function χ(k) ˜ such that 0 ≤ χ(k) ˜ ≤ U(k) ≤ 1 for all k = (k0 , k) ∈ R×Rd , ˜ (k) − χ(k) U ˜ is identically 1 on the support of U (k) − χ(k) and |ık0 − e(k)| ≥ 21 µe ˜ (k) − χ(k). for all k = (k0 , k) in the support of U ˜ Let ˜ (k) − χ(k) U ˜ . C˜0 (k) = ık0 − e(k)
Then C(k) =
C0 (k) 1+
δe(k) ık0 −e(k)
= C0 (k)
∞ X
n=0
C0 (k)
= 1+
˜ (k)−χ(k)) δe(k) (U ˜ ık0 −e(k)
(−δe(k) C˜0 (k))n .
=
C0 (k) 1 + δe(k) C˜0 (k)
December 12, 2003 15:2 WSPC/148-RMP
00177
Single Scale Analysis of Many Fermion Systems — Part 1
981
Then, by iterated application of Lemma II.7 and the second part of Corollary IV.9, with r replaced by r + d + 1 and r0 replaced by r0 + 2, ∞ X
kCk1,∞ ≤ kC0 k1,∞ ≤ const c0 = const
n=0 ∞ X
n=0
(kδˆ ek1,∞ kC˜0 k1,∞ )n
(const0 c0 kδˆ ek1,∞ )n
c0 . 1 − const0 c0 kδˆ ek1,∞
1 If µ1 < min{ 2 const 0 , 1}, then, by Corollary A.5(i), with ∆ = {δ ∈ N d+1 | |δ| ≤ r, |δ0 | ≤ r0 }, µ = const0 , Λ = 1 and X = kδˆ ek1,∞ , c0 . kCk1,∞ ≤ const 1 − kδˆ ek1,∞
Similarly
kC − C0 k1,∞ ≤ kC0 k1,∞
∞ X
n=1 ∞ X
≤ const c0
n=1
(kδˆ ek1,∞ kC˜0 k1,∞ )n
(const0 c0 kδˆ ek1,∞ )n
≤ const
c20 kδˆ ek1,∞ 0 1 − const c0 kδˆ ek1,∞
≤ const
c0 kδˆ ek1,∞ 1 − kδˆ ek1,∞
and
k|C − C0 k|∞ ≤ k|C0 k|∞ ≤ const ≤ const
∞ X
n=1
∞ X
n=1
(k|δˆ ek|1,∞ k|C˜0 k|1,∞ )n
(const0 k|δˆ ek|1,∞ )n
k|δˆ ek|1,∞
1 − const0 µ
≤ constk|δˆ ek|1,∞ . (ii) As U (k) − χ(k) d Cs (k) =− δe0 (k) 2 ds [ık − e(k) + δe(k)] 0 s=0
the first bound is a consequence of Proposition IV.8(i).
December 12, 2003 15:2 WSPC/148-RMP
982
00177
J. Feldman, H. Kn¨ orrer & E. Trubowitz
˜ Let U(k) and χ(k) ˜ be as in part (i) and set ˜ C(k) = Then
˜ (k) − χ(k) U ˜ . ık0 − e(k) + δe(k)
d 0 ˜ Cs (k) = −C(k)C(k)δe (k) ds s=0
and
d
Cs (k)
ds
s=0 1,∞
˜ 1,∞ kδˆ ≤ kCk1,∞ kCk e0 k1,∞ ≤ const e0 (kδˆ ek1,∞ )2 kδˆ e0 k1,∞ ≤ const e0 (kδˆ ek1,∞ )kδˆ e0 k1,∞
by Corollary A.5(ii). V. Insulators An insulator is a many fermion system as described in the introduction, for which the dispersion relation e(k) does not have a zero on the support of the ultraviolet cutoff U (k). We may assume that there is a constant µ > 0 such that e(k) ≥ µ for all k ∈ Rd . We shall show in Theorem V.2 that for a sufficiently small coupling constant the Green’s functions for the interacting system exist and differ by very little from the Green’s functions of the noninteracting system in the supremum norm. Lemma V.1. Let ρm;n be a sequence of nonnegative real numbers such that ρm;n0 ≤ ρm;n for n0 ≤ n. Define for f ∈ Fm (n) kf k = ρm;n kf k1,∞ where kf k1,∞ is the L1 –L∞ -norm introduced in Example II.6. (i) The seminorms k · k are symmetric. (ii) For a covariance C, let S(C) be the quantity introduced in Definition IV.1. Then 2S(C) is an integral bound for the covariance C with respect to the family of seminorms k · k. (iii) Let C be a covariance. Assume that for all m ≥ 0 and n, n0 ≥ 1 ρm; n+n0 −2 ≤ ρm;n ρ0;n0 . Let c ∈ Nd+1 obey c ≥ kCk1,∞ c0 ≥
ρm+m0 ; n+n0 −2 k|Ck|∞ ρm;n ρm0 ;n0
f or all m, m0 , n, n0 ≥ 1
December 12, 2003 15:2 WSPC/148-RMP
00177
Single Scale Analysis of Many Fermion Systems — Part 1
983
where c0 is the constant coefficient of the formal power series c. Then c is a contraction bound for the covariance C with respect to the family of seminorms k · k. Proof. Parts (i) and (ii) are trivial. To prove part (iii), let f ∈ Fm (n), f 0 ∈ Fm0 (n0 ) and 1 ≤ i ≤ n, 1 ≤ j ≤ n0 . Set g(η1 , . . . , ηm+m0 ; ξ1 , . . . , ξi−1 , ξi+1 , . . . , ξn , ξn+1 , . . . , ξn+j−1 , ξn+j+1 , . . . , ξn+n0 ) Z = dζdζ 0 f (η1 , . . . , ηm ; ξ1 , . . . , ξi−1 , ζ, ξi+1 , . . . , ξn )C(ζ, ζ 0 ) × f 0 (ηm+1 , . . . , ηm+m0 ; ξn+1 , . . . , ξn+j−1 , ζ 0 , ξn+j+1 , . . . , ξn+n0 ) . Then ConC i→n+j
Antext (f ⊗ f 0 ) = Antext g
and therefore
ConC Antext (f ⊗ f 0 ) ≤ kgk .
i→n+j
If m, m0 ≥ 1
kgk1,∞ ≤ kf k1,∞k|Ck|∞ kf 0 k1,∞ and consequently
ConC Antext (f ⊗ f 0 ) ≤ ρm+m0 ; n+n0 −2 k|Ck| kf k1,∞ kf 0 k1,∞ ∞
i→n+j
≤ c0 ρm;n kf k1,∞ ρm0 ;n0 kf 0 k1,∞ ≤ c kf k kf 0k .
If m = 0 or m0 = 0, by iterated application of Lemma II.7
Z
0
kgk1,∞ ≤ dζf (ξ1 , . . . , ξm ; ξ1 , . . . , ξi−1 , ζ, ξi+1 , . . . , ξn ) C(ζ, ζ ) B
1,∞
kf 0 k1,∞
≤ kf k1,∞ kCk1,∞ kf 0 k1,∞
and again
ConC Antext (f ⊗ f 0 ) ≤ c kf k kf 0k .
i→n+j
To formulate the result about insulators, we define for a function f (x1 , . . . , xn ), on (R × Rd × {↑, ↓})n, the L1 –L∞ -norm as in Example II.6 to be Z Y n dxi |f (x1 , . . . , xn )| . k|f k|1,∞ = max sup 1≤j≤n xj
i=1 i6=j
December 12, 2003 15:2 WSPC/148-RMP
984
00177
J. Feldman, H. Kn¨ orrer & E. Trubowitz
Theorem V.2 (Insulators). Let r and r0 be natural numbers. Let e(k) be a dispersion relation on Rd that is at least r + d + 1 times differentiable, and let U (k) be a compactly supported, smooth ultraviolet cutoff on Rd . Assume that there is a constant 0 < µ < 21 such that e(k) ≥ µ
f or all k in the support of U .
Set g=
Z
supp U
where E = max{1, supk∈supp R × Rd × {↑, ↓}
U
sZ
(
1 d k |e(k)| d
γ = max 1,
E dd k U (k) log |e(k)|
)
|e(k)|}. Let, for x = (x0 , x, σ), x0 = (x00 , x0 , σ 0 ) ∈
C(x, x0 ) = δσ,σ0
dd+1 k ıhk,x−x0 i− U (k) e (2π)d+1 ık0 − e(k)
Z
and set, for ξ = (x, a), ξ 0 = (x0 , a0 ) ∈ B
C(ξ, ξ 0 ) = C(x, x0 )δa,0 δa0 ,1 − C(x0 , x)δa,1 δa0 ,0 .
Furthermore let Z ¯ = V(ψ, ψ)
¯ 1 )ψ(y1 )ψ(x ¯ 2 )ψ(y2 ) dx1 dy1 dx2 dy2 V0 (x1 , y1 , x2 , y2 )ψ(x
(R×R2 ×{↑,↓})4
be a two particle interaction with a kernel V0 that is antisymmetric in the variables x1 , x2 and y1 , y2 separately. Set υ=
sup D decay operator with δ0 ≤r0 , |δ|≤r
Then there exists ε > 0 and a constant
µ|δ(D)| k|DV0 k|1,∞ .
const
such that
d
amp (i) If k|V0 k|1,∞ ≤ εµ gγ 2 , the connected amputated Green’s functions G2n (x1 , y1 , . . . , xn , yn ) exist in the space of all functions on (R × Rd × {↑, ↓})2n with finite k| · k|1,∞ norms. They are analytic functions of V0 .
(ii) Suppose that υ ≤
εµd gγ 2 .
For all decay operators D with δ0 (D) ≤ r0 and |δ(D)| ≤ r
k|DGamp 2n k|1,∞ ≤
const
k|D(Gamp − V0 )k|1,∞ ≤ 4
const
k|D(Gamp − K)k|1,∞ ≤ 2
const
n
gγ 6−2n
µd+|δ(D)| 2
gγ 2
µd+|δ(D)| gγ 4
µd+|δ(D)|
υ2
if n ≥ 3
υ2 υ2
where K(x, y) = 4
Z
dx0 dy 0 V0 (x, y, x0 , y 0 )C(x0 , y 0 ) .
December 12, 2003 15:2 WSPC/148-RMP
00177
Single Scale Analysis of Many Fermion Systems — Part 1
985
The constants ε and const depend on r, r0 , U, and the suprema of the kderivatives of the dispersion relation e(k) up to order r + d + 1, but not on µ or V0 . Proof. By (I.6), the generating functional for the connected amputated Green’s functions is amp Ggen (φ) = ΩC (V)(0, φ) .
To estimate it, we use the norms k·k of Lemma V.1 with ρm;n = 1. By Lemma V.1(ii) and Proposition IV.5, there is a constant const0 such that b = const0 γ is an integral bound for the covariance C with respect to these norms. By Lemma V.1(iii), Corollary IV.9 and Proposition IV.8(ii), there is a constant const1 such that X 2 |δ| X const1 g ∞ tδ c= tδ + µd µ |δ|≤r
|δ|>r
is a contraction bound for C with respect to these norms. Here we used that µgd is bounded below by a nonzero constant. As in Definition III.9, we set for any Grassmann function W(φ, ψ) and any α > 0 X 1 N (W; c, b, α) = 2 c αn bn kWm,n k . b m,n≥0
In particular
N (V; c, b, α) = α4 b2 c kV0 k1,∞ and N (V; c, b, 8α)0 ≤ const3
8 4 α4 γ 2 g k|V0 k|1,∞ . µd
(V.1)
Observe that ckV0 k1,∞ ≤
const1
µd
|δ| X g X 2 tδ + ∞tδ µ |δ|≤r
|δ|>r
X 1 υ δ t + × δ! µ|δ| |δ|≤r |δ0 |≤r0
≤ const2
X gυ d µ
|δ|≤r |δ0 |≤r0
X
|δ|>r or |δ0 |>r0
1 δ t + µ|δ|
∞tδ
X
|δ|>r or |δ0 |>r0
∞tδ .
Write V = : V 0:C . By [8, Proposition A.2(i)], Z 0 ¯ V = V + dxdyK(x, y)ψ(x)ψ(y) + const
December 12, 2003 15:2 WSPC/148-RMP
986
00177
J. Feldman, H. Kn¨ orrer & E. Trubowitz
and by [8, Corollary II.32(i)] N (V 0 ; c, b, α) ≤ N (V; c, b, 2α) = 16α4 b2 ckV0 k1,∞ ≤
We set α = 2 and ε =
α4 γ 2
const3
X gυ d µ
|δ|≤r |δ0 |≤r0
1 217 const3 .
1 δ t + µ|δ|
Then
X 1 gγ 2 υ tδ + d 2εµ µ|δ|
N (V 0 ; c, b, 16) ≤
N (V 0 ; c, b, 16)0 ≤
|δ|>r or δ0 |>r0
X
|δ|≤r |δ0 |≤r0
and, by (V.1)
X
∞tδ .
|δ|>r or δ0 |>r0
∞tδ
gγ 2 k|V0 k|1,∞ . 2εµd
d
0 Therefore, whenever k|V0 k|1,∞ ≤ εµ gγ 2 , V fulfills the hypotheses of Theorem III.10 amp 0 and Ggen (ψ) = ΩC (:V :)(0, ψ) exists. Part (i) follows.
If, in addition, υ ≤
εµd gγ 2 ,
then
amp N (Ggen − V 0 ; c, b, 2) ≤
where
P
f (t) =
1− =
1 N (V 0 ; c, b, 16)2 1 ≤ 2 1 − N (V 0 ; c, b, 16) 2
1 2
X
|δ|≤r |δ0 |≤r0
P
tδ +
|δ|≤r |δ0 |≤r0
Fδ tδ +
|δ|≤r |δ0 |≤r0
P
|δ|>r or δ0 |>r0
tδ + X
P
|δ|>r or δ0 |>r0
=
1 8
max
|δ|≤r |δ0 |≤r0
gγ 2 υ 2εµd
2
∞tδ
∞tδ
|δ|≤r |δ0 |≤r0
const4
∞tδ
|δ|>r or δ0 |>r0
with Fδ finite for all |δ| ≤ r, |δ0 | ≤ r0 . Hence 2 2 X gγ υ amp N (Ggen − V 0 ; c, b, 2) ≤ const4 d εµ with
1 δ t + µ|δ|
Fδ . As
2 t f µ
X
|δ|>r or δ0 |>r0
∞tδ
amp N (Ggen − V 0 ; c, b, 2)
=c
4kGamp 2
− Kk +
16b2 kGamp 4
− V0 k +
∞ X
n=3
4(2b)2n−2 kGamp 2n k
!
December 12, 2003 15:2 WSPC/148-RMP
00177
Single Scale Analysis of Many Fermion Systems — Part 1
≥
987
4 const1 g kGamp − Kk1,∞ + 4 const20 γ 2 kGamp − V0 k1,∞ 2 4 µd ! ∞ X amp 2n−2 + (const0 γ) kG2n k1,∞ n=3
we have
kGamp − Kk1,∞ + 4 const20 γ 2 kGamp − V0 k1,∞ 2 4 +
∞ X
n=3
(const0 γ)2n−2 kGamp 2n k1,∞ 4 2
≤ const4
X 1 δ gγ υ t + 2 d 4 const1 ε µ µ|δ| |δ|≤r |δ0 |≤r0
X
|δ|>r or δ0 |>r0
The estimates on the amputated Green’s functions follow.
∞tδ .
Remark V.3. (i) In reasonable situations, for example if the gradient of e(k) is bounded below, the constants γ and g in Theorem V.2 are of order one and log µ respectively. (ii) Using Example A.3, one may prove an analog of Theorem V.2 with the constants ε and const independent of r0 and d+|δ(D)| 8(d + 1) amp n 6−2n υ 2 if n ≥ 3 k|DG2n k|1,∞ ≤ const δ(D)!gγ µ d+|δ(D)| 8(d + 1) amp 2 2 k|D(G4 − V0 )k|1,∞ ≤ const δ(D)!gγ υ2 µ d+|δ(D)| 8(d + 1) amp 4 k|D(G2 − K)k|1,∞ ≤ const δ(D)!gγ υ2 . µ (iii) Roughly speaking, the connected Green’s function are constructed from the connected amputated Green’s functions by appending propagators C. The details are given in Sec. VI in Part 2. Using Proposition IV.8(ii), one sees that, under the hypotheses of Theorem V.2(i), the connected Green’s functions exist in the space of all functions on (R × Rd × {↑, ↓})2n with finite k| · k|1,∞ and k| · k|∞ norms. Appendices A. Calculations in the norm domain Recall from Definition II.4 that the (d + 1)-dimensional norm domain Nd+1 is the set of all formal power series
December 12, 2003 15:2 WSPC/148-RMP
988
00177
J. Feldman, H. Kn¨ orrer & E. Trubowitz
X=
X
Xδ tδ00 tδ11 · · · tδdd =
δ∈N0 ×Nd 0
X
Xδ t δ
δ∈N0 ×Nd 0
in the variables t0 , t1 , . . . , td with coefficients Xδ ∈ R+ ∪ {∞}. Definition A.1. A nonempty subset ∆ of N0 × Nd0 is called saturated if, for every δ ∈ ∆ and every multi-index δ 0 with δ 0 ≤ δ, the multi-index δ 0 also lies in ∆. If ∆ is a finite set, then N (∆) = min{n ∈ N|nδ ∈ / ∆ for all 0 6= δ ∈ ∆} is finite. For example, if r, r0 ∈ N then the set {δ ∈ N0 ×Nd0 |δ0 ≤ r0 , |δ| ≤ r} is saturated and N (∆) = max{r0 + 1, r + 1}. Lemma A.2. Let ∆ be a saturated set of multi-indices and X, Y ∈ Nd+1 . Furthermore, let f (t0 , . . . , td ) and g(t0 , . . . , td ) be analytic functions in a neighborhood of the origin in Cd+1 such that, for all δ ∈ ∆, the δth Taylor coefficients of f and g at the origin are real and nonnegative. Assume that g(0) < 1 and that, for all δ ∈ ∆, ! d 1 Y ∂ δi Xδ ≤ f (t , . . . , t ) 0 d δ! i=0 ∂tδi i t =···=t =0 0
Yδ ≤
Set Z =
X 1−Y
and h(t) = Zδ ≤
1 δ!
d Y ∂ δi ∂tδi i i=0
f (t) 1−g(t) .
1 δ!
Proof. Trivial.
!
g(t0 , . . . , td )
d
.
t0 =···=td =0
Then, for all δ ∈ ∆, ! d Y ∂ δi h(t , . . . , t ) 0 d δi ∂ti i=0 t
. 0 =···=td =0
Example A.3. Let ∆ be a saturated set and a ≥ 0, 0 ≤ λ ≤ 21 . Then 2 P P δ |δ| δ ∞t a t + X δ6∈∆ δ∈∆ 16 X P ≤ (4(d + 1) a)|δ| tδ + ∞tδ . P 3 |δ| δ δ 1−λ δ∈∆ δ6∈∆ δ∈∆ a t + δ6∈∆ ∞t
Proof. Set 2 X X X= ∞tδ a|δ| tδ + δ6∈∆
δ∈∆
Y = λ
X
δ∈∆
f (t) =
|δ| δ
a t +
X
δ6∈∆
∞t
δ
X
a t
δ
g(t) = λ
X δ
|δ| δ
!2
|δ| δ
a t
!
=
d Y
1 (1 − ati )2 i=0
=λ
d Y
1 . 1 − ati i=0
December 12, 2003 15:2 WSPC/148-RMP
00177
Single Scale Analysis of Many Fermion Systems — Part 1
989
Set f (t) 1 1 = . 1 − g(t) Π(1 − ati ) Π(1 − ati ) − λ q d+1 3 By the Cauchy integral formula, with ρ = a1 (1 − 4) h(t) =
d 1 Y ∂ δj h(t , . . . , t ) 0 d δ! j=0 ∂tδj j t0 =···=td =0 =
Z
|z0 |=ρ
···
Z
h(z)
|zd |=ρ
d Y
j=0
1
dzj δj +1 2πı z j
≤
1 ρ|δ|
≤
1 1 1 d+1 |δ| (1 − aρ)d+1 − λ ρ (1 − aρ)
≤
a|δ| 1 4 (1 − (3/4)1/(d+1) )|δ| 3 3/4 − 1/2
≤
16 (4(d + 1)a)|δ| . 3
sup |z0 |=···=|zd |=ρ
!
|h(z)|
Lemma A.4. (i) Let X, Y ∈ Nd+1 with X0 + Y0 < 1 1 1 1 ≤ . 1−X 1−Y 1 − (X + Y ) (ii) Let ∆ be a finite saturated set and X, Y ∈ Nd+1 with X0 + Y0 < 12 . There is a constant, const depending only on ∆, such that X 1 1 1 ≤ const + ∞tδ ∈ Nd+1 . 1 − (X + Y ) 1−X 1−Y δ ∈∆ /
Proof. (i) p ∞ ∞ X X X 1 1 m n = X Y = X m Y p−m 1−X 1−Y m,n=0 p=0 m=0
≤ =
p ∞ X X p X m Y p−m m p=0 m=0 ∞ X p=0
(X + Y )p =
1 . 1 − (X + Y )
December 12, 2003 15:2 WSPC/148-RMP
990
00177
J. Feldman, H. Kn¨ orrer & E. Trubowitz
ˆ = X − X0 and Yˆ = Y − Y0 . Then (ii) Set X 1 1 ≤ = ˆ + Yˆ ) 1 − (X + Y ) 1 − (X0 + Y0 ) − (X
1 2
1 ˆ + Yˆ ) − (X
N (∆)−1
≤2
X
ˆ + 2Yˆ )n + (2X
n=0
δ ∈∆ /
N (∆)−1
=2
X
n=0
X
n X
2n
m=0
N (∆)−1
X
≤ 22N (∆)−1
n=0
n m
n X
∞tδ
ˆ m Yˆ n−m + X
X
δ ∈∆ /
ˆ m Yˆ n−m + X
m=0
X
δ ∈∆ /
∞tδ
∞tδ
1
≤ 22N (∆)−1
X 1 ∞tδ + ˆ ˆ 1−X 1−Y δ ∈∆ /
≤ 22N (∆)−1
X 1 1 ∞tδ . + 1−X 1−Y δ ∈∆ /
P Corollary A.5. Let ∆ be a finite saturated set, µ, Λ > 0. Set c = δ∈∆ Λ|δ| tδ + P δ δ ∈∆ / ∞t . There is a constant, const depending only on ∆ and µ, such that the following hold. 1 , 1}. (i) For all X ∈ Nd+1 with X0 < min{ 2µ
c c ≤ const . 1 − µcX 1−X (ii) Set, for X ∈ Nd+1 , e(X) =
c 1−ΛX .
If µ + ΛX0 < 12 , then
e(X)2 ≤ const e(X)
e(X) ≤ const e(X) . 1 − µe(X)
ˆ Then, by Example A.3 and Lemma A.4, Proof. (i) Decompose X = X0 + X. c c = ˆ 1 − µcX 1 − µX0 c − µcX ≤ const
c 1 ˆ 1 − µX0 c 1 − µcX
≤ const
1 c ˆ 1 − c/2 1 − µcX
≤ const
c
ˆ 1 − µcX
.
December 12, 2003 15:2 WSPC/148-RMP
00177
Single Scale Analysis of Many Fermion Systems — Part 1
991
Expanding in a geometric series N (∆)−1 X c ˆ n (µcX) ≤ const c 1 − µcX n=0 N (∆)−1
≤ const c
N (∆)
≤ const
(1 + µ
N (∆)
)
X
ˆn X
n=0
c
ˆ 1−X c ≤ const . 1−X
(ii) The first claim follows from the second, by expanding the geometric series. By Lemma A.4(ii) and part (i), c
e(X) c 1−ΛX = = c 1 − µe(X) 1 − µ 1−ΛX 1 − ΛX − µc ≤
const
c c 1 ≤ const = const e(X) . 1 − µc 1 − ΛX 1 − ΛX
Remark A.6. The following generalization of Corollary A.5 is proven in the same P δ0 |δ| δ way. Let ∆ be a finite saturated set, µ, λ, Λ > 0. Set c = δ∈∆ λ Λ t + P δ δ ∈∆ / ∞t . There is a constant, const depending only on ∆ and µ, such that the following hold. 1 , 1}. (i) For all X ∈ Nd+1 with X0 < min{ 2µ
c c ≤ const . 1 − µcX 1−X (ii) Set, for X ∈ Nd+1 , e(X) =
c 1−ΛX .
If µ + ΛX0 < 21 , then e(X) ≤ const e(X) . 1 − µe(X)
e(X)2 ≤ const e(X)
Lemma A.7. Let ∆ be a finite saturated set and X X X= Xδ t δ + ∞tδ ∈ Nd+1 . δ∈∆
δ ∈∆ /
Let f (z) be analytic at X0 , with f (n) (X0 ) ≥ 0 for all n, whose radius of convergence at X0 is at least r > 0. Let 0 < β < X10 . Then there exists a constant C, depending only on ∆, β, r and max|z−X0 |=r |f (z)| such that f (X) ≤ C
1 . 1 − βX
December 12, 2003 15:2 WSPC/148-RMP
992
00177
J. Feldman, H. Kn¨ orrer & E. Trubowitz
Proof. Set α =
β 1−βX0
ˆ = X − X0 . Then and X
f (X) = ≤
X 1 ˆn f (n) (X0 )X n! n X
n
≤C
X
X 1 (n) ˆn + ∞tδ f (X0 )X n! δ ∈∆ /
ˆn + αn X
X
δ ∈∆ /
n
∞tδ
where f (n) (X0 ) f (n) (X0 ) > max . n n
C = max Hence
f (X) ≤
C ˆ 1 − αX
+
X
δ ∈∆ /
∞tδ
=
C 1 − α(X − X0 )
=
C(1 − βX0 ) 1 − βX
≤
C . 1 − βX
Notation Norms Norm
Characteristics
Reference
k| · k|1,∞
no derivatives, external positions, acts on functions
Example II.6
k · k1,∞
derivatives, external positions, acts on functions
Example II.6
k · kˇ∞
derivatives, external momenta, acts on functions
Definition IV.6
k| · k|∞
no derivatives, external positions, acts on functions
Example III.4
k · kˇ1
derivatives, external momenta, acts on functions
Definition IV.6
k· k·
kˇ∞,B kˇ1,B
k·k N (W; c, b, α)
Rd
Definition IV.6
derivatives, external momenta, B ⊂ R × Rd
Definition IV.6
ρm;n k · k1,∞
Lemma V.1
derivatives, external momenta, B ⊂ R ×
X 1 c αn bn kWm,n k b2 m,n≥0
Definition III.9 Theorem V.2
December 12, 2003 15:2 WSPC/148-RMP
00177
Single Scale Analysis of Many Fermion Systems — Part 1
993
Other notation Notation
Description
ΩS (W)(φ, ψ)
1 log Z
S(C)
sup
Z
eW(φ,ψ+ζ) dµS (ζ) sup
m ξ1 ,...,ξm ∈B
1/m Z ψ(ξ1 ) · · · ψ(ξm )dµC (ψ)
Reference before (I.6) Definition IV.1
B
R × Rd × {↑, ↓} × {0, 1} viewed as position space
beginning of Sec. II
Fm (n)
functions on B m × B n , antisymmetric in B m arguments
Definition II.9
References [1] M. Disertori and V. Rivasseau, Interacting Fermi liquid in two dimensions at finite temperature. Part I: convergent attributions, Comm. Math. Phys. 215 (2000), 251–290. [2] M. Disertori and V. Rivasseau, Interacting Fermi liquid in two dimensions at finite temperature. Part II: renormalization, Comm. Math. Phys. 215 (2000), 291–341. [3] W. Pedra and M. Salmhofer, Fermi systems in two dimensions and Fermi surface flows, to appear in Proc. 14th Int. Congress of Mathematical Physics, Lisbon, 2003. [4] G. Benfatto and G. Gallavotti, Renormalization Group, Physics Notes, Vol. 1, Princeton University Press, 1995. [5] J. Feldman, H. Kn¨ orrer and E. Trubowitz, A two dimensional Fermi liquid, Part 1: overview, to appear in Commun. Math. Phy. [6] J. Feldman, M. Salmhofer and E. Trubowitz, An inversion theorem in Fermi surface theory, Comm. Pure Appl. Math. LIII (2000), 1350–1384. [7] J. P¨ oschel and E. Trubowitz, Inverse Spectral Theory, Academic Press, 1987. [8] J. Feldman, H. Kn¨ orrer and E. Trubowitz, Convergence of perturbation expansions in Fermionic models, Part 1: nonperturbative bounds, preprint. [9] J. Feldman, H. Kn¨ orrer and E. Trubowitz, Convergence of perturbation expansions in Fermionic models, Part 2: overlapping loops, preprint. [10] J. Magnen and V. Rivasseau, A single scale infinite volume expansion for threedimensional many Fermion Green’s functions, Math. Phys. Electron. J. 1 (1995). [11] M. Disertori, J. Magnen and V. Rivasseau, Interacting Fermi liquid in three dimensions at finite temperature: Part I: convergent contributions, Ann. Henri Poincar´e 2 (2001), 733–806.
December 15, 2003 16:20 WSPC/148-RMP
00178
Reviews in Mathematical Physics Vol. 15, No. 9 (2003) 995–1037 c World Scientific Publishing Company
SINGLE SCALE ANALYSIS OF MANY FERMION SYSTEMS PART 2: THE FIRST SCALE
JOEL FELDMAN∗ Department of Mathematics, University of British Columbia Vancouver, B.C., Canada V6T 1Z2 [email protected] http://www.math.ubc.ca/∼feldman/ † and EUGENE TRUBOWITZ‡ ¨ HORST KNORRER
Mathematik, ETH-Zentrum, CH-8092 Z¨ urich, Switzerland †[email protected] ‡[email protected] †http://www.math.ethz.ch/∼knoerrer/
Received 22 April 2003
The first renormalization group map arising from the momentum space decomposition of a weakly coupled system of fermions at temperature zero differs from all subsequent maps. Namely, the component of momentum dual to temperature may be arbitrarily large — there is no ultraviolet cutoff. The methods of Part 1 are supplemented to control this special case. Keywords: Fermi liquid; renormalization; fermionic functional integral.
Contents VI. Introduction to Part 2 VII. Amputated and Nonamputated Green’s Functions VIII. Scales IX. The Fourier Transform X. Momentum Space Norms Appendices B. Symmetries C. Some standard Grassmann integral formulae Notation References
996 997 1003 1010 1017 1031 1031 1034 1036 1037
∗ Research supported in part by the Natural Sciences and Engineering Research Council of Canada and the Forschungsinstitut f¨ ur Mathematik, ETH Z¨ urich.
995
December 15, 2003 16:20 WSPC/148-RMP
996
00178
J. Feldman, H. Kn¨ orrer & E. Trubowitz
VI. Introduction to Part 2 We continue our analysis of models for weakly interacting fermions in d-dimensions given in terms of • a single particle dispersion relation e(k) on Rd , • an ultraviolet cutoff U (k) on Rd , • an interaction. From now on, we fix r ≥ 2 and assume that the dispersion relation is at least r+d+1 times differentiable. As discussed in Part 1, formally, the generating functional for the connected amputated Green’s functions is Z 1 eV(ψ+φ) dµC (ψ) Gamp (φ) = log Z R where Z = eV(ψ) dµC (ψ). In this Grassmann integral, there are anticommuting fields ψ(ξ), where ξ = (x0 , x, σ, a) ∈ B = R × Rd × {↑, ↓} × {0, 1}. See the beginning of Sec. II. The covariance of the Grassmann Gaussian measure dµC is the Fourier transform C(ξ, ξ 0 ) of C(k0 , k) =
U (k) ık0 − e(k)
as in Proposition IV.8. The interaction is Z V(ψ) = V0 (x1 , x2 , x3 , x4 ) (R×R2 ×{↑,↓})4
× ψ((x1 , 1))ψ((x2 , 0))ψ((x3 , 1))ψ((x4 , 0))dx1 dx2 dx3 dx4 . We shall, at various places, assume that V0 has a number of symmetries, that we abbreviate by single letters — translation invariance (T), spin independence (S), conservation of particle number (N), “k0 -reversal reality” (R) and “bar/unbar exchange invariance” (B). Precise definitions and a discussion of the properties of these symmetries are given in Appendix B. Formally, the Green’s functions of the many fermion system are Z Y n 1 ψ((xi , 0))ψ((yi , 1))eV(ψ) dµC (ψ) S2n (x1 , y1 , . . . , xn , yn ) = Z i=1 R V(ψ) where Z = e dµC (ψ). The generating functional for these Green’s functions is Z 1 S(φ) = eφJψ eV(ψ) dµC (ψ) Z where the operator J has kernel J((x0 , x, σ, a), (x00 , x0 , σ 0 , a0 )) 1 0 0 = δ(x0 − x0 )δ(x − x )δσ,σ0 −1 0
if a = 1, a0 = 0 if a = 0, a0 = 1 otherwise
(VI.1)
December 15, 2003 16:20 WSPC/148-RMP
00178
Single Scale Analysis of Many Fermion Systems — Part 2
so that the source term has the form Z Z ¯ ¯ φJψ = dξdξ 0 φ(ξ)J(ξ, ξ 0 )ψ(ξ 0 ) = dxφ(x)ψ(x) + ψ(x)φ(x) = ψJφ .
997
(VI.2)
The generating functional for the connected Green’s functions is Z 1 G(φ) = log eφJψ eV(ψ) dµC (ψ) Z
and the connected Green’s functions themselves are determined by Z Y ∞ n n X Y 1 ¯ i )φ(yi ) . G(φ) = dxi dyi G2n (x1 , y1 , . . . , xn , yn ) φ(x (n!)2 n=1 i=1 i=1
The relation between the connected Green’s functions and the amputated connected Green’s functions is G2n (x1 , y1 , . . . , xn , yn ) =
Z Y n i=1
for n ≥ 2, and
dx0i dyi0
n Y
C(xi , x0i )C(yi0 , yi )
i=1
G2 (x, y) − C(x, y) =
Z
!
0 0 0 0 Gamp 2n (x1 , y1 , . . . , xn , yn )
dx0 dy 0 C(x, x0 )C(y 0 , y)Gamp (x0 , y 0 ) . 2
In a multiscale analysis we shall estimate the position space supremum norm of connected Green’s functions and the momentum space supremum norm of connected amputated Green’s functions. We fix r0 ≥ 2 and control the Green’s functions, including up to r0 derivatives in the k0 direction. In Sec. VII, we introduce a ˜ of the renormalization group map Ω for use with the connected Green’s variant, Ω, functions. In Sec. VIII, we introduce the scale decomposition that will be used for ˜ for the first the multiscale analysis. Using the results of Part 1, we discuss the map Ω, few scales. The discussion will be sufficiently general to allow the absorption of a (renormalization) counterterm in the dispersion relation. In Sec. X, we introduce norms for use with the amputated Green’s functions and discuss the map Ω, for the first few scales. Notation tables are provided at the end of the paper. VII. Amputated and Nonamputated Green’s Functions ˜ C with reDefinition VII.1. The (unamputated) renormalization group map Ω spect to the covariance C associates the Grassmann function Z Z 1 φJζ W(φ,ψ+ζ) ˜ e e dµC (ζ) where Z = eW (0,ζ) dµC (ζ) 6= 0 ΩC (W)(φ, ψ) = log Z to the Grassmann function W(φ, ψ). As was the case with ΩC (W), [1, Theorem II.28] implies that, under hypotheses that we will make explicit later, the ˜ C (W) converges to an analytic function of W. formal Taylor expansion of Ω
December 15, 2003 16:20 WSPC/148-RMP
998
00178
J. Feldman, H. Kn¨ orrer & E. Trubowitz
Remark VII.2. (i) In the situation described in the introduction, the generating functional for the connected Green’s functions is ˜ C (V)(φ, 0) . G(φ) = Ω
˜ obeys the semigroup property (ii) Ω
˜ C1 +C2 = Ω ˜ C1 ◦ Ω ˜ C2 . Ω In order to use the results of Part 1 and [1], we note the following relationship ˜ C and the renormalization group map between Ω Z Z 1 ΩC (W)(φ, ψ) = log eW(φ,ψ+ζ) dµC (ζ) where Z = eW (0,ζ) dµC (ζ) Z of Part 1.
Lemma VII.3. ˜ C (W)(φ, ψ) = 1 φJCJφ + ΩC (W)(φ, ψ + CJφ) Ω 2 Z 1 = φJCJφ + ΩC (W)(φ, ψ) + : ΩC (W)(φ, ψ + ζ) 2 − ΩC (W)(φ, ψ) :ζ : eφJζ :ζ dµC (ζ) R R where for any kernel B(η, η 0 ), Bφ = dηB(η, η 0 )φ(η 0 ) and φBφ = dηdη 0 φ(η) B(η, η 0 )φ(η 0 ). Proof. By Lemma C.1, with φ replaced by Jφ, and (VI.2), Z ˜ C (W)(φ, ψ) = log e− 21 (Jφ)C(Jφ) 1 Ω eW(φ,ζ+ψ+CJφ)dµC (ζ) Z 1
= log e 2 φJCJφ eΩC (W)(φ,ψ+CJφ) =
1 φJCJφ + ΩC (W)(φ, ψ + CJφ) . 2
Also by Lemma C.1 ΩC (W)(φ, ψ + CJφ) =
Z
: ΩC (W)(φ, ψ + ζ + CJφ) :ζ dµC (ζ) 1
= e 2 (Jφ)C(Jφ) =
Z
Z
: ΩC (W)(φ, ψ + ζ) :ζ eζJφ dµC (ζ)
: ΩC (W)(φ, ψ + ζ) :ζ : eζJφ :ζ dµC (ζ)
= ΩC (W)(φ, ψ) +
Z
: ΩC (W)(φ, ψ + ζ)
− ΩC (W)(φ, ψ) :ζ : eφJζ :ζ dµC (ζ) .
(VII.1)
December 15, 2003 16:20 WSPC/148-RMP
00178
Single Scale Analysis of Many Fermion Systems — Part 2
999
The aim of the next section is to estimate the (unamputated) renormalization ˜ with respect to the norms of Definition III.9. The difference between group map, Ω, ˜ C lies in the source terms as is described in Lemma VII.3. The the maps ΩC and Ω estimates for this difference are similar to, but easier than, the estimates for the map ΩC itself. Definition VII.4 (External Improving). Let k · k be a family of symmetric seminorms on the spaces Fm (n). We say that the covariance C is Γ-external improving with respect to this family of seminorms if, for each m ≥ 0, n ≥ 1, there is an i with 1 ≤ i ≤ n such that
Z
Antext dζdζ 0 J(ηm+1 , ζ)C(ζ, ζ 0 )f (η1 , . . . , ηm ; ξ1 , . . . , ξi−1 , ζ 0 , ξi , . . . , ξn−1 )
≤ Γkf k
for all f ∈ Fm (n). Recall that Antext was introduced in [2, Definition II.9]. Observe that the function on the left-hand side is in Fm+1 (n − 1). Lemma VII.5. Let k·k be a family of symmetric seminorms and let the covariance C be Γ-external improving with Rrespect to this family of seminorms. Let f (φ, ψ, ζ) be of degree p0 in ζ. The integral : f (φ, ψ, ζ) :ζ,C : [φJζ]p :ζ,C dµC (ζ) vanishes unless p = p0 and then
Z
: f (φ, ψ, ζ) :ζ,C : [φJζ]p :ζ,C dµC (ζ) ≤ p!Γp kf k .
R Proof. Observe that φJψ = dξdξ 0 φ(ξ)J(ξ, ξ 0 )ψ(ξ 0 ) ∈ A1 ⊗V . By Definition VII.4 and [2, Definition III.1],
ConC (Antext (h ⊗ φJψ)) ≤ Γkhk (VII.2)
i→n+1
for all h ∈ Am ⊗ V ⊗n , m ≥ 0, n ≥ 1 and some 1 ≤ i ≤ n. Observe that h ⊗ φJψ ∈ (Am ⊗ V ⊗n ) ⊗ (A1 ⊗ V ) ∼ = Am ⊗ A1 ⊗ V ⊗n+1 , so that Antext (h ⊗ φJψ) ∈ Am+1 ⊗ ⊗n+1 V and ConC (Antext (h ⊗ φJψ)) ∈ Am+1 ⊗ V ⊗n−1 . i→n+1
Set g(φ, ψ, ζ, ζ 0 ) = f (φ, ψ, ζ)[φJζ 0 ]p . By Lemma II.13 and [1, Remark II.12], p times, starting with f (ξ, ξ 0 , ξ 00 ) = : g(φ, ψ, ξ 0 , ξ 00 ) :ξ00 Z Z p : f (φ, ψ, ζ) :ζ,C : [φJζ] :ζ,C dµC (ζ) = [: g(φ, ψ, ζ, ζ 0 ) :ζ,ζ 0 ]ζ 0 =ζ dµC (ζ) =
Z
0
: ConC g(φ, ψ, ζ, ζ ) :ζ,ζ 0 ζ→ζ 0
dµC (ζ) ζ 0 =ζ
December 15, 2003 16:20 WSPC/148-RMP
1000
00178
J. Feldman, H. Kn¨ orrer & E. Trubowitz
Z p0 0 : ConC g(φ, ψ, ζ, ζ ) :ζ,ζ 0 = ζ→ζ 0
dµC (ζ) ζ 0 =ζ
0
= ConC p g(φ, ψ, ζ, ζ 0 ) ζ→ζ 0
if p0 ≥ p, since then
0
ConC p g(φ, ψ, ζ, ζ 0 ) ζ→ζ 0
is independent of ζ and ζ 0 . If p0 > p ,
0
ConC p g(φ, ψ, ζ, ζ 0 ) = 0 . ζ→ζ 0
If p0 < p ,
0
ConC p g(φ, ψ, ζ, ζ 0 ) ζ→ζ 0
is of degree 0 in ζ and of degree p − p0 > 0 in ζ 0 and the integral Z p0 0 dµC (ζ) : ConC g(φ, ψ, ζ, ζ ) :ζ,ζ 0 = ζ→ζ 0
ζ 0 =ζ
again vanishes. It now suffices to apply [1, Definition II.9] and (VII.2), p times. Proposition VII.6. Let γ > 0 and α ≥ 1 obey
γ α
≤ 31 . Let
W 0 (φ, ψ) = W(φ, ψ + CJφ) . If C is γb-external improving N (W 0 − W; c, b, α) ≤ Proof. Write W(φ, ψ) = W(φ, ψ + ζ) =
P
m,n Wm,n (φ, ψ)
X m,n
γ N (W; c, b, 2α) . α with Wm,n ∈ Am [n] and
Wm,n (φ, ψ + ζ) =
n XX m,n p=0
Wm,n−p,p (φ, ψ, ζ)
with Wm,n−p,p ∈ Am [n − p, p]. By [1, Lemma II.22(iii)] ! n kWm,n−p,p k ≤ kWm,n k . p Then, by (VII.1), W 0 (φ, ψ) − W(φ, ψ) = W(φ, ψ + CJφ) − W(φ, ψ) Z = : W(φ, ψ + ζ) − W(φ, ψ) :ζ,C : eφJζ :ζ,C dµC (ζ)
December 15, 2003 16:20 WSPC/148-RMP
00178
Single Scale Analysis of Many Fermion Systems — Part 2
=
1001
Z n X X 1 : Wm,n−p0 ,p0 (φ, ψ, ζ) :ζ,C : (φJζ)p :ζ,C dµC (ζ) p! 0
m,p≥0 p =1 n≥1
Z n X X 1 : Wm,n−p,p (φ, ψ, ζ) :ζ,C : (φJζ)p :ζ,C dµC (ζ) . = p! p=1 m≥0 n≥1
By Lemma VII.5
Z
: Wm,n−p,p (φ, ψ, ζ) :ζ,C : (φJζ)p :ζ,C dµC (ζ) ≤ p!(γb)p kWm,n−p,p k
= ≤ p!
n p
!
(γb)p kWm,n k
so that N (W 0 − W; c, b, α) ≤
n c X X 1 n−p n−p α b b2 p! p=1 m≥0 n≥1
Z
p
: W : : (φJζ) : dµ (ζ) × m,n−p,p ζ,C ζ,C C
n p c X X n γ ≤ 2 αn bn kWm,n k p b α p=1 m≥0 n≥1
n c X γ − 1 αn bn kWm,n k . = 2 1+ b α m≥0 n≥1
Applying
γ 1+ α
n
n−1 n n−1 γ γ γ 3 γ γ −1≤ n 1+ ≤ ≤ 2n 1+ α α α 2 α α
we have N (W 0 − W; c, b, α) ≤
≤
c X γ n n n 2 α b kWm,n k b2 α m≥0 n≥1
γ N (W; c, b, 2α) . α
Corollary VII.7. Let γ, γ 0 > 0 and α > 1 with
γ α
≤ 16 . Let
Wκ0 (φ, ψ) = W(φ, ψ + Cκ Jφ) .
December 15, 2003 16:20 WSPC/148-RMP
1002
00178
J. Feldman, H. Kn¨ orrer & E. Trubowitz
If C0 is γb-external improving and ddκ Cκ |κ=0 is γ 0 b-external improving γ0 d 0 ; c, b, α ≤ 2 N (W; c, b, 2α) . Wκ N dκ α κ=0
d Proof. Define Dz = C0 + z dκ Cκ |κ=0 and Wz00 (φ, ψ) = W(φ, ψ + Dz Jφ) − W(φ, ψ). d d 0 00 Then dκ Wκ |κ=0 = dz Wz |z=0 . Furthermore, applying the triangle inequality directly to the Definition VII.4 of “external improving”, we see that Dz is 0 α (γ + |z|γ 0 )b-external improving. As γ+|z|γ ≤ 13 for all |z| ≤ 6γ 0 , Proposition VII.6 α implies that 1 N (Wz00 ; c, b, α) ≤ N (W; c, b, 2α) 3 α for all |z| ≤ 6γ . The corollary now follows by the Cauchy integral theorem to 0 d 0 express N ( dκ Wκ |κ=0 ; c, b, α) as an integral of N (Wz00 ; c, b, α) over the circle of radius α 6γ 0 centered on the origin.
Similarly to Lemma V.1, we have: Lemma VII.8. Let ρm;n be a sequence of nonnegative real numbers such that ρm;n0 ≤ ρm;n for n0 ≤ n. Define for f ∈ Fm (n) kf k = ρm;n kf k1,∞ where kf k1,∞ is the L1 –L∞ -norm introduced in Example II.6. Let C be a covariance and Γ obey ρ1;n−1 k|Ck|1,∞ for all n ≥ 1 Γ≥ ρ0;n Γ≥
ρm+1;n−1 k|Ck|∞ ρm;n
for all m, n ≥ 1 .
Then C is Γ-external improving with respect to the family of seminorms k · k, in the sense of Definition VII.4. Proof. Let f ∈ Fm (n) and set g(η1 , . . . , ηm+1 ; ξ1 , . . . , ξn−1 ) Z = Antext dζdζ 0 J(ηm+1 , ζ)C(ζ, ζ 0 )f (η1 , . . . , ηm ; ξ1 . . . , ξi−1 , ζ 0 , ξi , . . . , ξn−1 ) .
If m = 0
k|gk|1,∞ ≤ k|f k|1,∞ k|JCk|1,∞ = k|f k|1,∞ k|Ck|1,∞ by Lemma II.7. If m 6= 0, k|gk|1,∞ ≤ k|f k|1,∞ k|JCk|∞ = k|f k|1,∞ k|Ck|∞ . Since g ∈ Fm+1 , kgk1,∞ = k|gk|1,∞ . The lemma now follows from the hypothesis on Γ.
December 15, 2003 16:20 WSPC/148-RMP
00178
Single Scale Analysis of Many Fermion Systems — Part 2
1003
VIII. Scales From now on we discuss the situation that the dispersion relation e(k) has zeroes on the support of the ultraviolet cutoff U (k); in other words, that the Fermi surface F is not empty. Then a single scale analysis as for insulators is not possible because (k) on there is an infrared problem due to the singularity of the propagator ık0U−e(k) the set {(k0 , k) ∈ R × Rd |k0 = 0, e(k) = 0} which can be canonically identified with the Fermi surface. This singularity causes the L1 –L∞ norm (in position space) of the propagator to be infinite. To analyze the singularity at the Fermi surface, we introduce scales by slicing momentum space into shells around the Fermi surface. We choose a “scale parame1 , 2M ]) that takes values in [0, 1], is identically ter” M > 1 and a function ν ∈ C0∞ ([ M 2 one on [ M , M ] and obeys ∞ X
ν(M 2j x) = 1
j=0
for 0 < x < 1. The scale parameter M is chosen sufficiently big (depending on the dispersion relation e(k) and the ultraviolet cutoff U (k)). The function ν may be constructed by choosing a function ϕ ∈ C0∞ ((−2, 2)) that is identically one on [−1, 1] and setting ν(x) = ϕ(x/M ) − ϕ(M x) for x > 0 and zero otherwise. Then ν(x) vanishes for P∞ 2 1 and is identically one for M ≤ x ≤ M and j=0 ν(M 2j x) = x ≥ 2M and x ≤ M ϕ(x/M ) for x > 0. Definition VIII.1. (i) For j ≥ 1, the jth scale function on R × Rd is defined as ν (j) (k) = ν(M 2j (k02 + e(k)2 )) .
By construction, ν (j) is identically one on ( ) r √ 2 1 1 d k = (k0 , k) ∈ R × R ≤ |ik0 − e(k)| ≤ M j . M Mj M
The support of ν (j) is called the jth shell. By construction, it is contained in √ 1 1 d 1 2M k ∈ R × R √ ≤ |ik − e(k)| ≤ . 0 Mj M Mj The momentum k is said to be of scale j if k lies in the jth shell. (ii) For real j ≥ 1, set ν (≥j) (k) = ϕ(M 2j−1 (k02 + e(k)2 ))
with the function ϕ introduced just before this definition. By construction, ν (≥j) is identically one on √ 1 k ∈ R × Rd |ik0 − e(k)| ≤ M j . M
December 15, 2003 16:20 WSPC/148-RMP
1004
00178
J. Feldman, H. Kn¨ orrer & E. Trubowitz
Observe that if j is an integer, then for |ik0 − e(k)| > 0 X ν (≥j) (k) = ν (i) (k) . i≥j
The support of ν (≥j) is called the jth neighborhood of the Fermi surface. By construction, it is contained in √ 1 k ∈ R × Rd |ik0 − e(k)| ≤ 2M j . M Remark VIII.2. Since the scale parameter M > 1, the shells near the Fermi curve have j near +∞, and the neighborhoods shrink as j → ∞.
Conventions VIII.3. (i) We choose M so big that ν (≥1) (k) ≤ U (k) for all k = (k0 , k) ∈ R × Rd . (ii) We also use the notations ν (≤j) (k) =
j X
ν (i) (k)
i=0
ν (<j) (k) = ν (≤j−1) (k) ,
ν (>j) (k) = ν (≥j+1) (k) .
(iii) Generic constants that depend only on the dispersion relation e(k) and the ultraviolet cutoff U (k) will be denoted by “const”. Generic constants that may also depend on the scale parameter M , but still not on the scale j, will be denoted “const”. For technical discussions we also need a set of functions that “envelope” the various shells. We set ν˜ = ϕ(x/M 2 ) − ϕ(M 2 x) for x > 0 and zero otherwise. It is in C0∞ (( M12 , 2M 2 )), takes values in [0, 1] and is identically one on [ M22 , M 2 ] and hence on the support of ν, assuming that M ≥ 2. Definition VIII.4. (i) For j ≥ 1, the jth extended scale function on R × Rd is defined as ν˜(j) (k) = ν˜(M 2j (k02 + e(k)2 )) . The support of ν˜(j) is called the jth extended shell. It is (for j ≥ 2) contained in the union of the (j − 1)st, jth and (j + 1)st shells. In fact, if M ≥ 2, ν˜(j) is identically one on the jth shell and, if j, M ≥ 2, ν (j−1) + ν (j) + ν (j+1) is identically one on the jth extended shell. (ii) By definition, the jth extended neighborhood is the union of the ith extended shells with i ≥ j. It is (for j ≥ 2) contained in the (j − 1)st neighborhood of the Fermi surface. The function ν˜(≥j) (k) = ϕ(M 2j−2 (k02 + e(k)2 ))
December 15, 2003 16:20 WSPC/148-RMP
00178
Single Scale Analysis of Many Fermion Systems — Part 2
1005
is supported on the jth extended neighborhood and identically one on the jth neighborhood. (iii) Set ν¯(≥j) (k) = ϕ(M 2j−3 (k02 +e(k)2 )) = ν (≥j−1) . Then ν¯(≥j) (k) is identically one on the jth extended neighborhood. The support of ν¯(≥j) is called the √ jth doubly extended neighborhood and is contained in {k ∈ R×Rd | |ik0 −e(k)| ≤ 2M 3/2 M1 j }. Observe that the ultraviolet cutoff U (k) does not depend on k0 , so that the (k) propagator ık0U−e(k) is not compactly supported. However, we can use the results of Part 1 to integrate out the ultraviolet part of the model in the k0 -direction and to pass to a model whose propagator is supported in the second neighborhood. We choose to pass to a model supported in the second, rather than first, neighborhood because the second doubly extended neighborhood can be made arbitrarily small by choosing M sufficiently large. In the renormalization group analysis we shall add a counterterm δe(k) to the dispersion relation e(k). Definition VIII.5. Let µ > 0. The space of counterterms, Eµ , consists of all functions δe(k) on Rd that are supported in {k ∈ Rd |U (k) 6= 0} and obey X ∞tδ kδˆ ek1,∞ < µ + δ6=0
where δˆ e was defined just before Definition IV.10 and the norm k · k1,∞ was defined in Example II.6. P P δ Recall, from Definition IV.10 that c0 = ∞tδ ∈ Nd+1 |δ|≤r t + |δ|>r and e0 (X) =
c0 1−X ,
for X ∈ Nd+1 with X0 < 1.
|δ0 |≤r0
or |δ0 |>r0
Theorem VIII.6. Fix j0 ≥ 1 and set, for δe ∈ Eµ , C0 (k; δe) =
U (k) − ν (>j0 ) (k) . ık0 − e(k) + δe(k)
Define the covariance C0 (δe) by Z dd+1 k ıhk,x−x0 i− 0 δ C0 (k; δe) e σ,σ (2π)d+1 0 C0 (ξ, ξ ; δe) = 0 −C0 (ξ 0 , ξ; δe)
if a = 0, a0 = 1 if a = a0 if a = 1, a0 = 0
for ξ = (x, a) = (x0 , x, σ, a), ξ 0 = (x0 , a0 ) = (x00 , x0 , σ 0 , a0 ). Then there are (M and j0 -dependent) constants b, β0 , ε0 , const and µ > 0 such that, for all β ≥ β0 and ε ≤ ε0 , the following holds: Choose a system ρ = (ρm;n )m,n∈N0 , of positive real numbers obeying ρm;n−1 ≤ ρm;n , ρm+1;n−1 ≤ ρm;n and ρm+m0 ;n+n0 −2 ≤ ρm;n ρm0 ;n0 . For
December 15, 2003 16:20 WSPC/148-RMP
1006
00178
J. Feldman, H. Kn¨ orrer & E. Trubowitz
an even Grassmann function X Z dη1 · · · dηm dξ1 · · · dξn W(φ, ψ) = m,n≥0 m+n even
Bm+n
× Wm,n (η1 · · · ηm , ξ1 , . . . , ξn )φ(η1 ) · · · φ(ηm )ψ(ξ1 ) · · · ψ(ξn ) with kernels Wm,n that are separately antisymmetric under permutations of their η and ξ arguments and X ∈ Nd+1 with X0 < 1, set X N0 (W; β; X, ρ) = e0 (X) β n ρm;n kWm,n k1,∞ . m+n≥2 m+n even
˜ C (δe) (V) converges Let X ∈ Nd+1 with X0 < 41 . The formal Taylor series Ω 0 to an analytic map on {(V(ψ), δe)|V even, N0 (V; 32β; X, ρ)0 ≤ εe0 (X)0 , δe ∈ Eµ , k|δˆ ek|1,∞ ≤ X0 }. Furthermore, for all δe ∈ Eµ with kδˆ ek1,∞ ≤ X and all even Grassmann functions V(ψ) with N0 (V; 32β; X, ρ) ≤ εe0 (X), one has ˜ C (δe) (V)(φ, ψ) − V(ψ) − 1 φJC0 (δe)Jφ; β; X, ρ ≤ εb e0 (X) N0 Ω 0 2 β and N0
d ˜ 1 ΩC0 (δe+sδe0 ) (V)(φ, ψ) − φJC0 (δe + sδe0 )Jφ ; β; X, ρ ds 2 s=0
≤
εb e0 (X)kδˆ e0 k1,∞ . β
Proof. By Proposition IV.5 (with χ = ν (>j0 ) and e replaced by e − δe) and Proposition IV.11, there is a constant const1 such that S(C0 (δe)) ≤ const1 , k|C0 (δe)k|∞ ≤ const1 and 1 kC0 (δe)k1,∞ ≤ const1 e0 (kδˆ ek1,∞ ) . 4 Set k · k = ρm;n k · k1,∞ for functions on B m × B n . Set b = 4 const1 and c = const1 e0 (X). By Lemma V.1, with respect to this family of seminorms, b is an integral bound for C0 (δe) and c is a contraction bound for C0 (δe). Furthermore, by Lemma VII.8, C0 (δe) is const1 -external improving. For any Grassmann function W(φ, ψ) let N (W; c, b, α) be the norm of Definition III.9 with respect to the family of seminorms k · k. Set α = βb . Then, as 1 c = b4 e0 (X), N (W; c, b, α) = 4b N0 (W; β; X, ρ) and, if V = : V 0 :C0 (δe) , N (V 0 ; c, b, α) ≤ N (V 0 − V; c, b, α) ≤ by [1, Corollary II.32].
1 N0 (V; 2β; X, ρ) 4b
b N0 (V; 2β; X, ρ) 2β 2
December 15, 2003 16:20 WSPC/148-RMP
00178
Single Scale Analysis of Many Fermion Systems — Part 2
1007
To prove the first part of the Theorem, set V = : V 0 :C0 (δe) . Then the hypotheses of Theorem III.10 with C = C0 (δe) and W = V 0 are fulfilled. Therefore, 1 N0 (ΩC0 (δe) (V) − V; β; X, ρ) 4b ≤ N (ΩC0 (δe) (: V 0 :C0 (δe) ) − V 0 ; c, b, α) + N (V 0 − V; c, b, α) ≤
2 N (V 0 ; c, b, 8α)2 + N (V 0 − V; c, b, α) 2 α 1 − α42 N (V 0 ; c, b, 8α)
≤
1 2 b 2 16b 2 N0 (V; 16β; X, ρ) + N0 (V; 2β; X, ρ) 4 1 2 α 1 − α2 4b N0 (V; 16β; X, ρ) 2β 2
≤
e0 (X)2 ε2 εb + e0 (X) 2 8β 1 − βεb2 e0 (X) 2β 2
≤
εb e0 (X) β2
since ε0 is chosen so that βb2 ε < 41 and, by Corollary A.5(ii), By Lemma VII.3 and Proposition VII.6,
e20 (X) 1− 14 e0 (X)
≤ const e0 (X).
˜ C (δe) (V) − 1 φJC0 (δe)Jφ − V; β; X, ρ) N0 ( Ω 0 2 = N0 (ΩC0 (δe) (V)(φ, ψ + C0 (δe)Jφ) − V(ψ); β; X, ρ) ≤ N0 (ΩC0 (δe) (V)(φ, ψ + C0 (δe)Jφ) − ΩC0 (δe) (V)(φ, ψ); β; X, ρ) + N0 (ΩC0 (δe) (V)(φ, ψ) − V(ψ); β; X, ρ) ≤
4εb2 1 N0 (ΩC0 (δe) (V); 2β; X, ρ) + 2 e0 (X) 4α β
≤
1 4εb2 [N0 (ΩC0 (δe) (V) − V; 2β; X, ρ) + N0 (V; 2β; X, ρ)] + 2 e0 (X) 4α β
≤
4εb2 b εb2 e (X) + εe (X) + 2 e0 (X) 0 0 2 4β β β
≤
εb e0 (X) . β
The joint analyticity in V and δe follows from Proposition IV.11 and [1, Remark III.11].
December 15, 2003 16:20 WSPC/148-RMP
1008
00178
J. Feldman, H. Kn¨ orrer & E. Trubowitz
d ˜ [ΩC0 (δe+sδe0 ) (V)(φ, ψ) − 21 φJC0 (δe + sδe0 ) Finally, we prove the bound on ds Jφ]s=0 . As d U (k) − ν (>j0 ) (k) C0 (k; δe + sδe0 ) δe0 (k) . =− ds [ık0 − e(k) + δe(k)]2 s=0
Propositions IV.3(i), IV.8(i) and IV.11(ii) give that q d 0 C0 (δe + sδe ) ≤ const1 k|δˆ e0 k|1,∞ S ds s=0 d C0 (δe + sδe0 ) ds
s=0
∞
≤ const1 k|δˆ e0 k|1,∞
d 1
C0 (δe + sδe0 ) ≤ const1 e0 (kδˆ ek1,∞ )kδˆ e0 k1,∞ .
ds 2 s=0 1,∞ q e0 k|1,∞ and c0 = const1 e0 (X)kδˆ Set b0 = 4 const1 k|δˆ e0 k1,∞ . By Lemma V.1, 21 b0 is
d d C0 (δe+sδe0)|s=0 and c0 is a contraction bound for ds C0 (δe+ an integral bound for ds d 0 0 e0 k|1,∞ sδe )|s=0 . Furthermore, by Lemma VII.8, ds C0 (δe + sδe )|s=0 is const1 k|δˆ external improving. Define Cκ = C0 (δe + κδe0 ) and Wκ by : Wκ :Cκ = V. Even when α is replaced by 2α, the hypotheses of [1, Lemmas IV.5(i) and IV.7(i)], with µ = 1, are satisfied. By these two lemmas, followed by [1, Corollary II.32(iii)], d ΩC (V) ; c, b, α N ds s s=0
=N
≤N ≤
≤
d ΩC (: Ws :Cs ) ; c, b, α ds s s=0
d d 0 ΩC (: V :Cs ) ; c, b, α + N ΩC (: Ws :C0 ) ; c, b, α ds s ds 0 s=0 s=0
1 N (V 0 ; c, b, 8α) const1 e0 (X)kδˆ e0 k1,∞ 2α2 1 − α42 N (V 0 ; c, b, 8α)
d N (V 0 ; c, b, 8α) 2 N W ; c, b, 2α + 1+ 2 s α 1 − α42 N (V 0 ; c, b, 8α) ds s=0 1 N (V 0 ; c, b, 8α) const1 e0 (X)kδˆ e0 k1,∞ 2 2α 1 − α42 N (V 0 ; c, b, 8α)
2 N (V 0 ; c, b, 8α) + 1+ 2 α 1 − α42 N (V 0 ; c, b, 8α) ≤ const
ε e0 (X)kδˆ e0 k1,∞ α2
kδˆ e0 k1,∞ N (V; c, b, 4α) (2α − 1)2
December 15, 2003 16:20 WSPC/148-RMP
00178
Single Scale Analysis of Many Fermion Systems — Part 2
1009
as above. By Lemma VII.3, Proposition VII.6 and Corollary VII.7 1 d ˜ ; β; X, ρ ΩC0 (δe+sδe0 ) (V)(φ, ψ) − φJC0 (δe + sδe0 )Jφ N0 ds 2 s=0 = N0
≤ N0
+ N0
d 0 0 ΩC (δe+sδe ) (V)(φ, ψ + C0 (δe + sδe )Jφ) ; β; X, ρ ds 0 s=0 d ΩC (δe+sδe0 ) (V)(φ, ψ + C0 (δe)Jφ) ; β; X, ρ ds 0 s=0
1 N0 ≤ 4α +
d ΩC (δe+sδe0 ) (V); 2β; X, ρ ds 0
1 N0 (ΩC0 (δe) (V); 2β; X, ρ)k|δˆ e0 k|1,∞ 2α
≤ const ≤
d 0 ΩC (δe) (V)(φ, ψ + C0 (δe + sδe )Jφ) ; β; X, ρ ds 0 s=0
ε ε e0 (X)kδˆ e0 k1,∞ + e0 (X)kδˆ e0 k1,∞ 3 α 2α
εb e0 (X)kδˆ e0 k1,∞ β
as above. Remark VIII.7. (i) Let V(ψ) =
X Z
n even
Bn
dξ1 · · · dξn Vn (ξ1 , . . . , ξn )ψ(ξ1 ) · · · ψ(ξn )
as in Theorem VIII.6. Since c20 ≥ const c0 , if X (32β)n ρ0;n kVn k1,∞ ≤
ε const
c0
then the hypothesis N0 (V; 32β; X, ρ) ≤ εe0 (X) is satisfied. ˜ C0 (V) does. (ii) Observe that ΩC0 (V) does not depend on φ, while Ω (iii) In the applications we have in mind, there is a small constant λ > 0 (the coupling constant) and a small number υ > 0 such that 1 λ(1−υ)(m+n−2)/2 if m + n ≥ 4 ρm;n = 1 if m + n = 2 . λ(1−υ) Then the hypotheses on ρm;n in the theorem are fulfilled.
December 15, 2003 16:20 WSPC/148-RMP
1010
00178
J. Feldman, H. Kn¨ orrer & E. Trubowitz
Remark VIII.8. The norms of Theorem VIII.6 are too coarse for a multi scale analysis of many fermion systems. See [3, Sec. II, Subsec. 7]. In the notation of this paper, this may be seen as follows. For simplicity, set r = r0 = 0. Let C (j) (k) = The condition
ν (j) (k) . ık0 − e(k)
ρ0;n+n0 −2 ≤ ρ0;n ρ0;n0
with n = n0 = 2 implies that ρ0;2 ≥ 1 and hence ρ0;n ≥ 1 for all n ≥ 2. From const (j) . A direct Proposition IV.3(i) one deduces that M j/2 is an integral bound for C application of Proposition IV.8(i) and Lemma V.1 gives the poor estimate M dj t0 + P δ (j) . A more careful argument in which one |δ|>0 ∞t for a contraction bound for C
P (j) • decomposes C (j) = s∈Σ Cs into M (d−1)j/2 terms each having the projection of k onto the Fermi surface restricted to a roughly rectangular region of side M −j/2 (see [4, Definition XII.1] for the precise construction) (j) • applies Proposition IV.8(i) to obtain k|Cs k|1,∞ ≤ const M j for all s ∈ Σ P d+1 yields const M 2 j t0 + |δ|>0 ∞tδ as a realistic contraction bound. Thus, for an even Grassmann function ∞ Z X W(ψ) = dξ1 · · · dξ2n W2n (ξ1 , . . . , ξ2n )ψ(ξ1 ) · · · ψ(ξ2n ) n=0
the norm N (W; c, b, α) of Definition III.9 has ( d+3 α4 α2 j ρ k|W k| + ρ0;4 k|W4 k|1,∞ N (W; c, b, α)0 = const M 2 0;2 2 1,∞ Mj M 2j +
X
n≥3
α2 const j M
n
ρ0;2n k|W2n k|1,∞
)
.
In particular, for N (W; c, b, α)0 to be order one, it is necessary that k|W4 k|1,∞ is 1 of order d−1 . For d ≥ 2, this is not even the case for the original interaction V, j M
2
with all momenta restricted to the jth shell.
IX. The Fourier Transform Theorem VIII.6 and its higher scale analog, Theorem XV.3, shall be used in a renormalization group flow to obtain estimates on the suprema of connected Green’s functions in position space. We wish to also obtain estimates on the suprema of certain connected amputated Green’s functions in momentum space. In Sec. X, we introduce norms tailored to that purpose and prove an analog of Theorem VIII.6. In this section, we set up notation and concepts for the passage between position space and momentum space that will be needed to do that.
December 15, 2003 16:20 WSPC/148-RMP
00178
Single Scale Analysis of Many Fermion Systems — Part 2
1011
In Remark VIII.8, we pointed out that estimates based on integral and contrac(≥j) tion bounds of the propagator ıkν0 −e(k) are worse than those one expects by naive power counting in momentum space. The reason is that conservation of momentum is not exploited effectively. In the case d = 2, this difficulty can be completely overcome by introducing sectors, see [5], [6, Sec. X], [3, Sec. II, Subsec. 8] and Sec. XII. They localize momenta into small pieces of the shells introduced in Definition VIII.1. The discussion of this section is also useful for that purpose. To systematically deal with Fourier transforms, we call Bˇ = R × Rd × {↑, ↓} × {0, 1} “momentum space”. For ξˇ = (k, σ 0 , a0 ) = (k0 , k, σ 0 , a0 ) ∈ Bˇ and ξ = (x, a) = (x0 , x, σ, a) ∈ B we define the inner product ˇ ξi = δσ0 ,σ δa0 ,a (−1)a hk, xi− = δσ0 ,σ δa0 ,a (−1)a (−k0 x0 + k1 x1 + · · · + kd xd ) hξ, “characters” ˇ
ˇ ξ) = δσ0 ,σ δa0 ,a eıhξ,ξi = δσ0 ,σ δa0 ,a eı(−1) E+ (ξ,
a
(−k0 x0 +k1 x1 +···+kd xd )
a ˇ ˇ ξ) = δσ0 ,σ δa0 ,a e−ıhξ,ξi E− (ξ, = δσ0 ,σ δa0 ,a e−ı(−1) (−k0 x0 +k1 x1 +···+kd xd )
and integrals Z dξ• =
X
a∈{0,1} σ∈{↑,↓}
Z
R×Rd
Z
dx0 dd x •
ˇ = dξ•
X
a∈{0,1} σ∈{↑,↓}
Z
R×Rd
dk0 dd k • .
For ξˇ = (k, σ, a), ξˇ0 = (k 0 , σ 0 , a0 ) ∈ Bˇ we set 0 ξˇ + ξˇ0 = (−1)a k + (−1)a k 0 ∈ R × Rd .
Definition IX.1 (Fourier transforms). Let f (η1 , . . . , ηm ; ξ1 , . . . , ξn ) be a translation invariant function on B m × B n . (i) The total Fourier transform fˇ of f is defined by ˇ η1 , . . . , ηˇm ; ξˇ1 , . . . , ξˇn )(2π)d+1 δ(ˇ f(ˇ η1 + · · · + ηˇm + ξˇ1 + · · · + ξˇn ) =
Z Y m
E+ (ˇ ηi , ηi )dηi
i=1
n Y
E+ (ξˇj , ξj )dξj f (η1 , . . . , ηm ; ξ1 , . . . , ξn )
j=1
or, equivalently, by f (η1 , . . . , ηm ; ξ1 , . . . , ξn ) Z Y n m E− (ˇ ηi , ηi )dˇ ηi Y E− (ξˇj , ξj )dξˇj = (2π)d+1 j=1 (2π)d+1 i=1 ˇ η1 , . . . , ηˇm ; ξˇ1 , . . . , ξˇn )(2π)d+1 δ(ˇ × f(ˇ η1 + · · · + ηˇm + ξˇ1 + · · · + ξˇn )
December 15, 2003 16:20 WSPC/148-RMP
1012
00178
J. Feldman, H. Kn¨ orrer & E. Trubowitz
fˇ is defined on the set {(ˇ η1 , . . . , ηˇm ; ξˇ1 , . . . , ξˇn ) ∈ Bˇm × B n|ˇ η1 + · · · + ηˇm + ξˇ1 + · · · + ˇ ξn = 0}. If m = 0, n = 2 and f (ξ1 , ξ2 ) conserves particle number and is spin independent ˇ and antisymmetric, we define f(k) by ˇ fˇ((k, σ, 1), (k, σ 0 , 0)) = δσ,σ0 f(k) or equivalently by ˇ = f(k)
Z
dyeıhk,yi− f ((0, σ, 1), (y, σ, 0)) .
(ii) If n ≥ 1, the partial Fourier transform f ∼ is defined by ! Z Y m ∼ E+ (ˇ ηi , ηi )dηi f (η1 , . . . , ηm ; ξ1 , . . . , ξn ) f (ˇ η1 , . . . , ηˇm ; ξ1 , . . . , ξn ) = i=1
or, equivalently, by
f (η1 , . . . , ηm ; ξ1 , . . . , ξn ) = If n = 0, we set f ∼ = fˇ.
Z
m Y
dˇ ηi E− (ˇ η i , ηi ) d+1 (2π) i=1
!
f ∼ (ˇ η1 , . . . , ηˇm ; ξ1 , . . . , ξn ) .
Remark IX.2. Translation invariance of f implies that for all t ∈ R × Rd
f ∼ (ˇ η1 , . . . , ηˇm ; ξ1 + t, . . . , ξn + t) = eıhˇη1 +···+ˇηm ,ti− f ∼ (ˇ η1 , . . . , ηˇm ; ξ1 , . . . , ξn ) .
This is what we mean when we say that “f ∼ (ˇ η1 , . . . , ηˇm ; ξ1 , . . . , ξn ) is translation invariant”. There will be two different situations in which we wish to associate a two-point function B(ξ, ξ 0 ) in position/spin/particle–hole space, B, to a function B(k) in momentum space that has no spin/particle–hole dependence. In the first case, treated in Definition IX.3 below, B(ξ, ξ 0 ) is a propagator and so is spin-independent and particle number conserving, so that B(ξ, ξ 0 ) vanishes unless one of ξ, ξ 0 is particle and the other is hole. In the second case, treated in Definition IX.4 below, convolution with B(ξ, ξ 0 ) corresponds to pure multiplication by B(k) in momentum space. This is used to, for example, introduce partitions of unity in momentum space. In this case B(ξ, ξ 0 ) is diagonal in the particle/hole indices. Definition IX.3 (Fourier transforms of covariances). If C(k) is a function on R × Rd , we say that the covariance C(ξ, ξ 0 ) on B × B, defined by C((x0 , x, σ, a), (x00 , x0 , σ 0 , a0 )) Z dd+1 k ıhk,x−x0 i− 0 e C(k) δ σ,σ (2π)d+1 = 0 −C((x0 , σ 0 , a0 ), (x, σ, a))
is the Fourier transform of C(k).
if a = 0, a0 = 1 if a = a0 if a = 1, a0 = 0
December 15, 2003 16:20 WSPC/148-RMP
00178
Single Scale Analysis of Many Fermion Systems — Part 2
1013
As in part (ii) of Proposition IV.3, we use the notation Definition IX.4. If χ(k) is a function on R × Rd , we define the Fourier transform χ ˆ by Z a 0 dd+1 k χ(ξ, ˆ ξ 0 ) = δσ,σ0 δa,a0 e(−1) ıhk,x−x i− χ(k) (2π)d+1 for ξ = (x, a) = (x0 , x, σ, a), ξ 0 = (x0 , a0 ) = (x00 , x0 , σ 0 , a0 ) ∈ B. Lemma IX.5. (i) Let C(k) be a function on R × Rd and C(ξ, ξ 0 ) the associated covariance in the sense of Definition IX.3. Then CJ = Cˆ
(JCJ)ˇ(k) = C(k)
where J was defined in (VI.1), Cˆ was defined in Definition IX.4 and fˇ(k) was defined in Definition IX.1(i). (ii) Let χ(k) and χ0 (k) be functions on R × Rd . Then Z d0 (ξ, ξ 0 ) . dξ 00 χ(ξ, ˆ ξ 00 )χ ˆ0 (ξ 00 , ξ 0 ) = χχ (iii) Let χ(k) be a function on R × Rd . Then
(J χ) ˆ ˇ(k) = χ(k) (J χJ)(ξ, ˆ ξ 0 ) = −χ(ξ ˆ 0 , ξ) . Proof. The proof of this lemma consists of a number of three or four line computations. In a renormalization group analysis we will adjust the counterterms in such a way that, at each scale, the Fourier transform of the two point function is small on {k = (k0 , k) ∈ R × Rd |k0 = 0}. Then the absolute value of the Fourier transform of the two point function at a point (k0 , k) can be estimated in terms of |k0 | and the k0 derivative of the Fourier transform of the two point function. The following lemma is used to make an analogous estimate in position space. For a function f (x) on R × Rd , we define X 1 Z kf kL1 = |xδ f (x)|dd+1 x tδ ∈ Nd+1 . δ! d δ∈N0 ×N0
Lemma IX.6. Let u(ξ, ξ 0 ) be a translation invariant function on B 2 that satisfies u ˇ(((0, k), σ, a), ((0, k0 ), σ 0 , a0 )) = 0. Furthermore let χ(k) be a function on R × Rd . For x ∈ R × Rd set Z dd+1 k χ0 (x) = eıhk,xi− χ(k) (2π)d+1
December 15, 2003 16:20 WSPC/148-RMP
1014
00178
J. Feldman, H. Kn¨ orrer & E. Trubowitz
so that for ξ = (x, σ, a), ξ 0 = (x0 , σ 0 , a0 ) ∈ B χ(ξ, ˆ ξ 0 ) = δσ,σ0 δa,a0 χ0 ((−1)a (x − x0 )) . Then (i)
(ii)
Z
0
dη χ(ξ, ˆ η)u(η, ξ )
1,∞
0
∂χ (1,0,...,0)
uk1,∞ + ≤
∂x0 1 kD1,2 L
Z
(1,0,...,0)
0
D dη χ(ξ, ˆ η)u(η, ξ )
1,2
X
∞tδ .
X
∞tδ .
δ∈N0 ×Nd 0 δ0 6=0
1,∞
∂χ0 (1,0,...,0)
uk1,∞ + ≤ const kχk ˆ 1,∞ + x
0 ∂x0 1 kD1,2 L
δ∈N0 ×Nd 0 δ0 >r0
Proof. (i) Fix ξ = (x, σ, a), ξ 0 = (x0 , σ 0 , a0 ) ∈ B. By translation invariance Z Z dη χ(ξ, ˆ η)u(η, ξ 0 ) = dyχ0 ((−1)a (x − y))u((y, σ, a), (x0 , σ 0 , a0 )) = =
Z
Z
dyχ0 ((−1)a (x − y))u((y − x0 , σ, a), (0, σ 0 , a0 )) dyχ0 ((−1)a y)v(x − x0 − y)
where v(y) = u((y, σ, a), (0, σ 0 , a0 )). By hypothesis Z dy0 v(y0 , y) = 0 for all y ∈ Rd .
Therefore Z dη χ(ξ, ˆ η)u(η, ξ 0 ) = =
Z
Z
dy(χ0 ((−1)a y) − χ0 ((−1)a (x0 − x00 , y)))v(x − x0 − y) dy
χ0 ((−1)a y) − χ0 ((−1)a (x0 − x00 , y)) [(x0 − x00 − y0 )v(x − x0 − y)] (x0 − x00 ) − y0
= (−1)a+1
Z
dy
(1,0,...,0)
× D1,2
Z
1
ds 0
∂χ0 ((−1)a (sy0 + (1 − s)(x0 − x00 ), y)) ∂x0
u((x − x0 − y, σ, a), (0, σ 0 , a0 ))
December 15, 2003 16:20 WSPC/148-RMP
00178
Single Scale Analysis of Many Fermion Systems — Part 2
= (−1)a+1
Z
dy
(1,0,...,0)
× D1,2
Z
1
ds 0
≤
σ,a
1
ds
0
Z
(1,0,...,0)
× |D1,2 =
XZ σ,a
1
ds 0
Z
(1,0,...,0)
× |D1,2
∂χ0 ((−1)a (sy0 + (1 − s)(x0 − x00 ), y)) ∂x0
u((x − y, σ, a), ξ 0 ) .
Consequently, for fixed ξ 0 ∈ B Z Z dξ dη χ(ξ, ˆ η)u(η, ξ 0 ) XZ
1015
dxdy
Z
0 ∂χ a 0 ((−1) (sy0 + (1 − s)(x0 − x0 ), y)) dx0 dy0 ∂x0
u((x0 − y0 , x − y, σ, a), ξ 0 )| dxdy
Z
0 ∂χ a dαdβ ((−1) (β, y)) ∂x0
u((α, x − y, σ, a), ξ 0 )|
0
∂χ (1,0,...,0)
≤ uk1,∞ .
∂x0 1 kD1,2 L
(IX.1)
Here we used,Rfor each of variables α = x0 −y0 , β = sy0 +(1−s)x0 . R fixed s, the change The integral dξ 0 | dη χ(ξ, ˆ η)u(η, ξ 0 )|, for fixed ξ ∈ B, is treated similarly. By Leibniz’s rule (Lemma II.2) and (IX.1), Z X tδ Z δ 0 dξ D1,2 dη χ(ξ, ˆ η)u(η, ξ ) δ!
δ0 =0
≤
≤
X
X
δ0 =0 α,β∈N0 ×Nd 0 α+β=δ
δ
α, β
!
tδ δ!
Z
Z β α 0 dξ dη(D1,2 χ)(ξ, ˆ η)(D1,2 u)(η, ξ )
X X tα tβ
∂ α 0
kD(1,0,...,0) Dβ uk1,∞
x χ 1,2 1,2
α! β! ∂x0 t=0 L1 α =0 0
β0 =0
X X tα tβ
α ∂ 0 (1,0,...,0) β
= uk1,∞
x ∂x0 χ 1 kD1,2 D1,2 α! β! t=0 L α =0 0
β0 =0
0
∂χ (1,0,...,0)
≤ uk1,∞ .
∂x0 1 kD1,2 L
December 15, 2003 16:20 WSPC/148-RMP
1016
00178
J. Feldman, H. Kn¨ orrer & E. Trubowitz
(ii) By Leibniz’s rule (Lemma II.2), part (i) of this lemma (applied to Lemma II.7 Z X tδ 0 Dδ D(1,0,...,0) dη χ(ξ, ˆ η)u(η, ξ ) 1,2 1,2 δ! 1,∞
δ∈N0 ×Nd 0
≤
X
δ∈N0 ×Nd 0
X
δ + (1, 0, . . . , 0) α, β
α,β∈N0 ×Nd 0 α+β=δ+(1,0,...,0)
!
Z β α δ 0 × dηD1,2 χ(ξ, ˆ η)D1,2 D1,2 u(η, ξ )
tδ δ!
1,∞
≤
X
δ∈N0 ×Nd 0
X
δ + (1, 0, . . . , 0)
0
α + (1, 0, . . . , 0), β
α0 ,β∈N0 ×Nd 0 α0 +β=δ, β0 =0
!
Z (1,0,...,0) β α0 0 × dηD1,2 D1,2 χ(ξ, ˆ η)D1,2 u(η, ξ )
tδ δ!
1,∞
+
X
X
δ + (1, 0, . . . , 0) 0
α, β + (1, 0, . . . , 0)
0 d δ∈N0 ×Nd 0 α,β ∈N0 ×N0 α+β 0 =δ
!
tδ δ!
Z (1,0,...,0) β0 0 α u(η, ξ ) × dηD1,2 χ(ξ, ˆ η)D1,2 D1,2
1,∞
≤
X
0
(α00
α0 ,β∈N0 ×Nd 0 β0 =0
tα tβ + 1) 0 α ! β!
Z (1,0,...,0) β α0 × dηD1,2 D1,2 χ(ξ, ˆ η)D1,2 u(η, ξ 0 )
1,∞
+
X
α,β 0 ∈N0 ×Nd 0
0
(α0 + β00 + 1)
tα tβ α! β 0 !
Z (1,0,...,0) β0 α 0 ˆ η)D1,2 D1,2 u(η, ξ ) × dηD1,2 χ(ξ,
1,∞
∂χ ∂k0 )
and
December 15, 2003 16:20 WSPC/148-RMP
00178
Single Scale Analysis of Many Fermion Systems — Part 2
≤
X
(α00
α0 ,β∈N0 ×Nd 0 β0 =0
0 tα tβ ∂ α0 +(1,0,...,0) 0
+ 1) 0 x χ
α ! β! ∂x0 1 L t=0
(1,0,...,0) β D1,2 u × D1,2 X
+
1,∞
(α0 + β00 + 1)
α,β 0 ∈N0 ×Nd 0
β 0 (1,0,...,0) u × D1,2 D1,2
0 tα tβ α D χ ˆ α! β 0 ! 1,2 1,∞
1,∞
∂χ0
≤ (r0 + 1) (r0 + 1)kχk ˆ 1,∞ + x0 ∂x0
(1,0,...,0)
× kD1,2 since
∂ α0 +(1,0,...,0) 0 χ (x)) ∂x0 (x
1017
uk1,∞ +
X
δ∈N0 ×Nd 0 δ0 >r0 0
L1
+ kχk ˆ 1,∞
∞tδ
0
∂ = (α00 + 1)xα χ0 (x) + xα x0 ∂x χ0 (x). 0
X. Momentum Space Norms In this section, we introduce momentum space norms designed to control amputated Green’s functions in momentum space. The set of momentum conserving m-tuples of momenta is Bˇm = {(ˇ η1 , . . . , ηˇm ) ∈ Bˇm |ˇ η1 + · · · + ηˇm = 0} (we use the addition introduced before Definition IX.1). We are particularly interested in the two and four point functions. In the renormalization group analysis we shall control the external fields in momentum space, while the fields that are going to be integrated out are still treated in position space. That is, we will estimate partial Fourier transforms of functions on B m × B n as in Definition IX.1(ii). Motivated by Remark IX.2 we define Definition X.1. A function f on Bˇm × B n is called translation invariant, if for all t ∈ R × Rd f (ˇ η1 , . . . , ηˇm ; ξ1 + t, . . . , ξn + t) = eıhˇη1 +···+ˇηm ,ti− f (ˇ η1 , . . . , ηˇm ; ξ1 , . . . , ξn ) . Generalizing Definition II.1 we set
December 15, 2003 16:20 WSPC/148-RMP
1018
00178
J. Feldman, H. Kn¨ orrer & E. Trubowitz
Definition X.2 (Differential-decay operators). Let m, n ≥ 0. If n ≥ 1, let f be a function on Bˇm × B n . If n = 0, let f be a function on Bˇm . (i) For 1 ≤ j ≤ m and a multiindex δ set Dδj f ((p1 , τ1 , b1 ), . . . , (pm , τm , bm ); ξ1 , . . . , ξn ) bj δ 0
= [ı(−1) ]
d Y
[−ı(−1)bj ]δ`
`=1
∂ δ0 ∂ δ1 ∂ δd · · · 0 1 d ∂pδj,0 ∂pδj,1 ∂pδj,d
× f ((p1 , τ1 , b1 ), . . . , (pm , τm , bm ); ξ1 , . . . , ξn ) . (ii) Let 1 ≤ i 6= j ≤ m + n and δ a multiindex. Set Dδi;j f = (Di − Dj )δ f
if 1 ≤ i < j ≤ m
Dδi;j f = (Di − ξj−m )δ f Dδi;j f
if 1 ≤ i ≤ m, m + 1 ≤ j ≤ m + n
δ
= (ξi − Dj−m ) f
δ Dδi;j f = (ξi−m − ξj−m )δ f = Di−m,j−m f
if m + 1 ≤ i ≤ m + n, 1 ≤ j ≤ m if m + 1 ≤ i < j ≤ m + n .
(iii) A differential-decay operator (dd-operator) of type (m, n), with m + n ≥ 2, is an operator D of the form (1)
(r)
D = Dδi1 ;j1 · · · Dδir ;jr with 1 ≤ i` 6= j` ≤ m + n for all 1 ≤ ` ≤ r. A dd-operator of type (1, 0) is an (1) (r) (1) (r) operator of the form D = Dδ1 · · · Dδ1 = D1δ +···+δ . The total order of D is δ(D) = δ (1) + · · · + δ (r) . Remark X.3. (i) Let D be a differential-decay operator. If f is a translation invariant function on Bˇm × B n , then Df is again translation invariant. (ii) For a translation invariant function ϕ on B m × B n Di;j (ϕ∼ ) = (Di,j ϕ)∼ . In particular, Leibniz’s rule also applies for differential-decay operators. (iii) Let f be a translation invariant function on Bˇm × B. Then, for ξ = (x0 , x, σ, a) ∈ B, f (ˇ η1 , . . . , ηˇm ; ξ) = eıhˇη1 +···+ˇηm ,(x0 ,x)i− f (ˇ η1 , . . . , ηˇm ; (0, σ, a)) . Consequently, for 1 ≤ i ≤ m and a multiindex δ Dδi;m+1 f (ˇ η1 , . . . , ηˇm ; ξ) = eıhˇη1 +···+ˇηm ,(x0 ,x)i− Dδi f (ˇ η1 , . . . , ηˇm ; (0, σ, a)) . Definition X.4. For a function f on Bˇm , set X 1 max sup |Df (ˇ η1 , . . . , ηˇm )|tδ . kf k∼ = δ! D dd-operator ηˇ1 ,...,ˇηm ∈Bˇ 2 δ∈N0 ×N0
with δ(D)=δ
December 15, 2003 16:20 WSPC/148-RMP
00178
Single Scale Analysis of Many Fermion Systems — Part 2
1019
Let f be a function on Bˇm × B n with n ≥ 1. Set X 1 ∼ η1 , . . . , ηˇm ; ξ1 , . . . , ξn ) tδ sup Df (ˇ max kf k = δ! D dd-operator ηˇ1 ,...,ˇηm ∈Bˇ 1,∞ 2 δ∈N0 ×N0
with δ(D)=δ
when m ≤ p ≤ m + n. The norm k| · k|1,∞ of Example II.6 refers to the variables ξ1 , . . . , ξn . That is, k|Df (ˇ η1 , . . . , ηˇm ; ξ1 , . . . , ξn )k|1,∞ = max
sup
1≤j0 ≤n ξj ∈B 0
Z
Y
j=1,...,n j6=j0
dξj |Df (ˇ η1 , . . . , ηˇm ; ξ1 , . . . , ξn )| .
Remark X.5. In the case m = 0 the norm k · k1,∞ of Example II.6 and the norm k · k∼ of Definition X.4 agree. In analogy to Lemma II.7, we have Lemma X.6. Let f be a translation invariant function on Bˇm ×B n , f 0 a translation 0 0 invariant function on Bˇm × B n and 1 ≤ µ ≤ n, 1 ≤ ν ≤ n0 . 0 0 If n ≥ 2 or n0 ≥ 2 define the function g on Bˇm+m × B n+n −2 by g(ˇ η1 , . . . , ηˇm+m0 ; ξ1 , . . . , ξµ−1 , ξµ+1 , . . . , ξn , ξn+1 , . . . , ξn+ν−1 , ξn+ν+1 , . . . , ξn+n0 ) Z = dζf (ˇ η1 , . . . , ηˇm ; ξ1 , . . . , ξµ−1 , ζ, ξµ+1 , . . . , ξn ) B
× f 0 (ˇ ηm+1 , . . . , ηˇm+m0 ; ξn+1 , . . . , ξn+ν−1 , ζ, ξn+ν+1 , . . . , ξn+n0 ) .
If n = n0 = 1, define the function g on Bˇm+m0 by
η1 + · · · + ηˇm+m0 ) g(ˇ η1 , . . . , ηˇm+m0 )(2π)d+1 δ(ˇ Z = dζf (ˇ η1 , . . . , ηˇm ; ζ)f 0 (ˇ ηm+1 , . . . , ηˇm+m0 ; ζ) . B
Then
∼
∼
0 ∼
kgk ≤ kf k kf k
(
4
if n = n0 = 1
1
otherwise
.
Proof. If n ≥ 2 or n0 ≥ 2, the proof is analogous to that of Lemma II.7. Therefore we only discuss the case n = n0 = 1. In this case, by Remark X.3(iii) Z dξf (ˇ η1 , . . . , ηˇm ; ξ)f 0 (ˇ ηm+1 , . . . , ηˇm+m0 ; ξ) B
=
Z
dx0
Z
dx
X
σ∈{↑,↓} b∈{0,1}
f (ˇ η1 , . . . , ηˇm ; (0, σ, b))
December 15, 2003 16:20 WSPC/148-RMP
1020
00178
J. Feldman, H. Kn¨ orrer & E. Trubowitz
× eıhˇη1 +···+ˇηm+m0 ,(x0 ,x)i− f 0 (ˇ ηm+1 , . . . , ηˇm+m0 ; (0, σ, b)) X
=
f (ˇ η1 , . . . , ηˇm ; (0, σ, b))f 0 (ˇ ηm+1 , . . . , ηˇm+m0 ; (0, σ, b))
σ∈{↑,↓} b∈{0,1}
× (2π)d+1 δ(ˇ η1 + · · · + ηˇm+m0 ) . Consequently g(ˇ η1 , . . . , ηˇm+m0 ) =
X
f (ˇ η1 , . . . , ηˇm ; (0, σ, b))f 0 (ˇ ηm+1 , . . . , ηˇm+m0 ; (0, σ, b)) .
σ∈{↑,↓} b∈{0,1}
The claim now follows by iterated application of the product rule for derivatives and Remark X.3(iii). The factor of 4 comes from the sum over σ and b and is required only when n = n0 = 1. Remark X.7. F (ˇ η1 , . . . , ηˇm ) =
X
δ∈N0 ×N20
1 δ!
max
D dd-operator with δ(D)=δ
k|Df (ˇ η1 , . . . , ηˇm ; ·, . . . , ·)k|1,∞ tδ
so that kf k∼ =
sup ηˇ1 ,...,ˇ ηm ∈Bˇ
F (ˇ η1 , . . . , ηˇm )
(with the supremum of the formal power series F taken componentwise) and deηm+1 , . . . , ηˇm+m0 ) similarly. The proof of Lemma X.6 fine G(ˇ η1 , . . . , ηˇm+m0 ) and F 0 (ˇ actually shows that ( 4 if n = n0 = 1 0 η1 , . . . , ηˇm )F (ˇ ηm+1 , . . . , ηˇm+m0 ) G(ˇ η1 , . . . , ηˇm+m0 ) ≤ F (ˇ 1 otherwise 0 for all (ˇ η1 , . . . , ηˇm+m0 ) ∈ Bˇm+m .
Definition X.8. (i) For n ≥ 1, denote by Fˇm (n) the space of all translation invariant, complex valued functions f (ˇ η1 , . . . , ηˇm ; ξ1 , . . . , ξn ) on Bˇm × B n that are antisymmetric in their external (= ηˇ) variables. Let Fˇm (0) be the space of all antisymmetric, complex valued functions f (ˇ η1 , . . . , ηˇm ) on Bˇm . 0 (ii) Let C(ξ, ξ ) be any skew symmetric function on B 2 . Let f ∈ Fˇm (n) and 1 ≤ i < j ≤ n. We define “contraction”, for n ≥ 2, by ConC f (ˇ η1 , . . . , ηˇm ; ξ1 , . . . , ξi−1 , ξi+1 , . . . , ξj−1 , ξj+1 , . . . , ξn ) i→j
= (−1)
j−i+1
Z
dξi dξj C(ξi , ξj )f (ˇ η1 , . . . , ηˇm ; ξ1 , . . . , ξn )
December 15, 2003 16:20 WSPC/148-RMP
00178
Single Scale Analysis of Many Fermion Systems — Part 2
1021
and, for n = 2, by ConC f (ˇ η1 , . . . , ηˇm )(2π)d+1 δ(ˇ η1 + · · · + ηˇm ) 1→2
=
Z
dξ1 dξ2 C(ξ1 , ξ2 )f (ˇ η1 , . . . , ηˇm ; ξ1 , ξ2 ) .
Fˇm (n) consists of the partial Fourier transforms ϕ∼ [as in Definition IX.1(ii)] of translation invariant functions ϕ ∈ Fm (n) as in Definition II.9. Also, ConC ϕ∼ = i→j
( ConC ϕ)∼ , where ConC ϕ is defined in Definition III.1. i→j
i→j
Corollary X.9. Let C(ξ, ξ 0 ) ∈ F0 (2) be an antisymmetric function. Let m, m0 ≥ 0, n, n0 ≥ 1 and f ∈ Fˇm (n), f 0 ∈ Fˇm0 (n0 ). Then
∼
ConC Antext (f ⊗ f 0 ) ≤ 4kCk1,∞ kf k∼ kf 0 k∼ .
1→n+1 Proof. The claim follows by iterated application of Lemma X.6 and the observation that kCk∼ = kCk1,∞ by Remark X.5. We shall prove an analog of Theorem VIII.6, for the momentum space norms of Definition X.4. By way of preparation, we first formulate the following variant of Lemma V.1. Lemma X.10. Let ρm;n be a sequence of nonnegative real numbers such that ρm;n0 ≤ ρm;n for n0 ≤ n. Define (locally) for f ∈ Fˇm (n) kf k = ρm;n kf k∼
where kf k∼ is the norm of Definition X.4.
(i) The seminorms k · k are symmetric. (ii) For a covariance C, let S(C) be the quantity introduced in Definition IV.1. Then 2S(C) is an integral bound for the covariance C with respect to the family of seminorms k · k. (iii) Let C be a covariance. Assume that for all m, m0 ≥ 0 and n, n0 ≥ 1 ρm+m0 ;n+n0 −2 ≤ ρm;n ρm0 ;n0
and let c obey c ≥ 4kCk1,∞ . Then c is a contraction bound for the covariance C with respect to the family of seminorms k · k. Proof. Parts (i) and (ii) are trivial. To prove part (iii), let f ∈ Fˇm (n), f 0 ∈ Fˇm0 (n0 ) and 1 ≤ i ≤ n, 1 ≤ j ≤ n0 . If n ≥ 2 or n0 ≥ 2 define the function g on
December 15, 2003 16:20 WSPC/148-RMP
1022
00178
J. Feldman, H. Kn¨ orrer & E. Trubowitz
0 0 Bˇm+m × B n+n −2 by
g(ˇ η1 , . . . , ηˇm+m0 ; ξ1 , . . . , ξi−1 , ξi+1 , . . . , ξn , ξn+1 , . . . , ξn+j−1 , ξn+j+1 , . . . , ξn+n0 ) Z = dζdζ 0 f (ˇ η1 , . . . , ηˇm ; ξ1 , . . . , ξi−1 , ζ, ξi+1 , . . . , ξn )C(ζ, ζ 0 ) × f 0 (ˇ ηm+1 , . . . , ηˇm+m0 ; ξn+1 , . . . , ξn+j−1 , ζ 0 , ξn+j+1 , . . . , ξn+n0 ) . If n = n0 = 1, define the function g on Bˇm+m0 by g(ˇ η1 , . . . , ηˇm+m0 )(2π)d+1 δ(ˇ η1 + · · · + ηˇm+m0 ) Z = dζdζ 0 f (ˇ η1 , . . . , ηˇm ; ζ)C(ζ, ζ 0 )f 0 (ˇ ηm+1 , . . . , ηˇm+m0 ; ζ 0 ) or equivalently, by g(ˇ η1 , . . . , ηˇm+m0 ) X Z = dζ 0 f (ˇ η1 , . . . , ηˇm ; (0, σ, b))C((0, σ, b), ζ 0 )f 0 (ˇ ηm+1 , . . . , ηˇm+m0 ; ζ 0 ) . σ∈{↑,↓} b∈{0,1}
Then ConC i→n+j and therefore
As
Antext (f ⊗ f 0 ) = Antext g
ConC Antext (f ⊗ f 0 ) ≤ kgk .
i→n+j
kgk∼ ≤ 4kf k∼ kCk1,∞ kf 0 k∼
and consequently
ConC Antext (f ⊗ f 0 ) ≤ 4ρm+m0 ;n+n0 −2 kCk1,∞ kf k∼ kf 0 k∼
i→n+j
≤ cρm;n kf k∼ρm0 ;n0 kf 0 k∼ = ckf k kf 0 k .
The analog of “external improving” for the current setting is the following lemma. We shall later, in [4, Lemma XVII.5], prove a general scale version. Let (ρm;n )m,n∈N0 be a system of positive real numbers and X ∈ Nd+1 with X0 < 1.
December 15, 2003 16:20 WSPC/148-RMP
00178
Single Scale Analysis of Many Fermion Systems — Part 2
1023
For an even Grassmann function X Z W(φ, ψ) = dη1 · · · dηm dξ1 · · · dξn m,n≥0 m+n even
Bm+n
× Wm,n (η1 · · · ηm , ξ1 , . . . , ξn )φ(η1 ) · · · φ(ηm )ψ(ξ1 ) · · · ψ(ξn ) with kernels Wm,n that are separately antisymmetric under permutations of their η and ξ arguments, define X ∼ N0∼ (W; β; X, ρ) = e0 (X) β m+n ρm;n kWm,n k∼ m+n≥2 m+n even
where k · k∼ is the norm of Definition X.4. Lemma X.11. Let (ρm;n )m,n∈N0 be a system of positive real numbers and X ∈ Nd+1 with X0 < 1. There are constants const and Γ0 , independent of M, X and ρ such that the following holds for all Γ ≤ Γ0 . Let C(k) be a function that obeys ρm;n e0 (X) for all m ≥ 0, n ≥ 1 and let C(ξ, ξ 0 ) be the covariance kC(k)k∼ ≤ Γ ρm+1;n−1 associated to it by Definition IX.3. (i) Let W(φ, ψ) be a Grassmann function and set W 0 (φ, ψ) = W(φ, ψ + CJφ) . Then N0∼ (W 0 − W; β; X, ρ) ≤ const ΓN0∼ (W; 2β; X, ρ) . ρm;n ρm+1;n−1 Y e0 (X)
(ii) Assume that there is a Y ∈ Nd+1 such that kC 0 (k)k∼ ≤ all m ≥ 0, n ≥ 1. Set
for
Ws0 (φ, ψ) = W(φ, ψ + CJφ + sC 0 Jφ) . Then N0∼ Proof. Let
d 0 W ; β; X, ρ ≤ const Y N0∼ (W; 2β; X, ρ) . ds s=0
1 i(−1)a hk,xi− ∼ 0 0 J ((k0 , k, σ, a), (x0 , x, σ , a )) = δσ,σ0 e −1 0
if a = 1, a0 = 0 if a = 0, a0 = 1 otherwise
be the partial Fourier transform of J(η, ξ) with respect to its first argument. Observe that Z dζJ ∼ (ˇ η , ζ)C(ζ, ζ 0 ) = C(k)E+ (ˇ η , ζ 0 ) where ηˇ = (k0 , k, σ, a) .
December 15, 2003 16:20 WSPC/148-RMP
1024
00178
J. Feldman, H. Kn¨ orrer & E. Trubowitz
Let f ∈ Fˇm (n), 1 ≤ i ≤ n and set, for ηˇm+1 = (km+1 , σm+1 , am+1 ), g(ˇ η1 , . . . , ηˇm+1 ; ξ1 , . . . , ξn−1 ) = Antext
= Antext
Z
Z
dζdζ 0 J ∼ (ˇ ηm+1 , ζ)C(ζ, ζ 0 )f (ˇ η1 , . . . , ηˇm ; ξ1 . . . , ξi−1 , ζ 0 , ξi , . . . , ξn−1 )
dζC(km+1 )E+ (ˇ ηm+1 , ζ)f (ˇ η1 , . . . , ηˇm ; ξ1 . . . , ξi−1 , ζ, ξi , . . . , ξn−1 )
for n = 2 and g(ˇ η1 , . . . , ηˇm+1 )(2π)d+1 δ(ˇ η1 + · · · + ηˇm+1 ) = Antext
Z
dζdζ 0 J ∼ (ˇ ηm+1 , ζ)C(ζ, ζ 0 )f (ˇ η1 , . . . , ηˇm ; ζ 0 )
or equivalently g(ˇ η1 , . . . , ηˇm+1 ) = Antext C(km+1 )f (ˇ η1 , . . . , ηˇm ; (0, 0, σm+1 , am+1 )) for n = 1. In both cases, since Dδm+1 E+ (ˇ ηm+1 , ζ) = ζ δ E+ (ˇ ηm+1 , ζ), kgk∼ ≤ kf k∼ kC(k)k∼ so that ρm+1;n−1 kgk∼ ≤ Γρm;n kf k∼e0 (X) . (X.1) P (i) Write W(φ, ψ) = m,n Wm,n (φ, ψ), with Wm,n of degree m in φ and degree n in ψ, and W(φ, ψ + ζ) =
X
m,n
Wm,n (φ, ψ + ζ) =
n XX m,n `=0
Wm,n−`,` (φ, ψ, ζ)
with Wm,n−`,` of degrees m in φ, n − ` in ψ and ` in ζ. Let wm,n and wm,n−`,` be the kernels of Wm,n (φ, ψ) and Wm,n−`,` (φ, ψ, CJφ) respectively. By the binomial theorem and repeated application of (X.1), ! n ∼ ∼ ` ∼ e0 (X)ρm+`;n−` kwm,n−`,` k ≤ (const Γ) e0 (X)ρm;n kwm,n k∼ ` if ` ≥ 1. Then, W 0 (φ, ψ) − W(φ, ψ) = W(φ, ψ + CJφ) − W(φ, ψ) =
n X X
m,n≥0 `=1
Wm,n−`,` (φ, ψ, CJφ)
December 15, 2003 16:20 WSPC/148-RMP
00178
Single Scale Analysis of Many Fermion Systems — Part 2
1025
and N0∼ (W 0 − W; β; X, ρ) ≤ e0 (X)
≤ e0 (X)
n X X
m,n≥0 `=1
n X X
m,n≥0 `=1
X
= e0 (X)
m,n≥0
If
const Γ
∼ β m+n ρm+`;n−` kwm,n−`,` k∼
n `
!
∼ k∼ (const Γ)` β m+n ρm;n kwm,n
∼ [(1 + const Γ)n − 1]β m+n ρm;n kwm,n k∼ .
≤ 13 , (1 + const Γ)n − 1 ≤ const Γn(1 + const Γ)n−1 ≤ const Γ
n 3 (1 + const Γ)n−1 2
≤ const Γ2n and N0∼ (W 0 − W; β; X, ρ) ≤ const ΓN0∼ (W; 2β; X, ρ) . (ii) Write W(φ, ψ + ζ + η) =
X
X
m,n n1 ,n2 ,n3 ≥0 n1 +n2 +n3 =n
Wm,n1 ,n2 ,n3 (φ, ψ, ζ, η)
with Wm,n1 ,n2 ,n3 of degrees m in φ, n1 in ψ, n2 in ζ and n3 in η. Let wm,n1 ,n2 ,n3 be the kernel of Wm,n1 ,n2 ,n3 (φ, ψ, CJφ, C 0 Jφ). By the binomial theorem, repeated application of (X.1) and the obvious analog of (X.1) with C replaced by C 0 , ∼ e0 (X)ρm+`+1;n−`−1 kwm,n−`−1,`,1 k∼
n
`
≤ (const Γ) (const Y )
n − ` − 1, `, 1
!
∼ k∼ . e0 (X)ρm;n kwm,n
Then, d 0 d 0 = Ws (φ, ψ) W(φ, ψ + CJφ + sC Jφ) ds ds s=0 s=0 =
X n−1 X
m,n≥0 `=0
Wm,n−`−1,`,1 (φ, ψ, CJφ, C 0 Jφ)
December 15, 2003 16:20 WSPC/148-RMP
1026
00178
J. Feldman, H. Kn¨ orrer & E. Trubowitz
and N0∼
d 0 Ws ; β; X, ρ ds s=0
≤ e0 (X)
≤ e0 (X)
X n−1 X
m,n≥0 `=0
∼ β m+n ρm+`+1;n−`−1 kwm,n−`−1,`,1 k∼
X n−1 X
n n − ` − 1, `, 1
m,n≥0 `=0
= const Y e0 (X)
X n−1 X
n
n−1 `
m,n≥0 `=0
= const Y e0 (X)
X
m,n≥0
If
const Γ
n(1 + const Γ)n−1 ≤
N0∼
∼ (const Γ)` (const Y )β m+n ρm;n kwm,n k∼
!
∼ (const Γ)` β m+n ρm;n kwm,n k∼
∼ n(1 + const Γ)n−1 β m+n ρm;n kwm,n k∼ .
≤ 31 ,
and
!
n 3 n−1 (1 + const Γ) ≤ 2n 2
d 0 W ; β; X, ρ ≤ const Y N0∼ (W; 2β; X, ρ) . ds s s=0
In [7, Definition XIII.9], we shall amputate a Grassmann function by applying ˆ in the sense of Definition IX.4, of A(k) = ik0 − e(k) to its the Fourier transform A, external arguments. Precisely, if W(φ, ψ) is a Grassmann function, then ˆ ψ) W a (φ, ψ) = W(Aφ,
where ˆ (Aφ)(ξ) =
Z
ˆ ξ 0 )φ(ξ 0 ) . dξ 0 A(ξ,
If C(ξ, ξ 0 ) is the covariance associated to C(k) in the sense of Definition IX.3 and J is the particle hole swap operator of (VI.1), then, by parts (i) and (ii) of Lemma IX.5, Z ˆ 0 , ξ) = E(ξ, ˆ ξ0) dηdη 0 C(ξ, η)J(η, η 0 )A(η
ˆ is the Fourier transform of (ik0 − e(k))C(k) in the sense of Definition IX.4. where E Theorem X.12. Fix j0 ≥ 1 and set, for δe ∈ Eµ , C0 (k; δe) =
U (k) − ν (>j0 ) (k) . ık0 − e(k) + δe(k)
December 15, 2003 16:20 WSPC/148-RMP
00178
Single Scale Analysis of Many Fermion Systems — Part 2
1027
Let C0 (δe) be the Fourier transform of C0 (k; δe) in the sense of Definition IX.3. Then there are (M and j0 -dependent) constants β0 , ε0 , const and µ > 0 such that, for all β ≥ β0 and ε, ε0 ≤ ε0 , the following holds: Choose a system (ρm;n )m,n∈N0 , of positive real numbers obeying ρm;n−1 ≤ ρm;n , ρm+1;n−1 ≤ ε0 ρm;n and ρm+m0 ;n+n0 −2 ≤ ρm;n ρm0 ;n0 . Let X ∈ Nd+1 with X0 < 41 . For all δe ∈ Eµ with kδˆ ek1,∞ ≤ X and all even Grassmann functions V(ψ) with N0∼ (V; 32β; X, ρ) ≤ εe0 (X), ˜ ˜ C (δe) (V)(φ, ψ) − V(ψ) − 1 φJC0 (δe)Jφ W(φ, ψ; δe) = Ω 0 2 obeys ˜ a (φ, ψ; δe); β; X, ρ) N0∼ (W and N0∼
≤ε
1 √ 0 + ε e0 (X) β
d ˜a 1 √ 0 0 W (φ, ψ; δe + sδe ) ; β; X, ρ ≤ ε + ε e0 (X)kδˆ e0 k1,∞ . ds β s=0
Proof. The proof of this theorem is similar that of Theorem VIII.6. Let Vext be the vector space generated by φ(η), η ∈ B, and recall from Sec. II that V is the vector space generated by ψ(ξ), ξ ∈ B. Set V˜ = Vext ⊕ V and A˜ = C. Then M ⊗m1 ⊗mr A˜ ⊗ V˜ ⊗n = V˜ ⊗n = Vext ⊗ V ⊗n1 ⊗ · · · ⊗ Vext ⊗ V ⊗nr . m1 +n1 +···+mr +nr =n n1 ,m2 ,...,nr−1 ,mr ≥1
⊗mr ⊗m1 ⊗V ⊗nr can be uniquely ⊗V ⊗n1 ⊗· · ·⊗Vext Every element Fm1 ,n1 ,...,mr ,nr of Vext written in the form Z Fm1 ,n1 ,...,mr ,nr = dη1 · · · dηm1 +···+mr dξ1 · · · dξn1 +···+nr
× fm1 ,n1 ,...,mr ,nr (η1 , . . . , ηm1 +···+mr ; ξ1 , . . . , ξn1 +···+nr ) × φ(η1 ) ⊗ · · · ⊗ φ(ηm1 ) ⊗ ψ(ξ1 ) ⊗ · · · ⊗ ψ(ξn1 ) ⊗ · · · ⊗ ψ(ξn1 +···+nr−1 +1 ) ⊗ · · · ⊗ ψ(ξn1 +···+nr ) . We define ∼ |Fm1 ,n1 ,...,mr ,nr |∼ = ρm1 +···+mr ;n1 +···nr kfm k∼ 1 ,n1 ,...,mr ,nr
and for F =
X
m1 +n1 +···+mr +nr =n n1 ,m2 ,...,nr−1 ,mr ≥1
Fm1 ,n1 ,...,mr ,nr ∈ V˜ ⊗n
December 15, 2003 16:20 WSPC/148-RMP
1028
00178
J. Feldman, H. Kn¨ orrer & E. Trubowitz
⊗m1 ⊗mr with Fm1 ,n1 ,...,mr ,nr ∈ Vext ⊗ V ⊗n1 ⊗ · · · ⊗ Vext ⊗ V ⊗nr X |F |∼ = |Fm1 ,n1 ,...,mr ,nr |∼ . m1 ,n1 ,...,mr ,nr
Define the covariance C˜ on V˜ by ˜ C(φ(η), φ(η 0 )) = 0 ˜ C(φ(η), ψ(ξ)) = 0 ˜ C(ψ(ξ), ψ(ξ 0 )) = C0 (ξ, ξ 0 ; δe) . The restriction of C˜ to V coincides with the covariance on V determined by C0 (δe) as at the beginning of Sec. II, while Vext is isotropic and perpendicular to V with ˜ respect to C. We have already observed, in the proof of Theorem VIII.6, that there is a conek1,∞ ). stant const1 such that S(C0 (δe)) ≤ const1 and kC0 (δe)k1,∞ ≤ 41 const1 e0 (kδˆ
Let C0a (δe) be the covariance associated with
U (k)−ν (>j0 ) (k) ık0 −e(k)+δe(k) (ık0
− e(k)). Then
kC0a (k; δe)k∼ ≤ const kC0 (δe)k1,∞ ≤ const1 e0 (kδˆ ek1,∞ ) . ˜ and By Lemma X.10, b = 4 const1 is an integral bound for C0 (δe), and hence for C, ˜ c = const1 e0 (X) is a contraction bound for C0 (δe), and hence for C. Furthermore, Lemma X.11(i), with C = C0a (δe), X = kδˆ ek1,∞ and Γ = ε0 b is applicable. P β Set α = b . For any Grassmann function W(φ, ψ) = m,n Wm,n (φ, ψ), with Wm,n (φ, ψ) of degree m and n in φ and ψ, respectively, let N ∼ (W; c, b, α) =
1 X m+n m+n ∼ ∼ c α b |Wm,n | b2 m,n
be the norm of Definition III.9 and of [1, Definition II.23], but with V replaced 1 N0∼ (W; β; X, ρ) and, if by V˜ and A replaced by A˜ = C. Then N ∼ (W; c, b, α) = 4b V = : V 0 :C0 (δe) , N ∼ (V 0 ; c, b, α) ≤ N ∼ (V 0 − V; c, b, α) ≤
1 ∼ N (V; 2β; X, ρ) 4b 0 b N ∼ (V; 2β; X, ρ) 2β 2 0
by [1, Corollary II.32]. To prove the first part of the theorem, set V = : V 0 :C0 (δe) . Then the hypotheses of Theorem III.10 with C = C0 (δe) and W = V 0 are fulfilled. Therefore 1 ∼ N (ΩC0 (δe) (V) − V; β; X, ρ) 4b 0 ≤ N ∼ (ΩC0 (δe) (: V 0 :C0 (δe) ) − V 0 ; c, b, α) + N ∼ (V 0 − V; c, b, α)
December 15, 2003 16:20 WSPC/148-RMP
00178
Single Scale Analysis of Many Fermion Systems — Part 2
≤
N ∼ (V 0 ; c, b, 8α)2 2 + N ∼ (V 0 − V; c, b, α) α2 1 − α42 N ∼ (V 0 ; c, b, 8α)
≤
1 ∼ 2 2 16b b 2 N0 (V; 16β; X, ρ) + 2 N0∼ (V; 2β; X, ρ) b 2 ∼ α 1 − β 2 N0 (V; 16β; X, ρ) 2β
≤
e0 (X)2 εb ε2 + e0 (X) 2 8β 1 − βεb2 e0 (X) 2β 2
≤
εb e0 (X) β2
1029
e2 (X)
since ε0 is chosen so that βεb2 < 14 and, by Corollary A.5(ii), 1− 01 e0 (X) ≤ const e0 (X). 4 Observe that V and hence ΩC0 (δe) (V) are independent of φ and consequently are not affected by amputation. So, by Lemmas VII.3 and X.11, ˜ a (φ, ψ; δe); β; X, ρ) N0∼ (W = N0∼ (ΩC0 (δe) (V)(φ, ψ + C0a (δe)Jφ) − V(ψ); β; X, ρ) ≤ N0∼ (ΩC0 (δe) (V)(φ, ψ + C0a (δe)Jφ) − ΩC0 (δe) (V)(φ, ψ); β; X, ρ) + N0∼ (ΩC0 (δe) (V)(φ, ψ) − V(ψ); β; X, ρ) ≤ const ε0 bN0∼ (ΩC0 (δe) (V); 2β; X, ρ) +
4εb2 e0 (X) β2
≤ const ε0 b[N0∼ (ΩC0 (δe) (V) − V; 2β; X, ρ) + N0∼ (V; 2β; X, ρ)] + ≤ const ε0 b
4εb2 e0 (X) β2
εb2 4εb2 e (X) + εe (X) + 2 e0 (X) 0 0 2 β β 3
≤ const (1 + b) ε
1 1 √ 0 0 + ε e0 (X) ≤ ε + ε e0 (X) . β2 β
Finally, we prove the bound on
d ˜a ds W (φ, ψ; δe
+ sδe0 )|s=0 . As
d U (k) − ν (>j0 ) (k) =− C0 (k; δe + sδe0 ) δe0 (k) 2 ds [ık − e(k) + δe(k)] 0 s=0
U (k) − ν (>j0 ) (k) d a =− C0 (k; δe + sδe0 ) δe0 (k)(ık0 − e(k)) . 2 ds [ık − e(k) + δe(k)] 0 s=0
December 15, 2003 16:20 WSPC/148-RMP
1030
00178
J. Feldman, H. Kn¨ orrer & E. Trubowitz
Propositions IV.3(i) and IV.11 give that q d e0 k|1,∞ S C0 (δe + sδe0 ) ≤ const1 k|δˆ ds s=0 d C0 (δe + sδe0 ) ds
d
C0 (δe + sδe0 )
ds
s=0
q
∞
s=0 1,∞
d a
C0 (k; δe + sδe0 )
ds
Set b0 = 4 const1
s=0
≤ const1 k|δˆ e0 k|1,∞ ≤
1 const1 e0 (kδˆ ek1,∞ )kδˆ e0 k1,∞ 4
∼
≤ const1 e0 (kδˆ ek1,∞ )kδˆ e0 k1,∞ .
k|δˆ e0 k|1,∞ and c0 = const1 e0 (X)kδˆ e0 k1,∞ . By Lemma X.10, 21 b0
d C0 (δe + sδe0 )|s=0 and c0 is a contraction bound for is an integral bound for ds d 0 a 0 = ds C0 (δe + sδe )|s=0 . Furthermore, Lemma X.11(ii), with C = C0 (δe), C d a 0 0 0 0 C (δe + sδe )| , X = kδˆ e k , Γ = const ε and Y = ε bkδˆ e k is applicable. s=0 1,∞ 1 1,∞ ds 0 Define Cκ = C0 (δe + κδe0 ) and Wκ by : Wκ :Cκ = V. Even when α is replaced by 2α, the hypotheses of [1, Lemmas IV.5(i) and IV.7(i)], with µ = 1, are satisfied. By these two lemmas, followed by [1, Corollary II.32(iii)], d ∼ ΩCs (V) ; c, b, α N ds s=0
=N
∼
≤ N∼ ≤
d ΩCs (: Ws :Cs ) ; c, b, α ds s=0
d d ΩCs (: V 0 :Cs ) ; c, b, α + N ∼ ΩC0 (: Ws :C0 ) ; c, b, α ds ds s=0 s=0
1 N ∼ (V 0 ; c, b, 8α)2 const1 e0 (X)kδˆ e0 k1,∞ 2 2α 1 − α42 N ∼ (V 0 ; c, b, 8α)
2 N ∼ (V 0 ; c, b, 8α) + 1+ 2 α 1 − α42 N ∼ (V 0 ; c, b, 8α) ≤
N
∼
d Ws ; c, b, 2α ds s=0
1 N ∼ (V 0 ; c, b, 8α)2 const1 e0 (X)kδˆ e0 k1,∞ 2 2α 1 − α42 N ∼ (V 0 ; c, b, 8α)
N ∼ (V 0 ; c, b, 8α) 2 + 1+ 2 α 1 − α42 N ∼ (V 0 ; c, b, 8α) ≤ const
ε e0 (X)kδˆ e0 k1,∞ α2
k|δˆ e0 k|1,∞
(2α − 1)2
N ∼ (V; c, b, 4α)
December 15, 2003 16:20 WSPC/148-RMP
00178
Single Scale Analysis of Many Fermion Systems — Part 2
1031
as above. By Lemmas VII.3 and X.11, d ˜a N0∼ W (φ, ψ; δe + sδe0 ) ; β; X, ρ ds s=0 = N0∼ ≤
N0∼
+ N0∼
d ΩC0 (δe+sδe0 ) (V)(φ, ψ + C0a (δe + sδe0 )Jφ) ; β; X, ρ ds s=0 d a ΩC (δe+sδe0 ) (V)(φ, ψ + C0 (δe)Jφ) ; β; X, ρ ds 0 s=0
d a 0 ΩC (δe) (V)(φ, ψ + C0 (δe + sδe )Jφ) ; β; X, ρ ds 0 s=0
≤ (1 + const ε
0
b)N0∼
d ΩC (δe+sδe0 ) (V) ; 2β; X, ρ ds 0 s=0
+ const ε0 bN0∼ (ΩC0 (δe) (V); 2β; X, ρ)kδˆ e0 k1,∞
ε e0 (X)kδˆ e0 k1,∞ + const εε0 e0 (X)kδˆ e0 k1,∞ β2 1 √ 0 ≤ε + ε e0 (X)kδˆ e0 k1,∞ β ≤ const
as above.
Appendices B. Symmetries Definition B.1 (Symmetries). Let x = (x0 , x, σ) ∈ R × Rd × {↑, ↓}, ξ = (x, a) ∈ B = (R × Rd × {↑, ↓}) × {0, 1} and t = (t0 , t) ∈ R × Rd . We set x + t = (x0 + t0 , x + t, σ) R0 x = (−x0 , x, σ) −x = (−x0 , −x, σ) (T) A function f (ξ1 , . . . , ξn ) on B t ∈ R × Rd ,
n
ξ + t = (x + t, a) R0 ξ = (R0 x, a) −ξ = (−x, a) .
is called translation invariant if, for all
f (ξ1 + t, . . . , ξn + t) = f (ξ1 , . . . , ξn ) . In the same way, one defines translation invariance for functions on (R × Rd × {↑, ↓})n. (N) A function f on B n conserves particle number if f ((x1 , a1 ), . . . , (xn , an )) = 0 unless n #{j|aj = 0} = #{j|aj = 1} = . 2
December 15, 2003 16:20 WSPC/148-RMP
1032
00178
J. Feldman, H. Kn¨ orrer & E. Trubowitz
(S) Let f be a function on B n . Set, for each A ∈ SU (2), f A ((·, σ1 , b1 ), . . . , (·, σn , bn )) =
X
f ((·, τ1 , b1 ), . . . , (·, τn , bn ))
τ1 ,...,τn
n Y
j) A(b τj ,σj
j=1
¯ f is called spin independent if f = f A for where A(0) = A and A(1) = A. all A ∈ SU (2). (R) A function f (ξ1 , . . . , ξn ) on B n is called k0 -reversal real if f (R0 ξ1 , . . . , R0 ξn ) = f (−ξ1 , . . . , −ξn ) or, equivalently, if its Fourier transform obeys fˇ(R0 ξˇ1 , . . . , R0 ξˇn ) = fˇ(ξˇ1 , . . . , ξˇn ) where R0 (k0 , k, σ, a) = (−k0 , k, σ, a). (B) A function f (ξ1 , . . . , ξn ) on B n is called bar/unbar exchange invariant if f ((x1 , 1 − b1 ), . . . , (xn , 1 − bn )) = in f ((−x1 , b1 ), . . . , (−xn , bn )) or, equivalently, if its Fourier transform obeys ˇ 1 , σ1 , 1 − b1 ), . . . , (kn , σn , 1 − bn )) = in f((k ˇ 1 , σ1 , b1 ), . . . , (kn , σn , bn )) . f((k (vi) Let m ∈ N and Σ1 , . . . , Σm ∈ {B, N, R, S, T}. Then f is Σ1 · · · Σm -symmetric if f satisfies part (Σi ) of this definition for 1 ≤ i ≤ m. (vii) Let Σ ∈ {B, N, R, S, T}. A Grassmann function X W(φ, ψ) = Wm,n φm ψ n m,n
is Σ-symmetric if all of the coefficient functions Wm,n are. P Remark B.2. Let W(φ, ψ) = m,n Wm,n φm ψ n be a Grassmann function. W(φ, ψ) is translation invariant if and only if it is invariant under φ(ξ) → φ(ξ + t) ,
ψ(ξ) → ψ(ξ + t) for all t ∈ R × Rd .
W(φ, ψ) conserves particle number if and only if it is invariant under a
φ(x, a) → ei(−1) θ φ(x, a) ,
a
ψ(x, a) → ei(−1) θ ψ(x, a)
for all θ ∈ R .
W(φ, ψ) is spin independent if and only if it is invariant under X φ(·, σ, a) → A(a) σ,τ φ(·, τ, a) , τ ∈{↑,↓}
ψ(·, σ, a) →
X
τ ∈{↑,↓}
A(a) σ,τ ψ(·, τ, a) for all A ∈ SU (2) .
W(φ, ψ) is bar/unbar exchange invariant if and only if it is invariant under φ(x, a) → iφ(−x, 1 − a) ,
ψ(x, a) → iψ(−x, 1 − a)
December 15, 2003 16:20 WSPC/148-RMP
00178
Single Scale Analysis of Many Fermion Systems — Part 2
1033
or, equivalently, under ˇ σ, a) → iφ(k, ˇ σ, 1 − a) , ψ(k, ˇ σ, a) → iψ(k, ˇ σ, 1 − a) . φ(k, P ¯ Define W(φ, ψ) = m,n Wm,n φm ψ n and (R0 ψ)(ξ) = ψ(R0 ξ)
(Rψ)(ξ) = ψ(−ξ) .
W(φ, ψ) is k0 -reversal real if and only if
¯ W(R0 φ, R0 ψ) = W(Rφ, Rψ) .
Remark B.3. (i) If the function f (ξ1 , ξ2 ) on B 2 is NS-symmetric, then it is of the form f ((x, σ, a), (y, τ, b)) = δσ,τ f˜((x, a), (y, b)) ˜ for some f. (ii) If the function f (ξ1 , ξ2 ) on B 2 is antisymmetric and NST-symmetric, then, translating by −x − y, f ((x, σ, 0), (y, σ, 1)) = f ((−y, σ, 0), (−x, σ, 1)) = −f ((−x, σ, 1), (−y, σ, 0)) so that f is also B-symmetric. (iii) If the function f (ξ1 , ξ2 ) on B 2 is antisymmetric and NST-symmetric and if ˇ f(k) is its Fourier transform specified at the end of Definition IX.1(i), then f = J fbˇ .
The operator J was defined in (VI.1) and for a function χ(k), the Fourier transform χ(ξ, ˆ ξ 0 ) was defined in Definition IX.4. If, in addition, f is k0 -reversal real, then ˇ ˇ f(−k 0 , k) = f (k0 , k). (iv) If Z ¯ = ¯ 1 )ψ(x2 )ψ(x ¯ 3 )ψ(x4 ) dx1 dx2 dx3 dx4 V(ψ, ψ) V0 (x1 , x2 , x3 , x4 )ψ(x (R×Rd ×{↑,↓})4
with 1 V0 ((x1,0 , x1 , σ1 ), . . . , (x4,0 , x4 , σ4 )) = − δ(x1 , x2 )δ(x3 , x4 )v(x1,0 − x3,0 , x1 − x3 ) 2 0 0 0 where δ((x0 , x, σ), (x0 , x , σ )) = δ(x0 − x00 )δ(x − x0 )δσ,σ0 , then V is BNSTsymmetric. If in addition v(x0 , −x) = v(x0 , x), then V is also R-symmetric. This is the case if v(x1 − x3 ) = δ(x1,0 − x3,0 )v(x1 − x3 ) with v having a real-valued Fourier transform. Remark B.4. φJψ is BNRST-symmetric. If C(ξ, ξ 0 ) is the covariance associated to a function C(k) as in Definition IX.3, then C(ξ, ξ 0 ) is antisymmetric and BNSTsymmetric. If C(−k0 , k) = C(k0 , k) then C(ξ, ξ 0 ) is R-symmetric. Remark B.5. Assume that C(ξ, ξ 0 ) is the covariance associated to the function C(k) as in Definition IX.3 and that W(φ, ψ) is a Grassmann function. Let
December 15, 2003 16:20 WSPC/148-RMP
1034
00178
J. Feldman, H. Kn¨ orrer & E. Trubowitz
R ˜ C (W) and ΩC (W) Σ ∈ {BNST}. If W is Σ-symmetric, then W(φ, ψ)dµC (ψ), Ω R ˜ C (W) are are too. If C(−k0 , k) = C(k0 , k), then W(φ, ψ)dµC (ψ), ΩC (W) and Ω R-symmetric. Lemma B.6. Let W (η, ξ) be BNST-symmetric. Then Z Z ˇ )ˆ)(ξ, η)φ(η) = dηdξφ(η)W (η, ξ)ψ(ξ) . dξdηψ(ξ)(J(W Proof. The bar/unbar invariance implies ˇ ((k, σ, a), (k, σ, 1 − a)) = −W ˇ ((k, σ, 1 − a), (k, σ, a)) W ˇ ((k, σ, a), (k, σ, 1 − a)) = −(−1)a W ˇ ((k, σ, 1), (k, σ, 0)) =⇒ W ˇ (k) = −(−1)a W for an arbitrary spin σ. Hence, if x and y both have spin component σ, ˇ )ˆ)((x, 1 − a), (y, a)) = (−1)a (W ˇ )ˆ((x, a), (y, a)) (J(W = (−1)a
=− =−
Z
Z
Z
dd+1 k (−1)a ıhk,x−yi− ˇ e W (k) (2π)d+1
dd+1 k (−1)a ıhk,x−yi− ˇ e W ((k, σ, a), (k, σ, 1 − a)) (2π)d+1 dd+1 k dd+1 k 0 −(−1)a ıhk,yi− −(−1)1−a ıhk0 ,xi− e e (2π)d+1 (2π)d+1
ˇ ((k, σ, a), (k 0 , σ, 1 − a))(2π)d+1 δ(k − k 0 ) ×W = −W ((y, a), (x, 1 − a)) . The lemma follows. C. Some standard Grassmann integral formulae For a function C(ξ, ξ 0 ) on B × B we set, as in Sec. VII Z ψφ = ψ(ξ)φ(ξ)dξ Cφ = φCψ =
Z
Z
C(ξ, ξ 0 )φ(ξ 0 )dξ 0 φ(ξ)C(ξ, ξ 0 )ψ(ξ 0 )dξdξ 0 .
December 15, 2003 16:20 WSPC/148-RMP
00178
Single Scale Analysis of Many Fermion Systems — Part 2
1035
Lemma C.1. Let f (ψ) be a Grassmann function and let C be an arbitrary antisymmetric covariance. Then Z Z 1 f (ψ)eψφ dµC (ψ) = e− 2 φCφ f (ψ + Cφ)dµC (ψ) . Proof. It suffices to consider f (ψ) = eψζ with ζ another Grassmann field. Then comparing Z Z 1 f (ψ)eψφ dµC (ψ) = eψ(ζ+φ) dµC (ψ) = e− 2 (ζ+φ)C(ζ+φ) Z
f (ψ + Cφ)dµC (ψ) =
gives the desired result.
Z
1
e(ψ+Cφ)ζ dµC (ψ) = e−φCζ e− 2 ζCζ
Lemma C.2. Let C(ξ, ξ 0 ) and U (ξ, ξ 0 ) beR antisymmetric functions such that the norm of the integral operator with kernel dξ 00 C(ξ, ξ 00 )U (ξ 00 , ξ 0 ) is strictly smaller than one. Let C 0 (ξ, ξ 0 ) be the kernel of the integral operator [1l − CU ]−1 C and set Z U = U (ξ, ξ)ψ(ξ)ψ(ξ)dξdξ 0 and Z =
R
1
e 2 U (ψ) dµC (ψ). Then,
(a) For all Grassmann functions f (ψ) Z Z 1 1 U (ψ) 2 dµC (ψ) = f (ψ)dµC 0 (ψ) . f (ψ)e Z (b) For all Grassmann functions W(ψ) Z Z 0 1 1 1 W(ψ+φ) φU [1l+C 0 U ]φ 2 e(W− 2 U )(ψ+[1l+C U ]φ) dµC 0 (ψ) . e dµC (ψ) = e Z Proof. We give the proof for the case that the Grassmann algebra is finite dimensional. The general case then follows by approximation. (a) It suffices to consider the generating functional f (ψ) = eψφ . By definition Z Z 1 ψφ 21 U (ψ) dµC (ψ) = eψφ e 2 ψU ψ dµC (ψ) e e = Pf(C) = Pf(C)
Z
Z
Pf(C) = Pf(C 0 )
Z
1
1
eψφ e 2 ψU ψ e− 2 ψC 1
eψφ e− 2 ψC
0−1
eψφ dµC 0 (ψ)
ψ
−1
dψ
ψ
dψ
December 15, 2003 16:20 WSPC/148-RMP
1036
00178
J. Feldman, H. Kn¨ orrer & E. Trubowitz
Pf(C) where Pf(C) is the Pfaffian of C. In particular, setting φ = 0, Pf(C 0) = R 1 U (ψ) e2 dµC (ψ) = Z. (b) By part (a) Z Z 1 1 1 eW(ψ+φ) dµC (ψ) = e 2 U (φ) eW(ψ+φ)− 2 U (ψ+φ)+ψU φ+ 2 U (ψ) dµC (ψ)
Pf(C) 12 U (φ) e = Pf(C 0 )
Z
1
eW(ψ+φ)− 2 U (ψ+φ)+ψU φ dµC 0 (ψ)
Pf(C) 1 U (φ) 1 φU C 0 U φ e2 e2 = Pf(C 0 )
Z
1
e(W− 2 U )(ψ+φ+C
0
U φ)
dµC 0 (ψ)
by Lemma C.1, with the replacements C → C 0 and φ → U φ = −φU . Remark C.3. Recall from Lemma IX.5 that, if C(k) is a function on R × Rd and [ C(ξ, ξ 0 ) the associated covariance in the sense of Definition IX.3, then C = −C(k)J. Also recall from Remark B.3(iii) that if U is antisymmetric, particle number con[ ˇ serving, translation invariant and spin independent, then U = J U(k). In this case C(k) 0 0 0 \ . C = −C (k)J where C (k) = ˇ (k) 1−C(k)U
Notation Norms Norm
Characteristics
k| · k|1,∞
no derivatives, external positions, acts on functions
Example II.6
k · k1,∞ k · kˇ∞
derivatives, external positions, acts on functions
Example II.6
derivatives, external momenta, acts on functions
Definition IV.6
k| · k|∞ k · kˇ
no derivatives, external positions, acts on functions
Example III.4
derivatives, external momenta, acts on functions
Definition IV.6
derivatives, external momenta, B ⊂ R × Rd
Definition IV.6
derivatives, external momenta, B ⊂ R × Rd
Definition IV.6
ρm;n k · k1,∞
Lemma V.1
X 1 c αn bn kWm,n k b2 m,n≥0
Definition III.9
1
k · kˇ∞,B k · kˇ 1,B
k·k N (W; c, b, α)
N0 (W; β; X, ρ)
e0 (X)
X
β n ρm;n kWm,n k1,∞
Reference
Theorem V.2 Theorem VIII.6
m+n∈2N
k · k L1 k·
k∼
N0∼ (W; β; X, ρ)
derivatives, acts on functions on R × Rd
before Lemma IX.6
derivatives, external momenta, acts on functions X ∼ e0 (X) β m+n ρm;n kWm,n k∼
Definition X.4
m+n∈2N
before Lemma X.11
December 15, 2003 16:20 WSPC/148-RMP
00178
Single Scale Analysis of Many Fermion Systems — Part 2
1037
Other notation Notation ΩS (W)(φ, ψ)
Description log
1 Z
Z
eW(φ,ψ+ζ) dµS (ζ)
particle/hole swap operator Z ˜ C (W)(φ, ψ) log 1 Ω eφJ ζ eW(φ,ψ+ζ) dµC (ζ) Z J
Reference before (I.6) (VI.1) Definition VII.1
r0
number of k0 derivatives tracked
Sec. VI
r
number of k derivatives tracked
Sec. VI
scale parameter, M > 1
before Definition VIII.1
M const
generic constant, independent of scale
const
generic constant, independent of scale and M
ν (j) (k)
jth scale function
Definition VIII.1
ν˜(j) (k)
jth extended scale function
Definition VIII.4(i)
ν (≥j) (k)
ϕ(M 2j−1 (k02 + e(k)2 ))
Definition VIII.1
ν˜(≥j) (k)
ϕ(M 2j−2 (k02 + e(k)2 ))
Definition VIII.4(ii)
ν¯(≥j) (k)
ϕ(M 2j−3 (k02 + e(k)2 )) 1/m Z ψ(ξ1 ) · · · ψ(ξm )dµC (ψ) sup sup
Definition VIII.4(iii)
Fourier transform
Definition IX.1(i)
partial Fourier transform
Definition IX.1(ii)
Fourier transform
Definition IX.4
S(C)
m ξ1 ,...,ξm ∈B
fˇ f∼ χ ˆ B Bˇ Bˇm Fm (n) Fˇm (n)
R×
Rd
× {↑, ↓} × {0, 1} viewed as position space
Definition IV.1
beginning of Sec. II
R × Rd × {↑, ↓} × {0, 1} viewed as momentum space
beginning of Sec. IX
{(ˇ η1 , . . . , ηˇm ) ∈ Bˇm |ˇ η1 + · · · + ηˇm = 0}
before Definition X.1
functions on B m × B n , antisymmetric in B m arguments Definition II.9 functions on Bˇm × B n , antisymmetric in Bˇm arguments Definition X.8
References [1] J. Feldman, H. Kn¨ orrer and E. Trubowitz, Convergence of perturbation expansions in fermionic models, Part 1: Nonperturbative bounds, preprint. [2] J. Feldman, H. Kn¨ orrer and E. Trubowitz, Single scale analysis of many fermion systems, Part 1: Insulators, Rev. Math. Phys. 15 (2003), 949–993. [3] J. Feldman, H. Kn¨ orrer and E. Trubowitz, A two dimensional Fermi liquid, Part 1: Overview, to appear in Commun. Math. Phys. [4] J. Feldman, H. Kn¨ orrer and E. Trubowitz, Single scale analysis of many fermion systems, Part 3: Sectorized norms, Rev. Math. Phys. 15 (2003), 1039–1120. [5] J. Feldman, J. Magnen, V. Rivasseau and E. Trubowitz, Two dimensional many fermion systems as vector models, Europhys. Lett. 24 (1993), 521–526. [6] J. Feldman, H. Kn¨ orrer and E. Trubowitz, Convergence of perturbation expansions in fermionic models, Part 2: Overlapping loops, preprint. [7] J. Feldman, H. Kn¨ orrer and E. Trubowitz, A two dimensional Fermi liquid, Part 3: The Fermi surface, to appear in Commun. Math. Phys.
December 15, 2003 16:44 WSPC/148-RMP
00179
Reviews in Mathematical Physics Vol. 15, No. 9 (2003) 1039–1120 c World Scientific Publishing Company
SINGLE SCALE ANALYSIS OF MANY FERMION SYSTEMS PART 3: SECTORIZED NORMS
JOEL FELDMAN∗ Department of Mathematics, University of British Columbia Vancouver, B.C., Canada V6T 1Z2 [email protected] http://www.math.ubc.ca/∼ feldman/ † and EUGENE TRUBOWITZ‡ ¨ HORST KNORRER
Mathematik, ETH-Zentrum, CH-8092 Z¨ urich, Switzerland †[email protected] ‡[email protected] †http://www.math.ethz.ch/∼ knoerrer/
Received 22 April 2003 The generic renormalization group map associated to a weakly coupled system of fermions at temperature zero is treated by supplementing the methods of Part 1. The interplay between position and momentum space is captured by “sectors”. It is shown that the difference between the complete four-legged vertex and its “ladder” part is irrelevant for the sequence of renormalization group maps. Keywords: Fermi liquid; renormalization; fermionic functional integral; Euclidean Green’s functions.
Contents XI. Introduction to Part 3 XII. Sectors and Sectorized Norms XIII. Bounds for Sectorized Propogators XIV. Ladders XV. Norm Estimates on the Renormalization Group Map XVI. Sectorized Momentum Space Norms XVII. The Renormalization Group Map and Norms in Momentum Space Appendices D. Naive ladder estimates Notation References
1040 1040 1056 1067 1071 1085 1095 1108 1108 1118 1120
∗ Research supported in part by the Natural Sciences and Engineering Research Council of Canada and the Forschungsinstitut f¨ ur Mathematik, ETH Z¨ urich.
1039
December 15, 2003 16:44 WSPC/148-RMP
1040
00179
J. Feldman, H. Kn¨ orrer & E. Trubowitz
XI. Introduction to Part 3 We use “sectors” to construct norms that allow nonperturbative control of renormalization group maps for two-dimensional many fermion systems. Thus from Sec. XII on, we assume that the dimension d of our system is two. Notation tables are provided at the end of the paper. We assume that the dispersion relation e(k) is r + d + 1 times differentiable, with r ≥ 2, and that its gradient does not vanish on the Fermi surface F = {(k0 , k) ∈ R × Rd | k0 = 0, e(k) = 0} .
It follows from these hypotheses that the gradient of the dispersion relation e(k) does not vanish in a neighborhood of F and that there is an r + d + 1 times differentiable projection πF to F in a neighborhood of the Fermi surface. We assume that the scale parameter M of Sec. VIII has been chosen so big that the “second doubly extended neighborhood” {k ∈ R × R2 | ν¯(≥2) (k) 6= 0} is contained in the two above-mentioned neighborhoods. XII. Sectors and Sectorized Norms From now on we consider only d = 2, so that the Fermi “surface” is a curve in R × R2 . Definition XII.1. (Sectors and sectorizations) (i) Let I be an interval on the Fermi surface F and j ≥ 2. Then s = {k in the jth neighborhood | πF (k) ∈ I} is called a sector of length |I| at scale j. Recall that πF (k) is the projection of k on the Fermi surface. Two different sectors s and s0 are called neighbors if s0 ∩ s 6= ∅. (ii) If s is a sector at scale j, its extension is s˜ = {k in the jth extended neighborhood | πF (k) ∈ s} . (iii) A sectorization of length l at scale j is a set Σ of sectors of length l at scale j that obeys the following: – the set Σ of sectors covers the Fermi surface – each sector in Σ has precisely two neighbors in Σ, one to its left and one to its right 1 l ≤ |s ∩ s0 ∩ F | ≤ 81 l – if s, s0 ∈ Σ are neighbors then 16
Observe that there are at most 2 length (F )/l sectors in Σ.
We will need partitions of unity for the sectors, as well as functions that envelope the sectors — i.e. that are identically one on a sector and are supported near the sector. Their L1 –L∞ norm will be typical for a function with the specified support. To measure it we generalize Definition IV.10.
December 15, 2003 16:44 WSPC/148-RMP 00179 iii) A sectorization of length l at scale j is a set Σ of sectors of length l at scale j that obeys
- the set Σ of sectors covers the Fermi surface - each sector in Σ has precisely two neighbours in Σ, one to its left and one to its right - if s, s0 ∈ Σ are neighbours then
1 16 l
≤ |s ∩ s0 ∩ F | ≤ 18 l
Single Scale Analysis of Many Fermion Systems — Part 3
Observe that there are at most 2 length(F )/l sectors in Σ.
1041
s0
s
Definition XII.2. The element cj offorNthe is definedasaswell as functions that envelope d+1 sectors, We will need partitions of unity X X the sectors – i.e. that are cidentically one on areδ supported M j|δ| tδ a+sector and∞t ∈ Nd+1 . near the sector. Their j = L1 –L∞ norm will be typical for a function with the specified support. To measure it we |δ|≤r |δ|>r |δ0 |≤r0
or |δ0 |>r0
generalize Definition IV.10. 1 1 Lemma XII.3. Let Σ be a sectorization of length M j−3/2 ≤ l ≤ M (j−1)/2 at scale 2 j ≥ 2. Then there exist χs (k), χ ˜s (k), s ∈ Σ that take values in [0, 1] such that (i) χs is supported in the extended sector s˜ and X χs (k) = 1 for k in the jth neighborhood . s∈Σ
(ii) χ ˜s is identically one on the extended sector s˜, is supported on the jth douneighborhood and χ ˜s (k) · χ ˜s0 (k) = 0 if s ∩ s0 = ∅. Furthermore, Rbly 3extended χ ˜s (k) l d k |ık0 −e(k)| ≤ const M j . (iii) kχ ˆs k1,∞ , with a constant
const
ˆ˜s k1,∞ ≤ const cj−1 ≤ const cj kχ
that does not depend on M, j, Σ or s.
The proof of this lemma is postponed to Sec. XIII. Definition XII.4 (Sectorized representatives). Let Σ be a sectorization at scale j, and let m, n ≥ 0. (i) The antisymmetrization of a function ϕ on B m × (B × Σ)n is Ant ϕ(η1 , . . . , ηm ; (ξ1 , s1 ), . . . , (ξn , sn )) =
X 1 ϕ(ηπ(1) , . . . , ηπ(m) ; (ξπ0 (1) , sπ0 (1) ), . . . , (ξπ0 (n) , sπ0 (n) )) . m!n! π∈Sm π 0 ∈Sn
(ii) Denote by Fm (n; Σ) the space of all translation invariant, complex valued functions ϕ(η1 , . . . , ηm ; (ξ1 , s1 ), . . . , (ξn , sn ))
December 15, 2003 16:44 WSPC/148-RMP
1042
00179
J. Feldman, H. Kn¨ orrer & E. Trubowitz
on B m × (B × Σ)n that are antisymmetric in their external (= η) variables and whose Fourier transform ϕ(ˇ ˇ η1 , . . . , ηˇm ; (ξˇ1 , s1 ), . . . , (ξˇn , sn )) vanishes unless ki ∈ s˜i for all 1 ≤ j ≤ n. Here, ξˇi = (ki , σi , ai ). (iii) Let f ∈ Fm (n) be translation invariant. A Σ-sectorized representative for f is a function ϕ ∈ Fm (n; Σ) obeying X ˇ η1 , . . . , ηˇm ; ξˇ1 , . . . , ξˇn ) = f(ˇ ϕ(ˇ ˇ η1 , . . . , ηˇm ; (ξˇ1 , s1 ), . . . , (ξˇn , sn )) si ∈Σ i=1,...,n
for all ξˇi = (ki , σi , ai ) with ki in the jth neighborhood. (iv) Let u((ξ, s), (ξ 0 , s0 )) be a translation invariant, spin independent, particle number conserving function on (B × Σ)2 . We define u ˇ(k) by X u ˇ((k, σ, 1, s), (k, σ 0 , 0, s0 )) . ˇ(k) = δσ,σ0 u s,s0 ∈Σ
Example XII.5. Set ϕ(η1 , . . . , ηm ; (ξ1 , s1 ), . . . , (ξn , sn )) =
Z Y n
(dξi0 χ ˆsi (ξi , ξi0 ))f (η1 , . . . , ηm ; ξ10 , . . . , ξn0 )
i=1
where χs is the partition of unity of Lemma XII.3 and χ ˆs was defined in Definition IX.4. Then ϕ is a Σ-sectorized representative for f . V Recall that we want to control the renormalization group map ΩC on A V , where A is the Grassmann algebra generated by the fields φ(η) and V is the vector space generated by the fields ψ(ξ). We shall do this by controlling norms of sectorized representatives of the coefficient functions. In preparation, we consider a renormalization group map that is adjusted to the sectorization. Definition XII.6. (i) VΣ is the vector space generated by ψ(ξ, s), ξ ∈ B, s ∈ Σ. If ϕ ∈ Fm (n; Σ) we define the element Tens(ϕ) of Am ⊗ VΣ⊗n by n m Y X Z Y dξj ϕ(η1 , . . . , ηm ; (ξ1 , s1 ), . . . , (ξn , sn )) Tens(ϕ) = dηi sj ∈Σ j=1,...,n
i=1
j=1
· φ(η1 ) · · · φ(ηm )ψ(ξ1 , s1 ) ⊗ · · · ⊗ ψ(ξn , sn ) V and the element Gr(ϕ) of Am ⊗ n VΣ as m n X Z Y Y Gr(ϕ) = dηi dξj ϕ(η1 , . . . , ηm ; (ξ1 , s1 ), . . . , (ξn , sn )) sj ∈Σ j=1,...,n
i=1
j=1
· φ(η1 ) · · · φ(ηm )ψ(ξ1 , s1 ) · · · ψ(ξn , sn ) . V Elements of A ⊗ VΣ are called sectorized Grassmann functions.
December 15, 2003 16:44 WSPC/148-RMP
00179
Single Scale Analysis of Many Fermion Systems — Part 3
1043
(ii) Let W=
m X Z Y
dηi
dξj Wm,n (η1 , . . . , ηm ; ξ1 , . . . , ξn )
j=1
i=1
m,n≥0
n Y
× φ(η1 ) · · · φ(ηm )ψ(ξ1 ) · · · ψ(ξn ) be a Grassmann function with Wm,n ∈ Fm (n) antisymmetric in its internal (= ψ) variables. A sectorized representative for W is a sectorized Grassmann function of the form X w= Gr(wm,n ) m,n≥0
where, for each m, n, wm,n is a sectorized representative for Wm,n that is also antisymmetric in the variables (ξ1 , s1 ), . . . , (ξn , sn ). V Remark XII.7. Let IO be the ideal in A V consisting of all n XZ Y dξj Wn (ξ1 , . . . , ξn )ψ(ξ1 ) · · · ψ(ξn ) W= n>0
j=1
obeying ˇ n ((k1 , σ1 , a1 ), . . . , (kn , σn , an )) = 0 for all k1 , . . . , kn in jth neighborhood W and let VΣeff be the linear subspace of VΣ consisting of all XZ V= dξϕ((ξ, s))ψ(ξ, s) s∈Σ
obeying ϕ((k, ˇ σ, a, s)) = 0 unless k ∈ s˜ . Furthermore let π : VΣ → V be the linear map that sends ψ(ξ, s) ∈ VΣ to ψ(ξ) ∈ V . V V It induces an algebra homomorphism from A VΣ to A V , which we again denote by π. Then the sectorized Grassmann function w is a sectorized representative for V the Grassmann function W if and only if w ∈ A VΣeff and π(w) − W ∈ IO .
Proposition XII.8 (Functoriality). Let C(ξ, ξ 0 ) be a skew symmetric function on B×B. Assume that there is an antisymmetric function c((ξ, s), (ξ 0 , s0 )) ∈ F0 (2; Σ) such that X C(ξ, ξ 0 ) = c((ξ, s), (ξ 0 , s0 )) s,s0 ∈Σ
and cˇ((k, σ, a, s), (k 0 , σ 0 , a0 , s0 )) = 0
December 15, 2003 16:44 WSPC/148-RMP
1044
00179
J. Feldman, H. Kn¨ orrer & E. Trubowitz
unlessa k ∈ s, k 0 ∈ s0 . Define a covariance on VΣ by X CΣ (ψ(ξ, s), ψ(ξ 0 , s0 )) = c((ξ, t), (ξ 0 , t0 )) . t∩s6=∅ t0 ∩s0 6=∅
(i) If ϕ ∈ F0 (n; Σ) then X ZZ dξ1 · · · dξn ϕ((ξ1 , s1 ), . . . , (ξn , sn ))ψ(ξ1 , s1 ) · · · ψ(ξn , sn )dµCΣ (ψ) s1 ,...,sn ∈Σ
=
ZZ
dξ1 · · · dξn
X
s1 ,...,sn ∈Σ
× ψ(ξ1 ) · · · ψ(ξn )dµC (ψ) .
ϕ((ξ1 , s1 ), . . . , (ξn , sn ))
(ii) Let W(φ; ψ) be an even Grassmann function and w a sectorized representative for W. Then ΩCΣ (w) is a sectorized representative for ΩC (W). For any ζ(ξ), set, with some abuse of notation, Z (Cζ)(ξ, s) = dξ 0 C(ξ, ξ 0 )ζ(ξ 0 ) . ˜ C (W). Then 21 φJCJφ+ΩCΣ (w)(φ, ψ+CJφ) is a sectorized representative for Ω (iii) Let W(φ; ψ) be a Grassmann function and w a sectorized representative for W. Then : w :CΣ is a sectorized representative for : W :C . Proof. (i) First consider n = 2. Then X ZZ dξ1 dξ2 ϕ((ξ1 , s1 ), (ξ2 , s2 ))ψ(ξ1 , s1 )ψ(ξ2 , s2 )dµCΣ (ψ) s1 ,s2 ∈Σ
=
X Z
dξ1 dξ2 ϕ((ξ1 , s1 ), (ξ2 , s2 ))CΣ (ψ(ξ1 , s1 ), ψ(ξ2 , s2 ))
s1 ,s2 ∈Σ
=
X
Z
dξ1 dξ2 ϕ((ξ1 , s1 ), (ξ2 , s2 ))c((ξ1 , t1 ), (ξ2 , t2 ))
X
Z
dξ1 dξ2 ϕ((ξ1 , s1 ), (ξ2 , s2 ))c((ξ1 , t1 ), (ξ2 , t2 ))
s1 ,s2 ,t1 ,t2 ∈Σ t1 ∩s1 6=∅ t2 ∩s2 6=∅
=
s1 ,s2 ,t1 ,t2 ∈Σ
a The hypothesis c((ξ, s), (ξ 0 , s0 )) ∈ F (2; Σ) implies that c ˇ((k, ·, s), (k 0 , ·, s0 )) vanishes unless k ∈ s˜, 0 k 0 ∈ s˜0 . Here we further require that cˇ((k, ·, s), (k 0 , ·, s0 )) vanish unless k and k 0 are in the jth neighborhood.
December 15, 2003 16:44 WSPC/148-RMP
00179
Single Scale Analysis of Many Fermion Systems — Part 3
=
=
Z
dξ1 dξ2
ZZ
X
s1 ,s2 ∈Σ
dξ1 dξ2
X
1045
ϕ((ξ1 , s1 ), (ξ2 , s2 )) C(ξ1 , ξ2 )
s1 ,s2 ∈Σ
ϕ((ξ1 , s1 ), (ξ2 , s2 )) ψ(ξ1 )ψ(ξ2 )dµC (ψ) .
In the third equality, we used conservation of momentum to imply that Z dξ1 dξ2 ϕ((ξ1 , s1 ), (ξ2 , s2 ))c((ξ1 , t1 ), (ξ2 , t2 )) = 0
unless s˜1 ∩ t1 6= ∅ and s˜2 ∩ t2 6= ∅ and hence unless s1 ∩ t1 6= ∅ and s2 ∩ t2 6= ∅. The claim for general n is now proven by induction on n using integration by parts (see, for example, [1, Sec. II.2]). ˇ (ii) Set W 0 = W − π(w) ∈ IO . By assumption, C((k, σ, a), (k 0 , σ 0 , a0 )) = 0 unless R k and k 0 both lie in the jth neighborhood. Therefore f (ψ + ζ)dµC (ζ) ∈ IO for all f (ζ) ∈ IO . Consequently Z Z eπ(w)(φ,ψ+ζ)dµC (ζ) − eW(φ,ψ+ζ) dµC (ζ) = since 1 − eW
so that
0
(φ,ψ)
Z
eπ(w)(φ,ψ+ζ) [1 − eW
0
(φ,ψ+ζ)
]dµC (ζ) ∈ IO
∈ IO and IO is an ideal. In particular Z Z π(w)(0,ζ) Z(π(w)) = e dµC (ζ) = eW(0,ζ) dµC (ζ) = Z(W)
1 Z(π(w))
Z
eπ(w)(φ,ψ+ζ)dµC (ζ) −
1 Z(W)
Z
eW(φ,ψ+ζ) dµC (ζ) ∈ IO .
Expanding the power series for log(1 + x), one sees that ΩC (π(w)) − ΩC (W) ∈ IO .
As CΣ (v, v 0 ) = C(π(v), π(v 0 )), for all v, v 0 ∈ VΣeff , functoriality of the renormalization group map ([1, Remarks III.3 and III.1(i)]) implies that ΩC (π(w)) = π(ΩCΣ (w)). So ΩC (W) − π(ΩCΣ (w)) ∈ IO . Also, by construction, ΩCΣ (w) ∈ V eff A VΣ . Hence, by Remark XII.7, ΩCΣ (w) is a sectorized representative for ΩC (W). If v(φ, ψ) is a sectorized representative for V (φ, ψ), then v(φ, ψ + CJφ) is a sectorized representative for V (φ, ψ + CJφ). Therefore the second claim follows from the first and Lemma VII.3. (iii) Part (iii) is an immediate consequence of part (i) and [1, Proposition A.2(i)].
Definition XII.9 (Norms for sectorized functions). Let Σ be a sectorization at scale j ≥ 2 and let m, n ≥ 0 and p > 0 be integers.
December 15, 2003 16:44 WSPC/148-RMP
1046
00179
J. Feldman, H. Kn¨ orrer & E. Trubowitz
(i) For a function ϕ on B m × (B × Σ)n we define the seminorm |ϕ|p,Σ to be zero if m ≥ 1, p ≥ 2 or if m = 0, p > n. In the case m ≥ 1, p = 1 we set X kϕ(η1 , . . . , ηm ; (ξ1 , s1 ), . . . , (ξn , sn ))k1,∞ . |ϕ|p,Σ = si ∈Σ
In the case m = 0, p ≤ n we set |ϕ|p,Σ =
max
X
max
1≤i1 <···
si ∈Σ for i6=i1 ,...,ip
kϕ((ξ1 , s1 ), . . . , (ξn , sn ))k1,∞ .
In both cases, the k · k1,∞ norm (defined in Example II.6) applies to all the position space variables. Furthermore, maxima acting on a formal power series P δ δ aδ t are to be applied separately to each coefficient aδ . (ii) Let f ∈ Am ⊗ (VΣeff )⊗n be a sectorized Grassmann function. Then there is a unique ϕ ∈ Fm (n; Σ) such that f = Tens(ϕ). By definition |ϕ|p,Σ if ϕ is translation invariant, conserves particle numbers and is spin independent |f |p,Σ = ∞ otherwise .
Example XII.10. Let f ∈ Fm (n) with m ≥ 1. Then f has a sectorized representative ϕ fulfilling const n cnj kf k1,∞ . |ϕ|1,Σ ≤ l
Proof. Select a sectorized representative ϕ for f as in Example XII.5. For each choice of sectors s1 , . . . , sn of sectors in Σ kϕ(η1 , . . . , ηm ; (ξ1 , s1 ), . . . , (ξn , sn ))k1,∞
Z n
Y
= (dξi0 χ ˆsi (ξi , ξi0 ))f (η1 , . . . , ηm ; ξ1 , . . . , ξn )
i=1
≤ kf k1,∞
n Y
i=1
1,∞
kχ ˆsi k1,∞
≤ constn cnj−1 kf k1,∞ by Lemma II.7 (n times) and Lemma XII.3(iii). As there are follows from the definition.
const l
sectors, the claim
Remark XII.11. Let D be a decay operator and ϕ a function on (B × Σ)n . Then |Dϕ|p,Σ ≤
∂ |δ(D)| |ϕ|p,Σ . ∂tδ(D)
December 15, 2003 16:44 WSPC/148-RMP
00179
Single Scale Analysis of Many Fermion Systems — Part 3
1047
Lemma XII.12. Let ϕ ∈ F0 (n; Σ) be a sectorized representative for a translation invariant f ∈ F0 (n). Then, for p ≤ n, # " X 1 ˇ |Df (ˇ η1 , . . . , ηˇn )| tδ sup max δ! D differential operator ηˇ1 ,...,ˇ ηn 2 δ∈N0 ×N0
with δ(D)=δ
ηˇ1 +···+ˇ ηn =0 k1 ,...,kn in jth neighborhood
≤ 2p |ϕ|p,Σ . Here ηˇi = (ki , σi , bi ) and the differential operators D are defined in Definition X.2(iii). (1)
(q)
(1)
Proof. Fix a differential operator D = Dδu1 ;v1 · · · Dδuq ;vq and let D = Duδ 1 ,v1 · · · (q)
Duδ q ,vq be the corresponding decay operator as in Definition II.3. Fix any ηˇi = Pn bi (ki , σi , bi ), 1 ≤ i ≤ n with k1 , . . . , kn in the jth neighborhood and i=1 (−1) ki = 0. Then X fˇ(ˇ η1 , . . . , ηˇn ) = ϕ((k ˇ 1 , σ1 , b1 , s1 ), . . . , (kn , σn , bn , sn )) si 3ki 1≤i≤n
so that |Dfˇ(ˇ η1 , . . . , ηˇn )| ≤
X
X
si ∈Σ si 3ki 1≤i≤p p+1≤i≤n
Z Y
dξ`
`6=n
· |Dϕ((ξ1 , s1 ), . . . , (ξn−1 , sn−1 ), (0, σn , bn , sn ))| ≤ 2p max
s1 ,...,sp
X
si ∈Σ p+1≤i≤n
k|Dϕ((·, s1 ), . . . , (·, sn ))k|1,∞
since each ki can be contained in at most two sectors. Remark XII.13. We will use the norms of Definition XII in a multi scale analysis to prove the existence of Green’s functions in the position space L∞ -norm. This is the reason why we take the suprema over the external variables η1 , . . . , ηm , and why we do not “sectorize” these variables. In Sec. XVI, we will introduce another set of norms, designed to study the smoothness properties of the amputated two and four-point functions in momentum space. Lemma XII.14. Let Σ be a sectorization and let ϕ be a function on B m ×(B×Σ)n , 0 0 ϕ0 be a function on B m × (B × Σ)n and 1 ≤ i ≤ n, 1 ≤ i0 ≤ n0 . Define the function 0 0 γ on B m+m × (B × Σ)n+n −2 by γ(η1 , . . . , ηm+m0 ; (ξ1 , s1 ), . . . , (ξi−1 , si−1 ), (ξi+1 , si+1 ), . . . , (ξn+i0 −1 , sn+i0 −1 ) , (ξn+i0 +1 , sn+i0 +1 ), . . . , (ξn+n0 , sn+n0 ))
December 15, 2003 16:44 WSPC/148-RMP
1048
00179
J. Feldman, H. Kn¨ orrer & E. Trubowitz
X Z
=
0
s,s ∈Σ s∩s0 6=∅
dζϕ(η1 , . . . , ηm ; (ξ1 , s1 ), . . . , (ξi−1 , si−1 ), (ζ, s), (ξi+1 , si+1 ), . . . , B
(ξn , sn ))ϕ0 (ηm+1 , . . . , ηm+m0 ; (ξn+1 , sn+1 ), . . . , (ξn+i0 −1 , sn+i0 −1 ), (ζ, s0 ) , (ξn+i0 +1 , sn+i0 +1 ), . . . , (ξn+n0 , sn+n0 )) . If m = 0 or m0 = 0, |γ|p,Σ ≤ 3
max
p1 +p2 =p+1 p1 ,p2 odd
|ϕ|p1 ,Σ |ϕ0 |p2 ,Σ
for all odd natural numbers p. Proof. The variable indices for γ lie in the set I ∪ I 0 , where I = {1, . . . , i − 1, i + 1, . . . , n} I 0 = {n + 1, . . . , n + i0 − 1, n + i0 + 1, . . . , n + n0 } . Fix u1 , . . . , uq ∈ I, uq+1 , . . . , up ∈ I 0 and fix sectors su1 , . . . , sup ∈ Σ. First assume that q is odd. Then p − q is even. By Lemma II.7, for each choice of sectors sν , ν ∈ I ∪ I 0 \{u1 , . . . , up } one has kγ(η1 , . . . , ηm+m0 ; (ξ1 , s1 ), . . . , (ξi−1 , si−1 ), (ξi+1 , si+1 ), . . . , (ξn+i0 +1 , sn+i0 +1 ), . . . , (ξn+n0 , sn+n0 ))k1,∞ ≤
X
s,s0 ∈Σ s∩s0 6=∅
kϕ(η1 , . . . , ηm ; (ξ1 , s1 ), . . . , (ξi−1 , si−1 ), (ζ, s) ,
(ξi+1 , si+1 ), . . . , (ξn , sn ))k1,∞ kϕ0 (ηm+1 , . . . , ηm+m0 ; (ξn+1 , sn+1 ), . . . , (ξn+i0 −1 , sn+i0 −1 ), (ζ 0 , s0 ), . . . , (ξn+n0 , sn+n0 ))k1,∞ . Observe that for every s ∈ Σ there are at most three sectors s0 such that s0 ∩ s 6= ∅. Consequently X kγ(η1 , . . . , ηm+m0 ; (ξ1 , s1 ), . . . , (ξi−1 , si−1 ) , sν ∈Σ ν∈I∪I 0 \{u1 ,...,up }
(ξi+1 , si+1 ), . . . , (ξn+i0 +1 , sn+i0 +1 ), . . . , (ξn+n0 , sn+n0 ))k1,∞ ≤3
X
X
sν ∈Σ s∈Σ ν∈I\{u1 ,...,uq }
kϕ(η1 , . . . , ηm ; (ξ1 , s1 ), . . . , (ξi−1 , si−1 ), (ζ, s) ,
December 15, 2003 16:44 WSPC/148-RMP
00179
Single Scale Analysis of Many Fermion Systems — Part 3
X
(ξi+1 , si+1 ), . . . , (ξn , sn ))k1,∞ max 0 s ∈Σ
1049
kϕ0 (ηm+1 , . . . , ηm+m0 ;
s0µ ∈Σ µ∈I 0 \{uq+1 ,...,up }
(ξn+1 , sn+1 ), . . . , (ξn+i0 −1 , sn+i0 −1 ), (ζ 0 , s0 ), . . . , (ξn+n0 , sn+n0 ))k1,∞ ≤ 3|ϕ|q,Σ |ϕ0 |p−q+1,Σ . The case that q is even follows as in the case discussed above by interchanging the roles of ϕ and ϕ0 . We define “contraction” for sectorized functions as the obvious generalization of Definition III.1. Definition XII.15. Let c((ξ, s), (ξ 0 , s0 )) be any skew symmetric function on (B × c Σ)2 . Let m, n ≥ 0 and 1 ≤ i < j ≤ n. For ϕ ∈ Fm (n; Σ) the contraction Con i→j ϕ ∈ Fm (n − 2; Σ) is defined as Conc ϕ(η1 , . . . , ηm ; (ξ1 , s1 ), . . . , (ξi−1 , si−1 ) , i→j
(ξi+1 , si+1 ), . . . , (ξj−1 , sj−1 ), (ξj+1 , sj+1 ), . . . , (ξn , sn )) = (−1)j−i+1
X
si ,sj ,ti ,tj ∈Σ ti ∩si 6=∅ tj ∩sj 6=∅
Z
dξi dξj c((ξi , ti ), (ξj , tj ))
· ϕ(η1 , . . . , ηm ; (ξ1 , s1 ), . . . , (ξn , sn )) . Proposition XII.16. Let Σ be a sectorization of length l at scale j ≥ 2 and let c((ξ, s), (ξ 0 , s0 )) ∈ F0 (2; Σ) be an antisymmetric function. (i) Let p be an odd integer, m, m0 ≥ 0, n, n0 ≥ 1 and ϕ ∈ Fm (n; Σ), ϕ0 ∈ Fm0 (n0 , Σ). If m = 0 or m0 = 0 Conc Antext (ϕ ⊗ ϕ0 ) ≤ 9|c|1,Σ max |ϕ|p1 ,Σ |ϕ0 |p2 ,Σ . 1→n+1
If m 6= 0 and m0 6= 0 Conc Antext (ϕ ⊗ ϕ0 ) 1→n+1
p1 +p2 =p+1 p1 ,p2 odd
p,Σ
1,Σ
≤9
0
!
sup |c((ξ, s), (ξ , s ))| |ϕ|1,Σ |ϕ0 |1,Σ .
ξ,ξ 0 ,s,s0
0
(ii) Assume that there is a function C(k) that is supported in the jth neighborhood, such that c((·, s), (·, s0 )) is the Fourier transform of χs (k)C(k)χs0 (k) in the ε for some ε ≥ 0. sense of Definition IX.3 and that |C(k)| ≤ |ık0 −e(k)|
December 15, 2003 16:44 WSPC/148-RMP
1050
00179
J. Feldman, H. Kn¨ orrer & E. Trubowitz
Let ϕ ∈ Fm (n; Σ), n0 ≤ n and set, as in Definition III.5 of integral bound, ϕ0 (η1 , . . . , ηm ; (ξn0 +1 , sn0 +1 ), . . . , (ξn , sn )) X ZZ dξ1 · · · dξn0 ϕ(η1 , . . . , ηm ; (ξ1 , s1 ), . . . , = si ∈Σ i=1,...,n0
(ξn0 , sn0 ), . . . , (ξn , sn ))ψ(ξ1 , s1 ) · · · ψ(ξn0 , sn0 )dµCΣ (ψ) where
X
CΣ (ψ(ξ, s), ψ(ξ 0 , s0 )) =
c((ξ, t), (ξ 0 , t0 )) .
t∩s6=∅ t0 ∩s0 6=∅
Then for all p |ϕ0 |p,Σ ≤
εB1
l Mj
n0 /2
|ϕ|p,Σ
with a constant B1 that is independent of j and Σ. Proof. (i) Set γ(η1 , . . . , ηm+m0 ; (ξ2 , s2 ), . . . , (ξn , sn ), (ξn+2 , sn+2 ), . . . , (ξn+n0 , sn+n0 )) X Z dζdηϕ(η1 , . . . , ηm ; (ζ, s), (ξ2 , s2 ), . . . , (ξn , sn ))c((ζ, t), (η, t0 )) = s,s0 ,t,t0 ∈Σ t∩s6=∅ t0 ∩s0 6=∅
× ϕ0 (ηm+1 , . . . , ηm+m0 ; (η, s0 )(ξn+2 , sn+2 ), . . . , (ξn+n0 , sn+n0 )) .
Conc Then (−1)n+1 Antext γ = 1→n+1 Antext (ϕ ⊗ ϕ0 ). If m 6= 0 and m0 6= 0 then Conc Antext (ϕ ⊗ ϕ0 ) = |Antext γ|1,Σ 1→n+1
1,Σ
≤9
0
0
!
sup |c((ξ, t), (ξ , t ))| |ϕ|1,Σ |ϕ0 |1,Σ .
ξ,ξ 0 ,t,t0
If m = 0 or m0 = 0, by iterated application of Lemma XII.14 X Z |γ|p,Σ ≤ 3 max dζϕ(η1 , . . . , ηm ; p1 +p2 =p+1 p1 ,p2 odd
s,t∈Σ s∩t6=∅
(ζ, s), (ξ2 , s2 ), . . . , (ξn , sn ))c((ζ, t), (η, t )) 0
≤9
max
p1 +p2 =p+1 p1 ,p2 odd
|ϕ|p1 ,Σ |c|1,Σ |ϕ0 |p2 ,Σ .
p1 ,Σ
|ϕ0 |p2 ,Σ
December 15, 2003 16:44 WSPC/148-RMP
00179
Single Scale Analysis of Many Fermion Systems — Part 3
1051
(ii) Define the covariance C(ξ, ξ 0 ) to be the Fourier transform of C(k). By part (i) of Proposition XII, ϕ0 (η1 , . . . , ηm ; (ξn0 +1 , sn0 +1 ), . . . , (ξn , sn )) ZZ X ϕ(η1 , . . . , ηm ; (ξ1 , s1 ), . . . , (ξn0 , sn0 ), . . . , (ξn , sn )) = dξ1 · · · dξn0 s1 ,...,sn0 ∈Σ
· ψ(ξ1 ) · · · ψ(ξn0 )dµC (ψ) . Since ϕ ∈ Fm (n; Σ) ϕ(η1 , . . . , ηm ; (ξ1 , s1 ), . . . , (ξn0 , sn0 ), (ξn0 +1 , sn0 +1 ), . . . , (ξn , sn )) Z = dξ10 · · · dξn0 0 ϕ(η1 , . . . , ηm ; (ξ10 , s1 ), . . . , (ξn0 0 , sn0 ) , 0
(ξn0 +1 , sn0 +1 ), . . . , (ξn , sn ))
n Y
ˆ˜s (ξi0 , ξi ) . χ i
i=1
Consequently
ϕ0 (η1 , . . . , ηm ; (ξn0 +1 , sn0 +1 ), . . . , (ξn , sn )) ZZ X = dξ10 · · · dξn0 0 ϕ(η1 , . . . , ηm ; (ξ10 , s1 ), . . . , s1 ,...,sn0 ∈Σ
(ξn0 0 , sn0 ), . . . , (ξn , sn ))ψs1 (ξ10 ) · · · ψsn0 (ξn0 0 )dµC (ψ) where ψs (ξ 0 ) = Therefore, by Proposition IV.3(ii)
Z
ˆ˜s (ξ 0 , ξ)ψ(ξ) . dξ χ
|ϕ0 (η1 , . . . , ηm ; (ξn0 +1 , sn0 +1 ), . . . , (ξn , sn ))| Z X ≤ dξ1 · · · dξn0 |ϕ(η1 , . . . , ηm ; (ξ1 , s1 ), . . . , (ξn , sn ))| s1 ,...,sn0 ∈Σ
Z · ψs1 (ξ1 ) · · · ψsn0 (ξn0 )dµC (ψ) 0
≤ Gn /2 with G=
Z
X
s1 ,...,sn0 ∈Σ
Z
dξ1 · · · dξn0 |ϕ(η1 , . . . , ηm ; (ξ1 , s1 ), . . . , (ξn , sn ))|
dd+1 k χ ˜s (k)2 |C(k)| ≤ ε (2π)d+1
Z
dd+1 k l χ ˜s (k)2 ≤ const ε j . d+1 (2π) |ık0 − e(k)| M
December 15, 2003 16:44 WSPC/148-RMP
1052
00179
J. Feldman, H. Kn¨ orrer & E. Trubowitz
Remark XII.17. If C fulfills the hypothesis of part (i) of the proposition, then ( ) c = 9 max |c|1,Σ , sup |c((ξ, s), (ξ 0 , s0 ))| ξ,ξ 0 ,s,s0
is a contraction bound for the system | · |1,Σ of seminorms. We shall show in (j)
(k) 1 1 Proposition XIII.5, that for C (j) (k) = ıkν0 −e(k) and M j−3/2 ≤ l ≤ M (j−1)/2 , the j constant coefficient c0 of c is bounded by const q M . On the other hand, part (ii) of (j)
Proposition XII.16 shows that b = const Ml j is an integral bound for CΣ with respect to this system of seminorms. Thus, if W(ψ) is an even Grassmann function with sectorized representative Z ∞ X X w(ψ) = dξ1 · · · dξ2n w2n ((ξ1 , s1 ), . . . , (ξ2n , s2n )) n=0 s1 ,...,s2n ∈Σ
· ψ(ξ1 , s1 ) · · · ψ(ξ2n , s2n ) the quantity N (W; c, b, α) of [1, Definition II.23] (with V replaced by VΣ ) has N (w; c, b, α)0 (
) n α2 l M 2j X const j |w2n |1,Σ . ≤ const α M |w2 |1,Σ + α l|w4 |1,Σ + l M 2
j
4
n≥3
In contrast to the situation of Remark VIII.8, this norm is of order one if |w4 |1,Σ is of order 1l , which is approximately the number of sectors. As d = 2, this is a realistic estimate for the original interaction V, with all momenta restricted to the jth shell. Observe, however, that for d ≥ 3 this estimate is not expected to hold. See [2, Sec. II, Subsec. 8]. For more precise control of W4 one also uses the norm |w4 |3,Σ . To prepare for the application of [3, Theorem VI.6] about overlapping loops, we state Proposition XII.18. Let Σ be a sectorization of length l at scale j ≥ 2 and let c((ξ, s), (ξ 0 , s0 )) ∈ F0 (2; Σ) be an antisymmetric function. Let D(k), D 0 (k) be func2 and let d((·, s), (·, s0 )) resp. d0 ((·, s), (·, s0 )) tions obeying |D(k)|, |D 0 (k)| ≤ |ık0 −e(k)| be the Fourier transform of χs (k)D(k)χs0 (k) resp. χs (k)D0 (k)χs0 (k) in the sense of Definition IX.3. Let 1 ≤ i1 , i2 , i3 ≤ n, 1 ≤ i01 , i02 , i03 ≤ n0 with i1 6= i2 6= i3 6= i1 , i01 6= i02 6= i03 6= i01 , and let p be an odd natural number. Then for ϕ ∈ F0 (n; Σ), ϕ0 ∈ F0 (n0 , Σ) 2 l 0 ≤ B2 j |c|1,Σ max |ϕ|p1 ,Σ |ϕ0 |p2 ,Σ Conc Cond Cond0 (ϕ ⊗ ϕ ) p1 +p2 =p+3 i1 →n+i01 i2 →n+i02 i3 →n+i03 M p,Σ
with a constant B2 that is independent of j and Σ.
p1 ,p2 odd
December 15, 2003 16:44 WSPC/148-RMP
00179
Single Scale Analysis of Many Fermion Systems — Part 3
1053
Proof. By the symmetry of the norms we may assume that i1 = i01 = 1, i2 = i02 = 2, i3 = i03 = 3. Set γ((ξ4 , s4 ), . . . , (ξn , sn ), (ξn+4 , sn+4 ), . . . , (ξn+n0 , sn+n0 )) =
X
si ,ti ,t0i ,s0i ∈Σ i=1,2,3
Z Y 3
(dζi dηi )ϕ((ζ1 , s1 ), (ζ2 , s2 ), (ζ3 , s3 ), (ξ4 , s4 ), . . . , (ξn , sn ))
i=1
· c((ζ1 , t1 ), (η1 , t01 ))d((ζ2 , t2 ), (η2 , t02 ))d0 ((ζ3 , t3 ), (η3 , t03 )) · ϕ0 ((η1 , s01 ), (η2 , s02 ), (η3 , s03 ), (ξn+4 , sn+4 ), . . . , (ξn+n0 , sn+n0 )) . Then (−1)3(n+1) Antext γ = Conc
i1 →n+i01
Cond
Con
d0 i2 →n+i02 i3 →n+i03
(ϕ ⊗ ϕ0 ) .
Observe that d((ζ, t), (η, t0 )) = 0 if t ∩ t0 = ∅ and that Z dd+1 k χt (k)|D(k)|χt0 (k) sup |d((ζ, t), (η, t0 )| ≤ (2π)d+1 ζ,η;t,t0 ≤
Z
dd+1 k 2χt (k)χt0 (k) (2π)d+1 |ık0 − e(k)|
≤ const
l . Mj
(XII.1)
The same properties hold for d0 . Set ϕ00 ((ζ1 , t1 ), (ξ2 , s2 ), . . . , (ξn0 , sn0 )) X Z = dη1 c((ζ1 , t1 ), (η1 , t01 ))ϕ0 ((η1 , s01 ), (ξ2 , s2 ), . . . , (ξn0 , sn0 )) . t01 ,s01 ∈Σ
By Lemma XII.14 |ϕ00 |p,Σ ≤ 3|c|1,Σ |ϕ0 |p,Σ .
Furthermore ϕˇ00 ((k1 , σ1 , a1 , t1 ), (ξˇ2 , s2 ), . . . , (ξˇn0 , sn0 )) = 0 unless k1 ∈ t˜1 . Set γ˜ ((ξ4 , s4 ), . . . , (ξn , sn ), (ξn+4 , sn+4 ), . . . , (ξn+n0 , sn+n0 ) ; s1 , t1 , s2 , t2 , t02 , s02 , s3 , t3 , t03 , s03 ) Z = dζ1 dζ2 dη2 dζ3 dη3 ϕ((ζ1 , s1 ), (ζ2 , s2 ), (ζ3 , s3 ), (ξ4 , s4 ), . . . , (ξn , sn )) · d((ζ2 , t2 ), (η2 , t02 ))d0 ((ζ3 , t3 ), (η3 , t03 ))ϕ00 ((ζ1 , t1 ), (η2 , s02 ), (η3 , s03 ) , (ξn+4 , sn+4 ), . . . , (ξn+n0 , sn+n0 )) .
December 15, 2003 16:44 WSPC/148-RMP
1054
00179
J. Feldman, H. Kn¨ orrer & E. Trubowitz
Then γ((ξ4 , s4 ), . . . , (ξn , sn ), (ξn+4 , sn+4 ), . . . , (ξn+n0 , sn+n0 )) X
=
X
γ˜((ξ4 , s4 ), . . . , (ξn , sn ), (ξn+4 , sn+4 ), . . . ,
s1 ,t1 ∈Σ si ,ti ,t0i ,s0i ∈Σ i=2,3
(ξn+n0 , sn+n0 ); s1 , t1 , s2 , t2 , t02 , s02 , s3 , t3 , t03 , s03 ) and γ˜ (·; s1 , t1 , s2 , t2 , t02 , s02 , s3 , t3 , t03 , s03 ) = 0 unless s1 ∩ t1 6= ∅ and si ∩ ti ∩ t0i ∩ s0i 6= ∅ for i = 2, 3. By Corollary II.8, for all choices of sectors, k˜ γ ((·, s4 ), . . . , (·, sn ), ·, sn+4 ), . . . , (·, sn+n0 ); s1 , t1 , s2 , t2 , t02 , s02 , s3 , t3 , t03 , s03 )k1,∞ ≤ sup |d| sup |d0 |kϕ((·, s1 ), (·, s2 ), (·, s3 ), (·, s4 ), . . . , (·, sn ))k1,∞ · kϕ00 ((·, t1 ), (·, s02 ), (·, s03 ), (·, sn+4 ), . . . , (·, sn+n0 ))k1,∞ . The variable indices for γ lie in the set I ∪ I 0 , where I = {4, . . . , n} ,
I 0 = {n + 4, . . . , n + n0 } .
Fix u1 , . . . , uq ∈ I, uq+1 , . . . , up ∈ I 0 and fix sectors su1 , . . . , sup ∈ Σ. First assume that q is odd so that p − q is even. Then, by (XII.1) and the estimate on γ˜ above X kγ((·, s4 ), . . . , (·, sn ), (·, sn+4 ), . . . , (·, sn+n0 ))k1,∞ si ∈Σ i∈I∪I 0 \{u1 ,...,up }
≤ const
·
l2 M 2j
max 0 0
t1 ,s2 ,s3 ∈Σ
X
si ∈Σ i∈{1,...,n} i6=u1 ,...,uq
kϕ((·, s1 ), (·, s2 ), (·, s3 ), (·, s4 ), . . . , (·, sn ))k1,∞
X
si ∈Σ i∈I 0 \{uq+1 ,...,up }
· kϕ00 ((·, t1 ), (·, s02 ), (·, s03 ), (·, sn+4 ), . . . , (·, sn+n0 ))k1,∞ ≤ const
l2 |ϕ|q,Σ |ϕ00 |p−q+3,Σ M 2j
≤ const
l2 |c|1,Σ |ϕ|q,Σ |ϕ0 |p−q+3,Σ . M 2j
The case that q is even is similar. To treat source terms, we state, motivated by Definition VII.4,
December 15, 2003 16:44 WSPC/148-RMP
00179
Single Scale Analysis of Many Fermion Systems — Part 3
1055
Lemma XII.19 (External improving). Let C(k) be a function obeying 2 |C(k)| ≤ |ık0 −e(k)| and c((·, s), (·, s0 )) be the Fourier transform of χs (k)C(k)χs0 (k) in the sense of Definition IX.3. Let ϕ ∈ Fm (n; Σ), 1 ≤ i ≤ n and set ϕ0 (η1 , . . . , ηm+1 ; (ξ1 , s1 ), . . . , (ξn−1 , sn−1 )) X Z = Antext dζdζ 0 ϕ(η1 , . . . , ηm ; (ξ1 , s1 ), . . . , (ξi−1 , si−1 ) , s,t,t0 ∈Σ
(ζ 0 , t), (ξi , si ), . . . , (ξn−1 , sn−1 ))c((ζ 0 , t0 , ), (ζ, s))J(ζ, ηm+1 ) . Then 0
|ϕ |1,Σ
1 l |c|1,Σ ≤ const |ϕ|1,Σ l Mj
if m = 0 if m 6= 0
with a constant that is independent of j and Σ.
Proof. First consider the case m = 0. Define the function ϕ00 on (B×Σ)×(B×Σ)n−1 by ϕ00 ((η, s); (ξ1 , s1 ), . . . , (ξn−1 , sn−1 )) =
X Z
dζdζ 0 ϕ((ξ1 , s1 ), . . . , (ξi−1 , si−1 ) ,
t,t0 ∈Σ
(ζ 0 , t), (ξi , si ), . . . , (ξn−1 , sn−1 ))c((ζ 0 , t0 ), (ζ, s))J(ζ, η) . Then ϕ0 (η; (ξ1 , s1 ), . . . , (ξn−1 , sn−1 )) =
X
ϕ00 ((η, s); (ξ1 , s1 ), . . . , (ξn−1 , sn−1 )) .
s∈Σ
Hence, by Lemma XII.14, |ϕ0 |1,Σ ≤ |Σ| |ϕ00 |1,Σ ≤
const
l
|ϕ|1,Σ |cJ|1,Σ ≤
const
l
|c|1,Σ |ϕ|1,Σ .
Now suppose that m 6= 0. Then |ϕ0 (η1 , . . . , ηm+1 ; (ξ1 , s1 ), . . . , (ξn−1 , sn−1 ))|1,Σ ≤
X
s1 ,...,sn−1 s,t,t0
sup η1 ,...,ηm+1
Z
dζdζ 0 dξ1 , . . . , dξn−1 |ϕ(η1 , . . . , ηm ;
(ξ1 , s1 ), . . . , (ζ 0 , t), . . . , (ξn−1 , sn−1 ))c((ζ 0 , t0 ), (ζ, s))J(ζ, ηm+1 )|
December 15, 2003 16:44 WSPC/148-RMP
1056
00179
J. Feldman, H. Kn¨ orrer & E. Trubowitz
≤
X
s1 ,...,sn−1 ,t
X 0
s,t s∩t∩t0 6=∅
sup |c((ζ 0 , t0 ), (ηm+1 , s))|
ηm+1 ,ζ 0
· kϕ(η1 , . . . , ηm ; (ξ1 , s1 ), . . . , (ζ 0 , t), . . . , (ξn−1 , sn−1 ))k1,∞ ≤ 9 sup |c| ≤ const
X
s1 ,...,sn−1 ,t
kϕ(η1 , . . . , ηm ; (ξ1 , s1 ), . . . , (ζ 0 , t), . . . , (ξn−1 , sn−1 ))k1,∞
l |ϕ|1,Σ Mj
by (XII.1). XIII. Bounds for Sectorized Propagators In this section we prove the existence of the partitions of unity, {χs (k)}, and enveloping functions satisfying Lemma XII.3. We derive bounds on |c|1,Σ , for various sectorized covariances c whose Fourier transforms are related to (j) (k) χs (k) ıkν0 −e(k) χs0 (k), that, together with Proposition XII.16, give good contraction bounds. The reason it is not easy to get good L1 –L∞ -bounds on the propagators in position space is that integration by parts in Cartesian coordinates is not well suited to the curvature of the Fermi surface and the shells around it. This is why we introduce sectorization. If the sectors are not too long (more precisely, at most of order M1j/2 ), the curvature of the sector has little effect. The first step in deriving L1 –L∞ -bounds using sectorization is Proposition XIII.1. Let j ≥ 2. Let I be an interval of the Fermi curve with length l and let f (k) be a function that is supported on {k ∈ R3 | |ik0 − e(k)| ≤ M2 j , πF (k) ∈ I}. Set, as in Lemma IX.6 Z d3 k 0 f (x) = eıhk,xi− f (k) . (2π)3 ˆ be unit tangent and normal vectors to the Fermi Fix any point k0c ∈ I, let ˆt and n curve at k0c and let xk be the component of x parallel to ˆt and x⊥ the component ˆ . There is a constant, const, depending on e(k), but independent of M, parallel to n f, j and x such that (i) For all multiindices γ ∈ N0 × N20 x γ0 x γ1 ⊥ 0 (l xk )γ2 f 0 (x) j j M M ≤ const
l lγ 2 sup j(γ +γ ) |∂kγ00 (ˆ n · ∇k )γ1 (ˆt · ∇k )γ2 f (k)| . 2j 0 1 M M k
December 15, 2003 16:44 WSPC/148-RMP
00179
Single Scale Analysis of Many Fermion Systems — Part 3
1057
ˆ n O(1/M j ) k0c
ˆt
O(l)
(ii) Proof: i) By integration by parts l Z|f 0 (x)| . ≤ const 2j sup|f γ2 0 sup γ2 γ1 3 γ0 (k)| x0 γ0 x⊥ γ1 ı 1 d k −M 1 ˆ x k ˆ f (x) l x f (k) e ∂ l t · ∇ n · ∇ = Mj k k k Mj (2π)3 M j k0 Mj
(iii) If l ≥ M1j , then, for all multiindices δ Use S to denote Z the support of f (k). Observe that S has volume at most const M −2j l, since γ2 l δ 0 δ1 +δ2 |δ|j k0 is supported in an interval M −j , the distance ofsup k from F is bounded by dx|x f (x)|of≤length const 2 const max j(γ0 +γ1 ) γ0 ≤δ0 +2 k M −j γ +γ ≤δ +δ +3 const M and the πF (k) runs over an interval of length 1 2l. Hence 1 2 γ2 0 x0 γ0 x⊥ γ1 n sup · ∇k )γ11(jˆt∂·k∇kγ0)γ2 f1j(k)| . k γ1 lˆt · ∇k γ2 f (k) ·vol(S) |∂kγ00 (ˆ ˆ f (x) l x n · ∇ ≤ Mj j k 0 M M M k γ1 γ0 1 ˆt · ∇k γ2 f (k) ˆ n · ∇ l ≤ const Ml2j sup M1 j ∂k0 j k M Proof. (i) By integration by parts k x γ0 x γ1 ⊥ 0 (l xk )γ2 f 0 (x) ii) This is simply of part (i) with γ = 0. M j a restatement Mj Z γ1 γ0 d3 k ıhk,xi− 1 1 iii) Set γ2 ˆ ˆ · ∇k = (l t3· ∇k ) f (k) . e ∂ n 3 j−j −jj k0 2 (2π) M M ρ(x) = 1 + M |x0 |] 1 + M |x⊥ | + l|xk | Use (i) S to denote the support of f (k). Observe that S has volume at most const By part M −2j l, since k0 is supported in an interval of length const M −j , the distance of k δ 0 x fF(x) from is bounded by const M −j and the πF (k) runs over an interval of length l. ρ(x) γ γ γ0 1 Hence δ1 +δ2 j|δ| l ˆ · ∇k 1 lˆt · ∇k 2 f (k) ≤ const max sup M1j ∂k0 j n x2 γ0M x M γ2j1 M γ0 ≤δ0 +2 ⊥ 0 k γ≤δ 0 2 +3 2 1 +δ (lγ1x+γ (x) k )2 f Mj Mj since γ1 δ δ1 +δ2γ 1 −j γ0δ0 1 −j x sup 2 ˆ M |x | + l|x | ≤ M |δ|j M |x | ˆ ≤ vol(S) (l t · ∇ ) ∂ f (k) n · ∇ ⊥ 0 k k k0 k j j M M k The desired bound now follows from γ1 γ0 1 l 1 γ2 . ˆ ˆ ≤ const 2j sup Z (l t · ∇ ) ∂ n · ∇ f (k) k k0 k j j 1 M M MM 2j /l k ≤ const dx ρ(x) (ii) This is simply a restatement of part (i) with γ = 0. To see this, (iii) just Set make the change of variables x0 = M j z0 , x⊥ = M j z1 , xk = z2 /l. ρ(x) = [1 + M −j |x0 |]2 [1 + M −j |x⊥ | + l|xk |]3 . 18
December 15, 2003 16:44 WSPC/148-RMP
1058
00179
J. Feldman, H. Kn¨ orrer & E. Trubowitz
By part (i) ρ(x)|xδ f 0 (x)| ≤
const 2
δ1 +δ2
M j|δ|
l M 2j
max
γ0 ≤δ0 +2 γ1 +γ2 ≤δ1 +δ2 +3
γ1 γ0 1 1 γ2 ˆ ˆ · ∇k (l t · ∇k ) f (k) ∂ n · sup j k0 j M M k
since
|xδ | ≤ M |δ|j [M −j |x0 |]δ0 [M −j |x⊥ | + l|xk |]δ1 +δ2 . The desired bound now follows from Z 1 ≤ const M 2j /l . dx ρ(x) To see this, just make the change of variables x0 = M j z0 , x⊥ = M j z1 , xk = z2 /l. We parametrize the Fermi curve F by arc length, using a real variable k0 for the parametrization. To simplify notation, set k0 (k) = πF (k), the projection on the Fermi surface. Lemma XIII.2. Let j > 0. Let I be an interval of the Fermi curve with length√l ∈ [ M1j , M1j/2 ]. Let χ(k) = R(k0 , e(k))Θ(k0 (k)) with R(x, y) vanishing unless |y| ≤ 2l ˆ be unit tangent and and Θ supported in I. Fix any point k0c ∈ I and let ˆt and n normal vectors to the Fermi curve at k0c . There is a constant, const, depending on r0 , r and e(k), but independent of M, χ and j such that, for all γ ∈ N0 × N20 with γ0 ≤ r0 + 2, γ1 + γ2 ≤ r + 3, sup k
lγ 2
M j(γ0 +γ1 ) ≤ const
n · ∇k )γ1 (ˆt · ∇k )γ2 χ(k)| |∂kγ00 (ˆ max
sup l
m+n≤γ1 +γ2 k0
m
|∂km0 Θ(k0 )| sup x,y
1 M j(γ0 +n)
γ +n ∂ 0 R ∂xγ0 ∂ n y (x, y) .
Proof. Since l ≥ M1j and all derivatives of k0 (k) to order γ1 + γ2 are bounded, γ1 −β1 γ2 −β2 1 0 ˆ ˆ · ∇k sup l t · ∇k Θ(k (k)) ≤ const max sup lm |∂km0 Θ(k0 )| n j m≤γ1 +γ2 k0 M k −β1 −β2
for all β1 ≤ γ1 and β2 ≤ γ2 . So, by the product rule, it suffices to prove β1 γ0 1 1 β 2 ˆ · ∇k sup (l ˆt · ∇k ) R(k0 , e(k)) ∂ k0 n j j M M k∈S ≤ const max sup n≤β1 +β2 x,y
1
M j(γ0 +n)
γ +n ∂ 0 R (x, y) ∂xγ0 ∂ n y
December 15, 2003 16:44 WSPC/148-RMP
00179
Single Scale Analysis of Many Fermion Systems — Part 3
1059
where S is the support of χ(k). Set π = {1, . . . , β1 + β2 }, 1 n ˆ · ∇k if 1 ≤ i ≤ β1 j di = M l ˆt · ∇ if β1 + 1 ≤ i ≤ β1 + β2 k Q 0 and, for each π 0 ⊂ π, dπ = i∈π0 di . By the product and chain rules γ0 1 ∂ dπ R(k0 , e(k)) k Mj 0 =
βX 1 +β2
X
n=1 (π1 ,...,πn )∈Pn
n Y ∂ γ0 +n R M j dπi e(k) (k , e(k)) 0 M j(γ0 +n) ∂xγ0 ∂ n y i=1
1
where Pn is the set of all partitions of π into n nonempty subsets π1 , . . . , πn with, for all i < i0 , the smallest element of πi smaller than the smallest element of πi0 . So to prove the lemma, it suffices to prove that β1 j 1 β2 ˆ ˆ · ∇k (l t · ∇k ) e(k) ≤ const . (XIII.1) max sup M n j 1≤β1 +β2 ≤γ1 +γ2 k∈S M j β2
l If β1 ≥ 1 or β2 ≥ 2, this follows from M ≤ 1. (Recall that l ≤ M1j/2 .) The only M β1 j remaining possibility is β1 = 0, β2 = 1. ˆ. If ˆt · ∇k e(k) is evaluated at k = k0c , it vanishes, since ∇k e(k0c ) is parallel to n The second derivative of e is bounded so that,
M j l sup |ˆt · ∇k e(k)| ≤ const M j l sup |k − k0c | ≤ const M j l2 ≤ const k∈S
k∈S
since l ≤
1 . M j/2
For the rest of this section, we fix a sectorization Σ of scale j ≥ 2 and length 1 ≤ l ≤ M (j−1)/2 . We choose a smooth partition of unity Θs (k0 ), s ∈ Σ of the 1 Fermi curve F subordinate to the sets s ∩ F , such that |∂km0 Θs (k0 )| ≤ const lm for 0 ˜ s (k ), s ∈ Σ that m = 0, 1, . . . , r + 3. Furthermore choose enveloping functions Θ ˜ s (k0 )| ≤ const 1m for m = 0, 1, . . . , r + 3 are identically one on s ∩ F and obey |∂km0 Θ l 0 ˜ ˜ and Θs Θs0 = 0 if s ∩ s = ∅. Set 1
M j−3/2
χs (k) = ν˜(≥j) (k)Θs (k0 (k)) = ϕ(M 2j−2 (k02 + e(k)2 ))Θs (k0 (k))
˜ s (k0 (k)) = ϕ(M 2j−3 (k 2 + e(k)2 ))Θs (k0 (k)) χ ˜s (k) = ν¯(≥j) (k)Θ 0 where ν˜(≥j) , ν¯(≥j) are the functions of Definition VIII.4. Lemma XIII.3. Let s ∈ Σ. Set for x = (x0 , x) ∈ R × R2 Z Z d3 k d3 k 0 , χ ˜ (x) = eıhk,xi− χ ˜s (k) χ0s (x) = eıhk,xi− χs (k) s 3 (2π) (2π)3 χ0s (x)
=
Z
eık·x χs ((0, k))
d2 k . (2π)2
(XIII.2)
December 15, 2003 16:44 WSPC/148-RMP
1060
J. Feldman, H. Kn¨ orrer & E. Trubowitz
Then kχ0s kL1
x0 ∂ χ0s
∂x0 and
00179
∂ 0
χ ˜
∂x0 s
≤ const cj−1 ,
L1
≤ const cj−1 ,
L1
≤ const
1
M
∂ 0
∂x0 χs
L1
≤ const
1 cj−1 , M j−1
kχ ˜0s kL1 ≤ const cj− 23 ,
c 3, j−3/2 j− 2
0
x 0 ∂ χ ˜
∂x0 s
L1
≤ const cj− 23
kχ0s kL1 ≤ const cj−1 .
Here, for a function f (x) on R × R2 , X 1 Z |xδ f (x)|dx tδ ∈ Nd+1 kf kL1 = δ! d δ∈N0 ×N0
is the norm defined before Lemma IX.6, and for a function g(x) on R2 we set Z X 1 kgkL1 = |xδ f (x)|dx tδ . δ! d δ∈N0 ×N0 δ0 =0
ˆ be unit tangent and normal Proof. Fix a point k0c ∈ s ∩ F and let ˆt and n vectors to F at k0c . By Lemma XIII.2, with j replaced by j − 1 and R(x, y) = ϕ(M 2j−2 (x2 + y 2 )), lγ 2 sup (j−1)(γ +γ ) |∂kγ00 (ˆ max n · ∇k )γ1 (ˆt · ∇k )γ2 χs (k)| ≤ const (XIII.3) 0 1 γ0 ≤δ0 +2 k M γ1 +γ2 ≤δ1 +δ2 +3
for every multiindex δ ∈ N0 × N20 √ with δ0 ≤ r0 and δ1 + δ2 ≤ r. Here, we have used that M j−1 |x|, M j−1 |y| ≤ 2 on the support of R(x, y). Therefore, by Proposition XIII.1(iii), Z 1 dx|xδ χ0 (x)| ≤ const . M (j−1)|δ| By definition of k · kL1 , this implies that kχ0s kL1 ≤ const cj−1 . By Lemma XIII.2 and the product rule lγ 2 1 max sup (j−1)(γ +γ ) |∂kγ00 (ˆ n · ∇k )γ1 (ˆt · ∇k )γ2 (k0 χs (k))| ≤ const j−1 0 1 γ0 ≤δ0 +2 M k M γ1 +γ2 ≤δ1 +δ2 +3
∂ 1 and as above it follows that k ∂x χ0s kL1 ≤ const M j−1 cj−1 . 0 Again, by Lemma XIII.2 ∂ lγ 2 sup (j−1)(γ +γ ) ∂kγ00 (ˆ n · ∇k )γ1 (ˆt · ∇k )γ2 k0 χs (k) ≤ const max 0 1 γ0 ≤δ0 +2 ∂k0 k M γ1 +γ2 ≤δ1 +δ2 +3
∂ χ0s kL1 ≤ const cj−1 is as before. and the proof that kx0 ∂x 0
December 15, 2003 16:44 WSPC/148-RMP
00179
Single Scale Analysis of Many Fermion Systems — Part 3
1061
The bounds on χ ˜0s are obtained in the same way, with j − 1 replaced by j − 32 . The estimate on kχ0s kL1 follows from the fact that Z 0 χs (x) = dx0 χ0s (x0 , x) . Proof of Lemma XII.3. Parts (i) and (ii) of the lemma are trivial. To prove part (iii) observe that χ ˆs (ξ, ξ 0 ) = δσ,σ0 δa,a0 χ0s ((−1)a (x − x0 )) for ξ = (x, σ, a), ξ 0 = (x0 , σ 0 , a0 ) ∈ B. Therefore, by Lemma XIII.3 kχ ˆs k1,∞ ≤ const kχ0s kL1 ≤ const cj−1 . ˆ˜s k1,∞ is obtained in the same way. The estimate for kχ
From now on, we fix for each sectorization Σ, a partition of unity χs , s ∈ Σ and a system of functions χ ˜s , s ∈ Σ that fulfill the conclusions of Lemmas XII.3 and XIII.3. Recall from Definition XII.2 that X X ∞tδ ∈ Nd+1 . M j|δ| tδ + cj = |δ|>r or |δ0 |>r0
|δ|≤r |δ0 |≤r0
Observe that by Corollary A.5, there is a constant const that is independent of M such that for 2 ≤ i ≤ j ci cj ≤ const cj .
(XIII.4)
Lemma XIII.4. Set C (j) (k) =
ν (j) (k) , ık0 − e(k)
C˜ (j) (k) =
ν˜(j) (k) ık0 − e(k)
and, for s, s0 ∈ Σ, let c(j) ((ξ, s), (ξ 0 , s0 )) resp. c˜(j) ((ξ, s), (ξ 0 , s0 )) be the Fourier transforms (as in Definition IX.3) of χs (k)C (j) (k)χs0 (k) resp. χs (k)C˜ (j) (k)χs0 (k). Then (i) c(j) ((·, s), (·, s0 )) = c˜(j) ((·, s), (·, s0 )) = 0 if s ∩ s0 = ∅. (ii) |c(j) |1,Σ ,
|˜ c(j) |1,Σ ≤ const M j cj .
(j)
(iii) Let c0 ((ξ, s), (ξ 0 , s0 )) the Fourier transform (as in Definition IX.3) of (j) χs (k)k0 C (j) (k)χs0 (k) or χs (k)e(k)C (j) (k)χs0 (k) and let c˜0 ((ξ, s), (ξ 0 , s0 )) be the Fourier transform of either χs (k)k0 C˜ (j) (k)χs0 (k) or χs (k)e(k)C˜ (j) (k) χs0 (k). Then (j)
|c0 |1,Σ ,
(j)
|˜ c0 |1,Σ ≤ const cj .
December 15, 2003 16:44 WSPC/148-RMP
1062
00179
J. Feldman, H. Kn¨ orrer & E. Trubowitz
(iv) X
δ∈N0 ×N20
1 l max sup |Dδ c(j) ((ξ, s), (ξ 0 , s0 ))|tδ ≤ const j cj . δ! s,s0 ∈Σ ξ,ξ0 ∈B 1,2 M
Proof. Part (i) is obvious. To prove part (ii) fix sectors s, s0 ∈ Σ, with s ∩ s0 6= ∅ ˆ be unit tangent and normal vectors to F at and a point k0c ∈ s ∩ F . Let ˆt and n k0c . First we claim that for all k in the intersection of s with the jth shell β1 β0 1 1 1 β2 ˆ ˆ · ∇k (l t · ∇k ) max ∂ k0 n β0 ≤r0 +2 M j Mj ik0 − e(k) β1 +β2 ≤r+3
≤ const M j .
(XIII.5)
To see this, set π = {1, . . . , |β|}, 1 ∂ if 1 ≤ i ≤ β0 j k0 M 1 di = ˆ · ∇k if β0 + 1 ≤ i ≤ β0 + β1 n j M ˆ l t · ∇k if β0 + β1 + 1 ≤ i ≤ |β| Q 0 and, for each π 0 ⊂ π, dπ = i∈π0 di . By the product and chain rules n+1 |β| X X 1/M j 1 = Mj (−1)n n! dπ ik0 − e(k) ik0 − e(k) n=1 (π1 ,...,πn )∈Pn
·
n Y
i=1
M j dπi (ik0 − e(k)) . j
)n+1 is bounded uniformly In the sector s, |ik0 − e(k)| ≥ const M1j so that ( ik1/M 0 −e(k) in j. That M j dπi (ik0 − e(k)) is bounded uniformly in j follows immediately from (XIII.1) and the fact that |k0 | ≤ const M j on the jth shell. This proves (XIII.5). As in (XIII.3), for all k in the intersection of s with the jth shell, β1 β0 1 1 β2 (j) ≤ const . ˆ ˆ (l t · ∇ ) max ∂ ν (k) n · ∇ k k0 k j j β0 ≤r0 +2 M M β1 +β2 ≤r+3
By Leibniz’s rule it follows from this inequality and the inequalities (XIII.3) and (XIII.5) that max
γ0 ≤r0 +2 γ1 +γ2 ≤r+3
sup k
lγ 2 M j(γ0 +γ1 )
|∂kγ00 (ˆ n · ∇k )γ1 (ˆt · ∇k )γ2 χs (k)C (j) (k)χs0 (k)| ≤ const M j .
Hence, by Proposition XIII.1 |c(j) |1,Σ ≤ const M j cj .
The proof for |˜ c(j) |1,Σ is analogous.
December 15, 2003 16:44 WSPC/148-RMP
00179
Single Scale Analysis of Many Fermion Systems — Part 3
1063
The proof of part (iii) is the same as the proof of part (ii) with (XIII.5) replaced by max
β0 ≤r0 +2 β1 +β2 ≤r+3
max
β0 ≤r0 +2 β1 +β2 ≤r+3
β1 β0 k0 1 1 β2 ≤ const ˆ ˆ (l t · ∇ ) n · ∇ k k M j ∂ k0 Mj ik0 − e(k)
β1 β0 e(k) 1 1 β 2 ≤ const . ˆ · ∇k (l ˆt · ∇k ) n M j ∂ k0 j M ik0 − e(k)
To prove part (iv), observe that, by (XIII.5) and the fact that χs (k)C (j) (k)χs0 (k) is supported in a region of volume const Ml2j , δ max sup |D1,2 c(j) ((ξ, s), (ξ 0 , s0 ))| ≤ const
s,s0 ∈Σ ξ,ξ 0 ∈B
l M j(1+|δ|) M 2j
for all δ0 ≤ r0 and |δ| ≤ r. Proposition XIII.5. There are constants τ1 , const that depend on e(k) and M, but not on j or Σ with the following property: Let u((ξ, s), (ξ 0 , s0 )) ∈ F0 (2; Σ) be antisymmetric, spin independent and particle P τ1 δ number conserving and obey |u|1,Σ ≤ M j + δ6=0 ∞t . Let C(k) =
ν (j) (k) ık0 − e(k) − u ˇ(k)
c((ξ, s), (ξ 0 , s0 )) be the Fourier transform (as in Definition IX.3) χs (k)C(k)χs0 (k) and c0 ((ξ, s), (ξ 0 , s0 )) be the Fourier transform χs (k)k0 C(k)χs0 (k) or χs (k)e(k)C(k)χs0 (k). Then (i) c((·, s), (·, s0 )) = 0 if s ∩ s0 = ∅. Mj c (ii) |c|1,Σ ≤ const 1−M j |u|j 1,Σ . P l 1 0 0 δ δ (iii) δ∈N0 ×N20 δ! supξ,ξ 0 ,s,s0 |D1,2 c((ξ, s), (ξ , s ))|t ≤ const M j c (iv) |c0 |1,Σ ≤ const 1−M jj|u|1,Σ .
cj 1−M j |u|1,Σ .
Proof. Again part (i) is trivial. To prove part (ii), observe that C(k) =
C (j) (k) 1−
u ˇ(k) ık0 −e(k)
= C (j) (k)
∞ X
=
C (j) (k) 1−
u ˇ (k)˜ ν (j) (k) ık0 −e(k)
=
C (j) (k) 1−u ˇ(k)C˜ (j) (k)
(ˇ u(k)C˜ (j) (k))n .
n=0
Introducing the local notation
C (j) (k; s, s0 ) = χs (k)C (j) (k)χs0 (k) C˜ (j) (k; s, s0 ) = χs (k)C˜ (j) (k)χs0 (k)
of of
December 15, 2003 16:44 WSPC/148-RMP
1064
00179
J. Feldman, H. Kn¨ orrer & E. Trubowitz
we have χs (k)C(k)χs0 (k) = C (j) (k; s, s0 ) +
∞ X X
n=1
·
n Y
s00 ∈Σ
X
C (j) (k; s, s00 )
ti ,t0i ∈Σ
for i=1,...,n with t0n =s0
u ˇ(k)C˜ (j) (k; ti , t0i ) .
i=1
Define the operator ∗-product of the sectorized functions A((ξ, s), (ξ 0 , s0 )) and B((ξ 0 , s0 ), (ξ 00 , s00 )) by X Z 00 00 (A ∗ B)((ξ, s), (ξ , s )) = dξ 0 A((ξ, s), (ξ 0 , s0 ))B((ξ 0 , t0 ), (ξ 00 , s00 )) . s0 ,t0 ∈Σ s0 ∩t0 6=∅
Then, by part (i) of Lemma XIII.4 c = c(j)
∞ X
n=0
(∗ u ∗ c˜(j) )n
(XIII.6)
so that by iterated application of Lemmas XII.14 and XIII.4(ii) |c|1,Σ ≤ |c(j) |1,Σ
∞ X
n=0
≤ const M j cj = const
(9|u|1,Σ |˜ c(j) |1,Σ )n ∞ X
n=0
(const0 M j cj |u|1,Σ )n
M j cj . 1 − const0 M j cj |u|1,Σ
1 j j If τ1 < min{ 2 const and 0 , 1}, then, by Corollary A.5(i), with X = M |u|1,Σ , Λ = M 0 µ = const ,
|c|1,Σ ≤ const
M j cj . 1 − M j |u|1,Σ
(iii) The bound X 1 sup |Dδ (A ∗ B)((ξ, s), (ξ 0 , s0 ))|tδ δ! ξ,ξ0 ,s,s0 1,2 δ
(
X 1 ≤3 sup |Dδ A((ξ, s), (ξ 0 , s0 ))|tδ δ! ξ,ξ0 ,s,s0 1,2 δ
)
|B|1,Σ
is proven in much the same way as Lemma II.7, but uses Z Z sup dζA(ξ, ζ)B(ζ, ξ 0 ) ≤ sup |A(ξ, ζ)| sup dξ 0 |B(ζ, ξ 0 )| 0 ξ,ξ
ξ,ζ
ζ
December 15, 2003 16:44 WSPC/148-RMP
00179
Single Scale Analysis of Many Fermion Systems — Part 3
1065
in place of Z Z Z Z sup dξ 0 dζA(ξ, ζ)B(ζ, ξ 0 ) ≤ sup dζ|A(ξ, ζ)| sup dξ 0 |B(ζ, ξ 0 )| . ζ
ξ
ξ
Repeatedly applying this bound to (XIII.6) and using Lemma XIII.4(iv) yields X
δ∈N0 ×N20
≤
1 sup |Dδ c((ξ, s), (ξ 0 , s0 ))|tδ δ! ξ,ξ0 ,s,s0 1,2
(
X
δ∈N0 ×N20
≤ const
1 sup |Dδ c(j) ((ξ, s), (ξ 0 , s0 ))|tδ δ! ξ,ξ0 ,s,s0 1,2
)
∞ X
n=0
(9|u|1,Σ |˜ c(j) |1,Σ )n
∞ X l c (const0 M j cj |u|1,Σ )n j M j n=0
= const
cj l 0 j M 1 − const M j cj |u|1,Σ
≤ const
l cj . M j 1 − M j |u|1,Σ
(iv) Repeat the proof of (ii) with (XIII.6) replaced by (j)
c0 = c 0
∞ X
n=0
(∗ u ∗ c˜(j) )n
and using Lemma XIII.4(iii). Lemma XIII.6. There are constants τ2 , const that depend on e(k) and M, but not on j or Σ with the following property: Let, for κ in a neighborhood of zero, uκ ∈ F0 (2; Σ) be antisymmetric, spin indeP τ2 δ pendent and particle number conserving and obey |u0 |1,Σ ≤ M j + δ6=0 ∞t . Let Cκ (k) =
ν (j) (k) ık0 − e(k) − u ˇκ (k)
and cκ ((ξ, s), (ξ 0 , s0 )) be the Fourier transform of χs (k)Cκ (k)χs0 (k). Let cκ,0 ((ξ, s), (ξ 0 , s0 )) be the Fourier transform of χs (k)k0 Cκ (k)χs0 (k) or χs (k)e(k)Cκ (k)χs0 (k). Then d M j | dκ uκ |κ=0 |1,Σ 1−M j |u0 |1,Σ . d d supξ,ξ0 ,s,s0 | dκ cκ ((ξ, s), (ξ 0 , s0 ))|κ=0 | ≤ const l| dκ uκ |κ=0 |1,Σ . j d M | dκ uκ |κ=0 |1,Σ d | dκ cκ,0 |κ=0 |1,Σ ≤ const cj 1−M j |u0 |1,Σ .
d (i) | dκ cκ |κ=0 |1,Σ ≤ const M j cj
(ii) (iii)
December 15, 2003 16:44 WSPC/148-RMP
1066
00179
J. Feldman, H. Kn¨ orrer & E. Trubowitz
Proof. The proof is similar to that of Proposition XIII.5, using d d C (j) (k) Cκ (k) = dκ dκ 1 − u ˇκ (k)C˜ (j) (k)
1 d 1 u ˇκ (k)C˜ (j) (k) 1−u ˇκ (k)C˜ (j) (k) dκ 1−u ˇκ (k)C˜ (j) (k) ∞ X d = C (j) (k) u ˇκ (k)C˜ (j) (k) (ˇ uκ (k)C˜ (j) (k))n (ˇ uκ (k)C˜ (j) (k))m dκ m,n=0
= C (j) (k)
c
c
j j )2 ≤ const 1−M j |u . and [6, Corollary A.5], which implies that ( 1−M j |u 0 |1,Σ 0 |1,Σ
Lemma XIII.7. Let u((ξ, s), (ξ 0 , s0 )) be a translation invariant function on (B × Σ)2 with the property that u ˇ((k, σ, a, s), (k 0 , σ 0 , a0 , s0 )) vanishes unless πF (k) ∈ πF (s) and πF (k 0 ) ∈ πF (s0 ). Let µ(t) be a C0∞ function on R and set, for each Λ > 0 µΛ (k) = µ(Λ2 [k02 + e(k)2 ]) Z (u ∗ µ ˆΛ )((ξ, s), (ξ 0 , s0 )) = dζu((ξ, s), (ζ, s0 ))ˆ µΛ (ζ, ξ 0 ) B
(ˆ µΛ ∗ u)((ξ, s), (ξ 0 , s0 )) =
Z
dζu((ζ, s), (ξ 0 , s0 ))ˆ µΛ (ζ, ξ) B
where µ ˆΛ was defined in Definition IX.4. Denote j(Λ) = min{i ∈ N|M i ≥ Λ}. Then, there is a constant const, depending on µ, but not on M, j or Λ, such that |u ∗ µ ˆΛ |1,Σ ,
|ˆ µΛ ∗ u|1,Σ ≤ const cj(Λ) |u|1,Σ .
Proof. Let {Θs |s ∈ Σ} be the smooth partition of unity of the Fermi curve F that was chosen just before (XIII.2) and set χΛ,s (k) = µΛ (k)Θs (πF (k)) . Then, by Lemma XIII.2 and Proposition XIII.1(iii), as in Lemma XIII.3, kχ ˆΛ,s k1,∞ ≤ const cj(Λ) . We treat u ∗ µ ˆΛ . The other case is similar. As X Z u∗µ ˆΛ ((ξ, s), (ξ 0 , s0 )) = dζu((ξ, s), (ζ, s0 ))χ ˆΛ,s00 (ζ, ξ 0 ) . s00 ∈Σ s00 ∩s0 6=∅
Lemma II.7 implies that ku ∗ µ ˆΛ ((·, s), (·, s0 ))k1,∞ ≤
B
X 00
s ∈Σ s00 ∩s0 6=∅
const cj(Λ) ku((·, s), (·, s
≤ const cj(Λ) ku((·, s), (·, s0 ))k1,∞
0
))k1,∞
December 15, 2003 16:44 WSPC/148-RMP
00179
Single Scale Analysis of Many Fermion Systems — Part 3
1067
since, for each s0 ∈ Σ, there are only three s00 ∈ Σ with s00 ∩ s0 6= ∅. The lemma follows. Remark XIII.8. In the notation of Lemma XIII.7,
XIV. XIV. Ladders Ladders
(u ∗ µ ˆΛ )ˇ(k) = (ˆ µΛ ∗ u)ˇ(k) = u ˇ(k)µΛ (k) . In§XV, §XV,we wewill willapply apply Theorem Theorem VI.6 of XIV. In Ladders of [FKTr2] [FKTr2] to toestimate estimatethe therenormalization renormalization ˜ of Definition VII.1, with respect to the sectorized norms of Definition XII.9. group map Ω ˜ group mapXV, Ω ofweDefinition VII.1, with respect sectorized of Definition XII.9. In Sec. will apply [3, Theorem VI.6]totothe estimate the norms renormalization group It will give “improved power counting” for two–legged contributions and “improved power ˜ It map will give for two–legged contributions “improvedXII. power Ω of“improved Definitionpower VII.1,counting” with respect to the sectorized normsand of Definition counting”forforthose thosefour–legged four–legged contributions contributions that using counting” that are are not not ladders. ladders. AAsimilar similarresult result using
It will give “improved power counting” for two-legged contributions and “improved
the norms of §X, will be derived in §XVII. Depending on the geometry of the Fermi curve, thepower normscounting” of §X, willfor bethose derived in §XVII.contributions Depending onthat the are geometry of the Fermi curve, four-legged not ladders. A similar ladders have different behaviour. investigate §XXII,Depending [FKTf2, §VII] result using the norms of Sec. We X, shall will be derived ladders in Sec.inXVII. on and the
ladders have different behaviour. We shall investigate ladders in §XXII, [FKTf2, §VII] and the paper [FKTl]. In this Section, we introduce notation for ladders that will be useful in all
of theInFermi curve, ladders have notation differentfor behavior. We shall investigate thegeometry paper [FKTl]. this Section, we introduce ladders that will be useful in all these investigations. ladders in Sec. XXII, [4, Sec. VII] and the paper [5]. In this section, we introduce these investigations.
Infor this Section, the internal lines of ladders will beinvestigations. functions with arguments running notation ladders will belines useful in all these In this Section, that the internal of ladders will be functions with arguments running {↑, ↓} arguments × {0, 1} or overInanthis arbitrary measure space X. We think of X as B = IR × IR 2 × with section, the internal lines of ladders will be functions over an arbitrary measure space X. We think of X as B = IR × IR 2 × {↑, ↓} × {0,21} or 2 2 × {↑, ↓} or IR × IR or B × Σ, where Σ is a sectorization. IR × IR running over an arbitrary measure space X. We think of X as B = R × R ×
IR × IR2 × {↑, ↓} or IR × IR22 or B × Σ, where Σ 2is a sectorization.
{↑, ↓} × {0, 1} or R × R × {↑, ↓} or R × R or B × Σ, where Σ is a sectorization.
Definition XIV.1
Definition XIV.1. (i) A complex valued function on X × X is called a propagator Definition XIV.1 i) A complex valued function on X × X is called a propagator over X. over X.
i) A complex valued function on X × XXisiscalled a propagator X. 2 X2 × X2 . We A legged four-legged a complex valued over function . We sometimes ii) A(ii) four kernel kernel over X over is a complex valued function on X2 × Xon sometimes as aXbubble propagator over X, graphically by X2 . We sometimes ii) consider A four it legged kernelitpropagator over is aover complex valued function on X2 × depicted asconsider a bubble X, graphically depicted by consider it as a bubble propagator over X, graphically depicted by or as a rung over X, graphically depicted by
as aover rung X, graphically by or as aorrung X,over graphically depicteddepicted by iii) If A and B are propagators over X then the tensor product
iii) If A and B are propagators over X then tensor , x4 )the = then A(x x3product )B(x ⊗ B(x (iii) If A and B areApropagators over X tensor 1 , x2 , x3 1 , the 2 , x4 )product x33,, x44) = A(x A⊗⊗B(x B(x A(x11,,xx33)B(x )B(x22, ,xx4 )4 ) 11,,xx 22,,x is a bubble propagatorAover X. We set
a bubble propagator X.B)set We set⊗ A + A ⊗ B + B ⊗ A is aisbubble propagator overover X. We C(A, =A C(A, B) = A ⊗ A + A ⊗ B + B ⊗ A . C(A, B) = A ⊗ A + A ⊗ B + B ⊗ A
iv) Let F, F be four legged kernels over X. We define the four legged kernel F ◦ F as (iv) Let F , F 0 be four-legged kernels over X. We define the four-legged kernel 0 ◦ F F, asF (F iv)F Let be◦ F four kernels over define x3 , x4 )F ◦ F as (x1 , x2 ;kernel x1 , xfour )(xlegged , x ; x , x ) = 1 , x2 ; the 1 2 3 4 2 ) F legged 2 F (x 1 dxWe Z dxX.
(F ◦ F 0 )(x1 , x2 ; x3 , x4 ) = dx0 dx0 F (x1 , x2 ; x0 , x0 )F 0 (x0 , x0 ; x3 , x4 )
1 2 1 2 1 2 whenever (F the◦integral F )(x1 ,isx2well–defined. ; x3 , x4 ) = dx1 dx2 F (x1 , x2 ; x1 , x2 ) F (x1 , x2 ; x3 , x4 )
whenever the integral is well-defined. whenever the integral is well–defined.
28
28
December 15, 2003 16:44 WSPC/148-RMP
1068
00179
J. Feldman, H. Kn¨ orrer & E. Trubowitz
v) Let ` ≥ 1 . The ladder with rungs R1 , · · · , R`+1 and bubble propagators P1 , · · · , P` is defined to(v) be Let ` ≥ 1. The ladder with rungs R1 , . . . , R`+1 and bubble propagators P1 , . . . , P` is defined to be R1 ◦ P1 ◦ R2 ◦ P2 ◦ · · · ◦ P`−1 ◦ R` ◦ P` ◦ R`+1 R1 ◦ P1 ◦ R2 ◦ P2 ◦ · · · ◦ P`−1 ◦ R` ◦ P` ◦ R`+1 R1
P1
R2
P`−1 R`
P2
P` R`+1
If R is a rung and A, B are propagators we define L` (R; A, B) as the ladder with If R is `a+rung and RA,and B are propagators we define 1 rungs ` bubble propagators C(A,LB). ` (R; A, B) as the ladder with ` + 1 rungs R and ` bubble propagators C(A,to B). The ladders contribute the four-point function, which is antisymmetric. So the ladders must be antisymmetrized. The ladders contribute four point kernel. function, is antisymmetric. Definition XIV.2. Let F to bethe a four-legged Thewhich antisymmetrization of FSo the
laddersismust be antisymmetrized. the four-legged kernel 1 X sign(π)F (xπ(1) , xπ(2) , xπ(3) , xπ(4) ) (Ant F )(x1 , x2 , x3 , x4 ) = 4! kernel. The antisymmetrization Definition XIV.2 Let F be a four legged of F is the four π∈S 4
legged Fkernel is called antisymmetric if F = Ant F . X 1 Theorem VI.6], we will consider ladders with In the Antdirect F (x1application , x2 , x3 , x4 )of= [3, sign(π) F (xπ(1), xπ(2) , xπ(3) , xπ(4) ) 4! internal lines taking values in the measure space X = B × Σ, where Σ is a sectorizaπ∈S4 tion. However, the propagators are not naturally sectorized, and in [4, Sec. VII] we F is called antisymmetric F = Ant F .of different scales. This motivates the following will combine bubble ifpropagators variant of the previous definitions. In the direct application [FKTr2], wethe will consider ladders with Definition XIV.3. Let S beofa Theorem finite set.bVI.6 It is of endowed with counting measure. internal lines taking values in the measure space X = B × Σ, where Σ is a sectorization. Then X × S is also a measure space. However, the propagators are not naturally sectorized, and in [FKTf2, §VII] we will combine (i) Let P be a propagator over X, f a four-legged kernel over X × S and F a bubble propagators of (X different motivates the following variant of the previous function on × S)2 ×scales. X2 . WeThis define definitions. (f • P )((x1 , s1 ), (x2 , s2 ); x3 , x4 ) X Z 0 counting 0 0measure. Then 0 set(1) . It is endowed 0 with 0 0 Definition XIV.3 Let S bedx a 0finite the = 1 dx2 f ((x1 , s1 ), (x2 , s2 ), (x1 , s1 ), (x2 , s2 ))P (x1 , x2 ; x3 , x4 ) 0 0 X × S is also a measure space. s1 ,s2 ∈S i) Let P be (F a propagator • f )((x1 , s1 ),over . . . , X, (x4 ,fs4a))four legged kernel over X × S and F a function on 2 2 (X × S) × X . We define X Z 0 0 = dx01 dx02 F Z ((x1 , s1 ), (x2 , s2 ); x1 , x2 ) P 0 ∈S s01 ,s,x (f • P )((x1 ,s1 ),(x2 ,s2 );x dx01 dx02 f ((x1 ,s1 ),(x2 ,s2 ),(x01 ,s01 ),(x02 ,s02 )) P (x01 ,x02 ;x3 ,x4 ) 3 2 4) = s01 ,s02 ∈S
· f ((x01 , s01 ), (x02 , sZ02 ), (x3 , s3 ), (x4 , s4 )) P 0 0 0 0 0 0 0 0 (F • f )(whenever (x1 ,s1 ),···,(x4 ,s4 )) = 1 ),(x2 ,s2 );x1 ,x2 ) f ((x1 ,s1 ),(x2 ,s2 ),(x3 ,s3 ),(x4 ,s4 )) 1 dx2 F ((x1 ,sObserve the integrals are dx well-defined. that (f • P ) is a function on s0 ,s0 ∈S
(X × S)2 × X2 and 1F 2• f is a four-legged kernel over X × S.
(1) In practice, b In practice, S
S will be a set of sectors and X will be B or IR × IR 2 × {↑, ↓} or IR × IR2 . will be a set of sectors and X will be B or R × R2 × {↑, ↓} or R × R2 .
29
December 15, 2003 16:44 WSPC/148-RMP
00179
Single Scale Analysis of Many Fermion Systems — Part 3
1069
(ii) Let ` ≥ 1, r1 , . . . , r`+1 be rungs over X×S and P1 , . . . , P` be bubble propagators over X. The ladder with rungs r1 , . . . , r`+1 and bubble propagators P1 , . . . , P` is defined to be r1 • P1 • r2 • P2 • · · · • r` • P` • r`+1 . If r is a rung over X × S and A, B are propagators over X, we define L` (r; A, B) as the ladder with ` + 1 rungs r and ` bubble propagators C(A, B). Lemma XIV.4. Let c and d be propagators over X × S and r a rung over X × S. Define the propagators C and D over X by X X d((x1 , t1 ), (x2 , t2 )) c((x1 , t1 ), (x2 , t2 )) D(x1 , x2 ) = C(x1 , x2 ) = t1 ,t2 ∈S
t1 ,t2 ∈S
and new propagators c˜ and d˜ over X × S by ˜ 1 , s1 ), (x2 , s2 )) = D(x1 , x2 ) . d((x
c˜((x1 , s1 ), (x2 , s2 )) = C(x1 , x2 ) Then
˜ L` (r; C, D) = L` (r; c˜, d) ˜ of the right-hand side, is defined over the for all ` ≥ 1. Here, the ladder L` (r; c˜, d), measure space X × S and uses the ◦ product while the ladder on the left-hand side is as in Definition XIV.3(ii). Proof. For any rungs r 0 , r00 over X × S, ˜ ◦ r00 . r0 • C(C, D) • r00 = r0 ◦ C(˜ c, d) The lemma now follows by induction on `. Lemma XIV.5. Let Σ be a sectorization at scale j and ϕ ∈ F0 (4, Σ). Let C(k) and D(k) be functions on R × R2 , that are supported in the jth neighborhood, and C(ξ, ξ 0 ), D(ξ, ξ 0 ) their Fourier transforms as in Definition IX.3. Furthermore, let c((·, s), (·, s0 )) and d((·, s), (·, s0 )) be the Fourier transforms of χs (k)C(k)χs0 (k) and χs (k)D(k)χs0 (k). Define propagators over B × Σ by X c((ξ, t), (ξ 0 , t0 )) cΣ ((ξ, s), (ξ 0 , s0 )) = t∩s6=∅ t0 ∩s0 6=∅
dΣ ((ξ, s), (ξ 0 , s0 )) =
X
d((ξ, t), (ξ 0 , t0 )) .
t∩s6=∅ t0 ∩s0 6=∅
Then L` (ϕ; C, D) = L` (ϕ; cΣ , dΣ )
December 15, 2003 16:44 WSPC/148-RMP
1070
00179
J. Feldman, H. Kn¨ orrer & E. Trubowitz
for all ` ≥ 1. Here, the ladder on the right-hand side is defined over the measure space B × Σ and uses the ◦ product while the ladders on the left-hand side are as in Definition XIV.3(ii). P
χs (k) is identically one on the support of C(k) and D(k), X X d((ξ, t), (ξ 0 , t0 )) . c((ξ, t), (ξ 0 , t0 )) D(ξ1 , ξ2 ) = C(ξ, ξ2 ) =
Proof. Since
s∈Σ
t,t0 ∈Σ
t,t0 ∈Σ
As in Lemma XIV.4, set c˜((ξ1 , s1 ), (ξ2 , s2 )) = C(ξ1 , ξ2 )
˜ 1 , s1 ), (ξ2 , s2 )) = D(ξ1 , ξ2 ) . d((ξ
p = C(c, d)
˜. p˜ = C(˜ c, d)
Denote pΣ = C(cΣ , dΣ )
Then, p is a Σ-sectorized bubble propagator and X p((ξ1 , t1 ), (ξ2 , t2 ), (ξ3 , t3 ), (ξ4 , t4 )) pΣ ((ξ1 , s1 ), (ξ2 , s2 ), (ξ3 , s3 ), (ξ4 , s4 )) = ti ∩si 6=∅ 1≤i≤4
p˜((ξ1 , s1 ), (ξ2 , s2 ), (ξ3 , s3 ), (ξ4 , s4 )) =
X
p((ξ1 , t1 ), (ξ2 , t2 ), (ξ3 , t3 ), (ξ4 , t4 )) .
ti ∈Σ 1≤i≤4
For any w ∈ F0 (4, Σ), w ◦ pΣ ◦ ϕ =
X Z
s0i ∈Σ
dξ10 · · · dξ40 w(·, ·; (ξ10 , s01 ), (ξ20 , s02 ))
1≤i≤4
· pΣ ((ξ10 , s01 ), . . . , (ξ40 , s04 ))ϕ((ξ30 , s03 ), (ξ40 , s04 ), ·, ·) X Z dξ10 · · · dξ40 w(·, ·; (ξ10 , s01 ), (ξ20 , s02 )) = s0i ,ti ∈Σ s0i ∩ti 6=∅ 1≤i≤4
· p((ξ10 , t1 ), . . . , (ξ40 , t4 ))ϕ((ξ30 , s03 ), (ξ40 , s04 ), ·, ·) X Z dξ10 · · · dξ40 w(·, ·; (ξ10 , s01 ), (ξ20 , s02 )) = s0i ,ti ∈Σ 1≤i≤4
· p((ξ10 , t1 ), . . . , (ξ40 , t4 ))ϕ((ξ30 , s03 ), (ξ40 , s04 ), ·, ·) = w ◦ p˜ ◦ ϕ
(XIV.1)
because Z dξ10 · · · dξ40 w(·, ·; (ξ10 , s01 ), (ξ20 , s02 ))p((ξ10 , t1 ), . . . , (ξ40 , t4 ))ϕ((ξ30 , s03 ), (ξ40 , s04 ), ·, ·)
December 15, 2003 16:44 WSPC/148-RMP
00179
Single Scale Analysis of Many Fermion Systems — Part 3
1071
vanishes unless s0i ∩ ti 6= ∅ for all 1 ≤ i ≤ 4. Observe that w ˜ ◦ p˜ ◦ ϕ is again in F0 (4, Σ). It follows by induction from (XIV.1) that ˜. L` (ϕ; cΣ , dΣ ) = L` (ϕ; c˜, d) The lemma follows by Lemma XIV.4. XV. Norm Estimates on the Renormalization Group Map 1 Again, let j ≥ 2 and Σ be a sectorization of scale j and length M j−3/2 ≤ l ≤ 1 . Fix a system ρ = (ρ ) of positive real numbers such that m;n (j−1)/2 M
ρm;n ≤ ρm;n0
if n ≤ n0
ρm+m0 ;n+n0 −2 ≤ ρm;n ρm0 ;n0 ρm+1;n−1 ≤ ρm;n if m ≥ 1 √ ρ1;n−1 ≤ l M j ρ0;n .
(XV.1)
Definition XV.1. (i) For ϕ ∈ Fm (n; Σ) set 1 1 |ϕ|1,Σ + |ϕ|3,Σ + 2 |ϕ|5,Σ if m = 0 l l . |ϕ|Σ = ρm;n l |ϕ| if m = 6 0 1,Σ M 2j P (ii) We set, for X = δ∈N0 ×N2 Xδ tδ ∈ Nd+1 with X0 < M1 j , 0
ej (X) =
cj . 1 − M jX
(iii) A sectorized Grassmann function w can be uniquely written in the form X X Z dη1 · · · dηm dξ1 · · · dξn w(φ, ψ) = m,n s1 ,...,sn ∈Σ
· wm,n (η1 , . . . , ηm (ξ1 , s1 ), . . . , (ξn , sn )) · φ(η1 ) · · · φ(ηm )ψ((ξ1 , s1 )) · · · ψ((ξn , sn )) with wm,n antisymmetric separately in the η and in the ξ variables. Set, in analogy with Theorem VIII.6, for α > 0 and X ∈ Nd+1 , n/2 X M 2j lB n Nj (w; α; X, Σ, ρ) = ej (X) α |wm,n |Σ . l Mj m,n≥0
The constant B will be chosen in Definition XVII.1(iii). It will obey B > 4 max{8B1 , B2 } with B1 , B2 being the constants of Propositions XII.16 and XII.18.
December 15, 2003 16:44 WSPC/148-RMP
1072
00179
J. Feldman, H. Kn¨ orrer & E. Trubowitz
Remark XV.2. (i) By definition, for even w X Bn α2n Nj (w; Λ, α; X, Σ, ρ) = ej (X) n≥1
+ ej (X)
X
ln−1 ρ0;2n |w0,2n |1,Σ M j(n−2)
Bn α2n
ln−2 ρ0;2n |w0,2n |3,Σ M j(n−2)
Bn α2n
ln−3 ρ0;2n |w0,2n |5,Σ M j(n−2)
n≥2
+ ej (X)
X
n≥3
+ ej (X)
XX
m≥1 n≥0
α
n
lB Mj
n/2
ρm;n |wm,n |1,Σ .
If, in a renormalization group analysis, ρ0;2n is independent of the scale number, j, then boundedness of the norms Nj imply that 1 |w0,2 |1,Σ = O |w0,4 |3,Σ = O(1) Mj
modulo t. 1 (ii) If X ≤ 2M j cj then ej (X) ≤ const cj . P ∂ (iii) ∂t0 cj ≤ const M j cj + δ0 =r0 ∞tδ . P (iv) If X is independent of t0 , then ∂t∂0 ej (X) ≤ const M j ej (X) + δ0 =r0 ∞tδ . (v) The j-dependent factors in the definition of Nj were largely motivated by the discussion in [2, Sec. II, Subsec. 8] and [4, Remark VI.8]. The main result of this paper is, that the norms of Definition XV.1 are not ˜ C of Definition VII.1, and changed very much by the renormalization group map Ω that there is volume improvement for the two-point function and all contributions to the four-point function with the exception of ladders. Theorem XV.3. There are constants const, const0 , α0 and τ0 that are independent of j, Σ, ρ such that for all α ≥ α0 the following estimates hold : Let u((ξ, s), (ξ 0 , s0 )), v((ξ, s), (ξ 0 , s0 )) ∈ F0 (2; Σ) be antisymmetric, spin independent, particle number conserving functions whose Fourier transforms obey |ˇ u(k)|, |ˇ v (k)| ≤ 12 |ık0 − e(k)|. Furthermore, let X ∈ Nd+1 , µ, Λ > 0 and assume that τ0 |u|1,Σ ≤ µ(Λ + X)ej (X) and (1 + µ)(Λ + X0 ) ≤ M j . Set C(k) =
ν (j) (k) , ık0 − e(k) − u ˇ(k)
D(k) =
ν (≥j+1) (k) ık0 − e(k) − vˇ(k)
and let C(ξ, ξ 0 ), D(ξ, ξ 0 ) be the Fourier transforms of C(k), D(k) as in Definition IX.3. Let W(φ, ψ) be a Grassmann function and setc ˜ C (: W(φ, ψ) :ψ,C+D ) . : W 0 (φ, ψ) :ψ,D = Ω
c The
definition of W 0 as an analytic function, rather than merely a formal Taylor series will be explained in Remark XV.11.
December 15, 2003 16:44 WSPC/148-RMP
00179
Single Scale Analysis of Many Fermion Systems — Part 3
1073
Assume that W has a sectorized representative w with w0,2 = 0 and X Nj (w; 64α; X, Σ, ρ) ≤ const0 α + ∞tδ . δ6=0
Then W 0 has a sectorized representative w 0 such that Nj (w; 64α; X, Σ, ρ) const 1 . Nj w0 − φJCJφ − w; α; X, Σ, ρ ≤ 2 α 1 − const α Nj (w; 64α; X, Σ, ρ) Furthermore 0 |w0,2 |1,Σ ≤
const l Nj (w; 64α; X, Σ, ρ)2 8 j α ρ0;2 M 1 − const α Nj (w; 64α; X, Σ, ρ)
and ∞ 1X 0 ` `+1 (−1) (12) Ant L` (w0,4 ; C, D) w0,4 − w0,4 − 4 `=1
≤
3,Σ
Nj (w; 64α; X, Σ, ρ)2 const l . α10 ρ0;4 1 − const α Nj (w; 64α; X, Σ, ρ)
Remark XV.4. (i) When we use Theorem XV.3 in a renormalization group analysis, u will depend on counterterms that will ultimately be generated at scales j 0 > j. Then the derivatives of u ˇ(k) can have a scaling behavior characteristic of scale j 0 . In this case |u|1,Σ will not be of order cj . This is why we introduce the factor ej (X) in the definition of Nj . (ii) The hypothesis that w0,2 = 0 is used, in conjunction with Wick ordering, to 0 0 ensure that all non-ladder contributions to w0,2 and w0,4 contain overlapping loops. See [2, Sec. II, Subsecs. 4 and 9]. (iii) In Appendix D, we give naive power-counting bounds for ladders L` (w0,4 ; C, D). These estimates are not good enough for a renormalization group analysis. They would lead to logarithmic divergences. Stronger estimates on the “particle– particle” part of the ladders are derived in Theorem XXII.7. The “particle–hole” parts of the ladders are treated in [5]. Most of the rest of this paper is devoted to the proof of Theorem XV.3. To simplify notation we write Nj (w; α) for Nj (w; α; X, Σ, ρ). We define a family of seminorms on the spaces Fm (n; Σ) by if m = 0 |ϕ|p,Σ |ϕ|p = ρm;n l |ϕ|p,Σ if m 6= 0 M 2j
with p = 1, 3, 5. As in Definition XII.6, these norms induce a family of symmetric seminorms on the spaces Am ⊗ VΣ⊗n . This family of seminorms will only appear in the proof of Theorem XV.3 and in the preliminary Lemma XV.5.
December 15, 2003 16:44 WSPC/148-RMP
1074
00179
J. Feldman, H. Kn¨ orrer & E. Trubowitz
Let c((·, s), (·, s0 )) and d((·, s), (·, s0 )) be the Fourier transform of χs (k)C(k) χs0 (k) and χs (k)D(k)χs0 (k) in the sense of Definition IX.3. As in Lemma XIV.5, let X cΣ ((ξ, s), (ξ 0 , s0 )) = c((ξ, t), (ξ 0 , t0 )) t∩s6=∅ t0 ∩s0 6=∅
dΣ ((ξ, s), (ξ 0 , s0 )) =
X
d((ξ, t), (ξ 0 , t0 )) .
t∩s6=∅ t0 ∩s0 6=∅
As in Proposition XII, CΣ (ψ(ξ, s), ψ(ξ 0 , s0 )) = cΣ ((ξ, s), (ξ 0 , s0 )) DΣ (ψ(ξ, s), ψ(ξ 0 , s0 )) = dΣ ((ξ, s), (ξ 0 , s0 )) are covariances on VΣ . Lemma XV.5. Under the hypotheses of Theorem XV.3, there exists a constant const1 that is independent of j and Σ such that the covariances CΣ , DΣ have integration constantsd r 1 Bl j c = const1 M ej (X) , b= 2 4M j (in the sense of [3, Definition VI.13]) for the configuration | · |p of seminorms. Proof. Clearly, the functions C(k) and D(k) are supported on the jth neigh2 . By Proposition XII.16(ii) and the first borhood, and |C(k)|, |D(k)| ≤ |ık0 −e(k) 1 condition of (XV.1), 2 b is an integral bound both for CΣ and DΣ . We now verify the contraction estimates of [3, Definition VI.13]. Contraction by c for functions on B m × (B × Σ)m as in Definition XII.15 corresponds to contraction by CΣ in the Grassmann algebra over VΣ as in [1, Definition II.5]. Set ) ( M 2j 0 0 0 sup |c((ξ, s), (ξ , s ))|) . c = 9 max |c|1,Σ , l ξ,ξ0 ,s,s0 It follows from Proposition XII.16(i), combined with the second property of ρ, and Proposition XII.18, combined with the first two properties of ρ, that CΣ , DΣ have integration constants c0 , 21 b. By Proposition XIII.5, if τ0 is small enough, M j cj const M j cj 0 j c ≤ const max , M . ≤ 1 − M j |u|1,Σ 1 − M j |u|1,Σ d We
shall, in the proof of Theorem XV.7, apply [1, Theorem IV.4], which requires integral bounds 21 b. Of course, then b is also an integral bound, as is required in the proof of the current Theorem XV.3.
December 15, 2003 16:44 WSPC/148-RMP
00179
Single Scale Analysis of Many Fermion Systems — Part 3
1075
Therefore, by the hypotheses on u, c0 ≤
const M j cj const M j cj ≤ = const M j cj f (Y ) c j M j cj (Λ+X) 1 − µM j (Λ + X) 1−M j X 1 − µ 1−M j cj (Λ+X)
where Y = M j cj (Λ + X) and f (z) =
1 1−z = . z 1 − µ 1−z 1 − (1 + µ)z
1 so that By Lemma A.7, f (Y ) ≤ const 1−Y
c0 ≤
1 const M j cj cj ≤ const M j 1 − M j cj Λ − M j cj X 1 − M j cj Λ 1 − M j cj X
≤ const M j
1 cj . 1 − cj /3 1 − M j cj X
In the second inequality we used Lemma A.4(ii). As, by Corollary A.5(i), cj cj const cj and 1−M j c X ≤ const 1−M j X , we have j
cj 1−cj /3
c0 ≤ const M j ej (X) .
≤
(XV.2)
Lemma XV.6. Let g(φ, ψ) be a sectorized Grassmann function. Let C(k) be a 2 and C(ξ, ξ 0 ), resp. c((ξ, s), (ξ 0 , s0 )), be the function obeying |C(k)| ≤ |ık0 −e(k)| Fourier transforms of C(k), resp. χs (k)C(k)χs0 (k), in the sense of Definition IX.3. Set g 0 (φ, ψ) = g(φ, ψ + CJφ) . If |c|1,Σ ≤ const M j +
P
δ6=0
∞tδ , then
const Nj (g; 2α) . α In particular, this bound is true under the hypotheses of Theorem XV.3. Nj (g 0 − g; α) ≤
Proof. Let ϕ ∈ Fm (n; Σ), 1 ≤ i ≤ n and set
ϕ0 (η1 , . . . , ηm+1 ; (ξ1 , s1 ), . . . , (ξn−1 , sn−1 )) X Z = Antext dζdζ 0 ϕ(η1 , . . . , ηm ; (ξ1 , s1 ), . . . , (ξi−1 , si−1 ) , s,t,t0 ∈Σ
(ζ 0 , t), (ξi , si ), . . . , (ξn−1 , sn−1 ))c((ζ 0 , t0 , ), (ζ, s))J(ζ, ηm+1 ) . P Under the hypotheses of Theorem XV.3, |c|1,Σ ≤ const M j + δ6=0 ∞tδ by Proposition XIII.5(ii). Hence, by Lemma XII, j M if m = 0 l 0 . |ϕ |1,Σ ≤ const |ϕ|1,Σ l if m = 6 0 Mj
December 15, 2003 16:44 WSPC/148-RMP
1076
00179
J. Feldman, H. Kn¨ orrer & E. Trubowitz
Here we have used that the coefficient of tδ in |ϕ0 |1,Σ vanishes for δ 6= 0 so that in Lemma XII we may replace |c|1,Σ by its value at t = 0. Hence, for m = 0, l 1 |ϕ0 |1,Σ ≤ const ρ1;n−1 j |ϕ|1,Σ M 2j M √ ρ1;n−1 1 1 ≤ const |ϕ|Σ ≤ const l M j j |ϕ|Σ ≤ const b|ϕ|Σ j ρ0;n M M
|ϕ0 |Σ = ρ1;n−1
and, for m 6= 0, |ϕ0 |Σ = ρm+1;n−1 ≤ const
l2 l |ϕ0 |1,Σ ≤ const ρm+1;n−1 3j |ϕ|1,Σ 2j M M
l ρm+1;n−1 l |ϕ|Σ ≤ const j |ϕ|Σ ≤ const b|ϕ|Σ . ρm;n M j M
The lemma now follows from |ϕ0 |Σ ≤ const b|ϕ|Σ as Proposition VII.6 follows from the bound of Definition VII.4. Proof of Theorem XV.3. For ϕ ∈ Fm (n; Σ) set |ϕ| + 1 |ϕ| 3,Σ 1,Σ |ϕ|impr,Σ = ρm;n l 0
if m = 0
.
if m 6= 0
This family of seminorms will only appear in this proof. By Lemma XV.5 and [3, Lemma VI.15], with q = 5, J = l and k · kp = | · |p , the covariances (CΣ , DΣ ) have improved integration constants c, b, l for the families |·|Σ and |·|impr,Σ of seminorms (in the sense of [3, Definition VI.1]). For a sectorized Grassmann function v = P Vn VΣ let m,n vm,n with vm,n ∈ Am ⊗ N (v; α) =
Nimpr (v; α) =
1 X n n c α b |vm,n |Σ b2 m,n
1 X n n c α b |v0,n |impr,Σ b2 n
be the quantities introduced in [1, Definition II.23] and just after [3, Lemma VI.2]. Then const1 Nj (v; α; X, Σ, ρ) N (v; α) = B where const1 is the constant of Lemma XV.5. Set : w00 :ψ,DΣ = ΩCΣ (: w :ψ,CΣ +DΣ ) and w0 =
1 φJCJφ + w00 (φ, ψ + CJφ) . 2
By parts (ii) and (iii) of Proposition XII, : w 0 :ψ,DΣ is a sectorized representative for : W 0 (φ, ψ) :ψ,D . Hence, by Proposition XII(i) and [1, Proposition A.2(ii)], w 0 is a
December 15, 2003 16:44 WSPC/148-RMP
00179
Single Scale Analysis of Many Fermion Systems — Part 3
1077
sectorized representative for W 0 . We apply [3, Theorem VI.6] to get estimates on w 00 . B With const0 = 8 const the hypotheses of this theorem are fulfilled. Consequently 1 N (w00 − w; α) ≤
N (w; 32α)2 1 2 2α 1 − α12 N (w; 32α)
(XV.3)
and 00 α2 c|w0,2 |impr,Σ ≤
210 l N (w; 64α)2 α6 1 − α8 N (w; 64α)
∞ 1X 00 ` `+1 (−1) (12) Ant L` (w0,4 ; cΣ , dΣ ) α b c w0,4 − w0,4 − 4 4 2
`=1
10
≤
impr,Σ
2
2 l N (w; 64α) . α6 1 − α8 N (w; 64α)
For the last estimate, we also used the description of ladders in terms of kernels of 0 00 0 00 [3, Proposition C.4]. As w0,2 = w0,2 and w0,4 = w0,4 this implies that 0 ej (X)|w0,2 |1,Σ ≤
Nj (w; 64α)2 const l α8 ρ0;2 M j 1 − const α Nj (w; 64α)
and, using Lemma XIV.5 ∞ 1X 0 ` `+1 (−1) (12) Ant L` (w0,4 ; C, D) ej (X) w0,4 − w0,4 − 4 `=1
3,Σ
∞ 1X 1 00 ` `+1 ≤ lej (X) w0,4 − w0,4 − (−1) (12) Ant L` (w0,4 ; cΣ , dΣ ) ρ0;4 4 `=1
impr,Σ
2
≤
const Nj (w; 64α) l . 10 α ρ0;4 1 − const α Nj (w; 64α)
By Lemma XV.6, 1 0 Nj w − φJCJφ − w; α 2 = Nj (w00 (φ, ψ + CJφ) − w(φ, ψ); α) ≤ Nj (w00 (φ, ψ + CJφ) − w00 (φ, ψ); α) + Nj (w00 (φ, ψ) − w(φ, ψ); α) const Nj (w00 ; 2α) + Nj (w00 − w; α) α const const B Nj (w; 2α) + 1 + N (w00 − w; 2α) ≤ α α const1 ≤
December 15, 2003 16:44 WSPC/148-RMP
1078
00179
J. Feldman, H. Kn¨ orrer & E. Trubowitz
N (w; 64α)2 const 1 B const Nj (w; 2α) + 1 + ≤ α α const1 8α2 1 − 4α1 2 N (w; 64α) ≤
Nj (w; 64α)2 const const Nj (w; 2α) + 2 α α 1 − const α2 Nj (w; 64α)
≤
const const Nj (w; 64α)2 Nj (w; 64α) + 2 α α 1 − const α Nj (w; 64α)
≤
Nj (w; 64α) const . const α 1 − α Nj (w; 64α)
We also wish to allow the functions u and v of Theorem XV.7 to depend on a parameter κ. Theorem XV.7. There are constants const, const0 , α0 , τ0 that are independent of j, Σ, ρ such that for all ε > 0 and α ≥ α0 the following estimates hold : Let, for κ in a neighborhood of zero, uκ , vκ ∈ F0 (2; Σ) be antisymmetric, spin independent, particle number conserving functions whose Fourier transforms satd vˇκ (k)|κ=0 | ≤ ε|ık0 −e(k)|. Furthermore, isfy |ˇ u0 (k)|, |ˇ v0 (k)| ≤ 21 |ık0 −e(k)| and | dκ let X, Y ∈ Nd+1 , µ, Λ > 0 and assume that d uκ ≤ ej (X)Y |u0 |1,Σ ≤ µ(Λ + X)ej (X) dκ κ=0 1,Σ
and (1 + µ)(Λ + X0 ) ≤ Cκ (k) =
τ0 Mj
. Set
(j)
ν (k) , ık0 − e(k) − u ˇκ (k)
Dκ (k) =
ν (≥j+1) (k) ık0 − e(k) − vˇκ (k)
and let Cκ (ξ, ξ 0 ), Dκ (ξ, ξ 0 ) be the Fourier transforms of Cκ (k), Dκ (k). Let, for κ in a neighborhood of zero, Wκ (φ, ψ) be an even Grassmann function and set ˜ Cκ (: Wκ :ψ,Cκ +Dκ ) . : Wκ0 (ψ) :ψ,Dκ = Ω
Assume that Wκ has a sectorized representative wκ with X n ≡ Nj (w0 ; 64α; X, Σ, ρ) ≤ const0 α + ∞tδ . δ6=0
Then
Wκ0
wκ0
has a sectorized representative such that 1 d 0 ; α; X, Σ, ρ Nj wκ − φJCκ Jφ − wκ dκ 2 κ=0 ≤ const
+ const
n 1 1 + 2 const α α 1 − α2 n n 1−
const α2 n
Nj
1 n + α α2
d ; 16α; X, Σ, ρ wκ dκ κ=0
MjY +
ε n . α2
December 15, 2003 16:44 WSPC/148-RMP
00179
Single Scale Analysis of Many Fermion Systems — Part 3
1079
Lemma XV.8. Under the hypotheses of Theorem XV.7, there exists a constant const2 that is independent of j and Σ such that C0,Σ has contraction bound c, C0,Σ and D0,Σ have integral bound 12 b and d Cκ,Σ has contraction bound c0 = const2 M 2j ej (X)Y dκ κ=0
d Dκ,Σ dκ
1 0 √ b = εb 2
has integral bound
κ=0
for the family | · |Σ of symmetric seminorms. Proof. The contraction and integral bounds on C0,Σ and D0,Σ were proven in Lemma XV.5. Clearly, the function ν (≥j+1) (k) d d d ν (≥j+1) (k) Dκ (k) = = vˇκ (k) dκ dκ ık0 − e(k) − vˇκ (k) [ık0 − e(k) − vˇκ (k)]2 dκ
d 4ε is supported on the jth neighborhood and obeys | dκ Dκ (k)|κ=0 | ≤ |ık0 −e(k)| . By q √ d Dκ,Σ |κ=0 . Proposition XII.16(ii), 2 4B1 ε Ml j ≤ εb is an integral bound for dκ Set ( ) 2j M d d , . sup cκ ((ξ, s), (ξ 0 , s0 )) c00 = 9 max cκ dκ l ξ,ξ0 ,s,s0 dκ κ=0 1,Σ
κ=0
d ( dκ cκ |κ=0 )Σ
By Proposition XII.16(i) and the second property of ρ, has contraction 00 bound c . By Lemma XIII.6 ( ) j d M | u | | d κ κ=0 1,Σ dκ , M 2j uκ c00 ≤ const max M j cj 1 − M j |u0 |1,Σ dκ κ=0 1,Σ ≤ const M j cj
M j ej (X) Y 1 − M j µ(Λ + X)ej (X)
≤ const M 2j cj
1 1−M j X c
1 − M j µ(Λ + X) 1−Mj j X
≤ const M 2j cj Y
1
Y
1 1−M j cj (Λ+X) M j c (Λ+X) − µ 1−M jjcj (Λ+X)
= const M 2j cj Y f (Z)
(XV.4)
where Z = M j cj (Λ + X) and f (z) =
1 1−z z 1 − µ 1−z
1 By Lemma A.7, f (Z) ≤ const 1−Z so that
=
1 . 1 − (1 + µ)z
c00 ≤ const M 2j ej (X)Y
as in Lemma XV.5.
December 15, 2003 16:44 WSPC/148-RMP
1080
00179
J. Feldman, H. Kn¨ orrer & E. Trubowitz
Lemma XV.9. Let g(φ, ψ) be a sectorized Grassmann function and set gκ0 (φ, ψ) = g(φ, ψ + Cκ Jφ) . Under the hypotheses of Theorem XV.7, const j d 0 gκ M Y0 Nj (g; 2α; X, Σ, ρ) . ; α; X, Σ, ρ ≤ Nj dκ κ=0 α
Proof. Define
d c˜z = c0 + z cκ dκ κ=0
and
d g˜z (φ, ψ) = g φ, ψ + C0 + z Cκ Jφ . dκ κ=0 Let ϕ ∈ Fm (n; Σ), 1 ≤ i ≤ n and set ϕ0z (η1 , . . . , ηm+1 ; (ξ1 , s1 ), . . . , (ξn−1 , sn−1 )) X Z = Antext dζdζ 0 ϕ(η1 , . . . , ηm ; (ξ1 , s1 ), . . . , (ξi−1 , si−1 ) , s,t,t0 ∈Σ
(ζ 0 , t), (ξi , si ), . . . , (ξn−1 , sn−1 ))˜ cz ((ζ 0 , t0 , ), (ζ, s))J(ζ, ηm+1 ) . By Lemma XIII.6(i) and (XV.4) d 2j cκ dκ κ=0 ≤ const M Y0 1,Σ t=0
so that, using the bound on |c0 |1,Σ that was derived in Lemma XV.5, |˜ cz |1,Σ |t=0 ≤ const M j for all |z| ≤
1 M j Y0 .
so that, for m = 0,
Consequently, for all |z| ≤ M j1Y0 , Lemma XII yields j M if m = 0 l |ϕ0z |1,Σ ≤ const |ϕ|1,Σ l if m 6= 0 Mj
|ϕ0z |Σ = ρ1;n−1 ≤ const
l 1 |ϕ0 |1,Σ ≤ const ρ1;n−1 j |ϕ|1,Σ M 2j M
√ ρ1;n−1 1 1 |ϕ|Σ ≤ const l M j j |ϕ|Σ ≤ const b|ϕ|Σ j ρ0;n M M
December 15, 2003 16:44 WSPC/148-RMP
00179
Single Scale Analysis of Many Fermion Systems — Part 3
and, for m 6= 0, |ϕ0z |Σ = ρm+1;n−1 ≤ const
1081
l l2 0 |ϕ | ≤ const ρ |ϕ|1,Σ 1,Σ m+1;n−1 M 2j M 3j
ρm+1;n−1 l l |ϕ|Σ ≤ const j |ϕ|Σ ≤ const b|ϕ|Σ . j ρm;n M M
Hence, as in Lemma XV.6, Nj (˜ gz − g; α; X, Σ, ρ) ≤ for all |z| ≤
const Nj (g, 2α; X, Σ, ρ) α
1 M j Y0
Nj
and, by the Cauchy integral theorem, d 0 d g [˜ gz − g] ; α; X, Σ, ρ ; α; X, Σ, ρ = Nj dκ κ κ=0 dz z=0 ≤
const j M Y0 Nj (g, 2α; X, Σ, ρ) . α
Proof of Theorem XV.7. As in the proof of Theorem XV.3, let, for a sectorized V P Grassmann function v = m,n vm,n with vm,n ∈ Am ⊗ n VΣ , N (v; α) =
const1 1 X n n α b |vm,n |Σ = c Nj (v; α; X, Σ, ρ) 2 b m,n B
and : wκ00 :ψ,Dκ,Σ = ΩCκ,Σ (: wκ :ψ,Cκ,Σ +Dκ,Σ ) . By Proposition XII, parts (ii) and (iii), and [1, Proposition A.2(ii)], 1 φJCκ Jφ + wκ00 (φ, ψ + Cκ Jφ) 2 is a sectorized representative for Wκ0 . By the chain rule and the triangle inequality d 1 N wκ0 − φJCκ Jφ − wκ ;α dκ 2 κ=0 wκ0 =
≤N
d 00 w (φ, ψ + Cκ Jφ) ;α dκ 0 κ=0
d 00 [w (φ, ψ + C0 Jφ) − wκ00 (φ, ψ)]κ=0 ; α +N dκ κ d 00 +N [w (φ, ψ) − wκ (φ, ψ)]κ=0 ; α . dκ κ
By Lemma XV.9, const j d 00 ;α ≤ w0 (φ, ψ + Cκ Jφ) M Y0 Nj (w000 ; 2α; X, Σ, ρ) . N dκ α κ=0
(XV.5)
December 15, 2003 16:44 WSPC/148-RMP
1082
00179
J. Feldman, H. Kn¨ orrer & E. Trubowitz
By (XV.3), Nj (w000 ; 2α; X, Σ, ρ) ≤ Nj (w0 ; 2α; X, Σ, ρ) + Nj (w000 − w0 ; 2α; X, Σ, ρ) ≤ Nj (w0 ; 2α; X, Σ, ρ) +
N (w0 ; 64α)2 B 1 2 const1 8α 1 − 4α1 2 N (w0 ; 64α)
≤ Nj (w0 ; 64α; X, Σ, ρ) + ≤ const so that N
1−
const Nj (w0 ; 64α; X, Σ, ρ)2 α2 1 − const α2 Nj (w0 ; 64α; X, Σ, ρ)
Nj (w0 ; 64α; X, Σ, ρ) const α2 Nj (w0 ; 64α; X, Σ, ρ)
n const d 00 j w0 (φ, ψ + Cκ Jφ) ;α ≤ const M Y0 . dκ α 1 − n 2 κ=0 α
d wκ00 |κ=0 , By Lemma XV.6, with g = dκ d 00 00 [w (φ, ψ + C0 Jφ) − wκ (φ, ψ)]κ=0 ; α N dκ κ d 00 const ; 2α; X, Σ, ρ Nj w ≤ α dκ κ κ=0 const d ≤ Nj wκ ; 2α; X, Σ, ρ α dκ κ=0 const d 00 + Nj [w − wκ ]κ=0 ; 2α; X, Σ, ρ . α dκ κ
By [1, Theorem IV.4], with µ = M1j , d 00 [w − wκ ]κ=0 ; α N dκ κ
1 N (w0 ; 32α) ≤ N 2 2α 1 − α12 N (w0 ; 32α)
d ; 8α wκ dκ κ=0
(XV.6)
(XV.7)
1 N (w0 ; 32α)2 1 2j + 2 const2 M ej (X)Y + 4ε 2α 1 − α12 N (w0 ; 32α) 4M j const n d j ≤ w ; 8α + M Y n + 4εn (XV.8) N κ α2 1 − const dκ κ=0 α2 n
since ej (X)N (w0 ; 32α) ≤ const N (w0 ; 32α). Also d 00 Nj [w − wκ ]κ=0 ; 2α; X, Σ, ρ dκ κ d n const j N w ; 16α + M Y n + 4εn . ≤ κ α2 1 − const dκ κ=0 α2 n
December 15, 2003 16:44 WSPC/148-RMP
00179
Single Scale Analysis of Many Fermion Systems — Part 3
Substituting (XV.6)–(XV.8) into (XV.5), d 1 ;α wκ0 − φJCκ Jφ − wκ N dκ 2 κ=0 ≤
const const n j N const M Y0 + α 1 − α2 n α +
const n α2 1 − const α2 n
≤ const
+ const
N
1−
const α2 n
d ; 2α wκ dκ κ=0
d wκ ; 16α + M j Y n + 4εn dκ κ=0
1 n 1 + α α2 1 − const α2 n n
1083
Nj
n 1 + 2 α α
d wκ ; 16α; X, Σ, ρ dκ κ=0
ε M Y + 2n . α j
We also must control the pure φ contributions in a situation similar to that of Theorem XV.3. Proposition XV.10. There are constants α0 and τ0 that are independent of j, Σ, ρ such that for all α ≥ α0 the following estimates hold : Let u((ξ, s), (ξ 0 , s0 )), v((ξ, s), (ξ 0 , s0 )) ∈ F0 (2; Σ) be antisymmetric, spin independent, particle number conserving functions whose Fourier transforms obey |ˇ u(k)|, 1 |ˇ v (k)| ≤ 2 |ık0 − e(k)|. Furthermore, let X ∈ Nd+1 and assume that X, |u|1,Σ ≤ P τ0 δ δ6=0 ∞t . Let j- be a real number in (j + 1, j + 2] and set Mj + S(k) =
ν (≥j+1) (k) − ν (≥ -j) (k) , ık0 − e(k) − u ˇ(k)
D(k) =
ν (≥j+1) (k) ık0 − e(k) − vˇ(k)
and let S(ξ, ξ 0 ), D(ξ, ξ 0 ) be the Fourier transforms of S(k), D(k) as in Definition IX.3. Let W(φ, ψ) be a Grassmann function obeying W(φ, 0) = 0 and set ˜ S (: W(φ, ψ) :ψ,D )(φ, 0) . G(φ) = Ω Assume that W has a sectorized representative w with X Nj (w; α; X, Σ, ρ) ≤ 2 + ∞tδ . δ6=0
Write X 1 G(φ) − φJSJφ = 2 m
Z
with Gm antisymmetric. Then X
m>0
dη1 · · · dηm Gm (η1 , . . . , ηm )φ(η1 ) · · · φ(ηm )
ρm;0 k|Gm k|∞ ≤ 10 .
December 15, 2003 16:44 WSPC/148-RMP
1084
00179
J. Feldman, H. Kn¨ orrer & E. Trubowitz
Proof. We use the notation of the proof of Theorem XV.3. As in Lemma q XV.5, P Bl j δ c = const1 M + δ6=0 ∞t is a contraction bound for SΣ and b = M j is an integral bound for both SΣ and DΣ (in the sense of [1, Definition II.25]). Write : w(φ, ψ) :ψ,DΣ = : w(φ, ˜ ψ) :ψ,SΣ w00 (φ, ψ) = ΩSΣ (: w(φ, ψ) :ψ,DΣ ) . By Proposition XII G(φ) =
1 φJSJφ + w00 (φ, SJφ) . 2
By [1, Corollary II.32(ii)] α ˜ ; X, Σ, ρ ≤ 2Nj (w; α; X, Σ, ρ) . Nj w; 2 By [1, Theorem II.28], with α replaced by
α 16 ,
29 α Nj (w; ˜ 21 α; X, Σ, ρ)2 α ; X, Σ, ρ + 2 Nj w00 (φ, ψ); ; X, Σ, ρ ≤ Nj w; ˜ 10 16 16 α 1 − 2α2 Nj (w; ˜ 12 α; X, Σ, ρ) X ≤ 5+ ∞tδ . δ6=0
By Lemma XV.6 α α Nj w00 (φ, ψ + SJφ); ; X, Σ, ρ ≤ 2Nj w00 (φ, ψ); ; X, Σ, ρ 32 16 X ≤ 10 + ∞tδ δ6=0
so that ej (X)
α 1 ρm;0 k|Gm k|∞ = Nj G(φ) − φJSJφ; ; X, Σ, ρ 2 32 m>0 X
α ≤ Nj w00 (φ, ψ + SJφ); ; X, Σ, ρ 32 X ≤ 10 + ∞tδ . δ6=0
Remark XV.11. In Theorem XV.3, the sectorized representative w 0 of W 0 may be obtained from the sectorized representative w of W by : w0 :ψ,DΣ =
1 φJCJφ + ΩCΣ (: w : ψ,CΣ +DΣ )(φ, ψ + CJφ) . 2
Again, w0 is initially defined as a formal Taylor series in w. By [1, Remark IV.3] and the observation that, as in Proposition IV.11(i), CΣ and DΣ are analytic functions of u and v, respectively, this formal Taylor series converges to a function that is
December 15, 2003 16:44 WSPC/148-RMP
00179
Single Scale Analysis of Many Fermion Systems — Part 3
1085
jointly analytic in w, u and v. By the functoriality Proposition XII.8, if w1 and w2 are two sectorized representatives of W, then the corresponding w10 and w20 represent the same unsectorized Grassmann function W 0 . In this way one sees that the formal Taylor series for W 0 converges. The obvious analogs of these statements apply to Theorem XV.7 and Proposition XV.10. XVI. Sectorized Momentum Space Norms 1 1 ≤ l ≤ M (j−1)/2 at scale j ≥ 2. Again, let Σ be a sectorization of length M j−3/2 ˜ C using the algebra In Sec. XV we described the renormalization group map Ω V A VΣ , where VΣ is the vector space generated by ψ(ξ, s), ξ ∈ B, s ∈ Σ (see Definition XII.6) and A is the Grassmann algebra in the external fields φ(η), η ∈ B. To deal with amputated Green’s functions in momentum space, we set for ηˇ = (k, σ, a) ∈ Bˇ Z a ˇ φ(ˇ η ) = dd+1 xe−(−1) ıhk,xi− φ(x, σ, a)
ˇ η ), ηˇ ∈ B. ˇ Furthermore set and denote by Vext the vector space generated by φ(ˇ V˜ = Vext ⊕ VΣ .
V
V Then A VΣ is canonically isomorphic to the Grassmann algebra V˜ over V˜ with complex coefficients. In terms of Grassmann functions, this isomorphism amounts to the following: a translation invariant sectorized Grassmann function w can be uniquely written in the form X X Z w(φ, ψ) = dη1 · · · dηm dξ1 · · · dξn m,n s1 ,...,sn ∈Σ
· wm,n (η1 , . . . , ηm (ξ1 , s1 ), . . . , (ξn , sn )) · φ(η1 ) · · · φ(ηm )ψ((ξ1 , s1 )) · · · ψ((ξn , sn )) with wm,n antisymmetric separately in the η and in the ξ variables. As well, XZ ∼ ˇ η1 ) · · · φ(ˇ ˇ ηm ) w(φ, ψ) = dˇ η1 · · · dˇ ηm wm,0 (ˇ η1 , . . . , ηˇm )(2π)d+1 δ(ˇ η1 + · · · + ηˇm )φ(ˇ m
+
X
X
m,n s1 ,...,sn ∈Σ n≥1
Z
∼ dˇ η1 · · · dˇ ηm dξ1 · · · dξn wm,n
ˇ η1 ) · · · φ(ˇ ˇ ηm )ψ((ξ1 , s1 )) · · · ψ((ξn , sn )) . · (ˇ η1 , . . . , ηˇm (ξ1 , s1 ), . . . , (ξn , sn ))φ(ˇ
∼ Here wm,n is the partial Fourier transform of Definition IX.1. The basis elements of the vector space V˜ = Vext ⊕ VΣ are in one-to-one correspondence with the points of the disjoint union XΣ of Bˇ and B × Σ. To simplify notation, we make the
December 15, 2003 16:44 WSPC/148-RMP
1086
00179
J. Feldman, H. Kn¨ orrer & E. Trubowitz
Definition XVI.1. For x ∈ XΣ = Bˇ ∪· (B × Σ) set ( ˇ η) ˇ φ(ˇ if x = ηˇ ∈ B . Ψ(x) = ψ(ξ, s) if x = (ξ, s) ∈ B × Σ The purpose of this section is to define and analyze norms on functions on XnΣ to which the results of [1, 3] can be applied. First, we look at the structure of XnΣ more carefully. Definition XVI.2. Set X0 = Bˇ and X1 = B × Σ. Let ı = (i1 , . . . , in ) ∈ {0, 1}n. (i) The inclusions of Xij , j = 1, . . . , n, in XΣ induce an inclusion of Xi1 × · · · × Xin in XnΣ . We identify Xi1 × · · · × Xin with its image in XnΣ . (ii) Set m(ı) = n − (i1 + · · · + in ). Clearly, m(ı) is the number of copies of Bˇ in X i1 × · · · × X in . (iii) If f is a function on Xi1 × · · · × Xin , then Ord f is the function on Bˇm(ı) × ˇ arguments before all of (B × Σ)n−m(ı) obtained from f by shifting all of the B ˇ arguments the B × Σ arguments, while preserving the relative order of the B and the relative order of the B × Σ arguments and multiplying by the sign of the permutation that implements the reordering of the arguments. That is, Ord f (x1 , . . . , xn ) = sgn πf (xπ(1) , . . . , xπ(n) ) where the permutation π ∈ Sn is determined by π(j) < π(j 0 ) if ij < ij 0 or ij = ij 0 j < j 0 . Remark XVI.3. Using the identification of Definition XVI.2(i), [ XnΣ = · X i1 × · · · × X in i1 ,...,in ∈{0,1}
where, on the right-hand side we have a disjoint union. If f is a function on XnΣ and ı = (i1 , . . . , in ) ∈ {0, 1}n, we denote by f |ı the restriction of f to Xi1 × · · · × Xin . To define norms for functions on XnΣ it thus suffices to define norms for functions on each of the spaces Xi1 × · · · × Xin . As we want these norms to be invariant under permutations, it suffices, using the map Ord of Definition XVI.2(iii), to define norms for functions on the spaces Bˇm × (B × Σ)n−m . Definition XVI.4. Let p be a natural number. (i) For a function f on Bˇm we define ( kf ke |f |ep,Σ = 0
if p = m − 1, m = 2, 4
otherwise
with k · ke being the norm of Definition X.4.
December 15, 2003 16:44 WSPC/148-RMP
00179
Single Scale Analysis of Many Fermion Systems — Part 3
1087
(ii) For a translation invariant function f on Bˇm × (B × Σ)n with n ≥ 1, we set |f |ep,Σ = 0 when p > m + n or p < m, and |f |ep,Σ =
X
δ∈N0 ×N20
sup
X
1≤i1 <···
1 δ!
max
D dd-operator with δ(D)=δ
· k|Df (ˇ η1 , . . . , ηˇm ; (ξ1 , s1 ), . . . , (ξn , sn ))k|1,∞ tδ when m ≤ p ≤ m + n. The norm k| · k|1,∞ of Example II.6 refers to the variables ξ1 , . . . , ξ n . Remark XVI.5. In the case m = 0 and p odd, the norm | · |p,Σ of Definition XII and the norm | · |ep,Σ of Definition XVI.4 agree. Lemma XVI.6. Let f be a translation invariant function on Bˇm × (B × Σ)n , f 0 a 0 0 translation invariant function on Bˇm × (B × Σ)n and 1 ≤ i ≤ n, 1 ≤ i0 ≤ n0 . 0 0 If n ≥ 2 or n0 ≥ 2 define the function g on Bˇm+m × (B × Σ)n+n −2 by g(ˇ η1 , . . . , ηˇm+m0 ; (ξ1 , s1 ), . . . , (ξi−1 , si−1 ), (ξi+1 , si+1 ), . . . , (ξn+i0 −1 , sn+i0 −1 ), (ξn+i0 +1 , sn+i0 +1 ), . . . , (ξn+n0 , sn+n0 )) X Z dζf (ˇ η1 , . . . , ηˇm ; (ξ1 , s1 ), . . . , (ξi−1 , si−1 ), (ζ, s) , = s,s0 ∈Σ s∩s0 6=∅
B
(ξi+1 , si+1 ), . . . , (ξn , sn ))f 0 (ˇ ηm+1 , . . . , ηˇm+m0 ; (ξn+1 , sn+1 ), . . . , (ξn+i0 −1 , sn+i0 −1 ), (ζ, s0 ), (ξn+i0 +1 , sn+i0 +1 ), . . . , (ξn+n0 , sn+n0 )) . If n = n0 = 1, define the function g on Bˇm+m0 by η1 + · · · + ηˇm+m0 ) g(ˇ η1 , . . . , ηˇm+m0 )(2π)d+1 δ(ˇ X Z = dζf (ˇ η1 , . . . , ηˇm ; (ζ, s))f 0 (ˇ ηm+1 , . . . , ηˇm+m0 ; (ζ, s0 )) . s,s0 ∈Σ s∩s0 6=∅
B
Then, for all natural numbers p, |g|ep,Σ 3 max min{|f |ep1 ,Σ |f 0 |ep2 +1,Σ , |f |ep1 +1,Σ |f 0 |ep2 ,Σ } p +p =p 1 2 m≤p <m+n 1 ≤ m0 ≤p2 <m0 +n0 4|f |e |f 0 |e m,Σ m0 ,Σ
if (n, n0 ) 6= (1, 1) . if (n, n0 ) = (1, 1)
December 15, 2003 16:44 WSPC/148-RMP
1088
00179
J. Feldman, H. Kn¨ orrer & E. Trubowitz
Proof. The case n ≥ 2 or n0 ≥ 2. The proof is analogous to that of Lemma XII.14. The B × Σ indices for g lie in the set I ∪ I 0 , where I = {1, . . . , i − 1, i + 1, . . . , n} I 0 = {n + 1, . . . , n + i0 − 1, n + i0 + 1, . . . , n + n0 } .
Let q obey 0 ≤ q−m ≤ n−1 and 0 ≤ p−q−m0 ≤ n0 −1 or equivalently m ≤ q < n+m and m0 ≤ p − q < m0 + n0 . Fix u1 , . . . , uq−m ∈ I, uq−m+1 , . . . , up−m−m0 ∈ I 0 and fix sectors su1 , . . . , sup−m−m0 ∈ Σ. Let F (ˇ η1 , . . . , ηˇm ; s1 , . . . , sn ) X
=
δ∈N0 ×N20
1 δ!
max
D dd-operator with δ(D)=δ
so that |f |ep,Σ =
k|Df (ˇ η1 , . . . , ηˇm ; (·, s1 ), . . . , (·, sn ))k|1,∞ tδ X
sup
1≤i1 <···
F (ˇ η1 , . . . , ηˇm ; s1 , . . . , sn )
ηm+1 , . . . , and define G(ˇ η1 , . . . , ηˇm+m0 ; s1 , . . . , s/i , . . . , s/n+i0 , . . . , sn+n0 ) and F 0 (ˇ ηˇm+m0 ; sn+1 , . . . , sn+n0 ) similarly. By Remark X.7, for each choice of sectors sν , ν ∈ I ∪ I 0 , one has G(ˇ η1 , . . . , ηˇm+m0 ; s1 , . . . , s/i , . . . , s/n+i0 , . . . , sn+n0 ) ≤
X
F (ˇ η1 , . . . , ηˇm ; s1 , . . . , si−1 , s, si+1 , . . . , sn )
s,s0 ∈Σ s∩s0 6=∅
· F 0 (ˇ ηm+1 , . . . , ηˇm+m0 ; sn+1 , . . . , sn+i0 −1 , s0 , . . . , sn+n0 ) .
Observe that for every s ∈ Σ there are at most three sectors s0 such that s0 ∩ s 6= ∅. Consequently X G(ˇ η1 , . . . , ηˇm+m0 ; s1 , . . . , s/i , . . . , s/n+i0 , . . . , sn+n0 ) sν ∈Σ ν∈I∪I 0 \{u1 ,...,up−m−m0 }
≤3
X
X
F (ˇ η1 , . . . , ηˇm ; s1 , . . . , si−1 , s, si+1 , . . . , sn )
sν ∈Σ s∈Σ ν∈I\{u1 ,...,uq−m }
· max 0 s ∈Σ
X
s0µ ∈Σ µ∈I \{uq−m+1 ,...,up−m−m0 } 0
· F 0 (ˇ ηm+1 , . . . , ηˇm+m0 ; sn+1 , . . . , sn+i0 −1 , s0 , . . . , sn+n0 ) ≤ 3|f |eq,Σ |f 0 |ep−q+1,Σ .
December 15, 2003 16:44 WSPC/148-RMP
00179
Single Scale Analysis of Many Fermion Systems — Part 3
1089
Taking the supremum over the ηˇ’s and the remaining sν ’s gives |g|ep,Σ ≤ 3|f |eq,Σ |f 0 |ep−q+1,Σ .
By interchanging the roles of (f, q) and (f 0 , p − q), we get the bound 3|f |eq+1,Σ |f 0 |ep−q,Σ .
The case n = n0 = 1. In this case, the norm |g|ep,Σ is defined in Definition XVI.4(i). We need only consider the case p = m + m0 − 1. By Remark X.3(iii) X Z dξf (ˇ η1 , . . . , ηˇm ; (ξ, s))f 0 (ˇ ηm+1 , . . . , ηˇm+m0 ; (ξ, s0 )) B
s,s0 ∈Σ s∩s0 6=∅
=
Z
dx0
Z
dx
X
X
f (ˇ η1 , . . . , ηˇm ; (0, σ, b, s))
σ∈{↑,↓} s,s0 ∈Σ b∈{0,1}
· eıhˇη1 +···+ˇηm+m0 ,(x0 ,x)i− f 0 (ˇ ηm+1 , . . . , ηˇm+m0 ; (0, σ, b, s0 )) X X f (ˇ η1 , . . . , ηˇm ; (0, σ, b, s)) = σ∈{↑,↓} s,s0 ∈Σ b∈{0,1}
η1 + · · · + ηˇm+m0 ) . · f 0 (ˇ ηm+1 , . . . , ηˇm+m0 ; (0, σ, b, s0 ))(2π)d+1 δ(ˇ Consequently g(ˇ η1 , . . . , ηˇm+m0 ) =
X
σ∈{↑,↓} b∈{0,1}
X
f (ˇ η1 , . . . , ηˇm ; (0, σ, b, s))f 0 (ˇ ηm+1 , . . . , ηˇm+m0 ; (0, σ, b, s0 )) .
s,s0 ∈Σ
The claim now follows, as in Lemma II.7, by iterated application of the product rule for derivatives and Remark X.3(iii). Definition XVI.7. Let m, n ≥ 0. (i) For n ≥ 1, denote by Fˇm (n; Σ) the space of all translation invariant, complex valued functions f (ˇ η1 , . . . , ηˇm ; (ξ1 , s1 ), . . . , (ξn , sn )) on Bˇm ×(B ×Σ)n whose Fourier ˇ η1 , . . . , ηˇm ; (ξˇ1 , s1 ), . . . , (ξˇn , sn )) vanishes unless ki ∈ s˜i for all 1 ≤ transform f(ˇ j ≤ n. Here, ξˇi = (ki , σi , ai ). Also, let Fˇm (0; Σ) be the space of all momentum conserving, complex valued functions f (ˇ η1 , . . . , ηˇm ) on Bˇm . 0 0 (ii) Let c((ξ, s), (ξ , s )) be any skew symmetric function on (B × Σ)2 . Let f ∈ Fˇm (n; Σ) and 1 ≤ i < j ≤ n. We define “contraction”, for n ≥ 2, by Conc f (ˇ η1 , . . . , ηˇm ; (ξ1 , s1 ), . . . , (ξi−1 , si−1 ) , i→j
(ξi+1 , si+1 ), . . . , (ξj−1 , sj−1 ), (ξj+1 , sj+1 ), . . . , (ξn , sn ))
December 15, 2003 16:44 WSPC/148-RMP
1090
00179
J. Feldman, H. Kn¨ orrer & E. Trubowitz
= (−1)j−i+1
X
si ,sj ,ti ,tj ∈Σ ti ∩si 6=∅ tj ∩sj 6=∅
Z
dξi dξj c((ξi , ti ), (ξj , tj ))
· f (ˇ η1 , . . . , ηˇm ; (ξ1 , s1 ), . . . , (ξn , sn )) and, for n = 2, by Conc f (ˇ η1 , . . . , ηˇm )(2π)d+1 δ(ˇ η1 + · · · + ηˇm ) 1→2
=
X
s1 ,s2 ,t1 ,t2 ∈Σ t1 ∩s1 6=∅ t2 ∩s2 6=∅
Z
dξ1 dξ2 c((ξ1 , t1 ), (ξ2 , t2 ))f (ˇ η1 , . . . , ηˇm ; (ξ1 , s1 ), (ξ2 , s2 )) .
(iii) We denote by Fˇn;Σ the set of functions on XnΣ with the property that for each ı = (i1 , . . . , in ) ∈ {0, 1}n with m(ı) < n Ord(f |ı ) ∈ Fˇm(ı) (n − m(ı); Σ) and there is a function g on Bˇn such that f |(0,...,0) (ˇ η1 , . . . , ηˇn ) = (2π)3 δ(ˇ η1 + · · · + ηˇn )g(ˇ η1 , . . . , ηˇn ) . The map Ord was introduced in Definition XVI.2(iii) and the restriction f |ı was introduced in Remark XVI.3. The partial Fourier transforms ϕ∼ (as in Definition IX.1(ii)) of functions ϕ ∈ Fm (n; Σ) as in Definition XII(ii) are the functions in Fˇm (n; Σ) that are Conc Conc ∼ ∼ c antisymmetric in their external variables. Also, Con i→j ϕ = ( i→j ϕ) , where i→j ϕ is defined in Definition XII.15. Proposition XVI.8. Let c((ξ, s), (ξ 0 , s0 )) ∈ F0 (2; Σ) be an antisymmetric function. (i) Let p be a natural number, m, m0 ≥ 0, n, n0 ≥ 1 and f ∈ Fˇm (n; Σ), f 0 ∈ ˇ Fm0 (n0 , Σ). If (n, n0 ) 6= (1, 1) then | Conc Antext (f ⊗ f 0 )|ep,Σ 1→n+1
≤ 9|c|1,Σ
max
p1 +p2 =p m≤p1 <m+n 0 m ≤p2 <m0 +n0
min{|f |ep1 +1,Σ |f 0 |ep2 ,Σ , |f |ep1 ,Σ |f 0 |ep2 +1,Σ }
and if (n, n0 ) = (1, 1) then | Conc Antext (f ⊗ f 0 )|em+m0 −1,Σ ≤ 12|c|1,Σ |f |em,Σ |f 0 |em0 ,Σ . 1→n+1
December 15, 2003 16:44 WSPC/148-RMP
00179
Single Scale Analysis of Many Fermion Systems — Part 3
1091
(ii) Assume that there is a function C(k) that is supported in the jth neighborhood, such that c((·, s), (·, s0 )) is the Fourier transform of χs (k)C(k)χs0 (k) ε in the sense of Definition IX.3 and that |C(k)| ≤ |ık0 −e(k)| for some ε ≥ 0. 0 0 Let f ∈ Fˇm (n; Σ), n ≤ n and set, when n < n, as in Definition III.5, f 0 (ˇ η1 , . . . , ηˇm ; (ξn0 +1 , sn0 +1 ), . . . , (ξn , sn )) X ZZ = η1 , . . . , ηˇm ; dξ1 · · · dξn0 f (ˇ si ∈Σ i=1,...,n0
(ξ1 , s1 ), . . . , (ξn0 , sn0 ), . . . , (ξn , sn ))ψ(ξ1 , s1 ) · · · ψ(ξn0 , sn0 )dµCΣ (ψ) where CΣ (ψ(ξ, s), ψ(ξ 0 , s0 )) =
X
c((ξ, t), (ξ 0 , t0 )) .
t∩s6=∅ t0 ∩s0 6=∅
For n0 = n, set f 0 (ˇ η1 , . . . , ηˇm )(2π)d+1 δ(ˇ η1 + · · · + ηˇm ) X ZZ = dξ1 · · · dξn f (ˇ η1 , . . . , ηˇm ; (ξ1 , s1 ), . . . , (ξn , sn )) si ∈Σ i=1,...,n
· ψ(ξ1 , s1 ) · · · ψ(ξn , sn )dµCΣ (ψ) . Then, for all natural numbers p, |f 0 |ep,Σ ≤
l εB3 j M
n0 /2 |f |e
p,Σ
|f |e
p+1,Σ
if n 6= n0 if n = n0
with a constant B3 that is independent of j and Σ. 2 and (iii) Let D(k), D0 (k) be functions obeying |D(k)|, |D 0 (k)| ≤ |ık0 −e(k)| 0 0 0 let d((·, s), (·, s )) resp. d ((·, s), (·, s )) be the Fourier transform of χs (k)D(k) χs0 (k) resp. χs (k)D0 (k)χs0 (k) in the sense of Definition IX.3. Let 1 ≤ i1 , i2 , i3 ≤ n, 1 ≤ i01 , i02 , i03 ≤ n0 with i1 6= i2 6= i3 6= i1 , i01 6= i02 6= i03 6= i01 , and let p ≥ 1. Then there is a constant B4 that is independent of j and Σ such that for all f ∈ Fˇm (n; Σ), f 0 ∈ Fˇm0 (n0 , Σ) Cond Cond0 (f i1 →n+i01 i2 →n+i02 i3 →n+i03
| Conc ≤
l B4 j M
2
|c|1,Σ
⊗ f 0 )|ep,Σ max
p1 +p2 =p m≤p1 <m+n m0 ≤p2 <m0 +n0
min{|f |ep1 +3,Σ |f 0 |ep2 ,Σ , |f |ep1 ,Σ |f 0 |ep2 +3,Σ }
December 15, 2003 16:44 WSPC/148-RMP
1092
00179
J. Feldman, H. Kn¨ orrer & E. Trubowitz
if (n, n0 ) 6= (3, 3) and | Conc
Cond Cond0 (f i1 →n+i01 i2 →n+i02 i3 →n+i03 ≤
if (n, n0 ) = (3, 3).
l B4 j M
2
⊗ f 0 )|em+m0 −1,Σ
|c|1,Σ min{|f |em+2,Σ |f 0 |em0 ,Σ , |f |em,Σ |f 0 |em0 +2,Σ }
Proof. The proofs of part (i) and part (ii), except for the case n = n0 , is similar to that of parts (i) and (ii) of Proposition XII.16. The proof of part (iii) is similar to that of Proposition XII.18. So we only give the proof of part (ii) for the case n = n0 . By translation invariance, X ZZ f 0 (ˇ η1 , . . . , ηˇm ) = dξ1 · · · dξn δ(xn,0 )δ(xn )f (ˇ η1 , . . . , ηˇm ; si ∈Σ i=1,...,n
(ξ1 , s1 ), . . . , (ξn , sn ))ψ(ξ1 , s1 ) · · · ψ(ξn , sn )dµCΣ (ψ) where ξn = (xn,0 , xn , σn , an ). By Proposition IV.3(ii), with Z Z dd+1 k χ ˜s (k)2 l dd+1 k 2 χ ˜ (k) |C(k)| ≤ ε ≤ const ε j G= s d+1 d+1 (2π) (2π) |ık0 − e(k)| M we have, for any dd-operator D, |Df 0 (ˇ η1 , . . . , ηˇm )| ≤ 4Gn/2
X
si ∈Σ i=1,...,n
k|Df (ˇ η1 , . . . , ηˇm ; (·, s1 ), . . . , (·, sn ))k|1,∞ .
Hence |f |em−1,Σ ≤ 4Gn/2 |f |em,Σ . In Theorem XV.3, ladders played a special role. Due to the “external improvement” of Lemma XII, we needed to consider only ladders all of whose “ends” correspond to ψ fields and are integrated out at a later scale. This is not the case when we use the norms developed in this chapter. We consider ladders some of whose “ends” correspond to ψ fields and have sectorized position space variables (ξ, s) ∈ B × Σ, and some of whose ends correspond to φ fields and have momentum ˇ To do this, we extend the definitions and estimates of ladders space variables ηˇ ∈ B. from Sec. XIV. Definition XVI.9. (i) Let C be a propagator over B. We define its extension C˜ over the disjoint union Bˇ ∪· B by ( C(x, y) if x, y ∈ B ˜ y) = C(x, . 0 if x ∈ Bˇ or y ∈ Bˇ
December 15, 2003 16:44 WSPC/148-RMP
00179
Single Scale Analysis of Many Fermion Systems — Part 3
1093
(ii) Let C, D be propagators over B and R a rung over Bˇ ∪· B. We set ˜ D) ˜ . L` (R; C, D) = L` (R; C, (iii) Let P be a bubble propagator over B, r a rung over XΣ = Bˇ ∪· (B × Σ). We set (r • P )(y1 , y2 ; x3 , x4 ) X Z dx01 dx02 r(y1 , y2 , (x01 , s01 ), (x02 , s02 ))P (x01 , x02 ; x3 , x4 ) . = s01 ,s02 ∈Σ
B×B
(r • P ) is a function on X2Σ × B 2 . For a general function F on X2Σ × B 2 , define the rung (F • r) over XΣ by (F • r)(y1 , y2 , y3 , y4 ) X Z dx01 dx02 F (y1 , y2 ; x01 , x02 )r((x01 , s01 ), (x02 , s02 ), y3 , y4 ) = s01 ,s02 ∈Σ
B×B
if at least one of the arguments y1 , . . . , y4 lies in B × Σ ⊂ XΣ , and for ηˇ1 , ηˇ2 , ηˇ3 , ηˇ4 ∈ Bˇ ⊂ XΣ (F • r)(ˇ η1 , ηˇ2 , ηˇ3 , ηˇ4 )(2π)d+1 δ(ˇ η1 + ηˇ2 + ηˇ3 + ηˇ4 ) X Z dx01 dx02 F (ˇ η1 , ηˇ2 ; x01 , x02 )r((x01 , s01 ), (x02 , s02 ), ηˇ3 , ηˇ4 ) . = s01 ,s02 ∈Σ
B×B
(iv) Let ` ≥ 1, r1 , . . . , r`+1 rungs over XΣ and P1 , . . . , P` bubble propagators over B. The ladder with rungs r1 , . . . , r`+1 and bubble propagators P1 , . . . , P` is defined to be r1 • P1 • r2 • P2 • · · · • r` • P` • r`+1 . If r is a rung over XΣ and A, B are propagators over B, we define L` (r; A, B) as the ladder with ` + 1 rungs r and ` bubble propagators C(A, B). Remark XVI.10. In the situation of Definition XVI.9(ii), let R 0 be the restriction of R to B 4 , Rleft the restriction of R to (Bˇ ∪· B)2 × B 2 and Rright the restriction of R to B 2 × (Bˇ ∪· B)2 . Then L` (R; C, D) = Rleft ◦ C(C, D) ◦ R0 ◦ · · · ◦ R0 ◦ C(C, D) ◦ Rright . Similarly, in the situation of Definition XVI.9(iv), let r 0 be the restriction of r to (B × Σ)4 , rleft the restriction of r to X2Σ × (B × Σ)2 and rright the restriction of r to (B × Σ)2 × X2Σ . Then L` (r; C, D) = rleft • C(C, D) • r0 • · · · • r0 • C(C, D) • rright .
December 15, 2003 16:44 WSPC/148-RMP
1094
00179
J. Feldman, H. Kn¨ orrer & E. Trubowitz
In analogy to Lemma XIV.4 we have Lemma XVI.11. Let c and d be propagators over B × Σ and r a rung over X Σ . Define the propagators C and D over B by X X C(x1 , x2 ) = c((x1 , t1 ), (x2 , t2 )) D(x1 , x2 ) = d((x1 , t1 ), (x2 , t2 )) t1 ,t2 ∈Σ
t1 ,t2 ∈Σ
and new propagators c˜ and d˜ over B × Σ by ˜ 1 , s1 ), (x2 , s2 )) = D(x1 , x2 ) . d((x
c˜((x1 , s1 ), (x2 , s2 )) = C(x1 , x2 ) Then, for all ` ≥ 1
˜. L` (r; C, D) = L` (r; c˜, d) In analogy to Lemma XIV.5, we have Lemma XVI.12. Let f ∈ Fˇ4;Σ . Let C(k) and D(k) be functions on R×R2 , that are supported in the jth neighborhood, and C(ξ, ξ 0 ), D(ξ, ξ 0 ) their Fourier transforms as in Definition IX.3. Furthermore, let c((·, s), (·, s0 )) and d((·, s), (·, s0 )) be the Fourier transform of χs (k)C(k)χs0 (k) and χs (k)D(k)χs0 (k). Define propagators over B × Σ by X cΣ ((ξ, s), (ξ 0 , s0 )) = c((ξ, t), (ξ 0 , t0 )) t∩s6=∅ t0 ∩s0 6=∅
dΣ ((ξ, s), (ξ 0 , s0 )) =
X
d((ξ, t), (ξ 0 , t0 )) .
t∩s6=∅ t0 ∩s0 6=∅
Then L` (f ; C, D) = L` (f ; cΣ , dΣ ) for all ` ≥ 1. Here, the ladder on the right-hand side is defined as in Definition XVI.9(ii), but with B replaced by B × Σ, and uses the ◦ product of Definition XIV.1(iv), while the ladder on the left-hand side is as in Definition XVI.9(iv) and uses the • product. Also observe that, by Remark XVI.10, for f ∈ Fˇ4;Σ L` (f ; C, D) =
X
i1 ,...,i4 ∈{0,1}
f |(i1 ,i2 ,1,1) • C(C, D) • f |(1,1,1,1)
• · · · • C(C, D) • f |(1,1,i3 ,i4 ) .
(XVI.1)
December 15, 2003 16:44 WSPC/148-RMP
00179
Single Scale Analysis of Many Fermion Systems — Part 3
1095
XVII. The Renormalization Group Map and Norms in Momentum Space This section provides the analog of Sec. XV for the | |e-norms. Again, let j ≥ 2 and 1 1 ≤ l ≤ M (j−1)/2 . Fix a system let Σ be a sectorization of scale j and length M j−3/2 ρ = (ρm;n ) of positive real numbers such that if n ≤ n0
ρm;n ≤ ρm;n0 ρm+m0 ;n+n0 −2 ≤ ρm;n ρm0 ;n0
(XVII.1)
ρm+1;n−1 ≤ ρm;n . Definition XVII.1. (i) For a function f ∈ Fˇn;Σ and a natural number p we set X ρm(ı);n−m(ı) |Ord(f |ı )|ep,Σ |f |ep,Σ,ρ = ρn;0 |g|ep,Σ + ı∈|{0,1}n m(ı)
where g is the function on Bˇn such that f |(0,...,0) (ˇ η1 , . . . , ηˇn ) = (2π)d+1 δ(ˇ η1 + · · · + ηˇn )g(ˇ η1 , . . . , ηˇn ) . (ii) For f ∈ Fˇm (n; Σ) set 1 1 1 1 |f |e1,Σ + |f |e2,Σ + |f |e3,Σ + ||f |e4,Σ + 2 |f |e5,Σ + 2 ||f |e6,Σ l l l l |f |eΣ = ρm;n |f |e + 1 |f |e + 1 |f |e 1,Σ l 3,Σ l2 5,Σ and for f ∈ Fˇn;Σ set X |f |eΣ = |g|eΣ + |Ord(f |ı )|eΣ
if m 6= 0 if m = 0
ı∈{0,1}n m(ı)
where, as in part (i), g is the function on Bˇn such that f |(0,...,0) (ˇ η1 , . . . , ηˇn ) = (2π)d+1 δ(ˇ η1 + · · · + ηˇn )g(ˇ η1 , . . . , ηˇn ) . (iii) With the notation introduced in Definitions XVI.1 and XVI.7(iii), every translation invariant sectorized Grassmann function w can be uniquely written in the form XZ w(φ, ψ) = dx1 · · · dxn fn (x1 , . . . , xn )Ψ(x1 ) · · · Ψ(xn ) n
Xn Σ
where fn ∈ Fˇn;Σ is an antisymmetric function. Set, in analogy with Definition XV.1(iii), for α > 0 and X ∈ Nd+1 , n/2 X lB M 2j ej (X) αn |fn |eΣ Nj∼ (w; α; X, Σ, ρ) = l Mj n≥0
December 15, 2003 16:44 WSPC/148-RMP
1096
00179
J. Feldman, H. Kn¨ orrer & E. Trubowitz
where B = 4 max{8B1 , B2 , 32B3 , 4B4 } with B1 , B2 being the constants of Propositions XII.16 and XII.18 and B3 , B4 being the constants of Proposition XVI.8. Remark XVII.2. A sectorized Grassmann function w can also be uniquely written in the form XZ w(φ, ψ) = dη1 · · · dηm dξ1 · · · dξn wm,n (η1 , . . . , ηm (ξ1 , s1 ), . . . , (ξn , sn )) m,n
· φ(η1 ) · · · φ(ηm )ψ((ξ1 , s1 )) · · · ψ((ξn , sn )) with wm,n antisymmetric separately in the η variables and in the ξ variables. Then, Nj∼ (w; α; X, Σ, ρ)
(m+n)/2 X M 2j lB ∼ e m+n ej (X) |wm,n |Σ = α l Mj m,n≥0
=
X M 2j ej (X) αn l n≥0
· ρ0;n +
∼ e |w0,n |1,Σ
lB Mj
n/2
1 ∼ ˜ 1 ∼ e + |w0,n |3,Σ + 2 |w0,n |5,Σ l l
(m+n)/2 X M 2j lB ej (X) αm+n l Mj m,n≥0 m6=0
· ρm;n
"
6 X p=1
1
l
∼ e |wm,n |p,Σ [(p−1)/2]
#
.
∼ Here, wm,n is the partial Fourier transform of wm,n of Definition IX.1(ii) and [(p − 1)/2] is the integer part of p−1 2 . In particular,
and
∼ e ∼ e Nj∼ (w(φ, 0); α; X, Σ, ρ) = ej (X)[α2 ρ2;0 BM j |w2,0 |1,Σ + α4 ρ4;0 B2 |w4,0 |3,Σ ]
Nj∼ (w(0, ψ); α; X, Σ, ρ) = Nj (w(0, ψ); α; X, Σ, ρ) . Theorem XVII.3. Let cB > 0. There are constants const, const0 , α0 , γ0 and τ0 that are independent of j, Σ, ρ such that for all α ≥ α0 and γ ≤ γ0 the following holds: Let u((ξ, s), (ξ 0 , s0 )), v((ξ, s), (ξ 0 , s0 )) ∈ F0 (2; Σ) be antisymmetric, spin independent, particle number conserving functions. Set C(k) =
ν (j) (k) , ık0 − e(k) − u ˇ(k)
D(k) =
ν (≥j+1) (k) ık0 − e(k) − vˇ(k)
December 15, 2003 16:44 WSPC/148-RMP
00179
Single Scale Analysis of Many Fermion Systems — Part 3
1097
and let C(ξ, ξ 0 ), D(ξ, ξ 0 ) be the Fourier transforms of C(k), D(k) as in Definition IX.3. Let B(k) be a function on R × R2 and set Z ˆ ˆ ξ 0 )φ(ξ 0 ) (Bφ)(ξ) = dξ 0 B(ξ, ˆ was defined in Definition IX.4. Furthermore, let W(φ, ψ) be an even where B Grassmann function and sete ˆ . : W 0 (φ, ψ) :ψ,D = ΩC (: W(φ, ψ) :ψ,C+D )(φ, ψ + Bφ) Assume that the following estimates are fulfilled : • ρm+1;n−1 ≤ γρm;n for all m ≥ 0 and n ≥ 1. • |ˇ u(k)|, |ˇ v (k)| ≤ 21 |ık0 − e(k)|. • |u|1,Σ ≤ µ(Λ + X)ej (X) with X ∈ Nd+1 , µ, Λ > 0 such that (1 + µ)(Λ + X0 ) ≤ τ0 Mj . • kB(k)k˜ ≤ cB ej (X). • W has a sectorized representative XZ dx1 · · · dxn fn (x1 , . . . , xn )Ψ(x1 ) · · · Ψ(xn ) w(φ, ψ) = n
Xn Σ
with antisymmetric functions fn ∈ Fˇn;Σ such that f2 = 0 and X Nj∼ (w; 64α; X, Σ, ρ) ≤ const0 α + ∞tδ . δ6=0
Then W 0 has a sectorized representative w 0 such that Nj∼ (w; 64α; X, Σ, ρ) 1 Nj∼ (w0 − w; α; X, Σ, ρ) ≤ const +γ . ∼ α 1 − const α Nj (w; 64α; X, Σ, ρ) P R Furthermore, if one writes w 0 (φ, ψ) = n Xn dx1 · · · dxn fn0 (x1 , . . . , xn )Ψ(x1 ) · · · Σ Ψ(xn ), with antisymmetric functions fn0 ∈ Fˇn;Σ , then |f20 |e1,Σ,ρ ≤
Nj∼ (w; 64α; X, Σ, ρ)2 const l . ∼ α8 M j 1 − const α Nj (w; 64α; X, Σ, ρ)
R ˆ where (Bφ)(ξ, ˆ ˆ ξ 0 )φ(ξ 0 ), If one writes w0 (φ, ψ) = w00 (φ, ψ + Bφ) s) = dξ 0 B(ξ, with abuse of notation, and expands XZ w00 (φ, ψ) = dx1 · · · dxn fn00 (x1 , . . . , xn )Ψ(x1 ) · · · Ψ(xn ) n
Xn Σ
e The definition of W 0 as an analytic function, rather than merely a formal Taylor series was explained in Remark XV.11.
December 15, 2003 16:44 WSPC/148-RMP
1098
00179
J. Feldman, H. Kn¨ orrer & E. Trubowitz
with antisymmetric functions fn00 ∈ Fˇn;Σ , then e ∞ 1X 00 ` `+1 (−1) (12) Ant L` (f4 ; C, D) f 4 − f 4 − 4 `=1
≤
3,Σ,ρ
Nj∼ (w; 64α; X, Σ, ρ)2 const . l ∼ α10 1 − const α Nj (w; 64α; X, Σ, ρ)
Here L` (f4 ; C, D) is a ladder in the sense of Definition XVI.9(iv). The proof of Theorem XVII.3 is similar that of Theorems XV.3 and X.12. Recall that w and w0 are elements of the Grassmann algebra over the vector space, V˜ , genˇ η ), ηˇ ∈ B, ˇ ψ(ξ, s), (ξ, s) ∈ (B ×Σ). Let c((·, s), (·, s0 )) and d((·, s), (·, s0 )) erated by φ(ˇ be the Fourier transform of χs (k)C(k)χs0 (k) and χs (k)D(k)χs0 (k) in the sense of Definition IX.3. Then c and d define covariances on V˜ by ˇ η ), φ(ˇ ˇ η 0 )) = 0 , C˜Σ (φ(ˇ
ˇ η ), φ(ˇ ˇ η 0 )) = 0 ˜ Σ (φ(ˇ D
ˇ η ), ψ((ξ, s))) = 0 , C˜Σ (φ(ˇ
ˇ η ), ψ((ξ, s))) = 0 ˜ Σ (φ(ˇ D
and C˜Σ (ψ(ξ, s), ψ(ξ 0 , s0 )) = cΣ ((ξ, s), (ξ 0 , s0 )) =
X
c((ξ, t), (ξ 0 , t0 ))
t∩s6=∅ t0 ∩s0 6=∅
˜ Σ (ψ(ξ, s), ψ(ξ 0 , s0 )) = dΣ ((ξ, s), (ξ 0 , s0 )) = D
X
d((ξ, t), (ξ 0 , t0 )) .
t∩s6=∅ t0 ∩s0 6=∅
˜ Σ to the vector space, VΣ , generated by ψ(ξ, s), (ξ, s) ∈ The restriction of C˜Σ resp. D (B × Σ), coincides with the CΣ resp. DΣ of Proposition XII, while the subspace ˇ η ), ηˇ ∈ B, ˇ is isotropic and perpendicular to VΣ with respect Vext , generated by φ(ˇ ˜ Σ. to both C˜Σ and D ˇ For f ∈ Fm (n; Σ) set 1 1 |f |e1,Σ + |f |e2,Σ + |f |e3,Σ + |f |e4,Σ if m 6= 0 l l e (XVII.2) |f |impr,Σ = ρm;n 1 |f |e + |f |e if m = 0 1,Σ l 3,Σ and for f ∈ Fˇn;Σ set X |f |eimpr,Σ = |g|eimpr,Σ + |Ord(f |ı )|eimpr,Σ ı∈{0,1}n m(ı)
where g is the function on Bˇn such that f |(0,...,0) (ˇ η1 , . . . , ηˇn ) = (2π)d+1 δ(ˇ η1 + · · · + ηˇn )g(ˇ η1 , . . . , ηˇn ) .
December 15, 2003 16:44 WSPC/148-RMP
00179
Single Scale Analysis of Many Fermion Systems — Part 3
1099
∼ The seminorms | · |eimpr,Σ (and | · |0impr , Nimpr (·; α), to be introduced shortly) are used only locally, between this point and the end of the proof of Theorem XVII.3.
Lemma XVII.4. Under the hypotheses of Theorem XVII.3, there exists a constant ˜ Σ ) have const1 that is independent of j and Σ such that the covariances (C˜Σ , D improved integration constants r 1 Bl j c = const1 M ej (X) , b= , J =l 2 4M j for the families |·|eΣ and |·|eimpr,Σ of seminorms (in the sense of [3, Definition VI.1]).
˜ Σ ) have inteProof. By Proposition XVI.8 and (XVII.1), the covariances (C˜Σ , D gration constants r l 0 0 c = 12|c|1,Σ b = max{8B3 , B4 } j M for the configuration | · |e1,Σ,ρ , | · |e2,Σ,ρ , . . . , | · |e6,Σ,ρ of seminorms, in the sense of [3, ˜ Σ ) have improved integration Definition VI.11]. Hence, by [3, Lemma VI.12], (C˜Σ , D 0 0 constants c , b and J = l for the families 1 1 1 1 |f |0 = |f |e1,Σ,ρ + |f |e2,Σ,ρ + |f |e3,Σ,ρ + |f |e4,Σ,ρ + 2 |f |e5,Σ,ρ + 2 |f |e6,Σ,ρ l l l l 1 1 |f |0impr = |f |e1,Σ,ρ + |f |e2,Σ,ρ + |f |e3,Σ,ρ + |f |e4,Σ,ρ . l l e e ˇ When f ∈ F0 (n; Σ), |f |p+1,Σ,ρ ≤ |f |p,Σ,ρ for all odd p so that |f |eΣ ≤ |f |0 ≤ |2|f |eΣ
|f |eimpr,Σ ≤ |f |0impr ≤ |2|f |eimpr,Σ .
˜ Σ ) have improved integration constants 4c0 , 2b0 and J = l for the Hence (C˜Σ , D e families | · |Σ and | · |eimpr,Σ of seminorms. As in Lemma XV.5, c0 ≤ const M j ej (X) and the lemma follows. Lemma XVII.5. Let cB > 0. Then there are constants const and γ0 , independent of M, j, Σ, ρ such that the following holds for all γ ≤ γ0 and all X, XB ∈ Nd+1 . Let g(φ, ψ) be a sectorized Grassmann function and set ˆ . g 0 (φ, ψ) = g(φ, ψ + Bφ) Assume that kB(k)ke ≤ cB ej (X) and kB(k)ke ≤ cB XB ej (X). If ρm+1;n−1 ≤ γρm;n for all m ≥ 0 and n ≥ 1, then Nj∼ (g 0 − g; α; X, Σ, ρ) ≤ const γXB Nj∼ (g; 2α; X, Σ, ρ) .
Let Gm,n , resp. G0m,n , be the kernel of the part of g, resp. g 0 , that is of degree m in φ and degree n in ψ. Then, for p ∈ {1, 3}, X X ∼ e e |G0∼ |G∼ m,n − Gm,n |p,Σ,ρ ≤ const γXB ej (X) m,n |p,Σ,ρ . m,n m+n=p+1
m,n m+n=p+1
December 15, 2003 16:44 WSPC/148-RMP
1100
00179
J. Feldman, H. Kn¨ orrer & E. Trubowitz
Proof. Let ϕ ∈ Fˇm (n; Σ), 1 ≤ i ≤ n and set, for ηˇm+1 = (km+1 , σm+1 , am+1 ), ϕ0 (ˇ η1 , . . . , ηˇm+1 ; (ξ1 , s1 ), . . . , (ξn−1 , sn−1 )) XZ = Antext dζB(km+1 )E+ (ˇ ηm+1 , ζ)ϕ(ˇ η1 , . . . , ηˇm ; s∈Σ
(ξ1 , s1 ), . . . , (ξi−1 , si−1 ), (ζ, s), (ξi , si ), . . . , (ξn−1 , sn−1 )) if n ≥ 2, and ϕ0 (ˇ η1 , . . . , ηˇm+1 )(2π)d+1 δ(k1 + · · · + km+1 ) XZ dζB(km+1 )E+ (ˇ ηm+1 , ζ)ϕ(ˇ η1 , . . . , ηˇm ; (ζ, s)) = Antext s∈Σ
= Antext
X
B(km+1 )ϕ(ˇ η1 , . . . , ηˇm ; (0, σm+1 , am+1 , s))
s∈Σ
· (2π)d+1 δ(k1 + · · · + km+1 ) if n = 1. For any fixed ηˇ1 , . . . , ηˇm+1 k|ϕ0 (ˇ η1 , . . . , ηˇm+1 ; (ξ1 , s1 ), . . . , (ξn−1 , sn−1 ))k|1,∞ ≤ 2 sup |B(k)| k|ϕ(ˇ η1 , . . . , ηˇm ; (ξ1 , s1 ), . . . , (ζ, s), . . . , (ξn−1 , sn−1 ))k|1,∞ k,s
when n ≥ 2, since |E+ (ˇ ηm+1 , ζ)| ≤ 1 and the requirement that km+1 be in the sector s restricts the choice of s to at most two different sectors. For n = 1, |ϕ0 (ˇ η1 , . . . , ηˇm+1 )| ≤ 2 sup |B(k)| |ϕ(ˇ η1 , . . . , ηˇm ; (0, σm+1 , am+1 , s))| . k,s
Since Dδm+1 E+ (ˇ ηm+1 , ζ) = ζ δ E+ (ˇ ηm+1 , ζ), Leibniz and [6, Corollary A.5(ii)] implies that, for both X 0 = XB and X 0 = 1, ej (X)|ϕ0 |ep,Σ ≤ const ej (X)kB(k)ke|ϕ|ep,Σ ≤ const cB X 0 ej (X)|ϕ|ep,Σ
(XVII.3)
so that ej (X)|ϕ0 |ep,Σ,ρ ≤ const cB γX 0 ej (X)|ϕ|ep,Σ,ρ and
ej (X)|ϕ0 |eΣ ≤ const cB γX 0 ej (X)|ϕ|eΣ . (XVII.4) P Write g(φ, ψ) = m,n gm,n (φ, ψ), with gm,n of degree m in φ and degree n in ψ, and g(φ, ψ + ζ) =
X m,n
gm,n (φ, ψ + ζ) =
n XX
gm,n−`,` (φ, ψ, ζ)
m,n `=0
with gm,n−`,` of degrees m in φ, n − ` in ψ and ` in ζ. Let Gm,n and Gm,n−`,` ˆ respectively. By the binomial be the kernels of gm,n (φ, ψ) and gm,n−`,`(φ, ψ, Bφ)
December 15, 2003 16:44 WSPC/148-RMP
00179
Single Scale Analysis of Many Fermion Systems — Part 3
1101
theorem and repeated application of (XVII.4), ` − 1 times with X 0 = 1 and once with X 0 = XB , ! n ∼ e ` e ej (X)|Gm,n−`,` |Σ ≤ (const cB γ) XB ej (X)|G∼ m,n |Σ ` if ` ≥ 1. Then, ˆ − g(φ, ψ) = g (φ, ψ) − g(φ, ψ) = g(φ, ψ + Bφ) 0
n X X
ˆ gm,n−`,` (φ, ψ, Bφ)
m,n≥0 `=1
and Nj∼ (g 0 − g; α; X, Σ, ρ) ≤
(m+n)/2 n X X lB M 2j e ej (X) |G∼ αm+n m,n−`,` |Σ l Mj m,n≥0 `=1
n X X M 2j XB ej (X) ≤ l m,n≥0 `=1
n `
!
` m+n
(const cB γ) α
lB Mj
(m+n)/2
e |G∼ m,n |Σ
(m+n)/2 X M 2j lB n m+n e XB ej (X) = [(1 + const cB γ) − 1]α |G∼ m,n |Σ . l Mj m,n≥0
If
const cB γ
≤
1 3
(1 + const cB γ)n − 1 ≤
const cB γn(1 + const cB γ)
n−1
n 3 (1 + const cB γ)n−1 ≤ const cB γ 2 ≤
const cB γ2
≤
const γ2
n
n
and Nj∼ (g 0 − g; α; X, Σ, ρ) ≤ const γXB Nj∼ (g; 2α; X, Σ, ρ) . The proof of the second claim is similar but uses ! n ∼ e ` e |Gm,n−`,` |p,Σ,ρ ≤ (const cB γ) XB ej (X)|G∼ m,n |p,Σ,ρ ` and (const cB γ)` n` ≤ const γ for ` ≥ 1, n ≤ 4.
Proof of Theorem XVII.3. For a sectorized Grassmann function v = V with vn ∈ n V˜ let
P
n vn
December 15, 2003 16:44 WSPC/148-RMP
1102
00179
J. Feldman, H. Kn¨ orrer & E. Trubowitz
N ∼ (v; α) = ∼ Nimpr (v; α) =
1 X n n c α b |vn |eΣ b2 n
1 X n n α b |vn |eimpr,Σ c b2 n
be the quantities introduced in [1, Definition II.23] and just after [3, Lemma VI.2]. Then const1 ∼ N ∼ (v; α) = Nj (v; α; X, Σ, ρ) B where const1 is the constant of Lemma XVII.4. If : w00 :ψ,D˜ Σ = ΩC˜Σ (: w :ψ,C˜Σ +D˜ Σ ), then, by Proposition XII, parts (ii) and (iii), and [1, Proposition A.2(ii)] ˆ w0 = w00 (φ, ψ + Bφ) is a sectorized representative for W 0 . We apply [3, Theorem VI.6] to get estimates B on w00 . Choosing const0 = 8 const , the hypotheses of this theorem are fulfilled by 1 Lemma XVII.4. Consequently, N ∼ (w00 − w; α) ≤ α2 c|f200 |eimpr,Σ ≤
N ∼ (w; 32α)2 1 2α2 1 − α12 N ∼ (w; 32α)
(XVII.5)
210 l N ∼ (w; 64α)2 α6 1 − α8 N ∼ (w; 64α)
(XVII.6)
e 1X (−1)` (12)`+1 Ant L` (f4 ; cΣ , dΣ ) α4 b2 c f400 − f4 − 4 impr,Σ `≥1
≤
210 l N ∼ (w; 64α)2 . α6 1 − α8 N ∼ (w; 64α)
(XVII.7)
In (XVII.7), we used the description of ladders in terms of kernels given in [3, Proposition C.4]. By Lemma XVII.5, with XB = 1, ˆ − w(φ, ψ); α) N ∼ (w0 − w; α) = N ∼ (w00 (φ, ψ + Bφ) ˆ − w00 (φ, ψ); α) + N ∼ (w00 (φ, ψ) − w(φ, ψ); α) ≤ N ∼ (w00 (φ, ψ + Bφ) ≤ const γN ∼ (w00 ; 2α) + N ∼ (w00 − w; α) ≤ const γN ∼ (w; 2α) + (1 + const γ)N ∼ (w00 − w; 2α) ≤ const γN ∼ (w; 2α) + (1 + const γ) ≤ const
1 +γ α
1 N ∼ (w; 64α)2 2 8α 1 − α12 N ∼ (w; 64α)
Nj∼ (w; 64α; X, Σ, ρ) 1−
const ∼ α Nj (w; 64α; X, Σ, ρ)
.
December 15, 2003 16:44 WSPC/148-RMP
00179
Single Scale Analysis of Many Fermion Systems — Part 3
1103
By (XVII.6), M j α2 ej (X)|f200 |e1,Σ,ρ ≤ M j α2 ej (X)|f200 |eimpr,Σ ≤ const
N ∼ (w; 64α)2 l . 6 α 1 − α8 N ∼ (w; 64α)
Applying Lemma XVII.5 to the part of w 00 that is homogeneous of degree two in ψ and φ combined yields |f20 |e1,Σ,ρ ≤ 4ej (X)|f200 |e1,Σ,ρ
and hence |f20 |e1,Σ,ρ ≤
Nj∼ (w; 64α; X, Σ, ρ)2 const l . ∼ α8 M j 1 − const α Nj (w; 64α; X, Σ, ρ)
By Lemma XVI.12 and (XVII.7) e ∞ 1X 00 ` `+1 (−1) (12) Ant L` (f4 ; C, D) f 4 − f 4 − 4 `=1
≤
3,Σ,ρ
const Nj∼ (w; 64α)2 ∼ Nj (w; 64α) . l α10 1 − const α
Theorem XVII.6. Let cB > 0. There are constants const, const0 , α0 , γ0 and τ0 that are independent of j, Σ, ρ such that for all α ≥ α0 , ε > 0 and γ ≤ γ0 the following holds: Let, for κ in a neighborhood of zero, uκ , vκ ∈ F0 (2; Σ) be antisymmetric, spin independent, particle number conserving functions. Set Cκ (k) =
ν (j) (k) , ık0 − e(k) − u ˇκ (k)
Dκ (k) =
ν (≥j+1) (k) ık0 − e(k) − vˇκ (k)
and let Cκ (ξ, ξ 0 ), Dκ (ξ, ξ 0 ) be the Fourier transforms of Cκ (k), Dκ (k). Let Bκ (k) be a function on R × R2 and set Z ˆκ φ)(ξ) = dξ 0 B ˆκ (ξ, ξ 0 )φ(ξ 0 ) . (B Furthermore, let, for κ in a neighborhood of zero, Wκ (φ, ψ) be an even Grassmann function and set ˆκ φ) . : Wκ0 (φ, ψ) :ψ,Dκ = ΩCκ (: Wκ (φ, ψ) :ψ,Cκ +Dκ )(φ, ψ + B Assume that the following estimates are fulfilled : • ρm+1;n−1 ≤ γρm;n for all m ≥ 0 and n ≥ 1. d • |ˇ u0 (k)|, |ˇ v0 (k)| ≤ 21 |ık0 − e(k)| and | dκ vˇκ (k)|κ=0 | ≤ ε|ık0 − e(k)|. d • |u0 |1,Σ ≤ µ(Λ + X)ej (X) and | dκ uκ |κ=0 |1,Σ ≤ ej (X)Y with X, Y ∈ Nd+1 , τ0 µ, Λ > 0 such that (1 + µ)(Λ + X0 ) ≤ M j . d e e • kB0 (k)k ≤ cB ej (X) and k dκ Bκ (k)k ≤ cB ej (X)Z with Z ∈ Nd+1 .
December 15, 2003 16:44 WSPC/148-RMP
1104
00179
J. Feldman, H. Kn¨ orrer & E. Trubowitz
• Wκ has a sectorized representative wκ with
n ≡ Nj∼ (w0 ; 64α; X, Σ, ρ) ≤ const0 α +
Then
Wκ0
wκ0
X δ6=0
∞tδ .
has a sectorized representative such that d 0 Nj∼ [w − wκ ]κ=0 ; α; X, Σ, ρ dκ κ d n 1 ∼ Nj wκ ; 16α; X, Σ, ρ ≤ const γ + 2 α 1 − const dκ κ=0 α2 n + const
n
1−
const α2 n
ε 1 j M Y n + n + γZ . α2 α2
Lemma XVII.7. Under the hypotheses of Theorem XVII.6, there exists a constant const2 that is independent of j and Σ such that C˜0,Σ has contraction bound c, C˜0,Σ and D0,Σ have integral bound 21 b and d ˜ Cκ,Σ has contraction bound c0 = const2 M 2j ej (X)Y dκ κ=0 √ d ˜ 1 Dκ,Σ has integral bound b0 = εb dκ 2 κ=0
for the family | · |eΣ of symmetric seminorms.
˜ 0,Σ were proven in Proof. The contraction and integral bounds on C˜0,Σ and D Lemma XVII.4. Clearly, the function d ν (≥j+1) (k) d ν (≥j+1) (k) d Dκ (k) = = vˇκ (k) 2 dκ dκ ık0 − e(k) − vˇκ (k) [ık0 − e(k) − vˇκ (k)] dκ
4ε d Dκ (k)|κ=0 | ≤ |ık0 −e(k)| . By is supported on the jth neighborhood and obeys | dκ q √ Proposition XVI.8(ii) and the first property of (XVII.1), 2 4B3 ε Ml j ≤ εb is an ˜ κ,Σ |κ=0 . integral bound for d D dκ
d cκ |κ=0 |1,Σ . By Proposition XVI.8(i) (see also [3, Lemma VI.15]) Set c00 = 12| dκ d and the second property of ρ in (XVII.1), ( dκ cκ |κ=0 )Σ has contraction bound c00 . We showed in Lemma XV.8 that
c00 ≤ const M 2j ej (X)Y . Lemma XVII.8. Let g(φ, ψ) be a sectorized Grassmann function and set ˆκ φ) . gκ0 (φ, ψ) = g(φ, ψ + B Under the hypotheses of Theorem XVII.6, d 0 Nj∼ ; α; X, Σ, ρ ≤ const γZNj∼ (g; 2α; X, Σ, ρ) . gκ dκ κ=0
December 15, 2003 16:44 WSPC/148-RMP
00179
Single Scale Analysis of Many Fermion Systems — Part 3
1105
Proof. Define, as in Lemma XVII.5, for ηˇm+1 = (km+1 , σm+1 , am+1 ), ϕ0κ (ˇ η1 , . . . , ηˇm+1 ; (ξ1 , s1 ), . . . , (ξn−1 , sn−1 )) XZ dζBκ (km+1 )E+ (ˇ ηm+1 , ζ)ϕ(ˇ η1 , . . . , ηˇm ; = Antext s∈Σ
(ξ1 , s1 ), . . . , (ξi−1 , si−1 ), (ζ, s), (ξi , si ), . . . , (ξn−1 , sn−1 )) if n ≥ 2, and ϕ0κ (ˇ η1 , . . . , ηˇm+1 )(2π)d+1 δ(k1 + · · · + km+1 ) XZ = Antext dζBκ (km+1 )E+ (ˇ ηm+1 , ζ)ϕ(ˇ η1 , . . . , ηˇm ; (ζ, s)) s∈Σ
= Antext
X
Bκ (km+1 )ϕ(ˇ η1 , . . . , ηˇm ; (0, σm+1 , am+1 , s))
s∈Σ
· (2π)d+1 δ(k1 + · · · + km+1 ) if n = 1. By (XVII.4), with X 0 = XB = 1, ej (X)|ϕ00 |eΣ ≤ const γej (X)|ϕ|eΣ
(XVII.8)
and by the same derivation as led to (XVII.4), but with X 0 = XB = Z, e d ≤ const γej (X)Z|ϕ|eΣ . ej (X) ϕ0κ (XVII.9) dκ κ=0 Σ P As in Lemma XVII.5, write g(φ, ψ) = m,n gm,n (φ, ψ), with gm,n of degree m in φ and degree n in ψ, and g(φ, ψ + ζ) =
X
gm,n (φ, ψ + ζ) =
m,n
n XX
gm,n−`,` (φ, ψ, ζ)
m,n `=0
with gm,n−`,` of degrees m in φ, n − ` in ψ and ` in ζ. Let Gm,n and Gκ;m,n−`,` ˆκ φ) respectively. By the binomial be the kernels of gm,n (φ, ψ) and gm,n−`,`(φ, ψ, B theorem, Leibniz, one application of (XVII.9) and ` − 1 applications of (XVII.8), ! e d ∼ n ` e ≤ (const γ) ` ej (X) Gκ;m,n−`,` ej (X)Z|G∼ m,n |Σ . dκ ` κ=0 Σ Since G∼ κ;m,n,0 is independent of κ, d 0 ; α; X, Σ, ρ Nj∼ gκ dκ κ=0 ≤
e (m+n)/2 n X X d ∼ M 2j lB Gκ;m,n−`,` ej (X) αm+n j l M dκ κ=0 Σ m,n≥0 `=1
December 15, 2003 16:44 WSPC/148-RMP
1106
00179
J. Feldman, H. Kn¨ orrer & E. Trubowitz
!
n X X M 2j ` ≤ ej (X)Z l
n
n X X M 2j ej (X)Z n = l
n−1
m,n≥0 `=1
`
`−1
m,n≥0 `=1
=
(const γ)` αm+n !
lB Mj
(const γ)` αm+n
(m+n)/2
lB Mj
e |G∼ m,n |Σ
(m+n)/2
e |G∼ m,n |Σ
(m+n)/2 X M 2j lB e ej (X)Z |G∼ const γn(1 + const γ)n−1 αm+n m,n |Σ l Mj m,n≥0
(m+n)/2 X lB M 2j e n m+n ej (X)Z |G∼ const γ2 α ≤ m,n |Σ l Mj m,n≥0
≤ const γZNj∼ (g; 2α; X, Σ, ρ) . Proof of Theorem XVII.6. As in the proof of Theorem XVII.3, let, for a P Vn ˜ sectorized Grassmann function v = n vn with vn ∈ V, X const1 ∼ 1 Nj (v; α; X, Σ, ρ) αn bn |vn |eΣ = N ∼ (v; α) = 2 c b B n and
: wκ00 :ψ,D˜ κ,Σ = ΩC˜κ,Σ (: wκ :ψ,C˜κ,Σ +D˜ κ,Σ ) . By Proposition XII, parts (ii) and (iii), and [1, Proposition A.2(ii)], ˆκ φ) wκ0 = wκ00 (φ, ψ + B is a sectorized representative for Wκ0 . By the chain rule and the triangle inequality d 0 d 00 ∼ ∼ ˆ N ;α [w − wκ ]κ=0 ; α ≤ N w (φ, ψ + Bκ φ) dκ κ dκ 0 κ=0
d 00 00 ˆ +N [w (φ, ψ + B0 φ) − wκ (φ, ψ)]κ=0 ; α dκ κ d 00 ∼ [w (φ, ψ) − wκ (φ, ψ)]κ=0 ; α . (XVII.10) +N dκ κ ∼
By Lemma XVII.8, d 00 ∼ ˆ N ; α ≤ const γZNj∼ (w000 ; 2α; X, Σ, ρ) . w0 (φ, ψ + Bκ φ) dκ κ=0 By (XVII.5),
Nj∼ (w000 ; 2α; X, Σ, ρ) ≤ Nj∼ (w0 ; 2α; X, Σ, ρ) + Nj∼ (w000 − w0 ; 2α; X, Σ, ρ) ≤ Nj∼ (w0 ; 2α; X, Σ, ρ) +
1 B N ∼ (w0 ; 64α)2 2 const1 8α 1 − 4α1 2 N ∼ (w0 ; 64α)
December 15, 2003 16:44 WSPC/148-RMP
00179
Single Scale Analysis of Many Fermion Systems — Part 3
≤ Nj∼ (w0 ; 2α; X, Σ, ρ) + ≤ const
1−
1107
Nj∼ (w0 ; 64α; X, Σ, ρ)2 const ∼ α2 1 − const α2 Nj (w0 ; 64α; X, Σ, ρ)
Nj∼ (w0 ; 64α; X, Σ, ρ) const ∼ α2 Nj (w0 ; 64α; X, Σ, ρ)
so that Nj∼
d 00 n ˆ ; α; X, Σ, ρ ≤ const γ w (φ, ψ + Bκ φ) const Z . dκ 0 1 − α2 n κ=0
By Lemma XVII.5, with g = Nj∼
d 00 dκ wκ |κ=0 ,
B = B0 and XB = 1,
d 00 ˆ0 φ) − wκ00 (φ, ψ)]κ=0 ; α; X, Σ, ρ [w (φ, ψ + B dκ κ d 00 ; 2α; X, Σ, ρ ≤ const γNj∼ wκ dκ κ=0
≤ const γNj∼
+ const γNj∼
(XVII.11)
d ; 2α; X, Σ, ρ wκ dκ κ=0
d 00 [wκ − wκ ]κ=0 ; 2α; X, Σ, ρ . dκ
(XVII.12)
By [1, Theorem IV.4], with µ = M j (and assuming that we have chosen const1 ≥ 1), d 00 ∼ N [w − wκ ]κ=0 ; α dκ κ N ∼ (w0 ; 32α) d 1 ∼ ; 8α N ≤ wκ 2α2 1 − α12 N ∼ (w0 ; 32α) dκ κ=0 +
1 N ∼ (w0 ; 32α)2 2 2α 1 − α12 N ∼ (w0 ; 32α)
const n ≤ α2 1 − const α2 n
N
∼
1 const2 M 2j ej (X)Y + 4ε 4M j
d j wκ ; 8α + M Y n + 4εn dκ κ=0
(XVII.13)
since ej (X)N ∼ (w0 ; 32α) ≤ const N ∼ (w0 ; 32α). Also Nj∼
d 00 [wκ − wκ ]κ=0 ; 2α; X, Σ, ρ dκ
≤
n const α2 1 − const α2 n
N∼
d ; 16α + M j Y n + 4εn . wκ dκ κ=0
December 15, 2003 16:44 WSPC/148-RMP
1108
00179
J. Feldman, H. Kn¨ orrer & E. Trubowitz
Substituting (XVII.11)–(XVII.13) into (XVII.10), d 0 N∼ [wκ − wκ ]κ=0 ; α dκ n d ∼ ≤ const γ Z + const γN w ; 2α κ dκ κ=0 1 − const α2 n n const + (1 + γ) 2 α 1 − const α2 n
n 1 ≤ const γ + 2 const α 1 − α2 n + const
n 1−
const α2 n
∼
Nj∼
N
d j wκ ; 16α + M Y n + 4εn dκ κ=0
d ; 16α; X, Σ, ρ wκ dκ κ=0
ε 1 j M Y n + n + γZ . α2 α2
Remark XVII.9. In Theorem XVII.3, the sectorized representative w 0 of W 0 may be obtained from the sectorized representative w of W by ˜ Σ )(φ, ψ + Bφ) ˆ . : w0 :ψ,D˜ Σ = ΩC˜Σ (: w :ψ,C˜Σ +D The obvious analog of this statement applies to Theorem XVII.6. Appendices D. Naive ladder estimates 1 1 ≤ l ≤ M (j−1)/2 . Let j ≥ 2 and let Σ be a sectorization of scale j and length M j−3/2 To systematically treat ladders, we introduce an auxiliary channel norm, similar to the | · |e2,Σ norm, but with only the leftmost momenta held fixed.
Definition D.1. (i) Let 0 ≤ r ≤ 2 and f ∈ Fˇr (4 − r, Σ). We set X X 1 |f |ech,Σ = sup max δ! D dd-operator ηˇ1 ,...,ˇ ηr ∈Bˇ 2 s1 ,...,s2−r ∈Σ
s3−r ,s4−r ∈Σ δ∈N0 ×N0
with δ(D)=δ
· k|Df (ˇ η1 , . . . , ηˇr ; (ξ1 , s1 ), . . . , (ξ4−r , s4−r ))k|1,∞ tδ . The norm k| · k|1,∞ of Example II.6 refers to the variables ξ1 , . . . , ξ4−r . If r = 0, we also write |f |ch,Σ instead of |f |ech,Σ . (ii) If f ∈ Fˇ4,Σ , we set X |f |ech,Σ = |Ord f |(i1 ,i2 ,1,1) |ech,Σ . i1 ,i2 ∈{0,1}
Lemma D.2. There is a constant const, independent of j and M such that the following hold. Let 0 ≤ r ≤ 2 and f ∈ Fˇr (4 − r, Σ).
December 15, 2003 16:44 WSPC/148-RMP
00179
Single Scale Analysis of Many Fermion Systems — Part 3
1109
(i) |f |ech,Σ ≤ |f |e1,Σ
|f |ech,Σ ≤
(ii)
const
l
if r ≤ 1
|f |e3,Σ .
|f |e4,Σ ≤ |f |e3,Σ
|f |e3,Σ ≤
e const |f |4,Σ
.
(iii) If r = 1 or if r = 0 and f is antisymmetric, then |f |e1,Σ ≤
const
l
|f |ech,Σ .
Proof. Set X
F (ˇ η1 , . . . , ηˇr ; s1 , . . . , s4−r ) =
δ∈N0 ×N20
1 δ!
max
D dd-operator with δ(D)=δ
· k|Df (ˇ η1 , . . . , ηˇr ; (ξ1 , s1 ), . . . , (ξ4−r , s4−r ))k|1,∞ tδ . (i) Then |f |ech,Σ = ≤
X
sup
ˇ ηˇ1 ,...,ˇ η r ∈B s3−r ,s4−r ∈Σ s1 ,...,s2−r ∈Σ
F (ˇ η1 , . . . , ηˇr ; s1 , . . . , s4−r )
X
sup
1≤i1 <···
F (ˇ η1 , . . . , ηˇr ; s1 , . . . , s4−r ) = |f |e1,Σ
if r ≤ 1 and, since Σ contains at most const elements, l X F (ˇ η1 , . . . , ηˇr ; s1 , . . . , s4−r ) |f |ech,Σ = sup ηˇ1 ,...,ˇ ηr ∈Bˇ s3−r ,s4−r ∈Σ s1 ,...,s2−r ∈Σ
≤
≤
=
const
l
sup
ηˇ1 ,...,ˇ ηr ∈Bˇ s4−r ∈Σ s1 ,...,s3−r ∈Σ
const
l
const
l
X
sup
F (ˇ η1 , . . . , ηˇr ; s1 , . . . , s4−r ) X
1≤i1 <···
|f |e3,Σ .
F (ˇ η1 , . . . , ηˇr ; s1 , . . . , s4−r )
December 15, 2003 16:44 WSPC/148-RMP
1110
00179
J. Feldman, H. Kn¨ orrer & E. Trubowitz
(ii) |f |˜4,Σ = ≤
≤
sup s1 ,...,s4−r ∈Σ ˇ ηˇ1 ,...,ˇ η r ∈B
F (ˇ η1 , . . . , ηˇr ; s1 , . . . , s4−r ) X
sup
1≤i1 <···
sup
const
s1 ,...,s4−r ∈Σ ηˇ1 ,...,ˇ ηr ∈Bˇ
F (ˇ η1 , . . . , ηˇr ; s1 , . . . , s4−r ) = |f |e3,Σ
F (ˇ η1 , . . . , ηˇr ; s1 , . . . , s4−r ) = const |f |e4,Σ
since, by conservation of momentum, for any fixed si1 , . . . , si3−r and ηˇ1 , . . . , ηˇr , there are at most const choices of si , i 6= i1 , . . . , i3−r for which F (ˇ η1 , . . . , ηˇr ; s1 , . . . , s4−r ) does not vanish. (iii) If r = 1 or if r = 0 and f is antisymmetric, then X F (ˇ η1 , . . . , ηˇr ; s1 , . . . , s4−r ) |f |e1,Σ = sup 1≤i1 <···
=
X
sup
s1 ,...,s1−r ∈Σ s2−r ,...,s4−r ∈Σ ηˇ1 ,...,ˇ ηr ∈Bˇ
≤
=
const
l const
l
F (ˇ η1 , . . . , ηˇr ; s1 , . . . , s4−r )
X
sup
s1 ,...,s2−r ∈Σ s ,s ∈Σ ˇ 3−r 4−r ηˇ1 ,...,ˇ η r ∈B
F (ˇ η1 , . . . , ηˇr ; s1 , . . . , s4−r )
|f |ech,Σ .
Corollary D.3. There is a constant all f ∈ Fˇ4;Σ
const,
|f |ech,Σ ≤
independent of j and M such that for
const
l
|f |e3,Σ .
Lemma D.4. Let f1 , f2 ∈ Fˇ4;Σ and c, d ∈ F0 (2; Σ). Define propagators cΣ ((ξ, s), (ξ 0 , s0 )) =
X
c((ξ, t), (ξ 0 , t0 ))
t∩s6=∅ t0 ∩s0 6=∅
dΣ ((ξ, s), (ξ 0 , s0 )) =
X
t∩s6=∅ t0 ∩s0 6=∅
d((ξ, t), (ξ 0 , t0 ))
Corollary D.3 There is a constant const, independent of j and M such that for all f ∈ Fˇ4;Σ December 15, 2003 16:44 WSPC/148-RMP 00179 ˜ ˜ |f |ch,Σ ≤ const l | f | 3,Σ Lemma D.4 Let f1 , f2 ∈ Fˇ4;Σ and c, d ∈ F0 (2; Σ). Define propagators
over B × Σ. Then
X 0 0 Analysis of Many Fermion 0 Single cΣ (ξ, s), (ξScale ,s ) = c (ξ, t), (ξ 0, tSystems )
— Part 3
1111
t∩s6=∅ t0 ∩s0 6=∅
X dΣ (ξ, s), (ξ 0, s0 ) = d (ξ, t), (ξ 0, t0 ) e e |f1 ◦ (cΣ ⊗ dΣ ) ◦ f2 |e1,Σ ≤ constt∩s6 k|dk| =∅ ∞ |c|1,Σ |f1 |1,Σ |f2 |1,Σ
t0 ∩s0 6=∅
|f ◦ (cΣ ⊗ dΣ ) ◦ f2 |ech,Σ ≤ const k|dk|∞ |c|1,Σ |f1 |ech,Σ |f2 |ech,Σ over B × Σ. Then 1 e 1 ◦ (cΣ ⊗ dΣ ) ◦ f2 |e 3,Σ |f k|dk|∞ |c|1,Σ |f1 |ech,Σ ˜ |f|2f|23,Σ ˜ f1 ◦ (cΣ ⊗ dΣ ) ◦ f2 ˜ ≤≤const const |||d|||∞ | c||1,Σ | f1 | 1,Σ | 1,Σ 1,Σ 0 0 where k|dk|∞ = maxs,s0 ∈Σ supξ,ξ0 |d((ξ, s), (ξ , s ))|. ˜ | f2 | ˜ f1 ◦ (cΣ ⊗ dΣ ) ◦ f2 ˜ch,Σ ≤ const |||d|||∞ |c||1,Σ |f1| ch,Σ ch,Σ ˜ ˜ ˜ Proof. Set f1 ◦ (cΣ ⊗ dΣ ) ◦ f2 ≤ const |||d|||∞ |c||1,Σ |f1| ch,Σ|f2|3,Σ 3,Σ Z X 0 (·,max ·, ·, ·;s,s (ξ, s),sup (ξ 0 , s00)) 0 ∈Σ |d(=(ξ,s),(ξ0 ,s0 ))|.dζ1 dζ3 f1 (·, ·, (ζ3 , s3 ), (ξ, s)) where |||d|||∞f= ξ,ξ
Proof:
Set
f ( · , · , · , · ;(ξ,s),(ξ0 ,s0 )) =
P
s01 ,s03 ∈Σ
=
P
s0 ,s00 ∈Σ 3 3 s0 ,s00 ∈Σ 1 1
s01 ,s03 ∈Σ
· cΣ ((ζ3 , s03 ), (ζ1 , s01 ))f2 ((ζ1 , s01 ), (ξ 0 , s0 ), ·, ·) Z X Z 0 0 )) f2 ((ζ1 ,s01 ),(ξ0 ,s0 ), · , · ) ),(ξ,s) (ζ33,,ss030 ),(ζ dζ1=dζ3 f1 ( · , · ,(ζdζ 1 ,s1 3 ,s1 s)) Σ ((ζ 3dζ 3 f1)(·,c·, 3 ), (ξ,
Z
s03 ,s00 3 ∈Σ s01 ,s00 1 ∈Σ 0 0 0 00 dζ1 dζ3 f1 ( · , · ,(ζ3 ,s03 ),(ξ,s)) c((ζ3 ,s00 3 ),(ζ1 ,s1 )) f2 ((ζ1 ,s1 ),(ξ ,s ), · , · ) 00 00 0 0 0 · c((ζ3 , s3 ), (ζ1 , s1 ))f2 ((ζ1 , s1 ), (ξ , s ), ·, ·) .
f1
f2
By iterated application of Lemma XVI.6, By iterated application of Lemma XVI.6, e e |f||ef1,Σ const ||c|1,Σ |f1 |1,Σ ˜ ≤≤ ˜|f|2f|1,Σ ˜. |1,Σ const | c||1,Σ | f1 | 1,Σ 2 | 1,Σ
Since Since X Z 0 0 0 0 Z 0 f1 ◦ (cΣ ⊗ dΣ ) ◦ f2 = Σ ((ξ, s), (ξ , s )) P dξdξ f (·,0 ·, ·, ·; (ξ, s), (ξ , s0 ))d 0 0 dξ dξ f ( · , · , · , · ;(ξ,s),(ξ ,s )) dΣ ((ξ,s),(ξ0 ,s0 )) f1 ◦ (cΣ ⊗ dΣ ) ◦ f2s,s =∈Σ s,s0 ∈Σ Z Z X 0 0 P = = dξdξdξ0 fdξ(·, t), (ξ 00,,tt00))) 0 ·, ·, ·; (ξ, s), (ξ , s0 ))d((ξ, f ( · , · , · , · ;(ξ,s),(ξ ,s0 )) d((ξ,t),(ξ ) 0 s,s0 ,t,t s,s0∈Σ ,t,t0 ∈Σ ˜ ∩t ∅,t ˜0s˜6=0 ∩ s˜∩t˜6=s˜∅,˜ s6=0 ∩ ∅t˜0 6=∅
we have
71 |f1 ◦ (cΣ ⊗ dΣ ) ◦
f2 |e1,Σ
≤ const k|dk|∞ |f |e1,Σ .
This proves the first inequality of the lemma. To prove the third inequality, set X g(·, ·, ·, ·; (ξ, s), ξ 0 ) = f (·, ·, ·, ·; (ξ, s), (ξ 0 , s0 )) . s˜0 ∩˜ s6=∅
December 15, 2003 16:44 WSPC/148-RMP
1112
00179
J. Feldman, H. Kn¨ orrer & E. Trubowitz
By conservation of momentum X f1 ◦ (cΣ ⊗ dΣ ) ◦ f2 = 0
s,t,t ∈Σ s˜∩t˜6=∅,t˜∩t˜0 6=∅
Z
dξdξ 0 g(·, ·, ·, ·; (ξ, s), ξ 0 )d((ξ, t), (ξ 0 , t0 )) .
(D.1)
Fix any ı = (i1 , . . . , i4 ) ∈ {0, 1}4 and let X Z 0 0 fı (·, ·, ·, ·; (ξ, s), (ξ , s )) = dζ1 dζ3 f1 |(i1 ,i2 ,1,1) (·, ·, (ζ3 , s03 ), (ξ, s)) s03 ,s00 3 ∈Σ s01 ,s00 1 ∈Σ
· c((ζ3 , s003 ), (ζ1 , s001 ))f2 |(1,1,i3 ,i4 ) ((ζ1 , s01 ), (ξ 0 , s0 ), ·, ·) gı (·, ·, ·, ·; (ξ, s), ξ 0 ) =
X
s˜0 ∩˜ s6=∅
fı (·, ·, ·, ·; (ξ, s), (ξ 0 , s0 )) .
By (D.1) f1 ◦ (cΣ ⊗ dΣ ) ◦ f2 |ı Z X dξdξ 0 gı (·, ·, ·, ·; (ξ, s), ξ 0 )d((ξ, t), (ξ 0 , t0 )) . =
(D.2)
0
s,t,t ∈Σ s˜∩t˜6=∅,t˜∩t˜0 6=∅
For each ν = 1, . . . , 4, fix ηˇν ∈ Bˇ when iν = 0 and sν ∈ Σ when iν = 1. Let ( ηˇν if iν = 0 zν = (ξν , sν ) if iν = 1 and Gı =
X
X
s∈Σ δ∈N0 ×N20
1 δ!
max
D dd-operator with δ(D)=δ
k|Dgı (z1 , . . . , z4 ; (ξ, s), ξ 0 )k|1,∞ tδ .
By iterated application of Leibniz’s rule and Lemma D.2(ii), Gı ≤ ≤ ≤
e e const |c|1,Σ |f1 |(i1 ,i2 ,1,1) |ch,Σ |f2 |(1,1,i3 ,i4 ) |4,Σ
e e const |c|1,Σ |f1 |(i1 ,i2 ,1,1) |ch,Σ |f2 |(1,1,i3 ,i4 ) |3,Σ e e const |c|1,Σ |f1 |ch,Σ |f2 |3,Σ
.
Furthermore, by (D.2), X 1 max k|Df1 ◦ (cΣ ⊗ dΣ ) ◦ f2 |ı (z1 , . . . , z4 )k|1,∞ tδ ≤ 9k|dk|∞ Gı . D dd-operator δ! 2 δ∈N0 ×N0
with δ(D)=δ
This, together with Lemma D.2(ii), shows that |f1 ◦ (cΣ ⊗ dΣ ) ◦ f2 |ı |e3,Σ ≤ const |f1 ◦ (cΣ ⊗ dΣ ) ◦ f2 |ı |e4,Σ
≤ const k|dk|∞ |c|1,Σ |f1 |ech,Σ |f2 |e3,Σ .
December 15, 2003 16:44 WSPC/148-RMP
00179
Single Scale Analysis of Many Fermion Systems — Part 3
1113
The proof of the second inequality is similar to that of the third. Choose ı = (i1 , i2 , 1, 1) with i1 , i2 ∈ {0, 1}. For each ν = 1, 2, fix ηˇν ∈ Bˇ when iν = 0 and sν ∈ Σ when iν = 1. Let ( ηˇν if iν = 0 zν = (ξν , sν ) if iν = 1 and G0ı =
X
X
s3 ,s4 ,s∈Σ δ∈N0 ×N20
1 δ!
max
D dd-operator with δ(D)=δ
· k|Dgı (z1 , z2 , (ξ3 , s3 ), (ξ4 , s4 ); (ξ, s), ξ 0 )k|1,∞ tδ . By iterated application of Leibniz’s rule, G0ı ≤ const |c|1,Σ |f1 |(i1 ,i2 ,1,1) |ech,Σ |f2 |(1,1,1,1) |ech,Σ ≤ const |c|1,Σ |f1 |ech,Σ |f2 |ech,Σ .
Furthermore, by (D.2), X X
s3 ,s4 ∈Σ δ∈N0 ×N20
1 δ!
max
D dd-operator with δ(D)=δ
· k|Df1 ◦ (cΣ ⊗ dΣ ) ◦ f2 |ı (z1 , z2 , (ξ3 , s3 ), (ξ4 , s4 ))k|1,∞ tδ ≤ 9k|dk|∞ G0ı . This shows that |f1 ◦ (cΣ ⊗ dΣ ) ◦ f2 |ı |ech,Σ ≤ const k|dk|∞ |c|1,Σ |f1 |ech,Σ |f2 |ech,Σ .
Lemma D.5. Let f1 , f2 ∈ Fˇ4;Σ be momentum conserving functions. Also let u, v, u0 , v 0 ∈ F0 (2; Σ) be antisymmetric, spin independent, particle number conserving functions whose Fourier transforms obey |ˇ u(k)|, |ˇ v (k)|, |ˇ u0 (k)|, |ˇ v 0 (k)| ≤ 1 2 |ık0 − e(k)|. Let X ∈ N3 and 0 < ε ≤ 1 such that |u|1,Σ , |u0 |1,Σ ≤
1 X 2M j
|u − u0 |1,Σ ≤
ε X Mj
and |ˇ u(k) − u ˇ0 (k)| ,
|ˇ v (k) − vˇ0 (k)| ≤ ε|ık0 − e(k)| .
Set C(k) =
ν (j) (k) , ık0 − e(k) − u ˇ(k)
D(k) =
ν (≥j+1) (k) ık0 − e(k) − vˇ(k)
C 0 (k) =
ν (j) (k) , ık0 − e(k) − u ˇ0 (k)
D0 (k) =
ν (≥j+1) (k) ık0 − e(k) − vˇ0 (k)
December 15, 2003 16:44 WSPC/148-RMP
1114
00179
J. Feldman, H. Kn¨ orrer & E. Trubowitz
and let C(ξ, ξ 0 ), D(ξ, ξ 0 ), C 0 (ξ, ξ 0 ), D0 (ξ, ξ 0 ) be their Fourier transforms as in Definition IX.3. Assume that X0 ≤ min{τ1 , τ2 }, where τ1 and τ2 are the constants of Proposition XIII.5 and Lemma XIII.6, respectively. (i) |f1 • C(C, D) • f2 |e1,Σ ≤ const
l cj |f1 |e1,Σ |f2 |e1,Σ 1−X
|f1 • C(C, D) • f2 |ech,Σ ≤ const
l cj |f1 |ech,Σ |f2 |ch,Σ 1−X
|f1 • C(C, D) • f2 |e3,Σ ≤ const
l cj |f1 |ech,Σ |f2 |e3,Σ . 1−X
(ii) |f1 • C(C, D) • f2 − f1 • C(C 0 , D0 ) • f2 |e1,Σ ≤ const εl
cj (1 + X) e |f1 |1,Σ |f2 |e1,Σ 1−X
|f1 • C(C, D) • f2 − f1 • C(C 0 , D0 ) • f2 |ech,Σ ≤ const εl
cj (1 + X) e |f1 |ch,Σ |f2 |ch,Σ 1−X
|f1 • C(C, D) • f2 − f1 • C(C 0 , D0 ) • f2 |e3,Σ ≤ const εl
cj (1 + X) e |f1 |ch,Σ |f2 |e3,Σ . 1−X
Proof. Let c((·, s), (·, s0 )), d((·, s), (·, s0 )), c0 ((·, s), (·, s0 )), d0 ((·, s), (·, s0 )) be the Fourier transforms of χs (k) C(k) χs0 (k), χs (k) D(k) χs0 (k), χs (k) C 0 (k) χs0 (k), χs (k) D0 (k) χs0 (k) in the sense of Definition IX.3. By Proposition XIII.5(ii) and Lemma XIII.6(i) |c|1,Σ ≤ const 0
|c − c |1,Σ
M j cj 1−X
(D.3)
M j cj X ≤ const ε . 1−X
For all s, s0 ∈ Σ, the L1 -norm of χs (k)D(k)χs0 (k) is bounded by const Ml2j M j = const Ml j . The same holds for χs (k)D0 (k)χs0 (k). Also, the L1 -norm of χs (k)D(k)χs0 (k) − χs (k)D0 (k)χs0 (k) = χs (k)(v(k) − v 0 (k))D(k)D0 (k)χs0 (k) is bounded by ε const Ml j . The same bounds apply when D is replaced by C. Consequently k|dk|∞ ≤ const k|ck|∞
l , Mj
l ≤ const j , M
k|d0 k|∞ ≤ const 0
k|c k|∞
l , Mj
l ≤ const j , M
k|d − d0 k|∞ ≤ ε const 0
k|c − c k|∞
l , Mj
l ≤ ε const j . M
(D.4)
December 15, 2003 16:44 WSPC/148-RMP
00179
Single Scale Analysis of Many Fermion Systems — Part 3
Also recall from Lemma XVI.12 that if cΣ ((ξ, s), (ξ 0 , s0 )) =
X
1115
c((ξ, t), (ξ 0 , t0 ))
t∩s6=∅ t0 ∩s0 6=∅
c0Σ ((ξ, s), (ξ 0 , s0 )) =
X
c0 ((ξ, t), (ξ 0 , t0 ))
t∩s6=∅ t0 ∩s0 6=∅
dΣ ((ξ, s), (ξ 0 , s0 )) =
X
d((ξ, t), (ξ 0 , t0 ))
t∩s6=∅ t0 ∩s0 6=∅
d0Σ ((ξ, s), (ξ 0 , s0 )) =
X
d0 ((ξ, t), (ξ 0 , t0 ))
t∩s6=∅ t0 ∩s0 6=∅
then f1 • (C ⊗ D) • f2 = f1 ◦ (cΣ ⊗ dΣ ) ◦ f2 ,
f1 • (C 0 ⊗ D0 ) • f2 = f1 ◦ (c0Σ ⊗ d0Σ ) ◦ f2 .
To prove the first inequality in part (i), observe that by Lemma D.4, (D.3) and (D.4) |f1 ◦ (cΣ ⊗ dΣ ) ◦ f2 |e1,Σ ≤ const Similarly
M j cj l l cj |f1 |e1,Σ |f2 |e1,Σ = const |f1 |e1,Σ |f2 |e1,Σ . 1 − X Mj 1−X
|f1 ◦ (dΣ ⊗ cΣ ) ◦ f2 |e1,Σ ≤ const
l cj |f1 |e1,Σ |f2 |e1,Σ 1−X
|f1 ◦ (cΣ ⊗ cΣ ) ◦ f2 |e1,Σ ≤ const
l cj |f1 |e1,Σ |f2 |e1,Σ . 1−X
By Definition XIV.1(iii), C(C, D) = C ⊗ D + D ⊗ C + C ⊗ C. Therefore, the first inequality of part (i) follows. The proof of the other inequalities in part (i) is similar. To prove the first inequality of part (ii), it suffices by Definition XIV.1(iii) to bound each of the quantities |f1 • (C ⊗ D) • f2 − f1 • (C 0 ⊗ D0 ) • f2 |e1,Σ
|f1 • (D ⊗ C) • f2 − f1 • (D0 ⊗ C 0 ) • f2 |e1,Σ
c (1+X)
|f1 • (C ⊗ C) • f2 − f1 • (C 0 ⊗ C 0 ) • f2 |e1,Σ
by const εl j 1−X |f1 |e1,Σ |f2 |e1,Σ . Again we only bound the first quantity; the other two are similar. As above, by Lemma D.4, (D.3) and (D.4) |f1 • (C ⊗ D) • f2 − f1 • (C 0 ⊗ D0 ) • f2 |e1,Σ
≤ |f1 ◦ (cΣ ⊗ (dΣ − d0Σ )) ◦ f2 |e1,Σ + |f1 ◦ ((cΣ − c0Σ ) ⊗ d0Σ ) ◦ f2 |e1,Σ
December 15, 2003 16:44 WSPC/148-RMP
1116
00179
J. Feldman, H. Kn¨ orrer & E. Trubowitz
≤ const
≤ const εl
M j cj l M j cj X l ε j +ε 1−X M 1 − X Mj
|f1 |e1,Σ |f2 |e1,Σ
cj (1 + X) e |f1 |1,Σ |f2 |e1,Σ . 1−X
The proof of the other inequalities in part (ii) of the lemma is similar. Corollary D.6. Let f ∈ Fˇ4;Σ and let C, D, C 0 , D0 be as in Lemma D.5. Then (i) |L` (f ; C, D)|e1,Σ ≤ |L` (f ; C, D)|ech,Σ
≤
const
l cj 1−X
`
|f |e1,Σ
l cj const 1−X
`
|f |ech,Σ .
`+1
`+1
(ii) |L` (f ; C, D) − L` (f ; C |L` (f ; C, D) − L` (f ; C
0
0
, D0 )|e1,Σ
, D0 )|ech,Σ
` `+1 l cj |f |e1,Σ ≤ ε(1 + X) const 1−X ` `+1 l cj ≤ ε(1 + X) const |f |ech,Σ . 1−X
Proof. Part (i) follows by induction on ` from the first two inequalities of Lemma D.5(i) using L` (f ; C, D) = L`−1 (f ; C, D) • C(C, D) • f .
(D.5)
To prove part (ii), observe that L` (f ; C, D) − L` (f ; C 0 , D0 ) = [L`−1 (f ; C, D) − L`−1 (f ; C 0 , D0 )] • C(C, D) • f + L`−1 (f ; C 0 , D0 ) • [C(C, D) − C(C 0 , D0 )] • f
(D.6)
and again apply induction on `, using part (i) and the first two inequalities of Lemma D.5(ii). Proposition D.7. Let f ∈ Fˇ4;Σ . Also let u, v, u0 , v 0 ∈ F0 (2; Σ) be antisymmetric, spin independent, particle number conserving functions whose Fourier transforms obey |ˇ u(k)|, |ˇ v (k)|, |ˇ u0 (k)|, |ˇ v 0 (k)| ≤ 12 |ık0 − e(k)|. Let X ∈ N3 and 0 < ε ≤ 1 such that 1 ε |u|1,Σ , |u0 |1,Σ ≤ X |u − u0 |1,Σ ≤ j X j 2M M
December 15, 2003 16:44 WSPC/148-RMP
00179
Single Scale Analysis of Many Fermion Systems — Part 3
1117
and |ˇ u(k) − u ˇ0 (k)| ,
|ˇ v (k) − vˇ0 (k)| ≤ ε|ık0 − e(k)| .
Set C(k) =
ν (j) (k) , ık0 − e(k) − u ˇ(k)
D(k) =
ν (≥j+1) (k) ık0 − e(k) − vˇ(k)
C 0 (k) =
ν (j) (k) , ık0 − e(k) − u ˇ0 (k)
D0 (k) =
ν (≥j+1) (k) ık0 − e(k) − vˇ0 (k)
and let C(ξ, ξ 0 ), D(ξ, ξ 0 ), C 0 (ξ, ξ 0 ), D0 (ξ, ξ 0 ) be their Fourier transforms as in Definition IX.3. Assume that X0 ≤ min{τ1 , τ2 }, where τ1 and τ2 are the constants of Proposition XIII.5 and Lemma XIII.6, respectively. Then for all ` ≥ 1 (i) |L` (f ; C, D)|e1,Σ
≤
|L` (f ; C, D)|e3,Σ ≤
l cj const 1−X
`
|f |e1,Σ
const
l cj 1−X
`
|f |ech,Σ |f |e3,Σ .
`+1
`
(ii) |L` (f ; C, D) − L` (f ; C 0 , D0 )|e3,Σ ≤ ε(1 + X) const
cj 1−X
`
`+1
|f |e3,Σ .
Proof. The first inequality of part (i) was already stated in Corollary D.6(i). By (D.5), the second inequality of part (i) follows from Corollary D.6(i) and the third inequality of Lemma D.5(i). With an argument as above, using Lemma D.5 and Corollary D.6, one deduces from (D.6) that ` ` l cj |f |ech,Σ |f |e3,Σ |L` (f ; C, D) − L` (f ; C 0 , D0 )|e3,Σ ≤ ε(1 + X) const 1−X the claim now follows Corollary D.3. Remark D.8. Using Corollary D.3, one also sees that in the situation of Proposition D.7, |L` (f ; C, D)|e3,Σ ≤
const
cj 1−X
`
`+1
|f |e3,Σ .
December 15, 2003 16:44 WSPC/148-RMP
1118
00179
J. Feldman, H. Kn¨ orrer & E. Trubowitz
Notation Norms Norm
Characteristics
k| · k|1,∞ k · k1,∞ k · kˇ∞ k| · k|∞ k · kˇ1 k · kˇ∞,B k · kˇ1,B
no derivatives, external positions, acts on functions derivatives, external positions, acts on functions derivatives, external momenta, acts on functions no derivatives, external positions, acts on functions derivatives, external momenta, acts on functions derivatives, external momenta, B ⊂ R × Rd derivatives, external momenta, B ⊂ R × Rd
Example II.6 Example II.6 Definition IV.6 Example III.4 Definition IV.6 Definition IV.6 Definition IV.6
ρm;n k · k1,∞ X 1 αn bn kWm,n k c b2 m,n≥0
Lemma V.1
k·k N (W; c, b, α)
N0 (W; β; X, ρ)
e0 (X)
X
Reference
Definition III.9 Theorem V.2
β n ρm;n kWm,n k1,∞
Theorem VIII.6
m+n∈2N
k · k L1 k · ke
N0∼ (W; β; X, ρ)
derivatives, acts on functions on R × Rd derivatives, external momenta, acts on functions X ∼ e β m+n ρm;n kWm,n | e0 (X)
before Lemma IX.6 Definition X.4
like ρm;n k · ke but acts on V˜ ⊗n
Theorem X.12
before Lemma X.11
m+n∈2N
| · |e
N ∼ (W; c, b, α) | · |p,Σ
|ϕ|Σ
Nj (w; α; X, Σ, ρ) | · |ep,Σ
| · |ep,Σ,ρ |f |eΣ Nj∼ (w; α; X, Σ, ρ) | · |ech,Σ
| · |ch,Σ
1 X m+n m+n ∼ e | c α b |Wm,n b2 m,n
Theorem X.12
derivatives, external positions, all but p sectors summed 1 1 |ϕ|1,Σ + |ϕ|3,Σ + 2 |ϕ|5,Σ if m = 0 l l ρm;n l |ϕ|1,Σ if m 6= 0 M 2j n/2 2j X M lB ej (X) αn |wm,n |Σ j l M m,n≥0
Definition XII.9
derivatives, external momenta, all but p sectors summed
Definition XVI.4
weighted variant of | · |ep,Σ 1 1 |f |e + |f |e3,Σ + 2 |f |e5,Σ 1,Σ l l 6 ρm;n X 1 |f |ep,Σ [(p−1)/2] l p=1
Definition XVII.1(i)
X M 2j l B n/2 ej (X) αn |fn |eΣ j l M n≥0
channel variant of | · |e2,Σ for ladders
channel variant of | · |2,Σ for ladders
Definition XV.1
Definition XV.1
if m = 0 Definition XVII.1(ii) if m 6= 0
Definition XVII.1(iii) Definition D.1 Definition D.1
December 15, 2003 16:44 WSPC/148-RMP
00179
Single Scale Analysis of Many Fermion Systems — Part 3
1119
Other notation Notation
Description Z 1 ΩS (W)(φ, ψ) log eW(φ,ψ+ζ) dµS (ζ) Z J particle/hole swap operator Z ˜ C (W)(φ, ψ) log 1 Ω eφJ ζ eW(φ,ψ+ζ) dµC (ζ) Z r0 number of k0 derivatives tracked r number of k derivatives tracked M scale parameter, M > 1 const generic constant, independent of scale const generic constant, independent of scale and M ν (j) (k) jth scale function ν˜(j) (k) jth extended scale function
Reference before (I.6) (VI.1) Definition VII.1 Sec. VI Sec. VI before Definition VIII.1
Definition VIII.1 Definition VIII.4(i)
ν (≥j) (k)
ϕ(M 2j−1 (k02 + e(k)2 ))
Definition VIII.1
ν˜(≥j) (k)
ϕ(M 2j−2 (k02 + e(k)2 ))
Definition VIII.4(ii)
ν¯(≥j) (k)
ϕ(M 2j−3 (k02 + e(k)2 ))
Definition VIII.4(iii)
length of sectors sectorization 1/m Z ψ(ξ1 ) · · · ψ(ξm )dµC (ψ) sup sup m
Definition XII.1 Definition XII.1
l Σ S(C)
Definition IV.1
ξ1 ,...,ξm ∈B
B cj
j-independent constant =
X
M j|δ| tδ +
∗ ◦ • fˇ u ˇ f∼ χ ˆ B Bˇ Bˇm XΣ Fm (n) Fˇm (n) Fm (n; Σ) Fˇm (n; Σ) Fˇn;Σ
∞tδ ∈ Nd+1
Definitions XV.1, XVII.1 Definition XII.2
|δ|>r or |δ0 |>r0
|δ|≤r |δ0 |≤r0
ej (X)
X
cj 1 − MjX convolution ladder convolution ladder convolution
=
Definition XV.1(ii)
Fourier transform Fourier transform for sectorized u partial Fourier transform Fourier transform R × Rd × {↑, ↓} × {0, 1} viewed as position space
before (XIII.6) Definition XIV.1(iv) Definitions XIV.3, XVI.9 Definition IX.1(i) Definition XII(iv) Definition IX.1(ii) Definition IX.4 beginning of Sec. II
R × Rd × {↑, ↓} × {0, 1} viewed as momentum space {(ˇ η1 , . . . , ηˇm ) ∈ Bˇm |ˇ η1 + · · · + ηˇm = 0} Bˇ ∪· (B × Σ) functions on functions on functions on in sectors functions on in sectors functions on
beginning of Sec. IX before Definition X.1
Definition XVI.1 B m × B n , antisymmetric in B m arguments Definition II.9 Bˇm × B n , antisymmetric in Bˇm arguments Definition X.8 B m × (B × Σ)n , internal momenta Definition XII(ii) Bˇm × (B × Σ)n , internal momenta
Definition XVI.7(i)
ˇ Xn Σ that reorder to Fm (n − m; Σ)’s
Definition XVI.7(iii)
December 15, 2003 16:44 WSPC/148-RMP
1120
00179
J. Feldman, H. Kn¨ orrer & E. Trubowitz
References [1] J. Feldman, H. Kn¨ orrer and E. Trubowitz, Convergence of perturbation expansions in Fermionic models, Part 1: nonperturbative bounds, to apperar in Commun. Mum Phys. [2] J. Feldman, H. Kn¨ orrer and E. Trubowitz, A two dimensional Fermi liquid, Part 1: overview, to appear in Commun. Math. Phys. [3] J. Feldman, H. Kn¨ orrer and E. Trubowitz, Convergence of perturbation expansions in Fermionic models, Part 2: overlapping loops, preprint. [4] J. Feldman, H. Kn¨ orrer and E. Trubowitz, A two dimensional Fermi liquid, Part 2: convergence, to appear in Commun. Math. Phys. [5] J. Feldman, H. Kn¨ orrer and E. Trubowitz, Particle–hole ladders, preprint. [6] J. Feldman, H. Kn¨ orrer and E. Trubowitz, Single scale analysis of many Fermion systems, Part 1: insulators, Rev. Math. Phys. 15 (2003) 949–993.
December 15, 2003 16:52 WSPC/148-RMP
00180
Reviews in Mathematical Physics Vol. 15, No. 9 (2003) 1121–1169 c World Scientific Publishing Company
SINGLE SCALE ANALYSIS OF MANY FERMION SYSTEMS PART 4: SECTOR COUNTING
JOEL FELDMAN∗ Department of Mathematics, University of British Columbia Vancouver, B.C., Canada V6T 1Z2 [email protected] http://www.math.ubc.ca/∼ feldman/ † and EUGENE TRUBOWITZ‡ ¨ HORST KNORRER
Mathematik, ETH-Zentrum, CH-8092 Z¨ urich, Switzerland †[email protected] ‡[email protected] †http://www.math.ethz.ch/∼ knoerrer/
For a two-dimensional, weakly coupled system of fermions at temperature zero, one principal ingredient used to control the composition of the associated renormalization group maps is the careful counting of the number of quartets of sectors that are consistent with conservation of momentum. A similar counting argument is made to show that particle–particle ladders are irrelevant in the case of an asymmetric Fermi curve. Keywords: Fermi liquid; renormalization; fermionic functional integral; Fermi surface.
Contents XVIII. Introduction to Part 4 XIX. Comparison of Norms XX. Sums of Momenta and -Separated Sets -separated sets Sums of momenta Pairs of momenta Sectors that are compatible with conservation of momentum XXI. Sectors Compatible with Conservation of Momentum Comparison of the 1-norm and the 3-norm for four-legged Kernels Auxiliary norms Change of sectorization XXII. Sector Counting for Particle–Particle Ladders Appendices E. Sectors for k0 independent functions
1122 1123 1133 1134 1136 1138 1142 1147 1147 1148 1150 1157 1162 1162
∗ Research supported in part by the Natural Sciences and Engineering Research Council of Canada and the Forschungsinstitut f¨ ur Mathematik, ETH Z¨ urich.
1121
December 15, 2003 16:52 WSPC/148-RMP
1122
00180
J. Feldman, H. Kn¨ orrer & E. Trubowitz
Notation References
1166 1169
XVIII. Introduction to Part 4 In the application of the results of Parts 1 through 3 to many fermion systems ([1–3]) the effective potential and all the quantities derived from it will conserve particle number. Particle number conservation implies that sectorized functions ϕ((· , s1 ), . . . , (· , sn )) ∈ F0 (n; Σ), where Σ is a sectorization, vanish unless the configuration s1 , . . . , sn of sectors is consistent with conservation of momentum (for a more precise statement see Definition XX.1 and Remark XX.2). We shall count the number of configurations s1 , . . . , sn of sectors consistent with conservation of momentum that satisfy certain constraints. The results are used to compare different norms for four-point functions (Proposition XIX.1), and to compare norms associated to different sectorizations at different scales (Proposition XIX.4). The latter is crucial for a multi scale analysis of many fermion systems ([1–3]). Notation tables are provided at the end of the paper. We retain the assumptions that the dispersion relation e(k) is r + d + 1 times differentiable, with r ≥ 2 and d = 2, and that its gradient does not vanish on the Fermi curve F = {k ∈ Rd |e(k) = 0}. All the above results hold under additional geometric assumptions on the geometry of the Fermi curve F. First of all, we assume throughout the rest of the paper that the Fermi curve F is strictly convex, with curvature bounded away from zero. If the dispersion relation e(k) is that of a background electric field alone then e(k) = e(−k) and the Fermi curve F is symmetric about the origin. That is, k ∈ F if and only if −k in F. Definition XVIII.1. (i) Since F is strictly convex, for each point k ∈ F there is a unique point a(k) ∈ F different from k such that the tangent lines to F at k and a(k) are parallel. a(k) is called the antipode of k. (ii) We say that F is symmetric about a point p ∈ R2 if F = {2p − k|k ∈ F }. Example XVIII.2. If F is symmetric about a point p then a(k) = 2p − k for all k ∈ F. Symmetry of the Fermi curve about a point allows for the formation of Cooper pairs and the phase transition to a superconducting state. In [1–3] we show that this is the only instability in a broad class of short range many fermion models. We now make a precise asymmetry assumption on the geometry of the Fermi surface. Definition XVIII.3. Choose an orientation for F. (i) Let k ∈ F , ~t the oriented unit tangent vector to F at k and ~n the inward pointing unit normal vector to F at k. Then there is a function ϕk (s), defined on a neighborhood of 0 in R, such that s 7→ k + s~t + ϕk (s)~n is an oriented parametrization of F near k.
December 15, 2003 16:52 WSPC/148-RMP
00180
Single Scale Analysis of Many Fermion Systems — Part 4
1123
(ii) We say that F is strongly asymmetric if there is n0 ∈ N, with n0 ≤ r, such that for each k ∈ F there exists an n ≤ n0 such that (n)
(n)
ϕk (0) 6= ϕa(k) (0) .
Remark XVIII.4. (i) By construction, ϕk (0) = ϕ˙ k (0) = 0 and ϕ¨k (0) is the curvature of F at k. (ii) If F is symmetric under inversion in some point p ∈ R2 , then ϕk = ϕa(k) for all k ∈ F. (iii) In [4] we show that independent electrons in a suitably chosen periodic electromagnetic background field have a dispersion relation whose associated Fermi curve, for suitably chosen chemical potential, is smooth, strictly convex, strongly asymmetric and has nonzero curvature everywhere. (iv) In [1–3] we show that a many fermion system with a strongly asymmetric Fermi surface and weak, short range interaction is a Fermi liquid. Throughout the rest of the paper we assume, unless otherwise stated, that the Fermi surface is strictly convex and either symmetric about a point or strictly asymmetric in the sense of Definition XVIII.3. In Sec. XXII, we derive a sector counting result that holds only for strongly asymmetric Fermi curves and use it to get an estimate on particle–particle bubbles that is better than the logarithmic divergence that, in the case of a symmetric Fermi surface, is responsible for the Cooper instability. We emphasize that for the sector counting arguments of Sec. XX, the fact that the model is in two space dimensions is crucial. Propositions XX.10 and XX.11 would not hold in a three dimensional situation. See [1, Sec. II, Subsec. 8]. XIX. Comparison of Norms Theorem XV.3 indicates that ladders give the dominant contributions to w0,4 . The |·|3,Σ norm of ladders will be estimated in Sec. XXII and [5]. To control the N (w; · · ·) norms of w, we develop a bound on the | · |1,Σ norm of a ladder in terms of its | · |3,Σ norm. 1 ≤ l ≤ M1j/2 at scale Proposition XIX.1. Let Σ be a sectorization of length M 2j/3 ˇ j ≥ 4. Furthermore let ϕ ∈ F0 (4, Σ) and f ∈ F4;Σ be particle number conserving functions. Then 1 1 |ϕ|1,Σ ≤ const |ϕ|3,Σ and f |e1,Σ ≤ |const |f |e3,Σ l l with a constant const that is independent of M, j, Σ.
This proposition is proven after Lemma XXI.1. In the renormalization group analysis, we go from scale to scale. After integrating out scale j, we shall have an effective potential W with a representative w, sectorized at scale j; and we will have an estimate on the norm of w. To apply Theorem XV.3 at scale j + 1 we then need a representative for W that is sectorized at scale j + 1 and estimates on it. This change of sectorizations is implemented by
December 15, 2003 16:52 WSPC/148-RMP
1124
00180
J. Feldman, H. Kn¨ orrer & E. Trubowitz
Definition XIX.2. Let j, i ≥ 2. Let Σ and Σ0 be sectorizations of length l at scale j and length l0 at scale i, respectively. If i 6= j, define, for functions ϕ on B m × (B × Σ0 )n and f on Bˇm × (B × Σ0 )n , ϕΣ (η1 , . . . , ηm ; (ξ1 , s1 ), . . . , (ξn , sn )) Z n Y X χ ˆs` (ξ`0 , ξ` ) dξ10 · · · dξn0 ϕ(η1 , . . . , ηm ; (ξ10 , s01 ), . . . , (ξn0 , s0n )) = s01 ,...,s0n ∈Σ0
`=1
fΣ (ˇ η1 , . . . , ηˇm ; (ξ1 , s1 ), . . . , (ξn , sn )) Z n Y X dξ10 · · · dξn0 f (ˇ η1 , . . . , ηˇm ; (ξ10 , s01 ), . . . , (ξn0 , s0n )) χ ˆs` (ξ`0 , ξ` ) = s01 ,...,s0n ∈Σ0
`=1
where χs , s ∈ Σ is the partition of unity of Lemma XII.3 and (XIII.2). If ϕ is translation invariant and antisymmetric under permutation of its η arguments, then ϕΣ ∈ Fm (n; Σ). For i = j and Σ0 = Σ, define ϕΣ = ϕ and fΣ = f. Remark XIX.3. (i) If u ∈ F0 (2; Σ0 ) is an antisymmetric, spin independent and particle number conserving function then u ˇΣ (k) = u ˇ(k) (˜ ν (≥j) (k))2 . (ii) For a function ϕ on B m × (B × Σ0 )n one has (ϕΣ )∼ = (ϕ∼ )Σ . (iii) Let j, i1 , i2 ≥ 2 with i2 > i1 . Let Σ, Σ1 and Σ2 be sectorizations at scales j, i1 and i2 respectively. Then, for each function ϕ on B m × (B × Σ)n and each function f on Bˇm × (B × Σ)n (ϕΣ1 )Σ2 = ϕΣ2
and (fΣ1 )Σ2 = fΣ2 .
1 1 1 Proposition XIX.4. Let j > i ≥ 2, M j−3/2 ≤ l ≤ M (j−1)/2 and M i−3/2 ≤ l0 ≤ 1 0 0 with 4l < l . Let Σ and Σ be sectorizations of length l at scale j and length M (i−1)/2 l0 at scale i, respectively. Let ϕ ∈ Fm (n; Σ0 and f ∈ Fˇm (n; Σ0 ) be particle number conserving functions.
(i) If m 6= 0 |ϕΣ |1,Σ ≤ const
n
0 n l |ϕ|1,Σ0 . cj−1 l
(ii) If f is antisymmetric in its (ξ, s) arguments, then for all p 0 n+m−p−1 l e n |fΣ |p,Σ ≤ const cj−1 |f |ep,Σ0 . l √ 1 Moreover, if l ≥ M 2/3(j−1) , l0 ≤ 16 l and n ≥ 3 0 n+m−3 l 1 |fΣ |e1,Σ ≤ constn cj−1 |f |e1,Σ0 + 0 |f |e3,Σ0 . l l
December 15, 2003 16:52 WSPC/148-RMP
00180
Single Scale Analysis of Many Fermion Systems — Part 4
1125
(iii) If f is antisymmetric in its (ξ, s) arguments, then for all p 0 p−m l e n |fΣ0 |p,Σ0 ≤ const ci−1 |f |ep,Σ . l Here
const
is a constant that is independent of M, j, Σ.
This proposition is proved after Lemma XXI.4. Remark XIX.5. Since for m = 0 the norms |ϕ|p,Σ and |ϕ|ep,Σ agree, Proposi√ 1 and 4l < l0 < 16 l, for tion XIX.4(ii) implies that, in the case that l ≥ M 2/3(j−1) antisymmetric ϕ ∈ F0 (n; Σ0 ) |ϕΣ |1,Σ ≤ const cj−1 |ϕ|1,Σ0
if n = 2
|ϕΣ |3,Σ ≤ const cj−1 |ϕ|3,Σ0 0 n−3 1 l n |ϕΣ |1,Σ ≤ const cj−1 |ϕ|1,Σ0 + 0 |ϕ|3,Σ0 l l
if n = 4 if n ≥ 4 .
The resectorization of functions on XnΣ = (Bˇ ∪· (B × Σ))n is defined just as in Definition XIX.2. To be precise, recall from Remark XVI.3 and Definition XVI.2(iii) that [ XnΣ = · Xi1 (Σ) × · · · × Xin (Σ) i1 ,...,in ∈{0,1}
where X0 (Σ) = Bˇ and X1 (Σ) = B × Σ. Furthermore, for each ~ı = (i1 , . . . , in ) ∈ {0, 1}n, the map Ord gives a bijection between functions on Xi1 (Σ) × · · · × Xin (Σ) and functions on Bˇm(~ı) × (B × Σ)n−m(~ı) , where m(~ı) = n − i1 − · · · − in . Definition XIX.6. Let j, i ≥ 2. Let Σ and Σ0 be sectorizations of length l at scale j and length l0 at scale i, respectively. (i) Let ~ı = (i1 , . . . , in ) ∈ {0, 1}n and f a function on Xi1 (Σ0 ) × · · · × Xin (Σ0 ). Then fΣ is the function on Xi1 (Σ)×· · · ×Xin (Σ) determined by Ord(fΣ ) = (Ord f )Σ . (ii) If f is a function on XnΣ0 , its resectorization fΣ is the function on XnΣ determined by fΣ |~ı = (f |~ı)Σ
for all ~ı ∈ {0, 1}n .
From Proposition XIX.4, we have 1 1 1 ≤ l ≤ M (j−1)/2 and M i−3/2 ≤ l0 ≤ Corollary XIX.7. Let j > i ≥ 2, M j−3/2 1 0 0 with 4l < l . Let Σ and Σ be sectorizations of length l at scale j and M (i−1)/2 0 length l at scale i, respectively. Let f ∈ Fˇn;Σ0 be an antisymmetric particle number conserving function. Then for all p 0 n−p−1 l |fΣ |ep,Σ ≤ constn cj−1 |f |ep,Σ0 . l
December 15, 2003 16:52 WSPC/148-RMP
1126
00180
J. Feldman, H. Kn¨ orrer & E. Trubowitz
Moreover, if l ≥
1 , l0 M 2/3(j−1)
≤
|fΣ |e1,Σ ≤ constn
1 6
√
l and n ≥ 4 0 n−3 l 1 e e |f |1,Σ0 + 0 |f |3,Σ0 . cj−1 l l
In the renormalization group analysis of [1–3], the numbers ρ0;n used as weights in the norms Nj of Definition XV.1 do not depend on the scale j. As pointed out in Remark XV.2, boundedness in j of the norms Nj implies that the coefficient of t0 in |w0,2 |1,Σ has positive power counting (that is, tends to zero as a power of 1 0 M j ) and the coefficient of t in |w0,4 |3,Σ has neutral power counting. The other contributions wm,n behave well with respect to resectorization. 3 Corollary XIX.8. Fix 21 < ℵ < 23 and let j ≥ 2−3ℵ . Let Σj+1 and Σj be sectori1 zations of length lj+1 = M ℵ(j+1) at scale j + 1 and lj = M1ℵj at scale j, respectively. Let ρ ~ = (ρm;n ) be a system of positive real numbers obeying (XV.1) and set ρm;n if m = 0 s ρ0m;n = 1 lj M j 4 ρm;n = (1−ℵ)/4 ρm;n if m > 0 . lj+1 M j+1 M
Let
w(φ, ψ) =
X
X
m,n s1 ,...,sn ∈Σj+1 m+n even
Z
dη1 · · · dηm dξ1 · · · dξn
× wm,n (η1 , . . . , ηm (ξ1 , s1 ), . . . , (ξn , sn )) × φ(η1 ) · · · φ(ηm )ψ((ξ1 , s1 )) · · · ψ((ξn , sn )) with wm,n ∈ Fm (n; Σj ), be an even Σj -sectorized particle number conserving Grassmann function with w0,2 = 0 and wm,0 = 0 for all m. If M is big enough, then α ~0 ~) ≤ const ej+1 (X)Nj w; ; X, Σj , ρ Nj+1 (wΣj+1 ; 64α; X, Σj+1 , ρ 2 with the constant const independent of M, j, Σj and Σj+1 . If, in addition w0,4 = 0, then α 1 Nj+1 (wΣj+1 ; 64α; X, Σj+1 , ρ ~) ≤ (1−ℵ)/8 ej+1 (X)Nj w; ; X, Σj , ρ ~0 . 2 M
Proof. We apply Proposition XIX.4 with j replaced by j + 1, i = j, l = lj+1 and l0 = lj . Observe that the hypotheses of part (ii) are fulfilled. In this proof, use | · |Σ,~ρ to designate the norm of Definition XV.1 using the indicated ρ~. If m, n ≥ 1, by Proposition XIX.4(i), n/2 M 2(j+1) lj+1 B n ej+1 (X)(64α) |(wm,n )Σj+1 |Σj+1 ,~ρ lj+1 M j+1 = ej+1 (X)(64α)
n
lj+1 B M j+1
n/2
ρm;n |(wm,n )Σj+1 |1,Σj+1
December 15, 2003 16:52 WSPC/148-RMP
00180
Single Scale Analysis of Many Fermion Systems — Part 4
≤ constn cj ej+1 (X)
lj lj+1
n
n/2
214 lj+1 M lj
1127
ρm;n ρ0m;n
α n l B n/2 j ρ0m;n |wm,n |1,Σj × 2 Mj (2n−1)/4 2j 1 M n ≤ const ej+1 (X) cj M 1−ℵ lj α n l B n/2 j × |wm,n |Σj ,~ρ 0 2 Mj
≤
1 M (1−ℵ)/8
α n M 2j ej (X) lj 2
ej+1 (X)
lj B Mj
n/2
|wm,n |Σj ,~ρ 0
if M is large enough. If m = 0 and n ≥ 4, by Proposition XIX.4(ii) and Remark XVI.5, M 2(j+1) ej+1 (X)(64α)n lj+1
lj+1 B M j+1
M 2(j+1) = ej+1 (X)(64α)n lj+1 " × |(w0,n )Σj+1 |1,Σj+1 + +
1 |(w0,n )Σj+1 |5,Σj+1 l2j+1
≤ constn cj ej+1 (X) × +
"
lj+1 lj
lj+1
≤ const "
lj
n
n−3
n−4
n/2
lj+1 B M j+1
1
n/2
ρ0;n
|(w0,n )Σj+1 |3,Σj+1
lj+1 #
(n−2)/2 n/2 M 2j α n lj B lj+1 1 ρ0;n lj 2 Mj lj M (n−4)/2
|w0,n |1,Σj +
1 |w0,n |5,Σj l2j
#
M 2j α n cj ej+1 (X) lj 2
× |w0,n |1,Σj +
|(wm,n )Σj+1 |Σj+1 ,~ρ
lj lj+1
n−3
lj B Mj
n/2
1 |w0,n |3,Σj lj
1 M (n−4)/2 #
1 1 |w0,n |3,Σj + 2 |w0,n |5,Σj lj lj
lj lj+1
(n−4)/2
ρ00;n
December 15, 2003 16:52 WSPC/148-RMP
1128
00180
J. Feldman, H. Kn¨ orrer & E. Trubowitz
= constn ej+1 (X) ≤
1
1 M 1−ℵ
(n−4)/2
M 2j α n cj lj 2
+ const δn,4 ej+1 (X)
M (1−ℵ)/8 α n l B n/2 j × |w0,n |Σj ,~ρ 0 . 2 Mj
lj B Mj
n/2
|w0,n |Σj ,~ρ 0
M 2j ej (X) lj
The analog of Corollary XIX.8 for the Nj∼ norms is 3 Corollary XIX.9. Fix 12 < ℵ < 23 and let j ≥ 2−3ℵ . Let Σj+1 and Σj be sectori1 zations of length lj+1 = M ℵ(j+1) at scale j + 1 and lj = M1ℵj at scale j, respectively. Let ρ ~ = (ρm;n ) be a system of positive real numbers obeying (XVII.1). Let XZ w(φ, ψ) = dx1 · · · dxn fn (x1 , . . . , xn )Ψ(x1 ) · · · Ψ(xn ) Xn Σ
n
with fn ∈ Fˇn;Σ antisymmetric, be an even Σj -sectorized particle number conserving Grassmann function with f2 = 0. If M is big enough, then α ∼ ~ Nj+1 (wΣj+1 ; 64α; X, Σj+1 , ρ~) ≤ const ej+1 (X)Nj∼ w; ; X, Σj , ρ 2 with the constant const independent of M, j, Σj and Σj+1 . If, in addition f4 = 0, then α 1 ∼ Nj+1 (wΣj+1 ; 64α; X, Σj+1 , ρ ~) ≤ (1−ℵ)/8 ej+1 (X)Nj∼ w; ; X, Σj , ρ ~ . 2 M Proof. If n ≥ 4, by Proposition XIX.4(ii) with j replaced by j + 1, i = j, l = lj+1 and l0 = lj , n/2 M 2(j+1) lj+1 B n ej+1 (X)(64α) |(fn )Σj+1 |eΣj+1 lj+1 M j+1 M 2(j+1) ej+1 (X)(64α)n ≤ lj+1 ( ×
|(fn )Σj+1 |e1,Σj+1 ,~ρ +
≤ const × +
n
( 6 X
lj+1 B M j+1
6 X p=2
n/2
1 [(p−1)/2]
lj+1
|(fn )Σj+1 |ep,Σj+1 ,~ρ
)
(n−2)/2 n/2 M 2j α n lj B 1 lj+1 cj ej+1 (X) lj 2 Mj lj M (n−4)/2 lj lj+1
n−3
1
[(p−1)/2] p=2 lj+1
|fn |e1,Σj ,~ρ lj
lj+1
1 + |fn |e3,Σj ,~ρ lj
n−p−1
|fn |ep,Σj ,~ρ
)
December 15, 2003 16:52 WSPC/148-RMP
00180
Single Scale Analysis of Many Fermion Systems — Part 4
≤ constn ej+1 (X) ×
(
|fn |e1,Σj ,~ρ n
≤
1 M (1−ℵ)/8
lj B Mj
n/2
1 M (n−4)/2
lj lj+1
(n−4)/2
p−2−[(p−1)/2] 6 X lj+1 1 1 e |f |e ρ + |fn |3,Σj ,~ρ + [(p−1)/2] n p,Σj ,~ lj l j lj p=2
≤ const ej+1 (X)
M 2j α n cj lj 2
1129
1 M 1−ℵ
(n−4)/2
+ const δn,4
α n M 2j ej (X) lj 2
α n M 2j ej+1 (X) ej (X) lj 2
lj B Mj
n/2
lj B Mj
)
|fn |eΣj
n/2
|fn |eΣj .
The positive power counting of |w0,2 |1,Σ is achieved by renormalization. That is, we choose the counterterm in such a way that, at each scale, the restriction of the Fourier transform of w0,2 to the Fermi surface is small. The following proposition ensures then that |w0,2 |1,Σ is also small. Definition XIX.10. The function u ∈ F0 (2; Σ) is said to vanish at k0 = 0 if u ˇ(((0, k), σ, a, s), ((0, k), σ 0 , a0 , s0 )) = 0
for all a, a0 ∈ {0, 1}, σ, σ 0 ∈ {↑, ↓} and s, s0 ∈ Σ. Proposition XIX.11. There is a constant const, independent of M, such that the following holds: let j ≥ i ≥ 2 and Σ and Σ0 be sectorizations at scale j and i, respectively. If i = j assume that Σ = Σ0 . Let u ∈ F0 (2; Σ0 ) be a function that vanishes at k0 = 0. Then (i) |uΣ |1,Σ ≤ const
1 M j−1
(1,0,0)
|cj−1 D1,2
u|1,Σ0 +
X
δ∈N0 ×Nd 0 δ0 6=0
∞tδ .
(ii) (1,0,0)
|D1,2
(1,0,0)
uΣ |1,Σ ≤ const cj−1 |D1,2
u|1,Σ0 .
Proof. (i) Fix s1 , s2 ∈ Σ. If i < j, by Lemmas II.7, IX.6(i), XIII.3 and (XIII.4) kuΣ ((ξ1 , s1 ), (ξ2 , s2 ))k1,∞
Z
0 0
≤ const 0 max ˆs2 (η2 , ξ2 ) dη1 dη2 u((η1 , s1 ), (η2 , s2 ))χ ˆs1 (η1 , ξ1 )χ
0 0 s1 ,s2 ∈Σ
Z
dη1 u((η1 , s0 ), (· , s0 ))χ ˆ (η , ·) ≤ constkχ ˆs2 k1,∞ 0 max s 1 1 1 2
0 s ,s ∈Σ0 1
≤ const
2
0
∂χs1
cj−1
∂x0 1 L
(1,0,0)
max kD1,2
s01 ,s02 ∈Σ0
1,∞
1,∞
u((· , s01 ), (· , s02 )k1,∞ +
X
δ∈N0 ×Nd 0 δ0 6=0
∞tδ
December 15, 2003 16:52 WSPC/148-RMP
1130
00180
J. Feldman, H. Kn¨ orrer & E. Trubowitz
1
≤ const
(1,0,0)
M j−1 1
≤ const
c2j−1 |D1,2
(1,0,0)
M j−1
cj−1 |D1,2
u|1,Σ0 +
X
∞tδ
X
∞tδ .
δ∈N0 ×Nd 0 δ0 6=0
u|1,Σ0 +
δ∈N0 ×Nd 0 δ0 6=0
Similarly, if i = j and Σ = Σ0 , then setting 1 2j−2 2 2 χ(e) (k) = ϕ M (k + e(k) ) Θs (k0 (k)) s 0 2 (which just differs by a 21 from the definition of χs (k) in (XIII.2)), we have, using the support property of Definition XII.4(ii), kuΣ ((ξ1 , s1 ), (ξ2 , s2 ))k1,∞ = ku((ξ1 , s1 ), (ξ2 , s2 ))k1,∞
X Z
(e) (e)
dη1 dη2 u((η1 , s1 ), (η2 , s2 ))χ ˆs0 (η1 , ξ1 )χ ˆs0 (η2 , ξ2 ) =
1 2
s0 ,s0 ∈Σ 1
2
≤ const
≤ const
1,∞
Z
(e) (e)
dη1 dη2 u((η1 , s1 ), (η2 , s2 ))χ
ˆ (η , ξ ) χ ˆ (η , ξ ) max 0 0 1 1 2 2 s1 s2
0 0 s ,s ∈Σ 1
2
1 (1,0,0) cj−1 |D1,2 u|1,Σ + M j−1
X
δ∈N0 ×Nd 0 δ0 6=0
1,∞
∞tδ .
Since for every s1 ∈ Σ there are at most three sectors s2 with s˜1 ∩ s˜2 6= ∅, in both cases X 1 (1,0,0) 0 + ∞|tδ . c |D u| |uΣ |1,Σ ≤ const j−1 1,Σ 1,2 M j−1 d δ∈|N0 ×N0 δ0 6=0
(ii) If i = j and Σ = Σ0 the statement is trivial. So assume that i < j. Fix s1 , s2 ∈ Σ. By Lemma IX.6(ii) (twice), Lemma XIII.3 and (XIII.4) (1,0,0)
kD1,2
uΣ ((ξ1 , s1 ), (ξ2 , s2 ))k1,∞
Z
(1,0,0) 0 0
D1,2 ˆs2 (η2 , ξ2 ) ˆs1 (η1 , ξ1 )χ dη1 dη2 u((η1 , s1 ), (η2 , s2 ))χ ≤ const 0 max
0 0 s1 ,s2 ∈Σ
≤ const
∂χ0s2
kχ ˆs2 k1,∞ + x0 (x)
1 ∂x0 L
Z
(1,0,0)
0 0
D
× 0 max dη u((η , s ), (· , s )) χ ˆ (η , ·) 1 1 1 s1 1 2 1,2
0 0 s ,s ∈Σ 1
2
1,∞
+ 1,∞
X
δ∈N0 ×Nd 0 δ0 >r0
∞tδ
December 15, 2003 16:52 WSPC/148-RMP
00180
Single Scale Analysis of Many Fermion Systems — Part 4
≤ const cj−1
∂χ0s1
(x) kχ ˆs1 k1,∞ + x0
1 ∂x0 L
(1,0,0)
× 0 max kD1,2 0 0 s1 ,s2 ∈Σ
1131
(1,0,0)
≤ const cj−1 |D1,2
u((· , s01 ), (· , s02 ))k1,∞
u|1,Σ0 .
Corollary XIX.12. There is a constant const, independent of M, such that the following holds: let Σ be a sectorization of scale j ≥ 2 and u ∈ F 0 (2; Σ) be a function that vanishes at k0 = 0. Let X ∈ Nd+1 . If X (1,0,0) |D1,2 u|1,Σ ≤ cj−1 X + ∞tδ δ0 =r0
then |u|1,Σ ≤ const
M cj−1 X . Mj
Proof. By Proposition XIX.11(i) and (XIII.4) |u|1,Σ ≤
const
≤
const
1
(1,0,0)
M j−1 1 M j−1
cj−1 |D1,2
u|1,Σ +
δ0 6=0
X
cj−1 X +
X
δ0 6=0
∞tδ
∞tδ .
Also |u|1,Σ ≤ t0
X ∂ |u|1,Σ + ∞tδ ∂t0 δ0 =0
(1,0,0)
= t0 |D1,2
u|1,Σ +
δ0 =0
≤ t0 cj−1 X + ≤
1 M j−1
X
X
δ0 =r0 +1
cj−1 X +
∞tδ
∞tδ +
X
δ0 =0
X
δ0 =0
∞tδ
∞tδ
1
since t0 cj−1 ≤ M j−1 cj−1 . The corollary now follows by taking the minimum of the two estimates on |u|1,Σ . Corollary XIX.13. Let j > i ≥ 2 and Σ and Σ0 be sectorizations at scale j and i, respectively. Let u ∈ F0 (2; Σ0 ) vanish at k0 = 0. Assume that |u|1,Σ0 ≤ λ ci for some λ > 0. Then |uΣ |1,Σ ≤ const λM
Mi cj−1 . Mj
December 15, 2003 16:52 WSPC/148-RMP
1132
00180
J. Feldman, H. Kn¨ orrer & E. Trubowitz
Proof. By hypothesis (1,0,0)
|D1,2
u|1,Σ0 ≤ const λM i ci +
X
δ0 =r0
∞tδ .
Therefore, by Proposition XIX.11(i) and (XIII.4) |uΣ |1,Σ ≤
const
≤
const
1 M j−1 λ
(1,0,0)
cj−1 |D1,2
u|1,Σ0 +
X
δ0 6=0
∞tδ
X Mi ∞tδ . cj−1 + j−1 M δ0 6=0
Also, by Proposition XIX.11(ii) (1,0,0)
|uΣ |1,Σ ≤ t0 |D1,2 ≤ ≤
uΣ |1,Σ +
X
δ0 =0
∞tδ
(1,0,0) const(t0 cj−1 )|D1,2 u|1,Σ0
+
X
δ0 =0
const
1 M j−1
cj−1 λM i ci +
∞tδ
X
δ0 =r0 +1
∞tδ +
X
δ0 =0
∞tδ
X M cj−1 + ∞tδ . j−1 M i
≤
const
λ
δ0 =0
Again, the corollary follows by taking the minimum of the two estimates on |uΣ |1,Σ . When we start the multi scale analysis in [2], the effective potential after integrating out the first scales does not have a natural sectorized representative (see also Theorem VIII.6). Therefore we need analogs of Definition XIX.2 and Proposition XIX.4 that pass from unsectorized functions to sectorized functions (see also Example XII.5). Definition XIX.14. Let Σ be a sectorization of scale j ≥ 2. For a function f on B m × B n define the function fΣ on B m × (B × Σ)n by fΣ (η1 , . . . , ηm ; (ξ1 , s1 ), . . . , (ξn , sn )) Z Y n (dξi0 χ ˆsi (ξi0 , ξi ))f (η1 , . . . , ηm ; ξ10 , . . . , ξn0 ) = i=1
where χs is the partition of unity of Lemma XII.3.
Proposition XIX.15. Let Σ be a sectorization of scale j ≥ 0 and f ∈ Fm (n), f 0 ∈ Fˇm (n) particle number conserving functions that are antisymmetric in their ξ-variables.
December 15, 2003 16:52 WSPC/148-RMP
00180
Single Scale Analysis of Many Fermion Systems — Part 4
1133
(i) If m = 0 and f is translation invariant, then for all p < n |fΣ |p,Σ ≤ constn (ii) If m 6= 0 |fΣ |1,Σ ≤ (iii) If 0 < m ≤ p ≤ m + n |fΣ0 |ep,Σ ≤
1 cj−1 kf k1,∞ . ln−p−1
h const in l
cj−1 kf k1,∞ .
h const im+n−p l
cj−1 kf ke.
The proof of part (i) of this proposition is analogous to that of Proposition XIX.4, and part (ii) was already proven in Example XII.10. The proof of part (iii) is similar to that of part (ii). XX. Sums of Momenta and -Separated Sets In the next section we shall exploit conservation of momentum to prove Proposition XIX.1, relating the 1- and 3-norms of a four-legged kernel, and Proposition XIX.4, concerning the behavior of norms under change of sectorization. Conservation of momentum is equivalent to translation invariance in position space. Recall that we assume that the Fermi surface is strictly convex and either symmetric about a point or strictly asymmetric in the sense of Definition XVIII.3. The following definition is motivated by of [6, Definition B.1.N], of conservation of particle number, and Definition XVI.7(i), of the spaces Fˇm (n; Σ). Definition XX.1. A configuration (ˇ η1 , . . . , ηˇm ; s1 , . . . , sn ) ∈ Bˇm × Σn , where Σ is a sectorization of some scale j, is consistent with conservation of momentum for the sequence (a1 , . . . , an ) of creation–annihilation indices if there are k1 , . . . , kn ∈ R × R2 , with ki in the extended sector s˜i for each i = 1, . . . , n, such that m X i=1
(−1)bi pi +
n X
(−1)ai ki = 0
i=1
where ηˇi = (pi , σi , bi ). We say that the configuration (ˇ η1 , . . . , ηˇm ; s1 , . . . , sn ) is consistent with conservation of momentum if it is consistent with conservation of momentum for some sequence (a1 , . . . , an ) ∈ {0, 1}n of creation–annihilation indices such that #{i|ai = 0} + #{`|b` = 0} = #{i|ai = 1} + #{`|b` = 1} . Remark XX.2. Let Σ be a sectorization of scale j. (i) If f ∈ Fˇm (n; Σ) preserves particle number then f (ˇ η1 , . . . , ηˇm ; (· , s1 ), . . . , (· , sn )) = 0
December 15, 2003 16:52 WSPC/148-RMP
1134
00180
J. Feldman, H. Kn¨ orrer & E. Trubowitz
unless the configuration (ˇ η1 , . . . , ηˇm ; s1 , . . . , sn ) is consistent with conservation of momentum. (ii) If a configuration (ˇ η1 , . . . , ηˇm ; s1 , . . . , sn ) is consistent with conservation of momentum for the sequence (a1 , . . . , an ) of creation–annihilation indices then there are k1 , . . . , kn ∈ R2 such that n m X X (−1)ai ki = 0 (−1)bi pi + i=1
i=1
and
πF ((0, ki )) ∈ s ,
|e(ki )| ≤
√
2
M j−1
for i = 1, . . . , n. The comparison of the 1- and 3-norms of a four-legged kernel (Proposition XIX.1) uses an estimate on the maximal number of triples (s2 , s3 , s4 ) of sectors that complete a given sector s1 to a quadruple (s1 , s2 , s3 , s4 ) that is consistent with conservation of momentum (Proposition XX.10). Similarly, the estimate on the behavior of norms under change of sectorization (Proposition XIX.4) is based on estimates of the number of (2n)-tuples (s1 , . . . , s2n ) of sectors that are consistent with conservation of momentum and such that each si intersects a given bigger sector from another sectorization (Proposition XX.11). We reduce these counting problems to problems of estimating volumes of sets in momentum space that are characterized by the geometric constraints that the sectors are required to satisfy. To pass from volume estimates to sector counting we use the concept of -separated sets (see also [7, p. 22]). -separated sets Let M be a Riemannian manifold of dimension n. For any two points x, y ∈ M we denote by d(x, y) the distance between x and y in M. For x ∈ M and r > 0 let Br (x) = {y ∈ M |d(x, y) < r} be the open ball of radius r around x. We set, for > 0, 1 vol Br/2 (x) VM, = inf x∈M, 0 0 we call Aδ = x ∈ M inf d(x, y) ≤ δ y∈A
the (closed) δ-neighborhood of A. If X is a tangent vector to M at the point x we denote by kXk the length of X with respect to the Riemannian metric on M. If f is a differentiable map from M to another Riemannian manifold N we denote by Df (x) the derivative of f at the point x ∈ M. The point x is said to be a critical point of f if Df (x) has rank strictly less than the dimension of N.
December 15, 2003 16:52 WSPC/148-RMP
00180
Single Scale Analysis of Many Fermion Systems — Part 4
1135
Definition XX.3. Let > 0. A subset Γ of M is called -separated if for any two different γ, γ 0 ∈ Γ d(γ, γ 0 ) ≥ .
Example. Let Σ be a sectorization of length l and let Γ be the set of centers of the intervals s ∩ F, s ∈ Σ. If < 7 l/8, then Γ is an -separated subset of the Fermi curve F and more generally Γn is an -separated subset of F n . We wish to count, for example, for a given sector s4 , the number of triples of sectors (s1 , s2 , s3 ) such that there exist ki ∈ si obeying k1 + k2 − k3 − k4 = 0. If (s1 , s2 , s3 ) are such sectors, then the map f : F × F × F −→ R2 (k1 , k2 , k3 ) 7−→ k1 + k2 − k3
maps F 3 ∩ (s1 × s2 × s3 ) to a neighborhood of s4 . We start with an abstract lemma counting the number of points of an -separated set Γ in the preimage of a specified set A under a specified map f. Lemma XX.4. Let M be a Riemannian manifold of dimension n and f : M 7→ Rd a differentiable map. For x ∈ M denote by Df (x) the derivative of f at the point x. Let ~n1 , . . . , ~nd be an orthonormal basis of Rd . Set, for i = 1, . . . , d Ci = sup sup{|~ni · Df (x)v| |v is a unit tangent vector to M at x} . x∈M
Furthermore, for any subset A of Rd and any > 0 set ( 0
A () =
d
d
y ∈ R |∃(t1 , . . . , td ) ∈ (−, ) such that y +
d X i=1
)
ti Ci~ni ∈ A .
Then for all A ⊂ Rd , 0 > 0, 0 < < 0 and all -separated subsets Γ of M 1 vol(f −1 (A0 ())) . #(f −1 (A) ∩ Γ) ≤ n VM,0 Proof. If γ ∈ f −1 (A) ∩ Γ and x ∈ M with d(x, γ) < /2, then, by the assumption on the derivative of f for i = 1, . . . , d 0
|~ni · (f (x) − f (γ))| < Ci
so that f (x) ∈ A (). Obviously the sets B/2 (γ), γ ∈ f −1 (A) ∩ Γ are pairwise disjoint. Consequently, by the definition of VM,0 X VM,0 n #(f −1 (A) ∩ Γ) ≤ vol(B/2 (γ)) γ∈f −1 (A)∩Γ
= vol
[
γ∈f −1 (A)∩Γ
≤ vol(f −1 (A0 ())) .
B/2 (γ)
December 15, 2003 16:52 WSPC/148-RMP
1136
00180
J. Feldman, H. Kn¨ orrer & E. Trubowitz
Sums of momenta For the proofs of Propositions XX.10 and XX.11 we shall apply the discussion of the previous subsection with Γ being the set of centers of sectors of a given sectorization. The proofs of these propositions then lead to the problem of estimating the number of points (k1 , . . . , kn , kn+1 , . . . , k2n−1 ) ∈ Γ2n−1 such that k1 + · · · + kn − kn+1 − · · · − k2n−1 is close to q. Thus we are led to studying the maps F 2n−1 −→ R2 (k1 , . . . , kn , kn+1 , . . . , k2n−1 ) 7−→ k1 + · · · + kn − kn+1 − · · · − k2n−1 and the intersection of preimages of sets in R2 with Γ2n−1 . Outside a neighborhood of the set of critical points of this map this can usually be done using Lemma XX.4. The critical points of the map are exactly those points (k1 , . . . , k2n−1 ) ∈ F 2n−1 for which the tangent lines of F at k1 , . . . , k2n−1 are all parallel. This is the case if and only all the points k2 , . . . , k2n−1 coincide either with k1 or its antipode a(k1 ). For k ∈ F and 0 < Λ ≤ l, we call sΛ,l (k) = {q ∈ R2 | |e(q)| ≤ Λ, dF (q0 , k) ≤ l/2} the two-dimensional sector of length l and width Λ around k. Here q 7→ q0 is the projection to the Fermi curve introduced in Sec. XI and used in [8, Definition XII.1(i)] and dF is the intrinsic metric on F. Near critical points of the map discussed above we shall use Proposition XX.5. Let k, k1 , . . . , k2n−3 ∈ F and ω > 0 be such that for i = 1, . . . , 2n − 3 one has kki − kk < ω or kki − a(k)k < ω. Let 1 , . . . , 2n−3 ∈ {±1} and set q = 1 k1 + · · · + 2n−3 k2n−3 . Furthermore, let 0 < Λ ≤ l ≤ ω and let ~n respectively ~t be unit normal respectively tangent vectors of F at k. Then {1 x1 + · · · + 2n−3 x2n−3 |xi ∈ sΛ,l (ki )} is contained in the rectangle {q + t1~n + t2~t| |t1 | ≤ n The constant
const
const(Λ
+ lω), |t2 | ≤ 4nl} .
depends only on the geometry of F.
Proof. Without loss of generality we may assume that ~n = (0, 1) and ~t = (1, 0). The angle between F and the k1 direction at a point q ∈ F is bounded by const kq − kk and constkq − a(k)k. Therefore sΛ,l (ki ) is contained in a rectangle that is centered at ki and has two edges parallel to the k1 axis of length 2l and two edges parallel to the k2 axis of length const(Λ + l min{kki − kk, kki − a(k)k}) ≤ const(Λ + lω). The claim now follows.
December 15, 2003 16:52 WSPC/148-RMP
00180
Single Scale Analysis of Many Fermion Systems — Part 4
1137
Proposition XX.6. Let k, k1 , . . . , k2n−1 ∈ F, I ⊂ {1, . . . , n}, J ⊂ {n + 1, . . . , 2n − 1} and ω a positive real number smaller than the diameter of F such that kki −kk < ω for i ∈ I ∪J and kkj −a(k)k < ω for j ∈ / I ∪J. Furthermore, let p ∈ R2 and 0 < Λ ≤ l ≤ ω. Assume that there are points xi ∈ sΛ,l (ki ), i = 1, . . . , 2n − 1 such that kx1 + · · · + xn − xn+1 − · · · − x2n−1 − pk ≤ 2l . Then (i) kp − (#I − #J)k + (#I − #J − 1)a(k)k ≤ const nω . (ii) If p ∈ F then kp − kk ≤ const nω The constants
const
or
kp − a(k)k ≤ const nω .
depend only on the geometry of F.
Proof. By Proposition XX.5, kx1 + · · · + xn − xn+1 − · · · − x2n−1 − (k1 + · · · + kn − kn+1 − · · · − k2n−1 )k ≤ (n + 1)
const(Λ
+ lω) + 4(n + 1)l ≤ const nω .
Since kk1 + · · · + kn − kn+1 − · · · − k2n−1 − (#I)k − (n − #I)a(k) + (#J)k + (n − 1 − #J)a(k)k ≤ const nω part (i) follows. To prove part (ii) assume that p ∈ F. Set r = #I, s = #J. By possibly interchanging k with its antipode, we may assume that r ≥ s. If r = s or r = s + 1, it follows directly from part (i) that kp − kk ≤ const nω. So we assume that r − s ≥ 2. Let xi ∈ sΛ,l (ki ), i = 1, . . . , 2n − 1 such that y = x1 + · · · + xn − xn+1 − · · · − x2n−1 has distance at most 2l from p. Let ~n be the outward pointing unit normal vector of F at k. Then (k − a(k)) · ~n ≥ const1 and |(xi − k) · ~n| ≤ kxi − ki k + kki − kk ≤ const(Λ + l) + ω ≤ const ω for i ∈ I ∪ J. Similarly for j ∈ / I ∪J |(xj − a(k)) · ~n| ≤ const ω . Consequently |(x1 + · · · + xn − xn+1 − · · · − x2n−1 − k) · ~n − (r − s − 1)(k − a(k)) · ~n| ≤ (2n − 1)
const
ω
December 15, 2003 16:52 WSPC/148-RMP
1138
00180
J. Feldman, H. Kn¨ orrer & E. Trubowitz
and therefore (y − k) · ~n ≥ A((r − s − 1) − const nω) with strictly positive constants A, const. The tangent line to F at k is a “supporting hyperplane” for the convex hull of F. Therefore F ∩ {x ∈ R2 |(x − k) · ~n > 0} = ∅ . So 0 ≥ (p − k) · ~n = (p − y) · ~n + (y − k) · ~n ≥ (p − y) · ~n + A((r − s − 1) − n
const
ω) .
As |(p − y) · ~n| ≤ 2l and hence (p − y) · ~n ≥ −2l ≥ −2ω ≥ −n const ω, this shows that Pairs of Momenta r−s−1 1 n ≥ const0 ≥ const0 ω ω Lemma 1 , Thus 2 ∈ {±1}. a subset X strictly of F × Fpositive and a constant C and such the that since XX.7 r − s Let ≥ 2. nω is There largeristhan some constant estimate of part (ii) holds.
i) For every p ∈ IR2
Pairs of momenta# (k1 , k2 ) ∈ X 1 k1 + 2 k2 = p
≤C
Lemma XX.7. Let 1 , 2 ∈ {±1}. There is a subset X of F × F and a constant
ii) (F F ) \that X has measure zero. C× such
(i) For every p ∈ R2
Proof:
We may assume that 1 = +1. If 2 = −1, then we claim that, for every p 6= 0
#{(k1 , k2 ) ∈ X|1 k1 + 2 k2 = p} ≤ C .
(ii) (F × F ) \ X has#measure (k1 , k2 )zero. ∈ F × F k1 − k2 = p ≤ 2
(XX.1)
every Proof. We may assume that = +1. If = −1, then we claim that, for In this case, the Lemma with C = 21and X = 2(k1 , k2 ) k1 , k2 ∈ F, k1 6= k2 follows p 6= 0 directly from (XX.1). To prove (XX.1), choose a nonzero vector ~v perpendicular to p. Assume
#{(k F3 ,×k4F),|k(k1 5− = p}that ≤ 2k.1 −k2 = k3 −k4 = k(XX.1) that there are distinct pairs (k11, ,kk22)),∈(k , kk62) such 5 −k6 = p. Without loss the of generality we C assume thatXk= ~v <1 , k · ~v 1< · ~v . By convexity, the 1 ·{(k In this case, lemma with = 2 and k32 )|k , kk 2 5∈ F, k1 6= k2 } follows parallelogram, P, with vertices k1 , k2 ,(XX.1), k5 , k6 ischoose contained in the convex of F . The directly from (XX.1). To prove a nonzero vector ~hull v perpendicular k1 k5 ~v
k2 k6 segment joining k3 and k4 must cross this parallelogram. Therefore k3 and k4 lie on the edges of P . This contradicts the strict convexity of F . Formula (XX.1) may be phrased in more geometrical terms as follows. Let s be a secant of F (that is, a straight line segment joining two different points k1 , k2 ∈ F ). Then there is at most one other secant s0 for F that is parallel to s and has the same length. In the case that there is no second such secant, we set s0 = s. Clearly there are r, R > 0 such that for all k ∈ F
December 15, 2003 16:52 WSPC/148-RMP
00180
Single Scale Analysis of Many Fermion Systems — Part 4
1139
to p. Assume that there are distinct pairs (k1 , k2 ), (k3 , k4 ), (k5 , k6 ) such that k1 − k2 = k3 − k4 = k5 − k6 = p. Without loss of generality we assume that k1 ·~v < k3 ·~v < k5 ·~v . By convexity, the parallelogram, P , with vertices k1 , k2 , k5 , k6 is contained in the convex hull of F. The segment joining k3 and k4 must cross this parallelogram. Therefore k3 and k4 lie on the edges of P. This contradicts the strict convexity of F. Formula (XX.1) may be phrased in more geometrical terms as follows. Let s be a secant of F (that is, a straight line segment joining two different points k1 , k2 ∈ F ). Then there is at most one other secant s0 for F that is parallel to s and has the same length. In the case that there is no such second secant, we set s0 = s. Clearly there are r, R > 0 such that for all k ∈ F (i) Br (k) ∩ BR (a(k)) = ∅. (ii) For any secant s ⊂ Br (k) one has s0 ⊂ BR (a(k)).
Br (k)
s
s0
BR (a(k))
Now consider 2 = +1. If F is invariant under inversion in some point p0 ∈ R2 , 2 Nowfor consider then, p 6= 2p0,2 = +1. If F is invariant under inversion in some point p0 ∈ IR ,
then, for p 6= 2p0 , #{(k1 , k2 ) ∈ F × F |k1 + k2 = p} 0 0 k + (2p − k0 ) = p ≤2 ∈ 0F−×kF # (k1 , k2 ) ∈ F × F k1 = +#{(k k2 =1 ,pk02 ) = #× (k 1 ≤2 0 1 ,1k+ ∈F F |k 2 2 )(2p 2 ) = p}
by the case 2 = −1. by the case 2 = −1. Now we discuss the case that F is strongly asymmetric in the sense of DefiNow we discuss the case that F is strongly asymmetric in the sense of Definition nition XVIII.3. Since F × F is compact, it suffices to show that for each point
0 F ×F XVIII.3.(kSince F × F is compact, it suffices to show that for each point (k1 , k2 )U∈ 1 , k2 ) ∈ F × F there exists a neighborhood U of (k1 , k2 ) in F × F , a subset 0 0 of U awhose complementUU of \ U(k1has a number that,complement for there exists neighbourhood , k2measure ) in F ×zero F , and a subset U ofmUsuch whose
all p ∈ R2 , zero and a number m such that, for all p ∈ IR2 , U \ U 0 has measure 0 1 + q2 = p} ≤ m . #{(q1 , q2 ) ∈ U |q # (q1 , q2 ) ∈ U 0 q1 + q2 = p ≤ m
If k2 6= k1 , a(k1 ), then the map (q1 , q2 ) 7→ q1 + q2 has rank 2 at (k1 , k2 ). By the inverse function theorem, there is a neighborhood U of (k1 , k2 ) such that for all If k2 6= k1 , a(k 1 ), then the map (q1 , q2 ) 7→ q1 + q2 has rank 2 at (k1 , k2 ). By the in2 p ∈ R #{(q1 , q2 ) ∈ U |q1 + q2 = p} ≤ 1. verse function Theorem, there is a neighbourhood U of (k1 , k2 ) such that for all p ∈ IR2 Next assume that k1 = k2 = k. Let Ur = {(q1 , q2 ) ∈ F 2 | kq1 − kk, kq2 − # (q1 , kk q2 )<∈r}, U where q1 + rq2is = p ≤in1.the discussion of secants above. We claim that for defined 0 0 (q , q ), (q , q ) ∈ U Next (q1 , q2 ) ∈ F 2 kq1 −kk, kq2 −kk < r , 1 2 assume r 1 = k2 = k. Let Ur = 1 2 that k 0 where r is defined discussion of secants claim that for 0(q 0, q ), (q01 , q02 ) ∈ Ur q1in +the q2 = q01 + q02 ⇐⇒ (q1 , q2 )above. = (q01 , qWe 2 ) or (q1 , q2 ) = (q2 , q11 ) . 2 0 0 0 0 0 Assume q2q0= q01 ⇐⇒ + q02 and 0 2 ), (q2 , q1 ). Then q01 −0 q1 = 0 1, q 1 0++ 1 , q2 ) 6= (q ) or (q1 , q2 ) = (q2 , q1 ) (q(q q1 + that q2 =qq 1 , q2 ) = (q1 , q 2 1 02 0 q2 − q2 6= 0, so that the sector s of F joining q1 to q1 is parallel to and of the same
Assume that q1 + q2 = q01 + q02 and (q1 , q2 ) 6= (q01 , q02 ), (q02 , q01 ). Then q1 − q01 = q02 − q2 6= 0, so that the sector s of F joining q1 to q01 is parallel to and of the same length as, but disjoint from the sector s˜ joining q02 to q2 . Therefore, s˜ = s0 where, as above, s0 is the unique
second secant parallel to and of the same length as s. But this is impossible, as s˜ ⊂ B r (k), s0 ⊂ BR (a(k)) and Br (k) ∩ BR (a(k)) = ∅. Finally, assume that k1 = k and k2 = a(k) for some k ∈ F . We may assume, without loss of generality, that the oriented unit tangent vector to F at k is (1, 0) and that
December 15, 2003 16:52 WSPC/148-RMP
1140
00180
J. Feldman, H. Kn¨ orrer & E. Trubowitz
length as, but disjoint from the sector s˜ joining q02 to q2 . Therefore, s˜ = s0 where, as above, s0 is the unique second secant parallel to and of the same length as s. But this is impossible, as s˜ ⊂ Br (k), s0 ⊂ BR (a(k)) and Br (k) ∩ BR (a(k)) = ∅. Finally, assume that k1 = k and k2 = a(k) for some k ∈ F. We may assume, without loss of generality, that the oriented unit tangent vector to F at k is (1, 0) and that the inward pointing unit normal vector to F at k is (0, 1). Then in the notation of Definition XVIII.3, t 7−→ k + (t, ϕk (t)) is a parametrization of F near k and t 7−→ a(k) − (t, ϕa(k) (t)) is a parametrization of F near a(k). Then (t1 , t2 ) 7−→ (k + (t1 , ϕk (t1 )), a(k) − (t2 , ϕa(k) (t2 ))) is a parametrization of F × F near (k1 , k2 ). With respect to these coordinates, the map (q1 , q2 ) 7−→ q1 + q2 − (k + a(k)) from F × F to R2 is f˜ : (t1 , t2 ) 7−→ (t1 − t2 , ϕk (t1 ) − ϕa(k) (t2 )) . Since F is strongly asymmetric, there is 2 ≤ n ≤ n0 such that (n)
(n)
ϕk (0) 6= ϕa(k) (0) . We show that, if is small enough, then, for any p = (p1 , p2 ) ∈ R2 , the equation f˜(t1 , t2 ) = p
(XX.2)
has at most n solutions in (−, )2 . Fix p and set g(t) = ϕk (p1 + t) − ϕa(k) (t) − p2 . Then (t1 , t2 ) is a solution of (XX.2) if and only if (t1 , t2 ) = (p1 + t, t) with t a zero of g(t). Hence it suffices to prove that g(t) has at most n zeros. But, since (n) (n) (n) (n) ϕk (0) − ϕa(k) (0) 6= 0, the nth derivative g (n) (t) = ϕk (p1 + t) − ϕa(k) (t) never vanishes for |p1 + t|, |t| < , for sufficiently small. Consequently g can have at most n zeros on this set. Lemma XX.8. Let p ∈ F, 0 < ω1 < 12 ω2 and set M = {(k1 , k2 ) ∈ F × F | min[d(k1 , k2 ), d(a(k1 ), k2 )] ≥ ω1 and min[d(ki , p), d(a(ki ), p)] ≤ ω2 f or i = 1, 2} . Let 1 , 2 ∈ {±1} and let f be the map from F × F to R2 given by f (k1 , k2 ) = 1 k1 + 2 k2 . There are constants that depend only on the geometry of F such that
December 15, 2003 16:52 WSPC/148-RMP
00180
Single Scale Analysis of Many Fermion Systems — Part 4
1141
(i) for all measurable subsets A of R2 vol(f −1 (A) ∩ M ) ≤
const
ω1
vol(A) .
(ii) VM,ω1 ≥ const. (iii) Let ~n a unit normal vector to F at p. Then for all (k1 , k2 ) ∈ M |~n · Df (k1 , k2 )~v | ≤ const ω2 .
sup ~ v ∈T(k1 ,k2 ) M k~ v k≤1
(iv) Let 0 < < ω1 /4 and let Γ be an -separated subset of F. Furthermore let R be a rectangle in R2 having one pair of sides parallel to ~n of length A > 0 and a second pair of sides perpendicular to ~n of length B > 0. Then #f −1 (R) ∩ M ∩ Γ2 ≤
const
ω1 2
(A + ω2 )(B + ) .
Proof. (i) For k1 , k2 ∈ F let θ(k1 , k2 ) be the angle between the normal vectors to F at k1 and at k2 . Then the Jacobian determinant of f at (k1 , k2 ) is | sin θ(k1 , k2 )| ≥ const min(d(k1 , k2 ), d(k1 , a(k2 ))) ≥ const ω1 . The claim follows from the rule for the change of variables in integrals and Lemma XX.7. (ii) is trivial. (iii) For q ∈ F , let ϑ(q) be the angle between ~n and the normal vector to F at q. Then sup ~ v ∈T(k1 ,k2 ) M k~ v k≤1
|~n · Df (k1 , k2 )~v | ≤ 2 max(| sin ϑ(k1 )|, | sin ϑ(k2 )|) ≤
const
max(min(kp − k1 k, kp − a(k1 )k) ,
min(kp − k2 k, kp − a(k2 )k)) ≤
const
ω2 .
(iv) Let ~t be the tangent vector to F at p. Obviously sup ~ v ∈T(k1 ,k2 ) M k~ v k≤1
|~t · Df (k1 , k2 )~v | ≤ 2
for all (k1 , k2 ) ∈ M. So by parts (ii) and (iii) of this lemma and Lemma XX.4 #f −1 (R) ∩ M ∩ Γ2 ≤
const
2
vol(f −1 (R0 ))
where R0 is a rectangle of side lengths A + const ω2 and B + 4. By part (i), vol(f −1 (R0 )) ≤
1 (A + const ω2 )(B + 4) . ω1
December 15, 2003 16:52 WSPC/148-RMP
1142
00180
J. Feldman, H. Kn¨ orrer & E. Trubowitz
Sectors that are compatible with conservation of momentum Let 0 ≤ Λ ≤ l, Λ ≥ l2 , let p ∈ F , q ∈ R2 and let Γ be a discrete subset of F. We define for n ≥ 2 M om2n−1 (Γ, p) = {(k1 , . . . , k2n−1 ) ∈ Γ2n−1 |∃ xi ∈ sΛ,l (ki ), i = 1, . . . , 2n − 1 such that x1 + · · · + xn − xn+1 − · · · − x2n−1 ∈ sΛ,l (p)} 2n−1 M om∼ |∃ xi ∈ sΛ,l (ki ), i = 1, . . . , 2n − 1 2n−1 (Γ, q) = {(k1 , . . . , k2n−1 ) ∈ Γ
such that x1 + · · · + xn − xn+1 − · · · − x2n−1 = q} . Lemma XX.9. Let p ∈ F, q ∈ R2 and Γ an l-separated subset of F. (i) If ω ≥ Λ/l then # (k1 , k2 , k3 ) ∈ M om3 (Γ, p)|ω ≤
max
1≤µ6=ν≤3
min[d(kµ , kν ), d(a(kµ ), kν )] ≤ 2ω
≤ const
ω l
≤ const
ω . l
≤ const
Λ l2
≤ const
Λ . l2
# (k1 , k2 , k3 ) ∈ M om∼ 3 (Γ, q)|ω ≤
max
1≤µ6=ν≤3
min[d(kµ , kν ), d(a(kµ ), kν )] ≤ 2ω
(ii) If 4l ≤ ω ≤ Λ/l then # (k1 , k2 , k3 ) ∈ M om3 (Γ, p)|ω ≤
max
1≤µ6=ν≤3
min[d(kµ , kν ), d(a(kµ ), kν )] ≤ 2ω
# (k1 , k2 , k3 ) ∈ M om∼ 3 (Γ, q)|ω ≤ The constants
const
max
1≤µ6=ν≤3
min[d(kµ , kν ), d(a(kµ ), kν )] ≤ 2ω
above depend only on the geometry of F.
Proof. If (k1 , k2 , k3 ) ∈ M om3 (Γ, p) and max1≤µ6=ν≤3 min[d(kµ , kν ), d(a(kµ ), kν )] ≤ 2ω, then by Proposition XX.6(ii), with n = 2, ω replaced by 2ω and k = ki or a(ki ), i = 1, 2, 3 min[d(ki , p), d(a(ki ), p)] ≤ 4 const0 ω
for 1 ≤ i ≤ 3
(XX.3)
where const0 is the constant of Proposition XX.6. Therefore we set for 1 ≤ µ 6= ν≤ 3
December 15, 2003 16:52 WSPC/148-RMP
00180
Single Scale Analysis of Many Fermion Systems — Part 4
Sµ,ν =
(k1 , k2 , k3 ) ∈ M om3 (Γ, p)|ω ≤ min[d(kµ , kν ), d(a(kµ ), kν )] ≤ 2ω max
1≤α6=β≤3
min[d(kα , kβ ), d(a(kα ), kβ )] ≤ 2ω
and min[d(ki , p), d(a(ki ), p)] ≤ 4 const0 ω The discussion above shows that (k1 , k2 , k3 ) ∈ M om3 (Γ, p)|ω ≤ ⊂
1143
[
1≤µ6=ν≤3
for 1 ≤ i ≤ 3 .
min[d(kµ , kν ), d(a(kµ ), kν )] ≤ 2ω
max
1≤µ6=ν≤3
Sµ,ν .
(XX.4)
Similarly, if (k1 , k2 , k3 ) ∈ M om∼ 3 (Γ, q) and min[d(kµ , kν ), d(a(kµ ), kν )] ≤ 2ω for all 1 ≤ µ 6= ν ≤ 3, then by Proposition XX.6(i), we have for i = 1, 2, 3 d(ki , p) ≤ 4 const0 ω d(2ki − a(ki ), p) ≤ 4 const0 ω
d(a(ki ), p) ≤ 4 const0 ω
or
or
or d(2a(ki ) − ki , p) ≤ 4 const0 ω .
(XX.5)
Setting for 1 ≤ µ 6= ν ≤ 3 ∼ Sµ,ν = (k1 , k2 , k3 ) ∈ M om∼ 3 (Γ, q)|ω ≤ min[d(kµ , kν ), d(a(kµ ), kν )] ≤ 2ω ×
max
1≤α6=β≤3
min[d(kα , kβ ), d(a(kα ), kβ )] ≤ 2ω and (XX.5) holds
we get (k1 , k2 , k3 ) ∈ M om∼ 3 (Γ, q)|ω ≤ ⊂
[
1≤µ6=ν≤3
max
1≤µ6=ν≤3
min[d(kµ , kν ), d(a(kµ ), kν )] ≤ 2ω
∼ Sµ,ν .
(XX.6)
∼ We show that for 1 ≤ µ 6= ν ≤ 3 one has #Sµ,ν , #Sµ,ν ≤ const ωl in case (i), and Λ ∼ that #Sµ,ν , #Sµ,ν ≤ const l2 in case (ii). We only discuss the case µ = 1, ν = 2, the other cases are similar. ∼ Set S = S1,2 or S = S1,2 . By construction, if (k1 , k2 , k3 ) ∈ S,
min[kk1 − k3 k, kk1 − a(k3 )k] , min[kk2 − k3 k, kk2 − a(k3 )k] ≤ const ω
(XX.7)
and (XX.3) respectively (XX.5) hold for i = 3. Since the maps k 7→ k, k 7→ a(k), k 7→ 2k − a(k) and k 7→ 2a(k) − k are embeddings of F , there are at most const ωl choices of k3 ∈ Γ for which (XX.3) or (XX.5) are satisfied. Fix such a k3 . Let ~n be a unit normal vector to F at k3 and ~t be a unit tangent vector to F at k3 . If (k1 , k2 , k3 ) ∈ S, by (XX.7), the sectors sΛ,l (ki ), i = 1, 2, 3 are each contained in a
December 15, 2003 16:52 WSPC/148-RMP
1144
00180
J. Feldman, H. Kn¨ orrer & E. Trubowitz
rectangle two of whose edges are parallel to ~t and have length at most two of whose edges are parallel to ~n and have length at most ( const l ω in case (i) const(Λ + l ω) ≤ const Λ in case (ii) .
const l,
and
The same holds for sΛ,l (p) when S = S1,2 . In particular, if S = S1,2 , the set {x3 + y|x3 ∈ sΛ,l (k3 ), y ∈ sΛ,l (p)}
is contained in a rectangle R whose one pair of edges is parallel to t and has length at most const l, and whose other pair of edges is parallel to n and has length ∼ at most const l ω in case (i) and length at most const Λ in case (ii). If S = S1,2 , the set {x3 + q|x3 ∈ sΛ,l (k3 )} is contained in such a rectangle R. Let M(k3 ) = {(k1 , k2 ) ∈ Γ2 |(k1 , k2 , k3 ) ∈ S} .
By definition, if (k1 , k2 ) ∈ M(k3 ), there are x1 ∈ sΛ,l (k1 ), x2 ∈ sΛ,l (k2 ) such that x1 + x2 ∈ R. The shape of sΛ,l (k1 ), sΛ,l (k2 ) and R determined above implies that the map f : (k1 , k2 ) 7→ k1 + k2 maps M(k3 ) to a rectangle R0 that contains R and has one pair of edges parallel to ~t and of length at most const0 l, and a second pair of edges parallel to ~n and of length at most const0 l ω in case (i) and const0 Λ in case (ii). Observe that M(k3 ) ⊂ {(k1 , k2 ) ∈ Γ2 |ω ≤ min[d(k1 , k2 ), d(a(k1 ), k2 )] and min[d(ki , k3 ), d(a(ki ), k3 )] ≤ const ω for i = 1, 2} .
It follows from part (iv) of Lemma XX.8, with p = k3 , A = const0 lω or const0 Λ, B = const0 l, ω1 = ω, ω2 = const ω and = l that const ωl2 (l ω)(l) = const in case (i) #M(k3 ) ≤ const Λ l = const Λ in case (ii) . ωl2 ωl Together with the observation made above, that there are at most const ωl choices of k3 ∈ Γ for which there exist (k1 , k2 ) ∈ Γ2 with (k1 , k2 , k3 ) ∈ S, this completes the proof of the lemma. Proposition XX.10. For all l-separated subsets Γ of F and all p ∈ F, q ∈ R2 const Λ Λ #M om3 (Γ, p) ≤ 1 + log 2 l l l Λ const Λ ∼ #M om3 (Γ, q) ≤ 1 + log 2 l l l with a constant const that depends only on the geometry of F.
December 15, 2003 16:52 WSPC/148-RMP
00180
Single Scale Analysis of Many Fermion Systems — Part 4
1145
Proof. We give the proof for M om3 (Γ, p), the proof for M om∼ 3 (Γ, q) is similar. Applying part (i) of Lemma XX.9 successively to ω = Λ/l, 2Λ/l, 4Λ/l, . . . , const one sees that # (k1 , k2 , k3 ) ∈ M om3 (Γ, p) l ) ln2 (const Λ
X
≤
const 2
j
j=1
Λ l
Similarly, if 4l ≤
max
1≤µ6=ν≤3
min[d(kµ , kν ), d(a(kµ ), kν )] ≥ Λ/l
Λ l Λ ≤ const ≤ 2 l Λ l2
const
l
.
and one applies part (ii) of Lemma XX.9 successively to ω = 4l, 8l, 16l, . . . , 21+[log2
one sees that # (k1 , k2 , k3 ) ∈ M om3 (Γ, p)|4l ≤ ≤
const Λ
l2
1 + log2
Λ l2
const
l
max
1≤µ6=ν≤3
Λ l2
]
l
min[d(kµ , kν ), d(a(kµ ), kν )] ≤ Λ/l
.
Finally, it is obvious that # (k1 , k2 , k3 ) ∈ M om3 (Γ, p) ≤
max
1≤µ6=ν≤3
min[d(kµ , kν ), d(a(kµ ), kν )] ≤ 4l
.
Proposition XX.11. Let n ≥ 2, δ ≥ l and let I1 , . . . , I2n−1 be intervals of length δ in F. Assume that 1 ω= max min(dist(Ii , Ij ), dist(Ii , a(Ij ))) > max(δ, 4l) . 1≤i6=j≤2n−1 3 Then for all l-separated subsets Γ of F, all p ∈ F and all q ∈ R2 2n−3 δ #M om2n−1 (Γ, p) ∩ (I1 × · · · × I2n−1 ) ≤ const n2 +1 1+ l 2n−3 δ ∼ 2 #M om2n−1 (Γ, q) ∩ (I1 × · · · × I2n−1 ) ≤ const n +1 1+ l with a constant
const
that depends only on the geometry of F.
Proof. The proof is similar to the proof of Proposition XX.10. Set ( +1 for 1 ≤ i ≤ n i = . −1 for n + 1 ≤ i ≤ 2n − 1
Λ lω Λ lω
December 15, 2003 16:52 WSPC/148-RMP
1146
00180
J. Feldman, H. Kn¨ orrer & E. Trubowitz
Choose a point k ∈ I1 . Then for all x ∈ d(x, k) ≤ ω
S2n−1 i=1
Ii
or d(x, a(k)) ≤ ω .
Choose 1 ≤ i0 < j0 ≤ 2n − 1 such that min(dist(Ii0 , Ij0 ), dist(Ii0 , a(Ij0 ))) =
1 ω. 3
Since Γ is l-separated and each Ij is of length δ, 2n−3 2n−1 Y δ . +1 Γ ∩ Ii ≤ # l i=1
(XX.8)
i6=i0 ,j0
Fix ki ∈ Γ ∩ Ii for i = 1, . . . , 2n − 1, i 6= i0 , j0 . Let ~n be a unit normal vector to F at k and ~t be a unit tangent vector to F at k. By Proposition XX.5 2n−1 X i xi − q|xi ∈ sΛ,l (ki ) − i=1 i6=i0 ,j0
is contained in a rectangle R∼ having one pair of edges parallel to ~t and of length at most const nl and a second pair of edges parallel to ~n and of length at most const n[Λ + l ω]. As each sΛ,l (ki ) is contained in a rectangle having one pair of edges parallel to ~t and of length at most const l and a second pair of edges parallel to ~n and of length at most const[Λ + l ω], the map f : (ki0 , kj0 ) 7→ i0 ki0 + j0 kj0 maps the set M∼ = {(ki0 , kj0 ) ∈ Γ2 ∩ (Ii0 × Ij0 )|∃xi ∈ sΛ,l (ki ), i = 1, . . . , 2n − 1 such that x1 + · · · + xn − xn+1 − · · · − x2n−1 = q} to a rectangle R0 having one pair of edges parallel to ~t and of length at most B = n const0 l, and a second pair of edges parallel to ~n and of length at most A = const0 n[Λ + l ω] . By part (iv) of Lemma XX.8, with p replaced by k, ω1 = 13 ω, ω2 = ω and = l const Λ #M∼ ≤ 2 (nΛ + nlω)(nl) ≤ const n2 1 + . l ω lω This, together with (XX.8), proves #M om∼ 2n−1 (Γ, q) ∩ (I1 × · · · × I2n−1 ) ≤ const
δ +1 l
2n−3
Λ . n2 1 + lω
By Proposition XX.6(ii) kp − kk ≤ const nω
or kp − a(k)k ≤ const nω .
December 15, 2003 16:52 WSPC/148-RMP
00180
Single Scale Analysis of Many Fermion Systems — Part 4
1147
Therefore sΛ,l (p) is contained in a rectangle, two of whose edges are parallel to ~t and have length at most const l and two of whose edges are parallel to ~n and have length at most const(Λ
+ lkp − kk) ≤ const(Λ + n l ω) .
This and Proposition XX.5 imply that 2n−1 X i xi − y|xi ∈ sΛ,l (ki ), y ∈ sΛ,l (p) − i=1 i6=i0 ,j0
is contained in a rectangle R having one pair of edges parallel to ~t and of length at most const n l and a second other pair of edges parallel to ~n and of length at most const[nΛ
+ n l ω + n l ω] ≤ const n[Λ + l ω] .
As above, this implies that #M om2n−1 (Γ, p) ∩ (I1 × · · · × I2n−1 ) 2n−3 δ Λ 2 ≤ const . +1 n 1+ l lω XXI. Sectors Compatible with Conservation of Momentum Comparison of the 1-norm and the 3-norm for four-legged Kernels Lemma XXI.1. There is a constant const independent of M such that the 2 following holds. Let Σ be a sectorization of length l at scale j with M j−1 ≤ l 1 ˇ ≤ M (j−1)/2 . Furthermore let ϕ ∈ F0 (4; Σ) and f ∈ F1 (3; Σ) be particle number conserving functions. Then 1 const 1 log 2 j−1 |ϕ|3,Σ |ϕ|1,Σ ≤ 1+ l lM j−1 l M 1 1 const 1+ |f |e1,Σ ≤ log |f |e3,Σ . l lM j−1 l2 M j−1 Proof. By [8, Definition XII.9] and Remark XX.2(i) X |ϕ|1,Σ = max max kϕ((· , s1 ), . . . , (· , s4 ))k1,∞ . 1≤i1 ≤4 si1 ∈Σ
si ∈Σ for i6=i1 s1 ,s2 ,s3 ,s4 compatible with conservation of momentum
Let 1 ≤ i1 ≤ 4 and si1 ∈ Σ. Choose i2 , i3 , i4 such that {1, √ 2, 3, 4} = {i1 , i2 , i3 , i4 }. 2 , there are at most By Remark XX.2(ii) and Proposition XX.10, with Λ = M j−1 M M const l (1 + lM j log l2 M j ) triples (si2 , si3 , si4 ) such that (s1 , s2 , s3 , s4 ) is compatible with conservation of momentum. Consequently
December 15, 2003 16:52 WSPC/148-RMP
1148
00180
J. Feldman, H. Kn¨ orrer & E. Trubowitz
|ϕ|1,Σ ≤ ≤
const
l const
l ×
=
M M 1+ log 2 j lM j l M
l
max
s1 ,s2 ,s3 ,s4 ∈Σ
kϕ((· , s1 ), . . . , (· , s4 ))k1,∞
1 1 1+ log 2 j−1 lM j−1 l M X kϕ((· , s1 ), . . . , (· , s4 ))k1,∞ max
1≤i1
const
1+
si ∈Σ for i6=i1 ,i2 ,i3
1 1 log 2 j−1 lM j−1 l M
|ϕ|3,Σ .
The argument for |f |e1,Σ is analogous.
Proof of Proposition XIX.1. Under the hypotheses of this proposition, the term 1 1 lM j−1 log l2 M j−1 is bounded by an M -independent constant and the first inequality, |ϕ|1,Σ ≤ const 1l |ϕ|3,Σ , follows. If f ∈ Fˇ4;Σ and ~i ∈ {0, 1}4 , then |f |~i |e1,Σ = 0 unless m(~i) ≤ 1. Therefore the second inequality, |f |e1,Σ ≤ const 1l |f |e3,Σ , also follows from Lemma XXI.1. Auxiliary norms 1 Let Σ be a sectorization of scale j and length l ≥ M j−1 . For ω > 0 we define auxiliary norms on functions ϕ in F0 (n; Σ) and f ∈ F1 (n; Σ) that are antisymmetric in their (ξ, s) arguments by X |ϕ|1,Σ,ω = max kϕ((· , s1 ), . . . , (· , sn ))k1,∞ s1 ∈Σ
|f |e1,Σ,ω =
X
s2 ,...,sn ∈Σ dist(sk ,s` )≥ω and dist(sk ,a(s` ))≥ω for some 2≤k6=`≤n
δ∈N0 ×N20
sup ηˇ∈B
X
max
s1 ,...,sn ∈Σ dist(sk ,s` )≥ω and dist(sk ,a(s` ))≥ω for some 1≤k6=`≤n
D dd-operator with δ(D)=δ
tδ . δ! The norm k| · k|1,∞ of Example II.6 refers to the variables ξ1 , . . . , ξn . Furthermore, P maxima, like maxs1 ∈Σ , that act on a formal power series δ aδ tδ are to be applied separately to each coefficient aδ . × k|Df (ˇ η ; (ξ1 , s1 ), . . . , (ξn , sn ))k|1,∞
1 Lemma XXI.2. Let ω ≥ max{l, M (j−1)/2 }, n ≥ 3 and let ϕ ∈ F0 (n; Σ) and f ∈ F1 (n; Σ) be particle number conserving functions that are antisymmetric in their (ξ, s) arguments. Then
|ϕ|1,Σ ≤ |ϕ|1,Σ,ω + const n
ω2 |ϕ|3,Σ l2
December 15, 2003 16:52 WSPC/148-RMP
00180
Single Scale Analysis of Many Fermion Systems — Part 4
|f |e1,Σ ≤ |f |e1,Σ,ω + const n2
1149
ω2 e |f |3,Σ . l2
Proof. By definition of |ϕ|1,Σ and the antisymmetry of ϕ X |ϕ|1,Σ ≤ |ϕ|1,Σ,ω + max kϕ((· , s1 ), . . . , (· , sn ))k1,∞ . s1 ∈Σ
s2 ,...,sn ∈Σ dist(sk ,s` )≤ω or dist(sk ,a(s` ))≤ω for all 2≤k6=`≤n
Fix s1 ∈ Σ. If s2 , . . . , sn ∈ Σ are such that for all 2 ≤ k 6= ` ≤ n one has dist(sk , s` ) ≤ ω or dist(sk , a(s` )) ≤ ω and such that s1 , . . . , sn are compatible with conservation of momentum for some choice of annihilation/creation indices √ 2 (b1 , . . . , bn ), then, by Proposition XX.6(ii) with Λ = M j−1 , p the center of s1 and k the center of s2 , dist(s1 , sk ) ≤ const nω
or dist(s1 , a(sk )) ≤ const nω
for 2 ≤ k ≤ n. Set Sect = {(s2 , s3 ) ∈ Σ2 |dist(s1 , s2 ) ≤ const nω or dist(s1 , a(s2 )) ≤ const nω and dist(s2 , s3 ) ≤ const ω or dist(s2 , a(s3 )) ≤ const ω} . 2
Clearly #Sect ≤ const n ωl2 . Consequently X kϕ((· , s1 ), . . . , (· , sn ))k1,∞ s2 ,...,sn ∈Σ dist(sk ,s` )≤ω or dist(sk ,a(s` ))≤ω for all 2≤k6=`≤n
≤
X
X
s2 ,s3 ∈Sect s4 ,...,sn ∈Σ
kϕ((· , s1 ), . . . , (· , sn ))k1,∞ ≤ const n
ω2 |ϕ|3,Σ . l2
Similarly, |f |e1,Σ ≤ |f |e1,Σ,ω +
×
X
δ∈N0 ×N20
max ˇ η∈ ˇ B
X
s1 ,...,sn ∈Σ dist(sk ,s` )≤ω or dist(sk ,a(s` ))≤ω for all 1≤k6=`≤n
tδ max k|Df (ˇ η ; (ξ1 , s1 ), . . . , (ξn , sn ))k|1,∞ . D dd-operator δ! with δ(D)=δ
ˇ If s1 , . . . , sn ∈ Σ are such that for all 1 ≤ k 6= ` ≤ n one has Fix ηˇ = (p0 , p, σ, b) ∈ B. dist(sk , s` ) ≤ ω or dist(sk , a(s` )) ≤ ω and such that the configuration (ˇ η ; s 1 , . . . , sn ) is compatible with conservation of momentum, then, by Proposition XX.6(i) with √ 2 Λ = M j−1 and k the center of s1 , there is an integer r with |r| ≤ n such that kp − rk + (r − 1)a(k)k ≤ const nω .
(XXI.1)
December 15, 2003 16:52 WSPC/148-RMP
1150
00180
J. Feldman, H. Kn¨ orrer & E. Trubowitz
The maps F −→ R2 , k 7→ rk − (r − 1)a(k) are embeddings. Therefore there are at most const nω/l sectors s1 containing a k such that (XXI.1) holds. Set Sect = {(s1 , s2 ) ∈ Σ2 |s1 contains a point k for which (XXI.1) holds with some |r| ≤ n and dist(s1 , s2 ) ≤ const ω or dist(s1 , a(s2 )) ≤ const ω} . 2
Again #Sect ≤ const n2 ωl2 . Consequently X X 1 k|Df (ˇ η ; (ξ1 , s1 ), . . . , (ξn , sn ))k|1,∞ tδ max δ! D dd-operator 2 s1 ,...,sn ∈Σ δ∈N0 ×N0 dist(sk ,s` )≤ω or dist(sk ,a(s` ))≤ω for all 1≤k6=`≤n
≤
X
with δ(D)=δ
X
X
s1 ,s2 ∈Sect s3 ,...,sn ∈Σ δ∈N0 ×N20
×
max
D dd-operator with δ(D)=δ
≤ const n2
1 δ!
k|Df (ˇ η ; (ξ1 , s1 ), . . . , (ξn , sn ))k|1,∞ tδ
ω2 e |f |3,Σ . l2
Change of sectorization To prepare for the proof of Proposition XIX.4, we note 1 1 1 1 Lemma XXI.3. Let j > i ≥ 2, M j−3/2 ≤ l ≤ M (j−1)/2 and M i−3/2 ≤ l0 ≤ M (i−1)/2 . 0 0 Let Σ and Σ be sectorizations of length l at scale j and length l at scale i, respectively. Suppose that l < l0 . Let ϕ ∈ Fm (n; Σ0 ) and f ∈ Fˇm (n; Σ0 ) be particle number conserving.
(i) For s1 , . . . , sn ∈ Σ
kϕΣ (η1 , . . . , ηm ; (ξ1 , s1 ), . . . , (ξn , sn ))k1,∞ X kϕ(η1 , . . . , ηm ; (ξ1 , s01 ), . . . , (ξn , s0n ))k1,∞ ≤ constn cj−1 s01 ,...,s0n ∈Σ0 si 6=∅ s˜0i ∩˜
and for ηˇ1 , . . . , ηˇm ∈ Bˇ and s1 , . . . , sn ∈ Σ X 1 k|DfΣ (ˇ η1 , . . . , ηˇm ; (ξ1 , s1 ), . . . , (ξn , sn ))k|1,∞ tδ max D dd-operator δ! 2 δ∈N0 ×N0
with δ(D)=δ
≤ constn cj−1
×
max
X
X
s01 ,...,s0n ∈Σ0 δ∈N0 ×N20 s˜0i ∩˜ si 6=∅
D dd-operator with δ(D)=δ
1 δ!
k|Df (ˇ η1 , . . . , ηˇm ; (ξ1 , s01 ), . . . , (ξn , s0n ))k|1,∞ tδ .
December 15, 2003 16:52 WSPC/148-RMP
00180
Single Scale Analysis of Many Fermion Systems — Part 4
1151
(ii) If f is antisymmetric in its (ξ, s) arguments |fΣ |ep,Σ ≤ constn cj−1 |f |ep,Σ0
sup
max
ˇm ηˇ1 ,...,ˇ η m ∈B
s1 ,...,sp−m ∈Σ s0p−m+1 ,...,s0n ∈Σ0
× #Cons(ˇ η1 , . . . , ηˇm ; s1 , . . . , sp−m ; s0p−m+1 , . . . , s0n ) where Cons(ˇ η1 , . . . , ηˇm ; s1 , . . . , sp−m ; s0p−m+1 , . . . , s0n ) denotes the set of all (sp−m+1 , . . . , sn ) ∈ Σm+n−p such that s˜i ∩ s˜0i 6= ∅ for i = p − m + 1, . . . , n and the configuration (ˇ η1 , . . . , ηˇm ; s1 , . . . , sn ) is consistent with conservation of momentum in the sense of Definition XX.1. (iii) If m = 0, ω ≥ l0 and ϕ is antisymmetric, then |ϕΣ |1,Σ,ω ≤ constn cj−1 |ϕ|1,Σ0 max
s1 ∈Σ
max
s02 ,...,s0n ∈Σ0 dist(s0k ,s0` )≥ω−2l0 and dist(s0k ,a(s0` ))≥ω−2αl0 for some 2≤k6=`≤n
× #Cons(s1 ; s02 , . . . , s0n ) . Here, α is the supremum of the derivative of the antipodal map a on the Fermi curve F. If m = 1, ω ≥ l0 and f is antisymmetric in its (ξ, s) arguments, then |fΣ |e1,Σ,ω ≤ constn cj−1 |f |e1,Σ0 sup
ηˇ∈Bˇ
max
s01 ,...,s0n ∈Σ0 dist(s0k ,s0` )≥ω−2l0 and dist(s0k ,a(s0` ))≥ω−2αl0 for some 1≤k6=`≤n
× #Cons(ˇ η ; s01 , . . . , s0n ) . Proof. (i) ϕΣ (η1 , . . . , ηm ; (ξ1 , s1 ), . . . , (ξn , sn )) Z X dξ10 · · · dξn0 ϕ(η1 , . . . , ηm ; (ξ10 , s01 ), . . . , (ξn0 , s0n )) = s01 ,...,s0n ∈Σ0 s˜0i ∩˜ si 6=∅
×
n Y
χ ˆs` (ξ`0 , ξ` ) .
`=1
Hence, by [9, Lemma II.7] and [8, Lemma XII.3], kϕΣ (η1 , . . . , ηm ; (ξ1 , s1 ), . . . , (ξn , sn ))k1,∞ X ≤ constn cnj−1 kϕ(η1 , . . . , ηm ; (ξ1 , s01 ), . . . , (ξn , s0n ))k1,∞ . s01 ,...,s0n ∈Σ0 s˜0i ∩˜ si 6=∅
The proof of the second inequality is similar.
December 15, 2003 16:52 WSPC/148-RMP
1152
00180
J. Feldman, H. Kn¨ orrer & E. Trubowitz
(ii) By part (i) and Remark XX.2(i) X |fΣ |ep,Σ ≤ sup δ∈N0 ×N20
×
s1 ,...,sp−m ∈Σ sp−m+1 ,...,sn ∈Σ ˇm ηˇ1 ,...,ˇ η m ∈B
tδ k|DfΣ (ˇ η1 , . . . , ηˇm ; (· , s1 ), . . . , (· , sn ))k|1,∞ max D dd-operator δ! with δ(D)=δ
≤ constn cj−1
×
X
δ∈N0 ×N20
X
s1 ,...,sp−m ∈Σ sp−m+1 ,...,sn ∈Σ s01 ,...,s0n ∈Σ0 ηˇ1 ,...,ˇ ηm ∈Bˇm si 6=∅ s˜0i ∩˜
with δ(D)=δ
X
sup
δ∈N0 ×N20
×
X
sup
tδ k|Df (ˇ η1 , . . . , ηˇm ; (· , s01 ), . . . , (· , s0n ))k|1,∞ max D dd-operator δ!
= constn cj−1
×
X
X
s1 ,...,sp−m ∈Σ ηˇ1 ,...,ˇ ηm ∈Bˇm
X
X
s01 ,...,s0p−m ∈Σ0 s0p−m+1 ,...,s0n ∈Σ0 sp−m+1 ,...,sn ∈Σ s˜0i ∩˜ si 6=∅ s˜0i ∩˜ si 6=∅ i=p−m+1,...,n i=1,...,p−m
tδ max k|Df (ˇ η1 , . . . , ηˇm ; (· , s01 ), . . . , (· , s0n ))k|1,∞ D dd-operator δ! with δ(D)=δ
≤ constn cj−1
X
sup
δ∈N0 ×N20
X
X
s1 ,...,sp−m ∈Σ 0 s ,...,s0 ∈Σ0 s0p−m+1 ,...,s0n ∈Σ0 ηˇ1 ,...,ˇ ηm ∈Bˇm 1 0 p−m s˜i ∩˜ si 6=∅ i=1,...,p−m
× #Cons(ˇ η1 , . . . , ηˇm ; s1 , . . . , sp−m ; s0p−m+1 , . . . , s0n ) ×
tδ max k|Df (ˇ η1 , . . . , ηˇm ; (· , s01 ), . . . , (· , s0n ))k|1,∞ D dd-operator δ! with δ(D)=δ
≤ constn cj−1 |f |ep,Σ0
sup ηˇ1 ,...,ˇ ηm ∈Bˇm
max
s1 ,...,sp−m ∈Σ s0p−m+1 ,...,s0n ∈Σ0
× #Cons(ˇ η1 , . . . , ηˇm ; s1 , . . . , sp−m ; s0p−m+1 , . . . , s0n ) since, for i = 1, . . . , p − m, there are at most three sectors s0i ∈ Σ0 with s˜0i ∩ s˜i 6= ∅. (iii) If sk , s` ∈ Σ with dist(sk , s` ) ≥ ω, dist(sk , a(s` )) ≥ ω and s0k , s0` ∈ Σ0 with s˜k ∩ s˜0k 6= ∅, s˜` ∩ s˜0` 6= ∅ then dist(s0k , s0` ) ≥ ω − 2l0 and dist(s0k , a(s0` )) ≥ ω − 2αl0 . Using this observation, the proof of (iii) is analogous to the proof of (ii).
December 15, 2003 16:52 WSPC/148-RMP
00180
Single Scale Analysis of Many Fermion Systems — Part 4
1153
1 1 1 1 Lemma XXI.4. Let j > i ≥ 2, M j−3/2 ≤ l ≤ M (j−1)/2 and M i−3/2 ≤ l0 ≤ M (i−1)/2 1 0 0 with l < 4 l . Let Σ and Σ be sectorizations of length l at scale j and length l0 at scale i, respectively.
(i) There is a constant
const
independent of M such that for every s0 ∈ Σ0
l0 . l ∈ B, s1 , . . . , sp−m ∈ Σ and
#{s ∈ Σ|˜ s ∩ s˜0 6= ∅} ≤ const (ii) Let m ≥ 0, p ≥ m, n ≥ p − m + 1, ηˇ1 , . . . , ηˇm s0p−m+1 , . . . , s0n ∈ Σ0 . Then
#Cons(ˇ η1 , . . . , ηˇm ; s1 , . . . , sp−m ; s0p−m+1 , . . . , s0n ) ≤ constn
0 n+m−p−1 l . l
(iii) Let ω 0 ≥ 4l0 , and let s1 ∈ Σ and s02 , . . . , s0n ∈ Σ0 such that dist(s0k , s0` ) ≥ ω 0 and dist(s0k , a(s0` )) ≥ ω 0 for some 2 ≤ k 6= ` ≤ n. Then 0 n−3 1 l 1 + j−1 0 . #Cons(s1 ; s02 , . . . , s0n ) ≤ constn l M lω ˇ and s01 , . . . , s0n ∈ Σ0 such that dist(s0 , s0 ) ≥ ω 0 (iv) Let ω 0 ≥ 4l0 , ηˇ = (q0 , q, σ, a) ∈ B k ` 0 0 0 and dist(sk , a(s` )) ≥ ω for some 1 ≤ k 6= ` ≤ n. Then 0 n−2 l 1 0 0 n #Cons(ˇ η ; s1 , . . . , sn ) ≤ const 1 + j−1 0 . l M lω Proof. (i) is trivial. 0 (ii) By part (i), there are at most constn ( ll )n+m−p−1 (n + m − p − 1)-tuples (sp−m+1 , . . . , sn−1 ) of sectors in Σ such that s˜i ∩ s˜0i 6= ∅ for i = p − m + 1, . . . , n − 1. Given such an (n + m − p − 1)-tuple (sp−m+1 , . . . , sn−1 ) and a particle number preserving sequence (a1 , . . . , an ) of creation–annihilation indices, the set {−(−1)an (ˇ η1 + · · · + ηˇm + (−1)a1 k1 + · · · + (−1)an−1 kn−1 )|ki ∈ s˜i for i = 1, . . . , n − 1} has diameter at most const(n−1)l and therefore meets at most const(n−1) extended sectors of Σ. This shows that there are at most constn sectors sn ∈ Σ such that (s1 , . . . , sn ) is consistent with conservation of momentum. (iii) Let (a1 , . . . , an ) be a particle number preserving sequence of creation– annihilation indices. For i = 1, . . . , n − 1 let Ii = {k ∈ F |dist(k, s0i+1 ) ≤ l}. We √ 2 , Γ the apply the first inequality of Proposition XX.11 with δ = l0 + 2l, Λ = M j−1 set of centers of the intervals s ∩ F, s ∈ Σ and p the center of s1 ∩ F. It follows that 0 n−3 l 1 0 0 n #Cons(s1 ; s2 , . . . , sn ) ≤ const 1 + j−1 0 . l M lω (iv) is similar to (iii), using the second inequality of Proposition XX.11 instead.
December 15, 2003 16:52 WSPC/148-RMP
1154
00180
J. Feldman, H. Kn¨ orrer & E. Trubowitz
Proof of Proposition XIX.4. (i) As m 6= 0, by Lemmas XXI.3(i) and XXI.4(i) X kϕΣ (η1 , . . . , ηm ; (ξ1 , s1 ), . . . , (ξn , sn ))k1,∞ |ϕΣ |1,Σ = s1 ,...,sn ∈Σ
≤ constn cj−1
X
s1 ,...,sn ∈Σ s01 ,...,s0n ∈Σ0 si 6=∅ s˜0i ∩˜
kϕ(η1 , . . . , ηm ; (ξ1 , s01 ), . . . , (ξn , s0n ))k1,∞
≤ constn cj−1
0 n l l
= constn cj−1
0 n l |ϕ|1,Σ0 . l
X
s01 ,...,s0n ∈Σ0
kϕ(η1 , . . . , ηm ; (ξ1 , s01 ), . . . , (ξn , s0n ))k1,∞
(ii) By Lemmas XXI.3(ii) and XXI.4(ii) 0 n+m−p−1 l e n |fΣ |p,Σ ≤ const cj−1 |f |ep,Σ0 . l
(XXI.2)
1 1 , l0 ≤ 16 and n ≥ 3. Observe that |fΣ |e1,Σ vanishes Now assume that l ≥ M 2/3(j−1) √ for m ≥ 2, so it suffices to consider m = 0, 1. Set ω = α l. By Lemma XXI.2
|fΣ |e1,Σ ≤ |fΣ |e1,Σ,ω + const n2
By (XXI.2), n2
ω2 |fΣ |e3,Σ ≤ l2
const
≤
const
=
const
n
n
n
ω2 |fΣ |e3,Σ . l2
0 n+m−4 l l 0 n+m−4 l cj−1 l 0 n+m−3 l cj−1 l cj−1
ω2 e |f |3,Σ0 l2 1 e |f | 0 l 3,Σ 1 e |f | 0 . l0 3,Σ
If m = 0, by Lemmas XXI.3(iii) and XXI.4(iii), with ω 0 = ω − 2αl0 , 0 n+m−3 l 1 |fΣ |e1,Σ,ω ≤ constn cj−1 1 + j−1 |f |e1,Σ0 l M l(ω − 2αl0 ) 0 n+m−3 l |f |e1,Σ0 ≤ constn cj−1 l √ √ since M j−1 l(ω − 2αl0 ) ≥ M j−1 l(α l − α3 l) = 32 αM j−1 l3/2 ≥ 23 α. Similarly one sees, using Lemma XXI.4(iv), that also in the case m = 1 0 n+m−3 l |fΣ |e1,Σ,ω ≤ constn cj−1 |f |e1,Σ0 . l
December 15, 2003 16:52 WSPC/148-RMP
00180
Single Scale Analysis of Many Fermion Systems — Part 4
1155
(iii) Write fΣ0 (ˇ η1 , . . . , ηˇm ; (ξ1 , s01 ), . . . , (ξn , s0n )) X g(ˇ η1 , . . . , ηˇm ; (ξ1 , s01 ), . . . , (ξn , s0n ); s1 , . . . , sn ) = s1 ,...,sn ∈Σ
with g(ˇ η1 , . . . , ηˇm ; (ξ1 , s01 ), . . . , (ξn , s0n ); s1 , . . . , sn ) Z n Y = dξ10 · · · dξn0 f (ˇ η1 , . . . , ηˇm ; (ξ10 , s1 ), . . . , (ξn0 , sn )) χ ˆs0` (ξ`0 , ξ` ) . `=1
Then
|fΣ0 |ep,Σ0 =
X
δ∈N0 ×N20
1≤i1 <···
tδ max k|DfΣ (ˇ η1 , . . . , ηˇm ; (ξ1 , s01 ), . . . , (ξn , s0n ))k|1,∞ D dd-operator δ!
× ≤
X
sup
with δ(D)=δ
X
sup
δ∈N0 ×N20
×
1≤i1 <···
X
s0i ∈Σ0 for i6=i1 ,...,ip−m
X
s1 ,...,sn ∈Σ si ∩s0i 6=∅ for 1≤i≤n
tδ k|Dg(ˇ η1 , . . . , ηˇm ; (ξ1 , s01 ), . . . , (ξn , s0n ); s1 , . . . , sn )k|1,∞ . max D dd-operator δ! with δ(D)=δ
For each fixed ηˇ1 , . . . , ηˇm , s01 , . . . , s0n , s1 , . . . , sn , X 1 max k|Dg(ˇ η1 , . . . , ηˇm ; (ξ1 , s01 ), . . . , (ξn , s0n ); s1 , . . . , sn )k|1,∞ tδ D dd-operator δ! 2 δ∈N0 ×N0
≤
with δ(D)=δ
X
1 δ!
δ∈N0 ×N20
× tδ
n Y
`=1
max
D dd-operator with δ(D)=δ
k|Df (ˇ η1 , . . . , ηˇm ; (ξ10 , s1 ), . . . , (ξn0 , sn ))k|1,∞
kχ ˆs0` k1,∞
as in [9, Lemma II.7]. Hence, by [8, Lemma XII.3] and [9, Example A.3], X 1 k|Dg(ˇ η1 , . . . , ηˇm ; (ξ1 , s01 ), . . . , (ξn , s0n ); s1 , . . . , sn )k|1,∞ tδ max δ! D dd-operator 2 δ∈N0 ×N0
≤
with δ(D)=δ
X
δ∈N0 ×N20
1 δ!
max
D dd-operator with δ(D)=δ
k|Df (ˇ η1 , . . . , ηˇm ; (ξ10 , s1 ), . . . , (ξn0 , sn ))k|1,∞
December 15, 2003 16:52 WSPC/148-RMP
1156
00180
J. Feldman, H. Kn¨ orrer & E. Trubowitz
× tδ ×
n Y
const
i=1
ci−1 ≤ constn ci−1
max
D dd-operator with δ(D)=δ
X
δ∈N0 ×N20
1 δ!
k|Df (ˇ η1 , . . . , ηˇm ; (ξ10 , s1 ), . . . , (ξn0 , sn ))k|1,∞ tδ
uniformly in s01 , . . . , s0n , s1 , . . . , sn , ηˇ1 , . . . , ηˇm . So |fΣ0 |ep,Σ0 ≤
const
n
X
ci−1
s1 ,...,sn ∈Σ si ∩s0i 6=∅ for 1≤i≤n
sup
X
1≤i1 <···
X
si ∈Σ si ∩s0i 6=∅ for i=i1 ,...,ip−m
si ∈Σ for i6=i1 ,...,ip−m
X
s0i ∈Σ0 si ∩s0i 6=∅ for i6=i1 ,...,ip−m
tδ max k|Df (ˇ η1 , . . . , ηˇm ; (ξ10 , s1 ), . . . , (ξn0 , sn ))k|1,∞ D dd-operator δ! with δ(D)=δ
const
n
X
ci−1
sup
δ∈N0 ×N20
×
1≤i1 <···
X
with δ(D)=δ
δ∈N0 ×N20
×
X
sup
tδ max k|Df (ˇ η1 , . . . , ηˇm ; (ξ10 , s1 ), . . . , (ξn0 , sn ))k|1,∞ D dd-operator δ!
const
×
≤
X
ci−1
δ∈N0 ×N20
× ≤
n
1≤i1 <···
X
X
si ∈Σ for si ∈Σ i6=i1 ,...,ip−m si ∩s0i 6=∅ for i=i1 ,...,ip−m
tδ max k|Df (ˇ η1 , . . . , ηˇm ; (ξ10 , s1 ), . . . , (ξn0 , sn ))k|1,∞ D dd-operator δ! with δ(D)=δ
since the set {s0i ∈ Σ0 , i 6= i1 , . . . , ip−m |si ∩ s0i 6= ∅ for i 6= i1 , . . . , ip−m } contains at most 3n terms. Finally, applying X
sup s0i ,...,s0i 1
≤
p−m
∈Σ0
const
h(s1 , . . . , sip−m )
si ∈Σ si ∩s0i 6=∅ for i=i1 ,...,ip−m
l0 l
p−m
sup si1 ,...,sip−m ∈Σ
h(s1 , . . . , sip−m )
December 15, 2003 16:52 WSPC/148-RMP
00180
Single Scale Analysis of Many Fermion Systems — Part 4
1157
yields |fΣ0 |ep,Σ0
≤
const
× ≤
n
0 p−m l ci−1 l
X
δ∈N0 ×N20
sup 1≤i1 <···
X
si ∈Σ for i6=i1 ,...,ip−m
tδ max k|Df (ˇ η1 , . . . , ηˇm ; (ξ10 , s1 ), . . . , (ξn0 , sn ))k|1,∞ D dd-operator δ! with δ(D)=δ
const
n
0 p−m l ci−1 |f |ep,Σ . l
XXII. Sector Counting for Particle Particle Ladders In this section we prove that, when the Fermi surface F is strongly asymmetric in the sense of Definition XVIII.3, particle–particle ladders obey bounds that are stronger than those given by standard power counting. The precise formulation of this result is given in Theorem XXII.7. It bounds the | · |3,Σ norm of a particle– particle ladder L` (f ; C, D) with ` + 1 vertices f and propagators determined by C and D. Such a ladder looks like
f
f
f
Each line of this ladder is either C or D with at least one C in each of the ` rungs.a The detailed definition of L` (f ; C, D) is given in Definition XIV.1. When applied to four-legged kernels, the | · |3,Σ norm measures roughly the suprmemum in momentum space of the kernel and its derivatives. Naive power counting, as in Appendix D, leads to a bound on |L` (f ; C, D)|e3,Σ of order |f |e`+1 3,Σ . In this section, we use sector counting to implement the geometric argument outlined in [1, Sec. II, Subsec. 5] exploiting the asymmetry of the Fermi surface to improve the bound to one of order (l1/n0 )` |f |e`+1 3,Σ . The main sector counting result is Proposition XXII.1. Assume that the Fermi surface F is strongly asymmetric. There is a constant const independent of M such that for all sectorizations of scale 1 and all s01 , s02 ∈ Σ and all k1 , k2 ∈ R × R2 j ≥ 2 and length l ≥ M j−1 ]{(s1 , s2 ) ∈ Σ × Σ|(˜ s1 + s˜2 ) ∩ (˜ s01 + s˜02 ) 6= ∅} ≤ const
l1/n0 l
]{(s1 , s2 ) ∈ Σ × Σ|(˜ s1 + s˜2 ) ∩ (k1 + s˜01 ) 6= ∅} ≤ const
l1/n0 l
]{(s1 , s2 ) ∈ Σ × Σ|k1 + k2 ∈ s˜1 + s˜2 } ≤ const a The
l1/n0 . l
reader should think of C as a “hard” propagator and D as a “soft” propagator arising from Wick ordering.
December 15, 2003 16:52 WSPC/148-RMP
1158
00180
J. Feldman, H. Kn¨ orrer & E. Trubowitz
The proof of this proposition, which is given after Proposition XXII.4, is based on the following three lemmas. Lemma XXII.2. Assume that F is strongly asymmetric. There exists a constant 2 const such that for all ε > 0 and all disks D in R of radius ε length {k ∈ F |k + a(k) ∈ D} ≤ const ε1/(n0 −1) where n0 is the constant of Definition XVIII.3 and a(k) is the antipode of k. Proof. Since F is compact, it suffices to show that each point p ∈ F has a neighborhood U in F for which there exists 1 ≤ n ≤ n0 − 1 and a constant const such that, for all ε > 0 and all disks D in R2 of radius ε, length {k ∈ U |k + a(k) ∈ D} ≤ const ε1/n . Fix p ∈ F. Without loss of generality, we may assume that the oriented unit tangent vector to F at p is (1, 0) and that the unit inward pointing normal vector to F at p is (0, 1). Let ϕ(t) = ϕp (t), ϕ(t) ¯ = ϕa(p) (t), where ϕp is the parameterizing map of Definition XVIII.3. Precisely, t 7→ k(t) = p + (t, ϕ(t)) is a parameterization of F ¯ = a(p) − (t, ϕ(t)) near p and t 7→ k(t) ¯ is a parameterization of F near a(p). By strict convexity, the slopes ϕ(t) ˙ and ϕ(t) ¯˙ for the Fermi curve at k(t) and ¯ k(t), respectively, are strictly increasing with t. Hence there is a strictly increasing function t¯(t) such that ϕ( ¯˙ t¯(t)) = ϕ(t) ˙
(XXII.1)
and hence ¯ t¯(t)) = a(k(t)) k( so that k(t) + a(k(t)) = p + a(p) + (t − t¯(t), ϕ(t) − ϕ( ¯ t¯(t))) .
By construction, ϕ(0) = ϕ(0) ¯ = ϕ(0) ˙ = ϕ(0) ¯˙ = 0. Since F is strongly asymmetric, there is a minimal 1 ≤ n ≤ n0 − 1 such that ϕ¯(n+1) (0) 6= ϕ(n+1) (0). We may assume, without loss of generality, that |ϕ¯(n+1) (0)| < |ϕ(n+1) (0)| .
(XXII.2)
Since the curvature of F is assumed to be bounded away from zero, the second derivatives of both ϕ and ϕ¯ are nonzero. Thus ¯˙ ¨¯ ϕ(0) ˙ = ϕ(0) = 0, ϕ(0), ¨ ϕ(0) 6= 0 , ϕ(i) (0) = ϕ¯(i) (0) for 1 ≤ i ≤ n ,
ϕ(n+1) (0) 6= ϕ¯(n+1) (0) .
Using (XXII.1) we conclude that t¯(t) is C n and obeys 1 if i = 1 (i) if n > 1 and if 1 < i ≤ n − 1 t¯ (0) = 0 ˜ b 6= 0 if i = n
t¯˙ (0) = ˜b 6= 1 if n = 1 .
December 15, 2003 16:52 WSPC/148-RMP
00180
Single Scale Analysis of Many Fermion Systems — Part 4
1159
Consequently, there is a neighborhood U 0 of 0 and a b > 0 such that for all t ∈ U 0 n d ¯(t)) ≥ b . (t − t (XXII.3) dtn
Set U = {p + (t, ϕ(t))|t ∈ U 0 }. If D is a disk of radius ε, then its projection to the x-axis is an interval J of length 2ε and length {k ∈ U |k + a(k) ∈ D} ≤ const length {t ∈ U 0 |x0 + t − t¯(t) ∈ J }
where x0 is the x-component of p+a(p). Therefore, by (XXII.3), this lemma follows from Lemma XXII.3 below. Lemma XXII.3. Let b be a strictly positive real number and n be a strictly positive integer. Let I ⊂ R be an interval (not necessarily compact) and f a C n function on I obeying |f (n) (x)| ≥ b
f or all x ∈ I
Then for all ε > 0 and all intervals J of length 2ε, length {x ∈ I|f (x) ∈ J } ≤ 2n+1
ε 1/n b
.
Proof. Set α = ( εb )1/n and g(x) = f (x) − y0 , where y0 is the midpoint of J . We must show ε |g (n) (x)| ≥ n for all x ∈ I =⇒ length {x ∈ I| |g(x)| ≤ ε} ≤ 2n+1 α . α Define cn inductively by c1 = 2 and cn = 2 + 2cn−1 . Because dn = 2−n cn obeys d1 = 1 and dn = 2−n+1 + dn−1 we have dn ≤ 2 and hence cn ≤ 2n+1 . We shall prove ε |g (n) (x)| ≥ n for all x ∈ I =⇒ length {x ∈ I| |g(x)| ≤ ε} ≤ cn α α by induction on n. Suppose that n = 1 and let x and y be any two points in {x ∈ I| |g(x)| ≤ ε}. Then |x − y| |g(x) − g(y)| 2ε |x − y| = |g(x) − g(y)| = ≤ 0 |g(x) − g(y)| |g 0 (ζ)| |g (ζ)|
for some ζ ∈ I. As |g 0 (ζ)| ≥ αε we have |x − y| ≤ 2α. Thus {x ∈ I| |g(x)| ≤ ε} is contained in an interval of length at most 2α as desired. Now suppose that |g (n) (x)| ≥ αεn on I and that the induction hypothesis is ε satisfied for n − 1. As in the last paragraph, the set {x ∈ I| |g (n−1) (x)| ≤ αn−1 } is contained in a subinterval I0 of I of length at most 2α. Then I \ I0 is the union of ε at most two other intervals I+ , I− on which |g (n−1) (x)| ≥ αn−1 . By the inductive hypothesis X length {x ∈ I| |g(x)| ≤ ε} ≤ length (I0 ) + length {x ∈ Ii | |g(x)| ≤ ε} i=±
≤ 2α + 2cn−1 α = cn α .
December 15, 2003 16:52 WSPC/148-RMP
1160
00180
J. Feldman, H. Kn¨ orrer & E. Trubowitz
Proposition XXII.4. Assume that F is strongly asymmetric. Let Γ be an εseparated set in F and R a square of side length 8ε in R2 . Then ε1/n0 ε with const depending only on the geometry of F. Here n0 is the constant of Definition XVIII.3. #{(γ1 , γ2 ) ∈ Γ × Γ|γ1 + γ2 ∈ R} ≤ const
Proof. Let ω1 = ε
1− n1
0
and
X1 = {(γ1 , γ2 ) ∈ Γ × Γ|γ1 + γ2 ∈ R, min{d(γ1 , γ2 ), d(a(γ1 ), γ2 )} ≥ ω1 } X2 = {(γ1 , γ2 ) ∈ Γ × Γ|γ1 + γ2 ∈ R, d(γ1 , γ2 ) ≤ ω1 } X3 = {(γ1 , γ2 ) ∈ Γ × Γ|γ1 + γ2 ∈ R, d(a(γ1 ), γ2 ) ≤ ω1 } . By Lemma XX.8, part (iv), with an arbitrary point p and ω2 large enough, #X1 ≤
const
ω1
= const
ε1/n0 . ε
Next observe that, for any given γ1 ∈ Γ, the length of {γ1 +k|k ∈ F }∩R is bounded by const ε, so that #{γ2 ∈ Γ|γ1 + γ2 ∈ R} ≤ const .
(XXII.4)
If, for some γ1 ∈ Γ, there exists γ2 ∈ Γ such that (γ1 , γ2 ) ∈ X2 , then 2γ1 = γ1 + γ2 + (γ1 − γ2 ) lies in the disk D of radius 8ε + ω1 centered at the center of R. Since length {k ∈ F |2k ∈ D} ≤ const ω1 there are at most that
ω const ε1
choices of γ1 ∈ Γ with 2γ1 ∈ D. By (XXII.4) this implies
ω1 ε1/n0 = const ε−1/n0 ≤ const ε ε since n0 ≥ 2. If, for some γ1 ∈ Γ, there exists γ2 ∈ Γ such that (γ1 , γ2 ) ∈ X3 , then γ1 + a(γ1 ) ∈ D. By Lemma XXII.2 #X2 ≤ const
1
length {k ∈ F |k + a(k) ∈ D} ≤ const ω1n0 −1 . Consequently 1
#X3 ≤ const
ω1n0 −1 ε1/n0 = const . ε ε
Proof of Proposition XXII.1. For each sector s ∈ Σj let γs be the center of s ∩ F. Then Γ = {γs |s ∈ Σj } is a 43 lj separated set. Clearly s˜01 + s˜02 is contained in the disk of radius const0 lj around γs01 + γs02 . Therefore (˜ s1 + s˜2 ) ∩ (˜ s01 + s˜02 ) 6= ∅ 0 only if γs1 + γs2 is contained in the disk of radius 2 const lj around γs01 + γs02 . So
December 15, 2003 16:52 WSPC/148-RMP
00180
Single Scale Analysis of Many Fermion Systems — Part 4
1161
the first part of the proposition follows directly from Proposition XXII.4, applied 0 ( 4 const × 43 )2 times. The other two parts are similar. 8 Definition XXII.5. (i) The creation/annihilation index of z ∈ Bˇ ∪· (B × Σ) is ( b if z = (k, σ, b) ∈ Bˇ b(z) = b if z = (x, σ, b, s) ∈ B × Σ . (ii) Let f ∈ Fˇ4;Σ . We say that f is of particle–particle type if f (z1 , z2 , z3 , z4 ) = 0 unless b(z1 ) = b(z2 ) = 0 ,
b(z3 ) = b(z2 ) = 1 .
Lemma XXII.6. Let f ∈ Fˇ4;Σ be of particle–particle type. Then, |f |ech,Σ ≤ const
l1/n0 e |f |3,Σ l
with the channel norm | · |ech,Σ of [8, Definition D.1].
Proof. It suffices to consider f ∈ Fˇr (4 − r, Σ) with r ≤ 2. As in the proof of [8, Lemma D.2], set F (ˇ η1 , . . . , ηˇr ; s1 , . . . , s4−r ) =
X
δ∈N0 ×N20
1 δ!
max
D dd-operator with δ(D)=δ
k|Df (ˇ η1 , . . . , ηˇr ; (ξ1 , s1 ), . . . , (ξ4−r , s4−r ))k|1,∞ tδ .
Then, by Proposition XXII.1, X |f |ech,Σ = sup
ηˇ1 ,...,ˇ ηr ∈Bˇ s3−r ,s4−r ∈Σ s1 ,...,s2−r ∈Σ
× const
l1/n0 l
sup ηˇ1 ,...,ˇ ηr ∈Bˇ
s1 ,...,s3−r ∈Σ
≤ const
= const
l1/n0 l
sup
F (ˇ η1 , . . . , ηˇr ; s1 , . . . , s4−r ) X
F (ˇ η1 , . . . , ηˇr ; s1 , . . . , s4−r )
s4−r ∈Σ
X
1≤i1 <···
l1/n0 e |f |3,Σ . l
F (ˇ η1 , . . . , ηˇr ; s1 , . . . , s4−r )
1 ≤ Theorem XXII.7. Let Σ be a sectorization of scale j ≥ 2 and length M j−3/2 1 0 0 0 0 l ≤ M (j−1)/2 . Let u((ξ, s), (ξ , s )), v((ξ, s), (ξ , s )) ∈ F0 (2; Σ) be antisymmetric, spin independent, particle number conserving functions whose Fourier transforms obey |ˇ u(k)|, |ˇ v (k)| ≤ 21 |ık0 −e(k)|. Furthermore, let X ∈ N3 and assume that |u|1,Σ ≤ 21 X and M j X0 ≤ min{τ1 , τ2 }, where τ1 and τ2 are the constants of Proposition XIII.5 and [8, Lemma XIII.6], respectively. Set
December 15, 2003 16:52 WSPC/148-RMP
1162
00180
J. Feldman, H. Kn¨ orrer & E. Trubowitz
C(k) =
ν (j) (k) , ık0 − e(k) − u ˇ(k)
D(k) =
ν (≥j+1) (k) ık0 − e(k) − vˇ(k)
and let C(ξ, ξ 0 ), D(ξ, ξ 0 ) be the Fourier transforms of C(k), D(k) as in [6, Definition IX.3]. Furthermore, let f ∈ Fˇ4;Σ be of particle–particle type. If the Fermi curve F is strongly asymmetric in the sense of Definition XVIII.3, then for all ` ≥ 1
where ej (X) =
|L` (f ; C, D)|e3,Σ ≤ (const l1/n0 ej (X))` |f |e`+1 3,Σ
cj 1−M j X .
Proof. By Proposition D.7, with X replaced by M j X, and Lemma XXII.6 ` l cj e e |f |e` |L` (f ; C, D)|3,Σ ≤ const ch,Σ |f |3,Σ 1 − MjX ≤ (const l1/n0 ej (X))` |f |e`+1 3,Σ . Appendices E. Sectors for k0 independent functions In [1–3] we shall implement a renormalization algorithm that uses counterterms for the dispersion relation e(k) that are independent of k0 . In this appendix we adjust the discussion of sectorized norms in Sec. XII and the discussion of resectorization, following Definition XIX.2, to deal with such functions. Definition E.1. Let f (x, x0 ) be a translation invariant function on R2 × R2 and we define its extension fext (ξ, ξ 0 ) by fext ((x0 , x, σ, a), (x00 , x0 , σ 0 , a0 )) f (x, x0 ) = δσ,σ0 δ(x0 − x00 ) −f (x0 , x) 0
and its Fourier transform as
fˇ(k) =
Z
if a = 1, a0 = 0 if a = 0, a0 = 1 otherwise
d2 x e−ı(k1 x1 +k2 x2 ) f (x, 0) .
Remark E.2. If fˇext (k) is the Fourier transform of fext as in [6, Definition IX.1(i)], then fˇext ((k0 , k)) = fˇ(k) . Definition E.3. Let Σ be a sectorization at scale j and K((x, s), (x0 , s0 )) a translation invariant function on (R2 × Σ)2 .
December 15, 2003 16:52 WSPC/148-RMP
00180
Single Scale Analysis of Many Fermion Systems — Part 4
1163
(i) We define its extension Kext ((ξ, s), (ξ 0 , s0 )) on (B × Σ)2 by Kext ((x0 , x, σ, a, s), (x00 , x0 , σ 0 , a0 , s0 )) 0 0 K((x, s), (x , s )) = δσ,σ0 δ(x0 − x00 ) −K((x0 , s0 ), (x, s)) 0 0
if a = 1, a0 = 0 if a = 0, a0 = 1 . otherwise
0
(ii) The function K((x, s), (x , s )) is said to be sectorized if its Fourier transform Z 0 0 0 0 d2 xd2 x0 e−ı(k1 x1 +k2 x2 ) eı(k1 x1 +k2 x2 ) K((x, s), (x0 , s0 ))
vanishes unless (0, k) ∈ s˜ and (0, k0 ) ∈ s˜0 where s˜ and s˜0 are the extensions of s and s0 of [8, Definition XII.1(ii)]. ˇ (iii) We define K(k) by X Z ˇ K(k) = d2 x e−ı(k1 x1 +k2 x2 ) K((x, s), (0, s0 )) . s,s0 ∈Σ
(iv) We set kKk1,Σ = |Kext |1,Σ . ˇ ext (k) is the Fourier transform of Kext as in DefiniRemark E.4. (i) If K tion XII(iv), then ˇ ext ((k0 , k)) = K(k) ˇ K . (ii) If K is sectorized, then K((x, s), (x0 , s0 )) and Kext ((x0 , x, σ, a, s), (x00 , x0 , σ , a0 , s0 )) vanish unless s ∩ s0 6= ∅. P 1 κδ tδ . Then (iii) Suppose that K is sectorized and write kKk1,Σ = δ∈N×N2 δ! κδ vanishes unless δ0 = 0 and otherwise is given by ( XZ κ0,δ = max max d2 x|xδ K((x, s), (0, s0 ))| , 0 0
s ∈Σ
max s∈Σ
and obeys max 0
s,s ∈Σ
Z
XZ
s0 ∈Σ
s∈Σ 2
δ
0
d x |x K((x, s), (0, s ))|
)
d2 x |xδ K((x, s), (0, s0 ))|
≤ κ0,δ ≤ 3 max 0
s,s ∈Σ
Z
d2 x |xδ K((x, s), (0, s0 ))| .
1 ≤ l ≤ Lemma E.5. Let Σ be a sectorization of scale j ≥ 2 and length M j−3/2 1 0 0 and K((x, s), (x , s )) be a sectorized, translation invariant function on M (j−1)/2 (R2 × Σ)2 . Let µ(t) be a C0∞ function on R and set, for each Λ > 0
December 15, 2003 16:52 WSPC/148-RMP
1164
00180
J. Feldman, H. Kn¨ orrer & E. Trubowitz
µΛ (k) = µ(Λ2 [k02 + e(k)2 ]) Z (Kext ∗ µ ˆΛ )((ξ, s), (ξ 0 , s0 )) = dζ Kext ((ξ, s), (ζ, s0 ))ˆ µΛ (ζ, ξ 0 ) (ˆ µΛ ∗ Kext )((ξ, s), (ξ 0 , s0 )) =
Z
B
dζ Kext ((ζ, s), (ξ 0 , s0 ))ˆ µΛ (ζ, ξ) B
where µ ˆΛ was defined in [6, Definition IX.4]. Denote j(Λ) = min{i ∈ N|M i ≥ Λ}. Then, there is a constant const, depending on µ, but not on M, j or Λ, such that |Kext ∗ µ ˆΛ |1,Σ ,
|ˆ µΛ ∗ Kext |1,Σ ≤ const cj(Λ) kKk1,Σ .
This lemma is an immediate consequence of [8, Lemma XIII.7]. Remark E.6. In the notation of Lemma E.5, ˇ (Kext ∗ µ ˆΛ )ˇ(k) = (ˆ µΛ ∗ Kext )ˇ(k) = K(k)µ Λ (k) . As in Definition XIX.2, we define a resectorization for functions on (R2 × Σ)2 . For a function χ(k) on R × R2 , set, as in [8, Lemma XIII.3], Z Z d2 k = dx0 χ((x ˆ 0 , x, ↑, 0), (0, 0, ↑, 0)) χ0 (x) = eık·x χ(0, k) (2π)2 and let χ ˆ0 (ξ, ξ 0 ) = δσ,σ0 δa,a0 δ(x0 − x00 )χ0 ((−1)a (x − x0 )) . Then χ ˆ
0
((x0 , x, σ, a), (x00 , x0 , σ 0 , a0 ))
= δ(x0 −
x00 )
Z
dtχ((t, ˆ x, σ, a), (0, x0 , σ 0 , a0 ))
= χ((x ˆ 0 , x, σ, a), (x00 , x0 , σ 0 , a0 )) . Definition E.7. Let j, i ≥ 2. Let Σ and Σ0 be sectorizations of scale j and i, respectively. If i 6= j, define, for each function K on (R2 × Σ0 )2 , X Z KΣ ((x, s1 ), (y, s2 )) = dx0 dy0 χ0s1 (x − x0 )K((x0 , s01 ), (y0 , s02 ))χ0s2 (y0 − y) s01 ,s02 ∈Σ
where χs , s ∈ Σ is the partition of unity of Lemma XII.3 and [8, (XIII.2)]. For i = j and Σ0 = Σ, define KΣ = K. Remark E.8. (i) If K is translation invariant, then X ˇ Σ (k, s1 , s2 ) = ˇ K K(k, s01 , s02 )χs1 (0, k)χs2 (0, k) . s01 ,s02 ∈Σ
(ii) The resectorization KΣ is sectorized. Remark E.9. Let K((x, s), (x0 , s0 )) be a translation invariant sectorized function on (R2 × Σ0 )2 . Then
December 15, 2003 16:52 WSPC/148-RMP
00180
Single Scale Analysis of Many Fermion Systems — Part 4
1165
(KΣ )ext ((ξ, s1 ), (η, s2 )) X Z = dξ 0 dη 0 Kext ((ξ 0 , s01 ), (η 0 , s02 ))χ ˆ0s1 (ξ 0 , ξ)χ0s2 (η 0 , η) . s01 ,s02 ∈Σ s01 ∩s1 6=∅ s02 ∩s2 6=∅
Proof. Let ξ = (x0 , x, σ, a), η = (y0 , y, τ, b) ∈ B. We consider the case a = 1, b = 0, the other cases are similar. Fix any s01 , s02 ∈ Σ0 . If s01 ∩ s1 = ∅ or s02 ∩ s2 = ∅. Z dξ 0 dη 0 Kext ((ξ 0 , s01 ), (η 0 , s02 ))χ ˆ0s1 (ξ 0 , ξ)χ ˆ0s2 (η 0 , η) =0=
Z
dx0 dy0 χ0s1 (x − x0 )K((x0 , s1 ), (y0 , s02 ))χ0s2 (y0 − y) .
Otherwise Z dξ 0 dη 0 Kext ((ξ 0 , s01 ), (η 0 , s02 ))χ ˆ0s1 (ξ 0 , ξ)χ ˆ0s2 (η 0 , η) = δσ,τ
Z
dx00 dy00 dx0 dy0 δ(x00 − y00 )δ(x00 − x0 )δ(y00 − y0 )
× χ0s1 (x − x0 )K((x0 , s01 ), (y0 , s02 ))χ0s2 (y0 − y) Z = δσ,τ δ(x0 − y0 ) dx0 dy0 χ0s1 (x − x0 )K((x0 , s01 ), (y0 , s02 ))χ0s2 (y0 − y) . 1 1 1 Proposition E.10. Let j > i ≥ 2, M j−3/2 ≤ l ≤ M (j−1)/2 and M i−3/2 ≤ l0 ≤ 1 0 0 with 4l < l . Let Σ and Σ be sectorizations of length l at scale j and length M (i−1)/2 l0 at scale i, respectively.
(i) Let K((x, s), (x0 , s0 )) be a translation invariant sectorized function on (R2 × Σ0 )2 . Then kKΣ k1,Σ ≤ const cj−1 kKk1,Σ0 . (ii) Let K((x, s), (x0 , s0 )) be a translation invariant sectorized function on (R2 ×Σ)2 . Then 0 l kKΣ0 k1,Σ0 ≤ const ci−1 kKk1,Σ . l Proof. (i) By Definition E.3(iv), Remark E.9, [9, Lemma II.7], Lemma XIII.3 and [8, (XIII.4)], kKΣ k1,Σ = |(KΣ )ext |1,Σ ≤
const
max
s1 ,s2 ∈Σ
X
s01 ,s02 ∈Σ s01 ∩s1 6=∅ s02 ∩s2 6=∅
kKext ((· , s01 ), (· , s02 ))k1,∞ kχ ˆ0s1 k1,∞ kχ0s2 k1,∞
December 15, 2003 16:52 WSPC/148-RMP
1166
00180
J. Feldman, H. Kn¨ orrer & E. Trubowitz
max kKext((· , s01 ), (· , s02 ))k1,∞
≤ const max kχ0s1 kL1 kχ0s2 kL1
s01 ,s02 ∈Σ0
s1 ,s2 ∈Σ
≤ const c2j−1 |Kext |1,Σ0 ≤ const cj−1 kKk1,Σ0
since, for any s ∈ Σ, there are at most three sectors s0 ∈ Σ0 with s0 ∩ s 6= ∅. (ii) Similarly kKΣ0 k1,Σ0 = |(KΣ0 )ext |1,Σ0
X
kKext((· , s1 ), (· , s2 ))k1,∞ kχ ˆ0s01 k1,∞ kχ0s02 k1,∞
≤
const
≤
const
≤
const
c2i−1
const
0 l ci−1 kKk1,Σ l
≤
max
s01 ,s02 ∈Σ0
s1 ,s2 ∈Σ s1 ∩s01 6=∅ s2 ∩s02 6=∅
max kχ0s01 kL1 kχ0s02 kL1 0 0 0
s1 ,s2 ∈Σ
X
X
s1 ∈Σ s2 ∈Σ s1 ∩s01 6=∅
kKext((· , s1 ), (· , s2 ))k1,∞
0 X l kKext ((· , s1 ), (· , s2 ))k1,∞ max l s1 ∈Σ s2 ∈Σ
since, for any s01 ∈ Σ0 , there are at most 6= ∅.
l0 const[ l ]
sectors s1 ∈ Σ with s1 ∩ s01
Notation Norms Norm
Characteristics
Reference
k| · k|1,∞
no derivatives, external positions, acts on functions
Example II.6
k · k1,∞
derivatives, external positions, acts on functions
Example II.6
kˇ∞
derivatives, external momenta, acts on functions
Definition IV.6
k| · k|∞
no derivatives, external positions, acts on functions
Example III.4
k · kˇ1
derivatives, external momenta, acts on functions
k·
Definition IV.6
Rd
Definition IV.6
derivatives, external momenta, B ⊂ R × Rd
Definition IV.6
k·k
ρm;n k · k1,∞
Lemma V.1
N (W; c, b, α)
X 1 c αn bn kWm,n k b2 m,n≥0
Definition III.9
k· k·
kˇ∞,B kˇ1,B
derivatives, external momenta, B ⊂ R ×
Theorem V.2
December 15, 2003 16:52 WSPC/148-RMP
00180
Single Scale Analysis of Many Fermion Systems — Part 4
Norm
Characteristics
N0 (W; β; X, ρ ~)
e0 (X)
X
1167
Reference
β n ρm;n kWm,n k1,∞
Theorem VIII.6
m+n∈2N
k · k L1 k·
ke
N0∼ (W;
β; X, ρ ~)
derivatives, acts on functions on R × Rd
before Lemma IX.6
derivatives, external momenta, acts on functions X ∼ β m+n ρm;n kWm,n ke e0 (X)
Definition X.4
like ρm;n k · ke but acts on V˜ ⊗n
Theorem X.12
before Lemma X.11
m+n∈2N
| · |e
N ∼ (W; c, b, α)
1 X m+n m+n ∼ e c α b |Wm,n | b2 m,n
Theorem X.12
| · |p,Σ
derivatives, external positions, all but p sectors summed
Definition XII.9
k · k1,Σ
like | · |1,Σ , but for functions on (R2 × Σ)2
Definition E.3
|ϕ|Σ
Nj (w; α; X, Σ, ρ ~)
1 1 |ϕ|1,Σ + l |ϕ|3,Σ + l2 |ϕ|5,Σ if m = 0 ρm;n l |ϕ|1,Σ if m 6= 0 M 2j X M 2j l B n/2 ej (X) αn |wm,n |Σ j l M m,n≥0
Definition XV.1
Definition XV.1
| · |ep,Σ
derivatives, external momenta, all but p sectors summed
Definition XVI.4
| · |ep,Σ,ρ
weighted variant of | · |ep,Σ 1 e 1 e e |f |1,Σ + l |f |3,Σ + l2 |f |5,Σ ρm;n X 6 1 |f |e l[(p−1)/2] p,Σ p=1
Definition XVII.1(i)
|f |eΣ
Nj∼ (w; α; X, Σ, ρ ~) | · |ech,Σ
| · |ch,Σ
if m = 0 Definition XVII.1(ii) if m 6= 0
X l B n/2 M 2j ej (X) αn |fn |eΣ l Mj n≥0
channel variant of | · |e2,Σ for ladders
channel variant of | · |2,Σ for ladders
Definition XVII.1(iii) Definition D.1 Definition D.1
| · |1,Σ,ω
like | · |1,Σ but excludes almost degenerate sectors
Lemma XXI.2
| · |e1,Σ,ω
like | · |e1,Σ but excludes almost degenerate sectors
Lemma XXI.2
December 15, 2003 16:52 WSPC/148-RMP
1168
00180
J. Feldman, H. Kn¨ orrer & E. Trubowitz
Other notation Notation
Description 1 Z
Z
eW(φ,ψ+ζ) dµS (ζ)
ΩS (W)(φ, ψ)
log
J
particle/hole swap operator Z 1 log eφJ ζ eW(φ,ψ+ζ) dµC (ζ) Z
˜ C (W)(φ, ψ) Ω
Reference before (I.6) (VI.1) Definition VII.1
r0
number of k0 derivatives tracked
Sec. VI
r
number of k derivatives tracked
Sec. VI
M
scale parameter, M > 1
before Definition VIII.1
const
generic constant, independent of scale
const
generic constant, independent of scale and M
ν (j) (k)
jth scale function
Definition VIII.1
ν˜(j) (k)
jth extended scale function
Definition VIII.4(i)
ν (≥j) (k)
ϕ(M 2j−1 (k02 + e(k)2 ))
Definition VIII.1
ν˜(≥j) (k)
ϕ(M 2j−2 (k02 + e(k)2 ))
Definition VIII.4(ii)
ν¯(≥j) (k)
ϕ(M 2j−3 (k02 + e(k)2 ))
Definition VIII.4(iii)
n0
degree of asymmetry
Definition XVIII.3
l
length of sectors
Definition XII.1
Σ
sectorization
Definition XII.1
S(C)
sup m
B cj
sup ξ1 ,...,ξm ∈B
1/m Z ψ(ξ1 ) · · · ψ(ξm )dµC (ψ)
j-independent constant X X = M j|δ| tδ + |δ|≤r |δ0 |≤r0
∞tδ ∈ Nd+1
Definition IV.1 Definitions XV.1, XVII.1 Definition XII.2
|δ|>r or |δ0 |>r0
cj 1 − MjX
ej (X)
=
fext
extends f (x, x0 ) to fext ((x0 , x, σ, a), (x00 , x0 , σ 0 , a0 ))
Definition E.1
∗
convolution
before (XIII.6)
◦
ladder convolution
Definition XIV.1(iv)
•
ladder convolution
Definitions XIV.3, XVI.9
fˇ
Fourier transform
Definition IX.1(i)
u ˇ
Fourier transform for sectorized u
Definition XII.4(iv)
f∼
partial Fourier transform
Definition IX.1(ii)
χ ˆ
Fourier transform
Definition IX.4
Definition XV.1(ii)
December 15, 2003 16:52 WSPC/148-RMP
00180
Single Scale Analysis of Many Fermion Systems — Part 4
Notation B
Description R × Rd × {↑, ↓} × {0, 1} viewed as position space
1169
Reference beginning of Sec. II
Bˇ
R× × {↑, ↓} × {0, 1} viewed as momentum space
beginning of Sec. IX
Bˇm
{(ˇ η1 , . . . , ηˇm ) ∈ Bˇm |ˇ η1 + · · · + ηˇm = 0}
before Definition X.1
XΣ
Bˇ ∪· (B × Σ)
Definition XVI.1
Fm (n)
Rd
Bm
functions on antisymmetric
× Bn , in B m
Definition II.9 arguments
Fˇm (n)
functions on Bˇm × B n , antisymmetric in Bˇm arguments
Definition X.8
Fm (n; Σ)
functions on B m × (B × Σ)n , internal momenta in sectors
Definition XII.4(ii)
Fˇm (n; Σ)
functions on Bˇm × (B × Σ)n , internal momenta in sectors
Definition XVI.7(i)
Fˇn;Σ
ˇ functions on Xn Σ that reorder to Fm (n − m; Σ)’s
Definition XVI.7(iii)
References [1] J. Feldman, H. Kn¨ orrer and E. Trubowitz, A two dimensional Fermi liquid, Part 1: overview, to appear in Commun. Math. Phys. [2] J. Feldman, H. Kn¨ orrer and E. Trubowitz, A two dimensional Fermi liquid, Part 2: convergence, to appear in Commun. Math. Phys. [3] J. Feldman, H. Kn¨ orrer and E. Trubowitz, A two dimensional Fermi liquid, Part 3: the Fermi surface, to appear in Commun. Math. Phys. [4] J. Feldman, H. Kn¨ orrer and E. Trubowitz, Asymmetric Fermi surfaces for magnetic Schr¨ odinger operators, Comm. Partial Differential Equations 25 (2000), 319–336. [5] J. Feldman, H. Kn¨ orrer and E. Trubowitz, Particle–hole ladders, preprint. [6] J. Feldman, H. Kn¨ orrer and E. Trubowitz, Single scale analysis of many Fermion systems, Part 2: the first scale, Rev. Math. Phys. 15 (2003) 995–1037. [7] M. Gromov, Asymptotic Invariants of Infinite Groups. [8] J. Feldman, H. Kn¨ orrer and E. Trubowitz, Single scale analysis of many Fermion systems, Part 3: sectorized norms, Rev. Math. Phys. 15 (2003) 1039–1120. [9] J. Feldman, H. Kn¨ orrer and E. Trubowitz, Single scale analysis of many Fermion systems, Part 1: insulators, Rev. Math. Phys. 15 (2003) 949–993.
February 9, 2004 18:52 WSPC/148-RMP
00188
Reviews in Mathematical Physics Vol. 15, No. 10 (2003) 1171–1217 c World Scientific Publishing Company
ASPECTS OF NONCOMMUTATIVE LORENTZIAN GEOMETRY FOR GLOBALLY HYPERBOLIC SPACETIMES
VALTER MORETTI Department of Mathematics of the University of Trento, I.N.d.A.M, Istituto Nazionale di Alta Matematica “F.Severi”, unit` a locale di Trento, I.N.F.N., Istituto Nazionale di Fisica Nucleare, Gruppo Collegato di Trento, via Sommarive 14, I-38050 Povo (TN), Italy [email protected] Received 18 April 2003 Revised 30 September 2003 Connes’ functional formula of the Riemannian distance is generalized to the Lorentzian case using the so-called Lorentzian distance, the d’Alembert operator and the causal functions of a globally-hyperbolic spacetime. As a step of the presented machinery, a proof of the almost-everywhere smoothness of the Lorentzian distance considered as a function of one of the two arguments is given. Afterwards, using a C ∗ -algebra approach, the spacetime causal structure and the Lorentzian distance are generalized into noncommutative structures giving rise to a Lorentzian version of part of Connes’ noncommutative geometry. The generalized noncommutative spacetime consists of a direct set of Hilbert spaces and a related class of C ∗ -algebras of operators. In each algebra a convex cone made of self-adjoint elements is selected which generalizes the class of causal functions. The generalized events, called loci, are realized as the elements of the inductive limit of the spaces of the algebraic states on the C ∗ -algebras. A partial-ordering relation between pairs of loci generalizes the causal order relation in spacetime. A generalized Lorentz distance of loci is defined by means of a class of densely-defined operators which play the role of a Lorentzian metric. Specializing back the formalism to the usual globally-hyperbolic spacetime, it is found that compactly-supported probability measures give rise to a non-pointwise extension of the concept of events. Keywords: Noncommutative geometry; world function; Lorentzian distance; globally hyperbolic spacetime; C ∗ -algebra.
1. Introduction 1.1. Some aspects of Connes’ Riemannian noncommutative geometry Connes’ noncommutative geometry is a very impressive coherent set of mathematical theories which encompass parts of mathematics born by very far and different contexts [2]. On the physical ground, applications of Connes’ noncommutative geometry include general relativity, quantum field theory and many other research areas [2, 11, 20]. As regards the content of this paper we are interested in the approach of [2, Chap. VI] (see also [20, Chap. 6]). The basic ingredient introduced 1171
February 9, 2004 18:52 WSPC/148-RMP
1172
00188
V. Moretti
by Connes to develop the analogue of differential calculus for noncommutative algebras is given by a so-called spectral triple, (A, H, D). A is a unital algebra which is a subalgebra of the natural C ∗ -algebra of bounded operators on a Hilbert space H. D : D(D) → H is a self-adjoint operator on H, D(D) ⊂ H being a dense linear manifold, such that the resolvent (D − λI)−1 is compact for each λ 6∈ R. [D, a] must be well-defined at least as a quadratic form (see [2, VI.1]) and bounded for every a ∈ A. Every smooth compact n-dimensional Riemannian manifold M equipped with a (Euclidean) spin structure determines a natural commutative (i.e. A is commutative) spectral triple. In that case A is the normed commutative unital involutive (the involution being the usual complex conjugation) algebra of Lipschitza maps f : M → C, the norm being the usual sup-norm || · ||∞ . H is the space L2 (M, S) of [dim D/2]
the square integrable sections of the irreducible C2 -spinor bundle over M with measure µg associated to the metric g on M . The positive Hermitean scalar product used to define L2 reads Z (ψ, φ) := ψ † (x)φ(x) dµg (x) . M
This scalar product induces an operator norm which we denote by || · ||L(L2 (M,S)) . Finally D is the Dirac operator associated with the Levi–Civita connection. It turns out that if f ∈ A is seen as a multiplicative operator, ||f ||∞ = ||f ||L(L2 (M,S)) , f ∗ = f¯, 1 = I, where 1 : M → C is the constant map 1(x) = 1. Therefore A is a subalgebra of the C ∗ -algebra of the bounded operators on L2 (M, S) as it must be. Remarkably, one can realize the topological and metric structure of the manifold in terms of the spectral triple only (see [20, Propositions 6.5.1 and 10.1.1]). Let us summarize this result. In the following, A¯ denotes the (unital) C ∗ -algebra given by the completion of A. M is homeomorphic to the space of (the classes of unitary ¯ equipped with the equivalence of) irreducible representations of the C ∗ -algebra A, topology of the pointwise convergence (also said Gel’fand’s or ∗-weak topology). In the commutative case, the irreducible representations are unidimensional and ¯ In this sense the points of M are pure coincide with the pure algebraic states on A. algebraic states. All that is essentially due [2, 20, 11] to the well-known “commutative Gel’fand–Naimark theorem” [25]. In practice, A¯ turns out to be nothing but the C ∗ -algebra of the complex-valued continuous functions on M , C(M ) with the norm || · ||∞ , and the pure state associated to any p ∈ M trivially acts as p(f ) := f (p) for every f ∈ C(M ). As regards the metric, one has the functional formula dE (x, y) = sup {|f (x) − f (y)|
|f ∈ A, ||[D, f ]|| ≤ 1} ,
(1)
where dE is the distance in the manifold which is induced by the metric. Notice that there is no reference to paths in the manifold, despite the left-hand side being a i.e.
for some Kf ≥ 0, it holds |f (p) − f (q)| ≤ Kf dE (p, q) for every p, q ∈ M , dE being the distance in M .
February 9, 2004 18:52 WSPC/148-RMP
00188
Aspects of Noncommutative Lorentzian Geometry
1173
defined as the infimum of the length of the paths from p to q: dE (p, q) := inf {L(γ)} , Ωp,q
(2)
where Ωp,q is the class of all continuous piecewise-smooth curves jointing p and q and L(γ) is the Riemannian length of γ ∈ Ωp,q . As remarked by Connes [2], this fact is interesting on a pure physical ground. Indeed the path of quantum particles do not exist: wave functions exist but one must assume the existence of geometrical structures also discussing quantum particles. There is an analogous formula for the integration of functions f ∈ A over M based on the Dixmier trace trω (below, c(n) is a coefficient depending on the dimension n of the manifold M only) [2, 11, 20], Z f (x) dµg (x) = c(n) trω f |D|−n . (3) M
Whenever the algebra A of a spectral triple is taken noncommutative, (1) can be re-interpreted as defining a distance in the space of pure states [2, 20, 11] and generalized interpretations are possible for (3). Similar noncommutative generalizations can be performed concerning much of differential and integral calculus finding out very interesting and useful mathematical structures giving rise to a remarkable interplay between mathematics and theoretical physics [2, 11, 20]. It is worth noticing that, for most applications, the Dirac operator D can be replaced by the Laplace– Beltrami one, ∆, as suggested in [7, 8] (see also [20]) and this is the way we follow within the present work. Most physicists interested in quantum gravity believe that the Planck-scale geometry may reveal a structure very different from the geometry at macroscopic scales. This is a strong motivation for developing further any sort of noncommutative geometry. However, physics deals with Lorentzian spacetimes rather than Euclideanb spaces. To this end, the principal aim of this paper is to attempt to find the Lorentzian analogue of (1). Actually, we shall see that this is nothing but the first step in order to develop a noncommutative approach of the spacetimes causality. 1.2. The Lorentzian puzzle The Lorentzian geometry, i.e. the geometry of spacetimes, is more complicated than the Euclidean one due to the presence of, local and global, causal structures. These take temporal and causal relations among events into account. The local, metrical and causal, structure is given by the Lorentzian metric. A physically relevant global causal structure is involved in the definition of a globally-hyperbolic spacetime. Roughly speaking, a globally-hyperbolic spacetime is a time-oriented Lorentzian manifold (that is a spacetime) which admits spacelike surfaces, called Cauchy surfaces, such that the assignment of Cauchy data on those surfaces determines the evolution of any field everywhere in the manifold if the field satisfies, b We
use “Euclidean” as synonym of “Riemanniann” throughout.
February 9, 2004 18:52 WSPC/148-RMP
1174
00188
V. Moretti
for instance, Klein–Gordon equation. A globally-hyperbolic spacetime seems to be the natural scenario where one represents the theory on the matter content of the universe, including (quantum) fields, elementary interactions and all that [28, 29]. In order to built up a Lorentzian noncommutative geometry, a generalization of the (local and global) causal structure of a spacetime is necessary. To make contact with Connes’ program a natural question arises: what is the Lorentzian analogue of d E to be used to generalize (1) in Lorentzian manifolds? An interesting object defined in either Euclidean and Lorentzian manifolds is the so-called Synge world function σ (see Appendix A) which is related with the function dE in Euclidean manifolds. Any smooth, either Riemannian or Lorentzian, manifold is locally endowed with a smooth function σ : N × N → R where N is any convex normal neighborhood. σ maps x, y ∈ N into one half the (signed) squared length of the unique geodesic segment, which joints x and y, contained in N . In Riemannian manifolds σ ≥ 0. In Lorentzian manifolds, the sign is positive if and only if x, y are spatially separated, negative if and only if x, y are timelike related, and σ(x, y) = 0 for either x = y or when x, y are null related. It is known that σ completely determines the metric √ at each point of the spacetime. In Euclidean manifolds dE = 2σ holds whenever x, y belong to a common convex √ normal neighborhood, so, at least locally, it is possible to define dE in terms of 2σ. However, any attempt to generalize (1) in Lorentzian manifolds by means of any analogue of dE built up by means of σ faces √ the basic issue of the indefiniteness of the Lorentzian world function. d := 2σ would be complex-valued and so useless to restore some identity similar to (1). One could try to define d for spatially-separated events only by taking the squared root of 2σ in that case. An immediate drawback is that the definition would not work whenever x and y are too far from each other since σ is not well-defined outside convex normal neighborhoods. To avoid the problem, one may try to use (2) for x, y spatially separated with Ωx,y now denoting the class of space-like continuous piecewise smooth curves jointing x and y. This is not p a nice idea too, because it would entail d(x, y) = 0 (and thus also d(x, y) 6= 2σ(x, y)) at least for x and y sufficiently close to each other and spatially separated. This is because, in convex normal neighborhoods, one may arbitrarily approximate null piece-wise smooth curves by means of piecewise smooth space-like curves with the same endpoints. Actually several other problematic issues are related to the indefiniteness of d 2 . For instance, if D indicates the Dirac operator, the identity ||[D, f ]|| = ess supM |g(df, df )| , necessary to give rise to (1) (e.g., see [2, 20, 11]), fails to be fulfilled. This is because the left-hand side is not well-defined as a Hilbert-space operator norm since, in Minkowski spacetime (but this generalizes to any Lorentzian manifold equipped with a spin structure), the natural Lorentz invariant scalar product of spinors turns out to be indefinite. We do not address these issues in the present work because we shall employ the Laplace–Beltrami–D’Alembert operator instead of the Dirac one (see [27] for another approach based on the Dirac operator and Krein spaces).
February 9, 2004 18:52 WSPC/148-RMP
00188
Aspects of Noncommutative Lorentzian Geometry
1175
Another problematic technical issue related to the indefiniteness of the metric is the failure of the Lipschitz condition to define a valuable background algebra of functions A. Indeed, in the Euclidean case dE (p, ·) cannot be everywhere smooth but it turns out to be Lipschitz because of the triangular inequality (false in the Lorentzian case). The Lipschitz condition plays a relevant role in proving (1) and in the choice of the algebra A which contains dE (p, ·). We also remark that the compactness of the manifold has to be dropped in the Lorentzian case because a compact spacetime contains a closed timelike curve [1, Proposition 3.10] and thus fails to be physical. The failure of the compactness gives rise to problems in the Euclidean case. However approaches to noncommutative Euclidean geometry exist in some cases [10]. If M is a Hausdorff locally-compact space but is not compact, there is a homeomorphism from M onto the space of complex homomorphism of the nonunital C ∗ -algebra of the complex functions on M which vanish at infinity, C0 (M ), equipped with the pointwise-convergence topology [6]. So the points of M can be thought as multiplicative functionals on the C ∗ -algebra A¯ := C0 (M ), and A can be taken as the algebra of complex continuous compactly-supported functions in M , Cc (M ). However, in the noncompact case, (1) cannot be re-stated as it stands. In the Lorentzian case, possible attempts to solve all these problems (also connected with Hamiltonian formulation of field theories including the gravitational field) [15, 16, 18, 19] are based on the foliation of the manifold by means of space-like hypersurfaces. On these hypersurfaces, provided they are compact (and endowed with spin structures), one can restore Connes’ standard noncommutative approach referring to the Euclidean distance induced by the Lorentzian background metric. However, barring globally static spacetimes, any choice of the foliation is quite arbitrary. Moreover the relation between spatial spectral triples and causality seems to be quite involved. Finally, a classical background spacetime cannot be completely eliminated this way, reducing possible attempts to formulate approaches to quantum gravity. Another approach to noncommutative Lorentzian geometry is presented in [27] in terms of Krein spaces. However the issue of the generalization of (1) is not investigated, but attention is focused on the generalization of (3) and the noncommutative differential calculus. 1.3. A natural Lorentzian approach In this paper, first of all we show that there is a possible generalization of (1) in any physically well-behaved spacetime (see Sec. 1.5 for more details on the used definitions). In fact, in every globally-hyperbolic spacetime M (i.e. a connected time oriented Lorentzian manifold which admits Cauchy surfaces) a functional identity similar to (1) arises which uses the so-called Lorentzian distance d(x, y) [1, 21], the class of almost-everywhere smooth causal functions and the Laplace–Beltrami– d’Alembert operator, locally ∆ = ∇µ ∇µ , associated to the Levi–Civita connection derivative ∇. (Actually the same result holds, working with a vector fiber bundle F → M and more complicated second-order hyperbolic operators, see the remark
February 9, 2004 18:52 WSPC/148-RMP
1176
00188
V. Moretti
after Theorem 3.1 below). The original idea to express the Lorentzian distance by a functional formula using the metric Laplacian was formulated by Parfinov and Zapatrin in [22] where part of the approach developed in the first part of this work was presented in a more elementary form without the requirement of globalhyperbolicity. Let us illustrate the ingredients pointed out above. Take p, q ∈ M . First suppose that p 6= q and p q which means that q belongs to the causal future of p (i.e. the subset of M of the events r such that there is a causal future-directed curve from p to r). In that case, the Lorentzian distance from p to q is defined as d(p, q) := sup{L(γ) | γ ∈ Ωp,q }, Ωp,q denoting the set of all causal future-directed curves from p to q and L(γ) ≥ 0 is the length of γ. d(p, q) := 0 if either p = q or p 6 q. d enjoys an inverse triangular inequality if p q r: d(p, r) ≥ d(p, q)+d(q, r). d is a natural object in time-oriented Lorentzian manifolds, i.e. spacetimes, and it turns out to be continuous in globally-hyperbolic spacetimes. d plays a crucial role in Lorentzian geometry [1, 21] because one can re-build the topology, the differential structure, the metric tensor and the time orientation of the spacetime by using d only, as we shall see shortly. If N ⊂ M is open, a causal function on N is a continuous map f : N → R which does not decrease along every causal future-directed curve contained in N . C[µg ] (N ) denotes the class of causal functions on N ⊂ M which are smooth almost everywhere in N . X denotes the class of all regions I in the spacetime M which are open, causally convex (i.e. if p, q belong to such a region, every future-directed causal curve from p to q also lies in the region) and such that I¯ is compact, causally convex and ∂I has measure zero. The Lorentzian equation which corresponds to (1) reads, in a globally-hyperbolic spacetime M , for p, q ∈ M with q in the causal future of p, ¯ I ∈ X , p, q ∈ I, ¯ || [f, [f, ∆ d(p, q) = inf{hf (q)−f (p)i | f ∈ C[µg ] (I), 6 ]]−1 ||I ≤ 1} , (4) where hαi := max{0, α} for α ∈ R, 26∆ := ∆, the latter being the Laplace–Beltrami– d’Alembert operator. || · ||S denotes the uniform norm of operators A : L2 (S, µg ) → L2 (S, µg ) where µg being the measure on S ⊂ M naturally induced by the metric g of the spacetime. The restriction to a suitable class of compact sets I¯ is useful to realize the events of the spacetime as pure states of unital C ∗ -algebras of functions containing the causal functions. It holds despite the manifold not being compact and these functions, in general, are not bounded on the whole manifold. Afterwards we analyze, from the point of view of the C ∗ -algebras, the ingredients above showing that noncommutative generalizations are possible. In particular we introduce, in suitable algebraic context, the generalizations of the causal ordering relation and of the Lorentzian distance. Specializing back to the commutative case, these generalizations give rise to a non-pointwise concept of event (compactlysupported probability measures on globally-hyperbolic spacetimes) preserving the notion of causal ordering relation and Lorentzian distance.
February 9, 2004 18:52 WSPC/148-RMP
00188
Aspects of Noncommutative Lorentzian Geometry
1177
1.4. Structure of the work This paper is organized as follows. The remaining part of Sec. 1 contains basic definitions, notations and conventions used throughout the paper. In Sec. 2 we introduce the Lorentzian distance and the causal functions on a spacetime. More precisely (a) we present the basic properties of d, (b) we show that it completely determines the structure of the spacetime and (c) we prove some propositions necessary to generalize (1). In particular we prove a theorem concerning the almost-everywhere smoothness of d in globally-hyperbolic spacetimes. Section 3 is devoted to proving (4). Section 4 contains an algebraic analysis of the introduced mathematical structures and several generalizations. In particular (a) we introduce the concept of locus which generalizes the concept of event (or point in noncompact Euclidean manifolds) and (b) we prove that loci reduce to compactly-supported (regular Borel) probability measures in the commutative case. Finally (c) we show that and d can be extended into analogous mathematical objects related to the space of the loci which give rise to a noncommutative causality. 1.5. Basic definitions, notations and conventions Throughout the work “iff” means “if and only if” and “smooth” means C ∞ . Concerning differentiable manifolds we assume usual definitions. More precisely, a (ndimensional) differentiable manifold M is a connected, Hausdorff, second countable topological space which is locally homeomorphic to Rn and is equipped with a C ∞ -differentiable structure. Concerning differentiable functions in nonopen sets we give the following definition. If M is a differentiable manifold, U ⊂ M is open and nonempty, and V ⊂ ∂U , C ∞ (U ∪ V ) denotes the set of functions f : U ∪ V → R such that fU ∈ C ∞ (U ) and, for every y ∈ V , each derivative of any order, computed in a coordinate patch in some open neighborhood Uy of y, can be extended into a continuous function in Uy ∩ (U ∪ V ). We assume that the reader knows basic definitions and properties of manifolds equipped with (Lorentzian or Riemannian) metrics, Levi–Civita connection and geodesical flux. Section A.1 in Appendix A contains definitions and properties of the exponential map and the related mathematical machinery (convex normal neighborhoods). We direct the reader to [1, 14, 21, 23, 28] as general reference textbooks on spacetime structures. Let us summarize basic definitions, further definitions used in the paper will be given before relevant statements in the text. Appendix A contains a complete summary. A (smooth) Lorentzian manifold (M, g) is a n ≥ 2-dimensional smooth manifold M with a smooth Lorentzian metric g (with signature (−, +, · · · , +)). We use the following terminology concerning the classification of vectors and co-vectors. A vector T ∈ Tx M , T 6= 0, is said to be spacelike, timelike or null if, respectively, gx (T, T ) > 0, gx (T, T ) < 0, gx (T, T ) = 0. T is said to be causal if it is either timelike or null. The same terminology is used for co-vectors ω ∈ Tx∗ M referring to ↑ω ∈ Tx M , where gx (↑ω, ·) = ω.
February 9, 2004 18:52 WSPC/148-RMP
1178
00188
V. Moretti
We remind the reader that a Lorentzian manifold (M, g) is said to be time orientable if it admits a smooth nonvanishing vector field Z ∈ T M which is everywhere timelike. Afterwards a time orientation, Ot , is the choice of one of the two equivalence classes of smooth timelike vector fields Z with respect to the equivalence relation Z ∼ Z 0 if and only if g(Z, Z 0 ) < 0 everywhere. For each point p ∈ M , an orientation determines an analogous equivalence class of timelike vectors of Tp M , Otp . With the given definitions, a causal vector (co-vector) T ∈ Tp M (ω ∈ Tp∗ M ) is said to be future directed if gp (Z(p), X) < 0 (gp (Z(p), ↑ω) < 0) and past directed if gp (Z(p), X) > 0 (gp (Z(p), ↑ω) > 0). A spacetime (M, g, Ot ) is a Lorentzian manifold (M, g) which is time orientable and equipped with a time orientation Ot ; the points of M are also called events. To conclude we give the definition of causal curves. In spacetime M , a piecewise C 1 curve (see Sec. A.5 for the detailed definition of piecewise C k curve used in this work) γ is said to be timelike, spacelike, null, causal if its tangent vector γ˙ is respectively timelike, spacelike, null, causal. Moreover, the curve is said to be future (past) directed if its tangent vector γ˙ is future (past) directed. 2. Lorentzian Distance and Causal Functions In this section we collect and review important notions and results in Lorentzian geometry, in particular focusing on the role of Lorentzian distance. Part of these results are well-known but spread out in the literature. A relevant result proven in Sec. 2.1 concerns the almost-everywhere smoothness of the Lorentzian distance (Theorem 2.1). In Sec. 2.2 the interplay of the Lorentzian distance and the notion of causal function in a spacetime is investigated. Finally, a preliminary formulation of the functional formula of the Lorentzian distance is presented (Theorem 2.2) using the built-up machinery. 2.1. The role and properties of the Lorentzian distance in spacetimes To define the Lorentzian distance it is necessary to recall the notion of Lorentzian length of a (causal) curve. As is well-known, the Lorentzian length L(γ) of a piecewise C 1 curve γ : [a, b] → M is Z bq |gγ(t) (γ(t), ˙ γ(t))| ˙ dt . (5) L(γ) := a
Obviously the definition does not depend on the used parametrization. It is convenient to extend the definition of Lorentzian length to continuous causal curves because several definitions and results of Lorentzian geometry found in the literature require the use of continuous causal curves. A continuous curve γ : I → M is said to be a continuous future-directed causal curve (see Sec. A.6) if the following requirement is fulfilled. For each t ∈ I, there is a neighborhood of t, It and a convex normal neighborhood of γ(t), Ut , such that the following requirements are fulfilled. For t0 ∈ It \{t}, one has γ(t0 ) 6= γ(t) and (a) there is
February 9, 2004 18:52 WSPC/148-RMP
00188
Aspects of Noncommutative Lorentzian Geometry
1179
a future-directed causal (smooth) geodesic segment γ 0 ⊂ Ut from γ(t) to γ(t0 ) if t0 > t or (b) there is a future-directed causal (smooth) geodesic segment γ 0 ⊂ Ut from γ(t0 ) to γ(t) if t0 < t. Similar definitions hold concerning continuous futuredirected timelike curves, by replacing “causal” with “timelike” in the definitions above. The definition of L(γ) is extended as follows [14, p. 214] to continuous future-directed causal curves γ. Suppose that γ, from p to q, is such that, for every open neighborhood Uγ of γ, there is a future-directed timelike piecewise C 1 curve γ 0 from p to q, then define LUγ (γ) := sup L(γ 0 ) varying γ 0 in Uγ as said. Then L(γ) := inf LUγ (γ) where Uγ varies in the class of all open neighborhoods of γ. If γ does not fulfill the initial requirement then γ must be an unbroken null geodesic (see [14, p. 215]) and thus one defines L(γ) := 0.c Remark. From now on a, either future-directed or past-directed, causal curve is supposed to be a continuous, respectively future-directed or past-directed, causal curve. Moreover continuous curves γ : I → M and γ 0 : I 0 → M are identified if there is an increasing homomorphism h : I → I 0 and γ 0 ◦ h = γ. Let us give the definition of Lorentzian distance. We remind the reader that, in a spacetime (M, g, Ot ), if p, q ∈ M , p q means that either p = q or there is a future-directed causal curve from p to q, whereas p ≺ q means that p q and p 6= q, and finally p ≺ ≺ q means that there is a future-directed timelike curve from p to q. ( ≺ ≺ and are clearly transitive relations moreover, if p, q, r ∈ M , p ≺ ≺ q and q r entail p ≺ ≺ r, similarly p q and q ≺ ≺ r entail p ≺ ≺ r [23].) Definition 2.1. Let (M, g, Ot ) be a spacetime. If p, q ∈ M and Ωp,q denotes the class of the future-directed causal curves from p to q, the Lorentzian distance from p to q, d(p, q) ∈ [0, +∞) ∪ {+∞} is [1, 21] ( sup{L(γ) | γ ∈ Ωp,q } if p ≺ q , (6) d(p, q) := 0 if p 6≺ q . Remarks. (1) By the given definition of L(γ), d(p, q) = sup{L(γ)} attains the same value if one restricts the range of γ to the piecewise C 1 curves of Ωp,q . (2) Differently from the Euclidean case, in general Ωp,q 6= Ωq,p , and thus d(p, q) 6= d(q, p). The Lorentz distance enjoys several relevant properties which will be useful later. Proposition 2.1 below presents the elementary properties of the Lorentzian distance in relation with the causal sets of a spacetime. From now on we use the following definitions of causal sets in a spacetime (M, g, Ot ). The topological and c A maybe equivalent definition can be given noticing that a continuous future-directed causal curve γ satisfies a local Lipschitz condition (with respect to the coordinates of a sufficiently small neighborhood of each point of γ) and thus it is almost-everywhere differentiable. So, one defines L(γ) using (5) too (see [1, p. 136]).
February 9, 2004 18:52 WSPC/148-RMP
1180
00188
V. Moretti
causal properties of these sets which are employed in the work are presented in Secs. A.9, A.11, A.12 and A.23. If S ⊂ M , J + (S) := {q ∈ M | p q for some p ∈ S} is the causal future of S , J + (S) := {q ∈ M | q p for some p ∈ S} is the causal past of S ,
I + (S) := {q ∈ M | p ≺ ≺ q for some p ∈ S} is the chronological future of S ,
I − (S) := {q ∈ M | q ≺ ≺ p for some p ∈ S} is the chronological past of S .
Moreover I(p, q) := I + (p) ∩ I − (q) and J(p, q) := J + (p) ∩ J − (q). p, q ∈ M are said to be time related, if either I + (p) ∩ I − (q) 6= ∅ or I − (p) ∩ I + (q) 6= ∅, causally related if either J + (p) ∩ J − (q) 6= ∅ or J − (p) ∩ J + (q) 6= ∅. Causally-related events p, q ∈ M , p 6= q, which are not time related are called null related. S, S 0 ⊂ M are said to be spatially separated if (J + (S) ∪ J − (S)) ∩ S 0 = ∅ (which is equivalent to (J + (S 0 ) ∪ J − (S 0 )) ∩ S = ∅). We remind the reader that a set S of a spacetime M is causally convex when J(p, q) ⊂ S if p, q ∈ S (see Sec. A.11 and for properties of causally-convex sets and strongly-causal spacetimes). A spacetime is strongly causal when every event admits a fundamental set of open neighborhoods consisting of causally-convex sets. A spacetime is called chronological if there are no events p, q such that p ≺ ≺q ≺ ≺p (equivalently, it does not contain any closed future-directed timelike curve). Finally, a globally-hyperbolic spacetime (see also Secs. A.16–A.23 and the end of [28, Sec. 8.3] about possible equivalent definitions) is a strongly-causal spacetime (M, g, Ot ) such that every J(p, q) is either empty or compact for each pair p, q ∈ M (see Secs. A.12–A.15 for further definitions and properties. Here we remind the reader that only a globally-hyperbolic spacetime is both strongly causal and chronological). Proposition 2.1. If (M, g, Ot ) is a spacetime and p, q, r ∈ M : (a) I + (p) = {q ∈ M | d(p, q) > 0}. Moreover , if p 6= q and both d(p, q) and d(q, p) are finite, then either d(p, q) = 0 or d(q, p) = 0; (b) if p q r, the inverse triangular inequality holds, that is d(p, r) ≥ d(p, q) + d(q, r) ;
(7)
(c) d is lower semicontinuous on M × M ; (d) if Up is a, sufficiently small , convex normal neighborhood of p, d(p, ·)Up ∩J + (p) is finite, belongs to the class C ∞ (Up ∩ J + (p)) and , for all q ∈ Up ∩ J + (p), 1 σ(p, q) = − d(p, q)2 ; 2
(8)
where σ(p, q) is one half the squared geodesic distance from p to q, also called Synge’s world function, defined by using the exponential map (see Sec. A.0 in Appendix A);
February 9, 2004 18:52 WSPC/148-RMP
00188
Aspects of Noncommutative Lorentzian Geometry
1181
(e) if p q and there is a curve γ ∈ Ωp,q with L(γ) = d(p, q) (i.e. γ is maximal ), then γ can be re-parametrized to be a smooth geodesic. If (M, g, Ot ) is globally-hyperbolic it also holds that: (f) (g) (h) (i)
J + (p) = {q ∈ M | d(p, q) > 0}; d is finite; d is continuous on M × M ; if p ≺ q, there is a causal geodesic from p to q, γ with L(γ) = d(p, q).
Proof. Items (a), (c), (e), (g), (h) are proven in [1, Sec. 4.1]. (b) is a trivial consequence of the definition of d. Concerning (d), everything is a consequence of the smoothness of σ and of (8). The latter can be proven noticing that the length from p of causal geodesic segments through p, in a convex normal neighborhood is maximal [14, Sec. 4.5, Proposition 4.5.3] and using [1, Theorem 4.27]. (f) is a consequence of (a) and Sec. A.12. The proof of (i) can be found in [21, p. 411]. A very remarkable result of Lorentzian geometry is that the Lorentzian distance determines the whole, local and global (topological, differential, metric), structure of a spacetime as summarized in Proposition 2.2. Proposition 2.2. Let (M, g, Ot ) be a spacetime with Lorentzian distance d and n := dim M . (a) If M is strongly causal (in particular if M is globally-hyperbolic), its topology is generated by the sets {x ∈ M | d(p, x) · d(x, q) > 0} for all pairs p, q ∈ M with p ≺ ≺ q (we assume that 0 · ∞ = ∞ · 0 = 0). (b) There is an atlas of M , {(Up , ϕp )}p∈M , Up being an open neighborhood of p with coordinate maps given by ϕp : q 7→ (d(p1 , q), . . . , d(pn , q)) ∈ Rn , p1 , p2 , . . . , pn being suitable events about p. (c) For every pair of smooth vector fields X, Y and every event p ∈ M it holds that gp (Xp , Yp ) = −
1 lim Xq (Yq (d(p, q)2 )) . ≺ q→p 2 p≺
(9)
(d) If M is chronological (in particular if M is globally-hyperbolic), T p ∈ Tp M is timelike future-directed if and only if d(p, expp (tTp )) > 0, t ∈ (0, u] for some u > 0. (e) Let (M 0 , g0 , Ot0 ) be another spacetime with Lorentzian distance d0 . If M is strongly causal (in particular if M is globally-hyperbolic) and f : M → M 0 (not necessarily continuous) is surjective and d0 (f (p), f (q)) = d(p, q) for all p, q ∈ M , then f is a diffeomorphism (and thus a fortiori a homeomorphism), preserves the metric, i.e. f ∗ g0 = g, and preserves the time orientation. Proof. (a) See the end of Sec. A.11. (b) Let n := dim M . Fix p ∈ M and a sufficiently small convex normal neighborhood U of p. Take a basis of Tp∗ M made of
February 9, 2004 18:52 WSPC/148-RMP
1182
00188
V. Moretti
future-directed co-vectors ωk , k = 1, . . . , n, consider n geodesics γk through p, with respectively tangent vectors ↑ωk and take n events pk ∈ γk ∩ U ∩ I − (p). The maps x 7→ d(pk , x) are smooth in a neighborhood of p by (d) of Proposition 2.1. Using that proposition and (A.1) one gets dd(pk , x)|p = βk ωk (there is no summation over k) for some reals βk 6= 0. Since the co-vectors ωk are linearly independent, such a requirement is preserved by the vectors dd(pk , x) in a neighborhood of p and the maps x 7→ d(pk , x) define an admissible coordinate map about p. (c) In a Riemannian normal coordinate system centered on p, σ(p, q) = (1/2)gab (p)xaq xbq . Hence gp (Xp , Yp ) = limq→p Xq (Yq (σ(p, q))) by direct computation. The limit does not depend on the used curve because q 7→ Xq (Yq (σ(p, q))) is continuous about p. Using γ from p to some q0 ∈ I + (p) with γ\{p} ⊂ I + (p), Proposition 2.1(d) implies (9). (d) If Tp is timelike and future-directed, t 7→ expp (tT ) is a timelike future-directed curve, thus expp (tT ) ∈ I + (p) if t > 0 and the thesis is a consequence of Proposition 2.1(a). Conversely, if Tp ∈ Tp M and d(p, expp (tTp )) > 0 when t ∈ (0, u] for some u > 0 then expp (tTp ) ∈ I + (p) in that interval for Proposition 2.1(a). Taking t0 < u, t0 > 0 sufficiently small, there is a convex normal neighborhood Up containing either p, q := expp (t0 Tp ) and expp (tTp ) for t ∈ (0, t0 ]. [28, Theorem 8.1.2] implies that the unique geodesic in Up from p to q must be timelike and thus Tp is such. If Tp were past-directed, t 7→ expp (tTp ) would be such giving I + (p) ∩ I − (p) 6= ∅ which violates the chronological condition. (e) One has to prove the injectivity of f only, because the proof of the remaining items is a direct consequence of (a)–(d). The preservation of the Lorentz distance implies that p ≺ ≺ q in M if and only if f (p) ≺ ≺ f (q) in M 0 . Then suppose p 6= q in M and f (p) = f (q). Let V be an open causally-convex neighborhood of p with q 6∈ V . Take q1 , q2 ∈ V with q1 ≺ ≺p≺ ≺ q2 . It holds that I + (q1 ) ∩ I − (q2 ) ⊂ V and thus q 6∈ I + (q1 ) ∩ I − (q2 ). However f (q1 ) ≺ ≺ f (p) = f (q) ≺ ≺ f (q2 ) implies q1 ≺ ≺q ≺ ≺ q2 and q ∈ I + (q1 ) ∩ I − (q2 ) which is a contradiction. Remark. The item (e) can be made stronger (see [1, Theorem 4.17]) proving that if (M 0 , g0 , Ot0 ) is another spacetime with Lorentzian distance d0 , (M, g, Ot ) is strongly causal and f : M → M 0 (not assumed to be continuous) is surjective and for some constant β > 0, d0 (f (p), f (q)) = βd(p, q) for all p, q ∈ M , then f is a diffeomorphism and satisfies f ∗ g0 = βg. We can state the first important technical result of this section in Theorem 2.1. The theorem concerns some features of the structure of the cut locus in Lorentzian geometry and establishes that the Lorentzian distance is almost-everywhere smooth if considered as a function of one of the two arguments. These properties, in turn, will be used to prove the functional formula of the Lorentzian distance (in particular they are useful to prove Proposition 2.3). To understand the statement of the theorem we remind the reader that a subset X of a manifold M is said to have measure zero if for every local chart (U, φ),
February 9, 2004 18:52 WSPC/148-RMP
00188
Aspects of Noncommutative Lorentzian Geometry
1183
the set φ(U ∩ X) ⊂ Rdim(M ) has Lebesgue measure zero. When M is endowed with a nondegenerate smooth metric g, it turns out that X ⊂ M has measure zero if and only if it has measure zero with respect to the positive complete Borel measure µg induced by g on M . Some further preliminary definitions and results concerning the nonspace-like cut locus are necessary. We use notations and definitions in [1, Chap. 9]. Consider p ∈ M , with M globally-hyperbolic. Let h be a complete Riemannian metricd on M . Define U M := {v ∈ T M | h(v, v) = 1, g(v, v) ≤ 0, v is future directed} , U Mp := {v ∈ U M | π(v) = p} . If v ∈ U M , t 7→ cv (t), with t ∈ [0, a), denotes the unique geodesic starting from p = π(v) with initial tangent vector given by v and maximal domain. Finally define, for v ∈ U M , s1 (v) := sup{t > 0 | cv (t) is maximal form p to cv (t)} . “cv (t) is maximal from p to cv (t)” means [1] that L(cv[0,t] ) = d(p, cv (t)). Using Proposition 2.1(b), it arises that if a future-directed causal geodesic segment γ : [a, b] → M is maximal, then γ[a0 ,b0 ] is so for a ≤ a0 < b0 ≤ b. Notice that s1 (v) > 0 in strongly-causal spacetimes and thus in globally-hyperbolic spacetimes because, in these spacetimes, every geodesic is maximal in a convex normal neighborhood containing the initial point [1]. It is known (see [1, Proposition 9.33]) that s1 is lower semicontinuous in globally-hyperbolic spacetimes and, if (a) the spacetime is globally-hyperbolic, (b) s1 (v) is finite and (c) cv extends to [0, s1 (v)], then s1 is continuous in v. Finally define Γ+ ns (p) := {s1 (v)v | v ∈ U Mp , s1 (v) < +∞, cv extends to [0, s1 (v)]} and C + (p) := exp(Γ+ ns (p)) .
The second definition is consistent because cv extends to [0, s1 (v)] if and only if it is defined in some maximal domain [0, s1 (v) + ), > 0 and this is equivalent to saying that cs1 (v)s is defined in some maximal domain [0, 1 + s1 (v) ). Therefore if v ∈ U Mp , “s1 (v) < +∞ and cv extends to [0, s1 (v)]” is equivalent to “s1 (v)v ∈ Up ” and so Γ+ ns (p) = {s1 (v)v | v ∈ U Mp , s1 (v)v ∈ Up } . C + (p) is a subset of J + (p) by construction and it is called the future nonspacelike cut locus of p. If s1 (v)v ∈ Γ+ ns (p), exp(s1 (v)v) is called the future cut point of p along cv . The past nonspace-like cut locus is defined similarly, with the obvious changes. Everything can be re-stated for the past nonspace-like cut d It exists on any differentiable Hausdorff second-countable manifold as proven by Nomizu and Ozeki, Proc. Amer. Math. Soc. 12, 889–891.
February 9, 2004 18:52 WSPC/148-RMP
1184
00188
V. Moretti
locus with the necessary obvious replacements. By [1, Theorem 9.35], in globallyhyperbolic spacetimes, C + (p) is closed (and thus J + (p)\C + (p) is the union of the open set I + (p)\C + (p) and ∂I + (p)\C + (p) ⊂ ∂ (I + (p)\C + (p))). Theorem 2.1. Let (M, g, Ot ) be a globally-hyperbolic spacetime and take any p ∈ M . The following statements hold : (a) ∂I + (p) = ∂J + (p) = J + (p)\I + (p) and C + (p) ⊂ J + (p) are closed , without internal points, with measure zero; (b) aJ + (p)\ (C + (p) ∪ ∂J + (p)) = I + (p)\C + (p) is open and homeomorphic to Rdim(M ) ; (c) expp defines a diffeomorphism onto I + (p)\C + (p) with domain given by an open subset of Tp M of the form Ap = {X ∈ Tp M | X is timelike and future directed , 0 < |gp (X, X)| < λX for some λX > 0} ; (d) d(p, ·)2 belongs to C ∞ (J + (p)\C + (p)) and d(p, ·) belongs to C ∞ (I + (p)\C + (p)); (e) d(p, ·) satisfies the timelike eikonal equation for q ∈ I + (p)\C + (p), gq (↑dq d(p, q), ↑dq d(p, q)) = −1 . Proof. See Appendix C. Remarks. (a) The statement and the proof of item (b) are known in the literature [1]. (2) C ∞ (J + (p)\C + (p)) is valid in the sense of Sec. 1.5. Indeed since C + (p) is closed, I + (p) is open and J + (p) = I + (p) (see Sec. A.12), one has that J + (p)\C + (p) = (I + (p)\C + (p)) ∪ (∂I + (p)\C + (p)) where I + (p)\C + (p) is open and ∂I + (p)\C + (p) ⊂ ∂(I + (p)\C + (p)). (3) Due to the possibility of reversing the time orientation preserving the globallyhyperbolicity, it turns out that, fixing the latter argument of d(p, q) and varying the former, one gets a function in C ∞ (J − (q)\C − (q)) and the analogues of items (a)–(e) above hold. Finally q ∈ C + (p) if and only if p ∈ C − (q) as a consequences of [1, Theorems 9.12 and 9.15]. 2.2. Causal functions and Lorentzian distance We introduce a lemma and a proposition necessary to generalize (1) to Lorentzian manifolds in terms of the Lorentzian distance. To this end, we have to give some introductory definitions in particular concerning so-called causal functions. The introduced machinery, together with the results achieved in Sec. 2.1 will allow us to present a preliminary version of the formula of the Lorentzian distance in globallyhyperbolic spacetimes (Theorem 2.2).
February 9, 2004 18:52 WSPC/148-RMP
00188
Aspects of Noncommutative Lorentzian Geometry
1185
Definition 2.2. Let (M, g, Ot ) be a spacetime. Let N ⊂ M such that N = A ∪ B where A is open and B ⊂ ∂A. A continuous function f : N → C is said to be essentially smooth on N if there is a closed set Cf ⊂ N with measure zero, such that fN \Cf is smooth. E[µg ] (N ) indicates the class of such functions. Definition 2.3. Let (M, g, Ot ) be a spacetime. Let N ⊂ M . A continuous function f : N → R is either a causal function or a time function on N if, respectively, it is non-decreasing or increasing along every future-directed causal curve contained in N . C(N ) and T (N ) respectively denote the class of causal functions and the class of time functions on N . If N is taken as in Definition 2.2, C[µg ] (N ) := E[µg ] (N )∩C(N ), T[µg ] (N ) := E[µg ] (N ) ∩ T (N ). Remark. Notice that T (N ) ⊂ C(N ). Moreover, if N ⊂ M is taken as in Definition 2.2 and M is globally-hyperbolic, T (N ) ∩ C ∞ (N ) 6= ∅ because a smooth time function exists on the whole manifold M (see Secs. A.13 and A.15). In general spacetimes C(N ) ∩ C ∞ (N ) 6= ∅ because the constant functions are causal functions. The following technical lemma and a proposition are useful in generalizing (1). The proposition states that, in suitable domains, d defines a natural causal/time function which is also essentially smooth. Lemma 2.1. In a globally-hyperbolic spacetime (M, g, Ot ) with Lorentzian distance d, take an open causally convex (Sec. A.11) set I ⊂ M such that ∂I has measure ¯ then, df is either 0 or causal and past directed in an open set zero. If f ∈ C[µg ] (I) ¯ and J ⊂ I¯ with µ(g) (J) = µg (I)(= µg (I)) f (y) − f (x) ¯ ¯ x, y ∈ I, x ≺ ≺y . (10) ess inf{|dz f | |z ∈ I} ≤ inf d(x, y)
¯ Above, ≤ can be replaced by = if f ∈ T[µg ] (I). Proof. See Appendix C.
Proposition 2.3. Let (M, g, Ot ) be a globally-hyperbolic spacetime, let d indicate the corresponding Lorentzian distance and, for each p ∈ M, define the functions fp (·) := d(p, ·) and hp (·) := −d(·, p). It holds that (a) fp , hp ∈ E[µg ] (M ); (b) fp I + (p) ∈ T[µg ] (I + (p)) and hp I − (p) ∈ T[µg ] (I − (p)); (c) fp N ∈ C[µg ] (N ) and hpN ∈ C[µg ] (N ) for every N ⊂ M as in Def. 2.2. Proof. We prove the thesis for fp , the other case is analogous. (a) is a direct consequence of Theorem 2.1 and the fact that fp (x) = 0 if x 6∈ J(p). (b) Let γ ⊂ I + (p) be a causal future-directed curve. Take x, y ∈ γ with x = γ(t), y = γ(t0 ) and t0 > t. We want to show that it holds that fp (x) < fp (y), i.e. d(p, y) ≤ d(p, x) is not possible. Notice that y 6= x because the spacetime is globally-hyperbolic and thus
February 9, 2004 18:52 WSPC/148-RMP
1186
00188
V. Moretti
causal, in fact we have p ≺ ≺ x ≺ y (and thus p ≺ ≺ y). Suppose that d(p, x) ≥ d(p, y). By Proposition 2.1(b) it must also hold that d(p, y) ≥ d(p, x) + d(x, y). Putting together and using d(x, y) ≥ 0 one gets 0 ≤ d(x, y) ≤ d(p, y) − d(p, x) ≤ 0 .
The only chance is d(x, y) = 0 and d(p, y) = d(p, x). Since the spacetime is globallyhyperbolic, there must be a future-directed maximal null geodesic γ2 from x to y by Proposition 2.1(a). By the same item there must be a timelike maximal futuredirected geodesic γ1 from p to x. γ1 ∗ γ2 is a causal future-directed curve from p to y. Moreover it holds that L(γ1 ∗ γ2 ) = d(p, x) + 0 = d(p, y). By Proposition 2.1(e), γ1 ∗ γ2 can be re-parametrized into a maximal geodesic from p to y which must be timelike, since d(p, y) > 0, y being in I + (p). This is impossible since γ2 is null. (c) If N ∩ J + (p) = ∅ the proof is trivial since fp is constant on N . Suppose that N ∩ J + (p) 6= ∅ and that γ ⊂ N is a future-directed causal curve with γ(u) ∈ J + (p) for some u, the remaining cases being trivial. In these hypotheses γ(u0 ) ∈ J + (p) for u0 > u because of Sec. A.7. Then there are various cases to be analyzed for t < t0 where we use the fact that fp vanishes outside I + (p) by Proposition 2.1. (i) If γ(t), γ(t0 ) 6∈ J + (p), the thesis holds because 0 = fp (γ(t)) ≤ fp (γ(t0 )) = 0. (ii) If γ(t) 6∈ J + (p) and γ(t0 ) ∈ J + (p) the thesis holds because 0 = fp (γ(t)) ≤ fp (γ(t0 )) ≥ 0. (iii) If γ(t), γ(t0 ) ∈ I + (p), the thesis holds by (a). (iv) γ(t), γ(t0 ) ∈ ∂I + (p) = ∂J(p). In that case fp (γ(t)) = fp (γ(t0 )) = 0 by Proposition 2.1(a) and (f). (v) γ(t) ∈ ∂I + (p) and γ(t0 ) ∈ I + (p), in that case 0 = γ(t) < γ(t0 ) by Proposition 2.1(a) and (f). The case γ(t0 ) ∈ ∂I + (p) and γ(t) ∈ I + (p) is forbidden because p ≺ ≺ γ(t) γ(t0 ) 0 implies p ≺ ≺ γ(t ) by the remark in Sec. A.7. The last technical proposition necessary to state the preliminary version of the functional formula of the Lorentzian distance concerns the interplay of relativelycompact causally-convex sets in globally-hyperbolic spacetimes and essentially smooth causal functions. In Secs. A.16–A.23 of Appendix A the definition of Cauchy surface and the relevant properties of these surfaces are given. An important result of Lorentzian geometry (see Sec. A.20) states that: a spacetime (M, g, Ot ) is globally-hyperbolic if and only if it admits a Cauchy surface. This statement can be adopted as an equivalent definition of a globally-hyperbolic spacetime (see remark in the end of [28, Sec. 8.3] for a proof of equivalence of the various definitions of globallyhyperbolicity). In a globally-hyperbolic spacetime M , if S ⊂ M is a smooth Cauchy surface and p ∈ J + (S), I(S, p) and J(S, p) respectively denote I − (p) ∩ I + (S) and J − (p)∩J + (S). One can straightforwardly prove that I(s, p) is not empty if and only if p ∈ I + (p). It is not very difficult to show that I(S, p) and J(S, p) are causally convex. Section A.8 implies that I(S, p) is open and I(S, p) ⊂ J(S, p). The sets I(p, S) and J(p, S) enjoy analogous properties. Proposition 2.4. Let (M, g, Ot ) be a globally-hyperbolic spacetime and let X denote the class of open, nonempty, causally-convex subsets of M, I, such that I¯ is compact, causally convex and ∂I has measure zero. The following statements hold.
February 9, 2004 18:52 WSPC/148-RMP
00188
Aspects of Noncommutative Lorentzian Geometry
1187
S (a) The class X is a covering of M, i.e. X = M, and defines a direct set with respect to the set-inclusion partial-ordering relations, i.e. if A, B ∈ X there is C ∈ X such that A ∪ B ⊂ C. ¯ 6= ∅. (b) If A ∈ X , T[µg ] (A) 6= ∅ and C[µg ] (A) ¯ and I ∈ X (c) If p, q ∈ M, p q if and only if f (p) ≤ f (q) for all f ∈ C[µg ] (I) ¯ such that p, q ∈ I. (d) (i) If S ⊂ M is a smooth Cauchy surface for M and either p ∈ I + (S) or p ∈ I − (S), respectively I(S, p) ∈ X and I(S, p) = J(S, p) or I(p, S) ∈ X and I(p, S) = J(p, S). (ii) If p ∈ M, there is a fundamental system of neighborhoods of p made of sets I(r, s) ∈ X with I(r, s) = J(r, s). Proof. See Appendix B. Now we are able to state and prove the second important theorem of this section which is nothing but a preliminary version of the functional formula of the Lorentz distance. p From now on we use the p following notation: If T ∈ Tp M , |T | := |gp (T, T )|, similarly, if ω ∈ Tp∗ M , |ω| := |gp (↑ω, ↑ω)|. Theorem 2.2. Let (M, g, Ot ) be a globally-hyperbolic spacetime and p, q ∈ M . Defining hαi := max{0, α} for all α ∈ R, it holds that
¯ I ∈ X , p, q ∈ I, ¯ ess inf I¯|df | ≥ 1} . (11) d(p, q) = inf{hf (q) − f (p)i | f ∈ C[µg ] (I),
¯ I ∈ X , p, q ∈ I, ¯ Proof. Define µ(p, q) := inf{hf (q) − f (p)i | f ∈ C[µg ] (I), ess inf I¯|df | ≥ 1}. We want to show that µ(p, q) = d(p, q). First consider the case p q. To this end consider the map fp : x 7→ d(p, x), where x ∈ Ifp with Ifp = I(p, S), S being a smooth Cauchy surface with p, q ∈ I − (S). Theorem 2.1 and Proposition 2.3 say that such a fp can be used to evaluate µ(p, q) because all of the necessary requirements are fulfilled. We trivially have 0 ≤ d(p, q) = fp (q) − fp (p) = hfp (q) − fp (p)i and thus µ(p, q) ≤ d(p, q). To conclude, it is sufficient to show that ¯ and ess inf I¯|df | ≥ 1, we µ(p, q) ≥ d(p, q). By Lemma 2.1, if I ∈ X , f ∈ C[µg ] (I) have f (y) − f (x) ¯ x≺ inf x, y ∈ I, ≺ y ≥ 1. d(x, y) ¯ x≺ Therefore, in I, ≺ y entails hf (y) − f (x)i ≥ d(x, y). The inequality holds also if x y because, by Proposition 2.1(a) and (f), if x y and x ≺ ≺ y is false, it must be d(p, q) = 0. In that case hf (y) − f (x)i ≥ d(x, y) is trivially true. In particular, if p, q ∈ I¯ and p q, then 0 ≤ d(p, q) ≤ hf (q) − f (p)i. By the definition of µ, this implies µ(p, q) ≥ d(p, q). Let us consider the case q p. Similarly to above, take fq : x 7→ d(q, x) in some J(q, S) with p, q ∈ I − (S). fq can be used to compute µ(p, q) obtaining
February 9, 2004 18:52 WSPC/148-RMP
1188
00188
V. Moretti
fq (q) − fq (p) ≤ 0 which implies hfq (q) − fq (p)i = 0 and thus µ(p, q) = 0 because 0 ≤ µ(p, q) ≤ hfq (q) − fq (p)i by definition. Finally consider the case of p and q spatially separated. In that case it is possible to find (see below) two, sufficiently small, regions I(x, y), I(x0 , y 0 ) with p ∈ I(x, y), q ∈ I(x0 , y 0 ) and such that I(x, y) = J(x, y) and I(x0 , y 0 ) = J(x0 , y 0 ) are spatially separated. We conclude that A := I(p, y)∩I(q, y 0 ) ∈ X . Then x 7→ f (x) := d(p, x)+ ¯ and satisfies g(df, df ) = −1 a.e. by construcd(q, x) defines an element of C[µg ] (A) tion, hence it can be used to evaluate µ(p, q) producing µ(p, q) = 0 = d(p, q) because hf (q) − f (p)i = h0 − 0i = 0 and 0 ≤ µ(p, q) ≤ hf (q) − f (p)i. Let us prove the existence of I(x, y), I(x0 , y 0 ) with the properties above. Since {q} ∩ (J + (p) ∪ J − (p)) = ∅ and J + (p) ∪ J − (p) is closed (Sec. A.12), there is a neighborhood of q, V which satisfies V ∩ (J + (p) ∪ J − (p)) = ∅. As the spacetime is strongly causal, V can be fixed with the form I(x0 , y 0 ). By a suitable restriction (Sec. A.8) it is possible to fix J(x0 , y 0 ) such that q ∈ I(x0 , y 0 ) and J(x0 , y 0 ) ∩ (J + (p) ∪ J − (p)) = ∅. This is equivalent to {p} ∩ (J + (J(x0 , y 0 )) ∪ J − (J(x0 , y 0 ))) = ∅. Section A.12 implies that J + (J(x0 , y 0 )) ∪ J − (J(x0 , y 0 )) is closed because, since the spacetime is globallyhyperbolic, J(x0 , y 0 ) is compact. Using the same way followed above, one can find I(x, y) such that p ∈ I(x, y) and J(x, y) ∩ (J + (J(x0 , y 0 )) ∪ J − (J(x0 , y 0 ))) = ∅. We have proven that there are two regions I(x, y), I(x0 , y 0 ) with p ∈ I(x, y), q ∈ I(x0 , y 0 ) and J(x, y), J(x0 , y 0 ) are spatially separated. 3. The Functional Formula of the Lorentzian Distance 3.1. Laplace–Beltrami–d’Alembert operator and the net of Hilbert spaces The results achieved in Sec. 2 allow us to generalize (1) in a globally-hyperbolic spacetime (M, g, Ot ) using the Lorentzian distance d. The procedure consists of a translation of the statement of Theorem 2.2, Eq. (11) in particular, in terms of operators. To this end, a preliminary discussion on the remaining ingredients (operators) which appear in (4) is necessary. ¯ µg ), I ∈ X . These spaces are naturally Consider the class of Hilbert spaces L2 (I, 2 isomorphic to closed subspaces of L (M, µg ). || · ||L(L2 (I)) ¯ denotes the uniform norm 2 ¯ operator in the corresponding L (I, µg ). In those spaces three classes of useful operators can be defined: the operators ∆I which are obtained by means of a suitable restriction of the Laplace–Beltrami–d’Alembert operator, the functions f ∈ ¯ viewed as multiplicative operators and the commutators [f, [h, ∆I ]]. C[µg ] (I) Definition 3.1. Let (M, g, Ot ) be a globally-hyperbolic spacetime. Referring to the notations above, the Laplace–Beltrami–d’Alembert operator on L2 (M, µg ), is ∆ : C0∞ (M ) → L2 (M, µg ) ,
with ∆ := ∇µ ∇µ in local coordinates, ∇ denoting the Levi–Civita covariant deriva¯ µg ), linear manifold tive. ∆I denotes the restriction of ∆ to the, dense in L2 (I, ¯ I ∈ X. C ∞ (I),
February 9, 2004 18:52 WSPC/148-RMP
00188
Aspects of Noncommutative Lorentzian Geometry
1189
As a general remark we notice that ∆ is densely defined, symmetric and admit selfadjoint extensions because it commutes with the complex conjugation; conversely every ∆I is not symmetric because it is not Hermitean in the considered domain because of nonvanishing boundary terms. Let us pass to consider the causal func¯ (I ∈ X ) can be seen as a multiplicative tions and commutators. Every f ∈ C[µg ] (I) 2 ¯ ¯ µg ). (self-adjoint) operator in L (I, µg ) with domain given by the whole space L2 (I, 2 ¯ The commutator [f, [h, ∆I ]] is well-defined as an operators in L (I, µg ) with the domain and the properties stated below. A remarkable step which permits us to translate (11) into (4) is the identity established by Eq. (12) in item (b) below. In the following, if A is an operator in a Hilbert space with scalar product (·, ·), A ≤ αI means αI − A ≥ 0, i.e. (Ψ, (αI − A)Ψ) ≥ 0 for all Ψ in the domain of A. Lemma 3.1. In a globally-hyperbolic spacetime (M, Ot , g) take I ∈ X and f, h ∈ ¯ Let DI,f,g := C ∞ (I\(Sf ∪ Sh )), St being the set of singular points of C[µg ] (I). 0 t ∈ T[µg ] (I).
¯ µg ) is a dense linear manifold , invariant for either f, h, ∆I . (a) DI,f,g ⊂ L2 (I, (b) ∆I and [f, [h, ∆I ]] are symmetric on DI,f,g , the latter operator is also essentially self-adjoint on DI,f,g and [f, [h, ∆I ]] = 2g(↑df, ↑dh) almost everywhere in I¯ .
(12)
(c) The following equivalent relations hold (i) σ([f, [h, ∆I ]]) ⊂ (−∞, 0], (ii) [f, [h, ∆I ]] ≤ 0 on DI,f,g , (iii) [f, [h, ∆I ]] ≤ 0. ¯ µg ). It is also dense Proof. (a) It is obvious that DI,f,g is a linear manifold in L2 (I, therein because, as Sf ∪ Sh is closed, I\(Sf ∪ Sh ) is open and C0∞ (I\(Sf ∪ Sh )) is ¯ µg ) because Sf ∪ Sh ∪ ∂I dense in L2 (I\(Sf ∪ Sh ), µg ) which coincides with L2 (I, has measure zero. The invariance properties can be proven by direct inspection. (b) ∆I restricted to the linear manifold DI,f,g is Hermitean by construction (notice that I\(Sf ∪ Sh ) is open) and thus it is symmetric too because the domain is dense. (12) can be proven by direct inspection on I\(Sf ∪ Sh ). [f, [h, ∆I ]] = 2g(↑df, ↑dh) entails the Hermiticity (and thus the symmetry, the domain being dense) because g(↑df, ↑dh) is a real measurable function which acts as a multiplicative operator. However the symmetry also follows form standard properties of the commutator and the symmetry of the operators f, h, ∆I . The essentially self-adjointness of [f, [h, ∆I ]] on DI,f,g is assured by Nelson’s theorem [25] proving that DI,f,g is made by analytic vectors. The proof is immediate using the fact that g(↑df, ↑dh) is smooth and thus bounded when restricted to any compact set contained in I\(Sf ∪ Sh ). (c) By Lemma 2.1, df and df are almost everywhere causal and past directed where they do not vanish, therefore [f, [h, ∆I ]] = g(↑df, ↑dh) ≤ 0 almost everywhere. In turn it entails (ii), namely (Ψ, [f, [h, ∆I ]]Ψ) ≤ 0 for all Ψ ∈ DI,f,g . Let us prove the equivalence of (i), (ii) and (iii). The unique self-adjoint extension of [f, [h, ∆I ]]
February 9, 2004 18:52 WSPC/148-RMP
1190
00188
V. Moretti
coincides with the closure of the same operator and thus (ii) implies (iii). Moreover (iii) implies (ii) trivially. Using the spectral measure of [f, [h, ∆I ]] one trivially sees that (i) is equivalent to (iii). 3.2. The functional formula of the Lorentz distance To conclude, we can state and prove the formula (4) which generalizes (1) in globally-hyperbolic spacetimes. Theorem 3.1. Let (M, g, Ot ) be a globally-hyperbolic spacetime with Lorentzian distance d and define ∆ 6 I := 12 ∆I and hαi := max{0, α} if α ∈ R. The Lorentzian distance of p, q ∈ M can be computed as follows d(p, q)
= inf
−1 ¯ I ∈ X , p, q ∈ I, ¯ 6 I ]] hf (q) − f (p)i f ∈ C[µg ] (I),
[f, [f, ∆
where ||[f, [f, ∆ 6 I ]] of [f, [f, ∆ 6 I ]]
−1
−1
¯ L(L2 (I))
≤1 ,
(13)
||L(L2 (I)) ≤ 1 (which includes the requirement on the existence
) can be replaced by one of the following equivalent requirements [f, [f, ∆I ]] ≤ −I
(on DI,f,f ) ,
(14)
[f, [f, ∆I ]] ≤ −I ,
(15)
σ([f, [f, ∆I ]]) ⊂ (−∞, −1] .
(16)
Proof. First we show that under the assumption [f, [f, ∆ 6 I ]] ≤ 0 (which holds by Lemma 3.1(c) as f ∈ C[µg ] (I)), the four requirements (14)–(16) and (R): −1
−1
“[f, [f, ∆ 6 I ]] exist and ||[f, [f, ∆ 6 I ]] ||L(L2 (I)) ≤ 1”, are equivalent. The proof ¯ of the equivalence of (14)–(16) is essentially the same used to prove the equivalence of the analogous three conditions in Lemma 3.1(c), we leave the details to the −1
reader. Using the spectral representation of [f, [f, ∆ 6 I ]], and viewing [f, [f, ∆ 6 I ]] as a spectral function of the former, (16) implies (R) straightforwardly. On the other −1
hand, using the spectral theorem for [f, [f, ∆ 6 I ]] , (R) implies that σ([f, [f, ∆ 6 I ]]) ⊂ (−∞, −1] ∪ [1, +∞) (Use σ(A) ⊂ [−||A||, ||A||] and σ(A−1 ) ⊂ {1/λ | λ ∈ σ(A)\{0}} −1
provided 0 6∈ σp (A) this being our case when A = [f, [f, ∆ 6 I ]] because A admits inverse by construction). Then [f, [f, ∆ 6 I ]] ≤ 0, which is equivalent to σ([f, [f, ∆ 6 I t]]) ⊂ (−∞, 0] by Lemma 3.1(c), entails σ([f, [f, ∆ 6 I ]]) ⊂ (−∞, −1] which is (16). To conclude and prove (13) we reduce to the expression for d given in Theorem 2.2. The condition ess inf I¯ |df | ≥ 1 which appears in the thesis of Theorem 2.2 is equivalent to ess inf I¯ |df |2 ≥ 1 which, in turn, is equivalent to ess sup |gx (↑df, ↑df )|−1 | x ∈ I¯ ≤ 1 . (17) Using the function gx (↑ df, ↑ df )−1 = −|gx (↑ df, ↑ df )|−1 as a multiplicative (self¯ µg ), (17) can equivalently be re-written adjoint) operator on the whole space L2 (I,
February 9, 2004 18:52 WSPC/148-RMP
00188
Aspects of Noncommutative Lorentzian Geometry
1191
||gx (↑df, ↑df )−1 ||L(L2 (I)) ¯ ≤ 1.
(18)
[f, [f, ∆ 6 I ]] ◦ gx (↑df, ↑df )−1 = IL2 (I) ¯ ,
(19)
gx (↑df, ↑df )
(20)
On the other hand, holding gx (↑ df, ↑ df )−1 · gx (↑ df, ↑ df ) = 1 a.e., and gx (↑df, ↑df ) = [f, [f, ∆ 6 I ]] a.e., we also have −1
◦ [f, [f, ∆ 6 I ]] = IDI,f,f .
Notice that the closure of [f, [f, ∆ 6 I ]] is an operator because [f, [f, ∆ 6 I ]] is essentially self-adjoint (Lemma 3.1(b)), moreover gx (↑ df, ↑ df )−1 is bounded by (18). These 6 I ]] in both the two facts together imply that [f, [f, ∆ 6 I ]] can be replaced by [f, [f, ∆ identities above (also replacing IDI,f,f with the identity operator on the domain of [f, [f, ∆ 6 I ]]). Then, the uniqueness of the inverse operator implies that (18) is −1 nothing but ||[f, [f, ∆ 6 I ]] ||L(L2 (I)) ¯ ≤ 1. Remark. Theorem 3.1 holds if replacing M with a vector fiber bundle F → M equipped with a positive Hermitean fiber-scalar product, and using a second-order differential operator working on compactly-supported almost-everywhere smooth sections, locally given by 1 ∆ 6 (X,V ) = [(∇µ − iX µ )(∇µ − iXµ ) + V ] . 2 X is any smooth Hermitean SU (N )-connection field V defining a Hermitean linear map Vx : Fx → Fx on each fiber Fx , x ∈ M . This is because the identity (12) is preserved [h, [f, ∆ 6 (X,V ) ]] = g(↑dh, ↑df )I ,
(21)
where I is the fiber identity. 4. The Algebraic Point of View: Generalizations Towards a Lorentzian Noncommutative Geometry As found in Sec. 3, a generalization of the functional identity for the Riemannian distance exists in globally-hyperbolic spacetimes. Here, we shall not attempt to give a complete investigation of noncommutative Lorentzian causal structures but we try to extract the algebraic content from the structure involved in the commutative case obtaining generalizations of the causal structure in both the commutative and noncommutative case. In particular we present a set of five axioms on noncommutative causality which give a straightforward generalization of the causal structure of globally-hyperbolic spacetimes. We stress that there is no guarantee for the minimality of the presented set of axioms. 4.1. Algebraic ingredients Assume that (M, Ot , g) is a globally-hyperbolic spacetime and adopt all the notations and definition given in Secs. 1–3 (including Appendix A). In particular we
February 9, 2004 18:52 WSPC/148-RMP
1192
00188
V. Moretti
focus attention on the ingredients used to write down (4) from the point of view of C ∗ -algebra theory. A relevant mathematical object is the net of Hilbert subspaces, ¯ µg ). H enjoys several properties induced by H = {HI | I ∈ X }, where HI = L2 (I, the properties of the class of subsets X defined in Proposition 2.4. In the following, ≤, used between elements of X , indicates the partial ordering relation on X given by the set-inclusion relation. (X , ≤) is a direct set as shown in Proposition 2.4(a). We have a consequent trivial proposition concerning the elements of H. Proposition 4.1. Referring to the given definitions and notations, (a) for any pair I, J ∈ X with I ≤ J, HI ⊂ HJ . More precisely, there is a Hilbertspace isomorphism from HI onto a (closed ) subspace of HJ ; (b) H is a direct set with respect that inclusion relation. More precisely, for any pair I, J ∈ X there is K ∈ X with I, J ≤ K such that HI + HJ ⊂ HK . A second set of relevant mathematical objects is given as follows. An elementary ¯ C(I) ¯ denoting the commutative unital C ∗ computation proves that if f ∈ C(I), ¯ ||f ||∞ = ||Of ||L(H ) , where Of is algebra of the continuous complex functions on I, I the multiplicative operator Of h := f · h for all h ∈ HI . Moreover the involution in ¯ i.e. the complex conjugation ¯·, is equivalent to the involution in L(HI ), that C(I), ¯ can be viewed as a subalgebra of is the Hermitean conjugation ·∗ . Therefore C(I) ∗ the C -algebra of all bounded operators on HI , L(HI ). From here on we use the following notation A0 := {AI }I∈X , where AI denotes the commutative unital normed ∗-algebras containing all of multiplicative operators Of , f ∈ E[µg ] (I). Moreover A := {AI }I∈X where AI indicates the C ∗ -algebra given by the Banach completion of AI . Lemma 4.1. Referring to the given definitions and notations, if I ∈ X , AI is ¯ C(I). ¯ (isometrically) isomorphic to the C ∗ -algebra of the continuous functions on I, ¯ ⊂ AI . C ∞ (I) ¯ is || · ||∞ -dense in C(I) ¯ by Stone–Weierstrass’ apProof. C ∞ (I) ∞ ¯ ¯ proximation theorem because C (I) and thus the closed sub ∗-algebra of C(I), ∞ ∞ ¯ ¯ ¯ C (I), separates the points of I and so C (I) must coincide with the algebra ¯ itself. C(I) Proposition 4.2. Referring to the given definitions and notations, for I, J ∈ X and I ≤ J, define ΠI,J (a) := aHI , a ∈ AJ , then (a) ΠI,J (AJ ) ⊂ AI and thus ΠI,J AJ : AJ → AI is a continuous (norm decreasing) unital ∗-algebra homomorphism; (b) ΠI,J (AJ ) = AI , in other words ΠI,J : AJ → AI is a surjective continuous unital C ∗ -algebra homomorphism. Proof. (a) can be proved by direct inspection using the fact that, in the sense of Lemma 4.1, aI¯ = aHI where a ∈ C(J¯) in the left-hand side is viewed as a function
February 9, 2004 18:52 WSPC/148-RMP
00188
Aspects of Noncommutative Lorentzian Geometry
1193
and a ∈ L(HJ ) in the right-hand side is viewed as a multiplicative operator. Let us prove (b). ΠI,J (AJ ) = AI and the surjectivity on AI of ΠI,J to AJ are trivially equivalent because ΠI,J is continuous. We directly prove the surjectivity. Using ¯ there is g ∈ C(M ) Lemma 4.1, it is sufficient to show that, for every f ∈ C(I) such that g I¯ = f . Since M is Hausdorff, locally compact and I¯ is compact, the existence of g follows from the Tietze extension theorem [24]. We have an immediate corollary: Corollary 4.1. In the hypotheses of Proposition 4.2, for I, J, K ∈ X , ΠI,I = Id and I ≤ J ≤ K entails ΠI,K = ΠI,J ◦ ΠJ,K . The third ingredient is given by the class of causal functions. It takes the causal structure of the spacetime into account. Let us examine this ingredient from the ¯ ⊂ AI . From algebraic point of view. First of all, notice that I ∈ X entails C[µg ] (I) ¯ now on we use the notation CoI := C[µg ] (I) and C := {CoI }I∈X . CoI is called the causal cone in AI . Proposition 4.3. Referring to the given definitions and notations, for I, J ∈ X : (a) CoI is a convex cone containing the origin (i.e. αt + βt0 ∈ CoI for α, β ∈ [0, +∞) and t, t0 ∈ CoI ) whose elements are self-adjoint (i.e. t ∈ CoI implies t∗ = t). (b) [CoI ] := {t1 − t2 + i(t3 − t4 ) | tk ∈ CoI , k = 1, 2, 3, 4} is a dense sub ∗-algebra of AI ; (c) I ≤ J entails ΠI,J (CoJ ) ⊂ CoI . Proof. The only nontrivial statement is (b), let us prove it. First notice that [CoI ] is closed with respect to the algebra operations and I ∈ [CoI ], I being the unit of AI . Indeed, by the given definitions, u, v ∈ [CoI ] entails αu+βv ∈ [CoI ] for all α, β ∈ C and u ∈ [CoI ] entails u∗ ∈ [CoI ]. Then notice that I is nothing but the constant ¯ = CoI . Moreover if t ∈ CoI , since I¯ is map x 7→ 1 which is an element of C[µg ] (I) compact and t is continuous, there is α > 0 such that if tα := t + αI, tα (x) > 0 for ¯ So take t, t0 ∈ CoI and define tα , t0 0 > 0 as said. It is clear that tα · t0 0 ∈ all x ∈ I. α α CoI because the product of positive non-decreasing functions is a non-decreasing increasing function. tα ·t0α0 ∈ CoI means t·t0 +αα0 I+αt0 +α0 t ∈ [CoI ], therefore the definition of [CoI ] implies t · t0 ∈ [CoI ]. That result trivially generalizes to any pair u, u0 ∈ [CoI ]. We have proven that [CoI ] is a sub ∗-algebra of AI . Now we prove that [CoI ] separates the points of I¯ and hence the closed algebra [CoI ] coincides with ¯ because of Stone–Weierstrass’ theorem. Let us show that, if p, q ∈ I, ¯ AI = C(I) 0 there is t1 ∈ CoI (⊂ [CoI ]) such that t1 (p) 6= t1 (q). Indeed, if p ≺ q, take p ≺ ≺ p, fix any t ∈ CoI and define t1 := t + d(p0 , ·)I¯. By Proposition 2.3(c), t1 ∈ CoI . Then t1 (p) < t1 (q) by construction because d(p0 , p) < d(p0 , q) by Proposition 2.3(b). If p, q ∈ I¯ are spatially separated there is p0 ≺ ≺ p with q 6∈ J + (p0 ) (see the proof of
February 9, 2004 18:52 WSPC/148-RMP
1194
00188
V. Moretti
Theorem 2.2). If t ∈ CoI , take α ∈ [0, +∞) with t(p) + αd(p0 , p) > t(q). Then t1 := t + αd(p0 , ·)I¯ ∈ CoI and t1 (p) 6= t1 (q). Remark. Since nontrivial causal functions cannot have compact support, we are forced to consider the unital normed ∗-algebras AI , as natural objects instead of the nonunital normed ∗-algebras Cc (I) (the compactly-supported continuous functions on the open set I) if we want that some time function as d(p, ·) belongs to AI , as it results necessarily from the proof of Proposition 4.3. On the physical ground this is related to the fact that a physical spacetime cannot be compact. A consequence of such a choice is that the class of C ∗ -algebras {AI } is not a net of C ∗ -algebras in the sense used in Quantum Field Theory [13] and it is not possible to define an overall C ∗ -algebra given by the inductive limit of the net. The last ingredient we introduce is the class of densely-defined operators used in ¯ HI = Sec. 3.2, G := {GI }I∈X , where GI := ∆ 6 I : DI → HI and DI := C ∞ (I), 2 ¯ L (I, µg ). GI is said to be the causal operators on HI . Proposition 4.4. Referring to the given definitions and notations: (a) for every J ∈ X , f, g ∈ CoJ , there is a linear manifold DJ,f,g ⊂ DJ such that: (i) DJ,f,g is dense in HJ and invariant with respect to f, g, GI ; (ii) if K ∈ X , K ≤ J, and Ψ ∈ DK,ΠK,J (f ),ΠK,J (g) , [f, [g, GJ ]] Ψ = [ΠK,J (f ), [ΠK,J (g), GK ]] Ψ ;
(22)
(iii) [f, [g , GJ ]] is essentially self-adjoint on DJ,f,g ; (iv) if α, β > 0, it holds that DJ,f,f ∩ DJ,g,g ∩ DJ,f,g ∩ DJ,g,f ⊂ DJ,αf +βg,αf +βg ,
(23)
and DJ,f,f ∩ DJ,g,g ∩ DJ,f,g ∩ DJ,g,f is a core for [αf + βg, [αf + βg, GJ ]]; (v) [f , [g, GJ ]] ≤ 0 on DJ,f,g .
(b) CoGK := {f ∈ CoK | [f, [f, GK ]] ≤ −γI for some γ > 0} is not empty.
Proof. (a) (22) and (23) can be proven by direct inspection, DJ,f,f ∩DJ,g,g ∩DJ,f,g ∩ DJ,g,f is a core for [αf + βg, [αf + βg, GJ ]] because that operator is essentially selfadjoint on that domain. This fact can straightforwardly be shown by following that way, based on Nelson’s theorem, used in the proof of Lemma 3.1. The remaining statements of the thesis are parts of the thesis of Lemma 3.1. (b) As the spacetime is globally-hyperbolic there is a smooth time function t with dt everywhere timelike (see Sec. A.13). Therefore the smooth function g(↑dt, ↑dt) is strictly negative on ¯ and thus posing −γ := maxK¯ g(↑ dt, ↑ dt), one has γ > 0 and the compact K f := tK¯ ∈ Co0K because, by (12), [f, [f, GK ]] = g(↑dt, ↑dt) ≤ −γI. Corollary 4.2. With the hypotheses of Proposition 4.4, t ∈ CoGK entails f + αt ∈ CoGK for every f ∈ CoK and α > 0, in particular CoGK is a convex cone.
February 9, 2004 18:52 WSPC/148-RMP
00188
Aspects of Noncommutative Lorentzian Geometry
1195
Proof. Take t ∈ CoGK . By Proposition 4.4(a)(iv), if A := DK,t,t ∩DK,f,f ∩DK,f,t ∩ DK,t,f [f + αt, [f + αt, GK ]]A = [f + αt, [f + αt, GK ]] . In A, it also holds that [f + αt, [f + αt, GK ]] = [f, [f GK ]] + α[f, [t, GK ]] + α[t, [f, GK ]] + α2 [t, [t, GK ]]. Finally Proposition 4.4(a)(v) implies [f + αt, [f + αt, GK ]] ≤ −γI in the considered domain and so [f + αt, [f + αt, GK ]] ≤ −γI. In particular it holds in DK,f +αt,f +αt . Putting all together we can state the following general algebraic hypotheses which are fulfilled in the commutative case. However it is worthwhile stressing they do not explicitly require the commutativity itself and thus they could be used in noncommutative generalizations. A fifth axiom will be introduced shortly. (AH1) H = {HI }I∈X is a class of Hilbert spaces labeled in a partially-ordered direct set (X , ≤) such that Proposition 4.1(a) and (b) is fulfilled. (AH2) A = {AI }I∈X is a class of unital sub ∗-algebras of the C ∗ -algebra of the bounded operators on HI , L(HI ). AI denotes the unital C ∗ -algebra obtained as the Banach completion of AI and we assume that Proposition 4.2(a)(b) holds (and thus its corollary holds too) with ΠI,J defined therein. (AH3) C = {CoI }I∈X , with CoI ⊂ AI , fulfills Proposition 4.3(a)–(c). (AH4) G = {GI }I∈X , with GI : DI → HI and DI ⊂ HI , is a class of denselydefined operators satisfying Proposition 4.4(a) and (b) (and thus its corollary). 4.2. Events, loci and causality Let us examine how the events of M and its topology arise in the algebraic picture introduced above. In particular, we show that the presented approach gives rise to a generalization of the concept of event in a spacetime, preserving the causal relations. When a manifold is compact, its points can be realized in terms of pure algebraic states on the C ∗ -algebra of continuous functions on the manifold [2, 11, 20]. If a manifold is only locally compact the construction is more complicated and involves irreducible C-representations of the nonunital C ∗ -algebra of the functions which vanish at infinity [6, 20]. Here we want to develop an alternative procedure, involving pure states, which is useful from a metric point of view. We remind the reader that a linear functional ω : A → C where A is a C ∗ -algebra, is said to be positive if ω(a∗ a) ≥ 0 for all a ∈ A. If A is unital, ω is said to be normalized if ω(I) = 1, I being the unit element of A. In unital C ∗ -algebras, the positivity of a linear functional ω implies (a) the boundedness of ω and (b) ||ω|| = ω(I) (see [25, proof of Theorem 5.1]) so the normalization condition can be equivalently stated by requiring that ||ω|| = 1. A positive normalized linear functional on a unital C ∗ -algebra is called (algebraic) state. Concerning the GNS theorem we direct the reader to [25, Theorem 5.1] where a concise proof of that theorem is provided. In particular we remark that, by the GNS theorem, a positive normalized linear
February 9, 2004 18:52 WSPC/148-RMP
1196
00188
V. Moretti
functional on a unital C ∗ -algebra is real, i.e. ω(a) = ω(a∗ ), and this implies that ω(a) ∈ R if a is self-adjoint a = a∗ . A state is said to be pure when it is an extremal element of the convex set of states. As is known, a state is pure if and only if it admits an irreducible GNS representation [25]. Proposition 4.5. In our general algebraic hypotheses, let SI denote the convex set of algebraic states λI on AI , I ∈ X and let SpI ⊂ SI denote the subset of pure states. Define the maps JJ,I : SI → SJ with JJ,I : λI 7→ λI ◦ ΠI,J , where I, J ∈ X and I ≤ J. Then, (a) JI,I = Id and JJ,I is injective if I ≤ J. (b) JK,I = JK,J ◦ JJ,I provided I ≤ J ≤ K. Proof. Everything is a trivial consequence of Proposition 4.2 and its corollary. As X is a direct set and Proposition 4.4 holds, it is natural to consider the inductive limit of the spaces SI with respect to the maps JI,J and give the following definition. The definition of causal ordering E given below is a direct generalization of Proposition 2.4(c). After the introduction of generalization of the notion of event in terms of the notion of locus (also given in the definition below), it will be clear that E is nothing but a generalization of the causal partial ordering on the spacetime. Definition 4.1. In our general algebraic hypotheses and using the notation introduced above, (1) a locus on A, Λ, is an element of the inductive limit of the class {SI }I∈X , with respect to the class of maps {JI,J }. That is, Λ is an equivalence class of states in ∪I∈X SI with respect to the equivalence relation λI ∼ λ J
if and only if there is K ≥ I, J in X with JK,I (λI ) = JK,J (λJ ) ; (24)
L denotes the space of loci , i.e. the inductive limit of the class {SI }I∈X . (2) Λ ∈ L is said to be pointwise if and only if there is some pure state λI0 ∈ Λ. Lp ⊂ L indicates the space of pointwise loci on A; (3) Λ ∈ L is said to belong to I ∈ X , and we write Λ ε I, if and only if SI ∩ Λ 6= ∅. In that case we define Λ(f ) := λI (f ) for every f ∈ AI and λI ∈ Λ ∩ SI ; (4) For Λ, Λ0 ∈ L, we say that Λ0 causally follows Λ, and we write Λ E Λ0 , if and only if Λ(f ) ≤ Λ0 (f ) for every I ∈ X with Λ, Λ0 ε I and every f ∈ CoI . Remark. Definition 4.1 is consistent, i.e. the equivalence relation preserves positivity and normalization. Indeed, for I ≤ K, λI is respectively positive/normalized if and only if JK,I (λI ) is respectively such. We leave the trivial proof to the reader, based on the fact that ΠI,K is a homomorphism of unital C ∗ -algebras. The welldefinedness of Λ(f ) is proven in (c) below.
February 9, 2004 18:52 WSPC/148-RMP
00188
Aspects of Noncommutative Lorentzian Geometry
1197
Proposition 4.6. In our general algebraic hypotheses and using the notation introduced above, assuming Λ, Λ0 , Λ00 ∈ L and I, J, K ∈ X , we have the following statements. (a) If Λ ε I, there is only one λI ∈ Λ ∩ SI . (b) ΛεI and I ≤ J entail ΛεJ. In that case λJ := JJ,I (λI ) ∈ Λ∩SJ for λI ∈ Λ∩SI and Λ(f ) = Λ(ΠI,J (f )) as f ∈ AJ . (c) Λ ∈ Lp if and only if every λI ∈ Λ is a pure state. Hence Lp is the inductive limit of the class {SpI }I∈X with respect to the class of the above-defined maps {JJ,I }, I ≤ J in X . (d) ΛEΛ0 , Λ0 EΛ00 imply that Λ(f ) ≤ Λ00 (f ) for every f ∈ CoI such that Λ, Λ0 , Λ00 εI. Proof. (a) The thesis is a direct consequence of the injectivity of the maps JI,K . (b) λJ ∼ λI by construction. The remaining part is a direct consequence of the given definitions. Let us pass to prove (c). If every λI ∈ Λ is pure, Λ ∈ Lp by definition, so consider the other case. Suppose there is a pure state λI ∈ Λ, we want to show that all the remaining states λJ ∈ Λ are pure too. By definition of locus there must be K ∈ X with I, J ≤ K and λJ ◦ ΠJ,K = λI ◦ ΠI,K =: λK . GNS theorem [25, Theorem 5.1] and the surjectivity of ΠI,K imply that if hH, π, Ωi is a GNS triple for AI associated to λI , hH, π ◦ ΠI,K , Ωi is a GNS triple for AK associated to λK , and π ◦ ΠI,K is irreducible if and only if π is irreducible. Similarly if hH 0 , π 0 , Ω0 i is a GNS triple for AJ associated to λJ , hH 0 , π 0 ◦ ΠJ,K , Ω0 i is another GNS triple for the same algebra AK associated to the same state λK and π 0 ◦ ΠI,K is irreducible if and only if π 0 is irreducible. Since (by GNS theorem) all GNS triples for an algebra (AK ) referred to a state (λK ) are unitarily equivalent and the irreducibility is unitarily invariant, we conclude that π is irreducible if and only if π 0 is irreducible. This is the thesis. The proof of (d) is immediate by the given definitions and the item (b). The relationship between points on M and pointwise loci is established by the following theorem which does not require either the spacetime structure or a differentiable manifold structure. The only requirement is that M is a Hausdorff locally-compact topological space. More generally, the theorem shows that there is a bijection between loci on M and compactly-supported regular Borel probability measures µ with compact support on M . Such a bijective function reduces to a homeomorphism when restricted to the space of pointwise loci equipped with a suitable topology. We remind the reader that the support of a regular Borel R measure is the complement of the largest open set with measure zero. Below M f dµ ¯ is well defined by posing f ≡ 0 outside J¯ since supp(µ) ⊂ J. Theorem 4.1. Let M be a locally-compact Hausdorff topological space and X a covering of M made of open relatively compact subsets and defining a direct set with ¯ SI , L respect to the set-inclusion relation. Define A := {AI }I∈X with AI := C(I), and Lp as done in Definition 4.1, ΠI,J (a) := aI¯ and JJ,I as in Proposition 4.3.
February 9, 2004 18:52 WSPC/148-RMP
1198
00188
V. Moretti
Finally, denote the space of compact-support regular Borel probability measures on M by P. Consider the map F : P → L, such that for µ ∈ P, Z (µ) ¯ λ(µ) (f ) := . F (µ) := λJ ∈ SJ J ∈ X with supp(µ) ⊂ J, A f dµ for f ∈ J J M
(a) F is well-defined, i.e. F (µ) is a locus for every µ ∈ P. Moreover F (µ) ε I ∈ X ¯ if and only if supp(µ) ⊂ I. (b) F is bijective onto the set of the loci L. (c) F restricted to the space of Dirac measures {δx }x∈M gives rise to a homeomorphism from M onto Lp equipped with the inductive-limit topology, every SI , I ∈ X , being endowed with Gel’fand’s topology.
Proof. See Appendix B. The following theorem proves that the relation E among loci is nothing but a generalization of the causal partial ordering on the spacetime. Theorem 4.2. In the hypotheses of Theorem 4.1, also assume that (M, g, O t ) is a globally-hyperbolic spacetime and X is defined as in Proposition 2.4, C = {Co I }I∈X with CoI := T[g] (I). Consider the relation E defined in L by Definition 4.1 and Λ, Λ0 , Λ00 ∈ L, then (a) Λ E Λ0 and Λ0 E Λ together entail Λ = Λ0 ; (b) if Λ E Λ0 and Λ0 E Λ00 then, Λ, Λ00 ε I ∈ X entails Λ0 ε I and thus E is transitive and defines a partial ordering relation on L; (c) if F is that in Theorem 4.1, for every pair x, y ∈ M, F (δx ) E F (δy ) if and only if x y. Proof. See Appendix B. Actually most of the content of Theorem 4.2 can be generalized using the general algebraic hypotheses as well as a further causal convexity axiom: (AH5) For Λ, Λ0 , Λ00 ∈ L, if Λ E Λ0 , Λ0 E Λ00 and Λ, Λ00 ε I ∈ X , then Λ0 ε I. Notice that (AH5) is fulfilled in the globally-hyperbolic spacetime case by Theorem 4.2(b). Theorem 4.3. In the general algebraic hypotheses, including the causal convexity axiom (AH5), and employing notations above, E is a partial-ordering relation in L. Proof. Λ E Λ is a trivial consequence of the definition of E. The fact that Λ E Λ0 and Λ0 E Λ together entail Λ = Λ0 can be proven as in Theorem 4.2 where we have not used the spacetime structure. The transitivity of E follows from (AH5) and Proposition 4.6(d).
February 9, 2004 18:52 WSPC/148-RMP
00188
Aspects of Noncommutative Lorentzian Geometry
1199
4.3. Lorentzian distance We conclude by presenting a generalization of the Lorentzian distance in the general case. The following definition is very natural and can also be used in the generalized commutative case in the hypotheses of Theorem 4.2 concerning compactly supported probability measures on a spacetime. Notice that the definition makes sense by (AH3) and (AH4) which assure the existence of some function satisfying [t, [t, GI ]] ≤ −I below. Definition 4.2. In the general algebraic hypotheses including the causal convexity axiom (AH5) and employing notations and conventions above, the Lorentzian distance of Λ, Λ0 ∈ L is D(Λ, Λ0 ) = inf {hΛ0 (t) − Λ(t)i | t ∈ CoI ,
Λ, Λ0 ε I ∈ X ,
[t, [t, GI ]] ≤ −I} , (25)
where hαi := max{0, α} if α ∈ R. The item (iii) of (a) in (AH4) implies the following result, the proof being the same given for the corresponding part of Theorem 3.1. Proposition 4.7. In Definition 4.2 the condition [t, [t, GI ]] ≤ −I can be replaced by one of the three following conditions: σ([t, [t, GI ]]) ⊂ (−∞, −1] , [t, [t, GI ]]
−1
[t, [t, GI ]] ≤ −I , exists and ||[t, [t, GI ]]
−1
||L(HI ) ≤ 1 .
(26) (27) (28)
We have a conclusive theorem. Theorem 4.4. In the general algebraic hypotheses including the causal convexity axiom (AH5 ), employing notations and conventions above the Lorentzian distance enjoys the following properties for Λ, Λ0 , Λ00 ∈ L. (a) In the hypotheses of Theorem 4.2 and assuming GI = ∆ 6 I (defined in Theorem 3.1 for I ∈ X ), D(F (δp ), F (δq )) = d(p, q) for every pair p, q ∈ M . (b) 0 ≤ D(Λ, Λ0 ) < +∞. In particular , D(Λ, Λ0 ) = 0 if either Λ = Λ0 or Λ E 6 Λ0 . 0 00 00 0 0 00 (c) If Λ E Λ E Λ then D(Λ, Λ ) ≥ D(Λ, Λ ) + D(Λ , Λ ). Proof. (a) The right-hand side of the definition of D(F (δp ), F (δq )) in (25) can be re-written as the right-hand side of (13) in Theorem 3.1. So the proof of the thesis is obvious. (b) The set in the right-hand side of (25) is not empty because, if Λ ∈ L, there is some I ∈ X with Λ ε L by definition of locus, moreover (AH4) implies that there is some f ∈ CoGI 6= ∅ and thus t = αf ∈ CoI and [t, [t, GIt ]] ≤ −I for some α > 0. Then positivity and boundedness of D hold by definition. Λ = Λ0 implies D(Λ, Λ0 ) = 0 by the definition of D. Finally suppose Λ E 6 Λ0 . In that case 0 there must exist f ∈ CoI for some I ∈ X such that Λ, Λ ε I and Λ(f ) − Λ0 (f ) =
February 9, 2004 18:52 WSPC/148-RMP
1200
00188
V. Moretti
> 0. Define fν := νf , fν ∈ CoI for all ν > 0 because CoI is a convex cone and Λ(fν ) − Λ0 (fν ) = ν. Then take tγ ∈ CoGI (which exists by (AH4)) with γ > 0 such that [tγ , [tγ , GI ]] ≤ −γI. Therefore, by (AH4) (and (iv), (v) of Proposition 4.4(a) √ and its corollary in particular), tν := fν + (1/ γ)tγ is in CoI as before and satisfies √ [tν , [tν , GI ]] ≤ −I. Finally Λ0 (tν )−Λ(tν ) = −ν+(1/ γ)(Λ0 (tγ )−Λ(tγ )) < 0 if ν > 0 is sufficiently large. Then the definition of D(Λ, Λ0 ) gives D(Λ, Λ0 ) = 0. (c) Take I ∈ X with Λ, Λ00 ε I. Λ E Λ0 E Λ00 and (AH5) entail Λ0 ε I and thus Λ00 (f ) − Λ(f ) = (Λ00 (f ) − Λ0 (f )) + (Λ0 (f ) − Λ(f )) makes sense. In the given hypotheses, by (b), the identity can also be written hΛ00 (f ) − Λ(f )i = hΛ00 (f ) − Λ0 (f )i + hΛ0 (f ) − Λ(f )i. Using the definition of D it entails hΛ00 (f ) − Λ(f )i ≥ D(Λ, Λ0 ) + D(Λ0 , Λ00 ). Finally, since f is arbitrary, it implies the thesis. It is possible to define relations analogous to ≺ ≺ and ≺ respectively, which we denote by C C and C. Λ C C Λ0 means D(Λ, Λ0 ) > 0, and Λ C Λ0 means Λ E Λ0 and Λ 6= Λ0 together. The final corollary shows that the content of Sec. A.7 can be restated in the general context without using causal path. Corollary 4.3. In the hypotheses of Theorem 4.4 and with the given definitions: (a) C C and C are transitive and Λ C C Λ0 implies Λ C Λ0 ; (b) either Λ C C Λ0 and Λ0 E Λ00 , or Λ E Λ0 and Λ0 C C Λ00 implies Λ C C Λ00 . Proof. (a) By Theorem 4.4(b), ΛC CΛ0 entails Λ E Λ0 and Λ C Λ0 , hence, by (c), C C is transitive. C is transitive too because of Theorem 4.3 and the definition of E. (b) is a direct consequence of Theorem 4.4(c) and D ≥ 0. 5. Open Issues and Outlook This paper shows that a generalization of part of the noncommutative Connes’ program is possible in order to encompass Lorentzian and causal structures of (globally-hyperbolic) spacetimes. However several relevant issues remain open. Obviously, first of all concrete models of the presented generalized formalism should be presented in the noncommutative case, moreover the minimality of the proposed axioms should be analyzed. An important point which should be investigated is the interplay between the topology of the space of loci and D. In the commutative case and considering the events of a globally-hyperbolic spacetime, d turns out to be continuous with respect to the topology of the manifold. Presumably a natural topology of the space of loci, in the general case, could be the inductive limit topology, each space SI being equipped with the ∗-weak topology. One expects that D is continuous with respect to such a topology. Another point is the following. We have focused attention on the Lorentzian generalization of (1) avoiding tackling difficulties involved in possible generalizations of (3) which, presumably, should require a careful analysis of the spectral properties of the metric operators GI introduced above. Such an analysis could reveal contact points with the content of [27] in spite of the evident differences of the presented approach and obtained
February 9, 2004 18:52 WSPC/148-RMP
00188
Aspects of Noncommutative Lorentzian Geometry
1201
results. Another important question which should be investigated concerns possible physical applications of the presented mathematical structure. Acknowledgments I am grateful to A. Cassa, S. Delladio, G. Landi, M. Luminati and F. Serra Cassano for useful mathematical suggestions and comments. I would like to thank R. M. Wald for a short but useful technical discussion and H. Thaler who pointed out Ref. [27] to me. Appendix A. Exponential Map, Synge’s World function, Spacetimes A.0. Exponential map, Synge’s world function Let (M, g) be a smooth Riemannian or Lorentzian manifold. π : T M → M denotes the natural projection of T M onto M and, if v ∈ T M , vπ(v) is the vector of Tπ(v) M associated to v. If v ∈ T M and λ ∈ R, λv is the element of T M with π(λv) = v and (λv)π(λv) = λvπ(v) . Consider the map (t, v) 7→ γ(t, v) ∈ M , where γ(., v) is the unique geodesic which starts from π(v) at t = 0 with initial tangent vector vπ(v) and t belongs to the maximal domain (av , bv ) (av < 0 < bv ). From known theorems on maximal solutions of (first-order) differential equations on manifolds (T M ) [21], the domain of γ, ∪v∈T M (av , bv ) × {v} is open in R × T M and γ is smooth therein. Then pick out the set U ⊂ T M of elements v, such that 1 ∈ (0, bv ). It is possible to show that U is open. Notice that for each v ∈ T M , there is a sufficiently small λ > 0 such that 1 ∈ (0, bλv ), because of the identity γ(λt, v) = γ(t, λv). From that identity one trivially proves that U is starshaped, i.e. if v ∈ U then λv ∈ U for λ ∈ [0, 1]. The exponential map, exp : U → M , is defined as exp(v) := γ(1, v) [17]. Notice that exp ∈ C ∞ (U ). If p ∈ M , expp denotes expTp M and the open neighborhood of 0, Up := {v ∈ U | π(v) = p} ⊂ Tp M , its natural domain. By direct inspection, one finds that d exp|v 6= 0 if v belongs to the zero section of T M . This entails that if one shrinks each Up sufficiently about p to some starshaped and open neighborhood of p, Vp ⊂ Up , expp Vp defines a diffeomorphism from Vp to expp (Vp ) which is open too. If {eα |p } ⊂ Tp M is a basis, (t1 , . . . tD ) 7→ expp (tα eα |p ), t = tα eα |p ∈ Vp , defines a normal coordinate system centred on p. An open set C ⊂ M is called a (geodesically) convex normal neighborhood if there is an open and starshaped set W ⊂ T M , with π(W ) = C such that expW is a diffeomorphism onto C × C. It is clear that C is connected and there is only one geodesic segment joining any pair q, q 0 ∈ C which is completely contained in C, that is t 7→ expq (t((expq )−1 q 0 )) t ∈ [0, 1]. It is possible to take C diffeomorphic to an open ball in Rdim M [17]. Moreover if q ∈ C, {eα |q } ⊂ Tq M is a basis, (t1 , . . . , tD ) 7→ expq (tα eα |q ), t = tα eα |q ∈ Wq defines a global normal coordinate system onto C centred on q. The class of the convex normal neighborhood of a point p ∈ M is not empty and defines a fundamental system of neighborhoods of p [5, 9, 17, 21].
February 9, 2004 18:52 WSPC/148-RMP
1202
00188
V. Moretti
In (M, g) as above, σ(x, y) indicates one half the squared geodesic distance of x −1 from y, also known as Synge’s world function: σ(x, y) := 21 gx (exp−1 x y, expx y) [9]. By definition σ(x, y) = σ(y, x) and σ turns out to be smoothly defined on C ×C if C is a convex normal neighborhood. With the signature (−, +, · · · , +), we have σ(x, y) > 0 if the events are space-like separated, σ(x, y) < 0 if the events are time related and σ(x, y) = 0 if the events belong to a common null geodesic or x = y. All that and everything that follows also hold in manifolds endowed with an Euclidean metric where σ (defined as above) is everywhere nonnegative. It turns out that [9] if γ is the unique geodesic from p to q in a convex normal neighborhood containing p, q, with affine parameter λ ∈ [0, l] ↑dy σ(x, y)|y=γ(λ) = λγ(λ) ˙ , 2σ(x, y) = gx (dx σ(x, y), dx σ(x, y)) = gy (dy σ(x, y), dy σ(x, y)) .
(A.1) (A.2)
A.1. Lorentzian manifold A (smooth) Lorentzian manifold (M, g) is a n ≥ 2-dimensional smooth manifold M with a smooth Lorentzian metric g (with signature (−, +, · · · , +)). A.2. Signature of vectors A vector T ∈ Tx M , T 6= 0, is said to be spacelike, timelike or null if, respectively, gx (T, T ) > 0, gx (T, T ) < 0, gx (T, T ) = 0. T 6= 0 is said to be causal if it is either ∗ timelike or null. The same nomenclature is used for co-vectors p ω ∈ Tx M referring to ↑ω ∈ Tx M , where p gx (↑ω, · ) = ω. If T ∈ Tp M , |T | := |gp (T, T )|, similarly, if ω ∈ Tp∗ M , |ω| := |gp (↑ω, ↑ω)|. A.3. Time orientation A Lorentzian manifold (M, g) is said to be time orientable if it admits a smooth nonvanishing vector field Z ∈ T M which is everywhere timelike. A time orientation, Ot , on a time-orientable Lorentz manifold, (M, g), is one of the two equivalence classes of smooth timelike vector fields Z with respect to the equivalence relation Z ∼ Z 0 if and only if g(Z, Z 0 ) < 0 everywhere. For each point p ∈ M , an orientation determines an analogous equivalence class of timelike vectors of Tp M , Otp . In a orientable Lorentz manifold, to assign a time orientation it is sufficient to single out a timelike vector in Tp M for a p ∈ M . With the given definitions, a causal vector (co-vector ) T ∈ Tp M (ω ∈ Tp∗ M ) is said to be future directed if gp (Z(p), X) < 0 (gp (Z(p), ↑ ω) < 0). A causal vector (respectively co-vector) T ∈ Tp M (ω ∈ Tp∗ M ) is said to be past directed if gp (Z(p), X) < 0 (gp (Z(p), ↑ω) > 0).
February 9, 2004 18:52 WSPC/148-RMP
00188
Aspects of Noncommutative Lorentzian Geometry
1203
A.4. Spacetime A spacetime (M, g, Ot ) is a Lorentzian manifold (M, g) which is time orientable and equipped with a time orientation Ot ; the points of M are also called events. A.5. Regularity of curves and causal curves In a spacetime M , a piecewise C k curve defined in a (open, closed, semi-closed) nonempty interval in R, I, is a continuous map γ : I → M with a finite partition of I such that each subcurve obtained by restricting γ to each subinterval of the partition (including its boundary) is C k . If the partition coincides with I itself, the curve is said to be C k . A piecewise C 1 curve γ is said to be timelike, spacelike, null, causal if its tangent vector γ˙ is respectively timelike, spacelike, null, causal, everywhere in each subinterval I of the associated partition. A piecewise C 1 causal curve in a spacetime γ : I → M is said to be future (past) directed if its tangent vector γ˙ is future (past) directed everywhere in each subinterval I of the associated partition. In a spacetime M , if p, q ∈ M , a curve γ : [a, b] → M is said to be from p to q if γ(a) = p and γ(b) = q. A.6. Continuous causal curves It is possible to extend the notion of causal future-directed curves, considering continuous future-directed causal curves γ : I → M . That is by requiring that, for each t ∈ I there is a neighborhood of t, It and a convex normal neighborhood of γ(t), Ut , such that, for t0 ∈ It \{t}, one has γ(t0 ) 6= γ(t) and there is a futuredirected causal (smooth) geodesic segment γ 0 ⊂ Ut from γ(t) to γ(t0 ) if t0 > t there is a future-directed causal (smooth) geodesic segment γ 0 ⊂ Ut from γ(t0 ) to γ(t) if t0 < t. Similar definitions hold concerning continuous future-directed timelike curves, by replacing “causal” with “timelike” in the definitions above. In this work a causal curve is supposed to be a continuous causal curve, moreover continuous curves γ : I → M and γ 0 : I 0 → M are identified if there is an increasing homomorphism h : I → I 0 and γ 0 ◦ h = γ. A.7. Causal relations of events In a spacetime (M, g, Ot ), if p, q ∈ M , (i) p q means that either p = q or there is a future-directed causal curve from p to q, (ii) p ≺ q means that p q and p 6= q, (iii) p ≺ ≺ q means that there is a future-directed time-like curve from p to q. ≺ ≺ and are clearly transitive. Remark. In a spacetime (M, g, Ot ), if p, q, r ∈ M , p ≺ ≺ q and q r entail p ≺ ≺ r, and similarly p q and q ≺ ≺ r entail p ≺ ≺ r [23].
February 9, 2004 18:52 WSPC/148-RMP
1204
00188
V. Moretti
A.8. Causal sets We make use of the following notations. Consider a spacetime (M, g, Ot ) and S ⊂ M , then J + (S) := {q ∈ M | p q for some p ∈ S} the causal future of S ,
J + (S) := {q ∈ M | q p for some p ∈ S} the causal past of S ,
I + (S) := {q ∈ M | p ≺ ≺ q for some p ∈ S} the chronological future of S ,
I − (S) := {q ∈ M | q ≺ ≺ p for some p ∈ S} the chronological past of S .
Moreover I(p, q) := I + (p) ∩ I − (q) (which is not empty if and only if p ≺ ≺ q) and J(p, q) := J + (p) ∩ J − (q). If ∅ 6= S ⊂ M , I + (S) and I − (S) are open, S ⊂ J ± (S) ⊂ I ± (S), I ± (S) = Int(J ± (S)) [21]. A.9. Properties of I ± (p) and J ± (p) ([28, Theorem 8.1.2]) In a spacetime (M, g, Ot ), taking a sufficiently small normal convex neighborhood of p ∈ M , U , exp−1 p defines a local diffeomorphism, φ : U → Rn with φ(p) = 0, and φ(U ∩ I ± (p)) = B ∩ C, where B ⊂ Rn is an open ball centred in 0 and C the open convex cone, with vertex 0, made of all the future-directed timelike vectors. This result implies that both I ± (p) and J ± (p) are nonempty, connected by paths and connected. A.10. Causal relations of events again p, q ∈ M are said to be time related, if either I + (p)∩I − (q) 6= ∅ or I − (p)∩I + (q) 6= ∅, causally related if either J + (p) ∩ J − (q) 6= ∅ or J − (p) ∩ J + (q) 6= ∅. Causally related events p, q ∈ M , p 6= q, which are not time related are called null-related. S, S 0 ⊂ M are said to be spatially separated if (J + (S) ∪ J − (S)) ∩ S 0 = ∅ (which is equivalent to (J + (S 0 ) ∪ J − (S 0 )) ∩ S = ∅). A.11. Causally convex sets, strongly causal spacetimes, Alexandrov topology In a spacetime M , we say that a set S ⊂ M is causally convex when J(p, q) ⊂ S if p, q ∈ S. It can be proven that an open set U ⊂ M is causally convex if and only if for any future-directed causal curve γ and any choice of (continuous) parametrization γ −1 (U ) is open and connected in R. The transitivity of implies that J + (S), J − (S), J(r, s) are causally convex for ∅ 6= S ⊂ M and r s. Also using the remark in A.7 one directly shows that I + (S), I − (S), I(r, s) are causally convex for ∅ 6= S ⊂ M and r ≺ ≺ s. A spacetime is strongly causal when every event admits a fundamental set of open neighborhoods consisting of causally convex sets. It is known that a spacetime M is strongly causal if and only if the Alexandrov topology, i.e. that generated by all the sets I(p, q), p, q ∈ M , is the topology of M [1, 23].
February 9, 2004 18:52 WSPC/148-RMP
00188
Aspects of Noncommutative Lorentzian Geometry
1205
A.12. Globally-hyperbolic spacetimes A globally-hyperbolic spacetime (see the end of [28, Sec. 8.3] about possible equivalent definitions) is a strongly-causal spacetime (M, g, Ot ) such that every J(p, q) is either empty or compact for each pair p, q ∈ M . If the spacetime M is globally-hyperbolic and S ⊂ M is compact, J ± (S) = ¯ (A.8), J ± (S)\I ± (S) = ∂I ± (S) = I ± (S) [28] and thus, using I ± (S) = Int(S) ∂J ± (S). In particular J ± (p) = I ± (p). A.13. Stably causal spacetimes, global time functions A spacetime (M, g, Ot ) is said to be stably causal if there is a smooth map f : M → R with df everywhere timelike (other equivalent definitions are possible [1]). A continuous map t : M → R is said to be a global time function if it strictly increases along every future-directed causal curve. A stably causal spacetime admits a global time function given by either +f or −f , f being defined above. Remarkably, the converse is also true [12, 26]: if a spacetime admits a (global) time function, it admits a smooth map f : M → R with df everywhere timelike. A.14. Causal spacetimes A spacetime is said to be causal if there are no events p, q such that p ≺ q ≺ p (equivalently, it does not contain any closed causal curve). It is trivially proven that in a causal spacetime is a partial-ordering relation in causal spacetimes. A spacetime is called chronological if there are no events p, q such that p ≺ ≺q ≺ ≺p (equivalently, it does not contain any closed timelike curve). A.15. Implications of causal conditions It is known that [1, 21, 14] globally hyperbolic ⇒ stably causal ⇒ strongly causal ⇒ causal causal ⇒ chronological . In particular is a partial-ordering relation in globally-hyperbolic spacetimes too. A.16. Inextendible curves A causal curve γ : I → M is said to be future (past) inextendible if it admits no future (past) endpoint, i.e. e ∈ M such that, for every neighborhood O of e, there is t0 ∈ I with γ(t) ∈ O for t > t0 (t < t0 ). Any causal curve which admits an endpoint can be extended beyond that endpoint into a larger causal curve (only continuous in general). Hausdorff’s maximality theorem implies that every (causal, timelike) curve can be extended up to an inextendible (causal, timelike) curve.
February 9, 2004 18:52 WSPC/148-RMP
1206
00188
V. Moretti
A.17. Cauchy developments Let S ⊂ M be any set in the spacetime (M, g, Ot ), D+ (S) (D− (S)) indicates the future (past) Cauchy development of S, i.e. the set of points p of the spacetime, such that every past (future) inextendible causal curve through p intersects S. (In particular S ⊂ D ± (S).) D(S) := D+ (S) ∪ D− (S) is the Cauchy development of S. A.18. Achronal and acausal sets A set S ⊂ M is said to be achronal if S ∩ I ± (S) = ∅ and acausal if S ∩ J ± (S) = ∅. An achronal smooth spacelike embedded submanifold with dimension dim(M ) − 1 turns out to be also acausal [21, p. 425]. A.19. Cauchy surfaces A Cauchy surface (for M itself), S ⊂ M , is a closed achronal set such that D(S) = M . There are different, also inequivalent definitions, of Cauchy surfaces, we use the definition of [28] which is equivalent to that given in [21] as stated in Lemma 29 in Chap. 14 therein. A.20. Globally-hyperbolic spacetimes and Cauchy surfaces An important result states that: a spacetime (M, g, Ot ) is globally-hyperbolic if and only if it admits a Cauchy surface. This statement can be adopted as an equivalent definition of a globally-hyperbolic spacetime (see remark in the end of [28, Sec. 8.3] for a proof of equivalence of the various definitions of globally-hyperbolicity). A.21. Cauchy surfaces and global time functions in globally-hyperbolic spacetimes All Cauchy surfaces of a globally-hyperbolic spacetime M are connected and homeomorphic. M itself is homeomorphic to R × S, S being a Cauchy surface of M and the projection map from M onto R can be fixed to be a smooth global time function [1, 21, 23, 28]. A.22. Smooth Cauchy surfaces The existence of spacelike smooth Cauchy surfaces in any globally-hyperbolic spacetime is a very subtle issue. A first, not complete, proof of existence of smooth Cauchy surfaces in general globally-hyperbolic spacetimes is due to Dieckmann [3], however the complete proof, by Bernal and Sanch´ez [4], is much more recent. In (quantum) field theories, those are used to give initial data for hyperbolic field equations determining the dynamics of the fields everywhere in the spacetime [28, 29].
February 9, 2004 18:52 WSPC/148-RMP
00188
Aspects of Noncommutative Lorentzian Geometry
1207
A.23. Sets I(S, p) and J(S, p) and their properties In a globally-hyperbolic spacetime (M, g, Ot ), if S ⊂ M is a smooth Cauchy surface and p ∈ J + (S), I(S, p) and J(S, p) respectively denote I − (p) ∩ I + (S) and J − (p) ∩ J + (S). One can straightforwardly prove that I(s, p) is not empty if and only if p ∈ I + (p). It is not so difficult to show that I(S, p) and J(S, p) are causally convex. Section A.8 implies that I(S, p) is open and I(S, p) ⊂ J(S, p). Finally, J(S, p) is compact [28, Theorem 8.3.12] and J(S, p) = I(S, p) (Proposition 2.4). Analogous properties hold for the analogously-defined regions I(p, S) and J(p, S). Appendix B. B.1. Proof of Lemma 2.1 ¯ is continuous on I¯ and smooth in an open set In our hypotheses, f ∈ C[µg ] (I) ¯ J := I\C = I\C where C ⊂ I¯ is closed with measure zero and ∂I ⊂ C. Notice ¯ by construction. Therefore, concerning the first part that µg (J) = µg (I) = µg (I) of the thesis it is sufficient to show that, in I\C, g(↑df, ↑df ) ≤ 0 and ↑df is past direct if ↑ df 6= 0. To this end, suppose g(↑ df, ↑ df ) > 0 in p ∈ I\C, then there must be an open neighborhood of p, U , where the same inequality holds. So define a smooth vector field T 0 which is timelike, future-directed and orthogonal to ↑df in U and |T 0 | = 2|df |, then define T := T 0 − ↑df . T is timelike and future-directed in U . If γ : [0, 1] → U is a smooth integral curve of TR, γ is timelike and future1 directed and it trivially holds that f (γ(1)) − f (γ(0)) = 0 gγ(s) (↑df, γ)ds ˙ < 0. This is not allowed if f is a causal function in I. Similarly if ↑df 6= 0 is future-directed at p ∈ I\C, the same fact must hold in a neighborhood U of p. Take a local coordinate frame x1 , . . . xn in U where ∂x1 is timelike, future-directed and orthogonal to the spacelike vectors ∂xk , k = 2, . . . n. Obviously g(∂x1 , ↑df ) < 0. Let γ be an integral curve of ∂x1 in U . γ is causal and future-directed by construction and one gets R1 the contradiction f (γ(1)) − f (γ(0)) = 0 gγ(s) (↑df, γ)ds ˙ < 0. Concerning (10) it is sufficient to prove it in I. Indeed, the thesis in I¯ = I ∪ ∂I is a direct consequence of the continuity of f and d in I¯ and the fact that ∂I has measure zero. (In particular, if x or y or both belong to ∂I and x ≺ ≺ y there are two sequences {yn } ⊂ I, {xn } ⊂ I with xn → x and yn → y as n → ∞. The continuity of d implies that xn ≺ ≺ ym if n, m are sufficiently large and thus the right-hand side of (10) can be computed restricting to I.) Let us pass to prove that f (y) − f (x) ≥ ess inf I |df | for each pair x, y ∈ I with x ≺ ≺y. d(x, y)
(B.1)
To this end, fix x, y ∈ I with x ≺ ≺ y. Since the spacetime is globally-hyperbolic there is a timelike future-directed segment geodesic γ0 : [0, 1] → M from x to y. This geodesic completely belongs to I because I is causally convex. Using normal coordinates (see [21, Chap 7, Lemma 2.5]) about a geodesic segment γ00 , with γ0 ⊂ γ00 , it is possible to define a smooth variation of γ0 , (t, s) → γs (t) with t ∈ [0, 1], δ > 0, s ∈ D1 , γs=0 = γ0 , Dδ being the open disk in Rdim M −1 with radius δ > 0 and
February 9, 2004 18:52 WSPC/148-RMP
1208
00188
V. Moretti
centred in 0. It is possible to arrange (t, s) → γs (t) in order that (i) (t, s) → γs (t) with (t, s) ∈ (0, 1) × D1 defines an admissible local coordinate map, (ii) each curve γs is timelike and future-directed for t ∈ (0, 1) and admits t-limits towards 0+ and 1− defining smooth future-directed causal curves from x to y. Notice that for every s ∈ D1 , γs (0,1) ⊂ I by construction. Take s ∈ Dδ , 0 < δ < 1 and consider, for t ∈ [0, 1], t 7→ hs (t) = f (γs (t)). This function is non-decreasing and hence must admit derivative almost R 1 0 everywhere, such derivative is integrable and f (y) − f (x) = hs (1) − hs (0) ≥ 0 hs (τ )dτ . The derivative is nonnegative and thus we may also write, using Fubini’s theorem, Z Z Z 1 h0 (τ ) 1 1 p s dµg (t, s) . ds dτ h0s (τ ) = f (y) − f (x) ≥ vol(Dδ ) Dδ vol(Dδ ) [0,1]×Dδ |g(t, s)| 0 vol(Dδ ) is the Rdim M −1 volume of Dδ . In other words Z |gγs (τ ) (↑df, γ˙ s )| 1 p f (y) − f (x) ≥ dµg (t, s) . vol(Dδ ) [0,1]×Dδ |g(t, s)|
Barring vanishing measure sets, ↑ df is causal and past-directed or vanishes. Referring to an orthonormal basis of Tγs (t) M , e1 , . . . , eD , (D = dim M ) where e1 = γ˙ s (t)/|γ˙s (t)| is timelike, one straightforwardly proves that if T ∈ Tγs (t) M is causal and future-directed or vanishes then |gγs (t) (T, γ˙ s (t))| ≥ |T ||γ˙ s (t)|. Hence, posing T = ↑ dγs (t) f , we have Z |gγs (τ ) (↑df, γ˙ s )| 1 p f (y) − f (x) ≥ dµg (t, s) vol(Dδ ) [0,1]×Dδ |g(t, s)| Z 1 Z ess inf [0,1]×Dδ |df | |γ˙ s (t)| dt ds ≥ vol(Dδ ) 0 Dδ and thus, f (y) − f (x) ≥
(ess inf I |df |) vol(Dδ )
Z
dsL(γs ) . Dδ
Changing variables s → δσ (ess inf I |df |) f (y) − f (x) ≥ vol(D1 )
Z
dσL(γδσ ) . D1
Notice that σ 7→ L(γδσ ) ≤ L(γ0 ) = d(x, y) is continuous in D1 . Taking the limit as δ → 0+ (using Lebesgue’s dominate convergence theorem) we have Z (ess inf I |df |) (ess inf I |df |) dσL(γ0 ) = f (y) − f (x) ≥ vol(D1 )d(x, y) . vol(D1 ) vol(D1 ) D1 As x ≺ ≺ y, d(x, y) > 0 and thus f (y) − f (x) x, y ∈ I, x ≺ ≺ y ≥ ess inf I |df | . inf d(x, y)
(B.2)
February 9, 2004 18:52 WSPC/148-RMP
00188
Aspects of Noncommutative Lorentzian Geometry
1209
To conclude the proof it would be sufficient to show that, if f is a time function, for every > 0 there are x ≺ ≺ y in I such that f (y ) − f (x ) (B.3) d(x , y ) − ess inf I |df | < .
To this end notice that, if ess inf I |df | > 0 there must be sequence {zn }n ⊂ I such that each ↑ dzn f is timelike (and thus past-directed as we said above) and |dzn f | → ess inf I |df | as n → ∞. In that case (B.3) is a consequence of the statement: for each zn , and each µ > 0 there are xn,µ ≺ ≺ yn,µ in I such that f (yn,µ ) − f (xn,µ ) (B.4) d(xn,µ , yn,µ ) − |dzn f | < µ .
Let us prove the statement above. Let dzn f be timelike and past-directed. Define the normalized vector e1 = − ↑dzn f /|dzn f | and complete the basis of Tzn M with D − 1 spacelike vectors normalized and orthogonal to e1 . Finally consider the Riemannian normal coordinate system ξ 1 , . . . , ξ D centred on zn generated by the basis e1 , . . . , eD . We restrict such a coordinate system in a sufficiently small convex normal neighborhood of zn . By Proposition 2.1(d) if y has coordinates ξ 1 , . . . , ξ D , ξ 1 = d(zn , y). Then define xn,µ = zn , and for every y ≡ (t, 0, . . . , 0) one has, by Lagrange’s theorem (where y 0 ≡ (t0 , 0, . . . , 0) with t0 ∈ (0, t)) ∂f f (y) − f (xn,µ ) ∂f = 1 → 1 = |dzn f | , d(xn,µ , y) ∂ξ y0 ∂ξ zn
as t → 0+ , i.e. y → zn . For every µ > 0, the existence of xn,µ ≺ ≺ yn,µ in I such that (B.4) is fulfilled follows trivially. The same proof can be used for the case ess inf I |df | = 0 provided that a sequence {zn }n ⊂ I exists such that each ↑dzn f is timelike and |dzn f | → 0 as n → ∞. Let us prove that such a sequence does exist if f is a time function. Suppose it is not the case and ess inf I |df | = 0. So it must happen that |df | ≡ 0 in some E ⊂ I\C with µg (E) 6= 0. In turn it implies that there is some q ∈ I\C where dfq is a null vector or vanishes. So, take a sequence of open neighborhoods of q, Ui ⊂ I\C, where df is smoothly defined, such that Ui+1 ⊂ Ui and ∩i Ui = {q}. If dfqi is timelike for some qi ∈ Ui \{q} for every i, the wanted sequence exists and this is assumed to be impossible by hypotheses. So it must be |df | = 0 in some Ui0 . But this is not possible too because, if dfr 6= 0 for some r ∈ Ui0 , the time-function f would be constant along a future-directed causal curve given by an integral curve of ↑df in a neighborhood of r. Conversely, if df ≡ 0 in Ui0 , f would be constant along any timelike future-directed curve in Ui0 . B.2. Proof of Proposition 2.4
It is convenient to prove (d) first before (a) because most of (a) is a straightforward consequence of the former. (d) If S ⊂ M is a smooth Cauchy surface for M and p ∈ I + (S), the set I(S, p) is open and causally convex (Sec. A.23). Since ∂I + (S) = S, one has ∂I(S, p) ⊂ S ∪ ∂I − (p) and hence ∂I + (S) has measure zero since S is
February 9, 2004 18:52 WSPC/148-RMP
1210
00188
V. Moretti
a smooth hypersurface and Theorem 2.1(a) holds. To conclude that I(S, p) ∈ X , it is sufficient to show that I(S, p) = J(S, p) noticing that the latter is compact and causally convex (Sec. A.23). Obviously, I(S, p) ⊂ J(S, p) = I + (S) ∩ I + (S), so we have to show that I + (S) ∩ I − (p) ⊂ I(S, p). Using the decomposition I + (S) ∩ I − (p) = (∂I + (S)∩∂I − (p))∪(∂I + (S)∩I − (p))∪(I + (S) ∩∂I − (p))∪(I + (S)∩I − (p)), one finds that the only thing to be shown is that x ∈ ∂I + (S) ∩ ∂I − (p) implies x ∈ I + (S) ∩ I − (p). Take such an x. Notice that x ∈ ∂I + (S) = S. Let Bx be an open neighborhood of x and γ a maximal causal geodesic segment from x = γ(0) to p = γ(1) which exists by Proposition 2.1(i). Extend γ into an inextendible causal geodesic γ 0 . Notice that γ (and γ 0 ) must be null, because d(x, p) = 0 as a consequence of Proposition 2.1(a) and (f) and Sec. A.12. Since p ∈ I + (S) and γ 0 intersects S = ∂I + (S) in x only (S being a Cauchy surface), γ(t) ∈ I + (S) for t > 0. It is possible to fix t0 > 0, t < 1, such that x0 := γ(t0 ) ∈ Bx . Subsegments of a maximal geodesic segment are maximal and hence x0 ∈ ∂I − (p). Therefore there is a sequence of points {xn } ⊂ I − (p) with xn → x0 as n → ∞. As x0 ∈ I − (p), I + (S), Bx and these sets are open, for some N ∈ N it must hold that xn ∈ I − (p), I + (S), Bx if n > N . We have proven that for every open neighborhood Bx of x there is some xn ∈ I − (p) ∩ I + (S) ∩ Bx . In other words, x ∈ I − (p) ∩ I + (S). Let us pass to consider the open diamond regions I(r, s). We want to show that, if p ∈ M , there is a fundamental set of open neighborhoods of p, {I(rn , sn )} ⊂ X and in particular I(rn , sn ) = J(rn , sn ). From [23, Proposition 4.12] one finds that,e in a globallyhyperbolic spacetime M (it is sufficient, the strongly causal condition), each point p admits a convex normal neighborhood Up and an open neighborhood, Ap , such that (i) Ap ⊂ Up and (ii) if r, s ∈ Ap , I(r, s) ⊂ Ap . In Ap , take a future-directed geodesic segment through p, γ, and a two sequences of points on γ, {rn }, {sn } such that rn ≺ ≺ rn+1 ≺ ≺p≺ ≺ sn+1 ≺ ≺ sn , and rn , sn → p as n → ∞. As the spacetime is strongly causal (Sec. A.11), using the remark in Sec. A.7 one proves that {I(r n , sn )} is a fundamental set of neighborhoods of p, and J(rn+1 , sn+1 ) ⊂ I(rn , sn ) ⊂ Up . It is clear that each I(rn , sn ) is open, causally convex and ∂I(rn , sn ) ⊂ ∂I + (rn ) ∪ ∂I − (sn ) has measure zero. As J(rn , sn ) is causally convex (Sec. A.11) and compact (Sec. A.12), to conclude it is sufficient to show that J(rn , sn ) = I(rn , sn ). Suppose this is not the case and thus there is x ∈ J(rn , sn ) with x 6∈ I(rn , sn ). As is known, causal curves from rn to sn which are not smooth geodesic segments from rn to sn can be approximated by timelike curves from rn to sn [21]. This means that there must be a smooth null geodesic segment η ⊂ J(rn , sn ) from rn to sn with x ∈ η. Let us show that this is impossible. Indeed, since rn ≺ ≺ sn , by Proposition 2.1(i) there is a timelike (and thus 6= η) geodesic segment η 0 from rn to sn and both η, η 0 must belong to the same geodesically convex neighborhood Up . (a) Proposition 2.4(d)(ii) implies ∪X = M . Let us pass to show that X is a direct set. We want to show that if A, B ∈ X , there is C ∈ X with A, B ⊂ D. From here on e The reader should pay attention to the fact that the definition of causally convex sets given in [23] is different from that used in this paper.
February 9, 2004 18:52 WSPC/148-RMP
00188
Aspects of Noncommutative Lorentzian Geometry
1211
D := A ∪ B. As the spacetime is globally-hyperbolic, it is homeomorphic to R × S, where S ⊂ M is a smooth Cauchy surface (Secs. A.19–A.21). Then consider the natural smooth time function t : M → R (which exists by Secs. A.13, A.15, A.21) associated to the former Cartesian factor. Take t1 < min tD¯ and t2 > max tD¯ . ¯ The Cauchy surface S1 = {x ∈ M | t(x) = t1 } is in the past (with respect to t) of D ¯ and the Cauchy surface S2 := {x ∈ M | t(x) = t2 } is in the future of D and S1 . By ¯ every inextendible future-directed timelike definition of Cauchy surface, if p ∈ D, curve γ through p must intersect both S1 and S2 . Let q be the intersection of γ and ¯ ⊂ ∪q∈F I(S1 , q). S2 and F the set of such points q. By construction, it holds that D ¯ Since D is compact we can extract a finite covering from that found above. In ¯ ⊂ ∪n I(S1 , qi ). To conclude define C := ∪n I(S1 , qj ). C particular we have D i=1 i=1 is open because union of open sets, causally convex (if p, q ∈ C satisfy q p, p ∈ I(S1 , qk ) for some k and thus q and every causal curve from p to q belong to I(S1 , qk ) ⊂ C), ∂C ⊂ ∪ni=1 ∂I(S1 , qj ) has measure zero because every I(S1 , qj ) ∈ X and thus ∂I(S1 , qj ) has measure zero, C¯ = ∪ni=1 J(S1 , qi ) is compact (because of the union of compacts) and causally convex (the proof is similar to that for C). We conclude that C ∈ X and A, B ⊂ C. (b) If f is a global smooth time function, which exists by Secs. A.13 and A.15 in globally-hyperbolic spacetimes, and A ∈ X , then, trivially, f A ∈ T[µg ] (A) and ¯ fA¯ ∈ C[µg ] (A). (c) If p, q ∈ I¯ and p q, it holds that f (p) ≤ f (q) for all f ∈ C[µg ] (I), I ∈ X because I¯ is causally convex. Let us prove that if f (p) ≤ f (q) for all essentially smooth ¯ then p q. Suppose that causal functions f defined in any I¯ ∈ X with p, q ∈ I, the implication is false. If p and q are spatially separated, as in the proof of Theorem 2.2 one finds two spatially separated, sufficiently small, regions I(p0 , q 0 ) and I(p00 , q 00 ) which respectively contain p and q and have spatially separated closures. ¯ By Proposition 2.3(c), fc : z 7→ d(p0 , z) + cd(p00 , z), c > 0 is an element of C[µg ] (I) 0 0 00 00 00 0 with I = I(p , q ) ∪ I(p , q ) ∈ X . Moreover fc (q) = cd(p , q) < d(p , p) = fc (p) for c sufficiently small and this is a contradiction. If q ≺ p, the map f : z 7→ d(x, z), defined on J(x, y) with x ≺ ≺ q and p ≺ ≺ y, produces a contradiction once again. B.3. Proof of Theorem 4.1 In the following we take advantage of Riesz’ representation theorem [24] which proves that there is a bijective map L 7→ µL between the set of positive linear functionals L on Cc (Ω), Ω being a locally-compactR topological space, and the set of regular Borel measures on Ω, such that L(f ) = Ω f dµL for every f ∈ Cc (Ω). ¯ (a) By Proposition 2.4(a), for every µ ∈ P, there is I ∈ X with supp(µ) ⊂ I. (µ) It is trivially proven that λI is a state on AI . Moreover, varying I, one obtains ¯ J¯ entails equivalent states since, by trivial properties of measures, supp(µ) ⊂ I, (µ) (µ) JK,I (λI ) = JK,J (λJ ) for I, J ≤ K. We only have to show that if Λµ is the locus (µ) generated by some λI , every element λJ ∈ Λµ must belong to F (µ). To this end (µ) assume that λJ ∈ Λµ , that is λJ ∼ λI . That equivalence relationship can be
February 9, 2004 18:52 WSPC/148-RMP
1212
00188
V. Moretti
¯ and for every f ∈ C(K), ¯ re-written as follows: for every K ∈ X with I¯ ∪ J¯ ⊂ K R ¯ and Proposition it holds that K¯ f dµ = λJ (fJ¯). Using the fact that supp(µ) ⊂ K ¯ 2.4(a), the obtained identity implies that if h ∈ Cc (M ) and supp(h) ⊂ M \J, R h dµ = 0. Uryshon’s lemma [24] implies that µ(R) = 0 for every compact M ¯ then the regularity of µ implies that supp(µ) ⊂ J¯ and thus λJ ∈ F (µ). R ⊂ M \J, ¯ To We have proven that F (µ) is a locus, but also that F (µ) ε I implies supp(µ) ⊂ I. R ¯ conclude notice that if supp(µ) ⊂ I then F (µ) ε I because λI (·) := M · dµ ∈ F (µ). if µ 6= µ0 , by Riesz’ theorem there is f ∈ Cc (M ) with R R (b) Injectivity: 0 0 M f dµ . Taking K ∈ X with supp(µ), supp(µ) ⊂ K one gets M f dµ 6= (µ)
(µ0 )
λK (f K¯ ) 6= λK (f K¯ ) and thus F (µ) 6= F (µ0 ). Surjectivity: if Λ is a locus, take I ∈ X with ΛεI. Define LΛ (f ) := Λ(fI¯) for every f ∈ Cc (M ). LΛ turns out R to be a positive linear functional, therefore, Riesz’ theorem proves that LΛ (f ) = M f dµΛ ¯ Then consider for some regular Borel probability measure µΛ with supp(µΛ ) ⊂ I. (µΛ ) ∈ F (µΛ ). Tietze’s extension theorem [24] entails the locus F (µΛ ) and some λJ (µ ) λJ Λ ∈ Λ and thus F (µΛ ) = Λ by (a). ¯ and (c) First we prove that the elements of F (δx ) are pure states. f ∈ AI = C(I) (δx ) ¯ x ∈ I imply that λI : f 7→ f (x) defines a pure state because hH, π, Ωi is a GNS (δ ) triple for λI x if H = C, Ω = 1 ∈ C and π : f 7→ f (x), and π is trivially irreducible. So F (δx ) is a pointwise locus for every x ∈ M . F restricted to the space of Dirac measures {δx }x∈M is surjective onto Lp . Indeed, take Λ ∈ Lp and let λI ∈ Λ ∩ SpI . As irreducible representations of a commutative C ∗ -algebras are unidimensional, a pure state ω on a commutative C ∗ -algebra admits a GNS representation on C. As the cyclic vector is 1 ∈ C, one sees that ω is also multiplicative: ω(ab) = ω(a)ω(b). In other words ω itself is an irreducible C-representation of the C ∗ -algebra. Therefore ¯ A known λI is an irreducible representation of the commutative C ∗ -algebra C(I). theorem in commutative C ∗ -algebras theory (e.g., see [20, Proposition 2.2.2]) im¯ (Precisely, the plies that there is xΛ ∈ I¯ such that λI (f ) = f (xΛ ) for all f ∈ C(I). (δx ) ¯ theorem states that hI : x 7→ λI is a homeomorphism from I onto SpI equipped with the Gel’fand topology.) Then consider F (δxΛ ). It is clear that λI ∈ F (δxΛ ) and thus Λ = F (δxΛ ) by (a). Up to now we have proven that the map H : M → Lp with H(x) = F (δx ) is a bijection from M onto Lp . It remains to show that H is a homeomorphism when Lp is equipped by the inductive-limit topology obtained by equipping each SI by the weak ∗-topology (Gel’fand topology). To prove that H is a homeomorphism, notice that M can be naturally identified to M 0 , the in¯ I∈X equipped with a class of maps ductive limit of the class of compact sets {I} ¯ ¯ FI,J : J → I, when J ⊂ I, FI,J being the inclusion map. As M ≡ M 0 , the injective ¯ I ∈ X and [x] ∈ M 0 ≡ M inclusion maps FI 7→ M 0 (FI : x 7→ [x] where x ∈ I, being the equivalence class of x in the inductive limit) coincide with the usual inclusion maps of each I¯ in M itself. By definition the inductive-limit topology is the finest topology on the inductive limit set which makes continuous all the inclusion maps FI . In other words a set A ⊂ M 0 ≡ M is open if and only if A ∩ I¯ is open ¯ for all I ∈ X . As the sets I are open and ∪X = M , the in the topology of I,
February 9, 2004 18:52 WSPC/148-RMP
00188
Aspects of Noncommutative Lorentzian Geometry
1213
inductive-limit topology on M 0 ≡ M coincides to the original topology of M . To conclude, consider the following ingredients: the space Lp realized as the inductive limit of the family {SpI }I∈X (with maps JI,J ) equipped with the inductive-limit topology induced by the Gel’fand topology in the spaces SpI , the injective inclusion maps GI : SpI → Lp and the homeomorphisms hI : I¯ → SpI said above. Using (b) above, it is a trivial task to show that, for every I ∈ X , GI ◦ hI = H ◦ FI . As every GI ◦ hI is continuous, H turns out to be continuous. Conversely, since it also holds that H −1 ◦ GI = FI ◦ h−1 and every FI ◦ h−1 is continuous, H −1 turns out to be I I continuous too. We have obtained that H is a homeomorphism. B.4. Proof of Theorem 4.2 (a) In the given hypotheses, take I ∈ X with Λ, Λ0 εI. If λI ∈ Λ∩SI and λ0I ∈ Λ0 ∩SI , one has λI (t) = λ0I (t), for every t ∈ CoI . By Proposition 4.3(b), the linearity and the continuity of the states λI , λ0I , one gets λI = λ0I and thus Λ = Λ0 . The proof of (b) and (c) is based on the following lemma. Lemma B.1. In the hypotheses of Theorem 4.2, Λ E Λ0 implies both supp(µ0 ) ⊂ J + (supp(µ)) and supp(µ) ⊂ J − (supp(µ0 )), where F (µ) = Λ and F (µ0 ) = Λ0 and F is defined in Theorem 4.1. Proof of Lemma B.1. We prove supp(µ) ⊂ J − (supp(µ0 )), the other inclusion is analogous taking p ≺ ≺ q and −d(·, q) in place of d(q, ·) in the following proof. Suppose that Λ E Λ0 and there is p ∈ supp(µ) with p 6∈ J − (supp(µ0 )). Since J − (supp(µ0 )) is closed as supp(µ0 ) is compact (see Sec. A.12), there is an open neighborhood of p which have no intersection with J − (supp(µ0 )). Take q ≺ ≺ p in + − 0 such a neighborhood. J (q) ∩ J (supp(µ )) = ∅ by construction. Let I ∈ X be such that supp(µ), supp(µ0 ) ⊂ I, such a set exists because of Proposition 2.4(a) and let t ∈ CoI . By Proposition 2.3(c), tα = t + αd(q, R ·)) I¯ ∈ RCoI for all α > 0. Therefore Λ E Λ0 entails, making use of Theorem 4.1, I¯ tα dµ ≤ I¯ tα dµ0 . Since αd(q,R·) vanishes Ron supp(µ0 ) ⊂ JR − (supp(µ0 )), the same inequality can be re-written I¯ t dµ +Rα I¯ d(q, ·) dµ ≤ I¯ t dµ0 for every α > 0. On the other hand it holds that I¯ d(q, ·) dµ > 0. Indeed (i) if Up ⊂ Up ⊂ I + (q) ∩ I is an open neighborhood of p, it must be µ(Up ) 6= 0 because p ∈ supp(µ), moreover (ii) d(q, ·)Up ≥ γ > 0 because of Proposition 2.1(a), and the continuity of d. By consequence large α > 0 which produces a contraR there must R be some sufficiently R diction in I¯ t dµ + α I¯ d(q, ·) dµ ≤ I¯ t dµ0 .
Let us come back to the main proof. Concerning (c), the proof of the statement “x y implies F (δx ) E F (δy )” is obvious by the given definitions. Using the proven lemma, the proof of the statement “F (δx ) E F (δy ) implies x y” is straightforward. Concerning (b), notice that by the lemma and Theorem 4.1(a), Λ E Λ0 E Λ00 implies, with obvious notations, supp(µ0 ) ⊂ J + (supp(µ)) ∩ J − (supp(µ00 )). Every open set I ∈ X such that I¯ contains both supp(µ) and supp(µ00 ) must contain J + (supp(µ)) ∩ J − (supp(µ00 )) because I¯ is causally convex. Hence supp(µ0 ) ⊂
February 9, 2004 18:52 WSPC/148-RMP
1214
00188
V. Moretti
¯ In other words, using Theorem 4.1(a), if J + (supp(µ)) ∩ J − (supp(µ00 )) ⊂ I. 0 00 00 Λ E Λ E Λ and Λ, Λ ε I ∈ X , then Λ0 ε I. Appendix C. Proof of Theorem 2.1. The proof of Theorem 2.1 is based on two lemmata. Lemma C.1. If (M, g, Ot ) is globally-hyperbolic and p ∈ M, referring to the definitions above, C + (p) and ∂J + (p) = ∂I + (p) = J + (p)\I + (p) are closed without internal points and have measure zero, finally J + (p)\ (C + (p) ∪ ∂J + (p)) = I + (p)\C + (p) is homeomorphic to Rdim (M ) . Proof. From here on n := dim(M ) and Vp+ ⊂ Tp M is the cone made of futuredirected causal vectors and 0. First consider ∂J + (p) = ∂I + (p) = J + (p)\I + (p), these identities being given in Sec. A.12. It is obvious that ∂J + (p) is closed, let us prove that it has measure zero. J + (p)\I + (p) ⊂ expp (Up ∩∂Vp+ ) where Up is the open domain of the exponential map at p (see Appendix A). Indeed if q ∈ J + (p)\I + (p), either q = p or, by Proposition 2.1(i), there is a geodesic from p to q which must be null-like it being maximal and q 6∈ I + (p). Therefore (re-scaling the vector if necessary) there must be a vector v ∈ ∂Vp+ ∩ Up with expp v = q. The Lebesgue measure of ∂Vp+ ⊂ Rn vanishes and thus, since expp is smooth and thus locally Lipschitz, ∂I + (p) must have measure zero. Indeed one has that the part of ∂I + (p) contained in the domain V of any local coordinate chart (V, ψ) has measure zero, with respect to the Lebesgue coordinate measure and thus µg , because ψ ◦ expp is locally Lipschitz on (exp−1 p (V )) for all k ∈ N. Then the countable measurability of µg and the existence of a countable atlas of the manifold entails the thesis for the whole set ∂I + (p). The closure of C + (p) was proven in [1, Theorem 9.35], the absence of internal points is a trivial consequence of the measure zero (since nonempty open sets have positive measure µg ). The last statement in the thesis is a consequence of [1, Proposition 9.36], due to Galloway. It remains to show that C + (p) has measure zero. Similar to the proof for ∂J + (p), it is sufficient to prove that the Lebesgue + + + measure in Rn of Γ+ ns (p) is zero: since Γns (p) ⊂ Up and C (p) = expp (Γns (p)), the + latter has measure zero if Γns (p) has measure zero. To this end notice that U Mp can be thought to be embedded in Rn and diffeomorphic the intersection of the sphere pPto Pn n i 2 + 1 n−1 i 2 S , i=1 (X ) = 1 and the cone V , X ≥ i=2 (X ) . U Mp is compact by construction. Fix N ∈ N and consider set KN := {v ∈ U Mp | s1 (v) ≤ N }. As s1 is lower semicontinuous, KN is closed and thus compact (since KN ⊂ U Mp which is compact and the topology being Hausdorff), moreover ∪N KN = U Mp . As a second step we define SN = {v ∈ KN |s1 (v)v ∈ Up }. It is clear that, by the countability of the Lebesgue measure, the thesis is proven if one shows that, for every N ∈ N, the image of the map v 7→ s1 (v)v with v ∈ SN has measure zero. The map v 7→ s1 (v), v ∈ SN is continuous (see the beginning of this appendix). Using the continuity of v 7→ s1 (v)v and the fact that Up is open, it arises that SN is open with respect to the topology of AN . We conclude that SN = KN ∩ BN , where KN is compact and
February 9, 2004 18:52 WSPC/148-RMP
00188
Aspects of Noncommutative Lorentzian Geometry
1215
BN is an open set in S n−1 . If BN is not connected we shall refer to each connected component of BN in the following. The open set BN admits a finite or countable class of components because the topology of S n−1 is second countable. Consider a countable class of compact sets Hn ⊂ BN such that Un Hn = BN (they do exist because BN is a connected manifold or it holds for each connected component). Hn ∩ KN is compact (since the topology of S n−1 is Hausdorff), KN ∩ Hn ⊂ SN and ∪n (KN ∩ Hn ) = SN . The function v 7→ s1 (v)v is continuous on each compact KN ∩ Hn and thus its image has measure zero in Rn . By countability the image of v 7→ s1 (v)v, v ∈ SN has measure zero as required. The proof of the theorem ends by proving the lemma below. In the proof we make use of the sets Ap := {λv | v ∈ U Mp , λ ∈ [0, s1 (v))} and Ap := Ap \{v ∈ Tp M | gp (v, v) = 0} . Ap ⊂ Up as one can straightforwardly prove using the definition of s1 and the domain of the exponential map at p, Up , moreover Ap is open as a consequence of the lower semicontinuity of s1 . Lemma C.2. If (M, g, Ot ) is globally-hyperbolic and p ∈ M, referring to the definitions above the following statements hold true. (a) expp Ap is surjective onto J + (p)\C + (p), expp Ap is surjective onto I + (p)\ C + (p); (b) expp Ap and expp Ap are smooth and injective; (c) (exppAp )−1 ∈ C ∞ (J + (p)\C + (p)) and (expp Ap )−1 ∈ C ∞ (I + (p)\C + (p)); (d) The map q 7→ d(p, q)2 belongs to C ∞ (J + (p)\C + (p)); (e) The map q 7→ d(p, q) belongs to C ∞ (I + (p)\C + (p)); (f) gq (↑dq d(p, q), ↑dq d(p, q)) = −1 for q ∈ I + (p)\C + (p). Proof. (a) Take q ∈ J + (p)\C + (p). If q = p, q = expp (0) and 0 ∈ Ap . If q 6= p, by Proposition 2.1(i) (since the spacetime is globally-hyperbolic) there is a futuredirected causal geodesic γ : [0, b) → M with γ(0) = p and γ(a) = q, a < b and γ is maximal from p to q. Rescaling the affine parameter of γ, we can assume that v := γ(0) ˙ ∈ U Mp . It must hold that a ≤ s1 (v) by maximality and a 6= s1 (v) because it would imply q ∈ C + (p) by definition. Therefore q = expp (λv) with λ ∈ [0, s1 (v)), namely, q ∈ expp (Ap ). If q ∈ J + (p)\C + (p) but q 6∈ ∂J + (p) then (the spacetime being globally-hyperbolic) q ∈ I + (p)\C + (p) and so v above must belong to Ap . (b) As is known (see Appendix A), the exponential map is smooth where it is defined. Let us consider the injectivity. Suppose there are u, v ∈ Ap with expp (u) = expp (v). This is equivalent to saying that expp (λv0 ) = expp (µu0 ) = q for some v0 , u0 ∈ U Mp and 0 < λ < s1 (v0 ), 0 < µ < s1 (u0 ). In other words q is contained in a maximal future-directed causal geodesic from p to some q 0 (after q), and thus the subsegment from p to q is a maximal geodesic, too. Moreover there is another
February 9, 2004 18:52 WSPC/148-RMP
1216
00188
V. Moretti
maximal future-directed causal geodesic from p to q itself. [1, Lemmas 9.1 and 9.12] imply that q cannot be the image of a point in Ap and this is impossible. (c) It is a trivial consequence of (a), (b) and the fact that exp is a local diffeomorphism about every point of Ap . This is because there are no conjugate points with p along each future-directed causal geodesic starting from p before the corresponding cut point as stated in [1,Theorems 9.12 and 9.15]. (d) If q ∈ J + (p)\C + (p), there is a causal future-directed geodesic, γ, from p to q whose length coincides with d(p, q) and whose initial tangent vector is nothing −1 but expAp (q) ∈ Ap . Therefore d(p, q)2 = L(γ)2 = −gp ((expAp )−1 (q), (expAp )−1 (q))
from trivial properties of geodesics. From here on −2σ(p, q) indicates the right-hand side of the obtained p identity. √ (e) d(p, ·) = d(p, ·)2 and x 7→ x is smooth for x > 0. d(p, ·)2 cannot vanish + in the open set I + (p)\C p (p). (f) dq d(p, q) = dq −2σ(p, q). Thus dq d(p, q) = −(−2σ(p, q))−1/2 dq σ(p, q) and by consequence one gets gq (↑dq d(p, q), ↑dq d(p, q)) = (−2σ)−1 gq (↑dq σ(p, q), ↑dq σ(p, q)) . (A.2) holds in geodesically convex neighborhoods. However it can also be proven in our hypotheses following the proof of [9, Theorem 1.2.3, items (iii) and (iv)] which only employs the variational definition of (timelike) geodesics. Using (A.2) one has gq (↑dq d(p, q), ↑dq d(p, q)) = −1. References [1] J. K. Beem, P. E. Eherlich, K. L. Easley, Global Lorentzian Geometry, 2nd ed., Marcel Dekker, New York, 1996. [2] A. Connes, Noncommutative Geometry, Academic Press, New York, 1994. [3] J. Dieckmann, Cauchy surfaces in a globally-hyperbolic spacetime, J. Math. Phys. 29 (1988), 578. [4] A. N. Bernal and M. S´ anchez, On smooth Cauchy hypersurfaces and Geroch’s splitting theorem, Commun. Math. Phys. in print [arXiv:gr-qc/0306108]. [5] M. P. do Carmo, Riemannian Geometry, Birkh¨ auser, Boston, 1992. [6] J. M. G. Fell and R. S. Doran, Representations of ∗-Algebras, Locally compact Groups, and Banach ∗-Algebraic Bundles, Vol. 1, Academic Press, Boston, 1988. [7] J. Fr¨ ohlich and K. Gawedzki, Conformal Field Theory and the Geometry of Strings, CRM Proc. and Lectures Notes 7 (1994), 57–97 [arXiv:hep-th/9310187]. [8] J. Fr¨ ohlich, O. Grandjean and A. Recknagel, Supersymmetric quantum theory and (non-commutative) differential geometry, Commun. Math. Phys. 193, 527–594 (1998) [arXiv:hep-th/9612205]. [9] F. G. Friedlander, The Wave Equation on a Curved Space-time, Cambridge University Press, Cambridge, 1975. [10] V. Gayral and J. M. Gracia-Bonda, B. Iochum, T. Schcker and J. C. Varilly, Moyal planes are spectral triples, hep-th/0307241. [11] J. M. Gracia-Bondia, J. C. Varilly and H. Figueroa, Elements of noncommutative geometry, Birkhauser, Boston, 2000.
February 9, 2004 18:52 WSPC/148-RMP
00188
Aspects of Noncommutative Lorentzian Geometry
1217
[12] S. W. Hawking, The existence of cosmic time functions, Proc. Roy. Soc. London A308 (1968), 433. [13] R. Haag, Local Quantum Physics, 2nd ed., Springer, Berlin, 1996. [14] S. W. Hawking and G. F. R. Ellis, The Large Scale Structure of Space-time, Cambridge University Press, Cambridge, 1973. [15] E. Hawkins, Hamiltonian gravity and noncommutative geometry, Commun. Math. Phys. 187 (1997), 471 [arXiv:gr-qc/9605068]. [16] W. Kalau, Hamilton formalism in noncommutative geometry, J. Geom. Phys. 18 (1996), 349 [arXiv:hep-th/9409193]. [17] S. Kobayashi and K. Nomizu, Foundations of Differential Geometry, Vol. 1, Interscience Publishers, New York, 1963. [18] T. Kopf, Spectral geometry and causality, Int. J. Mod. Phys. A13 (1998), 2693 [arXiv:gr-qc/9609050]. [19] T. Kopf, Spectral geometry of spacetime, Int. J. Mod. Phys. B14 (2000), 2359 [arXiv:hep-th/0005260]. [20] G. Landi, An Introduction to Noncommutative Spaces and Their Geometries, Springer-Verlag, Berlin Heidelberg, 1997. [21] B. O’Neill, Semi-Riemannian Geometry with Applications to Relativity, Academic Press, New York, 1983. [22] G. N. Parfinov and R. Zapatrin, Connes duality in pseudo-Riemannian geometry, J. Math. Phys. 41 (2000), 7122 [arXiv:gr-qc/9803090]. [23] R. Penrose, Techniques of differential topology in relativity, Regional Conference Series in Applied Mathematics, Philadelphia USA: SIAM 7 (1972). [24] W. Rudin, Real and Complex Analysis, McGraw-Hill, London, 1970. [25] B. Simon, Mathematics of Contemporary Physics, ed. R. F. Streater, Academic Press, London, 1972. [26] H.-J. Seifert, Smoothing and extending cosmic time functions, Gen. Rel. Grav. 8, (1977) 815. [27] A. Strohmaier, On noncommutative and semi-Riemannian geometry [arXiv:mathph/0110001]. [28] R. M. Wald, General Relativity, Univ. Pr., Chicago, 1984. [29] R. M. Wald, Quantum Field Theory in Curved Spacetime and Black Hole Thermodynamics, Univ. Pr., Chicago, 1994.
February 9, 2004 19:32 WSPC/148-RMP
00189
Reviews in Mathematical Physics Vol. 15, No. 10 (2003) 1219–1253 c World Scientific Publishing Company
CONFINEMENT TO LOWEST LANDAU BAND AND APPLICATION TO QUANTUM CURRENT
S. FOURNAIS∗ Laboratoire de Math´ ematiques Universit´ e Paris-Sud – Bˆ at 425 F-91405 Orsay Cedex, France
Received 18 October 2002 Revised 23 October 2003 In this paper we study neutral atoms of nuclear charge Z in a strong constant magnetic field of strength B. We improve the bound on confinement to the lowest Landau band given by Lieb, Solovej and Yngvason in [1]. This permits the calculation of the asymptotic 98
form of the quantum current of atoms in the parameter region Z 4/3 B Z 51 . Keywords: Large atoms; magnetic fields; magnetic currents; magnetic Thomas–Fermi theory.
Contents 1. Introduction 1.1. Organization of the paper 2. Statement of the Results on the Current 2.1. Short summary of relevant known results on the energy and density 2.2. Main result on the current 2.3. Splitting the current 2.4. The parallel current 3. Basic Estimates 4. Commutator Formula 5. Analysis of JINT 5.1. A note on how and why 5.2. A concave function 5.3. MTF-theory with a current term 5.4. Calculating JINT 6. Analysis of JKIN ∗ Supported
by a grant from the Carlsberg Foundation. 1219
1220 1223 1224 1224 1225 1227 1230 1231 1233 1234 1234 1234 1235 1237 1239
February 9, 2004 19:32 WSPC/148-RMP
1220
00189
S. Fournais
7. Localization to Lowest Landau Band Acknowledgments References
1243 1253 1253
1. Introduction In the present paper we will study the behavior of large atoms in very strong magnetic fields. The asymptotics of the ground state energy and density of such atoms have already been determined in the pioneering papers [1, 2] (see also [3] and [4] for the case of non-constant magnetic field). Here the objects of study will be the local current and magnetic moment of the atoms. The (local) current is defined as the variational derivative of the energy with respect to the magnetic vector potential; similarly the (local) magnetic moment is the derivative with respect to the magnetic field. Before further discussion, let us set the mathematical stage. The magnetic hamiltonian (Pauli operator) of an N -electron atom of nuclear charge Z in a magnetic field B = curl A is N X X 1 Z (j) . (1.1) H(N, Z, A) = HA − (j) + (j) |x | |x − x(k) | j=1 1≤j
Here HA = (p + A(x))2 + σ · B(x), with σ = (σ1 , σ2 , σ3 ) being the vector of Pauli spin matrices, 0 1 0 −i 1 0 σ1 = , σ2 = , σ3 = . 1 0 i 0 0 −1
The operator H(N, Z, A) acts on the electronic Hilbert space (including spin) H = 2 3 2 ∧N j=1 L (R ; C ). As usual p = (p1 , p2 , p3 ) = (−i∇) and a superscript (j) denotes that the corresponding operator acts on the jth factor in the antisymmetric tensor product defining H. In particular x(j) is the coordinate of the jth electron. In the following we will use the notation pA = (pA,1 , pA,2 , pA,3 ) = (p + A). Let ψ ∈ H be a ground state of H(N, Z, A) with energy E(N, Z, A). It is easy to see that (if the derivative exists) d E(N, Z, A + ta) = hψ; J(a)ψi . (1.2) dt t=0 Here we have introduced the notation h·; ·i for the scalar product in H and we have with b = curl a = (b1 , b2 , b3 ), J(a) =
N X j=1
(j)
(j)
a(x(j) ) · pA + pA · a(x(j) ) + σ (j) · b(x(j) ) .
Equation (1.2) defines the current jψ in the state ψ as the distribution Z 3 3 C0 (R , R ) 3 a 7→ a(x) · jψ (x) dx ≡ hψ; J(a)ψi . R3
(1.3)
February 9, 2004 19:32 WSPC/148-RMP
00189
Confinement to Lowest Landau Band and Application to Quantum Current
1221
We will normally discuss the current in a ground state and leave out the reference to the state ψ (also in the notation). By gauge invariance, the quantity hψ; J(a)ψi only depends on the magnetic field b generated by a. Thus, one can equivalently define the local magnetic moment m as Z b(x) · m(x) dx ≡ hψ; J(a)ψi . R3
The magnetic moment and the current are connected by ∇×m = j (as is easily seen by partial integration). In the rest of the paper we will only discuss the current. In what follows we will fix B = (0, 0, B) and A(x) = 21 B × x. The parameter B, which measures the strength of the magnetic field, will tend to infinity. With this choice of B, A we will write H(N, Z, B) instead of H(N, Z, A), and define the ground state energy of the atom by E(N, Z, B) = inf Spec H(N, Z, B) .
(1.4)
We will sometimes denote this energy by E Q (N, Z, B) to distinguish it from other energies present in the paper. Furthermore, if ψ is a normalized ground state of H(N, Z, B), i.e. a normalized function with hψ; H(N, Z, B)ψi = E(N, Z, B), we define the ground state density as the following function in L1 (R3 ): Z |ψ(x, x(2) , . . . , x(N ) )|2 dx(2) · · · dx(N ) . (1.5) ρQ (x) = N R3N −3
N
(Here | · | is the euclidean norm in C2 .) This definition of the density depends on the existence of a ground state. Notice that if the ground state is degenerate, then ρQ is not uniquely defined. This non-uniqueness is not important. The convergence results below will be true for any choice of the ground state in case of degeneracy. In the case of large atoms without magnetic fields an important mathematical result is the asymptotic correctness of the Thomas–Fermi theory as the nuclear charge Z of the (neutral) atom tends to infinity, see [5, 6]. Thus to find the ground state energy of large atoms (to leading order) one can replace the study of the (linear) many-particle operator (1.1) (with B = 0) with the study of the (non-linear) Thomas–Fermi functional of the one-electron density. Similarly, in the presence of a magnetic field it was proved in [1, 2] that certain magnetic functionals of the density (and more generally of the one-electron density matrix) correctly predict the ground state energy and density in the limit where B and Z tend to +∞. In order to be concrete let us consider the Magnetic Thomas–Fermi theory (MTF-theory), which will be defined in Sec. 2.1 below. It is a result of [1, 2] (see also [4] for the case of non-constant magnetic fields) that in the asymptotic regime |(B, Z)| → +∞, with B/Z 3 → 0 we have for neutral (i.e. N = Z) atoms E Q (N, Z, B) → 1, E MTF (N, Z, B)
(1.6)
where E MTF (N, Z, B) is the ground state energy of the magnetic Thomas– Fermi atom. Since the MTF-functional depends on B (and therefore on A) it is
February 9, 2004 19:32 WSPC/148-RMP
1222
00189
S. Fournais
straight-forward to define an MTF-current by a variational derivative as in (1.2). Thus the natural question arises: Are the quantum- and MTF-currents asymptotically equal ?
(1.7)
One of the main results of the present paper is an affirmative answer to this question 98 for B Z 51 . In the paper [4], the asymptotic correctness of MTF-theory with non-constant magnetic fields was studied. When the magnetic field was allowed to vary on a length scale comparable to the size of the atom, the convergence in (1.6) was proved to hold for magnetic field strengths B Z 2 [4, Theorem 5.1, p. 642]. Thus one might expect new difficulties to appear in the calculation of the current for Z 2 ≤ B Z 3 . We expect that it might be possible to technically strengthen the estimates in Sec. 7 98 to 2. However, we expect new ideas to be in order to improve the exponent 51 necessary to go beyond the exponent 2. Our precise results on the current will be given and discussed further in Sec. 2 below. Notice, that due to (1.6), the question above is mathematically a question of interchanging limit and differentiation. The similar question for the density was already answered in [1, 2]. The density can also be viewed as a variational derivative of the energy (with respect to the electrostatic potential). But due to a convexity (concavity) argument the interchange of limit and derivation is standard for the density. The energy E(N, Z, A) is not a concave function of A which makes the calculation of the current much more difficult than the calculation of the density. As a technical tool we will obtain a result of independent interest concerning concentration to the lowest Landau band. We will state this result here before continuing the discussion of the current in Sec. 2. We define the kinetic energy operator in the coordinates perpendicular to the ˆ as K ˆ = p2 + p2 + σ · B. It is well-known that the spectrum magnetic field, K, A,1 A,2 ˆ is 2B(N ∪ {0}). The lowest Landau band (for one electron) is defined as the of K ˆ (acting on L2 (R3 , C2 )). The projection Π0 on the lowest kernel of the operator K Landau band for one electron has the integral kernel (see for instance [7]) Π0 (x, y) =
B iB(x1 y2 −x2 y1 )/2 −B(x⊥ −y⊥ )2 /4 e e δ(x3 − y3 )P ↓ , 2π
where x⊥ = (x1 , x2 ), and where P ↓ = subspace. We define
ΠN 0
as the projection
0 0 0 1
(1.8)
, is the projection to the spin-down
2 3 2 in ∧N j=1 L (R , C ) to (j) N i.e. ΠN 0 = ⊗j=1 Π0 .
the space where all
electrons are in the lowest Landau band, Now we can define Q the ground state energy for electrons in the lowest Landau band Econf (N, Z, B) as Q N Econf (N, Z, B) = inf Spec ΠN 0 H(N, Z, B)Π0 =
inf
ψ∈Ran ΠN 0 \{0}
hψ; H(N, Z, B)ψi . kψk2
The improved estimate on localization to the lowest Landau band is the following (compare [1, Theorem 1.3]), proved in Sec. 7.
February 9, 2004 19:32 WSPC/148-RMP
00189
Confinement to Lowest Landau Band and Application to Quantum Current
1223
Theorem 1.1. Let λ, Λ, c > 0 be given. Then there exists C1 , C2 > 0 such that if λ < N/Z < Λ, Z ≥ C2 , B ≤ C2−1 Z 3 , and β = BZ −4/3 ≥ C2 , then Q Q Econf (N, Z, B) ≥ E Q (N, Z, B) ≥ Econf (N, Z, B)(1 − C1 R) ,
(1.9)
where −9/10 141 18 + β − 515 Z − 103 β 12 6 R = β − 37 Z − 37 − 72 − 6 β 205 Z 41
508
B ≤ Z 321 , 508
98
Z 321 ≤ B ≤ Z 51 , 98
Z 51 ≤ B .
We easily get the following corollary to Theorem 1.1, which will be proved in Sec. 7 after the proof of the theorem. Given the kinetic energy operator in the ˆ defined above, we define the coordinates perpendicular to the magnetic field, K, corresponding total perpendicular kinetic energy operator as ˆN = K
N X
ˆ (j) . K
j=1
Corollary 1.2. Let λ, Λ, c > 0 be given, let ψ = ψN,Z,B be a ground state for H(N, Z, B) and let R denote the error term from Theorem 1.1. Then there exist C1 , C2 > 0 such that if λ ≤ N/Z ≤ Λ, Z ≥ C2 , BZ −4/3 ≥ C2 , BZ −3 ≤ C2−1 , then the perpendicular kinetic energy satisfies the estimate ˆ N |ψi ≤ C1 R|E Q (N, Z, B)| . 0 ≤ hψ|K In particular , for all > 0 there exists C > 0 such that if λ ≤ N/Z ≤ Λ, Z ≥ C, 98 BZ −4/3 ≥ C, BZ − 51 ≤ C −1 , then ˆ N |ψi ≤ β −3/5 |E Q (N, Z, B)| . hψ|K
(1.10)
The interest in the estimate (1.10) is that β −3/5 is exactly the relative error that we need for our application to the quantum current (in order for the right-hand side in Lemma 6.3 to be of lower order than the ground state energy). 1.1. Organization of the paper The paper consists of two fairly independent parts: In Sec. 7 we prove Theorem 1.1 and Corollary 1.2, using a few standard estimates that we recall in Sec. 3. The rest of the paper is devoted to the study of the current, using the information implied by Corollary 1.2. In Sec. 2 we state our precise results on the current and discuss the relation to previous work on the subject. Section 4 describes a useful splitting of the current, and Secs. 5 and 6 contain the calculation of the non-trivial terms of the current after this splitting. In this paper we will follow a general convention and denote by c or C constants that we will not try to control. These may change from line to line in a calculation but will anyway be denoted by the same symbol.
February 9, 2004 19:32 WSPC/148-RMP
1224
00189
S. Fournais
2. Statement of the Results on the Current 2.1. Short summary of relevant known results on the energy and density To describe the results on the quantum current, we need to discuss the approximating theory. We refer to [1, 2] for proofs and more details. In the region B Z 3 , a Thomas–Fermi type theory — MTF theory — describes the ground state of the atom correctly to highest order in B, Z. In this theory one assigns an energy to an electronic density ρ ∈ L1 (R3 ) using the following functional Z ρ MTF dx + D(ρ, ρ) , (2.1) τBMTF (ρ(x)) − Z EZ,B [ρ] = |x| R3 where 1 D(f, g) = 2
ZZ
f (x)g(y) dxdy . |x − y|
(2.2)
Here the kinetic energy term, τBMTF (ρ), is given as follows τBMTF (ρ) = sup[ρν − PB (ν)] , ν≥0
with " # ∞ X B 3/2 PB (ν) = ν 3/2 + 2 [2jB − ν]− . 3π 2 j=1
0, Here we have introduced the notation [x]− = 0−x xx ≤ > 0. The domain CB , of the MTF-functional, is the set of positive measurable functions ρ, satisfying Z τB (ρ(x)) dx < ∞ , D(ρ, ρ) < ∞ .
We define the MTF-ground state energy as Z MTF E MTF (N, Z, B) = inf EZ,B [ρ]; 0 ≤ ρ(x), ρ(x) dx ≤ N .
(2.3)
Here the infimum is, of course, taken over all ρ in the domain CB . It is proved in [2] that a unique minimizer ρMTF N,Z,B exists. Remark 2.1. • In usual (non-magnetic) Thomas–Fermi theory the kinetic energy term is just 2 3/5
cTF ρ5/3 = limB→0 τBMTF (ρ), with cTF = 3(2π5 ) . • One can easily extend the definition of E MTF to magnetic fields varying in space MTF — just replace τBMTF by τ|B(x)| in (2.1). Thus one can define an MTF-current by a variational derivative of the MTF-energy.
February 9, 2004 19:32 WSPC/148-RMP
00189
Confinement to Lowest Landau Band and Application to Quantum Current
1225
The result on convergence of the energies can be written as follows. Theorem 2.2 (Energy Asymptotics [1, 2]). Let N/Z be fixed and let ψ = ψN,Z,B be a ground state of H(N, Z, B). Introduce the parameter β = BZ −4/3 . Suppose that B/Z 3 → 0 as Z → ∞, then E Q (N, Z, B) → 1 as Z → ∞ . E MTF (N, Z, B)
(2.4)
Furthermore, with ` = Z −1/3 (1 + β)−2/5 , Z ρQ (x)Z`−1 V (x/`) dx R3
=
Z
ρMTF (x)Z`−1 V (x/`) dx + o(E Q (N, Z, B)) ,
(2.5)
R3
as Z → ∞, for all V ∈ L5/2 (R3 ), supp V compact. Remark 2.3. • From Theorem 2.2 we can read of ` = Z −1/3 (1 + β)−2/5 as the main length scale of the atom for B Z 3 . • As a consequence of the scaling behavior of the magnetic Thomas–Fermi theory, we get the following asymptotic behavior of the ground state energy: ( Z 7/3 B/Z 4/3 bounded , (2.6) E(N, Z, B) ≈ Z 7/3 β 2/5 B/Z 4/3 → ∞, B/Z 3 → 0 . • Theorem 2.2 remains true, even if the ground state ψ does not exist. Then the ground state energy is just the bottom of the spectrum. The convergence of the densities still holds in the sense that the densities of approximate ground states will converge to the minimizers of the MTF-functional. Furthermore, the result on the densities in Theorem 2.2 remains true if the ground state is not unique, i.e. if the ground state energy is a degenerate eigenvalue. This stability also holds for the current — i.e. the results below on the current do not assume the ground state energy to be a non-degenerate eigenvalue (only that it is an eigenvalue). It is generally not true that the currents of approximate ground states and the current of a true ground state converge to the same limit. One can construct (see the appendix to [8]) approximate ground states, the current of which do not converge to the current of the true ground state. 2.2. Main result on the current Before we state the main result on the current let us introduce the symmetry P : (P ψ)(x(1) , . . . , x(N ) ) = ψ(x(1) , . . . , x(N ) ), with x = (x1 , x2 , x3 ) = (x1 , x2 , −x3 ). Theorem 2.4. Let asc ∈ C04 (R3 , R3 ) and define a(x) = `asc (x/`), where ` = Z −1/3 (1 + B/Z 4/3 )−2/5 . Suppose that N/Z is held fixed as Z → ∞ and that
February 9, 2004 19:32 WSPC/148-RMP
1226
00189
S. Fournais 98
B/Z 51 → 0. Suppose finally, that ψ is a ground state for H(N, Z, B), and that ψ is also an eigenfunction of the symmetry P, then d (2.7) hψ; BJ(a)ψi = E MTF (N, Z, |B + tB curl a|) + o(E Q (N, Z, B)) . dt t=0 Here E MTF (N, Z, |B + tB curl a|) is defined as indicated in Remark 2.1. d Remark 2.5. For generic a, the term dt E MTF (N, Z, |B + tB curl a|) in (2.7) t=0 is of the same order of magnitude as the energy E MTF (N, Z, B) (which, by Theorem 2.2, is the same as the order of magnitude of E Q (N, Z, B)). Thus, Theorem 2.4 effectively states that the quantum and MTF-currents are equal to highest order.
Sketch of proof. We only need to prove Theorem 2.4 for B Z 4/3 , since the same result was proved in [9] for B ≤ CZ 4/3 . In Sec. 2.3 below we will give more precise results on different parts of the current. Theorem 2.4 follows upon collecting these results (Eq. (2.8), Theorems 2.7, 2.8, 2.10 and 2.12, and Remark 2.13) and recalling the calculation of the MTF-current from [9]. Since the calculation of the MTF-current and the comparison with the different terms from Sec. 2.3 has already been done in [9] — and is elementary, though tedious — we will not repeat it here. The result of Theorem 2.4 was proved in [9] (where also more discussion of the MTF-current can be found) for B ≤ CZ 4/3 . Thus the new work is the extension 98 to magnetic field strengths in the region Z 4/3 B B/Z 51 . As can be seen by a scaling argument in MTF-theory (see [2] for details) B ≈ Z 4/3 is the region where the magnetic field starts to play a role. From Theorem 1.1 we see that for B Z 4/3 the magnetic field becomes dominating in the sense that the electrons are confined to the lowest Landau band. Theorem 2.4 discusses (a part of) that region. In the recent paper [10] the easier one-body (mean-field) case was considered in a semiclassical limit corresponding to Z 4/3 B Z 3 . The essential new work there was to obtain (what can be understood as) an improved estimate on confinement to the lowest Landau band (see [10, Lemma 6.1, Lemma 7.1]). Unfortunately, this confinement estimate is not readily generalizable to the N -body case. Instead we use another approach to obtain the confinement estimate, i.e. Theorem 1.1, and collect the consequences for the current. Theorem 1.1 is too crude in that it (essentially) estimates the norm of the part of the ground state living in the higher Landau bands. The better estimate [10, Lemma 7.1] (in the one-particle context) contains a further x-space localization away from the singularities of the potential. It is our impression that to go beyond B ≈ Z 2 one would have to implement this idea in 98 is close to 2, the N -particle context, as well. Thus, in as much as our exponent 51 we expect our result to be close to optimal with the present strategy of proof. Let us summarize the known results on the current: in the one-body case (noninteracting electron gas) the current has been calculated in all asymptotic regimes. That is the result of the papers [8, 10, 11]. For atoms (the N -particle situation)
February 9, 2004 19:32 WSPC/148-RMP
00189
Confinement to Lowest Landau Band and Application to Quantum Current
1227
Theorem 2.4 summarizes the results of the present paper and [9]. Finally, still in the case of atoms, there is a special situation where complete results can be obtained. This is the case a = A, which corresponds to the total magnetic moment. That case is analyzed in detail in [12], where asymptotic formulae for the current in all parameter regimes are given. 2.3. Splitting the current In order to calculate the current in strong magnetic fields a virial type theorem — also called commutator formula — is very useful (see e.g. [9]). This virial theorem (Lemma 4.1) will be discussed further in Sec. 4. It only applies to eigenfunctions. This is the reason why, for the results on the current, we need to assume the existence of a ground state. Using the virial theorem we can write, for all eigenstates ψ of H, Bhψ; J(a)ψi = hψ; (JKIN − JINT + JDENS )ψi ,
(2.8)
where the operators on the right, JKIN , JINT and JDENS will be given in (2.10) below. The basis for the formula (2.8) is the observation (see Sec. 4 for details) that one can find an operator O such that i[H, O] = BJ(a) − (JKIN − JINT + JDENS ) . Since the commutator vanishes on eigenfunctions of H, (2.8) results. The commutator formula only applies in the case where a3 = 0, i.e. a = (a1 , a2 , 0). Thus, one needs to reduce the study of the current to that case. This can be achieved by an argument using gauge invariance — see Sec. 2.4 below. For a = (a1 , a2 , 0) we define a ˜ = (−a2 , a1 , 0), ˜ a0 (x) = a ˜(x) − ˜ a(0), and 2∂1 a2 ∂2 a2 − ∂ 1 a1 ∂3 a2 (2.9) −2∂2 a1 −∂3 a1 , M (x) = − D˜ a + (D˜ a ) t = ∂2 a2 − ∂ 1 a1 ∂3 a2
−∂3 a1
0
where D˜ a is the Jacobian matrix of a ˜ and (·)t denotes the transpose. Then the operators on the right-hand side in (2.8) are given by: JKIN =
N X j=1
JDENS
1 (j) (j) pA · M (x(j) )pA + Bb(x(j) ) · σ (j) − (∆⊥ b3 (x(j) )) 2
N X 1 ˜ a0 (x(j) ) · x(j) (j) − (∆k b3 (x )) = Z 2 |x(j) |3 j=1
JINT =
X
1≤j
(x(j) − x(k) ) · (˜ a(x(j) ) − ˜ a(x(k) )) , (j) (k) 3 |x − x |
where ∆⊥ = ∂x21 + ∂x22 and ∆k = ∂x23 .
(2.10)
February 9, 2004 19:32 WSPC/148-RMP
1228
00189
S. Fournais
Remark 2.6. The usefulness of the commutator formula (2.8) can be seen as follows: We know (from Theorem 1.1) that the ground state wave function ψ is essentially a function in the lowest Landau band. So we would expect the current to satisfy hψ; J(a)ψi ≈ hΠ0 ψ; J(a)Π0 ψi . However, a direct calculation gives Π0 J(a)Π0 = 0 (this uses a3 = 0). Thus to calculate the main non-zero term of the current we need to understand the part of ψ which is not in the lowest Landau band, which is fairly difficult. The commutator formula (2.8) replaces BJ(a) by the operator (JKIN − JINT + JDENS ). For the new operators (except one) we have Π0 J# Π0 6= 0, so we can expect hψ; J# ψi ≈ hΠ0 ψ; J# Π0 ψi , which is indeed what we will prove in this paper. All this is very well for # = INT, DENS, but JKIN is more problematic because Π0 JKIN Π0 = 0. Thus one might think that we are back where we started. However, all is not lost: notice that there is no factor B on the √ right-hand side of (2.8). Suppose (j) we replace each factor pA in JKIN and BJ(a) by B (since the distance between one-particle Landau levels is 2B, this gives a reasonable order of magnitude). Then each term in the sum defining BJ(a) becomes of size B 3/2 , whereas in JKIN they only get size B. So we have won a factor B −1/2 . Therefore it is possible to obtain a sufficiently good estimate on the non-lowest Landau band part of ψ to conclude that hψ; JKIN ψi = o(E Q (N, Z, B)) . This is indeed the strategy we will use — the necessary estimate is given in Corollary 1.2. On a very concrete level, a second glance at (2.10) reveals that the operators on the right-hand side in (2.8) have a structure similar to the kinetic, nuclear potential and electrostatic interaction energy in the Hamiltonian. Therefore, it is probably not too surprising that the expectation of these operators in the ground state in most cases will be of the same order of magnitude as the energy itself. This is in contrast to the original current operator BJ(a), which is not directly comparable to any of the parts of the Hamiltonian. We will give results for each of the three terms on the right-hand side in (2.8) individually. As can be seen from (2.10) the operator JDENS is actually just a multiplication P ˜ a0 (x)·x 1 ˜ (j) ˜ operator JDENS = N j=1 V (x ), with V (x) = Z |x|3 − 2 ∆k b3 (x). Thus Z hψ; JDENS ψi = ρψ V˜ dx ,
where ρψ is the density associated to the state ψ (as in (1.5). We therefore immediately get the asymptotics of JDENS from Theorem 2.2. Let us first define the
February 9, 2004 19:32 WSPC/148-RMP
00189
1229
Confinement to Lowest Landau Band and Application to Quantum Current
common notation for the results on the different parts of the current. We take an asc = (asc,1 , asc,2 , 0) ∈ C04 (R3 ; R3 ), i.e. a test function that lives on a scale of order 1. We then scale it to the size of an atom, which is ` = Z −1/3 (1 + B/Z 4/3 )−2/5 . Thus a(x) = aZ,B (x) = `asc (x/`) .
(2.11)
Theorem 2.7. Let λ, Λ > 0 be given, let ψ be a ground state of H(N, Z, B) and let a be as in (2.11). Then for all > 0 there exists C > 0 such that if λ ≤ N/Z ≤ Λ ,
Z ≥C,
BZ −3 ≤ C −1 ,
(2.12)
then Z hψ; JDENS ψi − Z
x2 (a1 (x) − a1 (0)) − x1 (a2 (x) − a2 (0)) MTF ρ (x) dx |x|3 R3 ≤ E Q (N, Z, B) .
Proof. Follows from Theorem 2.2.
The second operator, JINT , is very similar to the electron–electron interaction (j) − x(k) |−1 . In order to analyze this term we need to slightly generalize j
P
Theorem 2.8. Let the assumptions be as in Theorem 2.7, then ZZ (˜ a(x) − a ˜(y)) · (x − y) MTF MTF hψ; JINT ψi − ρ (y) dxdy ρ (x) |x − y|3 R3 ×R3 ≤ E Q (N, Z, B) . (2.13)
The proof of Theorem 2.8 is given as Sec. 5.4. Finally, we have to deal with JKIN . This is an operator which resembles the kinetic energy. However, it contains an off-diagonal part that couples the lowest Landau band with the second. This off-diagonal part causes some trouble. In general we would expect this term to be small: Conjecture 2.9. Let λ, Λ > 0 be given and let ψ = ψN,Z,B be a ground state for H(N, Z, B). Let a be as given in (2.11). Then for all > 0 there exists a constant C > 0 such that if λ ≤ N/Z ≤ Λ ,
Z ≥C,
BZ −4/3 ≥ C ,
BZ −3 ≤ C −1 ,
(2.14)
then |hψ; JKIN ψi| ≤ E Q (N, Z, B) .
Unfortunately, we are not able to prove this estimate in the entire parameter region Z 4/3 B Z 3 . In fact, we get
February 9, 2004 19:32 WSPC/148-RMP
1230
00189
S. Fournais
Theorem 2.10. Let the notation and conditions be as in Conjecture 2.9 but 98 with the assumption BZ − 51 ≤ C −1 added in (2.14). Then the conclusion of Conjecture 2.9 holds. The proof of this theorem is carried through in Secs. 6 and 7. In Sec. 6 we reduce the proof to the estimate on confinement to the lowest Landau band; Theorem 1.1. That estimate, which is a sharpening of the similar result in [1], is proved in Sec. 7. This improved confinement is interesting in its own right, independent of applications to the current. Remark 2.11. There are other models than MTF-theory that correctly describe the ground state of large atoms in strong magnetic fields. In particular, the density matrix (DM) model analyzed in [1]. The results above, Theorems 2.7 to 2.10, remain true for the DM-theory — with MTF replaced everywhere by DM — in the parameter region B Z 4/3 where the DM-model correctly predicts the ground state energy (the upper restrictions on B remaining imposed). Contrary to MTFtheory, it is not clear how to define a current in DM-theory, which makes the result above even more interesting. For simplicity of exposition, we will in this paper only explicitly discuss the MTF-case. 2.4. The parallel current As discussed above, the “virial theorem” (2.8), only works for the perpendicular current, i.e. for a = (a1 , a2 , 0) (and therefore perpendicular to B = (0, 0, B)). The calculation of the parallel current thus has to be reduced to that case. This reduction is accomplished using a symmetry argument and gauge invariance of the current. Notice, that the Hamiltonian H(N, Z, B) is invariant under the symmetry P that (j) (j) changes x3 into −x3 for all j (here it is, of course, important that B = B(0, 0, 1)). Thus, the space of ground states of H(N, Z, B) can be decomposed in eigenfunctions (with eigenvalue ±1) of this symmetry. Theorem 2.12. Let ψ be an eigenfunction of H(N, Z, B). • Suppose that a ∈ C01 (R3 , R3 ) satisfies curl a = 0, then hψ; J(a)ψi = 0 . • If P is the symmetry described above, if a = (0, 0, a3 ) satisfies a(x1 , x2 , x3 ) = a(x1 , x2 , −x3 ) , and the eigenfunction ψ is also an eigenfunction of P , then hψ; J(a)ψi = 0. Proof. The proof is straightforward. Details are given in [9].
February 9, 2004 19:32 WSPC/148-RMP
00189
Confinement to Lowest Landau Band and Application to Quantum Current
1231
Remark 2.13. Using Theorem 2.12, given any a1 ∈ C04 (R3 , R3 ), we can find an a = (a1 , a2 , 0) ∈ C03 (R3 , R3 ) such that hψ; J(a − a1 )ψi = 0 ,
for all ψ which are eigenfunctions of H(N, Z, B) and P . Thus, using (2.8), the proof of Theorem 2.4 reduces to the results proved on the different parts of the (perpendicular) current in Sec. 2.3. 3. Basic Estimates In this section we collect a few basic estimates that will be used in the paper. First we summarize some results on the different parts of the energy (kinetic, potential). Lemma 3.1 ([1]). Let λ, Λ > 0 be given, let ψ be a ground state for H(N, Z, B) and let V ∈ L5/2 (R3 ), supp V compact. Then for all > 0 there exist constants c, C > 0 such that if then
λ ≤ N/Z ≤ Λ ,
and *
ψ;
β = BZ −4/3 ≥ C ,
(3.1)
ˆ N ψi ≤ E Q (N, Z, B) , hψ; K N X
(j) (pA,3 )2 ψ
j=1
+
≤ c E Q (N, Z, B) .
Furthermore, if the condition BZ −3 ≤ C −1 is added in (3.1), then * + Z N X −1 (j) N Q −1 ΠN ψ; Z` V (x /`)Π ψ − ρ (x)Z` V (x/`) dx 0 0 j=1 Q ≤ E (N, Z, B) . Finally,
* + N X Q ΠN Zl−1 V (x(j) /l)ΠN 0 ψ; 0 ψ ≤ c E (N, Z, B) . j=1
We will also need to know that if B → ∞, then the projection Π0 commutes (to a very high accuracy) with regular functions. This is given more precisely by the following lemma. Lemma 3.2. Let Π0 be the projection on the lowest Landau band given in (1.8). Suppose f is a function satisfying ∇f ∈ L∞ (R3 ) (i.e. f ∈ C 0,1 (R3 ) the space of uniformly Lipschitz continuous functions). Then there exists a constant independent of B and f such that k[Π0 , f ]k ≤ cB −1/2 k∇f k∞ ,
where k∇f k∞ is the L∞ -norm of ∇f .
February 9, 2004 19:32 WSPC/148-RMP
1232
00189
S. Fournais
Proof. The operator [Π0 , f ] has kernel K(x, y) = Π0 (x, y)(f (y) − f (x)). We will use Schur’s Lemma, i.e. the following estimate of the norm kKk of the operator with kernel K(x, y): Z Z kKk ≤ max sup |K(x, y)| dy, sup |K(x, y)| dx . x
y
Notice from (1.8) that the integral kernel of Π0Rhas a length scale of order B −1/2 . 1 The lemma follows using f (y) − f (x) = (y − x) 0 ∇f (x + t(y − x)) dt, the explicit form of Π0 (x, y) given in (1.8) and Schur’s Lemma. Finally, we state some Lieb–Thirring inequalities. Recall that a density operator γ is a trace class operator satisfying 0 ≤ γ ≤ I (as operator inequalities). The Lieb– Thirring inequalities that we give, estimate tr[Hγ] from below, for some self-adjoint operator H and all density matrices γ. By the min–max principle these lower bounds are, in fact, lower bounds on the sum of the negative eigenvalues of H. Lemma 3.3 (Lieb–Thirring Inequalities). (i) Let γ be a density operator on L2 (Rd ) and let V ∈ L1+d/2 (Rd ), then Z 1+d/2 2 tr[(p + V )γ] ≥ −Cd [V (x)]− dx ,
(3.2)
where Cd is some positive constant depending only on the dimension d. (ii) For d = 3 the inequality (3.2) remains true if we replace p by pA . (iii) If V ∈ L3/2 (R3 ), and γ is a density operator on L2 (R3 ), then Z 2 3/2 (3.3) tr[Π0 (HA + V )Π0 γ] ≥ − B [V (x)]− dx . 3π (iv) If V ∈ L3/2 (R3 ) ∩ L5/2 (R3 ), and γ is a density operator on L2 (R3 ), then √ Z Z 8 6 4 5/2 3/2 (3.4) [V (x)]− dx . tr[(HA + V )γ] ≥ − B [V (x)]− dx − 3π 5π Proof. The estimate in (i) was first proved in [13]. The extension in (ii) (based on the diamagnetic inequality) is immediate and can for instance be found in [14]. Finally, the truly magnetic estimates (iii), (iv). These are [2, Theorem 2.7] and [2, Theorem 2.1], respectively. We will need the following standard consequence of Lemma 3.3 (c.f. [2, Corollary 2.2], [13] — see also [2, (5.15), (5.16)]). 2 3 2 Corollary 3.4. Let ψ ∈ ∧N j=1 L (R ; C ) be normalized and satisfy * + N X (j) (j) −1 ψ; (HA − Z|x | )ψ ≤ 0 . j=1
February 9, 2004 19:32 WSPC/148-RMP
00189
Confinement to Lowest Landau Band and Application to Quantum Current
1233
Let ρψ be the density associated to ψ as in (1.5). Suppose Z 4/3 B Z 3 and that ∃λ, Λ > 0 such that λ < N/Z < Λ. Then Z 5/3 ρψ (x) dx ≤ Cβ 4/5 Z 7/3 . 4. Commutator Formula In calculations of the current, the commutator formula (4.1) below is very useful. It makes it possible to replace the operator J(a) by a sum of other operators that are all better behaved than J(a). Recall the definitions of ˜ a, ˜ a0 from Sec. 2.3. Let N ) ( Y Y |x(j) − x(k) | = 0 . |x(j) | Σ = (x(1) , . . . , x(N ) ∈ R3N j=1
j
Then a calculation (see for instance [9]) gives (as an identity for operators on C0∞ (R3N \Σ)): " # N X 1 (j) (j) (j) (j) H(N, Z, B), ˜ a0 (x ) · pA + pA · a ˜0 (x ) 2i j=1 = −BJ(a) + JKIN − JINT + JDENS ,
(4.1)
where JKIN =
N X j=1
JINT =
1 (j) (j) pA · M (x(j) )pA + Bb(x(j) ) · σ (j) + ∆⊥ div ˜ a(x(j) ) , 2
X
1≤j
JDENS =
(x(j) − x(k) ) · (˜ a(x(j) ) − ˜ a(x(k) )) , |x(j) − x(k) |3
(4.2)
N X 1 ˜ a0 (x(j) ) · x(j) (j) ∆ div ˜ a (x ) , + Z k 2 |x(j) |3 j=1
where ∆⊥ = ∂x21 + ∂x22 and ∆k = ∂x23 . In the definition of JKIN we introduced the symmetric matrix M , which was defined in (2.9). Notice that the term div a ˜ satisfies div ˜ a = ∂2 a1 − ∂1 a2 = −b3 . Thus we can rewrite JKIN and JDENS , N X 1 (j) (j) (j) (j) (j) (j) JKIN = pA · M (x )pA + Bb(x ) · σ − (∆⊥ b3 (x )) 2 j=1 JDENS
N X ˜ a0 (x(j) ) · x(j) 1 (j) = Z − (∆k b3 (x )) . 2 |x(j) |3 j=1
Now let ψ be an eigenstate of H = H(N, Z, B). Then formally (for instance it would be true for matrices) for all operators O we have hψ; [H, O]ψi = 0 .
February 9, 2004 19:32 WSPC/148-RMP
1234
00189
S. Fournais
This result can easily be applied to (4.1) above and we thereby get: Lemma 4.1. Let ψ be an eigenstate of H(N, Z, B). Then Bhψ; J(a)ψi = hψ; (JKIN − JINT + JDENS )ψi .
(4.3)
Sketch of Proof. The identity (4.3) follows from the identity (4.1) upon approx∞ 3N imating the eigenstate ψ by a sequence of functions {ψn }∞ \ Σ), n=1 ⊂ C0 (R converging to ψ in graph norm. 5. Analysis of JINT 5.1. A note on how and why d We wish to calculate hψ|JINT |ψi as dα E , where Eα is the ground state enα=0 α ergy of an operator Hα . This is essentially mimicking the standard proof of the convergence of the density (see for instance [5]). We will choose Hα to formally satisfy Hα = H(N, Z, B) + αJINT + O(α2 ) .
(5.1)
We then make a similar perturbation of the approximating theory. If we have concavity of the energy as a function of α, then convergence of the energies implies convergence of their first derivatives in α — i.e. the desired formula for hψ|JINT |ψi. In [9] this was carried out by only including the α-term in (5.1) (i.e. defining the O(α2 )-term to be zero). This choice gives concavity automatically (since the inf of a family of concave functions is concave). However, when proving convergence of Eα to the (similarly defined) perturbed MTF-energy EαMTF (N, Z, B) one is led to require that ZZ (x − y) · (˜ a(x) − ˜ a(y)) ˜ α (f, g) = f (x)|x − y|−1 1 + α g(y) dxdy D |x − y|2 R3 ×R3 defines a non-negative quadratic form for small α (remember c.f. [6, p. 621 after (5.17)], [9, p. 390 before (15)] or see proof of Lemma 5.3 below, that in the proof of correctness of TF-theory one needs the inequality D(ρψ − ρT F , ρψ − ρT F ) ≥ 0 in ˜ α as defined above. the lower bound). It is not clear that this positivity holds for D Thus we will correct here that mistake in [9]. In order to avoid the above problem we define slightly more complicated operators/functionals (still of the form (5.1)). 5.2. A concave function Let α ∈ (−0 , 0 ), where 0 is so small that the function x 7→ ψα (x) ≡ x + α˜ a(x) is invertible on all of R3 for all α ∈ (−0 , 0 ) (where a ˜ was defined just before (2.9)). Since a ˜ is scaled to the size ` of an atom, we need to be a bit careful. By definition
February 9, 2004 19:32 WSPC/148-RMP
00189
Confinement to Lowest Landau Band and Application to Quantum Current
1235
˜ a = `˜ asc (x/`), where ˜ asc is a function living on a scale of order unity. Let gα be the inverse of the function x 7→ x + α˜ asc (x) . We denote the inverse of ψα by φα . Clearly φα (y) = `gα (y/`). This function satisfies φα (x) = x − α˜ a(x) + O(α2 ). Notice also that since a ˜sc has compact support there exists an R ∈ R+ independent of α such that gα (x) = x for all |x| ≥ R, α ∈ (−0 , 0 ). The Taylor expansion (in α) of gα is gα (y) = y − α˜ asc (y) + α2 ηsc,α (y) , where ηsc,α ∈ C01 (R3 ). With ηα (y) = `ηsc,α (y/`) we therefore get φα (y) = y − α˜ a(y) + α2 ηα (y) ,
(5.2)
where ηα is uniformly bounded in C01 (R3 ) as ` → 0. Consider, for x, y ∈ R3 , x 6= y, the function α 7→ Fx,y (α) =
|x − y| − k 0 α2 . |φα (x) − φα (y)|
By explicit differentiation we see that for k0 > 0 and sufficiently big, we have d2 dα2 Fx,y ≤ −1/2 for α ∈ (−0 , 0 ), uniformly in x, y. Thus α 7→ Fx,y (α) is a concave function on (−0 , 0 ). From now on we fix k0 satisfying the conditions above. Notice that the concavity of Fx,y implies that Fx,y (α) u , α 7→ u |x − y| is concave (for all u(x, y) with hu||x − y|−1 |ui < +∞).
5.3. MTF-theory with a current term We will need a perturbed magnetic Thomas–Fermi theory. Let Dα (ρ, ρ) be defined by ZZ ρ(x)Kα (x, y)ρ(y) dxdy , Dα (ρ, ρ) = 12 where
1 − k 0 α2 1 |x − y|−1 + |φα (x) − φα (y)|−1 . (5.3) 2 2 Notice, using (5.2), that for fixed ρ with D(ρ, ρ) < +∞, ZZ (x − y) · (˜ a(x) − ˜ a(y)) α ρ(y) dxdy + O(α2 ) . ρ(x) Dα (ρ, ρ) = D(ρ, ρ) + 4 |x − y|3 Kα (x, y) =
MTF We define the functional EZ,B,α for densities ρ ∈ CB (see Sec. 2.1 for the definition of CB ) by substituting Dα for D in (2.1). The corresponding energy is Z MTF EαMTF (N, Z, B) = inf EZ,B,α [ρ]; ρ(x) dx ≤ N, ρ ∈ CB .
February 9, 2004 19:32 WSPC/148-RMP
1236
00189
S. Fournais
By a change of variables we see that, with fα = | det Dψα (x)|f (ψα (x)), gα = | det Dψα (x)|g(ψα (x)) (with Dψα denoting the Jacobian matrix of ψα ), ZZ 1 f (x)g(y) dxdy = D(fα , gα ) . 2 |φα (x) − φα (y)| This implies, in particular, due to the positivity of the Coulomb kernel that Dα (f, f ) ≥ 0
for all f .
(5.4)
Furthermore, the Taylor expansion (5.2) implies that, for 0 sufficiently small, there exists a constant c0 such that (1 − c0 α)|x − y| ≤ |φα (x) − φα (y)| ≤ (1 + c0 α)|x − y| .
(5.5)
Therefore, for all ρ ≥ 0, we have (1 − c00 α)D(ρ, ρ) ≤ Dα (ρ, ρ) ≤ (1 + c00 α)D(ρ, ρ) .
(5.6)
These properties of Dα imply that the analysis of [2, Sec. 4] of the MTFfunctional goes through without changes for EαMTF . This gives the following results on the perturbed MTF-theory. Theorem 5.1. For 0 sufficiently small , EαMTF has the following properties: • For each N ≥ 0 there is a unique minimizing density ρMTF ∈ CB with α R MTF ρα (x) dx ≤ N . • The minimizing density satisfies the Thomas–Fermi equation τB0 (ρMTF ) = [Veff,α ]− , (5.7) α R where Veff,α (x) = −Z|x|−1 + Kα (x, y)ρMTF (y) dy + µ, for some unique α (chemical potential ) µ = µ(N, Z, α). The chemical potential satisfies µ = 0 if R MTF ρα (x) dx < N . The Thomas–Fermi equation (5.7) has the following useful equivalent form −PB ([Veff,α ]− ) = τB (ρMTF (x)) + Veff,α (x)ρMTF (x) . α α
(5.8)
• The function α 7→ EαMTF (N, Z, B) is differentiable at α = 0 with d E MTF (N, Z, B) dα α=0 α ZZ 1 (x − y) · (˜ a(x) − ˜ a(y)) MTF = ρMTF (x) ρ0 (y) dxdy . 0 3 4 |x − y| More precisely, there exists a function f (x) with lim α→0 f (α) = 0 such that MTF MTF Eα (N, Z, B) − E0MTF (N, Z, B) − α d E (N, Z, B) α dα α=0 ≤ αf (α)|E Q (N, Z, B)| .
(5.9)
February 9, 2004 19:32 WSPC/148-RMP
00189
Confinement to Lowest Landau Band and Application to Quantum Current
1237
Sketch of Proof. Upon going through the steps of [2, Sec. 4] for the perturbed functional (using (5.6)), we get the first two statements of Theorem 5.1. The last statement is an easy variational argument. Since we will give a similar argument in Sec. 5.4 (in a more difficult setting) we omit the argument here. Furthermore, we need the following semi-classical estimate. This is also almost identical to the analysis in [2] (see [2, Theorem 3.1 and Proposition 4.19]) so we give it without proof. Lemma 5.2. Let Veff,α be the effective potential from Theorem 5.1. Suppose there exist λ, Λ > 0 such that λ < N/Z < Λ, and that B/Z 3 → 0. Then as B, Z → ∞, we get the bound Z N X (j) (j) inf Spec HA + Veff,α (x ) ≥ − PB ([Veff,α ]− ) dx + o(E Q (N, Z, B)) . j=1
(5.10)
Here the operator on the left-hand side is understood as acting on the fermionic 2 3 2 hilbert space ∧N j=1 L (R , C ). 5.4. Calculating JINT In order to calculate hψ; JINT ψi we introduce a perturbed Hamiltonian for α ∈ (−0 .0 ): X N X Z (j) Kα (x(j) , x(k) ) , Hα = HA − (j) + |x | j=1 j 0. Let Eα (N, Z, B) = inf Spec Hα . The idea of the proof is to use the Feynman–Hellman theorem (this idea was probably first used in the context of the Thomas–Fermi theory in [5]) to calculate hψ; JINT ψi by differentiating Eα (N, Z, B) at α = 0. Using (5.5), we see (using scaling and (2.6)) that for some c > 0, (1 − αc)E(N, Z, B) ≤ Eα (N, Z, B) ≤ (1 + αc)E(N, Z, B) . The proof of Theorem 2.8 starts by justifying the following calculation. α hψ; JINT ψi ≥ hψ; (Hα − H)ψi 2 ≥ Eα (N, Z, B) − E(N, Z, B) ≥ EαMTF (N, Z, B) − E MTF (N, Z, B) + o(E(N, Z, B)) . (5.11)
February 9, 2004 19:32 WSPC/148-RMP
1238
00189
S. Fournais
The first inequality is just the concavity in α. The second inequality is a consequence of the variational principle for the energy, so we only need to justify the final inequality. Using the upper bound E(N, Z, B) ≤ E MTF (N, Z, B) + o(E(N, Z, B)) , from [2], we only need to prove a lower bound Eα (N, Z, B) ≥ EαMTF (N, Z, B) + o(E(N, Z, B)) .
(5.12)
We state this as a lemma Lemma 5.3. Let λ, Λ > 0 be given. Let > 0, then there exists C > 0 such that if λ ≤ N/Z ≤ Λ, BZ −4/3 > C, BZ −3 < C −1 , Z > C, then Eα (N, Z, B) ≥ EαMTF (N, Z, B) − |E Q (N, Z, B)| , for all α ∈ (−0 , 0 ). To prove Lemma 5.3 we state a version of the perturbed Lieb–Oxford inequality from [9]. Lemma 5.4. There exists a constant C such that for α ∈ (−0 , 0 ) and all normalized ψ ∈ L2 (R3N ) the following bound holds. * + Z X 1 4/3 (j) (k) Kα (x , x )ψ ≥ Dα (ρψ , ρψ ) − C ψ; ρψ dx . 2 R3 j
Sketch of Proof of Lemma 5.4. The proof of this lemma is simple, using the standard Lieb–Oxford inequality and the argument from [9, Appendix A]. By a change of variables (which defines ψα ): + * + * X X (j) (k) −1 (j) (k) −1 (5.13) ψ; |φα (x ) − φα (x )| ψ = ψα ; |x − x | ψα . j
j
On the right-hand side in (5.13), we can now use the standard Lieb–Oxford inequality. Upon changing the variables back, we get the desired estimate. We refer to [9, Appendix A] for further details. Proof of Lemma 5.3. The argument is essentially as in [2, Sec. 5]. Let ψ ∈ 2 3 2 ∧N j=1 L (R ; C ) be normalized and satisfy hψ, Hα (N, Z, B)ψi ≤ 0. Then, using Corollary 3.4 and a H¨ older inequality we see that Z 4/3 ρψ dx ≤ cβ 2/5 Z 5/3 = o(E(N, Z, B)) . R3
Thus the error term in the correlation inequality, Lemma 5.4, is of lower order. Using (5.4) we see that Dα (ρψ − ρMTF , ρψ − ρMTF ) ≥ 0. Therefore, it is straightforward, α α
February 9, 2004 19:32 WSPC/148-RMP
00189
Confinement to Lowest Landau Band and Application to Quantum Current
1239
using first Lemma 5.4 and then Lemma 5.2, to obtain hψ; Hα (N, Z, B)ψi ≥
ψ;
N X
(j)
((HA + Veff,α (x(j) ))ψ
j=1
− Dα (ρMTF , ρMTF ) − µN + o(E(N, Z, B)) α α Z ≥ − PB ([Veff,α (x)]− dx − Dα (ρMTF , ρMTF ) − µN + o(E(N, Z, B)) . α α Using the Thomas–Fermi equation (5.8), we therefore find Z Z MTF ρα (x) MTF hψ; Hα (N, Z, B)ψi ≥ τB (ρα ) dx − Z dx |x| + Dα (ρMTF , ρMTF ) + o(E(N, Z, B) . α α R Here we used the result from Lemma 5.1 that if ρMTF < N , then µ = 0, α which assures that the terms proportional to µ cancel. This finishes the proof of Lemma 5.3. At this point we have justified the calculation in (5.11). To finish the proof of Theorem 2.8 we only have to invoke (5.9). 6. Analysis of JKIN Let M be the matrix from (2.9). It is clear from the definition of a, ˜ a that we can write M = M0 (x/l), where M0 ∈ C02 (R3 ) is a matrix-valued function independent of B, Z. Define M = M t + N1 + N2 b3 0 0 N11 = 0 b3 0 + N12 0
0
0
0
N12 −N11 0
0
0
0
N13
0+ 0
0
N23 .
0
N13
N23
0
Notice that tr[N1 ] = tr[N2 ] = 0. With these definitions we write N X 1 (j) (j) JKIN,diag = pA · Mt (x(j) )pA + Bb(x(j) ) · σ (j) − (∆⊥ b3 (x(j) )) , 2 j=1 Js =
N X j=1
(j) pA
· Ns (x
(j)
(j) )pA
(6.1)
(6.2)
for s = 1, 2 .
The contribution to the current of each of these terms will be small. First the terms JKIN,diag and J2 are estimated in Lemma 6.1. The remaining operator, J1 , is further
February 9, 2004 19:32 WSPC/148-RMP
1240
00189
S. Fournais
split in (6.8). The individual components of J1 are then estimated in Lemmas 6.2 and 6.3. The final lines of this section combine the estimates proved here with Corollary 1.2 to obtain a proof of Theorem 2.10. Notice that J1 is the only part of JKIN for which we need to apply Corollary 1.2 to get a useful bound. Thus J1 is the part of the current responsible for the limitations of admissible field strengths B in Theorem 2.10. Lemma 6.1. Let λ, Λ > 0 be given and let ψ be a ground state for the Pauli operator H(N, Z, B). Let > 0, then there exists C > 0 such that if λ ≤ N/Z ≤ Λ, BZ −4/3 > C, BZ −3 < C −1 , then |hψ; JKIN,diag ψi| + |hψ; J2 ψi| ≤ |E Q (N, Z, B)| .
(6.3)
Proof. We will treat the terms separately. Recall the definitions of Π0 , Π> from (1.8), (7.1). Both for J = JKIN,diag and J = J2 we will decompose J = Π0 JΠ0 + Π> JΠ> + (Π0 JΠ> + Π> JΠ0 ) , and estimate the resulting three terms separately. Let us introduce the raising and lowering operators a, a∗ : a = pA,1 − ipA,2 ,
a∗ = pA,1 + ipA,2 .
(6.4)
We have a∗ a = p2A,1 + p2A,2 − B and [a, a∗ ] = 2B. It is now an easy calculation that pA Mt pA = (a∗ b3 a + ab3 a∗ )/2 . Therefore, we get by commutations 1 1 pA Mt pA − Bb3 − ∆⊥ b3 = (b3 a∗ a + a∗ ab3 ) , 2 2
(6.5)
so, since aΠ0 = 0 and Π0 (b · σ)Π0 = −Π0 b3 Π0 , 1 1 Π0 pA Mt pA + Bb · σ − ∆⊥ b3 Π0 = Π0 (b3 a∗ a + a∗ ab3 ) Π0 = 0 . 2 2 We now use the original representation (6.2) for JKIN,diag and the fact that |∆⊥ b3 | ≤ c`−2 ≤ c0 B. Thereby we can estimate 1 ˆ, ±Π> pA Mt pA + Bb · σ − ∆⊥ b3 Π> ≤ cK 2 and we know from Lemma 3.1 that ˆ N ψi = o(E Q ) . hψ; K We finally estimate the off-diagonal terms. Notice first, using Cauchy–Schwarz, that ˆ, ± {BΠ0 (b1 σ1 + b2 σ2 )Π> + h.c.} ≤ ηΠ0 (|b1 |2 + |b2 |2 )Π0 + η −1 K
(6.6)
February 9, 2004 19:32 WSPC/148-RMP
00189
Confinement to Lowest Landau Band and Application to Quantum Current
1241
for any η > 0. Notice next, using (6.5), aΠ0 = 0, commutation and Cauchy–Schwarz that 1 ± Π0 pA Mt pA + Bb3 σ3 − ∆⊥ b3 Π> + h.c. 2 1 = ± {Π0 [b3 , a∗ ]aΠ> + Π> a∗ [a, b3 ]Π0 } 2 ˆ, ≤ η`−2 Π0 |φ(x/`)|2 Π0 + η −1 K for any η > 0. Here we wrote [a, b3 ] = −(i∂1 + ∂2 )b3 ≡ ` N X j=1
(6.7) −1
φ(x/`). Now,
(j) Q hψ; Z`−1 ΠN /`)|2 ΠN 0 |φ(x 0 ψi = O(E ) ,
by Lemma 3.1, and similarly for the corresponding term in (6.6). Therefore, since `−2 `−1 Z (since B Z 3 ), we get the bound N X (j) (j) (j) ψ; Π0 pA Mt (x(j) )pA + Bb(x(j) ) · σ (j) j=1
1 (j) − (∆⊥ b3 )(x(j) ) Π> + h.c. ψ = o(E Q ) . 2 This proves (6.3) for JKIN,diag . The analysis of J2 is similar. Notice that pA N2 pA consists of terms of the form (pA,j Nj,3 pA,3 +h.c.) . We use that pA,j and pA,3 commute to calculate (for j = 1, 2), where c = 1/2 for j = 1 and c = −i/2 for j = 2: Π0 pA,j Nj,3 pA,3 Π0 = cΠ0 aNj,3 pA,3 Π0 = cΠ0 [a, Nj,3 ]pA,3 Π0 . Thus ± (Π0 pA,j Nj,3 pA,3 Π0 + h.c.) ≤ c0 η −1 `−2 Π0 |φ(x/`)|2 Π0 + ηp2A,3 ,
for any η > 0. Once again we use Lemma 3.1 to bound the involved terms. For the other diagonal term, Π> pA,j Nj,3 pA,3 Π> , we estimate ± (Π> pA,j Nj,3 pA,3 Π> + h.c.) ≤ cη −1 Π> p2A,⊥ Π> + ηp2A,3 ˆ > + ηp2 , ≤ c0 η −1 Π> KΠ A,3 for any η > 0. This easily implies the desired bound. Finally, the off-diagonal part of J2 . Here we have to consider the following type of terms (since Π0 a∗ = 0) Π0 arΠ> pA,3 ,
Π> arΠ0 pA,3 ,
Π> a∗ rΠ0 pA,3 .
For some functions r(x) = r0 (x/l) with r0 ∈ C02 (R3 ). We will only consider the first type of term, the analysis of the others being similar (and easier). We write Π0 arΠ> pA,3 = pA,3 Π0 arΠ> + Π0 a[r, pA,3 ]Π> .
February 9, 2004 19:32 WSPC/148-RMP
1242
00189
S. Fournais
Since kΠ0 ak ≤ cB, we get ˆ ± (pA,3 Π0 arΠ> + h.c.) ≤ ηp2A,3 + η −1 (Π0 arΠ> )∗ (Π0 arΠ> ) ≤ ηp2A,3 + cη −1 K for any η > 0. Using k∇rk ≤ c`−1 , we can similarly estimate ˆ ± (Π0 a[r, pA,3 ]Π> + h.c.) ≤ η`−2 + cη −1 K for any η > 0. From here the estimates proceed as they did after (6.7). We write J1 = J1,diag + J1,off with J1,diag = J1,off =
N X
j=1 N X
Π0 pA · N1 (x(j) )pA Π0 + Π> pA · N1 (x(j) )pA Π>
(j)
,
(j) (j) Π 0 pA
(j) (j) )pA Π0
.
(j) (j)
j=1
(j)
· N1 (x
(j)
(j)
(j) (j)
(j) (j) )pA Π>
(j) (j) Π > pA
+
(j)
· N1 (x
(j)
(6.8)
The diagonal part of J1 is readily estimated: Lemma 6.2. Let λ, Λ > 0 be given and let ψ = ψN,Z,B be a ground state for H(N, Z, B). Let > 0, then there exists C > 0 such that if λ ≤ N/Z ≤ Λ, β > C, Z > C, then |hψ; J1,diag ψi| ≤ |E Q (N, Z, B)| .
(6.9)
Proof. A simple calculation using the raising and lowering operators a, a∗ from (6.4) gives p A · N1 p A =
1 (a(N11 + iN12 )a + a∗ (N11 − iN12 )a∗ ) . 2
(6.10)
Using, aΠ0 = 0, we therefore get Π 0 p A · N1 p A Π 0 = 0 . For the component with Π> we estimate N1 by its norm to get ˆ. ±Π> pA · N1 pA Π> ≤ cΠ> p2A,⊥ Π> ≤ c0 K Therefore, ˆN , ±J1,diag ≤ c0 K and we finish the proof of Lemma 6.2 by invoking Lemma 3.1. We finally need to estimate the off-diagonal part of the operator J1 . This is the result of Lemma 6.3. Combined with Corollary 1.2 on the perpendicular kinetic energy Lemma 6.3 gives a useful bound on J1,off for a range of magnetic field strengths.
February 9, 2004 19:32 WSPC/148-RMP
00189
Confinement to Lowest Landau Band and Application to Quantum Current
1243
Lemma 6.3. Let ψ be a ground state of H(N, Z, B). Then there exists a constant c (independent of N, Z, B) such that q ˆ N ψi . |hψ; J1,off ψi| ≤ cB 1/2 N 1/2 hψ; K (6.11)
Proof. Using (6.10) and aΠ0 = 0 we get, with r(x) = (N11 (x) + iN12 (x))/2, Π0 pA · N1 pA Π> = Π0 araΠ> .
We now apply the Cauchy–Schwarz inequality and ka∗ Π0 k ≤ cB 1/2 to get |hψ; J1,off ψi| ≤ c
N X j=1
(j)
(j)
B 1/2 hψ; Π> (a(j) )∗ |r(x(j) )|2 a(j) Π> ψi1/2 .
By using the antisymmetry of ψ this expression is estimated by +1/2 * N X (j) (j) ∗ (j) 2 (j) (j) 1/2 1/2 Π> (a ) |r(x )| a Π> ψ . cB N ψ; j=1
The proof of Lemma 6.3 finishes by estimating |r(x(j) )| ≤ c, yielding (j)
(j)
ˆ (j) . Π> (a(j) )∗ |r(x(j) )|2 a(j) Π> ≤ c0 K
The proof of Theorem 2.10 is now straightforward. Proof of Theorem 2.10. It is easy to see that Theorem 2.10 follows by combining Corollary 1.2 with Lemmas 6.1–6.3. 7. Localization to Lowest Landau Band The objective of this section is to prove the localization to the lowest Landau band — Theorem 1.1. Notice that the first inequality in (1.9) is an easy consequence of the variational principle. Therefore, we will only prove the lower bound in (1.9). The proof follows some of the ideas in [1, Sec. 6] but with more careful estimates. Recall the definition of Π0 from (1.8), and define Π> = I − Π 0 .
(7.1)
We are able to gain extra precision by noticing that Π0 essentially commutes with the potential away from the singularities of the potential. Proof Part 1 : Splitting of H. We write H=
N X j=1
−1
(j)
(HA − ZV (x(j) )) +
X j
V (x(j) − x(k) ) ,
with V (x) = |x| . We will split V as follows. Let f1 , f2 be a smooth partition of unity on R satisfying f1 (x) = 1 for |x| ≤ 1, supp f1 ⊂ (−2, 2). Define d > 0 by, d = dist(supp f1 (·), supp f2 (·/2)) .
(7.2)
February 9, 2004 19:32 WSPC/148-RMP
1244
00189
S. Fournais
Let δ3 , δ⊥ > 0 (we will optimize in δ3 , δ⊥ in the end under the condition δ⊥ > δ3 ), −1 define φ1 (x⊥ ) = f1 (δ3−1 |x⊥ |) − f1 (δ⊥ |x⊥ |) and define 1 − f1 (δ3−1 |x3 |)f1 (δ3−1 |x⊥ |) , |x|
V> =
V< =
−1 f1 (δ3−1 |x3 |)f1 (δ⊥ |x⊥ |) , |x|
f1 (δ3−1 |x3 |)φ1 (x⊥ ) . |x|
V12 (x) =
(7.3)
Obviously V = V> + V< + V12 . We will also need the function g1 (x) = f1 (δ3−1 |x3 |)f12 ((2δ3 )−1 |x⊥ |) .
(7.4)
Notice, using Lemma 3.2, that kΠ0 V> Π> k = kΠ0 [Π0 , V> ]Π> k ≤ cB −1/2 k∇V> k∞ = cB −1/2 δ3−2 .
(7.5)
Similarly
φ1 (x⊥ ) −1/2 −2
Π 0 δ⊥ . Π>
≤ cB
|x|
(7.6)
We will now proceed as in [1, Sec. 6] but using (7.5) and (7.6). Define Πα for α ⊆ {1, . . . , N } by Y (j) Y (j) Π> . Π0 Πα = j∈α
j ∈α /
With this notation we will prove that for any 0 < > , < , 12 ≤ 1 (and for some C > 0 independent of Z, B, > , < , 12 , δ⊥ , δ3 ), X ˆα + H ˜ α Πα , H≥ Πα H (7.7) α
where
ˆα = H
X (j) HA − ZV (x(j) ) − < ZV< (x(j) ) − > ZB −1/2 δ3−2 j∈α
−2 −1 − 12 ZB −1/2 δ⊥ g1 (x(j) ) − CZδ⊥ exp(−d2 Bδ32 /8)
X
+
j,k∈α : j
V (x(j) − x(k) ) − 3< V< (x(j) − x(k) ) − > B −1/2 δ3−2
−2 −1 − 12 B −1/2 δ⊥ g1 (x(j) − x(k) ) − Cδ⊥ exp(−d2 Bδ32 /8) .
and ˜α = H
X j ∈α /
(j)
(j) −1 −1/2 −2 HA − ZV (x(j) ) − −1 δ3 < ZV< (x ) − C> ZB
−1 −1/2 −2 − C−1 δ⊥ g1 (x(j) ) − CZδ⊥ exp(−d2 Bδ32 /8) 12 ZB
February 9, 2004 19:32 WSPC/148-RMP
00189
Confinement to Lowest Landau Band and Application to Quantum Current
+
X
j,k∈α / : j
−1 −1/2 −2 V (x(j) − x(k) ) − 3< V< (x(j) − x(k) ) − C−1 δ3 > B
−1 −1/2 −2 − C−1 δ⊥ g1 (x(j) − x(k) ) − Cδ⊥ exp(−d2 Bδ32 /8) 12 B
+
X
j∈α,k∈α /
1245
3 (j) −1/2 −2 V (x(j) − x(k) ) − (< + −1 − x(k) ) − C−1 δ3 < )V< (x > B 2
−1/2 −2 − C−1 δ⊥ g1 (x(j) 12 B
−x
(k)
)−
−1 Cδ⊥
exp(−d
2
Bδ32 /8)
.
In order to prove (7.7) notice first that it is an equality for the kinetic energy terms (j) HA , since these commute with the Πα . For V (x(j) ) we proceed as follows: (j)
(j)
(j)
(j)
V (x(j) ) = Π0 V (x(j) )Π0 + Π> V (x(j) )Π> (j)
(j)
(j)
(j)
(j)
(j)
(j)
(j)
(j)
(j)
(j)
+ (Π> V< (x(j) )Π0 + Π0 V< (x(j) )Π> ) + (Π> V> (x(j) )Π0 + Π0 V> (x(j) )Π> ) (j)
+ (Π> V12 (x(j) )Π0 + Π0 V12 (x(j) )Π> ) .
(7.8)
For the cross terms with V< in (7.8), we use the Cauchy–Schwarz inequality to obtain (j)
(j)
(j)
(j)
−Π> V< (x(j) )Π0 − Π0 V< (x(j) )Π> (j)
(j)
(j)
(j)
(j) ≥ −< Π0 V< (x(j) )Π0 − −1 )Π> . < Π> V< (x
(7.9)
To estimate the cross terms with V> in (7.8), we use (7.5) before applying Cauchy– Schwarz and get (j)
(j)
(j)
(j)
−Π> V> (x(j) )Π0 − Π0 V> (x(j) )Π> (j)
(j)
−1/2 −2 ≥ −> B −1/2 δ3−2 Π0 − C−1 δ3 Π> . > B
(7.10)
Finally, we estimate the term with V12 in (7.8). Here we observe that f1 (δ3−1 |x3 |) commutes with the projections Π0 , Π> . Therefore, we may write q q φ1 (x⊥ ) Π> f1 (δ3−1 |x3 |) . Π0 V12 (x)Π> = f1 (δ3−1 |x3 |)Π0 |x| Notice, using (7.2), that f2 ((2δ3 )−1 |x⊥ |)φ1 (x⊥ ) = 0. We calculate, using the function g1 from (7.4), q q φ1 (x⊥ ) Π> f1 (δ3−1 |x3 |) f1 (δ3−1 |x3 |)Π0 |x| = Π0
p
g1 (x)[Π0 ,
φ1 (x⊥ ) p ] g1 (x)Π> |x|
February 9, 2004 19:32 WSPC/148-RMP
1246
00189
S. Fournais
q φ1 (x⊥ ) p + f1 (δ3−1 |x3 |)Π0 f2 ((2δ3 )−1 |x⊥ |)[Π0 , ] g1 (x)Π> |x| q p φ1 (x⊥ ) ]f2 ((2δ3 )−1 |x⊥ |)Π> f1 (δ3−1 |x3 |) . (7.11) + Π0 g1 (x)[Π0 , |x|
On the first term in (7.11) we apply (7.6), before we apply Cauchy–Schwarz. The result is p φ1 (x⊥ ) p ] g1 (x)Π> Π0 g1 (x)[Π0 , |x| (j)
(j)
(j)
(j)
−2 −1/2 −2 ≥ −12 B −1/2 δ⊥ Π0 g1 (x(j) )Π0 − C−1 δ⊥ Π> g1 (x(j) )Π> . (7.12) 12 B
We then consider the two last terms in (7.11). For these we expand the commutator and use f2 ((2δ3 )−1 |x⊥ |)φ1 (x⊥ ) = 0, f1 ((2δ3 )−1 |x⊥ |)φ1 (x⊥ ) = φ1 (x⊥ ). Thus, the two terms from (7.11) become q φ1 (x⊥ ) −1 f1 (δ3 |x3 |)Π0 f2 ((2δ3 )−1 |x⊥ |)Π0 |x| o q φ1 (x⊥ ) Π0 f2 ((2δ3 )−1 |x⊥ |) Π> f1 (δ3−1 |x3 |) . (7.13) − |x| Lemma 7.2 below can be applied to estimate the operator in {·} in (7.13) by −1 Cδ⊥ exp(−d2 Bδ32 /8). Therefore, we can collect our estimates and find −Π0 V12 (x)Π> − Π> V12 (x)Π0 −2 −1/2 −2 ≥ −12 B −1/2 δ⊥ Π0 g1 (x)Π0 − C−1 δ⊥ Π> g1 (x)Π> 12 B −1 −1 exp(−d2 Bδ32 /8)Π> . − Cδ⊥ exp(−d2 Bδ32 /8)Π0 − Cδ⊥
(7.14)
Combining the right-hand sides from (7.8)–(7.10) and (7.14) we get the terms in ˆ α and H ˜ α with one-particle potentials. For the two-particle terms V (x(j) − x(k) ) H we proceed similarly, but with projections Π0 and Π> in both the j and k variables. We leave out the details, since the mathematics is identical to the above, but the book-keeping is more tedious. ˆ α . Notice, that we actually proved (7.7) as an operator inPart 2 : Study of H 2 3 2 equality on all of ⊗N j=1 L (R , C ). For the next steps we will need the anti2 3 2 symmetry of the electronic wavefunctions. Suppose that ψ ∈ ∧N j=1 L (R , C ), then Πα ψ(x(1) , . . . , x(N ) ) is antisymmetric separately in the variables in α and ˆα + H ˜ α )Πα from below by those in the complement, α ˜ . We will bound Πα (H α ˆα α α ˜α α inf Spec Π H Π +inf Spec Π H Π , where the (anti-)symmetry conditions above are imposed on each of the operators. We clearly have, since N/Z is bounded, ( X (j) α ˆ ≥ H (H − ZV (x(j) ) − < ZV< (x(j) ) − 12 ZB −1/2 δ −2 g1 (x(j) )) A
j∈α
⊥
February 9, 2004 19:32 WSPC/148-RMP
00189
Confinement to Lowest Landau Band and Application to Quantum Current
+
X
j,k∈α : j
1247
(V (x(j) − x(k) ) − 3< V< (x(j) − x(k) )
(j) −2 − 312 B −1/2 δ⊥ g1 (x3
−x
(k)
))
)
−1 −d − c> Z 2 B −1/2 δ3−2 − CZ 2 δ⊥ e
2
Bδ32 /8
.
(7.15)
Assume without loss of generality that α = {1, . . . , n} (for some n ≤ N ) in (7.15) (j) and define Πn0 = ⊗nj=1 Π0 . Then ˆ α Πα ≥ inf Spec Πn0 H ˆ α Πn0 , inf Spec Πα H
(7.16)
where the operator on the right-hand side is considered as acting on ∧nj=1 L2 (R3 ; C2 ). Let H α0 be the operator in {·} in (7.15) with α = α0 ≡ {1, . . . , N }. Then α0 N N 2 3 2 ΠN 0 H Π0 is an operator acting on ∧j=1 L (R ; C ), and we find: α0 N ˆ α Πα ≥ inf Spec ΠN inf Spec Πα H 0 H Π0 −1 −d − c> Z 4/3 β −1/2 δ3−2 − CZ 2 δ⊥ E
2
Bδ32 /8
.
(7.17)
This follows from (7.16) and a variational argument (“sending electrons to infinity” as in the standard proof of the HVZ-theorem). We will now estimate the difference between the ground state energy of the Q α0 N perturbed operator, ΠN 0 H Π0 , and the confined energy Econf . The result will be Q α0 N inf Spec ΠN 0 H Π0 ≥ Econf (N, Z, B)
− c< δ⊥ β 2/5 Z 1/3 |E Q (N, Z, B)| 2 δ3 − c12 β −1/10 Z −1/3 |E Q (N, Z, B)| . δ⊥
(7.18)
Recall from Sec. 2.1 that ` = Z −1/3 (1 + β)−2/5 is the mean size (the size of an atom in magnetic Thomas–Fermi theory) of a neutral atom with nuclear charge Z in a magnetic field of magnitude B. Since β 1, we have ` ≈ Z −1/3 β −2/5 . Thus the first error term above, of order < (δ⊥ /`)E Q , is very natural if we consider < V< a perturbation of V on a length scale δ⊥ . To prove (7.18) we decompose H α0 as δ⊥ δ⊥ α0 α0 α0 + R H3α0 + RH4α0 , H = H 1 + < H2 + < ` ` 2 with R = 12 δδ⊥3 β −1/10 Z −1/3 , where ) !( N X X δ⊥ Z 1 (j) α0 H1 = 1 − 2< HA − (j) + , − 2R ` |x | |x(j) − x(k) | j=1 j
H2α0 =
N X j=1
(j)
HA −
X 3` Z` V< (x(j) ) − V< (x(j) − x(k) ) , δ⊥ δ⊥ j
February 9, 2004 19:32 WSPC/148-RMP
1248
00189
S. Fournais
H3α0
=
N X
(j) HA
j=1
H4α0
=
N X
(j) HA
j=1
2Z − (j) |x |
+
X j
2 , |x(j) − x(k) |
X 12 Z 12 (j) (k) (j) g1 (x3 − x3 ) . − g1 (x3 ) − 2 2 1/2 1/2 Rδ⊥ B Rδ⊥ B j
Clearly,
δ⊥ Q (N, Z, B) − 2R Econf ` δ⊥ Q ≥ Econf (N, Z, B) − C < + R |E Q (N, Z, B)| . `
α0 N inf Spec ΠN 0 H1 Π 0 =
1 − 2<
Thus, we get the desired estimate on H1α0 . In order to estimate H2α0 from below we write ) ( N N X 3`(N − 1) Z` 1 X (j) α0 (j) (j) (k) V< (x ) − V< (x − x ) . HA − H2 = N −1 δ⊥ 2δ⊥ k=1
j=1,j6=k
(7.19)
For fixed k we consider the operator in {·}. Without loss of generality, we assume (k) that k = N . Since there is no kinetic energy term HA we may estimate N inf Spec ΠN 0 {·}Π0
≥ inf3 inf Spec Π0N −1 z∈R
( N −1 X j=1
(j) HA
) Z` 3`(N − 1) −1 V< (x(j) ) − V< (x(j) − z) ΠN . − 0 δ⊥ 2δ⊥
Now we are in a position to use the Lieb–Thirring inequality for the lowest Landau band, Lemma 3.3(iii), to get 3/2 Z 2B 3`(N − 1) `Z N N inf Spec Π0 {·}Π0 ≥ − inf V< (x) + V< (x − z) dx 3π z∈R3 R3 δ⊥ 2δ⊥ = −cβ 2/5 Z 7/3 . Here we used the triangle inequality in L3/2 and the fact that N and Z are of the same order of magnitude. This proves that the contribution from < δ`⊥ H2α0 can be included in the error term in (7.18). Remark 7.1. Notice that we would have got an error of the same order if we had neglected the two-body terms V< (x(j) − x(k) ). A second look at the proof reveals that this is a consequence of the decomposition (7.19), the Lieb–Thirring inequality and the fact that N/Z is bounded. Below we will repeatedly use that string of arguments (decomposition a ` la (7.19), Lieb–Thirring inequality) and it will always be true that the one-body term and the two-body terms are of the same order of magnitude.
February 9, 2004 19:32 WSPC/148-RMP
00189
Confinement to Lowest Landau Band and Application to Quantum Current
1249
Using the Lieb–Thirring inequality, Lemma 3.3(iii), we easily get the estimate δ⊥ δ⊥ α0 N + R ΠN H Π ≥ −c + R β 2/5 Z 7/3 . inf Spec < < 0 0 3 ` `
Thus the contribution from H3α0 is included in the error term in (7.18). We estimate the term with H4α0 using the same method as for H2α0 . After application of the Lieb–Thirring inequality in the lowest Landau band, Lemma 3.3(iii), we get (using Remark 7.1) a lower bound of order: 3/2 Z −2 δ3 1/10 4/3 −1/2 −2 β Z B δ⊥ g1 (x) −cB dx = −c0 β 2/5 Z 7/3 . (7.20) δ⊥ Thus we have proved (7.18). Thus we get the following final bound, using (7.17) and (7.18), ˆ α Πα ≥ E Q (N, Z, B) inf Spec Πα H conf −2 −1/10 −1/3 − c < δ⊥ β 2/5 Z 1/3 + 12 δ32 δ⊥ β Z + > δ3−2 β −9/10 Z −1 −1 −d − CZ 2 δ⊥ e
2
Bδ32 /8
|E Q (N, Z, B)| .
(7.21)
For the best error estimate, we will chose the parameters 12 , < , > , δ⊥ , δ3 such that −2 −1/10 −1/3 < δ⊥ β 2/5 Z 1/3 = 12 δ32 δ⊥ β Z = > δ3−2 β −9/10 Z −1 .
(7.22)
(j) ˜ α . For H ˜ α we use the following estimate for j ∈ Part 3 : Study of H / α: Πα HA Πα ≥ (j) 2 1 α α 2 Π ((pA ) + 2B)Π . This follows since the jth electron is restricted away from the lowest Landau band (when j ∈ / α). So we get the estimate
˜ α Π α ≥ Π α (H ˜α + H ˜α + H ˜α + H ˜ α )Πα , Πα H 1 2 3 4
(7.23)
where ˜α = H 1
X j ∈α /
+
(j)
((pA )2 + 2B)/8 − X
j,k∈α:j
˜α = H 2
X j ∈α /
˜ 3α = H
j ∈α /
1 + − x(k) |
X
j∈α,k∈α /
|x(j)
1 , − x(k) |
(j)
(j) ((pA )2 + 2B)/8 − −1 ) < ZV< (x
− 3−1 < X
|x(j)
Z |x(j) |
X
j,k∈α:j
(j)
3 + < ) V< (x(j) − x(k) ) − (−1 2 <
X
j∈α,k∈α /
−1/2 −2 ((p3 )2 + 2B)/8 − C−1 δ⊥ g1 (x(j) ) 12 ZB
V< (x(j) − x(k) ) ,
February 9, 2004 19:32 WSPC/148-RMP
1250
00189
S. Fournais −1/2 −2 − C−1 δ⊥ 12 B −1/2 −2 − C−1 δ⊥ 12 B
˜ 4α = H
X j ∈α /
X
g1 (x(j) − x(k) )
X
g1 (x(j) − x(k) ) ,
j,k∈α:j
j∈α,k∈α /
−1 −d −1/2 −2 (B/4 − C−1 δ3 − CZδ⊥ e > ZB
−C −C
X
−1 −d −1/2 −2 (−1 δ3 + δ ⊥ e > B
2
2
Bδ32 /8
Bδ32 /8
)
)
j,k∈α / : j
X
−1 −d −1/2 −2 (−1 δ3 + δ ⊥ e > B
2
Bδ32 /8
).
j∈α,k∈α /
˜ α to be controlled by the positive B/4, so we We want the negative constants in H 4 get the condition (using that N/Z is bounded) −1 2 −d Cβ −3/2 Z −1 + > CZ −1/3 β −1 δ⊥ δ3 e
2
βZ 4/3 δ32 /8
≤ > δ32 .
(7.24)
˜ α is positive and can be discarded for a lower bound. Under this condition H 4 The remaining three operators in (7.23) are all estimated using a decomposition similar to (7.19) and a Lieb–Thirring type inequality. Using Remark 7.1, we will not explicitly consider the two-body terms in each of these operators. A repetition of the arguments given in the proof of (7.18) above shows that the error bounds are not affected by this. ˜ 1α , we use the non-magnetic Lieb–Thirring inequality, Lemma 3.3(ii), and For H get Z 5/2 ˜ α ≥ −C (7.25) [B − Z|x|−1 ]− dx = −C 0 β −9/10 |E Q | . inf Spec H 1 R3
˜ α we also use the 3-dimensional Lieb–Thirring inTo estimate the operator H 3 equality, Lemma 3.3(ii), and get an estimate as: Z 5/2 α 0 −1/2 −2 ˜ inf Spec H3 ≥ −C [B − C−1 δ⊥ g1 (z)]− dz 12 ZB R3
≥
(
2 CZ −1 β −3/2 < δ⊥ 12 ,
0, −5/2 3 −5 −33/20 −3/2 δ3 δ⊥ β Z |E Q | ,
−C12
2 12 , CZ −1 β −3/2 ≥ δ⊥
≡ R1 .
(7.26)
˜ α , is also estimated using Lemma 3.3(ii). We get The final operator, H 2 Z 5/2 α 0 ˜ [B − C−1 inf Spec H2 ≥ −C < ZV< (x)]− dx R3
February 9, 2004 19:32 WSPC/148-RMP
00189
1251
Confinement to Lowest Landau Band and Application to Quantum Current
≥
−3 −9/10 |E Q | , −C< β
1/2 −5/2 −2/5 1/6 −Cδ⊥ < β Z |E Q | ,
Z < δ⊥ , < B Z ≥ δ⊥ , < B
≡ R2 .
(7.27)
˜ α (using (7.25)–(7.27) and under the Thus we get our final lower bound on H condition (7.24)) uniformly in α as ˜ α ≥ C(β −9/10 + R1 + R2 )|E Q | . H (7.28) Part 4 : Choice of parameters. By combining (7.7), (7.21) and (7.28) (under the conditions (7.22) and (7.24)), we get Q E Q ≥ Econf − c < δ⊥ β 2/5 Z 1/3 + β −9/10 + R1 + R2 −1 −d + Z 2 δ⊥ e
2
Bδ32 /8
We choosea 107
166
δ⊥ = β − 206 Z − 309 ,
|E Q | .
(7.29)
219
47
77
1
δ3 = β − 412 Z − 103 ,
3
159
45
12 = β − 515 Z − 309 ,
< = β − 1030 Z 103 ,
9
> = β − 103 Z − 103 .
(7.30)
1/2 −5/2 δ⊥ < β −2/5 Z 1/6 ,
Thereby, the three main error terms, < δ⊥ β 2/5 Z 1/3 , and −5/2 −5 −33/20 −3/2 12 δ33 δ⊥ β Z in (7.29) become of the same order. Furthermore, since δ3 B −1/2 , the gaussian term becomes negligibly small. The leading error term becomes 141
18
β − 515 Z − 103 + β −9/10 . 159 − 1030
(7.31)
3 103
In the case β Z > 1 the above choice violates < ≤ 1. Therefore, in that case, we choose < = 1 and the other parameters as in (7.30). The dominating error term from (7.21) and (7.28) is then 1/2 −5/2 −2/5
δ⊥ < 159
3
β
1359
21
Z 1/6 + β −9/10 = β − 2060 Z − 206 + β −9/10 ≤ 2β −9/10 ,
for β − 1030 Z 103 > 1. 508 2 For B ≥ Z 321 the choice (7.30) gives Z −1 β −3/2 ≤ δ⊥ 12 . Therefore, we fall in −1 −3/2 2 the first case in (7.26). Thus, we add the condition Z β ≤ δ⊥ 12 , and get R1 = 0. This results in the following choice of parameters: 58
107
96
δ⊥ = β − 185 Z − 111 , 27
1
< = β − 185 Z 37 ,
17
δ3 = β − 185 Z − 37 , 127
5
12 = β − 370 Z 111 ,
171
3
> = β − 370 Z − 37 ,
(7.32)
which gives an (relative) error term of order 12
6
β − 37 Z − 37 + β −9/10 .
(7.33)
a We only give the dependence on β, Z. One needs to add small/big multiplicative constants to make inequalities such as (7.24) correct.
February 9, 2004 19:32 WSPC/148-RMP
1252
00189
S. Fournais 98
−9/10 Finally, for B ≥ Z 51 . Here we choose the first bound −3 |E Q | in (7.27). This < β results in the following choice of parameters 233
65
δ⊥ = β − 410 Z − 123 , 15
2
21
19
δ3 = β − 41 Z − 41 , 149
7
12 = β − 410 Z 123 ,
< = β − 82 Z 41 ,
39
3
> = β − 82 Z − 41 ,
(7.34)
which gives an (relative) error of order 72
6
β − 205 Z − 41 + β −9/10 .
(7.35)
Combining the terms from (7.31), (7.33) and (7.35), we get (1.9). In the proof above we used the following localization lemma: Lemma 7.2. Let f2 , φ1 be as defined in the text before (7.3). Let d = ⊥| dist(supp f2 ( |x 2δ3 ), supp φ1 (x⊥ )). Then there exists a constant C > 0 such that
f2 (|x⊥ |/(2δ3 ))Π0 φ1 (x⊥ ) ≤ Cδ −1 e−d2 Bδ32 . ⊥
|x| Proof. Let
K(x, y) = f2 (|x⊥ |/(2δ3 ))Π0 (x, y)
φ1 (y⊥ ) , |y|
be the integral kernel of the operator in question. From the explicit kernel (1.8) for Π0 we see that −B|x⊥ −y⊥ |2 /8 φ1 (y⊥ ) B −B|x⊥ −y⊥ |2 /8 |K(x, y)| ≤ sup f2 (|x⊥ |/(2δ3 ))e e δ(x3 − y3 ) |y| 2π
B −B|x⊥ −y⊥ |2 /8 e δ(x3 − y3 ) . 2π Now the desired estimate follows from the standard inequality: Z Z kKk ≤ max sup |K(x, y)| dy, sup |K(x, y)| dx . −1 −d ≤ Cδ⊥ e
2
Bδ32 /8
x
y
Finally we give the short proof of Corollary 1.2. Proof of Corollary 1.2. Let N
X ˆ =H−1 ˆj , H K 2 j=1 ˆ ˆ and let ψ be a ground state of H. let E(N, Z, B) be the ground state energy of H Then 0≥
N X −1 ˆ j |ψi ≥ E(N, ˆ hψ| K Z, B) − E(N, Z, B) . 2 j=1
February 9, 2004 19:32 WSPC/148-RMP
00189
Confinement to Lowest Landau Band and Application to Quantum Current
1253
One now notices that (the proof of) Theorem 1.1 remains true with H replaced by ˆ Thus H. ˆ E(N, Z, B) − E(N, Z, B) = O(RE Q (N, Z, B)) . This finishes the proof. Acknowledgments It is a pleasure for the author to use this opportunity to thank Prof. B. Helffer for useful discussions, encouragement and comments on a preliminary manuscript. The author also wishes to thank the anonymous referee for many helpful comments and corrections. References [1] E. H. Lieb, J. P. Solovej and J. Yngvason, Asymptotics of heavy atoms in high magnetic fields. I. Lowest Landau band regions, Comm. Pure Appl. Math. 47(4) (1994), 513–591. [2] E. H. Lieb, J. P. Solovej and Jakob Yngvason, Asymptotics of heavy atoms in high magnetic fields. II. Semiclassical regions, Commun. Math. Phys. 161(1) (1994), 77– 124. [3] C. Hainzl and R. Seiringer, A discrete density matrix theory for atoms in strong magnetic fields, Commun. Math. Phys. 217(1) (2001), 229–248. [4] L. Erdİ os and J. P. Solovej, Semiclassical eigenvalue estimates for the Pauli operator with strong non-homogeneous magnetic fields. II. Leading order asymptotic estimates, Commun. Math. Phys. 188(3) (1997), 599–656. [5] E. H. Lieb and B. Simon, The Thomas–Fermi theory of atoms, molecules and solids, Adv. Math. 23(1) (1977), 22–116. [6] E. H. Lieb, Thomas–Fermi and related theories of atoms and molecules, Rev. Modern Phys. 53(4) (1981), 603–641. [7] I. Fushiki, E. H. Gudmundsson, C. J. Pethick and J. Yngvason, Matter in a magnetic field in the Thomas–Fermi and related theories, Ann. Phys. 216 (1992), 29–72. [8] S. Fournais, On the semiclassical asymptotics of the current and magnetic moment of a non-interacting electron gas at zero temperature in a strong constant magnetic field, Ann. Henri Poincar´e 2(6) (2001), 1189–1212. [9] S. Fournais, The magnetisation of large atoms in strong magnetic fields, Commun. Math. Phys. 216(2) (2001), 375–393. [10] S. Fournais, Semiclassics of the quantum current in very strong magnetic fields, Annales de l’Institut Fourier 52(6) (2002), 1901–1945. [11] S. Fournais, Semiclassics of the quantum current, Comm. Partial Differential Equations 23(3,4) (1998), 601–628. [12] S. Fournais, On the total magnetic moment of large atoms in strong magnetic fields, Lett. Math. Phys. 59(1) (2002), 33–45. [13] E. H. Lieb and W. Thirring, A bound on the moments of the eigenvalues of the Schrdinger Hamiltonian and their relation to Sobolev inequalities, in Studies in Mathematical Physics: Essays in Honor of Valentine Bargmann, Academic Press, New York, 1976, pp. 269–303. [14] B. Simon, Functional Integration and Quantum Physics, Academic Press Inc. [Harcourt Brace Jovanovich Publishers], New York, 1979.
February 9, 2004 19:51 WSPC/148-RMP
00187
Reviews in Mathematical Physics Vol. 15, No. 10 (2003) 1255–1283 c World Scientific Publishing Company
ESSENTIAL PROPERTIES OF THE VACUUM SECTOR FOR A THEORY OF SUPERSELECTION SECTORS
GIUSEPPE RUZZI Dipartimento di Matematica, Universit` a di Roma “Tor Vergata” Via della Ricerca Scientifica I-00133, Roma, Italy [email protected] Received 2 April 2003 Revised 29 October 2003 As a generalization of DHR analysis, the superselection sectors are studied in the absence of the spectrum condition for the reference representation. Considering a net of local observables in 4-dimensional Minkowski spacetime, we associate to a set of representations, that are local excitations of a reference representation fulfilling Haag duality, a symmetric tensor C∗ -category B(A) of bimodules of the net, with subobjects and direct sums. The existence of conjugates is studied introducing an equivalent formulation of the theory in terms of the presheaf associated with the observable net. This allows us to find, under the assumption that the local algebras in the reference representation are properly infinite, necessary and sufficient conditions for the existence of conjugates. Moreover, we present several results that suggest how the mentioned assumption on the reference representation can be considered essential also in the case of theories in curved spacetimes. Keywords: Superselection sectors; generalized vacuum state; spectrum condition.
Contents 1. Introduction 2. The Net and the Presheaf Approach to the Theory, and Their Equivalence 2.1. The category B(A) 2.2. The category B(A⊥ ) 2.3. The isomorphism between B(A⊥ ) and B(A) 2.4. Faithfulness and double faithfulness 3. Statistics and Selection of the Relevant Subcategory 3.1. Symmetry 3.2. Net-left inverses, presheaf-left inverses and homogeneity 3.3. Simple objects 3.4. The category of objects with finite statistics 3.5 The selection of the relevant subcategory 4. Conjugation 1255
1256 1258 1259 1260 1262 1264 1265 1265 1268 1271 1273 1274 1276
February 9, 2004 19:51 WSPC/148-RMP
1256
00187
G. Ruzzi
5. Conclusions Appendix A. Some Notions and Results on Tensor C∗ -Categories A.1. Left inverses, symmetry and simple objects A.2. The notation introduced in Sec. 3.2 Acknowledgments References
1278 1279 1279 1282 1282 1282
1. Introduction In a series of papers [4, 5, 7], the first two of which are known as DHR analysis, Doplicher, Haag and Roberts have shown that the properties of charges associated with a global gauge group, like the Bose–Fermi alternative and the charge conjugation symmetry, find a natural description in the superselection sectors of a net of local observables. The theory was based on one important result obtained in a previous investigation [3]: the representations of the net local observables, corresponding to such kinds of charges, fulfill the following property: they are local excitations of the vacuum representation. This property was used in [4, 5] as the criterion for selecting a set of representations of a net of local observables. The authors associate to this set a C∗ -category, in which the charge structure arises from the existence of a tensor product, a symmetry and a conjugation. Finally it has been shown by Doplicher and Roberts [7] that the unobservable quantities underlying the theory, namely the fields and the global gauge group, can be reconstructed from the observables. At the present time it is not possible to apply this program in a curved spacetime without a global symmetry. In this case, in fact, a notion corresponding to the spectrum condition by which one could define a vacuum representation of a net of local observables does not yet exist.a The DHR analysis is, however, well suited for treating this situation, because no explicit use of Poincar´e covariance is made. Moreover, the spectrum condition is not fully used in the theory: only the Borchers property, a consequence of the spectrum condition [1], has a real role. In this paper we further generalize the theory. We will consider the set of representations that are local excitations of a reference representation, which is not required to satisfy the Borchers property. Also in this case, a tensor C∗ -category having a symmetry is associated with this set of representations. Then, the subject of this paper will be the search for a criterion for selecting the relevant subcategory of the theory: namely, the maximal full subcategory which is closed under tensor products, direct sums and subobjects, and whose objects have conjugates. In the usual setting of Algebraic Quantum Field Theory (see [10] and references therein), we consider a local net of von Neumann algebras R over M4 , namely a a The
superselection sectors of a net of local observables on an arbitrary globally-hyperbolic spacetime have been studied in [9]. Except when geometrical obstructions are present, the results of the DHR analysis are reproduced. However, the reference representation used in this analysis is not characterized by physical conditions, as the vacuum in the case of Minkowski space, but only by mathematical ones, suggested by a study on the representations, induced by quasi-free Hadamard states, of the local algebras of a free Bose field [17]. In this connection see also [18].
February 9, 2004 19:51 WSPC/148-RMP
00187
Essential Properties of the Vacuum Sector for a Theory of Superselection Sectors
1257
correspondence R : K 3 a → R(a) associating to an open double cone a of M4 a von Neumann algebra R(a) on a fixed Hilbert space H, subject to the conditions: a1 ⊆ a2 ⇒ R(a1 ) ⊆ R(a2 ) isotony a1 ⊥ a2 ⇒ R(a1 ) ⊆ R(a2 )0
locality
where the symbol ⊥ stands for spacelike separation and the prime for the commutant. The algebra R(a) is generated by all the observables measurable within a. For an unbounded region S ⊆ M4 there is an associated C∗ -algebra R(S) generated by all the algebras R(a) such that a ∈ K, a ⊂ S. We denote by R the algebra associated with M4 . As reference representation of R we consider a locally normal, faithful, irreducible representation πo , on an infinite dimensional separable Hilbert space Ho , such that the net of local observables in the reference representation A : K 3 a → A(a) ⊂ A where A(a) ≡ πo (R(a)) and A ≡ πo (R), satisfies Haag duality, namely A(a⊥ )0 = A(a) ∀a ∈ K where a⊥ denotes the spacelike complement of a. Now, in the present investigation we are interested in a set of representations of R which is closed under direct sums and subrepresentations, and whose elements are local excitations of πo . Without the Borchers property, such a set of representations can be selected by a suitable generalization of the DHR criterion. Precisely, we consider the representations π of R satisfying the following relation: for each a ∈ K there exists na ∈ N and an isometry Va : Hπ → Ho ⊗ Cna such that Va · π(A) = πo (A) ⊗ 1na · Va
A ∈ R(a⊥ ) .
(1)
We denote by SC the set of the representations verifying this selection criterion. We will associate to SC the tensor C∗ -category B(A) of the localized transportable bimodules of the net. This category is closed under direct sums and subobjects. Moreover, we will show the existence of a symmetry ε, thus, a notion of statistics of sectors can be introduced. However, since there might exist objects without left inverses, not all the sectors of B(A) fall into the DHR classes of sectors with finite/infinite statistics. The study of the properties of objects having conjugates will provide that, apart from the finiteness of the statistics, an additional condition, called homogeneity, is necessary for the existence of conjugates. Under the assumption that the local algebras are properly infinite, we will prove that the homogeneous sectors with finite statistics have conjugates. The key result that will allow us to formulate the property of a homogeneous object is that the superselection sectors theory of the net A is equivalent to the one of the presheaf A⊥ : K 3 a → A(a)0 . Namely, we will introduce the category B(A⊥ ) of the localized transportable bimodules of the presheaf A ⊥ : a bimodule ρˆ of A⊥ is a collection of morphisms aρ
: A(a)0 → B(Ho ) ⊗ Mnρ
∀a ∈ K
February 9, 2004 19:51 WSPC/148-RMP
1258
00187
G. Ruzzi
compatible with the presheaf structure. We will show that this category is isomorphic to B(A); in particular, any object ρ of B(A) admits a canonical extension to a localized transportable bimodule ρˆ of the presheaf (Theorem 2.3). Using this isomorphism, we introduce the notion of presheaf-left inverse of ρ which generalizes the concept of left inverse for unital endomorphisms of a C∗ -algebra to its extension ρˆ to the presheaf (Definition 3.9). However, the property of admitting presheaf-left inverses is not stable under equivalence and depends on the double cone where the object is localized. Hence, we will say that ρ is homogeneous whenever all the elements of its equivalence class [ρ] admit presheaf-left inverses (Definition 3.10). The existence of a maximal full subcategory B(A)fh of B(A) with homogeneous objects, closed under direct sums, tensor products and subobjects, and having finite statistics, will be proved in Proposition 3.20. Any object of B(A)fh is a finite direct sum of irreducible objects ρ fulfilling the following conditions: there exists an integer d, an object γ and an isometry V ∈ (γ, ρd ) such that (1) ε(γ, γ) = ±1γ ; (2) V V + equals either Ad or Sd the totally (anti)symmetric projector in (ρd , ρd ); (3) the extension γˆ of γ is a faithful morphism of the presheaf, that is a γ : A(a)0 → B(Ho ) ⊗ Mnρ is a faithful morphism for each a ∈ K. B(A)fh is the relevant subcategory of the theory. Indeed, we will prove that on the one hand each object with conjugates belongs to this category (Theorem 4.1), and, on the other hand, if the local algebras are properly infinite any object of B(A)fh has conjugates (Theorem 4.4). This last result suggests that it is reasonable to include proper infiniteness of the local algebras A(a) as an axiom of the theory. This proposal is also supported by the following facts: first, this property can be derived, in a particular case, from the existence of conjugates (Theorem 4.2); secondly, in a globally-hyperbolic spacetime the algebras of local observables of a multiplet of n Klein–Gordon fields in any Fock representation, acted on by U (n) as a global gauge group, fulfill this property (this result is proved in [15] and it will be described in a forthcoming article). The paper is organized as follows: in Sec. 2 we introduce the categories B(A) and B(A⊥ ) and show that they are isomorphic; Sec. 3 is entirely devoted to the construction of the category B(A)fh ; in Sec. 4 we study the conjugation and derive the above stated solutions; finally, Sec. 5 concludes the work. In Appendix A some definitions and results on tensor C∗ -categories are presented.
2. The Net and the Presheaf Approach to the Theory, and Their Equivalence In this section we introduce the categories B(A) and B(A⊥ ) which are respectively the categories of localized transportable bimodules of the net and of the presheaf.
February 9, 2004 19:51 WSPC/148-RMP
00187
Essential Properties of the Vacuum Sector for a Theory of Superselection Sectors
1259
We show how these categories are related to SC and that they are isomorphic. b We conclude by introducing the notions of faithfulness and of double faithfulness for the objects of B(A). 2.1. The category B(A) We show that there is, up to unitary equivalence, a bijective correspondence between SC and the set ∆t of the localized transportable morphisms of the net. After the observation that the elements of ∆t are bimodules of A, the category B(A) is defined as the category whose set of objects is ∆t and whose arrows are the intertwiners of the elements of ∆t . A morphism ρ : A → B(Ho ) ⊗ Mnρ is said to have multiplicity nρ and to be localized in o if for any a ∈ K, a ⊥ o then ρ(A) = A ⊗ 1nρ · ρ(1) ∀A ∈ A(a) . We denote by ∆ the set of localized morphisms and by ∆(o) the subset of those morphisms which are localized within o. Given ρ, σ ∈ ∆, the set (ρ, σ) of the intertwiners between ρ and σ is the set of the operators T ∈ B(Ho ⊗Cnρ , Ho ⊗Cnσ ) such that T ρ(1) = T = σ(1)T
and T ρ(A) = σ(A)T
∀A ∈ A.
A localized morphism ρ is said to be transportable if for each o there exists τ ∈ ∆(o) and a unitary U ∈ (ρ, τ ). We denote by ∆t the set of localized transportable morphisms and by ∆t (o) the subset of those morphisms which are localized in o. The following lemma is an easy consequence of Haag duality and of the localization property of the elements of ∆t . Lemma 2.1. Let ρ ∈ ∆t (o). Then the following assertions hold : (a) for each a ∈ K, o ⊆ a we have ρ(A(a)) ⊆ A(a) ⊗ Mnρ ; (b) if σ ∈ ∆t (o1 ) and T ∈ (ρ, σ), then T has values Ti,j in A(a) ∀ a ∈ K, o ∪ o1 ⊆ a. Studying SC is equivalent to studying ∆t because there exists, up to unitary equivalence, a bijective correspondence between the representations satisfying SC and the morphisms in ∆t . In fact, for each π ∈ SC there is a corresponding set of localized transportable morphisms defined as follows ρa (A) ≡ Va+ πo (A)Va
A ∈ A, a ∈ K
where {Va }a∈K is a set of isometries associated with π by (1). Conversely, given ρ ∈ ∆t then π(A) ≡ ρ(πo−1 (A))
A∈R
is a representation belonging to SC. b The relevance of sheaves of von Neumann algebras in the theory of superselection sectors was pointed out for the first time by J. E. Roberts [13] who showed a correspondence between sectors and some Hermitian bimodules over a sheaf of von Neumann algebras on Minkowski space.
February 9, 2004 19:51 WSPC/148-RMP
1260
00187
G. Ruzzi
In order to associate a category with ∆t , and hence with SC, we need to introduce the tensor C∗ -category B of bimodules of the C∗ -algebra A [6]. The objects are bimodules of A, namely the morphisms ρ : A → A ⊗ Mnρ with multiplicity nρ ∈ N. The arrows between two objects ρ, τ are the intertwiners T ∈ (ρ, τ ) with values in A, i.e. Ti,j ∈ A. The composition law between the arrows is the usual rows times columns product, and it is denoted by “·”. The identity arrow 1ρ of an object ρ is the projection ρ(1). The adjoint “+” is defined as ρ+ ≡ ρ on the objects, and + ∗ Ti,j ≡ Tj,i for each T ∈ (ρ, τ ), where ∗ denotes the involution of A. The tensor product “×” is defined by using the lexicographical ordering. Namely × is defined on the objects as ρσ( )i,j ≡ ρ(σ( )i2 ,j2 )i1 ,j1
where i = i1 + nρ i2 ,
j = j 1 + n ρ j2
(observe that ρσ has multiplicity nρ nσ ), and (T × S)i,j ≡ Ti1 ,k ρ(Si1 ,j2 )k,j1
where i = i1 + nρ2 i2 ,
j = j 1 + n ρ1 j 2
for each T ∈ (ρ1 , ρ2 ), S ∈ (σ1 , σ2 ). The identity object ι of the tensor product is the morphism ι(A) ≡ A for each A ∈ A. Since A is irreducible ι is irreducible. Finally, one can easily check that B is closed under direct sums and subobjects. Now, returning to the problem of stating what category is associated with ∆ t , we notice that ∆t is a subset of the objects of B because of Lemma 2.1(a), and that by Lemma 2.1(b) the set of the intertwiners between ρ, σ ∈ ∆t is equal to the set of the arrows between ρ and σ as objects of B. The category B(A) of the localized transportable bimodules of A is the full subcategory of B whose objects belong to ∆t . B(A) is closed under tensor products, direct sums and subobjects, and the identity object ι is irreducible. In conclusion, B(A) is the category associated with ∆t that we were looking for. The superselection sectors of the theory are the unitary equivalence classes of the irreducible objects of B(A). 2.2. The category B(A⊥ ) The presheaf A⊥ associated with the net A is defined as the correspondence A⊥ : K 3 a → A(a)0 where for a ⊆ b the restriction A(b)0 → A(a)0 is given by the inclusion A(b)0 ⊆ A(a)0 . A morphism ρˆ of A⊥ is a collection a ρ : A(a)0 → B(Ho )⊗Mnρ , a ∈ K, of morphisms with a fixed multiplicity nρ , fulfilling the relations: (a) a ρ(1) = ρ(1) ∀ a ∈ K (b) if a ⊆ b then a ρ A(b)0 = b ρ (compatibility). In a similar way as has been done for the net, the notion of localized transportable morphism of the presheaf can be introduced. A morphism ρˆ of A⊥ is said to be localized in o if o ρ(A)
= A ⊗ 1nρ · ρ(1) ∀A ∈ A(o)0 .
February 9, 2004 19:51 WSPC/148-RMP
00187
Essential Properties of the Vacuum Sector for a Theory of Superselection Sectors
1261
We denote by ∆⊥ the set of localized morphisms and by ∆⊥ (o) the subset of those morphisms localized within o. Given ρˆ, σ ˆ ∈ ∆⊥ , the set (ˆ ρ, σ ˆ ) of the intertwiners nρ between ρˆ and σ ˆ is the set of the operators T ∈ B(Ho ⊗ C , Ho ⊗ Cnσ ) such that T ρ(1) = T = σ(1)T
and T a ρ(A) = a σ(A)T
∀ A ∈ A(a)0 , ∀ a ∈ K .
A localized morphism ρˆ is said to be transportable if for each a ∈ K there exists σ ˆ∈ ∆⊥ (a) and a unitary U ∈ (ˆ ρ, σ ˆ ). By ∆⊥ t we denote the set of localized transportable morphisms and by ∆⊥ t (o) the subset of transportable morphisms which are localized within o. Finally, we call the category of localized transportable bimodules of A ⊥ , and denote it by B(A⊥ ), the category whose set of objects is ∆⊥ t , and whose set of arrows between ρˆ, σ ˆ ∈ ∆⊥ ρ, σ ˆ ). Clearly, B(A⊥ ) is a C∗ -category closed under t is (ˆ direct sums and subobjects. Proposition 2.2. Let ρˆ ∈ ∆⊥ t be localized in o. (a) For each a, b, c ∈ K, c ⊥ a, b we have a ρ A(c) = b ρ A(c). (b) If c ∈ K, c ⊥ o, then a ρ(A) = A ⊗ 1nρ · ρ(1) ∀ A ∈ A(c) and ∀ a ∈ K, a ⊥ c. (c) a ρ(A(a)0 ) ⊆ A(a)0 ⊗ Mnρ for each a ∈ K, o ⊥ a. (d) Given σ ˆ ∈ ∆⊥ t (b) and T ∈ (ρ, σ) then Ti,j ∈ A(a) for each a ∈ K, o ∪ b ⊆ a. Proof. (a) Since the spacelike complement of a double cone is pathwise connected in M4 , there is a path p, contained in c⊥ , joining a to b. As A(c) is contained in the commutant of the algebras associated with each double cone of the path, the proof follows from the compatibility of the morphisms. (b) Notice that c ⊥ o, a. Then, by (a) we have a ρ(A) = o ρ(A) for each A ∈ A(c). Since ρˆ is localized in o, the proof follows from the fact that A(c) ⊂ A(o)0 . (c) is postponed to the next section. (d) follows from (b). Some comments are in order. First, the proposition does not hold in a 2dimensional Minkowski spacetime because the spacelike complement of a double cone is not pathwise connected. Secondly, the statement (a) does not depend on the double cone where the object is localized. Thirdly, notice that, once o ∈ K is fixed, the correspondence {a ∈ K | a ⊥ o} 3 a → A(a)0
(2)
is a presheaf of von Neumann algebras and, if ρˆ ∈ ∆⊥ t is localized in o, then also the correspondence {a ∈ K | a ⊥ o} 3 a → (A(a)0 ⊗ Mnρ )ρ(1)
(3)
is a presheaf of von Neumann algebras, because ρ(1) ∈ A(a)0 ⊗ Mnρ for each a ∈ K, a ⊥ o; here (A(a)0 ⊗ Mnρ )ρ(1) denotes the reduced algebra ρ(1)(A(a)0 ⊗ Mnρ )ρ(1). Then, as a consequence of Proposition 2.2(c), the collection {a ρ | a ∈ K, a ⊥ o} is a presheaf morphism from (2) to (3). In the following we will refer to the presheaves (2) and (3) as, respectively, the domain and the codomain of ρˆ as an element of the set ∆⊥ t (o).
February 9, 2004 19:51 WSPC/148-RMP
1262
00187
G. Ruzzi
2.3. The isomorphism between B(A⊥ ) and B(A) The relation between B(A⊥ ) and B(A) is deeper than the one suggested from their definition: in fact they are isomorphic. The key point of the proof consists in proving that each element of ∆t admits an extension to a morphism of the presheaf. In order to prove this we need to introduce the cohomological description of the theory of superselection sectors developed by J. E. Roberts [12] (see also [14]). By using the same reasoning as in that paper, it is possible to introduce the category of 1-cocycles of the net and show that it is equivalent to B(A). However we limit ourselves to describing the way the set Zt1 (A) of 1-cocycles of the net and ∆t are related. Having fixed a double cone o and ρ localized in o, for each a ∈ K let us choose a set of unitary arrows Vao ∈ (ρ, τa ) where τa is localized in a and ρ = τo . Defining + zab ≡ Vao · Vbo
a, b ∈ K
(4)
and observing that zab ∈ (τb , τa ), we have the 1-cocycle identity (a) zab · zbc = zac
a, b, c ∈ K
and that (b) zaa = τa (1) (c) zab has values in A(d)
a∈K a, b, d ∈ K, a ∪ b ⊆ d.
A collection of partial isometries {¯ zab }a,b∈K satisfying (a)−(c) is called a 1-cocycle of A. A different choice of the set Vao a ∈ K yields a cohomologous cocycle. Conversely, we can associate with each z¯ ∈ Zt1 (A) an element of ∆t . Note that for each a, b ∈ K, a ⊥ b we have ρ(A) = zob · τb (A) · zbo = zob · A ⊗ 1nb · zbo
A ∈ A(a).
(5)
Then by replacing z with z¯ in the right-hand side of (5) one gets a transportable morphism localized in o. Theorem 2.3. The categories B(A⊥ ) and B(A) are isomorphic. Proof. First we define the extension functor E : B(A) → B(A⊥ ). Let ρ ∈ ∆t be localized in o, and let z be the 1-cocycle defined by (4). We set a E(ρ)(A)
≡ zoa · A ⊗ 1na · zao
E(T ) ≡ T
A ∈ A(a)0 T ∈ (ρ, σ) .
For each a ∈ K, a E(ρ) is a normal morphism of A(a)0 and a E(ρ)(1) = ρ(1). If a ⊆ b then for each A ∈ A(b)0 we have: a E(ρ)(A)
= zoa A ⊗ 1na
zao = zob zba · A ⊗ 1na · zba zbo = b E(ρ)(A)
because the coefficients of zba belong to A(b). Moreover, o E(ρ)(B) = B ⊗ 1no · ρ(1) for each B ∈ A(o)0 because zoo = ρ(1). Hence E(ρ) is a morphism of the presheaf and it is localized in o. It is worth observing that, by (5), we have a E(ρ)(A)
= zoa · A ⊗ 1na · zao = ρ(A)
A ∈ A(a⊥ ) .
February 9, 2004 19:51 WSPC/148-RMP
00187
Essential Properties of the Vacuum Sector for a Theory of Superselection Sectors
1263
Thus, a E(ρ) is a normal extension of ρ to A(a)0 and it is unique because, by Haag duality, A(a⊥ ) is weakly dense in A(a)0 . After this observation it is easy to see that T ∈ (E(ρ), E(σ)) and consequently that E(ρ) is transportable. We now pass to define the restriction functor R : B(A⊥ ) → B(A). Let ρˆ ∈ ∆⊥ t . Given a ∈ K, we take b ∈ K, a ⊂ b⊥ and define R(ˆ ρ)(A) = b ρ(A) R(S) ≡ S
A ∈ A(a) S ∈ (ˆ ρ, σ ˆ) .
R(ˆ ρ) is a morphism of A(a) and R(ˆ ρ)(1) = ρ(1). By Proposition 2.2(a) R(ˆ ρ) does not depends on the choice of b ⊥ a and, for this reason, it is compatible with the net A. Hence it is extendible by continuity to a morphism of A. If ρˆ is localized in o then, by Proposition 2.2(b), E(ˆ ρ) is also localized in o. The proofs, both that R(ˆ ρ) is transportable and that S belongs to (R(ˆ ρ), R(ˆ σ )) are straightforward and, therefore, we omit them. Finally, observing that R(ˆ ρ) is the restriction of the components of ρˆ to the algebras of double cones, it easily follows that R ◦ E = idB(A) and E ◦ R = idB(A⊥ ) . Concerning the functors E and R introduced in the previous theorem, from here on we will use the following notation: we will denote by ρˆ the extension E(ρ) of ρ ∈ ∆t ; conversely, we will denote by σ the restriction R(ˆ σ ) of σ ˆ ∈ ∆⊥ t . As a first consequence of Theorem 2.3, we prove Proposition 2.2(c). Given ρˆ ∈ ∆⊥ t (o), let z be the 1-cocycle, defined by (4), associated with ρ. Let a ∈ K, a ⊥ o. Then for each A ∈ A(a) and B ∈ A(a)0 we have a ρ(B) · A ⊗ 1no = zoa · B ⊗ 1na · zao · A ⊗ 1no = zoa · B ⊗ 1na · τa (A) · zao = zoa · τa (A) · B ⊗ 1na · zao = A · a ρ(B), where the inclusion τa (A(a)) ⊆ A(a) ⊗ Mna has been used. This completes the proof. Secondly, a tensor product can be easily introduced on B(A⊥ ): ∆⊥ ˆ, σ ˆ → ρˆ σ ˆ ∈ ∆⊥ t 3ρ t (ˆ ρ1 , ρˆ2 ), (ˆ σ1 , σ ˆ2 ) 3 T, S → T S ∈ (ˆ ρ1 σ ˆ1 , ρˆ2 σ ˆ2 )
where ρˆ σ ˆ ≡ ρc σ
where T S ≡ T × S .
A useful property of is shown in the following: Proposition 2.4. Let ρ ∈ ∆t (a), σ ∈ ∆t (b). If c ∈ K, a ∪ b ⊥ c then c (ρ
σ)(A) = c ρ(c σ(A))
A ∈ A(c)0 .
Proof. Without loss of generality we prove the statement only in the case of objects with multiplicity equal to one (namely endomorphisms, in general not unital, of the algebras A(a)0 ). Let z, z¯ be two 1-cocycles associated with ρ and σ respectively. First we observe that c (ρ σ)(A) = c (ρσ)(A) = zoc × z¯bc · A · zco × z¯cb . Taking d, e, h ∈ K such that d ⊥ e, d ∪ e ⊂ c, and b ∪ e ⊆ h, h ⊥ d, then for each A ∈ A(c)0 we have c (ρ
σ)(A) = zod zdc × z¯be z¯ec · A · zcd zdo × z¯ce z¯eb = zod × z¯be · zdc × z¯ec · A · zcd × z¯ce · zdo × z¯eb = zod × z¯be · A · zdo × z¯eb
February 9, 2004 19:51 WSPC/148-RMP
1264
00187
G. Ruzzi
where the last equality holds because zdc × z¯ec ∈ A(c)0 . Observing that zod × z¯be = zod · τd (¯ zbe ) = zod · z¯be , because z¯be ∈ A(h), and that, by Proposition 2.2(c), 0 0 σ(A(c) ) ⊆ A(c) , then it follows that zod × z¯be ·A·zdo × z¯eb = zod ·(¯ zbe ·A· z¯eb )·zdo = c d ρ(e σ(A)) = c ρ(c σ(A)), where we used the fact that τd is localized in d. This completes the proof. 2.4. Faithfulness and double faithfulness Our aim is to identify the relevant subcategory of B(A). A first step toward the understanding of this problem is made in this section. Definition 2.5. We say that an object ρ of B(A) is: (i) faithful if it is a faithful morphism of A. (ii) doubly faithful if its extension ρˆ is a faithful morphism of the presheaf, namely, for each a ∈ K, a ρ is a faithful morphism of the algebra A(a)0 . Since an object ρ of B(A) is the restriction to the local algebras of its extension ρˆ, double faithfulness implies faithfulness. The converse is, in general, false as can be easily seen by the following proposition. Proposition 2.6. Let ρ be an object of B(A) and let us denote by [ρ] the equivalence class of ρ. The following assertions hold : (a) ρ is faithful if , and only if , for each o, a ∈ K, o ⊥ a and for each τ ∈ [ρ] localized in a, the central support of τ (1) in A(o)0 ⊗ Mnσ is equal to 1 ⊗ 1nσ ; (b) ρ is doubly faithful if , and only if , for each o ∈ K and for each σ ∈ [ρ] localized in o, the central support of σ(1) in A(o) ⊗ Mnσ is equal to 1 ⊗ 1nσ . Proof. (b) Since the extensions ρˆ and σ ˆ of ρ and σ are equivalent and σ is localized in o, for A ∈ A(o)0 we have o ρ(A) = 0 ⇔ o σ(A) = σ(1) · A ⊗ 1nσ = 0, and the proof follows from the definition of central support. The assertion (a) follows in a similar way. Without any further assumptions on the structure of local algebras, and in particular on their centers, we have no way to conclude that these two properties are fulfilled. Notice that properties like the Schlieder propertyc or the simplicity of A, which are weaker than the Borchers property and imply the faithfulness, cannot be deduced from the hypotheses we have made on the local algebras. Thus, we have to accept the possible existence both of nonfaithful objects and of not doubly faithful objects. Since double faithfulness will turn out to be necessary for the existence of conjugates, in the following, not doubly faithful objects shall have to be excluded from the analysis. c The net A has the Schlieder property if given a, b ∈ K, a ⊥ b and A ∈ A(a), B ∈ A(b) then A · B = 0 ⇔ A = 0 or B = 0.
February 9, 2004 19:51 WSPC/148-RMP
00187
Essential Properties of the Vacuum Sector for a Theory of Superselection Sectors
1265
Direct sums, tensor products and equivalence preserve (double) faithfulness: Proposition 2.7. The following assertions hold : (a) if ρ ∈ ∆t is doubly faithful ⇒ each σ ∈ [ρ] is doubly faithful ; (b) if ρ1 , ρ2 ∈ ∆t are doubly faithful ⇒ ρ1 ⊕ ρ2 and ρ1 ρ2 are doubly faithful. The same assertions hold for faithfulness. Proof. (a) follows from the fact that if σ ∈ [ρ] then σ ˆ ∈ [ˆ ρ]. The proof of statement (b), concerning the direct sum, is obvious. Given a ∈ K, let b1 , b2 ∈ K, b1 , b2 ⊥ a and let Vi ∈ (ρi , σi ) be unitary such that σi is localized in bi for i = 1, 2. For A ∈ A(a)0 , by Proposition 2.4, we have a (ρ1 ρ2 )(A) = V1+ × V2+ · a (σ1 σ2 )(A) · V1 × V2 = V1+ × V2+ · a σ1 (a σ2 (A)) · V1 × V2 and the proof is now completed. 3. Statistics and Selection of the Relevant Subcategory This section is entirely devoted to showing how the relevant subcategory of the theory can be selected. We will find it convenient to work in the net approach, that is to say with B(A); nevertheless, to introduce the notion of a homogeneous object of B(A), we will have to rely on the presheaf approach. Homogeneity will turn out to be one of the properties characterizing the objects of the relevant subcategory. We start by proving the existence of a symmetry ε. Afterwards, we introduce the notions of net-left inverse, presheaf-left inverse and homogeneity. We prove that each doubly faithful simple object is homogeneous. After having introduced the category of objects with finite statistics, we conclude by showing how the relevant subcategory can be selected using doubly faithful simple objects. 3.1. Symmetry When a tensor C∗ -category has a symmetry ε (see definition in appendix) it is possible to introduce, as in DHR analysis, a notion of statistics of sectors. Briefly, one first notes that for each object ρ, by means of ε, there is an associated unitary representation εnρ of the permutation group of n-elements P(n), with values in (ρn , ρn ). If ρ is irreducible, the statistics of the sector [ρ] is the collection of the unitary equivalence classes of the representations εnρ as n varies. In this section we show that B(A) has a symmetry ε and, therefore, an associated notion of statistics of sectors. We start by recalling a result from [6]. Let (n, m) denote the set of m×n matrices A with values Ai,j in A. Then, there exists a map θ : n, m → θ(n, m) ∈ (nm, mn) with values θ(n, m)i,j in C, that verifies θ(m, m1 ) · A ⊗ B = B ⊗ A · θ(n, n1 )
(6)
for each pair A ∈ (n, m), B ∈ (n1 , m1 ) with commuting values, that is to say [Ai,j , Bl,k ] = 0, where (A⊗B)i,j ≡ Ai1 ,j1 Bi2 ,j2 is the lexicographical order product.
February 9, 2004 19:51 WSPC/148-RMP
1266
00187
G. Ruzzi
The proof of the existence of a symmetry ε is based on the following: Lemma 3.1. Let ρi ∈ ∆t (oi ) for i = 1, 2, σi ∈ ∆t (ai ) for i = 1, 2, such that o1 ⊥ a1 , o2 ⊥ a2 . For T ∈ (ρ1 , ρ2 ), S ∈ (σ1 , σ2 ) we have θ(n2 , m2 ) · T × S = S × T · θ(n1 , m1 ) where n1 , n2 , m1 , m2 denote, respectively, the multiplicities of ρ1 , ρ2 , σ1 and σ2 . To prove the statement we need three preliminary lemmas. Lemma 3.2. Let ρ ∈ ∆t (o), σ ∈ ∆t (a) such that o ⊥ a. If b ∈ K, o, a ⊥ b and A ∈ A(b) then we have θ(n, m) · ρσ(A) = σρ(A) · θ(n, m). Proof. Notice that ρσ(A) = 1ρ ⊗1σ ·A⊗1nσ and σρ(A) = 1σ ⊗1ρ ·A⊗1nρ because of localization of ρ and σ. As 1ρ and 1σ have commuting values, by (6) we have the proof. Lemma 3.3. Let ρ1 , ρ2 ∈ ∆t (o), σ1 , σ2 ∈ ∆t (a) such that o ⊥ a and let T ∈ (ρ1 , ρ2 ), S ∈ (σ1 , σ2 ). Then we have θ(n2 , m2 ) · T ⊗ S = S ⊗ T · θ(n1 , m1 ). Proof. (T ×S)i,j = Ti1 ,k ρ1 (Si2 ,j2 )k,j1 = Ti1 ,j1 Si2 ,j2 = (T ⊗S 0 )i,j . Similarly S ×T = S ⊗ T . Since the values of T and S commute, the proof follows by (6). Lemma 3.4. If ρ ∈ ∆t (o), σ ∈ ∆t (a) such that o ⊥ a then θ(n, m) ∈ (ρσ, σρ). Proof. For each b ∈ K, o ∪ a ⊆ b let us take a1 , o1 , c, d ∈ K such that o1 , a1 ⊥ b, (o ∪ o1 ) ⊂ d, (a ∪ a1 ) ⊂ c and d ⊥ c. Moreover, let U ∈ (ρ, ρ1 ), V ∈ (σ, σ1 ) be unitaries and ρ1 ∈ ∆t (o1 ), σ1 ∈ ∆t (a1 ). For A ∈ A(b) by Lemma 3.2 and Lemma 3.3 we have θ(n, m) · ρσ(A) = θ(n, m) · (U + × V + ) · ρ1 σ1 (A) · (U × V ) = (V + × U + ) · θ(n1 , m1 ) · ρ1 σ1 (A) · (U × V ) = (V + × U + ) · σ1 ρ1 (A) · θ(n1 , m1 ) · (U × V ) = (V + × U + ) · σ1 ρ1 (A) · (V × U ) · θ(n, m) = σρ(A) · θ(n, m) . Proof of Lemma 3.1. We use a standard deformation argument. Let U ∈ (σ3 , σ1 ) be unitary with σ3 ∈ ∆t (a3 ) and (a3 ∪a1 ) ⊥ o1 . By Lemma 3.3, θ(n1 , m1 )·1ρ1 ×U = U × 1ρ1 · θ(n1 , m3 ). Setting S1 ≡ S · U we have θ(n2 , m2 ) · T × S1 = S1 × T · θ(n1 , m3 ) if, and only if, θ(n2 , m2 ) · T × S = S × T · θ(n1 , m1 ). Similarly we can move the supportd of ρ1 without changing the statement of the lemma. If the number of spatial dimensions is bigger than 1, by a finite number of displacements of the supports we can reduce the problem to the trivial situation of the Lemma 3.3. d By
support we mean the double cone where the object is localized.
February 9, 2004 19:51 WSPC/148-RMP
00187
Essential Properties of the Vacuum Sector for a Theory of Superselection Sectors
1267
Theorem 3.5. The category B(A) has a unique symmetry ε satisfying ε(ρ, σ) = θ(n, m) whenever ρ and σ are localized in mutually spacelike double cones. Proof. Let us observe that if ε is a symmetry satisfying the relation in the statement, given two unitaries U ∈ (ρ, ρ1 ), V ∈ (σ, σ1 ) such that ρ1 ∈ ∆t (o1 ), σ1 ∈ ∆t (a1 ) where o1 ⊥ a1 , then we have θ(n1 , m1 ) · U × V = ε(ρ1 , σ1 ) · U × V = V × U · ε(ρ, σ). According to this observation we define ε(ρ, σ) ≡ V + × U + · θ(n1 , m1 ) · U × V .
(7)
The definition of ε does not depend on the choice of U , V . Indeed, given two unitaries U1 ∈ (ρ, ρ2 ), V1 ∈ (σ, σ2 ) such that ρ2 ∈ ∆t (o2 ), σ2 ∈ ∆t (a2 ) where o2 ⊥ a2 , then by Lemma 3.1 we have θ(n2 , m2 ) · (U1 U + × V1 V + ) = (V1 V + × U1 U + ) · θ(n1 , m1 ). Hence (V1+ × U1+ ) · θ(n2 , m2 ) · (U1 × V1 ) = V + × U + · θ(n1 , m1 ) · U × V . We now prove that ε is a symmetry for B(A). Let S ∈ (ρ, τ ), T ∈ (σ, β) and let W ∈ (τ, τ1 ), R ∈ (β, β1 ) be unitaries such that τ1 and β1 are localized in mutually spacelike double cones. By Lemma 3.1 we have θ(l1 , k1 ) · (W SU + ) × (RT V + ) = (RT V + ) × (W SU + ) · θ(m1 , n1 ), therefore, multiplying the right-hand side of this identity by R+ × W + and the left-hand side by U × V we obtain: ε(τ, β) · S × T = T × S · ε(ρ, σ) . Now, by using (7) we have ε(ρ, σ) · ε(σ, ρ) = V + × U + · θ(n1 , m1 ) · 1ρ1 × 1σ1 · θ(m1 , n1 ) · V × U = V + × U + · θ(n1 , m1 ) · θ(m1 , n1 ) · V × U = 1σρ . Finally, let X ∈ (γ, γ1 ) be unitary, such that γ1 ∈ ∆t (b1 ) where b1 ⊥ o1 and (b1 ∪ o1 ) ⊥ a1 . Observing that θ(n1 , m1 ) × 1γ1 · 1ρ1 × θ(l1 , m1 ) = θ(n1 l1 , m1 ) we have ε(ρ, σ) × 1γ · 1ρ × ε(γ, σ) = (V + × U + × 1γ ) · θ(n1 , m1 ) × 1γ · (U × V × 1γ ) · (1ρ × V + × X + ) · 1ρ × θ(l1 , m1 ) · (1ρ × X × V ) = (V + × U + × X + ) · θ(n1 , m1 ) × 1γ1 · 1ρ1 × θ(l1 , m1 ) · (U × X × V ) = (V + × U + × X + ) · θ(n1 l1 , m1 ) · (U × X × V ) = ε(ργ, σ) . This completes the proof.
February 9, 2004 19:51 WSPC/148-RMP
1268
00187
G. Ruzzi
3.2. Net-left inverses, presheaf-left inverses and homogeneity The net-left inverse of an object of B(A) is the obvious generalization of the concept of left inverse of unital endomorphisms of a C∗ -algebra to the case where ρ is a morphism of the net. Definition 3.6. A net-left inverse ϕ of an object ρ ∈ ∆t is a nonzero completely positive normalized linear map ϕ : (A ⊗ Mnρ )ρ(1) → A fulfilling the relation ϕ(B · ρ(A)) = ϕ(B) · A for each A ∈ A, B ∈ (A ⊗ Mnρ )ρ(1) . Clearly the faithfulness of the objects is necessary for the existence of net-left inverses. It is less obvious to see that it is also sufficient. The physical idea used in DHR analysis to show the existence of left inverses for unital endomorphisms, which is based on a charge transfer chain to infinite, can be suitably adapted to our case. Unfortunately it does not work. The reason is clear: as the objects are nonunital morphisms it is not possible to check whether the chain has a trivial limit or not. However, there is another idea that can be used to prove the existence of net-left inverses. Notice that a net-left inverse of ρ is also a linear map extending ρ−1 to the codomain algebra (A ⊗ Mnρ )ρ(1) of ρ. Such an extension can be obtained by generalizing, to our case, an argument used for unital endomorphisms [8]. Proposition 3.7. Each faithful object has a net-left inverse. Proof. Let ρ ∈ ∆t be faithful and let Ω ∈ Ho . Then we can define a state ω as ω(A) ≡ (Ω, ρ−1 (A)Ω) for A ∈ ρ(A). Since the inclusion ρ(A) ⊆ (A ⊗ Mnρ )ρ(1) preserves the identity, there is a state ω 0 of the algebra (A ⊗ Mnρ )ρ(1) which extends ω. Let (H0 , π 0 , Ω0 ) be the GNS construction associated with ω 0 and let us define V AΩ ≡ π 0 (ρ(A))Ω0 for A ∈ A. As A is irreducible and ω 0 is an extension of ω, V : Ho → H0 is an isometry fulfilling the relation V A = π 0 (ρ(A))V for A ∈ A. Now, by setting ϕ(A) ≡ V ∗ π 0 (A)V for A ∈ (A ⊗ Mnρ )ρ(1) , one easily checks that ϕ is a net-left inverse of ρ. A left inverse (see the definition in appendix) is uniquely associated with a net-left inverse of an object ρ. To show this we will need to represent an element E ∈ (ρσ, ρτ ) as a nσ × nτ matrix with values [E]i,j in (A ⊗ Mnρ )ρ(1) (see Appendix for more details). Proposition 3.8. Let ϕ be a net-left inverse of an object ρ. Then there exists a unique positive normalized left inverse Φ of ρ verifying for each σ, τ ∈ ∆ t the relation Φσ,τ (E)i,j = ϕ([E]i,j ) for i = 1, . . . nτ , j = 1, . . . , nσ .
E ∈ (ρσ, ρτ )
(8)
February 9, 2004 19:51 WSPC/148-RMP
00187
Essential Properties of the Vacuum Sector for a Theory of Superselection Sectors
1269
Proof. The proof of the uniqueness follows once we have shown that the relation (8) defines a left inverse of ρ. So, let Φ be the set of the linear maps Φσ,τ for σ, τ ∈ ∆t defined by (8). Φ is obviously normalized, and it is positive because ϕ is completely positive. In the rest of the proof the relations (A.6)–(A.8) will be used. Let A ∈ A and E ∈ (ρσ, ρτ ), then (Φσ,τ (E) · σ(A))i,j = ϕ([E]i,k ) · σ(A)k,j = ϕ([E]i,k · ρ(σ(A)k,j )) = ϕ([ E · ρ(σ(A)) ]i,j ) = ϕ([ ρ(τ (A)) · E ]i,j ) = (τ (A) · Φσ,τ (E))i,j . Hence Φσ,τ (E) ∈ (σ, τ ). Given S ∈ (α, σ), T ∈ (β, τ ), then we have Φα,β (1ρ × T + · E · 1ρ × S)i,j = ϕ([1ρ × T + · E · 1ρ × S]i,j ) + + = ϕ(ρ(Ti,k ) · [E]k,l · ρ(Sl,j )) = Ti,k · ϕ([E]k,l ) · Sl,j
= (T + · Φσ,τ (E) · S)l,j . In a similar way one can show that Φσπ,τ π (X ×1π ) = Φσ,τ (X)×1π for X ∈ (ρσ, ρτ ). This completes the proof. Summing up, faithfulness is a necessary and sufficient condition for the existence of net-left inverses. Moreover, for each net-left inverse there is an associated left inverse of the object. It is worth observing that this excludes neither the existence of nonfaithful objects with left inverses nor the existence of objects without left inverses. We now turn to the presheaf-left inverse. Definition 3.9. A presheaf-left inverse of an object ρ in ∆t (o) is defined as a collection ϕˆ ≡ { a ϕ : (A(a)0 ⊗ Mnρ )ρ(1) → A(a)0 | a ∈ K, a ⊥ o} of nonzero completely positive normalized linear maps verifying the relations: (i) a ϕ (A(b)0 ⊗ Mnρ )ρ(1) = b ϕ, ∀ b ∈ K, b ⊥ o, a ⊆ b (ii) a ϕ(B · a ρ(A)) = a ϕ(B) · A, ∀ A ∈ A(a)0 , B ∈ (A(a)0 ⊗ Mnρ )ρ(1) . It is worth observing that the definition of presheaf-left inverse depends on the double cone o where the object ρ is localized. This is so because the inclusion 0 0 a ρ(A(a) ) ⊂ (A(a) ⊗ Mnρ )ρ(1) is verified only for double cones a which are spacelike separated from o. Moreover, notice that by (2) and (3), a presheaf-left inverse of ρ ∈ ∆t (o) can be seen as a linear map from the codomain of ρˆ ∈ ∆⊥ t (o) onto its domain which is, by relation (i), compatible with the presheaf structure and fulfills relation (ii). Now, if ρ ∈ ∆t (o) has a presheaf-left inverse then each σ ∈ ∆t (b) equivalent to ρ, with o ⊆ b, has a presheaf-left inverse. In fact, given a unitary U ∈ (ρ, σ), the collection of linear maps defined as a ϕ(U + BU ) for B ∈ (A(a)0 ⊗ Mnσ )σ(1) and for each a ⊥ b is a presheaf-left inverse of σ. This argument cannot be applied to the
February 9, 2004 19:51 WSPC/148-RMP
1270
00187
G. Ruzzi
elements of [ρ] localized in double cones which do not contain o. Hence, in general, having a presheaf-left inverse is not a property of the equivalence class of the object. This leads to the following: Definition 3.10. We say that an object ρ of B(A) is homogeneous if each element of its equivalence class has presheaf-left inverses, namely if for each a ∈ K, any σ ∈ [ρ] localized in a has, as an element of ∆t (a), a presheaf-left inverse. Concerning homogeneity and existence of presheaf-left inverses, in this section we will limit ourselves to the following remark. If ρ ∈ ∆t (o) has a presheaf-left inverse, then a ρ is a faithful morphism for each a ⊥ o. This does not imply the double faithfulness of ρ. Double faithfulness occurs when ρ is homogeneous. Conversely, it is not clear whether in general double faithfulness, it is enough both for the existence of presheaf-left inverses and for homogeneity. We only know that this happens in the particular case of doubly faithful simple objects (see next section). As a consequence of the definition of presheaf-left inverse we have the following: Proposition 3.11. Let ϕˆ be a presheaf-left inverse of ρ ∈ ∆t (o). Then there exists a unique net-left inverse ϕ of ρ such that for each a, b ∈ K and b ⊥ a, o we have ϕ(A) = b ϕ(A)
A ∈ (A(a) ⊗ Mnρ )ρ(1) .
(9)
Proof. Since a⊥ ∩ o⊥ is pathwise connected, the compatibility of ϕˆ implies (in the same way as in Proposition 2.2(a)), that the relation (9) defines a net-left inverse of ρ. The relations (8) and (9) yield a correspondence between presheaf-left inverses and left inverses. Namely, given a presheaf-left inverse ϕˆ of ρ ∈ ∆t (o) we have (9)
(8)
{presheaf -left inverses of ρ ∈ ∆t (o)} 3 ϕˆ −→ ϕ −→ Φ ∈ {left inverses of ρ} . (10) We denote by l(ϕ) ˆ the left inverse defined by relation (10) and call it the left inverse associated with ϕ. ˆ Notice that for each E ∈ (ρ, ρ) we have l(ϕ) ˆ ι,ι (E) = a ϕ(E) ,
∀ a ⊂ o⊥ ⇒ a ϕ(E) = c · 1 ,
∀ a ∈ K, a ⊥ o
because of the irreducibility of ι. We conclude this section by studying how the correspondence ϕˆ → l(ϕ) ˆ behaves under the categorical operations. Let ϕ, ˆ ϕˆ1 and ϕˆ2 be presheaf-left inverses of ρ, ρ1 , ρ2 ∈ ∆t (o) respectively. Given two isometries Wi ∈ (α, ρi ) for i = 1, 2 verifying W1 W1+ + W2 W2+ = 1α , and given an isometry V ∈ (β, ρ) (V V + ≡ E ), the linear maps ϕˆ1 ◦ ϕˆ2 , ϕˆ1 ⊕s ϕˆ2 for s ∈ [0, 1] and ϕˆE defined as a (ϕ1 a (ϕ1
◦ ϕ2 )(A) ≡ a ϕ1 (a ϕ2 ([A])) ,
A ∈ (A(a)0 ⊗ Mn2 n1 )ρ2 ρ1 (1)
(11)
⊕s ϕ2 )(B) ≡ sa ϕ1 (W1+ BW1 ) + (1 − s)a ϕ2 (W2+ BW2 ) , B ∈ (A(a)0 ⊗ Mnα )α(1) aϕ
E
(C) ≡ a ϕ(E)
−1
a ϕ(V
CV + )
if a ϕ(E) 6= 0 ,
(12) C ∈ (A(a)0 ⊗ Mnβ )β(1) (13)
February 9, 2004 19:51 WSPC/148-RMP
00187
Essential Properties of the Vacuum Sector for a Theory of Superselection Sectors
1271
for each a ⊥ o, are, respectively, presheaf-left inverses of ρ2 ρ1 , α and β as elements of ∆t (o). The definitions (11) and (12) entail that the existence of presheaf-left inverses and the homogeneity are stable properties under tensor products and direct sums. This cannot be asserted for subobjects because ϕˆE exists only if the scalar a ϕ(E) 6= 0. Now, note that the same constructions we have made for presheafleft inverses can be made for the associated left inverses l(ϕ), ˆ l(ϕˆ1 ) and l(ϕˆ2 ) (see (A.1)–(A.3)). One can easily show that the correspondence ϕˆ → l(ϕ) ˆ is compatible: for each σ, τ ∈ ∆t we have (l(ϕˆ1 ) ◦ l(ϕˆ2 ))σ,τ = l(ϕˆ1 ◦ ϕˆ2 )σ,τ (l(ϕˆ1 ) ⊕s l(ϕˆ2 ))σ,τ = l(ϕˆ1 ⊕s ϕˆ2 )σ,τ l(ϕ) ˆE ˆE )σ,τ σ,τ = l(ϕ
(14) ∀ s ∈ [0, 1]
if l(ϕ) ˆ ι,ι (E) 6= 0 .
(15) (16)
These relations allow us to work with the more tractable associated left inverses rather than with the presheaf-left inverses. In particular, by (16) the existence of the presheaf-left inverse ϕˆE for the subobject β ∈ ∆t (o) is equivalent to the condition l(ϕ) ˆ ι,ι (E) 6= 0. 3.3. Simple objects In this section we study the simple objects of B(A), namely objects characterized by the following property: γ ∈ ∆t is simple if ε(γ, γ) = χγ · 1γ 2 where χγ ∈ {1, −1}. We show that each doubly faithful simple object is homogeneous. Let us start by noting that if γ ∈ ∆t is simple then each element of [γ] is simple. Now, if γ has a left inverse Φ then the following properties are equivalent Φγ,γ (ε(γ, γ)) = ±1γ ⇔ γ is simple ⇔ (γ 2 , γ 2 ) = C · 1γ 2 (see Proposition A.6). Moreover the left-hand side and the right-hand side relations imply that γ is irreducible. Hence when a simple object γ has a left inverse we can say that [γ] is a simple sector. Progress now results from studying the structure of the faithful simple objects. Lemma 3.12. Let γ ∈ ∆t (o) be a faithful simple object and let ϕ be a net-left inverse of γ. For each b ∈ K, o ⊥ b and for each unitary U ∈ (σ, γ) such that σ is localized in b we have σ(ϕ(B)) = U + · B · U
B ∈ (A(o) ⊗ Mn )γ(1) .
In particular ϕ (A(a) ⊗ Mn )γ(1) is injective for each a ∈ K, o ⊆ a. Proof. In the proof the relations (A.6)–(A.8) will be used. Observing that 1σ ⊗B = σ(B), because o ⊥ b, we have [1σ ⊗ B]i,j = σ(Bi,j ) = U + · γ(Bi,j ) · U = U + · γ((1γ )i,t ) · γ(Bt,s ) · γ((1γ )s,j ) · U = [U + × 1γ · γ(B) · U × 1γ ]i,j .
February 9, 2004 19:51 WSPC/148-RMP
1272
00187
G. Ruzzi
By using this identity we have B ⊗ 1σ = θ(nσ , nγ ) · 1σ ⊗ B · θ(nγ , nσ ) = θ(nσ , nγ ) · U + × 1γ · γ(B) · U × 1γ · θ(nγ , nσ ) = ε(σ, γ) · U + × 1γ · γ(B) · U × 1γ · ε(γ, σ) = 1γ × U + · ε(γ, γ) · γ(B) · ε(γ, γ) · 1γ × U = 1γ × U + · γ(B) · 1γ × U . In conclusion we have σ(ϕ(B))i,j = ϕ(B) · σ(1)i,j = ϕ(Bγ(σ(1)i,j )) = ϕ([B ⊗ 1σ ]i,j ) + = ϕ([1γ × U + ]i,t · [γ(B)]t,s · [1γ × U ]s,j ) = ϕ(γ(Ui,t ) · γ(Bt,s ) · γ(Us,j ))
= (U + · B · U )i,j . Theorem 3.13. Let γ be a simple object. If γ is faithful , then γ : A → (A⊗Mn )γ(1) is an isomorphism and γ −1 is the unique net-left inverse of γ. Proof. Let γ be localized in o. For each a ∈ K, o ⊆ a, let U ∈ (σ, γ) be unitary such that σ is localized in a double cone spacelike separated from a. Observing that ϕ((A(a) ⊗ Mn )γ(1) ) = A(a) then by the previous lemma we have σ(ϕ(A+ B)) = U + A+ BU = U + A+ U U + BU = σ(ϕ(A))∗ · σ(ϕ(B)) = σ(ϕ(A)∗ ϕ(B)) A, B ∈ (A(a) ⊗ Mn )γ(1) . Since σ is faithful, ϕ(A+ B) = ϕ(A)∗ ϕ(B) that is, ϕ is a morphism. It is surjective and injective therefore ϕ = γ −1 on (A(a) ⊗ Mn )γ(1) . Now, the proof follows by continuity of ϕ. Corollary 3.14. Let γ be a simple object. Then γ is doubly faithful if, and only if , γ is homogeneous. In particular if γ ∈ ∆t (o) is doubly faithful , then {a γ : A(a)0 → (A(a)0 ⊗ Mnγ )γ(1) , a ⊥ o} is a presheaf isomorphism and γˆ −1 ≡ {a γ −1 , a ⊥ o} is the unique presheaf-left inverse of γ ∈ ∆t (o). Proof. Let us assume that γ is localized in o ∈ K. It is only a matter of calculation to check that, for each a ⊥ o, a γ −1 (A(a⊥ ) ⊗ Mnγ )γ(1) ⊆ A(a)0 . Applying a γ to this inclusion and observing that, by Proposition 2.2(c), a γ(A(a)0 ) ⊆ (A(a)0 ⊗ Mnγ )γ(1) we have (A(a⊥ ) ⊗ Mnγ )γ(1) ⊆ a γ(A(a)0 ) ⊆ (A(a)0 ⊗ Mnγ )γ(1) . Passing to the weak closure, by Haag duality, we have a γ(A(a)0 ) = (A(a)0 ⊗Mnγ )γ(1) . This implies that {a γ : A(a)0 → (A(a)0 ⊗Mnγ )γ(1) , a ⊥ o} is a presheaf isomorphism and that γˆ −1 , defined as above, is the unique presheaf-left inverse of γ ∈ ∆t (o). Since
February 9, 2004 19:51 WSPC/148-RMP
00187
Essential Properties of the Vacuum Sector for a Theory of Superselection Sectors
1273
double faithfulness is stable under equivalence, each element of the equivalence class of γ admits presheaf-left inverses. Hence, γ is homogeneous; the converse statement is contained in the observation following Definition 3.10. 3.4. The category of objects with finite statistics The only known way to classify the statistics of sectors of a tensor C ∗ -category with a symmetry, is the one followed in DHR analysis and based on using left inverses. But this procedure might not be applicable to all the sectors of B(A) because, as observed in Sec. 3.2, we cannot exclude the existence of objects without left inverses. Disregarding the sectors associated with this kind of objects, we could proceed as in DHR analysis and classify the statistics of the sectors as finite or infinite. However, we will see that objects with conjugates have finite statistics; henceforth we will confine ourselves to this case. Let Ad , Sd ∈ (ρd , ρd ) be the (anti)symmetrizer associated with εdρ . Definition 3.15. We say that an object of B(A) has finite statistics if it is a finite direct sum of irreducibles ρ fulfilling the following conditions: there is a 3-tuple (d, γ, V ) where d is an integer, γ is a faithful simple object and V ∈ (γ, ρd ) is an isometry satisfying one of the following alternatives: (B) V V + = Ad
or (F) V V + = Sd .
We denote by ∆f the set of the objects with finite statistics and by B(A)f the full subcategory of B(A) whose objects have finite statistics. The finiteness of the statistics is stable under equivalence. Moreover, each object ρ ∈ ∆f has left inverses. By definition of ∆f and by (A.2) it is enough to prove this in the case where ρ is irreducible. Let (d, γ, V ) be the 3-tuple associated with ρ. Since γ is faithful, it has a left inverse Φ, Proposition 3.7. Then Ψσ,β (X) ≡ Φσ,β (V + × 1β · 1ρd−1 × X · V × 1σ ) X ∈ (ρσ, ρβ)
(17)
defines a left inverse of ρ. Our definition of objects with finite statistics is equivalent to the usual one, as the following propositions show. Proposition 3.16. Let ρ be irreducible. Then the following assertions are equivalent: (a) ρ has finite statistics; (b) each left inverse Ψ of ρ verifies the relation Ψρ,ρ (ε(ρ, ρ)) = λ · 1ρ 6= 0. Proof. (b) ⇒ (a) follows directly from DHR analysis, [5, Sec. III]. We show (a) ⇒ (b) only in the case (B). In a similar way, one can easily check that this holds in the case (F) as well. Furthermore, it is possible to prove, as in DHR analysis, that if Ψ is a left inverse of ρ such that Ψρ,ρ (ε(ρ, ρ)) = λ · 1ρ then the real
February 9, 2004 19:51 WSPC/148-RMP
1274
00187
G. Ruzzi
number λ is an invariant of the equivalence class [ρ]. Hence, the proof follows once we have shown the existence of one left inverse verifying the relation in the statement. In order to prove this let (d, γ, V ) be the 3-tuple associated with ρ, and let Φ be a left inverse of γ. Moreover, let Ψ be the left inverse of ρ defined by using Φ in (17). ρ being irreducible then Ψρ,ρ (ε(ρ, ρ)) = λ · 1ρ . We now prove that λ 6= 0. To this aim, we need some preliminary observations. First, we recall the following formula [4, Lemma 5.3]: Ψ◦ι,ιd (Ad ) = d!−1 · (1 − λ)(1 − 2λ) · · · (1 − (d − 1)λ) (∗) where Ψ◦d is the left inverse of ρd given by the d-fold composition of Ψ. Secondly, Ψ◦ρdd ,ρd (ε(ρd , ρd )) = λd · 1ρd because of (A.4). Thirdly, if Ψ◦ι,ιd (Ad ) 6= 0, then ˜ σ,τ (F ) = (Ψ◦d (Ad ))−1 · Ψ◦d (V × 1τ · F · V + × 1σ ) F ∈ (γσ, γτ ) Φ ι,ι σ,τ defines a left inverse of γ such that ˜ γ,γ (ε(γ, γ)) = (Ψ◦d (Ad ))−1 · Ψ◦d (V + × 1γ · ε(γ, γ) · V × 1γ ) χγ · 1 γ = Φ ι,ι γ,γ = (Ψ◦ι,ιd (Ad ))−1 · V Ψ◦ρdd ,ρd (ε(ρd , ρd ))V + = (Ψ◦ι,ιd (Ad ))−1 · λd · 1γ
(∗∗)
because ε(γ, γ) = χγ · 1γ 2 . Now the proof that λ 6= 0 proceeds as follows. If λ were ˜ would be well-defined because, by (∗), Ψ◦d (Ad ) = d!−1 . This equal to 0, then Φ ι,ι leads to a contradiction because, by (∗∗), we should have χγ · 1γ = 0. Proposition 3.17. The following assertions hold : (a) B(A)f is closed under tensor products, direct sums and subobjects; (b) ρ ∈ ∆f ⇔ has a standard left inverse Φ, that is Φρ,ρ (ε(ρ, ρ))2 = c · 1ρ > 0. Proof. (a) The closedness under direct sums and subobjects is obvious by definition of B(A)f . Once we have shown that given two irreducibles ρ, σ with finite statistics each subobject of ρσ has left inverses, the closedness under tensor products follows as in DHR analysis. For this purpose, let Φ, Ψ be two left inverses of ρ and σ respectively. By Proposition A.4 the left inverse Ψ ◦ Φ of ρσ is faithful. Hence each subobject of ρσ has a left inverse defined by (A.3). (b) (⇒) follows as in the DHR analysis. (⇐) By Proposition A.3 any standard left inverse is faithful. Therefore each subobject of ρ has left inverses. The rest of the proof follows as in DHR analysis. 3.5. The selection of the relevant subcategory In the previous section, in order to exclude objects without conjugates, we introduced the category B(A)f . With the same motivation, we can affirm that this is only a preliminary step since properties like (double) faithfulness and homogeneity might fail to hold in B(A)f . Observing that homogeneity implies the other two properties, we show in this section how to select the maximal full subcategory of B(A)f with homogeneous objects, closed under tensor products, direct sums and
February 9, 2004 19:51 WSPC/148-RMP
00187
Essential Properties of the Vacuum Sector for a Theory of Superselection Sectors
1275
subobjects. We claim here but will prove in the next section that, this is the relevant subcategory. To understand the problem we are facing, we recall that homogeneity might not be stable under subobjects (see the observation related to (13)). Consequently the category we are looking for does not correspond, in general, to the subcategory of B(A)f whose objects are homogeneous. However, this category can be selected by adding further conditions. Definition 3.18. We denote by ∆fh the subset of ∆t whose objects are finite direct sums of irreducibles ρ fulfilling the following conditions: (a) ρ has finite statistics; (b) given the 3-tuple (d, γ, V ) associated with ρ, the simple object γ is doubly faithful (or, equivalently, homogeneous). We denote by B(A)fh the full subcategory of B(A)f whose objects belong to ∆fh . Notice that the property of belonging to ∆fh is stable under equivalence. Now, the next proposition shows a useful characterization of the irreducible elements of ∆fh , while the subsequent one proves the main claim of this section. Proposition 3.19. Let ρ be irreducible. Then the following assertions are equivalent: (a) ρ ∈ ∆fh ; (b) ρ is homogeneous and belongs to ∆f . Proof. (a) ⇒ (b) Let (d, γ, V ) be the 3-tuple associated with ρ, where γ is homogeneous. Let us assume that ρ and γ are localized in the same double cone o. Let γˆ −1 be the presheaf-left inverse of γ ∈ ∆t (o) defined by Corollary 3.14. Setting a ϕ(A)
≡ a γ −1 (V + a ρd−1 (A) V ) A ∈ (A(a)0 ⊗ Mnγ )ρ(1)
(18)
for each a ∈ K, a ⊥ o, we have that the set ϕˆ = {a ϕ, a ⊥ o} is a presheaf-left inverse of ρ ∈ ∆t (o). Since each element of [ρ] belong to ∆fh , ρ is homogeneous. (a)⇐ (b) Let (d, γ, V ) be the 3-tuple associated with ρ, where in this case γ is faithful. The proof follows once we have shown that γ is homogeneous. Let us assume that ρ and γ are localized in the same double cone o and let ϕˆ be a presheaf-left inverse of ρ ∈ ∆t (o). By Proposition 3.16 the left inverse l(ϕ) ˆ of ρ associated with ϕˆ satisfies the relation l(ϕ) ˆ ρ,ρ (ε(ρ, ρ)) = λ · 1ρ 6= 0. Combining this with Proposition A.4 we ˆ ◦d = l(ϕˆ◦d ) obtain that l(ϕ) ˆ ◦d is a faithful left inverse of ρd . We now notice that l(ϕ) because of (14), where ϕˆ◦d is the presheaf-left inverse of ρd ∈ ∆t (o) given by the d-fold composition of ϕ. ˆ This entails that l(ϕˆ◦d )ι,ι (V V + ) 6= 0 and, by (13), that γ ∈ ∆t (o) has presheaf-left inverses. Since, by transportability, this argument can be applied to each element of [γ], γ is homogeneous. Proposition 3.20. The following assertions hold :
February 9, 2004 19:51 WSPC/148-RMP
1276
00187
G. Ruzzi
(a) B(A)fh is closed under tensor products, direct sums and subobjects, and its objects are homogeneous; (b) B(A)fh is the maximal full subcategory of B(A)f closed under subobjects and whose objects are homogeneous. Proof. (a) The closedness under direct sums and subobjects is obvious. Let ρ1 , ρ2 ∈ ∆fh be two irreducibles localized in the same double cone o. Since ρ1 , ρ2 are homogeneous, both the direct sum and the tensor product of these two objects are homogeneous (see observation related to (11), (12)). It remains to be proved that each subobject of ρ1 ρ2 is homogeneous. Since ρ1 and ρ2 are homogeneous objects with finite statistics, the proof follows by the same argument used in the proof of the implication (a) ⇐ (b) of the previous proposition. (b) Let C be a category fulfilling the properties written in the statement. Since each object of C is a finite direct sum of homogenous irreducible objects with finite statistics, the proof follows from Proposition 3.19. 4. Conjugation This section concludes the investigation of Secs. 2 and 3. We start by recalling the definition of the conjugate of an object. An object ρ has conjugates if there exists ¯ ∈ (ι, ρ¯ an object ρ¯ and a pair of arrows R ∈ (ι, ρ¯ρ), R ρ) satisfying the conjugate equations: ¯ + × 1 ρ · 1ρ × R = 1 ρ , R
¯ = 1ρ¯ . R+ × 1ρ¯ · 1ρ¯ × R
Conjugation is a property which is stable under equivalence, tensor product, direct sums and subobjects [11]. The next result proves the assertions we have been claiming throughout this paper: Theorem 4.1. Each object of B(A) with conjugates belongs to B(A)fh . ¯ be a pair of arrows solving the conjugate Proof. Given ρ, ρ¯ localized in o, let R, R equations for ρ and ρ¯. Then by setting Φσ,τ (X) ≡ (R+ R)−1 · (R+ × 1τ · 1ρ¯ × X · R × 1σ ) X ∈ (ρσ, ρτ ) for each σ, τ ∈ ∆t , we get a left inverse Φ of ρ. Since it is always possible to choose ¯ R in a way that Φ is standard [11], by Proposition 3.17(b) ρ has finite statistics. R, Now, the set of linear maps defined as a ϕ(A)
≡ (R+ R)−1 · (R+ · a ρ¯(A) · R) A ∈ (A(a)0 ⊗ Mnρ )ρ(1)
for each a ⊥ o, is a presheaf-left inverse of ρ ∈ ∆t (o). As conjugation is stable under equivalence, each element of [ρ] has presheaf-left inverses. Thus ρ is homogeneous. Finally, since conjugation is stable under subobjects, Proposition 3.20(b) completes the proof.
February 9, 2004 19:51 WSPC/148-RMP
00187
Essential Properties of the Vacuum Sector for a Theory of Superselection Sectors
1277
Theorem 4.1 states that all the relevant objects of the theory belong to B(A) fh . We claim that B(A)fh is the relevant subcategory of the theory. In fact, as we will prove later, each object of B(A)fh has conjugates under the assumption that the local algebras are properly infinite, and there are several reasons for considering this assumption as an essential property of the reference representation: First, in an arbitrary globally-hyperbolic spacetime the algebras of local observables of a multiplet of n Klein–Gordon fields in any Fock representation, acted on by U (n) as a global gauge group, fulfill this property (this result is proved in [15] and will be described in a forthcoming article). Secondly, it turns out to be a necessary condition in the following particular situation: Theorem 4.2. Let ρ, ρ¯ be two conjugate objects with multiplicity equal to one. If ρ is not a simple object, then A is a net of properly infinite von Neumann algebras. The proof is based on the following: Lemma 4.3. Let σ, σ ¯ be two conjugate objects localized, respectively, in o 1 , o2 ∈ K. If σ, σ ¯ have multiplicity equal to one, then each nonzero orthogonal projection E ∈ (σ, σ) is equivalent to 1 on A(a) for each a ∈ K, o1 ∪ o2 ⊆ a. Proof. We recall that an object with multiplicity equal to one is an endomorphism, in general not unital, of the algebra A. Let V ∈ (τ, σ) be an isometry such that V ·V + = E. The subobject τ is an endomorphism of A localized in o1 , and τ (1) = E. ¯ solving the conjugation equations for σ and σ Given a pair of arrows R, R ¯ , let us ¯ ∈ (ι, τ σ¯ ). By [11, Lemma 2.1] S 6= 0, hence S 0 ≡ S · t−1/2 define S ≡ V + × 1σ¯ · R + is an isometry, where S + · S = t · 1. Then 1 ∼ S 0 · S 0 ≤ τ σ¯ (1) ≤ τ (1) ≤ 1. Hence E ∼ 1 on A(a). Proof of Theorem 4.2. Let us assume that ρ and ρ¯ are, respectively, localized in o1 , o2 , and let a ∈ K, o1 ∪ o2 ⊆ a. By [2, Proposition 3] it is enough to show that A(a) is properly infinite. Notice that ρ2 is reducible because ρ is not simple, Proposition A.6. Hence there is a nonzero orthogonal projection E ∈ (ρ2 , ρ2 ) such that E 6= ρ2 (1). By Lemma 4.3, ρ2 (1) ∼ 1 ∼ (ρ2 (1) − E) ∼ E on A(a). If Tr were a finite normal trace of A(a), then the previous relation should entail the equality Tr(1) = Tr(ρ2 (1)) = Tr(E) = Tr(ρ2 (1)) − Tr(E), which is possible if, and only if, Tr(1) = 0. In spite of this theorem, we cannot assert that a general requirement on the reference representation for the existence of nontrivial conjugate objects is that the local algebras are properly infinite. A counterexample of this assertion can be found in [16]. The results obtained in that paper, however, do not affect the hypothesis of considering the local algebras A(a) properly infinite as an essential property of the reference representation because we are interested in applications deriving from
February 9, 2004 19:51 WSPC/148-RMP
1278
00187
G. Ruzzi
models of quantum fields theory, while that paper concerns quantum statistical mechanics and the local algebras are defined on a lattice. Theorem 4.4. If A is a net of properly infinite algebras, then B(A)fh is the full subcategory of B(A) whose objects have conjugates. Proof. By virtue of Theorem 4.1, we only have to prove that each object of B(A) fh has conjugates. The existence of conjugates in B(A)fh is equivalent to the existence of conjugates for its simple objects. In order to prove this, let us consider an irreducible object ρ of B(A)fh and let (d, γ, V ) be the 3-tuple associated with ρ. If γ has a conjugate γ¯ and T ∈ (ι, γ¯γ), T¯ ∈ (ι, γ¯ γ ) solve the conjugate equations for γ and γ¯, then by setting ρ¯ ≡ γ¯ρd−1 ,
R ≡ 1γ¯ × V · T ,
¯ ≡ ε(¯ R ρ, ρ) · R ,
¯ solve the conjugate equations for ρ and ρ¯. Now, one can easily check that R and R let γ be a simple object of B(A)fh localized in o. Since γ is doubly faithful, by Proposition 2.6(b) γ(1) has central support 1 ⊗ 1nγ in A(o) ⊗ Mnγ . Since the local algebras are properly infinite, there exists an isometry V : Ho → Ho ⊗ Cnγ with values Vi in A(o) for i = 1, . . . , nγ , such that V V + = γ(1). So, γ1 ( ) ≡ V + γ( )V is automorphism of A localized in o and transportable. Hence, as in DHR analysis, it turns out that γ1−1 is a conjugate of γ. 5. Conclusions We have shown that in a tensor C∗ -category associated with a set of representations of a net of local observables which are local excitations of a reference representation, the charge structure, in the sense of DHR analysis, manifests itself even when the Borchers property of the reference representation is not assumed. What is essential is that the local algebras are properly infinite in the reference representation. The main problem we have solved, has been to identify the subcategory carrying the charge structure of the theory, that is the subcategory whose objects have conjugates. Apart from the finiteness of the statistics, the sectors of this category are characterized by a new property called homogeneity. The key result allowing us to formulate this property has been that the theory can be equivalently studied using both the net and the presheaf approach. As mentioned at the beginning, the superselection sectors of a net of local observables on globally-hyperbolic spacetimes have been studied under the assumption that the reference representation fulfills the Borchers property [9]. We observed that this assumption has been verified only for certain models [17]. The results here suggest that it is reasonable to include the proper infiniteness of the algebras of local observables in the reference representation as an axiom of the theory. In this case the charge structure of the theory is carried not by the subcategory whose objects have finite statistics but by the one generated by the homogeneous sectors with finite statistics.
February 9, 2004 19:51 WSPC/148-RMP
00187
Essential Properties of the Vacuum Sector for a Theory of Superselection Sectors
1279
Appendix A. Some Notions and Results on Tensor C∗ -Categories We introduce the definition of a tensor C∗ -category and prove some results concerning left inverses and simple objects. The last part of this section is devoted to exposing some relations concerning the notation introduced in Sec. 3.2. The references are [6, 11]. A.1. Left inverses, symmetry and simple objects Let C be a category. We denote by ρ, σ, τ, etc. the objects of the category and the set of the arrows between ρ, σ by (ρ, σ). The composition of arrows is indicated by “·” and the unit arrow of ρ by 1ρ . Sometimes, if no confusion is possible, we will omit the symbol “·” when we write the composition of arrows. C is said to be a C∗ -category if the set of the arrows between two objects (ρ, σ) is a complex Banach space and the composition between arrows is bilinear; there is an adjoint, that is an involutive contravariant functor ∗ acting as the identity on the objects; the norm satisfies the C∗ -property, namely kR∗ Rk = kRk2 for each R ∈ (ρ, σ). Notice, that if C is a C∗ -category then the set of the form (ρ, ρ) is a C∗ -algebra for each ρ. Assume that C is a C∗ -category. An arrow V ∈ (ρ, σ) is said to be an isometry if ∗ V V = 1ρ ; a unitary, if it is an isometry and V V ∗ = 1σ . The property of admitting a unitary arrow, defines an equivalence relation on the set of the objects of the category. We denote by the symbol [ρ] the unitary equivalence class of the object ρ. An object σ is said to be irreducible if (σ, σ) = C1σ . C is said to be closed under subobjects if for each orthogonal projection E ∈ (ρ, ρ), E 6= 0 there exists an isometry V ∈ (β, ρ) such that V V ∗ = E. C is said to be closed under direct sums, if given ρi i = 1, 2 there exists an object α and two isometries Wi ∈ (ρi , α) such that W1 W1∗ + W2 W2∗ = 1α . A strict tensor C∗ -category (or tensor C∗ -category) is a C∗ -category C equipped with a tensor product, namely an associative bifunctor ⊗ : C × C −→ C with a unit ι, commuting with ∗, bilinear on the arrows and satisfying the exchange property, i.e. (T ⊗ S) · (T 0 ⊗ S 0 ) = T T 0 ⊗ SS 0 when the composition of the arrows is defined. To simplify the notation we omit the symbol ⊗ when applied to objects, namely ρσ ≡ ρ ⊗ σ. From now on, we assume that C is a tensor C∗ -category closed under direct sums, subobjects, and the identity object ι is irreducible. A left inverse Ψ of an object ρ is a set of nonzero linear maps Ψ = {Ψσ,τ : (ρσ, ρτ ) → (σ, τ )} satisfying (i) Ψσ0 ,τ 0 (1ρ ⊗ T · X · 1ρ ⊗ S ∗ ) = T Ψσ,τ (X)S ∗ , T ∈ (τ, τ 0 ), S ∈ (σ, σ 0 ), (ii) Ψσπ,τ π (X ⊗ 1π ) = Ψσ,τ (X) ⊗ 1π , X ∈ (ρσ, ρτ ). Ψ is said to be positive if Ψσ,σ is positive ∀σ ∈ C; faithful if Ψσ,σ is positive and faithful ∀σ ∈ C; normalized if Ψι,ι (1ρ ) = 1ι .
February 9, 2004 19:51 WSPC/148-RMP
1280
00187
G. Ruzzi
From now on by left inverse we mean a positive normalized left inverse. Lemma A.1. Let Ψ be a left inverse of ρ. The following relations hold : (a) Ψσ,γ (R)∗ = Ψγ,σ (R∗ ), R ∈ (ρσ, ργ); (b) Ψσ,σ (R∗ R) ≥ Ψγ,σ (R∗ ) · Ψσ,γ (R), R ∈ (ρσ, ργ). Proof. (a) By polarization of the identity the assertion holds for the C∗ -algebra (ρβ, ρβ) for each object β. For the general case, let R ∈ (ρσ, ργ). Since the category is closed under direct sums the reexists an object β and two isometries V ∈ (σ, β), W ∈ (γ, β) such that V V ∗ + W W ∗ = 1β . Since Ψβ,β (1ρ ⊗ W · R · 1ρ ⊗ V ∗ )∗ = Ψβ,β (1ρ ⊗ V · R∗ · 1ρ ⊗ W ∗ ), we have V Ψγ,σ (R)∗ W ∗ = V Ψσ,γ (R∗ )W ∗ therefore Ψγ,σ (R)∗ = Ψσ,γ (R∗ ). The statement (b) follows from the following inequality: 0 ≤ Ψσ,σ ((R − 1ρ ⊗ Ψσ,γ (R))∗ · (R − 1ρ ⊗ Ψσ,γ (R))) = Ψσ,σ (R∗ R) − Ψγ,σ (R∗ )Ψσ,γ (R). A first consequence of this lemma is the following: Lemma A.2. Let Ψ be a left inverse of ρ. Then Ψ is zero if , and only if , Ψι,ι (1ρ ) = 0. Proof. (⇒) is trivial. (⇐) Let R ∈ (ρσ, ργ). Since Ψσ,σ (R∗ R) ≤ kRk2 Ψσ,σ (1ρσ ) = kRk2 (Ψι,ι (1ρ ) ⊗ 1σ ), we have Ψσ,σ (R∗ R) = 0. By Lemma A.1(b) Ψσ,γ (R) = 0. Let Ψ, Ψ1 , Ψ2 be, respectively, left inverses of ρ, ρ1 , ρ2 . Let α be the direct sums of ρ1 and ρ2 and let β be a subobject of ρ. Hence there are two isometries Wi ∈ (ρi , α) for i = 1, 2 such that 1α = W1 W1∗ +W2 W2∗ and there is an isometry V ∈ (β, ρ) such that V V ∗ ≡ E. Then the sets Ψ1 ◦ Ψ2 , Ψ1 ⊕s Ψ2 for s ∈ [0, 1], and ΨE defined by (Ψ1 ◦ Ψ2 )σ,γ ( ) ≡ Ψ1σ,γ (Ψ2ρ2 σ,ρ2 γ ( ))
(A.1)
(Ψ1 ⊕s Ψ2 )σ,γ ( ) ≡ sΨ1σ,γ (W1∗ ⊗ 1γ · ( ) · W1 ⊗ 1σ ) + (1 − s)Ψ2σ,γ (W2∗ ⊗ 1γ · ( ) · W2 ⊗ 1σ ) −1 ΨE Ψσ,γ (V ⊗ 1γ · ( ) · V ∗ ⊗ 1σ ) σ,γ ( ) ≡ (Ψι,ι (E))
(A.2) if Ψι,ι(E) 6= 0 (A.3)
are, respectively, left inverses for ρ2 ρ1 , α and β. Let us observe that ΨE is defined if Ψι,ι (E) 6= 0. Hence, for an object the existence of left inverses does not imply the existence of left inverses for its subobjects. A symmetry ε in the tensor C∗ -category C is a map Obj(C) 3 ρ, σ → ε(ρ, σ) ∈ (ρσ, σρ) satisfying the relations: (i) ε(ρ, σ) · T ⊗ S = S ⊗ T · ε(τ, β) (iii) ε(ρ, τ σ) = 1τ ⊗ ε(ρ, σ) · ε(ρ, τ ) ⊗ 1σ
(ii) ε(ρ, σ)∗ = ε(σ, ρ) (iv) ε(ρ, σ) · ε(σ, ρ) = 1σρ ,
February 9, 2004 19:51 WSPC/148-RMP
00187
Essential Properties of the Vacuum Sector for a Theory of Superselection Sectors
1281
where T ∈ (τ, ρ), S ∈ (β, σ). By (ii)–(iv) it follows that ε(ρ, ι) = ε(ι, ρ) = 1ρ for each ρ. From now on we assume that C has a symmetry ε. Proposition A.3. Let Ψ be a left inverse of ρ. Then, kΨρ,ρ (R∗ R) ⊗ 1ρ k ≥ kR · (Ψρ,ρ (ε(ρ, ρ)) ⊗ 1σ )k2
R ∈ (ρσ, ργ) .
Proof. By using the properties (i) and (iii) of ε we have R ∗ R ⊗ 1ρ = 1ρ ⊗ ε(ρ, σ) · ε(ρ, ρ)⊗1σ ·(1ρ ⊗R∗ R)·ε(ρ, ρ)⊗1σ ·1ρ ⊗ε(σ, ρ). Using this relation and Lemma A.1(a) we have Ψσ,σ (R∗ R) ⊗ 1ρ ≥ (ε(ρ, σ) · Ψρσ,ρσ (ε(ρ, ρ) ⊗ 1σ ) · R∗ ) · (R · Ψρσ,ρσ (ε(ρ, ρ) ⊗ 1σ )) · ε(σ, ρ)), that implies the inequality written in the statement. Let us now recall that given two left inverses Φ, Ψ of ρ and σ respectively, then the following relation holds ([14, Sec. 3.2.6, Lemma 6]): (Ψ ◦ Φ)ρσ,ρσ (ε(ρσ, ρσ)) = Φρ,ρ (ε(ρ, ρ)) × Ψσ,σ (ε(σ, σ)) .
(A.4)
Proposition A.4. Let Φ, Ψ be two left inverses of ρ and σ respectively. Assume that Ψρ,ρ (ε(ρ, ρ)) = λρ · 1ρ and Ψσ,σ (ε(σ, σ)) = λσ · 1σ where λρ , λσ ∈ R. If λρ 6= 0 and λσ 6= 0, then Ψ ◦ Φ is faithful. Proof. Using the relation (A.4) we have (Ψ ◦ Φ)ρσ,ρσ (ε(ρσ, ρσ)) = λρ λσ · 1ρσ 6= 0. The proof follows from Proposition A.3. An object γ is said to be simple if ε(γ, γ) = χγ 1γ 2 . Since ε(γ, γ) is self-adjoint and unitary, χγ ∈ {1, −1}. Lemma A.5. Let γ be simple. Then: (a) γ n is simple for each integer n. Moreover , if γ has left inverses then: (b) each left inverse of γ is faithful ; (c) γ n is irreducible for each integer n. Proof. (a) follows from the property (iii) of the symmetry. (b) follows from Proposition A.3. (c) By (a) it suffices to prove the statement for γ. Let us assume that γ is reducible. Then there exists an orthogonal projection E ∈ (γ, γ) such that E 6= 1 γ . Moreover 0 6= E ⊗ (1γ − E) ∈ (γ 2 , γ 2 ). In fact, since each left inverse Ψ of γ is faithful, Ψγ,γ ((1γ − E) ⊗ E) = c E where c1ι = Ψι,ι ((1γ − E)) 6= 0. By virtue of this fact we have E ⊗ (1γ − E) = χγ ε(γ, γ) · (E ⊗ (1γ − E)) = χγ ((1γ − E) ⊗ E) · ε(γ, γ) = (1γ − E) ⊗ E which gives rise to a contradiction. Proposition A.6. Let Ψ be a left inverse of γ. Then the following properties are equivalent: (a) Ψγ,γ (ε(γ, γ)) = ±1γ
(b) γ is simple
(c) γ 2 is irreducible .
February 9, 2004 19:51 WSPC/148-RMP
1282
00187
G. Ruzzi
Proof. (a) ⇒ (b) By Lemma A.3 Ψ is faithful. Since (1γ 2 ∓ ε(γ, γ)) ∈ (γ 2 , γ 2 ) is positive, we have 1γ 2 = ±ε(γ, γ). (b) ⇒ (c) follows from the previous lemma. (c) ⇒ (a) is obvious. A.2. The notation introduced in Sec. 3.2 Given ρ ∈ ∆t , let us consider, for each pair σ, τ ∈ ∆t , a bounded linear operator T ∈ B(Ho ⊗ Cnρ nσ , Ho ⊗ Cnρ nτ ) with values Ti,j in A, and such that T ρσ(1) = T = ρτ (1)T . Such an operator can be represented as an nτ × nσ matrix with values [T ]i,j in (A ⊗ Mnρ )ρ(1) , that is [T ]1,1 · · · [T ]1,nσ . .. .. where [T ]i,j ∈ (A ⊗ Mnρ )ρ(1) , (A.5) T = . . .. [T ]nτ ,1
· · · [T ]nτ ,nσ
and i = 1, . . . , nτ and j = 1, . . . , nσ . The following relations hold: • if ρ, σ ∈ ∆t then [ρσ(A)]i,j = ρ(σ(A)i,j ) ,
i, j = 1 · · · nσ ,
∀A ∈ A
(A.6)
• if F ∈ (ρ, ρ), E ∈ (τ, σ) then [F × E]i,j = F · ρ(Ei,j ) ,
i = 1, . . . , nτ , j = 1, . . . , nσ
(A.7)
i = 1, . . . , nτ , j = 1, . . . , nβ .
(A.8)
• if G ∈ (ρτ, ρσ), L ∈ (ρσ, ρβ) then [L · G]i,j = [L]i,k · [G]k,j , Acknowledgments I would like to thank John E. Roberts for helpful discussions and his constant interest in this work. I am also grateful to him and to Daniele Guido for their critical comments on the preliminary version of the manuscript. Moreover, I would like to thank the anonymous referee for a careful reading of the manuscript. Finally, I am grateful to Ezio Vasselli, Fabio Ciolli, Gerardo Morsella, Gherardo Piacitelli, and Pasquale Zito for fruitful discussions, and to Isabella Baccarelli and Patrick O’Keefe for their precious help. References [1] H. J. Borchers, A remark on a theorem of B. Misra, Commun. Math. Phys. 4 (1967), 315–323. [2] J. Dixmier and O. Mar´echal, Vecteurs totalisateurs d’une alg`ebre de von Neumann (French), Commun. Math. Phys. 22 (1971), 44–50. [3] S. Doplicher, R. Haag and J. E. Roberts, Fields observables and gauge transformations I, Commun. Math Phys. 13 (1969), 1–23. [4] S. Doplicher, R. Haag and J. E. Roberts, Local observables and particle statistics I, Commun. Math Phys. 23 (1971), 199–230.
February 9, 2004 19:51 WSPC/148-RMP
00187
Essential Properties of the Vacuum Sector for a Theory of Superselection Sectors
1283
[5] S. Doplicher, R. Haag and J. E. Roberts, Local observables and particle statistics II, Commun. Math Phys. 35 (1974), 49-85. [6] S. Doplicher and J. E. Roberts, A new duality theory for compact groups, Invent. Math. 98(1) (1989), 157–218. [7] S. Doplicher and J. E. Roberts, Why there is a field algebra with a compact gauge group describing the superselection sectors in particle physics, Commun. Math. Phys. 131(1) (1990), 51–107. [8] K. Fredenhagen, Superselection Sectors, Lecture Notes, Hamburg University 1994/1995, http://www.desy.de/uni-th/lqp/psfiles/superselect.ps.gz [9] D. Guido, R. Longo, J. E. Roberts and R. Verch, Charged sectors, spin and statistics in quantum field theory on curved spacetimes, Rev. Math. Phys. 13(2) (2001), 125– 198. [10] R. Haag, Local Quantum Physics, 2nd ed. Springer Texts and Monographs in Physics, 1996. [11] R. Longo and J. E. Roberts, A theory of dimension, K-Theory 11(2) (1997), 103–159. [12] J. E. Roberts, Local cohomology and superselection structure, Commun. Math. Phys. 51 (1976), 107–119. [13] J. E. Roberts, New light on the mathematical structure of algebraic field theory, in Operator Algebras and Applications, Part 2 (Kingston, Ont. 1980) Proc. Sympos. Pure Math. 38, 523–550, Amer. Math. Soc., Providence, R.I., (1982). [14] J. E. Roberts, Lectures on algebraic quantum field theory, in The Algebraic Theory of Superselection Sectors (Palermo 1989), sed. D. Kastler, World Scientific Publishing, River Edge, NJ, 1990, 1–112. [15] G. Ruzzi, Essential properties of the vacuum representation in the theory of superselection sectors, PhD thesis, Universit` a di Genova, March 2002. [16] K. Szlach´ anyi and P. Vecserny´es, Quantum symmetry and braid group statistics in G-spin models, Commun. Math. Phys. 156(1) (1993), 127–168. [17] R. Verch, Continuity of symplectically adjoint maps and the algebraic structure of Hadamard vacuum representations for quantum fields in curved spacetime, Rev. Math. Phys. 9(5) (1997), 635–674. [18] R. Verch, On generalizations of spectrum condition, in Mathematical Physics in Mathematics and Physics (Siena 2000), ed. R. Longo, Fields Ins. Commun. 30, 409–428, Amer. Math. Soc. Providence, RI, 2001.
February 5, 2004 15:55 WSPC/148-RMP
00190
Reviews in Mathematical Physics Vol. 15, No. 10 (2003) 1285–1317 c World Scientific Publishing Company
WIGNER MEASURES AND MOLECULAR PROPAGATION THROUGH GENERIC ENERGY LEVEL CROSSINGS
CLOTILDE FERMANIAN KAMMERER Universit´ e de Cergy-Pontoise, Math´ ematiques, 2 Avenue Adolphe Chauvin BP 222, Pontoise, 95 302 Cergy-Pontoise Cedex, France [email protected] Received 23 June 2003 Revised 6 November 2003 We study the time-dependent Schr¨ odinger equation with matrix-valued potential presenting a generic crossing of type B, I, J or K in Hagedorn’s classification. We use two-scale Wigner measures for describing the Landau–Zener energy transfer which occurs at the crossing. In particular, in the case of multiplicity 2 eigenvalues, we calculate precisely the change of polarization at the crossing. Our method provides a unified framework in which codimension 2, 3 or 5 crossings can be discussed. We recover Hagedorn’s result for wave packets, from Wigner measure point of view, and extend them to any data uniformly bounded in L2 . The proof is based on a normal form theorem which reduces the problem to an operator-valued Landau–Zener formula. Keywords: Energy level crossings; Landau–Zener transitions; Born–Oppenheimer approximation; semi-classical measures (Wigner measures); multi-scale convergence.
1. Introduction The quantum description of the dynamics of a molecule consisting of kn nuclei and ke electrons, ke + kn = N , is given by the time-dependent Schr¨ odinger equation ( h ih∂t Φh = Hmol Φh , (1) Φh (0) = Φh0 , h is an essentially self-adjoint molecwith initial data Φh0 ∈ L2 (R3N ) and where Hmol ular Hamiltonian. The scale parameter h is the square-root of the ratio of the electronic mass me to the average mass of the molecule’s nuclei, r me . h= M We are concerned with molecules where the nuclei are very massive, more precisely with the limit M → +∞, i.e. h → 0. h can be written as The molecular Hamiltonian Hmol h Hmol =−
h2 ∆x + He (x) , 2 1285
February 5, 2004 15:55 WSPC/148-RMP
1286
00190
C. Fermanian Kammerer
where x describes the nuclear configuration, x ∈ R3kn . The electronic Hamiltonian He (x) contains the electrons’ kinetics and the interaction between electrons and nuclei. We suppose that He depends smoothly on x. We consider σe (He (x)) the spectrum of He (x) and σ∗ (x) a closed subset of σe (He (x)) which is the union of two eigenvalues λ1 and λ2 of the same multiplicity k. We denote by P∗ (x) the spectral projector of He (x) associated with σ∗ (x), then there exists an isometry U : Ran P∗ → L2 (R3kn , C2k ). Let us suppose, as in [25], that the initial data Φh0 belongs to Ran(P∗ ) with ||Φh0 ||L2 = 1 and such that (||h2 ∆Φh0 ||L2 )h>0 is uniformly bounded. If σ∗ (x) is isolated from the rest of the spectrum, that is if d(σ∗ (x), σ(He (x))\σ∗ (x)) ≥ d, ∀x ∈ R3kn , then Ran(P∗ ) is said to be adiabatically protected: it is possible to approximate Φh by a Born–Oppenheimer solution modulo an error of order h. Namely, there exists a constant C > 0, such that ||Φh (t) − ΦhBO (t)||L2 ≤ C(1 + |t|)h, h h UΦh0 , and HBO is of the for all times t ∈ R, where ΦhBO (t) = U ∗ exp −i ht HBO 2 h h form HBO = − 2 ∆x + V (x) for a 2k × 2k matrix-valued potential V . Thus, it is possible to reduce the study of (1) to that of a system of 2k equations ( 2 ih∂t ψ h = − h2 ∆x ψ h + V (x)ψ h , (2) ψ h (0) = UΦh0 . This type of observation dates back to the 1920s and is originally assigned to Born, Fock and Oppenheimer. The reader can refer to [25] for a precise statement of this fact. We focus more precisely on the case where λ1 (x) and λ2 (x) cannot be separated because there exists x0 such that λ1 (x0 ) = λ2 (x0 ) . In [15], G. Hagedorn derives normal forms for matrix potential V in such a context. Let us shortly recall his argument. h It is classic to associate with Hmol its symmetry group G and the subgroup H of unitary elements of G. Two cases occur, either G = H, or H is a subgroup of index 2 of G. As G = H, standard group representation theory applies and one associates with the eigenvalues λ1 and λ2 a unique representation of G. These representations are either unitarily equivalent or not. If H 6= G, one uses the theory of co-representations (see [26] or [22]) of which there exist three types (see [22]). With λ1 and λ2 , one associates co-representations of one of these three types. If these co-representations are of the same type, they are either unitarily equivalent or not, which gives six different situations. If not, they are of different types, and we have to deal with three new situations. Adding the two different cases which arise as G = H, we obtain eleven cases to study. For each of them, assuming that
February 5, 2004 15:55 WSPC/148-RMP
00190
Wigner Measures and Molecular Propagation
1287
the eigenvalues are of minimal multiplicity. Hagedorn derives in [15] normal forms which are diagonal as soon as the representations or co-representations associated with the eigenvalues are not unitarily equivalent, i.e. in seven of the derived cases. The four remaining normal forms obtained are not diagonal, they are of the form Hj = k + Vj (p), j ∈ {2, 3, 30, 5} , where x 7→ k(x) is a smooth scalar function, x 7→ p(x) is a smooth vector-valued function with p ∈ Rj for j = 2, 3, 5 and p ∈ R3 for j = 30 . Moreover the matrices Vj are given by p1 p2 V2 (p) = p2 −p1 p1 p2 + ip3 V3 (p) = , p2 − ip3 −p1 p1 p2 + ip3 0 p2 − ip3 −p1 , V30 (p) = p1 p2 − ip3 0 p2 + ip3 −p1 p0 + ip1 p2 + ip3 p4 1 −p2 + ip3 p0 − ip1 . V5 (p) = p0 − ip1 −p2 − ip3 −p4 1 p2 − ip3 p0 + ip1 For j = 2 the crossing is said to be of type I in Hagedorn’s classification, for j = 3 of type B, for j = 30 of type K and for j = 5 of type J. For these four Hamiltonians, the eigenvalues are k ± |p|. We shall say that these crossings are generic if dp is of maximal rank on {p = 0} ,
(A1)
i.e. of rank 2 for j = 2, of rank 3 for j = 3, 30 and of rank 5 for j = 5. This explains why it is usual to refer to these crossings as to codimension 2, 3 and 5 crossings and enlightens the choice of the index j we made. Since the earlier works of Landau and Zener (see [20] and [27]), it is well-known that energy crossings yield energy transfer at leading order between the two modes. This transfer has been precisely calculated by Hagedorn in [15] for gaussian wave packets passing through the crossing with non-vanishing speed (see also [16, 19]). Actually, in [15], Hagedorn can reduce its study to linear functions p. Indeed, the wave packets he considers are localized at microlocal scale on one point thus it is enough to deal with the Taylor expansion of p at the point concerned by the crossing. Our aim here is to consider more general initial data, thus we cannot suppose that p is linear without loss of generality. We shall describe the energy transfer by means of Wigner measure for initial data (ψ0h ) uniformly bounded in L2 , in the same spirit as in [7]–[9]. Actually, the
February 5, 2004 15:55 WSPC/148-RMP
1288
00190
C. Fermanian Kammerer
case of the Hamiltonians H2 and H3 for which the eigenvalues are of multiplicity 1, are dealt with in [8] and [9] respectively. Our aim here is to generalize these results to codimension 5 crossings and to the case of eigenvalues of higher multiplicity than 1. In the following we denote by d the dimension of the set of the nuclear configuration, d = 3kn and we consider system (Sj ), j ∈ {2, 3, 30, 5}, 2 ih∂ ψ h = − h ∆ ψ h + H (x)ψ h , (t, x) ∈ R × Rd , t x j 2 (Sj ) h h ψ (0) = ψ0 , (ψ0h ) ∈ L2 (Rd , CN (j) ) with N (2) = N (3) = 2 and N (30 ) = N (5) = 4, ||ψ0h ||L2 ≤ C, Hj (x) = k(x) + Vj (p(x)). We suppose that (A1) holds and we aim at describing Wigner measures of (ψ h ). 1.1. Wigner measures We begin by shortly recalling basic facts about semi-classical measures. We consider an observable a ∈ C0∞ (Rd+1 × Rd+1 , CN,N ) and we associate with a the semi-classical pseudo-differential operator of symbol a which expresses with Weyl quantization, as Z 0 i y + y0 , η e h η·(y−y ) f (y 0 )dy 0 dη , oph (a)f (y) = (2πh)−d a 2 with y = (t, x) ∈ Rt × Rdx , f ∈ L2 (Rd+1 , CN ). If (v h ) is a uniformly bounded family in L2 (Rd+1 , CN ), there exists hk , hk −−−−−→ 0, and a positive matrixk→+∞
valued Radon measure µ on T ∗ Rd+1 such that
∀a ∈ C0∞ (Rd+1 × Rd+1 , CN,N ), (ophk (a)v hk | v hk ) −−−−−→ tr (ha, µi) . k→+∞
The link with the energy density of (v hk ) is given by Z hk 2 w − lim|v (x)| dx = tr µ(x, dξ) , ξ∈Rd+1
provided some assumption of h-oscillation is fulfilled (see [11]), namely Z limsup |ˆ v h (ξ)|2 dξ −−−−−→ 0 . h→0
|ξ|≥R/h
R→+∞
The reader can refer to [11] and [21] or to [12] for a survey on Wigner measures. We endow T ∗ Rd+1 with the symplectic form σ = dτ ∧ dt + dξ ∧ dx , and we denote by {f, g} the Poisson bracket of functions f and g, {f, g} = ∇τ,ξ f · ∇t,x g − ∇t,x f · ∇τ,ξ g .
February 5, 2004 15:55 WSPC/148-RMP
00190
Wigner Measures and Molecular Propagation
1289
The vector field Hf is the Hamiltonian vector field associated with the function f , thus we have dg Hf = {f, g} = σ(Hf , Hg ) . A family (ψ h ) solution to (Sj ) belongs to C(Rt , L2 (Rd , CN (j) )) and thus to h L2loc (Rd+1 t,x ). We consider semi-classical measures µ of (ψ ) viewed as a function 2d+2 of (t, x). Matrix-valued measure µ acts on Rt,x,τ,ξ and is supported on the characteristic set 2 1 2 2 . Σ = |p| = τ + k + |ξ| 2 Moreover, according to [12] and [13], µ decomposes as µ = µ+ + µ− with Π±,j µ± Π±,j = µ± , where Π±,j are the spectral projectors associated with matrix Hj for the eigenvalue k ± |p| 1 1 Π±,j = 1± Vj (p) . 2 |p| Moreover, µ± are positive matrix-valued measures which satisfy the localization property 1 2 ± Supp(µ ) ⊂ τ + k + |ξ| ± |p| = 0 . 2 Outside S=
1 2 p = 0, τ + k + |ξ| = 0 , 2
these measures propagate along the Hamiltonian curves of 1 λ± = τ + k + |ξ|2 ± |p| , 2 and we have Hλ± µ± = [F ±,j , µ± ] outside S ,
(3)
with F
±,j
= Π
±,j
|ξ|2 ± |p|, Π±,j , τ +k+ 2
1 p = Vj , Vj (E) 4|p| |p|
(4)
where the latter formula comes from the application of the result of [13, p. 281]. Therefore, as long as the classical trajectories do not reach S, Wigner measures of (ψ h ) are determined by the transport equations. The problem is that there exist classical trajectories which reach S in finite time.
February 5, 2004 15:55 WSPC/148-RMP
1290
00190
C. Fermanian Kammerer
1.2. The geometry of the crossing The results of this section are corollaries of those proved in [9, Sec. 2] in a larger framework. However, for the convenience of the reader, we give a direct proof of Propositions 1 and 2 in the Appendix. In the following, we set E = dp(x)ξ and we consider the assumption E := dp(x)ξ 6= 0 .
(A2)
Note that if d = j, the condition (A2) simply reduces to ξ 6= 0 in view of (A1). In [15], d = j and the wave packets considered are supposed to have non-vanishing speed at the crossing, which means that ξ 6= 0 at the crossing points, i.e. that (A2) holds at all the crossing points. Proposition 1. Consider ρ0 ∈ S such that (A1) and (A2) hold in ρ0 , then there − exist two unique curves ρ+ s and ρs such that ± ± ρ˙ ± s = Hλ± (ρs ), ρ0 = ρ0 . ∓ Moreover , (ρ± s )s≤0 can be smoothly continued by (ρs )s≥0 .
Consider ρ0 ∈ S and a neighborhood V of ρ0 in S such that (A1) and (A2) hold ±,out in V. We define J ±,in as the union of the curves (ρ± s )s<0 which reach V, and J the union of the curves (ρ± s )s>0 issued from V. Because of Proposition 1, J := J +,in ∪ J −,out , J 0 := J +,out ∪ J −,in , are smooth submanifolds of T ∗ Rd+1 . We recall that a submanifold I of T ∗ Rd+1 is said to be involutive if at any point ρ ∈ I, the vector subspace T I|ρ contains its ⊥ orthogonal T I|ρ for the symplectic form σ(ρ). Proposition 2. The submanifolds J and J 0 are involutive submanifolds of T ∗ Rd+1 . Moreover , there exist smooth functions u = u(t, x, τ, ξ) ∈ Sj−1 , u0 = u0 (t, x, τ, ξ) ∈ Sj−1 such that J = {p = (τ + k + |ξ|2 /2)u}, J 0 = {p = (τ + k + |ξ|2 /2)u0 } , u|S = −u0|S =
E . |E|
As in [7], the knowledge of the incoming semi-classical measure is not enough to describe the outgoing one. The repartition of the energy depends on the way (ψ h ) √ 0 concentrates on J or J at the scale h which can be described by two-scale Wigner measures for involutive submanifolds. These measures, on which we shall focus now, have been introduced by Miller in [23] and studied in [7]. 1.3. Two-scale Wigner measures for involutive submanifolds Let us consider for a while a more general setting and recall results of [7]. Let I be a codimension m involutive submanifold of the cotangent space T ∗ RD . We
February 5, 2004 15:55 WSPC/148-RMP
00190
Wigner Measures and Molecular Propagation
1291
suppose that I is given by a system of equations f = 0 where f = (f1 , · · · , fm ) ∈ C ∞ (R2D , Rm ), {fj , fk } = 0 for j 6= k and Rank(df ) = m on f = 0. Let us denote by Rm , the ball obtained by adding a sphere at infinity to Rm . We consider the set A of symbols a = a(z, ζ, η) ∈ C ∞ (RD × RD × Rm ) which are uniformly compactly supported in the variables (z, ζ) with respect to η and which can be extended as a function of C ∞ (RD × RD × R¯m ) by a(z, ζ, ∞ω) = lim
R→+∞
a(z, ζ, Rω), ∀ω ∈ Sm−1 ,
where the convergence is in C ∞ . Observe that one can equivalently consider symbols which coincide with homogeneous functions of degree 0 as soon as |η| > R for some R > 0. With any a ∈ A, we associate the two-scaled pseudo-differential operator, f (z, ζ) . oph I (a) := oph a z, ζ, √ h
By the Calderon–Vaillancourt Theorem (see [1]), the family of operators oph I (a) is a bounded family of bounded operators in L2 (RD ). If (v h ) is √ a bounded family 2 D N in L (R , C ), N ∈ N, its concentration on I at the scale h is characterized two-microlocally by a positive Radon measure on the compactified normal bundle ¯ (I), which describes the evolution as h goes to 0 of to I, N Kh (a) := (oph I (a)v h | v h ) .
¯ (I) is. We associate with I its tangent bundle T I. Let us explain what the bundle N Taking the quotient of the tangent space T (T ∗ RD )|ρ above a point ρ of I by T I|ρ , ¯ |ρ is the we obtain the fiber above ρ of the normal bundle to I, N (I). Then, N(I) closed m-dimensional ball obtained by adding a sphere at infinity to N (I)|ρ . The ¯ (I)|ρ given by the continuation choice of the equation f yields local coordinates on N χ ¯ of the isomorphism χ, χ : [δρ] 7→ η = df (ρ)δρ .
¯ (I), we denote by νf the measure on I × R¯m which is the If ν is a measure on N image of ν by χ. ¯ Let us come back to the limit of Kh (a) as studied in [7]. Proposition 3. There exist a sequence hk −−−−−→ 0, a positive Radon measure ν k→+∞
¯ on N(I) such that for all a ∈ A, Z Z Khk (a) −−−−−→ tr a dνf + tr k→+∞
I×R¯m
where µ is a semi-classical measure of (v h ).
f (x, ξ) ∞ dµ , a x, ξ, |f (x, ξ)| f 6=0
We point out that ν determines µ on I by Z µ1I = ν(x, ξ, dη) . ¯x,ξ (I) N
Proposition 3 is easy to prove when I is a vector subspace of T ∗ RD . Then, one gets the geometric character of two-scale Wigner measures by a lemma of invariance
February 5, 2004 15:55 WSPC/148-RMP
1292
00190
C. Fermanian Kammerer
through canonical transforms. The proof is detailed in [7]. The same type of arguments is developed in [4] and [5] for measures associated with the concentration of a family on an involutive submanifold when no second scale is emphasized or in [6] for two-scale Wigner measure in the case of concentration on a symplectic submanifold. Indeed, for families of solutions of p.d.e.’s, two-scale Wigner measures satisfy properties of localization and of propagation. Let us suppose that (v h ) is a family of solutions to a scalar equation and that the involutive submanifold I is such that Σ∩I is an involutive submanifold of the characteristic set Σ and that the intersection is transverse. It is possible to study the concentration of (v h ) on I or on J := Σ ∩ I, which gives, a priori, two different measures νI and νJ . It is proved in [7] that the two points of view are equivalent since both two-scale Wigner measures can be ¯Σ (J) which is obtained by the identified to one measure ν supported on the bundle N compactification of the fibers of T Σ/T J. Moreover, ν is propagated by the linearized Hamiltonian flow of the principal symbol of the equation, transversally to J. We shall use these properties in Sec. 2.4. for calculating two-scale Wigner measures associated with families (ψ h ) of solutions to (Sj ) and the involutive submanifolds J and J 0 , which is now our purpose. 1.4. The Landau–Zener formula We suppose that (A1) and (A2) hold in ρ0 ∈ S. We denote by ν¯ (respectively ν¯0 ) the measures associated with (ψ h ) and J (respectively J 0 ). We denote by Σ+ , Σ− , J˙±,in and J˙±,out the sets Σ± = {λ± = 0}, J˙±,in = J ±,in \S, J˙±,out = J ±,out \S . ¯Σ± (J˙±,in ) and N ¯Σ± (J˙±,out ) obtained respecWe consider the bundles over Σ± , N tively by compactification of the fibers of T Σ± /T (J˙ ±,in) and of T Σ± /T (J˙ ±,out). Because of the properties of localization of semi-classical measures, there exist ¯Σ± (J˙±,in ) and scalar positive Radon measures ν ±,in and ν ±,out supported on N ±,out ¯Σ± (J˙ N ) respectively such that ν¯ = ν +,in + ν −,out , ν¯0 = ν −,in + ν +,out , Π± ν ±,in Π± = ν ±,in , Π± ν ±,out Π± = ν ±,out . Moreover, if L±,in (Hλ± ) (respectively L±,out (Hλ± )) is the linearized Hamiltonian flow of λ± transversally to J ±,in (respectively J ±,out ), in Σ± , the measures ν ±,in (respectively ν ±,out ) satisfy L±,in (Hλ± )ν ±,in = [ν ±,in , F ±,j ] , L±,out (Hλ± )ν ±,out = [ν ±,out , F ±,j ] .
February 5, 2004 15:55 WSPC/148-RMP
00190
Wigner Measures and Molecular Propagation
1293
These propagation properties result from [12] and [7]. Observe that (4) and Proposition 2 yield that 1 |ξ|2 ±,j F = [Vj (u), Vj (E)] on J , sgn τ + k + 4|p| 2 1 |ξ|2 [Vj (u0 ), Vj (E)] on J 0 . F ±,j = sgn τ + k + 4|p| 2 Since u and u0 are colinear to E on S, F ±,j = O(1) near S. Moreover, in view of the fact that the Hamiltonian flows are transverse to S, these measures have traces on S in the set of distributions, that we denote by νS±,in and νS±,out . These four traces can be identified with measures on one set in which we can study the existing link between νS±,out and νS±,in . Lemma 1. For ρ0 ∈ S, consider P(ρ0 ) ∈ L(Rj ), the orthogonal projection on the normal hyperplane to E(ρ0 ) for the Euclidean structure of Rj . The map η : T (T ∗(Rd+1 ))|ρ0 → Rj δρ 7→ P(ρ0 ) (dp δρ) ¯Σ± (J˙±,in ) and induces an isomorphism between the limits of the fibres of N ±,out ¯Σ± (J˙ ) above a point ρ which goes to ρ0 , and this hyperplane. N This lemma is proved in the Appendix. Using this isomorphism η, the connection between νS±,in and νS±,out near the point ρ0 is described by the following theorem. Theorem 1. If νS+,in and νS−,in are mutually singular , then ( +,out νS = T νS−,in + (1 − T ) Rj νS+,in Rj , νS−,out = T νS+,in + (1 − T ) Rj νS−,in Rj ,
(5)
for η = P(dp δρ). π |η|2 , T = exp − |E| and for η 6= 0, Rj = V j
η |η|
.
Remark 1. We emphasize that the part of the energy two-microlocally localized ∓ above η = 0 propagates along the smooth curves (ρ± s )s<0 ∪ (ρs )s>0 , while the part localized on |η| = +∞ is completely reflected and is not at all transferred to the other mode.
February 5, 2004 15:55 WSPC/148-RMP
1294
00190
C. Fermanian Kammerer
Remark 2. The results stated above can easily be generalized to time-dependent Hamiltonians (i.e. k = k(t, x), p = p(t, x)) provided the assumptions (A1) and (A2) are satisfied with E = ∂ t p + ∇x p ξ . With such a notation, Theorem 1 holds and the proof below applies easily. Remark 3. Observe that if at t = 0, the initial data is localized on the + mode and supported outside S, then we will have ν −,in = 0 when a classical trajectory will reach S for the first time. Therefore, the measures νS+,in and νS−,in will be mutually singular and Theorem 1 will apply. If one supposes moreover that x 7→ |p(x)|2 is convex, then, one can prove that the trajectories associated to the mode − and issued from points of S where (A1) and (A2) hold never reach S again. Therefore, the assumptions of Theorem 1 will be fulfilled above any point of S as soon as (A1) and (A2) are satisfied, and the local result of Theorem 1 yields global results. This phenomenon has already been described by the author and Lasser in [10] for linear codimension 2 crossings. The proof of this theorem is mainly inspired by [9]. It crucially uses a reduction theorem which turns J and J 0 into vector subspaces. These theorems for H2 and H3 , H30 are direct consequences of the results of [8] (see also [2]) and [9] respectively. Thus, we focus on the proofs in the case of H5 . This article is organized as follows. Section 2 is devoted to the proof of the reduction Theorem. In Sec. 3, we prove the reflection of the measure for |η| = ∞ and, in Sec. 4, the Landau–Zener formula for |η| < ∞. Finally, the results concerning the geometry of the crossing are proved in the Appendix. Notations. For y˜ = (y0 , y 0 ) = (y0 , y1 , y2 , y3 ) ∈ R4 , we denote by q(˜ y ) the quaternion y0 + iy1 y2 + iy3 q(˜ y) = . −y2 + iy3 y0 − iy1 We shall use the following relations ∀˜ y ∈ R4 , q(˜ y ) + q(˜ y )∗ = 2y0 , ∀˜ y ∈ R4 , q(˜ y )q(˜ y )∗ = |˜ y |2 ∀˜ y ∈ R4 , ∀˜ v ∈ R4 , q(˜ v )q(˜ y ) = q(v0 y0 − v 0 · y 0 , v0 y 0 + y0 v 0 − v 0 ∧ y 0 ) , where v 0 ∧ y 0 = (v2 y3 − v3 y2 , v3 y1 − v1 y3 , v1 y2 − v2 y1 ). For y = (˜ y , y4 ) ∈ R5 , we set y4 q(˜ y) V (y) = . q(˜ y )∗ −y4 Therefore, we have H5 = k + V (p).
February 5, 2004 15:55 WSPC/148-RMP
00190
Wigner Measures and Molecular Propagation
1295
2. A Reduction Theorem in Codimension 5 We recall that if κ is a canonical transform of T ∗ Rd+1 , there exists a semi-classical Fourier integral operator K = Kh , called associated with κ, which satisfies: ∀f ∈ L2 (Rd+1 ), ∀a ∈ C0∞ (R2d+2 ), K ∗ oph (a)Kf = oph (a ◦ κ)f + O(h2 )||f ||L2 in L2 (Rd+1 ) .
(6)
The reader can refer to [24] for a complete study of semi-classical Fourier Integral Operator or to [7] where a self-contained approach is included, and where this claim is proved. Theorem 2. Let ρ0 be such that (A1) and (A2) hold in ρ0 , there exist a local canonical transform κ mapping a neighborhood of ρ0 into a neighborhood Ω of 0, κ : (t, x, τ, ξ) 7→ (s, z, σ, ζ), κ(ρ0 ) = 0 , a semi-classical Fourier integral operator K associated with κ and a matrix A such that v h = K oph (A)ψ h satisfies for any φ ∈ C0∞ (Ω), ˜ −σ + s q(Γζ) oph (φ) oph v h = O(h) in L2 (Rd ) , (7) ˜ −σ − s q ∗ (Γζ)
˜ ζ4 , . . . , ζd−1 ) ∈ Rd with ζ˜ = (ζ0 , ζ1 , ζ2 , ζ3 ) = (ζ0 , ζ 0 ) ∈ R4 and where where ζ = (ζ, 4,4 Γ is a smooth invertible matrix , Γ ∈ C ∞ (R2d+2 ). Moreover if I = {ζ˜ = 0}, s,z,σ,ζ , C then we have J ∪ J 0 = {σ 2 = s2 } ∩ I , J ±,in = {σ ∓ s = 0, s > 0} ∩ I, J ±,out = {σ ± s = 0, s < 0} ∩ I . Remark 4. The above normal form is to be compared with the normal forms obtained for (H2 ), (H3 ) and thus (H30 ) in [8, 2] and [9]: −σ + s αζ1 for H2 , N2 = αζ1 −σ − s −σ + s γ 1 ζ1 + γ 2 ζ2 for H3 , N3 = γ¯1 ζ1 + γ¯2 ζ2 −σ − s −σ + s γ 1 ζ1 + γ 2 ζ2 0 γ¯1 ζ1 + γ¯2 ζ2 −σ − s N3 0 = for H30 , −σ + s γ¯1 ζ1 + γ¯2 ζ2 0 γ 1 ζ1 + γ 2 ζ2 −σ − s where the real-valued function α does not vanish and the complex-valued functions γ1 and γ2 satisfy Im(γ1 γ¯2 ) 6= 0. The proof of this theorem follows the proof of [9, Theorem 2] which states the reduction result for codimension 3 crossings. The first step consists of taking advantage of algebraic properties of H5 to turn this Hamiltonian into a new one
February 5, 2004 15:55 WSPC/148-RMP
1296
00190
C. Fermanian Kammerer
where equations of J and J 0 appear. Then, it remains to define the canonical transform and to reduce the study of (S5 ) to that of (7) by means of semi-classical pseudo-differential calculus. We shall emphasize the first step which is specific to codimension 5 while the remainder of the proof is quite similar to the proof of [9, Theorem 2]. Finally, at the end of this section, we use Theorem 2 to translate Theorem 1 in the coordinates (s, z, σ, ζ) in order to obtain an equivalent statement on the proof on which we shall focus in the following. 2.1. An algebraic lemma Let ρ0 be such that (A1) and (A2) hold in ρ0 . The results of this section are local results near ρ0 . We use the equations of J and J 0 , 1 1 J = p = τ + k + |ξ|2 u , J 0 = p = τ + k + |ξ|2 u0 . 2 2 Observe that if e =
u0 −u |u0 −u| ,
L=
and 1 2 p − τ + k + |ξ| u = λe; λ ∈ R , 2
then J ∪ J 0 = L ∩ Σ. Moreover we have
e|S = −
E . |E|
(8)
Lemma 2. There exist a smooth matrix A1 = A1 (t, x, τ, ξ), a smooth vector-valued function α = α(t, x, τ, ξ) = (α0 , α1 , α2 , α3 ) , a smooth complex-valued function θ = θ(t, x, τ, ξ) and equations of L, g = (g0 , g1 , g2 , g3 ), such that τ + 21 |ξ|2 + H5 = A∗1 H50 A1 with (1 − |θ|2 )−1/2 p · e q(g) 1 . H50 = τ + k + |ξ|2 + α · g + 2 q(g)∗ −(1 − |θ|2 )−1/2 p · e Moreover θ|S = 0, α|S = 0 and A1 ∗|S = A1 −1 |S with E ∀y ∈ R5 , (A1 )|S V (y)(A1 )∗|S = V l(y), −y · |E|
(9)
where l(y) is a vector of R4 consisting of the coordinates of P(y), the orthogonal projection of y on the hyperplane normal to E, in an orthonormal basis of this hyperplane. Proof. The proof relies on two lemmas. We consider the subset U of U4 (R) defined by v4 q(˜ v) 0 4 . ; v = (˜ v , v ) = (v , v , v ) = (v , v , v , v , v ) ∈ S U = U (v) = 4 0 4 0 1 2 3 4 −q(˜ v ) ∗ v4
February 5, 2004 15:55 WSPC/148-RMP
00190
Wigner Measures and Molecular Propagation
1297
Lemma 3. For all v ∈ S4 , there exists Rv ∈ O(5) such that for all y ∈ R5 , we have U (v)V (y)U (v)∗ = V (Rv (y)) , with (Rv (y))j = yj − 2vj (y · v), 0 ≤ j ≤ 3, (Rv (y))4 = −y4 + 2v4 (y · v). The proof of this lemma consists of straightforward calculations. Consider now the vector e = (e0 , e1 , e2 , e3 , e4 ). If e4 (ρ0 ) 6= −1, then e4 6= −1 near ρ0 and the function z = z(t, x, τ, ξ) = (z0 , z1 , z2 , z3 , z4 ) defined by r 1 + e4 1 , zj = ej , 0 ≤ j ≤ 3 , z4 = 2 2z4 is a smooth function valued in S4 . Moreover we have for this special z, U (z)V (y)U (z)∗ = V (˜l(y), y · e), ∀y ∈ R5 where ˜ l(y) ∈ R4 are the coordinates in an orthonormal basis of the orthogonal projection of y on the hyperplane normal to e. Let us suppose now that e4 (ρ0 ) = −1. Since |e| = 1, e0 (ρ0 ) = 0 and e0 6= −1 near ρ0 . Then we use the fact that 1 1 1 1 1 5 V (y) = V (y4 , y1 , y2 , y3 , y0 ) . ∀y ∈ R , 2 1 −1 1 −1 Therefore, if we consider the function z defined by r 1 1 1 + e0 , z0 = e4 , zj = ej , j ∈ {1, 2, 3} , z4 = 2 2y0 2z0 then |z|2 = 1, U (z) is smooth near ρ0 and 1 1 1 1 1 U (z) V (y) U (z)∗ = V (˜l(y), y · e), ∀y ∈ R5 , 2 1 −1 1 −1 where ˜l(y) again denotes the coordinates of the orthogonal projection of y on the hyperplane normal to e in an orthonormal basis. We conclude on the existence of a smooth unitary matrix A2 = A2 (t, x, τ, ξ) such that 1 |ξ|2 + H5 A∗2 = τ + k + |ξ|2 + V (˜l(p), p · e) , A2 τ + 2 2 with ∀y ∈ R5 , |y|2 = (y · e)2 + |˜l(y)|2 .
(10)
Let us denote by θ = θ(t, x, τ, ξ) = (θ0 , θ0 ) = (θ0 , θ1 , θ2 , θ3 ) the vector of R4 such that θ = ˜l(u) .
February 5, 2004 15:55 WSPC/148-RMP
1298
00190
C. Fermanian Kammerer
Observe that since u|S = −u0|S = −e|S and because of (10) we have θ|S = 0 .
(11)
We set S(θ) =
1
q(θ)
q(θ)∗
1
.
We obtain A2
|ξ|2 τ+ + H5 A∗2 2 1 2 1 2 ˜ = τ + k + |ξ| S(θ) + V l p − (τ + k + |ξ| )u , p · e . 2 2
Notice that the equations of L appear in this Hamiltonian. However, we need to get rid of the matrix S(θ). Because of (11), matrix S(θ) is invertible near ρ0 and we have moreover S(θ)−1 = (1 − |θ|2 )−1 S(−θ) , (12) −1 p p p 1 p S(θ) = aS(bθ) with a = 1 + |θ| + 1 − |θ| , b = 1 + 1 − |θ|2 2 (13)
p −1 S(θ) = a0 S(−bθ), with a0 = a(1 − |θ|2 )−1/2 .
(14)
Lemma 4. ∀y = (y0 , y 0 , y4 ) = (˜ y , y4 ) ∈ R5 , ∀v = (v0 , v 0 ) ∈ R4 , S(v)V (y)S(v) = 2v · y˜ + V ((1 − |v|2 )˜ y + 2(v · y˜)v, (1 − |v|2 )y4 ) . This lemma comes again from straightforward computations. We set φ(˜ y , v) = (1 − |v|2 )˜ y + 2(v · y˜)v . Observe that for v such thatp|v| 6= 1, φ(˜ y , v) = 0 if and only if y˜ = 0. Lemma 4 and Eq. (14) yield that if A1 = S(θ)A2 , 1 τ + |ξ|2 + H5 = A∗1 H50 A1 2
with 1 1 H50 = τ +k+ |ξ|2 −2(a0 )2 bθ· ˜l p − τ + k + |ξ|2 u +(a0 )2 V (g, (1−b2 |θ|2 )p·e) , 2 2 and 1 2 ˜ g = φ l p − τ + k + |ξ| u , −bθ . 2
(15)
Therefore, we have g = 0 if and only if ˜l p − (τ + k + 12 |ξ|2 )u) = 0. By definition of ˜l, we obtain that g = 0 if and only if p − (τ + k + 12 |ξ|2 )u is colinear to e, thus
February 5, 2004 15:55 WSPC/148-RMP
00190
Wigner Measures and Molecular Propagation
1299
g = 0 is a system of equations of L. Therefore, there exists a vector-valued function α such that 1 2 0 2 0 ˜ α · g = −2(a ) b θ · l p − τ + k + |ξ| u . 2 Moreover, Eq. (11) yields α|S = 0 and (A1 )|S = (A2 )|S . By definition of A2 , l and E because e|S = − |E| , we obtain (A1 )|S V (y)(A1 )∗|S = V (˜l|S (y), y · e|S ) . We set ˜ l|S = l and we check that (a0 )2 (1 − b2 |θ|2 ) = (1 − |θ|2 )−1/2 , which completes the proof of Lemma 2. 2.2. The canonical transform Proposition 4. There exist a function λ, λ > 0 near ρ0 , and a local canonical transform κ : (t, x, τ, ξ) 7→ (s, z, σ, ζ), κ(ρ0 ) = 0 , such that |ξ|2 σ = −λ τ + k + +α·g , 2 p·e , s = λp 1 − |θ|2 λg = Γ1 ζ˜ + (σ 2 − s2 )β
(16)
˜ ζ5 , . . . , ζd ), β = β(s, z, σ, ζ) is a smooth function where ζ = (ζ0 , ζ 0 , ζ5 , . . . , ζd ) = (ζ, 4 valued in R and Γ1 = Γ1 (s, σ, z, ζ) is a smooth invertible matrix. Moreover λ2|S = 1/|E| .
(17)
Proof. We set p·e |ξ|2 + α · g , sˇ = p . σ ˇ =− τ +k+ 2 1 − |θ|2
Using (8), we obtain
|ξ|2 {ˇ σ , sˇ}|S = − τ + k + , p · e = |E| > 0 . 2 Moreover, since σ ˇ and sˇ vanish in ρ0 , we can use [18, Lemma 21.3.4] and we obtain the existence of a function λ > 0 such that {λˇ σ , λˇ s} = 1.
February 5, 2004 15:55 WSPC/148-RMP
1300
00190
C. Fermanian Kammerer
By the Darboux Theorem, we find functions z, ζ such that (s, z, σ, ζ) are local symplectic coordinates near ρ0 . Let us denote by f the vector-valued function defined by f (s, z, σ, ζ) = λg(t, x, τ, ξ) . Since u · e < 0 and u0 · e > 0 by definition of e, we have sgn(s) = −sgn(τ + k + |ξ|2 /2) on J , sgn(s) = sgn(τ + k + |ξ|2 /2) on J 0 . Moreover we have τ + k + |ξ|2 /2 = ∓|p| on J ±,in ∪ J ±,out . Therefore ±σ ≥ 0 on J ±,in ∪ J ±,out , s ≥ 0 on J +,in ∪ J −,in , s ≤ 0 on J +,out ∪ J −,out . Moreover on L, we have g = 0, thus λ2 (τ + k + |ξ|2 /2)2 = σ 2 , λ2 |p|2 = λ2 (p · e)2 + λ2 |l(p)|2 = s2 (1 − |θ|2 ) + λ2 (τ + k + |ξ|2 /2)2 |l(u)|2 = s2 (1 − |θ|2 ) + σ 2 |θ|2 . Therefore L ∩ {σ 2 = s2 } = L ∩ Σ. Using L ∩ Σ = J ∪ J 0 , we obtain J ±,in = {σ ∓ s = 0, f = 0, s > 0} , J ±,out = {σ ± s = 0, f = 0, s < 0} . Since J and J 0 are involutive submanifolds, we have {σ ± s, f} = 0, on σ ± s = 0, f = 0 . Arguing as in the proof of [9, Lemma 4], we obtain that there exists an invertible matrix Γ2 = Γ2 (s, z, σ, ζ) and a function β such that f (s, z, σ, ζ) = Γ2 (s, z, σ, ζ)f (0, z, 0, ζ) + (σ 2 − s2 )β(s, z, σ, ζ) . Observe that the submanifold {f(0, z, 0, ζ) = 0} is involutive. Therefore, there exist symplectic coordinates z = z(z, ζ), ζ = ζ(z, ζ) and an invertible smooth matrix M = M (z, ζ) such that ˜ ζ˜ = (ζ0 , ζ1 , ζ2 , ζ3 ) , f(0, z, 0, ζ) = M (z, ζ)ζ, whence (16) with Γ1 (s, z, σ, ζ) = Γ2 (s, z, σ, ζ)M (z, ζ).
February 5, 2004 15:55 WSPC/148-RMP
00190
Wigner Measures and Molecular Propagation
1301
2.3. Proof of Theorem 2 We set A = λ−1/2 A1
(18)
and for (s, z, σ, ζ) = κ(t, x, τ, ξ), we denote by f˜ the function defined by f˜(s, z, σ, ζ) = λg(t, x, τ, ξ) = Γ1 ζ˜ + (σ 2 − s2 ) . We consider a semi-classical Fourier integral operator U associated with κ and we set v h = U oph (A)ψ h . Then, pseudo-differential symbolic calculus and (6) yield that, in L2 (Rd+1 ), −σ + s q(f˜) vh oph q(f˜)∗ −σ + s = U oph (λH50 )v h + O(h) |ξ|2 ∗ −1 + H5 oph (A−1 )v h + O(h) = U oph ((A ) ) oph τ + 2 |ξ|2 + H5 ψ h + O(h) = U oph ((A∗ )−1 ) oph τ + 2 = O(h) . We argue now as in [8, pp. 19–20]. Consider the symplectic coordinates y = σ + s, η = σ − s ,
and a semi-classical Fourier integral operator K associated with it. We set f = Γ 2 ζ˜ and √ η 0 0 q(f ) 0 q(β) D= 2 , F = , B = . 0 y q(f )∗ 0 q(β)∗ 0 With these notations, we are left with system oph (D − DBD)Kv h = oph (F )Kv h + O(h), in L2 (Rd+1 ) . Using pseudo-differential symbolic calculus, we have oph (D − DBD) = oph (D) − oph (D) oph (B) oph (D) + O(h) . Plugging the equation of Kv h into itself and using that oph (D) oph (B) oph (F ) = oph (B) oph (F ) oph (D) + O(h) , we get 1 − [oph (D) oph (B)]2 − oph (B) oph (F ) oph (D)v h = oph (F )v h + O(h) .
February 5, 2004 15:55 WSPC/148-RMP
1302
00190
C. Fermanian Kammerer
Using a parametrix of the operator P = 1 − [oph (D) oph (B)]2 − oph (B) oph (F ) which exists near 0 and is of the form D 0 P −1 = , 0 D where D is a 2 × 2 matrix, one proves the existence of Γ with Γ|S = (Γ1 )|S such that oph (D)Kv h = oph of Theorem 2.
˜ 0 q(Γζ) ˜∗ q(Γζ) 0
(19)
Kv h +O(h), which completes the proof
2.4. Consequences of the reduction theorem Theorem 2 has two important consequences. The first one is that this theorem provides us with an involutive submanifold of T ∗ Rd+1 , I = {ζ˜ = 0}, such that J ∪ J0 = Σ ∩ I , with transverse intersection. Thus, if ν is the two-scale Wigner measure of (ψ h ) for ¯ (I)|J (respectively I, we can identify measures ν¯, (respectively ν¯0 ) with ν above N ¯ ¯ N (I)|J 0 ). Actually, if NΣ (J) is the bundle above Σ obtained by compactification of the fibers of T Σ/T J, the canonical isomorphism from T Σ/T J onto T (T ∗ Rd+1 |J )/T JJ extends in an isomorphism ¯Σ (J) → N ¯ (I) , θΣ,I : N that we use for identifying ν¯ and ν 1N¯ (I)|J . As mentioned in Sec. 1, the reader can refer to [7, Lemma 4] for a proof of this fact. Because of this identification, we shall focus on calculating the two-scale Wigner measure of (ψ h ) for I. The second consequence is that because of the invariance of two-scale Wigner measure through canonical transform (see [7, Lemma 2), it is equivalent to study the two-scale Wigner measure of (v h ) for I, or of (ψ h ) for the same set. Indeed, the two-scale Wigner measures ν˜ of (v h ) and ν of (ψ h ) are linked by ¯ (κ), AνA∗ i , ∀a ∈ A, ha, ν˜i = ha ◦ N
(20)
¯ ¯ where for (ρ, η) ∈ N(I), N(κ)(ρ, η) = (κ(ρ), η). We concentrate now on translating Theorem 1 into a result on ν˜. We set 1 1 ˜ s) . ˜ ± = 1 ± q (21) V (Γζ, Π 2 2 2 ˜ s + |Γζ|
We decompose ν˜ as ν˜ = ν˜+ + ν˜− with the commutation relations ˜ ± ν˜± Π ˜ ± = ν˜± , Π
(22)
February 5, 2004 15:55 WSPC/148-RMP
00190
Wigner Measures and Molecular Propagation
1303
and the localization property: ν˜± is supported on J ±,in ∪ J ±,out . Let us denote by ν˜±,in (respectively ν˜±,out ) the traces of ν˜± on s = 0+ (respectively s = 0− ). In view of ¯ (κ)−1 = A|S ν ±,out A∗ , ν˜±,in ◦ N ¯ (κ)−1 = A|S ν ±,in A∗ . ν˜±,out ◦ N |S |S
In the coordinates (s, z, σ, ζ) we choose the equations of I, ζ˜ = 0. Let ρ ∈ S, this ¯ (I)|ρ . In the coordinates (t, x, τ, ξ), we equation generates coordinates η˜ ∈ R¯4 on N ¯ (I)|ρ as in Lemma 1. We aim to express the transformation N ¯ (κ) in characterize N these systems of coordinates. Note that, in view of (19) and Proposition 4, we have Γ|S dζ˜ = λ dg|S .
E Since g = ˜ l(p − (τ + k + |ξ|2 /2)u) with u|S = e|S = − |E| , we have for δρ ∈ T I|ρ , ρ ∈ S,
Γ(ρ) dζ˜ = λ dg(ρ)δρ = λ l(dp(ρ)δρ) ,
where l (dp(ρ)δρ) are the coordinates of P(dp(ρ)δρ), the orthogonal projection of dp(ρ)δρ on the hyperplane normal to E in the orthonormal basis of this hyperplan introduced in Lemma 2. We denote by η˜ the coordinate on N (I) induced by the choice of the equations ζ˜ of I and we obtain that Eqs. (5) write ( ˜ ν˜+,in R ˜, ν˜+,out = T˜ ν˜−,in + (1 − T˜)R (23) ˜ ν˜−,in R ˜, ν˜−,out = T˜ ν˜+,in + (1 − T˜)R with
¯ ¯ (κ), R ˜ = A|S RA−1 ◦ N(κ) . T˜ = T ◦ N |S By (18), we get ¯ ¯ (κ) = A1 |S RA1 ∗ ◦ N(κ) ˜ = A1 |S RA1 −1 ◦ N . R |S |S Therefore, (17) and (9) yield T˜(z, ζ5 , . . . , ζd−1 , η˜) = exp(−|Γ|S η˜|2 ) , 0 1 ˜ R(z, ζ5 , . . . , ζd−1 , η˜) = |Γ|S η˜| q(Γ|S η˜)∗
q(Γ|S η˜) 0
.
In the following sections, we drop the˜ on η˜ and we focus on proving these formulas, for |η| = ∞ first, then for finite η. Before closing this section, let us state a last result concerning the system (7). The dependence on the variable σ of the function Γ does not prevent us from dealing with system (7) as a system of evolution equations. Actually, arguing as in [9, Proposition 5], we can prove the following hyperbolic estimate. Proposition 5. Let φ ∈ C0∞ (R2 ), φ(0) = 1, then there exist h0 > 0 and ˜2 δ0 > 0 such that for any δ < δ0 , the family oph φ (|ζ|δ ,s) v h h
February 5, 2004 15:55 WSPC/148-RMP
1304
00190
C. Fermanian Kammerer
Moreover, again as in [9], the following corollary is straightforward and will be useful in the following. Corollary 1. There exist δ0 > 0 and h0 > 0 such that if bh ∈ C0∞ (R2d+2 ) is ˜ 2 < δ 2 }, there exists a constant C such that for all h < h0 , supported in {s2 + |ζ| 0 Z +∞ oph (bh )v h | v h ≤ C Sup Sup ∂zβ ∂σγ bh (s, z, σ, ζ) ds . −∞
γ+|β|≤N z,σ,ζ
3. Analysis at Infinity In this section, we aim at proving Eqs. (23) at infinity, i.e. ˜ ν˜+,in R, ˜ ν˜−,out = R ˜ ν˜−,in R ˜ on |η| = +∞ . ν˜+,out = R
(24)
We proceed in the same spirit as in [7] and [9] but serious modifications are induced by the fact that the eigenvalues of matrix V are of multiplicity 2 while it was exactly 1 in the references above. Thus, measures ν˜±,out and ν˜±,in are matrixvalued. Because of (22), they have a special polarization. Observe that 1 0 0 0 + +,in + ˜ ˜ Π = above J , Π = above J +,out , 0 0 0 1 (25) 0 0 1 0 − −,in − −,out ˜ ˜ Π = above J , Π = above J . 0 1 0 0 Therefore, the polarization of ν˜±,in is not the same as the one of ν˜±,out . This ˜ in (24). explains the presence of the matrix R We proceed in two steps. First, we reduce (24) to a statement on the family (v h ) by use of suitably chosen symbols. Then, in a second step, we prove this statement by Weyl–H¨ ormander pseudo-differential calculus. Let us first introduce some notation and a technical lemma. We set q ˜ 2 , A = V (Γζ, ˜ s) , λ = s2 + |Γζ| and we denote by (f h ) the family such that, locally near 0, h ∂s v h = oph (A)v h + hf h , i with (oph (φ)f h ) bounded in L2s,z for any φ compactly supported sufficiently near 0. Then, for ω ∈ S1 , we denote by Xω the unit eigenvector of A for the eigenvalue λ defined for s 6= 0 or ζ˜ 6= 0 by 0 1 1 ˜ Xω = p (q(Γζ)ω, (λ − s)ω) = p (λ + A) . ω 2λ(λ − s) 2λ(λ − s) We shall use the following properties of Xω .
February 5, 2004 15:55 WSPC/148-RMP
00190
Wigner Measures and Molecular Propagation
1305
Lemma 5. ˜ if s > 0. ˜ if s < 0, Xω = q Γζ˜ ω, 0 + O(|ζ|) (1) Xω = (0, ω) + O(|ζ|) ˜ |Γζ| (2) There exists a matrix-valued function φ, homogeneous of degree 0 in s, λ and ˜ such that if (ω, ω 0 ) is an orthonormal basis of R2 , ζ, 1 (∂s A − ∂s λ) Xω + (φ ω · ω 0 )Xω0 . ∂ s Xω = 2λ Proof. Point (1) is straightforward by Taylor expansion. Let us focus on point (2). Consider ω, ω 0 ∈ S1 such that ω · ω 0 = 0. The vectors Xω and Xω0 form an orthonormal basis of the subspace of all the eigenvectors of A for the eigenvalue λ. Moreover, using that Xω · Xω0 = 0, we obtain that 0 0 1 (λ + A)(∂s λ + ∂s A) · . ∂ s Xω · X ω 0 = 2λ(λ − s) ω ω0 We get, through straightforward computation, 1 ˜ ∗ q(∂s Γζ)ω ˜ · ω0) . (q(Γζ) ∂ s Xω · X ω 0 = 2λ(λ − s)
We set
φ(s, z, σ, ζ) =
1 ˜ ∗ q(∂s Γζ) ˜ . q(Γζ) 2λ(λ − s)
(26)
Since |Xω | = 1, we also have ∂s Xω · Xω = 0. Thus, the vector ∂s Xω is of the form ∂s Xω = (φ ω · ω 0 )Xω0 + Y ,
where Y is an eigenvector of A for the eigenvalue −λ. The derivation in the variable s of the relation AXω = λXω yields (∂s λ − ∂s A)Xω = (A − λ)∂s Xω = −2λY = −2λ(∂s Xω − (φω · ω 0 )Xω0 ) , whence (2) of Lemma 5. 3.1. An equivalent statement to (24) We consider ω, ω 0 ∈ S1 and the matrix
¯ ω0 . Πω,ω0 = Xω ⊗ X
We define the following 4 × 4 matrices ˜ 0 Γζ˜ 0 q ω q |ΓΓζζ| ω0 ˜ ˜ |Γζ| out Πin = ⊗ , Π = 0 0 0 ⊗ 0 , ω,ω ω,ω 0 0 ω ω ¯0 0 0 so that we have ˜ Πin 0 R ˜ = Πout 0 , R ω,ω ω,ω
(27)
˜ Πω,ω0 = Π∞ ω,ω 0 + O(|ζ|)
(28)
February 5, 2004 15:55 WSPC/148-RMP
1306
00190
C. Fermanian Kammerer
with in ∞ out Π∞ ω,ω 0 = Πω,ω 0 if s > 0, Πω,ω 0 = Πω,ω 0 if s < 0 .
Equation (24) links the traces on S of ν˜± . We shall use the following lemma which translates (24) into a result on ν˜± itself and not only on its traces. Lemma 6. If ρ ∈ C0∞ (R), ρ ≡ 1 near 0, the relations (24) are equivalent to the 1 0 1 fact that for all a0 ∈ C0∞ (R2d+1 z,σ,ζ × S ) and for all ω, ω ∈ S , η 1 0 s ∞ + = 0. (29) a0 z, σ, ζ, 1|η|=+∞ Πω,ω0 , ν˜ lim tr ρ ε→0 ε ε |η| Proof. We crucially use that matrices ν˜+ and ν˜− satisfy the commutation relations (22). We write the 4 × 4 matrix ν˜± by blocks of 2 × 2 matrices ± A B± ν˜± = . C ± D± Because of (25), ν˜+ is of the form + A 0 + ν˜ = above J +,in , 0 0 0 0 + ν˜ = above J +,out . 0 D+ Equation (24) for the mode + is equivalent to the fact that D + = |Γ|S η˜|−2 q(Γ|S η˜)∗ A+ q(Γ|S η˜) on |η| = +∞. Since any 2 × 2 matrix M is completely determined by the knowledge for any ω, ω 0 in S1 of tr(M ω ⊗ ω ¯ 0 ), this is equivalent to the fact for all ω, ω 0 in S1 and for |η| = +∞, tr(q(Γ|S η˜)∗ A+ q(Γ|S η˜)ω ⊗ ω ¯ 0 ) = tr(D+ ω ⊗ ω ¯ 0) .
Observe that above J +,out ,
0 0 tr(D+ ω ⊗ ω ¯ 0 ) = trν˜+ 0 ⊗ 0 = tr ν˜+ Πout ω,ω 0 , ω ω ¯0
and above J +,in
ω ¯0 ω ˜ ν˜+,in R ˜ ν+R ˜ Πout 0 ) . ˜ 0 ⊗ 0 = tr(R tr(q(Γ|S η˜)∗ A+ q(Γ|S η˜)ω⊗¯ ω 0 ) = trR˜ ω,ω 0 0 Therefore, Eq. (24) is equivalent to the fact that, for any choice of ω and ω 0 ,
˜ ˜+,in R ˜ Πout 0 ) . tr(˜ ν +,out Πout ω,ω 0 ) = tr(R ν ω,ω In view of ˜ ν˜+,in R ˜ Πout 0 ) = tr(˜ ˜ Πout 0 R) ˜ and R ˜ Πout 0 R ˜ = Πin 0 , tr(R ν +,in R ω,ω ω,ω ω,ω ω,ω
February 5, 2004 15:55 WSPC/148-RMP
00190
Wigner Measures and Molecular Propagation
1307
(24) is equivalent to the fact that for any choice of ω and ω 0 , on |η| = +∞, tr ν˜+,out Πout ˜+,in Πin ω,ω 0 = tr ν ω,ω 0 . Because of the value of Π∞ ω,ω 0 as s > 0 and s < 0, the latter equation reads tr(˜ ν +,out Π∞ ν +,in Π∞ ω,ω 0 ) = tr(˜ ω,ω 0 ) on |η| = +∞ . Using that ν˜+,in and ν˜+,out are the traces on s = 0− and s = 0+ of ν˜+ , we are led to proving that for ρ as in the lemma, the measure 1 0 s 1|η|=+∞ tr Π∞ ˜+ ρ ω,ω 0 ν ε ε goes to 0 as ε goes to 0 in the set of measures. Whence (29). For a0 as before, δ < δ0 , (where δ0 is defined in Proposition 5), R > 0, ε > 0, we define a as ! s ˜2 η |ζ| |η| a(s, z, σ, ζ, η) = ρ a0 z, σ, ζ, ρ 1−ρ . ε |η| δ R The function a is a symbol of A. Actually, a is smooth despite the term η 6= 0 on Supp(a). For the same reason, the function ζ˜ qh (s, σ, z, ζ) = a s, z, σ, ζ, √ Πω,ω0 h
η |η|
because
is smooth. Thus the operator oph (qh ) is well-defined. We shall denote it by / A. This fact compels us to work in a larger oph I (aΠω,ω0 ) even though aΠω,ω0 ∈ class of symbols than A. Since Πω,ω0 is a homogeneous function of degree 0 of the ˜ we have the estimate variable (s, Γζ), |α| 1 β α √ , ∂σ,z,ζ5 ,...,ζd ∂s,ζ˜ qh ≤ Cα,β (ε, δ) R h √ where we implicitly assumed R h ≤ 1. This estimate allows us to use Weyl– H¨ ormander symbolic calculus for which we refer to [18, Secs. 18.4–18.6]. The symbol qh belongs to the class S(1, gh ) where gh is the metric gh = dz 2 + √
ds2 dζ˜2 + h2 (dσ 2 + dζ52 + · · · + dζd2 ) + h 2 . 2 R h R √
Since ggσh ≤ ( Rh )2 , the gain of this symbolic calculus is Rh . Observe that, on the h other hand, ∂s a Πω,ω0 ∈ A and by definition of ρ, 1±s>0 ∂s a Πω,ω0 ∈ A. Moreover, because of (27), we have 1 s a0 Π∞ = lim lim lim oph I (∂s a Πω,ω0 ) v h | v h . tr ν˜+ 1|η|=∞ 1s>0 , ρ0 ω,ω 0 R→∞ δ→0 h→0 ε ε We obtain that (29) is equivalent to
lim lim lim lim oph I (∂s aΠω,ω0 ) v h |v h = 0 .
ε→0 R→∞ δ→0 h→0
(30)
February 5, 2004 15:55 WSPC/148-RMP
1308
00190
C. Fermanian Kammerer
3.2. Proof of (30) The proof of this claim crucially uses Weyl–H¨ ormander metric gh defined above and Corollary 1. It is inspired from [9, Sec. 4.2] to which the reader can refer. We decompose Lε,R,δ,h (a0 ) := − oph I (∂s aΠω,ω0 ) v h | v h as Lε,R,δ,h,0 (a0 ) = L1 + L2 + L3 with ˜ s q(Γζ) i h h I 1 0 v |v , oph (aΠω,ω ) , oph L = ˜∗ h q(Γζ) −s L2 = oph I (a∂s Πω,ω0 ) v h | v h , L3 = i oph I (aΠω,ω0 )f h | v h − i oph I (aΠω,ω0 )v h | f h .
Then, we prove the convergence to 0 of L1 , L2 and L3 . For b = b(s, z, σ, ζ, η) ∈ C ∞ , we set ζ˜ [ √ bh (s, z, σ, ζ) = b s, z, σ, ζ, , h √ ˜ , b]h (s, z, σ, ζ) = b(s, z, hσ, hζ, hζ) so that we have for b ∈ A,
oph I (b) = oph (b[h ) = op1 (b]h ) .
• We begin with L3 . By the Weyl–H¨ ormander symbolic calculus, if χ ∈ C0∞ (Ω) with χ = 1 near the support of a, we have √ !N h , gh , oph I (aΠω,ω0 ) = oph I (aχΠω,ω0 ) = oph I (aΠω,ω0 ) oph (χ)+ op1 S R
for all N ∈ N. On the other hand, we can apply Corollary 1 to the symbol bh = a[h Πω,ω0 . Using the explicit expressions of a and of Πω,ω0 , we get √ N ! Z +∞ h s 3 ρ ds + O |L | ≤ C . ε R −∞ • Let us consider now L1 . Note that A]h ∈ S(λ]h , gh ), therefore by symbolic calculus, we obtain 1 ˜ ω,ω0 , A} − {A, a[h Π ˜ ω,ω0 })v h | v h ) + O 1 . L1 = (oph ({a[h Π 2 R2
We set ˜ ω,ω0 } . ˜ ω,ω0 , A} − {A, a[h Π bh = {a[h Π
(31)
˜ where U = U (r, t, X) is homogeneous of degree 0. Observe that Πω,ω0 = U (s, λ, Γζ) Therefore, ∂s Πω,ω0 and ∇ζ˜Πω,ω0 are homogeneous of degree −1 in the variables s,
February 5, 2004 15:55 WSPC/148-RMP
00190
Wigner Measures and Molecular Propagation
1309
˜ However, ∂σ Πω,ω0 and ∇z Πω,ω0 have a better degree of homogeneity, they λ and ζ. ˜ Indeed, we have are homogeneous function of degree 0 in λ, s and ζ. ˜ + ∇X U (s, λ, Γζ)∂ ˜ σ Γ ζ˜ , ∂σ Πω,ω0 = ∂σ λ ∂r U (s, λ, Γζ)
˜ Similar relations hold for ∂z Πω,ω0 . Note that, in (31), with ∂σ λ = λ−1 ∂σ Γζ˜ · Γζ. j ˜ which compensates the derivatives ∂s Πω,ω0 and ∇ζ˜Πω,ω0 appear with a factor ζ, the −1 degree of homogeneity of these functions. Therefore, applying Corollary 1, we obtain the estimate Z +∞ s oph (bh )v h , v h ≤ C ds = O(ε) . ρ ε −∞ Thus
limsup
limsup limsup |L1 | = 0 .
(ε,δ)→(0,0) R→+∞
h→0
• Finally, let us deal with the remainder term L2 . We claim that if (ω, ω 0 ) is an orthonormal basis of R2 , A ∂s Πω,ω = A, 2 ∂s Πω,ω + (φ ω · ω)(Πω0 ,ω + Πω,ω0 ) , 2λ (32) A ∂s Πω,ω0 = A, 2 ∂s Πω,ω0 + (φ ω · ω 0 )(Πω,ω + Πω0 ,ω0 ) . 2λ Indeed, because of Lemma 5(2), if M =
1 2λ
((∂s A − ∂s λ)), we have for ω · ω 0 = 0,
∂s Πω,ω0 = M Πω,ω0 + Πω,ω0 M + (φ ω · ω 0 )(Πω,ω + Πω0 ,ω0 ) ∂s Πω,ω = M Πω,ω + Πω,ω M + (φ ω · ω)(Πω0 ,ω + Πω,ω0 ) . Observe that since ˜− , ∂s (A2 ) = ∂s (λ2 ) = A∂s A + ∂s A A = 2λ∂s λ and λ − A = 2λΠ
˜ − . Therefore, in view of we have AM + M A = 4∂s λΠ
AΠω,ω0 = Πω,ω0 A = λΠω,ω0 , AΠω,ω = Πω,ω A = λΠω,ω , ˜ − Πu,v = Πu,v Π ˜ − = 0, u, v ∈ {ω, ω 0 }, Π we obtain A∂s Πω,ω0 A = −λ2 ∂s Πω,ω0 + 2λ2 (φ ω · ω 0 )(Πω,ω + Πω0 ,ω0 ) , A∂s Πω,ω A = −λ2 ∂s Πω,ω + 2λ2 (φ ω · ω)(Πω,ω0 + Πω0 ,ω ) . Whence (32). This latter equation allows us to conclude for L2 . Indeed, the homogeneity of φ and Πω,ω0 in s, λ and ζ˜ yields that limsup limsup limsup oph I (a(φ ω · ω 0 )Πω,ω0 ) v h | v h = 0 . (ε,δ)→(0,0) R→+∞
h→0
February 5, 2004 15:55 WSPC/148-RMP
1310
00190
C. Fermanian Kammerer
Moreover, we transform the bracket part of √ ∂s Πω,ω0 as in [9] so that we can use again the equation. In the metric gh , λ]h ≥ C hR on Supp(a]h ) ! 1 ] −2 , gh . (aλ A)h ∂s Πω,ω0 ∈ S √ hRλ]h We obtain oph
I
−
A A, a 2 ∂s Πω,ω 2λ
1 op I 2 h
1 op (A) oph I aλ−2 A∂s Πω,ω0 2 h 1 −2 . aλ A∂s Πω,ω0 oph (A)+ op1 S , gh R2 ∈
Therefore, we can use again the equation and we get A oph I A, a 2 ∂s Πω,ω vh | vh 2λ h 1 + O(h) − oph I ∂s aλ−2 A∂s Πω,ω0 v h | v h . = O R2 2i [ C β Since |∂σ,z ∂s aλ−2 A∂s Πω,ω0 h | ≤ (s2 +hR 2 )3/2 , as h goes to 0, A vh | vh oph I A, a 2 ∂s Πω,ω 2λ Z +∞ ds 1 1 + o(1) + Ch . = o(1) + O = O 2 2 + hR2 )3/2 R2 R (s −∞ Hence, limsup limsup L2 = 0. This completes the proof of (23) for |η| = +∞. R→+∞
h→0
4. Analysis at Finite Distance In this section, we aim at proving the Landau–Zener formula (23) for {|η| < +∞}. We proceed in three steps. We begin by stating a normal form which holds in any ball B ⊂ R4η . Then we are reduced to dealing with some abstract scattering problem that can be solved explicitly. This allows us to obtain the Landau–Zener formula for measure ν˜ and then for ν. 4.1. A normal form at finite distance Proposition 6. For any ball B ⊂ R4η , there exists a matrix C ∈ C0∞ (R2d+4 s,z,σ,ζ,η ) √ h I h 2 d+1 such that if u = (1 + h oph (C))v , then, in L (R ), ˜ −σ + s q(Γ|S ζ) ∞ 2d+2 I h ∀a ∈ C0 (R × B), oph (a) oph ˜ ∗ −σ − s u = O(h) . q(Γ|S ζ) Remark 5. Note that (uh ) and (v h ) have the same two-scale Wigner measures on I.
February 5, 2004 15:55 WSPC/148-RMP
00190
Wigner Measures and Molecular Propagation
Proof. We set J =
1 0 0 −1
˜5 = H
−σ + s ˜∗ q(Γζ)
˜ 00 = H 5
−σ + s ˜∗ q(Γ00 ζ)
1311
∈ R4,4 , Γ0 = Γ(s, z, 0, ζ) and Γ00 = Γ(0, z, 0, ζ), ˜ q(Γζ) −σ − s
˜ q(Γ00 ζ) −σ − s
˜ 50 = , H
−σ + s ˜∗ q(Γ0 ζ)
˜ q(Γ0 ζ) −σ − s
,
.
We prove the following lemma. Lemma 7. There exist two smooth matrices C1 = C1 (s, z, σ, ζ, η) and C2 = C2 (s, z, σ, ζ, η), C1 , C2 ∈ C0∞ (R2d+4 ) such that for all a ∈ C0∞ (R2d+2 × B), √ ˜5) || oph I (a)[(1 + h oph I (JC1 J)) oph (H √ ˜ 50 )(1 + h oph I (C1 ))] ||L(L2 ) = O(h) , − oph (H (33) √ ˜ 0) || oph I (a)[(1 + h oph I (C2 )) oph (H 5 √ ˜ 500 )(1 + h oph I (C2 ))] ||L(L2 ) = O(h) . − oph (H (34) This lemma yields Proposition 6 thanks to the following remark. Actually, observe that if a is compactly supported in all the variables s, z, σ, ζ, η, then we have in L(L2 ), √ √ oph I (a) oph (ζj b) = h oph I (aηj b) + O( h) . Hence √ || oph I (a) oph (ζj b)||L(L2 ) = O( h) ,
(35)
for j ∈ {0, 1, 2, 3} and b ∈ C0∞ (R2d+2 ). Therefore, writing Γζ˜ = Γ|S ζ˜ + O(|ζ|2 ), we obtain that for any a compactly supported in all the variables, ˜ ˜ L(L2 ) = O(h) . || oph I (a) oph (q(Γζ))− oph I (a) oph (q(Γ|S ζ))|| Therefore, Eqs. (33) and (34) yield Proposition 6. Proof of Lemma 7. Equation (33) is equivalent to √ ˜5) || oph I (a)[(1 + h oph I (C1 )) oph (J H √ ˜ 0 )(1 + h op I (C1 ))]||L(L2 ) = O(h) . − oph (J H 5 h ˜ ˜ 5 = −σ + s∗ q(Γζ) Note that J H . We use symbolic calculus to expand in power of ˜ −q(Γζ) σ+s √ h the left-hand side. Because of (35), we obtain that C1 must satisfy, 0 q((Γ0 − Γ)η) σ[J, C1 ] = . −q((Γ0 − Γ)η)∗ 0
February 5, 2004 15:55 WSPC/148-RMP
1312
00190
C. Fermanian Kammerer
Therefore, we get C1 =
1 2σ
0 q((Γ0 − Γ)η)∗
q((Γ0 − Γ)η) 0
χ(η) ,
for a function χ ∈ C0∞ (R4η ) identically equal to 1 on B. A similar proof determines C2 , whence Lemma 7. 4.2. Landau–Zener formula Because of Proposition 6, once given a ball B ⊂ R4η , we are reduced to the study of the traces on s = 0+ and s = 0− of the two-scale Wigner measure of a family (uh ) satisfying ∀a ∈
C0∞ (R2d+2 ×B),
I
oph (a) oph
−σ + s ˜∗ q(Γ|S ζ)
˜ q(Γ|S ζ) −σ − s
uh = O(h) in L2 (Rd+1 ) .
Moreover, by applying a cut-off function, we may suppose that Γ is compactly supported and turn Γ|S into φ(η)Γ|S with φ compactly supported and identically equal to 1 on B. In this way, our system is microlocalized in the ball B, which is enough to calculate the two-scale Wigner measures in B. We are left with a system of the form ˜ s φ(η)q(Γ|S ζ) h h I ∂s u = oph uh + hf h , ˜∗ i φ(η)q(Γ|S ζ) −s where (oph I (a)f h ) is uniformly bounded in L2s,z for symbols a compactly supported in R2d+2 × B. However, f h does not contribute to the description of the traces on s = 0+ , s = 0− of the two-scale Wigner measure of (uh ). Actually, if Sh (s, s0 ) denotes the evolution operator associated with the free system ˜ s φ(η)q(Γ|S ζ) h uh , uh|s=0 = uh|s=0 , ∂s uh = oph I ˜∗ i φ(η)q(Γ|S ζ) −s then we have uh = uh + i
Z
s
Sh (0, t)f h (t)dt . 0
p Hence, uh = uh + O( |s|) in L2 (Rdz ). Therefore, the traces of the two-scale Wigner measures of (uh ) and (uh ) on s = 0± are the same. Let us denote by G the compact operator G = oph I φ(η)q(Γ|S η) .
As a consequence of [9, Proposition 7], we have the following lemma.
February 5, 2004 15:55 WSPC/148-RMP
00190
Wigner Measures and Molecular Propagation
1313
Lemma 8. There exist αhj = αhj (z), ωjh = ωjh (z), j ∈ {1, 2}, such that, as h goes to 0, for any χ ∈ C0∞ (R), χ(GG∗ )αh1 , χ(G∗ G)αh2 , χ(GG∗ )ω1h and χ(G∗ G)ω2h are bounded and for s < 0, ∗
χ(GG )
∗
χ(G G)
uh1 (s, z)
uh3 (s, z)
uh1 (s, z) uh2 (s, z)
uh3 (s, z) uh4 (s, z)
uh2 (s, z)
uh4 (s, z)
GG∗ s i 2 √ αh1 + o(1) , h
2
∗
= χ(GG )e
s i 2h
2
∗
= χ(G G)e
s −i 2h
for s > 0, ∗
χ(GG )
∗
χ(G G)
2
∗
= χ(GG )e
∗
s i 2h
GG∗ s i 2 √ ω1h + o(1) , h
2
= χ(G G)e
s −i 2h
Moreover
ω1h ω2h
= Sh
αh1 αh2
G∗ G s −i 2 √ αh2 + o(1) , h
G∗ G s −i 2 √ ω2h + o(1) . h
(36)
with Sh = λ
with a(λ) = e−π 2 , b(λ) = λ|b(λ)|2 .
a(GG∗ ) b(G∗ G)G∗ π
i 2ie √4 λ π
−¯b(GG∗ )G a(G∗ G)
,
λ
2−iλ/2 e−π 4 Γ(1 + i λ2 ) |rmsh( πλ 2 ),
a(λ)2 = 1 −
These formula allow the calculation of two-scale Wigner measures thanks to [9, Lemma 8] which states that for χ ∈ C0∞ (R) we have, 2 η I Γ|S η χ φ = χ(GG∗ ) + o(1) = χ(G∗ G) + o(1) . (37) oph R In view of
˜+ = Π
1 0
˜+ = Π
0 0
0 0
0 1
on J
˜− = on J +,out , Π
+,in
˜− = , Π
0 0
0 1 1 0 0 0
on J −,in , on J −,out ,
we obtain that ν˜+ 1s>0 (respectively ν˜− 1s>0 ) is the two-scale Wigner mea i GG∗ −i G∗2 G s2 s2 (0, αh2 )). sure of the family ei 2h √sh 2 (αh1 , 0) (respectively e−i 2h √sh Similarly, ν˜+ 1s<0 (respectively ν˜− 1s<0 ) is the two-scale Wigner measure of
February 5, 2004 15:55 WSPC/148-RMP
1314
00190
C. Fermanian Kammerer
−i G∗2 G i GG∗ s2 s2 e−i 2h √sh (0, ω2h ) (respectively ei 2h √sh 2 (ω1h , 0)). We use [9, Lemma 9]: for every χ ∈ C0∞ (R2d+4 )), we have in L(L2 ), GG∗ GG∗ G∗ G G∗ G s −i 2 s i 2 s −i 2 s i 2 I I √ oph (χ) √ = √ oph (χ) √ + o(1) h h h h = oph I (χ) + o(1) .
Therefore, ν˜+,in (respectively ν˜−,in ) is the two-scale Wigner measure of the family (αh1 , 0) (respectively (0, αh2 )). Similarly, ν˜+,out (respectively ν˜−,out ) is the two-scale Wigner measure of the family (0, ω2h ) (respectively (ω1h , 0)). Observe that Eq. (36) yields, h h ω1 (z) α1 (z) 0 G 0 = a(GG∗ ) − b(GG∗ ) , 0 0 G∗ 0 αh2 (z) h 0 G α1 (z) 0 0 ∗ ∗ . + b(G G) = a(G G) ω2h (z) 0 G∗ 0 αh2 (z) These relations yield (23) via (37), which completes the proof of Theorem 1. 5. Appendix: The Geometry of the Crossing Proof of Proposition 1. Consider ρ0 = (t0 , x0 , τ0 , ξ0 ) ∈ S such that E(x0 , ξ0 ) 6= 0 ± ± and let us prove the local existence of integral curves ρ± s = (t0 + s, xs , ξs , τ0 ) of Hλ± . We must have ± ± ± t ± p(xs ) ˙± . x˙ ± s = ξs , ξs = −∇k(xs ) ∓ ∇p(xs ) |p(x± s )|
− In order to define (ρ+ s )s<0 and (ρs )s>0 , we aim at proving the existence of functions x˜s , ξ˜s such that ( + ˜ x ˜s = x+ s , ξs = ξs for s < 0 , − ˜ x ˜s = x− s , ξs = ξs for s > 0 .
Therefore, we have to solve d x ˜s = ξ˜s , ds p(˜ xs ) d ˜ ξs = −∇k(˜ xs ) + sgn(s) t ∇p(˜ xs ) , ds |p(˜ xs )|
(38)
d (p(˜ xs )) = E(˜ xs , ξ˜s ) . ds Rs R1 Necessarily, p(˜ xs ) = 0 E(˜ xσ , ξ˜σ )dσ = s 0 E(˜ xst , ξ˜st )dt and thus ! R1 Z s Z s ˜σt ) dt E(˜ x , ξ σt t 0 ˜ ξs = ξ 0 − ∇k(˜ xσ ) + ∇p(˜ xσ ) R 1 ξ˜σ dσ . dσ, x ˜s = x0 + | E(˜ xσt , ξ˜σt ) dt| 0 0 0
February 5, 2004 15:55 WSPC/148-RMP
00190
Wigner Measures and Molecular Propagation
1315
Since E(ρ0 ) 6= 0, a fixed point argument can be used on these latter equations. One − − defines x ˜s , ξ˜s and thus ρ+ s for s < 0 and ρs for s > 0. We argue similarly for ρs , + s < 0 and ρs , s > 0. Let us denote by H and H 0 the limits of the two Hamiltonian flows along J and 0 J above S, we have − lim Hλ+ (ρ+ s ) = lim+ Hλ− (ρs ) = Hτ +k+|ξ|2 /2 −
s→0−
s→0
− lim Hλ+ (ρ+ s ) = lim Hλ− (ρs ) = Hτ +k+|ξ|2 /2 +
s→0+
s→0−
E · Hp := H , |E| E · Hp := H 0 . |E|
Proof of Proposition 2. Let us focus on J, the case of J 0 is similar. By definition, J is a codimension j submanifold of T ∗ Rd+1 . Moreover, since Hλ± are Hamiltonian vector fields, J is an involutive submanifold if and only if for any ρ ∈ S, T J|ρ is an involutive subspace. Since T J|ρ = T S|ρ ⊕ R H, we have (T J|ρ )⊥ = (T S|ρ )⊥ ∩ H ⊥ . Observe that E ⊥ · Hp . = Vect Hτ +k+|ξ|2 /2 , Hpi , 0 ≤ i ≤ j , H = Hτ +k+|ξ|2 /2 − T S|ρ |E| ⊥ Simple calculation shows that if δρ ∈ (T J|ρ )⊥ , i.e. if δρ ∈ T S|ρ and σ(H, δρ) = 0, j then there exists X ∈ R such that E δρ = X · Hp − X · Hτ +k+|ξ|2 /2 |E| E E E H + X −X · · Hp . =− X· |E| |E| |E|
E E 0 0 Since X 0 = X − X · |E| |E| satisfies X · E = 0, X · Hp ∈ T S|ρ . Thus, δρ ∈ T S|ρ ⊕ RH = T J|ρ and (T J|ρ )⊥ ⊂ T J|ρ , which proves that J is an involutive submanifold. Since p is of rank j, there exists a function y = y(t, x, τ, ξ) ∈ R2d−j+1 such that
(t, x, τ, ξ) 7→ (τ + k + |ξ|2 /2, p, y)
is a local diffeomorphism near ρ0 ∈ S. We claim that J → R2d−j+2
ρ 7→ (τ + k + |ξ|2 /2, y)
also is a local diffeomorphism. Indeed, let δρ ∈ T J be a null vector for the differential of this mapping at ρ ∈ S. We have δρ = δs + λH with δs ∈ T S|ρ and λ ∈ R. Thus 0 = d(τ + k)δρ = λd(τ + k + |ξ|2 /2)H = λσ(H, Hτ +k+|ξ|2 /2 ) = λ|E| .
Therefore λ = 0 and dy δs = 0, whence δs = 0 in view of the definition of y. As a consequence, there exist equations of J of the form p = φ(τ + k + |ξ|2 /2, y). Since |p| = |τ +k +|ξ|2 /2| on J, there exists a smooth vector u = u(t, x, τ, ξ) ∈ Sj−1 such that φ(τ + k + |ξ|2 /2, y) = (τ + k + |ξ|2 /2)u.
February 5, 2004 15:55 WSPC/148-RMP
1316
00190
C. Fermanian Kammerer
Observe that in view of (38) + E = lim− dp(x+ s )Hλ+ (ρs ) s→0
+ = lim d(τ + k + |ξ|2 /2)(ρ+ s )Hλ+ (ρs ) u s→0−
= − lim {τ + k + |ξ|2 /2 , p} · s→0−
Therefore we have u|S =
E |E| ,
p + (ρ )u = |E|u . |p| s
which completes the proof of Proposition 2.
¯Σ± (J˙±,in ) and Proof of Lemma 1. Let us study the limits of the fibers of N ¯Σ± (J˙±,out ) above a point ρ which goes to S. Consider H ⊥ and (H 0 )⊥ the orthogN onal of H and H 0 respectively for the symplectic form on T (T ∗ Rd+1 ). The measures νS+,in and νS−,out are measures on the compactified of H ⊥ /T J and measures νS−,in and νS+,out on (H 0 )⊥ /T J 0 . Note that T J = T S ⊕ RH and T J 0 = T S ⊕ RH 0 , therefore if we set F = T J + T J 0 = T S ⊕ RH ⊕ RH 0 , the (j − 1) dimension spaces H ⊥ /T J and (H 0 )⊥ /T J 0 can be identified with T (T ∗ Rd+1 )/F . Observe that F ⊥ = T J⊥ ∩ H⊥ ∩ H0
⊥
= {W = X · Hp , X · E = 0, X ∈ Rj } .
Moreover, if W = X · Hp and δρ ∈ T (T ∗ Rd+1 )|ρ , ρ ∈ S, σ(W, δρ) = −X · dp δρ . Therefore, if P ∈ L(Rj ) is the orthogonal projection on the normal hyperplane to E, the class modulo F of δρ is completely determined by the knowledge of P(dp δρ). The map η : T (T ∗(Rd+1 )) → Rj δρ 7→ P (dp δρ) defines a system of coordinates on T (T ∗Rd+1 )/F . Acknowledgments Crucial points of this work were understood by the author while collaborating with C. Lasser and P. G´erard; may they find here her acknowledgment for their acute questions and their challenging discussions. References [1] A. P. Calder´ on and R. Vaillancourt, On the boundedness of pseudo-differential operators, J. Math. Soc. Japan 23(2) (1971), 374–378. [2] Y. Colin de Verdi`ere, The level crossing problem in semi-classical analysis. I. The symmetric case, Annales de l’Institut Fourier 53 (2003), 1023–1054.
February 5, 2004 15:55 WSPC/148-RMP
00190
Wigner Measures and Molecular Propagation
1317
[3] Y. Colin. de Verdi`ere, The level crossing problem in semi-classical analysis. II. The Hermitian case, to appear in the proceedings of Louis Boutet de Monvel’s congress, Annales de l’Institut Fourier (2003). [4] C. Fermanian Kammerer, Propagation of concentration effects near shock hypersurfaces for the heat equation, Asymptot. Anal. 24 (2000), 107–141. [5] C. Fermanian Kammerer, Mesures semi-classiques deux-microlocales, C. R. Acad. Sci. Paris 331(1) (2000), 515–518. [6] C. Fermanian Kammerer, A non commutative Landau–Zener formula, to appear in Math. Nachr. [7] C. Fermanian Kammerer and P. G´erard, Mesures semi-classiques et croisements de modes, Bull. Soc. Math. France 130(1) (2002), 123–168. [8] C. Fermanian Kammerer and P. G´erard, Une formule de Landau–Zener pour un croisement g´en´erique de codimension 2, S´eminaire Equations aux D´eriv´ees Partielles 2001–2002, Expos´e No. 21, Ecole Polytechnique. [9] C. Fermanian Kammerer and P. G´erard, A Landau–Zener formula for nondegenerated involutive codimension 3 crossings, Ann. Henri Poincar´e 4 (2003), 513– 552. [10] C. Fermanian Kammerer and C. Lasser, Wigner measures and codimension two crossings, J. Math. Phys. 44(2) (2003), 507–527. [11] P. G´erard and E. Leichtnam, Ergodic properties of eigenfunctions for the Dirichlet problem, Duke Math. J. 71 (1993), 559–607. [12] P. G´erard, P. A. Markowich, N. J. Mauser and F. Poupaud, Homogenization limits and Wigner transforms, Comm. Pure Appl. Math. 50(4) (1997), 323–379. [13] P. G´erard, P. A. Markowich, N. J. Mauser and F. Poupaud, Erratum: Homogenization limits and Wigner transform, Comm. Pure Appl. Math. 53 (2000), 280–281. [14] G. A. Hagedorn, Proof of the Landau–Zener formula in an adiabatic limit with small eigenvalue gaps, Commun. Math. Phys. 136 (1991), 433–449. [15] G. A. Hagedorn, Molecular propagation through electron energy level crossings, Memoirs of the A. M. S. 111(536) (1994). [16] G. A. Hagedorn and A. Joye, Landau–Zener transitions through small electronic eigenvalue gaps in the Born–Oppenheimer approximation, Ann. Inst. Henri Poincar´e 68(1) (1998), 85–134. [17] G. A. Hagedorn and A. Joye, Molecular propagation through small avoided crossings of electron energy levels, Rev. Math. Phys. 11(1) (1999), 41–101. [18] L. H¨ ormander, The Analysis of Linear Partial Differential Operators III (SpringerVerlag, 1985). [19] A. Joye, Proof of the Landau–Zener formula, Asymptot. Anal. 9 (1994), 209–258. [20] L. Landau, Collected Papers of L. Landau (Pergamon Press, 1965). [21] P.-L. Lions and T. Paul, Sur les mesures de Wigner, Revista Matem´ atica Iberoamericana 9 (1993), 553–618. [22] J. Lever and R. Shaw, Irreducible multiplier corepresentations and generalized inducing, Commun. Math. Phys. 38 (1974), 257–277. [23] L. Miller, Propagation d’ondes semi-classiques a ` travers une interface et mesures 2-microlocales, Th`ese de l’Ecole Polytechnique (1996). [24] D. Robert, Autour de l’approximation Semi-classique (Birkhaser, 1983). [25] H. Spohn and S. Teufel, Adiabatic decoupling and time-dependent Born– Oppenheimer theory, Commun. Math. Phys. 224 (2001), 113–132. [26] E. P. Wigner, Group Theory and its Application to the Quantum Mechanics of Atomic Spectra (Academic Press, New York, 1959). [27] C. Zener, Non-adiabatic crossing of energy levels, Proc. Roy. Soc. Lond. 137 (1932), 696–702.
February 11, 2004 15:55 WSPC/148RMP final
REVIEWS IN MATHEMATICAL PHYSICS Author Index Volume 15 (2003)
Arai, A. & Kawano, H., Enhanced binding in a general class of quantum field models Arai, A., Non-relativistic limit of a Dirac–Maxwell operator in relativistic quantum electrodynamics Araki, H. & Moriya, H., Equilibrium statistical mechanics of fermion lattice systems Bahn, C., Ko, C. K. & Park, Y. M., Dirichlet forms and symmetric markovian semigroups on Z2 graded von Neumann algebras Benatti, F., Cappellini, V., De Cock, M., Fannes, M. & Vanpeteghem, D., Classical limit of quantum dynamical entropies Bieliavsky, P. & Bonneau, P., On the geometry of the characteristic class of a star product on a symplectic manifold Bieliavsky, P., Gutt, S., Bordemann, M. & Waldmann, S., Traces for star products on the dual of a lie algebra Black, C. P., A mathematical verification of the existence of strings of circulation in superfluid films on porous media Blanchard, P. & Olkiewicz, R., Decoherence induced transition from quantum to classical dynamics Bojowald, M. & Strobl, T., Poisson geometry in constrained systems Bonneau, P., see Bieliavsky, P. Bordemann, M., see Bieliavsky, P. Cappellini, V., see Benatti, F. Cuccagna, S.,
On asymptotic stability of ground states of NLS Damanik, D. & Zamboni, L. Q., Combinatorial properties of Arnoux–Rauzy subshifts and applications to Schr¨odinger operators De Cock, M., see Benatti, F. Derezi´nski, J., Jakˇsi´c, V. & Pillet, C.-A., Perturbation theory of W ∗ dynamics, Liouvilleans and KMS-states Fannes, M., see Benatti, F. Feldman, J., Kn¨orrer, H. & Trubowitz, E., Single scale analysis of many fermion systems part 1: Insulators Feldman, J., Kn¨orrer, H. & Trubowitz, E., Single scale analysis of many fermion systems part 2: The first scale Feldman, J., Kn¨orrer, H. & Trubowitz, E., Single scale analysis of many fermion systems part 3: Sectorized norms Feldman, J., Kn¨orrer, H. & Trubowitz, E., Single scale analysis of many fermion systems part 4: Sector counting Fermanian Kammerer, C., Wigner measures and molecular propagation through generic energy level crossings Forger, M., Paufler, C. & R¨omer, H., The Poisson bracket for Poisson forms in multisymplectic field theory Fournais, S., Confinement to lowest Landau band and application to quantum current
4 (2003) 387
3 (2003) 245
2 (2003) 93
8 (2003) 823
8 (2003) 847
2 (2003) 199
5 (2003) 425
8 (2003) 925
3 (2003) 217
7 (2003) 663
1319
8 (2003) 877
7 (2003) 745
5 (2003) 447
9 (2003) 949
9 (2003) 995
9 (2003) 1039
9 (2003) 1121
10 (2003) 1285
7 (2003) 705
10 (2003) 1219
February 11, 2004 15:55 WSPC/148-RMP
1320
final
Author Index
Gutt, S., see Bieliavsky, P. Hiroshima, F., Localization of the number of photons of ground states in nonrelativistic QED Horodecki, M., Shor, P. W. & Ruskai, M. B., Entanglement breaking channels Iksanov, A. M., see Kim, C. S. Jakˇsi´c, V., see Derezi´nski, J. Kabluchko, Z. A., see Kim, C. S. Kawano, H., see Arai, A. Kim, C. S., Proskurin, D. P., Iksanov, A. M. & Kabluchko, Z. A., The generalized CCR: representations and enveloping C ∗ -algebra Kn¨orrer, H., see Feldman, J. Ko, C. K., see Bahn, C. Matsui, T. & Ogata, Y., Variational principle for nonequilibrium steady states of the XX model ˆ Matsutani, S. & Onishi, Y., On the moduli of a quantized elastica in P and KDV flows: study of hyperelliptic curves as an extension of Euler’s perspective of elastica I Molev, A. I., Ragoucy, E. & Sorba, P., Coideal subalgebras in quantum affine algebras Moretti, V., Aspects of noncommutative Lorentzian geometry for globally hyperbolic spacetimes Moriya, H., see Araki, H. M¨uller, V. F., Perturbative renormalization by flow equations Ogata, Y., see Matsui, T.
3 (2003) 271
6 (2003) 629
4 (2003) 313
8 (2003) 905
6 (2003) 559
8 (2003) 789
10 (2003) 1171
5 (2003) 491
Olkiewicz, R., see Blanchard, P. ˆ Onishi, Y., see Matsutani, S. Park, Y. M., see Bahn, C. Paufler, C., see Forger, M. Petz, D., Monotonicity of quantum relative entropy revisited Pillet, C.-A., see Derezi´nski, J. Procesi, M., Exponentially small splitting and Arnold diffusion for multiple time scale systems Proskurin, D. P., see Kim, C. S. Ragoucy, E., see Molev, A. I. Ruskai, M. B., see Horodecki, M. Ruskai, M. B., Qubit entanglement breaking channels Ruzzi, G., Essential properties of the vacuum sector for a theory of superselection sectors R¨omer, H., see Forger, M. Shor, P. W., see Horodecki, M. Sorba, P., see Molev, A. I. Strobl, T., see Bojowald, M. Talagrand, M., Self organization in the low temperature region of a spin glass model Trubowitz, E., see Feldman, J. Vanpeteghem, D., see Benatti, F. Waldmann, S., see Bieliavsky, P. Yoshida, N., Phase transition from the viewpoint of relaxation phenomena Zamboni, L. Q., see Damanik, D.
1 (2003) 79
4 (2003) 339
6 (2003) 643
10 (2003) 1255
1 (2003) 1
7 (2003) 765