Commun. Math. Phys. 215, 1 – 24 (2000)
Communications in
Mathematical Physics
© Springer-Verlag 2000
Asymptotic Behavior of Thermal Nonequilibrium Steady States for a Driven Chain of Anharmonic Oscillators Luc Rey-Bellet1, , Lawrence E. Thomas2 1 Department of Mathematics, Rutgers University, 110 Frelinghuysen Road, Piscataway, NJ 08854, USA.
E-mail:
[email protected]
2 Department of Mathematics, University of Virginia, Kerchof Hall, Charlottesville, VA 22903, USA.
E-mail:
[email protected] Received: 6 January 2000 / Accepted: 4 May 2000
Abstract: We consider a model of heat conduction introduced in [6], which consists of a finite nonlinear chain coupled to two heat reservoirs at different temperatures. We study the low temperature asymptotic behavior of the invariant measure. We show that, in this limit, the invariant measure is characterized by a variational principle. The main technical ingredients are some control theoretic arguments to extend the Freidlin–Wentzell theory of large deviations to a class of degenerate diffusions.
1. Introduction We consider a model of heat conduction introduced in [6]. In this model a finite nonlinear chain of n d-dimensional oscillators is coupled to two Hamiltonian heat reservoirs initially at different temperatures TL ,TR , and each of which is described by a d-dimensional wave equation. A natural goal is to obtain a usable expression for the invariant (marginal) state of the chain analogous to the Boltzmann–Gibbs prescription µ = Z −1 exp (−H /T ) which one has in equilibrium statistical mechanics. We show here that the invariant state µ describing steady state energy flow through the chain is asymptotic to the expression exp (−W (η) /T ) to leading order in the mean temperature T , T → 0, where the action W (η) , defined on phase space, is obtained from an explicit variational principle. The action W (η) depends on the temperatures only through the parameter η = (TL − TR )(TL + TR ). As one might anticipate, in the limit η → 0, W (η) reduces to the chain Hamiltonian plus a residual term from the bath interaction, i.e., exp (−W (η) /T ) becomes the Boltzmann–Gibbs expression. Present address: Department of Mathematics, University of Virginia, Kerchof Hall, Charlottesville, VA 22903, USA.
2
L. Rey-Bellet, L. E. Thomas
Turning to the physical model at hand, we assume that the Hamiltonian H (p, q) of the isolated chain is assumed to be of the form H (p, q) =
n p2 i
i=1
2
+
n
U (1) (qi ) +
i=1
n−1
U (2) (qi − qi+1 ) ≡
i=1
n p2 i
i=1
2
+ V (q),
(1)
where qi and pi are the coordinate and momentum of the i th particle, and where U (1) and U (2) are C ∞ confining potentials, i.e. lim|q|→∞ V (q) = +∞. The coupling between the reservoirs and the chain is assumed to be of dipole approximation type and it occurs at the boundary only: the first particle of the chain is coupled to one reservoir and the nth particle to the other heat reservoir. At time t = 0 each reservoir is assumed to be in thermal equilibrium, i.e., the initial conditions of the reservoirs are distributed according to (Gaussian) Gibbs measure with temperature T1 = TL and Tn = TR respectively. Projecting the dynamics onto the phase space of the chain results in a set of integro-differential equations which differ from the Hamiltonian equations of motion by additional force terms in the equations for p1 and pn . Each of these terms consists of a deterministic integral part independent of temperature and a Gaussian random part with covariance proportional to the temperature. Due to the integral (memory) terms, the study of the long-time limit is a difficult mathematical problem (see [13] for the study of such systems in the case of a single reservoir). But by a further appropriate choice of couplings, the integral parts can be treated as auxiliary variables r1 and rn , the random parts become Markovian. Thus we obtain (see [6] for details) the following system of Markovian stochastic differential equations on the extended phase space R2dn+2d : For x = (p, q, r), we have q˙i = pi , j = 1, . . . , n, p˙ i = −∇qi V (q) + δ1,i r1 + δn,i rn , dri = −γ (ri − λ qi )dt + (2γ λ Ti ) 2
2
i = 1, . . . , n, 1/2
dwi ,
i = 1, n.
(2)
In Eq. (2), w1 (t) and wn (t) are independent d-dimensional Wiener processes, and λ2 and γ are coupling constants. It will be useful to introduce a generalized Hamiltonian G(p, q, r) on the extended phase space, given by ri2 G(p, q, r) = − ri qi + H (p, q), 2λ2 i=1,n
where H (p, q) is the Hamiltonian of the isolated systems of oscillators given by (1). We also introduce the parameters ε = (T1 + Tn )/2 (the mean temperature of the reservoirs) and η = (T1 + Tn )/(T1 − Tn ) (the relative temperature difference). Then Eq. (2) takes the form q˙ = ∇p G, p˙ = −∇q G, dr = −γ λ2 ∇r Gdt + ε 1/2 (2γ λ2 D)1/2 dw,
(3)
where p = (p1 , . . . , pn ), q = (q1 , . . . , qn ), r = (r1 , rn ) and where D is the 2d × 2d matrix given by D = diag(1 + η, 1 − η).
Asymptotic Behavior of Thermal Nonequilibrium Steady States
3
The function G is a Liapunov function, non-increasing in time, for the deterministic part of the flow (3). If the system is in equilibrium, i.e, if T1 = Tn = ε and η = 0, it is not difficult to check that the generalized Gibbs measure µε = Z −1 exp (−G(p, q, r)/ε), is an invariant measure for the Markov process solving Eq. (3). If the temperatures of the reservoirs are not identical, no explicit formula for the invariant measure µT1 ,Tn can be given, in general. It is the goal of this paper to provide a variational principle for the leading asymptotic form for µT1 ,Tn , at low temperature, ε → 0. To suggest what µT1 ,Tn looks like, we observe that a typical configuration of a reservoir has infinite energy, therefore the reservoir does not only act as a sink of energy but true fluctuations can take place. The physical picture is as follows: the system spends most of the time very close to the critical set of G (in fact close to a stable equilibrium) and very rarely (typically after an exponential time) an excursion far away from the equilibria occurs. This picture brings us into the framework of rare events, hence into the theory of large deviations and more specifically the Freidlin–Wentzell theory [8] of small random perturbations of dynamical systems. In the following we employ notation which is essentially that of [8]. Let C([0, T ]) denote the Banach space of continuous functions (paths) with values in R2d(n+1) equipped (η) with the uniform topology. We introduce the following functional Ix,T on the set of paths 2 C([0, T ]): If φ(t) = (p(t), q(t), r(t)) has one L -derivative with respect to time and satisfies φ(0) = x we set T 1 (η) Ix,T (φ) = (˙r + γ λ2 ∇r G)D −1 (˙r + γ λ2 ∇r G)dt, (4) 4γ λ2 0 if q(t) ˙ = ∇p G(φ(t)),
p(t) ˙ = −∇q G(φ(t)),
(η)
(5)
(η)
and Ix,T (φ) = +∞ otherwise. Notice that Ix,T (φ) = 0 if and only if φ(t) is a solution (η)
of Eq. (3) with the temperature ε set equal to zero. The functional Ix,T is called a rate function and it describes, in the sense of large deviations, the probability of the path φ. Roughly speaking, as ε → 0, the asymptotic probability of the path φ is given by (η) exp −Ix,T (φ)/ε . For x, y ∈ R2d(n+1) we define V (η) (x, y) as (η) inf I (φ), T >0 φ:φ(T )=y x,T
V (η) (x, y) = inf
(6)
and for any sets B, C ∈ R2d(n+1) we set V (η) (B, C) =
inf
x∈B;y∈C
V (η) (x, y).
(7)
The function V (η) (x, y) represents the cost to bring the system from x to y (in an arbitrary amount of time). We introduce an equivalence relation on the phase space R2d(n+1) : we say x ∼ y if V (η) (x, y) = V (η) (y, x) = 0. We divide the critical set
4
L. Rey-Bellet, L. E. Thomas
K = {x; ∇G(x) = 0} (about which the invariant measure concentrates) according to this equivalence relation: we have K = ∪i Ki with x ∼ y if x ∈ Ki , y ∈ Ki and x ∼ y if x ∈ Ki , y ∈ Kj , i = j . Our first assumption is on the existence of an invariant measure, the structure of the set K and the dynamics near temperature zero. Let ρ > 0 be arbitrary and denote B(ρ) the ρ-neighborhood of K and let τρ be the first time the Markov process x(t) which solves (3) hits B(ρ). K1 The process x(t) has an invariant measure. The ω-limit set of the deterministic part of the flow (3) (which turns out to be the set of critical values of the Hamiltonian G) can be decomposed into a finite number of inequivalent compact sets Ki . Finally, for any ε0 > 0, the expected hitting time Ex (τρ ) of the diffusion with initial condition x is bounded uniformly for 0 ≤ ε ≤ ε0 and uniformly in x on any compact set. Remark 1. The assumption K1 ensures that the dynamics is sufficiently confining in order to apply large deviations techniques to study the invariant measure. Remark 2. The assumptions used in [6, 5] to prove the existence of an invariant measure imply the assumption made on the structure of the critical set A. But it is not clear that they imply the assumptions made on the hitting time. We will merely assume the validity of condition K1 in this paper. Its validity can be established by constructing Liapunov-like functions for the model. Such methods allow as well to prove a fairly general theorem on the existence of invariant measures for Hamiltonian systems coupled to heat reservoirs and will be the subject of a separate publication [19]. Our second condition is identical to condition H2 of [6, 5]. K2 The 2-body potential U (2) (q) is strictly convex. Remark 3. The condition K2 will be important to establish various regularity properties of V (η) (x, y). It will imply several controllability properties of the control system associated with the stochastic differential equations (3). Following [8], we consider graphs on the set {1, . . . , L}. A graph consisting of arrows m → n, (m ∈ {1, . . . , L} \ {i}, m ∈ {1, . . . , L}), is called a {i}-graph if 1. Every point j , j = i is the initial point of exactly one arrow. 2. There are no closed cycles in the graph. We denote G{i} the set of {i}-graphs. The weight of the set Ki is defined by V (η) (Km , Kn ). W (η) (Ki ) = min g∈G({i})
(8)
m→n∈g
Our main result is the following: Theorem 1. Under the conditions K1 and K2 the invariant measure µT1 ,Tn = µε,η of the Markov process (3) has the following asymptotic behavior: For any open set D with compact closure and sufficiently regular boundary lim ε log µε,η (D) = − inf W (η) (x),
ε→0
x∈D
Asymptotic Behavior of Thermal Nonequilibrium Steady States
where
5
W (η) (x) = min W (η) (Ki ) + V (η) (Ki , x) − min W (η) (Kj ). i
j
(9)
In particular, if η = 0, then W (0) (x) = G(x) − min G(x).
(10)
x
The function W (η) (x) satisfies the bound, for η ≥ 0, (1 + η)−1 G(x) − min G(x) ≤ W (η) (x) ≤ (1 − η)−1 G(x) − min G(x) , (11) x
x
and a similar bound for η ≤ 0. Remark 4. Equations (10) and (11) imply that µε,η reduces to the Boltzmann–Gibbs expression µε ∼ exp (−G/ε) for η → 0 in the low temperature limit. Of course, at η = 0, they are actually equal at all temperatures ε. Moreover these equations imply that the relative probability µε,η (x)/µε,η (y) is (asymptotically) bounded above and below by G(x) G(y) exp − − , ε(1 ± η) ε(1 ∓ η) so that no especially hot or cold spots develop for η = 0. The theorem draws heavily from the large deviations theory of Freidlin–Wentzell [8]. That theory was developed for stochastic differential equations with a non-degenerate (elliptic) generator; but for Eq. (3) this is not the case since the random force acts only on 2d of the 2d(n + 1) variables. A large part of this paper is devoted to simply developing the control theory necessary to extend Freidlin–Wentzell theory to a class of Markov processes containing our model. Diffusions with hypoelliptic generators have been considered in the literature, e.g. [3, 2]. But these works assume in effect everywhere small-time controllability which is too strong for our purposes. Once the control theory estimates have been established, our proof follows rather closely the proof of Freidlin– Wentzell [8] and the presentation of it given in [3] with suitable technical modifications. We also note that the use of Freidlin–Wentzell theory in non-equilibrium statistical mechanics has been advocated in particular by Graham (see [10] and references therein). In these applications to non-equilibrium statistical mechanics, as in [10], the models are mostly taken as mesoscopic: the variables of the system describe some suitably coarsegrained quantities, which fluctuate slightly around their average values. In contrast to these models, ours is entirely microscopic and derived from first principles and the small-noise limit is seen as a low-temperature limit. We note that the variational principle for W (η) here certainly can be formulated analogously for more complicated arrays of oscillators, plates with multiple thermocoupled baths, etc. We conjecture that generically there is an onset of non-smooth behavior in W (η) as a function of x for η = 0 in the case where G has multiple critical sets, but this sort of critical behavior, as well as other physical phenomena to be deduced from W (η) are questions which remain to be elucidated.
6
L. Rey-Bellet, L. E. Thomas (η)
Finally we note that the action functional Ix,T can be related to an entropy production. As in [7] the entropy production can be defined as 0 = −F1 /T1 − Fn /Tn , where F1 and Fn are energy flows from the chain to the respective reservoirs. For a given path φ ˜ with φ(0) = x and φ(T ) = y we note φ˜ the time reversed path with φ(0) = J y and ˜ ) = J x, where J (p, q, r) = (−p, q, r). A simple computation shows that for any φ(T
T (η) (η) path φ we have Ix,T (φ) = IJy,T (φ ) + R(y) − R(x) − 0 3(φ(s))ds, where 0 = ε −1 3 and R(x) = (1+η)−1 (λ−1 r1 −λq1 )2 +(1−η)−1 (λ−1 rn −λqn )2 . Up to the boundary term R the weight of a given path is the weight of the time reversed path times the exponential of minus the entropy production along the path. In the case of equilibrium this reduces (0) (0) ) + G(y) − G(x). These identities are an to the usual detailed balance Ix,T (φ) = IJy,T (φ asymptotic version of identities needed for the proof of the Gallavotti-Cohen fluctuation theorem [4, 9] for stochastic dynamics [15, 16, 18]. The paper is organized as follows: In Sect. 2 we recall the large deviation principle for the paths of Markovian stochastic differential equations and using methods from control theory we prove the required regularities properties of the function V (η) (x, y) defined in Eq. (6). Section 3 is devoted to an extension of Freidlin–Wentzell results to a certain class of diffusions with hypoelliptic generators (Theorem 3): we give a set of conditions under which the asymptotic behavior of the invariant measure is proved. The result of Sect. 2 implies that our model, under Assumptions K1 and K2, satisfies the conditions of Theorem 3. In Sect. 4 we prove the equality (10) and the bound (11) which depend on the particular properties of our model. 2. Large Deviations and Control Theory In this section we first recall a certain number of concepts and theorems which will be central in our analysis: The large deviation principle for the sample path of diffusions introduced by Schilder for the Brownian motion [20] and generalized to arbitrary diffusions by [8, 1, 23] (see also [3]), and the relationship between diffusion processes and control theory, exemplified by the Support Theorem of Stroock and Varadhan [22]. With these tools we then prove several properties of the dynamics for our model. We prove that “at zero temperature” the (deterministic) dynamics given is dissipative: the ω-limit set is the set of the critical points of G(p, q, r). We also prove several properties of the control system associated with Eq. (3): a local control property around the critical points of G(p, q, r) and roughly speaking a global “smoothness” property of the weight of the paths between x and y, when x and y vary. The central hypothesis in this analysis is condition K2: this condition implies the hypoellipticity, [12], of the generator of the Markov semi-group associated with Eq. (3), but it implies in fact a kind of global hypoellipticity which will be used here to prove the aforementioned properties of the dynamics. 2.1. Sample paths large deviation and control theory. Let us consider the stochastic differential equation dx(t) = Y (x)dt + ε1/2 σ (x)dw(t),
(12)
where x ∈ X = Rn , Y (x) is a C ∞ vector field, w(t) is an m-dimensional Wiener process and σ (x) is a C ∞ map from Rm to Rn . Let C([0, T ]) denote the Banach space of continuous functions with values in Rn equipped with the uniform topology. Let L2 ([0, T ]) denote the set of square integrable functions with values in Rm and H1 ([0, T ])
Asymptotic Behavior of Thermal Nonequilibrium Steady States
7
denote the space of absolutely continuous functions with values in Rm with square integrable derivatives. Let xε (t) denote the solution of (12) with initial condition xε (0) = x. We assume that Y (x) and σ (x) are such that, for arbitrary T , the paths of the diffusion process xε (t) belong to C([0, T ]). We let Pxε denote the probability measure on C([0, T ]) induced by xε (t), 0 ≤ t ≤ T and denote Exε the corresponding expectation. We introduce the rate function Ix,T (f ) on C([0, T ]) given by 1 T inf |g(t)| ˙ 2 dt, (13) Ix,T (f ) =
T
T 2 0 {g∈H1 :f (t)=x+ Y (f (s))ds+ σ (f (s))g(s)ds} ˙ 0
0
where, by definition, the infimum over an empty set is taken as +∞. The rate function has a particularly convenient form for us since it accommodates degenerate situations where rank σ < n. In [3], Corollary 5.6.15 (see also [1]) the following large deviation principle for the sample paths of the solution of (12) is proven. It gives a version of the large deviation principle which is uniform in the initial condition of the diffusion. Theorem 2. Let x ε (t) denote the solution of Eq. (12) with initial condition x. Then, for any x ∈ Rn and for any T < ∞, the rate function Ix,T (f ) is a lower semicontinuous function on C([0, T ]) with compact level sets (i.e. {f ; Ix,T (f ) ≤ α} is compact for any α ∈ R). Furthermore the family of measures Pxε satisfy the large deviation principle on C([0, T ]) with rate function Ix,T (f ): 1. For any compact K ⊂ X and any closed F ⊂ C([0, T ]), lim sup log sup Px (xε ∈ F ) ≤ − inf inf Ix,T (φ). ε→0
x∈K φ∈F
x∈K
2. For any compact K ⊂ X and any open G ⊂ C([0, T ]), lim inf log inf Px (xε ∈ G) ≥ − sup inf Ix,T (φ). ε→0
x∈K
x∈K φ∈G
Recall that for our model given by Eq. (3), the rate function takes the form given in Eqs. (4) and (5). We introduce further the cost function VT (x, y) given by VT (x, y) =
inf
φ∈C ([0,T ]):φ(T )=y
Ix,T (φ).
(14)
Heuristically VT (x, y) describes the cost of forcing the system to be at y at time T starting from x at time 0. The function V (x, y) defined in the introduction, Eq. (6) is equal to V (x, y) = inf VT (x, y), T >0
(15)
and describes the minimal cost of forcing the system from x to y in an arbitrary amount of time. The form of the rate function suggests a connection between large deviations and control theory. In Eq. (13), the infimum is taken over functions g ∈ H1 ([0, T ]) which are more regular than a path of the Wiener process. If we do the corresponding substitution in Eq. (12), we obtain an ordinary differential equation x(t) ˙ = Y (x(t)) + σ (x(t))u(t),
(16)
8
L. Rey-Bellet, L. E. Thomas
where we have set u(t) = ε 1/2 g(t) ˙ ∈ L2 ([0, T ]). The map u is called a control and Eq. (16) a control system. We fix an arbitrary time T > 0. We denote by ϕxu : [0, T ] → Rn the solution of the differential equations (16) with control u and initial condition x. The correspondence between the stochastic system Eq. (12) and the deterministic system Eq. (16) is exemplified by the Support Theorem of Stroock and Varadhan [22]. The support of the diffusion process x(t) with initial condition x on [0, T ], is, by definition, the smallest closed subset Sx of C([0, T ]) such that Px [x(t) ∈ Sx ] = 1 . The Support Theorem asserts that the support of the diffusion is equal to the set of solutions of Eq. (16) as the control u is varied: Sx = {ϕxu : u ∈ L2 ([0, T ]) } , for all x ∈ Rk . The control system (16) is said to be strongly completely controllable, if for any T > 0, and any pair of points x, y, there exist a control u such that ϕxu (0) = x and ϕxu (T ) = y. In [7] it is shown that, under condition K2, the control system associated with Eq. (3) is strongly completely controllable. This is an ergodic property and this implies, [7], uniqueness of the invariant measure (provided it exists). In terms of the cost function VT (x, y) defined in (14), strong complete controllability simply means that VT (x, y) < ∞, for any T > 0 and any x, y. The large deviation principle, Theorem 2, gives more quantitative information on the actual weight of paths between x and y in time T , in particular that the weight is ∼ exp(− 1ε VT (x, y)). As we will see below, these weights will determine completely the leading (exponential) behavior of the invariant measure for xε (t), ε ↓ 0. 2.2. Dissipative properties of the dynamics. We first investigate the ω-limit set of the dynamics “at temperature zero”, i.e, when both temperatures T1 , Tn are set equal to zero in the equations of motion. In this case the dynamics is deterministic and, as the following result shows, dissipative. Lemma 1. Assume condition K2. Consider the system of differential equations given by q˙i = ∇pi G p˙ i = −∇qi G
i = 1, · · · , n, i = 1, · · · , n,
r˙i = −γ λ ∇ri G
i = 1, n.
2
(17)
Then the ω-limit set of the flow given by Eq. (17) is the set of critical points of the generalized Hamiltonian G(p, q, r) = j =1,n (λ−2 rj2 /2 − rj qj ) + H (p, q), i.e., the
set A = x ∈ R2d(n+1) : ∇G(x) = 0 . Proof. As noted in the introduction G(x) is a Liapunov function for the flow given by (17). A simple computation shows that d (λ−2 ri (t) − qi (t))2 = −γ λ2 |∇ri G(x(t))|2 ≤ 0. G(x(t)) = −γ λ2 dt i=1,n
i=1,n
Therefore it is enough to show that the flow does not get “stuck” at some point of the hyper-surfaces (λ−2 ri − qi )2 = 0, i = 1, n which does not belong to the set A.
Asymptotic Behavior of Thermal Nonequilibrium Steady States
9
Let us assume the contrary, i.e., G(x(t)) is constant for t ∈ [T1 , T2 ] so that dG/dt = 0, implying that λ−2 r1 (t) − q1 (t) = ∇r1 G(x(t)) = 0.
(18)
Taking the time derivative of Eq. (18) yields p1 = ∇p1 G = 0. Since p1 ≡ 0, q1 is constant, by Eq. (18) r1 is constant, and 0 = −p˙ 1 (t) = ∇q1 G(x(t)) = ∇q1 V (q(t)) − r1 (t).
(19)
Equation (19) implies that q2 is constant, since ∇q1 V is a function of q1 and q2 only and is a diffeomorphism in q2 (since U (2) is strictly convex). Thus p2 = q˙2 = ∇p2 G = 0. Proceeding inductively we find that if G(x(t)) is constant for t ∈ [T1 , T2 ], then ∇G(x(t)) = 0. This concludes the proof of Lemma 1. (η)
2.3. Continuity properties of VT (x, y). It will be important to establish certain con(η) tinuity properties of the cost function VT (x, y). We prove first a global property: we (η) show that for any time T , VT (x, y) as a map from X × X → R is everywhere finite (η) and upper semicontinuous. Furthermore we need a local property of VT (x, y) near the ω-limit set of the zero-temperature dynamics (see Lemma 1). We prove that if x and y (η) are sufficiently close to this ω-limit set then VT (x, y) is small. Both results are obtained using control theory and hypoellipticity. (η)
Proposition 1. Assume condition K2. Then the functions VT , for all T > 0 and V (η) are upper semicontinuous maps : X × X → R. (η)
Proof. By definition VT (y, z) is given by (η)
VT (y, z) = inf
1 2
0
T
(u1 (t)2 + un (t)2 )dt,
(20)
where the infimum in (20) is taken over all u = (u1 , un ) ∈ L2 ([0, T ]) such that q˙ = ∇p G, p˙ = −∇q G, r˙ = −γ λ2 ∇r G + (2γ λ2 D)1/2 u,
(21)
with boundary conditions (p(0), q(0), r(0)) = y and (p(T ), q(T ), r(T )) = z. In other words, the infimum in (20) is taken over all controls u which steer y to z. We first show that, for any y and z, there is a control which steers y to z, i.e, that (η) VT (y, z) < ∞. By condition K2, ∇q U (2) (q) is a diffeomorphism. As a consequence the identity (we set r1 ≡ q0 , and rn = qn+1 ) q¨l = −∇ql G(ql−1 , ql , ql+1 ),
l = 1, . . . , n,
can be solved for either ql−1 or ql+1 : there are smooth functions Gl and Hl such that ql−1 = Gl (ql , q¨l , ql+1 ),
ql+1 = Hl (ql−1 , ql , q¨l ).
(22)
10
L. Rey-Bellet, L. E. Thomas
Using this we rewrite now the equations in the following form: We assume for simplicity n, the number of oscillators, is an even number and we set j = n/2. (If n is odd, take j = (n+1)/2 and up to notational modifications the argument goes as in the even case.) It follows inductively from Eq. (22) and their derivatives and from the equation for r1 = q0 and rn = qn+1 (see Eq. (21)) that we can express u1 , un and q0 , . . . , qn+1 , p1 , . . . , pn as functions of qj and qj +1 and their derivatives up to order 2j + 1. Noting q [α] ≡ (q, q (1) , . . . , q (α) ), a straightforward induction argument shows that there are smooth maps B and N so that [2j +1] [2j +1] , (23) , qj +1 (u1 , un ) = B qj and [2j ] [2j ] (q0 , . . . , qn+1 , p1 , . . . , pn ) = N qj , qj +1 . [2j ]
Conversely, differentiating repeatedly the equations of motion we can express qj [2j ] qj +1
and
as a function of q0 , . . . , qn+1 , pn , . . . , pn : there is a smooth map M such that
[2j ]
qj
[2j ] , qj +1 = M(q0 , . . . , qn+1 , pn , . . . , pn ).
Thus N is a diffeomorphism with inverse M. We have proven the following: The system of Eqs. (21) with given boundary conditions at t = 0 and t = T is equivalent to Eq. (23) with the boundary data [2j ] [2j ] [2j ] [2j ] qj (T ), qj +1 (T ) = M(z). (24) qj (0), qj +1 (0) = M(y), (η)
From this the assertion of the theorem follows easily: First we see that VT (y, z) is finite, for all T > 0 and for all y, z. Indeed choose any sufficiently smooth curves qj (t) and qj +1 (t) which satisfies the boundary conditions (24) and considerthe u given by Eq.(23). [2j ]
[2j ]
Then the function (q0 (t), . . . , qn+1 (t), p1 (t), . . . , pn (t)) = N qj (t), qj +1 (t) is a solution of Eq. (21) with a control u(t) given by (23) which steers y to z. (η) In order to prove the upper semicontinuity of VT (y, z), let us choose some C > 0. (η) By definition of VT there is a control u which steers y to z along a path φ = φ u such that (η)
Iy,T (φ u ) ≤ VT (y, z) + C/2, and [2j +1] [2j +1] (t), qj +1 (t) . u(t) = B qj Let δ be chosen sufficiently small so that if [2j +1]
sup |qj
t∈[0,T ]
[2j +1]
− q˜j
[2j +1]
| + |qj +1
[2j +1]
− q˜j +1
| ≤ δ,
(25)
Asymptotic Behavior of Thermal Nonequilibrium Steady States
11
for q˜j q˜j +1 corresponding to a path φ˜ and control u, ˜ then C sup |u(t) − u(t)| ˜ ≤ T t∈[0,T ]
(26)
˜ ˜ ) = z˜ } with φ˜ is true. But since N is a diffeomorphism, the set {(y, ˜ z˜ ); φ(0) = y, ˜ φ(T satisfying Eqs. (25) and (26) is a neighborhood of (y, z). Hence (η)
(η)
u˜ VT (y, ˜ z˜ ) ≤ Iy,T ˜ (φ ) ≤ VT (y, z) + C. (η)
This shows the upper semicontinuity of VT (y, z) and the upper semicontinuity of V (η) (y, z) follows easily from this. This concludes the proof of Lemma 1. An immediate consequence of this lemma is a bound on the cost function around critical points of the generalized Hamiltonian G. Corollary 1. For any x ∈ A = {y : ∇G(y) = 0} and any h > 0 there is δ > 0 such that, if |y − x| + |z − x| ≤ δ, then one has V (η) (y, z) ≤ h. Proof. If x ∈ A, x is a critical point of Eq. (17) and, as a consequence, the control u ≡ 0 steers x to x and hence V (η) (x, x) = 0. The upper semicontinuity of V (η) (y, z) immediately implies the statement of the corollary. Remark 5. This corollary slightly falls short of what is needed to obtain the asymptotic of the invariant measure. More detailed information about the geometry of the control paths around the critical points is needed and will be proved in the next subsection. 2.4. Geometry of the paths around the critical points. Let us consider a control system of the form x˙ = Y (x) +
m
Xi (x)ui ,
(27)
i=1
where x ∈ Rn , Y (x), Xi (x) are smooth vector fields. We assume that Y (x), Xi (x) are such that Eq. (27) has a unique solution for all time t > 0. We want to investigate properties of the set which can be reached from a given point by allowing only controls with bounded size. The class of controls u we consider is given by UM = {u piecewise smooth, with |ui (t)| ≤ M , 1 ≤ i ≤ m} . M (x) the set of points which can be reached from x in time less than τ We denote Y≤τ with a control u ∈ UM . We say that the control system is small-time locally controllable M (x) contains a neighborhood of x for every τ > 0. (STLC) at x if Y≤τ The following result is standard in control theory, see e.g. [21, 17] for a proof.
Proposition 2. Consider the control system Eq. (27) with u ∈ UM . Let x0 be a critical point of Y (x), i.e., Y (x0 ) = 0. If the linear span of the brackets adk (Y )(Xi )(x) i = 1, . . . , m, has rank n at x0 then Eq. (27) is STLC at x0 .
k = 0, 1, 2, . . . ,
12
L. Rey-Bellet, L. E. Thomas
Proof. One proves Lemma 2 by linearizing around X0 and using e.g. the implicit function theorem, see e.g. [17], Chapter 6, Theorem 1. As a consequence of Lemma 2 and results obtained in [6] one gets Lemma 2. Consider the control system given by Eqs. (21) with u ∈ UM . Let x0 be a critical point of G(x). If condition K2 is satisfied, then the system (21) is STLC at x0 . Proof. An explicit computation, see [6], shows that condition K2 implies that the brackets adk (Y )(Xi )(x)
i = 1, . . . , m,
k = 0, . . . , n
generates the tangent space at each point x, in particular at every critical point x0 . Therefore by Lemma 2, the control system Eq. (21) is STLC at x0 . With these results we can derive the basic fact on the geometry of the control paths around critical points of G(x). Proposition 3. Consider the control system given by (21). Let x0 be a critical point of G(x) and B(ρ) the ball of radius ρ centered at x0 . Then for any h > 0, ρ ! > 0, there are M, T > 0, and ρ > 0 with ρ < ρ ! /3 such that for all x, y ∈ B(ρ), there is u ∈ UM with φ u (0) = x, φ u (T ) = y,
φ u (t) ∈ B(2ρ ! /3) for t ∈ [0, T ],
and Ix,T (φ u ) ≤ h. Proof. Together with the control system (21), we consider the time-reversed system q˙˜ = −∇p G, p˙˜ = ∇q G, r˙˜ = γ λ2 ∇r G + (2γ λ2 D)1/2 u.
(28)
Lemma 2 implies the STLC of the control system (21). Furthermore from Lemma 2 it is easy to see the control system (28) is STLC if and only if the control system (21) is. We M (x) (Y˜ M (x)) the set of reachable note φ u (φ˜ u ) the solution of Eq. (21) (Eq. (28)) and Y≤T ≤T points for the control system (21) ((28)). We now choose M and T such that M 2 T ≤ h M (x), Y˜ M (x) ⊂ B(2ρ ! /3). By Lemma 2, Y M (x) and Y˜ M (x) contain and such that Y≤T ≤T ≤T ≤T a neighborhood B(ρ) of x0 for |x − x0 | sufficiently small, with ρ < ρ ! /3. Therefore there are controls u1 , u2 ∈ UM and τ1 , τ2 ≤ T such that φ u1 (0) = x0 , φ u1 (τ1 ) = y,
u2 (0) = x0 , φ u2 (τ2 ) = x. φ
u2 (t) yields a trajectory φ u2 (t) with φ u2 (0) = x By reversing the time, the trajectory φ u 2 and φ (τ2 ) = x0 . Concatenating the trajectories φ u2 (t) and φ u1 (t) yields a path φ from x to y which does not leave the ball B(2ρ ! /3) and for which we have the estimate 1 τ1 +τ2 1 Ix,2T (φ) = dt|u(t)|2 ≤ M 2 (τ1 + τ2 ) ≤ h, 2 0 2 and this concludes the proof of Corollary 3.
Asymptotic Behavior of Thermal Nonequilibrium Steady States
13
3. Asymptotics of the Invariant Measure We consider a stochastic differential equation of the form dxε = Y (xε ) + ε 1/2 σ (xε )dw,
(29)
where x ∈ X = Rn , Y (x) is a C ∞ vector field, σ (x) a C ∞ map from Rm to Rn and w(t) a standard m-dimensional Wiener process. We view the stochastic process given by Eq. (29) as a small perturbation of the dynamical system x˙ = Y (x).
(30)
We denote Ix,T (·) the large deviation functional associated with Eq. (29) (see Eq. (13)) and denote VT (x, y) and V (x, y) the cost functions given by (14) and (15). Functions V (Ki , Kj ), V (Ki , z), W (Ki ) and W (z) are defined analogously as in Eqs.(6), (7), (8), and (9). We assume that the diffusion xε satisfies the condition K1 in the introduction. In addition we require L2 The diffusion process xε (t) has an hypoelliptic generator, and for any x in the ωlimit set of the deterministic flow (30) the control system associated with Eq. (29) is small-time locally controllable. L3 The diffusion process is strongly completely controllable and, for any T > 0, VT (x, y) is upper semicontinuous as a map from X × X to R. Remark 6. It is shown in Sect. 2 that, for the model we consider, the condition K2 implies that the ω-limit set of deterministic flow is the set of critical values of the Hamiltonian G as well as Conditions L2 and L3. We call a domain D ⊂ X regular if the boundary of D, ∂D, is a piecewise smooth manifold. Then we have Theorem 3. Assume Conditions K1, L2, and L3 . Let D be a regular domain with compact closure such that dist(D, ∪i Ki ) > 0. Then the (unique) invariant measure µε of the process xε (t) satisfies lim ε ln µε (D) = − inf W (z).
ε→0
z∈D
(31)
In particular if there is a single critical set K one has lim ε ln µε (D) = − inf V (K, z).
ε→0
z∈D
(32)
We first recall some general results on hypoelliptic diffusions obtained in [14], in particular a very useful representation of the invariant measure µC in terms of embedded Markov chains [11], see Proposition 4 below. Then we prove the large deviation estimates. Let U and V be open subsets of X with compact closure with U ⊂ V . Below, U and V will be the disjoint union of small neighborhoods of the sets Ki . We introduce an increasing sequence of Markov times τ0 , σ0 , τ1 , . . . defined as follows. We set τ0 = 0 and σn = inf{t > τn : xε (t) ∈ ∂V }, τn = inf{t > σn−1 : xε (t) ∈ ∂U }.
(33) (34)
14
L. Rey-Bellet, L. E. Thomas
As a consequence of hypoellipticity and the strong complete controllability of the control problem associated with the diffusion xε (t) (Conditions L2 and L3) we have the following result, [14], Theorem 4.1 : If the diffusion xε (t) is hypoelliptic and strongly completely controllable then the diffusion admits a (unique) invariant measure µε if and only if xε (t) is positive recurrent. It follows from this result that, almost surely, the Markov times τj and σj defined in Eqs. (33) and (34) are finite. An important ingredient in the proof of this result in [14] is the following representation of the invariant measure µε in terms of an invariant measure lε (dx) for the Markov chain {xε (τj )} on the (compact) state space ∂U , e.g. [11], Chap. IV, Lemma 4.2. for a proof. Proposition 4. Let the measure νε be defined as τ1 νε (D) = lε (dx)Exε 1D (xε (t))dt, ∂U
(35)
0
where D is a Borel set and 1D is the characteristic function of the set D. Then one has µε (D) =
νε (D) . νε (X)
Up to normalization, the invariant measure µε assigns to a set D a measure equal to the time spent by the process in D between two consecutive hits on ∂U . The proof of Theorem 3 is quite long and will be split into a sequence of lemmas. The proof is based on the following ideas: As ε → 0 the invariant measure is more and more concentrated on a small neighborhood of the critical set ∪i Ki . To estimate the measure of a set D one uses the representation of the invariant measure given in Proposition 4, where the sets U and V are neighborhoods of the sets {Ki }. Let ρ > 0 and denote B(i, ρ) the ρ-neighborhood of Ki and B(ρ) = ∪i B(i, ρ). Let D be a regular open set such that dist(∪i Ki , D) > 0. We choose ρ ! so small that dist(B(i, ρ ! ), B(j, ρ ! )) > 0, for i = j and dist(B(i, ρ ! ), D) > 0, for i = 1, . . . , L, and we choose ρ > 0 such that 0 < ρ < ρ ! . We set U = B(ρ) and V = B(ρ ! ). We let σ0 and τ1 be the Markov times defined in Eqs. (33) and (34) and let τD be the Markov time defined as follows: τD = inf{t : xε (t) ∈ D}. The first two lemmas will yield an upper bound on νε (D), the unnormalized measure given by Eq. (35). The first lemma shows that, for ε sufficiently small, the probability that the diffusion wanders around without hitting B(ρ) or D is negligible. Lemma 3. For any compact set K one has lim lim sup ε log sup Pxε (min{τD , τ1 } > T ) = −∞.
T →∞
ε→0
x∈K
Proof. From Condition K1 and the Markov inequality we obtain Pxε (min{τD , τ1 } > T ) ≤
1 ε 1 E (min{τD , τ1 }) ≤ Exε (τ1 ) < ∞, T x T
uniformly in ε → 0, and by L2, uniformly in x ∈ K, since the diffusion has an hypoelliptic generator and thus, Ex (τ1 ) is a C ∞ function of x.
Asymptotic Behavior of Thermal Nonequilibrium Steady States
15
Instead of the quantities V (Ki , Kj ) and V (Ki , z), it is useful to introduce the following quantities:
V˜ (Ki , Kj ) = inf inf Ix,T (φ), φ(0) ∈ Ki , φ(T ) ∈ Kj , φ(t) ∈ ∪l=i,j Kl , T >0
V˜ (Ki , z) = inf inf Ix,T (φ), φ(0) ∈ Ki , φ(T ) = x, φ(t) ∈ ∪l=i Kl . T >0
The following lemma will yield an upper bound on νε (D), where νε is the (unnormalized) measure given by Eq. (35). Lemma 4. Given h > 0, for 0 < ρ < ρ ! sufficiently small one has (i)
lim sup ε log
(ii)
lim sup ε log
ε→0 ε→0
sup
Pyε (τD < τ1 ) ≤ − ( inf V˜ (Ki , z) − h),
sup
Pyε (xε (τ1 ) ∈ ∂B(j, ρ)) ≤ − (V˜ (Ki , Kj ) − h).
y∈∂B(i,ρ ! ) y∈∂B(i,ρ ! )
z∈D
Proof. We first prove item (i). If inf z∈D V˜ (Ki , z) = +∞ there is no curve connecting Ki to z ∈ D without touching the other Kj , j = i. Therefore Pyε (τD < τ1 ) = 0 and there is nothing to prove. Otherwise, for h > 0 we set V˜h = inf z∈D V˜ (Ki , z) − h. Since V (y, z) satisfies the triangle inequality, we have, by Condition L2 (see Corollary 1), that, for ρ small enough, inf V˜ (y, z) ≥ inf V˜ (Ki , z) −
inf
y∈∂B(i,ρ ! ) z∈D
z∈D
sup
y∈∂B(i,ρ ! )
V˜ (Ki , y) ≥ V˜h ,
where
V˜ (y, z) = inf inf Ix,T (φ), φ(0) = y, φ(T ) = z, φ(t) ∈ ∪l=i Kl . T >0
By Lemma 3, there is T < ∞ such that lim sup ε log ε→0
sup
y∈∂B(i,ρ ! )
Pyε (τD ∧ τ1 > T ) < −V˜h .
(36)
Let GT denote the subset of C([0, T ]) which consists of functions φ(t) such that φ(t) ∈ D for some t ∈ [0, T ] and φ(t) ∈ B(ρ) if t ≤ inf{s, φ(s) ∈ / D}. The set GT is closed as is seen by considering its complement. We have inf
inf Iy,T (φ) ≥
y∈∂B(i,ρ ! ) φ∈GT
inf
inf V˜ (y, z) ≥ V˜h ,
y∈∂B(i,ρ ! ) z∈D
and thus by Theorem 2, we have lim sup ε log ε→0
sup
y∈∂B(i,ρ ! )
Pyε (xε ∈ GT ) ≤ −
inf
inf Iy,T (φ) ≤ −V˜h .
y∈∂B(i,ρ ! ) φ∈GT
We have the inequality Pyε (τD < τ1 ) ≤ Pyε (τD ∧ τ1 > T ) + Pyε (xε ∈ GT ),
(37)
16
L. Rey-Bellet, L. E. Thomas
and combining the estimates (36) and (37) yields lim sup ε log ε→0
sup
y∈∂B(i,ρ ! )
Pyε (τD ∧ τ1 ) ≤ −V˜h .
This completes the proof of item (i) of Lemma 4. The proof of part (ii) of the lemma is very similar to the first part and follows closely the corresponding estimates in [8], Chapter 6, Lemma 2.1. The details are left to the reader. The following lemma will yield a lower bound on νε (D). It makes full use of the information contained in Lemmas 1 and 3. Lemma 5. Given h > 0, for 0 < ρ ! < ρ sufficiently small one has (i) (ii)
lim inf ε log ε→0
lim inf ε log ε→0
inf
Pxε (τD < τ1 ) ≥ −( inf V˜ (Ki , z) + h),
inf
Pxε (xε (τ1 )
x∈∂B(i,ρ) x∈∂B(i,ρ)
z∈D
∈ ∂B(j, ρ)) ≥ −(V˜ (Ki , Kj ) + h).
(38)
Proof. We start with the proof of item (i). If inf z∈D V˜ (Ki , z) = +∞ there is nothing to prove. Otherwise let h > 0 be given. By Condition L2, (see Corollary 3), there are ρ and ρ ! > 0 with ρ < ρ ! /3 and T0 < ∞ such that, for all x ∈ ∂B(i, ρ), there is a path ψ x ∈ C([0, T0 ]) which satisfies Ix,T0 (ψ x ) ≤ h/3 with ψ x (0) = x and ψ x (T0 ) = x0 ∈ Ki and ψ x (t) ∈ B(2ρ ! /3), 0 ≤ t ≤ T0 . By Condition L3, there are z ∈ D, T1 < ∞ and φ1 ∈ C([0, T1 ]) such that Ix0 ,T1 (φ1 ) ≤ inf z∈D V˜ (Ki , z) + h/3 and φ1 (0) = x0 ∈ Ki and φ1 (T1 ) = z and φ1 does not touch Kj , with j = i. We may and will assume that ρ and ρ ! are chosen such that 2ρ ! ≤ dist(φ1 (t), ∪j =i Kj ). We note G = dist(z, ∂D). Let x1 be the point of last intersection of φ1 with ∂B(i, ρ) and let t1 be such that φ1 (t1 ) = x1 . We note φ2 ∈ C([0, T2 ]), with T2 = T1 − t1 , the path obtained from φ1 by deleting up to time t1 and translating in time. Notice that the path φ2 may hit ∂B(i, ρ ! ) several times, but hits ∂B(i, ρ) only at time 0. Denote as σ = inf{t : φ2 (t) ∈ ∂B(i, ρ ! )}
(39)
the first time φ2 (t) hits ∂B(i, ρ ! ). We choose G! so small that if ψ ∈ C([0, T2 ]) belongs to the G! -neighborhood of φ2 , then ψ(t) does not intersect ∂B(i, ρ) and ∂B(i, ρ ! ) for 0 < t < σ and does not intersect ∂B(i, ρ)} for t > σ . By Condition L2, there are T3 < ∞ and φ3 ∈ C([0, T3 ]) such that φ3 (0) = x0 , φ3 (T3 ) = x1 , φ3 (t) ∈ B(2ρ ! /3), 0 ≤ t ≤ T3 , and Ix0 ,T3 (φ3 ) ≤ h/3. Concatenating ψ x , φ3 and φ2 , we obtain a path φ x ∈ C([0, T ]) with T = T0 + T3 + T2 and Ix,T (φ x ) ≤ inf z∈D V˜ (Ki , z) + h. By construction the path φ x avoids ∂B(i, ρ)} after the time T0 + T3 + σ , where σ is defined in Eq. (39). We consider the open set UT =
x∈∂B(ρ)
ρ G G! ψ ∈ C([0, T ]) : #ψ − φx # < min{ , , } . 3 2 2
Asymptotic Behavior of Thermal Nonequilibrium Steady States
17
By construction the event {xε (t) ∈ UT } is contained in the event {τD ≤ τ1 }. By Theorem 2 we have lim inf ε log ε→0
inf
x∈∂B(ρ)
Pxε (τD < τ1 ) ≥ lim inf ε log ε→0
≥ − sup
inf
x∈∂B(ρ)
Pxε (xε ∈ UT )
inf Ix,T (ψ)
x∈∂B(ρ) ψ∈UT
≥ − sup Ix,T (φ x ) x∈∂B(ρ)
≥ −( inf V˜ (Ki , z) + h). z∈D
This concludes the proof of item (i). The proof of (ii) follows very closely the corresponding estimate in [8], Chapter 6, Lemma 2.1, which considers the case where the generator of the diffusion is elliptic: for any h > 0 one constructs paths φ xy ∈ C([0, T ]) from x ∈ ∂B(i, ρ) to y ∈ ∂B(j, ρ) such that Ix,T (φ xy ) ≤ V˜ (Ki , Kj ) + h/2 and such that if xε (t) is in a small neighborhood of φ xy , then xε (τ1 ) ∈ ∂B(j, ρ). As in part (i) of the lemma, the key element to construct the paths φ xy is Condition L2 of small-time controllability around the sets Ki . The details are left to the reader. The following two lemmas give upper and lower bounds on the normalization constant νε (X), where νε is defined in Eq. (35). Lemma 6. For any h > 0, we have lim inf ε log νε (X) ≥ −h. ε→0
Proof. We choose an arbitrary h > 0. For any ρ ! > 0 we have the inequality: νε (X) ≥ νε (B(ρ ! )) τ1 ε = lε (dx)Ex 1B(ρ ! ) (xε (t))dt ∂B(ρ) 0 σ0 ε lε (dx)Ex 1B(ρ ! ) (xε (t))dt ≥ ∂B(ρ) 0 = lε (dx)Exε (σ0 ). ∂B(ρ)
Using the small-time local controllability around the set Ki , Condition L2, as in Lemma 5 it is easy to show, as in Lemma 1.8 of [8] that for any h > 0, inf
x∈∂B(ρ)
h Exε (σ0 ) ≥ exp (− ), ε
for ε and ρ ! sufficiently small. This completes the proof of Lemma 6.
To get an upper bound on the normalization constant νε (X) we will need an upper bound on the escape time out of the ball B(ρ ! ) around ∪i Ki , starting from x ∈ ∂B(ρ). Lemma 7. Given h > 0, for 0 < ρ < ρ ! sufficiently small, lim sup ε log sup Exε (σ0 ) ≤ h. ε→0
x∈∂B(ρ)
18
L. Rey-Bellet, L. E. Thomas
Proof. Since we have the property of small time local controllability near the sets Ki , the proof of this lemma is similar to the proof of Lemma 1.7 of [8] in the elliptic case. With this lemma we have proved all large deviations estimates needed in the proof of Theorem 3. We will need upper and lower estimates on lε (∂B(i, ρ)) where lε is the invariant measure of the Markov chain xε (τj ). These estimates are proved in [8], Chapter 6, Sects. 3 and 4 and are purely combinatorial and rely on the representation of the invariant measure of a Markov chain with a finite state space via graphs on the state space. By Lemma 4, (ii) and 5, (ii) we have the following estimates on the probability transition q(x, y), x, y ∈ ∂B(ρ) of the Markov chain xε (τj ): Given h > 0, for 0 < ρ < ρ ! sufficiently small, 1 1 exp − (V˜ (Ki , Kj ) + h) ≤ q(x, ∂B(j, ρ)) ≤ exp − (V˜ (Ki , Kj ) − h), (40) ε ε for all x ∈ ∂B(i, ρ) and sufficiently small ε. It is shown in [8], Chapter 6, Lemmas 3.1 and 3.2 that the bound (40) implies a bound on lε (∂B(i, ρ)). One obtains 1 exp − (W˜ (Ki ) − min W˜ (Kj ) + h) ≤ lε (∂B(i, ρ)) ≤ j ε 1 ≤ exp − (W˜ (Ki ) − min W˜ (Kj ) − h) (41) j ε for sufficiently small ε, where W˜ (Ki ) = min
g∈G{i}
V˜ (Km , Kn ).
(42)
(m→n)∈g
Also in [8], Chapter 6, Lemmas 4.1 and 4.2 W˜ (Ki ) is shown to be in fact equal to W (Ki ) defined in Eq. (8) and that the function W (x), defined by Eq. (9), satisfies the identity W (x) = min(W (Ki ) + V (Ki , x)) − min W (Kj ) i
j
= min(W˜ (Ki ) + V˜ (Ki , x)) − min W˜ (Kj ). i
j
(43)
We can turn to the proof of Theorem 3. Proof of Theorem 3. In order to prove Eq. (31), it is enough to show that, for any h > 0, there is ε0 > 0 such that, for ε < ε0 we have the inequalities: 1 µε (D) ≥ exp − ( inf W (z) + h) , (44) ε z∈D 1 µε (D) ≤ exp − ( inf W (z) − h) . (45) ε z∈D We let ρ ! > 0 be such that ρ ! < dist(xmin , D). Recall that τD = inf{t : xε (t) ∈ D} is the first hitting time of the set D. We have the following bound on the νε (D): τ1 ε lε (∂B(i, ρ)) sup Ex 1D (xε (t))dt νε (D) ≤ x∈∂B(i,ρ)
i
≤ L max lε (∂B(i, ρ)) i
sup
x∈∂B(i,ρ)
0
Pxε (τD ≤ τ1 ) sup Eyε (τ1 ). y∈∂D
(46)
Asymptotic Behavior of Thermal Nonequilibrium Steady States
19
By K1, there exists a constant C independent of ε such that sup Eyε (τ1 ) ≤ C,
(47)
y∈∂D
for ε ≤ ε0 . From Lemma 4, (i), given h > 0, for sufficiently small 0 < ρ < ρ ! , we have the bound 1 ε ˜ (48) Px (τD < τ1 ) ≤ exp − ( inf V (Ki , z) − h/4) , ε z∈D for sufficiently small ε. From Eq. (41), given h > 0, for sufficiently small 0 < ρ < ρ ! , we have the bound 1 ˜ ˜ (49) lε (∂B(i, ρ)) ≤ exp − (W (Ki ) − min W (Kj ) − h/4) . j ε From the estimates (46)–(49), and the identity (43) we obtain the bound 1 νε (D) ≤ exp − (min W (z) + h/2) , ε z∈D
(50)
for sufficiently small ε. From Lemma 6, given h > 0, for sufficiently small 0 < ρ < ρ ! , we have the bound h , (51) νε (X) ≥ exp − 2ε for sufficiently small ε. Combining estimates (50) and (51), we obtain that 1 µε (D) ≤ exp − ( inf W (z) − h) , ε z∈D for sufficiently small ε and this gives the bound (45). In order to prove (44), we consider the set Dδ = {x ∈ D : dist(x, ∂D) ≥ δ}. For δ sufficiently small, Dδ = ∅. By L3, V˜ (Ki , z) is upper semicontinuous in z so that V˜ (Ki , z! ) ≤ V˜ (Ki , z) + h/4, for |z! − z| ≤ δ. Therefore inf V˜ (Ki , z) ≤ inf V˜ (Ki , z) + h/4.
z∈Dδ
We have the bound νε (D) ≥ max lε (∂B(i, ρ)) i
(52)
z∈D
inf
x∈∂B(i,ρ)
Pxε (τDδ < τ1 ) inf Exε x∈∂Dδ
τ1
0
There is ε0 > 0 and a constant C > 0 such that we have the bound τ1 1D (xε (t))dt ≥ C > 0, inf Exε x∈Dδ
1D (xε (t))dt.
(53)
(54)
0
uniformly in ε ≤ ε0 . From Eq. (41), given h > 0, for sufficiently small 0 < ρ < ρ ! , we have the bound 1 (55) lε (∂B(i, ρ)) ≥ exp − (W˜ (Ki ) − min W˜ (Kj ) + h/4) , j ε
20
L. Rey-Bellet, L. E. Thomas
for sufficiently small ε. Furthermore, by Lemma 5 and inequality (52), given h > 0, for 0 < ρ < ρ ! sufficiently small, we have 1 (56) inf Pxε (τDδ ≤ τ1 ) ≥ exp − ( inf V˜ (Ki , z) + h/4) , x∈∂B(i,ρ) ε z∈D for sufficiently small ε. Combining estimates (53)–(56) and identity (43) we find 1 νε (D) ≥ exp − ( inf W (z) + h/2) . (57) ε z∈D In order to give an upper bound on the normalization constant νε (X), we use Eq. (35). Using the Markov property, we obtain ε νε (X) = lε (dx)Ex (τ1 ) = lε (dx) Exε (σ0 ) + Exε (Exεε (σ0 ) (τ1 )) ∂B(ρ)
≤
sup Exε (σ0 ) +
x∈∂B(ρ)
∂B(ρ) Eyε (τ1 ). ! y∈∂B(ρ )
sup
(58)
By Lemma 7, given h > 0, for sufficiently small 0 < ρ < ρ ! we have the estimate h , sup Exε (σ0 ) ≤ exp 2ε x∈∂B(ρ) for sufficiently small ε. By K1, the second term on the right-hand side of (58) is bounded by a constant, uniformly in 0 ≤ ε ≤ ε0 . Therefore for we obtain the estimate h , (59) νε (X) ≤ exp 2ε for sufficiently small ε. Combining estimates (57) and (59) we obtain the bound 1 µε (D) ≥ exp − ( inf W (z) + h) , ε z∈D and this is the bound (44). This concludes the proof of Theorem 3.
4. Properties of the Rate Function and Proof of Theorem 1 To complete the proof of Theorem 1 we need the following lemma which expresses the property of detailed balance for η = 0. Recall that for a path φ ∈ C([0, T ]) with ˜ φ(0) = x and φ(T ) = y we denote φ˜ the time reversed path which satisfies φ(0) = Jy ˜ ) = J x. and φ(T (0)
Lemma 8. Let φ(t) ∈ C([0, T ]) with φ(0) = x and φ(T ) = y. Either Ix,T (φ) = +∞ or we have (0)
(0)
) + G(y) − G(x). Ix,T (φ) = IJy,T (φ
(60)
Asymptotic Behavior of Thermal Nonequilibrium Steady States
21
(0)
Proof. We rewrite the rate function Ix,T (φ) given by Eqs. (4) and (5) as (0) Ix,T (φ)
T 1 = (˙r + γ λ2 ∇r G)(˙r + γ λ2 ∇r G)dt 4γ λ2 0 T T 1 2 2 = (˙r − γ λ ∇r G)(˙r − γ λ ∇r G)dt + (∇r G)˙r dt 4γ λ2 0 0 ≡ K1 (φ) + K2 (φ).
(61)
The term K1 (φ) can be interpreted as the rate function corresponding to the the set of stochastic differential equations with the associated control system q˙ = ∇p G, p˙ = −∇q G, r˙ = +γ λ2 ∇r G + (2γ λ2 D)1/2 u.
(62)
Consider now the transformation (p, q, r) → J (p, q, r) and t → −t. This transfor˜ mation maps the solution φ of Eq. (62) into a solution of Eq. (21) with φ(0) = J y, ˜ φ(T ) = J x. This implies the equality T 1 K1 (φ) = (˙r − γ λ2 ∇r G)(˙r − γ λ2 ∇r G) 4γ λ2 0 T 1 (η) ˜ = 2 (r˙˜ + γ λ2 ∇r G)(r˙˜ + γ λ2 ∇r G)dt = IJy,T (φ). 4λ γ 0 This means that K1 (φ) is nothing but the weight of the time reversed path. We now consider the second term, K2 (φ), in Eq. (61). Using the constraints q˙ = ∇p G and p˙ = −∇q G we obtain the identity ∇p Gp˙ + ∇q Gq˙ = 0 and therefore we get T T ∇r G˙r dt = ∇r G˙r + ∇p Gp˙ + ∇q Gq˙ dt K2 (φ) = 0
0
T
=
d Gdt = G(y) − G(x), dt
0
and this proves Eq. (60). With this result we obtain
Lemma 9. If η = 0 then W (0) (x) = G(x) − minx G(x). Proof. The Hamiltonian G is constant on Kj and we set G(x) = Gj for all x ∈ Kj . Furthermore if (p, q, r) ∈ Kj , then p = 0 and therefore the sets Kj are invariant under time reversal: J Kj = Kj . Using Lemma 8, we see that for any path φ ∈ C([0, T ]) with φ(0) = x ∈ Km and φ(T ) = y ∈ Kn we have (0)
(0)
(0)
˜ + G(y) − G(x) = I (φ) ˜ + G n − Gm . Ix,T (φ) = IJy,T (φ) y,T Taking the infimum over all paths φ and all time T , we obtain the identity V (0) (Km , Kn ) = V (0) (Kn , Km ) + Gm − Gn .
22
L. Rey-Bellet, L. E. Thomas
In the definition of W (0) (Ki ), see Eq. (8), the minimum is taken over all {i}-graphs (see the definition in the introduction). Given an {i}-graph and a j with j = i, there is a sequence of arrows leading from j to i. Consider now the graph obtained by reversing all the arrows leading from j to i; in this way we obtain a {j }-graph. Using the identity (4) the weight of this graph is equal to the weight of the original graph plus Gj − Gi . Taking the infimum over all graphs we obtain the identity W (0) (Ki ) = W (0) (Kj ) + Gj − Gi , and therefore we have W (0) (Ki ) = Gi + const, and so W (0) (x), defined in Eq. (9), satisfies the identity W (0) (x) = min(Gi + V (0) (Ki , x)) − min Gj . i
j
(63)
The second term in Eq. (63) is equal to minx G(x), since G(x) is bounded below. We now derive upper and lower bounds on the first term in Eq. (63). A lower bound follows easily from Proposition 8: For any path φ ∈ C([0, T ]) with φ(0) = z ∈ Ki and φ(T ) = x we obtain the inequality (0) (0) ˜ + G(x) − Gi ≥ G(x) − Gi , Iz,T (φ) = IJ x,T (φ)
since the rate function is nonnegative. Taking the infimum over all paths φ and time T we obtain W (0) (x) ≥ G(x) − min G(x). x
To prove the lower bound we consider the trajectory φ˜ starting at J x at time 0 which is the solution of the deterministic equation (17). By Lemma 1, there is some Kj such that ˜ limt→∞ φ(t) ∈ Kj . Furthermore, since φ˜ is a solution of Eq. (17), the rate function of (0) ˜ = 0, for any T > 0. Now consider the time reversed path this path vanishes, IJ x,T (φ) φ(t). It starts at t = −T with T ≤ ∞ at Ki and reaches x at time 0. For such a path we have (0) lim I (φ) T →∞ z,T
(0)
˜ + G(x) − Gi = G(x) − Gi , = lim IJ x (φ) T →∞
and therefore V (0) (Ki , x) ≤ G(x) − Gi . We finally obtain W (0) (x) ≤ Gi + V (0) (Ki , x) − min G(x) ≤ G(x) − min G(x), x
and this concludes the proof of Proposition 9.
x
We have the following bound on the rate function in the case η = 0:
Asymptotic Behavior of Thermal Nonequilibrium Steady States
23
Lemma 10. If η ≥ 0 then (1 + η)−1 (G(x) − min G(x)) ≤ W (η) (x) ≤ (1 − η)−1 (G(x) − min G(x)), x
x
and a similar bound holds for η ≤ 0. (η)
The assertion follows from the fact that the subset of C([0, T ]) on which Ix,T (φ) < ∞ is independent of η. This is easily seen from the definition of rate function (13). Inspection of Eq. (4) implies the bound (0)
(η)
(0)
(1 + η)−1 Ix,T (φ) ≤ Ix,T (φ) ≤ (1 − η)−1 Ix,T (φ). Taking the infimum completes the proof of the lemma.
Combining Theorem 3 with Lemmas 9 and 10 we obtain Theorem 1. Acknowledgements. We would like to thank J.-P. Eckmann, M. Hairer, J. Lebowitz, C.-A. Pillet, and H. Spohn for useful discussions. This work was partially supported by the Swiss National Science Foundation (L.R.-B.) and NSF grant DMS 980139 (L.E.T).
References 1. Azencott, R.: Grandes deviations et applications. In: Ecole d’été de probabilités de Saint-Flour VIII-1978, Lectures Notes in Mathematics 778, Berlin–Heidelberg–New York: Springer, 1980, pp. 2–176 2. Ben Arous, G. and Léandre, R.: Décroissance exponentielle du noyau de la chaleur sur la diagonale. I and II. Probab. Theory Related Fields 90, 175–202 and 377–402 (1991) 3. Dembo, A. and Zeitouni, O.: Large deviations techniques and applications. Applications of Mathematics, Vol. 38, Berlin–Heidelberg–New York: Springer, 1998 4. Evans, D.J., Cohen, E.G.D., and Morris, G.P.: Probability of second law violations in shearing steady states. Phys. Rev. Lett. 71, 2401–2404 (1993) 5. Eckmann, J.-P. and Hairer, M.: Non-equilibrium statistical mechanics of strongly anharmonic chains of oscillators. Preprint, University of Geneva (1999) 6. Eckmann, J.-P., Pillet, C.-A., and Rey-Bellet, L.: Non-equilibrium statistical mechanics of anharmonic chains coupled to two heat baths at different temperatures. Commun. Math. Phys. 201, 657–697 (1999) 7. Eckmann, J.-P., Pillet, C.-A., and Rey-Bellet, L.: Entropy production in non-linear, thermally driven Hamiltonian systems. J. Stat. Phys. 95, 305–331 (1999) 8. Freidlin, M.I. and Wentzell, A.D.: Random perturbations of dynamical systems. Grundlehren der Mathematischen Wissenschaft 260, Berlin–Heidelberg–New York: Springer, 1984 9. Gallavotti, G. and Cohen, E.G.D.: Dynamical ensembles in stationary states. J. Stat. Phys. 80, 931–970 (1995) 10. Graham, R.: Weak noise limit and nonequilibrium potentials of dissipative dynamical systems. In: Instabilities and nonequilibrium structures (Valparaiso 1985), Math. Appl. 33, Dordrecht–Boston: Reidel, 1987, pp. 271–290 11. Hasminskii, R.H.: Stochastic stability of differential equations. Alphen aan den Rijn–Germantown: Sijthoff and Noordhoff, 1980 12. Hörmander, L.: The Analysis of linear partial differential operators. Vol III, Berlin–Heidelberg–New York: Springer, 1985 13. Jakši´c, V. and Pillet, C.-A.: Ergodic properties of classical dissipative systems. I. Acta Math. 181, 245–282 (1998) 14. Kliemann, W.: Recurrence and invariant measures for degenerate diffusions. Ann. of Prob. 15, 690–702 (1987) 15. Kurchan, J.: Fluctuation theorem for stochastic dynamics. J. Phys. A 31, 3719–3729 (1998) 16. Lebowitz, J.L. and Spohn, H.: A Gallavotti–Cohen-type symmetry in the large deviation functional for stochastic dynamics. J. Stat. Phys. 95, 333–365 (1999) 17. Lee, E.B. and Markus, L.: Foundations of optimal control theory. The SIAM Ser. in Appl. Math., New York: Wiley, (1967) 18. Maes, C.: The fluctuation theorem as a Gibbs property. J. Stat. Phys. 95, 367–392 (1999)
24
L. Rey-Bellet, L. E. Thomas
19. Rey-Bellet, L. and Thomas L.E.: Energy decay estimates for Hamiltonian systems coupled to heat reservoirs. In preparation 20. Schilder, M.: Some asymptotic formulae for Wiener integrals. Trans. Am. Math. Soc. 125, 63–85 (1966) 21. Sussmann, H.J.: Lie brackets, real analyticity and geometric control. In: Differential Geometric Control Theory, Proc. Conf. Michigan, Basel–Boston: Birkäuser, 1983, pp. 1–116 22. Stroock, D.W. and Varadhan, S.R.S: On the support of diffusion processes with applications to the strong maximum principle. In: Proc. 6th Berkeley Symp. Math. Stat. Prob., Vol III, 333–368 (1972) 23. Varadhan, S.R.S.: Large Deviations and Applications. Philadelphia: SIAM, 1984 Communicated by J. L. Lebowitz
Commun. Math. Phys. 215, 25 – 43 (2000)
Communications in
Mathematical Physics
© Springer-Verlag 2000
Non-Equilibrium Dynamics of Three-Dimensional Infinite Particle Systems E. Caglioti, C. Marchioro, M. Pulvirenti Dipartimento di Matematica, Università di Roma, La Sapienza, Piazzale A. Moro 2, 00185 Roma, Italy. E-mail:
[email protected];
[email protected];
[email protected] Received: 7 December 1999 / Accepted: 9 May 2000
Abstract: We show existence and uniqueness for the solutions to the Newton equations relative to a system of infinitely many particles in the space, interacting by means of a positive and short-range potential. The initial conditions are chosen in a set sufficiently large to be the support of any reasonable non-equilibrium state. We extend previous results in one and two dimensions, obtained by Lanford and by Fritz and Dobrushin respectively, many years ago. 1. Introduction A preliminary problem in the rigorous study of the Nonequilibrium Statistical Mechanics is to give a precise sense to the time evolution of states of infinitely extended systems. For particle systems the problem can be formulated in the following way. Let d = 1, 2, 3 be the dimension of the physical space. A phase point of the system is an infinite sequence {xi , vi }i∈N of the positions and velocities of the particles and its time evolution is characterized by the solutions of the Newton equations: x¨i (t) = F (xi (t) − xj (t)), i ∈ N, (1.1) j ∈N j =i
where F (x) = −∇φ(x) and φ is a two-body potential. Equation (1.1) is complemented by the initial conditions {xi (0), vi (0)}i∈N . The initial conditions must be chosen in a set sufficiently large to be the support of states of interest from a thermodynamical point of view. The first mathematical problem associated to the system (1.1) is to establish existence and uniqueness of the solutions. It is clear that the main effort is to give a bound to the
Work performed under the auspices of the CNR, GNFM-INDAM and the Italian Ministry of the University (MURST).
26
E. Caglioti, C. Marchioro, M. Pulvirenti
right-hand side of Eq. (1.1) for any positive time. To assure this, assuming φ smooth enough and short-range, it is sufficient to show that the number of particles in any bounded region remains finite in an arbitrary given time interval. As we shall illustrate, the difficulty in estimating the particle number increases with the dimension d. As we mentioned above the initial conditions must be a full measure set for all Gibbs measures at least. As a consequence the velocities and the local densities have, typically, a logarithmic divergence with the distance from the origin. We ignore this important fact, for the moment, and assume that, initially, |vi | < M and N (X; µ, R) < +∞, Rd µ∈Rd ,R>1 sup
(1.2)
where X = {xi , vi }i∈N is the particle configuration, N (X; µ, R) is the number of particles in the sphere of radius R, centered in µ . If V (t) denotes the modulus of the maximal velocity delivered by the particles in the time interval [0, t] and if X(t) denotes the time evolved configuration, the conservation of the particle number yields: N (X(t); µ, R0 ) ≤ N (X; µ, R(t)) ≤ Const R(t)d ,
where
t
R(t) = R0 +
dsV (s).
(1.3) (1.4)
0
On the other hand V (s) is controlled by the force, which turns out to be bounded by supµ N (X(s); µ, r), where r is the range of the potential. By virtue of (1.3) and (1.4) we arrive at the integral inequality: t R(t) ≤ R0 + Const t + Const dsR(s)d , (1.5) 0
which is solvable in the large only if d = 1. A first positive answer to the problem of the existence of nonequilibrium dynamics was in fact given for one-dimensional systems, by O. Lanford about thirty years ago (see [L1] and [L2]). Other results for one-dimensional systems interacting via singular potentials have been obtained in [DF] and [MPPu], while one-dimensional Coulomb systems has been treated in [MP]. After some years, J. Fritz and R. L. Dobrushin (see [FD]) were able to solve the two-dimensional case by using the energy conservation. Roughly speaking the idea is the following. Denoting by E(X; µ, R) the energy of the configuration X in the ball of radius R centered in µ, if it were true that E(X(t); µ, R0 ) ≤ E(X; µ, R(t)) ≤ Const R(t)d ,
(1.6)
where R(t) solves Eq. (1.4), then we could repeat the above argument, using that V (s) ≤ supµ E(X(s); µ, 1) ≤ Const R(s)d/2 , to obtain
t
R(t) ≤ R0 + Const 0
which is solvable in the large if d ≤ 2.
dsR(s)d/2 ,
(1.7)
3-D Dynamics of Infinite Particle Systems
27
Of course the bound (1.6) is not true in that form. However a suitable form of it can be established to make the above argument work (see [FD] for the details). It is also remarkable that the authors showed how the energy conservation alone cannot prevent a blow-up in three dimensions. Indeed they gave an example of a particle system with instantaneous interactions, which is preserving the energy in the collisions, but is not Hamiltonian, delivering a collapse in three dimensions, but not in two. After such contributions, the three dimensional problem remained unsolved. However one has to mention that it is possible to give a sense to the time evolution of a special class of initial states (see [P, CC and SS]). It is not worthless to mention also that the so-called Equilibrium Dynamics can be constructed in any dimension. Namely the existence of the solutions to Eq. 1.1 can be proven, for a full set of initial conditions with respect to a Gibbs measure (see [MPP and L3] for smooth short-range potential, [PPT] for long-range potential, [A] for hard spheres, [S1 and S2] for the cluster dynamics). The main limitation in this approach is that the set of the initial conditions is not explicitly characterized so that one cannot try to extend these results to a non-equilibrium situation. All the results which we are aware of, have been obtained before the early eighties and, apparently, the three dimensional problem, maybe forgotten, is still unsolved. The purpose of the present paper is to solve the problem in the three dimensional case for a smooth, positive and short range potential. Let us give a rough account of the idea of the proof. We have seen that the dimensional limitation relies on the fact that V ≈ E 1/2 . On the contrary if one would have V ≈ E 1/3 , the integral inequality (1.7) would be linear and globally solvable. We show basically this fact. Consider a large particle system with energy E in a large box (the volume is O(E) as well as the total number of particles). E is large and, initially, the energy per particle is O(1). We want to control the energy transferred to the quickest particle to see whether √ it is possible to obtain a better bound than V ≤ Const E. We assume the potential positive and short-range. The energy of a single particle is 1 Ei = vi2 + φi,j , (1.8) 2 j
where φi,j is the interaction between the particles i and j and Fi,j is the force generated by the particle j on the particle i. We have: E˙i (t) = Fi,j · vj (t). (1.9) j
Moreover 1/2 1/2 2 2 Fi,j · vj (t) ≤ vj (t) |Fi,j | (t) ≤ Const E 1/2 N (xi (t))1/2 , j
j
j
(1.10) where N (xi (t)) is the number of particles in the sphere of radius r around xi (t). In our case N (xi (t)) ≤ Const E 1/2 (this is a standard estimate for positive potential, as we shall see later), so that Ei (t) ≤ Ei (0) + Const E 3/4 , (1.11) from which
V ≤ Const E 3/8 .
(1.12)
28
E. Caglioti, C. Marchioro, M. Pulvirenti
Inequality (1.12) is useless because we need 13 in place of 38 as the exponent of E. However it shows that the exponent 21 is too pessimistic. To improve the inequality we assume that at some time there are quick particles with velocity O(E 1/3 ) and show that the kinetic energy of such particles cannot increase. Indeed if one considers the contribution to the sum in the left hand side of (1.10) due to slow particles (whose velocity is O(1)) and denote by such a contribution, we can profit from the time integral. Indeed: T T dt dtFi,j · vj (t) ≤ Const dt|Fi,j |. (1.13) 0
j
j
0
We observe that the time integral in the right hand side, due to the large relative velocity of the particle i with respect to the particle j (it is O(E 1/3 )), is expected to be O(E −1/3 ). Roughly speaking a quick particle spends a short time in the field of a slow particle, so that the contribution is relatively small. Since j 1 ≤ Const E we conclude that the right-hand side of (1.13) is bounded by Const E 2/3 . The other extreme case, namely the contribution due to the quick particles (whose velocity is O(E 1/3 )) can be easily bounded by energy conservation. Indeed, let Nq be the number of such particles. We have: Nq E 2/3 ≤ Const E,
(1.14)
and hence Nq ≤ Const E 1/3 . Therefore the contribution due to the quick particles in the sum in the left-hand side of (1.10) is bounded by 1/2
Const E 1/2 Nq
≤ Const E 2/3 .
(1.15)
Both estimates, for slow and quick particles, are satisfactory because they yield the right bound V ≤ Const E 1/3 . Therefore our main goal is to make rigorous the above ideas partitioning the particles j according to the magnitude of their velocity. Actually, during the time T a slow particle can increase its velocity so that the idea sketched above can’t be implemented so simply. What we really do is to partition the interval [0, T ] into many subintervals of side large enough to take advantage of the time average, but sufficiently small to avoid a slow particle that increases its velocity too much. In more detail the proof is organized in the following steps. We consider a locally bounded initial configuration and define a sequence of finite dynamics {xin (t), vin (t)} obtained by letting only the particles initially in the sphere, centered in zero, of radius n evolve. Denoting by V n (t) the maximal velocity of all the particles up to the time t, we show: Step 1. If V n (t) is diverging as (log n)3/2 , then {xin (t), vin (t)} are converging for all i, as n → ∞. The limit is the unique solution of Eq. (1.1). This is the content of Sect. 3. The technique is basically known (see e.g. [MPPu]) and we present it here in detail for the sake of completeness. Step 2. Defining t R(t) = (log n)3/2 + V n (s)ds, (1.16) 0
we show that the energy of any sphere of radius R(t), in the time evolved configuration, is bounded by CR(t)3 . This is shown in Proposition 4.2.
3-D Dynamics of Infinite Particle Systems
29
Step 3. This is the last and more important step. By a careful analysis of t dsFi,j · vj (s) 0
(1.17)
j
along the ideas we sketched above, we show that V n (t) ≤ CR(t). Inserting this last result in (1.16) we obtain the desired weakly diverging estimate which is necessary to make Step 1 work. We remark once more that we take care of this logarithmic divergence for taking into account configurations typical for non-equilibrium states. 2. Notation, Definitions and Main Results The basic object of investigation of the present paper is an infinite particle system in R3 . We denote by X = {xi , vi }i∈N the infinite sequence of positions and velocities of the particles. X is assumed to be locally bounded, namely, for any bounded region ⊂ R3 , the number of particles in that region: χ (xi ∈ ) (2.1) N( ) = i
is finite. Here and in the sequel χ (A) will indicate the characteristic function of the event A. The particles interact by means of a not negative, two-body potential φ = φ(|x|), x ∈ R3 , twice differentiable, strictly positive at the origin and short-range: φ(|x|) = 0
if |x| ≥ r > 0.
(2.2)
We denote the force by F = −∇φ. It is easy to show that, for a locally finite configuration X and a bounded region , we have: B1 χ (xi ∈ )χ (xj ∈ )φi,j ≥ (2.3) N ( )2 − B2 N ( ), ν( ) i=j
where B1 and B2 are two positive constants, φi,j = φ(|xi − xj |)
(2.4)
and ν( ) is the number of disjoint cubes of side one necessary to cover . Here and later on, if ⊂ R3 is a measurable region, we denote by | | its Lebesgue measure while, if is a discrete set, | | will denote its cardinality. For a locally finite configuration X we define:
W (X; µ, R) =
i
µ,R
where µ,R
fi
fi
(2.5)
j :j =i
=f
vi2 1 + φi,j + 1 , 2 2
|xi − µ| , R
(2.6)
30
E. Caglioti, C. Marchioro, M. Pulvirenti
and the function f ∈ C ∞ (R+ ) is not increasing and satisfies: f (x) = 1 f (x) = 0
for x ∈ [0, 1], for x ∈ (2, +∞)
and |f (x)| ≤ 2. Note that W (X; µ, R) is a smoothed version of the energy plus the particle number relative to the sphere B(µ, R) = {y ∈ R3 ||y − µ| < R}. In order to consider configurations which are typical for thermodynamical states, we must allow logaritmic divergences in the velocities and local densities. Defining W (X; µ, R) sup , (2.7) Q(X) = sup R3 µ R:R>ϕ1 (|µ|) where ϕ1 (x) = log(max(x, e)),
x ∈ R+,
(2.8)
it is well known that the set of all configurations for which Q(X) is finite constitutes a full measure set for all Gibbs states associated to the particle system (see e.g. [FD and DF]). The problem we deal with in this paper is that of making sense of the infinite set of equations: Fi,j (t), (2.9) x¨i (t) = j
where Fi,j = −∇xi φi,j is the force exerted by the particle j on the particle i. The initial conditions are xi (0) = xi , vi (0) = vi , where the infinite configuration X = {xi , vi } satisfies the condition Q(X) < +∞. The solution to Eq. (2.9) will be constructed by means of a limiting procedure. Neglecting all the particles outside B(0, n), the sphere of radius n around the origin, we consider, for an integer n: x¨i (t) = Fi (X n (t)), xin (0) = xi , vin (0) = vi , i ∈ In , where In = {i ∈ N|xi ∈ B(0, n)}, F (xin (t) − xjn (t)), Fi (X (t)) = n
j :j =i
and X n (t) = {xin (t), vin (t)}i∈In is the time evolved finite configuration.
(2.10)
3-D Dynamics of Infinite Particle Systems
31
The main result of this paper is: Theorem 2.1. Let X = {X|Q(X) < +∞} and X ∈ X . There exists a unique flow t → X(t) = {xi (t), vi (t)}i∈N ∈ X satisfying: x¨i (t) = Fi (X(t))
X(0) = X.
(2.11)
Moreover, for all t ∈ R and i ∈ N, lim x n (t) n→∞ i
= xi (t),
lim v n (t) n→∞ i
= vi (t).
(2.12)
The time evolution t → X(t) will be constructed in the next section after having established a basic bound on the maximal velocity of the particles, which will be proved in Sect. 4. We conclude this section with a lemma which summarizes some technicalities which will be useful in the sequel. Its proof is presented in the Appendix. We often shall use the short-hand notation: χi,j = χ (|xi − xj | < r);
χi (µ, R) = χ (|µ − xi | < R).
Lemma 2.1. For a given locally finite configuration X and R ≥ 2r + 1, we have: i) for R > R:
W (X; µ, R ) ≥ W (X; µ, R);
ii) setting N (X; µ, R) =
i
χ (|xi − µ| < R):
N (X; µ, R) ≤ C0 R 3/2 W (X; µ, R)1/2 ; iii)
(2.13)
(2.14)
χi,j χi (µ, R)χj (µ, R) ≤ C0 W (X; µ, R);
(2.15)
W (X; µ, 2R) ≤ C0 sup W (X; µ, R).
(2.16)
i=j
iv) If |X| < +∞
µ
Here C0 denotes a positive constant. 3. Construction of the Time Evolution The main effort in the proof is to show that the maximal velocity of all the particles in the nth dynamics is weakly diverging with n. Define V n (t) = max sup |vin (s)|, (3.1) i∈In 0≤s≤t
and fix an arbitrary time T > 0. Here and after we will denote by Ci , i = 1, 2 . . . , any positive constant, possibly depending on X ∈ X and T , both assumed arbitrary but fixed.
32
E. Caglioti, C. Marchioro, M. Pulvirenti
Proposition 3.1. There exists a constant C1 such that, for t ≤ T and n sufficiently large, V n (t) ≤ C1 ϕ(n),
(3.2)
where ϕ(x) = ϕ1 (x)3/2 and ϕ1 is given by (2.8). As we shall see below, the bound (3.2) is sufficient to control the limit n → ∞ and prove Theorem 2.1 as we are going to show. Proof of Theorem 2.1. Define δi (n, t) = |xin (t) − xin+1 (t)|; and
i∈Ik
sup |xin (s) − xi (0)|.
dn (t) = sup
s∈[0,t]
By (3.2) we have:
uk (n, t) = sup δi (n, t)
i∈In
dn (t) ≤ C2 ϕ(n),
(3.3) (3.4) (3.5)
where C2 = C1 T . Therefore the maximal number of particles that can be in the interaction sphere of a given particle xi (t), cannot be larger than the number of particles that, at time zero, were in a sphere of radius r + C2 ϕ(n). Therefore, for n sufficiently large: N (X n (t); xin (t), r) ≤ N (X; xin (t), r + C2 ϕ(n)) ≤ W (X; xin (t), r + C2 ϕ(n)) (3.6) 3 3 3 ≤ Q(X)[ϕ C2 ϕ(n) + n + C2 ϕ(n) + r ] ≤ C3 ϕ(n) , where N is defined in Lemma 2.1 and we have used that |xin (t)| ≤ C2 ϕ(n) + n. Writing Eq. (2.11) in integral form: t xin (t) = xi (0) + vi (0)t + ds(t − s) F (xin (s) − xjn (s)), (3.7) 0
j
we get, for i ∈ Ik and n sufficiently large: t ∗ δi (n, t) ≤ C4 ds(t − s) [δj (n, s) + δi (n, s)], 0
j
(3.8)
where ∗j means the sum restricted to all the particles which can fall in the interaction sphere of xin (s) or xin+1 (s), for s ≤ t and C4 = ∇F ∞ . Notice that, since k + r + dn (s) + dn+1 (s) < k + r + 2C2 ϕ(n + 1) << n (provided that n is sufficiently large), the particle i cannot interact with the particles initially in B(0, n + 1)/B(0, n). Having in mind (3.6) and Definition (3.3), we have, for k < n: t 3 uk (n, t) ≤ C5 ϕ(n) ds(t − s)uk1 (n, s), (3.9) 0
where k1 = [k + C6 ϕ(n)] + 1, [·] denotes the integer part of · and
r + 2C2 ϕ(n + 1) . C6 = sup ϕ(n) n≥1
3-D Dynamics of Infinite Particle Systems
33
Defining kr = [kr−1 + C6 ϕ(n)] + 1, with k0 = k, we can iterate Eq. (3.9) l times, where
n l= . (3.10) 10C6 ϕ(n) Since ul (n, t) ≤ C2 ϕ(n), the result is uk (n, t) ≤ [C7 ϕ(n)]3l+1
t 2l . (2l)!
(3.11)
With the choice (3.10) we realize that uk (n, t) is vanishing summably as n → ∞. The rest of the proof of Theorem 2.1 is now straightforward. Remark. It is easy to show that the convergence holds even if ϕ(n) is replaced by nα with α < 2/5. As a consequence Theorem 2.1 should be proven for a larger class of initial conditions, namely those for which ϕ1 (n) = nβ , β < 4/15. 4. Proof of Proposition 3.1 The main ingredient in the proof of Theorem 2.1 is Proposition 3.1 which is a corollary of the following: Proposition 4.1. There exists a positive constant C8 such that, for t ≤ T , V n (t) ≤ C8 R(t), where
t
R(t) = ϕ(n) +
(4.1)
V n (s)ds.
(4.2)
0
Indeed by (4.1),(4.2) and the Gronwall Lemma we easily obtain (3.2). We shall omit any explicit notational dependence on n for R and {xjn (t), vjn (t)} for simplicity, since, from now on, n will be fixed. We now concentrate on proving (4.1) that is the key ingredient in the proof. We first exploit the energy conservation to control the energy of a sphere in terms of the energy of a larger sphere at time zero. Proposition 4.2. There exists a positive constant C9 such that, for t ≤ T , sup W (X n (t); µ, R(t)) ≤ C9 R(t)3 .
µ∈R3
(4.3)
Remark. Note that the left-hand side of (4.3) is finite because the cardinality of X n (t) is finite. However it diverges with n and, at time zero, is O(ϕ(n)3 ). Estimate (4.3) says that this divergence is controlled by R which will turn out to be O(ϕ(n)). Proof. For 0 ≤ s ≤ t ≤ T , we define R(t, s) = ϕ(n) + 0
t
V n (τ )dτ +
t s
V n (τ )dτ
(4.4)
34
E. Caglioti, C. Marchioro, M. Pulvirenti
(note that R(t) = R(t, t)) and compute the derivative with respect to s of the quantity: n
W (X (s); µ, R(t, s)) =
i
µ,R(t,s) fi
vi2 1 φi,j + 1 . + 2 2
(4.5)
j =i
We have: W˙ (Xn (s); µ, R(t, s)) ˙ s) µ vi2 R(t, 1 |xi − µ| xˆ i · vi ) − + = f ( φi,j + 1 |xi − µ| R(t, s) R(t, s) R 2 (t, s) 2 2 i
+
i=j
j =i
µ,R(t,s)
fi
vi · Fi,j
1 − Fi,j · (vi − vj ) . 2
(4.6)
In (4.6) we neglect the explicit dependence on s (and n) of xi and vi and denote by µ xˆi the unit vector in the direction of (xi − µ). We note that the first term on the right-hand side of (4.6) is not positive. Indeed ˙ s) = −V n (s) so that: f ≤ 0, |xi − µ| > R , |vi | ≤ V n (s) and R(t, µ ˙ s) ˙ s) xˆi · vi |vi | R(t, R(t, − 2 |xi − µ| ≥ − − ≥ 0. R(t, s) R (t, s) R(t, s) R(t, s)
(4.7)
On the other hand the second term is (by using Fi,j = −Fj,i ): 1 µ,R(t,s) 1 µ,R(t,s) µ,R(t,s) fi (fi − fj ) Fi,j · vi . Fi,j · (vi + vj ) = 2 2 i=j
(4.8)
i=j
By the obvious inequality: µ,R(t,s)
|fi
µ,R(t,s)
− fj
| ≤ 2R(t, s)−1 |xi − xj |,
(4.9)
putting C10 = 2F ∞ r, the modulus of the quantity in (4.8) is bounded by − C10
˙ s) R(t, χ (|xi − xj | < r) · χ |xi − µ| R(t, s) i=j < 2R(t, s) + r χ |xj − µ| < 2R(t, s) + r .
(4.10)
W (X; R) = sup W (X; µ, R),
(4.11)
Setting µ∈R3
by (2.15) and (2.16) we conclude: ˙ s) R(t, W˙ (Xn (s); µ, R(t, s)) ≤ −C11 W (Xn (s); µ, 4R(t, s)) R(t, s) ˙ s) R(t, ≤ −C12 W (X n (s); R(t, s)), R(t, s)
(4.12)
3-D Dynamics of Infinite Particle Systems
35
so that n
n
s
W (X (s); R(t, s)) ≤ W (X (0); R(t, 0)) exp − C12 0
˙ τ) R(t, . dτ R(t, τ )
(4.13)
Hence, for s ≤ t:
R(t, 0) W (X (s); R(t, s)) ≤ W (X (0); R(t, 0)) R(t, s) n
Since
R(t,0) R(t,s)
n
C12
.
(4.14)
< 2, we conclude that W (X n (t); R(t)) ≤ C13 W (X n (0); R(t, 0)) ≤ 8C13 Q(X)R(t)3 ≤ C9 R(t)3 .
(4.15)
We observe that V n (t) is certainly bounded by W (X n (t); R(t))1/2 and hence by Const R(t)d/2 . Note that this is enough to obtain the bound (4.1) in two dimensions, but not in the present case. The proof of Proposition 4.1 is based on the following Proposition 4.3, whose proof will be presented at the end of the section. Proposition 4.3. For 0 ≤ s ≤ t ≤ T and any α ∈ [1/2, 1], we set: = αR(t)−(4/6) .
(4.16)
Suppose that, for some i ∈ In and some constant A¯ > 1: inf
τ ∈[s−,s]
¯ |vi (τ )| = AR(t).
Then there exists a constant D27 independent of A¯ such that: s dτ vj · Fi,j ≤ C27 R(t)2 . s−
(4.17)
(4.18)
j
By the use of Proposition 4.3 we can prove Proposition 4.1. Before doing it however, we shall prove a bound on the force generated by the configuration Xn (τ ) over the particle i, which will be often used in the sequel. We have, for τ ≤ t, (by (2.14)): |Fi (X n (τ ))| ≤ |F (xi (τ ) − xj (τ ))| ≤ F ∞ N (X n (τ ); xi (τ ), r) j
≤ C0 F ∞ (2r + 1)3/2 W (X n (τ ); xi (τ ), 2r + 1)1/2 ≤ C0 F ∞ (2r + 1)3/2 sup W (X n (τ ); µ, R(τ ))1/2
(4.19)
µ
≤ C0 F ∞ (2r + 1)
3/2
C9 R(τ )3/2 ≤ C14 R(τ )3/2 .
Proof of Proposition 4.1. We first notice that, by (2.7), V n (0) ≤ Q(X)1/2 ϕ(n) = Q(X)1/2 R(0) (this determines the dependence of V n (0), and hence V n (t), on n). Then (4.1) is verified for t = 0.
36
E. Caglioti, C. Marchioro, M. Pulvirenti
Suppose that, for some t ∗ ∈ [0, t] and i ∈ In , we have: V n (t ∗ ) = |vi (t ∗ )| = AR(t)
(4.20)
for a suitable constant A to be fixed later and satisfying A > 2(Q(X)1/2 + 1). We also fix t1 ∈ [0, t ∗ ), such that |vi (t1 )| = (Q(X)1/2 + 1)R(t);
inf
τ ∈(t1 ,t ∗ )
|vi (τ )| ≥ (Q(X)1/2 + 1)R(t)
(4.21)
and |t ∗ − t1 | = H for some integer H ≥ 1 and a suitable choice of α (see (4.16)). This can be done because by t∗ Fi (X n (τ ))dτ, (4.22) vi (t ∗ ) = vi (t1 ) + t1
and by (4.19), we find AR(t) ≤ (Q(X)1/2 + 1)R(t) + C14 (t ∗ − t1 )R(t)3/2
(4.23)
(t ∗ − t1 ) ≥ C15 R −(1/2) >> R −(4/6) ,
(4.24)
and hence:
therefore, for a suitable choice of α ∈ [1/2, 1], Furthermore t∗ 1 1 2 ∗ ds vi · Fi,j vi (t ) − vi2 (t1 ) = 2 2 t1 =
R 4/6 |t ∗ −t1 | α
is integer.
j
t∗
t1
=−
ds
H (vi − vj ) · Fi,j +
h=1 t1 +(h−1)
j ∗
∗
φ(xi (t ) − xj (t )) +
j
+
H
t1 +h
ds
j
φ(xi (t1 ) − xj (t1 ))
j t1 +h
h=1 t1 +(h−1)
ds
vj · Fi,j .
j
Note that, proceeding as for estimate (4.19): φ(xi (t ∗ ) − xj (t ∗ )) ≤ φ∞ N (X n (t ∗ ); xi (t ∗ ), r)) ≤ C16 R(t)3/2 . j
The same estimate holds for
j
vj · Fi,j
(4.25) (4.26)
φ(xi (t1 ) − xj (t1 )). Therefore, by using Proposition 4.3:
1 2 ∗ v (t ) ≤ (Q(X) + 1)R(t)2 + 2C16 R(t)3/2 + C27 R(t)2 |t ∗ − t1 |, 2 i from which A2 R(t)2 ≤ 2[(Q(X) + 1) + 2C16 + C27 T )]R(t)2 . A2
(4.27) (4.28)
The above inequality cannot be verified for larger than 2[(Q(X) + 1) + 2C16 + C27 T )]. This contradicts (4.20) (with this choice of A) and the proposition is proven.
3-D Dynamics of Infinite Particle Systems
37
We finally prove Proposition 4.3. Proof of Proposition 4.3. We set J = [s − , s], Yn = {j ∈ In ||xi (τ ) − xj (τ )| ≤ r
τ ∈ J },
for some
(4.29) (4.30)
and decompose the set Yn according to the following partition: ak = {j ∈ Yn |2k−1 R(t)4/6 ≤ sup |vj (τ )| < 2k R(t)4/6 τ ∈J
for
k = 1 . . . kmax },
where kmax is the maximum integer for which 1 R(t)2/6 , 2 a0 = {j ∈ Yn | sup |vj (τ )| ≤ R(t)4/6 },
2kmax ≤
τ ∈J
a˜ =
k max
ak ,
k=1
a¯ = Yn /(a0 ∪ a). ˜ Therefore s
s−
dτ
j ∈Yn
vj · Fi,j =
s
s−
dτ
+
j ∈a¯
+
j ∈a˜
(vj · Fi,j ),
j ∈a0
(4.31)
and we estimate the three sums separately. We start by estimating the cardinality of a. ¯ Note that if j ∈ a: ¯ |vj (t ∗ )| = max |vj (τ )| ≥ τ ∈J
1 R(t), 4
(4.32)
then, by (4.19), |vj (τ )| ≥
1 1 1 R(t) − C14 R(t)3/2 ≥ R(t) − C14 R(t)5/6 ≥ R(t), 4 4 8
(4.33)
for n (and hence R(t)), large enough. By definition R(t) is larger than the maximal displacement of each particle, then all the particles with indices in Yn must be contained in the sphere B(xi (0), 2R + r). Therefore, by (2.16) and (4.3), vj2 (τ ) ≤ 2W (Xn (τ ); xi (0), 2R(t) + r) j ∈a¯ (4.34) ≤ 2C02 W (X n (τ ); R(t)) ≤ 2C02 C9 R(t)3 , thus
and hence
1 2 ≤ C17 R(t)3 , |a|R(t) ¯ 64
(4.35)
|a| ¯ ≤ C18 R(t).
(4.36)
38
E. Caglioti, C. Marchioro, M. Pulvirenti
As a consequence, by (4.34) and (4.36): s ≤ F dτ v · F j i,j ∞ s−
j ∈a¯
s s−
≤ C19 R(t)
3/2
dτ [
|vj |2 ]1/2 |a| ¯ 1/2
j ∈a¯
R(t)
1/2
≤ C19 R(t) .
Furthermore s k 4/6 ≤ F dτ v · F 2 R(t) j i,j ∞ s−
j ∈ak
(4.37)
2
j ∈ak
s s−
dτ χi,j (τ ).
(4.38)
To estimate the time integral we first realize that, for n large enough: |vi (τ ) − vj (τ )| ≥ inf |vi (τ )| − sup |vj (τ )| τ ∈J
τ ∈J
≥ R(t) − 2kmax R(t)4/6 ≥
1 R(t). 2
(4.39)
In addition the pair (i, j ) interacts at most once in the sense that the set {τ ∈ J ||xi (τ ) − xj (τ )| < r} is connected. In fact, suppose that at time t0 we have |xi (t0 ) − xj (t0 )| = r, with outgoing velocities (i.e. (xi (t0 ) − xj (t0 )) · (vi (t0 ) − vj (t0 )) ≥ 0). Let t1 ∈ (s − , s) be the time in which (xi (τ ) − xj (τ ))2 takes its maximum value, say r12 (for which (xi (t1 ) − xj (t1 )) · (vi (t1 ) − vj (t1 )) = 0. Indeed if the maximum is taken in s there are no recollision). By the identity: 1 d2 (xi (τ ) − xj (τ ))2 = (vi (τ ) − vj (τ ))2 2 dτ 2 + (xi (τ ) − xj (τ )) · (Fi (τ ) − Fj (τ )), we obtain, for τ > t1 , using (4.39) and (4.19): (xi (τ ) − xj (τ ))2 ≥ r12 + It must be r1 ≥
R 1/2 4C14
(τ − t1 )2 R(t)2 − C14 r1 R(t)3/2 . 2 4
(4.40)
otherwise (xi (τ ) − xj (τ ))2 > r12 . In this case
2 α 2 1/6 (τ − t1 )2 4/3 r1 R 3/2 ≤ r1 R 3/2 ≤ R r1 ≤ C20 r1 . 2 2 2
(4.41)
Therefore 4/3
(xi (τ ) − xj (τ ))2 ≥ r12 − C20 r1
>> r 2 ,
(4.42)
and no other collision is possible. We now use again (4.40) when r1 is the minimal distance between the particles i and j and t1 is a time in which this distance is reached. Moreover we define τ + and τ − to
3-D Dynamics of Infinite Particle Systems
39
be the times in which the particle i escapes from, or enters in the interaction sphere of the particle j respectively. More precisely: τ + = min{s, sup{τ > t1 ||xi (τ ) − xj (τ )| < r}}, τ − = max{s − , inf{τ < t1 ||xi (τ ) − xj (τ )| < r}}.
Then
s
s−
χi,j (τ ) ≤ (τ + − t1 ) + (t1 − τ − ) ≤
8r . R(t)
(4.43)
To estimate the cardinality of ak we use again an energy bound as for a. ¯ Let τj ∈ J be such that |vj (τj )| = maxτ ∈J |vj (τ )|. Then |ak |22(k−1) R 8/6 ≤ |vj (τj )|2 (4.44) j ∈ak
≤
|vj (s − )| + F ∞
s
2
dτ |vj (τ )|
s−
j ∈ak
j ∈ak
χj,l (τ ).
l
Multiplying the above inequality by 2−k and summing over k we get: 1 2
k
|ak |2(k−1) R 8/6 ≤
k
2−k
|vj (s − )|2
j ∈ak
+ F ∞ R 4/6
s s−
dτ
j ∈a˜
(4.45) χj,l (τ ).
l
Using now estimates (2.15) and (4.3) for which χj,l (τ ) ≤ C0 C9 R(t)3 , j ∈a˜
(4.46)
l
and bound (4.34): k
2−k
(4.47)
j ∈ak
we arrive at
|vj (s − )|2 ≤ 2C02 C9 R(t)3 ,
|ak |2k ≤ C21 R(t)14/6 .
(4.48)
k
Finally by (4.38),(4.43) and (4.47): s dτ vj · Fi,j ≤ 8rF ∞ 2k |ak |R(t)−1 R(t)4/6 s−
k
j ∈ak
k
(4.49)
≤ C22 R(t) . 2
a0 ,
It remains to estimate the last contribution, namely that associated to the set of indices
s s−
dτ
j ∈a0
H −1 vj · Fi,j ≤ F ∞ R(t)4/6 h=0
sh+1
sh
dτ
N (τ ).
(4.50)
40
E. Caglioti, C. Marchioro, M. Pulvirenti
Here we decomposed the interval J into H identical intervals: J =
H −1
[sh , sh+1 ]
h=0 1 1 with sH = s, s0 = s − and |sh+1 − sh | = δ ∈ [ 2AR(t) , AR(t) ]. Moreover we set ¯ ¯
N (τ ) =
χ (|xi (τ ) − xj (τ )| < r).
(4.51)
j ∈a0
Since |vj (τ )| ≤ R(t)4/6 , the maximal displacement of each particle of the set a0 is bounded, in the time interval J by 1. Therefore, defining Nh = χ ( inf |xi (τ ) − xj (s0 )| < r + 1), (4.52) j ∈a0
τ ∈(sh ,sh+1 )
for τ ∈ (sh , sh+1 ) we deduce that N (τ ) ≤ Nh . Indeed if |xi (τ ) − xj (τ )| < r, |xi (τ ) − xj (s0 )| ≤ |xi (τ ) − xj (τ )| + |xj (τ ) − xj (s0 )| ≤ r + 1. Hence:
s
s−
dτ
j ∈a0
H −1 4/6 vj · Fi,j ≤ F ∞ R(t) δ Nh h=0
≤ F ∞ R(t)
√ 4/6
H δ(
H −1 h=0
(4.53) Nh2 )1/2 .
Defining Th = {y ∈ R3 | and E(Th ) =
inf
τ ∈(sh ,sh+1 )
|xi (τ ) − y| < r + 1}
(4.54)
φ(xl (s0 ) − xj (s0 )) + N (X n (s0 ), Th ),
(4.55)
l,j
where the sum is restricted to the pairs of particles in Th , by the bound (2.3) we conclude √
(4.53) ≤ C23 F ∞ H R(t)
4/6
δ
−1 H
1/2
E(Th )
.
(4.56)
h=0
Here we used the fact that |Th | is bounded independently of R(t) and A¯ as follows by the obvious inequality: 3 ¯ ¯ ¯ |vi (τ )| ≤ AR(t) + C14 R(t)3/2 ≤ AR(t) + C14 R(t)5/6 ≤ AR(t). 2 Denote by T =
h
Th
(4.57)
3-D Dynamics of Infinite Particle Systems
41
the whole tube spanned by the particle i in the time interval J and by E(T ) its energy according to formula (4.55) (replacing Th by T ). We claim that: 1/2 H −1 E(Th ) ≤ C24 E(T )1/2 . (4.58) h=0
¯ where C24 is independent of A. We prove (4.58) after having concluded the proof. √ √ ¯ 1/6 Inserting (4.58) in (4.56) we have (recalling that H = (/δ)1/2 ≤ 2α AR and that E(T ) ≤ constR 3 : |
s
s−
dτ
j ∈a0
R(t)2 ¯ ))1/2 ≤ C26 vj · Fi,j | ≤ C25 R 5/6 Aδ(E(T ≤ C26 R(t)2 , A¯ (4.59)
so that it remains to prove (4.58). (sh+1 ) For a given h, let e = |vvii (s and ξ(τ ) = (xi (τ ) − xi (sh+1 )) · e. Hence h+1 )| ξ(τ ) = |vi (sh+1 )|(τ − sh+1 ) +
τ
(τ − σ )dσ Fi (σ ) · e
(4.60)
|τ − sh+1 |2 C14 R(t)3/2 2 ¯ ≥ |τ − sh+1 |(AR(t) − C14 R(t)3/2 R(t)−(4/6) ) ¯ AR(t) ≥ |τ − sh+1 | 2
(4.61)
sh+1
from which |ξ(τ )| ≥ |vi (sh+1 )|(τ − sh+1 ) −
for n large enough. On the other hand by (4.57) it follows that 3 ¯ Th ⊂ B(xi (sh+1 ); AR(t)δ + r) ⊂ B(xi (sh+1 ); 2 + r). 2 Therefore, by (4.61), if we choose |τ − sh+1 | > (8 + 4r)δ, it follows that |ξ(τ )| > 2 + r. This implies that after this time τ , xi will never meet the sphere B(xi (sh+1 ); 2 + r) so that Th has a not empty intersection only with a definite number of other Tk ’s and (4.58) becomes evident. Remark. We mainly used in the proof, the positivity of the potential, in particular in the definition of W and in the control of the interaction. However we expect that the result is true also for superstable interactions, although this extension will probably require some more technical effort.
42
E. Caglioti, C. Marchioro, M. Pulvirenti
Appendix Proof of Lemma 2.1. (2.13) is obvious. Using (2.3): N (X; µ, R)2 Ê ≤ C0 [R 3
φi,j χi (µ, R)χj (µ, R) + N (X; µ, R)],
(A.1)
i=j
by which we deduce (2.14). To prove (2.15) we cover B(µ, R) by a collection of disjoint cubic cells {α }α∈N of side 23 r. Therefore χi (µ, R)χj (µ, R)χi,j ≤ Nα Nβ + Nα2 , (A.2) "α,β#
i=j
α
where "α, β# means that the sum is restricted to all pairs of different cells at distance not larger than r and Nα = N (α ). We estimate the right-hand side of (A.2) by α
Nα2 +
1 2 (Nα + Nβ2 ) ≤ C˜ Nα2 ≤ C0 W (X; µ, R), 2 α
(A.3)
"α,β#
where we used again (2.3). To prove (2.16) we cover the sphere B(µ, 4R + r) by a finite union of spheres B(µ + n R4 , R) with n ∈ Z3 . Then each pair of particles is fully contained at least in one such spheres, for a suitable n ∈ Z3 . Due to the positivity of the potential: W (X; µ, 2R) ≤
n∈Z3 ,|n|≤20
R W (X; µ + n , R) ≤ C0 sup W (X; µ, R). 4 µ
(A.4)
References [A] [CC] [DF] [FD] [L1] [L2] [L3] [MP] [MPP]
Alexander, R.: Time evolution for infinitely many hard spheres. Commun. Math. Phys. 49, 217–232 (1976) Calderoni, P. and Caprino, S.: Time evolution of infinitely many particles: an existence theorem. J. Stat. Phys. 28, 815–833 (1982) Dobrushin, R.L. and Fritz, J.: Non-Equilibrium Dynamics of One-Dimensional Infinite Particle Systems with a Hard-Core Interaction. Commun. Math. Phys. 55, 275–292 (1977) Fritz, J. and Dobrushin, R.L.: Non-Equilibrium Dynamics of Two-Dimensional Infinite Particle Systems with a Singular Interaction. Commun. Math. Phys. 57, 67–81 (1977) Lanford, O.E.: Classical Mechanics of one-dimensional systems with infinitely many particles. I An existence theorem. Commun. Math. Phys. 9, 176–191 (1968) Lanford, O.E.: Classical Mechanics of one-dimensional systems with infinitely many particles. II Kinetic Theory. Commun. Math. Phys. 11, 257–292 (1969) Lanford, O.E.: Time evolution of large classical systems. Moser Ed. Lect Notes in Physics 38. Berlin–Heidelberg–New York: Springer, 1975 Marchioro, C. and Pulvirenti, M.: Time evolution of infinite one-dimensional Coulomb systems. J. Stat. Phys. 27, 809–822 (1982) Marchioro, C., Pellegrinotti, A. and Presutti, E.: Existence of time evolution in ν-dimensional Statistical Mechanics. Commun. Math. Phys. 40, 175–185 (1975)
3-D Dynamics of Infinite Particle Systems
43
[MPPu] Marchioro, C., Pellegrinotti, A. and Pulvirenti, M.: Remarks on the existence of non-equilibrium dynamics. Proceedings Esztergom Summer School. Coll. Math. Soc. Janos Bolyai 27, 733–746 (1978) [P] Pulvirenti, M.: On the time evolution of the states of infinitely extended particle systems. J. Stat. Phys. 27, 693–713 (1982) [PPT] Presutti, E., Pulvirenti, M. and Tirozzi, B.: Time evolution of infinite classical systems with singular, long-range, two-body interaction. Commun. Math. Phys. 47, 81–95 (1976) [SS] Sigmund-Shultze, R.: On non-equilibrium dynamics of multidimensional infinite particle systems in the translation invariant case Commun. Math. Phys. 100, 245–265 (1985) [S1] Sinai, Ya.: Construction of the dynamics for one-dimensional systems of Statistical Mechanics. Sov. Theor. Math. Phys. 12 487–501 (1973) [S2] Sinai, Ya.: The construction of the cluster dynamics of dynamical systems in Statistical Mechanics. Vest. Moskow Univ. Sez. I Math. Mech. 29, 152–176 (1974) Communicated by Ya. G. Sinai
Commun. Math. Phys. 215, 45 – 56 (2000)
Communications in
Mathematical Physics
© Springer-Verlag 2000
Equivariant Self-Similar Wave Maps from Minkowski Spacetime into 3-Sphere Piotr Bizon´ Institute of Physics, Jagellonian University, Kraków, Poland Received: 20 October 1999 / Accepted: 12 May 2000
Abstract: We prove existence of a countable family of spherically symmetric selfsimilar wave maps from 3 + 1 Minkowski spacetime into the 3-sphere. These maps can be viewed as excitations of the ground state solution found previously by Shatah. The first excitation is particularly interesting in the context of the Cauchy problem since it plays the role of a critical solution sitting at the threshold for singularity formation. We analyze the linear stability of our wave maps and show that the number of unstable modes about a given map is equal to its nodal number. Finally, we formulate a condition under which these results can be generalized to higher dimensions. 1. Introduction Wave maps, defined as harmonic maps from a spacetime (M, η) into a Riemannian manifold (N, g), have been intensively studied during the past decade (see the recent review [1]). The interest in wave maps (sometimes called also sigma models) stems from the fact that they contain many features of more complex relativistic field models but are simple enough to be tractable rigorously. In particular, the investigation of questions of global existence and formation of singularities for wave maps can give insight into the analogous, but much more difficult, problems in general relativity. With this motivation we have recently studied numerically the development of singularities for wave maps from 3 + 1 Minkowski spacetime into the 3-sphere [2]. In this case it was known that: (i) solutions with small initial data exist globally in time [3, 4]; (ii) there exist smooth initial data which lead to blow-up in finite time. An example of (ii) is due to Shatah [5] who a spherically symmetric self-similar wave map of the form u(r, t) = constructed f0 T r−t . This solution is perfectly smooth for t < T but it breaks down at t = T . Our numerical simulations [2] strongly suggested that the self-similar blow-up found by Shatah is generic in the sense that there is a large set of initial data which comprise the basin of attraction of the solution f0 . In particular, it seems that all initial data of nonzero degree (which by definition are not small in the sense of [3, 4]) blow up in this
46
P. Bizo´n
universal self-similar manner. The dynamical evolution of degree zero wave maps is more interesting because, depending on the “size” of initial data, the solutions either exist globally in time converging to the vacuum (this scenario is usually referred to as dispersion), or blow-up in finite time (where, as before, the blow-up profile is given by f0 ). Thus, in this case there arises a natural question of determining the boundary between the basins of attraction of these two generic asymptotic behaviors. In [2] we studied this question numerically by evolving various one-parameter families of degree zero initial data interpolating between blow-up and dispersion. A typical initial data in this class is a gaussian with varying amplitude. We found that the initial data lying on the boundary between the basins of attraction of the solution f0 and the vacuum solution converge asymptotically to a certain codimension-one attractor which is self-similar. This that, besides f0 , the model admits another self-similar solution, call it suggested f1 T r−t , which has exactly one unstable direction. This expectation was confirmed numerically in [2]. In a sense, the solution f1 can be thought of as the excitation of the ground state solution f0 . The aim of this paper is to give a rigorous proof of existence of a countable family of spherically symmetric self-similar wave maps from Minkowski spacetime into the 3-sphere. The above mentioned solutions f0 and f1 are the first two elements of this family. The proof is based on a shooting technique very similar to the one used by us in the case of harmonic maps between 3-spheres [6]. 2. Preliminaries A wave map U is a map from a spacetime M with metric η into a Riemannian manifold N with metric g which is a critical point of the action 1 ∂U A ∂U B ab S(U ) = gAB η dVM . (1) ∂x a ∂x b 2 M The associated Euler–Lagrange equations A (U )∂a U B ∂ a U C = 0 ✷η U A + BC
(2)
constitute the system of semilinear wave equations, where ’s are the Christoffel symbols of the metric g. In this paper we consider the case where M = R3+1 , 3 + 1 dimensional Minkowski spacetime, and N = S 3 , the unit 3-sphere. In polar coordinates on R3+1 and S 3 the respective metrics are η = −dt 2 + dr 2 + r 2 dω2 ,
(3)
g = du2 + sin2 (u)d2 ,
(4)
and
where dω2 and d2 are the standard metrics on S 2 , and u ∈ [0, π ]. We consider spherically symmetric maps of the form U (t, r, ω) = (u(t, r), = ω). Then the action (1) reduces to 2 sin2 (u) 2 1 r dt dr dω, −u2t + u2r + S= 2 r2
(5)
(6)
Equivariant Self-Similar Wave Maps from Minkowski Spacetime
47
and the corresponding Euler–Lagrange equation is 2 sin(2u) −utt + urr + ur − = 0. r r2
(7)
This equation is invariant under dilations: if u(t, r) is a solution of Eq. (7), so is uλ (t, r) = u(λt, λr). It is thus natural to look for self-similar solutions of the form r u(t, r) = f , (8) T −t where T is a positive constant. As mentioned in the introduction such solutions are important in the context of the Cauchy problem for Eq. (7) since they appear in the dynamical evolution as intermediate or final attractors. Substituting the ansatz (8) into (7) we obtain the ordinary differential equation f +
2 sin(2f ) f − 2 = 0, ρ ρ (1 − ρ 2 )
(9)
where ρ = r/(T − t) and = d/dρ. For t < T we have 0 ≤ ρ < ∞. It is sufficient to consider Eq. (9) only inside the past light cone of the point (T , 0), i. e., for ρ ∈ [0, 1]. This constitutes the two-point singular boundary value problem with the boundary conditions f (0) = 0
and
f (1) =
π , 2
(10)
which are dictated by the requirement of smoothness at the endpoints. Once a solution of Eq. (9) satisfying the conditions (10) is constructed, it can be easily extended to ρ > 1 [5]. Note that solutions of (9) and (10) are the critical points of the functional 1 1 2 2 2 cos2 (f ) E[f ] = ρ f − dρ, (11) 2 0 1 − ρ2 which, as was pointed out by Shatah and Tahvildar-Zadeh [7], can be interpreted as the energy for harmonic maps from the hyperbolic space H 3 into the upper hemisphere of S 3 . Shatah [5] showed that E[f ] is bounded from below over the H 1 -space of functions satisfying (10) and attains an infimum at a smooth function f0 , the ground state solution of Eq. (9). Independently, Turok and Spergel [8] found this solution in closed form f0 = 2 arctan(ρ).
(12)
The main result of this paper is Theorem 1. There exist a countable family of smooth solutions fn of Eq. (9) satisfying the boundary conditions (10). The index n = 0, 1, 2, ... denotes the number of intersections of fn (ρ) with the line f = π/2 (the equator of S 3 ) on ρ ∈ [0, 1). Before proving this theorem in the next section, we present now some numerical results. As will be shown below the solutions satisfying f (1) = π/2 form a one-parameter family with the asymptotics f (ρ) ∼ π/2 + b(ρ − 1) near ρ = 1, while the solutions satisfying f (0) = 0 form a one-parameter family with the asymptotics f (ρ) ∼ aρ near ρ = 0. The solutions fn can be obtained numerically by a standard shooting-to-a-fittingpoint method, that is by integrating Eq. (9) away from the singular points ρ = 0 and
48
P. Bizo´n
3
f0
2.5
f3
2
f
f2
f1
1.5
1
0.5
0 0.0001
0.001
0.01
0.1
1
ρ
10
100
Fig. 1. The ground state solution f0 and the first three excitations generated numerically. The solutions fn with n > 0 were first discovered numerically by Äminneborg and Bergström [9]
ρ = 1 in the opposite directions with some trial parameters a and b and then adjusting these parameters so that the solution joins smoothly at the fitting point. The discrete set of pairs (an , bn ) generated in this way and the energies characterizing the solutions fn are shown below for n ≤ 4. n
an
bn
En = E[fn ]
En /En+1
0
2
1
π/4 − 1
10.891
− 0.305664
−1.97045 ×10−2
10.764
1
21.757413
2
234.50147
0.0932163
−1.83055 ×10−3
10.751
3
2522.0683
− 0.0284312
−1.70276 ×10−4
10.749
0.0086717
−1.58411 ×10−5
10.749
4
27113.388
3. Proof of Theorem 1 To prepare the ground for the proof of Theorem 1 we first discuss some basic properties of solutions of Eq. (9). It is convenient to use new variables defined by 1 π (13) , and h(x) = f (ρ) − . 2 cosh x The range of x is from x = 0 (for ρ = 1) to x = ∞ (for ρ = 0). Note that the number of intersections of f with the line f = π/2 is the same as the number of zeros of h. In these new variables Eq. (9) becomes ρ=
h − coth(x)h + sin(2h) = 0,
(14)
Equivariant Self-Similar Wave Maps from Minkowski Spacetime
49
and the boundary conditions (10) translate into h(0) = 0
and
π h(∞) = ± , 2
(15)
where the ± sign in the last expression, obviously allowed by the reflection symmetry h → −h, is introduced for convenience. Lemma 1. For any b there exists a unique global solution hb (x) to Eq. (14) such that hb (x) ∼ bx 2
(16)
as x → 0. Proof. Defining v = h , let us rewrite Eq. (14) as the system of two integral equations x x sin(2h(s)) v(x) = − sinh(x) v(s)ds. (17) ds, h(x) = sinh s 0 0 Following the standard procedure we solve (17) by iteration, setting x x sin(2h(n) (s)) v (n+1) (x) = − sinh(x) v (n) (s)ds. ds, h(n+1) (x) = sinh s 0 0
(18)
with the starting values h(0) = bx 2 and v (0) = 2bx. It can easily be shown that the mapping (h(n) (x), v (n) (x)) → (h(n+1) , v (n+1) (x)) defined by (18) is contractive for any finite x, hence the sequence (h(n) , v (n) ) converges to a solution of Eq. (14). The proof of uniqueness is also routine so we omit it. Definition 1. A solution of Eq. (14) starting at x = 0 with the asymptotic behavior (16) will be called the b-orbit. Without loss of generality we assume that b ≥ 0. The b-orbit which satisfies h(∞) = ±π/2 will be called a connecting orbit. Remark 1. In the following whenever we say “a solution” we always mean the b-orbit. Also, when we say that some property holds for all x we always mean for all x > 0. We use lim to denote limx→∞ . Remark 2. The endpoints of connecting orbits (h = ±π/2, h = 0) are saddle-type critical points of the asymptotic (x → ∞) autonomous equation h − h + sin 2h = 0. One can easily show (cf. [6]) that the connecting orbits converge to these points along the one-dimensional stable manifolds ±h(x) ∼ −π/2 + ae−x . The following function, defined for b-orbits, will play a crucial role in our analysis W (x) =
1 2 h + sin2 h. 2
(19)
We have dW 2 = coth(x)h , dx
(20)
so W is increasing (unless h is a constant solution). Equations (19) and (20) imply that if W (x0 ) ≥ 1 for some x0 ( and h is not identically equal to ±π/2) then |h (x)| > * > 0 for x > x0 , hence lim W (x) = ∞. Thus, if a b-orbit crosses the line h = ±π/2, then h and h tend monotonically to ±∞.
50
P. Bizo´n
Lemma 2. A b-orbit (with nonzero b) which satisfies |h(x)| < π/2 for all x is a connecting orbit. Proof. We showed above that if W (x0 ) ≥ 1 for some x0 , then |h| tends to infinity, hence |h| < π/2 implies that W (x) < 1 for all x, so lim W (x) exists. Thus, lim W = 0 which means by (20) that lim h = 0 and next by (19) that lim sin2 h exists, implying that also lim h exists. By Eq. (14), sin 2h(∞) = 0 since otherwise lim h = 0 contradicting lim h = 0. Hence, h(∞) = ±π/2 or h(∞) = 0. To conclude the proof note that the latter implies lim W = 0 which in view of (20) is possible only if W ≡ 0, that is h ≡ 0. The next two lemmas describe the behavior of b-orbits for small and large values of the shooting parameter b, respectively. Lemma 3. If b is sufficiently small then the solution hb (x) has arbitrarily many zeros. ˜ Proof. Define h(x) = hb (x)/b. The function h˜ satisfies ˜ sin(2bh) h˜ − coth(x)h˜ + =0 b
(21)
˜ with the asymptotic behavior h(x) ∼ x 2 as x → 0. As b → 0, the solutions of Eq. (21) tend uniformly on compact intervals to the solution of the limiting equation H − coth(x)H + 2H = 0
(22)
with the asymptotic behavior H (x) ∼ x 2 as x → 0. The solution H (x) can be found in closed form in terms of the hypergeometric function but for the purpose of the argument it is enough to observe that H (x) is oscillating at infinity, since this implies that the ˜ number of zeros of hb (x) = bh(x) increases to infinity as b tends to zero. Lemma 4. If b is sufficiently large then the solution hb (x) increases monotonically to ∞. Proof. As in the √ proof of Lemma 3, we use a scaling argument. This time, we define ¯ h(x) = hb (x/ b). The function h¯ satisfies ¯ 1 sin(2h) x h¯ − √ coth √ h¯ + =0 (23) b b b ¯ with the asymptotic behavior h(x) ∼ x 2 as x → 0. As b → ∞, the solutions of Eq. (23) tend uniformly on compact intervals to the solution of the limiting equation 1 H¯ − H¯ = 0, x
(24)
that is to H¯ (x) = x 2 . Thus, on any compact interval the solution hb (x) stays arbitrarily close to bx 2 if b is sufficiently large. In particular, hb (x) strictly increases up to some x0 , where h(x0 ) = π/2. Since W (x0 ) > 1, by the argument following (20) hb tends monotonically to ∞. Now we are ready to prove Theorem 1. The proof will be the immediate corollary of the following proposition
Equivariant Self-Similar Wave Maps from Minkowski Spacetime
51
Proposition 1. There exists a decreasing sequence of positive numbers {bn }, n = 0, 1, 2, ..., such that the corresponding bn -orbits are connecting orbits with exactly n zeros for x > 0. Morever, limn→∞ bn = 0. Proof. The proof is based on an inductive application of the standard shooting argument. Let S0 = {b | hb (x) strictly increases up to some x0 where hb (x0 ) = π/2}. Let b0 = inf S0 . By Lemma 4 the set S0 is nonempty and by Lemma 3 b0 > * > 0. The b0 -orbit cannot cross the line h = π/2 at a finite x because the same would be true for nearby b-orbits with b < b0 , violating the definition of b0 . Thus, the b0 -orbit stays in the region |h| < π/2 for all x, and therefore due to Lemma 2 it is a connecting orbit. By definition the b0 -orbit has no zeros for x > 0. To make the inductive step we need one more lemma. Lemma 5. If b = b0 − * for sufficiently small * > 0, then the solution hb (x) increases up to some x0 where it attains a positive local maximum h(x0 ) < π/2 and then decreases monotonically to −∞. Proof. By the definition of b0 there must exist a point x0 , where hb (x0 ) = 0. Since by (14) a solution h cannot have a local minimum if h > 0, it follows that there must be a point x1 > x0 , where hb (x1 ) = 0 (otherwise the b-orbit would contradict Lemma 2). The idea of the proof is to show that W (x1 ) > 1 provided that * is sufficiently small. As argued above this implies that for x > x1 hb decreases monotonically to −∞. In the following we drop the index b on hb . From (20) we have W (x1 ) − W (x0 ) =
x1
x0
coth(x)h dx > − 2
h(x0 )
h dh.
(25)
0
In order to estimate the last integral note that for x > x0 , W (x) − W (x0 ) = so −h >
1 2 h + sin2 h(x) − sin2 h(x0 ) > 0, 2
(26)
2(sin2 h(x0 ) − sin2 h). Inserting this into (25) gives W (x1 ) > sin h(x0 ) + 2
h(x0 )
2(sin2 h(x0 ) − sin2 h) dh.
(27)
0
The right-hand side of this inequality is an increasing function of h(x0 ) which exceeds 1 if, say, π/3 < h(x0 ) < π/2, as can be checked by direct calculation. The value hb (x0 ) will fall into that interval if * is sufficiently small because by continuous dependence of solutions on initial conditions, hb (x0 ) → π/2 as * → 0. This concludes the proof of Lemma 5. Having Lemma 5 we return now to the proof of Proposition 1. Let S1 = {b | hb (x) increases up to some x0 , where it attains a positive local maximum h(x0 ) < π/2 and then decreases monotonically up to some x1 , where h(x1 ) = −π/2}. Let b1 = inf S1 . Due to Lemma 5 the set S1 is nonempty and by Lemma 3 b1 is strictly positive. Using the same argument as above we conclude that the b1 -orbit must stay in the region |h| < π/2 for all x, so it is a connecting orbit (asymptoting to −π/2). By definition the b1 -orbit has exactly one zero for x > 0.
52
P. Bizo´n
The subsequent connecting orbits are obtained by repetition of the above construction. Since the sequence {bn } is decreasing and bounded below by zero, it has a nonnegative limit. Suppose that limn→∞ bn = b∗ > 0. The b∗ -orbit cannot leave the region |h| < π/2 for a finite x because the set of such orbits is clearly open. Thus, the b∗ -orbit is a connecting orbit with some finite number of zeros. But this contradicts the fact that the number of zeros of bn -orbits increases with n. We conclude therefore that limn→∞ bn = 0. This completes the proof of Proposition 1. Returning to the original variables f (ρ) and ρ, and using the notation hn (x) ≡ hbn (x), we have fn (ρ) = hn (x) + π/2 with fn (1) = π/2 and fn (0) = 0(mod π ), as claimed in Theorem 1. We end this section with a remark about the large n limit. From limn→∞ bn = 0, it follows that limn→∞ hn (x) = 0 for any finite x. The limiting solution h∗ = 0 (or f ∗ = π/2) is a singular map which geometrically corresponds to the map into the equator of S 3 . The “energy” of this map E[f ∗ ] = 0 provides the upper bound for the “energies” of critical points of (11) (we write “energy” in quotation marks to emphasize that the functional (11) is not the true conserved energy associated with the action (6)). As follows from the proof of Lemma 3, the behavior of connecting orbits with large n (and consequently small bn ) can be approximated by the solution of Eq. (22), namely hn (x) ≈ bn H (x) on x ∈ [0, xn ), where xn tends to infinity as n → ∞. This fact can be used to prove some remarkable scaling properties of connecting orbits in the limit of large n. For example one can show that (see the table in Sect. 2) 2π En √ = e 7. n→∞ En+1
lim
(28)
For more detailed discussion of this issue we refer the reader to [6] where the analogous behavior in the case of harmonic maps between spheres was derived. 4. Stability The role of self-similar solutions fn in the evolution depends crucially on their stability with respect to small perturbations. This problem was analyzed by us in [2] by mixed analytic-numerical methods. In particular, we provided evidence towards the conjecture that the solution f0 is asymptotically stable and, as such, has an open basin of attraction. To make the results obtained in [2] rigorous is a formidable task. In this section we discuss the first (easy) step in achieving this goal, namely we determine the character of the spectrum of the linearized operator around the solutions fn . A somewhat different but equivalent version of the linear stability analysis was presented in [2]. We restrict attention to the interior of the past light cone of the point (T , 0) and define the new time coordinate s = − ln (T − t)2 − r 2 . In terms of s and ρ, Eq.(7) becomes −
e2s 2 sin(2u) (e−2s us )s + uρρ + uρ − 2 = 0. 2 2 (1 − ρ ) ρ ρ (1 − ρ 2 )
(29)
Of course, this equation reduces to Eq. (7) if the solution is self-similar, that is sindependent. Following the standard procedure we seek solutions of (29) in the form u(s, ρ) = fn (ρ) + w(s, ρ). Neglecting the O(w 2 ) terms we obtain a linear evolution equation for the perturbation w(s, ρ), −
e2s 2 2 cos(2fn ) (e−2s ws )s + wρρ + wρ − 2 w = 0. 2 2 ρ ρ (1 − ρ 2 ) (1 − ρ )
(30)
Equivariant Self-Similar Wave Maps from Minkowski Spacetime
53
Substituting w(s, ρ) = e(λ+1)s v(ρ) into (30) we get the eigenvalue problem 2(1 − ρ 2 ) cos(2fn ) (1 − ρ 2 )2 d 2 d Av = (1 − λ2 )v, where A = − ρ + . 2 ρ dρ dρ ρ2 (31) Note that the principal part of the operator A is the radial Laplacian on the hyperbolic space H 3 . We consider this problem in the space of smooth functions which are squareintegrable on the interval [0, 1] with respect to the natural inner product on H 3 , that is v ∈ L2 ([0, 1],
ρ2 dρ). (1 − ρ 2 )2
(32)
In this function space A is self-adjoint hence its spectrum is real. Both endpoints are of the limit-point type. Near ρ = 0 the leading behavior of solutions of (31) is v(ρ) ∼ ρ α , where α(α + 1) = 2, so admissible solutions behave as v(ρ) ∼ ρ
as ρ → 0.
(33)
√ Near ρ = 1 the leading behavior is v(ρ) ∼ (1 − ρ)β , where β = (1 ± λ2 )/2 so eigenfunctions must have λ2 > 0 and behave as (up to a normalization constant) v(ρ) ∼ (1 − ρ) 2 (1+|λ|) 1
as
ρ → 1.
(34)
All λ2 ≤ 0 belong to the continuous spectrum of A. The case λ = 0 will be treated separately below. Note that this eigenvalue problem has the symmetry λ → −λ (that is why we wrote λ + 1 rather than λ in the ansatz for w). Each eigenvalue λ2 > 0 gives rise to an unstable mode which grows exponentially as e(|λ|+1)s . To find the eigenvalues we need to solve (31) on the interval ρ ∈ [0, 1] with the boundary conditions (33) and (34). In [2] we did this numerically (for n ≤ 4) by shooting the solutions from both ends and matching the logarithmic derivatives at a midpoint. For example, for n = 1 we got λ21 ≈ 28.448; for n = 2 we got λ21 ≈ 28.132, λ22 ≈ 3372.12. Our numerics strongly suggested that the point spectrum of the operator A around the solution fn has exactly n positive eigenvalues λ2k > 0 (k = 1, . . . , n). Now, we will give a simple proof of this property. Proof. The proof is based on the observation that the solution with λ = 0 corresponds to the gauge mode which is due to the freedom of choosing the blowup time T . To see this, consider a solution fn (r/(T − t)). In terms of the similarity variables s = − ln (T − t)2 − r 2 and ρ = r/(T − t), we have ρ r fn where * = T − T . = fn (35) T −t 1 + * 1 − ρ 2 es In other words, each self-similar solution fn (ρ) generates the orbit of solutions of (29) parametrized by *. It is easy to verify that the generator of this orbit
d ρ
w(s, ρ) = − = es ρ 1 − ρ 2 fn (ρ) (36) fn
d* 1 + * 1 − ρ 2 es *=0
54
P. Bizo´n
(n) (n) solves (30), thus vgauge (ρ) = ρ 1 − ρ 2 fn (ρ) satisfies (31) with λ = 01 . Since vgauge (ρ) has exactly n zeros on ρ ∈ (0, 1) (because fn has n extrema), it follows by the standard result from Sturm–Liouville theory that there are exactly n positive eigenvalues, as conjectured in [2]. To summarize, we showed that the self-similar solution fn has exactly n unstable modes, which means in particular that the solution f0 is linearly stable. Remark. If we view the solutions fn as harmonic maps from the hyperboloid H 3 into S 3 , then the eigenvalue problem (31) determines the spectrum of the Hessian of the energy functional (11) 1 δ E[fn ](v, v) = 2
2
0
1
2 cos(2fn ) 2 ρ v + v dρ. 1 − ρ2 2 2
(37)
Within this approach the argument given above can be rephrased in terms of the Morse index. In particular, it implies that the Morse index of the solution f0 is zero, in agreement with Shatah’s result that f0 is a local minimum of the energy functional (11). Note that in this context the gauge mode acquires a geometrical interpretation as the perturbation induced by K, the conformal Killing vector field on H 3 , (n) vgauge = £K fn , where K = ρ 1 − ρ 2 ∂/∂ρ; (38) Aside, we remark that by solving (31) one can show that the Morse index of the singular map f ∗ = π/2 is infinite. This fact could be probably used to give an alternative proof of Theorem 1 via Morse theory methods using the ideas of Corlette and Wald developed recently in the case of harmonic maps between spheres [10]. 5. Generalization to Higher Dimensions The proof of Theorem 1 is rather robust which suggests that the result can be generalized in various directions. One possibility, which will not be pursued here, is to consider more general nonconvex targets2 . Another possibility is to consider the analogous problem in higher dimensions, that is wave maps U : M → N , where M = Rm+1 , m + 1 dimensional Minkowski spacetime, and N = S m , the unit m-sphere. At the same time one can relax the equivariance ansatz (5) by admitting the maps of the form U (t, r, ω) = (u(t, r), = χ (ω)),
(39)
where χ is a homogeneous harmonic polynomial of degree l > 0. The ansatz (5) is the special case of (39) with l = 1. Assuming self-similarity we obtain the analogue of Eq. (9) m − 1 (m − 3)ρ k sin(2f ) f − 2 + = 0, (40) f + ρ 1 − ρ2 ρ (1 − ρ 2 ) (n)
1 We emphasize that λ = 0 is not an eigenvalue because v gauge (ρ) is not square-integrable at ρ = 1. Hovewer, this solution is distinguished from the rest of the continuous spectrum by the fact that it is subdominant at ρ = 1 (such a solution is sometimes referred to as a pseudo-eigenfunction). 2 For example, one can easily verify that the proof of Theorem 1 goes through if the metric (4) is replaced by g = du2 + s 2 (u)d2 , where the function s(u) satisfies the following conditions (cf. [11]): (i) s(0) = 0 and s (0) = 1; (ii) s(u) is monotone increasing from u = 0 up to some u∗ > 0 where it attains a maximum.
Equivariant Self-Similar Wave Maps from Minkowski Spacetime
55
where k = l(l + m − 2)/2. As before, we want to construct smooth solutions on the interval [0, 1] satisfying the boundary conditions (10). Standard analysis of the behavior of such solutions at the endpoints yields that f (ρ) ∼ aρ l near ρ = 0 and f (ρ) ∼ π/2 + b(1 − ρ)(m−1)/2 near ρ = 1. Note that the latter implies that the desired smooth solutions can exist only if the dimension m is odd. Although Eq. (40) looks more complex than (9), the same change of variables as in (13) transforms (40) into h − (m − 2) coth(x)h + k sin(2h) = 0,
(41)
which has the same form as (14) apart from the change of constant coefficients. Now, let us see which steps of the shooting argument from Sect. 3 are affected by this change of coefficients. Lemma 1 holds with the asymptotic behavior near x = 0 replaced by h(x) ∼ bx m−1 . Lemmas 2 and 4 remain valid because their proofs are dimension independent. The only fact which is dimension sensitive is Lemma 3, because in m dimensions the limiting equation analogous to (22) reads H − (m − 2) coth(x)H + 2kH = 0,
(42)
so Lemma 3 is true iff the solution H (x) is oscillating at infinity, that is, 4k > (m − 2)2 . This imposes the condition √ 2−1 l> (m − 2). (43) 2 Under this condition the proofs of Lemma 5 and Proposition 1 remain basically unchanged, thus we have Theorem 2. For each odd m ≥ 3 and l satisfying the condition (43), there exist a countable family of smooth solutions fn of Eq. (40) satisfying the boundary conditions (10). The index n = 0, 1, 2, ... denotes the number of intersections of fn (ρ) with the line f = π/2 on ρ ∈ [0, 1). This theorem extends the recent result of Cazenave, Shatah, and Tahvildar-Zadeh [11] who proved existence of the ground state solution f0 in odd dimensions under the condition (43) using variational methods. Acknowledgements. I thank Robert Wald for discussions and Arthur Wasserman for reading the manuscript and helpful remarks. This research was supported in part by the KBN grant 2 P03B 010 16.
References 1. Struwe, M.: Wave maps. In: Progress in Nonlinear Differential Equations and their Applications, Vol. 29 Basel–Boston: Birkhäuser, 1997 2. Bizo´n, P., Chmaj, T. and Tabor, Z.: Dispersion and collapse of wave maps. Nonlinearity 13, 1411–1423 (2000) 3. Sideris, T.: Global existence of harmonic maps in Minkowski space. Comm. Pure Appl. Math. 42, 1–13 (1989) 4. Kovalyov, M.: Long-time behaviour of solutions of a system of nonlinear equations. Comm. PDE 12, 471–501 (1987) 5. Shatah, J.: Weak solutions and development of singularities of the SU(2) σ -model. Comm. Pure Appl. Math. 41, 459–469 (1988) 6. Bizo´n, P.: Harmonic maps between 3-spheres. Proc. Roy. Soc. London A 451, 779–793 (1995) 7. Shatah, J. and Tahvildar-Zadeh, A.: On the Cauchy problem for equivariant wave maps. Comm. Pure Appl. Math. 47, 719–754 (1994)
56
P. Bizo´n
8. Turok, N. and Spergel, D.: Global texture and the microwave background. Phys. Rev. Lett. 64, 2736–2739 (1990) 9. Äminneborg, S. and Bergström, L.: On selfsimilar global textures in an expanding universe. Phys. Lett. B362, 39–43 (1995) 10. Corlette, K. and Wald, R.M.: Morse theory and infinite families of harmonic maps between spheres. math-ph/9912001 11. Cazenave, T., Shatah, J. and Tahvildar-Zadeh,A.: Harmonic maps of the hyperbolic space and development of singularities for wave maps and Yang–Mills fields. Ann. Inst. H. Poincaré Phys. Theor. 68, 315–349 (1998) Communicated by A. Kupiainen
Commun. Math. Phys. 215, 57 – 68 (2000)
Communications in
Mathematical Physics
© Springer-Verlag 2000
Convergence to Equilibrium for Spin Glasses P. Mathieu1,2 1 CMI, Université de Provence, 39 Rue F. Joliot Curie, 13453 Marseille Cedex 13, France.
E-mail:
[email protected]
2 IME, Universidade de São Paulo, Rua do Matão 1010, Butantã, CEP: 05508-900, São Paulo, Brasil.
E-mail:
[email protected] Received: 4 November 1999 / Accepted: 15 May 2000
Abstract: We derive estimates on the thermalisation times for dynamical spin glasses at high temperature.
Introduction Static properties of random spin systems – spin glasses – have been the object of many recent papers by both physicists and mathematicians. Although many attempts from physicists have also been made to describe the dynamics of such models, rigourous results are few. (See [3] for instance.) Still the claims of physicists suggest a great variety of interesting features: very large relaxation times, aging properties . . . (see [2]). The aim of the present paper is to discuss the convergence to equilibrium for quite general dynamical spin models. This set-up covers the examples of the Glauber or Metropolis dynamics for spin glasses such as Derrida’s Random Energy model and the Sherrington and Kirkpatrick model. The question we address is “how long does it take for the process to reach equilibrium?” We believe that one of the interesting features of spin glasses is that the time to reach equilibrium may depend a lot on the initial configuration. This is easily understood at a heuristic level: because of the fluctuations of the environment, a large system creates, with probability one, deep traps, i.e. initial configurations for which the thermalisation time will be large. Deep traps are not numerous, randomly located and should be considered as “pathological” in many examples. Still when started from one of these configurations, the relaxation to equilibrium is very slow. On the other hand we expect the system to equilibrate much quicker provided we make it start from a “typical” initial configuration. One way to do that is to choose as initial law the uniform probability on the state space. We will then define the equilibrium time as the time needed for the process to reach equilibrium when starting from the uniform law. (This is actually the point of view of many physicists, see [2] for instance.) We shall see that for spin glasses at high
58
P. Mathieu
temperature the thermalisation time starting from the uniform law is much shorter than the inverse spectral gap. This same issue can also be rephrased in terms of annealed vs quenched models. In the quenched situation – which amounts to choosing as initial configuration the worst one – one is lead to very large relaxation times corresponding to the almost sure largest fluctuation in the environment. One possible way to estimate quenched relaxation times is to compute the inverse spectral gap of the dynamics. On the other hand, making the process start from the uniform law is almost equivalent to computing annealed convergence times, once we have assumed that the law of the disorder is translation invariant. In these terms, the main conclusion of this paper is that, at high temperature, the annealed convergence is much faster than the quenched convergence. In part III, we derive rigourous bounds on the relaxation to equilibrium of the process of the environment as seen from the particule that illustrate this fact. In fact, one could argue that the annealed relaxation time is the true physically meaningful quantity. Indeed, it was shown in [6] that the quenched relaxation time for the Metropolis dynamics of Derrida’s random energy model does not feel the static phase transition of the model. Later, in [11], we showed that the annealed relaxation time, for this same model, does present a discontinuity at the critical temperature. To complete this program, one needs a technique to estimate the distance to equilibrium which, on one hand, does not require a too precise information on the geometry of the Hamiltonian – since this information is usually not available for spin glasses – and, on the other hand, takes into account the dependence on the initial law. This is not the case of the spectral gap, nor of the other standard tools from semi-group theory such as Log-Sobolev, Sobolev or Nash inequalities. Our proof is actually based on a special class of functional inequalities that we call “generalized Poincaré inequalities” (GPI). GPI were introduced in our paper [9] to derive capacity estimates for a general Dirichlet form. They were later used in [12] to study diffusion processes, in [10] to study random walks in a random environment and in [11] to describe the dynamics of the REM. In this last paper, the estimates on the Poincaré constants are derived from geometrical arguments. We then heavily relied on our precise knowledge of the static properties of the REM as exposed in [7] and [13]. In particular the value of the ground energy is needed. Such information is not available anymore for other spin glasses such as the S-K model. In order to be able to use GPI in a general context, we shall rather use comparison arguments. These lead to estimates of the equilibrium time which depend only on the pressure and therefore can be explicitly computed in most interesting models. On the other hand, the bounds we obtain are not always sharp. The paper is organized as follows: in Sect. 1, we describe our model and state the main result on the thermalization time starting with the uniform law. Section 2 is devoted to a self-contained exposition of generalized Poincaré inequalities and the comparison lemma we need. We conclude the proof of the theorem at the end of Sect. 2. Section 3 contains estimates of the law of the process of the environment seen from the particule. Finally note that our results are only interesting for those mean field models for which one expects thermalisation times to grow exponentially fast in N . In particular we do not consider the usual finite range spin models. Nor do we study spin systems in the socalled Griffiths phase.
Convergence for Spin Glasses
59
1. The Model and Main Results Let S := SN = {−1, +1}N be the N -dimensional cube. All the quantities we are going to define depend on the dimension N , but we shall not indicate it explicitly. We are interested in the large N limit. Let η be the uniform probability measure on S. Let := N = RS . By definition an environment is an element ω = (ω(x))x∈S ∈ . ω will play the role of the Hamiltonian of the system. To each environment ω, we first associate a Gibbs measure, π ω defined by π ω (x) =
e−βω(x) , Z ω (β)
(1.1)
where Z ω (β) := x∈S exp(−βω(x)) is the partition function, and β > 0 is the inverse temperature. We shall consider different examples of dynamics on S, all of them admitting π ω as a unique invariant and reversible probability measure. For each i = 1 . . . N, let i denote the element of S whose i th coordinate is −1 and the other coordinates are +1. We call admissible an edge (x, y) ∈ S 2 such that there exists i ∈ [1, N ] such that y = i.x. (a.b denotes the usual multiplication of the group S). Let A denote the set of admissible edges. Given a family of transition rates, (k ω (x, y))x,y∈S , consider the Markov generator: Lω f (x) =
k ω (x, y)(f (y) − f (x)).
(1.2)
y∈S
We shall always assume that: k ω (x, y) > 0 iff the edge (x, y) is admissible. Besides we assume the detailed balance condition: k ω (x, y)π ω (x) = k ω (y, x)π ω (y) for all x, y ∈ S. Lω generates a continuous time Markov process that we denote by (Xtω , t ∈ R+ ), (Pxω , x ∈ S). From the detailed balance condition it follows that π ω is invariant and reversible for Xω . Let Ptω f (x) = Exω [f (Xtω )] be the semi-group generated by Lω . The following special choices of k ω will be used: when k ω (x, y) = exp(− β2 (ω(y) − ω(x)) for an admissible edge (x, y), we then call the dynamics generated by Lω the Glauber dynamics. If k ω (x, y) = exp(−β(ω(y) − ω(x))+ ), we then speak of the Metropolis dynamics. When k ω (x, y) = exp(βω(x)), we call the dynamics the random hoping times dynamics (RHT). The names of “Glauber” and “Metropolis” are borrowed from the statistical physics literature. The RHT dynamics is easily described: when sitting at configuration x, the process waits for an exponential time of inverse mean N exp(βω(x)) and then chooses uniformly among its neighbors the new configuration. Let Lη (X ω (t)) be the law of the process X ω at time t started with uniform initial law. Our aim is to estimate the decay of dT V (Lη (X ω (t)), π ω ), the distance in total variation between Lη (X ω (t)) and the Gibbs measure π ω . For c > 0, let T ω (c) = inf{t > 0 s.t. sup
sup
s≥t f ; f ∞ ≤1
η[|Psω f − π ω (f )|] ≤ c}.
(1.3)
T ω (c) is the time needed by the process to get at a distance shorter than c of equilibrium and stay close to equilibrium forever. Note in particular that for any time s ≥ T ω (c), we have dT V (Lη (X ω (s)), π ω ) ≤ c.
60
P. Mathieu
The following assumptions will be useful: we say that β satisfies the (Hω ) hypothesis if both following properties are satisfied: – for some ε > 0 and all β ∈ [β, β + ε], the limit P ω (β ) :=
Z ω (β ) 1 log N→+∞ N 2N lim
(1.4)
exists. We then call P ω (β ) the pressure. – the function β → P ω (β ) has a right derivative at point β, say P ω (β). Theorem 1.1. Let c > 0. Assume that β satisfies the (Hω ) hypothesis. (i) Consider the RHT dynamics. Then lim sup N→+∞
1 ω log T ω (c) ≤ βP (β). N
(1.5)
(ii) Consider the Glauber or the Metropolis dynamics and assume further that the pressure at −β exists (i.e. the limit (1.4) is defined for β = −β). Then lim sup N→+∞
1 ω log T ω (c) ≤ 2βP (β) − P ω (β) + P ω (−β). N
(1.6)
(iii) Consider the Glauber or Metropolis dynamics and assume further that ω(x) ≤ 0 for all x ∈ S. Then lim sup N→+∞
1 ω log T ω (c) ≤ βP (β). N
(1.7)
Examples. We discuss applications of Theorem 1.1 to two different examples of spin glasses: Derrida’s Random Energy model and the Sherrington and Kirkpatrick model. Some information about the dynamics of the REM is already available from [6] and [11], which allows us to test the sharpness of our bounds. The S-K model is more difficult to analyse. To our knowledge, the results quoted below are the only estimates of convergence times for this model. In both our examples, the pressure P ω (β) turns out not to depend on ω. We therefore simply denote it by P(β). √ Derrida’s Random Energy Model. Choose ω(x) = N Ex , where the family (Ex , x ∈ {−1, +1}N ) is a realization of a sequence of i.i.d. N (0, 1) random variables. The model thus defined is Derrida’s √ Random Energy Model. √ From [7] and [13] we √ then know that P(β) = β 2 /2 if β ≤ 2 log 2 and P(β) = β 2 log 2 − log 2 if β ≥ 2 log 2. Theorem 1.1 then implies that, for the RHT dynamics, 1 log T ω (c) ≤ β 2 if β ≤ 2 log 2, N→+∞ N 1 lim sup log T ω (c) ≤ β 2 log 2 if β ≥ 2 log 2. N→+∞ N lim sup
(1.8) (1.9)
Equation (1.9) could also be deduced from the asymptotics of the spectral gap.√Indeed, if λω denotes the spectral gap of the RHT dynamics, we have N1 log λω → −β 2 log 2
Convergence for Spin Glasses
61
for all β, as can be easily checked using the same arguments as in [6]. The bound (1.8) improves upon the asymptotics of λω . √ Note that the REM has a static phase transition at critical temperature β = 2 log 2. According to the general ideology of statistical physics, a similar phase transition should occur for the dynamics. Still the asymptotics of the spectral gap do not feel the difference between the high and low temperature regimes. Our upper bounds do. The typical close to ω(x) ∼ −Nβ √ configurations for the Gibbs √ measure have energy √ when β ≤ 2 log 2 (resp. ω(x) ∼ −N 2 log 2 when β ≥ 2 log 2). When the process reaches √ such a typical configuration, it stays still for a time of order exp(Nβ 2 ) (resp. exp(Nβ 2 log 2). Therefore, since we expect the process to visit more than just one typical configuration before reaching equilibrium, the bounds (1.8) and (1.9) are expected to be sharp. The bounds given by Theorem 1.1 for the Glauber and Metropolis dynamics are 1 log T ω (c) ≤ 2β 2 if β ≤ 2 log 2, N→+∞ N 1 lim sup log T ω (c) ≤ 2β 2 log 2 if β ≥ 2 log 2, N→+∞ N
lim sup
(1.10) (1.11)
i.e. twice the bounds in (1.8) and (1.9). In the case of the Metropolis dynamics, the asymptotics of the spectral gap given in [6] imply that (1.9) actually holds. Besides we proved in [11] that (1.8) also holds. Therefore (1.10) and (1.11) are off by a factor 2. This is due to the fact that our proof of Theorem 1.1 ignores the details of the geometry of ω. Actually the main ingredient in the estimates of [6] and [11] is the remark that, for most neighbouring configurations x and y, N1 (ω(x) ∨ ω(y)) ∼ 0. Therefore the transition rates for the Metropolis and RHT dynamics are similar and the equilibrium times should have the same asymptotics for both dynamics (and they do). The Sherrington–Kirkpatrick Model. Let ω(x) = N −1/2 i<j Ji,j xi xj , where the (Ji,j , 1 ≤ i < j ∈ N) form a realization of i.i.d. N (0, 1) random variables. The pressure for this model is known to be P(β) = β 2 /4, when β < 1 (see [1] and [15]). Therefore we get from Theorem 1.1 that: 1 log T ω (c) ≤ β 2 /2 for the RHT dynamics, N→+∞ N 1 lim sup log T ω (c) ≤ β 2 for the Glauber or Metropolis dynamics. N→+∞ N
lim sup
(1.12) (1.13)
As in the case of the REM, we expect (1.12) to be sharp but not (1.13). One can interpret (1.12) and (1.13) in terms of dynamical phase transitions through the following non-rigorous reasoning: although the true value of the ground energy of the S-K model is not known, it is easy to show that it is of order −CN , for some constant C. By analogy with the REM, it suggests that the spectral gap should be of order exp(−CβN ), where the constant C does not depend on β. Therefore we should have lim sup N1 log T ω (c) ≤ Cβ. At low temperature, this estimate should be sharp since the Gibbs measure gets more and more concentrated on configurations of low energy as β increases. We can therefore distinguish between two regimes: the high temperature regime, for which lim N1 log T ω (c) is quadratic in β (as suggested by (1.12) and (1.13)), and the low temperature regime, for which lim N1 log T ω (c) is linear in β. I admit it requires a good dose of optimism to believe in these claims.
62
P. Mathieu
2. Generalized Poincaré Inequalities In this section we introduce the technique used to prove Theorem 1.1. Since we have assumed the detailed balance condition for the transition rates k ω , the generator Lω is a symmetric operator in L2 (S, π ω ). Let Ptω := exp(tLω ) be the Markov semi-group generated by Lω and let us denote by E ω the Dirichlet form associated to Lω on L2 (S, π ω ). E ω is therefore a symmetric quadratic form and 1 E ω (f, f ) := −π ω (f Lω f ) = q ω (e)|de f |2 , (2.1) ω 2Z (β) e∈A
where, for the edge e = (x, y) ∈ S 2 , we have set q ω (e) = k ω (x, y)e−βω(x) and de f = f (x) − f (y). Note that q ω (e) is non-zero iff e is admissible. Lemma 2.1 (Generalized Poincaré Inequalities). Let p ∈]0, 1[ and define Lω (p) :=
(2−2p)/p
E ω (f, f ) f ∞ (f )=0 η[|f |]2/p
infω
f s.t. π
.
(2.2)
Then, for any t > 0 and any function f bounded by 1, we have η[|Ptω f − π ω (f )|] ≤ (2eLω (p)t)−p/2 .
(2.3)
Therefore, for any c > 0, log T ω (c) ≤
2 1 1 log + log ω . p c L (p)
(2.4)
Proof. Let f be bounded by 1 and such that π ω (f ) = 0. Then, for any t > 0, π ω (Ptω f ) = 0 and Ptω f ∞ ≤ 1. Besides E ω (Ptω f, Ptω f ) = −π ω [{Lω Ptω f }Ptω f ] √ ω = π ω [{ −Lω et L f }2 ] 1 ω 2 π [f ] ≤ 2et 1 ≤ 2et
(2.5)
since supx>0 xe−x = 1/e. By definition of Lω (p), we therefore have: η[|Ptω f |] ≤ Lω (p)−p/2 E ω (Ptω f, Ptω f )p/2 ≤ (2eLω (p)t)−p/2 .
(2.6)
We now have to estimate the constant Lω (p): Lemma 2.2. Let a ∈]0, (1 − p)/p[. Then ω 1−2/p Z ω (β(1 + (1+a)p )) 2/p−(1+a) 1 2−(1+a)p 1/p−1 Z (β) ≤ 2 ω N N L (p) 2 2 a ω −1/a e∈A (q (e)) × . 2N
(2.7)
Convergence for Spin Glasses
63
Proof. We prove (2.7) using comparison arguments. For a function f on S, let var η (f ) := η[(f − η(f ))2 ] be the variance of f w.r.t. the probability η. Let γ ∈]0, 1[. We have η[|f − π ω (f )|] ≤
1 e−βω(y) |f (x) − f (y)| 2N Z ω (β)
x,y∈S
1 e−βω(y) |f (x) − f (y)|1−γ 2N Z ω (β)
γ
≤ 2γ f ∞
x,y∈S
γ
≤ 2γ f ∞ ( ×
x,y∈S
x,y∈S
=2
γ
1 2N .2N
(f (x) − f (y))2 )(1−γ )/2
(2.8)
(1+γ )/2 1 2N e−βω(y) 2/(1+γ ) ( ) 2N .2N Z ω (β)
γ f ∞ var η (f )(1−γ )/2
2 ω 2N Z (β 1+γ ) (1+γ )/2 ) ( Z ω (β) 2N
(We used Hölder’s inequality between lines 2 and 3.) Let E(f, f ) := (1/2.2N ) e∈A |de f |2 be the Dirichlet form of the usual random walk on S. As in the proof of (2.8), we get that, for any b ∈]0, 2[, 1 b b 2 f |de f |2−b E(f, f ) ≤ ∞ 2.2N e∈A (2−b)/2 b/2 b 2 b ω 2 ω −(2−b)/b ≤ f ∞ q (e)|de f | (q (e)) 2.2N e∈A
≤
2b 2.2N
=2
b/2
e∈A
f b∞ (2.Z ω (β)E ω (f, f ))(2−b)/2 (
f b∞ (E ω (f, f ))(2−b)/2
e∈A
(q ω (e))−(2−b)/b )b/2
Z ω (β) (2−b)/2 2N
e∈A (q
ω (e))−(2−b)/b
2N
b/2 . (2.9)
We now combine (2.8) and (2.9) with the values γ = 1−(1+a)p and 2−b = 2p/(1−γ ) to get ω −1/a −a N e∈A (q (e)) ω −a −2a 2 E (f, f ) ≥ 2 f ∞ (E(f, f ))1+a Z ω (β) 2N ω −1/a −a E(f, f ) 1+a N e∈A (q (e)) −a −2a 2 ≥ 2 f ∞ Z ω (β) var η (f ) 2N −(1+γ )/p ω 2/p Z ω (β 2 )
2/p 1+γ −2γ /p Z (β) −2γ /p 2 f ∞ η |f − π ω (f )| N N 2 2 (2.10)
64
P. Mathieu
ω 1+a−2/p 2 Z ω (β) 2/p−1 Z (β 1+γ ) =2 2N 2N ω −1/a −a E(f, f ) 1+a
2/p e∈A (q (e)) η |f − π ω (f )| . N 2 var η (f )
1−2/p
2−2/p f ∞
The spectral gap of E in L2 (S, η) is known to be 2, i.e. we always have E(f, f ) ≥ 2var η (f ). Using this inequality in (2.10), we obtain: 1+a−2/p ω (β) 2/p−1 Z ω (β 2 ) Z 1+γ E ω (f, f ) ≥ 2N 2N ω −1/a −a
2/p e∈A (q (e)) × η |f − π ω (f )| . N 2
2−2/p 21−1/p f ∞
Replacing γ by its value in (2.11) leads to the claim of Lemma 2.2.
(2.11)
q ω (e)
Proof of Theorem 1.1. In the case of the RHT dynamics, we have = 1, for any admissible edge. Therefore Lemma 2.2 implies that 2 Z ω (β) 1 1 ≤ − 1 log 2 + 1 − log log ω L (p) p p 2N (2.12) ω β(1 + (1+a)p ) Z 2−(1+a)p 2 + + a log N. − (1 + a) log p 2N Therefore, provided that (1 + a)p is small enough, we can pass to the limit to get that 1 1 2 lim sup log ω ≤ 1− P ω (β) N L (p) p (2.13) 2 (1 + a)p + − (1 + a) P ω β 1 + . p 2 − (1 + a)p Letting a tend to 0, and then p tend to 0, we get lim sup
1 1 ω log ω ≤ βP (β). N L (p)
(2.14)
Equation (1.5) follows from (2.14) and Lemma 2.1. For the Glauber (resp. Metropolis) dynamics, we have q ω (e) = exp(− β2 (ω(x) + ω(y))) (resp. q ω (e) = exp(−β(ω(x) ∨ ω(y)))), where e = (x, y). In both cases, we have q ω (e)−1/a ≤ eβω(x)/a + eβω(y)/a . Therefore the last term in the upper-bound (2.7) is bounded by ω −1/a a Z ω (−β/a) a e∈A (q (e)) ≤ 2N . 2N 2N
Convergence for Spin Glasses
65
Choosing a = 1 (and assuming that p < 1/2), we therefore obtain from (2.7): 1 2 1 ≤ 1− P ω (β) lim sup log ω N L (p) p 2 p + − 2 Pω β 1 + + P ω (−β). p 1−p
(2.15)
Letting p tend to 0, we get lim sup
1 1 ω log ω ≤ 2βP (β) − P ω (β) + P ω (−β). N L (p)
(2.16)
Under the assumption that ω is non positive, in both the Glauber and Metropolis cases, we have q ω (e) ≥ 1. Therefore the last term in the upper-bound (2.7) is bounded by N a . Hence (2.12) holds and one can conclude as above.
3. The Process of the Environment as Seen from the Particule In this section, we shall only deal with spin glasses, i.e. we assume that some probability measure, Q, is given on . Q is supposed to be translation invariant. We now define the process of the “environment as seen from the particule”: S acts on through the rule x.ω(y) ≡ ω(x.y). (Remember that a.b denotes the multiplication in S.) Let X ω denote the RHT, Glauber or Metropolis dynamics associated to ω. We define the process t := Xtω .ω. t represents the environment translated according to the position of the particule at time t. Note in particular that, if we call 0 the configuration in S whose coordinates are all +1, then t (0) = ω(Xtω ) is the value of the Hamiltonian at the current position of the dynamics. t is a Markov process. Denoting by Exω its law when 0 = ω and X0ω = x, we then have, as t tends to +∞, π ω (y)F (y.ω) (3.1) Exω [F (t )] → y∈S
for any bounded function F . We are interested in estimating the speed of convergence in (3.1). For this purpose, we introduce the following relaxation time: for c > 0, let π ω (y)F (y.ω) ≤ c . (3.2) T (c) := inf t > 0 s.t. sup sup Q Exω [F (t )] − s≥t |F |≤1
y∈S
Due to the invariance by translation of Q, this quantity does not depend on x. Furthermore, from Lemma 2.1, we get that ω ω π (y)F (y.ω) Q Ex [F (t )] − y∈S
=
x∈S
η(x)Q Exω [F (t )] − π ω (y)F (y.ω)
≤ Q (2eLω (p)t)−p/2 .
y∈S
(3.3)
66
P. Mathieu
At this point, all that remains to be done in order to estimate T (c) is to replace in (3.3) Lω (p) by the upper bound of (2.7) and pass to the limit as N tends to +∞. To justify this last step, we shall modify a little the definition of the pressure and make the ad-hoc assumptions. Let us say that the (H) hypothesis is satisfied if, for some ε > 0 and all β ∈ [β, β + ε], we have: – The limit 1 Z ω (β ) Q log P(β ) := lim N→+∞ N 2N
(3.4)
exists. – The function β → P(β ) has a right derivative at point β, say P (β). – There exists a constant K such that, for all u > 0,
u2 Q | log Z ω (β ) − Q log Z ω (β ) | ≥ u ≤ e− KN .
(3.5)
Note that, when Q is a Gaussian measure, the assumption (3.5) usually easily follows from the concentration properties of Gaussian measures (see [8]). Theorem 3.1. Let c > 0. Assume that β satisfies the (H) hypothesis. (ii) Consider the RHT dynamics. Then lim sup N→+∞
1 log T (c) ≤ βP (β). N
(3.6)
(ii) Consider the Glauber or the Metropolis dynamics and assume further that the pressure at −β exists (i.e. the limit (3.4) is defined for β = −β and (3.5) holds for β = −β). Then lim sup N→+∞
1 log T (c) ≤ 2βP (β) − P(β) + P(−β). N
(3.7)
(iii) Consider the Glauber or Metropolis dynamics and assume further that, Q.a.s., ω(x) ≤ 0 for all x ∈ S. Then lim sup N→+∞
1 log T (c) ≤ βP (β). N
(3.8)
Proof. The proof follows the same lines as for Theorem 1.1. To justify the passage to the large N limit, one uses the following property: for all k ∈ R, and all β such that (3.5) holds, we then have: 1 log Q N
Z ω (β) 2N
k
−
k Z ω (β) → 0. Q log N 2N
(3.9)
Convergence for Spin Glasses
67
The proof of (3.9) is straightforward: for . small enough, one has ω ∞ Z ω (β) k Z (β) k−1 Q = k s dsQ ≥ s 2N 2N 0 ∞ Z ω (β) ≤ k ≥ (1 − .) log s s k−1 ds1 Q log 2N 0 ∞
s k−1 dsQ log Z ω (β) − Q[log Z ω (β)] ≥ . log s +k 0 ∞ .2 k Z ω (β) k−1 − KN (log s)2 ≤ exp + ks dse . Q log N 1−ε 2 0 Finally note that 1 log N and let . tend to 0, to get that 1 lim sup log Q N
∞
.2
ks k−1 dse− KN (log s) → 0 2
0
Z ω (β) 2N
k
k Z ω (β) − Q log ≤ 0. N 2N
The proof for the lim inf is identical. One can now replace Lω (p) by the upper bound (2.7) in the expression (3.3), pass to the large N limit using (3.9) and conclude the proof of Theorem 3.1 as for Theorem 1.1. Example. The p-spins interaction model. Let p ≥ 2 be an integer. Define 1/2 p! ω(x) := J (i1 , i2 , . . . , ip )xi1 . . . xip , 2N p−1 1≤i1
where the summation is over all possible choices of indices 1 ≤ i1 < i2 < · · · < ip ≤ N . The J (i1 , i2 , . . . , ip ) are independent standard normal random variables. The special case p = 2 is the Sherrington–Kirkpatrick model. Let now φ(t) := 21 [(1 + t) log(1 + t) + (1 − t) log(1 − t)] and assume that β 2 < inf 2(1 + t −p )φ(t). 0≤t≤1
From [16], we then know that (H) is satisfied with P(β) = β 2 /4. Therefore we get the following estimates for the relaxation time T (c): – for the RHT dynamics, lim sup N→+∞
1 log T (c) ≤ β 2 /2, N
– consider the Glauber or the Metropolis dynamics then lim sup N→+∞
1 log T (c) ≤ β 2 . N
68
P. Mathieu
References 1. Aizenman, M., Lebowitz, J.L., Ruelle, D.: Some rigourous results on the Sherrington- Kirpatrick model. Commun. Math. Phys. 112, 3–20 (1987) 2. Bouchaud, J.P., Cugliandolo, L.F., Kurchan, J., Mézard, M.: Out of equilibrium dynamics in spin glasses and other glassy systems. cond-mat/9702070 (1997) 3. Bovier, A., Picco P. (eds): Mathematical aspects of spin glasses and neural networks. Progress in Probability, Vol. 41, Boston: Birkhäuser, 1998 4. Derrida, B.: Random Energy model: Limit of a family of disordered systems. Phys. Rev. Lett. 45, 79–82 (1980) 5. Derrida, B.: Random Energy model:An exactly solvable model of a spin glass. Phys. Rev. B 24, 2613–2626 (1981) 6. Fontes, L.R., Isopi, M., Kohayakawa, Y., Picco, P.: The spectral gap of the REM under metroplolis dynamics. Ann. Appl. Prob 8, 917–943 (1998) 7. Galves, J.A., Martínez, S., Picco, P.: Fluctuations in Derrida’s Random Energy and Generalized Random Energy Models. J. Stat. Phys. 54, 515–529 (1989) 8. Ibragimov, A., Sudakov, V.N., Tsirelson, B.S.: Norms of Gaussian sample functions. Proc. Third Japan USSR Symposium on Probability theory, Lecture Notes in Math. 550, Berlin–Heidelberg–New York: Springer, 1976, pp. 20–41 9. Mathieu, P.: Hitting times and spectral gap inequalities.Annales de l’Institut H.Poincaré. Série Probabilités 33, n 4, 437–465 (1997) 10. Mathieu, P.: Sur la convergence des marches aléatoires dans un milieu aléatoire et les inégalités de Poincaré généralisées. C.R.A.S. Série I, 329, 1015–1020 (1999) 11. Mathieu, P., Picco, P.: Convergence to equilibrium for the random energy model. Preprint (1999) 12. Mathieu, P., Veretennikov, A.Yu.: On semigroups, resolvents and hitting times for SDE in divergence form. Preprint (1999) 13. Olivieri, E., Picco, P.: On the Existence of Thermodynamics for the Random Energy Model. Commun. Math. Phys. 96, 125–144 (1984) 14. Saloff-Coste, L.: Lectures on finite Markov chains. Ecole d’été de St-Flour. Lectures Notes in Math. Berlin: Springer Verlag, (1997) 15. Talagrand, M.: The Sherrington–Kirkpatrick model: A challenge for mathematicians. Prob. Theo. Rel. Fields 110, 109–176 (1998) 16. Talagrand, M.: Rigourous low temperature results for the mean field p-spins interaction model. Communicated by J. L. Lebowitz
Commun. Math. Phys. 215, 69 – 103 (2000)
Communications in
Mathematical Physics
© Springer-Verlag 2000
Anderson Localization for the Holstein Model G. Gentile1 , V. Mastropietro2 1 Dipartimento di Matematica, Università di Roma Tre, 00146 Roma, Italy 2 Dipartimento di Matematica, Università di Roma “Tor Vergata”, 00139 Roma, Italy
Received: 20 October 1999 / Accepted: 18 May 2000
Abstract: A one-dimensional system of electrons on a lattice, interacting with a periodic potential, with period incommensurate with the lattice spacing and satisfying a Diophantine condition, is considered in the case of strong interaction. The Schwinger functions are computed and their asymptotic behaviour is studied, proving Anderson localization. The decay of the Schwinger functions is shown to depend critically on the value of the chemical potential. 1. Introduction 1.1. The experimental discovery of a class of anisotropic (one-dimensional) crystals exhibiting a periodic modulation of the atomic positions (called charge density wave), superimposed on the Bravais periodic lattice and such that the two periodicities are generally incommensurate (see for instance [AAR]), motivates the theoretical study of systems of electrons (describing the conduction electrons of such crystals) on a lattice subject to an incommensurate periodic potential. This problem is generally faced studying the discrete Schrödinger equation with an incommensurate potential or the Schrödinger equation with a quasi-periodic potential. Also mathematically a lot of effort was devoted to it; see for instance the review [PF] for references. The solutions of the Schrödinger equation were analysed by a perturbation theory (in the two opposite situations of small and large potential) which is afflicted by a small divisor problem. In particular if the potential is small the solutions are quasi-Bloch waves, while for large potentials they are exponentially localized. However the results available for the Schrödinger equation do not apply immediately to the physical situations described above, as they concern single-particle properties and not statistical many-particle ones. It is true that neglecting the interaction between electrons the many-electron wave-functions could be constructed starting from the singleelectron ones, but the deduction of statistical observables like the correlation functions could be not straightforward.
70
G. Gentile, V. Mastropietro
A different approach was started in [BGM] studying directly in the thermodynamic limit the correlation functions of the one-dimensional Holstein model, describing a system of electrons on a lattice interacting with a periodic potential, in the case of weak interaction and period incommensurate with the lattice spacing. The analysis was performed by using the techniques and methods typical of the Renormalization Group (RG): the correlation functions are written as Grassmannian integrals expressed by a series plagued by a small denominator problem. The convergence was proved using techniques similar to the ones adopted for the convergence of the invariant tori of perturbed Hamiltonians in Classical Mechanics, see [G1,GM]. In fact a formulation of the KAM problem in terms of quantum field theory has been recently proposed, see [FT,G2,GGM,BGK]. The advantage of such method is twofold. While in fact one has directly the correlation functions of the system in the thermodynamic limit, which can be compared in principle with the experiments, the above techniques can take into account also the twobody interaction between electrons, what is important to make the model more realistic. This was done in [M], by considering the Holstein–Hubbard model (similar to the Holstein model but in which the electron-electron interaction is added). The non-analytic dependence of the correlation functions on the amplitude of the electron-electron interaction (due to the Luttinger liquid nature of d = 1 interacting fermions) found in [M] shows that it is not possible to study this model by an analytic continuation from the non-interacting limit, which means that the results found for the Schrödinger equation cannot be used directly in this problem. In this work we continue our study of the Holstein model considering the opposite limit in which the interaction with the incommensurate potential is very large (near to the anti-integrable limit). The analysis is performed perturbatively by considering as the “free Hamiltonian” the interaction of the fermions with the external potential, while the hopping part is treated as a perturbation. We write the Schwinger functions (defined in the next section) as functional integrals which will be proved to admit a well defined thermodynamic limit. A large distance exponential decay of the Schwinger functions is also obtained. Our method would allow us in principle to consider also the interaction between electrons. 1.2. The Hamiltonian of the Holstein model is given by H =
x,y∈
tx,y ψx+ ψy− + λ(µ + ν)
x∈
ψx+ ψx− − λ
x∈
ϕx ψx+ ψx− ,
(1.1)
where x, y are points on the one-dimensional lattice with unit spacing and length L; we shall identify with {x ∈ Z : −[L/2] ≤ x ≤ [(L − 1)/2]}. Moreover the matrix tx,y is defined as tx,y = δx,y − (1/2)[δx,y+1 + δx,y−1 ], where δx,y is the Kronecker delta: the term in (1.1) containing the matrix tx,y (the only surviving one for λ = 0) is usually referred to as hopping term. The fields ψx± are creation (+) and annihilation (−) fermionic fields. The potential ϕx is a smooth function physically representing the interaction with the phonon field. In (1.1) λ is the interaction strength, λµ is the chemical potential and ν is a counterterm to be fixed in a proper way; we will discuss its physical meaning in Sect. 2.11 below. We shall consider potentials ϕx which are of the form ϕx = ϕ(ωx), ¯ where ϕ¯ is a real function on the real line periodic of period 1 and ω, the rotation number, is an irrational number (satisfying some Diophantine condition; see the statement of Theorem 1.4 below).
Anderson Localization for the Holstein Model
71
We are interested in the behaviour of the Schwinger functions of the model (1.1) for large |λ|; then it is more natural to consider the Hamiltonian obtained from (1.1) by dividing it by λ: H = H0 + V , H0 = (µ − ϕx ) ψx+ ψx− , x∈
ε + − − V =− ψx ψx+1 + ψx+ ψx−1 − 2ψx+ ψx− + ν ψx+ ψx− , 2 x∈
(1.2)
x∈
where ε ≡ 1/λ can be considered a small parameter. Given a Hamiltonian H , the finite-temperature two-point Schwinger function (twopoint Matsubara functions, [NO]) for H is defined as SL,β (x, y) =
Tre−βH T ψx− ψy+ Tre−βH
,
(1.3)
where ψx± = ex0 H ψx± e−x0 H , with x = (x, x0 ), −β/2 ≤ x0 ≤ β/2 for some β > 0, and T is the time-ordering operator, [NO]. We shall impose free boundary conditions on x and antiperiodic boundary conditions on x0 , [NO]. Note that the free boundary conditions on x allow potentials of the considered form even at finite L. 1.3. If T = R/Z denotes the one dimensional torus, · T is the distance on T and, given a C s function f (t), ∂ s f (t) denotes its s th derivative with respect to t, i.e. ∂ s f (t) ≡ d s f (t)/dt s ; in particular we shall set f (t) = ∂f (t). We shall prove the following theorem. The potential ϕx in (1.1) will be chosen to be an even C 1 function, essentially for expository clearness; however we stress since now that which really matters is the form of the potential ϕx near the critical points of ϕx − µ. Extensions will be discussed in Sect. 4. Theorem 1.4. Let be ϕx = ϕ(ωx) ¯ an even function in C 1 (T), i.e. ϕx = ϕ−x , ϕ(x) ¯ = ϕ(x ¯ + 1), and with ω verifying a Diophantine condition ωnT ≥ C0 |n|−τ ,
∀n ∈ Z \ {0},
(1.4)
for some constants τ > 1 and C0 > 0. Let us define ω¯ ≡ ωx¯ such that µ = ϕ( ¯ ω) ¯ and assume that there is only one x¯ ∈ (0, 1/2) satisfying such a condition and that ϕ¯ (ω) ¯ = 0. Then there exists ε0 > 0, depending on ω and ω, ¯ and, for |ε| < ε0 , a function ν ≡ ν(ε) = 0, such that (1) if ω¯ ∈ ωZ mod 1 and the additional Diophantine condition ωn ± 2ω ¯ T ≥ C0 |n|−τ ,
∀n ∈ Z \ {0},
is verified, then the two-point Schwinger function SL,β (x, y) admits a limit lim lim SL,β (x, y) = S(x, y)
β→∞ L→∞
(1.5)
72
G. Gentile, V. Mastropietro
bounded by
CN exp −4−1 |x − y| log ε −1 |S(x, y)| ≤ log (1 + min{|x|, |y|}) N , 1 + (1 + min{|x|, |y|})−τ |x0 − y0 | (1.6) τ
for any N ≥ 1 and for some constant CN depending on N ; (2) if 2ω¯ = (2k + 1)ω mod 1, k ∈ Z, then, for α = 2(k + 1) and for some constant CN depending on N , exp −(4α)−1 |x − y| log ε −1 CN |S(x, y)| ≤ log ϒ(x, y, σ ) , (1.7) 1 + [ϒ(x, y, σ ) |x0 − y0 |]N with ϒ(x, y, σ ) = max{(1 + min{|x|, |y|})−τ , σ } and 0 ≤ σ ≤ C |ε|η(k) , where η(k) =
(1.8)
(2k + 1)/4, k > 1, 1, k = 1,
(1.9)
for some constant C. 1.5. Note that both the conditions (1) and (2) in the theorem above exclude x¯ to be a point on the lattice, so avoiding divergences in the free propagator (see (2.3) below). We see that the two-point Schwinger function decays exponentially for large spatial distances with rate O(|log |λ||−1 ), for large λ. This is a consequence of the Anderson localization of the solutions of the Schrödinger equation with a large incommensurate potential, see [PF]. However the decay for large values of |x0 − y0 | is different in the two cases, corresponding to two choices of the chemical potential (hence of the fermion density). This is due to the presence of a (possibly vanishing) gap in the ground state energy of H in correspondence with the choice of the chemical potential done in case (2) of the theorem. We are not able to prove that in general these gaps are nonvanishing and we find only an upper bound for them, except in the case ω¯ ≡ ω/2 (in which a bound from below is possible). 1.6. The above theorem will be proven in Sect. 2 and Sect. 3, referring to the Appendices for the technical aspects. In Sect. 4 we shall deal with the problem of extending the results to more general potentials: this will lead to Theorem 4.6, whose physical relevance will be discussed in Sect. 4.8. Also a comparison will be presented therein with the existing literature about the Schrödinger equation, in particular with the results in [S,E]. 2. Anomalous Integration and Effective Potential 2.1. For H ≡ H0 , the two-point Schwinger function (1.3) is given by e−(µ−ϕx )(x0 −y0 ) θ(x0 − y0 ) − e−β(µ−ϕx ) θ(y0 − x0 ) gL,β (x, y) = δx,y . 1 + e−β(µ−ϕx )
(2.1)
Anderson Localization for the Holstein Model
73
If gL,β (x, y) ≡ gL,β (x, y; τ ), with τ = x0 − y0 , then, for −β ≤ τ ≤ 0, one has gL,β (x, y; τ + β) = −gL,β (x, y; τ ). Therefore we can write 1 −ik0 τ e gˆ L,β (x, y; k0 ), β
gL,β (x, y; τ ) =
(2.2)
k0 ∈Dβ
where Dβ = {k0 = (2n + 1)πβ −1 , n ∈ Z} and gˆ L,β (x, y; k0 ) =
β/2 −β/2
dτ eik0 τ gL,β (x, y; τ )
δx,y = δx,y g(x, ˆ k0 ) ≡ . −ik0 − ϕx + µ
(2.3)
Let us introduce a cut-off M so that k0 = 2(n + 1/2)π/β, n ∈ Z, −M ≤ n ≤ M − 1. The Schwinger function (1.3) can be written as a power series in ε, convergent for |ε| ≤ εβ , for some constant εβ (the only trivial bound of εβ goes to zero, as β → ∞). This power expansion can be constructed in the usual way, [NO], in terms of Feynman graphs (in this case only chains, since the interaction is quadratic in the fields), by using as free propagator the function (2.2): in the following we shall look for a different expansion which will turn out to be more suitable to find a nontrivial bound for εβ . ± (one for each of the 2.2. We introduce a finite set of Grassmannian variables ψx,k 0 allowed values for x ∈ and k0 ∈ Dβ , provided that the ultraviolet cut-off M has been introduced and L is kept finite) and a linear functional P (dψ) on the generated Grassmannian algebra, such that β − P (dψ)ψx,k ψ + = δx,y δk0 ,k0 ˆ k0 ). (2.4) ≡ β δx,y δk0 ,k0 g(x, 0 y,k0 −ik0 − ϕx + µ
The integration P (dψ) has a simple representation in terms of the Grassmannian integration dψ − dψ + , defined as the linear functional on the Grassmannian algebra, such − + that, given a monomial Q(ψ − , ψ + ) in the variables ψx,k , ψx,k , 0 0
−
+
−
+
dψ dψ Q(ψ , ψ ) =
1 0
if Q(ψ − , ψ + ) = otherwise .
x,k0
− ψx,k ψ+ 0 x,k0
(2.5)
We have P (dψ) = dψ − dψ +
x,k0
−1 + − (β g(x, ˆ k0 )) exp − ψx,k0 ψx,k . (2.6) β g(x, ˆ k0 ) 0 x,k0
−zψ + ψ −
+ − + Note that, as ψx,k ψ + = ψx,k ψ − = 0, then e x,k0 x,k0 = 1 − zψx,k ψ − , for 0 x,k0 0 x,k0 0 x,k0 any complex z. By using standard arguments (see, for example, [NO], where a different regularization of the propagator is used), one can show that the Schwinger functions can be calculated as “expectations” of suitable functions of the Grassmannian variables with respect to the “measure” (2.6). In particular, the two-point Schwinger function, which in our case
74
G. Gentile, V. Mastropietro
determines the other Schwinger functions through the Wick rule (as the Hamiltonian is quadratic in the fermionic fields), can be written as 1 −ik0 (x0 −y0 )+ ˆ S(x, y; k0 ), e β k0 ∈Dβ
− P (dψ) eV (ψ) ψx,k ψ+ 0 y,k0
SˆL,β (x, y; k0 ) = lim , M→∞ P (dψ) eV (ψ) SL,β (x, y) =
(2.7)
with V(ψ) =
ε 1 + − + − + ψx,k ψx−1,k ψx,k0 ψx+1,k 0 0 0 2 β x∈
k0 ∈Dβ
(2.8)
1 + ψx,k ψ− , − ν0 0 x,k0 β x∈
k0 ∈Dβ
where ν0 = ν + ε. Remark 2.3. The ultraviolet cut-off M on the k0 variable was introduced in order to give a precise meaning to the Grassmannian integration (so that and Dβ become indeed finite sets, hence the numerator and the denominator in (2.7) are finite sums), but it does not play any essential rôle in this paper, as all bounds will be uniform with respect to M and they easily imply the existence of the limit. Hence, we shall not stress anymore the dependence on M of the various quantities we shall study. 2.4. For the Grassmannian integration (2.6), we can write P (dψ) = 1h=hβ P (dψ (h) ) for some hβ ≤ 1 to be fixed later (after (2.10) below). This can be done by setting ± = ψx,k 0
1 h=hβ
(h)±
ψx,k0 ,
g(x, ˆ k0 ) =
1
gˆ (h) (x, k0 ),
(2.9)
h=hβ
(h)±
where ψx,k0 are families of Grassmann fields with propagators gˆ (h) (x, k0 ) which are defined in the following way. Note that, for small ωx (mod 1), ρ
ϕx +ρ x¯ − µ = ρv0 ωx + 4x , ρ
v0 = ∂ ϕ(ω ¯ x), ¯
ρ = ±1,
(2.10)
with 4x = o(ωx T ); the parity assumptions on ϕx implies 41x = 4−1 −x . Set r = (x, k0 ) and x¯ = (x, ¯ 0). Given r ∈ × Dβ define r = (x , k0 ), where x = x − ρ x, ¯ with ρ = sign (ωx), and set = {x : x ∈ such that x = ρ x¯ + x with ρ = sign (ωx)}. We introduce a scaling parameter γ > 1 and a function χ (x , k0 ) ∈ C ∞ (T1 × R) such that, if r 2 ≡ k02 + v02 ωx 2T , then 1 if r < t0 ≡ a0 /γ , χ (r ) = χ (−r ) = (2.11) 0 if r > a0 ,
Anderson Localization for the Holstein Model
75
where a0 is such that the supports of χ (r − x¯ ) and χ (r + x¯ ) are disjoint. Then define fˆ1 (r) = 1 − χ (r − x¯ ) − χ (r + x¯ )
(2.12)
and, for any integer h ≤ 0, fh (r ) = χ (γ −h r ) − χ (γ −h+1 r );
(2.13)
then, for any h¯ < 0, we have χ (r ) =
0
¯
fh (r ) + χ (γ −h r ).
(2.14)
¯ h=h+1
Note that, if h ≤ 0, fh (r ) = 0 for r < t0 γ h−1 or r > t0 γ h+1 . We finally define, for any h ≤ 0, fˆh (r) = fh (r − x¯ ) + fh (r + x¯ ),
(2.15)
and, for any h ≤ 1, gˆ (h) (r) ≡
fˆh (r) . −ik0 − ϕx + µ
(2.16)
The definition (2.15) also implies that, if h ≤ 0, the support of fˆh (r) is the union of two disjoint sets, Ih+ and Ih− . In Ih+ , ωx (mod 1) is strictly positive and ω(x − x) ¯ T≤ ¯ T ≤ a0 γ h /|v0 |. a0 γ h /|v0 |, while, in Ih− , ωx (mod 1) is strictly negative and ω(x + x) (h)± as the sum of two independent Grassmann Therefore, if h ≤ 0, we can write ψr (h)± variables ψr,ρ , ρ = ±1, with propagator P (dψ (h) ) ψr(h)− ψ (h)+ = β δr1 ,r2 δρ1 ,ρ2 gˆ ρ(h) (r1 ), (2.17) 1 ,ρ1 r2 ,ρ2 1 so that ψr(h)± =
(h)± ψr,ρ ,
gˆ (h) (r) =
ρ=±1
gˆ ρ(h) (r) =
ρ=±1
˜ θ(ρωx) fˆh (r) , −ik0 − ϕx + µ
gˆ ρ(h) (r),
(2.18)
(2.19)
where θ˜ (·) is the (periodic) Heaviside function. If ρωx > 0, we will write in the following x = x + ρ x; ¯ note that as x¯ ∈ / then x ∈ / Z. In order to simplify the notation, it will (1) ¯ be useful in the following to denote gˆ (1) (r) also as gˆ 1 (r), with x = x + x. It is easy to prove, by using (2.10), that, for any h ≤ 1 and any ρ, |gˆ ρ(h) (r )| ≤ G0 γ −h ,
(2.20)
for a suitable positive constant G0 . Finally note that, as |k0 | ≥ π/β, one has r ≥ π/β, so that hβ = [log(π/β)/ log γ ] (where [·] denotes the integer part), i.e. γ hβ ≈ π/β.
76
G. Gentile, V. Mastropietro
2.5. In order to prove that the Schwinger functions in (2.7) exist, we start by studying the denominator P (dψ) eV (ψ) . We perform the integration P (dψ) in the following way, defined by induction. Assume that we have integrated all the fields with scale 1 ≥ h > h and we have to integrate the r.h.s. of (h) ≤h P (dψ) eV (ψ) = e−Eh P (dψ (≤h) ) eV (ψ ) , (2.21) where V (h) is the effective potential
V (h) (ψ (≤h) ) =
ρ1 ,ρ2 =±1
∞ 1 β m=−∞ x∈
(≤h)+
k0 ∈Dβ
(2.22)
(≤h)−
(h) ψx +ρ1 x,k ¯ 0 ,ρ1 ψx +ρ2 x+[m+(ρ ¯ ¯ 0 ,ρ2 W (x, x + m; k0 ) 1 −ρ2 )x],k
and Eh is defined iteratively, see (2.42) below. For h = 1 the integration is just given by (2.6). The kernels W (h) (x, x +m; k0 ) are expressed as the sum of suitable Feynman graphs, see Sect. (2.7) below. The integration P (dψ (≤h) ) is defined, for h ≤ 0, as P (dψ (≤h) ) =
r ∈ ×Dβ
1 (≤h)+ (≤h)− dψr +ρ x¯ ,ρ dψr +ρ x¯ ,ρ Nh (r ) ρ=±1
1 × exp − Ch (r ) β ρ=±1 x∈ k0 ∈Dβ ρ (≤h)+ (≤h)− − ik0 − ρv0 ωx − 4x ψr +ρ x¯ ,ρ ψr +ρ x¯ ,ρ (≤h)+ (≤h)− − σh (r ) ψr +ρ x¯ ,ρ ψr −ρ x¯ ,−ρ ,
(2.23)
where Ch−1 (r ) =
Nh (r ) = β −1 Ch (r )
−ik0 − 41x
h
fj (r ),
(2.24)
j =hβ
2 2 2 −ik0 − 4−1 − v ωx − [σ (r )] h 0 T x (2.25)
and σh (r ) is also defined iteratively, see (2.31) below. We write V (h) = LV (h) + RV (h) ,
(2.26)
Anderson Localization for the Holstein Model
77
where L is the localization operator, a linear operator such that L
1 x∈
β
k0 ∈Dβ
(≤h)+
(≤h)−
(h) ψx +ρ1 x,k ¯ 0 ,ρ1 ψx +ρ2 x+[m+(ρ ¯ ¯ 0 ,ρ2 W (x, x + m; k0 ) 1 −ρ2 )x],k
(h) ¯ ρ1 x¯ + m; 0) = δ˜ωm+(ρ1 −ρ2 )ω,0 ¯ W (ρ1 x, 1 (≤h)+ (≤h)− ψx +ρ1 x,k · ¯ 0 ,ρ1 ψx +ρ2 x+[m+(ρ ¯ ¯ 0 ,ρ2 , 1 −ρ2 )x],k β x∈
if δ˜x,y =
(2.27)
k0 ∈Dβ
k∈Z δx,y+k ,
and R = 1 − L is the renormalization operator.
Remark 2.6. Note that in case (1) of the theorem the condition defining the delta in (2.27) can be verified only if ρ1 = ρ2 and m = 0, while in case (2) it can be verified also if ρ1 = −ρ2 , m = −ρ1 (2k + 1). 2.7. Using the parity property of ϕx , we write, for h ≤ 0, LV (h) = γ h νh Fν(h) + sh Fσ(h) , (h)
where Fν
(h)
and Fσ
(2.28)
are given by
Fν(h) = Fσ(h)
1 (≤h)+ (≤h)− ψr +ρ x¯ ,ρ ψr +ρ x¯ ,ρ , β
ρ=±1 x∈
k0 ∈Dβ
ρ=±1 x∈
k0 ∈Dβ
1 (≤h)+ (≤h)− = ψr +ρ x¯ ,ρ ψr −ρ x¯ ,−ρ . β
We write, for h ≤ 0, (h) (≤h) ˜ (h) (≤h) e−Eh P (dψ (≤h) ) eV (ψ ) = e−Eh −th P˜ (dψ (≤h) ) eV (ψ ) ,
(2.29)
(2.30)
where P˜ (dψ (≤h) ) has the same form of P (dψ (≤h) ) in (2.24) with σh (r ) replaced by σh−1 (r ), where σh−1 (r ) = σh (r ) + Ch−1 (r ) sh , h < 0, (2.31) σ0 (r ) = 0, and V˜ (h) = LV˜ (h) + RV (h) , if LV˜ (h) = γ h νh Fν(h)
(2.32)
is the localized effective potential on scale h. In (2.30) th takes into account the different normalizations of the two integrations and it is given, for h ≤ −1, by 2 2 2 −ik0 − 41x −ik0 − 4−1 x − v0 ωx T − [σh−1 (r )] ; th = − log 2 ωx 2 − [σ (r )]2 − v −ik0 − 41x −ik0 − 4−1 h r ∈ ×Dβ 0 T x
78
G. Gentile, V. Mastropietro
a similar expression holds for h = 0. The r.h.s of (2.30) can be written as ˜ (h) (≤h) −Eh −th (≤h−1) P (dψ ) P˜ (dψ (h) ) eV (ψ ) , e
(2.33)
where P (dψ (≤h−1) ) and P˜ (dψ (h) ) are given by (2.23) with σh (r ) replaced by σh−1 (r ), with Ch (r ) replaced by Ch−1 (r ) and fh−1 (r ), respectively, and with ψ (≤h) replaced by ψ (≤h−1) and ψ (h) , respectively. Note that σh (r ) is defined iteratively by (2.31), for all h ≤ 0, with σ0 (r ) = 0; by the k0 -dependence of the propagator one easily checks that σh (r ) is real. In case (1) of the theorem s0 = 0 and also sj = 0 for any j < 0 (see Remark 2.6), so that σh (r ) = 0 for any h. On the other hand in case (2), by defining η(k) as in (1.9), one has |σh (r )| ≤ C |ε|η(k) , for some constant C, as it will be proven in Appendix A.2. Then, as a consequence of the change of the Grassmannian integration, (2.16) has to be replaced with (h) fh (r )[Th−1 (r )]ρ,ρ ≡ g˜ ρ,ρ (r ), (2.34) gˆ (h) (r) = ρ,ρ =±1
ρ,ρ =±1
where the 2 × 2 matrix Th (r ) has entries 1 [Th (r )]1,1 = −ik0 − v0 ωx − 4x , [Th (r )]1,2 = [Th (r )]2,1 = −σh (r ), [T (r )] = −ik + v ωx − 4−1 , h 2,2 0 0 x
(2.35)
which is well defined on the support of fh (r ), so that, if we set 2 2 2 Ah (r ) ≡ det Th (r ) = [−ik0 − 41x ][−ik0 − 4−1 x ] − v0 ωx T − [σh (r )] , (2.36)
then Th−1 (r ) with
1 [τh (r )]1,1 [τh (r )]1,2 = , Ah (r ) [τh (r )]2,1 [τh (r )]2,2
−1 [τh (r )]1,1 = −ik0 − 4x + v0 ωx , [τh (r )]1,2 = [τ0 (r )]2,1 = σh (r ), [τ (r )] = −ik − 41 − v ωx . h 2,2 0 0 x (1)
(2.37)
(2.38)
(1)
For h = 1 we set g˜ 1,1 = gˆ 1 (r). Moreover σh (r ) =
0 j =h
Cj−1 (r ) sj .
(2.39)
Note that there exists a constant G1 , such that (h)
|g˜ ρ,ρ (r )| ≤ G1 γ −h , which can be proven as (2.20).
(2.40)
Anderson Localization for the Holstein Model
79
Integrating the ψ (h) field we find that (2.33) becomes (h−1) (ψ (≤h−1) ) e−Eh−1 P (dψ (≤h−1) ) eV ,
(2.41)
with Eh−1 = Eh + th + E˜ h ,
(2.42)
˜ (h) (h) where E˜ h = − log P˜ (dψ (h) ) eV (ψ ) ; we can consider (2.42) defined for any h ≤ 1, provided that we set E1 = 0 and t1 = 0. 2.8. In order to perform some estimates it is convenient to introduce a diagrammatic representation of the effective potential V˜ (h) , in terms of chain graphs described below. A graph ϑ of order n is a chain of n + 1 lines A1 , . . . , An+1 connecting a set of n ordered points (vertices) v1 , . . . , vn , so that Ai enters vi and Ai+1 exits from vi ; A1 and An+1 are the external lines of the graph and both have a free extreme, while the others are the internal lines; we shall denote by int(ϑ) the set of all internal lines. We say that vi < vj if vi precedes vj and we denote vj the vertex immediately following vj , if j < n. We denote also by Av the line entering the vertex v, so that Ai ≡ Avi , 1 ≤ i ≤ n. We say that a line A emerges from a vertex v if A either enters v (A = Av ) or exits from v (A = Av ). We shall say that ϑ is a labelled graph of order n and external scale h, if ϑ is a graph of order n, to which the following labels are associated: • a label δv = 0, ±1 for each vertex, • a scale label h for both the external lines and a scale label hA ≥ h + 1 for each A ∈ int(ϑ), • two labels ρA1 , ρA2 = ±1 for each internal line A, setting ρA21 , ρA1q+1 = ±1, ρA11 = 0 and
ρA2q+1 = 0, • a momentum k0 on each line, and • a coordinate xA 1 = x , with x = x + ρ1 x, ¯ for the first line and a coordinate xA v = x +
δw + ρA2w − ρA1w x¯
w≤v
for each other line Av . Moreover, h(ϑ) ≡ minA∈int(ϑ) hA will be called the internal scale or simply the scale of ϑ. A graph of order n can be obtained from n graph elements formed by a vertex with two emerging half-lines (representing the left one a ψ + field and the right one a ψ − field), by pairing the half-lines (contractions) in such a way that a line ψ ± can be paired only with a line ψ ∓ and the resulting graph turns out to be connected with only two half-lines left not contracted (the external lines of the graph). (h ) Given a line A, we associate to it a propagator g˜ ρ 1A,ρ 2 (rA ). Given a vertex v one has A
A
δv = 0 only if hAv ≤ 0 and we associate to it a factor γ h νh , if h = min{hv , hv }; we say that such a vertex is a ν-vertex. If a vertex v has δv = ±1, we associate to it simply a factor ε.
80
G. Gentile, V. Mastropietro
Given a labelled graph ϑ, we can consider a maximal connected subset T of lines A in ϑ with scales hA ≥ hT and with at least one line on scale hT . Then the external lines of T ( i.e. the lines that have only one vertex inside T ) have scale labels smaller than hT . We shall say that T is a cluster of scale hT . The vertices connected by the lines internal to T are said to belong to T . An inclusion relation can be established between the clusters, in such a way that the innermost clusters are the clusters with the highest scale (minimal clusters), and so on. Note that ϑ itself is a cluster (of scale h(ϑ)). Each cluster T has an incoming line AiT and an outgoing line AoT ; we set xA o − x i ≡ T
¯ where mT + (ρ 2i − ρA1o )x, AT
T
mT =
δv +
v∈T
A∈T
ρA2 − ρA1 x¯
AT
(2.43)
is an integer. The maximum between hAi and hAoT will be called the external scale of T . T Note that, for mT = 0, 2a0 γ hT /|v0 | ≥ ωxA o T + ωxA i T ≥ ω(xA o − xA i )T T
=
T
T
ωmT + (ρA2i T
T
(2.44)
− ρA1o )ω ¯ T, T
where hT is the scale of the cluster immediately containing T ( i.e. hT is the external scale of T ), so that C0 |mT |−τ , case (1), hT 2a0 γ /|v0 | ≥ (2.45) −τ C0 (|mT | + (2k + 1)) , case (2), a key inequality which will be deeply used in the proof of Lemma 2.10 below (see Appendix A1). We say that V is a resonance (or a resonant cluster) of ϑ, if x i = xA o , i.e. if the AV
V
Kronecker delta in the r.h.s of (2.28) is verified. On each resonance the R operation acts. h the set of the graphs ϑ of order n and with external scale h, such that We define Tn,m the difference between the coordinate xA1 of the entering and the coordinate xAn+1 of the exiting line is m, i.e. ρA2 − ρA1 x¯ = m, (2.46) δv + v∈ϑ
A∈int(ϑ)
h(ϑ) = h + 1 and on each resonance the R operator acts. Then we can write ∞ Wn(h) (x, x + m; k0 ), W (h) (x, x + m; k0 ) = Wn(h) (x, x + m; k0 ) = Val(ϑ) = εn R
n=1 ∞
h n=1 ϑ∈Tn,m
A∈int(ϑ)
Val(ϑ),
(h ) g˜ ρ 1A,ρ 2 (rA ) A A
(ν) γ hT νh MT0 T , ε
T ∈T
(2.47)
Anderson Localization for the Holstein Model
81
where T is the set of clusters in ϑ, T0 is the set of lines and vertices inside T and outside (ν) the clusters internal to T and MT0 is the number of ν-vertices in T0 . A resonance can be seen as a tree with external lines carrying the same coordinate labels xA 1 = xA p+1 ; in (h)
¯ ±x; ¯ k0 ) the resonance value. such a case we shall call Wn (x, Note that the R operator in (2.47) produces derivatives of the propagators: one can easily show that, for any values r1 , r2 , d (hA ) g˜ 1 2 (tr + r ) ≤ G2 r γ −2h , t ∈ [0, 1], (2.48) 2 2 dt ρA ,ρA 1 for some constant G2 , a property that will be used in Appendix A1 to prove Lemma 2.10 below. 2.9. Let we define (G1 is defined in (2.40)) h∗ = inf{h ≥ hβ : G1 γ h ≥ |σh |}.
(2.49)
In case (1) of the theorem of course h∗ ≡ hβ . In case (2) however h∗ = hβ generically (we cannot exclude that for some potential ϕx one has σh = 0 identically: no lower ∗ bounds for γ h can be in general given). If one defines (≤h∗ )
g˜ ρ,ρ (r ) =
∗
h j =hβ
(j )
g˜ ρ,ρ (r ),
(2.50)
then (≤h∗ )
∗
g˜ ρ,ρ (r ) ≤ G1 γ −h ;
(2.51)
this means that, if h∗ > hβ , the scales ≤ h∗ can be integrated all together. The convergence of the effective potential is a consequence of the following lemma, proved in Appendices A1 and A2. Lemma 2.10. If γ > 2τ , there exists ε0 such that, for |ε| ≤ ε0 and h∗ ≤ h ≤ 0 (h∗ = hβ in case (1)), one has −1 1 (h) (2.52) Wn (x, x + m; k0 ) ≤ CD n−1 |ε|n/4 e− 4α log ε |m| , for some constants C, D and α = 1 in case (1), α = 2(k + 1) in case (2). 2.11. Let us make some comments on the elaborate integration procedure described above. The series for the effective potential are plagued by a problem of small divisors similar to the one in the Lindstedt series for KAM tori of Hamiltonian systems close to integrable ones. Retaining only the terms in the series with no resonances, it would be possible to show that, as a consequence of the Diophantine condition, a bound O(|ε|n/4 ) for the graphs with n vertices could be obtained. According to the RG approach, the resonance values are written as sums of two terms, using the decomposition 1 = L + R given by (2.27), and one considers a renormalized expansion in terms of graphs such that (1) on all the clusters the R operation acts and (2) the graph values depend also on a set of
82
G. Gentile, V. Mastropietro
running coupling constants, which take into account the local part of the resonances; of course the action of R is trivial (R = 1) except for the resonances. It is then possible to show that, if the running coupling constants admit a bound O(|ε|), the renormalized graphs still admit a bound O(|ε|n/4 ) (Appendix A1). However one has still to show that the running coupling constants are bounded (Appendix A.2). The running coupling constants have a clear physical meaning: the νh represent the renormalization of the chemical potential with respect to the ε = 0 case, while the σh , present only in the case (2), take into account the opening of a gap in the singleparticle spectrum. The flow of the νh is controlled by adding a counterterm ν = ν(ε) in the Hamiltonian. Note that also in the case with gaps it is necessary to fix the chemical potential as the gap is quite smaller than the chemical potential renormalization. The flow of σh is controlled by putting it in the Grassmannian integration; this is often referred to by saying that the “free measure” is changed as an effect of the interaction (anomalous Grassmannian integration). 3. The Two-Point Schwinger Function 3.1. In this section we define a perturbative expansion, similar to the one discussed for the effective potential in the previous section, for the two-point Schwinger function, defined by (2.7), which can be rewritten, at finite L, β, #
" 1 ∂2 V (ψ)+ dx φx+ ψx− +ψx+ φx− SL,β (x, y) = P (dψ) e (3.1) + − , φ =φ =0 ∂φx+ ∂φy− N1
β/2
where dx is a shortcut for x∈ −β/2 dx0 , N1 = P (dψ) eV (ψ) and {φx± } are Grassmannian variables (the external field), anticommuting with {ψx± }. Setting ψ = ψ (≤0) + ψ (1) and performing the integration over the field ψ (1) (ultraviolet integration), which can be easily performed as in [BGM], to which we refer for more details, we find
(0) ∂2 dxdy φx+ Vφ,φ (x,y) φy− e + − ∂φx ∂φy #
" + (≤0)− (≤0)+ − 1 (0) (≤0) (0) (≤0) P (dψ (≤0) ) e dx φx ψx +ψx φx eV (ψ )+W (ψ ,φ) + − , · φ =φ =0 N0 (3.2)
SL,β (x, y) =
where
N0 =
W (0) (ψ (≤0) , φ) =
P (dψ (≤0) ) eV
(0) (ψ (≤0) )
,
(0) (0) dxdy φx+ Kφ,ψ (x, y) ψy(≤0)− + ψx(≤0)+ Kψ,φ (x, y) φy− , (3.3)
(0)
(0)
Vφ,φ (x, y) = g (1) (x, y) + Kφ,φ (x, y), with g (1) (x, y) given by g (1) (x, y) =
1 −ik0 (x0 −y0 ) fˆ1 (x − y, k0 ) e . β −ik0 − ϕx + µ k0 ∈Dβ
(3.4)
Anderson Localization for the Holstein Model
83
We have, in particular, (0)
Kφ,φ (x, y) =
∞
e−ik0 (x0 −y0 ) Val(ϑ),
(3.5)
n=3 ϑ∈T φφ,0 k0 ∈Dβ n,m
φφ,0
where Tn,m is the set of all labelled graphs of order n with two external propagators (corresponding to the lines A1 and An+1 ), such that δv + 1 = m, (3.6) v∈ϑ
A∈int(ϑ)
with hA = 1 ∀A ∈ ϑ; moreover, Val(ϑ) is obtained from (2.47) by adding two external propagators with argument (x, k0 ) and (y, k0 ) and scale 1. In the same way the kernels (0) (0) Kφ,ψ (x, y) or Kψ,φ (x, y) are defined, with the only difference that only to one external line a propagator is associated. Then (3.2) can be written (0)
SL,β (x, y) = Vφ,φ (x, y) + S (0) (x, y), where S (0) (x; y) =
1 ∂2 P (dψ (≤0) ) ∂φx+ ∂φy− N0 #
" + (≤0)− (≤0)+ − (0) (≤0) (0) (≤0) · e dx φx ψx +ψx φx eV (ψ )+W (ψ ,φ)
(3.7)
(3.8) φ + =φ − =0
.
3.2. Proceeding as above, after integrating ψ (0) , . . . , ψ (h+1) we find SL,β (x, y) =
0 h =h
(h )
Vφ,φ (x; y) + S (h) (x, y),
(3.9)
where
1 ∂2 S (x, y) = P (dψ (≤h) ) ∂φx+ ∂φy− Nh #
" + (≤h)− (≤h)+ − (h) (≤h) (h) (≤h) +ψx φx dx φx ψx ·e eV (ψ )+W (ψ ,φ) + − , φ =φ =0 (h) (h) W (h) (ψ (≤h) , φ) = dxdy φx+ Kφ,ψ (x, y) ψy(≤h)− + ψy(≤h)+ Kψ,φ (x, y) φx− , (h)
(3.10) (h)
(h)
Vφ,φ (x, y) = g (h+1) (x, y) + Kφ,φ (x, y). (h)
The kernels Kχ (1) ,χ (2) (x, y) can be represented as sums of graphs of the same type as
those appearing in the graph expansion of the effective potential V (h) ; the new graphs differ only in the following respects: • if χ (2) = φ, the right external line is associated to the φ − field and the graph ends with a vertex carrying no ε factor;
84
G. Gentile, V. Mastropietro
• if χ (1) = φ, the left external line is associated to the φ + field and the graph begins with a vertex carrying no ε factor; • R ≡ 1 on resonances containing an external propagator (defined as after (3.5)); • h(ϑ) = h + 1 for all graphs, if χ (1) = χ (2) = φ. Then the functional derivatives in (4.8) give SL,β (x, y) =
∗ (h) (h∗ ) g (h+1) (x; y) + Kφ,φ (x, y) + g¯ (h ) (x, y) + K¯ φ,φ (x, y),
0 h=h∗ +1
(3.11) ∗
∗ (h ) where h∗ is defined in (2.49), g¯ (h ) (x, y) and K¯ φ,φ (x, y) have a different meaning in case (1) or (2) of the theorem (see Sect. 3.3 and Sect. 3.4 below) and
(h)
Kφ,φ (x, y) =
∞ n=3 ϑ∈T φφ,h
1 −ik0 (x0 −y0 ) e Val(ϑ) β
n,m
(3.12)
k0 ∈Dβ
φφ,h
if Tn,m , with x − y = m, is the set of all labelled graphs of order n with two external propagators, such that δv + ρA1 − ρA2 x¯ = m, (3.13) v∈ϑ
A∈int(ϑ)
and Val(ϑ) is computed with the rules explained after (3.5) and (3.10). The discussion will proceed from now on in a different way depending on case (1) or (2) of the theorem. ∗
∗ ∗ (h ) 3.3. In the case (1) in (3.11) one has g¯ (h ) (x, y) = g (h ) (x, y) and K¯ φ,φ (x, y) =
(h∗ )
Kφ,φ (x, y), with h∗ = hβ . Note that, from (3.13), |v0 |t0−1 γ −h−1 ≤ sup max
0≤k≤2n ρ=±1
≤
4C0−1 2τ
1 ω(x + k) + ρ ω ¯
(3.14)
τ
[min{|x|, |y|} + 2n] ,
where in the last inequality we have used the Diophantine condition (1.5) for ω(x +k)+ ρ ω ¯ ≤ 1/4 (in such a case one can write ω(x + k) + ρ ω ¯ = ω2(x + k) + ρ2ω/2), ¯ while for ω(x + k) + ρ ω ¯ > 1/4, the bound in (3.14) is of course trivial. Then τ n −1 τ −h−1 τ |v0 |t0 γ ≤ 4C0 4 (1 + min{|x|, |y|}) 1 + 1 + min{|x|, |y|} (3.15) ¯
≡ |v0 |t0 γ −h(n)−1 . In order to bound (3.11) we use the following result, proven in Appendix A.3:
|Val(ϑ)| ≤ γ −h CD n−1 |ε|(n−2)/4 e− 4α log ε 1
−1 m
φφ,h ∀ϑ ∈ Tn,m ,
(3.16)
Anderson Localization for the Holstein Model
85
with the same notations used in (2.52); in particular α = 1. The integral over k0 is over a domain at most of order γ h(ϑ) , with h(ϑ) = h + 1, as k0 is constrained to be on the compact support of the propagator with scale h(ϑ), so that 0 0 ∞ (h) C3 γ h sup |Val(ϑ)| , Kφ,φ (x, y) ≤
(3.17)
k0 ∈Dβ
h=hβ n=3 ϑ∈T φφ,h n,m
h=hβ
where Val(ϑ) is bounded as in (3.16). By (3.15) we see that the sums over h and n are not independent; in particular we can exchange the order of the sums writing 0 0 ∞ (h) Kφ,φ (x, y) ≤
C3 γ h sup |Val(ϑ)| k0 ∈Dβ
φφ,h n=3 h=h(n) ¯ ϑ∈T
h=hβ
n,m
≤
∞
C3 CD n |ε|(n−2)/4 e− 4 log ε 1
−1 |x−y|
n=[|x−y|/2]
$
%τ n · log (1 + min{|x|, |y|}) 1 + 1 + min{|x|, |y|} & ' |x − y| −1 τ ≤ C5 log (1 + min{|x|, |y|}) exp − log ε , 4
(3.18)
where C5 depends on ω and ω, ¯ the sum over the scales of the tree is controlled by (3.15) and we have used (3.14) and the fact that n ≥ |x − y|/2. We can obtain another bound, which is better for large |x0 − y0 |; by using (h)
Kφ,φ (x, y) =
∞ 1 1 e−ik0 (x0 −y0 ) D0N Val(ϑ), N |x0 − y0 | β φφ,h n=3 ϑ∈T
(3.19)
k0 ∈Dβ
n,m
where D0 denotes the discrete derivative with respect to k0 (see also [BGM]) and proceeding as above we find the bound & 0 ' 1 |x − y| (h) (3.20) log ε−1 , Kφ,φ (x, y) ≤ C6 " #N exp − 4 |x0 − y0 | |x|−τ h=hβ
which is better than the bound (3.18) for |x0 − y0 | ≥ |x|τ . So the bound (1.6) follows. ∗
(h∗ )
∗
3.4. In case (2) of the theorem, one has g¯ (h ) (x, y) = g (≤h ) (x, y) and K¯ φ,φ (x, y) = (≤h∗ )
Kφ,φ (x, y), with h∗ > hβ , generically, as one can integrate all the scales ≤ h∗ in a single step, namely SL,β (x, y) =
0 h=h∗ +1
+g
(h) g (h+1) (x, y) + Kφ,φ (x, y)
(≤h∗ )
(3.21)
(
(x, y) + Kφ,φ (x, y),
(h)
(≤h∗ )
∗)
and an expression similar to (3.12) for Kφ,φ (x, y) is valid for Kφ,φ (x, y), with g (≤h in place of g (h+1) . Proceeding as above the bound (1.7) in Theorem 1.4 follows.
86
G. Gentile, V. Mastropietro
4. Generalizations and Final Comments 4.1. We compare now our results with the result about the Schrödinger equation with a large quasiperiodic potential −ψ(x − 1) − ψ(x + 1) + 2ψ(x) + λV (x)ψ(x) = Eψ(x),
(4.1)
where V (x) = V¯ (α + ωx), with V¯ (·) periodic of period 1 and α ∈ R. Note the presence of the free parameter α in V (x), while we take α fixed. The results obtained for |λ| large depend crucially on α; in fact, under “reasonable” assumptions on V¯ , see [PF], it is proved that the spectrum is pure point and the eigenfunctions decay exponentially for almost all α. The search for exponentially localized solutions of the above Schrödinger equation was done through the use of perturbation theory considering the second difference operator to be the perturbation; such a perturbation theory is afflicted by a small divisors problem. In [BLS], V (x) was assumed strictly monotone in the period; a typical example is V (x) = tan (2π(α + ωx)), with ω verifying a Diophantine condition. Functions like V (x) = cos (2π(α + ωx)), which are the more interesting ones, were excluded by [BLS], but were considered later in [BLT,S]. In particular [S] considered C 2 functions V¯ (α + ωx), having exactly one nondegenerate maximum and minimum and strictly monotone with nonzero derivative between them (and some assumptions – slightly different from the usual Diophantine one – on the continued fraction expansion of ω was made). In particular, imposing some conditions on α and E (satisfied on a full measure set), in [S] the existence of Anderson localization was proved. In [E] further generalizations were considered, in particular relaxing the hypothesis of monotonicity done in [S]; see also Sect. 4.4 and Sect. 4.5 below. In order to compare the above results, especially [S], with our work, we can study the following problem. We replace in (1.1) the function ϕx with V¯ (α +ωx) and we consider µ = V¯ (α +ωn), ¯ with n¯ integer. This corresponds to choose µ in correspondence of an eigenvalue of the spectrum of the Schrödinger equation. We assume as Diophantine condition, besides (1.3), also ωn ± 2αT ≥ C0 |n|−τ ,
∀n ∈ Z \ {0}.
(4.2)
Then we can proceed as in the proof of Theorem 1.4, with the following differences (let C1 , C2 , C3 , C4 denote suitable constants): • we replace (2.44) and (2.45), case (1), with C3 γ hT ≥ V¯ (α + ωxAoT ) − V¯ (α + ωxAi
T
)
≥ C2 sup ωm + 2εαT ≥ C1 |m|−τ ;
(4.3)
ε=0,±1
• we replace (3.14) with 1 C4 t0 γ −h−1 ≤ sup ¯ ¯ 0≤k≤2n V (α + ω(x + k)) − V¯ (α + ωn) τ
≤ C1 [min{|x|, |y|} + 2n] .
(4.4)
Anderson Localization for the Holstein Model
87
By these substitutions everything is essentially unchanged with respect to the case (1) of Theorem 1.4: so we can prove the convergence of the Schwinger functions, at least if we make the extra assumption that the potential V (x) is an even function (as Theorem 1.4 was just stated under such an assumption). We shall come in a moment (see in particular Sect. 4.9) on the relevance of the parity assumption. We prefer to study the model (1.1), in which the phase α is fixed and the chemical potential can vary, as physically it corresponds to change the number of electrons. Moreover we are interested also in fixing the chemical potential within gaps of order O(σ ) in the spectrum of the Schrödinger equation, which could be important from a physical point of view: this corresponds to a zero measure set of ω¯ = ωx¯ (equivalently of α), hence it is irrelevant in the analysis performed in [S,E], where only properties holding almost everywhere were explicitly considered. 4.2. We consider now the problem of extending the results to more general potentials. In fact we shall see that, if we allow the potential ϕx to be modified by the perturbation ( i.e. if we choose ϕx as a function of ε), then the case of any potential with only one nondegenerate maximum and minimum and strictly monotone between them, as the one considered in [S], can be easily recovered. The same holds also if we want to consider potentials in the class defined in [E]; see Sect. 4.6–Sect. 4.7 below. The physical meaning of such a modification will be discussed in Sect. 4.9. 4.3. We start by considering in the Hamiltonian (1.1) the class of the potentials considered in [S]. First of all the analysis of the previous sections can be easily extended to the case in which the two roots of the equation ϕ(ωx) ¯ = µ (which we shall call the critical points of the potential ϕx ) are x¯1 and x¯2 , with x¯2 = −x¯1 , and ϕx is even with respect to (x¯1 + x¯2 )/2; if we have such a function ϕ(ωx) ¯ it is enough to define ϕ(ωx) ˜ = ϕ(ω(x ¯ + x0 )), with x0 = (x¯1 + x¯2 )/2, in order to obtain a potential even with respect to x0 , so that its roots are opposite to each other. So it is always possible to have a potential ϕx = ϕ(ωx) ˜ such that ϕ(±ω ¯ x) ¯ = µ. In general, if ϕx is not necessarily even, not only x¯1 = −x¯2 , but also, setting (1) (2) (1) (2) ϕ¯ (ωx¯1 ) = v0 and ϕ¯ (ωx¯2 ) = v0 , one has v0 = −v0 ; as x¯1 = −x¯2 we prefer to denote 1, 2 the values of the label ρ, instead of ±1 as in Sect. 2. A property used in the analysis of the previous sections due to the parity assumption on ϕx was indeed that ϕ¯ (±ωx) ¯ = ±v0 , but it is immediate to note that the analysis (1) (2) can be adapted to the case in which v0 = −v0 . Simply one has to use a different scale unit for the compact support functions (appearing in the multiscale decomposition of the propagator) near the two points x¯1 and x¯2 , i.e. one replaces in (2.11) a0 /|v0 | (1) (2) with a0 /|v0 | near x¯1 and with a0 /|v0 | near x¯2 , where a0 is such that the supports of χ (r − x¯ 1 ) and χ (r − x¯ 2 ) are disjoint (here x¯ j = (x¯j , 0), for j = 1, 2). Another property that fails in the non-even case is the equality between 41x and 42−x ; this implies that (2.28) has to be replaced by
LV (h) =
2 ρ=1
(h) (h) γ h νρh Fν,ρ , + sρh Fσ,ρ
(4.5)
88
G. Gentile, V. Mastropietro (h)
(h)
where Fν,ρ and Fσ,ρ are given by (h) Fν,ρ = (h) Fσ,ρ
1 (≤h)+ (≤h)− ψr +¯xρ ,ρ ψr +¯xρ ,ρ , β
x∈
k0 ∈Dβ
x∈
k0 ∈Dβ
1 (≤h)+ (≤h)− = ψr +¯xρ ,ρ ψr +¯x3−ρ ,3−ρ , β
(4.6)
so that four running coupling constants turn out to be involved: ν1h , ν2h , σ1h , σ2h . This does not introduce any extra problems as the flow of the running coupling constants can still be controlled as in the case considered in the even case. The main difference is that now a counterterm of the form in (1.2) is not sufficient to control the flow: one has to introduce in the Hamiltonian a term νx ψx+ ψx− , (4.7) x∈
where νx = ν¯ (ωx) is any function in C 1 (T) such that ν¯ (ωx¯1 ) = ν1 in a neighbourhood of radius a0 and center ωx¯1 and ν¯ (ωx¯2 ) = ν2 in a neighbourhood of radius a0 and center ωx¯2 , if ν1 and ν2 are fixed in such a way that the flow of the running coupling constants ν1h , ν2h is controlled as in the even case (see Sect. 2.11); we refer to Sect. A2.11 for more technical details. Note that (4.7) corresponds to a counterterm depending on x. A similar situation (in Fourier space) arose in [FST], where the modification of the Fermi surface is studied in nonspherical two-dimensional fermion systems with quartic interaction. By using the above comments one finds that for potentials like the ones of [S] a result analogous to Theorem 1.4 holds, provided that the potential ϕx is modified into ϕx + νx ; see Sect. 4.9 below. We also note that what really matters is the behaviour of the potential ϕx near the two points x¯1 , x¯2 , so that, far from them, no regularity assumption is required on the potential (which can even be a very irregular function far enough from the critical points x¯1 and x¯2 of the potential). Of course if one is interested in properties holding almost everywhere in the spectrum, as in [S,E], a global regularity has to be required. 4.4. For the case (1) of the theorem, also potentials with more than two points x¯1 , . . . , x¯p , p ≥ 3, such that ϕ(ω ¯ x¯j ) = µ,
j = 1, . . . , p,
(4.8)
like the ones in [E], could be considered with our methods, essentially without any real extra difficulties (however see Sect. 4.9). Of course some additional Diophantine conditions should be imposed on ω. Instead of (1.5) one should require the p(p − 1)/2 conditions # " (4.9) ωn + ω x¯i − x¯j > C0 |n|−τ , ∀n ∈ Z \ {0}, ∀i = j = 1, . . . , p, which slightly narrow the set of admissible ω, but always leaving as possible a full measure set, provided that x¯i − x¯j does not remain constant for i = j when varying µ. If the interaction potential is a function of class C s , with s ≥ 2, also the case in ¯ x¯j ) = 0, for some 1 ≤ s ≤ s, can be dealt with with our techniques (this which ∂ s ϕ(ω would allow us to recover the first transversality condition in [E]). Anyway, if s0 is the
Anderson Localization for the Holstein Model
89
minimum s such that ∂ s ϕ(ω ¯ x¯j ) = 0, a renormalization to order s0 (and not just to first order except for the case s0 = 1) would be required, so that the flow of s0 running coupling constants ought to be controlled. The analysis turns out to be more involved but not out of reach; see Sect. 4.5 below. Concerning the second transversality condition in [E], in the case s0 = 1, p = 2 it is trivially satisfied near the critical points x¯1 and (1) (2) x¯2 , as v0 = v0 . If the second tranversality condition is assumed to hold for any x as in [E], then it follows that the set of µ for which the condition (1.5) is satisfied has full measure. Also the case s ≥ 2 can be treated along the lines sketched in the above paragraph. In fact the second transversality condition in [E] automatically excludes the potential ϕx to be locally translation invariant, so that it imposes that, when varying µ, the quantities x¯i − x¯j are not constants for any i = j : then the set of ω satisfying (4.9) is of full measure (see comments after (4.9)). In conclusion when varying µ, if ϕx satisfies the two transversality conditions in [E], we have that (4.9) holds almost everywhere in µ. 4.5. If there are p critical points x¯1 , . . . , x¯p satisfying (4.8), for fixed µ, and ω fulfills the Diophantine condition (4.9), then we have to renormalize the theory to order s0 (see Sect. 4.4), by introducing, for any point x¯j , j = 1, . . . , p, s0 running coupling constants (0)
(1)
(s −2)
γ h νj h , γ (s0 −1)h/s0 νj h , . . . , γ (s0 −2)h/s0 νj h0
(s −1)
, γ h/s0 νj h0
,
(4.10)
such that LV (h) =
p s0 j =1 r=1
(r)
(h)
γ (s0 −r)h/s0 νj h Fν,j,r ,
(4.11)
where (h)
Fν,j,r =
1 (≤h)+ " #r (≤h)− ψr +¯xj ,j ωx ψr +¯xj ,j ; β
x∈
(4.12)
k0 ∈Jβ
recall that no σ -type running coupling constants appear in the case in which ω is incommensurate with respect to ω(x¯i − x¯j ) for all i = j . The flow of the running coupling constants (4.10) can be discussed as in the case of Theorem 1.4 and it turns out to be controlled if suitable counterterms are introduced in the Hamiltonian. Instead of only one counterterm (4.7) one has to introduce s0 counterterms, which leads to add to the Hamiltonian ε + − − − 2ψx+ ψx− ψx ψx+1 + ψx+ ψx−1 (4.13) (µ − ϕx ) ψx+ ψx− − 2 x∈
x∈
a term x∈ (r)
νx ψx+ ψx− =
x∈
ν˜ x(0) + ν˜ x(1) + . . . + ν˜ x(s0 −1) ψx+ ψx− ,
(4.14)
where ν˜ x = ν¯ (r) (ωx) are any functions in C s0 (T) such that ν¯ (r) (ωx) = #r (r) " ω(x − x¯j ) νj in a neighbour of radius a0 and center ωx¯j , if a0 is so chosen that
90
G. Gentile, V. Mastropietro
the supports of the functions χ (r − x¯ j ), with j = 1, . . . , p, are disjoint and the con(0) (s −1) are fixed in such a way that the running coupling constants stants νj , . . . , νj 0 remain bounded; we refer to Appendix A.4 for details. Here we confine ourselves to state the final result, which is a generalization of Theorem 1.4, case (1). Theorem 4.6. Consider the fermion system described by the Hamiltonian H0 =
x∈
(µ − ϕx ) ψx+ ψx− −
ε + − − ψx ψx+1 + ψx+ ψx−1 − 2ψx+ ψx− , 2
(4.15)
x∈
where ϕx = ϕ(ωx) ¯ a function in C s (T), of period 1 and with ω satisfying a Diophantine condition ωnT ≥ C0 |n|−τ ,
∀n ∈ Z \ {0},
(4.16)
for some constants τ > 1 and C0 > 0. Let us define ω¯ j ≡ ωx¯j such that µ = ϕ( ¯ ω¯ j ) and assume that there exists s0 ≤ s such that ≥ ξ, max ∂ r ϕ(ωx) ¯ r≤s0
∀x ∈ {x¯1 , . . . , x¯p },
(4.17)
for some constant ξ . Then there exists ε0 > 0, depending on ω, ω¯ 1 , . . . , ω¯ p and ξ , and, for |ε| < ε0 , (r) (r) s0 p functions νj ≡ νj (ε) (some of which possibly vanishing) with j = 1, . . . , p and r = 0, . . . , s0 − 1, such that, if ω¯ ∈ ωj Z mod 1 and the additional Diophantine condition ωn ± (ω¯ i − ω¯ j )T ≥ C0 |n|−τ ,
∀n ∈ Z \ {0},
∀i = j = 1, . . . , p,
(4.18)
is verified, then the two-point Schwinger function SL,β (x, y) for the system described by the Hamiltonian Hν = H0 + ν˜ x(0) + ν˜ x(1) + . . . + ν˜ x(s0 −1) ψx+ ψx− (4.19) x∈
admits a limit lim lim SL,β (x, y) = S(x, y)
L→∞ β→∞
bounded by |S(x, y)| ≤ log (1 + min{|x|, |y|})τ exp −4−1 |x − y| log ε −1 CN · N , 1 + (1 + min{|x|, |y|})−τ |x0 − y0 | depending on N . for any N ≥ 1 and for some constant CN
(4.20)
Anderson Localization for the Holstein Model
91
4.7. Let us make some minor comments on the statement of the theorem above, deferring to Sect. 4.9 the main one. (1) The number of running coupling constants is in general s1 + . . . + sp , of sj is first nonvanishing derivative of ϕ(ωx) ¯ in x = x¯j ; as sj ≤ s0 for some s0 ≤ s, the number of nonvanishing running coupling constants is ≤ ps: this motivates the statement about the number of running coupling constants in the theorem. (2) In the case of Theorem 1.4 the term (4.7) takes into account the shifting of the singularity of the propagator with respect to the free case (ε = 0). If the first sj − 1 derivatives are vanishing in some critical points x¯j such a counterterm is not sufficient and one has to make nonvanishing also the first sj − 1 derivatives of the potential by modifying it, near the critical points x¯j , in the following way: " #s ϕx − µ ≡ ω(x − x¯j ) j Lx " # #s −1 " #s (s −1) " ω(x − x¯j ) j + ω(x − x¯j ) j Lx , → νx(0) + νx(1) ω(x − x¯j ) + . . . + νx j (4.21) Of course if sj = 1 we recover the result of Theorem 1.4. 4.8. As to the case (2) generalizations could be possible, but several possible cases should be discussed, depending on the possible commensurability relations between the values ω¯ 1 , . . . , ω¯ p , so we prefer to not explicitly consider further extensions in this direction, also in view of the observations in the following section. 4.9. Let us discuss the physical meaning of the above results. In general, with the chemical potential fixed to an ε-independent value, we expect the singularities of the Schwinger functions in the two cases ε = 0 and ε = 0 to be different. As such singularities are physically observable, it is reasonable to fix them to some ε-independent value (accessible to the experiments), by introducing a suitable counterterm which leads to add a term (4.7) to the Hamiltonian, (if the formal “Luttinger theorem” were held in this case, this would correspond to fix the density of the fermion system). In Renormalization Group language, we want to fix the dressed quantities to some ε-independent value. However, as the symmetries of H0 and V are different, except in the case discussed in Theorem 1.4, the counterterm will depend on the coordinate x; in other words we have to choose an ε-dependent potential in such a way that the singularities of the Schwinger functions are ε-independent. This is not surprising: we want to prove that the Schwinger function is a perturbation of (2.3), in which the potential ϕx appears, but we know that the eigenvalues of the Schrödinger equation are of the form ϕx + εδϕx , for some suitable function δϕx ; see [S,E]. Note that in the particular case of Theorem 1.4 the counterterm is a constant (so that (4.7) becomes proportional to the fermion density). So in such a case another interpretation is possible: one has to fix the chemical potential to an ε-dependent value so that the singularities of the Schwinger functions are ε-independent. Having nonconstant counterterms is what usually happens in interacting fermion systems. For instance in order to study a system of two-dimensional fermions with a non-rotation invariant dispersion relation and interacting with a weak two-body potential, one has to introduce an angle-dependent counterterm, see [FST]. If one is interested in studying a Hamiltonian with no counterterms (but it is questionable if this is the right thing to do: this corresponds to consider as physically observable the bare instead of the dressed quantities), one has an implicit function to solve, which is nontrivial in our case because of the small divisors problem.
92
G. Gentile, V. Mastropietro
Appendix A1. Convergence of the Effective Potential (h)
A1.1. Let us consider the quantity Wn (x, x + m; k0 ) introduced in Sect. 3: Wn(h) (x, x + m; k0 ) = Val(ϑ).
(A1.1)
h ϑ∈Tn,m
We define the depth of a cluster T inductively by setting DT = DT + 1, if T is the cluster immediately containing T , and considering the vertices as clusters with depth 0; the graph ϑ is a cluster with maximal depth Dϑ . Given a cluster T we shall say that a cluster T˜ is maximal in T if it is contained inside T but not in any other cluster inside T . We introduce the following notations: • • • • • • • • • • • •
T is the set of clusters contained in ϑ (including ϑ); V is the set of resonances; T0 the set of lines and vertices inside T and outside the clusters internal to T ; MT is the number of vertices in T ; MT0 is the number of vertices in T0 ; (ν) MT0 is the number of ν-vertices in T0 ; LT0 is the number of lines in T0 ; kT0 is the number of maximal nonresonant clusters in T ; kTR is the number of maximal renormalized resonant clusters in T ; kT is the number of maximal clusters in T ; DT is the depth of the cluster T . Tk as the set of clusters with depth DT = k, Tk = {T ∈ T : DT = k}. Then, in (A1.1), one can write h ϑ∈Tn,m
Val(ϑ) =
δ1 ,... ,δn δ1 +...+δn =m
εn
{hA }
R
A∈int(ϑ)
(ν) γ hT νh MT0 T , g˜ A ε
(A1.2)
T ∈T
(h )
with the notations introduced in §3 (in particular g˜ A is a shorthand for g˜ ρ 1A,ρ 2 (rA )). A
A
Recall (2.44) and (2.45); by using in case (1) of the theorem that |mT | ≤ MT , one has 2a0 γ hT /|v0 | ≥ C0 |mT |−τ ≥ C0 MT−τ ,
(A1.3)
by (1.5). In case (2) of the theorem one finds, by using that |mT | ≤ MT + (2k − 1)(MT − 1), 2a0 γ hT /|v0 | ≥ C0 (|mT | + (2k + 1))−τ ≥ C0 [MT (2k + 2)]−τ ,
(A1.4)
as 2ω = (2k + 1)ω¯ mod 1. We can express both (A1.3) and (A1.4) through the only inequality α = 1, case (1), 2a0 γ hT /|v0 | ≥ C0 (αMT )−τ , (A1.5) α = 2k + 2, case (2). In particular, if T = ϑ, one finds |m| ≤ αn.
Anderson Localization for the Holstein Model
93
h , one has Lemma A1.2. For any ϑ ∈ Tn,m −(DT +2) M T, |ε|n/2 = |ε|1/2 ≤ |ε|2 v∈ϑ
(A1.6)
T ∈T
if MT is the number of vertices in T and DT is the depth of T . Proof A1.3. We prove by induction on the depth k ∈ [0, Dϑ ] the following bound: k−1 −(p+2) −(k+1) MT MT , (A1.7) |ε|1/2 ≤ |ε|2 |ε|2 p=0 T ∈Tp
T ∈Tk v∈T
T ∈Tk
where the product in the first parentheses has to be thought of as 1 for k = 0. Then, for k = 0, (A1.6) is a trivial identity. Suppose that (A1.6) holds for k − 1; then we show that it holds also for k. In fact, by denoting by MT0 the number of vertices in T0 ( i.e. internal to T and external to all clusters contained inside T ), one has |ε|1/2 = |ε|1/2 |ε|1/2 T ∈Tk v∈T
≤
T ∈Tk v∈T0
|ε|MT0 /2
≤
k−1
k−1
|ε|
|ε|
2−(p+2) MT
p=0 T ∈Tp
2−(p+2) MT
|ε|MT0 /2
so proving (A1.7). By taking k = Dϑ , (A1.6) follows.
!
|ε|
|ε|
2−(k+1) MT
T ∈Tk−1
,
|ε|
p=0 T ∈Tp
|ε|
2−(k+1) MT
≤
2−k MT
T ∈Tk−1
T ∈Tk
2−(p+2) MT
p=0 T ∈Tp
T ∈Tk
k−2
T ∈Tk−1 v∈T
(A1.8)
T ∈Tk
Lemma A1.4. For any tree ϑ one has −(DT +2) M −(DT +1) C γ −hT /τ k 0 1 T ≤ T, |ε|2 |ε|2 T ∈T DT ≥0
(A1.9)
T ∈T DT ≥1
where C1 = (C0 |v0 |/2a0 )1/τ α −1 and kT0 is the number of nonresonant clusters in T0 . Proof A1.5. One has T ∈T DT ≥0
|ε|2
−(DT +2) M T
=
|ε|2
−(D ˜ +2) T MT˜
,
(A1.10)
T ∈T T ∈T DT ≥1 T˜ =T
where we denote by T the cluster immediately containing T . By taking into account that min{hAoT , hAi } = hT by construction and T˜ = T , one has T
αMT˜ ≥ |mT˜ | ≥ C2 γ −hT /τ ,
C2 = (C0 |v0 |/2a0 )1/τ ,
(A1.11)
94
G. Gentile, V. Mastropietro
by (2.44), provided that mT˜ = 0. Note that DT˜ + 1 = DT and kT −(D ˜ +2) −(D ˜ +2) T T MT˜ MT˜ |ε|2 = |ε|2 ,
(A1.12)
T ∈T T˜ =T
if kT0 is the number of maximal clusters in T . For any nonresonant cluster T˜ ( i.e. with mT˜ = 0) one can use (A1.10). Then (A1.9) follows. ! h one has Lemma A1.6. For any tree ϑ ∈ Tn,m ( )( ) n−1 −hT LT0 hV −hV , g˜ A ≤ 2 (G3 γ ) γ R T ∈T
A∈int(ϑ)
(A1.13)
V ∈V
where G3 = max{G1 , a0 G2 }, with G1 and G2 being defined, respectively, in (2.40) and in (2.48). h and n > 1 (the case n = 1 is trivial, except for the ν-vertex). Proof A1.7. Let ϑ ∈ Tn,m Let us consider the collection V1 of maximal resonances, i.e. resonances which are not strictly contained in any other resonance. If V is such a resonance, AiV and AoV are its external lines, and x i = xA o . Then
R
AV
V
g˜ A =
A∈intϑ
g˜ A
g˜ Ai g˜ AoV RMhVV (rA i ) ,
V ∈V1
A∩V1 =∅
V
(A1.14)
V
where (h )
• g˜ A is a shorthand for g˜ ρ 1A,ρ 2 (rA ), if A is an internal line of ϑ, g˜ A = 1 otherwise; A A • A∩V1 =∅ g˜ A = 1, if ϑ itself is a resonance (so that all lines intersect V1 );
• the resonance value (i.e. the kernel corresponding to a resonance) MhVV (r i ) is given AV
by MhVV (rA i ) =
V
A∈V : A∩V2 =∅
g˜ A
W ∈V2 ∩V
g˜ Ai g˜ AoW RMhWW (rA i ) ,
W
W
(A1.15)
where V2 is the collection of resonances which are strictly contained inside some resonance in V1 , and which are maximal, and V2 ∩ V is the subset of resonances in V2 which are contained in V . We can write RMhVV (r i ) as AV
RMhVV (rA i ) ≡ MhVV (xA i , k0 ) − MhV (0, 0) V V 1 d = dtV MhV (tV xA i , tV k0 )) . dtV V 0
(A1.16)
Note that MhV (tV x i , tV k0 ) can be written as in (A1.15), by substituting the argument AV
xA of any line A with tx i + r˜A , for suitable values of r˜A . Therefore the r.h.s. of (A1.16) AV
can be written as a sum of terms of the form (A1.15) with a derivative d/dtV acting either
Anderson Localization for the Holstein Model
95
(1) on one of the propagators corresponding to a line outside V2 , or (2) on one of the RMhWW . In case (2), we write d d hW MW (tV xA i , tV k0 ) − MhWW (0, 0) RMhWW (tV xA i , tV k0 ) = dtV dtV V V d hW = M (tV xA o , tV k0 ), V dtV W
(A1.17)
so that, if the derivative corresponding to a resonance V acts on the value of some resonance W ⊂ V , one can replace with 1 the R operator corresponding to W . We can now iterate this procedure, by applying to MhWW (tV x i , tV k0 ) Eq. (A1.14), AV
with V3 (the family of resonances which are strictly contained inside some resonance belonging to V2 in place of V2 ), and so on. At the end (A1.14) can be written as a sum of MV − 1 terms, if MV denotes the number of vertices contained in V , which can be described in the following way. (1) There is one term for each line A¯ ∈ V ; (2) if A¯ ∈ T0 , where T is a cluster contained in V (note that T can be equal to V ), and T = Tr ⊂ Tr−1 . . . ⊂ T1 = V is the chain of r clusters containing T and contained in V , then the graph value can be computed by replacing with 1 the R operator acting on Ti , i = 1, . . . , r, even if Ti is a resonance, because of the comments after (A1.16); (3) the R operation acts on all other resonances contained in V ; ¯ whose argument is of the form (4) the derivative d/dtV acts on the propagator of A, (tV x i + r˜A¯, k0 ). AV
A similar decomposition of the resonance value is now applied, for each term of the previous sum, to all resonant clusters, which are still affected by the R operation. This procedure is iterated, until no R operation is explicitly present; it is easy to see that we end with an expression of the form R
A∈int(ϑ)
g˜ A =
d dA dt1 . . . dts g˜ A , dti(A)
(A1.18)
T ∈T A∈T0
where the sum is over all possible choices of s, {dA } and {i(A)}, which satisfy the following conditions: (1) dA is equal to 0 or 1; (2) if dA = 0, i(A) is arbitrarily defined, otherwise i(A) ∈ {1, . . . , s} and i(A) = i(A ), if A = A ; (3) the number of lines for which dA = 1 is equal to the number of interpolating parameters s; (4) for each derived line A there is a chain of r clusters T = Tr ⊂ Tr−1 . . . ⊂ T1 = V , such that A ∈ T0 and V is a resonance; (5) no cluster can belong to more than one chain of clusters; (6) each resonance belongs to one of the chains of clusters;
96
G. Gentile, V. Mastropietro
(7) the argument of the derived line is of the form ti(A) x + r˜A , with |x | ≤ ao γ −hV (in general x is not xA 0 , but it can depend also on the interpolation parameters V
corresponding to resonances containing V , if any), where hV is the scale of the smaller cluster containing it.
The item (7) above implies that, for each derived line, d h −hA −hA g ˜ γ , A ≤ a0 G2 γ V dt
(A1.19)
i(A)
(see also (2.48)). Note that hV − hA =
r i=1
hTi − hTi ,
(A1.20)
with the notations of item (4); hence the “gain” γ hV −hA in the bound (A1.24), with respect to the bound of a nonderived propagator, can be divided between the clusters of the chain h −hT
associated to the derived line A, so that each cluster has a factor γ Ti ≤ 1 associated with it; in particular we have a factor of this type associated with each resonance, for each term in the sum of (A1.18). Since the number of terms in this sum is bounded by 2n−1 , we obtain the bound (A1.13). ! A1.8. of the previous lemmata, one has −(DT +1) C γ −hT /τ k 0 1 T |ε|2 Val(ϑ) ≤ 2n−1 |ε|n/2 ( ·
T ⊂T DT ≥1
)( (G3 γ −hT )LT0
T ∈T
) (ν) γ hT νh MT0 T , γ hV −hV |ε|
V ∈V
(A1.21)
T ∈T
(ν)
where LT0 and MT0 denote, respectively, the number of lines and the number of νvertices in T0 . Lemma A1.9. If |νh | ≤ B|ε| for any h ≤ 1 and for some constant B, then one has Val(ϑ) ≤ γ h(ϑ) CD n−1 |ε|n/2 , h ϑ∈Tn,m for some constants C, D.
(A1.22)
Anderson Localization for the Holstein Model
97
Proof A1.10. One has the following (obvious) relations: hT ≤ −DT + 2, LT0 = kT − 1, kT =
(ν) MT0
+ kTR
∀T ∈ T, (A1.23) + kT0 ,
with the notations listed at the beginning of this section. One can write in (A1.21) γ −hT LT0 = γ hT γ and use that
(
T ∈T
)( γ
−hT kTR
(ν) 0
−hT MT
0
) γ
R
γ −hT kT γ −hT kT ,
hV −hV
=
V ∈V
(A1.24)
γ −hV ,
(A1.25)
V ∈V
as kTR is the number of resonances contained inside T , i.e. the number of resonances V ∈ V such that V = T . Then one can bound in (A1.21) ( )( )( ) h M (ν) T T 2−(DT +1) C1 γ −hT /τ kT0 −hT LT0 hV −hV 0 |ε| (γ ) γ γ T ∈T
T ⊂T DT ≥1
V ∈V
T ∈T
0 −(DT +1) C γ −hT /τ k 0 1 T, ≤ γ h(ϑ) γ −hT kT |ε|2
(A1.26)
T ⊂T DT ≥1
as ϑ ∈ V so that, by using the first relation in (A1.23), one sees that each sum over hT ≤ −DT + 2 can be easily performed in (A1.2), if kT0 = 0. In fact if γ is so large that γ˜ ≡ γ 1/τ /2 > 1, if, ∀N > 0, CN is such that
CN , (A1.27) exp − log ε−1 2−3 C1 γ˜ r ≤ 1 + (2−3 C1 log ε−1 γ˜ r )N (one can take CN = 1 + N !) and if N is so that γ˜ N ≥ 2γ , then
≤ ≤
γ −hT |ε|2
hT ≤−DT +2 ∞
k 0 r −2−3 logε−1 C1 γ˜ r T
γ e
r=DT −2
C4 2−DT
k 0 T
−(DT +1) C γ −hT /τ 1
k 0 T
≤
∞
γ r e−2
r=DT −2
≤
∞
r=DT −2
−3 C logε −1 (γ 1/τ /2)r 1
1 + (2−3 C
k 0 T
k 0 CN γ r T −1 N r log |ε |) (2γ ) 1 (A1.28)
#N " where C4 = 8CN / 2−3 C1 log ε−1 . The sum over {hT } would give some bad factor, when kT0 = 0, but it turns out that there is indeed no sum in this case. In fact, if all the clusters and vertices strictly contained
98
G. Gentile, V. Mastropietro
in T are resonant ( i.e. if kT0 = 0), then T itself must be a resonance and all its internal lines have the same x as the external ones, implying, by support properties of the fh functions, that the scale label of the external lines is equal to hT − 1. ! A1.11. As n ≥ |m|/α then |ε|
n/4
' |m| −1 log ε , ≤ exp − 4α &
(A1.29)
so that Lemma 2.10 follows from Lemma A1.9. Appendix A2. The Flow of the Running Coupling Constants A2.1. We still have to check that the bound on |νh | stated in Lemma A1.9 of the previous section is satisfied. Note that, ∀h < 0, sh = σh − σh+1 = νh = γ νh+1 + γ
−h
∞ q=2 ∞ q=2
(h)
W q (x, ¯ −x; ¯ 0), (A2.1) (h) W q (x, ¯ x; ¯ 0).
(h)
¯ ±x; ¯ 0) admit an expansion in terms of graphs ϑ, differing from the where W q (x, (h)
¯ ±x; ¯ 0) in the following respects: corresponding expansion of Wq (x, (1) the R operation on the whole graph, which is necessarily a resonance, is substituted with the localization operation, hence in the previous analysis ϑ must not be included in the set V; (2) the internal scale of ϑ is equal to h + 1, that is there is in the graph at least one line of frequency h + 1. Remark A2.2. If all maximal clusters strictly contained in ϑ are resonant, as well as the vertices belonging to ϑ0 , that is if kϑ0 = 0, then Val(ϑ) = 0; the same holds if kϑ0 = 1. This follows from the support properties of the propagators, from the definition of resonance and from the observation that all lines A ∈ ϑ would have x = 0, if kϑ0 = 0, since xA 1 = xA n+1 = 0 for the external lines. Lemma A2.3. If |νh | ≤ B|ε| for any h ≤ 1 and for some constant B, then, for any N , one has (h) n−1 n/2 ¯ ±x; ¯ 0) ≤ γ Nh CN DN |ε| , (A2.2) W n (x, for suitable constants CN , DN . Proof A2.4. Item (2) in Sect. A2.1 implies that ϑ is a cluster on scale h+1; by the remark Sect. A2.2 one has kϑ00 = 0. Then we can bound the value of each graph contributing (h)
to W n (x, ¯ ±x; ¯ 0) as in the proof of Lemma A1.9. The only difference is that now, in
Anderson Localization for the Holstein Model
99
(A1.26), the product in the right-hand side is also on ϑ itself, and there is no sum on h(ϑ) as h(ϑ) = h + 1. Therefore we can bound the factor corresponding to T = ϑ as 0 −1 1/τ −(Dϑ +1) C γ −h(ϑ)/τ kϑ0 −3 −(h+1) 1 γ h(ϑ) γ −h(ϑ) |ε|2 ≤ e−2 C1 log ε (γ /2) (A2.3) AN hN ≤ ≤ B γ , N 1 + (2−3 C1 log |ε−1 |)N γ˜ −(h+1)N for suitable constants AN , BN . As all other factors can be bounded as before, the bound (A2.2) follows. ! A2.5. Let us find a bound for σh . In case (1) of the theorem there is nothing to prove (h) ¯ ±x; ¯ 0) = 0 for as by definition σh ≡ 0. In the second case note that in (A2.1) W n (x, all n ≤ 2k + 1, as it is easy to see by using (2.47), (3.13) and (A2.1). Using (A2.1) and (A2.2) then for k ≥ 1, |σh | ≤ C|ε|(2k+1)/4 , for some constant C. Note that no lower bounds are in general found. Nevertheless in the case k = 0, the first contribution to σ arises from the perturbation, so that the value σ is explicitly computable: σ = −ε/2 + o(ε). A2.6. We discuss now the flow of the running coupling constant νh . The discussion is identical in case (1) or (2) of the theorem, for the lacking of lower bounds for σh (we cannot use the infrared cut-off for the flow of νh ). Define in (A2.1) βh+1 (ε; [νh+1 , ν1 ]) = γ −h
∞ q=2
(h)
W q (x, ¯ x; ¯ 0),
(A2.4)
[νh+1 , ν1 ] ≡ (νh+1 , νh+2 , . . . , ν0 , ν1 ), so that, by iteration, one finds ∀h ≤ 0, 1 νh = γ −h+1 ν1 + γ k−2 βk (ε; [νk , ν1 ]) .
(A2.5)
k=h+1
Remarks A2.7. (i) Note that, in any contribution to βk (ε; [νh+1 , ν1 ]) containing at least one vertex νh , for some h ≥ k + 1, there must be at least two nonresonant vertices (see Remark A2.2). (ii) There are contributions to βk (ε; [νh+1 , ν1 ]) containing only nonresonant vertices (at least two of them). (n)
A2.8. Introduce a sequence {νh }, with n ≥ 0, defined recursively, for any h ≤ 0, as (0)
νh = 0, (n) νh
,
= γ −h+1
(n−1) ν1
+
1 k=h+1
(n−1) γ k−2 βk
,
(n−1)
βk
(n−1)
= βk (ε; [νk
(n−1)
, ν1
]),
(A2.6) (0)
and set βk = 0.
100
G. Gentile, V. Mastropietro
Lemma A2.9. If for any n ≤ 0 one formally sets (n)
ν1 = −
1
(n)
k=−∞
γ k−2 βk ,
(A2.7)
(n)
then the sequence {νh } converges uniformly to a limit νh such that |νh | ≤ B |ε| γ (N−1)h ,
(A2.8)
for all h ≤ 1 and for some constant B. Proof A2.10. We show by induction that ∀h ≤ 1, (n) (n) (n−1) (n−1) ≤ B0 γ (N−1)h |ε|n , βh − βh ≤ B1 γ (N−1)h |ε|n , νh − νh
(A2.9)
for some constants B0 and B1 . (0) For n = 1, by considering that νh = 0 for any h, the bound (A2.2) and Remark (ii) above give (1) (A2.10) βh ≤ B1 γ (N−1)h |ε| , (1)
for some constant B1 , so that by defining ν1 as in (A2.6) for n = 1, one obtains (1) (1) (A2.11) ν1 ≤ B0 γ (N−1) |ε|n , νh ≤ B0 γ (N−1)h |ε|n ∀h ≤ 0, so proving (A2.9) for n = 1. (n+1) (n) − βh can be written as a sum of If n > 1 and (A2.9) holds for n, then βh (n) (n−1) values of graphs in which there is at least one vertex with νh − νh , for some h ≥ h (and at least two other vertices; see Remark A2.7, (i), above). Therefore, as (n) (n−1) (n+1) (n) − βh has |νh − νh | ≤ B0 γ (N−1)h |ε|n and any graph contributing to βh to contain two nonresonant vertices (see Remarks A2.7), the second bound in (A2.9) follows by using (A2.2). Then (A2.6) and (A2.7) imply the first one. (n) Therefore {νh } converges uniformly to a limit νh , and νh verifies the bound (A2.8) for h ≤ 0. Moreover ν1 , which is given by the limit of (A2.6), for n → ∞ verifies (A2.7) for h = 1. ! A2.11. If ϕx is not even one has four running coupling constants (see Sect. 4.2). The flow of the constants σ1h and σ2h is controlled exactly as in Sect. A2.5. As far as the constants ν1h and ν2h are concerned, one can define the functions βj h (ε; [ν1h , ν11 ], [ν2h , ν21 ]),
(A2.12)
with j = 1, 2 and with the notations of (A2.4); it is easy to check that, formally setting (n)
ν1j = −
1 k=−∞
(n)
(n)
(n)
(n)
(n−1)
γ k−2 βj k
,
j = 1, 2,
(n)
(n)
(A2.13) (n)
with βj k = βj k (ε; [ν1k , ν11 ], [ν2k , ν21 ]), the sequences {ν1h } and {ν2h }, defined as in (A2.6), with the obvious modifications, converge uniformly to two limits ν1h and ν2h , respectively, such that |ν1h | , |ν2h | ≤ B |ε| γ (N−1)h , so that the same conclusions as in the even case are obtained.
(A2.14)
Anderson Localization for the Holstein Model
101
Appendix A3. Convergence of the Schwinger Functions (h)
A3.1. Let ϑ be one of the graphs contributing to the kernel Kφ,φ (x, y), and let us consider the two vertices, v1 and vq , connected to the external lines (which are associated with the external field): such vertices will be called the external vertices. Suppose first that neither v1 nor vq are contained in any cluster, different from ϑ itself. In this case, we can bound Val(ϑ) as in Appendix A1, by taking into account that (1) there is no factor associated to the external vertices; (2) h(ϑ) = h + 1; (3) there are at least two lines of scale h + 1, the external propagators. Hence we get a bound differing from (A1.29) only because the power of |ε| is n − 2 instead of n and each external propagator gives a contribution proportional to γ −h(ϑ) . 0 n −3 log |ε −1 |C γ −hT /τ kT0 4M −h(ϑ) −h −2 T 1 |Val(ϑ)| ≤ γ , (A3.1) γ Te |ε| 2 C2 T ∈T
where the same notation of Appendix A1 is used, except for the definition of MT , which differs from the previous one, since we do not consider the external vertices in the (2) calculation of MT ; moreover we assigned a label nv = 0 to the external vertices. A3.2. Suppose now that v1 is contained in some cluster strictly contained in ϑ and that the scale of the external propagator emerging from v1 is h1 . In this case, there is a chain of clusters T (1) ⊂ T (2) . . . ⊂ T (r) = ϑ, such that v1 ∈ T (i) and hT (1) = h1 ; moreover R = 1 on T (i) , i = 1, . . . , r, even if T (i) is a resonance. We proceed again as in Appendix A1, but we have to take into account the lack of the he −h factor γ T (i) T (i) , which was present before, when T (i) is a resonance. Since ϑ is not a resonance (by definition) and heT (i) = hT (i+1) , we loose at most a factor γ h(ϑ)−hT (1) = γ h+1−h1 . If we also consider the bound of the external propagator emerging from v1 , we see that the overall effect of the vertex v1 in the bound of Val(ϑ) is to add a factor γ −h−1 to the expression in the r.h.s. of (A1.29), that is the same effect that we should get, if the only cluster containing v1 was ϑ. A similar argument can be used for studying the effect of the vertex vq . Hence we (h) get the bound (A2.6) for all graphs contributing to Kφ,φ (x, y). Appendix A4. Proof of the Theorem 4.6 A4.1. As the proof of Theorem 4.6 proceeds very similar to that of Theorem 1.4, we only outline the main differences and show how they can be dealt with. A4.2. Suppose that ϕ(ω ¯ x¯j ) = µ for p points x¯1 , . . . , x¯p . For each j one introduce sj (0)
(s −1)
running coupling constants νhj , . . . , νhjj , if sj is the first nonvanishing derivative of ϕ(ωx) ¯ at x = x¯j . Then we can set s0 = max{s1 , . . . , sp }. In defining the scales, we set (j )
r 2 ≡ k02 + (v0 )2 ωx T j , 2s
(A4.1)
102
G. Gentile, V. Mastropietro
where x = x¯j + x when ωx − ω¯ j T is small, so that, if (j ) "
ϕx +εx¯j − µ = v0
ωx
#sj
j
+ 4x ,
(j )
v0 =
1 sj ∂ ϕ(ω ¯ x¯j ), sj !
(A4.2)
then, by setting t0 = a0 /γ , one has that (j )
|v0 | ωx T ≤ t0 γ (h+1)/sj ,
(A4.3)
when the corresponding propagator is on scale h. Then we introduce the running coupling constants as in (4.10). Taking into account the bound (A4.3) on ωx T , one proves a result like Lemma A1.9, provided, again, all running coupling constants remain bounded by B|ε|, for some constant B. Note that it has been aiming at having the condition expressed in such a way that the writing (4.10) for the running coupling constants imposes itself as the natural one. The only difference is that now each resonance has to be renormalized to sj order: this means that one has to derive up to sj − 1 times the propagators, but a condition r d (hA ) 2sj −2h g ˜ (tr + r ) t ∈ [0, 1], (A4.4) 2 ≤ G2 r2 γ dt r εA1 ,εA2 1 holds, for some constant G2 (in general G2 grows as s0 !). A4.3. One can then proceed like in the proof of Theorem 1.4, by using that the bound in Lemma A2.3 can be extended to the quantity ∂r W (x¯j , x¯j ; 0), ∂(ωx )r
1 ≤ r ≤ sj ,
j = 1, . . . , p.
(A4.5)
One finds that it is possible to choose the running coupling constants on scale h = 1, (r) ν1j , by proceeding exactly as in Sect. A2.8, and setting (r,n)
νj 1
=−
1 k=−∞
(r,n)
γ k−2 βj k ,
(A4.6)
with obvious meaning of the symbols. The analysis is as in Sect. A2.8 and the same conclusions are obtained. Then the running coupling constants on scale h = 1 define the counterterms appearing in (4.14). Acknowledgements. We thank IHES for hospitality while part of this work was done.
References [AAR]
Aubry, S., Abramovici, G., Raimbaut, J.: Chaotic polaronic and bipolaronic states in the adiabatic Holstein model. J. Stat. Phys. 67, 675–780 (1992) [BLS] Belissard, J., Lima, R., Scoppola, E.: Localization in ν-dimensional incommensurate structures. Commun. Math. Phys. 88, 465–477 (1983) [BLT] Belissard, J., Lima, R., Testard, D.: A metal-insulator transition for almost Mathieu model. Commun. Math. Phys. 88, 207–234 (1983) [BGM] Benfatto, G., Gentile, G., Mastropietro, V.: Electrons in a lattice with an incommensurate potential. J. Stat. Phys. 89, 655–708 (1997)
Anderson Localization for the Holstein Model
[BGK]
103
Bricmont, J., Gawedzki, K., Kupiainen, A.: KAM theorem and quantum field theory. Commun. Math. Phys. 201, 699–727 (1999) [E] Eliasson, L.H.: Discrete one-dimensional quasi-periodic Schrödinger operators with pure point spectrum. Acta Math. 179, 153–196 (1997) [FST] Feldman, J., Salmhofer, M., Trubowitz, E.: Perturbation theory around nonnested Fermi surfaces. I. Keeping the Fermi surface fixed. J. Stat. Phys. 84, 1209–1336 (1996) [FT] Feldman, J., Trubowitz, E.: Renormalization on classical mechanics and many body quantum field theory. J. Anal. Math. 58, 213–247 (1992) [G1] Gallavotti, G.: Twistless KAM tori. Commun. Math. Phys. 164, 145–156 (1994) [G2] Gallavotti, G.: Invariant tori: A field theoretic point of view on Eliasson’s work. In: Advances in Dynamical Systems and Quantum Physics, Ed. R. Figari, Singapore: World Scientific, 1995, pp. 117– 132 [GGM] Gallavotti, G., Gentile, G., Mastropietro, V.: Field theory and KAM tori. Math. Phys. Electron. J. 1, paper 5, pp. 1–13 (1995) [GM] Gentile, G., Mastropietro, V.: Methods for the analysis of the Lindstedt series for KAM tori and renormalizability in classical mechanics. A review with some applications. Rev. Math. Phys. 8, 393–444 (1996) [M] Mastropietro, V.: Small denominators and anomalous behaviour in the Holstein–Hubbard model. Commun. Math. Phys. 201, 81–115 (1999) [NO] Negele, J.W., Orland, H.: Quantum many particle systems. New York: Addison-Wesley, 1988 [PF] Pastur, L., Figotin, A.: Spectra of random and quasi-periodic operators. Berlin: Springer, 1991 [S] Sinai, Ya.G.: Anderson localization for one-dimensional difference Schrödinger operator with quasiperiodic potential. J. Stat. Phys. 46, 861–909 (1987) Communicated by A. Kupiainen
Commun. Math. Phys. 215, 105 – 118 (2000)
Communications in
Mathematical Physics
© Springer-Verlag 2000
The Reeh–Schlieder Property for Quantum Fields on Stationary Spacetimes A. Strohmaier Institut für Theoretische Physik, Universität Leipzig, Augustusplatz 10/11, 04109 Leipzig, Germany. E-mail:
[email protected] Received: 1 March 2000 / Accepted: 30 May 2000
Abstract: We show that as soon as a linear quantum field on a stationary spacetime satisfies a certain type of hyperbolic equation, the (quasifree) ground- and KMS-states with respect to the canonical time flow have the Reeh–Schlieder property. We also obtain an analog of Borchers’ timelike tube theorem. The class of fields we consider contains the Dirac field, the Klein–Gordon field and the Proca field. 1. Introduction For quantum field theory in curved spacetime it has turned out that the framework of algebraic quantum field theory (see [19]) is most suitable for analyzing the problems connected with the non-uniqueness of the quantization of linear fields (see e.g. [39] and the references therein). The problem reduces to finding appropriate representations of the field algebra which can straightforwardly be constructed on manifolds (see [13, 14]). This is the same as specifying vacuum-like states over the algebra. On stationary spacetimes it is possible to distinguish states as ground- or KMS-states with respect to the canonical time translations. For free fields satisfying certain wave-equations it was recently shown in [35] that such passive quasifree states satisfy the microlocal spectrum condition (see [8, 32]) which is believed to be a substitute for the usual spectrum condition in Minkowski spacetime. For the case of the scalar field on a 4-dimensional globally hyperbolic spacetime it was shown in [32,33] that the microlocal spectrum condition is equivalent to the requirement that the 2-point function is of Hadamard form (see e.g. [15, 39]), which allows for a renormalization of the stress energy tensor ([38]). It therefore seems reasonable to consider ground- or KMS-states for free quantum fields on stationary spacetimes as good substitutes for the vacuum in flat spacetime. In the Minkowski space theory the vacuum vector turns out to be cyclic for field algebras associated to nonvoid open regions ([34]), i.e. the quantum field has the Reeh– Schlieder property. This can be shown to hold also for the GNS vacuum vector in thermal representations of quantum fields ([22]). By the lack of symmetry in general spacetimes it
106
A. Strohmaier
is not clear whether physically reasonable vacuum states have this property as well. It was shown in [37] that the quasifree ground-state of the massive scalar field on an ultrastatic spacetime is of this kind. The proof uses an anti-locality property of the square root of the Laplace operator. Using similar methods it was possible to obtain a Reeh–Schlieder-type property for solutions to the Dirac equation on an ultrastatic spacetime with compact Cauchy surface in [3], and the Reeh–Schlieder property for ground- and KMS-states of the free Dirac field on a static globally hyperbolic 4-dimensional spacetime in [36]. We will show in this paper that the Reeh–Schlieder property for ground- and KMSstates holds for a large class of free fields on stationary spacetimes. We introduce the notions of (classical) linear fermionic and bosonic field theories on a spacetime and show that these notions lead via canonical quantization to quantum field theories on this spacetime. We note that our approach does not use an initial data formulation. In the last section we show how the most common free fields fit into this framework. Our main theorem states that as soon as the classical field fulfills a certain hyperbolic partial differential equation, a state over the field algebra of the quantized theory, which is a (quasifree in the bosonic case) ground- or KMS-state with respect to the group of time translations, has the Reeh–Schlieder property. In order to show this we combine the Gelfand-Maurin theorem on generalized eigenvectors with classical results on the strong unique continuation property of solutions of certain second order elliptic differential equations. As a result we obtain a curved spacetime analog to the timelike tube theorem of Borchers ([6]). Using standard arguments this yields the Reeh–Schlieder property. The class of fields which fulfill our assumptions contains the Dirac field, the Proca field and the scalar field on arbitrary connected 4-dimensional globally hyperbolic stationary Lorentzian manifolds. We note that as a consequence the Hartle-Hawking state for the Klein–Gordon field on the external Schwarzschild spacetime has the Reeh– Schlieder property (see [25]). This is a new result which so far has not been obtained by the methods previously employed. The Reeh–Schlieder property serves as the starting point for the use of the TomitaTakesaki-theory within quantum field theory. The application of this theory to the analysis of quantum field theory in Minkowski spacetime was very fruitful (see e.g. [5]) and there might be an impact as well on curved space quantum physics (see [9,18]). 2. Classical Linear Field Theories In the following and throughout the text a smooth manifold will always be Hausdorff and separable as a topological space. If we are given a smooth vector bundle E over a smooth manifold M one can endow the space of smooth sections (E) and the space of compactly supported smooth sections 0 (E) with locally convex topologies (see [11, 12]) in a similar way as for C ∞ (Rn ) and C0∞ (Rn ). These locally convex vector spaces turn out to be nuclear (see [29], or [31, 23] for properties of nuclear spaces). We denote the topological dual of the space 0 (E) by D (M, E ∗ ), calling it the space of distributions with values in the dual bundle E ∗ . For some open O ⊂ M the subspaces 0 (E, O) and (E, O) of sections supported in O are closed. We call a smooth vector bundle E a G-vector bundle for some Lie-group G if there is a smooth action of G on the base space M and on E, such that the bundle projection E → M is equivariant and the left action q : Ex → Eqx is linear for all q ∈ G and x ∈ M. In this case we have a canonical action of G on 0 (E) and (E), for which we use the notation G × (E) → (E), (q, f ) → qf . These representations of G are continuous in the corresponding locally convex topologies. A smooth n-dimensional Lorentzian manifold
The Reeh–Schlieder Property for Quantum Fields on Stationary Spacetimes
107
(M, g) is a smooth n-dimensional manifold with smooth metric g of constant signature (1, −1, . . . , −1) and n ≥ 2 (see e.g. [30]). For any open set O we denote the open causal complement of O, i.e. the interior of the set of points in M which cannot be joined to a point in O by a causal curve, by O⊥ . If E is a smooth vector bundle over a Lorentzian manifold (M, g) we say a second order differential operator P : (E) → (E) has metric principal part, if in local coordinates the principal part is of the form 1 · g ik ∂i ∂k . 2.1. The fermionic case. Let (M, g) be a smooth n-dimensional Lorentzian manifold. We denote the identity component of the group of isometries of M by G and its universal ˜ Note that G ˜ is a Lie-group which acts smoothly on M. covering group by G. Definition 1. A linear fermionic field theory on M is a 5-tuple (H, K, E, η, ρ), where H ˜ is a complex Hilbert space with conjugation K, E a smooth complex G-vector bundle, ˜ η a linear map η : 0 (E) → H with dense range, and ρ a unitary representation of G on H commuting with K, such that the following conditions are satisfied: ˜ 1. Covariance: η(qf ) = ρ(q)(η(f )), ∀f ∈ 0 (E), q ∈ G. 2. Causality: O1 ⊂ O2⊥ implies H(O1 ) ⊥ H(O2 ), where H(O) denotes the closure of the image of 0 (E, O) under η. 3. Continuity: η is weakly continuous, i.e. v, η(·) is in D (M, E ∗ ) for all v ∈ H. 4. K is local, i.e. K(H(O)) = H(O). Proposition 1. For any linear fermionic field theory (H, K, E, η, ρ) on M the map η is norm continuous. Moreover, the representation ρ is strongly continuous. Proof. The sesquilinear form B(f, g) := η(f ), η(g) on 0 (E) × 0 (E) is separately continuous, and as a consequence of the nuclearity of 0 (E) it is jointly continuous. ˜ on E is smooth, the action of This gives the norm continuity of η. Since the action of G ˜ on 0 (E) is continuous. As a consequence we obtain the strong continuity of ρ. G 2.2. The bosonic case. Let (M, g) be again a smooth n-dimensional Lorentzian manifold and G be the identity component of the group of isometries of M. Definition 2. A linear bosonic field theory on M is a 5-tuple (W, σ, E, η, ρ), where W is a symplectic vector space with symplectic form σ , E a smooth real G-vector bundle, η a surjective linear map η : 0 (E) → W, and ρ a representation of G on W by symplectomorphisms, such that the following conditions are satisfied: 1. Covariance: η(qf ) = ρ(q)(η(f )), ∀f ∈ 0 (E), q ∈ G. 2. Causality: O1 ⊂ O2⊥ implies σ (W(O1 ), W(O2 )) = 0, where W(O) denotes the image of 0 (E, O) under η. 3. Continuity: or all v ∈ W the map 0 (E) → R, f → σ (v, η(f )) is continuous, i.e. defines a distribution in D (M, E ∗ ). 3. Canonical Quantization In this section (M, g) will be a smooth n-dimensional Lorentzian manifold. We will work in the framework of algebraic quantum field theory ([19]). A quantum field theory on M will be defined by a net of local field algebras, i.e. a map O → F(O) from the relatively compact open subsets of M to the set of closed ∗-subalgebras of a C ∗ - or W ∗ -algebra F which is isotone, i.e. F(O1 ) ⊂ F(O2 ) whenever O1 ⊂ O2 .
108
A. Strohmaier
3.1. Quantization of a linear fermionic field theory. We show how a linear fermionic field theory gives rise to a quantum field theory on M given by a net of local field algebras. Given a linear fermionic field theory (H, K, E, η, ρ), the field algebra F is the (selfdual) CAR-algebra CAR(H, K) (see [1]). This is the C ∗ -algebra with unit generated by symbols B(v) with v ∈ H and the relations v → B(v) is complex linear, B(v)∗ = B(Kv), {B(v1 ), B(v2 )} = B(v1 )B(v2 ) + B(v2 )B(v1 ) = Kv1 , v2 .
(1) (2) (3)
F has a natural Z2 -grading. The even/odd parts are spanned by those products B(v1 ) . . . B(vk ) with an even/odd number of generators. For each relatively compact subset O ⊂ M we define the local algebra F(O) ⊂ F to be the closed unital ∗subalgebra generated by the symbols B(η(f )) with f ∈ 0 (E, O). The representation ˜ on H gives rise to a representation τ of G ˜ by strongly continuous Boρ of the group G goliubov automorphisms of the algebra (see [1]). It is not difficult to check the following properties of the net O → F(O): 1. Isotony: O1 ⊂ O2 implies F(O1 ) ⊂ F(O2 ). 2. Causality: if O1 ⊂ O2⊥ , then {F(O1 ), F(O2 )} = {0}. ˜ 3. Covariance: τ (q)F(O) = F(qO) ∀q ∈ G. Here {·, ·} denotes the graded commutator, which is equal to the commutator if one of the arguments is even and coincides with the anti-commutator if both arguments are odd. Moreover, F is the quasilocal algebra of the net O → F(O) (see [7, Proposition 5.2.6]). Hence, this defines a reasonable quantum field theory on the manifold M. Note that F is not the algebra of observables. The algebra of observables A should be a ∗-subalgebra of Feven consisting of elements a for which τg1 a = τg2 a whenever p(g1 ) = p(g2 ), where ˜ → G is the covering map. Usually A is the ∗-subalgebra of elements which are p:G invariant under the action of a gauge group.
3.2. Quantization of a linear bosonic field theory. Each linear bosonic field theory (W, σ, E, η, ρ) gives rise to a quantum field theory on M. The field algebra F is defined to be the CCR-algebra CCR(W, σ ) (see [26, 27, 7]). This is the C ∗ -algebra generated by symbols W (v) with v ∈ W and the relations W (−v) = W (v)∗ , W (v1 )W (v2 ) = e
−iσ (v1 ,v2 )/2
W (v1 + v2 ).
(4) (5)
We define for each relatively compact open subset O ⊂ M the local field algra F(O) ⊂ F to be the closed ∗-subalgebra generated by the symbols W (v) with v ∈ W(O). The representation ρ of G gives rise to a representation τ of G by Bogoliubov automorphisms of the algebra F (see e.g. [7]), and the net O → F(O) has the following properties: 1. Isotony: O1 ⊂ O2 implies F(O1 ) ⊂ F(O2 ). 2. Causality: if O1 ⊂ O2⊥ , then [F(O1 ), F(O2 )] = {0}. 3. Covariance: τ (q)F(O) = F(qO) ∀q ∈ G.
The Reeh–Schlieder Property for Quantum Fields on Stationary Spacetimes
109
Moreover, F is the quasilocal algebra of the net O → F(O) (see [7, Proposition 5.2.10]). Hence, this defines a reasonable quantum field theory on the manifold M. Unlike the fermionic case the representation τ fails to be strongly continuous whenever it is nontrivial. We therefore need to pass to certain representations of the field algebra to obtain a net of von Neumann algebras on which τ extends to a σ -weakly-continuous representation. In order to avoid complications we specialize to the so-called quasifree states. Assume that we are given a scalar product µ on W which dominates σ , i.e. satisfies the estimate |σ (v1 , v2 )|2 ≤ 4µ(v1 , v1 )µ(v2 , v2 )
v1 , v2 ∈ W.
(6)
In this case the linear functional ωµ : F → C, defined by ωµ (W (v)) := e−µ(v,v)/2
v ∈ W,
(7)
is a state (see [26, 27, 7]). The states over F which can be realized in this way are called quasifree states. A quasifree state ωµ gives rise to a one particle structure (Proposition 3.1 in [27]), that is a map Kµ : W → Hµ to some complex Hilbert space Hµ , such that 1. the complexified range of Kµ , (i.e. Kµ W + iKµ W), is dense in Hµ , 2. Kµ v1 , Kµ v2 = µ(v1 , v2 ) + 2i σ (v1 , v2 ). This structure is unique up to equivalence. A one particle structure (Kµ , Hµ ) for a quasifree state allows one to construct the GNS-triple (πωµ , Hωµ , #ωµ ) explicitly (see [27, 26, 7]). Namely, one takes Hωµ to be the bosonic Fock space over Hµ with Fock ˆ µ v))), where aˆ ∗ (·) and vacuum #ωµ , and defines πωµ (W (v)) = exp(−(aˆ ∗ (Kµ v) − a(K a(·) ˆ are the usual creation and annihilation operators. One clearly has the following Proposition 2. Let ωµ be a quasifree state over the CCR-algebra F = CCR(W, σ ) and let (πωµ , Hωµ , #ωµ ) be its GNS-triple. If V ⊂ W is a subspace which is dense in W in the topology defined by µ, then the ∗-algebra generated by the set {πωµ (W (v)), v ∈ V } ⊂ πωµ (F) is strongly dense in the von Neumann algebra πωµ (F) . Definition 3. Let F be a field algebra constructed from a linear bosonic field theory (W, σ, E, η, ρ) on M. We call a quasifree state ωµ over F continuous if the map 0 (E) × 0 (E) → R,
(f1 , f2 ) → µ(η(f1 ), η(f2 ))
is continuous and hence defines a distribution in D (M × M, E ∗ E ∗ ), where E E ∗ is the direct product of the bundle E ∗ with itself over the base space M × M. This is clearly a necessary and sufficient condition for Wightman 2-point function w2 (·, ·) := Kµ η(·), Kµ η(·) to be a distribution. For the class of G-invariant continuous quasifree states we can circumvent the problems connected with the non-continuity of the representation τ of G on F. The action of G is continuous on 0 (E) and ρ leaves σ invariant. If the state ωµ is invariant then also µ is invariant under the action of G and hence there exists a unique strongly continuous representation U˜ of G on the one particle Hilbert space Hµ , such that Kµ ◦ρ(q) = U˜ (q)◦Kµ for all q ∈ G. Second quantization gives a strongly continuous unitary representation U of G on Hωµ , such that πωµ (ρ(q)a) = U (q)πωµ (a)U −1 (q) for all q ∈ G and a ∈ F. Hence, one gets the following proposition:
110
A. Strohmaier
Proposition 3. Let ωµ be a G-invariant continuous quasifree state over the field algebra F constructed from a linear bosonic field theory (W, σ, E, η, ρ) on M. Let (πωµ , Hωµ , #ωµ ) be its GNS-triple and U be the unitary representation of G on Hωµ induced by τ . Then U is strongly continuous and hence τ can be continued to a σ weakly-continuous representation τˆ of G by automorphisms of the von Neumann algebra Fˆ := πωµ (F) . Therefore, given a continuous quasifree state ωµ we can construct the net of von ˆ Neumann algebras O → F(O) := πωµ (F(O)) . This assignment is isotone, causal and covariant, and the representation τˆ of G is σ -weakly-continuous. It gives rise to a quantum field theory on M with reasonable physical properties. 4. The Reeh–Schlieder Property for Quantized Linear Fields 4.1. Stationary spacetimes. Let (M, g) be an n-dimensional time-oriented Lorentzian manifold which admits a one parameter group ht of isometries, smooth in t, with timelike orbits giving rise to a timelike Killing vector field ξ . Such a manifold is called stationary. For later considerations we need a special class of charts. Lemma 1. For each point p ∈ M there exists an open neighbourhood O and a chart φ : O → Rn with coordinates (x0 , . . . , xn−1 ), such that in local coordinates 1. ξ = ∂x∂ 0 , 2. the (n − 1) × (n − 1) matrix −g αβ (x), α, β = 1, . . . , n − 1 is positive for all x ∈ φ(O). Proof. One can always choose a neighbourhood O1 of p and a chart φ : O1 → Rn with coordinates (x0 , . . . , xn−1 ), such that ξ = ∂x∂ 0 and the metric tensor is diagonal in the point p, i.e. −gin (φ(p)) = diagn (g(ξ, ξ ), −1, . . . , −1).
(8)
−g αβ (φ(p)), α, β = 1, . . . , n − 1 is then a positive matrix. Since the matrix-valued function −g αβ is continuous, there exists a neighbourhood φ(O) of φ(p), on which it is positive. 4.2. Free quantum fields on stationary spacetimes. Since ht is a group of isometries it defines a group homomorphism R → G which lifts uniquely to a group homomorphism ˜ Hence, in case we have a net of field algebras O → F(O) constructed from R → G. a linear fermionic or bosonic field theory we canonically get a one pameter group of automorphisms τt which acts covariantly, i.e. τt F(O) = F(ht O). We call this group the group of canonical time translations induced by ht . In the interesting cases one can realize F on a Hilbert space, such that τt is σ -weakly-continuous and hence extends to an automorphism group τˆt of the von Neumann algebra F (see e.g. Proposition 3). It is then possible to distinguish vacuum states over the field algebra F as ground- or KMS-states with respect to the group of time translations τt . Definition 4. Let A be a W ∗ -algebra and αt be a σ -weakly-continuous one-parameter group of ∗-automorphisms of A. An αt invariant normal state ω is called ground-state with respect to αt if the generator of the corresponding strongly continuous unitary group U (t) on the GNS-Hilbert space is positive.
The Reeh–Schlieder Property for Quantum Fields on Stationary Spacetimes
111
Definition 5. Let A and αt be as above. A normal state ω is called KMS-state with inverse temperature β > 0 with respect to αt if for any pair A, B ∈ A there exists a complex function FA,B which is analytic in the strip Dβ := {z ∈ C; 0 < I m(z) < β} and bounded and continuous on Dβ , such that FA,B (t) = ω(Aαt (B)), FA,B (t + iβ) = ω(αt (B)A) . Definition 6. Let (M, g, ht ) be a connected stationary Lorentzian manifold. Let {F(O)} be the net of local field algebras constructed from a linear fermionic field theory (H, K, E, η, ρ) or from a linear bosonic field theory (W, σ, E, η, ρ). Denote the group of canonical time translations by τt . Let ω be a τt -invariant state over the quasilocal algebra F, which we assume to be quasifree and continuous in the bosonic case, and denote by (πω , Hω , #ω ) the corresponding GNS-triple. We say that ω is a ground or KMS state, if the unique normal extension of ω over πω (F) is a ground or KMS state with respect to the unique σ -weakly-continuous extension of the group of time translations τt . In the fermionic case the existence of ground states is always guaranteed ([1]). Moreover, there exists a unique quasifree KMS-state with inverse temperature β > 0 (see [7, 1]). In the bosonic case the construction of continuous quasifree ground- and KMSstates seems problematic in the general case. See e.g. [24] for conditions that allow the construction of a continuous quasifree ground state for the Klein–Gordon quantum field on a 4-dimensional stationary spacetime. 4.3. A density theorem and a tube theorem. Theorem 1. Let (M, g, ht ) be a connected stationary Lorentzian manifold and E a smooth complex ht -vector bundle. Let H be a complex Hilbert space and ρt a strongly continuous unitary one-parameter group on H . Assume that we have a linear strongly continuous map ηˆ : 0 (E) → H with dense range which is covariant, i.e. η(h ˆ tf ) = ˆ ) for all t ∈ R, f ∈ 0 (E). Assume furthermore that ηˆ ◦ P = 0 for some second ρt η(f order differential operator P with metric principal part. Then η( ˆ 0 (E, hR O)) is dense in H for each nonvoid open set O ⊂ M. Proof. We introduce the following notations: V := η( ˆ 0 (E, hR O))⊥ , pV . . . orthogonal projection onto V , 4 := Ran(pV ◦ η). ˆ
(9) (10) (11)
V is clearly a ρt -invariant subspace and 4 is dense in V . Identifying 4 with 0 (E)/ ker(pV ◦ η) ˆ we can endow 4 with the locally convex quotient topology. Since 0 (E) is nuclear and pV ◦ ηˆ is continuous, 4 is a nuclear space (see [29, 31, 23]), and clearly the inclusion map 4 → V is continuous. We denote the dual of 4 by 4 . It follows that 4 ⊂ V ⊂ 4
112
A. Strohmaier
is a Gelfand triple. We denote the selfadjoint generator of the group ρt |V by A. Clearly, 4 ⊂ D(A) and A restricts to a continuous operator 4 → 4. Moreover 4 is invariant under the action of ρt |V . As a consequence A is essentially selfadjoint on 4. Hence, there exists a complete set of generalized eigenvectors for A (see [17] and Sect. II.3 in [29]), i.e. a family of elements vλ,k ∈ 4 indexed by a subset I ⊂ R and a second index k, such that vλ,k (Aφ) = λvλ,k (φ), ∀φ ∈ 4, vλ,k (φ) = 0 for all λ, k ⇔ φ = 0.
(12) (13)
ψλ,k (·) := vλ,k (pV ◦ η(·)) ˆ
(14)
By continuity
defines for each λ and k a distribution in D (M, E ∗ ), such that ψλ,k (0 (E, hR O)) = {0}, ψλ,k (Lξ f ) = iλψλ,k (f ), ψλ,k (P ·) = 0,
(15) (16) (17)
where ξ is the timelike Killing vector field induced by ht and Lξ the Lie derivative on 0 (E) defined by Lξ f = limt→0 ht ft−f . For each point p ∈ M there exists an open contractible neighbourhood U and a chart mapping U to Rn which we can choose to be of the form constructed in Lemma 1. The restriction of E to U is trivial and we can identify 0 (E, U) with C ∞ (U) ⊗ CN in such a way that Lξ f = ∂x∂ 0 f for both functions and sections. We consider the distribution ψλ,k in such a chart. We have P ∗ ψλ,k = 0, where P ∗ is the adjoint operator. Moreover, Eq. (16) reads ∂ ∗ ∂x0 ψλ,k = −iλψλ,k . Note that the principal part of P has the form g 00
∂2 ∂2 ∂2 0α αβ + g + g , ∂x0 ∂xα ∂xα ∂xβ ∂x02
(18)
α, β = 1, 2, . . . , n − 1. ∂ 2 Replacing the x0 -derivatives by −iλ and adding the term − ∂x 2 − λ we obtain a second 2
0
order differential operator Pe∗ with Pe∗ ψλ,k = 0. Note that Pe∗ is elliptic due to having chosen a chart with the properties in Lemma 1. Hence, ψλ,k is smooth (see e.g.[21, 10]) and since Pe has scalar principal part the classical result of Arozajn [2] (see especially Remark 3)1 implies that ψλ,k = 0 in each such chart in which ψλ,k vanishes in an open nonvoid set, in particular in each such chart intersecting with hR O. Since M is connected this implies ψλ,k = 0 on M. The set of generalized eigenvectors vλ,k is complete and therefore Ran(pV ◦ η) ˆ = {0}. Since η( ˆ 0 (E)) is dense in H this implies pV = 0 and the theorem is proved. As a consequence one gets a result similar to the timelike tube theorem ([6]) in Minkowski spacetime. 1 For a detailed treatment see Sect. 17.1 of [20] and the references therein
The Reeh–Schlieder Property for Quantum Fields on Stationary Spacetimes
113
Theorem 2. Let (M, g, ht ) be a connected stationary Lorentzian manifold. Let {F(O)} be a net of local field algebras constructed from a linear fermionic field theory (H, K, E, η, ρ) or from a linear bosonic field theory (W, σ, E, η, ρ). Assume that η ◦ P = 0 for some second order differential operator P with metric principal part. One has 1. The fermionic case. The ∗-subalgebra of F generated by the subset t∈R F(ht O) is norm dense in F for each nonvoid relatively compact open set O ⊂ M. 2. The bosonic case. Let ωµ be a quasifree and continuous state over the quasilocal algebra F .Denote its GNS-triple by (πωµ , Hωµ , #ωµ ). Assume that ωµ is invariant under the automorphism group τt induced by the Killing flow ht . The ∗-subalgebra of πωµ (F) generated by the subset π t∈R ωµ (F(ht O)) is strongly dense in πωµ (F) for each nonvoid relatively compact open set O ⊂ M. Proof. We start with the fermionic case. Let H1 be the subspace of H generated by the set {η(ht f ) ∈ H; f ∈ 0 (E, O), t ∈ R}, and let F1 be the unital ∗-subalgebra of F generated by the set {B(v), v ∈ H1 }. Clearly, F1 is equal to the ∗-subalgebra of F generated by t∈R F(ht O). The group of time translations ρt is a strongly continuous one parameter group on H, and with ηˆ = η we can apply Theorem 1 above. It follows that H1 is dense in H and hence F1 is norm dense in F. In the bosonic case let WC be the complexification of W and take the complexification of µ as a scalar product on WC . We complete this space and obtain a Hilbert space H . We complexify the real vector bundle E and obtain the complex vector bundle EC ∼ = E ⊕E with a canonical action of ht . We can extend the map η to a map ηC which maps from the section of EC to H . By construction ηC has dense range. Since µ is invariant under the action of the time translations ρt we get a strongly continuous unitary action of the group of time translations on H such that ηC is equivariant. Let H1 be the complex subspace of H generated by the set {ηC (ht f ) ∈ H ; f ∈ 0 (EC , O), t ∈ R}. We see that all the assumptions for Theorem 1 are fulfilled and hence H1 is dense in H . This implies that η(0 (E, hR O)) is µ-dense in W. Using Proposition 2 one therefore concludes that the ∗-subalgebra of πωµ (F) generated by the set {W (η(ht f )), f ∈ 0 (E, O), t ∈ R} is strongly dense in πωµ (F).
4.4. The Reeh–Schlieder property for ground- and KMS-states. Definition 7. Let {F(O)} be a net of local field algebras indexed by the relatively compact open subsets of a manifold M. Let ω be a state over the quasilocal field algebra F and (πω , Hω , #ω ) its GNS-triple. We say that ω has the Reeh–Schlieder property if #ω is cyclic for the von Neumann algebra πω (F(O)) for each nonvoid relatively compact open set O ⊂ M.
114
A. Strohmaier
Our main theorem is: Theorem 3. Let (M, g, ht ) be a connected stationary Lorentzian manifold. Let {F(O)} be the net of local field algebras constructed from a linear fermionic field theory (H, K, E, η, ρ) or from a linear bosonic field theory (W, σ, E, η, ρ). Assume that η ◦ P = 0 for some second order differential operator P with metric principal part. Let ω be a state over the quasilocal algebra F which we assume to be quasifree and continuous in the bosonic case. If ω is a ground- or KMS-state with respect to the automorphism group τt induced by the Killing flow ht , then ω has the Reeh–Schlieder property. We postpone the proof for a moment. Let O ⊂ M be a nonvoid relatively compact open set and B(O) ⊂ F(O) the ∗-subalgebra consisting of those elements a ∈ F(O) for which there exists a neighbourhood I ⊂ R of 0, such that τI (a) ⊂ F(O). Let O1 ⊂ M be another nonvoid open set such that O1 ⊂ O. We clearly have the inclusions F(O1 ) ⊂ B(O) ⊂ F(O) .
(19)
We denote the strongly continuous unitary group, implementing τt on the GNS-Hilbert space, by U˜ (t). One has Lemma 2. Let E := πω (B(O))#ω . Then E is invariant under the action of U˜ (t), i.e. U˜ (R)E ⊂ E. Proof. Let ψ ∈ E ⊥ . For each a ∈ πω (B(O)) we then have at least for some open neighbourhood I ⊂ R of 0, f (t) := ψ, U˜ (t)a#ω = 0
∀t ∈ I .
Since ω is a KMS-state (β > 0) or a ground-state (β = ∞), f (t) is the boundary value of a function F (z) which is analytic on the strip Dβ/2 = {z ∈ C; 0 < I m(z) < β/2} and bounded and continuous on Dβ/2 . By the Schwartz reflection principle f (t) vanishes on the whole real axis. Therefore U˜ (t)ψ, a#ω = 0 for all t ∈ R. Hence, E ⊥ is invariant under the action of U˜ (t). We are now able to give the proof of the main theorem:
Proof of Theorem 3. We denote by R the von Neumann algebra t∈R U˜ (t)πω (F(O1 ))U˜ (t)∗ . Lemma 2 implies that E is invariant under the action of R, i.e. RE ⊂ E. By Theorem 2 R = πω (F) . Hence #ω is cyclic for R and therefore E = Hω . By the inclusions (19) we have E ⊂ πω (F(O))#ω and hence #ω is cyclic for πω (F(O)).
5. Examples of Linear Field Theories In this section (M, g) will be an oriented time-oriented 4-dimensional Lorentzian manifold which is globally hyperbolic in the sense that it admits a smooth Cauchy surface. For some subset O ⊂ M the set of points, which can be reached by future/past directed causal curves emanating from O, will be denoted by J ± (O).
The Reeh–Schlieder Property for Quantum Fields on Stationary Spacetimes
115
The free Dirac field (see [14]). It is known that M possesses a trivial spin structure given by a Spin(3, 1)-principal bundle SM and a two-fold covering map SM → F M onto the bundle F M of oriented time-oriented orthonormal frames. One can now construct the Dirac bundle DM which is associated to SM by the spinor representation and is a natural module for the Clifford algebra bundle Cliff(T M) (see [4] and [28]). Furthermore, the Levi-Civita connection on T M induces a connection on DM with covariant derivative ∇ : (SM) → (SM ⊗ T ∗ M). Given a vector field n we write as usual n / for the section in the Clifford algebra bundle γ (n), or in local coordinates γ i ni . There exists an antilinear bijection (DM) → (DM ∗ ) : u → u+ , the Dirac conjugation, which in the standard representation in a local orthonormal spin frame has the form u+ = uγ 0 , where the bar denotes complex conjugation in the dual frame. We use the symbol + also for the inverse map. Canonically associated with the Dirac bundle there is the Dirac operator which in a frame takes the form ∇ / = γ i ∇ei . The Dirac equation for mass m ≥ 0 is (−i/∇ + m)u = 0,
u ∈ (DM) .
(20)
The Dirac equation has unique advanced and retarded fundamental solutions S ± : 0 (DM) → (DM) satisfying (−i/ ∇ + m)S ± = S ± (−i/∇ + m) = id
on
0 (DM) ,
supp(S ± f ) ⊂ J ± (supp(f )) . We can define the operator S := S + − S − and form the pre-Hilbert space H := 0 (DM)/ker(S)
(21)
with inner product [u1 ], [u2 ]H := −i
M
(u+ 1 , Su2 )(x)w(x),
(22)
where (·, ·) denotes the dual pairing between the fibres of DM and DM ∗ and w the pseudo-Riemannian volume form on M. [f ] denotes the equivalence class containing f . Analogously we form the pre-Hilbert space H + := 0 (DM ∗ )/(ker(S))+
(23)
[v1 ], [v2 ]H + := [v2+ ], [v1+ ]H .
(24)
with inner product
We note that S maps H onto the space of smooth solutions to the Dirac equation whose supports have compact intersections with any Cauchy surface. The construction of the Dirac field starts with the Hilbert space H := H ⊕ H + and the antiunitary involution K : H → H,
u ⊕ v → v + ⊕ u+ .
(25)
116
A. Strohmaier
By construction we have a map η : 0 (DM ⊕ DM ∗ ) → H;
f ⊕ g → [f ] ⊕ [g]
˜ with dense range. Let G be the identity component of the group of isometries of M and G ˜ ˜ its universal covering group. G acts canonically on F M and since G is simply connected this action lifts uniquely to a smooth action on SM, yielding a smooth equivariant action ˜ on the bundles DM and DM ∗ , making them G-vector bundles. The Dirac operator ∇ / and its conjugate ∇ / + := + ◦ ∇ / ◦+ are both invariant under these actions and hence the ˜ ˜ on 0 (DM ⊕ DM ∗ ) gives rise to a unitary representation ρ of G representation of G on H. Taking E := DM ⊕ DM ∗ one shows that (H, K, E, η, ρ) is a linear fermionic field theory on M. Moreover, the differential operator D / = (/∇ + im) ⊕ (/∇ + − im) : 0 (E) → 0 (E) maps to the kernel of η, i.e. η ◦ D / = 0. The square of D / has metric principal part. The real scalar field (see [13]). The construction of the Klein–Gordon field starts with the Klein–Gordon operator for mass m ≥ 0: P := g + m2 ,
(26)
where g = g ik ∇i ∇k and ∇ is the Levi-Civita covariant derivative. This operator ∞ (M). It has unique acts on the real-valued smooth functions with compact support C0r ∞ ± ∞ advanced and retarded fundamental solutions Fs : C0r (M) → Cr (M) satisfying P Fs± = Fs± P = id
on
∞ C0r (M),
supp(Fs± f ) ⊂ J ± (supp(f )). With Fs := Fs+ − Fs− , σˆ (f1 , f2 ) := M f1 Fs (f2 )w defines an antisymmetric bilinear ∞ (M) × C ∞ (M), where w is the pseudo-Riemannian volume form on M. form on C0r 0r ∞ (M)/ker(F ) with quotient map η, the bilinear form σ (η(f ), η(f )) Defining W := C0r s 1 2 := σˆ (f1 , f2 ) on W is symplectic. We have a canonical linear action of the group G on ∞ (M) which leaves ker(F ) and σ C0r ˆ invariant and hence gives a representation ρ of G s on W by symplectomorphisms. Taking the trivial bundle M × R for E, one shows that (W, σ, E, η, ρ) is a linear bosonic field theory on M and η ◦ P = 0. Note that P has metric principal part. Remark 1. Since the complex scalar field consists of two independent real scalar fields it fits into this framework as well. The Proca field (see [16]). Let d be the exterior derivative of differential forms, ∗ the Hodge star operator and δ = ∗d∗. Take the cotangent bundle for E. For mass m > 0 the Proca equation for sections f ∈ 0 (E) is (δ ◦ d + m2 )f = 0,
(27)
which is equivalent to the hyperbolic system (g + m2 )f = (δ ◦ d + d ◦ δ + m2 )f = 0, δf = 0.
(28) (29)
The Reeh–Schlieder Property for Quantum Fields on Stationary Spacetimes
117
We define the differential operator P˜ : 0 (E) → 0 (E) by P˜ := g + m2 .
(30)
It has unique advanced and retarded fundamental solutions F˜p± : 0 (E) → (E) satisfying P˜ F˜p± = F˜p± P˜ = id on 0 (E), supp(F˜p± f ) ⊂ J ± (supp(f )). We define the operators Fp± := (m−2 d ◦ δ + 1)F˜p± . It is not difficult to see that the Fp± are the unique fundamental solutions for the operator P = δ ◦ d + m2 with the above properties. With Fp := Fp+ − Fp− , σˆ (f1 , f2 ) := M f1 ∧ ∗Fp (f2 ) defines an antisymmetric bilinear form on 0 (E) × 0 (E). Taking W := 0 (E)/ker(Fp ) with quotient map η, the bilinear form σ (η(f1 ), η(f2 )) := σˆ (f1 , f2 ) on W is symplectic. The pullback of forms induces a linear action of G on 0 (E) which leaves ker(Fp ) and σˆ invariant and hence gives rise to a representation ρ of G on W by symplectomorphisms. Again one can show that (W, σ, E, η, ρ) is a linear bosonic field theory and moreover, η ◦ P˜ = η ◦ P = 0. One gets the following corollary. Corollary 1. Let (M, g, ht ) be a connected stationary globally hyperbolic oriented timeoriented 4-dimensional Lorentzian manifold. Let O → F(O) be a net of field algebras for one of the following free fields: – The real or complex scalar field for mass m ≥ 0, – The Proca field for mass m > 0, – The Dirac field for mass m ≥ 0. Assume that ω is a state over the field algebra F which we require to be quasifree and continuous in the bosonic case and which is a ground- or KMS-state with respect to the canonical time translations. Then ω has the Reeh–Schlieder property. Acknowledgement. The author would like to thank Prof. M. Wollenberg and Dr. R. Verch for useful discussions and comments. This work was supported by the Deutsche Forschungsgemeinschaft within the scope of the postgraduate scholarship programme “Graduiertenkolleg Quantenfeldtheorie” at the University of Leipzig.
References 1. Araki, H.: On Quasifree States of CAR and Bogoliubov Automorphisms. Publ. RIMS Kyoto Univ. 6, 385–442 (1970/71) 2. Aronszajn, N.: A unique continuation theorem for solutions of elliptic partial differential equations or inequalities of second order. J. Math. Pures Appl. 36, 235–249 (1957) 3. Bär, C.: Localization and semibounded energy – a weak unique continuation theorem. J. Geom. Phys. 34, 155–161 (2000) 4. Baum, H.: Spin-Strukturen und Dirac-Operatoren über pseudoriemannschen Mannigfaltigkeiten. Leipzig: Teubner, 1981 5. Borchers, H.J.: On revolutionizing quantum field theory. J. Math. Phys. 41, 3604–3673 (2000) 6. Borchers, H.J.: Über die Vollständigkeit lorentzinvarianter Felder in einer zeitartigen Röhre. Nuovo Cim. 19, 787 (1961) 7. Bratteli, O., Robinson, D.W.: Operator Algebras and Quantum Statistical Mechanics 2. Berlin– Heidelberg–New York: Springer, 1996
118
A. Strohmaier
8. Brunetti, R., Fredenhagen, K., and Köhler, M.: The microlocal spectrum condition and Wick polynomials of free fields on curved space-times. Commun. Math. Phys. 180, 633–652 (1996) 9. Buchholz, D., Dreyer, O., Florig, M., Summers, S.J.: Geometric modular action and space-time symmetry groups. Rev. Math. Phys. 12, 475–560 (2000) 10. Dencker, N.: On the propagation of polarization sets for systems of real principle type. J. Funct. Anal. 46, 351–372 (1982) 11. Dieudonne, J.: Grundzüge der modernen Analysis, Volume 3. Braunschweig: Vieweg Verlag, 1976 12. Dieudonne, J.: Grundzüge der modernen Analysis, Volume 7. Braunschweig: Vieweg Verlag, 1976 13. Dimock, J.: Algebras of local observables on a manifold. Commun. Math. Phys. 77, 219–228 (1980) 14. Dimock, J.: Dirac quantum fields on a manifold. Trans. Am. Math. Soc. 269, 133–147 (1982) 15. Fulling, S.A.: Aspects of Quantum Field Theory in Curved Space-Time. Cambridge: Univ. Pr., 1989 16. Furlani, E.P.: Quantization of massive vector fields in curved space-time. J. Math. Phys. 40, 2611 (1999) 17. Gelfand; I.M., Wilenkin, N.J.: Verallgemeinerte Funktionen. Berlin: DeutscherVerlag der Wissenschaften, 1964 18. Guido, D., Longo, R., Roberts, J.E., Verch, R.: Charged sectors, spin and statistics in quantum field theory on curved space-times. 1999. math-ph/9906019, to appear in Rev. Math. Phys. 19. Haag, R.: Local quantum physics: Fields, particles, algebras. Berlin–Heidelberg–New York: Springer, 1992 20. Hörmander, L.: The Analysis of Linear Partial Differential Operators III Berlin–Heidelberg–New York: Springer, 1985 21. Hörmander, L.: The Analysis of Linear Partial Differential Operators I. Berlin–Heidelberg–New York: Springer, 1990 22. Jäkel, C.D.: The Reeh-Schlieder property for thermal field theories. J. Math. Phys. 41, 1745–1754 (2000) 23. Jarchow, H.: Locally convex spaces. Stuttgart: B. G. Teubner, 1981 24. Kay, B.S.: Linear spin-zero quantum fields in external gravitational and scalar fields. I. A one particle structure for the stationary case. Commun. Math. Phys. 62, 55 (1978) 25. Kay, B.S.: The double wedge algebra for quantum fields on Schwarzschild and Minkowski space-times. Commun. Math. Phys. 100, 57 (1985) 26. Kay, B.S.: Sufficient conditions for quasifree states and an improved uniqueness theorem for quantum fields on space-times with horizons. J. Math. Phys. 34, 4519–4539 (1993) 27. Kay, B.S., Wald, R.M.: Theorems on the uniqueness and thermal properties of stationary, nonsingular, quasifree states on space-times with a bifurcate Killing horizon. Phys. Rept. 207, 49–136 (1991) 28. Lawson, H.B., Michelsohn, M.L.: Spin Geometry. Princeton, NT: University Press, 1989 29. Maurin, K.: General Eigenfunction Expansions and Unitary Representations of Topological Groups. Warszawa: Polish Scientific Publishers, 1968 30. O’Neill, B.: Semi-Riemannian Geometry. London–New York: Academic Press, 1983 31. Pietsch, A.: Nukleare Lokalkonvexe Räume. Berlin: Akademie-Verlag, 1965 32. Radzikowski, M.J.: The Hadamard condition and Kay’s conjecture in (axiomatic) quantum field theory on curved space-times. PhD-Thesis, 1992 33. Radzikowski, M.J.: Micro-local approach to the Hadamard condition in quantum field theory on curved space-time. Commun. Math. Phys. 179, 529 (1996) 34. Reeh, H., Schlieder, S.: Bermerkungen zur Unitäräquivalenz von Lorentz-invarianten Feldern. Nuovo Cimento 22, 1051–1068 (1961) 35. Sahlmann, H., Verch, R.: Passivity and microlocal spectrum condition. math-ph/0002021, to appear in Commun. Math. Phys. 36. Strohmaier, A.: The Reeh–Schlieder property for the Dirac field on static spacetimes. math-ph/9911023 37. Verch, R.: Antilocality and a Reeh-Schlieder theorem on manifolds. Lett. Math. Phys. 28, 143–154 (1993) 38. Wald, R.M.: The back reaction effect in particle creation in curved space-time. Commun. Math. Phys. 54 1 (1977) 39. Wald, R.M.: Quantum Field Theory in Curved Spacetime and Black Hole Thermodynamics. Chicago Lectures in Physics, 1994 Communicated by H. Araki
Commun. Math. Phys. 215, 119 – 142 (2000)
Communications in
Mathematical Physics
© Springer-Verlag 2000
The Existence of Non-Topological Multivortex Solutions in the Relativistic Self-Dual Chern–Simons Theory Dongho Chae1 , Oleg Yu. Imanuvilov2 1 Department of Mathematics, Seoul National University, Seoul 151-742, Korea.
E-mail:
[email protected]
2 Department of Mathematics, Iowa State University, Ames, IA 50011, USA.
E-mail:
[email protected] Received: 6 July 1999 / Accepted: 14 June 2000
Abstract: We construct a general type of multivortex solutions of the self-duality equations (the Bogomol’nyi equations) of (2+1) dimensional relativistic Chern–Simons model with the non-topological boundary condition near infinity. For such construction we use a perturbation argument around the explicit solutions of the Liouville equation. Introduction The Lagrangian density of the (2+1)-dimensional relativistic Chern–Simons gauge field theory is given by L=
κ µνρ 1 ε Fµν Aρ + (Dµ φ)(D µ φ) − 2 |φ|2 (1 − |φ|2 )2 , 4 κ
(1)
where Aµ (µ = 0, 1, 2) is the gauge field on R3 , Fµν = ∂x∂ µ Aν − ∂x∂ ν Aµ is the corre√ sponding curvature tensor, φ = φ1 + iφ2 (i = −1) is a complex field on R3 , called the Higgs field, Dµ = ∂x∂ µ − iAµ is the gauge covariant derivative associated with Aµ , εµνρ is the totally skewsymmetric tensor with ε012 = 1, and finally κ > 0 is the Chern–Simons coupling constant. Our metric on R3 is (gµν ) = diag (1, −1, −1). This model was suggested by Hong–Kim–Pac [8] and Jackiw–Weinberg [10] to study vortex solutions of the Abelian Higgs model which carry both electric and magnetic charges (see [5] for a general survey of the model). This feature of the model is important in the physics of high critical temperature superconductivity. The Gauss equation (variational equation for A0 ) of (1) is given by κF12 = −2|φ|2 A0 .
(2)
This research supported partially by GARC-KOSEF, BSRI-MOE, KOSEF (K95070, 970702013), and KIAS-M97003.
120
D. Chae, O. Yu. Imanuvilov
Using this relation, and by integration by part δ1 the static energy corresponding to (1) can be written as ( [8, 10]) 2 2 2 κ F12 1 2 2 2 2 E= dx (3a) + |D φ| + |φ| (1 − |φ| ) j 2 κ2 R2 4 |φ| j =1
2
κF12
1 2 2
¯ = |(D1 ± iD2 )φ| +
± φ(|φ| − 1)
dx ± F12 dx, (3b) 2φ κ R2 R2 where the +(−) sign are chosen if the integral R2 F12 dx has nonnegative (nonpositive) sign. Below we choose the upper sign. We have thus E≥| F12 dx| R2
and the minimum of the energy is saturated if and only if (φ, A), A = (A1 , A2 ) satisfies the self-duality equations, or the Bogomol’nyi equations: (D1 + iD2 )φ = 0, 2 F12 + 2 |φ|2 (|φ|2 − 1) = 0. κ
(4) (5)
The system (4)–(5) is equipped with the following natural boundary conditions
or
|φ(x)| → 1
as |x| → ∞
(6)
|φ(x)| → 0
as |x| → ∞
(7)
in order to make the energy (3a) finite. The solutions (φ, A) of (4)–(5) satisfying (6) are called topological solutions, while the solutions of (4)–(5) satisfying the boundary condition (7) are called nontopological solutions. Following Jaffe–Taubes [11], we can reduce system (4)–(5) with (6) or (7) to the more simplified form of partial differential equations as follows . We introduce a new variable (u, θ ) by φ = e 2 (u+iθ) , θ = 2 1
N
arg(z − zj ),
z = x1 + ix2 ∈ C1 = R2 ,
(8)
j =1
where zj , allowing multiplicities, j = 1, 2, ...., N are the zeros, called the centers of the vorticities, of φ(z). Then, we can rewrite (4)–(5) with (6) or (7) as (hereafter, we set κ = 2 for simplicity) u = eu (eu − 1) + 4π
N
δ(z − zj ),
(9)
j =1
u(x) → 0 or
as |x| → ∞
u(x) → −∞ as |x| → ∞
(topological boundary condition) (non-topological boundary condition).
(10)
For the topological boundary condition Wang [15] proved existence of general multivortex solutions, using the variational method similar to Jaffe–Taubes [11]. Later,
Non-Topological Multivortex Solutions in Self-Dual Chern–Simons Theory
121
Spruck–Yang [12] proved existence of topological solutions, using a more constructive iteration method, and generated even shapes of vortices by numerical simulations. (See [14] also for the study of (4)–(5) in a periodic bounded domain, and [3] for the study of topological solitons of the Chern–Simons model coupled with the Maxwell fields in a self-dual fashion.) The non-topological solutions, however, have not been well understood yet compared to the topological ones. In [13] Spruck–Yang proved existence of radially symmetric non-topological solutions which correspond to solutions of (4)–(5) with a single center (see [4] also for related studies of the radial solutions). In this paper we prove existence of a general type of non-topological multivortex solutions. Moreover, we establish precise decay estimates near infinity of our solutions. More specifically we prove the following theorem: 1 Main Theorem. Let {zj }N j =1 ⊂ C be arbitrarily given. Then, there exists a solution (φ, A) to (4)–(5) (with κ = 2), (7) such that the function φ(z) has the zeros {zj }N j =1 with possible multiplicities, and the pair (φ, A) make the energy functional (3a) finite. Moreover, our constructed solutions satisfy:
(i)
The decay estimates: there exists β > 0 such that
1 |φ|2 + |F12 | + |D1 φ|2 + |D2 φ|2 = O |x|2N+4+β
as|x| → ∞.
(11)
(ii) Flux integral: there exist a sequence of solutions φi and a sequence of positive numbers {γi }∞ i=1 , lim i→+∞ γi = 0 such that 2 F12 dx = |φ |2 (1 − |φi |2 ) dx = 4π(N + 1) + γi . (12) $= 2 i R2 R2 κ Remark 0.1. The solutions constructed in the Main Theorem have the following repre∗ , a ∗ , u∗ ) sentation: there exists ε0 > 0 and a continuous mapping G : (0, ε0 ) → (a1,ε 2,ε ε in an appropriate sense to be described later such that u(x) = ln ρε,aε∗ (x) + ε 2 w0 (εx) + ε 2 u∗ε (εx), where
8ε2N+2 |f (z)|2
ρε,a (z) =
1 + ε 2N+2 |F (z) +
with f (z) = (N + 1)
N
a
|2 εN +1
(z − zk ), F (z) =
k=1
z
(13)
2
f (ξ )dξ,
0
and z = x1 + ix2 , a = a1 + ia2 satisfies the Liouville equation ρε,aε + eρε,aε = 4π
N
δ(z − zj )
in R2 ,
j =1
and w0 (r) is the solution to the ordinary differential equation 1 d dw0 (r ) + ρw0 = ρ 2 r dr dr
in R+ ,
(14)
122
where ρ(r) =
D. Chae, O. Yu. Imanuvilov 8(N+1)2 r 2N (1+r 2N +2 )2
, and satisfies the asymptotic formula (see Lemma 2.2)
˜ 2 ln |x| + o(ln |x|) w0 (εx) = −Cε
as |x| → ∞ ,
(15)
where C˜ is a positive number independent of ε. Finally for the last term in (13) we have the following estimate: |u∗ε (x)| ≤ C(ε) ln(|x| + 2)
∀x ∈ R2 ,
(16)
where C(ε) → 0 as ε → +0. Therefore the contributions from the second and the third terms in (13) can be made as small as we want compared to the first one, and we could say that u(x) is close to the solution of the Liouville equation (in topology of the Hilbert space Y 1 ).1 Moreover, (15) and (16) implies that the contribution from the third term 2 can be arbitrarily small compared to the second one as ε → +0. Since limr→+∞ wln0 (r) r < 0, from (13) we can deduce that if u(x) = −B, |x|→∞ ln |x| lim
(17)
then necessarily we have B > 2N + 4. It would be interesting to compare this result with the previously known one (see Theorem 2.1 of [13], and Theorem 3.2 of [4]) for the radially symmetric solutions of the following equation for vorticities concentrated on the origin of R2 : u = eu (eu − 1) + 4π N δ(z), (18) equipped with the nontopological boundary condition (10). They proved the existence of a radially symmetric non-topological solution u of (18) satisfying (17) with B > 2N + 4 using the shooting method which consists in looking for t0 , α such that the solution of the ordinary differential equation 1 d du (r ) − eu (eu − 1) = 0, r dr dr satisfies the boundary conditions lim u = −∞,
r→+∞
u(t0 ) = −α, u (t0 ) = 0
lim u/ ln r = 2N.
r→+0
It was established that for all α > ln 2 the suitable t0 exists. The solutions constructed in this paper correspond to the case when the parameter α is sufficiently large (the parameter ε is very small). Thus our existence result is a “proper” generalization of the previous ones for the radially symmetric ones in [4 and 13] for the large values of the parameter α. Remark 0.2. Equation (12) implies that the integral R2 F12 dx, which corresponds to the minimum of the total static energy given by (3a)–(3b), is not “quantized” contrary to the case of topological solutions. This phenomena, which was discussed in the physics literature (see e.g. [5] and references therein) for radially symmetric solutions, is to the authors’ knowledge rigorously verified for the first time in this paper for a nonradial solution. We also note that the estimate $ > 4(N + 1)π was proved by Spruck–Yang [13] for any radially symmetric solution. The fact that γi in the main theorem can be chosen arbitrarily small implies that their estimate is sharp for the radially symmetric solution. For arbitrary nonradial solutions such an estimate is currently not available. 1 Definitions of the spaces Y and X are given in the next section. 1 1 2
2
Non-Topological Multivortex Solutions in Self-Dual Chern–Simons Theory
123
Our idea of proof is that after a suitable scale transformation our equation can be perturbed from a radially symmetric equation, and the iteration starting from a radial solution of the Liouville equation superposed with appropriate “small” nonradial function works well. For the reader’s convenience we briefly sketch below the existence part of the proof of our main theorem in the radially symmetric case. As we mentioned above our solution is constructed to be close to that of the Liouville equation in the norm of certain Hilbert space. One of the possible solutions to the Liouville equation (14) is the function ln ρ, where 8(N + 1)2 (r)2N ρ(r) = . (1 + (r)2N+2 )2 Then the function v(r) = vε (r) = equation P (v, ε) = v + ε−2 ρeε
1 u( εr )− ε12 ε2
2 (v+w ) 0
ln ε2 ρ(r)−w0 (r) satisfies the functional
− ρ 2 e2ε
2 (v+w ) 0
− ε −2 ρ + w0 = 0.
Now our goal is to find a continuous mapping ε → vε in the neighborhood of the origin, satisfying the equation P (vε , ε) = 0. We find that Pv (0, 0)(w) = ( + ρ)w. By explicitly solving the ordinary differential equation (see Lemma 2.1) ( + ρ)u(r) =
d 2 w 1 dw + + ρw = f dr 2 r dr
in R+
for a given f ∈ X 1 ∩ C 1 (R+ ) we find that the linearized operator A = Pv (0, 0) : 2 Y 1 → X 1 is onto. Moreover we can infer that ker A is spanned by {ϕ0 }, where ϕ0 (r) = 2
1−r 2N +2 . 1+r 2N +2
2
Thus we can apply the standard implicit function theorem to P (v, ε) = 0, which guarantees the existence of at least one solution vε in the neighborhood of the 1 origin in the quotient space Y 2 / ker A. The fact that our real solution u(r) = ln ρε (r) + ε 2 w0 (εr) + ε 2 vε (εr) satisfies the nontopological boundary condition is explained briefly in Remark 1 above. The organization of the paper is the following. In Sect. 1 we introduce basic function spaces, and derive some properties of them useful in the following sections. In Sect. 2 we formulate our problem in terms of the functional equation, and establish properties of the linearized operator of the functional equation. In Sect. 3, based on the preliminary results in Sect. 2, we prove our main theorem. We postpone all the proofs of the auxiliary lemmas to Sect. 4 in order to help the readers keep the main stream of the argument.
1. Introduction of Function Spaces In this section we define the class of function spaces which are necessary in the sequel and establish some properties of functions from these spaces. Let α ∈ (0, 1) be given. We introduce the Hilbert spaces Xα and Yα as follows: Xα = u(x) ∈ L2loc (R2 )| (1 + |x|2+α )u2 dx < ∞ , R2
124
D. Chae, O. Yu. Imanuvilov
equipped with the inner product (u, v)Xα = Yα = u ∈
2,2 Wloc (R2 )|
R2 (1 + |x|
+
u2Xα
2+α )uvdx,
u
α
1 + |x|1+ 2
and
2
<∞
L2 (R2 )
equipped with the inner product (u, v)Yα = (u, v)Xα +
R2
uv dx. 1 + |x|2+α
These spaces are equipped with the natural Banach space norms; uXα = (u, u)Xα , uYα = (u, u)Yα respectively. Thanks to the inequality
R2
|u|dx ≤
R2
1 dx 1 + |x|2+α
1
2
R2
1 + |x|
2+α
1 u dx 2
2
there is a continuous imbedding Xα 8→ L1 (R2 )
∀α ∈ (0, 1).
(1.1)
Also, by the local regularity of the Laplace operator (see [7]) we have 0 Yα ⊂ Cloc (R2 )
∀α ∈ (0, 1).
(1.2)
We start from the following elementary proposition: Proposition 1.1. Let α ∈ (0, 1) and v ∈ Yα be a harmonic function. Then v ≡ const. Proof. In the polar coordinates the Fourier expansion in θ yields v(x) =
∞
vk (r)eikθ
k=0
with the rapidly decreasing coefficients, vk (r) in k for each fixed r ≥ 0. Since v(x) is the harmonic function, each vk (r) satisfies the ordinary differential equation: 1 dvk k2 d 2 vk + − 2 vk = 0 2 dr r dr r
∀ k ∈ Z+ ∪ {0}.
The general solution of these ordinary differential equations is well known: vk (r) = M1,k r k +
M2,k rk
for k ∈ Z+ ,
v0 (r) = M1,0 + M2,0 ln r,
π where M1,k , M2,k are the constants. Noting −π eikθ dθ equals 0 for k = 0 and 2π for k = 0, we deduce vk ∈ Xα ∀k ∈ Z+ ∪ {0}. Hence M1,k = 0 for all k ∈ Z+ . Then taking into account that v is a smooth function in neighborhood of zero we obtain M2,k = 0 for all k ∈ Z+ ∪ {0}.
Non-Topological Multivortex Solutions in Self-Dual Chern–Simons Theory
125
Denote ln+ |x| = max{0, ln |x|} below. The next lemma provides the pointwise estimate for an arbitrary function from Yα . In particular it implies that on infinity functions from this space have at most the logarithmic growth. Lemma 1.1. Let α ∈ (0, 1), then there exists C1 > 0 such that for all v ∈ Yα , |v(x)| ≤ C1 vYα (ln+ |x| + 1) ∀x ∈ R2 .
(1.3)
Proof. For given v ∈ Yα , we set v = g. By definition of the space Yα the function g ∈ Xα . We consider the function 1 ln |x − τ |g(τ ) dτ. v(x) ˜ = 2π R2 It is well-known that v˜ = g in R2 . Using the Cauchy–Bynakovskii inequality we obtain
1
≤ 1 ln |x − τ |g(τ )dτ |v(x)| ˜ ≤
2π 2 | ln |x − τ |g(τ )|dτ 2π R2 R 1 1 ln2 |x − τ | 1 (1.4) dτ ) 2 ( (1 + |τ |2+α )|g(τ )|2 dτ ) 2 ( ≤ 2+α 2π R2 R2 1 + |τ | 1 ln2 |τ | 1 ≤ dτ ) 2 . vYα ( 2+α 2 2π 1 + |x − τ | R Let us estimate the last integral in (1.4), ln2 |τ | I2 = dτ 2+α R2 1 + |x − τ | ln2 |τ | ln2 |τ | dτ + dτ = A1 + A2 . = 2+α 2+α |τ |≤1 1 + |x − τ | |τ |≥1 1 + |x − τ | Obviously, A1 =
|τ |≤1
ln2 |τ | dτ ≤ 1 + |x − τ |2+α
|τ |≤1
ln2 |τ |dτ ≤ C.
(1.5)
Note that for |τ | ≥ 2|x| we have |τ − x| ≥ |τ | − |x| =
1 1 1 |τ | + |τ | − |x| ≥ |τ |. 2 2 2
Thus, |τ |≥1
ln2 |τ | dτ = 1 + |x − τ |2+α +
{|τ |≥1}∩{|τ |≥2|x|} ln2 |τ |
{|τ |≥1}∩{|τ |<2|x|}
ln2 |τ | dτ 1 + |x − τ |2+α
1 + |x − τ |2+α
dτ ≤ C(1 + (ln+ 2|x|)2 ),
(1.6)
where the constants C depends on α only. Then inequalities (1.4)–(1.6) imply |v(x)| ˜ ≤ CvYα (ln+ |x| + 1)
∀x ∈ R2 .
(1.7)
126
D. Chae, O. Yu. Imanuvilov
Hence v˜ ∈ Yα and by Proposition 1.1 there exists k ∈ R such that v(x) ˜ = v(x) + k. Thus the pointwise estimate (1.7) yields v + kYα ≤ CvYα . Hence
|k| ≤ CkYα ≤ C(v + kYα + vYα ) ≤ CvYα .
Then by (1.7) and this estimate we obtain (1.3).
2. Functional Formulation of the Problem The aim of this section is two-fold. First we wish to transform Eq. (9) to a more convenient form, and second we would like to outline the strategy of the proof of the main theorem. As the first step in transforming Eq. (9) we get rid of delta-functions in its right-hand side, using solutions for the Liouville equation. For readers’convenience we remind some notations. Throughout this paper we denote z = x1 + ix2 , a = a1 + ia2 ∈ C1 = R2 . Let us define f (z) = (N + 1)
N
(z − zk ),
z
F (z) =
f (ξ )dξ.
0
k=1
Let us introduce a function ρε,a (z), ρ(r) by 8ε2N+2 |f (z)|2
ρε,a (z) =
1 + ε 2N+2 |F (z) +
a
|2 εN +1
2 ,
ρ(r) =
8(N + 1)2 r 2N . (1 + r 2N+2 )2
(2.1)
We note that for any ε > 0 and a ∈ C1 , ln ρε,a (z) is a solution of the Liouville equation (14). Defining v(z) = u(z) − ln ρε,a (z), we obtain from (9) 2 2v v + ρε,a ev − ρε,a e − ρε,a = 0.
Then, making a change of variables z →
and denoting v(z) ˜ =v
z ε
, we have
64ε4N+2 |f εz |4 e2v˜ v˜ + 2 − 4 1 + ε 2N+2 |F εz + εNa+1 |2 1 + ε 2N+2 |F εz + εNa+1 |2 8ε 2N |f εz |2 − 2 = 0. 1 + ε 2N+2 |F εz + εNa+1 |2 8ε2N |f
z
z ε
(2.2)
ε
|2 ev˜
We denote gε (z, a) =
8ε2N |f 1 + |ε N+1 F
z ε
|2
ε
+ a|2
z
2 .
(2.3)
(2.4)
Before going further we note that g0 (z, 0) = ρ(|z|).
(2.5)
Non-Topological Multivortex Solutions in Self-Dual Chern–Simons Theory
127
Now, we can write (2.3) as v˜ + gε ev˜ − ε 2 gε2 e2v˜ − gε = 0.
(2.6)
To transform (2.6) further we construct a solution to the ordinary differential equation L1 w =
d 2 w 1 dw + + ρw = f dr 2 r dr
in R+ .
(2.7)
We first observe that the function ϕ0 (r) defined by 1 − r 2N+2 1 + r 2N+2
ϕ0 (r) =
(2.8)
solves (2.7) with f = 0. (For a rational derivation of this fact see the proof of Lemma 2.4.) From this we establish the following two lemmas, the proofs of which are given in Sect. 4. Lemma 2.1. Given α ∈ (0, 21 ), f = f (r) with f ∈ Xα ∩ C 1 (R+ ), the ordinary differential equation (2.7) has a solution w(r) ∈ Yα given by the formula r φf (s) − φf (1) φf (1)r (2.9a) ds + w(r) = ϕ0 (r) (1 − s)2 1−r 0 with
φf (r) =
1 + r 2N+2 1 − r 2N+2
2
(1 − r)2 r
r
ϕ0 (t)tf (t)dt,
(2.9b)
0
where φf (1) and w(1) are defined as limits of φf (r) and w(r) as r → 1. Lemma 2.2. Let w0 (r) be a solution of the equation L1 w0 − ρ 2 (= w0 + ρw0 − ρ 2 ) = 0 in R+ , obtained by substituting f = ρ 2 in the solution formula (2.9). Then, the pointwise estimate holds true, |w0 (r)| ≤ C(ln+ r + 1) ∀r > 0. (2.10) Moreover we have the asymptotic formula w0 (r) = −C˜ ln r + o(ln r) as r → ∞,
(2.11)
where C and C˜ are positive constants independent of r. Let us introduce the mapping P (· , · , ·) : @ → Xα , where @ is an open subset of Yα × R2 × R to be specified below, given by P (u, a, ε) = u + ε −2 gε (z, a)eε
2 (u+w ) 0
− ε −2 gε (z, a) + w0 , (2.12) where w0 (x) = w0 (|x|) is the function defined by Lemma 2.2. Then, the problem of solving Eq. (9) is reduced to that of finding a mapping ε → (u∗ε , aε∗ ) from R into ∈ Yα × R2 , satisfying the functional equation − gε2 (z, a)e2ε
P (u∗ε , aε∗ , ε) = 0. The transformation of Eq. (9) is finished.
2 (u+w ) 0
(2.13)
128
D. Chae, O. Yu. Imanuvilov
We note that once a solution of (2.13), (u∗ε , aε∗ ) is found, then our solution u of Eq. (9) is recovered by the formula u(x) = ln ρε,aε∗ (x) + ε 2 w0 (εx) + ε 2 u∗ε (εx),
ε > 0.
(2.14)
Of course, after that one should check if this solution really satisfies the nontopological boundary conditions. Now, we describe the domain of definition, @ ⊂ Yα × R2 × R of the mapping P . We first observe that from w0 = −ρw0 + ρ 2 that we can rearrange P (u, a, ε) = u + gε (z, a)u + ε −2 gε (z, a){eε
2 (u+w ) 0
+ (gε (z, a) − ρ)w0 − gε2 (z, a)(e2ε
− 1 − ε 2 (u + w0 )}
2 (u+w ) 0
− 1) − (gε2 − ρ 2 ).
(2.15)
Then, by the decay properties ρ + |gε (z, a)| = O(|x|−4N−4 ) as |x| → ∞, we find that all terms in the right-hand side of (2.15), which do not contain the factor of exponential function, belong to Xα . Since e2ε
2 (u+w ) 0
≤ e2ε
2 C(u +1) ln(|x|+2) Yα
≤ (2 + |x|)2ε
where we used (1.3) and (2.10), in order that gε2 (z, a)e
2 C(u +1) Yα
2ε2 (u+w0 )
,
∈ Xα it suffices to have
2ε2 C(uYα + 1) < 1. This, in turn, is satisfied by the restriction (u, ε) ∈ B1 (0) × (0, ε0 ), where B1 (0) = {u ∈ Yα
|
uYα < 1},
and ε0 is a sufficiently small number. Clearly, this restriction implies also gε (z, a)eε Xα . Next, by Taylor’s expansion and (1.3) and (2.10) we find easily eε where
2 (u+w ) 0
2 (u+w ) 0
− 1 − ε 2 (u + w0 ) = ε4 R(x),
|R(x)| ≤ C(ln+ |x| + 1)2 .
Thus, defining P (u, a, 0) = u + gε (z, 0)u + (gε (z, 0) − ρ)w0 − (gε2 (z, 0) − ρ 2 ), we infer that P (u, a, ε) has an obvious continuous extension to ε = 0 for each (u, a) ∈ B1 (0) × R2 . Thus, our domain of definition of P is @ = B1 (0) × R2 × (−ε0 , ε0 ). Using (2.5) and Lemma 2.3, we find that P (0, 0, 0) = 0.
(2.16)
Let us introduce a linear operator A : Yα × R2 → Xα defined by A(u, a) = Lu + Ma,
(2.17)
∈
Non-Topological Multivortex Solutions in Self-Dual Chern–Simons Theory
129
where the operator L : Yα → Xα is defined by Lu = u + ρu,
r = |x|,
(2.18)
and the operator M : R2 → Xα is defined by Ma = −4(ρw0 − 2ρ 2 )ϕ+ a1 − 4(ρw0 − 2ρ 2 )ϕ− a2 with the functions ϕ+ (r, θ ), ϕ+ (r, θ ) =
∀a = (a1 , a2 ) ∈ R2
(2.19)
ϕ− (r, θ ) ∈ Yα given by formulas
r N+1 cos(N + 1)θ , 1 + r 2N+2
ϕ− (r, θ ) =
By direct computation we find that
∂gε (z, a)
= −4ρϕ+ , ∂a1 ε=0,a=0
r N+1 sin(N + 1)θ . 1 + r 2N+2
∂gε (z, a)
∂a
ε=0,a=0
2
(2.20)
= −4ρϕ− ,
from which we find that the mapping P (· , · , ·) from @ into Xα is continuously differentiable, and P(u,a) (0, 0, 0)(u, a) = Lu + Ma = A(u, a), (2.21a) i.e.
P(u,a) (0, 0, 0) = A.
(2.21b)
In order to apply the standard implicit function theorem to the functional equation (2.13), we will prove the invertibility of P(u,a) (0, 0, 0) below. For this purpose we need the following: Theorem 2.1. The operator A defined by (2.17)–(2.20) belongs to L(Yα × R2 , Xα ) for all α ∈ (0, 21 ) and Im A = Xα . To prove this proposition first we establish facts on image and kernel of the operator L. We start from the following: Proposition 2.1. Let α ∈ 0, 21 , then the image of L, Im L is closed in Xα . The proof of Proposition 2.1 follows immediately from the lemma below. Lemma 2.3. Suppose that X, Y are Hilbert spaces, a bounded linear operator B : X → Y is onto, and K : X → Y is a linear compact operator. Then, Im (B + K) is closed in Y. Proof. See the proof of Lemma 5.1 of [6], p. 413.
Proof of Proposition 2.1. We apply Lemma 2.3 for operators B = , K = ρ. Now, it is easy to check K : Yα → Xα is a compact operator, by decomposing R2 into a bounded and an exterior domains, and using the Rellich–Kondrachev compactness lemma for the bounded domain. We omit details of this part. Here, we only check : Yα → Xα is onto. Given f ∈ Xα , let us consider a solution u of the equation u = f
in R2
130
D. Chae, O. Yu. Imanuvilov
represented by
1 u(x) = ln |x − y|f (y)dy. 2π R2 We will show u ∈ Yα . First uXα < ∞ is immediate. Next, we claim u2 (x) dx < ∞. 2+α R2 1 + |x| Indeed, we have |u(x)| ≤
1 2π
| ln |x − y||
R2
(1 + |y|2+α )
(2.22)
1
1 2
(1 + |y|2+α ) 2 |f (y)|dy
21
ln2 |x − y| 1 dy f Xα ≤ 2+α 2π R2 1 + |y| ≤ Cf Xα (ln+ |x| + 1), where we used the previous estimate for I in the proof of Lemma 1.1 (see (1.5), (1.6)). Thus (2.22) follows, and u ∈ Yα . Now we can compute the kernel of the operator L. Lemma 2.4. Let α ∈ (0, 21 ), and L : Yα → Xα be the differential operator defined by (2.18). Then, ker L = Span{ϕ0 , ϕ+ , ϕ− }, where ϕ0 , ϕ± are introduced in (2.7) and (2.20) respectively. Proof. As is well-known, for any N, k ∈ Z+ ∪ {0} and a = a1 + ia2 ∈ C1 the function D(a, z) = ln satisfies the Liouville equation
8|(N + 1)zN + a(k + N + 1)zN+k |2 (1 + |zN+1 + azN+k+1 |2 )2 D + eD = 0
(2.23) 1)zN
1)zN+k
+ a(k + N + in C1 . except at the zeros of the polynomial p(z) = (N + Taking the derivative of (2.23) with respect to a1 and a2 at the point a = 0, we find that for each k ∈ Z+ ∪ {0} the functions
∂D(a, z)
= φk (r) cos kθ, ϕ−,k = φk (r) sin kθ, ϕ+,k = ∂a1 a=0 with φk (r) =
(k + N + 1) + (k − N − 1)r 2N+2 k r (N + 1)(1 + r 2N+2 )
satisfy the equation Lϕ±,k = 0
in R2
and the function φk satisfies to the ordinary differential equation ∂ 2 φk 1 ∂φk k 2 φk + + ρφk = 0 − ∂r 2 r ∂r r2
∀r > 0.
(2.24)
Non-Topological Multivortex Solutions in Self-Dual Chern–Simons Theory
131
On the other hand, as can be checked by direct computations, new functions φ˜ k (r) = φk (1/r) are also solutions to Eq. (2.24) for all k ∈ N+ . Obviously the pair φk , φ˜ k is the fundamental set of solutions to the ordinary differential equation (2.24). If k ∈ {0, N + 1}, then these solutions blow up like r k near infinity, or like r −k near 0, and do not belong to the space Yα . Since φ0 (r) = φ0 (1/r) we should introduce the second linearly independent solution in a different way. For all r ∈ (0, 21 ) we set φ˜ 0 (r) = 1 φ0 (r) 2 21 ds. For r ∈ ( 1 , ∞) we introduce φ˜ as the solution of the following r
sφ (s)
2
Cauchy problem: ˜ 1) ˜ ∂ φ( ∂ 2 φ˜ 1 ∂ φ˜ 1 ∂ φ(r) 2 ˜ ˜ = 0, φ( ˜ ) = lim φ(r), + + ρ φ = lim . 2 ∂r r ∂r 2 ∂r r→ 21 −0 r→ 21 −0 ∂r 2,2 (R2 ). Using well-known regularity results of elliptic Now let v(x) ∈ ker L ∩ Wloc 2 (R2 ). In the polar coordinates the Fourier differential operator we obtain that v(x) ∈ Cloc ∞ ikθ expansion yields v(x) = k=0 vk (r)e . Obviously the function vk (r) satisfies (2.24) for all k ∈ Z+ ∪ {0}. Thus vk (r) is the linear combination of the functions φk (r), φ˜ k (r),
vk (r) = ck,1 φk (r) + ck,2 φ˜ k (r). Since v is the smooth function ck,2 = 0. On the other hand v(x) ∈ Yα , thus ck,1 = 0 for all k ∈ Z+ \ {N + 1}. This proves the lemma. The image of the operator L is completely described by the following proposition. Proposition 2.2. Let α ∈ 0, 21 , then for the image of the operator L defined by (2.18) we have Im L = {f ∈ Xα | f ϕ± dx = 0}. R2
Proof. Since the image Im L is closed in Xα by Proposition 2.1, we can decompose Xα as follows: Xα = Im L ⊕ (Im L)⊥ . Let ξ ∈ (Im L)⊥ , then
(Lu, ξ )Xα = 0
∀u ∈ Yα .
Thus, denoting (1 + |x|2+α )ξ(x) = ψ(x), we have (Lu, ψ)L2 (R2 ) = 0
∀u ∈ Yα .
Since C0∞ (R2 ) ⊂ Yα , this implies immediately that Lψ = 0
in R2
by integration by parts and standard density argument. |ψ|2 Since ξ ∈ Xα , we have R2 1+|x| 2+α dx < ∞, and thus ψ ∈ ker L ∩ Yα . By Lemma 2.4 the function ψ is a linear combination of ϕ0 , ϕ± , ψ = C0 ϕ0 + C+ ϕ+ + C− ϕ− ,
(2.25)
132
D. Chae, O. Yu. Imanuvilov
where C0 , C± are constants. Let us consider a function f (r) = λ(r)ϕ0 (r), where λ(r) is a smooth cut-off function which is nonnegative, and whose non-empty support is in ( 21 , 1). Let w ∈ Yα be the solution of L1 w = f given by (2.9). Now suppose C0 is not equal to 0, then this leads us to the contradiction, 1 ∞ C0 λ(r)ϕ02 (r)rdr = L1 w(r)C0 ϕ0 (r)rdr = C0 Lwϕ0 dx = 0, 1 2
R2
0
where, in the last equality, we used (2.25) and the following fact: (Lu, ϕ± )L2 (R2 ) = 0
∀u ∈ Yα .
Indeed, let m(r) ∈ C ∞ (R1+ ), m(r) = 1 for r ∈ [0, 1] and m(r) = 0 for r > 2. We denote mε (x) = m(εx). Obviously ϕ ± m ε → ϕ±
in Xδ
as ε → +0,
for all δ < −2. Thus, using (1.3), we have (Lu, ϕ± )L2 (R2 ) = lim (Lu, ϕ± mε )L2 (R2 ) ε→+0 ε 2 (m)(εr)ϕ± u + 2ε((∇m)(εr), ∇ϕ± )u) dx = 0 = lim ε→+0 R2
∀u ∈ Yα .
This completes the proof. In (2.17)–(2.20) the operator A was introduced by A(u, a) = Lu + Ma. Since variables u and a are independent, and I m L is the linear space of codimension two we can prove Theorem 2.2 by showing that the space (I mL)⊥ does not contain any vector orthogonal to I mM. This goal will be achieved with the help of the following auxiliary lemma, whose proof is given in Sect. 4. Lemma 2.5. The following inequality holds true: 2 (ρw0 − 2ρ 2 )ϕ± dx < 0. R2
Now we can prove Theorem 2.2. Proof of Theorem 2.2. Given f ∈ Xα , we want to show that there exists u ∈ Yα , a1 , a2 ∈ R such that A(u, a1 , a2 ) = f . Let us define 2 ˜ f ϕ± dx, C± = 4 (ρw0 − 2ρ 2 )ϕ± dx. C± = R2
R2
We note that C˜ ± = 0 due to Lemma 2.5. Thus, the function, f˜ introduced below, is well defined, 4C+ 4C− f˜ = f − (ρw0 − 2ρ 2 )ϕ+ − (ρw0 − 2ρ 2 )ϕ− . C˜ + C˜ − 2π Then, from 0 sin(N + 1)θ cos(N + 1)θ dθ = 0 we obtain f˜ϕ± dx = 0. R2
Non-Topological Multivortex Solutions in Self-Dual Chern–Simons Theory
133
Hence, by Proposition 2.2 there exists u ∈ Yα such that u + ρu = f˜, and we have
C− C+ = f. ,− A u, − C˜ + C˜ − This completes the proof of the proposition.
3. Proof of the Main Theorem Although the operator A is onto due to Theorem 2.2, its kernel is not empty (actually Lemma 2.4 implies that dim{ker A} = 3). To overcome this difficulty let us decompose Yα = ker L ⊕ (ker L)⊥ , and set Uα = (ker L)⊥ × R2 . We equip the space Uα with the norm (u, a)Uα = u2Yα + a12 + a22 . From now on we are going to work in the space Uα instead of Yα × R2 . By Theorem 2.2 there exists A−1 : Xα → Uα
∀ α ∈ (0, 21 ).
(2.26)
Indeed to prove (2.26) suppose A(v, a) = 0. Then, this is equivalent to
Lv = −Ma
and Proposition 2.2 implies that R2 Maϕ± dx = 0. By Lemma 2.5 this, in turn, is possible only if a = 0. So we have Lv = 0. Hence, v = 0 by definition of the space Uα . We are now equipped with all the necessary facts to apply the implicit function theorem to prove our main theorem. Proof of the Main Theorem. Let α ∈ (0, 21 ) be given, we consider the mapping P (u, a, ε) in the domain @1 = @ ∩ Uα . From the previous section we find that P (0, 0, 0) = 0, (0, 0, 0) = A : Uα → Xα is an isomorphism. Thus by the standard implicit and P(u,a) function theorem (see e.g. [16, p. 150]) there exists ε0 > 0 and a continuous function ε → (u∗ε , aε∗ ) = vε∗ from (0, ε0 ) into a neighborhood of 0 in Uα such that P (u∗ε , aε∗ , ε) = 0,
∀ε ∈ (0, ε0 ).
We now verify the decay estimate (11) for u recovered from formula (2.14), thus showing that our solution u is really nontopological. From the explicit formula given in (2.1) we know that ln ρε,aε∗ (x) = −(2N + 4) ln |x| + o(ln |x|)
as |x| → ∞ .
(3.1)
On the other hand, from the asymptotic formula (2.11) ˜ 2 ln |x| + o(ln |x|) ε 2 w0 (εx) = −Cε
as |x| → ∞ ,
where C˜ is a positive number. Now, from (1.3) we obtain |u∗ε (x)| ≤ Cu∗ε Yα (ln+ |x| + 1) ≤ Cvε∗ Uα (ln+ |x| + 1).
(3.2)
134
D. Chae, O. Yu. Imanuvilov
This, implies then |u∗ε (εx)| ≤ Cvε∗ Uα (ln+ |εx| + 1) ≤ Cvε∗ Uα (ln+ |x| + 1).
(3.3)
from the continuity of the function ε → vε∗ from (0, ε0 ) into Uα and the fact v0∗ = 0 we have (3.4) vε∗ Uα → 0 as ε → 0, from which we deduce that there is ε1 ∈ (0, ε0 ), and a constant β = β(ε) > 0 such that our solution u(x) of (9) represented by the formula (2.14) satisfies u(x) = −(2N + 4 + β) ln |x| + o(ln |x|)
as |x| → ∞
(3.5)
for all ε ∈ (0, ε1 ). Then, e
u(x)
=O
1
as |x| → ∞.
|x|2N+4+β
(3.6)
We recall that z = x1 + ix2 , ∂z = 21 ( ∂x∂ 1 − i ∂x∂ 2 ), ∂¯z = 21 ( ∂x∂ 1 + i ∂x∂ 2 ) as previously, and define N 1 φ(z) = exp (u + iθ) with θ = 2arg (z − zj ) (3.7) 2 j =1
and
A1 = −Re {2i ∂¯z ln φ(z)}, A2 = −Im {2i ∂¯z ln φ(z)}.
Then, (φ, A) becomes a solution of the Bogomol’nyi equations (4), (5) satisfying (7). We now show that our solution (φ, A) is of finite energy, and satisfies the decay estimates (11). Since by (3.6) and (3.7),
1 2 |φ(x)| = O as |x| → ∞, (3.8) |x|2N+4+β we obtain |φ|2 (1 − |φ|2 ) ∈ L1 (R2 ).
(3.9)
Then by (5) and (3.8)
F12 (x) = O
1
|x|2N+4+β
as |x| → ∞,
Now it suffices to show that |D1 φ|2 + |D2 φ|2 = O
and
2 F12 , |φ|2
1 |x|2N+4+β
|φ|2 (1 − |φ|2 ) ∈ L1 (R2 ).
as |x| → ∞,
and also that it is integrable. From (4) we have immediately that ∂¯z ln φ(z) = −i α, ¯ where α¯ = 21 (A1 + iA2 ), which, in turn, gives 1 ∂θ 1 ∂u 1 ∂θ 1 ∂u − A1 = − , − A2 = . 2 ∂x1 2 ∂x2 2 ∂x2 2 ∂x1
Non-Topological Multivortex Solutions in Self-Dual Chern–Simons Theory
135
∂u ∂θ ∂u ∂u ∂u Thus, D1 φ = ( 21 ∂x + 2i ∂x − iA1 )φ = 21 ( ∂x − i ∂x )φ and D2 φ = ( 21 ∂x + 1 1 1 2 2 1 ∂u ∂u iA2 )φ = 2 ( ∂x2 + i ∂x1 )φ. Therefore,
|D1 φ|2 + |D2 φ|2 ≤
i ∂θ 2 ∂x2
−
1 1 |∇u|2 |φ|2 = |∇u|2 eu . 2 2
From (9) and Proposition 1.1 we obtain u(x) =
1 2π
R2
ln |x − y|eu(y) (eu(y) − 1)dy + 2
N
ln |x − zj | + C
(3.10)
j =1
applied for some constant C. Since u(x) ≤ 0 for x ∈ R2 (by the maximum principle to (9)), taking the derivative of u from (3.10), and using the inequality ( nj=1 aj )2 ≤ n nj=1 aj2 , we obtain 2 u(x)
|∇u(x)| e
≤ Ce
u(x)
R2
eu(y) dy |x − y|
2 +C
N j =1
eu(x) = I1 (x) + I2 (x). |x − zj |2
Since u(x) = 2 ln |x − zj | + O(1) as x → zj , the function I2 (x) is locally integrable, and by (3.6),
1 I2 (x) = O as |x| → ∞. |x|2N+4+β Hence I2 (x) ∈ L1 (R2 ). From the estimate eu(y) eu(y) eu(y) dy ≤ C, dy ≤ dy + R2 |x − y| |x−y|≤1 |x − y| |x−y|>1 and (3.6) we also find that
I1 (x) = O
1 |x|2N+4+β
as |x| → ∞,
and is also integrable. We now prove (12). From (5) with κ = 2 we first obtain 1 F12 dx = − eu (eu − 1)dx. (3.11) 2 R2 R2 Let us denote B(zj , δ) = {z ∈ C1 | |z − zj | < δ}, and BR = {z ∈ C1 | |z| < R},
SR = {z ∈ C1 | |z| = R}.
Let us choose δ small enough so that B(zj , δ) (j = 1, · · · , N )’s are mutually disjoint and R large enough so that BR contains all of B(zj , δ)’s. Integrating (9) over BR \ ∪N j =1 B(zj , δ), we obtain by the divergence theorem SR
∂u ds = ∂r
BR \∪N j =1 B(zj ,δ)
eu (eu − 1)dx +
N j =1 |x−zj |=δ
x − zj · ∇uds. (3.12) |x − zj |
136
D. Chae, O. Yu. Imanuvilov
Now, using the representation formula of u(x) in (2.14), we deduce x − zj x − zj · ∇uds = 2 · ∇ ln |x − zj |ds + o(1) |x − z | |x − zj | j |x−zj |=δ |x−zj |=δ ∂ =2 ln rds + o(1) = 4π + o(1) ∂r |x|=δ as δ → 0. Substituting (3.13) into (3.12), and passing δ → 0, we find that ∂u eu (eu − 1)dx + 4π N. ds = SR ∂r BR From (3.11) and (3.14),
1 F12 dx = − lim 2 R→∞ 2 R
SR
(3.13)
(3.14)
∂u ds + 2π N, ∂r
and for our solution, u(x) = uε (x) given by (2.14), it suffices to show that there exists a positive constant Cˆ such that 1 ∂uε ˆ 2 + o(ε 2 ) − lim ds = 2π N + 4π + Cε (3.15) 2 R→∞ SR ∂r as ε → 0. Taking into account the formula for uε (x) in (2.14), it suffices, in turn, to prove the following, (3.16)–(3.18): 1 ∂ − lim (3.16) ln ρε,aε∗ ds = 2π N + 4π, 2 R→∞ SR ∂r ∂w0 (ε|z|) ˆ − lim ds = C, (3.17) R→∞ SR ∂r where Cˆ is a positive constant independent of ε, and ∂u∗ε (εz) lim ds → 0, R→∞ SR ∂r
(3.18)
as ε → 0. We first prove (3.16). From the formula (2.1) one can write ∂ 1 ∂ ln ρε,aε∗ = ρε,aε∗ ∂r ρε,aε∗ ∂r ∂ 2 8ε2N+2 ∂r |f (z)| = ρε,aε∗ (1 + ε 2N+2 |F (z) + − with and
aε∗ 2 2 | ) εN +1
aε∗ εN +1 aε∗ 2 3 | )
∂ |F (z) + 2|f (z)|2 ε 2N+2 ∂r
(1 + ε 2N+2 |F (z) +
εN +1
|2
(3.19) ,
|f (z)|2 = (N + 1)2 r 2N + p2N−1 (θ )r 2N−1 + · · · + p0 (θ ) |F (z)|2 = r 2N+2 + q2N+1 (θ )r 2N+1 + · · · + q0 (θ ),
Non-Topological Multivortex Solutions in Self-Dual Chern–Simons Theory
137
where pj (θ )’s and qk (θ )’s are functions only of θ . Thus, substituting ∂ |f (z)|2 = 2N (N + 1)2 r 2N−1 + (2N − 1)p2N−1 (θ )r 2N−2 + · · · + p1 (θ ), ∂r and ∂ |F (z)|2 = (2N + 2)r 2N+1 + (2N + 1)q2N+1 (θ )r 2N + · · · + q1 (θ ) ∂r into (3.19), we have for each ε > 0,
∂ ln ρε,aε∗ ds ∂r SR ∗ 2 ε 2N+2 ∂ |F (z) + aε |2 2|f (z)| 1 ∂ ∂r εN +1 = ds |f (z)|2 − aε∗ 2 2 2N+2 SR |f (z)| ∂r 1+ε |F (z) + εN +1 | 2π 1 2N (N + 1)2 R 2N−1 2(2N + 2)ε 2N+2 R 2N+1 = Rdθ + O − (N + 1)2 R 2N 1 + ε 2N+2 R 2N+2 R 0 1 = −4πN − 8π + O R
as R → ∞. Thus (3.16) is proved. Next, in order to prove (3.17) we first note ∂w0 (r) r = ∂r
∞ 0
1 ϕ0 (t)tρ 2 (t)dt + O( ) r
as r → ∞,
which is obtained by explicit computation, using the formula for w0 (R) given by (2.9) with f (t) = ρ 2 (t). From this fact we have − lim
R→∞ SR
∂w0 (ε|z|) ds = − lim R→∞ ∂r
∂w0 (|z|) ds ∂r SεR
2π ∂w0 (r)
Rdθ = − lim R→∞ 0 ∂r r=R ∞ ∂w0 (R) = 2π ϕ0 (t)tρ 2 (t)dt > 0, = −2π lim R R→∞ ∂R 0
where the estimate on I in the proof of Lemma 2.2 was used in the last equality. The formula (3.17) is proved. Finally let us prove (3.18). By the divergence theorem, lim
R→∞ SR
∂u∗ε (εx) ds = lim R→∞ ∂r = lim
SεR
R→∞ BεR
∂u∗ε (x) ds ∂r u∗ε (x)dx
=
R2
u∗ε (x)dx,
138
and
D. Chae, O. Yu. Imanuvilov
u∗ε (x)dx
≤ |u∗ε (x)|dx 2 2 R R 1 1 = |u∗ε (x)|(1 + |x|2+α ) 2 (1 + |x|2+α )− 2 dx R2
≤
R2
|u∗ε (x)|2 (1 + |x|2+α )dx
1 2
R2
≤ Cvε∗ Uα → 0
dx 1 + |x|2+α
1 2
as ε → 0 due to (3.4). We thus proved (3.18). The proof of the theorem is complete.
4. Proof of the Auxiliary Lemmas Proof of Lemma 2.1. Let w(r) be a solution to L1 w = f . Since ϕ0 (r) is a solution of (2.7) with f = 0, we try ansatz w(r) = ξ(r)ϕ0 (r), and obtain ξ(r) below. By an elementary computation we have f = L1 w = ϕ0 ξ + (2ϕ0 +
ϕ0 )ξ , r
where we used the fact L1 ϕ0 = 0. Multiplying the above equations by rϕ0 , we deduce that d (rϕ02 ξ ) = rϕ02 ξ + (2rϕ0 ϕ0 + ϕ02 )ξ = rϕ0 f. dr From this we immediately have s r r φf (s) 1 ξ(r) = ϕ0 (t)tf (t)dt ds = ds, r ∈ (0, 1), 2 2 0 sϕ0 (s) 0 0 (1 − s) where φf (·) is defined by (s − 1)2 φf (s) = sϕ02 (s)
s
tf (t)ϕ0 (t)dt,
s ∈ (0, ∞)
0
as in (2.9b). Thus we find that
r
w1 (r) = ϕ0 (r) 0
φf (s) ds (1 − s)2
solves (2.7) for r ∈ (0, 1). Similarly we can deduce that for any L > 0, r φf (s) w2 (r) = ϕ0 (r) ds (1 − s)2 1+L solves (2.7) for r ∈ (1, ∞). In order to extend our solution w(r) to the range r ∈ (0, ∞) we set r φf (s) − φf (1) φf (1)r w(r) = ϕ0 (r) ds + ϕ0 (r) = I1 (r) + I2 (r). 2 (1 − s) 1−r 0
(4.1)
Non-Topological Multivortex Solutions in Self-Dual Chern–Simons Theory
139
2 (R1 ) solution of (2.7). We claim that w(r) is a well-defined Cloc + ∞ (R1 ). In order to show that I ∈ C 2 (R1 ) we observe first Obviously I2 ∈ Cloc 1 + + loc 1 (R1 ), φ (s) ∈ C 2 (0, k) for all k > 0. In particular, we have φf (s) ∈ Cloc f + 1 s ϕ0 (t)tf (t)dt = 0, lim s→0+ s 0
and there is no singularity near s = 0 of φf (s). Next we will show that φf (1) = 0.
(4.2)
To see this we compute first φf (s) = where we set
(s − 1)η(s) s 2 ϕ03 (s)
0
s
tf (t)ϕ0 (t)dt +
(1 − s)2 f (s) = J1 (s) + J2 (s), ϕ0 (s)
η(s) = (s + 1)ϕ0 (s) − 2(s − 1)sϕ0 (s).
We note J2 (1) = 0. In order to show J1 (1) = 0 it suffices to prove η(1) = η (1) = η (1) = 0.
(4.3)
η (s) = ϕ0 (s) − 3(s − 1)ϕ0 (s) − 2(s − 1)sϕ0 (s),
(4.4)
Obviously η(1) = 0. Since we also find that η (1) = 0. In order to see η (1) = 0 we use substitution sϕ0 = −ϕ0 − sρ(s)ϕ0 for the last term in the right-hand side of (4.4), which is immediate from the ordinary differential equation for ϕ0 , L1 ϕ0 = 0. Thus, we obtain η (s) = ϕ0 (s) − 3(s − 1)ϕ0 (s) + 2(s − 1)(ϕ0 + sρϕ0 ) = ϕ0 (s) − (s − 1)ϕ0 (s) + 2s(s − 1)ρ(s)ϕ0 (s),
from which η (1) = 0 follows easily. We thus proved (4.3), and hence (4.2). From (4.2) we have φf (s) − φf (1) ∈ C 1 (0, k) (1 − s)2
∀k > 0.
This implies I1 (r) in (4.1), and hence w(r) belongs to C 2 (0, k) for all k > 0. Now, in order to show that w(r) is a C 2 solution of (2.7) in (0, ∞), it suffices to prove that w(r) satisfies (2.7) on (0, 1) and (1, ∞). For r ∈ (0, 1) we find easily r φf (s) w(r) = ϕ0 (r) ds = w1 (r), 2 0 (1 − s) which is a solution of (2.7) by construction. For r > 1, setting L > 0, we have r 1+L φf (s) − φf (1) φf (s) − φf (1) φf (1)r w(r) = ϕ0 (r) ds + ds + (1 − s)2 (1 − s)2 1−r 1+L 0 1+L φf (s) − φf (1) (1 + L)φf (1) = w2 (r) + ϕ0 (r) − + ds . L (1 − s)2 0 (4.5)
140
D. Chae, O. Yu. Imanuvilov
By construction the first term above is a solution of (2.6), while the second term is a solution of the homogeneous equation of (2.7) (i.e. (2.7) with f = 0). In order to prove that w ∈ Yα we note that the second term in the right hand side of (4.5) is the bounded function and the first term satisfies the estimate |w2 (r)| ≤ C(ln(r) + 1) for all r ≥ 2. Thus w = −ρw + f ∈ Xα . This completes the proof of Lemma 2.1. Proof of Lemma 2.2. From the explicit solution formula (2.9) we find that w0 (·) ∈ C 0 (R+ ), and thus (2.10) follows from (2.11). It suffices now to prove (2.11). We observe from formula (2.9) that
r
w0 (r) = ϕ0 (r) 2
1 + s 2N+2 1 − s 2N+2
as r → ∞, where
2
s
I (s) =
I (s) ds + (bounded function of r) s
ϕ0 (t)tρ 2 (t)dt.
0
Since ϕ0 (r) → −1 as r → ∞, (2.11) follows if I = I (∞) =
∞
ϕ0 (r)rρ 2 (r)dr > 0.
0
Indeed, substituting r 2 = t in the integrand of I , then ∞ 1 − t N+1
I = 32(N + 1)4
1 + t N+1
0
where we changed the variable s =
1 t
t 2N dt (1 + t N+1 )4
(1 − t N+1 )t 2N dt N+1 )(1 + t N+1 )4 0 (1 + t ∞ (1 − t N+1 )t 2N + 32(N + 1)4 dt (1 + t N+1 )(1 + t N+1 )4 1 1 (1 − t N+1 )t 2N 4 = 32(N + 1) dt N+1 )(1 + t N+1 )4 0 (1 + t 1 (1 − s N+1 )s 2N+2 − 32(N + 1)4 ds N+1 )(1 + s N+1 )4 0 (1 + s 1 (1 − t N+1 )(1 − t 2 )t 2N = 32(N + 1)4 dt > 0, N+1 )(1 + t N+1 )4 0 (1 + t = 32(N + 1)4
1
in the third equality.
Proof of Lemma 2.5. Suppose L1 is the differential operator introduced in (2.7). Let us first observe that the following equality holds: L1
1 (N + 1)2 r 4N+2 = , 2N+2 2 16(1 + r (1 + r 2N+2 )4 )
Non-Topological Multivortex Solutions in Self-Dual Chern–Simons Theory
141
which can be verified by an elementary computation. We prove our lemma for ϕ+ only. The case for ϕ− is similar. Using the above identity, we have the following:
R2
(ρw0 − 2ρ
= = = = = = =
2
2 )ϕ+ dx
2π
= 0
∞ r 2N+2 cos2 (N + 1)θ (ρw0 − 2ρ 2 ) rdrdθ (1 + r 2N+2 )2
0 ∞ 8(N
+ 1)2 r 2N r 2N+2 w0 − 2ρ 2 rdr 2N+2 2 (1 + r ) (1 + r 2N+2 )2 0 0 ∞ 1 1 2ρ 2 r 2N+2 π L1 w0 − rdr 2 (1 + r 2N+2 )2 (1 + r 2N+2 )2 0 ∞ 1 1 2ρ 2 r 2N+2 π rdr − L1 w0 · 2 (1 + r 2N+2 )2 (1 + r 2N+2 )2 0 ∞ ρ2 2ρ 2 r 2N+2 π rdr − 2(1 + r 2N+2 )2 (1 + r 2N+2 )2 0 ∞ 5ρ 2 2ρ 2 π − rdr 2(1 + r 2N+2 )2 (1 + r 2N+2 ) 0 ∞ 5r 4N 2r 4N 4 rdr 64π(N + 1) − 2(1 + r 2N+2 )6 (1 + r 2N+2 )5 0 ∞ 5t 2N 2t 2N 32π(N + 1)4 dt − 2(1 + t N+1 )6 (1 + t N+1 )5 0 2π
cos2 (N + 1)θ dθ
(By change of variable, t = r 2 ) ∞ −5t N tN d 1 d 1 dt + = 32π(N + 1)4 10(N + 1) dt (1 + t N+1 )5 2(N + 1) dt (1 + t N+1 )4 0 ∞ t N−1 t N−1 3 = 32π(N + 1) N dt − 2(1 + t N+1 )4 2(1 + t N+1 )5 0 ∞ t 2N = −16π(N + 1)3 N dt < 0. (1 + t N+1 )5 0 The proof of the lemma is completed.
Acknowledgement. The authors would like to deeply thank the anonymous referee for many invaluable comments, in particular for his suggestion that made it possible to reduce considerably the implicit function part of our previous proof.
References 1. Baraket, S. and Pacard, F.: Construction of singular limits for a semilinear elliptic equation in dimension 2. Calc. Var. and P.D.E. 6, 1–38 (1998) 2. Caffarelli, L. and Yang, Y.: Vortex condensation in the Chern–Simons–Higgs model: an existence theory. Commun. Math. Phys. 168, 321–336 (1995) 3. Chae, D. and Kim, N.: Topological multivortex solutions of the self-dual Maxwell–Chern–Simons–Higgs System. J. Diff. Eqns. 134 No. 1, 154–182 (1997) 4. Chen, X., Hastings, S., McLeod, J. B. and Yang, Y.: A nonlinear elliptic equations arising from gauge field theory and cosmology. Proc. R. Soc. Lond. A 446, 453–478 (1994)
142
D. Chae, O. Yu. Imanuvilov
5. Dunne, G.: Self-Dual Chern–Simons Theories. Lecture Notes in Physics, Vol. M36. Berlin–New York: Springer-Verlag, 1995 6. Fursikov, A. V. and Imanuvilov, O.Yu.: Local Exact Boundary Controllability of the Boussinesq Equation. SIAM Journal of Control and Optimization 36 No. 2, 391–421 (1998) 7. Gilbag, D. and Trudinger, N.S.: Elliptic Partial Differential Equations of Second order, 2nd ed. NewYork: Springer-Verlag, 1982 8. Hong, J., Kim, Y. and Pac, P.Y.: Phys. Rev. Lett. 64, 2230 (1990) 9. Jackiw, R. and Pi, S.Y.: Phys. Rev. Lett. 64, 2969 (1990) 10. Jackiw, R. and Weinberg, E. J.: Phys. Rev. Lett. 63, 2234 (1990) 11. Jaffe, A. and Taubes, C.: Vortices and Monopoles. Boston: Birkhäuser, 1980 12. Spruck, J. and Yang, Y.: Topological solutions in the self-dual Chern–Simons theory: Existence and approximation. Ann. Inst. Henri Poincaré 1, 75–97 (1995) 13. Spruck, J. and Yang, Y.: The existence of nontopological solitons in the self-dual Chern–Simons theory. Commun. Math. Phys. 149, 361–376 (1992) 14. Tarantello G.: Multiple condensate solutions for the Chern–Simons-Higgs Theory. J. Math. Phys. 37, 3769–3796 (1996) 15. Wang, R.: The existence of Chern–Simons Vortices. Commun. Math. Phys. 137, 587–597 (1991) 16. Zeidler, E.: Nonlinear functional analysis and applications. Vol. 1. New York: Springer-Verlag, 1985 Communicated by A. Kupiainen
Commun. Math. Phys. 215, 143 – 175 (2000)
Communications in
Mathematical Physics
© Springer-Verlag 2000
Anderson Localization for Schrödinger Operators on Z with Strongly Mixing Potentials Jean Bourgain1 , Wilhelm Schlag2 1 Institute for Advanced Study, Olden Lane, Princeton, NJ 08540, USA.
E-mail:
[email protected]
2 Department of Mathematics, Princeton University, Fine Hall, Princeton, NJ 08544, USA.
E-mail:
[email protected] Received: 28 January 2000 / Accepted: 14 June 2000
Abstract: In this paper we show that for a.e. x ∈ [0, 2π ) the operators defined on 2 (Z+ ) as (H (x)ψ)n = ψn+1 + ψn−1 + λ cos(2n x)ψn for n ≥ 0 and with Dirichlet condition ψ−1 = 0, have pure point spectrum in [−2 + δ, −δ] ∪ [δ, 2 − δ] with exponentially decaying eigenfunctions where δ > 0 and 0 < |λ| < λ0 (δ) are small. As it is a simple consequence of known techniques that for small λ one has [−2 + δ, 2 − δ] ⊂ spectrum (H (x)) for a.e. x ∈ [0, 2π ), we thus established Anderson localization on the spectrum up to the edges and the center. More general potentials than cosine can be treated, but only those energies with nonzero spectral density are allowed. Finally, we prove the same result for operators on the whole line Z with potential vn (x) = n 2 2 1 2 λ F (A x), where A : T → T is a hyperbolic toral automorphism, F ∈ C (T ), F = 0, and λ small. The basis for our analysis is an asymptotic formula for the Lyapunov exponent for λ → 0 by Figotin–Pastur, and generalized by Chulaevski– Spencer. We combine this asymptotic expansion with certain martingale large deviation estimates in order to apply the methods developed by Bourgain and Goldstein in the quasi-periodic case. 1. Introduction In this paper we consider discrete Schrödinger operators (H ψ)n = ψn+1 + ψn−1 + vn ψn
(1.1)
acting on 2 (Z) or 2 (Z+ ). The potentials are of the form vn = F (2n θ), n ≥ 0, where θ ∈ T = [0, 2π] and F is 2π -periodic or vn = F (An x), n ∈ Z where A : T2 → T2 is a hyperbolic toral automorphism and F : T2 → R (in both cases F ∈ C 1 suffices). It is well-known that the spectra and the spectral parts of these families of operators
144
J. Bourgain, W. Schlag
do not depend on the parameters θ or x (up to a set of measure zero), see Figotin and Pastur [10]. A classical problem is to establish Anderson localization for (1.1). This means that the spectrum sp(H ) is pure point and the corresponding eigenfunctions decay exponentially. For the random case (i.e., the vn are independent and identically distributed random variables) this problem has a rich history and has been considered by several authors, see for example [10], Carmona and Lacroix [8], Cycon, Froese, Kirsch and Simon [7], as well as the references there. Another important class is the quasiperiodic one, i.e., vn = F (nω + θ), where θ ∈ Td and ω determines a dense orbit on the d-dimensional torus Td . Although the dynamics we consider in this paper are strongly mixing and thus closer to the random case, the methods developed in the quasiperiodic case are more relevant for us, because they address certain issues that are specific to deterministic potentials. In fact, this paper is based on the techniques from Chulaevski and Spencer [9], Bourgain and Goldstein [4], and Goldstein and Schlag [16]. Before describing our approach in more detail, we recount some well-known basic facts from the theory of the operators (1.1). Let E − λvN −1 E − λvN−1 −1 E − λv1 −1 MN (E) = ··· (1.2) 1 0 1 0 1 0 be the monodromy matrix associated with the operator H − E (here E is any real number). Then any solution of H ψ = Eψ satisfies ψN+1 ψ1 = MN (E) for N = 1, 2, . . . . ψN ψ0 The Lyapunov exponent is given by 1 E log MN (E), N→∞ N
L(E) = lim LN (E) = lim N→∞
(1.3)
where E refers to the average over the random parameter (thus, in the case of the doubling dθ map θ → 2θ , E = T . . . 2π ). Since det(MN (E)) = 1, one has L(E) ≥ 0. As observed by Pastur and Ishii, cf. [10], L(E) > 0 a.e. on some interval I implies that spa.c. (H )∩I = ∅. This follows from Osseledec’s multiplicative ergodic theorem and the characterization of the spectrum of self–adjoint operators due to Shnol and Simon, see [10]. It is important to note, however, that it is not possible to remove the singular continuous part of the spectrum by such general methods. In fact,Avron and Simon [3] proved that for potentials vn = λ cos(2π ωn + θ), where ω is a Liouville number and λ > 2, the spectral measures are singular continuous. On the other hand, Jitomirskaya [17] recently proved that for Diophantine ω and λ > 2 the spectrum is pure point and that the eigenfunctions decay exponentially (it is well-known that inf E L(E) > 0 for λ > 2). The positivity of the Lyapunov exponent and general ergodic arguments are therefore insufficient to establish Anderson localization, and a more detailed analysis involving finer properties of the underlying dynamics is required. To our knowledge, this has been done for the truly random case and the quasiperiodic case, but not for other dynamics as the ones considered here. As mentioned above, one might believe that most results from the random, independent case remain true for the doubling map and hyperbolic automorphisms. However, all of the arguments in the literature for the random case rely strongly on independence, see for example [8], Fröhlich, Martinelli, Scoppola, Spencer [11], Simon and Wolff [22],
Anderson Localization for Strong Mixing
145
von Dreifuss and Klein [25], Aizenman and Molchanov [1], and it is not immediately clear how to apply them to deterministic potentials like the ones considered here. We now describe our results in more detail. For the sake of simplicity we restrict ourselves to potentials vn = λF (2n θ),
(1.4)
where F is 2π-periodic and λ > 0, the case of toral automorphisms being similar. It was observed by Chulaevski and Spencer [9] that in this case the Lyapunov exponent L(λ, E) admits an asymptotic expansion L(λ, E) = λ2 c0 (E) + O(λ3 )
(1.5)
for small λ and 0 < |E| < 2. Here c0 is some function of E that depends on F , see Sect. 2 below and the constant in O(λ3 ) depends on E, but remains bounded on intervals of the form [−2 + δ, −δ] ∪ [δ, 2 − δ]. They in turn used a method of Figotin and Pastur [10], Theorem 14.6, where (1.5) was proven for the independent case. We use the formalism of these authors in order to establish large deviation theorems of the form 1 mes θ ∈ T : log Mn (θ, E) − Ln (E) > λ5/2 < exp(−Cλ n). (1.6) n Here Mn (θ, E) is as in (1.2) with vn = λF (2n θ). For this reason our results require sufficiently small λ (it seems to be even unknown how to show that inf E L(E) > 0 for all λ > 0 for potentials (1.4)). In order to prove Anderson localization we use the scheme from Bourgain and Goldstein [4]. More precisely, suppose that ψ is a solution of (H (θ )ψ)(n) = ψn+1 + ψn−1 + λF (2n θ)ψn = Eψn on Z ∩ [0, ∞) satisfying ψ−1 = 0,
(1.7)
where θ ∈ T is fixed. By a result of Shnol [20] and Simon [21] every energy E in the spectrum – up to a set of spectral measure zero – has a nontrivial polynomially bounded solution of (1.7) (such a solution is called a generalized eigenfunction). Our goal is to construct a set ⊂ T with mes(T \ ) = 0 such that for any θ ∈ every polynomially bounded solution ψ has to decay exponentially. To this end let H (θ ) be the restriction of the operator H (θ ) to the interval ⊂ Z ∩ [0, ∞) with zero boundary conditions. Assume for the moment that θ and E are fixed. Choose some large positive integer N and suppose that for any interval ⊂ [N/2, 4N ]
(1.8)
(H (θ ) − E)−1 (n1 , n2 ) < exp(−c0 |n1 − n2 |)
(1.9)
of length about N
for every n1 , n2 ∈ with |n1 − n2 | > N/10 and some constant c0 > 0. Setting = [n − N/2, n + N/2] = [a, b] it is easy to see that for any solution H (θ)ψ = Eψ and any n ∈ [N, 2N] (1.9) implies that |ψ(n)| = |((H (θ ) − E)−1 (ψa−1 δa + ψb+1 δb )(n)| < Cψ e−cn .
(1.10)
146
J. Bourgain, W. Schlag
This suggests introducing BN = {θ ∈ T | (1.9) fails for some E and some }
(1.11)
and then to set = T \ lim sup B2 . It is immediately clear, however, that this fails as necessarily = ∅ because E can be chosen to be an eigenvalue of H (θ ). The device used by Bourgain and Goldstein [4] is to define BN as the set of those θ ∈ T for which there is some E that comes very close to the spectrum of both H[0,N] (θ ) and H (θ ), where is an interval as in (1.8). These eigenvalues are said to cause “double resonances” and appear frequently in the mathematical analysis of Anderson localization, see for example Fröhlich, Spencer, Wittwer [14], Sinai [23], and von Dreifuss and Klein [25]. More precisely, for any large integer N one defines BN = BN (δ, λ) as the set of those θ ∈ T such that for some N1 ∈ [N 12 , 2N 12 ], δ < |E| < 2 − δ and N¯ < k < 2N¯ , 2 (log N) , the following conditions hold: where N¯ = e (H[0,N1 ] (θ ) − E)−1 > eN , 1 log MN (2k θ, E) < LN (E) − Cλ5/2 . N 2
(1.12) (1.13)
One needs to show that mes(BN ) decays rapidly. In fact, it is shown in Sect. 9 below that for small λ > 0, mes(BN ) ≤ C e−cλ
4N
.
(1.14)
Let θ ∈ T \ lim sup BN and N be sufficiently large. It is not hard to see that the existence of a generalized eigenfunction ψ with energy E implies that (1.12) holds, which implies that the converse of (1.13) has to be true. Invoking Cramer’s rule and a simple algebraic fact about the entries of MN (E) one concludes that for most ⊂ [N¯ , 2N¯ ] of size N one has estimate (1.9). Finally, applying the resolvent identity about N¯ /N many times ¯ also satisfies (1.9), as desired. The choice of scales N1 and N¯ shows that = [N¯ , 2N] is to some extent technical. However, it is essential that N¯ be much larger than N . This allows one to exploit some type of independence between the events (1.12) and (1.13). The important estimate (1.14) is obtained by means of the large deviation estimate (1.6). It is perhaps worth pointing out that in [4], where the dynamics is given by the shift θ → θ + ω on the torus T, the events (1.12) and (1.13) are strongly dependent. For this reason, these authors need to invoke complexity bounds for certain semi-algebraic sets. Near independence allows us to avoid appealing to these complexity bounds. In addition to greater simplicity, this has the advantage of allowing us to take F to be C 1 rather than analytic, see (1.4). For details we refer the reader to Sects. 8 and 9. Finally, we consider regularity of the integrated density of states for the two models (n) mentioned above. Let Ej (θ ) with 0 ≤ j ≤ n be the eigenvalues of Hλ (θ ) restricted to the interval [0, n] with a Dirichlet boundary condition ψn+1 = 0. It is well known, see [10] Theorem 6.4, that for a.e. θ 1 (n) card{j : Ej (θ ) < E} → N (E) n+1
(1.15)
as n → ∞ where N is a distribution function that does not depend on θ. The convergence in (1.15) is in the sense of measures. Thouless’ formula provides a relation between the
Anderson Localization for Strong Mixing
147
Lyapunov exponent L and the integrated density of states N : L(E) = log |E − E | dN (E ).
(1.16)
It basically states that L is the Hilbert transform of N and vice versa. In particular, if L is Hölder continuous, so is N . For details see [10], Theorems 11.8 and 11.15. In this section we follow [16] to show that both N and L are Hölder continuous. In [16] this was done in the context of quasiperiodic potentials. In that case it is known that N has no greater smoothness. In the case of potentials with strong mixing as considered here one would however expect to be close to the random case where Simon and Taylor [24] have shown that the integrated density of states is actually infinitely differentiable. Compare also the series of papers on this issue by Klein and collaborators, in particular the work by Campanino, Klein, and Speis [5, 6, 18]. It is unclear, however, to what extent the methods of these authors apply to other cases than the independent one. 2. Representation of Schrödinger Matrices in Terms of Polar Coordinates In this section we recall the formalism of Figotin–Pastur [10], see also Chulaevsky– Spencer [9]. Consider the discrete Schrödinger operator (Hλ ψ)n = ψn+1 + ψn−1 + λvn ψn ,
(2.1)
where vn is a sequence of real numbers and λ > 0. We shall consider (2.1) both on the entire integer lattice Z as well as on the half-line Z+ ∪ {0} with specific choices of vn . In this section we allow for any choice of vn . Fix some small δ > 0 and let δ < |E| < 2 − δ.
(2.2)
Define κ ∈ (0, π) and Vn by means of E = 2 cos κ, vn . Vn = − sin κ
(2.3) (2.4)
Now let ψ be a solution of H ψ = Eψ on the half–line Z+ ∪ {0}. Then E − λvn −1 ψn+1 ψn = for any n = 1, 2, . . . . 1 0 ψn ψn−1
(2.5)
The coordinate change Yn = (ψn − cos κ ψn−1 , sin κ ψn−1 )
(2.6)
transforms (2.5) into Yn+1 =
cos κ − sin κ sin κ cos κ
Yn + λVn
sin κ cos κ 0 0
Yn .
(2.7)
Introducing polar coordinates Yn = ρn (cos ϕn , sin ϕn ),
(2.8)
148
J. Bourgain, W. Schlag
Eq. (2.7) implies cos ϕn+1 cos(ϕn + κ) + λVn sin(ϕn + κ) ρn+1 = ρn , sin(ϕn + κ) sin ϕn+1 cotg ϕn+1 = cotg (ϕn + κ) + λVn , 2 = ρn2 1 + λVn sin(2(ϕn + κ)) + λ2 Vn2 sin2 (ϕn + κ) . ρn+1
(2.9) (2.10) (2.11)
From (2.11) with ρ1 = 1, N 1 1
log ρN (θ ) = log 1 + λVn sin 2(ϕn + κ) + λ2 Vn2 sin2 (ϕn + κ) N 2N
=
1 N 2 λ
8N + − +
Vn2
(2.12)
(2.13)
1
n λ
Vn sin 2(ϕn + κ) 2N 1 N 2
λ
(2.14)
4N
Vn2 cos 2(ϕn + κ)
(2.15)
8N
Vn2 cos 4(ϕn + κ) + O(λ3 ).
(2.16)
1 N 2
λ 1
Letting ζn = e2iϕn , µ = e2iκ ,
(2.17)
one verifies that (2.10) is equivalent to ζn+1 = µζn +
(µζn − 1)2 iλ . Vn 2 1 − iλ 2 Vn (µζn − 1)
(2.18)
Denoting the right-hand side of (2.18) by F (ζn , Vn ), one checks that F (ζ, u) is holomorphic on a neighborhood of |ζ | ≤ 1 and |u| ≤ 1 for small λ. Moreover, for small λ, |F (ζ1 , u1 ) − F (ζ2 , u2 )| ≤ (1 + Cλ)|ζ1 − ζ2 | + Cλ|u1 − u2 |,
(2.19)
see Lemma 2 in [9]. Formulas (2.12) to (2.16) and especially (2.18) are of fundamental importance to this paper. Figotin and Pastur [10], Theorem 14.6, used this formalism to show that for small λ the Lyapunov exponent L(λ, E) obeys the expansion L(λ, E) =
λ2 E(v02 ) + O(λ3 ) 2(4 − E 2 )
(2.20)
provided the potentials are identically distributed independent random variables with zero mean (notice that a change of variables, as the one given by (2.6), does not change
Anderson Localization for Strong Mixing
149
the limit in (1.3)). The constant in O(λ3 ) depends on E, but remains bounded on intervals of the form [−2+δ, −δ]∪[δ, 2−δ] (we shall not mention this fact in what follows). The main observation in this context is that ζn+1 , and thus ϕn+1 , only depends on V1 , . . . , Vn , see (2.18). In particular, if vn are i.i.d., Vn and ϕn are independent random variables. Taking expectations in (2.13) to (2.16), one obtains (2.20). Suppose vn = vn (x) = F (T n x), where T : X → X is an ergodic transformation on some probability space (X, µ). In that case (1.3) remains meaningful if one takes E to be X dµ. Moreover, Chulaevskii and Spencer showed in [9] that (2.20) remains valid
2iκ F, F (T ·) instead of E(v 2 ) – if T is assumed only to be strongly – with ∞ −∞ e 0 mixing rather than the Bernoulli shift. This is accomplished by iteration of (2.18), thus exploiting the decay of the correlations. 3. Potentials Given by the Doubling Map A special case of potentials given by strongly mixing dynamics is vn (θ ) = cos(2n θ),
(3.1)
dθ where θ ∈ T = [0, 2π] with the normalized measure dP = 2π . The choice of cos here is for simplicity only and we want to emphasize that the analysis below generalizes easily to the case of arbitrary C 1 -functions, cf. Remark 4.4. The goal of the following sections is to prove Anderson localization for the operator
(Hλ (θ )ψ)n = ψn+1 + ψn−1 + λ cos(2n θ)ψn
(3.2)
defined on the half–line Z+ ∪{0} with the boundary condition ψ−1 = 0. We also establish that the integrated density of states is Hölder continuous in the energy E ∈ [δ, 2 − δ]. All of this requires λ > 0 to be sufficiently small so that the perturbative formulas (2.12)– (2.16) apply. The first step in this program is to prove the large deviation theorem (1.6). Let µ, ζn be as in (2.17) and vn as in (3.1). Clearly, (2.13) =
N λ2 1 λ2 n+1 + cos(2 θ) . 16 sin2 κ 16 sin2 κ N n=1
(3.3)
To estimate the contribution of (2.14), we need to control N λ
V n ζn . N
(3.4)
1
Since vn2 = cos2 (2n θ) =
1 (1 + vn+1 ), 2
(3.5)
the contribution of (2.15) is bounded by N N 1 λ2 1 λ2
ζn + Vn+1 ζn . 2 sin κ N sin κ N 1 1
(3.6)
150
J. Bourgain, W. Schlag
Similarly, the contribution of (2.16) is bounded by N N 1 λ2 2 1 λ2
2 ζn + Vn+1 ζn . 2 sin κ N sin κ N 1 1
(3.7)
N N 1
1 1 ζn = µ ζn + 0(λ) + 0 . N N N
(3.8)
N 1 λ λ ζn < C
(3.9)
From (2.18)
1
1
Hence
1
Also, from (2.18) 2 ζn+1 = µ2 ζn2 + 0(λ),
(3.10)
and thus N 1 2 λ λ ζn < C
(3.11)
1
From (3.6), (3.7), (3.9), (3.11) therefore N N Cλ2
Cλ2
2 (2.15) + (2.16) ≤ Vn+1 ζn + Vn+1 ζn + 0(λ3 ). N N 1
(3.12)
1
4. Estimation of (3.4) Let Er be the conditional expectation operators with respect to the dyadic partition of T into 2r congruent intervals, i.e.,
1 Er f = f χI , |I | I I
where the sum runs over dyadic intervals of size 2−r 2π . Lemma 4.1. For r > n, |ζn − Er [ζn ]| < C0
r−n 2 . 3
(4.1)
Anderson Localization for Strong Mixing
151
Proof. Observe that 0 = |Er [cos 2n θ]| < C2r−n if r ≤ n, |(cos 2 θ ) − Er [cos 2n θ]| < C2n−r if r ≥ n. n
(4.2) (4.3)
We shall prove (4.1) by induction on n using (2.18). Thus from (2.18), (2.19), (4.1), (4.2), |ζn+1 − Er [ζn+1 ]| ≤ |ζn − Er [ζn ]| + Cλ{|Vn − Er [Vn ]| + |ζn − Er [ζn ]|} (4.4) r−n r−n 2 2 n−r ≤ C0 + Cλ 2 + 3 3 r−n 2 . (4.5) < (C0 + Cλ) 3 Thus it suffices to take λ small enough to ensure that 1+
3 C λ< C0 2
(4.6)
and the lemma follows.
Iteration of (2.18) gives |ζn − µr ζn−r | < Crλ.
(4.7)
Fix a number T = T (λ) such that 1 (4.8) T ∼ log . λ Throughout this paper we let a ∼ b denote c1 a ≤ b ≤ c2 a, where c1 , c2 are suitable positive constants which can be inferred from the context. From (2.18) ζn = µζn−1 +
iλ Vn−1 (µζn−1 − 1)2 + 0(λ2 ) 2
which gives upon iteration ζn = µT ζn−T +
T
iλ s−1 µ Vn−s (µζn−s − 1)2 + 0(T λ2 ) 2
(4.9)
s=1
for any n > T . From (4.7) and (4.9) ζn = µT ζn−T +
T
iλ s−1 µ Vn−s (µT −s+1 ζn−T − 1)2 + 0(T 2 λ2 ). 2
(4.10)
s=1
In view of the previous lemma therefore ζn = µT En− T [ζn−T ] 2
+
iλ 2
T
s=1
µs−1 Vn−s En− T [(µT −s+1 ζn−T − 1)2 ] 2
T 2 2 2 2 . + 0 T λ + (λT + 1) 3
(4.11)
152
J. Bourgain, W. Schlag
Appropriate choice of T in (4.8) yields ζn = µT En− T [ζn−T ] 2
T iλ
+
2
s=1
µs−1 Vn−s En− T [(µT −s+1 ζn−T − 1)2 ] + 0 λ2 log2 1/λ . 2
(4.12)
Returning to (3.4), we obtain from (4.12) for large N , N N
λ
λ Vn ζ n = µ T Vn En− T [ζn−T ] 2 N N T
1
(4.13)
N 1
T −s+1 2 + µ Vn Vn−s En− T [(µ ζn−T − 1) ] 2 2 N s=1 T + 0 λ2 (log 1/λ)2 . (4.14) iλ2
T
s−1
Our aim is to replace (4.13), (4.14) with sums of martingale differences. Recall that Vn = −
cos(2n θ) . sin κ
(4.15)
Hence, for s ≥ 1 Vn Vn−s
1 −2 n n−s n n−s cos[(2 + 2 )θ] + cos[(2 − 2 )θ ] = (sin κ) 2
(4.16)
and thus, by (4.2) and (4.3) |Vn − En+ T [Vn ]| < 0(2−T /2 ) = 0(λ2 ), |En− T [Vn ]| < 0(λ2 ) 2
2
(4.17)
and |Vn Vn−s − En+ T [Vn Vn−s ]| < 0(λ2 ), |En− T [Vn Vn−s ]| < 0(λ2 ). 2
2
(4.18)
Consequently, the sum in (4.13) equals N λ T
µ En− T [ζn−T ](En+ T − En− T )[Vn ] + 0(λ3 ) 2 2 2 N
(4.19)
T
and the sum in (4.14) equals T N iλ2 s−1 1
µ En− T [(µT −s+1 ζn−T − 1)2 ](En+ T − En− T )[Vn Vn−s ] 2 2 2 2 N s=1
T
+ 0(T λ ). 4
Writing n in the form n = mT + r
(4.20) N ≡ M, r = 0, 1, . . . , T − 1 m< T
(4.21)
Anderson Localization for Strong Mixing
153
it is clear that in (4.19), (4.20) the sum differences of the form
N
may be expressed by T sums of martingale
1
N/T
dm ,
(4.22)
1
where |dm | < C, dm = (EmT +r − E(m−1)T +r )[dm ].
(4.23)
Recall at this point the general large deviation estimate for sums of martingale differences with bounded increments. This is a standard fact, see for example, Chapt. 7 in Alon, Erdös, and Spencer [2]. Lemma 4.2. Let dm be a martingale difference sequence adapted to some filtration. Then
1/2
M M 2 < e−cδ M . dm > δM 1/2 dm 2∞ (4.24) P 1
1
Consequently, the preceding implies that Lemma 4.3. For N sufficiently large N λ
4 5/2 < e−cλ N V n ζn > λ P θ ∈T| N
(4.25)
n=1
providing a large deviation estimate for (2.14). Remark 4.4. As mentioned above, the choice of cos was for simplicity only. In fact, one can check by the same methods that for any F ∈ C 1 (T) (or even F ∈ C α (T) of some 0 < α < 1)
∞ T N iλ 2isκ 4 s 3/2 < e−cλ N . Vn ζ n − e F, F (2 ·) > λ (4.26) P θ ∈T| N 2µ s=1
n=T
The interested reader can find most of the details in Sect. 11, where they are presented in the closely related context of toral automorphisms. 5. Contribution of (3.12) The arguments here are similar to those in the previous section. In fact, the λ2 -factor permits us to proceed in a cruder way. Applying (4.7), we get for large N ,
N N λ2
2 T 1 Vn+1 ζn = λ µ ζn−T Vn+1 + 0(T λ3 ), (5.1) N N T N 2 λ
N
T
Vn+1 ζn2 = λ2 µ2T
1 N
1 N
2 ζn−T Vn+1 + 0(T λ3 ),
(5.2)
1
and the sums in the right members may then again be converted to martingale differences. Here we obtain a deviation estimate P[θ ∈ T | (3.12) > λ5/2 ] < e−cλ
2N
.
(5.3)
154
J. Bourgain, W. Schlag
6. Large Deviation Estimate Recall (3.3) as the contribution of (2.13) to N1 log ρN (θ ). Observe that the second term in (3.3) is again subject to a large deviation estimate N 1
n+1 5/2 P θ ∈T|λ < e−cλN . cos(2 θ) > λ N
2
(6.1)
1
Together with the estimates (4.25), (5.3), it follows that for N large enough, 1 λ2 4 5/2 > λ < Ce−cλ N . P θ ∈ T | log ρN (θ ) − 2 N 16 sin κ
(6.2)
Let MN (θ, E) be the monodromy matrix defined in (1.2). Since ρN (θ ) = MN (θ, E)v with a unit vector v = eiφ1 , one immediately derives the following important large deviation estimates for the norm of the monodromy from (6.2): 1 4 2 5/2 < e−cλ N , (6.3) P θ ∈ T | log MN (θ, E) − c0 λ > λ N where c0 = c0 (E) =
1 . 16 sin2 κ
(6.4)
7. Hölder Continuity of the Integrated Density of States In this section we establish that the integrated density of states is Hölder continuous. See Sect. 1 for an introduction to this section. We follow the approach from [16]. In particular, we use the following lemma from [16], also called the “avalanche principle”. In order to keep the paper self-contained, the proof is reproduced here. Definition√ 7.1. Fix some unimodular 2 × 2 matrix K. We denote the normalized eigen− + + − vectors of K ∗ K by u+ K and uK , respectively. One has KuK = Kv K and KuK = − + − K−1 v K , where v K and v K are unit vectors. Given two unimodular 2 × 2 matrices K + (+,−) , b(−,+) , and b(−,−) . and M, we let b(+,+) (K, M) = v + K · uM and similarly for b Strictly speaking, these quantities are defined up to a sign, but we are only interested in the absolute value. Lemma 7.2. Let A1 , . . . , An be a sequence in S2 (R) satisfying the conditions min Aj ≥ µ ≥ n,
1≤j ≤n
max | log Aj +1 + log Aj − log Aj +1 Aj | <
1≤j
1 log µ. 2
(7.1) (7.2)
Then n−1 n−1
n log An . . . A1 + log Aj − log Aj +1 Aj ≤ C . µ j =2
j =1
(7.3)
Anderson Localization for Strong Mixing
155
Proof. One checks from the definition that KM|b(+,+) (K, M)| − KM−1 ≤ MK ≤ KM|b(+,+) (K, M)| + K−1 M + KM−1 . In particular, Aj +1 Aj 1 − ≤ |b(+,+) (Aj , Aj +1 )| Aj +1 Aj Aj 2 Aj +1 Aj 1 1 ≤ + . + Aj +1 Aj Aj 2 Aj +1 2 In view of our assumptions therefore √ √ µ 2 µ Aj +1 Aj ≤1+ 2 , 1 − 2 ≤ |b(+,+) (Aj , Aj +1 )| µ Aj +1 Aj µ
(7.4)
which implies |b(+,+) (Aj , Aj +1 )| ≥ √1µ (1 − µ− 2 ) ≥ 21 µ− 2 if n ≥ 2, say. One checks easily by induction that for any vector u, 3
An · . . . · A1 u =
An
91 ,... ,9n =±1
9n
n−1 j =1
1
Aj 9j b(9j ,9j +1 ) (Aj , Aj +1 )(u9A11 · u) v 9Ann .
Hence An · . . . · A1 u = An
n−1 j =1
Aj |b(+,+) (Aj , Aj +1 )||u+ A1 · u|[1 + Rn (u)],
where n
|Rn (u)| ≤
Aj
9j −1
j =1
k , Ak+1 ) (+,+) b (Ak , Ak+1 )
n−1 b(9k ,9k+1 ) (A
k=1 91 , . . . , 9n = ±1 minj 9j = −1 n n
n
n n 4 n √ ≤ µ−2 (2 µ)2 = (4/µ) = 1 + − 1 < 4e4 , µ µ =1
=1
and consequently n n−1
n log Aj − log |b(+,+) (Aj , Aj +1 )| < C . log An · . . . · A1 − µ j =1
(7.5)
j =1
In view of (7.4)
n−1 log |b(+,+) (Aj , Aj +1 )| j =1
3 n − log Aj +1 Aj + log Aj + log Aj +1 ≤ Cµ− 2 n ≤ C . µ
Combining this with (7.5) yields (7.3).
156
J. Bourgain, W. Schlag
Apply this lemma to c
n < e 10 λ , Aj = M (2(j −1) θ). 4
(7.6) (7.7)
From (6.3) with N = , there is an exceptional set ⊂ T with c 4
P() < e− 2 λ
(7.8)
such that if θ # ∈ and j = 0, . . . , n then 1 log M (2j θ) = c0 λ2 1 + o(1) , 1 log M2 (2j θ) = c0 λ2 1 + o(1) , 2
(7.9) (7.10)
where o(1) → 0 as λ → 0. Hence for θ # ∈ Aj = ec0 λ
2 (1+o(1))
Aj +1 Aj = e
,
2c0 λ2 (1+o(1))
(7.11) ,
(7.12)
log Aj +1 Aj − log Aj − log Aj +1 = o(1)λ2 .
(7.13)
Thus taking µ = e 2 c0 λ 1
2
(7.14)
conditions (7.1), (7.2) of Lemma 7.2 are clearly fulfilled (λ is assumed sufficiently small). For θ # ∈ the conclusion (7.3) is that n−1 n−1
log M (2(j −1) θ) − log M2 (2(j −1) θ) < Cnµ−1 . log Mn (θ ) + j =2
j =1
(7.15) Divide (7.15) by n and integrate in θ ∈ T. Splitting the integration as T\ + , it follows from (7.8), (7.15) that if we denote 1 by(6.3) log MN (θ, E)dθ → c0 λ2 + 0(λ5/2 ), (7.16) LN (E) = N T
then c 4 n−2 2(n − 1) L (E) − L2 (E) < C −1 µ−1 + P() < Ce− 2 λ . Ln (E) + n n (7.17) Taking (7.6) into account, one obtains the following estimate from (7.17). Lemma 7.3. Fix λ > 0 sufficiently small. Then for large , |L(E) + L (E) − 2L2 (E)| < Ce−cλ . 4
Here L(E) = limN→∞ LN (E) = inf N≥1 LN (E) is the Lyapunov exponent.
(7.18)
Anderson Localization for Strong Mixing
157
Proof. Applying (7.17) to both n and 2n with the same yields |Ln (E) + L (E) − 2L2 (E)| < Cn−1 , |L2n (E) + L (E) − 2L2 (E)| < Cn−1 .
(7.19)
Therefore |Ln (E) − L2n (E)| < Cλ−4 logn n for all large n. Summing this over 2k n for k = 0, 1, 2, . . . yields |L(E) − Ln (E)| < Cλ−4
log n . n
Replacing Ln (E) with L(E) in (7.19) proves the lemma.
Since there is the obvious bound |∂E L (E)| < (1 + |λ| + |E|) < 4 ,
(7.20)
an appropriate choice of in (7.18) implies Corollary 7.4. For energies E1 , E2 ∈ [δ, 2 − δ] ∪ [−2 + δ, −δ] and λ sufficiently small, |L(E1 ) − L(E2 )| < c|E1 − E2 |cλ . 4
(7.21)
From the Thouless formula and (7.21) we have thus Corollary 7.5. Fixing a range δ < |E| < 2 − δ, and taking λ > 0 small enough, the integrated density of states of Hλ is Hölder continuous. Corollaries (7.4) and (7.5) are formulated for the potentials given by (3.1). Using the large deviation theorem from Remark 4.4, the arguments above show that the integrated density of states is Hölder continuous on any subinterval I of [−2 + δ, −δ] ∪ [δ, 2 − δ] so that E ∈ I implies (recall E = 2 cos κ)
e2isκ F, F (2|s| ·) # = 0. s∈Z
Here F is an arbitrary C 1 -function. 8. Green’s Function Estimate In the following two sections we show that the operator Hλ (θ ) given by (3.2) on the half–line [0, ∞) ∩ Z with a Dirichlet boundary condition ψ−1 = 0 displays Anderson localization. More precisely, for any small but fixed δ > 0 and all sufficiently small λ > 0 there exists = (λ, δ) ⊂ T with P(T \ ) = 0 such that for every θ ∈ the spectrum sp(Hλ (θ )) ∩ ([−2 + δ, −δ] ∪ [δ, 2 − δ]) is pure point, and the corresponding eigenfunctions decay exponentially. The approach chosen here is basically the same as in [4]. The basic idea behind this method is described in the introduction, and we will not give any heuristic motivation here. This section deals with upper bounds for the Green’s function, which are of fundamental importance. We will need the following simple deviation estimate.
158
J. Bourgain, W. Schlag
Lemma 8.1. Let K > 1 and assume that F is a function on T satisfying |F | ≤ 1, |F | ≤ K. Then r r 1
. (8.1) F (2s θ) > δ < exp −cδ 2 P θ ∈ T | F dθ − −1 2 r (log Kδ ) T
s=1
Proof. Denote by {Ej } the dyadic expectation operators and write F as a martingale difference sequence F =
T
F dP +
∞
=j F,
where =j F = Ej [F ] − Ej −1 [F ].
(8.2)
j =1
Choose an integer J such that 2−J K <
δ K , i.e., J ∼ log . 10 δ
(8.3)
δ . 10
(8.4)
Since |F | ≤ K, F − EJ F ∞ < K2−J < Denote EJ [F ] by FJ for simplicity. Observe that r
1
FJ (2s θ ) = r s=1 F dθ + T
1 + [=1 F (θ ) + =2 F (θ)] = d2 (θ ) r 1 + [=1 F (22 θ ) + =2 F (2θ) + =3 F (θ)] = d3 (θ ) r + ··· is also a martingale difference sequence adapted to {Ej }. Thus for j ≤ J + r (r > j ), dj (θ ) =
1 r
=s F (2j −s θ),
J dj ∞ ≤ , r 1/2
2J dj 2∞ ≤√ . r j ≤J +r Recalling the martingale deviation estimate (4.24), it follows that
δ2 r 1 r δ s < exp −c 2 FJ (2 θ) − F dP > P θ ∈T| r 10 J T s=1
(8.5)
(j −r)+ <s<j ∧J
(8.6) (8.7)
(8.8)
Anderson Localization for Strong Mixing
159
and hence, in view of (8.3), (8.4)
1 r F (2s θ ) − F dP > δ < exp −cδ 2 r/(log δ −1 K)2 , P θ ∈ T | r T
(8.9)
s=1
and the lemma follows. In what follows we will suppress the parameter λ in the notation for H . For any interval ⊂ Z+ ∪ {0} denote the restriction of H to by H = H (θ ) = R H (θ)R ,
(8.10)
where R is the coordinate restriction. As usual, the Green’s function is given by G = G (θ, E) = (H (θ ) − E1 )−1
(8.11)
for any E which is not an eigenvalue of H (θ ). Letting = [0, N ] and Hn (θ ) = H[0,n] (θ ), we have for 0 ≤ n1 ≤ n2 ≤ N by Cramer’s rule GN (θ, E)(n1 , n2 ) =
det[Hn1 −1 (θ ) − E] det[HN−n2 −1 (2n2 +1 θ) − E] . det[HN (θ ) − E]
(8.12)
Recall that Mn (θ, E) is the monodromy matrix defined in (1.2) and Ln (E) = 1 dθ log Mn (θ, E) 2π . We will need the following fact, which can be easily checked n T inductively: MN (θ, E) =
det[HN (θ ) − E] − det[HN−1 (2θ) − E] . det[HN−1 (θ ) − E] − det[HN−2 (2θ) − E]
(8.13)
Proposition 8.2. There exists 1 ⊂ T such that P(T \ 1 ) = 0 with the following property: For every θ ∈ 1 there is n0 = n0 (θ, λ) such that for all n > n0 (θ, λ), |E| < 2 and 0 ≤ s0 ≤ n8 we have that n
1 Ln (E) − n−4 log Mn (2s+s0 θ, E) < λ3 . n 4
(8.14)
s=1
In particular, for every θ ∈ 1 and N > n40 (θ, λ), exp (N − |n1 − n2 |)L(E) + 0(λ5/2 N ) |GN (θ, E)(n1 , n2 )| < | det[HN (θ ) − E]| for all n1 , n2 ∈ [0, N].
(8.15)
160
J. Bourgain, W. Schlag
Proof. The function F (θ, E) =
1 log Mn (θ, E) n
(8.16)
satisfies |∂E F | < 4n , |∂θ F | < 2n 4n = 8n = K.
(8.17)
In view of Lemma 8.1 r r 1 1 s 3 P θ ∈ T | Ln (E) − log Mn (2 θ, E)) > λ < exp −cλ6 2 . (8.18) r n n s=1
Considering a λ3 4−n -dense set of E values in [0 < |E| < 2], we obtain that for r = n4 , n > n0 (λ) P θ ∈T|
(8.19)
r
1 Ln (E) − 1 log Mn (2s+s0 θ, E) > λ3 r n [|E|<2,0≤s0 ≤r 2 ]
sup
s=1
6 n−2 r
< r 2 λ−3 4n e−cλ
< e−n .
(8.20)
The exceptional set T \ 1 for (8.14) is obtained by taking the lim sup of the sets on the 1 left-hand side of (8.20). Now fix some n > n40 (θ, λ) and let k = [n 4 ] and m = [n/k]+1. Estimating Mn (θ, E) ≤
m
Mk (2s k θ, E)
(8.21)
s =0
we may deduce from (8.14) that n−1 log Mn (2s0 θ, E) < Lk (E) + λ3 + 0(n−3/4 )
(8.22)
for s0 < n2 . For sufficiently large k convergence implies that Lk (E) < L(E) + 0(λ3 ). Hence, for θ ∈ 1 , and any |E| < 2, n > n40 (θ, λ), 0 ≤ j ≤ n2 , 1 log Mn (2j θ, E) < L(E) + 0(λ3 ). n Combining (8.12), (8.13), and (8.23) yields Mn1 −1 (θ ) MN−n2 −1 (2n2 +1 θ) | det[HN (θ ) − E]| exp (N − |n1 − n2 |)L(E) + 0(λ5/2 )N , < | det[HN (θ ) − E]|
|GN (θ, E)(n1 , n2 )| ≤
as claimed.
(8.23)
Anderson Localization for Strong Mixing
161
Let θ ∈ 1 . In view of (8.13) and (8.15), for any N > n40 (θ, λ) and |E| < 2, |GN (θ, E)(n1 , n2 )| ≤ exp −|n1 − n2 |L(E) + [N LN (E) − log MN (θ, E)] + 0(λ5/2 N )
(8.24)
possibly up to replacing N with N − 1 or N − 2 and θ with 2θ . Combining (8.24) with the large deviation theorem from Sect. 6 yields (8.25) |GN (θ, E)(n1 , n2 )| < exp −|n1 − n2 |L(E) + Cλ5/2 N for all n1 , n2 ∈ [0, N] provided θ lies in a set whose complement is of measure < e−cλ Indeed, this set, which depends on E, consists of those θ ∈ T such that
4N
LN (E) <
.
1 log MN (θ, E) − Cλ5/2 , N
see (6.3). The exponential decay estimate (8.25) will be used in the following section to establish Anderson localization. It is again subject to possible replacement of N with N − 1 or N − 2 and of θ with 2θ, but this is irrelevant for the applications below. 9. Localization for the Doubling Map In this section we shall define a set ⊂ 1 so that P(T \ ) = 0 and Anderson localization holds on . As explained in the introduction, θ ∈ iff only finitely many double resonances occur. Proposition 9.2 below makes this precise. First we want to establish a simple and standard fact about the location of the spectrum. This will show that Theorem 9.3 is meaningful. Lemma 9.1. Let H (x) be a family of Schrödinger operators as in (1.1) with potentials vn (x) = f (2n x), x ∈ T. Assume that f is a Hölder continuous function on T. Then sp(H (x)) ⊃ [−2, 2] + f (0) for a.e. x ∈ T. Proof. This is a simple consequence of the results of Kunz and Souillard [19], see also Theorem 4.22 in [19]. For the convenience of the reader, we give the short argument. First consider the case of a single operator (1.1) with an arbitrary sequence {vn }. Assume that there is α and an interval [n0 , n1 ] ⊂ Z+ so that sup |vn − α| < 9
[n0 ,n1 ]
for some small 9. Define ψn = χ[n0 ,n1 ] (n) eiκn for all n ∈ Z+ . Then (H ψ)n = (2 cos κ + vn )eiκn provided n0 < n < n1
(9.1)
162
J. Bourgain, W. Schlag
and (H ψ)n = 0 if n < n0 − 1 or n > n1 + 1. Let E = 2 cos κ + α. Hence
(H − E)ψ22 = O(1) + |vn − α|2 , n0
which implies that 1 2 dist(E, sp(H )) < 9 2 + O((n1 − n0 + 1)−1 ) .
(9.2)
Returning to our family of operators H (x), observe that a.e. x ∈ T has the property that the binary expansion of x contains arbitrarily long blocks of zeros. In view of our assumptions, a.e. x ∈ T therefore has the following property: For every 9 > 0 and every positive integer N there exists an interval [n0 , n0 + N ] ⊂ Z+ such that sup
[n0 ,n0 +N]
|vn (x) − f (0)| < 9.
Thus (9.1) holds with α = f (0), which implies that (9.2) also holds with any E ∈ [−2, 2] + f (0). Hence a.e. x ∈ T has the property that E ∈ sp(H (x)). Recall that there exists a compact set K so that sp(H (x)) = K for a.e. x ∈ T. We have therefore shown that [−2, 2] + f (0) ⊂ K, as claimed. Applying this lemma to the operators (3.2) shows that the spectrum of that family contains [−2+λ, 2+λ]. Since λ is small depending on δ, one obtains that [−2+δ, 2−δ] lies in the spectrum, as desired. Proposition 9.2. Fix some small δ > 0 and a sufficiently small λ > 0. Let N ≥ N0 (λ, δ) be a large integer. Define BN = BN (δ, λ) as the set of those θ ∈ T such that for some choice of N1 , E, and k, where N1 ∈ [N 12 , 2N 12 ]∩Z, δ < |E| < 2−δ, and N¯ ≤ k ≤ 2N¯ 2 with N¯ = e(log N) , the following conditions hold: GN1 (θ, E) > eN , 2
1 log MN (2k θ, E) < LN (E) − Cλ5/2 . N
(9.3) (9.4)
Then P(BN ) ≤ e−cλ
4N
.
(9.5)
Proof. To estimate the measure of those θ ∈ T satisfying (9.3), (9.4) we define N (θ ) = B
N1 ∼N 12 E∈sp(HN1 (θ))∩I0
η∈T|
1 log MN (η, E) < LN (E) − Cλ5/2 , N (9.6)
where I0 = [−2 + δ, −δ] ∪ [δ, 2 − δ]. Now suppose θ ∈ T is chosen such that (9.3) and (9.4) hold. Since GN1 (θ, E)−1 = dist(E, sp(HN1 (θ ))), one obtains that 2k θ ∈ N (θ ) for some k ∼ N. ¯ Indeed, since N is large the e−N 2 -errors that arise by passing B N can be absorbed into the Cλ5/2 -term in (9.6) (this fact will be used to from BN to B
Anderson Localization for Strong Mixing
163
N . Denote the set on obtain (9.7) below). It therefore suffices to bound the measure of B the right-hand side of (9.6) by SN (E). By (6.3) P(SN (E)) < e−cλ
4N
.
−N 2 and fix some Divide the torus into congruent intervals T = M =1 J , with |J | ∼ e θ ∈ J . In view of (8.17) one has (the step leading to (9.8) will be explained below) P[θ ∈ T : 2k θ ∈ BN (θ ) for some k ∼ N¯ ] ≤
M
P[θ ∈ J : 2k θ ∈ BN (θ )]
(9.7)
k∼N¯ =1
≤
M
P[θ ∈ J : 2k θ ∈ SN (E)]
k∼N¯ =1 N1 ∼N 12 E∈sp(HN1 (θ ))∩I0
≤
M
k∼N¯ =1 N1
≤2
M
∼N 12
(2k |J | + 1)2−k P[SN (E)]
(9.8)
E∈sp(HN1 (θ ))∩I0
N 100 |J |e−cλ
4N
≤ 2N¯ N 100 e−cλ
4N
< Ce−cλ
4 N/2
.
k∼N¯ =1
In order to pass to line (9.7), one might need to take a smaller constant C in Definition (9.6), but this is irrelevant. The independent behavior of the underlying dynamics is reflected in the inequality leading to (9.8), i.e, P[θ ∈ J : 2k θ ∈ SN (E)] ≤ (2k |J | + 1)2−k P[SN (E)]. This is proved by covering J with dyadic intervals of size 2−k . Since no more than 2k |J | + 1 many intervals are needed for such a cover, a change of variables on each of the 2−k -intervals leads to the line above. In view of the preceding, the measure of the θ-set such that (9.3), (9.4) holds for 4 some choice of E, N1 , and k, is at most e−cλ N/2 . The proposition follows. A similar estimate is proved in [16] for the case of the shift. There, however, the proof requires more detailed analysis of the structure of the set (9.6) which is provided by certain complexity bounds for semi-algebraic sets. Although this approach can also be used to obtain (9.5), it is much easier to invoke the “near independence” of the events (9.3) and (9.4), as was done in (9.8) above. Theorem 9.3. Fix some δ > 0 and a sufficiently small λ > 0. Let = (δ, λ) = 1 \ lim sup BN , where 1 was defined in Proposition 8.2 and BN in Proposition 9.2. Then P(T \ ) = 0 and for every θ ∈ the operator Hλ (θ ) defined in (3.2) on 2 (Z ∩ [0, ∞)) with a Dirichlet boundary condition ψ−1 = 0 has pure-point spectrum in [−2 + δ, −δ] ∪ [δ, 2 − δ] and the corresponding eigenfunctions decay exponentially. Proof. We follow the same scheme as in the case of the shift, see [4] and Sect. 1 above. Fix some δ, λ and an arbitrary θ ∈ . By a Theorem of Shnol [20] and Simon [21] for
164
J. Bourgain, W. Schlag
a.e. E ∈ sp(Hλ (θ )) with respect to spectral measure there exists ξ such that (Hλ (θ ) − E)ξ = 0 ξ0 = 1, |ξk | < Ck p
on the half–line [0, ∞) ∩ Z, for all k = 1, 2, . . . ,
(9.9) (9.10) (9.11)
where p is some positive integer. It therefore suffices to show that for any E any polynomially bounded solution ξ as in (9.9) decays exponentially. To this end fix some such E and ξ and choose N so large that θ ∈ (N) := 1 \ BN . We claim that there is N1 such that N1 ∼ N 12
(9.12)
and GN1 (θ, E) = G[0,N1 ] (θ, E) > eN .
(9.13)
(HN1 − E)R[0,N1 ] ξ = −R[0,N1 ] H RZ+ \[0,N1 ] ξ, |(HN1 − E)R[0,N1 ] ξ | = |ξN1 +1 |,
(9.14)
2
From (9.9)
where R is the restriction and hence, by (9.10), GN1 (θ, E) > |ξN1 +1 |−1 .
(9.15)
k ∼ N 12
(9.16)
It therefore suffices to find
with |ξk | < e−N .
(9.17)
= k − N 3 /2, k + N 3 /2 = [a, b]
(9.18)
2
Define
and suppose G satisfies the estimate G (θ, E)(n1 , n2 ) < exp −cλ2 |n1 − n2 | + 0(λ5/2 N 3 )
(9.19)
for any n1 , n2 ∈ . Again, from (9.9), (9.14), (9.11), and (9.16), 2 3 2 |ξk | ≤ (H (θ ) − E)−1 (ξa−1 δa + ξb+1 δb ) ≤ Ce−cλ N /2 N 12p < e−N for large N , which is (9.17). Next we show how to ensure (9.19). Estimate by (8.24) N3
|G (θ, E)(n1 , n2 )| = |GN 3 (2k− 2 θ )(n1 , n2 )| < exp −cλ2 |n1 − n2 | + 0(λ5/2 N 3 ) + N 3 [LN 3 (E) N3 − N −3 log MN 3 (2k− 2 θ, E)] . (9.20)
Anderson Localization for Strong Mixing
165
The issue is thus to find k ∼ N 12 for which LN 3 (E) −
1 log MN 3 (2k θ, E) < λ5/2 . N3
(9.21)
Applying (8.14) with n = N 3 , s0 = N 12 gives 12 2N
1 k LN 3 (E) − N −12 log MN 3 (2 θ, E) < λ3 . 3 N 12
(9.22)
k=N
Thus (9.21) follows from (9.22) by averaging over k. This permits us to establish (9.12), (9.13). In view of the definition of (N) we conclude that (9.4) holds for every k ∈ [N¯ , 2N¯ ]. First we claim that G[N,2 ¯ N¯ ] (θ, E) ≤ exp(CN ).
(9.23)
By the resolvent identity, G[N,2 ¯ N] ¯ (θ, E + i9)(n, m) − G (θ, E + i9)(n, m)
G[N,2 =− ¯ N] ¯ (θ, E + i9)(k, m)G (θ, E + i9)(n, k),
(9.24)
k∈∂
¯ ∩ [n − N/2, n + N/2] and 9 > 0 (we might have to shift by one where = [N¯ , 2N] unit or its length by one or two, see the comments following (8.25)). In view of (9.4) the terms involving G in (9.24) can be controlled by means of (8.25), and (9.23) follows by letting 9 → 0. The usual “block–expansion”, i.e., repeated application of the resolvent identity now yields 5/2 |G[N,2 )|n − m| (9.25) ¯ N] ¯ (θ, E)(n, m)| ≤ C exp −(L(E) − Cλ ¯ with |n − m| ≥ N¯ /2. Taking (9.25) for granted for the moment, for any n, m ∈ [N¯ , 2N] one obtains from (9.11), |ξn | = |[G[N,2 δN¯ + ξ2N+1 δ2N¯ )](n)| ≤ exp(−L(E)n/10) ¯ N] ¯ (θ, E)(ξN−1 ¯ ¯ for any n ∈ [5N¯ /4, 7N¯ /4], say. Since this argument applies to all sufficiently large N , the theorem follows. To obtain (9.25), fix n, m ∈ [N¯ , 2N¯ ] with |n − m| ≥ N¯ /2 and apply the resolvent identity = [|n − m|/N ] many times: G[N,2 ¯ N] ¯ (θ, E)(n, m)
= (−1) k1 ∈∂1 k2 ∈∂2 (k1 )
...
G1 (n, k1 )(θ, E)G2 (k1 ) (k1 , k2 )(θ, E)
k ∈∂ (k−1 )
G3 (k2 ) (k2 , k3 )(θ, E) · . . . · G (k−1 ) (k−1 , k )(θ, E)G[N,2 ¯ N] ¯ (θ, E)(k , m). Here 1 is an interval centered at n of length N intersected with [N¯ , 2N¯ ], 2 (k1 ) is an interval of length N centered at k1 intersected with [N¯ , 2N¯ ], etc. (again up to possible shifting and shrinking). Bounding each of the Green’s functions over the s in terms of (8.25) and the last factor by (9.23) yields (9.25).
166
J. Bourgain, W. Schlag
10. Some Lemmas About Functions on the Line The remainder of this paper is devoted to potentials of the form vn (x) = F (An x), where A : T2 → T2 and A is a hyperbolic toral automorphism, F a fixed function. For any f ∈ C 1 (R), and h > 0, let mf (h) = sup − f dy , I
I :|I |=h
where the supremum is taken over all intervals I ⊂ R and −I =
1 |I | I .
1 Lemma 10.1. Let f , {gj }∞ j =1 ∈ C (R), f ∞ ≤ 1, supj gj ∞ ≤ 1. Suppose f ∞ + supj gj ∞ ≤ K. Then for any L ≥ 2,
sup
|I |=1 I
N
2 exp t gj (Lj −1 y)f (Lj y) dy ≤ eCt N provided
√ 1 KL− 2 + mf ( L) ≤ t. Proof. Let SN (y) = I
(10.1)
j =1
N
j =1 gj (L
(10.2)
j −1 y)f (Lj y).
Then
exp tSN (y) dy = exp tSN−1 (y) exp tgN (LN−1 y)f (LN y) dy (10.3) I
1 1 1 − exp tSN−1 (L 2 −N y) [exp tgN (L− 2 y)f (L 2 y) − at,L ()]dy = J
1 − exp tSN−1 (L 2 −N y) dy at,L (). +
Here LN− 2 I = 1
(10.4)
J
J ,
where J are congruent, |J | ∼ 1 and
1 1 at,L () = − exp tgN (L− 2 y)f (L 2 y) dy. J
(10.5)
Expanding the exponential function yields at,L ≤ e
Ct 2
+ t[L
− 21
√
gN + mf (
L)] ≤ e
Ct 2
√ K + t √ + mf ( L) . L
(10.6)
Let 1 1 exp tgN (L− 2 y)f (L 2 y) − at,L () = ht,L (y)
(10.7)
Anderson Localization for Strong Mixing
167
on J where ht,L = 0 at the endpoints of J . Then |ht,L | ≤ 2et on J . Therefore, − exp tSN−1 (L 21 −N y) h (y)dy t,L J 1 1 ≤ − exp tSN−1 (L 2 −N y) tL 2 −N SN−1 ∞ |ht,L (y)|dy J
≤ tL 2 −N 1
N−1
j =1
≤ Ctet L− 2 K 1
1 K Lj 2et − exp tSN−1 (L 2 −N y) dy J
J
1 exp tSN−1 (L 2 −N y) dy
(10.8)
In view of (10.4), (10.6) and (10.8), √ Ctet K 2 exp tSN (y) dy ≤ exp tSN−1 (y) dy eCt + √ + tmf ( L) L I I 2 ≤ eCt exp tSN−1 (y) dy I
provided (10.2) holds. The lemma follows.
The following result is the analogue of Lemma 8.1. Corollary 10.2. Fix some hyperbolic toral automorphism A on T2 . Let F ∈ C 1 (T2 ), F = 0, and such that |F | ≤ 1, |∇F | ≤ K. Then for any 0 < δ < 1, mes x ∈ T2
1 N δ2 N : F (An x) > δ ≤ C exp − c N log(K/δ)
(10.9)
n=1
for all N ≥ N0 (A, K, δ). c, C are constants depending only on A. Proof. Let Av + = ρv + , Av − = ρ− v − , where |ρ| > 1, ρ− = ± ρ1 and v + = v − = 1. We can assume that ρ > 1. Let FR = ψ 1 ∗ F , where ψ 1 is an L1 -normalized bump R R function with diam(supp(ψ 1 )) ∼ 1/R. For R ∼ K/δ, R
FR − F ∞ < δ/2.
(10.10)
Set f (y) = FR (v + y), y ∈ R. Then f ∞ ≤ K and for any interval I ⊂ R,
+ f (y)dy ≤ ˆ |F (k)| |ψ e(k · v y)dy 1 (k)| R I
I
0<|k|
≤
0<|k|
+ −1 |Fˆ (k)| |ψ ≤ CR 4 . 1 (k)| k · v
(10.11)
R
Here we have used that k ·v + ≥ |k|−2 , which holds since the slope of v + is a quadratic irrationality. Thus hmf (h) ≤ CR 4 .
(10.12)
168
J. Bourgain, W. Schlag
Let s be a positive integer to be determined. Then
1
N
exp t f (ρ m y) dy ≤ est
0
1
(N/s)−1 s
exp t f (ρ j s ρ r y) dy
0
n=1
r=1
j =0
(N/s)−1 1 s ρ r
s st js ≤e − exp ts f (ρ y) dy . (10.13) 0
r=1
j =0
Let ρ s = L ≥ 2. By Lemma 10.1, the integral in (10.13) is ≤ CeCt
2s2 N s
,
(10.14)
provided s
Kρ − 2 + mf (ρ s/2 ) ≤ st
(10.15)
or, in view of (10.12) s
s
Kρ − 2 + R 4 ρ − 2 ≤ st ⇔
K δ
4
s
≤ ρ 2 ts.
(10.16)
Now choose s so that
s
ρ ∼
K δ
10 (10.17)
δ and t = Cs for C large. Then (10.16) is satisfied if K ≥ C. Therefore, (10.14) holds with this choice of t and s. Hence, in view of (10.13),
mes x ∈ T2
N 1 N n −δNt n : F (A x) > δ ≤ 2e exp t F (A x) dx N T2 n=1
≤ Ce−δNt e2st
n=1
1
N n exp t f (ρ y) dy
0
≤ Ce
−δNt+3st t 2 sN
e
n=1
≤ Ce−Cδ
2 N/ log K δ
,
(10.18)
as claimed (with C = C(ρ)). Notice that for s >> δN (10.9) becomes trivial. To pass to the first inequality in (10.18) we used that (with y = v + · x) N
F (An x) ≤
N
n=s
n=1
≤
N
n=s
Since ρ −s =
F (ρ n v + v + · x + ρ −n v − v − · x) + s
δ 10 K
f (ρ n y) + C
∞
Kρ −n + s ≤
n=s
, (10.18) follows.
N
n=1
f (ρ n y) +
CK −s ρ + 2s. ρ−1
Anderson Localization for Strong Mixing
169
11. Large Deviation Theorems for the Monodromy From now on fix some toral automorphism A on T2 and consider Hx ψ(n) = ψn+1 + ψn−1 + λVn (x)ψ(n), where Vn (x) = F (An x), F = 0. F will be assumed to be C 1 . Recall (2.12)–(2.16) from Sect. 2: N N 1 λ2 2 λ
log ρN = Vn + Vn sin 2(ϕn + κ) N 8N 2N n=1
−
λ2
n=1
N
4N
Vn2 cos 2(ϕn + κ) +
1
N λ2 2 Vn cos 4(ϕn + κ) + 0(λ3 ). 8N 1
(11.1) As above, fix some small δ > 0 and let δ < |E| < 2 − δ, E = 2 cos κ, ζn = e2iϕn , and µ = e2iκ By (10.9) with K ∼ 1, mes x ∈ T2
N 1
δ2 N : Vn2 (x) − V 2 > δ ≤ exp − C . N | log δ|
(11.2)
1
Lemma 11.1. If 0 < λ ≤ λ0 (ρ), then ∇ζn ∞ ≤ Cλρ n−1 for all n. Proof. In view of (2.18) ζn = F (ζn−1 , Vn−1 ), where F (ζ, u) = µζ + i λu 2 |λ| ≤
1 2,
|u| ≤ 1. Thus
(µζ −1)2 . 1− iλu 2 (µζ −1)
(11.3)
Notice that |Fζ | ≤ 1 + Cλ, |Fu | ≤ Cλ if
|∇ζn | ≤ (1 + Cλ)|∇ζn−1 | + Cλρ n−1 ≤ Cλρ n−1
∞
1 + Cλ =0
if 1 + Cλ < ρ.
ρ
≤ Cλρ n−1
As before, Av + = ρv + , Av − = ρ− v − , where ρ > 1, ρ− = ± ρ1 and v + = v − = 1.
(11.4)
Lemma 11.2. For any 0 < λ < λ0 (V , A) and N ≥ N0 (λ),
1 N 2 1 N 2 2 √ mes x ∈ T : Vn ζn (x) + Vn ζn (x) > λ ≤ e−Cλ N . N N
2
1
1
(11.5)
170
J. Bourgain, W. Schlag
Proof. Let W = F 2 − F 2 so that Vn2 (x) − Vn2 = W (An x) = Wn (x). Since
1 N 2 | N1 N 1 ζn | + | N 1 ζn | ≤ Cλ, see (3.9) and (3.11), it suffices to prove (11.5) with Wn instead of Vn2 . Let ζn = ξn + iηn . We will show that N √ 1
mes x ∈ T2 : Wn ξn (x) > λ ≤ e−Cλ N , N
(11.6)
1
the other cases being similar. Fix a positive integer s to be specified below. Since |Wn (x)− Wn (v + v + · x)| ≤ |∇W |ρ −n one has N s 1
1 1 Wn (x)ξn (x) = N s N/s n=1
=
1 s
s (Wj s+r ξj s+r )(x) + 0 N
r=1
j =0
s
[N/s]−1
r=1
1 N/s
[N/s]−1
Wj s+r (v + x · v + )ξj s+r (x) + 0
j =0
s . N (11.7)
Now let f (y) = W (v + y) and gj,r (y, y ) = ξj s+r (ρ −s(j −1)−r v + y + v − y ). By Lemma 11.1, ∂ gj,r (y, y ) ≤ Cλρ s . ∂y Moreover, up to a suitable truncation, h mf (h) ≤ Cλ−2 , cf. (10.11). Clearly, for large N, and with L = ρ s , (remove ρ r by rescaling)
1 mes x ∈ T : N/s 2
[N/s]−1
j =0
√ λtN/s
≤ Ce−
≤ C exp
√ Wj s+r ξj s+r (x) > λ
1
sup
0 |I |=1 I
[N/s]−1
exp t f (Lj y)gj,r (Lj −1 y, y ) dydy
√ N 2N , − λ t + Ct s s
j =0
(11.8)
from Lemma 10.1 provided (1 + λρ s )L− 2 + λ−2 L− 2 ≤ t. 1
1
Setting L = ρ s = λ−3 one has t ∼ λ− 2 . Since (11.7) and (11.8) imply (11.6) for large N , the lemma follows. 1
Fix an integer T so that ρ T ∼ λ−1000 . Recall (4.9) ζn = µT ζn−T +
T
iλ s−1 µ Vn−s (µζn−s − 1)2 + 0(T λ2 ). 2 s=1
Anderson Localization for Strong Mixing
171
This implies N T N N iλ s−1 1
1 T 1
Vn ζn = µ Vn−s Vn + µ Vn ζn−T N 2 N N s=1
n=T +1
+
n=T +1
(11.9)
n=T +1
T N iλ s−1 1
2 µ Vn−s Vn (µ2(T −s) ζn−T − 2µT −s ζn−T ) + 0(λ2 T 2 ). 2 N s=1
n=T +1
By Corollary 10.2 with K = ρ s , mes x ∈ T2
1 N : Vn−s Vn − V0 Vs > λ ≤ exp(−Cλ N ). N
(11.10)
n=1
2 Lemma 11.3. mes[x ∈ T2 : N1 N n=T +1 Vn ζn−T (x) > λ ] ≤ exp(−Cλ N ) for N > N0 (ρ, λ). Proof. As before, it suffices to prove that for any r = 1, . . . , T , mes x ∈ T2 :
[N/T ] 1
2 Vj T +r (x)ξ(j −1)T +r (x) ≥ λ ≤ exp(−Cλ N ). N/T
(11.11)
j =1
This follows again by the usual reduction from Lemma 10.1 if one sets f (y) = V (v + y) and gj,r (y, y ) = ξn−T (ρ T −n yv + +y v − ), where n = j T +r (recall that ζn = ξn +iηn ). Note that Vj T +r (ρ −r yv + )ξn−T (ρ −r yv + + y v − ) = f (Lj y)gj,r (Lj −1 y, y ) ∂ gj,r (y, y )| ≤ Cλ and hmf (h) ≤ λ−8 . Hence, with L = ρ T ∼ λ−100 . Moreover, | ∂y with t ∼ λ2 ,
(11.11) ≤ e
−tλ2 N/T
1
sup
0 |I |=1 I
[N/T
] j j −1 exp t f (L y)gj,r (L y, y ) dydy j =1
N ≤ exp(−tλ2 + Ct 2 N/T ) ≤ exp(−Cλ N ). T Notice that condition (10.2) is satisfied, since λ−8 L− 2 ≤ t. 1
Lemma 11.4. For any 1 ≤ s ≤ T , N
√ 2 1 2(T −s) 2 T −s mes x ∈ T : Vn−s Vn (µ ζn−T − 2µ ζn−T ) > λ ≤ e−Cλ N , N n=T +1
provided N ≥ N0 (ρ, λ). Proof. One considers two cases: s ≤ T2 and s > T2 . In the former case, define f and 2 , respectively. If s > T , then define f and gj,r in terms of Vn−s Vn − V0 Vs and ζn−T 2 2 , respectively. Otherwise the details are the same as gj,r in terms of Vn and Vn−s ζn−T before and will be omitted.
172
J. Bourgain, W. Schlag
iκn F, F (An ·). Then Proposition 11.5. Let σ (κ) = ∞ n=−∞ e 1 σ (2κ) 5/2 > λ mes x ∈ T2 : log ρN − λ2 ≤ exp(−Cλ N ) N 8 sin2 κ for N ≥ N0 (λ, A) and any 0 < λ < λ0 (δ, A, F ). Proof. This follows immediately from (11.1), (11.2), (11.5), (11.9), (11.10), Lemma 11.3 and Lemma 11.4. Note that the monodromy corresponds to ϕ1 = 0 or ϕ1 = equally well covers N1 log MN (E).
π 2 , so that Proposition 11.5
12. Hölder Continuity of the Integrated Density of States In this section we prove the analogous result to Corollary 7.5. As the arguments are almost identical, we shall only state the result and skip the details. The only difference from Sect. 7 is that we need to use the large deviation theorems from the previous section rather than those from Sect. 6. Proposition 12.1. Let I ⊂ [−2 + δ, −δ] ∪ [δ, 2 − δ] so that |σ (2κ)| > σ0 > 0 on I (recall that E = 2 cos κ). Then for any 0 < λ < λ0 (σ0 , δ, A, F ) the integrated density of states is Hölder continuous on I . Proof. Identical with the proof in Sect. 7.
13. Localization for Toral Automorphisms In this section we shall establish Anderson localization for the operator for small λ, Hλ (x)ψ(n) = ψn+1 + ψn−1 + λvn ψn ,
(13.1)
where vn (x) = F (An x), A is a fixed toral automorphism on T2 , and F ∈ C 1 (T2 ). Since the details are very similar to those from Sects. 8 and 9, we shall provide only a sketch of the argument. As far as the location of the spectrum is concerned in this case, a lemma similar to Lemma 9.1 holds. Lemma 13.1. Consider the family of operators H (x) on 2 (Z) defined by (1.1) with potentials vn = f (An x). Assume f is Hölder continuous on T2 . Then sp(H (x)) ⊃ [−2, 2] + f (0) for a.e. x ∈ T2 . Proof. This follows by basically the same arguments as in Lemma 9.1. As far as the dynamics is concerned, one simply uses that a.e. trajectory returns to arbitrarily small neighborhoods of the origin. We skip the details. We follow the notation from Sect. 8. Let MN (E) be as in (1.2) with vn = vn (x) = F (An x).
Anderson Localization for Strong Mixing
173
Proposition 13.2. There exists 1 ⊂ T2 such that mes(T2 \ 1 ) = 0 with the following property: For every x ∈ 1 there is n0 = n0 (x, λ) such that for all n > n0 (x, λ), |E| < 2 and − n8 ≤ s0 ≤ n8 one has n
1 Ln (E) − n−4 log Mn (As+s0 x, E) < λ3 . n 4
(13.2)
s=1
In particular, for every x ∈ 1 and N > n40 (x, λ), exp (N − |n1 − n2 |)L(E) + 0(λ5/2 N ) |GN (x, E)(n1 , n2 )| < | det[HN (x) − E]|
(13.3)
for all n1 , n2 ∈ [0, N]. Proof. The proof is identical with that of Proposition 8.2. In fact, one simply needs to invoke Corollary 10.2 instead of Lemma 8.1. Details are left to the reader. Recall that σ (κ) =
∞
eiκn F, F (An ·).
(13.4)
n=−∞
Fix some small δ > 0 and suppose that I0 ⊂ [−2 + δ, −δ] ∪ [δ, 2 − δ] is some nonempty interval on which |σ (2κ)| > σ0 > 0. Proposition 13.3. Let λ > 0 be sufficiently small. Consider any large integer N . Let 12 BN = BN (δ, λ) be the set of those x ∈ T2 such that for some N1 ∼ N , E ∈ I0 and 2 k ∈ [−2N¯ , −N¯ ] ∪ [N¯ , 2N¯ ], where N¯ = e(log N) , the following conditions hold: G[−N1 ,N1 ] (x, E) > eN , 2
1 log MN (Ak x, E) < LN (E) − Cλ5/2 . N
(13.5) (13.6)
Then mes(BN ) ≤ e−cλ N . (13.7) 2 Proof. Let τ = e−N and let T2 = J be a partition of T2 into squares of size ∼ τ . Let x ∈ J be fixed. Define 1 y ∈ T2 : log MN (y, E) < LN (E) − Cλ5/2 . BN (x) = N 12 4
N1 ∼N
E∈sp(HN1 (x))∩I0
(13.8) If (13.5) and (13.6) hold simultaneously for some x ∈ J , then for some k as above Ak x ∈ BN (x ) provided the constant C in (13.8) is chosen appropriately. Since ∂x MN (E) ≤ (Cρ)N
174
J. Bourgain, W. Schlag ¯
and ρ −N is much smaller than ρ −N , one has ρ k (x · v + )v + ∈ BN (x ) for some |k| ∼ N¯ .
(13.9)
(τ1 ) χN,E (y)
an indicator of the set SN,E on the right Let τ1 ∼ (Cρ)−N and denote by hand side of (13.8) smoothed out over scale ∼ τ1 . With any fixed choice of y0 compute (τ1 ) (τ1 ) χN,E (ρ k yv + )dy = ρ −k χN,E (uv + )du |y−y0 |<τ |u−u0 |<τρ k
(τ1 ) (τ1 ) = 2τ χN,E χN,E (x)dx + ρ −k (ν) O(min(τρ k , v + · ν−1 )) T2
0<|ν|
≤ 2τ mes [SN,E ] + ρ −k
mes [SN,E ]
−1/2 1≤2
Cτ1−2 ·2 2
(13.10)
∼ τ mes [SN,E ] + ρ −k τ1−2 | log τ1 | mes [SN,E ]. To obtain the right-hand side of (13.10), one splits the summation range into those ν for which v + ·ν ∼ 2− . Since v + ·(ν 1 −ν 2 ) ≥ c|ν 1 −ν 2 |−2 , there are at most ∼ τ1−2 2− choices for each , as claimed. Also notice that in the O-term the first expression can be neglected because k ∼ N¯ implies τρ k > τ1−100 . In view of the preceding, mes [x ∈ T2 : (13.5) and (13.6) hold simultaneously] −2
≤
τ
k∼N¯ =1
mes x ∈ J : ρ k v + x · v + ∈ BN (x )
−2
≤
τ
=1 N1
≤
τ −2
∼N 12
J
E∈sp(HN 1 (x ))∩I0 k∼N¯
τ sup
|I |≤2τ
=1 N1 ∼N 12 E∈sp(HN1 (x ))∩I0 k∼N¯ −2
≤
τ
χSN,E (ρ k v + x · v + )dx (τ )
I
1 χN,E (ρ k yv + )dy
τ 2 mes[SN,E ] + ρ −k (Cρ)N
k∼N¯ =1 N1 ∼N 12 E∈sp(HN1 (x ))∩I0 ¯ ≤ N 100 N¯ e−Cλ N + ρ −N (Cρ)N N 100 < e−CN ,
and the proposition follows. The same arguments that established localization for the doubling map, cf. Theorem 9.3, yield the following theorem. One simply replaces Propositions 8.2 and 9.2 in the proof of that theorem with the propositions of this section. Otherwise the details are basically the same, and will be omitted. Theorem 13.4. Fix some small δ > 0 and a sufficiently small λ > 0. Let I0 ⊂ [−2 + δ, −δ] ∪ [δ, 2 − δ] have the property that σ # = 0 on I0 , cf. (13.4). Let = (δ, λ) = 1 \lim sup BN . Then mes(T2 \) = 0 and for every x ∈ the operator Hλ (x) defined in (13.1) on 2 (Z) has pure–point spectrum in the interval I0 and the corresponding eigenfunctions decay exponentially.
Anderson Localization for Strong Mixing
175
Acknowledgement. The authors thank Thomas Spencer for suggesting the problem and helpful discussions and Michael Goldstein for useful suggestions. The authors are grateful to the anonymous referee for pointing out that the location of the spectrum should be determined, as well as for suggesting several improvements to the presentation. The second author was supported in part by the NSF, grant number DMS-9706889.
References 1. Aizenman, M., Molchanov, S.: Localization at large disorder and at extreme energies: An elementary derivation. Commun. Math. Phys. 157, 245–278 (1993) 2. Alon, N., Erdös, P., Spencer, J.: The probabilistic method. New-York: John Wiley & Sons, 1992 3. Avron, J., Simon, B.: Singular continuous spectrum for a class of almost periodic Jacobi matrices. Bull. Am. Math. Soc. 6, no. 1 , 81–85 (1982) 4. Bourgain, J., Goldstein, M.: On nonperturbative localization with quasiperiodic potential. To appear in Annals of Math. 5. Campanino, M., Klein, A.: A supersymmetric transfer matrix and differentiability of the density of states in the one-dimensional Anderson model. Commun. Math. Phys. 108, 41–66 (1987) 6. Campanino, M., Klein,A.:Anomalies in the one–dimensionalAnderson model at weak disorder. Commun. Math. Phys. 130, 441–456 (1990) 7. Cycon, H.L., Froese, R.G., Kirsch, W., Simon, B.: Schrödinger operators. Berlin–Heidelberg–New York: Springer, 1987 8. Carmona, R., Lacroix, J.: Spectral theory of random Schrödinger operators. Boston: Birkhäuser, 1990 9. Chulaevski, V., Spencer, T.: Positive Lyapunov exponents for a class of deterministic potentials. Commun. Math. Phys. 168, 455–466 (1995) 10. Figotin, A., Pastur, L.: Spectra of random and almost–periodic operators. Grundlehren der mathematischen Wissenschaften 297. Berlin–Heidelberg–New York: Springer 1992 11. Fröhlich, J., Martinelli, F., Scoppola, E., Spencer, T.: Constructive proof of localization in the Anderson tight binding model. Commun. Math. Phys. 101 (1985), 21–46 12. Fröhlich, J., Spencer, T.: Absence of diffusion in the Anderson tight binding model for large disorder or low energy. Commun. Math. Phys. 88, 151–189 (1983) 13. Fröhlich, J., Spencer, T.: A rigorous approach to Anderson localization. Phys. Rep. 103, no. 1–4, 9–25 (1984) 14. Fröhlich, J., Spencer, T., Wittwer, P.: Localization for a class of one dimensional quasi-periodic Schrödinger operators. Commun. Math. Phys. 132, 5–25 (1990) 15. Goldsheid, I.Ya., Molchanov, S.A., Pastur, L.A.: A pure point spectrum of the stochastic one-dimensional Schrödinger equation. Funkt. Anal. Appl. 11, 1–10 (1977) 16. Goldstein, M., Schlag, W.: Hölder continuity of the integrated density of states for quasiperiodic Schrödinger equations and averages of shifts of subharmonic functions. To appear in Annals of Math. 17. Jitomirskaya, S.: Metal–insulator transition for the almost Mathieu operator. Annals of Math. 150, no. 3 (1999) 18. Klein, A., Speis, A.: Regularity of the density of states in the one-dimensional Anderson model. J. Funct. Anal. 898, 211–227 (1989) 19. Kunz, H., Souillard, B.: Sur le spectre des opérateurs aux différences finies aléatoires. Commun. Math. Phys. 78, no. 2, 201–246 (1980/81) 20. Shnol, I.E.: On the behavior of eigenfunctions. (Russian), Doklady Akad. Nauk SSSR (N.S.) 94, 389–392 (1954) 21. Simon, B.: Spectrum and continuum eigenfunctions of Schrödinger Operators. J. Funct. Anal. 42, 66–83 (1981) 22. Simon, B., Wolff, T.: Singular continuous spectrum under rank–one perturbations and localization for random Hamiltonians. Comm. Pure Appl. Math. 39, 75–90 (1986) 23. Sinai, Ya.G.: Anderson localization for one-dimensional difference Schrödinger operator with quasiperiodic potential. J. Stat. Phys. 46, 861–909 (1987) 24. Simon, B., Taylor, M.: Harmonic Analysis on S(2, R) and smoothness of the density of states in the one-dimensional Anderson model. Commun. Math. Phys. 101, 1–19 (1985) 25. von Dreifuss, H., Klein, A.: A new proof of localization in the Anderson tight binding model. Commun. Math. Phys. 124, 285–299 (1989) Communicated by Ya. G. Sinai
Commun. Math. Phys. 215, 177 – 196 (2000)
Communications in
Mathematical Physics
© Springer-Verlag 2000
The Variational Principle for a Class of Asymptotically Abelian C∗ -Algebras Sergey Neshveyev1, , Erling Størmer2, 1 Institute for Low Temperature Physics and Engineering, 47, Lenin Ave., 310164 Kharkov, Ukraine 2 Department of Mathematics, University of Oslo, P.O. Box 1053, Blindern, 0316 Oslo, Norway
Received: 19 April 2000 / Accepted: 14 June 2000
Abstract: Let (A, α) be a C∗ -dynamical system. We introduce the notion of pressure Pα (H ) of the automorphism α at a self-adjoint operator H ∈ A. Then we consider the class of AF-systems satisfying the following condition: there exists a dense α-invariant ∗-subalgebra A of A such that for all pairs a, b ∈ A the C∗ -algebra they generate is finite dimensional, and there is p = p(a, b) ∈ N such that [α j (a), b] = 0 for |j | ≥ p. For systems in this class we prove the variational principle, i.e. show that Pα (H ) is the supremum of the quantities hφ (α) − φ(H ), where hφ (α) is the Connes–Narnhofer– Thirring dynamical entropy of α with respect to the α-invariant state φ. If H ∈ A, and Pα (H ) is finite, we show that any state on which the supremum is attained is a KMS-state with respect to a one-parameter automorphism group naturally associated with H . In particular, Voiculescu’s topological entropy is equal to the supremum of hφ (α), and any state of finite maximal entropy is a trace. 1. Introduction The variational principle has over the years attracted much attention both in classical ergodic theory, see e.g. [W], and in the C∗ -algebra setting of quantum statistical mechanics, see e.g. [BR]. In the years around 1970 there was much progress in the study of spin lattice systems. In that case one is given for each point x ∈ Zν (ν ∈ N) the algebra of all linear operators B(H)x on a finite dimensional Hilbert space H, and if ⊂ Zν one defines the C∗ -algebra A() = ⊗x∈ B(H)x . One then considers the UHF-algebra A = ∪⊂Zν A() with space translations α, and studies mean entropy s(φ) of invariant states and the corresponding variational principle P (H ) = sup(s(φ) − φ(H )) φ
Partially supported by NATO grant SA (PST.CLG.976206)5273.
Partially supported by the Norwegian Research Council.
(1.1)
178
S. Neshveyev, E. Størmer
together with the KMS-states defined by a natural derivation associated with a selfadjoint operator H , see [BR, Chapter 6]. With the development of dynamical entropy of automorphisms of C∗ -algebras [CS, CNT,V] it was natural to replace the mean entropy s(φ) by dynamical entropy hφ (α). This was done by Narnhofer [N], who considered KMS-properties of the states on which the quantity hφ (α) − φ(H ) attains its maximal value. Then Moriya [M] showed that one can replace s(φ) by the CNT-entropy hφ (α) in (1.1) and get the same result, i.e. P (H ) = sup(hφ (α) − φ(H )). φ
(1.2)
If one wants to study the variational principle and equilibrium states for more general C∗ -dynamical systems, the mean entropy is usually not well defined, and it is necessary to consider dynamical entropy. However, in order to define time translations and extend the theory of spin lattice systems rather strong assumptions of asymptotic abelianness are needed. In the present paper we shall study a restricted class of asymptotically abelian systems (A, α), namely we shall assume that A is a unital separable C∗ -algebra, and (A, α) is asymptotically abelian with locality, i.e. there exists a dense α-invariant ∗-subalgebra A of A such that for all pairs a, b ∈ A the C∗ -algebra they generate is finite dimensional, and there is p = p(a, b) ∈ N such that [α j (a), b] = 0 for |j | ≥ p. In particular, A is an AF-algebra. Examples of such systems are described in [S]. They are all different shifts, on infinite tensor products of the same AF-algebra with itself, on the Temperley-Lieb algebras, on towers of relative commutants, on binary shift algebras defined by finite subsets of the natural numbers. In Sect. 2 we define the pressure Pα (H ) of α with respect to a self-adjoint operator H ∈ A. This can be done in any unital C∗ -dynamical system (A, α) with A a nuclear C∗ -algebra, and follows closely Voiculescu’s definition of topological entropy ht (α) in [V]. The main difference is that he considered rank B of a finite dimensional C∗ -algebra B, while we consider quantities of the form Tr B (e−K ) for K self-adjoint, where Tr B is the canonical trace on B (so in particular we get rank B when K = 0). We can then show the analogues of several of the classical results on pressure as presented in [W]. If (A, α) is asymptotically abelian with locality we show the variational principle (1.2) in Sect. 3. Furthermore, if we assume ht (α) < ∞, H ∈ A and φ is a β-equilibrium state at H , i.e. φ is α-invariant and Pα (βH ) = hφ (α) − βφ(H ), then we show in Sect. 4 via a proof modelled on the corresponding proof based on (1.1) in [BR] for spin lattice systems, that φ is a β-KMS state with respect to the one-parameter group defined by the derivation δH (x) = [α j (H ), x], x ∈ A. j ∈Z
In particular, when H = 0, so Pα (0) = ht (α), we obtain ht (α) = sup hφ (α), φ
(1.3)
and if hφ (α) = ht (α) then φ is a trace. Equation (1.3) is false in general. Indeed in [NST] there is exhibited a (non-asymptotically abelian) automorphism α of the CAR-algebra for which the trace τ is the unique invariant state, hτ (α) = 0, while ht (α) ≥ 21 log 2. Furthermore, our assumption of locality is essential to conclude that φ is a trace when hφ (α) = ht (α) < ∞, even for asymptotically abelian systems. In Example 5.7 we show that there is a Bogoliubov
Variational Principle for Asymptotically Abelian C∗ -Algebras
179
automorphism α of the even CAR-algebra which is asymptotically abelian, ht (α) = 0, while there are an infinite number of non-tracial α-invariant states. In Sect. 5 we consider some other examples and special cases. First we apply our results to C∗ -algebras associated with certain AF-groupoids arising naturally from expansive homeomorphisms of zero-dimensional compact spaces. We show that if H lies in the diagonal then there is a one-to-one correspondence between equilibrium states on the algebra and equilibrium measures in the classical sense. We also consider unique ergodicity for non-abelian systems. If (A, α) is asymptotically abelian with locality, unique ergodicity turns out to be of marginal interest. Indeed, the unique invariant state τ is a trace, and the image of A in the GNS-representation of τ is abelian. 2. Pressure In order to define pressure of a C ∗ -dynamical system we follow the setup of Voiculescu [V] for his definition of topological entropy. Let A be a nuclear C ∗ -algebra with unit and α an automorphism. Let CPA(A) denote the set of triples (ρ, ψ, B), where B is a finite dimensional C ∗ -algebra, and ρ : A → B, ψ : B → A are unital completely positive maps. Pf (A) denotes the family of finite subsets of A. For δ > 0, ω ∈ Pf (A) and H ∈ Asa , put P (H, ω; δ) = inf{log Tr B (e−ρ(H ) ) | (ρ, ψ, B) ∈ CPA(A), ||(ψ ◦ ρ)(x) − x|| < δ ∀x ∈ ω}, where Tr B is the canonical trace on B, i.e. the trace which takes the value 1 on each minimal projection. Then set n−1 n−1 1 j Pα (H, ω; δ) = lim sup P α (H ), α j (ω); δ , n→∞ n j =0
j =0
Pα (H, ω) = sup Pα (H, ω; δ). δ>0
Definition 2.1. The pressure of α at H is Pα (H ) =
sup
ω∈Pf (A)
Pα (H, ω).
We have chosen the minus sign e−ρ(H ) in the definition of P (H, ω; δ) because of its use in physical applications, see [BR], rather than the plus sign used in ergodic theory, see [W]. It is easy to see that if ω1 ⊂ ω2 ⊂ . . . is an increasing sequence of finite subsets of A such that the linear span of ∪n ωn is dense in A, then Pα (H ) = limn Pα (H, ωn ). If H = 0 the pressure coincides with the topological entropy of Voiculescu [V]. Recall that hφ (α) denotes the CNT-entropy of α with respect to an α-invariant state φ of A. Proposition 2.2. For any α-invariant state φ of A we have Pα (H ) ≥ hφ (α) − φ(H ).
180
S. Neshveyev, E. Størmer
Proof. The proof is a rewording of the proof of [V, Prop. 4.6]. Let N be a finite dimensional C∗ -algebra, and γ : N → A a unital completely positive map. Let ω ∈ Pf (A) be such that H ∈ ω and γ ({x ∈ N | ||x|| ≤ 1}) is contained in the convex hull of ω. Then if (ρ, ψ, B) ∈ CPA(A) and j ||(ψ ◦ ρ)(a) − a|| < δ for a ∈ ∪n−1 j =0 α (ω),
we obtain by [CNT, Proposition IV.3] |Hφ ({α j ◦ γ }0≤j ≤n−1 ) − Hφ ({ψ ◦ ρ ◦ α j ◦ γ }0≤j ≤n−1 )| < nε, where ε = ε(δ) → 0 as δ → 0. If K ∈ Bsa and θ is a state on B then log Tr B (e−K ) ≥ S(θ ) − θ (K), hence Hφ ({ψ ◦ ρ ◦ α j ◦ γ }0≤j ≤n−1 ) ≤ Hφ (ψ) ≤ S(φ ◦ ψ)
n−1 j −ρ j =0 α (H ) ≤ log Tr B e n−1 j +(φ ◦ ψ) ρ α (H ) j =0
n−1 j −ρ j =0 α (H ) ≤ log Tr B e + nφ(H ) + nδ. Thus
n−1 n−1 1 1 α j (H ), α j (ω); δ + φ(H ) + δ + ε. Hφ ({α j ◦ γ }0≤j ≤n−1 ) ≤ P n n j =0
j =0
It follows that hφ (γ ; α) ≤ Pα (H, ω) + φ(H ), hence hφ (α) ≤ Pα (H ) + φ(H ).
Remark 2.3. If A is abelian, A = C(X), and Pαcl (H ) denotes the pressure as defined in [W], then Pα (H ) = Pαcl (−H ). The inequality ’≤’ can be proved just the same as in [V, Proposition 4.8]. The converse inequality follows from Proposition 2.2 and the classical variational principle. In the AF-case however, i.e. when X is zero-dimensional, it is easy to give a direct proof. Indeed, if in the proof of Proposition 2.2 N was the subalgebra of A corresponding to a clopen partition P of X, then we could conclude that n−1 n−1 1 1 Hφ ({α j (N )}0≤j ≤n−1 ) ≤ P α j (H ), α j (ω); δ n n j =0 j =0 (2.1) n−1 1 j + φ α (H ) + δ + ε n j =0
for any (not necessarily α-invariant) state φ of A, with ε independent of φ. Let T be the homeomorphism corresponding to α, so that α(f ) = f ◦ T . Suppose the points −j P . Then inequality (2.1) x1 , . . . , xm lie in different elements of the partition ∨n−1 j =0 T for the measure
−1 −(Sn H )(xi ) e e−(Sn H )(xi ) δxi , φ= i
i
Variational Principle for Asymptotically Abelian C∗ -Algebras
where (Sn H )(x) =
n−1
j =0 H (T
1 log e−(Sn H )(xi ) n i
181
j x),
means that n−1 n−1 1 ≤ P α j (H ), α j (ω); δ + δ + ε. n j =0
j =0
Recalling the definition of pressure [W, Def. 9.7], we see that Pαcl (−H ) ≤ Pα (H ). We list some properties of the function H → Pα (H ) on Asa . Proposition 2.4. The following properties are satisfied by Pα for H, K ∈ Asa . (i) If H ≤ K then Pα (H ) ≥ Pα (K). (ii) Pα (H + c1) = Pα (H ) − c, c ∈ R. (iii) Pα (H ) is either infinite for all H or is finite valued. (iv) If Pα is finite valued then |Pα (H ) − Pα (K)| ≤ ||H − K||. j (v) For k ∈ N, Pα k ( k−1 j =0 α (H )) = kPα (H ). (vi) Pα (H + α(K) − K) = Pα (H ). Proof. (i) Given H ≤ K take ω ∈ Pf (A). If (ρ, ψ, B) ∈ CPA(A) we have
n−1 j n−1 j −ρ −ρ j =0 α (H ) j =0 α (K) log Tr B e ≥ log Tr B e , see e.g. [OP, Corollary 3.15]. Thus (i) follows. (ii) As in (i) we have
n−1 j n−1 j −ρ −ρ j =0 α (H +c1) j =0 α (H ) log Tr B e = log Tr B e − nc, and (ii) follows. (iii) By (i) and (ii) we have Pα (H ) ≥ Pα (||H ||) = Pα (0) − ||H || = ht (α) − ||H ||, and similarly Pα (H ) ≤ ht (α) + ||H ||. Thus (iii) follows. (iv) For any (ρ, ψ, B) ∈ CPA(A) we have by the Peierls–Bogoliubov inequality [OP, Corollary 3.15]
n−1 j n−1 j 1 log Tr B e−ρ j =0 α (H ) − 1 log Tr B e−ρ j =0 α (K) ≤ ||H − K||. n n Thus Pα (H, ω; δ) − ||H − K|| ≤ Pα (K, ω; δ) ≤ Pα (H, ω; δ) + ||H − K|| for any ω ∈ Pf (A). Thus (iv) follows. (v) Let (ρ, ψ, B) ∈ CPA(A) and ω ∈ Pf (A). Given n ∈ N choose m ∈ N such that k−1 j j mk ≤ n < (m + 1)k. Set Hk = k−1 j =0 α (H ) and ωk = ∪j =0 α (ω). Then
n−1 j mk−1 j −ρ −ρ j =0 α (H ) j =0 α (H ) −k||H || log Tr B e ≥ log Tr B e
m−1 j k −ρ j =0 α (Hk ) = log Tr B e − k||H ||.
182
S. Neshveyev, E. Størmer
Similarly
n−1 j m jk −ρ −ρ j =0 α (Hk ) j =0 α (H ) log Tr B e ≤ log Tr B e + k||H ||.
n−1 j m jk jk Since ∪m−1 j =0 α (ωk ) ⊂ ∪j =0 α (ω) ⊂ ∪j =0 α (ωk ), it follows that k−1 k−1 1 Pα (H, ω; δ) = Pα k α j (H ), α j (ω); δ , k j =0
j =0
k−1
and hence Pα (H ) = k1 Pα k ( j =0 α j (H )). k−1 j j k (vi) Set Hk = k−1 j =0 α (H ) and Hk = j =0 α (H + α(K) − K) = Hk + α (K) − K. Then by (iv) and (v) we have |Pα (H ) − Pα (H + α(K) − K)| = Thus (vi) follows.
1 2||K|| |P k (Hk ) − Pα k (Hk )| ≤ . k α k
The next result is the analogue of [W, Theorem 9.12], see also [R]. Proposition 2.5. Suppose ht (α) < ∞. Let φ be a self-adjoint linear functional on A. Then φ is an α-invariant state if and only if −φ(H ) ≤ Pα (H ) for all H ∈ Asa . Proof. If φ is an α-invariant state then by Proposition 2.2, −φ(H ) ≤ Pα (H ) − hφ (α) ≤ Pα (H ) for all H ∈ Asa . Conversely if −φ(H ) ≤ Pα (H ) for all H ∈ Asa then by Proposition 2.4(vi), 1 1 −φ(α(H ) − H ) = − φ(α(nH ) − nH ) ≤ Pα (α(nH ) − nH ) n n 1 1 = Pα (0) = ht (α) −→ 0. n→∞ n n Applying this also to −H we see that φ is α-invariant. Furthermore, by Proposition 2.4(i), (ii), 1 1 1 −φ(H ) = − φ(nH ) ≤ Pα (nH ) ≤ ht (α) + ||H || −→ ||H ||, n→∞ n n n so that ||φ|| ≤ 1. For c ∈ R we have −cφ(1) ≤ Pα (c1) = ht (α) − c. Hence φ(1) = 1, and φ is a state.
Definition 2.6. We say φ is an equilibrium state at H if Pα (H ) = hφ (α) − φ(H ), hence by Proposition 2.2, hφ (α) − φ(H ) = sup(hψ (α) − ψ(H )), ψ
where the sup is taken over all α-invariant states.
Variational Principle for Asymptotically Abelian C∗ -Algebras
183
Recall that if F is a real convex continuous function on a real Banach space X, then a linear functional f on X is called a tangent functional to the graph of F at the point x0 ∈ X if F (x0 + x) − F (x0 ) ≥ f (x) ∀x ∈ X. In the sequel we will identify self-adjoint linear functionals on A with real linear functionals on Asa . The next proposition is the analogue of [W, Theorem 9.14]. Proposition 2.7. Suppose ht (α) < ∞ and the pressure is a convex function on Asa . Then (i) if φ is an equilibrium state at H then −φ is a tangent functional for the pressure at H; (ii) if −φ is a tangent functional for the pressure at H then φ is an α-invariant state. Proof. (i) Let K ∈ Asa . Then by Proposition 2.2, Pα (H + K) − Pα (H ) ≥ (hφ (α) − φ(H + K)) − (hφ (α) − φ(H )) = −φ(K), so −φ is a tangent functional. (ii) If K ∈ Asa then by Proposition 2.4(vi), −φ(α(K) − K) ≤ Pα (H + α(K) − K) − Pα (H ) = 0. Applying this also to −K we see that φ is α-invariant. Now note that ||φ|| ≤ 1 by Proposition 2.4(iv). By Proposition 2.4(ii) we have also −c ≥ −cφ(1) for any c ∈ R. Hence φ(1) = 1, and φ is a state. 3. The Variational Principle We shall prove the variational principle for the following class of C∗ -dynamical systems. Definition 3.1. A unital C∗ -dynamical system (A, α) is called asymptotically abelian with locality if there is a dense α-invariant ∗-subalgebra A of A such that for each pair a, b ∈ A the C∗ -algebra generated by a and b is finite dimensional, and for some p = p(a, b) ∈ N we have [α j (a), b] = 0 whenever |j | ≥ p. We call elements of A for local operators and finite dimensional C∗ -subalgebras of A for local algebras. Note that since we may add the identity operator to A, we may assume that 1 ∈ A. Since each finite dimensional C∗ -algebra is singly generated, an easy induction argument shows that the C∗ -algebra generated by a finite set of local operators is finite dimensional. In particular, if A is separable then A is an AF-algebra. Note also that another easy induction argument shows that for each local algebra N there is p ∈ N such that [α j (a), b] = 0 for all a, b ∈ N whenever |j | ≥ p. Theorem 3.2. Let (A, α) be a unital separable C∗ -dynamical system which is asymptotically abelian with locality. Let H ∈ Asa . Then Pα (H ) = sup(hφ (α) − φ(H )), φ
where the sup is taken over all α-invariant states of A. In particular, the topological entropy satisfies ht (α) = sup hφ (α). φ
184
S. Neshveyev, E. Størmer
Consider first the case when there exists a finite dimensional C∗ -subalgebra N of A such that H ∈ N , α j (N ) commutes with N for j = 0, ∨j ∈Z α j (N ) = A. Lemma 3.3. Under the above assumptions there exists an α-invariant state φ such that n−1 j
1 log Tr ∨n−1 α j (N) e− j =0 α (H ) . n→∞ n j =0
Pα (H ) = hφ (α) − φ(H ) = lim
Proof. First note that if A1 and A2 are commuting finite dimensional C∗ -algebras, and ai ∈ Ai , ai ≥ 0, i = 1, 2, then Tr A1 ∨A2 (a1 a2 ) ≤ Tr A1 (a1 )Tr A2 (a2 ), since if pi is a minimal projection in Ai , i = 1, 2, then p1 p2 is either zero or minimal in A1 ∨ A2 . Hence the limit in the formulation of the lemma really exists. We denote it by P˜α (H ). It is easy to see that P˜α (H ) ≥ Pα (H ). For each n ∈ Z let Nn be a copy of N . Consider the C∗ -algebra M = ⊗n∈Z Nn . Let β be the shift to the right on M, and π : M → A the homomorphism which intertwines β with α, and identifies N0 with N . Set I = Ker π . For each n ∈ N let Mn = N0 ⊗ . . . ⊗ Nn−1 , In = I ∩ Mn , πn = π |Mn . Identifying M with Mn⊗Z consider the β n -invariant state ψn = ⊗(fn ◦ πn ) on M, where j fn is the state on ∨n−1 j =0 α (N ) with density operator
−1 n−1 j n−1 j Tr ∨n−1 α j (N) e− j =0 α (H ) e− j =0 α (H ) . j =0
n−1
1 Set φn = ψn ◦ β j . Then φn is β-invariant. Using concavity of entropy we obtain n j =0
n−1 1 1 1 1 1 hψn ◦β j (β n ) = hψn (β n ) = S(fn ◦ πn ) = S(fn ) hφn (β n ) ≥ 2 n n n n n j =0 n−1 n−1 j 1 1 − j =0 α (H ) = log Tr ∨n−1 α j (N) e + fn α j (H ) j =0 n n j =0 n−1 1 ≥ P˜α (H ) + fn α j (H ) n j =0 n−1 1 = P˜α (H ) + ψn β j (H ) = P˜α (H ) + φn (H ). n
hφn (β) =
j =0
Let φ˜ be any weak∗ limit point of the sequence {φn }n . Then φ˜ is β-invariant. Let B be a masa in N0 containing H . Then B is in the centralizer of the state φn , hence hφn (β) = hφn (B; β).
Variational Principle for Asymptotically Abelian C∗ -Algebras
185
Since the mapping ψ → hψ (B; β) is upper semicontinuous, we conclude that ˜ ). hφ˜ (β) ≥ P˜α (H ) + φ(H Now note that φ˜ is zero on I . Indeed, if x ∈ In then β j (x) ∈ Im for j = 0, . . . , m − n and m ≥ n, whence |φm (x)| ≤
1 m
m−1
|(ψm ◦ β j )(x)| ≤
j =m−n+1
n−1 ||x||, m
˜ so φ(x) = 0. Thus φ˜ defines a state φ on A. We have hφ (α) = hφ˜ (β) ≥ P˜α (H ) + φ(H ), where the first equality follows from [CNT, Theorem VII.2]. Since by Proposition 2.2, hφ (α) − φ(H ) ≤ Pα (H ) ≤ P˜α (H ), the proof of the lemma is complete. We shall reduce the general case to the case considered above by replacing α by its powers. For this suppose that N is a local subalgebra of A, and H ∈ N . Choose p such k−p that α j (N ) commutes with N whenever |j | ≥ p. For k ≥ p set Mk = ∨j =0 α j (N ), k−p Hk = j =0 α j (H ). Then Hk ∈ Mk , and α j k (Mk ) commutes with Mk for j = 0. Lemma 3.4. For any finite subset ω of N we have 1 Pα (H, ω) ≤ lim inf Pα k |∨j ∈Z α j k (Mk ) (Hk ). k→∞ k Proof. The idea of the proof is to reduce to the situation of Lemma 3.3 by showing that the contribution of the indices in the intervals [j k − p + 1, j k − 1], j ∈ N, becomes negligible for large k. Fix δ > 0. Choose m0 ∈ N such that 2(p − 1)||a|| < δ for a ∈ ω. m0 Take any k ≥ m0 + p. Let n ∈ N. Then (m − 1)k ≤ n < mk for some m ∈ N. Set jk B0 = ∨m j =0 α (Mk ) and B = B0 ⊕ . . . ⊕ B0 . m0
Choose a conditional expectation E : A → B0 , and define unital completely positive mappings ψ : B → A and ρ : A → B as follows: ψ(b1 , . . . , bm0 ) =
m0 1 α −i+1 (bi ), m0 i=1
ρ(a) = (E(a), (E ◦ α)(a), . . . , (E ◦ α m0 −1 )(a)). For any a ∈ A we have ||(ψ ◦ ρ)(a) − a|| ≤
2||a|| #{0 ≤ i ≤ m0 − 1 | α i (a) ∈ / B0 }, m0
186
S. Neshveyev, E. Størmer
where #S means the cardinality of a set S. Let a = α l (b) for some b ∈ ω and l, 0 ≤ l ≤ n − 1. Then l = j k + r for some j and r, 0 ≤ j ≤ m − 1, 0 ≤ r < k. Since m0 ≤ k − p, the interval [l, l + m0 − 1] is contained in [j k, (j + 1)k + k − p]. But for i ∈ [j k, (j + 1)k + k − p]\[j k + k − p + 1, (j + 1)k − 1] we have α i (N ) ⊂ B0 , so #{0 ≤ i ≤ m0 − 1 | α i (a) ∈ / B0 } ≤ p − 1, and ||(ψ ◦ ρ)(a) − a|| ≤
2(p − 1)||b|| < δ. m0
n−1 n−1 n−1 j −ρ j j j =0 α (H ) P . α (H ), α (ω); δ ≤ log Tr B e
Hence
j =0
j =0
Now note that for 0 ≤ i ≤ m0 −1 the sets Xi = [i, i +n−1] and X = ∪m j =0 [j k, j k + k − p] are contained in Y = [0, mk + k − p], so #(Xi
X) ≤ #(Y \Xi ) + #(Y \X) = (mk + k − p + 1 − n) + m(p − 1) ≤ mk + k − p + 1 − (m − 1)k + m(p − 1) ≤ mp + 2k.
When j ∈ Xi ∩ X, α j (H ) ∈ B0 . Hence n−1 m j jk j j (E ◦ α i ) α (H ) − α (Hk ) = E α (H ) − α (H ) j =0 j =0 j ∈Xi j ∈X ≤ (mp + 2k)||H ||. By the Peierls–Bogoliubov inequality we obtain
m j k
n−1 j −(E◦α i ) j =0 α (H ) Tr B0 e ≤ e(mp+2k)||H || Tr B0 e− j =0 α (Hk ) , so
m j k
n−1 j −ρ j =0 α (H ) Tr B e ≤ m0 e(mp+2k)||H || Tr B0 e− j =0 α (Hk ) .
Taking the log, dividing by n, and letting n → ∞, we obtain m−1 j k
1 p||H || 1 + lim log Tr ∨m−1 α j k (Mk ) e− j =0 α (Hk ) j =0 k k m→∞ m p||H || 1 = + Pα k |∨j ∈Z α j k (Mk ) (Hk ), k k
Pα (H, ω; δ) ≤
where the last equality follows from Lemma 3.3. We shall need also the following Lemma 3.5. Let (A, α) be a C∗ -dynamical system with A nuclear, B an α-invariant C∗ -subalgebra of A, φ an α-invariant state on B. Then for any ε > 0 there exists an α-invariant state ψ on A such that ψ|B = φ and hψ (α) > hφ (α|B ) − ε.
Variational Principle for Asymptotically Abelian C∗ -Algebras
187
Proof. Since the Sauvageot–Thouvenot entropy is not less than the CNT-entropy for general C∗ -systems, there exist a commutative C∗ -dynamical system (C, β, µ), an (α ⊗ β)-invariant state λ on B ⊗ C, and a finite dimensional subalgebra P of C such that λ|B = φ, λ|C = µ and hφ (α|B ) < Hµ (P , P − ) − Hλ (P |B) + ε, see [ST] for notations. Extend λ to an (α ⊗ β)-invariant state on A ⊗ C, and set ψ = |A . Since the conditional entropy Hλ (P |B) is decreasing in the second variable, and ST-entropy coincides with CNT-entropy for nuclear algebras, we have hψ (α) ≥ Hµ (P , P − ) − H (P |A) ≥ Hµ (P , P − ) − Hλ (P |B) > hφ (α|B ) − ε.
Proof of Theorem 3.2. The inequality “≥” has been proved in Proposition 2.2. Since the pressure is continuous by Proposition 2.4(iv), to prove the converse inequality it suffices to consider local H . Then by Lemma 3.4 we have only to show that if H is contained in a local algebra N then 1 sup(hφ (α) − φ(H )) ≥ lim inf Pα k |∨j ∈Z α j k (Mk ) (Hk ). k→∞ k φ By Lemma 3.3, for each k ∈ N there exists an α k -invariant state ψk on ∨j ∈Z α j k (Mk ) such that hψk (α k |∨j ∈Z α j k (Mk ) ) − ψk (Hk ) = Pα k |∨j ∈Z α j k (Mk ) (Hk ). By Lemma 3.5 we may extend ψk to an α k -invariant state φ˜ k on A such that hφ˜k (α k ) ≥ hψk (α k |∨j ∈Z α j k (Mk ) ) − 1. k−1
Set φk =
1 φ˜ k ◦ α j . Then as in the proof of Lemma 3.3 k j =0
hφk (α) ≥
1 1 1 hφ˜k (α k ) ≥ hψk (α k |∨j ∈Z α j k (Mk ) ) − . k k k
Since k−1
φk (H ) =
1 1 p−1 p−1 1 φ˜ k (α j (H )) ≤ φ˜ k (Hk ) + ||H || = ψk (Hk ) + ||H ||, k k k k k j =0
we get 1 1 1 + (p − 1)||H || hψ (α k |∨j ∈Z α j k (Mk ) ) − ψk (Hk ) − k k k k 1 1 + (p − 1)||H || = Pα k |∨j ∈Z α j k (Mk ) (Hk ) − , k k
hφk (α) − φk (H ) ≥
and the proof is complete.
Corollary 3.6. With our assumptions the pressure is a convex function of H . Proof. Use the affinity of the function H → hφ (α) − φ(H ).
188
S. Neshveyev, E. Størmer
Corollary 3.7. If (A1 , α1 ) and (A2 , α2 ) are asymptotically abelian systems with locality then ht (α1 ⊗ α2 ) = ht (α1 ) + ht (α2 ). Proof. If φi is an αi -invariant state, i = 1, 2, then by [SV, Lemma 3.4] and [V, Propositions 4.6 and 4.9], hφ1 (α1 ) + hφ2 (α2 ) ≤ hφ1 ⊗φ2 (α1 ⊗ α2 ) ≤ ht (α1 ⊗ α2 ) ≤ ht (α1 ) + ht (α2 ). Taking the sup over φi we get the conclusion.
4. KMS-States By Corollary 3.6 and Proposition 2.7 it follows that if (A, α) is asymptotically abelian with locality and ht (α) < ∞, then for every equilibrium state φ at H , −φ is a tangent functional for the pressure Pα at H . Furthermore, if ω is a tangent functional for Pα at H then −ω is an α-invariant state. If H is local and I ⊂ Z is a subset then the derivation δH,I (x) = [α j (H ), x], x ∈ A, j ∈I
defines a strongly continuous one-parameter automorphism group σtH,I = exp(itδH,I ) of A (see [BR, Theorem 6.2.6 and Example 6.2.8]). We shall mainly be concerned with the case I = Z, and will write δH = δH,Z , σtH = σtH,Z . Recall that a state φ is a H (a)) for σ H -analytic elements a, b ∈ A. (σtH , β)-KMS state if φ(ab) = φ(bσiβ t We say that an α-invariant state φ is an equilibrium state at H at inverse temperature β if Pα (βH ) = hφ (α) − βφ(H ). By Theorem 3.2, for systems which are asymptotically abelian with locality, this is equivalent to hφ (α) − βφ(H ) = sup(hψ (α) − βψ(H )). ψ
The main result in this section is Theorem 4.1. Suppose a unital separable C∗ -dynamical system (A, α) is asymptotically abelian with locality, and ht (α) < ∞. If H is a local self-adjoint operator in A and φ is an equilibrium state at H at inverse temperature β, then φ is a (σtH , β)-KMS state. In particular, if ht (α) = hφ (α) then φ is a trace. In order to prove the theorem we may replace H by βH and show that φ is a (σtH , 1)KMS state. We shall prove the following more general result. Theorem 4.2. If −φ is a tangent functional for Pα at H then φ is a (σtH , 1)-KMS state. We shall need an explicit formula for the pressure, which is a consequence of our proof of the variational principle. Lemma 4.3. Let N be a local algebra. Then there exist a sequence {An }n of local algebras containing N and three sequences {pn }n , {mn }n , {kn }n of positive integers such that
Variational Principle for Asymptotically Abelian C∗ -Algebras
189
(i) α p (An ) commutes with An whenever |p| ≥ pn ; pn (ii) → 0 as n → ∞; kn
j 1 (iii) Pα (H ) = lim log Tr ∨j ∈In α j (An ) e− j ∈In α (H ) for all H ∈ Nsa , where n→∞ kn mn m n −1 In = [j kn , j kn + kn − pn ]. j =0
Proof. Let {An }n be an increasing sequence of local algebras containing N such that ∪n An is dense in A, ωn a finite subset of An such that span(ωn ) = An . Let {pn }n be a sequence satisfying condition (i). By Lemma 3.4 k−p n 1 Pα (H, ωn ) ≤ lim inf Pα k |∨j ∈Z α j k (An,k ) α j (H ) ∀H ∈ Nsa , k→∞ k j =0
k−p
where An,k = ∨j =0 n α j (An ). On the other hand, by the proof of Theorem 3.2, k−p n 1 Pα (H ) ≥ lim sup Pα k |∨j ∈Z α j k (An,k ) α j (H ) ∀H ∈ Nsa . k→∞ k j =0
Choose a countable dense subset X of Nsa . Since Pα (H, ωn ) ! Pα (H ) for any H ∈ X, we can find a sequence {kn }n such that condition (ii) is satisfied and kn −pn 1 Pα (H ) = lim P kn α j (H ) ∀H ∈ X. j kn n→∞ kn α |∨j ∈Z α (An,kn ) j =0
Since by Lemma 3.3 Pα kn |∨j ∈Z α j kn (An,kn )
kn −pn
1 − j ∈In,m α j (H ) , log Tr ∨j ∈In,m α j (An ) e m→∞ m
α j (H ) = lim
j =0
where In,m = ∪m−1 j =0 [j kn , j kn + kn − pn ], we can choose a sequence {mn }n such that condition (iii) is satisfied for all H ∈ X. But then it is satisfied for all H ∈ Nsa by Proposition 2.4(iv) and the Peierls–Bogoliubov inequality. Every local operator is analytic for the dynamics, and σtH depends continuously on H in a fixed local algebra. More precisely, we have Lemma 4.4. (i) The series
σβH,I (a)
=
∞ (iβ)n n=0
n!
n (a) converges absolutely in norm δH,I
for any β ∈ C and any local operator a. (ii) Given a local algebra N , R > 0, C > 0 and ε > 0 there exist q ∈ N and δ > 0 such that ||σβH1 ,I1 (a) − σβH2 ,I2 (a)|| ≤ ε||a|| ∀a ∈ N, ∀H1 , H2 ∈ Nsa with ||H1 ||, ||H2 || ≤ C and ||H1 − H2 || < δ, ∀β ∈ C with |β| ≤ R, ∀I1 , I2 ⊂ Z with [−q, q] ⊂ I1 ∩ I2 .
190
S. Neshveyev, E. Størmer
Proof. We shall use the arguments of Araki [A, Theorem 4.2]. Let H and a lie in a local algebra N . Choose p ∈ N such that α j (N ) commutes with N for |j | ≥ p. Then m (a) = [α jm (H ), [. . . , [α j1 (H ), a] . . . ]], δH,I j1 ,... ,jm
where the sum is over all j1 , . . . , jm ∈ I such that
[jl − p, jl + p] jk ∈ [−p, p]
(4.1)
l
for each k = 1, . . . , m. But as was already noted in [GN] condition (4.1) is equivalent to
[jl , jl + p] = ∅. [jk , jk + p] ∩ [0, p] l
Thus the lemma follows from the proof of [A, Theorem 4.2] (with n = p and r = p). The following lemma contains the main technical result needed to prove Theorem 4.2. Lemma 4.5. Let N be a local algebra, H ∈ Nsa , −φ ∈ N ∗ is a tangent functional for (Pα )|Nsa at H . Let E : A → N be a conditional expectation. Then for any function f ∈ D (the space of C ∞ -functions with compact support) and any a, b ∈ N we have H fˆ(t)φ(aE(σ H (b)))dt − ˆ f (t + i)φ(E(σt (b))a)dt t R
R
≤ ||a||
R
(|fˆ(t)| + |fˆ(t + i)|)||σtH (b) − E(σtH (b))||dt.
Proof. First consider the case when (Pα )|Nsa is differentiable at H , in other words −φ is the unique tangent functional. With the notations of Lemma 4.3 consider the state fn on ∨j ∈In α j (An ) with density operator
−1 j j e− j ∈In α (H ) . Tr ∨j ∈In α j (An ) e− j ∈In α (H ) Then define a positive linear functional φn on N by φn (x) =
1 fn (α j (x)). kn m n j ∈In
Note that ||φn || = φn (1) ≤ 1. Since −fn is a tangent functional for the convex function x → log Tr ∨j ∈In α j (An ) (e−x ) on (∨j ∈In α j (An ))sa at the point j ∈In α j (H ), −φn is a
j tangent functional for the function Nsa % x → kn1mn log Tr ∨j ∈In α j (An ) e− j ∈In α (x) at H . It follows that any limit point of the sequence {−φn }n is a tangent functional for (Pα )|Nsa at H . Since the latter is unique by assumption, φn → φ as n → ∞.
Variational Principle for Asymptotically Abelian C∗ -Algebras
191
Since fn is a (σtH,In , 1)-KMS state, by [BR, Proposition 5.3.12] we have fˆ(t)fn (α j (a)σtH,In (α j (b)))dt = fˆ(t + i)fn (σtH,In (α j (b))α j (a))dt ∀j ∈ In . R
R
H,In −j
Note that σtH,In (α j (b)) = α j (σt
(4.2) n −1 (b)). Fix q ∈ N, and set In,q = ∪m j =0 [j kn +
H,I −j
q, j kn + kn − pn − q]. By Lemma 4.4, if q is large enough then σt n (b) is arbitrarily close to σtH (b) for any j ∈ In,q and any t in a fixed compact subset of R. But then σtH,In (α j (b)) − α j (E(σtH (b))) is arbitrarily close to α j (σtH (b) − E(σtH (b))). In other words,
H,In j j j j H dt fˆ(t) 1 fn α (a)σt (α (b)) − α (a)α (E(σt (b))) kn m n R j ∈In,q ≤ ||a|| |fˆ(t)| ||σtH (b) − E(σtH (b))||dt + ε(q) ∀n ∈ N, R
where ε(q) → 0 as q → ∞. Since #In,q /#In → 1 as n → ∞, letting n → ∞ we may replace averaging over the set In,q by averaging over In , and then obtain
1 H,In j j H ˆ ˆ lim sup f (t)fn α (a)σt (α (b)) dt − f (t)φ(aE(σt (b)))dt k m n→∞ n n j ∈I R R n ≤ ||a|| |fˆ(t)| ||σtH (b) − E(σtH (b))||dt. R
Since an analogous estimate holds for R fˆ(t + i)φ(E(σtH (b))a)dt, we obtain the conclusion of the lemma by virtue of (4.2). If (Pα )|Nsa is not differentiable at H then by [LR, Theorem 1], φ lies in the closed ˜ for which there exists a sequence {Hn }n ⊂ Nsa converging to H convex hull of those φ, ˜ Since for such that (Pα )|Nsa has a unique tangent functional −φn at Hn and φn → φ. φn the lemma is already proved (for Hn instead of H ), using Lemma 4.4 we conclude ˜ But then it is true for any functional in that the conclusion of the lemma is true for φ. ˜ the closed convex hull of the φ’s. Proof of Theorem 4.2. If −φ is a tangent functional for Pα at H then −φ|N is a tangent functional for (Pα )|Nsa at H for any local algebra N containing H . Thus by Lemma 4.5 the equality R
fˆ(t)φ(aσtH (b))dt =
R
fˆ(t + i)φ(σtH (b)a)dt
holds for all f ∈ D and all local a, b, hence for all a, b ∈ A. By [BR, Proposition 5.3.12] this is equivalent to the KMS-condition. Remark 4.6. Under the assumptions of Theorem 4.1, if φ∈ {ψ | Pα (H ) < hψ (α) − ψ(H ) + ε} ε>0
192
S. Neshveyev, E. Størmer
(weak∗ closure), then −φ is a tangent functional for the pressure at H , hence φ is a (σtH , 1)-KMS state. In other words, any weak∗ limit point of a sequence on which the sup in the variational principle is attained, is a (σtH , 1)-KMS state. If ht (α) = +∞, this is of course false in general. Moreover, for any α-invariant state φ there exists a sequence {φn }n converging in norm to φ such that hφn (α) = +∞ for all n. Indeed, first note that by taking infinite convex combinations of states of large entropy we can find a state ψ of infinite entropy. Then φn = n1 ψ + n−1 n φ → φ and hφn (α) = +∞. n→∞
5. Examples First we consider a class of systems arising naturally from systems of topological dynamics. Let σ be an expansive homeomorphism of a zero-dimensional compact space X, G the group of uniformly finite-dimensional homeomorphisms of X in the sense of Krieger [K]. By definition, a homeomorphism T belongs to G if lim sup d(σ n T x, σ n x) = 0,
|n|→∞ x∈X
where d is a metric defining the topology of X. In other words, G consists of those homeomorphisms T of X, for which there exists a bound on the number of coordinates of any point that are changed under the action of T , when (X, σ ) is represented as a subshift by means of some generator. Since the group G is locally finite, the orbit equivalence relation R ⊂ X × X has a structure of AF-groupoid [Re]. Consider the groupoid C∗ -algebra A = C ∗ (R) and the automorphism α of A defined by α(f ) = f ◦ (σ × σ ). The algebra C(X) is a subalgebra of A, and there exists a unique conditional expectation E : A → C(X). Let C0 (X) be the ∗-subalgebra of C(X) spanned by characteristic functions of clopen sets, and C0 (X, R) the subalgebra of C0 (X) consisting of real functions. Every element g ∈ G defines a canonical unitary ug ∈ A such that ug f u∗g = f ◦ g −1 for f ∈ C(X). The ∗-algebra generated by C0 (X) and ug , g ∈ G, is our algebra A of local operators. For H ∈ C0 (X, R) consider the 1-cocycle cH ∈ Z 1 (R, R), cH (x, y) =
(H (σ j x) − H (σ j y)). j ∈Z
Recall [Re, Definition 3.15] that a measure µ on X = R(0) satisfies the (cH , 1)-KMS condition if its modular function is equal to e−cH . In other words, dg∗ µ −1 (x) = e−cH (g x,x) . dµ Proposition 5.1. Let H ∈ C0 (X, R). Then (i) Any measure µ on X which is an equilibrium measure at −H satisfies the (cH , 1)KMS condition. In particular, any measure of maximal entropy is G-invariant. (ii) The mapping µ → µ◦E defines a one-to-one correspondence between equilibrium measures on X at −H and equilibrium states on C ∗ (R) at H .
Variational Principle for Asymptotically Abelian C∗ -Algebras
193
Proof. First note that if φ is an α-invariant state on C ∗ (R), and µ = φ|C(X) then hµ (σ ) = hµ◦E (α) ≥ hφ (α). The equality is proved by standard arguments using [CNT, Corollary VIII.8]. The inequality follows from the fact that if ψ is a state on a finite dimensional C∗ -algebra M with a masa B then S(ψ) ≤ S(ψ|B ). It follows that if µ is an equilibrium measure at −H then µ ◦ E is an equilibrium state at H , and if φ is an equilibrium state at H then φ|C(X) is an equilibrium measure at −H . By Theorem 4.1 any equilibrium state is a (σtH , 1)-KMS state. But by [Re, Proposition 5.4] any (σtH , 1)-KMS state has the form µ ◦ E for some measure µ satisfying the (cH , 1)-KMS condition. From this both assertions of the proposition follow. Example 5.2. As an application of Proposition 5.1 consider a topological Markov chain (X, σ ) with transition matrix AT . As is well-known, if AT is primitive then the PerronFrobenius theorem implies the uniqueness of the trace on C ∗ (R). If AT is only supposed to be irreducible, then the traces of C ∗ (R) form a simplex with the number of vertices equal to the index of cyclicity of the matrix. The barycenter of this simplex is the unique α-invariant trace. By Proposition 5.1 we conclude that if AT is irreducible then (X, σ ) has a unique measure of maximal entropy. Thus we have recovered a well-known result of Parry (see [W, Theorem 8.10]). While in the abelian case uniquely ergodic systems are of great interest, they are not so for asymptotically abelian systems with locality. Indeed, we have Proposition 5.3. Let (A, α) be a C∗ -dynamical system which is asymptotically abelian with locality. If there is a unique invariant state τ , then τ is a trace, and πτ (A) is an abelian algebra. For later use the main part of the proof will be given in a separate lemma. Lemma 5.4. Let (A, α) be an asymptotically abelian system with locality, τ an αinvariant ergodic trace on A, H a local self-adjoint operator. Suppose for each H in the real linear span of α j (H ), j ∈ Z, and for each k ∈ N there exists an α k -invariant j (σtH ,kZ , 1)-KMS state φ such that τ = k1 k−1 j =0 φ ◦ α . Then πτ (H ) is central in πτ (A). Proof. Replacing A by A/Ker πτ we may identify A with πτ (A) ⊂ B(Hτ ). The automorphism α being extended to A ⊂ B(Hτ ) is strongly asymptotically k abelian. Hence for any k ∈ N the fixed point algebra (A )α is central. Since α is ergodic, this algebra is k0 -dimensional for some k0 |k, and we may enumerate its atoms z1 , . . . , zk0 in such a way that α(z1 ) = z2 , . . . , α(zk0 −1 ) = zk0 , α(zk0 ) = z1 . Now if φ is j an α k -invariant state such that τ = k1 k−1 j =0 φ ◦α then φ ≤ kτ , hence φ(x) = τ (xa) for
k
some positive a ∈ (A )α . In particular, φ is a trace. So if in addition φ is a (σtH ,kZ , 1) KMS state then the dynamics σtH ,kZ is trivial on (A )s(a) , where s(a) is the support of a. Hence δH ,kZ (y)s(a) = 0 for all local y. Then δH ,kZ (y)zi = 0 for some zi (1 ≤ i ≤ k0 ) majorized by s(a). Fix a local x ∈ A. Choose p ∈ N such that α j (H ) commutes with x whenever |j | ≥ p. Pick any m > p and set k = 2m + 1. For λ ∈ Rk consider the operator H (λ) =
m j =−m
λj α j (H ).
194
S. Neshveyev, E. Størmer
Applying the result of the previous paragraph to H = H (λ), we find i, 1 ≤ i ≤ k0 , such that δH (λ),kZ (y)zi = 0 for all local y. Denote by Xi the set of all λ ∈ Rk satisfying the latter condition. Since Rk = ∪i Xi , there exists i for which Rk coincides with the linear span of Xi . Without loss of generality we may suppose that i = 1. Since for any j ∈ [−m + p, m − p], any j = 0, and any λ ∈ X1 , the elements α j k (H (λ)) and α j (x) commute, we obtain 0 = δH (λ),kZ (α j (x))z1 = [H (λ), α j (x)]z1 ,
hence [α j (H ), α j (x)]z1 = 0 for j ∈ [−m, m]. In particular, α j ([H, x])z1 = 0 for j ∈ [−m + p, m − p]. m−p
If k0 = k and k − 2p ≥ 2k (> k0 ) then ∨j =−m+p α j (z1 ) = 1, so [H, x] = 0. If k0 = k then [H, x]z = 0, where z=
m−p j =−m+p
Since
k−2p k
α j (z1 ) =
m−p
α j (z1 ), τ (z) =
j =−m+p
→ 1 as m → ∞, we conclude that [H, x] = 0.
k − 2p . k
Proof of Proposition 5.3. Since A is a unital AF-algebra, there exists a trace on A, hence there exists an α-invariant trace. It follows that the unique α-invariant state is a trace. If H is local then for any subset I of Z there exists a (σtH,I , 1)-KMS state. Indeed, if we take an increasing sequence of finite subsets In of I such that ∪n In = I , an increasing sequence of local algebras An such that α j (H ) ∈ An for j ∈ In and ∪n An is dense in A, and a sequence of states φn such that φn |An is a (σtH,In , 1)-KMS state, then any weak∗ limit point of the sequence {φn }n will be a (σtH,I , 1)-KMS state. If in addition I +k = I , then the state can be chosen to be α k -invariant (since the set of (σtH,I , 1)-KMS states is j α k -invariant). But if φ is an α k -invariant state then the state k1 k−1 j =0 φ ◦α is α-invariant, hence it coincides with τ . Thus the conditions of Lemma 5.4 are satisfied. Hence πτ (H ) is central in πτ (A) for any local H , so πτ (A) is abelian. We consider two examples illustrating Proposition 5.3. Example 5.5. Let U be the bilateral shift on a separable Hilbert space H, and α = Ad U |A , where A is the C∗ -algebra K(H) + C1, K(H) being the algebra of compact operators. Then the only α-invariant state is the trace τ , which annihilates K(H). Then πτ (A) = C1. Example 5.6. More generally, consider a uniquely ergodic system (X, σ ) and construct a system (C ∗ (R), α) as above. Let τ be an α-invariant trace. Then τ = µ ◦ E for some measure µ, and the unique ergodicity of (X, σ ) means that µ is the unique invariant measure. We check the conditions of Lemma 5.4 for any H ∈ C0 (X, R). By the same reasons as in the proof of Lemma 5.4, the fixed point algebra (πτ (A) )α is central. By [FM] the center of the algebra πτ (A) is isomorphic to L∞ (X, µ)G . Since the measure µ is ergodic, we conclude that the trace τ is also ergodic. Let H ∈ C0 (X, R), k ∈ N, and φ any α k -invariant (σtH,kZ , 1)-KMS state. Then j φ = ν ◦ E for some σ k -invariant measure ν. Since µ = k1 k−1 j =0 ν ◦ σ , we have 1 k−1 τ = k j =0 φ ◦ α j .
Variational Principle for Asymptotically Abelian C∗ -Algebras
195
Thus we can apply Lemma 5.4, and conclude that πτ (C(X)) ∼ = C(supp µ) is central in πτ (A). This means that G acts trivially on supp µ, and πτ (A) = C(supp µ). By [Re, Proposition 4.5] the kernel of πτ is the algebra corresponding to the groupoid RX\supp µ . Since there is no non-zero finite σ -invariant measures on X\supp µ, any α-invariant state is zero on Ker πτ . Thus the system (A, α) is uniquely ergodic and πτ (A) = C(supp µ). We next give an example of an asymptotically abelian C∗ -dynamical system (A, α) with A an AF-algebra, for which there exist non-tracial α-invariant states with maximal finite entropy. Hence the assumption of locality in Theorem 4.1 is essential. Example 5.7. Let H be an infinite-dimensional Hilbert space, A the even CAR-algebra over H, α the Bogoliubov automorphism corresponding to a unitary U . It easy to see that α is asymptotically abelian if and only if (U n f, g) → 0 for any f, g ∈ H. If n→∞
in addition U has singular spectrum then by the proof of [SV, Theorem 5.2] we have ht (α) = 0, while there are many non-tracial α-invariant states (for example, quasi-free states corresponding to scalars λ ∈ (0, 1/2)). Unitaries with such properties can be obtained using Riesz products. We shall briefly recall the construction. Let q > 3 be a real number, {nk }∞ k=1 a sequence of positive integers such that nk+1 ∞ a sequence of real numbers such that a ∈ (−1, 1), a → 0 as ≥ q, {a } k k k nk 2 k=1 k → ∞, k ak = ∞. Then the sequence of measures n 1 (1 + ak cos nk t) dt 2π k=1
weakly∗
to a probability measure µ with Fourier coefficients ∞ ak |εk | , if n = k εk nk with εk ∈ {−1, 0, 1}, k=1 2 µ(n) ˆ = µ(eint ) = 0, otherwise.
on [0, 2π] converges
The measure µ is singular by [Z, Theorem V.7.6]. We see also that µ(n) ˆ → 0 as |n| → ∞. Thus the operator U of multiplication by eit on L2 ([0, 2π ], dµ) has the desired properties. Acknowledgement. The authors are indebted to A. Connes for suggesting to us to study the variational principle and equilibrium states in the setting of asymptotically abelian C∗ -algebras. Theorem 4.1 gives a solution to a problem stated by Connes in Section V.6 of his book “Non commutative goemetry”.
References [A]
Araki, H.: Gibbs states of a one dimensional quantum lattice. Commun. Math. Phys. 14, 120–157 (1969) [BR] Bratteli, O., Robinson, D.W.: Operator Algebras and Quantum Statistical Mechanics II. Berlin– Heidelberg–New York: Springer-Verlag, 1981 [CNT] Connes, A., Narnhofer, H., Thirring, W.: Dynamical entropy of C∗ -algebras and von Neumann algebras. Commun. Math. Phys. 112, 691–719 (1987) [CS] Connes, A., Størmer, E.: Entropy of automorphisms in II1 von Neumann algebras. Acta Math. 134, 289–306 (1975) [FM] Feldman, J., Moore, C.C.: Ergodic equivalence relations, cohomology and von Neumann algebras II. Trans. Am. Math. Soc. 234, 325–361 (1977) [GN] Golodets, V.Ya., Neshveyev, S.V.: Gibbs states for AF-algebras. J. Math. Phys. 39, 6329–6344 (1998) [K] Krieger, W.: On dimension functions and topological Markov chains. Invent. Math. 56, 239–250 (1980)
196
[LR]
S. Neshveyev, E. Størmer
Lanford, O.E. III, Robinson, D.W.: Statistical mechanics of quantum spin systems III. Commun. Math. Phys. 9, 327–338 (1968) [M] Moriya, H.: Variational principle and the dynamical entropy of space translation. Rev. Math. Phys. 11, 1315–1328 (1999) [N] Narnhofer, H.: Free energy and the dynamical entropy of space translations. Rep. Math. Phys. 25, 345–356 (1988) [NST] Narnhofer, H., Størmer, E., Thirring, W.: C∗ -dynamical systems for which the tensor product formula for entropy fails. Ergod. Th. & Dynam. Sys. 15, 961–968 (1995) [OP] Ohya, M., Petz, D.: Quantum Entropy and Its Use. Berlin–Heidelberg–New York: Springer-Verlag, 1993 [Re] Renault, J.: A groupoid approach to C∗ -algebras. Lect. Notes in Math. 793, Berlin–Heidelberg–New York: Springer-Verlag, 1980 [R] Ruelle, D.: Statistical mechanics on a compact set with Zν action satisfying expansiveness and specification. Trans. Am. Math. Soc. 185, 237–251 (1973) [ST] Sauvageot, J.-L., Thouvenot ,J.-P.: Une nouvelle definition de l’entropie dynamique des systems non-commutatifs. Commun. Math. Phys. 145, 411–423 (1992) [S] Størmer, E.: Entropy of endomorphisms and relative entropy in finite von Neumann algebras. J. Funct. Anal. 171, 34–52 (2000) [SV] Størmer, E., Voiculescu, D.: Entropy of Bogoliubov automorphisms of the canonical anticommutation relations. Commun. Math. Phys. 133, 521–542 (1990) [V] Voiculescu, D.: Dynamical approximation entropies and topological entropy in operator algebras. Commun. Math. Phys. 170, 249–281 (1995) [W] Walters, P. An Introduction to Ergodic Theory. Graduate Texts in Math. 79. Berlin–Heidelberg–New York: Springer-Verlag, 1982 [Z] Zygmund, A.: Trigonometric Series, Vol. I & II. Cambridge: Cambridge Univ. Press, 1977 Communicated by A. Connes
Commun. Math. Phys. 215, 197 – 216 (2000)
Communications in
Mathematical Physics
© Springer-Verlag 2000
Localization Regions of Local Observables Bernd Kuckert Mathematics Institute, Pl. Muidergracht 24, 1018 TV Amsterdam, The Netherlands. E-mail:
[email protected] Received: 22 February 2000 / Accepted: 29 June 2000
Abstract: Exploiting the properties of the Jost–Lehmann–Dyson representation, it is shown that in 1+2 or more spacetime dimensions, a nonempty smallest localization region can be associated with each local observable (except for the c-numbers) in a theory of local observables in the sense of Araki, Haag, and Kastler. Necessary and sufficient conditions are given that observables with spacelike separated localization regions commute (locality of the net alone does not imply this yet). 1. Introduction The algebraic approach to relativistic quantum physics [2,13] aims at joining the structures familiar from nonrelativistic quantum mechanics to those of special relativity. The object of investigation is a net A of local observables that associates with every bounded open region O in the Minkowski spacetime R1+s a unital C∗ -algebra A(O) of bounded operators in a Hilbert space H in such a way that O ⊂ P ⊂ R1+s implies A(O) ⊂ A(P ) (isotony), and such that the elements of algebras associated with spacelike separated regions commute (locality). For every region O, the elements of A(O) are interpreted as the observables measurable in a lab located in O. This paper deals with the question whether, given a single element A ∈ A(O) for some bounded O ⊂ R1+s , one can find a smallest nonempty region L(A) within which A can be measured. From a theorem of Landau it is well known that in at least 1+2 spacetime dimensions, one observable cannot be measured in two labs located in disjoint spacetime regions. A generalization of this theorem will be proved below, and for at least 1+2 spacetime dimensions, it will lead to several meaningful definitions of convex localization regions. Some additional technical assumptions then provide a strongest localization prescription with the property that observables with spacelike separated localization regions commute. But the fact that locality of the net alone (not even together Casimir-Ziegler fellow of the Nordrhein-Westfälische Akademie der Wissenschaften
198
B. Kuckert
with those assumptions that lead to the definition of a nonempty localization region) does not imply this version of locality, is a little surprising. This article is structured as follows: Sect. 2 discusses the notation, concepts, and basic assumptions that play a role in this paper. Sect. 3 collects and completes the tools that will be used in what follows. In Sect. 4 the theorem due to Landau mentioned above is discussed and generalized: it states that in 1+2 or more spacetime dimensions, the algebras of any two double cones with disjoint closures only have in common the complex multiples of the identity operator. Using the techniques introduced in the first sections, it is shown that this result can be generalized to the empty-intersection theorem: instead of considering two double cones, one can consider one double cone and any finite number of wedge regions; if the closures of the regions under consideration have an empty common intersection, then the corresponding algebras have in common exactly the c-numbers. In Sect. 5, finally, it is discussed how one can, in 1+2 or more spacetime dimensions, use the empty-intersection theorem in order to associate a nonempty localization region with any given single local observable that is not a multiple of the identity. Despite the locality of the net, the question whether local observables with spacelike separated localization regions commute turns out to be nontrivial.A necessary and sufficient criterion for the locality of such a localization prescription is provided by the nonempty-intersection theorem: the criterion requires that, given any finite family of wedges, all local observables contained in the algebras associated with all these wedges are contained in the algebra associated with any neighbourhood of the intersection of the wedges as well. It is shown that an additional additivity assumption, which is typically fulfilled by nets arising from Wightman fields, implies this criterion, so it should be quite difficult to find a nonpathological example of a local net whose localization prescription does not exhibit loclity. In the Conclusion, some related results are discussed briefly.
2. Notation and Assumptions In what follows, H will be an infinite-dimensional (not necessarily separable) Hilbert space, and A will be a net of observables as defined above, i.e., a net which satisfies locality and isotony. The union of all algebras A(O) associated with bounded open regions1 O ⊂ R1+s , s ≥ 1, is an involutive algebra Aloc called the algebra of local observables. Throughout Sects. 4 and 5 it will be assumed that s ≥ 2. A will be assumed to satisfy the following (standard) conditions throughout this article: (A) Translation Covariance. A is covariant under a strongly continuous unitary representation U of the group (R1+s , +) of spacetime translations, i.e., U (a)A(O)U (−a) = A(O + a) for every bounded region O and every a ∈ R1+s . (B) Spectrum Condition. The spectrum of the four-momentum operator generating U is contained in the closure of the forward light cone. 1 In this article we refer to arbitrary subsets of R1+s as “regions”.
Localization Regions of Local Observables
199
(C) Existence and Uniqueness of the Vacuum. The space of U -invariant vectors in H is one-dimensional. will denote an arbitrary, but fixed unit vector in this space, the vacuum vector. is cyclic with respect to Aloc , i.e., Aloc = H. Throughout Sects. 4 and 5 the following additional condition will be assumed to hold: (D) Reeh–Schlieder Property. For every nonempty bounded open region O, the space A(O) is dense in H. Conditions (A) and (B) make sure that the system described by the net has a welldefined four-momentum whose spectrum ensures energetic stability of the system. Given Conditions (A) and (B), Condition (C) holds if and only if induces a unique and pure vacuum state, which, in turn, holds if and only if the algebra Aloc is irreducible. of A A sufficient condition for this uniqueness is that the bicommutant Aloc loc is a factor (Thm. III.3.2.6 in [13]). But as soon as H is separable, this implies that the uniqueness part of Condition (C) does not mean any loss of generality: every von Neumann algebra in a separable Hilbert space admits a direct-integral decomposition into factors, and since the , unitaries representing the translations commute with the elements of the center of Aloc one can conclude that almost all factors of the central decomposition inherit Conditions (A) and (B) (cf. also the remarks in [13], Sect. III.3.2, and references therein). The Reeh–Schlieder property, Condition (D), holds for all Wightman fields [22]. If the region O contains an open cone, it is well known to follow from Conditions (A) through (C) (cf., e.g., the Appendix in [10]). The Reeh–Schlieder property holds as soon as one has weak additivity ([8], cf. also Thm. 7.3.37 in [4]): if O is any bounded open region, then A(O + a) = Aloc . a∈R1+s
Conversely, it is well known that weak additivity can be derived from the Reeh–Schlieder property as well, provided Conditions (A) through (C) hold (cf., e.g., Lemma 2.6 in [26]). For the reader’s convenience we include a proof of this fact in the appendix. Some special classes of spacetime regions will be used below. The first one is the class K of double cones, i.e., all regions of the form (a + V+ ) ∩ (b − V+ ), a, b ∈ R1+s . The class W of wedges consists of the region W1 := {x ∈ R1+s : x1 > |x0 |} and its images under Poincaré transforms. If M is a region in R1+s , one denotes by M c the causal complement or spacelike complement, which is the region consisting of all points that are spacelike with respect to all points of M. The spacelike complement of the spacelike complement (M c )c =: M cc ⊃ M is called the causal completion of M, and M is called causally complete if M = M cc . It is convenient to denote the interior of M c by M . K and W are subclasses of the class C of convex, causally complete and open proper subsets of R1+s . The wedges in W are maximal elements of C in the sense that for every wedge W ∈ W, every element R ∈ C with R ⊃ W is a wedge. Every element R of C is an intersection of wedges (cf. [24], Thm. 3.2). The class of all wedges that contain a region R will be denoted by WR . In general, the causal complement of a region in C is not convex. If R ∈ C, then R is a union of wedges ([24], Thm. 3.2), and W R will denote the class of all wedges that are subsets of R . If O is an open convex region and if P is a convex region that is spacelike separated from O, there is a wedge X ∈ W such that O ⊂ X and P ⊂ W c (cf. [24], Prop. 3.1).
200
B. Kuckert
B will denote the bounded elements of the class C. Clearly, the double cones are in B. Every element of B is contained in some double cone, and it is precisely the intersection of all such double cones ([24], Prop. 3.8). The class of all double cones which contain a region O will be denoted by KO , and the class of all double cones contained in an arbitrary region R will be called KR . In Sect. 5, two more technical assumptions will occur: (E) Wedge Duality. For all W ∈ W, one has A(W ) = A(W ) . (F) Wedge Additivity. For each wedge W ∈ W and each double cone O ∈ K with W ⊂ W + O one has A(W ) ⊂ A(a + O ) . a∈W
All nets arising from finite-component Wightman fields satisfy wedge duality [5, 6]. One checks that wedge duality implies the condition of essential duality known from the analysis of superselection sectors, since for any two spacelike separated regions in B, one can find a wedge which contains one of the two, whereas its spacelike complement contains the other one (see above, and cf. Lemma 5.2 below). The algebras A(O ) will also occur below for other regions O, e.g., for double cones. By locality, A(O ) ⊃ A(O), and with the above remarks, it is easy to show that as soon as wedge duality holds, one obtains that for spacelike separated double cones O and P , the elements of the algebras A(O ) and A(P ) commute, a property which is called essential duality and which is used in the theory of localized superselection sectors (cf. [13] and references given there). Condition (F) strengthens Condition (D) slightly, but it is a standard property of all Wightman fields as well. It has been used extensively by Thomas and Wichmann in [26]. The authors have obtained results in the spirit of Theorem 4.5 and Proposition 5.5 below, but their results do not imply ours. Occasionally, terminology borrowed from PDEs and General Relativity will be used (timelike curves, Cauchy surfaces, etc.). These notions will not be defined in detail, but will be used as in [14].
3. Commutator Functions and Wave Equation Techniques It is a classical result of the Wightman approach to quantum field theory that one can reconstruct a Wightman field from its vacuum expectation values [23, 15]. The following lemma shows how one can reconstruct commutation relations of a net of observables from the behaviour of its vacuum expectation values. Since these have some convenient properties, this will facilitate the subsequent investigations. A will be a local net of local observables satisfying the above Conditions (A) through (C). Lemma 3.1. For an arbitrary double cone O ∈ K, let A be an element of A(O ) . (i) If a region R ⊂ R1+s contains some open cone and has the property that , AB = , BA for all B ∈ A(R), then A ∈ A(R) . (ii) Assume that A has the Reeh–Schlieder property, and suppose there is a double cone P ∈ K with the property that , AB = , BA for all B ∈ A(P ). If there is a double cone Q ⊂ P with the property that A ∈ A(Q) , then A ∈ A(P ) .
Localization Regions of Local Observables
201
(iii) Assume that A exhibits the Reeh–Schlieder property, and suppose there is a double cone P ∈ K with the property that , AB = , BA for all B ∈ a∈R1+s A(P + a). Then A is a multiple of the identity. Proof. (i) If S is an open cone contained in R, there is a translation a ∈ R1+s such that S + a ⊂ R ∩ O . Choose C and D in A(S + a) and B in A(R). Since A ∈ A(O ) , the operators A and C ∗ commute: C, ABD = , C ∗ A BD = , AC ∗ BD. Since C ∗ BD is in A(R), the assumption implies , AC ∗ BD = , C ∗ BDA, and since D and A, in turn, commute because of A ∈ A(O ) , one concludes C, ABD = , C ∗ BDA = C, BAD. But since C and D are arbitrary elements of A(S + a), and since is cyclic with respect to this algebra, it follows that AB = BA; since B ∈ A(R) was arbitrary, one obtains A ∈ A(R) , which is (i). (ii) Choose C and D in A(Q) and B in A(P ). Since A has been assumed to be in A(Q) , it commutes with C ∗ , so C, ABD = , C ∗ A BD = , AC ∗ BD. Since C ∗ BD is in A(P ), the assumption implies , AC ∗ BD = , C ∗ BDA, and since D and A commute by the assumption that A ∈ A(Q) , one concludes C, ABD = , C ∗ BDA = C, BAD. But since C and D are arbitrary elements of A(Q), and since by the Reeh–Schlieder property, is cyclic with respect to this algebra, it follows that AB = BA; since B ∈ A(P ) was arbitrary, one obtains A ∈ A(P ) , which is (ii). (iii) There is a translation a ∈ R1+s such that P + a ⊂ O , so that A ∈ A(O ) ⊂ A(P + a) . Now choose a b ∈ R1+s such that P + b intersects P + a, and let Q be a double cone contained in (P + b) ∩ (P + a). Isotony implies that A ∈ A(Q) . Since by assumption, , AB = , BA for all B ∈ A(P + b), (ii) implies that A ∈ A(P + b) . Now one can iterate this procedure: choose an arbitrary c ∈ R1+s such that (P + c) ∩ (P + b) is nonempty, choose a new double cone Q in this intersection, and conclude from (ii) that A ∈ A(P + c) . Note that only the double cone P + a chosen in the first step needs to be spacelike separated from O, since each step uses the result of the preceding one, so one finds that for every a ∈ R1+s , one proves that A ∈ A(P + a) with a finite number of steps. The statement now follows from weak additivity, which follows from the Reeh–Schlieder property (see above), and from irreducibility.
202
B. Kuckert
Given any two local observables A, B ∈ Aloc , the commutator function fA,B will henceforth be defined by R1+s x → , [A, U (x)BU (−x)] =: fA,B (x). Due to Lemma 3.1, the analysis of the support of this function yields information on the structure of the net. Crucial for this analysis is the fact that fA,B is a boundary value of a solution of the wave equation, and a well-known lemma due to Asgeirsson concerning such solutions (cf., e.g., [1], Sect. 4.4.D in [7], or [11]) immediately implies the following lemma, which, for this reason, will be referred to as Asgeirsson’s Lemma. Another important consequence of the “wave nature” of the function fA,B is a theorem due to Jost, Lehmann and Dyson [16, 12] which will also be recalled for the reader’s convenience. Lemma 3.2 (Asgeirsson). If the commutator function fA,B and all its partial derivatives are zero along a timelike curve segment γ , fA,B vanishes in the entire double cone γ cc . Proof. The Fourier transform of the operator valued function R1+s x → U (x) is the spectral measure of the four-momentum operator. It follows that the Fourier transform fˆA,B of the function fA,B is a finite (not necessarily positive) measure, and by the spectrum condition, one has supp fˆA,B ⊂ V. It follows that the function
1+s F (x, σ ) := (2π)− 2 cos(σ k 2 ) eikx d fˆA,B (k) is a continuous function with F (x, 0) = fA,B (x) for all x ∈ R1+s . This F is a solution of the 1+(s+1)-dimensional wave equation. This implies the statement by Asgeirsson’s result for solutions of the wave equation, see the references quoted above. Evidently, the assumption of the lemma is satisfied as soon as fA,B vanishes in some open neighbourhood of γ . In the proof of Theorem 3.5 below, however, the function F defined in the proof is analysed, and the information one has about fA,B from locality merely implies that F vanishes in a null set of R1+(s+1) . In this case one makes use of the fact that F has been constructed in such a way that all its partial derivatives, including the one in the σ -direction, are zero at all points of this null set; one may then use the above lemma to show that the region where F vanishes also extends into the σ -direction. Definition 3.3. Let R be a region in Minkowski space. (i) R will be called Asgeirsson complete if for every timelike curve segment γ ⊂ R, the double cone γ cc is a subset of R as well. The smallest Asgeirsson complete extension of R will be called the Asgeirsson hull of R. (ii) R will be called timelike convex if it contains as subsets all double cones with tips in R, i.e., if (R + V+ ) ∩ (R − V+ ) ⊂ R. (iii) R will be called a Jost–Lehmann–Dyson region if it is timelike convex and if every inextendible timelike curve in R1+s intersects R ∪ R c . Timelike convex regions contain all timelike path segments connecting two points in the region, so the terminology is in harmony with other notions of convexity. In [18] the term “double cone complete” was used instead of “timelike convex”, but the latter term was also used in [24] (Par. IV) and will be used in what follows to facilitate reading. The following lemma collects some relations between these notions most of which will be used below.
Localization Regions of Local Observables
203
Lemma 3.4. (i) Every causally complete region is timelike convex. (ii) Every timelike convex region is Asgeirsson complete. (iii) Every timelike convex and bounded open region is a Jost–Lehmann–Dyson region. (iv) The causal complement of a Jost–Lehmann–Dyson region is a Jost–Lehmann–Dyson region. (v) Let R and S be timelike convex regions, and assume that there exists a Cauchy surface T which is a subset of both R and S. Then the region R ∪ S is timelike convex (and, like R and S, trivially, a Jost–Lehmann–Dyson region). (vi) Let (Rρ )ρ>0 be an increasing family of Jost–Lehmann–Dyson regions. Then R := ρ Rρ is a Jost–Lehmann–Dyson region. Before proving the lemma, we give some counterexamples to strengthened statements or converse implications. An example of a timelike convex region (and Jost–Lehmann– Dyson region) that is not causally complete (cf. (i)) is the time slice region {x ∈ R1+s : 0 ≤ x0 ≤ 1}. An example of an Asgeirsson complete region that is not timelike convex (cf. (ii)) is the union of two disjoint double cones at a timelike distance; this shows that the classes of timelike convex regions and of Jost–Lehmann–Dyson regions, respectively, are not stable under taking unions, so Statement (v) is far from tautological. The same holds for the class of Asgeirsson complete regions: consider the regions R+ := {x ∈ R1+s : ρx1 < x0 < ρx1 + 1} and R− := {x ∈ R1+s − ρx1 < x0 < 1 − ρx1 } for some ρ with 0 < ρ ≤ 1. One easily checks that both regions are Asgeirsson complete, while their union is not: its Asgeirsson hull is R1+s . If ρ < 1, the two regions are even Jost–Lehmann–Dyson regions, while their union evidently is not (cf. (v) and (vi)). An example of a timelike convex region which is neither causally complete nor a Jost–Lehmann–Dyson region (cf. (iii))is the region R := {x ∈ R1+s : 1 < x 2 < 2, x0 > 0}, since there are timelike curves which do not intersect R, e.g., the curve R t → (sinh t, cosh t, 0, . . . , 0). Proof of Lemma 3.4. (i) Let R be a causally complete region, and pick two points x ∈ R and y ∈ R ∩ (x + V+ ). The causal completion {x, y}cc of the set {x, y} is the closure of the double cone (x + V+ ) ∩ (y − V+ ), and since {x, y} ⊂ R implies {x, y}cc ⊂ R cc = R, this immediately implies (i). Statement (ii) immediately follows from the definition. (iii) Let R be timelike convex, bounded and open. Since R is open, a point x ∈ R1+s is not contained in the spacelike complement R c if and only if it is timelike with respect to some point of R, i.e., R1+s \R c = R + V , where V is the open light cone. Now let γ be an inextendible timelike curve that does not intersect R ∪ R c . Since γ does not intersect R c , it has to stay within the region R+V . But since γ is an inextendible timelike curve, while R is bounded, γ cannot stay in the future R + V+ or the past R − V+ of R, i.e., it has to pass from R − V+ to R + V+ . Since both these regions are open, while γ is continuous, it follows that it has to hit the region (R + V+ ) ∩ (R − V+ ). But this region coincides with R since R is timelike convex and open, so γ hits R, which is a contradiction and proves (iii). (iv) The causal complement of any region is causally complete, by (i), this enhances timelike convexity of R c . The condition that R ∪ R c is intersected by every inextendible timelike curve implies that R c ∪ R cc (= R c ∪ R) is intersected by every such curve. This proves (iv).
204
B. Kuckert
(v) Let x and y be points in R ∪ S that are timelike with respect to each other. Since R1+s is timelike convex, one finds an inextendible timelike curve γ hitting both x and y. Let z be the unique point where γ hits T . Since R and S are timelike convex, and since z ∈ T ⊂ R ∩ S, the closed double cones with the tips z and x and the tips z and y, respectively, are subsets in R ∪ S. If with respect to the time ordering along γ , z is earlier or later than both x and y, it follows that the double cone with tips x and y is contained in R ∪ S as well. If z is between x and y, then, as before, we can conclude that the segments of γ between z and x and between z and y is a subset of R ∪ S, and since z ∈ T ⊂ R ∩ S, it follows that all of the segment of γ joining x to y is a subset of R ∪ S. Since γ can be any inextendible timelike curve hitting x and y, one obtains that all timelike curve segments joining x and y are contained in R ∪ S, so the double cone with tips x and y is contained in R ∪ S, which completes the proof of (v). (vi) Let x and y be two points in R at a timelike distance. There are a ρx > 0 and a ρy > 0 such that x ∈ Rρx and y ∈ Rρy , so it follows that both x and y are elements of Rmax{ρx ,ρy } . Since this region is timelike convex, it follows that the double cone with tips x and y is in R, proving that R is timelike convex. It remains to be shown that every inextendible timelike curve intersects R ∪ R . Let γ be an inextendible timelike curve that does not intersect R. Since all Rρ are Jost– Lehmann–Dyson regions, it follows that γ has to intersect every Rρ , so it has to intersect the region ρ>0 Rρ = R . This completes the proof. The above statements and proofs can be extended in a straightforward manner to the spacetime one obtains by endowing the cylinder Zρ := {x = (x0 , x) ∈ R1+s : x = ρ} with the spacetime structure it inherits from R1+s , provided s ≥ 2. For s = 1 this spacetime fails to be timelike convex, and the proof of part (v) does no longer work. For further results of the above kind, see [24]. The useful property of Jost–Lehmann–Dyson regions (which is the reason to call them so) is established by the following theorem. Theorem 3.5 (Jost, Lehmann, Dyson). Let A and B be local observables, and assume that the commutator function fA,B vanishes in a Jost–Lehmann–Dyson region R. Then the support of fA,B is contained not only in the complement of R, but even in the (in general, smaller) union of all admissible mass hyperboloids of R, i.e., the mass hyperboloids Ha,σ := {x ∈ R1+s : (x − a)2 = σ 2 },
a ∈ R1+s , σ ∈ R,
which do not intersect R. Sketch of proof. Define F as in the proof of Lemma 3.2. Since F is a solution of the wave equation, it is well-known that for every Cauchy surface ζ in R1+(s+1) , there exists a distribution Fζ with support in ζ such that F = Fζ ∗ D1+(s+1) , where D1+(s+1) denotes a fundamental solution of the 1+(s+1)-dimensional wave equation (see, e.g., [7], pp. of R1+(s+1) . 175-184). The support of D1+(s+1) is contained in the closed light cone V 1+s Since R is a Jost–Lehmann–Dyson region in R , its 1+(s+1)-dimensional Asgeirsson is easily seen to be a Jost–Lehmann–Dyson region in R1+(s+1) . Provided this hull R ∪ R c . This Cauchy surface region is “well-behaved”, there is a Cauchy surface ζ in R has the property that for every point z ∈ ζ , either both the forward and the backward
Localization Regions of Local Observables
205
+ z or neither of them intersects R. The former case occurs if and only if part of V The latter case occurs if and only if z ∈ ζ ∩ R c , the Asgeirsson hull R of z ∈ ζ ∩ R. R and the spacelike complement being taken in the spacetime R1+(s+1) . But since all partial derivatives of F can be checked to vanish in all points in R, one obtains from the support of Fζ contains only points of the second Lemma 3.2 that F vanishes in R, c kind, i.e., it is contained in R ∩ ζ . This implies that the support of F is contained in . ∩ ζ) + V (R + c with R1+s is Since fA,B is a boundary value of F and since the intersection of V the convex hull of a shifted mass hyperboloid, the support of the boundary value fA,B of the function F is contained in the union of admissible mass hyperboloids, as stated. We conclude this section with another lemma to be used below that concerns the geometry of Minkowski space. Lemma 3.6. Let P ∈ K be a double cone. (i) If O is a double cone, so is (O + P )cc . (ii) If W is a wedge, so is (W + P )cc . Proof. Denote by aO and aP the lower tips, and by bO and bP the upper tips of O and P , respectively. Let x = aO + ξ and y = aP + η be points in O and P , respectively. Then x + y = aO + aP + ξ + η, and since ξ and η are elements of V+ , so is ξ + η, so x + y is contained in aO + aP + V+ . In the same way one proves that x + y ∈ bO + bP − V+ , so one has O + P ⊂ (aO + aP + V+ ) ∩ (bO + bP − V+ ). Since the right hand side is a double cone and, hence, causally complete, it follows that (O + P )cc is a subset of this double cone as well. On the other hand it is straightforward to check that the tips aO + aP and bO + bP of this double cone and the straight line joining them are contained in O + P , whence the converse inclusion follows as well, so the proof of (i) is complete. The region W + P is a union of wedges that are images of W under translations. Consequently, (W + P )c is the intersection of the corresponding translates of W c under translations. But this intersection is the closure of a wedge, so it follows that the causal complement of this region, (W + P )cc , is a wedge. This proves (ii). 4. Landau’s Theorem and the Empty-Intersection Theorem By definition, a local net associates algebras with regions. In the sequel it will be discussed how to associate a localization region with a given algebra and even with a single local observable. The analysis is based on a theorem due to Landau [20]. In order to localize single observables, a new generalization of Landau’s theorem will be used. It will be stated and proved below. In this section, the theorem of Landau and its consequences for the localization of algebras will be discussed, and the mentioned generalization will be proved. This generalization will be the basis for the analysis of localization regions for single local observables, which is presented in the next section.
206
B. Kuckert
From now on, Conditions (A) through (D) will be assumed to hold without further mentioning, and it will assumed in addition that s ≥ 2; Landau’s theorem and all generalizations discussed below heavily depend on this assumption, and so do the consequences to be discussed later on. Using the wave equation techniques discussed in the preceding section, Landau [20] proved the following: Theorem 4.1 (Landau). If the closures of two double cones O and P are disjoint, then A(O ) ∩ A(P ) = C idH . This already implies that for an O satisfying the assumptions of the corollary, the region {P ∈ K : A(P ) ⊂ A(O ) }, L(A(O ) ) := which will be called the localization region of the algebra A(O ) , coincides with O (cf. [3]): Corollary 4.2. Let O ⊂ R1+s be a bounded, causally complete and convex open region. (i) For every open region M ⊂ R1+s , one has A(M) ⊂ A(O ) if and only if M ⊂ O. (ii) L(A(O)) = O. Proof. By isotony and locality, the condition in statement (i) is sufficient. To prove that it is necessary, assume M ⊂ O. Then, since K is a topological base and since the region M\O has a nonempty interior, M\O contains a double cone P ∈ K whose closure is disjoint from O. Since O is an intersection of closures of wedges, it follows from this that a wedge W can be found whose closure is disjoint from P and contains O. Since P is compact, the distance between P and W is > 0, so eventually shifting it a little bit, one can choose W in such a way that W is a subset not only of W , but also of W itself. By Proposition 3.8 (b) in [24], one can now conclude that there is a double cone Q with Q ⊂ W and Q ⊃ O (note that O itself does not need to be a double cone). Landau’s theorem now implies that A(P ) ∩ A(Q ) = C idH . It follows from the Reeh–Schlieder property that A(P ) ⊂ C idH , so A(P ) ⊂ A(Q ) . Since A(P ) ⊂ A(M) follows from isotony, A(M) cannot be a subset of A(Q ) , and since O ⊂ Q, it cannot be a subset of A(O ) . This proves (i) and, trivially, implies (ii). The proof of Corollary 4.2 can be made shorter as soon as one knows that Landau’s theorem still works if one of the two double cones is replaced by a wedge. That this, indeed, is possible, has been shown in the context of the proof of the P1 CT-part of the first uniqueness theorem for modular symmetries (Theorem 2.1 in [17]). Theorem 4.3. If the closures of a double cone O and a wedge W are disjoint, then A(O ) ∩ A(W ) = C idH . Using this generalized version of Landau’s theorem, one concludes that in Lemma 4.2, the assumption that O is bounded may be omitted: Corollary 4.4. Let R ⊂ R1+s be a causally complete convex open region. (i) For every open region M ⊂ R1+s , one has A(M) ⊂ A(R ) if and only if M ⊂ R. (ii) L(A(R)) = R.
Localization Regions of Local Observables
207
Proof. By isotony and locality, the condition is sufficient. To prove that it is necessary, assume M ⊂ R. Then, since K is a topological base and since the region M\R has a nonempty interior, M\R contains a double cone O ∈ K whose closure is disjoint from R. As in the proof of Corollary 4.2, it follows that a wedge W can be found whose closure is disjoint from O and whose interior contains R. Landau’s theorem now implies that A(O) ∩ A(W ) = C idH . It follows from the Reeh–Schlieder property that A(O) ⊂ C idH , so A(O) ⊂ A(W ) . Since A(O) ⊂ A(M) follows from isotony, A(M) cannot be a subset of A(W ) , and since R ⊂ W , it cannot be a subset of A(R ) , proving both statements. In order to investigate the localization behaviour of a single local observable, a further generalization of Landau’s theorem will be used. It is the main result of this section. It is d will denote the algebra of local observables a generalization of Theorem 2.1 in [17]. Aloc d of the dual net A . Theorem 4.5 (Empty-Intersection Theorem). Let (Wν )1≤ν≤n be a family of n wedges in W. If ν W ν = ∅, then
Aloc ∩ A(Wν ) = C idH . ν
Proof. Choose an A ∈ Aloc ∩ ν A(Wν ) , and let O be a double cone with A ∈ A(O) (or A ∈ A(O ) ). Since the wedges W ν have empty common intersection, so do the compact regions O ∩ W ν . But if a finite family of compact regions have empty common intersection, there is an ε > 0 such that the family of the ε-neighbourhoods of the regions still have empty common intersection. The proof of this is an elementary induction proof: any two disjoint compact regions have a positive distance, which implies the statement for two regions. Now assume the statement to hold for any family of n compact sets, and let C1 , . . . , Cn+1 be a family of n+1 regions. If n of these regions already have empty common intersection, there is nothing more to prove. So consider the case that the set 2 := nν=1 Cν is nonempty. This region is compact and, as shown, has a finite distance δ from Cn+1 , so the δ/3 neighbourhoods of the two regions still have a finite distance. But the δ/3-neighbourhood of 2 is the intersection of the δ/3-neighbourhoods of C1 , . . . , Cn , so the statement follows for ε = δ/3. It follows from this that there is a double cone P which is so small that the wedge W˜ ν := (Wν − P ) = (Wν − P )cc , ν ≤ n, and the double cone O˜ := (O − P ) = (O − P )cc (cf. Lemma 3.6 above to see that these regions are a wedge and a double cone, respectively) still have empty common intersection. Choose any B ∈ A(P ). By locality, the commutator function fA,B vanishes in the region R := O˜ ∪ ν W˜ ν . There is no admissible mass hyperboloid for this region. To see this, note that if a (shifted) mass hyperboloid is disjoint from a union of wedges, so is the unique shift x + V , x ∈ R1+s , of the closure of the full light cone which contains the hyperboloid. Now choose x ∈ R1+s such that x + V is disjoint from all W˜ ν , ν ≤ n, and from O˜ , which is a union of wedges, too. This is equivalent to {x} ⊃ O˜ ∪ ν W˜ ν , i.e.,
x ∈ O˜ ∩ W˜ ν = O˜ ∩ W˜ ν = ∅. ν
ν
Hence there is no admissible mass hyperboloid for R.
208
B. Kuckert
If R is a Jost–Lehmann–Dyson region, it follows from Theorem 3.5 that fA,B (x) vanishes for all x ∈ R1+s and all B ∈ A(P ), so using part (iii) of Lemma 3.1, one concludes that A ∈ C idH , and the proof is complete. But since R does not need to be a Jost–Lehmann–Dyson region, Asgeirsson’s lemma will be used to show that the function fA,B vanishes in a larger region N ⊃ R which is a Jost–Lehmann–Dyson region. Since there is no admissible hyperboloid for R, there is, a fortiori, no admissible hyperboloid for N , so the proof will be complete as soon as N has been shown to exhibit the stated properties. To this end, choose coordinates such that O˜ is the double cone (−ρ0 e0 + V+ ) ∩ (ρ0 e0 − V+ ), where e0 denotes the unit vector in the 0-direction, and ρ0 > 0 is the radius of the double ˜ Let Zρ = {x = (x0 , x) ∈ R1+s : x = ρ} be the boundary of the cylinder of cone O. radius ρ around the time axis in R1+s , and define Rρ,0 := O˜ ∩ Zρ , Rρ,ν := W˜ ν ∩ Zρ ,
ν ≤ n.
All these regions are bounded subsets of R1+s . Due to our choice of coordinates, the region Rρ,0 is a strip: Rρ,0 = {x ∈ Zρ : |x0 | ≤ ρ − ρ0 } (which is empty if ρ < ρ0 ). For 1 ≤ ν ≤ n, the wedge W˜ ν is timelike convex in R1+s , so the region Rρ,ν is timelike convex with respect to the inherited spacetime structure of Zρ . We now show that there is a ρν > 0 such that the union Rρ,0 ∪ Rρ,ν is timelike convex as well for all ρ > ρν . To this end, let C be a spacelike hypersurface in W˜ ν ∪ W˜ νc . As a spacelike surface, it is a subset of O˜ up to a compact set. For ρ so large that this compact set is enclosed by Zρ one finds that C ∩ Zρ is a subset of Rρ,0 . Since C ∩ Zρ is a Cauchy surface in the spacetime Zρ , it follows that Rρ,ν ∪ C and Rρ,0 are timelike convex regions in the spacetime Zρ whose intersection contains a Cauchy surface, so part (v) of Lemma 3.4 implies that Rρ,0 ∪ Rρ,ν is timelike convex. This proves that ρν with the stated properties exists for 1 ≤ ν ≤ n. Now choose ρ > ρˆ := maxν ρν , and apply Lemma 3.4 (v) another n − 1 times to conclude that the region Rρ,ν Rρ := R ∩ Zρ = 0≤ν≤n
ρ is open, bounded, and is timelike convex in Zρ . Since the R1+s -Asgeirsson hull R timelike convex, it is a Jost–Lehmann–Dyson region by Lemma 3.4 (iii). On the other hand, the part of O˜ and W˜ ν , respectively, which is enclosed by Zρ is a ρ,ν of Rρ,ν . It follows that subset of the R1+s -Asgeirsson hull R ρ,ν = ρ , R ⊂ N := R R ρ>ρˆ ν≤n
ρ>ρˆ
and by Asgeirsson’s lemma, fA,B vanishes in N . Since the Jost–Lehmann–Dyson region ρ increases with ρ, it follows from Lemma 3.4 (vi) that N is a Jost–Lehmann–Dyson R region. This is what remained to be shown, so the proof is complete.
Localization Regions of Local Observables
209
Actually, the following, slightly stronger version has been established by the preceding proof: Corollary 4.6. Let W1 , . . . , Wn be wedges in W, and let O ∈ K be a double cone. If O ∩ 1≤ν≤n W ν = ∅, then
A(Wν ) = Cid. A(O ) ∩ ν
After completing this article, it was brought to the author’s attention that in 1+3 dimensions, one can also use the results of Thomas and Wichmann for the above proof. As the region O˜ ∩ ν W˜ ν is a union of wedges, one can apply Theorem 3.6 in [25] to prove that the function fA,B vanishes in the causal closure of this region, which one can check to be all of R1+s . Their proof, which uses a completely different line of argument, has been written down for 1+3 dimensions only, and it is not evident whether it also holds in other dimensions; verifying this would require to check several hard proofs at the end of [26]. On the other hand the Thomas-Wichmann analysis is more general in other aspects, so the reader interested in the above special problem may find the above argument more direct. 5. Localization Regions and the Nonempty-Intersection Theorem Theorem 4.5 prepares for the definition of a localization region for local observables. As the following proposition shows, there are several “natural” ways how to define such a localization region, and it follows from the empty-intersection theorem that all of them yield nonempty localization regions. Proposition 5.1. Let X be any of the classes K, B, W and C. For every A ∈ Aloc which is not a multiple of the identity, the localization regions
LX (A) := {O : O ∈ X : A ∈ A(O) },
LX (A) := {O : O ∈ X : A ∈ A(O ) } are nonempty, causally complete, convex, and compact sets. Between them, one has the following equalities and inclusions: LB (A) = LK (A) ⊃ LK (A) = LB (A) ∪ ∪ LC (A) = LW (A) ⊃ LW (A) = LC (A). Proof. We start with the proof of the equalities and inclusions. The equalities immediately follow from the definitions, since on the one hand, K ⊂ B and W ⊂ C, while on the other hand, every region in B is an intersection of double cones in K and every region in C is an intersection of wedges in W (see Sect. 2). The inclusions in the upper and the lower row of the diagram immediately follow from locality. The inclusions in the two columns follow from the fact that every double cone is an intersection of wedges and that, by isotony, an observable contained in the algebra associated with a given double cone is contained in all algebras associated with wedges containing this double cone. By these inclusions, it is sufficient to prove that LW (A) is nonempty if A ∈ / C id. It already follows from Theorem 4.5 that the intersection of the closures of any finite
210
B. Kuckert
family of wedges whose algebras contain A is nonempty. But the family of all wedges whose algebras contain A is never finite. Since A is a local observable, there is a double cone O with A ∈ A(O), and it follows from isotony, locality, and the above inclusions that LW (A) ⊂ O. But this implies that
LW (A) = {O ∩ W : W ∈ W, A ∈ A(W ) }, which is an intersection of subsets of the compact set O. But if for a class of closed subsets of a compact space, every finite subclass has a nonempty intersection, it follows from the Heine-Borel property that the whole class has a nonempty intersection. Now Corollary 4.6 implies the statement. In the sequel the maps Aloc A → LX (A) and Aloc A → LX (A) will be referred to as localization prescriptions. Clearly, the localization prescriptions LK and LK coincide if the net satisfies Haag duality, i.e., if A(O ) = A(O) for all O ∈ K, and the prescriptions LW and LW coincide if the net satisfies wedge duality. Furthermore, wedge duality also makes LW coincide with LK by the following lemma (cf. also [9], Lemma 4.1). Lemma 5.2. Assume the net A to satisfy wedge duality. For every region R ∈ C, one has
A(R ) = A(W ) =: M(R), W ∈WR
and the net M satisfies locality. Proof. We first show that the net (A(R ) )R∈C satisfies locality. This immediately follows from the fact remarked above that if R and S are spacelike separated regions in C, there is a wedge W ∈ W with R ⊂ W and S ⊂ W . For such a constellation one has A(R ) ⊂ A(W ) = A(W ) ⊂ A(S ) , which is the stated locality for the net (A(R ) )R∈C . One proves in the same way that the net M satisfies locality with respect to A, i.e., M(R) ⊂ A(R ) for all R ∈ C. On the other hand,
A(R ) ⊂ A(W ) = A(W ) = M(R) for all R ∈ C, W ∈WR
and this completes the proof.
W ∈WR
So if one assumes wedge duality, the localization prescriptions LW (A), LW (A), and coincide and provide the smallest localization region out of the above suggestions. In what follows, wedge duality will be assumed, and for every local observable A ∈ Aloc , we simply write LW (A) = LW (A) = LK (A) =: L(A). Now the question arises in how far L(A) can be considered as the region where the observable A can be measured. For this interpretation to be consistent it is important that the localization prescription L satisfies locality in the sense that observables with spacelike separated localization regions commute. This does not follow from the locality assumption made for the net A. To illustrate this, consider the wedge X := W1 + e1 , where e1 is the unit vector in the 1-direction, and its images Y and Z under rotations in the LK (A)
Localization Regions of Local Observables
211
1-2-plane by 120◦ and 240◦ , respectively. Assume a local observable A to be contained in A(X) and in A(Y ) , while another local observable B is contained in A(Y ) and A(Z) . In this case, the localization regions L(A) and L(B) are spacelike with respect to each other, but locality of the net alone is not yet sufficient to conclude that A and B should commute, since not any two of the three wedges are spacelike separated. Actually, this simplified sketch already points towards the sufficient and necessary condition for locality of L provided by the following theorem. Theorem 5.3 (Nonempty-Intersection Theorem). Assume A to satisfy wedge duality. The localization prescription Aloc A → L(A) satisfies locality if and only if for every finite family W 1 , . . . , Wn of wedges and for every causally complete and convex region R ∈ C with ν W ν ⊂ R, one has
Aloc ∩ A(Wν ) ⊂ A(R ) . 1≤ν≤n
Proof. To prove that the condition is sufficient, let ∂Bε (L(A)) be the boundary of the open ε-neighbourhood Bε (L(A)) of L(A) for ε > 0, and define WA := {W ∈ W : ∃X ∈ W : A ∈ A(X) , X ⊂ W }. A class of closed subsets of the compact space ∂Bε (L(A)) is defined by X := {∂Bε (L(A)) ∩ W : W ∈ WA }. X has empty intersection, and by the Heine-Borel property, there is a finite subclass of X whose intersection is still empty, i.e., there are wedges W1 , . . . , Wn ∈ WA such that
∂Bε (L(A)) ∩ W ν = ∅. ν
Due to the convexity of L(A) and of wedges it follows that the region
R := Wν ν
is a subset of Bε (L(A)), and that R ∈ B. By the definition of the class WA , there are wedges X1 , . . . , Xn in WA such that Xν ⊂ Wν for 1 ≤ ν ≤ n. Since R ∈ B ⊂ C, one now obtains from the condition that
A(Xν ) ⊂ A(R ) ⊂ A(Bε (L(A)) ) , A ∈ Aloc ∩ ν
as stated. This holds for each ε > 0, and evidently, the same reasoning proves that B ∈ A(Bε (L(B)) ) . Since L(A) and L(B) are compact, convex, and spacelike separated, the euclidean distance between these regions is positive, and one can choose ε > 0 so small that Bε (L(A)) and Bε (L(B)) still are spacelike separated. As remarked in Sect. 2, there is a wedge X such that Bε (L(A)) ⊂ X and Bε (L(B)) ⊂ X . Using wedge duality and Lemma 5.2, one concludes A ∈ A Bε (L(A)) ⊂ A(X) ,
212
B. Kuckert
and
B ∈ A Bε (L(B)) ⊂ A(X ) = A(X) ,
so AB = BA, proving that the condition is sufficient. To prove that the condition is necessary, let W1 , . . . , Wn be a family of wedges, and choose an R ∈ C with ν W ν ⊂ R. Whenever A ∈ Aloc ∩ ν A(Wν ) and B ∈ Aloc ∩ A(X) for any X ∈ W R , locality of L implies that AB = BA, and one concludes that
(Aloc ∩ A(X) ) = A(X) = A(X ) A∈ X∈W R
=
X∈W R
X∈W R
A(X) = A(R ) ,
X∈WR
where Lemma 5.2 has been used in the last step.
This theorem immediately implies the following corollary. Corollary 5.4. Assume A to satisfy wedge duality, and suppose that the localization prescription L satisfies locality. If A is a local observable and R ∈ C is a causally complete convex open region in R1+s such that L(A) ⊂ R, then A ∈ A(R ) . As the following proposition shows, the additional assumption of wedge additivity (Condition (F) above) is sufficient to ensure locality of L. Proposition 5.5. Assume A to satisfy wedge duality and wedge additivity. Then the localization prescription L satisfies locality. Proof. Let A and B be local observables with spacelike separated localization regions. There is a wedge W such that L(A) ⊂ W and L(B) ⊂ W . So as soon as one proves that this implies A ∈ A(W ) and B ∈ A(W ) , wedge duality implies the statement. To this end, we consider any A ∈ Aloc and any wedge W whose closure is spacelike separated from L(A), and show that A ∈ A(W ) . This follows from wedge additivity as soon as one has found a double cone P with the property that W ⊂ W + P and that fA,B vanishes in W for all B ∈ A(P ) . So fix an ε > 0 such that the ε-neighbourhood Bε (L(A)) of L(A) is still spacelike separated from W . As in the proof of Theorem 5.3, we choose a finite number of wedges X1 , . . . , Xn in the class WA such that
X ν ⊂ Bε (L(A)). ν
Now define P := (−ρe0 + V+ ) ∩ (ρe0 − V+ ) for some ρ > 0 (again, e0 denotes the unit vector in the time direction), and note that W ⊂ W + P . Fixing ρ > 0 sufficiently small, one can make sure that
W ⊂ (Xν − P ) ν
.
Localization Regions of Local Observables
213
Choosing any B ∈ A(P ) , one obtains from wedge duality that the commutator function fA,B defined above vanishes in the region R :=
(Xν − P ) , ν
which is a union of wedges. As in the proof of Theorem 4.5, fA,B can be shown to vanish in a larger region N ⊃ R which is a Jost–Lehmann–Dyson region. This can be shown by mimicking the corresponding part of that proof, as it does not depend on the assumption that the intersection of the closed wedges under consideration is empty. So one can keep A, B, and the double cone P , choose some double cone O with A ∈ A(O), replace X1 , . . . , Xn by W1 , . . . , Wn , and proceed like above to construct N . A mass hyperboloid H is admissible with respect to N only if it is admissible with respect to R, and as R is a union of closed wedges, this is the case only if the whole unique shift of the open light cone which contains H is disjoint from R. But by Theorem 3.5, this implies that in particular, fA,B vanishes in the region W , completing the proof. Thomas and Wichmann have obtained a similar result for 1+3 dimensions from slightly stronger assumptions (Theorem 4.10 in [26]). One may ask what is the difference between the intersection condition found in Theorem 5.3 and the ‘brute force’ condition that A(O) ∩ A(P ) = A(O ∩ P ) for all O, P ∈ C. Clearly, this condition is the stronger one of the two, and it would imply locality of L in a straightforward fashion. Furthermore the property appears so natural that one could expect it to be a general feature of local nets. In [21], Landau has given examples of theories which exhibit wedge duality, wedge additivity, and, hence, locality of L and the equivalent condition given above, while A(O) ∩ A(P ) = A(O ∩ P ) for O, P ∈ C. To illustrate the geometrical trick of Landau’s example, start from some local net B of observables in 1+(s+1) dimensions, and with every double cone O = (a+V+ )∩(b+V− ) in R1+s , associate the algebra + ) ∩ (b + V − )) =: B(O), B0 (O) := B((a + V + and V − denote the 1+(s+1)-dimensional forward and backward where, as before, V light cone, respectively. One easily checks that B0 (O) ∩ B0 (P ) might not coincide with B0 (O ∩ P ), since the intersection of the 1+(s+1)-dimensional Asgeirsson hulls of O and P differs from the ∩ P = O 1+(s+1)-dimensional Asgeirsson hull of the intersection O ∩ P , i.e., O ∩ P. Indeed, Landau has given examples for theories where the corresponding algebras differ. In particular, they differ if the ‘large’ net B has the intersection property, i.e., if B(O) ∩ B(P ) = B(O ∩ P ) for all O, P ∈ B. This shows that the intersection property cannot be a general property of all local nets of observables. While Landau’s examples do satisfy all of our above conditions, they illustrate that the sufficient and necessary condition for L to be local is not self-evident, as it is similar to the intersection property violated by Landau’s examples. On the other hand, the fact that all our sufficient conditions for the locality of L hold, gives some hope that locality of L is a rather natural property of local nets.
214
B. Kuckert
6. Conclusion Generalizing Landau’s result that the algebras associated with two strictly disjoint double cones have a trivial intersection, the empty-intersection theorem makes it possible to associate a nonempty causally complete, convex and compact localization region with every single local operator of a local net. If one makes the additional assumption of wedge duality, there is a natural way how to obtain a smallest localization region from the emptyintersection theorem. Even in this situation it is a nontrivial issue whether observables with spacelike separated localization regions commute. As a necessary and sufficient condition for this, the nonempty-intersection theorem establishes a special intersection property, and sufficient for this property is the additional condition of wedge additivity, a property typically shared by models arising from Wightman fields. As these results depend on very weak additional assumptions, locality of the localization prescription L turns out to be a rather natural property of local nets. The question what the intersection of two algebras of local observables contains has arisen earlier, as, e.g., the remarks in Sect. III.4.2 of Haag’s monograph [13] show. Haag’s ‘Tentative Postulate’ that the map O → A(O) be a homomorphism from the orthocomplemented lattice of all causally complete regions (which, in general, are neither bounded nor convex) of Minkowski space into the orthocomplemented lattice of von Neumann algebras on a Hilbert space does not hold in general as it stands (cf. also Haag’s heuristic remarks which illustrate the physical limits of the postulate). But if a net satisfies wedge duality and strong additivity for wedges, the above results, indeed, imply parts of Haag’s conjecture: for arbitrary finite families of wedges, one obtains relations in the spirit of (III.4.7) through (III.4.11) in [13] for the dual net. The results of this article have been used for the analysis of the Unruh effect and related symmetries of quantum fields [18, 19]. Proceeding, so to speak, in the converse direction, Thomas and Wichmann have investigated the implications that the symmetries providing the Unruh effect exert on the localization behaviour of local observables. Assuming the theory to exhibit the Unruh-effect and a couple of (standard) technical properties including wedge additivity, they found that the localization region of an observable A with respect to a minimal Poincaré covariant local net generated by A is the smallest ↑ region OA in B with the property that for every (a, 8) ∈ P+ , one has a + 8OA ⊂ OA ∗ if and only if [A, U (a, 8)AU (a, λ) ] = 0 [26]. This definition of a localization region no longer explicitly refers to any other operators of the net (while it does refer to the representation, which, by the Bisognano-Wichmann symmetries, is closely related to the net). While this interesting conclusion has been derived from the Unruh effect and other assumptions of relevance in the above discussion, all these assumptions have been avoided above since they are a goal rather than a starting point of the above analysis. In this sense, the results of Thomas and Wichmann are complementary to the above results. Appendix For the reader’s convenience we include a proof that the Reeh–Schlieder property entails weak additivity (cf. also Lemma 2.6 in [26]). Lemma. Let A be a local net of local observables satisfying Conditions (A) through (D) above, let O ⊂ R1+s be a bounded open region, and let a ∈ R1+s be some timelike vector. Then A(O + ta) = B(H). CO,a := t∈R
Localization Regions of Local Observables
215
Proof. For any a and O as above, let A be any local observable commuting with all elements of CO,a , and pick a B ∈ A(O). Define f+ (t) := , A∗ U (ta)B and f− (t) := , BU (−ta)A∗ . By the spectral theorem and the spectrum condition, the Fourier transforms of these functions are (not necessarily positive, but bounded) measures one of which has its support in the closed positive half axis, while the other one has its support in the closed negative half axis. Since f+ and f− coincide by construction, it follows that the Fourier transform of f+ (and of f− ) is a measure with support {0}, i.e., some multiple of the Dirac measure, so that f+ is a constant function. Using this, the spectral theorem, and uniqueness of the vacuum, one concludes A, B = f+ (0) = f+ (t) = , A, B =: α, B = α, B. Since by the Reeh–Schlieder property, B runs through a dense set, one concludes A = α, and since A is a local observable, one obtains A = α id since is cyclic with respect to A(O ), which implies that it is separating with respect to A(O) due to locality. This proves the lemma. Acknowledgements. It was a very important help that D. Arlt, K. Fredenhagen, W. Kunhardt, N. P. Landsman, and the referee read preliminary versions of the manuscript very carefully. For helpful discussions, thanks are due to H.-J. Borchers, D. Guido, M. Lutz, M. Requardt, S. Trebels, and E. Wichmann. The above research is part of a project that has been re-initiated at the Erwin Schrödinger Institute in Vienna in the autumn of 1997. It has been supported by the Deutsche Forschungsgemeinschaft, the European Union’s TMR Network “Noncommutative Geometry”, a Feodor-Lynen grant of the Alexander von Humboldt Foundation, a part of which has been funded by the University of Amsterdam, and a Casimir-Ziegler award of the Nordrhein-Westfälische Akademie der Wissenschaften.
References 1. Araki, H.: A generalization of Borchers’ Theorem. Helv. Phys. Acta 36, 132–139 (1963) 2. Araki, H.: Mathematical Theory of Quantum Fields. Oxford: Oxford University Press, 1999 3. Bannier, U.: Intrinsic Algebraic Characterization of Space-Time Structure. Int. J. Theor. Phys. 33, 1797– 1809 (1994) 4. Baumgärtel, H., Wollenberg, M.: Causal Nets of Operator Algebras. Berlin: Akademie-Verlag, 1992 5. Bisognano, J.J., Wichmann, E.H.: On the duality condition for a Hermitian scalar field. J. Math. Phys. 16, 985–1007 (1975) 6. Bisognano, J. J., Wichmann, E. H.: On the duality condition for quantum fields. J. Math. Phys. 17, 303 (1976) 7. Bogoliubov, N.N., Logunov, A.A., Oksak, A.I., Todorov, I. T.: General Principles of Quantum Field Theory. Dordrecht: Kluwer, 1990 8. Borchers, H.-H.: On the Vacuum State in Quantum Field Theory. II. Commun. Math. Phys. 1, 57–79 (1965) 9. Borchers, H.-J., Yngvason, J.: Transitivity of locality and duality in quantum field theory. Some modular aspects. Rev. Math. Phys. 6, 597–619 (1994) 10. Buchholz, D.: Collision Theory for Massless Fermions. Commun. Math. Phys. 42, 269–279 (1974) 11. Courant, R., Hilbert, D.: Methods of Mathematical Physics, Vol. II. New York, London: Interscience Publishers, 1962 12. Dyson, F. J.: Integral Representations of Causal Commutators. Phys. Rev 110, 1460 (1958) 13. Haag, R.: Local Quantum Physics. Berlin–Heidelberg–New York: Springer, 1992 14. Hawking, S.W., Ellis, G.F.R.: The Large Scale Structure of Space-Time. Cambridge: Cambridge Univ. Press, 1980 15. Jost, R.: The General Theory of Quantized Fields. Providence, RI: Am. Math. Soc., 1965 16. Jost, R., Lehmann, H.: Integral Representation of Causal Commutators. Nuovo Cimento 5, 1598–1608 (1957) 17. Kuckert, B.: Borchers’ Commutation Relations and Modular Symmetries in Quantum Field Theory. Lett. Math. Phys. 41, 307–320 (1997) 18. Kuckert, B.: Spin & Statistics, Localization Regions, and Localization Regions in Quantum Field Theory. PhD-thesis, Hamburg 1998, DESY-thesis 1998-026
216
B. Kuckert
19. Kuckert, B.: Two uniqueness result on the Unruh effect and PCT-symmetry. Preprint math-ph 0010008, Amsterdam, 2000 20. Landau, L.J.: A Note on Extended Locality. Commun. Math. Phys. 13, 246–253 (1969) 21. Landau, L.J.: On Local Functions of Fields. Commun. Math.Phys. 39, 49–62 (1974) 22. Reeh, H., Schlieder, S.: Bemerkungen zur Unitäräquivalenz von lorentzinvarianten Feldern. Nuovo Cimento 22, 1051 (1961) 23. Streater, R.F., Wightman, A.S.: PCT, Spin & Statistics, and All That. New York: Benjamin, 1964 24. Thomas, L.J., Wichmann, E.H.: On the causal structure of Minkowski spacetime. J. Math. Phys. 38, 5044–5086 (1997) 25. Thomas, L.J., Wichmann, E.H.: On a class of distributions of interest in quantum field theory. J. Math. Phys. 39 , 1680–1719 (1998) 26. Thomas, L.J., Wichmann, E.H.: Standard forms of local nets in quantum field theory. J. Math. Phys. 39, 2643–2681 (1998) Communicated by H. Araki
Commun. Math. Phys. 215, 217 – 236 (2000)
Communications in
Mathematical Physics
© Springer-Verlag 2000
Towards Cohomology of Renormalization: Bigrading the Combinatorial Hopf Algebra of Rooted Trees D. J. Broadhurst1 , D. Kreimer2, 1 Physics Dept., Open University, Milton Keynes MK7 6AA, UK. E-mail:
[email protected] 2 Physics Dept., Univ. Mainz, 55099 Mainz, Germany. E-mail:
[email protected]
Received: 31 January 2000 / Accepted: 7 July 2000
Abstract: The renormalization of quantum field theory twists the antipode of a noncocommutative Hopf algebra of rooted trees, decorated by an infinite set of primitive divergences. The Hopf algebra of undecorated rooted trees, HR , generated by a single primitive divergence, solves a universal problem in Hochschild cohomology. It has two nontrivial closed Hopf subalgebras: the cocommutative subalgebra Hladder of pure ladder diagrams and the Connes–Moscovici noncocommutative subalgebra HCM of noncommutative geometry. These three Hopf algebras admit a bigrading by n, the number of nodes, and an index k that specifies the degree of primitivity. In each case, we use iterations of the relevant coproduct to compute the dimensions of subspaces with modest values of n and k and infer a simple generating procedure for the remainder. The results for Hladder are familiar from the theory of partitions, while those for HCM involve novel transforms of partitions. Most beautiful is the bigrading of HR , the largest of the three. Thanks to Sloane’s superseeker, we discovered that it saturates all possible inequalities. We prove this by using the universal Hochschild-closed one-cocycle B+ , which plugs one set of divergences into another, and by generalizing the concept of natural growth beyond that entailed by the Connes–Moscovici case. We emphasize the yet greater challenge of handling the infinite set of decorations of realistic quantum field theory. 1. Introduction In this paper we bigrade the Hopf algebra of undecorated rooted trees, and both of its closed Hopf subalgebras, taking account of Hochschild cohomology. In [1–9] we have exposed the connection between renormalization and a Hopf algebra. The joblist of renormalization specifies a noncocommutative coproduct, . On the left are products of divergent subdiagrams; on the right these shrink to points. An Heisenberg Fellow
218
D. J. Broadhurst, D. Kreimer
antipode, S, upgrades this bialgebra to a Hopf algebra, by specifying the procedure of subtracting subdivergences. If this antipode is twisted, by taking only the poles of the Laurent series in ε, for dimensionally regularized diagrams in d := 4 − 2ε spacetime dimensions, the final subtraction delivers a finite renormalized Green function, in the limit d → 4, corresponding to the minimal subtraction scheme. Different twists correspond to different renormalization schemes [1]. The general problem of perturbative quantum field theory involves the Hopf algebra of decorated rooted trees. These decorations represent primitive divergences, coming from diagrams with no subdivergences. Restriction to the Hopf algebra HR of undecorated rooted trees, generated by a single primitive divergence, reveals a remarkable feature. This apparently small problem in quantum field theory has a mathematical structure larger than a very general problem in noncommutative geometry, investigated by Alain Connes and Henri Moscovici [10], who showed that the composition of diffeomorphisms can be described algebraically, and hence extended to noncommutative manifolds, by making use of an appropriate Hopf algebra HCM . In [2] it was shown that the Hopf algebra HCM of [10] is, in the one-dimensional case, the unique noncocommutative Hopf subalgebra of HR , corresponding to adding Feynman diagrams [4] with weights determined by natural growth. The only other closed Hopf subalgebra is the cocommutative Hopf algebra Hladder of rooted trees whose nodes have fertility less than 2, corresponding to the ladder (or rainbow) diagrams of [11–14]. Suppose we are given an n-loop Feynman diagram that represents an n-node tree in HR . It involves n ultraviolet-divergent integrations, and hence one may expect that it delivers, in dimensional regularization, a pole of nth order. But then, combinations of diagrams corresponding to sums of products of rooted trees can provide cancellations of poles, and may hence eliminate leading pole terms. A prominent example of such a mechanism occurs in the calculation of an anomalous dimension, γ = d log Z/d log µ, which detects only single-pole terms, after minimal subtraction of subdivergences. All higher order poles are determined by the requirement that they cancel when one takes the derivative of the logarithm of the renormalization factor Z w.r.t. to the renormalization scale µ. Every practitioner of multiloop quantum chromodynamics is vividly aware of the bigrading of her/his work, by loop number and degree of singularity. The slightest error in handling either the combinatorics or the integrations usually – and mercifully – reveals itself by a failure to get the uniquely finite answer that is ensured by the locality of counterterms. Thus there is deep – and largely uncharted – structure in the relations between Laurent expansions of products of Feynman diagrams, corresponding to forests of rooted trees. In this work we make preparation for a cohomological approach to renormalization, by identifying and analyzing a combinatoric bigrading of linear combinations of undecorated rooted forests. In Sect. 2, we define this bigrading, in terms of the number of nodes n and an index k that classifies subspaces according to their projection into an augmentation ideal, analyzed by k-fold iterations of the coproduct . We wish to learn the dimension, Hn,k , of the subspace with weight n and index k. In Sect. 3 we find that this problem has a very simple solution in the cocommutative subalgebra Hladder , where the dimension is the number of ways of partitioning n into k positive integers, given in Table 1. In Sect. 4 we find that the corresponding problem in the noncocommutative subalgebra HCM has the subtler solution of Table 2, which we find to be related to Table 1 by a remarkable transform, which preserves the sums of rows. In Sect. 5 we team Neil
Bigrading Combinatorial Hopf Algebra of Rooted Trees
219
Sloane’s superseeker [15] with Tony Hearn’s Reduce [16] and find H (x, y) :=
Hn,k x n y k =
n,k
R(x) (1 − y)R(x) + xy
(1)
for the generating function of the bigrading of HR , with results in Table 3 obtained from [17] rn x n = x (1 − x n )−rn = x + x 2 + 2x 3 + 4x 4 + 9x 5 + 20x 6 + . . . R(x) := n>0
n>0
(2) which generates the number rn of rooted trees with n nodes. Our discovery of the generating principle of Table 3 was triggered by superseeker analysis of merely the first 8 entries of its first column. After thorough study of the filtration in Table 4, we prove (1). 2. The Second Grading The weight n of a rooted tree t is the number of its nodes. The weight of a forest F = j tj is the sum of the weights of the trees tj in the product. This is the first grading. To define the index k for the second grading, k-primitivity, we use k-fold iterations of the coproduct , defined by the highly nontrivial recursion [1] (t) = t ⊗ e + (id ⊗ B+ ) ◦ ◦ B− (t)
(3)
for a nonempty tree t. Here e is the empty tree, evaluating to unity, id is the identity map, B− removes the root of t, and B+ combines the trees of a product by appending them to a common root. The coproduct is coassociative. Hence it has a unique iteration, which may be written in a variety of equivalent ways. Since has only single trees on the right, the recursion k = (id ⊗ k−1 ) ◦ (4) is particularly convenient. For a forest F = j tj we have k (F ) = j k (tj ). Let X be a Q-linear combination of monomials of trees, i.e. of forests. We say that X is k-primitive if every term of k (X) has at least one empty tree e. Symbolically we may consider the composition of tensor products of the projection operator P := id − E ◦ e¯ with iterations of . P projects onto the augmentation ideal Hc = {X ∈ HR | P (X) = X}, where X = P (X) + E ◦ e(X). ¯ Here e¯ is the counit, which annihilates everything except the empty tree, for which it gives e(e) ¯ = 1. The map from the rationals back to the algebra is simply E(q) = qe for q ∈ Q. Hence P annihilates e and leaves everything else unchanged. Let U0 := P and . . ⊗ P) ◦ k = (P ⊗ Uk−1 ) ◦ Uk := (P ⊗ .
(5)
k+1 times
for k > 0. In using the recursive form, note should be taken that, in general, the projection makes Uk (X1 X2 ) = Uk (X1 )Uk (X2 ): one should store results for forests; not just for trees.
220
D. J. Broadhurst, D. Kreimer
We have said that X is k-primitive if Uk (X) = 0. Then clearly X is (k + 1)-primitive, since k+1 (X) has at least two empty trees e in every term. We are interested in the number, Hn,k := Dn,k − Dn,k−1 , of weight-n terms that are k-primitive but are not (k − 1)-primitive, where Dn,k is the dimension of the subspace with weight n and index k. To compute Dn,k for specific (and rather modest) values of n, k one considers the most general linear combination X of weight-n terms, with unknown coefficients, and solves Uk (X) = 0. The rank deficiency of this large system of linear equations is Dn,k . From this one subtracts the number Dn,k−1 of weight-n terms that are (k −1)-primitive. By this means we obtained the first 7 rows of Tables 1 and 2, for the Hopf subalgebras Hladder and HCM , and inferred their generating principles. In the case of the full Hopf algebra HR , bigraded in Table 3, data were much harder to obtain. Fortunately the generating principle is very distinctive. 3. Bigrading the Cocommutative Subalgebra We first consider the cocommutative Hopf algebra Hladder of rooted trees all of whose n (e), n ≥ 0. nodes have fertility less than 2, i.e. the Hopf algebra with linear basis ln = B+ In this very simple case, the recursive definition (3) linearizes on the left, giving (ln ) =
n
ln−k ⊗ lk
(6)
k=0
for the unique n-node tree ln ∈ Hladder . Thanks to our recent work in [8] we have an explicit construction of the weight-n 1-primitive pn ∈ Hladder . First we compute the antipodes. In the cocommutative case, these are simply S(ln ) = −
n−1
S(ln−k )lk
(7)
k=0
with l0 = e and S(e) = e. To construct the 1-primitives, we use the star product S Y , where Y is the grading operator, giving Y (lk ) = klk . In general, a star product of operators is defined by O1 O2 := m ◦ (O1 ⊗ O2 ) ◦ , where m merely multiplies entries on the left and right of a tensor product. The ladder 1-primitives are given by n
pn :=
k 1 [S Y ](ln ) = S(ln−k )lk . n n
(8)
k=0
Clearly p1 = l1 and p2 = l2 −
1 2 2 l1
are 1-primitive. It takes some time to show that
1 p8 = l8 − l7 l1 − l6 l2 + l6 l12 − l5 l3 + 2l5 l2 l1 − l5 l13 − l42 + 2l4 l3 l1 + l4 l22 2 3 22 2 4 2 2 − 3l4 l2 l1 + l4 l1 + l3 l2 − l3 l1 − 3l3 l2 l1 + 4l3 l2 l13 2 1 4 5 24 1 5 3 2 − l3 l1 − l2 + 2l2 l1 − l2 l1 + l2 l16 − l18 4 2 8
(9)
gives (p8 ) = p8 ⊗ e + e ⊗ p8 . We were able to compute this primitive with ease, using recursion (7) in the star product (8). From [8] we know that [S Y ](t) delivers a combination of diagrams whose singularity is a single pole, as d → 4, with a
Bigrading Combinatorial Hopf Algebra of Rooted Trees
221
Table 1. Dimensions H n,k of the bigrading of the cocommutative subalgebra, Hladder 1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
1
1
2
1
3
1
1
4
1
2
1
5
1
2
2
1
6
1
3
3
2
1
1
7
1
3
4
3
2
1
8
1
4
5
5
3
2
1
1
9
1
4
7
6
5
3
2
1
10
1
5
8
9
7
5
3
2
1
1
11
1
5
10
11
10
7
5
3
2
1
12
1
6
12
15
13
11
7
5
3
2
1
13
1
6
14
18
18
14
11
7
5
3
2
1
1
14
1
7
16
23
23
20
15
11
7
5
3
2
1
15
1
7
19
27
30
26
21
15
11
7
5
3
2
1
16
1
8
21
34
37
35
28
22
15
11
7
5
3
2
1
17
1
8
24
39
47
44
38
29
22
15
11
7
5
3
2
1
1
18
1
9
27
47
57
58
49
40
30
22
15
11
7
5
3
2
1
1
19
1
9
30
54
70
71
65
52
41
30
22
15
11
7
5
3
2
1
19
1 1 1 1 1 1 1 1 1 1 1
1
residue that determines the contribution of t to the anomalous dimension. Moreover 1-primitives have only single poles. However the converse is not true in the full Hopf algebra: noncocommutativity implies that not every [S Y ](t) is 1-primitive. Here, in the cocommutative subalgebra, there is a single 1-primitive for each weight n > 0. Hence S Y delivers it. From examples such as (9) we inferred the general result of (8). The 1-primitive n pn contains all possible multiplicative partitions j lj j with weight n = j nj j . The coefficient of each partition is (−1)k−1 (k − 1)!/ j nj !, where k = j nj is the number of integers into which n has been partitioned. For example the partition 8 = 2 + 2 + 1 + 1 + 1 + 1, with k = 6, gives the coefficient −5!/2!4! = −5/2 of l22 l14 in (9). We have tested this Ansatz up to n = 20, where p20 contains 627 terms. It is easy to understand the leading diagonal of Table 1: l1n is n-primitive, but not (n − 1)-primitive. For 8 > n > k > 1 we used Reduce to prove the results of Table 1. The entry in the nth row and k th column is H n,k = D n,k − D n,k−1 , where D n,k is the number of undetermined coefficients whenone solves Uk (X) = 0, with X taken as n an unknown linear combination of forests j lj j of weight n = j nj j . Clearly the generating principle is extremely simple: H n,k is the number of partitions of n into k positive integers. This simply reflects the fact that solving Uk (X) = 0 determines all and only the coefficients of partitions with j nj ≤ k. Hence the k th column of Table 1
222
D. J. Broadhurst, D. Kreimer
is generated by H k (x) :=
n
H n,k x n =
j ≤k
x x = H k−1 (x) 1 − xj 1 − xk
(10)
which yields the recursion of the tabular entry A048789 of [15]: H n,k = H n−k,k + H n−1,k−1
(11)
seeded by the empty tree, which gives H 0,0 = 1. We particularly note that for all j, k > 0, H j +k (x) < H j (x)H k (x).
(12)
4. Bigrading the Connes–Moscovici Subalgebra To compute Table 2 we proceeded as above, now using the coproduct [2], (δn ) = δn ⊗ e + e ⊗ δn + Rn−1 ,
Rn = X ⊗ e + e ⊗ X + δ1 ⊗ Y, Rn−1 + δ1 ⊗ Y (δn )
(13) (14)
with R0 = 0, [X, δn ] = δn+1 increasing weight, and [Y, δn ] = Y (δn ) = nδn measuring weight. This is the noncocommutative coproduct of Connes and Moscovici [10], shown in [2] to give the closed Hopf subalgebra of HR that is realized by δn = N n−1 (l1 ), where N is the natural growth operator, which appends a single node in all possible ways. Thus δ1 = l1 and δ2 = l2 , while δ3 = l3 + B+ (l12 ) differs from the ladder-algebra element l3 . Natural growth implies that δn is a sum over all weight-n trees in HR , with nonzero Connes–Moscovici weights that we specified in [4], using an efficient recursive procedure. Computation of the first 7 rows of Table 2 took longer than for Table 1, because of the proliferation of product terms on the left of the noncocommutative coproduct. These scanty data presented us with a pretty puzzle, which the reader might like to try to solve, after covering up the rows of Table 2 with n > 7. What is the generating procedure? Recall that the sum of the nth row in Table 2 must agree with that in Table 1, since each gives the total number of ways of partitioning the integer n. In Table 1 this is achieved with great simplicity: the k th entry is the number of ways of partitioning n into k positive integers. In Table 2 it is achieved far more subtly, by the addition of only 1 + n/2 n,k has support only for 2k ≥ n ≥ k. terms, since H Given merely data for n ≤ 7, the most interesting feature is the second subleading diagonal 1, 2, 4, 6 . . . . The leading diagonal is generated by G0 = 1/(1 − z), the first subleading diagonal by G1 = 1/(1 − z)2 . The simplest Ansatz for the second is G2 = 8,6 = 9. Then H 8,5 = 4 is required, so that 1 + H 8,5 + G1 /(1 − z2 ), which requires H 9 + 7 + 1 = 22 is the number of ways of partitioning 8. A Reduce program, running 8,5 = 4. Next, the requirement H 9,6 = 7 comes from for 24 hours, proved that indeed H 9,7 = 12 from the hypothesis 9,6 + 12 + 8 + 1 = 30, for the partitions of 9, taking H 2+H G2 = 1/(1 − z)2 (1 − z2 ) for the second subleading diagonal. Then the third subleading diagonal is revealed as 1, 2, 4, 7 . . . , which is nicely consistent with G3 = G2 /(1 − z3 ). Finally, it is easy to check that the recurrence relation Gk = Gk−1 /(1 − zk ) for the diagonals makes the rows sum to the correct partitions. Later we shall prove this result
Bigrading Combinatorial Hopf Algebra of Rooted Trees
223
n,k of the bigrading of the noncocommutative subalgebra, HCM Table 2. Dimensions H 1 1
1
2
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
1
3
2
4
1
1 3
1
5
2
4
1
6
1
4
5
1
7
2
6
6
8
1
1
4
9
7
9
2
7
12
8
1
10
1
4
11
16
9
1
11
2
7
16
20
10
1
12
1
4
12
23
25
11
1
13
2
7
18
31
30
12
1
14
1
4
12
27
41
36
13
1
1
15
2
7
19
38
53
42
14
1
16
1
4
12
29
53
67
49
15
1
17
2
7
19
42
71
83
56
16
1
18
1
4
12
30
60
94
102
64
17
1
2
7
19
44
83
121
123
72
18
19
19
1
by considering the Connes–Moscovici restriction of the filtration of the bigrading of the full Hopf algebra. In words, the transformation is simple to state: the subleading diagonals of Table 2 are the partial sums of the columns of Table 1. This leads to the subtle recurrence relation n,k = H k,2k−n + H n−2,k−1 H
(15)
for the bigrading of the Connes–Moscovici Hopf subalgebra. We particularly note that for j, k > 0 and j + k > 2, j +k (x) < H j (x)H k (x), H
(16)
2 (x) = H 1 (x)H 1 (x) = x 2 (1 + x)2 . while for j = k = 1 we have the equality H 5. Bigrading the Full Hopf Algebra of Rooted Trees Given how long it took to compute the data that eventually led to the generating principle for the Connes–Moscovici subalgebra, one might be daunted by the task of inferring the bigrading of the full Hopf algebra of undecorated rooted trees. In fact, we discovered this first, by mere consideration of the first 8 entries in the first column of Table 3. Thanks to [4] we had an extremely efficient Reduce implementation of the coproduct (3). The severity of the challenge of understanding the range and kernel of Uk , i.e. the difficulty of the computation of k , increases drastically with k. At k = 1 it was possible to solve
224
D. J. Broadhurst, D. Kreimer Table 3. Dimensions Hn,k of the bigrading of the Hopf algebra of rooted trees, HR 1
2
3
4
5
6
7
8
9
10
11
12
1
1
2
1
1
3
1
2
1
4
2
3
3
5
3
6
6
4
1
6
8
11
13
10
5
1
7
16
26
27
24
15
6
8
41
58
63
55
40
21
7
9
98
142
148
132
100
62
28
8
1
10
250
351
363
322
251
168
91
36
9
1
11
631
890
912
804
635
444
266
128
45
10
1
12
1646
2282
2330
2051
1625
1167
742
402
174
55
11
1
13
4285
5948
6036
5304
4220
3072
2030
1184
585
230
66
12
13
1
1 1
1
U1 (X) := (P ⊗ P )((X)) = 0, for weights n ≤ 8, using a few hours of CPUtime, notwithstanding the fact that at n = 8 the number of products of trees is r9 = 286. The book-keeping was very simple, since the defining property of rooted trees is that every weight-n forest F = j tj is uniquely labelled by the tree B+ (F ) with weight n + 1. This clearly leads to the enumeration (2). More deeply, it shows that [2] B+ is Hochschild closed, and hence that the apparently simplistic quantum-field-theory task of handling a single primitive divergence solves a universal problem in Hochschild cohomology. Submitting 1, 1, 1, 2, 3, 8, 16, 41 to Neil Sloane’s superseeker [15], we learnt that it is generated by the first 8 terms of H1 (x) =
R(x) − x (1 − x n )rn . =1− R(x)
(17)
n>0
At the time, Sloane had no idea that we were studying the bigrading of rooted trees and told us “it is pretty unlikely this is your sequence, but I thought I should pass this along just in case”. In fact, his superseeker discovery unlocked our puzzle. We knew that R(x) Hk (x), = x where H0 (x) := 1 and Hk (x) := construed (17) as
(18)
k≥0
k
Hn,k x n generates column k of Table 3. We then
R(x) 1 = [H1 (x)]k . = x 1 − H1 (x)
(19)
k≥0
Comparison with (18) then led to the conjecture Hk (x) = [H1 (x)]k , requiring that Hj +k (x) = Hj (x)Hk (x) .
(20)
Bigrading Combinatorial Hopf Algebra of Rooted Trees
225
To test this, we made intensive use of Reduce. At weight n = 9 we computed the 3214×719 matrix of integer contributions to the 3214 terms in (X) produced by r10 = 719 weight-9 forests. The rank deficiency of the condition U1 (X) = 0 was proven to be D9,1 = 98, which is indeed the coefficient of x 9 in (17). We tested H2 (x) = [H1 (x)]2 up to weight n = 8, where 2 (X) has 3651 terms in 286 unknowns. Here U2 (X) = 0 gave D8,2 = 41 + 58 = 99, where 41 and 58 are indeed the coefficients of x 8 in (17) and its square. Finally, we tested H3 (x) = H1 (x)H2 (x) up to weight n = 7, where 3 (X) has 3168 terms in 115 unkowns, with U3 (X) = 0 giving D7,3 = 16 + 26 + 27 = 69, in agreement with the sum of the coefficients of x 7 in (17), its square and cube. Hence we obtained compelling evidence for the bigrading (1) of the Hopf algebra of rooted trees, determined by the circumstance (20) that it saturates all inequalities. First we derive these general inequalities, for any commutative graded Hopf algebra. Then we prove that they are saturated in HR . 5.1. General inequalities. Let H be a commutative graded Hopf algebra with unit e. Let deg be the grading, with deg(X) ∈ N for all X ∈ H and deg(e) = 0. We assume that H is reduced to scalars by the counit e. ¯ Let Hk be the set of elements in the kernel of Uk which are in the range of Uk−1 , so that Uk (X) = 0 and Uk−1 (X) = 0, for X ∈ Hk . Then we call k the degree of primitivity of X, writing degp (X) = k. We let H0 be the set of elements in the kernel of P = U0 , i.e. the scalars. The augmentation ideal fulfills Hc = H/H0 = ∞ k=1 Hk . To show that degp (X) ≤ deg(X), suppose that deg(X)−1 (X) =
i
(1)
Xi
(deg(X))
⊗ . . . ⊗ Xi
(21)
⊗deg(X)
has nonscalar entries in Hc . Then they are all formed from 1-primitives, in H1 , since the coproduct is homogenous in deg and any element X with deg(X) = 1 also has degp (X) = 1. Hence degp is majorized by deg and is thus finite for each X ∈ H. We denote by Hn,k the number of linearly inequivalent terms X with weight deg(X) = n and primitivity degp (X) = k. Then the generators Hk (x) := n≥k Hn,k x n satisfy Hj +k (x) ≤ Hj (x)Hk (x),
j, k > 0 .
(22)
Proof. It is sufficient to show that an element X ∈ Hj +k may be labelled by those terms in (X) that are in Hj ⊗ Hk . To prove this, suppose that X1 and X2 give the same terms in Hj ⊗ Hk . Now observe that Uj,k := Uj −1 ⊗ Uk−1 projects onto Hj ⊗ Hk , giving Uj,k ◦ (X1 − X2 ) = 0. Finally, observe that coassociativity gives 0 = Uj,k (X1 − X2 ) = P ⊗(j +k) ◦ j +k−1 (X1 − X2 ) := Uk+j −1 (X1 − X2 ) (23) which shows that X1 − X2 is (j + k − 1)-primitive and hence that X1 and X2 are equivalent elements of Hj +k . In consequence of (22) we obtain Hk (x) ≤ [H1 (x)]k .
(24)
226
D. J. Broadhurst, D. Kreimer
This reflects the fact that the terms in k−1 (X) which belong to H1⊗k are sufficient to label elements X ∈ Hk . The remarkable feature of the Hopf algebra of rooted trees, to be proved below, is that all the elements of H1⊗k are necessary to label elements of Hk . As a further comment, we note that (24) may be strengthened if the Hopf algebra is cocommutative, since then the order of labels is immaterial. In the case of Hladder , with H 1 (x) = x/(1 − x), one thus obtains x H k (x) ≤ , (25) 1 − xj j ≤k
which is in fact saturated by Table 1. 5.2. Saturation. We now seek to prove that Hk (x) = [H1 (x)]k in the case that H = HR is the Hopf algebra of undecorated rooted trees. First we prove that degp (Xj Xk ) = degp (Xj ) + degp (Xk ). Proof. Suppose that Xj ∈ Hj and Xk ∈ Hk . Then Uj +k−1 (Xj Xk ) = P ⊗(j +k) (j +k−1 (Xj )j +k−1 (Xk )) = 0
(26)
contains 1-primitives in all its slots, giving Uj +k (Xj Xk ) = 0, by coassociativity.
It is instructive to see how this works out for the product ZX, when Z is 1-primitive and X is k-primitive. Then k (Z) =
k+1
e ⊗ . . . ⊗ Z|j −th place ⊗ . . . e
(27)
j =1
consists of all k + 1 terms with a single Z and k empty trees. As X is k-primitive, k (X) =
k+1 j =1 ij
(1)
(k+1)
Xij ⊗ . . . ⊗ e|j −th place ⊗ . . . Xij
+ ...
(28)
with the final ellipsis denoting omission of terms that contain more than one e. The latter make no contribution to Uk (ZX) =
k+1 j =1 ij
(1)
(j −1)
Xij ⊗ . . . ⊗ Xij
(j +1)
⊗ Z|j −th place ⊗ Xij
(k+1)
⊗ . . . ⊗ Xij
, (29) (r)
where Z replaces a single e. By construction (29) has all its entries, namely Z or Xij , in H1 . Hence Uk+1 (ZX) = 0 and degp (ZX) = k + 1. Iterating this result one immediately concludes that degp (X1 . . . Xk ) = k, for 1primitive elements X1 , . . . , Xk . This does not, of itself, allow us to conclude that Hk (x) = [H1 (x)]k , since the products are commutative. Thus there are fewer k-fold products of 1-primitives than there are k-primitives. To appreciate what is needed in the next step, we pause to consider HCM . Its 1primitives are δ1 and δ2 = δ2 − 21 δ12 . From these we can form the 2-primitive products δ12 , δ1 δ2 , and δ22 . Table 2 shows that there is a further inequivalent 2-primitive, at weight
Bigrading Combinatorial Hopf Algebra of Rooted Trees
227
n = 3. Direct computation shows that it may be taken as δ3 = δ3 − 21 δ13 . Then we may form 6 inequivalent 3-primitive products, namely δ13 , δ12 δ2 , δ1 δ3 , δ1 δ22 , δ3 and δ2 δ23 . Table 2 shows that there is only one more 3-primitive, at weight n = 4. It may be taken as δ4 = δ4 − 43 δ14 . The absence of a further inequivalent 3-primitive at weight n = 5 3 (x) < H 1 (x)H 2 (x). This exercise reveals the filtration of the bigrading means that H of Table 2: the generator is
n,k x n y k = H
n,k
1 1 1 − xy 1 − x k+1 y k
(30)
k! k+1 l 2k 1
(31)
k>0
corresponding to products of l1 and δk+1 = N k (l1 ) −
with k-primitivity achieved by a subtraction at weight n = k + 1 > 1. At y = 1, the filtration (30) agrees with the ladder filtration
H n,k x n y k =
n,k
k>0
1 1 − xk y
(32)
generated by products of the 1-primitives (8). Now consider the highly nontrivial filtration of the bigrading of rooted trees. Let Pn,k be the number of weight-n elements of Hk that cannot be expressed as products of elements of {Hj | j < k}. Then H (x, y) :=
Hn,k x n y k =
n,k
n,k
1
(33)
(1 − x n y k )Pn,k
with Pn,k telling us how many linearly independent combinations of weight-n trees may be made k-primitive, but not (k − 1)-primitive, by suitable subtractions of products of trees of lesser weight. Setting y = 1, taking logs, and using the unique property (2) of the enumeration of rooted trees, we obtain rn log(1 − x n ) = log x − log R(x) = Pn,k log(1 − x n ) (34) n
n,k
and hence rn = k Pn,k . Table 4 gives the filtration implied by Hk (x) = [H1 (x)]k . The column generators are Pk (x) :=
n
n
Pn,k x =
µ(j ) j |k
k
1−
k/j nj rn
(1 − x )
,
(35)
n
where the Möbius function µ(j ) vanishes if j is divisible by a square and is equal to (−1)p when j is the product of p distinct primes. To proceed, we use the Hochschild property of B+ , namely ◦ B+ = B+ ⊗ e + (id ⊗ B+ ) ◦
(36)
228
D. J. Broadhurst, D. Kreimer Table 4. Filtration Pn,k of the bigrading of HR 1 1
1
2
1
2
3
4
5
6
7
8
9
10
11
3
1
1
4
2
1
1
5
3
3
2
1
6
8
5
4
2
1
7
16
13
9
6
3
8
41
28
21
13
8
3
1
9
98
71
49
33
20
10
4
1
10
250
174
121
79
50
27
13
4
1
11
631
445
304
201
127
74
38
16
5
12
1646
1137
776
510
325
192
106
49
19
5
1
13
4285
2974
2012
1326
844
512
290
148
65
23
6
12
1
1 1
which follows from the action of the coproduct (3) on the trees produced by B+ , using B− ◦ B+ = id. Taking care to note that C := B+ ◦ B− = B− ◦ B+ = id
(37)
◦ B− = (id ⊗ B− ) ◦ ◦ C
(38)
we obtain
by composition of (36) with id ⊗ B− on the left and B− on the right. It follows from (36) that if X is k-primitive, then B+ (X) has primitivity no greater than k + 1. Proof. Suppose that X ∈ Hk . Then repeated application of (36) gives Uk+1 ◦ B+ (X) = (P ⊗(k+1) ⊗ B+ ) ◦ k+1 (X) = 0
(39)
since every term in k+1 (X) contains at least two e’s, of which at most one is promoted to l1 by B+ . The presence of C in (38) frustrates a parallel attempt to show that B− decreases primitivity. Rather, we found that the kernel of B− is an object of great interest. The action of B− on a nonempty tree t is simple: it removes the root to produce, in general, a forest of rooted trees, each of whose roots was originally connected to the root of t by a single edge. Since B− obeys the Leibniz rule B− (X1 X2 ) = X1 B− (X2 ) + X2 B− (X1 ),
B− (e) = 0,
(40)
its action on forests is less trivial. The action of B− on a tree, t, is undone by B+ , giving C(t) := B+ (B− (t)) = t. On a forest of more than one tree, C does not degenerate to the identity map. It is this that makes the Hopf algebra of rooted trees such an amazingly
Bigrading Combinatorial Hopf Algebra of Rooted Trees
229
rich structure. Another important feature is that the kernels of B− and C coincide, since C := B+ ◦ B− and B− = B− ◦ C. Moreover, C is idempotent, since (C − id) ◦ C = B+ ◦ (B− ◦ B+ − id) ◦ B− = 0.
(41)
Hence there are two special types of object: trees, for which C acts like the identity, and those linear combinations of forests that lie in the kernel of C. We shall show that the latter are the key to the filtration Pn,k of Table 4. The first step is to prove that C(X) = 0 for every X ∈ H1 with weight n > 1. Proof. The coproduct of tree t has the form (t) = t ⊗ e + B− (t) ⊗ l1 + . . . ,
(42)
wherethe ellipsis denotes terms with weight n > 1 on the right. Now consider a forest F = j tj . The Leibniz rule (40) gives (F ) = (tj ) = F ⊗ e + B− (F ) ⊗ l1 + . . . (43) j
and hence (X) contains B− (X) ⊗ l1 , for all X ∈ HR . Now suppose that X ∈ H1 has no weight-1 term. Then (X) = X ⊗ e + e ⊗ X requires that B− (X) = 0 and hence that X is in the kernel of C. To get acquainted with the problem in hand, consider a pair of 1-primitives, X1 and X2 . Their product is 2-primitive, giving U1 (X1 X2 ) = X1 ⊗ X2 + X2 ⊗ X1 .
(44)
For every such pair, we require another 2-primitive construct, say W (X1 , X2 ), giving U1 ◦ W (X1 , X2 ) = X1 ⊗ X2 − X2 ⊗ X1 .
(45)
This does not uniquely define W (X1 , X2 ), since we may add to any solution of (45) any combination of 1-primitives. The operative question is whether a solution exists, for each pair of distinct 1-primitives. This question does not arise in the ladder subalgebra, which is cocommutative. It is easily answered in the Connes–Moscovici subalgebra, where the asymmetry of U1 ( δ3 ) = 3δ1 ⊗ δ2 + δ2 ⊗ δ1
(46)
makes it simple to solve the single case of (45) by 1 W (δ1 , δ2 ) = δ3 − 2 δ2 δ1 = B+ (l12 ) + l3 − 2l2 l1 + l13 . 2
(47)
More generally, the k-primitive nonproduct term δk+1 accounts for the leading diagonal Pk+1,k = 1 of Table 4. In the full Hopf algebra, we must show the existence of Pn,2 asymmetric pairings enumerated by P2 (x) :=
n
Pn,2 x n =
1 1 [H1 (x)]2 − H1 (x 2 ) 2 2
(48)
= x + x + 3x + 5x + 13x + 28x + . . . . 3
4
5
6
7
8
230
D. J. Broadhurst, D. Kreimer
Part of what is required is clearly provided by W (l1 , X) = l1 X − 2B+ (X)
(49)
U1 ◦ W (l1 , X) = l1 ⊗ X + X ⊗ l1 − 2(P ◦ B+ )(X ⊗ e + e ⊗ X) = l1 ⊗ X − X ⊗ l 1
(50)
since (36) shows that
has the desired antisymmetry. By this means we easily construct the elements of H2 with weight n < 5 from products of 1-primitives and the action of B+ on 1-primitives. At weights n ≥ 5 we need a further construction. There are P5,2 = 3 weight-5 nonproduct 2-primitives, but only H4,1 = 2 weight-4 1-primitives on which to act with B+ . We lack, thus far, a way of constructing W (p2 , p3 ), where p2 = l2 − 21 l12 and p3 = l3 − l2 l1 + 13 l13 are the 1-primitives at weights n = 2, 3, common to the cocommutative subalgebra Hladder . At weight n = 6, we lack W (p2 , p4 ) and W (p2 , p4 ), where 1 1 p4 = l4 − l3 l1 − l22 + l2 l12 − l14 , 2 4 2 3 p4 = B+ 2l2 l1 − B+ (l1 ) − l1 + l1 B+ (l12 ) − l22 ,
(51) (52)
are the weight-4 ladder and nonladder 1-primitives enumerated by H4,1 = 2. It is simple to check that they are annihilated by the Leibniz action of B− , using B− ◦ B+ = id and B− (ln ) = ln−1 with l0 := e evaluating to unity. At this juncture, it is instructive to compare Tables 3 and 4, which reveal that Pn,2 ≤ 2Hn−1,1 , Pn,k ≤ Hn−1,k−1 ,
k > 2.
(53) (54)
In the Appendix, we show that these inequalities persist at large n, thanks to the fact that the Otter constant c := limn→∞ rn+1 /rn = 2.955765 . . . is slightly less than 3. Thus it is conceivable that for k > 2 the action of B+ might generate Pk (x) from xHk−1 (x), but it is quite impossible for it to do this job at k = 2. It appears from (53) that we need a second operator that increases n and k by unity. 5.3. Natural growth by a single node. There is a clear candidate for the second operator: the natural growth operator N , which appends a single node in all possible ways, and hence obeys a Leibniz rule. The commutators of N with B± are easily found, since we need only consider what is happening at the root. Defining the operator L by L(X) := l1 X, we obtain
(55) N, B+ = B+ ◦ L ,
B− , N = L ◦ B− , (56) [N, C] = [N, B+ ] ◦ B− − B+ ◦ [B− , N ] = 0 .
(57)
The natural growth operator is a wonderful thing: it commutes with B+ ◦B− , the operator that makes the Hopf algebra so structured; hence it preserves the kernel of B− ; like B− , it acts as a derivative; like B+ , it adds a node and increases the degree of primitivity;
Bigrading Combinatorial Hopf Algebra of Rooted Trees
231
finally, it identifies the unique [2] noncocommutative Hopf subalgebra HCM , with linear basis δn := N n−1 (l1 ). Constructing N (p4 ) and N (p4 ), we verified that they are in the kernel of U2 and the range of U1 . It might thus appear that some linear combination of them with B+ (p4 ), B+ (p4 ) and the product terms {p1 p4 , p1 p4 , p2 p3 } solves the problem of constructing W (p2 , p3 ). Remarkably, this turns out not to be the case. Rather, we find that application of S1 := N + (B+ − L) ◦ Y
(58)
to a 1-primitive gives a 1-primitive of higher weight. Here Y is the grading operator, which multiplies each tree by its weight and operates on products by a Leibniz rule. Thus N (p4 ) and N (p4 ) are linear combinations of {B+ (p4 ), B+ (p4 ), p1 p4 , p1 p4 } and 1-primitives. Instead of constructing the missing weight-5 nonproduct 2-primitive, we discovered how to generate all the 1-primitives with weight n ≤ 5. We have L(e) = p1 = l1 , at n = 1; S1 (p1 ) = 2p2 , at n = 2; S1 (p2 ) = 3p3 , at n = 3. At n = 4, we obtain S1 (p3 ) = 4p4 − p4 , to which we adjoin p4 , from the ladder construction (8) of Sect. 3. Then we obtain the 1-primitives at n = 5 as p5 , S1 (p4 ) and S12 (p3 ). We then found a generalization of (58), which solves the problem of constructing W (p2 , p3 ). Operating on a weight-n 1-primitive with k−1 Sk := S1 − (59) (B+ − L) ◦ N k−1 2 we create a k-primitive of weight n + k. In particular, W (p2 , p3 ) =
8S2 (p3 ) − 7N ◦ S1 (p3 ) − p 2 p3 12
(60)
completes the construction of weight-5 2-primitives. More generally, we found that W (p2 , Xn ) =
2O2 (Xn ) − p 2 Xn , n(n + 1)
1 O2 := S2 ◦ (Y + id) − N ◦ S1 ◦ (Y + id) 2
(61) (62)
gives U1 ◦ W (p2 , Xn ) = p2 ⊗ Xn − Xn ⊗ p2 , where Xn is a 1-primitive with weight n. We remark that (61) lies in the kernel of C, for all n > 1. However, it is not yet clear how to generalize this construction to obtain, for example, W (p3 , p4 ) and W (p3 , p4 ) at weight n = 7. The key to this issue is an extension1 of the concept of natural growth. 5.4. Natural growth by appending sums of forests. Let F be a forest. We define NF (X) to be the sum of forests obtained by appending F to every node of X, in turn. To append F = j tj to a particular node, one connects the roots of all the tj to that node. We note that NF obeys a Leibniz rule, with NF (e) = 0 and NF (l1 ) = B+ (F ). We have already encountered two examples, namely the grading operator Y := Ne , which merely counts nodes, and the simplest natural growth operator N := Nl1 , which appends a single node. 1 Our extension of natural growth allows a suitable extension of the Lie algebra dual to H , as was observed R by Alain Connes. This will be presented in a sequel to [7].
232
D. J. Broadhurst, D. Kreimer
Finally, with Z = F1 + F2 , we make NZ := NF1 + NF2 linear in its subscript, as well as its argument. The commutation relations (55,56) then generalize to
NZ , B+ = B+ ◦ LZ , (63)
B− , NZ = LZ ◦ B− (64) with LZ (X) := ZX. Thus [NZ , C] = 0 and NZ preserves the kernel of C for all Z ∈ HR . The great virtue of this construct is that it gives U1 ◦ NZ (X) = Z ⊗ Y (X)
(65)
when both Z and X are 1-primitive. Proof. We use the shorthand notation (X) = X ⊗ e + e ⊗ X + X ⊗ X for any Hopf algebra element X, with the final term denoting a sum over tensor products containing no scalars. Let Z be any 1-primitive. Then U1 ◦ NZ (X) = NZ (X ) ⊗ X + X ⊗ NZ (X ) + (LZ ⊗ Y ) ◦ (X)
(66)
consists of terms in which X or X grow naturally, with a final contribution where Z is itself completely cut from any node to which it was connected by NZ , with the grading operator Y acting on the right, to count the number of cuts. The case with Z = l1 was proven in [2], by an analysis of admissible cuts. Here, where Z is 1-primitive, we obtain a result of the same form, since the internal cuts of Z cancel when U1 (Z) = 0. (A more general formula, for arbitrary Z, can be given but is not required here.) When X is 1-primitive, with X = X = 0, we obtain (65) from LZ ⊗ Y acting on the second term of (X) = X ⊗ e + e ⊗ X. The result (65) immediately proves that H2 (x) = [H1 (x)]2 , since it shows that each pairing NX1 (X2 ) of 1-primitives gives an element of H2 that is inequivalent to any other pairing. Hence (24) is saturated at k = 2. Now we define the iteration Vk+1 (X1 , . . . , Xk , Xk+1 ) := NVk (X1 ,... ,Xk ) (Xk+1 )
(67)
for k > 0, with V1 := id. Then, for example, V2 (X1 , X2 ) := NX1 (X2 ) and V3 (X1 , X2 , X3 ) := NNX1 (X2 ) (X3 ) = NX1 NX2 (X3 ) .
(68)
⊗(k+1)
We remark that a Hochschild boundary can be defined for maps Vk+1 : HR → HR . For this, it is sufficient to define terms of the form Vk (X1 , . . . , Xj Xj +1 , . . . , Xk+1 ), where one argument is a product. Natural growth by forests supplies this. Consequences will be described in future work. For the present, we are content with the following result. Theorem. The dimensions Hn,k of the bigrading of the Hopf algebra of undecorated rooted trees, by weight n and degree of primitivity k, are generated by (1). Proof. Let X1 , X2 , . . . , Xk be 1-primitives, which need not be distinct. Then Uk−1 ◦ Vk (X1 , X2 , . . . , Xk ) = X1 ⊗ Y (X2 ) ⊗ . . . ⊗ Y (Xk )
(69)
by coassociativity and iteration of the argument that led to (65). Thus Hk (x) = [H1 (x)]k saturates (24). Then H (x, y) = 1/(1 − H1 (x)y) gives R(x) = x/(1 − H1 (x)), at y = 1. Solving for H1 (x) = 1 − x/R(x), we obtain (1).
Bigrading Combinatorial Hopf Algebra of Rooted Trees
233
5.5. Comments on the main theorem. Four comments are in order. The first concerns the enumeration of the filtration. This follows from taking logs in (33), which gives log H (x, y) = − log(1 − H1 (x)y) = − Pn,k log(1 − x n y k ) . (70) n,k
Equating coefficients of y j , and setting x = z1/j , we obtain [H1 (z1/j )]j = kPk (z1/k )
(71)
k|j
which is a classic problem in Möbius inversion, yielding (35), after use of (2). Next, we remark on the number, Cn,k , of weight-n elements of Hk that are in the kernel of C := B+ ◦ B− . We have explicitly constructed a filtration of the bigrading, for weights n < 7, in which the only element with C(X) = 0 is l1 . The iteration (67) proves that there is no obstacle to continuing this process, since the only restriction imposed by C ◦ Vk+1 (X1 , X2 , . . . , Xk+1 ) = 0 is Xk+1 = l1 . Thus n Cn,k+1 x n = [H1 (x)]k (H1 (x) − x) and the generating function n,k
Cn,k x n y k =
(1 − xy)R(x) (1 − y)R(x) + xy
(72)
differs from (1) only by a factor of 1 − xy, which removes l1 from the filtration (33). In total, we have Cn := k Cn,k = rn+1 − rn weight-n solutions to C(X) = 0. It is easy to see how that comes about: there are rn+1 possible forests in X, subject to the rn conditions that the coefficient of every tree in C(X) vanishes. The result Cn = rn+1 − rn proves the independence of these conditions. Hence an element X of the kernel of C is uniquely identified by the contribution X that contains no pure trees, since X = X − C(X). Finally, the filtration of the bigrading of the kernel of C differs from that of the full Hopf algebra only by the absence of l1 . These distinctive features frustrate every attempt to decrease primitivity by the action of B− on any nonproduct element except the singlenode tree. One may climb up the ladder of primitivity with great ease, yet descent is impossible, save in one trivial case. In a sense, the second grading is characterized by the profound difficulty of constructing its 1-primitives. At first meeting, this makes it difficult to fathom. Then one realizes that the structure is beautifully tuned to prevent casual construction. Our third comment concerns the remarkable operator O2 in (62), which provides a way of solving U1 ◦ W (p2 , X) = p2 ⊗ X − X ⊗ p2 . A second way is provided by Np2 . These solutions need not be the same; they may differ by a 1-primitive. In general, they will differ, since Np2 acts by a Leibniz rule, while O2 does not. Hence T2 := O2 − Np2 ◦ (Y + id)
(73)
provides a second shift operator that creates 1-primitives, when applied to 1-primitives. It gives information that is not provided by S1 in (58). For example, at weight n = 6 we already know how to construct 4 of the H6,1 = 8 primitives, by applying powers of S1 to the ladder primitives constructed in (8). Of the missing 4, the constructs T2 (p4 ) and T2 ◦S1 (p3 ) provide 2. For the remaining 2, which are now proven to exist, we laboriously solved U1 (X) = 0 at weight n = 6, working with tensor products of the 38 forests with up to 6 nodes. At first sight, one might hope to add a few more shift operators, to arrive at a set that is sufficient to construct 1-primitives up to some large weight, without having
234
D. J. Broadhurst, D. Kreimer
to solve the fearsome explosion of linear equations required by the vanishing of all tensor products in U1 (X) = 0. This seems not to be the case; the construction of 1-primitives appears to be a deeply nontrivial challenge. Asymptotically, no more than a fraction 1/c of what is necessary may be provided by S1 , and no more than 1/c2 by T2 , which increases weight by 2 units. The number of similarly constructed operators that change weight by n cannot exceed the number Hn,1 of weight-n 1-primitives. Constructing a finite number of these, we obtain merely an asymptotic fraction f < H1 (1/c) = 1 − 1/c < 1 of what is needed. Hence we envisage no easy route to the construction of 1-primitives, short of solving the tensorial defining property (X) = X ⊗ e + e ⊗ X. Thereafter, the problem of constructing k-primitives is completely solved by (67), which shows that the 1-primitives of weight n > 1 are enumerated by those elements of the kernel of C that cannot be generated by any process of natural growth acting on 1-primitives of lesser weight. This negative criterion appears even harder to implement than the tensorial definition U1 (X) = 0, which we were able to solve at n = 9, by explicit computation of the 98-dimensional kernel of a 3214 × 719 matrix of integers. Finally, we remark that we have explicit constructions of the bigradings (30,32) of the Connes–Moscovici and ladder subalgebras. In the case of HCM we have merely a pair of 1-primitives: δ1 = l1 and δ2 = Nδ1 (δ1 ) − 21 δ12 . The only form of natural growth that we are allowed is by a single node: this is the defining restriction. Then we easily construct δk+1 = Nδk1 (δ1 ) − 2−k k!δ1k+1 as a nonproduct k-primitive of weight k + 1. This completes the filtration, since any further term would make the number of weight-n products of filtered elements greater than the number of weight-n products of the linear basis. Hence the construction of the Connes–Moscovici bigrading is particularly simple. In the case of Hladder the cocommutativity of the ladder restriction (6) of the coproduct means that all k-primitives are products at k > 1. Here the problem of construction is more demanding, since it not clear how to generate an infinite set of 1-primitives. Hence one sees that detailed study of ladder diagrams, most notably by Bob Delbourgo and colleagues [12–14], addresses a problem more severe than that posed by the Connes– Moscovici prolegomenon to noncommutative geometry: ladder diagrams are a nontrivial infinite subset of perturbative quantum field theory; even after subtractions of products they provide an infinite subset of 1-primitives, when their bigrading is analyzed. Fortunately, our recent work in [8] provides the explicit construction (8) of the ladder filtration. The reader may try to imagine what might be involved in giving an explicit construction of the 1-primitives of the full Hopf algebra of undecorated rooted trees. Then s/he should contemplate the true challenge of quantum field theory, by recalling that – in physical reality – every node of every rooted tree may be decorated in an infinite number of ways. After half a century, few physicists or mathematicians have even begun to grapple with the true legacy of Dyson, Feynman, Schwinger and Tomonaga. 6. Prospects In this paper, we were content to study the bigrading of the Hopf algebra of undecorated [4, 8] rooted trees, by the number of nodes and a degree of primitivity analyzed by iterations of the coproduct. The extension of this bigrading to the decorated [3, 7] case is the obvious next step in our plan to decode the rich structure of mature quantum field theory. The present work makes it clear that the key feature will be the nontriviality of C := B+ ◦ B− = B− ◦ B+ = id. In the undecorated case, we have shown that the bifiltration of the Hopf algebra is obtained by adjoining the single-node tree to the bifiltration of the kernel of C. The proof of this lies in the powerful generalization (67) of
Bigrading Combinatorial Hopf Algebra of Rooted Trees
235
the concept of natural growth, which diagonalizes (69). First results for the commutator [B+ , B− ] = C − id of the decorated Hopf algebra of full quantum field theory were recently given in [9]. These increase our hopes that it will not take another 50 years to complete the characterization of the intricate interrelation of combinatorics and analysis that makes quantum field theory possible. We firmly believe that further elucidation of its structure has much to offer for wide areas of both physics and mathematics. Acknowledgements. This study began during the workshop Number Theory and Physics at the ESI in November 1999, where we enjoyed discussions with Pierre Cartier, Werner Nahm, Ivan Todorov and Jean-Bernard Zuber. Work with Alain Connes at the IHES supports the present paper. System management by Chris Wigglesworth enabled accumulation of crucial data, which Neil Sloane’s superseeker helped us to decode.
Appendix: Asymptotic Enumerations Here we consider inequalities inferred from Tables 3 and 4 and show that they persist at large weights, thanks to the upper bound c < 3 on the Otter constant [17]. Asymptotically, the number of rooted trees is given by rn = cn n−3/2 (b + O(1/n))
(74)
with Otter constants that we evaluated in [4]: b = 0.4399240125710253040409033914345447647980854079 4011985765349354502263540042047646053798621977 79782334..., c = 2.9557652856519949747148175241231945883754923046 6359659535047247890596473313957495108666828367 65813525... . The asymptotic fraction of trees assigned to primitivity k in the filtration of Table 4 is Pn,k 1 k−1 1 fk := lim , (75) = 1− n→∞ rn c c while the asymptotic fraction of forests in Table 3 is 1 k−1 k Hn,k kfk = 1− gk := lim = . n→∞ rn+1 c c c2 These follow by using [4] |1 − R(x)|2 = O(1 − cx), near x = 1/c. Numerically, g1 g2 g3 g4 g5 g6 g7 g8 g9
= 0.1144616788557279695 . . . = 0.1514735822429146084 . . . = 0.1503401379409753267 . . . = 0.1326357110750687024 . . . = 0.1097026887662558145 . . . = 0.0871054456752243543 . . . = 0.0672417311397409555 . . . = 0.0508484386279160206 . . . = 0.0378509630072558308 . . .
, , , , , , , , ,
(76)
236
D. J. Broadhurst, D. Kreimer
with k = 2 giving the largest fraction of forests at large n. This was not apparent until n = 28, where we found that H28,2 = 20 716 895 918 exceeds H28,3 = 20 710 700 277. The asymptotic results establish inequalities (53,54) at large n, where it is sufficient that c < 3. Amusingly, this upper bound and the condition R(1/c) = 1 produce a rather tight lower bound
R(c−k ) 1 > exp 1 + > 2.943 (77) c = exp k (3k − 1)k k>0
k>1
from the rather loose lower bound R(x) ≥ Rladder (x) = x/(1 − x). References 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17.
Kreimer, D.: Adv. Theor. Math. Phys. 2, 303 (1998); q-alg/9707029 Connes, A., Kreimer, D.: Commun. Math. Phys. 199, 203 (1998); hep-th/9808042 Kreimer, D.: Commun. Math. Phys. 204, 669 (1999); hep-th/9810022 Broadhurst, D.J., Kreimer, D.: J. Symb. Comput. 27, 581 (1999); hep-th/9810087 Kreimer, D.: Adv. Theor. Math. Phys. 3.3, (1999); hep-th/9901099 Connes, A., Kreimer, D.: JHEP 9909, 24 (1999); hep-th/9909126 Connes, A., Kreimer, D.: Commun. Math. Phys. 210, 249 (2000); hep-th/9912092 Broadhurst, D.J., Kreimer, D.: Phys. Lett. B 475 63; hep-th/9912093 Kreimer, D.: Lett. Math. Phys. 51, 179 (2000); hep-th/9912290 Connes, A., Moscovici, H.: Hopf Algebras, cyclic Cohomology and the transverse Index Theorem. Commun. Math Phys. 198, 199 (1998); IHES/M/98/37; math.DG/9806109 Kreimer, D.: J. Knot Th. Ram. 6, 479 (1997); q-alg/9607022 Delbourgo, R., Kalloniatis, A., Thompson, G.: Phys. Rev. D 54, 5373 (1996); hep-th/9605107 Delbourgo, R., Elliott, D., McAnally, D.S.: Phys. Rev. D 55, 5230 (1997); hep-th/9611150 Kreimer, D., Delbourgo, R.: Phys. Rev. D 60, 105025 (1999); hep-th/9903249 Sloane, N.J.A.: On-line Encyclopedia of Integer Sequences, http://www.research.att.com/˜njas/sequences Hearn, A.C.: REDUCE User’s Manual, Version 3.7, March 1999 Otter, R.: Annals Math. 49, 583 (1948)
Communicated by A. Jaffe
Commun. Math. Phys. 215, 237 – 238 (2000)
Communications in
Mathematical Physics
© Springer-Verlag 2000
Erratum
The Holonomy of the Determinant of Cohomology of an Algebraic Bundle H. Gillet1 , C. Soulé2 1 Department of Mathematics, University of Illinois at Chicago, M/C 249, 851 S. Morgan,
Chicago, IL 60607-7045, USA. E-mail:
[email protected]
2 U.E.R. de Mathématiques, Université de Paris VII, Tour 45-55, 5eme Etage, 75252 Paris Cedex 05, France.
E-mail:
[email protected]. Received: 22 February 2000 / Accepted: 7 July 2000 Commun. Math. Phys. 131, 219–220 (1990)
1. Let f : X → Y be a smooth map between projective complex manifolds, and E an algebraic vector bundle on X. Choose a Kähler form ω on X, hence a metric on the relative tangent space TX/Y , and a smooth Hermitian metric on E. Let γ : S 1 → Y be a smooth loop on Y , and µ(γ ) the holonomy along γ of the connection associated to the Quillen metric on the determinant line bundle λ(E) = det Rf∗ (E) . In [B-F], Bismut and Freed computed µ(γ ) as an adiabatic limit µ(γ ) = lim exp (2π i η ε (0)) , ε→0
where η ε (0) = ηε (0) + dim Ker(D ε ), D ε is some twisted Dirac operator depending on a real parameter ε > 0, and ηε (0) is the êta invariant of D ε (see op.cit. for more details). In [GS2] and [GS1] Theorem 3.7, we claimed that η ε (0) is independent of ε. As noticed by J.-M. Bismut, this assertion is incorrect. Indeed, one can find in [BF], Theorem 3.8, an equality expressing V =
∂ ε η (0) |ε=1 ∂ε
as some integral over the smooth manifold f −1 (γ (S 1 )), and there are cases where this integral does not vanish. 2. An example where V = 0 can be obtained as follows. Let X1 X2 P1 (C), let X = X1 × X2 and let f : X → Y = P1 (C) be the second projection. Assume E is trivial, and denote by z = x + iy and u the affine complex coordinates on X1 and X2 respectively. We let ω = a d z d z + b du du,
238
H. Gillet, C. Soulé
where a (resp. b) is a positive smooth function on X1 (resp. X2 ). Let ϕ be a real smooth function on X such that α = i ∂ z ∂u ϕ is real and not identically zero, and such that ∂u ∂u ϕ does not depend on z. We let ω = ω0 + t dd c ϕ , where t is a small positive real number. If t goes to zero, and if A = log(a) and B = log(b) are big with respect to dd c ϕ, one can compute V from the formula (3.42) in [BF], loc.cit., and one finds that V is as close as one wants to a constant multiple of a −1 α 2 (∂x (A)2 + ∂y (A))2 dx dy t2 P1 (C)
multiplied by
γ (S 1 )
b−1 (d c B) .
Therefore, when a, b and ϕ are general enough, V is different from zero. References [BF]
Bismut, J.-M., Freed, D.-S.: The analysis of elliptic families, II. Dirac operators, êta invariants and the holonomy theorem. Commun. Math. Phys. 107, 103–163 (1986) [GS1] Gillet, H., Soulé, C.: Arithmetic Chow groups and differential characters. In: Algebraic K-theory: connections with geometry and topology, Jardine, J.F., Snaith, V.P. (eds.), Dordrecht: Kluwer Academic, 1989, pp. 30–68 [GS2] Gillet, H., Soulé, C.: The Holonomy of the Determinant of Cohomology of an Algebraic Bundle. Commun. Math. Phys. 131, 219–220 (1990) Communicated by A. Jaffe
Commun. Math. Phys. 215, 239 – 244 (2000)
Communications in
Mathematical Physics
© Springer-Verlag 2000
More Operator Versions of the Schwarz Inequality Rajendra Bhatia1 , Chandler Davis2 1 Indian Statistical Institute, 7, S.J.S. Sansanwal Marg, New Delhi – 110 016, India.
E-mail:
[email protected]
2 Department of Mathematics, University of Toronto, Toronto M5S3G3, Canada.
E-mail:
[email protected] Received: 14 April 2000 / Accepted: 10 May 2000
Abstract: Some new operator versions of the Schwarz inequality are obtained. One of them is a counterpart of the variance-covariance inequality in the context of noncommutative probability. The Schwarz inequality has appeared in several avatars. Some of these are its versions for operators [1, 2, 4, 7–9]. More are presented here. 1. A Variance–Covariance Inequality Let f and g be random variables – elements of the space L2 (X, µ), where µ is a probability measure. The covariance between f and g is defined as
where Ef =
cov(f, g) = E(f g) − Ef Eg.
(1)
f dµ denotes the expectation of f . The variance of f is defined as var(f ) = cov(f, f ) = E(|f |2 ) − |Ef |2 .
(2)
|cov(f, g)|2 ≤ var(f )var(g),
(3)
The inequality
much used in statistics, is just the Schwarz inequality. A noncommutative analogue of variance and covariance is defined as follows. Let B(H) be the space of all (bounded linear) operators on a (complex separable) Hilbert space H. Let be a unital completely positive linear map [10] on B(H). We define the covariance between two operators A and B as cov(A, B) = (A∗ B) − (A)∗ (B).
(4)
240
R. Bhatia, C. Davis
The variance of A is defined as var(A) = cov(A, A) = (A∗ A) − (A)∗ (A).
(5)
The famed inequality of Kadison [7], extended in several directions by Choi [4], says that var(A) ≥ 0.
(6)
[We write T ≥ 0 to mean that the operator T is positive (semidefinite).] This generalises the simple fact that var(f ) is a nonnegative number. A good generalisation of the inequality (3) is given by the following theorem. Theorem 1. The 2 × 2 operator matrix var(A) cov(A, B)∗
cov(A, B) var(B)
(7)
is positive. Proof. We have to prove that (A∗ A) (A∗ B) (A)∗ (A) (A)∗ (B) ≥ . (B ∗ A) (B ∗ B) (B)∗ (A) (B)∗ (B)
(8)
First consider the special case when is the map (T ) = V ∗ T V , where V is an isometry; i.e., V ∗ V = I . Then the inequality (8) can be rewritten as ∗ ∗ A A A∗ B V 0 V 0 0 V 0 V∗ B ∗A B ∗B ∗ ∗ ∗ A V V A A∗ V V ∗ B V 0 V 0 ≥ . 0 V∗ B ∗V V ∗A B ∗V V ∗B 0 V This will follow from the inequality ∗ ∗ A V V ∗ A A∗ V V ∗ B A A A∗ B ≥ . B ∗A B ∗B B ∗V V ∗A B ∗V V ∗B This, in turn, can be written as ∗ ∗ A B 0 A B A 0 VV∗ A 0 . ≥ 0 0 B∗ 0 B∗ 0 0 VV∗ 0 0 But as V V ∗ ≤ I , this is certainly true. We have proved (8) for the special . The general case follows from this via the Stinespring dilation theorem: there exists a Hilbert space K, an isometry V of H into K, and a ∗-homomorphism π of B(H) into B(K) such that (A) = V ∗ π(A)V .
R T Remark 1. It is well-known [1,4] that ≥ 0 if and only if R, S are positive and ∗ T S R ≥ T S −1 T ∗ . Here if S is not invertible S −1 is understood to be its generalised inverse. (The same convention is followed in such contexts throughout the paper.) Theorem 1 is thus equivalent to the statement var(A) ≥ cov(A, B)[var(B)]−1 cov(A, B)∗ . This equivalence will be used repeatedly.
(9)
More Operator Versions of the Schwarz Inequality
241
Remark 2. The Schwarz inequality proved by Lieb and Ruskai [8] says that (A∗ A) ≥ (A∗ B) (B ∗ B)−1 (B ∗ A),
(10)
or, equivalently,
(A∗ A) (A∗ B) (B ∗ A) (B ∗ B)
≥ 0.
(11)
The inequality (8) is a considerable strengthening of this result. Remark 3. Say that a function f isin the Lieb class L if f (R) ≥ 0 whenever R ≥ 0, and R T ≥ 0. Several examples of such functions |f (T )|2 ≤ f (R)f (S) whenever T∗ S may be found in [2, pp. 268–270]. Many Schwarz-type inequalities for such f may thus be obtained from (7). For example, we have cov(A, B) 2 ≤ var(A) var(B) ,
(12)
for every unitarily invariant norm. This gives a variety of good generalisations of (3). It is often of interest to weaken the hypothesis that the map be completely positive. We will comment below on this much studied class of maps [4]: Definition. Let be a linear map between C*-algebras. We say that is n-positive in case the condition [Aij ]ij ≥ 0 on an n × n operator matrix Aij implies [ (Aij )]ij ≥ 0. Thus ordinary positivity is 1-positivity; complete positivity is n-positivity for all n. Remark 4. If is assumed only to be a unital positive linear map then the inequality (6) is not always true. It is true under additional hypotheses such as self-adjointness of A. However, the inequalities (6) and (11) are true if is just assumed to be 2-positive. We do not know whether we have (8) under this weaker condition. If , in addition to being 2-positive, has the averaging property [5] (A (B)) = (A) (B), then the inequality (8) does hold. Another strengthening is given in Remark 7. Remark 5. When is the identity map, the inequality (10) reduces to A∗ A ≥ A∗ B(B ∗ B)−1 B ∗ A.
(13)
An easy proof of this is given in [9]. The operator B(B ∗ B)−1 B ∗ is idempotent and Hermitian. Hence, I ≥ B(B ∗ B)−1 B ∗ ; and (13) follows at once. Remark 6. The argument in the proof of Theorem 1 can be used to show that for any operators A1 , . . . , An , the n × n block operator matrix [cov(Ai , Aj )] is positive. Remark 7. The referee has pointed out to us an ingenious proof of (8) (that is, the conclusion of Theorem 1) under the hypothesis that is unital and 4-positive. This will be sketched here. From the easily verified relation ∗ A A A∗ B A∗ A∗ ∗ ∗ ∗ ∗ B A B B B B ≥0 A B I I A B I I
242
R. Bhatia, C. Davis
and the 4-positivity of follows
(A∗ A) (A∗ B) (A)∗ (A)∗ (B ∗ A) (B ∗ B) (B)∗ (B)∗ (A) ≥ 0. (B) I I (A) (B) I I Applying again the equivalence in Remark 1, this yields (8). The referee remarks that this improvement can be made equally to the generalisation in Remark 6; here one assumes of only that it is unital and 2n-positive. 2. An Operator Version of the Wielandt Inequality Let A be a positive operator on H. For any two vectors x, y in H, we have from the Schwarz inequality | x, Ay|2 ≤ x, Ax y, Ay.
(14)
A well-known inequality of Wielandt [6, p. 443] gives a much improved inequality in the special case when x and y are orthogonal. If mI ≤ A ≤ MI , and x ⊥ y, then
| x, Ay| ≤ 2
M −m M +m
2
x, Ax y, Ay.
(15)
From this one can derive another well-known result called the Kantorovich inequality: for every unit vector x,
x, Ax x, A−1 x ≤
(M + m)2 . 4Mm
(16)
See [6] for details. We discuss operator versions of these inequalities. Let A be a positive operator and X, Y any two operators. Replacing the A and B in (13) by A1/2 X and A1/2 Y , respectively, we obtain X ∗ AY (Y ∗ AY )−1 Y ∗ AX ≤ X∗ AX.
(17)
From this we get for every 2-positive linear map (X∗ AY ) (Y ∗ AY )−1 (Y ∗ AX) ≤ (X ∗ AX).
(18)
(See Remark 1.) This is an operator version of (14). A similar extension of (15) is given below. Theorem 2. Let A be a positive operator on H with mI ≤ A ≤ MI . Let X, Y be two partial isometries in H whose final spaces are orthogonal to each other. Let be a 2-positive linear map on B(H). Then (X∗ AY ) (Y ∗ AY )−1 (Y ∗ AX) ≤
M −m M +m
2
(X ∗ AX).
(19)
More Operator Versions of the Schwarz Inequality
243
Proof. An operator version of (16) is known [3, 9]. It says that for every positive unital linear map , (M + m)2 (20) (A)−1 . 4Mm Now consider a direct sum decomposition H = H1 ⊕ H2 , and a corresponding block decomposition of A as A11 A12 A= . A21 A22 (A−1 ) ≤
Then we have A−1 =
−1 ∗ (A11 − A12 A−1 22 A21 ) ∗ ∗
.
See [6, p. 472]. If we put (A) = A11 , we get from (20) −1 ≤ (A11 − A12 A−1 22 A21 )
(M + m)2 −1 A11 . 4Mm
This is equivalent to A11 − A12 A−1 22 A21 ≥ and thereby to
4Mm A11 , (M + m)2
M −m 2 ≤ A11 . M +m If X, Y are projections onto H1 and H2 , respectively, then this inequality can be written as
M −m 2 ∗ ∗ ∗ −1 ∗ X AY (Y AY ) Y AX ≤ X AX. (21) M +m A12 A−1 22 A21
A minor argument shows that this remains true if X, Y are mutually orthogonal projections whose ranges do not span all of H. This proves the inequality (19) in the special case when is the identity map. Let α = (M − m)/(M + m). As pointed out in Remark 1, the inequality (21) is equivalent to the statement αX ∗ AX X ∗ AY ≥ 0. Y ∗ AX αY ∗ AY From this we get the inequality (19) for every 2-positive linear map .
Remark 8. The referee has proved, under the stronger hypothesis that is 3-positive, a somewhat stronger inequality than our (19). Remark 9. The inequality (21) was proved recently in [11] by a different argument. In the scalar case generally the Wielandt inequality (15) is used to derive the Kantorovich inequality (16). In our proof for the operator version we have gone in the opposite direction. Similar ideas have been used by F. Zhang [12]. Acknowledgements. The second author thanks the Indian Statistical Institute and NSERC (Canada) for sponsoring his visit to Delhi in January 2000, when this work was done.
244
R. Bhatia, C. Davis
References [1]
Ando, T.: Concavity of certain maps on positive definite matrices and applications to Hadamard products. Linear Algebra and its Applications 26, 203–241 (1979) [2] Bhatia, R.: Matrix Analysis. Berlin–Heidelberg–New York: Springer, 1997 [3] Bhatia, R., Davis, Ch.: A better bound on the variance. American Mathematical Monthly 107, 353–356 (2000) [4] Choi, M.-D.: Some assorted inequalities for positive linear maps on C ∗ -algebras. Journal of Operator Theory 4, 271–285 (1980) [5] Davis, Ch.: Various averaging operations onto subalgebras. Illinois J. Math. 3, 538–553 (1959) [6] Horn, R.A., Johnson, C.R.: Matrix Analysis. Cambridge: Cambridge University Press, 1985 [7] Kadison, R.V.: A generalized Schwarz inequality and algebraic invariants for C ∗ -algebras. Ann. of Math. 56, 494–503 (1952) [8] Lieb, E.H., Ruskai, M.B.: Some operator inequalities of the Schwarz type, Advances in Math. 12, 269–273 (1974) [9] Marshall, A.W., Olkin, I.: Matrix versions of Cauchy and Kantorovich inequalities. Aequationes Math. 40, 89–93 (1990) [10] Stinespring, W.F.: Positive functions on C ∗ -algebras. Proc. Amer. Math. Soc. 6, 211–216 (1955) [11] Wang, S.-G., Ip, W.-C.: A matrix version of the Wielandt inequality and its applications to statistics. Lin. Alg. Appl. 296, 171–181 (1999) [12] Zhang, F.: Matrix inequalites in the Löwner ordering by means of block matrices and Schur complements. Preprint Communicated by H. Araki
Commun. Math. Phys. 215, 245 – 250 (2000)
Communications in
Mathematical Physics
© Springer-Verlag 2000
Absolute Continuity of the Floquet Spectrum for a Nonlinearly Forced Harmonic Oscillator Sandro Graffi1, , Kenji Yajima2, 1 Dipartimento di Matematica, Università di Bologna, Piazza di Porta S. Donato 5, 40127 Bologna, Italy.
E-mail:
[email protected]
2 Department of Mathematical Sciences, University of Tokyo, 3-8-1 Komaba, Meguro-ku, Tokyo 153-7815,
Japan. E-mail:
[email protected] Received: 23 March 2000 / Accepted: 24 May 2000
Abstract: We prove that the Floquet spectrum of the time periodic Schrödinger equation ∂u 1 1 i = − u + x 2 u + 2ε(sint)x1 u + µV (t, x)u, corresponding to a mildly nonlin∂t 2 2 ear resonant forcing, is purely absolutely continuous for µ suitably small. 1. Introduction and Statement of the Result It is well known [HLS] that the spectrum of the Floquet operator of the resonant, linearly forced Harmonic oscillator i
∂u 1 1 = − u + x 2 u + 2ε(sint)x1 u, ∂t 2 2
x = (x1 , . . . , xn ) ∈ Rn , ε > 0
is purely absolutely continuous. We show in this paper that the absolute continuity of the Floquet spectrum persists under time-periodic perturbations growing no faster than linearly at infinity provided the resonance condition still holds. Thus we consider the time-dependent Schrödinger equation i
∂u 1 1 = − u + x 2 u + 2ε(sint)x1 u + µV (t, x)u ∂t 2 2
(1.1)
and suppose that V (t, x) is a real-valued smooth function of (t, x), 2π -periodic with respect to t, increasing at most linearly as |x| goes to infinity: |∂xα V (t, x)| ≤ Cα ,
|α| ≥ 1.
(1.2)
Partly supported by MURST, National Research Project “Sistemi dinamici” and by Università di Bologna, Funds for Selected Research Topics. Partly supported by the Grant-in-Aid for Scientific Research, The Ministry of Education, Science, Sports and Culture, Japan #11304006
246
S. Graffi, K. Yajima
Under this condition Eq. (1.1) generates a unique unitary propagator U (t, s) on the Hilbert space L2 (Rn ). The Floquet operator is the one-period propagator U (2π, 0) and we are interested in the nature of its spectrum. It is well known that the long time behaviour of the solutions of (1.1) can be characterized by means of the spectral properties of the Floquet operator ([KY]). Our main result in this paper is the following theorem. Theorem 1.1. Let V be as above. Then, for |µ| sup |∂x1 V (t, x)| < ε, the spectrum of the t,x
Floquet operator U = U (2π, 0) is purely absolutely continuous. Remark. The above result can be understood in terms of the classical resonance phe1 nomenon. If V = 0 all motions generated by the classical Hamiltonian (p 2 + x 2 ) + 2 2εx1 sint undergo a resonance between the proper frequency of the harmonic motions and the frequency of the linear forcing term: as a consequence, for any given initial condition the classical motion eventually diverges to infinity by oscillations of linearly increasing amplitude. The quantum counterpart of this phenomonon is the absolute continuity of the Floquet spectrum[HLS]. One might ask whether this absolute continuity is stable under perturbations which destroy the linearity of the forcing potential. Theorem 1.1 establishes the stability under perturbations which make the forcing a non-linear one but do not destroy the resonance phenomenon because all initial conditions still diverge by oscillations to infinity. Therefore, the fact that all initial conditions for the corresponding classical motions satisfy the resonance condition seems an almost necessary condition for the spectral absolute continuity of the Floquet spectrum. Indeed ∂u it is known that the Schrödinger equation i = − 21 u + ε|x|α u + µV (ωt, x)u, with ∂t bounded V (t + 2π, x) = V (t, x), α > 2, ω ∈ R whose classical counterpart admits but a dense set of resonant initial conditions, has no absolutely continuous part in its Floquet spectrum if V ∈ C 2 ([H]); moreover it has pure point spectrum for a large set of non-resonant ω for µ small and V ∈ C r (r suitably large), provided V satisfies a supplementary condition on its matrix elements([DS]). Notation. We use the vector notation: for the multiplication operator Xj by the vari1 ∂ able xj and the differential operator Dj = , j = 1, . . . , n, we denote X = i ∂xj (X1 , . . . , Xn ) and D = (D1 , . . . , Dn ). For a measurable function W and a set of commuting selfadjoint operators H = (H1 , . . . , Hn ), W (H) is the operator defined via functional calculus. We have the identity U ∗ W (H)U = W (U ∗ HU)
(1.3)
for any unitary operator U. 2. Proof of the Theorem It is well known ([Ya]) that the nature of the spectrum of the Floquet operator U is the same (apart from multiplicities) as that of the Floquet Hamiltonian formally given by Ku = −i
∂u 1 1 − u + x 2 u + 2ε(sint)x1 u + µV (t, x)u ∂t 2 2
(2.1)
Absolute Continuity of Floquet Spectrum Forced Harmonic Oscillator
247
on the Hilbert space K = L2 (T) ⊗ L2 (Rn ), where T = R/2π Z is the circle. More precisely, if K is the generator of the one-parameter strongly continuous unitary group U(σ ), σ ∈ R, defined by (U(σ )u)(t) = U (t, t − σ )u(t − σ ), u = u(t, ·) ∈ K,
(2.2)
then, U(2π) = e−i2π K is unitarily equivalent to 1 ⊗ U (2π, 0). We set D ≡ C ∞ (T, S(Rn )). It is easy to see that: 1. The function space D is dense in K. 2. D is invariant under the action of the group U(σ ). 3. D is a subset of the domain D(K) of K and, for u ∈ D, Ku is given by the right-hand side of (2.1). It follows that D is a core for K ([RS]) and K is the closure of the operator defined by (2.1) on D. We introduce four unitary operators U0 ∼ U3 on K and successively transform K by Uj as follows: Write H0 for the selfadjoint operator on L2 (Rn ) defined by 1 1 1 H0 = − + x 2 − 2 2 2 with the domain D(H0 ) = {u ∈ L2 (Rn ) : D 2 u, x 2 u ∈ L2 (Rn )} and define U0 u(t, ·) = e−itH0 u(t, ·),
u ∈ K.
(2.3)
Proposition 2.1. (1) The operator U0 is well defined on K and is unitary. (2) U0 maps D onto itself. (3) For u ∈ D, K1 ≡ U0∗ KU0 is given by K1 u = −i
∂u u + 2εsint (X1 cost + D1 sint)u + µV (t, Xcost + Dsint)u + . ∂t 2
(4) D is a core of K1 .
(2.4)
Proof. It is well-known that σ (H0 ) = {0, 1, . . . } and we have e−2πniH0 = 1. Hence k (2.3) defines a unitary operator on K. We have S(Rn ) = ∩∞ k=1 D(H0 ) and (2) follows. (3) follows from the identity (1.3) and the well-known formulae eitH0 Xe−itH0 = Xcos t + Dsin t,
eitH0 De−itH0 = −Xsin t + Dcos t.
Since D is a core of K and U0 maps D onto itself, D is also a core for K1 .
Note that for any linear function aX + bD + c of X and D, and W satisfying (1.2), W (aX+bD+c) is the pseudo-differential operator with the Weyl symbol W (ax+bξ +c) ([Hö]). To eliminate the term 2εX1 sintcost from K1 , we define U1 u(t, x) = eiε(cos2t)x1 /2 u(t, x).
(2.5)
248
S. Graffi, K. Yajima
It is easy to see that U1 maps D onto itself and we have ∂ εcos2t ∂ U1 = −i − ε(sin2t)X1 , U1∗ DU1 = D + e1 , U1∗ −i ∂t ∂t 2 on D. It follows that K2 ≡ U1∗ K1 U1 is given by the closure of ∂u + 2ε(sin2 t)D1 u + ε 2 (sin2 tcos2t)u ∂t εcos2t u + µV (t, Xcost + sint (D + e1 ))u + 2 2
K2 u = − i
(2.6)
defined on D. We write 2ε(sin2 t)D1 = εD1 − ε(cos2t)D1 in the right side of (2.6). Next, to eliminate the term −ε(cos2t)D1 , we define U2 u(t, x) = eiε(sin2t)D1 /2 u(t, x) = u(t, x + ε(sin2t)e1 /2). Then, U2 maps D onto itself and we have on D, ∂ ∂ U2 = −i + ε(cos2t)D1 , U2∗ −i ∂t ∂t
U2∗ XU2 = X −
εsin2t e1 . 2
It follows, also with the help of the identity (1.3), that K3 ≡ U2∗ K2 U2 is the closure of the operator given on D by ∂u + εD1 u + ε 2 (sin2 tcos2t)u ∂t εsint u + µV (t, Xcost + Dsint − e1 )u + . 2 2 Here we also used the obvious identity cos2tsint − costsin2t = −sint. We write now 1 1 1 (sin2 t)cos2t = cos2t − cos4t − , 2 4 4 and define K3 u = − i
U3 u(t, x) = e−iε Again U3 maps D onto itself and L ≡ by Lu = − i
2 (sin2t)/4+iε 2 (sin4t)/16
U3∗ K2 U3
u(t, x).
is the closure of the operator given on D
∂u (2 − ε 2 )u + εD1 u + ∂t 4
+ µV
(2.7)
t, Xcost + Dsint −
εe1 sint 2
(2.8) u.
Thus, K is unitarily equivalent to L defined as the closure of the operator with domain D and action specified by the right side of (2.8). Completion of the proof of the Theorem. We apply Mourre’s theory of conjugate operators ([M]; see also[PSS]). We take the selfadjoint operator A defined by Au(t, x) = x1 u(t, x) with obvious domain as the conjugate operator for L, and verify the conditions (a)–(e) of Definition 1 of [M] are satisfied.
Absolute Continuity of Floquet Spectrum Forced Harmonic Oscillator
249
(a) D ⊂ D(A) ∩ D(L) and hence D(A) ∩ D(L) is a core of L. (b) It is clear that eiα A = eiαX1 maps D onto D and that, for u ∈ D, we have εe1 sint −iα A iα A e u Le u − Lu = εαu − µV t, Xcost + Dsint − 2 (ε − 2α)e1 sint + µV t, Xcost + Dsint − u. 2 Since V (x) − V (x + αe1 sint) is bounded with bounded derivatives, the right-hand side extends to a bounded operator on K and it is continuous with respect to α in the operator norm topology. It follows that eiα A maps the domain of L into itself and sup|α|≤1 Leiα A uK < ∞ for any u ∈ D(L). (c) Let us verify the conditions (c’), (i), (ii), (iii) of Proposition II.1 of [M] taking H = L, A = A and S = D. The verification of these conditions in turn implies (c). First remark that (i) and (ii) are a direct consequence of (a) and (b) above. Moreover for any u ∈ D we have εsint i[L, A]u = εu + µsint · ∂x1 V t, Xcost + Dsint − (2.9) e1 u. 2 The right-hand side extends to a bounded operator in K which, following [M], we denote i[L, A]◦ . The boundedness implies a fortiori Condition (iii) and hence (c) is verified. (d) By direct computation we have for u ∈ D, εsint i[[L, A]◦ , A]u = µsin2 t (∂x21 V ) t, Xcost + Dsint − (2.10) e1 u. 2 The right-hand side extends to a bounded operator on K. It follows that [L, A]◦ D(A) ⊂ D(A) and (2.10) holds for u ∈ D(A). Hence [[L, A]◦ , A] defined on D(L) ∩ D(A) is bounded and this yields (d). εsint (e) The operator norm of u → sint · ∂x1 V t, Xcost + Dsint − e1 u is bounded 2 by sup |∂x1 V (t, x)| by an abstract theorem of functional calculus([RS]) or by noticing t,x
that the operator is unitarily equivalent to sint ·∂x1 V (t, x −εsinte1 /2) via the unitary operator U0 . Hence if |µ|∂x1 V L∞ < ε, then we have i[L, A]◦ ≥ c > 0. Thus the conditions of [M] are satisfied and we can conclude that σ (K) = σac (K) if |µ|∂x1 V L∞ < ε by Theorem and Proposition II.4 of [M].
References [DS]
Duclos, P., Stovicek, P.: Floquet Hamiltonians with pure point spectrum. Commun. Math. Phys. 177, 327–347 (1996) [HLS] Hagedorn, G., Loss, M., Slawny, J.: Non-stochasticity of time-dependent quadratic Hamiltonians and the spectra of canonical transformations. J. Phys. A 19, 521–531 (1986) [Hö] Hörmander, L.: The analysis of linear partial differential operators III. Berlin–Heidelberg–New York–Tokyo: Springer Verlag, 1985 [H] Howland, J.: Floquet operators with singular spectrum. II. Ann. Inst. H. Poincaré 50, 325–334 (1989) [KY] Kitada, H., Yajima, K.: Bound states and scattering states for time periodic Hamiltonians. Ann. Inst. Henri Poincaré 39, 145–157 (1983)
250
[M]
S. Graffi, K. Yajima
Mourre, E.: Absence of singular continous spectrum for certain self-adjoint operators. Commun. Math. Phys. 78, 391–400 (1981) [PSS] Perry, P., Sigal, I., Simon, B.: Spectral analysis of N-body Schrödinger operators. Ann. Math. 114, 519–567 (1981) [RS] Reed, M., Simon, B.: Methods of Modern Mathematical Physics. New York–San Francisco–London: Academic Press, 1975 [Ya] Yajima, K.: Scattering theory for Schrödinger equations with potentials periodic in time. J. Math. Soc. Japan 29, 729–743 (1977) [Ya-1] Yajima, K.: Resonances for the AC-Stark effect. Commun. Math. Phys. 87, 331–352 (1982) Communicated by B. Simon
Commun. Math. Phys. 215, 251 – 290 (2000)
Communications in
Mathematical Physics
© Springer-Verlag 2000
Interacting Fermi Liquid in Two Dimensions at Finite Temperature. Part I: Convergent Attributions M. Disertori, V. Rivasseau Centre de Physique Théorique, Ecole Polytechnique, 91128 Palaiseau Cedex, France Received: 27 July 1999 / Accepted: 31 May 2000
Abstract: Using the method of a continuous renormalization group around the Fermi surface, we prove that a two-dimensional interacting system of Fermions at low temperature T is a Fermi liquid in the domain |λ|| log T | ≤ K, where K is some numerical constant. According to [S1], this means that it is analytic in the coupling constant λ, and that the first and second derivatives of the self energy obey uniform bounds in that range. This is also a step in the program of rigorous (non-perturbative) study of the BCS phase transition for many Fermion systems; it proves in particular that in dimension two the transition temperature (if any) must be non-perturbative in the coupling constant. The proof is organized into two parts: the present paper deals with the convergent contributions, and a companion paper (Part II) deals with the renormalization of dangerous two point subgraphs and achieves the proof.
1. Introduction Conducting electrons in a metal at low temperature are well described by Fermi liquid theory. However we know that the Fermi liquid theory is not valid down to 0 temperature. Indeed below the BCS critical temperature the dressed electrons or holes which are the excitations of the Fermi liquid bound into Cooper pairs and the metal becomes superconducting. During the last ten years a program has been designed to investigate rigorously this phenomenon by means of field theoretic methods [BG, FT1-2, FMRT1-3, S2]. In particular the renormalization group of Wilson and followers has been extended to models with surface singularities such as the Fermi surface. The ultimate goal is to create a mathematically rigorous theory of the BCS transition and of similar phenomena of solid state physics. This is a long and difficult program which requires to glueing together several ingredients, in particular renormalization group around the Fermi surface and spontaneous symmetry breaking.
252
M. Disertori, V. Rivasseau
A more accessible task is to make precise the mathematical status of the Fermi liquid theory itself. Fermi liquid theory is not valid at zero temperature because of the BCS instability. Even when the dominant electron interaction is repulsive, the Kohn–Luttinger instabilities prevent the Fermi liquid theory from being generically valid down to zero temperature. There are nevertheless two proposals for a mathematically rigorous Fermi liquid theory: – one can block the BCS and Kohn–Luttinger instabilities by considering models in which the Fermi surface is not invariant under p → −p [FKLT]. In two dimensions it is possible to prove (even non-perturbatively) that in this case the Fermi liquid theory remains valid at zero temperature, and the corresponding program is well under way [FKLT]. However this program requires rigorous control of the stability of a nonspherical Fermi surface under the renormalization group flow, a difficult technical issue [FST]; – one can study the Fermi liquid theory at finite temperature above the BCS transition temperature. A system of weakly interacting electrons has an obviously stable thermodynamic limit at high enough temperature, since the temperature acts as an infrared cutoff on the propagator in the field theory description of the model. In this point of view, advocated by [S1], the non trivial theorem consists in showing that stability (i.e. summability of perturbation theory) holds for all temperatures higher than a certain critical temperature whose dependence in terms of the initial interaction should be as precise as possible, and that the first and second derivatives of the self-energy obey some uniform bounds. These bounds rule out in particular Luttinger liquid behavior; they do not hold in dimension 1, where Luttinger-liquid has been established rigorously [BGPS]–[BM]. It is this second program that we do here. We prove an upper bound on any critical temperature for two dimensional systems of Fermions which is exponentially small in the coupling constant, hence invisible in perturbation theory, and we check the uniform derivative bounds on the self-energy in that domain. Our analysis relies on a renormalization group analysis around the Fermi surface. Renormalization group flows were studied perturbatively in the context of a spherical Fermi surface in [FT2]. A non perturbative study in 2 dimensions was performed in [FMRT1], but it was limited to so called “completely convergent graphs”. In this paper we rely heavily on the ideas introduced in [FMRT1], but we extend them to include non perturbative renormalization of the two point functions which allow the rigorous exponentially small upper bound. This extension is not trivial since renormalization in phase space in this context is complicated by the need for anisotropic sectors. Also we use (in contrast with [FMRT1]) a continuous renormalization group scheme around the Fermi surface (another idea advocated in [S1]). This scheme has been tested first in the simpler case of the Gross Neveu model (a field theory where there is no Fermi surface) in [DR1]. The next natural step in this program is to add the computation of coupling constant flows (i.e. renormalization of four point functions). It would allow computation of the optimal expected value Co of the constant c in our upper bound on the critical temperature of Fermions systems. A more difficult step is to glue this analysis to a kind of 1/N expansion and to a bosonic analysis to control the region at distance BCS e−Co /|λ| of the Fermi surface [FMRT2]. In two dimensions and finite temperature we cannot expect true symmetry breaking by the Mermin–Wagner theorem, but we can expect a Kosterlitz– Thouless phase for a two dimensional bosonic field in a rotation invariant effective potential. Finally at 0 temperature we have effectively a three-dimensional theory (two dimensions for space, one for imaginary time). Continuous symmetry breaking can
Interacting Fermi Liquid in Two Dimensions at Finite Temperature. I
253
then occur, with the associated Goldstone boson. The last part of the analysis consists therefore in the non-perturbative control of the infrared divergences associated to this Goldstone boson, using Ward identities at the constructive level [FMRT3]. Our result has quite a long proof, which we organize therefore in two main parts. In this paper we introduce the model and prove the analyticity of the “convergent contributions” to the vertex functions, hence we reproduce the results of [FMRT1], but with the continuous renormalization group technique. In a companion paper [DR2] we consider the complete sum of all graphs, perform renormalization of the two point subgraphs and obtain our main theorem, with the bounds on the derivatives of the self-energy proved in a separate Appendix. 2. Model and Notations The simplest free continuum model for interacting Fermions is the isotropic jellium model with a continuous rotation invariant ultraviolet cutoff. This model is rotation invariant, a feature which simplifies considerably the study of the renormalization group flows after introducing the interaction. In particular it has a spherical Fermi surface. It is a realistic model for instance in solid state physics in the limit of weak electron densities (where the Fermi surface becomes approximately spherical). The simplest Fermion interaction perturbing this free model is a local four body interaction. This is a realistic interaction for instance in a solid where the dominant interaction is not the Coulomb interaction but the electron-phonon interaction. After integrating out the phonon modes an effective four body interaction is obtained, which is not strictly local due to the non local phonon propagator. However at long distances it is well approximated by a local interaction.1 We use the formalism of non-relativistic field theory at the imaginary (periodic) time of [FT1-2, BG] to describe the interacting fermions at finite temperature. Our model is therefore similar to the Gross–Neveu model, but with a different, not relativistic propagator2 . 2.1. Propagator without ultraviolet cutoff. Using the Matsubara formalism, the propagator at temperature T , C(x0 , x), is antiperiodic in the variable x0 with antiperiod T1 . This means that the Fourier transform defined by 1 1 T ˆ C(k) = dx0 d 2 x e−ikx C(x) (2.1) 2 − T1 is not zero only for discrete values (called the Matsubara frequencies): k0 =
2n + 1 π, β
n ∈ Z,
(2.2)
1 Interaction with non-local but well-decaying kernels can be added without much cost to our analysis. 2 However there are some important differences:
– in GN the infrared singularity lies at k = 0. Renormalization subtracts divergent functions at this point. In the Fermi liquid the singularity lies on the surface k0 = 0, |k| = 1, so renormalization is more complicated; – in GN a natural infrared cut-off is given by the mass, in Fermi liquid it is given by the temperature; – in GN we are interested in the ultraviolet limit, the low energy (renormalized) parameters being kept fixed; in the Fermi liquid we fix the ultraviolet cut-off and we want to deduce the long range properties from the microscopic theory.
254
M. Disertori, V. Rivasseau
where β = 1/T (we take h/ = k = 1). We remark that only odd frequencies appear, because of antiperiodicity. Our convention is that a three dimensional vector is denoted by x = (x0 , x), where x is the two dimensional spatial component. The scalar product is defined as kx := −k0 x0 + kx. By some slight abuse of notations we may write either C(x − x) ¯ or C(x, x), ¯ where the first point corresponds to the field and the second one to the antifield (using translation invariance of the corresponding kernel). ˆ Actually C(k) is obtained from the real time propagator by changing k0 in ik0 and is equal to: Cˆ ab (k) = δab
1 , ik0 − e(k)
e(k) =
k2 − µ, 2m
(2.3)
where a, b ∈ {1, 2} are the spin indices. The vector k is two-dimensional. Since our theory has two spatial dimensions and one time dimension, there are really three dimensions. The parameters m and µ correspond to the effective mass and to the chemical potential (which fixes the Fermi energy). To simplify notation we put 2m = µ = 1, so that e(k) = k 2 − 1. Hence, 1 Cab (x) = d 2 k eikx Cˆ ab (k). (2.4) (2π )2 β k0
The notation k0 means really the discrete sum over the integer n in (2.2). When T → 0 (which means β → ∞) k0 becomes a continuous variable, the corresponding discrete sum becomes an integral, and the corresponding propagator C0 (x) becomes singular on the Fermi surface defined by k0 = 0 and |k| = 1. In the following to simplify notations we will write: 1 1 β d 2 k, d 3x ≡ dx0 d 2 x. (2.5) d 3k ≡ β 2 −β k0
In determining the spatial decay we will need the following lemma: Lemma 1. The function C defined in (2.4) can also be written as m C(x) = f (x0 , x) := (−1)m C0 x0 + , x , T
(2.6)
m∈Z
where C0 is the propagator at T = 0. Proof. To prove this lemma we first prove that the function f is antiperiodic on T1 . Since ˆ fˆ(k) = C(k) ∀k, the lemma holds. In this paper we do not yet perform any renormalization, hence we do not introduce any counterterm, and the interaction is simply: λ ¯ 2, SV = ψψ) d 3x ( (2.7) 2 V a where V := [−β, β]×V and V is an auxiliary volume cutoff in two dimensional space, that will be soon sent to infinity. We remark that in (2.2) |k0 | ≥ π/β = 0, hence the denominator in C(k) can never be 0 at non zero temperature. This is why the temperature provides a natural infrared cut-off.
Interacting Fermi Liquid in Two Dimensions at Finite Temperature. I
255
2.2. Propagator with an ultraviolet cutoff. It is convenient to add a continuous ultraviolet cut-off (at a fixed scale !u ) to the propagator (2.3) for two reasons: first because it makes its Fourier transformed kernel in position space well defined, and second because a nonrelativistic theory does not make sense anyway at high energies. To preserve physical (or Osterwalder–Schrader) positivity one should introduce this ultraviolet cutoff only on spatial frequencies [FT2]. However for convenience we introduce this cutoff both on spatial and on Matsubara frequencies as in [FMRT1]; indeed the Matsubara cutoff could be lifted with little additional work. For technical reasons it is also convenient to introduce, as in [DR1], an auxiliary infrared cut-off at scale !, whose variation controls the renormalization group flow. At the end the limit ! → 0 is taken (we recall that the true infrared cutoff is the temperature, which is not taken to 0 in this paper). The propagator (2.3) equipped with !u these two cutoffs is called C! . It is defined as: !u (k) := C(k) u(r/!2u ) − u(r/!2 ) C!
r=k02 +e2 (k)
,
(2.8)
where we fixed !u = 1 (for simplicity), 0 ≤ ! ≤ 1 and the compact support function u(r) ∈ C0∞ (R) satisfies: u(r) = 0 for |r| > 1/2 ; u(r) = 1 for |r| < 1/4 ; u(r)dr = 3/4. (2.9) For later calculations it is useful to choose u to be a Gevrey function3 . The propagator can be parametrized as: !u C! (k) =
!−2
!−2 u
dα Cα (k),
(2.12)
where Cα (k) = C(k) η[α r]r=k 2 +e2 (k) 0
η(α r) = −ru (α r).
(2.13)
√ As u (α r) = 0 only√for r 1/α the propagator Cα (k) is non-zero only for 1/2 α ≤ k02 + e2 (k) ≤ 1/ 2α, hence for momenta in the volume between two tori in R 3 centered on the critical circle |k| = 1, k0 = 0 (see Fig. 1). √ In short in the support of Cα (k) we have ||k| − 1| √1α and k0 1/ α, but they cannot be simultaneously much smaller. Remark that the temperature cut-off implies 3 A function f ∈ C ∞ (Rd ) with compact support is in the Gevrey class of order s if there exist two constants A and µ such that
∀n ≥ 0,
||f (n) ||1 ≤ Aµ−n
n ns
(2.10)
e
and its Fourier transform satisfies (see [G]): ∀k ∈ Rd
|fˆ(k)| ≤ Ae
1/s −s √µ |k| d
.
(2.11)
256
M. Disertori, V. Rivasseau
k2
11111111111111111 00000000000000000 00000000000000000 11111111111111111 00000000000000000 11111111111111111 00000000000000000 11111111111111111 00000000000000000 11111111111111111 00000000000000000 11111111111111111 00000000000000000 11111111111111111 00000000000000000 11111111111111111 00000000000000000 11111111111111111 00000000000000000 11111111111111111 00000000000000000 11111111111111111 000000 111111 00000000000000000 11111111111111111 000000 111111 00000000000000000 11111111111111111 000000 111111 00000000000000000 11111111111111111 000000 111111 00000000000000000 11111111111111111 000000 111111 00000000000000000 11111111111111111 000000 111111 00000000000000000 11111111111111111 000000 111111 00000000000000000 11111111111111111 000000 111111 1 00000000000000000 11111111111111111 000000 111111 00000000000000000 11111111111111111 000000 111111 00000000000000000 11111111111111111 000000 111111 00000000000000000 11111111111111111 000000 111111 00000000000000000 11111111111111111 00000000000000000 11111111111111111 00000000000000000 11111111111111111 00000000000000000 11111111111111111 00000000000000000 11111111111111111 00000000000000000 11111111111111111 00000000000000000 11111111111111111 00000000000000000 11111111111111111 00000000000000000 11111111111111111 00000000000000000 11111111111111111 00000000000000000 11111111111111111
k0
k1
00000 11111 000 111 00000 11111 111 000 00000 11111 000 111 00000 11111 000 111 00000 11111 000 111 00000 11111 000 111 00000 11111 000 111 00000 11111 000 111 00000 11111 000 111 00000 11111 000 111 00000 11111 000 111 -1 00000 11111 000 111 00000 11111 000 111 00000 11111 000 111 00000 11111 000 111 00000 11111 000 111 00000 11111 000 111
11111 00000 00 11 00000 11111 00 11 00000 11111 00 11 00000 11111 00 11 00000 11111 00 11 00000 11111 00 11 00000 11111 00 11 00000 11111 00 11 00000 11111 00 11 00000 11111 00 11 00000 11111 1 00 11 00000 11111 00 11 00000 11111 00 11 00000 11111 00 11 00000 11111 00 11 00000 11111 00 11 00000 11111 00 11
k2
k 1 =0
k0=0 Fig. 1. Support of Cα
√ that Cα = 0 if 1/ 2α < π/β, hence the real non-zero propagator is !−2 T !u C! (k) := dα Cα (k) −2 !u = C(k) u(r/!2u ) − u(r/!2T ) 2 2 ,
(2.14)
r=k0 +e (k)
where we defined
√
!T := max !, 2 π T .
(2.15)
2.3. Vertex functions. The vertex functions are defined through the partition function: ¯ >+<ξ,ψ> ¯ −SV (ψ,ψ)+<ψ,ξ ZV!!u (ξ, ξ¯ ) = dµC !u (ψ, ψ)e , ! ¯ < ψ, ξ > =: d 3 x ψ(x)ξ(x), (2.16) V
where ξ is an external field. The 2p-point vertex function is defined as: - !!u ({y}, {z}) := - !!u (y1 , . . . , yp , z1 , . . . , zp ) δ 2p !!u !u −1 (ln Z − F )(C ) (ξ ) = lim ! V V →∞ δξ(z1 )...δξ(zp )δ ξ¯ (y1 )...δ ξ¯ (yp )
ξ =0
,
(2.17)
!u ξ > is the bare propagator. These functions are the coefficients where F (ξ ) =< ξ, C! of the effective action (expanded in powers of the external fields) at energy !. They are in fact distributions (as easily seen because there are graphs for which several external arguments hook to the same vertex, hence create δ functions). Therefore we will later smear the vertex functions - with smooth test functions φ1 (y1 ), . . . φp (yp ), φp+1 (z1 ), . . . φ2p (zp ). 2p We consider external impulsions ke1 , . . . ,ke2p (with i=1 kei ) fixed with a precision !T . Then we choose the following test functions:
2 kj − kei ,j −ikxei 1 ˆ φi (k) =: e i = 2, . . . , 2p, φˆ 1 (k) =: e−ikxe1 (2.18) u !3τ !τ j =0
Interacting Fermi Liquid in Two Dimensions at Finite Temperature. I
257
where u is the Gevrey function (2.9). The impulsion k is then fixed in a box of size !3τ centered on kei . We take !τ = !T . In this paper we also treat explicitly the case of external legs at fixed impulsion. For that it is sufficient to let !τ → 0 and xei → 0, as the test function tends to δ(k − kei ). In position space these functions become φi (x) =: e−ikei (x−xei )
2
uˆ (x − xei )!T
i = 2, . . . , 2p,
φ1 (x) =: δ(x − xe1 ).
j =0
(2.19) This means that φ1 is localized exactly on xe1 , while φi (i = 1) has its maximum on xei and decays as 1
1 s
|φi (x)| ≤ e−a|x−xe1 | s !T .
(2.20)
Any reasonable test function (for instance a test function with compact support in position space) can be treated with our method, but at the cost of a few technical complications. Expanding the exponential in Z we have: ZV!!u (ξ ) =
∞ ∞ 1 (−1)n n d 3 y 1 . . . d 3 y p d 3 z 1 . . . d 3 zp d 3 x 1 . . . d 3 x n λ p!2 n! V
p=0 p
n=0
ξ(zi )ξ¯ (yi )
i=1
y1 . . . yp x1 x1 . . . xn xn , z1 . . . zp x1 x1 . . . xn xn
where we used Cayley’s notation for determinants: ui,a !u = det(C!,ab (ui − vj )). vj,b
(2.21)
(2.22)
The determinant is the sum over all Feynman graphs amplitudes, and the logarithm selects the sum over connected graphs. To obtain log Z without expanding completely the determinant we use a forest formula. Forest formulas are Taylor expansions with integral remainders which test links (here the propagators) between n ≥ 1 points (here the vertices) and stop as soon as the final connected components are built. The result is a sum over forests, a forest being a set of disjoint trees. Like in [DR1] we use the ordered Brydges–Kennedy Taylor formula, which states [AR1] that for any smooth function H of the n(n − 1)/2 variables ul , l ∈ Pn = {(i, j )|i, j ∈ {1, . . . , n}, i = j }, k k ∂ H |ul =1 = dwq H (wlF (wq ), l ∈ Pn ), ∂u l 0≤w1 ≤···≤wk ≤1 q o−F
q=1
q=1
(2.23) where o − F is any ordered forest, made of 0 ≤ k ≤ n − 1 links l1 , . . . , lk over the n points. To each link lq q = 1, . . . , k of F is associated the parameter wq , and to each pair l = (i, j ) is associated the weakening factor wlF (wq ). These factors replace
258
M. Disertori, V. Rivasseau
the variables ul as arguments of the derived function
k
∂ q=1 ∂ulq
H in (2.23). These
weakening factors wlF (w) are themselves functions of the parameters wq , q = 1, . . . , k through the formulas F (w) = 1, wi,i F (w) = inf wq , wi,j
if i and j are connected by F,
F lq ∈Pi,j
F is the unique path in the forest F connecting i to j , where Pi,j F (w) = 0 wi,j
if i and j are not connected by F.
(2.24)
We apply this formula to the determinant in (2.21), inserting the interpolation parameter !(u) ul in the UV cut-off !u of the covariance C! (xi , xj ), when i = j . We define !(u) by: !(u) = u(!u − !) + !;
!(0) = !;
!(1) = !u .
(2.25)
Now the product in (2.23) becomes: k k ∂ ∂ !(wq ) H (wlF (wq ), l ∈ Pn ) = C (k)(xlq , ylq ) det M (2.26) ∂ulq ∂wq ! q=1
q=1
which is the product of the forest line propagators and a remaining determinant which contains all possible contractions of loop lines. Actually, the elements of the matrix M are the loop line propagators weakened by the forest formula. Now, taking the logarithm of Z and including (as announced above) the smearing of external arguments by test functions we obtain a tree expansion for the vertex function similar to the one of [DR1]: !!u -2p (φ1 , . . . φ2p )
=
∞ λn n=1
n!
o−T
E
<
wT ≤w1 ≤···≤wn−1 ≤1
=
∞ λn n=1
ε(T , <)
n!
o−T
E
<
!T ≤!1 ≤···≤!n−1 ≤1
n−1 q=1
∂ !(wq ) C (xlq , x¯lq )dwq det M(E) ∂wq !
n−1 q=1
(2.27)
ε(T , <)
d 3 x1 . . . d 3 xn φ1 (xi1 ) . . . φ2p (xjp )
d 3 x1 . . . d 3 xn φ1 (xi1 ) . . . φ2p (xjp )
∂ !q C (xlq , x¯lq )d!q det M(E), ∂!q !
where o − T is the set of ordered trees over n vertices, and E is the set of pairs (φj , vj ) which specifies which test function φj is hooked to which internal vertex vj for j = 1, . . . , 2p (see [DR1]). < specifies for each tree line whether it comes from a ψ ψ¯ or ¯ contraction. ε(T , <) is a global ± sign whose exact (inessential) value is given in ψψ
Interacting Fermi Liquid in Two Dimensions at Finite Temperature. I
259
[AR2]. √ Finally wT is defined by !(wT ) = !T . We remark that wT = 0 if !T = !, i.e. if ! ≥ 2πT , and wT > 0 otherwise. The bound !(wi ) ≥ !T ∀i is due to (2.14–2.15). In the fourth line we performed the change of variable !q =: !(wq ) and we applied the following identities: ∂ ∂ !(wq ) ! C (x) = (!u − !) C q (x), ∂wq ! ∂!q ! 1 dwq = d!q . (!u − !) In the following we always use the variable !q instead of wq .
(2.28)
2.4. Bands. The strategy to analyze (2.27) is similar to the one of [DR1]. The determinant is bounded by a Gram inequality (which gives no factorial)4 . Spatial integrals ! ∂ C! q |. To send the IR are performed using the spatial decay of the tree propagators | ∂! q cut-off to zero without generating unwanted factorials, we need to perform some renormalization. These renormalizations, although more complicated than in the field theory case, still involve only two and four point subgraphs [FT1-2]. Therefore as in [DR1] we need to distinguish the so-called dangerous subgraphs, which means four-point and twopoint quasi-local subgraphs. Remark that a subgraph is called quasi-local if all internal lines have energy higher than all external lines [R]. These contributions are decomposed into a renormalized part with improved power counting, and a localized part which in turn is absorbed into a flow of effective constants. To implement this renormalization group program, the first tool is to cut the momentum space into bands, which form a partition of unity. The ordering of the tree in the previous section cuts in a natural way the space of momenta into n bands [DR1]. Indeed: wT ≤ w1 ≤ w2 ≤ · · · ≤ wn−1 ≤ 1 → !T ≤ !1 ≤ !2 · · · ≤ !n−1 ≤ 1 .
(2.29)
q th
band corresponds to scales between The set of bands is called B = {1, . . . , n}. The !q−1 and !q , where we adopt the convention !n = !u = 1 and !0 = wT . Then we can attribute each loop line to a well defined band. Real external lines, associated to test functions, have fixed impulsions. We take the convention to put them in an artificial band with index 0, that does not correspond to any energy range. Let’s see how propagators for tree and loop lines look. 2.4.1. Tree propagators. The q th tree line propagator is given by: !−2 ∂ 2 C !q (k) = dα Cα (k) = 3 Cα (k)α=!−2 q ∂!q !−2 !q q k 2 + e2 (k) 2 . = 3 [ik0 + e(k)] u 0 2 !q !q
(2.30)
The derivative with respect to !q fixes the α parameter of the line on the top of the band bq , and this tree line propagator is considered by convention to belong to the q th band. In this way we have one tree line in each band, except the last one bn (see Fig. 2). 4 The first example of combining a tree expansion with a Gram bound appears in [L]. We thank G. Gallavotti and C. Wieczerkowski for pointing out this reference to us.
260
M. Disertori, V. Rivasseau
impulsions Λu
n
Λ n-1
n-1
Λ n-2
links of the tree
Λ1 ΛT
1
0 positions Fig. 2. Band structure
2.4.2. Loop lines. Loop line propagators are the elements of the (n+1−p)×(n+1−p) matrix M(E): T (w)) !(wf,g
Mfg = C!
(xf , xg ).
(2.31)
The corresponding loop fields (respectively antifields) are labeled by the index f (respectively g). Altogether they form a set L labeled by an index a ∈ L =: {1, . . . , 2n+2−2p}, hence a indexes both the rows and columns of the determinant in (2.27): a(f1 ) = 1, . . . , a(fn+1−p ) = n + 1 − p, a(g1 ) = n + 2 − p, . . . , a(gn+1−p ) = 2n + 2 − 2p. Similarly to each tree line li there corresponds two half tree lines called fi and gi . Each loop propagator can be written as a sum of propagators restricted to single bands: !(wT (w)) C! f,g (k)
iT
=
f,g
j =1
!−2 j −1 !−2 j
iT
dα Cα (k) = C(k)
f,g
uj (k),
(2.32)
j =1
T as the lowest index in the path P T (defined in Eq. (2.24)) where we define if,g f,g T T = inf {q | lq ∈ Pf,g }, if,g
and the function uj is the cutoff for the j th band −2 uj (k) := u[r !−2 j ] − u[r !j −1 ]
r=[k02 +e2 (k)]
(2.33)
.
(2.34)
By multi-linearity one can expand the determinant in (2.27) according to the different bands in the sum (2.32) for each row and column: det M(µ, E), (2.35) det M(E) = µ
Interacting Fermi Liquid in Two Dimensions at Finite Temperature. I
261
where an attribution µ is a collection of band indices for each loop field a ∈ L: µ = {µ(f1 ), . . . µ(fn+1−p ), µ(g1 ), . . . µ(gn+1−p )}, µ(a) ∈ B for a = 1 . . . 2n + 2 − 2p.
(2.36)
Now, for each attribution µ we need to exploit power counting. This requires notations for the various types of fields or half-lines which form the analogs of the quasi local subgraphs of [R] in our formalism (that is subgraphs with all internal lines higher than the external ones). For a loop half line (with index a) or an external line (with index j ) we call va or vj the vertex to which it hooks. Similarly, for tree half lines fi and gi , we call vfi or vgi the vertex to which they hook. We define as iv the band index of the highest tree line hooked to the vertex v, and, for each k ≥ 1: Tk = {li ∈ T | i ≥ k}.
(2.37)
In particular we define tk as the unique connected component of Tk containing the tree line lk . We say that a vertex v ∈ tk if iv ≥ k and liv ∈ tk . The matrix element of the determinant in (2.35) is then 1 ) Mfg (µ)(xf , xg ) = δµ(f ),µ(g) d 3 k eik(xf −xg ) C(k) uµ(f ) (k)Wvµ(f , f ,vg (2π )2 (2.38) where k Wv,v = 1 =0
if v and v are connected by Tk otherwise.
(2.39)
Now we define the quasi-local subgraph at level k gk as gk etk ilk elk eek egk Vk
= tk ∪ ilk , = {li ∈ T |vi ∈ tk , i < k}, = {a ∈ L|va ∈ tk , µ(a) > A(k)}, = {a ∈ L|va ∈ tk , µ(a) ≤ A(k)}, = {(φj , vj ) ∈ E|vj ∈ tk }, = etk ∪ elk ∪ eek , = {v|v ∈ tk },
(2.40)
where etk , elk and eek are the tree, loop and real external half lines respectively, and ilk are the internal loop half lines, and we denoted by A(k) the index of the highest tree external line of gk . In defining internal and external loop half-lines we have observed that no new line connects to tk in the interval between k and A(k). Hence all loop half-lines connected to the vertices of tk with attributions between k and A(k) are in fact internal lines for the subgraph gk as they must contract between themselves. Therefore we have considered as external loop half lines only the ones with attributions µ(a) ≤ A(k). In the following, we will note by |A| the number of elements in some set A.
262
M. Disertori, V. Rivasseau
Tadpoles. We remark that µ(a) ≤ iva always. Indeed we could have µ(a) > iva only if the line a belongs to a tadpole. But the contribution of a tadpole is zero5 , as proved by the following lemma: Lemma 2. The amplitude of a tadpole with loop line in some band i is zero ∀i. Proof. The loop integral is: 1 1 !i 3 2 ik0 + e(k) 2 2 d d k C (k) = − k , e (k) , U k 0 !i−1 (2π)2 (2π )2 β k02 + e2 (k) k
(2.41)
0
where
U k0 , e (k) = u 2
k02 + e2 !2i
−u
k02 + e2 !2i−1
.
(2.42)
By the properties of u, U = 0 only for !2i−1 /4 ≤ k02 + e2 ≤ !2i /2. The integral reduces to 1 e(k) 2 2 2 d − k , e (k) (2.43) U k 0 (2π)2 β k02 + e2 (k) k 0
as the other term is odd under k0 . Performing the change of variables t = |k|2 − 1 the spatial integral (for any k0 fixed) becomes ∞ 1 2π dt t t 2 2 dθ U (k0 , t ) = π dt 2 U (k02 , t 2 ) = 0 (2.44) 2 + t2 2 2 k k + t 0 −1 −1 0 0 by parity. We remark that the domain of t can be reduced to [−1, 1] since, for t ≥ 1, k02 + t 2 ≥ 1 > !2i /2, hence U = 0. 2.4.3. Analyticity of convergent attributions. We call an attribution µ convergent if it satisfies egk ≥ 6 for any k > 1. We remark that for k = 1, eg1 = 2p, and for p ≤ 2 we cannot require that this last subgraph has more than 4 external legs. The convergent part of the theory is defined by the functions !!u -2p,c. (φ1 , . . . φ2p ) = ∞ λn n=1
n!
ε(T , <)
o−T E <
!T ≤!1 ≤···≤!n−1 ≤1
n−1 q=1
d 3 x1 . . . d 3 xn φ1 (xi1 ) . . . φ2p (xjp )
∂ ! C! q (xlq , x¯lq )d!q det M(µ, E). ∂!q µ conv. (2.45)
We start with a first theorem which essentially reproduces the result of [FMRT1] in our framework of continuous cutoffs. This theorem states that the infrared limit (i.e. the 5 Tadpoles are exactly zero because we choose our ultraviolet cutoff small enough. Otherwise the tadpole would simply be very small, which would add some inessential complications.
Interacting Fermi Liquid in Two Dimensions at Finite Temperature. I
263
zero temperature limit) of the convergent part of the theory exists and is analytic in the bare coupling constant. The full theorem on the Fermi liquid, which includes renormalization and requires a finite temperature cutoff is postponed to the companion paper (Part II). !!u (φ1 , . . . , Theorem 1. For fixed !u and T ≥ 0, the limit ! → 0 of the function -2p,c. φ2p ) exists and is analytic in λ for any |λ| ≤ c, where c is the convergence radius.
This partial result is interesting because it isolates the constructive arguments from the computation of the renormalization group flow. We conjecture that the same theorem holds in three dimensions but have no proof until now (see however [MR] for a partial result in that direction). The rest of the paper is devoted to a proof of Theorem 1. 3. Further Expansion Steps 3.1. Chains. The decomposition into bands has a price, that is we have to perform the additional sum over convergent attributions µ. As in [DR1] this sum might develop a factorial. In other words fixing the band index for each single half-line develops the determinant too much. To overcome this difficulty we remark that the attributions contain much more information than necessary, hence we can group attributions into packets to reduce the number of determinants to bound. This operation is based on four remarks. For each band index i we analyze the subgraph gi : • for each gi nothing happens in the interval between i and A(i), as it contains just loop internal half lines that contract between themselves. Therefore we can regroup all the attributions in this interval; • if |egi | ≤ 10 we want to know exactly which loop fields are external and which ones are internal; • if |egi | ≥ 11 and |eti | + |eei | < 11 we just want to fix the attributions for 11 − |eti | − |eei | loop fields, but we do not need to fix the attributions for the remaining loop fields; • if |egi | ≥ 11 and |eti | + |eei | ≥ 11 we do not fix the attributions for any loop field. We remark that a subgraph is potentially divergent when it has two or four external lines. For this reason in [DR1] we selected at most five external lines to ensure convergence. Here we select at most eleven external lines because of additional technical difficulties due to the sector counting and renormalization, that will be explained in the following (see Sect. 4.4. (4.43) and Part II in Sect. 3.5, (3.28)). As seen below this does not develop the determinant too much. Hence, instead of expanding the loop determinant over lines and columns as a sum over all attributions det M = det M(µ), (3.1) µ
we write it as a sum over a smaller set P (called the set of packets). These packets are defined by means of a function φ : {µ} −→ P, µ → C = φ(µ),
(3.2)
which to each attribution µ associates a class C = φ(µ) element of P. For our resummation purpose, the function φ must have two crucial properties:
264
M. Disertori, V. Rivasseau
• #{P} ≤ K n (this is critical for summation over packets); • there exists a matrix M such that
det M(µ) = det M (C)
(3.3)
µ∈φ −1 (C )
and some form of Gram’s inequality applies to det M (C). The construction of a function φ with these properties is developed in detail in [DR1] 6 . We just recall the result: for each class C, each loop field a belongs no longer to a single band µ(a), but to a set of bands: Ja (C) = {µ(a)|m(a, C) ≤ µ(a) ≤ M(a, C) ≤ iva },
(3.4)
and the new matrix elements are M xf ,xg (C) =
1 (2π)2
d 3 k eik(xf −xg ) C(k)
n q=1
q
q
ηa(f ) ηa(g) uq (k)Wvqf ,vg ,
(3.5)
where M is a function of C, and ηa is the characteristic function of the set of bands attributed by C to the loop field a:
q
ηa (C) : B → {0, 1}
ηa (C) =
0 if q ∈ Ja (C) 1 if q ∈ Ja (C).
(3.6)
Finally we remark that the construction of [DR1] groups convergent attributions µ into convergent classes C which form a subset of the set P. Therefore the convergent !!u functions -2p, c. (φ1 , . . . φ2p ) can be rewritten as: !!u -2p, c. (φ1 , . . . φ2p )
=
∞ λn n=1
n!
o−T
E
<
φ1 (xi1 ) . . . φ2p (xjp )
!T ≤!1 ≤···≤!n−1 ≤1
ε(T , <)
d 3 x1 . . . d 3 xn
Cc
n−1
(3.7)
C
!q
(x¯lq , xlq )d!q det M (C, E)
q=1
and the definitions of internal and external lines for each subgraph gi can be generalized: ili (C) := {a ∈ L|va ∈ ti , M(a, C) > A(i)}, eli (C) := {a ∈ L|va ∈ ti , M(a, C) ≤ A(i)}, egi (C) := eti ∪ eli (C) ∪ eei .
(3.8)
6 We need only to modify φ slightly to accommodate the expansion up to eleven external lines instead of five external lines. This has no other consequences than a larger constant K for the first condition (the number 35 in [DR,(IV.13)] is replaced by 311 ).
Interacting Fermi Liquid in Two Dimensions at Finite Temperature. I
265
d a
de
2
2
bc 11111111 00000000
6
11 00 c 00 11
4 bc
b 1 0 0 1 0 1
bd 1
de 5 d1 0 df 0 1 0 1 3 dg 2
dg 1
00e 11 11 00 00 11
ab
1
df
1
A Clustering Tree Structure C
f
3
c
ab
0 1
1
3
1
0 a1 0 1
e 4
b
1
1
g
bd 1 0 0 1 0f 1
A CTS with labeling L corresponding to T and σ
11 00 00g 11
An ordinary tree T and an ordering σ
Fig. 3. Left: A CTS and a tree, with an ordering; Right: The associated CTS with labeling induced. The vertices of the tree are named as a,b,c,d,e,f,g; the lines are named by the pair of vertices they join; the ordering is indicated by numbers 1,2,3,4,5,6 on the lines of T . Finally on the right, the numbers NF (T , L) are shown on each line F
3.2. Partial ordering. We have seen that attributions contain much more information than necessary and that this affects the convergence of the series. Hence we have regrouped attributions into packets preserving only the information to perform power counting. Similarly the total ordering over tree line energies contains unnecessary information that make power counting more complicated and less transparent. Indeed we are not interested in the relative ordering of tree lines that belong to mutually disjoint connected components gi . Hence we reorganize the scale analysis according to a structure that we call Clustering Tree Structure (CTS), that contains the desired scale information and no more. This structure is closely related to the “Gallavotti–Nicolo” trees. Definition. A clustering Tree Structure CTS is an unlabeled rooted tree, with 2n − 2 lines and 2n − 1 vertices of two different types: n − 1 crosses and n dots, such that the root is a cross with coordination 2, each other cross has coordination 3 and each dot coordination 1 (see Fig. 3). Obviously Lemma 3. The number of CTS at order n is at most 3n−1 . Proof. We start from the cross root and climb in the structure. At each cross there are at most three choices for the two vertices immediately above: two dots, one dot and one cross, or two crosses. Hence the number of crosses being n − 1 the total number of
266
M. Disertori, V. Rivasseau
choices is bounded by 3n−1 (this is only an upper bound because some choices may not lead to a structure made of n − 1 crosses and n dots). 3.2.1. Labeling. We want to relate a CTS at order n to an ordinary tree T with n vertices. The n − 1 lines of T are labeled by an index l and the 2n − 2 lines of the CTS are labeled by an index F (they should not be confused). A labeling L of the CTS is a one to one map between the set of vertices (crosses and dots) of the CTS and the vertices and lines of T , so that each cross of the CTS is labeled by a particular line of T , and each dot of the CTS by a particular vertex of T , satisfying a further constraint. For each F, let TF (L) be the subset of T made of all lines and vertices of T corresponding to all crosses and dots “above F” (that is such that the unique path in CTS joining this cross or dot to the root passes through F). The constraint on the labeling L is that TF (L) has to be connected for all F. We call NF (T , L) the number of external lines of T hooked to TF (L), L{×, ◦} −→ {l, v} L(×) = l
L(◦) = v.
We consider only in what follows trees T with coordination Nv at each vertex v bounded by 4 (since other trees cannot appear as subgraphs in the model we consider). We remark that a tree can be considered as the list V = {Nv } of its coordination numbers plus the set of Wick contractions W which associates together two by two the half lines or “fields” hooked to each vertex, subject to the constraint that the resulting graph is a tree. Let T be a tree with n vertices, and σT a total ordering of its lines. In [DR1] it is shown how to construct an associated CTS and a labeling L. We recall the rule: the first line in the ordering is the cross root. When cut, it separates T into two ordered trees T1 and T2 (possibly reduced to a single vertex). The process is iterated in each subtree: in T1 and T2 the lowest lines give the label of the crosses immediately above the root and so on (see Fig. 3). When subtrees reduced to a single vertex are met, a dot appears instead of a cross. Conversely for a given tree T , the same CTS and labeling L can be obtained from many total orderings σT . Indeed CTS and L induce only a partial ordering σP on the lines of T : li ≥P lj if the path from the cross with label li to the root passes through the cross with label lj . Every total ordering σT compatible with this partial ordering gives the same CTS and labeling L. This is somehow a defect. Our new point of view resums all these total orderings to retain only the partial ordering σP (which is the one relevant for scale analysis). Hence the sum over ordered trees can be written as = = = , o−T
u−T σT
u−T CT S L σT →(CT S,L)
CT S u−T
L σT →(CT S,L)
where u − T is an unordered tree and σT →(CT S,L) is the sum over the set of total orderings that give the same couple (CT S, L), for u − T fixed. Now we observe that = , σT →(CT S,L) !T ≤!1 ≤···≤!n−1 ≤1
!T ≤!A(i) ≤!i ≤1, ∀i
where the integration is now on the region of the !’s parameters satisfying the partial ordering relations associated to σP . We call !r := mini !i the parameter associated to the lowest tree line, that is the root of the CTS, and by convention we put !A(r) := !T . We remark that now for any !i we only know that min[!i , !i ] ≥ !i ≥ !A(i) ,
(3.9)
Interacting Fermi Liquid in Two Dimensions at Finite Temperature. I
267
where !i and !i are the parameters associated to the two crosses above i (if there is a dot instead we assume !i = 1). In this new point of view the band q corresponds to the energy interval [!q , !A(q) ] instead of [!q , !q−1 ] and in (3.5), the new matrix element q Wvf ,vq selects only the vertices connected by tq , hence in (2.39) Tk has to be replaced by tk . The expression (3.7) for the vertex function becomes ∞ λn !!u -2p, c. (φ1 , . . . φ2p ) = ε(T , <) d 3 x1 . . . d 3 xn n! n=1
CT S u−T
φ1 (xi1 ) . . . φ2p (xjp )
!T ≤!A(i) ≤!i ≤1
L
n−1
E
<
Cc
(3.10)
C !q (xlq , x¯lq )d!q det M (C, E).
q=1
3.3. Sectors. Band decoupling is not enough to obtain correct power counting. Roughly speaking, this happens for two reasons. 1. The partition of unity for internal lines (tree and loop lines) is not fine enough, as the volume in phase space √ x k depends on α. Actually x is given by the rate of spatial decay which is 1/ α in all three directions. √ On the other hand k is given by the band volume, proportional to 1/α. Then x k α. To obtain a phase space volume independent from α we must take a smaller volume in the momentum space. For that, adapting to our continuous formalism the idea of [FMRT1], we cut the two dimensional 1/4 Fermi surface |k| = 1 into angular sectors of size 1/αs 7 . Now the volume in phase √ 1/4 space of a single angular sector is 1/(ααs ). The spatial decay rate is 1/ α on two 1/4 directions, and 1/αs on the third one, tangential direction (provided αs is not bigger than α, as explained in [FMRT1]). Then the phase space volume becomes a constant independent from α and αs , as it should for a single “degree of freedom” of the theory. 2. Real external lines do not need any sector decomposition as their impulsion is fixed with a precision !T . 3.3.1. Sector cutoffs. To introduce the angular sectors we insert in 2π ∞ 1 Cα (x) = d|k| |k| dθ eikx Cα (k) (2π )2 β 0 0 k0
the unitary integral 4 1/4 αs 3
2π 0
dθs χαθs (θs ) = 1,
(3.11)
where χαθs (θs ) = χαθss (θ ) selects a small angular sector centered on θs . The factor 43 αs is needed to normalize properly the integral (see (2.9)). Indeed to define χ we use again the Gevrey function u : R → R of the previous section: 1/4
χαθs (θs ) := uαps [αs (θ − θs )], 1/4
(3.12)
7 α is not necessarily equal to α, since we need to exploit momentum conservation of sectors at various s intermediate scales between α and the ultraviolet scale. The power 1/4 is chosen as in [FMRT1], to avoid a logarithmic divergence related to “almost collapsed rhombuses”.
268
M. Disertori, V. Rivasseau
where uαps is the periodic function of period τ = 2π αs , obtained from u by: uαps (y) = u(x) when y = x + nτ for some x ∈ [−1/2, 1/2[ and n ∈ Z, and uαps (y) = 0 otherwise. This definition satisfies the condition (3.11). A sector is defined as a couple (αs , θs ). For a given sector (αs , θs ), we define the support I(αs , θs ) to be the support of the function χαθss (θ ). Inside this support |θ − θs | ≤ −1/4 (1/2)αs . Now, in order to exploit momentum conservation at each vertex and subgraph, we need to decompose each half-line (either loop or tree) a certain number of times into sectors with different values of αs , starting from larger sizes (hence smaller αs ) and then refining them into smaller ones. This process requires definition of a sequence of scales for each line. These scales roughly speaking represent all scales i for which the half line is external to the subgraph gi and |egi (C)| ≤ 10 (as we can exploit momentum conservation only in this case), plus a last scale, characteristic of the line and the class C. The subgraph gr requires a particular treatment: its external lines are the only real external lines of the whole graph, hence we can always exploit momentum conservation, even if 2p = |egr | > 10. 1/4
3.4. Choice of scales αs for each half-line. Let us introduce an index h which parametrizes loop, tree and external half-lines. The sum over sector choices will be done inductively, from the root towards the leaves. We then choose as the root vertex the external vertex xe1 , and as the root the test function φe1 . Now we denote the two half-lines beR R L longing to the tree line li as hL i (h left) and hi (h right) in such a way that hi → hi is oriented towards the root vertex. Hence we define TL and TR as the set of tree half-lines of left and right type respectively. We remark that, for any subgraph gk with k = r, (as egr = 2p then there is no tree external line) there is at most one tree half-line hi ∈ etk ∩ TR (that we call hroot k ) going towards the root. If e1 ∈ eei , all tree external half-lines belong to TL and we put hroot = e1 . The sector of this half-line is kept fixed in the sum over sector choices until k scale 0. In the same way the sector of each tree right half-line hR i is kept fixed in the sum over sector choices until scale i; by momentum conservation along the tree line li this sector is then equal to that of hL i . Therefore for each tree line li we perform sector R L decoupling and sector sums only for hL i (as hi is automatically fixed by hi ). In the following, hR will appear only as hroot for some subgraph gi , hence to simplify notation i we write simply hi for hL . i Given the class C we define a natural scale i(h) associated to each h ∈ L ∪ TL • For the left half-line belonging to the tree line li obviously i(hi ) = i. • For a loop half-line h = a we choose i(h) = M(a, C) (this choice avoids the “logarithmic divergence” associated to momentum conservation in 2 dimensions, see [FMRT1], Lemma 2). We introduce then a growing sequence of indices jh,1 = i(h), . . . , jh,nh = ivh such that each scale jh,r of the sequence corresponds to a refining of that half-line in sectors of 1/4 1/2 size 1/αjh,r = !jh,r . We remark that the lowest refining scale is i(h). The choice of these indices is the following: a half-line h ∈ TL ∪ L is refined at scale j = i(h) and at all scales j such that h ∈ egi (C) for some level i with j = A(i) and such that |egi (C)| ≤ 10.
Interacting Fermi Liquid in Two Dimensions at Finite Temperature. I
269
This multiple decomposition has to be adapted to the different bounds satisfied by tree and loop lines. 3.4.1. Tree lines. As explained above, we introduce the multi-sector decomposition only for the left half-line of li , hi . We must ensure that the spatial decay of the tree line li depends only on the finest sector (at level i), hence, we apply the identity (3.11) just one time, at the scale i. We then decompose each tree left half-line on larger sectors introducing the identity 1=
4 1/4 α 3 jhi ,r
2π
0
θh
,1
dθhi ,r χαjhi ,r (θhi ,r ).
(3.13)
i
1/2
This actually selects θhi ,r to be in a sector of size !jh ,r around θhi ,1 . Hence, for the i half-tree line hi ∈ TL the complete decomposition is:
1=
4 1/4 3 αjh ,1 i
0
=
2π
4 1/4 3 αjh ,n i hi
4 1/4 3 αjh ,2 i
dθhi ,1 χαθji
2π
0
(θhi ,1 )
dθhi ,nhi
Ijh
hi ,1
dθhi ,2 [ i ,3
nhi r=2
4 1/4 3 αjh ,n −1 i hi
4 1/4 3 αjh ,1 i
Ijh
Ijh
4 1/4 3 αjh ,r i
i ,nh
2π
θh
,1
dθhi ,r χαjhi ,r (θhi ,r )
i
0
dθhi ,nhi −1 . . . i
dθhi ,1 i ,2
n h r=2
θh ,1 χαjhi ,r (θhi ,r ) i
χαθji
hi ,1
(θhi ,1 ), (3.14)
where we defined sectors twice as large as the previous ones: ' & 1/2 Ijh,r := I(αjh,r /24 , θh,r ) ≡ θ | |θh,r − θ | ≤ !jh,r .
(3.15)
Indeed the integration domain for θhi ,r , r ≥ 2, can be restricted to Ijhi ,r+1 if we observe θh
,1
θh
,1
that the product χαjhi ,r (θhi ,r )χαjhi ,r+1 (θhi ,r+1 ) can be non zero only if θhi ,r ∈ Ijhi ,r+1 . i
i
θh
,1
This is also true for r = 1 since the single function χαjhi ,2 (θhi ,2 ) is non zero only if 1/2
i
|θhi ,1 − θhi ,2 | ≤ 21 !jh,2 , which implies θhi ,1 ∈ Ijhi ,2 . Finally we remark that for each r ≥ 1, θi ∈ Ijhi ,r , where θi is the angular variable for the momentum of the propagator of line i. 3.4.2. Loop half-lines. Loop lines are not used in spatial decay, and there is no sector conservation along the line, as we do not know exactly which loop fields are contracted. Hence we can decompose them as we want. To simplify notation, we treat them exactly in the same way as the tree left half-lines.
270
M. Disertori, V. Rivasseau
Hence the expression (3.10) for the convergent part of the vertex function becomes: ∞ λn n=1
n!
r=2 n−1
L
(
[
h∈L∪TL n h
CT S u−T
1 4 −2 3 !jh,n
]
h
<
E 2π
dθ
0
ε(T , <)
Cc
1 4 −2 h,nh 3 !jh,n −1 h
[
)
θ χαjh,r (θh,1 ) h,r
]
n−1 !T ≤!A(i) ≤!i ≤1 q=1
Ijh,n
d!q
dθh,nh −1 . . . [ h
1 4 −2 3 !jh,1
]
Ijh,2
dθh,1
d 3 x1 . . . d 3 xn φ1 (xi1 , θe1 ,1 ) . . . φ2p (xjp , θe2p ,1 )
C !q (xq , x¯q , θh,1 ) det M (C, E, {θa,1 }),
(3.16)
q=1
where, in order to have the same notations for tree, loop and external half-lines, we denoted by θe,1 = ke the center of the box of size !T , where the impulsion kie is located (as there is no refinement the index ne = 1 always), and 1 θ d 3 k eik(xq −x¯q ) C !q (k)χαjh,1 (θ ), (3.17) C !q (x¯q , xq , θh,1 ) := h,1 (2π )2 and the coefficients of the matrix M (C, E, {θa,1 }) are 1 d 3 k eik(xf −xg ) C(k) M (C, E, {θa,1 })xf ,xg := (2π)2 n θ θ q q ),1 a(g),1 ηa(f ) ηa(g) uq (k)Wvqf ,vg χαja(f (θ ) χ (θ ) . αja(g),1 a(f ),1
(3.18)
q=1
We remark that the sums over sectors have been taken out of the determinant by multilinearity, and that we used χαθ11 (θ ) = χαθ1 (θ1 ). Now we want to exploit momentum conservation. At each subgraph gi with i = r and |egi (C)| ≤ 10 we refine all loop and tree external lines in sectors at the scale A(i), 1
1
2 except for the half-line hroot which (if it is not e1 ) is fixed in a sector of size !j2 ≤ !A i (i) (for some 0 ≤ j ≤ A(i)). Actually the volume of integration for the new sectors is restricted by momentum conservation. To take into account these effects we insert in the expression above (3.19) 1 = ϒ θhroot , {θh,r(i) }h∈egi∗ + 1 − ϒ θhroot , {θh,r(i) }h∈egi∗ , i
i
where we defined r(i) as the number of refinements we have done on the half-line h until A(i) (this means jh,r(i) = A(i)). We remark that r(e) = 0. We also set egi∗ := egi \{hroot i } and define the function ϒ to be 0 if the set of selected sectors is forbidden by momentum conservation at this subgraph, and we define ϒ to be 1 otherwise. Therefore after insertion of (3.19) the term 1 − ϒ, forbidden by momentum conservation, gives a zero contribution. Hence we can insert freely in (3.16) the product ϒ θhroot , {θh,r(i) }h∈egi∗ . (3.20) gi | i=r or |egi (C )|≤10
i
Interacting Fermi Liquid in Two Dimensions at Finite Temperature. I
271
In this way we exploit momentum conservation at each subgraph, but we still have to exploit it at each vertex. For that we need some additional notation. We call H (v) the set of half-lines hooked to v (and |H (v)| its cardinal). We define H ∗ (v) := H (v)\hroot v , where hroot is the half-line going towards the root. We remark that the scale iv is the v largest scale of refinement for each of the tree and loop half-lines* in H ∗ (v): iv = jh,nh+, ∀h ∈ H ∗ (v) ∩ (L ∪ TL ). Again we can insert the function ϒ θhroot , {θh,nh }h∈H ∗ (v) , v which is zero when the sectors are not permitted by momentum conservation at vertex v. Hence, by the same argument as above, we can freely insert in (3.16) ∗ (3.21) ϒ θhroot , {θ } h,nh h∈H (v) . v v
4. Main Result and Bounds Now we have all the elements to perform the bounds. We insert absolute values inside the sums and integrals and obtain the inequality !!u |-2p c. | ≤
n=1
(
∞ |λ|n
4
− 21
3 !j
h∈L∪TL
n!
h,nh
v
2π
0
Ijh,2
CT S u−T
dθh,1
!T ≤!A(i) ≤!i ≤1 q=1
L E < Cc
dθh,nh n h
4
− 21
3 !j
n−1
h,nh −1
Ijh,n
) θ
χαjh,r (θh,1 ) h,r
r=2
dθh,nh −1 . . .
d!q
4
− 21
3 !j
h
gi | i=r and |egi (C)|≤10
h,1
Ijh,2
dθh,1
ϒ θiroot , {θh,r(i) }h∈egi∗
∗ (v) ϒ θhroot , {θ } d 3 x1 . . . d 3 xn |φ1 (xi1 , θe1 ,1 )| . . . |φ2p (xjp , θe2p ,1 )| h,n h∈H h v
n−1
|C !q (xq , x¯q , θh,1 )| | det M (C, E, {θa,1 })|.
(4.1)
q=1
Actually we prove the following theorem (more precise than Theorem 1): Theorem 2. Let ε > 0, !u = 1 and T ≥ 0 be fixed. The series (4.1) is absolutely convergent for |λ| ≤ c, c small enough. This convergence is uniform in !, then the IR !u !!u limit -2p,c. = lim!→0 -2p,c. exists and satisfies the bound: (5−3p)
!u |-2p>4,c. (φ1 , . . . , φ2p )| ≤ K0
T 2 (5 − 3p) 1
· [K1 (ε)]p (p!)2 K(c) e !u (φ1 , . . . , φ4 )| |-4,c.
≤
1 K0 T − 2 K(c)
e
1 1 −(1−ε)!Ts dTs (xe1 ,...xe4 ) 1 s
1 s
1
−(1−ε)!Ts dTs (xe1 ,...xe2p )
!u |-2,c. (φ1 , φ2 )| ≤ K0 K(c) e−(1−ε)!T dT (xe1 ,xe2 ) ,
,
,
(4.2)
272
M. Disertori, V. Rivasseau
where xei is the position of the maximum of φi , K1 (ε) is a constant dependent from ε, K(c) is a function of c that tends to zero when c tends to zero, and s is the Gevrey index of our cutoff function u (we assume that 1 < s < 2). Finally we defined dT (x1 , . . . , x2p ) := inf
u−T
|x¯l − xl |,
(4.3)
l∈T
where in the definition of dT (x1 , . . . , x2p ) (called the tree distance of x1 , . . . x2p ) the infimum over u − T is taken over all unordered trees (with any number of vertices) connecting x1 , . . . x2p . These bounds are also true in the case of fixed impulsions, but without the exponential decay factor.
4.1. Loop determinant. To bound the loop determinant we apply Gram’s inequality, which states that if M is a n × n matrix whose elements Mij=< fi , gj> are scalar products of vectors fi , gj in a Hilbert space, then | det M| ≤ ni=1 ||fi || nj=1 ||gj ||. Lemma 4. The matrix M (C) satisfies the following Gram inequality: | det M (C)| ≤ =
f
||Ff ||C
f
1 (2π)2
g
||Gg ||C
f
d k uC (k)|Ff (k)| 3
2
(4.4) 1 2 g
1 (2π )2
1
g
d k uC (k)|Gg (k)| 3
2
2
,
where the cut-off uaC (k) is defined by uaC (k) := u
k02 + e2 (k)
−u
!2M(a,C )
k02 + e2 (k)
!2A(m(a,C ))
.
(4.5)
Proof. The proof is identical to that of Lemma 4 in [DR1]. The only difference is that here we have partial order instead of the total order in [DR1]. We just resume it for completeness. We observe that the matrix element (3.18) can be written as 1 (2π)2
d 3 k Ff (k) G∗g (k)
n q=1
q
q
Wvqf ,vg uq (k) ηa(f ) ηa(g) ,
(4.6)
where we defined 1
θ
Ff (k) = eixf k χαjf,1 (θ ) f,1
(k02
θ
1
+ e2 (k)) 4
Gg (k) = eixg k χαjg,1 (θ ) g,1
(ik0 + e(k)) 3
(k02 + e2 (k)) 4
. (4.7)
We introduce the matrix q
q
q
q
q
q
Wv,a;v ,b := Ra,b Wv,v := ηa ηb Wv,v
(4.8)
Interacting Fermi Liquid in Two Dimensions at Finite Temperature. I
273
for v, v belonging to the set of n vertices, a, b to the set of 2n + 2 − 2p loop half-lines q q (fields and anti-fields). Both Ra,b and Wv,v can be written (modulo permutation of field and vertex indices) as block diagonal positive matrices or sums of matrices of the type
1k 0 , (4.9) 0 0 where 1k is a k × k matrix with all elements equal to 1. Then W q is positive, q uq W q is positive too and there exists a positive matrix U defined by q Uv,a;w,c Uw,c;v ,b := uq Wv,a;v ,b . (4.10) w,c
q
The determinant can be written as the scalar product of two functions f g∗ 1 3 Mfg = d k Fv s Gv s =< F f , G g >, (2π )2
(4.11)
v ,s
where we defined f
g
Fv s (k) = Ff (k) Uv ,s;v(f ),a(f ) ,
Gv s (k) = Gg (k) Uv ,s;v(g),a(g) .
Applying the Gram inequality we obtain (4.4).
(4.12)
With these definitions, the norms of Ff and Gg satisfy the bounds 1
1
4 2 ||Ff ||C ≤ K !M(f, C ) [!M(f,C ) − !A(m(f,C )) ] , 1
1
4 2 ||Gg ||C ≤ K !M(f, C ) [!M(g,C ) − !A(m(g,C )) ] .
(4.13)
Indeed let us bound for instance the norm of Ff : θ 2 + e2 (k) 2 + e2 (k) (θ )]2 [χαjf,1 k k 1 f,1 u 0 2 − u 02 d 3k ||Ff ||2C = 1 (2π)2 !M(f,C ) !A(m(f,C )) [k02 + e2 (k)] 2 !−2 1 A(m(f,C)) 1 θ 2 2 u [αx] d 3 k [χαjf,1 −x dα (θ )] = 2 f,1 −2 x=k02 +e2 (k) (2π) !M(f,C) !−2 1 A(m(f,C)) 1 θ 2 u [αx] ≤ dα |S| sup χαjf,1 (θ ) −x f,1 x=k02 +e2 (k) β S !−2 M(f,C) !−2 1 1 A(m(f,C)) 3 2 2 dα α − 2 ≤ K !M(f, ≤ K!M(f,C ) C ) [!M(f,C ) − !A(m(f,C )) ], (4.14) !−2 M(f,C)
where K is some constant, S is the set in momentum space selected by the cut-offs χ and u , and we applied the bounds: θ
θ
[χαjf,1 (θ )]2 ≤ χαjf,1 (θ ), f,1 f,1 1 θ 2 u [αx] sup χαjf,1 (θ ) −x f,1 S
x=k02 +e2 (k)
≤ Kα − 2 , 1
1
−1 2 |S| ≤ β!M(f, . C) α
(4.15)
274
M. Disertori, V. Rivasseau
Finally the loop determinant is bounded by | det M (C, E, {θa,1 })| ≤ K n
a∈L
1
1
4 2 !M(a, C ) [!M(a,C ) − !A(m(a,C )) ] .
(4.16)
This bound no longer depends on {θa,r } or E. 4.2. Spatial integrals. To perform spatial integration we use the decay of tree lines. The test functions are not used in spatial integration except φ1 (x) = δ(x − xe1 ) which is used to perform the integration over the root xi1 , d x1 . . . d xn |φ1 (xi1 , θe1 ,1 )| . . . |φ2p (xjp , θe2p ,1 )| 3
3
n−1
|C !q (xq , x¯q , θhq ,1 )|. (4.17)
q=1
We now estimate the norms of the test functions and the spatial decay of the tree propagators. 4.2.1. Spatial decay of tree lines. We consider tree line propagators and prove that they decay as Gevrey functions of class s, where s is the Gevrey index of our initial cutoff u, 1
|C !q (δxq , 0, θhq ,1 )| ≤ K
1 !q !q e !3q 5 2
1 1
1
−a |(δxq )0 !q | s +|(δxq )r !q | s +|(δxq )t !q2 | s
, (4.18)
where we applied translational invariance, δxq := xq − x¯q , (δxq )r and (δxq )t are the radial and tangential components of x relative to the sector center θh,1 , K and a are some positive constants. We remark that the smallest sector governs the spatial decay rate. ! To prove this formula we study the propagator at T = 0 C0 q . Using the properties !
of Gevrey functions with compact support, C0 q satisfies (4.18) too (see Appendix A). Then applying (2.6) achieves the proof of (4.18). 4.2.2. Test functions. By the properties of the Gevrey functions, we can prove, as for the propagators, that 1 φj (xi ) ≤ Ke−a(|xij −xej |!T ) s j
∀j = 2, . . . , 2p.
(4.19)
4.2.3. Bound. Now we can complete the bound on (4.17). But, before going on, we take out a fraction (1 − ε) of the exponential decay (4.18) of each tree line and of the exponential decay (4.19) of each test function (except φ1 ) . This factor is bounded by n−1 q=1
e
1
−a(1−ε)|(δxq )!q | s
2p
1 s
e−a(1−ε)(|xiq −xeq |!T ) ≤ e
1
1
−a (1−ε) !Ts dTs (xe1 ,...,xe2p )
.
q=2
(4.20)
Interacting Fermi Liquid in Two Dimensions at Finite Temperature. I
275
We bound the remaining fraction of exponential decay for φi (i = 1) by one. On the other hand we keep the remaining fraction ε of the decay of tree propagators to perform spatial integration: d 3 x1 d 3 x2 . . . d 3 xn δ(xi1 − xe1 )
≤
n−1
e
q=1
1 1 1 1 n−1 −aε |(δxq )0 !q | s +|(δxq )r !q | s +|(δxq )t !q2 | s
1
d 3x e
1 1
1
−aε |x0 !q | s +|xr !q | s +|xt !q2 | s
q=1
≤
n−1 q=1
1
d ue 3
5 2
!q
1 1 1 −aε u0s +u1s +u2s
≤ K
n−1
1
q=1
!q2
5
,
(4.21)
and Eq. (4.17) is bounded by 1
Ke
1
−a (1−ε) !Ts dTs (xe1 ,...,xe2p )
.
n−1 q=1
1 . !2q
(4.22)
4.3. Sector sum. We still have to perform the sums over sector choices: h∈L∪TL
gi | i=r or |egi (C)|≤10
1 4 −2 3 !jh,n
h
2π
0
dθh,nh
1 4 −2 3 !jh,n −1 h
dθh,nh −1 . . .
Ijh,n
1 4 −2 3 !jh,1
dθh,1
Ijh,2
h
∗ (v) , ϒ θiroot {θh,r(i) }h∈egi∗ ϒ θhroot , {θ } h,n h∈H h v
(4.23)
v
nh θh,1 where the products χ (θ ) have been bounded by one. h∈TL ∪L r=2 αjh,r h,r We perform the sums for each half-line starting from the lowest scale i(h) and going up towards the leaves (that means the vertices). There is no sum over the root sector, as this is an external line at fixed impulsion. The sums for different half-lines are mixed by the ϒ function. For any band i we consider the subgraph gi . If |egi (C)| ≥ 11 or i = r there is no ϒ function for this subgraph and only lines with i(h) = A(i) are refined (in the particular case of gr there is no refinement at all). Hence we have to perform h∈egi∗ (C)\eei∗ (C) jh,1 =i(h)=A(i)
(
1 4 −2 3 !jh,1
Ijh,2
) dθh,1 1 ≤ K
#
h∈egi∗ (C)\eei∗ (C) jh,1 =i(h)=A(i)
1
!j2h,2
h∈egi∗ (C)\eei∗ (C) jh,1 =i(h)=A(i)
!j2h,1
1
,
(4.24)
where we defined eei∗ (C) =: eei (C)\e1 if e1 ∈ egi (C) and eei∗ (C) =: eei (C) otherwise. If |egi (C)| ≤ 10, and i = r we have an ϒ function expressing the momentum conservation at this subgraph, and all loop and tree external fields have been refined.
276
M. Disertori, V. Rivasseau
Each field h ∈ egi∗ \eei∗ is refined at the scale A(i) = jh,r(i) . Hence we have to perform
h∈egi∗ (C )\eei∗ (C )
1 4 −2 3 !jh,r(i)
Ijh,r(i)+1
dθh,r(i) ϒ θiroot , {θh,r(i) }h∈egi∗ (C ) .
(4.25)
We know that the function ϒ reduces the size of the integrals to perform. When |eei∗ | = 0 we can apply Lemma 5 below, which states that once the sectors for |egi (C)| − 2 external lines have been fixed, the last two sectors are automatically fixed. This means that, since the sector θiroot is always fixed, we have to perform the sector sum only for |egi (C)| − 3 external lines. If |eei∗ | = 0 Lemma 5 is always true, as real external impulsions are fixed with a precision !T . Anyway, if |eei∗ | > 2, the number of sector sums is at most |egi (C)| − |eei∗ | − 1. Lemma 5. Let Ii := (α −1/4 , θis ) for i = 1, . . . l be a set of l ≥ 2 sectors on the Fermi surface centered on θis of size α −1/4 . Let the sector center θ1s be fixed, and the other sector centers θis vary over intervals
α −1/4 . We define the function ϒ({θis }) to be zero, unless there exist some set of momenta k 1 , . . . k l satisfying l
√ ||k i | − 1| ≤ 1/ α ∀i;
k i = 0;
k i ∈ Ii ∀i,
i=1
(k i ∈ Ii in radial coordinates means |θi − θis | ≤ α −1/4 ). Then the integral over θis ∈
1 4 4 3α
|
(4.26)
when I is the subset of indices of the l − 3 largest intervals among <2 , . . .
(4.27)
where, when |eei∗ | ≤ 2, we define I (i) as the set of |egi (C)| − 3 half-lines h ∈ egi∗ \eei∗ , that have the largest sectors Ijh,r(i)+1 , and when |eei∗ | > 2, we define I (i) as the set of all |egi (C)| − 1 − |eei∗ (C)| half-lines h ∈ egi∗ \eei∗ . In the particular case of gr we have no sector sum at all, as the only external half-lines are the real ones.
Interacting Fermi Liquid in Two Dimensions at Finite Temperature. I
277
We still have to consider the sums over the largest sectors: they correspond to the vertices. Each vertex v ∈ V can be treated as a subgraph with |eg| ≤ 10, hence we can apply Lemma 5 with
h
h∈H ∗ (v), h∈L∪TL
0
(where inessential constants such as |
K 10 K 4n−2p , (4.29) gi |i=r and |egi (C)|≤10
where the last factor comes from the finest refinement for each internal field (there are at most 4n − 2p such fields). Now #{gi : |egi (C)| ≤ 10} ≤ n . This ends the proof. 4.4. Main bound. With all these elements, we can bound the sum (4.1): !!u |-2p c. | ≤ e
1
1
−a (1−ε) !Ts dTs (xe1 ,...,xe2p )
n=1
K0
∞ n c
n−1 !T ≤!A(i) ≤!i ≤1 i=1
d!i
gi | i=r, |egi (C)|≥11
1
h∈egi∗ \eei∗ jh,1 =i(h)=A(i)
n−1 i=1
n!
!j2h,2 1 !j2h,1
Kn
CT S u−T
(4.30)
L < E Cc
!A(m(a,C )) 2 1 43 1− ! !M(a,C ) !2i a∈L M(a,C )
1
gi | i=r, |egi (C)|≤10
1 2 ! jh,r(i)+1 − 21 !iv , 1 v∈V h∈I (i) !j2 h,r(i)
where we have bounded |λ| ≤ c. Now we can take the limit ! → 0 and, after bounding 1 2 ! 1 − !A(m(a,C)) by one, we factorize the integrals performing the change of variable: M(a,C) !i =
1 !A(i) , βi
1 ≤ i ≤ n − 1.
(4.31)
278
M. Disertori, V. Rivasseau
By (3.9) we have the following bound for βi : !A(i) ,1 . βi ∈ min[!i , !i ]
(4.32)
Now each !i can be written 1 !T , !i = βj
(4.33)
j ∈Ci
where we defined Ci as the set of crosses on the chain joining the cross i to the root. The Jacobian of this transformation is the determinant of the matrix ∂!i 1 = − !i χ (j ∈ Ci ), ∂βj βj
Mij =
where χ (j ∈ Ci ) = 1 if j ∈ Ci and 0 otherwise. If we order the rows and columns of Mij putting the root first, then the first layer of the CTS and so on, we see that Mij is a triangular matrix; hence its determinant is given by n−1 n−1 n−1 ∂! 1 1 1 1 ni −1 i = !n−1 |Jac| = . (4.34) = !n−1 T T ∂βi βi βj βi βi i=1
i=1
j ∈Ci
i=1
where ni is the number of vertices in the subgraph gi . Indeed βi appears in the chain Cj exactly for all j ≥P i, hence its exponent is the number of crosses above i, which is the number of tree lines in gi ; hence ni − 1 if we denote the number of vertices in gi by ni . In these new coordinates we have: !u |-2p c. |
1
≤ K0 e
CT S u−T
a∈L
gi | i=r, |egi (C)|≤10
h∈I (i)
dβi !n−1 T
3
j ∈Cr(i)+1 \Cr(i)
n−1 i=1
n!
Kn
−1+(1−ni )
βi
gi | i=r, |egi (C)|≥11
n−1 i=1
− βj 4 !T4 3
j ∈CM(a,C)
∞ n c n=1
1 n−1 !T i=1
L < E Cc
1
−a (1−ε) !Ts dTs (xe1 ,...,xe2p )
h∈egi∗ \eei∗ jh,1 =i(h)=A(i)
j ∈Ci
βj2 !−2 T
j ∈Cr(i)+1 \Cr(i)
1 21 − 21 , βj !T 1 2 v∈V j ∈Civ βj
1 1 2 βj (4.35)
where we have taken as integration domain for all βi the interval [!T , 1], that contains the ! ! exact integration domain, since min[!A(i) ≥ A(i) ≥ !T . We write the integrals over 1 ,!i" ] i
Interacting Fermi Liquid in Two Dimensions at Finite Temperature. I
279
01 −1+xi the different βi as a product n−1 . We have to find out the expression i=1 !T dβi βi for xi . We observe that n−1 n−1 2(n −1) βj2 = βi i , i=1
a∈L
v∈V
j ∈Ci
i=1
j ∈CM(a,C)
j ∈Civ
−3 βj 4
1 2
βj =
=
i=1
n−1 i=1
n−1
1
βi2
− 43 #{a∈L|M(a,C )≥P i}
βi
n−1
=
i=1
#{v∈V |iv ≥P i}
=
n−1 i=1
1
− 43 |ili (C )|
βi
,
ni
βi2 ,
and the remaining products over sector attributions are equal to
n−1 i=1
−yi
βi
, where
yi = 0 if |egi (C)| = 2 or i = r 1 ≤ (|egi (C)| − 3) if |egi (C)| ≤ 10 and i = r 2 1 ≤ (|egi (C)| − 1) if |egi (C)| > 10 and i = r. (4.36) 2 To obtain this bound we observe that the factor βi appears in the product with a power −1/2 each time there is a half-line h ∈ TL ∪ L with i ∈ Cr(i)+1 \Cr(i) for some r and the corresponding factor appears in the sector counting. Now, for each subgraph gi (i = r) we have three situations: • |egi (C)| = 2: then the factor βi does not appear, i.e. yi = 0. • 4 ≤ |egi (C)| ≤ 10, hence all loop and tree external half-lines except hroot are refined and the factor βi appears with power −ai /2, where ai ≤ (|egi (C)| − 3). • |egi (C)| > 10: only some of the loop and tree external half-lines of gi (other than hroot ) are refined; therefore the factor βi appears with power − 21 ai , where ai ≤ (|egi (C)|−1) is the number of half-lines refined. Now we can bound (4.35) (using that |L| = 2(n + 1 − p)): !u |-2p c. |
1
≤ K0 e
∞ n c n=1
1
−a (1−ε) !Ts dTs (xe1 ,...,xe2p )
n!
Kn
(5−3p) 2
!T
n−1 CT S u−T
L < E Cc i=1
1 !T
dβi βi−1+xi ,
(4.37)
where 3 1 1 xi ≥ (ni − 1) − |ili (C)| + ni − (|egi (C)| − 3) 4 2 2
(4.38)
when i = r and 4 ≤ |egi (C)| ≤ 10, 3 1 1 xi ≥ (ni − 1) − |ili (C)| + ni − (|egi (C)| − 1)) 4 2 2
(4.39)
280
M. Disertori, V. Rivasseau
when i = r and |egi (C)| > 10, and finally 1 3 xr = (n − 1) − |L| + n. 4 2
(4.40)
The integrals over βi are well defined only if xi > 0 ∀i. To check that it is true, we observe that 3 1 1 (ni − 1) − |ili (C)| + ni = (3|egi (C)| − 10), 4 2 4
(4.41)
where we applied the relation |ili (C)| = 2ni + 2 − |egi (C)|. Hence, for i = r and 4 ≤ |egi (C)| ≤ 10 we have xi =
1 1 1 (3|egi (C)| − 10) − (|egi (C)| − 3) = (|egi (C)| − 4) 4 2 4
(4.42)
and when i = r and |egi (C)| > 10 we have xi ≥
1 1 1 (3|egi (C)| − 10) − (|egi (C)| − 1) = (|egi (C)| − 8) ≥ 1 4 2 4
(4.43)
by construction. Finally for gr we have xr =
1 (3p − 5). 2
(4.44)
If |egi (C)| = 4 (i = r), xi = 0 and the graph is logarithmic in the temperature:
1
!T
dβi βi−1 = − log !T = − log
√ 2π T .
(4.45)
Finally if |egi (C)| = 2 (i = r), then xi =
1 (3|egi (C)| − 10) = −1 4
and the integral over βi is linearly divergent with the temperature T :
1 1 −1−1 −1 dβi βi = (!T − 1) = √ −1 . 2π T !T
(4.46)
(4.47)
Hence we have recovered the well known fact that the only divergent subgraphs are the four points and two points subgraphs [FT1-2–FMRT1]. We remark that, if 2p = 4 the global graph is not logarithmic divergent. This happens because our bound over sector sums is not fine enough. Anyway, as we do not perform four point renormalization, this is not a problem. In this paper we restrict ourselves to convergent attributions, for which xi is always positive. However it is important (in order to bound later the sum over labelings) to check that we have a lower bound on xi which is proportional to the number of external tree lines of gi :
Interacting Fermi Liquid in Two Dimensions at Finite Temperature. I
281
Lemma 7. For any subgraph gi (i = r) we have xi ≥
|eti | > 0. 72
(4.48)
Proof. We distinguish several cases: • if |egi | ≤ 10 1 1 (|egi (C)| − 4) ≥ > 0 4 2
(4.49)
as for convergent attributions |egi (C)| ≥ 6 (we cannot have |egi (C)| = 5 by parity). Now, if |eti | ≥ 5 we have 1 1 1 (|egi (C)| − 4) ≥ (|eti | − 4) ≥ |eti |. 4 4 5·4
(4.50)
If |eti | ≤ 4 we can write 1 1 1 (|egi (C)| − 4) ≥ ≥ |eti |. 4 2 8
(4.51)
1 (|egi (C)| − 8) ≥ 1 > 0. 4
(4.52)
• if |egi | > 10 we have
Repeating the same arguments as before for the case |eti | ≥ 9 and |eti | < 9 we obtain 1 1 (|egi (C)| − 8) ≥ |eti |. 4 4·9
(4.53)
This completes the proof of the lemma. Now we can perform the integrals on the βi , to obtain !u |-2p>4 c. | ≤ K0 e
1
5−3p
!T 2 *
1
−a (1−ε) !Ts dTs (xe1 ,...,xe2p )
(4.54)
∞ 1 cn n 1 K , 3p − 5 n! |eti | n=1
CT S u−T
L < E Cc i=r
xi +
where the factor i 1 − !T coming from the integrals over the variables βi has been bounded by one. For the particular case of four point and two point vertex functions we have 1 s
1 s (x ,...,x ) e1 e4
|-4!c.u | ≤ K0 e−a (1−ε) !T dT − 21
!T
∞ n c n=1
n!
Kn
1 , |eti |
CT S u−T
L < E Cc i=r
(4.55)
282
M. Disertori, V. Rivasseau
2
1
lx
lx
x
lx0 Fig. 4. Definition of F0x ,F1x , F2x 1 s
|-2!c.u | ≤ K0 e−a (1−ε) !T ∞ n c n=1
n!
Kn
1
dTs (xe1 ,xe2 )
(4.56)
1 . |eti |
CT S u−T
L < E Cc i=r
The sum Cc is over a set whose cardinal is bounded by K n so we can bound it with the supremum over the set. The sum over < runs over a set of at most 2n−1 elements. The sum over E to attribute the 2p external lines to particular vertices runs over a set of at most n2p (this is an overestimate!). Hence Cc
<
|F (Cc , <, E)| ≤ (p!)2 K n
E
sup |F (Cc , <, E)|,
Cc ,<,E
where we applied the bound n2p ≤ (2p)!en ≤ K p (p!)2 en
∀n ≥ 0.
We still have to perform the sum over the CTS and L. For each cross x of the CTS different from the root, there is one line F0x going down (towards the root), and two lines F1x and F2x going up (see Fig. 4). Lemma 8. For any cross x different from the root: NF0x (T , L) = NF1x (T , L) + NF2x (T , L) − 2.
(4.57)
Proof. The clusters TF1x (L) and TF2x (L) are joined by a single line in the tree T , which is the label of the cross x. This line is counted once as an external line of TF1x (L) and once as an external line of TF2x (L), and is no longer an external line of TF0x (L). This proves the lemma. The following lemma is an improved version of [CR, Lemmas B4 and B5 (see also Lemma III.6)], adapted to this formalism of relative rather than total orderings. Lemma 9. Let CTS be a fixed Clustering Tree Structure of order n. We have 1 1 ≤ 4n . n! NF (T , L) T
L
F
(4.58)
Interacting Fermi Liquid in Two Dimensions at Finite Temperature. I
283
Proof. We decompose the sum over T and L into subsums. We call Lo the map which associates the dots of CTS to the vertices of T and Lx the map which associates the crosses of CTS to the lines of T . By the previous lemma, once Lo and the collection V = {Nv } of coordination numbers for each vertex v of T is given, the numbers NF (T , L) are all fixed, hence they do not depend on the particular contractions W and on Lx . This suggests to split the sum over T and L as a sum over W and Lx followed by a sum over V and Lo : 1 1 1 1 1. = n! NF (T , L) n! NF (V , Lo ) T
L
Lo
V
F
Lx
F
(4.59)
W
But the number of labelings Lx and contractions W compatible with given V and Lo is precisely F NF (V , Lo ). Indeed starting from the n dots in CTS with their Nv hooked fields, and going down towards the root we can inductively build the contractions corresponding to each cross of CTS (this builds at the same time W and Lx ). To count the possible contractions for a cross x, we have to choose one external field in TF1x and one in TF2x , hence the number of choices is exactly NF1x (V , Lo )NF2x (V , Lo ), where F1x and F2x were introduced in the previous lemma. Multiplying over all crosses, we get: 1 1 1 1 ≤ 4n . = = n! NF (V , Lo ) n! T
L
Lo
V
F
(4.60)
V
Indeed n! is exactly the number of labelings Lo of the dots of CTS, and for each vertex v Nv is an integer between 1 and 4; hence the sum over V is bounded by 4n (this is an upper bound since we do not take into account the constraint v Nv = 2n − 2). Applying the lemma above, we bound 1 1 ≤ 4n . n! |eti | u−T
(4.61)
L i=r
Hence the vertex function is bounded by !u |-2p>4 c. |
1
≤ K0 e
5−3p
1
−a (1−ε) !Ts dTs (xe1 ,...,xe2p )
∞
!T 2 p K1 (p!)2 cn K2n , 3p − 5
(4.62)
n=1
1 s
|-4!c.u | ≤ K0 e−a (1−ε) !T
1
dTs (xe1 ,...,xe4 ) 1 s
|-2!c.u | ≤ K0 e−a (1−ε) !T
1
− 21
!T
dTs (xe1 ,xe2 )
∞
∞
cn K2n ,
(4.63)
n=1
cn K2n
(4.64)
n=1
for some constant K2 . This is convergent for c < K12 and achieves the proof of Theorems 1 and 2. (We remark that we did not try to optimize the dependence of this bound in 2p, the number of external points).
284
M. Disertori, V. Rivasseau
Appendix A Spatial decay. We prove that the T = 0 propagator C0 decays as 1
!
|C0 q (x, 0, θhq ,1 )| ≤ K
1 ! q !q e !3q 5 2
1
1 1
−a |x0 !q | s +|xr !q | s +|xt !q2 | s
.
(A.1)
Lemma 10. Let f ∈ C ∞ (Rd ) be such that its Fourier transform fˆ has compact support of volume Vf and satisfies d ∂ n1 nd
∂ (n ,...,n ) 1 d (A.2) ||∞ := n1 . . . nd fˆ ≤ A0 ||fˆ (αi C)ni (ni !)s , ∂p1 ∂pd i=1
∞
where A0 , C, α1 , . . . αd are some constants and s ≥ 1 is some constant. Then for some constants K, µ and a, one has |f (x)| ≤ K A0 Vf e
−a
d xi 1/s i=1 α
∀x ∈ Rd .
i
(A.3)
Proof. By Stirling’s formula the first equation can be written ||fˆ(n1 ,...,nd ) ||∞ ≤ A0 K
d αi ni ni ni s , µ e
(A.4)
i=1
where K and µ are some constants (eventually dependent from d). Hence, for any x we have: 1 −ipx ˆ(n1 ,...,nd ) |f (x)| = f e (p) n n 1 d (ix1 ) . . . (ixd ) d αi ni ni ni s ||fˆ(n1 ,...,nd ) ||∞ Vf ≤ . (A.5) ≤ Vf K A0 µx |x1 |n1 . . . |xd |nd e i i=1
1 i s Optimizing to ni = µx αi , we obtain |f (x)| ≤ Vf A0 K
d
e
1 µx s −s α i i
,
(A.6)
i=1 1
which ends the proof of the lemma, with a = sµ s .
!
Lemma 11. C0 q satisfies (A.1). Proof. To prove (A.1) we write in momentum space: θh ,1 1 !q ,θhq ,1 ! ! |C0 q (kr , kt , θhq ,1 )| = C0 q (k)χαjhq ,1 [θ(kr , kt )] = 3 C0 (k0 , kr , kt ) , (A.7) q !q where the radial and tangential variables kr and kt are defined by: kr = |k| cos(θ − θhq ,1 ) − 1;
kt = |k| sin(θ − θhq ,1 ).
(A.8)
Interacting Fermi Liquid in Two Dimensions at Finite Temperature. I
285
The function to study is (since jhq ,1 = q): !q ,θhq ,1
C0
1/4 (k0 , kr , kt ) = up αq (θ − θhq ,1 ) [ik0 + e(|k|)] u αq (k02 + e2 (|k|) , (A.9)
and kt , 1 + kr e(|k|) = |k|2 − 1 = f2 (kr , kt ) = kr2 + kt2 + 2kr .
θ − θhq ,1 = f1 (kr , kt ) = arctan
!q ,θhq ,1
The propagator C0 !q ,θhq ,1
C0 where
(A.10)
can be written as the product of three functions
(k0 , kr , kt ) = F1 (kr , kt ) F2 (k0 , kr , kt ) F3 (k0 , kr , kt ),
1 4
(A.11)
F1 (kr , kt ) := up αq f1 (kr , kt ) , F2 (k0 , kr , kt ) := [ik0 + f2 (kr , kt ))], F3 (k0 , kr , kt ) := u [αq (k02 + f22 (kr , kt ))],
(A.12)
f1 , f2 being defined in (A.10). Now we know that u(x) is a Gevrey function of class compact support on s, with −1/4 −1/4
1 1 αq αq − 2 , 2 . The function f1 takes values in the interval − 2 , 2 , hence kr ∈ −1/4 −1/4 −1 −1 αq 2 αq 2 α α − 2 , 2 , kt ∈ − q2 , q2 . By hand or using the standard rules for derivation, product and composition of Gevrey functions (see [G]) it is then easy to check that !q ,θh
,1
q C0 (k0 , kr , kt ) is a Gevrey function with compact support of class s and satisfies the bound: n
1 nr +n0 1 nt ∂ 0 ∂ nr ∂ nt !q ,θs ≤ √1 C n0 +nt +nt αq2 αq4 C (n0 !nr !nt !)s . 0 ∂k n0 ∂k nr ∂k nt 0 α q r t 0 ∞ (A.13) 1
1
Hence, applying Lemma 2, with A0 = 1/αq2 and Vf = !q2 !2q , proves (A.1). Appendix B Proof of the Sector Counting Lemma 5. We define k i as the projection of k i on the Fermi surface k i = k i /|k i | and r i as the center of the sector Ii , with components (1, θis ) in radial coordinates. Then, as in [FMRT1], we renumber k 2 , . . . k l so that |r l · r l−1 | is the minimum of the set {|r i · r j ||i, j > 1}. This means that the angle between k l and k l−1 φ := (k l−1 , k l ) is as close as possible to π/2. All other angles (k i , k j ) with i, j ≥ 2 must be within φ + O(α −1/4 ) of either 0 or π . The proof is performed in two steps.
286
M. Disertori, V. Rivasseau
1. When 2−i ≤ |φ| ≤ 2−i+1 or 2−i ≤ |π − φ| ≤ 2−i+1 , for any i fixed, we have 1 2 1 2 1 2 s dθls dθl−1 ϒ({θis }i=1,...,l ) ≤ K0l 43 α 4 α− 4 ≤ K l , Nl := 43 α 4
(B.1) where K0 and K are some constants and the sector centers θ2 , . . . , θl−2 , are not integrated yet. The proof is shown below. 2. We now have to perform the remaining integrals, then sum over all possible values of i. Assuming (B.1) true, the sum over all sectors is bounded by l j =2
1 4 4 3α
<j
dθjs ϒ({θjs }j =1,...,l ) ≤ K l = K
1 4 4 3α
2−i
j ∈J (i)
2−i
l
α− 4
dθjs
α− 4
1
j ∈J (i)
1 4 4 3α
<j
j ∈J (i)
|<j |
1
j ∈J (i)
,
dθjs 1 (B.2)
where J (i) := {j |2−i ≤ |<j |, 1 < j < l − 1}. To perform the sum over all possible i we distinguish two situations, defining i0 such that 2−i0 ≤ |
α− 4
|<j | α− 4
1
1
i=i0 j ∈J (i)
j ∈J (i)
≤
∞ l−2 2−i |<j |
α− 4
α− 4
1
1
i=i0
j =3
=
l−2 2−i0 +1 |<j |
α
− 41
j =3
α
− 41
≤2
|
1
i∈I
.
(B.3)
• if 2−i > 2−i0 , then, once fixed the sectors of all the k i except the l th there can be at most one i consistent with k l falling in
l−2
(r j − k j ),
j =1
a + ε = k l + k l−1 = −k 1 − · · · − k l−2 + 2O(α − 2 ), 1
(B.5)
hence ε=
l−2 j =1
(r j − k j ) + 2O(α − 2 ). 1
(B.6)
Interacting Fermi Liquid in Two Dimensions at Finite Temperature. I
287
We remark that a is fixed, once fixed I1 , . . . Il−2 . We chose a coordinate system in which r 2 = (1, 0). Then, since θj = (r 2 , r j ) satisfies |θj | = O(2−i ) or |π − θj | = O(2−i ) ∀ j ≥ 2 the x and y coordinates of every k j 2 ≤ j ≤ l, obey 1 1 k j = ±[1 + O(α − 2 )] cos O(2−i ), [1 + O(α − 2 )] sin O(2−i ) = [±1 + O(2−2i )], O(2−i ) , (B.7) where we assumed 2−i ≥ α −1/4 (otherwise all sectors are automatically fixed). On the other hand, the differences k j − r j can be written k j − r j = k j − r j + O(α − 2 ) * + 1 1 1 = cos θj + O(α − 4 ) , sin θj + O(α − 4 ) − cos θj , sin θj + O(α − 2 ) 1 1 1 = | sin θj |O(α − 4 ), | cos θj |O(α − 4 ) + O(α − 2 ). (B.8) 1
For any j ≥ 2 we know that | sin θj | = O(2−i ) and | cos θj | = O(1). For j = 1, since k1 = − lj =2 kj we can check that maxj ≥2 (k 1 , k j ) ≤ lO(2−i ), hence | sin θ1 | = lO(2−i ). Therefore we have 1 1 ∀ j > 1, k j − r j = O(2−i α − 4 ), O(α − 4 ) 1 1 (B.9) k 1 − r 1 = l O(2−i α − 4 ), O(α − 4 ) . Inserting these results in the expressions for a and ε we have 1 1 ε = l O 2−i α − 4 , α − 4 , 1 1 a = N (2, 0) + O 2−2i , 2−i + l O 2−i α − 4 , α − 4 = N (2, 0) + l O 2−2i , 2−i ,
(B.10)
where N ∈ {1, 0, −1}. Now we can bound Nl in (B.1). We consider two cases. First, let |N | = 1. We rotate the coordinate system by πδN,−1 + O(2−i ) in such a way to make a run along the positive x axis (see Fig. 5). In the new coordinate system the coordinates of ε obey, as before 1 1 ε = l O 2−i α − 4 , α − 4 . (B.11) We remark that, calling ψ the angle (k l−1 , a), we must have (k l , a) = φ − ψ. Then the two components of the equation k l−1 + k l = (cos ψ, sin ψ) + (cos(φ − ψ), sin(φ − ψ)) = a + ε
(B.12)
are cos ψ + cos(φ − ψ) = |a| + l O(2−i α − 4 ), sin ψ − sin(φ − ψ) = l O(α − 4 ). (B.13) 1
1
288
M. Disertori, V. Rivasseau
φ k’ l
k’l-1 ψ
ε a Fig. 5. Case 1, |N | = 1
ε k’l a
ψ
π−φ k’l-1 Fig. 6. Case 2, |N | = 0
The y component implies that |2ψ − φ| = l O(α − 4 ), 1
ψ=
1 1 φ + l O(α − 4 ), 2
(B.14)
then φ is determined with precision O(α −1/4 ) once ψ has been fixed. We remark this was not obvious since the maximal variation for φ, without additional constraints, is 2−i (remember 2−i ≤ φ ≤ 2−i+1 ). Therefore, for rl−1 fixed, θls is restricted to an interval of width l O(α −1/4 ). Finally, we consider the x component: cos ψ + cos (φ − ψ) = cos ψ + cos ψ cos (φ − 2ψ) − sin ψ sin (φ − 2ψ) 1 1 = 2 + l 2 O(α − 2 ) cos ψ + lO(2−i α − 4 ). (B.15) Then the angle ψ is −1
ψ = cos
|a| 2
+ lO
2−i α − 4 2−i
1
,
(B.16)
s must be integrated on an interval of width lO(α −1/4 ), instead of
1 π −φ |a + ε| |a| sin (B.17) = = + l O(α − 4 ). 2 2 2
Interacting Fermi Liquid in Two Dimensions at Finite Temperature. I
Thus φ = π − 2 sin
−1
|a| 2
289
+ l O(α − 4 ) 1
(B.18)
and θls is restricted to an interval of width l O(α −1/4 ), when r l−1 is held fixed. To evaluate ψ we apply the relation ε · cos π−φ , sin π−φ 2 2 sin ψ − φ = 2 |a| l O(2−i α − 4 ) 1
≤
O(2−i ) − l
O(α
− 41
≤ l O(α − 4 ), 1
)
(B.19)
where we applied the relation π −φ |a + ε| = 2 sin 2
≥ O(2−i )
(B.20)
that is proved with the hypothesis 2−i ≤ φ ≤ 2−i+1 . Then ψ=
1 φ + l O(α − 4 ), 2
hence θls is restricted to an interval of width l O(α −1/4 ). This ends the proof.
(B.21)
Acknowledgements. We thank C. Kopper and J. Magnen for many interesting discussions; in particular the use of partial orderings (rather than total orderings as in [DR1]) came from common work with C. Kopper. We are especially grateful to M. Salmhofer: not only his paper [S1] inspired this work, but he also explained to us the meaning and physical importance of the uniform bounds on the derivatives of the self energy that we had not included in a first version of this work.
References [AR1]
[AR2] [BG] [BGPS] [BM] [CR] [DR1] [DR2] [FKLT]
Abdesselam, A. and Rivasseau, V.: Trees, forests and jungles: A botanical garden for cluster expansions. In: Constructive Physics, ed by V. Rivasseau, Lecture Notes in Physics 446, Berlin– Heidelberg–New York: Springer Verlag, 1995 Abdesselam, A. and Rivasseau, V.: Explicit Fermionic Cluster Expansion. Lett. Math. Phys. 44, 77–88 ( 1998) Benfatto, G. and Gallavotti, G.: Perturbation theory of the Fermi surface in a quantum liquid. A general quasi particle formalism and one dimensional systems. J. Stat. Phys. 59, 541 (1990) Benfatto, G., Gallavotti, G., Procacci, A., Scoppola, B.: Commun. Math. Phys. 160, 93 (1994) Bonetto, F., Mastropietro, V.: Commun. Math. Phys. 172, 57 (1995) de Calan, C. and Rivasseau, V.: Local existence of the Borel transform in Euclidean φ44 . Commun. Math. Phys. 82, 69 (1981) Disertori, M. and Rivasseau, V.: Continuous Constructive Fermionic Renormalization. Annales Henri Poincaré, 1, 1 (2000) Disertori, M. and Rivasseau, V.: Interacting Fermi liquid in two dimensions at finite temperature, Part II: Renormalization. To appear Feldman, J., Knörrer, H., Lehmann, D. and Trubowitz, E.: Fermi Liquids in Two Space Time Dimensions. In: Constructive Physics, ed. by V. Rivasseau, Springer Lectures Notes in Physics, Vol. 446, Berlin–Heidelberg–New York: Springer-Verlag, 1995
290
[FST]
M. Disertori, V. Rivasseau
Feldman, J., Salmhofer, M. and Trubowitz, E.: Perturbation Theory around Non-nested Fermi Surfaces II. Regularity of the Moving Fermi Surface, RPA Contributions. Comm. Pure Appl. Math. 51, 1133 (1998); Regularity of the Moving Fermi Surface, The Full Selfenergy: To appear in Comm. Pure Appl. Math. [FT1] Feldman, J. and Trubowitz, E.: Perturbation theory for Many Fermion Systems. Helv. Phys. Acta 63, 156 (1991) [FT2] Feldman, J. and Trubowitz, E.: The flow of an Electron-Phonon System to the Superconducting State. Helv. Phys. Acta 64, 213 (1991) [FMRT1] Feldman, J., Magnen, J., Rivasseau, V. and Trubowitz, E.: An infinite Volume Expansion for Many Fermion Green’s Functions. Helv. Phys. Acta 65, 679 (1992) [FMRT2] Feldman, J., Magnen, J., Rivasseau, V. and Trubowitz, E.: An Intrinsic 1/N Expansion for Many Fermion System. Europhys. Lett. 24, 437 (1993) [FMRT3] Feldman, J., Magnen, J., Rivasseau, V. and Trubowitz, E.: Ward Identities and a Perturbative Analysis of a U(1) Goldstone Boson in a Many Fermion System. Helv. Phys. Acta 66, 498 (1993) [G] Gevrey, M.: Sur la nature analytique des solutions des équations aux dérivées partielles. (Ann. Scient. Ec. Norm. Sup., 3 série. t. 35, pp. 129–190). In: Oeuvres de Maurice Gevrey, pp. 243 , ed. CNRS, 1970 [L] Lesniewski, A.: Effective Action for the Yukawa2 Quantum Field Theory. Commun. Math. Phys. 108, 437 (1987) [MR] Magnen, J. and Rivasseau, V.: A Single Scale Infinite Volume Expansion for Three Dimensional Many Fermion Green’s Functions. Math. Phys. Electronic Journal, Volume 1, 1995 [R] Rivasseau, V.: From perturbative to constructive renormalization. Princeton, NJ: Princeton University Press, 1991 [S1] Salmhofer, M.: Continuous renormalization for Fermions and Fermi liquid theory. Commun. Math. Phys. 194, 249 (1998) [S2] Salmhofer, M.: Improved Power Counting and Fermi Surface Renormalization. Rev. Math. Phys. 10, 553 (1998) Communicated by D. Brydges
Commun. Math. Phys. 215, 291 – 341 (2000)
Communications in
Mathematical Physics
© Springer-Verlag 2000
Interacting Fermi Liquid in Two Dimensions at Finite Temperature. Part II: Renormalization M. Disertori, V. Rivasseau Centre de Physique Théorique, Ecole Polytechnique, 91128 Palaiseau Cedex, France Received: 27 July 1999 / Accepted: 31 May 2000
Abstract: This is a companion paper to [DR1]. Using the method of continuous renormalization group around the Fermi surface and the results of [DR1], we achieve the proof that a two-dimensional jellium system of interacting Fermions at low temperature T is a Fermi liquid above the BCS temperature. Following [S], this means proving analyticity in the coupling constant λ for |λ|| log T | ≤ K, where K is some numerical constant, and some uniform bounds on the derivatives of the self-energy.
1. Introduction For a general introduction we refer to the [DR1] paper. We assume all its results and notations. In [DR1] the “convergent contributions” to the vertex functions of a two-dimensional weakly interacting Fermi liquid were controlled, hence the results of [FMRT] were essentially reproduced but with a continuous renormalization group analysis, as advocated in [S]. In this paper we consider the complete sum of all graphs, perform renormalization of the two point subgraphs and obtain our main theorem. This is not a trivial extension of the methods of [FMRT] and [FT1-2], since renormalization has to be performed in phase space, not momentum space. This raises a delicate point: since angular sector decomposition has to be anisotropic [FMRT], it is not obvious that one gains anything by renormalizing in phase space, if the sector directions of the spanning tree used for spatial integration do not match the sector directions of the external legs. This non-trivial problem is solved here by a somewhat delicate one-particle irreducibility analysis for two point subgraphs that must respect the determinant structure of the Fermionic loop variables and Gram’s inequality. Here we go.
292
M. Disertori, V. Rivasseau
2. Renormalization We consider now the sum over all (not necessarily convergent) attributions. By [DR1], (Sect. 4, 45–47) the four point and two point subgraphs are convergent at finite temperature, but diverge logarithmically and linearly respectively when T → 0. We remark that, as we keep T ≥ Tc > 0, we could avoid performing renormalization at all, but in this case the estimation of the convergence radius would be bad. Actually, we would have to bound a sum such as ∞
n=1 n4 +n2 ≤n
−
|λ|n K2n | log T |n4 T
n2 2
,
(2.1)
where n4 and n2 are the number of four point and two point subgraphs respectively. Since n2 + n4 ≤ n, it is easy to check that the convergence radius of this sum is defined by the upper bound on the critical temperature 1 |λ|2K2 − 1 upper Tc = max Tc(4) , Tc(2) = √ max e |λ|2K2 , (|λ|2K2 ) = √ . (2.2) π 2 π 2 Actually one can do slightly better and find a bound in |λ|2 , because tadpoles vanish, so that one has effectively n2 ≤ n/2. But we see that without renormalization of the two point subgraphs, we cannot get an upper bound on the critical temperature of the non-perturbative form predicted by the theory of superconductivity1 , namely: Tctrue C1 e
− C 1|λ| 2
,
(2.3)
where C1 and C2 are two constants related to the physical parameters of the model such as the Debye frequency, the electron mass, the interatomic distance, and the particular crystalline lattice structure. upper Our goal in this paper is to prove an upper bound on Tc , i.e. give a value of Tc which is non-perturbative like (2.3) but with different constants K1 and K2 . To obtain this behavior we need to perform renormalization, but only for two-point subgraphs, which amounts to a computation of the flow of the chemical potential only2 . Hence in this paper we will use the interacting action 2 λ 3 1 3 ¯ ¯ d x + δµ d x ψψ ψψ , (2.4) SV = 2 V V a a where λ is the bare coupling constant and δµ1 is the bare chemical potential counterterm, which is a function of the ultraviolet cut-off u = 1 and the infrared cut-off . The free covariance is as usual 1
, Cˆ ab (k) = δab (2.5) ik0 − k 2 − µ 1 We recall that in dimension d = 2 by the Mermin–Wagner theorem there is no continuous symmetry breaking at finite temperature, but there ought to be a critical temperature associated to a Kosterlitz Thouless phase. At zero temperature, there are three non compact dimensions (space plus imaginary time) and there should be a continuous symmetry breaking with an associated Goldstone boson. 2 To find the exact constant K = C in our Theorem 3 is trivial, but to find a bound with the exact constant 1 1 K2 = C2 requires to compute the flows of the coupling constant also. This is almost certainly also doable within the methods of this paper, but introduces some painful complications, since there are really infinitely many running coupling constants [FT2].
Interacting Fermi Liquid in Two Dimensions at Finite Temperature. II
293
where µ = 1 is the renormalized chemical potential and we have taken 2m = 1. The BPHZ condition states u ˆ (2.6) δµren () = δµ = 2 (kF ) = d 3 xe−ikF x 2u (0, x) = 0, where 2u is the two point vertex function with u = 1 and with bare counterterm δµ1 , kF is some vector as near as possible to the Fermi surface, (the Fermi surface cannot be reached at finite temperature because of the antiperiodicity of Fermions) hence with |kF 0 | = πT and |k F | = 1. This function actually coincides with the 1PI one, as the Gevrey cut-off on internal lines fixes the 1PR contributions to zero. By rotation invariance, this condition does not depend on the angular part of k F . On the other hand, to conserve the parity in the imaginary time direction we should take the mean value ˆ (π T , k F ) + ˆ (−πT , k F )], but in our computations this is not necessary. The 1/2[ main result of our paper is u Theorem 1. The limit → 0 of 2p (φ1 , . . . φ2p ) is analytic in the bare coupling constant λ, for all values of λ ∈ C such that |λ| ≤ c, with c given by the equivalent relations
T = K1 e
− cK1
2
;
c=
1 K2 | log T /K1 |
(2.7)
for some constants K1 and K2 (this relation being limited to the interesting low temperature regime T /K1 < 1). Moreover the first and second derivatives3 of the self-energy ˆ 4 are uniformly bounded: (k) ∂ 2 ∂ 2 ˆ ˆ π (2.8) ∂k |k0 = β ,e(k)=0 ≤ K3 |λ| ; ∂k ∂k (k) ≤ K4 , i i j ∞ where i and j take values 0,1,2, and K3 , K4 are some constants. This theorem establishes that the jellium model of interacting Fermions in 2 dimensions is a Fermi liquid above the critical temperature of the BCS transition, in the sense of [S], and the remaining part of this section is devoted to its proof, which is a generalization of Theorem 1 of [DR1] with the additional difficulty of mass renormalization. We remark that one usually restricts to real values of the coupling constant because one wants a hermitian Hamiltonian and only real shifts in the chemical potential. With the new action (2.4) the expression to bound becomes:
λn δµ1 n = ε(T , () d 3 x1 . . . d 3 xn¯ n! n ! n≥1 ¯ o−T E ( n−1 ¯ ∂ q φ1 (xi1 ) . . . φ2p (xjp ) C (xlq , x¯lq )dq det M(E), ∂q T ≤1 ≤···≤n−1 ¯ ≤1 0 2p (φ1 , . . . φ2p )
q=1
(2.9) 3 We remark that for k the derivative means discrete derivative ∂f = 1 [f (k + 2π T ) − f (k )], since 0 0 0 2πT ∂k0
Matsubara frequencies are discrete. 4 Recall that the self-energy is the sum of all non-trivial one-particle-irreducible two point subgraphs.
294
M. Disertori, V. Rivasseau
where n is the number of four point vertices (with coupling constant λ), n is the number of two point vertices (with coupling constant δµ1 ) and we defined n¯ = n + n . Now, we can insert band attributions and classes exactly as in [DR1].
2.1. Extracting loop lines. Before introducing sectors, we must perform an additional expansion of the loop determinant. This is necessary for two reasons: • to select the two-point subgraphs that really need renormalization; • to optimize sector counting by reducing the number of possible sector choices, in order to perform renormalization. We introduce some notations. For any class C we define DC as the set of “potentially dangerous” two-point subgraphs gi . They are defined by the following property: by cutting a single tree line on the path joining the two external vertices of gi we cannot separate gi into two disconnected subgraphs gj (C) and gj (C), one of them, say gj (C), being a two point subgraph. This property is similar but not equal to 1PI (one particle irreducibility). In Fig. 1 there are some examples of subgraphs not belonging to DC . By the relation of partial order in the CT S, DC has a forest structure (see [R]). This means that for any pair g and g ∈ DC we have g ∩ g = ∅ or g ⊆ g or g ⊆ g. Now, for any g ∈ DC , we define the set A(g) of maximal subgraphs g ∈ DC , g ⊂ g. The loop determinant is then factorized on the product of several terms: one for each set ilj (C), gj ∈ A(g), one containing the remaining internal loop fields in gi , and a last term containing all the other loop fields. Then, the good object to study is not g, but the reduced graph g r := g/DC , where each gj ∈ A(g) has been reduced to a single point (see Fig. 2). For each gir we denote the set of internal loop half-lines by ilir and the set of vertices by Vir .
gj’
gj
gj
lA(j)=A(j’)
gj’ lA(j)=A(j’)
a)
b)
Fig. 1. Examples of subgraphs not belonging to DC ; tree lines are solid and loop fields are wavy
g
gr
g1 g2
g1 g3
g2
g3
Fig. 2. A subgraph g and the reduced correspondent subgraph g r ; g1 , g2 and g3 belong to A(g)
Interacting Fermi Liquid in Two Dimensions at Finite Temperature. II
295
(2)
v(h i )
r
Ci
t(i)
i Fig. 3. Example of Cir : the dashed lines belong to the chain
jq +1
j q+1
jq
jq Fig. 4. Reduction of gjq+1 in gjq (1)
(2)
For each gir , gi ∈ DC , we call hi the external half-line hroot and hi the other i (1) (2) external half-line. In the same way we define vi and vi . With these definitions, we introduce the chain Cir which joins the dot vertex vh(2) to the cross vertex just above the i cross t (i) (see Fig. 3). On this chain we define the set Ji of crosses (and eventually one dot) indices j corresponding to four-point subgraphs |egj (C)| = 4. We order them starting from the lowest index j1 and going up to the highest j|Ji | . We remark that, by definition of DC , there is no index j on the chain with |egj (C)| = 2. Again we introduce the reduced subgraphs gjrq (C) := gjq (C)/gjq+1 (C) (see Fig. 4), the set of internal loop half-lines of gjrq , iljrq , and that of internal vertices, Vjrq . Then the corres,onding loop determinant is factorized, det(iljq ) = det(iljrq ) det(iljrq+1 ).
(2.10)
For the first step of the induction we define j0 = i, gj0 := gir (it is a two point subgraph!) and gjr0 := gjr0 /gjr1 . (1)
root For each gjrq , q = 1, . . . , |Ji | we call hjq the external tree half-line hroot jq , ljq the (2)
(2)
corresponding tree line, hjq = hi (3)
(4)
(2)
(we remark that hroot jq can never coincide with hi
by construction), and hjq , hjq the remaining two external half-lines. The line ljroot cuts q the tree ti into two connected components. We call TjLq (i) the component that contains
296
M. Disertori, V. Rivasseau (1)
(2)
the vertex xi , and TjRq (i) the other component that contains the vertex xi . We remark
that all vertices in gjrq , belong to TjRq (i). For each q = 1, . . . |Ji | (starting from the lowest and going up) we test if there is some loop line lfg with f, g ∈ ilir connecting TjLq (i) with TjRq (i). If for some jq ∈ Ji there is no loop line gi is actually 1PR (one particle reducible) and, by momentum conservation, it does not need to be renormalized (as it is shown below). On the other hand, if ∀j ∈ Ji we can find a loop line, then gi is 1PI and it must be renormalized. We perform this test inductively. At each subgraph gjq we define r R root LR jq (i) := {a ∈ iljq−1 |a ∈ Tjq (i), m(a, C) ≤ i(ljq ) ≤ A(jq )}, r L root LL jq (i) := {a ∈ iljq−1 |a ∈ Tjq (i), m(a, C) ≤ i(ljq ) ≤ A(jq )},
(2.11)
(where we recall that A(jq )) (defined in [DR1]) is the index of the highest external tree r line of gjrq ). Actually, LR jq (i) is the set of internal loop half-lines of gjq−1 which are
hooked to TjRq (i) and may connect somewhere in TjLq (i). By construction, no internal
R loop half-line of gjrq and no external loop half-line of gjrq−1 belongs to LL jq (i) ∪ Ljq (i). This is the main reason for which this expansion does not develop any new factorial. We distinguish three situations: (4)
(3)
1. hjq and hjq ∈ L (see Fig. 7). Then ljroot = lA(jq ) , LR jq (i) is reduced to two elements q and we develop the determinant to choose where they contract, applying two times the following formula: (3) det M = M (3) ε(hjq , a) det Mred , (2.12) a
hjq ,a
(3)
where ε(hjq , a) is a sign and det Mred is the determinant of the reduced matrix obtained taking away a row and a column. If they contract together gi is 1PR. If not, we have (3) (4) 2 |LL jq (i)| choices to contract them. We remark that if hjq or hjq , or both are external lines at some gjq , with q < q, then they have already been extracted from the determinant and we do not touch them. (3)
(4)
2. hjq ∈ L and hjq ∈ ti (see Fig. 8). (3)
If hjq has not been already contracted at some lower scale, we develop the determinant (3)
(3)
as before to choose where hjq contracts. If hjq has already been contracted at some lower scale we do not touch it. (3) In any case, if hjq contracts with some element of LL jq (i), then 1PI is assured and we go to the step q + 1. If not (Fig. 9a), we test the loop determinant in the following way:
1
det M (C) = det M (C)(0) + 0
dsjq
d det M (C)(sjq ), dsjq
(2.13)
where we defined Mxf ,xg (C)(sjq ) = sjq Mxf ,xg (C)
sjq ∈ [0, 1]
(2.14)
Interacting Fermi Liquid in Two Dimensions at Finite Temperature. II
297
L if (f, g) or (g, f ) belong to LR jq (i) × Ljq (i) and
Mxf ,xg (C)(sjq ) = Mxf ,xg (C)
(2.15)
otherwise. The term s = 0 extracts from the determinant the loop line we wanted (see Fig. 9a). The term s = 0 means that gi is 1PR. The number of choices is bounded by 2 L |LR jq (i)| |Ljq (i)|. (3)
(4)
3. hjq and hjq ∈ ti . Then we apply directly the interpolation formulas (2.13)–(2.14). Again we distinguish the case s = 0, that corresponds to gi 1PR, and the case s = 0 R that corresponds to gi 1PI and has at most |LL jq (i)||Ljq (i)| terms (see Fig. 10a). Repeating the same procedure for all j ∈ Ji we extract from the loop determinant at most 2|Ji | internal loop line propagators. For each class C, the process Ji0 specifies the set of jq ∈ Ji for which one or two loop lines have been extracted simply developing the determinant, Ji1 specifies the set of jq ∈ Ji for which one loop line has been extracted applying (2.14). In the same way the process P0 and P1 specifies which loop fields are contracted in Ji0 and Ji1 for all i. Then det M (C) =
1 Mf,g (C) dsjq det M (C)({sjq }) , 0 r 1 J
P gi ∈DC
f g∈ili lf g ∈P
jq ∈Ji
(2.16) where J defines the sets Ji0 and Ji1 for all i. For each loop line lfg extracted, the set of band indices accessible for both f and g is reduced to M r (f, C) = M r (g, C) = min[M(f, C), M(g, C)], mr (f, C) = mr (g, C) = max[m(f, C), m(g, C)].
(2.17)
We have to verify that the new matrix M (C)({sjq }) still satisfies a Gram’s inequality and that the sum over processes does not develop a factorial. This is done in the following two lemmas. Remark that the sum over J is not dangerous. Actually at each jq we have two choices, hence |J | ≤ 2n¯ . Lemma 1. M (C)({sjq }) satisfies the same Gram inequality as M (C) in [DR1], (4.4), which does not depend on the parameters sfg . Proof. The proof is identical to that of [DR1], Lemma 4. The only difference is that now k k Wv,a;v ,a contains an additional s dependent factor Sv,a;v ,a . By (2.10) or (2.19) below, L we recall that the determinant for the set LR jq (i) ∪ Ljq (i) of fields and antifields which may be concerned by the sjq interpolation step factorize in the big loop determinant, k,j
q so we need only to consider a single such factor Sv,a;v ,a , and prove that it is still a positive matrix. This is obvious if we reason on the index space for the vertices v and v to which the fields and antifields hook (and not on the fields or antifields indices a and k,jq v v a themselves). Indeed Sv,a;v ,a is χa .χa (the positive matrix which is 1 if a hooks to v and a hooks to v , and 0 otherwise) times the combination with positive coefficients sjq Mv,v + (1 − sjq )Nv,v of the positive matrix M which has each coefficient equal to
298
M. Disertori, V. Rivasseau
1 and the positive block matrix N which has Nv,v = 1 if v and v belong both to TjRq k,j
q or both to TjLq and Nv,v = 0 otherwise. Therefore the matrix Sv,a;v ,a is positive in the big tensor space spanned by pairs of indices v, a, it has a diagonal bounded by 1, and we can complete the proof as in [DR1], Lemma 4. The conclusion is that the additional interpolation parameters sjq do not change the Gram estimate and the norms of Ff and Gg given in [DR1].
Lemma 2. The cardinal of P is bounded by K n¯ for some constant K. Proof. The loop determinant is factorized on determinants restricted to each reduced two-point subgraph in DC : det M (ilir ). (2.18) det M = gir ∈DC /A(gi )
Each determinant det M (il(gir )) is in turn factorized on determinants restricted to internal loop fields for each reduced subgraph gjrq , q = 0, . . . , |Ji |: det M (il(gir )) =
|Ji | q=0
det M (iljrq ).
(2.19)
2 L We have seen that for each gjq the number of terms in P is bounded by |LR jq (i)| |Ljq (i)|. Then |Ji | R 2 |P | ≤ (|LL jq (i)||Ljq (i)| )
≤
gir ∈DC /A(gi ) q=0 |Ji | r (|LR (i)|+|LL jq (i)|) 2n¯ e gi ∈DC /A(gi ) q=0 jq r n¯ 4 gir ∈DC /A(gi ) |Vi | n¯
≤2 e
≤K ,
(2.20)
where we applied |Ji | q=0
L r (|LR jq (i)| + |Ljq (i)|) ≤ 4|Vi |;
gir ∈DC /A(gi )
|Vir | ≤ n. ¯
(2.21)
This completes the proof. ! Now we can insert sector decouplings exactly as we did in [DR1], but with a few additional operations. (2)
2.2. Sector refinement. For each gi ∈ DC , 1PI and with hi ∈ E, we introduce one (2) more sector decomposition on hi , in order to optimize the bounds from renormalization 1
(Sect. 2.7). Actually, the finest sector of size 2
(2)
i(hi )
sector of size 1
1
j2 (2) := 2 hi
,1
(2)
i(hi )
is further decomposed in a smaller
zi ,
(2.22)
Interacting Fermi Liquid in Two Dimensions at Finite Temperature. II (2)
299
(2)
where i(hi ) ≤ A(i) is the band index of hi and 0 < zi ≤ 1 is a factor to be chosen. This sector is introduced applying the identity [DR1](3.11) with αs = αjh,1 −1
1
defined by αjh,14 := j2h,1 . All the other larger sectors are introduced through the identity [DR1](3.13). The effect of last refinement is an additional factor 1/zi from sector counting and spatial integration (this only if the refined line is a tree one), and a factor zi from the volume in impulsion space. Then, we are left with a global factor 1/zi . The 1
optimal value is zi = t2(i) , as will be explained at the end of Sect. 2.7. The expression to bound is then similar to [DR1] (3.16): u 2p (φ1 , . . . , φ2p )
n≥1 ¯
n−1 ¯
T ≤A(i) ≤i ≤1 q=1
[
λn δµ1 n = ε(T , () n! n !
1 4 −2 3 jh,n −1 h
]
ϒ
gi | i=r, |egi (C|)≤10
dq
h∈L∪TL
dθh,nh −1 . . . [
jh,n
h
CT S u−T
θiroot , {θh,r(i) }h∈egi∗
−1
2 ] [ 43 jh,n h
1 4 −2 3 jh,1
2π
0
L E( C J,P
dθh,nh
]
jh,2
dθh,1
n h r=2
v∈V ∪V
ϒ θhroot , {θh,nh }h∈H ∗ (v) v
d 3 x1 . . . d 3 xn¯ φ1 (xi1 , θe1 ,1 ) . . . φ2p (xjp , θe2p ,1 )
lf g ∈P
Mf,g (C, E, {θa,1 })
jq ∈J 1
1 0
θ χαjh,r (θh,1 ) h,r
n−1 ¯
C q (xq , x¯q , θh,1 )
q=1
dsjq det M (C, E, {θa,1 }, {sjq }),
where we defined V and V as the set of four point and two point vertex respectively. To perform renormalization we apply to the amplitude of each two point subgraph g the operator (1 − τg ) + τg , where τg selects the linearly divergent term in g giving a local counterterm for δµ that depends on the scale of the external lines of g. We start the renormalization from the leaves of the CT S (hence from the smallest subgraphs at highest scale) and go down. 2.3. Momentum space. The Taylor expansion of g(k) ˆ around a vector kF near the Fermi surface gives two possible sources of counterterms. The term of order 0 in the Taylor expansion is linearly divergent and gives rise to a chemical potential counterterm; the term of order 1 is logarithmic and would give rise to wave function counterterms (in fact proportional to k0 and k 2 ), that we do not need to consider for our upper bound, Theorem 1. As we said, for this kind of bound we need only to perform the linearly divergent renormalization. Therefore we define the localization operator acting on a two-point function as: ˆ 2 ) = δ(k1 + k2 )g(k ˆ F ). δ(k1 + k2 )τg g(k
(2.23)
300
M. Disertori, V. Rivasseau
We remark that by rotational invariance there is no ambiguity in the choice of the spatial component of kF . For the temporal component we take kF 0 = π T , to simplify computations. This choice breaks parity in the imaginary time direction, but in our context this is not essential. 2.3.1. Subgraphs to renormalize. We do not need to renormalize all two point subgraphs. Actually, by momentum conservation it is easy to see that, if gi (C) is 1PR (in the sense explained in Sect. 2.1) and gj (C) is the two-point subgraph we obtain cutting one tree line of gi ,
τgi (C ) 1 − τgj (C ) = 1 − τgi (C ) τgj (C ) = 0,
(2.24)
hence the renormalization of gi (C) is ensured by that of gj (C). The set of 1PR subgraphs is the union of the set of two point subgraphs not in DC , for which we knew one particle reducibility from the start, and the set DC \D(C, P ), for which we learnt it after the loop extraction process. For any of such subgraphs one internal line lj must have the same momentum as the external line lA(i) . Then the internal and external scales of gi cannot be far; this imposes a constraint on the integral over the parameter i that allows to avoid renormalizing these subgraphs. But even two point 1PI subgraphs do not all need renormalization. To treat real external half-lines we define, for each ej ∈ E, with 1 < j ≤ 2p, ej =:
k02 + e2 (k)
|k=kej
.
(2.25)
Now, for each gi (C) with hi (2) = ej ∈ E with 1 < j ≤ 2p we have the following relation between A(i) and ej : √ 2ej ≤ A(i) ≤ 2ej .
(2.26)
Then we distinguish two situations. • If ej ≥ √i , then internal and external energies are near, and we can apply the same 2 2 argument as for 1PR subgraphs. • If ej < √i < i then we must renormalize. 2 2
Hence we define the set of renormalized subgraphs as D(C, P ) := gi ||egi (C)| = 2, gi 1PIand
(2) ifhi
= ej , 1 < j ≤ 2p, ej
! i < √ . 2 2 (2.27)
We remark that, by the relation of partial order in the CT S, D(C, P ) has a forest structure (see [R and DR2]). We denote by N D(C, P ) (not-dangerous . . . ) the set of two point subgraphs which are not renormalized.
Interacting Fermi Liquid in Two Dimensions at Finite Temperature. II
301
2.4. Real space. The formulation of renormalization in momentum space is the one of [FT2] and is sufficient for perturbative results. In this formulation the localization operator is rotation invariant. However for constructive bounds we need a phase space analysis, hence a direct space “dual version” of this operator [R]. In the space of positions, the dual localization operator, which acts on the external lines of the subgraph, is never unique. In relativistic Euclidean field theory it depends on the choice of an arbitrary localization point (see [R]), a convenient choice being the position of one of the external vertices. Here, in condensed matter, this dual operator depends on an additional choice, namely a direction on the Fermi surface. A convenient choice is found thanks to the sector decomposition. Actually, before performing the sum over sector attributions, the two external propagators of a graph gi belong to well defined sectors (if they are not real external half-lines) (αj (1) , θ1 ) and (αj (2) , θ2 ) with sector center on the vectors (0, r k ), k = 1, 2, where j (1), j (2) ≤ A(i). Therefore we define the operator τg as a first order Taylor expansion around the momentum k2 = −r2 = (−π T , −r 2 ) (the minus sign corresponding to integration by parts). The dual x-space operator τg∗ acts on the product of external propagators Cθ1 (x1 , y1 )Cθ2 (x2 , y2 ) by τg∗ Cθ1 (x1 , y1 )Cθ2 (x2 , y2 ) = eir2 (x2 −x1 ) Cθ1 (x1 , y1 )Cθ2 (x1 , y2 ).
(2.28)
This formula does not coincide with the usual one (see [R]) and can be justified observing that Cθs (x, y) is not a slowly varying function with x, but has a spatial momentum of order 1, hence oscillates wildly. The good slowly varying function to move is Cθ s (x, y) defined by: Cθs (x, y) =
eirs (x−y) (2π)2
d 3 kei(k−rs )(x−y) Cθs (k) := eirs (x−y) Cθ s (x, y).
(2.29)
The expression (2.28) can also be obtained defining τg∗ Cθ 1 (x1 , y1 )Cθ 2 (x2 , y2 ) = Cθ 1 (x1 , y1 )Cθ 2 (x1 , y2 ).
(2.30)
External test functions. When a test function φe , is the moved half-line for a subgraph gi , we apply the same formula as for the propagator but with rs defined as the projection of the external impulsion ke = θe,1 on the Fermi surface: rs = (π T , ke /|ke |). This choice is important in order to have an effective constant δµ independent from k. We remark that, √ with this choice, φ is a slowly varying function with phase |k − rs | ≤ e < A(i) /2 2 < i . Choice of the reference vertex. The choice of x1 as a fixed vertex instead of x2 is arbitrary. In this paper we use the rule that most simplifies notations and calculations (not exactly the same as in [DR2]). For each gi ∈ D(C, P ) we choose as reference vertex the one (1) (1) (1) hooked to the half-line hi = hroot i , vi with position xi . The moved vertex is then (2) xi . This rule implies that tree lines never have both ends moved, and that the root vertex x1 , which is essential in spatial integration, is always fixed. In the following we will denote by Dt (C, P ), Dl (C, P ), De (C, P ) the subgraphs in D(C, P ) for which the moved line is tree, loop or external respectively.
302
M. Disertori, V. Rivasseau
2.5. Effective constants. At each vertex v we can now resum the series of all counterterms obtained applying τg to all g ∈ D(C, P ) (for different classes C, processes P and perturbation orders n) ¯ that have the same set of external lines as v itself. In this way we obtain an effective coupling constant which depends on the scale iv of the highest tree line hooked to the vertex v. This is automatically true for a two point vertex (and in fact would also be true as in [DR2] for a four point vertex because tadpoles are zero by [DR1], Lemma 2). Each counterterm is now a function Fθ1 ,θ2 (y1 , y2 ) = d 3 x1 Cθ1 (x1 , y1 )Cθ2 (x1 , y2 ) d 3 x2 g(x1 , x2 )eir2 (x2 −x1 ) = d 3 x1 Cθ1 (x1 , y1 )Cθ2 (x1 , y2 )g(−r ˆ (2.31) 2 ), where we applied the translational invariance of g. Now remark that g(k) ˆ is invariant under rotations of the spatial component k of k as the free propagator depends only on the absolute value of k. Therefore ˆ T , |r 2 |) = g(−π ˆ T , 1) g(−r ˆ 2 ) = g(−π
(2.32)
is independent from θ1 and θ2 . Theorem 2. If we apply to each two point subgraph g ∈ D(C, P ), for any class C and process P , the operator (1 − τg ) + τg = Rg + τg , (2.23) can be written as u 2p (φ1 , . . . , φ2p ) =
n≥1 ¯
n−1 ¯
T ≤A(i) ≤i ≤1 q=1
[
λn ε(T , () n!n !
1 4 −2 3 jh,n −1 h
gi | i=r, |egi (C)|≤10
]
ϒ
dq
CT S u−T
h
θiroot , {θh,r(i) }h∈egi∗
−1
2 ] [ 43 jh,n h
h∈L∪TL
dθh,nh −1 . . . [
jh,n
L E( C
1 4 −2 3 jh,1
2π
0
dθh,nh n h
]
jh,2
v∈V ∪V
dθh,1
r=2
θ χαjh,r (θh,1 ) h,r
ϒ θhroot , {θh,nh }h∈H ∗ (v) v
d 3 x1 . . . d 3 xn¯ φ1 (xi1 , θe1 ,1 ) . . . φ2p (xjp , θe2p ,1 )
(2.33)
JP
v∈V
δµiv (λ)
¯ n−1 Rgi C q (xq , x¯q , θh,1 ) q=1 gi ∈D(C ,P ) 1 Mf,g (C, E, {θa,1 }) dsjq det M (C, E, {θa,1 }, {sjq }) , 1 0 lf g ∈P
jq ∈J
where δµq (λ), the effective constant defined by: 1 1 δµq (λ) = ˆ 2 q (−r2 ) = d 3 x2 2 q (0, x2 )eir2 x2 ,
(2.34)
Interacting Fermi Liquid in Two Dimensions at Finite Temperature. II
303
is independent of the choice of the angular component of r 2 . This effective constant is the 1 vertex function 2 q for an effective theory with IR parameter q , and bare counterterm
δµ1 . Furthermore δµq (λ) is analytic in λ and is bounded by q δµ (λ) ≤ K|λ|(q − )
(2.35)
for some constant K. The renormalized δµren () is then the vertex function for an effective theory with IR parameter u = , δµren () = δµ (λ) = 0.
(2.36)
ˆ Finally the first and second derivatives of the self-energy (k) are uniformly bounded: 2 ∂ ∂ 2 ˆ ˆ π (2.37) ∂k |k0 = β ,e(k)=0 ≤ K|λ| ; ∂k ∂k (k) ≤ K, i i j ∞ where i and j take values 0,1,2, and K is some constant. These bounds are proved in Appendix B. Proof. The first part of the theorem actually consists in a reshuffling of perturbation theory, and can be proved by standard combinatorial arguments as in [R]. The only difficulty that is not in [R] is to prove that the parameter q of the effective constant always corresponds to the highest tree line of the vertex: as we said above this is obvious for two point vertices. The second part of the theorem, that is the analyticity of δµ and the bound (2.35), corresponds in statistical mechanics to the problem of fixing the bare mass in such a way that the renormalized mass is zero. This is a standard problem, now well understood. For instance, a proof in the case of the critical φ44 model, can be found in [FMRS and GK]. For completeness we recall the arguments of the proof in Appendix A. Finally the bound of the first and second derivatives of the self-energy allows a Taylor expansion around the Fermi surface which proves Fermi liquid behavior [S]; they would be false in d = 1, where Luttinger liquid behavior is known to occur [BGPS]–[BM]. ! 2.6. Convergence of the effective expansion. Theorem 3. Let ε > 0 and u = 1 be fixed. The series (2.33) is absolutely convergent for |λ| ≤ c and c≤
1 K2 | log(T /K1 )|
(2.38)
for some constants K1 , K2 . This convergence is uniform in , then the IR limits of the u u vertex functions 2p = lim→0 2p exist, they are analytic in λ in a disk of radius c, and they obey the bounds 5−3p
u 2p>4 (φ1 , . . . , φ2p )|
1
1
T 2 −(1−ε)Ts dTs (xe1 ,...,xe2p ) ≤ K0 , [K1 (ε)]p (p!)2 K(c, T )e 5 − 3p 1 s
1 s
|40 (φ1 , . . . , φ4 )| ≤ K0 (ε)T − 2 K(c, T )e−(1−ε)T dT (xe1 ,...,xe4 ) , 1
1 s
1 s
|20 (φ1 , φ2 )| ≤ K0 (ε)e K(c, T )e−(1−ε)T dT (xe1 ,xe2 ) , (2.39)
304
M. Disertori, V. Rivasseau
where xei is the maximum of the test function φi (x), K1 (ε), K0 (ε) and K0 (ε) are 2 + e2 (k ) for the external half-line e, d (x , . . . x ) functions of ε only, e = ke0 e e2p T e1 is defined as in [DR1], Theorem 2 (4.3), K(c, T ) is a function which tends to 0 when c → 0. The bounds are also valid in the case of external impulsions fixed, but without the exponential decay factor.
This theorem (that is a generalization of [DR1], Theorem 2) means that one can build in a constructive sense the infrared limit of the Fermi liquid at a finite temperature higher than some exponentially small function of the coupling constant simply by summing up perturbation theory. The rest of this paper is devoted to the proof of that theorem.
2.7. ( Lines∗ interpolation. Before performing ∗any bound we must study the action of g∈DC Rg . For each gi ∈ DC the action of Rg on the external lines of gi is Rg∗i Cθ1 (x (1) , y (1) )Cθ2 (x (2) , y (2) ) (2) (1) = Cθ1 (x (1) , y (1) ) Cθ2 (x (2) , y (2) ) − eir2 (x −x ) Cθ2 (x (1) , y (2) ) (2) (2) (1) Cθ2 (x (2) , y (2) )e−ir2 x − Cθ2 (x (1) , y (2) )e−ir2 x = Cθ1 (x (1) , y (1) )eir2 x 1 d (2) (1) (1) ir2 x (2) (2.40) Cθ2 (x (2) (t), y (2) )e−ir2 x (t) , dt = Cθ1 (x , y )e dt 0 (2)
where we applied a first order development on Cθ2 (x (2) , y (2) )e−ir2 x and x (2) (t) is any differentiable path with x (2) (0) = x (1) and x (2) (1) = x (2) . The external line hooked to x (2) has then been hooked to the point x(t) (see Fig. 5) and has now propagator: Cθm2 (x (2) (t), y (2) ) := eir2 x
(2)
d (2) Cθ2 (x (2) (t), y (2) )e−ir2 x (t) . dt
(2.41)
The easiest choice for the path is a linear interpolation between x (1) and x (2) : x (2) (t) = x (1) + t (x (2) − x (1) ).
g
y
(1)
x(1)
x(2) x(2)(t) y (2)
Fig. 5. Line interpolation
(2.42)
Interacting Fermi Liquid in Two Dimensions at Finite Temperature. II
305
This is actually the kind of path we will take if the moved line is a loop or an external one. The interpolated line can then be written as (2)
Cθm2 (x (2) (t), y (2) ) = eir2 (x −x2 (t)) (x (2) − x (1) )µ ∂ Cθ2 (x (2) (t), y (2) ) −ir2 + (2) ∂x (t) µ (2) (2) (2) (2) = eir2 (x −x (t)) d 3 keik(x (t)−y ) i(x (2) − x (1) )(k − r2 ) Cθ2 (k).
(2.43)
When applied to a tree line, this interpolation does not “follow the tree” as the point x(t) in general no longer hooks to some point on a segment corresponding to a tree line. This leads to some difficulties when integrating over spatial positions. To avoid this we take x(t) as the path in the tree joining x (2) to x (1) , as in [DR2]. This path has in general q lines with vertices x0 , . . . xq with the conditions x0 = x (1) and xq = x (2) . We remark that, with this rule, the renormalization at higher scales modifies the tree used for renormalization at lower scales. We will define below the modified tree by an induction process. The interpolated line can then be written as (2) (2) (1) e−ir2 x Cθ2 (x (2) , y (2) ) − e−ir2 x Cθ2 (x (1) , y (2) ) eir2 x q 1 = dtCθm2 (xj (t), y (2) ), (2.44) j =1 0
where we defined d (2.45) Cθ2 (xj (t), y (2) )e−ir2 xj (t) dt ∂ (2) = eir2 (x −xj (t)) (xj − xj −1 )µ −ir2 + Cθ2 (xj (t), y (2) ) ∂xj (t) µ (2) * eir2 (x −xj (t)) (2) ) = d 3 keik(xj (t)−y ) i(xj − xj −1 )(k − r2 ) Cθ2 (k) 2 (2π)
Cθm2 (xj (t), y (2) ) = eir2 x
(2)
and xj (t) = xj −1 + t (xj − xj −1 ).
(2.46)
2.7.1. Second order expansion. The renormalizing factor is (k − r2 )(xj − xj −1 ), or (k − r2 )(x (2) − x (1) ). When the interpolated line is a tree or loop line, the size of (k − r2 ) is fixed by the cut-off of the propagator Cθ2 : (k − r2 )0 i2 (k − r2 )r(r2 ) i2 1
≤ A(i) , ≤ A(i) , 1
2 (k − r2 )t (r2 ) i22 zi ≤ A (i) zi ,
(2.47)
where (k − r2 )r(r2 ) is the spatial component on the direction r 2 and (k − r2 )t (r2 ) is the spatial component on the direction orthogonal to r 2 . We remark that the size of the tangential component (k − r2 )t (r2 ) is the size of the finest sector of the propagator Cθ2 ;
306
M. Disertori, V. Rivasseau 1
as we have said in the precedent subsection we have cut its finest sector scale i22 in a smaller sector, to improve the renormalizing factor. When the interpolated line is an external line we have (k − r2 )i e <
A(i) √ , 2 2
i = 0, 1, 2.
(2.48)
On the other hand (xj − xj −1 ) is bounded using a fraction of the exponential decay of tree line propagators and give the scale factors (we will perform the detailed calculation in the following): (xj − xj −1 )0 −1 t (i) , (xj − xj −1 )r(rj ) −1 t (i) , −1
(xj − xj −1 )t (rj ) t (i)2 .
(2.49)
(x (2) − x (1) ) give the same factors, as it can be written as j (xj − xj −1 ). In the case of a real external line, we obtain immediately |k − r2 ||xj − xj −1 | A(i) −1 t (i) which is the factor we need to renormalize. In the case of a tree or loop half-line moved, we have to consider the different components. One sees immediately that the components (k − r2 )0 (xj − xj −1 )0 and (k − r2 )r(r2 ) (xj − xj −1 )r(r2 ) give the factor A(i) −1 t (i) that we need to renormalize, but 1
−1 2 (k − r2 )t (r2 ) (xj − xj −1 )t (r2 ) gives only A (i) zi t (i) that is not sufficient. This is the main difficulty, announced in the Introduction, that we met in this paper: when trying to renormalize in phase space with anisotropic sectors, the internal decay of the tree does not necessarily match the external sector scales. To solve this problem we expand to +1 +1 second order, by 0 dtF (t) = F (0) + 0 dt (1 − t)F (t), and we prove that the first order term which gives the bad power counting factor is actually zero. Then we optimize the bound obtained with respect to zi . Indeed this second order Taylor formula gives for loop and external lines
d (2) Cθ2 (x (2) (t), y (2) )e−ir2 x (t) dt 0 ∂ (1) = (x (2) − x (1) )µ (1)µ Cθ2 (x (1) , y (2) )e−ir2 x + (x (2) − x (1) )µ (x (2) − x (1) )ν ∂x 1 ∂ ∂ (2) (2) −ir2 x (2) (t) dt (1 − t) (2)µ (x (t), y )e , (2.50) C θ 2 ∂x (t) ∂x (2)ν (t) 0 1
dt
where we applied x (2) (0) = x (1) . For tree lines we have q j =1 0
1
q d d (2) −ir2 xj (t) (2) −ir2 xj (t) dt = Cθ2 (xj (t), y )e Cθ (xj (t), y )e dt dt 2 t=0 +
q j =1 0
j =1
1
dt (1 − t)
d2 −ir2 xj (t) (x (t), y )e C . θ j 2 2 dt 2
(2.51)
Interacting Fermi Liquid in Two Dimensions at Finite Temperature. II
307
The last sum on the right-hand of the equation is a second order term: q
(xj − xj −1 )µ (xj − xj −1 )ν
j =1
1
dt (1 − t)
0
∂
µ
∂
∂xj (t) ∂xjν (t)
Cθ2 (xj (t), y (2) )e−ir2 xj (t) .
(2.52)
The first sum on the right-hand of the equation contains a first order and a second order term: q
(xj − xj −1 )µ
j =1
∂ Cθ2 (xj −1 , y (2) )e−ir2 xj −1 µ ∂xj −1
j −1 q ∂ (1) (2) −ir2 x (1) µ (x , y )e (x − x ) C + θ2 j j −1 µ ∂x1 j =1 k=1 ∂ ∂ (2) −ir2 xk − µ Cθ2 (xk−1 , y (2) )e−ir2 xk−1 , (2.53) µ Cθ2 (xk , y )e ∂xk ∂xk−1
= (x (2) − x (1) )µ
where we applied xj (0) = xj −1 . The first term is the same first order term we obtain for loop or external lines, while the second term can be written as: q
(xj − xj −1 )
j =1
=
j −1 q
µ
j −1 k=1 0
1
dt
d ∂ Cθ2 (xk (t), y (2) )e−ir2 xk (t) µ dt ∂xk (t)
1 (xj −xj −1 )µ (xk −xk−1 )ν dt
j =1 k=1
0
∂
µ ∂xk (t)
∂
∂xkν (t)
Cθ2 (xk (t), y (2) )e−ir2 xk (t)
(2.54) and gives a second order term that adds to (2.52). Lemma 3. The contribution coming from the component orthogonal to r 2 of the first order term
is zero.
d 3 x (1) d 3 x (2) g(x (1) , x (2) )Cθ1 (x (1) , y (1) ) ∂ (2) (1) eir2 (x −x ) (x (2) − x (1) )t (r2 ) −ir2 + (1) Cθ2 (x (1) , y (2) ) ∂x t (r2 )
(2.55)
308
M. Disertori, V. Rivasseau
Proof. The complete first order term is
d 3 x (1) d 3 x (2) g(x (1) , x (2) )Cθ1 (x (1) , y (1) ) , 1 ∂ (2) (1) eir2 (x −x ) (x (2) − x (1) )µ i −r2 + Cθ (x (1) , y (2) ) i ∂x (1) µ 2 , 1 ∂ 3 (1) (1) (1) (2) = d x Cθ1 (x , y1 ) −r2 + Cθ (x , y ) i ∂x (1) µ 2 3 (2) (2) ir2 x (2) (2)µ i d x g(0, x )e , x
(2.56)
where we applied the translational invariance of g(x (1) , x (2) ). Now i
d x2 g(0, x 3
(2)
)e
ir2 x (2) (2)µ
x
∂ (2) d 3 x (2) g(0, x (2) )eir2 x = ∂r2µ ∂ =− g(k) ˆ . ∂kµ k=−r2
(2.57)
To compute the expression we take the two spatial axes on the directions parallel and orthogonal to −r 2 . Then k = −r 2 means k1 = 1, k2 = 0 or in radial coordinates ρ = 1 and θ = 0. As we said before, g(k) ˆ depends only on the zero component k0 and on the module of the spatial vector ρ: ∂ g(k) ˆ =0 ∂θ
∀θ.
(2.58)
Now, applying ∂ g(k) ˆ ∂ g(k) ˆ ∂ρ ∂θ ∂ g(k) ˆ = + ∂ki ∂ρ ∂ki ∂θ ∂ki
(2.59)
for i = 1, 2 and the relations: k1 ∂ ρ= , ∂k1 ρ
∂ k2 ρ= , ∂k2 ρ
(2.60)
we obtain
∂ g(k) ˆ = 0, ∂k0 k=−r2
This ends the proof.
!
∂ g(k) ˆ = 0, ∂kr(r2 ) k=−r2
∂ g(k) ˆ = 0. ∂kt (r2 ) k=−r2
(2.61)
Interacting Fermi Liquid in Two Dimensions at Finite Temperature. II
309
Choice of zi . After putting the dangerous first order term to zero we are left with the bound 3 1 2 2A(i) A z t2(i) A(i) i A(i) zi2 (i) + 2 + + t (i) zi t (i) 2t (i) 2t (i) 1 2 1 zi A(i) 2 A(i) ≤ t2(i) , (2.62) + + t (i) zi t (i) t (i) where t (i) is the band index of the lowest tree line in the path joining the two external 1
vertices of gi , the global factor t2(i) comes from a gain in sector sum, as explained in Sect. 3.4 and in the second line we have extracted the renormalization factor and bounded [1 + t (i) /A(i) ] ≤ 2. To optimize the bound we study the function 2 zi + +c . (2.63) f (zi ) = zi b This function has a minimum at zi =
√ 1 √ 2b = 2t2(i) whose value is
√ 1 1 2 2 √ 2 2 + A(i) ≤ 1 2 2 + A(i) ≤ K . 1 1 1 1 t (i) t2(i) t2(i) t2(i) t2(i)
(2.64)
This bad factor is compensated by the gain on the sector sum. 3. Main Bound Now we have all we need to perform the bound. We introduce absolute values inside the sums and integrals. As in [DR1], tree line propagators are used to perform spatial integrals and the loop propagator is bounded through a Gram inequality. The difference is that now some propagators (tree, loop or external) have been moved, and bear one or two derivatives, hence giving a different scaling factor. Furthermore some loop propagators have been taken out of the determinant, and there are some additional distance factors to bound, coming from the renormalization factors. 3.1. Loop lines. For each gi ∈ Dl (C, P ) the interpolation (2.42) applies to the determinant, or to a matrix element that has been extracted. The distance factors and the integral over t are taken out of the determinant by multi-linearity. Then we apply the Gram inequality as in Sect. IV.1 of [DR1]. Loop lines in P are bounded by a Schwartz inequality, / . (3.1) | Ff (xf ), Gg (xg ) | ≤ ||Ff ||||Gg ||. The interpolated half-line functions Ff or Gg will have some factors (k − rf )µ or (k − rg )µ (actually the two ends of a matrix element could be both interpolated), that modify the estimation of their norms ||Ff ||, ||Gg ||. For f ∈ L being the interpolated
310
M. Disertori, V. Rivasseau
line for the subgraph gi , each (k − rf )0 and (k − rf )r(rf ) adds a factor (α − 2 )2 in the 2 1 1 2 2 integral [DR1](4.14), while each (k − rf )t (rf ) adds a factor M(f,C ) t (i) as we are 1
integrating |Ff |2 . Hence, for each gi ∈ Dl (C, P ) the contribution to the bound at the first order is 1 1 2 (2) (1) (2) (1) 3 3 4 |xi − xi |0 + |xi − xi |r(rai ) (2) (2) − . (3.2) (2) M(hi ,C )
M(hi ,C )
At the second order it is given by three terms: 1 4 (2) 5 (2) − 5 M(hi ,C )
for the distance factor
(2)
|xi
M(hi ,C )
(1)
(2)
− xi |0 + |xi
1 2
(1)
− xi |r(r2 )
(2) M(hi ,C )
1 3 (2) M(hi ,C )
−
(3.3)
A(m(h(2) i ,C ))
1 4
A(m(hi ,C ))
2
3
A(m(h(2) i ,C ))
2
,
1
(3.4) 1
2
(2) M(hi ,C )
t2(i)
for the distance factor (2) (1) (2) (1) (2) (1) |xi − xi |0 + |xi − xi |r(r2 ) |xi − xi |t (r2 ) , and
)
1
4
(2)
M(hi ,C )
M(a,C ) − A(m(a,C ))
*1 2
M(h(2) ,C ) t (i) i
(3.5)
(3.6)
(3.7)
for (2)
|xi
(1)
− xi |2t (r2 ) .
(3.8)
Then, the loop determinant times the product of extracted loop propagators is bounded by the usual term 1 3 A(m(a,C )) 2 4 1 − M(a, C) M(a,C ) a∈L
√ √ 3 5 (where we applied the relations 1−x 3 and 1−x 5 for x ≤ 1) times the 1−x ≤ 1−x ≤ terms coming from renormalization: 0 (2) (1) (2) (1) (3.9) |xi − xi |0 + |xi − xi |r(r2 ) A(i) gi ∈Dl (C ,P )
2 (2) (1) (2) (1) + |xi − xi |0 + |xi − xi |r(r2 ) 2A(i) 3 1 (2) (1) (2) (1) (2) (1) 2 2 + |xi − xi |0 + |xi − xi |r(r2 ) |xi − xi |t (r2 ) A t (i) (i) 1 (2) (1) +|xi − xi |2t (r2 ) A(i) t (i) 1 0 (2) (1) (2) (1) ≤ |xi − xi | A(i) + |xi − xi |2 A(i) t (i) . gi ∈Dl (C ,P )
Interacting Fermi Liquid in Two Dimensions at Finite Temperature. II
311
3.2. External lines. It is easy to see that, when some external test function is moved, the bound (exponential decay) obtained in [DR1], Sect. 4.2.2, is multiplied by the factor 0 1 (2) (1) (2) (1) |xi − xi |A(i) + |xi − xi |2 2A(i) . (3.10) gi ∈De (C ,P )
3.3. Tree lines. As we said, interpolated tree lines are moved along the connection between the external vertices of any graph provided by the tree. But, as the tree itself is modified by renormalization, this process has to be inductive, starting from the smallest graph and going down towards the biggest. We take for this construction the same rules as in [DR2], with some simplifications as we do not treat four point subgraphs. We remark that only the renormalization of subgraphs in Dt (C, P ) can modify the tree. Our induction creates progressively a new tree T (J ). To describe it, we number the subgraphs in Dt (C, P ) in the order we meet then g1 , . . . gr . At the stage 1 ≤ p ≤ r, before renormalization of gp , the tree is called T (Jp−1 ). Then we interpolate the external line of gp following the unique path in T (Jp−1 ) connecting the two external vertices of gp . Then we update J and T . We define Jp = Jp−1 for the first order term, as the propagator hooks to the reference vertex, Jp = Jp−1 ∪ {j } ∪ {k} for the second order term, where j and k are the indices of the lines of T (Jp−1 ) chosen by renormalization. Finally we update the tree according to Fig. 6. 1 0
1 0 0 1 00 11 xv
fi
Case 1
x j-1 0 1 0 1 11 00 1 0 11 00 00 11 0 1 00 11 11 00 00 11 1 0 xj 11x v 00 0 1 fi xf
11 00 00 11 11 00 00x v 11
Case 2
x k-1 0 1 x j -1 0 1 1 0 1 1 x0 j x0 k 0 1 fi xf
Case 3
Fig. 6. Three possible updatings of the tree
In the following we will call Dt1 (C, P ) the set of subgraphs with the interpolated line fixed by J on the reference vertex (hence giving a first order term), and Dt2 (C, P ) the set of subgraphs with the interpolated line fixed by J on some tree line xk − xk−1 , k ≤ j (hence giving a second order term). 3.3.1. Spatial decay. It is easy to see that the interpolated line for the subgraph gi has the same spatial decay as the non interpolated one [DR1] (4.18), times a factor (2) (1) (2) (1) (3.11) |xi − xi |0 + |xi − xi |r(2) A(i) if gi ∈ Dt1 (C, P ). If gi ∈ Dt2 (C, P ) we have three multiplying factors depending on the components of the scaling factors |xj − xj −1 |µ |xk − xk−1 |ν .
(3.12)
312
M. Disertori, V. Rivasseau
If µ and ν ∈ (0, r(r2 )) we have the multiplying factor 2A(i) . If µ, or ν is t (r2 ) and the 3
1
2 2 other belongs to (0, r(r2 )) we have the factor A (i) t (i) . Finally, if µ and ν = t (r2 ) we have the factor A(i) t (i) . Before going on we take a fraction (1 − ε) of the exponential decay to ensure the decay between the maxima of the test functions of Theorem 3 as in [DR1](4.20). Of the remaining decay a fraction 2ε will be used to bound the distance factors and the other to perform spatial integrals.
3.3.2. Bounding distance factors. For each renormalized subgraph gi we have to bound one or two distance factors, depending if it belongs to D 1 (C, P ) or D 2 (C, P ), which are the subsets of subgraphs that give a first order or a second order term respectively. These sets can be cut in turn into Dlm (C, P ), Dem (C, P ) and Dtm (C, P ), m = 1, 2, for loop, external and tree lines respectively moved. Then we have to bound the quantity
A(x, J , T ) =
(2)
gi ∈D 1 (C ,P )
|xi
(1)
− xi |
|xj − xj −1 ||xk − xk−1 |
gi ∈Dt2 (C ,P )
(2)
|xi
(1)
− x i |2
gi ∈Dl2 (C ,P )∪De2 (C ,P )
1 s
ε
e−a 2 (|x¯l −xl |l ) ,
l∈T (J )
where we have taken the same spatial decay (actually the worst) for all directions. (2) (1) For each loop or external line the difference |xi − xi | can be bounded, applying several triangular inequalities, by the sum over the tree lines on the unique path in T (J ) (2) (1) connecting xi to xi . We observe that the same tree line lj can appear in several paths connecting different (2) (1) pairs of points xi , xi . Using the same fraction of its exponential decay many times might generate some unwanted factorials as supx x n exp(−x) = (n/e)n . To avoid this problem we define Dj as the set of subgraphs gi ∈ D(C, P ) that use the tree distance |x¯lj − xlj | and we apply the relation )
e
−a 2ε |x¯lj −xlj |lj
* 1s
)
≤e
* 1s
−a 2ε |x¯lj −xlj |
gi ∈Dj
)
1
1
s ts(i) −A(i)
* .
(3.13)
With this expression a different decay factor is used for each subgraph. Now applying 1
this result and the inequality xe−(x) s ≤ s! we prove the bound:
sup |A(x, J , T )| ≤ K(s)n¯ x
gi ∈D 1 (C ,P )
, 1−
A(i) t (i)
- 1 −s
1 t (i)
s
gi
∈D 2 (C ,P )
, - 1 −2s A(i) s 1 1− , t (i) 2t (i)
(3.14)
where K(s) is some function of s. The remaining differences are dangerous as they appear with a negative exponent. This happens because in this continuous formalism one has to perform renormalization even when the differences between internal and external scales of subgraphs are arbitrary small. The solution of this problem is given by loop line factors. Indeed any renormalized subgraph has necessarily internal loop lines,
Interacting Fermi Liquid in Two Dimensions at Finite Temperature. II
313
which give small factors when the differences between internal and external scales of subgraphs become arbitrarily small. By Lemma 9 in [DR2] we know that, for each gi ∈ D(C, P ) there are at least two loop lines internal to gi which satisfy M(a,C ) ≤ t (i) and A(m(a,C )) ≥ A(i) . Then for each gi ∈ D 1 (C, P ), we have to bound [1 − x]2 , 1 *s 1 − (x) s
(3.15)
[1 − x]2 , 1 *2s 1 − (x) s
(3.16)
f1 (x) = ) and for each gi ∈ D 2 (C, P ) f2 (x) = )
where we defined x = A(i) /t (i) . We remark that f1 (x) (1 − x)2−s for x → 1 while f2 (x) (1 − x)−2(s−1) . Therefore choosing 1 < s < 3/2, f1 is bounded near x = 1, and f2 is integrable. We bound f1 (x) ≤ sup f1 (x)
(3.17)
x∈[0,1]
and we keep f2 to be bounded when the integration over the parameters w will be *1 ) 2 that are not used are bounded by 1. performed. Finally the factors 1 − A(m(a,C)) M(a,C) 3.3.3. Sum over J . We bound the sum over J by taking the supJ times the cardinal of J . In [DR2], Lemma 7, it is proved that |J | ≤ K n¯ for some constant K. 3.3.4. Spatial integration. To perform spatial integration we use the remaining tree line decay
ε
e−a 2
)
1
1
1 1
|(δxl )0 l | s +|(δxl )r l | s +|(δxl )t l2 | s
*
.
(3.18)
l∈T (J )
These lines depend in general on the interpolation parameters t. In [DR2] it is proved that spatial integration performed with interpolated tree lines does not depend on the interpolating factor t and give the same result as integration with the starting tree T . Summarizing the results, tree lines are used for several purposes: extracting the exponential decay between the test functions maxima, bounding distance factors and performing spatial integration. The resulting bound is: 1
e
1
−a(1−ε)Ts dTs (xe1 ,...,xe2p )
n¯ 1 3q
q=1
gi
∈D 1 (C ,P )
1 t (i)
gi
∈D 2 (C ,P )
, - 1 −2s A(i) s 1 1− . t (i) 2t (i)
(3.19)
314
M. Disertori, V. Rivasseau
3.4. Sector sum. We still have to perform the sum over sector choices corresponding to [DR1], (4.23). We do it in the same way as in Sect. 4.3 of [DR1]. The only difference is that, for a two-point subgraph gi , by momentum conservation, there is no sector choice at all: 1 4 −2 dθh(2) ,r(i) ϒ θiroot , θh(2) ,r(i) ≤ K, (3.20) 3 j (2) hi
j
,r(i)
i
(2) hi ,r(i)+1
i
and, for each gi ∈ D(C, P ) we have to count the number of choices for the addi(2)
tional refinement for the half-line hi
1 2
(2)
(2)
i(hi )
− 21
1 2
i(hi )
1
from a sector of size 2
into a sector of size
t (i) . This costs a factor t (i) . This term is dangerous as it is on the denominator. 1
To compensate it we extract a factor t2(i) from the subgraphs gj of gi defined above. This factor is extracted inductively for j ∈ Cir . For each subgraph gj we distinguish two situations: 1
• if |egj (C)| > 4 we insert the identity 1 =
1
2 2 A(j ) j 1
1
j2
, where the second factor will
2 A(j )
be compensated by the convergent power counting of the subgraph gj ; • if |egj (C)| = 4 we observe (see Lemma 4 below) that we have counted one unneces1
1
2 2 sary sum over sector choices and we gain again a factor A (j ) /j . 1
Putting together all these terms we obtain the factor we want, namely t2(i) , times a factor 1
j2
gj |j ∈Cir ,|egj (C )|>4
2 A (j )
1
.
(3.21)
Lemma 4. Let the two point subgraph gi ∈ D(C, P ), and the four point subgraph gj , j ∈ Ji and |eej∗ | = 0, be fixed. Then the number of sector choices predicted by [DR1], Lemma 5, (4.26) must be modified: 4 m=2
1 4 −2 3 A(j )
j
(m) hj ,r(j )+1
, (root) dθh(m) ϒ θj , {θh(m) ,r(j ) }m=2,3,4 ≤ K j
j
(3.22)
for some constant K. Proof. We observe that θh(2) actually is fixed by the momentum conservation for the two j
1
2 external lines of gi on an interval of size A (i) . In the following we write explicitly the q dependence: j = jq . We distinguish then three possible situations.
Interacting Fermi Liquid in Two Dimensions at Finite Temperature. II (3)
315
(4)
1. hjq and hjq are both loop half-lines (see Fig. 7). Then they are contracted to some (3)
1
(4)
half-lines hjq and hjq , that belong to some sector of size 2
(m) M r (hjq ,C )
1
2 ≤ A (jq ) ,
m = 3, 4. Therefore by momentum conservation θh(m) is restricted on the sector of jq
(m)
hjq , for m = 3, 4. (3)
(4)
2. hjq ∈ L and hjq ∈ ti . Then we have two situations. (3)
When hjq contracts with some element of LL jq (i) (see Fig. 8), repeating the argument 1
2 above, θh(3) is restricted to an interval of width A (jq ) , and, by momentum conservation jq
([DR1], Appendix B) θh(4) is restricted to an interval of the same size. When
(3) h jq
jq
contracts with some element of LR jq (i), there is a loop line a connecting
R R LL jq (i) with Ljq (i). This line is an external line of some subgraph in Tjq (i), say gj (see Fig. 9a). Then we choose as root half-line for gj the loop half-line a instead of the tree
gj
(1)
hj
h(4) j
h’j(4)
h (3) j
h’j (3)
(2)
hj
Fig. 7. Tree lines are solid, loop lines are wavy; the arrows show the direction of the sector sum
gj
h (1) j
h’j (3)
h (3) j
h (4) j
hj(2) (3)
Fig. 8. hj
contracts with some element of LR j (i)
316
M. Disertori, V. Rivasseau
gj
gj
g j’
(1)
gj’
(1)
hj
hj
(4)
hj
(4)
hj
a
a (2)
(2)
hj
a)
hj
b) (3)
Fig. 9. hj
q
contracts with some element of LL j (i)
gj’
gj h(1) j
gj’
gj h(1) j
(3)
hj
(3)
hj
a
a
(4)
(4)
hj
hj
h(2) j
a)
(3)
Fig. 10. hj
q
h(2) j
b) (3)
and hj
q
L(3)
are tree half-lines, and there is a loop line connecting Tj
q
(i) with TjR (i) q
(4)
root half-line hroot j and, for all tree lines on the unique path connecting vj to vjq we can
exchange hL and hR (see Fig. 9b; the new arrows show the direction towards this new 1
2 root). Then θh(4) is fixed in an interval of size A (jq ) . jq
(3)
(4)
R(3)
3. hjq and hjq ∈ ti . We remark that TjRq (i) is separated into two subtrees, Tjq
(i)
(3) lj
R(4) and Tjq (i) which is connected to gjq through which is connected to gjq through (4) R(3) R(4) lj . There is a loop half-line a hooked to Tjq (i) or to Tjq (i), contracting to some R(3) loop half-line in LL jq (i). Let’s say a is hooked in Tjq (i) (see Fig. 10a). Then, repeating
the same argument above (see Fig. 10b), θh(3) and θh(4) are fixed in an interval of size 1 2
A(jq ) . This ends the proof.
jq
jq
!
We remark that this lemma is true also when |eej∗ | = 1 (it cannot be larger as |egi | = 2). The proof is even simpler, as the external line has fixed impulsion, then it consumes no sum.
Interacting Fermi Liquid in Two Dimensions at Finite Temperature. II
317
3.5. Integration over the parameters i . Putting everything together, we can bound the sum (2.33): u |≤e |2p
1
1
−a(1−ε)Ts dTs (xe1 ,...,xe2p )
n−1 ¯
T ≤A(i) ≤i ≤1 i=1
n−1 ¯
di
i=1
K0
cn¯ n¯ K n!n ! n≥1 ¯ CT S u−T L (E C J P
1 43 i v 2i a∈L M(a,C ) v∈V g | i=r,
i |egi (C)|≥11
1 2 A(i) jh,r(i)+1 − 21 iv 1 i gi | i=r, v∈V h∈I (i) j2 gi ∈D(C ,P ) h,r(i) 2≤|egi (C)|≤10 2 1 2 1 − A(i) j t (i) , 1 1 2s 2 A(i) s gi ∈D(C ,P ) j ∈Cir \Ji A(j ) gi ∈D 2 (C ,P ) 1−
1 2 jh,2 1 2 jh,1
h∈egi∗ jh,1 =i(h)=A(i)
(3.23)
t (i)
where we bounded δµiv (λ) ≤ K|λ|iv and |λ| ≤ c. We remark that sector counting for a vertex gives a factor depending from V only, as for a two point vertex no sum has to be paid. Now we can send to zero, and introduce the variables βi exactly as in Sect. 4.4 of [DR1] (see (4.31–33)) and obtain in these new coordinates: cn¯ n¯ K n!n ! n≥1 ¯ CT S u−T L (E C J P n−1 ¯ n−1 ¯ n−1 ¯ 3 3 − −1+(1−n¯ i ) ¯ dβi n−1 β βi2 −2 β 4 4
u |2p |
1
1
≤ K0 e
T
T i=1
v∈V
i=1
1
−a(1−ε)Ts dTs (xe1 ,...,xe2p )
i
1 T βj
j ∈Civ
gi | i=r, 2≤|egi (C)|≤10
gi ∈D(C ,P )
βi
gi | i=r, |egi (C)|≥11
h∈I (i)
i=1
j ∈Cr(i)+1 \Cr(i)
gi ∈D(C ,P )
j ∈Cir \Ji
T
j ∈Ci
h∈egi∗ jh,1 =i(h)=A(i)
a∈L
j ∈Cr(i)+1 \Cr(i)
j j ∈CM(a,C)
T
1 1 βj2
1 21 − 21 βj T 1 v∈V j ∈Civ βj2
(3.24)
2 ( 1 − j ∈Ct (i) \CA(i) βj 1 , 1 1 2s s βj2 gi ∈D 2 (C ,P ) 1 − ( β j ∈Ct (i) \CA(i)
j
where n¯ i = ni + ni , and ni , ni are respectively the number of four points and two points vertex in gi . Now we compute power counting as in [DR1], Sect. 4.4, and we obtain the
318
M. Disertori, V. Rivasseau
same expressions, substituting n by n. ¯ The only different expressions are n−1 ¯ n−1 ¯ ni 1 −n βj2 = βi 2 βj−1 = βi i . v∈V
j ∈Civ
i=1
v∈V
j ∈Civ
(3.25)
i=1
Then we obtain u |2p | ≤ K0 e
¯ 1 n−1 T i=1
cn¯ n¯ K n!n ! n≥1 ¯ CT S u−T L (E C J P 2 ( 1 − j ∈Ct (i) \CA(i) βj . (3.26) ( 1 2s s gi ∈D 2 (C ,P ) 1 − j ∈Ct (i) \CA(i) βj 1
1
−a(1−ε)Ts dTs (xe1 ,...,xe2p )
dβi βi−1+xi
5−3p
T 2
To compute xi we apply the following relation: |ili (C)| = 2ni + 2 − |egi (C)|.
(3.27)
For all gi with |egi (C)| > 4, i = r, such that there exists some gi ∈ D(C, P ) with i ∈ Cir \Ji , the factor xi is given by: 3 1 1 xi = (n¯ i − 1) − |ili (C)| + ni − ni − (|egi (C)| − 3) − 4 2 2 1 = (|egi (C)| − 6) if 4, < |egi (C)| ≤ 10, 4 3 1 1 xi ≥ (n¯ i − 1) − |ili (C)| + ni − ni − (|egi (C)| − 1) − 4 2 2 1 = (|egi (C)| − 10) if |egi (C)| > 10, 4
1 2
1 2 (3.28)
where the last term −1/2 corresponds to the factor extracted to perform the sector sum in Sect. 3.4. For the remaining gi with |egi (C)| ≥ 4, i = r, we have the usual power counting 3 1 1 xi = (n¯ i − 1) − |ili (C)| + ni − ni − (|egi (C)| − 3) 4 2 2 1 = (|egi (C)| − 4) if 4 ≤ |egi (C)| ≤ 10, 4 3 1 1 xi ≥ (n¯ i − 1) − |ili (C)| + ni − ni − (|egi (C)| − 1) 4 2 2 1 = (|egi (C)| − 8) if |egi (C)| > 10. 4
(3.29)
For the subgraph gr , when |egr | > 2, there is no sector sum then we have xr =
1 3p − 5 (3|egr (C)| − 10) = . 4 2
(3.30)
We remark that in the first situation six-point subgraphs become logarithmic divergent, while the other ones still have xi > 0; this is a price to pay for our anisotropic analysis.
Interacting Fermi Liquid in Two Dimensions at Finite Temperature. II
319
However by Lemma 3.34 xi is still proportional to the number of tree external lines |eti |, which is crucial to perform the sum over partial orders. This is the reason why, when introducing classes in [DR1], we have selected up to 11 external lines per subgraph. Finally we consider two-point subgraphs. For all gi ∈ D(C, P ), i = r, we have 3 1 xi = (n¯ i − 1) − |ili (C)| + ni − ni + 1 = 1 − 1 = 0, 4 2
(3.31)
where the term 1 comes from renormalization. The corresponding power counting is logarithmic in T . In the particular case of i = r, the renormalizing factor is e /t (r) . Then the last integral gives e T
1 T
dβr βr−1−1+1 ≤
e ln T . T
(3.32)
We have still to consider the non-renormalized two-point subgraphs gi ∈ N D(C, P ), that have xi = −1. In the case i = r and gi 1PR, the external momentum at scale A(i) is equal to that of some internal line lj . Since our Gevrey cutoffs have compact √ support, this forces a relation between external and internal scales, namely i ≤ 2A(i) . This means i ≤ 2A(i) , or equivalently βi ≥ 21 . The corresponding integral is then bounded by a √ +1 −1− 21 constant: 1 dβi βi = 2( 2 − 1). 2
In the case i = r and gi 1PR, (hence for |egr | = 2p = 2) the external impulsion at scale e (which is not necessarily near T ) is √ equal to that of some internal line li . Then the internal scale is restricted to the interval 2e ≤ i ≤ 2e . This means that T r ≤ 2e , hence βr ≥ 2 . The last integral then gives e
1 T 2e
dβr βr−1−1 ≤
2e . T
(3.33)
√ (2) Finally in the case of gi 1PI and hi = ej , we know that e ≥ i /2 2. This imposes a √ constraint on βi ≥ A(i) /2 2e , then the integral over βi is bounded as in the equation above. Now Lemma 7 in [DR1] can be generalized: Lemma 5. For any subgraph gi (i = r) with |egi | > 6 we have xi ≥
|eti | . 44
(3.34)
Proof. For gi with |egi | > 4 and such that there is no gi ∈ D(C, P ) with i ∈ Cir \Ji , Lemma 7 of [DR1] applies directly. For gi with |egi | > 6 and such that there is some gi ∈ D(C, P ) with |egi | > 6, i ∈ Cir \Ji , we have to bound 41 (|egi (C)| − 10) for |egi | > 10 or 41 (|egi (C)| − 6) for |egi | ≤ 10. Then we apply the same reasonings as in Lemma 7 of [DR1]. !
320
M. Disertori, V. Rivasseau
To complete the bound we must factorize the integrals over the β parameters as in [DR1]. Some sets of βj are not independent yet: 2 ( 1 1 − j ∈Ct (i) \CA(i) βj −1+xj dβj βj . (3.35) 1 2s T ( s 2 j ∈C \C gi ∈D (C ,P ) t (i) A(i) 1− j ∈Ct (i) \CA(i) βj The mixed term has an integrable singularity at the point βj = 1, ∀j ∈ Ct (i) \CA(i) . We decompose the integration domain of each βj into two subsets [T , 1] = I 1 ∪ I 2 , where I 1 = [T , 1/2] and I 2 = [1/2, 1]. The integral above, for a fixed gi ∈ D 2 (C, P ) is written as 2 ( 1 − β j j ∈Ct (i) \CA(i) −1+xj dβj βj . (3.36) mj 1 2s ( s j ∈Ct (i) \CA(i) mj =1,2 I 1− j ∈Ct (i) \CA(i) βj We distinguish two situations. 1. If mj = 1 for some j , then some βj ≤ 1/2, and the mixed term can be bounded by 1
1/[1−(1/2) s ]2s and taken out of the integral. The integrals in (3.36) are then factorized. 2. If mj = 2 ∀j , we have to compute
j ∈Ct (i) \Ci
1 1 2
−1+xj
dβj βj
1 1 2
1−
dβi βi−1
1−
(
(
j ∈Ct (i) \CA(i) βj
j ∈Ct (i) \CA(i)
βj
2
1 2s
,
(3.37)
s
where βi appears with exponent −1 because gi is a two point renormalized subgraph. Then xi = 0. We perform the change of variable on βi : z := βi ci , ci := (3.38) βj , j ∈Ct (i) \Ci
and the integral becomes: j ∈Ct (i) \Ci
1 1 2
−1+xj dβj βj
ci ci 2
[1 − z]2 . 1 2s 1 − zs
dzz−1
(3.39)
We observe that ci varies on the interval [2−|ci | , 1], where we defined |ci | as the number of βj in Ct (i) \Ci . To bound the integral over z and verify this bound does not depend on ci , we distinguish two cases. a: ci ≥
1 2
ci ci 2
then dzz
−1
[1 − z]2 ≤ 1 2s 1 − zs
1 2 ci 2
dzz
−1
+
ci 1 2
[1 − z]2 ≤ K; 1 2s 1 − zs
dzz−1
(3.40)
Interacting Fermi Liquid in Two Dimensions at Finite Temperature. II
b: ci <
1 2
321
then
ci ci 2
[1 − z]2 ≤K 1 *2s 1 − zs
dzz−1 )
ci ci 2
dzz−1 = K log 2,
(3.41)
where K is a constant. In both cases the bound does not depend on ci and the integrals in (3.39) are factorized. Finally we can bound the integrals over the parameters βi :
1
gi ∈/ND(C ,P ) T
dβi βi−1+xi ≤
| log T |
| log T |
gi ||egi |=4
gi ∈D(C ,P )
| log T |
j ∈Cir \Ji ||egj |=6
gi ∈D(C ,P )
gi ||egi |>6
1 . xi
(3.42)
Now, like in [DR1], Lemma 7, we can bound the vertex functions by u | |2p>4
1
≤ K0 e
1 |eti | i=r
1
5−3p
T 2 cn¯ n¯ K 3p − 5 n!n ! n≥1 ¯ CT S u−T L(E C J P I | log T | | log T | , (3.43)
−a(1−ε)Ts dTs (xe1 ,...,xe2p )
gi ∈D(C ,P )
gi ∈D(C,P ) or gi ||egi |=4
j ∈Cir \Ji ||egj |=6
where the sum over I gives the choices of the integration domain of βi between I 1 and I 2 . The set I has then cardinal proportional to 2n¯ . If 2p = 4 or 2p = 2 we substitute an additional factor | log T | to the global factor 1/(3p − 5) in front of (3.43). Now we bound all the sums exactly as in [DR1] (the only difference being that we are working with n¯ instead of n). Finally we obtain u | |2p>4
1
≤ K0 e
5−3p
1
−a(1−ε)Ts dTs (xe1 ,...,xe2p )
1 T 2 p K1 (p!)2 (cK2 | log T |)n¯ , 3p − 5 n!n ! n≥1 ¯
|4u | ≤ K0 e |2u | ≤ K0 e
1 1 −a(1−ε)Ts dTs (xe1 ,...,xe4 )
1 1 −a(1−ε)Ts dTs (xe1 ,xe2 )
−1 T 2
1 (cK2 | log T |)n¯ . n!n ! n≥1 ¯
1 e (cK2 | log T |)n¯ . n!n !
(3.44)
n≥1 ¯
These sums are convergent for cK2 | log T | < 1 which achieves the proof of Theorem 3. Appendix A: Flow of δµ To study the flow of the chemical potential counterterm we introduce some definitions. We define [δµ, C] as the two point vertex function 2 (φ10 , φ20 ) 1PI and with at least one internal line, for a theory with bare chemical potential counterterm δµ and propagator C. The test functions are φi0 (x) = δ(x) and φ20 (x) = eikF x , hence the external impulsion
322
M. Disertori, V. Rivasseau
is fixed to kF , as near as possible to the Fermi surface. The two fundamental equations are then 1 ], δµ1 (λ) = −[δµ1 (λ), C
1 1 1 δµ (λ) = δµ (λ) + [δµ (λ), C ]
≤ ≤ 1,
1 1 1 = [δµ1 (λ), C ] − [δµ (λ), C ].
(A.1)
These equations are consistent with the BPHZ condition δµ (λ) = 0. To study the flow 1 ] as an expression where the dependence from and is we write [δµ1 (λ), C explicit:
λn δµ1 n 1 1 ¯ [δµ (λ), C ] = ε(T , () (2)n−1 ! n! n CT S u−T
n+n ≥2 n≥1
n−1 ¯ T ≤A(i) ≤i ≤1 q=1 −1
2 ] . . . [ 43 jh,1
gi | i=r, |egi (C|)≤10
jh,2
dq
2 [
1 4 −2 3 jh,n
h
]
lf g ∈P
2π
0
h∈L∪TL n h θh,r dθh,1 χαjh,r (θh,1 ) r=2
dθ
1 4 −2 h,nh 3 jh,n −1 h
[
]
dθh,nh −1
jh,n
h
(A.2)
∗ (v) ϒ θiroot , {θh,r(i) }h∈egi∗ ϒ θhroot , {θ } h,n h∈H h v v∈V ∪V
d 3 x1 . . . d 3 xn¯ φ10 (xi1 , θe1 ,1 )φ20 (xj1 , θe2 ,1 )
L E( C J,P
Mf,g (C, E, {θa,1 })
n−1 ¯
C q (xq , x¯q , θh,1 )
q=1
jq ∈J 1
1 0
dsjq det M (C, E, {θa,1 }, {sjq }),
√ where T = for ≥ 2πT . We remark that appears only in the r integration and in some loop line propagators (those of the loop fields with m(a, C) = ). The power counting is performed as usual, passing to the variables βi defined by i = A(i) /βi . We want to prove, by induction, that the property H (), defined by (A.3) δµ (λ) ≤ K1 |λ|( − ) ∀ ≥ is true ∀ 0 ≤ ≤ 1. We suppose H () is true for a certain , then we prove Lemma 6 and 7, about the existence and the bound satisfied by the derivative. These lemmas ensure that H () is true for all . Indeed otherwise there exists m > 0 defined as m = inf {|H () is true}. ∈[0,1]
(A.4)
Then by Lemma 7 we can write a Taylor expansion at first order d |δµ−ε (λ)| ≤ |δµ (λ)| + δµ (λ) ε + o(ε) (A.5) d ≤ K1 |λ|( − ) + K3 |λ|ε + o(ε) ≤ K1 |λ|( − ( − ε))
Interacting Fermi Liquid in Two Dimensions at Finite Temperature. II
Λr
Λr
Λ
Λ
a)
b)
323
Fig. 11. Two possible schema for the lowest band in a two point 1PI graph
for all ≥ . The same bound in the case − ε ≤ ≤ is proven in Lemma 8. This result contradicts the definition of m , therefore m = 0. Lemma 6. If H () is true then the derivative satisfies the bounds:
d 1 d δµ (λ),
≤ ≤ 1 exists and
d 1 d δµ (λ) ≤ K2 |λ| .
(A.6)
Proof. If the derivative exists, it satisfies the formal equation: d d δµ1 = A + δµ1 B, d d
(A.7)
where A is the expression for δµ1 with the derivative d/d applied to one propagator, and B is the same expression for δµ1 , but with one special two point vertex with value 1 instead of δµ1 (as the derivative of the corresponding factor has been taken out of the sum in (A.7). Then, formally, the solution for (A.7) is d A δµ1 = . d 1−B
(A.8)
Now, if we can prove that A ≤ K λ and B ≤ 1/2, we obtain (A.6) with K2 = 2K . Bound on A. We remark that, as shown in Fig. 11 a,b, there is at least one loop line in the first band, obtained in the first case by 1PI, in the second case by parity of the number of external lines for any subgraph. Therefore the derivative d/d may apply only to a loop line propagator. Indeed, if it applies to the r integral, the first band width and the corresponding loop line amplitude are reduced to zero. The action of the derivative on the loop propagator is given by d M 2 C = − 3 Cα=−2 . d
(A.9)
324
M. Disertori, V. Rivasseau
Then, when performing the estimations we add the factor
1 M
≤
1 r
=
βr :
n−1 ¯ cn¯ n¯ n−1 ¯ K dq (2) n!n ! T ≤A(i) ≤i ≤1 q=1 n≥2 ¯ CT S u−T L (E C J P , 3 -, - n−1 ¯ −1 1 − 21 4 iv r iv M(a,C ) 2 i a∈L i=1 v∈V v∈V 1 1 2 2 A(i) jh,2 jh,r(i)+1 1 1 g | i=r, i gi | i=r, h∈egi∗ h∈I (i) j2 j2h,1 gi ∈D(C ,P ) i h,r(i)
ˆ ≤ K0 ||
|egi (C)|≥11
gi ∈D(C ,P )
2≤|egi (C)|≤10
jh,1 =i(h)=A(i)
1
i2
j ∈Cir \Ji
2 A (i)
1
gi ∈D 2 (C ,P )
1−
1−
A(i) 2 t (i)) 2s , 1 s A(i)
(A.10)
1
ts(i)
We remark that the set of renormalized subgraphs D(C, P ) does not contain the global graph gr . Passing to the variables βi = A(i) /i , we obtain 1 n−1 ¯ cn¯ βr n¯ ˆ ≤ K0 K dβi βi−1+xi , (A.11) || T n!n ! T A(i) n≥1 ¯
CT S u−T LC (E J P
i=1
where the integral limit A(i) ≥ T . We have not written the non-factorized terms, that appear in (3.28) and come from renormalized two point subgraphs, as their power counting is not modified at all. The factor 1/T cancels with the global factor T , giving a constant independent from T . The power counting of βr becomes logarithmic, instead of linearly divergent; this is the reason for which we can extract only one coupling constant λ. Indeed: |λ ln | ¯ A ≤ |λ| ≤ |λ| (A.12) ≤ K2 |λ|. (|λ ln |)n−1 1 − |λ ln | n≥2 ¯
Bound on B. The estimation for B is performed as that for δµ1 . The only difference is that, when a two point subgraph contains the special insertion, it is not renormalized, as the power counting is logarithmic instead of linearly divergent. This happens because there is one two point insertion (the special one) that is not compensated by the corresponding δµ scaling factor, then we have 1 dβi βi−1−1+1 ≤ | log |. (A.13)
Of course, the βr power counting becomes logarithmic too, as gr always contains the special insertion. The global factor is then cancelled by the global factor coming from the special insertion. Then |λ ln | 1 ¯ B ≤ |λ| ≤ K2 |λ| ≤ (A.14) ≤ |λ| (|λ ln |)n−1 1 − |λ ln | 2 n≥2 ¯
for λ small enough.
Interacting Fermi Liquid in Two Dimensions at Finite Temperature. II
325
Existence of the derivative. We still have to prove that the derivative exists. For that we apply the definition I,ε 1 d 1 1 δµ1 = lim [δµ1−ε , C−ε . ] − [δµ1 , C ] = lim ε→0 ε ε→0 ε d 1
(A.15)
The difference I1,ε can be written as 1 1 1 1 I1,ε = [δµ1−ε , C−ε ] − [δµ1 , C−ε ] + [δµ1 , C−ε ] − [δµ1 , C ] =
∞ p=1
I1,ε
p
Fp + A 1 + A 2 ,
(A.16)
1 , and A ] with one loop propagator C−ε where A1 is the expression for [δµ1 , C−ε 2 is the same expression, but this time with the first band r ≤ . Finally Fp is the 1 expression for [δµ1 , C−ε ] with p insertions of a special two point vertex, obtained by substituting the coefficient by 1. With the same kind of argument as before we can prove that |Fp | ≤ |λ|/p−1 , |A1 | ≤ ε|λ| and |A2 | ≤ ε 2 |λ|. Then we can prove that I1,ε exists and the derivative takes the form (A.7). !
Lemma 7. If H () is true then the derivative
d d δµ (λ) exists and satisfies the bound:
d d δµ (λ) ≤ K3 |λ|,
(A.17)
Proof. The proof is a direct consequence of Lemma 6. By the definition of δµ (λ) the derivative is given by d d δµ δµ1 (1 − F ), = d d
(A.18)
1 ] with one special insertion, that means one where F is the expression for [δµ1 , C two point vertex factor substituted by 1. As for B in Lemma 6, we can prove that
then
|F | ≤ |λ|,
(A.19)
d d δµ ≤ K2 |λ|(1 + |λ|)) ≤ K3 |λ|.
(A.20)
The existence of this derivative is a consequence of Lemma 6, as I ,ε
=
I1,ε
−
∞ p=1
where we can prove that |Fp | ≤ |λ|/p−1 .
!
I1,ε
p
Fp ,
(A.21)
326
Lemma 8. If the bound
M. Disertori, V. Rivasseau
δµ ≤ K1 |λ|( − )
(A.22)
is true for all ≥ 0 = + ε then it is true for 0 − ε for ε < ε small enough. Proof. For all ≥ 0 we can prove (by the same arguments as before) that d δµ d ≤ K2 |λ|. Then we can perform a first order Taylor expansion 0 −ε 0 d δµ ≤ δµ + δµ ε + o(ε ) d ≤ K2 |λ|(ε + ε ) + o(ε + ε ) ≤ K1 |λ|(ε − ε )
(A.23)
(A.24)
for ε small enough. We remark that we used the inequality 0 +ε +ε d ε + o(ε) ≤ K2 |λ|ε + o(ε). δµ = δµ ≤ δµ+ε + δµ d = =+ε (A.25) ! We remark that the differential RG equation (A.8) is simpler than its discretized counterpart (A.16). This is an advantage of the differential version of the RG. Appendix B: Study of the Selfenergy B 1. A loop expansion for extracting the self energy . (in collaboration with D. Iagolnitzer and J. Magnen) Our tree formula selects connected graphs. But the self energy is the sum over all non-trivial two point connected subgraphs which are furthermore 1PI-irreducible (with respect to the single channel between the two external points). In this appendix we apply an (unpublished) formula, due to D. Iagolnitzer and J. Magnen, which proves that this additional information of 1PI in a single channel can be extracted by expanding some loops out of the determinant without generating any factorials in the bounds. The arch expansion. We consider a two point connected graph with n vertices, equipped with its spanning tree T . If the two (amputated) external lines are hooked to the same vertex, we have a “generalized tadpole” which is automatically 1PI, hence belongs to the self energy. In that case no additional expansion is performed. Otherwise there is a unique non-empty linear path P1,2 made of p − 1 ≤ n − 1 lines in the tree T joining the two external vertices x (1) = x1 and x (2) = xp through the intermediate vertices x2 , . . . , xp−1 . The set V of vertices of G is then the disjoint union of the sets Vj , j = 1, . . . , p, where a vertex belongs to Vj if and only if the unique path in T joining it to the root x1 passes through xj but not through xj +1 (Fig. 12). We call F ({Ci,j }) the loop determinant of the remaining fields (it depends on the weakening w parameters, but this is completely irrelevant in what follows). Expanding
Interacting Fermi Liquid in Two Dimensions at Finite Temperature. II
x1= x(1) V1
x2
x3
V2
V3
327
x 5 = x(2)
x4
V5
V4
Fig. 12. Example of tree, with p = 5 vertices on P1,2 . Loop fields are dashed, tree lines are solid, and the amputated external lines are darker
completely F would cost n!. But we just want to know if the graph is 1PI, which means 1PI with respect to the p − 1 lines of P1,2 , since by parity there cannot be 1 particle reducibility in a 0-2 channel. We perform an auxiliary expansion à la Brydges–Battle– Federbush, and we call it the “arch expansion”. This means that we first test if some vertex of V1 is linked to some vertex of Vk1 , with k1 > 1. This is done by introducing the interpolation parameter 0 ≤ s1 ≤ 1 and defining Cij (s1 ) := s1 Cij := Cij
if i ∈ V1 , j ∈ V1
(B.1)
otherwise.
Then we can write F ({Ci,j }) = F ({Ci,j (s1 )})
s1 =1
= F ({Ci,j (s1 )})
s1 =0
1
+ 0
ds1
d F (s1 ). ds1
(B.2)
The first term s1 = 0 means that the graph is 1PR (by cutting the first line of P1,2 as no loop line connects V1 to its complement). Otherwise we derive an explicit loop line out of the determinant, which connects a vertex of V1 to a vertex of Vk1 , for some k1 > 1 (see Fig. 13, where we have k1 = 3). If k1 = p we are done since the full graph 1 is 1PI. Otherwise we repeat the same procedure, but between ∪kl=1 Vl and its non-empty complement, introducing a second interpolation parameter 0 ≤ s2 ≤ 1: Cij (s1 , s2 ) := s2 Cij (s1 ) := Cij (s1 )
k1 1 if i ∈ ∪kl=1 Vl , j ∈ ∪l=1 Vl
(B.3)
otherwise.
Then we can write F1 ({Cij (s1 )}) = F1 ({Ci,j (s1 , s2 )}) = F ({Ci,j (s1 , s2 )})
s2 =1
s2 =0
1
+ 0
ds2
d F1 (s1 , s2 ). ds2
(B.4)
328
M. Disertori, V. Rivasseau
x1= x(1) V1
x2
x3
V2
V3
x4
x 5 = x(2) V5
V4
Fig. 13. Extraction of one loop line from V1 . The loop line is dashed
x1= x(1) V1
x2
x3
V2
V3
x4
x 5 = x(2)
V4
V5
Fig. 14. Extraction of two loop lines 1 Once again the first term at s2 = 0 means that the block ∪kl=1 Vl is not linked to its complement by any loop line, and the graph is 1PR across the line number k1 of P1,2 . The second term corresponds to extract a new loop line (see Fig. 14) and can be written again as
0
1
d F1 (s1 , s2 ) ds2 = i ∈∪k1 V
ds2
2 l=1 l k2 >k1 ; j2 ∈Vk
1 0
ds2
∂ ∂ Ci2 j2 (s1 , s2 ) F1 (s1 , s2 ). ∂s2 ∂Ci2 j2
(B.5)
2
We remark that ∂ Ci j (s1 , s2 ) = Ci2 j2 ∂s2 2 2 = s1 Ci2 j2
1 if i2 ∈ ∪kl=2 Vl
if i2 ∈ V1 .
(B.6)
Interacting Fermi Liquid in Two Dimensions at Finite Temperature. II
329
We repeat this procedure until we reach Vp with a loop line. Then, if the process stops at the q th step we have the expression ··· A(q, n) = 1
kq−1
iq ∈∪l=1 Vl ,jq ∈Vkq =Vp
∂ ∂ Ci1 j1 (s1 ) Ci j (s1 , s2 ) . . . ∂s1 ∂s2 2 2 0 0 ∂ q F1 (s1 , s2 , . . . sq ) ∂ Ciq jq (s1 , s2 , . . . , sq ) . ∂sq ∂Ci1 j1 ∂Ci2 j2 . . . ∂Ciq jq 1
ds1 · · ·
1
dsq
(B.7)
To extract the exact expression for the derived propagators we introduce some notations. We call l(V ) the number of loop fields hooked to the vertices in V , Wi the set of vertices from where the line li may start from, and mi the number of loop fields where li may contract without crossing more than one arch: Wi = V1 ∪ V2 ∪ · · · ∪ Vki−1 , mi = l(Wi \Wi−1 ).
(B.8)
We remark that m1 = l(V1 ) and m2 = l(V1 ∪ V2 ∪ · · · ∪ Vk1 \V1 ). Now we observe that the interpolated propagator is given by Cir jr (s1 , . . . sr ) = sr +1 sr +2 . . . sr Cir jr
(B.9)
if ir ∈ mr . Then the derivative just takes away the factor sr . We remark that the remaining determinant satisfies all properties of the initial one, in particular, as the interpolation respects positivity, a Gram inequality can be applied. Hence the functional F has been developed as follows: F ({Cij }) = AI , (B.10) LF I ∈LF
where LF is a set of subsets I of the path {x1 , . . . , xn } which form 1PI clusters and AI =
n i /2
A(q, ni ),
(B.11)
q=1
where ni ≤ n is the number of vertices belonging to I and q is the number of loop lines ensuring 1PI. Now we want to prove that AI ≤ K ni . As the functional and the propagators can always be bounded by a constant, the problem is to prove the following lemma: Lemma 9. The sum over all possible arch systems that connect p points in such a way to obtain a 1PI block does not develop a factorial, in other words: 1 p 1 ds1 · · · dsq a(s1 , . . . , sq , i1 , . . . , iq ) ≤ K n , q=1 1
jr ∈Vr r=1,...,q
0
0
ir ∈Wr r=1,...,q
(B.12) where a is the function we obtain after bounding the determinant and the propagators by a constant.
330
M. Disertori, V. Rivasseau
Proof. We start observing that
a(s1 , . . . , sq , i1 , . . . , iq ) ≤
q
ar (s1 , . . . , sr−1 ),
(B.13)
r=1
ir ∈Wr r=1,...,q
where ar is defined inductively by a1 = m1 and ar (s1 , . . . , sr−1 ) = mr + sr−1 ar−1 (s1 , . . . , sr−2 ). To see this we remark that we have m1 choices to choose i1 . In the same way, we have m2 choices to choose i2 if it does not hook to V1 . If it does hook to V1 , we have m1 = a1 choices, but we also have a factor s1 . We remark that this is an overestimate, as, once i1 is fixed we have only m1 − 1 choices for i2 . We have
q 1
0
dsr
q
ar (s1 , . . . , sr−1 ) ≤ e
q
r=1 mr
.
(B.14)
r=1
1
Indeed this follows from the inductive use of 0. Now, as mr = l(Wr \Wr−1 ), we have q
mr ≤
r=1
+1
p
0
(as +b)ds ≤ (1/a)ea+b , for a > 0, b >
l(Vi ) < 4n.
(B.15)
i=1
Finally we prove that p/2
q=1 1
1 ≤ K n.
(B.16)
jr ∈Vr r=1,...,q
Actually jr ∈Vr r=1,...,q
1=
q
l(Vkr ) < 4n
(B.17)
r=1
and 1
Interacting Fermi Liquid in Two Dimensions at Finite Temperature. II
331
Selfenergy. Now, we can apply the arch formula to the two point vertex function and extract the following expression for the selfenergy: (φ1 , φ2 )
n−1 ¯ λn δµ1 n = ε(T , () dq n! n ! T ≤A(i) ≤i ≤1 q=1 n≥1 ¯ CT S u−T L,C E,( J,P ,Le 2 1 |L 2π e| 1 1 4 −2 4 −2 [ 3 jh,n ] dsq dθh,nh [ 3 jh,n −1 ] dθh,nh −1 0 q=1
−1
2 ] . . . [ 43 jh,1
h
h∈L∪TL
ϒ
gi | i=r, |egi (C|)≤10
jh,2
dθh,1
n h r=2
h
0
jh,n
h
θ
χαjh,r (θh,1 ) h,r
θiroot , {θh,r(i) }h∈egi∗
v∈V ∪V
∗ ϒ θhroot , {θ } h,nh h∈H (v) v
d 3 x1 . . . d 3 xn¯ φ1 (xi1 , θe1 ,1 )φ2 (xj1 , θe2 ,1 ) n−1 ¯ C q (xq , x¯q , θh,1 ) Mf,g (C, E, {θa,1 }, {sq }) q=1
jq
∈J 1
lf g ∈P ∪Le
1 0
dsjq det M (C, E, {θa,1 }, {sjq }, {zq }),
(B.18)
ˆ where we took φ1 (x) = δ(x) and φ2 (x) = e−ixk , to obtain (k). Le is the set of loop lines extracted to ensure 1PI, sq is the set of interpolation parameters used to extract them while P is the set of loop lines extracted in Sect. 2.1. With this expression we can perform the same bound as for the vertex function , as the additional sums do not generate any factorial. B 2. First derivative of selfenergy at the Fermi surface. In this section we prove that the first order derivative of the selfenergy computed at an impulsion kF on the Fermi surface is bounded by ≤ |λ|2 M ∂k (k) ˆ (B.19) 1 k α F
for α = 0, 1, 2 and for all λ and T satisfying |λ ln T | ≤ M0 , where M0 and M1 are some constants. The derivative actually corresponds to the multiplication by a factor (x − y)α in position space, for α = 1, 2 and a factor (ei(x0 −y0 )T − 1)/T for α = 0, which is bounded, in absolute value, by |x0 − y0 |. Therefore: ≤ d 3 xd 3 yδ(x)|x − y| |(x, y)| . ∂k (k) ˆ (B.20) α k α F
Then we can perform power counting as usual, the only difference being an additional factor 1/t (r) , where t (r) is the band index of the lowest tree line in the path joining
332
M. Disertori, V. Rivasseau
the two external points x and y. Nevertheless, as the two point function itself is not 1
renormalized, the factor t2(r) coming from loop contractions is not consumed. Then we are left with the factor 1 1 1 = βj2 1 . (B.21) 1 2 j ∈Ct (i) t (r) T2 We remark that, by 1PI, all j ∈ Ct (r) , except for the last one j = r, correspond to 1
a subgraph with at least four external legs. Then a factor βj2 just makes their power counting even more convergent. The last subgraph gives 1 1 1 dβr βr−1−1 βr2 1 ≤ K. (B.22) T T T2 Hence the derivative is bounded by ∞ ¯ ˆ ≤ ≤ |λ|2 M1 (K|λ|)n¯ | ln T |n−2 ∂kα (k) kF
(B.23)
n=2 ¯
for |λ|| ln T | ≤ 1/K = M0 . The extraction of two coupling constants from the sum does not affect the convergence as there are at most n¯ − 2 logarithmic divergent subgraphs. Actually, there are n¯ − 1 subgraphs, and one of them, gr does not give a logarithm, as shown in the equation above. This ends the proof. ! This bound on the first derivative, which is the first inequality in (2.8), already proves that our system is not a Luttinger liquid [S, BGPS, BM]. B 3. Second derivative of the selfenergy. In this section we complete the last part of the proof of Theorem 1, namely the bound on the second derivative of the self energy which is the second inequality in (2.8). It is this bound which proves really “Fermi liquid behavior” [S]. We prove that the second order derivative of the selfenergy computed at any impulsion k (not necessarily on the Fermi surface) is bounded by ˆ ≤ M3 (B.24) ∂kα ∂kβ (k) for α, β = 0, 1, 2 and for all λ and T satisfying |λ ln T | ≤ M0 , where M0 and M3 are some constants. Applying a double derivative in impulsion space corresponds to multiply by a factor |x − y|2 in position space ˆ (k) ∂ (B.25) ∂kα kβ ≤ d 3 xd 3 yδ(x)|x − y|α |x − y|β |(x, y)|. 1
2 This time we have to compensate the bad factor −2 t (r) , hence the factor t (r) extracted from sector sums is not enough. Actually pushing further the loop analysis we want to 1
extract a second factor t2(r) . It turns out that this is almost possible but not quite. Because 1 of Lemma 10 below, one can only extract t2(r) ln t (r) . This explains the absence of any λ in the final bound (B.29). One of the two λ’s is consumed by Lemma 10, and the other one by the power counting for the final graph gr which becomes logarithmic.
Interacting Fermi Liquid in Two Dimensions at Finite Temperature. II
333
B 3.1. Outline of the proof of (B.24). When extracting loop lines in Sect. 2.1, we introduced the chain Cir (i = r in this case), joining the dot vertex vh(2) to the cross vertex just above t (r) (see Fig. 3), where h(1) is the root external half-line of the self-energy and h(2) is the other one. Now we introduce the equivalent chain C r for h(1) , joining the dot vertex vh(1) to the cross vertex just above t (r) (see Fig. 15). We remark that, for all j ∈ C r we must have |egj (C)| ≥ 4, by 1PI. On this chain, in analogy with the set Ji=r of Sect. 2.1, we introduce the set Jr of cross index (and eventually one dot index) corresponding to the four point subgraphs ordered from the (1) lowest index j1 and going up to the highest j|Jr | . For each gj , with j ∈ Jr , we call hj (2)
the real external half-line h(1) , hj the tree half-line going towards the external vertex (3)
(4)
(2)
vh(2) , hj and hj the two remaining external half-lines. The tree line lj (corresponding (2)
to hj ) cuts the tree in two connected components. By analogy with the definitions in
R Sect. 2.1, we call T L j the component containing the vertex vh(1) , and T j the component
containing the vertex vh(2) . We remark that gj belongs to T L j.
R Finally, for each subgraph gjq with jq ∈ Jr , we introduce the sets L L jq and L jq that L are the analogs on C r of the sets LR jq (i = r), Ljq (i = r) introduced in (2.11), but in which of course we take out the loop half lines already contracted by the induction on the first chain Cir .
Now we extract inductively the factor we need for j ∈ C r . For each subgraph gj we distinguish two cases: 1
1
2 2 • if |egj (C)| > 4 we extract the factor A (j ) /j from the convergent power counting of gj , and we pass to the cross above in the chain;
R • if |egj (C)| = 4 we test the number of loop lines connecting T L jq to T jq . We remark that this number must be always even, and cannot be zero by 1PI. We have to extract some of these loop lines from the determinant, using a tedious case by case analysis described in the next subsection, B3.2. When all lines are extracted, the last subsection R B3.3, performs the final bounds. In the case of two loop lines connecting T L jq to T jq 1
1
2 2 we gain at once the factor A (j ) ln A(j ) by Lemma 10, and the induction is stopped;
v(h (1)) v(h (2)) C’r
r
Ci=r
t(r) r Fig. 15. Example of C r
334
M. Disertori, V. Rivasseau 1
1
2 2 in the case of four loop lines or more, by Lemma 11 we gain the factor A (j ) /j and we pass to the following cross in the chain. 1
At the end of the induction we have therefore obtained the factor t2(r) ln t (r) . This leads to the proof of (B.24). B 3.2. The extraction of loop lines. We consider the four point subgraph gj on the chain. We distinguish three situations. (3)
(4)
1. If hj and hj are both loop half-lines they must contract to T R j as selfcontractions are impossible by 1PI (see Sect. 2.1, Par. 1.). We extract them out of the determinant 2 (see (2.12)). The number of choices is bounded by |L R j | . No additional improvement on sector sums can be gained in that case, but Lemma 10 proves that a refinement from 1 anisotropic to isotropic sectors gives at once the factor 2 ln A(j ) . Therefore in A(j )
that case we can stop the induction. (3)
2. If hj
(4)
is a tree half-line and hj (4)
(4)
is a loop one, we contract hj , expanding the
determinant. If hj contracts to T R j , then we have to extract, applying several times the
R formulae (2.13-2.14), one or three additional loop lines joining T L j with T j (depending (4)
R L whether there are two or more loop lines joining T L j with T j ). If hj contracts to T j , then we have to extract two or four additional loop lines (depending whether there are R two or more loop lines joining T L j with T j ). In any case the number of choices for
4 L 5 these extractions is bounded by |L R j | |L j | . (3)
(3)
(4)
3. If hj and hj are both tree half-lines, then we call T j the subtree connected to gj (4)
(3)
(4)
through hj , and T j the one connected to gj through hj (see Fig. 17). In the same (3)
(4)
(3)
(4)
way we define L j and L j (L L j = L j ∪ L j ). Then we apply (2.13, 2.14) several
R times, until we extract two or four loop lines joining T L j with T j . Finally, if there are four loop lines we perform an additional analysis. If four or two loop lines extracted are (3) (4) hooked to T j , then we apply (2.13) once more to extract a loop line joining T j to (3)
T j or to T R j (there must be one by 1PI, and by the parity of the number of external (3)
(4)
5 lines). In any case the number of choices is bounded by |L j |5 |L j |5 |L L j| .
The reader might worry that some of the loop lines may have already been extracted r described in Sect. 2.1. Actually from the determinant by the induction on the chain Ci=r this may happen only for the lowest four point subgraph gj1 on the left chain C r , and the lowest four point subgraph gj1 on the right chain Crr as later extractions are performed in factorized components of the determinant (see (2.10)). Hence for all gjq with jq ∈ Jr and q > 1 all loop lines extracted by the procedure above are independent from loop lines extracted by the analog procedure on the right chain (Sect. 2.1). For the subgraph gj1 a detailed analysis must be performed and in some cases it is necessary to extract an additional loop line from the determinant and modify slightly the sector counting on the right chain of Lemma 4. Actually, it may happen that the loop line that has been used to gain a sector refinement sum for the subgraph gj1 on the right chain Crr is also needed to gain a factor for gj1 on the left chain. In this case we observe
Interacting Fermi Liquid in Two Dimensions at Finite Temperature. II
335
four point subgraph (right chain)
four point subgraph (left chain) gj
gk h(2) j
h(4) j
1
l lj l lj2 3
h(3) j
h(1)
l lj
extraloop
h(2)
Fig. 16. gj is the lowest four point subgraph on the left chain and gk the lowest four point subgraph on the (3)
right chain. The loop half-line hj hooks to gk but it cannot be used in Lemma 4 to gain a sector sum for gk , therefore we extract the “extraloop” line
that there must be, by parity, a second loop line, from now on called the “extraloop”, connecting TjL with TjR on the right chain (see Fig. 16). If this loop line has not been already extracted from the determinant, we apply once again formulae (2.13-2.14) to L extract this “extraloop”. This costs a factor |LR j1 ||Lj1 |. Lemma 11 will show that, in this case, at least one of these two loop lines is not essential to gain a factor on the left chain. Therefore, in Lemma 4, we can use this loop line to gain a sector sum on the right chain. 1
1
This operation is crucial to assure that we can obtain the factor t2 (r) and t2(r) ln t (r) from the two chains independently.
B 3.3. The final bound. Lemmas 1 and 2 in Sect. 2.1 prove that the remaining determinant still satisfies a Gram inequality, and that the number of choices to extract the loop lines is bounded by K n¯ . Now we prove the following lemmas. Lemma 10. Let gj be a four point subgraph on the chain C r . If there are only two loop (1) (2) L lines llj and llj , connecting T R j to T j , then the power counting has an additional 1 2 factor (so-called “volume factor”) A (j ) ln A(j ) .
336
M. Disertori, V. Rivasseau (3)
gj
Tj
h(2) j (3)
hj (4)
Tj
(4)
hj
h(1) j (3)
Fig. 17. Example of T j
(4)
and T j
Gj gj
(2)
lj
(1)
l lj
(2)
l lj h (1)
Fig. 18. Example of a non-quasi-local four point subgraph Gj
Proof. As there are only two loop lines , then T L j actually is a four point subgraph (1)
(2)
Gj (but not necessarily quasi-local) with external lines l1 (Gj ) = hj , l2 (Gj ) = lj , (1)
(2)
l3 (Gj ) = llj and l4 (Gj ) = llj . For an example see Fig. 18. No anisotropic sector counting improvement can be obtained here, since the two loop lines may be both external lines for the lowest four point subgraph gj1 on the right chain Crr (see Fig. 19). Therefore we refine the sectors of li (Gj ), i = 1, . . . , 4 (that may be of different sizes 1
1 2
1
2 2 (l(Gj )) ≤ A (j ) ) in smaller sectors multiplying their size by A(j ) . We remark 1
1
2 that, when 2 (li (Gj )) = A (j ) ∀i, this means we are refining up to isotropic sectors.
Interacting Fermi Liquid in Two Dimensions at Finite Temperature. II left chain
337
right chain gk
gj
l t(r)
h (2)
h (1)
Fig. 19. In this example one loop line is used to gain a sector sum on the right chain for gk and no loop line with an independent sector sum remains for gj on the left chain
By Lemma 2, Sect. II, in [FMRT] the new sector sum costs 1 1 2
A(j )
ln A(j ) .
(B.26)
−1
2 We remark that the bad factor A(j ) coming from the worse spatial decay of the tree line 1
2 is compensated by the good factor A (j ) from the smaller volume in impulsion space. Loop lines are not used for spatial integration, therefore their smaller volume factor in 1
2 impulsion space is not consumed. Each loop line gives therefore a net bonus A (jq ) . Finally we are left with the factor 1 − 21 ln A(j ) = 2 ln A(j ) . A(jq ) A(j q q A (j ) ) q q
(B.27)
! R When there are at least four loop lines joining T L jq with T jq , the following lemma proves that we have paid one unnecessary sector refinement sum.
Lemma 11. Let the four point subgraph gj on the chain C r and the four loop lines llj1 ,
R . . . llj4 , joining T L j with T j , be fixed. Then the number of sector choices predicted by [DR1], Lemma 5, (4.26) must be modified to: 4 m=2
1 4 −2 3 A(j )
j
(m) hj ,r(j )+1
, (root) dθh(m) ϒ θj , {θh(m) ,r(j ) }m=2,3,4 ≤ K j
j
(B.28)
for some constant K. Proof. The proof is quite similar to that of Lemma 4, but slightly more complicated, as this time the sector of the external tree line on the path joining x (1) with x (2) must be summed. We distinguish two cases.
338
M. Disertori, V. Rivasseau
gj
gj
h(2) j
h(4) j
h(2) j
h(4) j h(3) j
h(3) j
1
l lj l lj2
1
l lj l lj2
3
l lj
3
l lj h(1) j
(a)
h(1) j
(b)
2 as root. In this Fig. 20. (a) shows the usual sector counting and (b) shows the new sector counting taking llj 1 has been used to gain a sector sum on the right chain example we suppose that llj
(3)
(4)
1. If hj is a loop half-line and hj is a tree one, we know there are at least three loop (3)
L lines (different from hj ), called llj1 , llj2 , llj3 , joining T R j to T j (see Fig. 20a). One of these lines, say llj1 , may have been used to gain a sector sum on the other chain C r (in Lemma 4). The sector of at least one of the two remaining loop lines, say llj2 , has been summed independently in the sector counting lemmas. Then this line can be chosen as a new root to perform sector counting (see Fig. 20b). This permits us to fix the sector of (4) (3) hj . The sector of hj is fixed by impulsion conservation along the loop line. This loop line is essential to fix the third sector for gj , therefore it cannot be used to gain a sector sum in Lemma 4 for the right chain. This is not a problem as there is always the “extra loop” line that we can choose for this purpose (not necessarily equal to llj1 , llj2 or llj3 , see Fig. 16). (3)
2. If hj
(4)
and hj
are both tree half-lines, then we know there are four loop lines
R llji , i = 1, . . . , 4, connecting T L j with T j . Now we have three situations, shown in Figs. 21–23. (3)
(4)
• llj1 , llj2 and llj3 are hooked to T j and only llj4 is hooked to T j (see Fig. 21a). One of the first three lines, say llj1 (with sector θ1 ), may have been used to gain a sector sum on the other chain C r , then, among llj2 and llj3 we choose as the new root for sector (3)
counting in T j the one that has a sector sum independent from that of θ1 (there must (4)
be at least one by sector counting lemmas). In T j we choose as new root the unique loop line llj4 (see Fig. 21b). We remark that this loop line is essential to fix the sector of (4)
hj , therefore it cannot be used to gain a sector sum in Lemma 4 for the right chain. This is not a problem as there is always the “extra loop” line we can choose for this purpose. (3) (4) • llj1 and llj2 are hooked to T j while llj3 and llj4 are hooked to T j (see Fig. 22a). One of these loop lines, say llj4 may have been used to gain a sector sum on the other chain Crr . As the sector of llj3 may not be independent of that of llj4 we need to extract a third (4)
(3)
loop line llj5 connecting T j to T j or T R j . This line must exist by parity. Then we
Interacting Fermi Liquid in Two Dimensions at Finite Temperature. II
339
gj (3)
hj
gj (3)
hj
(2)
hj
(2)
hj
l2lj
l2lj
(4)
(4)
hj
hj 4
4
llj
llj
(1)
hj
(a)
(3)
Fig. 21. Three loop lines hook to T j
(1)
hj
(b) (4)
and only one to T j : in (a) usual sector counting is shown; in (b)
2 in T (3) and l 4 in T (4) sector counting is performed taking as roots llj j j lj
gj (3)
hj
gj (3)
hj
(2)
hj l2lj
(2)
hj l2lj
(4)
(4)
hj
hj
l5lj
l5lj (1)
(1)
hj
(a)
(b)
hj
(3)
(4) and two to T j ; in (a) usual sector counting is shown; in (b) sector (3) 2 5 in T (4) counting is performed taking as roots llj in T j and llj j
Fig. 22. Two loop lines hook to T j
can take either this line llj5 or llj3 as new root to repeat the argument above. On the other hand, as llj1 and llj2 are not needed for the other chain Crr (only one loop line is necessary for this chain and we have supposed it is llj4 ), we can choose any of them as new root, even if their sectors are related (see Fig. 22b). (3)
• All the four loop lines are hooked to T j (see Fig. 23a). A loop line, say llj1 , may have been used to gain a sector sum on the other chain Crr . Among the three remaining (3) (4) lines we can choose a new root for the sector sum on T j . For T j , there is no loop
340
M. Disertori, V. Rivasseau
gj (3)
hj
gj (3)
hj
(2)
hj
2
(4)
(2)
hj
2
l lj
(4)
hj
l lj
hj
llj5
llj5 (1)
hj
(a) (3)
Fig. 23. Four loop lines hook to T j
(1)
hj
(b) (4)
and none to T j : in (a) usual sector counting is shown; in (b) sector (3)
2 in T counting is performed taking as roots llj j
(4)
5 in T and llj j
(3)
(4)
line, therefore we must extract a fifth loop line connecting T j to T j or T R j (it must exist by parity) and we repeat the same argument above (see Fig. 23b). (3)
(4)
(1)
In any case, the two sectors, for hj and hj , are fixed. Then, as the sector of hj is always fixed, three sectors are known, hence the fourth one too and there is no sector 1
1
2 2 refinement to pay. Then we gain the factor A (j ) /j and iterate the process.
!
Final bound. Inserting all these results, and performing power counting we find the bound ∞ ˆ ≤ (|λ ln T |)n¯ ≤ M3 ∂kα ∂kβ (k)
(B.29)
n=2 ¯
for |λ ln T | ≤ 1/K = M3 . We remark that, contrary to (B.23), there is no factor λ2 since there can be two additional logarithms, one coming from the global power counting of the subgraph gr , and the other coming from Lemma 10. Acknowledgements. We thank V. Mastropietro for a careful reading of this paper which has lead us to add the appendix on the flow of δµ. We are also extremely grateful to M. Salmhofer who explained to us the importance of proving the bounds on the derivatives of the self energy to distinguish between Fermi and Luttinger liquids. Finally we thank J. Magnen for useful discussions.
References [BGPS] Benfatto, G., Gallavotti, G., Procacci, A., Scoppola, B.: Commun. Math. Phys. 160, 93 (1994) [BM] Bonetto, F., Mastropietro, V.: Commun. Math. Phys. 172, 57 (1995) [DR1] Disertori, M. and Rivasseau, V.: Interacting Fermi liquid in two dimensions at finite temperature. Part I: Convergent attributions. Commun. Math. Phys. 215, 251–290 (2000)
Interacting Fermi Liquid in Two Dimensions at Finite Temperature. II
[DR2]
341
Disertori, M. and Rivasseau, V.: Continuous Constructive Fermionic Renormalization. Preprint (1998), To appear in Annales Henri Poincaré [FMRS] Feldman, J., Magnen, J., Rivasseau, V. and Sénéor, R.: Construction of infrared φ44 by a phase space expansion. Commun. Math. Phys. 109, 437 (1987) [FMRT] Feldman, J., Magnen, J., Rivasseau, V. and Trubowitz, E.: An infinite Volume Expansion for Many Fermion Green’s Functions. Helv. Phys. Acta 65, 679 (1992) [FT1] Feldman, J. and Trubowitz, E.: Perturbation theory for Many Fermion Systems. Helv. Phys. Acta 63, 156 (1991) [FT2] Feldman, J. and Trubowitz, E.: The flow of an Electron-Phonon System to the Superconducting State. Helv. Phys. Acta 64, 213 (1991) [GK] Gawedzki, K., Kupiainen, A.: Massless φ44 theory: Rigorous control of a renormalizable asymptotically free model. Commun. Math. Phys. 99, 197 (1985) [R] Rivasseau, V.: From perturbative to constructive renormalization. Princeton, NJ: Princeton University Press, 1991 [S] Salmhofer, M.: Continuous renormalization for Fermions and Fermi liquid theory. Commun. Math. Phys. 194, 249 (1998) Communicated by D. Brydges
Commun. Math. Phys. 215, 343 – 356 (2000)
Communications in
Mathematical Physics
© Springer-Verlag 2000
Center Manifold for Nonintegrable Nonlinear Schrödinger Equations on the Line Ricardo Weder, Instituto de Investigaciones en Matemáticas Aplicadas y en Sistemas, Universidad Nacional Autónoma de México, Apartado Postal 20-726, México D.F. 01000. E-mail: [email protected] Received: 20 January 2000 / Accepted: 1 June 2000
Abstract: In this paper we study the following nonlinear Schrödinger equation on the line, ∂ d2 u(t, x) u(t, x) = − 2 u(t, x) + V (x)u(t, x) + f (x, |u|) , u(0, x) = φ(x), ∂t dx |u(t, x)| where f is real-valued, and it satisfies suitable conditions on regularity, on growth as a function of u and on decay as x → ±∞. The generic potential, V , is real-valued and it is d2 chosen so that the spectrum of H := − dx 2 +V consists of one simple negative eigenvalue and absolutely-continuous spectrum filling [0, ∞). The solutions to this equation have, in general, a localized and a dispersive component. The nonlinear bound states, that bifurcate from the zero solution at the energy of the eigenvalue of H , define an invariant center manifold that consists of the orbits of time-periodic localized solutions. We prove that all small solutions approach a particular periodic orbit in the center manifold as t → ±∞. In general, the periodic orbits are different for t → ±∞. Our result implies also that the nonlinear bound states are asymptotically stable, in the sense that each solution with initial data near a nonlinear bound state is asymptotic as t → ±∞ to the periodic orbits of nearby nonlinear bound states that are, in general, different for t → ±∞. i
1. Introduction We study below the small solutions to the nonintegrable, nonlinear Schrödinger equation, i
∂ u(t, x) d2 u(t, x) = − 2 u(t, x) + V (x)u(t, x) + f (x, |u|) , u(0, x) = φ(x), ∂t dx |u(t, x)| (1.1)
Fellow Sistema Nacional de Investigadores
Research partially supported by Proyecto PAPIITIN 1057 99
344
R. Weder
where u is a complex-valued function defined for t, x ∈ R. For each fixed x ∈ R, ∂ f (x, ·) ∈ C 1 (R, R), ∂x f (x, ·) ∈ C(R, R), f (x, 0) = 0 and, ∂ f (x, u) ≤ C|u|p−1 , ∂u ∂ f (x, u) ≤ C|u|p , for some p > 2. (1.2) ∂x The potential, V , is a real-valued function. For any γ ∈ R we denote by L1γ the Banach space of all complex-valued measurable functions, φ, defined on R and such that
φ L1γ := |φ(x)| (1 + |x|)γ dx < ∞. (1.3) d 2 If V ∈ L11 , − dx 2 + V has a unique self-adjoint realization in L , that we denote by H . Moreover, H has no singular-continuous spectrum, and its absolutely-continuous spectrum is [0, ∞). H has no positive or zero eigenvalues and, in general, it has a finite number of negative eigenvalues that are simple. For these results see [5] and [26]. We assume that H has only one negative eigenvalue, that we denote by E0 . We also suppose that x+1 |V (y)|2 dy < ∞. (1.4) N (V ) := sup 2
x∈R x
It follows from Theorem 2.7.1 in p. 35 of [20] that D(H ) = H2 , where, Hn , n = 1, 2, . . . , denote the Sobolev spaces [1]. To explain our results let us consider first the associated linear Schrödinger equation with f ≡ 0, i
∂ d2 u(t, x) = − 2 u(x, t) + V (x)u(t, x), u(0, x) = φ(x). ∂t dx
(1.5)
Let 0 be the eigenvector associated to the eigenvalue E0 , with L2 norm equal to one. Equation (1.5) has an invariant center manifold given by, M0 := reiθ 0 : r ≥ 0, 0 ≤ θ < 2π . (1.6) The invariant manifold, M0 , consists of the orbits of periodic localized solutions to (1.5) of the form e−itE0 reiθ 0 : r ≥ 0, 0 ≤ θ < 2π . Every solution to (1.5), u = e−itH φ, with initial data in φ ∈ L2 , can be decomposed as follows, e−itH φ = e−itE0 P0 φ + e−itH Pc φ,
(1.7)
where P0 denotes the orthogonal projector in L2 onto the one-dimensional subspace generated by 0 , P0 φ := (φ, 0 )0 . Moreover, Pc := I − P0 is the projector onto the subspace of continuity of H , Hc := Pc L2 . The component e−itH Pc φ is a scattering state that propagates out to infinity as t → ±∞. In fact, it is an easy consequence of the RAGE theorem [19] that, lim e−itH Pc φ 2 = 0, s > 0, (1.8) t→±∞
L−s
Center Manifold
345
where L2s , s ∈ R, denotes the weighted L2 space consisting of all functions, f , that are locally in L2 and such that (1 + |x|2 )s/2 f (x) ∈ L2 , with norm,
f L2s := (1 + |x|2 )s/2 f (x) 2 . (1.9) L
Equation (1.8) is an expression of the fact that, as the scattering state propagates to infinity the local energies tend to zero. Note that (1.9) does not hold with s = 0 because e−itH is unitary in L2 . By (1.7) and (1.8) every solution, u, to (1.5) approaches a periodic orbit in the invariant center manifold in the sense that for any s > 0, lim u(t) − e−itE0 P0 φ 2 = 0. (1.10) t→±∞
L−s
For functions u(t, x) defined for t, x ∈ R, we denote u(t) for u(t, ·). Equation (1.10) gives us also a time-dependent characterization of the stability of the bound state 0 . It tells us that given any initial state in L2 , = reiθ 0 + φ with φ ∈ Hc , the solution to (1.5) is the sum of the periodic orbit of the bound state, e−itE0 reiθ 0 , and a dispersive solution, e−itH φ, that propagates out to infinity as t → ±∞ and whose local energies tend to zero. As we show below, this situation persists in the nonlinear case. There is an invariant center manifold, consisting of the orbits of periodic localized solutions, such that all small solutions to (1.1) approach particular orbits in the center manifold as t → ±∞. Invariant manifold theorems have been extensively used in the analysis of the time evolution of dissipative equations. See for example, [4] and [9]. Equation (1.1) is however, dispersive. A periodic localized solution to (1.1) is a solution of the type u(t, x) := e−itE E , where the nonlinear bound state, E , is a solution to the following nonlinear eigenvalue problem: H E + f (x, |E |)
E = EE , E ∈ H2 . |E |
(1.11)
It is a consequence of standard bifurcation theory (see Theorem 3.2.2 on p. 77 of [15]) that the set of nontrivial solutions to (1.11) near the trivial solution E0 = 0 consists of exactly one continuous curve such that |E − E0 | < % for some % > 0, and, lim E H2 = 0.
(1.12)
E→0
In other words, the nonlinear bound states bifurcate from the zero solutions at E = E0 . We prove in Appendix 1 (see (1.24) below) that for some constant C, √ |E0 ||x|
|0 (x)| ≤ Ce−
√ |E||x|
; |E (x)| ≤ Ce−
, |E − E0 | < %.
(1.13)
If follows from (1.13) that for any s > 0 there is a constant Cs such that
E L2s ≤ Cs , |E − E0 | < %.
(1.14)
The invariant center manifold for the nonlinear Schrödinger equation (1.1) is given by M := eiθ E : |E − E0 | < %, 0 ≤ θ < 2π .
(1.15)
346
R. Weder
M consists of the orbits of periodic solutions to (1.1) of the form e−itE eiθ E . In the spirit of the central manifold theorem, [4, 9, 16], let us write M as the graph of a function from P0 L2 into its orthogonal complement, Hc . The proof given in [16] in the case of three or more dimensions applies also in one dimension, and it follows that there is a δ > 0 and a function, h, from {z ∈ C : |z| < δ} into Hc ∩ H2 ∩ L2s , s > 0 , such that, M = { = z0 + h(z) : |z| < δ} .
(1.16)
Moreover, h is a C 1 in the real sense, i.e., as a function from C, as a two-dimensional real space, into Hc ∩ H2 ∩ L2s . Furthermore, h(0) = 0, and h(eiθ z) = eiθ h(z). See Proposition 2.2 of [16] and its proof. u Let us denote by F (x, u) := 0 f (x, v) dv. It follows from (1.2) and since f (x, 0) = 0, that F (x, |u|) ≤ C|u|p+1 ,
(1.17)
for some constant C. It is a consequence of standard standard results (see [10] and [8]) that there is a ρ > 0 such that the initial value problem (1.1) has a unique solution in C(R, H1 ) for every φ ∈ H1 such that φ H1 < ρ. If moreover, F (x, |u|) ≥ −C(|u|2 + |u|q+1 ), for some 1 < q < 5,
(1.18)
then the initial value problem (1.1) has a unique solution in C(R, H1 ) for every φ ∈ H1 . In both cases the L2 norm and the energy are conserved quantities,
u(t) L2 = φ L2 ,
2 1 ∂ 2 u(t, x) + V (x)|u(t, x)| + F (x, |u|) dx 2 ∂x
2 1 d 2 = dx φ(x) + V (x)|φ(x)| + F (x, |φ|) . 2 dx
(1.19)
(1.20)
Moreover, for any % > 0 there is a ν > 0 such that
u(t) H1 < %, t ∈ R, if φ H1 < ν.
(1.21)
Before we state our main result we need to introduce some more standard notations. For any pair u, v of solutions to the stationary Schrödinger equation: d2 u + V u = k 2 u, k ∈ C, dx 2 let [u, v] denote the Wronskian of u, v: d d u v − u v. [u, v] := dx dx −
(1.22)
(1.23)
Let fj (x, k), j = 1, 2, k ≥ 0, be the Jost solutions to (1.22) (see [6, 7, 5] and [24]). A potential V is said to be generic if [f1 (x, 0), f2 (x, 0)] = 0 and V is said to be exceptional if [f1 (x, 0), f2 (x, 0)] = 0. If V is exceptional there is a bounded solution to (1.22) with k 2 = 0 (a half-bound state or a zero-energy resonance). Note that the trivial potential, V = 0, is exceptional. We prove in Appendix 2 that the generic potentials are a dense open set in L11 . Our main result is the following theorem.
Center Manifold
347
Theorem 1.1. Suppose that V ∈ L12 , that N (V ) < ∞, that V is a generic potential and d2 that H := − dx 2 + V (x) has only one negative eigenvalue. We assume that for each fixed ∂ x ∈ R, f (x, ·) ∈ C 1 (R, R), ∂x f (x, ·) ∈ C(R, R), f (x, 0) = 0 and, for some p > 2, ∂ f (x, u) ≤ q(x)|u|p−1 , (1.24) ∂u where (1 + |x|)2s+4β q(x) ∈ L∞ , for some s > 1, and , 1/2 < β ≤ 1. Moreover, ∂ f (x, u) ≤ C|u|p . (1.25) ∂x Then there is a η > 0, such that for all φ ∈ H1 ∩ L2s+2β with φ H1 < η, there exist functions, E(t) and θ (t), in C 1 (R, R), such that for some constant C, t ≤ C(1 + |t|)−1/2−β Pc φ −h((φ, 0 )) L2 , u(t) − e−i 0 E(ρ)dρ eiθ(t) E(t) 2 L−s−2β
1+2β
(1.26) where u(t) is the solution to (1.1) with initial value φ. Moreover, the following limits exists: lim E(t) := E± ;
t→±∞
lim θ(t) := θ± .
t→±∞
(1.27)
Equation (1.26) tells us that u tends to the periodic orbit of eiθ± E± . In particular, solutions with initial data near a nonlinear bound state are asymptotic as t → ±∞ to the periodic orbits of nearby nonlinear bound states. Note that the dispersive part, t u(t) − e−i 0 E(ρ)dρ eiθ(t) E(t) , tends to zero in L2−s−2β , as t → ±∞, with the same rate that the dispersive solutions to the associated linear Schrödinger equation (1.5) (see Theorem 1.2 below). A result as in Theorem 1.1 was proven in three or more dimensions in [21, 22] and [16]. For results on the asymptotic stability of stationary (time independent) solutions to nonlinear evolution equations see [11, 13, 12] and the references mentioned in these papers. For results on the asymptotic stability of solitons see [2] and [3]. For studies of the orbital stability of solitons see [27] and [28]. The proof of Theorem 1.1 is based on ideas from center manifold theory, [4, 9] and [16]. The basic dynamical input of the proof is the following L2s+2β − L2−s−2β estimate. For any pair of Banach spaces, X, Y , B (X, Y ) denotes the Banach space of all bounded operators from X into Y . In the theorem below we consider the general case where H has a finite number of eigenvalues. Theorem 1.2. Suppose that V ∈ L12 , and that V is generic. Then, for any s > 1 and 0 ≤ β ≤ 1, there is a constant C such that, −itH ≤ C(1 + |t|)−1/2−β . Pc (1.28) e 2 2 B Ls+2β ,L−s−2β
A decay estimate like (1.28) was proven in Theorem 7.6 of [14] for potentials such that V is a compact operator from Hl into L2ρ for some l < 1 and some ρ > 4. Murata’s condition roughly means that V decays as |x|−ρ as |x| → ∞, ρ > 4, whereas our condition, V ∈ L12 , requires ρ > 3. We give below a rather simple proof of Theorem 1.2,
348
R. Weder
quite different from the one in [14], based on our results in [24]. The key issue of estimate (1.28) is that it is integrable for large times if β > 1/2. For exceptional potentials the decay rate is (1 + |t)−1/2 . We can prove this result with our methods, but we do not consider this issue here. See [14] on this point. The paper is organized as follows. We prove Theorem 1.1 in Sect. 2. Theorem 1.2 is proven in Sect. 3. In Appendix 1 we prove estimate (1.13) and in Appendix 2 we prove that the generic potentials are a dense open set in L11 . 2. The Center Manifold In this section we prove Theorem 1.1. Let us project the solution to (1.1) along P0 and Pc , i.e., u(t) = up (t)0 + uc (t), where uc (t) := Pc u(t). Then, (1.1) is equivalent to the following system: i
∂ d up = E0 up + gp (up , uc ); i uc = H uc + gc (up , uc ), dt ∂t
(2.1)
u , we have that, where, denoting g(x, u) := f (x, |u|) |u|
gp (up , uc ) := P0 g(x, up 0 + uc ) = (g(x, up 0 + uc ), 0 )0 ; gc (up , uc ) : = Pc g(x, up 0 + uc ).
(2.2)
In a similar way, any point on the center manifold is written as, eiθ E = up 0 + h(up ), h(up ) ∈ Hc , where up , h(up ) are the solution to the following system (see [16]): E0 − E = −
gp (up , h(up )) ; h(up ) = −(H − E)−1 gc (up , h(up )). up
(2.3)
We first prove that u(t) approaches the center manifold. For this purpose, let us consider the vector in M that has the same projection along P0 that u(t), i.e., (t) := up (t)0 + h(up (t)). We prove below that the difference, v(t) := u(t) − (t) = uc (t) − h(up (t)) satisfies the estimate,
v(t) L2
−s1
≤ C(1 + |t|)−1/2−β v(0) L2s ,
(2.4)
1
where, s1 := s + 2β. By (2.1), v(t) is a solution of the following equation: ∂ v(t) = H v(t) + Q(up (t), v(t)), ∂t
(2.5)
Q(up , uc ) := gc (up , h(up ) + v) − gc (up , h(up )) − (Dh)(up )[gp (up , h(up ) + v) − gp (up , h(up ))],
(2.6)
i where,
with (Dh) the Frechét derivative of h. To verify (2.6) we must prove that (Dh)(up ) E0 up + gp (up , h(up )) = H h(up ) + gc (up , h(up )).
(2.7)
Center Manifold
349
We prove below this equation at t = t0 , for any t0 ∈ R. We denote, E := E(up (t0 )). Note that by (2.3), [e−itE up (t0 ), h(e−itE up (t0 ))] is a solution to (2.1) (recall that h(e−itE up ) = e−itE h(up )). Then, using the equation for up in (2.1), i
∂ h(e−itE up (t0 )) = (Dh)(e−itE up (t0 )) ∂t · E0 e−itE up (t0 ) + e−itE gp (up (t0 ), h(up (t0 ))) .
(2.8)
Moreover, by the equation for uc in (2.1), i
∂ h(e−itE up (t0 )) = H h(e−itE up (t0 )) + e−itE gc (up (t0 ), h(up (t0 ))). ∂t
(2.9)
Equation (2.7) follows taking t = 0 in (2.8) and (2.9). By (1.24), |g(x, u+v)−g(x, u)| ≤ Cq(x)(|u|(p−1) + |v|(p−1) )|v|, and we have that g(x, up 0 + h(up )) − g(x, up 0 + h(up ) + v) 2 Ls 1
(p−1) (p−1) 2s1
v L2 , ≤ C (1 + |x|) q(x) L∞ (up 0 + h(up ) H1 + v H1 −s1
(2.10)
where we used Sobolev’s [1] theorem to bound the L∞ norms by the H1 norms. By (1.13) P0 and Pc = I − P0 are bounded operators on L2s , s ∈ R, and it follows from (2.10) that gp (up , h(up )) − gp (up , h(up ) + v) 2 Ls 1
(p−1) (p−1)
v L2 , ≤ C up 0 + h(up ) H1 + v H1 (2.11) −s1 gc (up , h(up )) − gc (up , h(up ) + v) 2 Ls 1
(p−1) (p−1) ≤ C up 0 + h(up ) H1 + v H1 (2.12)
v L2 . −s1
By (1.21) given any %1 > 0 we can take η so small that if φ H1 < η, we have that |up (t)| = |(u(t), 0 )| ≤ u(t) H1 < %1 . Moreover, since h is C 1 and h(0) = 0,
h(up (t)) H1 ≤ C|up | ≤ C%1 ,
(2.13)
v(t) H1 ≤ C%1 .
(2.14)
and we conclude that,
By (2.6), (2.10), (2.11) and (2.12),
Q(up (t), v(t)) L2s ≤ C%1 v(t) L2 ,
if φ H1 < η.
(2.15)
ds e−i(t−s)H Q(up (s), v(s)).
(2.16)
−s1
1
We write (2.5) as an integral equation, v(t) = e−itH v(0) +
1 i
t 0
350
R. Weder
Let us denote vT := max|t|≤T (1 + |t|)1/2+β v(t) L2 . By Theorem 1.2, and (2.15), for −s1 |t| ≤ T ,
v(t) L2
−1/2−β
≤ C(1 + |t|)−1/2−β v(0) L2s 1 t + C%1 (sign t) ds (1 + |t − s|)−1/2−β (1 + |s|)−1/2−β vT ≤ C(1 + |t|)
0 −1/2−β
[ v(0) L2s + C%1 vT ].
(2.17)
1
Taking η so small that C%1 < 1/2, we obtain that vT ≤ C v(0) L2s ,
(2.18)
1
and since the constant C is independent of T , Eq. (2.4) follows. Equation (1.26) follows from (2.4) by the argument given in Sect. 4 of [16]. 3. The L2s+2β − L2−s−2β Estimate The results on the spectral theorem for H that we state below follow from the Weyl– Kodaira-Titchmarsch theory. See for example [5]. For a version of the Weyl–Kodaira– Titchmarsch theory adapted to our situation see Appendix 1 of [29] and also the proof of Theorem 6.1 in p. 78 of [29]. Let us denote for any k ∈ R, 1 √ T (k)f1 (x, k), k ≥ 0, 2π + (x, k) := (3.1) √1 T (−k)f2 (x, −k), k < 0, 2π
and − (x, k) := + (x, −k). Recall that Hc is the subspace of continuity of H . Then the following limits: N φˆ ± (k) := s − lim ± (x, k) φ(x) dx, (3.2) N→∞ −N
exist in the strong topology in L2 for every φ ∈ Hc and the operators (F± φ) (k) := φˆ ± (k), are unitary from Hc onto L2 . Moreover, the F±∗ are given by N ∗ F± φ (x) = s − lim ± (x, k) φ(k) dk, N→∞ −N
(3.3)
(3.4)
where the limits exist in the strong topology in L2 . Furthermore, the operators F±∗ F± are the orthogonal projector onto Hc . For each eigenvalue of H , let j , 0 = 1, 2, · · · , N , be the corresponding eigenfunction normalized to one, i.e. j L2 = 1. The operators Fj φ := (φ, j )j , j = 0, 2, · · · , N,
(3.5)
are unitary from the eigenspace generated by j onto C. The following operators: F ± = F± ⊕N j =0 Fj ,
(3.6)
Center Manifold
351
are unitary from L2 onto L2 ⊕N j =0 C and for any φ ∈ D(H ): F ± H φ = k 2 (F± φ)(k), E0 F0 φ, · · · , EN FN φ .
(3.7)
Moreover, for any bounded Borel function, 8, defined on R, F ± 8(H )φ = 8(k 2 )(F± φ)(k), 8(E0 )F0 φ, · · · , 8(EN )FN φ .
(3.8)
The projector, P0 , onto the subspace of L2 generated by the eigenvectors of H is given by P0 φ :=
N
(φ, j )j .
(3.9)
j =0
Since H has no singular–continuous spectrum the projector onto the continuous subspace of H is given by Pc := I − P0 . It follows from (3.6) that e−itH Pc = F±∗ e−ik t F± . 2
(3.10)
Equation (3.10) is the starting point of our proof of Theorem 1.2. Proof of Theorem 1.2. It follows from (3.10) that for any φ ∈ L2 ∩ L1 , e−itH Pc φ(x) = 8t (x, y)φ(y)dy, where
8t (x, y) :=
e−ik t + (x, k)+ (y, k) dk. 2
(3.11)
(3.12)
The proofs of Lemmas 2.4 and 2.5 of [24] imply that 1 |8t (x, y)| ≤ C √ . |t|
(3.13)
We prove below that (1 + |x|)−2 |8t (x, y)( 1 + |y|)−2 ≤ C
1 . |t|3/2
(3.14)
By (3.13) and (3.14), for any s > 1 and 0 ≤ β ≤ 1, (1 + |x|)−s−2β |8t (x, y)| (1 + |y|)−s−2β ≤ C
1 . |t|1/2+β
(3.15)
Equation (1.28) follows from (3.15), from Schur’s criterion and from the unitarity in L2 of e−itH Pc . We now prove (3.14). Let χ1 ∈ C0∞ (R) satisfy, χ1 (k 2 ) = 1 for |k| ≤ 1 and let us denote χ2 := 1 − χ1 . Changing the variable of integration in (3.12) to λ := k 2 we obtain that 8t = 81,t + 82,t ,
(3.16)
352
R. Weder
where 8j,t (x, y) :=
∞
0
√ √ 1 √ χj (λ) e−iλt + (x, λ)+ (y, λ) dλ, j = 1, 2, λ (3.17)
where we used that + (x, −k) = + (x, k). Let us denote,
√ √ 1 hj (λ, x, y) := √ χj (λ) + (x, λ)+ (y, λ) , j = 1, 2. λ
(3.18)
We first estimate 81,t (x, y). The key issue is that in the generic case the transmission coefficient satisfies T (k) = αk + o(k), as k → 0, where α = 0, and the reflection coefficients satisfy Rj (0) = −1, j = 1, 2 (see [5]). Then, it follows from [24]: Eq. (2.5), Lemma 2.1, Eqs. (2.38) and (2.40–2.42) that √ λ |h1 (λ, x, y)| ≤ C √ ; 1+ λ
∂ h1 (λ, x, y) ≤ C √1 (1 + |x|)(1 + |y|). ∂λ λ
(3.19)
Let us extend h1 (λ, x, y) to a function defined for λ ∈ R by setting h1 (λ, x, y) := 0 for λ ≤ 0. Then, 1 81,t (x, y) = 2
e
−itλ
−it (λ−π/t)
h1 (λ, x, y)dλ − e h1 (λ, x, y)dλ 1 = e−itλ [h1 (λ, x, y) − h1 (λ + π/t, x, y)]dλ. 2
(3.20)
Hence, |81,t (x, y)| ≤ C
|h1 (λ, x, y) − h1 (λ + π/t, x, y)| dλ.
(3.21)
If t > 0, |h1 (λ, x, y) − h1 (λ + π/t, x, y)| dλ 2π/t |h1 (λ, x, y)|dλ + ≤ 2
∂ dλ ∂ρ h1 (ρ, x, y) dρ 0 π/t λ C π ∞ ∂ dρ ≤ 3/2 (1 + |x|)(1 + |y|) + (ρ, x, y) h 1 t t π/t ∂ρ C ≤ 3/2 (1 + |x|) (1 + |y|), (3.22) t ∞
λ+π/t
where we used (3.19). If t < 0 we change the variable of integration in (3.21) to λ´ := λ + π/t, and we proceed as above.
Center Manifold
353
Let us denote, m1 (x, k) := e−ikx f1 (x, k) and m2 (x, k) := eikx f2 (x, k). We desig∂2 ∂ ´ j := ∂x mj (x, k), j = 1, 2 . We prove nate, m ¨ j (x, k) := ∂k 2 mj (x, k), j = 1, 2, and m 1 below that if V ∈ L2 , then, for any x0 ∈ R, 1 + |k| , x ≥ x0 ; |k|2 1 + |k| |m ¨ 2 (x, k)| ≤ Cx0 , x ≤ x0 , |k|2 1 + |k| ¨ ´ 1 (x, k) ≤ Cx0 1 + , x ≥ x0 ; m |k|2 1 + |k| ¨ ´ 2 (x, k) ≤ Cx0 1 + x ≤ x0 . m |k|2 |m ¨ 1 (x, k)| ≤ Cx0
(3.23)
(3.24)
Integrating by parts twice with respect to λ in (3.17) with j = 2 and using (3.23), (3.24) and [24]: Eq. (2.5), Lemma 2.1, Eqs. (2.38) and (2.40–2.42), we obtain that |82,t (x, y)| ≤ C
1 (1 + |x|)2 (1 + |y|)2 . t2
(3.25)
Equation (3.14) follows from (3.16), (3.21), (3.22) and (3.25). We prove now (3.23) and (3.24) for m1 . The case of m2 follows in a similar way. As is well known (see [5, 24]), m1 (x, k) = lim m1,n (x, k),
(3.26)
n→∞
where m1,0 (x, k) := 1 and the m1,n satisfy the following equation for n = 0, 1, . . . : ∞ m1,n+1 (x, k) = 1 + Dk (y − x)V (y)m1,n (y, k)dy, (3.27) x
where, Dk (x) :=
x
e
2iky
dy =
0
1 2ikx 2ik (e
− 1),
x,
k = 0, k = 0.
(3.28)
Clearly, 2 D¨ k (x) ≤ C (1 + |x|) (1 + |k|) . |k|2
By (3.27),
m ¨ 1,n+1 (x, k) =
x
∞
Dk (y − x)V (y)m ¨ 1,n (y, k) dy + An (x, k),
where,
An (x, k) := 2
∞
x
+
(3.30)
˙ 1,n (y, k) dy D˙ k (y − x)V (y)m
∞ x
(3.29)
D¨ k (y − x)V (y)m1,n (y, k) dy.
(3.31)
354
R. Weder
Then by (3.29) and [24]: Eqs. (2.10), (2.18) and (2.21) with γ = 2, for |k| ≤ 1 and γ = 1 for |k| ≥ 1, |An (x, k)| ≤ Cx0
1 + |k|
V L1 , x ≥ x0 , 2 |k|2
(3.32)
and it follows from (3.29) and (3.30) that |m ¨ 1,n+1 (x, k)| ≤ Cx0
1 + |k| 1 + |k| + Cx0 |k|2 |k|2
∞ x
|y − x|2 |V (y)||m ¨ 1,n (y, k)| dy. (3.33)
Iterating (3.33) n + 1 times we prove that |m ¨ 1,n+1 (x, k)| ≤ Cx0
1 + |k| , x ≥ x0 , |k|2
(3.34)
and (3.23) for m1 follows taking the limit as n → ∞. Equation (3.24) follows from (3.23) and from [24]: Lemma 2.1 and Eq. (2.23).
4. Appendix 1 In thisAppendix we prove Eq. (1.13). For this purpose it is enough to assume that V ∈ L1 , and that N (V ) < ∞. Note that as E ∈ H2 it follows from Sobolev’s theorem [1] that
E L∞ ≤ C E H2 . Then qe := V + f (x, |E |)/|E | ∈ L1 and N (qe ) < ∞. By d2 Theorem 2.7.1 on p. 35 of [20] the differential expression − dx 2 + qe is essentially-selfby H the unique self-adjoint realization in L2 . Moreover, adjoint on C0∞ . We denote √ e D(He ) = H2 . Let gj x, i |E| , j = 1, 2, be the Jost solutions for the potential qe at energy E. They satisfy (see Lemma 1 of [5]), √ √ g1 x, i |E| ≤ Ce− |E||x| , x ≥ 0; g2 x, i |E| ≤ Ce− |E||x| , x ≤ 0. (4.1) d But the differential expression − dx 2 + qe is in the limit-point case at ±∞ [26]. Hence, √ gj (x, i |E|), j = 1, 2, are, respectively, the only independent solutions to the eigenvalue equation 2
−
d2 g + qe g = Eg, dx 2
(4.2)
that are square integrable on a neighborhood of ±∞. By (1.11) E is a solution in H2 to the linear eigenvalue equation (4.2). Then, g1 and g2 are linearly dependent, and
(4.3) E (x) = αg1 x, i |E| = βg2 x, i |E| , for some constants α, β. Equation (1.13) follows from (4.1) and (4.3). Note that the argument above also implies that √ |E0 ||x|
|0 (x)| ≤ Ce−
.
(4.4)
Center Manifold
355
5. Appendix 2 In this Appendix we prove that the generic potentials are a dense open set in L11 . For each fixed x ∈ R the functions V ∈ L11 @→ fj (x, k), j = 1, 2, are continuous [5]. It follows that the set of generic potentials is an open set in L11 . Suppose that V ∈ L11 and denote, W (λ) := f1,λ (x, 0), f2,λ (x, 0) , (5.1) where fj,λ , j = 1, 2, are the Jost solutions for λV given by [24] Eqs. (2.11–2.13)) for j = 1 and similar formulas for j = 2. W (λ) is an entire analytic function of λ and there are two possibilities. (a) W (λ) is not identically zero. Then the set of zeros of W (λ) is discrete and there exists a sequence λn , with W (λn ) = 0 and limn→∞ λn = 1. It follows that λn V is generic and that limn→∞ λn V = V strongly in L11 . (b) W (λ) is identically zero . In this case it follows from [24]: Eqs. (2.11–2.13) and (2.23) that d − V (y) dy = = 0. (5.2) W (λ) λ=0 dλ Take any q(x) ∈ L11 with q(x) > 0 and any sequence, %n > 0, with limn→∞ %n = 0. As V + %n q does not satisfy (5.2) there are sequences λn,m with limm→∞ λn,m = 1 such that Vn,m := λn,m (V + %n q) are generic. We can always take a subsequence of the Vn,m that converges strongly to V in L11 . It follows that the set of generic potentials is dense in L11 . References 1. Adams, R. A.: Sobolev spaces. New York: Academic Press, 1975 2. Buslaev, V.S. and Perelman, G.S.: Scattering states for the nonlinear Schrödinger equation: states close to a soliton. Algebra i Analiz 4 (1992), 63–102 [English translation in St. Petersburg Math. J. 4 (1993), 1111–1142] 3. Buslaev, V.S. and Perelman, G.S.: On the stability of solitary waves for nonlinear Schrödinger equation. Am. Math. Soc. Transl. (2) 164, 75–89 (1995) 4. Carr, J.: Applications of centre manifold theory. Applied Mathematical Sciences 35, New York: SpringerVerlag, 1981 5. Deift, P. and Trubowitz, E.: Inverse scattering on the line. Commun. Pure Appl. Math. XXXII, 121–251 (1979) 6. Faddeev, L.D.: Properties of the S matrix of the one-dimensional Schrödinger equation. Trudy Math. Inst. Steklov 73, 314–333 (1964) [English translation in Am. Math. Soc. Translation Series 2 65, 139–166 (1964)] 7. Faddeev, L.D.: Inverse problems of quantum scattering theory, II. Itogi Nauki i Tekhniki Sovremennye Problemy Matematiki 3, 93–180 (1974)[English translation in J. Soviet Math. 5, 334–396 (1976)] 8. Ginibre, J.: Introduction aux équations de Schrödinger non linéaires. Paris: Onze Édition, 1998 9. Henry, D.: Geometric theory of semilinear parabolic equations. Lecture Notes in Mathematics, 840, Berlin: Springer-Verlag, 1981 10. Kato, T.: Nonlinear Schrödinger equations. In: H. Holden and A. Jenssen (eds.) Schrödinger operators, Lecture Notes in Physics 345, Berlin: Springer-Verlag, 1989, pp. 218–263 11. Komech, A. and Vainberg, B.: On asymptotic stability of stationary solutions to nonlinear wave and Klein–Gordon equations. Arch. Rational Mechanics and Analysis 134, 227–248 (1996) 12. Komech, A.: On transitions to stationary states in Hamiltonian nonlinear wave equations. Phys. Lett. A 241, 311–322 (1998) 13. Komech, A., Spohn, H. and Kunze, M.: Long-time asymptotics for a classical particle interacting with a scalar field. Comm. Part. Diff. Equations 22, 307–335 (1997) 14. Murata, M.: Asymptotic expansions in time for solutions of Schrödinger-type equations. J. Funct. Analysis 49, 10–56 (1982)
356
R. Weder
15. Nirenberg, L.: Topics in nonlinear functional analysis. Courant Institute of Mathematical Sciences Lecture Notes, New York: New York University, 1974 16. Pillet, C.-A. and Wayne C.E.: Invariant manifolds for a class of dispersive, hamiltonian, partial differential equations. J. Differential Equations 141, 310–326 (1997) 17. Reed, M. and Simon, B.: Methods of modern mathematical physics I: functional analysis. New York: Academic Press, 1972 18. Reed, M. and Simon, B.: Methods of modern mathematical physics II: Fourier analysis, self-adjointness. New York: Academic Press, 1975 19. Reed, M. and Simon, B.: Methods of modern mathematical physics III: scattering theory. New York: Academic Press, 1978 20. Schechter, M.: Operator methods in quantum mechanics. New York: North Holland, 1981 21. Soffer, A. and Weinstein, M.I.: Multichannel nonlinear scattering for nonintegrable equations. Commun. Math. Phys. 133, 119–146 (1990) 22. Soffer, A. and Weinstein, M.I.: Multichannel nonlinear scattering for nonintegrable equations II. The case of anisotropic potentials and data. J. Differential Equations 98, 376–390 (1992) 23. Weder, R.: Inverse scattering for the nonlinear Schrödinger equation. Commun. Part. Diff. Equations 22, 2089–2103 (1997) 24. Weder, R.: Lp − Lp´ estimates for the Schrödinger equation on the line and inverse scattering for the nonlinear Schrödinger equation with a potential. J. Funct. Analysis 170, 37–68 (2000) 25. Weder, R.: The Wk,p -continuity of the Schrödinger wave operators on the line. Commun. Math. Phys. 208, 507–520 (1999) 26. Weidmann, J.: Spectral theory of ordinary differential operators. Lecture Notes in Mathematics 1258, Berlin: Springer-Verlag, 1987 27. Weinstein, M.I.: Modulation stability of ground states of nonlinear Schrödinger equations. SIAM J. Math. Anal. 16, 472–491 (1985) 28. Weinstein, M.I.: Lyapunov stability of ground states of nonlinear dispersive evolution equations. Commun. Pure Appl. Math. 39, 51–67 (1986) 29. Wilcox, C.H.: Sound propagation in stratified fluids. Applied Mathematical Sciences 50, New York: Springer-Verlag, 1984 Communicated by A. Kupiainen
Commun. Math. Phys. 215, 357 – 373 (2000)
Communications in
Mathematical Physics
© Springer-Verlag 2000
On Existence of Mini-Boson Stars Piotr Bizon´ 1 , Arthur Wasserman2 1 Institute of Physics, Jagellonian University, Kraków, Poland 2 Department of Mathematics, University of Michigan, Ann Arbor, MI 48109, USA
Received: 14 February 2000 / Accepted: 26 June 2000
Abstract: We prove the existence of a countable family of globally regular solutions of spherically symmetric Einstein–Klein–Gordon equations. These solutions, known as mini-boson stars, were discovered numerically many years ago. 1. Introduction Boson stars are compact gravitationally bound soliton-like equilibrium configurations of bosonic fields. The simplest kind of boson star, which is made up of a self-gravitating free complex massive scalar field, was conceived over thirty years ago by Kaup [1] and Ruffini and Bonazzola [2] who found numerically the ground state solution to the spherically symmetric Einstein–Klein–Gordon (EKG) equations. A decade later the systematic numerical analysis of these equations was performed by Friedberg, Lee, and Pang [3] who rediscovered and extended the results of [1, 2], in particular they found a countable sequence of excited states. The aim of this paper is to give a rigorous proof of existence of solutions found in [1–3]. In the physics literature these solutions are usually referred to as mini-boson 1 stars (“mini” because they are tiny objects with mass ∼ Gm , where m is the boson mass). What are the motivations for studying such objects? Let us mention three possible reasons varying from physical to purely mathematical. First, most theories of elementary particles predict the existence of massive bosons which interact weakly with baryonic matter. To the extent one believes in these models, one should accept their consequences, like boson stars. From this standpoint, the recent surge of interest in boson stars is largely due to the suggestion that the dark matter could be bosonic since then some fraction of the missing mass of the universe would float around in the form of boson stars. Second, even if massive scalar fields do not exist in nature, they provide one of the simplest fundamental matter sources for the Einstein equations and, as such, are ideal theoretical “laboratories” for studying the dynamics of gravitational collapse. Mathematically, these studies amount to the analysis of the Cauchy problem for the EKG equations. Boson stars
358
P. Bizo´n, A. Wasserman
play an important role in this context as candidates for intermediate or final attractors of dynamical evolution. Finally, and admittedly most interestingly for us, mini-boson stars are non-perturbative solutions of the EKG equations in the sense that they have no regular flat-spacetime limit (one manifestation of this property is the fact mentioned above that the total mass of a mini-boson star is inversely proportional to the gravitational constant G). In this respect mini-boson stars are similar to the Bartnik–McKinnon solutions of the Einstein–Yang–Mills equations [4]. However, in contrast to the Bartnik–McKinnon solutions, the mini-boson stars are not static: although the metric and the stress-energy tensor of the scalar field are time-independent, the scalar field iself has the form of a ˜ standing wave φ(r, t) = eiωt φ(r). This fact has an important consequence at the ode level, namely the lapse function does not decouple from the Klein–Gordon equation and the hamiltonian constraint which means that we have to deal with a 4-dimensional (nonautonomous) dynamical system1 . Below we analyze this system using a shooting method which is similar in spirit (but quite different in implementation) to the proof of existence of the Bartnik–McKinnon solutions [5]. The paper is organized as follows. In Sect. 2 we derive the field equations together with the boundary conditions and discuss some basic properties of solutions. We also formulate the main theorem and sketch the heuristic idea of its proof. In Sect. 3 we prove the local existence of solutions near the origin. In Sects. 4 and 5 we discuss the limiting behavior of solutions for small and large values of the shooting parameter, respectively. In Sect. 6 we derive the asymptotics of globally regular solutions. Sect. 7 contains some technical results concerning the behavior of singular solutions. Finally, in Sect. 8, using the results of Sects. 4–7, we complete the proof of the main theorem by a shooting argument. 2. Preliminaries The action for the EKG system is given by R 1 2 ∗ 1 4 √ ∗ a S = d x −g − ∂a φ ∂ φ − m φ φ , 16π G 2 2
(2.1)
where R is the scalar curvature of the spacetime metric gab , φ is the complex scalar field, and m is a real constant called the boson mass. The associated field equations are the Einstein equations 1 Rab − gab R = 8π GTab 2
(2.2)
with the stress-energy tensor of the scalar field Tab =
1 1 (∂a φ ∗ ∂b φ + ∂a φ ∂b φ ∗ ) − gab (g cd ∂c φ ∗ ∂d φ + m2 φ ∗ φ), 2 2
(2.3)
and the Klein–Gordon equation ( − m2 )φ = 0,
(2.4)
1 For comparison, the static spherically symmetric Einstein–Yang–Mills equations reduce (within the purely magnetic ansatz) to a 3-dimensional (nonautonomous) dynamical system.
On Existence of Mini-Boson Stars
359
where is the d’Alembertian operator associated with the metric gab . Now, we assume that the fields are spherically symmetric. We write the metric using areal radial coordinate and polar slicing ds 2 = −e−2δ Adt 2 + A−1 dr 2 + r 2 d2 ,
(2.5)
where d2 is the standard metric on the unit two-sphere, and A and δ are functions of (t, r). In this parametrization the (relevent components of) Einstein equations have the particularly simple form 1−A − 8π Gr T00 , r ∂r δ = −4π GrA−1 (T00 + T11 ), ∂t A = −8π Gre−δ AT01 ,
∂r A =
(2.6) (2.7) (2.8)
where the components of stress-energy tensor Tab are expressed in the orthonormal frame determined by the metric (2.5) (e0 = eδ A−1/2 ∂t , e1 = A1/2 ∂r ). From (2.3) we obtain 1 (A|∂r φ|2 + A−1 e2δ |∂t φ|2 + m2 |φ|2 ), 2 1 = (A|∂r φ|2 + A−1 e2δ |∂t φ|2 − m2 |φ|2 ), 2 1 = eδ (∂r φ ∗ ∂t φ + ∂r φ ∂t φ ∗ ). 2
T00 = T11 T01
(2.9) (2.10) (2.11)
The remaining components of Einstein’s equations are equivalent to the Klein–Gordon equation. ˜ For the scalar field φ we assume the standing wave ansatz φ(r, t) = exp(iωt)φ(r), where ω is a real constant. Then, due to the U (1) symmetry of the action, the stressenergy tensor and the metric are time-independent. Morever, T01 = 0 so Eq. (2.8) is trivially satisfied. In terms of the dimensionless variables √ ˜ (2.12) x = mr, f (x) = 4π G φ(r) and the auxiliary variable C(x) =
ω −1 δ A e , m
(2.13)
Eqs. (2.4),(2.6) and (2.7) reduce to the following system of ordinary differential equations d ): (hereafter prime denotes dx
x2f C
x2 (1 − AC 2 )f, AC 1−A 2 A = − x(Af + AC 2 f 2 + f 2 ), x C C = (A − 1 + x 2 f 2 ). xA =
(2.14a) (2.14b) (2.14c)
360
P. Bizo´n, A. Wasserman
Instead of A, it is sometimes convenient to use the “mass” function M(x) defined by A(x) = 1 − 2M(x)/x. From (2.14b) we have 1 x 2 2 s (Af + AC 2 f 2 + f 2 )ds. (2.15) M(x) = 2 0 A spacetime is said to be asymptotically flat if δ(∞) is finite and lim M(x) = M∞ < ∞.
x→∞
(2.16)
The limiting value M∞ is interpreted as the total mass of a solution (in our case it is 1 measured in units Gm ). In Sect. 6 we will show that the finiteness of mass implies that C has a finite limit and f decays exponentially as x → ∞. Besides the singularity at infinity, the field equations (2.14) have the fixed singular point at x = 0 and a moving singularity at x, ¯ where A(x) ¯ = 0. Regularity of solutions at x = 0 requires the following behavior: f (x) = a + O(x 2 ),
A(x) = 1 + O(x 2 ),
C(x) = α + O(x 2 ),
(2.17)
where a = f (0) and α = C(0) are arbitrary parameters (assumed positive without loss of generality). In Sect. 3 we will show that these parameters determine uniquely a smooth local solution to Eqs. (2.14). Definition 2.1. The solution of Eqs. (2.14) starting at x = 0 with the behavior (2.17) is called the a-orbit. In the following whenever we write “a solution” we always mean the a-orbit. Also when we write that some property holds for all x we always mean for all x ≥ 0. We will frequently refer to the behavior of a-orbits in the (f, f )-plane; when we write, say, that the a-orbit enters the first quadrant (Q1 for brevity), we mean that the projection of the a-orbit in the (f, f )-plane does so. Definition 2.2. The a-orbits which exist for all x and are asymptotically flat are called globally regular. Now, we are ready to formulate our main result: Theorem 2.1. For each α > 1, there is a decreasing sequence of parameters an (n = 0, 1, 2, . . . ) such that the corresponding an -orbits are globally regular. The index n labels the number of nodes of the function f (x). This theorem makes rigorous the numerical results obtained in [1–3]. Notice that although the a-orbits are determined by two parameters, only the parameter a has to be fine-tuned so the shooting is essentially one-dimensional. In order to prepare the ground for the proof of Theorem 2.1 we discuss now some elementary global properties of a-orbits. Lemma 2.2. A(x) < 1 for all x > 0 unless f (x) ≡ 0. Proof. From (2.14b), A (x0 ) < 0 if A(x0 ) = 1 so A cannot cross 1 from below. Since A(x) < 1 for small x, the lemma follows. Lemma 2.3. An a-orbit exists as long as A(x) > 0.
On Existence of Mini-Boson Stars
361
Proof. If A(x) > 0 for x < x¯ < ∞, then limx→x¯ M(x) exists (because M > 0 and M(x) < x < x) ¯ so limx→x¯ A exists as well. We will show that the orbit can be continued beyond x¯ provided that limx→x¯ A(x) > 0. Since 0 < A < 1, the only obstruction to extending the solution is the possibility that C,f , or f might be unbounded. To see that f is bounded we note that (xA) < 1 − x 2 Af 2 . Choose & > 0 such that A(x) > A(x)/2 ¯ ¯ 2 /2 and integrating from for x¯ − & < x < x; ¯ then (xA) < 1 − (x¯ − &)2 A(x)f x x¯ − & < x gives that x−& f (x)2 < ∞ and hence, by the Cauchy–Schwarz inequality, ¯ x |f (x)| < ∞. Thus, f is bounded. This implies by Eq. (2.14c) that (ln C) is also x−& ¯ bounded so both C and 1/C are bounded. Now, (2.14a) says that x 2 f /C is bounded so f is bounded. Remark. It follows from Lemma 2.3 that the only possible obstruction to extendability of a-orbits to arbitrarily large x is limx→x¯ A = 0 for some x. ¯ If that happens we will say that the solution crashes at x. ¯ Let us define the function g = 1−AC 2 . The following two properties of this function will play an important role in our discussion. Lemma 2.4. We have (a) If g(x) ≤ 0, then g (x) > 0; (b) If g(x0 ) ≥ 0, then g(x) > 0 for all x > x0 . Proof. A simple calculation yields
g =C
2
1−A 2 2 + xAf − xgf . x
(2.18)
Part (a) follows immediately from (2.18). To prove Part (b) note that g (x1 ) > 0 if g(x1 ) = 0, so g cannot cross zero from above. The restriction α > 1 in Theorem 2.1 can be easily seen as follows. Suppose that there is a globally regular solution with α ≤ 1. Since g(0) = 1 − α 2 , it follows from Lemma 2.4 that g(x) is positive for all x. Multiplying Eq. (2.14a) by f and integrating by parts we get that ff > 0 for all x, hence f 2 is monotone increasing which is obviously impossible for globally regular solutions (in fact such solutions crash at finite x as follows easily from Eq. (2.14b)). Thus we have Lemma 2.5. There are no (nontrivial) globally regular solutions for α ≤ 1. Note that Lemma 2.5 implies in particular that there are no static (α = 0) globally regular solutions. In view of Lemma 2.5 from now on we always assume that α > 1. Definition 2.3. The rotation function θ(x, a) of an a-orbit is defined by θ(0, a) = 0, tan θ (x, a) = −f (x)/f (x) and θ(x, a) is continuous in x. We will drop the second argument of θ if there is no danger of confusion. Now we list the basic properties of the rotation function of a-orbits which we will need below.
362
P. Bizo´n, A. Wasserman
Lemma 2.6. For any nonnegative integer n we have: (a) If θ (x1 ) > (n + 1/2)π for some x1 , then θ (x) > (n + 1/2)π for all x > x1 . (b) If θ (x1 ) < nπ for some x1 and g(x1 ) ≥ 0, then θ (x) < nπ for all x > x1 . (c) There are at most two values of x with θ(x) = nπ . Proof. (a) We note that θ (x) = (f 2 − ff )/(f 2 + f 2 ), so θ (x) = 1 if θ(x) = (n + 1/2)π. (b) If x > x1 and g(x1 ) ≥ 0, then g(x) > 0 by Lemma 2.4. Next, we note that θ (x) = −g(x)/A(x) < 0 if θ (x) = nπ and g(x) > 0. If g(x) = 0 then θ (x) = 0 but θ (x) = −g (x)/A(x) < 0 since g (x) > 0 when g(x) = 0 by Eq. (2.18). (c) The function θ (x) − nπ changes sign at each zero for which g(x) = 0. From Lemma 2.4, g changes sign at most once. Thus, for n > 0, θ(0) − nπ < 0 and at x1 , the first zero of θ (x) − nπ, if g(x1 ) ≥ 0 then by part b) θ(x) − nπ < 0 for all x > x1 . If g(x1 ) < 0 then θ (x) − nπ changes sign at x1 , and hence, at x2 , the next zero of θ (x) − nπ , g(x2 ) ≥ 0 and hence θ(x) − nπ < 0 for all x > x2 . For n = 0, θ (0) − nπ = 0, θ (x) > 0 near x = 0 and if θ(x1 ) = 0 then g(x1 ) ≥ 0, hence, θ(x) < 0 for all x > x1 . Before going into details of Sects. 3–8, let us outline the main idea of the proof of Theorem 2.1. According to this theorem there exists a countable family of globally regular solutions distinguished by nodal class. We first show (Sect. 3) that there is a continuous one-parameter family of local solutions depending on a = f (0); we all these solutions a-orbits. In Sect. 6 we show that an a-orbit that has bounded rotation and that is defined for all x is a globally regular solution, that is, it has the correct asymptotic behavior as x → ∞. The existence of a-orbits with bounded rotation that are defined for all x is proven in each nodal class by an inductive application of a shooting argument. The zeroth solution we construct has θ(x, a0 ) < π/2 for all x; the first solution has θ (x, a1 ) < 3π/2 (and greater than π/2 for large x), etc. This is shown in Fig. 1. The crucial step of our argument is the control of behavior of a-orbits for large and small values of the parameter a. In Sect. 4 we show that for sufficiently small a the a-orbit has arbitrarily large rotation; more precisely, there is a number bn such that θ(x, a) > nπ for some x if a < bn . In contrast, we show in Sect. 5 that for a >> 1 the a-orbit exits Q4 directly to Q1 (see Fig. 1). Now, to prove the existence of a globally regular solution in the zeroth nodal class we let a0 = inf{a| θ (x, a) < π/2 for all x for which the a-orbit is defined}. Note that a0 ≥ b1 > 0. We then prove that the a0 -orbit is the globally regular solution in the zeroth nodal class. It is clear that the a0 -orbit has rotation θ(x, a0 ) ≤ π/2 for otherwise all nearby orbits would have rotation > π/2 which contradicts the definition of a0 . It is also easy to see that the a0 -orbit cannot exit Q4 to Q1 because again, nearby orbits would also do so which contradicts the definition of a0 . Hence, the a0 -orbit must stay in Q4; it either crashes or is defined for all x and is a globally regular solution in the zeroth nodal class. Thus, it remains to show that the a0 -orbit does not crash. The (technical) crash lemma of Sect. 7 shows that if an orbit crashes in Q4 then nearby orbits either crash in Q4 or exit Q4 to Q1. Thus the a0 -orbit cannot crash because nearby orbits would all be in {a|θ (x, a) < π/2 for all x for which the a-orbit is defined} and a0 would not be the infimum of that set. To show the existence of globally regular solutions in higher nodal classes we proceed as above. We let an = inf{a|θ (x, a) < (n + 1/2)π for all x for which the a-orbit is defined}. We then show that θ (x, an ) < (n + 1/2)π . We again use the crash lemma as
On Existence of Mini-Boson Stars
363
we did in the n = 0 case to show the an -orbit is defined for all x. The only difference is that we must show that θ (x, an ) > nπ . That fact follows easily from Lemmas 2.6b and 6.3. 0.3
f’/a a=a0=0.4404104
a=0.42
a=0.35
a=a1=0.3486769
a=0.33
a=0.32
a=a2=0.3196460
a=0.31
a=0.445 0.0
-0.5 -0.4
f/a
0.0
1.0
Fig. 1. The projection of a-orbits on the (f, f ) phase plane for several selected values of the shooting parameter a
3. Local Existence Proposition 3.1. There exists a two-parameter family of local solutions of Eqs. (2.14) near x = 0 satisfying the initial conditions (2.17). Proof. The proof is standard so we just outline it. We introduce new variables w = f , z = ln(C), B = (1 − A)/x, and rewrite Eqs. (2.14) as the first order system f = w, x2 (x 2 w) = (xf 2 − B)w + f (1 − AC 2 ) , A 2 (x B) = x 2 (Aw 2 + f 2 AC 2 + f 2 ), xf 2 − B z = . A
(3.1a) (3.1b) (3.1c) (3.1d)
364
P. Bizo´n, A. Wasserman
We will use the sup norm throughout this discussion: h means the sup{|h(x)| : 0 ≤ x ≤ r}. Consider the space X of quadruples of functions (f, y, B, z), where f − a ≤ 1, w ≤ 1, B ≤ M, and z − ln(α) ≤ 1 and each of the four functions is in C 0 ([0, r]), the space of continuous functions defined on the interval 0 ≤ x ≤ r with the sup norm. X is a complete metric space if we take as metric the maximum of the four components. We define a map T : X → X by T (f, w, B, z) = (T1 , T2 , T3 , T4 ) where x T1 = a + w ds, (3.2a) 0
x 2 s 1 T2 = 2 (sf 2 − B)w + f (1 − AC 2 ) ds, x 0 A x 1 T3 = 2 s 2 Aw 2 + f 2 AC 2 + f 2 ) ds, x 0 x 1 T4 = ln α + (sf 2 − B)ds. A 0
(3.2b) (3.2c) (3.2d)
One verifies easily that T does in fact take X to X and that T is a contracting map if r is sufficiently small, and that a fixed point of T is a solution to our equations. The proof that the solution depends continuously on a is also routine. 4. Behavior of Solutions for Small a In this section we show that the rotation θ(x, a) of the a-orbit is arbitrarily large if a is sufficiently small and x is sufficiently large. Proposition 4.1. For any n > 0, there exists a bn such that for a < bn there is an x with θ(x, a) > nπ . Proof. Let f˜ = f/a. Then, Eqs. (2.14) become x 2 f˜ x2 = (1 − AC 2 )f˜, C AC
(4.1a)
1−A 2 − a 2 x(Af˜ + AC 2 f˜2 + f˜2 ), x C C = (A − 1 + a 2 x 2 f˜2 ) xA A =
(4.1b) (4.1c)
with the behavior at the origin f˜(0) = 1,
A(0) = 1,
C(0) = α.
(4.2)
For a = 0 (decoupling of gravity) Eqs. (4.1bc) with conditions (4.2) have constant flatspactime solutions A ≡ 1, C ≡ α. Inserting these solutions into Eq. (4.1a) gives the Bessel equation (x 2 f˜ ) + x 2 (α 2 − 1)f˜ = 0,
(4.3)
On Existence of Mini-Boson Stars
365
whose unique solution satisfying (4.2) is √ α2 − 1 x sin f˜(x) = √ . α2 − 1 x
(4.4)
then θ(x, 0) > nπ so for This solution has infinite rotation as x → ∞. If x > √ nπ α 2 −1 a close to 0, say a < bn , we have θ (x, a) > nπ because solutions of Eqs. (4.1) are continuous in a and x. This concludes the proof of Proposition 4.1. 5. Behavior of Solutions for Large a Proposition 5.1. The a-orbits with sufficiently large a exit Q4 directly to Q1. We define new variables y = ax,
v(y) ˜ = a(f (x) − a),
˜ A(y) = A(x),
˜ C(y) = C(x).
(5.1)
Then, Eqs. (2.14) become (where now the prime denotes the derivative with respect to y) 2 y v y2 v˜ = (5.2a) (1 − A˜ C˜ 2 )(1 + 2 ), a C˜ A˜ C˜ 2 ˜ 1 − A v ˜ 1 2 2 A˜ = , (5.2b) Av˜ + (1 + A˜ C˜ ) 1 + 2 −y y a2 a C˜ v˜ C˜ = (5.2c) A˜ − 1 + y 2 (1 + 2 )2 . a y A˜ The initial conditions at y = 0 are ˜ A(0) = 1,
˜ C(0) = α > 1,
v(0) ˜ = 0,
v˜ (0) = 0.
(5.3)
As a → ∞, the solutions of Eqs.(5.2) tend uniformly on compact intervals to the solutions of the following limiting system: 2 y v y2 (1 − AC 2 ), = (5.4a) C AC 1−A A = (5.4b) − y(1 + AC 2 ), y C C = (A − 1 + y 2 ), (5.4c) yA satisfying the same initial conditions as in (5.3). The rest of this section is devoted to the analysis of Eqs. (5.4). Our goal is to show that v (y) becomes positive at a point y1 < y. ¯ This would imply that v(y) is bounded below, i.e., there is d > 0 such that v(y) > −d for y < y1 , and therefore v(y) ˜ > −d − 1 if a is sufficiently large. Then √ f (x) > a − (d + 1)/a and f (x) > 0 for x = y1 /a, hence, if a > d + 1, the a-orbit exits Q4 to Q1 directly without entering Q3. Note that the function v decouples from Eqs. (5.4bc) for the metric coefficients – this fact considerably simplifies the analysis.
366
P. Bizo´n, A. Wasserman
Lemma 5.2. The solution of Eqs. ¯ that is, A(y) ¯ = limy→∞ √ (5.4) crashes at some y, A(y) = 0. Moreover, 1 < y¯ < 3. Proof. Note that (yA) < 1 − y 2 , so integrating gives A < 1 − y 2 /3. Therefore, √ 2 A(y) ¯ = 0 for y¯ < 3. To show that y¯ > 1 assume y¯ ≤ 1. Then ( Cy ) = 1−y AC ≥ 0, so if 0 < τ < y < y¯ we have y/C(y) > τ/C(τ ) or C(y) < C(τ )/τ , so C is bounded. Since (AC) = −yAC 3 , (ln(AC)) = −yC 2 is bounded below. Thus, by integrating one concludes that limy→y¯ AC(y) > 0. But (AC)2 = A(AC 2 ); AC 2 ≤ α 2 and limy→y¯ A(y) = 0, so limy→y¯ (AC(y))2 = 0. This contradicts limy→y¯ AC(y) > 0, so we must have y¯ > 1. Proof of Proposition 5.1. In order to prove that v (y) becomes positive at some point y 2 2) y1 < y, ¯ we will show that v (y) ¯ > 0. By Eq. (5.4a) we have v (y) = yC2 0 z (1−AC dz, AC y¯ y 2 (1−AC 2 ) so we must show that 0 dy > 0. AC The proof of this fact is divided into two cases: (i) y¯ 2 ≥ 3/2, and (ii) y¯ 2 < 3/2. Before considering these cases we list some useful properties of the function g = 1 − AC 2 . Lemma 5.3. We have: (a) g = (1 − A − y 2 g)C 2 /y; (b) if g(y0 ) ≥ 0, then g(y) > 0 for all y > y0 ; (c) g > 0 if g ≤ 1/3. Proof. Part (a) is a calculation. For (b) note that g (g = 0) > 0 so g cannot cross zero from above. For (c) we have (yA) < 1 − y 2 , so integrating gives 1 − A > y 2 /3 and hence, g > y(1/3 − g)C 2 . y¯ 2 We return now to the proof that 0 yACg dy > 0. We first consider the case (i) y¯ 2 > 3/2. y¯ A calculation shows that y 2 g = (2y 3 /3+yA−y) , hence 0 y 2 gdy = 2y¯ 3 /3− y¯ > 0 if y¯ 2 > 3/2. Since g(0) = 1 − α 2 < 0, this implies that g(σ ) = 0 for some σ < y¯ and therefore g(y) > 0 for y > σ . Note that AC is monotone decreasing because (AC) = −yAC 3 < 0. Thus y 2 g(y) y 2 g(y) ≥ A(y)C(y) A(σ )C(σ ) and therefore
0
y¯
y2g 1 dy ≥ AC A(σ )C(σ )
for
y¯
0 ≤ y ≤ y, ¯
y 2 gdy > 0.
(5.5)
(5.6)
0
Now we consider the case (ii) y¯ 2 ≤ 3/2. Lemma 5.4. Define the function p = 1 + y 2 g − y 2 . If y 2 ≤ 3/2, then p(y) > 0. Proof. Note that p(0) = 1. Let y1 be the first zero of p, that is, p(y1 ) = 0 and p (y1 ) ≤ 0. If g(y1 ) > 1/3 then p = y 2 g + 1 − y 2 > y 2 /3 + 1 − y 2 = 1 − 2y 2 /3. Thus p can have a zero for y12 ≤ 3/2 only if g(y1 ) ≤ 1/3. Then, from Lemma (5.3), g (y) > 0 for all y ≤ y1 . Define a function k(y) = 2 − 3A − y 2 . A calculation gives y 3 g = (y(k + p)) , so by integrating we get k(y1 ) > 0. On the other hand we have p = yC 2 (k − p), so k(y1 ) ≤ 0; contradiction.
On Existence of Mini-Boson Stars
To show that 0
y¯
y¯
y2g 0 AC dy
y2g dy = AC
367
> 0, we rewrite it as
y¯ 0
p − 1 + y2 dy = AC
y¯ 0
p dy − AC
y¯ 0
1 − y2 dy. AC
(5.7)
The first term on the right hand side of (5.7) is positive because p is positive. To compute the second term, note that y C
=
1 − y2 , AC
(5.8)
hence L = limy→y¯ (y/C) exists and is finite since y¯ > 1 by Lemma 5.2. If L > 0 then ¯ < ∞, so C is bounded. Since limy→y¯ A(y) = 0 we conclude that limy→y¯ C(y) = y/L limy→y¯ AC(y) = 0. But (ln AC) = yC 2 is bounded, so ln AC is bounded below and hence lim AC = 0. This contradiction shows that L = 0. Thus, the second term on the right-hand side of (5.7) is zero. This concludes the proof of Proposition 5.1.
6. Asymptotics of Globally Regular Solutions In this section we derive the leading asymptotic behavior of globally regular solutions. We use lim to denote limx→∞ . Proposition 6.1. An a-orbit which exists for all x and has bounded rotation is asymptotically flat. The leading asymptotic behavior for x → ∞ is A(x) ∼ 1 −
2M∞ , x
C(x) ∼ C∞ e
where 0 < M∞ < ∞, 0 < C∞ < 1, and b =
2M∞ x
,
f (x) ∼ f∞ e−bx ,
(6.1)
2 . 1 − C∞
To prove this proposition we need several partial results. Lemma 6.2. An a-orbit which exists for all x and has bounded rotation is ultimately in the second (Q2) or fourth (Q4) quadrant. Proof. If θ (x) is bounded above then there is an integer n ≥ 0 such that θ(x) < (n + 1/2)π for all x but θ (x1 ) > (n − 1/2)π for some x1 and hence, by Lemma 2.6a for all x > x1 , (n − 1/2)π < θ (x) < (n + 1/2)π . We next show that there is an x2 such that for all x > x2 , nπ < θ (x) < (n + 1/2)π (that is, the orbit is ultimately in Q2 or Q4). Note that, by Lemma 2.6c the orbit must satisfy either nπ < θ (x) < (n + 1/2)π or (n − 1/2)π < θ (x) < nπ, that is the orbit must lie in Q3 or Q2 if n is odd and in Q1 or Q4 if n is even. We must rule out the possibility that the orbit is in Q1 or Q3. Assume that the orbit lies in Q1 or Q3 for all x > x1 . Then f (x)f (x) > 0 for all x > x1 , so f 2 (x) ≥ f 2 (x1 ) for all x > x1 . From Eq. (2.15b) we have (xA) = 1 − x 2 Af 2 − x 2 f 2 AC 2 − x 2 f 2 , so (xA) < 1 − x 2 f 2 < 1 − x 2 f 2 (x1 ), and hence A goes to zero in finite x. This contradiction concludes the proof. Lemma 6.3. Under the assumptions of Proposition 6.1 the function g = 1 − AC 2 is eventually positive.
368
P. Bizo´n, A. Wasserman
Proof. Suppose that g(x) ≤ 0 for all x. We claim that this implies lim A = 1. To see this, suppose that lim inf A = 1 − 4& for some & > 0. Let −β = lim g ≤ 0 which exists because g > 0. Note that g(x) < −β for all x. Choose an x1 such that g(x1 ) > −β−&. If A(x2 ) < 1 − 3& for some x2 > x1 , then by (2.18) g (x) > C 2 (1 − A)/x > (1 − A)/x = x(1 − A)/x 2 > x2 (1 − A(x2 ))/x 2 > 3&x2 /x 2 for x > x2 , where the last but one inequality follows from the fact that x(1 − A(x)) is monotone increasing. Integrating this inequality from x2 to 2x2 say, we get g(2x2 ) > g(x2 )+3&/2 > −β−&+3&/2 > −β; contradiction. Thus, lim inf A = 1 and hence lim A = 1. Since lim g = lim(1 − AC 2 ) exists, lim C also exists and is finite. Next, from Lemma 6.2 we know that the a-orbit is ultimately in Q2 or in Q4. For concreteness we consider the case of Q4 (the proof of the Q2-case is identical), that is f (x) > 0 and f (x) < 0 for sufficiently large x. Then, from (2.14a), lim(x 2 f /C) exists, so lim(x 2 f ) = −τ < 0 exists as well (where τ might be infinite; the point is that τ = 0). Now, by L’Hôpital’s rule, lim xf = − lim(x 2 f ) = τ . But (2.14c) says (ln C) > τ 2 /4x which implies lim C = ∞, a contradiction. Proof of Proposition 6.2. From the previous lemma we know that there exists an x1 such that g(x) > 0 for x > x1 . Let u = ACf/g for x > x1 . A calculation shows that u = −AC(f C 2 (1 − A)/x − f g + xf f 2 )/g 2 so u < 0 if g > 0. Multiplying Eq. (2.14a) by u we obtain (x 2 Aff /g) = x 2 f 2 + x 2 f u /C.
(6.2)
The right-hand side is positive for x > x1 , so x 2 Aff /g is negative and increasing, hence it has a finite non-positive limit. This implies that x 2 f 2 is integrable. Similarly, multiplying Eq. (2.14a) by f we obtain x 2 ff /C = (x 2 f 2 g + Ax 2 f )/(AC). 2
(6.3)
The right-hand side is positive for x > x1 , so x 2 ff /C is negative and increasing, hence it has a finite non-positive limit. This implies that Ax 2 f 2 is integrable (recall that AC is monotone decreasing). The integrability of x 2 f 2 and Ax 2 f 2 implies via Eq. (2.15) that lim M = M∞ < ∞ exists. This concludes the proof that A(x) ∼ 1 − 2M∞ /x. Having lim A = 1 we can strengthen Lemma 6.3 by showing that lim g = g∞ > 0 exists. To see this choose an x1 such that g(x1 ) > 0. Then AC 2 (x1 ) < 1, hence AC(x1 ) < 1. Since AC is monotone decreasing, we have AC(x) < AC(x1 ) for x > x1 and thus lim AC < 1. Hence, lim AC 2 = (lim AC)2 / lim A < 1. Since g = 1 − AC 2 , lim g exists and lim g > 0. Now we have all we need to derive the asymptotics of f . Let r = f /f . Then r = f /f − r 2 = −r(1 + A − x 2 f 2 )/(xA) + g/A = g∞ − r 2 + &(x), where lim & = 0. Let σ (x2 ) = max(|&(x)|) √ for x > x2 and assume that x2 is sufficiently large so that g∞ > σ (x2 ). If r(x2 ) > − g∞ − σ (x2 ), then clearly r becomes eventually √ positive which contradicts that the orbit is eventually in Q2 or Q4. If r(x2 ) < − g∞ + σ (x2 ), then lim r = −∞; this is impossible because then by L’Hôpital’s rule lim r√= lim f /f = lim g/r = √0. Therefore r(x2 ) must be sandwiched in the interval − g∞ + σ (x2 ) < r(x2 ) < −√ g∞ − σ (x2 ). Since x2 is arbitrarily large and lim σ = 0, we conclude that lim r = − g∞ . The asymptotics of f given in (6.1) follows immediately from this. Finally, inserting the derived leading asymptotic behavior of A and f into Eq. (2.14c), we obtain C /C ∼ −2M∞ /x, from which the asymptotics of C follows trivially.
On Existence of Mini-Boson Stars
369
7. Solutions that Crash Proposition 7.1. If the b-orbit crashes at some x¯ then g(x) > 0 for x near x. ¯ Proof. Suppose that g(x) < 0 for all x < x, ¯ so AC 2 (x) > 1 for all x < x. ¯ We have 2 2 2 from (2.18) that g > AC xf > xf . Integrating this inequality from some x1 > 0 to some x2 < x, ¯ we obtain x2 x2 2 2 x1 f dx < xf dx < g(x2 ) − g(x1 ) < α 2 − 1, (7.1) x1
x1
which implies (by the Cauchy–Schwartz inequality) that f is bounded. Next, A(x) ¯ = 0, AC 2 > 1, implies that limx→x¯ − C = ∞; moreover, by (2.14c) (ln C) < xf 2 /A, hence xf 2 /A is not integrable near x. ¯ Since f is bounded, this shows that 1/A is not integrable near x. ¯ But from (2.18), g > C 2 (1 − A)/x = AC 2 (1 − A)/(xA) > 1/(2xA), so g is not integrable near x, ¯ which contradicts the fact that g is a bounded function. The importance of Proposition 7.1 derives from Lemma 2.6b which says that if g > 0 then rotation stops. The main result of this section is the crash theorem which states that if an orbit has bounded rotation and crashes, then nearby orbits also have similarly bounded rotation. The precise statement is given in Proposition 7.2. Since we consider more than one orbit in this section, we use the notation A(x, a) to denote the value of A at x for the a-orbit, etc. Proposition 7.2 (Crash Theorem). If the b-orbit crashes at x = x¯ and (a) if (k − 1/2)π < θ (x, b) < kπ, k ≥ 1, for x near x, ¯ then nearby orbits have rotation < kπ for x ≥ x; ¯ (b) if kπ < θ (x, b) < (k + 1/2)π , then nearby orbits have rotation < (k + 1/2)π . Proof. Part (a): Suppose the b-orbit crashes in Q3 or Q1. By Proposition 7.1, g(x1 , b) > 0 for some x1 < x¯ with (k − 1/2)π < θ (x1 , b) < kπ; hence, for a sufficiently near b we have g(x1 , a) > 0 with (k − 1/2)π < θ (x, a) < kπ. By Lemma 2.6b, θ (x, a) < kπ for all x > x1 . Part (b): This case is much more difficult and will require several auxiliary results. It follows from part (a) that nearby orbits have rotation < (k + 1)π ; we must prove a much more difficult result, namely that nearby orbits have rotation < (k +1/2)π . Remark. It is clear from numerical observations that no a-orbit crashes in Q2 or Q4; however, that appears to be quite difficult to prove. Moreover, one can easily construct orbit segments that start, for example, at x = 1 with f = 5, f = 0, A = 0.2, C = 3, say, that crash in Q4. Such orbit segments have limx→x¯ − f (x) = −∞. Nevertheless, the next lemma shows that Af 2 remains bounded at crash. Lemma 7.3. If an a-orbit is defined for x < x2 , ff (x) < 0 for x1 < x < x2 , f 2 (x1 ) < B, and f (x1 ) = 0, then Af 2 (x) ≤ max(B, α 2 /3). In particular, if an orbit crashes in Q2 or Q4, limx→x¯ − A(x)f (x) = 0. Proof. We set q = Af 2 and then compute that xq = −(3 + x 2 f + x 2 C 2 f 2 )q − f + 2xff + x 2 f 2 f − 2AC 2 xff . 2
2
2
(7.2)
370
P. Bizo´n, A. Wasserman
Note that q ≥ 0 and all terms on the right side of (7.3) are negative except for the last two. If q > B, we combine the term −qx 2 f 2 with x 2 f 2 f 2 ; clearly, x 2 f 2 f 2 − qx 2 f 2 = (f 2 − q)x 2 f 2 ≤ 0. Next, we combine the term −qx 2 f 2 C 2 with −2xff AC 2 to get −AC 2 (y 2 − 2y), where y = −xff ; the maximum value of this expression occurs when y = 1 and that value is AC 2 ≤ α 2 by Lemma 2.4. Hence, if q ≥ α 2 /3, then −q(x 2 f 2 C 2 ) − 3q − 2xf f 2 AC 2 ≤ 0. Thus, q ≥ max(B, α 2 /3) implies that q < 0; consequently, Af 2 (x) ≤ max(B, α 2 /3). Since AAf 2 = (Af )2 , and Af 2 is bounded and lim x → x¯ − A(x) = 0, limx→x¯ − (A(x)f (x))2 = 0, hence limx→x¯ − A(x) f (x) = 0. We can now discuss the strategy of the proof of part (b) of Proposition 7.2. We want to show that if an orbit is sufficiently close to an orbit that crashes in Q4 then it must either crash or exit Q4 to Q1 (the case in which the orbit crashes in Q2 is completely symmetric). To that end, let v(x) = A(x)f (x). We will prove that v(x, a) goes to 0 if a is sufficiently close to b and f (x, a) > 0. This means either f = 0 and hence the orbit is exiting Q4 to Q1, or A = 0, that is, the orbit is crashing in Q4. Note that v (x) = −(2Af − xf + xAC 2 f + x 2 Af 2 + x 2 f 2 f AC 2 )/x = −v(2 + x 2 f 2 + x 2 f 2 C 2 )/x + f g > f g. We know that v(x, b) goes to 0 at crash so nearby orbits will also have v small for x near x. ¯ We will show that f and g are both uniformly bounded away from 0 in an interval about x. ¯ That is, the size of the interval and the bounds work for all a near b. That is enough to force v positive. The most technical part of the proof involves showing that nearby orbits stay in Q4 long enough to have v go positive. Since f goes to −∞ at crash, nearby orbits have f large also. Now, (2.14a) can be written as xAf + (1 + A − x 2 f 2 )f − xgf = 0; moreover, to get to Q3 orbits must pass through xf (x) < 1 which means that the coefficient of f , (1 + A − x 2 f 2 ), is positive. That is enough to bound f . The details of the proof, especially Lemma 7.5, are tedious. We will restrict ourselves to an interval 0.99 x¯ < x < 1.01 x¯ and replace x by x¯ (whenever justified) in making estimates. We show next that if the b-orbit crashes at x = x¯ with rotation kπ < θ (x, b) < (k + 1/2)π, then |xf ¯ (x)| ¯ ≥ 1. Lemma 7.4. If the b-orbit crashes at x = x¯ with θ (x, b) < (k + 1/2)π for all x < x¯ and θ (x, b) > kπ for x near x, ¯ then |xf ¯ (x)| ¯ ≥ 1, in particular f (x) ¯ = 0. Proof. The assumption on θ (x, b) tells us that the orbit lies in Q2 or Q4 for x near x. ¯ For simplicity of exposition we only discuss the case of Q4, i.e., f (x) ≥ 0, f (x) ≤ 0. In particular, f is a monotone function and hence has a limit at x. ¯ Thus, h(x) = xf (x) is continuous; in particular, if we suppose that xf ¯ (x) ¯ < 1, then h(x) < 1 for x near x. ¯ Since A(x) ¯ = 0, we get from (2.14c) that xAC = C(A − 1 + x 2 f 2 ) < 0 for x near x. ¯ We conclude that C is bounded above, hence limx→x¯ − AC 2 = 0 and limx→x¯ − g = 1. Since g > 0, the right hand side of Eq. (2.14a) is positive and hence x 2 f /C is bounded and since C is bounded we conclude that f is bounded; thus limx→x¯ − Af 2 = 0. Then, from (2.14b), xA = 1 − A − x 2 f 2 − x 2 (Af 2 + AC 2 f 2 ), we see that A > 0 near x¯ so there is no crash. This is a contradiction so we conclude that xf ¯ (x) ¯ ≥ 1 and hence f (x) ¯ > 0. Lemma 7.5. There is a γ > 0 such that h(x, a) = xf (x, a) > 1/4 for all a sufficiently near b and x¯ < x < x¯ + γ .
On Existence of Mini-Boson Stars
371
Proof. If the b-orbit crashes at x = x¯ with rotation θ (x, b) > kπ, then there is a y such that θ (y, b) = kπ. Let B = (f (y, b) + 1)2 . By Proposition 7.3, if a is sufficiently close to b, Af 2 (along the a-orbit) is bounded in Q4 by D = max(α 2 /3, B); D is a uniform bound on Af 2 in Q4 for all a sufficiently near b. Next, choose x1 such that 0.99 x¯ < x1 < x¯ and such that A(x1 , b) < 0.01, g(x1 , b) = 2τ > 0, and h(x1 , b) > 0.9; this is possible by Lemma 7.4 and Proposition 7.1. Then, for a sufficiently near b we have A(x1 , a) < 0.02, g(x1 , a) > τ > 0, f (x1 , a) < f (x1 , b) + 0.01/x¯ and h(x1 ) > 3/4. We shall find a γ ∈ (0, 0.01 x) ¯ that works for all a, that is, it satisfies h(x, a) > 1/4 for all a sufficiently near b and x¯ < x < x¯ + γ . So let a satisfy: i) Af 2 (along the a-orbit) is bounded by D, ii) A(x1 , a) < 0.02, iii) h(x1 , a) > 3/4, and iv) g(x1 , a) > τ > 0. If h(x, a) > 1/4 for all x < 1.01 x¯ and all a near b we are done – let γ = 0.01 x. ¯ Otherwise, we define x2 = x2 (a), etc. by h(x2 ) = 3/4, h(x3 ) = 1/2, h(x4 ) = 1/4, where x2 , x3 , and x4 are the largest values of x < 1.01 x¯ with that property. For x > x2 we have from (2.14a) xAf = xgf − (1 + A − h2 )f ≥ −(1 + A − h2 )f ≥ −f /4 since h ≤ 3/4 ¯ or f /f 2 ≥ −f /(4.04 x¯ D). We so f ≥ −f 3 /(4xAf 2 ) ≥ −f 3 /(4 ∗ 1.01 xD) now integrate the above from x2 to x > x3 to get x x −1 −1 1 f (x) − f (x2 ) f −f ≥ + ≥ dx = . (7.3) dx ≥ 2 f (x) f (x) f (x2 ) 4.04 x¯ D x2 f x2 4.04 x¯ D xD ¯ 5x¯ D 2 Now, f (x) ≥ f (x3 ), so −f (x) ≤ f (x25)−f (x3 ) ≈ h(x2 )−h(x3 ) = 20 x¯ D. Using the uni form bound on f in the interval x3 ≤ x ≤ x4 , we have x4 −x3 = (f (x4 )−f (x3 ))/f (ξ ) for some ξ ∈ [x3 , x4 ]. But (f (x4 ) − f (x3 ))/f (ξ ) ≥ (h(x4 ) − h(x3 ))/(xf ¯ (ξ )) ≥ 3 3 1/80 x¯ D and hence we may take γ = 1/80 x¯ D. 2
¯ b)). Lemma 7.6. In the interval x1 < x < x¯ + γ , g(x, a) > min(τ, 0.9/ h2 (x, Proof. From (2.18) we have xg = C 2 (1−A+x 2 Af 2 −x 2 gf 2 ) ≥ C 2 (1−A−x 2 gf 2 ). Moreover, since A(x1 , a) < 0.02 and xA < 1, A(x, a) = A(x1 , a) + A (z)(x − x1 ) < 0.02 + 1/z(0.02x) ¯ < 0.04, so if g < 0.96/ h2 (x) then g > 0. Since f (x1 , a) < f (x1 , b) + 0.01/x, ¯ h(x, a) ≤ 1.01 xf ¯ (x1 , a) < 1.01(xf ¯ (x1 , b) + 0.01) < 1.02 xf ¯ (x1 , b), we have g > 0 if g(x1 , a) < 0.9/ h2 (x, ¯ b). Thus, if τ < g(x1 , a) < 0.9/ h(x, ¯ b), g > 0, and g(x, a) > τ in the interval x1 < x < x¯ + γ ; if g(x1 , a) > 2 0.9/ h (x, ¯ b), then g(x, a) > 0.9/ h2 (x, ¯ b) for all x in the interval x1 < x < x¯ + γ because g cannot cross that value from above. Note that the above lower bound on g is uniform – it applies to all a satisfying the conditions i) Af 2 (along the a-orbit) is bounded by D, ii) A(x1 , a) < 0.02, iii) h(x1 , a) > 3/4, and iv) g(x1 , a) > τ > 0. Lemma 7.7. For all a sufficiently near b, v(x, a) goes to 0 for some x < x¯ + γ . Proof. To show that v(x, a) goes to 0, we note that h(x, a) ≥ 1/4 for all a near b and x¯ < x < x¯ + γ by Lemma 7.5. Hence, f (x, a) = h(x, a)/x > 1/4x. ¯ By Lemma 7.6, g(x, a) > min(τ, 1/ h(x), ¯ hence v ≥ 1/4x¯ min(τ, 1/ h(x) ¯ = η > 0 for x¯ < x < x¯ +γ . x+γ x+γ ¯ ¯ v dx ≥ x¯ ηdx ≥ ηγ . Let x1 be chosen so that Thus, v(x¯ + γ ) − v(x) ¯ = x¯ v(x1 , a) > −ηγ /2. Then, if a is sufficiently close to b, |v(x1 , a) − v(x1 , b)| > ηγ /2 so v(x1 , a) > −ηγ . For such a we then have v(x¯ + γ , a) > v(x, ¯ a) + ηγ and v(x, ¯ a) > v(x1 , a) > −ηγ because v > f g > 0; thus, v(x¯ + γ , a) > 0.
372
P. Bizo´n, A. Wasserman
We now complete the proof of Proposition 7.2. Proof of Proposition 7.2 b). Suppose that the b-orbit crashes at x = x¯ with θ(x, b) < (k+1/2)π for all x < x¯ and θ (x, b) > kπ for x near x. ¯ For a near b there is an x < x¯ +γ with v(x, a) = 0 by Lemma 7.7. Since x < x¯ + γ , h(x) > 1/4, i.e., f (x, a) > 0, so the a-orbit crashes, A(x, a) = 0, or exits Q4 to Q1 (or Q2 to Q3), f (x, a) = 0, never to return. In either case, the a-orbit has rotation θ (x, a) < (k + 1/2)π .
8. Proof of the Main Theorem Proof of Theorem 2.1. Let Xn = {a > 0 | θ (x, a) < (n + 1/2)π for all x for which the a-orbit is defined}. Note that Xn−1 ⊂ Xn and X0 = ∅ by Proposition 5.1 and hence, Xn = ∅. Also note that bn+1 > 0 is a lower bound for Xn by Proposition 4.1; hence, Xn has a greatest lower bound an = inf(Xn ) ≥ bn+1 > 0. We will show that the an -orbit is a globally regular solution and nπ < θ (x, an ) < (n + 1/2)π for large x. We first show that an ∈ Xn , i.e., an is the smallest element in Xn . If θ(x, an ) > (n + 1/2)π for some x then θ (x, a) > (n + 1/2)π for all a near an so a ∈ / Xn for these a’s and this contradicts the fact that an is the greatest lower bound of Xn . Thus, an ∈ Xn . In particular, the an -orbit has bounded rotation. Next we show that the an -orbit does not crash. Recall from Proposition 7.1 that if the an -orbit crashes at x = x¯ then g(x, an ) > 0 for x near x. ¯ If the an -orbit crashes in Q1 or Q3, that is, if θ (x, an ) < nπ for x near x¯ then θ (x, a) < nπ and g(x, a) > 0 for all a near an which implies by Lemma 2.6b that the a-orbit must have θ(x, a) < nπ for all x. Thus, a ∈ Xn for all a near an and this contradicts the fact that an is the greatest lower bound of Xn . Similarly, if the an -orbit crashes in Q2 or Q4, that is, at some x¯ with (n + 1/2)π > θ (x, ¯ an ) > nπ, then by the crash lemma (n + 1/2)π > θ (x, a) for all x in the domain of definition of the a-orbit for all a near an and this contradicts the fact that an is the greatest lower bound of Xn . Thus, the an -orbit is defined for all x and hence is a globally regular solution by Propositions 6.1. Also, by Proposition 6.2, the an -orbit is in Q2 or Q4 for large x. It remains to prove that θ (x, an ) > nπ for large x. Suppose that θ(x, an ) < nπ for large x. By Lemma 6.3 we have that g(x, an ) > 0 for large x and hence, g(x, a) > 0 for all a near an . Then, by Lemma 2.6b the a-orbit must have θ (x, a) < nπ for all x and thus a ∈ Xn , and this contradicts the fact that an is the greatest lower bound for Xn . This completes the proof of Theorem 2.1. Acknowledgement. We would like to thank the Mathematisches Forschungsinstitut in Oberwolfach for supporting this project under of the Research in Pairs program. P. B. was supported in part by the KBN grant 2 P03B 010 16.
References 1. Kaup, D.J.: Klein–Gordon geons. Phys. Rev. 172, 1331–1342 (1968) 2. Ruffini, R. and Bonazzola, S.: Systems of self-gravitating particles in general relativity. Phys. Rev. 187, 1767–1783 (1969) 3. Friedberg, R., Lee, T.D. and Pang, Y.: Mini-soliton stars. Phys. Rev. D 35, 3640–3657 (1987) 4. Bartnik, R. and McKinnon, J.: Particle-like solutions of the Einstein–Yang–Mills equations. Phys. Rev. Lett. 61, 141–144 (1988)
On Existence of Mini-Boson Stars
373
5. Smoller, J. and Wasserman, A.G.: Existence of infinitely many smooth, global solutions of the Einstein– Yang–Mills equations. Commun. Math. Phys. 151, 303–325 (1993) Communicated by H. Nicolai
Commun. Math. Phys. 215, 375 – 408 (2000)
Communications in
Mathematical Physics
© Springer-Verlag 2000
Distribution of Resonances and Decay Rate of the Local Energy for the Elastic Wave Equation Mourad Bellassoued Université de Paris Sud, Mathématiques, Bât. 425, 91405 Orsay Cedex, France Received: 22 February 2000 / Accepted: 28 June 2000
Abstract: We study resonances (scattering poles) associated to the elasticity operator in the exterior of an arbitrary obstacle with Neumann or Dirichlet boundary conditions. We prove that there exists an exponentially small neighborhood of the real axis free of resonances. Consequently we prove that for regular data, the energy for the elastic wave equation decays at least as fast as the inverse of the logarithm of time. According to Stefanov–Vodev ([SV1, SV2]), our results are optimal in the case of a Neumann boundary condition, even when the obstacle is a ball of R3 . The main difference between our case and the case of the scalar Laplacian (see Burq [Bu]) is the phenomenon of Rayleigh surface waves, which are connected to the failure of the Lopatinskii condition. 1. Introduction and Main Results Let = R3 \O be an exterior domain in R3 with C ∞ and compact boundary ∂, g = (gij ) a C ∞ positive definite symmetric matrix equal to the identity matrix outside a compact set. We consider the elastic wave equation with Neumann or Dirichlet boundary condition 2 in R × ∂t u − Ae (x, Dx )u = 0 (1.1) B(x, D)u = 0 on R × ∂ u(0, x) = u0 (x); ∂t u(0, x) = u1 (x) in . Here, Ae (x, Dx ) and B(x, Dx ) are of the forms Ae (x, Dx ) = µ + (µ + λ)∇(div .),
(1.2)
where µ > 0, 3λ + 2µ > 0 and , ∇and div(.) are the Laplacian, gradian and √ divergence operators defined by u = ij ∂i (gij ∂j u) ; ∇u = j ( g)ij ∂j u i and
376
M. Bellassoued
div u = trace (∇u), B(x, D)u =
u (Dirichlet condition) , N (x, D)u = σ (u) · n (Neumann condition)
(1.3)
where n(x) is the unit outer-normal vector to at x ∈ ∂. In the above σ (u) = λ(div u)I + 2µε(u)
(1.4)
is the stress tensor, where ε(u) = 21 (∇u +t ∇u) is the strain tensor. The aim of this work is to study the decay rate of the local energy and the distribution of resonances. For R > 0 we define the local energy E(u, R, t) for the solution of (1.1) in R = ∩ B(0, R) as
2 1 E(u, R, t) = λ| div u(t, x)|2 + µ (1.5) |εij (u)|2 + ∂t u(t, x) dx. 2 R ij
We say that the problem (1.1) has the uniform local energy property when, for any R > 0 there exists a continuous function f (t) satisfying lim f (t) = 0 as t → ∞ such that E(u, R, t) ≤ f (t)E(u0 , u1 ) holds for any t ≥ 0. Shibata and Soga [SS] formulate the scattering theory for the elastic wave equation which is analogous to the theory of Lax and Phillips [LP]. In particular they proved that for any initial data u0 , u1 the local energy for the solution of (1.1) in R decays as t tends to infinity. Therefore, it is interesting to investigate whether the problem (1.1) has a uniform rate of local energy decay. In the case of the scalar-valued wave equation with the Dirichlet or the Neumann boundary condition, or even the elastic wave equation with Dirichlet boundary condition, if the obstacle satisfies a non trapping condition in some sense, we have the uniform local energy decay property. Furthermore, we can take the rate f (t) as e−αt (see Melrose [Me], Melrose–Sjöstrand [MS], Morawetz [Mo], Iwashita–Shibata [IS] and Yamamoto [Y]). In the case when the obstacle is trapping Ralston’s example (see [Ra]) we prove that we cannot generally expect the local energy to decay uniformly. For the elastic wave equation with the Neumann condition, however, there is an interesting phenomenon. It is the existence of the Rayleigh surface wave which seems to propagate along the boundary. Taylor [T1] gives a rigorous treatment of the singularity and he proves that there are three types of rays that carry singularities. The first two types are classical rays reflecting at the boundary according to the laws optics √ √ of geometrical and the singularities propagate along them with speeds c1 = µ, c2 = 2µ + λ. The third type of rays lie on the boundary and singularities propagate along them with a slower propagation speed cR > 0 (the Rayleigh speed). Thus any obstacle is trapping for the problem (1.1) (even a ball of R3 ) from the point of view of propagation of singularities. Consequently Ikeheta and Nakamura [IN] and Stefanov–Vodev [SV1] show that the problem with the Neumann condition does not have the uniform local energy decay propert! y if the obstacle is a ball in R3 , and the results are extended for any non trapping obstacle by Kawashita [K1, K2]. But for regular data, we can set E(u, R, t) pm,R (t) = sup ; (u , u ) = 0 , (1.6) 0 1 ||(u0 , u1 )||2D(Am )
Local Energy for Elastic Wave Equation
377
where
A=
0 −iI d −iAe 0
,
(1.7)
and as for the rate pm,R (t) with m > 0, however, we can show lim pm,R (t) = 0 by the method of Walker [Wa]. Indeed, his proof is based on the Rellich theorem and the local energy decay property, that is, limt→∞ E(u, R, t) = 0. An important problem in this direction is to know how fast pm,R (t) converges to zero as t → ∞. Ikeheta and Nakamura [IN] show that for any α > 0 we cannot get the estimate of the form pm,R (t) ≤ Ce−αt even if ∂ is a unit sphere in R3 and Kawashita [K] shows that limt→∞ t γ pm,R (t) = +∞ hold for any γ > 0. Also, a second important problem is to know how fast pm,R (t) converges to zero if the obstacle is trapping. A surprising new discovery concerning the decay of the local energy for the scalarvalued wave equation in an exterior obstacle was made a few years ago by Burq [Bu] who proved that for regular data, the energy decays at least as fast as the inverse of the logarithm of time for an arbitrary obstacle. In the present paper, we show similar results for the solution of (1.1) independent of the geometries of the obstacle. More precisely, we prove that: Theorem 1.1. For any R1 , R2 > 0 and m > 0 there exists C > 0 such that for any data (u0 , u1 ) ∈ D(Am ) supported in R1 we have E(u, R2 , t) ≤
C ||(u0 , u1 )||2D(Am ) , (log(2 + t))2m
(1.8)
where u solve (1.1). We have the previous result as a consequence of the existence of an exponentially small neighborhood of the real axis free from the resonances (scattering poles). More precisely the purpose of this paper is to give some information about the location of the poles of the outgoing resolvent R(z) defined as the solution operator of (e + z2 )v(x, z) = f in . (1.9) Bv(x, z) = 0 on ∂ v(x, z) is outgoing We say that the function v is outgoing, if v(x, z) is the L2 ()−solution in Im z < 0 and the analytic continuation of an L2 −solution in Im z < 0 if Im z ≥ 0. It is known that the resolvent R(z) acting on functions v ∈ L2comp () in H 2 is an holomorphic function in Im z < 0 and can be extended to a meromorphic function from Im z < 0 to the whole complex plan C with eventual poles in Im z > 0. Let χ1,2 ∈ C0∞ be two cutoff functions equal to 1 near ∂. The poles of χ1 R(z)χ2 are called resonances. In the case of the Laplacian with Dirichlet boundary condition, it is well known that (Melrose-Sjöstrand [MS]) for non-trapping obstacles the resonances lie above logarithmic curves of the type Im z = α log(| Re z|) − β, α > 0. For trapping obstacles Burq [Bu] shows the existence of an exponentially small neighborhood of the real axis containing no resonances. Stefanov and Vodev [SV] prove that for the elasticity operator with the Neumann boundary condition there exists a sequence of resonances tending exponentially to the real axis for an arbitrary obstacle. In this paper we show that:
378
M. Bellassoued
Theorem 1.2. There exists C1 , C2 , C3 > 0 such that the outgoing resolvent is holomorphic in the region (1.10) U = z ∈ C; Im z < C1 e−C2 | Re z| ∩ {|z| > C3 }. Moreover, there exist C > 0 and δ > 0, χj , j = 1, 2 such that in the region U we have the estimate ||χ1 R(z)χ2 ||L(L2 ,H 2 ) ≤ Ceδ| Re z| .
(1.11)
To prove Theorem 1.2, we will use an idea due to Lebeau and Robbiano [LR2] which has been adapted by Burq [Bu] for this kind of problems. It consists in using Carleman estimates to obtain information about the resolvent in a bounded domain. The cost is to use phase functions satisfying Hörmander’s assumption and thus growing fast, far from the obstacle. The point is to show estimates for the solution of (Ae + τ 2 )u = f , where τ is a real parameter and f is compactly supported. Up to our knowledge, there isn’t much literature concerning the case of the system problem, even without any additional conditions on the boundary. Indeed, no general method is available to solve such a problem, except for multiplication of the system by the cofactors matrix and then we use the machinery of scalar Carleman estimates (see Hörmander [H1]) for the determinant. Unfortunately this method needs high regularity assumptions on the coefficients and especially in the case of the boundary problem, since this method increases the multiplicity of real characteristics near the boundary. Hence the Lopatinskii condition is not easily satisfied. D. Tataru [T] gives a rigorous study of Lopatinskii condition and Carleman estimates. In fact Tataru proved the Carleman estimates in the general case for scalar operators under the Lopatinskii condition. But in the case of elasticity system the situation is more complicated. Indeed, the operator has a principal symbol matrix 3 × 3, and especially in the case of the Neumann boundary condition the phenomenon of Rayleigh waves are connected to the failure of the Lopatinskii condition. In this step our proof diverge completely from the proof of Burq [Bu]. Our approach, consisting in diagonalizing the system near the boundary. This is the main technical part of our work. Let 0 be a bounded domain with smooth boundary ∂0 and let A(x, D) = −Ae (x, D)−τ 2 with principal symbol A(x, ξ ) = −Ae (x, ξ )−τ 2 . Let ϕ(x) ∈ C ∞ (R3 ); 2 we define A(x, D, τ ) = eτ ϕ A(x, Dx )e−τ ϕ and denote by aγ (x, ξ ) = t ξgξ − τγ for γ ∈ {µ, 2µ + λ}. We assume that ϕ satisfy Hörmander’s assumption Hϕ : ∃C > 0 such that Re aγ (x, ξ + iτ ϕ ) : Im aγ (x, ξ + iτ ϕ ) > C whenever aγ (x, ξ + iτ ϕ ) = 0 and γ ∈ {µ, 2µ + λ} and we assume that ∂ϕ ∂n ∂ = 0 . 0
∂ϕ ∂n
Theorem 1.3. Assume that < −C0 , (where C0 > 0 is large enough) on 1 ⊂ ∂0 . Then there exists C > 0 such that for any u ∈ C ∞ () and Bu = 0 on 1 we have 2 2 2 2 |A(x, D, τ )u| + τ (τ |u| + |∇u| )dσ ≥ Cτ (τ 2 |u|2 + |∇u|2 ). 0
∂0 \1
0
(1.12) for large enough τ . Remark 1. According to Ikeheta–Nakamura [IN] and Stefanov–Vodev [SV1] Theorem 1.1 and Theorem 1.2 with the Neumann boundary condition are optimal even when the obstacle is a ball of R3 .
Local Energy for Elastic Wave Equation
379
Remark 2. Theorem 1.1 follows from Theorem 1.2. This is proved in the general case by Burq (see [Bu]) using essentially the spectral calculus for selfadjoint operators. The paper is organized as follows. In Sect. 2 we study the outgoing solution far from the obstacle (outside a fixed ball B(0, R1 ) containing the obstacle) by using a spherical harmonics decomposition. The construction of the phases satisfies Hörmander’s condition and the proof of Theorem 1.2 is in Sect. 3. The Carleman estimates are proved in Sects. 4, 5, 6 and 7. 2. Spherical Harmonics Our purpose in this section is to give some information about the outgoing solutions of (e + z2 )u = 0 outside a ball B(0, R1 ) (where e = Ae in B(0, R1 )c ). Our analyses here are close to the ones of Ikeheta–Nakamura [IN] and Stefanov–Vodev [SV1] (see also Morse–Feshbach [MF]). We will keep some of notations used in this paper. 2.1. Preliminaries. Let (r, θ, ϕ) be the polar coordinates in R3 . Denote by Pmn (z), (n, m ∈ Z+ , m ≤ n) the Ferrer’s function defined by m ∂ m+n 2 (2.1) Pmn (z) = (2n n!)−1 1 − z2 (z2 − 1)n . ∂z Set Ymn (θ, ϕ) = eimϕ Pmn (cosθ ) and define the vector spherical harmonics Pmn , Bmn , Cmn by Pmn = Ymn (θ, ϕ)ω, Cmn = (n(n + 1))− 2 curl(ωrYmn (θ, ϕ)), Bmn = ω × Cmn , (2.2) 1
where the symbol (×) denotes the exterior product and ω = Rx for x ∈ S 2 (0, R). It is known (see Morse–Feshbach [MF] pp. 1898) that Pmn , Bmn , Cmn form an orthogonal
π 2 basis in L2 (S 2 ). Denote by hn (z) = ( 2z ) Hn+ 1 (z) the spherical Hankel function of 2 first order where +∞−iπ Hγ (z) = ezsinht−γ t dt. (2.3) 1
−∞
We have the following lemma for the properties of Hγ (z) (see Lemmas 2.3 and 2.4 of Burq [Bu]). Lemma 2.1. 1. For any a < 1, there exists C > 0, τ0 > 0 such that for any |τ | > τ0 and γ < |τ |ar, where τ = Re k > τ0 , | Im k| ≤ 1 we have,
H (kr) Hγ (kr) C γ (2.4) ≥ C, and Re − Im ≤ √ . Hγ (kr) Hγ (kr) |τ | 2. For any a < 1 and 0 < R1 < R2 , there exists C > 0, ε > 0, τ0 > 0 such that for any |k| > τ0 and γ > τ a −1 R1 , | Im k| ≤ 1 we have |Hγ (kR2 )| ≤ Ce−εγ |Hγ (kR1 )| . (2.5) |H (kR )| ≤ Ce−εγ |H (kR )| 2 γ 1 γ
380
M. Bellassoued
Define now Lmn (r; k), Mmn (r; k), Nmn (r; k) as follows: L (r; k) = k −1 grad(Ymn (θ, ϕ)hn (kr)), mn Mmn (r; k) = curl(rωYmn (θ, ϕ)hn (kr)), Nmn (r; k) = k −1 curl(Mmn (r; k)).
(2.6)
Then it can be easily seen that the following lemma holds (see Morse–Feshbach [MF]) Lemma 2.2. We have the following properties: 1. Lmn (r; k), Mmn (r; k), Nmn (r; k) are the eigenfunction of the vector Helmholtz equation ( + k 2 )U = 0 in R3 .
(2.7)
z . Then Lmn (r; k1 ), Mmn (r; k2 ), Nmn (r; k2 ) are the eigenfunc2. Let k1 = µz , k2 = 2µ+λ tions of the equation
(e + z2 )U = 0 in R3 .
(2.8)
3. Ymn (θ, ϕ)hn (kr) are the eigenfunctions of the scalar Helmholtz equation ( + k 2 )φ = 0 in R3 and we have
k −2
d2 2 d n(n + 1) + hn (kr) = 0. + 1 − dr 2 kr dr k2 r 2
(2.9)
4. We have the following decomposition of Lmn (r; k), Mmn (r; k), Nmn (r; k) as a linear combination of Pmn , Bmn and Cmn (see Morse–Feshbach [MF], pp. 1865) √ d Lmn (r; k) = k1 dr (hn (kr))Pmn + n(n + 1) kr1 hn (kr)Bmn , √ (2.10) Mmn (r; k) = n(n + 1)hn (kr)Cmn , √ d (rhn (kr))Bmn . Nmn (r; k) = n(n + 1) kr1 hn (kr)Pmn + n(n + 1) kr1 dr Let u(x, z) solve the following problem: (e + z2 )u = 0 in the exterior of the ball B(0, R1 ) and u is outgoing. The solution space can be decomposed into subspaces, which consist of longitudinal and transverse fields each satisfying the appropriate Helmholtz equation (see Kupradz [Ku]; Ikeheta–Nakamura [I.N]) ( + k22 )u(p) = 0,
( + k12 )u(s) = 0,
where u = u(p) + u(s) and u(p) = grad φ, φ satisfy A satisfy u(s) = curlA, with div A = 0.
( + k22 )φ = 0 ( + k12 )A = 0
(2.11)
(2.12)
Local Energy for Elastic Wave Equation
381
2.2. Study of the outgoing solutions far from the obstacle. Since here we have two sound speeds, we have to consider the following five regions i) ii) iii) iv) v)
Hyperbolic region: n(n + 1) < τ22 r 2 , Glancing region (I): n(n + 1) = τ22 r 2 , Mixed region: τ22 r 2 < n(n + 1) < τ12 r 2 , Glancing region (II): n(n + 1) = τ12 r 2 , Elliptic region: n(n + 1) > τ12 r 2 .
Now, we state the main result in this section Proposition 2.1. For any R2 > R1 > 0, there exists C1 , C2 , ε and τ0 , such that for any z ∈ C; | Im z| ≤ 1, | Re z| = |τ | > τ0 the outgoing solution for (e + z2 )u = 0 outside a ball B(0, R1 ) satisfy (− Im N uu)dσ ≥ |τ |C1 (|u|2 + |τ −1 ∇u|2 )dσ (2.13) r=R2 r=R2 −C2 e−ε|τ | (|u|2 + |τ −1 ∇u|2 )dσ. r=R1
Remark 3. We only prove the estimate (2.13) in the hyperbolic, mixed and elliptic regions; the glancing regions (I) (II) can be easily incorporated respectively in mixed and elliptic regions respectively (see Lemmas 2.3, 2.4 and 2.5). The next calculation is useful for expanding the outgoing solution for (e +z2 )u = 0 and N u in terms of Lmn , Mmn , Nmn . Let φ and A be a solution of (2.12), then φ can be rewritten as follows: (2.14) φ= k2−1 φmn (Ymn (θ, ϕ)hn (k2 r)). Then we obtain by (2.10) u(p) = grad φ =
√ n(n + 1) φmn hn (k2 r)Pmn + hn (k2 r)Bmn . k2 r
(2.15)
Since we have represented the boundary condition for u(p) in the form N u(p) = 2µ∂r u(p) + λωdivu(p) = 2µ∂r u(p) − λk22 φω.
(2.16)
We obtain from (2.15) N u(p) =
hn (k2 r) hn (k2 r) + 2µn(n + 1) r k2 r 2 − (2µ + λ)k2 hn (k2 r) Pmn h (k r) h (k r) 2 n 2 − + 2µ n(n + 1) n Bmn . 2 r k2 r φmn
Similarly, let u(s) = curlA, where A = k1−1
− 4µ
(A1mn Mmn + A2mn Nmn ).
(2.17)
(2.18)
382
M. Bellassoued
Here we will omit the calculations which are tedious but elementary and will present only the final results. We get by (2.8) and (2.10), hn (k1 r) u(s) = A1mn n(n + 1) (2.19) Pmn + n(n + 1) k1 r h (k r) n 1 × + hn (k1 r) Bmn + A2mn n(n + 1)hn (k1 r)Cmn . k1 r On the other hand N u(s) = 2µ∂r u(s) + µω × curlu(s) = 2µ∂r u(s) + µk12 ω × A.
(2.20)
Hence, we have h (k r) h (k r) 1 n 1 A1mn 2n(n + 1) n Pmn − r k1 r 2 h (k1 r) + n(n + 1) − 2 n − k1 hn (k1 r) r n(n + 1) hn (k1 r) +2 B h (k r) − 2 n 1 mn k1 r 2 k1 r 2 hn (k1 r) +A2mn n(n + 1) k1 hn (k1 r) − Cmn . r
N u(s) = µ
Taking into account (2.15), (2.17), (2.19) and (2.21) we obtain − Im(N u(p) u(p) ) = (2µ + λ) |φmn |2 Im k2 hn (k2 r)hn (k2 r) and − Im(N u(s) u(s) ) = µ
n(n + 1) |A1mn |2 Im k1 hn (k1 r)hn (k1 r)
(2.21)
(2.22)
(2.23)
n(n + 1) +|A2mn |2 Im k 1 hn (k1 r)hn (k1 r) + O |hn (k1 r)|2 . |k1 | Finally − Im(N u(p) u(s) + N u(s) u(p) ) = −
(p)
(s)
z µ(2µ + λ) n(n + 1) Im( ) z¯ 1 Re Amn φmn hn (k1 r)hn (k2 r) .
(p)
(s)
(2.24)
Now we define gn , gn , gn , αn by gn = gn + gn + αn , where (p) − Im(N u(p) u(p) ) = gn (k2 r) (s) (2.25) − Im(N u(s) u(s) ) = gn (k1 r) (s) (p) (p) (s) − Im(N u u + N u u ) = αn (k1 r, k2 r) (p) 2 (p) (s) (s) |un |2 = |un | + |un |2 + 2 Re(un un ) . The rest of this section and |u|2 = is devoted to prove Proposition 2.1. Then we need the following three lemmas.
Local Energy for Elastic Wave Equation
383
Lemma 2.3 (Hyperbolic region). For any a < 1, there exist C > 0, τ0 > 0 such that for any |τ | > τ0 and n(n + 1) ≤ τ22 R22 a 2 we have gn (k1 R2 , k2 R2 ) ≥ Cτ |un (R2 )|2 .
(2.26)
Proof. By Lemma 2.1 we have for any n(n + 1) ≤ τ22 R22 a 2 and σ ∈ k1 , k2 ,
Im(σ hn (σ R2 )hn (σ R2 )) ≥ Cτ |hn (R2 )|2 . Im(σ hn (σ R2 )hn (σ R2 )) ≥ Cτ |hn (R2 )|2
(2.27)
Therefore,
(p)
(p)
gn (k2 R2 ) ≥ Cτ |un (R2 )|2 . (s) (s) gn (k2 R2 ) ≥ Cτ |un (R2 )|2
(2.28)
C (p) (gn + gn(s) ) τ
(2.29)
Using this and |αn (k1 r, k2 r)| ≤ we obtain (2.26).
Now using Lemma 2.1 we can prove the two following lemmas about elliptic and mixed regions. Lemma 2.4 (Elliptic region). For any a < 1, there exists C > 0, τ0 > 0 and ε > 0 such that for any |τ | > τ0 and n(n + 1) > τ12 R12 a −2 we have |gn (k1 R2 , k2 R2 )| ≤ Ce−ε
√ n(n+1)
|un (R1 )|2 .
(2.30)
Lemma 2.5 (Mixed region). For any a < 1, there exist C1 , C2 , τ0 > 0 and ε > 0 such that for any |τ | > τ0 and τ22 R12 a −2 < n(n + 1) ≤ τ12 R22 a 2 we have 2 −ε gn (k1 R2 , k2 R2 ) ≥ C1 |u(s) n (k1 R2 )| − C2 e
≥ C1 |un (R2 )| − C2 e 2
√ n(n+1)
√ −ε n(n+1)
(p)
|un (k2 R1 )|2
(2.31)
|un (R1 )| . 2
2.2.1. Proof of Proposition 2.1. Using Lemma 2.3 we have r=R2
− Im(N uu)dσ ¯ =
gn (k1 R2 , k2 R2 ) ≥ Cτ
|un (R2 )|2
n(n+1)≤a 2 τ22 R22
+
gn (k1 R2 , k2 R2 )
a 2 τ22 R22
−τ
n(n+1)≥a 2 τ12 R22
|un (R2 )|2 + |τ −1 gn (R2 )|2 ,
(2.32)
384
M. Bellassoued
where gn define by |N u|2 = Lemma 2.5 and 2.4 we obtain
| gn |2 . We set a 2 =
r=R2
−
− Im(N uu) ≥ C1 τ
e
√ −ε n(n+1)
|un (R1 )|2 − C2 |τ |
n(n+1)≥a 2 τ12 R22
+ Cτ
then a 2 R22 = a −2 R12 . Using
|un (R2 )|2
n(n+1)≤a 2 τ12 R22
≥ −C |τ |
R1 R2
|un (R2 )|2 + |τ −1 gn (R2 )|2
n(n+1)≥a 2 τ12 R22
1+
n(n + 1) |un (R2 )|2 + |τ −1 gn (R2 )|2 τ2
√ n(n + 1) 2 −ε n(n+1) 1+ (R )| − e |un (R1 )|2 . |u n 2 τ2
(2.33)
As in the proof of Lemma 2.4, with (2.17) and (2.21) we obtain for n(n + 1) > a 2 τ12 R22 the following estimate: gn (R2 )|2 ≤ Ce−ε |τ −1
√ n(n+1)
|un (R1 )|2 .
(2.34)
Thus we get from (2.33) and Lemma 2.4,
r=R2
(|u|2 + |τ −1 ∇u|2 )dσ r=R2 −ε|τ | −C2 e (|u|2 + |τ −1 ∇u|2 )dσ.
(− Im N uu)dσ ≥ |τ |C1
(2.35)
r=R1
3. Proof of Theorem 1.2 3.1. Construction and properties of the phase function. Our analysis here is close and mostly inspired from that of Burq (see [Bu]). The purpose of this section is to construct two phases ϕ1 , ϕ2 which satisfy Hörmander’s condition, except in a finite number of balls, in such a way that, if ϕ1 does not satisfy this condition in a certain ball, then ϕ2 does and it is strictly bigger than ϕ1 in this ball, and vice-versa. Moreover, far from the obstacle, the functions ϕ1 , ϕ2 coicide, are radial and satisfy ϕ (r) = κ, for κ > 0 arbitrary small. 3.1.1. Construction of the phase in a bounded region. Let R > 0, such that B(0, R) contain space perturbation and operator perturbation, i.e. the obstacle O ⊂ B(0, R) and g = I d outside B(0, R). Proposition 3.1 (see Burq [Bu]). There exists two functions ψ1 , ψ2 ∈ C ∞ (R ) satisfy∂ψi ing ∂n ∂ < 0 (∂n stand for the unit outer-normal vector field at ∂), having only no degenerate critical points such that on S(0, R) we have ∇ψi · x > 0 and when ∇ψi = 0 then ∇ψi+1 = 0 and ψi+1 > ψi (ψ3 = ψ1 ). Finally ψi are radials and ψ1 = ψ2 in a neighborhood of S(0, R).
Local Energy for Elastic Wave Equation
385
Then there exists a finite number of points xij ∈ , i = 1, 2, j = 1, . . . , Ni and ε > 0, such that B(xij , 2ε) ⊂ R , B(x1,j , 2ε) ∩ B(x2,j , 2ε) = ∅ and ψi < ψi+1 in B(xij , 2ε) c (where ψ3 = ψ1 ). Denote i = R ∩ ∪ B(xij , ε) . Let us search ϕi in the form eβψi j
for β each large enough. Taking from γ ∈ {µ, 2µ + λ} and g = (gij ) aγ (x, ξ ) = t ξgξ −
τ2 γ
(3.1)
and by aγ (x, ξ + iτ ϕ ) = t ξgξ − τ 2t ϕ gϕ + 2iτ t ξgϕ − aγ (x, ξ, τ ) =
τ2 . γ
(3.2)
If aγ (x, ξ, τ ) = 0 a simple computation gives t
ξgϕ = 0
and
t
ξgξ = τ 2t ϕ gϕ +
τ2 . γ
(3.3)
Taking into account (3.3) we obtain {Re aγ , Im aγ }(x, ξ, τ ) = 4τ e3βψ (β 4 (t ψ gψ )2 + O(β 3 )).
(3.4)
Hence ψ = 0 and g ≥ CI d, then we have that det Ae (x, ξ, τ ) = 0
implies
{Re aγ , Im aγ }(x, ξ, τ ) ≥ C.
(3.5)
3.1.2. Construction of the phase outside a ball. Let us now construct the phase ϕ in B(0, R)c ∩ B(0, R1 ), for fixed R1 > R such that ϕ1 , ϕ2 are radial and coincide in a small neighborhood of each x, R − α < |x| < R. Let us extend ϕ in a radial function. Consider polar coordinates in R3 (r, ρ; θ, η), where θ ∈ S 2 ; then the symbol aγ can be written aγ = ρ 2 +
η2 τ2 − r2 γ
and
aγ = ρ 2 − τ 2 (ϕ )2 +
η2 τ2 + 2iτ ϕ ρ. − r2 γ
(3.6)
It is easy to see that if aγ = 0 then we have ρ=0
and
η2 1 2 2 = τ ) + (ϕ . r2 γ
(3.7)
Taking into account (3.7) we obtain that aγ = 0 implies {Re aγ , Im aγ } = 4τ 3 ϕ ϕ ϕ +
1 γ
+ (ϕ )2 r
.
Let yγ (r) solve the following linear ordinary differential equation: y y + γ1r = 0 y(R) = ϕ (R).
(3.8)
(3.9)
386
M. Bellassoued
Then we obtain
r 1/2 2 yγ (r) = ϕ (R)2 − log . γ R
(3.10)
Let now R1 = Reϕ (R) /2 then the constant R depends only on the value of ϕi in r = R, for a small parameter η and κ such that 0 < κ < η2 we take in R − α < r < R ϕ (R) 2 (3.11) ϕ (r) = y2µ+λ (r) in R ≤ r < R1 e−κ /2 2 κ in R1 e−κ /2 ≤ r ≤ R2 2
and by the argument in Burq [Bu] we can extend ϕ in a smooth phases ϕ . 3.2. End of the proof of Theorem 1.2. Let u(x, z) solve the following problem: (Ae + z2 )u = f in (3.12) Bu = 0 on ∂ , u outgoing where f supported in R1 = ∩ B(0, R1 ). Let χi , i = 1, 2 two cutoff functions equal to 1 in ∪ B(xij , 2ε)c and supported in ∪ B(xij , ε)c . We apply Theorem 1.3 for the j j function eτ ϕi χi u in a domain i = (R2 ∩ ∪ B(xij , ε)c , we get for τ 2 = Re(z2 ) and j
gi = [Ae , χi ]u + χi f − i Im(z2 )χi u, e2τ ϕi |gi |2 + τ (τ 2 |u|2 + |∇u|2 )e2τ ϕ(R2 ) ≥ i r=R2 ≥ Cτ (τ 2 |χi u|2 + |∇χi u|2 )e2τ ϕi . i
(3.13)
Using the properties of the phases, ϕi > ϕi+1 in ∪ B(xij +1 , 2ε), we have for any τ large j
enough, R2
e2τ ϕ1 + e2τ ϕ2 (|f |2 + | Im(z2 )u|2 ) + τ + |∇u|2 )dσ ≥ Cτ
r=R2
e2τ ϕ1 + e2τ ϕ2 (τ 2 |u|2
R2
e2τ ϕ1 + e2τ ϕ2 (τ 2 |u|2 + |∇u|2 ).
(3.14)
Combining the previous estimates with Proposition 2.1, we obtain 2τ ϕ 2τ ϕ e 1 + e2τ ϕ2 |f |2 + τ 2 e 1 + e2τ ϕ2 (Im(Bu · u))dσ R2
+
R2
r=R2
e
2τ ϕ1
+e
2τ ϕ2
| Im(z )u| ) ≥ Cτ 2
2
R2
− eτ (2ϕ1 (R2 )−ε) + eτ (2ϕ2 (R2 )−ε)
e2τ ϕ1 + e2τ ϕ2 (τ 2 |u|2 + |∇u|2 )
r=R1
(τ 2 |u|2 + |∇u|2 )dσ.
(3.15)
Local Energy for Elastic Wave Equation
387
By the trace formula we get (see Burq [Bu] (3.6)) 2τ ϕ 2 2 2 2τ ϕ2 1 ≥ (τ |u| + |∇u| ) e +e |u|2 e2τ ϕ1 (R1 ) + e2τ ϕ2 (R1 ) . R2
r=R1
(3.16)
Taking into account that e u = −z2 u in a neighborhood of S(0, R1 ) we can get 2τ ϕ e 1 + e2τ ϕ2 (τ 2 |u|2 + |∇u|2 ) R2
r=R1
τ −2 |∇u|2 e2τ ϕ1 (R1 ) + e2τ ϕ2 (R2 ) dσ.
(3.17)
(τ 2 |u|2 + |∇u|2 ) e2τ ϕ1 + e2τ ϕ2 .
(3.18)
Then we have (τ 2 |u|2 + |∇u|2 ) e2τ ϕ1 + e2τ ϕ2 R2
τ −2
r=R1
Hence, by the fact that κ < 2ε and by (3.15), (3.17), we obtain 2τ ϕ e 1 + e2τ ϕ2 (|f |2 + | Im(z2 )|2 |u|2 ) R2
τ2
r=R2
e2τ ϕ1 + e2τ ϕ2 | Im(Bu · u)| ≥ Cτ
R2
e2τ ϕ1 + e2τ ϕ2 (τ 2 |u|2 + |∇u|2 ). (3.19)
On the other hand we get f ·u= R2
R2
=
Then we have
(Ae + z2 )u · u
r=R2
(Bu · u)dσ +
Im
(3.20) R2
z2 |u|2 − E(u, u).
r=R2
(Bu · u) =
R2
Im(f · u) − Im(z2 )|u|2 ,
combining (3.21) with (3.19) we obtain eCτ |f |2 + τ 2 | Im(z2 )|2 |u|2 + τ 2 |f · u|dx ≥ Cτ R2
R2
(3.21)
τ 2 |u|2 + |∇u|2 . (3.22)
If moreover eCτ | Im(z2 )|2 ≤ C1 , then this is equivalent to Im(z) < C1 e−C| Re z|/2 and the term | Im(z2 )|2 |u|2 which can be easily incorporated in the R.H.S. in (3.22) for large τ . Finally we get if Im z < e−C| Re z| the following estimate
eC τ
R2
|f |2 ≥ C
This completes the proof of Theorem 1.2.
R2
(τ 2 |u|2 + |∇u|2 ).
(3.23)
388
M. Bellassoued
4. Proof of Carleman Estimates This section is devoted to prove estimates of Carleman’s type near the boundary for solutions to boundary value problem of the form A(x, D)u = f in 0 (4.1) B(x, D)u = g on ∂0 , where A(x, D) a partial differential operator with principal symbol given by A(x, ξ ) = µt ξ ξ + (µ + λ)ξ t ξ − τ 2 I d
(4.2)
and B(x, D)u defined by (1.3). Here we remark that the phenomenon of Rayleigh waves is connected to the failure of Lopatinskii condition, and our analysis is completely different from the scalar case treated by Burq [Bu]. D. Tataru [T] was the first to consider the Carleman estimates and the uniform Lopatinskii condition for scalar operators; here we shall use the method developed in [T] for the construction of the symmetrizer.
4.1. Reduction of the problems. 4.1.1. Reduction of the Laplacian. Let 0 be a bounded smooth domain of Rn with boundary ∂0 of class C ∞ . In a neighborhood of a given x0 ∈ ∂0 we denote by x = (x , xn ) the system of normal geodesic coordinates where x ∈ ∂0 and xn ∈ R are characterized by |xn | = dist(x, ∂0 ); 0 = xn > 0 ; dist(x , x) = dist(x, ∂0 ). In this system of coordinates the principal symbol of the Laplace operator takes the form a (x, ξ ) = t G(x, ξ )G(x, ξ ) = ξn2 + r(x, ξ ),
(4.3)
where r(x, ξ ) is a quadratic form, such that there exist C > 0, r(x, ξ ) ≥ C|ξ |2 ,
for any x ∈ K, ξ ∈ T ∗ (∂0 ).
(4.4)
We set G(x, ξ ) = G0 (x)ξn + G1 (x, ξ ),
(4.5)
then we have t
G1 G1 = r(x, ξ ), t G0 G1 = 0. (4.6) Denote K = x ∈ Rn+ ; |x| ≤ r0 . Let ϕ(x) be a C ∞ (Rn ) function with values in R, defined in a neighborhood of K. We define the operator a(x, D, τ ) = eτ ϕ a (x, D)e−τ ϕ := op(a).
(4.7)
a(x, ξ, τ ) = a(x, ξ + iτ ϕ )
(4.8)
Denote by
Local Energy for Elastic Wave Equation
389
the principal symbol of the operator, and we set op( q2 ) =
1 (op(a) + op(a)∗ ); 2
op( q1 ) =
1 (op(a) − op(a)∗ ), 2i
its real and imaginary part. Then we have q1 ) op(a) = op( q2 ) + i op( q2 = ξn2 + q2 (x, ξ , τ ); q1 = 2iτ ξn ϕx n + 2q1 (x, ξ , τ ), where q1 and q2 a tangential symbols of order respectively 1 and 2 given by q2 (x, ξ , τ ) = r(x, ξ ) − τ 2 (ϕx n )2 − τ 2 r(x, ϕx ) q1 (x, ξ , τ ) = r(x, ξ , τ ϕx )
(4.9)
(4.10)
(4.11)
and r(x, ξ , η ) the bilinear form attached to the quadratic form r(x, ξ ). 4.1.2. Reduction of the elasticity system. In the system of normal geodesic coordinates the principal symbol of elasticity operator can be written as Ae (x, ξ ) = µt G(x, ξ )G(x, ξ )I d + (λ + µ)G(x, ξ )t G(x, ξ ),
(4.12)
where G(x, ξ ) defined by (4.5) and G(x, ξ )t G(x, ξ ) is the orthogonal projection onto the space spanned by G(x, ξ ). We set A(x, D, τ ) = eτ ϕ Ae (x, D)e−τ ϕ − τ 2 I d.
(4.13)
The principal symbol of A(x, D, τ ) is given by A(x, ξ, τ ) =
2
j
A2−j (x, ξ , τ )ξn ,
(4.14)
j =0
where Aj (x, ξ , τ ) are tangential symbols in S j (Rn × Rn−1 ) defined by A0 = µI d + (µ + λ)Gt0 G0 , A1 = 2iτ ϕx n A0 + (µ + λ)(Gt0 G1 + Gt1 G0 ), A2 = µ(t G1 G1 )I d + iτ ϕx n A1 (x, ξ ) − (τ ϕx n )2 A0 + (µ + λ)Gt1 G1 − τ 2 I d.
(4.15)
For fixed (x, ξ ) ∈ T ∗ (∂0 ) let α(x, ξ , τ ) ∈ C such that a(x, ξ, τ ) = (ξn + iτ ϕx n + iα)(ξn + iτ ϕx n − iα),
(4.16)
then we have also by (4.6) r(x, ξ + iτ ϕx ) = −(iα)2 = t G1 (x, ξ + iτ ϕx )G1 (x, ξ + iτ ϕx ).
(4.17)
Here we assume that α(x, ξ , τ ) = 0. ⊥ Let (ω1 , . . . , ωn−2 ) be the basis of G0 , (iα)−1 G1 . We can define the smooth matrix H = (ω1 , . . . , ωn−2 , G0 , (iα)−1 G1 ) homogeneous of degree zero in (ξ , τ ), such that,
390
M. Bellassoued
on a conic neighborhood of (x, ξ , τ ) we get Aˆ = H −1 AH = 0 0 (µa − τ 2 )I dn−2 0 (2µ + λ)a − τ 2 + (µ + λ)(iα)2 −(µ + λ)(iα)(ξn + iτ ϕx n ) . 0 (µ + λ)(iα)(ξn + iτ ϕx n ) µa − τ 2 − (µ + λ)(iα)2 (4.18) We set now
G(x, ξ, τ ) = (ξn + iτ ϕx n )G0 + G1 (x, ξ , τ ) , G(x, ξ, τ ) = (ξn + iτ ϕx n )(iα)−1 G1 + (iα)G0
(4.19)
then we have t G G = 0 and we can decouple the system by P = (ω1 , . . . , ωn−2 , G, G). We obtain
(µa − τ 2 )I dn−1 0 −1 A = P AP = , (4.20) 0 (2µ + λ)a − τ 2 therefore, det (A(x, ξ, τ )) = µn−1 (2µ + λ)(aµ (x, ξ, τ ))n−1 a2µ+λ (x, ξ, τ ), where aγ (x, ξ, τ ) = a(x, ξ, τ ) −
τ2 γ ,
(4.21)
and γ ∈ µ, 2µ + λ .
4.1.3. Study of the eigenvalues. The proof of the Carleman estimate relies on a cutout argument based on the nature of the roots with respect to ξn of aγ (x, ξ , ξn , τ ). Let us now introduce the following micro-local regions: q12 τ2 > 0 , + (i) E + = (x, ξ , τ ) ∈ K × S n−1 ; q2 − µ (τ ϕx n )2 q12 τ2 n−1 (ii) Zγ = (x, ξ , τ ) ∈ K × S ; q2 − + =0 , γ (τ ϕx n )2 q12 τ2 (ii) E − = (x, ξ , τ ) ∈ K × S n−1 ; q2 − < 0 , + 2µ + λ (τ ϕx n )2 q12 τ2 (iii) M = (x, ξ , τ ) ∈ K × S n−1 ; q2 − + <0 µ (τ ϕx n )2
q12 τ2 < q2 − . + 2µ + λ (τ ϕx n )2
And for fixed (x, ξ , τ ) let αγ (x, ξ , τ ) ∈ C be such that aγ (x, ξ, τ ) = a(x, ξ, τ ) − = (ξn + iτ ϕx n
τ2 γ + iαγ )(ξn + iτ ϕx n − iαγ ).
(4.22)
Local Energy for Elastic Wave Equation
391
Taking into account (4.16) and (4.17), we have (iαγ )2 = (iα)2 +
τ2 . γ
(4.23)
Then we get by (4.12) and (4.17) αγ2 = (τ ϕx n )2 + q2 −
τ2 + 2iτ q1 . γ
(4.24)
For a fixed (x, ξ , τ ) decompose aγ (x, ξ, τ ) as a polynomial in ξn . Then we have the lemma: Lemma 4.1. We have the following: 1. For any (x, ξ , τ ) ∈ E + the roots of aµ and a2µ+λ are denoted by z1± , z2± satisfying ± Im zk± > 0. 2. For any (x, ξ , τ ) ∈ Zγ , one of the roots of aγ is real. 3. For any (x, ξ , τ ) ∈ E − the roots of aγ are in the upper half-plane if ϕx n < 0 (resp. in the lower half-plane if ϕx n > 0). 4. For any (x, ξ , τ ) ∈ M the roots of a2µ+λ satisfy ± Im z2± > 0 and the roots of aµ satisfy 3). 4.2. Carleman estimates in the region Zµ . Define the following weighted norms in H 1 (0 ), respectively in H 1 (∂0 ): ||u||21,τ =
1 j =0
τ 2(1−j ) ||u||2H 1 ( ) ,
and define the norms
0
|u|21,τ =
1 j =0
τ 2(1−j ) |u|2H 1 (∂ ) , 0
∂u 2 |u|21,0,τ = |u|21,τ + 2 . ∂n L
(4.25)
(4.26)
The purpose of this section is to get the Carleman estimates in the region Zµ . Precisely we take a cutoff function χ0 (x, ξ , τ ) homogeneous of degree zero in the region Zµ . Our purpose here is to prove the following proposition: Proposition 4.1. There exist C > 0 such that for any large enough τ we have ||A(x, D, τ )u||2 + τ |u|21,0,τ + ||u||21,τ ≥ Cτ || op(χ0 )u||21,τ ,
(4.27)
whenever u ∈ C0∞ (K). Furthermore if we assume that ϕx n > C0 (C0 > 0 large enough) on xn = 0 ∩ suppχ0 then we have ||A(x, D, τ )u||2 + ||u||21,τ + |u|21,τ + τ |B(x, D, τ )u|21−ordB,τ ≥ C τ || op(χ0 )u||21,τ + τ | op(χ0 )u|21,τ , whenever u ∈ C0∞ (K).
(4.28)
392
M. Bellassoued
4.2.1. Scalar estimates. In this section we give a scalar Carleman estimate for the operator with principal symbol aγ (x, ξ, τ ), where γ ∈ µ, 2µ + λ . These estimates have been proved essentially in Lebeau–Robbiano [LR1]and [LR2]. Furthermore we assume that ϕ satisfies the following assumption: For any x ∈ K; ϕx n = 0 Hϕ : Re aγ , Im aγ > 0 whenever aγ (x, ξ, τ ) = 0 (x, ξ, τ ) ∈ K × Rn+1 . Lemma 4.2. There exist C > 0 such that for any large enough τ we have || op(aµ )v||2 + τ |v|21,0,τ + ||v||21,τ ≥ Cτ || op(χ0 )v||21,τ
(4.29)
whenever v ∈ C0∞ (K). If we assume that ϕx n > 0 and Dn v−op(k1 )v = g1 on xn = 0 , where k1 (x, ξ , τ ) is a tangential symbol of order 1, then for large enough τ we have || op(aµ )v||2 + ||v||21,τ + τ |g1 |2 ≥ C τ || op(χ0 )v||21,τ + τ | op(χ0 )v|21,τ (4.30) whenever v ∈ C0∞ (K). Lemma 4.3. There exists C > 0 such that for any τ large enough we have || op(a2µ+λ )v||2 + τ |v|21,0,τ + ||v||21,τ ≥ Cτ 2 || op(χ0 )v||21,τ
(4.31)
whenever v ∈ C0∞ (K). If we assume that Dn v − op(k2 )v = g2 on xn = 0 such that k2 (x, ξ , τ ) = z2+ (x, ξ , τ ) for any (x, ξ , τ ) ∈ Supp χ0 , we have || op(a2µ+λ )v||2 + ||v||21,τ + τ |g2 |2 ≥ C τ 2 || op(χ0 )v||21,τ + τ | op(χ0 )v|21,τ (4.32) whenever v ∈ C0∞ (K). Remark 4. We have a similar lemma if we assume the Dirichlet condition in a boundary v = g1 on xn = 0. 4.2.2. Estimation for A. By applying Lemma 4.3 and Lemma 4.2 we get the following estimate of A: Lemma 4.4. There exists C > 0 such that for any large enough τ we have 2 + τ |v|21,0,τ + ||v||21,τ ≥ Cτ || op(χ0 )v||21,τ || op(A)v||
(4.33)
whenever v ∈ C0∞ (K). Furthermore, if we assume that ϕx n > 0 and Dn v − op(k)v = g on xn = 0 , where kn (x, ξ , τ ) = z2+ (x, ξ , τ ) for any (x, ξ , τ ) ∈ Supp χ0 , we have 2 + ||v||21,τ + τ |g|2 ≥ C τ || op(χ0 )v||21,τ + τ | op(χ0 )v|21,τ || op(A)v|| whenever v ∈ C0∞ (K).
(4.34)
Local Energy for Elastic Wave Equation
393
In the following, we build an approximate solution of P v = u, with an appropriate boundary condition. Let us now introduce some notations. Let op(L) be a differential operator with symbol L(x, ξ, τ ) and assume that op(L) can be written in local coordinates op(L) =
m
j
op(Lj (x, ξ , τ ))Dn ,
(4.35)
j =0
where Lj (x, ξ , τ ) are tangential symbols. Let u0 denote the function which coincides j with u in xn ≥ 0 and is zero if xn < 0. Let γj (u) be the trace of Dn u when xn = 0. 0 0 Then op(L)u − (op(L)u) is supported on ∂ and we set op(Lc )u := op(L)u0 − (op(L)u)0 =
1 i
l+k+1≤m
Ll+k+1 (x, D , τ )γ (u) ⊗ Dnk δ, (4.36)
where δ the Dirac measure and op(Lc ) is called the Calderon’s operator (see Hörmander [H]). If we assume that op(L) is an elliptic operator, let op(Q) be a parametrix of op(L). Then we have by (4.36) u0 = op(Q)[(op(L)u)0 ] + op(Q)(op(Lc )u) + op(R)u0
(4.37)
for a regular operator R. Now for g ∈ C ∞ () we note op(Q)g = (op(Q)g0 )|xn >0 . Then we have by (4.37), u = op(Q)(op(L)u) +
m−1
op(Tl )γl (u) + op(R)u,
(4.38)
l=0
where op(Tl ) is a tangential operator with principal symbol 1 Tl (x, ξ , τ ) = eixn ξn (L−1 (x, ξ, τ ) Ll+k+1 (x, ξ , τ )ξnk )dξn , (4.39) 2iπ L + l+k+1≤m
where L + is a contour in the half-plane Im ξn > 0 surrounding the roots of (det(L(x, ξ, τ )). Now formally, let P v = u, where P = P0 Dn + P1 . Then we have P v 0 = u0 + P0 γ0 (v) ⊗ δ. Applying the parametrix Q of P for the previous equation we get v = Qu + op(T )γ0 (v), where T a tangential symbol given by T (x, ξ , τ ) = 1 ixn ξn P −1 (x, ξ, τ )dξ )P . More precisely, let {χ , j = 1, . . . , N} be a family n 0 j 2iπ ( L + e of cutoff functions homogeneous of degree zero in the region Zµ , and Pj a family of operator satisfying (4.20) and χ j equal to 1 in Suppχj and supported in the domain of χj and supported in the domain of Pj . Pj . Finally let χˆ j be equal to 1 in Supp Lemma 4.5. Let v = (op(Qj χj )u0 )|xn >0 then v satisfies op(Pj χ j )v = op(χj )u + op(r)u in xn > 0 , op(χj )vn−1 = op(χj )vn + op(r )u on xn = 0 where op(r), op(r ) are neglectible operators of order −∞.
(4.40)
394
M. Bellassoued
Proof. We have op(Pj χ j )v 0 = op(χj )u0 + Pj,0 op( χj )γ0 (v) ⊗ δ + op(r)u0 .
(4.41)
By applying op(Qj χˆ j ) to (4.41), we get op( χj )v = op(Qj χj )u + op(Tj )γ0 (v) + op(r)u,
(4.42)
where the principal symbol of op(Tj ) is given by Tj (x, ξ , τ ) =
1 ( 2iπ
L+
eixn ξn Pj−1 (x, ξ, τ ) χj dξn )Pj,0 ,
(4.43)
where once more L + is a contour in the half-plane Im ξn > 0 that contains in its interior the poles of Pj−1 . On one hand the symbol Pj is given by Pj (x, ξ, τ ) = Hj (x, ξ , τ )Pˆj (x, ξ, τ ),
(4.44)
where
I d < ξ , τ > 0 0 0 iα(x, ξ , τ ) ξn + iτ ϕx n . Pˆj (x, ξ, τ ) = 0 ξn + iτ ϕx n iα(x, ξ , τ )
(4.45)
and Hj defined by (4.18). This implies that the principal symbol Tj can be given by Tj (x, ξ , τ ) =
1 ( 2iπ
L+
eixn ξn Pˆj−1 (x, ξ, τ ) χj dξn )Pˆj,0 (x, ξ , τ ).
(4.46)
Furthermore, the poles of Pj−1 are r± = −iτ ϕx n ± iα. Hence, 1 2iπ
L+
eixn ξn Pˆj−1 (x, ξ, τ ) χj dξn
0 0 0 1 = χ j e−xn r+ 0 1 −1 . 2 0 −1 1
(4.47)
On the other hand by (4.42) we can get op(Tj )γ0 (v) = op(r)u,
(4.48)
where op(r) is a neglectible operator. Combining (4.48) with (4.47), we obtain op( χj )vn−1 = op( χj )vn + op(r)u. This concludes the proof of Lemma 4.5. Lemma 4.6. There exists C > 0 such that, with the notation of Lemma 4.5, the following inequality holds: χj )v||2 + τ −1 |u|21,0,τ + ||u||2 ≥ Cτ || op(Qj χj )u||21,τ || op(Aχ for each large enough τ .
(4.49)
Local Energy for Elastic Wave Equation
395
Proof. According to Lemma 4.5, we have
op(Pj χ j )v = op(χj )u + op(r)u in op(χj )vn−1 = op(χj )vn + op(r )u on
xn > 0 xn = 0.
(4.50)
j ) = op(Hj−1 χˆ j ) op(Hj Pˆj χ j ) op(Hj−1 χˆ j ) op(Pj χ
(4.51)
Then, according to the following symbolic calculus:
= op(Pˆj χ j ) + op(r0 ), where op(r0 ) is a tangential operator of order zero (because, in Pˆj the coefficient of ξn is one), we have j )v = [op(Hj−1 χj )u + op(r−1 )u] + [op(r0 )v + op(r)v] op(Pˆj χ
in xn > 0, (4.52)
where once again op(r−1 ) is a tangential operator of order −1. For the first (n − 2) terms of the system (4.52) we apply Lemma 4.2, where v = op( χj )vl with l ∈ 1, . . . , n − 2 . Denote by v = (vn−1 , vn ). We have by (4.50), Dn v − op(k) v = u + z0 ,
(4.53)
where z0 = op(r0 )v + op(r)v, u = op(Hj−1 χj )u, and op(k) is given by k(x, ξ , τ ) = diag(r− (x, ξ , τ ), r− (x, ξ , τ )). According to Lemma 4.2 we obtain || op(aµ )vn−1 ||2 + τ |vn−1 |21,0,τ + ||u||2 ≥ Cτ || op( χj )vn−1 ||21,τ
(4.54)
and by Lemma 4.3 we obtain || op(a2µ+λ )vn ||2 + τ |u|20,τ + ||u||2 ≥ Cτ 2 || op( χj )vn ||21,τ + τ | op( χj )vn |21,0,τ . (4.55) χj )vn + op(r)u on xn = 0, we get by (4.55) Using now the fact that op( χj )vn−1 = op( the following estimate: τ |vn−1 |21,0,τ ≤ C(|| op(a2µ+λ )vn ||2 + τ |u|20,τ + ||u||2 .
(4.56)
Combining (4.56), (4.55) and (4.54) we obtain (4.49). Now we prove a simpler estimate which completely neglects the boundary condition. Lemma 4.7. There exist C > 0, such that for any large enough τ we have Cτ || op(χ0 )u||21,τ ≤ || op(A)u||2 + τ |u|21,0,τ + ||u||21,τ whenever u ∈ C0∞ (K).
(4.57)
396
M. Bellassoued
Proof. According to Lemma 4.6 and (4.40) it is enough to prove that χj )v||2 ≤ || op(A)u||2−1 + τ −1 |u|21,0,τ + ||u||2 C|| op(Aχ
(4.58)
for τ large enough. Writing op(A) op(Pj χ j ) =
3 k=0
op(Ck (x, ξ , τ ))Dnk ,
(4.59)
where Ck (x, ξ , τ ) is a tangential pseudo-differential of order (3−k), we obtain by(4.35) op(A) op(Pj χ j )v 0 = f 0 +
1 op(Cl+k+1 )γl (v) ⊗ Dnk δ i
(4.60)
l+k+1
1 1 = f 0 + w0 ⊗ δ − w1 ⊗ δ − w2 ⊗ δ , i i where
w0 = op(C1 )γ0 (v) + op(C2 )γ1 (v) + op(C3 )γ0 (v) w1 = op(C2 )γ0 (v) + op(C3 )γ1 (v) . w = op(C )γ (v) 2 3 0
(4.61)
Now applying op(Pj−1 χj ) to (4.60), we obtain χj )v = op(P −1 χj )f + op(Aχ j
2
op(Tk )wk + op(r0 )u,
(4.62)
k=0
where r0 ∈ S 0 and op(Tk ) is a tangential pseudo-differential operator with principal symbol 1 Tk (x, ξ , τ ) = eixn ξn Pj−1 (x, ξ, τ ) χj ξnk dξn )Pj,0 . (4.63) ( 2iπ L + A simple computation shows that χj )v = op(P −1 χj )f + op(Aχ j where Sk is defined by
2
op(Sk )γk (v) + op(r0 )u,
(4.64)
k=0
S0 = T0 C1 + T1 C2 + T2 C3 S1 = T0 C2 + T1 C3 . S = T C 2 0 3
(4.65)
From (4.53) it is easy to see that op( χj )Dn v = op( χj K1 )v + op(r0 )u + op(r0 )v,
(4.66)
Local Energy for Elastic Wave Equation
397
where op(r0 ), op(r0 ) are two tangential operator with symbols of order zero, and the principal symbol of op( χj K1 ) is given by
−iτ ϕx n −iα . (4.67) K1 (x, ξ , τ ) = −iα −iτ ϕx n Combined with (4.52), this yields χj r− )v + op(r0 )u, op( χj )Dn v = op(
(4.68)
and by the first line of (4.50) we get 2 χj r− )v + op(r1 )u + op(r0 )Dn u op( χj )Dn2 v = op(
on xn = 0
(4.69)
C |Dn u|20 . τ
(4.70)
op(Sk )γk (v) = op(Q)γ0 (v) + [op(r1 )u + op(r0 )Dn u],
(4.71)
with the following estimates: || op(r1 )u||2 ≤
C 2 |u| τ 1,τ
and
|| op(r0 )u||2 ≤
Then we have 2 k=0
where
2S Q = S0 + r− S1 + r− 2
|| op(r1 )u||2 ≤
C 2 τ |u|1,τ ,
|| op(r0 )u||2 ≤
C 2 τ |Dn u|0 .
(4.72)
But ˆ 0, Q = T0 Q0 = Tˆ0 Q
(4.73)
2 + r r + r 2 )C ˆ ˆ 0 = Cˆ 1 + (r+ + r− )Cˆ 2 + (r+ where Q + − − 3 . Using (r+ + r− ) = −2iτ ϕxn 2 2 2 2 and (r+ + r+ r− + r− ) = −3(τ ϕxn ) + (iα) we can get by a simple calculus
01 2 ˆ . Q0 = −τ 10 Then, by (4.50) op(Q)v = op(r )u on xn = 0, where r ∈ S 0 . Combining this last equality with (4.71),(4.64) and (4.72), we get (4.58).
5. End of the Proof of Proposition 4.1 The purpose of this section is to prove Proposition 4.1. The essential ingredient in the proof is to estimate the traces of u by the operators A and B. The subject of this section is to prove the following proposition: Proposition 5.1. There exists C0 > 0 and C > 0 such that if ϕx n > C0 for any large enough τ we have ||A(x, D, τ )u||2 + τ |B(x, D, τ )u|21−ordB,τ + ||u||21,τ + |u|21,τ ≥ Cτ | op(χ0 )u|21,0,τ (5.1) whenever u ∈ C0∞ (K).
398
M. Bellassoued
5.1. Preliminaries. Let u ∈ C0∞ (K) denote u = op(χ0 )u
and
f = op(A) u,
(5.2)
where op(A) the differential operator with principal symbol A(x, ξ, τ ) = A0 (x)ξn2 + A1 (x, ξ , τ )ξn + A2 (x, ξ , τ ).
(5.3)
It is easy to see that
op(A) u = f in op(B) u= g0 on
xn > 0 , xn = 0
(5.4)
where f = op(χ0 )f + [op(A); op(χ0 )]u. Let us reduce the problem (5.4) to a first order system. Put v = t (< D , τ > u, ˜ Dn u). ˜ Then the system (5.4) is reduced to the following first order system: Dn v − op(A)v = F in xn > 0 , (5.5) op(B)v = g0 on xn = 0 where the principal symbol of op(A) is given by
0 − < ξ , τ > I dn A= − < ξ , τ >−1 A−1 −A−1 0 A2 0 A1
(5.6)
and B = (< ξ , τ >−1 B1 , B0 )
(5.7)
with B(x, ξ, τ ) = B0 (x)ξn + B1 (x, ξ , τ ) and F = t (0, f˜).
(5.8)
Further, det (ξn − A(x, ξ , τ )) = det (A(x, ξ, τ )) (5.9) n−1 n−1 = µ (2µ + λ)(aµ (x, ξ, τ )) (a2µ+λ (x, ξ, τ )). Let (x0 , ξ0 , τ0 ) be fixed in Supp(χ0 ). In this case the eigenvalues of A are z1± = −iτ ϕx n ± iαµ and z2± = −iτ ϕx n ± iα2µ+λ with ± Im(z2± ) > 0 and z1+ ∈ R. De ± + − − note s(x0 , ξ0 , τ0 ) = (s + 1 , . . . , s n , s 1 , . . . , s n ), where s j j =1,...,n form a basis of the to the generalized eigenspace of A(x0 , ξ0 , τ0 ), corresponding eigenvalues with either positive or negative imaginary part. Let for γ ∈ µ, 2µ + λ , 1 ± Pγ (x, ξ , τ ) = (ξn − A(x, ξ , τ ))−1 dξn , (5.10) 2iπ Cγ± where Cγ± is a small circle with center −iτ ϕx n ± iαγ . Using this projection operator, ± ± ± we put sj± = Pµ± s ± j , j = 1, . . . , n − 1 and sn = P2µ+λ s n , where < ξ , τ >= 1. Also we define the smooth positive homogeneous function of degree zero s(x, ξ , τ ) = (s1+ , . . . , sn+ , s1− , . . . , sn− ), and finally we define a pseudo-differential S(x, Dx , τ ) with principal symbol s(x, ξ , τ ).
Local Energy for Elastic Wave Equation
399
Then by the argument in Taylor [T] (see also Yamamoto [Y]) there exists a pseudodifferential operator K(x, Dx , τ ) of order −1 such that the boundary value problem (5.5) is reduced to the following: in xn > 0 Dn w − op(H)w = F (5.11) = op(B)w g0 on xn = 0, = BS(I + K)−1 and H = = (I + K)−1 S −1 F , B where w = (I + K)−1 S −1 v, F diag(H+ , H− ). Moreover the eigenvalue of the principal symbol of H− have negative + , B − ) and E + the imaginary parts. Denote the boundary operator B of (4.11) by (B + + subspace generated by (s1 , . . . , sn ). Then we have E + = E + (x, ξ , τ ) = [Ker(z1+ − A)] ⊕ [Ker(z2+ − A)].
(5.12)
5.2. Proof of Proposition 5.1. Lemma 5.1. Let R = diag(0, −ρI dn ), ρ > 0, then there exists C > 0 such that Im(RH) = diag(0, e(x, ξ , τ ))
(5.13)
and i) e(x, ξ , τ ) ≥ C < ξ , τ > In in suppχ0 , ≥ Cdiag(0, I d) on xn = 0 ∩ suppχ0 . ∗ B ii) −R + B Proof of Proposition 5.1. Define the function G(xn ) =
d dxn
op(R)w, w
ing into account (5.10), we have G(xn ) = − 2 Im(op(R) op(H)w, w) ) + (op(R xn )w, w) + 2 Im(op(R)w, F
L2 (Rn−1 )
. Tak-
(5.14)
The integration in the normal direction gives ∞ (op(R)w, w)0 = 2 Im(op(R) op(H)w, w)dxn (5.15) 0 ∞ ∞ ) − (op(R)w, F op(R xn )w, w)dxn . −2 Im 0
0
Then according to Lemma 5.1 and the Gårding inequality we obtain for w = (w+ , w− ) and large τ , Im(op(R) op(H)w, w) ≥ Cτ ||w− ||2 , and further, for any ε > 0, ∞ )|dxn ≤ εCτ ||w − ||2 + Cε ||f||2 . |(op(R)w, F τ 0 By applying Lemma 5.1 ii), 2 ≥ C|w|2 . −(op(R)w, w) + C|Bw| Combining (5.18) (5.17) (5.16) with (5.15) we get C 2. Cτ ||w− ||2 + C |w|20 ≤ ||f||20 + |Bw| τ This implies (5.1).
(5.16)
(5.17)
(5.18)
(5.19)
400
M. Bellassoued
5.3. Proof of Lemma 5.1. First we prove that for any (x, ξ , τ ) ∈ Suppχ0 , the restriction B + of B to E + (x, ξ , τ ) is an isomorphism. The eigenvalues of A are z1± = −iτ ϕx n ± iαµ and z2± = −iτ ϕx n ± iα2µ+λ with multiplicity respectively (n − 1) and 1. Now let X = (X1 , X2 ) ∈ Cn ⊕ Cn be an eigenvector of A associated to z0 . Then X satisfies < ξ , τ > X2 = z0 X1 . (5.20) A(z0 ).X1 = 0 a basis of a) Calculus of the eigenvectors associated to z1+ : Denote by ωj+ j =1,...,n−2 ⊥ G0 , G1 . Then, A(z1+ )ωj+ = 0
for j ∈ 1, . . . , n − 2 ,
(5.21)
where A(z1+ ) = (µ + λ)(G1 + iαµ G0 )t (G1 + iαµ G0 ) Now we set the following vector in Cn : ωn−1 =< ξ , τ >−1 ((iα)2 G0 + iαµ G1 ).
(5.22)
Then, by a simple computation, + = 0. A(z1+ )ωn−1
(5.23)
b) Calculus of the eigenvectors associated to z2+ : We get A(z2+ ) = −
µ+λ I d + (µ + λ)(G1 + iα2µ+λ G0 )t (G1 + iα2µ+λ G0 ). 2µ + λ
(5.24)
Let ωn+ defined by ωn+ = (iα2µ+λ )G0 + G1 . Hence A(z2+ )ωn+ = 0. Using (5.20), we denote sj+ = (ωj+ ; < ξ , τ >−1 z1+ ωj+ ) j ∈ 1, . . . , n − 1 sn+ = (ωn+ ; < ξ , τ >−1 z2+ ωn+ )
(5.25)
(5.26)
by (b+ , . . . , bn+ , b− , . . . , bn− ). and the principal symbol of B 1 1
i) Dirichlet condition: In this case B = (< ξ , τ >−1 idn , 0) then BS = (ω1+ , . . . , ωn+ , + , B − ). We obtain ω1− , . . . , ωn− ) = (B + = (ω+ , . . . , ωn+ ) = (b+ , . . . , bn+ ). B 1 1
(5.27)
+ is an isomorphism if ω+ and ωn+ are linearly independent. With (5.23) Therefore B n−1 and (5.26), we must see that for any (x, ξ , τ ) ∈ Suppχ0 , (iα)2 − (iαµ )(iα2µ+λ ) = 0.
(5.28)
Local Energy for Elastic Wave Equation
401
We may show this by contradiction. If we assume that (iα)2 − (iαµ )(iα2µ+λ ) = 0 then we have τ2 3µ + λ
(5.29)
τ2 τ2 2µ + λ 2 − = τ > 0. µ 3µ + λ µ(3µ + λ)
(5.30)
(iα)2 = − and by (4.23), we get (iαµ )2 =
Then we get (iαµ ) ∈ R. This implies that Im(z1+ ) = Im(z1− ) = −τ ϕx n ; this is contra+ is an isomorphism. dictory since (x, ξ , τ ) ∈ Zµ . Hence B ii) Neumann Condition: The proof of this part is postponed to Sect. 6 (see Lemma 6.3). For the second part of Lemma 5.1, we get Im(RH) = diag(0, −ρ Im(H))
(5.31)
then we obtain i). =B + w + + B − w − . Taking Let now w = (w+ , w− ) ∈ C2n = Cn ⊕ Cn . We have Bw + into account that B is an isomorphism, there exists C > 0 such that 2 ). |w+ |2 ≤ C (|w − |2 + |Bw|
(5.32)
2 −(Rw, w) = ρ|w− |2 ≥ C|w|2 − C |Bw|
(5.33)
This shows that
for large ρ. This concludes the proof of Lemma 5.1. 6. Carleman Estimates in Elliptic Region E + In this section, our goal is to show the following Carleman estimate in the elliptic region: q12 τ2 E + = (x, ξ , τ ) ∈ K × S n−1 ; q2 − + > 0 . µ (τ ϕx n )2
(6.1)
Let χ a be cutoff function homogeneous of degree zero in the region E + . Lemma 6.1. There exists C0 > 0 and C1 > 0 such that if ϕx n > C0 for any large enough τ we have ||A(x, D, τ )u||2 + τ |B(x, D, τ )u|20,τ + ||u||21,τ + |u|21,τ ≥ C1 τ (|| op(χ )u||21,τ + τ | op(χ )u|21,τ ) whenever u ∈ C0∞ (K).
(6.2)
402
M. Bellassoued
Denote f = op(A) u.
u = op(χ )u,
By a similar argument to the one of Sect. 5, we have in xn > 0 Dn w − op(H)w = F , op(B)w = g0 on xn = 0
(6.3)
(6.4)
= BS(I + K)−1 , H = = (I + K)−1 S −1 F , B where w = (I + K)−1 S −1 v, F + − ± diag(H , H ) and H is an n × n square matrix, whose components are pseudodifferential operators of order 1 with principal symbol given by diag(z1± I dn−1 , z2± ). The proof of Lemma 6.1 is based on the next lemma. Lemma 6.2. There exists a hermitian matrix R(x, ξ , τ ), C1 ,C2 > 0 such that Im(RH) = e(x, ξ , τ ),
(6.5)
furthermore, we have the following properties: i) e(x, ξ , τ ) ≥ C1 I in supp(χ ), ∗ B ≥ CI on xn = 0 ∩ supp(χ ) ii) −R(x, ξ , τ ) + C2 B Proof of Lemma 6.1. Let G be the function defined by G(xn ) =
d (op(R)w, w)L2 (Rn−1 ) . dxn
(6.6)
, we obtain Using the fact that Dn w − op(H)w = F ) + (op(R xn )w, w). (6.7) G(xn ) = −2 Im(op(R) op(H)w, w) + 2 Im(op(R)w, F The integration in the normal direction gives ∞ (op(R)w, w)0 = 2 Im(op(R) op(H)w, w)dxn 0 ∞ ∞ (op(R)w, F ) − (op(R xn )w, w)dxn −2 Im 0
(6.8)
0
Taking into account Lemma 6.2 i), we have Im(op(R) op(H)w, w) ≥ Cτ ||w||2 ,
(6.9)
(op(R)w, w)0 ≤ C|Bw|2 − C |w|2 .
(6.10)
and by (ii) we obtain
Combining (6.9) (6.10) with (6.8) we get ||20 + τ |Bw|2 . Cτ ||w||2 + C |w|20 ≤ ||F This completes the proof of Lemma 6.1
(6.11)
= Proof of Lemma 6.2. Let w = (w+ , w− ) ∈ C2n = Cn ⊕ Cn . We then have Bw + w + +B − w − . By the argument of Sect. 5 we prove that for any (x, ξ , τ ) and τ = 0 B is B
Local Energy for Elastic Wave Equation
403
an isomorphism in the case of Dirichlet boundary condition. For the Neumann boundary condition see Lemma 6.3. But in the region E + the parameter τ can be degenerate to zero. Study near τ = 0. For τ = 0 the eigenvalues of A are ±(iα). Let {ωj+ }, j = 1, . . . , n−2 be a basis of {G0 , G1 } and put + ωn−1 = R−1 (G1 + (iα)G0 ) with
R =< ξ , τ > .
(6.12)
Then we have + A(iα)ωn−1 = 0.
(6.13)
± −1 ± s± j = (ωj ; ±(iα)R ωj )
(6.14)
For j ∈ {1, . . . , n − 1} we set
and let −1 s+ n = ((λ + 3µ)(G1 − (iα)G0 ); (iα)R (3λ + 5µ)G1 + (λ − µ)(iα)G0 ).
(6.15)
Then we have − (A + (iα))s + n = 2(µ + λ)s n ∈ Ker(A − (iα)).
(6.16)
Then {s ± j } j = 1, . . . , n are linearly independent. Let P± (x, ξ , τ ) =
1 2iπ
C±
(ξn − A(x, ξ , τ ))−1 dξn ,
(6.17)
where C ± is a small circle with the center ±(iα). We define sj± by sj± (x, ξ , τ ) = P± (x, ξ , τ )s ± j ,
(6.18)
then s(x, ξ , τ ) = (s1+ , . . . , sn+ , s1− , . . . , sn+ ) is a homogeneous regular function of 0 , ξ , 0) = degree zero. On the one hand for the Dirichlet condition we have B(x 0 + + (b1 , . . . , bn ), where
j ∈ {1, . . . , n − 1} bj+ = ωj+ bn+ = (3µ + λ)(G1 − (iα)G0 )
(6.19)
0 , ξ , 0)) = 0 if and only if (3µ+λ) = 0. On the other hand for the Neumann so det (B(x 0 condition, by an easy computation we have B = (R−1 B1 , B0 ), where
B0 = µI d + (µ + λ)Gt0 G0 B1 = µGt1 G0 + λGt0 G1 ,
(6.20)
404
M. Bellassoued
= (b+ , . . . , bn+ ), where and B 1 + + + bj = µz1 ωj + b = 2µ(iα)((iα)G0 + G1 ) n−1 bn+ = 2µ(µ + λ)(iα)(−iαG0 + G1 )
(6.21)
0 , ξ , 0)) = 0 if and only if (µ + λ) = 0. so det(B(x 0 To prove the second part of Lemma 6.2 let us search R in the form R = diag(I dn , −ρI dn ).
(6.22)
Im(RH) = diag(Im(H+ ), −ρ Im(H− )) = e(x, ξ, τ ),
(6.23)
For a such R, we have
where e(x, ξ, τ ) ≥ CI d2n . Now let w = (w + , w− ) ∈ C2n = Cn ⊕ Cn . Then we have =B + w + + B − w − and there exist C > 0 such that for any w ∈ C2n we get Bw 2 ). |w+ |2 ≤ C(|w − |2 + |Bw|
(6.24)
We deduce 2. −(Rw, w) = −|w+ |2 + ρ|w − |2 ≥ |w+ |2 + (ρ − 2C ) − 2C |Bw| As a consequence we have the desired estimate (ii) for large ρ.
(6.25)
+ (x, ξ , τ ) is an Lemma 6.3. For any (x, ξ , τ ) in Zµ ∪ E + and τ = 0 the operator B isomorphism under the assumption ϕx n > C0 for large C0 > 0. Proof. We will keep some of the notation of Sect. 4. Let B = (R−1 B1 , B0 ) be the principal symbol of the Neumann operator, where B0 = µI d + (µ + λ)Gt0 G0 (6.26) B1 = iτ ϕx n B0 + µGt1 G0 + λGt0 G1 . + = (b+ , . . . , bn+ ), then we have by elementary calculations Denote by B 1 + + j = 1, . . . , n − 2 bj = µ(iαµ )R−1 ωj , + −3 bn−1 = R [2µ(iαµ )](iα)2 G0 + R−3 [2µ(iα)2 + τ 2 ]G1 , + bn = R−2 [2µ(iα)2 + τ 2 ]G0 + R−2 [2µ(iα2µ+λ )]G1 . (6.27) Then we get + ) = (µ(iαµ ))n−1 τ 4 R−4 R det(B
√µ τ
α(x, ξ , τ ) ,
(6.28)
where R is a function given by R(s) = (1 − 2s ) − 4s 2 2
2
! s2
−1
s2
µ − . 2µ + λ
(6.29)
Local Energy for Elastic Wave Equation
405
It is well known that there is only one simple root s = s0 of R(s) = 0, s > 1 (see Taylor [T1]) and we can prove R has no roots in Re z > 0. Let 1 be the characteristic manifold defined by τ2 1 = (x , ξ , τ ) ∈ T ∗ ∂; α 2 = 2 (6.30) cR τ2 = (x , ξ , τ ) ∈ T ∗ ∂; r(x , ξ ) − τ 2 r(x , ϕx ) + 2iτ q1 − 2 = 0 , cR √
µ + is elliptic outside 1, but for where cR = s0 is the Rayleigh speed. Therefore B (x , ξ , τ ) ∈ Zµ ∪ E + we have
r(x , ξ ) − τ 2 r(x , ϕx ) +
q12 τ2 ≥ + (τ ϕx n )2 . 2 (ϕxn ) µ
So for ϕx n > C0 we have 1 ∩ (Zµ ∪ E + ) = ∅
(6.31)
7. Study of the Region M and Z2µ+λ Our goal in this section is to prove the Carleman estimates in the mixed region M and Z2µ+λ . Lemma 7.1. Assume that ϕx n > 0. Then there exists C > 0 such that for any large enough τ we have Cτ | op(χ )u|21,0,τ ≤ ||A(x, D, τ )u||2 + τ |B(x, D, τ )|21−ordB,τ + ||u||21,τ
(7.1)
whenever u ∈ C0∞ (K). ˜ Dn u), ˜ where u = op(χ )u. In the mixed region M the Proof. Denote v =t ('D , τ (u, − and z˜ − , are in the lower half-plane Im ξ < 0 and the roots roots of aµ denoted by zµ n µ of a2µ+λ satisfy ± Im(z2µ+λ ) > 0. By a similar argument to the one of Sect. 5, we have in xn > 0 Dn w − op(H)w = F , (7.2) =g op(B)w on xn = 0 = BS(I + K)−1 , and op(H) is = (I + K)−1 S −1 F , B where w = (I + K)−1 s −1 v, F a pseudodifferential operator of order 1 with principal symbol equal to + ), H = diag(H− ; z2µ+λ
(7.3)
where Im(H− ) < 0. Let − ωj ), sj = (ωj ; R−1 zµ − zµ ωj ); sn−1+j = (ωj ; R−1
j = 1, . . . , n − 2,
− sn−1 = (ωn−1 ; R−1 zµ ωn−1 ),
− j = 1, . . . , n − 2, s2n−2 = (ωn−1 ; R−1 zµ ωn−1 ),
− ωn ), s2n−1 = (ωn ; R−1 z2µ+λ
+ s2n = (ωn ; R−1 z2µ+λ ωn ).
406
M. Bellassoued
Since B = (R−1 B1 , B0 ) with B0 = µI d + (µ + λ)G0 t G0 B1 = µG1 t G0 + λG0 t G1 + iτ ϕx n B0
(7.4)
= (b1 , . . . , bn ), where from the definition of sj it follows that B bj = µR−1 (iαµ )ωj f or j = 1, . . . , n − 2, b = µR−1 [(iα)2 + (iαµ )2 ]G1 + 2µR−1 (iαµ )(iα)2 G0 , n−1 bn−1+j = −µR−1 (iαµ )ωj , b2n−2 = µR−1 [(iα)2 − (iαµ )2 ]G1 − (2µ + λ)(iαµ )(iα2µ+λ )G0 , b = R−1 [−(2µ + λ)(iα2µ+λ )2 − λ(iα)2 ]G0 , 2n−1 b2n = 2µR−1 (iα2µ+λ )G1 + R−1 [−λ(iα)2 + (2µ + λ)(iα2µ+λ )2 ]G0 . As a consequence = (b1 , . . . , bn−2 , bn−1 , −b1 , . . . , −bn−2 , b2n−2 , b2n−1 , b2n ). B
(7.5)
(7.6)
Let C = (b1 , . . . , bn−2 , b2n−1 , b2n ), and define the elliptic pseudodifferential operator C(x, Dx , τ ) of order 0 with the principal symbol C0 (x, ξ , τ ). In (7.2) we may assume .Then we have 0 = C −1 B that the boundary operator is B 0 = (e1 , . . . , en−2 , C1 , −e1 , . . . , −en−2 , C2 , en−1 , en ) B (7.7) and
in xn > 0 Dn w − op(H)w = F 0 )w = op(C −1 )g = op(B g on xn = 0.
Now let K0 = (0, . . . , 0, en ). By (7.8), w satisfy the following system: in xn > 0, Dn w − op(H)w = F wn = g0 on xn = 0,
(7.8)
(7.9)
where g0 = op(K0 ) g−
n−1
βj w j
(7.10)
j =1
= op(K0 ) g − op(β) w, w = ( w , wn ) and op(β) is a pseudodifferential operator of order 0. Let R = diag(−λI2n−1 , 0) and define the function G(xn ) by d (op(R)w, w). dxn Then we have by integration in the normal direction ∞ (op(R)w, w)0 = 2 Im(op(R) op(H)w, w)dxn + G(xn ) =
0
(7.11) ∞
). (7.12) Im(op(R)w, F
0
Taking into account (7.9) and (7.11) we get ||2 + τ |g|2 . τ |w|20 ≤ C||F
(7.13)
Finally the elliptic region E − can be handled as E + , and we prove that we have a better estimate (without boundary terms). Combining Proposition 4.1, Lemma 6.1 and Lemma 7.1 we have Theorem 1.3.
Local Energy for Elastic Wave Equation
407
A. Appendix In this section we study the outgoing resolvent in the real axis. We prove that there are no real poles. Let z ∈ R∗ , u(x, z) solve the following problem (e + z2 )u = 0 in , Bu = 0 on ∂ and u is outgoing. Our purpose is to prove that u is identically equal to zero. Assume that z > 0 and let a > 0 such that the obstacle O ⊂ B(0, a) and g = I d in c B(0, a). Let R ≥ a then we have
(−E(u, u) + z |u| )dx + 2
R
2
S(0,R)
Bu · udσ = 0.
(A.1)
Thus we have lim
R→∞ S 2
Im(Bu · u)(R, σ )R 2 dσ = 0
(A.2)
keeping the some notation of paragraph 2, we set u = u(p) + u(s) . Then we have S2
(Bu.u)R 2 dσ = =
S2
(Bu(p) + Bu(s) )(u(p) + u(s) )R 2 dσ
(A.3)
(Bu(p) − i(2µ + λ)k1 u(p) )(u(p) + u(s) )R 2 dσ + (Bu(s) − iµk2 u(s) )(u(p) + u(s) )R 2 dσ 2 S + i(2µ + λ)k1 |u(p) |2 R 2 dσ + iµk2 |u(s) |2 R 2 S2 S2 + (i(2µ + λ)k1 u(p) u(s) + iµk2 u(s) u(p) )R 2 . S2
S2
Now using that u is outgoing then u satisfy the radiation condition. According to Theorem 2.9 in Kupradze [Ku] (pp. 127) we obtain lim
R→∞ S 2
(2µ + λ)k1 |u
(p) 2
| R + 2
S2
µk2 |u(s) |2 R 2 dσ = 0.
(A.4)
Then we have lim
R→∞ S 2
|u(p) |2 R 2 dσ = lim
R→∞ S 2
|u(s) |2 R 2 dσ = 0.
(A.5)
It is well known that if ( + z2 )u = 0 and u satisfy the radiation condition then u ≡ 0 for z ∈ R (see Burq [Bu], Morawetz [Mo]). This argument complete the proof. Acknowledgements. Professor L. Robbiano played an essential part in the conception of this work. The author wishes to express his gratitude to him especially for his patient and numerous readings of this paper.
408
M. Bellassoued
References [Bu] [H1] [IN] [IS] [K1] [K2] [Ku] [LP] [LR1] [LR2] [Me] [MF] [MS] [Mo] [Ra] [SS] [SV1] [SV2] [SV3] [T] [T1]
[T2] [Y]
Burq, N.: Décroissance de l’énergie locale de l’équation des ondes pour le problème extérieur et absence de résonance au voisinage du réel. Acta Math. 180, n.1, 1–29 (1998) Hörmander, L.: The analysis of linear partial differential operators, I–III. Berlin–Heidelberg–New York: Springer Verlag Ikehata, N. and Nakamura, G.: Decaying and nondecaying properties of the local energy of an Elastic wave outside an obstacle. Japan J. App. Math. 6, 83–95 (1989) Iwashita, H., Shibata,Y.: On the analyticity of spectral functions for exterior boundary value problems. Glas. Math. Ser. III 23 (43), 291–313 (1988) Kawashita, M.: On the local-energy decay property for the elastic wave equation with the Neumann boundary conditions. Duke Math. J. 67, 333–351 (1992) Kawashita, M.: On a region free from the poles of resolvent and decay rate of the local energy for the elastic wave equation. Indiana Univ. Math. J. 43, 1013–1043 (1994) Kupradze, V.: Three-dimensional problems of the mathematical theory of elasticity and thermoelasticity. Amsterdam: North Holland, 1979 Lax, P.D. and Phillips, R.S.: Scattering theory. New York: Academic Press, 1967 Lebeau, G., Robbiano, L.: Contrôle exact de l’équation de la chaleur. Comm. Part. Diff. Eq. 20, 335–356 (1995) Lebeau, G. et Robbiano, L.: Stabilisation de l’équation des ondes par le bord. Duke Math. J. 86, n.3, 465–491 (1997) Melrose, R.B.: Singularities and energy decay in acoustical scattering. Duke Math. J. 46, 43–59 (1979) Morse, P. and Feshbach, H.: Methods of theoretical physics. New York: Mc Grow-Hill, 1953 Melrose, R.B. and Sjöstrand, J.: Singularities of boundary value problems I. Comm. Pure. App. Math. 31, 593–617 (1978) Morawetz, C.S.: The decay of solutions of the exterior initial-boundary value problem for the wave equation. Comm. Pure. App. Math. 28, 229–264 (1975) Ralston, J.V.: Solutions of the wave equation with localized energy. Comm. Pure. App. Math. 22, 807–823 (1969) Shibata, Y. and Soga, H.: Scattering theory for the elastic wave equation. Publ. RIMS Kyoto Univ. 25, 861–887 (1989) Stefanov, P. and Vodev, G.: Distribution of the resonances for the Neumann problem in linear elasticity in the exterior of a ball. Ann. Inst. H. Poincaré, Phys. Th. 60, 303–321 (1994) Stefanov, P. and Vodev, G.: Distribution of the resonances for the Neumann problem in linear elasticity outside a strictly convex body. Duke Math. J. 78 n.3, 677–714 (1995) Stefanov, P. and Vodev, G.: Neumann resonances in linear elasticity for an arbitrary body. Commun. Math. Phys. 176, 645–659 (1996) Tataru, D.: Carleman estimates and unique continuation for solutions to boundary value problems. J. Math. Pures Appl. (9)75, no. 4, 367–408 (1996) Taylor, M.: Rayleigh waves in linear elasticity as a propagation of singularities phenomenon. In: Proceedings of the conference on Partial Diff. Equa. Geo., New York: Marcel Dekker, 1979, pp. 273– 291 Taylor, M.: Reflection of singularities of solution to systems of differential equations. Comm. Pure. Appl. Math. 29, 1–38 (1976) Yamamoto, K.: Singularities of solutions to the boundary value problems for elastic and Maxwell’s equations. Japan J. Math. 14 n.1, 119–163 (1988)
Communicated by B. Simon
Commun. Math. Phys. 215, 409 – 432 (2000)
Communications in
Mathematical Physics
© Springer-Verlag 2000
Geometric Quantization of Vector Bundles and the Correspondence with Deformation Quantization Eli Hawkins Center for Gravitational Physics and Geometry, The Pennsylvania State University, University Park, PA 16802, USA. E-mail: [email protected] Received: 16 November 1998 / Accepted: 29 June 2000
Abstract: I repeat my definition for quantization of a vector bundle. For the cases of the Toeplitz and geometric quantizations of a compact Kähler manifold, I give a construction for quantizing any smooth vector bundle, which depends functorially on a choice of connection on the bundle. Using this, the classification of formal deformation quantizations, and the formal, algebraic index theorem, I give a simple proof as to which formal deformation quantization (modulo isomorphism) is derived from a given geometric quantization. 1. Introduction Traditionally, “quantization” has meant some sort of process that, given a classical, symplectic phase space, produces a noncommutative algebra of quantum observables. There are two principal mathematical notions of quantization. Both share as a starting point the idea (from physics) that the product of functions on a manifold is deformed with a parameter h ¯ in such a way that the commutator is given to leading order by the Poisson bracket as [f, g]− = −i¯h {f, g} + O2 (¯h).
(1.1)
One theory, geometric quantization, gives concrete procedures for constructing a C∗ algebra for each (allowed) value of h ¯ . In the limit as h ¯ → 0, each of these algebras can be linearly identified with the ordinary algebra of continuous functions. It is in this approximate sense that the elements of the algebra can be thought of as being fixed while the product changes and satisfies Eq. (1.1). In the other theory, (formal) deformation quantization (see [1, 23]), Eq. (1.1) is taken to suggest an expansion in powers of h ¯ . The h ¯ -dependent product is expressed as a power series in h ¯ . This power series does not, however, converge for most smooth functions; hence, h ¯ can only be taken as a formal parameter and cannot be given a specific, nonzero value.
410
E. Hawkins
Both these theories were originally intended to address the physical problem of quantizing the phase space of a physical system. Physically the value of h ¯ is not variable (in fact h ¯ ≈ 10−27 g cm2 /sec), so deformation quantization can never be used to fully describe what it was originally intended to. However, deformation quantization has proven fruitful as a mathematical subject. For instance, interesting classification results have been achieved in this abstract setting (see [18]). The concept of noncommutative geometry (see [4]) suggests that a noncommutative algebra can be thought of as the algebra of functions on a “noncommutative space”, so perhaps geometric quantization could be made into a way of constructing a noncommutative geometry from a classical geometry. However, as it stands, quantization is only a procedure for constructing an algebra. Since the algebra of continuous (or smooth) functions contains only the information of the point-set (or differential) topology of a space, this is merely the quantization of topology. It would be desirable to extend quantization to a theory of quantization of geometry. Beyond topology, vector bundles are arguably the second most fundamental structure in geometry, so a plausible first step towards a theory of quantizing geometry would be a theory of quantizing vector bundles. I began constructing such a theory in [11] by giving a definition of vector bundle quantization and a procedure for quantizing the equivariant vector bundles over coadjoint orbits of compact, semi-simple Lie groups. I continue the story in this paper by giving a procedure for quantizing arbitrary smooth vector bundles over compact Kähler manifolds. The construction depends only on the structures used to quantize the manifold, the vector bundle itself, and a connection on the vector bundle. As to using deformation quantization for something like its intended purpose, I believe that it should be interpreted as describing the asymptotic behavior of a more concrete structure, such as that produced by geometric quantization. Any result concerning deformation quantization should then have implications for this concrete version of quantization. Essential to exploiting this is an understanding of the relationship between deformation quantization and geometric quantization. In principle, any geometric quantization can be viewed in terms of an h ¯ -dependent product which can then be asymptotically expanded to yield a deformation quantization. Since this procedure is laborious at best, a shortcut to understanding what deformation quantization this gives is desirable. This is achieved as a byproduct of the quantization of vector bundles by taking advantage of an index theorem in deformation quantization theory. This paper previously appeared in two parts as [12] and [13]. 2. Generalities Most of the symbols defined in this section will be defined again through constructions in Sects. 3 and 4. The theorems of Sect. 4 will show that these constructions actually do satisfy the original definitions. Recall that a continuous field of C∗ -algebras (see [5, 14]) is the natural notion of a bundle of C∗ -algebras. The fibers are all C∗ -algebras, the space of (continuous) sections is a C∗ -algebra, and for each point of the base space there is an evaluation map, a ∗-homomorphism of the algebra of sections onto the fiber algebra. By the most general definition (see [15, 16, 20]), a strict deformation quantization of a (Poisson) manifold M consists of a continuous field of C∗ -algebras, AIˆ , and a (total)
Geometric Quantization of Vector Bundles and Deformation Quantization
411
quantization map. Conventionally, the base space Iˆ of the continuous field is the set of possible values of h ¯ ; more generally, it is just some set containing an accumulation point ∞ ∈ Iˆ which plays the role of h ¯ = 0. The fiber of the continuous field at this “classical limit” point is C(M), the C∗ -algebra of continuous functions on M. Definition. A is the C∗ -algebra of continuous sections of AIˆ . P : A C(M) is the ˆ evaluation homomorphism at ∞ ∈ I. In most of this paper the quantization map is a map Q : C(M) → A; more generally, the domain of Q may be only a dense subalgebra of C(M), but it must contain the smooth functions C ∞ (M). The composition P ◦ Q is required to be the identity map; that is, applying the quantization map to a function f ∈ C(M) gives a continuous section of AIˆ whose value at ∞ ∈ Iˆ is f . Finally, Q is required to commute with the involution (∗-structure) and satisfy a commutation relation corresponding to Eq. (1.1). Specifying the total quantization map, Q, is equivalent to specifying, for each point i ∈ I := Iˆ {∞}, a map Qi to Ai which is just Q composed with the evaluation at i. It may be possible to reconstruct the continuous field AIˆ from this system of quantization maps. The total quantization map Q can be reconstructed as the direct product of the Qi ’s. The codomain of this reconstructed quantization map is superficially the C∗ -algebraic direct product i∈I Ai ; however, the image of Q is actually contained in A ⊂ i∈I Ai and will usually generate A (as a C∗ -algebra) in most cases. Given a proposed system of quantization maps Qi , it is a nontrivial convergence condition that the image of Q consists of sections of some continuous field. For some purposes, including defining quantization of a vector bundle, the continuous field AIˆ is the only important structure. For this reason, in [11], I gave the following: Definition. A general quantization of M is a continuous field AIˆ with C(M) as the fiber over ∞. This is just enough structure to define whether or not a sequence of operators converges to a given function on M. Note that this is not concerned with the commutation relation (1.1). The context of this paper is the geometric quantization of compact Kähler manifolds. ˆ := {1, 2, . . . , ∞}, the 1-point compactification of the positive In this case Iˆ = N integers. The algebras AN for each N ∈ N are (finite-dimensional) matrix algebras, and if M is connected, they are simple (i. e., “full”) matrix algebras. Assume for simplicity that M is connected. Then, all the information of the general ˆ can be recovered as the spectrum quantization is contained in A and P. The index set N of the center of A. The ideal A0 := ker P is the algebra of sections of ANˆ that vanish at ∞. Equivalently, A0 consists of those sections over N for which the sequence of norms converges to 0. Since N is discrete, A0 is just the C∗ -algebraic direct sum N∈N AN . I shall be concerned with two choices of quantization maps here. The first are the Toeplitz quantization maps TN ; these are manifestly completely positive and thus defined on all of C(M). The second type are the geometric quantization maps QN ; as I shall show in Thm. 6.2, these correspond to the same general quantization as the Toeplitz quantization maps do. Note that a general quantization can be phrased as an extension P
0 → A0 −→ A −→ C(M) → 0.
(2.1)
412
E. Hawkins
The total Toeplitz quantization map T : C(M) → A gives a completely positive splitting of (2.1). 2.1. Quantized vector bundles. The category equivalence of vector bundles over M with finitely generated projective (finite-projective) modules of C(M) is well known. The C(M)-module corresponding to a vector bundle V over M is the space of continuous sections (M, V ). This suggests the following definition (see [11]). Definition. Given a general quantization, expressed as P : A C(M), a quantization of a vector bundle V is any finite-projective A-module, V, such that the push-forward by P is P∗ (V) = (M, V ). ˆ pushing V forward by the evaluation homomorphism gives a module For every i ∈ I, Vi of Ai . The A-module V is equivalent to a bundle of modules over Iˆ whose fiber over i is Vi . It is not obvious a priori that any quantization of V will exist, or that it will be at all unique. To investigate these issues, it is helpful to consider K-theory; the group K 0 (M) classifies vector bundles; the group K0 (A) classifies finite-projective A-modules (which are the quantized vector bundles). The short exact sequence (2.1) leads, as usual, to a six-term, periodic exact sequence in K-theory; incorporating the identity K∗ [C(M)] = K ∗ (M), this reads, β
K0 (A0 ) −−−−→ K0 (A) −−−−→ K 0 (M) α
(2.2)
K 1 (M) ←−−−− K1 (A) ←−−−− K1 (A0 ). Assume that I is discrete, and the Ai ’s are full matrix algebras. Then A0 is just the direct sum i Ai , and the K-groups are direct sums K∗ (A0 ) = i K∗ (Ai ). Since each Ai is a full matrix algebra, its K-theory is K0 (Ai ) ∼ = Z and K1 (Ai ) = 0. This gives that K0 (A0 ) ∼ = Z⊕∞ and K1 (A0 ) = 0. The algebraic direct sum Z⊕∞ is the set of sequences of integers with finitely many nonzero terms. This corresponds to the fact that a finite-projective module of A0 is a direct sum of (finitely generated) modules of finitely many Ai ’s. Any A0 -module is also an A-module; the finite-projective A0 -modules are precisely those A-modules which are finite-dimensional as vector spaces. This identification corresponds to the map β in (2.2), and shows that β must be injective, which, by exactness, shows that the map α is zero. With this, the exact sequence (2.2) breaks down into the isomorphism K1 (P) : K1 (A) −→ K 1 (M) and the short exact sequence K0 (P )
0 → Z⊕∞ −→ K0 (A) −−−→ K 0 (M) → 0.
(2.3)
This K0 (P) maps the K-class of a quantization of a vector bundle, V , to the K-class of V . The exact sequence (2.3) shows that K0 (P) is surjective, which suggests that any vector bundle can be quantized.
Geometric Quantization of Vector Bundles and Deformation Quantization
413
Theorem 2.1. For a general quantization AIˆ such that I is discrete and the fibers over I are matrix algebras, every vector bundle can be quantized. Proof. Any vector bundle can be realized as the image of some idempotent matrix of continuous functions e ∈ Mm [C(M)]. The ideal A0 consists of all compact operators in A, so any element of the preimage P −1 (e) is “essentially” idempotent (i. e., modulo compacts). There therefore exists an idempotent e˜ such that P(e) ˜ = e. The right image Am e˜ is a quantization of the vector bundle in question. The short exact sequence (2.3) also shows that ker K0 (P) = Z⊕∞ . This means that the K-class of a quantization of V is uniquely determined by V, modulo Z⊕∞ . This suggests: Theorem 2.2. With the hypothesis of Theorem 2.1, quantization of a vector bundle is unique modulo finite-dimensional modules. Proof. We need to prove that if V and V are quantizations of V, then there exists a module homomorphism from V to V whose kernel and cokernel are finite-dimensional (in other words, a Fredholm homomorphism). Any finite-projective module can be realized as the (right) image of an idempotent matrix over A. So, identify V and V with the images of idempotents e, e ∈ Mm (A); that is, V = Am e and V = Am e. These idempotents can be chosen so that P(e) = P(e ); therefore, e−e ∈ Mm (A0 ). Multiplication by e (respectively, e) gives a homomorphism ·e ·e ϕ : V −→ V (resp., ϕ : V − → V). Let k be the self-adjoint idempotent whose (right) image is ker ϕ ◦ ϕ . Since this is a subspace of V, k satisfies ke = k, and since this is the kernel of ϕ ◦ ϕ , k satisfies ke e = 0. A priori k is not necessarily in Mm (A). However, the entries of k are in the C∗ -algebraic direct product of the Ai ’s; i. e., bounded sections of AIˆ over I. A0 is an ideal in this algebra, so k = k(e − e )e ∈ Mm (A0 ). Therefore, ker ϕ ◦ ϕ is an finite-projective module of A0 and is thus finite-dimensional. By an identical argument, ker ϕ ◦ ϕ is also finite-dimensional. This implies that the kernel and cokernel of ϕ are finite-dimensional. The converse is clearly also true: If V is a quantization of V, and V is isomorphic to V modulo finite-dimensional modules, then V is also a quantization of V. 3. Toeplitz Quantization Again, let M be a compact, connected Kähler manifold. Now, let L be a Hermitian line bundle with curvature given by the symplectic form as ∇ 2 = −iω and L0 a holomorphic line bundle with an inner product on sections (i. e., a pre-Hilbert structure on (M, L0 )). Definition. For each N ∈ N, LN := L0 ⊗ L⊗N, HN := hol (M, LN ) (the space of holomorphic sections of LN ), and AN := End HN (matrices over HN ). The inner products on sections of L0 and fibers of L combine to give an inner product on sections of LN . This makes HN into a Hilbert space; it does not need to be completed, since it is finite-dimensional. The Hilbert space structure of HN makes AN a C∗ -algebra. The connections and inner products must be compatible. For convenience, assume that L1 (and thus any LN∈N ) is “positive” (if not, just reparameterize N ). This guarantees that AN is nontrivial for all N ∈ N. The simplest choice of L0 is just the trivial line
414
E. Hawkins
bundle with the trivial connection and the inner product given by integrating with the canonical volume form ωn /n! (n := dimC M). The space HN is naturally a Hilbert subspace of L2 (M, LN ) which is a subspace of the Hilbert space of (0, ∗)-forms with coefficients in LN . Definition. Let N be the self-adjoint projection onto HN . The Toeplitz quantization map TN : C(M) → AN is given by TN (f ) := N f .
(3.1)
In other words, the action of TN (f ) on an element of HN (holomorphic section of LN ) is given by first multiplying by f (giving a non-holomorphic section) and then projecting back to HN by N . This TN is automatically a unital and (completely) positive map; therefore, it is norm-contracting.
3.1. Vector bundles. Suppose that we are given a smooth vector bundle V with a specific connection. I would like to construct from V a sequence of AN modules. The algebra AN can be written as AN = End HN = Hom(HN , HN ) and can be thought of as consisting of square matrices of height and width HN . Any module of AN can be written as Hom(E, HN ) and thought of as consisting of rectangular matrices of height HN and width E. Any construction for E should generalize that of HN . Thanks to the Kodaira vanishing theorem (see the appendix and e. g., [9]) and the assumption that L1 is positive, HN = hol (M, LN ) can also be realized as the kernel of the LN -twisted Dolbeault operator which acts on 0,∗ (M, LN ). In order to generalize HN appropriately, we will need: Definition. DV := ∇∂¯ + (∇∂¯ )∗ = iγ µ ∇µ is the V ∗ ⊗ LN -twisted Dolbeault operator, a Dirac-type operator acting on the smooth (0, ∗)-forms, 0,∗ (M, V ∗ ⊗ LN ). Here, ∇ is the connection, and the Dirac matrices satisfy [γ µ , γ ν ]+ = γ µ γ ν + = g µν, which differs from the usual convention by a factor of 2. A number of inequalities related to Dolbeault operators will prove useful, but the proofs of these are relegated to the appendix. Natural generalizations of AN and TN are, γ νγ µ
V , H ), where E ˜ V = ker DV . The map T V : (M, V ) → Definition. V˜N := Hom(E˜ N N N N V˜N is given by
TNV (v) := N v.
(3.2)
In this, multiplication must be understood to mean contraction of V with V ∗. MultiV ⊂ 0,∗ (M, V ∗ ⊗L ) by v gives an element of 0,∗ (M, L ); plying an element of E˜ N N N N then projects this down to HN . If V is trivial (i. e., V = C × M with the trivial connection), then TNV reduces to TN . The tildes will be dispensed with by the end of the next section.
Geometric Quantization of Vector Bundles and Deformation Quantization
415
4. Convergence If V has an inner product – or if we assign one – then there is a natural operator norm on V˜N which generalizes the norm on AN . This corresponds to the norm v := supx∈M v(x) on sections. None of the constructions here will require an inner product on V ; however, several of the proofs will make use of one – which can be taken arbitrarily. The main property of the maps TN and TNV that is needed to prove convergence of the quantization of both the algebra and vector bundles is Lemma 4.1. For any function f ∈ C(M) and any section v ∈ (M, V ), lim TN (f )TNV (v) − TNV (f v) = 0. N→∞
Proof. Let D be the LN -twisted Dolbeault operator, so that HN = ker D. We can approximate N by (1 + αD 2 )−1 with α a positive real number; in fact, by Lemma A.1 (in the appendix) there exists a constant C < 1 such that Spec D 2 ⊂ {0} ∪ [N − C, ∞),
(4.1)
so the error in N ≈ (1 + αD 2 )−1 is bounded as N − (1 + αD 2 )−1 ≤ (1 + α[N − C])−1 ≤ α −1 (N − C)−1 . Now, for f ∈ C ∞ (M) any smooth function, the commutator [f, N ]− is approximated by [f, (1 + αD 2 )−1 ]− = α(1 + αD 2 )−1 [D 2 , f ]− (1 + αD 2 )−1 = iα(1 + αD 2 )−1 [D, γ µ f|µ ]+ (1 + αD 2 )−1 . Another consideration of (4.1) shows that (N − C)1/2 ≤ α −1 (N − C)−1/2 . (1 + αD 2 )−1 D ≤ 1 + α(N − C) So, √ [f, (1 + αD 2 )−1 ]− ≤ 2(N − C)−1/2 γ µ f|µ = 2 (N − C)−1/2 ∇f . This gives that [f, N ]− ≤
√
2 (N − C)−1/2 ∇f + 2α −1 (N − C)−1 f ,
for any α > 0, and therefore, [f, N ]− ≤
√
2 (N − C)−1/2 ∇f .
With a slight abuse of notation, TN (f )TNV (v) − TN (f v) = TNV (f N v − f v) = TNV ([f, N ]− v). By construction, TNV is norm-contracting; thus, √ TN (f )TNV (v) − TNV (f v) ≤ 2 (N − C)−1/2 ∇f v. So TN (f )TNV (v) − TNV (f v) → 0 as N → ∞. Since TN and TNV are norm-contracting, they are continuous, and since C ∞ (M) ⊂ C(M) is a dense subalgebra, the conclusion holds for all f ∈ C(M).
416
E. Hawkins
Definition. The total Toeplitz quantization map T : C(M) → N∈N AN is the direct product of the TN ’s. Also A0 := N∈N AN is the C∗ -algebraic direct sum, and A := Im T + A0 . Lemma 4.2. A is a C∗ -algebra, and T induces an isomorphism C(M) −→ A/A0 . Proof. Lemma 4.1 is in this case equivalent to the statement that for any functions f, g ∈ C(M), T (f )T (g) − T (f g) ∈ A0 . (4.2) The direct sum A0 is an ideal in the direct product N∈N AN , so (4.2) shows that A is algebraically closed. Since T is norm-contracting, Im T is norm-closed, and so A is norm-closed. Hence, A is a C∗ -algebra. Equation (4.2) also shows that T induces (by composition with the quotient map A A/A0 ) a ∗-homomorphism C(M) A/A0 . This is surjective because of the definition of A. We need to verify that it is injective. Since A lies inside the direct product of the AN ’s, there is for each N ∈ N an obvious “evaluation” homomorphism PN : A AN . Define the normalized partial traces tr N : A → C by tr N a := tr[PN (a)]/dim HN , so that tr N 1 = 1. The normalized trace is norm-contracting, so any a ∈ A satisfies | tr N a| ≤ PN (a); therefore, a ∈ A0 #⇒
lim tr N a = 0.
(4.3)
N→∞
Note that for any f ∈ C(M), the (unnormalized) trace of TN (f ) can be expressed as tr[TN (f )] = Tr[N f ] = lim Tr[e−tD f ]. 2
t→∞
Using the asymptotic expansion for e−tD (the “heat kernel expansion”, see [8]), this can be evaluated explicitly as a polynomial in the curvatures of TM and LN . The result is a polynomial in N with leading order term N n ωn f . 2π M n! 2
tr N is well-defined on A, vanishes on This, with (4.3), shows that tr ∞ := limN→∞ A0 and satisfies 1 ωn tr ∞ [T (f )] = f . vol M M n! Suppose that some function f is in the kernel of the induced homomorphism C(M) → A/A0 , or equivalently that T (f ) ∈ A0 . The kernel of a ∗-homomorphism is spanned by its positive elements, so we can assume without loss of generality that f ≥ 0. This implies that 0 = tr ∞ [T (f )] ∝ M f ωn /n!, but since ωn is nonvanishing this implies f = 0. So, the homomorphism is injective and thus an isomorphism. Definition. P : A C(M) is the composition of the natural surjection A A/A0 with the inverse of the isomorphism induced by T . The following shows that A indeed gives a general quantization of M.
Geometric Quantization of Vector Bundles and Deformation Quantization
417
ˆ such that the fiber over Theorem 4.3. There is a continuous field of C∗ -algebras over N N ∈ N is AN , the fiber over ∞ is C(M), and the algebra of continuous sections is A. Proof. Let PN : A → AN denote the evaluation map at N . Most of the axioms given in [5] for a continuous field of C∗ -algebras are easily verified. The nontrivial axiom is the requirement that for any a ∈ A the norms PN (a) define a continuous function ˆ Since continuity is only an issue at ∞ ∈ N, ˆ this reduces to the requirement that on N. PN (a) → P(a) when N → ∞. It is sufficient to prove this on Im T ; in other words, we need to show that for any f ∈ C(M), lim TN (f ) = f .
(4.4)
N→∞
ˆ (of irreducible representations, see 3.2.2 of [5]) is a non-Hausdorff The spectrum A ˆ According to Prop. 3.3.2 union of M and N, although it maps continuously onto N. ˆ of [5], the function on A defined by the norms of the images of any a ∈ A is lower semi-continuous. This means that for any x ∈ M, lim inf TN (f ) ≥ f (x) N→∞
≥ f . On the other hand, because TN is norm-contracting, lim supTN (f ) ≤ sup TN (f ) ≤ f . N→∞
N∈N
Each of the maps TN is surjective (Prop. 4.2 of [2]), so this is clearly the smallest continuous field such that N ' → TN (f ) defines a continuous section. In fact, Im T generates A as a C∗ -algebra. Definition. T V : (M, V ) →
V˜N
N∈N
is the direct product of the TNV ’s, V := A · Im T V is the A-module generated by Im T V, VN is the restriction of V to an AN -module, and P V : V V/A0V = P∗ (V) is the natural surjection. The following lemma shows that the analytic condition vN → 0 can be expressed algebraically. Lemma 4.4.
A0V = v ∈ VN lim vN = 0 , N∈N
N→∞
and T V induces a homomorphism of C(M)-modules, P V ◦ T V : (M, V ) P∗ (V).
418
E. Hawkins
Proof. As I have mentioned, TNV is norm-contracting; thus T V (v) is bounded. Because of this, the sequence of norms coming from any element of A0V = A0 ImT V must converge to 0. Conversely, A0V is norm-closed and contains all sequences in N∈N VN with finitely many nonzero terms. This proves the first claim. With this, Lemma 4.1 shows that for all f ∈ C(M) and v ∈ (M, V ), T (f )T V (v) − T V (f v) ∈ A0V, which proves the second claim. Lemma 4.5. For N sufficiently large, VN = V˜N . Proof. Equivalently, for sufficiently large N , the image of TNV generates V˜N as an AN module. If not, then Im TNV must lie inside a proper submodule of V˜N , and so there must V such that, for any ϕ ∈ H and v ∈ (M, V ), (ϕ|T V (v)|ψ) = 0. exist ψ ∈ E˜ N N N V . Let ψ ∈ ∞ (M, V ∗ ⊗ L ) be the Take any nonzero ϕ ∈ HN and ψ ∈ E˜ N 0 N component of ψ in degree 0. Assume N to be sufficiently large that ψ0 is guaranteed by Corollary A.2 not to vanish. Using the fact that M is connected, the zeros of ϕ must form a proper subvariety of M, and ψ0 must be nonzero on an open set; therefore, there exists y ∈ M where ϕ(y), ψ0 (y) * = 0. If v ∈ (M, V ) approximates the distribution ϕ(y)ψ¯ 0 (y)δ(x, y), then (ϕ|TN (v)|ψ) = (ϕ|v|ψ) will approximate ϕ(y)2 ψ0 (y)2 and must be nonzero for a sufficiently close approximation. 4.1. Category. It remains to be proven that V is a finitely generated, projective (finiteprojective) module and that its push-forward is P∗ (V) = (M, V ). To do this, it will be helpful to make the correspondence V '→ V into a functor. Since the module V is not constructed from the vector bundle V alone but from V accompanied by a connection, the domain of this functor must be a category of vector bundles with connections. We need to identify those bundle homomorphisms which will lead naturally to module homomorphisms. A bundle homomorphism naturally defines a map of sections. It also naturally gives (by tensor product with the identity map) homomorphisms of the tensor products with any other bundle (such as 1-forms). For simplicity, I will denote all these trivially derived maps by the same symbol as the original homomorphism. Definition. A morphism of bundles with connections, φ : V → W , is a smooth bundle homomorphism such that for any smooth section v ∈ ∞ (M, V ), φ(∇V v) = ∇W φ(v);
(4.5)
in other words, φ is covariantly constant. With these morphisms, vector bundles with connections form an Abelian category. Clearly, the identity homomorphism on any bundle satisfies the above property, and the composition of two such morphisms does as well. Also, the kernel and cokernel of such a morphism inherit natural connections. Now, let’s try and construct a functor Q from this category of bundles with connections to the category of A-modules, such that Q(V ) = V.
Geometric Quantization of Vector Bundles and Deformation Quantization
419
Definition / Theorem 4.6. Any morphism φ : V → W of bundles with connections induces a homomorphism Q(φ) : V → W of A-modules which satisfies, T W ◦ φ = Q(φ) ◦ T V .
(4.6)
Proof. φ gives an adjoint map on the dual bundles in the opposite direction, and in turn maps the spaces of forms φ ∗ : 0,∗ (M, W ∗ ⊗ LN ) → 0,∗ (M, V ∗ ⊗ LN ). Because φ intertwines connections, the map φ ∗ intertwines Dolbeault operators. If ψ ∈ ker DW = W , then D φ ∗ (ψ) = φ ∗ (D ψ) = 0; this means that the restriction of φ ∗ to E ˜ W maps E˜ N V W N W V ∗ ˜ ˜ ˜ ˜ φ : EN → EN . This induces a homomorphism φ∗ : VN ⊃ VN → WN . Put these maps together to define Q(φ). A priori, Q(φ) maps an element of V to some sequence of elements of W˜ N . We need to prove that the image of Q(φ) in fact lies inside W. W, For any v ∈ (M, V ) and ψ ∈ E˜ N TNW [φ(v)]ψ = N φ(v)ψ = N vφ ∗ (ψ) = φ∗ TNV (v) ψ. So, TNW ◦ φ = φ∗ ◦ TNV and hence, Eq. (4.6). Since W is defined to be generated by Im T W, this shows that indeed Q(φ) : V → W. Q is an additive functor. It respects identity maps, compositions, and sums of morphisms. Because of this, Q must respect finite direct sums; this property is also easily seen from the construction of Q. The category of bundles with connections actually behaves somewhat trivially. Because a morphism is covariantly constant, it can be specified completely by its action at a single point. As a result, this category behaves somewhat like the category of finitedimensional vector spaces. Any short exact sequence splits. Because of this, any additive functor (such as Q) on this category is exact. Of course, not all bundle homomorphisms are morphisms of bundles with connections. We will need some module homomorphisms that do not come from Q. The following result shows that an isomorphism of vector bundles can be used to construct a homomorphism of modules which is an isomorphism modulo finite-dimensional modules. Lemma 4.7. Let V and W be isomorphic bundles with different connections. Then there exists a Fredholm homomorphism u : V → W (compare Thm. 2.2), which satisfies PW ◦ u ◦ T V = PW ◦ T W .
(4.7)
Proof. The homomorphism u is specified by giving, for each N , a homomorphism uN : VN → WN of AN -modules. Define VN to be the spectral projection at 0 for the V ∗ ⊗ LN -twisted Dolbeault operator (likewise with W ); that is, VN is an idempotent with Im VN = ker DV and ker VN = Im DV . W 1→ The isomorphism of V and W gives a natural (isometric) inclusion ι : E˜ N W →E ˜ V , and W ι is the 0,∗ (M, V ∗ ⊗ LN ). Composing this with VN gives VN ι : E˜ N N N W ˜ identity on EN . According to Lemma A.3, lim VN − W N = 0;
N→∞
(A.4)
420
E. Hawkins
V therefore, for N sufficiently large, VN − W N < 1. When this is so, N ι is injective, W is nonzero then, because if ψ ∈ E˜ N V V )ιψ + ψ N ιψ = (N − W N ≥ 1 − VN − W N ψ > 0.
The existence of a similar injection in the opposite direction establishes that VN ι is bijective. Recall from Lemma 4.5 that for N sufficiently large VN = V˜N (and likewise with W ). When N is sufficiently large, we can define uN : VN → WN to be the bijection given by uN (vN ) = vN VN ι. For small N , it doesn’t matter what uN is. Now assemble the uN ’s into u. The kernel and cokernel of u come entirely from the finitely many uN ’s which are not bijective, and thus are finite-dimensional. In other words, u is Fredholm. But does the image in fact lie inside W? Using Eq. (A.4) again shows that, for any v ∈ (M, V ), V TN (v)VN ι − TNW (v) ≤ TNV (v) VN − W N→0 as N → ∞; therefore, by Lemma 4.4, u[T V (v)] − T W (v) ∈ A0W ⊂ W.
Theorem 4.8. V = Q(V ) is a quantization of V by the definition in Sect. 2.1. Proof. For any vector bundle V, there exists another vector bundle W such that the direct sum is some trivial bundle V ⊕ W ∼ = Cm × M. Choose an arbitrary connection on W . As noted above, Q respects finite direct sums, so Q(V ⊕ W ) = V ⊕ W. By Lemma 4.7 there exists an A-module homomorphism u : V ⊕ W → Am whose kernel and cokernel are finite-dimensional and thus projective. All the terms of the exact sequence u
0 → ker u −→ V ⊕ W −→ Am −→ coker u → 0, other than V ⊕ W, are now seen to be finite-projective modules. This sequence is thus split, and a diagram chase shows that V is projective. It remains to prove that P∗ (V) = V/A0V = (M, V ). Lemma 4.4 showed that P V ◦ T V : (M, V ) P∗ (V) is a C(M)-module homomorphism, and it is clearly surjective by the definition of V. We need to prove that the kernel of P V ◦ T V is trivial. Let φ denote the natural inclusion φ : V 1→ V ⊕ W (as bundles with connections) and ϕ the equivalent inclusion ϕ : V 1→ Cm × M (as a bundle). If v ∈ ker[P V ◦ T V ], then T V (v) ∈ A0V, so P ◦ u ◦ Q(φ) ◦ T V (v) = 0, since u ◦ Q(φ) is an A-module homomorphism. However, by Eqs. (4.6) and (4.7), P ◦ u ◦ Q(φ) ◦ T V = P ◦ u ◦ T V ⊕W ◦ φ = P ◦ T ◦ ϕ = ϕ which is injective. Therefore, ker[P V ◦ T V ] ⊆ ker ϕ = 0, and P V ◦ T V : (M, V ) −→ P∗ (V) is indeed an isomorphism.
Geometric Quantization of Vector Bundles and Deformation Quantization
421
5. The Holomorphic Case Recall that a holomorphic vector bundle is a bundle with a connection whose curvature is of type (1, 1). Theorem 5.1. If V is a holomorphic vector bundle, then for all N ∈ N, VN = V , H ), where Hom(EN N V = hol (M, V ∗ ⊗ LN ), EN
and hol means holomorphic sections. Proof. This is much the same as the proof of Lemma 4.5. V (and thence V ) is ZIn the holomorphic case, D 2 respects the Z-grading, so E˜ N N V V , H ). graded. Sections of V, and thus Im TN , are entirely of degree 0, so VN ⊆ Hom(EN N V such that for If the statement were false, then there would exist a nonzero ψ ∈ EN any, ϕ ∈ HN and v ∈ (M, V ), (ϕ|v|ψ) = 0. However, if ϕ * = 0 then the zero sets of ϕ and ψ will be proper subvarieties of M; therefore, there exists y ∈ M, where ¯ ϕ(y) * = 0 and ψ(y) * = 0. So, if v(x) approximates the distribution ϕ(y)ψ(y)δ(x, y), then (ϕ|v|ψ) will approximate ϕ(y)2 ψ(y)2 and thus be nonzero for a sufficiently close approximation. 6. Geometric Quantization Definition. The standard geometric quantization maps (see [24]) QN : C ∞ (M) → AN are defined on smooth functions by (with a slight abuse of notation)
(6.1) QN (f ) := N f − Ni π µν f|µ ∇ν = TN f − Ni π µν f|µ ∇ν . µ
Here π is the Poisson bivector, defined by π µν ωλν = δλ , and ∇ is again the connection. Following TNV , there is an obvious generalization of QN for vector bundles. Definition. QVN : ∞ (M, V ) → V˜N is given by
QVN (v) := TNV v − Ni π µν v|µ ∇ν . Lemma 6.1. For any smooth section v ∈ ∞ (M, V ), lim TNV (v) − QVN (v) = 0.
N→∞
Proof. Let wµ be any tangent vector with components in V, and use D to denote both the LN -twisted and V ∗ ⊗ LN -twisted Dolbeault operators. Then, −i[D, γµ w µ ]+ = [γ ν ∇ν , γµ w µ ]+ = [γ ν , γµ w µ ]+ ∇ν + γ ν [∇ν , γµ w µ ]− µ
= wµ ∇µ + γ ν γµ w |ν . Because the argument of TNV acts between the kernels of the Dolbeault operators, this gives the identity
µ 0 = TNV [D, γµ w µ ]+ = iTNV (w µ ∇µ + γ ν γµ w |ν ).
422
E. Hawkins
Now, setting w µ = − Ni π νµ v|ν gives QVN (v) − TNV (v) =
ν λµ i V N TN (γ γµ π v|λν ).
(6.2)
Since TNV is norm-contracting, for any smooth v, the norm of the difference (6.2) converges to 0 as N → ∞. Equation (6.2) is related to a formula due to Tuynman [21]. Namely, for any smooth function f ∈ C ∞ (M), 1 4(f ) , (6.3) QN (f ) = TN f + 2N where 4 = −∇ 2 is the scalar Laplacian. Since 4 is a positive operator, this shows that QN , like TN , is (completely) positive, which means that, after all, QN can be uniquely, continuously defined on all of C(M). As with Toeplitz quantization, we can assemble the QN ’s into a direct-product map Q : C ∞ (M) → N∈N AN . Theorem 6.2. Q : C(M) → A and P ◦ Q = id. Proof. By Lemma 6.1, for any smooth function f , T (f ) − Q(f ) ∈ A0 . This shows that P[Q(f )] = f . Since A = P −1 [C(M)], this shows that Im Q ⊂ A. This shows that the general quantization constructed by geometric quantization is exactly the same as that constructed by Toeplitz quantization. Analogous to the construction of V, define V to be the A-module generated by the image of QV. Theorem 6.3. This V is a quantization of V. Proof. It is sufficient to prove that V is isomorphic to V modulo finite-dimensional modules. Choose some set of smooth sections of V such that their images by T V generate V, and hence their images by TNV generate VN . For sufficiently large N , their images by QVN will be close enough to those by TNV to generate V˜N = VN . Therefore, for N sufficiently large VN = VN . 7. Further Structures Because V has been produced constructively from V and its connection, essentially any additional structure that is consistent with the connection on V should lift to V. (This is equally true for V.) If there is a group G acting on M, and V is a G-equivariant vector bundle with an equivariant connection, then there will be a natural representation of G on V, and T V will be G-invariant. See also [11]. If V has a given inner product and a compatible connection, then V will have a natural inner product, corresponding to the inner product of sections integrated against the volume form ωn /n!.
Geometric Quantization of Vector Bundles and Deformation Quantization
423
8. Formal Deformation Quantization This section serves to fix notation and summarize some facts about deformation quantization. It contains no original results. For this section, M is a symplectic manifold, not necessarily Kähler or compact. A (formal) deformation quantization of M (see [23]) is an algebra Ah¯ which (as ¯ a vector space) is identified with C ∞ (M)[[¯h]], the space of formal power series in h with coefficients in the smooth functions over M. Denote the Ah¯ -product by ∗h¯ and the C ∞ (M)[[¯h]]-product by apposition (e. g., f g). The product ∗h¯ is given by a formal power series f ∗h¯ g = f g +
∞
(−i¯h)k ϕk (f, g).
(8.1)
k=1
This is required to be associative and C[[¯h]]-linear. It is also required to satisfy f ∗h¯ 1 = f and f ∗ ∗h¯ g ∗ = (g ∗h¯ f )∗ , where the complex conjugate on series is such that h ¯∗ = h ¯. The only condition involving the symplectic form is the restatement of Eq. (1.1), ¯ 2, f ∗h¯ g − g ∗h¯ f ≡ −i¯h {f, g} mod h or equivalently ϕ1 (f, g)−ϕ1 (g, f ) = {f, g}, where {f, g} is the Poisson bracket. Finally there is a (perhaps unnecessary) locality condition that each ϕk is a bidifferential operator. The archetypal example of a deformation quantization is the Moyal–Weyl deformation on a symplectic vector space M = R2n . Let m : C ∞ (R2n )⊗C ∞ (R2n ) → C ∞ (R 2n ) be the (ordinary) multiplication map, and π the Poisson bivector, regarded as a differential operator acting on C ∞ (R2n ) ⊗ C ∞ (R2n ) (so that m ◦ π(f ⊗ g) = {f, g}). The Weyl product is f ∗h¯ g := m ◦ exp[− i¯2h π ](f ⊗ g) = fg −
(8.2)
i¯h 2 {f, g} + . . . .
The formal Weyl algebra Wh¯ is related to this, essentially by taking germs of functions about 0 ∈ R2n . It is constructed using formal power series C[[R2n ]] in place of smooth functions; in other words, Wh¯ is C[[R2n , h ¯ ]] with the product (8.2). Over any manifold, we can construct a bundle C[[TM]] of formal power series over each fiber of the tangent bundle. A Leibnitz connection over a bundle of algebras, such as this, is one satisfying the Leibnitz (product) rule with respect to the product of sections. The constant sections of C[[TM]] with respect to a flat Leibnitz connection are naturally identified with the smooth functions on M. The value of a section at some x ∈ M is a Taylor expansion about x of the corresponding function. Every fiber of the tangent bundle of a symplectic manifold is a symplectic vector space. From this, we can construct a bundle Wh¯ M of formal Weyl algebras such that the fiber over x ∈ M is the formal Weyl algebra constructed on Tx M. Note that the order h ¯ 0 part is just Wh¯ M/¯h = C[[TM]]. The structure Lie algebra for a Leibnitz connection on Wh¯ M is der Wh¯, the derivations of the typical fiber. A flat, Leibnitz connection ∇ on Wh¯ M is known as a Fedosov connection. The algebra Ah¯ of ∇-constant sections of Wh¯ M is a deformation quantization of C ∞ (M). Fedosov connections always exist [6], and, in fact, any deformation quantization can be constructed in this way [17].
424
E. Hawkins
The space g := h ¯ −1 Wh¯ (series with an order h ¯ −1 term allowed) is a Lie algebra with the commutator as a Lie bracket. Indeed, g acts on Wh¯ by derivations and in fact gives all derivations of Wh¯. It is thus a central extension 0→h ¯ −1 C[[¯h]] → g → der Wh¯ → 0.
(8.3)
Since the Fedosov connection ∇ is a der Wh¯ -connection, it can be lifted to a gconnection ∇˜ using Eq. (8.3). The flatness of ∇ implies that the curvature of ∇˜ is central, that is ∇˜ 2 ∈ h ¯ −1 2 (M)[[¯h]]. However, the lifting of ∇ is not unique, so the curvature ˜ of ∇ is not uniquely determined by ∇. Fortunately, the ambiguity is only modulo exact 2 (M)[[¯h]], where single brackets deforms, so we can define θ := [∇˜ 2 ]/2π i ∈ h ¯ −1 HdR note the deRham cohomology class. To leading order in h ¯ , this is given by the symplectic form as θ=
[ω] + .... 2π h ¯
(8.4)
The group of arbitrary automorphisms of Ah¯ decomposes as the direct product of the subgroup of automorphisms preserving h ¯ , with the group of formal h ¯ reparameterizations. The group of h ¯ -preserving automorphisms is itself an extension of the group of symplectomorphisms (ω-preserving diffeomorphisms) by the group of inner automorphisms. The cohomology class θ turns out [18] to classify deformation quantizations modulo inner automorphisms and small (connected component) symplectomorphisms. The class θ , modulo “large” symplectomorphisms and formal h ¯ reparameterization, therefore classifies Ah¯ modulo all isomorphisms. For a given deformation quantization Ah¯ of M, there exists a natural trace (see [7, 18, 22]) Tr : Ah¯ → h ¯ −n C[[¯h]] (2n = dim M). This is given to leading order in h ¯ by f ωn Tr f = + .... (8.5) ¯ )n n! M(2π h Using θ and this trace, a formal index theorem can be formulated (see [17] for the original, [19, 22] for clarity, and also [7, 18]). In the case of compact M, this reads: Theorem 8.1. If e = e2 ∈ Mm [C ∞ (M)] and e = e2 ∈ Mm [Ah¯ ] are idempotents such that e ≡ e mod h ¯ , then ˆ Tr e = ch e ∧ A(TM) ∧ eθ . (8.6) M
Here ch e is the Chern character of the bundle determined by e. 9. Growth of Modules Since (for any N ) the algebra AN = End HN is a full matrix algebra, its modules are classified (modulo isomorphism) by the positive integers. To be precise, any AN -module can be written in the form Hom(E, HN ), where AN acts only on HN , and E may be any finite-dimensional vector space; the integer corresponding to this module is: Definition. rk[Hom(E, HN )] := dim E.
Geometric Quantization of Vector Bundles and Deformation Quantization
425
This also gives a natural isomorphism rk : K0 (AN ) −→ Z. Theorem 9.1. Let V be any quantization of a vector bundle V, and VN the restriction of V to an AN -module. For all sufficiently large values of N , rk VN = ch V ∧ td TM ∧ eNω/2π−c1 (L0 ) . (9.1) M
Proof. The uniqueness result of Thm. 2.2 implies that any analytic formula for rk VN for large N must apply to any quantization of V. It is therefore sufficient to look at the specific quantization constructed in Sect. 4. V , H ); hence, By Lemma 4.4, for N sufficiently large, VN = V˜N = Hom(E˜ N N V ∗ rk VN = dim E˜ N . This is the kernel of the V ⊗ LN -twisted Dolbeault operator, and has the same dimension as the kernel of the V ⊗ L¯ N -twisted anti-Dolbeault operator. By Corollary A.2, this is entirely of even degree (again, for N sufficiently large); hence, rk VN is the index of this anti-Dolbeault operator. Equation (9.1) then follows from the Riemann–Roch–Atiyah–Singer theorem if we note that ch L¯ N = e−c1 (LN ) = eNω/2π−c1 (L0 ) . This gives some interesting qualitative results. Again writing n := dimC M, the right-hand side of Eq. (9.1) is a polynomial in N of degree n. The coefficients of this polynomial give n + 1 components of the Chern character of V. The growth of rk VN thus gives some – but not in general all – topological information about the bundle V. Evidently, the sequence of modules VN does not carry all the information of V; there is important information contained in the way these modules fit together as N → ∞. Since A is a quantization of the trivial line bundle, Eq. (9.1) implies the formula dim HN = td TM ∧ eNω/2π−c1 (L0 ) , (9.2) M
which, thanks to the Kodaira vanishing theorem, holds for all N > 0. Comparing (9.1) with (9.2) shows that rk VN ≈ rk V · dim HN , with corrections of order N n−1 (rk V is the fiber dimension). A trivial vector bundle over M can be quantized to a free module of A. In that case, rk VN must be an integer multiple of dim HN , but in general the deviation from this reflects the nontriviality of a vector bundle. It is especially interesting to quantize a spinor bundle of. Since M is symplectic, it is even dimensional, and spinors decompose as S = S + ⊕ S − into left and right handed parts. The Dirac operator is odd; that is, it maps left spinors to right spinors and vice-versa. A “quantized” Dirac operator should act on the quantized spinor bundle, i. e., + − DN : SN → SN . If oddness of the Dirac operator is preserved, and if SN and SN are of different size, then the quantized Dirac operator will necessarily have a kernel. + − + − Typically, SN and SN are different. In fact rk SN − rk SN is independent of N and equal to the Euler characteristic χ (M). The dimension of an AN -module is equal to its rank times dim HN , so the dimension of the kernel of a quantized Dirac operator for a manifold of nonzero Euler characteristic must be at least dim ker DN ≥ |χ (M)| dim HN .
(9.3)
This may have dire consequences for the existence of quantized Dirac operators. I hope to discuss this further in a future paper. Theorem 9.1 can also be reexpressed in a form precisely analogous to Eq. (8.6). Write tr N for the trace of the evaluation at N , i. e., tr N = tr ◦PN .
426
E. Hawkins
Corollary 9.2. Let e ∈ Mm [C(M)] and e˜ ∈ Mm (A) be idempotents such that P(e) ˜ = e. For N sufficiently large, ch e ∧ td TM ∧ eNω/2π−c1 (L0 ) . (9.4) tr N e˜ = M
Proof. The idempotent e defines a vector bundle V. The module V := Am e˜ is a quantization of V. We have VN = Am ˜ This gives rk VN = tr N e. ˜ N PN (e). 10. Comparison As suggested in Sect. 1, we can try to construct a deformation quantization from a geometric quantization. Let QN represent either the geometric or Toeplitz quantization maps. Suppose that we choose a sequence of approximately inverse maps Qinv N : AN 1→ inv ◦ Q → id as N → ∞. Using such maps, C ∞ (M) such that QN ◦ Qinv = id and Q N N N we can pull the product on AN back to C(M) and define, f ∗N g := Qinv N [QN (f )QN (g)]. Under the product ∗N, C ∞ (M) is an associative (but not unital) algebra. It is in terms of a variable product such as this that the commutation relation Eq. (1.1) should be considered. Definition. The deformation quantization, Ah¯ , associated to QN and Qinv N is the space C ∞ (M)[[¯h]] with the product ∗h¯ , where for f, g ∈ C ∞ (M), f ∗h¯ g ∼ f ∗N g
(10.1)
is the asymptotic expansion as N = h ¯ −1 → ∞. This Ah¯ will only exist as an algebra if QN and Qinv N are chosen well. The requirement that Ah¯ be algebraically closed under ∗h¯ is equivalent to the requirement that the image of each ϕj in Eq. (8.1) is in C ∞ (M). Associativity of ∗h¯ is automatic. Recall from Sect. 2 and Thm. 4.3 that the algebra A is the space of continuous sections ˆ It is useful to refine A to a space of smooth of a continuous field of C∗ -algebras over N. ˆ so smooth sections. Differentiability is only an issue at the accumulation point ∞ ∈ N, sections are those admitting an asymptotic expansion as N → ∞. Definition / Lemma 10.1. The asymptotic expansion map, ι, is defined by the condition, for a ∈ A and f ∈ Ah¯ , that f = ι(a) if a ∼ Q(f ). That is, if for all k, lim N k aN − QN (f(k) ) = 0,
N→∞
(10.2)
¯ = N −1 , and aN is the where f(k) is the order k partial sum of f with the substitution h evaluation of a at N . Let AS ⊂ A be the domain of definition of the asymptotic expansion map ι : AS → Ah¯ . That is, the set of a ∈ A such that ι(a) ∈ Ah¯ exists.
Geometric Quantization of Vector Bundles and Deformation Quantization
427
Proof. To check that ι is well-defined, we need to check that f satisfying Eq. (10.2) is unique (if it exists). Suppose that f and g both satisfy Eq. (10.2). If f(k−1) = g(k−1) , then N k (f(k) − g(k) ) is constant with respect to N . At order k, Eq. (10.2) implies, lim N k QN (f(k) − g(k) ) = 0. N→∞
Using the property that QN is norm-preserving in the limit (see Eq. (4.4) and Lemma 6.1), this gives f(k) = g(k) . Trivially, f(−1) = g(−1) = 0, so by induction, f = g. The necessity of making a (somewhat arbitrary) choice of Qinv N in the construction of ∗h¯ is a rather unpleasant feature; fortunately, it can be eliminated. Lemma 10.2. If ∗h¯ exists, then it is determined by QN alone, AS ⊂ A is a holomorphically closed subalgebra, and ι : AS → Ah¯ is a homomorphism. Proof. Suppose for a ∈ A that Qinv N (aN ) ∼ f . By the definition of asymptotic equivalence, lim N k Qinv N (aN ) − f(k) = 0. N→∞
Because the quantization maps are norm contracting, we can apply QN and get Eq. (10.2). So, f = ι(a). Applying this to Eq. (10.1) gives, f ∗h¯ g = ι[Q(f )Q(g)].
(10.3)
This clearly determines ∗h¯ without reference to Qinv N . Multiplicativity of ι is immediate from Eq. (10.3). The algebraic closure of Ah¯ under ∗h¯ then implies the algebraic closure of AS . That is, if a, b ∈ AS then ι(a), ι(b) ∈ AS , so ι(ab) = ι(a) ∗h¯ ι(b) ∈ Ah¯ ; thus ab ∈ AS . To verify that AS is holomorphically closed, it is sufficient to check that if a ∈ AS and F : C → C is holomorphic on the closed disc of radius a, then ι[F (a)] = F [ι(a)]. It is proven in [2] and [10] that a ∗h¯ -product satisfying Eq. (10.3) does exist and defines a deformation quantization, although only for the (slightly) restricted case of L0 trivial. ˆ which vanishes at ∞ ∈ N. ˆ Let We can think of h ¯ = N −1 as a function on N S S A0 := A ∩ A0 be the ideal of sections vanishing at ∞. This is generated by h ¯ , as AS0 = h ¯ AS . ¯ -adic completion of AS . That is, Ah¯ can be constructed from Theorem 10.3. Ah¯ is the h S A as the algebraic inverse limit Ah¯ = lim AS/(AS0 )k , ← − using the natural projections AS /(AS0 )k+1 AS /(AS0 )k.
428
E. Hawkins
Proof. ι induces a homomorphism AS /(AS0 )k → Ah¯ /¯hk. This is in fact an isomorphism, since Q induces the inverse map. Let B be any algebra and ψk : B → AS /(AS0 )k some maps compatible with the inverse system. We want a map ψ : B → Ah¯ . Composing ψk with the isomorphism induced by ι gives a map B → Ah¯ /¯hk , which determines the part of ψ up to degree k − 1. By this construction, Ah¯ satisfies the definition of inverse limit. With this construction of Ah¯, ι can still be recovered canonically from the natural projections AS AS /(AS0 )k . On the other hand, the identification of Ah¯ with C ∞ (M)[[¯h]] S does depend on the QN ’s (but not on Qinv N ). Because of Eq. (6.3), A ⊂ A is the same for both the geometric and Toeplitz quantization maps. In light of the classification of deformation quantizations by cohomology classes, the obvious question now is: If a deformation quantization can be successfully constructed from a geometric quantization, then what is θ? This question is easily answered by comparing Corollary 9.2 and Thm. 8.1. Theorem 10.4. For the deformation quantization derived from the geometric quantization of a compact, Kähler manifold M, the classifying cohomology class is [ω] − c1 (L0 ) − 21 c1 (TM). (10.4) 2π h ¯ Proof. Corollary 9.2 shows that tr N 1 grows as (only) a polynomial in N . With the inequality |tr N a| ≤ aN tr N 1, this implies that if a ∈ ker ι (that is, a ∼ 0), then tr N a ∼ 0. This shows that the asymptotic expansion of tr N gives a well defined, C[¯h]linear trace on Im ι ⊂ Ah¯ . Because of Thm. 10.3, this extends uniquely to all of Ah¯ and must therefore be proportional to the canonical trace, Tr (by the uniqueness of Tr, see ¯ −1 → ∞ [18]). To be precise, for any a ∈ AS, the asymptotic expansion of tr N a as N = h must be θ=
tr N a ∼ β Tr[ι(a)], where β ∈ C[[¯h]] is independent of a. Of course, this extends to matrices over AS. Let e ∈ Mm [C ∞ (M)] be any smooth idempotent. Choose any a ∈ Mm [AS ] such that P(a) = e. This is idempotent modulo AS0 in the sense that a 2 − a ∈ Mm [AS0 ]. Since AS is holomorphically closed, we can use a standard contour integral trick to construct from a an idempotent e˜ ∈ Mm [AS ]; because contour integration commutes with the homomorphism P, we have P(e) ˜ = e, which is the hypothesis of Corollary 9.2. Equation (9.4) then gives an exact polynomial expression for tr N e˜ for N sufficiently large; this polynomial is the asymptotic expansion of tr N e. ˜ The idempotent e := ι(e) ˜ satisfies the hypothesis of Thm. 8.1 that e ≡ e mod h ¯ , so Tr[ι(e)] ˜ is given by Eq. (8.6). Combining these results gives, ˆ ch e ∧ A(TM) ∧ eθ = ch e ∧ td TM ∧ eω/2π¯h−c1 (L0 ) . β M
M
1 ˆ Recall that the Aˆ and Todd classes are related by td TM = e− 2 c1 (TM) ∧ A(TM), where ∗ (M), c1 is the first Chern class. Now, noting that the possible values of ch(e) span HdR ˆ and that A(TM) is invertible, this gives
θ + ln β =
[ω] − c1 (L0 ) − 21 c1 (TM). 2π h ¯
Geometric Quantization of Vector Bundles and Deformation Quantization
429
All terms of this equation are of cohomological degree 2 except for ln β which is of degree 0; therefore, ln β = 0. Corollary 10.5. For any a ∈ AS, there is the asymptotic equivalence tr N a ∼ Tr[ι(a)] as N =
h ¯ −1
→ ∞.
The geometric quantization constitutes what Fedosov [7] calls an “asymptotic operator representation” of the corresponding geometric quantization. His integrality condition (7.1.9) seems to be for the special case θ ∝ ω; this condition is satisfied by the θ in Eq. (10.4). More generally, there is a sort of converse to Thm. 10.4. Theorem 10.6. Let (M, ω) be a compact, symplectic manifold. Suppose that Ah¯ is a deformation quantization of M corresponding to some general quantization of M by matrix algebras AN indexed by N ∈ N. Then, with a possible rescaling of ω and h ¯reparameterization, Ah¯ is equivalent to a deformation quantization with characteristic class of the form θ=
[ω] + α, 2π h ¯
(10.5)
2 (M), and both ω and α + c (TM) are integral (for any almost-complex where α ∈ HdR 1 structure on TM).
Proof. By reparameterizing h ¯ , we can arrange the correspondence h ¯ = N −1 . We again have the asymptotic proportionality of the traces. Keeping the notation of the last proof, the numbers tr N e˜ must be integers, so the series ˆ β Tr e = β ch e ∧ A(TM) ∧ eθ (10.6) M
h ¯ −1
must be integer-valued for ∈ N. With Eq. (8.4), this implies it is a polynomial in h ¯ −1 . So, βeθ must be a (degree n) polynomial in h ¯ −1 . Therefore β is constant and θ is linear. This implies that θ has the form (10.5), and the integrality of (10.6) implies the integrality conditions. In the Kähler case, this shows that any deformation quantization that can be constructed from some general quantization can be constructed from the geometric or Toeplitz quantizations. A. Spectral Inequalities The line bundles LN continue to be as defined in Sect. 3. Specifically, LN = L⊗N ⊗ L0 , and L1 is assumed to be positive. Lemma A.1. If V has a compatible connection and inner product, then the V ∗ ⊗ LN twisted Dolbeault operator, DV , is (essentially) self-adjoint and there exists a constant, C, such that Spec DV2 ⊂ {0} ∪ [N − C, ∞). Moreover, for the trivial bundle V = C × M, we can take C < 1.
(A.1)
430
E. Hawkins
Proof. Let Latin indices denote holomorphic and barred Latin indices antiholomorphic directions in the tangent bundle. Using the Kähler identity ωi ¯ = igi ¯ , the Weitzenbock formula in this case takes the form ˆ DV2 = −g i ¯ ∇i ∇¯ + N δ + K,
(A.2)
where δ is the grading operator on 0,∗ (M, V ∗ ⊗ LN ), Kˆ = iγ ı¯ γ j Kı¯j + 2i γ i γ j Kij + 2i γ ı¯ γ ¯ Kı¯¯ , and K is the curvature of V ∗ ⊗ L0 . The operator DV2 always preserves the Z2 -grading of 0,∗ (M, V ∗ ⊗ LN ) into even and odd parts, although it may not respect the full Z-grading. With respect to the Z2 grading, DV decomposes into D+ + D− , where D+ maps even to odd and D− maps odd to even. The first term of (A.2) is a positive operator, and δ ≥ 1 when restricted to the odd subspace; therefore, D+ D− ≥ N − C,
(A.3)
ˆ is sufficient. This proves that any eigenvalue of D+ D− (the spectrum where C = K consists entirely of eigenvalues) is greater than N − C. Let ψ be an eigenvector of D− D+ with eigenvalue λ * = 0. This implies that D+ ψ * = 0. Now, D+ D− (D+ ψ) = D+ λψ = λ(D+ ψ), so λ is an eigenvalue of D+ D− . Therefore, λ ≥ N − C. For V trivial, the assumption that L1 is positive implies that δ + Kˆ is strictly positive. This means that D 2 > (N − 1)δ, and so we can take C < 1 in (A.3). Corollary A.2. Let V be an arbitrary vector bundle with a connection, and DV the V ∗ ⊗ LN -twisted Dolbeault operator. For N sufficiently large, ker DV is entirely of even degree, and for any nonzero ψ ∈ ker DV , the degree 0 component of ψ is nonvanishing. Identical results hold for the V ⊗ L¯ N -twisted anti-Dolbeault operator. Proof. Assign an arbitrary inner product to V . The given connection on V can be decomposed into a connection compatible with the inner product and a self-adjoint potential. Correspondingly, the Dolbeault operator decomposes as DV = D0 + iB, where D0 is a self-adjoint Dolbeault operator and B is a self-adjoint and bounded Dirac matrix. Using Eq. (A.2) again gives Re DV2 = D02 − B 2 ≥ N δ − C − B2 . Now assume that N > C + B2 . If ψ is of strictly positive degree (i. e. ψ0 = 0) then DV2 ψ * = 0, which implies ψ * ∈ ker DV . Because DV respects the Z2 -grading, ker DV must be the sum of even and odd parts. However, if ψ is of strictly odd degree, then it is of strictly positive degree. Hence, ker DV can have no odd part. Note that if V is trivial, then the first statement can be strengthened to the classical Kodaira vanishing theorem, namely the fact that ker D is entirely of degree 0 and thus is simply hol (M, LN ) – a fact which was used in Sect. 4. Recall that in the proof of Lemma 4.7, VN was defined as the idempotent such that Im VN = ker DV and ker VN = Im DV .
Geometric Quantization of Vector Bundles and Deformation Quantization
431
Lemma A.3. If V and W are the same bundle, but with different connections, then lim VN − W N = 0.
N→∞
(A.4)
Proof. Suppose initially that the W connection is compatible with the inner product. This means that the associated Dolbeault operator DW and idempotent W N will be self-adjoint. Since different connections on the same bundle only differ by a potential, the difference A := DV − DW of the Dolbeault operators is bounded. The idempotent VN can be expressed in terms of DV as VN =
1 (z − DV )−1 dz, 2π i C
where the contour of integration C encloses 0 but no other eigenvalue of DV . An identical formula holds for W N in terms of DW . The difference of these expressions gives VN − W N =
1 (z − DV )−1 A (z − DW )−1 dz. 2π i C
(A.5)
Expanding (z − DV )−1 = (z − DW − A)−1 as a power series in A and taking the norm gives −1 −1 −1 −1 . (z − DV ) ≤ (z − DW ) − A Since DW is self-adjoint, the norm of (z − DW )−1 is just the reciprocal of the distance from z to Spec DW . Equation (A.1) implies that (for some C) √ √
Spec DW ⊂ −∞, − N − C ∪ {0} ∪ N − C, ∞ . √ If we let the contour C be the circle about 0 of radius 21 N − C (which is a good contour if N > C − 4A2 ), then for z ∈ C, (z − DW )−1 ≤ 2(N − C)−1/2 . Taking the norm of (A.5) now gives −1 V −1/2 1 1/2 − A . N − W N ≤ 2 A (N − C) 2 (N − C) This clearly goes to 0 as N → ∞, thus proving the claim in this special case. Idempotents constructed from two connections incompatible with the inner product can both be compared with one constructed from a connection that is compatible with the inner product; thus, this special case implies the more general result. Acknowledgements. I wish to thank Nigel Higson for extensive discussions, as well as Boris Tsygan, Ranee Brylinski, Jean-Luc Brylinski, and Nicolaas Landsman. This material is based upon work supported in part under a National Science Foundation Graduate Fellowship. Also supported in part by NSF grant PHY95-14240 and by the Eberly Research Fund of the Pennsylvania State University.
432
E. Hawkins
References 1. Bayen, F., Flato, M., Fronsdal, C., Lichnerowicz, A., Sternheimer, D.: Deformation Theory and Quantization. Ann. Phys. 111, 61–151 (1977) 2. Bordemann, M., Meinrenken, E., Schlichenmaier, M.: Toeplitz Quantization of Kähler Manifolds and gl(N ), N → ∞ Limits. E-print: hep-th/9309134; Commun. Math. Phys. 165, 281–296 (1994) 3. Cartan, H., Eilenberg, S.: Homological Algebra. Princeton, NJ: Princeton University Press, 1956 4. Connes, A.: Noncommutative Geometry. New York: Academic Press, 1994 5. Dixmier, J.: C∗ -algebras. Amsterdam: North Holland, 1982 6. Fedosov, B. V.: A simple geometrical construction of deformation quantization. J. Diff. Geom. 40 no. 2, 213–238 (1994) 7. Fedosov, B. V.: Deformation quantization and index theory. Math. Top., 9. Berlin: Akademie Verlag, 1996 8. Gilkey, P. B.: Invariance theory, the heat equation, and the Atiyah–Singer index theorem. Boca Raton, FL: CRC Press, 1995 9. Griffiths, P., Harris, J.: Principles of Algebraic Geometry. New York: Wiley, 1978 10. Guillemin, V.: Star Products on Compact Pre-quantizable Symplectic Manifolds. Lett. Math. Phys. 35, 85–89 (1995) 11. Hawkins, E.: Quantization of Equivariant Vector Bundles. E-print: q-alg/9708030; Commun. Math. Phys. 202, 517–546 (1999) 12. Hawkins, E.: Geometric Quantization of Vector Bundles. E-print, math.QA/9808116 13. Hawkins, E.: The Correspondence Between Geometric Quantization and Formal Deformation Quantization. E-print: math.QA/9811049 14. Kirchburg, E., Wassermann, S.: Operations on Continuous Bundles of C∗ -algebras. Math. Ann. 303, 677–697 (1995) 15. Landsman, N. P.: Strict Quantization of Coadjoint Orbits. E-print: math-ph/9807027; J. Math. Phys. 39, 6372 (1998) 16. Landsman, N. P.: Mathematical Topics Between Classical and Quantum Mechanics. New York: Springer, 1998 17. Nest, R., Tsygan, B.: Algebraic Index Theorem. Commun. Math. Phys. 172 2, 223–262 (1995) 18. Nest, R., Tsygan, B.: Algebraic Index Theorem for Families. Adv. Math. 113 2, 151–205 (1995) 19. Nest, R., Tsygan, B.: Formal Versus Analytic Index Theorems. IMRN 11 (1996) 20. Rieffel, M. A.: Quantization and C∗ -algebras. Contemp. Math. 167, 67–97 (1994) 21. Tuynman, G. M.: Quantization: Towards a Comparison Between Methods. J. Math. Phys. 28, 2829–2840 (1987) 22. Rosenberg, J. M.: Review of [17, 18]. Math. Rev. 96j:58163a. 23. Weinstein, A.: Deformation Quantization. Séminaire Bourbaki, Vol. 1993/94. Astérisque No. 227, Exp. No. 789, 5, 389–409 (1995) 24. Woodhouse, N. M. J.: Geometric Quantization. Oxford: Oxford University Press, 1991 Communicated by A. Connes
Commun. Math. Phys. 215, 433 – 441 (2000)
Communications in
Mathematical Physics
© Springer-Verlag 2000
Linearity of Space-Time Transformations Without the One-to-One, Line-onto-Line, or Constancy-of-Speed-of-Light Assumptions Alexander Chubarev, Iosif Pinelis Department of Mathematical Sciences, Michigan Technological University, Houghton, MI 49931, USA Received: 3 May 1999 / Accepted: 7 July 2000
Abstract: Using a version of the fundamental theorem of geometry without the 1-to-1 assumption, recently obtained by the authors, the following is proved: Let n ≥ 2 and T be a mapping of Rn onto itself which maps every timelike line into an arbitrary line so that the image of every future ray of contains at least two distinct points or the same holds for every past ray of . Then T is affine. A version of the Pappus theorem under minimal assumptions is also given, which is then used as a tool in this paper. Related results have been obtained by Borchers and Hegerfeldt. 1. Introduction The special theory of relativity is usually derived based on three principles: the linearity [more exactly, affinity] of the reference frame (r.f.) change transformations, isotropy of space, and the constancy of the speed of light. Of these three principles, the constancy of the speed of light seems to be the least intuitive. Instead of the constancy of the speed of light one may use the less restrictive requirement of the existence of a universal upper bound on the speeds of all uniform rectilinear motions (u.r.m.’s) or, alternatively, that of the existence of a u.r.m. whose speed is the same with respect to all “inertial systems”. However, neither of the latter two requirements seems much more intuitive than the constancy of the speed of light. Moreover, they are so strong by themselves that either of them already implies the linearity [1, 2, 9]; cf. also, e.g., [13, 11]. As early as in 1910 it was shown by Ignatowsky, and Frank and Rothe that the principle of the constancy of the speed of light may be replaced by the requirement that the r.f. change transformations form a one-parameter homogeneous group; see, e.g., [12]. Alternatively, Dixon [6] used isotropy in place of group assumptions. It seems therefore natural to try to deduce the linearity of r.f. change transformations from general principles, without the constancy of the speed of light or related
434
A. Chubarev, I. Pinelis
assumptions. The easiest way would be to refer to the Fundamental Theorem of [Affine] Geometry [10], which says that for n ≥ 2, any 1-to-1 mapping of Rn onto itself which maps every (straight) line onto a line is affine. There is however one problem with this approach: the requirement that every line be mapped onto a line means, generally speaking, that one is allowed to consider u.r.m.’s whose speed may be finite relative to one r.f. and infinite relative to another r.f. It is thus desirable to require only that, for instance, the lines in the Minkowski space that are the world-lines of sufficiently slow u.r.m.’s be mapped onto lines. More generally and in view of testing purposes, one might want to require a priori only that the lines whose directions are close enough to a fixed direction be mapped onto lines; let us refer for brevity to such directions as allowable. Let us emphasize that the directions of the image-lines are not required here to be allowable or to be otherwise specified at all, so that this requirement has nothing to do with such conditions as the ones of the constancy of the speed of light, boundedness of the speeds of all “signals”, etc. Results addressing this concern were obtained by Borchers and Hegerfeldt in the remarkable series of papers [8, 2, 3]. Another concern is the following. While ideally one should think of any two r.f.’s as being mutually interchangeable, in practice one r.f. may be better equipped than another, so that events, found distinct in one r.f., may seem indistinguishable or almost so in another. This brings forward the question whether it is possible to reduce, or entirely remove, the requirement that the r.f. change transformations be 1-to-1 as imposed a priori. The theorem mentioned in the abstract addresses both of the mentioned concerns at once; moreover, the line-onto-line condition may be relaxed to the line-into-line one. The proof is based on the recent result by these authors [5], who have been able to remove the 1-to-1 condition in the Fundamental Theorem of Geometry – see Theorem G below. On the other hand, weaker conditions are imposed in [2, 3] on the cone of the allowable directions. We present here an amazingly simple [at least for n ≥ 3, and using Theorem G below] proof of the result cited in the Abstract. 2. Notation and Statement of Results For brevity, let us refer to a k-dimensional affine subspace of Rn as a k-plane; then, 1-planes will be referred to simply as lines. Let T be a mapping of Rn into Rn . The following theorem is immediate from, say, Corollary 2 of [5]. Theorem G (Non-1-to-1 fundamental theorem of geometry). Suppose that n ≥ 2 and that T is onto and maps every line in Rn into a line. Then T is affine. As indicated in [5], the conditions of Theorem G are minimal in the sense that none of them: (i) T being onto, (ii) T preserving collinearity, (iii) n ≥ 2 – may be dropped. In what follows, capital Roman letters A, B . . . , possibly with indices, will always denote points of Rn , unless otherwise stated; , possibly with indices, will denote lines. We shall set A := T (A) for all A ∈ Rn . We shall say that two k-planes π1 and π2 are parallel and write π1 || π2 if π1 ∩π2 = ∅ and dim(π1 ∪ π2 ) = k + 1. Let C = ∅ be an open convex cone in Rn \ {0}; C being a cone means in this paper that R+ C = C, where R+ is the set of all strictly positive real numbers. Let us write P < Q or, equivalently, Q > P if Q − P ∈ C.
Linearity of Space-Time Transformations
435
Let us call a pair {P , Q} timelike if either P < Q or P > Q. Note that if {P , Q} is timelike, then P = Q, since 0 ∈ C. A line in Rn will be called timelike if there exists a timelike pair of points of ; then, obviously, every pair of distinct points on is timelike. For any Q ∈ , let us call the sets of the form {P ∈ : P > Q} and {P ∈ : P < Q} the future and past rays in , respectively. Let us call a timelike line good (under T ) if the image under T of every future ray in contains at least two distinct points or this is true for every past ray in . Since the cone C is open, a timelike line is good iff for every P ∈ Rn , there exists a point Q ∈ such that Q > P and Q = P , or for every P ∈ Rn , there exists a point Q ∈ such that Q < P and Q = P . Yet another equivalent definition is the following: a timelike line is good iff for every A ∈ Rn and for every future ray r in , there exists P ∈ r such that P = A , or this is true for every A ∈ Rn and for every past ray r in . Let us now state the main result of this paper. Theorem (Main). Suppose that n ≥ 2 and that T is onto and maps every timelike line into a line [ is not a priori required to be timelike]. Suppose also that every timelike line is good. Then T is affine. One might ask whether the condition in the Theorem (Main) about timelike lines being good is essential. The positive answer to this question is seen from the following simple example. Example. For any n ≥ 2, let be any (n − 1)-plane in Rn not containing timelike lines [such an (n − 1)-plane always exists since C is an open convex cone in Rn \ {0}]; let Q be any point in Rn . Let T be such that T (Rn \ ) = {Q} and T () = Rn \ {Q}. Then T is not affine, while it is onto Rn and maps every timelike line onto a two-point set. Let a non-1-to-1 mapping T be interpreted as a transformation describing the correspondence between some “perfect” r.f. f and another, “less than perfect” r.f. f . Suppose that every timelike line may be represented as the world-line of a particle as viewed from the “perfect” r.f. f . From the viewpoint of r.f. f , the particle may be observable not at all time moments, although we do assume that the 4-tuples of the time-space coordinates of all the events associated with the particle and measured in f correspond to collinear points in the Minkowski space. The condition of being good just means that to the “observer” f , the particle cannot disappear forever so that no new events pertaining to the particle can be registered in f after some time moment [“after some time moment” is understood here from the viewpoint of f ] or the particle could not have appeared to the “observer” f out of nowhere. Obviously, this picture is much less perfect than that of an ideal 1-to-1 and line-onto-line correspondence. Nonetheless, Theorem (Main) says that the imperfection of r.f. f can be rectified, so that T must be a posteriori affine. With the linearity (affinity) of the r.f. change transformations established, considerations of isotropy or group symmetry such as mentioned in the Introduction may be used to obtain, once again a posteriori, Lorentz-like properties of the r.f. change transformations. The following corollary is immediate from the Theorem (Main). Corollary 1. Suppose that n ≥ 2 and that T is onto and maps every timelike line into a line. Suppose also that for every timelike line , the restriction of T to some ray of is one-to-one. Then T is affine. Corollary 1 in turn implies
436
A. Chubarev, I. Pinelis
Corollary 2. Suppose that n ≥ 2 and that T is onto and maps every timelike line into a line. Suppose also that for every timelike line , the restriction of T to is one-to-one. Then T is affine. Further specialization of Corollary 2 yields Corollary 3. Suppose that n ≥ 2 and that T is onto and maps every timelike line into a line and every timelike pair onto a timelike pair. Then T is affine. By yet another round of specialization, one has Corollary 4. Suppose that n ≥ 2 and that T is onto and maps every timelike line into a line. Suppose also that T preserves the order in the sense that P < Q implies P < Q [or reverses the order in the sense that P < Q implies P > Q ]. Then T is affine. Finally, another specialization of Corollary 2: Corollary 5. Suppose that n ≥ 2 and that T is onto and maps every timelike line into a line. Suppose also that T is one-to-one. Then T is affine. The last corollary may be viewed as a generalization of the result of [8], where it was additonally assumed that T maps every timelike line onto a line. In the proof of Theorem (Main), we will have to distinguish between two cases: n = 2 and n ≥ 3; the first case is the harder one, and it is easy to understand that the reason is that the smaller the dimension, the “fewer” lines are there to operate with; the same kind of phenomenon is noticeable in [13], where the result is true for n ≥ 3 but not for n = 2, in [2, 3], and in the fundamental theorem of geometry itself, which is not true for n = 1. It is in the proof of Theorem (Main) for the case n = 2 that we use the following version of the Pappus theorem, the distinguishing feature of which is the minimality of the assumptions; cf., e.g., [4, 7]. We would have very much appreciated having this theorem when working on the proof of Theorem (Main) for n = 2, and we believe that it could be useful elsewhere. In what follows, A−B −C will mean that A, B, and C are collinear. Theorem (Pappus). Suppose that n = 2, A1 −A2 −A3 , B1 −B2 −B3 , and for all permutations (i, j, k) of the set {1, 2, 3}, one has Ck −Ai −Bj . Then C1 −C2 −C3 unless one of the following two exceptional situations takes place: Exception 1. Points A1 , A2 , A3 , B1 , B2 , B3 are all collinear, and there exist two distinct elements i and j of the set {1, 2, 3} such that Ai = Bj and Aj = Bi ; Exception 2. A1 , A2 , A3 , B1 , B2 , B3 are not all collinear, and there exists a permutation (i, j, k) of {1, 2, 3} such that Ai = Bj = Bk or Bi = Aj = Ak . These exceptions are minimal in the following sense. For any points A1 , A2 , A3 , B1 , B2 , B3 satisfying the conditions of Exception 1 or Exception 2 and such that A1 −A2 −A3 and B1 −B2 −B3 , there exist three non-collinear points C1 , C2 , C3 such that Ck −Ai −Bj for all permutations (i, j, k) of the set {1, 2, 3}.
Linearity of Space-Time Transformations
437
3. Proofs Here is some more notation that we shall use in this section: AB will denote the line through A and B, and this notation will imply that the points A and B have been assumed or proved to be distinct or that this is not difficult to see; 1 /| 2 will be used to mean that the lines 1 and 2 have exactly one common point. In this section, we shall first give a proof for the simpler case n ≥ 3, preceded by Lemma 1, and then, a proof for n = 2, preceded by the proof of Theorem (Pappus) and by Lemma 2. Lemma 1 (Timelike-plane-into-plane). Suppose that the conditions of Theorem (main) hold and a 2-plane π in Rn contains a timelike line [such a plane may be called timelike]. Then dim T (π ) ≤ 2. Proof. Without loss of generality (w.l.o.g.), there exists a point A ∈ π such that A ∈ [otherwise, there is nothing to prove]. Let π be the 2-plane containing A and . Let X ∈ π. It suffices to show that X ∈ π . Choose Q ∈ so that Q = A and AQ is timelike. Note that AQ /| , because A ∈ and hence A ∈ , while Q ∈ . Next, choose Q1 ∈ AQ so that Q1 = Q , XQ1 is timelike, and there exists a unique point D ∈ ∩ XQ1 [which is possible since AQ /| ]. Then Q1 ∈ [otherwise, A ∈ Q Q1 = ]. Hence, Q1 = D , and so, X ∈ Q1 D ⊂ π [since Q1 ∈ A Q ⊂ π and D ∈ ⊂ π ]. Proof of Theorem (Main), case n ≥ 3. Let A−B −C. By Theorem G, it suffices to show that A −B −C . Assume that the latter is false. Then A, B, C are distinct and the line AB is not timelike. Let 1 be a timelike line through A, and π1 the 2-plane containing A, B, C, and 1 . By Lemma 1 (timelike-plane-into-plane), there exists a 2-plane π1 such that T (π1 ) ⊆ π1 . Since T is onto Rn and n ≥ 3, there exists a point E such that E ∈ π1 . Then E ∈ 1 . Choose now a point Q1 ∈ 1 so that EQ1 is timelike. Then choose Q2 ∈ EQ1 so that Q2 = Q1 and 2 := AQ2 is timelike; note that Q2 = Q1 implies Q2 ∈ π1 [otherwise, E ∈ Q1 Q2 ⊂ π1 ]. Let π2 be the 2-plane containing A, B, C, and 2 . Again by Lemma 1 (timelikeplane-into-plane), there exists a 2-plane π2 such that T (π2 ) ⊆ π2 . Note that π2 = π1 , because Q2 ∈ π2 \ π1 . Thus, the points A , B , C belong to the intersection of the two different 2-planes π1 and π2 and hence are collinear. Proof of Theorem (Pappus). Let A and B be any lines containing the Ai ’s and the Bi ’s, respectively. In view of the symmetry of the statement of the theorem with respect to all permutations of the indices {1, 2, 3} and the interchangeability of the roles of the Ai ’s with those of the Bi ’s and excluding the situations described in Exception 1 and Exception 2, it suffices to consider the cases listed below; after that, we shall demonstrate the minimality of the exceptions. > > Let us point out that the only case actually needed in the proof of Theorem (Main) is II.1.1, so that the reader interested primarily in Theorem (Main) here may skip all the other cases. It is meant below that, say, II.2.3 is a subcase of II.2, so that all the conditions defining II.2 apply to II.2.3 as well: I. A1 = B2 , A2 = B1 , and A1 , A2 , A3 , B1 , B2 , B3 are not all collinear.
438
A. Chubarev, I. Pinelis
Here, A3 = A1 = A2 = B1 = B2 = B3 , whence C1 = C2 , and so, C1 −C2 −C3 . II. Ai = Bj or Aj = Bi , for every pair of distinct i and j . II.1. A1 , A2 , A3 are distinct and A1 , A2 , A3 , B1 , B2 , B3 are not all collinear. II.1.1. {B1 , B2 , B3 } ∩ A = ∅. II.1.1.1. B1 = B2 and {A1 , A2 } ∩ B = ∅. This is the “non-trivial” and well-known case, at least if A1 , A2 , A3 , B1 , B2 , B3 are all distinct; we shall verify it separately below for the reader’s convenience. II.1.1.2. B1 = B2 and A1 ∈ B . Here, C3 = B1 = C2 . II.1.1.3. B1 = B2 = B3 . Here, C1 = C2 = C3 . II.1.2. B1 ∈ A , {B2 , B3 } ∩ A = ∅. II.1.2.1. B1 ∈ {A2 , A3 }. Here, C2 = A1 = C3 . II.1.2.2. B1 = A2 . Here, C1 = B2 , C2 = A1 , and so, C3 ∈ A1 B2 = C2 C1 . II.1.3. B1 = B2 ∈ A \ {A3 }, B3 ∈ A . Here, {C1 , C2 , C3 } ⊂ A . II.2. A1 , A2 , A3 are not distinct, B1 , B2 , B3 are not distinct, and A1 , A2 , A3 , B1 , B2 , B3 are not all collinear. II.2.1. A1 = A2 = A3 , B1 = B2 = B3 ∈ A . Here, C2 = B1 = C3 . II.2.2. A1 = A2 = A3 , B1 = B2 = B3 . II.2.2.1. B2 = B3 ∈ A \ {A1 }, B1 ∈ A . Here, C2 = A3 = A2 = C3 . II.2.2.2. B2 = B3 ∈ A , B1 ∈ A \ {A2 , A3 }[= A \ {A2 }]. Here, C2 = A1 = C3 . II.2.2.3. B2 = B3 ∈ A , B1 ∈ A . Here, C2 = C3 . II.2.3. A1 = A2 = A3 , B1 = B2 = B3 . II.2.3.1. B1 = B2 ∈ A \ {A2 , A3 }, B3 ∈ A . Here, C1 = A2 , C2 = A1 , and C3 ∈ A2 B1 = A . II.2.3.2. B1 = B2 ∈ A , B3 ∈ A . II.2.3.2.1. B3 = A1 . Here, C1 = A3 , C3 = B1 , and so, C2 ∈ A3 B1 = C1 C3 . II.2.3.2.2. B3 = A1 . Here, C2 = A3 , C3 = B1 , and so, C1 ∈ A3 B2 = A3 B1 = C2 C3 . II.2.3.3. B1 = B2 ∈ A , B3 ∈ A . Here, C3 = B1 , C1 ∈ A3 B2 = A3 B1 , C2 ∈ A3 B1 , and so, {C1 , C2 , C3 } ⊂ A3 B1 . II.3. A1 , A2 , A3 , B1 , B2 , B3 lie all on one line, say . Here, {C1 , C2 , C3 } ⊂ . Of all the above cases, it remains to verify only II.1.1.1. Here, w.l.o.g., the Ai ’s and the Bi ’s may be identified with the triples of projective coordinates as follows: A1 = (1, 0, 0), A2 = (0, 1, 0), A3 = (1, α, 0), B1 = (0, 0, 1), B2 = (1, 1, 1), B3 = (1, 1, β) for some real numbers α = 0 and β. One can then easily check the equations of the lines: A1 B2 : y − z = 0, A2 B1 : x = 0, A2 B3 : βx − z = 0, A3 B2 : αx − y + (1 − α)z = 0, A1 B3 : βy − z = 0, A3 B1 : αx − y = 0, in the projective coordinates (x, y, z), and the coordinates of the points: C3 = (0, 1, 1), C1 = (1, β(1 − α) + α, β), C2 = (1, α, αβ). Thus, C1 − C2 − β(1 − α)C3 = 0, and so, C1 −C2 −C3 . [The condition α = 0 was not used here.] Let us finally demonstrate the minimality of the exceptions. Indeed, if the conditions of Exception 1 are satisfied, one can take A = B and then take Ci and Cj to be any two distinct points on A = B and Ck to be any point not on A = B , where {k} = {1, 2, 3}\{i, j }. In the other case, when the conditions of Exception 2 are satisfied,
Linearity of Space-Time Transformations
439
so that, for instance, Bi ∈ A and Ai = Bj = Bk for some distinct i, j, k, one can take Cj := Ak , Ck := Bi , and any Ci ∈ A \ {Ak }. Lemma 2 (Breeding-image-points). Suppose that the conditions of Theorem (Main) hold, n = 2, and is any timelike line in R2 . Then the image under T of any ray of contains infinitely many points. Proof. Suppose that, on the contrary, there exists a ray r0 of such that card T (r0 ) = m ∈ {1, 2, . . . }. Let stand for the line containing T (). Since T is onto and card T (r0 ) < ∞, one successively construct points D1 , D2 , . . . such that D1 ∈ and Di ∈ can ∪ {Dj U : U ∈ r0 , j ∈ {1, . . . , i − 1}} for all i ∈ {2, 3, . . . }. Then Di ∈ and Di U = Dj U for all U ∈ r0 and all distinct i and j . Hence, Di ∈ ∀i. At least one of the two closed half-planes of R2 bounded by the line will contain infinitely many points of the set {D1 , D2 , . . . }; denote any of such half-planes by H. W.l.o.g., one has {D1 , . . . , Dm+1 } ⊂ H \ . Let us now take r to be such a closed ray that r ⊆ r0 and the lines Di U are timelike for all U ∈ r and all i ∈ {1, . . . , m + 1}. Let Q1 be the boundary point of r in , so that r emanates from Q1 . Let V be any point on r different from Q1 . Note that Di Q1 = Dj Q1 if i = j , since Di U = Dj U for all U ∈ r0 and all distinct i and j . 1 V that is the largest among the angles {Di Q 1 V : i ∈ W.l.o.g., it is the angle D1 Q stands for the value of the angle between the vectors A − B {1, . . . , m + 1}}. Here, ABC and C − B that is in the interval [0, π ]. Then one of the two closed half-planes bounded by the line D1 Q1 contains {D1 , . . . , Dm+1 } ∪ r; denote this half-plane by G. Choose S ∈ D1 Q1 so that S = Q1 and the angles Di SD1 are small enough for all i ∈ {2, . . . , m+1}. Then ∀i ∈ {2, . . . , m+1} ∃Qi ∈ r SDi ∩ = {Qi }, because the line through Di that is parallel to D1 Q1 lies in the half-plane G, for every i ∈ {2, . . . , m + 1}. Observe also that S ∈ [otherwise, D1 ∈ S Q1 = ]. It remains to notice that the points Q1 , . . . , Qm+1 are all distinct. Indeed, if Qi = Qj for some distinct i and j in {1, . . . , m + 1}, then Di ∈ S Qi = S Qj = Dj Qj , which contradicts the construction of the Di ’s. Proof of Theorem (Main), case n = 2. Let A1 , A2 , A3 be points on some line A . In view of Theorem G, it suffices to prove that A1 −A2 −A3 . W.l.o.g., A is not timelike and A1 , A2 , A3 are distinct [whence A1 , A2 , A3 are distinct]. Let 1 , 2 , 3 be three parallel timelike lines through A1 , A2 , A3 . Let 1 , 2 , 3 be the lines containing T (1 ), T (2 ), T (3 ), respectively. Then, Ai ∈ j if i = j [otherwise, A = Ai Aj = j is timelike]. W.l.o.g., 2 = 1 [1 = 2 = 3 would imply {A1 , A2 , A3 } ⊂ 1 ]. Henceforth, the line 3 will no longer be used. Using Lemma 2 (Breeding-image-points), construct successively points C1 , C2 , C3 so that the following is true: C1 ∈ 1 , C1 ∈ {A1 , A2 , A3 } ∪ (1 ∩ 2 ) [note that card (1 ∩ 2 ) ≤ 1 since 2 = 1 ], and the lines C1 A2 and C1 A3 are timelike; C2 ∈ 2 , C2 ∈ {A1 , A2 , A3 , C1 } ∪ (1 ∩ 2 ) ∪ (C1 A3 ∩ 2 ) [note that card (C1 A3 ∩ 2 ) ≤ 1 since C1 ∈ 2 ], the lines C2 C1 , C2 A1 , and C2 A3 are timelike, and there exists a point B3 uniquely determined by the condition B3 ∈ C1 A2 ∩ C2 A1 ; this is possible because C1 A2 /| 2 and the angle d(C2 A1 , 2 ) between the lines C2 A1 and 1 may be made arbitrarily small; the latter condition will also imply that B3 is close to C1 ; here and in the sequel, d(·, ·) denotes the value of the angle between two lines that is in the interval [0, π/2].
440
A. Chubarev, I. Pinelis
C3 ∈ C1 C2 , C3 ∈ {A1 , A2 , A3 , C1 , C2 } ∪ (1 ∩ 2 ), the lines C3 A1 and C3 A2 are timelike, and there exist two points: B1 uniquely determined by B1 ∈ C2 A3 ∩ C3 A2 [which is possible because C2 ∈ C1 A3 implies C2 A3 /| C1 C2 and because the angle d(C3 A2 , C1 C2 ) may be made arbitrarily small] and B2 uniquely determined by B2 ∈ C1 A3 ∩ C3 A1 [which is possible because C2 ∈ C1 A3 implies C1 A3 /| C1 C2 and because the angle d(C3 A1 , C1 C2 ) may be made arbitrarily small]. Thus, the points C1 , C2 , C3 , A1 , A2 , A3 are all distinct, whence C1 , C2 , C3 , A1 , A2 , A3 are so. Therefore, by Theorem (Pappus), B1 −B2 −B3 . Alternatively, B1 −B2 −B3 follows from Theorem (Pappus) because {A1 , A2 , A3 } ∩ C1 C2 = ∅, which follows from {A1 , A2 , A3 } ∩ C1 C2 = ∅. [Note that A1 ∈ C1 C2 would imply C2 ∈ A1 C1 = 1 , A2 ∈ C1 C2 would imply C1 ∈ A2 C2 = 2 , and A3 ∈ C1 C2 would imply C2 ∈ C1 A3 ; in any of these three events, one would have a contradiction with the above construction.] Having {A1 , A2 , A3 }∩C1 C2 = ∅, it is also easy to see that {B1 , B2 , B3 }∩C1 C2 = ∅. [Indeed, now B1 ∈ C1 C2 would imply B1 ∈ (C1 C2 ∩C2 A3 )∩(C1 C2 ∩C3 A2 ) = {C2 }∩ {C3 } = ∅, contradiction; similarly, {B2 , B3 } ∩ C1 C2 = ∅ would imply a contradiction.] If one could now show that the collinear points B1 , B2 , B3 lie on a timelike line, then one would have B1 −B2 −B3 , and so, using Theorem (Pappus) once again, one would obtain A1 −A2 −A3 . Thus, it suffices to show that one can further specialize the above construction so that B3 = B1 and the line B3 B1 is timelike. Let us fix C1 – as well as A1 , A2 , A3 , 1 , and 2 – and let C2 and C3 vary so that, in addition to all the other conditions that they have been assumed to satisfy, one has |C2 − A2 | → ∞ and |C3 − C2 |/|C2 − A2 | → ∞. Then the angles d(C1 C2 , 2 ) → 0 and d(C3 A2 , C1 C2 ) = d(C3 A2 , C3 C2 ) → 0, and so, d(C3 A2 , 2 ) → 0. Hence, d(A2 B1 , 2 ) = d(C3 A2 , 2 ) → 0 [note that B1 = A2 ; otherwise, A2 ∈ A3 C2 , and so, A3 ∈ A2 C2 = 2 ]. This implies |B1 − A2 | → ∞: otherwise, w.l.o.g., ∃B1,∞ ∈ 2 B1 → B1,∞ , and so, C2 → B1,∞ [since A3 ∈ 2 and {C2 } = 2 ∩ A3 B1 ], which contradicticts |C2 − A2 | → ∞. As we noted in the paragraph introducing C2 , one has B3 → C1 . Hence, |B3 − A2 | → |C1 − A2 |. This and |B1 − A2 | → ∞ imply B1 = B3 [eventually] and d(B3 B1 , A2 B1 ) → 0; recalling now that d(A2 B1 , 2 ) → 0, one obtains d(B3 B1 , 2 ) → 0, whence B3 B1 is timelike.
References 1. Alexandrov, A. D.: On Lorentz transformations. Usp. Mat. Nauk 5, No. 3(37) 187 (1950); proofs may be found in Alexandrov, A. D.: Mapping of affine spaces with systems of cones. Zap. Nauˆcn. Sem. Leningrad. Otdel. Mat. Inst. Steklov (LOMI) 27, 7–16 (1972) 2. Borchers, H. J., Hegerfeldt, G. C.: The structure of the space-time transformations. Commun. Math. Phys. 28, 259–266 (1972) 3. Borchers, H. J., Hegerfeldt, G. C.: Über ein Problem der Relativitätstheorie: Wann sind Punktabbildungen des Rn linear? Nachr. Akad. Wiss. Göttiingen Math.-Phys. Kl. II, 205–229 (1972) 4. Bumcrot, R. J.: Modern Projective Geometry. New York: Holt, Rinehart and Winston, 1969 5. Chubarev, A., Pinelis, I.: Fundamental theorem of geometry without the 1-to-1 assumption. Proc. AMS 127, 2735–2744 (1999) 6. Dixon, W. G.: Special Relativity. Cambridge: Cambridge University Press, 1978 7. Hawrylycz, M.: A geometric identity for Pappus’ theorem. Proc. Nat. Acad. Sci. USA 91, 2909 (1994) 8. Hegerfeldt, G. C.: The Lorentz transformations: Derivation of linearity and scale factor. Il Nuovo Cim. A 10, 257–267 (1972) 9. Lenard, A.: A characterization of Lorentz transformations. J. Math. Phys. 19, 157 (1978) 10. Lester, J. A.: Distance preserving transformations. In: Handbook of Incidence Geometry, Amsterdam: North Holland, 1995, pp. 921–944
Linearity of Space-Time Transformations
441
11. Naber, G. L.: The Geometry of Minkowski Spacetime. Applied Mathematical Sciences. Vol. 92 New York: Springer, 1992 12. Pauli, W.: Theory of Relativity. Oxford: Pergamon Press, 1958 13. Zeeman, E. C.: Causality implies the Lorentz group. J. Math. Phys. 5, 490–493 (1964) Communicated by A. Jaffe
Commun. Math. Phys. 215, 443 – 476 (2000)
Communications in
Mathematical Physics
© Springer-Verlag 2000
Existence and Properties of p-tupling Fixed Points Henri Epstein Institut des Hautes Etudes Scientifiques, 91440 Bures-sur-Yvette, France Received: 10 April 2000 / Accepted: 7 July 2000
Abstract: We prove the existence of fixed points of p-tupling renormalization operators for interval and circle mappings having a critical point of arbitrary real degree r > 1. Some properties of the resulting maps are studied: analyticity, univalence, behavior as r tends to infinity. 1. Introduction Two problems have a strong resemblance, and have both found their origin in the theory of period doubling for maps of the interval [F1, F2, CT]. The first is to prove the existence and properties of solutions of the (p + 1)-Cvitanovi´c–Feigenbaum functional equation, i.e. fixed points of the (p + 1)-tupling operator Rp+1 : 1 g(x) = (Rp+1 g)(x) = − g p+1 (−λx), λ
g(0) = 1.
(1.1)
Here g is required to be an even, C 1 map of [−1, 1] into itself, strictly decreasing on [0, 1] and λ = −g p+1 (0) is required to be in (0, 1). More precisely, the restrictions g+ and g− to [0, 1] and [−1, 0], respectively, must satisfy 1 p−1 2 g+ = − g− ◦ g+ ◦ (λ), λ
g(0) = 1.
(1.2)
Denoting u the inverse function of g+ , and u(z) ˇ = u(−z), this can be reexpressed as u=
1 u ◦ uˇ p ◦ λ. λ
(1.3)
We shall also require g+ to have the form g+ (x) = f (x r ) ∀x ∈ [0, 1],
(1.4)
444
H. Epstein
where r > 1 is a real number, and f is real-analytic on [0, 1], with f (x) < 0 on this closed interval. The class of those g having the property (1.4) for a fixed r is left invariant by Rp+1 . The second problem is to prove the existence and properties of solutions of the system η= − λ1 ηp ◦ ξ ◦ (−λ), 1 ξ = − λ η ◦ (−λ), ξ(0) = 1,
λ = −η(0) ∈ (0, 1).
(1.5)
Here ξ is a real C 1 , strictly increasing function defined on a certain interval [−L, 0] of the negative real axis (L > 1) and satisfies ξ(x) > x on this interval. Again ξ is required to be of the form ξ(x) = f (|x|r ) ∀x ∈ [−L, 0],
(1.6)
where r > 1 is a real number, and f is real analytic without critical points on [0, Lr ]. Let −u be the inverse function of ξ , and u(z) ˇ = u(−z). Then (1.5) implies u=
1 u ◦ λ ◦ uˇ p ◦ λ. λ2
(1.7)
The system (1.5) is part of the theory, initiated in [FKS] and [ORSS], of critical circle mappings whose rotation number has the continued fraction expansion [p, p, . . . p, . . . ]. It is natural to attempt a unified treatment of the two functional equations (1.3) and (1.7) by introducing an interpolating parameter ν ∈ [1, 2] and considering the functional equation u=
1 u ◦ λν−1 ◦ uˇ p ◦ λ. λν
(1.8)
As a device for avoiding repetitions, this works rather well for p = 1, ([EE, E2, E3]). It is much less effective, as we shall see, for p > 1. It is also of some interest to consider the case when ν < 1. The history of this subject is long, even if restricted to rigorous results (see e.g. [L1, L2]), and the literature has experienced a veritable explosion in recent times. For the case of interval maps, the paper of M. Lyubich [Ly] (a kind of culminating point) contains a historical note and references to which I refer the reader. For the case of circle maps, the reader is referred to the paper of M. Yampolsky [Y] and to references therein. However the literature has tended to concentrate on the case of integer r, with notable exceptions such as [CEL, JR, M, MO]. Another, most important exception is the whole theory of “real a priori bounds” (see [dMvS, S1, S2, Sw, dFdM] and other references given in [Ly,Y]). In this paper, we look for solutions of the functional equations (1.8), for arbitrary real r, which are subjected to some additional constraints (see Sect. 3). All the available theoretical and numerical evidence indicates that, for each ν ∈ [1, 2], each r > 1, and each p ≥ 1, there is one and only one solution obeying all the constraints. This suggests that the solution (and in particular λ) must depend analytically on the parameters ν, r, and p. For p = 1, it has been proved in [E3] that solutions exist for all ν ∈ [1, 2] and all r > 1, and the proof extends without any change to the case ν ∈ (0, 1] provided rν > 1. In the case ν = 1, the existence of solutions for all p and all r > 1 has been proved by M. Martens [M], whose results go much farther since they include all possible periodic points and kneading sequences. In this paper, the existence of solutions will be proved, by another method, in the case 1 ≤ ν ≤ 2, for all r > 1 and all (integer) p ≥ 1. It will be seen that in the case 0 < ν ≤ 1, the condition rν − 1 − (p − 1)(1 − ν) > 0 is necessary and sufficient for the existence of solutions.
Existence and Properties of p-tupling Fixed Points
445
This work had remained unfinished for a long time1 when I belatedly became aware of the paper of B. Mestel and A. Osbaldestin [MO], devoted to the proof of the existence of a period 2 point of the doubling operator for non-even maps (with arbitrary r > 1). One of the ideas in that paper allowed me to finish the proof of existence in the case ν > 1 (see Subsect. 6.2). Section 2 collects some notations and well-known or straightforward facts (see [D, V]). Sections 3–6 contain the proofs of existence. In Sect. 7 some properties of the solutions are derived (univalence, boundedness). In Sect. 8 it is shown that for ν ∈ (0, 1], when r tends to infinity the solutions behave similarly to those of the case p = 1 (see [EW, E1, EE]). 2. Notations and Preliminaries 1. We denote C+ = −C− = {z ∈ C : Im z > 0}. A function f is a Herglotz or Pick function [D] (and −f is an anti-Herglotz function) if it is holomorphic in C+ ∪ C− , f (z∗ ) = f (z)∗ , and f maps C+ (resp. C− ) into its closure, f (C± ) ⊂ C± . If f is also holomorphic on a real non-empty open segment (a, b), then, for each x ∈ (a, b), and each N ∈ N the N × N matrix M with components Mj k = D j +k+1 f (x)/(j + k + 1)!, (0 ≤ j, k < N ), is positive. This follows immediately from the Herglotz integral representation theorem ([D], pp. 20 ff.). The case N = 2 shows that if f is not a constant, then f (x) > 0 ∀x ∈ (a, b), and f has non-negative Schwarzian derivative Sf = (f
/f ) −(f
/f )2 /2 in (a, b). Denote v = f
/f and suppose a < x < y < b. If v does not vanish in [x, y], 1 1 y−x − ≥ . v(x) v(y) 2
(2.1)
If v(x) > 0 then v(y) > 0 since v is increasing, hence v(x) ≤ 2/(y − x), which also holds if v(x) ≤ 0. Similarly v(y) ≥ −2/(y − x). Letting y tend to b or x tend to a, we find: −
2 f
(z) 2 ≤
≤ z−a f (z) b−z
∀z ∈ (a, b).
(2.2)
If f ((a, b)) has a finite upper bound, then f |(a, b) extends continuously to (a, b] with f (b) = sup f ((a, b)), and similarly if there is a finite lower bound. If f maps [a, b] into (a, b), it has a fixed point in (a, b) which (by Schwarz’s lemma) is unique and attractive; in this case every subinterval of (a, b) which contains the fixed point is mapped into itself by f . If F is an increasing function with non-negative Schwarzian on (0, +∞) (in particular if F is a Herglotz function holomorphic in C+ ∪ C− ∪ (0, +∞)), then its restriction to (0, +∞) is concave as a special case of (2.2). The following corollary will be needed later: Lemma 2.1. Let A < a < b < B be real numbers, and f be a Herglotz function holomorphic in C+ ∪ C− ∪ (A, B). Then, for each z ∈ (a, b), f (z) ≥
(B − b)(z − a)f (b) + (B − a)(b − z)f (a) . (b − a)(B − z)
(2.3)
1 The case ν = 1, all r and p, was presented at the Meeting on New Developments in Mathematical Physics and Neuroscience (Hunziker-Hepp Fest), ETH, Zurich, 21–23/9/1995.
446
H. Epstein
Proof. We define F (z) = f (B − z−1 ), i.e. f (z) = F (1/(B − z)). Then F is a Herglotz function holomorphic in C+ ∪ C− ∪ (1/(B − A), +∞). Setting now a = 1/(B − a), b = 1/(B − b) and z = 1/(B − z), with z ∈ (a, b), the concavity of F implies F (z ) ≥
z − a
b − z
F (b ) +
F (a ),
b −a b − a
(2.4)
which translates back into (2.3). 2. Let A, B, A , B be strictly positive real numbers. Then the homographic function z A1 + B1
z → m(z; A, B, A , B ) = 1 (2.5) 1 1 1 z AB − BA
+ A + B
is a bijection of C+ ∪ C− ∪ [−A, B] onto C+ ∪ C− ∪ [−A , B ], and fixes 0. Its derivative at 0 is m (0; A, B, A , B ) =
A B (A + B) . AB(A + B )
(2.6)
Hence if f is a holomorphic map of C+ ∪ C− ∪ (−A, B) into C+ ∪ C− ∪ (−A , B ) which fixes 0, it follows from Schwarz’s lemma applied to m−1 ◦ f , that |f (0)| ≤ m (0; A, B, A , B ). 3. Let b, s be real numbers, with 0 < b < 1, and s > 1. Then the homographic function z → χb,s (z) = 1 + bs−1
b(1 + b)(z − 1) 1 + b − b2 (z − 1)
(2.7)
is holomorphic in C+ ∪ C− ∪ (−1/b, 1/b2 ), Herglotzian, and
χb,s (−1/b) > 0, χb,s (1/b2 ) < 1/b2 , χb,s (1) = 1, χb,s (1) = bs .
(2.8)
4. In the sequel, if s and t > s are real numbers, we shall denote %(s, t) the domain %(s, t) = C+ ∪ C− ∪ (s, t).
(2.9)
If u− and u+ are two real numbers in (0, 1), we denote E0 (u− , u+ ) the space of functions ψ, holomorphic and anti-Herglotzian in %(−1/u− , 1/u+ ), and such that ψ(0) = 1, ψ(1) = 0. Such a function has an integral representation 1 1 log ψ(z) = σ (t) (2.10) − dt, ∀z ∈ %(−1/u− , 1). t t −z R\(−1/u− , 1) Here σ is an L∞ function with 0 ≤ σ ≤ 1 and σ (t) = 1 for all t ∈ [1, 1/u+ ]. It follows that ψ satisfies the following inequalities: ψ(z)(1 − u+ z) ψ(z)(1 + u− z) ≤1≤ 1−z 1−z
for all z ∈ (0, 1/u+ ) \ {1}, reversed for z ∈ (−1/u− , 0),
1 + u− ψ (z) 1 − u+ ≤ ≤ − (1 − z)(1 − u+ z) ψ(z) (1 − z)(1 + u− z) for all z ∈ (−1/u− , 1/u+ ) \ {1}.
(2.11)
(2.12)
Existence and Properties of p-tupling Fixed Points
447
Suppose now that ψ ∈ E0 (u− , u+ ) has a finite upper bound M on (u− , u+ ). (By (2.11), M must satisfy M ≥ ψ(−1/u− ) ≥ (1 + u− )/(u+ + u− ) .) Then σ (t) dt ≤ log M. (2.13) t (1 + u− t) R\(−1/u− , 1) Therefore, if −1/u− < z < 0, σ (t) t (1 + u− t) ψ (z) dt = − ψ(z) t (1 + u t) (t − z)2 − R\(−1/u− , 1) t (1 + u− t) log M ≤ (log M) max ≤ . t∈(−1/u− , 1) (t − z)2 (−4z)(1 + u− z)
(2.14)
3. More Precise Statement of the Problem We begin with a few heuristic considerations. It is easy to see that if u is a solution of (1.8), it will analytically extend to the real interval (−1/λ, 1). Moreover, since the function f is analytic without critical point in an open real interval containing 0, its inverse function, denoted U , will be analytic, with strictly negative derivative, in an open interval containing 1. The functions u and U are related by U (z) = u(z)r . Thus U (z) =
1 U (λν−1 uˇ p (λz)) λrν
(3.1)
should hold wherever both sides are analytic. The main condition which we impose on the solutions we seek is that u and U be anti-Herglotz functions. It is in fact sufficient to impose this condition on u. Indeed denote ϕ = λν−1 uˇ p ◦ λ.
(3.2)
Then ϕ is Herglotzian and Eq. (3.1), rewritten as U (z) =
1 U (ϕ(z)) λrν
(3.3)
shows first that ϕ(1) must be a zero of U , i.e. that ϕ(1) = 1, and then that U is a linearizer of ϕ at 1, the multiplier ϕ (1) being equal to λrν < 1. Therefore U is also anti-Herglotzian, and is holomorphic in the basin of attraction of 1 for ϕ. The reasons for imposing the Herglotz condition have been given e.g. in [EE, E2]. It is more convenient to work with ψ = U/U (0) rather than with U , and we denote z1 /λν−1 the quantity u(0). This implies z1 = λν−1 u(0) ˇ ≤ ϕ(0) < 1. We thus adopt the following definition: Given two real numbers ν ∈ (0, 2], r > 1, and an integer p ≥ 1, a solution associated with these values consists of two functions ψ and u and two real numbers λ ∈ (0, 1) and z1 ∈ (0, 1) with the following properties: (1) ψ is an anti-Herglotzian function holomorphic in %(−1/λ, 1/a) for some a ∈ (0, 1) with ψ(1) = 0 and ψ(0) = 1. (2) u is holomorphic and anti-Herglotzian in %(−1/λ, 1), and is given there by u(z) =
z1 ψ(z)1/r . λν−1
(3.4)
448
H. Epstein
(3) The identity ψ(z) =
1 ψ(λν−1 uˇ p (λz)) λrν
(3.5)
holds for all z ∈ %(−1/λ, 1/a), where again u(z) ˇ = u(−z). The following theorem will be proved. Theorem 3.1. (i) For any integer p ≥ 1 and any real ν ∈ (0, 1], there exist solutions if and only if r satisfies rν − 1 − (p − 1)(1 − ν) > 0.
(3.6)
(ii) For any integer p ≥ 1 and any real ν ∈ [1, 2], there exist solutions for all real r > 1. The necessity of the condition (3.6) will be shown in the next section, but it is easy to see that the conditions we have imposed require rν > 1. It suffices to consider the case 0 < ν ≤ 1. In this case, we must have ψ(−1/λ) =
1 1 z1 1 ψ(ϕ(−1/λ)) ≤ rν ⇒ u(−1/λ) ≤ 2ν−1 < , λrν λ λ λ
(3.7)
from which it follows that ϕ = λν−1 uˇ p ◦ λ is analytic in %(−1/λ, 1/λ2 ) and that: ϕ(−1/λ) ≥ 0, ϕ(1) = 1, ϕ(1/λ2 ) ≤
z1 . λν
(3.8)
We can apply Remark 2 of Sect. 2, i.e. Schwarz’s lemma as in (2.6) to bound ϕ (1), with A = 1 + 1/λ, B = 1/λ2 − 1, A = 1, B = λ−ν − 1 and find ϕ (1) = λrν ≤
λ (1 − λν ) λ ≤ . (1 − λ2 ) 1+λ
(3.9)
This implies λrν−1 < 1, i.e. rν > 1. In the case ν > 1, it is well-known (see [JR]), and easy to verify, that for r = 1, p ≥ 1, there is a solution such that ψ(z) = 1 − z, all functions u, ϕ, etc. are affine, z1 = λν−1 , and λ is the unique solution in (0, 1) of λν + pλν−1 − 1 = 0.
(3.10)
Moreover it is proved in [JR] that (in the case p = 1, ν = 2) there exist solutions for all sufficiently small r − 1 > 0. The proof of Theorem 3.1 will occupy Sect. 4–6. Many repetitions occur in these sections, since variations of the same method apply to several cases. But avoiding the repetitions would produce more obscurity than brevity.
Existence and Properties of p-tupling Fixed Points
449
4. Existence for r > 1, p ≥ 1 and ν ∈ (0, 1] In this section, r > 1 and ν ∈ (0, 1] are fixed real numbers such that rν > 1, and p ≥ 1 is a fixed integer. The real number b ∈ (0, 1) is also fixed, but its value will be chosen later (as a function of r). For any two s, t ∈ (0, 1), we denote 0 → 0 z(s + 1) 1 → 1 . hs,t (z) = (4.1) z(s − t) + 1 + t −1/s → −1/t Obviously hs,t = h−1 t,s . We denote Q0 (b, rν) the space of all functions , with the following properties: (Q1) , is a Pick function holomorphic in the domain:
1 1 %(−1/b, 1/b2 ) = C+ ∪ C− ∪ − , 2 , b b
(4.2)
and maps this domain into itself. (Q2) ,(z) ≥ 0 for all z ∈ (−1/b, 1/b2 ), (Q3) ,(1) = 1, and 0 < , (1) ≤ brν . We regard Q0 (b, rν) as a subset of the real Fréchet space of the self-conjugated functions holomorphic in %(−1/b, 1/b2 ), equipped with the topology of uniform convergence on compact subsets. We shall define a continuous operator B(b, r, p, ν) on this space by describing its action on an arbitrary ,0 ∈ Q0 (b, rν). Given ,0 ∈ Q0 (b, rν), we denote λ = , 0 (1)1/rν . By (Q3), 0 < λ ≤ b. We define a function ϕ0 by ϕ0 = hb,λ ◦ ,0 ◦ h−1 b,λ .
(4.3)
If λ = b, hb,λ is the identity. Otherwise, since λ < b, its pole is below −1/b and hb,λ maps %(−1/b, 1/b2 ) onto %(−1/λ, 1/a0 (λ)) where
1 1 , = hb,λ a0 (λ) b2 b2 ≤ a0 (λ) = b − λ(1 − b) < b.
(4.4)
The function ϕ0 possesses the following properties: (Q 1)
ϕ0 is a Pick function holomorphic in %(−1/λ, 1/a0 (λ)), and maps this domain into itself. (Q 2) ϕ0 (z) ≥ 0 for all z ∈ (−1/λ, 1/a0 (λ)). (Q 3) ϕ0 (1) = 1, and ϕ0 (1) = λrν . We denote ψ the linearizer of ϕ0 normalized by the condition ψ(0) = 1, i.e. the unique function, holomorphic in %(−1/λ, 1/a0 (λ)), such that ψ(z) =
1 ψ(ϕ0 (z)) ∀z ∈ %(−1/λ, 1/a0 (λ)), ψ(1) = 0, ψ(0) = 1. λrν
(4.5)
450
H. Epstein
This is an anti-Herglotz function given, as it is well known ([Mi, V]), by ψ(z) = h(z)/ h(0), h(z) = lim
n→∞
1 (ϕ n (z) − 1). λnrν 0
(4.6)
The limit converges uniformly on compact subsets of %(−1/λ, 1/a0 (λ)), which is a basin of attraction of 1 for ϕ0 . It is easy to check that ψ depends continuously on ϕ0 . On [−1/λ, 1/a0 (λ)), ψ is strictly decreasing and, because 0 ≤ ϕ0 (−1/λ)) < 1, ψ(−1/λ) =
1 1 ψ(ϕ0 (−1/λ)) ≤ rν . λrν λ
(4.7)
ψ satisfies the inequalities (2.11) and (2.12), with u− = λ and u+ = a0 (λ). We also define v(z) = (ψ(−z))1/r ∀z ∈ C+ ∪ C− ∪ (−1, 1/λ).
(4.8)
v is a Pick function which extends to a strictly increasing continuous function on [−1, 1/λ]. It satisfies: v(−1) = 0, v(0) = 1, v(1/λ) ≤ λ−ν . We now show that there is a unique z1 ∈ (0, 1) such that p z1 λ1−ν v (λ) = λ1−ν .
(4.9)
(4.10)
As a consequence of the inequality in (4.9), the function (sλ1−ν v)k is defined on [−1, 1/λ] for every s ∈ [0, 1] and every integer k ≥ 0. The functions s → xk (s) = (sλ1−ν v)k (λ) are thus defined, continuous, and strictly increasing in s for k > 0 and s ∈ [0, 1]. For s∗ = λν /v(λ), xk (s∗ ) = λ for all k. For s > s∗ , x1 (s) > x1 (s∗ ) = λ = x0 (s), hence x0 (s) < x1 (s) < . . . xp+1 (s). Since xp (1) > x1 (1) = λ1−ν v(λ) > λ1−ν , there exists a unique z1 ∈ (s∗ , 1) such that xp (z1 ) = λ1−ν . In the case p = 1, (4.10) reduces to z1 = 1/v(λ). The derivative xp (z1 ) is strictly positive. Therefore, if v is allowed to change slightly, z1 can be computed by a Cauchy integral along a small circle which remains fixed. Thus z1 depends continuously (in fact analytically) on v, hence on ,0 . We denote ζj = (z1 λ1−ν v)j (λ), (0 ≤ j ≤ p + 1). By the preceding argument, λ = ζ0 < ζ1 < · · · < ζp = λ1−ν < ζp+1 = z1 λ1−ν v(λ1−ν ).
(4.11)
Since v(λ1−ν ) ≤ λ−ν , z1 > λν ,
z1 λ1−ν > λ.
(4.12)
Note also that, since v(λ) > v(0) = 1, ζ1 > z1 λ1−ν .
(4.13)
The last inequality in (4.11), the upper bound on ψ(−λ1−ν ) from (2.11), and ν ≤ 1 give z1 ≥
1 − λ2−ν 1 + λ1−ν
1r
>
1−λ . 2
(4.14)
Existence and Properties of p-tupling Fixed Points
451
We can now define a new function ϕ by: p ϕ(z) = λν−1 z1 λ1−ν v (λz), z ∈ %(−1/λ, 1/λ2 ).
(4.15)
In this domain, ϕ is a Pick function, which extends continuously to the ends of its real interval of definition, and ϕ(−1/λ) = 0 if p = 1, ϕ(−1/λ) = λ ϕ(1) = 1,
ν−1
(z1 λ
1−ν
v)
p−1
ν−1
ϕ(1/λ ) = λ 2
(4.16) ν
(0) ≥ z1 = λ
(z1 λ
1−ν
if p ≥ 2,
p
ν
v) (1/λ) ≤ z1 /λ .
(4.17) (4.18)
The domain %(−1/λ, 1/λ2 ) is thus a basin of attraction of the fixed point 1 of ϕ. This domain contains %(−1/λ, 1/a0 (λ)) since a0 (λ) ≥ λ2 (see (4.4)). We now use Schwarz’s lemma, as mentioned in Sect. 2, to obtain an upper bound for ϕ (1). If p ≥ 2, ϕ (1) ≤ A=1+
A B (A + B) with AB(A + B )
1 1 z1 , B = 2 − 1, A = 1 − z1 , B = ν − 1. λ λ λ
This gives ϕ (1) ≤
λ(1 − z1 )(z1 − λν ) . z1 (1 − λν )(1 − λ2 )
(4.19)
When z1 ∈ (λν , 1) this expression is maximum at z1 = λν/2 , so that ϕ (1) ≤
λ 1 λ < = if p ≥ 2. √ 2 Z1 (λ) 8 (1 + λ)(1 + λ)
(4.20)
Therefore, if p ≥ 2 and we choose b ≥ b0 (rν) = (1/8)1/rν , then ϕ (1) < brν . For a slightly better choice of b, we note that λ → λ/Z1 (λ) is increasing on (0, 1), so that λ ≤ b ⇒ ϕ (1) ≤ b/Z1 (b). This will be less than brν if b ≥ b1 (rν), where s → b1 (s) is the solution of b1s = b1 /Z1 (b1 ), i.e. the inverse function (defined on (1, ∞)) of √ log((1 + b)(1 + b)2 ) log Z1 (b) =1+ . (4.21) b → 1 + log(1/b) log(1/b) This last function is strictly increasing on (0, 1), and tends to 1 as b tends to 0 and to +∞ as b tends to 1. Obviously b1 (s) ≤ b0 (s). A useful inequality (proved in the Appendix) is: √ 2b log((1 + b)(1 + b)2 ) = ∀b ∈ (0, 1) (4.22) log(1/b) 1−b i.e. b = b1 (s), s > 1
⇒s>
1+b . 1−b
(4.23)
452
H. Epstein
If p = 1, ϕ (1) ≤
λ(1 − λν ) λ 1 ≤ ≤ , 2 (1 − λ ) (1 + λ) 2
(p = 1).
(4.24)
Thus, if b is chosen at least equal to b2 (rν) = (1/2)1/rν or to b3 (rν), where b3 is the inverse function of b → 1+log(1+b)/ log(1/b), it follows from λ ≤ b that ϕ (1) ≤ brν . We now define the action of the operator B(b, r, p, ν) on ,0 by B(b, r, p, ν) ,0 = , = h−1 b,λ ◦ ϕ ◦ hb,λ .
(4.25)
This definition implies that if ,0 is a fixed point of B(b, r, p, ν), i.e. , = ,0 , then the functions ϕ0 and ϕ constructed above coincide, and λ, z1 , ψ, and u(z) = z1 λ1−ν ψ(z)1/r provide a solution to the problem set in Sect. 3. Conversely, given a solution to the problem, the function , given by Eq. (4.25) (with λ = , (1)1/rν ) is a fixed point of B(b, r, p, ν) for any b ∈ [λ, 1). The preceding estimates show that if p ≥ 2 and b ≥ b0 (rν) or b ≥ b1 (rν), or if p = 1 and b ≥ b2 (rν) or b ≥ b3 (rν), then B(b, r, p, ν) Q0 (b, rν) ⊂ Q0 (b, rν).
(4.26)
The same estimates show that, for any solution of our problem, the inequalities λ < bj (rν) must hold (j = 0, 1 for p ≥ 2, j = 2, 3 for p = 1). The set Q0 (b, rν) is not compact: we have to guard against λ tending to zero, i.e. to find a reproducing lower bound for λ. This will be feasible only under certain restrictions on r, ν, and p. We first show that such restrictions are unavoidable. ϕ (1) is given by ϕ (1) = λν
p−1
z1 λ1−ν v (ζj ) = λν
j =0
=
p−1
j =0
p−1
j =0
ζj +1 v (ζj ) v(ζj )
p−1
−ζj ψ (−ζj ) ζj v (ζj ) = . v(ζj ) r ψ(−ζj )
(4.27)
j =0
Here we have used ζp /ζ0 = λ−ν . The upper bound in (2.12) (with u− = λ) give ϕ (1) ≤
p−1
j =0
ζj (1 + λ) λ ≤ r(1 + ζj )(1 − λζj ) r(1 − λ2 )
λ1−ν (1 + λ) r(1 + λ1−ν )(1 − λ2−ν )
≤ λ1+(p−1)(1−ν) (r (1 − λ))−p .
p−1
(4.28)
If we suppose p ≥ 2 and λ ≤ b1 (rν), then r(1 − λ) ≥ 1 + λ > 1 by the inequality (4.23). Therefore a fixed point can exist only if rν − 1 − (p − 1)(1 − ν) > 0.
(4.29)
Using (4.27) and the lower bound in (2.12), with u+ = a0 (λ) < b gives
ϕ (1) ≥
p−1
j =0
ζj (1 − b) , r(1 + ζj )(1 + bζj )
(4.30)
Existence and Properties of p-tupling Fixed Points
453
and using 1 ≥ ζj ≥ z1 λ1−ν (for j > 0), and z1 ≥ (1 − b)/2 (see (4.14)), we find ϕ (1) ≥ cp λ1+(p−1)(1−ν) ,
c=
(1 − b)2 . 4r(1 + b)
(4.31)
Assume now that λ ≥ λ0 > 0. Then a sufficient condition for ϕ (1) ≥ λrν 0 to hold is that rν−1−(p−1)(1−ν)
λ0
≤ cp .
(4.32)
If the condition (4.29) holds, we can take λ0 = λ0 (p, r, ν) =
(1 − b)2 4r(1 + b)
p
rν−1−(p−1)(1−ν)
∈ (0, 1).
(4.33)
Assume that the inequality (4.29) holds. Let, for definiteness, b(rν) = b1 (rν) if p ≥ 2, and b(rν) = b3 (rν) if p = 1. We observe that Q1 (p, r, ν) = Q0 (b(rν), rν) ∩ {, : , (1) ≥ λ0 (p, r, ν)rν }
(4.34)
is not empty. Indeed the function χb,s (see (2.8)) with b = b(rν) and s = rν belongs to
= brν . Therefore , = B(b(rν), r, p, ν) χ Q0 (b(rν), rν) and χb,s b,s also belongs to rν
Q0 (b(rν), rν), and the preceding estimates show that b ≥ , (1) ≥ b1+(p−1)(1−ν) cp with c as in (4.31), and b = b(rν). Hence b(rν) ≥ λ0 (p, r, ν), in particular χb,s ∈ Q1 (p, r, ν). (This is not really essential since we could have redefined λ0 (p, r, ν) to be less than b(rν).) The continuous map B(b(rν), r, p, ν) maps the compact convex nonempty set Q1 (p, r, ν) into itself. Therefore it has a fixed point there by the Schauder– Tikhonov theorem. As noted before, if ,0 = , is such a fixed point, the functions ϕ0 and ϕ constructed as above coincide, and ψ and u(z) = z1 λ1−ν ψ(z)1/r provide a solution to our problem. Note that here again any solution must satisfy λ ≥ λ0 (p, r, ν), since it must satisfy ϕ (1) = λrν ≥ cp λ1+(p−1)(1−ν) , with c as in (4.31). Thus any solution is associated to a fixed point of B(b(rν), r, p, ν) in Q1 (p, r, ν). 5. Case p = 1 and ν ∈ [1, 2] This case has been dealt with in [E3]. It will be shown in this section that the method of the preceding section also applies to this case with minor modifications. Let r > 1 and ν ∈ [1, 2] be fixed reals. We define the space Q(b, rν) and the operator B(b, r, 1, ν) in the same way as in the preceding section. In particular, starting from ,0 ∈ Q(b, rν), the functions ϕ0 , ψ and v are defined by the same formulae and have the same properties, in particular v(−1) = 0, v(0) = 1, v(1/λ) ≤ λ−ν ,
(5.1)
but we note that now λ−ν ≥ 1/λ. We define z1 = 1/v(λ) ∈ (0, 1).
(5.2)
z1 > λν
(5.3)
It follows from (5.1) that
454
H. Epstein
and from the upper bound (2.11) on ψ(−λ) that z1 ≥ (1 − λ)1/r > 1 − λ.
(5.4)
z → ϕ(z) = z1 v(λz)
(5.5)
The function
is again Herglotzian, holomorphic in %(−1/λ, 1/λ2 ), continuous on [−1/λ, 1/λ2 ] with ϕ(−1/λ) = 0, ϕ(1) = 1, ϕ(1/λ2 ) = z1 v(1/λ) ≤ z1 /λν < 1/λ2 .
(5.6)
Schwarz’s lemma can be again applied as in the preceding section, but now with A=1+
1 1 , B = B = 2 − 1, A = 1. λ λ
(5.7)
This gives ϕ (1) ≤ λ,
(5.8)
which is not sufficient for our purposes. We therefore use the bound (2.14) with u− = λ and M = 1/λrν , to get zψ (−z) ν log(1/λ) zv (z) =− ≤ ∀z ∈ (0, 1/λ), v(z) r ψ(−z) 4(1 − λz)
(5.9)
hence ϕ (1) =
λv (λ) log(1/λ) ≤ . v(λ) 2(1 − λ2 )
(5.10)
The r.h.s. of this inequality is a decreasing function of λ, tending to +∞ when λ → 0, and to 1/4 when λ → 1. For any choice of b ∈ (0, 1), if λ < brν , then ϕ (1) < brν by (5.8). If λ ≥ brν , then, by (5.10), a sufficient condition for ϕ (1) < brν is that brν > µ, where µ is the unique zero, in (0, 1), of the increasing function x → x −
log(1/x) . 2(1 − x 2 )
(5.11)
One finds µ < 0.479, and we choose, from now on, b = b5 (rν) = (0.479)1/rν . Note that for any solution, ϕ (1) = λrν must satisfy (5.10), and since the rhs of this inequality is decreasing, the function defined in (5.11) must be negative at x = λrν , i.e. λ ≤ b5 (rν). The lower bound in (2.12), with u+ = a0 (λ) < b, gives ϕ (1) = −
λψ (−λ) λ(1 − a0 (λ)) λ(1 − b) ≥ ≥ . rψ(−λ) r (1 + λ)(1 + λa0 (λ)) r (1 + b)(1 + b2 )
If λ ≥ λ0 > 0 then ϕ (1) ≥ λrν 0 provided λ0 ≤ λ0 (r, ν) with
1/(rν−1) (1 − b) λ0 (r, ν) = , b = b5 (rν). r (1 + b)(1 + b2 )
(5.12)
(5.13)
Therefore the operator B(b5 (rν), r, 1, ν) preserves the compact convex set Q0 (b5 (rν), rν) ∩ {, : , (1) ≥ λ0 (r, ν)}. Again any solution must satisfy λ ≥ λ0 (r, ν).
(5.14)
Existence and Properties of p-tupling Fixed Points
455
6. Case p ≥ 2 and ν ∈ [1, 2] In this section, r > 1 and ν ∈ [1, 2] are fixed real numbers, and p ≥ 2 is a fixed integer. The real number b ∈ [1/2, 1) is also fixed, but its value will be chosen later (as a function of r). We shall need the function a : [0, 1] → [0, 1] given by √ 2t 5 − 2, if 0 ≤ t ≤ 2t 1+t 1−t √ (6.1) a(t) = min , = 1+t 1−t 2 5 − 2 ≤ t ≤ 1. if 2 √ t → a(t) is continuous and strictly increasing in [0, 1]. (Note that 5 − 2 ≈ 0.236 < 1/4.) 0 (b, νr) the space of all functions , with the following properties: We denote Q (Q1) , is a Pick function holomorphic in the domain:
1 1 1+b %(−1/b, 1/a(b)) = C+ ∪ C− ∪ − , , a(b) = , b a(b) 2
(6.2)
and maps this domain into itself. (Q2) ,(z) ≥ 0 for all z ∈ (−1/b, 1/a(b)), (Q3) ,(1) = 1, and 0 < , (1) ≤ bνr . 0 (b, νr) is a convex subset of the real Fréchet space of all self-conjugated functions Q holomorphic in %(−1/b, 1/a(b)). It is not empty since it contains the function χb,νr (see Sect. 2). 6.1. The operator B(b, r, p, ν). We shall define a continuous operator B(b, r, p, ν) 0 (b, νr) by describing its action on an arbitrary element ,0 . on the space Q 0 (b, νr), we denote λ = , (1)1/rν . Note that λ ≤ b. We define a Given ,0 ∈ Q 0 function ϕ0 by ϕ0 = hb,λ ◦ ,0 ◦ h−1 b,λ .
(6.3)
Here hb,λ is the homographic function defined in Sect. 4 (see (4.1)). It maps the domain %(−1/b, 1/a(b)) onto %(−1/λ, 1/a1 (λ)), where 1/a1 (λ) = hb,λ (1/a(b)), i.e. a1 (λ) =
1+λ b−λ + ≥ a(b) ≥ a(λ), 2 1+b
a1 (λ) ≤ a1 (0) =
1 + 3b . (6.4) 2(1 + b)
The function ϕ0 possesses the following properties: (Q 1) ϕ0 is a Pick function holomorphic in %(−1/λ, 1/a1 (λ)), and maps this domain into itself. ˜ 2) ϕ0 (z) ≥ 0 for all z ∈ (−1/λ, 1/a1 (λ)). (Q ˜ 3) ϕ0 (1) = 1, and ϕ (1) = λνr . (Q 0 As in previous sections we denote ψ the linearizer of ϕ0 , normalized by the condition ψ(0) = 1, i.e. ψ(z) =
1 ψ(ϕ0 (z)) ∀z ∈ %(−1/λ, 1/a1 (λ)), ψ(1) = 0, ψ(0) = 1. λνr
(6.5)
456
H. Epstein
ψ is anti-Herglotz, holomorphic in %(−1/λ, 1/a1 (λ)), and satisfies the inequalities (2.11) and (2.12), with u− = λ and u+ = a1 (λ). Also, ψ(−1/λ) =
1 1 ψ(ϕ0 (−1/λ)) ≤ νr . λνr λ
(6.6)
We again define v(z) = (ψ(−z))1/r for all z ∈ %(−1, 1/λ). This is a Pick function which extends to a strictly increasing continuous function on [−1, 1/λ]. It satisfies: v(−1) = 0, v(0) = 1, v(1/λ) ≤ λ−ν .
(6.7)
We now show that there is a unique z1 ∈ (0, 1) such that p z 1 1 v (λ) = ν−1 . ν−1 λ λ
(6.8)
For real s ≥ 0 let x0 (s) = λ, x1 (s) = sλ1−ν v(λ). The function s → x1 (s) is strictly increasing on R+ and takes the values λ at s∗ = λν /v(λ) and 1/λ at s1 = λν−2 /v(λ). By induction we can construct a strictly decreasing infinite sequence s1 > · · · > sj > · · · > s∗ such that, for j ≥ 2, s → xj (s) = (sλ1−ν v)j (λ) is continuous and strictly increasing on [s∗ , sj −1 ], x0 (s) < · · · < xj (s) in (s∗ , sj −1 ], xj (s∗ ) = λ, and xj (sj ) = 1/λ. Indeed it follows that xj +1 (s) = sλ1−ν v(xj (s)) is defined, continuous, and strictly increasing on [s∗ , sj ] and xj +1 (s) > sλ1−ν v(xj −1 (s)) = xj (s) for all s ∈ (s∗ , sj ]. Since xj +1 (sj ) > xj (sj ) = 1/λ and xj +1 (s∗ ) = λ, sj +1 exists in (s∗ , sj ). In particular xp (sp−1 ) > 1/λ. Therefore there is a unique z1 ∈ (s∗ , sp−1 ) such that xp (z1 ) = λ1−ν . It must satisfy z1 < 1 since z1 v(xp−1 (z1 )) = 1, and v(xp−1 (z1 )) > 1. Note also that for s ∈ (s∗ , sp−1 ), there exists a unique x−1 (s) < x0 (s) = λ such that sλ1−ν v(x−1 (s)) = λ. The function z → s∗ λ1−ν v(z) maps %(−1, 1/λ) into %(0, λ1−ν /v(λ)), so that it has a unique and attractive fixed point at λ by Schwarz’s lemma. Hence s∗ λ1−ν v(x) ≥ x for all x ∈ [−1, λ]. When s > s∗ , sλ1−ν v(x) > x for all x ∈ [−1, λ]. Since this includes [x−1 (s), x0 (s)], it follows that sλ1−ν v(x) > x for all x ∈ [−1, xp (s)], for all s ∈ (s∗ , z1 ]. The function xp is analytic, with a strictly positive derivative, on (s∗ , sp−1 ). Therefore z1 depends continuously on v, hence on ,0 . We denote ζj = (z1 λ1−ν v)j (λ), (0 ≤ j ≤ p + 1) : λ = ζ0 < ζ1 < . . . ζp =
1 z1 < ζp+1 = ν−1 v(1/λν−1 ). λν−1 λ
(6.9)
Since v(1/λν−1 ) ≤ λ−ν , z1 > λν ,
z1 ν−1 λ
> λ,
(6.10)
and since v(λ) > v(0) = 1, ζ1 >
z1 . ν−1 λ
(6.11)
We have seen above that z1 v(x) > x ∀x ∈ [−1, 1/λν−1 ]. λν−1
(6.12)
Existence and Properties of p-tupling Fixed Points
457
Applying this to x = 1 gives z1 /λν−1 ≥ 1/v(1), and, using (2.11),
z1 λν−1
≥
1−λ 2
1 r
>
1−λ . 2
(6.13)
The function ϕ is defined by: ϕ(z) = λν−1
z p 1 v (λz), z ∈ C+ ∪ C− ∪ (−1/λ, ζ1 /λ). λν−1
(6.14)
In this domain, ϕ is a Pick function, which extends continuously to the ends of its real interval of definition, and ϕ(−1/λ) = λν−1 ϕ(1) = 1,
z p−1 1 v (0) ≥ z1 ≥ λν ≥ λ2 , λν−1 z1 ϕ(ζ1 /λ) = z1 v(1/λν−1 ) ≤ ν < ζ1 /λ. λ
(6.15)
(Note that the first inequality in (6.15) has used p ≥ 2.) The domain %(−1/λ, ζ1 /λ) is a basin of attraction of the fixed point 1 of ϕ, hence ϕ (1) < 1 by Schwarz’s lemma. For a better upper bound on this derivative, we shall need a better lower bound for ζ1 /λ. This is provided by Lemma 6.1. The inequality z1 v(z) ν−1 λ
≥
z(1 − 2λ) + λ 1 − λz
(6.16)
holds for all z ∈ [0, 1]. Proof. This is simply the result of applying Lemma 2.1 of Sect. 2, with f = z1 λ1−ν v, and a = 0, b = 1, B = 1/λ. This function satisfies f (0) = z1 λ1−ν ≥ λ by (6.10), and f (1) ≥ 1 by (6.12). For z = ζ0 = λ, this gives ζ1 ≥ 2λ/(1 + λ). Since we also have the lower bounds (6.11) and (6.13), ζ1 1−λ 1 2 ≥ max , = . λ 1+λ 2λ a(λ)
(6.17)
This is the reason for our original definition of the function a in (6.1). We conclude that the domain %(−1/λ, ζ1 /λ), where ϕ is holomorphic, and which it maps into itself, certainly contains the domain of analyticity %(−1/λ, 1/a1 (λ)) of ϕ0 in view of (6.4). We now use Schwarz’s lemma, as mentioned in Sect. 2, to obtain an upper bound for ϕ (1) : ϕ (1) ≤
A B (A + B) with AB(A + B ) 1 1 − 1, A = 1 − λ2 . A = 1 + , B = B = λ a(λ)
(6.18)
458
H. Epstein
This gives (1 − λ2 )(a(λ) + λ) (1 + λ)(1 − a(λ)λ2 ) 1 + 3λ ≤ Z(λ) = 2 + 2λ + λ2 √ ≤ Zmax = 9/(4 + 2 13) < 0.803.
ϕ (1) ≤
(6.19)
Therefore choosing b ≥ b6 (rν) = m0 , m0 = 0.803 ensures that ϕ (1) < brν . We define the operator B(b, r, p, ν) by 1/rν
, = B(b, r, p, ν),0 = h−1 b,λ ◦ ϕ ◦ hb,λ .
(6.20)
It then follows from the preceding estimates that, if b ≥ b6 (rν), 0 (b, νr) ⊂ Q 0 (b, νr). B(b, r, p, ν) Q
(6.21)
In the remainder of this section, it will always be understood that b = b6 (rν). " # 6.1.1. Lower bound for ϕ (1).. We use ϕ (1) =
p−1
j =0
−ζj ψ (−ζj ) . r ψ(−ζj )
(6.22)
The lower bound in (2.12) gives, for ζ ∈ [0, 1/λ), −ζ ψ (−ζ ) ζ (1 − c) ≥ . ψ(−ζ ) (1 + ζ )(1 + cζ )
(6.23)
Here c = a1 (λ), where a1 (λ) is given by (6.4) and satisfies a(λ) ≤ a(b) ≤ a1 (λ) < a1 (0) =
1 + 3b . 2(1 + b)
(6.24)
However it will be convenient to suppose only, at first, that (6.23) holds for a certain c satisfying a(λ) ≤ c < 1. For j > 0, ζj ≥ ζ1 ≥ λ/a(λ) hence ζj ∈ [λ/c, 1/λ]. When ζ varies in this interval, the second expression in (6.23) is minimum at ζ = 1/λ. Therefore
p−1 1−c 1−c
p ϕ (1) ≥ λ . (6.25) r (1 + λ)(1 + cλ) r (1 + λ)(λ + c) It is easy to verify that the rhs of this inequality is decreasing in c and increasing in λ provided c ≥ λ2 (note that a(λ) > λ). Setting now c = a1 (λ), and using the inequalities (6.24) and λ ≤ b, this gives
1−b p . (6.26) ϕ (1) ≥ λp 16r Supposing λ ≥ λ0 > 0, the last inequality will imply ϕ (1) ≥ λrν 0 if λ0 satisfies
1−b p rν−p λ0 ≤ , (6.27) 16r
Existence and Properties of p-tupling Fixed Points
459
and this is possible only if rν − p > 0. In this case we can choose
p 1 − b rν−p λ0 = λ0 (r, ν) = , 16r
(6.28)
and obtain the existence of a fixed point in the same way as in the preceding sections. 1/rν Recall that in these formulae, b stands for b6 (rν) = m0 . It is easy to verify that λ0 (r, ν) → 1 when r → ∞. The condition rν − p > 0 is just a limitation of the present method. The inadequacy of the estimate (6.26) is due to the fact that a1 (λ) does not tend to 0 as λ tends to 0. By contrast, in the case of fixed points, the lower bound on λ can be improved. Indeed, since ψ and ϕ are holomorphic in %(−1/λ, 1/a(λ)), the bound (6.23) and consequently (6.25) hold with c replaced by a(λ) (instead of a1 (λ)). Assume λ ≤ 1/7. We can then set c = 2λ/(1 − λ) in (6.25) and obtain
p−1 1 − 3λ 1 − 3λ = λ (6r)−p . (6.29) ϕ (1) ≥ λ r (1 + λ)(1 − λ + 2λ2 ) r (1 + λ)(3 − λ) Therefore the lower bound λ = ϕ (1)1/rν ≥ min{1/7, (6r)−p/(rν−1) }
(6.30)
holds for all fixed points. This fact suggests the use of another operator instead of B(b, r, p, ν), and this will be done in the next subsection. 6.2. The operator N (b, r, p, ν, λ1 ). In this subsection, we define a new operator 0 (b, νr). This construction closely follows an idea N (b, r, p, ν, λ1 ) on the space Q of Mestel and Osbaldestin [MO]. It consists in replacing the operator B(b, r, p, ν) 0 (b, νr)) by a “truncated version” N (b, r, p, ν, λ1 ) which is (which is analytic on Q 0 (b, νr) into a compact subset. This operator depends on only continuous, but maps Q an additional real parameter λ1 ∈ (0, 1/2). It will be shown later that for small values of this parameter, any fixed point of N (b, r, p, ν, λ1 ) is a fixed point of B(b, r, p, ν). The notations are the same as in the preceding subsection unless explicitly mentioned. 1/rν In particular ν ∈ [1, 2] and r > 1 are fixed and b will stand for b6 (rν) = m0 , r m0 = 0.803 . We denote τ1 = λ1 . 0 (b, νr). We define N (b, r, p, ν, λ1 ) by its action on an arbitrary element ,0 of Q 0 (b, νr), σ ν ≤ brν . Let σ ν = , 0 (1). Recall that, by the definition of Q If σ ν ≥ τ1ν , we define N (b, r, p, ν, λ1 ),0 = B(b, r, p, ν),0 ,
(σ ν ≥ τ1ν ).
(6.31)
If σ ν < τ1ν , we define λ = λ1 (so that λ ≤ b), and define ϕ0 , as before, by ϕ0 = hb,λ ◦ ,0 ◦ h−1 b,λ .
(6.32)
The function ϕ0 is holomorphic and Herglotzian in the domain %(−1/λ, 1/a1 (λ)), which it maps into itself. Here a1 (λ) = a1 (λ1 ) is given by (6.4). ϕ0 possesses the same properties as in Subsect. 6.1, except for ϕ0 (1) = σ ν .
(6.33)
460
H. Epstein
The linearizer ψ1 is the unique function holomorphic in %(−1/λ, 1/a1 (λ)) such that 1 ψ1 (ϕ0 (z)) ∀z ∈ %(−1/λ, 1/a1 (λ)), σν
ψ1 (z) =
ψ1 (0) = 1, ψ1 (1) = 0. (6.34)
It is anti-Herglotzian and satisfies ψ1 (−1/λ) =
1 1 ψ1 (ϕ0 (−1/λ)) ≤ ν . σν σ
(6.35)
In the preceding subsection, much depended on the bound ψ(−1/λ) ≤ λ−rν . To restore an analogous situation we define a new function ψ as ψ = θσ −ν ,τ −ν ◦ ψ1 , 1
(6.36)
where θσ −ν ,τ −ν denotes the homographic function which fixes 0 and 1, and sends σ −ν
to τ1−ν :
1
θσ −ν ,τ −ν (z) = 1
z(τ1ν
z(1 − σ ν ) . − σ ν ) + 1 − τ1ν
(6.37)
This function is Herglotzian and has a pole at a negative value temporarily denoted k. As a consequence ψ is holomorphic and anti-Herglotzian in %(−1/λ, 1/a2 ), where 1/a2 = ψ1−1 (k) if k ∈ ψ1 ((1, 1/a1 (λ))), and 1/a2 = 1/a1 (λ) otherwise. For z > 1, ψ1 (z) < 0 and (using the inequalities (2.11)), ψ1 (z) ≥ ψ2 (z) =
1−z . 1 − a1 (λ)z
(6.38)
If y = ψ1−1 (k) < 1/a1 , we have, since ψ2 is decreasing, k = ψ1 (y) ≥ ψ2 (y),
ψ2−1 (k) ≤ y.
(6.39)
Thus ψ is holomorphic in %(−1/λ, 1/4), where 1/4 = ψ2−1 (k) = ψ2 (k). This gives: τ1ν − σ ν + (1 − τ1ν )a1 (λ) , 1 − σν a(λ) ≤ a1 (λ) ≤ 4 < a3 (λ) = τ1ν + (1 − τ1ν )a1 (λ). 4=
(6.40) (6.41)
The function ψ has been defined so as to satisfy ψ(−1/λ) ≤ λ−rν . We now proceed to define v, z1 , ϕ, etc. exactly as in the preceding subsection and obtain the same inequalities with the single exception that, in the lower bound (6.25), c must be replaced by a3 (λ). Since λ = λ1 , we find
p−1 1 − a3 (λ1 ) 1 − a3 (λ1 ) p ϕ (1) ≥ l(λ1 ) = λ1 . r (1 + λ1 )(1 + a3 (λ1 )λ1 ) r (1 + λ1 )(λ1 + a3 (λ1 )) (6.42) Recall that ϕ is holomorphic in %(−1/λ, 1/a(λ)) and maps this domain into itself, with a(λ) given by (6.1). The bound (6.42) also holds in the cases when λ > λ1 since then a1 (λ) < a3 (λ1 ).
Existence and Properties of p-tupling Fixed Points
461
Finally we define N (b, r, p, ν, λ1 ),0 = h−1 b,λ ◦ ϕ ◦ hb,λ .
(6.43)
0 (b, νr) into Q 0 (b, νr) ∩ {, : The operator N (b, r, p, ν, λ1 ) maps the domain Q , (1) ≥ l(λ1 )}, which is compact and convex, hence it has fixed points there. Our task is now to prove that if λ1 has been chosen sufficiently small, any fixed point of N (b, r, p, ν, λ1 ) is actually a fixed point of B(b, r, p, ν). We assume, from now on, that λ1 ≤ 1/8. Let ,0 be a fixed point of N (b, r, p, ν, λ1 ). If σ ν = , 0 (1) ≥ τ1ν , there is nothing to prove. Otherwise, we have λ = λ1 and ϕ0 = ϕ, so that ϕ0 and ψ1 are now holomorphic in %(−1/λ1 , 1/a(λ1 )). Thus ψ is now holomorphic in %(−1/λ1 , 1/a4 (λ1 )), with a(λ1 ) < a4 (λ1 ) =
τ1ν − σ ν + (1 − τ1ν )a(λ1 ) < τ1ν + (1 − τ1ν )a(λ1 ). 1 − σν
(6.44)
Recalling that λ1 ≤ 1/8, we find a4 (λ1 ) ≤ λrν 1 + 2λ1
1 − λrν 3λ1 1 ≤ . 1 − λ1 1 − λ1
(6.45)
Inserting this in the lower bound obtained by setting λ = λ1 and c = a4 (λ1 ) in (6.25) gives
p−1 1 − 4λ 1 − 4λ1 1
ϕ (1) ≥ λ1 , (6.46) r (1 + λ1 )(4 − λ1 ) r (1 + λ1 )(1 − λ1 + 3λ21 ) and, using λ1 ≤ 1/8, ϕ (1) ≥ λ1 (9r)−p ,
(6.47)
ϕ (1) ≥ (9r)−prν/(rν−1) .
(6.48)
and since λ1 ≥ (ϕ (1))1/rν ,
If we assume that λ1 has been chosen so that λ1 < (9r)−p/(rν−1) ,
(6.49)
the inequality (6.48) contradicts our hypothesis that , 0 (1) < λrν 1 . Therefore ,0 is a fixed point of B(b, r, p, ν). 7. Properties of Solutions This section is devoted to some properties of the solutions, i.e. of functions ψ and u, and numbers ν ∈ (0, 1], p ≥ 2, r > 1, (rν > 1 + (p − 1)(1 − ν) if ν < 1), λ, z1 , satisfying the requirements of Sect. 3. These properties are extensions of those established for p = 1 in [EE, EL, E2]. We do not consider the case p = 1.
462
H. Epstein
We denote ϕ = ϕ0 , v, ζ0 , . . . , ζp+1 , the objects constructed from ψ as in the definition of B(b, r, p, ν). We also denote τ = λr , and z1 u(z) = u(−z) ˇ = ν−1 v(−z) λ z1 = ν−1 ψ(z)1/r = U (z)1/r , z ∈ %(−1/λ, 1), λ z r 1 U (z) = ν−1 ψ(z), z ∈ %(−1/λ, ζ1 /λ). (7.1) λ Recall that it has been shown in Sect. 4 that √ 1 1 ≥ (1 + λ)(1 + λ)2 > 8, λ ≤ b1 (rν), if 0 < ν ≤ 1, p ≥ 2, (7.2) ν τ λ and 1+λ rν ≥ (7.3) , λrν−1−(p−1)(1−ν) ≤ (1 + λ)−p if 0 < ν ≤ 1, p ≥ 2. 1−λ Moreover (4.33) and rν ≥ (1 + b)/(1 − b) give λ ≥ (4r 3 ν 2 )−p/(rν−1−(p−1)(1−ν))
if 0 < ν ≤ 1, p ≥ 2.
(7.4)
For 1 < ν ≤ 2, it was shown in Sect. 6 that 1/rν
λ ≤ b6 (rν) = m0
, m0 = 0.803,
ζ1 1 ≥ , λ a(λ)
(1 < ν ≤ 2, p ≥ 2), (7.5)
where a(λ) is defined in (6.1), and that λ ≥ min{1/7, (6r)−p/(rν−1) }, (1 < ν ≤ 2, p ≥ 2), 1/rν p/(rν−p) 1 − m0 λ≥ , (1 < ν ≤ 2, 2 ≤ p < rν). 16r
(7.6) (7.7)
The function u has an angular derivative at infinity equal to zero (i.e. u(z)/z tends to 0 as z → ∞ in non-real directions) because u(z) = U (z)1/r , U is anti-Herglotzian, and r > 1. Similarly v and ϕ have zero angular derivative at infinity. 7.1. Analyticity. The function ϕ is holomorphic in %(−1/λ, ξmax ), where ξmax = λ−2 if u(1/λ) ˇ ≤ 1/λ (as is the case for ν ≤ 1, since λu(1/λ) ˇ ≤ z1 λ2−2ν < 1). In this case, z1 ϕ(ξmax ) = ϕ(λ−2 ) = z1 v(uˇ p−1 (λ−1 )) ≤ ν < λ−2 . (7.8) λ If u(1/λ) ˇ > 1/λ, we denote ξp = 1/λ2 and λξp−1 = uˇ −1 (1/λ). We construct by a descending induction the strictly increasing sequence ξ1 , . . . , ξp satisfying uˇ j (λξp−j ) = λξp = 1/λ. Supposing ξp−j < . . . ξp already constructed for a certain j < p − 1, we have uˇ j +1 (λξp−j ) = u(1/λ) ˇ > 1/λ, while uˇ j +1 (λ) = ζj +1 < 1/λ. Hence λξp−j −1 = uˇ −(j +1) (1/λ) exists in (λ, λξp−j ). We set ξmax = ξ1 so that uˇ p−1 (λξmax ) = 1/λ. Recalling that uˇ p−1 (ζ1 ) = λ1−ν , we find: ζ1 ≤ ξmax = ξ1 < ξ2 < · · · < ξp = λ−2 . λ
(7.9)
Existence and Properties of p-tupling Fixed Points
463
The first inequality here is replaced by the equality ξmax = ζ1 /λ when ν = 2 (and, of course, p > 1). More generally uˇ p−j (ζj ) = λ1−ν implies ζj ≤ λξj for all j ∈ [1, p−1], equality holding when ν = 2. Note (see (6.9) and (6.11)) that z1 /λν−1 < ζ1 < 1/λν−1 , and ϕ(ξmax ) = z1 v(λ−1 ) ≤
z1 < ζ1 /λ. λν
(7.10)
In both cases the whole domain %(−1/λ, ξmax ) is a basin of attraction of the fixed point 1 of ϕ, hence the domain of ψ is also %(−1/λ, ξmax ), and 1 ψ(ϕ(z)), λνr ϕ(z) = λν−1 uˇ p (λz),
ψ(z) =
(7.11)
hold for all z ∈ %(−1/λ, ξmax ). Also u(z) =
1 u(λν−1 uˇ p (λz)), λν
z ∈ %(−1/λ, 1).
(7.12)
7.2. Univalence for p ≥ 2. We prove in this subsection that ψ and ϕ are univalent in %(−1/λ, ξmax ). We temporarily denote φj (z) = −uˇ j (λz), for 0 ≤ j ≤ p − 1, φp (z) = ϕ(z).
(7.13)
Let c be fixed with 1 < c < min{1/λ, ζ1 /λ}. We first verify that each φj , 0 ≤ j ≤ p, maps the interval (−1/λ, c) into an open interval Xj with closure contained in (−1/λ, c). This is clear in the case j = p, since φp = ϕ. For j = 0, φ0 (−1/λ) = 1, and φ0 (c) = −λc > −1. If 1 ≤ j ≤ p−1, φj is decreasing, φj (c) < φj (−1/λ) ≤ 0 and φj (c) > −uˇ j (ζ1 ) = −ζj +1 ≥ −1/λ. Let X be the convex hull of X0 ∪ . . . ∪ Xp . This is an open interval with closure contained in (−1/λ, c), such that, for all j = 0, . . . , p, φj ((−1/λ, c)) ⊂ X. Suppose that w and w
are distinct points in %(−1/λ, ξmax ) such that ψ(w ) = ψ(w
). This implies that w and w
are not real, and have imaginary parts of the same sign. We inductively construct a sequence of triples {wn , wn
, jn }0≤n<∞ , where w0 = w , w0
= w
, and, for all n ≥ 0, wn = wn
are non-real, ψ(wn ) = ψ(wn
), and
0 ≤ jn ≤ p is such that wn+1 = φjn (wn ), wn+1 = φjn (wn
). Assuming that wn and
wn have already been constructed, it follows from (7.11) and the definition (7.13) of the functions φj that there is a unique jn in [0, p] such that φjn (wn ) = φjn (wn
) and either jn = p or φjn +1 (wn ) = φjn +1 (wn
). This implies that ψ(φjn (wn )) = ψ(φjn (wn
)), and
we take wn+1 = φjn (wn ), wn+1 = φjn (wn
). It is easy to see (as e.g. in [E2]) that, as n tends to infinity, the Poincaré distances, relative to C+ ∪ C− ∪ X, of wn and wn
to the segment X tend to 0. Therefore as n becomes sufficiently large, the points wn and wn
enter a complex neighborhood of the real segment X so thin that ψ is injective there, producing a contradiction. Thus ψ, and therefore also u and ϕ are univalent in their respective domains.
464
H. Epstein
7.3. Boundary values of u. We show in this subsection that the restriction of u to the upper half-plane C+ extends to a continuous bounded injective function on the closed upper half-plane C+ . The same, of course, holds in the lower half-plane, since u(z) = u∗ (z∗ ). We rewrite (7.12) as u(z) = F (u(−λz)), 1 F (z) = ν u(λν−1 uˇ p−1 (z)). λ
(7.14) (7.15)
The function F is anti-Herglotzian, holomorphic and univalent in %(−1, ζ1 ) and vanishes at ζ1 = u(−λ). It has a fixed point at u(0) = z1 λ1−ν with F (z1 λ1−ν ) = −1/λ, and (since it is strictly decreasing) no other fixed point in the real interval [−1, ζ1 ]. Equation (7.14) can be rewritten as F = u ◦ (−λ−1 ) ◦ u−1 on the intersection of the domain of F with the range of u. This range includes the real segment (0, u(−λ−1 )) and hence (0, ζ1 ], since u(−λ−1 ) > u(−λ) = ζ1 . Any periodic orbit of F in [−1, ζ1 ] must be contained in (0, ζ1 ], so that {u(0)} is the only such orbit. Therefore the Herglotz
function F 2 also has u(0) as its unique real fixed point, with F 2 (u(0)) = λ−2 . Both 2 r F and F have zero angular derivative at ∞ since z → F (z) is anti-Herglotzian and z → F 2 (z)r is Herglotzian. As in [EL, E2], the theory of iterations of maps of C+ into itself (Wolff–Denjoy–Valiron Theorem, see [V, Mi]) shows that, uniformly on every compact subset of C+ , F 2n tends to a finite constant c as n → ∞. In particular for all z in a fixed compact subset of C− , u(λ−2n z) = F 2n (u(z)) tends uniformly to c, i.e. c = u(−i∞). For n = 0, 1, 2, . . . , we denote In the closed real interval In = (−λ)−n [1, λ−2 ].
(7.16)
We first consider the case when ξmax = λ−2 (in particular the case 0 < ν ≤ 1), for which the argument of the case p = 1 [EL, E2] can be repeated almost verbatim. Let z follow I0 − i0. Then u(z) follows the segment τ0 = eiπ/r [0, |U (λ−2 )|1/r ].
(7.17)
If z crosses the interior of I0 into C+ , u gets continued by v0 ≡ e2πi/r u. The image of C+ given by v0 is contained in {z : π/r < arg z < 2π/r}. It is contained in C+ if and only if r ≥ 2 (in particular if r is an integer). Let V0 denote the open set ∗ {z ∈ C+ : v0 (z) ∈ C+ }, and, for n ≥ 1, Vn = −λ−1 Vn−1 (so that Vn = C+ when r ≥ 2). If z follows I1 − i0, then −λz follows I0 + i0 and, by (7.14), u(z) follows the analytic arc τ1 = F (τ0∗ ) which lies entirely in C+ except for its starting point, u(−λ−1 ) = F (0). An easy induction shows that when z follows In − i0, (n ≥ 1), u(z) follows an analytic arc τn lying entirely in C+ for n > 1, and u can be continued across the interior of In into C+ by a function vn holomorphic in Vn , with vn (Vn ) ⊂ C+ and ∗ τn = F (τn−1 ), vn (z) = F (vn−1 (−λz∗ )∗ ).
(7.18)
The starting point of τn+2 is the end of τn , and τn+2 = F 2 (τn ). Hence the arcs τn tend to the point c. Thus u|C+ extends to a continuous bounded function on C+ . This function is injective. Indeed at each step of its inductive construction, a new extension is obtained by composing copies of the previously constructed extension and scalars.
Existence and Properties of p-tupling Fixed Points
465
We now consider the case when ξmax < λ−2 which occurs if u(λ ˇ −1 ) > λ−1 . (In −1 −1 particular for ν = 2, ζp = λ < ζp+1 = u(λ ˇ ).) Recall that we denote ξj , for 1 ≤ j ≤ p, the unique number in [ζ1 /λ, λ−2 ] such that uˇ p−j (λ ξj ) = λ−1 ,
(7.19)
ξmax = ξ1 < · · · < ξp = λ−2 , ζj ≤ λξj ∀j ∈ [1, p].
(7.20)
and that
Suppose z follows [1, ξmax ] ∓ i0. Then u(z) = e±iπ/r |U (z)|1/r follows the segment e±iπ/r [0, |U (ξmax )|1/r ]. Hence if z follows (−1/λ)[1, ξmax ]−i0, u(z) is given by (7.14), and follows an analytic arc entirely contained in C+ except for its starting point u(1/λ). ˇ Thus z → u(z − i0) now has a continuous, non-real extension to [−ξmax /λ, −1/λ). The extension thus obtained of u|C− to C− ∪ [−ξ1 /λ, ξ1 ] is also obviously injective, as well as the conjugate extension of u|C+ . Recall also that 1/λ < u(1/λ) ˇ ≤ z1 λ1−2ν < ζ1 /λν ≤ ξmax /λ.
(7.21)
u(1/λ) ˇ ∈ (1/λ, ξmax /λ),
(7.22)
1 ≤ λν−2 < λν−1 u(1/λ) ˇ < ζ1 /λ.
(7.23)
Hence and This shows that λν−1 u(1/λ) ˇ is in the domain of analyticity of U and U is negative there. We assume inductively that there exists, for a certain j ∈ [1, p − 1], a continuous injective extension, temporarily denoted u(j ) , of u|C− to C− ∪ [−ξj /λ, ξj ], such that u(j ) (z − i0) ∈ C+ if z ∈ [−ξj /λ, −1/λ) or if z ∈ (1, ξj ]. This implies of course a symmetrical situation for u|C+ . By abuse of notation we also denote u(j ) (z + i0) = u(j ) (z∗ − i0)∗ . We also assume that (7.12) holds with u replaced by u(j ) in the domain of the latter. In order to prove the same for j + 1, we denote j
p−j
u(j +1/2) (z ∓ i0) = λ−ν u(j ) (λν−1 uˇ (j ) (uˇ (j ) (λz ∓ i0))).
(7.24)
Note that the rhs of the above equation is equal to u(j ) (z ∓ i0) in the domain of u(j ) by the induction hypothesis. Thus u(j +1/2) is a new extension of u|C− , which is injective p−j wherever it is defined. If z increases along (ξj , ξj +1 ], uˇ (j ) (λz − i0) moves along (1/λ, u(1/λ)] ˇ − i0. This, by the induction hypothesis and (7.22), is within the domain j p−j of the already constructed uˇ (j ) , and uˇ (j ) (uˇ (j ) (λz − i0)) moves along an arc entirely contained in C− , and u(j +1/2) (z − i0) moves along an arc entirely contained in C+ . A little more detail is needed, since u(1/λ) ˇ is real, when z moves in a small real interval containing ξj so that uˇ p−j (λz − i0) moves along a small real interval containing 1/λ. If j = 1, then u(1) is continuous and non-real at λν−1 u(1/λ) ˇ ± i0 since, as noted above, this is a point of analyticity of U . If j > 1, uˇ 2(j ) (1/λ ± i0) is defined and p non-real by (7.22). Denote now u(j +1) (z − i0) = λ−ν u(j +1/2) (λν−1 uˇ (j +1/2) (λz − i0)). This is a continuous injective extension of u|C− to C− ∪ [−ξj +1 /λ, ξj +1 ] since the arc u(j +1/2) ([ξj , ξj +1 ] ∓ i0) is entirely contained in C± . The construction makes it obvious that (7.12) holds with u replaced by u(j +1) in the domain of the latter.
466
H. Epstein
We conclude that u|C− has a continuous injective extension to C− ∪ [−λ−3 , λ−2 ] which takes real values only on [−1/λ, 1]. It maps I0 − i0 onto a union τ0 of p consecutive arcs contained in C+ except for the point u(1) = 0 : τ0 = τ00 ∪ . . . τ0(p−1) , with τ00 = u([1, ξ1 ] − i0) and τ0j = u([ξj , ξj +1 ] − i0) for 1 ≤ j < p. It maps I1 − i0 ∗ ), contained in C onto another finite union τ1 = τ10 ∪ . . . τ1(p−1) , with τ1j = F (τ0j + except for the point u(−1/λ). As in the previous case, u|C− extends to a continuous injective function on C− . The images τn = u(In − i0) all lie in C+ for n > 1. The ∗ ) = F 2 (τ sequence of the τn = F (τn−1 n−2 ) tends to the point c. Let I00 = [1, ξ1 ], I0j = [ξj , ξj +1 ] for 1 ≤ j < p, and Inj = (−λ)−n I0j for n ∈ N, 0 ≤ j < p. If z crosses (1, ξmax ) from C− into C+ , u(z) gets continued by v00 (z) = e2iπ/r u(z). If z crosses (−ξmax /λ, −1/λ) from C− into C+ , u(z) gets continued by v10 (z) = F (v00 (−λz∗ )∗ ), holomorphic in V10 = V1 . If z crosses (ξj , ξj +1 ) from C− into C+ (with 1 ≤ j < p), u(z) gets continued by v0j (z) = λ−ν u(λν−1 uˇ j −1 (v10 (−uˇ p−j (λz∗ ))∗ )). If z crosses (−ξj +1 /λ, −ξj /λ) from C− into C+ , u(z) gets continued by v1j (z) = F (v0j (−λz∗ )∗ ). If z crosses the interior of Inj from C− into C+ , u(z) gets continued by vnj (z) = F (v(n−1)j (−λz∗ )∗ ). If r ≥ 2, all the functions vnj are holomorphic in C+ and map it into itself. Note that, in all cases, the extension of u to C− (resp. C+ ) takes real values only on [−λ−1 , 1]. The function F |C+ (resp. F |C− ) also has a continuous injective extension to C+ (resp. C− ) which takes real values only on the real segment [−1, ζ1 ]. The point c cannot be real. Indeed if we suppose it is and let w0 ∈ C+ , wn = F 2n (w0 ) for n ∈ N, the sequences {wn } and {F 2 (wn )} both tend to c, so that, by the continuity of the extensions of F to C± , F 2 (c + i0) = c. Since this is real, F (c + i0), hence also c, must belong to [−1, ζ1 ], and c is a fixed point of F 2 , i.e. c coincides with z1 λ1−ν . But the latter is repulsive, contradicting the attractive property of c. Hence c is in C+ and is a fixed point of F 2 . It is attractive and unique by Schwarz’s lemma applied to F 2 |C+ . Therefore the compact sets τn converge geometrically to c. It follows that the functions u, ϕ, ψ are all bounded. 7.4. Commutativity for ν = 2. The following is a transcription into the notations of this paper of (a special case of) a result due to O. Lanford. This will prove that the properties of u and ψ recalled at the beginning of this section suffice to imply, in the case ν = 2, a form of commutativity for the functions ξ and η given by ξ = (−u)−1 , η = −λξ ◦ (−1/λ), η−1 = λuˇ ◦ (1/λ).
(7.25)
Recall that the functional equations (7.11), (7.12), and ν = 2, imply that ξ and η satisfy the system (1.5). With the notations of the beginning of this section, we have Lemma 7.1. (Lanford) For every solution with ν = 2, ψ(λu(−z/λ)) = −λr ψ(u(z)/λ) for all z ∈ %(−λ, 1).
(7.26)
Proof. The domains of the two anti-Herglotzian functions F1 = ψ ◦ λ ◦ u ◦ (−1/λ), F2 = −λr ψ ◦ (1/λ) ◦ u,
(7.27)
are equal to %(−λ, 1). Indeed the function z → λu(−z/λ) has this domain and maps −λ to 0, and 1 to z1 v(1/λ) ≤ z1 /λ2 < ζ1 /λ, hence it maps %(−λ, 1) into the domain
Existence and Properties of p-tupling Fixed Points
467
of ψ. The function (1/λ)u is holomorphic in %(−1/λ, 1). It maps 1 to 0 and −λ to ζ1 /λ, hence it also maps %(−λ, 1) into the domain of ψ (and F2 has a branch point at −λ since ψ has one at ζ1 /λ). We now substitute for u, in the equation for F1 , the r.h.s. of (7.12), and substitute for ψ, in the equation for F2 , the r.h.s. of the first equation in (7.11). This gives 1 1 F2 ◦ G0 , F2 = − r F1 ◦ G0 , λr λ G0 (z) = λuˇ p (−z) = ϕ(−z/λ).
F1 = −
(7.28) (7.29)
Since the functional equations (7.12) and (7.11) hold with domains, so does the system (7.28). In fact the anti-Herglotzian function G0 maps the domain %(−λ, 1) into itself, since G0 is holomorphic in %(−ζ1 , 1) and satisfies: G0 (1) = ϕ(−1/λ) ≥ 0, G0 (0) = ϕ(0) < 1, G0 (−λ) = ϕ(1) = 1.
(7.30)
Since G0 is strictly decreasing on [−λ, 1], it has there a unique fixed point x¯ ∈ (0, 1) which, by Schwarz’s lemma, is attractive and has %(−λ, 1) as a basin of attraction. Let κ = −G 0 (x) ¯ ∈ (0, 1), and let h be the linearizer of G0 at x, ¯ normalized by h (x) ¯ = 1. This is a function holomorphic in %(−λ, 1), and satisfying, in this domain, h = (−1/κ)h ◦ G0 (in particular h(x) ¯ = 0). The point x¯ is also the unique fixed point of the function G20 in %(−λ, 1) and its normalized linearizer is also h . On the other hand, because G0 maps %(−λ, 1) into itself, the equation obtained by substituting the second equation in (7.28) into the first, 1 F1 ◦ G20 (7.31) λ2r ¯ and κ = λr . The second holds in %(−λ, 1). Therefore F1 = c1 h with c1 = F1 (x), equation in (7.28) now reads F2 = (−c1 /κ)h◦G0 = c1 h. Therefore F1 and F2 coincide, which is the assertion of the lemma. In particular, for z = −λ, this gives 1 = ψ(0) = −λr ψ(ζ1 /λ), i.e. ψ(ζ1 /λ) = −1/λr . Both sides of (7.26) must vanish at x, ¯ hence u( ˇ x/λ) ¯ = 1/λ, u(x) ¯ = λ so that x¯ = λζp−1 = −ζ−1 . Since F2 (1) = −λr , the common range of F1 (z) and F2 (z) as z varies in [−λ, 1] is [−λr , 1]. The identity (7.26) continues to hold if ψ is replaced with U = (z1 /λ)r ψ on both sides. In order to translate this identity in terms of ξ and η, we denote F1 =
q(z) = |z|r sign(z) ∀z ∈ R, u(z) = q −1 ◦ U (z) ∀z ∈ [−1/λ, ζ1 /λ].
(7.32)
The function u is strictly decreasing, with range containing [−z1 /λ2 , 1/λ], coincides with u on [−1/λ, 1], and satisfies u = (1/λ2 ) u ◦ λuˇ p ◦ λ u ◦ λu ◦ (−1/λ) = −λ u ◦ (1/λ) u
on [−1/λ, ζ1 /λ], on [−λ, 1].
(7.33)
Let ξ = (− u)−1 ,
η = −λ ξ ◦ (−1/λ).
Then ξ is an extension of ξ to an interval containing (−1/λ, z1 of η, and
(7.34) /λ2 ), η is
an extension
ξ = (1/λ2 ) ηp ◦ ξ ◦ λ2 , ξ = (−1/λ) η ◦ (−λ), on (−1/λ, z1 /λ2 ), # " η ◦ ξ = ξ ◦ η on (−z1 /λ, z1 ).
(7.35)
468
H. Epstein
8. Behavior of Fixed Points as r → ∞, 0 < ν ≤ 1 In this section we consider only the cases when 0 < ν ≤ 1 and p ≥ 2 (and, of course, rν −1−(p −1)(1−ν) > 0). In the case p = 1, the behavior of solutions as r → ∞ was first elucidated by Eckmann and Wittwer in [EW], and also studied in [E1] (for ν = 1) and [EE] (for 1 ≤ ν ≤ 2), and the method of [E1,EE] extends trivially to 0 < ν ≤ 1. The case p ≥ 2 requires some additional work. 8.1. The functions V and W . The functional equation implies ψ(z) = V (ψ(−λz)) = W (ψ(λ2 z)),
∀z ∈ C+ ∪ C− ∪ (−1/λ, 1/λ2 ),
(8.1)
where V (ζ ) = f (ζ 1/r ), 1 f (z) = rν ψ(λν−1 uˇ p−1 (z1 λ1−ν z)), λ W = V ◦ V.
(8.2)
u(z) ˇ = z1 λ1−ν ψ(−z)1/r .
(8.3)
Recall that
The function f is anti-Herglotzian and holomorphic in %(−1/z1 λ1−ν , 1/z1 λ2−ν ). We denote ζmax = (1/z1 λ2−ν )r . These functions satisfy 1 V (1) = − , λ
V (1) = f (1) = 1,
r f (1) = − . λ
(8.4)
Since ψ(1) = 0, V vanishes at α = ψ(−λ), and f vanishes at v(λ) = (z1 λ1−ν )−1 ζ1 . We also define (ζ ) = 1 − V (1 − ζ ), V
=V ◦V . W
(8.5)
Since the functional equations (8.1) hold for all z in the domain of ψ, the real ranges of V and W contain that of ψ. The following estimates follow [EE] and [E1]. In the domain of V ,
1 z f
(z) V
(ζ ) (8.6) = r −1−
, z = ζ 1/r . −
V (ζ ) rζ f (z) For real ζ ∈ (0, ζmax ), −
V
(ζ ) 1 ≥
V (ζ ) rζ
2z 2−ν 1/λ z1 − z
1 1 + λ2−ν z1 z = r− . rζ 1 − λ2−ν z1 z
r −1−
(8.7)
Recalling the bound rν ≥ (1 + λ)/(1 − λ), we find that −
1−ν V
(ζ ) ≥
ζ V (ζ )
for 0 < ζ ≤ (z1 λ1−ν )−r .
(8.8)
Existence and Properties of p-tupling Fixed Points
469
This is in particular satisfied if ζ = α = ((z1 λ1−ν )−1 ζ1 )r , since ζ1 ≤ λ1−ν ≤ 1. Integrating the inequality (8.8) from 1 to ζ > 1, using V (1) = −1/λ and V (1) = 1, gives V (ζ ) > 1 −
1 ν (ζ − 1) λν
⇒ α > (1 + λν)1/ν ≥ (1 + λ).
It follows similarly from (8.6) that
V
(ζ ) 1 2z −
≤ r −1+ V (ζ ) rζ z + λν−1 /z1
1 1 − z1 λ1−ν z = 1− , ζ r(1 + z1 λ1−ν z)
(8.9)
(8.10)
so that −
V
(ζ ) 1 ≤ V (ζ ) ζ
∀ζ ∈ (0, α).
(8.11)
∀ζ ∈ (1, α),
(8.12)
Integrating this from 1 to ζ > 1 gives −V (ζ ) ≥ V (ζ ) ≤ 1 −
1 log ζ λ
1 λζ
∀ζ ∈ (1, α)
⇒ α ≤ eλ .
(8.13)
Since V = f ◦ q −1 , where q −1 (ζ ) = ζ 1/r , the Schwarzian derivative SV of V satisfies, for real ζ in the domain of V , SV (ζ ) ≥ Sq −1 (ζ ) =
1 − r −2 . 2ζ 2
(8.14)
The function W is Herglotzian and holomorphic in %(0, α), where α = ψ(−λ) = V −1 (0) (since V (0) ≤ λ−rν < ζmax ). It has a repelling fixed point at 1 with multiplier is Herglotzian and holomorphic in %(1 − α, 1) and has a fixed point at 0. By λ−2 . W (8.14),
(1 − r −2 ) V (ζ )2 1 SW (ζ ) ≥ (8.15) + 2 . 2 V (ζ )2 ζ in [0, 1]. For 0 < ζ < 1, the convexity of V implies: Lower bound for W −V (ζ ) ≥
V (ζ ) − 1 , 1−ζ
(8.16)
hence −
1 V (ζ ) ≤1−ζ −
≤ 1 − ζ + λ.
V (ζ ) V (ζ )
It follows that 2SW (ζ ) ≥ (1 − r −2 )
1 1 , + (1 − ζ + λ)2 ζ2
(8.17)
(8.18)
470
H. Epstein
and hence (ζ ) ≥ (1 − r −2 ) 2S W
1 1 . + (ζ + λ)2 (1 − ζ )2
(8.19)
In (0, 1), the r.h.s. has a minimum at ζ = (1 − λ)/2, and, using the bound on r ≥ (1 + λ)/(1 − λ), we get
(ζ ) d W (ζ ) ≥ s(λ) ≡ 16λ . ≥ SW (ζ ) dζ W (1 + λ)4
(8.20)
(ζ )
(V (ζ ))
W V (ζ ) + V (ζ ) , = V (ζ )) (ζ ) (V (ζ ) W V V
(8.21)
By (8.11) and
it follows that
(0) W 1 1 V (0) V (1) 1 = − = − 1 − 1 ≥ − − 1 . (0) (0) λ λ V (1) λ W V Hence,
(ζ )
(0) W W 1 ≥
+ s(λ)ζ ≥ − − 1 + s(λ)ζ, (ζ ) (0) λ W W
(ζ ) ≥ 2 log(1/λ) − 1 − 1 ζ + s(λ)ζ 2 /2 log W λ
1 ≥ 2 log(1/λ) − − 1 + s(λ)ζ 2 /2. λ
(8.22)
(8.23)
(8.24)
As a function of λ in (0, 1), the first bracket in the last expression has a unique maximum at 1/2 and vanishes at 1. Since it is positive at e−1 , it is non-negative in [e−1 , 1]. Hence, for λ ≥ e−1 and 0 ≤ ζ < 1, (ζ ) ≥ 1 + s(λ)ζ 2 /2, W
(ζ ) ≥ ζ 1 + s(λ) ζ 2 , W 6
(8.25) (8.26)
and we note that, for λ ≥ 1/4, s(λ) ≥ 1. is Pick with 0 angular derivative at infinity in C+ ∪ C− ∪ (1 − On the other hand W α, 1), and vanishes at 0. Hence there is a positive measure ρ with support in R\(1−α, 1) such that
dρ(t) 1 1 1 (ζ ) = W = 2. (8.27) − dρ(t), 2 t − ζ t t λ R\(1−α, 1) R\(1−α, 1) Hence, for 0 ≤ ζ < 1, (ζ ) ≥ ζ W λ2
inf
t ∈(1−α, / 1)
ζ (α − 1) t = 2 t −ζ λ (α − 1 + ζ ) ζ ≥ . λ(1 + λ)
(8.28)
Existence and Properties of p-tupling Fixed Points
471
(ζ ) ≥ Here we have used the lower bound (4.14) for α. For λ ≤ 1/2, this implies W 4ζ /3 ≥ ζ (1 + ζ 2 /6), so that, for all λ and all ζ ∈ (0, 1), (ζ ) ≥ ζ (1 + c ζ 2 ), W
c = 1/6.
(8.29)
Remark 8.1. Let ζ , y, a , and m be strictly positive real numbers such that 0 < ζ (1 + a ζ 2 ) ≤ y ≤ m.
(8.30)
Then ζ ≤ y(1 − ay 2 ),
a=
a
. 1 + 3a m2
(8.31)
Indeed, note first that am2 ≤ 1/3 < 1. Moreover ζ ≤ z for any z such that a z3 +z−y ≥ 0, and inserting z = y(1 − ay 2 ) in this expression gives y 3 [a (1 − ay 2 )3 − a] ≥ y 3 [a (1 − 3am2 ) − a] = 0. −1 is defined on This remark (with m = 1) and the lower bound (8.29) imply that W [0, 1), and that, for all y ∈ [0, 1), −1 (y) ≤ y(1 − cy 2 ), W
c=
c
= 1/9. 1 + 3c
(8.32)
Lower bound for W in [1, α]. For 1 ≤ ζ ≤ α, the inequalities (8.15) and (8.12), together with 0 ≤ V (ζ ) ≤ 1, give SW (ζ ) ≥
1 2 1 (1 − r −2 )(λ−2 + 1) ≥ 2 (λ−1 + λ). 2 2ζ ζ (1 + λ)2
(8.33)
The last inequality follows from the lower bound on r already used above. The last expression is decreasing in λ, so that, finally, SW (ζ ) ≥
1 ζ2
∀ζ ∈ (1, α).
(8.34)
(0)/W (0) ≥ 0 (see (8.22)), for 1 ≤ ζ ≤ α, Since W
(1)/W (1) = −W ζ W
(ζ ) ≥ t −2 dt = (ζ − 1)/ζ ≥ (ζ − 1)/e, W (ζ ) 1
(8.35)
by using the upper bound α ≤ e, and hence W (ζ ) ≥ λ−2 (1 + (ζ − 1)2 /2e), W (ζ ) − 1 ≥ (ζ − 1)(1 + k (ζ − 1)2 ),
k = 1/6e
∀ζ ∈ (1, α).
(8.36)
The function W (ζ ) = W (ζ + 1) − 1 is thus defined on [0, α − 1], where it satisfies W (ζ ) ≥ ζ (1 + k ζ 2 ).
(8.37)
472
H. Epstein
We note that W (α) = W (ψ(−λ)) = ψ(−λ−1 ), hence the range of W |(1, α) contains in particular ψ(−1). We wish to apply Remark 8.1 to the inverse function W −1 restricted to [0, ψ(−1)−1], and we first obtain an upper bound for ψ(−1). We use the representation (2.10) :
1 1 σ (t) log ψ(−1) − log ψ(−λ) = − dt R\(−λ −1 , 1) t + λ t + 1 (8.38) 1 1 − dt = log 2 ≤ t +1 R\(−λ−1 , 1) t + λ which yields (using (8.13)) ψ(−1) ≤ 2ψ(−λ) ≤ 2eλ < 2e.
(8.39)
Thus (8.37) and Remark 8.1, with m = 2e − 1, show that W −1 (y) ≤ y(1 − ky 2 ),
k = 1/(6e + 3(2e − 1)2 ),
∀y ∈ [0, ψ(−1) − 1]. (8.40)
Note that we have obtained the following bounds: 1 + λ ≤ ψ(−λ) ≤ eλ ,
ψ(−1) ≤ 2ψ(−λ) ≤ 2eλ .
(8.41)
This provides upper and lower bounds for y0 = z1r . Indeed from ζ1 = z1 λ1−ν v(λ) ≤ λ1−ν = ζp , and z1 λ1−ν v(1) ≥ z1 λ1−ν v(ζp ) ≥ ζp , it follows z1 ≤ 1/v(λ) and z1 ≥ 1/v(1), hence (2e)−1 ≤ y0 ≤ (1 + λ)−1 .
(8.42)
8.2. The functions H± . We define H± (w) = ψ(±eβw ),
β = log(1/λ),
± = 1 − H± . H
(8.43)
H+ is holomorphic in the cut strip F+ (λ) = {w ∈ C : | Im w| < π/β} \ (2 + R+ ).
(8.44)
It maps points in C± into C∓ . It is decreasing on the reals, tends to 1 at −∞, and vanishes at 0. H− is holomorphic in the cut strip F− (λ) = {w ∈ C : | Im w| < π/β} \ (1 + R+ ),
(8.45)
maps points in C± into C± , is increasing on the reals and tends to 1 at −∞. They satisfy H± (w) = V (H∓ (w − 1)) = W (H± (w − 2)), ± (w) = V (H ∓ (w − 1)) = W (H ± (w − 2)). H
(8.46)
Moreover
±
(w) H H±
(w) zψ
(z) =β 1+
=
, ± (w) H± (w) ψ (z) H
z = ±eβw .
(8.47)
Existence and Properties of p-tupling Fixed Points
473
Since (for 0 < ν ≤ 1) ψ is anti-Herglotzian in C+ ∪C− ∪(−1/λ, 1/λ2 ), the inequalities (2.2) imply, for 0 < z = eβw < 1/λ, i.e. for all w ∈ (−∞, 1),
H+
(w) 2λz ≥ β 1 − ≥ 0. (8.48) H+ (w) 1 + λz For 0 < −z = eβw < 1/λ, i.e. again for all w ∈ (−∞, 1), we find similarly that
H−
(w) 2λ2 z ≥ β 1 + ≥ 0. (8.49) H− (w) 1 − λ2 z + are increasing and convex, H+ is decreasing and concave. In other words, H− and H In particular, for w < 0, using (8.32), + (w) ≥ H + (w) − H + (w − 2) 2H + (w) − W −1 (H + (w)) ≥ cH + (w)3 . =H
(8.50)
+ (0) = 1 gives Integrating this with the initial condition H + (w) ≤ (1 − cw)−1/2 , H
H+ (w) ≥ 1 − (1 − cw)−1/2
∀w ∈ R− (c = 1/9).
(8.51)
Similarly, defining H − (w) = H− (w) − 1, recalling that H− (0) = ψ(−1), H− (−1) = ψ(−λ), we obtain, using (8.40), H − (w) ≥ kH − (w)3 /2, H− (w) ≥ k(H− (w) − 1)3 /2
∀w ∈ R− (k = 1/(6e + 3(2e − 1)2 )).
(8.52)
We will need a lower bound for H− (w)/H− (w) in the interval w ∈ [−1, 0]. This is provided by the lower bound H− (−1) = ψ(−λ) ≥ 1 + λ, and by (8.52) : H− (w) k(H− (w) − 1)3 ≥ 2H− (w) H− (w) k(H− (−1) − 1)3 kλ3 ≥ ≥ 2H− (−1) 2(1 + λ)
∀w ∈ [−1, 0].
(8.53)
8.3. Lower bound on τ . Recall that the function ϕ satisfies ϕ(z) = λν−1 uˇ p (λz), ϕ (1) = τ ν = λν
p−1
∀z ∈ C+ ∪ C− ∪ (−1/λ, 1/λ2 ),
uˇ (ζj ), λ ≤ ζj = uˇ j (λ) ≤ λ1−ν .
ϕ(1) = 1, (8.54)
j =0
Let T (w) = eβ w , β = log(1/λ). Then the function X = T −1 ◦ ϕ ◦ T
(8.55)
474
H. Epstein
is given by X(w) = −ν + 1 + Y p (w − 1) ∀w ∈ (−∞, 2), Y (w) = T −1 ◦ uˇ ◦ T (w) log y0 1 = +ν−1+ log H− (w) ∀w ∈ (−∞, 1). (8.56) log(1/τ ) log(1/τ ) It satisfies X(0) = 0 and X (0) = τ ν =
p−1
Y (wj ) =
j =0
p−1
j =0
H− (wj ) 1 , log(1/τ ) H− (wj )
(8.57)
where −1 ≤ wj =
log ζj ≤ ν − 1. log(1/λ)
(8.58)
Hence by (8.53), k (8.59) , k = 1/(6e + 3(2e − 1)2 ). 4 When r > 3p/ν, this provides a lower bound for τ . We may e.g. rewrite (8.59) as τ ν/p−3/r log(1/τ ) ≥
y log(1/y) ≥ (ν/p − 3/r)k/4,
y = τ ν/p−3/r .
(8.60)
8.4. Limiting fixed points. The preceding subsections have shown that, for any solution, the associated functions have the following properties: (1) The function V is holomorphic and anti-Herglotzian in C+ ∪ C− ∪ (0, ζmax ), where ζmax ≥ τ ν−2 ≥ 8(2−ν)/ν . It satisfies V (1) = 1 and V (1) = −1/λ. (2) The function W = V ◦ V is holomorphic and Herglotzian in C+ ∪ C− ∪ (0, α), where (1 + λ) ≤ α = V −1 (0) ≤ eλ . (3) The function H+ is holomorphic in the cut strip F+ (λ) (see (8.44)), maps points in C± into C∓ , vanishes at 0, and satisfies the bound (8.51). (4) The function H− is holomorphic in the cut strip F− (λ) (see (8.45)), maps points in C± into C± , and satisfies H− (−1) = α and the bounds (8.52) and (8.53). (5) τ = λr is bounded above by τ ≤ 8−1/ν . For sufficiently large r, its is bounded below by (8.59), and for all r by λ0 (p, r, ν)r (see (4.33)). (6) y0 = z1r satisfies (8.42). As a consequence every infinite sequence of solutions, with fixed ν and p, such that r → ∞, contains an infinite subsequence such that τ and y0 have limits in (0, 1), and that the functions V , W , H± tend, uniformly over compact sets, to non-constant functions, holomorphic in cut planes. Meanwhile, λ and z1 > λν tend to 1 (see (7.4)), ψ and u tend to 1, uniformly over compact subsets of C+ ∪ C− ∪ (−1, 1) (see (8.51)). However the functions
log ζ (8.61) S± (ζ ) = U (±ζ 1/r ) = y0 τ 1−ν H± log(1/τ ) have non-trivial limits and obey the functional equation: 1 p−1 S± (ζ ) = ν S+ (τ ν−1 S− ◦ S∓ (τ ζ )). τ
(8.62)
Existence and Properties of p-tupling Fixed Points
475
Appendix. Proof of the Inequality (4.22) This inequality is equivalent to (1 − x 2 ) log((1 + x 2 )(1 + x)2 ) + 4x 2 log(x) > 0
∀x ∈ (0, 1),
(A.1)
or to f1 (x) − 4xf2 (x) > 0
∀x ∈ (0, 1),
(A.2)
where f1 (x) = log((1 + x 2 )(1 + x)2 ),
f2 (x) =
x log(1/x) . 1 − x2
The derivative f2 (x) has the sign of
1 − x2 4 −2 log(x) − 2 = − log(t) − + 2, t = x 2 . 2 1+x 1+t
(A.3)
(A.4)
The last expression vanishes at 1 and has negative derivative in t on (0, 1). Hence f2 is increasing on (0, 1). It tends to 1/2 as x tends to 1, so that f2 < 1/2 on (0, 1). It now suffices to prove that f1 (x) − 2x > 0 for all x ∈ (0, 1). This quantity vanishes for x = 0, and f1 (x) − 2 =
2x 2 (1 − x) >0 (1 + x 2 )(1 + x)
∀x ∈ (0, 1).
(A.5)
Acknowledgement. I wish to thank Oscar Lanford and Michael Yampolsky for many helpful discussions. I am also indebted to O. Lanford for his kind permission to include the contents of Subsect. 7.4.
References [CEL]
Collet, P., Eckmann, J.-P., and Lanford III, O.E.: Universal properties of maps on the interval. Commun. Math. Phys. 76, 211–54 (1980) [CT] Coullet, P., and Tresser, C.: Itération d’endomorphismes et groupe de renormalisation. J. de Physique Colloque C 539, C5–25 (1978). CRAS Paris 287 A, (1978) [dFdM] de Faria, E., and de Melo, W.: Rigidity of critical circle mappings. Rigidity of critical circle mappings I. J. Eur. Math. Soc. (JEMS) 1, 339–392 (1999) [dMvS] de Melo, W., and van Strien, S.: One-Dimensional Dynamics. New York: Springer Verlag, 1993 [D] Donoghue, Jr., W.F.: Monotone matrix functions and analytic continuation. Berlin: Springer Verlag, 1974 [EE] Eckmann, J.-P., and Epstein, H.: On the existence of fixed points of the composition operator for circle maps. Commun. Math. Phys. 107, 213–231 (1986) [EW] Eckmann, J.-P., and Wittwer, P.: Computer methods and Borel summability applied to Feigenbaum’s equation. Lecture Notes in Physics 227. Berlin: Springer Verlag, 1985. [E1] Epstein, H.: New proofs of the existence of the Feigenbaum functions. Commun. Math. Phys. 106, 395–426 (1986) [E2] Epstein, H.: Fixed points of composition operators. In: Non-linear Evolution and Chaotic Phenomena, Zweifel, P., Gallavotti, G., and Anile, M., eds., New York: Plenum, 1988 [E3] Epstein, H.: Fixed points of composition operators II. Nonlinearity 2, 305–310 (1989) (reprinted in: Cvitanovi´c, P. (ed): Universality in Chaos, 2d edition. Bristol: Adam Hilger, 1989) [EL] Epstein, H., and Lascoux, J.: Analyticity properties of the Feigenbaum function. Commun. Math. Phys. 81, 437–53 (1981) [F1] Feigenbaum, M.J.: Quantitative universality for a class of non-linear transformations. J. Stat. Phys. 19, 25–52 (1978)
476
[F2]
H. Epstein
Feigenbaum, M.J.: Universal metric properties of non-linear transformations. J. Stat. Phys. 21, 669–706 (1979) [FKS] Feigenbaum, M.J., Kadanoff, L.P., and Shenker, S.J.: Quasi-periodicity in dissipative systems: A renormalization group analysis. Physica D 5, 370–386 (1982) [JR] Jonker, L., and Rand, D.: Universal properties of maps of the circle with I-singularities. Commun. Math. Phys. 90, 273–292 (1983) [L1] Lanford III, O.E.: Remarks on the accumulation of period-doubling bifurcations. In Mathematical problems in Theoretical Physics, Lecture Notes in Physics vol. 116, Berlin: Springer Verlag, 1980, pp. 340–342 [L2] Lanford III, O.E.: A computer-assisted proof of the Feigenbaum conjectures. Bull. Am. Math. Soc., New Series 6, 127 (1984) [Ly] Lyubich, M.: Feigenbaum–Coullet–Tresser universality and Milnor’s Hairiness Conjecture. Annals of Math. 149, 319–420 (1999) [M] Martens, M.: The periodic points of renormalization. Ann. Math., II. Ser. 147, No.3, 543–584 (1998) [MO] Mestel, B., and Osbaldestin, A.: Feigenbaum theory for unimodal maps with asymmetric critical point: Rigorous results. Commun. Math. Phys. 197, 211–228 (1998) [Mi] Milnor, J.: Dynamics in one complex variable. Wiesbaden: Vieweg, 1999 [ORSS] Ostlund, S., Rand, D., Sethna, J., and Siggia, E.: Universal properties of the transition from quasiperiodicity to chaos in dissipative systems. Physica D 8, 303–342 (1983) [S1] Sullivan, D.: Quasiconformal homeomorphisms in dynamics, topology, and geometry. Proc. ICM-86 Berkeley II, Providence, RI.: AMS, pp. 1216–1228 (1987) [S2] Sullivan, D.: Bounds, quadratic differentials, and renormalization conjectures. AMS Centennial Publications II, Mathematics into Twenty-First Century, (1992), pp. 417–466 ´ atek, G.: Rational rotation numbers for maps of the circle. Commun. Math. Phys. 119, 109–128 [Sw] Swi¸ (1988) [V] Valiron, G.: Fonctions Analytiques. Paris: Presses Universitaires de France, 1954 [Y] Yampolsky, M.: Hyperbolicity of renormalization of critical circle maps. (Preprint IHES/M/00/50, to appear) Communicated by A. Jaffe
Commun. Math. Phys. 215, 477 – 486 (2000)
Communications in
Mathematical Physics
© Springer-Verlag 2000
Quantum Ergodicity of Eisenstein Series for Arithmetic 3-Manifolds Shin-ya Koyama Department of Mathematics, Keio University, 3 Hiyoshi, Kohoku, Yokohama 223-8522, Japan. E-mail: [email protected] Received: 20 April 2000 / Accepted: 10 July 2000
Abstract: We prove the quantum ergodicity for Eisenstein series for P SL(2, OK ), where OK is the integer ring of an imaginary quadratic field K of class number one. 1. Introduction Luo and Sarnak [LS] proved the quantum ergodicity of Eisenstein series for P SL(2, Z). It is stated as follows: Theorem 1.1. Let A, B be compact Jordan measurable subsets of P SL(2, Z)\H 2 , then lim
t→∞
µt (A) Vol(A) = , µt (B) Vol(B)
where µt = |E(z, 21 + it)|2 dV with E(z, s) being the Eisenstein series for P SL(2, Z), and dV is the volume element of the upper half plane H 2 . In this paper we will generalize Theorem 1.1 to three dimensional cases X = P SL(2, OK )\H 3 , where OK is the integer ring of an imaginary quadratic field K of class number one, and H 3 is the three dimensional upper half space. Our main theorem is analogously described as follows: Theorem 1.2. Let A, B be compact Jordan measurable subsets of X, then µt (A) Vol(A) , = t→∞ µt (B) Vol(B) lim
where µt = |E(v, 1 + it)|2 dV with E(v, s) being the Eisenstein series for X, and dV is the volume element of H 3 . Partially supported by Japan Association for Mathematical Sciences
478
S. Koyama
Indeed we show that as t → ∞, µt (A) ∼
2 Vol(A) log t, ζK (2)
where ζK (s) is the Dedekind zeta function. In two dimensional cases numerical examples [HR] suggested that the quantum ergodicity would hold. For higher dimensional cases no numerical examples are known. Theorem 1.2 is the first result along this direction. 2. Three-Dimensional Settings In this section we introduce some notation on the three-dimensional hyperbolic space. A point in the hyperbolic three-dimensional space H 3 is denoted by v = z + yj , z = x1 + x2 i ∈ C, y > 0. We fix an imaginary quadratic field K whose class number is one. Denote its discriminant by DK and integer ring O = OK . Put D = |DK |. We often regard O as a lattice in R2 , which is denoted by L with the fundamental domain FL ⊂ R2 . Also put ω = ωK = D −1/2 , the inverse different of K. The group = P SL(2, O) acts on H 3 and the quotient space X = \H 3 is a three dimensional arithmetic hyperbolic orbifold. The Laplacian on X is defined by d2 d d2 d2 2 ! = −y + 2 + 2 +y . dy dy dx12 dx2 It has a self-adjoint extension on L2 (X). It is known that the spectra of ! is composed of both discrete and continuous ones. The eigenfunction for a discrete spectrum is called a cusp form. We denote it by φj (v) with eigenvalue λj (0 = λ0 < λ1 ≤ λ2 ≤ · · · ). We put λj = 1 + rj2 . We shall assume the φj (v)’s to be chosen so that they are eigenfunctions of the ring of Hecke operators and are L2 -normalized. The Fourier development of φj (v) is given in [S] (2.20): φj (v) = ρj (n)yKirj (2π |n|y)e(n, z), (2.1) n∈O ∗ /∼
where n ∼ m means that they generate the same ideal in O, and n, z is the standard inner product in R2 with Kν being the K-Bessel function. For a Maass-Hecke cusp form φj (v) with its Fourier development given by (2.1), we have the Rankin-Selberg convolution L-function L(s, φj ×φj ) and the second symmetric power L-function L(2) (s, φj ) which satisfy the following: L(s, φj × φj ) = ζK (2s) L(2) (s, φj ) =
n∈O ∗ /∼
n∈O ∗ /∼
|λj (n)|2 , N (n)s
cj (n) = ζK (s)−1 L(s, φj × φj ), N (n)s
sinh πrj 2 with ρj (n) = vj (n), vj (n) = vj (1)λj (n) and cj (n) = l 2 k=n λj (k ). It rj is known that the both functions converge in Re(s) > 1. The functional equation of
Quantum Ergodicity of Eisenstein Series for Arithmetic 3-Manifolds
479
L(s, φj × φj ) is inherited from the Eisenstein series by our unfolding the integral. We compute that X
|φj (v)|2 E(v, 2s)dv = |ρj (1)|2
L(s, φj × φj ) (s + irj ) (s − irj ) (s)2 ζK (2s) 8π 2s (2s)
is invariant under changing the variable s to 1 − s. We normalize such that φj = 1 with respect to the Petersson inner product 1 f, g = (X) f (v)g(v)dv. vol X The residue Rj of L(s, φj × φj ) at its unique simple pole s = 1 is equal to 8πζK (2) 8π ζK (2) Vol(FL ) , Ress=2 E(v, s) = |vj (1)|2 |vj (1)|2 Vol(X)
(2.2)
where Ress=2 E(v, s) = Vol(FL )/ Vol(X) is known by Sarnak [S], Lemma 2.15. 3. Proofs In this section we prove Theorem 1.2. We first define the Eisenstein series by y(γ v)s , E(v, s) =
(3.1)
∞\
where y(v) = y for v = z + jy ∈ H 3 and Re(s) > 2. Here the group
1n = : n ∈ O . ∞ 01
∞
is given by
The Fourier development of E(v, s) is known by Asai [A] and Elstrodt et al. [E]: ξK (s − 1) ξK (s) 2 + |n|s−1 σ2(1−s) (n)e4πi Re(nωz) Ks−1 (4π |nω|y)y, ξK (s) ∗
E(v, s) = y s + y 2−s
(3.2)
n∈O /∼
where σs (n) =
d|n
|d|s and ξK (s) = (
√
D s 2π )
(s)ζK (s).
Our goal is to prove the equidistribution of the measure µt = |E(v, 1 + it)|2 dV (v), 2 dy where dV (v) = dx1 dx . We consider its inner product with various functions spanning y3
L2 (X). We begin with inner products with Maass cusp forms φj . Proposition 3.1. For any fixed φj , lim
t→∞ X
φj dµt = 0.
480
S. Koyama
Proof. Set Jj (t) =
X
φj dµt =
X
φj (v)E(v, 1 + it)E(v, 1 − it)
dx1 dx2 dy y3
(3.3)
with z = x1 + x2 i. To investigate this we first consider dx1 dx2 dy Ij (s) = φj (v)E(v, 1 + it)E(v, s) . y3 X
(3.4)
All of the above integrals converge since φj is a cusp form. We unfold the integral (3.4) to get ∞ dx1 dx2 dy φj (v)E(v, 1 + it)y s . (3.5) Ij (s) = y3 FL 0 Denote the conjugate of v = z + yj ∈ H 3 by v = z − yj . As is well-known in the two dimensional case, the space of the Maass cusp forms is expressed as a direct sum of spaces of even and odd cusp forms. Here even (resp. odd) cusp forms are ones satisfying φj (1 − v) = 9φj (v) with 9 = 1 (resp. −1). Since E(v, s) = E(1 − v, s), it follows that Ij (s) ≡ 0 if φj odd. So we may assume that φj is even. In this case the Fourier development (2.1) is written as φj (v) = y ρj (n)Kirj (2π |n|y) cos(2π in, z), (3.6) n∈O ∗ /∼
where 1 + rj2 = λj . Normalizing the coefficients by ρj (n) = ρj (1)λj (n), the multiplicative relations are satisfied by λj (n). These amount to
L(φj , s) :=
n∈O ∗ /∼
λj (n) = N (n)s
λj (p) 1 1− + N (p)s N (p)2s
(p):prime ideal
−1
.
(3.7)
By substituting (3.2) and (3.6) into (3.5) we have Ij (s) =
∞
0
FL
y
n∈O ∗ /∼
y 1+it + y 1−it 2y + ξK (1 + it) Now we have
ρj (n)Kirj (2π |n|y) cos(2π n, z)
ξK (it) ξK (1 + it) it
|m| σ−2it (m)e
4πi Re(mωz)
m∈O ∗ /∼
FL
cos(2πinω, z)dv =
0 1
dx1 dx2 dy Kit (4π |m|ωy) y s . y3
(3.8)
n ∈ O − {0} . n=0
In the expansion of (3.8), we appeal to the formula cos x cos y = 21 (cos(x +y)+cos(x − y)). Only the terms with n = m remain as follows:
Quantum Ergodicity of Eisenstein Series for Arithmetic 3-Manifolds
2 ξK (1 + it)
Ij (s) =
0
n∈O ∗ /∼
2 ξK (1 + it)
=
∞
n∈O ∗ /∼
481
|n|it σ−2it (n)Kit (2π |n|y)ρj (n)Kirj (2π |n|y)y s
|n|it σ−2it (n)ρj (n) |n|s
∞ 0
Kit (2πy)Kirj (2πy)y s
dy y
dy . y
An evaluation of the integral involving Bessel functions [GR] yields Ij (s) =
( 2π −s ξK (1 + it)
s+irj +it ) 2
with
(
R(s) =
n∈O ∗ /∼
s+irj −it ) 2
( (s)
s−irj +it ) 2
s−irj −it ) 2
(
R(s)
|n|it σ−2it (n)ρj (n) . |n|s
We compute R(s) as follows: R(s) =
= =
1 ρj (1)
∞ λj (p k )|p|ikt σ−2it (p k )
|p|ks
(p):prime ideal k=0
∞ k 1 λj (p k )|p|ikt −2itl |p| ρj (1) |p|ks
1 ρj (1)
(p) k=0 ∞
l=0
λj
(p k )|p|ikt
(p) k=0
|p|ks
1 − |p|−2it (k+1) 1 − |p|−2it
∞ ∞ 1 k −k(s−it) −2it k −k(s+it) = λj (p )|p| − |p| λj (p )|p| ρj (1)(1 − |p|−2it ) (p)
=
k=0
k=0
1 ρj (1)(1 − |p|−2it ) 1 |p|−2it − 1 − λj (p)|p|−(s−it) + |p|−2(s−it) 1 − λj (p)|p|−(s+it) + |p|−2(s+it) (p)
=
1 ρj (1) (p)
=
1 − |p|−2s
(1 − λj
(p)|p|−(s−it)
+ |p|−2(s−it) )(1 − λ
j (p)|p|
−(s+it)
+ |p|−2(s+it) )
s+it 1 L(φj , s−it 2 )L(φj , 2 ) . ρj (1) ζK (s)
(3.9) Therefore Jj (t) = Ij (1 − it) =
( 2π −1+it ξK (1 + it)
1+irj 2
) (
1+irj −2it ) 2
1−ir
( 2 j) ( (1 − it)
1−irj −2it ) 2
R(1 − it).
(3.10)
482
S. Koyama
By Stirling’s formula | (σ + it)| ∼ e−πt/2 |t|σ − 2 , we see 1
the gamma factors in (3.10) |t|−1
(3.11)
as t → ∞. It is known that the Dedekind zeta function in (3.10) is estimated as t −9 |ζK (1 + it)| t 9 .
(3.12)
Estimating the automorphic L-functions in (3.10) was recently done successfully by Sarnak and Petridis [SP]. They proved there exists δ > 0 such that for any 9 > 0, L(φj ,
1 + it) j,9 |t|1−δ+9 2
(3.13)
as |t| → ∞. The estimates (3.11)–(3.13) yield Jj (t) |t|−δ+9 .
(3.14)
This implies Proposition 3.1.
We now turn to inner products of µt with incomplete Eisenstein series. Let h(y) be a rapidly decreasing function at 0 and ∞, that is h(y) = ON (y N ) as y → ∞ or 0 and N ∈ Z. Let H (s) be its Mellin transform H (s) =
∞
h(y)y −s
0
dy . y
Clearly H (s) is entire in s and is of Schwartz class in t for each vertical line σ + it. The inversion formula gives 1 h(y) = H (s)y s ds 2π i (σ ) for any σ ∈ R. For such an h we form the convergent series Fh (v) =
γ∈
∞\
1 h(y(γ v)) = 2π i
(3)
H (s)E(v, s)ds,
which we call incomplete Eisenstein series. Proposition 3.2. For incomplete Eisenstein series F (v), we have
2 F (v)dµt (v) ∼ ζK (2) X
as t → ∞.
X
F (v)dV (v) log t
Quantum Ergodicity of Eisenstein Series for Arithmetic 3-Manifolds
483
Proof. Incomplete Eisenstein series decrease rapidly as y → ∞ and belong to C ∞ (X). Hence dzdy Fh (v)dµt (v) = Fh (v)|E(v, 1 + it)|2 3 y X X dzdy 1 = H (s)E(v, s)ds|E(v, 1 + it)|2 3 2πi X (3) y ∞ dzdy 1 = H (s)y s ds |E(v, 1 + it)|2 3 2πi 0 y (3) F L 2 ∞ 1+it 1 s 1−it ξK (it) = H (s)y ds y +y 2πi 0 ξ (1 + it) K (3) 2 dy 2y + |σ−2it (n)Kit (4π |n|ωy)|2 3 ξK (1 + it) y ∗ n∈O /∼
= F1 (t) + F2 (t), where we put 2 ∞ 1+it 1 s 1−it ξK (it) dy F1 (t) = H (s)y ds y +y y3 . 2πi 0 ξ (1 + it) K (3) (it) Since ξKξK(1+it) = 1, we have
∞
dy + (a rapidly decreasing function of t), y
(3.15)
|σ−2it (n)|2 2 F2 (t) = H (s) πi|ξK (1 + it)|2 (3) |n|s n∈O ∗ /∼ ∞ dy |Kit (4π ωy)|2 y s ds. y 0
(3.16)
F1 (t) = 2
h(y)
0
whereas
The series is computed as follows: n∈O ∗ /∼
|σa (n)|2 = |n|s =
(p): prime ideal k=0 ∞ (p) k=0
=
∞ σa (p k )σ−a (p k )
1 |p|ks
(p)
|p|ks
1 − |p|a(k+1) 1 − |p|a
1 − |p|−a(k+1) 1 − |p|−a
2
1 (1 − |p|a )(1 − |p|−a ) ∞
2|p|−ks − |p|(a−s)k+a + |p|(−a−s)k−a
k=0
484
S. Koyama
=
(p)
1
(3.17)
(1 − |p|a )(1 − |p|−a )
2 |p|a |p|−a − − 1 − |p|−s 1 − |p|a−s 1 − |p|−a−s −s 1+p = −s (1 − p )(1 − p −(s−a) )(1 − p −(s+a) )
(p)
=
s+a ζK ( 2s )2 ζK ( s−a 2 )ζK ( 2 ) . ζK (s)
The y-integral in (3.16) is evaluated in terms of the 2 F2 (t) = πi|ξK (1 + it)|2
(3)
H (s)
function as before. We obtain
|σ−2it (n)|2 |n|s
n∈O ∗ /∼ H (s)ζK ( 2s )2 |ζK ( 2s
∞ 0
|Kit (4π ωy)|2 y s
dy ds y
+ it) ( 2s + it)|2 ( 2s )2 ds (4π ω)s ζK (s) (s)
2 πi|ξK (1 + it)|2 (3) 2 = B(s)ds, πi|ξK (1 + it)|2 (3)
=
(3.18)
(3.19) where we put B(s) =
H (s)ζK ( 2s )2 |ζK ( 2s + it) ( 2s + it)|2 ( 2s )2 . (4π ω)s ζK (s) (s)
(3.20)
By Stirling’s formula to estimate the gamma factors and from the fact that H (σ + it) is rapidly decreasing in t, we can shift the integral in (3.18) to Re(s) = 1: 4 Ress=2 B(s) 2 + B(s)ds. (3.21) F2 (t) = |ξK (1 + it)|2 π i|ξK (1 + it)|2 (1) The second term in (3.20) is evaluated by Heath-Brown [H] as 1 1 ζK + it t 3 +9 2 for any fixed 9 > 0. We find that 2 πi|ξK (1 + it)|2
B(s)ds 9 t − 3 +9 . 1
(1)
This corresponds to the bound (3.14). Next we deal with the residue term in (3.20), which is more complicated. Write B(s) as ζK ( 2s )2 G(s) where G(s) is holomorphic at s = 2. Put ζK (s/2) =
A−1 + A0 + O(s − 2) (s → 2). s−2
Quantum Ergodicity of Eisenstein Series for Arithmetic 3-Manifolds
485
In the expansion of B(s) =
2
G(2) + G (2)(s − 2) + O(s − 2)3 ,
A−1 + A0 + O(s − 2) s−2
the coefficient of (s − 2)−1 gives the residue Ress=2 B(s) = G(2)A−1
G 2A0 + A−1 (2) . G
A simple calculation gives G(2) =
H (2)|ζK (1 + it) (1 + it)|2 ( 21 )2 H (2)|ξK (1 + it)|2 = 2 (4π ω) ζK (2) 4ζK (2)
and (1 + it) (1 − it) ζ (1 + it) ζ (1 − it) G H (2) = (2) + K + K + + +C G H 2ζK (1 + it) 2ζK (1 − it) 2 (1 + it) 2 (1 − it)
with C being independent of t. For the Weyl–Hadamard–De La Vallée Poussin bound [T, (6.15.3)] and its generalization to Dirichlet L-functions by Landau, we have ζK (1 + it) log t . ζK (1 + it) log log t This together with
(1 + it) ∼ log t gives
Ress=2 B(s) =
H (2)|ξK (1 + it)|2 log t + O 2ζK (2)
log t log log t
.
Finally the first term of (3.20) is evaluated as 4 Ress=2 B(s) 2H (2) = log t + O(1). |ξK (1 + it)|2 ζK (2) Taking into account that
∞
H (2) = 0
h(y)
dy = y3
X
Fh (z)
dzdy , y3
we reach the conclusion. Proposition 3.3. Let F be a continuous function of compact support in X. Then 2 F (v)dµt (v) ∼ F (v)dV (v) log t ζK (2) X X as t → ∞.
486
S. Koyama
Proof. The space of all incomplete Eisenstein series and cusp forms is dense in the space of continuous functions vanishing in the cusp. For any 9 > 0, we can find G = G1 + G2 with G1 the finite sum of cusp forms and G2 in the space of incomplete Eisenstein series, such that G − F ∞ < 9. The difference H = G − F is sufficiently small and rapidly decreasing in the cusp. Namely, it is majorized in terms of another incomplete Eisenstein series H1 (v) = h1 (y(γ v)) γ∈
as
∞\
H1 (v) ≥ |H (v)|
satisfying
X
H1 (v)dV (v) < C(K)9
with some constant C(K) depending only on the field K. Hence the conclusion.
Propositions 2.3 implies Theorem 1.1 by standard approximation arguments. Acknowledgements. The author would like to express his thanks to Professor Peter Sarnak, who introduced the author to the subject.
References [A] [E] [GR] [H] [HR] [LS] [S] [SP] [T]
Asai, T.: On a certain function analogous to log |η(z)|. Nagoya Math. J. 40, 193–211 (1970) Elstrodt, J., Grunewald, F. and Mennicke, J.: Eisenstein series for imaginary quadratic number fields. Contemporary Math. 53, 97–117 (1986) Gradshteyn, I.S., Ryzhik, I.M.: Table of integrals, series, and products. New York–London: Academic Press, 1994 Heath-Brown, D.R.: The growth rate of the Dedekind zeta-function on the critical line. Acta Arith. 49, 323–339 (1988) Hejhal, D. and Rackner, B.: On the topography of Maass waveforms for P SL(2, Z). Experimental Math. 1, 275–305 (1992) Luo, W. and Sarnak, P.: Quantum ergodicity of eigenfunction on P SL2 (Z)\H 2 . To appear Sarnak, P.: The arithmetic and geometry of some hyperbolic three manifolds. Acta math. 151, 253– 295 (1983) Sarnak, P. and Petridis, Y.: Quantum unique ergodicity for SL2 (O)\H 3 and estimates for L-functions. Preprint 2000 Titchmarsh, E.C.: The theory of the Riemann zeta function. Oxford, 1951
Communicated by P. Sarnak
Commun. Math. Phys. 215, 487 – 515 (2001)
Communications in
Mathematical Physics
© Springer-Verlag 2001
Continued Fractions and the d-Dimensional Gauss Transformation D. M. Hardcastle1 , K. Khanin1,2,3,4 1 Department of Mathematics, Heriot-Watt University, Edinburgh EH14 4AS, UK.
E-mail: [email protected]
2 Isaac Newton Institute for Mathematical Sciences, 20 Clarkson Road, Cambridge CB3 0EH, UK.
E-mail: [email protected]
3 BRIMS, Hewlett-Packard Laboratories, Stoke Gifford, Bristol BS12 6QZ, UK 4 Landau Institute for Theoretical Physics, Kosygina Str., 2, Moscow 117332, Russia
Received: 4 February 2000 / Accepted: 23 May 2000
Abstract: In this paper we study a multidimensional continued fraction algorithm which is related to the Modified Jacobi–Perron algorithm considered by Podsypanin and Schweiger. We demonstrate that this algorithm has many important properties which are natural generalisations of properties of one-dimensional continued fractions. For this reason, we call the transformation associated to the algorithm the d-dimensional Gauss transformation. We construct a coordinate system for the natural extension which reveals its symmetries and allows one to give an explicit formula for the density of its invariant measure. We also discuss the ergodic properties of this invariant measure.
1. Introduction The theory of one-dimensional continued fractions has a rich and long history. They originated in Euclid’s algorithm and their theory was later developed by Gauss, Hurwitz, Legendre, Lagrange and many others. One of the most important contributions made by Gauss was the discovery of an explicit formula for the invariant measure of the transformation associated to one-dimensional continued fractions; this measure is now known as the Gauss measure. A generalisation of the one-dimensional continued fraction algorithm to two dimensions was first considered by Jacobi in the 1830s. This work was published posthumously in 1868 [10]. Perron later performed a detailed study of Jacobi’s algorithm in arbitrary dimension [17]; for this reason, the algorithm is now known as the Jacobi–Perron algorithm (JPA). In fact, the study of the JPA inspired Perron to develop his famous theory of positive matrices. The JPA has been widely studied since then and in particular F. Schweiger has considered its ergodic and metrical properties [21]. The ergodic properties of the Jacobi–Perron transformation and other similar maps have also been studied by Gordin [5], Mayer [15], and Ito and Yuri [9].
488
D. M. Hardcastle, K. Khanin
Since the development of the JPA, many other multidimensional continued fraction algorithms have been proposed, in particular we mention the algorithms of Poincaré [19], Brun [3] and Selmer [23]. Podsypanin introduced a two-dimensional algorithm which is closely related to the algorithm of Brun [18]. Later, Schweiger considered a multidimensional modification of Podsypanin’s algorithm called the Modified Jacobi–Perron algorithm and gave an explicit formula for its invariant density [22]. In this paper we study an algorithm which is equivalent to the modified JPA. We demonstrate that this algorithm has many properties which are natural generalisations of properties of one-dimensional continued fractions. To the best of our knowledge, it is the only algorithm which possesses these properties. We find it natural to call it the d-dimensional Gauss algorithm, especially since the invariant measure is a generalisation of the Gauss measure. The structure of the paper is as follows. In Sect. 2 we give a geometrical description of the one-dimensional continued fraction algorithm and briefly discuss some of its most important properties. In Sect. 3 we describe two different geometrical schemes for producing a sequence of vectors of rational numbers simultaneously approximating an irrational vector. These two schemes are based on the concepts of time-ordering and space-ordering. We briefly describe the Jacobi–Perron algorithm, which is based on the time-ordering concept, and two other algorithms which are related to the idea of spaceordering. One of these algorithms leads to the d-dimensional Gauss transformation which is the subject of the rest of the paper. Section 4 is concerned with finding a good coordinate system for the natural extension of the d-dimensional Gauss transformation. In Sect. 5, various important properties of the natural extension are proved. In particular, using the symmetries of the natural extension, we find an explicit formula for the density of its invariant measure. 2. One-Dimensional Continued Fractions In this section we discuss the approximation of irrationals by rationals in the classical one-dimensional case. The theory of one-dimensional continued fractions is one of the most beautiful examples of the applications of ergodic theory. We realise that the theory of continued fractions is classical (see [12]) and that the reader is well aware of this theory. Nevertheless we wish to spend some time on a formal introduction of the Gauss transformation and a discussion of its ergodic properties and its connection with the theory of onedimensional continued fractions. This introduction will be useful in the next section where we will discuss multidimensional generalisations of this theory. In this section we will also formulate the most important properties of the one-dimensional case. We will see later that only one of the multidimensional generalisations inherits these nice properties. We will start with a geometrical approach to the problem of finding a sequence of rational approximations to an irrational number ω ∈ [0, 1]. This geometrical scheme is based on the following picture. A point ω is approximated by a sequence of nested intervals which contain ω. These intervals are constructed inductively. Suppose that on the nth step one has an interval n which contains ω and which has rational end points pn pn qn , qn . The point ω, like any point in the interval n , can be written in the form pn p ⊕ α1 n (1) ω = α0 qn qn
d-Dimensional Continued Fractions
489
for some α0 ≥ α1 ≥ 0. Here ⊕ denotes Farey addition, i.e. p1 p2 αp1 + βp2 α ⊕β = . q1 q2 αq1 + βq2 Note that we can regard (α0 , α1 ) as an element of RP 1 since the representation (1) is unique up to multiplication by a scalar factor. Also note that the order of the end points pn pn qn , qn in (1) is governed by the relation α0 ≥ α1 , rather than by the natural order of the end points on the real line. In the next step of the scheme, we produce an interval n+1 which has end points pqnn and
mn+1
pn qn
⊕
pn , qn
where mn+1 ∈ N.
To consider how to chose the integer mn+1 we rewrite (1) as pn α1 (n) pn ω= ⊕ω , where ω(n) = . qn qn α0 Then ω=
1 pn pn 1 pn pn 1 pn ⊕ = ⊕ ⊕ , qn qn qn qn ω(n) qn ω(n) ω(n)
where [x] and {x} denote the integer and fractional parts of a real number x respectively. We let mn+1 = [ ω1(n) ], mn+1 pn + pn pn pn pn+1 ⊕ = = mn+1 qn+1 qn qn mn+1 qn + qn
Then n+1 , which is the closed interval with end points ω=
pn+1 qn+1
⊕ ω(n+1)
pn+1 qn+1
pn+1
and pn+1 qn+1
qn+1
and
where ω(n+1) =
pn+1 , qn+1
=
pn . qn
contains ω and
1 . ω(n)
We see that our geometric scheme has led to the Gauss transformation T (ω) = { ω1 }. To make this scheme work it is necessary to specify the interval 0 . Take p0 = 0, q0 = 1, p0 = 1 and q0 = 0 so that p p0 ⊕ ω(0) 0 where ω(0) = ω. ω= q0 q0 Hence 0 is associated with the semi-infinite interval [0, ∞). Notice that ω(n) = T n ω(0) gives the projective coordinate of the point ω inside the interval n . If we chose mn+1 to be an integer greater than [ ω1(n) ] then the interval n+1 would not contain ω. This guarantees that the approximation given by continued fractions is the best possible. One can show that for any rational pq ∈ Int(n ), q > max{qn , qn } and in fact q ≥ qn + qn . The rational approximations pqnn defined above are called the convergents of the irrational number ω. We now describe an easy way to calculate them.
490
D. M. Hardcastle, K. Khanin
The map T is expanding and has infinitely many inverse branches. Each of these is characterised by an integer m ∈ N. For each m ∈ N, let Tm−1 denote the branch of T −1 given by Tm−1 (ω) =
1 . m+ω
Notice that ω ∈ Tm−1 [0, 1] if and only if [ ω1 ] = m. The trajectory of ω under T gives the sequence of integers produced in the continued fraction expansion: 1 mn = . T n−1 ω Take 0 =
0 1
as an approximation to ω(n) . One can easily show that pn = Tm−1 ◦ · · · ◦ Tm−1 (0). n 1 qn
It is convenient to present this using matrix multiplication. Firstly, note that p q mn 1 q −1 p = Tmn if and only if = . p 1 0 p q q Thus
qn pn
=
m1 1 1 0
For n ∈ N, let
m2 1 mn 1 1 ... . 1 0 1 0 0
An =
mn 1 . 1 0
Then qn = A1 · · · An e1 , e1 = e1 , An · · · A1 e1 and pn = A1 · · · An e1 , e2 = e1 , An · · · A1 e2 , 1 0 , e2 = . Notice that where {e1 , e2 } is the standard basis of R2 : e1 = 0 1 qn qn−1 = = A1 · · · An−1 e1 = A1 · · · An e2 . pn pn−1 We next discuss the Gauss automorphism, which is the natural extension of the Gauss transformation. Each irrational ω ∈ [0, 1] has a unique symbolic representation 1 (m1 , m2 , . . . ), where mn = [ T n−1 ]. We write [m1 , m2 , . . . ] for the point ω correspondω ing to (m1 , m2 , . . . ). The Gauss transformation T : [0, 1] → [0, 1] is conjugate to the unit shift U on the space of one-sided sequences in N: U ((m1 , m2 , . . . )) = (m2 , m3 , . . . ).
d-Dimensional Continued Fractions
491
The Gauss measure µ(A) =
1 log 2
A
1 dω, 1+ω
(2)
which is the unique absolutely continuous T -invariant probability measure, is transformed by this conjugacy to an invariant Gibbs measure ν on the space NN of one-sided sequences in N. The natural extension T of T is metrically isomorphic to the unit shift U Z on the space N of two-sided sequences with an invariant measure ν. The measure ν is -invariant and whose projection onto NN coincides the unique measure on NZ which is U with ν. However, there is a better coordinate system for the natural extension T of T . Given a two-sided sequence (mn )n∈Z we can produce (y, x) ∈ [0, 1]2 by defining y = [m0 , m−1 , . . . ],
x = [m1 , m2 , . . . ].
: NZ → NZ is conjugate In 1977 Nakada, Ito and Tanaka [16] observed that the shift U 2 2 to the map T : [0, 1] → [0, 1] given by 1 1 T(y, x) = , 1 [x ] + y x and that T has an invariant measure given by the density 1 1 . log 2 (1 + xy)2
(3)
Clearly, projection onto x produces the Gauss measure (2): 1 1 1 dy = . 2 1+x 0 (1 + xy) The transformation T has the important property of reversibility. One can readily see that T−1 = S TS, where S is the involution S(y, x) = (x, y). Notice that S corresponds to the reversing of the orientation of a two-sided sequence (mn )n∈Z . We will see below that in the ddimensional case, both the d-dimensional Gauss transformation and its natural extension have invariant measures which generalise the above formulae. The y coordinate for T was obtained through the continued fraction expansion written in reverse order. These reverse order continued fractions appear quite naturally in the theory of continued fractions. Consider the sequence ρn = qn−1 qn , n ≥ 1. It is easy to see that 1 qn qn−2 = = mn + = mn + ρn−1 . ρn qn−1 qn−1 This means that
T (ρn ) = ρn−1
and
mn =
1 . ρn
(4)
492
D. M. Hardcastle, K. Khanin
Iterating one has ρn =
qn−1 = [mn , mn−1 , . . . , m1 ]. qn
(5)
This formula expresses the ratio of the denominators in terms of the first n entries of the continued fraction of ω written in reverse order. The numbers ρn are important since the quality of approximation can be expressed through them. Indeed 1 1 ≤ |ωqn − pn | ≤ 2qn+1 qn+1 and 1 qn+1
=
n+1
ρi .
i=1
We will see later that (4) and (5) can be generalised to higher dimensions. In some sense they give a basic insight into what sort of coordinates one should use for the natural extension. We end this section with a simple result for one-dimensional continued fractions. We will see later that this result has a multidimensional generalisation which is much less trivial. Consider the finite sequence m1 , m2 , . . . , mn . When we read it in the forward direction it corresponds to the matrix mn 1 mn−1 1 m1 1 Cn = ... . 1 0 1 0 1 0 In the opposite direction the fraction [mn , . . . , m1 ] produces m1 1 mn−1 1 mn 1 Cn = ... . 1 0 1 0 1 0 n since the matrices An are symmetric. It follows from this trivial Obviously, Cnt = C observation that [m1 , . . . , mn ] and [mn , . . . , m1 ] have the same denominator. We will n coincide up to a change see later that in the d-dimensional case the matrices Cnt and C in the order of the rows and columns. 3. Multidimensional Jacobi–Perron Type Algorithms In this section we describe a geometric approach to the construction of rational approximations to an irrational vector. This approach leads to many different generalisations of one-dimensional continued fractions. These algorithms can be called Jacobi–Perron type algorithms, since the transformations which are used are similar to the Jacobi–Perron transformation. We will see later that only one Jacobi–Perron type algorithm inherits the nice properties which we discussed in Sect. 2. We will call the corresponding transformation the d-dimensional Gauss transformation. Let ω = (ω1 , . . . , ωd ) ∈ [0, 1]d . A geometrical scheme for approximating ω is based on a nested sequence of d-dimensional simplices, each of which contains ω. Each simplex in the sequence has vertices which are given by rational vectors of the form p1 pd ,..., . q q
d-Dimensional Continued Fractions
493
Given a simplex in the sequence, one forms the next simplex by deleting one of the vertices and replacing it by a Farey combination of the existing vertices. In this Farey combination, each vertex has an integer coefficient. Moreover, the deleted vertex has coefficient 1. Let n denote the simplex which was obtained at the nth step. In order to define an algorithm for producing the next simplex n+1 , it is necessary to order the vertices of n . The vertices can be ordered in two different ways. They can be put into time-order or space-order. We consider the time-ordering approach first. The d + 1 vertices of n are denoted (0)
(1)
(d)
pn
pn
pn
qn
qn
qn
, (0)
,..., (1)
(i)
(d)
,
(i)
where, for 0 ≤ i ≤ d, pn ∈ Zd+ and qn ∈ N. We order the vertices according to the times of their appearance in the nested sequence of simplices. So appeared at the nth step, on down to deletes the
(0)
(d−1) pn (d−1) qn
(d)
pn (d) qn
is the vertex which
is the vertex which appeared at the (n − 1)st step, and so
pn (0) . The Jacobi–Perron algorithm is based on the following procedure. One qn (0) n oldest vertex p(0) and adds the vertex qn
(d)
p n+1 (d)
qn+1
=
(0)
pn
(0)
qn
⊕
d
mi
(i)
pn
(i)
qn
i=1
,
where the mi are integers which will be specified later. In the formula above, ⊕ stands for Farey addition: αp + βp p p α ⊕β = . q q αq + βq Since ω ∈ n we can write ω = αd+1
(d)
pn
(d)
qn
⊕ αd
(d−1)
pn
(d−1)
qn
⊕ · · · ⊕ α1
(0)
pn
(0)
qn
,
where αd+1 , αd , . . . , α1 ≥ 0. The representation of ω by (αd+1 , αd , . . . , α1 ) is unique up to multiplication by a scalar, i.e. (αd+1 , αd , . . . , α1 ) ∈ RP d . It is convenient to take a representative of (αd+1 , αd , . . . , α1 ) which has first coordinate 1. So ω=
(d)
pn
(d) qn
(n) ⊕ ωd
(d−1)
pn
(d−1) qn
(n) ⊕ · · · ⊕ ω1
(0)
pn
(0) qn
,
(n)
where ωi
=
αi . αd+1
494
D. M. Hardcastle, K. Khanin
Then ω= =
(d)
pn
(d)
qn 1
(n)
⊕
d−1
i=0
(d)
pn
(d)
(n) ωi+1
⊕
(i)
pn
(i)
qn
d−1 (n) (i)
ωi+1 p n
(n) (i) qn i=0 ω1 (0) d−1 (n) (i) (d)
ωi+1 1 pn pn pn = ⊕ ⊕ (n) (d) (n) (i) (0) ω1 qn qn qn i=1 ω1 (d)
d−1 (n) (i) ωi+1 1 pn pn ⊕ ⊕ . (n) (d) (n) (i) ω1 qn qn i=1 ω1
ω1
qn
(d)
pn+1
The first three terms define the new vertex
(d)
pn+1 (d)
qn+1
=
1 (n)
(d)
pn
(d)
ω1
qn
(d) qn+1
, i.e.
d−1 (n) (i)
ωi+1 pn
⊕
i=1
(n)
⊕
(i)
ω1
qn
(0)
pn
,
(0)
qn
and (i)
pn+1 (i)
qn+1
=
(i+1)
pn
(i+1)
qn
,
0 ≤ i ≤ d − 1.
We get ω=
(d)
pn+1
⊕
(d)
qn+1
d−1
i=0
(n+1)
ωi+1
(i)
p n+1
,
(i)
qn+1
where (n+1)
ωi
= (n+1)
Denote ω(n+1) = (ω1
(n)
ωi+1 (n)
ω1
(n+1)
, 1 ≤ i ≤ d − 1, and ωd (n+1)
, . . . , ωd
ω(n+1) = J Pd (ω(n) ) =
=
1
.
(n)
ω1
). Then
(n)
ω2
(n)
ω1
,...,
(n)
ωd
(n)
ω1
,
1 (n)
ω1
.
Clearly, J Pd is a map from I d into itself. This map is an exact endomorphism which has a unique absolutely continuous invariant probability measure [21, 15, 9]. We now consider the space-ordering approach. We denote the d + 1 vertices of the nth step by p(n, 0) p(n, 1) p(n, d) , ,..., , q(n, 0) q(n, 1) q(n, d)
d-Dimensional Continued Fractions
495
where p(n, i) ∈ Zd+ and q(n, i) ∈ N for 0 ≤ i ≤ d. In this approach we order the vertices according to their contribution to the expansion ω=
d
p(n, i) . αi q(n, i)
(6)
i=0
More precisely, the ordering in (6) is such that α0 ≥ α1 ≥ · · · ≥ αd . Again we will normalise the representation of ω so that ω=
p(n, 0) q(n, 0)
(n)
⊕
d
i=1
(n) ωi
(n)
p(n, i) , q(n, i)
(n)
where ωi
=
αi . α0
(n)
Notice that 1 ≥ ω1 ≥ ω2 ≥ · · · ≥ ωd ≥ 0. In order to produce the next approximation one must decide which vertex to delete. This vertex may be any one except the first. We will consider two extreme cases. The first case is when we delete the vertex p(n,d) q(n,d) and the second case is when we delete
p(n,1) q(n,1) .
In the first case we have
d p(n, 0) (n) p(n, i) ⊕ ωi q(n, 0) q(n, i) i=1
d (n) ωi 1 p(n, 0) p(n, i) = (n) ⊕ (n) q(n, i) q(n, 0) ωd i=1 ωd
d−1 (n) ωi 1 p(n, 0) p(n, i) p(n, d) = ⊕ ⊕ (n) (n) q(n, 0) q(n, d) q(n, i) ωd i=1 ωd
d−1 (n) ωi 1 p(n, 0) p(n, i) ⊕ ⊕ . (n) (n) q(n, 0) q(n, i) ωd ω i=1 d
ω=
We get the new vertex
d−1 (n) ωi p(n, 0) p(n, i) 1 p(n, d) p(n + 1, 0) = ⊕ ⊕ (n) (n) q(n + 1, 0) q(n, 0) q(n, d) q(n, i) ωd i=1 ωd and the transformation ω
(n+1)
=
(n+1) (n+1) (ω1 , . . . , ωd )
= ord
(n) (n) ωd−1 ω1 , ,..., . (n) (n) (n) ωd ωd ωd 1
(7)
Here ord(α1 , . . . , αd ) is an ordering of (α1 , . . . , αd ). In other words ord(α1 , . . . , αd ) = (απ(1) , . . . , απ(d) ), where π is a permutation of {1, 2, . . . , d} such that απ(1) ≥ απ(2) ≥ · · · ≥ απ(d) .
496
D. M. Hardcastle, K. Khanin
Obviously, the permutation π depends on (α1 , . . . , αd ). The vertices of the simplex n+1 have to be ordered according to the ordering in (7). Notice that (7) defines a transformation of the simplex d = {(ω1 , . . . , ωd ) ∈ [0, 1]d : ω1 ≥ ω2 ≥ · · · ≥ ωd } into itself. We now consider the second choice of the vertex which is to be deleted, namely p(n,1) q(n,1) . We have
d p(n, 0) (n) p(n, i) ωi ⊕ q(n, 0) q(n, i) i=1
d (n) ωi 1 p(n, 0) p(n, i) = (n) ⊕ (n) q(n, i) q(n, 0) ω1 i=1 ω1
d (n) ωi 1 p(n, 0) p(n, 0) p(n, i) p(n, 1) 1 = ⊕ ⊕ ⊕ . (n) q(n, i) (n) (n) q(n, 0) q(n, 1) q(n, 0) ω1 ω1 i=2 ω1
ω=
The formula above gives a new vertex p(n + 1, 0) p(n, 0) 1 p(n, 1) = ⊕ (n) q(n + 1, 0) q(n, 0) q(n, 1) ω1 and a transformation Td : d → d such that Td (ω(n) ) = ω(n+1) . In coordinates Td is given by ω2 1 ωd Td (ω1 , . . . , ωd ) = ord , ,..., . (8) ω1 ω1 ω1 The transformation Td is the main subject of the rest of this paper. Definition 1. The transformation Td : d → d is called the d-dimensional Gauss transformation. Strictly speaking, for all geometric schemes one has to specify the initial simplex 0 . For both the space-ordering schemes above the initial simplex is given by p(0, 0) = (0, . . . , 0), q(0, 0) = 1, p(0, 1) = (1, 0, . . . , 0), q(0, 1) = 0, .. . p(0, d) = (0, . . . , 0, 1), q(0, d) = 0. By interpreting 00 as 0 and 01 as infinity, we can regard 0 as a semi-infinite simplex which coincides with the positive quadrant of Rd , {ω = (ω1 , . . . , ωd ) ∈ Rd : ωi ≥ 0}.
d-Dimensional Continued Fractions
497
In a well-defined number of steps one reaches a bounded simplex. This happens when all the vertices with a 0 denominator are removed. We now describe a straightforward algebraic method of calculating the vectors p(n, d) p(n, 0) ,..., q(n, 0) q(n, d) produced by the d-dimensional Gauss transformation Td . From now on we will write T instead of Td . Define m : d → N by m(ω) = [ ω11 ], where ω = (ω1 , . . . , ωd ) ∈ d . The ordering in (8) consists of placing { ω11 } in the correct position. Let j (ω) denote this position, i.e. j (ω) = i, where the i th coordinate of T (ω) is { ω11 }. For each pair (m, j ) ∈ N × {1, 2, . . . , d} there is a corresponding branch of T −1 . The branch of T −1 associated −1 to (m, j ), denoted T(m,j ) , is given by ωj −1 ωj +1 1 ω1 ωd −1 T(m,j . (ω , . . . , ω ) = , , . . . , , , . . . , d ) 1 m + ωj m + ωj m + ωj m + ωj m + ωj (m,j ) ∈ GL(d + 1, Z). For each pair (m, j ) ∈ N × {1, 2, . . . , d} we define a matrix A (m,j ) has only two nonzero entries: The first row of A a1,1 = m,
a1,j +1 = 1.
All other rows have only one nonzero entry, which is equal to 1. More precisely, ai,i−1 = 1 for i = 2, . . . , j + 1 and ai,i = 1 for i = j + 2, . . . , d + 1. In short, m 0 ... 0 1 0 ... 0 0 1 0 ... 0 0 0 ... 0 0 0 1 ... 0 0 0 ... 0 0 . . . . . . .. .. . . .. . . . . . . . . . . (m,j ) = 0 0 . . . 1 0 0 . . . 0 0 A (9) . 0 0 ... 0 0 1 ... 0 0 . . .. .. .. . . .. .. .. .. . . . . . . 0 0 ... 0 0 0 ... 1 0 0 0 ... 0 0 0 ... 0 1 It is easy to check that
−1 T(m,j )
p1 pd ,..., q q
=
p 1 p d ,..., q q
q q 1 p1 p if and only if ... = A(m,j ) ... .
p d
pd
t We also define A(m,j ) = A (m,j ) . Notice that in the one-dimensional case Am = Am m is symmetric. since A Let ω be the point of d that we wish to approximate. We can produce the vectors 1 p(n, i) = (p1 (n, i), . . . , pd (n, i)) q(n, i) q(n, i)
498
D. M. Hardcastle, K. Khanin
by the method described above. These vertices p(n, i)/q(n, i) form a matrix D(n ) ∈ GL(d + 1, Z), namely
q(n, 0) p1 (n, 0) . . . pd (n, 0) q(n, 1) p1 (n, 1) . . . pd (n, 1) . D(n ) = .. .. ... . . q(n, d) p1 (n, d) . . . pd (n, d) Consider the trajectory of ω under T : T
T
T
ω = ω(0) → ω(1) → · · · → ω(n) . Associated to this trajectory is the sequence (m1 , j1 ), . . . , (mn , jn ), where mi = m(T i−1 ω),
ji = j (T i−1 ω).
Let (n) (m1 ,j1 ) A (mn ,jn ) (m2 ,j2 ) · · · A n = ( ci,k )1≤i,k≤d+1 = A C
and (n) nt = A(mn ,jn ) · · · A(m2 ,j2 ) A(m1 ,j1 ) . Cn = (ci,k )1≤i,k≤d+1 = C
It can be shown that Cn = D(n ) (see [6]). This obviously implies that p(n, i) = q(n, i)
(n)
ci+1,2 (n)
ci+1,1
(n)
,...,
ci+1,d+1 (n)
ci+1,1
,
0 ≤ i ≤ d.
Also, for the first vertex p(n, 0) 0 0 −1 −1 −1 = T(m1 ,j1 ) ◦ T(m2 ,j2 ) ◦ · · · ◦ T(mn ,jn ) , . . . , . q(n, 0) 1 1
Remark. In the case d = 1, the three geometric schemes considered above all lead to the same transformation, namely the Gauss transformation. This is because, in the one-dimensional case, the earlier vertex always gives a smaller contribution to the decomposition (6). It seems to be natural to get rid of the vertex which gives the smallest contribution to (6). However, the natural generalisation of the Gauss transformation arises from a different strategy: one has to delete the vertex which gives the second largest contribution to the decomposition (6).
d-Dimensional Continued Fractions
499
4. The d-Dimensional Gauss Transformation and its Natural Extension It was shown by Schweiger [22] that the d-dimensional Gauss transformation T has an ergodic invariant probability measure µ given by µ(dω) = ρ(ω) =
π∈Sd
1 ρ(ω) dω, K
1 1 1 ... , 1 + ωπ(1) 1 + ωπ(1) + ωπ(2) 1 + ωπ(1) + ωπ(2) + · · · + ωπ(d)
where K = d ρ(ω) dω and Sd is the group of permutations of {1, 2, . . . , d}. It can also be shown that, for almost all ω, the approximations generated by the d-dimensional Gauss transformation are exponentially convergent to ω in the weak or directional sense (see [6]). This means that for µ-almost every ω ∈ d the diameter of n tends to 0 exponentially fast as n → ∞. Weak convergence implies that, after the removal of a set of measure 0 from d , the map ( which associates to ω the sequence (mn , jn )n∈N = (m(T n−1 ω), j (T n−1 ω))n∈N is a bijection. We will write [(m1 , j1 ), (m2 , j2 ), . . . ] for the vector ω corresponding to the sequence ((m1 , j1 ), (m2 , j2 ), . . . ). The invariant measure µ is projected by the transformation ( onto a stationary measure ν on the space of one-sided sequences in N × {1, 2, . . . , d}. Clearly, the dynamical system (d , T , µ) is metrically isomorphic to the unit shift on the space of one-sided sequences in N × {1, 2, . . . , d} with stationary measure ν. There exists a unique stationary extension of ν onto the space of two-sided sequences. We denote this extension by ν as in the onedimensional case. The natural extension of (d , T , µ) is isomorphic to the unit shift on the space of two-sided sequences with the invariant measure ν. However, our aim is to find a good coordinate system for the natural extension. One can naively try to mimic the one-dimensional strategy by defining x = [(m1 , j1 ), (m2 , j2 ), . . . ],
y = [(m0 , j0 ), (m−1 , j−1 ), . . . ].
It turns out that this is not a good way of proceeding. Before we give a formal definition of the right coordinates we offer the following motivation. 4.1. The backwards Gauss transformation. In the one-dimensional case we had the important property that the ratios of the denominators are connected by the backward Gauss transformation with the same integer entry m as for the forward Gauss transformation. More precisely, qn−2 qn−1 qn−1 (n−1) (n) (n−1) T (ω = . )=ω , T and m(ω )=m qn qn−1 qn We will see below that a similar property holds in the d-dimensional case. The vectors generated by the ratios of the denominators of the vertices are related by the ddimensional Gauss transformation. However, in the d-dimensional case there are two numbers related to the Gauss transformation, namely m(ω) and j (ω). It turns out that while the parameter m for the forward and backward transformation is the same, the parameter j changes. This change of j leads to the appearance of additional discrete structure in the natural extension. Consider the simplex n which is the nth approximation to ω. Each vertex of n is a rational vector with a certain denominator. Thus there are d + 1 denominators associated
500
D. M. Hardcastle, K. Khanin (0)
(1)
(d)
to n . We put these denominators into their chronological order qn , qn , . . . , qn , (i) (i+1) . It is easy to see that the denominator of where qn appeared more recently than qn a new vertex is greater than or equal to all previous denominators. Hence qn(0) ≥ qn(1) ≥ · · · ≥ qn(d) . It is natural to compare the sequence of denominators in chronological order with the sequence in its space order. Recall that ω=
p(n, 0) q(n, 0)
⊕
d
i=1
(n) ωi
p(n, i) . q(n, i)
It follows from the construction that q(n, 0) corresponds to the most recent vertex, i.e. q(n, 0) = qn(0) . However, q(n, 1), . . . , q(n, d) appear in an arbitrary order. Let )n ∈ Sd be the permutation which reflects this order, i.e. q(n, i) = qn()n (i))
for 1 ≤ i ≤ d.
Denote φn =
(1)
(d)
qn
qn
qn
qn
,..., (0)
(0)
∈ d .
Lemma 4.1. φ n−1 = T (φ n ), m(φ n ) = m(ω(n−1) ) = mn , j (φ n ) = )n−1 (1). Proof. The space-ordering of the denominators of the vertices of n−1 is connected to their chronological order by the permutation )n−1 . More precisely, ()
n−1 q(n − 1, i) = qn−1
(i))
for 1 ≤ i ≤ d.
Clearly, (0)
()
n−1 qn(0) = q(n, 0) = mn q(n − 1, 0) + q(n − 1, 1) = mn qn−1 + qn−1
and
qn(i)
(i−1)
qn−1 (i) qn−1
=
if 1 ≤ i ≤ )n−1 (1); if i > )n−1 (1).
Note that
(0)
qn
(1)
qn
==
()
(0)
n−1 mn qn−1 + qn−1
(1))
(0)
qn−1
()
=
n−1 qn−1
(0)
qn−1
and
(0)
qn
(1)
qn
=
()
(0)
n−1 mn qn−1 + qn−1
(0)
qn−1
(1))
(1))
= mn .
(1))
,
d-Dimensional Continued Fractions
501
This implies that T (φ n ) = T
(1)
qn
,..., (0)
(d)
qn
(0)
qn qn ()n−1 (1)) (1) ()n−1 (1)−1) ()n−1 (1)+1) (d) qn−1 qn−1 qn−1 qn−1 qn−1 = ord , , . . . , , , . . . , (0) (0) (0) (0) (0) qn−1 qn−1 qn−1 qn−1 qn−1 = φ n−1 ,
and j (φ n ) = )n−1 (1), m(φ n ) = mn .
Lemma 4.1 demonstrates that φ n , φ n−1 , φ n−2 , . . . is indeed a trajectory of the d-dimensional Gauss transformation T . However, j (φ n ) = )n−1 (1) and in general j (φ n ) = jn . This means that the inverse branches connecting φ n−1 and φ n , and ω(n) and ω(n−1) are different. Instead of jn one has to use ln = )n−1 (1). Then −1 φ n = T(m φ n ,ln ) n−1
and
−1 ω(n−1) = T(m ω(n) . n ,jn )
Remark. Notice that the permutations )n are not defined for small n. This is because, for sufficiently small n, several vertices of n have the same denominator. However, there exists a random variable n(ω) such that, for n ≥ n(ω), the denominators are ordered and )n is defined (see [6]). 4.2. Combinatorial properties and symmetry. In the previous section we introduced three variables: jn , ln ∈ {1, 2, . . . , d} and )n ∈ Sd . In this section we discuss the connections between them. We have already seen that ln = )n−1 (1). For 1 ≤ i ≤ d, let σi = (σi (1), . . . , σi (d)) denote the permutation (2, 3, . . . , i, 1, i + 1, . . . , d). Let Sd (1) = {) ∈ Sd : )(1) = 1}. Define P : Sd → Sd (1) by if i = 1; 1 (10) (P ))(i) = )(i) if i > 1 and )(i) > )(1); )(i) + 1 if i > 1 and )(i) < )(1). It is easy to check that P can be represented as multiplication by the permutation σ)(1) , namely P ) = σ)(1) · ). Here we adopt the convention that permutations are to be composed from right to left. More precisely, if )ˆ is a bijection from {1, 2, . . . , d} to itself associated to the permutation ), i.e. )ˆ : i → )(i), then ) )2 ). 1 · )2 = )ˆ1 ◦ )ˆ2 = )ˆ1 (ˆ Define a permutation valued function E(), j ) = (P )) · σj = σ)(1) · ) · σj . Notice that multiplication by σj transforms P ) in the following way: the entry 1 moves from the first to the j th position.
502
D. M. Hardcastle, K. Khanin
Lemma 4.2. (i) )n = E()n−1 , jn ), ln = )n−1 (1). (ii) Let )¯ = E(), j ). Then j is uniquely determined by )¯ , in fact j = (¯) )−1 (1). (iii) Let )¯ = E(), j ), where j = (¯) )−1 (1). Denote τ = ) −1 , τ¯ = (¯) )−1 . Then τ = E(τ¯ , l), where l = )(1) = τ −1 (1). (iv) For all fixed )¯ and l there exists a unique ) such that )(1) = l and )¯ = E(), j ), where j = (¯) )−1 (1). Moreover, −1 ) = E (¯) )−1 , l . Proof. (i) Notice that )n−1 and jn uniquely determine )n . It is easy to see that the definition of the function E(), j ) exactly corresponds to the process of determining )n from )n−1 and jn . The permutation P ) corresponds to the order of the denominators when the new denominator is added and the denominator q(n − 1, 1) is deleted. The permutation (P )n−1 ) · σjn appears after the denominator q(n − 1, 0) is placed in the jnth position. Hence the first relation holds. The second relation is a trivial consequence of Lemma 4.1. (ii) Obviously )¯ (j ) = 1. Hence j = (¯) )−1 (1). (iii) Note that )¯ = σ)(1) · ) · σj
and
−1 τ¯ = (¯) )−1 = σj−1 · ) −1 · σ)(1) .
Hence τ = ) −1 = σj · τ¯ · σ)(1) . Since j = τ¯ (1), it follows that τ = E(τ¯ , l), where l = )(1) = τ −1 (1). (iv) The uniqueness of ) follows from (iii). Indeed )¯ = E(), j ) implies that ) −1 = E (¯) )−1 , l , where l = )(1). Hence −1 ) = E (¯) )−1 , l .
(11)
It is easy to check that ) given by (11) satisfies )¯ = E(), j ), )(1) = l. Consider a two-sided sequence (mn , jn )n∈Z . We will suppose that this sequence is typical with respect to the invariant measure of the natural extension of T . In particular this means that for any finite sequence (m(1) , j (1) ), (m(2) , j (2) ), . . . , (m(k) , j (k) ) there are infinitely many positive and negative integers n such that (mn+s , jn+s ) = (m(s) , j (s) ),
s = 1, . . . , k.
This property is a consequence of Birkhoff’s Ergodic Theorem, since for any finite sequence (m(1) , j (1) ), (m(2) , j (2) ), . . . , (m(k) , j (k) ), ν({(mn , jn )n∈Z : (ms , js ) = (m(s) , j (s) ), s = 1, . . . , k}) > 0. Denote a two-sided sequence (jn )n∈Z by J and let E denote a two-sided sequence of permutations ()n )n∈Z . Definition 2. The sequence E is said to be compatible with J if, for any n ∈ Z, )n = E()n−1 , jn ).
d-Dimensional Continued Fractions
503
)n0 +6 ↑E(·,1) )n0 +5 ↑E(·,4) )n0 +4 ↑E(·,4) )n0 +3 ↑E(·,2) )n0 +2 ↑E(·,4) )n0 +1 ↑E(·,3) )n 0
1
4 ↑ 3
4¯ 2 3 4¯ 4¯ 2¯
4¯
3 ↑ 2
1 2 3¯ 4¯
2 ↑ 1
3 4¯ ↑ 3¯ 1 3¯
1 2 ↑ 1 2¯ ↑ 1¯
Fig. 1. The orbit of the permutation )n0 = (2, 4, 3, 1) under repeated applications of E(·, js ). Numbers with a bar over them denote elements of the itineraries of the elements of )n0
To establish the existence and uniqueness of a sequence E which is compatible with J we will need the following lemma: Lemma 4.3. Suppose that n0 < n and the finite sequence jn0 +1 , jn0 +2 , . . . , jn−1 , jn contains at least d − 1 entries d. For an arbitrary permutation )n0 define )s = E()s−1 , js ) for n0 + 1 ≤ s ≤ n.
(12)
Then )n depends only on the sequence (js )n0 +1≤s≤n and it does not depend on )n0 . Proof. The lemma has a purely combinatorial nature. We shall consider (12) as the iteration of a sequence of mappings E(·, js ) acting on permutations with initial point )n0 . Each entry of the permutation )s except the first one gets mapped into some entry of )s+1 which is either to the left of it or just above it (see Fig. 1). E(·, js ) also produces one entry 1 in the js position of )s and terminates the first entry of )s−1 . The whole process of iteration produces itineraries which originate either at one of the entries of the original permutation )n0 or at one of the new ones. Notice that the itinerary of every newly produced element is independent of )n0 and depends only on the future sequence of js ’s. Hence the resulting permutation )n is independent of the original permutation )n0 if all the itineraries which start at the 0th level (i.e. the entries of )n0 ) get terminated before n. Notice that if js = d then all existing itineraries move one unit to the left, except the one which gets terminated. Thus after d − 1 iterations of E(·, d), all the itineraries which start at the 0th level will reach their leftmost position and will be terminated. Notice that because of monotonicity the last itinerary which will be terminated is the one which starts in the rightmost element of )n0 . Let D denote the set of two-sided sequences J = (jn )n∈Z for which there are infinitely many positive and negative integers n such that jn = d. Proposition 4.4. If J ∈ D then there exists a unique sequence E = ()n )n∈Z which is compatible with J . Proof. Uniqueness follows immediately from the previous lemma. To prove existence we (s) (s) (s) consider a sequence of one-sided sequences ()−s , )−s+1 , . . . ), where )−s is an arbitrary
504
D. M. Hardcastle, K. Khanin (s)
(s)
permutation and )n = E()n−1 , jn ), n > −s. It follows from the lemma that for any s∈Z )n(s) → )n as s → ∞, (s)
which simply means that )n = )n for s large enough. Obviously, E = ()n )n∈Z is a sequence which is compatible with J . We can now give the definition of the compatibility of a sequence L = (ln )n∈Z with J . This definition follows from the relation ln = )n−1 (1). Definition 3. A sequence L = (ln )n∈Z is said to be compatible with J if there exists a sequence E = ()n )n∈Z which is compatible with J and for which ln = )n−1 (1) for all n ∈ Z. Proposition 4.5. For an arbitrary sequence J ∈ D the sequence L which is compatible with J also belongs to D. Proof. Consider the itinerary of the entry of )n0 which is equal to d (see Lemma 4.3). It does not change its value, but can only change its position. It moves one unit to the left every time we apply E(·, d). After at most d − 1 applications of E(·, d) the itinerary of the entry d will reach the leftmost position. Hence for any finite sequence js , n0 ≤ s ≤ n, which contains at least d − 1 entries d, there is at least one s for which )s (1) = d. This implies that ls+1 = d. Recall that the sequence L = (ln )n∈Z labels a backward sequence of the d-dimensional Gauss transformation. We can give the definition of the compatibility of a sequence of permutations T = (τn )n∈Z with the sequence L, and hence with J . This definition is analogous to Definition 2. Definition 4. (i) The sequence T is compatible with L if, for any n ∈ Z, τn = E(τn+1 , ln+1 ). (ii) The sequence T is compatible with J if there exists a sequence L which is compatible with J such that T is compatible with L. If J ∈ D then L ∈ D and hence by Proposition 4.4 there exist unique E, T which are compatible with J . The compatibility of J , E, L and T is presented graphically in Fig. 2. E(·,jn ) )n E(·,jn+1 ) )n+1 E(·,jn+2 ) · · · −−−−→ • −−−−−−→ • −−−−−−→ · · · E(·,ln ) τn E(·,ln+1 ) τn+1 E(·,ln+2 ) · · · ←−−−−− • ←−−−−−− • ←−−−−−− · · ·
Fig. 2. The compatibility of (E, J ) and (T , L)
Theorem 1. Suppose J ∈ D. Let T = (τn )n∈Z and E = ()n )n∈Z be compatible with J . Then, for any n ∈ Z, τn = )n−1 .
d-Dimensional Continued Fractions
505
Proof. The sequence T is uniquely defined by the condition of compatibility and thus it is enough to check that the sequence ()n−1 )n∈Z is indeed compatible with L. Since −1 = (E()n , jn+1 ))−1 = (σ)n (1) · )n · σjn+1 )−1 = σj−1 · )n−1 · σ)−1 )n+1 n+1 n (1)
we have −1 · σ)n (1) . )n−1 = σjn+1 · )n+1 −1 (1) and ln+1 = )n (1). Thus Notice that jn+1 = )n+1
)n−1 = σ) −1
n+1 (1)
−1 −1 · )n+1 · σln+1 = E()n+1 , ln+1 ).
Consider a representation of the group Sd by permutation matrices. Namely, for any ) ∈ Sd consider a d-dimensional permutation matrix V ()) which has 1 in the positions ()(1), 1), ()(2), 2), . . . , ()(d), d) and 0’s elsewhere. Let Q()) be the (d + 1)dimensional matrix 1 0 . 0 V ()) Notice that Q()) gives a (d + 1)-dimensional representation of the group Sd , i.e. Q() · )¯ ) = Q())Q(¯) ) and Q() −1 ) = (Q()))−1 . Since the matrices Q()) are orthogonal, we also have Qt ()) = (Q()))−1 = Q() −1 ). If )(1) = 1 then V ()) is of the form 1 0 , 0 W ()) where W ()) is a (d − 1)-dimensional permutation matrix which has entries 1 in the positions ()(i + 1) − 1, i), 1 ≤ i ≤ d − 1. Again we have W () · )¯ ) = W ())W (¯) ) and (W ()))−1 = W () −1 ) = W t ()) assuming that )(1) = )¯ (1) = 1. (m,j ) (see Eq. (9) in Sect. 3), and that A(m,j ) = Recall the definition of the matrix A t A(m,j ) . Proposition 4.6. (i) For arbitrary m and ),
m1 0 (m,)(1)) Q()) = 1 0 0 , A 0 0 W (P ))
(13)
where P ) = σ)(1) · ). (ii) For arbitrary ) and j , P ) = (P τ¯ )−1 ,
(14)
where τ¯ = (¯) )−1 and )¯ = E(), j ). (iii) For arbitrary m, j and ), (m,l) Q()), A(m,j ) = Q−1 (¯) )A where )¯ = E(), j ) and l = )(1).
(15)
506
D. M. Hardcastle, K. Khanin
(m,)(1)) Q()) coincides with the first column Proof. (i) Clearly the first column of A (m,)(1)) Q()) is equal to the (m,)(1)) . For 2 ≤ i ≤ d + 1, the i th column of A of A th ()(i − 1) + 1) column of A(m,)(1)) . This immediately implies that (13) is correct for the second column. Also, if i > 2 then the i th column has only one non-zero entry, which is the entry 1 in the row ()(i − 1) + 2) if )(i − 1) < )(1) or in the row ()(i − 1) + 1) if )(i − 1) > )(1). Using (10) we find that the entry 1 is located in the ((P ))(i − 1) + 1)th row. Now consider the minor corresponding to the last d − 1 rows and columns of (m,)(1)) Q()). Take k = i − 2 and consider the k th column. The entry 1 is located in the A ((P ))(k + 1) + 1 − 2) = ((P ))(k + 1) − 1)th row. This implies (13). (ii) Using j = (¯) )−1 (1) we get −1 (P τ¯ )−1 = (στ¯ (1) · τ¯ )−1 = (τ¯ )−1 · στ¯−1 (1) = )¯ · στ¯ (1)
−1 −1 = E(), j ) · στ¯−1 (1) = σ)(1) · ) · σj · στ¯ (1) = σ)(1) · ) · σj · σ(¯) )−1 (1) = σ)(1) · ) = P ).
(m,l) Q()). Using (13) we get (iii) It is enough to show that Q(¯) )A(m,j ) = A
m1 0 (m,)(1)) Q()) = 1 0 (m,l) Q()) = A 0 . A 0 0 W (P )) We also have t t (m,j ) Qt (¯) ) t = A (m,j ) Q (¯) )−1 m,(¯) )−1 (1) Q (¯) )−1 Q(¯) )A(m,j ) = A = A ( ) t 0 0 m1 m1 =1 0 = 1 0 0 0 0 0 W P (¯) )−1 0 0 W t P (¯) )−1 0 m1 = 1 0 0 . −1 −1 0 0 W P (¯) ) −1 Using (14) we have P ) = P (¯) )−1 which implies (15).
We now formulate a theorem which relates the product of the matrices A(mn ,jn ) to the product of the matrices A(mn ,ln ) . Theorem 2. Suppose E and L are compatible with J . Then for an arbitrary sequence M = (mn )n∈Z and arbitrary a < b we have
A(ma ,la ) · · · A(mb ,lb )
t
= Q()b )A(mb ,jb ) · · · A(ma ,ja ) Q−1 ()a−1 ).
(16)
Proof. It follows from Proposition 4.6 and the compatibility of E, L and J that for any n (mn ,ln ) . Q()n )A(mn ,jn ) Q−1 ()n−1 ) = A Taking the product over a ≤ n ≤ b we get (16).
d-Dimensional Continued Fractions
507
Remark. As we have seen above, the product of the matrices A(mn ,jn ) produces the approximations corresponding to the d-dimensional Gauss transformation. Theorem 2 says that forward iteration of the d-dimensional Gauss transformation and backward iteration produce the same matrix up to transposition and a change in the order of the rows and the columns. Notice that Q−1 ()a−1 ) = Q(τa−1 ). One can say that )a−1 determines the correct order of the rows and τb the correct order of the columns. Let us give one more definition which we shall use below. Let N be an arbitrary subset of Z. Denote EN = ()n )n∈N , LN = (ln )n∈N , TN = (τn )n∈N . Definition 5. A configuration EN (respectively LN , TN ) is said to be compatible with J if there exists E (respectively L, T ) which is compatible with J and is such that E|N = EN (respectively L|N = LN , T |N = TN ). 4.3. Coordinates for the natural extension. The aim of this section is to define new coordinates for the natural extension of the d-dimensional Gauss transformation. Instead of using a two-sided sequence (mn , jn )n∈Z , we use a two-sided sequence M = (mn )n∈Z and two one-sided sequences L− = (ln )n≤0 and J+ = (jn )n≥1 , where L− is a subsequence of the unique sequence L which is compatible with J . We also use a discrete coordinate )0 ∈ Sd which is the 0th entry of the sequence E which is compatible with J . As we have seen above, L− and )0 are uniquely determined by J if J ∈ D. The converse is also true: for arbitrary (L− , J+ , )0 ) there exists a unique J such that )0 and L− are compatible with J . Let D+ (respectively D− ) denote the set of one-sided sequences J+ = (jn )n≥1 (respectively L− = (ln )n≤0 ) which contain infinitely many entries equal to d. Proposition 4.7. If L− ∈ D− and J+ ∈ D+ then for any )0 there exists a unique sequence J = (jn )n∈Z ∈ D which coincides with J+ for n ≥ 1 and is such that )0 and L− are compatible with J . Proof. It follows from Theorem 1 that −1 )n−1 = E()n−1 , ln ).
Applying this formula repeatedly to )0 and the sequence (l0 , l−1 , l−2 , . . . ) we can define the sequence ()−1 , )−2 , . . . ). Hence we can determine (j0 , j−1 , j−2 , . . . ) using jn = )n−1 (1). Obviously, J is the only sequence that can be compatible with L− , J+ and )0 . To see that it is indeed compatible it is enough to show that J ∈ D. This easily follows from the argument used in Proposition 4.5. Propositions 4.4, 4.5 and 4.7 imply that the mapping from {(M, J ) : J ∈ D} into {(M, L− , J+ , )0 ) : L− ∈ D− , J+ ∈ D+ } is a bijection. Let ν−,+ denote the measure on {(M, L− , J+ , )0 ) : L− ∈ D− , J+ ∈ D+ } which is the image of the natural extension’s invariant measure ν under this bijection. Denote the projection of ν−,+ onto {(M, L− , J+ ) : L− ∈ D− , J+ ∈ D+ } by ν¯ −,+ . Next we associate two vectors x = (x1 , . . . , xd ), y = (y1 , . . . , yd ) ∈ d to the sequences (M+ , J+ ), (M− , L− ) (where M− = (mn )n≤0 , M+ = (mn )n≥1 ). We do this by regarding the sequences as symbolic representations of y and x corresponding to the d-dimensional Gauss transformation. More precisely, y = [(m0 , l0 ), (m−1 , l−1 ), (m−2 , l−2 ), . . . ],
x = [(m1 , j1 ), (m2 , j2 ), . . . ].
508
D. M. Hardcastle, K. Khanin
We will show that this mapping from {(M, L− , J+ ) : L− ∈ D− , J+ ∈ D+ } into denote the {(y, x) : y, x ∈ d } is well-defined on a set of full ν¯ −,+ -measure. Let ( inverse mapping which associates ((M− , L− ), (M+ , J+ )) = (M, L− , J+ ) to (y, x): (y, x) = ((M− , L− ), (M+ , J+ )) = (M, L− , J+ ). ( is well-defined if x and y and their orbits (T n x)n≥1 , (T n y)n≥1 under the Clearly, ( Gauss transformation do not belong to the boundary of d . is a bijection between a set of full Lebesgue measure in d × d Proposition 4.8. ( and a set of full ν¯ −,+ -measure in {(M, L− , J+ ) : L− ∈ D− , J+ ∈ D+ }. Proof. Let M denote the set of (M, L− , J+ ) for which there are infinitely many positive n’s such that (mn+s , jn+s ) = (1, d),
0 ≤ s ≤ 2d − 1,
and infinitely many negative n’s such that (mn+s , ln+s ) = (1, d),
0 ≤ s ≤ 2d − 1.
−1 (M), i.e. Z is the preimage of M under ( . It follows from [6] that Z has Let Z = ( is a bijection between Z and M. To show that M has full Lebesgue measure and that ( full ν¯ −,+ measure, consider a set N of sequences (M, J ) such that for infinitely many positive and negative n’s, (mn+s , jn+s ) = (1, d),
0 ≤ s ≤ 3d − 1.
Clearly ν(N ) = 1. Notice that if (M, J ) has a piece of length 3d consisting of (1, d)’s then (M, L) has a corresponding piece of length 2d which consists of (1, d)’s. This implies that ν¯ −,+ (M) ≥ ν(N ) and hence ν¯ −,+ (M) = 1. We are now ready to define an automorphism T which is metrically isomorphic to the natural extension of T . The definition below describes the unit shift on the space of two-sided sequences in terms of the coordinates (y, ), x). For x, y ∈ d and ) ∈ Sd let T(y, ), x) = (y , ) , x ), where (i) x = T (x), (ii) ) = E(), j (x)), −1 (iii) y = T(m(x),l) (y), where l = )(1), i.e. y =
1 y1 yl−1 yl+1 yd . , ,..., , ,..., m(x) + yl m(x) + yl m(x) + yl m(x) + yl m(x) + yl
Although this definition appears to be a bit complicated, it does indeed correspond to the forward and backward dynamics of the d-dimensional Gauss transformation. We hope that the properties which we will describe in the next section will provide ample motivation for our definition of T.
d-Dimensional Continued Fractions
509
Remarks. (i) The transformation T is well-defined when x does not belong to the boundary of d . (ii) In the one-dimensional case, the transformation T coincides with the natural extension defined in Sect. 2. In this one-dimensional setting, the discrete coordinate ) is absent. (iii) In the two-dimensional case, ) can be only one of two permutations: (1,2) and (2,1). We will say that ) = 1 in the first case and ) = 2 in the second case. With this notation, Lemma 4.2 can be simplified. It is easy to see that )n = jn and ln = )n−1 . Thus ln = jn−1 , i.e. the sequence of j ’s is just the unit shift of the sequence of l’s. In this two-dimensional case, it is especially easy to see that L− , J+ and )0 allow one to construct the whole sequence of j s. Indeed j0 = )0 and jn = ln+1 for n ≤ −1. (iv) Let x (0) ∈ d and )0 ∈ Sd be arbitrary, and let y (0) = (0, . . . , 0). Define (y (n) , )n , x (n) ) = Tn (y (0) , )0 , x (0) ). Then y
(n)
=
(1)
qn
(0)
qn
,...,
(d)
qn
(0)
qn
.
5. Properties of the d-Dimensional Gauss Transformation In this section we formulate and prove the most important properties of T. Let us denote the cardinality of a set S by #(S). We define a measure χ on d ×Sd ×d by χ (A1 × S × A2 ) =
A1
dy
A2
dx (#(S))
for Borel subsets A1 and A2 of d and S ⊂ Sd , i.e. χ is the direct product of Lebesgue measure on each copy of d and the counting measure on Sd . Denote 7d = {x ∈ d : T n x ∈ Int(d ) for all n ≥ 0}. Obviously, χ (7d × Sd × 7d ) = χ (d × Sd × d ), i.e. 7d × Sd × 7d is a set of full χ -measure in d × Sd × d . Proposition 5.1. T is a bijection from 7d × Sd × 7d to itself. Proof. It is easy to see that y = T y ∈ Int d whenever y ∈ Int d . Hence, T maps 7d × Sd × 7d into itself. The invertibility of T on 7d × Sd × 7d follows immediately from its definition. Indeed, given (y , ) , x ) ∈ 7d × Sd × 7d define y = T y , ) = −1 −1 −1 −1 and x = T(m,j E () ) , l ) x , where l = j (y ), m = m(y ) and j = () ) (1). Then it is easy to check that (y, ), x) is the unique point in 7d × Sd × 7d such that T(y, ), x) = (y , ) , x ).
510
D. M. Hardcastle, K. Khanin
We now consider an invariant measure for T. Let µ be the probability measure on d × Sd × d which, with respect to χ , has the density d µ 1 1 (y, ), x) = , dχ C (1 + di=1 xi y)(i) )d+1 where C is a normalising constant: 1 C= χ (dy, d), dx). d d+1 d d ×Sd × (1 + i=1 xi y)(i) ) We will also use the notation f) (y, x) =
(1 +
d
1
i=1 xi y)(i) )
d+1
.
Theorem 3. µ is an invariant measure for T. Proof. Consider a set Ay × {)} × Ax , where Ay , Ax ⊂ d and ) ∈ Sd . Denote j
Ax = {x ∈ Ax : j (x) = j },
1 ≤ j ≤ d.
Then d 1 µ(Ay × {)} × Ax ) = f) (y, x) dydx. C j =1
j
Ay ×Ax
Let T) : d × d → d × d be the restriction of T on to the variables (y, x) with ) fixed. Then the measure of T(Ay × {)} × Ax ) is given by d 1 µ T Ay × {)} × Ax = C j =1
=
f)j (y , x ) dy dx
j T) (Ay ×Ax )
d 1 f)j T) (y, x) | Jac) (y, x)| dydx, C j =1
j
Ay ×Ax
where )j = E(), j ) and Jac) denotes the Jacobian of the transformation T) . In order to prove that µ is an invariant measure one has to show that f) (y, x) = f)j T) (y, x) | Jac) (y, x)| for all x such that j (x) = j . This can be shown directly. Indeed a simple calculation shows that 1 d+1 1 . | Jac) (y, x)| = y)(1) + m(x) x1
d-Dimensional Continued Fractions
511
Thus 1
f)j (y , x )| Jac)j (y, x)| =
1+
d
where (y , x ) = T) (y, x). Since xj = have 1
1+
i=1 xi y) (i) j
1 x1
1
d+1 , y)(1) + m(x) x1
d+1
− m(x), )j (j ) = 1 and y1 =
1 y)(1) +m(x)
we
1
d+1
d+1 y)(1) + m(x) x1 y x i=1 i ) (i) j = y)(1) x1 + m(x)x1 + xj y) (j ) y)(1) + m(x) x1
d
j
+
d i=1 i=j
xi y) (i) y)(1) + m(x) x1
−(d+1)
j
1 = y)(1) x1 + m(x)x1 + − m(x) y1 y)(1) + m(x) x1 x1 −(d+1) d + xi y) (i) y)(1) + m(x) x1 j
i=1 i=j
−(d+1) d = 1 + x1 y)(1) + xi y) (i) y)(1) + m(x) x1 . j
i=1 i=j
Notice that for i < j , xi =
xi+1 x1
xi =
xi x1
and
y) (i) = j
y)(i+1) y)(1) + m(x)
and for i > j , and
y) (i) = j
y)(i) . y)(1) + m(x)
Hence −(d+1) −(d+1) d d xi y) (i) y)(1) + m(x) x1 = 1+ xi y)(i) . 1 + x1 y)(1) + i=1 i=j
j
i=1
Remarks. (i) It is easy to see that for the probability measure µ the conditional distributions on d × d under ) fixed are given by µ(dy, dx|)) =
1 1 dydx, d C()) (1 + i=1 xi y)(i) )d+1
512
D. M. Hardcastle, K. Khanin
where the C()) are normalising constants: 1 dydx. C()) = d (1 + i=1 xi y)(i) )d+1 d ×d
Obviously,
C()) = C.
)∈Sd
µ|Sd = κ. Then (ii) Let κ denote the marginal distribution of the measure µ on Sd , i.e. 1 C()) 1 dydx = . κ()) = d C (1 + i=1 xi y)(i) )d+1 C d ×d
Theorem 4. (i) The automorphism (T, µ) on d × Sd × d is metrically isomorphic to the natural extension of the d-dimensional Gauss transformation. (ii) (T, µ) is a K-automorphism. (iii) T is reversible with respect to the involution S(y, ), x) = (x, ) −1 , y), i.e. T−1 = S TS. Proof. (i) By Proposition 4.8, for Lebesgue almost all x, y ∈ d there exists a unique symbolic representation of (y, x): (y, x), ((M− , L− ), (M+ , J+ )) = (M, L− , J+ ) = ( where L− ∈ D− and J+ ∈ D+ . From Proposition 4.7, there exists a unique J ∈ D such that J+ is the restriction of J onto n ≥ 1, and L− , )0 are compatible with J . Moreover, the transformation of (L− , )0 , J+ ) onto J ∈ D is also one-to-one. Together these two facts imply that there is a one-to-one transformation ψ from a set of (y, ), x) of full χ measure onto the space of two-sided sequences (mn , jn )n∈Z , where J = (jn )n∈Z ∈ D. It follows easily from the construction that T is conjugated by ψ to the unit shift of the sequence (mn , jn )n∈Z . Denote the image of µ under ψ by ν, i.e. ν = ψ µ. Obviously, ν is the measure corresponding to the natural extension; indeed it is translation invariant and its projection onto the space of one-sided sequences (mn , jn )n∈N coincides with ν, which proves (i). (ii) T is an exact endomorphism with respect to the invariant measure µ (see [9] or [15]). Hence its natural extension is a K-automorphism [20]. (iii) The property of reversibility easily follows from (i) and Theorem 1. Indeed, the unit shift of a two-sided sequence is always reversible with respect to the involution corresponding to the reflection n → −n. Using Theorem 1 it is easy to see that this involution gives S in the coordinates (y, ), x). However, we will give another proof of (iii) which is based on a direct calculation. Suppose x, y ∈ Int(d ). Then T(y, ), x) = (y , ) , x ), where y ∈ Int(d ), y = T y , j (y ) = l = )(1), m(y ) = m(x) = m and ) = σ)(1) · ) · σj = σl · ) · σj . Hence S TS T(y, ), x) = S TS(y , ) , x ) = S T(x , () )−1 , y ) −1 −1 −1 · σl−1 · σl , y) = S(T(m,j ) (x ), σj · σj · ) = S(x, ) −1 , y) = (y, ), x).
d-Dimensional Continued Fractions
513
Similarly, TS TS(y, ), x) = (y, ), x). This proves (iii). Corollary 1. The involution S preserves the invariant measure µ. Proof. It follows from the reversibility of T that S µ is also an invariant measure for T. Since T is ergodic and both µ and S µ are absolutely continuous with respect to χ we get that S µ= µ. Consider the trajectory (T n x)n≥0 of an arbitrary x ∈ d under the endomorphism T and the corresponding sequence of permutations )n (x). We have seen in Sect. 4 that, for almost all x, )n (x) is well-defined for n large enough. We shall show that the stationary distribution for )n (x) is given by κ. Corollary 2. For any )0 ∈ Sd and Lebesgue almost every x ∈ d , #{1 ≤ n ≤ N : )n (x) = )0 } → κ()0 ) as N → ∞. N 1 if ) = )0 ; Proof. Consider an observable g)0 (y, ), x) = δ)0 ()) = For µ-almost 0 if ) = )0 . all (y, ), x) we have N−1 1 g)0 (Tn (y, ), x)) → g)0 d µ = κ()0 ). N d ×Sd ×d n=0
Let χ)0 ()) denote the characteristic function of )0 . Since for Lebesgue almost all x, the sequence g)0 (Tn (y, ), x)) does not depend on y or ) for N large enough, and is equal to χ)0 ()n (x)), we get the required result. 6. Conclusions We have shown that the d-dimensional Gauss transformation and its natural extension have many of the important ergodic and dynamic properties which are valid in the onedimensional situation. We summarise these similarities below: The invariant measures for both the d-dimensional Gauss transformation and its natural extension are given by explicit formulae. It is quite obvious that the explicit formula for the invariant measure of the natural extension is a generalisation of the corresponding formula in the one-dimensional case. (d) (1) (ii) The vectors φ n = qn(0) , . . . , qn(0) are connected by the backwards d-dimensional
(i)
qn
qn
Gauss transformation, i.e. φ n−1 = T −1 φ n . (iii) The matrix Cn (x) = A(mn ,jn ) · · · A(m1 ,j1 ) gives the vertices of the simplex n (x) which is the nth approximation to x, and also after taking its transpose and a suitable rearrangement of the rows and columns, the vertices of the simplex n (φ n ).
514
D. M. Hardcastle, K. Khanin
Although there are many multidimensional generalisations of continued fractions, the d-dimensional Gauss transformation is the only one we know which enjoys the properties (i)–(iii). We believe that there is a connection between the existence of explicit formulas for the invariant density and the symmetry of the natural extension. These symmetries are “hidden”, i.e. non-obvious, in the d-dimensional case. One of the manifestations of the symmetry is the existence of an “almost” first integral. Define F (y, ), x) = y)(1) + x11 . It is easy to see that F (S T(y, ), x)) = F (y, ), x). Hence F is a first integral for S T. In the one-dimensional case, the existence of F allows one to construct an S-symmetric invariant absolutely continuous measure for S T in the regular way, which gives a unique absolutely invariant measure for T (see [11]). We believe that such a construction can be carried out in the d-dimensional case as well. Despite the many similarities between one-dimensional continued fractions and the ddimensional Gauss algorithm, there do exist significant differences. The main difference is the presence of a discrete coordinate ) in the natural extension and the non-trivial dependence of the sequences L and J . The sequences E, J and L are completely absent in the one-dimensional case. In fact the first really non-trivial case is d = 3. In the case d = 2, ) belongs to the commutative group Z2 and as a consequence the sequences J and L are related in an elementary way: L is the unit shift of the sequence J . Another beautiful and important aspect of the classical theory of continued fractions is a deep connection between the one-dimensional Gauss automorphism and the geodesic flow on a surface of constant negative curvature. This connection was studied by R. Adler and L. Flatto [1], C. Series [24, 25] and recently by M. Kontsevich and Yu. Suhov [13]. It would be very interesting to find a similar geometrical interpretation in the d-dimensional case. In this paper we have not discussed the convergence of the approximations provided by the d-dimensional Gauss algorithm. In fact the explicit forms for the invariant measure make it possible to give computer assisted proofs of almost everywhere strong convergence in dimensions 2 and 3 [7, 4, 6]. However we hope that the hidden symmetries which we have discussed here will eventually contribute to a conceptual proof of almost everywhere strong convergence which is currently an open problem. Acknowledgements. The authors are grateful to the European Science Foundation for the opportunity to participate in their PRODYN (Probabilistic methods in non-hyperbolic dynamics) programme. The first author also wishes to thank the Engineering and Physical Sciences Research Council of the UK for financial support.
References 1. Adler, R. and Flatto, L.: Cross section maps for geodesic flows. In: Ergodic Theory and Dynamical Systems, Progress in Mathematics 2, ed. A. Katok, Boston: Birkhäuser, 1980 2. Bernstein, L.: The Jacobi–Perron Algorithm – Its Theory and Application. Lecture Notes in Mathematics 207, Berlin–Heidelberg–New York: Springer-Verlag, 1971 3. Brun, V.: Algorithmes euclidiens pour trois et quatre nombres. In: 13 ième Congre. Math. Scand., Helsinki (1957), pp. 45–64 4. Fujita, T., Ito, S., Keane, M. and Ohtsuki, M.: On almost everywhere exponential convergence of the modified Jacobi–Perron algorithm: A corrected proof. Ergod. Th. and Dyn. Sys. 16, 1345–1352 (1996) 5. Gordin, M.I.: Exponentially rapid mixing. Dokl. Akad. Nauk SSSR 196, 1255–1258 (1971); English translation: Soviet Math. Dokl. 12, 331–335 (1970) 6. Hardcastle, D.M. and Khanin, K.: On almost everywhere strong convergence of multidimensional continued fraction algorithms. To appear in Ergod. Th. and Dyn. Sys.
d-Dimensional Continued Fractions
515
7. Ito, S., Keane, M. and Ohtsuki, M.: Almost everywhere exponential convergence of the modified Jacobi– Perron algorithm. Ergod. Th. and Dyn. Sys. 13, 319–334 (1993) 8. Ito, S. and Nakada, H.: On natural extensions of transformations related to Diophantine approximations. In: Number Theory and Combinatorics, Singapore: World Scientific, 1985, pp. 185–207 9. Ito, S. and Yuri, M.: Number theoretical transformations with finite range structure and their ergodic properties. Tokyo J. Math. 10, 1–32 (1987) 10. Jacobi, C.G.J.: Allgemeine Theorie der Kettenbruchähnlichen Algorithmen, in welchen jede Zahl aus drei vorhergehenden gebildet wird. J. Reine Angew. Math. 69, 29–64 (1868) 11. Khalatnikov, I.M., Lifshitz, E.M., Khanin, K.M., Shchur, L.N. and Sinai, Ya.G.: On the stochasticity in relativistic cosmology. J. Stat. Phys. 38, 97–114 (1985) 12. Khinchin, A.Ya.: Continued Fractions. Chicago, Ill: University of Chicago Press, 1964 13. Kontsevich, M.L. and Suhov, Yu.M.: Statistics of Klein polyhedra and multidimensional continued fractions. In: Pseudoperiodic Topology, eds. V. Arnold, M. L. Kontsevich and A. Zorich, Amer. Math. Soc. Transl. Series 2 197, 9–28 (1999) 14. Lagarias, J.C.: The quality of the Diophantine approximations found by the Jacobi–Perron algorithm and related algorithms. Mh. Math. 115, 299–328 (1993) 15. Mayer, D.H.: Approach to equilibrium for locally expanding maps in Rk . Commun. Math. Phys. 95, 1–15 (1984) 16. Nakada, H., Ito, S. and Tanaka, S.: On the invariant measure for the transformations associated with some real continued-fractions. Keio Eng. Rep. 30, 159–175 (1977) 17. Perron, O.: Grundlagen für eine Theorie des Jacobischen Kettenbruchalgorithmus. Math. Ann. 64, 1–76 (1907) 18. Podsypanin, E.V.: A generalization of the algorithm for continued fractions related to the algorithm of Viggo Brunn. Zap. Naucn. Sem. Leningrad Otdel. Mat. Inst. Steklov 67, 184–194 (1977); English translation: J. Soviet Math. 16, 885–893 (1981) 19. Poincaré, H.: Sur une généralisation des fractions continues. C. R. Acad. Sci. Paris 99, 1014–16 (1884) 20. Rohlin, V.A.: Exact endomorphisms of a Lebesgue space. Amer. Math. Soc. Transl. Series 2 39, 1–36 (1964) 21. Schweiger, F.: The Metrical Theory of the Jacobi–Perron Algorithm. Lecture Notes in Mathematics 334, Berlin–Heidelberg–New York: Springer-Verlag, 1973 22. Schweiger, F.: A modified Jacobi–Perron algorithm with explicitly given invariant measure. In: Ergodic Theory, Proceedings Oberwolfach, Germany 1978, Lecture Notes in Mathematics 729, , Berlin– Heidelberg–New York: Springer-Verlag, 1979, pp. 199–202 23. Selmer, E.: Om flerdimensjonal Kjede brøk. Nordisk Mat. Tidskr. 9, 37–43 (1961) 24. Series, C.: On coding geodesics with continued fractions. Enseign. Math. 29, 67–76 (1980) 25. Series, C.: The modular surface and continued fractions. J. London Math. Soc. (2) 31, 69–80 (1985) Communicated by Ya. G. Sinai
Commun. Math. Phys. 215, 517 – 557 (2001)
Communications in
Mathematical Physics
© Springer-Verlag 2001
Vertex Algebras and Mirror Symmetry Lev A. Borisov Department of Mathematics, Columbia University, 2990 Broadway, Mailcode 4432, New York, NY 10027, USA. E-mail: [email protected] Received: 11 June 1999 / Accepted: 12 June 2000
Abstract: Mirror Symmetry for Calabi–Yau hypersurfaces in toric varieties is by now well established. However, previous approaches to it did not uncover the underlying reason for mirror varieties to be mirror. We are able to calculate explicitly vertex algebras that correspond to holomorphic parts of A and B models of Calabi–Yau hypersurfaces and complete intersections in toric varieties. We establish the relation between these vertex algebras for mirror Calabi–Yau manifolds. This should eventually allow us to rewrite the whole story of toric Mirror Symmetry in the language of sheaves of vertex algebras. Our approach is purely algebraic and involves simple techniques from toric geometry and homological algebra, as well as some basic results of the theory of vertex algebras. Ideas of this paper may also be useful in other problems related to maps from curves to algebraic varieties. This paper could also be of interest to physicists, because it contains explicit description of holomorphic parts of A and B models of Calabi–Yau hypersurfaces and complete intersections in terms of free bosons and fermions. 1. Introduction The first example of Mirror Symmetry was discovered by physicists in [8]. It relates the A model on one Calabi–Yau variety with the B model on another one. Unfortunately, the definition of A and B models was given by physicists in terms of integrals over the set of all maps from Riemann surfaces to a given Calabi–Yau variety, see [22]. While physicists have developed good intuitive understanding of the behavior of these integrals, they are ill-defined mathematically. Nevertheless, physicists came up with predictions of numbers of rational curves of given degree in Calabi–Yau manifolds, a quintic threefold being the most prominent example. Kontsevich has introduced spaces of stable maps (see [14]) which allowed him to define mathematically virtual numbers of rational curves on a quintic. Givental proved in [11] that these virtual numbers agree with physical predictions. Because of its hard
518
L. A. Borisov
calculations, Givental’s paper is a source of controversy, see [16] by Lian, Liu and Yau. In some sense, however, Givental’s approach does not clarify the origins of mirror symmetry. His proof is a beautiful and tricky calculation which has little to do with mirror involution. The goal of this paper is to present a completely different approach to toric Mirror Symmetry which should eventually lead to conceptual understanding of mirror involution in purely mathematical terms. To do this, one has to employ the theory of vertex algebras, which is a very well developed purely algebraic theory. Malikov, Schechtman and Vaintrob have recently suggested an algebraic approach to A models (see [17]) that involves the chiral de Rham complex which is a certain sheaf of vertex algebras. In my personal opinion, their paper is one of the most important mathematics papers on mirror symmetry written to date, even though it does not deal directly with mirror symmetry. In this paper we attempt to calculate A and B model vertex algebras for mirror families of Calabi–Yau complete intersections. We define (quasi-)loop-coherent sheaves over any algebraic variety X, and we show that if their sections on any affine open subset have vertex algebra structure, then the cohomology of the sheaf has this structure as well. It is hoped that the techniques of this paper will prove to be more important than the paper itself, after all, they should allow mathematicians to do rigorously what physicists have been doing half-rigorously for quite a while and with a lot of success. As far as applications to conformal field theory are concerned, this paper suggests a way of defining A and B models for varieties with Gorenstein toroidal singularities that does not use any resolutions of such singularities. The paper is organized as follows. Section 2 is devoted to (quasi-)loop-coherent sheaves, which is a generalization of the notion of (quasi-)coherent sheaves. It serves as a useful framework for the whole paper, and perhaps may have other applications. Sections 3 and 4 contain mostly background material. The only apparently new result there is Proposition 3.7, which had actually been suggested in [17]. Section 5 contains a calculation of chiral de Rham complex of a hypersurface in a smooth variety in terms of chiral de Rham complex of the corresponding line bundle. Section 6 contains a calculation which is in a sense a mirror image of the calculation of Sect. 5. Sections 7 and 8 combine results of previous sections to describe A and B models of Calabi–Yau hypersurfaces in smooth nef-Fano toric varieties. Sections 9 and 10 attempt to extend these results to singular varieties and complete intersections. While good progress is made there, a few details need to be further clarified. Section 11 is largely a speculation. We state there some open questions related to our construction, as well as some possible applications of the results and techniques of this paper. We use the book of Kac [13] as the standard reference for vertex algebras.
2. Loop-Coherent Sheaves The goal of this section is to develop foundations of the theory of (quasi-)loop-coherent sheaves over algebraic varieties. These are rather peculiar objects that nevertheless behave very much like usual coherent and quasi-coherent sheaves. For simplicity, we only concern ourselves with algebras over complex numbers. This is mostly just a formalization of the localization calculation of [17] but it provides us with a nice framework for our discussion. The idea is somehow to work with sheaves over the loop space of an algebraic variety without worrying much about infinities. Only the future can tell if this is a truly useful concept or just an annoying technicality.
Vertex Algebras and Mirror Symmetry
519
Definition 2.1. Let R be a commutative algebra over C with a unit. R-loop-module is a vector space V over C together with the following set of data. First of all, V is graded V = ⊕l≥0 Vl . We assume that values of l are integer, although it takes little effort to modify our definitions to allow any real l. We denote by L[0] the grading operator, that is L[0]v = kv for all v ∈ Vk . In addition, for every element r ∈ R and every integer l there is given a linear operator r[l] : V → V such that the following conditions hold: (a) (b) (c) (d)
1[k] = δk0 ; all r[l] commute with each other; r[l]Vk ⊆ Vk−l ; for every two ring elements r1 , r2 there holds r1 [k]z−k r2 [l]z−l = (r1 r2 )[k]z−k . k
l
k
This equation makes sense because at any given power of z while applied to any given element, only the finite number of terms on the left-hand side are non-zero. This follows from (b), (c), and L[0] ≥ 0. Remark 2.2. R-loop-modules are usually not R-modules. Really, one has r1 [k]r2 [−k] (r1 r2 )[0] = r1 [0]r2 [0] + k =0
as opposed to just (r1 r2 )[0] = r1 [0]r2 [0]. However, the extra term is locally nilpotent, that is a sufficient power of it annihilates Vl for any given l. This is what makes it possible to localize loop-modules analogously to localization of the usual modules. The following proposition will also serve as a definition. Proposition 2.3. Let S be a multiplicative system in R. Given a loop-module V over R denote by VS its localization by the multiplicative system Sloop generated by s[0] for all s ∈ S. We claim that VS has a natural structure of RS -loop-module. Moreover, ρ : V → VS is a universal morphism in the sense that for any map to RS -loop-module ρ1 : V → V1 that is compatible with R → RS there exists a unique RS -loop-module map ρ2 : VS → V1 such that ρ1 = ρ2 ◦ ρ. Proof. First of all, let us provide VS with RS -loop-module structure. Grading is clearly unaffected by localization. For every v ∈ V we have v = w/ s [0] and every a ∈ RS S i looks like a = b/ sj . We define 1 a(z)w, si [0] 1 a(z)w = b(z) (z) w, sj l ∞ 1 1 l −l−1 −k (z)w = w= (−1) sj [0] sj [k]z w, sj [0] + k =0 sj [k]z−k sj a(z)v =
l=0
k =0
520
L. A. Borisov
which gives an element in VS [z, z−1 ]. Even though it seems that infinite sums appear when one applies (1/sj (z)) several times, for any given w most sj [k] with positive k could be safely ignored, since they annihilate w and could always be pushed through other s∗ [∗] by commutativity. One, of course, needs to check that the above definition is self-consistent. It is clear that if you change v to s[0]v/s[0], the result stays the same. Changing a to sa/s requires a certain calculation, but does not present any major difficulties either. For every map V → V1 , where V1 is RS -loop-module notice that s[0] is invertible on V1 for all s ∈ S. Really, s[0] s −1 [0] = 1 + locally nilpotent so s[0] is invertible by the same trick. Therefore, the map V → V1 can be naturally pushed through VS . Remark 2.4. Because of Remark 2.2, it is enough to localize by s[0] for those s that generate S. Proposition 2.3 allows us to define a quasi-loop-coherent sheaf on any complex algebraic variety X as follows. Definition 2.5. A sheaf V of vector spaces over C is called quasi-loop-coherent if for every affine subset Spec(R) ⊂ X sections (Spec(R), V) form an R-loop-module and restriction maps are precisely localization maps of Proposition 2.3. Many results about quasi-loop-coherent sheaves could be deduced from the standard results about quasi-coherent sheaves due to the following proposition. Proposition 2.6. For every R-loop-module V consider the following filtration: F lV = si [ki ]V≤l . i,s1 ,...,si ,k1 ,...,ki i
We have F 0 V ⊆ F 1 V ⊆ . . . , and F l+1 V /F l V has a natural structure of the R-module. Moreover, this filtration commutes with localizations. Proof. Locally nilpotent operators (s1 s2 )[0] − s1 [0]s2 [0] push F l+1 V to F l V , which provides the quotient with the structure of R-module. Filtration commutes with localization, because s[0] commute with L[0]. As a result of this proposition every quasi-loop-coherent sheaf V is filtered by other quasi-loop-coherent sheaves F l V and all quotients are quasi-coherent. It is also worth mentioning that the above filtration is finite on every Vk which prompts the following definition. Definition 2.7. A quasi-loop-coherent sheaf is called loop-coherent, or loco if quasicoherent sheaves F l+1 V ∩ Vk /F l V ∩ Vk are coherent for all k and l. From now on we also use abbreviation quasi-loco in place of quasi-loop-coherent. Remark 2.8. A zero component V0 of a (quasi-)loco sheaf is (quasi-)coherent.
Vertex Algebras and Mirror Symmetry
521
Proposition 2.9. For any affine variety X and quasi-loco sheaf V on it cohomology spaces H i (X, V) are zero for i ≥ 1. For any projective variety X all cohomology groups of a loco sheaf are finite dimensional for each eigenvalue of L[0]. Proof. For both statements, one considers a specific eigenvalue k of L[0] and then applies an induction on l in F l V ∩ Vk . Remark 2.10. There is a one-to-one functorial correspondence between R-loop-modules and quasi-loco sheaves over SpecR. Remark 2.11. One can modify the definition of quasi-loco sheaves to allow negative eigenvalues of L[0], as long as there is some bound L[0] ≥ −N on them. 3. Sheaves of Vertex Algebras We follow [13] in our definition of a vertex algebra. We only consider vertex algebras over C. Definition 3.1 ([13]). A vertex algebra V is first of all a super vector space over C, that is V = V0 ⊕ V1 , where elements of V0 are called bosonic or even and elements from V1 are called fermionic or odd. In addition, there given a bosonic vector |0 called vacuum vector. The last part of the data that defines a vertex algebra is the so-called state-field correspondence which is a parity preserving linear map from V to EndV [[z, z−1 ]] a → Y (a, z) = a(n) z−n−1 n∈Z
such that for fixed v and a all a(n) v are zero for n >> 0. This state-field correspondence must satisfy the following axioms. • translation covariance: {T , Y (a, z)}− = ∂z Y (a, z), where {, }− denotes the usual commutator and T is defined by T (a) = a(−2) |0 ; • vacuum: Y (|0 , z) = 1V , Y (a, z)|0 z=0 = a; • locality: (z − w)N {Y (a, z), Y (b, w)}∓ = 0 for all sufficiently big N , where ∓ is + if and only if both a and b are fermionic. The equality is understood as an identity of formal power series in z and w. We will often write a(z) instead of Y (a, z). Linear operators a(k) will be referred to as modes of a. Every two fields a(z), b(w) of a vertex algebra have operator product expansion N ci (w) a(z)b(w) = + : a(z)b(w) :, (z − w)i i=1
1 where the meaning of the symbols (z−w) i and :: in the above formulas is clarified in Chapter 2 of [13]. We only remark here that operator product expansion contains information about all super-commutators of the modes of a and b, and the sum on the right-hand side is finite due to the locality axiom. The sum is called the singular part and the :: term is called the regular part. When there is no singular part, the OPE is called non-singular and it means that all modes of the two fields in question super-commute. In this paper we will only use graded vertex algebras.
522
L. A. Borisov
Definition 3.2 ([13]). A vertex algebra V is called graded if there is given an even diagonalizable operator H on V such that {H, Y (a, z)}− = z∂z Y (a, z) + Y (H a, z). When H a = ha we rewrite Y (a, z) in the form a[n]z−n−h Y (a, z) = n∈−h+Z
and again call a[n] modes of a. Number h is usually called the conformal weight of a or a(z). We use brackets to denote modes, which differs from the notations of [13] and [17]. Modes can hardly be confused with commutators, but we use {, }± notation for the latter just in case. Definition 3.3 ([13]). A graded vertex algebra V is called conformal if there is chosen an even vector v such that the corresponding field Y (a, z) = L(z) = n L[n]z−n−2 satisfies (c/2) 2L(w) ∂L(w) L(z)L(w) = + reg. + + 4 2 (z − w) (z − w) z−w We also require L[−1] = T and L[0] = H . Number c here is called central charge or rank of the algebra. We now combine the theories of quasi-loco sheaves and vertex algebras to define sheaves of graded vertex algebras over an algebraic variety X. Abusing the notations, we denote the grading operator H by L[0] even if algebra V is not conformal. Definition 3.4. Let R be a commutative algebra over C. A graded vertex algebra V is called vertex R-algebra if R is mapped to L[0] = 0 component of V so that images of all r ∈ R are bosonic, all modes r[n] commute with each other and Y (r1 , z)Y (r2 , z) = Y (r1 r2 , z). In addition, we assume that L[0] has only non-negative eigenvalues. Definition of graded algebra implies that r[n] decreases eigenvalues of L[0] by n. Thus any vertex R-algebra has a structure of R-loop-module. Proposition 3.5. Let S be a multiplicative system in R and let V be a vertex R-algebra. Then the localization VS defined in Proposition 2.3 has a natural structure of vertex RS -algebra. Proof. For any element a ∈ V and any set of elements si ∈ S we need to define the field Y (a/ si [0], z) on VS . First, we do it for any element of the form |0/ si [0]. The corresponding field is defined, of course, as i Y (si , z)−1 in agreement with Proposition 2.3. Let us check vacuum axiom of vertex algebras for this field. First of all, when we apply it to the vacuum, which has L[0] eigenvalue zero in any graded vertex algebra, only non-positive modes of si survive. As a result, at z = 0 we obtain precisely |0/ si [0]. Operator T extends naturally, because it commutes with s[0]. The second part of the vacuum axiom also holds, because
T, si−1 (z) s −1 (z) = T , 1/( si )(z) = si−1 (z) si (z), T i
−
=−
−
s −2 (z)(∂z
si (z)) = ∂z
−
si−1 (z).
Vertex Algebras and Mirror Symmetry
523
In addition one can show that new fields are mutually local with all old fields Y (b, z) and with each other. We will now construct a field for every element in VS . We will do it by induction on l in filtration F l V of Proposition 2.6. For an arbitrary element a/ si [0] of grading h we define
Y a/ si [0], z = : Y |0/ si [0], z Y (a, z) : −k −i−h = (|0/ si [0])[k]z a[i]z k≤0
+
i
i
−i−h −k . a[i]z (|0/ si [0])[k]z k≥1
One can see that this expression is well defined as an element in EndV [[z, z−1 ]]. When we apply this field to the vacuum, the second term does not contribute, and we get ( si−1 )[0]a. This is not the same as a/ si [0] but the difference lies in the deeper part of the filtration. So we have now constructed a field for each a ∈ VS . To show that this definition is compatible with changes from a to s[0]a/s[0], notice that all constructed fields are mutually local and satisfy the second part of vacuum axiom. Then the argument of the uniqueness theorem of Sect. 4.4 of [13] works and allows us to conclude that for any two ways of writing an element in VS the corresponding fields are the same. The above proposition allows us to define sheaves of vertex algebras over any algebraic variety X. Definition 3.6. A (quasi-)loco sheaf V of vector spaces over C is said to be a a (quasi-) loco sheaf of vertex algebras if for every affine subset Spec(R) ⊂ X sections (Spec(R), V) form a vertex R-algebra and restriction maps are precisely localization maps of Proposition 3.5. Our goal now is to provide H ∗ (V) with a structure of vertex algebra. Certainly, operators T and L[0], as well as the vacuum, behave well under localizations and are therefore globally defined. For every affine set in X and every integer n we may consider the operation (n) that maps (a, b) to a(n) b. It also commutes with localization, which gives us the map (n) : V ⊗ V → V. This map induces a cup product on the cohomology of V and we shall show that combining all these maps together yields a vertex algebra structure on H ∗ (V). Proposition 3.7. Cohomology of quasi-loco sheaf of vertex algebras V has a natural structure of vertex algebra. Moreover, if sections of V over Zariski open sets are given the structure of conformal algebras that is compatible with localization maps, then H ∗ (V) has a natural conformal structure. Proof. We will use the equivalent set of axioms of vertex algebras, see [13], Sect. 4.8. This set consists of the partial vacuum identity Y (|0 , z) = 1, a(−1)|0 = a
524
L. A. Borisov
and Borcherds identity ∞ ∞ m j n (a(n+j ) b)(m+k−j ) c = a(m+n−j ) (b(k+j ) c) (−1) j j j =0
j =0
− (−1)
parity(a)parity(b)
∞
(−1)
j +n
j =0
n b(n+k−j ) (a(m+j ) c) j
for all a, b, c ∈ V and all k, m, n ∈ Z. When we have a graded algebra with L[0] ≥ 0, for any given L[0]-eigenvalues of a, b and c, the sums in Borcherds identity are finite. Therefore, Borcherds identity is just a collection of identities for maps (n) between components Vr of V. Therefore, they induce identities on cohomology of V when we replace (n) with corresponding cup products. A careful examination of signs shows that the Borcherds identity holds on cohomology of V if we define parity on H s (V) as the sum of s and parity on V. Partial vacuum identity on cohomology also follows from the vacuum identity on V and the fact that |0 is globally defined. If V has conformal structure, then a = L[−2]|0 is globally defined, so it lies in H 0 and provides cohomology with conformal structure of the same central charge. We finish this section with discussion of BRST cohomology . Let V be a vertex algebra 2 = 0. Consider cohomology of V with respect and let a be an element of V such that a(0) to a(0) , which is called BRST cohomology. Operator a(0) and field Y (a, z) are called BRST operator and BRST field respectively. The following proposition is standard. Proposition 3.8. BRST cohomology of V with respect to a(0) has a natural structure of vertex algebra. Proof. One has the following identity ([13], Eq. 4.6.9): {a0 , Y (b, z)}± = Y (a(0) b, z). Therefore, if b is annihilated by a(0) then Y (b, z) commutes with a(0) and conserves the kernel and the image of BRST operator. This provides us with the set of mutually local fields for BRST cohomology, and it remains to employ the uniqueness theorem of Sect. 4.4 of [13]. In particular, if a is an odd element of V such that all modes a(n) anticommute with each other, then a(0) could serve as BRST operator. All major results of this paper involve BRST cohomology by operators of this type. 4. Chiral de Rham Complex as a Sheaf of Vertex Algebras Over a Smooth Variety In this section we review and summarize results of the extremely important paper of Malikov, Schechtman and Vaintrob [17]. Notations of our paper follow closely those of [17]. We assume some familiarity with the vertex algebras of free bosons and fermions. The reader is referred to [13] or pretty much any conformal field theory textbook. For every smooth variety X authors of [17] define a loco sheaf of vertex algebras MSV(X) which is called chiral de Rham complex of X. It is described in local coordinates x 1 , . . . , x dimX as follows. There are given 2 dimX fermionic fields ϕ i (z), ψi (z)
Vertex Algebras and Mirror Symmetry
525
and 2 dimX bosonic fields ai (z), bi (z), where index i is allowed to run from 1 to dimX. The non-trivial super-commutators between modes are given by j
0 {ai [k], bj [l]}− = δi δk+l , j
0 , {ψi [k], ϕ j [l]}+ = δi δk+l
and the fields are defined as ai (z) = ai [k]z−k−1 ,
bi (z) =
k
i
ϕ (z) =
bi [k]z−k ,
k
i
ϕ [k]z
−k
,
ψi (z) =
k
ψi [k]z−k−1 .
k
There is defined a Fock space generated from the vacuum vector |0 by non-positive modes of b and ϕ and by negative modes of a and ψ. To obtain sections of MSV(X) over a neighborhood U of x = 0, one considers the tensor product V of the above Fock space with the ring of functions over U with bi [0] plugged in place of x i . The tensor product is taken over the ring C[b[0]]. Grading on this space is defined as the opposite of the sum of mode numbers. Certainly one needs to specify how elements of V change under a change of local coordinates. For each new set of coordinates x˜ i = g i (x), x j = f j (x˜ j ) this is accomplished in [17] by the formulas b˜ i (z) = g i (b(z)), ϕ˜ i (z) = gji (b(z))ϕ j (z) j
k a˜ i (z) =: aj (z)fi (b(z)) : + : ψk (z)fi,l (b(z))grl (b(z))ϕ r (z) :, j ψ˜ i (z) = ψj (z)fi (b(z)),
where i gji = ∂g i /∂x j , fji = (∂f i /∂ x˜ j ) ◦ g, fj,k = (∂ 2 f i /∂ x˜ j ∂ x˜ k ) ◦ g
and normal ordering :: is defined by pushing all positive modes of a, b, ψ and ϕ to the right, multiplying by (−1) whenever two fermionic modes are switched. For any choice of local coordinates, one introduces fields L(z) =: ∂z bi (z)ai (z) : + : ∂z ϕ i (z)ψi (z) :, J (z) =: ϕ i (z)ψi (z) :, G(z) = ∂z bi (z)ψi (z), Q(z) = ∂z ai (z)ϕ i (z). Field L(z) is invariant under the change of coordinates, which provides MSV(X) with structure of sheaf of conformal vertex algebras. The L[0] = 0 part is naturally isomorphic to the usual de Rham complex on X, with grading given by eigenvalues of J [0] and differential given by Q[0] (both modes are globally defined). If X is Calabi–Yau, all four
526
L. A. Borisov
of the above fields are well-defined which provides MSV(X) with the structure of sheaf of topological vertex algebras. This means that spaces of sections over affine subsets are equipped with structures of topological vertex algebras, in a manner consistent with localization. This structure is analogous to the conformal structure but requires a choice of four fields Q, G, J and L that satisfy certain OPEs, see [17]. It was suggested in [17] that cohomology of MSV(X) has a structure of vertex algebra which describes the holomorphic part of A model of X, see [22]. Since we now know how to provide cohomology of the loco sheaf of vertex algebras MSV(X) with such structure, we can state the following definition. Definition 4.1. Let X be a smooth algebraic variety over C. We define A model topological vertex algebra of X to be H ∗ (MSV(X)) with the structure of vertex algebra on it defined in Proposition 3.7. This algebra also possesses the conformal structure since L(z) is globally defined, as well as the structure of the topological algebra, with operators given by formulas of [17] in the case when X is Calabi–Yau. If X is a Calabi–Yau variety, one can also talk about the B model topological vertex algebra of X. As a vertex algebra, it is identical to the A model, but the additional structure of the topological algebra differs. Definition 4.2. Let X be a smooth Calabi–Yau manifold. We define B model topological vertex algebra of X as follows. As a vector space, it coincides with the A model topological vertex algebra of X. The operator T is also the same, and so are the fields of the algebra. Topological structure of the B model vertex algebra is related to the topological structure of A model algebra by mirror involution QB = GA , GB = QA , JB = −JA , LB = LA − ∂JA .
In what follows we will often abuse the notations and call these algebras simply A and B model vertex algebras, but it should always be understood that they are considered together with their extra structure. Notice that the B model is ill-defined for varieties X that are not Calabi–Yau, even as a conformal vertex algebra, because LB is ill-defined for them. The first goal of our paper is to calculateA and B model vertex algebras for Calabi–Yau hypersurfaces in smooth toric nef-Fano varieties. The second goal (which is only partially achieved) is to generalize our results to toric varieties with Gorenstein singularities. 5. Vertex Algebras of Line Bundles and Zeros of Their Sections In this section we study chiral de Rham complex of a line bundle L over a smooth variety P . Given a section of the dual line bundle we are able to calculate the chiral de Rham complex of its zero set in terms of the push-forward of the chiral de Rham complex of the line bundle. We denote the projection to the base by π : L → P . The line bundle structure is locally described by the fact that there is one special coordinate x 1 such that allowed local changes of coordinates are compositions of changes in x 2 , . . . , x dimL and changes x˜ 1 = x 1 h(x 2 , . . . , x dimL ), x˜ i = x i , i ≥ 2.
Vertex Algebras and Mirror Symmetry
527
Proposition 5.1. Field b1 (z)ψ1 (z) depends only on the line bundle structure of L. Proof. This field is clearly unaffected by any changes of coordinates on the base that leave x 1 intact. In addition, for the change of coordinates as above, we have b˜ 1 = b1 h(b2 , . . . , bdimL ), ψ˜ 1 = ψ1 / h(b2 , . . . , bdimL ). As a result, field b1 (z)ψ1 (z) is independent from the choice of local coordinates that are compatible with the given line bundle structure. The following lemma is clear. Lemma 5.2. Line bundle L has trivial canonical class if and only if L is the canonical line bundle on P . Remark 5.3. Since MSV(L) is a loco sheaf, its cohomology spaces H ∗ (MSV(L)) are isomorphic to cohomology H ∗ (π∗ MSV(L)) of its push-forward to P , which is the sheaf we are mostly interested in. Remark 5.4. One may also consider the bundle 2L obtained by declaring the coordinate on L odd. It turns out that the corresponding sheaf of vertex algebras is roughly the same as the corresponding sheaf for the even bundle L−1 . More precisely, the pushforwards of both bundles to P coincide. Locally the isomorphism is obtained by mapping (b1 , ϕ 1 , a1 , ψ1 ) for 2L to (ψ1 , a1 , ϕ 1 , b1 ) for L−1 . It has been observed in [20] that mirror symmetry for Calabi–Yau complete intersections may be formulated in terms of odd bundles on ambient projective varieties. Let us additionally assume that we have at our disposal a section µ of the dual line bundle L−1 . This amounts to having a global function on L which is linear on fibers. We will also assume that zeros of µ form a reduced non-singular divisor X on P . The goal of the rest of the section is to describe MSV(X) in terms of π∗ MSV(L). The following lemma is easily checked by a calculation in local coordinates. In fact, it holds for any global function on any smooth variety. Lemma 5.5. Fields µ(z) and Dµ(z) that are locally defined as µ(z) = µ(b1 , . . . , bdimL )(z), Dµ(z) =
i
ϕ i (z)
∂µ (z) ∂bi
are independent of the choice of coordinates and are therefore globally defined. In particular, the operator BRST µ = Dµ(z)dz is globally defined. It is now time to state the main result of this section. Theorem 5.6. Let X be a smooth hypersurface in a smooth variety P defined as above by a section µ of line bundle L−1 . Then sheaf of vertex algebras MSV(X) is isomorphic to BRST cohomology of sheaf π∗ MSV(L) with respect to the operator BRST µ . Here cohomology is understood in the sense of sheaves, that is as a sheafification of BRST cohomology presheaf.
528
L. A. Borisov
Proof. Clearly, BRST µ is a differential, because its anticommutator with itself is zero. and it is our goal to We denote the BRST cohomology of π∗ MSV(L) by MSV(X) and MSV(X). construct an isomorphism between MSV(X) It is enough to construct this isomorphism locally for any point p ∈ P provided that our construction withstands a change of coordinates. We use Hausdorff topology on P rather than Zariski topology. Point p may or may not lie in X so our discussion splits into two cases. Case 1. p ∈ / X. In this case for any sufficiently small neighborhood U ⊂ P of p we can choose a coordinate system (x 1 , x 2 , . . . , x dimL ) on π −1 U such that x1 is the special line bundle variable, and µ = x 1 . As a result, BRST µ = ϕ 1 (z)dz = ϕ 1 [1]. A simple calculation on the flat space then shows that cohomology by BRST µ are is zero on sections of π∗ MSV(L) for any sufficiently small U . As a result, MSV(X) supported on X, which is, of course, true of MSV(X). Case 2. p ∈ X. For any sufficiently small neighborhood U ⊂ P of p we can choose a system of coordinates (x 1 , x 2 , . . . , x dimL ) on π −1 U that agrees with line bundle structure, such that µ = x 1 x 2 and x 3 , . . . , x dimL form a system of local coordinates on X∩U . We then have (b1 [k]ϕ 2 [−k + 1] + b2 [k]ϕ 1 [−k + 1]). BRST µ = (b1 ϕ 2 + b2 ϕ 1 )(z)dz = k∈Z
Fock space (U, π∗ MSV(L)) is a tensor product of spaces Fock 1,2 and Fock ≥3 , which are the spaces generated by modes of ai , bi , ϕ i , ψi for i ∈ {1, 2} and i ∈ {3, . . . , dimL} respectively. Since BRST µ acts on the first component of this tensor product, its cohomology is isomorphic to the tensor product of Fock ≥3 and cohomology of Fock 1,2 with respect to BRST µ . We claim that cohomology of Fock 1,2 with respect to BRST µ is one-dimensional and is generated by the image of vacuum vector |0 . We do not multiply by (U, OP ), which does not alter the argument. Notice first that Fock 1,2 is a restricted tensor product (that is almost all factors are 1) of the following infinite set of vector spaces: • • • • •
⊕l≥0 C(a 1 [−k])l ⊕l≥0 C(a 1 [−k])l ϕ 2 [−k + 1], ⊕l≥0 C(a 2 [−k])l ⊕l≥0 C(a 2 [−k])l ϕ 1 [−k + 1], ⊕l≥0 C(b1 [−k])l ⊕l≥0 C(b1 [−k])l ψ 2 [−k − 1], ⊕l≥0 C(b2 [−k])l ⊕l≥0 C(b2 [−k])l ψ 1 [−k − 1], ⊕l≥0 R(b2 [−k])l ⊕l≥0 R(b2 [−k])l ψ 1 [−k − 1],
for all k > 0, for all k > 0, for all k ≥ 0, for all k > 0, for all k > 0.
In the last formula R means the ring of function on a disc. We assume here that the neighborhood U is a product of |x 2 | ≤ c and some Ux 3 ,...,x dimL . Vacuum vector, of course, corresponds to the product of all 1. The Fock space is graded by the eigenvalues of L1,2 [0] that is by the opposite of the total sum of indices. Operator BRST µ shifts this grading by −1. If we consider elements with bounded grading, it is enough to consider only product of a finite number of above spaces. For each such product, BRST µ is a sum of anticommuting operators on each component. One can then show that cohomology is a tensor product of cohomologies for each component
Vertex Algebras and Mirror Symmetry
529
by induction on the number of components. On each step of the induction we are using a spectral sequence for a stupid filtration of the tensor product complex, with grading given by eigenvalues of L1,2 [0]. As a result, to show that cohomology space is one-dimensional, it is enough to show that for each of the spaces above cohomology is one-dimensional and is given by the image of 1. It is sufficient to consider the first, third, and fifth types only. For a space of first type, the kernel of BRST µ is C1 ⊕ ⊕l≥0 C(a 1 [−k])l ϕ 2 [−k + 1] and its image is
⊕l≥0 C(a 1 [−k])l ϕ 2 [−k + 1]
so the image of 1 generates cohomology. For a space of third type, the kernel is ⊕l≥0 C(b1 [−k])l and the image is
⊕l≥1 C(b1 [−k])l
which gives the same result. For the space of the fifth type, we use R/xR = C. So we managed to show that for a given choice of coordinates on π −1 U , there is an and MSV(X). The proof is not over yet, isomorphism between sections of MSV(X) because we need to show that these locally defined isomorphisms could be glued together. This amounts to the demonstration that the isomorphism just constructed commutes with any changes of coordinates on π −1 U that preserve our setup. Every such coordinate change could be written in the form x˜ 1 = x 1 · h(x 2 , . . . x dimL ),
x˜ 2 = x 1 / h(x 2 , . . . , x dimL ),
x˜ i = f i (x 3 , . . . , x dimL ) + x 2 g i (x 2 , . . . , x dimL ),
i ≥ 3.
It is clear that when h = 1 and g i = 0 the corresponding splitting of the Fock space is unaffected and the resulting isomorphism precisely matches the change of variables on X. As a result, we only need to show that the isomorphism commutes with coordinate changes such that f i (x) = x i . One can show that in this case fields a˜ i (z), b˜ i (z), ϕ˜ i (z), ψ˜ i (z) for i ≥ 3 act on the cohomology in the same way as the operators ai (z), bi (z), ϕ i (z), ψi (z), because the difference lies in the image of BRST µ . This finishes the proof. It is clear that our isomorphism commutes with structures of sheaves of vertex algebras. We also have the following corollary which will be very useful later. Proposition 5.7. For any affine subset U ⊂ P the BRST µ cohomology space of (U, π∗ MSV(L)) is isomorphic to (U, MSV(X)). Proof. Sheaf π∗ MSV(L) is a quasi-loco sheaf, and BRST µ is a map of a quasi-loco sheaf into itself. For any affine subset it is induced from the map of corresponding loop-modules, and then everything follows from Remark 2.10. We are especially interested in the case where L has a non-degenerate top form. In this case, by Lemma 5.2, L is a canonical line bundle, and a section µ of L−1 produces a Calabi–Yau divisor X on P . Our goal here is to calculate global fields G(z) and Q(z) on MSV(X) in terms of some global fields on π∗ MSV(L).
530
L. A. Borisov
Proposition 5.8. When L is a canonical bundle on X, the field GX (z) is the image of the field GL (z) − (b1 (z)ψ1 (z)) . The field QX (z) is the image of the field QL (z). Proof. Because of 5.1, all fields in question are defined globally, so a local calculation is sufficient. We assume notations of the proof of Theorem 5.6. Then we have QX (z) − QL (z) = −a1 (z)ϕ 1 (z) − a2 (z)ϕ 2 (z), GX (z) − GL (z) + (b1 (z)ψ1 (z)) = b2 (z) ψ2 (z) − b1 (z)ψ1 (z) , and we need to show that right-hand sides of these equations are commutators of BRST µ and some fields. This goal is accomplished by fields −a1 (z)a2 (z) and −ψ1 (z)ψ2 (z) respectively. 6. BRST Description of Vertex Algebra in Logarithmic Coordinates This section is in a sense a mirror of the previous one. It contains a local calculation of the chiral de Rham complex of a smooth toric variety as the BRST cohomology of some MSV-like space defined in terms of local coordinates. We introduce some notations which will stay with us for the rest of the paper. Let M be a free abelian group of rank dimM and N = Hom(M, Z) be its dual. The vector space (M ⊕ N ) ⊗ C has dimension 2 dimM and it is equipped with a standard bilinear form denoted by “·”. This allows us to construct 2 dimM bosonic and 2 dimM fermionic fields. Really, one can always construct k bosonic and k fermionic fields starting from a vector space or dimension k with a non-degenerate bilinear form on it, see for example [13], so our purpose here is to fix notations. For every m ∈ M and n ∈ N we have m · B(z) =
m · B[k]z−k−1 ,
n · A(z) =
k∈Z
m · 8(z) =
k∈Z
m · 8[k]z
−k
, n · 9(z) =
n · A[k]z−k−1 ,
k∈Z
n · 9[k]z−k−1 .
k∈Z
Notice that the moding of B also has z−k−1 in it, in contrast to the moding of bi in the previous section. The non-zero super-commutators are 0 id, {m · B[k], n · A[l]}− = (m · n)kδk+l 0 {m · 8[k], n · 9[l]}+ = (m · n)δk+l id.
Our battlefield will be the following space whose construction is standard as well: Fock M⊕N =def ⊕m∈M,n∈N ⊗k≥1 C[B[−k]] ⊗k≥1 C[A[−k]] ⊗l≥0 (C + C8[−l]) ⊗l≥1 (C + C9[−l])|m, n . Here ⊗ means restricted tensor product over C where only finitely many factors are not equal to 1. Vectors |m, n are annihilated by positive modes of A, B, and 8, and by non-negative modes of 9. Also, A[0]|m, n = m|m, n , B[0]|m, n = n|m, n .
Vertex Algebras and Mirror Symmetry
531
This Fock space possesses a structure of vertex algebra, see for example [13]. Among the fields of this algebra the important role is played by so-called vertex operators :e
(m·B(z)+n·A(z))
:
which are defined as follows: : e (m·B(z)+n·A(z)) : A[. . . ] B[. . . ] 8[. . . ] 9[. . . ]|m1 , n1 z−k z−k = C(m, n, m1 , n1 )zm·n1 +n·m1 e−(m·B[k]+n·A[k]) k e−(m·B[k]+n·A[k]) k
k<0
A[. . . ]
B[. . . ]
8[. . . ]
k>0
9[. . . ]|m + m1 , n + n1 .
Cocycle C(m, n, m1 , n1 ) here equals (−1)m·n1 . It is used to make vertex operators purely bosonic. Our notation suppresses this cocycle, which is a bit unusual but should not lead to any confusion. Vertex operators obey the following OPEs: :e
(m·B(z)+n·A(z))
:: e
(m1 ·B(w)+n1 ·A(w))
:=
:e
(m·B(z)+n·A(z)) e (m1 ·B(w)+n1 ·A(w))
(z
− w)m·n1 +n·m1
:
,
where putting both fields under the same :: sign means that we move all negative modes to the left and all positive modes to the right as in the definition of vertex operators above. Of course, this OPE could be expanded by the Taylor formula, and the resulting fields are normal ordered products of vertex operators, free bosons, and their derivatives. In general, all fields of the vertex algebra Fock M⊕N are normal ordered products of various B, A, 9, 8 and their derivatives times one (perhaps trivial) vertex operator. Remark 6.1. Vertex algebra Fock M⊕N possesses a conformal structure, given by LM⊕N (z) =: B(z) · A(z) : + : ∂z 8(z) · 9(z) : . The corresponding grading operator LM⊕N [0] assigns grading m · n to vector |m, n . Elements i mi ·8[0]|m, n have the same eigenvalue, but for every other element with the same A[0] and B[0] eigenvalues, the grading is strictly larger than m · n. Under this conformal structure the moding of the fields A, B, 9 and 8 is as above. The goal of this section is to construct a flat space vertex algebra in terms of A, B, 8, 9. We first look at the case of dimension one. In this case M and N are one-dimensional, and A, B, 8, 9 are no longer vector-valued. Consider the following fields: b(z) = e B(z) , ϕ(z) = 8(z)e B(z) , ψ(z) = 9(z)e− B(z) , a(z) =: A(z)e−
B(z)
: + : 8(z)9(z)e−
B(z)
:.
Proposition 6.2. Operator product expansions of a, b, ϕ, ψ are a(z)b(w) =
1 1 + reg., ϕ(z)ψ(w) = + reg., z−w z−w
and all other OPEs are non-singular. Proof. It is a standard calculation of OPEs that include vertex operators, which we omit. This is not too surprising since the fields in question are given by the formulas of [17] applied to the exponential change of variables x˜ = exp(x).
532
L. A. Borisov
Proposition 6.3. Modes of the fields a, b, ϕ, ψ generate a vertex algebra which is isomorphic to global sections of chiral de Rham complex of a one-dimensional affine space. Proof. First, all OPEs are correct due to 6.2. Since a and b are bosonic and ϕ and ψ are fermionic, this implies that super-commutators of their modes are correct. Notice that the conformal weight of b is zero, so it is moded correctly. One can also show that positive modes of b, ϕ and non-negative modes of a, ψ annihilate |0, 0 . The rest follows from the fact that Fock representation of the algebra of modes is irreducible. The following calculation is extremely useful. Proposition 6.4. We define L(z), J (z), Q(z), and G(z) for a, b, ϕ, ψ as usual, see [17]. Then in terms of A, B, 8, 9 we have Q(z) = A(z)8(z) − ∂z 8(z), G(z) = B(z)9(z), J (z) =: 8(z)9(z) : +B(z), L(z) =: B(z)A(z) : + : ∂z 8(z)9(z) : . Proof. It is a standard calculation, which is again omitted. Observe that Q and J acquire extra terms under the exponential change of variables as one can expect from Theorem 4.2 of [17]. We now define Fock M⊕N≥0 as a subalgebra of Fock M⊕N characterized by the condition that eigenvalues of B[0] are non-negative. This amounts to only allowing |m, n with n ≥ 0. We will now show that the vertex algebra generated by a, b, ϕ, ψ could be obtained as a certain BRST cohomology of the vertex algebra Fock M⊕N≥0 . Theorem 6.5. Vertex algebra of a, b, ϕ, ψ is isomorphic to BRST cohomology of Fock M⊕N≥0 with respect to the operator BRST g = BRST g (z)dz = g9(z)e A(z) dz, where g is an arbitrary non-zero complex number. Proof. First of all, notice that all modes of a, b, ϕ, ψ commute with BRST g . Really, all these fields except a(z) give non-singular OPEs with BRST g (w), and
g : A(z)9(w)e −B(z)+A(w) : a(z)BRST g (w) = + reg. z−w − B(z)+ A(w) e 9(z) + O(z − w) +g − z−w z−w
−g9(z) : e (A(z)−B(z)) : + reg., = (z − w)2 which implies {a(z), BRST g }− = 0. Space Fock M⊕N≥0 is graded by eigenvalues of B[0] and BRST g shifts them by one. We first show that BRST g has no cohomology for eigenvalues of B[0] that are positive.
Vertex Algebras and Mirror Symmetry
533
Really, we can look at the operator R(z) = 8(z)e− that
A(z) . A
similar calculation shows
{R(z), BRST g }+ = g · id and therefore the anticommutator of the zeroth mode of R(z) and BRST g is identity. Thus we found a homotopy operator, which insures that there is no cohomology at positive eigenvalues of B[0]. Fortunately, the above operator shoots out of Fock M⊕N≥0 from zero eigen-space of B[0]. So we found that the cohomology is isomorphic to the kernel of BRST g on the zero eigen-space of B[0]. To show that all elements of this space can be obtained by applying modes of a, b, ϕ, ψ to |0, 0 , we employ the result of Proposition 6.4. More precisely, L[0] has non-negative eigenvalues. Moreover, its zero eigen-space is ⊕m∈Z (C ⊕ C8[0])|m, 0 . Since L[0] commutes with BRST g , it induces grading on the kernel. We prove by induction on eigenvalues of L[0] that all elements of the kernel of BRST g with zero eigenvalue of B[0] are obtained by applying modes of a, b, ϕ, ψ to |0, 0 . For L[0] = 0 notice that cohomology is graded by eigenvalues of A[0]. An explicit calculation then shows that for k < 0 elements BRST g |k, 0 and BRST g 8[0]|k, 0 are linearly independent. In addition, BRST g 8[0]|0, 0 is non-zero (it is proportional to |0, 1 ), and the rest is generated by modes of b and ϕ. If L[0]v = lv with l > 0, notice that L[0] =
ka[k]b[−k] +
k<0
kb[−k]a[k] −
k>0
kψ[k]ϕ[−k] +
k<0
kϕ[−k]ψ[k].
k>0
When applied to v, only finitely many terms survive. So we have v=
1 pi qi v. l i
Since v is in the kernel of BRST g , and pi commutes with BRST g , pi v is in the kernel for each i. Also pi v has a strictly lower eigenvalue of L[0], so it is generated by modes of a, b, ϕ, ψ due to the induction assumption. Therefore, v is also generated by modes of a, b, ϕ, ψ, which finishes the proof. We can extend this theorem to lattices of any dimension as follows. Consider a primitive cone K ∗ in lattice N . Primitive here means that it is generated by a basis n1 , . . . , ndimN of N . The dual basis is denoted by m1 , . . . , mdimM . We denote by Fock M⊕K ∗ the subalgebra of Fock M⊕N where eigenvalues of B[0] are allowed to lie in K ∗ . We consider vertex algebra of flat space that is generated by fields bi (z) = e
mi ·B(z)
,
ai (z) =: (ni · A(z))e
ϕ i (z) = (mi · 8(z))e
− mi ·B(z)
for all i = 1, . . . , dimM.
mi ·B(z)
,
ψi (z) = (ni · 9(z))e−
: + : (mi · 8(z))(ni · 9(z))e
− mi ·B(z)
:
mi ·B(z)
,
534
L. A. Borisov
Theorem 6.6. Vertex algebra of ai , bi , ϕ i , ψi is isomorphic to BRST cohomology of Fock M⊕K ∗ with respect to operator BRST g = BRST (z)dz = gi (ni · 9(z))e ni ·A(z) dz, i
where g1 , . . . gdimM are arbitrary non-zero complex numbers. Moreover, operators L, J , G and Q are given by Q(z) = A(z) · 8(z) − deg · ∂z 8(z), G(z) = B(z) · 9(z), J (z) =: 8(z) · 9(z) : + deg · B(z), L(z) =: B(z) · A(z) : + : ∂z 8(z) · 9(z) :, where “deg” is an element in M that equals 1 on all generators of K ∗ . Proof. The result follows immediately from Theorem 6.5. Really, Fock M⊕K ∗ is a tensor product of dimM spaces discussed in there. We grade each space by eigenvalues of mi · B[0], and BRST gi becomes a degree one differential. Operator BRST g is a total differential on the corresponding total complex, which finishes the proof. Another option is to go through the proof of Theorem 6.5 with minor changes due to higher dimension. 7. Smooth Toric Varieties and Hypersurfaces The result of the previous section could be interpreted as a calculation of the chiral de Rham complex for a smooth affine toric variety given by cone K ∗ ⊂ N . The first objective of this section is to learn how to glue these objects together to get the chiral de Rham complex of a smooth toric variety (or rather a canonical line bundle over it). Then we employ the result of Theorem 5.6 to calculate vertex algebras of Calabi–Yau hypersurfaces in toric varieties. Let us recall the set of data that defines a smooth toric variety. For general theory of toric varieties see [9, 10, 19]. The paper of Batyrev [1] may also be helpful. A toric variety P? is given by fan ? which is a collection of rational polyhedral cones in N with vertex at 0 such that (a) for any cone C ∗ of ?, C ∗ ∩ (−C ∗ ) = {0}; (b) if two cones intersect, their intersection is a face in both of them; (c) if a cone lies in ?, then all its faces lie in ?, this also includes the vertex (zerodimensional face) of the cone. Toric variety is smooth if and only if all cones in ? are basic, that is generated by a part of a basis of N . For every cone C ∗ ∈ ?, one considers the dual cone C ∈ M defined by C = {m ∈ M, s.t. m · C ∗ ≥ 0} and the corresponding affine variety AC = Spec(C[C]). We employ a multiplicative notation and denote elements of C[M] by x m for all m ∈ M. If C1∗ is a face of C2∗ , then C[C1 ] is a localization of C[C2 ] by all x m for which m ∈ C2 , m · C1∗ = 0. This allows us to construct inclusion maps between affine varieties ACi and then glue them all together to form a toric variety P = P? .
Vertex Algebras and Mirror Symmetry
535
We already know how to describe sections of the chiral de Rham complex on a flat space that corresponds to cone K ∗ of maximum dimension. Our next step is to construct a vertex algebra that corresponds to a face of such a cone. Namely, let C1∗ be generated by n1 , . . . , nr , where n1 , . . . , ndimN form a basis of N and generate C ∗ . Then we can consider the BRST operator BRST g =
r
gi (ni · 9(z))e
ni ·Ai (z)
dz
i=1
that acts on Fock M⊕C1∗ . The corresponding BRST cohomology will be denoted by VAC1 ,g and sections of the chiral de Rham complex on AC will be denoted by VAC,g . We have a natural surjective map ρ : Fock M⊕C ∗ → Fock M⊕C1∗ which commutes with BRST g . Here we, of course, abuse the notation a little bit by using two different definitions of BRST g for C ∗ and C1∗ . However, we assume that gi there are the same for i = 1, . . . , r. For every m ∈ C and every l ∈ Z element e m·B [l] acts on both VAC,g and VAC1 ,g . ∗ Really, its action on Fock M⊕C ∗ commutes with BRST g , because m · C ≥ 0. Consider the multiplicative system S generated by elements e m·B(z) [0] with m ∈ C, m · C1∗ = 0. Proposition 7.1. The map ρ induces map ρBRST : VAC,g → VAC1 ,g which is precisely the localization map of C[C]-loop-module VAC,g with respect to multiplicative system S. Proof. First we show that this map is the localization map of corresponding vector spaces with the action of the multiplicative system. For this it is enough to show that ρ is the localization map. This amounts to showing that any element v of Fock M⊕C with eigenvalues of B[0] equal to n where n ∈ / C1∗ is annihilated by some element in S. Since n ∈ / C1 , there exists an elementm ∈ C such that m · C1 = 0 and m · n > 0. It is easy to see that a power of s[0] = e m·B [0] annihilates v. Really, it does not change the L[0] eigenvalue of v but on the other hand it increases its A[0] · B[0] eigenvalue by an arbitrary positive multiple of m · n. So for big l the L[0] eigenvalue of s[0]l v is too small to fit into the subspace based on |moriginal + lm, n , see Remark 6.1. Since VAC,g has structure of the loop-module over C[C], its localization has the structure of loop-module over C[C1 ]. One can also show that the vertex algebra structure on VAC1 ,g is the localization of the structure on VAC,g . Remark 7.2. It is interesting to observe that a surjective map on Fock spaces leads to an injective map on BRST cohomology.
Remark 7.3. Even though e m·B(z) is invertible, its zero mode is not. This seems to contradict the calculations of Sect. 2 but the reason is the presence of negative L[0] eigenvalues in Fock M⊕K ∗ .
536
L. A. Borisov
So we now have at our disposal a way of calculating sections of the chiral de Rham complex of a smooth toric variety on any toric affine subset of it. This allows us to calculate cohomology of the chiral de Rham complex. We will be most concerned with a calculation of cohomology of the chiral de Rham complex for the canonical line bundle L over a complete toric variety P. To get the fan of L from the fan of P one adds extra dimension to N and then lifts the fan of P to height one as illustrated by the following figure: (0, 0) ❅ ❅ ❅ ❅ ✠ ❄ ❘ ❅ •−−−−−− −− • −− −−−−−−• (nold , 1) (0, 1) (nold , 1) We adjust our notations to denote the whole new lattice by N = N1 ⊕ Z and the new fan by ?. An element in M that defines the last coordinate in N is denoted by “deg”. Notice that for every cone C ∗ ∈ ? it is the same as “deg” from Proposition 6.6. We also denote by K ∗ the union of all cones in ?, which may or may not be convex. As it was noticed in Remark 5.3, we can consider the quasi-loco sheaf π∗ MSV(L) ˇ on P. By Proposition 2.9, its cohomology could be calculated as a Cech cohomology that corresponds to the covering of L by open affine subsets AC , where we only consider ˇ complex the cones C ∗ that contain (0, 1). So we need to consider the Cech 0 → ⊕C0∗ VAC0 ,g → ⊕(C0∗ ,C1∗ ) VAC01 ,g → · · · → ⊕(C0∗ ,...,Cr∗ ) VAC0...r ,g → 0, ∗ of C0∗ , . . . , Ck∗ . Here we have chosen where C0..k is the dual of the intersection C0...k non-zero numbers gn for all generators n of one-dimensional cones in ?. We know that each VAC,g is the BRST cohomology of the corresponding Fock space and our goal is ˇ to write cohomology of Cech complex as certain BRST cohomology.
Proposition 7.4. Consider the following double complex 0 ... 0 ↓ ... ↓ ∗ )deg·B[0]=0 → 0 0 → ⊕C0∗ (Fock M⊕C0∗ )deg·B[0]=0 → · · · → ⊕(C0∗ ,...,Cr∗ ) (Fock M⊕C0...r ↓ ... ↓ ∗ )deg·B[0]=1 → 0 0 → ⊕C0∗ (Fock M⊕C0∗ )deg·B[0]=1 → · · · → ⊕(C0∗ ,...,Cr∗ ) (Fock M⊕C0...r ↓ ... ↓ ... ... ... where vertical arrows are BRST g operators and horizontal arrows are sums of surjecˇ tive maps of Fock spaces as dictated by definition of Cech cohomology. We also multiply vertical differentials in odd-numbered columns by (−1) to assure anticommutation of small squares. Then p th cohomology of the total complex is equal to H p (π∗ MSV(L), P). Here again we only consider cones C ∗ that contain (0, R≥0 ). Proof. Proposition 6.6 tells us that cohomology along vertical lines happens only at the ˇ top (zeroth) row, where it becomes the Cech complex for the sheaf π∗ MSV(L). So the spectral sequence of one stupid filtration degenerates and converges to cohomology of π∗ MSV(L).
Vertex Algebras and Mirror Symmetry
537
Our next step is to calculate the cohomology of the total complex using the other stupid filtration. Let us see what happens if we take cohomology of horizontal maps of our double complex first. It could be done separately for each lattice element n ∈ N . If deg · n = l then we are dealing with the l th row. The part of the complex that we care about is a constant space Fock M⊕n multiplied by a certain finite complex of vector spaces. That complex calculates the cohomology of the simplex based on all indices i such that cones Ci∗ contain (0, R≥0 ) and n. If n ∈ / K ∗ then the set is empty, and ∗ cohomology is zero. However, if n ∈ K the cohomology is C and is located at the zeroth column. As a result, horizontal cohomology is zero except the zeroth column. Therefore, cohomology could be calculated by means of the restriction of the BRST g operator applied to kernels of horizontal maps from the zeroth column. The following theorem describes this space. Theorem 7.5. Consider the following degeneration of the vertex algebra structure on Fock M⊕K ∗ . In the definition of vertex operator e (m·B+n·A)(z) when applied to . . . |m1 , n1 , the result is put to be zero, unless there is a cone in ? that contains both n1 and n. This does provide a consistent set of operator product expansions and ∗ the new algebra is denoted by Fock ? M⊕K . We denote by @ the set of all generators of one-dimensional cones of ?. We construct a BRST operator on Fock ? M⊕K by the formula BRST g = BRST g (z)dz = gn (n · 9)(z)e n·A(z) . n∈@∗
Then we claim that ⊕p H p (π∗ MSV(L)) equals BRST cohomology of Fock ? M⊕K ∗ with respect to BRST g . Proof. In view of Proposition 7.4, it is enough to show that the horizontal cohomology of the double complex of 7.4 at the zeroth column and the corresponding vertical differential coincide with Fock ? M⊕K ∗ and BRST g . The kernel of the horizontal map consists of collections of elements of Fock M⊕C ∗ that agree with restrictions. This can certainly be identified with Fock ? M⊕K ∗ as follows. For every point n ∈ N we take the corresponding n-part of the above collection of elements, since it is the same no matter which C ∗ ! n we choose. In the opposite direction, for each cone C ∗ we take a sum of n-parts for all n that belong to C ∗ . When we apply vertical arrows to such collections of elements, for each C ∗ we use only the part of BRST g that contains B[0] eigenvalues from that C ∗ . Under our identification this is precisely the action of the whole BRST g on Fock ? M⊕K ∗ because as a result of that action for any n ∈ C ∗ the only terms that survive and have a non-trivial projection back to C ∗ come from applying the part of BRST g with n in C ∗ . We also want to show that the structure of vertex algebra induced on the BRST cohomology of Fock ? M⊕K ∗ coincides with the vertex algebra structure on the cohomology of π∗ MSV(L) defined in Proposition 3.7. Proposition 7.6. Two structures of vertex algebra on H ∗ (π∗ MSV(L)) coincide. ˇ Proof. The cup-product (n) is induced on Cech cohomology by the following product ˇ ˇ on Cech cochains. To define the Cech differential we have chosen an order on the set ,g , where of all cones. If α ∈ VAC0...k ,g , where C0 < C1 < · · · < Ck , and β ∈ VAC0...l C0 < C1 < · · · < Cl , then their (n)-product α(n) β is zero unless C0 < · · · < Ck = C0 < · · · < Cl ,
538
L. A. Borisov
in which case it is defined as the (n)-product of the restrictions of α and β to ,g . We extend this construction to define α(n) β for any pair of elements VAC0...k C1...l ,g by of the double complex of Theorem 7.5 by replacing the (n)-product in VAC0...k C1...l ∗ ∗ the (n)-product in Fock M⊕(C0...k ∩C 1...l ) . We now observe that for the differential d of the total complex we have d(α(n) β) = (dα)(n) β + (−1)parity(α)+column(α) α(n) (dβ). To check this we again use Eq. 4.6.9 of [13]. The product (n) induces the product on ˇ the cohomology of vertical maps that is precisely the (n) product on Cech cochains of π∗ MSV(L). It also induces a cup product on the cohomology of horizontal maps. A map between the two repeated cohomologies could be seen on the level of cochains as an addition of a coboundary, which is therefore compatible with (n). It remains to notice that the (n) product on the zeroth column of our double complex simply acts as an independent application of (n) products for every cone C ∗ ∈ ?. So it coincides on the cohomology of the horizontal maps with the (n) product of the vertex algebra structure of Fock ? M⊕K ∗ . Remark 7.7. It is important to keep in mind that operations (n) do not define the structure of the vertex algebra on the whole double complex, they only induce this structure on cohomology. This is analogous to the fact that the usual cup-product is not supercommutative on the level of cochains. Also, we can not really define a quasi-loco sheaf of vertex algebras over P whose sections over AC are Fock M⊕C ∗ , because eigenvalues of L[0] are not bounded from below. Perhaps, it is just a matter of definitions, but localization might indeed behave poorly in this case. Remark 7.8. It is clear that fields L(z), J (z), G(z) and Q(z) are still given by Q(z) = A(z) · 8(z) − deg · ∂z 8(z), G(z) = B(z) · 9(z), J (z) =: 8(z) · 9(z) : + deg · B(z), L(z) =: B(z) · A(z) : + : ∂z 8(z) · 9(z) : . Remark 7.9. Similar results can be obtained for cohomology of chiral de Rham complex for an arbitrary smooth toric variety, for example for a projective space. However in this paper we are mostly concerned with the line bundle case because of its applications to Mirror Symmetry. So now we have a good description of cohomology of the chiral de Rham complex on a canonical bundle of a smooth toric variety. Our next step is to use the results of Sect. 5 to obtain a similar result for a Calabi–Yau hypersurface inside a smooth toric nef-Fano variety. Certain combinatorial conditions on ? are necessary to ensure that we have a Calabi– Yau hypersurface. Details could be found in the original paper of Batyrev [1]. The set @∗ of all n should be the set of all lattice points of a convex polytope which we also denote by @∗ abusing notations slightly. We do not require that all n except (0, 1) are vertices of @∗ which geometrically means that the opposite of the canonical divisor on P is nef but not necessarily ample. We also have a polytope @ ∈ M defined as follows. Decomposition N = N1 ⊕ Z implies M = M1 ⊕ Z. We define @ = K ∩ {M1 , 1}.
Vertex Algebras and Mirror Symmetry
539
All vertices of polytope @ belong to M. This is not entirely obvious, but follows from the fact that all cones of ? are basic and therefore @ is reflexive in M1 . A Calabi–Yau hypersurface in P is given by a section of the negative canonical line bundle of P. Any such section is given by a set of numbers fm , one for each lattice point m in @. If f is generic, the resulting hypersurface in P is smooth and Calabi–Yau. In what follows we will denote (0, 1) by deg∗ . Let us fix a generic section µ of the anti-canonical line bundle of P and hence a function f : @ → C. Proposition 7.10. The operator BRST µ (z) from Lemma 5.5 is given by BRST µ (z) = fm (m · 8)(z)e m·B(z) . m∈@
Proof. It is enough to consider a section µ = fm x m . For any maximum cone of ? with basis (n1 , . . . , ndimN ) we see that ϕi (m · ni )e (m−mi )·B(z) = fm BRST µ (z) = fm (mi · 8)(z)(m · ni )e m·B(z) i
= fm (m · 8)(z)e
i m·B(z)
.
From now on we denote BRST µ by BRST f . We also denote by X the Calabi–Yau hypersurface in P which is given by f . We will denote by XC the intersection of X with AC . Proposition 7.11. For every cone C ∗ ∈ ? sections of MSV(X) on XC are given by BRST cohomology of Fock M⊕C ∗ by the operator BRST f,g = BRST f,g (z)dz, where BRST f,g (z) = BRST f (z) + BRST g (z) fm (m · 8)(z)e m·B(z) + = m∈@
gn (n · 9)(z)e
n·A(z)
.
n∈@∗ ∩C ∗
Proof. One easily computes that all modes of BRST f (z) and BRST g (z) anti-commute with each other. Also, Proposition 5.7 implies that sections of MSV(X) are cohomology with respect to BRST f of cohomology of Fock M⊕C ∗ with respect to BRST g . Consider the following double complex: ...
0 0 0 ... ↓ ↓ ↓ · · · → Fock −1,0 → Fock 0,0 → Fock 1,0 → . . . ↓ ↓ ↓ · · · → Fock −1,1 → Fock 0,1 → Fock 1,1 → . . . ↓ ↓ ↓ · · · → Fock −1,2 → Fock 0,2 → Fock 1,2 → . . . ↓ ↓ ↓ ... ... ...
540
L. A. Borisov
where Fock k,l is a shorthand for the part of Fock M⊕C ∗ where (deg∗ ·A)[0] and (deg·B)[0] equal k and l respectively. Horizontal maps are BRST f and vertical maps are BRST g . We already know that columns of this double complex are exact everywhere except the zeroth row. A standard diagram chase then implies that horizontal cohomology of zeroth kernels of vertical maps are isomorphic to cohomology of the total complex. However, the total complex and differential on it are precisely Fock M⊕C ∗ and BRST f,g . Proposition 7.12. In the above proposition all cohomology of total complex are trivial, except for the zeroth one. Proof. Grading by deg∗ · A[0] on π∗ MSV(L) corresponds to counting C(b1 ) + C(ϕ 1 ) − C(a1 ) − Cψ1 , where x 1 is the special coordinate of the line bundle. Since this count is zero for MSV(X), the cohomology of BRST f is concentrated at zeroth column. Remark 7.13. Above identification is also compatible with vertex algebra structures. Really, this structure is induced from that of Fock M⊕C ∗ both for the repeated and for the single use of BRST cohomology. Now we are in position to calculate the cohomology of the chiral de Rham complex of Calabi–Yau hypersurfaces in toric Fano varieties. Theorem 7.14. BRST cohomology of Fock ? M⊕K ∗ with respect to BRST operator BRST f,g equals H ∗ (X, MSV(X)). Proof. The argument is completely analogous to that of Proposition 7.4. We construct a double complex similar to that of 7.4, but with BRST g changed to BRST f,g and deg · B[0] changed to deg · B[0] + deg∗ · A[0]. Proposition 7.12 assures that the spectral sequence of this double complex degenerates. It is now a technical matter to calculate fields L(z), J (z), G(z) and Q(z). Proposition 7.15. Fields L, J , G, and Q on H ∗ (MSV(X)) are induced from the following fields on Fock ? M⊕K ∗ : Q(z) = A(z) · 8(z) − deg · ∂z 8(z), G(z) = B(z) · 9(z) − deg∗ · ∂z 9(z), J (z) =: 8(z) · 9(z) : + deg · B(z) − deg∗ · A(z), L(z) =: B(z) · A(z) : + : ∂z 8(z) · 9(z) : −deg∗ · ∂z A(z). Proof. By a standard application of the Wick theorem, one observes that the above operators satisfy OPEs of the topological algebra of dimension (dimN − 2). Then it remains to show that G(z) and Q(z) are correct. This follows from Proposition 5.8 and Remark 7.8. Remark 7.16. Notice that OPEs of the topological algebra hold exactly in Fock ? M⊕K ∗ even though we only need them to hold modulo the image of BRST f,g . This algebra was discovered almost three years ago as a lucky guess motivated by Mirror Symmetry. The above algebra behaves exactly like the holomorphic part of N = (2, 2) theory under mirror involution, see for example [22]. This becomes apparent when we undo the topological twist and consider N = 2 super-conformal algebra with LN=2 (z) = : B(z) · A(z) : + : (1/2)(∂z 8(z) · 9(z) − 8 · ∂z 9(z)) : − (1/2)deg∗ · ∂z A(z) − (1/2)deg · ∂B(z).
Vertex Algebras and Mirror Symmetry
541
Mirror Symmetry in this setup means switching M and N , @ and @∗ , deg and deg∗ , f and g. We want to show that the A model vertex algebra of a Calabi–Yau hypersurface equals the B model algebra of its mirror. Certainly the formulas above show that this is true for the operators G, Q, J and L. However, there are some obstacles that prevent us from making such a statement. The easiest objection is that the cocycle used to make vertex operators bosonic does not look symmetric. However, that could be fixed by noticing that multiplication of |m, n by (−1)m·n is equivalent to switching the roles of M and N in the definition of the cocycle. We also notice that when we switch M and N , and consequently 9 and 8, the moding of the operators changes slightly. Besides, the Fock space we consider is based on M ⊕ K ∗ , rather than on M ⊕ N or K ⊕ K ∗ . Both these objections will be addressed successfully in the next section. However, there is one more difficulty that can not be resolved: we use the subdivision of the cone K ∗ but we do not subdivide lattice M at all. We believe this difference is due to the instanton corrections.
8. Transition to the Whole Lattice The goal of this section is to show that cone K ∗ in Theorem 7.14 could be replaced by the whole lattice N . This is certainly far from obvious. We construct and use certain homotopy operators whose anticommutators with BRST f,g are identity plus operators that “push elements closer to K ∗ ”. Before we start, notice that the decomposition of K ∗ into cones can be extended to decomposition of lattice N by adding arbitrary multiples of deg∗ to all cones. This allows us to the define vertex algebra Fock ? M⊕N analogously . to the definition of the vertex algebra Fock ? ∗ M⊕K Proposition 8.1. For every vertex m0 of @ there is an operator Rm0 such that Rm0 BRST f,g + BRST f,g Rm0 = 1 + α, where α strictly increases eigenvalues of m0 · B[0] and does not decrease eigenvalues of m · B[0] for any other m ∈ @. Proof. Consider the graded ring C[K]. Pick a basis n1 , . . . , ndimN of N . It was proved in [6] that for general values of fm elements m fm x m (m · ni ) form a regular sequence. In particular, quotient ring is Artinian and for a sufficiently big k (k = dimM is in fact always enough) element x km0 lies in the ideal generated by the above regular sequence. So we have x km0 = hi (x @ ) fm x m (m · ni ). m
i
We now consider the following field: Rm0 (z) = e−
km0 ·B(z)
hi e @·B ni · 9(z).
i
We have the operator product expansion Rm0 (z)BRST f (w) ∼
1 1 e− km0 ·B . hi e @·B fm e m·B (m · ni ) = z−w z − w m i
542
L. A. Borisov
With the usual abuse of notation, we introduce Rm0 = Rm0 (z)dz. We now argue that this operator satisfies the claim of this proposition. First of all, the above OPE shows us that Rm0 BRST f + BRST f Rm0 = 1. Let us look at its anticommutator with BRST g . The operator product expansion of Rm0 (z)e n·A(w) n · 9(w) is non-singular if m0 · n = 0. Otherwise, fields in the OPE shift eigenvalues of m0 · B[0] positively (more precisely by m0 · n). Of course, for any other m ∈ @ eigenvalues of m · B[0] are shifted by m · n which is non-negative. The above proposition provides us with necessary tools to prove the main result of this section. Proposition 8.2. Cohomology of Fock ? M⊕K ∗ with respect to BRST f,g is isomorphic to ? cohomology of Fock M⊕N with respect to BRST f,g . ? Proof. It is clear that BRST f,g Fock ? M⊕K ∗ ⊆ Fock M⊕K ∗ . Then one needs to prove the following two inclusions: ? ? • Ker(Fock ? M⊕K ∗ ) ∩ Im(Fock M⊕N ) ⊆ Im(Fock M⊕K ∗ );
? ? • Ker(Fock ? M⊕K ∗ ) + Im(Fock M⊕N ) ⊇ Ker(Fock M⊕K ∗ ).
Notice that the opposite inclusions are obvious. First inclusion. Assume that there exists an element v ∈ Fock ? M⊕K ∗ such that v = . Moreover, of all such v1 we pick the BRST f,g v1 , where v1 does not lie in Fock ? ∗ M⊕K ". The distance is defined as follows. We look at one which is "the closest to Fock ? M⊕K ∗ all codimension one faces of K ∗ or equivalently all vertices of @. For every vertex m of @ we look at the maximum eigenvalue of −m · B[0] on components of v1 . We call the maximum of this number and zero the m-distance from v1 to Fock ? M⊕K ∗ . Then the total is the sum of m-distances for all vertices m of @. So we distance from v1 to Fock ? ∗ M⊕K pick v1 with a minimum distance and our goal is to show that this distance is zero. If the distance is not zero, then for one of the vertices m there is a component of v1 with a negative eigenvalue of m · B[0]. We now apply the result of Proposition 8.1. Consider operator Rm . We have (Rm BRST f,g + BRST f,g Rm )v = v + αv. So
v = BRST f,g (Rm v + αv1 ),
because α commutes with BRST f,g . Notice now that Rm v ∈ Fock ? M⊕K ∗ and the is strictly less than the distance from v distance from αv1 to Fock ? ∗ 1 to it. Really, M⊕K the m-distance is smaller, and m1 -distance is not bigger for any other vertex m1 of @. This contradicts minimality of v1 . Second inclusion. Our argument here is similar. Let v be an element of Fock ? M⊕N such that BRST f,g v = 0. Then for every vertex m ∈ @ we have BRST f,g Rm v = v + αm v
Vertex Algebras and Mirror Symmetry
543
and therefore v ≡ −αm v (mod Im(Fock ? M⊕N )). By applying αm for different m sufficiently many times, we can again push v into Fock ? M⊕K ∗ . We now combine Propositions 8.2 and 7.15 with Theorem 7.14 to formulate one of the main results of this paper. Theorem 8.3. Let X be a Calabi–Yau hypersurface in a smooth toric nef-Fano variety, given by f : @ → C and a fan ?. Then cohomology of chiral de Rham complex of X equals BRST cohomology of Fock ? M⊕N by operator BRST f,g =
fm (m · 8)(z)e
m·B(z)
+
gn (n · 9)(z)e
n·A(z)
dz
n∈@∗
m∈@
with any choice of non-zero numbers gn . Additional structure of the topological vertex algebra is given by formulas of Proposition 7.15. We have therefore addressed one of the questions posed at the end of last section. Another obstacle for Mirror Symmetry stated there was the fact that the modes of 8 and 9 are defined differently. However, this is precisely what happens when we go from A model to B model. Because J =: 8 · 9 : +deg · B − deg∗ · A, the moding of 9 and 8 changes when we go from LA-model [0] to LB-model [0] = LA-model [0] + J [0]. Really, while 8[k] and 9[k] change, the true modes 8(k) and 9(k) are not affected by the switch of the roles M and N . It remains to address the following question. What is the real meaning of going from Fock M⊕N to Fock ? M⊕N ? It turns out that in the case when ? admits a convex piece-wise linear function (which geometrically means that P is projective) this vertex algebra is a degeneration of the vertex algebra Fock M⊕N . The degeneration we are about to describe is completely analogous to the one discussed in [6] but is now performed for the whole Fock space. Let h : NR → R be a continuous function which is linear on every cone of ? and satisfies h(x + y) ≤ h(x) + h(y) with equality achieved if and only if x and y lie in the same cone of ?. Then we get ourselves a complex parameter t and start changing the basis of Fock M⊕N by assigning h(n) |m, n . To preserve the definition of the vertex algebra we also multiply |m, n t = t n·A(z) by t h(n) . Now if we let t go to zero, the structure of the vertex algebra of e Fock M⊕N will go to the structure of the vertex algebra of Fock ? M⊕N . When the structure is defined via this limit, we can also get the action of BRST f,g on it as a of Fock ? M⊕N limit of h(n) m·B(z) n·A(z) dz. fm (m · 8)(z)e + gn t (n · 9)(z)e BRST f,g (t) = m∈@
This prompts the following definition.
n∈@∗
544
L. A. Borisov
Definition 8.4. We define Master Family of vertex algebras that corresponds to the pair of reflexive polytopes @ and @∗ as the BRST quotient of the vertex algebra Fock M⊕N by the operator fm (m · 8)(z)e m·B(z) + gn (n · 9)(z)e n·A(z) dz, BRST f,g = n∈@∗
m∈@
where f and g are parameters of the theory. Additional structure of the topological vertex algebra is given by formulas of Proposition 7.15. Conjecture 8.5. Vertex algebras that appear in Mirror Symmetry for hypersurfaces defined by @ and @∗ are elements of the Master Family of vertex algebras. Remark 8.6. Large complex structure limit (see [18]) in our language is most likely the degeneration of the Master Family where M is subdivided. Large Kähler structure limit is the degeneration of the Master Family where N is subdivided. The difference between the BRST quotients of Fock ? M⊕N and Fock M⊕N should be somehow seen in terms of instanton corrections. In particular, the construction of [17] recovers the large Kähler structure limit of the physical theory. Our discussion so far has been focused around reflexive polytopes @∗ that admit a unimodular triangulation and therefore yield smooth Ps. This is a very important class of examples, which includes famous quintic in P4 , but most reflexive polytopes do not fall into this category. The next two sections will be devoted to the treatment of singular Ps. We can no longer use the definition of [17], but many of our results still hold in that generality under appropriate definitions. 9. Vertex Algebras of Gorenstein Toric Varieties The goal of this section is to define an analog of the chiral de Rham complex for an arbitrary Gorenstein toric variety. It is again a loco sheaf of conformal vertex algebras. Sections of this sheaf over any toric affine chart admit a structure of the topological vertex algebra which may or may not be compatible with the localization. However, J [0] and Q[0] are globally defined, which allows us to introduce a string de Rham complex and to propose a definition of string cohomology vector spaces. Recall that dimensions of these spaces were rigorously defined by Batyrev and Dais in [5] but the spaces themselves have never been constructed mathematically. We are working in the following setup. There are dual lattices M and N with a primitive element “deg” fixed in M. There is a fan ? in N such that all generators ni of its one-dimensional faces satisfy deg · ni = 1. A set @∗ consists of some lattice points of degree one inside the union of all cones of ?. We do not generally require that @∗ includes all such points, or that it is a set of all lattice points inside a convex polytope. However, we do demand that it contains generators of all one-dimensional cones of ?. At last, we have a generic set of numbers gn for all n ∈ @∗ . Definition 9.1. For each cone C ∗ ∈ ? we denote by Vg (C) the BRST cohomology of the vertex algebra Fock M⊕C ∗ with respect to the BRST operator BRST g = gn (n · 9)(z)e n·A(z) dz. n∈@∗ ∩C ∗
Vertex Algebras and Mirror Symmetry
545
We also provide this algebra with structure of topological algebra by introducing operators Q(z) = A(z) · 8(z) − deg · ∂z 8(z), G(z) = B(z) · 9(z), J (z) =: 8(z) · 9(z) : + deg · B(z), L(z) =: B(z) · A(z) : + : ∂z 8(z) · 9(z) : . Remark 9.2. The above definition does not guarantee that the resulting vertex algebra has no negative eigenvalues of L[0]. To prove this and much more we first consider the case of the simplicial cone C ∗ . Then we extend our results to the general case by looking at the degeneration of Fock M⊕C ∗ that corresponds to subdivision of C ∗ into simplicial cones. First of all, we consider the case of orbifold singularities in which case we can give an explicit description of BRST cohomology similar to that of Proposition 6.6. When we talk about orbifold singularities we implicitly assume that not only the cones C ∗ are simplicial, but also gn are zero, except for the generators of one-dimensional faces of C∗. Proposition 9.3. Let C ∗ be a simplicial cone of dimension dimN . Its faces of dimension one are generated by n1 , n2 , . . . , ndimC ∗ . Denote by Nsmall the sublattice of N generated by ni . Denote by Mbig the sublattice of M which is the dual of Nsmall . Let the dual of C ∗ in Mbig be generated by m1 , . . . , mdimM . For every i we define bi (z) = e
mi ·B(z)
,
ai (z) =: (ni · A(z))e
ϕ i (z) = (mi · 8(z))e
− mi ·B(z)
mi ·B(z)
,
ψi (z) = (ni · 9(z))e−
: + : (mi · 8(z))(ni · 9(z))e
− mi ·B(z)
mi ·B(z)
,
:
for all i = 1, . . . , dimM. These fields generate a vertex subalgebra VAC ∗ ,Mbig inside Fock Mbig ⊕0 . Consider all fields from VAC ∗ ,Mbig whose A[0] eigenvalues lie in M. Denote the resulting algebra by VAC ∗ ,M . Let Box(C ∗ ) be the set of all elements in n ∈ C ∗ such ∗ that (n − ni ) ∈ / C ∗ for all i. For everyn ∈ Box(C the following set of ) consider elements of Fock M⊕n . For every v0 = A[. . . ] B[. . . ] 8[. . . ] 9[. . . ]|m, 0 that lies in VAC ∗ ,M consider v = A[. . . ] B[. . . ] 8[. . . ] 9[. . . ]|m, n which is obtained by applying the same modes of A, B, 8 and 9 to |m, n instead of |m, 0 . (n) We denote this space by VAC ∗ ,M . Then we claim that (n)
Vg (C) = ⊕n∈Box(C ∗ ) VAC ∗ ,M .
Proof. First of all, the argument of Proportion 6.5 shows that Vg (C) = ⊕n∈Box(C ∗ ) Ker(BRST g : Fock M⊕n → Fock M⊕C ∗ ). Then we notice that changing |m, n to |m, 0 commutes with the action of BRST g so it is enough to concern ourselves with the case n = 0. Then the kernel of BRST g on Fock M⊕0 is the intersection of Fock M⊕0 with the kernel of BRST g on Fock Mbig ⊕0 . It remains to apply Proposition 6.6. Remark 9.4. The corresponding space AC is a quotient of a flat space by an abelian group. The part at n = 0 is precisely the invariant part of the flat space algebra, while other n correspond to “twisted sectors”.
546
L. A. Borisov
Proposition 9.5. For a simplicial cone C ∗ of dimension dimN eigenvalues of L[0] on Vg (C) are non-negative. Eigen-values of A[0] on the zero eigen-space of L[0] lie in C. Besides, for any d > 0, eigenvalues of A[0] on L[0] = d eigen-space lie in C −D(d) deg where D(d) is some constant which depends only on d and dimension of N . Proof. Consider n ∈ Box(C ∗ ) given by n=
α i ni .
i
Let us consider all elements v0 from VAC ∗ ,Mbig and the corresponding elements v obtained by changing |m, 0 to |m, n . Such a change incurs a change in L[0] which is equal to m · n, and is therefore linear in m. Hence, we can assess new L[0] by adding αi for each occurrence of bi or ϕ i and subtracting αi for each occurrence of ai or ψi . So to show that L[0] has no negative eigenvalues, one has to show that each mode of ai , bi , ϕ i or ψi contributes non-negatively. According to Proposition 6.6, we need to consider non-positive modes of b and ϕ and negative modes of a and ψ. The resulting contributions are collected in the following table: mode @L[0]
ai [−k] bi [−k] ϕ i [−k] ψi [−k] k − αi
k + αi
k + αi
k − αi
Since 0 ≤ αi < 1, all entries are non-negative, which proves the first part of the proposition. Moreover, the only time a zero entry can occur is when αi = 0 and we are looking at bi [0] or ϕ i [0], which proves the second part. Finally, for a fixed d there are only finitely many ways to combine a[−k], b[−k], ϕ[−k] and ψ[−k] to get L[0] = d, up to arbitrary extra bi [0] and ϕ i [0]. This finishes the proof. We now extend these results to simplicial cones of dimension smaller than dimN . Proposition 9.6. For any simplicial cone C ∗ eigenvalues of L[0] on Vg (C) are nonnegative. Eigen-values of A[0] on the zero eigen-space of L[0] lie in C. Besides, for any d > 0, eigenvalues of A[0] on L[0] = d eigen-space lie in C − D(d) deg where D(d) is some constant which depends only on d and dimension of N . Proof. Consider an arbitrary simplicial cone C1∗ of maximum dimension whose onedimensional faces are generated by elements in @∗ such that C ∗ is a face of C1∗ . Proposition 9.5 implies that VC1 is a C[C1 ]-loop-module. Proof of Proposition 7.1 is applicable in this more general situation and allows us to show that VC is a localization of VC1 with respect to the multiplicative system S = {x m , m · C ∗ = 0, m · C1∗ ≥ 0}. The grading operator L[0] is still non-negative on the localization, and L[0] = 0 part of VC is the localization of L = 0 part of VC1 . It remains to observe that for every x m ∈ S we have −m ∈ C so localization does not push A[0] eigenvalues from C. We are now in a position to drop the simpliciality assumption on cone C ∗ . This will require a careful investigation of degeneration of vertex algebras given by a triangulation of a non-simplicial cone. Proposition 9.7. For any cone C ∗ and a generic choice of g, all eigenvalues of L[0] on VC are non-negative.
Vertex Algebras and Mirror Symmetry
547
Proof. Consider an arbitrary regular triangulation of C ∗ ∩ @∗ and corresponding detriang composition of C ∗ into a union of simplicial cones. Denote by Fock M⊕C ∗ degeneration of vertex algebra Fock M⊕C ∗ as in Theorem 7.5. As in Proposition 7.4 we consider the double complex 0 ... 0 ↓ ... ↓ ∗ )deg·B[0]=0 → 0 0 → ⊕C0∗ (Fock M⊕C0∗ )deg·B[0]=0 → · · · → ⊕(C0∗ ,...,Cr∗ ) (Fock M⊕C0...r ↓ ... ↓ ∗ )deg·B[0]=1 → 0 0 → ⊕C0∗ (Fock M⊕C0∗ )deg·B[0]=1 → · · · → ⊕(C0∗ ,...,Cr∗ ) (Fock M⊕C0...r ↓ ... ↓ ... ... ... where C0∗ are all cones in the triangulation. Again we consider two spectral sequences associated to this double complex. When you take horizontal maps first, the only non-trivial cohomology appears at zeroth column. Moreover, when you apply vertical cohomology triang to zeroth kernels of horizontal maps, you get BRST cohomology of Fock M⊕C ∗ by BRST g . This shows that cohomology of the total complex is isomorphic to BRST cotriang homology of Fock M⊕C ∗ . The other stupid filtration implies that there exists a spectral setriang
quence that converges to BRST cohomology of Fock M⊕C ∗ and starts with ⊕C0...k Vg (C0...k ). Everything in this picture is additionally graded by eigenvalues of L[0]. Proposition 9.6 shows that Vg (C0...k ) have no negative eigenvalues of L[0], theretriang fore BRST cohomology of Fock M⊕C ∗ can not have any negative eigenvalues either. triang
Now it remains to go from BRST cohomology of Fock M⊕C ∗ to that of Fock M⊕C ∗ . Notice that A[0], L[0] and J [0] commute with each other and with BRST g so we can triang consider separately the parts of Fock M⊕C ∗ and Fock M⊕C ∗ that have fixed eigenvalues of A[0], J [0] and L[0]. We can show that these spaces are finite-dimensional. Really, for a fixed eigenvalue m of A[0] we can find an integer r such that m + r deg lies in C. Then we claim that all eigen-spaces of L[0] + (r + 1)J [0] on Fock m⊕C ∗ are finitedimensional. For each n we start with at least deg · n and almost all modes of A, B, 8 and 9 can only increase this eigenvalue. The exception is a few modes of 8 or 9, but they are fermionic and can only appear in a finite number of combinations. This proves that all eigen-spaces of L[0] + (r + 1)J [0] are finite-dimensional, and so are all spaces with fixed A[0], L[0] and J [0] eigenvalues. triang Now BRST g for Fock M⊕C ∗ can be seen as the true limit of operators BRST g(t) as discussed right before Definition 8.4. For families of operators on finite dimensional spaces, dimensions of kernels jump at special points and dimensions of images decrease, so dimensions of cohomology jump. Since for all negative L[0] there is no cohomology for the degenerate map, there is no cohomology for the original Fock M⊕C ∗ for generic (that is outside of countably many Zariski closed subsets of codimension one) choices of g. We can use similar arguments to extend the rest of the results of Propositions 9.5 and 9.6 to arbitrary Gorenstein cones C ∗ . Proposition 9.8. For any cone C ∗ , the eigenvalues of A[0] on the zero eigen-space of L[0] lie in C. Besides, for any d > 0, eigenvalues of A[0] on L[0] = d eigen-space lie in C − D(d) deg where D(d) is some constant which depends only on d and dimension of N .
548
L. A. Borisov
Proof. For L[0] = 0, it is enough to show for every generator n of a one-dimensional face of C ∗ that eigenvalues of n · A[0] are non-negative on the zero eigen-space of L[0]. Argument of the above proposition shows that it is enough to produce a regular triangulation of C ∗ such that all its maximum cones contain n. Really, we can then apply the second part of Proposition 9.6 and the same degeneration argument works. Similarly, to show that for L[0] = d all eigenvalues of A[0] lie in C − D(d) deg means to show that n · A[0] ≥ −D(d) for every generator n of a one-dimensional face of C ∗ . Again if we can produce a regular triangulation such that all of its maximum cones contain n, then we can use the same degeneration argument and Proposition 9.6. To construct such a triangulation, we do the following. We consider the polytope P = C ∗ ∩ @∗ . For every vertex ni of P we move it slightly away from n along the line from n to ni . For small generic perturbation of this type, the resulting (non-integer) points ni and n will still be a set of vertices of a convex polytope P . All faces of P that do not contain n will be simplicial, and we will call the union of these faces the outer surface of P . Then we consider the following function on h1 : P → R. For every point p ∈ P we draw a line l from n to p and define h(p) as the ratio of the distances from n to p and from n to the outer boundary of P . This function will be piecewise linear on the triangulation of P that is obtained by intersecting P with Conv(n, F ) for all faces F of the outer boundary of P . Moreover, it will be strictly convex in a sense that for each two p1 , p2 ∈ P and each α ∈ (0, 1) αh(p1 ) + (1 − α)h(p2 ) ≤ h(αp1 + (1 − α)p2 ) with equality satisfied if and only if there exists a simplex of the triangulation that contains both p1 and p2 . We then extend h from P to C ∗ by putting hC ∗ (p) = h(p)(deg · p). This function will be strictly convex on the triangulation of C ∗ such that all its simplices contain n, which finishes the proof. Remark 9.9. Looking back, we really had to work hard to prove the last two propositions. It would be very interesting to find a direct proof not based on results of Proposition 6.6. Proposition 9.10. If C1∗ is a face of C ∗ then surjective map Fock M⊕C ∗ → Fock M⊕C1∗ induces a map VC → VC1 which is a localization map of loop-module VC over C[C] by multiplicative system S = {x m , m · C1∗ = 0, m · C ∗ ≥ 0}. Proof. The hard part was to show that the spaces in question are loop-modules, that is to show that L[0] is non-negative. Then the argument of Proposition 7.1 extends to this more general situation. For the rest of this section we no longer assume that deg ∈ M is fixed. Proposition 9.11. A toric variety P defined by fan ? is Gorenstein if for every cone C ∗ ∈ ? all generators n of one-dimensional cones of C satisfy degC · n = 1, where degC is a lattice point in M. Proof. See [1].
The following definition is made possible by the results of Propositions 9.10 and 9.7. Definition 9.12. Let P be a Gorenstein toric variety, given by fan ? in N . Fix a generic set of numbers gn for all points of degree one in each cone of ? (the notion of degree may vary from cone to cone). Then the ( generalized) chiral de Rham complex MSV(P) is
Vertex Algebras and Mirror Symmetry
549
defined as a quasi-loco sheaf over it such that for any affine subspace AC of P sections of MSV(P) are BRST cohomology of Fock M⊕C ∗ by operator n·A(z) BRST g = dz. gn (n · 9)(z)e n∈C ∗ ,degC ·n=1
Remark 9.13. Notice that while the choice of g is irrelevant in the smooth or even orbifold case, it is very important in general. Theorem 9.14. The quasi-loco sheaf MSV(P) is in fact a loco sheaf. Recall that the grading is given by L[0], where L(z) =: B(z) · A(z) : + : ∂z 8(z) · 9(z) : . Proof. This is essentially a local statement, so it is enough to work with one cone C ∗ of maximum dimension. We need a few preliminary lemmas. Lemma 9.15. The action of deg · B[0] could be pushed down to the BRST cohomology of Fock M⊕K ∗ . BRST cohomology of Fock M⊕K ∗ has only eigenvalues of deg · B[0] in a certain range (from 0 to D1 ). Proof of the lemma. Our BRST operator increases deg · B[0] eigenvalues by one, so the action of deg · B[0] could be pushed down to the cohomology. As before, we can notice that the above statement is true for any simplicial subcone and then do the spectral sequence and degeneration trick as in Proposition 9.7. We must mention that as a result of the spectral sequence the bound D1 could jump a bit, but we only need to know that such a bound exists. We will need some general theory of funny objects which we call almost-modules. Definition 9.16. Let R be a Noetherian ring. An abelian group V is called an almostmodule over R if • there is defined a map R × V → V which is bilinear but not necessarily associative; • there exists a finite filtration of V compatible with the multiplication map above such that the quotients of the filtration are modules over R. Lemma 9.17. Let V be an almost-module over R. The following conditions are equivalent: (1) There exists a filtration of V as above such that all quotients are finitely generated. (2) For any filtration of V as above all quotients are finitely generated. (3) Any ascending chain of sub-almost-modules terminates. If these conditions hold, then V is called a Noetherian almost-module. Proof of the lemma. (3) ⇒ (2). If one of the quotients is not finitely generated, then then there will be an ascending non-terminating chain of submodules inside it. It will give rise to an ascending non-terminating chain of sub-almost-modules of V . (1) ⇒ (3). If V = F k V ⊂ F k−1 V ⊂ · · · ⊂ F 0 V = 0
550
L. A. Borisov
is the filtration on V then for any ascending chain {Vj } one can look at Vj ∩ F 1 V . At some point it will stabilize. Then we can look at Vj ∪F 2 V /F 1 V , and it will stabilize too. This will imply that Vj ∪ F 2 V is stabilized. Eventually we will have Vj ∪ F k V = Vj stabilized. As a corollary from the above proposition, all submodules and quotients of Noetherian almost-modules are Noetherian. Lemma 9.18. For any d, D and any finite set I ⊆ N the L[0] = d part of BRST g cohomology of Fock is a Noetherian almost-module over C[K]. The multiplication K−D deg ⊕I is given by e m·B [0].
Proof of the lemma. Consider the filtration by the number of A. When e B·m [0] acts on the quotients of this filtration, it amounts to acting on |m1 , n1 directly, because one can push through and the extra commutators are in the lower part of the filtration. We can also do it for one fixed n, because there are only finitely many of them. Then we see that the multiplication gives zero unless m · n = 0, and for those m it is simply a shift of A[0] eigenvalues. Now it remains to notice that there are finitely many (linearly independent) choices of extra A, B, 8, 9 to get the desired L[0] eigenvalue. Also, there are finitely many choices for m · n in m, n because it can’t be too big. And for each m · n = const we have a finitely generated module over C[K] which finishes the proof of the lemma. We are now ready to complete the proof of Theorem 9.14. What we need to show is that L[0] = d component of BRST g cohomology of Fock M⊕K ∗ is a Noetherian almost module over C[K]. Because of Proposition 9.8 and Lemma 9.15, it is enough to consider Fock K−D deg,{deg ·.≤D1 } . By the above lemma, this is a Noetherian almost-module, and the observation after Lemma 9.17 finishes the proof of the theorem. The above theorem together with Proposition 2.9 lead to the following corollary. Corollary 9.19. If P is compact, then H ∗ (MSV(P)) is a graded vertex algebra with finite dimensional graded components. It is worthwhile to mention that MSV(P) is also loop-coherent with respect to the grading by the B model Virasoro operator LB [0] which is the zeroth mode of LB (z) =: B(z) · A(z) : + : ∂z 9(z) · 8(z) : −deg · B(z). Theorem 9.20. MSV(P) is loop-coherent with respect to the grading by LB [0]. Proof. The proof is completely analogous to the proof of Theorem 9.14. We simply follow the chain of propositions of this section, with the following change. In the proof of Proposition 9.5, in addition to the table mode ai [−k] bi [−k] @LB [0]
ϕ i [−k]
ψi [−k]
k − αi k + αi k + 1 + α i k − 1 − α i we need to consider the extra shift by deg ·α = i αi due to the extra term in LB (z). This shift cures possible negative contributions of ψi [−1] (fortunately, these are fermionic modes so they can not repeat).
Vertex Algebras and Mirror Symmetry
551
We will now address the problem of string-theoretic cohomology vector spaces. Recall that string cohomology numbers were constructed by Batyrev and Dais in [5], and it was proved in [4] that they comply with predictions of Mirror Symmetry for Calabi–Yau complete intersections in toric varieties. Unfortunately, until now it was not known how to construct string cohomology vector spaces whose dimensions are the string cohomology numbers above. The analysis of this paper suggests the following definition, at least for toric varieties. Definition 9.21. String-differential forms on P is L[0] = 0 part of MSV(P). By Remark 2.8 it is a coherent sheaf. The following proposition provides us with a much more practical definition of this sheaf that does not refer explicitly to sheaves of vertex algebras. Proposition 9.22. For each C ∗ ∈ ? consider C[C ⊕ C ∗ ]-module VC defined as VC = ⊕m∈C,n∈C ∗ ,m·n=0 C x m y n , where the action of x k and y l is defined to be zero if the result violates m · n = 0. There (0) is defined a differential BRST g on VC ⊗C (G∗ MC ) given by gn y n contr(n), BRST (0) g = n∈C ∗ ∩@∗
(0)
where contr(n) indicates contraction by n on G∗ M. Then the cohomology of BRST g is a finitely generated C[C]-module isomorphic to L[0] = 0 component of VC . Moreover, grading by J [0] on it is defined as “degree in G∗ M plus degree of n”, and differential d is defined as d(wx m y n ) = (w ∧ m)x m y n . Proof. Due to Proposition 9.8, L[0] = 0 part of VC could be obtained by applying BRST g to the L[0] = 0 part of Fock C⊕C ∗ . For every |m, n from this space we already have m · n = 0, so all elements from this space are obtained by multiplying |m, n by products of 8i [0], that is by G∗ M. Then we only need to calculate the action of BRST g on this space, as well as the actions of J [0] and Q[0]. This is accomplished by a direct calculation. If desired, one can use the above proposition as a definition of the space of sections of the sheaf of string-differential forms over P. It is not hard to show that it is coherent directly. Remark 9.23. Notice that the sheaf of string-differential forms is not locally free, it reflects singularities of P. Remark 9.24. Another peculiar feature of the above description is that grading by eigenvalues of J [0] on the space of differential forms seems to be ill-defined, since J [0] varies with the cone. Nevertheless, this is not a problem, because the notion of J [0] behaves well under the localization, so string cohomology spaces do have an expected double grading. Besides, the de Rham operator Q(z)dz is clearly well-defined for string-differential forms on P.
552
L. A. Borisov
Remark 9.25. The fiber of the sheaf of string-differential forms over the most singular point of AC is obtained by considering only the m = 0 part of the above space. It is easily seen to coincide with the prediction of [6]. Remark 9.26. It is not entirely clear if one should consider the cohomology of MSV(P) or the hypercohomology of it under the Q(0) operator. On one hand, hypercohomology might be a smaller and nicer object, but on the other hand taking hypercohomology may complicate the relation between A and B models. So our definitions below should be considered only provisional. Definition 9.27. String cohomology vector space is the hypercohomology of the complex of string-differential forms. It is very likely that our definition reproduces correctly the numbers of [5], but clearly more work is necessary. We would like to formulate this as a vague conjecture. Conjecture 9.28. For every variety X with only Gorenstein toroidal singularities there exists a loco sheaf MSV(X) which is locally isomorphic to the product of MSV(open ball) and MSV(singularity) defined above. This construction depends on the choice of parameters gn and perhaps on some other structures yet to be determined. The sheaf MSV(X) is provided with the structure of a sheaf of conformal vertex algebras, and with N = 2 structure if X is Calabi–Yau. The L[0] = 0 component of this sheaf has a natural grading and differential which generalizes de Rham differential. The hypercohomology of this complex has dimensions prescribed by [5] and possesses a pure Hodge structure if X is projective. 10. Hypersurfaces in Gorenstein Toric Fano Varieties: General Case Even though we are unable to construct the chiral de Rham complex for an arbitrary variety with Gorenstein toroidal singularities, the situation is somewhat better in the special case of a hypersurface X in a Gorenstein toric variety. We can use the formulas for the smooth case applied now to arbitrary cones. The resulting sheaf turns out to be loop-coherent. We are mostly interested in the case when the ambient variety P is Fano and the hypersurface X is Calabi–Yau and generic, but most statements hold true for any generic hypersurfaces. We will try to extend the calculation of Sects. 6–8. We use the same notations @, @∗ , ?, M = M1 ⊕ Z, N = N1 ⊕ Z. deg, deg∗ as in Sects. 7 and 8. We again consider a projective variety P, a line bundle L on it, and a hypersurface X in P given by f : @ → C. We define MSV(X) as follows. Definition 10.1. Let f : @ → C be a set of coefficients that defines X and g : @∗ → C be a generic set of parameters. Then for any cone C ∗ ∈ ? that contains deg∗ sections of quasi-loco sheaf MSV(X) over the affine chart AC are defined as BRST cohomology of Fock M⊕C ∗ with BRST operator: fm (m · 8)(z)e m·B(z) + gn (n · 9)(z)e n·A(z) dz. BRST f,g = m∈@
n∈@∗
For the above definition to make sense, we should show that the spaces of sections constructed above are compatible with localization. Moreover, we must show that they are loop-modules over the structure ring of X, which means that they are annihilated by f .
Vertex Algebras and Mirror Symmetry
553
Proposition 10.2. The above definition indeed defines a quasi-loco sheaf of vertex algebras over X. It is provided with the structure of topological algebra by formulas of Proposition 7.15. Proof. To prove compatibility with localizations, we need to show that for any cone C ∗ of maximum dimension the BRST f,g cohomology of Fock M⊕C ∗ is non-negatively graded with respect to L[0]. Then the argument of Proposition 7.1 shows the compatibility. The field L[z] here is given by the formulas of Proposition 7.15, in particular, it differs slightly from L(z) of Sect. 9. To avoid confusion we will call the operator given in 7.15 by LX,A [0]. This notation is chosen to indicate that we are dealing with the Virasoro algebra of A model on the hypersurface X. Explicitly, LX,A (z) =: B(z) · A(z) : + : ∂z 8(z) · 9(z) : −deg∗ · ∂z A(z) so LX,A [0] counts the opposite of the sum of mode numbers of A, B, 8, 9 plus m · n plus m · deg∗ . One can split BRST f,g as a sum of BRST f and BRST g as usual. Then we will have a spectral sequence as in Proposition 7.11. It is easily shown to be convergent because of Lemma 9.15. As a result, it is enough to show that the BRST g cohomology of Fock M⊕C ∗ has nonnegative LX,A [0] eigenvalues. Since C ∗ contains deg∗ and P is Gorenstein, cone C has some special properties. One of the generators of its one-dimensional faces is a vertex m0 of @, and all other generators lie in M1 . Notice that Fock M⊕C ∗ naturally splits as a tensor product of Fock M1 ⊕C1∗ and Fock Zm0 ⊕Z≥0 deg∗ . Here C1∗ is the cone in N1 obtained by projecting C ∗ there along deg∗ . Moreover, it is easy to see that the BRST g cohomology of Fock M⊕C ∗ is the tensor product of BRST g cohomology of Fock M1 ⊕C1∗ and BRST g cohomology of Fock Zm0 ⊕Z≥0 deg∗ . The BRST operator on the first space is defined precisely as in Sect. 9, and the BRST operator on the second space is BRST (z) = deg∗ ·9(z)e
deg∗ ·A(z)
.
Moreover, LX,A [0] is the sum of L[0] from Sect. 9 applied to M1 ⊕ C1∗ and L2 [0] =: (m0 · B)(deg∗ ·A) : [0]+ : (∂m0 · 8)(deg∗ ·9) : [0] + deg∗ ·A[0]. Proposition 9.7 assures that the BRST cohomology of M1 ⊕ C1∗ does not have negative eigenvalues of LX,A [0]. Explicit calculation of the BRST cohomology of Fock Zm0 ⊕Z≥0 deg∗ given in Theorem 6.5 allows us to conclude that L2 [0] also has nonnegative eigenvalues. This assures that LX,A [0] has only non-negative eigenvalues. However, we have only shown so far that MSV(X) is a sheaf of vertex algebras over P. We also need to prove that the structure of the C[C]-loop-module induced from Fock M⊕C ∗ naturally gives the structure of C[C ∩ M1 ]/rf -loop-module, where rf is the local equation of the hypersurface. It is enough to show this for a cone C ∗ of maximum dimension. Locally the element rf is fm x m−m0 . m∈@
Notice that the corresponding field rf (z) =
m∈@
fm e(m−m0 )·B(z)
554
L. A. Borisov
in Fock M⊕C ∗ could be expressed as an anti-commutator of BRST f,g and R(z) = deg∗ ·9(z)e
−m0 ·B(z)
.
Really, the OPE of this field with BRST g (w) is non-singular, because either n = deg∗ and −m0 · n = 0, or n = deg∗ , which gives −m0 · n = −1. However, in the latter case, we will also have deg∗ 9(z) deg∗ 9(w) which is O(z − w), so overall the OPE is still non-singular. Hence, all modes of rf (z) act trivially on the cohomology by BRST f,g , which finishes the proof of the proposition. We remark that it is plausible that all modes of em0 ·B(z) act trivially as well, but we do not need to prove it, because C[C ∩ M1 ] is embedded in C[C]. It seems certain that MSV(X) is loco with respect to LX,A [0], but we do not have a proof of it yet. It would follow from any reasonable solution of Conjecture 9.28. On the other hand, we can easily show that MSV(X) is loco with respect to the grading LX,B [0] that comes from the B model Virasoro field. This field is given by LX,B (z) =: B(z) · A(z) : − : 8(z) · ∂9(z) : −deg · ∂z B(z). Theorem 10.3. MSV(X) is a loop-coherent sheaf with respect to the grading LX,B [0]. Proof. The question is local and it is enough to consider a cone C ∗ of maximum dimension. We are working in the set-up of the previous proposition. By Theorem 9.20, cohomology of Fock M⊕C ∗ with respect to BRST g has graded components that are Noetherian almost-modules over C[C]. Really, expressions for LB and LX,B are identical (which is not the case for A model). Notice that for a sufficiently big integer k, we can express ekm0 ·B(z) as an anticommutator of some field and BRST f,g . That field is similar to the one in the proof of Proposition 8.1, but without the extra e−km0 ·B . The spectral sequence from BRST f cohomology of BRST g cohomology to the BRST f,g cohomology degenerates by Lemma 9.15, so BRST f,g cohomology has graded components that are Noetherian almost-modules over C[C]/(x km0 ). Since we have already shown that rf acts trivially, these spaces are Noetherian over the structure ring of X. Corollary 10.4. For a fixed pair of eigenvalues of L[0] and J [0], the corresponding eigen-spaces of H ∗ (MSV(X)) are finite-dimensional. It is our firm belief that after Conjecture 9.28 is successfully proved, the sheaf MSV(X) could be identified as a (generalized) chiral de Rham complex of X as implied by this notation. However, it is still well defined as a sheaf of vertex algebras and one may ask how to calculate its cohomology. We can also ask whether the analog of Proposition 7.11 still holds. For orbifold singularities Proposition 7.11 still holds the way it is stated, but the proof must be different, because vertical cohomology of the double complex considered there is nonzero for more than one row. In the case of orbifold singularities spectral sequence still degenerates, because we may split the picture according to the eigenvalues of B[0] modulo the lattice spanned by generators of one-dimensional faces of C ∗ . This spectral sequence might degenerate for every C ∗ , but we can not prove it.
Vertex Algebras and Mirror Symmetry
555
Unfortunately, Proposition 7.12 does not hold even for orbifold singularities. It is also not clear that the double complex of Theorem 7.14 gives degenerate spectral sequences for arbitrary Gorenstein toric Fano varieties. On the other hand, BRST f,g cohomology of Fock M⊕K ∗ could still be the correct vertex algebra to consider, once the relation to physicists’ A and B models becomes more clear. Section 8 never uses the fact that P is non-singular and generalizes to any P. To sum it up, it is plausible that Theorem 8.3 holds for any toric Gorenstein Fano varieties, but we can only prove it in the smooth case. To complete the discussion we must mention that Mirror Symmetry for Calabi–Yau complete intersections in Gorenstein toric Fano varieties can be adequately treated by the methods of this paper. It is appropriate to state the final conjecture that covers all examples of “toric” Mirror Symmetry. See [3] for relation between complete intersection examples of Mirror Symmetry and pairs of dual reflexive Gorenstein cones. Conjecture 10.5. Let M and N be dual lattices with dual cones K and K ∗ in them. We assume that K and K ∗ are reflexive Gorenstein, which means that K ⊕ K ∗ is Gorenstein in M ⊕ N . We denote the corresponding degree elements by deg and deg∗ . We are also provided with generic numbers gn and fm for all elements in K and K ∗ . Then, if reflexive cones come from Calabi–Yau complete intersections, vertex algebras of these Calabi–Yau manifolds are degenerations of BRST cohomology of Fock M⊕N by operator m·B(z) n·A(z) dz. fm (m · 8)(z)e + gn (n · 9)(z)e BRST f,g = m
n
The degeneration is provided by fans that define the corresponding toric varieties. Structure of topological algebras of dimension dimM − 2 deg · deg∗ is given by Q(z) = A(z) · 8(z) − deg · ∂z 8(z), G(z) = B(z) · 9(z) − deg∗ · ∂z 9(z), J (z) =: 8(z) · 9(z) : + deg · B(z) − deg∗ · A(z), L(z) =: B(z) · A(z) : + : ∂z 8(z) · 9(z) : − deg∗ · ∂z A(z).
11. Open Questions and Concluding Remarks In this section we point out important questions that were not addressed in this paper as well as possible applications of our results and techniques. • It remains to show that deformations of the Master Family of vertex algebras are flat in the appropriate sense. For instance, we would love to say that dimensions of L[0] eigen-spaces are preserved under these deformations. • One can generalize the construction of Conjecture 10.5 to go from K ⊕ K ∗ in M ⊕ N to any Gorenstein self-dual cone in a lattice with inner product by using the BRST field gn (n · fermion)(z)e n·boson(Z) . BRST (z) = n
These theories still have conformal structure, with L given as Lflat − deg · ∂boson. Do these theories have any nice properties or physical significance? • It would be great to provide vertex algebras of Mirror Symmetry with unitary structure. Inequalities on eigenvalues of L[0] and J [0] seem to suggest its existence.
556
L. A. Borisov
• It is extremely important to use results of this paper to get actual correlators of corresponding conformal field theories and thus to draw a connection to the calculation of [8]. • There must be a connection between calculations of this paper and moduli spaces of stable maps defined by Kontsevich. It remains a mystery at this time. • It would be interesting to see how the GKZ hypergeometric system enters into our picture. Solutions of GKZ system are known to give cohomology of Calabi–Yau hypersurfaces, see for example [12, 21]. • One must also do something about the antiholomorphic part of the N = (2, 2) superconformal algebra. Perhaps this problem is not too hard and has its roots in the author’s ignorance. • One should construct generalized chiral de Rham complexes as suggested in Conjecture 9.28. This seems to be a realistic project since all we really need to do is to extend the automorphisms of toroidal singularities to the suggested local descriptions of chiral de Rham complexes. It is also interesting to see which of the standard properties of cohomology of smooth varieties generalize to string-theoretic cohomology. • One should define string cohomology for all Gorenstein, and perhaps Q-Gorenstein singularities. See [2] for the definition of string cohomology numbers in this generality. This may also shed some light on generalized McKay correspondence. • Our results and techniques may have applications to hyperbolicity. Indeed, for a smooth X there is a loco subsheaf MSV b,ϕ of MSV(X) which is generated by modes of bi and ϕ i only. This part of MSV(X) is contravariant, so for any map from a line to X it could be pulled back to it. Then global sections of MSV b,ϕ could give restrictions on possible maps to X. On the other hand, rich structure of the whole MSV(X) might help to show that there are plenty of such sections. • Finally, cohomology of MSV(X) is graded by L[0] and J [0], and one can show that Trace(q L[0] w J [0] ) has some modular properties. It is directly related to the elliptic genus of X. This issue is addressed in the upcoming joint paper with Anatoly Libgober [7]. Acknowledgements. This project began in 1995 while I was a Sloan predoctoral fellow at the University of Michigan. I thank Martin Halpern and Christoph Schweigert who helped me learn the basics of conformal field theory. Konstantin Styrkas has answered a couple of my questions regarding vertex algebras, which has greatly improved my understanding of the subject.
References 1. Batyrev, V.V.: Dual polyhedra and mirror symmetry for Calabi–Yau hypersurfaces in toric varieties. J. Algebraic Geom. 3, 493–535 (1994) 2. Batyrev, V.V.: Stringy Hodge numbers and Virasoro algebra. Math. Res. Lett. 7, no. 2–3, 155–164 (2000) 3. Batyrev, V.V., Borisov, L.A.: Dual Cones and Mirror Symmetry for Generalized Calabi–Yau Manifolds. Mirror Symmetry II, 1995. B. Greene and S.-T. Yau, eds., Cambridge: International Press, 1997, pp. 65–80 4. Batyrev, V.V., Borisov, L.A.: Mirror Duality and String-theoretic Hodge Numbers. Invent. Math. 126, Fasc. 1, 183–203 (1996) 5. Batyrev, V.V., Dais, D.I.: Strong McKay Correspondence, String-theoretic Hodge Numbers and Mirror Symmetry. Topology 35, 901–929 (1996) 6. Borisov, L.A.: String Cohomology of a Toroidal Singularity. J. Alg. Geom. 9, 289–300 (2000) 7. Borisov, L.A., Libgober, A.: Elliptic Genera of Toric Varieties and Applications to Mirror Symmetry. Invent. Math. 140, no. 2, 453–485 (2000) 8. Candelas, P., de la Ossa, X.C., Green, P.S. and Parkes, L.: A pair of Calabi–Yau manifolds as an exactly soluble superconformal theory. Nuclear Phys. B 359, 21–74 (1991) 9. Danilov, V.I.: The Geometry of Toric Varieties. Russian Math. Surveys 33, 97–154 (1978) 10. Fulton, W.: Introduction to toric varieties. Princeton, NJ: Princeton University Press, 1993
Vertex Algebras and Mirror Symmetry
557
11. Givental, A.B.: Equivariant Gromov–Witten Invariants. Internat. Math. Res. Notices 13, 613–663 (1996) 12. Hosono, S.: GKZ Systems, Gröbner Fans and Moduli Spaces of Calabi–Yau Hypersurfaces. In: Topological field theory, primitive forms and related topics (Kyoto, 1996), 239–265, Progr. Math. 160, Boston, MA: Birkhäuser Boston, 1998 13. Kac, V.: Vertex algebras for beginners. University Lecture Series, 10, Providence, RI: American Mathematical Society, 1997 14. Kontsevich, M.: Enumeration of rational curves via torus actions. In: The moduli space of curves (Texel Island, 1994), Progr. Math. 129, Boston, MA: Birkhäuser Boston, 1995, pp. 335–368 15. Lerche, W., Vafa, C., Warner, P.: Chiral rings in N=2 superconformal theories. Nucl. Phys. B 324, 427–474 (1989) 16. Lian, B., Liu, K., Yau, S.-T.: Mirror Principle I. Asian J. Math. 1, 729–763 (1997) 17. Malikov, F., Schechtman, V., Vaintrob, A.: Chiral de Rham complex. Commun. Math. Phys. 204, 439–473 (1999) 18. Morrison, D.R.: Making Enumerative Predictions by Means of Mirror Symmetry. In: Mirror Symmetry II B. Greene and S.-T. Yau, eds., Cambridge, MA: International Press, 1997, pp. 457–482 19. Oda, T.: Convex Bodies and Algebraic Geometry – An Introduction to the Theory of Toric Varieties. Ergeb. Math. Grenzgeb. (3), vol. 15, Berlin–Heidelberg–NewYork–London–Paris–Tokyo: Springer-Verlag, 1988 20. Schwarz, A.: Sigma-models having supermanifolds as target spaces. Lett. Math. Phys. 38, 91 (1996) 21. Stienstra, J.: Resonant Hypergeometric Systems and Mirror Symmetry. In: Integrable systems and algebraic geometry (Kobe/Kyoto, 1997), River Edge, NJ: World Sci. Publishing, 1998, pp. 412–452 22. Witten, E.: Mirror manifolds and topological field theory. In: Essays on Mirror Manifolds, S.-T. Yau, ed., Hong Kong: International Press, 1992, pp. 120–159 Communicated by R. H. Dijkgraaf
Commun. Math. Phys. 215, 559 – 581 (2001)
Communications in
Mathematical Physics
© Springer-Verlag 2001
On Spherically Symmetric Solutions of the Compressible Isentropic Navier–Stokes Equations Song Jiang1, , Ping Zhang2 1 Institute of Applied Physics and Computational Mathematics, P.O. Box 8009, Beijing 100088, P. R. China.
E-mail: [email protected]
2 Institute of Mathematics, Academia Sinica, Beijing 100080, P. R. China.
E-mail: [email protected] Received: 17 January 2000 / Accepted: 3 July 2000
Dedicated to Professor Rolf Leis on the occasion of his 70th birthday Abstract: We prove the global existence of weak solutions to the Cauchy problem for the compressible isentropic Navier–Stokes equations in Rn (n = 2, 3) when the Cauchy data are spherically symmetric. The proof is based on the exploitation of the one-dimensional feature of symmetric solutions and use of a new (multidimensional) property induced by the viscous flux. The present paper extends Lions’ existence theorem [15] to the case 1 < γ < γn for spherically symmetric initial data, where γ is the specific heat ratio in the pressure, γn = 3/2 for n = 2 and γn = 9/5 for n = 3. 1. Introduction We prove the global existence of spherically symmetric weak solutions to the Cauchy problem for the compressible isentropic Navier–Stokes equations in two or three space dimensions. The spherically symmetric motion of a compressible viscous, isentropic fluid in Rn (n = 2, 3) is described by the system of equations in Eulerian coordinates: mρu ρt + (ρu)x + x = 0, 2 mρu2 u = µ uxx + µ m (ρu)t + ρu + P (ρ) x + x x x
(1.1)
together with initial and boundary conditions ρ(0, x) = ρ0 (x), (ρu)(0, x) = m0 (x), u(t, 0) = 0, Corresponding author
t ≥ 0,
x ≥ 0,
(1.2)
560
S. Jiang, P. Zhang
where m = n − 1 (n = 2, 3),
P (ρ) = a ρ γ ,
ρ, u and P (ρ) are the density, the velocity and the pressure respectively, and γ ≥ 1 (the specific heat ratio), a > 0, µ > 0 are constants. Initial boundary value problems and the initial value problem for the compressible Navier–Stokes equations (of heat-conducting flow) have been studied by a great many authors. In one dimension, it is well known that global (smooth and weak) solutions exist for large initial data and are time-asymptotically stable. In more than one dimension, Matsumura and Nishida proved the existence of global smooth solutions and obtained the decay rates of solutions for sufficiently small initial data. See [18–20], also see [4, 5,27, 23, 10, 13] and the references cited therein on more results for small data. For large initial data, the global existence and large-time behavior of solutions to the Navier–Stokes equations for compressible heat-conducting flow have been obtained in the spherically symmetric case. See for example [21, 3, 2, 9, 11], and among others. Concerning the global existence for general large initial data in general domains, little was known until Lions’ work [14–16] (also see [25, 26, 28, 17] for other cases). In [15, 16] Lions (also cf. [8]) used the weak convergence method and showed the existence of global weak solutions to the Navier–Stokes equations for compressible isentropic flow under the assumption that the specific heat ratio γ in the pressure law P = aρ γ satisfies γ ≥ γn , where γn = 3/2 for n = 2 and γn = 9/5 for n = 3. Unfortunately, this assumption excludes for example the interesting case γ = 1.4 (air, et al.). Recently, under a condition on γ similar to that of Lions [15], Feireisl, Matsuš˚u-Neˇcasová, Petzeltová, Straškraba [6, 7] studied the large-time behavior of weak solutions and the existence of weak periodic solutions by applying techniques similar to those of Lions [15]. The aim of this paper is to study the existence of global weak solutions for 1 < γ < γn . We will prove that if the initial data are spherically symmetric, then the Cauchy problem (i.e. the problem (1.1), (1.2)) to the compressible isentropic Navier–Stokes equations possesses a global weak solution for any γ > 1. We mention that when γ = 1, the existence of a weak solution for BV-data has been proved by Hoff [3]. Now, let us recall the definition of weak solutions of (1.1), (1.2). The notation appearing below will be defined at the end of this section. Definition 1.1. We call (ρ(t, x), u(t, x)) a global weak solution of (1.1), (1.2), if 1) ρ ≥ 0 a.e., and for any T > 0, ρ ∈ L∞ ([0, T ], Lγ (R+ )),
ρu2 ∈ L∞ ([0, T ], L1 (R+ )),
ux , u/x ∈ L2 ([0, T ], L2 (R+ )), γ
ρ ∈ C 0 ([0, T ], Lloc (R+ 0 ) − w), γ
2γ
γ +1 ρu ∈ C 0 ([0, T ], Lloc (R+ 0 ) − w), 2γ
γ +1 + (ρ, ρu) (0, x) = (ρ0 , m0 )(x) weakly in Lloc (R+ 0 ) × Lloc (R0 ).
(1.3) 2) For any t2 ≥ t1 ≥ 0 and any ϕ ∈ C01 (R × R+ 0 ), there holds t2 ∞ ∞ ρϕx m dx|tt21 − (ρϕt + ρuϕx )x m dx dt = 0. 0
t1
0
(1.4)
Spherically Symmetric Solutions
561
3) For any t2 ≥ t1 ≥ 0 and any ϕ ∈ C01 (R × R+ 0 ) satisfying ϕ(t, 0) = 0, there holds 0
∞
ρuϕx m dx|tt21 −
t2
t1
mϕ m ρuϕt + ρu2 ϕx + P (ρ) ϕx + x dx dt x t2 ∞ muϕ = −µ ux ϕx + 2 x m dx dt. (1.5) x t1 0
∞
0
The main result of this paper reads: 2γ
Theorem 1.1. Let γ > 1, 0 ≤ ρ0 ∈ L1 (R+ ) ∩ Lγ (R+ ), m0 ∈ L γ +1 (R+ ), m20 /ρ0 ∈ L1 (R+ ). Then there exists a global weak solution (ρ, u) of (1.1), (1.2), such that ∞ ρu2
0
t ∞ mu2 u2x + 2 x m dx dτ + ψ(ρ) (t, x)x m dx + µ 2 x 0 0 ∞ 2 m0 ≤ + ψ(ρ0 ) (x)x m dx ∀ t ≥ 0, 2ρ0 0
(1.6)
a where ψ(ρ) = γ −1 ρ γ . Moreover, for any T > 0 and any ϕ1 ∈ C0∞ (R+ 0 ) with ϕ1 (x) = n O(x ) as x → 0, there holds T ∞ ρ 2γ (t, x)ϕ14 (x) dx dt ≤ C. 0
0
m If, in addition, γ > n/2, then for any T > 0 and any ϕ2 ∈ C0∞ (R+ 0 ) with ϕ2 (x) = O(x ) as x → 0, there holds T ∞ 2γ ρ γ +θ (t, x)ϕ2 (x) dx dt ≤ C ∀θ < − 1. n 0 0
Remark 1.1. (i) For the case limx→∞ ρ0 (x) = ρ∞ > 0, limx→∞ u0 (x) = u∞ , we could use similar cut-off function arguments to those used in [15, Section 5.6] and obtain a similar existence theorem. (ii) If we define ρ(t, x) := ρ(t, x), U(t, x) := u(t, x)x/x for t ≥ 0 and x ∈ Rn , where x = |x|, x = (x1 , · · · , xn ). Then it is easy to see that (ρ(t, x),U(t, x)) is a weak solution of the Cauchy problem for the compressible isentropic Navier–Stokes equations in Rn (cf. the proof of Theorem 5.7 in [3]). The proof of Theorem 1.1 is based on a basic energy estimate and passing to the limit for the approximate solutions. To pass to the limit, we mainly use and adapt the idea of Lions in the study of the compressible isentropic Navier–Stokes equations [15]. As mentioned before, in Lions’ arguments, γ ≥ γn has to be assumed because, in order to estimate the commutators and to exclude the concentration of mass at the origin, the higher integrabilities of ρ (for example L2 -integrability) are needed. To overcome the difficulties induced by γ < γn , here we exploit the one-dimensional feature of symmetric solutions, use integrals instead of commutators and employ a new property induced by the viscous flux: aρ γ − µux − µmu/x (cf. Remark 3.1), which shows the L2/θ -integrability of ρ θ − ρ θ near the origin for some small constant θ > 0 (cf. Lemma 3.2), where ρ θ is the weak limit of the θ th power of the approximate density sequence.
562
S. Jiang, P. Zhang
This L2/θ -integrability of ρ θ −ρ θ , though not the L2 -integrability of ρ itself, is sufficient to conclude that the concentration of mass cannot develop at the origin. (It should be remarked here that this L2/θ -integrability seems to be a multidimensional property, cf. Remark 3.2.) This paper is organized as follows. In Sect. 2 we construct an approximate solution sequence and derive a priori estimates for the approximate solutions. In Sect. 3 we give the proof of Theorem 1.1. Notation used throughout this paper. Let be a domain in R. Let m be an integer and m,p let 1 ≤ p ≤ ∞. By W m,p () (W0 ()) we denote the usual Sobolev space defined over . W m,2 () ≡ H m () (W0m,2 () ≡ H0m ()), W 0,p () ≡ Lp () with norm · Lp () . We define Lp () := {f ∈ L1loc (); |f (x)|p x m dx < ∞}
1
|p x m dx) p .
p Lloc ()
1 () are defined similarly and Hloc with norm · Lp () := ( | · p 1 to Lloc () and Hloc (), respectively. For simplicity we also use the following abbreviations: · Lp ≡ · Lp (R+ ) , R+ := (0, ∞), R+ 0 := [0, ∞).
Lp (I, B) resp. · Lp (I,B) denotes the space of all strongly measurable, p th -power integrable (essentially bounded if p = ∞) functions from I to B resp. its norm, I ⊂ R an interval, B a Banach space. C 0 (I, B − w) is the space of all functions which are in L∞ (I, B) and continuous in t with values in B endowed with the weak topology. The same letter C (sometimes used as C(X) to emphasize the dependence of C on X) will denote various positive constants which do not depend on %. 2. Approximate Solutions and a Priori Estimates We will use the approximate solutions constructed by Hoff in [3]. First, we mollify the initial data as follows:
m
−m x γ ρ0 ∗ j%/2 (x) + %, ρ0% (x) : = x γ % % m m χ m0 m 0 m%0 (x) : = x − 2 ρ0% x2 √ ∗ j%/2 (x), u%0 (x) := (x), ρ0 ρ0% where j%/2 is the standard Friedrichs mollifier, χ % is a C0∞ (R+ )-function satisfying χ % (x) = 0 for x ≤ % and χ % (x) = 1 for x ≥ 2%. Then, it is not difficult to prove that for any N > 0, 2 N ∞ 2γ % % m 0 % γ |ρ0 − ρ0 − %| + ρ0 u0 − √ x m dx + |ρ0% u%0 − m0 | γ +1 x m dx → 0 ρ0 % % (2.1) as % → 0, and moreover, %
∞
|ρ0%
Nn − ρ0 − %|x dx ≤ n m
+C
γ −1
N
γ
∞ N−1
%
|ρ0% − ρ0 − %|γ x m dx
ρ0 x m dx → 0
1/γ
as % → 0, N → ∞.
(2.2)
Spherically Symmetric Solutions
563
Denote ψ% (ρ) :=
a% γ −1 ρ aρ γ − − a% γ −1 ρ + a% γ . γ −1 γ −1
Then, it is easy to see by (2.1) and (2.2) that ∞ 2 m % % 2 m0 % + ψ% (ρ0 ) − ψ(ρ0 ) x dx → 0 ρ0 (u0 ) − ρ0 %
(2.3)
as % → 0.
Now, we consider the following approximate problem of (1.1), (1.2): mρ% u% = 0, [ρ% ]t + [ρ% u% ]x + x u mρ% (u% )2 % [ρ% u% ]t + ρ% (u% )2 + P (ρ% ) + = µ [u% ]xx + µ m x x x x
(2.4)
(2.5)
together with initial and boundary conditions ρ% (0, x) = ρ0% (x), u% (0, x) = u%0 (x), u% (t, %) = 0,
x ≥ %,
(2.6)
t ≥ 0.
Then, by Theorem 4.1 of [3] (also see Theorem 5.6 of [3]), the problem (2.5), (2.6) has a global weak solution (ρ% (t, x), u% (t, x)) on [0, ∞) × [%, ∞) with positive ρ% , such that γ ρ% ∈ C 0 ([0, ∞), Lloc ([%, ∞))), u% ∈ C 0 ([0, ∞), L2 (%, ∞)) ∩ C 0 ((0, ∞), H01 (%, ∞)), ρ% (t, x) is pointwise bounded from above, and %
∞ 1
2
ρ% u2%
m
t
∞
[u% ]2x
u2 + m %2 x
+ ψ% (ρ% ) (t, x)x dx + µ 0 % ∞ 1 % % 2 % ≤ ρ (u ) + ψ% (ρ0 ) x m dx ≤ C 2 0 0 %
x m dxdt ∀ t ≥ 0,
(2.7)
where ψ% is the same as in (2.3) and we have used (2.4). Recalling the definition of ψ% , we get from (2.7) and Young’s inequality that for any t ≥ 0 and any N > %, ∞ % % 2 N ρ0 (u0 ) γ m n γ % ρ% (t, x)x dx ≤ CN % + C (2.8) + ψ% (ρ0 ) x m dx. 2 % % In order to obtain Theorem 1.1 we have to prove the precompactness of the approximate solution sequence (ρ% , u% ). For this purpose, we need some higher space-time integration estimates for the density ρ% , which will be derived in the following two lemmas. For the sake of simplicity we will omit the subscript % from now on until the end of this section. Lemma 2.1. For any T > 0 and any ϕ ∈ C0∞ ([0, ∞)), ϕ(x) = O(x n ) as x → 0, there exists a positive constant C, independent of %, such that 0
T
∞ %
ρ 2γ (t, x)ϕ 4 (x) dx dt ≤ C.
564
S. Jiang, P. Zhang
Proof. We multiply (2.5)2 by ϕ and integrate over (x, ∞) (x ∈ [%, ∞)) to obtain ∂t
∞ x
∞
∞
mρu2 ϕ dy y x x ∞ u mu = (aρ γ + ρu2 )ϕ − µux ϕ − µm ϕ − µ uy + ϕy dy, x y x
ϕρu dy −
(ρu2 + aρ γ )ϕy dy +
(2.9)
which yields ∞ u aρ 2γ ϕ = µρ γ ux + m ϕρu dy ϕ − ρ 1+γ u2 ϕ + ρ γ ∂t x x ∞ ∞ mρu2 − ργ ϕdy (ρu2 + aρ γ )ϕy dy + ρ γ y x x ∞ mu + µρ γ ϕy dy. uy + y x
(2.10)
Mollifying the first equation of (2.5), we find that ∂t ρ δ + ∂x (uρ δ ) +
mρ δ u = rδ, x
(2.11)
where ρ δ (t, x) = ρ(t, ·) ∗ jδ and r δ (t, x) = (uρ δ )x − (ρu)x ∗ jδ +
ρu mρ δ u −m ∗ jδ . x x
By virtue of Lemma 2.3 of [14], r δ → 0 in L1loc (R+ × R+ ) as δ → 0. Thus, multiplying (2.11) by γ (ρ δ )γ −1 , we have mγ (ρ δ )γ u ∂t (ρ δ )γ + ∂x u(ρ δ )γ + = γ (ρ δ )γ −1 r δ + (1 − γ )ux (ρ δ )γ . x
(2.12)
Since ρ ∈ L∞ ((0, T ) × (%, ∞)), we get by taking δ → 0 in (2.12), ∂t ρ γ + ∂x (uρ γ ) +
mγρ γ u = (1 − γ )ux ρ γ x
(2.13)
in D ((0, T ) × (%, ∞)). Then ∞ ∞ ∞ ρ γ ∂t ϕρu dy = −∂t ρ γ ϕρu dy + ∂t ρ γ ϕρu dy x x ∞ ∞ x mγρ γ u γ γ γ = ∂t ρ ϕρu dy + ∂x (uρ ) + (γ − 1)ρ ux + ϕρu dy x x x ∞ ∞ = ∂t ρ γ ϕρu dy + ∂x uρ γ ϕρu dy x ∞x mγρ γ u γ + (γ − 1)ρ ux + ϕρu dy + u2 ρ 1+γ ϕ. (2.14) x x
Spherically Symmetric Solutions
565
Inserting (2.14) into (2.10), we find
∞ ∞ ρ γ uϕ γ γ aρ ϕ = µρ ux ϕ + µm ϕρu dy + ∂x uρ ϕρu dy + ∂t ρ x x x ∞ ∞ mγρ γ u + (γ − 1)ρ γ ux + ϕρu dy − ρ γ (ρu2 + aρ γ )ϕy dy x x x ∞ ∞ ∞ mρu2 u γ γ γ +ρ ϕ dy + µρ ϕy dy. uy ϕy dy + µmρ (2.15) y y x x x γ
2γ
Now, if we multiply (2.15) by ϕ 3 and integrate over (0, T ) × (%, ∞), we get T ∞ T ∞ 1 {R.H.S. of (2.15)} ϕ 3 dx dt. ρ 2γ ϕ 4 dx dt = a 0 % 0 %
(2.16)
Next we estimate each term on the right-hand side of (2.16). The first term on the right-hand side of (2.16) can be bounded as follows using (2.7): T ∞ T ∞ 1 T ∞ 2 4 ρ γ ux ϕ 4 dx dt ≤ δ ρ 2γ ϕ 4 dx dt + ux ϕ dx dt δ 0 % 0 % 0 % T ∞ C ≤ +δ ρ 2γ ϕ 4 dx dt. (0 < δ < 1). (2.17) δ 0 % T ∞ γ 4 A similar estimate can be obtained for the term 0 0 ρ xuϕ dxdt in the same manner. It is easy to see that by (2.7) and (2.8), ∞ 1 ∞ ρuϕ dy ≤ ρ(u2 + 1)ϕ dy ≤ C, (2.18) 2 % x whence by (2.1), T ∞ ∂t ρ γ 0
%
∞ x
ϕρu dy ϕ 3 dxdt ≤ C +
∞ %
≤C+C
ργ ϕ3
∞ %
∞ x
ρuϕ dydx
ϕ 3 ρ γ (T , x)dx ≤ C. (2.19)
Using the boundary condition (2.6)2 , (2.18) and (2.7), we deduce that T ∞ T ∞ ∞ ∞ 3 γ 2 γ ϕ uρ ϕρudy dxdt = 3 ϕ ϕx ρ u ϕρu dydxdt 0
%
≤C ≤δ
x
T
0 T
0
∞ % ∞
%
x
%
0
x
ϕ 2 |ϕx | ρ γ |u| dxdt
ρ 2γ ϕ 4 dx dt +
C δ
0
T
∞ %
u2 ϕx2 dx dt
T ∞ C ≤ +δ ρ 2γ ϕ 4 dx dt. δ 0 % T ∞ T ∞ ∞ The terms 0 % ux ρ γ ϕ 3 x ϕρu dydxdt and 0 % estimated exactly in the same manner.
(2.20) ρ γ uϕ 3 x
∞ x
ϕρu dydxdt can be
566
S. Jiang, P. Zhang
Take R such that supp ϕ ⊂ [0, R]. We employ (2.7) and (2.8) to deduce that
T 0
∞ %
ργ ϕ3
∞ x
(ρu2 + aρ γ )ϕy dy dx dt
T
≤C 0
∞ %
ργ ϕ3
R x
(ρu2 + ρ γ )y m dy dx dt ≤ C. (2.21)
T ∞ ∞ 2 Analogously to (2.21), we can estimate the terms 0 % ρ γ ϕ 3 x mρu y ϕ dydxdt, T ∞ γ 3∞ u T ∞ γ 3∞ 0 % ρ ϕ x uy ϕy dydxdt and 0 % ρ ϕ x y ϕy dydxdt. Summing up (2.16), (2.17), (2.19)–(2.21) and the related estimates, we finally get T ∞ T ∞ 2γ 4 ρ ϕ dx dt ≤ C + 5δ ρ 2γ ϕ 4 dx dt. 0
%
0
%
By taking δ = 1/10 in the above inequality, we obtain the lemma.
Remark 2.1. By applying (2.12), we can make some seemingly formal arguments in (2.14) rigorous. For simplicity, we will omit it here. In the following lemma we prove a higher space-time estimate (regularity) in Rn for the density around the origin. Lemma 2.2. Let γ > n/2. Then, for any T > 0 and any ψ ∈ C0∞ ([0, ∞)) with ψ ≥ 0, ψ(x) = O(x m ) as x → 0, there is a positive constant C, independent of %, such that T ∞ ρ γ +θ (t, x)ψ(x) dxdt ≤ C 0
%
for any 0 < θ < 2γ /n − 1. Proof. The proof is long, we break it up into several steps. First, without loss of generality, we can assume further that ψ(x) = x m for 0 ≤ x ≤ 1. T ∞ Step 1. A representation of 0 % ρ γ +θ ψ dx dt. Let φ ∈ C0∞ ([0, ∞)) with φ = 1 on supp ψ. Multiplying (2.5)2 by φ and integrating then the resulting equation over (x, ∞), we obtain
∞
∞
ρu2 φdy y x x ∞ u u = µ ux + m φ + µuy + µm − ρu2 − aρ γ φy dy. x y x
(ρu2 + aρ γ )φ − ∂t
ρuφ dy − m
Now, let θ be a positive number. Multiplying the above equation by ρ θ ψ, we find ∞ u ρuφ dy aρ γ +θ ψ = µux + µm − ρu2 ρ θ ψ + ρ θ ψ∂t x x ∞ 2 ∞ ρu u + mρ θ ψ µuy + µm − ρu2 − aρ γ φy dy. φ dy + ρ θ ψ y y x x
(2.22)
Spherically Symmetric Solutions
567
When θ ≥ 1, we can easily show by the same arguments as in the proof of (2.13) that ∂t ρ θ + ∂x (uρ θ ) +
mθρ θ u = (1 − θ)ux ρ θ . x
(2.23)
For θ < 1, set βν (ρ) = (ρ + ν)θ . Then, multiplying (2.11) with βν (ρ δ ), we arrive at ∂t βν (ρ δ ) + ∂x [uβν (ρ δ )] + [βν (ρ δ )ρ δ − βν (ρ δ )]ux +
mβν (ρ δ )ρ δ u = βν (ρ δ )r δ . x
Taking δ → 0 in the above equation, one gets mβν (ρ)ρu ∂t βν (ρ) + ∂x [uβν (ρ)] + βν (ρ)ρ − βν (ρ) ux + = 0. x Note that (ρ + ν)θ − ρ θ =
1 0
θν dτ ≤ θν θ (ρ + τ ν)1−θ
1 0
dτ ≤ νθ . τ 1−θ
(2.24)
(2.25)
On the other hand, βν (ρ)ρ − θρ θ = θ[(ρ + ν)θ−1 ρ − ρ θ ]
= θ [(ρ + ν)θ−1 ρ − (ρ + ν)θ ] + θ [(ρ + ν)θ − ρ θ ]
(2.26)
and |(ρ + ν)θ−1 ρ − (ρ + ν)θ | = ν(ρ + ν)θ−1 ≤ ν θ .
(2.27)
Thus, letting ν → 0 in (2.24) and utilizing (2.25)–(2.27), we find that for θ < 1, (2.23) remains valid. Therefore, (2.23) holds for any θ > 0. With the help of (2.23) and (2.22), following a procedure similar to that used for (2.15), we obtain ∞ ∞
u θ aρ γ +θ ψ = µ ux + m ρ ψ + ∂t ρ θ ψ ρuφdy + ∂x uρ θ ψ ρuφdy x x x
∞
θ θ θu ρuφ dy − uρ ψx + (1 − θ)ρ ux ψ − mθρ ψ x x ∞ 2 ∞ ρu u θ θ 2 γ + mρ ψ φ dy + ρ ψ µuy + µm − ρu − aρ φy dy y y x x ≡ I1 + I2 + I3 + I4 + I5 + I6 . (2.28) T ∞ In the sequel, we derive bounds for 0 % Ij dxdt (1 ≤ j ≤ 6) on the right-hand side of (2.28). T ∞ T ∞ Step 2. Estimate of 0 % I1 dxdt and 0 % I2 dxdt. It follows from (2.7) and Young’s inequality that T ∞ T ∞ 1 T ∞ 2 θ 2θ ρ u ψ dx dt ≤ δ ρ ψ dx dt + ux ψ dx dt x δ 0 % 0 % 0 % T ∞ C ≤ ρ γ +θ ψ dx dt for θ ≤ γ . (2.29) +δ δ 0 %
568
S. Jiang, P. Zhang
T ∞ The term 0 % ρ θ ux −1 ψdx dt can be bounded in the same manner. For x ≥ 1 we easily obtain ∞ 1 ∞ ≤ ρuφ dy (ρ + ρu2 )φx m dx ≤ C, 2 x 1 where we have used (2.7) and (2.8). Hence, 1 ∞ ∞ θ θ ≤ ρ ψ ρuφ dydx ρ ψ %
x
%
x
1
ρ|u| dydx + C
∞ %
ρ θ ψ dx
(2.30)
(2.31)
and %
1
ρθ ψ
1 x
ρ|u| dydx =
%
≤
1
%
1
ρθ xm θ α1
ρ x
1 x
1 x
ρ|u| dydx ρ|u|y m−α1 dydx,
(2.32)
where α1 ≤ m is a non-negative constant which will be determined later on. From Hölder’s inequality, (2.7) and (2.8) we get 1 1/2 1/2 ρ|u|y m−α1 dy ≤ ρu2 L1 (%,1) ρLγ (%,1) y −α1 Lq (%,1) ≤ C, (2.33) x
1 where q, α1 satisfy 21 + 2γ + q1 = 1, q ≥ 2 and qα1 −m < 1, that is, recalling m = n−1,
α1 <
n 1 1− . 2 γ
(2.34)
Thus, substituting (2.33) into (2.32), we find that 1 1 1 ρθ ψ ρ|u|dydx ≤ C ρ θ x α1 dx ≤ Cρ θ Lp (%,1) x α1 −m Lq (%,1) ≤ C, %
x
%
(2.35)
where q, p satisfy 1/p + 1/q = 1, (m − α1 )q < m + 1 and θp γ γ γ α1 1 1 θ <γ − < + < + γ− q n n n 2
≤ γ , that is, 1 . 2
On the other hand, when θ < ( n1 + 21 )γ − 21 , we easily deduce by (2.8) that ∞ ∞ θ ρ ψ dx ≤ C (1 + ρ γ )ψ dx ≤ C, %
%
which together with (2.31) and (2.35) implies that for 0 < θ < ( n1 + 21 )γ − 21 , ∞ ∞ θ ρ ψ ρuφ dydx ≤ C. (2.36) %
x
Spherically Symmetric Solutions
Step 3. Estimate of
T
0
∞ %
T ∞
ρ θ uψx
≤ 0
T
1 %
0
%
∞ x
I4 dxdt for 1 < γ ≤
ρuφ dydxdt =
ρ|u|
569
y %
T
2n 4−n .
∞ %
0
ρ θ |u|x m−1 dxdydt + C
By (2.30), we see that
y
% ∞
ρuφ
T
%
0
ρ θ uψx dxdydt ρ θ |u| |ψx |dxdt.
For y ∈ [%, 1], one has by applying Hölder’s inequality and (2.8) that y y |u| ρ θ |u|x m−1 dx ≤ y α2 ρ θ x −α2 +m dx x % % u ≤ y α2 ρθLγ (%,1) L2 (%,1) x −α2 Lq (%,1) x u ≤ Cy α2 L2 (%,1) , x
(2.37)
(2.38)
where θ/γ + 1/2 + 1/q = 1, α2 ≥ 0 and qα2 − m < 1, that is, n 1 θ α2 < = n − . q 2 γ
(2.39)
Obviously, by (2.7), x
m−1 2
u (t, x) =
%
x
∂y [y
∞
m−1 2
u (t, y)]dy ≤ C
%
u2y
u2 + 2 y
y m dy ≡ A2 (t),
with A(t) ∈ L2 (0, T ), which results in |u(t, x)| ≤ CA(t) x −(m−1)/2 ,
x ≥ %, t ∈ [0, T ].
Therefore by (2.38), (2.40), (2.7) and (2.8), T 1 y ρ|u| ρ θ |u|x m−1 dxdydt ≤ C 0
%
%
≤ C sup
[0,T ] %
T
0 1
ρ(·, y)y α2 −
m−1 2
≤ C sup ρ(·)Lγ (%,1)
1 %
[0,T ]
u A(t) L2 (%,1) x
1 %
ρy α2 −
(2.40)
m−1 2
dydt
dy
y
−
(3m−2α2 −1)γ 2(γ −1)
≤ C,
+m
dy
(γ γ−1) (2.41)
2 −1)γ where α2 satisfies− (3m−2α + m > −1, i.e. 2(γ −1)
α2 >
n 4−n − , γ 2
which together with (2.39) shows that θ should satisfy θ<
2 γ − 1. n
(2.42)
570
S. Jiang, P. Zhang
Recalling Young’s inequality and the fact that (2.42) gives 2θ ≤ γ , we deduce for θ < 2γ /n − 1 that
T
∞ %
0
T ∞ 2 u 2θ x|ψx |dx dt ρ |u| |ψx |dx dt ≤ C +ρ x2 0 % T ∞ 2 u γ ≤C x|ψx |dx dt + 1 + ρ x2 0 % ≤ C. θ
Inserting the above inequality and (2.41) into (2.37), we obtain
T
0
∞ %
θ
ρ uψx
ρuφ dydxdt ≤ C.
∞ x
(2.43)
∞ T ∞ ∞ T ∞ The terms 0 % ρ θ ux ψ x ρuφ dydxdt and 0 % ρ θ ux −1 ψ x ρuφ dydxdt in I4 can be estimated, following the same arguments as used in the derivation of (2.43). T ∞ 2n Step 4. Estimate of 0 % I4 dxdt for γ > 4−n . First we notice that the estimate (2.37) still holds. To derive bounds for the first term on the right-hand side of (2.37), we apply Hölder’s inequality to arrive at
y
%
θ where γ +θ + should satisfy
ρθ
1 2
y |u| m |u| ρ θ x −α4 x m dx x dx ≤ y α4 x x % u α4 θ ≤ Cy ρLγ +θ (%,1) L2 (%,1) x −α4 Lq (%,1) x u ≤ Cy α4 ρθLγ +θ (%,1) L2 (%,1) , y ∈ [0, 1], x
+
1 q
= 1 (i.e. q =
2(γ +θ) γ −θ
α4 <
(2.44)
> 2) and qα4 < m + 1, that means that α4
n nθ − . 2 γ +θ
(2.45)
Similarly (also by (2.7) and (2.8)), %
1
2
ρ|u| y α4 dy ≤ C A q (t)
1
%
2
1
1
ρ2
− q1
|u|
1− q2
1− 2
1
ρ2 1
+ q1 α4 − m−1 q −m m
+1
y
q q ≤ C A q (t)ρ 2 uL2 (%,1) ρL2 γ (%,1) y
y dy
α4 − m−1 q −m
2 q
≤ C A (t), where 21 (1 − q2 ) + γ1 ( 21 + q1 ) + n−1+
1 p
= 1 and p(m +
Lp (%,1) (2.46)
m−1 q
− α4 ) < m + 1. Therefore,
1 1 1 n−2 −n 1− + < α4 , q γ 2 q
Spherically Symmetric Solutions
571
+θ) which combined with (2.45) and q = 2(γ γ −θ implies that θ < 2γ /n − 1. Combining (2.44) with (2.46), utilizing Young’s inequality, (2.7) and recalling ψ(x) = x m for x ∈ [0, 1], we obtain under (2.42) that T 1 y δ T 1 γ +θ ρ|u| ρ θ |u|x m−1 dxdydt ≤ ρ ψ dxdt + C(δ). (2.47) 2 0 % 0 % %
From (2.40) it follows that T 1 ∞ T T ∞ |u| ρ θ |u| |ψx |dxdt ≤ ρ θ ψ dxdt + C A(t) ρ θ |ψx |dxdt x 0 % 0 % 0 1 T 1 T ∞ 2 u ≤C ρ 2θ + 2 ψ dxdt + C A(t) (1 + ρ γ )|ψx |dxdt x 0 % 0 1 δ T 1 γ +θ ≤ ρ ψ dxdt + C(δ). 2 0 % Inserting the above estimate and (2.47) into (2.37), we conclude T ∞ ∞ T 1 θ ρ uψ ρuφ dydxdt ≤ δ ρ γ +θ ψdxdt + C(δ). x %
0
x
%
0
(2.48)
We can use the same arguments as used in the derivation of (2.48) to bound the terms T ∞ θ ∞ T ∞ θ −1 ∞ 0 % ρ ux ψ x ρuφ dydxdt and 0 % ρ ux ψ x ρuφ dydxdt. T ∞ T ∞ Step 5 Estimate of 0 % I5 dxdt and 0 % I6 dxdt. We make use of (2.40) to arrive at T ∞ 2 y T ∞ ∞ 2 ρu ρu θ θ ρ ψ ρ ψ dxdydt φ dydxdt = φ y y 0 % x 0 % % T y ∞ ρφ ≤C A2 (t) ρ θ ψ dxdydt m y 0 % % 1 y ∞ y ρ θ m ρ x dxdy + sup ρφ ρ θ ψ dxdy. (2.49) ≤ C sup m y % % t∈[0,T ] % t∈[0,T ] 1 Here the right-hand side of (2.49) can be bounded as follows, using (2.8) and (2.42): y ∞ ∞ ∞ ρφ ρ θ ψdx dy ≤ C ρφ dy (1 + ρ γ )ψdx ≤ C, (2.50) %
1
and
%
1
ρ ym
y %
%
1
ρ θ x m dx dy ≤
1 %
ρy α3 −m dy
≤ Cρθ+1 Lγ (%,1) y
where α3 satisfies
γ
L γ −1 (%,1)
x −α3
γ
L γ −θ (%,1)
(2.51)
+ m > −1 and n−2+
ρ θ x −α3 +m dx
% α3 −2m
≤ C, (α3 −2m)γ γ −1
1
3γ − γα−θ
+ m > −1, that is,
nθ n < α3 < n − , γ γ
572
S. Jiang, P. Zhang
which gives (2.42). Therefore, under (2.42) we conclude by (2.49)–(2.51) that T ∞ ∞ 2 ρu θ φ dydxdt ≤ C. ρ ψ y 0 % x
(2.52)
Finally, we derive bounds for the integral of I6 . Using (2.7) and (2.8), taking into account φx (x) = 0 for x ∈ [0, 1], we easily find that T ∞ ∞ |u| θ 2 γ ρu + ρ + |uy | + |φy | dydx ρ ψ y 0 % x T ∞ ∞ u2 ≤C 1 + ρu2 + ρ γ + u2y + 2 |φy | dy (1 + ρ γ )ψdx y 0 % 1 ≤ C. (2.53) Integrating (2.28) over (%, ∞)×(0, T ), summing up (2.28), (2.29), and (2.36), (2.43), (2.48), (2.52), (2.53) and the related estimates for the other terms of Ij (1 ≤ j ≤ 6), we obtain the lemma by choosing δ appropriately small. 3. Proof of Theorem 1.1 In this section we extract a limiting solution (ρ, u) from the approximate solution sequence (ρ% , u% ) of (2.5), (2.6), and thus obtain a global weak solution of (1.1), (1.2). First we extend both u% (t, x) and ρ% (t, x) to be zero for 0 ≤ x ≤ %. For simplicity, we still denote by (u% (t, x), ρ% (t, x)) this extension. In view of (2.7), (2.8) and Lemma 2.1, we can extract a subsequence of (ρ% , u% ), still denoted by (ρ% , u% ), such that γ
ρ% 4 ρ weak-∗ in L∞ ([0, T ], Lloc (R+ )) and weakly in L2γ ([0, T ], Lloc (R+ )), 2γ
1 (R+ )). u% 4 u weakly in L2 ([0, T ], Hloc
(3.1)
Moreover, from the lower semicontinuity, (2.4), (2.7) and (2.8), we get ρ ∈ L∞ ([0, T ], Lγ (R+ )) ∩ L2γ ([0, T ], Lloc (R+ )), 2γ
ux , u/x ∈ L2 ([0, T ], L2 (R+ )).
(3.2)
In the sequel, we show that (ρ, u) obtained in (3.1) is indeed a weak solution of (1.1), (1.2). By virtue of (2.7), (2.8), Lemma 2.1 and Hölder’s inequality, ρ%θ ∂x u% ∈ 2γ
2γ
θ +γ L θ +γ ([0, T ], Lloc (R+ )). So for any 0 < θ < γ we can extract a subsequence of 2γ
2γ
γ +θ
θ +γ ρ%θ ∂x u% such that it is weak-convergent in L θ +γ ([0, T ], Lloc (R+ )). Similarly, ρ%
γ ρ% ,
ρ%θ are weak-convergent in L 2γ θ
2γ θ +γ
2γ θ
,
2γ θ +γ
([0, T ], Lloc (R+ )), in L2 ([0, T ], L2loc (R+ )) and
in L ([0, T ], Lloc (R+ )), respectively. For the sake of convenience we denote by f (ρ) the weak limit of f (ρ% ) (in the sense of distributions) as % → 0. Now, we prove Lemma 3.1. For any 0 < θ < γ , we have a ρ γ +θ − µ Q = a ρ γ ρ θ − µ ρ θ ux , where Q is the weak limit of ρ%θ ∂x u% .
Spherically Symmetric Solutions
573
Proof. Let φ ∈ C0∞ (R+ ). Multiplying (2.5)2 by φ and integrating then the resulting equation over (x, ∞), using (2.23) and following the same procedure as in the proof of (2.15) (cf. (2.28)), we find that ∞ ∞ u% θ γ θ θ ρ% u% φ dy + ∂x u% ρ% ρ% u% φ dy φ = ∂t ρ% ρ% a ρ% −µ ∂x u% −µ m x x x ∞ ∞ u% − (1 − θ )ρ%θ ∂x u% ρ% u% φ dy − ρ%θ (ρ% u2% + aρ%γ )φy dy + mθρ%θ x x x ∞ ∞ ρ% u2% u% +mρ%θ ∂ y u% + m (3.3) φ dy + µ ρ%θ φy dy. y y x x Let r1 , r2 be arbitrary positive numbers with r1 < r2 and denote by I the open interval (r1 , r2 ). By (2.7), (2.8) and Lemma 2.1, u% ∈ L2 ([0, T ], H 1 (I )) and ρ% ∈ L∞ ([0, T ], Lγ (I )) ∩ L2 ([0, T ], L2 (I )), while by (2.5)1 one sees that ∂t ρ% ∈ L∞ ([0, T ], W
−1, γ2γ +1
2γ
(I )) + L∞ ([0, T ], L γ +1 (I )).
(3.4)
Hence, applying Lemma 5.1 of [15], we obtain ρ% u% 4 ρu in D ((0, T ) × I ),
(3.5)
and due to ρ% u% ∈ L2 ([0, T ], Lγ (I )), (3.5) in fact holds weakly in L2 ([0, T ], Lγ (I )). ∞ Moreover, x ρ% u% φ dy ∈ L2 ([0, T ], W 1,γ (I )). From (2.9) we get ∞ ρ% u% φ dy ∈ L∞ ((0, T ) × I ) + L2 ((0, T ) × I ) + L∞ ([0, T ], L1 (I )). ∂t x
Thus, by the classical Lions–Aubin Lemma, we obtain ∞ ∞ ρ% u% φ dy → ρuφ dy strongly in L2 ([0, T ], Lp (I )), x
x
Consequently, ρ%θ ρ%θ ∂x u%
∞ x x
∞
p < ∞ if γ = 1, and p = ∞ if γ > 1.
ρ% u% φ dy 4 ρ θ
ρ% u% φ dy 4 Q
∞ x ∞
x
ρuφ dy in D ((0, T ) × I ),
ρuφ dy in D ((0, T ) × I ),
(3.6)
∞ and due to ρ%θ x ρ% u% φ dy ∈ L2 ((0, T )×I ), (3.6)1 in fact holds weakly in L2 ((0, T )× I ). Analogously to (3.5), we can prove that ρ% u2% = ρ% u% · u% 4 ρu2 in D ((0, T ) × I ), 4γ 2γ +1
(3.7)
4γ 2γ +1
and (3.7) in fact holds weakly in L ([0, T ], L (I )) because of ρ% u2% ∈ 4γ 4γ L 2γ +1 [0, T ], L 2γ +1 (I ) , which can be shown as follows using Lemma 2.1 and (2.40):
T 0
4γ
I
8γ
ρ%2γ +1 u%2γ +1 dxdt ≤ C
0
T
I
(ρ%2γ + A2 (t)ρ% u2% )dxdt ≤ C.
(3.8)
574
S. Jiang, P. Zhang
By virtue of (3.7) and Lebesgue’s dominated convergence theorem we easily see that ∞ ρ% u2% ∞ ρu2 p x y φ dy 4 x y φ dy weakly or weak-∗ in L ((0, T ) × I ) for any p ∈ (1, ∞], while by (2.23), (2.7), (2.8), Lemma 2.1 and (2.40), we have ∂t ρ%θ ∈ L1 ((0, T ) × I ) + L2 ([0, T ], W −1,2γ /θ (I )). Thus, we get in the same manner as in the proof of (3.5) that
ρ%θ
∞
x ∞
ρ% u2% φ dy, ρ%θ y
∞ x
ρ% u2% φy
dy,
ρ%θ
∞ x
ρ%γ φy
dy,
ρ%θ
∞ x
∂y u% φy dy,
∞ 2 ∞ u% u% ρu φy dy, ρ%θ 4 ρθ φ dy, ρ θ ρu2 φy dy, (3.9) y x y x x x ∞ ∞ ∞ u u ρθ ρ γ φy dy, ρ θ uy φy dy, ρ θ φy dy, ρ θ in D ((0, T ) × I ). y x x x x
ρ%θ
∞ From (3.3), Lemma 2.1 and (2.40) we get ∂t [ρ%θ x ρ% u% φ dy] ∈ L1 ((0, T ) × I ) +L2 ((0, T ), W −1,2γ /θ (I )). Thus, applying (3.6)1 and Lemma 5.1 in [15], we find that u% ρ%θ
∞ x
ρ% u% φ dy 4 u ρ θ
∞ x
ρuφ dy
in D ((0, T ) × I ).
(3.10)
In the same way we obtain u% θ ρ x %
∞ x
ρ% u% φ dy 4
u θ ρ x
∞ x
ρuφ dy
in D ((0, T ) × I ).
(3.11)
Letting % → 0 in (3.3), and employing (3.6) and (3.9)–(3.11), we arrive at
∞ ∞ u φ = ρθ ρuφ dy + u ρ θ ρuφ dy x x x x ∞t ∞ u 2 + mθ ρ θ − (1 − θ )Q ρuφ dy − ρ θ (ρu + a ρ γ )φy dy x x x ∞ 2 ∞ ρu u θ θ +m ρ uy + m (3.12) φ dy + µ ρ φy dy y y x x
a ρ γ +θ − µ Q − µ m ρ θ
in the sense of distributions. On the other hand, with the help of (3.5), (3.7) and (3.9), we have by taking % → 0 in (2.5) and (2.23) that ρu = 0, ∂t ρ + (ρu)x + m x u mρu2 (ρu)t + ρu2 + a ρ γ + = µ uxx + µ m , x x x x ∂t ρ θ + ∂x (uρ θ ) +
mθ ρ θ u = (1 − θ)Q x
(3.13)
(3.14)
in D ((0, T ) × R+ ). If we multiply Eq. (3.13)2 by φ, integrate then over (x, ∞) and use (3.14), we obtain by arguments similar to those used for (3.3) (also cf. the derivation of
Spherically Symmetric Solutions
575
(2.15)) that ∞ ∞ u a ρ γ ρ θ − µ ρ θ ux − µ m ρ θ φ = ρθ ρuφ dy + u ρ θ ρuφ dy x x x t x ∞ ∞ u 2 + mθ ρ θ − (1 − θ )Q ρuφ dy − ρ θ (ρu + a ρ γ )φy dy x x x ∞ 2 ∞ ρu u +m ρ θ φ dy + µ ρ θ φy dy. uy + m (3.15) y y x x
Comparing (3.12) with (3.15), we infer that (a ρ θ ρ γ − µ ρ θ ux )φ = (a ρ γ +θ − µ Q)φ, which proves the lemma. γ
Remark 3.1. As noted by Hoff in [3–5], the viscous flux F% := aρ% − µ∂x u% − µm u% /x is a good term, which embodies smoothing properties of parabolic parts in the system (1.1). This can be seen by the limit F% ρ%θ 4 F ρ θ , which follows easily from Lemma 3.1 and the proof of Lemma 3.1. The following lemma shows that although we do not know whether ρ is L2 -integrable for γ < γn , we can prove the L2/θ -integrability of ρ θ − ρ θ , which suffices to exclude the concentration of mass at the origin (cf. (3.26)). Lemma 3.2. Let 0 < θ < 1 satisfy 21 (1 − θ +
√
1 + 6θ + θ 2 ) ≤ γ . Then
ρ θ − ρ θ ∈ L2/θ ([0, T ], L2/θ (0, 1)). Proof. By Lemma 3.1 we have a(ρ γ +θ − ρ γ ρ θ ) = µ(Q − ρ θ ux ). By virtue of convexity, ρ γ +θ ≥ ρ γ
γ +θ γ
(3.16)
, ρ γ ≥ ρ γ and ρ θ ≥ ρ θ . Hence,
ρ γ +θ − ρ γ ρ θ ≥ ρ γ =
ργ
γ +θ γ
− ργ ρθ
ργ
θ γ
− ρθ
≥ ρ γ (ρ θ − ρ θ ) ≥ 0.
(3.17) 2γ
2γ
It is easy to see that by (2.7) and (2.8), ρ%θ ∂x u% , ρ θ ux ∈ L γ +2θ ([0, T ], L γ +2θ (0, 1)). 2γ
2γ
Therefore, Q and ρ θ ux ∈ L γ +2θ ([0, T ], L γ +2θ (0, 1)), which together with (3.16) and (3.17) gives 2γ
2γ
ρ γ (ρ θ − ρ θ ) ∈ L γ +2θ ([0, T ], L γ +2θ (0, 1)).
(3.18)
576
S. Jiang, P. Zhang
θ 2γ 1 Recalling 2 − γ2γ +2θ ≤ γ +2θ when θ satisfies 2 (1 − θ + (3.18) and Young’s inequality, we thus obtain 2
ρθ − ρθ
2 θ
=
ρθ − ρθ
2γ γ +2θ
ρθ − ρθ
2− θ
√
1 + 6θ + θ 2 ) ≤ γ , using
2γ γ +2θ
2γ γ +2θ 2− 2γ θ ≤ C ρθ − ρθ ρ γ +2θ 2γ 2γ 2 γ +2θ θ θ γ +2θ 1+ρ ≤ C ρ −ρ 2γ 2γ 2 γ +2θ ≤ C 1 + ργ + ρθ − ρθ ρ γ +2θ ∈ L1 ([0, T ], L1 (0, 1)), which proves the lemma. Remark 3.2. In the proof of Lemma 3.2 we did not directly use any one-dimensional feature of the symmetric form (1.1). Thus, this L2/θ -integrability of ρ θ − ρ θ could be a multidimensional property. Lemma 3.3. The estimate (1.6) in Theorem 1.1 holds for (ρ, u), and ρ, ρ θ
1/θ
∈ L∞ ([0, T ], L1 (R+ )).
γ
γ
Proof. Taking into account that aρ% /(γ − 1) ≤ ψ% (ρ% ) + C% γ (1 + ρ% ), we use (3.7), (3.1), and (2.7), (2.4) and (2.8) to deduce that for any δ, N > 0, N 2 T N ρu u2 u2x + m 2 y m dxdt + ψ(ρ) (t, x)x m dx + µ 2 x δ 0 δ N T N γ 2 ρ% u % u2% m a ρ% m 2 ≤ lim inf (u% )x + m 2 y dxdt + x dx + µ %→0 2 γ −1 x δ 0 δ ∞ N 2 m0 ≤ + ψ(ρ0 ) x m dx + lim inf C% γ (1 + ρ%γ )x m dx %→0 2ρ0 0 δ ∞ 2 m0 = + ψ(ρ0 ) x m dx, 2ρ0 0 which, by taking δ → 0 and N → ∞, gives (1.6). Let φk ∈ C0∞ (R+ ) satisfying φk (x) = 0 for x ≤ 1 or x ≥ k + 1 and φk (x) = 1 for 2 ≤ x ≤ k. Using (3.4) and (2.8), applying Lemma C.1 of [14, Appendix C], we obtain γ ρ ∈ C 0 ([0, T ], Lloc (R+ ) − w). Hence, if we multiply (3.13)1 by x m φk2 , integrate the resulting equation over (0, t) × R+ and integrate by parts, we find that ∞ ∞ t ∞ 2 m 2 m ρ(t, x)φk x dx ≤ ρ0 φk x dx + C φk |∂x φk | ρ |u|x m dxds 0 0 0 0 t ∞ t ∞ 2 m ≤C+ ρφk x dxds + C ρu2 x m dxds 0 0 0 1 t ∞ 2 m ≤C+ ρφk x dxds, 0
0
Spherically Symmetric Solutions
577
∞ which, by applying Gronwall’s inequality, implies 0 ρφk2 x m dx ≤ C, where C is a ∞ positive constant independent of k. So, letting k → ∞, we get 2 ρ(t, x)x m dx ≤ C 2 2 for any t ∈ [0, T ]. On the other hand, 0 ρ(t, x)x m dx ≤ C 0 (1 + ρ γ )x m dx ≤ C. Thus, ρ ∈ L∞ ([0, T ], L1 (R+ )). By convexity we have ρ θ lemma.
1/θ
≤ ρ, which proves the
Now, we are in a position to give the proof of Theorem 1.1. Proof of Theorem 1.1. Using Eqs. (2.5), recalling (3.8) and (3.4), we have [ρ% ]t ∈ −1,
2γ
−1,
4γ
4γ
L∞ ([0, T ], Wloc γ +1 (R+ )), [ρ% u% ]t ∈ L 2γ +1 ([0, T ], Wloc 2γ +1 (R+ )). So, we obtain by Lemma C.1 in [14, Appendix C] that ∞ ∞ ρ% (t, x)φ(x)x m dx = ρ(t, x)φ(x)x m dx ∈ C 0 ([0, T ]), lim %→0 0 ∞ 0 ∞ m lim (ρ% u% )(t, x)φ(x)x dx = (ρu)(t, x)φ(x)x m dx ∈ C 0 ([0, T ]) (3.19) %→0 0
0
for any φ ∈ C0∞ (R+ ). By a density argument we find that (3.19) holds for any φ ∈ γ
2γ
L γ −1 (R+ ) or φ ∈ L γ −1 (R+ ) with supp φ ⊂⊂ R+ 0 . Therefore, (1.3)3 and (1.3)4 are satisfied by taking into account (2.1). Thus, in view of (3.1), (3.2), Lemmas 2.1, 2.2 and 3.3, we see that to complete the proof of Theorem 1.1, it remains to prove only that (1.4), (1.5) hold for (ρ, u). Let θ ∈ (0, 1) satisfy the condition of Lemma 3.2. First notice that by convexity, ρ γ +θ
γ γ +θ
≥ ρ γ and ρ γ +θ
θ γ +θ
≥ ρ θ . Hence, we deduce from Lemma 3.1 that Q ≥ ρ θ ux .
(3.20)
Using the first equation of (3.13), we get, analogously to (2.13) (also see the proof of (2.23)), that ∂t ρ θ + ∂x (uρ θ ) +
mθρ θ u = (1 − θ)ux ρ θ . x
(3.21)
Thus, subtracting (3.14) from (3.21) and employing (3.20), we get ∂t (ρ θ − ρ θ ) + ∂x (u(ρ θ − ρ θ )) +
mθ u θ (ρ − ρ θ ) ≤ (1 − θ )(ρ θ − ρ θ )ux . x
Now, we multiply the above inequality by x mθ to get ∂t f + ∂x (uf ) ≤ (1 − θ)f ux ,
(3.22)
where f ≡ f (t, x) := x mθ (ρ θ − ρ θ ). Note that f ≥ 0 by convexity. Next, we claim that ∂t f 1/θ + ∂x (uf 1/θ ) ≤ 0
in D ((0, T ) × R+ ).
(3.23)
In fact, applying the mollifier to (3.22), we have (cf. (2.11)) ∂t fδ + ∂x (ufδ ) ≤ (1 − θ)ux fδ + rδ ,
(3.24)
578
S. Jiang, P. Zhang
where fδ (t, x) = f (t, ·) ∗ jδ and rδ (t, x) = [ufδ ]x − [uf ]x ∗ jδ + (1 − θ)[(ux f ) ∗ 2γ /θ jδ − ux fδ ]. By Lemma 2.1, f ∈ Lloc ([0, T ] × R+ ). So, we apply Lemma 2.3 of [14] q θ to conclude that rδ → 0 in Lloc ([0, T ] × R+ ) as δ → 0, with q1 = 21 + 2γ . Now, 1/θ −1
multiplying (3.24) by θ1 fδ
, we infer that
1/θ
∂t fδ
1/θ
+ ∂x (ufδ
)≤
1 1/θ−1 rδ f δ , θ 2γ
1−θ which, by taking δ → 0 and noticing that f 1/θ−1 ∈ Lloc ([0, T ] × R+ ), yields (3.23). Using Eqs. (3.14) and (3.21), recalling 2γ
ux f 1/θ ,
γ +1 f 1/θ −1 x mθ (ρ θ ux − Q) ∈ Lloc ([0, T ] × R+ ) 2γ
γ +θ ([0, T ] × R+ ), we have similarly to (3.23) that because of ρ θ ux , Q ∈ Lloc ∂t f 1/θ + ∂x (uf 1/θ ) = (1 − θ −1 ) ux f 1/θ − f 1/θ−1 x mθ (ρ θ ux − Q) ,
2γ
−1, γ2γ +1
which implies ∂t f 1/θ ∈ L γ +1 ([0, T ], Wloc pendix C] yields
(R+ )). Hence, Lemma C.1 of [14, Ap-
f 1/θ ∈ C 0 ([0, T ], Lloc (R+ ) − w). 2γ
(3.25)
Now, take φ% ∈ C0∞ (R+ ) satisfying 0, x ≤ %/2 or x ≥ % −1 + 1, φ% (x) = 1, % ≤ x ≤ % −1 , and |∂x φ% (x)| ≤ C% −1 for x ≤ % and |∂x φ% (x)| ≤ C for x ≥ % −1 . Noting that f (0, x) = 0, we multiply (3.23) with φ% , integrate it over (0, t) × (0, ∞) and utilize (3.25) to find that ∞ t ∞ C t % f 1/θ (t, x)φ% (x)dx ≤ |u|f 1/θ dxds + C |u|f 1/θ dxds. % 0 0 0 0 1/% (3.26) From Lemmas 3.2 and 3.3, (2.40) and Cauchy-Schwarz’s inequality it follows that x −1 uf 1/θ ∈ L1 ((0, T ) × (0, 1)), uf 1/θ ∈ L1 ((0, T ) × (1, ∞)). Hence, letting % → 0 in (3.26), we obtain for any T > 0 that f 1/θ = 0 a.e. on (0, T ) × R+ 0 . This implies ρ θ (t, x) = ρ θ (t, x),
a.e. (t, x) ∈ R+ × R+ .
Hence by the convexity, the Young measure associated with {ρ% (t, x)} is the Dirac mass (see [24, 12, 22]), and by Proposition 3.1.7 in [12] and Lemma 2.1, we have ρ% → ρ
p
+ strongly in Lloc (R+ 0 ×R )
∀ p < 2γ .
(3.27)
Spherically Symmetric Solutions
579 2γ
Noting that ρ% , ρ ∈ L∞ ([0, T ], Lγ (0, N )) and ρ% u% , ρu ∈ L∞ ([0, T ], L γ +1 (0, N )) uniformly in % for any N > 0, we have by (3.27), (3.5) and Young’s inequality that for any N > 0, 2γ
ρ% → ρ in L1 ([0, T ], L1 (0, N )), ρ% u% 4 ρu in L∞ ([0, T ], L γ +1 (0, N )).
(3.28)
Since the weak solution (ρ% , u% ) of (2.5), (2.6) satisfies (1.4) in (0, ∞) × (%, ∞) (also see [3]), we infer, recalling ρ% (t, x) = 0 for x ≤ %, that
∞
ρ% ϕx
0
m
dx|tt21
−
t2 t1
∞ 0
(ρ % ϕt + ρ % u% ϕx )x m dx dt = 0
(3.29)
for any ϕ ∈ C01 (R × R+ 0 ). We take % → 0 in (3.29), and use (3.28) and (3.19) to see that (1.4) is satisfied. Next, we use the method of shielding test functions (cf. [3, 1]) to show that (ρ, u) satisfies (1.5). To this end, we take a cut-off function χ h ∈ C0∞ (R+ ) satisfying χ h (x) = 0 for 0 ≤ x ≤ h and χ h (x) = 1 for x ≥ 2h. Recalling (1.3)2 , we multiply (3.13)2 by χ h (x)ϕ(t, x)x m in L2 ((t1 , t2 ) × R+ ), where ϕ ∈ C0∞ (R × R+ 0 ) with ϕ(t, 0) = 0, integrate by parts and use (3.27) to deduce that 0
∞
ρuχ h ϕx m dx|tt21 − −
t2
t1
= −µ
∞ 0
t2
t1
t2
t1
∞ 0
mϕ h m ρuϕt + ρu2 ϕx + aρ γ ϕx + χ x dxdt x
(ρu2 + aρ γ )ϕ∂x χ h x m dxdt
∞
u x ϕx +
0
muϕ h h χ x m dxdt. + u ϕ∂ χ x x x2
(3.30)
Since ϕ(t, 0) = 0, |ϕ(t, x)∂x χ h (x)| ≤ Ch−1 |ϕ(t, x)| ≤ C for any 0 ≤ x ≤ 2h and any t. Thus, with the help of Lebesgue’s dominated convergence theorem and (1.6), we find that
t2
t1
∞ 0
(ρu2 + ρ γ + |ux |)|ϕ∂x χ h |x m dxdt
≤C
t2
t1
2h h
(ρu2 + ρ γ + |ux |)x m dxdt → 0,
as h → 0.
Taking h → 0 in (3.30) and employing the above estimate, we see that (ρ, u) satisfies (1.5) for any ϕ ∈ C0∞ (R × R+ 0 ) with ϕ(t, 0) = 0. Moreover, a density argument shows that (1.5) still holds for any ϕ ∈ C01 (R × R+ 0 ) with ϕ(t, 0) = 0. This completes the proof of Theorem 1.1. Acknowledgement. Ping Zhang would like to thank Professor D. Hoff for sending him the papers [3] and [4]. This work was supported by the 973 Project (No. G1999032801), the NNSF, the Climbing Project, the Ministry of Education and the CAEP of China.
580
S. Jiang, P. Zhang
References 1. Evans, L.C.: Weak Convergence Methods for Nonlinear Partial Differential Equations. CBMS 74, Providence, R.I.: AMS, 1990 2. Fujita-Yashima, H. and Benabidallah, R.: Equation à symétrie sphérique d’un gaz visqueux et calorifère avec la surface libre. Ann. Mat. Pura Appl. CLXVIII, 75–117 (1995) 3. Hoff, D.: Spherically symmetric solutions of the Navier–Stokes equations for compressible, isothermal flow with large discontinuous initial data. Indiana Univ. Math. J. 41, 1225–1302 (1992) 4. Hoff, D.: Global solutions of the Navier–Stokes equations for multidimensional compressible flow with discontinuous initial data. J. Diff. Eqs. 120, 215–254 (1995) 5. Hoff, D.: Strong convergence to global solutions for multidimensional flows of compressible, viscous fluids with polytropic equations of state and discontinuous initial data. Arch. Rational Mech. Anal. 132, 1–14 (1995) 6. Feireisl, E., Matuš˚u-Neˇcasová, Š., Petzeltová, H. and Straškraba, I.: On the motion of a viscous compressible flow driven by a time-periodic external force. Arch. Rational Mech. Anal. 149, 69–96 (1999) 7. Feireisl, E. and Petzeltová, H.: Large-time behaviour of solutions to the Navier–Stokes equations of compressible flow. Arch. Rational Mech. Anal. 150, 77–96 (1999) 8. Feireisl, E. and Petzeltová, H.: On compactness of solutions to the Navier–Stokes equations of compressible flow. J. Diff. Eqs. 163, 57–75 (2000) 9. Jiang, S.: Global spherically symmetric solutions to the equations of a viscous polytropic ideal gas in an exterior domain. Commun. Math. Phys. 178 , 339–374 (1996) 10. Jiang, S.: Global solutions of the Cauchy problem for the equations of a viscous polytropic ideal gas . Ann. Scuola Norm. Sup. Pisa 26, 47–76 (1998) 11. Jiang, S.: Large-time behavior of solutions to the equations of a viscous polytropic ideal gas. Ann. Mat. Pura Appl. CLXXV, 253–275 (1998) 12. Joly, J.L., Métivier, G. and Rauch, J.: Focusing at a point and absorption of nonlinear oscillations. Trans. Am. Math. Soc. 347, 3921–3970 (1995) 13. Kobayashi, T. and Shibata, Y.: Decay estimates of solutions to the equations of motion of compressible viscous and heat-conductive gases in an exterior domain in R3 . Commun. Math. Phys. 200, 621–659 (1999) 14. Lions, P.-L.:Mathematical Topics in Fluid Mechanics, Vol. 1, Incompressible Models. Oxford: Oxford Science Publications, Clarendon Press, 1996 15. Lions, P.-L.: Mathematical Topics in Fluid Mechanics, Vol. 2, Compressible Models. Oxford: Oxford Science Publications, Clarendon Press, 1998 16. Lions, P.-L.: Bornes sur la densité pour les équations de Navier–Stokes compressibles isentropiques avec conditions aux limites de Dirichlet. C.R. Acad. Sci. Paris, Ser. I 328, 659–662 (1999) 17. Lu, M., Kazhikhov,A.V. and Ukai, S.: Global solutions to the Cauchy problem of the Stokes approximation equations for two-dimensional compressible flows. Comm. PDEs 23, 985–1006 (1998) 18. Matsumura, A. and Nishida, T.: The initial value problem for the equations of motion of compressible viscous and heat-conductive fluids. Proc. Japan Acad. Ser. A 55, 337–342 (1979) 19. Matsumura, A. and Nishida, T.: The initial value problem for the equations of motion of viscous and heat-conductive gases. J. Math. Kyoto Univ. 20, 67–104 (1980) 20. Matsumura, A. and Nishida, T.: Initial boundary value problems for the equations of motion of compressible viscous and heat-conductive fluids. Commun. Math. Phys. 89, 445–464 (1983) 21. Nikolaev, V.B.: On the solvability of mixed problem for one–dimensional axisymmetrical viscous gas flow. Dinamicheskie zadachi Mekhaniki sploshnoj sredy 63, Sibirsk. Otd. Acad. Nauk SSSR, Inst. Gidrodinamiki, 1983 (Russian) 22. Schonbek, M.: Convergence of solutions to nonlinear dispersive equations. Comm. PDEs. 7, 959–1000 (1982) 23. Solonnikov, A.V.: The solvability of the initial boundary-value problem for equations of motion of a l+1,l/2+1 (QT ). J. Math. Sci. 77, 3250–3255 viscous compressible barotropic liquid in the space W2 (1995) 24. Tartar, L.: Compensated compactness and applications to partial differential equations. Nonlinear Analysis and Mechanics, Heriot-Watt Sympos. IV, Knops, R.J. ed., New York: Pitman, 1979, pp. 136–212 25. Vaigant, V.A.: An example of nonexistence globally in time of a solution of the Navier–Stokes equations for a compressible viscous barotropic fluid. Russian Acad. Sci. Dokl. Math. 50, 397–399 (1995) 26. Vaigant, V.A. and Kazhikhov, A.V.: On existence of global solutions to the two-dimensional Navier–Stokes equations for a compressible viscous fluid. Siberian J. Math. 36, 1283–1316, (1995)
Spherically Symmetric Solutions
581
27. Valli, A.: Mathematical results for compressible flows. Mathematical Topics in Fluid Mechanics, Pitman Research Notes in Math. Ser. 274, Rodrigues, J.F. and Sequeira, A. ed., New York: John Wiley, 1992, pp. 193–229 28. Xin, Z.: Blow-up of smooth solutions to the compressible Navier–Stokes equations with compact density. Comm. Pure Appl. Math. 51, 229–240 (1998) Communicated by H. Araki
Commun. Math. Phys. 215, 583 – 589 (2001)
Communications in
Mathematical Physics
© Springer-Verlag 2001
Poles of Zeta and Eta Functions for Perturbations of the Atiyah–Patodi–Singer Problem G. Grubb Copenhagen Univ., Math. Dept., Universitetsparken 5, 2100 Copenhagen, Denmark. E-mail: [email protected] Received: 4 October 1999 / Accepted: 7 July 2000
Dedicated to Professor Norio Shimakura on the occasion of his sixtieth birthday. Abstract: The zeta and eta functions of a differential operator of Dirac-type on a compact n-dimensional manifold, provided with a well-posed pseudodifferential boundary condition, have been shown in [G99] to be meromorphic on C with simple or double poles on the real axis. Extending results from [G99] we show how perturbations of the boundary condition of order −J affect the poles; in particular they preserve a possible regularity of zeta at 0 and a possible simple pole of eta at 0 when J ≥ n. This applies to perturbations of spectral boundary conditions, also when the structure is non-product and the problem is non-selfadjoint. Let D be a first-order differential operator (e.g a Dirac-type operator) from C ∞ (X, E1 ) to C ∞ (X, E2 ) (E1 and E2 Hermitian N -dimensional vector bundles over a compact ndimensional C ∞ manifold X with boundary ∂X = X ), and let DB be the L2 -realization defined by a well-posed zero-order pseudodifferential boundary condition B(u|X ) = 0. For 1 = DB ∗ DB and 2 = DB DB ∗ , the following expansions were shown in [G99]: Tr(i − λ)−m ∼
k
a˜ i,k (−λ)− 2 −m +
−n≤k<0
Tr e
−ti
∼
k 2
ai,k t +
−n≤k<0
(s) Tr −s i ∼
−n≤k<0
k≥0
ai,k s+
k 2
+
k≥0
k a˜ i,k log(−λ) + a˜ i,k (−λ)− 2 −m , (1)
k −ai,k log t + ai,k t2,
(2)
ai,k ν0 (i ) ai,k + . + s (s + 2k )2 s + 2k k≥0
(3)
In (1), m > n2 and λ → ∞ on rays in C \ R+ ; in (2), t → 0+. Equation (3) means n that (s) Tr −s i , defined in a standard way for Re s > 2 , extends meromorphically to C with the pole structure indicated in the right-hand side. Here ν0 is the dimension of the −s nullspace (on which −s i is taken to be zero). Tr i is also known as the zeta function
584
G. Grubb
ζ (i , s) = eigenv. λ>0 λ−s . The three expansions (1)–(3) are essentially equivalent, cf. [GS96], the k th coefficients being interrelated by universal constants. A fundamental example is the Atiyah–Patodi–Singer problem [APS75], where D is a Dirac operator with product structure near X and B is taken as the orthogonal projection ≥ onto the nonnegative eigenspace for an associated selfadjoint operator A on X (the spectral boundary condition). For the general case without assumptions on product structure near X , an expansion up to k < 1 was shown in [G92] with ai,0 = 0, and a full expansion was shown in the joint work with Seeley [GS95]. It is important in applications to know whether the coefficient ai,0 vanishes. Since (s) has a pole at 0, ai,0 = 0 means that ζ (i , s) is regular at 0. Then the derivative −ζ (i , 0) is well-defined; it equals the “logarithm of the determinant” of i . (For the connection with determinants, note that −ζ (i , s) = eigenv. λ>0 λ−s log λ; if i is replaced by a positive matrix T , this equals log det T for s = 0.) In an interesting recent paper [W99], Wojciechowski studies the regularity at 0 of the zeta and eta functions of DB in the case where D is a selfadjoint Dirac-type operator with product structure near X , B is a pseudodifferential projection differing from the Calderón projector C + by an operator of order −∞, and DB is selfadjoint (with proof details for n odd). He does mention our paper [G99] in preprint form, but only with a vague statement that “at the moment, the problem of explicit computation of the coefficients in the expansion is open”. This is not so for the particular coefficient ai,0 , since our Theorem 9.4 (showing that order −∞ perturbations of the boundary condition do not change the values ai,k ), implies that ai,0 = 0 in the case considered in [W99], as stated in our Corollary 9.5. This covers the result on ζD2 (s) in [W99, Th. 0.2] (also for P n even). The purpose of the present note is to account for the consequences of [G99] and to extend the analysis to perturbations of arbitrary finite negative order, showing which of the coefficients in (1)–(3) are left unaffected. We establish similar results for eta functions, and include some improved details on the use of the polyhomogeneous calculus of [GS95]. The realization DB of D and its adjoint (DB )∗ (acting as D ∗ with a certain boundary condition B (∗) u|X = 0) are imbedded in the larger elliptic system DB =
0 −DB ∗ DB 0
, with Rµ = (DB + µ)−1 =
µ(1 +µ2 )−1
DB ∗ (2 +µ2 )−1
−DB (1 +µ2 )−1
µ(2 +µ2 )−1
, (4)
for µ ∈ C \ iR; the resolvents Ri,µ = (i + µ2 )−1 can be retrieved from this. For two choices B1 and B2 of B, let B = B2 − B1 . Denoting Bj = Bj Bj(∗) , D+µ for j = 1, 2, and B = B2 − B1 , we have that the inverses ( Rj,µ Kj,µ ) of Bj γ0 for µ ∈ C \ iR satisfy +µ R K ( R2,µ K2,µ ) = ( R1,µ K1,µ ) D B1 γ0 ( 2,µ 2,µ ) = ( R1,µ K1,µ ) −B γI0 R2,µ I −B γ00 K2,µ ;
(5)
here γ0 u = u|X . It is shown in [G99, Cor. 8.3] (to which we refer for notation) that the µ,+ − Kµ+ S Bj γ0 Q µ,+ and Kj,µ = Kµ+ S , operators have the structure Rj,µ = Q j,µ j,µ
Poles of Zeta and Eta
585
where the Sj,µ denote particular right inverses of Bj Cµ+ ; they are weakly polyhomogeneous with symbols in S 0,0 , whereas the other µ-dependent factors are strongly polyhomogeneous. Then µ,+ − Kµ+ S2,µ µ,+ ). R2,µ − R1,µ = −K1,µ B γ0 R2,µ = −Kµ+ S1,µ B γ0 (Q B2 γ0 Q (6)
Denoting 1,j = DBj ∗ DBj and 2,j = DBj DBj ∗ , with resolvents Ri,j,µ , we have in view of (4), R1,2,µ − R1,1,µ = µ−1 ( 1 0 ) (R2,µ − R1,µ ) 01 , (7) DB2 R1,2,µ − DB1 R1,1,µ = − ( 0 1 ) (R2,µ − R1,µ ) 01 , with similar formulas for i = 2. The second expression has a similar structure as in (6), and the first one has it with an extra factor µ−1 . We can likewise find the explicit structures of m−1 m−2 m−1 m m R1,2,µ − R1,1,µ = (R1,2,µ − R1,1,µ )(R1,2,µ + R1,2,µ R1,1,µ + · · · + R1,1,µ ), m−2 m−2 m m m DB2 R1,2,µ − DB1 R1,1,µ = DB2 R1,2,µ (R1,2,µ − R1,1,µ )(R1,2,µ + · · · + R1,1,µ ) (8) m−1 , + (DB2 R1,2,µ − DB1 R1,1,µ )R1,2,µ
for higher powers m. Theorem 1. Let B = B1 − B2 be of order −J for some 1 ≤ J ≤ ∞, and let m ≥ max{n − J, 1}. Then, with ϕ denoting a morphism from E2 to E1 , there are expansions m m Tr(Ri,2,µ − Ri,1,µ )∼ c˜i,k µ−2m−k + (c˜i,k log µ + c˜i,k )µ−2m−k , (9) n−J
m Tr(ϕDB2 R1,2,µ
m − ϕDB1 R1,1,µ )
∼
n−J
k≥0
d˜k µ
1−2m−k
+
k≥0
(d˜k log µ + d˜k )µ1−2m−k , (10)
for µ → ∞ in C \ iR; the c˜i,k and d˜k vanish when k ≤ J − n. Proof. First consider (9). We can let i = 1. The operator R1,2,µ − R1,1,µ is of the form µ−1 Kµ Sµ Tµ , where Kµ and Tµ are a strongly polyhomogeneous Poisson resp. trace operator of order 0 resp. −1, and Sµ is a weakly polyhomogeneous ψdo on X with symbol in S −J,0 , in the calculus introduced in [GS95]. If J ≥ n − 1, the operator is trace-class and has the same trace as the ψdo on X obtained by circular perturbation, µ−1 Sµ Tµ Kµ . Since Tµ Kµ is a strongly polyhomogeneous ψdo on X of order −1, hence has symbol in S −1,0 ∩ S 0,−1 , the composed expression is a weakly polyhomogeneous ψdo on X with symbol in S −1−J,−1 ∩ S −J,−2 . If J < n − 1, we need to consider a power as in (8) with m ≥ n − J , to get a trace class operator. The operators R1,j,µ are of the form Qµ,+ − Kµ Sµ Tµ with Kµ and Tµ as above, Qµ a strongly polyhomogeneous ψdo of degree −2 and Sµ having symbol in S 0,−1 . Then by circular perturbation, m−1 m−1 m m Tr X (R1,2,µ − R1,1,µ ) = Tr X (µ−1 Kµ Sµ Tµ (R1,2,µ + · · · + R1,1,µ ))
= Tr X (Sµ ),
m−1 m−1 Sµ = µ−1 Sµ Tµ (R1,2,µ + · · · + R1,1,µ )Kµ . (11)
586
G. Grubb
The various composition rules explained in [GS95] show that Sµ is a composite of weakly polyhomogeneous and strongly polyhomogeneous ψdo’s on X ; its symbol lies in S −m−J,−m ∩ S −J,−2m , since each factor R1,j,µ results in multiplying the symbol space by S −1,−1 ∩ S 0,−2 . Now [GS95, Th. 2.1] can be applied to the resulting ψdo on the manifold X of dimension n − 1. Since the total order is −2m − J and the d-index is −2m, the formula [GS95, (2.1)] gives an expansion (9). For the log-coefficients we use the additional information from [GS95, Th. 2.1] stating that the contribution to the coefficient c˜1,k of µ−2m−k log µ when k ≥ 0 comes entirely from the homogeneous symbol of Sµ of degree (1 − n) − 2m − k. Since the highest degree of homogeneity occurring in the symbol is −2m − J , these c˜1,k ’s vanish for k + n − 1 < J , i.e., for k ≤ n − J . This shows the statements on (9). For (10), we have that ϕDB2 R1,2,µ − ϕDB1 R1,1,µ is of the form Kµ Sµ Tµ , where Kµ , Sµ and Tµ are as above. A circular perturbation gives a weakly polyhomogeneous ψdo on X , now with symbol in S −1−J,0 ∩ S −J,−1 . For J ≥ n − 1 we can pass directly to an application of [GS95, Th. 2.1] as above. For J < n − 1 we take a high enough m and find, very similarly to the above considerations, that circular perturbation gives a ψdo on X with symbol in S −m−J,1−m ∩ S −J,1−2m . Then an application of [GS95, Th. 2.1] gives (10). The cases J = 1 and J = ∞ were treated in [G99, Th. 9.4 ff.]. (The considerations above on composite expressions are very similar to those in [G99, Th. 9.1]. The deduction given here in terms of resolvent powers is slightly more direct than that indicated in [G99] via µ-derivatives; the present considerations can also be used for the passage from (9.1) to (9.9)–(9.11) in [G99].) Theorem 1 is carried over to a result on heat traces and zeta and eta functions by application of the transition rules [GS96, Cor. 2.10 and Prop. 5.1]: Corollary 2. Under the hypotheses of Theorem 1, one has: k k Tr(e−ti,2 − e−ti,1 ) ∼ −ci,k log t + ci,k t2, ci,k t 2 + n−J
Tr(ϕDB2 e−t1,2 − ϕDB1 e−t1,1 ) ∼
k≥0
n−J
dk t
k−1 2
+
(12)
k≥0
k−1 −dk log t + dk t 2 , (13)
(s) Tr((i,2 ) ∼
−s
n−J
−s
− (i,1 ) ) ∼ ci,k
k 2 −s
s+
+
ci,k ν0 (i,2 ) − ν0 (i,1 ) ci,k , + + s (s + 2k )2 s + 2k k≥0
(s) Tr(ϕDB2 (1,2 ) − ϕDB1 (1,1 )−s ) ∼ dk dk dk ∼ + + , 2 s + k−1 (s + k−1 s + k−1 2 2 ) 2 n−J
(14)
(15)
where the ci,k and dk vanish for k ≤ J − n. Now consider operators of Dirac-type (notation of [G99]); on a collar neighborhood U of X they are of the form: D = σ ( ∂x∂ n + A + xn P1 + P0 ),
(16)
Poles of Zeta and Eta
587
where xn is a normal coordinate, A is a selfadjoint elliptic first-order differential operator in L2 (X , E1 |X ), the Pj are differential operators of order ≤ j , and σ is a unitary morphism from E1 |X to E2 |X . (U is identified with X × [0, c] and the Ei are liftings of Ei |X here.) The product case is the case where, moreover, the Pj are 0 on U . The coefficients in (1)–(3) were determined from the zeta and eta expansions of D and A in [GS96, Cor. 2.7–2.8] in the product case with B = ≥ (the orthogonal projection onto the nonnegative eigenspace for A); then in particular all the a˜ i,k and ai,k with k ≥ 0 vanish when n is odd; the a˜ i,k and ai,k with k even ≥ 0 vanish when n is even. (The remaining coefficients are nonzero in general, cf. Gilkey–Grubb [GG98].) Combining this with Corollary 2, one finds that the same holds for perturbations B2 = ≥ + S of B1 = ≥ with S of order −∞; this is formulated in [G99, Cor. 9.5]. This result includes the case B2 = C + + S , S of order −∞, since C + − ≥ is of order −∞ in the product case by [G99, Prop. 4.1] (C + is the Calderón projector for D). With the present accounting of the effect of perturbations of order −J , we get the following extended information: Corollary 3. Consider DB in the product case with B = ≥ + S, S of order −J (then B − C + is likewise of order −J ). If n is odd, all a˜ i,k and ai,k with 0 ≤ k ≤ J − n vanish in (1)–(3). If n is even, all the a˜ i,k and ai,k with 0 ≤ k ≤ J − n and k even vanish in (1)–(3). We formulate some consequences for the zeroth coefficient in detail. Corollary 4. In (1)–(3), the coefficients a˜ i,0 and ai,0 vanish in the following cases: (a) D is a Dirac-type operator and B − ≥ is a ψdo of order ≤ −n. (b) D is a Dirac-type operator in the product case, and B − C + is a ψdo of order ≤ −n. (c) D is a Dirac-type operator, B − C + is a ψdo of order ≤ −n, and the structure near X is so close to the product case that ≥ − C + is of order −n. Hence in these cases, the zeta function ζ (i , s) is regular at s = 0. Proof. The result in case (a) follows from Corollary 2 together with the fact that ai,0 vanishes for Dirac-type operators with the boundary condition ≥ u|X = 0 ([G92]). The result in case (b) follows from Corollary 3, and the result in case (c) is an immediate consequence of the preceding ones. The reason that we do not claim that (a) holds with ≥ replaced by C + is that C + will in general differ from ≥ by an operator that is merely of order −1. Let us also consider the eta function. We have from [G99, (9.9)–(9.11)]: k−1 b˜k (−λ)− 2 −m Tr[ϕDB (1 − λ)−m ] ∼ −n
+
k≥0
Tr[ϕDB e−t1 ] ∼
k−1 b˜k log(−λ) + b˜k (−λ)− 2 −m ,
k
bk t 2 +
−n
(s) Tr[ϕDB −s 1 ]∼
−n
k≥0
bk s+
(17)
k−1 2
+
k−1 −bk log t + bk t 2 ,
k≥0
bk (s +
k−1 2 2 )
+
(18) bk
s+
k−1 2
,
(19)
588
G. Grubb
with similar formulas where DB and DB ∗ are interchanged. When E1 = E2 , the eta −s
+1
function is defined by η(DB , s ) = Tr(DB 1 2 ), so in (19), s corresponds to s = s +1 2 , and the expansion with ϕ = I takes the form 4bk 2bk 2bk ( s +1 + + . (20) 2 )η(DB , s ) ∼ s + k (s + k)2 s + k −n
k≥0
The coefficients in (17)–(20) were determined from the zeta and eta expansions of D and A in [GS96, Cor. 4.5] in the product case with B = ≥ ; in particular, the bk vanish for k ≥ 0 if n is odd, and they vanish for k even > 0 if n is even. For k = 0 and n even, b0 is proportional to the residue of Tr X (σ A|A|−s−1 ) at s = 0. It vanishes e.g if σ A = −Aσ , for in this case Tr X (σ A|A|−s−1 ) ≡ 0 (see also [GS96, Cor. 2.4]). Then Corollary 2 implies: Corollary 5. Consider the product case with E1 = E2 and B = ≥ + S, S of order −J (so also B − C + is of order −J ). When n is odd, all bk with 0 ≤ k ≤ J − n vanish in (20). When n is even, all the bk with 0 < k ≤ J − n and k even vanish in (20); if σ A = −Aσ , also b0 = 0. Note that in the product case, D is selfadjoint if σ ∗ = −σ and σ A = −Aσ . Since (s) is regular at 21 , b0 has to vanish in order for η(DB , s ) to have a simple pole at s = 0. In the case of a simple pole, our analysis shows that perturbations of the boundary condition by operators of order ≤ −n still give a simple pole. The pole vanishes under more restrictive circumstances, cf. Douglas and Wojciechowski [DW91], [GS96, Cor. 4.6], [W99]. The results in Corollary 4 extend to b0 in view of the following theorem. Theorem 6. Consider the realization DB of a Dirac-type operator D (cf. (16)) with B = ≥ . The coefficients b˜0 and b0 in (17)–(19) are the same as these coefficients in the expansions for DB0 , where D 0 = χ σ ( ∂x∂ n + A) + (1 − χ )D, with χ = 1 near X and supported in X × [0, c[ . Proof. Observe that we are here dealing with a perturbation of the interior operator, not of the boundary condition as in Theorem 1. This is advantageous, for we can then use the result of [G92, Th. 4.4], showing that the resolvent (1 − λ)−1 has the form (1)
(1 − λ)−1 = Qλ,+ + χ G0λ χ + Gλ ,
(21)
D∗D
− λ defined on a neighwhere Qλ,+ is the restriction to X of a parametrix Qλ of boorhood of X, G0λ is the singular Green part of the resolvent in the product situation (1) where D is replaced by σ ( ∂x∂ n + A) and X is replaced by X × [0, ∞[ , and Gλ is a singular Green operator of order −3, class 0 and regularity 0. Then (m − 1)!D(1 − λ)−m = D∂λm−1 (1 − λ)−1
(1)
= D∂λm−1 Qλ,+ + D∂λm−1 (χ G0λ χ ) + D∂λm−1 Gλ .
(22)
In the calculation of the trace, the first term gives no logarithms, the second term gives the 1 expansion known for the product case, where the log-terms start with b˜0 (−λ) 2 −m log(−λ), and the third term gives an expansion with pure powers up to and including the term 1 1 c(−λ) 2 −m (the method of [G92] gives an O((−λ) 8 −m ) after this, and the method of [GS95] gives a full expansion with log-terms for higher k). So the first log-term for D(1 − λ)−m is the same as that for D 0 ((DB0 )∗ DB0 − λ)−m .
Poles of Zeta and Eta
589
One could also have shown this by arguments as in the last paragraph of the proof of [GS95, Th. 3.13], but the notation would be quite heavy. Corollary 7. Let E1 = E2 and assume that σ A = −Aσ if n is even. In the three cases (a)–(c) listed in Corollary 4, the coefficient b0 in (20) vanishes, in other words the eta function η(i , s ) has at most a simple pole at s = 0. Proof. The product case with B = ≥ has b0 = 0, as accounted for in Corollary 5. Hence so has the non-product case with B = ≥ , by Theorem 6. The statements for the cases (a) and (b) then follow from Corollary 2 resp 5, and case (c) is an immediate consequence. Note that we do not assume selfadjointness of DB as in [W99]. References [APS75] Atiyah, M.F., Patodi, V.K. and Singer, I.M.: Spectral asymmetry and Riemannian geometry, I. Math. Proc. Camb. Phil. Soc. 77, 43–69 (1975) [DW91] Douglas, R.G. and Wojciechowski, K.P.: Adiabatic limits of the η-invariants, the odd dimensional Atiyah–Patodi–Singer problem. Commun. Math. Phys. 142, 139–168 (1991) [GG98] Gilkey, P.B. and Grubb, G.: Logarithmic terms in asymptotic expansions of heat operator traces. Comm. Part. Diff. Eq. 23, 777–792 (1998) [G92] Grubb, G.: Heat operator trace expansions and index for general Atiyah–Patodi–Singer problems. Comm. P. D. E. 17, 2031–2077 (1992) [G99] Grubb, G.: Trace expansions for pseudodifferential boundary problems for Dirac-type operators and more general systems. Arkiv f. Mat. 37, 45–86 (1999) [GS95] Grubb, G. and Seeley, R.: Weakly parametric pseudodifferential operators and Atiyah–Patodi– Singer boundary problems. Invent. Math.121, 481–529 (1995) [GS96] Grubb, G. and Seeley, R.: Zeta and eta functions for Atiyah–Patodi–Singer operators. J. Geom. An. 6, 31–77 (1996) [W99] Wojciechowski, K.: The ζ -determinant and the additivity of the η-invariant on the smooth, selfadjoint Grassmannian. Commun. Math. Phys. 201, 423–444 (1999) Communicated by A. Jaffe
Commun. Math. Phys. 215, 591 – 608 (2001)
Communications in
Mathematical Physics
© Springer-Verlag 2001
Morse Theory and Infinite Families of Harmonic Maps Between Spheres Kevin Corlette1, , Robert M. Wald2, 1 Department of Mathematics, University of Chicago, 5734 South University Avenue, Chicago, IL 60637,
USA. E-mail: [email protected]
2 Enrico Fermi Institute and Department of Physics, University of Chicago, 5640 South Ellis Avenue, Chicago,
IL 60637, USA. E-mail: [email protected] Received: 10 December 1999 / Accepted: 7 July 2000
Abstract: Existence of an infinite sequence of harmonic maps between spheres of certain dimensions was proven by Bizo´n and Chmaj. This sequence shares many features of the Bartnik–McKinnon sequence of solutions to the Einstein–Yang–Mills equations as well as sequences of solutions that have arisen in other physical models. We apply Morse theoretic methods to prove existence of the harmonic map sequence and to prove certain index and convergence properties of this sequence. In addition, we generalize the result of Bizo´n and Chmaj to produce infinite sequences of harmonic maps not previously known. The key features “responsible” for the existence and properties of the sequence are thereby seen to be the presence of a reflection (Z2 ) symmetry and the existence of a singular harmonic map of infinite index which is invariant under this symmetry. 1. Introduction A countably infinite sequence (hi ) of harmonic maps from an (m + 1)-dimensional sphere, S m+1 , into itself for 2 ≤ m ≤ 5 was discovered and analyzed by Bizo´n [2] and Bizo´n and Chmaj [4]. In these references, existence of this sequence was proven via a shooting argument, and analytic arguments and/or numerical evidence also was presented that this sequence satisfies the following properties: (1) as i → ∞, we have hi → h∞ pointwise except at the poles, where h∞ denotes a singular harmonic map which maps all of S m+1 into the equator of S m+1 , (2) the sequence of energies, (Ei ), is monotone increasing and converges to the energy, E∞ , of h∞ and (3) the index of hi is i. The above properties bear a remarkable resemblance to the properties of the Bartnik– McKinnon sequence and the colored black hole sequences of Einstein–Yang–Mills theory (see Volkov and Gal’tsov [14] for a review). Here one considers static, spherically symmetric, asymptotically flat, nonsingular solutions of the Einstein–Yang–Mills The first author was supported by NSF grant DMS-9971727 during the course of this work.
The second author was supported by NSF grant PHY 95-14726.
592
K. Corlette, R. M. Wald
equations where the static Killing field either remains strictly timelike everywhere (the Bartnik–McKinnon case) or becomes null on a regular event horizon (the colored black hole case). The Bartnik–McKinnon solutions are labeled by a positive integer, i, whereas the colored black holes are labeled by i and a positive real number, r0 , corresponding to the radius of the event horizon. As i → ∞, the Bartnik–McKinnon sequence converges (in the sense described in Sect. 3.1 of [14]) to the extreme Reissner-Nordstrom solution with unit magnetic charge. At fixed r0 , the colored black hole sequence converges (in the sense described in Sect. 4.1 of [14]) to a unit magnetically charged ReissnerNordstrom solution. The mass of both the Bartnik–McKinnon and colored black hole sequences increases monotonically with i and converges to the mass of the limiting Reissner-Nordstrom solution. Finally, numerical evidence indicates that – if one suitably restricts the function space so that only “even parity” perturbations (involving only variables that are nonzero in the background) are considered – the index of both the i th Bartnik–McKinnon and the i th colored black hole solution is i. Sequences of solutions with similar properties have also been found in a number of other models, in particular, static, spherically symmetric solutions in Yang–Mills-dilaton theory [1] and self-similar wave maps from Minkowski spacetime into S 3 (or, equivalently, harmonic maps from the hyperboloid, H 3 , into S 3 ) [3]. The fact that very similar sequences of solutions exist in the quite different contexts of harmonic map theory and Einstein–Yang–Mills theory suggests that there should be an explanation of the existence and properties of these sequences of solutions that depends only on some general properties of the equations, not on their detailed form. An early attempt to provide such an explanation (made prior to the discovery of the harmonic map sequence) was given in [13]. In that reference, it was proposed that a key property related to the existence of the Bartnik–McKinnon and colored black hole sequences is the presence of a symmetry that ensures a “degeneracy” among solutions, i.e., that if a solution of a given mass exists, then other solutions (obtained by action of the symmetry on this solution) of the same mass also must exist. The following heuristic argument for the existence of the Bartnik–McKinnon sequence was given. (A similar argument also was given for the colored black hole sequence.) Suppose that on the phase space, , appropriate to the problem there exists a “mass flow” vector field (satisfying suitable smoothness properties and invariant under the symmetries), such that the mass monotonically decreases along the integral curves of this vector field, and such that these integral curves always asymptotically approach a critical point of mass (corresponding to a stationary solution of the Einstein–Yang–Mills equations [13]). By the positive mass theorem, the global minimum of mass is flat spacetime with a pure gauge Yang–Mills field, as well as copies of this solution under the symmetry. However, in addition to these global minima of mass (and, possibly, other local minima of mass), there should exist other critical points of mass: the integral curves of the mass flow vector field should not be able to bifurcate discontinuously between the different local minima, so there should exist points of phase space that do not flow to any local minimum. Consequently, these must flow to critical points of nonzero index. Indeed, if the set of points, 1 , that do not flow to a local minimum of mass has the structure of a hypersurface of co-dimension 1, then the critical point of minimum mass within 1 would have index 1, corresponding to the first Bartnik–McKinnon solution. The argument can now be repeated, replacing by 1 , to argue for existence of a critical point of index 1 within 1 and, hence, index 2 within the original phase space . It was proposed in [13] that this would account for the second Bartnik–McKinnon solution; continued iteration of this argument should generate the entire Bartnik–McKinnon sequence. This argument would naturally account
Morse Theory and Infinite Families of Harmonic Maps
593
for the fact that the index of the i th Bartnik–McKinnon solution is i, as well as for the fact that the masses in the sequence increase monotonically. However, in addition to its heuristic nature and its major gaps concerning the existence of a suitable mass flow vector field, the manifold nature of i , etc., the argument given in [13] suffers from the following three serious deficiencies: First, the relevant symmetry was identified in [13] as being the “large gauge transformations” of the Yang–Mills field. However, no analog of this symmetry exists in the harmonic map problem, so if a common explanation is sought, this could not be the relevant symmetry in the Einstein–Yang–Mills context. Second, no argument was given as to why 1 (or any of the higher i ) should be connected or – if not connected – why the symmetry should map any connected component of 1 into itself. If the symmetry fails to map a connected component of 1 into itself, no bifurcation of the mass flow on 1 need occur, and the above argument for additional critical points breaks down. Note that this difficulty would leave intact the argument for a critical point of index 1 (i.e., the first Bartnik–McKinnon solution) since is connected, but if the relevant symmetry is taken to be the large gauge transformations, there is no reason to expect that any higher members of the Bartnik– McKinnon sequence need exist. Third, the argument given in [13] does not account for any of the convergence properties of the Bartnik–McKinnon sequence as i → ∞. In this paper, we will give a Morse theoretic proof of the existence and properties of a generalization of the sequence of harmonic maps between spheres found in [2] and [4]. By doing so, we will – in the context of the harmonic map problem rather than the Einstein–Yang–Mills problem – in effect, cure all of the deficiencies as well as close all of the gaps in the general argument sketched in [13]. In our proofs, the relevant symmetry on the space of maps between spheres will be seen to be a Z2 symmetry corresponding to composing a given map between the spheres with the reflection isometry about the equatorial plane of the image sphere. It will be seen that an additional fact playing a crucial role in our proofs is the existence of a harmonic map (namely, h∞ ) which is invariant under this symmetry and which has infinite index. At a heuristic level, the presence of h∞ ensures that the heat flow has the appropriate bifurcation properties to obtain an infinite sequence of solutions. However, a number of technical difficulties would arise if we attempted to use heat flow arguments in our proofs. The heat flow for maps into a sphere can develop singularities in finite time, which makes it difficult to regard it as a flow on the space of maps. As is typical in Morse theoretic arguments on infinite-dimensional spaces, we replace the heat flow by an energy-decreasing flow which is defined in a more ad hoc manner, but has the two advantages that it is defined for all time and its flow lines converge to limiting maps as time tends to infinity. It then becomes possible to apply the essential observation of Morse theory, which is that nontrivial topology in the space of maps forces the existence of critical points for the energy which are not minima. More precisely, one can apply a minimax argument for each homology class in the configuration space: the minimum of the collection of numbers E such that the chosen homology class can be realized in the space of maps with energy no larger than E is a critical value. However, the space of maps in our case is parametrized by real-valued functions on the real line, so has no nontrivial topology. We avoid this problem by (1) showing that all critical points have energy bounded by the energy of the unique singular critical point (corresponding to a map which collapses the domain sphere to the equator in the target sphere) and (2) exploiting the Z2 symmetry. This leads to a configuration space with nontrivial homology classes in infinitely many dimensions. These homology classes are the essential explanation for the existence of the infinite sequence of harmonic maps.
594
K. Corlette, R. M. Wald
Several technical complications arise in the course of the proof which may obscure the main ideas, so we will outline a naive version of the argument here. The first step is to reduce the harmonic map equation to an ordinary differential equation by prescribing the map along codimension one slices in the domain sphere. This leads to the condition that the map along such slices must be an eigenmap, i.e. a harmonic map with constant energy density. By an appropriate choice of coordinates, we can then reduce the problem to that of finding critical points of an energy functional E on a Hilbert space H of functions h : R → R. There is an involution of H given by multiplication by −1 which preserves E. It has a unique fixed point, corresponding to the function h∞ which is identically zero, and corresponding to a singular harmonic map which collapses the domain to the equator of the target. We can show that all other critical points of E correspond to smooth harmonic maps with energy strictly less than E(h∞ ). Under suitable conditions, h∞ has infinite index as a critical point of E. Ideally, we would exploit this as follows. Consider the punctured Hilbert space H ∗ = H − {h∞ }, and divide by the involution to obtain a space H ∗ with the homotopy type of RP ∞ . In particular, the homology of H ∗ is Z2 [x], where x is a class of degree 1. In the cases where h∞ has infinite index, we can realize all of the homology of H ∗ in the portion consisting of functions with energy less than E(h∞ ). If Morse theory could be applied naively, it would tell us that there must be an infinite sequence of critical points, with at least one critical point of index k for every k ≥ 0. The idea of finding harmonic maps between spheres by reducing to an ordinary differential equation is an old one, first used by Smith [10, 11] and used more recently by Ding [6], Eells–Ratto [7] and Pettinati–Ratto [9]. To the best of our knowledge, most of the work done in this direction has focused on showing that harmonic maps exist, without trying to show that they exist in profusion. The work of Bizo´n and Bizo´n–Chmaj appears to be the first in this direction. It should be noted that the static, spherically symmetric Einstein–Yang–Mills equations also possess a Z2 symmetry given by the transformation w → −w in the notation of [14] (see Eqs. (2.50)–(2.53) of that reference). In addition, the Reissner–Nordstrom solutions with unit magnetic charge are invariant under this symmetry and have infinite index. This suggests that it might be possible to give a similar Morse theoretic proof of the existence and properties of the Bartnik–McKinnon and colored black hole sequences; the main problem would be to show that the relevant functional satisfies the Palais–Smale condition. However, we shall not pursue this issue here. The contents of this paper are as follows. In Sect. 2, we describe the basic framework for studying special classes of harmonic maps which can be interpreted as solutions of ordinary differential equations. We establish certain basic properties of these solutions, including the bound on the energy mentioned above. In Sect. 3, we study the index of the singular map which collapses the domain sphere onto the equator of the target, and show that it is infinite under certain circumstances. In the final section, we apply Morse theory for functions on compact convex sets to establish the existence of an infinite sequence of harmonic maps. Both authors would like to thank Piotr Bizo´n for helpful discussions related to this paper. The second author would particularly like to acknowledge conversations with Bizo´n several years ago which provided the initial impetus for this project.
Morse Theory and Infinite Families of Harmonic Maps
595
2. Preliminaries Let S n be the n-dimensional sphere with the Riemannian metric induced from its identification with the unit sphere in Euclidean space. We identify the (n + 1)-sphere with north and south poles removed with π π − , × Sn (1) 2 2 with the metric dθ 2 + cos2 θ dψ 2 ,
(2)
where θ is the coordinate along (− π2 , π2 ) and dψ 2 is the round metric on S n . If we fix a map F : S m → S n , then a function f : [− π2 , π2 ] → [− π2 , π2 ] with f (± π2 ) = ± π2 gives a map f˜ : S m+1 → S n+1 via f˜(θ, ψ) = (f (θ ), F (ψ)). We are interested in harmonic maps which can be written in this form with m ≥ 2. In order for this to be reasonable, we must impose a condition on F . Definition 1. F : S m → S n is an eigenmap if it is harmonic and the energy density |dF |2 = ω is constant as a function on S m . In this situation, ω is called the eigenvalue of the map. One of the simplest examples of an eigenmap is the identity map S m → S m ; the corresponding eigenvalue is m. Other examples are given by the Hopf maps S 3 → S 2 , S 7 → S 4 and S 15 → S 8 ; these have eigenvalues 8, 16 and 32, respectively. Any eigenmap S m → S n is obtained from a collection of n+1 eigenfunctions ξ1 , . . . , ξn+1 of the Laplacian on S m satisfying ξi2 = 1. There is no general classification of eigenmaps between spheres, but a number of examples and partial results are known. For example, there are eigenmaps produced by what is known as the Hopf construction, generalizing the classical Hopf maps. There are also eigenmaps associated with harmonic eiconals. These are harmonic polynomials on Rn+1 whose gradients have unit length along the unit sphere; the gradients then give harmonic maps S n → S n . Consult Eells–Ratto [7], Chapter VIII for more information. Once we assume that F is an eigenmap, the condition for f˜ to be a harmonic map reduces to a differential equation for f : f − m tan θf +
ω sec2 θ sin 2f = 0. 2
(3)
This is the Euler–Lagrange equation for the following energy functional: 1 J (f ) = 2
π 2
− π2
ω cos2 f (f ) + cos2 θ 2
cosm θ dθ,
(4)
which gives the energy of f˜ in the usual sense. It will often be useful to make the change of variables x = log (tan 21 (θ + π2 )) and to set h(x) = f (2 tan−1 (ex ) − π2 ). Then the energy becomes 1 ∞ 2 E(h) = (h ) + ω cos2 h sechm−1 x dx, (5) 2 −∞
596
K. Corlette, R. M. Wald
while the Euler–Lagrange equation becomes h − (m − 1) tanh xh +
ω sin 2h = 0. 2
(6)
˜ If v : R → R is C 2 with compact We will denote the map of spheres associated to h by h. support, then the formula for the second variation of E is d2 E(h + tv)|t=0 = dt 2
∞ −∞
=−
(v )2 − ω cos (2h)v 2 sechm−1 x dx
∞
=−
v sechm−1 x + ω cos (2h)v sechm−1 x v dx
−∞ ∞
−∞
(7)
v − (m − 1) tanh xv + ω cos (2h)v v sechm−1 x dx.
The first of these integrals will be abbreviated as Q(v, v). Define a weighted Sobolev space H to be the completion of the space of smooth functions h : R → R satisfying ∞ 2
(h ) + h2 sechm−1 x dx < ∞ (8) −∞
with respect to the norm h 2H defined by the integral above. Define E : H → R by the same formula as in the previous paragraph. E is easily seen to be a smooth function on H , with critical points given by functions satisfying the Euler–Lagrange equation given above. We are interested in critical points of E lying in the closed convex set C ⊂ H given by
π C = h ∈ H ||h(x)| ≤ , x ∈ R . (9) 2 The fact that this set is closed follows from the Sobolev embedding theorem for ordinary Sobolev spaces on a finite interval. We begin by proving a few simple properties of these critical points. Most, but not all, of this can be found in [2, 4]. In fact, we will need to study critical points of functionals which are perturbations of the energy, so define 1 ∞ 2 Eν (h) = (10) (h ) + ω(1 + ν) cos2 h sechm−1 x dx, 2 −∞ where ν is a C 2 function on R with |ν| < 1 and compact support. The Euler–Lagrange equation for this functional is h − (m − 1) tanh xh +
ω (1 + ν) sin 2h = 0. 2
(11)
Given a solution h ∈ C of this equation, define W (x) =
1 2 ω (h ) + (1 + ν) sin2 h. 2 2
(12)
Morse Theory and Infinite Families of Harmonic Maps
597
We calculate that ω dW = h h + ω(1 + ν) sin h cos hh + ν sin2 h dx 2 ω ω 2 = h + (1 + ν) sin 2h h + ν sin h 2 2 ω 2 2 = (m − 1) tanh x(h ) + ν sin h. 2
(13)
Thus, W is increasing when x 0, and decreasing when x 0. If W (x0 ) ≥ ω2 for some x0 0, then either h is constant with value ± π2 or h (x) > % > 0 for all x > x0 . The latter is impossible, since h ∈ C. A similar argument applies if x0 0. Hence, W (x) ≤ ω2 for all x sufficiently far from zero. As a consequence, the limits lim W (x) = L±
(14)
lim W (x) = 0.
(15)
x→±∞
both exist, and x→±∞
The latter implies that h approaches zero as x tends to ±∞. Taken together, the fact that both W and h have limits at infinity implies that lim sin2 h =
x→±∞
2L± , ω
(16)
and therefore h itself has limits at ±∞. The Euler–Lagrange equation now implies that h also has limits at ±∞, and since limx→±∞ h (x) = 0, that limit must be zero. The Euler–Lagrange equation then implies that lim sin 2h = 0,
x→±∞
(17)
which means that h approaches 0 or ± π2 as x approaches ±∞. If the limit is zero at either extreme, then W approaches zero as well. Since it is increasing when x 0 and decreasing when x 0, and nonnegative everywhere, this implies that it must be zero for x 0 or x 0. The latter implies that h = 0 on an open set, so is zero everywhere. Thus, unless h = h∞ , π (18) lim h(x) = ± . x→±∞ 2 This implies that the corresponding map h˜ between spheres is continuous at the poles of S m+1 . The Euler–Lagrange equation for Eν corresponds to the harmonic map equation in a neighborhood of each pole, so by regularity of continuous solutions of the harmonic map equation, h˜ must be a smooth map. It is also possible to compare the perturbed energy of h with that of h∞ . When m ≤ 1, h∞ has infinite energy, so from here on we will assume m ≥ 2. Integrating by parts, and using the fact that h tends to zero at ±∞, we find that
1 ∞ Eν (h) = h − h sechm−1 x + ω(1 + ν) cos2 h sechm−1 x dx 2 −∞ (19) 1 ∞ 1 h sin 2h + cos2 h ω(1 + ν) sechm−1 x dx. = 2 −∞ 2
598
K. Corlette, R. M. Wald
The function 21 h sin 2h + cos2 h is bounded by 1 when h ∈ C, and is identically equal to 1 only if h = h∞ . Thus, Eν (h) ≤ Eν (h∞ ), with equality only if h = h∞ . We summarize this in the following Proposition 2. Assume that m ≥ 2. Any critical point h ∈ C of Eν is either the singular map h∞ or satisfies 1. limx→∞ h(x) = ± π2 , limx→−∞ h(x) = ± π2 , 2. Eν (h) < Eν (h∞ ), 3. The index and nullity of the Hessian of Eν at h are finite. Proof. The only remaining issue is the proof of (3). The critical points of Eν correspond to harmonic maps with a potential function from S m+1 to S n+1 . As such, they satisfy a quasilinear system of elliptic PDE which, in a neighborhood of each pole, agrees with the usual harmonic map equation. One can calculate the Hessian of the perturbed energy functional on the space of all maps S m+1 → S n+1 ; it corresponds to an elliptic operator which, aside from a zeroth order term deriving from the potential, agrees with the corresponding operator for the usual energy. This is an operator of Laplacian type, so the assertions about the index and nullity of the Hessian follow easily. It is of interest to consider functions satisfying certain symmetry conditions: either h(−x) = h(x)
(20)
h(−x) = −h(x).
(21)
or
If h satisfies Condition 2 in the previous proposition, then (20) implies that the corresponding map between spheres is homotopically trivial, while (21) implies that the map between spheres is in the homotopy class of the suspension of F . We will let H + be the set of functions in H satisfying (20), and H − the set of functions in H satisfying (21); similarly, C + = C ∩ H + and C − = C ∩ H − . Notice that, if ν(−x) = ν(x) (as we shall assume henceforth), then the lefthand side of (11) satisfies (20) if h does, and it satisfies (21) if h does. This implies that any critical point of Eν , regarded as a function on H + or H − , is actually a critical point of Eν regarded as a function on H .
3. The Index of the Singular Map In this section, we will show that the index of the singular map corresponding to h∞ is infinite in certain cases. The Hessian of Eν at h is given by Qν (v, v) =
∞
−∞
(v )2 − ω(1 + ν) cos (2h)v 2 sechm−1 x dx
(22)
for any v ∈ H . Let w = v sech
m−1 2
x.
(23)
Morse Theory and Infinite Families of Harmonic Maps
599
notice that if v ∈ then ∞ 2
(w ) + w 2 dx −∞ ∞ 2
= (v ) + v 2 sechm−1 x dx −∞
(m − 1)2 m−1 2 2 m−1 + x tanh xv − (m − 1) sech x tanh xvv dx sech 4 −∞ ∞ ∞ 2
m−1 sechm−1 x tanh xv 2 dx (v ) + v 2 sechm−1 x dx − = 2 −∞ −∞ ∞ m−1 (m − 1)2 + [sechm−1 x tanh x] v 2 + sechm−1 x tanh2 xv 2 dx. 2 4 −∞ (24)
∞
The middle term in the last expression vanishes, while the third is comparable to ∞ v 2 sechm−1 x dx, (25) −∞
the usual Sobolev space of L2 functions on R whose so v ∈ H if and only if w ∈ 2 first derivatives are also in L . A calculation shows that the integral defining the Hessian becomes 2 ∞ m−1 2 w + (26) tanh xw − ω(1 + ν) cos (2h)w dx 2 −∞ ∞ = (w )2 + (m − 1) tanh xww L21 ,
−∞
(m − 1)2 2 + tanh x − ω(1 + ν) cos (2h) w 2 dx 4 ∞ m−1 2 m−1 2 = (w )2 + (w tanh x) − w sech2 x 2 2 −∞ (m − 1)2 2 + tanh x − ω(1 + ν) cos (2h) w 2 dx 4 ∞ (m − 1)2 (m − 1)2 m−1 = (w )2 + − + sech2 x 4 4 2 −∞ − ω(1 + ν) cos (2h) w 2 dx.
Now if we set h = h∞ = 0, we find that the Hessian at h∞ is given by Qν (w, w) ∞ (m−1)2 (m−1)2 m−1 = (w )2 + − + sech2 x − ω(1 + ν) w 2 dx. 4 4 2 −∞ (27)
600
K. Corlette, R. M. Wald
Theorem 3. If (m−1) < ω, then there are finite-dimensional subspaces of H of arbi4 trarily large dimension on which the Hessian of Eν at h∞ is negative definite. 2
Proof. Under the assumption that (m−1) < ω, we can find an % > 0 and a K > 0 such 4 that (m − 1)2 m−1 (m − 1)2 2 − + sech x − ω(1 + ν) < −% (28) V (x) = 4 4 2 2
whenever |x| > K. Choose a, c > 0 and consider the piecewise linear test function 0, x ∈ [c, c + 2a] F (x) = x − c, x ∈ [c, c + a] c + 2a − x, x ∈ (c + a, c + 2a]. Then F is in the Sobolev space L21 , and it is easy to calculate that ∞ Qν (F, F ) = 2a + V (x)F (x)2 dx. −∞
Choosing c > K (so that V (x) < −% throughout the support of F ), we find that ∞ 2%a 3 Qν (F, F ) < 2a − % F (x)2 dx = 2a − . 3 −∞
(29)
(30)
(31)
Thus, if a 2 > 3% −1 , then Qν (F, F ) < 0. Given some a satisfying this condition, for any positive integer i, define Fi as above with c = K + 2ai. Then Qν is negative definite on any subspace of L21 generated by any finite collection of the Fi . The identity map satisfies the hypothesis of this result for 2 ≤ m ≤ 5. The Hopf maps for which it holds are S 3 → S 2 and S 7 → S 4 . Among other maps produced by the Hopf construction, there are maps S 5 → S 4 and S 9 → S 8 which satisfy the hypothesis. There are also maps associated to harmonic eiconals to which the result applies. For example, there are harmonic eiconals of polynomial degree 3 on R5 , R8 , R14 and R26 , corresponding to harmonic self-maps of S 4 , S 7 , S 13 and S 25 with Brouwer degrees 0, 2, 2 and 2 and eigenvalues 18, 27, 45 and 81, respectively. The first three satisfy the hypothesis. Notice that h∞ ∈ H + , H − . It can be shown that the index of h∞ as a critical point of Eν on either of these spaces continues to be infinite under the same hypothesis on m and ω. This follows by a small modification of the argument above, where the piecewise linear function F given there is replaced by F (x) + F (−x) in the case of H + , and F (x) − F (−x) in the case of H − . 4. Morse Theory Applied It is now possible to apply the elements of Morse theory on convex sets to the Eν . As basic references on this subject, we take Chang [5] and Struwe [12]. We first recall the notion of a critical point of a function on a convex set ([5], Definition 6.4 or [12], II,1.3).
Morse Theory and Infinite Families of Harmonic Maps
601
Definition 4. h0 ∈ C is a C-critical point of Eν if dEν (h0 )(h − h0 ) ≥ 0
(32)
whenever h ∈ C. Equivalently, let gν (h0 ) = inf dEν (h0 )(h − h0 ),
(33)
where h ranges over elements of C with h − h0 H < 1. Then h0 is a C-critical point for Eν iff gν (h0 ) = 0. It should be noted that gν is a continuous function on C. We can define the notions of C + -critical and C − -critical points of Eν on C + = C ∩ H + and C − = C ∩ H − ; it is simply necessary to let h range over elements of C ± rather than elements of C. It is not hard to see that any C ± -critical point is also C-critical, since the differential of Eν at h satisfies the same symmetry condition as h. We now compare the C-critical points with ordinary ones. Lemma 5. Any C-critical point h0 of Eν is a critical point for Eν as a function on H . Proof. If h0 is not identically equal to ± π2 , then there is a nonempty open set of R on which it takes values in − π2 , π2 . On this open set, we can test h0 by smooth variations with compact support; when these are sufficiently small, we do not leave C. Hence, h0 satisfies the Euler–Lagrange equation for Eν on the open set where it is not equal to ± π2 . Near any point x0 ∈ R where h(x0 ) = ± π2 , h only satisfies a variational inequality. The condition given in the definition above implies that ∞ ω h v − (1 + ν) sin (2h)v sechm−1 x dx ≥ 0 (34) 2 −∞ for any smooth v with compact support which is nonpositive near h−1 ( π2 ) and nonnegative near h−1 (− π2 ). This implies that, as a distribution, ω (35) (1 + ν) sin 2h = µ+ − µ− , 2 where µ+ , µ− are positive Radon measures supported on h−1 π2 , h−1 − π2 , respectively. µ+ , µ− are the distributional derivatives of two monotone functions F+ , F− which are locally constant outside of h−1 π2 , h−1 − π2 , respectively. This implies that x ω h (x) = C + (m − 1) tanh th − (1 + ν) sin (2h) dt + F+ − F− . (36) 2 0 Thus, h is continuous, except possibly for jump discontinuities on h−1 π2 , h−1 − π2 , with upward jumps on the former and downward jumps on the latter. If x0 ∈ h−1 π2 , then x < x0 implies h − (m − 1) tanh xh +
h(x) − h(x0 ) ≥ 0, x − x0
(37)
h(x) − h(x0 ) ≤ 0. x − x0
(38)
while x > x0 implies
602
K. Corlette, R. M. Wald
The only that h is only allowed upward jumps πway this can be compatible with the fact −1 on h 2 is if h is continuous at x0 with h (x0 ) = 0. A similar argument applies if x0 ∈ h−1 − π2 . Hence, h is continuous on R. Choose a maximal interval I contained in h−1 − π2 , π2 . If I is not all of R, then there is some endpoint x0 contained in either h−1 π2 or h−1 − π2 . On I , h coincides with a smooth solution of the Euler–Lagrange equation for Eν , and extends to x0 as a C 1 function with h(x0 ) = ± π2 and h (x0 ) = 0. But the only solution with this property is constant, contradicting the assumption that h does not attain ± π2 as values in I . Hence, I = R, and h satisfies the Euler–Lagrange equation everywhere. In order to apply Morse theory, the following result is needed. Proposition 6. If m > 1, the functional Eν : C → R satisfies the Palais–Smale condition, i.e. if (hi ) ⊂ C is a sequence with Eν (hi ) uniformly bounded and lim gν (hi ) = 0,
i→∞
(39)
then a subsequence of the hi converges strongly to a critical point of Eν in C. Proof. Eν is a smooth function on H . The fact that Eν (hi ) is uniformly bounded implies that ∞ (hi )2 sechm−1 x dx (40) −∞
is uniformly bounded. Since hi ∈ C and m > 1, this implies that hi H is uniformly bounded so, by passing to a subsequence if necessary, we can assume that hi converges weakly in H to some h ∈ C. By Rellich’s Lemma, the restriction of (hi ) to any bounded interval [−a, a] is precompact in L2 ([−a, a]). Set h0,i = hi . Then, for each positive integer k, we can choose a subsequence (hk,i ) of (hk−1,i ) so that hk,i converges in L2 ([−k, k]) as i → ∞. Replacing (hi ) by the diagonal subsequence, we may assume that (hi ) converges in L2 on any bounded interval. On the other hand, if k is large enough, then |hi |2 sechm−1 x dx (41) |x|≥k
is as small as we like, since |hi | ≤ π2 . This implies that ∞ |hi − hj |2 sechm−1 x dx → 0 −∞
(42)
as i, j → ∞. We can write dEν (hi )(hi − hj ) − dEν (hj )(hi − hj ) ∞ =2 (hi − hj )2 − ω(1 + ν)(sin hi cos hi −∞ − sin hj cos hj )(hi − hj ) sechm−1 x dx.
(43) (44)
Morse Theory and Infinite Families of Harmonic Maps
The second term is bounded in absolute value by ∞ (1 + ν)|hi − hj |2 sechm−1 x dx, 2ω −∞
603
(45)
so tends to zero as i, j → ∞. The fact that g(hi ) tends to zero implies that the expression in (43) is bounded by arbitrarily small positive numbers for i, j 0, which implies that ∞ (hi − hj )2 sechm−1 x dx (46) −∞
tends to zero as i, j → ∞. This implies that the subsequence converges to h in H .
Recall that a function F : M → R on a Hilbert manifold is said to be a Morse function if its critical points are isolated and have nondegenerate Hessians. Define
π Hˆ = h ∈ H |h ≡ kπ, h ≡ + kπ, k ∈ Z . (47) 2 Theorem 1.1 of [8] implies that, for a generic set of ν in the space of compactly supported C 2 functions on the line, Eν is a Morse function on Hˆ . In particular, we can choose a sequence (νj ) converging uniformly to zero so that Ej = Eνj is a Morse function on Hˆ for each j . When |ν| < 1, it is straightforward to see that h ≡ π2 + kπ gives a global minimum for Eν on H and that the Hessian is positive definite, so for j >> 0, each Eνj is a Morse function on the enlarged Hilbert manifold given by H = {h ∈ H |h ≡ kπ, k ∈ Z}.
(48)
Note that h∞ is a critical point for any Eν , and by the result in the previous section, has infinite index iff it has infinite index as a critical point for E. It will be important for our purposes to use the symmetry Eν (h) = Eν (−h). To exploit this, we will work on the space C˜ = (C − {h∞ })/±. C˜ is a locally convex set in the sense of [5], Definition 6.2, and each Eν descends to a smooth function on C˜ satisfying the Palais–Smale condition on the subset of C˜ where Eν < Eν (h∞ ). The basic deformation lemma of Morse theory, adapted to our context, is the following. Lemma 7. Fix ν as before, some λ < Eν (h∞ ) and % > 0. Let Cλ = Eν−1 (−∞, λ) , and C˜ λ = Cλ /±. Let Kλ be the set of critical points of Eν in C˜ with Eν (h) = λ, and let ˜ Then there exist % ∈ (0, %) and a continuous N be an open neighborhood of Kλ in C. ˜ ˜ map 3 : [0, 1] × C → C such that 1. 3(t, h) = h if either t = 0, |Eν (h) − λ| ≥ % or h is a critical point of Eν ; 2. Eν (3(t, h)) is nonincreasing as a function of t for any h; 3. 3(1, C˜ λ+% − N ) ⊂ C˜ λ−% ; and 4. 3(1, C˜ λ+% ) ⊂ C˜ λ−% ∪ N . Proof. This is essentially [12], II,1.9. The only difference is that Struwe works there with an actual convex set rather than the kind of quotient we are dealing with. Thus, one has to ensure that the construction of the map 3 is invariant under ±, which is straightforward. Compare [5], Theorem 3.3 and Sect. 6.2.
604
K. Corlette, R. M. Wald
The other result from Morse theory we will need is concerned with the way the topology of C˜ λ changes as λ passes a critical value of Eν . Before we can apply this, we need to verify the assumption in [12], II, 3.3. Let h ∈ C˜ − {h∞ } be a critical point for Eν , and let Qh be the Hessian of Eν at h. As mentioned previously, H decomposes into a direct sum H = H+ ⊕ H0 ⊕ H− ,
(49)
corresponding to the subspaces on which Qh is positive-definite, zero and negativedefinite, and the latter two subspaces are finite-dimensional. In fact, the dimension of H0 is at most 1. That the dimension is at most two follows from the fact that the relevant differential equation has a 2-dimensional space of solutions locally; that it is at most 1 follows from the fact that the two points at infinity fall into the limit point case of Weyl’s classification of singular points for a second order differential equation. The assumption we need in order to apply the theory of [12] is verified by the following. Lemma 8. If h ∈ C − {h∞ } is a critical point of Eν , then there is an open neighborhood U of 0 ∈ H− such that h + U ⊂ C. Proof. For any v, w ∈ H , Qh (v, w) =
∞
−∞
[v w − ω(1 + ν) cos (2h) vw] sechm−1 x dx.
(50)
Fixing v and letting w vary over H , we obtain a bounded linear functional on H so there is a bounded linear operator A : H → H such that Qh (v, w) = !Av, w"H .
(51)
A is a symmetric operator, and H− is a direct sum of the eigenspaces for A corresponding to negative eigenvalues. Suppose that Av = λv with λ < 0. Then the fact that Qh (v, w) = λ!v, w"H for all smooth compactly supported w implies that v sechm−1 x = (λ − 1)−1 [λ + ω(1 + ν) cos 2h]v sechm−1 x. (52) When x is sufficiently large in absolute value, λ+ω(1+ν) cos 2h is negative. This implies that v is increasing when v is positive, and is decreasing when v is negative. Hence, v cannot be zero when x is large in absolute value. We can choose a basis v1 , . . . , vN for H− consisting of eigenfunctions corresponding to eigenvalues λ1 , . . . , λN , satisfying the condition that vi > 0 for x 0. Consider v = i %i vi and suppose that limx→∞ h(x) = π2 . We need to show that h + v ≤ π2 for all sufficiently small %i . This is true on any bounded interval, so we need only show it is so when x 0. Let g = π2 − h, w = i vi and define the following Wronskian-like quantity: W (x) = [g (x)w(x) − g(x)w (x)] sechm−1 x. Then
W (x) = g (x) sechm−1 x w(x) − g(x) w (x) sechm−1 x ω = − sin 2h sechm−1 x w(x) 2 − g(x) (λi − 1)−1 [λi + ω(1 + ν) cos 2h]vi sechm−1 x. i
(53)
(54)
Morse Theory and Infinite Families of Harmonic Maps
605
This is negative when x 0. On the other hand, limx→∞ W (x) = 0, so W (x) > 0 for x 0. This implies that (w/g) < 0. Hence, (w/g) < C for some C > 0 and x 0, so the required condition holds when %i < C −1 for each i. A similar argument applies when limx→∞ h(x) = − π2 or x tends to −∞. With this in place, we can state the second result from Morse theory. Theorem 9. Suppose λ is a critical value of Ej : C˜ → R with λ < Ej (h∞ ), where, as defined above, Ej = Eνj . There are finitely many critical points p1 , . . . , pn in Ej−1 (λ). If the indices of these critical points are i1 , . . . , iN , respectively, then, for any sufficiently small % > 0, C˜ λ+% is homotopy equivalent to C˜ λ−% with disks of dimensions i1 , . . . , iN attached along their boundaries. (If ik = 0, then we add a point to C˜ λ−% as a disjoint component.) Proof. The fact that there are only finitely many critical points follows from the fact that each critical point of Ej (except possibly h∞ ) is isolated together with the Palais–Smale condition. The rest is II, Theorem 3.6 in [12]. We are now in a position to prove our main result. Define the extended index of a critical point h of Eν to be the sum of the dimensions of H− and H0 . Since dim H0 ≤ 1, the extended index of h is either i or i + 1, where i is the index of h. Theorem 10. Suppose that F : S m → S n is an eigenmap with eigenvalue ω, m > 1, and (m − 1)2 < ω. 4
(55)
There is an infinite sequence (hk ) ⊂ C˜ of critical points for E such that 1. the extended index of hk is at least k, whereas the index of hk is at most k, and 2. the hk converge strongly to h∞ . It should be remembered that each hk corresponds to a pair of harmonic maps S m+1 → S n+1 . If F is the identity map, then the eigenvalue is m, and the assumption reduces to 2 ≤ m ≤ 5, which gives back the results of [2, 4]. Proof. C is a contractible space, as is C − {h∞ }. This implies that C˜ has the homotopy type of the classifying space for Z2 , i.e. that of an infinite-dimensional real projective space. It follows that the cohomology ring of C˜ is the polynomial ring Z2 [x], where x has degree 1. Since h∞ has infinite index as a critical point of Ej when j 0, we can choose, for any positive integer k, a (k + 1)-dimensional subspace V of H on which the Hessian of Ej at h∞ is negative definite. Let S be a small sphere in V centered at the origin; when S is sufficiently small, Ej takes values strictly less than Ej (h∞ ) on h∞ + S. This implies that the nontrivial homology class in C˜ of degree k exists already in some C˜ λ with λ < Ej (h∞ ). As described in the previous theorem, the homotopy type of C˜ λ is obtained by taking a point (corresponding to the unique global minimum of Ej ˜ and attaching disks of dimensions determined by the indices of critical points with in C) energies less than λ. The only way to create a homology class of degree k is by attaching a disk of dimension k. Hence, Ej must have at least one critical point of every possible index, for each j 0.
606
K. Corlette, R. M. Wald
Now choose a critical point fj of Ej of index k for each j 0. We will show that (fj ) satisfies the hypothesis of the Palais–Smale condition for E. We know that ∞ ω ∞ Ej (fj ) < Ej (h∞ ) = (1 + νj ) sechm−1 x dx ≤ ω sechm−1 x dx, (56) 2 −∞ −∞ so is bounded independent of j . On the other hand, the fact that νj converges uniformly to zero implies that 1 ∞ 2 (fj ) + ω(1 + νj ) cos2 fj sechm−1 x dx (57) Ej (fj ) = 2 −∞ ∞ 1 1 ≥ (58) (fj )2 + ω cos2 fj sechm−1 x dx = E(fj ) 4 −∞ 2 when j 0. Thus, E is uniformly bounded on the sequence (fj ). On the other hand, fj satisfies the Euler–Lagrange equation fj − (m − 1) tanh xfj + which implies that dE(fj )(v) = −2
∞ −∞ ∞
= −ω
−∞
ω (1 + νj ) sin 2fj = 0, 2
fj − (m − 1) tanh xfj +
ω sin 2fj v sechm−1 x dx 2
νj v sin 2fj sechm−1 x dx.
The integral on the last line is bounded in absolute value by ∞ ω νj C 0 |v| sechm−1 x dx. −∞
(59)
(60) (61)
(62)
This tends to zero as j → ∞, so (fj ) satisfies the Palais–Smale condition. We can thus choose a subsequence which converges in H to some some critical point hk of E in C. We need to show that hk = h∞ , and that its extended index is at least k. Define ck to be the infimum of all λ such that the nontrivial homology class of degree k in C˜ can be represented by a cycle in E −1 ((−∞, λ)). From the fact that the class can be represented as described above by an embedding of a real projective space of dimension k in C˜ along which the energy is everywhere less than E(h∞ ), it follows that ck < E(h∞ ). The fact that 1 ∞ νj cos2 h sechm−1 x dx ≤ C sup |νj (x)| (63) |Ej (h) − E(h)| = 2 −∞ x∈R implies that Ej converges uniformly to E on H . Thus, for any % ∈ (0, 41 (E(h∞ ) − ck )), j 0 implies that % −1 E − ∞, ck + ⊂ Ej−1 ((−∞, ck + %)) (64) 2 and ck + % < Ej (h∞ ). But this means that Ej−1 ((−∞, ck + %)) must contain some critical point of index k. We can therefore assume that Ej (fj ) < ck + %, which would imply that E(hk ) < ck + %. This shows that hk = h∞ .
Morse Theory and Infinite Families of Harmonic Maps
607
To see that the extended index of hk is at least k, we can look at the difference between the Hessians of Ej and E at fj and hk , respectively. We find D 2 Ej (fj )(v, w) − D 2 E(hk )(v, w) ω ∞ = [cos 2hk − (1 + νj ) cos 2fj ]vw sechm−1 x dx. 2 ∞
(65)
This tends to zero as j → ∞, uniformly in v, w as they range over any bounded set in H . This implies that the Hessian of Ej at fj converges to that of E at hk . The extended index is upper semicontinuous on the space of continuous quadratic forms on H , so the extended index of hk is at least k. Similarly, the index is lower semicontinuous, so the index cannot be greater than k. Finally, the sequence of hk satisfies the hypothesis of the Palais–Smale condition, so converges to some critical point of E. By an argument similar to the one just given, the limit must have infinite index, so the limiting critical point must be h∞ . The analogous argument can be carried out for C + and C − . This leads to the following conclusion. Theorem 11. Suppose that F : S m → S n is an eigenmap with eigenvalue ω, m > 1, and (m − 1)2 < ω. 4
(66)
There are infinite sequences of critical points for E in C + and C − , each of which converges strongly to h∞ . These are the generalizations of the infinite sequences of degree 0 and degree 1 harmonic maps found in [2, 4]. It is of interest to ask which homotopy classes of maps between spheres can be represented as suspensions of eigenmaps of spheres. As mentioned previously, the Hopf maps S 3 → S 2 and S 7 → S 4 are eigenmaps and satisfy the hypothesis of Theorem 2.1. Therefore, the homotopy classes of their suspensions contain infinitely many harmonic representatives. In the case of the map S 3 → S 2 , we obtain a map representing the nontrivial class in π4 (S 3 ) = Z2 . In the case of S 7 → S 4 , we obtain a generator of π8 (S 5 ) = Z24 . The maps S 5 → S 4 and S 9 → S 8 mentioned in Sect. 2 produce infinite families of harmonic maps of the form S 6 → S 5 and S 10 → S 9 . The relevant homotopy groups are again isomorphic to Z2 , but we do not know whether the suspensions of the two original maps represent the nontrivial class. The maps associated to the cubic harmonic eiconals on R8 and R14 give infinite sequences of harmonic self-maps of degrees 0 and 2 defined on S 8 and S 14 . As we have already mentioned, there are other settings where similar ideas may apply. We will briefly summarize the characteristics of the problem discussed here which make the argument above possible. 1. The configuration space is contractible, being in this case a Hilbert space. 2. The energy functional satisfies the Palais–Smale condition. 3. There is a reflection symmetry of the configuration space preserving the energy functional. There is a unique fixed point for this symmetry, corresponding to a critical point for the energy. 4. The index of the fixed point is infinite. 5. All critical points with energy less than that of the fixed point have finite index.
608
K. Corlette, R. M. Wald
6. Possibly after small perturbations of the energy, the critical points with energy less than that of the fixed point are nondegenerate. Of course, there are variations of these conditions which may be treated along similar lines. References 1. Bizo´n, P.: Saddle-point solutions in Yang-Mills-dilaton theory. Phys. Rev. D47, 1656–1663 (1993) 2. Bizo´n, P.: Harmonic maps between three-spheres. Proc. Roy. Soc. London Ser. A 451, 779–793 (1995) 3. Bizo´n, P.: Equivariant self-similar wave maps from Minkowski spacetime into the 3-sphere. math-ph/9910026 4. Bizo´n, P. and Chmaj, T.: Harmonic maps between spheres. Proc. Roy. Soc. London Ser. A 453, 403–415 (1997) 5. Chang, K.-c.: Infinite-dimensional Morse theory and multiple solution problems. Boston: Birkhäuser Boston Inc., 1993 6. Ding, W. Y.: Symmetric harmonic maps between spheres. Commun. Math. Phys. 118, 641–649 (1988) 7. Eells, J. and Ratto A.: Harmonic maps and minimal immersions with symmetries. Methods of ordinary differential equations applied to elliptic variational problems. Princeton, NJ: Princeton University Press 1993 8. Motreanu, D.: Generic existence of Morse functions on infinite-dimensional Riemannian manifolds and applications. In: Global differential geometry and global analysis (Berlin, 1990), (Ferus, D., Pinkall, U., Simon, U., and B. Wegner, eds.) Berlin: Springer 1991, pp. 175–184 9. Pettinati, V. and Ratto, A.: Existence and nonexistence results for harmonic maps between spheres. Ann. Scuola Norm. Sup. Pisa Cl. Sci. (4) 17, 273–282 (1990) 10. Smith, R. T.: Harmonic mappings of spheres. Bull. Amer. Math. Soc. 78, 593–596 (1972) 11. Smith, R. T.: Harmonic mappings of spheres. Amer. J. Math. 97, 364–385 (1975) 12. Struwe, M.: Plateau’s problem and the calculus of variations. Princeton, NJ: Princeton University Press 1988 13. Sudarsky, D. and Wald, R. M.: Extrema of mass, stationarity, and staticity, and solutions to the EinsteinYang-Mills equations. Phys. Rev. D (3) 46, 1453–1474 (1992) 14. Volkov, M. and Gal’tsov, D.: Gravitating non-abelian solitons and black holes with Yang-Mills fields. Physics Reports 319, 1–83 (1999) Communicated by A. Jaffe
Commun. Math. Phys. 215, 609 – 629 (2001)
Communications in
Mathematical Physics
© Springer-Verlag 2001
Semiclassical Limit for the Schrödinger Equation with a Short Scale Periodic Potential Frank Hövermann, Herbert Spohn, Stefan Teufel Zentrum Mathematik, Technische Universität München, 80290 München, Germany. E-mail: [email protected]; [email protected] Received: 7 February 2000 / Accepted: 7 July 2000
Abstract: We consider the dynamics generated by the Schrödinger operator H = − 21 + V (x) + W (εx), where V is a lattice periodic potential and W an external potential which varies slowly on the scale set by the lattice spacing. We prove that in the limit ε → 0 the time dependent position operator and, more generally, semiclassical observables converge strongly to a limit which is determined by the semiclassical dynamics. 1. Introduction A basic problem of solid state physics is to understand the motion of electrons in the periodic potential which is generated by the ionic cores. While this problem is quantum mechanical, many electronic properties of solids can be understood already in the semiclassical approximation [2, 16, 26, 19]. One argues that if the wave packet spreads over many lattice spacings, the kinetic energy (hk) ¯ 2 /2m is modified to the nth band energy En (k). Otherwise the electron responds to external fields, Eex , Bex , as in the case of vanishing periodic potential. Thus the semiclassical equations of motion are r˙ = vn (k) = ∇k En (k), h¯ k˙ = e(Eex (r) + vn (k) ∧ Bex (r)),
(1)
where r is the position and k the quasimomentum of the electron. Note that there is a semiclassical evolution for each band separately. The goal of our paper is to understand on a mathematical level how these semiclassical equations arise from the underlying Schrödinger equation. We consider only the case where Bex = 0. The setup is rather obvious. We start from the Schrödinger equation i
∂ ψ = Hψ ∂t
(2)
610
F. Hövermann, H. Spohn, S. Teufel
with Hamiltonian 1 H = − + V (x) + W (εx). 2
(3)
The electron moves in Rd and the solution to (2) defines the unitary time evolution U ε (t)ψ(x) = e−itH ψ(x) = ψ(x, t) in L2 (Rd ). We have chosen units such that h¯ = 1 and the mass of the particle m = 1. V (x) is a periodic potential with average lattice spacing a. The precise conditions on V will be spelled out in the following section, where we also describe the direct fiber integral decomposition for periodic Schrödinger operators. The lattice spacing a defines the microscopic spatial scale. W (εx) is an external electrostatic potential with dimensionless scale parameter ε, ε 1, which means that W is slowly varying on the scale of the lattice. For real metals the condition of slow variation is satisfied even for the strongest external electrostatic fields available, cf. [2], Chapter 13. The external forces due to W are of order ε and therefore have to act over a time of order ε−1 to produce finite changes, which defines the macroscopic time scale. We will mostly work in the microscopic coordinates (x, t) of (2). For sake of comparison we note that the macroscopic space-time scale (x , t ) is defined through x = ε−1 x and t = ε−1 t . With this scale change Eqs. (2), (3) read iε
∂ ψ = H ψ, ∂t
1 H = −ε2 + V (x /ε) + W (x ) 2
(4)
with initial conditions ψ ε (x ) = ε−d/2 ψ(x /ε). If V = 0, Eq. (4) is the usual semiclassical limit with ε set equal to h. ¯ Thus our problem is to understand how an additional periodic, but rapidly oscillating potential modifies the standard picture. The two scale problem (2), (3) can be attacked along several routes. A first choice would be time dependent WKB [5, 6, 10, 12]. In the limit ε → 0, for each energy band separately, one obtains a Hamilton–Jacobi equation for the phase and a transport equation for the amplitude of the wave function ψ(x, t). As a main draw-back of this method, generically, the solution to the Hamilton–Jacobi equation develops singularities after some finite macroscopic time. If V = 0, it is well understood how to go beyond such caustics by introducing new coordinates on the Lagrangian manifold. For (2), (3) a corresponding program has not yet been attempted. The results [5, 6, 10, 12] are valid only over a finite macroscopic time span with a duration depending on the initial wave function. Another variant is to establish the semiclassical limit through the convergence of Wigner functions. In our context one defines a band Wigner function Wnε (r, k, t) depending on the band index n and as a function of the position and quasimomentum. One then wants to prove that in the limit ε → 0 Wnε (t) converges to W n (t), which is the initial band Wigner function W n (0) evolved according to the semiclassical flow (1). Such a result is established in [9, 18] for the case of zero external potential, the general case being left open as a challenging problem. A third approach to the semiclassical is the strong convergence of Heisenberg operators [1, 4, 23]. We briefly recall its main features for V = 0. We define, as unbounded
Semiclassical Limit with a Short Scale Periodic Potential
611
operators on L2 (Rd ), x(t) := eitH xe−itH , p(t) := eitH pe−itH ,
p = −i∇x ,
where H is the Hamiltonian in (3) with V = 0. The goal is to establish the strong limit of x ε (t)ψ = εx(ε −1 t)ψ, p ε (t)ψ = p(ε −1 t)ψ as ε → 0 with ψ in a suitable domain. In the trivial case of free motion, W = 0, this amounts to the strong convergence of x ε (t)ψ = (εx + pt)ψ, p ε (t)ψ = pψ, which yields limε→0 x ε (t) = pt, limε→0 p ε (t) = p. The general case requires more work [22]. One obtains the strong limits lim x ε (t) = r(p, t),
ε→0
lim p ε (t) = u(p, t).
(5)
ε→0
Here r(p, t), u(p, t) are solutions of r˙ = u,
u˙ = −∇W (r)
(6)
with initial conditions r0 = 0, u0 = p. The initial condition r0 = 0 reflects that |ψ|2 looks like δ(r) on the macroscopic scale, provided that ψ 2 = 1. For general initial conditions, r0 = 0, we would have to shift the initial ψ by ε −1 r0 . The strong operator convergence may look slightly abstract, but all the desired physical information can be deduced. E.g., the initial ψ defines the momentum distribution (k)|2 independent of ε and the δ(r) spatial distribution in the limit ε → 0. Then, |ψ according to (5), for small ε the position distribution at time t is given by f (x)|ψ ε (x, t)|2 dx = (ψ, f (x ε (t))ψ) Rd (k)|2 f (r(k, t)) dk, (ψ, f (r(p, t))ψ) = |ψ (k)|2 dr dk is transported accordwhich means that the phase space distribution δ(r)|ψ ing to the semiclassical flow (6). The spatial marginal of this distribution at time t is the desired approximation to the true position distribution |ψ ε (x, t)|2 . |ψ ε (x, t)|2 may oscillate rapidly on small scales and some averaging, as embodied by the test function f , is needed. In this paper we investigate the semiclassical limit (2), (3) through the strong convergence of the position operator x ε (t). We will show that, in the limit ε → 0, x ε (t) is diagonal with respect to the band index and in each band the structure is analogous to (5) with p replaced by the quasimomentum k and (6) replaced by (1). More generally we will consider the semiclassical limit of the Weyl quantized operators a W (εx, p), whose classical symbol is periodic in p. To give a short outline: In the following section we collect some properties of periodic Schrödinger operators. In Sect. 3 we state our main results, which are proved in Sects. 5, 6, 7 and 8, respectively. In Sect. 4 we discuss some implications for the position and quasimomentum distributions, and, more generally, for the band Wigner functions. The difficulties arising from band crossings are explained in Sect. 9.
612
F. Hövermann, H. Spohn, S. Teufel
2. Periodic Schrödinger Operators For the periodic potential V we will need only some rather minimal assumptions, which we state as Condition (Cper ). Let ! Zd be the lattice generated by the basis {γ1 , . . . , γd }, γi ∈ Rd . Then V (x + γ ) = V (x) for all x ∈ Rd , γ ∈ !. Furthermore, we assume V to be infinitesimally operator bounded with respect to H0 . The last condition is satisfied, e.g., if V ∈ Lp (M), where M is the fundamental domain of !, and p = 2 for d ≤ 3 and p > d/2 for d > 3, respectively. (Cper ) will be assumed throughout. We recall the Bloch-Floquet theory for the spectral representation of Hper =
1 2 p + V (x). 2
(7)
The reciprocal lattice ! ∗ is defined as the lattice generated by the dual basis {γ1∗ , . . . , γd∗ } determined by γi ·γj∗ = 2πδij , i, j = 1, . . . , d. The fundamental domain of ! is denoted by M, the one of ! ∗ by M ∗ . M ∗ is usually referred to as the first Brillouin zone. If we identify opposite edges of M, respectively M ∗ , then it becomes a flat d-torus denoted by T = Rd / !, respectively T∗ = Rd / ! ∗ . Let us introduce the Bloch-Floquet transformation, which should be viewed as a discrete Fourier transform, through (Uψ)(k, x) := e−i(x+γ )·k ψ(x + γ ), (k, x) ∈ R2d , γ ∈!
for ψ ∈ S(Rd ). Clearly, (Uψ)(k, x + γ ) = (Uψ)(k, x ), ∗
(Uψ)(k + γ ∗ , x) = e−ix·γ (Uψ)(k , x).
(8)
Therefore it suffices to specify Uψ on the set M ∗ × M and, if needed, extend it to all of ⊕ R2d by (8). The linear map U : L2 (Rd ) ⊃ S(Rd ) → H := M ∗ L2 (M) dk, with dk the ∗ normalized Lebesgue measure on M , has norm one and can thus be extended to all of L2 (Rd ) by continuity. U is surjective as can be seen from the inverse mapping (U −1 φ)(x) := eix·k φ(k, x) dk, M∗
which has norm one. Thus U : L2 (Rd ) → H is unitary. To transform Hper under U, we first note that p = UpU −1 = Dx + k, with Dx = −i∇x . Therefore ⊕ per := UHper U −1 = Hper (k) dk, H M∗
and Hper (k) =
1 (Dx + k)2 + V (x), 2
k ∈ Rd .
Semiclassical Limit with a Short Scale Periodic Potential
613
Hper (k) acts on L2 (M) with k-independent domain D := H 2 (T). Hper (k) is a semibounded self-adjoint operator, since by condition (Cper ) V is infinitesimally operator bounded with respect to − [7]. Since the resolvent of H0 (k) = 21 (Dx + k)2 is compact, the resolvent Rλ (Hper (k)) := (Hper (k) − λ)−1 , λ = σ (Hper (k)), is also compact, and Hper (k) has a complete set of (normalized) eigenfunctions ϕn (k) ∈ H 2 (T), n ∈ N, called Bloch functions. The corresponding eigenvalues En (k), n ∈ N, accumulate at infinity and we enumerate them according to their magnitude and multiplicity, E1 (k) ≤ ∗ E2 (k) ≤ . . . . En (k) is called the nth band function. We note that Hper (k) = e−ix·γ ∗ Hper (k + γ ∗ )eix·γ . Therefore En (k) is periodic with respect to ! ∗ . If En−1 (k) < En (k) < En+1 (k) for all k ∈ M ∗ (in particular En (k) is non degenerate), then the nth band is isolated. In this case En and the corresponding projection operator are real analytic functions as a consequence of analytic perturbation theory [15]. We denote by I ⊂ N the set of indices of isolated bands. It will be convenient to have also a notation for the spectral subspaces. Let Pn (k) : L2 (M) → L2 (M) denote the orthogonal projection onto the nth eigenspace of Hper (k). Similarly, we set Qn (k) = 1 − Pn (k). Their direct fiber integral is denoted by ⊕ n = P Pn (k) dk. M∗
n projects onto the nth band subspace in H and Pn = U −1 P n U projects onto the nth P 2 d band subspace in L (R ). We have n ψ)(k, ·) = Pn (k)ψ(k, ·) = (ϕn (k), ψ(k))L2 (M) ϕn (k, ·) (P =: ψn (k)ϕn (k, ·).
(9)
The coefficient functions ψn ∈ L2 (M ∗ ) and are called the Bloch coefficients in the nth band subspace. For the index set I ⊂ N of isolated bands we set PI = n∈I Pn . Remark 1. To have a concise notation, we will use a tilde for operators acting on H. Thus = UAU −1 . If A has a direct fiber decomposition, if A is an operator on L2 (Rd ), then A ⊕ then A = M ∗ A(k) dk with A(k) acting on the fiber L2 (M) of H. 3. Main Results For the potentials we assume (Cper ) for V and in addition Condition (Cex ). The external potential W ∈ S(Rd ). To state the semiclassical limit, we first have to explain the classical dynamics which will serve as a comparison. For each n ∈ I the classical phase space is Rd × T∗ , where T∗ = Rd / ! ∗ . As nth band Hamiltonian we have hn (r, k) = En (k) + W (r),
(r, k) ∈ Rd × T∗ ,
and the classical dynamics in the nth band is governed by r˙n = ∇k En (kn ),
k˙n = −∇r W (rn ).
(10)
Since we want to prove the strong convergence of the position operator, as in the case V ≡ 0, we have to lift (10) to operators on H. For this purpose we solve (10) with initial
614
F. Hövermann, H. Spohn, S. Teufel
condition rn (0) = 0, kn (0) = k. We denote the solution by (rn (t; k), kn (t; k)), regarded as functions of k ∈ T∗ . For ψ ∈ H, we define n ψ(k, x), (R(t)ψ)(k, x) := rn (t; k)P n∈I
and analogously, for later use, (K(t)ψ)(k, x) :=
n ψ(k, x). kn (t; k)P
n∈I
Theorem 1. Let the conditions (Cper ), (Cex ) be satisfied. Let x ε (t) := εU ε (−t/ε) x U ε (t/ε). Then for every ψ ∈ RanPI ∩ D(|x|) ∩ H 2 , with H 2 the second Sobolev space, and T < ∞ there is a c < ∞ such that for t ∈ [0, T ],
ε x (t) − U −1 R(t)U ψ ≤ c ε. Theorem 1 will be proved in several steps. First we show that in the semiclassical limit transitions from and to isolated band subspaces are suppressed on the level of the unitary n := P H P + Q H Q and U ε,n (t) := exp(−itH n ). In Sect. groups. We define Hdiag n n n n diag diag 5 we will prove Theorem 2. For any n ∈ I and T < ∞ there is a c < ∞ such that for all t ∈ [0, T ], ε ε,n ≤ c ε, U (t/ε) − Udiag (t/ε) 1 2 B(H ,L )
where
H1
is the first Sobolev space.
The position operator is not diagonal with respect to the nth band subspace and we n define its diagonal part by xdiag := Pn xPn + Qn xQn with the time evolution ε,n ε,n ε,n n (t) := εUdiag (−t/ε)xdiag Udiag (t/ε). xdiag
Our second step is to prove that the off-diagonal part of x ε (t) vanishes in the limit ε → 0. Theorem 3. For n ∈ I and T < ∞ there is a c < ∞ such that for all t ∈ [0, T ], ε ε,n ≤ c ε. (11) x (t) − xdiag (t) 2 2 B(H ,L )
ε,n (t), Pn ] = 0 and it suffices to study the dynamics in the nth By construction we have [xdiag band subspace. This subspace is isomorphic to L2 (T∗ ) and, up to errors of higher order, ε,n ε,n xdiag (t) can be replaced by xsc (t) whose time evolution is governed by a Hamiltonian of the form ε,n sc H = En (k) + W (iε∇k ).
At this stage we can apply the standard machinery of semiclassics, except that formally the roles of position and momentum have been interchanged and the new position space is the flat torus rather than Rd .
Semiclassical Limit with a Short Scale Periodic Potential
615
So far we focused on the position operator, since the electronic density is the most accessible quantity experimentally and it corresponds in essence to a suitable function of the position. On more general grounds one would like to characterize a wider class of semiclassical observables. One further obvious candidate is the momentum p. In the Bloch-Floquet basis we have p = k+Dx . k is semiclassical, being canonically conjugate to i∇k : Theorem 4. Let k ε (t) := U ε (−t/ε) U −1 k U U ε (t/ε). Then for every ψ ∈ RanPI and T < ∞ there is a c < ∞ such that for all t ∈ [0, T ],
ε (12) k (t) − U −1 K(t)U ψ ≤ c ε. On the other hand, Dx is unbalanced because there is no extra factor of ε. Thus p(t/ε) has a limit only when averaged over time (compare with Sect. 6). It is relatively easy to see that Theorems 1 and 4 imply the semiclassical limit also for bounded functions of x ε (t) respectively of k ε (t) (cf. Lemma 8). Next note that for ! ∗ -periodic functions g, g(· + γ ∗ ) = g(·) for all γ ∗ ∈ ! ∗ , we have Ug(p)U −1 = g(k) and hence, by the functional calculus for self-adjoint operators, g(p ε (t)) = g(k ε (t)). Therefore we introduce the set O(0) ⊂ C ∞ (Rd × Rd , R) of bounded and smooth semiclassical symbols with the following properties: a function a(x, k) belongs to O(0), if the function and all its partial derivatives are bounded, if it is ! ∗ -periodic in its second argument and vanishes as the first argument approaches infinity. For a ∈ O(0) we introduce its Weyl quantization x+y 1 (a W ψ)(x) = a (13) , ξ ei(x−y)·ξ ψ(y) dξ dy (2π )d 2 as a bounded operator on L2 (Rd ). The operator corresponding to the symbol a(εx, ξ ) will be denoted by a W,ε and we set, as before, a W,ε (t) := U ε (−t/ε)a W,ε U ε (t/ε).
(14)
Theorem 5. Let the conditions (Cper ), (Cex ) be satisfied and a ∈ O(0). Then for every ψ ∈ RanPI and T < ∞ there is a c < ∞ such that for all t ∈ [0, T ],
W,ε a (t) − U −1 a(R(t), K(t))U ψ ≤ c ε. 4. Semiclassical Distributions Theorems 1 and 5 tell us how the quantum distributions behave in the semiclassical limit. Let us first consider the initial ψ ∈ PI H. Its scaled position distribution is ε −d |ψ(x/ε)|2 which converges to δ(x) as a measure. The quasimomentum distribution n∈I |ψn (k)|2 is independent of ε. Thus it is natural to choose ρ(dr dk) = δ(r)|ψn (k)|2 dr dk = ρn (dr dk) (15) n∈I
n∈I
616
F. Hövermann, H. Spohn, S. Teufel
as the initial distribution for the semiclassical flow (10). We could consider more general initial measures at the expense of making ψ itself ε-dependent. For example the shifted initial measure n∈I δ(r − r0 )|ψn (k)| dr dk is approximated by ψ(x − ε−1 r0 ). Under (10) ρ(dr dk) evolves to ρ(dr dk, t) = n∈I ρn (dr dk, t). Each ρn satisfies weakly the transport equation ∂ ρn = −∇En (k) · ∇r ρn + ∇V (r) · ∇k ρn ∂t
(16)
with initial condition ρn (dr dk, 0) = ρn (dr dk). We define the position and quasimomentum marginals through ρ(dr, t) = ρ(dr dk, t), ρ(dk, t) = ρ(dr dk, t). (17) M∗
Rd
To connect with the quantum evolution we consider the quantum mechanical position distribution ρ ε (dx, t) = ε−d |ψ(x/ε, t/ε)|2 dx
(18)
as a probability measure on Rd . From Theorem 1 and Lemma 8 we conclude that lim f (x ε (t))ψ = U −1 f (R(t))Uψ
(19)
ε→0
for f ∈ C∞ (Rd ). In particular, lim ρ ε (dx, t)f (x) = lim (ψ, f (x ε (t))ψ) = (Uψ, f (R(t))Uψ), ε→0
ε→0
(20)
and we only have to compute the expression on the right-hand side. Using that (Uψ)(x, k) = ψn (k)ρn (x, k), n∈I
we have (Uψ, f (R(t))Uψ) =
∗ n∈I M
|ψn (k)| f (rn (t; k)) dk = 2
n∈I M
ρn (dr, t)f (r). (21)
Thus the positional distribution ρ ε (dx, t) converges weakly as a measure to the inco herent sum n∈I ρn (dr, t). By the same reasoning, if g is a ! ∗ -periodic function, then by Theorem 4 and Lemma 8, lim g(p(t/ε))ψ = U −1 g(K(t))Uψ.
ε→0
(22)
Therefore, if ρ ε (k, t) dk denotes the spectral measure for the quasimomentum operator at time t/ε, we have lim ρ ε (k, t) dk = ρn (dk, t) (23) ε→0
weakly as measures.
n∈I
Semiclassical Limit with a Short Scale Periodic Potential
617
More generally for ψ ∈ L2 we define the scaled Wigner function by W ε (x, k, t) = ε −d ψ(x/ε − γ /2, t/ε)ψ ∗ (x/ε + γ /2, t/ε)eik·γ
(24)
γ ∈!
with x ∈ Rd , k ∈ M ∗ . We think of W ε as a signed, bounded measure over Rd × M ∗ . The Wigner function yields expectations of Weyl quantized operators through
ε ε ψ, eiH t/ε a W,ε e−iH t/ε ψ = W ε (x, k, t)a(x, k) dx dk (25) Rd ×M ∗
with a ! ∗ -periodic in its second argument. From Theorem 5 we therefore deduce that lim W ε (r, k, t) dr dk = ρ(dr dk, t)
ε→0
(26)
weakly as measures. The limits (20) and (23) are the particular cases, where either a(x, k) = f (x) or a(x, k) = g(k). 5. Convergence of the Unitary Groups By definition, the time evolution generated by Hper leaves invariant the band subspaces Ran(Pn ) for all n ∈ N. However, W ε (x) = W (εx) does not respect the Bloch decomposition and it will induce transitions between different bands. Since W ε is of slow variation, we expect such transitions to have a small amplitude as stated in Theorem 2. W ε transforms under U as (UW ε ψ)(k, x) = e−i(x+γ )·k W (ε(x + γ ))ψ(x + γ ) γ ∈!
=
e−i(x+γ )·k (2π )−d/2
γ ∈!
= (2π)
−d/2
=: (2π)−d/2
Rd Rd
Rd
(p)eiε(x+γ )·p dp ψ(x + γ ) W
(p)(Uψ)(k − εp, x) dp W ε (p)(Uψ)(k − p, x) dp W
ε Uψ)(k, x) , =: (W
(27)
∈ S(Rd ), the integral (27) is and we adopt the quasiperiodic extension (8). Since W ε ε −1 = UW U acts on H as convolution with W (p/ε) in ε := ε −d W well-defined and W ε the fiber parameter k. W approximates a Dirac delta in the limit ε → 0 and the shift in (27) becomes the identity operator. In the Bloch-Floquet representation the full Hamiltonian (3) becomes ψ)(k, ·) = Hper (k)ψ(k, ·) + (W ε ψ)(k, ·). (H We expect the diagonal part of W ε to be dominant with the off-diagonal piece as a small correction. For such a decomposition it turns out to be convenient to fix the index n of an isolated band and to project along Pn and its complement Qn = 1 − Pn . For n ∈ I n of H as we define the diagonal part Hdiag n Hdiag = Pn H Pn + Qn H Qn ,
618
F. Hövermann, H. Spohn, S. Teufel
and the off-diagonal part of the external potential as ε,n Wod = Qn W ε Pn + Pn W ε Qn .
Then ε,n ε,n ε,n n H = Hdiag + Wod = (Hper + Wdiag ) + Wod . ε,n ε,n We note that Wdiag and Wod are bounded operators and set
U ε (t) = e−itH ,
ε,n Udiag (t) = e
n −itHdiag
.
To prove Theorem 2 we start by writing the difference of the two unitary groups in the Bloch representation as ε (t/ε) − U ε,n (t/ε) = − iε U diag
t/ε
0
ε,n (s) ds ε (ε −1 t − s) ε −1 W ε,n U U od diag
(28)
ε,n . By definition, for ψ ∈ H, we have and we have to investigate the operator W od εP n ψ)(k) = (2π)−d/2 ε (p)Qn (k)Pn (k − p)ψ(k − p) dp n W W (Q Rd −d/2 (p)Qn (k)Pn (k − εp)ψ(k − εp) dp, = (2π) W Rd
ε localizes around p = 0. To control which vanishes strongly in the limit ε → 0, since W the long times in (28) we would need uniform convergence of order O(ε2 ), however. In ε,n order to identify the leading order term of Wod we do a Taylor expansion of Pn (k − εp) around Pn (k), leading, as we will show, to εP n ψ)(k) = −ε(2π)−d/2 (p)Qn (k)∇k Pn (k)ψ(k − εp) dp + O(ε 2 ). n W F (Q Rd
(29)
is the Fourier transform of F (x) = (Dx W )(x) and we will associate to F the Here F ε as in the case of W , operator F ε −d/2 (p)ψ(k − εp) dp. (F ψ)(k) = (2π ) F Rd
To justify (29) we first show that Pn (k) is smooth and calculate ∇k Pn (k) explicitly for later use. Lemma 1. Let n ∈ I. Then ∇k Pn (k) = − Qn (k)REn (k) (Hper (k))(Dx + k)Pn (k) − Pn (k)(Dx + k)REn (k) (Hper (k))Qn (k),
(30)
where Rλ (H ) = (H − λ)−1 is the resolvent of H . Thus Pn (·) ∈ C ∞ (M ∗ ; B(L2 (M))).
Semiclassical Limit with a Short Scale Periodic Potential
Proof. Using contour integrals we write ∇k Pn (k) = −
1 2π i
619
cn (k)
∇k Rλ (Hper (k)) dλ,
where cn (k) is a closed rectifiable curve in the complex spectral plane which encircles En (k) only. From 0 = ∇k 1 = ∇k (Hper (k) − λ)Rλ (Hper (k)) = (Dx + k)Rλ (Hper (k)) + (Hper (k) − λ)∇k Rλ (Hper (k)), we infer ∇k Rλ (Hper (k)) = −Rλ (Hper (k))(Dx + k)Rλ (Hper (k)). Hence we get Qn (k)∇k Pn (k) = Qn (k)∇k Pn (k)(Pn (k) + Qn (k)) 1 = Qn (k)Rλ (Hper (k))(Dx + k)Rλ (Hper (k))Pn (k) dλ 2πi cn (k) 1 1 = Rλ (Hper (k))Qn (k) dλ (Dx + k)Pn (k) 2πi cn (k) En (k) − λ = −REn (k) (Hper (k))Qn (k)(Dx + k)Pn (k),
(31)
where the term Qn (k)∇k Pn (k)Qn (k) vanishes, since in this case the integrand is an analytic function on the whole interior of cn (k). Note that Pn (k) projects onto a subspace of finite energy, on which Dx +k is bounded. The statement about continuity for this term then follows from the continuity of Pn (k), En (k) and the assumption that En (k) is isolated from the remainder of the spectrum. An analogous computation for Pn (k)∇k Pn (k) leads to the second term in (30). Finally, Pn (·) ∈ C ∞ (M ∗ ; B(L2 (M))) follows by induction. ! From Qn (k) + Pn (k) = 1 we conclude that Qn (k) is differentiable as well and that ∇k Qn (k) = −∇k Pn (k). Lemma 2. Let n ∈ I. Then
ε n ∇k P n ∇k Q n + P n · F + O(ε 2 ) ε,n = −ε Q W od n := ⊕∗ ∇k Pn (k) dk. in B(H, H), where ∇k P M
Proof. By Lemma 1 we have Pn (k − εp) = Pn (k) − εp · (∇k Pn )(k) + ε 2 p · H(Pn )(k (k, εp)) · p, where the last term is the Lagrangian remainder with H denoting the Hessian. Hence n ψ)(k) = (2π)−d/2 (p)Pn (k − εp)ψ(k − εp) dp εP W (W Rd −d/2 (p) (Pn (k) − εp · ∇k Pn (k)) ψ(k − εp)) dp = (2π) (32) W Rd (p) p · H(Pn )(k (k, εp)) · p ψ(k − εp)) dp. + (2π)−d/2 ε 2 (33) W Rd
620
F. Hövermann, H. Spohn, S. Teufel
Since
(p) p · H(Pn )(k (·, εp)) · p ψ(· − εp)) dp W Rd H 2 ≤ sup H(Pn )(k) |W (p)| p ψ(· − εp)) H dp
k∈M ∗
≤ c W L1 ψ H ,
Rd
(33) is O(ε2 ) and multiplying (32) with Qn (k) from the left shows that n ∇k P εP n = −εQ n + O(ε 2 ). n W Q Clearly n W εQ n = −εP n + O(ε 2 ) n ∇k Q P follows analogously. For later use we add that we also showed that [W ε , Pn ] = O(ε).
!
As a consequence of Lemma 2 the difference of the two unitary groups in Eq. (28) can be written as ε (t/ε) − U ε,n (t/ε) U diag t/ε ε ε,n
ε (ε −1 t − s) Q n + P n · F (s) ds + O(ε). (34) U n ∇k P n ∇k Q U = iε diag 0
We have to estimate the integral without losing one order of ε from the integration over time. As in the proof in [3] of the adiabatic theorem the idea is to rewrite the integrand n with an appropriately chosen operator as a time derivative, i.e. as a commutator of H diag A, at least up to an unavoidable error O(ε). Let us define for n ∈ I, 2 Bn (k) = RE (Hper (k))Qn (k)(Dx + k)Pn (k). n (k)
Lemma 3. For n ∈ I we have n ∇k P n + P n = [B n ∇k Q n + B n∗ , H per ]. Q Proof. Using the spectral decomposition and recalling Qn (k)∇k Pn (k) = −REn (k) (Hper (k))Qn (k)(Dx + k)Pn (k) from Lemma 1, one directly computes Bn (k)Hper (k) − Hper (k)Bn (k) 2 = −(Hper (k) − En (k))RE (Hper (k))Qn (k)(Dx + k)Pn (k) n (k)
= −REn (k) (Hper (k))Qn (k)(Dx + k)Pn (k) = Qn (k)∇k Pn (k). n ∇k Q n ∇k P n = −(Q n )∗ . The lemma then follows from P
!
Semiclassical Limit with a Short Scale Periodic Potential
621
ε,n = O(ε) in B(H, H) as ε tends to zero. Lemma 4. Bn + Bn∗ , W diag ε,n Pn (k) are Proof. To have a concise notation in the following, expressions like W diag ε,n understood in the sense that W diag acts on all k-depending objects on its right-hand side. ε,n = P n W n W εP n + Q εQ n . Hence We recall that W diag ε,n = Qn (k) R 2 ε Bn (k), W En (k) (Hper (k))Qn (k)(Dx + k)Pn (k), W Pn (k). diag ε ], [Dx + k, W ε ] and [R 2 ε We now examine the commutators [Pn (k), W En (k) Qn (k), W ] ε ] = O(ε) and one by one. It follows from the proof of Lemma 2 that [Pn (k), W 2 ε [REn (k) Qn (k), W ] = O(ε) can be shown to hold by a similar argument. Thus it remains ε ]. For ψ ∈ H 1 (Rd ) we compute to discuss the commutator [Dx + k, W ε ]Uψ)(k) (2π)d/2 ([Dx + k, W ε (p)(((Dx + k) − (Dx + k − p))Uψ)(k − p) dp W = Rd ε (p)ε −1 p(Uψ)(k − p) dp =ε W Rd
ε Uψ)(k), = ε(F which is clearly O(ε) uniformly for ψ ∈ L2 as ε → 0, since F ∈ S(Rd , Rd ).
!
In summary we have shown that ε
n n ∇k Q n + B n∗ , H diag ε , n + P n · F = B n ∇k P + O(ε) · F Q and it remains to check n , F ε = O(ε) in B(UH 1 , H) as ε tends to zero. Lemma 5. H diag
Proof. The commutator 1 1 1 [Hper , F ε ] = − ε 2 (F ε ) − ε(∇F ε ) · ∇ − ε(∇ · F ε )∇ 2 2 2 ε,n , F ε ] is O(ε) in B(H, H), since is O(ε) in B(H 1 , L2 ) as ε → 0. The commutator [W diag n and Q n with F ε are both of order O(ε) in B(H, H) and [W ε, F ε ] the commutator of P vanishes identically. ! Defining
ε n + B n∗ · F , n = B A
it follows that the integrand in (34) can be written as ε
n n ∇k Q n , H diag n + P n · F = A n ∇k P + O(ε), Q where O(ε) is in the norm of B(UH 1 , H). (Note that for A ∈ B(L2 , L2 ) A B(H 1 ,L2 ) ≤
A B(L2 ,L2 ) ).
622
F. Hövermann, H. Spohn, S. Teufel
We are now ready for the ε,n (t) : H 1 → H 1 is bounded uniformly in t and ε (cf. Proof of Theorem 2. Since Udiag Sect. 6), we obtain for the difference (34) of the unitary groups,
ε (t/ε) − U ε,n (t/ε) U diag
t/ε
= −iε 0
n n , H diag ε,n (s) ds + O(ε). (35) ε (ε −1 t − s) A U U diag
ε,n (s), we get, ε (−s)U ε,n (s) and A n (s) = U ε,n (−s)A n U Abbreviating Xn (s) = U diag diag diag using partial integration in (35), t/ε n ε,n (−s) A n , H diag ε,n (s) ds ε (t/ε) X n (s) U −iε U U 0
diag
diag
t/ε d ε (t/ε) = εU X n (s) An (s) ds ds 0
n U ε (t/ε) A n ε,n (t/ε) − U =ε A diag t/ε d n ε (t/ε) n (s) ds −εU X (s) A ds 0
n U ε (t/ε) A n ε,n (t/ε) − U =ε A diag t/ε ε ε,n A ε (−s) W n U ε,n (s) ds. − iε U (t/ε) U od diag 0
n is bounded and the second term is O(ε), since For ε → 0 the first term is O(ε) since A ε,n Wod is O(ε) according to Lemma 2. !
6. Convergence of the Position Operator In this section we will study the asymptotics of the position operator x ε (t). As in the case of the unitaries we have to establish that the off-diagonal contributions to x ε (t) vanish in the limit ε → 0. Proof (of Theorem 3). Let ψ ∈ D(|x|) ∩ H 2 and n ∈ I. Then
ε ε,n x (t) − xdiag (t) ψ
ε,n ε,n ≤ x ε (t) − Udiag (−t/ε) x ε Udiag (t/ε) ψ
ε,n ε,n ε,n + Udiag (−t/ε) x ε Udiag (t/ε) − xdiag (t) ψ . In order to estimate (36), note that we have t/ε U ε (−s)Dx U ε (s)ψ ds x ε (t)ψ = εxψ + ε 0
(36) (37)
(38)
Semiclassical Limit with a Short Scale Periodic Potential
623
and ε,n ε,n Udiag (−t/ε) x ε Udiag (t/ε) t/ε
ε,n ε,n ε,n Udiag (−s) Dx + i Wdiag , x Udiag (s)ψ ds = εxψ + ε 0
= εxψ + ε
0
t/ε
ε,n ε,n Udiag (−s)Dx Udiag (s)ψ ds + O(ε).
ε,n , x] = O(ε) in B(L2 ), as follows immediately from The last equality holds, since [Wdiag ε the fact that [W , Pn ] = O(ε) and [W ε , Qn ] = O(ε), cf. proof of Lemma 2. Hence, using (38), the remaining term from (36) is t/ε
ε,n ε,n (−s)Dx Udiag (s) ψ ds U ε (−s)Dx U ε (s) − Udiag ε 0 t
ε,n ε,n = U ε (−s/ε) − Udiag (−s/ε) Dx Udiag (s/ε)ψ ds (39) 0 t
ε,n + U ε (−s/ε)Dx U ε (s/ε) − Udiag (s/ε) ψ ds. (40) 0
Using the fact that V and W are infinitesimally operator bounded with respect to − 21 ε,n and that ψ ∈ H 2 , we get for ψ(s) := Udiag (s/ε)ψ, ε,n 2 ε,n Dx ψ(s) ≤ Hdiag ψ(s) + (V + Wdiag )ψ(s) ε,n ψ + c1 Dx2 ψ(s) + c2 ψ , ≤ Hdiag ε,n (s/ε)ψ H 1 ≤ c ψ H 2 with c independent with c1 < 21 and c2 < ∞. Hence Dx Udiag of s and ε and we can apply Theorem 2 to conclude that the operator acting on ψ in (39) is O(ε) in B(H 2 , L2 ) as ε → 0. ε,n We come to (40). Let ψ(s) = (U ε (s/ε) − Udiag (s/ε))ψ, then, by Cauchy-Schwarz,
Dx ψ(s) 2 = ψ(s), D 2 ψ(s) ≤ ψ(s) Dx2 ψ(s) . x
The first factor is O(ε) by Theorem 2 whereas the second is uniformly bounded by the same argument as in the treatment of (39) a few lines above. Next we rewrite (37) as ε,n ε,n n εUdiag (−t/ε) xod Udiag (t/ε) n := Q xP + P xQ . This is certainly of order O(ε) as ε → 0 if x n can be with xod n n n n od shown to be a bounded operator. To see this, note that in Bloch representation x acts as i∇k . Hence
(UQn xPn ψ)(k) = iQn (k)∇k Pn (k)(Uψ)(k) = iQn (k)(∇k Pn (k))(Uψ)(k) n ∇k P n . Finally also Pn xQn is bounded, since it is the adjoint and thus Qn xPn = Q of Qn xPn . !
624
F. Hövermann, H. Spohn, S. Teufel
7. Semiclassical Equations of Motion for the Position Operator As we have shown, on the macroscopic scale the position and quasimomentum operators commute with the projection on isolated bands. Thus it remains to investigate the semiclassical limit for each isolated band separately. For this purpose we note that any n H is of the form ψn (k)ϕn (x, k) with ψn ∈ L2 (M ∗ ). Since ϕn already satisfies (8), ψ ∈P we have to extend the Bloch coefficients periodically and hence consider ψn ∈ L2 (T∗ ) from now on. ε,n n ] = 0 and Hper acts First we determine how Hdiag acts on L2 (T∗ ). We have [Hper , P ε,n as multiplication by En (k). For Wdiag we have
n W εP n Uψ (k, x) P
ε (p) ϕn (k), ϕn (k − p) 2 ψn (k − p) dp ϕn (k, x) = (2π)−d/2 W L (M) d
ε,n R ψn (k)ϕn (x, k). =: W (41) ε,n n H is unitarily equivalent to H ε,n := En (k) + W ε,n . restricted to P Thus Hdiag ε,n by the To be able to use techniques from semiclassics we next approximate W ε,n sc operator W = W (iε∇k ) acting on L2 (T∗ ), i.e., ∇k is defined with periodic boundary conditions.
Lemma 6. For any n ∈ I there is a c < ∞ such that ε,n ε,n W sc −W ≤ c ε2 . B(L2 (T∗ )) Proof. By definition we have
ε,n sc ψ (k) = (2π )−d/2 W
Rd
(42)
ε (p)ψn (k − p) dp, W
and therefore
ε,n ε,n sc −W ψ (k) W
−d/2 ε,n (p) ϕn (k), ϕn (k − p) 2 = (2π) − 1 ψ(k − p) dp. W L (M)
(43)
As will be shown, there exists a constant c such that ϕn (k), ϕn (k − p) L2 (M) − 1 ≤ c|p|2
(44)
Rd
for Lebesgue almost all k. Therefore we conclude ε ε,n |p|2 ε,n 2 W sc −W ψ L2 (T∗ ) ≤ cε W (p) 2 |ψ(· − p)| dp 2 ∗ ε L (T ) ≤ c ε 2 ψ L2 (T∗ ) . To show (44) note that one must choose ϕn (k) such that the map k " → ϕn (k) ∈ L2 (M) is smooth. This is possiblebecause according to Lemma 1 the projections Pn (k) depend smoothly on k and hence one can locally define ϕn (k) = Pn (k)ϕn (k0 )/ Pn (k)ϕn (k0 ) . Now we cover T∗ by finitely many open sets Ui such that ϕni (k) is defined on the closure
Semiclassical Limit with a Short Scale Periodic Potential
625
of each Ui in the way described above. One obtains a family ϕni (k) of eigenfunctions which can be connected to a smooth function ϕn (k) on all of M. Taylor expansion yields ϕn (k − p) = ϕn (k) − p · ∇k ϕn (k) + 21 p · H(ϕn )(k )p, where H(ϕn ) denotes the Hessian and 21 p·H(ϕn )(k )p is the Lagrangian remainder. In view of (ϕn (k), ∇k ϕn (k))L2 (M) = 0, which follows from comparing (30) with (∇k Pn ψ)(k) = (ϕn (k), ψ(·, k))∇k ϕn (k) + (∇k ϕn (k), ψ(·, k))ϕn (k), we obtain ϕn (k), ϕn (k − p) L2 (M) − 1 ≤ c(k)|p|2 . Here c(k) = 21 i,j |(ϕn (k ), ∂ki ∂kj ϕn (k ))|. However, c(k) is bounded uniformly in k, since ϕn (k) is smooth on each compact U¯i . ! ε,n We define now the semiclassical Hamiltonian Hsc , ε,n Hsc = En (k) + W (iε∇k )
(45)
ε,n is of order acting on L2 (T∗ ). Then Lemma 6 shows that the difference H ε,n − Hsc 2 2 ∗ O(ε ) uniformly in B(L (T )) and hence (cf. Sect. 5) the difference of the corresponding unitary groups is O(ε). ε,n
ε,n
ε,n Corollary 1. Let Usc (t) = e−itHsc and U ε,n (t) = e−itH . For n ∈ I and T < ∞ there is a c < ∞ such that for all t ∈ [0, T ],
ε,n ε,n (t/ε) U (t/ε) − Usc
B(L2 (T∗ ))
≤ c ε.
ε,n (t/ε) on L2 (T∗ ) is well studied. We refer to [8, 14, The semiclassical limit for Usc 22]. As a consequence the strong limits ε,n ε,n lim Usc (−t/ε) (iε∇k ) Usc (t/ε) = rn (t; k),
ε→0
lim
ε→0
ε,n ε,n Usc (−t/ε) k Usc (t/ε)
= kn (t; k)
(46) (47)
exist on H 1 (T∗ ) and the errors are of order O(ε). rn and kn act as multiplication operators and are defined as in (10) with initial conditions (rn (0), kn (0)) = (0, k). n Since the restriction of εxdiag to the nth band subspace is unitarily equivalent to 2 ∗ −iε∇k on L (T ), we can, in view of Theorem 3, conclude the proof of Theorem 1 by showing Lemma 7. For n ∈ I and T < ∞ there is a C < ∞ such that for t ∈ [0, T ] we have that ε,n ε,n ε,n (−t/ε)(iε∇k )Usc (t/ε) 2 ∗ ≤ c ε. (48) U (−t/ε)(iε∇k )U ε,n (t/ε) − Usc B(L (T ))
626
F. Hövermann, H. Spohn, S. Teufel
Proof. The proof of (48) is analogous to the proof of Theorem 3 in Sect. 5, however, simpler. As in (38) we have t/ε ε,n ε,n ε,n ε,n ε,n Usc (−t/ε)(iε∇k )Usc (t/ε) = iε∇k + ε Usc (−s) i∇k , Hsc Usc (s) ds 0
and U ε,n (−t/ε)(iε∇k )U ε,n (t/ε) t/ε = iε∇k + ε U ε,n (−s) i∇k , H ε,n U ε,n (s) ds = iε∇k + ε
0
t/ε 0
ε,n ε,n U ε,n (s) ds, + i∇k , W U ε,n (−s) i∇k , Hsc
ε,n ε,n ε,n − W sc ε,n := W . Now [i∇k , Hsc ] = i∇k En (k) is bounded, and (48) where W ε,n ] = O(ε) in B(L2 (T∗ )). follows from Corollary 1 if we can show that [i∇k , W ε,n ψ)(k) is given by (43), this can be shown by an argument similar to Noting that (W the one in Lemma 6. !
8. Semiclassical Equations of Motion for General Observables We proceed to more general semiclassical observables. First note that Theorem 4 follows immediately from the results obtained so far (Theorem 2, Corollary 1 and (47)), since multiplication with k in Bloch representation is bounded. Hence we now have that
ε (49) x (t) − U −1 R(t)U ψ = O(ε) for all ψ ∈ RanPI ∩ D(|x|) ∩ H 2 and that
ε k (t) − U −1 K(t)U ψ = O(ε)
(50)
for all ψ ∈ RanPI . We next consider bounded continuous functions of x ε (t) and k ε (t): Lemma 8. Let f ∈ C∞ (Rd ) and g ∈ C(T∗ ). Then for all ψ ∈ RanPI and T < ∞ there is a c < ∞ such that for t ∈ [0, T ] we have
(51) f (x ε (t)) − U −1 f (R(t))U ψ ≤ c ε and
g(k ε (t)) − U −1 g(K(t))U ψ ≤ c ε.
(52)
¯ Proof. We will sketch the proof for x ε (t). First note that R(t) := U −1 R(t)U is a self¯ ± i)(RanPI ∩ adjoint operator that commutes with PI . Hence the sets D± := (R(t) D(|x|) ∩ H 2 ) are dense in RanPI (Since R and x ε are vectors of operators in Rd , note that this and the following statements hold component wise). For ψ ∈ D± we have ¯ ± i)−1 ψ = (x ε (t) ± i)−1 (R(t) ¯ − x ε (t))ϕ (53) (x ε (t) ± i)−1 − (R(t)
Semiclassical Limit with a Short Scale Periodic Potential
627
¯ ± i)−1 ψ ∈ RanPI ∩ D(|x|) ∩ H 2 . Thus, by Theorem 2, (53) is of order for ϕ = (R(t) O(ε) (strongly) as ε → 0 and, since D± are dense in PI , (x ε (t)±i)−1 strongly approach ¯ ± i)−1 on PI with an error of order O(ε). (R(t) Using the fact that polynomials in (xj ± i)−1 , j = 1, . . . , d, are dense in C∞ (Rd ) ¯ one concludes that the convergence x ε (t) → R(t) on RanPI in the “strong resolvent sense” implies
¯ lim f (x ε (t)) − f (R(t)) ψ = O(ε) ε→0
for all f ∈ C∞ (Rd ) and ψ ∈ RanPI (cf. Theorem VIII.20 in [20]). However, by the functional calculus for self-adjoint operators we have f (U −1 R(t)U) = U −1 f (R(t))U and (51) follows. Clearly (52) follows analogously. ! Proof (of Theorem 5). Let a ∈ O(0). Referring again to the general Stone-Weierstraß theorem we can uniformly approximate a(x, ξ ) by a sum of products, i.e. a(x, ξ ) = ∞ d ), g ∈ C(T∗ ), a f (x)g (ξ ) with f ∈ C (R |ai | < ∞ and supi∈N,x∈Rd ,ξ ∈T∗ i i ∞ i i=0 i i |fi (x)gi (ξ )| < ∞. Hence in order to prove Theorem 5 we are left to show that for arbitrary f ∈ C∞ (Rd ) and g ∈ C(T∗ ), (f (x)g(ξ ))W,ε (t) → U −1 f (R(t))g(K(t))U
(54)
strongly on RanPI . To see this recall the so-called product rule for quantum observables (cf. [22]). It states, in particular, that for two symbols a, b ∈ O(0),
(ab)W,ε − a W,ε bW,ε ψ = O(ε). Applied to our case this yields
(f (x)g(ξ ))W,ε (t) → f (x)W,ε g(ξ )W,ε (t) = f (x ε (t))g(k ε (t)). Finally, since f and g are bounded, Lemma 8 implies (54) and thus Theorem 5.
!
9. Band Crossings We proved the semiclassical limit for isolated bands only. In principle, there are two distinct mechanisms of how this assumption could be violated. First of all a band could be isolated but have a constant multiplicity larger than one. This occurs, e.g., for the Dirac equation where because of spin the electron and positron bands are both twofold degenerate. A systematic study is only recent [9, 24] and leads to a matrix valued symplectic structure for the semiclassical dynamics. For periodic potentials degeneracies are the exception. They form a real analytic subvariety of the Bloch variety B = {(k, λ) ∈ Rd × R | ∃f ∈ L2 (M) : Hper (k)f = λf } and have a dimension at least one less than the dimension of B [17, 25]. Thus points of band crossings have a k-Lebesgue measure zero. From the study of band structures in solids one knows that band crossings indeed occur. Thus it is of interest to understand the extra complications coming from band crossings. There are two types of band crossings. The first one is removable through a proper analytic continuation of the bands. In a way, removable band crossings correspond to a
628
F. Hövermann, H. Spohn, S. Teufel
wrong choice of the fundamental domain. E.g. for V = 0 we may artificially introduce a lattice !. The bands touch then at the boundary of M ∗ . Upon analytic continuation we recover the single band E1 (k) = k 2 /2 with M ∗ = Rd . In one dimension all band crossings can be removed [21]. Thus, with the adjustment discussed, our result fully covers the case d = 1. For d ≥ 2 generically band crossings cannot be removed. It is then of great physical interest to understand how a wave packet tunnels into a neighboring band through points of degeneracy (or almost degeneracy). For a careful asymptotic analysis in particular model systems we refer to the monumental work of G. Hagedorn [13]. Gerard [11] considersa model system with two bands in two dimensions, k1 k2 1 i.e., the role of − 2 + V is taken by . He investigates the semiclassical limit k2 −k1 and proves that the particle may tunnel to the other band with a probability which depends on how well the initial wave packet is concentrated near a semiclassical orbit hitting the singularity. Acknowledgement. FH gratefully acknowledges the financial support by the Deutsche Forschungsgemeinschaft via the Graduiertenkolleg Mathematik im Bereich ihrer Wechselwirkung mit der Physik at the LMU München.
References 1. Asch, J., Knauf, A.: Motion in Periodic Potentials. Nonlinearity 11, 175–200 (1998) 2. Ashcroft, N.W., Mermin, N.D.: Solid State Physics. New York: Saunders, 1976 3. Avron, J.E., Elgart, A.: Adiabatic Theorem without a Gap Condition. Commun. Math. Phys. 203, 445–463 (1999) 4. Avron, J.E., Seiler, R., Yaffe, L.G.: Adiabatic Theorems and Applications to the Quantum Hall Effect. Commun. Math. Phys. 110, 33–49 (1987) 5. Buslaev, V.: Semiclassical Approximation for Equations with Periodic Coefficients. Russ. Math. Surveys 42, No. 6, 97–125 (1987) 6. Buslaev, V., Grigis, A.: Imaginary Parts of Stark–Wannier Resonances. J. Math. Phys. 39, No. 5, 2520– 2550 (1998) 7. Cycon, H.L., Froese, R.G., Kirsch, W., Simon, B.: Schrödinger Operators. Berlin–Heidelberg–NewYork: Springer, 1987 8. Folland, G.B.: Harmonic Analysis in Phase Space. Princeton, NJ: Princeton University Press, 1989 9. Gerard, P., Markowich, P.A., Mauser, N.J., Poupaud, F.: Homogenization Limits and Wigner Transforms. Commun. Pure Appl. Math. 50, 323–380 (1997) 10. Gerard, C., Martinez,A., Sjöstrand, J.:A MathematicalApproach to the Effective Hamiltonian in Perturbed Periodic Problems. Commun. Math. Phys. 142, 217–244 (1991) 11. Gerard, P.: Semiclassical Limits. Talk at Nonlinear Equations in Many-Particle Systems, Oberwolfach, 1999 12. Guillot, J.C., Ralston, J., Trubowitz, E.: Semi-Classical Asymptotics in Solid State Physics. Commun. Math. Phys. 116, 401–415 (1988) 13. Hagedorn, G.A.: Molecular Propagation through Electron Energy Level Crossings. Memoirs Am. Math. Soc. 111, (1994) 14. Hövermann, F.: Quantum Motion in Periodic Potentials. Dissertation, LMU München, 1999 15. Kato, T.: Perturbation Theory for Linear Operators. Berlin–Heidelberg–New York: Springer, 1980 16. Kohn, W.: Theory of Bloch Electrons in a Magnetic Field: The Effective Hamiltonian. Phys. Rev. 115, No. 6, 1460–1478 (1959) 17. Kuchment, P.: Floquet Theory for Partial Differential Equations. Basel–Boston: Birkhäuser, 1993 18. Markowich, P.A., Mauser, N.J., Poupaud, F.: A Wigner-function Theoretic Approach to (Semi)-Classical Limits: Electrons in a Periodic Potential. J. Math. Phys. 35, No. 3, 1066–1094 (1994) 19. Nenciu, G.: Dynamics of Band Electrons in Electric and Magnetic Fields: Rigorous Justification of the Effective Hamiltonians. Rev. Mod. Phys. 63, No. 1, 91–127 (1991) 20. Reed, M., Simon, B.: Methods of Modern Mathematical Physics I. New York: Academic Press, 1972 21. Reed, M., Simon, B.: Methods of Modern Mathematical Physics IV. New York: Academic Press, 1978 22. Robert, D.: Autour de l’Approximation Semi-Classique. Basel–Boston: Birkhäuser, 1987
Semiclassical Limit with a Short Scale Periodic Potential
629
23. Spohn, H.: Long Time Asymptotics for Quantum Particles in a Periodic Potential. Phys. Rev. Lett. 77, No. 7, 1198–1201 (1996) 24. Spohn, H.: Semiclassical Limit of the Dirac Equation and Spin Precession. Annals of Physics 282, 420– 431 (2000) 25. Wilcox, C.H.: Theory of Bloch Waves. J. Anal. Math. 33, 146–167 (1978) 26. Zak, J.: Dynamics of Electrons in Solids in External Fields. Phys. Rev. 168, No. 3, 686–695 (1968) Communicated by A. Jaffe
Commun. Math. Phys. 215, 631 – 682 (2001)
Communications in
Mathematical Physics
© Springer-Verlag 2001
Integrable Highest Weight Modules over Affine Superalgebras and Appell’s Function Victor G. Kac1, , Minoru Wakimoto2 1 Department of Mathematics, M.I.T., Cambridge, MA 02139, USA.
E-mail: [email protected]
2 Graduate School of Mathematics, Kyushu University, Fukuoka 812-81, Japan.
E-mail: [email protected] Received: 17 April 2000 / Accepted: 7 July 2000
Abstract: We classify integrable irreducible highest weight representations of nontwisted affine Lie superalgebras. We give a free field construction in the level 1 case. The analysis of this construction shows, in particular, that in the simplest case of the s(2|1) level 1 affine superalgebra the characters are expressed in terms of the Appell elliptic function. Our results demonstrate that the representation theory of affine Lie superalgebras is quite different from that of affine Lie algebras. 0. Introduction In this paper we continue the study of integrable irreducible highest weight modules over affine superalgebras that we began in [KW]. First, let us recall the definition of an integrable module over an ordinary affine Kac– Moody algebra g [K3]. Let g be a finite-dimensional simple or abelian Lie algebra over C with a symmetric invariant bilinear form (.|.). Recall that the associated affine algebra is g = (C[t, t −1 ] ⊗C g) ⊕ CK ⊕ Cd
(0.1)
with the following commutation relations (a, b ∈ g; m, n ∈ Z and a(m) stands for t m ⊗ a) : [a(m), b(n)] = [a, b](m + n) + mδm,−n (a|b)K, [d, a(m)] = −ma(m), [K, g] = 0.
(0.2)
We identify g with the subalgebra 1 ⊗ g. The bilinear form (.|.) extends from g to a symmetric invariant bilinear form on g by: (a(m)|b(n)) = δm,−n (a|b), (C[t, t −1 ] ⊗ g|CK + Cd) = 0, (K|K) = (d|d) = 0, (K|d) = −1. Supported in part by NSF grant DMS-9970007.
(0.3)
632
V. G. Kac, M. Wakimoto
Choose a Cartan subalgebra h of g and let g = h ⊕ (⊕α∈ gα ) be the root space decomposition, where gα denotes the root space attached to a root α ∈ ⊂ h∗ . Let h = h + CK + Cd
(0.4)
be the Cartan subalgebra of g, and, as before, let gα (m) = t m ⊗ gα . A g-module V is called integrable if the following two properties hold [K3]: h is diagonalizable on V , all gα (m)(α a root of g, m ∈ Z) are locally finite on V .
(0.5) (0.6)
(Property (0.6) means that dim U (gα (m))v < ∞ for any v ∈ V .) It is easy to show that these two properties imply g is locally finite on V ( i.e., dim U (g)v < ∞ for any v ∈ V ).
(0.7)
Here and further U (a) denotes the universal enveloping algebra of a Lie (super)algebra a. Note also that condition (0.6) is vacuous if g is abelian. Let now g = g0¯ + g1¯ be a finite-dimensional Lie superalgebra over C with an even symmetric invariant bilinear for (.|.) (for a background on Lie superalgebras see [K1]). Recall that “even” means that (g0¯ |g1¯ ) = 0, “symmetric” means that (.|.) is symmetric on g0¯ and skewsymmetric on g1¯ , and “invariant” means that ([a, b]|c) = (a|[b, c]), a, b, c ∈ g. We shall assume, in addition, that g0¯ is reductive: g0¯ = ⊕N ¯ , j =0 g0j
(0.8)
where g00 ¯ is abelian and g0j ¯ with j ≥ 1 are simple Lie algebras. The affine superalgebra g associated to the Lie superalgebra g and the bilinear form (.|.) is defined in exactly the same way as in the Lie algebra case by relations (0.2). Likewise, the invariant even symmetric bilinear form (.|.) on g is defined by (0.3), and the Cartan subalgebra h is defined by (0.4) after a choice of a Cartan subalgebra h of g0¯ . Note that for each j ∈ {0, 1, . . . , N}, the superalgebra g contains an affine Kac–Moody algebra g0j ¯ associated to g0j ¯ . We shall see that condition (0.6) of integrability is too strong in the superalgebra case, as for most of the affine superalgebras it allows only trivial highest weight modules. This forces us to consider weaker conditions (cf. [KW]): Definition 0.1. Given a subset J ⊂ {1, . . . , N}, a g-module V is called J -integrable if it satisfies conditions (0.5) and (0.7) and if it is integrable as g0j ¯ -module for all j ∈ J . Let g = h ⊕ (⊕α∈ gα ) be a root space decomposition of the Lie superalgebra g with respect to a Cartan subalgebra h of g0¯ . Choose a set of positive roots + in and let n+ = ⊕α∈+ gα . For each ∈ h∗ one defines an irreducible highest weight module L() over g as the (unique) irreducible g-module for which there exists a non-zero vector v such that h, n+ v = 0, g(m)v = 0 for m > 0, hv = (h)v for h ∈
(0.9)
where, as before g(m) = t m ⊗ g. The number k = (K) is called the level of L() and ¯ of . Note that K = kI on L() and that L() := U (g)v is an irreducible highest weight module over g.
Integrable Highest Weight Modules over Affine Superalgebras
633
In Sect. 1 we describe a general approach to the classification of irreducible integrable highest weight modules over arbitrary Kac–Moody superalgebras, and in Sects. 2 and 6 give their complete classification in the affine (non-twisted) case, using Serganova’s odd reflections. In Sect. 3 we give a free field realization of all level 1 integrable highest weight modules over g(m|n), which leads to a “quasiparticle” character formula for these modules and to a “theta function” type character formula. This construction may be viewed as a generalization of the classical boson-fermion correspondence based on the oscillator algebra g(1) and of the super boson-fermion correspondence based on g(1|1) [KL]. The former produces the classical vertex operators and relates representation theory of g(1) to the denominator identity for s(2), while the latter produces vertex operators for the symplectic bosons and relates representation theory of g(1|1) to the denominator identity for s(2|1) (see [K4]). In Sect. 4 we show that the “theta function” type character formula for s(m|1)(m ≥ 2) is a product of a theta function, a power of the eta function, and a more “exotic” function, called a multivariable Appell function. The classical Appell function appeared in the 1880’s in the papers by Appell [A] and by Hermite in their study of elliptic functions. Most recently this function has been discussed in [P]. The study of asymptotics of Appell’s functions gives the high temperature asymptotics of integrable level 1 s(m|1)characters. We also derive here formulas for branching functions for integrable level 1 s(m|1)-modules restricted to the even subalgebra. They turn out to be certain “half” modular functions. In Sect. 5 we relate integrable level 1 modules over g(m|n) to the denominator identity for s(m + 1|n), and as a result, we derive for these modules yet another, a Weyl type, character formula. In Sect. 7 we give a free field realization of the two level 1 integrable highest weight modules over osp(m|n), which generalizes the constructions for so(m) and sp(n) from [KP1, F] and [FF]. These lead to character formulas and high temperature asymptotics of the characters. In Sect. 8 we show that integrability is a necessary condition for an irreducible highest weight g-module to be a module over the associated vertex algebra, and that in the level 1 case this condition is sufficient. We thus get examples of rational vertex algebras for which the C-span of normalized (super)characters is not SL(2, Z)-invariant. The latter property was proved in [Z] under certain additional assumptions, and it was generally believed that these assumptions were superfluous. In Sect. 9 we discuss some open problems. It is interesting to note that in the “super” case a number of new interesting phenomena occur. The level gets quantized by the integrability condition, but in almost all cases the number of integrable modules is infinite. This is the case for the lowest, level 1, integrable s(m|n)-modules which apparently causes the specialized characters and branching functions to lose their customary modularity properties, which are so ubiquitous in the affine Lie algebra case [KP2, K3]. However, in the cases when the number of characters of given level is finite, like, for example, k = 1 osp(m|n) case, the specialized normalized characters are still modular, though their C-span is no longer SL(2, Z)invariant as in the affine Lie algebra case. It is also interesting to note that while the characters of affine Lie algebras are global sections of line bundles on abelian varieties, the characters of affine Lie superalgebras are related to global sections of rank 2 vector bundles on abelian varieties, as the work of Polishchuk [P] on Appell’s function apparently indicates.
634
V. G. Kac, M. Wakimoto
1. Integrability of Highest Weight Modules over Kac–Moody Superalgebras Consider the following data: D = {h, I, I1 , $∨ , $}, where h is a vector space, I is an index set, I1 is a subset of I , $∨ = {αi∨ }i∈I and $ = {αi }i∈I are linearly independent sets of vectors in h and h∗ respectively indexed by I . One associates to these data a Lie superalgebra g(D) defined as the quotient of the Lie algebra on generators ei , fi (i ∈ I ) and h, the generators ei and fi for i ∈ I1 being odd and all other generators being even, and the standard relations (i, j ∈ I, h ∈ h): [h, h] = 0,
[ei , fj ] = δij αi∨ ,
[h, ei ] = αi , hei ,
[h, fi ] = −αi , hfi ,
by the maximal graded with respect to the root space decomposition ideal intersecting h trivially (cf. [K1, K3]). The commutative ad -diagonizable subalgebra h of g(D) is called the Cartan subalgebra, $ and $∨ are called the sets of simple roots and coroots respectively, elements ei and fi (i ∈ I ) are called Chevalley generators, etc. One defines the notions of roots and root spaces in the usual way (cf. [K1, K3]). Let n+ (resp. n− ) denote the subalgebra of g generated by the ei ’s (resp. fi ’s). Then, as usual, one has the triangular decomposition: g = n− + h + n+ . Let aij = αj , αi∨ . The matrix A = (aij )i,j ∈I is called the Cartan matrix of the data D (and of g(D)). A root of g(D) is called even (resp. odd) if the attached root space is even (resp. odd). For example a simple root αs is called odd iff s ∈ I1 . An odd simple root αs (and the coroot αs∨ ) is called isotropic if ass = 0. In what follows we let −1 if both αi and αj are odd, pij = 1 otherwise. Note that g(D) has an anti-involution ω defined by ω(ei ) = fi , ω(fi ) = ei , ω|h = I . For that reason properties of the ei ’s automatically hold for the fi ’s. Lemma 1.1. (a) An odd simple root αi is isotropic iff [ei , ei ] = 0. (b) If i = j , then [ei , ej ] = 0 iff aij = aj i = 0. Proof. It is clear that [fj , [ei , ei ]] = 0 if j = i, and one has: [fi , [ei , ei ]] = 2aii ei , which proves (a). The proof of (b) is similar. It is straightforward to check the following relation (i, j ∈ I, i = j ): [[ei , ej ], [fi , fj ]] = pij (aij αj∨ − pii aj i αi∨ ).
(1.1)
Further on we shall always assume the following property of the Cartan matrix A: aij = 0 iff aj i = 0.
(1.2)
Given s ∈ I1 such that ass = 0 (i.e., αs is an odd isotropic simple root), define a new data rs (D) = {h, I, rs (I1 ), rs ($∨ ), rs ($)}
Integrable Highest Weight Modules over Affine Superalgebras
635
and new Chevalley generators rs (ei ), rs (fi ) of g(D) as follows (cf. [S, PS, KW]): i ∈ rs (I1 ) iff i ∈ I1 in case asi = 0, i ∈ rs (I1 ) iff i ∈ I1 otherwise; rs (αs∨ ) = −αs∨ , rs (αs ) = −αs , rs (αi∨ ) = αi∨ +
ais ∨ α and rs (αi ) = αi + αs if asi = 0, asi s
rs (αi∨ ) = αi∨ and rs (αi ) = αi in all other cases; rs (es ) = fs , rs (fs ) = −es , 1 [fs , fi ] if asi = 0, rs (ei ) = [es , ei ] and rs (fi ) = psi asi rs (ei ) = ei ,
rs (fi ) = fi in all other cases.
Denote by rs (n+ ) (resp. rs (n− )) the subalgebra of g(D) generated by the rs (ei )’s (resp. rs (fi )’s). The transformation rs is called an odd reflection (with respect to αs ). Lemma 1.2. (a) The data rs (D) satisfy (1.2). (b) The new Chevalley generators satisfy the standard relations and together with h generate g(D), so that g(rs (D)) g(D). (c) One has the new triangular decomposition: g(D) = rs (n− ) + h + rs (n+ ). (d) The data rs (rs (D)) coincide with D, and the Chevalley generators rs (rs (ei )) (resp. rs (rs (fi ))) coincide, up to a non-zero factor, with ei (resp. fi ). Proof. It is straightforward using (1.1) and the relation [a, [a, b]] =
1 [[a, a], b] if a is an odd element. 2
An element ρ ∈ h∗ such that ρ, αi∨ =
1 aii for all i ∈ I 2
is called a Weyl vector for $∨ . Lemma 1.3. If ρ is a Weyl vector for $∨ , then ρ + αs is a Weyl vector for rs ($∨ ). Proof. It suffices to check that in the case ais = 0 one has: ρ + αs , asi αi∨ + ais αs∨ = 1 ∨ ∨ 2 αi + αs , asi αi + ais αs , which is immediate. Recall that for each ∈ h∗ one defines an irreducible highest weight module L() over g(D) as the (unique) irreducible g(D)-module for which there exists a non-zero vector v such that hv = (h)v for h ∈ h,
n+ v = 0.
(1.3)
The vector v , called a highest weight vector (with respect to n+ ), is determined uniquely up to a (non-zero) constant factor by the condition n+ v = 0 (cf. [K3]). The linear function is called the highest weight (with respect to n+ ) of L().
636
V. G. Kac, M. Wakimoto
= r (n ). Lemma 1.4. Let αs be an odd isotropic simple root and let n+ s + = v is a highest weight vector with respect to n , so that (a) If , αs∨ = 0, then v + the highest weight remains the same: = . = f v is a highest weight vector with respect to n , so (b) If , αs∨ = 0, then v s + that the highest weight vector becomes = − αs .
Proof. It is straightforward using the facts that fs2 v = 21 [fs , fs ]v = 0, and fs v = 0 iff , αs∨ = 0. As an immediate corollary of Lemmas 1.3 and 1.4 we obtain the following very useful formulas (cf. [KW]): + ρ = + ρ if , αs∨ (= + ρ, αs∨ ) = 0, + ρ = + ρ + αs if , αs∨ (= + ρ, αs∨ ) = 0.
(1.4)
Let α ∈ h∗ be a positive even root of g(D) such that there exist root vectors e attached to α and f attached to −α satisfying the following conditions: (i) ad f is locally nilpotent on g(D), (ii) [e, f ] = α ∨ ∈ h, [α ∨ , e] = 2e,
[α ∨ , f ] = −2f .
Then we call f an integrable element of g(D). The following lemma is well-known (cf. [K3]). Lemma 1.5. Let f be an integrable element attached to a negative root α. (a) If f is locally nilpotent on L() then , α ∨ ∈ Z+ . (b) Provided that α is a simple root, f is locally nilpotent on L() iff , α ∨ ∈ Z+ . Let β = αs be an odd isotropic simple root. It will be convenient to use notation rβ in place of rs . Consider a sequence of roots β0 , β1 , . . . , βk such that β0 is an odd isotropic simple root from $(0) := $, β1 is an odd isotropic simple root from $(1) = rβ0 ($(0) ), . . . , βk is an odd isotropic simple root from $(k) = rβk−1 ($(k−1) ). Given (0) ∈ h∗ , let (0) = be the highest weight of L() with respect to n+ := n+ , (1) (1) be the highest weight of L() with respect to n+ := rβ0 (n+ ), . . . , (k) be the highest (k) (k−1) weight of L() with respect to n+ = rβk−1 (n+ ). Let ρ (k) be a Weyl vector for $(k) . Proposition 1.1. Let α be a positive root of g(D) and let f be an integrable root element attached to −α. Given ∈ h∗ , let S = {i ∈ [0, 1, . . . , k − 1]|(i) , βi∨ = 0}. Suppose that α ∈ $(k) . Then the element f is locally nilpotent on L() if and only if + ρ + βi , α ∨ ∈ N = {1, 2, . . . }. i∈S
Proof. It follows from (1.4) that (k) + ρ (k) = + ρ +
βi .
i∈S
Since (k) + ρ (k) , α ∨ = (k) , α ∨ + 1, the proposition follows from Lemma 1.5b.
Integrable Highest Weight Modules over Affine Superalgebras
637
Proposition 1.2. If, under the assumptions of Proposition 1.1, one has: + ρ, α ∨ ∈ N, then f is integrable on L(). Proof. Due to Proposition 1.1, Proposition 1.2 holds if S = ∅. Let N = , α ∨ . It is well-known (cf. [K3]) that f is integrable on L() iff f N+1 v lies in a maximal submodule of the Verma module M().
(1.5)
But we have just shown that (1.5) holds for a Zariski open set of λ on the hyperplane λ, α ∨ = N . Since (1.5) is a polynomial condition, we conclude that it holds for all λ on this hyperplane. Proposition 1.3. If, under the assumptions of Proposition 1.1, f is integrable on L() and
then −
+ ρ, βi∨ = 0 for i = 0, 1, . . . , s(≤ k),
s
i=0 βi , α
∨
∈ Z+ .
Proof. We have: , β0∨ = + ρ, β0∨ = 0, hence, by (1.4) we have: + ρ = (1) + ρ (1) , etc. Thus, (i) + ρ (i) = + ρ for i = 1, . . . , s. Therefore, by Lemma 1.4b, we have: (s) = −
s
βi .
i=0
Now the proposition follows from Lemma 1.5a.
The calculation of coroots is facilitated by the following simple fact. Proposition 1.4. (a) There exists a non-degenerate symmetric bilinear form (.|.) on h such that, identifying h and h∗ via this form, we have: αi∨ = νi αi , where νi ∈ C× ,
(1.6)
A = diag (νi )i∈I B, where B = (bij ) is a symmetric matrix.
(1.7)
if and only if
One then has: (αi |αj ) = bij . (b) Let $∨ = {αi∨ } = rs ($∨ ) and $ = {αi } = rs ($) where rs is an odd reflection, and suppose that (1.6) holds. Then αi∨ = νi αi . (c) Provided that (1.7) holds and aii = 2 or 0 for all i ∈ I , one has for any nonisotropic root α which is obtained from a simple root by a sequence of odd reflections: α ∨ = 2α/(α|α). Proof. (a) is proved in [K3], (b) and (c) are easily checked.
638
V. G. Kac, M. Wakimoto
Remark 1.1. A natural question is which of the Lie superalgebras g(D) are of “Kac– Moody” type? The most natural answer, in our opinion, is that they should satisfy the following conditions: (i) g(D)0¯ is a (generalized) Kac–Moody algebra, (ii) the g(D)0¯ -module g(D)1¯ is integrable. This definition covers the basic classical finite-dimensional Lie superalgebras and the associated affine superalgebras (including the twisted ones). Unfortunately, a well developed theory of generalized Kac–Moody superalgebras (see [B, R] and references there) does not cover most of the latter superalgebras (because of the crucial assumption on the Cartan matrix that its off diagonal entries are non-positive). 2. Classification of Integrable Irreducible Highest Weight Modules over g(m|n) Consider the Lie superalgebra g(m|n), where m, n ≥ 1 (see [K1]). Let eij (1 ≤ i, j ≤ m+n) denote its standard basis. Denote by h the Cartan subalgebra of g(m|n) consisting of all diagonal matrices. Let 2i (1 ≤ i ≤ m + n) be the basis of h∗ dual to the basis ui := eii of h. Then g(m|n) = g(D) for the following data D = {h, I, I1 , $∨ , $} (cf. [K1]). We let I = {1, 2, . . . , m + n − 1}, I1 = {m}; αi∨ = ui − ui+1 for i ∈ I \I1 , ∨ = u +u αm m m+1 , αi = 2i − 2i+1 for all i ∈ I . Its Cartan matrix is the following (m + n − 1) × (m + n − 1) matrix: 2 −1 0 −1 2 −1 .. . −1 2 −1 A= th −1 0 1 m row. −1 2 −1 .. . −1 2 The Chevalley generators are as follows: ei = ei,i+1 , fi = ei+1,i
(i = 1, . . . , m + n − 1).
Note that αm is the only odd simple root, and it is isotropic. Consider the supertrace form on g(m|n): (a|b) = str ab. This is a non-degenerate invariant supersymmetric bilinear form on g(m|n) whose restriction to h is non-degenerate and symmetric. Identifying h and h∗ via this bilinear form, we have: 2i = ui for i = 1, . . . , m; 2i = −ui for i = m + 1, . . . , m + n. Hence we have: αi∨ = αi for i = 1, . . . , m,
αi∨ = −αi for i = m + 1, . . . , m + n − 1,
(2.1)
Integrable Highest Weight Modules over Affine Superalgebras
639
and we may use Proposition 1.4. In particular, ((αi |αi ))i,j ∈I = diag (1, . . . , 1, −1, . . . , −1)A. m
where the data Likewise, the affine superalgebra g(m|n) is isomorphic to g(D), ∨ D = {h, I , I1 , $ , $} is an extension of the data D for g(m|n) defined as follows ∨ = $∨ ∪ {α ∨ }, (cf. [K3]). The space h is defined by (0.4), I = I ∪ {0}, I1 = {m, 0}, $ 0 $ = $ ∪ {α0 }. Here the αi for i ∈ $ are extended from h to h by letting αi (K) = αi (d) = 0, α0 = δ − θ, α0∨ = K − θ ∨ , where: δ|h+CK = 0, δ, d = 1, θ = 21 − 2m+n is the highest root of g(m|n), θ ∨ = u1 + um+n . We extend the bilinear form (.|.) from g(m|n) to g(m|n) by (0.3). Identifying h with h∗ via this symmetric bilinear form, we get: θ = θ ∨,
K = δ,
α0 = α0∨ .
(2.2)
We have the following expression of δ = K in terms of simple roots and coroots: δ=K=
m+n−1
αi =
i=0
is The Cartan matrix for D
m i=0
αi∨ −
m+n−1 j =m+1
αj∨ .
(2.3)
0 −1 0 · · · 0 1 −1 0 = A . A ··· 0 −1
As above, we have: ((αi |αj ))i,j ∈I = diag (1, . . . , 1, −1, . . . , −1)A. m+1
The even part of g(m|n) is g(m) ⊕ g(n), hence the even part of g(m|n) is the sum g(m) + g(n) with a common central element K and a common scaling element d. Note that the restriction of the supertrace form to g(m) (resp. g(n)) is the normalized (resp. negative of the normalized) invariant form, i.e., (α|α) = 2 (resp. (α|α) = −2) for any root α. The set of simple roots for g(m) (resp. g(n)) is empty if m = 1 (resp. n = 1), and for m ≥ 2 (resp. n ≥ 2) it is as follows: = {α0 = δ − θ , α1 , . . . , αm−1 } $ = {α = δ − θ , αm+1 , . . . , αm+n−1 ), where θ = (resp. $ 0
m−1
αi , θ =
i=1
Assuming that m ≥ 2, we have: (θ |θ ) = 2, hence θ = θ ∨ , and we have:
α0 = α0∨ = α0 +
m+n−1 i=m
∨ αi = α0∨ + αm −
m+n−1 i=m+n
αi∨ .
m+n−1
αi .
i=m+1
(2.4)
640
V. G. Kac, M. Wakimoto
A g(m|n)-module L() is called integrable, if its restriction to g(m) is integrable and its restriction to g(m|n) is locally finite. In this section we shall classify all such modules. As usual, define fundamental weights i ∈ h (i = 0, 1, . . . , m + n − 1) by i , αj∨ = δij , j = 0, . . . , m + n − 1, i , d = 0, and labels of a weight by: ki = , αi∨ . The following necessary conditions of integrability of L() follow from Lemma 1.5a: ki ∈ Z+ for i = 1, . . . , m − 1, m + 1, . . . , m + n − 1, k := k0 + km −
m+n−1
ki ∈ Z+ .
(2.5) (2.6)
i=m+1
We assume in (2.6) that m ≥ 2 and use (2.4). We call k the partial level of since, using (2.3), we see that the level k := , K is given by k=
m−1
ki + k .
(2.7)
i=1
Hence, provided that m ≥ 2, the level of an integrable g(m|n)-module is a nonnegative integer. Lemma 2.1. Assume that m ≥ 2. Then conditions (2.5) and (2.6) along with the condition k ≥ n
(2.8)
are sufficient for integrability of the g(m|n)-module L(). Proof. The lemma follows from Lemma 1.5 applied to the simple roots αi , i = 1, . . . , m − 1, and Proposition 1.2 applied to α ∨ = α0∨ , since, due to (2.4) we have:
ρ, α0∨ = −n + 1.
(2.9)
Lemma 2.2. Let L() be an integrable g(m|n)-module such that k < n, and let m ≥ 2. Then the following complementary condition holds: (*) there exist r, s ∈ Z+ such that (i) k = r + s, (ii) k0 − km+n−1 − km+n−2 − · · · − km+n−r − r = 0, (iii) km − km+1 − km+2 − · · · − km+s − s = 0.
Integrable Highest Weight Modules over Affine Superalgebras
641
Proof. Consider the following two sequences of roots of g(m|n): β0 = α0 , β1 = α0 + αm+n−1 , β2 = α0 + αm+n−1 + αm+n−2 , . . . , βn−1 = α0 + αm+n−1 + . . . + αm+1 ; β0 = αm , β1 = αm + αm+1 , β2
= αm + αm+1 + αm+2 , . . . , βn−1 = αm + · · · + αm+n−1 .
It is clear by Proposition 1.4 that βi∨ = βi and βi ∨ = βi . Note that + ρ, βr∨ (resp. + ρ, βs∨ ) is equal to the left-hand side of (ii) (resp. (iii)). Note that
βi , α0∨ = 1 = βi , α0∨ , i = 0, . . . , n − 1.
(2.10)
If +ρ, βi∨ = 0 for all i, using (2.6) and (2.10) we would conclude, by Proposition 1.3, that k − n ≥ 0, in contradiction with the assumption of the lemma. Hence (ii) holds for some non-negative integer r (< n). Similarly, (iii) holds for some non-negative integer s (< n). Similarly, applying Proposition 1.3 to the union of sequences βi and βi , we conclude that r + s ≤ k.
(2.11)
Hence, adding up (ii) and (iii) we get k +
m+n−r−1
ki = r + s.
(2.12)
i=m+s+1
Now (i) follows from (2.5), (2.11) and (2.12).
Remark 2.1. Condition (*) on is equivalent to the following condition: there exists a non-negative integer s ≤ k (≤ n − 1) such that: km = km+1 + · · · + km+s + s and km+s+1 = · · · = km+s+n−k −1 = 0. This condition implies that lies in a union of k +1 hyperplanes of dimension k +m−1. Equivalently, there exists a non-negative integer r ≤ k (≤ n − 1) such that k0 = km+n−1 + km+n−2 + · · · + km+n−r + r km+n−r−1 = · · · = km+k −r+1 = 0.
and
Theorem 2.1. (a) A g(1|n)-module L() is integrable iff k2 , . . . , kn ∈ Z+ . b) Provided that m ≥ 2, a g(m|n)-module L() is integrable iff conditions (2.5), (2.6) hold and, in the case k < n, the complementary condition (*) holds. Proof. In the case m = 1, the only condition of integrability is local finiteness of g(1, n) on L() which is equivalent to k2 , . . . , kn ∈ Z+ due to Lemma 1.5b. It follows from Lemma 2.2 that in the case m ≥ 2, the conditions listed by Theorem 2.1b are necessary. In view of Lemma 1.5b, it remains to show that these conditions are sufficient for local nilpotency of e−α0 . Due to Lemma 2.1, we may assume that k ≤ n − 1.
(2.13)
642
V. G. Kac, M. Wakimoto
Consider the sequence of odd roots β0 , . . . , βn−1 introduced in the proof of Lemma 2.2 and let $(0) = $, $(1) = rβ0 ($(0) ), . . . , $(n) = rβn−1 ($(n−1) ), and notice that α0 ∈ $(n) . (n)
Let (n) be the highest weight vector of L() with respect to n+ = rβn−1 . . . rβ0 (n). Due to Lemma 1.5b, it remains to show that conditions listed by Theorem 2.1b imply that
(n) , α0∨ ∈ Z+ .
(2.14)
Recall that by (1.4) we have: (n) + ρ (n) = + ρ +
βi ,
(2.15)
i∈S
where S = {i ∈ [0, . . . , n − 1]|(i) , βi∨ = 0}. Let ti = (i) , βi∨ for short. Then condition (*) gives for some r ∈ Z+ , r < n, that tr = 0. In view of Remark 2.1, we have: tr = tr+1 = · · · = tn−s−1 = 0.
(2.16)
Hence, due to (2.15), (2.9), (2.10) and (2.16) we get:
(n) + ρ (n) , α0∨ = k + (1 − n) + |S| ≥ k + (1 − n) + (n − s − r) ≥ 1,
proving (2.14), since ρ (n) , α0∨ = 1.
Remark 2.2. It follows from Theorem 2.1 that when m ≥ 2, the only integrable g(m|n)-modules L() of level k = 0 are those for which all labels are 0, in which case dim L() = 1. Remark 2.3. If m ≥ 2 and n ≥ 2, then the only L() which are integrable with respect to the whole even subalgebra are 1-dimensional. (It is because the g(m)-integrability implies k ≥ 0 and g(n)-integrability implies k ≤ 0.) Remark 2.4. Define 2 ∈ h∗ by letting 2|h = supertrace, 2(K) = 2(d) = 0. It follows from Theorem 2.1 that when m ≥ 2, the complete list of highest weights of integrable g(m|n)-modules of level 1, up to adding an arbitrary linear combination of 2 and δ, is as follows: s (1 ≤ s ≤ m − 1), (a + 1)m + am+1 (a ∈ Z+ ), (a + 1)0 + am+n−1 (a ∈ Z+ ). Remark 2.5. Consider the sequence of the sets of simple roots $(0) = $, . . . , $(n) = {α0 , . . . , αm+n−1 }, introduced in the proof of Theorem 2.1. One has: α0 = α0 + α1 , α1 = α2 , . . . , αm−2 = αm−1 , αm−1 = αm + αm+1 + · · · + αm+n−1 + α0 , αm = −(αm+n + · · · + αm+n−1 + α0 ), αj
= αj for m + 1 ≤ j ≤ m + n − 1.
Integrable Highest Weight Modules over Affine Superalgebras
643
Let j be the fundamental weights with respect to $(n) . Given a weight , denote by (n)
(n) the highest weight of L() with respect to $(n) (or rather n+ ). Using Lemma 1.4, it is easy to see that the weights listed in Remark 2.4 get changed under the map ! → (n) as follows: (n)
j = j
(1 ≤ j ≤ m), ((a + 1)0 + am+n−1 )(n)
= (a + 1)0 + am+n−1 − α0 , ((a + 1)m + am+1 )(n) = (a + 1)m + am+1 + αm (a > 0).
In terms of the fundamental weights j the map → (n) looks as follows: j ! → j −1
(1 ≤ j ≤ m) ,
(a + 1)0 + m+n−1 ! → (a + 2)0 + (a + 1)m+n−1 (a ∈ Z+ ), (a + 1)m + am+1 ! → am + (a − 1)m+1
(a ∈ N).
It follows that all weights of level 1 listed by Remark 2.4 are conjugate to each other by odd reflections. 3. Free Field Realization of Level 1 Integrable Modules over g(m|n) Fix non-negative integers m and n such that m + n ≥ 1 and denote by F the vertex algebra generated by m pairs of odd fields ψ i (z), ψ i∗ (z), (i = 1, . . . , m) and n pairs of even fields ϕ j (z), ϕ j ∗ (z) (j = 1, . . . , n), all pairwise local, subject to the following operator product expansions (as usual, we list only the non-trivial OPE): δij z−w , δij − z−w ,
ψ i (z)ψ j ∗ (w) ∼
ψ i∗ (z)ψ j (w) ∼
ϕ i (z)ϕ j ∗ (w) ∼
ϕ i∗ (z)ϕ j (w) ∼
δij z−w , δij z−w .
This is called a free fermionic vertex algebra in the book [K4] to which we refer for foundations of the vertex algebra theory. This vertex algebra has a family of Virasoro fields [K4], from which it is convenient to choose the following one: L(z) ≡
Lk z−k−2 =
k∈Z
1 2
m
(: ∂ψ i (z)ψ i∗ (z) : + : ∂ψ i∗ (z)ψ i (z) :)
i=1
+
1 2
n
(: ∂ϕ j (z)ϕ j ∗ (z) : − : ∂ϕ j ∗ (z)ϕ j (z) :).
(3.1)
j =1
With respect to L(z) the fields ψ i (z), ψ i∗ (z), ϕ j (z) and ϕ j ∗ (z) are primary of conformal (i) weight 1/2. We therefore write all these fields in the form x i (z) = k∈ 1 +Z xk z−k−1/2 2 where x = ψ , ψ ∗ , ϕ or ϕ ∗ , and we have the following conditions on the vacuum |0: (i)
ψk |0 = 0,
(i)∗
ψk
|0 = 0,
(i)
ϕk |0 = 0,
(i)∗
ϕk |0 = 0 for k > 0.
644
V. G. Kac, M. Wakimoto
The operator L0 is called the energy operator or (Hamiltonian) and its eigenvalues are called the energies of the corresponding eigenvectors. The energy can be calculated from the following relations: energy |0 = 0,
(i)
(i)∗
energy (ψk , ψk
(j )
(j )∗
, ϕk , ϕk
) = −k.
(3.2)
(i)
The second relation means that ψk , etc., changes the energy by −k, i.e., (i)
energy (ψk v) = energy (v) − k, etc. Next, for each pair i, j that may occur introduce the following fields of conformal weight 1: a ij + (z) =: ψ i (z)ψ j ∗ (z) :, a ij − (z) =: ϕ i (z)ϕ j ∗ (z) : E ij + (z) =: ψ i (z)ϕ j ∗ (z) :,
E ij − (z) =: ϕ i (z)ψ j ∗ (z) : .
Proposition 3.1. (a) Consider the affine superalgebra g(m|n) and let A(z) = k −k−1 for A ∈ g(m|n). Then the linear map σ given by k∈Z (t ⊗ A)z eij (z) ! → a ij + (z),
ei+m,j +m (z) ! → a ij − (z),
ei,j +m (z) ! → E ij + (z),
ei+m,j (z) ! → E ij − (z),
K ! → 1,
d ! → L0
defines a representation of g(m|n)(of level 1) in the space F . (b) Consider the standard g(m|n)-module Cm|n and its contragredient module Cm|n∗ . Consider the corresponding C[t, t −1 ] ⊗ g(m|n)-modules C[t, t −1 ] ⊗ Cm|n and −1 m|n∗ C[t, t ] ⊗ C , and let v(z) = k∈Z (t k ⊗ v)z−k−1 for v ∈ Cm|n or Cm|n∗ . Then the linear maps ν and ν ∗ given by (i = 1, . . . , m; j = 1, . . . , n): vi (z) ! → ψ i (z),
vj +m (z) ! → ϕ j (z) and
vi∗ (z) ! → ψ i∗ (z), vj∗+m (z) ! → ϕ j ∗ (z) are equivariant, i.e., they have the following property: ν(A(z)v(w)) = [σ (A(z)), ν(v(w))],
v ∈ Cm|n ,
ν ∗ (A(z)v ∗ (w)) = [σ (A(z)), ν ∗ (v ∗ (w))], v ∗ ∈ Cm|n . Proof. Both statements follow from the corresponding OPE, which are easily derived from Wick’s formula. Below we give the less trivial OPE needed for the proof of (a): E ij + (z)E k− (w) ∼
δj k a i+ (w) + δi a kj − (w) δi δj k , + z−w (z − w)2
a ij ± (z)E k± (w) ∼
δj k E i± (w) , z−w
a ij ± (z)E k∓ (w) ∼
−δi E kj ∓ (w) , z−w
Integrable Highest Weight Modules over Affine Superalgebras
a ij ± (z)a k± (w) ∼
645
δj k a i± (w) − δi a kj ± (w) 1 ± , (z − w)2 z−w
a ij ± (z)a k∓ (w) ∼ 0.
Introduce the total charge operator a0 = σ (I ), where I =
m+n
eii ∈ g(m|n).
i=1
Its eigenvalues are called charges of the corresponding eigenvectors. It is clear from Proposition 3.1 that the total charge can be calculated from the following relations: (j )
(i)
charge (ψk , ϕk ) = 1,
charge |0 = 0,
(i)∗
charge (ψk
(j )∗
, ϕk
) = −1.
(3.3)
Consider the charge decomposition of F , i.e., its decomposition in eigenspaces of a0 : F = ⊕s∈Z Fs .
(3.4)
Since a0 commutes with σ (g(m|n)), we conclude that (3.4) is a decomposition in a direct sum of g(m|n)-modules. It is clear that L0 commutes with a0 , hence each Fs is L0 -invariant, and since all eigenvalues of L0 in F lie in 21 Z+ , the same holds for eigenvalues of L0 in Fs , s ∈ Z. Note also that L0 commutes with σ (g(m|n)). It is because all fields a ij ± (z) and E ij ± (z) have conformal weight 1. It follows that each eigenspace of L0 in Fs is a g(m|n)-module. The following proposition describes the lowest energy subspace Fslow and the lowest weight vector |s in each Fs . Proposition 3.2. a) Let s ∈ Z+ . Then, as a g(m|n)-module, Fslow is isomorphic to s Cm|n . Furthermore, any highest weight vector of g(m|n)in Fs lies in Fslow and is proportional to the vector (1) − 21
|s = ψ
(s) |0 − 21
...ψ
with weight 0 + 21 + · · · + 2s − 2s δ,
provided that s ≤ m, and to the vector s−m (1) |m with weight 0 + 21 + · · · + 2m + (s − m)2m+1 − 2s δ |s = ϕ 1 −2
provided that s ≥ m. (b) Let −s ∈ Z+ . Then, as a g(m|n)-module, Fslow is isomorphic to −s (Cm|n )∗ . Furthermore, any highest weight vector of g(m|n)in Fs lies in Fslow and is proportional to the vector −s (n)∗ |0 with weight 0 + s2m+n + 2s δ. |s = ϕ 1 −2
Proof. It is clear that, if s (resp. −s) ∈ Z+ , then Fslow consists of homogeneous poly(i) (i)∗ nomials of degree |s| in anticommuting operators ψ 1 (resp. ψ 1 ) and commuting operators tion 3.1.
(j ) ϕ 1 −2
(resp.
(j )∗ ϕ 1 ), −2
−2
−2
applied to |0. This proves (a) (resp. (b)), due to Proposi-
646
V. G. Kac, M. Wakimoto
Remark 3.1. The lowest energy in Fs is 21 |s| and the spectrum of L0 in Fs is 21 |s| + Z+ . Remark 3.2. Denote by (s) the weight of |s. When restricted to s(m|n), (s) is given by the following formulas: s − 2s δ
if 0 ≤ s ≤ m,
(1 + s − m)m + (s − m)m+1 − 2s δ
if s ≥ m,
(1 − s)0 − sm+n−1 + 2s δ
if s ≤ 0.
We identify here m+1 with −0 in the case n = 1. The following theorem is the central result of this section. Theorem 3.1. Suppose that m ≥ 1. Then each g(m|n)-module Fs , s ∈ Z, is an irreducible integrable highest weight module of level 1. Remark 3.3. The g(0|n)-modules Fs are not irreducible. For example, one can show that in the case (m, n) = (0, 2), one has the following decomposition as g(2)-modules (in the standard notation of [K3]): chFs =
∞
chL(−(1 + 2j + |s|)0 + (2j + |s|)1 )q j
2 +(|s|+1)j +|s|/2
.
j =0
E. Frenkel informed one of us that he had found this decomposition too. The proof of Theorem 3.1 is based on the (super) boson-fermion correspondence, which we shall now recall (cf. [K4]). For each i = 1, . . . , m there exists a unique invertible odd operator e2i with inverse e−2i satisfying the following three properties: [e2i , ψ j (z)] = 0 if i = j, (i)
(i)
(i)∗ −2i
e2i ψk e−2i = ψk−1 , e2i |0 = ψ
[e2i , ϕ j (z)] = 0 for all j, e2i ψk
(i) |0, − 21
e
(i)∗
= ψk+1 ,
e−2i |0 = ψ
(3.5) (3.6)
(i)∗ |0. − 21
(3.7)
It is easy to see that e2i e2j = −e2j e2i if i = j . We let for short (i = 1, . . . , m; j = 1, . . . , n): (i) (j +m) 2 i (z) = a ii+ (z) = 2k z−k−1 , 2 j +m (z) = a jj − (z) = 2k z−k−1 . k∈Z
k∈Z
Then we have: (i)
[2k , e2j ] = δij δk0 e2j ,
i = 1, . . . , m + n;
j = 1, . . . , m.
For each i = 1, . . . , m + n introduce the following fields: ?2+i (z) = e
∞
zk (i) k=1 k 2−k
,
?2−i (z) = e−
∞
k=1
z−k (i) k 2k
,
(3.8)
Integrable Highest Weight Modules over Affine Superalgebras
647
± and for a linear combination with integer coefficients α = m i=1 si 2i we let ?α (z) = (i) (i) $i (?2±i )si (recall that all 2k commute and all 2−k commute for k ≥ 1, see Proposition 3.1a). The central fact of the classical boson-fermion correspondence is the following formula, see e.g. [K4] (i = 1, . . . , m) (i)
ψ i (z) = e2i z20 ?2+i (z)?2−i (z),
(i)
+ − ψ i∗ (z) = e−2i z−20 ?−2 (z)?−2 (z). i i
(3.9)
The key formulas of the super boson-fermion correspondence are the following [KL, K4] (j = 1, . . . , n): (i)
ϕ j (z) = z20 e2i ?2+i (z)E j i− (z)?2−i (z), ϕ j ∗ (z) = z
(i) −20
(3.10)
+ − e−2i ?−2 (z)E ij + (z)?−2 (z), i i
for each i = 1, . . . , m (we assume here that m ≥ 1). Proof of Theorem 3.1. Since the eigenspaces of L0 in Fs are finite-dimensional and L0 commutes with g(m|n), it follows that Fs is a direct sum of finite-dimensional g(m|n)-modules, hence g(m|n) acts locally finitely on Fs . Furthermore, we have: F = F fermi ⊗ F bose , where F fermi (resp. F bose ) is the vertex algebra generated by the ψ i (z), ψ i∗ (z) (resp. ϕ j (z), ϕ j ∗ (z)), and the subalgebra g(m) of g(m|n) acts on F via π ⊗ 1, where the representation π of g(m) on F fermi is known to be integrable of level 1 (see [KP1]). Thus, the representation of g(m|n) in each Fs is integrable. The irreducibility of Fs , provided that m ≥ 1, is proved using (3.9) and (3.10) in exactly the same fashion as the proof of Theorem 5.8a from [K4]. Remark 3.4. We have got along the way the following vertex operator construction of g(m|n). For each α = m i=1 si 2i , si ∈ Z, introduce the usual vertex operator ?α = eα zα0 ?α+ ?α− . Then the following map defines an irreducible integrable highest weight g(m|n)module of level 1 in each Fs : eii (z) ! → 2i (z) (i = 1, . . . , m), K ! → 1, eij (z) ! → ?2i −2j (i, j = 1, . . . , m), ei+m,j +m (z) ! →: ϕ i (z)ϕ j ∗ (z) : (i, j = 1, . . . , n), (i = 1, . . . , m; j = 1, . . . , n), ej +m,i (z) ! → ?−2i (z)ϕ j (z) (i = 1, . . . , m; j = 1, . . . , n). ei,j +m (z) ! → ?2i (z)ϕ j ∗ (z) Next, we give a standard derivation of a “quasiparticle” character formula for the g(m|n)-modules Fs , s ∈ Z. Given a = (a1 , . . . , am ), b = (b1 , . . . , bm ) ∈ Zm + and c = (c1 , . . . , cn ), d = (d1 , . . . , dn ) ∈ Zn+ , denote by F (a, b, c, d) the linear span of vectors in F obtained (i) (i)∗ (i) (i)∗ from the vacuum vector |0 by applying all monomials in the ψk , ψk , ϕk , ϕk (1) which contain a1 factors of the form ψk , k ∈ 21 + Z, . . . , am factors of the form (m) (1)∗ (m)∗ ψk , b1 factors of the form ψk , . . . , bm factors of the form ψk , c1 factors of the
648
V. G. Kac, M. Wakimoto (1)
(n)∗
form ϕk , . . . , dn factors of the form ϕk condition holds:
. These states lie in Fs iff the following
|a| − |b| + |c| − |d| = s,
(3.11)
where |a| = ai , etc. It is clear that the state of minimal energy in F (a, b, c, d) is (up to a constant factor) the following vector: (1) (1) (1) (m) (m) . . . ψ 3 ψ 1 ) . . . (ψ ...ψ 1 ) −(a1 − 21 ) −2 −2 −(am − 21 ) −2
v(a, b, c, d) = (ψ
(1)∗ (1)∗ (m)∗ (m)∗ . . . ψ 1 ) . . . (ψ ...ψ 1 ) −(b1 − 21 ) −2 −(bm − 21 ) −2
× (ψ
(1) c1 ) − 21
× (ϕ
(n) cn (1)∗ d1 ) (ϕ 1 ) − 21 −2
. . . (ϕ
(n)∗ dn ) |0. − 21
. . . (ϕ
All other basis elements from F (a, b, c, d) are obtained from v(a, b, c, d) by adding to the lower indices of the factors arbitrary non-negative integers. Hence we have (since weight |0 = 0 ): chF (a, b, c, d) = eweight (v(a,b,c,d)) /$(q), where (3.12a) $(q) = (q)a1 . . . (q)am (q)b1 . . . (q)bm (q)c1 . . . (q)cn (q)d1 . . . (q)dn . (3.12b) Here and further we use the usual notation and assumptions: (q)a = (1 − q) . . . (1 − q a ), q = e−δ and |q| < 1. Noticing that (i)
(i)∗
weight (ψk ) = 2i + kδ, weight (ψk (i) weight (ϕk )
= 2m+i + kδ,
) = −2i + kδ,
(i)∗ weight (ϕk )
(3.13)
= −2m+i + kδ,
we obtain from (3.11) and (3.12) the “quasiparticle” character formula for Fs : chFs = e
0
e
m+n i=1
a,b∈Zm+n + |a|−|b|=s
(ai −bi )2i q 21
m
$m+n i=1
1 2 2 i=1 (ai +bi )+ 2
m+n
(q)ai (q)bi
i=m+1 (ai +bi )
.
(3.14)
Another formula, which we call a theta function type character formula, is derived as follows. Let chF = zs chFs . s∈Z
Using (3.3) and (3.13), we obtain: chF = e0 $∞ k=1
2i k−1/2 )(1 + z−1 e−2i q k−1/2 ) $m i=1 (1 + ze q . $nj=1 (1 − ze2m+j q k−1/2 )(1 − z−1 e−2m+j q k−1/2 )
(3.15)
In order to compute the coefficient of zs , we use the Jacobi triple product identity 1 1 m 1 m2 k− 21 )(1 + z−1 q k− 2 ) = z q2 , (3.16) $∞ k=1 (1 + zq ϕ(q) m∈Z
Integrable Highest Weight Modules over Affine Superalgebras
649
and also the following well-known identity which can be derived from the super bosonfermion correspondence [K4]: k− 2 −1 ) (1 + z−1 q k− 2 )−1 = $∞ k=1 (1 + zq 1
1
= ϕ(q)−2
−
m,k≥0
1 2 m(m+1) 1 m q (−1) (3.17) 1 ϕ(q)2 1 + zq m+ 2 m∈Z
((−1)m+k zk q 2 m(m+1)+(m+ 2 )k ). 1
1
m,k<0
Here and further j ϕ(q) = $∞ j =1 (1 − q ).
Substituting (3.16) and (3.17) in (3.15), we get: e0 ... chF = − − ϕ(q)m+2n m p1 ,a1 ≥0
k∈Z
(−1)|r| z|k|+|p| e
p1 ,a1 <0
i ki 2i +
pj 2m+j
j
1
q2
pn ,an ≥0
pn ,an <0
2 1 i ki + 2
j (aj (aj +1)+pj (aj +1/2))
,
where k = (k1 , . . . , km ) ∈ Zm , p = (p1 , . . . , pn ), a = (a1 , . . . , an ) ∈ Zn , and |k| = i ki . The coefficient of zs is a rather complicated expression for chFs , which, after letting r = p + a ∈ Zn , can be written as follows: 2 e0 +s21 q s /2 ... (−1)|r+p| chFs = − − ϕ(q)m+2n m−1 r1 ≥p1 ≥0 r1
k∈Z
e
i ki (2i −21 )+
j
pj (2m+j −21 )
q
1 2 1 2 |k| + 2
2 1 i ki + 2
rn ≥pn ≥0 rn
j rj (rj +1)+
i<j
pi pj +|k||p|−s(|k|+|p|)
,
(3.18) where k = (k2 , . . . , km ) ∈ Zm−1 , p, r ∈ Zn . We rewrite (3.18) using translation operators tα , α ∈ h∗ , defined by (λ ∈ h∗ ): m−1
tα (λ) = λ + (λ|δ)α − ( 21 (α|α)(λ|δ) + (λ|α))δ.
(3.19)
Let M # = i=1 Zαi ; recall that M # acts on h∗ via α ! → tα and the image of this action is the translation subgroup of the Weyl group of s(m). It is straightforward to show that (3.18) can be rewritten as follows: 1 ... chFs = − − ϕ(q)m+2n r1 ≥p1 ≥0
1
(−1)|r|+|p| q 2
r1
j rj (rj +1)+
i<j
rn ≥pn ≥0
pi pj −|p|+as
rn
n
etα ((s) −
j =1 pj (21 −2m+j ))
,
α∈M #
(3.20)
650
V. G. Kac, M. Wakimoto
where (s) is the weight of |s and as = s(rn −pn +1)+|p| if s ≤ 0, = 0 if 0 < s ≤ m, and = (s − m)(r1 − p1 ) if s ≥ m. In the case n = 1 formula (3.18) can be simplified by making use of the following lemma. Lemma 3.1. Let a, b ∈ Z. Then (j, k, n ∈ Z): (a) ( k≥j ≥a − k<j
Proof. If k ≥ a (resp. k < a), we have: k
a−1
(−1)j +k x j (resp. −
j =a
(−1)j +k x j ) =
j =k+1
x k+1 − (−1)k x a . 1+x
Hence the LHS of (a) is equal to 1 k+1 bk+k(k+1)/2 xa x q − (−1)k q bk+k(k+1)/2 . 1+x 1+x k∈Z
k∈Z
Noticing that the second summand is zero and applying to the first summand the Jacobi triple product identity (3.16), we obtain (a). In the proof of (b) we assume that b > 0, the case b < 0 being similar: −1 n−b−1 ) = $m≥1−b (1 + x −1 q m−1 ) $∞ n=1 (1 + x q
−1 m−1 = $∞ )$0m=1−b (1 + x −1 q m−1 ). m=1 (1 + x q
The second product on the RHS is equal to $bn=1 (1 + x −1 q −n ) = $bn=1 x −1 q −n (1 + xq n ) = x −b q −b(b+1)/2 $bn=1 (1 + xq n ). Next, we have: n+b m ∞ m b m $∞ ) = $∞ n=1 (1 + xq m=1+b (1 + xq ) = $m=1 (1 + xq )/$m=1 (1 + xq ).
These equalities prove (b).
Let now n = 1. Then (3.18) reads: 2 1 1 2 e0 +s21 q s /2 i ki (2i −21 ) q 2 |k|(|k|−2s)+ 2 i ki , chFs = ψ(k)e ϕ(q)m+2 m−1
k∈Z
where ψ(k) =
t≥j ≥0
−
t<j <0
j (−1)j +t q |k|−s e2m+1 −21 q t (t+1)/2 .
(3.21)
Integrable Highest Weight Modules over Affine Superalgebras
651
By Lemma 3.1a for x = q |k|−s e2m+1 −21 , b = 0, we have: ψ(k) =
q |k|−s e2m+1 −21 ϕ(q) ∞ $n=1 1 + e21 −2m+1 q n−|k|+s−1 1 + e2m+1 −21 q n+|k|−s . |k|−s 2 −2 1+q e m+1 1
By Lemma 3.1b for x = e2m+1 −21 , b = |k| − s, we rewrite this as follows: ψ(k) =
x −b q −b(b−1)/2 ϕ(q) ∞ $n=1 (1 + x −1 q n )(1 + xq n−1 ). 1 + qbx
Substituting this in (3.21), we obtain: s
e0 +s2m+1 q − 2 ∞ chFs = $n=1 (1 + e21 −2m+1 q n )(1 + e2m+1 −21 q n−1 ) ϕ(q)m+1 ×
(3.22)
m
k∈Zm−1
e i=2 ki (2i −2m+1 ) 1 m ki (ki +1) q 2 i=2 . 1 + q |k|−s e2m+1 −21
This formula agrees with the one obtained in [KL] (see also [K4]) for g(1|1). In the next section we use this formula in the case m ≥ 2 in order to derive character formulas for all integrable level 1 s(m|1)-modules in terms of the theta function and the (multivariable) Appell’s functions, and to obtain their high temperature asymptotics. 4. Theta Function Type Character Formula for Integrable Level 1 s(m|1)-Modules and Appell’s Function Recall that Appell’s function is defined by the following series (cf. [A] and [P]): A(x, z, q) =
q 21 k 2 zk , 1 + xq k k∈Z
which converges to a meromorphic function in the domain x, z, q ∈ C, |q| < 1. The classical theta function in one variable is a special case of this function: A(z) ≡ A(z; q) = A(0, z, q). Note that by (3.16) we have a product expansion: k −1 k−1 A(zq 1/2 ; q) = ϕ(q)$∞ ). k=1 (1 + zq )(1 + z q
(4.1)
We shall need also the following multivariable generalization of Appell’s function. Let B be an N × N symmetric matrix such that Re B is positive definite and let be a linear function of CN . We define the series q 21 k T Bk zk1 . . . zkN 1 N AB, (x; z1 , . . . , zN ; q) = , (k) 1 + xq N k∈Z
which converges to a meromorphic function provided that |q| < 1. Again, letting x = 0, we get the multivariable theta function.
652
V. G. Kac, M. Wakimoto
Consider now the g(m|1)-modules Fs , s ∈ Z, and m ≥ 2. We have (see Remark 3.2): s − 2s δ (s) = −(s − m)0 + (1 + s − m)m − 2s δ (1 − s) + s + s δ 0 m 2
assume in this section that if 0 ≤ s ≤ m if s ≥ m . if s ≤ 0
For m ≥ 2, we have: g(m|1) = s(m|1) + g(1) (sum of ideals), hence: chFs = chL((s) )ϕ(q)−1 , where L((s) ) denotes the irreducible s(m|1)-module with highest weight (s) . Hence formula (3.22) gives us the following expression for chL((s) ) in terms of the theta function A(z; q) (we use (4.1)) and the (multivariable) Appell’s function: s
1 1 1 e0 +s2m+1 q − 2 A(z1 q 2 ; q)AI, (z1−1 q −s ; z2 q 2 , . . . , zm q 2 ; q), chL((s) ) = m+1 ϕ(q)
(4.2)
2i −2m+1 (i = 1, . . . , m), B = I is the (m − 1) × (m − 1) identity matrix where zi = e and (k) = i ki , k ∈ Zm−1 . (Note that in the simplest case m = 2 we get in this expression the classical Appell’s function.) Next, we derive yet another character formula for the s(m|1)-module L(0 ) in the case m ≥ 2, in terms of classical theta functions and certain “half” modular forms. We use for this (3.20) for n = 1: 1 1 chL(0 ) = − (−1)r+p etα (0 −p(21 −2m+1 )) q 2 r(r+1) . ϕ(q)m+1 r≥p≥0 r
Introduce the following elements of M # : βk = k21 −
k
2i ,
k = 1, . . . , m,
i=1
and the element µ = 2m+1 − m1 (21 + · · · + 2m ) ∈ h∗ , which is orthogonal to M # . The even part of s(m|1) is a sum of ideals s(m) and (Cµ). Write p = j m − k, where j ∈ Z, 1 ≤ k ≤ m and note that ˙ k + pµ, 0 − p(21 − 2m+1 ) + jβm − βk =
(4.4)
˙ m with ˙ 0 . Adding ˙ k denote fundamental weights of s(m) and we identify where to α the element jβm − βk in (4.3), and using (4.4), we rewrite (4.3) as follows: chL(0 ) =
m 1 − ϕ(q)m+1 k=1
(−1)r+k+j m
j,r∈Z r+k≥j m j >0
α∈M #
˙
j,r∈Z r+k<j m j ≤0
etα (k )+(j m−k)µ q
r(r+1) (j m−k)(j m−k−j ) k(j −1) − 2 2 − 2
.
Integrable Highest Weight Modules over Affine Superalgebras
653
˙ , ˙ a) the irreducible s(m) + (Cµ)-module with highest weight Denoting by L( ˙ + aµ, and recalling that ([K3], Proposition 12.13): 1 ˙ ˙ ˙ k , a) = etα (k )+aµ , (4.5) chL( m ϕ(q) # α∈M
we obtain: chL(0 ) =
m
˙ ˙ k , p), bk,p (q)chL(
(4.6)
k=1 p∈Z p+k|m
where q ( 2 − 2m )p + 2m k + 2 k ϕ(q) 1 1 × (−1)r−p q 2 r(r+1) (resp. × (−1)r−p−1 q 2 r(r+1) ), 1
bk,p (q) =
1
2
1
2
1
r≥p
r
if p ≥ 0 (resp. p < 0). Thus, the branching functions bk,p (q) are “half” modular functions, in a sharp con trast with the case of affine Lie algebras [K3]. Recalling that the series α∈M # etα () converges to a classical theta function [K3], we see that the character of the “basic” s(m|1)-module is a finite linear combination of classical theta functions with coefficients “half” modular functions. The basic specialization of (3.22) gives the specialized character formulas for s(m|1)-modules L((s) ):
trL((s) ) q
L0
= 2q
− 2s
1 ϕ(q 2 )2 q 2 i ki (ki +1) , ϕ(q)m+2 1 + q |k|−s m−1
(4.7)
k∈Z
where, as before, |k| = i ki . In the remainder of this section we discuss asymptotics of (4.7). Given a positive definite quadratic form B(x) on RN , an affine linear function (x) on RN and an element α of RN , consider the following series, where q = e2πiτ : fB,,α (τ ) =
γ ∈ZN +α
q 2 B(γ ) . 1 + q (γ ) 1
This series converges on the upper half plane Im τ > 0 to a specialization of the multivariable Appell’s function. From the transformation properties of theta series one gets (see e.g. [K3]): 1 (det B)−1/2 β −N/2 + o(β) as β → 0 (Re β > 0). 2 In order to get the asymptotics of the functions fB,,α (τ ), let 1 1 ± (τ ) = q 2 B(γ ) . fB,,α 2 N fB,0,α (iβ) =
γ ∈Z +α ±(γ )>0
(4.8)
654
V. G. Kac, M. Wakimoto
It is easy to derive from (4.8) by induction on n the following asymptotics: ± fB,,α (iβ) =
1 (det B)−1/2 β −N/2 + p± (β −1/2 ) + o(β), 4
(4.9)
as β → 0, β ∈ R+ , where p± (x) is a polynomial in x of degree strictly less than N . The idea of the following lemma is due to A. Polishchuk. Lemma 4.1. |fB,,α (iβ) − fB,0,α (iβ)| < p(β −1/2 ) for β ∈ R, 0 < β < a, where a is a positive number and p(x) is a polynomial in x of degree strictly less than N . Proof. Let g(β) = fB,,α (iβ) − fB,0,α (iβ). We have: g(β) = g + (β) − g − (β), where
g ± (β) =
e−πβB(γ )
γ ∈ZN +α ±(γ )>0
Furthermore, we have: 0 ≤ g ± (β) ≤
e−πβB(γ ) −
γ ∈ZN +α ±(γ )>0
1 − e∓2πβ(γ ) . 1 + e∓2πβ(γ )
e−πβ(B(γ )±2(γ )) .
γ ∈ZN +α ±(γ )>0
± (iβ), which has asymptotics (4.9). But the second The first sum on the right is just fB,,α sum on the right has asymptotics of this form too since it can be written as a product ± of a power of q and a function fB, ,α (iβ) for some other affine linear function , by “completing the squares”. The lemma is proved.
We shall write f (τ ) ∼ g(τ ) if lim β→0 f (iβ)/g(iβ) = 1. Lemma 4.1 and (4.8) β∈R+
imply: fB,,α (τ ) ∼
1 (det B)−1/2 β −N/2 . 2
(4.10)
Since η(τ ) ∼ β −1/2 e−π/12β ,
(4.11)
we deduce from (4.10) and (4.7) the following asymptotics along the imaginary axis τ = iβ, β ∈ R+ : trL((s) ) q L0 ∼
π 1 1 12β (m+1) β2e . 2
(4.12)
5. A Weyl Type Character Formula for Integrable Level 1 g(m|n)-Modules In this section we derive a Weyl type character formula (5.12) for principal integrable level 1-modules over g(m|n) provided that m ≥ n. We use for that formula (3.15) for chF and the denominator identity for s(m + 1|n). In order to compare these two
Integrable Highest Weight Modules over Affine Superalgebras
655
formulas, we consider the labelings of simple roots of g(m|n) and s(m + 1|n) given below: 0
✘ ✘✘ ✘✘✘ ✘ ✘ ··· ··· ( ( ( ( m−1 m m+n−1 1 2 0
✘ ✘ ✘✘✘ ✘✘✘ ··· ( ( ( ··· ( ( ∗ m−1 m m+n−1 1 2 Putting z = −e−21 q − 2 y and y = e−α∗ , 1
(5.1)
we can rewrite formula (3.15) as follows: e−0 chF = $∞ k=1
α∗ +α1 +···+αi q k )(1−e−α∗ −α1 −···−αi q k−1 ) $m−1 i=0 (1−e
$n−1 j =0 (1+e
α∗ +α1 +···+αm+j k −α −α −···−αm+j k−1 q )(1+e ∗ 1 q )
.
(5.2)
Denote by W # (resp. W˜ # ) the subgroup of the Weyl group of g(m|n) (resp. s(m + 1|n)) generated by reflections rα in roots α = α1 , . . . , αm−1 (resp. α∗ , α1 , . . . , αm−1 ) and by M # (resp. M˜ # ) the groups generated by translations tα in integral linear combinations of these roots. Let ρ˜ denote a Weyl vector for s(m + 1|n) and let R denote the denominator for g(m|n). It is clear by (5.2) that the denominator R˜ of s(m + 1|n) is given by R˜ = e−0 RchF.
(5.3)
In order to write down the denominator identity for s(m + 1|n), introduce the roots βij = αi + αi+1 + · · · + αj , 1 ≤ i ≤ j ≤ m + n − 2 (here βii = αi ), and let βi = βm−n+i,m+i−1 , i = 1, . . . , n. Let
ρ˜ = ρ˜ +
βij .
(5.4)
m−n+2≤i≤m≤j ≤m+n−2 j −i≤n−2
Then the denominator identity for s(m + 1|n) looks as follows:
e−0 eρ˜ RchF =
w∈W˜ # M˜ #
2(w)w
eρ˜ . $nj=1 (1 + e−βj )
(5.5)
This identity can be derived from the denominator identity given in [KW] by making use of odd reflections as follows.
656
V. G. Kac, M. Wakimoto
The denominator identity for s(m + 1|n) in [KW] is given for the choice of the set of simple roots with a maximal number of grey nodes: 0 ✘ ✘ ✘
✘✘✘ ✘✘ ✘ ✘✘ ··· ( (
(
···
2n − 1 Let γ1 , . . . γn be the (unique subset) of the set of simple roots without α0 such that (γi |γj ) = 0 for all i, j , and let ρ˜ be its Weyl vector. Then the identity reads:
eρ˜ R˜ =
2(w)w
w∈W˜ # M˜ #
eρ˜ . $nj=1 (1 + e−γj )
(5.6)
In order to derive (5.5) from (5.6), we apply a sequence of odd reflections which transforms the initial diagram with two grey nodes to the above final diagram with 2n grey nodes. In order to explain this sequence, denote by $(i,j ) a set of simple roots containing βi,m+j and by $(i,j +1) the set of simple roots obtained from it by the odd reflection in ˜ and $ ˜ the initial and the final sets of simple roots, we have the βi,m+j . Denoting by $ following sequence: ˜ = $(m,0) → $(m,1) → · · · → $(m,n−2) → $(m,n−1) $ = $(m−1,0) → $(m−1,1) → · · · → $(m−1,n−2) = $(m−2,0) → $(m−2,1) → · · · $(m−2,n−3) = $(m−3,0) → $(m−3,1) → · · · ˜ . · · · = $(m−n+2,0) → $(m−n+2,1) = $ Using Lemma 1.3, one sees that ρ˜ and ρ˜ are related by formula (5.4) and using Lemma 1.4a, we see that 0 = 0 .
(5.7)
Using the decompositions # # ˜# W˜ # = W # (m−1 j =0 W rα∗ +α1 +···+αj ) and M = Zα∗ + M ,
we obtain from (5.5): e−0 RchF = I0 −
m−1
IIi ,
(5.8)
i=0
where I0 = e−ρ˜
k∈Z α∈M # w∈W #
IIi = e−ρ˜
k∈Z α∈M # w∈W #
2(w)tkα∗ tα w
eρ˜ , n $j =1 (1 + e−βj )
2(w)tkα∗ tα wrα∗ +α1 +···+αi
eρ˜ . $nj=1 (1 + e−βj )
Integrable Highest Weight Modules over Affine Superalgebras
657
In order to compute chL(0 ) of the g(m|n)-module L(0 ) we compare the constant terms (i.e., y 0 -terms) in the decomposition of both sides of (5.8) as the series in powers of y. We will show that constant term of IIi = 0 for all i, 2(w)w constant term of I0 = w∈W # M #
(5.9)
eρ˜ . n $j =1 (1 + e−βj )
(5.10)
Using that ρ˜ when restricted to the Cartan subalgebra of g(m|n) coincides with 0 + ρ , where ρ is related to the Weyl vector ρ of g(m|n) by (5.4) with ∼ removed, we obtain from (5.5), (5.9) and (5.10):
ρ
e RchL(0 ) =
w∈W # M
e0 +ρ 2(w)w n . $j =1 (1 + e−βj ) #
(5.11)
Applying to this formula odd reflections as above, we obtain an equivalent character formula for the choice of the diagram with 2n grey nodes as above (where ρ is a Weyl vector for this choice of the diagram):
eρ RchL(0 ) =
2(w)w
w∈W # M #
e0 +ρ . $nj=1 (1 + e−αm−n+2j −1 )
(5.12)
The proof of (5.9) and (5.10) is straightforward, and we explain it in the case n = 1. We have: 2 eρ˜ e−mkα∗ e−m(α∗ +α1 +···+αm−1 ) q k m−k+mk −ρ˜ e t−kα∗ rα∗ +α1 +···+αm−1 = . 1 + e−αm 1 + e−(α∗ +α1 +···+αm ) q k Hence: e−ρ˜
k∈Z
t−kα∗ rα∗ +α1 +···+αm−1
eρ˜ 1 + e−αm
=
−
k,s≥0
(−1)s y m(k+1)+s (. . . ),
k,s<0
where (. . . ) doesn’t involve y. But the constant term of the last expression is 0 since m(k + 1) + s > 0 if k, s ≥ 0 and m(k + 1) + s ≤ s < 0 if k, s < 0. Thus, constant term of IIm−1 = 0. Furthermore, we have for 0 ≤ i ≤ m − 2: e−ρ˜ tkα∗ rα∗ +α1 +···+αi
eρ˜ = y −mk+i+1 (. . . ). 1 + e−αm
Since −mk + i + 1 = 0 if 0 ≤ i ≤ m − 2, we see that constant term of IIi is 0 for all i. Finally: eρ˜ q k m+k = y −mk , −α 1+e m 1 + e−αm 2
e−ρ˜ tkα∗
hence the constant term of this expression is equal δk,0 (1 + e−αm )−1 , which proves (5.10).
658
V. G. Kac, M. Wakimoto
Using odd reflections one may derive from (5.11) or (5.12) the Weyl-type character formulas for all other level 1 principal integrable modules. For example, in the case g = s(m|1), we let k = mj + s ∈ Z+ , where j ∈ Z+ and 0 ≤ s ≤ m − 1; then eρ Rch((k + 1)0 − km ) =
2(w)w
w∈W
eρ Rch(−k0 + (k + 1)m ) =
2(w)w
w∈W
e+ρ 1 + e−j δ−αm−s −...−αm e+ρ 1 + e−j δ−α0 −...−αs
,
,
(5.13)
(5.14)
and for 1 ≤ j ≤ m − 1 we have: eρ Rch(j ) =
2(w)w
w∈W
ej +ρ . 1 + e−αm
(5.15)
6. Classification of Integrable Highest Weight Modules over Affine Superalgebras In this section we consider affine superalgebras of type A(m, n), B(m, n), C(n), D(m, n), D(2, 1; a), F (4), G(3). We shall exclude from consideration the well understood case of B(0, n) (see [K2] and Sect. 9.5). In all cases except for A(n, n) these are the affine superalgebras g defined by (0.1), (0.2) for g = A(m, n)(m = n), B(m, n), C(n), D(m, n), D(2, 1; a), F (4) and G(3) respectively (see [K1] for a construction of the simple finite-dimensional Lie superalgebras g). In the A(n, n) case it is more convenient to take g = s(n + 1|n + 1) in order not to lose the most interesting modules. The Lie superalgebra g carries a unique, up to a constant factor, non-zero invariant bilinear form (.|.). This form extends to g by formula (0.3)and it is normalized by the values of (αi |αi ), given in Table 6.1 (see below). It is convenient to depict Cartan matrices of affine (super)algebras by (generalized) Dynkin diagrams (cf. [K1]). We shall assume that the diagonal entries of a Cartan matrix are always 2 or 0 (one can achieve this by rescaling simple coroots). The Dynkin A is a graph whose nodes label the index set I = {0, 1, 2, . . . } and are of diagram of A the form 0, ⊗ or corresponding to cases aii = 2, i ∈ I1 ; aii = 0 (then i ∈ I1 ); and aii = 2, i ∈ I1 , respectively. These nodes are called white, grey and black respectively, so that I1 consists of non-white nodes. We let I = I\{0}, I1 = I1 \{0}. As usual, I labels simple roots α1 , α2 , . . . of g, I1 labels odd simple roots of g, and α0 = δ − θ , where θ is the highest root of g. In the cases aij = aj i = 0, i = j , the i th and j th nodes are not connected. In the cases aii = ajj = 2, i = j , the nodes are, as usual, connected by |aij aj i | edges with an arrow pointing to j th node if |aj i | > 1. In the remaining cases the nodes are joined as follows: 0a 0a ⊗−(= , ⊗⇒(= =⊗⇒ , −1 2 −2 2 0a 2 −1 ⊗−⊗= , (⇒ = . b0 −2 2
•
•
•
In Table 6.1 below we list the Dynkin diagrams of the symmetrizable Cartan matrices of the affine Lie superalgebras g under consideration. The labels against the nodes i are (αi |αi ), and the labels against the edges connecting i and j are (αi |αj ). Recall that
Integrable Highest Weight Modules over Affine Superalgebras
659
αi∨ = 2αi /(αi |αi ) if aii = 0; we let αi∨ = αi if aii = 0. We also give the coefficients of the decomposition of the root δ in terms of simple roots. The nodes are numbered by I = {0, 1, . . . } in increasing order from left to right, except when it is impossible to do, in which case nodes are numbered by the subscripts of their labels.
Table 6.1. g
A(m, n)
( 21
···
( 2
0
✘✘ 1
−1✘ ✘ ✘✘ ✘ −1 −1
−1(−1 1 ( 1 2 0m+1 −2
4
m>0
···
m+n+1
1( −2m+n+1
i=0
−2 −1 −1 −1 (⇒(−(−. . . −(⇒
B(0, n)
B(m, n)
δ
Dynkin diagram
2
2
2
•
α0 + 2
αi
n
αi
i=1
1
2 1 1 1 1 −1 −1 −1 −1 (⇒(−(−. . .−(− −(−. . . −(⇒( −4 −2 −2 −2 0n 2 2 1
α0 + 2
m+n i=1
αi
−2|1 1 2 1 1 − (−(−. . .−(⇐ ( 01 −2 −2 −2 −4n
C(n)
D(m, n)
2 1 (⇒( −4 −2
···
1
( −2
G(3)
0n
( 2
( 2m+n −1 −1 ... ( ( 2 2
( −2(a + 1) a+1 −1 −a ( ( 22 01 2a3
D(2, 1; a)
F (4)
1 −1
( −3
3 1 2 − 2
0
−1 −1 (⇐,( ( 1 2 2
4 − 1 −1 3 3 (−− −−(( 8 2 2 −3 0 3
α0 + α1 + 2
n−1 i=2
α0 + 2
m+n−2 i=1
αi + αn
αi + αm+n−1 + αm+n
α0 + 2α1 + α2 + α3
α0 + 2α1 + 3α2 + 2α3 + α4
α0 + 2α1 + 4α2 + 2α3
Remark 6.1. Recall the definition of the orthosymplectic Lie subalgebra osp(M|N ) [K1]. Let V = V1¯ ⊕ V0¯ be a superspace, where dim V1¯ = M, dim V0¯ = N , and let (.|.) be a non-degenerate bilinear form on V such that (V0¯ |V1¯ ) = 0, the restriction of (.|.) to V1¯ is symmetric and to V0¯ is skewsymmetric, so that N = 2n is even; let
660
V. G. Kac, M. Wakimoto
m = [M/2]. Then (α = 0, 1): osp(M|N )α¯ = {a ∈ g(M|N )α¯ |(a(x)|y) + (−1)αp(x) (x|a(y)) = 0,
x, y ∈ V }.
For the definition in a matrix form, consider the following (M +N )×(M +N ) matrices: C1 0 IM 0 C= , F = , 0 C2 0 −IN where C1 (resp. C2 ) is a M ×M (resp. N ×N ) symmetric (resp. skewsymmetric) matrix. Then osp(M|N )α¯ = {a ∈ g(M|N )α¯ |F α a C + Ca = 0},
α = 0, 1.
Recall that B(m, n) = osp(2m+1|2n), C(n) = osp(2|2n) and D(m, n) = osp(2m|2n). The invariant bilinear form on osp(M|N ) that is used in Table 6.1 and throughout the paper is (a|b) =
1 2
str ab.
Table 6.2. g
g0¯
g
g0¯
A(m, n)
A m + An + C
D(2, 1; a)
D2 + A1
C(n)
C + Cn
F (4)
B 3 + A1
B(m, n)
B m + Cn
G(3)
G2 + A1
D(m, n)
D m + Cn
The even parts g0¯ of the Lie superalgebras g are listed in Table 6.2. In the case of D(2, 1; a), the subalgebra D2 corresponds to α2 and α3 (see Table 6.1). We denote by g0¯ (resp. g0¯ ) the first (resp. second) non-zero summand of g0¯ in the decomposition of Table 6.2. Note that the invariant bilinear form (.|.) (which can be read off from Table 6.1) is normalized in such a way that it is positive definite on g0¯ and negative definite on g0¯ (except that for D(2, 1; a) we should assume that a ∈ R> ), and the maximal square length of a root is 2, except for the cases B(1, n) when it is 1 and D(2, 1; a) when it is max (2, 2a). If the Killing form on g is non-degenerate, then the form (.|.) is a positive (resp. negative) multiple of the Killing form, in the cases g ∼ = s(m|n) with m > n, osp(m|n) with m > n + 2, F (4) and G(3) (resp. s(m|n) with m < n and osp(m|n) with m < n + 2). An irreducible highest weight module over g is called principal (resp. subprincipal) integrable module if it is integrable with respect to g0¯ (resp. g0¯ ) and locally finite with respect to g (cf. Definition 0.1). As we shall see, the non-trivial principal (resp. subprincipal) highest weight modules have positive (resp. negative) level, except for the cases g = A(0, n) and C(n) (resp. A(n, 0)). It is easy to see that in these cases the only conditions of integrability are ki ∈ Z+ if i ∈ I \I1 ; we shall exclude these cases from further considerations. Let θ be the highest root of g0¯ ; in cases D(2, n) and D(2, 1; a), which are the only cases when g0¯ is not simple, we have: g0¯ = A1 +A1 , and the highest roots are θ+ = αn+1
Integrable Highest Weight Modules over Affine Superalgebras
661
and θ− = αn+2 (where n = 1 for D(2, 1; a)). The root θ (resp. root θ± ) gives rise to = δ − θ ) of g . The corresponding a simple root α0 = δ − θ (resp. simple roots α0± ± 0¯ coroot is α0∨ = 2α0 /(α0 |α0 ). In all cases except for A(m, n) and C(n) there is a (unique) simple root α of g0 (which is a simple Lie algebra), which is not a simple root of g [K1]. As we have seen in Sect. 2, the principal integrability in the case of A(m, n) follows from local nilpotency of the root vector f attached to the root −α0 . In all other cases one has to check in addition the local nilpotency of the root vector f attached to the root −α (in order to ensure the local finiteness with respect to g0¯ ; that with respect to g0¯ follows automatically from the integrability with respect to g ). For that reason, as we have seen 0¯
in Sect. 1, it is important to introduce the following numbers, where ρ is a Weyl vector for g: b = −ρ, α0∨ ,
b± = −ρ, α0∨± ,
b = −ρ, α ∨ .
(resp. b ) are given in Table 6.3 (resp. 4), the first The values of the numbers b and b± and the second for b . Table 6.3 contains line for D(2, . . . ) in Table 6.3 being for b+ − being on the same line as b ) also the formula for the partial levels (k± ± ∨ = , α0± , k = , α0∨ , k± . Table 6.4 and the level k = , K of a weight in terms of its labels ki and k , k± contains also a formula for
k = , α ∨ and the level k in terms of the ki and k . Theorem 6.1. For an affine superalgebra from Table 6.1 (recall that A(0, n), B(0, n) and C(n) are excluded) the labels {ki }i=I of the highest weight of a principal integrable irreducible highest weight module L() are characterized by the following four series of conditions: (1) ki ∈ Z+ if i ∈ I \I1 , in case D(2, . . . ) ) ∈ Z , k ∈ Z (see Tables 6.3 and 6.4), (2) k (resp. k± + + ) ≤ b (see Table 6.3), there are the supplementary conditions: (3) if k (resp. one of the k± A(m, n), m ≥ 1: there exists s ∈ Z+ , s ≤ k , such that: km+1 = km+2 + · · · + km+1+s + s, km+s+2 = · · · = km+s+1+n−k = 0, B(m, n), m ≥ 2, and D(m, n), m ≥ 3: one of the four possibilities hold: (i) there exist r, s ∈ Z+ , r < s, such that k = r + s, kj = 0 for r + 1 ≤ j ≤ m + n and j = s, ks = 1, (ii) there exist r, s ∈ Z+ , r ≤ s, such that k = r + s, kj = 0 for r + 1 ≤ j ≤ m + n, kr = 0,
662
V. G. Kac, M. Wakimoto
Table 6.3. k
g A(m, n)
b m+n+1
k0 + km+1 −
k k +
m
ki
n
ki
4n − 1
ki
2n − 1
ki
2n − 1
+k k+ n+2
ki
2n − 1
+k k− n+1
ki
2n − 1
i=m+2
i=1
ki
(m ≥ 1) 4kn + kn+1 − 4
B(1, n)
2kn + kn+1 − 2
B(m, n)
n−1 i=0 n−1 i=0
1 2 (k + kn+1 )
k + kn+1 + 2
n+m−1 i=n+2
ki + kn+m
(m ≥ 2) 2kn + kn+1 − 2
D(2, n)
2kn + kn+2 − 2 D(m, n)
2kn + kn+1 − 2
n−1 i=0 n−1 i=0 n−1 i=0
k + kn+1 + 2
m+n−2 i=n+2
ki + kn+m−1 + kn+m
−(a + 1)k0 + 2k1 + ak3
1
+k k+ 2
a −1 (−(a + 1)k0 + 2k1 + k2 )
1
+k ) a(k− 3
F (4)
− 23 k0 + 2k1 + 21 k2
1
k + k2 + 2k3 + k4
G(3)
− 43 k0 + 2k1 + 13 k2
1
k + k2 + k3
D(2, 1; a)
Table 6.4. k
g B(m, n), m ≥ 1
−kn −
n+m−1 i=n+1
ki − 21 km+n
− 21 (k0 + k1 )
C(n) D(m, n)
b
−kn −
n+m−2 i=n+1
ki − 21 (km+n−1 + km+n )
k
m − 21
−2(k +
0
−2(k +
m−1
−2(k +
n−1 i=0 n i=2 n−1 i=0
ki ) ki ) ki )
D(2, 1; a)
−(a + 1)−1 (2k1 + k2 + ak3 )
1
−(a + 1)(k + k0 )
F (4)
− 43 k1 − k2 − 43 k3 − 23 k4
3
− 23 (k + k0 )
G(3)
− 23 k1 − k2 − 23 k3
5 2
− 43 (k + k0 )
Integrable Highest Weight Modules over Affine Superalgebras
663
(iii) there exist r ∈ Z+ such that k = n + r, kj = 0 for r + 1 ≤ j ≤ n − 1, kn = 0, kn + kn+1 + 1 = 0, (iv) there exist r ∈ Z+ such that k ≥ n + r, kj = 0 for j ≥ r + 1, kr = 0. B(1, n): the same as for B(m, n) with m > 1 with the following changes: k is replaced by 21 k everywhere and kn+1 is replaced by 21 kn+1 in (iii), D(2, n): the same as for D(m, n) with m > 2 with the following additions: k+ = k− , kn + kn+2 + 1 = 0 in (iii),
D(2, 1; a): one of the four possibilities holds: k = 0, then all k = 0, (i) k+ i − equals 1, then k = k = 1 and one has: (ii) a ∈ Q>0 , a, a −1 ∈ N and one of k± + − −1 (*) k0 = −(a+1) r −1, k1 = −r, k2 = r −1, k3 = a −1 r −1 for some r ∈ N∩aN, = 1, then either (*) holds or k = −(a + 1)−1 , k = k = (iii) a −1 ∈ N and k+ 0 1 2 k3 = 0, = 1, then either (*) holds or k = −a(a + 1)−1 , k = k = k = (iv) a ∈ N and k− 0 1 2 3 0, F (4): one of the two possibilities holds: (i) k = 0, then all ki = 0, (ii) k = 1, then k0 = − 23 , k1 = k2 = k3 = k4 = 0, G(3): one of the two possibilities holds: (i) k = 0, then all ki = 0, (ii) k = 1, then k0 = − 43 , k1 = k2 = k3 = 0, (4) if k ≤ b (see Table 6.4), there are the supplementary conditions: B(m, n): kj = 0 for all j ≥ n + k + 1, D(m, n): one of the two possibilities holds: (i) k ≤ m − 2 and m > 2 (resp. m = 2), then kj = 0 for all j ≥ n + k + 1 (resp. j ≥ n), (ii) k = m − 1, then km+n−1 = km+n , D(2, 1; a): one of the two possibilities holds: (i) k = 0, then k1 = k2 = k3 = 0, (ii) k = 1, then a ∈ Q and k2 + 1 = |a|(k3 + 1), F (4): one of the three possibilities holds: (i) k = 0, then k1 = k2 = k3 = k4 = 0, (ii) k = 2, then k2 = k4 = 0, (iii) k = 3, then k2 = 2k4 + 1, G(3): one of the two possibilities holds: (i) k = 0, then k1 = k2 = k3 = 0, (iii) k = 2, then k2 = 0.
664
V. G. Kac, M. Wakimoto
Proof. In the case g = A(m, n), the theorem follows from Theorem 2.1. In general, the proof is based on similar arguments. Below we shall give details in the case g = B(1, n); in the rest of the cases arguments are the same. The even part of B(1, n) is A1 + Cn and its simple roots are αn+1 for A1 and 1 are {α = {α1 , α2 , . . . , αn−1 , α = 2(αn + αn+1 )} for Cn . The simple roots of A 0 δ−αn+1 , αn+1 }. Due to Lemma 1.5a, the local finiteness (resp. integrability) with respect 1 ) implies that k1 , . . . , kn−1 , k ∈ Z+ (resp. kn+1 , k ∈ Z+ ). Hence, to Cn (resp. A conditions (1) and (2) are necessary. Furthermore, it follows from Proposition 1.2, that in the cases k > b = 4n − 1 (resp. k > b = 21 ) the element f (resp. f ) is locally nilpotent. It remains to show that in the case of inequality k ≤ 4n − 1
(6.1)
the element f is locally nilpotent iff condition (3) holds, and in the case k = 0, f is locally nilpotent iff (4) holds. We shall concentrate on the first claim, the second being easier (cf. also [K1]). Introduce the following isotropic roots: βj =
n
αi (j = 0, . . . , n), βn+j = βn +
i=n−j
j
αi (j = 1, . . . , n − 1).
i=1
We have: βj∨ = βj for all j , and
(βi |αn+1 ) = −1 for all i, (βi |βi+1 ) =
1 if 0 ≤ i ≤ n − 2, 2 if i = n − 1.
(6.2)
Let $(0) = $, then β0 ∈ $(0) and we let $(1) = rβ0 $(0) . Similarly β1 ∈ $(1) and we let $(2) = rβ1 $(1) , . . . , $(2n) = rβ2n−1 $(2n−1) . We have: α0 ∈ $(2n) , α0∨ = 2α0 .
(6.3) (j )
Let (j ) denote the highest weight of L() with respect to n+ . It can be computed by making use of Lemma 1.4. Introduce the following numbers: uj = (j ) , α0∨ (j = 0, . . . , 2n),
tj = (j ) , βj (j = 0, . . . , 2n − 1).
Using (6.2), , α0∨ = k and Lemma 1.4, we get the following recurrent formula for the uj ’s: u0 = k , uj +1 = uj − 2( resp. = uj ) if tj = 0( resp. = 0).
(6.4)
In view of Lemma 1.5, the local nilpotency of f follows from u2n ∈ Z+ . This, clearly, holds if k ≥ 4n (by (6.4)), which again shows that in this case there are no supplementary conditions. From now on we may assume that k ≤ 4n − 1. We may also assume that conditions (1), (2) and (4) hold. We shall derive a recurrent formula for the ti . Using that, by Lemma 1.4, (i+1) = (i) − βi (resp. = (i) ) if ti = 0 (resp. = 0) and that αn−i−1 ∈ $(i) , we obtain: t − kn−i−1 − 1 if ti = 0 ti+1 = i (0 ≤ i ≤ n − 2) , (6.5a) ti − kn−i−1 if ti = 0
Integrable Highest Weight Modules over Affine Superalgebras
tn = tn−1 − 2k0 − 2 (resp. = tn−1 − 2k0 ) if tn−1 = 0 (resp. = 0).
665
(6.5b)
For tj , j ≥ n, the recurrent formula involves numbers si = ((n) |αi ) for 1 ≤ i ≤ n − 1, sn = ((n) |αn + αn+1 ), which, using the above arguments, can be expressed in terms of the labels of as follows: if tn−i−1 tn−i = 0 or tn−i−1 = tn−i = 0 (1 ≤ i ≤ n − 1), −ki si = −ki − 1 if tn−i−1 = 0, tn−i = 0, (6.6a) −k + 1 if t i n−i−1 = 0, tn−i = 0, sn = −k + 1(resp. − k ), where k = −kn − 21 kn+1 , if kn = 0 (resp. = 0). (6.6b) Then we have (0 ≤ i ≤ n − 1): tn+i+1 = tn+i + si+1 − 1 (resp. tn+i + si+1 ) if tn+i = 0 (resp. = 0),
(6.7a)
where we let t2n = 21 u2n .
(6.7b)
Note that t0 = kn ≤ 0 since k = −kn − 21 kn+1 ∈ Z+ and kn+1 ∈ Z+ . Since ki ∈ Z+ for i = 0, n, formulae (6.5) imply 0 ≥ t0 ≥ t1 ≥ · · · ≥ tn−1 .
(6.8)
tn ≥ tn+1 ≥ · · · ≥ t2n = 21 u2n .
(6.9)
Furthermore, we have
In order to show this, it suffices to prove that si ≤ 0, 1 ≤ i ≤ n. But, due to (6.6a), si > 0 can take place for 1 ≤ i ≤ n−1 only when tn−i−1 = 0, tn−i = 0, ki = 0, which is impossible by (6.5a). Also, sn = −k + 1(resp. −k ) if kn = 0 (resp. = 0) cannot be positive since in this case k = 0, which implies that kn = 0 (see supplementary conditions (4)). Suppose now that f is locally nilpotent. Then u2n ≥ 0 (by Lemma 1.5), hence we have from (6.4): tj = 0 for some 0 ≤ j ≤ 2n − 1.
(6.10)
Due to (6.10), (6.8) and (6.9), we have the following three possibilities for some 0 ≤ i0 ≤ n − 1 and n ≤ j0 ≤ 2n − 1: (α) t0 = · · · = ti0 = 0, ti0 +1 = 0, . . . , t2n−1 = 0, (β) t0 = 0, . . . , tj0 −1 = 0, tj0 = · · · = t2n = 0, (γ ) t0 = · · · = ti0 = 0, ti0 +1 = 0, . . . , tj0 −1 = 0, tj0 = . . . = t2n = 0.
666
V. G. Kac, M. Wakimoto
The possibilities (i), (ii), (iii) and (iv) of the supplementary conditions (3) correspond respectively to the following cases: (i) (ii) (iii) (iv)
(γ ) when i0 + j0 < 2n − 1, where we put i0 = n − s − 1, j0 = n + r, (γ ) when i0 + j0 ≥ 2n − 1, where we put i0 = n − r − 1, j0 = n + s, (β), where we put j0 = n + r, (α), where we put i0 = n − r − 1.
We consider in detail only case (ii), the treatment of all other cases being similar. We have: t0 = · · · = tn−r−1 = 0, tn−r = 0, . . . , tn+s−1 = 0, tn+s = · · · = t2n−1 = 0 for some integer s such that 0 ≤ r ≤ s ≤ n − 1. Hence, by (6.5a) we have: kr = 0, kr+1 = · · · = kn = 0,
(6.11)
and, by (6.6a), we have: si = −ki if i = r,
1 ≤ i ≤ n − 1,
sr = −kr + 1,
sn = 0.
The recurrent formulas (6.5) and (6.7) can be now rewritten respectively as follows: tn−r tn−r+1 ··· tn−1 tn tn+1 ··· tn+r−1 tn+r tn+r+1 ··· 0 = tn+s
= −kr , = tn−r − kr−1 − 1, = tn−2 − k1 − 1, = tn−1 − 2k0 − 2, = tn − k1 − 1, = tn+r−2 − kr−1 − 1, = tn+r−1 − kr , = tn+r − kr+1 − 1, = tn+s−1 − ks − 1 + δr,s .
r Summing up these equalities, we get: 2 ki = −(r + s), which, in view of (6.11), i=0 implies k = 2(r + s). Suppose now that conditions (1)–(4) hold. We have to show that u2n ≥ 0. As before, we may assume that k ≤ 4n − 1, hence, due to (3), (6.10) holds. Since (6.8) and (6.9) hold, we again have only the possibilities (α), (β) and (γ ). In cases (β) and (γ ), u2n = 2t2n = 0, hence only case (α) remains. This case corresponds to (3)iv when we have: t0 = kn = 0, t1 = t0 − kn−1 = 0, . . . , tn−r−1 = tn−r−2 − kr+1 = 0. Hence v := #{0 ≤ j ≤ 2n − 1|tj = 0} ≤ n + r and u2n = u0 − 2v ≥ k − 2(n + r) ≥ 0 (we have used here (6.4) and (3)iv).
Theorem 6.2. For an affine superalgebra g from Table 6.4 the labels {ki }i∈I of the highest weight of a subprincipal integrable irreducible highest weight module are characterized by the following conditions: (1) ki ∈ Z+ if i ∈ I\I1 , k ∈ Z+ , (2) if k ≤ b (see Table 6.4), there are supplementary conditions described by (4) of Theorem 6.1, and also in the C(n) case: k0 = k1 = 0.
Integrable Highest Weight Modules over Affine Superalgebras
667
Proof. The only simple root of g0¯ which is not simple for g is α . Hence the proof of Theorem 6.1 proves Theorem 6.2 as well. Remark 6.2. It follows from Theorem 6.1 that the level k of a principal integrable gmodule L() is a non-negative number which is an integer in all cases, except for B(1, n), when it is a half-integer; moreover, if k = 0, then all labels of are 0, hence L() is 1-dimensional; also, if k > 0, then k ≥ 1. Remark 6.3. It is easy to see that, when restricted to the derived subalgebra [ g, g] of g the module L() remains irreducible. Two g-modules are called essentially equivalent if they are equivalent as [ g, g]-modules. For example, the modules L() and L( + aδ) are essentially equivalent for any a ∈ C. Theorem 6.1 gives the following complete list of principal integrable modules of level 1 up to essential equivalence: (1) A(m, n), m ≥ 1: s (1 ≤ s ≤ m), (a + 1)m+1 + am+2 (a ∈ Z+ ), and (a + 1)0 + am+n+1 (a ∈ Z+ ), (2) B(m, 1) and D(m, 1): − 21 0 and − 23 0 − 1 , (3) B(m, n) and D(m, n), n ≥ 2: − 21 0 and − 23 0 + 1 , 1−a (4) D(2, 1; a), a −1 ∈ N: −(a + 1)−1 0 and − a+2 a+1 0 − 1 + a 3 , (5) F (4): − 23 0 , (6) G(3): − 43 0 . One can show (cf. Remark 2.5) that in all cases, all weights are conjugate to each other by odd reflections. Thus, for each of the affine superalgebras A(m, n)(m ≥ 1), B(m, n), D(m, n), D(2, 1; a)(a ∈ Q> ), F (4) and G(3) all, up to essential equivalence, principal integrable modules of level 1, can be obtained from one of them by making different choices of the set of positive roots. Note also that in all cases the “basic” module L(u0 ), where u is such that u0 has level 1, is a principal integrable module. m Remark 6.4. Using the symmetry of A(m, n) which exchanges the subalgebras A n , one gets the classification of the subprincipal integrable modules L() for this and A affine subalgebra: ki ∈ Z+ for i ∈ I\I1 ,
k := −
m+1
ki ∈ Z+ ,
i=0
and there exists s ∈ Z+ , s ≤ k , such that k0 + k1 + · · · + ks + s = 0 and ks+1 = · · · = ks+m−k +1 = 0. One has: k = −(k +
m+n+1
ki ).
i=m+2
Remark 6.5. All principal integrable highest weights of level 2 (up to essential equivalence) for B(1, 2) are −(1 + a)0 + a1 , where a ∈ Z+ . Thus, in sharp contrast to the level 1 case, there are infinitely many essentially inequivalent principal integrable highest weight modules of level ≥ 2.
668
V. G. Kac, M. Wakimoto
Remark 6.6. It follows from Theorem 6.2 and Remark 6.3 that the level k of a subprincipal integrable g-module L() is a non-positive number, provided that a > −1 for D(2, 1; a); moreover, dim L() = 1 if k = 0. Thus, in view of Theorem 6.1, the only L() over g = A(m, 0), A(0, n) or C(n), which are integrable over g0¯ are 1-dimensional. Remark 6.7. Using the same arguments, one can show that the non-symmetrizable “twisted” affine superalgebra of type Q (which is the universal central extension of the Lie superalgebra (Q(n)0¯ t 2n + Q(n)1¯ t 2n+1 )), with the Cartan matrix n∈Z
0 1 0 · · · −1 −1 An 0 ··· 0 −1
has no non-trivial integrable (with respect to its even part) highest weight modules. Remark 6.8. Consider the Z/2Z-gradation of F (4) of type (0, 0, 0, 1, 0) and that of G(3) of type (0, 0, 0, 1), cf. Table 6.1 and [K3]. The 0th piece in the first (resp. second) case is isomorphic to D(2, 1; 1/2) ⊕ A1 (resp. to D(2, 1; 1/3)), and its representation on the 1st piece is the module C10 C2 (resp. C14 ), where C10 (resp. C14 ) is the lowestdimensional non-trivial module over D(2, 1; 1/2) (resp. D(2, 1; 1/3)). This reduces to some extent the construction of the principal integrable level 1 module over F (4) and G(3) to that of D(2, 1; a). The free field construction of the principal integrable level 1 modules over osp(m, n) (covering the B − C − D cases) will be given in Sect. 7. 7. Free Field Realization of Level 1 Integrable Modules over osp(M|N ) Let V be the superspace and let (.|.) be the bilinear form on V considered in Remark 6.1. Recall an equivalent definition of osp(M|N ) via the Clifford superalgebra: CV = T (V )/[x, y] − (x|y)1|x, y ∈ V . The Lie superalgebra osp(M|N ) is identified with the C-span of all quadratic elements of CV of the form: : αβ :≡ αβ + (−1)p(α)p(β) βα,
where α, β ∈ V .
Such an element is identified with an operator from osp(M|N ) by the formula: (: αβ :)v = [: αβ :, v],
v ∈ V.
(7.1)
Denote by OV the vertex algebra generated by pairwise local fields γ (z), where γ ∈ V0¯ ∪ V1¯ and γ (z) is even (resp. odd) if γ ∈ V0¯ (resp. V1¯ ), subject to the following OPE: γ (z)γ (w) ∼
(γ |γ ) . z−w
This is called the vertex algebra of free superfermions in [K4].
Integrable Highest Weight Modules over Affine Superalgebras
669
Remark 7.1. The vertex algebra F considered in Sect. 3 is isomorphic to OV , where dim V0¯ = 2n, dim V1¯ = 2m and the bilinear form is given by: (ϕ i∗ |ϕ j ) = −(ϕ i |ϕ j ∗ ) = δij (i, j = 1, . . . , n), (ψ i∗ |ψ j ) = (ψ i |ψ j ∗ ) = δij (i, j = 1, . . . , m), all other inner products = 0. Furthermore, in the case when dim V1¯ = 2m + 1 the vertex algebra OV is isomorphic to F ⊗ O, where O is a vertex algebra generated by one odd field ψ(z) with the OPE 1 ψ(z)ψ(w) ∼ z−w . This corresponds to adding an odd vector ψ with (ψ|ψ) = 1 orthogonal to all the above basis vectors. As in Sect. 3, we construct the Virasoro field L(z) ≡ j ∈Z Lj z−j −2 with respect to which all γ (z) are primary of conformal weight 1/2. Choose a basis ϕ i , ϕ i∗ (i = 1, . . . , n) of V0¯ , and a basis ψ i , ψ i∗ (i = 1, . . . , m) and ψ if M is odd, with inner products described by Remark 7.1. Then L(z) is given by formula (3.1) if M is even. In the case M is odd, one should add to the expression (3.1) the term 21 : ∂ψ(z)ψ(z) :. As in Sect. 3, we shall write γ (z) = γk z−k−1/2 , γ ∈ V0¯ ∪ V1¯ . k∈ 21 +Z
We shall need also the following well-known fact (see e.g. [K4], formula (5.1.5)). Lemma 7.1. Let ψ + , ψ − ∈ V1¯ be such that (ψ ± |ψ ± ) = 0, (ψ + |ψ − ) = 1. Let α(z) ≡ αn z−n−1 =: ψ + (z)ψ − (z) :. Then one has: n∈Z
: α(z)α(z) :=: ∂ψ + (z)ψ − (z) : + : ∂ψ − (z)ψ + (z) : . ± Consequently, the fields ψ (z) are primary of conformal weight 1/2 with respect to the Virasoro field (z) ≡ n z−n−2 = 21 : α(z)α(z) :. In particular, we have n∈Z
[0 , ψn± ] = −nψn± .
(7.2)
Note that γj |0 = 0 for j > 0, γ ∈ V . Hence OV is obtained by applying polynomials in the γ−j , γ ∈ V , j > 0, to the vacuum vector |0. We have the decomposition − OV = O+ V ⊕ OV ,
(7.3)
− where O+ V (resp. OV ) is obtained by applying even (resp. odd) degree polynomials in the γ−j to |0.
Theorem 7.1. a) Consider the affine superalgebra osp(M|N ) and let : αβ : (z) = (t k ⊗ : αβ :)z−k−1 for : αβ :∈ osp(M|N ). Then the linear map σ given by k∈Z
(α, β ∈ V ): : αβ : (z) ! →: α(z)β(z) :, K ! → 1, d ! → L0 defines a principal integrable representation of osp(M|N ) of level 1 in the space − OV for which O+ V and OV are submodules.
670
V. G. Kac, M. Wakimoto
− b) The osp(M|N )-modules O+ V and OV are irreducible highest weight modules isomorphic to L(− 21 0 ) and L(− 21 0 − 21 α0 ) respectively, provided that (M, N ) = (1, 0) or (2, 0).
Proof. The proof that σ is a representation is, as usual, a straightforward use of Wick’s formula. The proof of integrability of σ is the same as in the proof of Theorem 3.1. This establishes (a). Note that, as before, L0 commutes with osp(M|N ), and the spectrum of L0 on O+ V 1 ± (resp. O− V ) is Z+ (resp. 2 + Z+ ), the lowest eigenvalue eigenspace being S = C|0 (resp. S − = {γ− 1 |0|γ ∈ V }), which is the trivial 1-dimensional (resp. the standard) 2
representation of osp(M|N ). Provided that O± V are irreducible osp(M|N )-modules, (b) follows. + − In order to prove irreducibility of O± V , pick elements ψ , ψ ∈ V1¯ as in Lemma 7.1 and define the field (z) as in that lemma. Let ψ ∈ V1¯ be an element orthogonal to both ψ + and ψ − , and consider the field β(z) =: ψ + (z)ψ(z) :≡ βn z−n−1 , so that βn =
:
ψj+ ψn−j
n∈Z
:. Since 0 commutes with ψ(z), we have by (7.2):
j ∈ 21 +Z
[0 , βn ] =
j : ψj+ ψn−j : .
(7.4)
j ∈ 21 +Z
Let U ⊂ O± V be an invariant with respect to osp(M|N ) subspace. It follows from Lemma 7.1 and (7.4) that v ∈ U implies that ((ad 0 )s βn )v ∈ U , s ∈ Z+ . Hence U is invariant with respect to all operators ψj+ ψk , where ψ + , ψ ∈ V are such that (ψ + |ψ + ) = 0 = (ψ + |ψ) and j, k ∈ 21 + Z. Hence, provided that M ≥ 3, U contains a non-zero purely bosonic element, i.e., an element obtained by applying a polynomial in the γj (γ ∈ V0¯ ) to |0. Thus we reduced the problem to the purely bosonic case, i.e., the case when M = 0. In this case the irreducibility was proved in [L] using the character formula for modular invariant repn from [KW1] and formula (12.13) from [K3] (the reference to (13.13) resentations of C in [L] is a misprint). The remaining cases, when M = 1 or 2 and N = 2n is even ≥ 2 can be reduced again to the purely bosonic case by a direct calculation. We give below details in the M = 1 case, the M = 2 case being similar. The simple root vectors of osp(1, N ) = B(0, n) are as follows: (1)∗ (1)∗ e0 = (ϕ 1∗ (z)ϕ 1∗ (z))1 = s∈Z ϕ−s+1/2 ϕs+1/2 , (i) (i+1)∗ ei = (ϕ i (z)ϕ i+1∗ (z))0 = s∈Z ϕ−s−1/2 ϕs+1/2 (i = 1, . . . , n − 1), (n) = s∈Z ϕ−s−1/2 ψs+1/2 . en = (ϕ n (z)ψ(z))0 n are e0 , e1 , . . . en−1 and en = [en , en ] = Then the simple root vectors of sp(N ) = C (n) (n) s∈Z ϕ−s−1/2 ϕs+1/2 . Any vector v of OV can be uniquely written in the form: ψi1 . . . ψik ui1 ,... ,ik , v= i1 <···
Integrable Highest Weight Modules over Affine Superalgebras
671
where ui1 ,... ,ik are purely bosonic elements (i.e., obtained by applying polynomials in the ϕ’s to |0). Now, if v is a singular vector, i.e., ei v = 0 for all i = 0, . . . , n, then, in particular, en v = 0, and since all e1 , . . . , en−1 , en commute with the ψ’s, we get: ψi1 . . . ψik (ei ui1 ,... ,ik ) = 0, ψi1 . . . ψik (en ui1 ,... ,in ) = 0. i1 <···
i1 <···
n vectors, hence, It follows that all ui1 ,... ,in are purely bosonic singular with respect to C due to irreducibility of O± for M = 0 mentioned above, we obtain that all ui1 ...ik are V (1) linear combinations of elements |0 and ϕ−1/2 |0. Hence
v=
ai1 ,... ,ik ψi1 . . . ψik |0 +
i1 <···
i1 <···
(1)
bi1 ,... ,ik ψi1 . . . ψik ϕ−1/2 |0.
Using that en v = 0, we obtain: k r=1 i1 <···
+
k r=1 i1 <...
(n)
ir . . . ψik ϕ |0 (−1)r−1 ai1 ,... ,ik ψi1 . . . ψ ir
ir . . . ψik ϕ (n) ϕ (1) |0 = 0, (−1)r−1 bi1 ,... ,ik ψi1 . . . ψ −1/2 ir
which implies that ai1 ,... ,ik (resp. bi1 ,... ,ik ) = 0 if k > 0. Thus, the only singular vectors (1) − in O+ V (resp. OV ) are scalar multiples of |0 (resp. ϕ−1/2 |0). To conclude that the B(0, n)-modules O± V are irreducible, note that OV carries a unique non-degenerate Hermitian form H (., .) such that the square length of |0 is 1 (j ) (j )∗ and the adjoint operators of ϕk and ψk are ϕ−k and ψ−k , respectively. The absence + − of non-trivial singular vectors in OV (resp. OV ) implies that the B(0, n)-submodules (1) − O+ V (resp. OV ) generated by |0 (resp. ϕ−1/2 |0) is irreducible, hence the restriction − of H to it is non-degenerate. Hence the orthogonal complement to O+ V (resp. OV ) is a complementary submodule which has no non-zero singular vectors, hence it is zero, and O± V are irreducible. Remark 7.2. The irreducibility in the purely fermionic case was established in [KP1] by making use of the Weyl-Kac character formula. An argument, using Virasoro operators, was given in [F]. The method of using Virasoro operators to prove irreducibility apparently works only in the presence of fermions (cf. Remark 3.3). It is shown in [L] that the irreducibility claims of [FF], based on the use of Virasoro operators, are false for the (2) (2) constructions of A2−1 and A2 -modules. Using Theorem 7.1, it is straightforward to write down the characters and supercharacters for the integrable level 1 osp(M|N )-modules. We have: − chO+ V ± chOV
= e− 2 0 $∞ k=1 1
2i k−1/2 )(1 ± e−2i q k−1/2 ) (1 ± q k−1/2 )p(M) $m i=1 (1 ± e q , n 2 $j =1 (1 ∓ e j +m q k−1/2 )(1 ∓ e−2j +m q k−1/2 )
(7.5)
672
V. G. Kac, M. Wakimoto
where p(M) = 0 (resp. 1) if M is even (resp. odd). A similar formula for supercharacters is obtained by reversing signs in the numerator of the right-hand side of (7.5). Letting all 2i and 0 equal 0 in (7.5), we obtain: (1 ± q k− 2 )M 1
trO+ q L0 ± trO− q L0 = $∞ k=1 V
V
(1 ∓ q k− 2 )N 1
.
(7.6)
Noticing that 1
k− 2 $∞ )= k=1 (1 − q 1
ϕ(q 2 ) ϕ(q)2 k− 21 (1 + q ) = and $∞ k=1 1 ϕ(q) ϕ(q 2 )ϕ(q 2 )
(7.7)
and using the asymptotics (4.11) of η(τ ), we obtain the following asymptotics as τ ↓ 0: trO± q L0 ∼ V
πi 1 1 e 12τ ( 2 M+N) . 2n+1
(7.8)
Remark 7.3. The right-hand side of (7.6) multiplied by q (N−M)/48 is a modular function equal to a product of powers of functions η( 21 τ )/η(τ ) and η(τ )2 /η( 21 τ )η(2τ ), and the same holds if we replace tr by str. It is well known (and easy to see) that the above two modular functions along with the modular function η(2τ )/η(τ ) are transitively permuted (with some constant factors) under the action of SL(2, Z). Thus, the normalized by q (N−M)/48 characters and supercharacters of integrable level one osp(M|N )-modules are modular functions, but their C-span is not SL(2, Z)-invariant. 8. On Classification of Modules over the Associated Vertex Algebras Define numbers u and h∨ (the dual Coxeter number) by level (ku0 ) = k, level (ρ) = h∨ .
(8.1)
Their values for all affine superalgebras are given in Table 8.1. Table 8.1. g
A(m, n)
B(m, n)
C(n)
D(m, n)
D(2, 1; a)
F (4)
G(3)
u h∨
1 m−n
−1/2 2(m − n) − 1
−1/2 n−1
−1/2 2(m − n − 1)
−(a + 1)−1 0
−2/3 3
−3/4 2
The following proposition is an immediate corollary of Theorems 6.1 and 6.2. Proposition 8.1. a) If g = A(m, n) with m ≥ 1, B(m, n) with m ≥ 2, D(m, n), F (4) or G(3), then the g-module L(ku0 ) is principal integrable iff k ∈ Z+ . The B(1, n)-module L(ku0 ) is principal integrable iff k ∈ Z+ ∪ {2n − 21 + Z+ }. The D(2, 1; a)-module L(ku0 ) is principal integrable iff k ∈ Z+ ∩ aZ+ . b) If g = B(m, n), C(n), D(m, n), D(2, 1; a), F (4) or G(3), then the gmodule L(k0 0 ) is subprincipal integrable iff k0 ∈ Z+ .
Integrable Highest Weight Modules over Affine Superalgebras
673
Recall that the g-module Vk := L(ku0 ) has a canonical structure of a vertex algebra for any k ∈ C (see e.g. [K4]). It is well known that any irreducible Vk -module is one of the (irreducible) g-modules L() of level k, and it is an important problem of vertex algebra theory to find out which of these L() are actually Vk -modules. A necessary condition is given by Proposition 8.2. Suppose that k is such that L(ku0 ) is a principal (resp. subprincipal) integrable g-module. If a g-module L() of level k is a Vk -module, then it must be a principal (resp. subprincipal) integrable. Proof. Denote by g0 the subalgebra g0¯ (resp. g0¯ ) of g (see Sect. 6). This is an affine Lie 0 0 g )vku0 of Vk . Since, by definition, algebra. Denote by V the vertex subalgebra U ( V 0 is an integrable g0 -module, it follows that it is g0 -irreducible [K3], hence V 0 is a simple affine vertex algebra of non-negative integral level. But one knows [Z] that all irreducible modules over such a vertex algebra are integrable g0 -modules. Using the 0 complete reducibility of g -modules [K3], we deduce that any V -module, viewed as a V 0 -module, is a direct sum of irreducible integrable g0 -modules, which proves the proposition. Let g+ = C[t] ⊗C g + Cd and consider a 1-dimensional module Ck (k ∈ C) over g+ + CK on which g+ acts trivially and K = k. Then L(ku0 ) is a quotient of the induced g-module V˜k = U ( g) ⊗U ( g) applied to 1 ⊗ 1. g+ +CK) Ck by a left ideal Ik of U ( Suppose that k is such that L(ku0 ) is a principal integrable g-module. As we have seen in the proof of Proposition 8.2, viewed as a g0 -module, L(ku0 ) is a direct sum of irreducible integrable highest weight modules. All these modules have the same level (resp. + , − when g0¯ has two simple components) given in terms of k as follows: = + = k if g = B(1, n), − = k if g = D(2, n),
= 2k if g = B(1, n),
− = a −1 k if g = D(2, 1; a).
In particular, Ik contains the element
e−θ (1)+1 ( resp. elements e−θ+ (1)+ +1 and e−θ− (1)− +1 ).
(8.2)
g-module L() of level k If elements (8.2) generate the left ideal Ik , it follows that a +1
+1
+ − is a Vk -module iff the field e−θ (z)+1 (resp. fields e−θ (z) and e−θ (z)) annihilate +
−
L(). The latter property implies that, viewed as a g0¯ -module, L() is a direct sum of irreducible integrable modules and therefore L() is a principal integrable g-module. We thus established a sufficient condition for a g-module L() to be a Vk -module: g-module and Proposition 8.3. Let k be such that L(ku0 ) is a principal integrable suppose that the left ideal Ik is generated by (8.2). Let L() be a principal integrable g-module of level k. Then L() is a Vk -module. Proposition 8.4. Let k be such that L(ku0 ) is a principal integrable g-module. g-module a) Suppose that the highest weight ku0 is the only singular weight of the V˜k which is principal integrable. Then elements (8.2) generate the left ideal Ik .
674
V. G. Kac, M. Wakimoto
b) The assumption of (a) holds if k + h∨ = 0,
(8.3)
and for any principal integrable weight of level k one has: − ku0 ∈ Q\Zδ, = where Q
i∈I Zαi
(8.4)
is the root lattice of g.
Proof. Let Ik (⊂ Ik ) denote the left ideal of U ( g) generated by elements (8.2). Then the g-module Vk = V˜k /(Ik (1 ⊗ 1)) is principal integrable, hence each of its singular weights is integrable. Hence, if the condition of (a) holds, the g-module Vk is irreducible, and therefore Ik = Ik . hence (8.4) implies that = ku0 + j δ Furthermore, obviously, − ku0 ∈ Q, for some j ∈ Z. Using the Casimir operator [K3], we obtain: (ku0 + ρ|ku0 + ρ) = (ku0 + ρ − j δ|ku0 + ρ − j δ), which is equivalent to j (k + h∨ ) = 0. But then (8.3) implies that j = 0, proving (b). Theorem 8.1. Let g be one of the affine superalgebras A(m, n) with m ≥ 1, B(m, n) with m ≥ 1, D(m, n), D(2, 1; a) with a −1 ∈ N, F (4) or G(3). Then all integrable g-modules L() of level 1 are V1 -modules (the complete list of these ’s is given by Remark 6.3). Proof. Note that in the A(m, n) case V1 is a subalgebra of the vertex subalgebra F0 of F (constructed in Sect. 3), while the highest component of the F0 -module Fs restricted to V1 is L((s) ). Since the (s) exhaust all integrable highest weights of level 1, by Proposition 8.1, they give a complete list of irreducible V1 -modules. In the B(m, n) and D(m, n) cases we note that V1 is isomorphic to the vertex − algebra O+ V (see Theorem 7.1), OV is its irreducible module, and these two modules produce all integrable highest weights of level 1. The cases F (4) and G(3) are obvious since V1 is the only irreducible integrable module of level 1 (see Remark 6.3). a+2 It remains to show that L − a+1 0 − 1 + 1−a a 3 is a V1 -module in the D(2, 1; a) 1 1−a 1 a−1 case. But − a+2 a+1 0 −1 + a 3 = − a+1 0 − 2 α0 + 2a α3 , hence the difference of this weight and u0 does not lie in the root lattice; we also have: k = 1 and level h∨ = 0. Hence we may apply Propositions 8.4 and 8.3. Remark 8.1. The lowest energy D(2, 1; a)-submodule of the module L − a+2 a+1 0 − 1−a −1 ¯ ¯ 3 ). It has dimension 4a −1 + 2. ¯ 1 + (a − 1) 1 + a 3 is the module L(− For a = 1 this is the defining module of D(2, 1); for a = 21 (resp. 13 ) this is the 10(resp. 14-) dimensional module mentioned in Remark 6.8. As a D(2, 1; a)-module, the even (resp. odd) part of this module is isomorphic to the irreducible s(2)+s(2)+s(2)module C Ca
−1
C2 (resp. C2 Ca
−1 +1
C).
Integrable Highest Weight Modules over Affine Superalgebras
675
Remark 8.2. Let V be a vertex algebra with a conformal vector such that L0 is diagonizable with finite-dimensional eigenspaces and rational eigenvalues. It is a general belief that if V has finitely many irreducible modules, then the character tr M q L0 of each of these modules M becomes a modular function when normalized, i.e., multiplied by a suitable power of q. The example of the vertex algebra V1 for B(m, n) and D(m, n) confirms this conjecture and leads to believe that the same is true for D(2, 1; a) (with a −1 ∈ N), F (4) and G(3). The vertex algebra V is called rational if L0 has integral eigenvalues, the number of irreducible V -modules is finite and any V -module is completely reducible. It follows from the above discussion that the vertex algebra V1 for B(m, n), D(m, n), D(2, 1; a), F (4) and G(3) is a rational vertex algebra, and that, moreover, the corresponding Zhu algebra [Z] is finite-dimensional semisimple (and even 1-dimensional in the F (4) and G(3) cases). It was proved by Zhu [Z] under certain technical assumptions that the C-span of normalized characters of irreducible modules over a rational vertex algebra is SL(2, Z)invariant, and it was believed by many that the technical assumptions may be removed. However, the above mentioned rational vertex algebra V1 shows that this is not the case. Remark 8.3. There are only two cases where there exists only a finite number of essentially inequivalent subprincipal integrable g-modules of a given non-zero level k: g = F (4), k = − 23 and g = G(3), k = − 43 . In both cases the only subprincipal integrable g-module is L(0 ). In both cases the associated vertex algebra is rational with a unique irreducible module and the Zhu algebra is 1-dimensional.
9. Some Remarks and Open Problems 9.1. The calculation of characters of integrable highest weight modules of arbitrary level k over affine superalgebras seems to be a very difficult problem. One may expect that the case of the “critical” level k = −h∨ should be rather different from other levels (as for the affine Lie algebras). However, the construction of level 1 integrable modules over osp(m|n) given in Sect. 7 is the same for all values m and n though 1 is the critical level iff m − n = 1. Formula (5.12) leads us to believe in the following conjecture: Consider a principal integrable highest weight module L() over an affine superal such that it contains a gebra g and suppose that one can choose a set of simple roots $ maximal + ρ-isotropic subset S of roots (i.e., all roots from S are pairwise orthog # be the Weyl group of the integrable part onal and orthogonal to + ρ [KW]). Let W g0¯ of g0¯ . We conjecture that the following character formula holds: eρ RchL() =
# w∈W
2(w)w
e+ρ . $β∈S (1 + e−β )
(9.1)
Note that the assumptions of this conjecture exclude the critical level, and include the level 1 integrable modules over exceptional affine superalgebras.
676
V. G. Kac, M. Wakimoto
9.2. In the papers [KW1] and [KW2] we proved character formulas for a class of modules L() over affine Lie algebras, called admissible modules, which includes integrable modules. These character formulas imply that the normalized specialized characters of admissible modules are modular functions (and we conjecture that this property characterizes admissible modules). Of course, these character formulas break down in the Lie superalgebra case. However, in certain exceptional situations, when the character admits a simple product expansion in the Lie algebra case (see [KW2], Theorem 3.2), it seems that a similar product formula holds in the Lie superalgebra case as well. Concretely, let u be a positive integer and let k = h∨ (u−1 − 1)
(9.2)
(recall that in general the level k of an admissible module is ≥ h∨ (u−1 − 1)). Let y be such that all roots γi = y((u − 1)δi0 K + α ∨ ) are an automorphism of the root lattice Q i positive (i ∈ I). The weights of the form y.k0 , where, as usual, y.λ = y(λ + ρ) − ρ, are called admissible. We conjecture that the following analog of formula (3.3) from [KW2] holds: chL(y.(k0 )) = e
y.(k0 )
ϕ(q u ) ϕ(q)
1 − q un ey.α 1 + q un ey.α /$ ,(9.3) ¯ α∈1¯ 0 1 − q n eα 1 + q n eα n∈N n∈N
$α∈¯ ¯
¯ 0¯ , ¯ 1¯ are the sets of even and odd roots of g, and q = e−δ . where is the rank of g, This conjecture agrees with formula (7.5) in the case g = osp(2|2) = s(2|1). In this case k = −1/2 and h∨ = 1, so that (9.2) holds for u = 2. All the admissible weights of level − 21 are as follows: − 21 i (i = 0, 1, 2), − 21 0 − 21 α0 , where the Dynkin diagram is chosen such that α1 and α2 are odd roots (and α0 is even). Character formula (7.5) gives: 1 ch(− 21 0 ) = 21 e− 2 0 1 1 ch(− 2 0 − 2 α0 )
1
Q(u−1 vq 2 ; q) 1
Q(−uvq 2 ; q)
1
±
Q(−u−1 vq 2 ; q) 1
Q(uvq 2 ; q)
,
(9.4)
where u = e− 2 α1 , v = e− 2 α2 and 1
1
k−1 Q(z; q) = $∞ )(1 + z−1 q k ), k=1 (1 + zq
(9.5)
whereas formula (9.3) gives: ch(− 21 0 ) = e− 2 0 1
Q(u2 q; q 2 )Q(v 2 q; q 2 )ϕ(q 2 )2 , Q(−uvq; q 2 )ϕ(q)2
ch(− 21 0 − 21 α0 ) = e− 2 0 − 2 α0 1
1
Q(u2 ; q 2 )Q(v 2 ; q 2 )ϕ(q 2 )2 . Q(−uvq; q 2 )ϕ(q)2
(9.6a)
(9.6b)
Integrable Highest Weight Modules over Affine Superalgebras
677
However, the seemingly different expressions in the right-hand sides of (9.4) and (9.6) actually coincide due to one of the addition theta function formulas (cf. [M], formula (6.6) and notation on p. 17): θ00 (τ, z1 )θ00 (τ, z2 ) = θ00 (2τ, z1 + z2 )θ00 (2τ, z1 − z2 ) + θ10 (2τ, z1 + z2 )θ10 (2τ, z1 − z2 ),
(9.7)
if we let u = e2πiz1 , v = e2πiz2 . Using [KW], formula (6.1), it is immediate to show that the span of supercharacters of the four admissible s(2|1)-modules of level −1/2 is SL(2, Z)-invariant. Thus, it is natural to conjecture that this modular invariance property of admissible characters holds for any affine superalgebra g and any k given by (9.2). Two other very interesting examples are provided by Remark 8.3: g = F (4) with u = 2, y = 1 and g = G(3) with u = 3, y = 1. 9.3. The first case not covered in Sect. 5, that when m = n − 1, is very interesting. It connects the level 1 modules over g(n − 1|n) (or, equivalently the “critical” level −1 modules over g(n|n−1)) to the denominator identity for s(n|n), which is unknown. Analyzing this connection, we arrived at the following s(2|2) denominator identity: eρ eρ R = , (9.8) 2(w)w ∞ −α (1 + e 0 )$j =1 (1 + q j eα2 )(1 + q j −1 e−α2 ) # w∈W
# = r1 , tα1 . Here we use the Dynkin diagram with the where, as before, q = e−δ and W grey nodes α0 and α2 . If all four nodes are grey we get the same identity with ρ replaced # = rα1 +α2 , tα1 +α2 ): by 0; in yet another form (9.8) can be written as follows (where W R $∞ (1 − q 2n )(1 + q 2n−1 eα1 +α3 )(1 + q 2n−1 e−α1 −α3 ) (9.9) ϕ(q)2 n=1 n α1 n α3 n−1 −α1 = 2(w)w($∞ e )(1 + q n−1 e−α3 ))−1 . n=1 (1 + q e )(1 + q e )(1 + q # w∈W
The latter identity is equivalent to the following identity in u = e−α1 , x = e−α2 , v = e−α3 and q (where Q is defined by (9.5)): Q(uvq; q 2 )Q(−ux; q)Q(−vx; q) = Q(uv −1 q; q 2 )Q(x; q)Q(uvx; q) − xQ(uvx 2 q; q 2 )Q(u; q)Q(v; q). In the notation of [M] this identity can be rewritten in terms of theta functions as follows (if we let u = e2πiz1 , v = e2πiz2 , x = e2πiz3 ): θ00 (2τ, z1 + z2 )θ11 (τ, z1 + z3 )θ11 (τ, z2 + z3 ) + θ00 (2τ, z1 − z2 )θ10 (τ, z3 )θ11 (τ, z1 + z2 + z3 ) = θ00 (2τ + z1 + z2 + 2z3 )θ10 (τ, z1 )θ10 (τ, z2 ).
(9.10)
Identity (9.10) can be derived from (9.7) as follows. Replacing zi by zi + 21 τ (resp. zi + 1 2 (1 + τ )) in (9.7), we obtain: θ10 (τ, z1 )θ10 (τ, z2 ) = θ00 (2τ, z1 + z2 )θ10 (2τ, z1 − z2 ) + θ10 (2τ, z1 + z2 )θ00 (2τ, z1 − z2 ),
(9.11a)
678
V. G. Kac, M. Wakimoto
θ11 (τ, z1 )θ11 (τ, z2 ) = θ00 (2τ, z1 + z2 )θ10 (2τ, z1 − z2 ) − θ10 (2τ, z1 + z2 )θ00 (2τ, z1 − z2 ).
(9.11b)
Substituting (9.11b) (resp. (9.11a)) in the first (resp. second) summand of the lefthand side of (9.10), we obtain the product of θ00 (2τ, z1 + z2 + 2z3 ) and the right-hand side of (9.11a), and, substituting its left-hand side, we obtain the left-hand side of (9.10). We also have a conjectural formula for an s(3|3) denominator identity, but it is too cumbersome to be reproduced here. We have no conjectures as how the s(n|n) denominator identity should look for n > 3. Using the connection of the s(2|2) denominator identity to level 1 modules over s(2|1) we deduce from (9.8) (k ∈ Z+ ): chL(k0 − (k + 1)1 ) e+ρ ϕ(q) 2(w)w tj α0 ∞ . = ρ n−1 e R $n=1 (1 + q e−α2 )(1 + q n eα2 ) w∈r0
(9.12)
j ∈Z+
Here the Dynkin diagram is chosen in such a way that α0 is even and α1 , α2 are odd simple roots. 9.4. Let k be such that Vk = L(uk0 ) is a (principal or subprincipal) integrable gmodule of level k. Is it always true that any integrable g-module of level k can be extended to a module over the vertex algebra Vk ? Of course, this question is closely related to the description of generators of the left ideal Ik . In the principal integrable case Ik contains elements (8.2), and the answer to the above question in this case would be positive if Ik were generated by these elements. Is it true that the normalized (by a power of q) characters trq L0 (where L0 is given by the Sugawara construction [K4]) of irreducible Vk -modules are modular functions, provided that there are finitely many of them and k + h∨ = 0? Is it true that for k of the form (9.2), all Vk -modules are admissible modules? 9.5. A few examples that we have worked out in the paper indicate that the theory of integrable highest weight modules over affine Lie superalgebras is dramatically different from that in the Lie algebra case. The only exception is the case of g = B(0, n). The integrability conditions are (see Table 6.1 for its Dynkin diagram). ki ∈ Z+ for all i,
kn ∈ 2Z+ ,
hence the level k = 2k0 +. . .+2kn−1 +kn is a non-negative even integer and the number of integrable highest weight B(0, n)-modules is finite for each k. Moreover, each of these modules extends to an irreducible Vk -module since Ik is generated by e−θ (1)k+1 for each k, and these are all irreducible modules over the vertex algebra Vk (k ∈ 2Z+ ). Furthermore, the character formula for all integrable B(0, n)-modules L() is known (see [K2]), and it is given by the same expression as that for the twisted affine (2) algebra A2n (replacing the black node by a white one). In order to derive the transfor(2) mation formula of B(0, n) supercharacters from that of A2n characters, we need to go (2) from the A2n coordinates, which we call A-coordinates, to the B(0, n) coordinates, which we will call B-coordinates. This calculation is explained below.
Integrable Highest Weight Modules over Affine Superalgebras
679
The B-coordinates (τ, zB , uB ) of h ∈ h are defined by h = 2πi(−τ 21 0 + zB + uB δ), where zB ∈ h. ∨ Let β = 21 n−1 j =0 (n − j )αj . Then tβ (i ) = i for i = 1, . . . , n − 1 and t−β ( 21 0 ), αi∨ = δ0n , hence we may take n = t−β ( 21 0 ) for the 0th fundamen(2) tal weight of A2n . Hence the A-coordinates are expressed via B-coordinates by h˜ = t−β (h) = 2π i(−τ n + zA + uA δ).
(9.13)
Recall that SL(2, Z) acts on functions in τ, z, u by the formula [K3, Chapter 13]: aτ + b z c(z|z) −n , ,u − . F (τ, z, u)| a b = j (τ ) F cτ + d cτ + d 2(cτ + d) c d Furthermore, defining a new function F α,β by F α,β (h) = F (tβ (h) + 2π iα − π i(α|β)δ), we have [KP2]: F α,β | a
b c d
= (F | a
b c d
)dα−bβ,aβ−cα .
(9.14)
We shall use the following connection between supercharacters of B(0, n) and char(2) acters of A2n , which follows from (9.13) and definitions: ˜ −β,β = (−1)nk/2 chL()(h) ˜ β,β . schL()(h) = chL()(h)
(9.15)
(2)
Recall that the normalized A2n character χ˜ and the normalized B(0, n) supercharacter χ are defined by: χ˜ = q m˜ chL(), χ = q m schL(), where in B-coordinates: m =
|ρ|2 ck | + ρ|2 ( + 2ρ|) − − , = ∨ ∨ ∨ 2(k + h ) 2h 2(k + h ) 24
ck =
k sdim B(0, n) , k + h∨
and m ˜ is defined by a similar formula in A-coordinates. (Hence χ (τ, 0, 0) = strL() L −c 0 k /24 , as it should be.) q ∈ SL(2, Z). We shall denote by SA (resp. SB ) the action of S in ALet S = 01 −1 0 (resp. B-coordinates). We have by (9.15): −β,β
χ |SB = χ˜
= (χ˜ |SA )β,β ,
where the last equality holds due to (9.14). But one has (see [KP2], [K3, Theorem 13.8a]): SM χ˜ M , χ˜ |SA = M∈P+k mod Cδ
(9.16)
680
V. G. Kac, M. Wakimoto
where (SM ) is an explicitly known matrix. Hence, continuing the calculation (9.16), we get, using (9.15): χ |SB =
M∈P+k
β,β
SM χ˜ M = (−1)nk/2
mod Cδ
M
−β,β
SM χ˜ M
.
Using again (9.15), we obtain the final transformation formula:
χ |SB = (−1)nk/2
SM χM .
(9.17)
M∈P+k mod Cδ
It is clear from the above calculation that the SL(2, Z)-invariance of normalized β,0 0,β B(0, n) characters χ˜ does not hold, but the span of {χ˜ , χ˜ , χ˜ }∈P k mod Cδ is + SL(2, Z)-invariant.
9.6. We use this opportunity to make some corrections to [KW]. Due to computer error the following lines disappeared from the paper: (1) Bottom of page 418: and 4(n + 1)2 , respectively (given by Theorem 4.2; see also Examples 5.3 and (2) Bottom of page 421: (this is independent of the choice of B), and let W # denote the subgroup of W generated by reflections rα with respect to all α ∈ #0 . Denote by (.|.) the even Also, the diagrams B(0, n), D(m, n) and D(2, 1; α) on p. 429 should be as follows: B(0, n)
( − ( − ... − ( − ( ⇒
D(m, n), m ≥ 2 ( − ( − . . . −
D(2, 1; α)
(1 2
•
(1 2 − (2 − . . . − (2 (1
(1
Furthermore, the following corrections should be made: – – – – – – –
page 417, line 13↑: (−q) ¯ 0 |α ⊥ S }, page 432, line 12↓: M := {α ∈ page 434, lines 5,7↓: α2 should be replaced by α1 , page 435, line 6↓: ( j >0 t j ⊗ g), page 438, line 1↑: : (α1 |α1 ) = 0, page 449, line 3↑: (α2 |α2 ) = 2, = Rm $∞ . . . , where Rm is the denominator of Am , not the page 450, line 4↑: R n=1 one defined by (7.1),
Integrable Highest Weight Modules over Affine Superalgebras
681
– page 453: Theorem 8.1(a) as stated holds for the subprincipal integrable modules (cf. Theorem 6.2 of the present paper). It is appropriate to mention here that the specialization (7.2) of Conjecture 7.2 has been proved recently independently by S. Milne (by combinatorial methods) and by D. Zagier (using cusp forms). D. Zagier also proved Conjecture 7.2 in the first unknown case m = 21 . In a slightly different form than in [KW], Conjecture 7.2 reads: s 2s 1 − q 2n 1 − q 2n eα ∞ $n=1 = $α∈ chL ki γi , Am 1 − q 2n−1 1 − q 2n−1 eα n1 ,... ,ns ≥0 k1 ≥...≥ks ≥0
s
m+1
× q i=1
i=1
ki (2ni +1)+(m−2i+2)ni
.
and {γ1 , . . . , γs } is the set of positive Here is the set of roots of Am , s = 2 pairwise orthogonal roots, γ1 being the highest root. Acknowledgement. We are grateful to A. Polishchuk for giving us the idea of Lemma 4.1 and other very useful remarks. The first named author wishes to thank ENS, IHES and MSRI for their hospitality.
References [A]
Appell, M.P.: Sur le fonctions doublement periodique de troisieme espese. Annals Sci. l’Ecole Norm. Sup., 3e serie, t1, p. 135, t2, p. 9, t3, p. 9 (1884–1886) [B] Borcherds, R.: Monstrous moonshine and monstrous Lie superalgebras. Invent. Math., 109, 405–444 (1992) [F] Frenkel, I.B.: Two constructions of affine Lie algebra representations and boson-fermion correspondence in quantum field theory. J. Funct. Anal. 44, 259–327 (1981) [FF] Feingold, A.J. and Frenkel, I.B.: Classical affine algebras. Adv. Math. 56, 117–172 (1985) [K1] Kac, V.G.: Lie superalgebras. Adv. Math. 26, 8–96 (1977) [K2] Kac, V.G.: Infinite-dimensional algebras, Dedekind’s η-function, . . . . Adv. in Math. 30, 85–136 (1978) [K3] Kac, V.G.: Infinite dimensional Lie algebras. 3rd edition, Cambridge: Cambridge University Press, 1990 [K4] Kac, V.G.: Vertex algebras for beginners. University lecture series, Vol. 10, Providence, RI: AMS, 1996; Second edition, 1998 [KL] Kac, V.G. and van de Leur, J.W.: Super boson-fermion correspondence. Ann. de l’Institute Fourier 37, 99–137 (1987) [KP1] Kac, V.G. and Peterson, D.H.: Spin and wedge representations of infinite-dimensional Lie algebras and groups. Proc. Natl. Acad. Sci. USA 78, 3308–3312 (1981) [KP2] Kac, V.G. and Peterson, D.H.: Infinite-dimensional Lie algebras, theta functions and modular forms. Adv. Math. 53, 125–264 (1984) [KW1] Kac, V.G. and Wakimoto, M.: Modular invariant representations of infinite-dimensional Lie algebras and superalgebras. Proc.Natl.Acad.Sci. USA 85, 4956–4960(1988) [KW2] Kac, V.G. and Wakimoto, M.: Classification of modular invariant representations of affine algebras. Advanced Ser. Math. Phys. 7, Singapore: World Sci., 1989, pp. 138–177 [KW] Kac, V.G. and Wakimoto, M.: Integrable highest weight modules over affine superalgebras and number theory. Progress in Math. 123, Birkhäuser, Boston, 1994, pp. 415–456 [L] Lu, S.-R.: Some results on modular invariant representations. Adv. Ser. in Math.Phys. 7, Singapore: World Sci., 1989, pp. 235–253 [M] Mumford, D.: Tata lectures on theta I, Progress in Math. 28, Basel–Boston: Birkhäuser, 1983 1 Added in proof: In the meantime, D. Zagier proved this conjecture in the case of arbitrary m in his paper “A proof of the Kac–Wakimoto affine denominator formula for the strange series”.
682
V. G. Kac, M. Wakimoto
[PS]
Penkov, I. and Serganova, V.: Representations of classical Lie superalgebras of type I. Indag. Math. N.S. 3(4), 419–456 (1992) Polishchuk, A.: M.P. Appell’s function and vector bundles of rank 2 on elliptic curves. Preprint math. AG/9810084 Ray, U.: A character formula for generalized Kac–Moody superalgebras. Preprint Serganova, V.: Automorphisms of complex simple Lie superalgebras and affine Kac–Moody algebras. Thesis, Leningrad State University, 1988 Zhu, Y.: Modular invariance of characters of vertex operator algebras. J. AMS 9, 237–302 (1996)
[P] [R] [S] [Z]
Communicated by A. Jaffe
Commun. Math. Phys. 215, 683 – 705 (2001)
Communications in
Mathematical Physics
© Springer-Verlag 2001
Universality of the Local Spacing Distribution in Certain Ensembles of Hermitian Wigner Matrices Kurt Johansson Department of Mathematics, Royal Institute of Technology, 100 44 Stockholm, Sweden. E-mail: [email protected] Received: 21 June 2000 / Accepted: 26 July 2000
Abstract: Consider an N × N hermitian random matrix with independent entries, not necessarily Gaussian, a so-called Wigner matrix. It has been conjectured that the local spacing distribution, i.e. the distribution of the distance between nearest neighbour eigenvalues in some part of the spectrum is, in the limit as N → ∞, the same as that of hermitian random matrices from GUE. We prove this conjecture for a certain subclass of hermitian Wigner matrices. 1. Introduction and Main Results Consider a probability measure PN on the space of all N ×N hermitian matrices. We will be interested in the statistical properties of the spectrum as N becomes large, in particular in features that are insensitive to the details of the particular sequence of probability measures we are considering. It is believed, on the basis of numerical simulations, that for many types of hermitian random matrix ensembles, i.e. choices of PN , the local statistical properties of the eigenvalues are the same as for the Gaussian Unitary −1 Ensemble (GUE), where dPN (M) = ZN exp(− N2 Tr M 2 )dM. Here dM is Lebesgue 2 measure on the space HN ∼ RN of all N × N hermitian matrices. The asymptotic eigenvalue density as N → ∞ (density of states) is given by the Wigner semicircle law 1 (4 − t 2 )+ . Let ρN (x1 , . . . , xN ) be the induced probability density on the ρ(t) = 2π eigenvalues. The semicircle law is the limit of the one-dimensional marginal density as N → ∞. The m - point correlation function N! (N) Rm (x1 , . . . , xm ) = ρN (x)dxm+1 . . . dxN , (1.1) (N − m)! RN −m is given by [21, Ch. 5], [31], (N) Rm (x1 , . . . , xm ) = det(KN (xi , xj ))m i,j =1 ,
(1.2)
684
K. Johansson
where the kernel KN (x, y) is given by KN (x, y) =
κN−1 pN (x)pN−1 (y) − pN−1 (x)pN (y) −N(x 2 +y 2 )/4 . e κN x−y
(1.3)
Here pN (x) = κN x N + . . . are the normalized orthogonal polynomials with respect to the weight function exp(−N x 2 /2) on R (rescaled Hermite polynomials). From these formulas, and Plancherel-Rotach asymptotics for the Hermite polynomials it follows that sin π(ti − tj ) m 1 t1 tm (N) lim R (u + , . . . , u + ) = det m N→∞ (Nρ(u))m Nρ(u) Nρ(u) π(ti − tj ) i,j =1 (1.4) if ρ(u) > 0. It has been proved, [23, 8, 3], that this is also true in other invariant ensembles −1 of the form dPN (M) = ZN exp(−N Tr V (M))dM. The orthogonal polynomials in (1.3) are then replaced by polynomials orthogonal with respect to exp(−N V (x)) on R. That the ensemble is invariant means that the probability measure is invariant under the conjugation M → U −1 MU , with a unitary matrix U . Sufficient control of the limit (1.4) for all m ≥ 1, makes it possible to determine the asymptotic spacing distribution, i.e. distances between nearest neighbour eigenvalues, see [8]. More precisely, let {tN } be a sequence such that tN → ∞ but tN /N → 0 as N → ∞ and define, [18, 8], SN (s, x), s ≥ 0, x ∈ RN , to be the symmetric function, which for x1 < · · · < xN is defined by 1 s tN SN (s, x) = , |xj − u| ≤ . # 1 ≤ j ≤ N − 1 ; xj +1 − xj ≤ 2tN Nρ(u) Nρ(u) (1.5) Given an hermitian matrix M let x1 (M) < . . . xN (M) be its eigenvalues; we write x(M) = (x1 (M), . . . , xN (M)). Then it is proved in [8] that s p(σ )dσ, (1.6) lim EN [SN (s, x(M))] = N→∞
0
for a large class of invariant ensembles. Here p(σ ) is the density of the β = 2 local spacing distribution, the Gaudin distribution, given by the probability density p(s) =
d2 det(I − K)L2 (0,s) , ds 2
(1.7)
where K is the operator on L2 (0, s) with kernel K(t, s) = sin π(t − s)/π(t − s), the sine kernel, see [21]. The aim of the present paper is to extend (1.4) and (1.6) to other, non-invariant ensembles. It is conjectured, see [21], p. 9, that (1.4) and (1.6) should hold also for so called Wigner matrices where the elements are independent but not necessarily Gaussian variables. In this case the probability measure is not invariant under conjugation by unitary matrices. For other results on Wigner matrices see for example [2, 19, 20, 26, 29] and [30]. In particular, in [28] the universality of the fluctuations of the largest eigenvalues is established. To be more precise, consider the complex random variables I = δ . If w = w wj k , 1 ≤ j ≤ k with independent laws Pj k = PjRk ⊗PjIk , where Pjj ¯ jk, 0 kj
Local Spacing Distribution in Hermitian Wigner Matrices
685
p W = (wj k )N j,k=1 is an N × N hermitian Wigner matrix . Let W , a class of Wigner ensembles, denote the class of all {Pj k }1≤j ≤k which satisfy zdPj k (z) = 0 , |z|2 dPj k (z) = σ 2 (1.8)
for all 1 ≤ j ≤ k, and furthermore sup j,k
|z|p dPj k (z) < ∞.
(1.9)
Fix a > 0 and let φa (t) = (π a 2 )−1/2 exp(−t 2 /a 2 ) be a Gaussian density function. R,I R R I √ Define QR,I j k = φa ∗ Pj k , 1 ≤ j < k, Qjj = φa 2 ∗ Pjj , j ≥ 1 and Qjj = δ0 . Then p Q is also a Wigner ensemble and we let Wa denote the subclass of W p obtained in p this way. Note that although Wa does not contain all Wigner ensembles it does contain cases where the distribution of the matrix elements have very different shapes, so in p this sense it is rather broad, and proving universality in Wa clearly shows that the universality is not restricted to the invariant ensembles. Another way to describe this ensemble of random matrices is as follows. Let V be a GUE-matrix with the probability −1 measure ZN exp(− 21 Tr V 2 )dV and let W be an N × N Wigner matrix with distribution P ∈ W p , i.e. the law of wj k is Pj k . We will assume that the variance σ 2 = 1/4, which can always be achieved by rescaling. Then W + aV has the distribution Q, and we write 1 M = √ (W + aV ). N We can think of this in terms of Dyson’s Brownian motion model, [9], W + aV is obtained from W by letting the matrix elements execute a Brownian motion for a time a 2 , see Sect. 2. If P ∈ W√p and W is an N × N hermitian matrix we let P (N) denote the distribution of H = W/ N = (hj k ), i.e.
dP (N) (H ) =
dPj k
√
N hj k .
1≤j ≤k≤N
The matrix M has the distribution Q(N) , which is given by (N)
dQ
(M) = 2
−N/2
N πa 2
N 2 /2 HN
e
−
N 2a 2
Tr(M−H )2
dP
(N)
(H ) dM,
(1.10)
and this is the measure we will study. The asymptotic distribution of the eigenvalues x1 , . . . , xN of M is the semicircle law
2 ρ(u) = (1 + 4a 2 − u2 )+ . (1.11) π(1 + 4a 2 ) Matrices of the form H0 +aN −1/2 V with a fixed non-random matrix H0 is considered by Brézin and Hikami in [5–7], and the present paper is inspired by their work. The following proposition will be proved in Sect. 2 using an argument from [5, 6].
686
K. Johansson
Proposition 1.1. The symmetrized eigenvalue measure on RN induced by Q(N) has a density ρN (x; y(H ))dP (N) (H ), (1.12) ρN (x) = HN
where ρN (x; y) = and -N (x) =
1≤i<j ≤N (xi
N 2πa 2
N/2
-N (x) − N (x −y )2 det(e 2a2 j k )N j,k=1 -N (y)
(1.13)
− xj ) is the Vandermonde determinant. p
The main result of the present paper is that for Wigner ensembles from Wa we can prove (1.4) and (1.6), and thus extend the universality to a rather broad class of Wigner matrices. (N) Theorem 1.2. Fix a > 0 and assume that |u| ≤ 1/2 + 2a 2 . Let RM (x1 , . . . , xm ) be the correlation functions, defined by (1.1), of the eigenvalue measure ρN , (1.12), for m ∞ m Q(N) , (1.10). Let f ∈ L∞ c (R ), the set of all L functions on R with compact support, and set for x ∈ RN , (Sf )(x) =
f (xi1 , . . . , xim ),
i1 ,...,im p
where the sum is over all distinct indices from {1, . . . , N}. If Q ∈ Wa with p > 2(m+2), then (Sf )(Nρ(u)(x1 (M) − u), . . . Nρ(u)(xN (M) − u))dQ(N) (M) (1.14) lim N→∞ HN t1 1 tm (N) = lim u + f (t1 , . . . , tm ) R , . . . , u + d mt N→∞ Rm (Nρ(u))m m Nρ(u) Nρ(u) sin π(ti − tj ) m = f (t1 , . . . , tm ) det d m t. m π(t − t ) i j R i,j =1 The condition on u is made just to simplify the saddle-point argument in Sect. 3; the result should hold for any u with ρ(u) > 0. We can also prove that the spacing distribution is the same as for GUE. Theorem 1.3. Fix any a > 0 and assume that Q ∈ Wa6+0 , 0 > 0. Let SN (s, x) be defined by (1.5) with the same conditions on u as above. Then, for any s ≥ 0, s lim SN (s, x(M))dQ(N) (M) = p(σ )dσ, (1.15) N→∞ HN
0
where p(s) is given by (1.7). The theorems will be proved in Sect. 4 after the preparatory work in Sects. 2 and 3.
Local Spacing Distribution in Hermitian Wigner Matrices
687
2. The Correlation Functions We will start by proving Proposition 1.1 using the Harish-Chandra/Itzykson–Zuber formula following [5, 6]. After that we will give a formula for the correlation functions of ρN (x; y), which is very close to the formula in [7], but our derivation will be different. A central role will be played by non-intersecting one-dimensional Brownian motions and we will use the formulas of Karlin and McGregor, [17]. Also we will discuss the relation to Dyson’s Brownian motion model. This connection can be found in [10] and we will only give an outline. Proof. Let F (x) be a continuous symmetric function on RN . By Fubini’s theorem − N Tr(M−H )2 (1) F (x(M))dQ(N) (M) = cN F (x(M))e 2a2 dM dP (N) (H ) HN
HN
HN
(2.1) (1)
with cN = 2−N/2 (N/π a 2 )N /2 . In the right-hand side of (2.1) we make the substitution M = U −1 RU , with U ∈ U (N ) and R ∈ HN , and then integrate over U (N ). If we use Fubini’s theorem again, we obtain − N2 Tr(U −1 RU −H )2 (1) 2a cN F (x(R)) e dU dR dP (N) (H ). 2
HN
HN
U (N)
Here we have also used the fact that dM = dR. The integral over U (N ) can now be evaluated using the Harish-Chandra/Itzykson–Zuber formula, [12, 14], see also [21, A.5]. We obtain the integral 1 (1) (2) −N(xj −yk )2 /2a 2 N cN cN F (x) )j,k=1 dR dP (N) (H ), det(e -N (x)-N (y) HN HN (2) where y1 , . . . , yN are the eigenvalues of H and cN = (a 2 /N )N(N−1)/2 N j =1 j !. The integrand in the middle integral depends only on the eigenvalues x of R and hence we can integrate out the other degrees of freedom in the standard way, [21, Ch. 3], and obtain, after using Fubini’s theorem,
F (x(M))dQ(N) (M) -N (x) 2 2 (1) (2) (3) N = cN c N c N F (x) d x dP (N) (H ) det(e−N(xj −yk ) /2a )N j,k=1 N (y) N HN R (2.2) HN
(3) (1) (2) (3) −1 2 N/2 and since with cN = π N(N−1)/2 N j =1 (j !) . We see that cN cN cN = (N/2π a ) (2.2) holds for arbitrary bounded, continuous, symmetric F (x) we have proved that the symmetrized eigenvalue measure is given by 1.12. This proves Proposition 1.1. Let pt (x, y) be the transition probability of a Markov process X(t) on R with continuous paths. Consider N independent copies of the process (X1 (t), . . . , XN (t)) and assume that this is a strong Markov process in RN . Suppose that the particles start at positions y1 < · · · < yN at time 0. The probability density that they are at positions
688
K. Johansson
x1 < · · · < xN at time S given that their paths have not intersected anytime during the time interval [0, S] is, by a theorem of Karlin and McGregor, [17], det(pS (yj , xk ))N j,k=1 . Hence, the conditional probability density that the particles are at positions y1 < · · · < yN at time 0, at positions x1 < · · · < xN at time S, at positions z1 < · · · < zN at time S + T , given that their paths have not intersected in the time interval [0, S + T ] is . 1 N qS,T (x; y; z) = det(pS (yj , xk ))N j,k=1 det(pT (xj , zk ))j,k=1 , ZN
(2.3)
where ZN =
x1 <···<xN
N N det(pS (yj , xk ))N j,k=1 det(pT (xj , zk ))j,k=1 d x.
Note that the expression (2.3) is a symmetric function of x1 , . . . , xN , so we can regard it as a probability measure on RN . Our next lemma shows that we can obtain ρN (x; y) defined by (1.13) as a limit of the measure in (2.3) in the case of Brownian motion. Lemma 2.1. Let zj = j −1, 1 ≤ j ≤ N and let pt (x, y) = (2π t)−1/2 exp((x −y)2 /2t) be the transition probability for Brownian motion. Then, for any x ∈ RN and y1 < · · · > yN , lim qS,T (x; y; z) =
T →∞
N -N (x) 1 2 . det e−(xj −yk ) /2S j,k=1 = qS (x; y). (2.4) N/2 (2πS) -N (y)
Note that ρN (x; y) = qa 2 /N (x; y). Proof. Write N det(pS (yj , xk ))N j,k=1 det(pT (xj , zk ))j,k=1 N
=
xj +zj −(x −y )2 /2S N N 1 j k det e e− 2T det exj zk /2T j,k=1 . j,k=1 N N/2 (2π) (T S) 2
2
(2.5)
j =1
Note that ZN is the conditional probability density of going from y1 < · · · > yN to z1 < · · · > zN without collisions, i.e. ZN = det(pS+T (yj , zk ))N j,k=1 =
N yj2 +zj2 N 1 − 2(S+T ) det e yj zk /2(S+T ) e . j,k=1 N/2 N/2 (2π) (S + T )
(2.6)
j =1
Now, since zj = j − 1, we have two Vandermonde determinants in (2.5) and (2.6). If we evaluate these, take the quotient between (2.5) and (2.6) and then take the limit T → ∞, we obtain the right-hand side of (2.4).
Local Spacing Distribution in Hermitian Wigner Matrices
689
Proposition 1.1 √and Lemma 2.1 establish a link between the eigenvalue distribution of M = (W +aV )/ N and the non-intersecting Brownian paths. If we set S = a 2 /N , then the right-hand√side of (2.4) and (1.13) are identical; y1 < · · · < yN are the eigenvalues of H = W/ N. This relation can also be seen in another way, which we will now outline. Let X(t) = (xj k (t))N j,k=1 be an N × N Hermitian matrix, where Re xj k (t), Im xj k (t), j ≤ k are independent Brownian motions with variance (1 + δj k )/2. Assume that X(0) = H is distributed according to P (N) . Then the distribution of X(a 2 /N ) is the √ same as that of M = (W + aV )/ N . Following Dyson, [9], see also [24], it is possible to derive a stochastic differential equation for the eigenvalues λ1 (t), . . . , λN (t) of X(t), dλi = dBi +
k=i
1 dt, λi − λ k
(2.7)
where Bi are independent standard Brownian motions on R, and with the intial conditions λi (0) = yi , 1 ≤ i ≤ N . We can also consider the problem of non-intersecting Brownian motions in a different way than that of Karlin and McGregor. Namely, let K = {x ∈ RN ; x1 < · · · < xN } and consider Brownian motion in RN starting at y ∈ K and conditioned to remain in K forever. As proved in [10], see also [13, 25], if λi are the components of the N -dimensional conditioned Brownian motion they satisfy the stochastic differential equation (2.7) with the same initial conditions. This gives another way to obtain (1.13) without using the Harish-Chandra/Itzykson–Zuber formula. Actually, we can turn the argument around and give a proof of this formula. We turn now to the computation of the correlation functions of the right-hand side of (2.4), but we start more generally with (2.3). This can be analyzed using the techniques of [31], compare the analysis of the Schur measure, [22], in [15], and see also [16]. For completeness, let us outline the result we need from [31]. See [27] for more results related to Proposition 2.2 below. Related ideas were also used in [4]. Let (7, µ) be a measure space. Assume that φj , ψj ∈ L2 (7, µ), 1 ≤ j ≤ N , and f ∈ L∞ (7, µ). Set 1 ZN [f ] = N!
7N
N det(φj (xk ))N j,k=1 det(ψj (xk ))j,k=1
N
f (xj )dµ(xj )
j =1
and A=
7
φj (x)ψj (x)dµ(x)
N j,k=1
.
Proposition 2.2. ([31]) Assume that ZN [1] = 0. Then A is invertible and we can define KN (t, s) =
N
ψk (t)(A−1 )kj φj (s).
(2.8)
j,k=1
Then, for any g ∈ L∞ (7, µ), ZN [1 + g] = det(I + KN g)L2 (7) . ZN [1]
(2.9)
690
K. Johansson
If we define a measure on 7N by uN (x)
N
N
dµ(xj ) =
j =1
1 N det(φj (xk ))N det(ψ (x )) dµ(xj ), j k j,k=1 j,k=1 N !ZN [1] j =1
(2.10) then it has determinantal correlation functions N! (N − M)!
7N −m
uN (x)dµ(xm+1 ) . . . dµ(xN ) =
det(KN (xi , xj ))N i,j =1
m
dµ(xj ).
j =1
(2.11) Proof. We will indicate the main steps in the proof of (2.9) of which (2.11) is a consequence, see [31]. Set B=
7
φj (x)ψj (x)g(x)dµ(x)
N j,k=1
.
We have the formula ZN [f ] = det
7
φj (x)ψj (x)f (x)dµ(x)
N j,k=1
,
which goes all the way back to [1], and which is not difficult to prove by expanding the determinants in ZN [f ]. From this we see that det A = ZN [1] = 0, so A is invertible and ZN [1 + g] det(A + B) = = det(I + A−1 B). ZN [1] det A
(2.12)
Now, (A−1 B)j k =
7
ψk (x)
N
(A−1 )j < φ< (x)g(x) dµ(x),
<=1
and we define T : CN → L2 (7, dµ) and S : L2 (7, dµ) → CN by the kernels −1 T (x, k) = ψk (x) and S(j, x) = N <=1 (A )j < φ< (xg (x). Then, by (2.12) and a determinant identity, ZN [1 + g] = det(I + ST )CN = det(I + T S)L2 (7,dµ) ZN [1] = det(I + KN g)L2 (7,dµ) , with KN given by (2.8). Note that KN g, which means first multiplication by g and then application of the operator on L2 (7, dµ) with kernel KN , is a finite rank operator.
Local Spacing Distribution in Hermitian Wigner Matrices
691
Observe now that if we take 7 = R, dµ(x) = dx, φj (x) = PT (x, zj ) and ψj (x) = pS (yj , x), then (2.3) is a probability density of the form (2.10) and we can apply the proposition. Note that (A)j k = pT (x, zj )pS (yk , x)dx = pS+T (yk , zj ). R
The kernel which gives the correlation functions is N N S,T −1 KN (u, v) = pS (yk , v) (A )j k pT (u, zj ) . k=1
j =1
Let Ak (v) be the matrix we obtain from A by replacing column k by (pT (v, z1 ) · · · pT (v, zN ))T . Then, by Kramers’ rule, KNS,T (u, v) =
N
pS (yk , v)
k=1
det Ak (v) . det A
(2.13)
This formula and Proposition 2.2 is the basis for the next proposition. The result is closely related to the result derived in [7] by different methods. Proposition 2.3. The correlation functions for qS (x; y) defined by (2.4) are given by N! . N qS (x; y)dxm+1 . . . dxN (2.14) Rm (x1 , . . . , xm ; y) = (N − m)! RN −m = det(KNS (xi , xj ; y))m i,j =1 ,
where e(v −u )/2S = (v − u)S(2π i)2 2
KNS (u, v; y)
2
γ
dz
N w − yi dw 1 − e(v−u)z/S z − yj ? j =1
(2.15)
yj 1 e(w2 −2vw−z2 +2uz)/2S . × w+z−v−S z (w − yj )(z − yj ) j
Here γ is the union of the curves t → −t + iω, t ∈ R and t → t − iω, t ∈ R with a fixed ω > 0, and ? : R t → it. Proof. We have to show that with pt (u, v) = (2π t)−1/2 exp(−(u − v)2 /2t) and zj = j − 1 the limit of the right-hand side of (2.13) as T → ∞ can be written as (2.15). The result then follows from Lemma 2.1, Proposition 2.2 and the dominated convergence theorem. We see that N zj2 +yj2 1 − 2(S+T ) det A = e (2π(S + T ))N/2 j =1
1≤i<j ≤N
yj
yi
(e S+T − e S+T )
(2.16)
692
K. Johansson
∗ be the curve t → t + iM, t ∈ R, by the formula for a Vandermonde determinant. Let ?M M fixed. Then
zj2 2 S v2 1 1 − τ +z v +iτ 2T (S+T ) e 2 j T pT (zj , v) = √ dτ. e− 2(S+T ) − 2T √ ∗ 2π ?M 2πT
Hence,
det Ak (v) =
N
zj2
yj2
1 1 e− 2(S+T ) e− 2(S+T ) √ 2π T (2π(S + T ))(N−1)/2 j =1 j =k 2 2 v τ × e− 2T e− 2 det A˜ k (v)dτ, ∗ ?M
zj yk N where A˜ k (v) is the matrix we get from exp S+T by replacing column k by j,k=1 N
S . Since zj = j − 1 we have a Vandermonde deexp zj Tv + iτ 2T (S+T ) j =1
terminant and we obtain N zj2 yj2 S+T 1 det Ak (v) = e− 2(S+T ) e− 2(S+T ) T (2π(S + T ))N/2 j =1 j =k yj yi v2 τ2 1 × e− 2T √ e− 2 e S+T − e S+T dτ, (2.17) ∗ 2ı ?M 1≤i<j ≤N
v S where yk should be replaced by (S + T ) T + iτ 2T (S+T ) . Take the quotient of (2.16) and (2.17) and let T → ∞. This gives
√ 2 v + i Sτ − yj det Ak (v) 1 − τ2 lim dτ. e =√ ∗ T →∞ det A yk − y j 2π ?M j =k Choose M √ so that v − w = v + i Sτ . Then
√ SM = L, where L is given, and make the change of variables
1 det Ak (v) = √ lim T →∞ det A i 2π S
?L
e
(w−v)2 2S
w − yj j =k
yk − y j
dw,
where ?L : t → L + it, t ∈ R. Thus, using (2.13), KNS (u, v; y) =
N (w−v)2 w − yj 1 −(yk −u)2 /2S dw. e e 2S 2πiS yk − y j ?L k=1
j =k
Let γ be a curve surrounding y1 , . . . , yN and choose L so large that γ and ?L do not intersect. The residue theorem gives −(z−u)2 /2S N N w − yj w − yj 1 e −(yk −u)2 /2S dz = e 2πi γ w−z z − yj yk − y j j =1
k=1
j =k
Local Spacing Distribution in Hermitian Wigner Matrices
693
for all w ∈ ?L . Thus, v 2 −u2
KNS (u, v; y)
e 2S = (2πi)2 S
γ
dz
dwe 2S (w 1
?L
2 −2vw−z2 +2uz)
N 1 w − yj . (2.18) w−z z − yj j =1
In (2.18) we make the change of variables z → bz, w → bw with b ∈ R close to 1. This will modify the contours but we can use Cauchy’s theorem to deform back to γ and ?L . Now, take the derivative with respect to b and then put b = 1. This gives the equation
v 2 −u2
0=
e 2S (2π i)2 S 2
1 1 2 2 e 2S (w −2vw−z +2uz) w − z γ ?L N N w w − yj z × w 2 − z2 + uz − vw + S − . w − yj z − yj z − yj
KNS (u, v; y) +
dz
dw
j =1
j =1
This can be written v 2 −u2 e 2S ∂ S ((u − v)KN (u, v; y)) = − dz dw ∂u (2π i)2 S 2 γ ?L N N yj w − yj e(w2 −2vw−z2 )/2S euz/S w + z − v − S , (w − yj )(z − yj ) z − yj
j =1
j =1
and integration of this formula gives (2.15). In this last formula we can choose L arbitrarily and take γ to be the curve in the proposition by using Cauchy’s formula. This completes the proof. We now take S = a 2 /N and set N (u2 −v 2 ) +ω(u−v) 2a 2
KN (u, v; y) = e
a 2 /N
KN
(u, v; y),
(2.19) a 2 /N
with where ω is a constant that will be specified later. Note that we can replace KN KN in (2.14) without changing the correlation functions, so we can just as well work with KN . Set N 1 2 1 (z − 2uz) + log(z − yj ), 2a 2 N j =1 N yj 1 a2 , gN (z, w) = 2 w+z−u− a z N (w − yj )(z − yj )
fN (z) =
j =1
hN (z, w) =
eω(u−v) Nρ(u)(v − u)
so that KN (u, v; y) = Nρ(u)
γ
dz 2π i
e
N (u−v)w a2
?
(eN(u−v)w/a − eN(u−v)(w−z)/a ), 2
2
dw hN (z, w)gN (z, w)eN(fN (w)−fN (z)) . 2π i
(2.20)
694
K. Johansson
These are the formulas we will use in the asymptotic analysis. A straightforward computation shows that gN (z, w) =
f (z) − fN (w) 1 fN (z) + N . z z−w
(2.21)
3. Asymptotics The eigenvalues y1 , . . . , yN of the Wigner matrix H converge to the semicircle law σ (t) =
2 1 − t 2, π
|t| ≤ 1.
(3.1)
In order to be able to perform the saddle point analysis of (2.20) we need uniform control of the convergence of fN (z) to its limit 1 1 f (z) = 2 (z2 − 2uz) + log(z − t)σ (t)dt. (3.2) 2a −1 In order to show this we must start with some probability estimates. Write 7R,η = {z ∈ C ; | Re z| ≤ R, η ≤ | Im z| ≤ R}. Lemma 3.1. Let F ∈ L∞ (RN ) be symmetric and let η > 0 and R > 0 be given. Assume 1 that P ∈ W p , p > 4 and 0 < ξ < min( 21 − p2 , 16 ). Then, there is a probability measure (N) ˜ P on HN such that 1 (N) ˜ (N) (H ) ≤ N 2−p( 2 −ξ ) ||F ||∞ , (3.3) F (x(H ))dP (H ) − F (x(H ))d P HN
and
HN
1 1 sup Tr log(z − H ) − log(z − t)σ (t)dt ≤ CN −ξ N z∈7R,η −1
(3.4)
a.s. with respect to P˜ (N) . Proof. Given P ∈ W p we introduce a cut-off L > 0 and define a new probability measure PL ∈ W p by R,I dPL,j k (t) =
1 χ[−L,L] (t)dPjR,I k (t), dL,j k
1 ≤ j ≤ k,
where dL,j k is a normalization constant. Note that PL,j k is supported in K = [−L, L]2 . (N) Set dL = 1≤j ≤k≤N dL,j k . Then, (N) (N) ≤ ||F ||∞ 1 − d (N) 1 + 1 F (x(H ))dP (H ) − F (x(H ))dP (H ) L L (N) HN HN dL ≤
CN 2 ||F ||∞ Lp
(3.5)
Local Spacing Distribution in Hermitian Wigner Matrices
695
for some constant C. The last estimate follows from (N)
1 − dL
= P [some |Wj k | ≥ L] ≤ N 2 sup
1≤j ≤k
E[|Wj k |p ] CN 2 ≤ Lp Lp
(3.6)
by (1.9). Set DN = 7R,η ∩ N1 Z2 and note that #DN ≤ CN 2 for some constant C that only depends on R, η. For a given function f set 1 1 f (t)dσ (t)| ≤ δ , AN (f ; δ) = H ∈ HN ; | Tr(f (H )) − N −1 where σ (t) is the semicircle law (3.1). Set AN (δ) = AN (fz , δ),
(3.7)
z∈DN
where fz (t) = log(z − t) (principal branch). To estimate the probability of AN (δ) under P (N) we will use a result of Guionnet and Zeitouni, [11]. Let |f (x) − f (y)| , |x − y| t,s∈R
|f |L = sup
and ||f ||L = ||f ||∞ + |f |L . Then, by [11, Corollary 1.6a]), and the discussion before this corollary, given 0 > 0, there are positive constants C0 (0), C1 and C2 such that if we write δ1 (N ) = C1 L2 |f |L N −1 + C2 (0)||f ||L N −1/4+0 ,
(3.8)
then (N) PL
1 | Tr f (H ) − N
C2 N 2 f (t)σ (t)dt| ≥ δ ≤ 4 exp − 4 (δ − δ1 (N ))2 L |f |L −1 1
(3.9)
√ √ (N) for any δ > δ1 (N ). Since under PL all |Hj k | ≤ 2(L/ N ), the spectral radius is ≤ 2L. Thus, the left-hand side of (3.9) is unchanged if we replace f = fz with f = fzL (t), where fzL (t) = log(z − t) if |t| ≤ 2L, fzL (t) = log(z − 2L) if t > 2L and fzL (t) = log(z + 2L) if t < −2L. Now, fzL (t) is Lipschitz and there is a constant C3 , independent of L, such that |fzL (t)|L ≤ C3 and ||fzL (t)||L ≤ C3 (1 + log L) for all z ∈ 7z,η . Take L = LN = N 1/2−ξ and 0 = 1/6 in (3.8). Then δ1 (N ) ≤ CN −2ξ and if we choose δ = N ξ in (3.9) we obtain 1 1 (N) −ξ | Tr fz (H ) − ≤ c1 exp(−c2 N 2ξ ) fz (t)σ (t)dt| ≥ N (3.10) PL N −1 for some positive constants c1 , c2 . If we use (3.10) we see that the probability of the complement of the event in (3.7) can be estimated as (N)
PLN [AN (N −ξ )c ] ≤ CN 2 e−c2 N . 2ξ
(3.11)
696
K. Johansson
Set (N) (N) d P˜ (N) (H ) = (PLN [AN (N −ξ )])−1 χAN (N −ξ ) (H )dPLN . p
Note that N 2 /LN = N 2−p(1/2−ξ ) , so combining (3.5), (3.7) and (3.11) we obtain the estimate (3.3). From the definition of AN (δ) we see that (3.4) holds for z ∈ DN , but then a straightforward approximation argument extends it to all z ∈ 7R,η . This completes the proof of Lemma 3.1. We now come to the central asymptotic result. Lemma 3.2. Let 7R,η be as above, let ξ ∈ (0, 1/2] and let K be a compact subset of R. Also let uN be a sequence such that uN → u as N → ∞. Furthermore, let YR,η be the set of all y ∈ RN such that 1 N 1 sup log(z − yj ) − log(z − t)σ (t)dt ≤ CN −ξ (3.12) N z∈7R,η −1 j =1 for some constant C and all N ≥ 1, where σ (t) is given by (3.1). Then, we can find R0 > 0, η0 > 0 and a constant C such that for all y ∈ YR0 ,η0 , τ ∈ K, |u| ≤ 1/2 + 2a 2 and N ≥ 1, 1 τ sin π τ (u , u + (3.13) K ; y) − ≤ C(|u − uN | + N −ξ ), Nρ(u) N N N Nρ(u) πτ where ρ(u) is given by (1.11). Proof. It follows from the formula (2.20) that 1 τ KN (uN , uN + ; y) Nρ(u) Nρ(u) dz dw h(z, w)gN (z, w)eN(fN (w)−fN (z)) , =N γ 2π i ? 2π i
(3.14)
where gN (z, w) is given by (2.21), N 1 2 1 fN (z) = 2 (z − 2uN z) + log(z − yj ) 2a N j =1
and h(z, w) =
eω0 τ −τ w/a 2 ρ(u) 2 e − e−τ (w−z)/a ρ(u) . τ
We have taken ω = ω0 /Nρ(u), where ω0 is given by (3.23) below. The integral in (3.14) will be analyzed using a saddle point argument. It follows from (3.12) and Cauchy’s integral formula that there is a constant C such that for all N ≥ 1, τ ∈ K, y ∈ YR/2,2η and |u| ≤ 1/2 + 2a 2 , |fN (z) − f (z)| ≤ C(N −ξ + |u − uN |), |fN (z) − f (z)| ≤ CN −ξ .
(3.15)
Local Spacing Distribution in Hermitian Wigner Matrices
697
A computation shows that 1 (z − u) + 2(z − z2 − 1). a2 √ √ −1 2 2 Set S(w) √ √ = (w + 1/w)/2 with inverse S (z) = z + z − 1, where z − 1 = z − 1 z + 1 (principal argument). The function S maps {|w| > 1} to C \ [−1, 1] and |w| = 1 is mapped to [−1, 1]. Note that f (z) =
f (S(w)) =
u w 1 1 + (2 + 2 ) − 2 . 2 2a 2a w a
√ Write u = 1 + 4a 2 cos θc , where θc ∈ [0, π ]. Our assumption √ on u means that | cos θc | ≤ 1/2. Note that f (S(w)) = 0 has the solutions wc± = 1 + 4a 2 exp(±iθc ). Hence the critical points for f are zc± = S(wc± ). We will now define some contours that√ we will use. Pick δ > 0 (small), see below. Set, for some 0 > 0 (small), γ1+ (t) = S( 1 + 4a 2 eiδ − t), −∞ < t ≤ 0, γ2+ (t) = √ √ S( 1 + 4a 2 eit ), δ ≤ t ≤ θc − 0, γ3+ (t) = S( 1 + 4a 2 eit ), θc − 0 ≤ t ≤ θc + 0, √ √ γ4+ (t) = S( 1 + 4a 2 eit ), θc + 0 ≤ t ≤ π − δ and γ5+ (t) = S( 1 + 4a 2 ei(π−δ) − t), 0 ≤ t < ∞. Also, set γj− (t) = γj+ (t), 1 ≤ j ≤ 5. Then, we can take γ = 5j =1 (γj+ − √ γj− ) = γ + − γ − in (3.14). Let t0 ∈ (1/ 1 + 4a 2 , 1) be such that Im S(t0 wc+ ) = η, and write α = Re S(t0 wc+ ). Set, for some 0 > 0 (small), ?1+ (t) = α + it, 0 ≤ t ≤ η, ?2+ (t) = S(twc+ ), t0 ≤ t ≤ 1 − 0, ?3+ (t) = S(twc+ ), 1 − 0 ≤ t ≤ 1 + 0 and
?4+ (t) = S(twc+ ), 1 + 0 ≤ t. Also, set ?j− (t) = ?j+ (t), 1 ≤ j ≤ 4. We can then take ? = 4j =1 (?j+ − ?j− ) = ? + − ? − in (3.14). Set Lbd N (τ ; y)
=N
γ3b
dz 2π i
?3d
dw h(z, w)gN (z, w)eN(fN (w)−fN (z)) , 2π i
(3.16)
+− −+ −− where b, d ∈ {+, −} and write LN = L++ N − LN − LN + LN .
Claim. We can choose R0 > 0, η0 > 0 and 0, δ > 0, so that γ3+ + γ3− + ?3+ + ?3− ± lies in a neighbourhood of zc which is included in 7R0 /2,2η0 and for all N ≥ 1, τ ∈ K, y ∈ Yr/2,2η and |u| ≤ 1/2 + 2a 2 , 1 τ −cN (3.17) Nρ(u) KN (uN , uN + Nρ(u) ; y) − LN (τ ; y) ≤ Ce with c > 0 The claim will be proved below. We will now use the claim to finish the proof of ± ± Lemma 3.2. It follows from (3.15) that there are critical points zN = S(wN ) for fN (z) such that ± − zc± | ≤ C(N −ξ + |u − uN |). |zN
(3.18)
± We can deform γ3± (?3± ) into contours γN± (?N ) such that the endpoints are unchanged, ± ± ± ± ± γN (0) = ?N (0) = zN and γN (?N ) have C 1 -distance ≤ C(N −ξ + |u − uN |) to γ3±
698
K. Johansson
± ±it (?3± ). We can also asume that these contours are chosen so that γN± (t) = S(wN e ) ± ± and ?N (t) = S(wN (1 + t)) for |t| ! 0. We can now proceed in the standard way with a local saddle point argument in (3.16) and prove that there is a constant C such that b )−f (zd )) b ) (0)(? d ) (0)eN(fN (zN N N bd (γ N L (τ ; y)−h(zb , zd )gN (zb , zd ) 2π N
N N N N N 2 (2π i) f (zb )(γ b ) (0)2 −f (zd )(? d ) (0)2 N
N
N
N
N
N
C ≤√ N
(3.19)
+ − for all N ≥ 1, τ ∈ K, y ∈ YR0 ,η0 and |u| ≤ 1/2 + 2a 2 . Note that zN = zN + − b b b and fN (zN ) − fN (zN ) is purely imaginary. Now, (γN ) (0) = biS (wN ), (?N ) (0) = b S (w b ) and a computation shows that wN N b b b b b 2 a 2 )(γNb ) (0)2 = −fN (zN )(?N ) (0)2 = −fN (zN )S (wN ) (wN ) , fN (zN
which has a positive real part by (3.15) and the fact that f (zcb )S (wcb )2 (wca )2 has a b , zd ) = 0 if b = d and g (zb , zb ) = positive real part. From (2.21) we see that gN (zN N N N N b fN (zN ). It follows that
b , zb )(γ b ) (0)(? b ) (0) gN (zN N N N
= −bi. b b b b ) (0)2 2 fN (zN )(γN ) (0) −fN (zN )(?N
b , zb ) − h(zb , zb )| ≤ C(N −ξ + |u − u |), and Also, from (3.18) it follows that |h(zN N c c N thus (3.19) yields
C |Lbd N (τ ; y)| ≤ √ N
(3.20)
b b bb L (τ ; y) + bh(zc , zc ) ≤ C(N −ξ + |u − uN |). N 2π i
(3.21)
if b = d and
Combining (3.16), (3.20) and (3.21) we obtain + + − − LN (τ ; y) + h(zc , zc ) − h(zc , zc ) ≤ C(N −ξ + |u − uN |). 2π i
(3.22)
Now h(zc± , zc± ) =
eω0 τ −τ zc± /a 2 ρ(u) −1 , e τ
and a computation shows that zc± 2 a ρ(u)
=π
1 + 2a 2 . cot θc ± π i = ω0 ± π i. 2a 2
(3.23)
Local Spacing Distribution in Hermitian Wigner Matrices
699
Thus (3.22) becomes LN (τ ; y) − sin π τ ≤ C(N −ξ + |u − uN |). πτ If we combine this estimate with (3.17) we see that the lemma is proved.
It remains to prove Claim 3. Proof. Let γ∗± = j =3 γj± and ?∗± = j =3 ?j± . We have to estimate I1bd
=N
and I2bd = N
γ∗b
|dz|
γb
?d
|dz|
?∗d
d
b
d
b
|dw||h(z, w)||gN (z, w)|eN Re(fN (w)−fN (zc ))−N Re(fN (z)−fN (zc )) ,
|dw||h(z, w)||gN (z, w)|eN Re(fN (w)−fN (zc ))−N Re(fN (z)−fN (zc )) ,
where b, d ∈ {+, −}. Note that fN (zc+ ) − fN (zc −) is purely imaginary. We will concentrate on I1++ since the other cases are similar. Using the inequality w − yj = 1 + w − z ≤ 1 + C(|w| + |z|) z−y z − yj j it is not difficult to see that there are constants C1 and C2 such that |h(z, w)||gN (z, w)|eN Re(fN (w)−fN (z)) ≤ C1 E C2 N(|z|+|w|)+N(Re(w
2 −2uw)−Re(z2 −2uz))/2a 2
(3.24)
for all y ∈ RN , τ ∈ K and |u| ≤ 1/2 + 2a 2 . Note that | Im z| ≥ c > 0 for all z ∈ γ . (The constant c depends on the δ in the definition of γ , but as we will see below δ depends only on the parameter a in the problem.) From the estimate (3.24) it follows that by picking R = R0 sufficiently large, the contribution to I1++ from z and/or w outside 7R0 ,0 is ≤ e−N . Thus we can assume that z, w ∈ 7R0 ,0 . Next, we will derive the other estimates we will need to prove the claim. Assume that z ∈ 7R0 ,η and w ∈ ?1+ . Then, |gN (z, w)eNfN (w) |
N N 1 1 2 2 ≤C 1+ |w − yj |eN Re(w −2uw)/2a N |w − yk | k=1 j =1 N N 1 1 2 2 ≤ C 1 + |α + iη − yj |eN Re(w −2uw)/2a N |α + iη − yj | j =1
≤ CeN[Re fN
j =1
(α+iη)+Re(w2 −2uw)−((α+iη)2 −2u(α+iη))]/2a 2
.
700
K. Johansson
If we use (3.12) and the definition of fN we obtain +
|gN (z, w)eN(fN (w)−fN (zc )) | ≤ CecN(N
−ξ +|u−u |)+Nη2 /2a 2 +N N
Re(f (α+iη)−f (zc+ ))/2a 2
(3.25)
for z ∈ 7R0 ,η and w ∈ ?1+ . We will now compute how Re f (z) changes along γ . Assume that θc ≥ 0, the other √ case is analogous. Consider γ (θ ) = S( 1 + 4a 2 eiθ ), δ ≤ θ ≤ π − δ. A computation, 2 d f (γ (θ )) = 1+2a sin θ(cos θc − cos θ). using the fact that f (γ (θc )) = 0 gives Re dθ 2a 2 From this we see that there is a constant c0 > 0 such that Re(f ( 1 + 4a 2 eiθ ) − f (zc+ )) ≥ c0 (θ − θc )2 . (3.26) √ √ Next, consider γ1 (t) = S( 1 + 4a 2 eiδ − t), t ≤ 0. If we write ωδ = 1 + 4a 2 eiδ , then 1 d 1 1 + 4a 2 f (γ1 (t)) = − 2 [ωδ − t − 2u + ][1 − ]. dt 4a ωδ − t (ωδ − t)2 Set ωδ − t = s(t)eiθ(t) . A computation shows that 1 1 d (s(t) + f (γ1 (t)) = − ) cos θ(t) − 2 cos θc √ dt s(t) 4a 2 1 + 4a 2 1 1 1 2 × 1 + 4a − cos 2θ (t) − sin 2θ (t)(s(t) − ) sin θ(t) . (3.27) s(t)2 s(t)2 s(t) √ Note that sin θ (t) = s(t)−1 1 + 4a 2 sin δ. It follows that the right-hand side of (3.27) equals Re
−
√
1
4a 2 1 + 4a 2
1 1 (1 + 4a 2 ) sin2 δ + 2 ) )(1 + 4a 2 − s(t) s(t)2 s(t)4 1 (1 + 4a 2 ) sin2 δ ) cos θ(t) (s(t) − −2 s(t)4 s(t) (s(t) +
−2(1 + 4a 2 −
1 (1 + 4a 2 ) sin2 δ + 2 ) cos θc s(t)2 s(t)4
(3.28)
and this is ≤−
4a 2
√
1 1 + 4a 2
(1 + 4a 2 −
1 ) s(t)2
(s(t) +
1 1 + 4a 2 2 , sin δ) cos θ ) cos θ(t) − 2(1 + c s(t) 2a 2
since s(t) ≥ 1. Choose δ ≤ θc /4 so that (1 +
θc 1 + 4a 2 sin2 δ) cos θc ≤ cos . 2 2a 2
Local Spacing Distribution in Hermitian Wigner Matrices
701
Since s(t) + 1/s(t) ≥ 2 and θ(t) ≤ δ we see that there is a constant c0 > 0 such that Re
d f (γ1 (t)) ≤ −c0 . dt
(3.29)
√ For γ5 (t) = S( 1 + 4a 2 ei(π−δ) − t), t ≥ 0, we still have the formula (3.28) with γ1 (t) replaced by γ5 (t) and, since π − δ ≤ θ(t) ≤ π , we see that the right-hand side is ≥√
1 1 + 4a 2
[(s(t) +
1 ) cos(π − θ(t)) + 2 cos θc ] s(t)
(3.30)
and consequently there is a constant c0 > 0 such that Re
d f (γ5 (t)) ≥ c0 . dt
(3.31)
Consider now how Re f (w) changes along ? + . Set ?(t) = S(twc+ ), t ≥ t0 . A computation gives d 1−t 1 f (S(twc )) = 2 2 [1 + t (1 + 4a 2 ) − (t 2 (1 + 4a 2 ) + ) cos 2θc ]. dt 2a t t Now, since |u| ≤ 1/2 + 2a 2 , it follows that cos 2θc ≤ 0 and thus Re
d f (S(twc )) ≥ dt d Re f (S(twc )) ≤ dt
Re
1−t (1 + t (1 + 4a 2 )) 2a 2 t 2 1−t (1 + t (1 + 4a 2 )) 2a 2 t 2
if t0 ≤ t ≤ 1, if t ≥ 1.
(3.32)
The first of these estimates can be used to show that if we pick η = η0 sufficiently small, then η2 + Re(f (α + iη) − f (zc+ )) ≤ −c0 for some positive c0 . If we use this in (3.25) we obtain +
|gN (z, w)eN(fN (w)−fN (zc )) | ≤ Ce−c0 N
(3.33)
for some positive c0 . We can now use (3.26), (3.29), (3.31), (3.32) and (3.33) to estimate I1++ and see that it is ≤ Ce−cN for some positive c. 4. Proof of the Theorems We start with the proof of Theorem 1.2. Proof. By Proposition 1.1 and Fubini’s theorem the integral in the left-hand side of (1.14) can be written N ρN (x, y(H ))(Sf )(Nρ(u)(x1 − u), . . . , Nρ(u)(xN − u))d x dP (N) (H ). HN
RN
(4.1)
702
K. Johansson
Note that ||S(f )||∞ ≤ N m ||f ||∞ . Since ρN (x, ·) is a probability density on RN we can use Lemma 3.1 to replace the expression in (4.1) by N ρN (x, y(H ))(Sf )(Nρ(u)(x1 − u), . . . , Nρ(u)(xN − u))d x d P˜ (N) (H ) HN
RN
(4.2) with an error ≤ CN m ||f ||∞ N 2−p(1/2−ξ ) = o(1), since p > 2(m + 2), provided we choose ξ small enough. Now, since ρN (x, ·) is symmetric it follows from (1.13), (2.4), (2.14) and (2.19) that the expression in (4.2) can be written HN
Rm
f (t1 , . . . , tm )
× det(
tj 1 ti m ˜ (N) (H ). K(u + ,u + ; y(H )))m i,j =1 d td P Nρ(u) Nρ(u) Nρ(u)
(4.3)
Since f has compact support and we know that (3.4) holds a.s. [P˜ (N) ] it follows from Lemma 3.2, with uN = u + ti /Nρ(u), τ = tj − ti , that K(u + ti , u + tj ; y(H )) − sin π(ti − tj ) ≤ CN −ξ , Nρ(u) Nρ(u) π(ti − tj ) for a.a. [P˜ (N) ] and all (t1 , . . . , tm ) in the support of f . Thus we can take the limit as N → ∞ in (4.3) and obtain the right-hand side of (1.14). This completes the proof. Before proving Theorem 1.3 we need some preliminary results on the level spacing distribution. Let ρN (x) be a symmetric probability density on RN with correlation func(N) tions defined by (1.1). Assume that R1 /N → ρ(t) (weakly) as N → ∞, so that ρ(t) is the asymptotic density. Let u be a given point such that ρ(u) > 0, and let tN be a sequence such that tN → ∞ but tN /N → 0 as N → ∞. Set, for |r| ≤ 1/2, R(N) m (σ1 , . . . , σm ; r) =
2tN r + σm 1 2tN r + σ1 (N) ,...,u + ) Rm (u + m (Nρ(u)) Nρ(u) Nρ(u)
and let Rm (σ1 , . . . , σm ) be the limiting correlation functions, which we assume are continuous, symmetric and translation invariant. Assume that, for each s ≥ 0, DN (s) =
∞ sm sup |Rm (σ1 , . . . , σm )| < ∞. m! |σj |≤s
(4.4)
m=N+1
Set H (s) =
∞ (−1)m Rm (σ1 , . . . , σm )d m σ m m! [0,s]
m=0
(the probability of no particle in [0, s]), which is well defined by (4.4). Also, set (N) 0m =
sup
|σj |≤s,|r|≤1/2
|R(N) m (σ1 , . . . , σm ; r) − Rm (σ1 , . . . , σm )|.
(4.5)
Local Spacing Distribution in Hermitian Wigner Matrices
703
Proposition 4.1. Let SN (s, x) be defined by (1.5). Then
N
SN (s, x)ρN (x)d x −
RN
N s m−1 (N) H (u)du ≤ DN (s) + 0 . (m − 1)! m
s
0
(4.6)
m=2
Proof. We first show that
s
H (u)du =
0
N m=2
s m−1 (m − 1)!
[0,s]m−1
Rm (0, τ2 , . . . , τm )dτ2 . . . dτm ,
(4.7)
see [8]. Since Rm is translation invariant and symmetric by assumption, we have ∞ 1 (−1)m H (u) = lim Rm (x1 , . . . , xm )d m x m \[0,u]m 0→0 0 m! [−0,u] m=0 ∞ (−1)m 1 m = lim m Rm (x1 , . . . , xm )d x 0→0 m! 0 [−0,0]×[0,u]m−1 m=0 ∞ (−1)m = Rm (0, x2 , . . . , xm )d m−1 x, (4.8) (m − 1)! [0,u]m−1 m=1
where we have also used (4.4) and the continuity of Rm . Continuing in the same way we see that H (u) is actually a C ∞ function, in particular H (u) is well defined and continuous. From (4.8) we get ∞ (−1)m H (s) = −Rm (0) + Rm (0, x2 , . . . , xm )d m−1 x. (m − 1)! [0,s]m−1 m=2
H (0)
Hence = −Rm (0) and we see that the right-hand side of (4.7) equals H (u) − H (0), which is what we wanted to prove. It is proved in [8], using a result from [18], that SN (s, x)ρN (x)d N x RN
=
1/2 N (−1)m m−1 dr R(N) σ. m (0, σ2 , . . . , σm ; r)d m−1 (m − 1)! −1/2 [0,min(s,(1−2r)tN )]
m=2
Hence, the estimate (4.6) follows from (4.4), (4.5) and (4.7).
We turn now to the proof of Theorem 1.3. Proof. Just as in the proof of Theorem 1.2 above we see that since P ∈ W 6+0 and ||SN ||∞ ≤ N/2tN , (N) N (N) ˜ S (s, x(M))dQ (M) − S (s, x)ρ (x; y(H ))d x d P (H ) N N N HN
HN
≤C
N 2−(6+0)(1/2−ξ ) C N ≤ , tN tN
RN
(4.9)
704
K. Johansson
if we take ξ sufficiently small, and also that (3.4) holds. From Proposition 1.1, (2.4) and Proposition 2.3 we know the correlation functions of ρN (x; y), and if we take uN = u + (2tN r + σi )(Nρ(u))−1 in Lemma 3.2 we see that 1 2tN r + σj sin π(σi − σj ) 2tN r +σi tN −ξ Nρ(u) K(u+ Nρ(u) , u + Nρ(u) ; y(H ))− π(σ −σ ) ≤ C( N + N ) i j . = ωN (4.10) for a.a. H [P˜ (N) ]. Thus, the limiting correlation functions are Rm (σ1 , . . . , σm ) = det
sin π(σi − σj ) π(σi − σj )
m i,j =1
.
Since the matrix in the determinant is positive definite it follows from the Hadamard inequality that ∞ sm . m!
DN (s) ≤
m=N+1
Also, since R(N) m (σ1 , . . . , σm ; y)
m
2tN r + σj 1 2tN r + σi ; y) = det K(u + ,u + Nρ(u) Nρ(u) Nρ(u)
i,j =1
it follows from (4.10), the multilinearity of the determinant and Hadamard’s inequality that m−1 |R(N) ωN mm/2 , m (σ ; y) − Rm (σ )| ≤ m(1 + ωN ) (N)
and hence 0m ≤ m(1 + ωN )m−1 ωN mm/2 . Now, by Proposition 4.1, Stirling’s formula and the fact that ωN → 0,
RN
SN (s, x)ρN (x; y(H ))d N x − ≤
∞ m=N+1
sm + ωN m!
N m=2
s 0
H (u)du
sm (1 + ωN )m−1 m(m+2)/2 = o(1) (m − 1)!
(4.11)
as N → ∞, for a.a. H [P˜ (N) ]. If we combine (4.9) and (4.11) we see that the theorem is proved. Note added in proof. A different approach to the formula (1.13) using supersymmetry techniques is given in T. Guhr, “Transitions toward Quantum Chaos: With Supersymmetry from Poisson to Gauss”, Ann. Phys. 250 (1996), 145–192.
Local Spacing Distribution in Hermitian Wigner Matrices
705
References 1. Andréief, C.: Note sur une relation les integrales définies des produits de fonctions. Mém. de la Soc. Bordeaux, 2, 1–14 (1883) 2. Bai, Z.D.: Methodologies in spectral analysis of large dimensional random matrices: A review. Statistica Sinica 9, 611–661 (1999) 3. Bleher, P., Its, A.: Semiclassical asymptotics of orthogonal polynomials, Riemann–Hilbert problem, and universality in the matrix model. Ann. Math. 150, 185–266 (1999) 4. Borodin, A.: Biorthogonal ensembles. Nucl. Phys. B 536, 704–732 (1999) 5. Brézin, E., Hikami, S.: Correlations of nearby levels induced by a random potential. Nucl. Phys. B 479, 697–706 (1996) 6. Brézin, E., Hikami, S.: Spectral form factor in a random matrix theory. Phys. Rev. E 55, 4067–4083 (1997) 7. Brézin, E., Hikami, S.: An extension of level-spacing universality. cond-mat/9702213 8. Deift, P., Kriecherbauer, T., McLaughlin, K.T.-R., Venakides, S., Zhou, X.: Uniform asymptotics for polynomials orthogonal with respect to varying exponential weights and applications to universality questions in random matrix theory. Comm. Pure. Appl. Math. 52, 1335–1425 (1999) 9. Dyson, F.J.: A Brownian-motion Model for the Eigenvalues of a Random Matrix. J. Math. Phys. 3, 1191–1198 (1962) 10. Grabiner, D.J.: Brownian motion in a Weyl chamber, non-colliding particles and random matrices. Ann. Inst. H. Poincaré 35, 177–204 (1999) 11. Guionnet, A., Zeitouni, O.: Concentration of the spectral measure for large matrices. Preprint (2000) 12. Harish-Chandra, Differential operators on a semisimple Lie algebra. Amer. J. Math. 79, 87–120 (1957) 13. Hobson, D.G., Werner, W.: Non-colliding Brownian motions on the circle. Bull. London Math. Soc. 28, 543–650 (1996) 14. Itzykson, C., Zuber, J.-B.: The planar approximation II. J. Math. Phys., 21, 411–421 (1980) 15. Johansson, K.: Random growth and Random matrices. To appear in the Proceedings of the third European Congress of Mathematics 16. Johansson, K.: Non-intersecting paths, random tilings and random matrices. In preparation 17. Karlin, S., McGregor, G.: Coincidence probabilities. Pacific J. Math, 9, 1141–1164 (1959) 18. Katz, N.M., Sarnak, P.: Random Matrices, Frobenius Eigenvalues and Monodromy. AMS Colloquium Publications, Vol. 45, 1999 19. Khorunzhy, A.: On smoothed density of states for Wigner random matrices. Random Oper. and Stoch. Equ. 5, 147–162 (1997) 20. Khorunzhy, A., Khoruzhenko, B.A., Pastur, L.A.: On asymptotic properties of large random matrices with independent entries. J. Math. Phys. 37, 5033–5060 (1996) 21. Mehta, M.L.: Random Matrices. 2nd ed., San Diego: Academic Press, 1991 22. Okounkov, A.: Infinite wedge and measures on partitions. math.RT/9907127 23. Pastur, L.A., Shcherbina, M.: Universality of the local eigenvalue statistics for a class of unitary invariant random matrix ensembles. J. Stat. Phys. 86, 109–147 (1997) 24. Pauwels, E.J., Rogers, L.G.G.: Skew-product decompositions of Brownian motions. In: Geometry of Random Motion, R. Durrett, M. A. Pinsky, eds., Providence, RI: AMS Contemporary Mathematics, Vol. 73, 1988 25. Pinsky, R.G.: On the convergence of diffusion processes conditioned to remain in a bounded region for large time to limiting positive recurrent diffusion processes. Ann. Prob. 13, 363–378 (1985) 26. Porter, C.E., ed.: Statistical Theories of spectra: Fluctuations. New York: Academic Press, 1965 27. Rains, E.: Correlation functions for symmetrized increasing subsequences. math.CO/0006097 (2000) 28. Soshnikov, A.: Universality at the edge of the spectrum in Wigner random matrices. Commun. Math. Phys. 207, 697–733 (1999) 29. Sinai, Ya., Soshnikov, A.: Central limit theorem for traces of large random symmetric matrices with independent matrix elements. Bol. Soc. Brasil. Mat. 29, 1–24 (1998) 30. Sinai, Ya., Soshnikov, A.: A refinment of Wigner’s semicircle law in a neighborhood of the spectrum edge for symmetric matrices. Funct. Anal. Appl. 32, 114–131 (1998) 31. Tracy, C.A., Widom, H.: Correlation Functions, Cluster Functions, and Spacing Distributions for Random Matrices. J. Statist. Phys. 92, 809–835 (1998) Communicated by P. Sarnak
Commun. Math. Phys. 215, 707 (2001)
Communications in
Mathematical Physics
© Springer-Verlag 2001
Erratum
Homotopy Classes for Stable Periodic and Chaotic Patterns in Fourth-Order Hamiltonian Systems W. D. Kalies1 , J. Kwapisz2 , J. B. VandenBerg3 , R. C. A. M. VanderVorst3,4, 1 Department of Mathematical Sciences, Florida Atlantic University, Boca Raton, FL 33431, USA.
E-mail: [email protected]
2 Department of Mathematical Sciences, Montana State University-Bozeman, Bozeman, MT 59717-2400,
USA. E-mail: [email protected] 3 Department of Mathematical Sciences, University of Leiden, Niels Bohrweg 1, 2333 CA Leiden, The Netherlands. E-mail: [email protected]; [email protected] 4 CDSNS, Georgia Institute of Technology, Atlanta, GA 30332, USA Received: 30 November 2000 / Accepted: 4 December 2000 Commun. Math. Phys. 214, 573–592 (2000)
Due to an unfortunate error the bibliographical cross-references in the text were incorrectly represented. Following this page the complete paper is printed again.
This work was supported by grants ARO DAAH-0493G0199 and NIST G-06-605.
Commun. Math. Phys. 214, 573 – 592 (2000)
Communications in
Mathematical Physics
© Springer-Verlag 2000
Homotopy Classes for Stable Periodic and Chaotic Patterns in Fourth-Order Hamiltonian Systems W. D. Kalies1 , J. Kwapisz2 , J. B. VandenBerg3 , R. C. A. M. VanderVorst3,4, 1 Department of Mathematical Sciences, Florida Atlantic University, Boca Raton, FL 33431, USA.
E-mail: [email protected]
2 Department of Mathematical Sciences, Montana State University-Bozeman, Bozeman, MT 59717-2400,
USA. E-mail: [email protected]
3 Department of Mathematical Sciences, University of Leiden, Niels Bohrweg 1, 2333 CA Leiden,
The Netherlands. E-mail: [email protected]; [email protected]
4 CDSNS, Georgia Institute of Technology, Atlanta, GA 30332, USA
Received: 6 April 1999 / Accepted: 2 May 2000
Abstract: We investigate periodic and chaotic solutions of Hamiltonian systems in R4 which arise in the study of stationary solutions of a class of bistable evolution equations. Under very mild hypotheses, variational techniques are used to show that, in the presence of two saddle-focus equilibria, minimizing solutions respect the topology of the configuration plane punctured at these points. By considering curves in appropriate covering spaces of this doubly punctured plane, we prove that minimizers of every homotopy type exist and characterize their topological properties. 1. Introduction This work is a continuation of [7] where we developed a constrained minimization method to study heteroclinic and homoclinic local minimizers of the action functional γ 2 β 2 JI [u] = j (u, u , u ) dt = (1.1) |u | + |u | + F (u) dt, 2 I I 2 which are solutions of the equation γ u − βu + F (u) = 0
(1.2)
with γ , β > 0. This equation with a double-well potential F has been proposed in connection with certain models of phase transitions. For brevity we will omit a detailed background of this problem and refer only to those sources required in the proofs of the results. A more extensive history and reference list are provided in [7], to which we refer the interested reader. The above equation is Hamiltonian with H = −γ u u +
γ 2 β 2 |u | + |u | − F (u). 2 2
This work was supported by grants ARO DAAH-0493G0199 and NIST G-06-605.
(1.3)
574
W.D. Kalies, J. Kwapisz, J.B. VandenBerg, R.C.A.M. VanderVorst
The configuration space of the system is the (u, u )-plane, and solutions to (1.2) can be represented as curves in this plane. Initially these curves do not appear to be restricted in any way. However, the central idea presented here is that, when (±1, 0) are saddle-foci, the minimizers of J respect the topology of this plane punctured at these two points, which allows for a rich set of minimizers to exist. Using the topology of the doublypunctured plane and its covering spaces, we describe the structure of all possible types of minimizers, including those which are periodic and chaotic. Since the action of the minimizers of these latter types is infinite, a different notion of minimizer is required that is reminiscent of the minimizing (Class A) geodesics of Morse [11]. Such minimizers have been intensively studied in the context of geodesic flows on compact manifolds or the Aubry–Mather theory (see e.g. [1] for an introduction). A crucial difference is that we are dealing with a non-mechanical system on a non-compact space. Nevertheless, we are able to emulate many of Morse’s original arguments about how the minimizers can intersect with themselves and each other. For a precise statement of the main results we refer to Theorem 3.2 and Theorem 5.8. For related work on mechanical Hamiltonian systems we refer to [2, 12] and the references therein. Another important aspect of the techniques employed here and in [7] is the mildness of the hypotheses. In particular, our approach requires no transversality or non-degeneracy conditions, such as those found in other variational methods and dynamical systems theory, see [7]. Specifically, we will assume the following hypothesis on F : (H): F ∈ C 2 (R), F (±1) = F (±1) = 0, F (±1) > 0, and F (u) > 0 for u = ±1. Moreover there are constants c1 and c2 such that F (u) ≥ −c1 + c2 u2 . We will also assume for simplicity of the formulation that F is even, but many analogous results will hold for nonsymmetric potentials, cf. [7]. Finally, we assume that the parameters γ and β are such that u = ±1 are saddle-foci, i.e. 4γ /β 2 > 1/F (±1). An example of a nonlinearity satisfying these conditions is F (u) = (u2 − 1)2 /4, in which case (1.2) is the stationary version of the so-called extended Fisher–Kolmogorov (EFK) equation. In [7] we classify heteroclinic and homoclinic minimizers of J by a finite sequence of even integers which represent the number of times a minimizer crosses u = ±1. In order to classify more general minimizers we must consider infinite and bi-infinite sequences, as we now describe. A function u : R → R can be represented as a curve in the (u, u )−plane, and the associated curve will be denoted by (u). Removing the equilibrium points (±1, 0) from the (u, u )−plane (the configuration space) creates a space with nontrivial topology, denoted by P = R2 \{(±1, 0)}. In P we can represent functions u which have the property that u = 0 when u = ±1, and various equivalence classes of curves can be distinguished. For example, in [7] we considered classes of curves that terminate at the equilibrium points (±1, 0). Another important class consists of closed curves in P, which represent periodic functions. We now give a systematic description of all classes to be considered. Definition 1.1. A type is a sequence g = (gi )i∈I with gi ∈ 2N ∪ {∞}, where ∞ acts as a terminator. To be precise, g satisfies one of the following conditions: i) I = Z, and g ∈ 2NZ is referred to as a bi-infinite type. ii) I = {0} ∪ N, and g = (∞, g1 , g2 , . . . ) with gi ∈ 2N for all i ≥ 1, or I = −N ∪ {0}, and g = (. . . , g−2 , g−1 , ∞) with gi ∈ 2N for all i ≤ −1. In these cases g is referred to as a semi-terminated type.
Homotopy Classes for Stable Periodic and Chaotic Patterns
575
iii) I = {0, . . . , N + 1} with N ≥ 0, and g = (∞, g1 , . . . , gN , ∞) with gi ∈ 2N. In this case g is referred to as a terminated type. These types will define function classes using the vector g to count the crossings of u at the levels u = ±1. Since there are two equilibrium points, we introduce the notion of parity denoted by p, which will be equal to either 0 or 1. 2 (R) is in the class M(g, p) if there are nonempty sets Definition 1.2. A function u ∈ Hloc {Ai }i∈I such that i) u−1 (±1) = i∈I Ai , ii) #Ai = gi for i ∈ I, iii) max Ai < min Ai+1 , i+p+1 , and iv) u(A i ) = (−1) v) i∈I Ai consists of transverse crossings of ±1, i.e., u (x) = 0 for x ∈ Ai .
Note that by Definition 1.1, a function u in any class M(g, p) has infinitely many crossings of ±1. Definition 1.2 is similar to the definition of the class M(g) in [7] except that here it is assumed that all crossings of ±1 are transverse. Only finitely many crossings are assumed to be transverse in [7] so that the classes M(g) would be open subsets of χ +H 2 (R). Since we will not directly minimize over M(g, p), we now require transversality of all crossings of ±1 to guarantee that (u) ∈ P. However, note that the minimizers found in [7] are indeed contained in classes M(g, p) as defined above, where the types g are terminated. The classes M(g, p) are nonempty for all pairs (g, p). Conversely, any function 2 (R) is contained in the closure of some class M(g, p) with respect to the u ∈ Hloc −i 2 (R) given by ρ(u, v) = complete metric on Hloc i 2 min{1, u − vH 2 (−i,i) }, cf. 2 [13]. That is, if we define M(g, p) := {u ∈ Hloc (R) | ∃un ∈ M(g, p), with un → u 2 (R)}, then H 2 (R) = ∪ in Hloc (g,p) M(g, p). Note that the functions in ∂M(g, p) := loc M(g, p) \ int(M(g, p)) have tangencies at u = ±1 and thus are limit points of more than one class. In the case of an infinite type, shifts of g can give rise to the same function class. Therefore certain infinite types need to be identified. Let σ be the shift map defined by σ (g)i = gi+1 and the map τ : {0, 1} → {0, 1} be defined by τ (p) = (p + 1)mod 2 = |p−1|. Two infinite types (g, p) and (g , p ) are equivalent if g = σ n (g) and p = τ n (p) for some n ∈ Z, and this implies M(g, p) = M(g , p ). A bi-infinite type g is periodic if there exists an integer n such that σ n (g) = g. When the domain of integration is R, the action J [u] given in (1.1) is well-defined only for terminated types g and u ∈ M(g, p) ∩ {χ p + H 2 (R)}, where χ p is a smooth function from (−1)p+1 to (−1)p . For semi-terminated types or infinite types the action J is infinite for every u ∈ M(g, p). In Sect. 2, we will define an alternative notion of minimizer in order to overcome this difficulty. The primary goal of this paper is to prove the following theorem, but we also prove additional results about the structure and relationships between various types of minimizers. Theorem 1.3. If F satisfies Hypothesis (H) and is even, then for any type g and parity p there exists a minimizer of J in M(g, p) in the sense of Definition 2.1. Moreover, if g is periodic, then there exists a periodic minimizer in M(g, p). In Sects. 5 and 6 we show that other properties of the symbol sequences, such as symmetry, are reflected in the corresponding minimizers. The classification of minimizers by
576
W.D. Kalies, J. Kwapisz, J.B. VandenBerg, R.C.A.M. VanderVorst
symbol sequences has other properties in common with symbolic dynamics; for example, if a type is asymptotically periodic in both directions, then there exists a minimizer of that type which is a heteroclinic connection between two periodic minimizers. The minimizers discussed here all lie in the 3-dimensional “energy-manifold” M0 = {(u, u , u , u ) | H ((u, u , u , u ) = 0}. Exploiting certain properties of minimizers that are established in this paper, we can deduce various linking and knotting characteristics when they are represented as smooth curves in M0 [4, 5]. The minimizers found in this paper are also used in [16] to construct stable patterns for the evolutionary EFK equation on a bounded interval, and the dynamics of the evolutionary EFK is discussed in [9]. Some notation used in this paper was previously introduced in [7]. While we have attempted to present a self-contained analysis, we have avoided reproducing details (particularly in Sect. 5.1) which are not central to the ideas presented here, and which are thoroughly explained in [7]. 2. Definition of Minimizer For every compact interval I ⊂ R the restricted action JI is well-defined for all types. When we restrict u to an interval I , we can define its type and parity relative to I , which we denote by (g(u|I ), p(u|I )). Namely, let u ∈ M(g, p). It is clear that (u, u )|∂I ∈ (±1, 0) for any bounded interval I . Then g(u|I ) is defined to be the finite-dimensional vector which counts the consecutive instances of u|I = ±1, and p(u|I ) is defined such that the first time u|I = ±1 in I happens at (−1)p+1 . Note that the components of g(u|I ) are not necessarily all even, since the first and the last entries may be odd. We are now ready to state the definition of a (global) minimizer in M(g, p). Definition 2.1. A function u ∈ M(g, p) is called a minimizer for J over M(g, p) if and only if for every compact interval I the number JI [u|I ] minimizes JI [v|I ] over all functions v ∈ M(g, p) and all compact intervals I such that (v, v )|∂I = (u, u )|∂I and (g(v|I ), p(v|I )) = (g(u|I ), p(u|I )). The pair (g(u|I ), p(u|I )) defines a homotopy class of curves in P with fixed end points (u, u )|∂I . The above definition says that a function u, represented as a curve (u) in P, is a minimizer if and only if for any two points P1 and P2 on (u), the segment (P1 , P2 ) ⊂ (u) connecting P1 and P2 is the most J -efficient among all connections (P1 , P2 ) between P1 and P2 that are induced by a function v and are of the same homotopy type as (P1 , P2 ), regardless of the length of the interval needed to parametrize the curve (P1 , P2 ). As we mentioned in the introduction, this is analogous to the length minimizing geodesics of Morse and Hedlund and the minimizers in the Aubry–Mather theory. The set of all (global) minimizers in M(g, p) will be denoted by CM(g, p). Lemma 2.2. Let u ∈ M(g, p) be a minimizer, then u ∈ C 4 (R) and u satisfies Eq. (1.2). Moreover, u satisfies the relation H (u, u , u , u ) = 0, i.e. the associated orbit lies on the energy level H = 0. Proof. From the definition of M(g, p), on any bounded interval I ⊂ R there exists #0 (I ) > 0 sufficiently small such that u + φ ∈ M(g, p) for all φ ∈ H02 (I ), with φH 2 < # ≤ #0 . Therefore JI [u + φ] ≥ JI [u] for all such functions φ, which implies that dJI [u] = 0 for any bounded interval I ⊂ R, and thus u satisfies (1.2).
Homotopy Classes for Stable Periodic and Chaotic Patterns
577
To prove the second statement we argue as follows. Since u ∈ M(g, p), there exists a bounded interval I such that u |∂I = 0. Introducing the rescaled variable s = t/T with T = |I | and v(s) = u(t), we have JI [u] = J [T , v] ≡
1
0
1 γ 2 1β 2 |v |v | + | + T F (v) ds, T3 2 T 2
(2.1)
which decouples u and T . Since u |∂I = 0 we see from Definition 2.1 that J [T ± #, v] ≥ J T [u] = J [T , v]. The smoothness of J in the variable T > 0 implies that ∂ = 0. Differentiating yields ∂τ J [τ, v] τ =T
∂ J [τ, v] = ∂τ
1
−4 3
2
−2 β
2
−τ γ |v | − τ |v | + F (v) ds 2 2 τ 3 β 2 −1 2 =τ − γ |u | − |u | + F (u) dt 2 2 0 τ = −τ −1 H (u, u , u , u )dt ≡ −E. 0
0
Thus E = 0, and H (u, u , u , u ) = 0 for t ∈ I . This immediately implies that H = 0 for all t ∈ R. The minimizers for J found in [7] also satisfy Definition 2.1, and we restate one of the main results of [7]. Proposition 2.3. Suppose F is even and satisfies (H), and β, γ > 0 are chosen such that u = ±1 are saddle-focus equilibria. Then for any terminated type g with parity either 0 or 1 there exists a minimizer u ∈ M(g, p) of J . From Definition 1.2, the crossings of u ∈ M(g, p) with ±1 are transverse and hence isolated. We adapt from [7], the notion of a normalized function with a few minor changes to reflect the fact that we now require every crossing of ±1 to be transverse. Definition 2.4. A function u ∈ M(g, p) is normalized if, between each pair u(a) and u(b) of consecutive crossings of ±1, the restriction u|(a,b) is either monotone or u|(a,b) has exactly one local extremum. Clearly, the case of u|(a,b) being monotone can occur only between two crossings at different levels ±1, in which case we say that u has a transition on [a, b]. Lemma 2.5. If u ∈ CM(g, p), then u is normalized. Proof. Since u ∈ M(g, p), all crossings of u = ±1 are transverse, i.e. u = 0. Thus for any critical point t0 ∈ R, u(t0 ) = ±1, and the Hamiltonian relation from Lemma 2.2 and (1.3) implies that γ u (t0 )2 /2 = F (u(t0 )) > 0. Therefore u is a Morse function, and between any two consecutive crossings of ±1 there are only finitely many critical points. Now on any interval between consecutive crossings where u is not normalized, the clipping lemmas of Sect. 3 in [7] can be applied to obtain a more J -efficient function, which contradicts the fact that u is a minimizer.
578
W.D. Kalies, J. Kwapisz, J.B. VandenBerg, R.C.A.M. VanderVorst
3. Minimizers of Arbitrary Type In this section we will introduce a notion of convergence of types which will be used in Sect. 5.2 to establish the existence of minimizers in every class M(g, p) by building on the results proved in [7].
Definition 3.1. Consider a sequence of types (gn , pn ) = (gin )i∈In , pn and a type (g, p) = (gi )i∈I , p . The sequence (gn , pn ) limits to the type (g, p) if and only if n there exist numbers Nn ∈ 2Z such that gi+N → gi for all i ∈ I as n → ∞. We n n +p −p n n will abuse notation and write (g , p ) → (g, p). We should point out that a sequence of types can limit to more than one type. For
n , 0) = (∞, 2, 2, n, 4, 4, 4, 4, n, 2, 2, 2, . . . ), 0 limits to the example the sequence (g
types (∞, 2, 2, ∞), 0 , (∞, 4, 4, 4, 4, ∞), 1 and (∞, 2, 2, 2, . . . ), 0 . Theorem 3.2. Let (gn , pn ) → (g, p) and un ∈ CM(gn , pn ) with un 1,∞ ≤ C for 4 (R), all n. Then there exists a subsequence unk such that unk → u ∈ M(g, p) in Cloc and u is a minimizer in the sense of Definition 2.1, i.e. u ∈ CM(g, p). Proof. This proof requires arguments developed in [7] to which the reader is referred for certain details. The idea is to take the limit of un restricted to bounded intervals. We define the numbers Nn as in Definition 3.1, and we denote the convex hull of Ai by Ii = conv(Ai ). Due to translation invariance we can pin the functions un so that un (0) = (−1)p+1 , which is the beginning of the transition between INn n +pn −p and n . Due to the assumed a priori bound and interpolation estimates which can I1+N n n +p −p be found in the appendix to [10], there is enough regularity to yield a limit function u 4 –limit of u , after perhaps passing to a subsequence. Moreover u satisfies the as a Cloc n differential equation (1.2) on R. The question that remains is whether u ∈ M(g, p). To simplify notation we will now assume that Nn = 0 and pn = p = 0. Fixing a small δ > 0, we define Iin (δ) ⊃ Iin as the smallest interval containing Iin such that u|∂Iin (δ) = (−1)i+1 − (−1)i+1 δ (if g is a (semi-)terminated type then Iin (δ) may be n (δ) is denoted by Ln (δ). a half-line). The interval of transition between Iin (δ) and Ii+1 i To see that u ∈ M(g, p), one has to to eliminate the two possibilities that a priori may lead to the loss or creation of crossings in the limit so that u ∈ M(g, p): the distance between two consecutive crossings in un could grow without bound or u could possess tangencies at u = ±1. Due to the a priori estimates in W 1,∞ we have the following bounds on J : J [un |Iin (δ) ] ≤ C
and
J [un |Lni (δ) ] ≤ C ,
(3.1)
where C and C are independent of n and i. Indeed, note that for n large enough the homotopy type of un on the intervals Iin (δ) is constant by the definition of convergence of types. Since the functions un are minimizers, J [un |Iin (δ) ] is less than the action of any test function of this homotopy type satisfying the a priori bounds on u and u on ∂Iin (δ) (see [7, Sect. 6] for a similar test function argument). The estimate |Lni (δ)| ≤ C(δ) is immediately clear from Lemma 5.1 of [7]. We now need to show that the distance between two crossings of (−1)i+1 within the interval Iin (δ) cannot tend to infinity. First we will deal with the case when gin is finite for all n. Suppose that the distance between consecutive crossings of (−1)i+1 in Iin (δ) tends to infinity as n → ∞. Due to Inequality (3.1) and Lemma 2.5, minimizers have exactly one extremum between
Homotopy Classes for Stable Periodic and Chaotic Patterns
579
crossings of (−1)i+1 for any # > 0, and hence there exist subintervals Kn ⊂ Iin (δ) with |Kn | → ∞, such that 0 < |un − (−1)qn | < # on Kn , where qn ∈ {0, 1}, and |u |∂Kn | < #. Taking a subsequence we may assume that qn is constant. We begin by considering the case where qn = i + 1. Now # can be chosen small enough, so that the local theory in [7] is applicable in Kn . If |Kn | becomes too large then un can be replaced by a function with lower action and with many crossings of (−1)i+1 . Subsequently, redundant crossings can be clipped out, thereby lowering the action. This implies that un is not a minimizer in the sense of Definition 2.1, a contradiction. The case where qn = i must be dealt with in a different manner. First, there are unique points tn ∈ Kn such that un (tn ) = 0, and for these points un (tn ) → (−1)i as |Kn | → ∞. Let un (sn ) be the first crossing of (−1)i+1 to the left of Kn . Taking the limit (along subsequences) of un (t − sn ) we obtain a limit function u which is a solution of (1.2). If |tn − sn | is bounded then u has a tangency to u = (−1)i at some t∗ ∈ R. All un lie in {H = 0} (see (1.3)) and so does u, hence u (t∗ ) = 0. Moreover u (t∗ ) = 0, because u(t∗ ) is an extremum. By uniqueness of the initial value problem this implies that u ≡ (−1)i , contradicting the fact that u(0) = (−1)i+1 . If |tn − sn | → ∞, then u is a monotone function on [0, ∞), tending to (−1)i as x → ∞, and its derivatives tend to zero (see Lemma 3 in [14] or Lemma 1, Part (ii) in [10] for details). This contradicts the saddle-focus nature of the equilibrium point. In the case that gin = ∞ we remark that (3.1) also holds when Iin is a half-line. It follows from the estimates in Lemma 5.1 in [7] that uni → (−1)i+1 as x → ∞ or x → −∞ (whichever is applicable). From the local theory in Sect. 4 of [7] and the fact that un is a minimizer, it follows that the derivatives of un tend to zero. The analysis above concerning the intervals Kn and the clipping of redundant oscillations now goes on unchanged. We have shown that the distance between two crossings of ±1 is bounded from above. Next we have to show that the limit function has only transverse crossings of ±1. This ensures that no crossings are lost in the limit. If u were tangent to (−1)i+1 in Ii , then we could construct a function in v ∈ M(g, p) in the same way as demonstrated in [7] by replacing tangent pieces by more J -efficient local minimizers and by clipping. The function v still has a lower action than u on a slightly larger interval (the limit function u also obeys (3.1), so that the above clipping arguments still apply). Since un → u in 4 it follows that J [u ] → J [u] on bounded intervals I . This then implies that for n Cloc I n I large enough the function un is not a minimizer in the sense of Definition 2.1, which is a contradiction. The limit function u could also be tangent to (−1)i for some t0 ∈ Ii . As before, such tangencies satisfy u(t0 ) − (−1)i = u (t0 ) = u (t0 ) = u (t0 ) = 0, which leads to a contradiction of the uniqueness of the initial value problem. Finally, crossings of u = ±1 cannot accumulate since this would imply that at the accumulation point all derivatives would be zero, leading to the same contradiction as above. In particular, if gin → ∞ for some i, then |Iin | → ∞ and the crossings in Anj for j > i move off to infinity and do not show in u, which is compatabile with the convergence of types. 4 –limit of minimizers, We have now proved that u ∈ M(g, p) and, since u is the Cloc u is also a minimizer in the sense of Definition 2.1. Remark 3.3. It follows from the estimates in Theorem 3 of [10] that in the theorem above we in fact only need an L∞ -bound on the sequence un .
580
W.D. Kalies, J. Kwapisz, J.B. VandenBerg, R.C.A.M. VanderVorst
Remark 3.4. It follows from the proof of Theorem 3.2 that there exists a constant δ0 > 0 such that for all uniformly bounded minimizers u(t) it holds that |u(t) − (−1)i+p | > δ for all t ∈ Ii and all i ∈ I. This means that the uniform separation property discussed in [7] is uniformly satisfied by all minimizers. 4. Periodic Minimizers A bi-infinite type g is periodic if there exists an integer n such that σ n (g) = g. The (natural) definition of the period of g is min{n ∈ 2N | σ n (g) = g}. We will write g = r, where r = (g1 , . . . , gn ) and n is even. Cyclic permutations of r with possibly a flip of p give rise to the same function class M(r, p). In reference to the type r with parity p we will use the notation (r, p). Any such type pair (r, p) can formally be associated with a homotopy class in π1 (P, 0) in the following way. Let e0 and e1 be the clockwise oriented circles of radius one centered at (1, 0) and (−1, 0) respectively, so that [e0 ] r /2 r /2 and [e1 ] are generators for π1 (P, 0). Defining θ(r, p) = eτnn (p) · . . . · ep1 , the map θ : ∪k≥1 2N2k × {0, 1} → π1 (P, 0) is an injection, and we define π1+ (P, 0) to be the image of θ in π1 (P, 0). Powers of a type pair (r, p)k for k ≥ 1 are defined by concatenation of r with itself k times, which is equivalent to (r, p)k = θ −1 ((θ (r, p))k ). Definition 4.1. Two pairs (r, p) and ( r, p) are equivalent if there are numbers p, q ∈ N r, p)q up to cyclic permutations. This relation, (r, p) ∼ ( r, p), is such that (r, p)p = ( an equivalence relation.
Example. If (r, p) = (2, 4, 2, 4), 0 and ( r, p) = (4, 2, 4, 2, 4, 2), 1 , then θ(r, p)3 = θ ( r, p)2 . The equivalence class of (r, p) is denoted by [r, p]. A type (r, p) is a minimal representative for [r, p] if for each ( r, p) ∈ [r, p] there is k ≥ 1 such that ( r, p) = (r, p)k up to cyclic permutations. A minimal representative is unique up to cyclic permutations. It is clear that in the representation of a periodic type g = r, the type r is minimal if the length of r is the minimal period. Due to the above equivalences we now have that M(r, p) = M( r, p), ∀ ( r, p) ∈ [r, p]. It is not a priori clear that minimizers in M(r, p) are periodic. However, we will see that among these minimizers, periodic minimizers can always be found. For a given periodic type r we consider the subset of periodic functions in M(r, p), Mper (r, p) = {u ∈ M(r, p) | u is periodic}. For any u ∈ Mper (r, p) and a period T of u, (u|[0,T ] ) is a closed loop in P whose homotopy type corresponds to a nontrivial element of π1+ (P, 0). In this correspondence there is no natural choice of a basepoint. For specificity, we will describe how to make the correspondence with the origin as the basepoint and thereafter omit it from the notation. Translate u so that u(0) = 0. Let γ : [0, 1] → P be the line from 0 to ∗ [0,T ] ) = γ ∗ ◦ (u|[0,T ] ) ◦ γ , and
(0, u (0)), and let+ γ (t) = γ (1 − t). Then (u| (u|[0,T ] ) . Thus there exists a (u|[0,T ] ) ∈ π1 (P, 0). Now define (u|[0,T ] ) ≡
pair θ −1 (u|[0,T ] ) = ( r, p) ∈ [r, p], with r = rk for some k ≥ 1. Therefore we define for any ( r, p) ∈ [r, p],
Mper ( r, p) = u ∈ Mper (r, p) | (u|[0,T ] ) ∼ θ( r, p) ∈ π1 (P) for a period T of u .
Homotopy Classes for Stable Periodic and Chaotic Patterns
581
The type r = g(u|[0,T ] ), with g = r, is the homotopy type of u relative to a period T . This type has an even number of entries. It follows that Mper (r, p) ⊂ Mper ( r, p) for all ( r, p) = (r, p)k , k ≥ 1. Furthermore Mper (r , p) = ∪( r, p)∈[r,p] Mper ( r, p). In order to get a better understanding of periodic minimizers in M(r, p) we consider the following minimization problem: Jper (r, p) =
inf
u∈Mper (r,p)
JT [u] =
inf
T (r,p) Mper T ∈R+
JT [u],
(4.1)
T (r, p) where JT is action given in (1.1) integrated over one period of length T , and Mper is the set of T -periodic functions u ∈ Mper (r, p) for which g(u|[0,T ] ) = r. Note that T is not necessarily the minimal period, unless r is a minimal representative for [r]. It is clear that for γ , β > 0 the infima Jper (r, p) are well-defined and are nonnegative for any homotopy type r. At this point it is not clear, however, that the infima Jper (r, p) are attained for all homotopy types r. We will prove in Sect. 5 that existence of minimizers for (4.1) can be obtained using the existence of homoclinic and heteroclinic minimizers already established in [7].
Lemma 4.2. If Jper (r, p) is attained for some u ∈ Mper (r, p), then u ∈ C 4 (R) and satisfies (1.2). Moreover, since u is minimal with respect to T we have H (u, u , u , u ) = 0, i.e. the associated periodic orbit lies in the energy surface H = 0. Proof. Since Jper (r, p) is attained by some u ∈ Mper (r, p) for some period T , we have that JT [u + φ] − JT [u] ≥ 0 for all φ ∈ H 2 (S 1 , T ) with φH 2 ≤ #, sufficiently small. This implies that dJT [u] = 0, and thus u satisfies (1.2). The second part of this proof is analogous to the proof of Lemma 2.2. We now introduce the following notation: CM(r, p) = {u ∈ M(r, p) | u is a minimizer according to Definition 2.1}, CMper (r, p) = {u ∈ CM(r, p) | u is periodic}, CMper (r, p) = {u ∈ Mper (r, p) | u is a minimizer for Jper (r, p)}. 4.1. Existence of periodic minimizers of type r = (2, 2)k . If we seek periodic minimizers of type r = (2, 2)k , the uniform separation property for minimizing sequences (see Sect. 5 in [7]) is satisfied in the class Mper (r). Note that the parity is omitted because it does not distinguish different homotopy types here. The uniform separation property as defined in [7] prevents minimizing sequences from crossing the boundary of the given homotopy class. For any other periodic type the uniform separation property is not a priori satisfied. For the sake of simplicity we begin with periodic minimizers of type (2, 2) and minimize J in the class Mper ((2, 2)). Minimizing sequences can be chosen to be normalized due to the following lemma, which we state without proof. The proof is analogous to Lemma 3.5 in [7]. Lemma 4.3. Let u ∈ Mper ((2, 2)) and T be a period of u. Then for every # > 0 there exists a normalized function w ∈ Mper ((2, 2)) with period T ≤ T such that JT [w] ≤ JT [u] + #.
582
W.D. Kalies, J. Kwapisz, J.B. VandenBerg, R.C.A.M. VanderVorst
The goal of this subsection is to prove that when F satisfies (H) and β, γ > 0 are such that u = ±1 are saddle-foci, then Jper ((2, 2)) is attained, by Theorem 4.5 below. The proof relies on the local structure of the saddle-focus equilibria u = ±1 and is a modification of arguments in [7]; hence we will provide only a brief argument. The reader is referred to [7] for further details. In preparation for the proof of Theorem 4.5, we fix τ0 > 0, #0 > 0, and δ > 0 so that the conclusion of Theorem 4.2 of [7] holds, i.e. the characterization of the oscillatory behavior of solutions near the saddle-focus equilibria u = ±1 holds. Let T ((2, 2)) be normalized, and let t be such that u(t ) = 0. Then t is part of a u ∈ Mper 0 0 0 transition from ∓1 to ±1. Assume without loss of generality that this transition is from −1 to 1. Define t− = sup{t < t0 : |u(t)+1| < δ} and t+ = inf{t > t0 : |u(t)−1| < δ}. Then let S(u) = {t : |u(t) ± 1| < δ} and B[u, T ] = |S(u) ∩ [t+, t− + T ]|, and note that [t0 , t0 + T ] = S(u) ∩ [t+ , t− + T ] ∪ S(u)c ∩ [t0 , t0 + T ] . With these definitions we can establish the following estimate (cf. Lemma 5.4 in [7]). For all u ∈ Mper ((2, 2)) with JT [u] ≤ Jper ((2, 2)) + #0 , u2H 2 ≤ C(1 + Jper ((2, 2)) + B[u, T ]).
(4.2)
First, u 2H 1 ≤ C(Jper ((2, 2))+#0 ), and second if |u±1| > δ, then F (u) ≥ η2 u2 , which t +T implies that u2L2 ≤ 1/η2 t00 F (u) dt + (1 + δ)2 B[u, T ] ≤ C(JT [u] + B[u, T ]). Combining these two estimates proves (4.2). T ((2, 2)) that satisfy J [u] ≤ J ((2, 2)) + 1, it follows For functions u ∈ Mper T per from Lemma 5.1 of [7] that there exist (uniform in u) constants T1 and T2 such that T2 ≥ |S(u)c ∩ [t0 , t0 + T ]| ≥ T1 > 0 and thus T > T1 . The next step is to give an a priori upper bound on T by considering the minimization problem (cf. Sect. 5 in [7]) T ((2, 2)) normalized, T ∈ R+ , B# = inf{ B[u, T ] | u ∈ Mper and JT [u] ≤ Jper ((2, 2)) + #}.
Lemma 4.4. There exists a constant K = K(τ0 ) > 0 such that B# ≤ K for all 0 < # < #0 . Moreover, if T0 ≡ K + T2 , then for any 0 < # < #0 , there is a normalized T ((2, 2)) with J [u] ≤ J ((2, 2)) + 2# and T < T ≤ T . u ∈ Mper T per 1 0 Tn ((2, 2))×R+ be a minimizing sequence for B# , with normalProof. Let (un , Tn ) ∈ Mper ized functions un . As in the proof of Theorem 5.5 of [7], in the weak limit this yields a pair ( u, T ) such that B[ u, T ] ≤ B# . We now define K((2, 2), τ0 ) = 8((2τ0 + 2) + 2). This gives two possibilities for B[ u, T ], either B[ u, T ] > K or B[ u, T ] ≤ K. If the former is T ((2, 2)) × R+ , true then we can construct (see Theorem 5.5 of [7]) a pair ( v , T ) ∈ Mper with v normalized, such that
v ] < JT [ u] ≤ Jper ((2, 2)) + # JT [
and
B[ v , T ] < B[ u, T ] ≤ B# ,
which is a contradiction excluding the first possibility. In the second case, where B[ u, T ] ≤ K, we can construct a pair ( v , T ) with v normalized such that v ] < JT [ u] + # ≤ Jper ((2, 2)) + 2#, JT [
and
B[ v , T ] < B[ u, T ] ≤ K,
which implies that T1 < T < T ≤ K + T2 = T0 and concludes the proof. For details concerning these constructions, see Theorem 5.5 in [7].
Homotopy Classes for Stable Periodic and Chaotic Patterns
583
Theorem 4.5. Suppose that F satisfies (H) and β, γ > 0 are such that u = ±1 are saddle-foci, then Jper ((2, 2)k ) is attained for any k ≥ 1. Moreover, the projection of any minimizer in CMper ((2, 2)) onto the (u, u )–plane is a simple closed curve. Tn Proof. By Lemma 4.4, we can choose a minimizing sequence (un , Tn ) ∈ Mper ((2, 2))× R+ , with un normalized and with the additional properties that un H 2 ≤ C and T1 < Tn ≤ T0 . Since the uniform separation property is satisfied for the type (2, 2) this leads to a minimizing pair ( u, T ) for (4.1) by following the proof of Theorem 2.2 in [7]. As for the existence of periodic minimizers of type r = (2, 2)k the uniform separation property is automatically satisfied and the above steps are identical. Lemma 2.5 yields that minimizers are normalized functions and the projection of a normalized function in Mper ((2, 2)) is a simple closed curve in the (u, u )–plane.
We would like to have the same theorem for arbitrary periodic types r. For homotopy types that satisfy the uniform separation property the analogue of Theorem 4.5 can be proved. However, in Sect. 5 we will prove a more general result using the information about the minimizers with terminated types (homoclinic and heteroclinic minimizers) which was obtained in [7]. Remark 4.6. The existence of a (2, 2)-type minimizer is proved here in order to obtain a priori W 1,∞ -estimates for all minimizers (Sect. 5). However, if F satisfies the additional hypothesis that F (u) ∼ |u|s , s > 2 as |u| → ∞, then such estimates are automatic (cf. [6, 10]). In that case the existence of a minimizer of type (2, 2) follows from Theorem 4.14 below. To prove existence of minimizers of arbitrary type r we will use an analogue of Theorem 4.14 (see Lemma 5.7 and Theorem 5.8 below). 4.2. Characterization of minimizers of type g = (2, 2). Periodic minimizers associated with [e0 ] or [e1 ] are the constant solutions u = −1 and u = 1 respectively. The simplest nontrivial periodic minimizers are those of type r = (2, 2)k , i.e. r ∈ [(2, 2)]. These minimizers are crucial to the further analysis of the general case. The type r = (2, 2) is a minimal type (associated with [e1 e0 ]), and we want to investigate the relation between minimizers in M((2, 2)) and periodic minimizers of type (2, 2)k . Considering curves in the configuration space P is a convenient method for studying minimizers of type (2, 2). For example, minimizers in CM((2, 2)) and CMper ((2, 2)) all satisfy the property that they do not intersect the line segment L = (−1, 1)×{0} in P. If other homotopy types r are considered, i.e. r ∈ [(2, 2)], then minimizers represented as curves in P necessarily have self-intersections and they must intersect the segment L, which makes their comparison more complicated. We will come back to this problem in Sect. 5. Note that for a C 1 -function u the associated curve (u) is a closed loop if and only if u is a periodic function. Lemma 4.7. For any non-periodic minimizer u ∈ CM((2, 2)) and any bounded interval I the curve [u|I ] has only a finite number of self-intersections. For periodic minimizers u ∈ CMper ((2, 2)) this property holds when the length of I is smaller than the minimal period. Proof. Fix a time interval I = [0, T ]. If u is periodic, T should be chosen smaller than the minimal period of u. Let P = (u0 , u0 ) be an accumulation point of self-intersections of u|I . Then P is a self intersection point, and there exists a monotone sequence of times τn ∈ I converging to t0 such that (u(τn )) are self-intersection points and (u(t0 )) = P .
584
W.D. Kalies, J. Kwapisz, J.B. VandenBerg, R.C.A.M. VanderVorst
Also there exists a corresponding sequence σn ∈ I with σn = τn such that (u(τn )) = (u(σn )). Choosing a subsequence if necessary, σn → s0 monotonically. Since u is a minimizer in CM((2, 2)), the intervals [σn , τn ] must contain a transition, and hence |τn − σn | > T0 > 0. Therefore, s0 = t0 , and we will assume that s0 < t0 (otherwise change labels). The homotopy type of (u|[s0 ,t0 ] ) is (2, 2)k for some k ≥ 1 (since I is bounded). Assume that σn and τn are increasing; the other case is similar. Using the times σn < s0 < τn < t0 , the curve ∗ = [u|[σn −δ,t0 +δ] ], for δ sufficiently small, can be decomposed as 1 = a ◦γ2 ◦γ ◦γ1 ◦b, where b = (u|[σn −δ,σn ] ), γ1 = (u|[σn ,s0 ] ), γ = (u|[s0 ,τn ] ), γ2 = (u|[τn ,t0 ] ), and a = (u|[t0 ,t0 +δ] ). For n sufficiently large, γ1 and γ2 have the same homotopy type, and γ1 = γ2 , since otherwise u would be periodic with period smaller than t0 − σn < T . We can now construct two more paths 1 = a ◦ γ1 ◦ γ ◦ γ1 ◦ b
and
2 = a ◦ γ2 ◦ γ ◦ γ 2 ◦ b
which have the same homotopy type for n sufficiently large. Since J [∗ ] is minimal, J [1 ] ≥ J [∗ ] and J [2 ] ≥ J [∗ ], and thus J [γ1 ] ≥ J [γ2 ] and J [γ2 ] ≥ J [γ1 ] which implies that J [γ1 ] = J [γ2 ]. Therefore J [∗ ] = J [1 ] = J [2 ], and 1 , 2 and ∗ are all distinct minimizers with the same homotopy type and same boundary conditions. Since these curves all coincide along γ , the uniqueness of the initial value problem is contradicted. An argument very similar to the one above is also used in the proof of Lemma 4.12 and is demonstrated in Fig. 4.1. Lemma 4.8. If r = (2, 2)k with k > 1, then CMper (r) = CMper ((2, 2)) and Jper (r) = k · Jper ((2, 2)). Proof. Let u ∈ CMper (r) with r = (2, 2)k for k > 1, and let T be the period1 such that the associated curve in P, (u|[0,T ] ), has the homotopy class of θ((2, 2)k ). First we will prove that (u|[0,T ] ) is a simple closed curve in P, and hence u ∈ Mper ((2, 2)). Suppose not, then by Lemma 4.7 the curve (u|[0,T ] ) can be fully decomposed into k distinct simple closed curves i for i = 1, . . . , k (just call the inner loop 1 , cut it out, and call the new inner loop 2 , and so on). Denote by Ji the action associated with loop i , then i Ji = JT [u]. Let vi ∈ Mper ((2, 2)k ) be the function obtained by pasting together k copies of u restricted to the loop i . If vi were a minimizer in Mper ((2, 2)k ), then by Lemma 4.2 the functions u and vi would be distinct solutions to the differential equation (1.2) which coincide over an interval. This would contradict the uniqueness of solutions of the initial value problem, and hence vi is not a minimizer, i.e. JT [vi ] = k · Ji > Jper ((2, 2)k ). Consequently Jper ((2, 2)k ) = i Ji > Jper ((2, 2)k ), which is a contradiction. Thus u ∈ Mper ((2, 2)) and (u|[0,T ] ) is a simple loop traversed k times. Now we will show that u ∈ CMper ((2, 2)). Since (u) is the projection of a function into the (u, u )–plane, u traverses the loop once over the interval [0, T /k], and Jper ((2, 2)k ) = k · JT /k [u]. Suppose JT /k > Jper ((2, 2)). Then we can construct a function in Mper ((2, 2)k ) with action less than J [u] = Jper ((2, 2)k ) by gluing together k copies of a minimizer in Mper ((2, 2)), which is a contradiction. Lemma 4.9. For any k ≥ 1, CMper ((2, 2)k ) = CMper ((2, 2)) = CMper ((2, 2)). 1 One may assume without loss of generality that is a minimal period.
Homotopy Classes for Stable Periodic and Chaotic Patterns
585
Proof. We have already shown in Lemma 4.8 that CMper ((2, 2)k ) = CMper ((2, 2)). We first prove that CMper ((2, 2)) ⊂ CMper ((2, 2)). Let u ∈ CMper ((2, 2)) have period T . Suppose u ∈ CMper ((2, 2)). Then there exist two points (u(t1 )) = P1 and (u(t2 )) = P2 on (u) such that the curve γ between P1 and P2 obtained by following (u) is not minimal. Replacing γ by a curve with smaller action and the same homotopy type yields a function v ∈ Mper ((2, 2)) for which J[t1 ,t2 ] [v] ≤ J[t1 ,t2 ] [u]. Choose k ≥ 0 such that kT > t2 − t1 . Then u is a minimizer in CMper ((2, 2)k ) = CMper ((2, 2)) which is a contradiction. To finish the proof of the lemma we show that CMper ((2, 2)) ⊂ CMper ((2, 2)). Let u ∈ CMper ((2, 2)) have period T . Let (u|[0,T ] ) be the associated closed curve in P and ω its winding number with respect to the segment L. Suppose JT [u] > Jper ((2, 2)ω ) = ω·Jper (2, 2). This implies the existence of a function v ∈ Mper ((2, 2)ω ) and a period T such that JT [v] < JT [u]. Choose a time t0 ∈ [0, T ] such that u(t0 ) = 1 and u (t0 ) > 0. Let P0 = (1, u (t0 )) ∈ P. There exists a δ > 0 sufficiently small such that u(t0 ± δ) > 0, u (t0 ± δ) > 0, and u does not cross ±1 in [t0 − δ, t0 + δ] except at t0 . Let P1 and P2 denote the points (u(t0 ∓ δ), u (t0 ∓ δ)) respectively. Let γ denote the piece of the curve (u) from P1 to P2 and γ ∗ the curve tracing (u) backward in time from P2 to P1 . Now choose a point P3 on (v) for which v = 1 and v > 0. We can easily construct cubic polynomials p1 and p2 for which the curve (p1 ) connects P1 to P3 and the curve (p2 ) connects P3 to P2 in P. These curves (pi ) are monotone functions, and hence the loop (p1 ) ◦ (p2 ) ◦ γ ∗ has trivial homotopy type in P. Therefore (u|[0,T ] )k ◦ γ ∼ (p2 ) ◦ (v|[0,T ] )k ◦ (p1 ) in P for any k ≥ 1, and from Definition 2.1 J [(u|[0,T ] )k ◦ γ ] ≤ J [(p2 ) ◦ (v|[0,T ] )k ◦ (p1 )]. Thus, k · JT [u] + J [γ ] ≤ J [p1 ] + J [p2 ] + k · JT [v], which implies 0 ≤ k(JT [u] − JT [v]) ≤ J [p1 ] + J [p2 ] − J [γ ]. These estimates lead to a contradiction for k sufficiently large.
Lemma 4.10. For any two distinct minimizers u1 and u2 in CMper ((2, 2)), the associated curves (ui ) do not intersect. Proof. Suppose (u1 ) and (u2 ) intersect at a point P ∈ P. Translate u1 and u2 so that (u1 (0)) = (u2 (0)) = P . Define the function u ∈ Mper ((2, 2)2 ) as the periodic extension of u1 (t) for t ∈ [0, T1 ], u(t) = u2 (t − T1 ) for t ∈ [T1 , T1 + T2 ], where Ti is the minimal period of ui . Then JT1 +T2 [u] = 2Jper ((2, 2)) = Jper ((2, 2)2 ). By Lemma 4.8 we have u ∈ CMper ((2, 2)), which contradicts the fact that u1 and u2 are distinct minimizers with (u1 ) = (u2 ). As a direct consequence of this lemma, the periodic orbits in Mper ((2, 2)) are ordered in the sense that (u1 ) lies either strictly inside or outside the region enclosed by (u2 ). The ordering will be denoted by >. Theorem 4.11. There exists a largest and a smallest periodic orbit in CMper ((2, 2)) in the sense of the above ordering, which we will denote by umax and umin respectively. Moreover 1 < umin 1,∞ ≤ umax 1,∞ ≤ C0 , and umin < u < umax for every u ∈ CMper ((2, 2)). In particular the set CMper ((2, 2)) is compact.
586
W.D. Kalies, J. Kwapisz, J.B. VandenBerg, R.C.A.M. VanderVorst
Proof. Either the number of periodic minimizers is finite, in which case there is nothing to prove, or the set of minimizers is infinite. Let U = {(u) | u ∈ CMper ((2, 2))} ⊂ P, and let A = U ∩ {(u, u ) | u = 0, u > 0}. Every minimizer in CMper ((2, 2)) intersects the positive u–axis transversely exactly once. Moreover distinct minimizers cross this axis at distinct points by Lemma 4.10. Thus we can use A as an index set and label the minimizers as uα for α ∈ A. Due to the a priori upper bound on minimizers (Lemma 5.1 in [7]), A is a bounded set. The set A is contained in the u-axis and hence has an ordering induced by the real numbers. This order corresponds to the order on minimizers, i.e. α < β in A if and only if uα < uβ as minimizers. Suppose α∗ is an accumulation point of A. Then there exists a sequence αn converging to α∗ . From Theorem 3.2 (the a priori L∞ -bound on uαn is sufficient by Remark 3.3) we see that there exists u ∈ CM((2, 2)) which is a solution to Eq. (1.2) such that 1 (R). Since u is periodic and the C 1 –limit of a sequence of periodic uαn → u in Cloc αn loc functions with uniformly bounded periods (compare with the proof of Theorem 3.2 to find a uniform bound on the periods) is periodic, u ∈ CMper ((2, 2)). By Lemma 4.9, u ∈ CMper ((2, 2)). Furthermore u corresponds to uα∗ , and hence A is compact. Consequently A contains maximal and minimal elements. Let umax and umin be the periodic minimizers through the maximal and minimal points of A respectively. This proves the theorem. The above lemmas characterize periodic minimizers in CM((2, 2)). Now we turn our attention to non-periodic minimizers. We conclude this subsection with a theorem that gives a complete description of the set CM((2, 2)). Let u ∈ CM((2, 2)) be non-periodic. Suppose that P is a self-intersection point of (u). Then there exist times t1 < t2 such that (u(t1 )) = (u(t2 )) = P , and (u|[t1 ,t2 ] ) is a closed loop. By Lemma 4.7 there are only finitely many self-intersections on [t1 , t2 ]. Without loss of generality we may therefore assume that γ is a simple closed loop, i.e, we need only consider the case where P = (u(t1 )) = (u(t2 )) and (u|[t1 ,t2 ] ) is a simple closed loop. We now define + = (u|(t1 ,∞) ) and − = (u|(−∞,t2 ) ). We will refer to ± as the forward and backward orbits of u relative to P . Lemma 4.12. Let u ∈ CM((2, 2)) be a non-periodic minimizer with at least one selfintersection. Let P and ± be defined as above. Then the forward and backward orbits ± relative to P do not intersect themselves. Furthermore, P and ± are unique, and the curve (u) passes through any point in P at most twice. Proof. We will prove the result for + ; the argument for − is similar. Suppose that + has self-intersections. Define t∗ = min{t > t1 | (u(t)) = (u(τ )) for some τ ∈ (t1 , t)}. The minimum t∗ is attained by Lemma 4.7, and t∗ > t2 since γ ≡ (u|[t1 ,t2 ] ) is a simple closed loop. Let t0 ∈ (t1 , t∗ ) be the point such that (u(t0 )) = (u(t∗ )). This point is unique by the definition of t∗ , and γ˜ ≡ (u|[t0 ,t∗ ] ) is a simple closed loop. For small positive δ we define Q = (u(t∗ )), B = (u(t1 − δ)), E = (u(t∗ + δ)) and ∗ = (u|[t1 −δ,t∗ +δ] ), see Fig. 4.1. We can decompose this curve into five parts; ∗ = σ3 ◦ γ˜ ◦ σ2 ◦ γ ◦ σ1 , where σ1 joins B to P , σ2 joins P to Q, σ3 joins Q to E, and γ and γ˜ are simple closed loops based at P and Q respectively, see Fig. 4.1. The simple closed curves γ and γ˜ go around L exactly once and thus have the same homotopy type. Moreover, γ = γ˜ since u is non-periodic.
Homotopy Classes for Stable Periodic and Chaotic Patterns
587
Besides ∗ we can construct two other distinct paths from B to E: 1 = σ 3 ◦ σ 2 ◦ γ ◦ γ ◦ σ 1
and
2 = σ3 ◦ γ˜ ◦ γ˜ ◦ σ2 ◦ σ1 .
It is not difficult to see that 1 , 2 and ∗ all have the same homotopy type. Since J [∗ ] is minimal in the sense of Definition 2.1 we have, by the same reasoning as in Lemma 4.7, that J [1 ] ≥ J [∗ ] and J [2 ] ≥ J [∗ ], which implies that J [γ˜ ] ≥ J [γ ] and J [γ ] ≥ J [γ˜ ]. Hence J [γ ] = J [γ˜ ]. Therefore J [1 ] = J [2 ] = J [∗ ] which gives that 1 , 2 and ∗ are all distinct minimizers of the same type as curves joining B to E. Since these curves all contain the paths σ1 , σ2 and σ3 , and are solutions to (1.2), the uniqueness to the initial value problem is contradicted. Finally, the curve (u) can pass through a point at most twice because it is a union of + and − , each visiting a point at most once. Moreover, points in (u|(t1 ,t2 ) ), common to both + and − , are passed exactly once. It now follows that if there is another selfintersection besides P , say at R = (u(s1 )) = (u(s2 )), then s1 < t1 and t2 < s2 . We conclude that the curve (u|(s1 ,s2 ) ) contains (u|[t1 ,t2 ] ) and therefore it is not a simple closed curve. Thus P is a unique self-intersection that cuts off a simple loop.
B
Q
2
1
3
P
~
E L
;1; 0)
(
(1; 0)
Fig. 4.1. The forward orbit + starting at P with a self-intersection at the point Q Lemma 4.12 implies that this cannot happen for non-periodic u ∈ CM((2, 2))
Lemma 4.13. Let u ∈ CM((2, 2)) be non-periodic. Suppose that u ∈ L∞ (R). Then u is a connecting orbit between two periodic minimizers u− , u+ ∈ CMper ((2, 2)), i.e. there are sequences tn− , tn+ → ∞ such that u(t − tn− ) → u− (t) and u(t + tn+ ) → u+ (t) 4 (R). in Cloc Proof. Lemma 4.12 implies that + is a spiral which intersects the positive u–axis at a bounded, monotone sequence of points (αn , 0) in P converging to a point (α∗ , 0). Let tn be the sequence of consecutive times such that u(tn ) = αn , and n (tn ) = 0. Consider the sequence of minimizers in CM((2, 2)) defined by un (t) = u(t + tn ). By Theorem 1 –limit u ∈ CM((2, 2)). If u is periodic, there is nothing more 3.2 there exist a Cloc + + to prove. Thus suppose u+ is non-periodic. Then the curve (u+ ) crosses the u–axis 1 convergence (u ) crosses infinitely many times. On the other hand, from the Cloc +
588
W.D. Kalies, J. Kwapisz, J.B. VandenBerg, R.C.A.M. VanderVorst
this axis only at α∗ . By Lemma 4.12, (u+ ) can intersect α∗ at most twice, which is a 4 –convergence follows from regularity (as in the proof of Theorem contradiction. The Cloc 4.2). The proof of the existence of u− is similar. Theorem 4.14. Let u ∈ CM((2, 2)). Either u is unbounded, u is periodic and u ∈ CMper ((2, 2)), or u is a connecting orbit between periodic minimizers in CMper ((2, 2)). Proof. Let u ∈ CM((2, 2)) be bounded, then u is either periodic or non-periodic. In the case that u is periodic it follows from Lemma 4.9 that u ∈ CMper ((2, 2)). Otherwise if u is not periodic it follows from Lemma 4.13 that u is a connecting orbit between two minimizers u− , u+ ∈ CMper ((2, 2)). In Sect. 5.2 we give analogues of the above theorems for arbitrary homotopy types r. Notice that the option of u ∈ CM((2, 2)) being unbounded in the above theorem does not occur when F (u) ∼ |u|s , s > 2 as |u| → ∞. 5. Properties of Minimizers In Sect. 4, we proved the existence of minimizers in Mper ((2, 2)), which will provide a priori bounds on the minimizers of arbitrary type. These bounds and Theorem 3.2 will establish the existence of such minimizers. In this section we will also prove that certain properties of a type g are often reflected in the associated minimizers. The most important examples are the periodic types g = r. Although there are minimizers in every class M(r, p), it is not clear a priori that among these minimizers there are also periodic minimizers. In order to prove existence of periodic minimizers for every periodic type r we use the theory of covering spaces. 5.1. Existence. The periodic minimizers of type (2, 2) are special for the following reason. For a normalized u ∈ Mper ((2, 2)), define D(u) to be the closed disk in R2 such that ∂D(u) = (u). Theorem 5.1. i) If u ∈ CM(r , p), then (u) ⊂ D(umin ) for any periodic type r = (2, 2). ii) If u ∈ CM(g, p), then (u) ⊂ D(umin ) for any terminated type g. Proof. i) If r = (2, 2) then every u ∈ CM(r , p) has the property that (u) intersects the u-axis between u = ±1. Suppose that (u) does not lie inside D(umin ). Then (u) must intersect (umin ) at least twice, and let P1 and P2 be distinct intersection points with the property that the curve 1 obtained by following (u) from P1 to P2 lies entirely outside of D(umin ). Let 2 ⊂ (umin ) be the curve from P1 to P2 following umin , such that 1 and 2 are homotopic (traversing the loop (umin ) as many times as necessary) and thus J [1 ] = J [2 ] is minimal. Replacing 1 by 2 leads to a minimizer in CM(r , p) which partially agrees with u. This contradicts the uniqueness of the initial value problem for (1.2). ii) As in the previous case the associated curve (u) either intersects (umin ) at least twice or lies completely inside D(umin ), and the proof is identical. Corollary 5.2. For all minimizers in the above theorem, u1,∞ ≤ umin 1,∞ ≤ C0 . In order to prove existence of minimizers in every class we now use the above theorem in combination with an existence result from [7].
Homotopy Classes for Stable Periodic and Chaotic Patterns
589
Theorem 5.3. For any given type g and parity p there exists a (bounded) minimizer u ∈ CM(g, p). Moreover u1,∞ ≤ C0 , independent of (g, p). Proof. Given a type g we can construct a sequence gn of terminated types such that gn → g as n → ∞. For any terminated type gn there exists a minimizer un ∈ CM(gn , p) by Proposition 2.3 (Theorem 1.3 of [7]). Clearly such a sequence un satisfies un 1,∞ ≤ C by Corollary 5.2. Applying Theorem 3.2 completes the proof.
5.2. Covering spaces and the action of the fundamental group. The fundamental group of P is isomorphic to the free group on two generators e0 and e1 which represent loops (traversed clockwise) around (1, 0) and (−1, 0) respectively with basepoint (0, 0). Indeed, P is homotopic to a bouquet of two circles X = S1 ∨ S1 . The universal covering can be represented by an infinite tree whose edges cover either e0 or of X denoted by X → P, can then be e1 in X, see Fig. 5.1. The universal covering of P, denoted by ℘ : P viewed by thickening the tree X so that P is homeomorphic to an open disk in R2 .
Xg
Xg }g
O
O
}
}
e1
0
X
e0
of X is a tree. Its origin is denoted by O. For θ = e0 e1 e0 , the quotient space Fig. 5.1. The universal cover X θ = X/ θ is also a covering space over X, and X θ ∼ S 1 X
An important property of the universal covering is that the fundamental group π1 (P) in a natural way, via the lifting of paths in P to paths in induces a left group action on P We will not reproduce P. This action will be denoted by θ · p for θ ∈ π1 (P) and p ∈ P. the construction of this action here, and the reader is referred to an introductory book on algebraic topology such as [3]. However, we will utilize the structure of the quotient
590
W.D. Kalies, J. Kwapisz, J.B. VandenBerg, R.C.A.M. VanderVorst
obtained from this action, which are again coverings of P. These quotient spaces of P spaces will be the natural spaces in which to consider the lifts of curves (u) which lie in more complicated homotopy classes than those in the case of u ∈ Mper ((2, 2)). A periodic type g = r is generated by a finite type r, which together with the parity r2n p determines an element of π1 (P) of the form θ(r) = e|p−1| · · · · · epr1 . Since we only consider curves in P which are of the form (u) = (u(t), u (t)), the numbers ri are all positive. The infinite cyclic subgroup generated by any such element θ will be denoted θ = P/ θ is obtained by identifying points p by θ ⊂ π1 (P). The quotient space P for which q = θ k · p for some k ∈ Z. The resulting space P θ is homotopic and q in P θ → P is a covering space. Figure 5.1 illustrates the situation to an annulus, and ℘θ : P for X, since it is easier to draw, and for P the reader should imagine that the edges in based at O is shown by the picture are thin strips. The lift of the path θ = e0 e1 e0 to X θ . Note the dashed line. This piece of the tree becomes a circle in the quotient space X are identified with this circle. The dashed lines in both that infinitely many edges in X and X θ are strong deformation retracts of X and X θ respectively, and hence X θ is X θ is homotopic to an annulus. Thus θ gives that P homotopic to a circle. Thickening X θ ) is a generated by a simple closed loop in P θ which will be denoted by ζ (r). Note π1 (P that for convenience we suppress the dependence of θ and ζ on the parity p. Remark 5.4. If we define the shift operator σ on finite types r to be a cyclic permutation, then Mper (r, p) = Mper (σ k (r), τ k (p)) for all k ∈ Z. Functions in Mper (r, p) have a θ , θ = θ(r). However, functions in the shifted unique lift to a simple closed curve in P k k θ . In order for such functions class Mper (σ (r), τ (p)) are not simple closed curves in P θk , to be lifted to a unique simple closed curve we need to consider the covering space P k k where θk = θ (σ (r), τ (p)). 5.3. Characterization of minimizers of type r. In Sect. 5.2 we characterized minimizers in CM((2, 2)) by studying the properties of their projections into P. What was special about the types (2, 2)k was that the projected curves were a priori contained in P \L, which is topologically an annulus. The J -efficiency of minimizing curves restricts the possibilities for their self and mutual intersections. In particular, we showed that all periodic minimizers in CM((2, 2)) project onto simple closed curves in P \ L and that no two such minimizing curves intersect. These two properties, coupled with the simple topology of the annulus, already force the minimizing periodic curves to have a structure of a family of nested simple loops. Such a simple picture in the configuration plane P cannot be expected for minimizers in CM((r , p)) with r = (2, 2). The simple intersection properties (of Lemma 5.9 and 5.11) no longer hold; in fact, periodic minimizing curves must have self-intersections in P as do any curves in P representing the homotopy class of (r , p). However, by θ , we can remove exactly these necessary lifting minimizing curves into the annulus P self-intersections and put us in a position to emulate the discussion for the types (2, 2)k . More precisely, for a minimal type (r, p), any u ∈ Mper ((r, p)k ) with period T such that θ −1 [(u|[0,T ] )] = (r, p)k , there are infinitely many lifts of the closed loop (u|[0,T ] ) θ (r) (see the above remark) but there is exactly one lift, denoted θ (u|[0,T ] ), that into P θ (r). We can repeat all of the arguments in is a closed loop homotopic to ζ k (r) in P θ (r) instead of Sect. 4 by identifying intersections between the curves θ (u|[0,T ] ) in P intersections between the curves (u|[0,T ] ) in P \ L. Of course, when gluing together pieces of curves, the values of u and u come from the projections into P. In particular,
Homotopy Classes for Stable Periodic and Chaotic Patterns
591
the arguments of Lemma 4.9 show that θ (u|[0,T ] ) must be a simple loop traced k-times, which leads to the following: Lemma 5.5. For any periodic type r and any k ≥ 1 it holds that CMper ((r, p)k ) = CMper (r, p) = CMper (r , p). The proof of the next theorem is a slight modification of Theorem 4.11. Theorem 5.6. For any periodic type r the set CMper (r, p) is compact and totally θ ). ordered (in P The following lemma is analogous to Lemma 4.13. Note however that by Theorem 5.1 we do not need to assume that the minimizer is uniformly bounded. Lemma 5.7. Let u ∈ CM(r , p) for some periodic type r = (2, 2). Either u is periodic and u ∈ CMper (r, p), or u is a connecting orbit between two periodic minimizers u− , u+ ∈ CMper (r, p), i.e. there are sequences tn− , tn+ → ∞ such that 4 (R). u(t − tn− ) → u− (t) and u(t + tn+ ) → u+ (t) in Cloc Combining Theorem 5.3 and Lemma 5.7 we obtain the existence of periodic minimizers in every class with a periodic type (this result can also be obtained in a way analogous to Theorem 4.5). Theorem 5.8. For any periodic type r the set CMper (r, p) is nonempty. The classification of functions by type has some properties in common with symbolic dynamics. For example, if a type g is asymptotic to two different periodic types, i.e. σ n (g) → r+ and σ −n (g) → r− as n → ∞, with r+ = r− , then any minimizer u ∈ CM(g, p) is a connecting orbit between two periodic minimizers u− ∈ CMper(r− ,p) and u+ ∈ CMper (r+ , p), i.e. there exist sequences tn− , tn+ → ∞ such that u(t −tn− ) → u− (t) 4 (R). This result follows from Cantor’s diagonal argument and u(t + tn+ ) → u+ (t) in Cloc using Theorems 3.2 and 5.7, and hence we have used the symbol sequences to conclude the existence of heteroclinic and homoclinic orbits connecting any two types of periodic orbits. Symmetry properties of types g are also often reflected in the corresponding minimizers. For example, define the map Bi0 on infinite types by Bi0 (g) = (g2i0 −i )i∈Z , and consider types that satisfy Bi0 (g) = g for some i0 . Moreover assume that g is periodic. In this case we can prove that the corresponding periodic minimizers are symmetric and satisfy Neumann boundary conditions. Theorem 5.9. Let g = r satisfy Bi0 (r) = r for some i0 . Then for any u ∈ CMper (r, p) there exists a shift τ such that uτ (x) = u(x − τ ) satisfies i) uτ (x) = uτ (T − x) for all x ∈ [0, T ] where T is the period of u, ii) uτ (0) = u τ (0) = 0 and uτ (T ) = uτ (T ) = 0, and iii) uτ is a local minimizer for the functional JT [u] on the Sobolev space Hn2 (0, T ) = {u ∈ H 2 (0, T ) | u (0) = u (T ) = 0}. Proof. Without loss of generality we may assume that i0 = 1 and g = (g1 , . . . , gN ) for some N ∈ 2N. We can choose a point t0 in the convex hull of A1 such that u (t0 ) = u (t0 + T ) = 0 and g(u|[t0 ,t0 +T ] ) = (g1 /2, g2 , . . . , gN , g1 /2). We now define v(t) = u(t0 +T − t). Then by the symmetry assumptions on g we have that g(v|[t0 ,t0 +T ] ) = g(u|[t0 ,t0 +T ] ). Since J[t0 ,t0 +T ] (v) = J[t0 ,t0 +T ] (u) and (u(t0 )) = (u(t0 +T )) = (v(t0 )) = (v(t0 + T )), we conclude from the uniqueness of the initial value problem that u(t) = v(t) for all t ∈ [t0 , t0 + T ], which proves the first statement. The second statement follows immediately from i). The third property follows from the definition of minimizer.
592
W.D. Kalies, J. Kwapisz, J.B. VandenBerg, R.C.A.M. VanderVorst
References 1. Bangert, V.: Mather Sets for Twist Maps and Geodesics on Tori. Volume 1, of Dynamics Reported. Oxford: Oxford University Press, 1988 2. Boyland, P. and Golé, C.: Lagrangian systems on hyperbolic manifolds. Ergodic Theory Dynam. Systems 19, 1157–1173 (1999) 3. Fulton, W.: Algebraic Topology: A First Course. Berlin–Heidelberg–New York: Springer-Verlag, 1995 4. Ghrist, R., VandenBerg, J.B. and VanderVorst, R.C.A.M.: Braided closed characteristics in fourth-order twist systems. Preprint 2000 5. Ghrist, R. and VandenBerg, J.B. and VanderVorst, R.C.A.M.: Morse theory on the space of braids and Lagrangian dynamics. In preparation 6. Hulshof, J. and VandenBerg, J.B. and VanderVorst, R.C.A.M.: Traveling waves for fourth order parabolic equations. To appear in SIAM J. Math. Anal. (1999) 7. Kalies, W.D. and Kwapisz, J. and VanderVorst, R.C.A.M.: Homotopy classes for stable connections between Hamiltonian saddle-focus equilibria. Commun. Math. Phys. 193, 337–371 (1998) 8. Kalies, W.D. and VanderVorst, R.C.A.M.: Multitransition homoclinic and heteroclinic solutions of the extended Fisher–Kolmogorov equation. J. Diff. Eq. 131, 209–228 (1996) 9. Kalies, W.D. and VanderVorst, R.C.A.M. and Wanner, T.: Slow motion in higher-order systems and -convergence in one space dimension. To appear in Nonlin. Anal. TMA 10. Kwapisz, J.: Uniqueness of the stationary wave for the extended Fisher–Kolmogorov equation. J. Diff. Eq. 165, 235–253 (2000) 11. Morse, M.: A fundamental class of geodesics on any closed surface of genus greater than one. Trans. Am. Math. Soc. 26, 25–60 (1924) 12. Rabinowitz, P.H.: Heteroclinics for a Hamiltonian system of double pendulum type. Top. Meth. Nonlin. Anal. 9, 41–76 (1997) 13. Schecter, E.: Handbook of analysis and its foundations. San Diego–New York–Boston: Acad. Press, 1997 14. VandenBerg, J.B.: The phase-plane picture for a class of fourth-order conservative differential equations. J. Diff. Eq. 161, 110–153 (2000) 15. VandenBerg, J.B.: Uniqueness of solutions for the extended Fisher–Kolmogorov equation. Comptes Rendus Acad. Sci. Paris (Série I) 326, 447–452 (1998) 16. VandenBerg, J.B. and VanderVorst, R.C.A.M.: Stable patterns for fourth order parabolic equations. Preprint (2000) Communicated by Ya. G. Sinai