Commun. Math. Phys. 267, 1–12 (2006) Digital Object Identifier (DOI) 10.1007/s00220-006-0033-1
Communications in
Mathematical Physics
Simple Waves and a Characteristic Decomposition of the Two Dimensional Compressible Euler Equations Jiequan Li1, , Tong Zhang2 , Yuxi Zheng3, 1 Department of Mathematics, Capital Normal University, Beijing, 100037, P.R. China 2 Institute of Mathematics, Chinese Academy of Sciences, Beijing, 100080, P.R. China 3 Department of Mathematics, The Pennsylvania State University, University Park, PA 16802, USA.
E-mail:
[email protected] Received: 15 September 2005 / Accepted: 27 December 2005 Published online: 5 May 2006 – © Springer-Verlag 2006
Abstract: We present a characteristic decomposition of the potential flow equation in the self-similar plane. The decomposition allows for a proof that any wave adjacent to a constant state is a simple wave for the adiabatic Euler system. This result is a generalization of the well-known result on 2-d steady potential flow and a recent similar result on the pressure gradient system. 1. Introduction The one-dimensional wave equation u tt − c2 u x x = 0
(1)
with constant speed c has an interesting decomposition (∂t + c∂x )(∂t − c∂x )u = 0,
(2)
(∂t − c∂x )(∂t + c∂x )u = 0,
(3)
or
known from elementary text books. One can rewrite them as ∂+ ∂− u = 0, or ∂− ∂+ u = 0,
(4)
where ∂± = ∂t ± c∂x . Sometimes, the same fact is written in Riemann invariants ∂t R + c∂x R = 0, ∂t S − c∂x S = 0
(5)
Research partially supported by NSF of China with No. 10301022, NSF from Beijing Municipality, Fok Ying Tong Educational Foundation, and the Key Program from Beijing Educational Commission with no. KZ200510028018. Research partially supported by NSF-DMS-0305497, 0305114.
2
J. Li, T. Zhang, Y. Zheng
for the Riemann invariants R := (∂t − c∂x )u, S := (∂t + c∂x )u. For a pair of a system of hyperbolic conservation laws u f (u, v) 0 , + = v t g(u, v) x 0
(6)
(7)
it is known that a pair of Riemann invariants exist so that the system can be rewritten as ∂t R + λ1 (u, v)∂x R = 0, (8) ∂t S + λ2 (u, v)∂x S = 0, where (R, S) are the Riemann invariants and the λ’s are the two eigenvalues of the system. These decompositions and Riemann invariants are useful in the construction of solutions, for example, the construction of the D’Alembert formula, and proof of development of singularities ([5]). An example of the system is the system of isentropic irrotational steady two-dimensional Euler equations for compressible ideal gases 2 (c − u 2 )u x − uv(u y + vx ) + (c2 − v 2 )v y = 0, (9) u y − vx = 0, supplemented by Bernoulli’s law c2 u 2 + v2 + = k2, γ −1 2
(10)
where γ > 1 is the gas constant while k > 0 is an integration constant. This system has two unknowns (u, v), and by the existence of Riemann invariants, any solution adjacent to a constant state is a simple wave. A simple wave means a solution (u, v) that depends on a single parameter rather than the pair parameters (x, y). Since there is the lack of explicit expressions, the concept of Riemann invariants plays a limited role in a much broader sense, e. g., to treat the full Euler equations. In recent years, the pressure gradient system u t + px = 0, vt + p y = 0, (11) E + (up) + (vp) = 0, t x y where E = p + (u 2 + v 2 )/2, has been known to have “simple waves” adjacent to a constant state (u, v, p) in the self-similar variable plane (ξ, η) = (x/t, y/t). This system has three equations and no Riemann invariants have been found. But the equation for the unknown variable p in the (ξ, η) plane ( p − ξ 2 ) pξ ξ − 2ξ ηpξ η + ( p − η2 ) pηη +
(ξ pξ + ηpη )2 − 2(ξ pξ + ηpη ) = 0 (12) p
allows for a decomposition ∂+ ∂− p = m + ∂− p, m + :=
r 4 λ+ pr , 2 p2
(13)
Simple Waves and Decomposition of 2D Compressible Euler Equations
3
where (r, θ ) denotes the polar coordinates of the (ξ, η) plane and ∂+ =
∂θ + λ−1 + ∂r ;
∂− =
∂θ + λ−1 − ∂r ;
λ± = ±
p . − p)
r 2 (r 2
(14)
For convenience of verification we state that the p equation in polar coordinates takes the form ( p − r 2 ) prr +
p p 1 pθθ + pr + (r pr )2 − 2r pr = 0. 2 r r p
(15)
The characteristics are defined by dθ = λ± . dr
(16)
∂± λ ± = n ± ∂ ± p
(17)
In addition, we know that
for some nice factors n ± . These facts allow for expressions ∂∓ (∂± λ± ) = (∂± λ± ) f ±
(18)
for some nice factors f ± . This decomposition leads directly to the fact that Proposition 1. A state adjacent to a constant state for the pressure gradient system must be a simple wave in which p is constant along characteristics of a plus (or minus) family. These lead to the desire to consider the pseudo-steady isentropic irrotational Euler system which has three equations with source terms, (ρU )ξ + (ρV )η = −2ρ, (ρU 2 + p(ρ))ξ + (ρU V )η = −3ρU, (ρU V )ξ + (ρV 2 + p(ρ))η = −3ρV,
(19)
where (ξ, η) = (x/t, y/t), (U, V ) = (u − ξ, v − η) is the pseudo-velocity, and the pressure p = p(ρ) is the function of the density ρ. It turns out that we are unable to find explicit forms of the Riemann invariants, but decompositions similar to ∂+ ∂− λ− = m∂− λ− hold for some m, presented in Sect. 4. We use the characteristic decomposition of Sect. 4 to establish in Sect. 5 that adjacent to a constant state a wave must be a simple wave for the pseudo-steady irrotational isentropic Euler system. A simple wave for this case is such that one family of wave characteristics are straight lines and the physical quantities velocity, speed of sound, pressure, and density are constant along the wave characteristics. Further, using the fact that entropy and vorticity are constant along the pseudo-flow characteristics (the pseudo-flow lines), our irrotational result extends to the adiabatic full Euler system, see Sect. 5.
4
J. Li, T. Zhang, Y. Zheng
2. Two-by-Two System Consider a 2 × 2 hyperbolic system in the Riemann invariants ∂t R + λ1 ∂x R = 0, ∂t S + λ2 ∂x S = 0.
(20)
So we find that ∂2 λ2 := (∂t + λ2 ∂x )λ2 = λ2,R ∂2 R,
(21)
where λ2,R := ∂ R λ2 . We go on to find ∂ 1 ∂2 R = and so
∂1
∂1 λ2 − ∂2 λ1 ∂2 R, λ2 − λ1
1 λ2,R
∂2 λ 2
=
(22)
∂1 λ2 − ∂2 λ1 ∂2 λ2 , λ2 − λ1 λ2,R
(23)
which implies ∂1 ∂2 λ2 =
∂1 λ2 − ∂2 λ1 ∂1 λ2,R + λ2 − λ1 λ2,R
∂2 λ 2 .
(24)
The elegant form is undermined by the dependence on the Riemann invariant R via the term λ2,R . It is not very useful when the explicit form of the Riemann invariants are not known. But we think it is worth mentioning. For example, (24) can be applied directly to show that all characteristics in a wave adjacent to a constant state are straight and thus such a wave is a simple wave. 3. Steady Euler System Let us build explicitly the characteristic decomposition for the steady Euler system for isentropic irrotational flow (9)(10) in the absence of the explicit form of the Riemann invariants. The same technique can be extended to the case of pseudo-steady flows in Sect. 4. We write the system in the form −2uv c2 −v 2 u u + c2 −u 2 c2 −u 2 = 0. (25) v x v y −1 0 The matrix has eigenvalues λ± =
uv ±
c2 (u 2 + v 2 − c2 ) u 2 − c2
dy = , dx
(26)
which are solutions to the characteristic equation λ2 +
2uv c2 − v 2 λ + = 0. c2 − u 2 c2 − u 2
(27)
Simple Waves and Decomposition of 2D Compressible Euler Equations
5
We have the left eigenvectors ± = [1, λ∓ ],
(28)
where we have used the relation λ± λ∓ =
c2 − v 2 . c2 − u 2
(29)
The characteristic form of the system is therefore u u + λ± ± = 0, ± v x v y
(30)
which is equivalent to ∂± u + λ∓ ∂± v = 0.
(31)
We then have ∂− λ− = ∂x λ− + λ− ∂ y λ− = ∂u λ− ∂− u + ∂v λ− ∂− v = (∂u λ− − ∂v λ− /λ+ ) ∂− u.
(32)
We shall ignore the similar calculation for ∂+ λ+ for simplicity of notation. Now that the term ∂− λ− differs from ∂− u by a lower-order factor, we shall focus our attention on ∂− u. First we see that we can derive a second-order equation for u, i. e., 2uv c2 − u 2 u yy − 2 u − u = 0, (33) y x c − v2 c2 − v 2 x or equivalently uxx
2uv c2 − v 2 c2 − v 2 − 2 u + u = x y yy c − u2 c2 − u 2 c2 − u 2
2uv c2 − v 2
uy − x
c2 − u 2 c2 − v 2
ux . x
(34)
We now compute the ordered derivative ∂+ ∂− u to find ∂+ ∂− u = u x x + (λ+ + λ− )u x y + λ+ λ− u yy + ∂+ λ− u y .
(35)
∂+ λ− = ∂+ u(∂u λ− − ∂v λ− /λ− ).
(36)
2 2uv c2 − v 2 c − u2 uy − 2 ux ∂+ ∂− u = 2 c − u2 c2 − v 2 x c − v2 x +(u x + λ+ u y )u y (∂u λ− − ∂v λ− /λ− ).
(37)
We find that
Thus we obtain
We notice that the above right-hand side is a quadratic form in (u x , u y ), once we substitute vx by u y . So we compute further. We use the Bernoulli’s law to find (c2 )x = −(γ − 1)(uu x + vu y ).
(38)
6
J. Li, T. Zhang, Y. Zheng
So we find 2uv 2 = 2 [vu x (c2 − u 2 − v 2 + γ u 2 ) + uu y (c2 + γ v 2 )], c2 − v 2 x (c − v 2 )2 2 c − u2 −1 = 2 [uu x (2c2 − v 2 − u 2 + γ u 2 − γ v 2 ) 2 2 c −v x (c − v 2 )2 −vu y (2c2 − v 2 + γ v 2 − u 2 − γ u 2 )]. We now compute the factor
∂u λ− − λ−1 − ∂v λ − .
We use Bernoulli’s law to find
(c )u = −(γ − 1)u, (c )v = −(γ − 1)v. 2
(39)
2
(40)
We use the characteristic equation (c2 − u 2 )λ2 + 2uvλ + c2 − v 2 = 0
(41)
to obtain ∂u λ− =
λ2− (γ + 1)u − 2vλ− + (γ − 1)u , 2λ− (c2 − u 2 ) + 2uv
λ2 (γ − 1)v − 2uλ− + (γ + 1)v ∂v λ− = − . 2λ− (c2 − u 2 ) + 2uv
(42)
We then simply compute to find ∂u λ− − λ−1 − ∂v λ − =
2λ−
(c2
(uλ− − v)3 γ +1 . 2 − u ) + 2uv c 2 λ−
(43)
Coming back to our equation for ∂+ ∂− u, we have (c2 − u 2 )(c2 − v 2 )∂+ ∂− u = u 2x u(2c2 − u 2 − v 2 + γ u 2 − γ v 2 ) +u x u y (−vu 2 − v 3 + 3γ vu 2 − γ v 3 + Q) +u 2y [2u(c2 + γ v 2 ) + λ+ Q],
(44)
where we have introduced the notation Q :=
(γ + 1)(c2 − v 2 )(u H − vc)3 (c2 − u 2 )(c2 − v 2 ) γ + 1 3 , (uλ − v) = − 2λ− (c2 − u 2 ) + 2uv c2 λ− 2(c2 − u 2 )H (cH − uv)
where H :=
u 2 + v 2 − c2 .
(45)
(46)
We then factorize the quadratic form to find finally ∂+ ∂− u =
u(2c2 − u 2 − v 2 + γ u 2 − γ v 2 ) (∂x u + α∂ y u)∂− u, (c2 − u 2 )(c2 − v 2 )
(47)
where α=
2u(c2 + γ v 2 ) + λ+ Q . λ− u(2c2 − u 2 − v 2 + γ u 2 − γ v 2 )
(48)
Simple Waves and Decomposition of 2D Compressible Euler Equations
7
Proposition 2. There holds the identity ∂+ ∂− u = m(∂x u + α∂ y u)∂− u
(49)
for α(u, v) given in (48) and m given by m=
u(2c2 − u 2 − v 2 + γ u 2 − γ v 2 ) . (c2 − u 2 )(c2 − v 2 )
(50)
We use the relation ∂− u = ∂− λ− /(∂u λ− − ∂v λ− λ−1 + )
(51)
to go back to ∂+ ∂− λ− . We find ∂u λ− − ∂v λ− λ−1 + =−
[4c2 + (γ − 3)(u 2 + v 2 )][vλ− (c2 + u 2 ) + (c2 − v 2 )u] =: G. (c2 − v 2 )(c2 − u 2 )[2λ− (c2 − u 2 ) + 2uv] (52)
So we have
∂+
1 ∂− λ − G
= m ∂α u
∂− λ − , G
(53)
or ∂+ ∂− λ− = (m∂α u + ∂+ (ln |G|))∂− λ− .
(54)
Proposition 3. There holds the identity ∂+ ∂− λ− = m∂− λ−
(55)
for some m = m(u, v)(∂x u + β(u, v)∂ y u). A similar identity holds for ∂− ∂+ λ+ . We remark that in the application on simple waves, the equation for u is sufficient and the equations for λ± are not needed. 4. Pseudo-Steady Euler We consider the two-dimensional isentropic irrotational ideal flow in the self-similar plane (ξ, η) = (x/t, y/t). Bernoulli’s law holds: U2 + V 2 c2 + = −ϕ, γ −1 2
(56)
where c is the speed of sound, (U, V ) = (u − ξ, v − η) are the pseudo-velocity, while (u, v) is the physical velocity, and ϕ is the pseudo-potential such that ϕξ = U,
ϕη = V.
The equations of motion can be written as 2 (c − U 2 )Uξ − U V (Uη + Vξ ) + (c2 − V 2 )Vη = −2c2 + U 2 + V 2 , Vξ − Uη = 0.
(57)
(58)
8
J. Li, T. Zhang, Y. Zheng
We can rewrite the equations of motion in a new form −2U V c2 −V 2 u u + c2 −U 2 c2 −U 2 =0 v ξ v η −1 0
(59)
to draw as much parallelism to the steady case as possible. We emphasize the mixed use of the variables (U, V ) and (u, v), i. e., (U, V ) is used in the coefficients while (u, v) is used in differentiation. This way we obtain zero on the right-hand side for the system. The eigenvalues are similar as before: dη U V ± c2 (U 2 + V 2 − c2 ) = ± = . (60) dξ U 2 − c2 The left eigenvectors are ± = [1, ∓ ].
(61)
∂± u + ∓ ∂± v = 0.
(62)
And we have similarly
Our ± now depend on more than (U, V ). But, let us regard ± as a simple function of three variables ± = ± (U, V, c2 ) as given in (60). Thus we need to build differentiation laws for c2 . We can directly obtain 2 2 c c + U u ξ + V vξ = 0, + U u η + V vη = 0. (63) γ −1 ξ γ −1 η We have ∂± c2 = −(γ − 1)(U ∂± u + V ∂± v).
(64)
So we move on to compute ∂± ± = ∂U ± ∂± U + ∂V ± ∂± V + ∂c2 ± ∂± c2 = ∂U ± (∂± u − 1) + ∂V ± (∂± v − ± ) + ∂c2 ± ∂± c2 = ∂U ± ∂± u + ∂V ± ∂± v + ∂c2 ± ∂± c2 − ∂U ± − ∂V ± ± .
(65)
We need to handle the term ∂U ± + ∂V ± ± . We show it is zero. Recalling that (c2 − U 2 ) 2 + 2U V + c2 − V 2 = 0,
(66)
and regarding that depends on the three quantities (U, V, c2 ) independently, we can easily find
U =
(U − V ) U − V , V = − .
(c2 − U 2 ) + U V
(c2 − U 2 ) + U V
(67)
Thus
U +
V = 0.
(68)
Simple Waves and Decomposition of 2D Compressible Euler Equations
Therefore we end up with
−1 ∂± ± = ∂U ± − −1 ∂
− (γ − 1)∂
(U −
V ) ∂± u. 2 V ± ± c ∓ ∓
9
(69)
Thus, if one of the quantities (u, v, c2 , − ) is a constant along − , so are all the rest. So far the properties are very similar to the steady case. We derive an equation for ∂− u. We have a similar second-order equation for u, 2U V c2 − U 2 uη − 2 uξ . (70) u ηη = c2 − V 2 c − V2 ξ We have similarly ∂+ ∂− u = u ξ ξ + ( + + − )u ξ η + − + u ηη + ∂+ − u η
2 2U V c2 − V 2 c − U2 = 2 uη − 2 u ξ + ∂+ − u η . c − U2 c2 − V 2 ξ c − V2 ξ
(71)
We compute ∂+ − = ∂U − ∂+ U + ∂V − ∂+ V + ∂c2 − ∂+ c2 1 1 = ∂U − − ∂V − − (γ − 1)∂c2 − U − V ∂+ u
−
− −(∂U − + + ∂V − ).
(72)
We continue to find uη c2 − V 2 ∂+ ∂− u = 2 c − U 2 (c2 − V 2 )2
× 2Vξ U (c2 − V 2 ) + 2V Uξ (c2 − V 2 ) − 2V U ((c2 )ξ − 2V Vξ )
uξ 2 2 2 2 2 2 ((c ) − 2UU )(c − V ) − (c − U )((c ) − 2V V ) − 2 ξ ξ ξ ξ (c − V 2 )2 1 1 ∂V − − (γ − 1)∂c2 − (U − V) +∂+ u u η ∂U − −
−
− (73) −u η (∂U − + + ∂V − ). We apply the rule Uξ = u ξ − 1, Vξ = vξ = u η to find
uη c2 − V 2 2u η U (c2 − V 2 ) + 2V u ξ (c2 − V 2 ) ∂+ ∂ − u = 2 c − U 2 (c2 − V 2 )2 +2V U ((γ − 1)U u ξ + (γ + 1)V u η )
uξ − ((γ + 1)U u ξ + (γ − 1)V u η )(c2 − V 2 ) − 2 2 2 (c − V ) +(c2 − U 2 )((γ − 1)U u ξ + (γ + 1)V u η ) 1 1 +∂+ u u η ∂U − − ∂V − − (γ − 1)∂c2 − (U − V)
−
− −2V vξ − 2U u ξ − u η (∂U − + + ∂V − ). c2 − U 2
(74)
10
J. Li, T. Zhang, Y. Zheng
We note that terms appear which are linear in the derivatives of (u, v) in addition to the pure quadratic form as in the steady case. The pure quadratic form is identical to the steady case, so we do not need to handle it further. The linear form can be handled as follows. First we use the derivatives ( U , V ) to compute ∂U − + + ∂V − = ∂U − +
1 c2 − V 2 2(U − − V ) ∂V − = .
− c2 − U 2 c2 − U 2
(75)
Then we have −2V vξ − 2U u ξ 2U − u η (∂U − + + ∂V − ) = − 2 ∂− u. 2 2 c −U c − U2
(76)
Thus the linear form is also in the direction of − . Combining the steps we end up with Theorem 4. There holds ∂+ ∂− u =
U (2c2 − U 2 − V 2 + γ U 2 − γ V 2 ) 2U (∂ξ u + A∂η u)∂− u − 2 ∂− u, (c2 − U 2 )(c2 − V 2 ) c − U2 (77)
where A :=
2U (c2 + γ V 2 ) + + Q˜ ,
− U (2c2 − U 2 − V 2 + γ U 2 − γ V 2 )
(78)
and (c2 − U 2 )(c2 − V 2 ) γ + 1 (U − − V )3 . Q˜ := 2 − (c2 − U 2 ) + 2U V c2 −
(79)
We then have Theorem 5. There holds ∂+ (∂− − ) = m ∂− − .
(80)
Similarly ∂− (∂+ + ) = n ∂+ + holds. 5. Application: Simple Waves For a system of hyperbolic conservation laws in one-space dimension, a centered rarefaction wave is a simple wave, in which one family of characteristics are straight lines and the dependent variables are constant along a characteristic. See any text book on systems of conservation laws, e. g., Courant and Friedrichs [2], p.59, and others’ [9, 3]. Simple waves for the two-dimensional steady Euler system are similar, i. e., one family of characteristics are straight and the velocity are constant along the characteristics. For the two-dimensional self-similar pressure gradient system, see [4], simple waves can be defined similarly, i. e., one nonlinear family of characteristics are straight and the pressure is constant along them. We note that we do not require the velocity to be constant. This way, by the characteristic decomposition, we find that a wave adjacent to a constant state is a simple wave.
Simple Waves and Decomposition of 2D Compressible Euler Equations
11
In the construction of solutions to the two-dimensional Riemann problem for the Euler system, see any of the sources [7–9], it is important to know how to construct solutions adjacent to a constant state in addition to the constructions of the interaction of rarefaction waves ([6]), subsonic solutions, and transonic shock waves. “Simple waves play a fundamental role in describing and building up solutions of flow problems.” ( See ˘ c and Keyfitz explored simple waves further and generalized pp.59–60, [2]). In [1], Cani´ Courant and Friedrichs’ theorem by allowing the coefficients of a 2 × 2 system (say (7)) to depend on the independent variables (t, x) linearly as well as the dependent variables (u, v). Here the characteristic decomposition ∂1 ∂2 λ2 = m∂2 λ2 allows us to conclude that Theorem 6. Adjacent to a constant state in the self-similar plane of the potential flow system is a simple wave in which the physical variables (u, v, c) are constant along a family of characteristics which are straight lines.
5.1. Simple waves for full Euler. Consider the adiabatic Euler system for an ideal fluid ρt + ∇ · (ρu) = 0, (ρu)t + ∇ · (ρu ⊗ u + p I ) = 0, (81) (ρ E) + ∇ · (ρ Eu + pu) = 0, t where E :=
1 2 |u| + e, 2
where e is the internal energy. For a polytropic gas, there holds e=
1 p , γ −1ρ
where γ > 1. In the self-similar plane and for smooth solutions, the system takes the form 1 ρ ∂s ρ + u ξ + vη = 0, ∂s u + 1 pξ = 0, ρ (82) ∂s v + ρ1 pη = 0, 1 ∂s p + u ξ + vη = 0, γp where ∂s := (u − ξ )∂ξ + (v − η)∂η , which we call pseudo-flow directions, as opposed to the other two characteristic directions, called (pseudo-)wave characteristics. We easily derive ∂s ( pρ −γ ) = 0,
(83)
12
J. Li, T. Zhang, Y. Zheng
and
ωt + (uω)x + (vω) y +
py ρ
− x
px ρ
=0
(84)
y
for the vorticity ω := vx − u y . So entropy pρ −γ is constant along the pseudo-flow lines. For a region whose pseudo-flow lines come from a constant state, we see that the entropy is constant in the region. For the isentropic region, vorticity has zero source of p production since ( ρy )x − ( pρx ) y = 0. Thus vorticity tω = vξ − u η (setting t = 1 then) satisfies ∂s (ω/ρ) = 0.
(85)
Hence, for a region whose pseudo-flow lines come from a constant state, the vorticity must be zero everywhere. So the region is irrotational and isentropic. Thus our formulas for the potential flow apply. We have Theorem 7. Adjacent to a constant state in the self-similar plane of the adiabatic Euler system is a simple wave in which the physical variables (u, v, c, p, ρ) are constant along a family of wave characteristics which are straight lines, provided that the region is such that its pseudo-flow characteristics extend into the state of constant. Note added in proof: Lax has a concept of Riemann invariants for large systems, see “hyperbolic systems of conservation laws. II.” Comm. Pure Appl. Math. 10(1957), pp. 537–566. For the steady irrotational Euler system (9), a pair of Riemann invariants are pointed out to us by Marshall Slemrod as W± (θ, q) = θ ± qc(q 2 − c2 )−1/2 dq for u = q cos θ, u = q sin θ and c2 depends on q via Bernoulli’s law (10). Acknowledgements. Y. Zheng would like to thank the mathematics department at Capital Normal University for its hospitality during his visit when this work was done. J. Li thanks Matania Ben-Artzi for his interest.
References ˘ c, S., Keyfitz, B.L.: Quasi-one-dimensional Riemann problems and their role in self-similar two1. Cani´ dimensional problems. Arch. Rat. Mech. Anal. 144, 233–258 (1998) 2. Courant, R., Friedrichs, K.O.: Supersonic flow and shock waves. New York: Interscience, 1948 3. Dafermos, C.: Hyperbolic conservation laws in continuum physics (Grundlehren der mathematischen Wissenschaften), Berlin Heidelberg New York: Springer, 2000, pp. 443 4. Dai, Zihuan; Zhang, Tong: Existence of a global smooth solution for a degenerate Goursat problem of gas dynamics. Arch. Rational Mech. Anal. 155, 277–298 (2000) 5. Lax, P.: Development of singularities of solutions of nonlinear hyperbolic partial differential equations. J. Math. Phys. 5, 611–613 (1964) 6. Li, Jiequan: On the two-dimensional gas expansion for compressible Euler equations. SIAM J. Appl. Math. 62, 831–852 (2001) 7. Li, Jiequan; Zhang, Tong; Yang, Shuli: The two-dimensional Riemann problem in gas dynamics. Pitman monographs and surveys in pure and applied mathematics 98. London-NewYork: Addison Wesley Longman limited, 1998 8. Zhang, Tong; Zheng, Yuxi: Conjecture on the structure of solution of the Riemann problem for two-dimensional gas dynamics systems. SIAM J. Math. Anal. 21, 593–630 (1990) 9. Zheng, Yuxi: Systems of conservation laws: Two-dimensional Riemann problems. 38 PNLDE, Boston: Birkhäuser, 2001 Communicated by P. Constantin
Commun. Math. Phys. 267, 13–23 (2006) Digital Object Identifier (DOI) 10.1007/s00220-006-0062-9
Communications in
Mathematical Physics
The Cohomology Algebra of the Semi-Infinite Weil Complex Andrew R. Linshaw Department of Mathematics, Brandeis University, Waltham, MA 02454, USA. E-mail:
[email protected] Received: 17 September 2005 / Accepted: 28 March 2006 Published online: 11 July 2006 – © Springer-Verlag 2006
Abstract: In 1993, Lian-Zuckerman constructed two cohomology operations on the BRST complex of a conformal vertex algebra with central charge 26. They gave explicit generators and relations for the cohomology algebra equipped with these operations in the case of the c = 1 model. In this paper, we describe another such example, namely, the semi-infinite Weil complex of the Virasoro algebra. The semi-infinite Weil complex of a tame Z-graded Lie algebra was defined in 1991 by Feigin-Frenkel, and they computed the linear structure of its cohomology in the case of the Virasoro algebra. We build on this result by giving an explicit generator for each non-zero cohomology class, and describing all algebraic relations in the sense of Lian-Zuckerman, among these generators. 1. Introduction The BRST cohomology of a conformal vertex algebra of central charge 26 is a special case of the semi-infinite cohomology of a tame Z-graded Lie algebra g (in this case the Virasoro algebra) with coefficients in a g-module M. The theory of semi-infinite cohomology was developed by Feigin and Frenkel-Garland-Zuckerman [3, 7], and is an analogue of classical Lie algebra cohomology. In general, there is an obstruction to the semi-infinite differential being square-zero which arises as a certain cohomology class in H 2 (g, C). The semi-infinite Weil complex of g is obtained by taking M to be the module of “adjoint semi-infinite symmetric powers” of g [4, 1]. In this case, an anomaly cancellation ensures that the differential is always square-zero. The semi-infinite Weil complex is a vertex algebra and its differential arises as the zeroth Fourier mode of a vertex operator, a fact which is useful for doing computations. This paper is organized as follows. First, we define vertex algebras and their modules, which have been discussed from various different points of view in the literature [2, 6, 8, 9, 15, 10, 13, 16]. We will follow the formalism introduced in [13]. We describe the main examples we need, and then define the BRST complex of a conformal vertex algebra
14
A. R. Linshaw
A with central charge 26. We then recall the two cohomology operations introduced in [12] on the BRST cohomology H ∗ (A), namely, the dot product and the bracket. We examine in detail the case where A = S, i.e., the βγ -ghost system associated to a one-dimensional vector space. This coincides with the module of adjoint semi-infinite symmetric powers of the Virasoro algebra, so the BRST complex of S is exactly the semi-infinite Weil complex of the Virasoro algebra. Finally, we prove our main result, which is a complete description of the algebraic structure of H ∗ (S) in the sense of Lian-Zuckerman. Theorem 1.1. Let V ir+ denote the Lie subalgebra of the Virasoro algebra generated by L n , n ≥ 0. As a Lie superalgebra with respect to the bracket, H ∗ (S) is isomorphic to the semi-direct product of V ir+ with its adjoint module. As an associative algebra with respect to the dot product, H ∗ (S) is a polynomial algebra on one even variable and one odd variable. 2. Vertex Algebras Let V = V0 ⊕ V1 be a super vector space over C, and let z, w be formal variables. By Q O(V ), we mean the space of all linear maps V → V ((z)) =
v(n)z −n−1 |v(n) ∈ V ; v(n) = 0 f or n >> 0 .
(2.1)
n∈Z
Each a ∈ Q O(V ) can be uniquely represented as a power series a(z) = element −n−1 ∈ (End V )[[z, z −1 ]], although the latter space is clearly much larger n∈Z a(n)z than Q O(V ). We refer to a(n) as the n th Fourier mode of a(z). Each a ∈ Q O(V ) is assumed to be of the shape a = a0 + a1 , where ai : V j → Vi+ j ((z)) for i, j ∈ Z/2, and we write |ai | = i. On Q O(V ) there is a set of non-associative bilinear operations, ◦n , indexed by n ∈ Z, which we call the n th circle products. They are defined by a(w) ◦n b(w) = Resz a(z)b(w) ι|z|>|w| (z −w)n −(−1)|a||b| b(w)a(z)ι|w|>|z| (z −w)n . Here ι|z|>|w| f (z, w) ∈ C[[z, z −1 , w, w −1 ]] denotes the power series expansion of a rational function f in the region |z| > |w|. Note that ι|z|>|w| (z − w)n = ι|w|>|z| (z − w)n for n < 0. We usually omit the symbol ι|z|>|w| and just write (z − w)n to mean the expansion in the region |z| > |w|, and write (−1)n (w − z)n to mean the expansion in |w| > |z|. It is easy to check that a(w) ◦n b(w) above is a well-defined element of Q O(V ). The non-negative circle products are connected through the operator product expansion (OPE) formula ([13], Prop. 2.3). For a, b ∈ Q O(V ), we have a(z)b(w) =
a(w) ◦n b(w) (z − w)−n−1 + : a(z)b(w) :
n≥0
as formal power series in z, w. Here : a(z)b(w) : = a(z)− b(w) + (−1)|a||b| b(w)a(z)+ ,
(2.2)
Cohomology Algebra of the Semi-Infinite Weil Complex
15
−n−1 and a(z) = −n−1 . Equation (2.2) is where a(z)− = + n<0 a(n)z n≥0 a(n)z customarily written as a(w) ◦n b(w) (z − w)−n−1 , a(z)b(w) ∼ n≥0
where ∼ means equal modulo the term : a(z)b(w) : . Note that : a(z)b(z) : is a well-defined element of Q O(V ). It is called the Wick product of a and b, and it coincides with a(z) ◦−1 b(z). The other negative circle products are related to this by n! a(z) ◦−n−1 b(z) = : (∂ n a(z))b(z) :, where ∂ denotes the formal differentiation operator the k-fold iterated Wick product is defined to be
d dz .
For a1 (z), ..., ak (z) ∈ Q O(V ),
: a1 (z)a2 (z) · · · ak (z) : = : a1 (z)b(z) :, where b(z) = : a2 (z) · · · ak (z) : . From the definition, we see that a(z) ◦0 b(z) = [a(0), b(z)],
(2.3)
where the bracket denotes the graded commutator. It follows that a◦0 is a graded derivation of every circle product [13]. The set Q O(V ) is a nonassociative algebra with the operations ◦n and a unit 1. We have 1◦n a = δn,−1 a for all n, and a◦n 1 = δn,−1 a for n ≥ −1. We are interested in subalgebras A ⊂ Q O(V ), that is, linear subspaces of Q O(V ) containing 1, which are closed under the circle products. In particular A is closed under formal differentiation ∂ since ∂a = a ◦−2 1. Following [13], we call such a subalgebra a quantum operator algebra (QOA). Many formal algebraic notions are immediately clear: a QOA homomorphism is just a linear map which sends 1 to 1 and preserves all circle products; a module over A is a vector space M equipped with a QOA homomorphism A → Q O(M), etc. A subset S = {ai | i ∈ I } of A is said to generate A if any element a ∈ A can be written as a linear combination of nonassociative words in the letters ai , ◦n , for i ∈ I and n ∈ Z. Remark 2.1. Fix a nonzero vector 1 ∈ V and let a, b ∈ Q O(V ) such that a(z)+ 1 = b(z)+ 1 = 0 for n ≥ 0. Then it follows immediately from the definition of the circle products that (a ◦ p b)+ (z)1 = 0 for all p. Thus if a QOA A is generated by elements a(z) with the property that a(z)+ 1 = 0, then every element in A has this property. In this case the vector 1 determines a linear map χ : A → V, a → a(−1)1 = lim a(z)1 z→0
(called the creation map in [13]), having the following basic properties: χ (1) = 1, χ (a ◦n b) = a(n)b(−1)1, χ (∂ p a) = p! a(− p − 1)1.
(2.4)
Next, we define the notion of commutativity in a QOA. Definition 2.2. We say that a, b ∈ Q O(V ) circle commute if (z − w) N [a(z), b(w)] = 0 for some N ≥ 0. If N can be chosen to be 0, then we say that a, b commute. A QOA is said to be commutative if its elements pairwise circle commute.
16
A. R. Linshaw
The notion of a commutative QOA is abstractly equivalent to the notion of a vertex algebra (see [8]). Briefly, every commutative QOA A is itself a faithful A-module, called the left regular module. Define ρ : A → Q O(A), a → a, ˆ a(ζ ˆ )b = (a ◦n b) ζ −n−1 . n∈Z
It can be shown (see [14] and [11]) that ρ is an injective QOA homomorphism, and the quadruple of structures (A, ρ, 1, ∂) is a vertex algebra in the sense of [8]. Conversely, if (V, Y, 1, D) is a vertex algebra, the subspace Y (V ) ⊂ Q O(V ) is a commutative QOA. We will refer to a commutative QOA simply as a vertex algebra throughout the rest of this paper. Remark 2.3. Let A be the vertex algebra generated by ρ(A) inside Q O(A). Since a(n)1 ˆ = a(z) ◦n 1 = 0 for all a ∈ A and n ≥ 0, it follows from Remark 2.1 that for every α ∈ A , we have α+ 1 = 0. The creation map χ : A → A sending α → α(−1)1 is clearly a linear isomorphism since χ ◦ ρ = id. It is often convenient to pass between A and its image A in Q O(A). For example, we shall often denote the Fourier mode a(n) ˆ simply by a(n). When we say that a vertex operator b(z) is annihilated by the Fourier mode a(n) of a vertex operator a(z), we mean that a ◦n b = 0. Here we are regarding b as an element of the state space A, while a operates on the state space, and the map a → aˆ is the state-operator correspondence. For later use, we write down a formula, valid in any vertex algebra, which measures the non-associativity of the Wick product. Lemma 2.4. Let A be a vertex algebra, and let a, b, c ∈ A. Then : (: ab :)c : − : abc : = 1 n+1 |a||b| n+1 : (∂ a)(b ◦n c) : +(−1) : (∂ b)(a ◦n c) : . (n + 1)!
(2.5)
n≥0
Note that this sum is finite by circle commutativity. In particular, we see that : (: ab :)c : and : abc : differ by terms of the form : (∂ i a)X : and : (∂ i b)Y :, where i ≥ 1 and X, Y ∈ A. ˆ cˆ satisfy this identity, which Proof. By the preceding remark, it suffices to show that a, ˆ b, can be checked by applying the creation map to both sides and then using (2.4).
2.1. Virasoro elements. Many vertex algebras A have a Virasoro element, that is, a vertex operator L(z) = n∈Z L(n)z −n−1 satisfying the OPE L(z)L(w) ∼
k (z − w)−4 + 2L(w)(z − w)−2 + ∂ L(w)(z − w)−1 , 2
(2.6)
where the constant k is called the central charge. It is customary to write L(z) = −n−2 , where L := L(n + 1). The Fourier modes {L | n ∈ Z} together with n n n∈Z L n z a central element κ then generate a copy of the Virasoro Lie algebra V ir : [L n , L m ] = (n − m)L n+m +
1 3 (n − n)δn,−m κ. 12
Cohomology Algebra of the Semi-Infinite Weil Complex
17
Often we require further that L 0 be diagonalizable and L −1 acts on A by formal differentiation. In this case, (A, L) is called a conformal vertex algebra. An element a ∈ A which is an eigenvector of L 0 with eigenvalue ∈ C is said to have conformal weight . In any conformal vertex algebra, the operation ◦n is homogeneous of conformal weight −n − 1. In particular, the Wick product ◦−1 is homogeneous of conformal weight zero. 2.2. βγ - and bc-ghost systems. Let V be a finite-dimensional vector space. Regard V ⊕ V ∗ as an abelian Lie algebra. Then its loop algebra has a one-dimensional central extension by Cτ , h = h(V ) = (V ⊕ V ∗ )[t, t −1 ] ⊕ Cτ, which is known as a Heisenberg algebra. Its bracket is given by [(x, x )t n , (y, y )t m ] = (y , x − x , y)δn+m,0 τ, for x, y ∈ V and x , y ∈ V ∗ . Let b ⊂ h be the subalgebra generated by τ , (x, 0)t n , and (0, x )t m , for n ≥ 0 and m > 0, and let C be the one-dimensional b-module on which (x, 0)t n and (0, x )t m act trivially and the central element τ acts by the identity. Denote the linear operators representing (x, 0)t n , (0, x )t n on Uh ⊗Ub C by β x (n), γ x (n − 1), respectively, for n ∈ Z. The power series β x (z) = β x (n)z −n−1 , γ x (z) = γ x (n)z −n−1 ∈ Q O(Uh ⊗Ub C) n∈Z
n∈Z
generate a vertex algebra S(V ) inside Q O(Uh ⊗Ub C), and the generators satisfy the OPE relations
β x (z)γ x (w) ∼ x , x(z − w)−1 , β x (z)β y (w) ∼ 0, γ x (z)γ y (w) ∼ 0. This algebra was introduced in [9] and is known as a βγ -ghost system, or a semi-infinite symmetric algebra. The creation map χ : S(V ) → Uh ⊗Ub C, which sends a(z) → a(−1)(1 ⊗ 1), is easily seen to be a linear isomorphism. By the Poincaré-Birkhoff-Witt theorem, the vector space Uh ⊗Ub C has the structure of a polynomial algebra with generators given by the negative Fourier modes β x (n), γ x (n), n < 0, which are linear in x ∈ V and x ∈ V ∗ . It follows from (2.4) that S(V ) is spanned by the collection of iterated Wick products of the form
µ = : ∂ n 1 β x1 · · · ∂ n s β xs ∂ m 1 γ x1 · · · ∂ m t γ xt : . S(V ) has a natural Z-grading which we call the βγ -ghost number. Fix a basis x1 , . . . , xn for V and a corresponding dual basis x1 , . . . , xn for V ∗ . Define the βγ -ghost number to be the eigenvalue of the diagonalizable operator [B, −], where B is the zeroth Fourier mode of the vertex operator n i=1
: β xi γ xi : .
18
A. R. Linshaw
Clearly B is independent of our chosen basis of V , and β x , γ x have βγ -ghost numbers −1, 1 respectively. We can also regard V ⊕ V ∗ as an odd abelian Lie (super) algebra, and consider its loop algebra and a one-dimensional central extension by Cτ with bracket [(x, x )t n , (y, y )t m ] = (y , x + x , y)δn+m,0 τ. Call this Lie algebra j = j(V ), and form the induced module Uj ⊗Ua C. Here a is the subalgebra of j generated by τ , (x, 0)t n , and (0, x )t m , for n ≥ 0 and m > 0, and C is the one-dimensional a-module on which (x, 0)t n and (0, x )t m act trivially and τ acts by 1. There is a vertex algebra E(V ), analogous to S(V ), which is generated by the odd vertex operators b x (n)z −n−1 , c x (z) = c x (n)z −n−1 ∈ Q O(Uj ⊗Ua C), b x (z) = n∈Z
n∈Z
which satisfy the OPE relations
b x (z)c x (w) ∼ x , x(z − w)−1 , b x (z)b y (w) ∼ 0, c x (z)c y (w) ∼ 0. This vertex algebra is known as a bc-ghost system, or a semi-infinite exterior algebra. Again the creation map E(V ) → Uj ⊗Ua C, a(z) → a(−1)(1 ⊗ 1), is a linear isomorphism. As in the symmetric case, the vector space Uj ⊗Ua C has the structure of an odd polynomial algebra with generators given by the negative Fourier modes b x (n), c x (n), n < 0, which are linear in x ∈ V and x ∈ V ∗ . As above, it follows that E(V ) is spanned by the collection of all iterated Wick products of the vertex operators ∂ k b x and ∂ k c x , for k ≥ 0. E(V ) has a Z-grading which we call the bc-ghost number (or fermion number). It is given by the eigenvalue of the diagonalizable operator [F, −], where F is the zeroth Fourier mode of the vertex operator −
n
: b xi c xi : .
i=1
F is independent of our choice of basis for V , and b x , c x have bc-ghost numbers −1, 1 respectively. We will denote the bc-ghost number of a homogeneous element a ∈ E(V ) by |a|. Note that this coincides with our earlier notation for the Z/2-grading on E(V ) coming from its vertex superalgebra structure. This causes no difficulty; since b x , c x are odd vertex operators, the mod 2 reduction of the bc-ghost number coincides with this Z/2-grading. Let us specialize to the case where V is a one-dimensional vector space. In this case, S(V ) coincides with the module of adjoint semi-infinite symmetric powers of the Virasoro algebra [4]. Fix a basis element x of V and a dual basis element x of V ∗ . We denote S(V ) by S, and we denote the generators β x , γ x by β, γ , respectively. Similarly, we denote E(V ) by E, and we denote the generators b x , c x by b, c, respectively. For a fixed scalar λ ∈ C, define LS λ = (λ − 1) : ∂βγ : +λ : β∂γ : ∈ S. An OPE calculation shows that −2 + ∂β(w)(z − w)−1 , LS λ (z)β(w) ∼ λβ(w)(z − w)
(2.7)
Cohomology Algebra of the Semi-Infinite Weil Complex
19
−2 LS + ∂γ (w)(z − w)−1 , λ (z)γ (w) ∼ (1 − λ)γ (w)(z − w)
S LS λ (z)L λ (w) ∼
k −2 −1 (z − w)−4 + 2L S + ∂ LS λ (w)(z − w) λ (w)(z − w) , 2
where k = 12λ2 − 12λ + 2. Hence L S λ is a Virasoro element of central charge k, and ) is a conformal vertex algebra in which β, γ have conformal weights λ, 1 − λ (S, L S λ respectively. Similarly, define L Eλ = (1 − λ) : ∂bc : −λ : b∂c : ∈ E.
(2.8)
A calculation shows that L Eλ is a Virasoro element with central charge k = −12λ2 + 12λ − 2, (E, L Eλ ) is a conformal vertex algebra, and b, c have conformal weights λ, 1 − λ respectively. 3. BRST Cohomology
Observe that if A, A are conformal vertex algebras with Virasoro elements L A , L A of central charges k, k , respectively, then A ⊗ A is a conformal vertex algebra with Virasoro element L A⊗A = L A + L A (i.e., L A ⊗ 1 + 1 ⊗ L A ) of central charge k + k . To simplify notation, the ordered product ab = a(z)b(z) of two vertex operators a, b in the same formal variable z will always denote the Wick product. Fix λ = 2 in (2.8), and denote the corresponding Virasoro element L E2 = −∂bc − 2b∂c ∈ E, by L E . With this choice, (E, L E ) is a conformal vertex algebra of central charge -26. For any conformal vertex algebra (A, L A ) of central charge k, let C ∗ (A) = E ⊗ A. Denote the Virasoro element L E + L A , by L C . The conformal weight and bcghost number are given, respectively, by the eigenvalues of the operators [L C 0 , −] and [F ⊗ 1, −] on C ∗ (A). Definition 3.1. Let JA be the following element of C ∗ (A): 1 3 JA = (L A + L E )c + ∂ 2 c. 2 4 A calculation shows that JA (z) ◦0 b(z) = L C (z).
(3.1)
(3.2)
We will denote the zeroth Fourier mode JA (0) by Q, so we may rewrite this equation as [Q, b(z)] = L C (z) by (2.3). Note that the operator [Q, −] preserves conformal weight and raises bc-ghost number by 1. Lemma 3.2. Q 2 = 0 iff k = 26. In this case, we can consider C ∗ (A) to be a cochain complex graded by bc-ghost number, with differential [Q, −]. Its cohomology is called the BRST cohomology associated to A, and will be denoted by H ∗ (A). Proof. First, note that Q 2 = 21 [Q, Q] = 21 Resw JA (w)◦0 JA (w). Computing the OPE of JA (z)JA (w) and extracting the coefficient of (z −w)−1 , we find that JA (w)◦0 JA (w) = 3 k−26 3 2 2 ∂(∂ c(w)c(w)) + 12 ∂ c(w)c(w). Since the residue of a total derivative is zero, only the second term contributes, and it follows that Resw JA (w)◦0 JA (w) = 0 iff k = 26.
From now on, we will only consider the case where k = 26.
20
A. R. Linshaw
4. Algebraic Structure of H ∗(A) In this section, we recall without proof some facts from [12] on the algebraic structure of the BRST cohomology. We first note that any cohomology class can be represented by an element u(z) of conformal weight 0, since (3.2) implies that [Q, b(1)] = L C 0 . Since [Q, −] acts by derivation on each of the products ◦n on C ∗ (A), each ◦n descends to a product H ∗ (A). Since ◦n lowers conformal weight by n + 1, all these products are trivial except for the one induced by ◦−1 (the Wick product), which we call the dot product. We write the dot product of u and v as uv. The cohomology H ∗ (A) has another bilinear operation known as the bracket. First, we define the bracket on the space C ∗ (A). Definition 4.1. Given u(z), v(z) ∈ C ∗ (A), let {u(z), v(z)} = (−1)|u| (b(z) ◦0 u(z)) ◦0 v(z).
(4.1)
The equivalence between this definition and the one given in [12] is shown in [13]. From this description, it is easy to see that the bracket descends to H ∗ (A), inducing a well-defined bilinear operation. Theorem 4.2. The following algebraic identities hold on H ∗ (A): uv = (−1)|u||v| vu,
(4.2)
(uv)t = u(vt),
(4.3)
{u, v} = −(−1)(|u|−1)(|v|−1) {v, u},
(4.4)
(−1)(|u|−1)(|t|−1) {u, {v, t}} + (−1)(|t|−1)(|v|−1) {t, {u, v}} +(−1)(|v|−1)(|u|−1) {v, {t, u}} = 0,
(4.5)
{u, vt} = {u, v}t + (−1)(|u|−1)(|v|) v{u, t},
(4.6)
b ◦1 {u, v} = {b ◦1 u, v} + (−1)|u|−1 {u, b ◦1 v},
(4.7)
{, } : H p × H q → H p+q−1 .
(4.8)
Equations (4.2)–(4.3) say that H ∗ (A) is an associative, graded commutative algebra with respect to the dot product. Equations (4.4)–(4.5) say that under the bracket, H ∗ (A) is a Lie superalgebra with respect to the grading (bc-ghost number –1). Also, note that H 1 (A) is an ordinary Lie algebra under the bracket. Taking p = 1 in (4.8), we see that for every q, H q (A) is a module over H 1 (A).
Cohomology Algebra of the Semi-Infinite Weil Complex
21
5. The Semi-infinite Weil Complex of the Virasoro Algebra In this section, we give a complete description of the algebraic structure of H ∗ (A) in the case A = S, with the choice λ = 2 in (2.7). In this case, the Virasoro element LS = LS 2 = ∂βγ + 2β∂γ ∈ S has central charge 26. Definition 5.1. C ∗ (S) = E ⊗ S, equipped with the differential [Q, −] and the Virasoro element L W = L E + L S , is called the semi-infinite Weil complex associated to V ir , and will be denoted by W. Note that W is naturally triply graded. In addition to the conformal weight and bc-ghost number, W is graded by the βγ -ghost number, which is the eigenvalue of [1 ⊗ B, −]. Note that [Q, −] preserves the βγ -ghost number. Let W i, j ⊆ W denote the conformal weight zero subspace of bc-ghost number i, βγ -ghost number j, and let Z i, j , B i, j ⊆ W i, j denote the cocycles and coboundaries, respectively, with respect to [Q, −]. Let H i, j = Z i, j /B i, j . Note that H i (S) decomposes as the direct sum j∈Z H i, j . In [4] and [5], Feigin-Frenkel computed the linear structure of H ∗ (S), namely, the dimension of each of the spaces H i, j . Theorem 5.2. For all j ≥ 0, dim H 0, j = dim H 1, j = 1. For all other values of i, j, dim H i, j = 0. This was proved by using the Friedan-Martinec-Shenker bosonization [9] to express S as a submodule of a direct sum of Feigin-Fuchs modules over V ir , and then using known results on the structure of these modules. We will assume the results in [4, 5], and use them to describe the algebraic structure of H ∗ (S). Our first step is to find a canonical generator for each non-zero cohomology class. Recall that W has a basis consisting of the monomials: ∂ n 1 b · · · ∂ n i b ∂ m 1 c · · · ∂ m j c ∂ s1 β · · · ∂ sk β ∂ t1 γ · · · ∂ tl γ
(5.1)
with n 1 > · · · > n i ≥ 0, m 1 > · · · > m j ≥ 0 and s1 ≥ · · · ≥ sk ≥ 0, t1 ≥ · · · ≥ tl ≥ 0. Let D ⊂ W be the subspace spanned by monomials which contain at least one derivative, i.e., at least one of the numbers n 1 , ..., n i , m 1 , ..., m j , s1 , ..., sk , t1 , ..., tl above is positive. Lemma 5.3. The image of [Q, −] is contained in D. Proof. A straightforward calculation shows that [Q, b] = L W , [Q, c] = c∂c, [Q, β] = c∂β + 2∂cβ, [Q, γ ] = c∂γ − ∂cγ . The claim follows by applying the graded derivation [Q, −] to a monomial of the form (5.1), and then using (2.5) to express the result as a linear combination of standard monomials of the form (5.1). Note that for any vertex operators a, b, c ∈ W, the expression : (ab :)c : − : abc : , which measures the non-associativity of the Wick product, always lies in D.
Lemma 5.4. Let x = βγ 2 − bcγ + 23 ∂γ . Then x ∈ Z 0,1 and x ∈ / B 0,1 , so x represents a non-zero cohomology class. Since H 0,1 is 1-dimensional, x generates H 0,1 . Similarly, let y = cβγ + 23 ∂c. Then y ∈ Z 1,0 and y ∈ / B 1,0 , so y generates H 1,0 .
22
A. R. Linshaw
Proof. The proof that x ∈ Z 0,1 and y ∈ Z 1,0 is a straightforward calculation. Since x contains a monomial with no derivatives, x ∈ / D. By Lemma 5.3, B 0,1 ⊂ D, so x ∈ / B 0,1 . 1,0 Similarly, y ∈ / B .
Our main result is the following Theorem 5.5. For each integer k ≥ 0, x k represents a non-zero cohomology class in H 0,k , and yx k represents a non-zero class in H 1,k . By Theorem 5.2, these are all the non-zero classes in H ∗ (S). Proof. It is clear from the derivation property of [Q, −] that x k ∈ Z 0,k and yx k ∈ Z 1,k , so it suffices to show that x k ∈ / B 0,k and yx k ∈ / B 1,k . For each integer k ≥ 0, define: xk = β k γ 2k − kbcβ k−1 γ 2k−1 ,
yk = cβ k+1 γ 2k+1 .
We claim that: x k = x k + Dk ,
(5.2)
yx k = yk + Dk ,
(5.3)
for some Dk ∈ D and Dk ∈ D. Since xk and yk have no derivatives, it follows that x k and yx k do not lie in D. Now we can apply Lemma 5.3 to conclude that x k ∈ / B 0,k and yx k ∈ / B 1,k . We begin with (5.2) and proceed by induction. The cases k = 0 and k = 1 are obvious, so assume the statement true for k − 1, 3 x k = (βγ 2 − bcγ + ∂γ )x k−1 2
(5.4)
= (βγ 2 − bcγ )(β k−1 γ 2k−2 − (k − 1)bcβ k−2 γ 2k−3 + Dk−1 ) + E 0 , where E 0 = 23 ∂γ x k−1 . We expand this product and apply Lemma 2.4 repeatedly: (bcγ )((k − 1)bcβ k−2 γ 2k−3 ) = (k − 1)b2 c2 β k−2 γ 2k−2 + E 1 = E 1 ,
(5.5)
since b, c are anti-commuting variables, (bcγ )(β k−1 γ 2k−2 ) = bcβ k−1 γ 2k−1 + E 2 ,
(5.6)
(βγ 2 )((k − 1)bcβ k−2 γ 2k−3 ) = (k − 1)bcβ k−1 γ 2k−1 + E 3 ,
(5.7)
(βγ 2 )(β k−1 γ 2k−2 ) = β k γ 2k + E 4 ,
(5.8)
where E i ∈ D for i = 0, 1, 2, 3, 4. It is easy to see that (bcγ )Dk−1 ∈ D and (βγ 2 )Dk−1 ∈ D. Equation (5.2) follows by collecting terms from (5.5)–(5.8). Finally, the same argument proves (5.3).
Using Theorem 5.5, we can now describe the algebraic structure of H ∗ (S). Let V ir+ ⊆ V ir be the Lie subalgebra generated by L n , n ≥ 0. The Cartan subalgebra h of V ir+ is generated by L 0 .
Cohomology Algebra of the Semi-Infinite Weil Complex
23
Corollary 5.6. As an associative algebra with respect to the dot product, H ∗ (S) is a polynomial algebra in one even variable, x, and one odd variable, y. In other words, H ∗ (S) is isomorphic to the classical Weil algebra associated to h. It is easy to check from the definition of the bracket (4.1) that {y, x} = −x. Using the graded derivation property of the bracket with respect to the dot product, we can write down all the bracket relations in H ∗ (S). For any n, m ≥ 0, {x n , x m }=0, {yx n , x m }=(n − m)x n+m , {yx n , yx m }=(n − m)yx n+m .
(5.9)
It follows that as a Lie algebra, H 1,∗ is isomorphic to V ir+ under the isomorphism yx k → L k , k ≥ 0. As an H 1,∗ -module, H 0,∗ is isomorphic to the adjoint representation of V ir+ . Finally, we obtain Corollary 5.7. As a Lie superalgebra with respect to the bracket, H ∗ (S) is isomorphic to the semi-direct product of V ir+ with its adjoint module. Acknowledgement. I would like to thank Bong H. Lian for many helpful conversations we have had during the course of this work.
References 1. Akman, F.: A characterization of the differential in semi-infinite cohomology. J. Alg. 162(1), 194–209 (1993) 2. Borcherds, R.: Vertex operator algebras, Kac-Moody algebras and the Monster. Proc. Nat. Acad. Sci. USA 83, 3068–3071 (1986) 3. Feigin, B.: The semi-infinite homology of Lie, Kac-Moody and Virasoro algebras. Russ. Math. Surv. 39(2), 195–196 (1984) 4. Feigin, B., Frenkel, E.: Semi-Infinite Weil complex and the Virasoro algebra. Commun. Math. Phys. 137, 617–639 (1991) 5. Feigin, B., Frenkel, E.: Determinant formula for the free field representations of the Virasoro and KacMoody algebras. Phys. Lett. B 286, 71–77 (1992) 6. Frenkel, I.B., Huang, Y.Z., Lepowsky, J.: On axiomatic approaches to vertex operator algebras and modules. Mem. Amer. Math. Soc. 104(494), viii+64 (1993) 7. Frenkel, I.B., Garland, H., Zuckerman, G.J.: Semi-infinite cohomology and string theory. Proc. Natl. Acad. Sci. USA 83(22), 8442–8446 (1986) 8. Frenkel, I.B., Lepowsky, J., Meurman, A.: Vertex Operator Algebras and the Monster. New York: Academic Press, 1988 9. Friedan, D., Martinec, E., Shenker, S.: Conformal invariance, supersymmetry and string theory. Nucl. Phys. B271, 93–165 (1986) 10. Li, H.: Local systems of vertex operators, vertex superalgebras and modules. http://arxiv.org/list// hep-th/9406185, 1994 11. Lian, B., Linshaw, A.: Chiral equivariant cohomology I. http://arxiv.org/list// math.DG/0501084, 2005, to appear in Adv. Math. 12. Lian, B., Zuckerman, G.J.: New perspectives on the BRST-algebraic structure of string theory. Commun. Math. Phys. 154, 613–646 (1993) 13. Lian, B., Zuckerman, G.J.: Commutative quantum operator algebras. J. Pure Appl. Alg. 100(1–3), 117– 139 (1995) 14. Lian, B.: Lecture notes on circle algebras. Preprint, 1994 15. Kac, V. G.: Vertex Algebras for Beginners. AMS Univ. Lecture Series, 10, 2nd corrected ed., Prividence, RI: Amer. Math. Soc., 2001 16. Moore, G., Seiberg, N.: Classical and quantum conformal field theory. Commun. Math. Phys. 123, 177– 254 (1989) Communicated by L. Takhtajan
Commun. Math. Phys. 267, 25–64 (2006) Digital Object Identifier (DOI) 10.1007/s00220-006-0073-6
Communications in
Mathematical Physics
Quasi-Linear Quantum Field Theories for Maps to Groups and Their Quotients Clifford H. Taubes Department of Mathematics, Harvard University, Cambridge, MA 02138, USA. E-mail:
[email protected] Received: 2 October 2005 / Accepted: 14 February 2006 Published online: 26 July 2006 – © Springer-Verlag 2006
Abstract: I describe a functional integral for maps from R × Rn to a Lie group or its quotient which has a simple renormalization that leads to a quantum field theory for maps from Rn into the Lie group or its quotient whose Hamiltonian is the time translation generator for a unitary action of the n+1 dimensional Poincaré group on the quantum Hilbert space. I also explain how the renormalization provides a functional integral for maps from a Riemann surface to a compact Lie group or its quotient that exhibits many conformal field theoretic properties. Contents 1. 2. 3. 4. 5. 6. 7. 8.
The Construction . . . . . . . . . . . . Renormalization . . . . . . . . . . . . Quantum field theories . . . . . . . . . The action of the Poincaré group . . . Field theories for quotient spaces . . . When the domain is a Riemann surface Remarks on conformal field theories . Free field theories . . . . . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
26 27 33 41 44 45 52 58
Introduction I described in a previous paper [T] a construction of a measure on spaces of maps from a topological space to a smooth manifold and my purpose here is to explore variants of the construction from [T] in the case when the range space is a compact Lie group or its quotient by a compact subgroup. In particular, I describe how a simple renormalization of [T] ’s measure when the domain is R × Rn leads to a quantum field theory of maps Support in part by a grant from the National Science Foundation
26
C. H. Taubes
to the Lie group or its quotient whose Hamiltonian is the time translation generator for a unitary action of the n + 1 dimensional Poincaré group on the quantum Hilbert space. I also explain how the renormalization provides a functional integral for maps from a Riemann surface to a compact Lie group or its quotient that exhibits many conformal field theoretic properties. The second to last section here describes what is lacking for a complete conformal field theory. The final subsection describes a Gaussian field theory for maps to a compact Lie group or its quotient. 1. The Construction This section constitutes a digression of sorts to summarize the measure that is constructed in [T]. The summary begins by setting notation. To this end, let M denote the domain space, and let a : M × M → [0, ∞) denote a continuous function that defines a non-negative definite kernel in the following sense: Fix any positive integer N ; and choose any N distinct points {z 1 , . . . , z N } and N real numbers {η1 , . . . , η N }. Then a(z i , z j )ηi η j ≥ 0. (1.1) 1≤i≤ j≤N
Let X denote the range space, a smooth manifold with a given Riemannian metric. In the cases under consideration, X is assumed to be both compact and oriented. I use d to denote the dimension of X . Let π : P → X denote a compact, fiber bundle with the following additional data: • A set, {∂1 , . . . , ∂d }, of d vector fields that generate T P/kernel(π∗ ) at each point. • A volume form, dp, with total mass 1 and such that each vector field from this set has zero divergence. (1.2) Note that the symbol dp is also used in what follows to indicate the product volume form on products of P. A bundle such as P always exists; as a last resort, one can take P to be the bundle of oriented, orthonormal frames over X . Suppose next that a positive integer N has been chosen. In what follows, δ N is used to denote the Dirac delta function on × N P with support along the full diagonal. To be precise, δ N dp is the measure on × N P that sends a continuous function to Fδ N dp ≡ F( p, . . . , p)dp. (1.3) ×N P
P
Here and below, measures, even those singular with respect to Lebesgue measure, are written as if they were honest functions times the volume form. Thus, the volume form is always present in the notation for integration. To continue with the notation, for each i ∈ {1, . . . , N } and a ∈ {1, . . . , d}, the symbol ∂ai denotes the vector field on × N P that differentiates according to the basis vector ∂a along the i th factor of P. Suppose now that z = (z 1 , . . . , z N ) ∈ × N M is a chosen point. This point defines the differential operator j Az ≡ a(z i , z j ) ∂ai ∂a (1.4) 1≤i, j≤N
on
C ∞ (×
N P).
1≤a≤d
Quasi-Linear Quantum Field Theories for Maps to Groups and Their Quotients
27
By virtue of (1.1) and (1.2), this operator is negative semi-definite and symmetric. As a consequence, there exists a measure valued solution to the heat equation on × N P that is characterized as follows: This solution, K z , defines a continuous map from [0, ∞) to the space of Borel measures on × N P whose value at 0 is the measure δ N . Moreover, K z is such that when F is twice differentiable, then the pairing F, K z is differentiable on (0, ∞) where it obeys d F K z |s dp = (A z F)K z |s dp. (1.5) ds × N P ×N P A theorem of Hörmander [H] guarantees that K z is smooth for s > 0 if the bilinear form in (1.1) is non-degenerate, and if the set of higher order Lie brackets of the vector fields in the set {∂a } span T P at each point. In general, • K z ≥ 0. • P K (z 1, z 2 ,...,z N ) ( p1 , . . . , p N −1 , p)dp = K (z 1 ,z 2 ,...,z N ) ( p1 , . . . , p N −1 ). • Let N and N be positive integers with N ≤ N . If the final N − N + 1 entries
of z ∈ × N M are identical, then K z ( p1, . . . , p N ) = δ N −N +1 K (z 1, ...,z N −1 ,z N ) ( p1 , . . . , p N ). • Let σ simultaneously denote an element in the group of permutations of {1, . . . , N } and the action of this element on both × N M and × N P. Then K z = σ ∗ (K σ (z) ). (1.6) Let P M denote the space of all maps (continuous or not) from M to P. The collection of all such K z can be used to define a measure on P M as follows: The measure in question is defined on the σ -algebra that is generated by ‘cylinder’ sets that are jointly ladled by a positive integer, N , together with a collection of N pairs {(z i , Ui )}1≤i≤N such that z = (z 1 , . . . , z N ) ∈ × N M has distinct entries and each U j is an open subset of P. The set labelled by the data (N , {(z j , U j )}) consists of the maps that send each z j to its partnered set U j . The measure of this set is deemed equal to K z |s=1 dp. (1.7) ×1≤ j≤N U j
A theorem of Kolmogorov (see Theorem 1.10 in [SV]) guarantees that the just asserted rules define a bonafide measure on P M . A push-forward measure is induced on X M from the measure just described on P M . This push-forward measure is defined by its values on certain cylinder sets of maps from M to X . Such a set is jointly labelled by a positive integer, N , together with a collection of N pairs {(z j , V j )}1≤ j≤N , where z is as above and where V j ⊂ X is an open set. The set with this label consists of those φ ∈ X M such that φ(z j ) ∈ V j for all 1 ≤ j ≤ N . The measure of this set is given by the version of (1.7), where each U j is taken to be the inverse image in P of the corresponding V j . 2. Renormalization The task here is that of salvaging something from the preceding construction in the case where M is a smooth manifold, but where the function a that appears in (1.1) diverges towards ∞ as the distance between its arguments limits to zero. In this case, the operator
28
C. H. Taubes
that appears in (1.4) has no meaning. A strategy is described below for proceeding in certain instances. Note in this regard that the quantum field theory story that follows uses a divergent version of the function a. To motivate the focus in the quantum field theory on divergent versions of a, first recall from [T] that the measure in the case where M = R × Y is ‘reflection positive’ when the function a in (1.1) is the Green’s function for a differential operator of the form 2 d − + L ; (2.1) dt 2 here L is a non-positive, self-adjoint elliptic differential operator on Y whose order is greater than twice the dimension of Y . This lower bound on the degree of L guarantees the continuity of the Green’s function on the whole of M × M. The reflection positivity property is used in [T] to construct a Hamiltonian quantum field theory using ideas of Osterwalder-Schrader [OS]. These quantum field theories are of little interest to physics in part because L is not a second order operator. For example, a reasonable action of the (n + 1) dimensional Poincaré group on the quantum Hilbert space in the case Y = Rn would seem to require a second order version of L . Meanwhile, a second order version of L has a Green’s function that is singular on the diagonal when dim(Y ) ≥ 1. What follows describes a strategy for dealing with a singular version of the function a that appears in (1.1). To start, fix a bounded, continuous function, c : M → R. This done, define ac (z, z ) to equal a(z, z ) when z = z and to equal c(z) when z = z . Thus, ac is continuous on the complement of the diagonal in M × M, but not continuous on the whole of M × M. Let R ≡ (r1 , . . . , r N ) ∈ R N and replace A z in (1.4) with j j j ac (z i , z j ) A z,R ≡ rj ∂a ∂a + ∂ai ∂a . (2.2) 1≤ j≤N
1≤a≤d
1≤i, j≤N
1≤a≤d
The latter operator is negative semi-definite if each r j is sufficiently positive; lower bounds are determined by the chosen point z = (z 1 , . . . , z N ). In any event, if each r j is sufficiently positive, then there is an analog, K z,R , to the measure K z that appears in (1.5) and (1.6). Note that K z,R depends implicitly on the chosen function c for the values of ac on the diagonal in M × M . Of course, one can always take c ≡ 0, but as is illustrated below, there may be better choices. A physicist might view the choice of c as the choice of a ‘normalization scale’. The plan now is to look for a suite of operators,
L N ,R : N ∈ {1, . . .} and R ≡ (r1 , . . . , r N ) ∈ R N , (2.3) where any given L N ,R provides a linear map from a subspace in C 0 (× N P) to C 0 (× N P). This suite of operators should have the following properties: • For fixed N , all {L N ,R } are defined on the same dense domain in C 0 (× N P) and the latter should include the constant function 1. • Suppose that F ∈ C 0 (× N P) is a function from the common domain of {L N ,R }. If z ∈ × N M has distinct entries, and if R is such that K z,R is well defined, then (L N ,R F)K z,R |s=1 dp ×N P
is independent of R when all r j are sufficiently large.
Quasi-Linear Quantum Field Theories for Maps to Groups and Their Quotients
29
• If N > 1 and if F is independent of p N , and if each r j is sufficiently large, then L N ,R F is also independent of p N and its value at any given ( p1 , . . . , p N −1 ) is that of L N −1,R F , where F ( p1 , . . . , p N −1 , ·) and R ≡ (r1 , . . . r N −1 ). (2.4) C 0 (×
In what follows, T N ⊂ N P) denotes the common domain of the set {L N ,R }. A collection of the desired sort of operators is exhibited below in the case where P is a compact Lie group. It is not likely that a collection {L N ,R } as just described exists in the generic case, even with the second point replaced by the weakened version that demands only the existence of a reasonable limit in (2.4) as one or more of the r j tend to infinity. In any event, assume in what follows that {L N ,R } does exist as described in (2.4). Fix an integer N > 1 and suppose that F is a function on × N P. A point z ∈ × N M and F together define a function on the space of maps from M to P by assigning to a map φ the value of F at (φ(z 1 ), . . . , φ(z N )). The latter function is denoted by (z,F) . Introduce F to denote the vector space over C whose elements are finite linear combinations of the constant function and functions such as (z,F) with z having distinct entries and with F taken from the domain T N . Distinct versions of (·) that appear in any given from F need not use the same integer N . Associate to each (z,F) the number
(z,F) ≡ (L N ,R F)K z,R |s=1 dp. (2.5) lim r1 ,...,r N →∞ × P N
Use these assignments to F’s given as the basis for the definition of a linear functional on F. The value of the latter on any given ∈ F is denoted by in what follows. Keep in mind that there is an implicit dependence on the chosen function c : M → R. A measure on a space defines, by integration, a linear functional on some subset of the continuous functions. Conversely, certain sorts of linear functions are guaranteed to arise from measures (see, e.g. Chapter 14 in [R]). Even so, no claim is made here that the linear function · comes from a measure on the set of maps from M to P. The functional itself serves the purposes at hand. The case of interest in this article, and the case assumed without further notice in all that follows takes P to be a compact Lie group. Meanwhile, {∂a } is taken to be an orthonormal basis of left invariant vector fields on P for a fixed, bi-invariant metric; and the integration on P is defined with the volume form from this same metric. Here is a collection {L N ,R } to use in this Lie group case: Let T N denote the vector space of finite linear combinations of functions that send p = ( p1 , . . . , p N ) to f 1 ( p1 ) . . . f N ( p N ), where each f j is an eigenfunction of the Laplacian on P , this being the operator 1≤a≤d ∂a ∂a . For F ∈ T N , set L N ,R F ≡ exp − (2.6) rj ∂a ∂a F. 1≤i≤N
1≤a≤n
Note that the exponentiated operator that appears on the right-hand side of ( 2.6) has the ‘wrong’ sign; it is the kernel operator to the backwards heat equation on × N P. As such, it can be defined only on a rather small domain in C 0 (× N P). However, it is defined for any choice of {R j } on the domain T N . The fact that the operator 1≤a≤d ∂a ∂a on a
30
C. H. Taubes
compact Lie group commutes with each version of ∂a guarantees that the conditions in (2.4) are obeyed. This same fact about commutators implies a simple relationship between the respective versions of · that arise from two different choices for the function c on M. To state the relation, suppose that the function a on M × M has been specified, and then use · c to denote the version of · that is defined by a given function c : M → R. Let c and c both denote functions on M. Given a positive integer N and a point z = (z 1 , . . . , z N ) ∈ × N M, define R(z) ∈ × N R to so that its k th component is c(z k ) − c (z k ). When F ∈ T N and when z has pairwise distinct entries, set F (z) to equal L N ,R(z) F. Granted this notation, then the c and c versions of · are related as follows: When F ∈ T N , and z has pairwise distinct entries,
(2.7) (z,F) c = (z,F (z) c . As will now be explained, the functional · has a certain finite dimensional flavor in the case at hand. To start, fix a positive integer N and suppose that V1 , . . . , V N are each a finite direct sum of eigenspaces of the Laplacian on P. Let V ≡ V 1 × . . . × V N denote the corresponding vector subspace of C ∞ (× N P). Thus, a function from V is a linear combination of functions that send any given p = ( p1 , . . . , p N ) to f 1 ( p1 ) . . . f N ( p N ), where each f k is from the corresponding Vk . To finish setting the stage, let z ∈ × N M denote a point with pairwise distinct components, and let R ≡ (r1 , . . . , r N ) ∈ × N R. The first remark is that the operator A z,R that is depicted in (2.3) maps V to itself. This is also true for the operator j Aˆ z ≡ ac (z i , z j ) ∂ai ∂a . (2.8) 1≤i≤ j≤N
1≤a≤d
Both statements hold because each k ∈ 1, ..., N version of 1≤a≤d ∂ak ∂ak commutes with Aˆ z and preserves V. Granted the preceding observation, let exp( Aˆ z )V ∈ End(V) denote the exponential of the restriction to V of the operator in (2.11). The relevance of exp( Aˆ z )V stems from the the following observation: If F ∈ V, then (z,F) = (exp( Aˆ z )V F)( p, . . . , p)dp. (2.9) P
There is one further ‘commutation’ property that plays a key role in what follows. This is stated as Lemma 2.1. Let N denote a positive integer. Then each a ∈ {1, . . . , d} version of fα ≡ k with all of the N 2 operators on × N P from the set 1≤k≤N ∂a commutes
i j . In addition, 1≤b≤d ∂b ∂b 1≤i, j≤N
(fa U )( p, . . . , p)dp = 0
(2.10)
p
for all U ∈ C ∞ (× N P). Proof of Lemma 2.1. The collection {fa }1≤a≤d generate the diagonal action of P on × N P that simultaneously multiplies each factor on the right by the same group j element. The fact that each fa commutes with each operator of the form 1≤b≤d ∂bi ∂b
Quasi-Linear Quantum Field Theories for Maps to Groups and Their Quotients
31
follows because each of the latter is invariant under this diagonal action of P. Meanwhile, the identity in (2.10) follows by virtue of the fact that the full diagonal in × N P is mapped to itself by this same P action. Note that (2.9) and the preceding lemma lead to a simple formula for · in the case N = 2 or N = 3. Consider first the case N = 2, and take F = f 1 ( p1 ) f 2 ( p2 ) with f 1 and f 2 eigenfunctions of the Laplacian. Then (z,F) = 0 unless f 1 and f 2 have the same eigenvalue. In the latter case, ˆ ((z 1 ,z 2 ), f1 f2 ) = e(2a(z 1 ,z 2 )−c(z 1 )−c(z 2 )) E f 1 ( p) f 2 ( p)dp, (2.11) P
where Eˆ denotes the common eigenvalue of − 1≤a≤d ∂a ∂a . In the case that N = 3, take F = f 1 ( p1 ) f 2 ( p2 ) f 3 ( p3 ) with each f k being an eigenfunction of the Laplacian. Now −α1 Eˆ 1 −α2 Eˆ 2 −α3 Eˆ 3 ((z 1 ,z 2 ,z 3 ), f1 f2 f3 ) = e f 1 ( p) f 2 ( p) f 3 ( p)dp, (2.12) P
where − Eˆ 1 is f 1 ’s eigenvalue, α1 = c(z 1 ) + a(z 2 , z 3 ) − a(z 1 , z 2 ) − a(z 1 , z 3 ); and Eˆ 2 , Eˆ 3 , α2 and α3 are defined analogously. This section ends here with an analysis of the behavior of (z,R) as the distance between two components of z shrinks to zero. The upcoming Proposition 2.2 summarizes the story. A three part digression follows directly to set the stage. Part 1. Suppose that N ≥ 2 is given, and that a function F ∈ × N P has been specified that sends any given point ( p1 , . . . , p N ) to f 1 ( p1 ) . . . f N ( p N ), where each f k is an eigenfunction of the Laplacian on P. Suppose that z = (z 1 , z 2 , . . . , z N ) ∈ × N M is such that z 3 ,. . .,z N are pairwise distinct. Meanwhile, z 1 = z 2 and both are very near a point z 0 ∈ M that is distinct from z 3 ,. . .,z N −1 and z N . I use z ∈ × N −1 M to denote the point (z 0 , z 3 , . . . , z N ). Part 2. Let V1 and V2 denote the respective eigenspaces for the Laplacian on P that contain f 1 and f 2 . The representation theory for the diagonal action of P on ×2 P can be used to decompose the space V1 × V2 as a finite direct sum of spaces, ⊕v Wv , where the restriction to the diagonal in P × P identifies Wv with a direct sum of copies of a single eigenspace of the Laplacian on P. This notation is such that pairs Wv and Wv
with v = v map via this restriction homorphism to distinct eigenspaces of the Laplacian on P. I use φ in what follows to denote this restriction homomorphism from V1 × V2 to C ∞ (P). Thus, the image of φ on any given Wv is a particular eigenspace for the Laplacian on P. This understood, let Eˆ v denote the eigenvalue of − 1≤a≤d ∂a ∂a on φ(Wv ). Part 3. For each k ≥ 3, let Vk denote the eigenspace of the Laplacian on P that contains f k , and set V = V 1 × . . . × V N . This done, use φv to denote the homomorphism from V to φ(Wv ) × V3 × . . . × V N that is defined as follows: Let G = g1 g2 . . . g N , where each gk ∈ Vk , and let (g1 g2 )v denote the component of g1 g2 in Wv . Now set φ(G) = φ((g1 g2 ))g3 . . . g N . With the stage now set, here is the promised
32
C. H. Taubes
Proposition 2.2. With the point z = (z 0 , z 3 , . . . , z N ) ∈ × N −1 M fixed, then ˆ ˆ ˆ (z,F) = e(a(z 1 ,z 2 )−c(z 0 ))( E 1 + E 2 − E v ) (z ,φv (F)) + o(1)
(2.13)
v
as z 1 and z 2 converge to z 0 . In particular, if a(·, ·) is positive when its two entries are close and diverges as their distance converges to zero, then ˆ ˆ ˆ (z,F) = e(a(z 1 ,z 2 )−c(z 0 ))( E 1 + E 2 − E 0 ) (z ,φ0 (F)) + o(1) ; (2.14)
here the notation uses Eˆ 0 to denote the smallest of the numbers in the set Eˆ v , and φ0 to denote the corresponding version of φv . By the way, the leading order term on the right-hand side of (2.14) can vanish. This is the case, for example, when zero is the only constant function in the C ∞ (P) image of φ(W0 ) × V3 × . . . × V N via the homomorphism that comes by restricting to the full diagonal in × N −1 P. Note as well that the o(1) term that appears in (2.13) is bounded by a constant times the distance between z 1 and z 2 when the function a in (1.1) and the function c are both smooth. However, in general, the o(1) term that appears in (2.14) need not be on the order of distance between z 1 and z 2 . In particular, the o(1) term in (2.14) will shrink slower than this distance when the function a has a reasonably mild divergence on approach to the diagonal. Such will be the case, for example, when the function a diverges as a small, positive multiple of the absolute value of the log of the distance between its two entries. Proof of Proposition 2.2. Any k > 2 version of either a(z 1 , z k ) or a(z 2 , z k ) can be written as a(z 0 , z k ) + o(1). Likewise, both c(z 1 ) and c(z 2 ) can be written as c(z 0 ) + o(1). This then allows the endomorphism Aˆ z from (2.8) and ( 2.9) to be written as Aˆ z = 2(a(z 1 , z 2 ) − c(z 0 )) ∂a1 ∂a2 + Aˆ + o, ˆ (2.15) 1≤a≤d
where the notation has Aˆ ≡ c(z 0 ) (∂a1 + ∂a2 )(∂a1 + ∂a2 ) + 2 a(z 0 , z k ) (∂a1 + ∂a2 )∂ak 1≤a≤d
+
3≤i, j≤N
ac (z i , z j )
k≥3 j ∂ai ∂a ;
1≤a≤d
(2.16)
1≤a≤d
and where oˆ is an endomorphism of V of size o(1). of the first two The plan now is to study Aˆ z by viewing oˆ as a small perturbation terms on the right-hand side of (2.15). In this regard, note that 1≤a≤d ∂a1 ∂a2 commutes with Aˆ and this makes the story for their sum relatively straight forward. To start this story, note that the homomorphism φ intertwines any given of ∂a1 + ∂a2 with the version 1 2 corresponding ∂a . This implies that the endomorphism 1≤a≤d ∂a ∂a acts on Wv as multiplication by the constant 21 ( Eˆ 1 + Eˆ 2 − Eˆ v ). Meanwhile, the endomorphism Aˆ preserves any given Wv × V3 × . . . × V N . To describe its action here, write Wv as a direct sum of subspaces that are each mapped isomorphically by φ onto φ(Wv ). Then, the action of Aˆ
Quasi-Linear Quantum Field Theories for Maps to Groups and Their Quotients
33
on any one of these subspaces is identical to that of the z = (z 0 , z 3 , . . . , z N ) ∈ × N −1 M version of Aˆ z on φ(Wv ) × V3 × . . . × V N . Granted the preceding, a straightforward application of finite dimensional perturbation theory to exp( Aˆ Z )V finds that the ρ versions of (z,F) when z 1 is very close to z 2 have the form that is depicted in (2.13) and (2.14). 3. Quantum Field Theories Consider now the case where M = R×Y and P is a compact Lie group. Take the vector fields {∂a } and integration as in the previous section. Meanwhile, take the function a on (R×Y )×(R×Y ) to be a positive, constant multiple of a Green’s function for an operator on R×Y that has the form depicted in (2.1). To keep the story relatively short, I assume in what follows that L is a second order, elliptic operator that is self-adjoint with dense domain in L 2 (Y ). I also assume that the L 2 kernel of L is either trivial or consists of the constant functions. To be more specific about the function a in the case where Y is compact, introduce an orthonormal basis, {ηα }, of eigenfunctions of L; here Lηa = −E α2 ηα with E α ≥ 0. Write a point z ∈ R×Y as z = (t, y). Granted this notation, suppose z and z are distinct points in R×Y and set
• a((t, y), (t , y )) = σ α 2E1 α e−E α |t−t | ηα (y)ηa (y ) if all eigenvalues E α are non-zero. 1 1 −E α |t−t |
• a((t, y), (t , y )) = −σ 2Vol(Y ηα (y)ηα (y ) {α:E α >0} 2E α e ) t − t +σ if L annihilates the constants. (3.1) Here, σ is a positive constant. The only noncompact example considered below is that where Y = Rn and L is either the Laplacian on L 2 (Rn ) or differs from the latter by a negative constant. In all cases, the function c that defines the values of ac on the diagonal is taken to be independent of the R factor in R × Y . My purpose here is to explain how the Osterwalder-Schrader construction takes as input the functional · and returns a Hilbert space with a strongly continuous, unitary action of R whose generator is a self-adjoint positive semi-definite operator. To begin the Osterwalder-Schrader construction, reintroduce the vector space F, and let F 0 ⊂ F denote the set of functions that consists of the constant function and those of the form (z,F) , where, z ∈ × N M has pairwise distinct entries and F ∈ C ∞ (× N P) decomposes as a product of eigenfunctions for the Laplacian on P. Thus, F takes any given p = ( p1, . . . , p N ) to f 1 ( p1 ) . . . f N ( p N ), where each f k is an eigenfunction for 1≤a≤d ∂a ∂a . Here, N can be any positive integer. Now let F++ ⊂ F denote the subspace that is generated by the constant function and by the subset in F 0 that are labelled by (z, F), where each entry of z has positive first coordinate. This is to say that z = (z 1 , . . . , z N ) and each z k = (tk , yk ), where tk ∈ R is positive. Define F−− ⊂ F to denote the analogous subspace whose generators consist of the constant function and those (z,F) , where each entry of z has negative first component. Note that when − is in F−− and + is in F++ then − + is in F. As a consequence, multiplication of functions defines a vector space homomorphism from F−− ⊗ F++ to F. The abelian group R acts on × N (R×Y ) by simultaneously translating the R-coordinate of each entry. The image of a given point z under the action of τ ∈ R is denoted in what follows by τ · z. For example, if (t, y) ∈ R×Y , then τ · (t, y) = (t + τ, y). This R
34
C. H. Taubes
action can be used to define an R action on F 0 ; this is the action whereby τ ∈ R sends any given Fz to Fτ ·z . This action extends by linearity to an action on the algebra F. In the latter guise, the action of τ is denoted by Rτ . Note that this action induces an action of the semi-group [0, ∞) ⊂ R on the subalgebra F++ ⊂ F. The notion of reflection positivity as used here refers to a certain anti-linear involution, ∗, on F that is defined from the involution on × N (R×Y ) whose effect is to change the sign of the R factor of each entry. The latter involution is also denoted by ∗. For example, in the case that (t, y) ∈ R×Y , then ∗(t, y) = (−t, y). The involution just defined on × N (R×Y ) is used to define an anti-linear involution on F 0 , that sends any given Fz to the complex conjugate of the function F∗z . This anti-linear involution on F 0 then extends as the desired anti-linear involution of the algebra F. The latter is also denoted as ∗. Note that it maps F++ to F−− . Proposition 3.1. The linear functional · has the following two properties: • It is translation invariant in that Rτ = for any given τ ∈ R and ∈ F, • It is ‘reflection positive’ in the sense that (∗) is non-negative for any ∈ F++ . This theorem is proved below. Accept it for the moment to see where it leads. Let Q(·, ·) denote the bilinear form on F++ that is defined by associating the number (∗) to any given and from F++ . Proposition 3.1 asserts that Q is a positive semi-definite Hermitian form. Let ker(Q) denote the subspace of vectors ∈ F++ with the property that Q( , ) = 0 for all ∈ F++ . The form Q thus descends to F++ / ker(Q) as a positive definite Hermitian form; this is denoted by Q also. In what follows, H denotes the Hilbert space completion of F++ / ker(Q) using Q. As noted above, the semi-group {Rτ : τ ≥ 0} maps F++ to itself. Moreover, this action is such that Q( , Rτ ) = Q(Rτ , ). (3.2) Indeed, (3.2) is a direct consequence of the fact that the value of the function depicted in (3.1) is unchanged when both t and t are translated by the same amount. Note that (3.2) implies that Rτ preserves ker(Q) and so the assignment τ → Rτ descends to define a semi-group on F++ / ker(Q). Theorem 3.2. The Hilbert space H has a strongly continuous, self-adjoint, one-parameter contraction semi-group whose time τ > 0 member maps the image of any given in F++ to the image of Rτ . The proof of Theorem 3.2 appears below. The generator of Theorem 3.2’s semi-group is minus 1 times a self-adjoint, non-negative operator on H (see, e.g [HP]). The latter operator is called the Hamiltonian. There is one more point to make here, this regarding the dependence of Theorem 3.2’s Hilbert space and semi-group on the choice for function c : Y → R that is used to define · by giving the values for ac on the diagonal in (R×Y ) × (R×Y ). For this purpose, reintroduce the notation that is used in (2.7). Granted this notation, the assignments (z,F) → (z,F (z)) extend by linearity to give an invertible, linear map from L++ to itself. This map is denoted here by J . As a consequence of (2.7 ), the corresponding c and c versions of Q are such that
Q c (, ) = Q c (J , J ).
(3.3) to the c
version. MoreThus, J descends to define an isometry from the c version of H over, as J commutes with the action on L++ of the 1-parameter semigroup {Rτ : τ ≥ 0},
Quasi-Linear Quantum Field Theories for Maps to Groups and Their Quotients
35
so this isometry intertwines the c and c versions of the semi-group and its self-adjoint generator. A change from c to c might be viewed by a physicist as a change in a choice of renormalization scale for the quantum field theory. As just demonstrated, two such choices lead to equivalent Hilbert spaces and Hamiltonians. Proof of Proposition 3.1. It is sufficient to establish the reflection positivity condition solely for the linear combinations of functions from F 0 that are all defined using the same integer N . Indeed, this is a consequence of (2.4). With the preceding understood, fix N and let denote in what follows a finite set of distinct pairs of the form (z, F), where z and F are as follows: First, z is a point in × N (R×Y ) whose entries have positive first coordinate. Meanwhile, F, a function on × N P, has the form F( p1 , . . . , p N ) = f 1 ( p1 ) . . . f N ( p N ), where each f k is an eigenfunction of the Laplacian on P . Here, N = N ( ) is a positive integer that depends only on . When F and F are functions on × N P, the function on (× N P) × (× N P) that sends any given ( p, p ) to F( p)F ( p) is denoted in what follows by F × F . Since is finite, all functions on (× N P) × (× N P) of the form F¯ × F , where F and F come from lie in some N = 2N version of the subspace V ∈ C ∞ (×2N P) that appears in (2.9). As a consequence, the reflection positivity condition for is obeyed if and only if (exp( Aˆ (∗z,z ) )V · ( F¯ × F ))( p, . . . , p)dp ≥ 0. (3.4) (z,F),(z ,F )∈
P
Three cases will be considered below. The first is that where Y is compact and L has trivial kernel. The second case is that where Y is compact and L annihilates the constant functions. The third case is that where Y = Rn and L = − +m 2 with m ≥ 0. Case 1. This is the case where a is given in the top line of (3.1). The translation invariance required in the first point of Proposition 3.1 follows directly from the fact that the chosen function c is independent of R and the function a in (3.1 ) is unchanged if both t and t are translated the same amount. The arguments from Sect. 5 of [T] can be used here in this case to prove (3.4). In fact, the arguments in this case are simpler by virtue of the fact that exp( Aˆ (∗z,z ) )V is an endomorphism of a finite dimensional vector space. What follows is a brief summary of the arguments for the case at hand. To start, the operator Aˆ (∗z,z ) that appears in (3.4) can be written as tα,a (z) tα,a (z ) , (3.5) Aˆ (∗z,z ) = Aˆ z + Aˆ z + σ 1≤a≤d α
¯ while where the notation is as follows: First, Aˆ z depends only on z and acts only on F,
Aˆ z depends only on z and acts only on F . Second, the sum indexed by the Greek letter α is the same sum over an orthonormal eigenbasis for the operator L that appears in (3.1). Finally, 1 −E α ti tα,a (z) = e ηα (yi )∂ai , (3.6) 2E α 1≤i≤N
and tα,a (z ) is identical save that z is used and that ∂ai replaces ∂ai . Thus, tα,a (z) acts solely on F¯ and tα,a (z ) acts solely on F .
36
C. H. Taubes
The next observation is that for the purposes of establishing (3.4), all terms in (3.5) and (3.6) can be viewed as elements in End(V). Note in particular that the sum in (3.6) is absolutely convergent as an element in End(V). This understood, the decomposition in (3.6) is used to write exp( Aˆ (∗z,z ) )V as an absolutely convergent sum that has the schematic form
1 M × M + σ M1−s tα,a Ms × M1−s tα,a Ms ds 1≤a≤d α
+σ2
1≤a,b≤d α,β
0
0
1 s1 0
M1−s1 tα,a Ms1 −s2 tβ,b Ms2
× M1−s1 tα,a Ms1 −s2 tβ,b Ms2
ds2 ds1 + . . . .
(3.7)
Here, the unprimed terms are defined by z and operator only on F¯ , while the primed terms are defined by z and operate only on F . Also, M = exp( Aˆ z ) and M = exp( Aˆ z ), each viewed as an endomorphism of a particular finite dimensional subspace of C ∞ (× N P). Note that any given term in (3.7) is a sum of integrals of endomorphisms where each endomorphism sends F¯ × F to a function that has the form Uz F¯ × Uz F with the assignment z → Uz indicating a certain endomorphism valued function on the portion of × N (R×Y ) where all the components have positive R coordinate. As a consequence, each term in (3.7) makes a non-negative contribution to (3.4). Indeed, such is the case because 2 ( p, . . . p)dp. (Uz F¯ × Uz F )( p, . . . , p)dp = (U F) z P (z,F)∈ (z,F),(z ,F )∈ P (3.8) Case 2. The translation invariance required by Proposition 3.1 follows by virtue of the fact that the function c on R×Y comes from Y and the function that is depicted in the second line of (3.1) is left unchanged when both t and t are translated by the same amount. The argument for (3.4) in this case uses the following strategy: With fixed, and a positive real number, m, chosen, each version of the Aˆ (∗z,z ) ∈ End(V) that appears in ( 3.4) is replaced by an endomorphism, Aˆ m (∗z,z ) ∈ End(V), with two key properties. First, the argument from Case 1 proves the version of (3.4) that has Aˆ m (∗z,z ) replacing A(∗z,z ) . Second, given ε > 0, then
¯ exp( Aˆ m (∗z,z ) )V ( F × F ) ( p, . . . , p)dp P (3.9) exp( Aˆ (∗z,z ) )V ( F¯ × F ) ( p, . . . , p)dp < ε − P
when m is sufficiently small. To obtain Aˆ m ˆ m (z, z ) denote the version of the function in the first line of (∗z,z ) , let a (3.1) that appears when L is replaced by L − m 2 . With aˆ m understood, define j ˆm ˆm
Aˆ m aˆ m (∗z i , z j ) ∂ai ∂a , (3.10) (∗z,z ) = A z + A z + σ 1≤i, j≤N
1≤a≤d
Quasi-Linear Quantum Field Theories for Maps to Groups and Their Quotients
37
where Aˆ m z ≡
σ + c(z i ) ∂ai ∂ai + σ 2m
1≤i≤N
1≤a≤d
aˆ m (z i , z j )
1≤i = j≤N
j
∂ai ∂a ,
(3.11)
1≤a≤d
k and where Aˆ m z is defined similarly using z and with all versions of ∂a replaced by their primed counterparts. The argument from Case 1 proves the reflection positivity condition in (3.4) for any given positive m. The proof that (3.9) holds has four steps.
Step 1. Return to the milieux of (2.7) and (2.8) and let N and V be as described in the associated discussion. Now, let ai j 1≤i, j≤N denote a set of (N )2 real numbers. Any i j ˆ operator of the form Aˆ ≡ 1≤i, j≤N ai j 1≤a≤d ∂a ∂a maps V to itself and so exp( A) ˆ V . Suppose now is well as a linear map from V to V. The latter is denoted as exp( A) defined δ > 0 with that ai j has been the following and also some ε > 0. Then there exists
given
significance: If ai j is a second set of real numbers with 1≤i, j≤N ai j − ai j < δ, ˆ V − exp( Aˆ )V ≤ ε. then exp( A) Step 2. The collection aˆ m m>0 has the following key property: Fix δ > 0 and let Oδ ⊂ ×2 (R×M) denote the set of pairs (t, y), (t , y ) such that t − t > δ and |t| + |t | < 1/δ. Then 1 m aˆ − converges uniformly to aˆ on Oδ as m → 0. (3.12) 2m m>0 Hold onto this last observation for a moment. N
Now, let z and z denote points in × N (R×Y ) that come from , and define the ˆ ˆ
≡ 4N 2 numbers a
i j so that the endomorphism A on V from Step 1 is A(∗z,z ) .
Meanwhile, let ai j denote the 4N 2 numbers that arise when the function aˆ that is used 1 to define Aˆ (∗z,z ) is replaced by some very small but positive m version of aˆ m − 2m . Let
Aˆ denote the corresponding endomorphism of V. The conclusions of Step 1 with (3.12) imply the following: Given ε > 0, then | exp( Aˆ (∗z,z ) )V − exp( Aˆ )V | < ε when m is sufficiently small. σ . This set of numbers defines the operator Aˆ m Step 3. Let aimj denote ai j + 2m (∗z,z ) that appears in (3.10). This step argues that ˆ
σ fa fa . (3.13) exp( Aˆ m (∗z,z ) )V = exp( A )V exp 2m 1≤a≤d
V
ˆ σ To establish (3.13), note first that Aˆ m 1≤a≤d fa fa . Thus, (3.13) follows (∗z,z ) = A + 2m from the claim that Aˆ and 1≤a≤d fa fa commute. Meanwhile, the latter claim follows from Lemma 2.1.
38
C. H. Taubes
Step 4. Now let V ∈ V. Then exp( Aˆ m )V V ( p, . . . , p)dp = exp( Aˆ )V V ( p, . . . , p)dp P
(3.14)
P
by virtue of (3.13) and (2.10). Granted (3.14), then (3.9) follows from the conclusions of Step 2. Case 3. Here, Y = Rn . In the cases where n > 1 or where n = 1 and m > 0, the function a(·, ·) on (R × Rn ) × (R × Rn ) can be written as σ a, ˆ where aˆ is given by the Fourier integral: 1 1
aˆ (t, y), (t , y ) = e−|t−t |E(k) eik(y−y ) dk, (3.15) (2π )n Rn 2E(k) 1
where E(k) ≡ (|k|2 +m 2 ) 2 . The convention here is to take m1 ≥ 0. The n = 1 and m = 0 case has a = σ a, ˆ where aˆ (t, y), (t , y ) is the function − 4π ln (t − t )2 + (y − y )2 . In all cases, σ is a positive constant. The translation invariance is again due to the fact that the function c is independent of the R factor in R × Rn and the the relevant versions of a(·, ·) are unchanged when both t and t are simultaneously translated by the same amount. The arguments that are used in Case 1 above can be applied to prove the second point in Proposition 3.1 after one preliminary step. To describe the preliminary step, consider first the situation when n > 1 or when n = 1 and m = 0. In this case, Aˆ (∗z,z ) ∈ End(V) has a form that is very similar to the one depicted in ( 3.5): 1 1 tk,a (z)tk,a (z ) dk, (3.16) Aˆ (∗z,z ) = Aˆ z + Aˆ z + σ n (2π ) Rn 2E(k) 1≤a≤d
1
where E(k) ≡ (|k|2 + m 2 ) 2 and tk,a (z) ≡
e−E(K )ti e−ik·yi ∂ai .
(3.17)
1≤i≤N
The important point here is that the integral that appears in (3.16) is absolutely convergent and defines an operator in End(V). This understood, the argument used for Case 1 can be repeated in an essentially verbatim fashion after changing certain sums to integrals. Turn next to the case where n = 1 and m = 0. This case requires the lemma that follows. Lemma 3.3. When m > 0, let aˆ m be the corresponding n = 1 version of (3.15). Given ε > 0 and ρ > 1, then all sufficiently small but positive m versions of aˆ m enjoy the following property: If z, z ∈ R × R are points such that 1/ρ < |z − z | < ρ, then m aˆ (z, z ) − κ + 1 ln |z − z | < ε. (3.18) 2π Here, κ is a constant that depends only on ε and ρ.
Quasi-Linear Quantum Field Theories for Maps to Groups and Their Quotients
39
The proof of this lemma is straightforward and left to the reader. To see where Lemma 3.3 leads, remark that with fixed in (3.4), there exists some number ρ >> 1 such that the conditions in the lemma hold for all points that appear as a component of any z = (z 1 , . . . , z N ) from any pair in . This understood, suppose that ε < 0 has been chosen and that aˆ m is given as in Lemma 3.3 with m > 0 but very small. Let a m ≡ σ aˆ m . Given a pair (z, F) and (z , F ) from , use Aˆ m (∗z,z ) to denote the element in End(V) that is defined by (c(z i ) + σ K) ∂ai ∂ai + (c(z i ) + σ K) ∂ai ∂ai 1≤i≤N
+
1≤a≤d
a m (z i , z j )
1≤i = j≤N
+
1≤i, j≤N
1≤a≤d
∂ai ∂a + a m (z i , z j ) j
1≤a≤d
a m (∗z i , z j )
j ∂ai ∂a
1≤a≤d
j ∂ai ∂a .
(3.19)
1≤a≤d
ˆm With Am (∗z,z ) as above, introduce the corresponding exp( A(∗z,z ) )V . Now there are two key points. First, the arguments for the n = 1 and m > 0 case just given prove that the Aˆ m (·) version of (3.4) holds. Second, with ε > 0 given and fixed, any sufficiently small m version of Aˆ m (·) obeys (3.9 ). Indeed, Lemma 3.3 m
ˆ ˆ implies that (3.9) holds if A(∗z,z ) is replaced by A ≡ Aˆ m
) − σ κ 1≤a≤d fa fa , where (∗z,z fa ≡ 1≤i≤N (∂ai + ∂ai ). Meanwhile, Lemma 2.1 implies that such a replacement does not affect the integral that appears in (3.9). Proof of Theorem 3.2. The issue here is whether the one parameter semigroup on F++ given by the set of transformations {Rτ }τ >0 descends to H as a strongly continuous, self-adjoint, contraction semigroup. As noted previously, each such Rτ descends to H as a symmetric operator with dense domain F++ / ker(Q). As is argued momentarily, Q(Rτ , Rτ ) ≤ Q(, )
(3.20)
for all ∈ F++ and τ ≥ 0. This implies that Rτ extends to H as a bounded, self-adjoint operator. Given that Rτ Rτ = Rτ +τ on F++ , standard arguments (see, eg. [Ka] or [HP]) prove that these extensions define a strongly continuous, 1-parameter, self-adjoint, contraction semigroup. The argument for (3.20) is given momentarily. The proof depends on the following: Lemma 3.4. Let ∈ F++ and τ ∈ [0, ∞). Then there exists υ ≥ 0 such that Q(Rτ , Rτ ) ≤ υ for all τ ≥ 0. The proof of this lemma is given below. Here is how to prove (3.20): Note first that Q(Rτ , Rτ ) = Q(, R2τ ) and thus is less than the square root of Q(R2τ , R2τ )Q(, ). Now, bound Q(R2τ , R2τ ) in terms of Q(R4τ , R4τ ) using this same inequality but with τ replaced by 2τ . Continue in this vein to conclude that 1
1
1
3
Q(Rτ , Rτ ) ≤ Q(R2τ , R2τ ) 2 Q(, ) 2 ≤ Q(R4τ , R4τ ) 4 Q(, ) 4 −k
−k
≤ . . . ≤ lim Q(R2k τ , R2k τ )2 Q(, )1−2 . k→∞
(3.21)
40
C. H. Taubes
Lemma 3.4 guarantees that the limit on the right is no greater than Q(, ). Proof of Lemma 3.4. The existence of a τ -independent bound can be deduced by first writing = α bα α , where bα ∈ C and where each α is either constant or has the form (z,F) . Writing in this way exhibits the fact that it is sufficient to find a τ independent bound for any given (z, F) version of Q(Rτ (z,F) , Rτ (z,F) ). The existence of the latter bound is argued next. Consider first the cases where Y is compact and L has no kernel, where Y = Rn>1 , and where Y = R and m > 0. To start, fix some very large number r , and use R ∈ ×2N R to denote the diagonal point (r, . . . , r ). When r is sufficiently large, then Q(Rτ (z,F) , Rτ (z,F) ) =
(× N P)×(× N P)
L 2N ,R F¯ × F K (∗τ ·z,τ ·z),R dp.
(3.22)
Meanwhile, the supremum norm of L 2N ,R F¯ × F is bounded by some constant multiple of υ r where υ > 1 depends only F and z. It thus follows from (3.22) that this same υ r bounds Q(Rτ (z,F) , Rτ (z,F) ) since the function K (·) is non-negative and its integral over ×2N P is equal to 1. This understood, here is the key observation: When Y is compact and L has trivial kernel, or when Y = Rn>1 , or when Y = R and m > 0, then the same constant r can be used for all τ ≥ 0. Indeed, this can be deduced using the top line in ( 3.1) or (3.15) as the case may be. The key point is that both expressions lead to functions a(·, ·) with the following property: The assignment of a ((−t − τ ,y) , (t + τ, y ) to τ ∈ [0, ∞) for fixed (t, y) and (t , y ) defines a bounded function on [0, ∞) if both t and t are positive. Note that this is not true of the function that appears in the second line of (3.1), nor is it true for the Green’s function in the case Y = R and m = 0. To elaborate on the story in these last cases, remark that when the lower line in (3.1) is relevant, the large τ versions of r must be at least some constant times τ . In the case 1 when Y = R and the Green’s function is − 2π ln z − z , then the large τ versions of r must be at least some constant times ln(τ ). By the way, the chain of inequalities in (3.21) leads to the desired bound Q(Rτ (z,F) , Rτ (z,F) ) ≤ Q (, ) in the case that the large τ versions of r are linear in ln(τ ). This is because the k → ∞ limit of k2−k is zero. To finish the proof of Lemma 3.4, suppose now that either Y is compact and L has a kernel, or else Y = R and m = 0. To start the argument for these cases, remark that the version of (2.9) that gives the integral in (3.22) is defined using an endomorphism, Aˆ (∗τ ·z,τ ·z) , of V that has the form
ac (z i , z j )
1≤i, j≤N
+
j
∂ai ∂a +
1≤a≤d
a(∗τ · z i , τ · z j )
1≤i, j≤N
ac (z i , z j )
1≤i, j≤N
j
∂ai ∂a
1≤a≤d
j ∂ai ∂a .
(3.23)
1≤a≤d
This is to say that the integral on the right-hand side of (3.23) is the same as P
(exp( Aˆ (∗τ ·z,τ ·z) )V ( F¯ × F))( p, . . . , p)dp.
(3.24)
Quasi-Linear Quantum Field Theories for Maps to Groups and Their Quotients
41
To continue, observe that τ only appears in the far right term in (3.23). Moreover, it follows from the form of a(·, ˆ ·) that the large τ behavior of this term can be written as j j ∂ai ∂a + bi j (τ ) ∂ai ∂a , (3.25) −υ(τ ) 1≤i, j≤N 1≤a≤d
1≤i, j≤N
1≤a≤d
σ 2σ where υ(τ ) = Vol(Y ) τ or 2π ln(τ ), and where bi j (τ ) is bounded as τ → ∞. Here, the case with υ (τ ) linear in τ arises when Y is compact and L annihilates the constants; and the other case arises when Y = R and m = 0. What follows is the crucial point: The leftmost term in (3.25) is
j j ∂ai ∂a + ∂ai ∂a ,
1 1 − υ(τ ) fa fa + υ(τ ) 2 2 1≤a≤d
(3.26)
1≤i, j≤N 1≤a≤d
where fa ≡ 1≤i≤N (∂ai + ∂ai ). This understood, Lemma 2.1 implies that the leftmost term in (3.26) can be dropped without affecting the integral in (3.24). This is to say that (3.24) is valid with Aˆ (∗τ ·z,τ ·z) replaced by j j Aˆ ≡ ac (z i , z j ) ∂ai ∂a + ac (z i , z j ) ∂ai ∂a
1≤i, j≤N
+
1≤i, j≤N
1≤a≤d
bi j (τ )
1≤a≤d
1≤i, j≤N
1 j ∂ai ∂a +υ(τ ) 2
1≤a≤d
(∂ai ∂a +∂ai ∂a ). (3.27) j
j
1≤i, j≤N 1≤a≤d
Since υ(τ ) > 0 forlarge τ, the operator at the far right in (3.27) is negative semi-definite. Granted that bi j (τ ) are bounded functions of τ , this last fact implies that the right-hand side of (3.24) enjoys a τ independent bound. 4. The Action of the Poincaré Group The purpose of this section is to describe a unitary action of the n + 1 dimensional Poincaré group on the Hilbert space H from Theorem 3.2 in a case where Y = Rn . In particular, this case takes the function a that appears in (1.1) to be a positive multiple of the Green’s function for the operator d2 + −m 2 , dt 2
(4.1)
where is the (negative definite) Laplacian on Rn and where m ≥ 0. In particular, a = σ a, ˆ where aˆ is given by (3.15) when n > 1 and when n = 1 and m > 0. In the case 1 n = 1 and m = 0, then aˆ = − 4π ln (t − t )2 + (y − y )2 . Meanwhile, the constant function on R × Rn is used for the definition of the function ac that appears in (2.2) and implicitly in ( 2.9). The notation used below for the Poincaré group writes the latter as the semi-direct product of the group of translations, R × Rn , with the Lorentz group S O(1, n). The ‘time’ translations are those of the 1-parameter subgroup along the R factor in the translation group R × Rn . Meanwhile, the ‘spatial translations’ are those from the Rn factor. The subgroup S O(n) ⊂ S O(1, n) is identified here as the subgroup that fixes the left-hand R factor in R × Rn .
42
C. H. Taubes
The first point to make is that · is invariant under the action on F of the semi-direct product of R × Rn and S O(n + 1) that is induced by the latter’s action on R × Rn . To elaborate, let N be a positive integer and let z ∈ × N (R × Rn ). Let b ∈ R × Rn and let U ∈ S O(n + 1) and use to denote the point (b, U ) in the semi-direct product group. Write · z to designate the point that is obtained from z by acting by simultaneously on each of its factors. Then (z,F) = (z,F) (4.2) for all pairs (z, F) with z ∈ × N (R × Rn ) and with F ∈ C ∞ (× N P) such that z’s entries are pairwise distinct and F decomposes as a product of f 1 . . . f N , where f k is an eigenfunction of the Laplacian on the k th factor × N P. Indeed, (4.2) follows by virtue of the fact that the function ac is a function only of the Euclidean distance in R × Rn between its two arguments. The action of the subgroup Rn S O(n) preserves F++ and commutes with the involution ∗ that defines the quadratic form Q. As a consequence, the action of this subgroup descends as a unitary group action on the Hilbert space H. This subgroup’s action also commutes with Theorem 3.2’s contraction semi-group because the Rn S O(n) action on F commutes with the action of R that sends τ ∈ R and (z,F) ∈ F 0 to (τ z,F) . As noted above, the generator of Theorem 3.2’s contraction semi-group is self-adjoint. As a consequence, the square root of −1 times this operator generates a 1-parameter, unitary group action on H that commutes with the action of Rn S O(n). The latter 1-parameter unitary group together with the aforementioned action of Rn S O(n) supply a unitary action on the Hilbert space of the semi-direct product (R × Rn ) S O(n). This last group appears directly as a subgroup in the Poincaré group and is identified now with the latter. Doing so specifies the desired action of much of the Poincaré group; all that is left as yet to define are the ‘Lorentz boosts’, these group elements that act on R × Rn so as to mix the time and space directions. A ‘pure’ Lorentz boost as defined here is determined by a vector v ∈ Rn with Euclidean norm less than 1. The pure Lorentz boost defined by v fixes the point (0, y) when y is orthogonal to v and it sends any given (t, 0) to γ (t, vt), where γ = (1 − v · v)−1/2 . Here, v · v is used to denote the inner product of v with itself using the Euclidean inner product on Rn . Meanwhile, it sends (0, v) to γ (v · v, v). To fill out the whole Poincaré group, it is sufficient to first find unitary operators on H that correspond to the pure Lorentz boosts, and then verify that the pure boost operators multiply one against another and against the (R × Rn ) S O(n) operators according to the multiplication law for the Poincaré group. As might be expected, the pure Lorentz boosts are defined using the part of the S O(n + 1) action on F that comes from elements that mix the R and Rn factors in R × Rn . Such an element in S O(n + 1) is deemed a ‘Euclidean boost’ in what follows. A ‘pure’ Euclidean boost as defined here is determined by a vector v ∈ Rn . This element in S O(n + 1) fixes points of the form (0, y) when y is orthogonal to v, sends (t, 0) to λ(t, vt), where λ ≡ (1 + v · v)−1/2 , and it sends (0, v) to the point λ(−v · v, v). Note that the pure boost defined by v sends any point (t, y) with t > |v||y| to a point with positive first coordinate. However, given v = 0 in Rn , there exist points (t, y) with t > 0 that are mapped by v’s boost to points with negative first coordinate. Thus, pure boosts do not preserve F++ . This last point is what makes the story interesting. When v ∈ Rn , let Oˆ v ∈ S O(n + 1) denote the pure boost that is defined by v. With r ≡ |v| fixed, there exists a subspace, Ur ⊂ F ++ , that is mapped to F++ by the action of Oˆ v . Moreover, these Rn labelled subspaces can be defined so as to have the following properties: (4.3) If r > r then Ur ⊂ Ur , and ∪r >0 Ur = F++ .
Quasi-Linear Quantum Field Theories for Maps to Groups and Their Quotients
43
Indeed, take Ur to be the subspace that is generated by the constant function and functions (z,F) , where each of the N factors of z ∈ × N (R × Rn ) have the form (t, y) with t > r |y|. To continue, remark that the pure boosts have the following property with respect to the involution, ∗, on R × Rn that changes the sign of the R coordinate: ∗ Oˆ v = Oˆ −v ∗ .
(4.4)
Since Oˆ −v = Oˆ v−1 , it thus follows from (4.2) that Q( , Oˆ v ) = Q Oˆ v ,
(4.5)
whenever and are both in Ur with r > |v|. This last equation has two implications: First, the operator Oˆ v annihilates the kernel of Q. Second, if |v| < 1, then Oˆ v descends to H as a symmetric operator on the domain Ur / ker(Q) for r > 2|v|/(1 − |v|2 ). Here, this last bound on r follows from the fact that Oˆ v Oˆ v = Oˆ v with v = 2v/(1 − v · v). To make the next point, fix a vector ∈ F++ , then Oˆ v is in F++ when |v| is sufficiently small. For such v, the pairing Q( , Oˆ v ) is well defined for any given
in the Hilbert space. Moreover, the value of the resulting function v → Q( , Oˆ v ) extends from a neighborhood of 0 in Rn as a holomorphic function on a neighborhood of 0 in Cn . This follows from the form of aˆ using (2.9) and some standard, finite dimensional perturbation theory. An immediate consequence is that the assignment v → Oˆ v defines a holomorphic map from a neighborhood of 0 in Cn to the Hilbert space H. Of interest with regards to defining a Lorentz boost are the vectors Oˆ v when v is purely imaginary with Hermitian norm less than 1. Thus, where v = iw with w ∈ Rn and w · w < 1. In this regard, the operator Oˆ iw is defined on the image in H of Ur when r > |w|. It is a consequence of (4.4) and ( 4.5) that Oˆ iw is a unitary operator on this domain. Were the Ur dense in H, then Oˆ iw would extend by continuity over the whole of H as a unitary operator. Such an extension would serve as a pure Lorentz boost for the desired Poincaré group action. Since Ur does not have dense image in H, a larger domain must be found. For this purpose, fix and in F++ and remark that the analyticity near 0 in Cn of the map v → Q( , Oˆ v ) has the following consequence: Take v to be real with length 1. Then the assignment to ∈ F++ of the θ → 0 limit of θ1 − Oˆ tan(θ)v defines a symmetric operator on H with domain F++ / ker(Q). Indeed, this follows because d Q( , Oˆ tan(θ)v )|θ=0 when , ∈ F++ . • limθ→0 Q , θ1 ( − Oˆ tan(θ)v ) = dθ • limθ→0 Q( θ1 ( − Oˆ tan(θ)v ), θ1 ( − Oˆ tan(θ)v )) = when ∈ F++ .
d2 dθ 2
Q(, Oˆ tan(θ)v )|θ=0 (4.6)
Use L v in what follows to denote the operator just defined. As is argued next, L v has a self-adjoint extension. To start the argument, introduce hv to denote the hermitian form on F++ / ker(Q) that is defined by polarizing the quadratic, non-negative functional that sends to Q(L v , L v ). Thus, Q(L v , L v ) =
d2 Q(, Oˆ tan(θ)v )|θ=0 . dθ 2
(4.7)
44
C. H. Taubes
According to Theorem 1.27 in Chapter VI.5 of [Ka], the form hv is closable on the domain F++ / ker(Q). This implies that L 2v has a Friedrichs extension; this a non-negative self adjoint operator whose dense domain contains a core in F++ / ker(Q). Now, let |L v | = (L 2v )1/2 . The domain of |L v | is the domain of the closure of the form hv . In particular, F++ / ker(Q) is a core for |L v |. On F++ / ker(Q), the operator T = L v + 2 |L v | is non-negative and symmetric. The arguments just invoked to define the self-adjoint extension of L 2v can be used to endow T with a self-adjoint extension whose domain is that of |L v |. This understood, T − 2|L v | extends L v from F++ / ker(Q) as a closed and self-adjoint operator on H. Standard constructions (see, e.g. Chapter IX.1.2 in [Ka]) now provide a strongly continuous, 1-parameter group of unitary operators on H with generator i · L v . When τ ∈ R, the corresponding version of exp(iτ L v ) is denoted by Uˆ tanh(τ )v . Note that in the case that ∈ F++ / ker(Q) and τ has small absolute value, then Uˆ tanh(τ )v = Oˆ tan(iτ )v . Indeed, such is the case by virtue of the fact that Uˆ − tanh(τ )v Oˆ tan(iτ )v has zero derivative with respect to τ . The 1-parameter group τ → Uˆ tanh(τ )v is to be identified with the 1-parameter group of pure Lorentz boosts as defined by multiples of the unit vector v. To verify that the multiplication between w = w operators Uˆ w and Uˆ w is as required for Lorentz group elements, it turns out that it is sufficient to consider the issue on the domain Ur / ker(Q) for r sufficiently large. This point is explained momentarily. In the meantime, note that the operators Uˆ w and Uˆ w on any sufficiently large r version of Ur / ker(Q) are analytic continuations of corresponding versions of Oˆ (·) ; and this implies that they obey the desired multiplication law. A similar argument shows that the various versions of Uˆ (·) have the desired properties with regards to the already defined action of the (R × Rn ) S O(n) subgroup of the Poincaré group. To argue that it is enough to consider the Uˆ w Uˆ w on a large r version of Ur / ker(Q), suppose that ∈ F++ has been specified. Then Rτ is in Ur for all sufficiently large τ . Suppose that Uˆ w Uˆ w = Uˆ on Ur / ker(Q), where Uˆ denotes the operator on H that corresponds to the appropriate element along some other 1-parameter subgroup of the Lorentz group. Thus Q( , (Uˆ w Uˆ w − Uˆ )Rτ ) = 0 for all τ sufficiently large and all
in H. However, if
is any fixed vector in H , then the assignment τ → Q(
, Rτ ) defines a real analytic function on (0, ∞) with continuous extension to τ = 0 (see, e.g. Chapter IX.1.6 in [Ka].) Thus, Q( , (Uˆ w Uˆ w − Uˆ )) = 0 for all ∈ H. This then implies that Uˆ w Uˆ w = Uˆ on the dense domain F++ / ker(Q); and thus Uˆ w Uˆ w = Uˆ on the whole of H. 5. Field Theories for Quotient Spaces Suppose that P is a compact Lie group as in the previous sections. To keep things simple, I will assume that P is a simple Lie group. Note, however, that what transpires below holds in greater generality. As before, take {∂a } to be an orthonormal basis of left invariant vector fields on P with respect to a chosen bi-invariant metric. Integration on P is defined using this same metric’s volume form. Now, suppose that G ⊂ P is a subgroup and use G\P to denote the quotient of P by the action of G via multiplication on the left. My purpose here is to explain how the constructions in the previous sections can be used to obtain a quantum field theory for the space of maps from a manifold, Y , into G\P.
Quasi-Linear Quantum Field Theories for Maps to Groups and Their Quotients
45
For this purpose, suppose first that M is a given manifold and reintroduce the vector space F as defined in Sect. 2 for the pair M and P. Thus, F is generated by the constant function from M to P and by functions of the form (z,F) , where z ∈ × N M has pairwise discrete entries and F is in the domain T N . Here N can be any positive integer. The domain T N consists of finite linear combinations of functions that decompose so as to send any given p = ( p1 , . . . , p N ) ∈ × N P to f 1 ( p1 ) . . . f N ( p N ), where each f k is an eigenfunction for the Laplacian on P . As is explained next, it is a consequence of the Peter-Weyl theorem that T N contains functions that are pulled up from × N G\P. To see that T N has functions from × N G\P, note first that the Peter-Weyl theorem asserts that the eigenspaces of the Laplacian on P are in 1-1 correspondence with the irreducible representations of P. To make this correspondence explicit, suppose that V is an irreducible representation of P with a hermitian metric that makes for a unitary P action. Let ρV : P → U (V ) denote the corresponding homomorphism of Lie groups. If η, ν are any two elements in V , then the function p → η† ρV ( p)ν
(5.1)
is an eigenfunction for the Laplacian on P. With V fixed, all such eigenfunctions have the same eigenvalue and these eigenfunctions span the corresponding eigenspace. This understood, remark that the representation V decomposes as a direct sum of irreducible representations of the subgroup G . Functions of the form depicted in (5.1) in the case that η is in the trivial G-representation are functions that are pulled up from G\P via the projection map. Granted the preceding, let F G ⊂ F denote the subspace that is generated by the constant function and those of the form (z,F) with z as before and with F decomposing as f 1 . . . f N , where each f k is a G-invariant eigenfunction of the Laplacian on P. Let G denote the corresponding subspace of F . The quadratic form Q restricts to F as F++ ++ ++ a Hermitian, non-negative form. Let ker G (Q) denote the kernel of this restricted form, G / ker G (Q) using Q. and let HG denote the completion of F++ G and ker G (Q), The action of the semigroup {Rτ : τ ≥ 0} on F++ preserves both F++ G G and so descends as an action on the dense domain F++ / ker (Q) in HG . The argument given for Theorem 3.2 works as well here to prove that this semigroup action extends to an action on HG of a strongly continuous, self-adjoint 1-parameter contraction semi-group. This semi-group is generated by a non-negative, self-adjoint operator. In addition, the argument given in the previous section can be applied here in the case where Y = Rn to prove that HG has an action of the (n + 1)-dimensional Poincaré group whose time translation subgroup is generated by the square root of −1 times the generated afore-mentioned contraction semi-group. The structure just described can be viewed as a quantum field theory for the space of maps from Y to G\P. 6. When the Domain is a Riemann Surface My purpose in this section is to exhibit certain properties of · in the case when M is a compact Riemann surface or one with some number of punctures. To motivate the ensuing discussion, note that physicists have generally agreed on a list of properties that would allow, were they satisfied, the collection of dim(M) = 2 versions of · to be deemed the normalized correlation functions for a conformal field theory. Such a list of properties was first summarized by Segal [Se1, Se2]. The discussion that follows
46
C. H. Taubes
refers to Gawedzki’s presentation of the list in his second lecture from [Ga]. Some of the properties on the list refer only to a single Riemann surface, and the forthcoming Propositions 6.1–6.4 assert that the latter sort are satisfied by · . To set the stage, I treat a surface with n punctures as a compact surface, M, together with a set, ϑ, of n distinct points in M, these are the missing points in the original unpunctured surface. With it understood that M is compact, assume that M has a given Riemannian metric and take the function a that appears in (1.1) to be a positive multiple of a certain Green’s function for the Laplacian on M. To be precise here, I use a(z, ˆ z )
to denote the value of this Green’s function at points z = z in M − ϑ. With z ∈ M − ϑ fixed, then aˆ z (·) ≡ a(·, ˆ z ) is a distributional solution on M to the equation 1 1 −aˆ z = δz − δw + (2 − n) ; (6.1) 2 area(M) w∈ϑ
here is the negative definite Laplacian, δu is the Dirac delta function with mass 1 at the point u ∈ M, and the area of M is computed using the metric’s area measure. In particular, aˆ z is the unique solution to (6.1) whose integral over the whole of M is zero when computed using the metric’s area measure. By way of an example, suppose that M is the round 2-sphere with area 4π . Write M = C ∪ {∞} and take ϑ = {0, ∞}, then z − z 1
a(z, ˆ z)=− . (6.2) ln 2π |z|1/2 |z |1/2 ˆ z ) with σ a positive constant. With aˆ understood, set a(z, z ) ≡ σ a(z, 1 An important point in what follows is that a(·, ˆ z ) diverges as − 2π ln(dist(·, z )) near
the point z In fact, 1 ˆ z ) + ln dist(z, z ) lim a(z, (6.3) 2π z→z exists and defines a function of the point z that varies smoothly over M − ϑ. Indeed, this follows from the fact that any metric on a surface is locally conformally flat. In what follows, c denotes the function from M − ϑ to R that is depicted in (6.3). Use this same function c to define the function ac that appears in (2.2). In the propositions that follow, N is a positive integer, z is a point in × N (M − ϑ) with pairwise distinct entries and F is a function on × N P that sends any given p = ( p1 , . . . , p N ) to f 1 ( p1 ) . . . f N ( p N ) where each f k is a real valued eigenfunction of the Laplacian on P. Note that Eˆ k is used to denote the absolute value of the eigenvalue of fk . To make contact with the axioms in Gawedzki’s article [Ga], let us agree that what Gawedzki calls a ‘primary field’ is any real valued eigenfunction of the Laplacian on P. Thus, the collection { f k } constitutes a set of primary fields. Note that there is no nontrivial analog here of Gawedzki’s partition function Z. This is because our functional · gives normalized correlation functions by virtue of the fact that it assigns the value 1 to the element 1 ∈ F. The first proposition here describes how · is affected by a diffeomorphism. It verifies that · satisfies the axioms that are given by Eqs. (2.2) and (2.6) in [Ga]. Proposition 6.1. Let · denote the version of (2.5) that is defined using the pull-back of the original metric on M via a given diffeomorphism. Let ϑ denote the inverse image of ϑ under this diffeomorphism. When z ∈ × N (M − ϑ), let z ∈ × N (M − ϑ ) denote the inverse image of z via the induced diffeomorphism of × N M. Then (z ,F) = (z,F) .
Quasi-Linear Quantum Field Theories for Maps to Groups and Their Quotients
47
The proof of Proposition 6.1 and the subsequent propositions are given at the end of this section. The next proposition describes how · is affected by a conformal change to the given metric on M. For this purpose, suppose that u is a smooth function on M and that the conformal change is such that the norm on T M given by the new metric is eu/2 times the norm as defined by the original metric. Let · denote the original version of (2.5) and let · u denote the version that is defined by the new metric. Proposition 6.2. Let u be a smooth function on M. Then ˆ (z,F) u = e− E k σ u(z k )/4π (z,F) .
(6.4)
1≤k≤N
Note that this proposition verifies that · satisfies the axiom that is given by Eq. (2.4) in [Ga]. By the way, the requirement that is stated in Eq. (2.32) of [Ga] follows from Eqs. (2.2) and (2.4) of [Ga]; thus it is also satisfied by · . The next two propositions describe the change in · with a first order change in the metric that is neither a conformal change nor one that is tangent to the metric’s diffeomorphism group To make this notion precise, let m denote a given symmetric orbit. section of T ∗ M T ∗ M, and consider the 1-parameter family of metrics parametrized by a small real number, τ , and defined so that the square of the τ version norm is obtained from the original by adding τ m(·, ·). Such a 1-parameter family of metrics provides the corresponding 1-parameter families of versions of (2.5), and a τ version is denoted in what follows as · τ . I know no general formula for the τ -dependence of these linear forms. However, with z and F fixed, then the function τ → (z,F) τ is an analytic function of τ near τ = 0 when the support of m is disjoint from the set of points that define the components of z. This follows from (2.8) and (2.9) using perturbation theory to analyze the τ dependence of the Green’s function for the Laplacian on M. In particular, the derivatives of (z,F) τ with respect to τ can be readily computed at τ = 0 for such m. The axioms for a conformal field theory make demands on the behavior of these derivatives as the support of m approaches some component of z. To compare the singular behavior for the first derivative with that required for a conformal field theory, it proves d useful to write the derivative dτ (z,F) τ |τ =0 as if its value was that of the L 2 inner product between the symmetric tensor m and a certain section over M − ϑ ∪ {z k }1≤k≤N of T ∗ M T ∗ M that extends to all of M as a distribution. The latter section is symmetric and traceless, and so determined by its type (1, 0)2 portion; this as defined by the complex structure on M that comes from the τ = 0 metric. The type (1, 0)2 part of this 1 1 distribution at w ∈ M is denoted in what follows by 4π t(w)(z,F) − 4π t(w) (z,F) . 1 The factors of 4π are traditional in the conformal field theory literature.
Proposition 6.3. The assignment (w, z) → t(w), (z,F) − t(w) (z,F) defines a holomorphic section over the complement of the diagonals in(M − ϑ) × (× N (M − ϑ)) of the pull-back via projection to the first factor of M of 2 T 1,0 M. Moreover, this section is equivariant with respect to the action of the group of orientation preserving diffeomorphisms of M. The following is also true: Let k ∈ {1, . . . , N } and let x denote a holomorphic coordinate for a neighborhood of z k in M such that x = 0 is mapped to 2 2 z k and such that the given
back as |d x| to order |x| . Then the pull-back metric pulls at x of the section w → t(w), (z,F) − t(w) (z,F) has the form
48
C. H. Taubes
1 ˆ 1 1 ∂ σ Ek 2 + 4π x x ∂z k
(z,F) (d x)2 + . . . ,
(6.5)
where the three dots indicate terms that are bounded as x → 0. This proposition verifies that · is compatible with the conditions that are required by Eqs. (2.26) and (2.28)–(2.31) in [Ga]. The axioms for a conformal field theory also make demands on the
τ second derivatives and higher order derivatives at τ = 0 of the function τ → (z,F) . These can also be verified in the present case. The proposition that follows summarizes the story for the second derivatives. To conform with convention, the second derivative at τ = 0 of the τ function τ → (z,F) is written as if it were obtained by integrating m ⊗ m against a tensor valued distribution on M × M. The latter is determined by its (1, 0)2 ⊗ (1, 0)2 and (1, 0)2 ⊗ (0, 1)2 parts. The (1, 0)2 ⊗ (1, 0)2 part at a point (w, w ) ∈ M × M is denoted by the rather cumbersome 2
1 t(w)t(w )(z,F) − t(w) t(w )(z,F) − t(w ) t(w)(z,F) 4π
+ t(w) t(w ) (z,F) . (6.6) Meanwhile, the (1, 0)2 ⊗ (0, 1)2 part at a point (w, w ) is denoted by the analog of (6.6) that is obtained by replacing each occurrence of t(w ) with ¯t(w ). (The notation here best conforms to that used by Gawedzki.) The axioms of conformal field theory specifically require certain singularities in these distributions along the diagonal in M × M, these implied by Eq. (2.27) in [Ga]. The next proposition asserts that the desired singularities do exist. Proposition 6.4. Suppose that w ∈ M is distinct from all components of z. Let x denote a holomorphic coordinate on a neighborhood of w such that x = 0 is mapped to w and such that the given metric pulls back as |d x|2 + O |x|2 . Then the pull-back at x = 0 of the section depicted in (6.6) has the form
2 1 ∂ t (·) (z,F) − t(·) (z,F) |w (d x)2 + . . . , + (6.7) 2 x x ∂w where the three dots indicate terms that are bounded as x → 0. Meanwhile, the pull-back of the joint t(w) and ¯t(w ) version of (6.6) has no singularity at x = 0. The following are true and not hard to prove: The section that is denoted by (6.6) is jointly holomorphic with respect to both w and w . Meanwhile, the joint t(w) and ¯t(w ) version of (6.6) describes a section that is holomorphic with respect to w and anti-holomorphic with respect to w . The remainder of this section is occupied with the proofs of the preceding four propositions. Proof of Proposition 6.1. The assertion follows because the pull-back of aˆ is the solution to the pull-back metric’s version of ( 6.1) as defined using the set ϑ . Proof of Proposition 6.2. The proof begins by describing how the Green’s function, a, ˆ changes when the metric on M is changed by a conformal transformation. For this purpose, let aˆ denote the Green’s function for the original metric and let aˆ u denote the
Quasi-Linear Quantum Field Theories for Maps to Groups and Their Quotients
49
Green’s function for the new metric. These two functions on M × M are related as follows: aˆ u z, z = aˆ z, z + q u (z) + q u (z ), (6.8) where q u (z) =
1 2−n 2 area(M)
area(M) u(z ) aˆ z, z 1 − dz + k u e u (M) area M
(6.9)
with k u a suitable constant. Here, the integration u measure dz is that defined by the u original metric. Meanwhile, area (M) ≡ M e dz denotes the area of M as computed using the conformally transformed metric.
For each k ∈ {1, . . . , N }, let Vk denote the eigenspace for the Laplacian on P that contains the function f k ; then set V ≡ V 1 × . . . × V N . With V understood, agree now that Aˆ z and Aˆ uz denote the respective old and new versions of the endomorphism of V that appears in (2.8) and (2.9). The claimed relation between · and · u asserts that exp Aˆ z F ( p, . . . , p)dp V P ˆ − E k σ u(z k )/4π exp Aˆ uz = e F ( p, . . . , p)dp. V
P
1≤k≤N
(6.10) To establish (6.10), note that (6.10) and the formula for c given by (6.3) imply that
Aˆ uz = Aˆ z + 2σ
1≤a≤d
+
1≤i≤N
q u (z i )∂ai
1≤i≤N
1 σ u(z i ) 4π
∂ak
1≤k≤N
∂ai ∂ai .
(6.11)
1≤a≤d
Now, keep in mind that each version of ∂ai ∂ai commutes with any given ∂ak . As a consequence, the rightmost term on the right-hand side of the equality in (6.7) can be replaced by − 1≤k≤N Eˆ k σ u(z k )/4π . This replacement accounts for the factor that multiplies the integral on the right-hand side of (6.10). This understood, (6.8) follows with a proof that the left-hand side of (6.10) is unchanged when Aˆ z is replaced by Aˆ z ≡ Aˆ z + 2σ
1≤a≤d
q u (z i )∂ai
1≤i≤N
1≤k≤N
To prove that such is the case, reintroduce the collection
fa ≡
1≤k≤N
∂ak
1≤a≤d
∂ak .
(6.12)
50
C. H. Taubes
of operators that appear in Lemma 2.1, but viewed as endomorphisms of V. To keep the formula that follows relatively uncluttered, introduceas shorthand ba to denote each u i a ∈ {1, . . . , d} version of 2 1≤a≤d 1≤i≤N q (z i )∂a . Then 1 exp( Aˆ z )V = exp( Aˆ z )V + 2σ exp (1 − τ ) Aˆ z ba fa exp(τ Aˆ z )V dτ. V
0
1≤a≤d
(6.13) Now, each fa commutes with the corresponding ba , and with both Aˆ z and Aˆ z . This understood, then (6.13) implies that exp Aˆ z F ( p, . . . , p)dp V P = exp Aˆ z F ( p, . . . , p)dp + fa Ua dp, (6.14) P
where
1
Ua ≡ 2σ 0
V
1≤a≤d
P
exp (1 − τ ) Aˆ z ba exp(τ Aˆ z )V F ( p, . . . , p)dτ. V
(6.15)
The proof ends by using (2.10) to prove the the rightmost sum in (6.14) is zero.
Proof of Proposition 6.3. To derive a formula for t(·)(z,F) − t(·) (z,F) , digress for a minute to study the τ dependence of the Green’s function on M. For this purpose, note that the small τ metric’s version of the Laplacian on M has the form 1 τ = − τ m¯ − ∗ m 0 (·, ) + O(τ 2 ). (6.16) 2 The notation here uses to denote the τ = 0 Laplacian, the τ = 0 covariant derivative and ∗ its adjoint. Thus, = − ∗ . Meanwhile, m¯ and m 0 are the trace of m and the trace free part of m, these as defined with respect to the τ = 0 metric. Assume in what follows that m = m 0 ; this because the trace of m defines a deformation of the τ = 0 metric in its conformal equivalence class. By virtue of (6.16), the small τ metric’s Green’s function can be written as " # τ
m|w , (d 2 a)| ˆ z )+τ ˆ (z,w) ⊗ d 1 aˆ |(w,z ) dw + O(τ 2 ), (6.17) aˆ (z, z ) = a(z, M ∗ T M. where the notation uses ·, · to denote the τ = 0 metric’s inner product on T ∗ M Meanwhile, d 1 denotes the exterior derivative along the first factor in M × M while d 2 denotes the exterior derivative along the second. It follows as a consequence of (6.17) that
t(·)(z,F) − t(·) (z,F) %(1,0)2 $ d 2 a| = 4π σ ˆ (zi ,w) ⊗ d 1 a| ˆ (w,z k ) 1≤i,k≤N
× P
here
2 [·](1,0)
1
exp(1 − τ ) Aˆ z
0
denotes the type
V
(1, 0)2
∂ai ∂ak exp(τ Aˆ z )V F ( p, . . . , p)dτ dp,
1≤a≤d
part of the given section of
T ∗M
(6.18) T ∗ M.
Quasi-Linear Quantum Field Theories for Maps to Groups and Their Quotients
51
With (6.18) in hand, the first point to make is that with z fixed, then (6.18) defines a holomorphic section over M − ϑ ∪ {z k }1≤k≤N of the square of the holomorphic cotangent bundle. To see why such is the case, note that the exterior derivative on M can ¯ where ∂ is the projection of the exterior derivative onto the (1, 0) be written as ∂ + ∂, summand in T ∗ MC and ∂¯ is the projection to the (0, 1) summand. Furthermore, if h is any function on M , then the 2-form that is obtained by multiplying the area form by ¯ h is equal to 4∂∂h. This understood, it then follows that $ %(1.0)2 1 ¯∂ d 2 a| ˆ (zi ,w) ⊗ d a| ˆ (w,z k ) =
1 2 a| ˆ (zi ,w) ⊗∂ 1 a| ˆ (w,z k ) +∂ 2 a| ˆ (zi, w) ⊗1 a| ˆ (w,z k ) . 4
(6.19)
Here, ∂ 1 aˆ and ∂ 2 aˆ denote the projections of d 1 aˆ and d 2 aˆ onto the (1, 0) parts of the cotangent bundle of the relevant factor of M. Granted (6.1), this last equation implies that
∂¯ t(·)(z,F) − t(·) (z,F) |w % $ 1 ∂ 2 a| = π σ (1 − n) ˆ (z k ,w) + ∂ 1 a| ˆ (w,z k ) 2 1≤k≤N 1 exp(1−τ ) Aˆ z × fa ∂ak exp(τ Aˆ z )V F ( p, . . . , p)dτ dp, (6.20) P 0
V 1≤a≤d
where fa ≡ 1≤i≤N ∂ai . Since fa commutes with each ∂ai and also with Aˆ z , itfollows that the expression on the right-hand side of (6.20 ) can be written as 1≤a≤d P fa Ua ( p, . . . , p)dp, where each Ua is a function on × N P with values in T 1,0 M. This understood, the vanishing of the right-hand side of (6.20) follows from Lemma 2.1. To establish (6.5), view a(z ˆ 1 , w) with z 1 fixed and w near z 1 as a function, aˆ 1 , of x. 1 Likewise, view a(w, ˆ z 1 ) as a function, aˆ 2 , of x. Of course both differ from − 2π ln (|x|) by a function that is smooth near x = 0. Note, however that the smooth terms in aˆ 1 and aˆ 2 need not be the same. In any event, the exterior derivative of aˆ 1 is what appears as d 2 aˆ in (6.18), while that of aˆ 1 appears in (6.18) as d 1 a. ˆ These derivatives have the form 1 1 d aˆ ν=1,2 = − ¯ (6.21) − αν d x − − α¯ ν d x, 4π x 4π x¯ where α1 and α2 are smooth in a neighborhood of x = 0. It follows directly from (6.21) that the most singular portion of the x → 0 limit of the expression in (6.18) is given by
1 1 ˆ σ E 1 (z,F) 2 (d x)2 . 4π x
(6.22)
To identify the lower order but divergent term that appears in (6.18) in the limit as w → z 1 , note that the terms α1 and α2 that appear in (6.21) sum at x = 0 to give the differential of c/σ at the point z 1 . Granted this identification, the remaining divergences in (6.18) have the form of 1 υ(d x)2 , (6.23) x
52
C. H. Taubes
where υ is defined to be ∂ 1 a| ˆ (z 1 ,z k ) + ∂ 2 a| ˆ (z k ,z 1 ) Eˆ 1 ∂c − σ 2≤k≤N
1
× P
0
exp(1 − τ ) Aˆ z
V
∂a1 ∂ak exp(τ Aˆ z )V F ( p, . . . , p)dτ dp.
1≤a≤d
(6.24) Here, ∂c denotes the projection of the exterior derivative of c at z 1 onto the (1, 0) portion of T ∗ M, and ∂ 1 aˆ and ∂ 2 aˆ are, as before, the analogous projections for the forms d 1 aˆ and d 2 a. ˆ It is left as an exercise for the reader to verify that the expression in ( 6.24) is identical to the (1, 0) portion of the differential at z 1 of (z,F) when the latter is considered as a function of the first component of z with the remaining components fixed. Proof of Proposition 6.4. The appearance of the required singular behavior near the diagonal comes from the second of a τ at τ = 0 that appear via the chain derivatives
τ rule when computing those of (z,F) . In particular, they come from the τ -derivative & '(1,0)2 at τ = 0 of the expression d 2 aˆ τ |(z 1 ,w) ⊗ d 1 aˆ τ |(w,z k ) ; this is the a τ analog of the term that appears before the integral in (6.18). This understood, (6.17) can be used to compute the latter derivative. Algebraic manipulation of the resulting expression then gives (6.7). The assertion about the t and ¯t version of (6.6) is proved with this same τ derivative calculation after an appeal to (6.1). The detailed manipulations are reasonably straightforward in both cases and left to the reader. 7. Remarks on Conformal Field Theories The remaining properties for a conformal field theory make demands that require a Hilbert space assignment to a surface with boundary. It is not clear whether these other properties are satisfied. For example, a conformal field theory must assign a vector in Theorem 3.2’s Hilbert space to a surface with a connected boundary and a marked boundary point. It is not clear that the right sort of assignment exists in the case at hand. A conformal field theory also requires that Theorem 3.2’s contraction semi-group consists of trace class operators at positive times. It is not known whether this last condition is satisfied. Note that these are not the only questionable properties. The first subsection that follows elaborates on this manifold with boundary issue. The subsequent subsection initiates a study of the spectrum of the generator of Theorem 3.2’s contraction semi-group. On the positive side of the ledger, there is a pair of unitary S L(2; R) actions on Theorem 3.2’s Hilbert space that fits appropriately into the conformal field theory story. These actions are briefly described in the final subsection. a) Manifolds with boundary. Suppose that M0 is a compact Riemann surface with connected boundary whose metric is flat on some neighborhood of the boundary. I shall also assume that the boundary circle is totally geodesic and has length 2π . Suppose that a fiducial point has been specified in the boundary circle. This data is supposed to yield a vector in Theorem 3.2’s Hilbert space via a prescription that is described momentarily.
Quasi-Linear Quantum Field Theories for Maps to Groups and Their Quotients
53
As remarked at the outset, it is not clear at present whether this prescription can be implemented here. The description of the desired vector assignment follows in four parts. Part 1. The given manifold with boundary can be viewed in a canonical way as the complement of disk in a compact manifold. To see how this comes about, note that there exists a positive number, ε, and coordinates, (t, θ ), on a neighborhood of the boundary with t ∈ (−ε, 0] and θ ∈ R/(2π Z) with the following properties: First, the boundary is the t = 0 slice. Second, the metric appears in these coordinates as dt 2 + dθ 2 . Finally, the point θ = 0 corresponds to the fiducial boundary point. Note that these coordinates are canonically associated with the given data. Now identify this neighborhood of the boundary with the portion of C where the holomorphic coordinate, u, has norm greater than or equal to 1 and less than eε by the map that sends (t, θ ) to u = e−(t+iθ) . Use this identification to glue the |u| ≤ 1 disk in the Riemann sphere C to M0 . Let M denote the resulting compact manifold and w ∈ M the point that corresponds to the origin in the attached disk. I use D ⊂ M in what follows to denote the attached disk. Take the metric on D to be flat and such that the boundary of M is the circle of radius 1. Extend this metric over the rest of M so as to be conformal on M0 to the given metric. Part 2. Suppose next that N is a positive integer and z ∈ × N (R × S 1 ) is a point with pairwise distinct entries and such that each entry has positive first coordinate. The gluing of the |u| ≤ 1 portion of C to M0 identifies z with a point in × N (D − w), and thus with a point in × N (M − w). Meanwhile, let F ∈ C ∞ (× N P) denote a function of the usual sort, one that assigns a given point p = ( p1 , . . . , p N (w) ) to f 1 ( p1 ) . . . f N ( p N ), where each f k is an eigenfunction for the Laplacian on P. The pair z and F defines a function, (z,F) in M’s version of the linear space F. Of course, with z viewed in × N (R×S 1 ), the pair (z, F) also defines an element in Theorem 3.2’s space F++ . Part 3. Let aˆ M denote the Green’s function on M for the data given by the metric and the one marked point, w. Thus, aˆ obeys the following version of (6.1): 1 1 M aˆ z = δz − δw + . (7.1) 2 area(M) Suppose that both z and z are distinct points in D − w. This understood, both points can be viewed simultaneously as points in the |u| < 1 part of C and so aˆ M (z, z ) can be written as u − u 1 1 M
2 2 |u| u aˆ (z, z ) = − ln 1/2 + s(u, u ). (7.2) − + 2π |u| |u |1/2 4area(M) Here, u and u are the respective images in C of z and z and s is the real part of a symmetric, holomorphic function on the radius 1 polydisk in C × C. As can be seen from (7.2), the function c : M → R that defines M’s function ac in (2.9) pulls back to C − 0 from D − w so as to send the holomorphic coordinate u to 1 1 |u|2 + s(u, u). ln |u| − 2π 2area(M)
(7.3)
54
C. H. Taubes
Part 4. Let · Mdenote the version of the linear form in (2.5). Meanwhile, let Eˆ k denote the value of − 1≤a≤d ∂a ∂a on the function f k . The assignment of
M ˆ ec(z k ) E k (z,F)
(7.4)
1≤k≤N
to the pair (z, F) extends as a linear functional on Theorem 3.2 ’s vector space F++ . This functional is denoted by J in what follows. Pretend for the moment that J annihilates the kernel of Theorem 3.2’s bilinear form Q. Then J define a linear functional on a dense domain in the Hilbert space H. If this functional on H is bounded, then it defines a vector in H. The latter is the required conformal field theory vector. The issue of whether J annihilates the kernel of Q requires understanding the affect of the s(u, u ) term in (7.2) on the endomorphism exp( Aˆ z )V that appears in (2.9). b) The spectrum of the Hamiltonian. This section initiates a study of the Hamiltonian from Theorem 3.2 for the case where Y = S 1 , the function a has the form a = σ a, ˆ where aˆ is Green’s function for the Laplacian on R × S 1 , and the function c is given by (6.3). The conclusion here is that the spectrum of the Hamiltonian is not as simple as pure point with no accumulations. To begin, identify R × S 1 with C − 0 as in the previous subsection by first writing S 1 as R/(2π Z) and then mapping a given (t, θ ) ∈ R × R/(2π Z) to the point u = e−(t+iθ) in C. This identifies the Green’s function, aˆ with the function on ×2 (C − 0) that is depicted in (6.2). The function c in this case is identically zero. This version of a and c ≡ 0 are used in what follows to define · for R × S 1 and thus the Hilbert space in the corresponding version of Theorem 3.2. This Hilbert space is denoted below as H and the Hamiltonian by H. Thus, −H generates Theorem 3.2’s contraction semi-group. Supposing that τ ≥ 0, the action on H of exp(−τ H) is such that Q (z,F) , exp(−τ H)(z ,F ) =
exp( Aˆ (∗z,qz ) )V ( F¯ × F ) ( p, . . . , p)dp, P
(7.5) where q ≡ e−τ and where q ·z ≡ qz 1 , . . . , qz N . Here, z and z are points in × N (C−0) whose components have norm less than 1 and are pairwise distinct. Meanwhile, F and F are, as usual, functions on × N P that decompose as a product of eigenfunctions of the Laplacian from each factor. To say more about the q-dependence of the right-hand side of (7.5), let { f k }1≤n≤k denote the eigenfunction on P that define F and, for each k, let Vk denote the eigenspace
that f k . As usual, let V F denote V1 × . . . × Vk . Use F to define the analogous contains f k , Vk and V F . Thus, V = V F × V F . This done, note that σ Aˆ (∗z,qz ) = Aˆ z + Aˆ z − 2π
1≤i, j≤N
j ln (qz j z¯ i )1/2 − (qz j z¯ i )−1/2 ∂ai ∂a , (7.6) 1≤a≤d
where the notation is such that Aˆ z and the unprimed versions of ∂ai involve only the factor of V F in V; meanwhile A z and all ∂ai involve only the factor of V F in V.
Quasi-Linear Quantum Field Theories for Maps to Groups and Their Quotients
55
To continue, note that the sums with q that appear in (7.6) can be written as j j ln(1 − qz j z¯ i ) ∂ai ∂a + ln(1 − q¯ z¯ j z i ) ∂ai ∂a
1≤i, j≤N
1≤a≤d
1≤i, j≤N
1≤i≤N
1≤a≤d
j 1 1 j − ln |z i | ∂ai ∂a − ∂ai ln z j ∂a 2 2 1≤a≤d 1≤i≤N 1≤ j≤N 1≤a≤d 1≤i≤N 1≤ j≤N j 1 − ln(q q) ¯ ∂ai ∂a . (7.7) 2 1≤a≤d
1≤ j≤N
The introduction of q¯ is for reference later; q = q¯ for now. Note that the natural logs that appear in the two leftmost terms in (7.7) are defined by the power series expansion that writes ln(1 − b) = − n=1,2,... n −1 bn for points b ∈ C with norm less than 1. This understood, the part of (7.3) that is not analytic in q is a term that has the form j σ ln(q q) ¯ (7.8) ∂ai ∂a . 4π 1≤a≤d
1≤i≤N
1≤ j≤N
ˆ This endomorphism of V is denoted in what follows as O.
ˆ ˆ ˆ As it turns out, O commutes with A z , A z and the two leftmost terms in (7.7). However, Oˆ does not commute with all of (7.7). This last conclusion has certain ramifications that will now be explored. To start, imagine for the moment that H has pure point spectrum with finite multiplicities and with no accumulation points. Were this the case, then (7.5) with q real could be written as a convergent series with each term a power of q, thus as n E q −E . (7.9) E
Moreover, the sum in (7.9) would be indexed by the set of eigenvalues of H; and each version of n E would be a q -independent constant. Just such a sum would arise were the operator Oˆ to commute with all of the terms in (7.7). To see this, first expand each of the natural logarithms that appear in the two leftmost terms of (7.7) as respective power series in q and q. ¯ This expansion allows the operator in (7.6) to be written as ˆ Aˆ z + Aˆ z + b(q, q) ¯ + O,
(7.10)
where b is the sum of a convergent power series in q and one in q. ¯ Use standard perturbation theory to expand the exponential of the latter as an infinite power series in b ˆ With Oˆ lacking, the resulting whose zeroth order term is exp( Aˆ z )V exp( Aˆ z )V exp( O).
expansion has the form 0≤n,n mn,n q −n q¯ −n , where the sum is over non-negative integer pairs (n, n ) and where the coefficients are independent of q and q. ¯ The presence of a commuting version of Oˆ replaces this last expansion with
un,n ,λ q −(n+λ) q¯ −(n +λ) , (7.11) 0≤n,n λ
56
C. H. Taubes
where λ ranges over the set of eigenvalues of Oˆ on V. The real q version of (7.11) has the form of the sum in (7.9). Now consider what happens when Oˆ does not commute with b . As is explained next, this leads to a version of (7.11) where certain coefficients u(·) are functions of τ that have convergent expansions at large τ in powers of 1/τ . To see how this comes about, again write the exponential of the endomorphism in ( 7.10) as a convergent power series in b. The first two terms in this expansion are ˆ exp( Aˆ z + Aˆ z + O) 1 exp (1 − s) Aˆ z + Aˆ z + Oˆ b exp s Aˆ z + Aˆ z + Oˆ ds. +
(7.12)
0
Of interest here is the rightmost term. In particular the latter can intertwine distinct eiˆ Suppose that such is the case and let µ1 and µ2 denote the genspaces of Aˆ z + Aˆ z + O. corresponding eigenvalues of Aˆ z + Aˆ z + Oˆ on the respective initial and final eigenspaces. The integral in (7.12) has the form of a power series in q and q¯ times eµ2 − eµ1 . µ2 − µ1
(7.13)
Granted this, suppose that the initial and final eigenspaces have distinct Oˆ eigenvalues. In this regard, remember that Oˆ commutes with Aˆ z and Aˆ z . Write the initial Oˆ eigenvalue as τ λ1 and the final Oˆ eigenvalue as τ λ2 with both λ1 and λ2 independent of τ . With τ large, (7.13) can be written as 1 α2 q λ2 − α1 q λ1 (1 + . . .), (7.14) τ λ2 − λ1 where the missing terms are O(τ −2 ). Here, α1 and α2 are constants that are determined by the eigenvalues of Aˆ z + Aˆ z on the respective initial and final eigenspaces. The power of 1/τ in (7.14) indicates that the spectrum of H is not pure point without accumulations. By the way, the existence of accumulations in the spectrum of H is suggested by the following observation: Consider the Y = S 1 case and a has the form depicted in the top line of (3.1) where the indexing set for the sum is the integers, each ηα (y) is exp(iαy), each non-zero version of E α is |α|, and E 0 = m with m positive but very small. The arguments just given prove that the function τ → Q((z,F) exp(−τ H)(z ,F ) ) for these m > 0 cases has a large τ expansion as depicted in (7.9), where each n E is constant. Moreover, the values of E that appear have the form of a sum of the elements from some finite set of elements chosen from the set {E α }α∈Z . In particular, finite sets with any given number of E 0 ’s can appear with a fixed number of nonzero α versions of E α . As a consequence, the set of such E sees eigenvalues accumulate as m → 0. To tie up a loose end, note that the introduction of q¯ in (7.7 ) makes sense in the context of the circle action on H that is induced by the group of rotations of the S 1 factor of R×S 1 . To be precise here, the action of ϕ ∈ R/(2π Z) on any given (z,F) ∈ F++ sends the latter to the vector (z ,F) , where z ∈ × N (C − 0) is obtained from z by multiplying the latter’s components by e−iϕ . This action preserves F++ and is isometric with respect to the bilinear form Q. In particular, it maps the kernel of Q to itself and so it descends as a circle action on H. As Q is preserved by the circle action on F++ , the
Quasi-Linear Quantum Field Theories for Maps to Groups and Their Quotients
57
resulting action on H is unitary. As the action is strongly continuous, Stone’s theorem [St] finds it generated by an operator of the form −iP, where P is a self-adjoint operator on H. Note that P commutes with H because the circle action on F++ commutes with the action of the semi-group {Rτ }τ ≥0 . Supposing that τ > 0, then Q((z,F) , exp(−τ H − iϕP)(z ,F ) ) is given by the right-hand side of (7.5), where q is now e−(τ +iϕ) . Note that (7.6) holds now as does (7.7) but with q as just described. Thus q need not equal q. ¯ In principle, the expression in (7.11) can be used to study the spectrum of H ± P. c) A pair of group actions on H. What follows is meant as a brief description of a pair of actions on H of the universal covering group of S L(2; R). One of these actions is a bonafide S L(2; R) action; the other is not. The story starts with the group S L(2; C) and its standard action on the Riemann sphere, C ∪ ∞, as the group of conformal diffeomorphisms of the round metric. Let z ∈ R × S 1 denote a point with positive first component. As explained previously, z can be viewed as a point in C∗ with norm less than 1. This understood, there exists an open neighborhood, U ⊂ PSL(2; C), of the identity element such that each element in U maps z to a point in C∗ with norm less than 1. The neighborhood U depends only on |z|. When q ∈ U , use qz to denote the point where q sends z. Now let z ∈ (z 1 , . . . , z N ) ∈ × N (R × S 1 ) be a point with pairwise distinct components, each with norm less than 1 when R × S 1 is viewed as C∗ . Set Uz ⊂ S L(2; C) to denote the intersection of the versions of U that are defined by the components of z. When q ∈ Uz , use q · z to denote the point (qz 1 , . . . , qz N ). Meanwhile, let F ∈ C ∞ (× N P) denote the usual sort of function, one that decomposes so as to send p = ( p1 , . . . , p N ) to f 1 ( p1 ) . . . f N ( p N ), where each f k is an eigenfunction of the Laplacian on P . Let Eˆ k denote −1 times f k ’s eigenvalue. When q ∈ Uz , set ˆ e− E k σ u(z k )/4π (qz,F) . (7.15) T [q] · (z,F) ≡ 1≤k≤N
Here, u(·) is determined by q as follows: Write a b q= c d
(7.16)
and u(w) = ln(|aw − b| · |cw − d|) − ln |w|. Now let z ∈ × N (R × S 1 ) be a second point with pairwise distinct components, each with norm less than 1; and let F ∈ C ∞ (× N P) denote a function of the same sort as F. If q ∈ Uz ∩ Uz , then it follows from Proposition 6.2 that & ' Q(T q ∗ (z ,F ) , T [q] (z,F) ) = Q((z ,F ) , (z,F) ), (7.17) where q ∗ is q∗ ≡
d¯ b¯
c¯ a¯
(7.18)
when q is given by (7.16). Granted (7.18), agree to identify S L(2; R) ⊂ S L(2; C) with the subset of matrices as depicted in (7.16), where b = c, ¯ d = a. ¯ Elements of this sort have q ∗ = q. According to (7.17), the elements near the identity in S L(2; R) act unitarily on a domain in H.
58
C. H. Taubes
Moreover, the intersection of these domains is dense in H. This understood, an argument much like the one in Sect. 4 finds a unitary action of S L(2; R) on H with the operator iP as one of its generators. Meanwhile, there is an embedded R × S 2 in S L(2; C), where q ∗ = q −1 ; this is the subset of matrices with b = −c¯ with both a and d being real. Any such element near the identity in S L(2; C) acts in a hermitian fashion on a domain in H. Arguing as in Sect. 4, the square root of −1 times the tangent space at 1 to this R × S 1 generates a unitary action of the universal cover of S L(2; R) on H. One of the generators of this action is iH. These two actions are those mentioned at the start of this section. 8. Free Field Theories The purpose of this final section is to describe a linear functional on F that comes directly from a Gaussian measure on the space of maps from a given manifold M to the Lie algebra of P. This Gaussian measure can then be used to define a ‘free quantum field’ theory whose Hilbert space is a quotient of certain versions of the space F++ that appears in Theorem 3.2. The ensuing discussion has six parts. Part 1. Fix a manifold M and the group P. As before, use p to denote the Lie algebra of P. A Gaussian measure on Maps(M; p) is defined by a positive definite, integral kernel, q, on M × M with possibly singular behavior on the diagonal. Integrals with respect to the corresponding Gaussian measure are defined initially by their values on functions from Maps(M; p) to C of the following sort: Let h denote a continuous, complex valued, compactly supported map top. Now on Maps(M; p) let h denote the function that sends a given map φ to exp R×Y 1≤a≤d h a (z)φa (z)dz . Here, {h a }1≤a≤d are the components of h with respect to the basis of p that corresponds to the chosen basis, {∂a }1≤a≤d , of left invariant vector fields on P. The integral of h using the Gaussian measure is defined to be exp q(z 1 , z 2 )a1 ,a2 h a1 (z 1 )h a2 (z 2 )dz . (8.1) M×M 1≤a a ≤d 1, 2
The integrals so defined are extended linearly to the vector space of finite linear combinations of functions such as h . These integrals are denoted below by (·). An example of interest for quantum field theory aficionados is the case M = R × Y and q equal to σ a, ˆ where aˆ is the Green’s function that is used in Sect. 3 to define · , and σ is a positive constant. Part 2. Let z = (z 1 , . . . , z N ) ∈ × N M denote a point with pairwise distinct entries. Meanwhile, let (a1 , . . . , a N ) ∈ × N {1, . . . , d}. As it turns out, functions on Maps(M; p) of the form φ → φa1 (z 1 ) . . . φa N (z N ) (8.2) are integrable. In fact, let f denote the function on × N p that sends a given N -tuple τ ≡ (τ1 , . . . , τ N ) to τ1a1 . . . τ N a N . Then φa1 (z 1 ) , . . . φa N (z N ) j q z i , z j a ,a ∂ai 1 ∂a2 f . (8.3) = exp 1 2 1≤i≤ j≤N 1≤a1 ,a2 ≤d τ¯ =0
Quasi-Linear Quantum Field Theories for Maps to Groups and Their Quotients
59
If q is Holder continuous across the diagonal, then the equality in (8.3) makes good sense even when components of z are not pairwise distinct (see, e.g. Theorem 3.2 in [T].) In this case, the formula on the right-hand side of (8.3) defines the Gaussian measure’s integral of any function on Maps(M; p) that sends φ to f (φ(z 1 ), . . . , φ(z N )) with f a smooth, bounded function on × N p. In particular, (8.3) makes good sense when f is the pull-back via a smooth map from × N p to × N P. This last observation leads to the ‘Gaussian measure’ on Maps(M; P) that is described in [T]. Part 3. Of particular interest here is the case where q is singular on the diagonal in M × M. A strategy employed in this case replaces q with qγ , where qγ (z, z ) = q(z, z ) when z = z and qγ (z, z) ≡ γ (z) with γ a continuous function on M. As the operator
hγ ,z ≡
1≤i≤ j≤N 1≤a1 ,a2 ≤d
j
qγ (z i , z j )a1 a2 ∂ai 1 ∂a2
(8.4)
is unbounded from below when z has a pair of components that are close in M, the qγ version of the exponential that appears in ( 8.3) can not be defined for such z as the time 1 element in a contraction semi-group on L 2 (M; × N p). This n issue is often avoided by 1 defining exp(hγ ,z ) f by the series H D60 n=0,1... n! hγ ,z f . Of course, the power series definition makes sense only for a very prescribed set of functions f since the series in question does not generally converge. For example, the definition makes sensewhen f is a polynomial in the various components n 1 of τ. For such f , the series n=0,1... n! hγ ,z f has but a finite set of non-zero terms. For a second example, note that the series in question converges when f ( τ) = , with each k exp k τ = (k , . . . , k ) a fixed vector in p. The ia ia i 11 id 1≤i≤N 1≤a≤d series in this case sums to exp qγ (z i , z j )a1 a2 kia1 k ja2 . (8.5) 1≤i≤ j≤N 1≤a1 ,a2 ≤d
A definition of the Gaussian integral of f (φ(z 1 ), . . . , φ (z N )) using qγ is what is called a ‘normal ordering’ prescription. Part 4. There are yet other functions f for which sense can be made of the right-hand side of (8.3), these relevant to Maps(M; P). To say more, suppose that f is a smooth, τ |2 ) f ( τ ) has a bounded function on × N p. When ε > 0, the function τ → exp(−ε | Fourier transform, a function on × N p that is denoted by tε f . Now, fix a point z ∈ × N M with pairwise distinct entries. Also, fix r > 0 and consider
qγ (z i , z j )a1 a2 kia1 kia2 (tε exp N p:k
f ) k dk. (8.6)
If the ε → 0 limit of tε f defines a distribution with compact support, then the ε → 0 limit of (8.6) is independent of r when r is large. In this case, the ordered limit of (8.6) with ε → 0 first and then r → ∞ makes good sense and so defines f (φ (z 1 ) , . . . , φ (z N )). The sort of functions to consider with regard to (8.6) are described in the next lemma.
60
C. H. Taubes
Lemma 8.1. Fix a positive integer, N , and for each k ∈ {1, . . . , N }, fix an eigenfunction, f k , for the Laplacian on P. For fixed p ∈ P, define f p to be the function on × N p that sends τ ≡ (τ1 , . . . , τ N ) to f p ( τ ) ≡ f i ( p · exp(τ1 )) . . . f N ( p · exp(τ N )) dτ.
(8.7)
Then, the ε → 0 limit of the Fourier transform of the positive ε versions of the function on × N p that sends to exp(−ε | τ |2 ) f p (τ ) is a compactly supported distribution on × N p. This lemma is proved at the end of this section for the case when P = SU (2). The proof for the general case follows along the same lines and is omitted. Assume for now the conclusions of the lemma. With N a positive integer, z ∈ × N M a point with pairwise distinct entries, and ( f 1 , . . . f N ) a set of eigenfunctions of the Laplacian on P, reintroduce (z,F) to denote the function p on Maps(M; P) that sends any given map φ to f 1 (φ(z 1 )) . . . f N (φ(z N )). Use f z to denote the ε → 0 and the r → ∞ limit of (8.6) and then set p (z,F) ≡ f z dp. (8.8) P
The assignment of (z,F) to (z,F) extends by linearity to define a functional on the same space, F, where · is defined from a given pair of function, a, on M × M and c : M → R. The functional defined by (·) is what I will call a Gaussian expectation on F. The linear functions (·) and · on F will not agree unless P is abelian even with q and γ respectively equal to the functions a and c that define · . This can be seen directly in the case where M is a compact Riemann surface with a and c as in Sect. 6. For example, the behavior of (·) with respect to conformal changes in the metric is rather more complicated than that described in Proposition 6.1 for · . Part 5. Proposition 3.1 and Theorem 3.2 have the following Gaussian analog: Proposition 8.2. Suppose that M = R × Y and that q = σ0 a, ˆ where aˆ is given by the top line in (3.1 ) when Y is compact, by (3.15) when Y = Rn and n > 1, and by (3.15) with m positive when Y = R. Here, σ0 is a positive constant. Meanwhile, take γ to be a function that is pulled up from Y . In all these cases, the Gaussian expectation on F has the following properties: • It is translation invariant in that (Rτ ) = () for any given τ ∈ R and ∈ F, • It is ‘reflection positive’ in the sense that ((∗) ) is non-negative for any ∈ F++ . Use the bilinear form on F++ that is defined so as to send a given Q G to denote pair , to (∗) . The action on F++ of the 1-parameter semi-group {Rτ }τ ≥0 descends as a strongly continuous, self-adjoint, contraction semigroup to the Hilbert space that is obtained by completing F++ / ker(Q G ) using the norm that is defined by QG . This proposition follows from what is known about Gaussian measures. For example, it can be deduced using the results in Chapter 6 of [G-J]. Note that Proposition 8.2 says nothing about the cases where aˆ is given by either the bottom line in (3.1) or by the Green’s function 2 2 1 − ln t − t + y − y
4π
Quasi-Linear Quantum Field Theories for Maps to Groups and Their Quotients
61
when Y = R. To say more about this absence, remark that the proof of Proposition 3.1 for these cases uses (2.10) at a key juncture. The steps that use (2.10) for the proof of Proposition 3.1 require here a different condition; this is the vanishing of each a ∈ {1, . . . , d} version of ∂ f 1 ( p exp(τ1 )) . . . f N ( p exp(τ N ))dp. (8.9) ∂τi,a P 1≤ j≤N
However, the expression in (8.9) is not in general equal to zero. As a parenthetical remark, one can try to circumvent this problem with the use of an alternate definition of (·) that builds in the required analog of (8.9). In particular, one can imagine a definition that starts by replacing (8.6) with exp qγ (z i , z j )a1 a2 kia1 k ja2 N p:k
×δ0
1≤i≤ j≤N 1≤a1 ,a2 ≤N
ki (tε f ) k dk,
(8.10)
1≤i≤N
where δ0 (·) denotes the Dirac delta function with mass 1 at the origin in p. However, I do not know whether the ε → 0 limit of the distribution k → δ0 1≤i≤N ki (tε f ) k is itself a distribution when f is any given f p from (8.7) or the average over P of the family of such functions. Part 6. The Hamiltonians that arise via Proposition 8.2 have much in common with the analogous ones from Theorem 3.2 . In some sense, this is no surprise as the same data defines both. What follows describes an example. Let Y = S 1 , and with m > 0 take the case where 1 −E |t−t | in ( y −y ) e n (8.11) a (t, y) , t , y = e 2E n n∈Z
1/2 with E n = 0 for all n. For example, one can consider E n = n 2 + m 2 , or E n equal to |n| when n = 0 and E 0 = m. In any event, assume that {E n }n∈Z has no accumulation points. Take the function c to be identically zero. Suppose that N is a positive integer and that { f 1 , . . . , f N } is a set of eigenfunctions for the Laplacian on P. As usual F denotes the function on × N P that sends ( p1 , . . . , p N ) to f 1 ( p1 ) . . . f N ( p N ). Let z ∈ × N (R × S 1 ) denote a point with distinct components, each with positive R factor. The resulting function (z,F) on Maps(M; P) is in F++ and so defines an element in Theorem 3.2’s Hilbert space. Let −H again denote the generator of Theorem 3.2’s contraction semigroup. As noted for an example at the end of Sect. 7b, there exists a large τ expansion of the form Q((z,F) , exp(−τ H)(z,F) ) = n E e−Eτ , (8.12) E
where the values of E are the sums of the elements in some finite set chosen from {E α }α∈Z . Meanwhile, let HG denote the Hamiltonian for the case described by Proposition 8.2 where q = a and γ = 0. An argument very similar to that used in Sect. 7b finds that the function τ → Q G (z,F) , exp(−τ HG )(z,F) on [0, ∞) has a large τ , asymptotic
62
C. H. Taubes
expansion that also takes the form that is depicted in (8.12). Moreover, the values of E that appear for the Gaussian case come from the same set as those that appear in the case from Theorem 3.2. Note, however, that corresponding versions of n E can differ. Part 7. This last part of the section contains the P = SU (2) version of the Proof of Lemma 8.1. Fix an irreducible representation, V , for the group P. Such a representation determines an eigenspace for the Laplacian on SU (2); functions in this eigenspace have the form given in (5.1 ). The ψ p pull-back of (5.1) is †
τ → η p ρV (exp (τ )) ν,
(8.13)
here η p is shorthand for ρV ( p)η. Lemma 8.1 follows from the claim that the ε → 0 limit of positive ε versions of the function † k → exp i ka τa − ε |τ |2 η p ρV (exp (τ )) νdτ (8.14) p
1≤a≤d
is a compactly supported distribution on p. Since the function in (8.13) is smooth and bounded on p, the ε → 0 limit of the function in (8.14) defines a distribution on p. This understood, the issue is whether the distribution has compact support. The arguments that follow establish that such is the case when P = SU (2). To start, suppose that k = 0, and let oˆ 3 ∈ su(2) denote a unit vector that is tangent to k. Now introduce oˆ + and oˆ − to denote vectors in the complexified Lie algebra with the following properties: They are orthogonal to oˆ 3 , they have norm squared equal to 21 , they ' & † , and oˆ 3 , oˆ ± = ±2i oˆ ± . Use r ∈ R and w ∈ C to parameterize are such that oˆ + = oˆ − su(2) via the map that sends a given (z, w) to r oˆ 3 + w oˆ + − w¯ oˆ − . Consider now when V = C2 is the fundamental representation. The representation sends oˆ ± to matrices with square 0. As a consequence, ρC2 (exp(τ )) = exp(ir oˆ 3 ) 1 +
1 0
ds we−ir s oˆ + − we ¯ ir s oˆ − + . . . ,
(8.15)
where the (2n + 1)st term is |w|2n we−ir (s1 −s2 +s3 −...+s2n+1 ) oˆ + oˆ − . . . oˆ + 2n+1 S∈
and the 2n th term is
− we ¯ ir (s1 −s2 +s3 −...+s2n+1 ) oˆ − oˆ + . . . oˆ − ds1 . . . ds2n+1
(8.16)
|w|2n e−ir (s1 −s2 +s3 −...+s2n+1 ) oˆ + oˆ − . . . oˆ − 2n S∈ − eir (s1 −s2 +s3 −...+s2n+1 ) oˆ − oˆ + . . . oˆ + ds1 . . . ds2n .
(8.17)
Here, k is the k-dimensional simplex where 0 ≤ sk ≤ . . . ≤ s1 ≤ 1.
Quasi-Linear Quantum Field Theories for Maps to Groups and Their Quotients
63
The term depicted in (8.16) contribute zero to (8.14). Meanwhile, the term in (8.17) contributes π 3/2 n! 1 † 2 exp − + s − . . . + s ds1 . . . ds2n η p πˆ + ν + 1 − s ) (κ 1 2 2n εn+3/2 S∈ 4ε 2n π 3/2 n! 1 † 2 + n+3/2 exp − (κ − 1 + s1 − s2 + . . . − s2n ) ds1 . . . ds2n η p πˆ − ν. 2n ε 4ε s∈ (8.18) Here, κ is defined so that k = κ oˆ 3 , and πˆ ± are the respective projections onto the ±i eigenspaces of oˆ 3 . To continue, suppose that κ ≥ 2. In this case, (κ ± (1 − s1 + s2 − . . . + s2n )) has size at least 1. Under this condition, any given version of (8.18) has limit zero as n → ∞. Even so, it is important to consider the convergence of their sums. For this purpose, remark that the integrals in (8.18) are no greater than a κ and n independent multiple of (κ − 1)−(2n+3) n! (n + 3/2)n e−n / (2n)!.
(8.19)
Here, factor 1/(2n)! is the volume of 2n . Meanwhile, Stirling’s approximation finds the expression in (8.19) on the order of 2−n (κ − 1)−2n−3 for large n. What with any given integer n version of (8.18) limiting to zero as ε → 0, this last observation implies that (8.14) limits uniformly to zero as ε → 0 in both the C0 and L 1 topology on the complement of the radius 2 ball in su(2). The case where dim(V ) > 2 can be handled using the following View C2 as device: 2 the defining representation of SU (2). Fix m ≥ 1 and let W ≡ m C . View W as a representation of SU (2) via the simultaneous action of any given element on all of the k summands. This representation is not irreducible, but any given irreducible representation is contained as a summand in some large m version of W . Granted this, it follows that the ε → 0 limit of (8.14) has compact support as a distribution in the variable k if such is the case for the analogous limit where ρV is replaced by ρW in the case where W contains V as a summand. To see that such is the case, it is enough to consider the case; where, η and ψ are decomposable, thus η = η1 ⊗ . . . ⊗ ηm and ψ = ψ1 ⊗ . . . ⊗ ψm . For such η and ψ, −ε |τ |2 p† 2 p† exp(−ε |τ | )η ρW (exp(τ ))ψ = ×1≤i≤m exp ηi ρC2 (exp (τ )) ψi . m (8.20) This implies that the Fourier transform of the function on the left-hand side of (8.20) is an m-fold convolution of Fourier transforms of the sort just computed for the representation C2 . As the latter have ε → 0 limit equal to zero in both the C 0 and L 1 topologies in the complement of the radius 2 ball in su(2), it follows that the Fourier transform of the function on the left-hand side of (8.20) limits to zero as ε → 0 in both the C 0 and L 1 topologies in the complement of the radius 2(m + 1) ball in su(2). References [Ga] [GJ]
Gawedzki, K.: Lectures on conformal field theory and strings. In: Quantum Fields and Strings: A Course for Mathematicians, Providance, RI: Amer. Math. Soc., 1999 Glimm, J., Jaffe, A.: Quantum Physics. New York: Springer-Verlag, 1981
64
[H] [HP] [Ka] [OS] [R] [Se1] [Se2] [St] [SV] [T]
C. H. Taubes
Hörmander, L.: Pseudo-differential operators and hypoelliptic equations. Proc. Symp. Pure Math. 10, Providance, RI: Amer. Math. Soc., 1966, pp. 138–183 Hille, E., Phillips, R.S.: Functional Analysis and Semigroups. AMS Colloq. Publ. 31, Providance, RI: Amer. Math. Soc., 1957 Kato, T.: Perturbation Theory of Linear Operators. New york: Springer-Verlag, 1984 Osterwalder, K., Schrader, R.: Axioms for Euclidean Green’s functions I. Commun. Math. Phys. 42, 83–112 (1973) Royden, H.L.: Real analysis. Third edition. New york: Macmillan Publishing Company, 1988 Segal, G.: Two-dimensional conformal field theories and modular functon. In: IXth International Congress on Mathematical Physics (Swansea 1988), Bristol: Adam Hilger Pub., 1989, pp. 22–37 Segal, G.: The definition of conformal field theory. In: Topology, geometry and quantum field theory, London Math. Soc. Lecture Note Ser., 308, Cambridge: Cambridge Univ. Press, 2004, pp. 421–577 Stone, M.H.: On one parameter unitary groups in Hilbert space. Ann. Math. 33, 643–648 (1932) Stroock, D.W., Varadhan, S.R.S.: Multidimensional Diffusion Processes. New York: Springer-Verlag, 1979 Taubes, C.H.: Constructions of measures and quantum field theories on mapping spaces. J. Diff. Geom. 70, 23–57 (2005)
Communicated by A. Connes
Commun. Math. Phys. 267, 65–92 (2006) Digital Object Identifier (DOI) 10.1007/s00220-006-0049-6
Communications in
Mathematical Physics
Quantum States on Harmonic Lattices Norbert Schuch, J. Ignacio Cirac, Michael M. Wolf Max-Planck-Institut für Quantenoptik, Hans-Kopfermann-Str. 1, 85748 Garching, Germany. E-mail:
[email protected] Received: 5 October 2005 / Accepted: 23 March 2006 Published online: 15 July 2006 – © Springer-Verlag 2006
Abstract: We investigate bosonic Gaussian quantum states on an infinite cubic lattice in arbitrary spatial dimensions. We derive general properties of such states as ground states of quadratic Hamiltonians for both critical and non-critical cases. Tight analytic relations between the decay of the interaction and the correlation functions are proven and the dependence of the correlation length on band gap and effective mass is derived. We show that properties of critical ground states depend on the gap of the point-symmetrized rather than on that of the original Hamiltonian. For critical systems with polynomially decaying interactions logarithmic deviations from polynomially decaying correlation functions are found. 1. Introduction The importance of bosonic Gaussian states arises from two facts. First, they provide a very good description for accessible states of a large variety of physical systems. In fact, every ground and thermal state of a quadratic bosonic Hamiltonian is Gaussian and remains so under quadratic time evolutions. In this way quadratic approximations naturally lead to Gaussian states. Hence, they are ubiquitous in quantum optics as well as in the description of vibrational modes in solid states, ion traps or nanomechanical oscillators. The second point for the relevance of Gaussian states is that they admit a powerful phase space description which enables us to solve quantum many-body problems which are otherwise (e.g., for spin systems) hardly tractable. In particular, the phase space dimension, and with it the complexity of many tasks, scales linearly rather than exponentially in the number of involved subsystems. For this reason quadratic Hamiltonians and the corresponding Gaussian states also play a paradigmatic role as they may serve as an exactly solvable toy model from which insight into other quantum systems may be gained. Exploiting the symplectic tools of the phase space description, exact solutions have been found for various problems in quantum information theory as well as in quantum
66
N. Schuch, J. I. Cirac, M. M. Wolf
statistical mechanics. In fact, many recent works form a bridge between these two fields as they address entanglement questions for asymptotically large lattices of quadratically coupled harmonic oscillators: the entropic area law [1–3] has been investigated as well as entanglement statics [4–6], dynamics [7–9] and frustration [10, 11]. In the present paper we analytically derive general properties of ground states of translationally invariant quadratic Hamiltonians on a cubic lattice. We start by giving an outlook and a non-technical summary of the main results. The results on the asymptotic scaling of ground state correlations are summarized in Table 1. We note that related investigations of correlation functions were recently carried out in [12, 13] for finite dimensional spin systems and in [1, 14] for generic harmonic lattices with non-critical finite range interactions. Quadratic Hamiltonians. In Sect. 2, we start by introducing some basic results on quadratic Hamiltonians together with the used notation. Translationally invariant systems. In Sect. 3, we show first that every pure translational invariant Gaussian state is point symmetric. This implies that the spectral gap of the symmetrized rather than the original Hamiltonian determines the characteristic properties of the ground state. We provide a general formula for the latter and express its covariance matrix in terms of a product of the inverse of the Fourier transformed spectral function and the Hamiltonian matrix. Non-critical systems. Section 4 shows that if the Hamiltonian is gapped, then the correlations decay according to the interaction: a (super) polynomial decay of the interaction leads to the same (super) polynomial decay for the correlations, and (following Ref. [1]) finite range interactions lead to exponentially decaying correlations. Correlation length and gap. Section 5 gives an explicit formula for the correlation length for gapped 1D-Hamiltonians with finite range interactions. The correlation length ξ is expressed in terms of the dominating zero of the complex spectral function, which close Table 1. Summary of the bounds derived in the paper on the asymptotic scaling of ground state correlations, depending on the scaling of the interaction (left column). Here n is the distance between two points (harmonic oscillators) on a cubic lattice of dimension d. O denotes upper bounds, O ∗ tight upper bounds, and the exact asyptotics. The table shows the results for generic interactions—special cases are discussed in the text interaction local
non-critical O ∗ e−n/ξ d = 1: ξ ∼ √ 1
m ∗
O n −∞ 1 O α n α > 2d + 1 c nα d=1
critical v =
O n −∞ O
1
1 n2 log n
d = 1 : O∗ d >1: O
n d+1
n ν−d α>ν∈N
1 α≥2: α n
12 ,c>0 n α =3: √ log n , c < 0 n2 1 α >3: 2 n
Quantum States on Harmonic Lattices
67
to a critical point is in turn determined by the spectral gap and the effective mass m ∗ at the band gap via ξ ∼ (m ∗ )−1/2 . When the change in the Hamiltonian is given by a global scaling of the interactions this proves the folk theorem ξ ∼ 1/. Critical systems. Section 6 shows that for generic d-dimensional critical systems the correlations decay as 1/n d+1 , where n is the distance between two points on the lattice. Whereas for sufficiently fast decreasing interactions in d = 1 the asymptotic bound is exactly polynomial, it contains an additional logarithmic correction for d ≥ 2. Similarly for d = 1 a logarithmic deviation is found if the interaction decays exactly like −1/n 3 . 2. Quadratic Hamiltonians and their Ground States Consider a system of N bosonic modes which are characterized by N pairs of canonical operators (Q 1 , P1 , . . . , Q N , PN ) =: R. The canonical commutation relations (CCR) are governed by the symplectic matrix σ via N
0 1 , Rk , Rl = iσkl , σ = −1 0 n=1
and the system may be equivalently √ described in terms of bosonic creation and annihilation operators al = (Q l + i Pl )/ 2. Quadratic Hamiltonians are of the form H=
1 Hkl Rk Rl , 2 kl
where the Hamiltonian matrix H is real and positive semidefinite due to the Hermiticity and lower semi-boundedness of the Hamiltonian H. Without loss of generality we neglect linear and constant terms since they can easily be incorporated by a displacement of the canonical operators and a change of the energy offset. Before we discuss the general case we mention some important special instances of quadratic Hamiltonians: a well studied 1D example of this class is the case of nearest neighbor interactions in the position operators of harmonic oscillators on a chain with periodic boundary conditions 1 2 Q i + Pi2 − κ Q i Q i+1 , κ ∈ [−1, 1]. 2 N
Hκ =
(1)
i=1
This kind of spring-like interaction was studied in the context of information transfer [7], entanglement statics [4–6] and entanglement dynamics [9]. Moreover, it can be considered as the discretization of a massive bosonic continuum theory given by the Klein-Gordon Hamiltonian 1 L/2 ˙ 2 + φ(x) 2 + m 2 φ(x)2 d x, HKG = φ(x) 2 −L/2 2 where the coupling κ is related to the mass m by κ −1 = 1+ 21 mNL [5]. Other finite range quadratic Hamiltonians appear as limiting cases of finite range spin Hamiltonians via the Holstein–Primakoff approximation [15]. In this way the x y-spin model with transverse magnetic field can for instance be mapped onto a quadratic bosonic Hamiltonian in the limit of strong polarization where a (σx + iσ y )/2. Longer range interactions appear naturally for instance in 1D systems of trapped ions. These can either be implemented as
68
N. Schuch, J. I. Cirac, M. M. Wolf
Coulomb crystals in Paul traps or in arrays of ion microtraps. When expanding around the equilibrium positions, the interaction between two ions at position i and j = i is— cQ Q in harmonic approximation—of the form |i−i j|3j , where c > 0 (c < 0) if Q i , Q j are position operators in radial (axial) direction [16]. Let us now return to the general case and briefly recall the normal mode decomposition [17]: every Hamiltonian matrix can be brought to a diagonal normal form by a congruence transformation with a symplectic matrix S ∈ Sp(2N , R) = {S|Sσ S T = σ }:1
I J
εi 0 0 0 ⊕ , εi > 0, SH S = 0 εi 0 1 T
i=1
(2)
j=1
where the symplectic eigenvalues εi are the square roots of the duplicate nonzero eigenvalues of σ H σ T H . The diagonalizing symplectic transformation S has a unitary representation U S on Hilbert space which transforms the Hamiltonian according to U S HU S† =
1 2
I
Q i2 + Pi2 εi +
1 2
i=1
J
P j2 =
j=1
I
† ai ai + 21 εi + i=1
1 2
J
P j2 .
(3)
j=1
Hence, by Eq. (3) the ground state energy E 0 and the energy gap can easily be expressed in terms of the symplectic eigenvalues of the Hamiltonian matrix: E0 =
1 2
I
εi , =
i=1
mini εi , J = 0 . 0, J >0
(4)
The case of a vanishing energy gap = 0 is called critical and the respective ground states are often qualitatively different from those of non-critical Hamiltonians. For the Hamiltonian Hκ , Eq. (1), this happens in the strong coupling limit |κ| = 1 − 2 → 1, and in the case of 1D Coulomb crystals a vanishing energy gap in the radial modes can be considered as the origin of a structural phase transition where the linear alignment of the ions becomes unstable and changes to a zig-zag configuration [18–20]. Needless to say, these phase transitions appear as well in higher dimensions and for various different configurations [21]. Ground and thermal states of quadratic Hamiltonians are Gaussian states, i.e, states having a Gaussian Wigner distribution in phase space. In the mathematical physics literature they are known as bosonic quasi-free states [22, 23]. These states are completely characterized by their first moments dk = tr ρ Rk (which are w.l.o.g. set to zero in our case) and their covariance matrix (CM) γkl = tr ρ Rk − dk , Rl − dl + , (5) where {·, ·}+ is the anticommutator. The CM satisfies γ ≥ iσ , which expresses Heisenberg’s uncertainty relation and is equivalent to the positivity of the corresponding density operator ρ ≥ 0. In order to find the ground state of a quadratic Hamiltonian, observe that
(4) (5) 1 εi = E 0 = inf tr[ρH] = 41 inf tr[γ H ]. (6) 2 i
ρ
γ
1 Note that we disregard systems where the Hamiltonian contains irrelevant normal modes.
Quantum States on Harmonic Lattices
69
By virtue of Eqs. (2,3) the infimum is attained for the ground state covariance matrix
I J
s 0 1 0 S, ⊕ γ = lim S T (7) 0 1 0 s −1 s→∞ i=1
j=1
which reduces to γ = S T S in the non-critical case. Note that the ground state is unique as long as H does not contain irrelevant normal modes [which we have neglected from the very beginning in Eq. (2)]. In many cases it is convenient to change the order of the canonical operators such that R = (Q 1 , . . . , Q N , P1 , . . . , PN ). Then the covariance matrix as well as the Hamiltonian matrix can be written in block form HQ HQ P H= . H QT P H P In this representation a quadratic Hamiltonian is particle number preserving iff H Q = H P and H Q P = −H QT P , that is, the Hamiltonian contains only terms of the kind ai† a j + a †j ai . In quantum optics terms of the form ai† a †j , which are not number preserving, are neglected within the framework of the rotating wave approximation. The resulting Hamiltonians have particular simple ground states: Theorem 1a. The ground state of any particle number preserving Hamiltonian is the vacuum with γ = 1, 1 and the corresponding ground state energy is given by E 0 = 41 tr H . Proof. Number preserving Hamiltonians are most easily expressed in terms of creation and annihilation operators. For this reason we change to the respective complex representation via the transformation 1 11 −i1 0 X 1 . H → H T = ¯ , = √ 1 X 0 2 11 i1 In this basis H is transformed to normal form via a block diagonal unitary transformation U ⊕ U¯ which in turn corresponds to an element of the orthogonal subgroup of the symplectic group Sp(2N , R) ∩ SO(2N ) U(N ) [24]. Hence, the diagonalizing S in Eqs. (2,7) is orthogonal and since J = 0 due to particle number conservation, we have γ = S T S = 1. 1 E 0 follows then immediately from Eq. (6). Another important class of quadratic Hamiltonians for which the ground state CM takes on a particular simple form corresponds to the case H Q P = 0 where there is no coupling between the momentum and position operators: Theorem 1b. For a quadratic Hamiltonian with Hamiltonian matrix H = H Q ⊕ H P the ground state energy and the ground state CM are given by −1/2 1/2 1/2 −1/2 E 0 = 21 tr H Q H P , γ = X ⊕ X −1 , X = H Q H Q H P H Q H Q . (8) = H P H Q ⊕ H Q H P , the symplectic of H are given Proof. Since σ H σ T H eigenvalues √ √ by the eigenvalues of H Q H P and thus E 0 = 21 tr H Q H P . Moreover, by the uniqueness of the ground state and the fact that E 0 = 41 tr[γ H ] with γ from Eq. (8) we know that γ is the ground state CM (as it is an admissible pure state CM by construction).
70
N. Schuch, J. I. Cirac, M. M. Wolf
Finally we give a general formula for the ground state CM in cases where the blocks in the Hamiltonian matrix can be diagonalized simultaneously. This is of particular importance as it applies to all translational invariant Hamiltonians discussed in the following sections. Theorem 1c. Consider a quadratic Hamiltonian for which the blocks H Q , H P , H Q P of the Hamiltonian matrix can be diagonalized simultaneously and in addition H Q P = H QT P . Then with Eˆ = H Q H P − H Q2 P we have (9) ˆ −1 σ H σ T . ˆ E 0 = 21 tr[E], = λmin Eˆ , γ = (Eˆ ⊕ E) (10) ˆ and = λmin Eˆ . Proof. Since σ H σ T H = Eˆ 2 ⊕ Eˆ 2 we have indeed E 0 = 21 tr[E] Positivity γ ≥ 0 is implied by H ≥ 0 such that we can safely talk about the symplectic eigenvalues of γ . The latter are, however, all equal to one due to (γ σ )2 = −1 1 so that γ is an admissible pure state CM. Moreover it belongs to the ground state since 1 4 tr[H γ ] = E 0 . 3. Translationally Invariant Systems Let us now turn towards translationally invariant systems. We consider cubic lattices in d dimensions with periodic boundary conditions. For simplicity we assume that the size of the lattice is N d . The system is again characterized by a Hamiltonian matrix Hkl , where the indices k, l, which correspond to two points (harmonic oscillators) on the lattice, are now d-component vectors in ZdN . Translational invariance is then reflected by the fact that any matrix element Akl , A ∈ {H Q , H P , H Q P } depends only on the relative position k − l of the two points on the lattice, and we will therefore often write Ak−l = Akl . Note that due to the periodic boundary conditions k − l is understood modulo N in each component. Matrices of this type are called circulant, and they are all simultaneously diagonalized via the Fourier transform 2πi 1 Fαβ = √ e N αβ , α, β ∈ Z N , such that N
2πi An e− N m n , Aˆ := F ⊗d AF †⊗d = diag
n∈ZdN
m
ZdN .
where m n is the usual scalar product in It follows immediately that all circulant matrices mutually commute. In the following, we will show that we can without loss of generality restrict ourselves to point-symmetric Hamiltonians, i.e., those for which H Q P = H QT P (which means that H contains only pairs Q k Pl + Q l Pk ). For dimension d = 1 this is often called reflection symmetry. Theorem 2. Any translationally invariant pure state CM is point symmetric. Proof. For the proof, we use that any pure state covariance matrix can be written as Q Q P X XY , = = T Q P Y X X −1 + Y X Y P
Quantum States on Harmonic Lattices
71
where X ≥ 0 and Y is real and symmetric [25]. From translational invariance, it follows that all blocks and thus X and Y have to be circulant and therefore commute. Hence, T , i.e., is point symmetric. Q P = X Y = Y X = Q P Let P : ZdN → ZdN be the reflection on the lattice and define the symmetrization operation S(A) = 21 (A + P AP) such that by the above theorem S(γ ) = γ for every translational invariant pure state CM. Then due to the cyclicity of the trace we have for any translational invariant Hamiltonian inf tr H γ = inf tr S(H )γ . γ
γ
Hence, the point-symmetrized Hamiltonian S(H ), which differs from H by the offdiagonal block S(H Q P ) = 21 (H Q P + H QT P ) has both the same ground state energy and the same ground state as H . Together with Theorem 1c this leads us to the following: Theorem 3. Consider any translationally invariant quadratic Hamiltonian. With Eˆ = 1/2 the ground state CM and the corresponding ground H Q H P − 41 (H Q P + H QT P )2 state energy are given by −1 ˆ E 0 = 21 tr[E], γ = Eˆ ⊕ Eˆ σ S(H )σ T . (11) It is important to note that the energy gaps of H and S(H ) will in general be different. In particular H might be gapless while S(H ) is gapped. However, as we will see ˆ of the in the following sections, the properties of γ depend on the gap = λmin (E) symmetrized Hamiltonian rather than on that of the original H . For this reason we will in the following for simplicity assume H Q P = H QT P . By Thm. 3 all results can then also be applied to the general case without point symmetry if one only keeps in mind that is the gap corresponding to S(H ). Note that the eigenvalues of Eˆ are the symplectic eigenvalues of S(H ), i.e., E = ⊗d ˆ †⊗d is the excitation spectrum of the Hamiltonian. This is the reason for the F EF notation where E resides in Fourier space and Eˆ in real space, which differs from the normal usage of the hat throughout the paper. Correlation functions. According to Eqs. (9,10,11) we have to compute the entries of functions of matrices in order to learn about the entries of the covariance matrix. This is most conveniently done by a double Fourier transformation, where one uses that ˆ and we find f (M) = f ( M), 2πi 1 − 2πi nr ˆ r s e N sm . [ f (M)]nm = d e N [ f ( M)] (12) N r,s As we consider translationally invariant systems, M is circulant and thus Mˆ is diagonal. We define the function
ˆ M(φ) = Mn e−inφ (13) n∈ZdN
ˆ such that M(2πr/N ) = Mˆ r,r . As f (M) is solely determined by its first row, we can write 1 2πi nr/N ˆ [ f (M)]n = d e f ( M(2πr/N )). (14) N d r ∈Z N
72
N. Schuch, J. I. Cirac, M. M. Wolf
In the following we will use the index n ∈ Zd for the relative position of two points on the lattice. Their distance will be measured either by the l1 , l2 or l∞ norm. Since we are considering finite dimensional lattices these are all equivalent for our purpose and we will simply write n. In the thermodynamic limit N → ∞, the sum in Eq. (14) converges to the integral
1 inφ ˆ ˆ [ f (M)]n = dφ f ( M(φ)) e with M(φ) = Mn e−inφ , (15) (2π )d T d d n∈Z
where T d is the d-dimensional torus, i.e., [0, 2π ]d with periodic boundary conditions. The convergence holds as soon as |Mn | < ∞ [which holds e.g. for Mn = O(n−α ) ˆ with some α > d] and f is continuous on an open interval which contains the range of M. k d ˆ ˆ From the definition (15) of M, it follows that M ∈ C (T ) (the n times continuously differentiable functions on T d ) whenever the entries Mn decay at least as fast as n−α for some α > k + d, since then the sum of the derivatives converges uniformly. Particularly, if the entries of M decay faster than any polynomial, then Mˆ ∈ C ∞ (T d ). In the following the most important function of the type f ◦ Mˆ will be the spectral function ! (16) e−inφ [H Q H P ]n − [H Q2 P ]n . E(φ) = n∈Zd
Asymptotic notation. As the main issue of this paper is the asymptotic scaling of correlations, we use the Landau symbols o, O, and , as well as the symbol O ∗ for tight bounds: f (x) – f (x) = o(g(x)) means lim g(x) = 0, i.e., f vanishes strictly faster than g for x→∞ x → ∞; " " " f (x) " – f (x) = O(g(x)), if lim sup " g(x) " is finite, i.e., f vanishes at least as fast as g; x→∞
– f (x) = (g(x)), if f (x) = O(g(x)) and g(x) = O( f (x)) (i.e., exact asymptotics); – f (x) = O ∗ (g(x)), if f (x) = O(g(x)) but f (x) = o(g(x)), i.e., g is a tight bound on f .2 If f is taken from a set (e.g., those functions consistent with the assumptions of a theorem) we will write f = O ∗ (g) if g is a tight bound for at least one f (i.e., the best possible universal bound under the given assumptions). If talking about Hamiltonians, the scaling is meant to hold for all blocks, e.g., if the interaction vanishes as O(n−α ) for n → ∞, this holds for all the blocks H Q , H P , and H Q P = H PT Q . The same holds for covariance matrices in the non-critical case. By the shorthand notation f (n) = o(n−∞ ), we mean that f (n) = o(n−α ) ∀α > 0. Note finally that the Landau symbols are also used in (Taylor) expansions around a point x0 where the considered limit is x → x0 rather than x → ∞. 4. Non-Critical Systems In this section, we analyze the ground state correlations of non-critical systems, i.e., those which exhibit an energy gap > 0 between the ground and the first excited state. 2 In order to see the difference to , take an f (x) = g(x) for even x, f (x) = 0 for odd x, x ∈ N. Although f does not bound g, thus f (x) = O(g(x)), the bound g is certainly tight. A situation like this is met, e.g., in Theorem 5, where the correlations oscillate within an exponentially decaying envelope.
Quantum States on Harmonic Lattices
73
Simply speaking, we will show that the decay of correlations reflects the decay of the interaction. While local (super-polynomially decaying) interactions imply exponentially (super-polynomially) decaying correlations, a polynomial decay of interactions will lead to the same polynomial law for the correlations. According to Theorem 3, we will consider a translationally invariant system with a point-symmetric Hamiltonian (H Q P = H QT P ). Following (10,11), we have to determine the entries of (Eˆ −1 ⊕ Eˆ −1 )σ H σ T , with Eˆ = (H Q H P + H Q2 P )1/2 . In Lemma 1 we will first show that it is possible to consider the two contributions independently, and as the asymptotics of σ H σ T is known, we only have to care about the entries of Eˆ −1 , i.e., we have to determine the asymptotic behavior of the integral 1 (Eˆ −1 )n = dφ E −1 (φ)einφ where E = ( Hˆ Q Hˆ P + Hˆ Q2 P )1/2 . (2π )d T d Lemma 1. Given two asymptotic circulant matrices A, B in d dimensions with polynomially decaying entries, An = O(n−α ), Bn = O(n−β ), α, β > d. Then (AB)n = O ∗ (n−µ ), µ := min{α, β}. Proof. With Q η (n) := min{1, n−η }, we know that |An | = O(Q α ) and |Bn | = O(Q β ), and " " " "
" " A0, j B j,n "" ≤ |A j ||Bn− j | = O Q α ( j)Q β (n − j) . (17) |(AB)n | = "" " j " j j We consider only one half space j ≤ n− j, where we bound Q β (n− j) ≤ Q β (n/2). As Q α ( j) is summable, the contribution of this half-plane is O Q β (n/2) . The other half-plane gives the same result with α and β interchanged, which proves the bound, while tightness follows by taking all An , Bn positive. We now determine the asymptotics of (Eˆ −1 )n for different types of Hamiltonians. Lemma 2. For non-critical systems with rapidly decaying interactions, i.e., as o(n−∞ ), the entries of Eˆ −1 decay rapidly as well. That is, > 0 ⇒ (Eˆ −1 )n = o(n−∞ ). Proof. As the interactions decay as o(n−∞ ), Hˆ • ∈ C ∞ (T d ) (• = Q, P, P Q), and thus E 2 = Hˆ Q Hˆ P + Hˆ Q2 P ∈ C ∞ (T d ). Since the system is gapped, i.e., E ≥ > 0, it follows that also g := E −1 ∈ C ∞ (T d ). For the proof, we need to bound 1 dφ g(φ)einφ (Eˆ −1 )n = (2π )d T d by n−κ for all κ ∈ N. First, let us have a look at the one-dimensional case. By integration by parts, we get $π # π 1 1 1 −1 inφ ˆ g(φ)e − dφ g (φ)einφ , (E )n = 2π in 2πin −π φ=−π
74
N. Schuch, J. I. Cirac, M. M. Wolf
where the first part vanishes due to the periodicity of g. As g ∈ C ∞ (T 1 ), the integration by parts can be iterated arbitrarily often and all the brackets vanish, such that after κ iterations, π 1 −1 ˆ (E )n = dφ g (κ) (φ)einφ . 2π(in)κ −π % As g (κ) (φ) is continuous, the integral can be bounded by |g (κ) (φ)|dφ =: Cκ < ∞, such that finally |(Eˆ −1 )n | ≤
Cκ nκ
∀κ ∈ N ,
which completes the proof of the one-dimensional case. The extension to higher dimensions is straightforward. For a given n = (n 1 , . . . , n d ), integrate by parts with respect to the φi for which |n i | = n∞ ; we assume i = 1 without loss of generality. As g(·, φ2 , . . . , φd ) ∈ C ∞ (S 1 ), the same arguments as in the 1D case show " " κ " ∂ " 1 " " dφ = Cκ . g(φ) |(Eˆ −1 )n | ≤ κ " " d κ d ∂φ (2π ) |n | nκ T
1
∞
1
For systems with local interactions, a stronger version of Lemma 2 can be obtained: Lemma 3. For a system with finite range interaction, the entries of Eˆ −1 decay exponentially. This has been proven in [1] for Hamiltonians of the type H = V ⊕ 1, 1 exploiting a result on functions of banded matrices [26]. Following Eqs. (9,11) the generalization to arbitrary translational invariant Hamiltonians is straightforward by replacing V with H Q H P − H Q2 P . In fact, it has been shown recently that the result even extends to non translational invariant Hamiltonians of the form in Theorem 1 b [14]. Finally, we consider systems with polynomially decaying interaction. Lemma 4. For a 1D lattice with H = V ⊕ 11 > 0 and an exactly polynomially decaying interaction & i= j : a Vi j = , 2 ≤ ν ∈ N, i = j : |i−bj|ν Eˆ −1 decays polynomially with the same exponent, (Eˆ −1 )n = (V 1/2 )n = (|n|−ν ). Hamiltonians of this type appear, e.g., for the vibrational degrees of freedom of ions in a linear trap, where ν = 3. (9) Proof. We need to estimate (Eˆ −1 )n = (V −1/2 )n =
Vˆ (φ) = a + 2b
∞
cos(nφ) n=1
nν
1 2π
% 2π 0
Vˆ −1/2 (φ)einφ dφ. Note that
= a + 2b Re Liν (eiφ ) > 0,
(18)
n ν where Liν (z) = n≥1 z /n is the polylogarithm. The polynomial decay of coeffiν−2 1 cients implies Vˆ ∈ C (S ), and as the system is non-critical, Vˆ −1/2 ∈ C ν−2 (S 1 ).
Quantum States on Harmonic Lattices
75
As Liν has an analytic continuation to C\[1; ∞), Vˆ ∈ C ∞ ((0; 2π )) and thus Vˆ −1/2 ∈ C ∞ ((0; 2π )). We can therefore integrate by parts ν − 1 times, and as all brackets vanish due to periodicity, we obtain $ 2π # ν−1 d 1 ˆ −1/2 (φ) einφ dφ, (Eˆ −1 )n = (19) V 2π(in)ν−1 0 dφ ν−1 and dν−1 ˆ −1/2 Vˆ (ν−1) (φ) 3(ν − 2)Vˆ (ν−2) (φ)Vˆ (1) (φ) (φ) = − + + g(φ). V ν−1 dφ 2 Vˆ (φ)3/2 4Vˆ (φ)5/2
(20)
Note that the second term only appears if ν ≥ 3, and g only if ν ≥ 4. As g(φ) ∈ C 1 (S 1 ), its Fourier coefficients vanish as O(n −1 ), as can be shown by integrating by parts. The second term can be integrated by parts as well, the bracket vanishes due to continuity, and we remain with ( ' 1 2π 3(ν − 2)Vˆ (ν−1) (φ)Vˆ (1) (φ) + h(φ) einφ dφ, in 0 4Vˆ (φ)5/2 with h ∈ C (S 1 ). [For ν = 3, a factor 2 appears as (Vˆ (1) ) = Vˆ (ν−1) .] As we will show later, Vˆ (ν−1) is absolutely integrable, hence the integral exists, and thus the Fourier coefficients of the second term in Eq. (20) vanish as O(n −1 ) as well. Finally, it remains to bound 2π ˆ (ν−1) (φ) inφ V e dφ. (21) ˆ 2 V (φ)3/2 0 As Liν (x) = Liν−1 (x)/x, it follows from Eq. (18) that V (ν−1) (φ) = 2b Re i ν−1 Li1 (eiφ ) = 2b Re −i ν−1 log(1 − eiφ ) , where the last step is from the definition of Li1 . We now distinguish two cases. First, assume that ν is even. Then, V (ν−1) (φ) ∝ Im log(1 − eiφ ) = arg(1 − eiφ ) =
(φ − π ) 2
on (0; 2π ), hence the integrand in Eq. (21) is bounded and has a bounded derivative, and by integration by parts, the integral Eq. (21) is O(n −1 ). In case ν is odd we have " " " " V (ν−1) (φ) ∝ Re log(1 − eiφ ) = log "1 − eiφ " = log(2 sin(φ/2)) on (0; 2π ). With h(φ) := Vˆ −3/2 (φ)/2, the integrand in Eq. (21) can be written as Vˆ (ν−1) (φ)h(φ) ∝ log(2 sin(φ/2)) h(0) + log(2 sin(φ/2)) [h(φ) − h(0)]. The first term gives a contribution proportional to 2π 1 log(2 sin(φ/2)) cos(nφ)dφ = − 2n 0
(22)
76
N. Schuch, J. I. Cirac, M. M. Wolf
as it is the back-transform of − 21 n≥1 cos(nφ)/n. For the second term, note that h ∈ C 1 (S 1 ) for ν ≥ 3 and thus h(φ)−h(0) = h (0)φ +o(φ) by Taylor’s theorem. Therefore, the log singularity vanishes, and we can once more integrate by parts. The derivative is 1 cot(φ/2) [h(φ) − h(0)] + log(2 sin(φ/2)) h (φ). 2 In the left part, the 1/φ singularity of cot(φ/2) is cancelled out by h(φ) − h(0) = O(φ), and the second part is integrable as h ∈ C (S 1 ), so that the contribution of the integral (21) is O(n −1 ) as well. In order to show that n −ν is also a lower bound on (Vˆ −1/2 )n , one has to analyze the asymptotics more carefully. Using the Riemann-Lebesgue lemma—which says that the Fourier coefficients of absolutely integrable functions are o(1)—one finds that all terms in (19) vanish as o(1/n ν ), except for the integral (21). Now for even ν, (21) can be integrated by parts, and while the brackets give a (n −ν ) term, the remaining integral is o(n −ν ), which proves that (Vˆ −1/2 )n = (n −ν ). For odd ν, on the other hand, the first part of (22) gives exactly a polynomial decay, while the contributions from the second part vanishes as o(n −ν ), which proves (Vˆ −1/2 )n = (n −ν ) for odd ν as well. Generalizations of Lemma 4. The preceding lemma can be extended to non-integer exponents α ∈ N: if Vn ∝ n −α , n = 0, then (Eˆ −1 )n = O(n −α ). For the proof, define α = ν + ε, ν ∈ N, 0 < ε < 1. Then Vˆ ∈ C ν−1 (S 1 ), ˆ V ∈ C ∞ ((0; 2π )), and one can integrate by parts ν times, where all brackets vanish. What remains is to bound the Fourier integral of the ν th derivative of Vˆ −1/2 by n −ε . An upper bound can be established by noting that |Vˆ (ν) (φ)| ≤ |Liε (eiφ )| = O(φ ε−1 ) and |Vˆ (ν+1) (φ)| = O(φ ε−2 ). It follows that all contributions in the Fourier integral except the singularity from Vˆ (ν) lead to o(1/n) contributions as can be shown by another integration by parts. In order to bound the Fourier integral of the O(φ ε−1 ) term, split the Fourier integral at n1 . The integral over [0; n1 ] can be directly bounded by n −ε , while for [ n1 ; 1], an equivalent bound can be established after integration by parts, using Vˆ (ν+1) = O(φ ε−2 ). This method is discussed in more detail in the proof of Theorem 10, following Eq. (44). The proof that n −ε is also a lower bound to (Eˆ −1 )n is more involved. From a series expansion of Vˆ and its derivatives, it can be seen that it suffices to bound the sine and cosine Fourier coefficients of φ ε−1 from below. As in the proof of Theorem 9, this is accomplished by splitting the integral into single oscillations of the sine or cosine and bounding each part by the derivative of φ ε−1 . For polynomially bounded interactions Vn = O(n −α ), α > 1, not very much can be said without further knowledge. With ν < α, ν ∈ N the largest integer strictly smaller than α, we know that Vˆ ∈ C ν−1 (S 1 ). Thus, one can integrate by parts ν − 1 times, the brackets vanish, and the remaining Fourier is o(1) using the Riemann-Lebes integral gue lemma. It follows that (Eˆ −1 )n = o n −(ν−1) . In contrast to the case of an exactly polynomial decay, this can be extended to higher spatial dimensions d > 1 by replacing ν − 1 with ν − d, which yields (Eˆ −1 )n = o n −(ν−d) . We now use the preceding lemmas about the entries of Eˆ −1 (Lemma 2–4) to derive corresponding results on the correlations of ground states of non-critical systems. Theorem 4. For systems with > 0, the following holds: (i) If the Hamiltonian H has finite range, the ground state correlations decay exponentially.
Quantum States on Harmonic Lattices
77
(ii) If H decays as o(n−∞ ), the ground state correlations decay as o(n−∞ ) as well. (iii) For a 1D system with H = V ⊕ 1, 1 where V decays with a power law |n|−ν , ν ≥ 2, the ground state correlations decay as (|n|−ν ). Proof. In all cases, we have to find the scaling of the ground state γ which is the product γ = (Eˆ −1 ⊕ Eˆ −1 )σ H σ T , Eq. (10). Part (i) follows directly from Lemma 3, as multiplying with a finite-range σ H σ T doesn’t change the exponential decay, while (ii) follows from Lemma 2, the o(n−∞ ) decay of σ H σ T , and Lemma 1. To show (iii), note that for H = V ⊕ 1, 1 the ground state is γ = V −1/2 ⊕ V 1/2 , and from Lemma 4, O(n −ν ) follows. For Vˆ −1/2 , Lemma 4 also includes that the bound is exact, while for Vˆ 1/2 , it can be shown by transferring the proof of the lemma one-to-one. Note that a simple converse of Theorem 4 always holds: for each translationally invariant pure state CM γ , there exists a Hamiltonian H with the same asymptotic behavior as γ such that γ is the ground state of H . This can be trivially seen by choosing H = σγσT. 5. Correlation Length and Gap In this section, we consider one-dimensional chains with local gapped Hamiltonians. We compute the correlation length for these systems and use this result to derive a relation between correlation length and gap. Theorem 5. Consider a non-critical 1D chain with a local Hamiltonian. Define the 1/2 L complex extension of the spectral function E(φ) = c cos(nφ) in Eq. (16) n n=0 as g(z) :=
L
n=0
cn
z n + z −n , 2
(9) such that g(eiφ ) = E 2 (φ) = Hˆ Q (φ) Hˆ P (φ) − Hˆ Q2 P (φ) and let z˜ be zero of g with the largest magnitude smaller than one. Then, the correlation length
ξ =−
1 log |˜z |
determines the asymptotic scaling of the correlations which is given by √ – O ∗ (e−n/ξ / n), if z˜ is a zero of order one, – O ∗ (e−n/ξ ), if z˜ is a zero of even order, – o(e−n/(ξ +ε) ) for all ε > 0, if z˜ is a zero of odd order larger than one. For the nearest neighbor interaction Hamiltonian Hκ from Eq. (1)√one has for instance √ 2 E(φ) = 1 − κ cos(φ), There√ so that g has simple zeros at z 0 = (1 ± 1 − κ )/κ. √ fore z˜ = (1 − 1 − κ 2 )/κ, and the correlations decay as (e−n/ξ / n), where ξ = −1/ log |˜z |.
78
N. Schuch, J. I. Cirac, M. M. Wolf
Proof. For local Hamiltonians, the correlationsdecay as the matrix elements of Eˆ −1 [Eq. (10)]. By Fourier transforming (9), E(φ) = g(eiφ) , with g(eiφ ) = Hˆ Q (φ) Hˆ P (φ)− L cn cos(nφ) an even trigonometric polynomial (we assume c L = 0 Hˆ Q2 P (φ) = n=0 without loss of generality), and min(g(eiφ )) = 2 . We have to compute (Eˆ −1 )n =
1 2π
2π 0
1 inφ z n−1 1 e dφ = dz, √ E(φ) 2πi S 1 g(z)
(23)
where S 1 is the unit circle. The function g(z) has a pole of order L at zero and 2L zeros altogether. Since min(g(φ)) = 2 > 0, g has no zeros on the unit circle. As g(z) = g(1/z), the zeros come in pairs, and L of them are inside the unit circle. Also, the conjugate of √ a zero is a zero as well. From each zero with odd multiplicity emerges a branch cut of g(z). We arrange all the branch cuts inside the unit circle such that they go straight to the middle where they annihilate with another cut. In case L is odd, √ the last cut is annihilated by the singularity of g(z) at 0. If two zeros lie on a line, one cut curves slightly. A sample arrangement is shown in Fig. 1. Following Cauchy’s theorem, the integral can be decomposed into integrals along √ the different branch cuts and around the residues of 1/ g, and one has to estimate the contributions from the different types of zeros of g. The simplest case is given by zeros z 0 with even multiplicity 2m. In that case, define h(z) := g(z)/(z − z 0 )2m which has no zero around z 0 . The contribution from z 0 to the correlations is then given by the residue at z 0 and is
√ Fig. 1. Sample arrangement of branch cuts and poles of g inside the unit circle. From each odd order zero of g, a branch cut emerges. All cuts go to 0 where they cancel with another cut. In case their number is odd, there is an additional branch point at 0 cancelling the last cut. In case two zeros are on a line to the origin, the √ cuts are chosen curved. The integral of g around the unit circle is equal to the integral around the cuts, plus integrals around the residues which originate from the even order zeros of g
Quantum States on Harmonic Lattices
dm−1 1 (m − 1)! dz m−1
79
" z n−1 "" n−(m−1) ∝ z0 √ h(z) "z=z 0
for n − (m − 1) > 0, i.e., it scales as |z 0 |n . Note that for z 0 ∈ R, the imaginary parts originating from z 0 and its conjugate z¯ 0 exactly cancel out, but the scaling is still given by |z 0 |n = en log |z 0 | , i.e., ξ = −1/ log |z 0 | is the corresponding correlation length. If z 0 is a simple zero of g(z), we have to integrate around the branch cut. Assume first that the cut goes to zero in a straight line, and consider a contour with distance ε to the slit. Both the contribution from the ε region around zero and the ε semicircle at z 0 vanish as ε → 0, and the total integral is therefore given by twice the integral along the cut, z0 z n−1 1 dz, √ √ πi 0 z − z 0 h(z) where again h(z) = g(z)/(z − z 0 ). Intuitively, for growing n the part of the integral close to z 0 becomes more and more dominating, i.e., the integral is well approximated by the modified integral where h(z) has been replaced by h(z 0 ). After rotating it onto the real axis, this integral—up to a phase—reads |z 0 | n−1 r 1 |z 0 |n−1/2 (n) , (24) dr = √ √ √ |z 0 | − r π |h(z 0 )| 0 π |h(z 0 )| (n + 21 ) which for large n is 1 |z 0 |n √ √ +O n π |z 0 h(z 0 )|
|z 0 |n n 3/2
.
(25)
In order to justify the approximation h(z) h(z 0 ), consider the difference of the two respective integrals. It is bounded by " " " " z 0 " 1 "" "" |z|n−1 "" 1 " − dz . √ √ √ " |z − z 0 | " h(z) h(z 0 ) " " 0 ) *+ , (∗)
On [z 0 /2, z 0 ], h(z) is analytic and has no zeros, thus, |h(z)−1/2 −h(z 0 )−1/2 | < C|z−z 0 |, where C is the maximum of the derivative of h(z)−1/2 on [z 0 /2, z 0 ]. On [0, z 0 /2], the same bound is obtained by choosing C the supremum of |h(z)−1/2 − h(z 0 )−1/2 |/|z 0 /2| on [0, z 0 /2]. Together, (∗) ≤ C|z − z 0 |, and the above integral is bounded by √ |z 0 | |z 0 |n π|z 0 |n+1/2 (n) n−1 , =O C r |z 0 | − r dr = C n 3/2 2(n + 23 ) 0 i.e., it vanishes by 1/n faster than the asymptotics derived in Eq. (25), which justifies fixing h(z) at h(z 0 ). √ From Eq. (25), it follows that the scaling is e−n/ξ / n, where the correlation length is again ξ = −1/ log |z 0 |. The same scaling behavior can be shown to hold for appropriately chosen curved branch cuts from z 0 to 0 by relating the curved to a straight integral. The situation gets more complicated if zeros of odd order > 1 appear. In order to get an estimate which holds in all scenarios, we apply Cauchy’s theorem to contract the unit
80
N. Schuch, J. I. Cirac, M. M. Wolf
circle in the integration (23) to a circle of radius r > |z 0 |, where z 0 is the largest zero inside the unit circle.√Then, the integrand can be bounded by Cr r n−1 (where Cr < ∞ is the supremum of 1/ g on the circle), and this gives a bound 2πCr r n−1 for the integral. This holds for all r > |z 0 |, i.e., the correlations decay faster than en log r for all r > |z 0 |. This does not imply that the correlations decay as en log |z 0 | , but it is still reasonable to define −1/ log |z 0 | as the correlation length. Theorem 6. Consider a 1D chain together with a family of Hamiltonians H () with gap > 0, where H () is continuous for → 0 in the sense that all entries of H converge. Then, the ground state correlations scale exponentially, and for sufficiently small the correlation length is
Here, m ∗ =
"
d2 E (φ) " " dφ 2 φ=φ
−1
1 ξ√ . m ∗ is the effective mass at the band gap.
√ For the discretized Klein-Gordon field (1), for example, we have = 1 − |κ|, √ m ∗ =√2 1 − |κ|/|κ|, and for √ small (corresponding to |κ| close to 1), one obtains ξ |κ|/2(1 − |κ|) 1/ 2. Hence, the ξ ∝ 1/ law holds if the coupling is increased relative to the on-site energy (in which case m ∗ ∝ ). More generally, if we expand the spectral function [Eq. √ (16)] around the band gap 3 we are generically led to the dispersion relation E(k) 2 + v 2 k 2 (k ≡ φ). By the definition of the effective mass and Theorem 6 this leads exactly to the folk theorem v (26) ξ . Proof. According to Theorem 5, what remains to be done is to determine the position of the largest zero z˜ of g in the unit circle. Due to the restriction on H (), the coefficients of the polynomial g(z)z L and thus also the zeros of g continuously depend on , i.e., for sufficiently small , the zero closest to the unit circle is the one closest to the gap. In order to determine the position of this zero, we will expand g around the gap. We only discuss the generic case where the gap appears only for one angle φ0 , g(φ0 ) = . In the case of multiple occurrences of the gap in the spectrum, one will pick the gap which gives the zero closest to the unit circle, i.e., the largest correlation length. Furthermore, we assume φ0 = 0 without loss of generality. Otherwise, one considers g(ze−iφ0 ) instead of g(z), which on the unit circle coincides with the (rotated) spectrum. The knowledge on g =: u + iv (with u, v : C → R) which will be used in the proof is u(1) = 2 , u φ (1) = 0, u φφ (1) = 2 m ∗ > 0,
v(1) = 0, vφ (1) = 0, vφφ (1) = 0,
(27)
where the subscripts denote the partial derivative with respect to the respective subscript (in Euclidean coordinates z ≡ x + i y, in polar coordinates z ≡ r eiφ ). Note that 3 This makes the natural assumption that the minimum under the square root is quadratic. In fact, if it is of higher order, then m ∗ = ∞ and thus ξ = 0, which is consistent with the findings of the following section. An example of such a behavior is given by so called ‘quadratic interactions’ [2] for which H = V ⊕1, 1 where V is the square of a banded matrix.
Quantum States on Harmonic Lattices
81
z = 1 is the point where the gap appears, and that g(eiφ ) = E(φ)2 is real. Therefore, the derivatives of the imaginary part v along the circle vanish, while the derivatives of the real part u are found to be u(1) = E(0)2 = 2 , u φ (1) = 2E(0)E (0) = 0, and u φφ (0) = 2E (0)2 + 2E(0)E (0) = 2/m ∗ , where m ∗ = 1/E (φ) is the effective mass at the band gap. We need to exploit the relation between Euclidean and polar coordinates, gx (1) = gr (1) ; g y (1) = gφ (1), gx x (1) = grr (1) ; g yy (1) = gφφ (1) + gr (1), and the Cauchy–Riemann equations u x = v y , u y = −vx , and gx x + g yy = 0, which together with the information (27) lead to u(1) = 2 ; v(1) = 0 ; u x (1) = u y (1) = vx (1) = v y (1) = 0 ; 2 u x x (1) = − 2 m ∗ ; u yy (1) = m ∗ ; vx x (1) = 0 ; v yy (1) = 0.
Note that it is not possible to derive information about the mixed second derivates using only the information (27). However, as long as vx y does not vanish at 1, v will only stay zero in direction of x or y, but not diagonally. Since 2 > 0 and 2/m ∗ > 0, the closest zero is—to second order—approximately located along the x axis. By intersecting with √ the parabola 2 − m∗ (x − 1)2 , one finds that the zero is located at x0 ≈ 1 − m ∗ . For √ ∗ small √ , the correlations thus decay with correlation length ξ ≈ −1/ log(1− m ) ≈ ∗ 1/ m . 6. Critical Systems In the following, we discuss critical systems, i.e., systems without an energy gap, = 0.4 In that case, the Hamiltonian will get singular and some entries of the ground state covariance matrix will diverge, which leads to difficulties and ambiguities in the description of the asymptotic behavior of correlations. We will therefore restrict to Hamiltonians of the type H = V ⊕ 1, 1 for which the ground state CM is γ = V −1/2 ⊕ V 1/2 . While the Q part diverges, the entries of the P-block stay finite. Following Thm. 1(b) the extension to interactions of the form H = H Q ⊕ H P is straightforward. In order to compute the correlations we have to determine the asymptotics of V 1/2 , i.e., 1 1/2 (V )n = Vˆ (φ)einφ dφ. (2π )d T d 4 Note that there are different meanings of the notion criticality referring either to a vanishing energy gap or to an algebraic decay of correlations. In this section we discuss in which cases these two properties are equivalent.
82
N. Schuch, J. I. Cirac, M. M. Wolf
We will restrict to the cases in which the excitation spectrum E = Vˆ has only a finite number of zeros, i.e., finitely many points of criticality. In addition, we will also consider the special case in which the Hamiltonian exhibits a tensor product structure. We proceed as follows. First, we consider one-dimensional critical chains and show that the correlations decay typically as O(n −2 ) and characterize those special cases where the correlations decay more rapidly. The practically important case of exactly cubic decaying interactions will be investigated in greater detail. Depending on the sign of the interaction this case will lead to a logarithmic deviation from the n −2 behavior. Then, we turn to higher dimensional systems and show that generically the correlations decay as n −(d+1) log n, where d is the spatial dimension of the lattice. 6.1. One dimension. First, we prove a lemma which shows that although taking the square root of a smooth function destroys its differentiability, the derivatives will stay bounded. Lemma 5. Let f ∈ C m ([−1; 1]), f (x) ≥ 0 with the only zero at x = 0, and let 2ν ≤ m be the order of the minimum at x = 0, i.e., f (k) (0) = 0 ∀k < 2ν, f (2ν) (0) > 0. √ Define g(x) := f (x). Then, the following holds: – For odd ν, g ∈ C ν−1 ([−1; 1]), and g ∈ C m−ν ([−1; 0]), g ∈ C m−ν ([0; 1]), i.e., the first m − ν derivatives (for x = 0) are bounded. – For even ν, g ∈ C m−ν ([−1; 1]). k (k) m−k ) for Proof. Using the Taylor expansion f (x) = m k=2ν ck x + ρ(x), ρ (x) = o(x ν ν k ≤ m, we express g as g(x) = (sgn x) x r (x) with . m . ρ(x) ck x k−2ν + 2ν , r (x) = / x k=2ν
√
where we used that (sgn x)ν x ν = x 2ν . Let us now consider the derivatives of r (x). While the sum leads to a O(1) contribution, the k th derivative of the remainder behaves as o(1)/x 2ν−m+k . Together, this leads to r (k) (x) = O(1), r (k) (x) = o(1)/x 2ν−m+k ,
2ν − m + k ≤ 0, 2ν − m + k ≥ 1.
Now consider the k th derivative of g(x) for x = 0, $ k # l
k d ν (k−l) (k) ν g (x) = (sgn x) x r (x) . l dx l l=0 ) *+ , sl
Assume first k ≤ ν. Then, sl ∝ O(1)x ν−l for 2ν − m + k − l ≤ 0, and sl ∝ o(1)x m−ν−k for 2ν − m + k − l ≥ 1, and as m ≥ 2ν, it follows that g (k) = O(x) for k < ν, which cancels the discontinuity originating from sgn x. For k = ν, on the contrary, sk = O(1), and sgn x introduces a discontinuity on g (k) , yet, it remains bounded and piecewise differentiable on [−1; 0] and [0; 1]. The first non-bounded sl is found as soon as m − ν − k = −1, and g ∈ C m−ν ([0; 1]) directly follows. This also implies that for m − ν − k ≥ 0, g(x)/(sgn x)ν ∈ C m−ν ([−1; 1]), i.e., the discontinuity is only due to (sgn x)ν . Since, however, this is only discontinuous for odd ν, it follows that g ∈ C m−ν ([−1; 1]) if ν even.
Quantum States on Harmonic Lattices
83
Theorem 7. Consider a one-dimensional critical chain with Hamiltonian H = V ⊕ 1, 1 where Vn = O(n −α ), α > 4 and where Vˆ has a finite number of critical points which are all quadratic minima of Vˆ . Then, (γ P )n = O ∗ (n −2 ). For Vn ∝ n −α , α > 3 it even follows that (γ P ) = (n −2 ). Note that for Vn ∝ n −α , the extrema of Vˆ are always quadratic. Proof. We want to estimate (V
1/2
1 )n = g(φ)einφ dφ, 2π S 1
(28)
where g = Vˆ 1/2 . Under both assumptions, Vˆ ∈ C 2 (S 1 ), and all critical points are minima of order 2. It follows from Lemma 5 that g is continuous with bounded derivative. Therefore, we can integrate by parts, the bracket vanishes, and we obtain 2π 1 (V 1/2 )n = − g (φ)einφ dφ. 2πin 0 0 Now, split S 1 at the zeros of g into closed intervals I j , j I j = S 1 , and rewrite the above integral as a sum of integrals over I j . As g ∈ C (I j ) (and differentiable on the inner of I j ), one can once more integrate by parts which yields
1 inφ inφ g (V 1/2 )n = − (φ)e − g (φ)e dφ . (29) Ij 2π(in)2 Ij j
Neither of the terms will vanish, but since g ∈ C (I j ), the bracket is bounded. In case Vn ∈ O(n −α ), α > 4, we have Vˆ ∈ C 3 (S 1 ), therefore g is bounded (Lemma 5), and the integrals vanish as o(1). Unless the contributions of the brackets for the different I j cancel out, the n −2 bound is tight, (V 1/2 )n = O ∗ (n −2 ). The tightness of the bound is also illustrated by the example which follows the proof. For the case of an exactly polynomial decay, we additionally have to show that g is absolutely integrable for 3 < α ≤ 4. Then, the exactness of the bound holds because the bracket in Eq. (29) does not oscillate (the critical point is either at φ = 0 or at φ = π ), and because the integral is o(1) for g ∈ L1 (S 1 ). In case the critical point is at φ = π , the latter holds since Vˆ ∈ C ∞ ((0; 2π )) implies that g is bounded at π , and Vˆ ∈ C 2 (S 1 ) that g ∈ C 2 ((−π, π )), which together proves that g is bounded on S 1 . In case the critical point is at φ = 0, the situation is more involved (and for α = 3, a logarithmic correction appears, cf. Theorem 9). Since Vˆ (3) (φ) = −Im Liα−3 (eiφ ) = O(φ α−4 ), we have Vˆ (φ) = Vˆ (0) + O(φ α−3 ), Vˆ (φ) = Vˆ (0)φ + O(φ α−2 ), Vˆ (φ) = 1 Vˆ (0)φ 2 + O(φ α−1 ). 2
With this information, g (φ) =
2 Vˆ (φ)Vˆ (φ) − Vˆ (φ)2 = O(φ α−4 ), 4V (φ)3/2
which indeed proves that g ∈ L1 (S 1 ), and thus (V 1/2 )n = (n −2 ).
84
N. Schuch, J. I. Cirac, M. M. Wolf
As an example, consider again the discretized Klein-Gordon field of Eq. (1) which is critical for κ = ±1, corresponding to Vˆ (φ) = 1 ∓ cos φ. The Fourier integral is solvable √ 2 2 (sgn κ)n and yields (γ P )n = − π 4n 2 −1 = (n −2 ). Generalizations of Theorem 7. Using Lemma 5, several generalizations for the 1D critical case can be found. In the following, we mention some of them. In all cases H = V ⊕1 1 is critical. Critical points of even order. If Vn = o(n −∞ ) and the critical points are minima of order 2ν, ν even, the correlations decay as (γ P )n = o(n −∞ ). This is the case, e.g., if V = X 2 with X itself rapidly decaying. Critical points of higher order. If Vˆ has critical points of order at least 2ν, ν odd, and Vn = O(n −α ), α > 2ν + 2, then (γ P )n = O(n −(ν+1) ). Minima of different orders. If Vˆ has minima of different orders 2νi , in general the minimum with the lowest odd νi ≡ ν1 will determine the asymptotics, (γ P )n = O(n −(ν1 +1) ). As Vˆ ∈ C (2 max{νi }) (S 1 ) is required anyway, the piecewise differentiability of Vˆ 1/2 is guaranteed. Weaker requirements on V . It is possible to ease the requirements imposed on V in Theorem 7 to Vn = O(n −α ), α > 3 or Vˆ ∈ C 2 (S 1 ), respectively. The price one has to pay is that one gets an additional log correction as in the multidimensional critical case, Theorem 10. The method to bound g is the same which is used there to derive (39). The above proof does not cover the case of the relevant 1/n 3 interaction, which for instance appears for the motional degrees of freedom of trapped ions. In the following, we separately discuss this case. It will turn out that the scaling will depend on the sign of the coupling: while a positive sign (corresponding to the radial degrees of freedom) again gives a ( n12 ) scaling as before, for the negative sign (corresponding to the axial √ n degree of freedom) one gets log . n2 Theorem 8. Consider a critical 1D chain with a 1/n 3 coupling with positive sign, i.e., H = V ⊕ 1, 1 Vn = c/n 3 , V0 = 3cζ (3)/2, c > 0, with ζ the Riemann zeta function. Then, the ground state correlations scale as (γ P )n = ( n12 ). Proof. We take w.l.o.g. c = 1/2. For this sign of the coupling, the critical point is at π , Vˆ (π ) = 0. From the proof of Lemma 4, we know that Vˆ ∈ C 1 (S 1 ), Vˆ ∈ C ∞ ((0; 2π )), and that Vˆ (φ) = log(2 sin(φ/2)) on (0; 2π ). With g := Vˆ 1/2 , it follows from Lemma 5 that g ∈ C (S 1 ), g ∈ C 1 ([−π ; π ]), and g ∈ C ∞ ((0; π ]), g ∈ C ∞ ([−π ; 0)). This means that all derivatives g (k) , k ≥ 1 can exhibit jumps at the critical point π but they all remain bounded. In contrast, around φ = 0, g is continuous but g has a log divergence. Thus, the Fourier integral 1 1/2 g(φ)einφ dφ (V )n = 2π S 1 can be split at 0 and π , and then integrated by parts twice. The brackets of the first integration cancel out due to continuity of g, and one remains with π π 1 g (φ) cos(nφ) + g (φ) cos(nφ)dφ , (V 1/2 )n = 0 π(in)2 0
Quantum States on Harmonic Lattices
85
where we used the symmetry of g. One finds [g (φ) cos(nφ)]π0 = − log2 2 (−1)n , and since g is integrable, the integral is o(1) due to the Riemann-Lebesgue lemma. Together, this proves (γ P )n = ( n12 ). Theorem 9. Consider a critical 1D chain with a 1/n 3 coupling with negative sign, i.e., H = V ⊕1, 1 Vn = −c/n 3 , V0 = 2cζ (3), c > 0, with√ζ the Riemann zeta function. Then, n the ground state correlations scale as (γ P )n = log . n2 Proof. Again, take w.l.o.g. c = 1/2. For the negative sign of the interaction, the critical point is at φ = 0. Since at this point Vˆ diverges, Lemma 5 cannot be applied, and the situation gets more involved. As in the previous proof, we use that Vˆ ∈ C 1 (S 1 ), Vˆ ∈ C ∞ ((0; 2π )), and thus 1/2 ∈ C (S 1 ), Vˆ 1/2 ∈ C ∞ ((0; 2π )). Further, Vˆ (φ) = − log(2 sin(φ/2)) on (0; 2π ), Vˆ cf. the proof of Lemma 4, and with sin x = x(1 + O(x 2 )) we have Vˆ (φ) = − log(φ) + O(φ 2 ) for φ → 0 (and similarly for φ → 2π ), and therefore Vˆ (φ) = φ(1 − log φ) + O(φ 3 ), Vˆ (φ) = 41 φ 2 (3 − 2 log φ) + O(φ 4 ). As Vˆ 1/2 ∈ C (S 1 ), we can integrate by parts one time, π 1 1 (V 1/2 )n = g (φ) sin(nφ)dφ, Vˆ 1/2 (φ)einφ dφ= 2π S 1 πn 0
(30)
(31)
where we exploited the symmetry of Vˆ , and with g := Vˆ 1/2 . Then, from (30), 1 − log φ −2 + log φ φ φ2 , g (φ)= , g (φ)= √ + O +O √ √ φ(3 − 2 log φ)3/2 3 − 2 log φ | log φ| | log φ| and after another round of approximation, √ | log φ| 1 1 1 1 . , g (φ) = − 3/2 √ g (φ) = +O √ +O √ 2 φ | log φ| φ| log φ|3/2 | log φ| 2 √ This shows that the remainder g (φ) − | log φ|/2 is continuous with an absolutely integrable derivative, and by integration by parts it follows that it only leads to a contribution O(1/n) in the integral (31). Thus, √ it remains to investigate the asymptotics of the sine Fourier coefficients of h(φ) = | log φ|. For convenience, we split the integral (31) at 1, and [1; π ] only contributes with O(1/n), as h is continuous with absolutely integrable derivative on [1; π ]. On [0; 1], we have to compute the asymptotics of 1 I= − log φ sin(nφ)dφ. (32) 0
Therefore, split the integral at 1/n. The left integral can be bounded directly, and the right after integration by parts [cf. the treatment of Eq. (44)]. One gets √ √ 1/n 1 log n 1 1 log n . I≤ − log φ dφ + dφ = O + √ n n 1/n 2φ − log φ n 0 In order to prove that this is also a lower bound for the asymptotics, it suffices to show this for the integral (32) as all other contributions vanish more quickly. To this end, split
86
N. Schuch, J. I. Cirac, M. M. Wolf
the integral (32) into single oscillations of the sine, Jk = [ 2πn k , 2π(k+1) ], k ≥ 0. As n √ − log φ has negative slope on (0; 1), each of the Jk gives a positive contribution to I, and thus we can truncate the integral at 21 , I≥
2π(k+1) 1 ≤2 n
− log φ sin(nφ) dφ.
(33)
Jk
√ On [0; 21 ], − log φ has a positive curvature, and thus, each of the integrals can be esti√ mated by linearly approximating − log φ at the middle of each Jk but with the slope at 2π(k+1) , which gives n Jk
− log φ sin(nφ) dφ ≥
π n2
1 2π(k+1) n
1 − log
2π(k+1) n
.
Now, we plug this into the sum (33) and bound the sum by the integral from integrand in monotonically decreasing), which indeed gives a lower bound √ √ log 2) on I and thus proves the ( log n/n 2 ) scaling.
2π 1 n to 2
1 n(
log
(the
n 2π
−
6.2. Higher dimensions. For more than one dimension, the situation is more involved. First of all, it is clear by taking many uncoupled copies of the one-dimensional chain that there exist cases where the correlations will decay as n −2 . However, these are very special examples corresponding to Hamiltonians with a tensor product structure Hi1 i2 , j1 j2 = Hi1 , j1 Hi2 , j2 . In contrast, we show that for generic systems the correlations in the critical case decay as O(n −(d+1) log n), where d is the dimension of the lattice. The requirement is again that the energy spectrum E(φ) has only a finite number of zeroes, i.e., finitely many critical points. Note that the case of a Hamiltonian with a tensor product structure can also be solved, as in that case Vˆ becomes a product of terms depending on one φi each and thus the integral factorizes. Interestingly, although the correlations along the axes decay as n −2 , −2 n−2d and the correlations in a fixed diagonal direction will decay as n −2 1 · · · nd ∝ −(d+1) thus even faster than in the following theorem. The O n log n decay of the theorem holds isotropically, i.e., independent of the direction of n. Theorem 10. Consider a d-dimensional bosonic lattice with a critical Hamiltonian H = V ⊕ 1. 1 Then the P-correlations of the ground state decay as O n−(d+1) log n if the following holds: Vˆ ∈ C d+1 [e.g., the correlations decay as O(n−(2d+1+ε) ), ε > 0],Vˆ has only a finite number of zeros which are quadratic minima, i.e., the 2 ˆ (φ) Hessian ∂∂φVi ∂φ is positive definite at all zeros. j ij
Quantum States on Harmonic Lattices
87
Proof. We have to evaluate the asymptotic behavior of the integral 1 d (Vˆ 1/2 )n = d φ Vˆ (φ) cos[nφ]. (2π )d T d Let us first briefly sketch the proof. We start by showing that it suffices to analyze each critical point separately. To this end, we show that is is possible to smoothly cut out some environment of each critical point which reproduces the asymptotic behavior. Then, we rotate the coordinate system such that we always look at the correlations in a fixed direction, and integrate by parts—which surprisingly can be carried out as often as Vˆ is differentiable, as all the brackets vanish. Therefore, the information about the asymptotics is contained in the remaining integral, and after a properly chosen number of partial integrations, we will attempt to estimate this term. Let now ζi , i = 1, . . . , I be the zeros of Vˆ . Clearly, these will be the only points which contribute to the asymptotics as everywhere else Vˆ is C d+1 . In order to separate the contributions coming from the different ζi , we will make use of so-called neutralizers [27]. For our purposes, these are functions Nξ0 ,r ∈ C ∞ (Rd → [0; 1]) which satifsy 1 : ξ − ξ0 ≤ r/2 Nξ0 ,r (ξ ) = 0 : ξ − ξ0 ≥ r and are rotationally symmetric (cf. [27] for an explicit construction). For each ζi , there exists an ri such that the balls Bri (ζi ) do not intersect. We now define the functions f i (φ) :=
Vˆ (φ) Nζi ,ri (φ), ρ(φ) :=
Vˆ (φ) −
I
f i (φ).
i=1
Clearly, ρ is C d+1 , and so is each f i except at ζi . Furthermore, each f i is still the square root of a C d+1 function. By definition, (Vˆ −1/2 )n =
I 1 1 d d φ f (φ) cos[nφ] + dd φρ(φ) cos[nφ], (34) i d d d (2π )d (2π ) T i=1 T
i.e., it suffices to look at the asymptotics of each f i separately. The contribution of ρ is O(n−(d+1) ) as can be shown by successive integrations by parts just as for the non-critial lattice (cf. the proof of Lemma 2). Let us now analyze the integrals Ii = dd φ f i (φ) cos[nφ]. Bri (ζi )
The integration range can be restriced to Bri (ζi ) as f i vanishes outside the ball. By a rotation, this can be mapped to an integral where n = (n, 0, . . . , 0), whereas f i is rotated to another function f˜i with the same properties, Ii = dd φ f˜i (φ) cos[nφ1 ]. Bri (ζi )
88
N. Schuch, J. I. Cirac, M. M. Wolf
Since the integrand is continuous and thus bounded, it is absolutely integrable, and from Fubini’s theorem, one finds ζi,1 +ri ˜ cos[nφ1 ], Ii = dd−1 φ˜ dφ1 f˜i (φ1 , φ) )
Bri (ζ˜i )
ζi,1 −ri
*+
˜ Ji (φ)
,
where we separated out the integration over the first component. The vector φ˜ denotes the components 2 . . . d of φ. The extension of the integration range to a cylinder is possible as f˜i vanishes outside Bri (ζi ). Let us now require φ˜ = ζ˜i . This does not change the integral since the excluded set is of measure zero, but it ensures that f˜i is in C d+1 . This allows us to integrate the inner ˜ by parts up to d + 1 times, and each of the brackets integral Ji (φ) # $ζi,1 +ri 1 (k) ˜ ˜ cos(nφ1 − kπ/2) f i (φ1 , φ) nk φ1 =ζi,1 −ri d ˜ = ∂ d f˜i (φ1 , φ)/∂φ ˜ appearing in the k th integration step vanishes. Here, f˜i(d) (φ1 , φ) 1 is th the d partial derivative with respect to the first argument. After integrating by parts d times, we obtain
1 Ii = nd
ζi,1 + ri
d
Bri (ζ˜i )
d−1
φ˜
˜ cos[nφ1 − dπ/2]. dφ1 f˜i(d) (φ1 , φ)
(35)
ζi,1 − ri
Now we proceed as follows: first, we show that the order of integration can be inter(d) changed, and second, we show that for the function obtained after integrating f˜i over ˜ the Fourier coefficients vanish as log(n)/n. φ, (k) The central issue for what follows is to find suitable bounds on | f˜i |. Therefore, 2 d+1 define f˜i =: h i ∈ C . By virtue of Taylor’s theorem, and as h i (ζi ) = 0 is a minimum, h i (φ) = 21 (φ − ζi ) · (D2 h i (ζi ))(φ − ζi ) + o(φ − ζi 2 ) with D2 the second derivative. As the first term is bounded by 21 D2 h i (ζi )∞ φ − ζi 2 and the second vanished faster than φ − ζi 2 , we can find εi > 0 and C1 > 0 such that |h i (φ)| ≤ C1 φ − ζi 2
∀φ − ζi < εi .
(36)
By looking at the Tayor series of h i ≡ ∂h i /∂φ1 up to the first order we also find that there are εi > 0 and C2 > 0 such that |h i (φ)| ≤ C2 φ − ζi
∀φ − ζi < εi .
(37)
In addition to these upper bounds, we will also need a lower bound on |h i |. Again, by the Taylor expansion of h i around ζi , we find |h i (φ)| ≥ λmin D2 h i (ζi ) − o(φ − ζi 2 ),
Quantum States on Harmonic Lattices
89
and as all the zeros are quadratic minima, i.e., λmin D2 h i (ζi ) > 0, there exist εi > 0, C3 > 0 such that |h i (φ)| ≥ C3 φ − ζi 2
∀φ − ζi < εi .
(38)
Clearly, εi can be chosen equal in Eqs. (36–38). Note that the bounds can be chosen to be invariant under rotation of h i and thus of f˜i . This holds in particular for the εi as the remainders of Taylor series vanish uniformly. Thus, the bound we will obtain for the correlation function indeed only depends on n and not on the direction of n. (k) Now, we use the conditions (36–38) to derive bounds on | f˜i |. Therefore, note that √ from f˜i ≡ h i it follows that ( j1 )
f˜i(k) =
j1 +···+ jk =k jν =0,1,2,...
c j1 ... jk h i
(2k−1)/2
hi
( jk )
· · · hi
.
One can easily check that for each term in the numerator, the number K 0 of zeroth derivatives and the number K 1 of first derivatives of h i satisfy 2K 0 + K 1 ≥ k. By bounding all higher derivatives of h i from above by constants, we find that the modulus of each summand in the numerator, and thus the modulus of the numerator itself, can be bounded above by C φ − ζi k in the ball Bεi (ζi ) with some C > 0. On the other hand, it follows directly from (38) that the modulus of the denominator is bounded below by C φ − ζi 2k−1 , C > 0, such that in total | f˜i(k) (φ)| ≤ C
1 ; 1 ≤ k ≤ d + 1. φ − ζi k−1
(39)
Note that this holds not only inside Bεi (ζi ) but in the whole domain of f i , as outside Bεi (ζi ), f i is C d+1 and thus all the derivatives are bounded. Equation (39) is the key result for the remaining part of the proof. First, it can be used to bound the integrand in (35) by an integrable singularity (this is most easily seen in spherical coordinates, where 1/r d−1 is integrable in a d-dimensional space). Hence, the order of integration in (35) can be interchanged, and it remains to investigate the asymptotics of the integral ζi,1 + ri
1 Ii = nd gi (φ1 ) ≡
dφ1 gi (φ1 ) cos[nφ1 − dπ/2], with
(40)
ζi,1 − ri
(d) ˜ dd−1 φ˜ f˜i (φ1 , φ).
(41)
Bri (ζ˜i )
From (39), we now derive bounds on gi (φ1 ) and its first derivative. Again, we may safely fix φ1 = ζi,1 as this has measure zero. Then, using (39) we find that
ri
|gi (φ1 )| ≤ 0
C d−2 dr, (d−1)/2 Sd−1r 2 2 (φ1 − ζi,1 ) + r )
90
N. Schuch, J. I. Cirac, M. M. Wolf
where we have transformed into spherical coordinates [Sd−1 is the surface of the (d −1)dimensional unit sphere] and assumed the l2 -norm. Since (φ1 − ζ1 )2 + r 2 ≥ r 2 , the integrand can be bounded once again, and we find
ri
C Sd−1 dr ((φ − ζi,1 )2 + r 2 )1/2 1 0 $ # = C − log |φ1 − ζi,1 | + log ri + ri2 + (φ1 − ζi,1 )2
|gi (φ1 )| ≤
≤ −C log |φ1 − ζi,1 |,
(42)
where in the last step we used that in (40) |φ1 − ζi,1 | < ri and that ri can be chosen sufficiently small. Next, we derive a bound on gi (φ1 ). As we fix φ1 = ζ1 , the integrand in (41) is C 1 and we can take the differentiation into the integral, gi (φ1 )
=
˜ dd−1 φ˜ f˜i(d+1) (φ1 , φ).
Bri (ζ˜i )
Again, we bound the integrand by virtue of Eq. (39) and obtain |gi (φ1 )|
ri
C Sd−1 dr 2 2 0 ((φ1 − ζi,1 ) + r ) ri arctan |φ1 −ζ C i,1 | ≤ . =C |φ1 − ζi,1 | |φ1 − ζi,1 | ≤
(43)
Finally, these two bounds will allow us to estimate (40) and thus the asymptotics of the correlations in the lattice. We consider one half of the integral (40), ζi,1 + ri
dφ1 gi (φ1 ) cos[nφ1 − dπ/2],
(44)
ζi,1
as both halves contribute equally to the asymptotics. We then split the integral at ζi,1 + ri /n. The left part gives " " " ζi,1 +r " /n " i " (42) " " dφ1 gi (φ1 ) cos[nφ1 − dπ/2]" ≤ " " " " ζi,1 "
ζi,1 +r i /n
dφ1 (− log |φ1 − ζi,1 |) ζi,1
ri − ri log ri + ri log n . = n
(45)
Quantum States on Harmonic Lattices
91
The right part of the split integral (44) can be estimated by integration by parts, " " " ζ " i,1 +ri " " " " dφ1 gi (φ1 ) cos[nφ1 − dπ/2]" " " " " ζi,1 +ri /n " "# " ζ i,1 +ri $ζi,1 +ri " " 1 1 " " ≤ " gi (φ1 ) dφ1 |gi (φ1 )| cos[nφ1 − (d + 1)π/2] "+ " " n n ζi,1 +ri /n ζi,1 +ri /n
(42,43)
≤
C
| log ri | log n + C . n n
(46)
Thus, both halves [Eqs. (45),(46)] give a log n/n bound for the integral (44), and thus the integral Ii is asymptotically bounded by log n/nd+1 following Eq. (40). As the number of such integrals in (34) is finite, this proves that the correlations of the ground state decay at least as log n/nd+1 . Acknowledgements. We would like to thank Jens Eisert, Otfried Gühne, David Pérez García, Diego Porras, Tommaso Roscilde, Frank Verstraete, and Karl Gerd Vollbrecht for helpful discussions and comments. This work has been supported by the EU IST projects QUPRODIS and COVAQIAL.
References 1. Plenio, M.B., Eisert, J., Dreissig, J., Cramer, M.: Entropy, entanglement, and area: analytical results for harmonic lattice systems. Phys. Rev. Lett. 94, 060503 (2005) 2. Cramer, M., Eisert, J., Plenio, M.B., Dreissig J.: An entanglement-area law for general bosonic harmonic lattice systems. Phys. Rev. A 73, 012309 (2005) 3. Wolf M.M.: Violation of the Entropic area law for fermions. Phys. Rev. Lett. 96, 010404 (2005) 4. Audenaert, K., Eisert, J., Plenio, M.B., Werner, R.F.: Entanglement properties of the harmonic chain. Phys. Rev. A 66, 042327 (2002) 5. Botero, A., Reznik, B.: Spatial structures and localization of vacuum entanglement in the linear harmonic chain. Phys. Rev. A 70, 052329 (2004) 6. Asoudeh, M., Karimipour, V.: Entanglement of bosonic modes in symmetric graphs. Phys. Rev. A 72, 0332339 (2005) 7. Plenio, M.B., Semiao, F.L.: High efficiency transfer of quantum information and multi-particle entanglement generation in translation invariant quantum chains. New J. Phys. 7,73 (2005) 8. Plenio, M.B., Hartley, J., Eisert, J.: Dynamics and manipulation of entanglement in coupled harmonic systems with many degrees of freedom. New J. Phys. 6, 36 (2004) 9. Eisert, J., Plenio, M.B., Bose, S., Hartley, J.: Towards mechanical entanglement in nano-electromechanical devices. Phys. Rev. Lett. 93, 190402 (2004) 10. Wolf, M.M., Verstraete, F., Cirac, J.I.: Entanglement and frustration in ordered systems. Int. J. Quant. Inf. 1, 465 (2003) 11. Wolf, M.M., Verstraete, F., Cirac, J.I.: Entanglement frustration for gaussian states on symmetric graphs. Phys. Rev. Lett. 92, 087903 (2004) 12. Nachtergaele, B., Sims, R.: Lieb-robinson bounds and the exponential clustering theorem. Commun. Math. Phys. 265, 119 (2006) 13. Hastings, M.B., Koma, T.: Spectral gap and exponential decay of correlations. http://arxiv.org/list/ math-ph/0507008, 2005 14. Cramer, M., Eisert, J.: Correlations and spectral gap in harmonic quantum systems on generic lattices. New J. Phys. 8, 71 (2006) 15. Auerbach, A.: Interacting electrons and quantum magnetism. New York, Springer Verlag, 1994 16. James, D.F.V.: Quantum dynamics of cold trapped ions, with application to quantum computation. Appl. Phys. B 66, 181 (1998) 17. Williamson, J.: Amer. J. Math. 58, 141 (1936)
92
N. Schuch, J. I. Cirac, M. M. Wolf
18. Birkl, G., Kassner, S., Walther, H.: Multiple-shell structures of laser-cooled Mg ions in a quadrupole storage ring. Nature 357, 310 (1992) 19. Dubin, D.H.E.: Theory of structural phase transitions in a Coulomb crystal. Phys. Rev. Lett. 71, 2753 (1993) 20. Enzer, D.G., Schauer, M.M., Gomez, J.J., Gulley, M.S., et al.: Observation of power-law scaling for phase transitions in linear trapped ion crystals. Phys. Rev. Lett. 85, 2466 (2000) 21. Mitchell, T.B., Bollinger, J.J., Dubin, D.H.E., Huang, X.-P., Itano, W.M., Baugham, R.H.: Direct observations of structural phase transitions in planar crystallized ion plasmas. Science 282, 1290 (1998) 22. Manuceau, J., Verbeure, A.: Quasi-free states of the C.C.R.–Algebra and Bogoliubov transformations. Commun. Math. Phys. 9, 293 (1968) 23. Holevo, A.S.: Quasi-free states on the C*-algebra of CCR. Theor. Math. Phys. 6, 1 (1971) 24. Arvind, Dutta, B., Mukunda, N., Simon, R.: The real symplectic groups in quantum mechanics and optics. Pramana 45, 471 (1995) 25. Wolf, M.M., Giedke, G., Krüger, O., Werner, R.F., Cirac, J.I.: Gaussian entanglement of formation. Phys. Rev. A 69, 052320 (2004) 26. Benzi, M., Golub, G.H.: BIT Numerical Mathematics 39, 417 (1999) 27. Bleistein, N., Handelsman, R.A.: Asymptotic expansions of integrals. New York: Dover Publication, 1986 Communicated by M.B. Ruskai
Commun. Math. Phys. 267, 93–115 (2006) Digital Object Identifier (DOI) 10.1007/s00220-006-0069-2
Communications in
Mathematical Physics
A Geometric Approach to the Classification of the Equilibrium Shapes of Self-Gravitating Fluids Alvaro Pelayo1 , Daniel Peralta-Salas2 1 Department of Mathematics, University of Michigan, 530 Church Street, Ann Arbor, Michigan 48109-1043,
USA. E-mail:
[email protected]
2 Departamento de Física Teórica II, Universidad Complutense, 28040 Madrid, Spain.
E-mail:
[email protected] Received: 10 October 2005 / Accepted: 3 March 2006 Published online: 20 July 2006 – © Springer-Verlag 2006
Abstract: The classification of the equilibrium shapes that a self-gravitating fluid can take in a Riemannian manifold is a classical problem in Mathematical Physics. In this paper it is proved that the equilibrium shapes are isoparametric submanifolds. Some geometric properties of them are also obtained, e.g. classification and existence for some Riemannian spaces and relationship with the isoperimetric problem and the group of isometries of the manifold. Our approach to the problem is geometrical and allows to study the equilibrium shapes on general Riemannian spaces. 1. Introduction Let (M, g) be an analytic, complete and connected (without boundary) Riemannian n-manifold and an open connected subset of M (bounded or not) occupied by a mass of fluid. We say that a fluid is self-gravitating if the only significative forces are its interior pressure and its own gravitation. A fluid in these conditions represents a simplified stellar model of fluid-composed star. Depending on whether the gravitational field is modelled by Poisson or Einstein equations we say that the fluid is Newtonian or relativistic. An important problem in Fluid Mechanics consists in studying the shape that a self-gravitating fluid will take when it reaches the equilibrium state. By the term shape of a fluid it is meant the topological and geometrical properties of the boundary ∂. The mathematical description of this kind of fluids only involves three physical quantities, the gravitational potential, which is a function f 1 : M → R (constant on ∂), and the density and pressure, which are two functions f 2 , f 3 : → R. The set of partial differential equations fulfilled by the functions ( f 1 , f 2 , f 3 ) is of free-boundary type because the domain is an unknown of the problem. In the relativistic case the metric tensor g is also an unknown and it must satisfy the coupling condition Rab = f 1−1 f 1;ab + 4π( f 2 − f 3 )gab , Rab standing for the Ricci tensor and; standing for the covariant derivative.
(1)
94
A. Pelayo, D. Peralta-Salas
The standard approaches to the problem of classifying the equilibrium shapes of ∂ generally employ analytical techniques. In the Newtonian case maximum principles for elliptic equations are used in order to prove the existence of symmetries of the solutions ( f 1 , f 2 , f 3 ). Lichtenstein [1] and later on Lindblom [2] proved the existence of spherical symmetry (i.e. ∂ is a round sphere) when M is the Euclidean R3 , is bounded and the functions ( f 1 , f 2 , f 3 ) satisfy some physical constraints. In the relativistic case arguments involving the positive mass theorem are used for obtaining the conformal flatness of the metric tensor. As a consequence of this technique Beig&Simon [3] and Lindblom&Masood-ul-Alam [4] proved again spherical symmetry when is bounded and the solutions verify certain physical hypotheses. Despite these important results many questions remain open: What about Newtonian fluids on Riemannian manifolds? What about unbounded domains ? Do the same results hold if we drop the physical assumptions? In this paper we approach the problem in a different, more geometrical way. Indeed, since the gravitational potential f 1 is constant on ∂ then the study of the geometric properties of the level sets of f 1 is an effective procedure to classify the equilibrium shapes that the fluid can take, thus connecting our problem with the philosophy of the geometric theory of PDEs. The literature on this field is essentially focused on the study of general properties of the level sets of the solutions to differential equations, e.g. critical levels [5, 6], convexity or starshapedness of the levels [7, 8], order of vanishing and measure for level sets [9, 10], symmetries of the levels due to overdetermined boundary conditions [11–15], . . .. This work provides another contribution to the abundant literature on geometric theory of PDEs, and as far as we know, the techniques that we develop for classifying the shapes of ∂ (e.g. the analytic representation property) are new and independent of the other approaches to similar problems. It is interesting to observe that in Serrin’s paper [11, 12] it is proved that solutions to certain boundary problems exist only when the domain is a sphere (for related results see the literature on overdetermined boundary problems, e.g. [15] and references therein). However Serrin’s method, somehow related to the classical approaches to Newtonian self-gravitating fluids by Lichtenstein and Lindblom (i.e. moving plane technique and maximum principles), bear no similarity with our methods (in fact the physical motivation of Serrin’s problem is not related to static fluids). Let me summarize the organization of this paper in three blocks. – In Sects. 2 and 3 we establish the notation of the paper, prove some elementary lemmas and formulate the problem. The equations that we study include the Newtonian and the relativistic (without taking into account Eq. (1)) as particular cases. In Sect. 4 we prove that the function f 1 is analytically representable across ∂ and as a consequence of this remarkable property f 1 is proved to satisfy the equilibrium condition. This result is particularly relevant from the physical viewpoint since it allows us to classify the equilibrium shapes of a self-gravitating fluid on any Riemannian manifold. Our techniques are different from the previous ones which appeared in the literature on self-gravitating fluids, and could be of interest to PDE theorists working on (overdetermined) free-boundary problems. – In the second block we study the geometrical consequences of the equilibrium condition. In Sect. 5 we prove that the (regular) level sets of the function f 1 are isoparametric submanifolds, thus connecting the shapes of static fluids with these classical objects of differential geometry, and in Sect. 6 we obtain classification theorems for certain spaces (M, g). In Sect. 7 necessary and sufficient conditions for the existence of certain equilibrium shapes are obtained and the 3-manifolds admitting
Geometric Classification of Equilibrium Shapes of Self-Gravitating Fluids
95
fluid-composed stars are classified. In Sect. 8 the relationship between the equilibrium shapes, the isoperimetric problem and the Killing vector fields of M is studied. Apart from the physical relevance (most of our statements have not been obtained through the classical approaches) our results could be of interest to differential geometers. – In Sects. 9 and 10 some open problems are discussed, the most important being the extension of our techniques to general relativistic fluids. 2. Notation and Preliminary Results Let f be a smooth, real-valued function on M. For each c ∈ f (M) ⊂ R we have a pre-image f −1 (c). Let Vci be the i th connected component of f −1 (c). The partition induced on M by f is defined as Vci . βM ( f ) = i,c
Analogously if we have an open set U ⊂ M the partition induced on U by f |U is called βU ( f ). Each Vci is called a leaf of the partition. In general the dimension of the leaves of the partition is not constant because there can exist singular fibres (∇ f = 0) and therefore β M ( f ) is a singular foliation. We say that a function f represents the partition of U if βU ( f ) = . If f is real analytic (C ω ) then is said to admit an analytic representation. Analogously we call f analytically representable on U ⊂ M if βU ( f ) admits an analytic representation. We will say that a family { f i : M → R} agrees fibrewise if β M ( f i ) = β M ( f j ) for all i, j. The following extension lemma will be useful in forthcoming sections. Lemma 1. Let f , g be real analytic functions on M. Let U ⊂ M be an open set. Then βU ( f ) = βU (g) =⇒ β M ( f ) = β M (g). Proof. Since βU ( f ) = βU (g) we have that rank(d f, dg) ≤ 1 in U . Since U is open and f , g are analytic functions this condition extends to the whole M, i.e. rank(d f, dg) ≤ 1 in M. This implies [16] that f and g are functionally dependent and hence there exists an analytic function Q : R2 → R such that Q( f, g) = 0 in M, thus showing that the partitions induced by f and g agree. Let S be a codimension one orientable submanifold in M. The metric induced by g on S is given by βab = gab − n a n b [17], where n a is the unit normal vector field to S on M. The extrinsic curvature or second fundamental form of S is defined as Hab = βac βbd n d;c = 1/2L n (βab ) [17], L n standing for the Lie derivative with respect to the unit normal vector field. If S is given as the zero-set of the function f , i.e. S = { p ∈ M : f ( p) = 0} and d f |S = 0, then the mean curvature H of S (the trace of the second fundamental form) is given by the following expression: ∇f , (2) H = div |∇ f | div and ∇ standing for the divergence and gradient operators on the Riemannian manifold.
a is the Riemann curvature tensor induced by R a on S then the Gauss theorem If Rbcd bcd implies [17] that R = R − 2Rab n a n b + (Haa )2 − Hab H ab . (3)
96
A. Pelayo, D. Peralta-Salas
From this expression we deduce the relationship between the intrinsic sectional curvature of S (K ), the sectional curvature of M restricted to S (K ) and the Gauss curvature of S ( K¯ ), namely K = K + K¯ [17]. Let us finish this section with the following lemma [18]. Lemma 2. Let f be a smooth function on an open set U ⊆ M saturated by level sets of f . If |d f | ≥ m > 0 on U then f is a (locally trivial) fibre bundle on U . Proof. The normal vector field X =
∇f (∇ f )2
is smooth on U and a symmetry of f because
L X ( f ) = 1 [19]. X is complete because (M, g) is complete and |X | ≤ a global 1-parameter group of diffeomorphisms [20].
1 m , thus defining
3. Statement of the Problem The problem (P) in which we are interested is a system of PDEs defined on M. Its form and additional regularity assumptions are inspired by the equations modelling static self-gravitating fluids, both Newtonian and relativistic [21]. The boundary of the fluid region, ∂, is assumed to be a codimension one analytic submanifold (connected or ¯ constant on ∂ and f 3 is not). The functions f 2 and f 3 are required to be analytic in , not allowed to be constant in the whole . With these assumptions in mind we state now the equations of problem (P) ( is the Laplace–Beltrami operator on the manifold). In this paper we consider only classical solutions. f 1 = F( f 1 , f 2 , f 3 ) in , H ( f 1 )∇ f 3 + G( f 2 , f 3 )∇ f 1 = 0 in , f 1 = c, c ∈ R, ∇ f 1 = 0 and f 1 ∈ Ct2 on ∂, ¯ f 1 = 0 in M − ,
(4) (5) (6) (7)
where F, G and H are (not identically zero) analytic functions. We also impose that G is not a constant. The symbol Ct2 in (6) means that f 1 is C 1 on the boundary and its tangential second derivatives f 1,i j t j are continuous for any (local) vector field t = t i ∂i tangent to the boundary. This assumption is analogous to Synge’s junction condition for the metric tensor in General Relativity [22]. The normal components of the second derivatives will not be continuous in general. Note that the constant c in (6) is not a priori prescribed and that the domain is an unknown of (P). For the sake of simplicity we have assumed that ∇ f 1 = 0 on ∂, but all the results of this paper hold only requiring that ∇ f 1 is not identically zero on the boundary. It is interesting to observe that one only needs to assume that ∂ is smooth enough, its analyticity following from general properties of elliptic free-boundary equations [23, 24]. Remark 1. The solutions to problem (P) are the functions ( f 1 , f 2 , f 3 ) and the domains satisfying all the required conditions. For certain values of F, G and H or certain manifolds (M, g) problem (P) could have no solutions at all. Since we are not interested in the existence problem we will suppose that solutions to (P) exist and characterize the structure of the level sets of these solutions. This is in strong contrast to the classical approaches where the problem of existence and uniqueness is first considered and then the geometrical restrictions arise.
Geometric Classification of Equilibrium Shapes of Self-Gravitating Fluids
97
Let us prove three important (elementary) lemmas which will be used in the following sections. The first one is a consequence of [25–27], and is stated without proof. Lemma 3. The solutions f 1 to (P) are analytic on M − ∂. Lemma 4. The solutions f 1 , f 2 and f 3 of (P) agree fibrewise in . Proof. From (5) it follows that d G(df2f,3 f3 ) = −d Hd(ff11 ) = 0 and hence G , f2 d f 2 ∧ d f 3 = 0 (the subscript denotes, as usual, partial differentiation). Therefore ∇ f 2 and ∇ f 3 are linearly dependent. Also from (5) d f 3 = − G(Hf(2f,1f)3 ) d f 1 and taking the exterior derivative in this equation we get (G , f2 d f 2 +G , f3 d f 3 )∧d f 1 = 0. The linear dependence of ∇ f 2 and ∇ f 3 implies the linear dependence of all ∇ f 1 , ∇ f 2 and ∇ f 3 . This fact and the analyticity of the functions imply that they agree fibrewise in the whole . For any point p ∈ ∂ consider a small enough open neighborhood U ⊂ M. Define the open sets Uin = U ∩ = ∅ and Uout = U ∩ (M − ) = ∅. Lemma 5. f 2 and f 3 are functions of f 1 in Uin . Proof. Suppose that ∇ f 1 does not vanish in U (it is always possible by continuity and the fact that (∇ f 1 )|∂ = 0). If we cover U with a local coordinate system (x1 , . . . , xn ) we can assume, without loss of generality, that f 1,x1 = 0. The implicit function theorem guarantees the following steps in Uin : x1 = X 1 ( f 1 , x2 , . . . , xn ) =⇒ f 2 = f 2 (X 1 ( f 1 , x2 , . . . , xn ), x2 , . . . , xn ) ≡ F2 ( f 1 , x2 , . . . , xn ) =⇒ f 3 = f 3 (X 1 ( f 1 , x2 , . . . , xn ), x2 , . . . , xn ) ≡ F3 ( f 1 , x2 , . . . , xn ). It is easy to check that Fi,x2 = . . . = Fi,xn = 0, i = 2, 3. One only has to take into account the implicit function theorem and Lemma 4. Hence in Uin we get that f 2 = F2 ( f 1 ) and f 3 = F3 ( f 1 ), where F2 and F3 are analytic functions of their argument. Since the pressure, the density and the gravitational potential induce the same partition in Uin we can reduce our study to β M ( f 1 ), which has the advantage of being defined on the whole M. The following sections are devoted to understand the topological and geometrical properties of this partition, which will only depend on the base manifold (M, g). 4. Analytic Representation and Proof of the Classification Theorem When working with partitions of U ⊆ M it is reasonable to try to represent them by functions as good as possible, e.g. analytic functions. The general programme would be to substitute the “pathological” function g : U → R, which represents the partition, by an analytic function f satisfying βU (g) = βU ( f ). In general a partition cannot be analytically represented. Even in the case in which the function g is analytic in the whole M except for the fibre g −1 (0) (as happens with the potential f 1 solution to (P)), an analytic representation does not need to exist. The following examples illustrate the difficulties which may arise.
98
A. Pelayo, D. Peralta-Salas
Example 1. Let f : Rn → R be defined as f (x) = |x|2 − 1 (|.| standing for the Euclidean norm) and let h : R → R be given by h(t) = t exp(−1/t 2 ) if t = 0 and h(0) = 0. Then the function g = f + h ◦ f is C ∞ on Rn , analytic on Rn − g −1 (0) and agrees fibrewise with f . So f is an analytic representation of g. Example 2. Let g : Rn → R be a smooth function defined in coordinates (x1 , . . . , xn ) as x1 (1 + x2 exp(−1/x12 )) if x1 > 0 g(x1 , . . . , xn ) = x1 if x1 ≤ 0 It is not difficult to prove that g does not admit analytic representation in any neighborhood of the fibre {x1 = 0}. Similar examples can be constructed when the “pathological” fibre is compact. In spite of these results, the gravitational potential f 1 , which is generally not analytic on the boundary, has a remarkable property: its partition can be represented by analytic functions in a neighborhood of ∂. Before proving this result let us comment on a physical reason supporting this property. Since ∂ is an analytic submanifold and (∇ f 1 )|∂ = 0 then there exists an analytic function I : M → R, arbitrarily close to f 1 (in the C ∞ strong topology), such that I −1 (0) = ∂ [28]. This proves that a small perturbation of the gravitational potential makes its partition analytically representable. Since two arbitrarily close partitions are indistinguishable from the physical viewpoint, this suggests that β M ( f 1 ) admits itself analytic representation. The rigorous proof of this result is a modification of a technique developed by Lindblom in [29]. Theorem 1 (Analytic representation property). Let f 1 be any solution to (P), then β M ( f 1 ) = β M (I ) where I : M → R is a function analytic on the whole M. Proof. Let U be a small enough neighborhood of p ∈ ∂. On account of Lemma 5 we can write the equations defining (P) in terms of just f 1 : ˆ f 1 ) in Uin , f 1 = F( 2 f 1 = c, c ∈ R, ∇ f 1 = 0 and f 1 ∈ Ct on ∂ ∩ U, f 1 = 0 in Uout .
(8) (9) (10)
Consider a vector field ξ = ξ i ∂i which is a symmetry of f 1 in Uin , i.e. ξ( f 1 ) = 0. Note ¯ that ξ can always be chosen analytic. The assumption that f 2 and f 3 are analytic in ˆ implies that F in Eq. (8) is an analytic function of f 1 . It follows that the interior solution f 1 , satisfying Eq. (8) and the boundary condition f 1 = c, is analytically continuable across the boundary [30], although its continuation does not generally coincide with the exterior solution. Consequently ξ i and ξ,i j can be extended to ∂ as analytic functions and in fact ξ( f 1 )|∂ = 0. Now let us extend the vector field ξ beyond the free-boundary in such a way that the extension is analytic in the whole U . The components of the new vector field ξˆ in U are defined by the following boundary problem: ξˆ k +
2Dµ Dk f 1 µ k Rik D i f 1 k ξˆ = 0 , (D ξˆ ) + f 1,k f 1,k
(11)
Geometric Classification of Equilibrium Shapes of Self-Gravitating Fluids
99
k = 1, . . . , n, provided with the boundary conditions on ∂ given by the values (ξˆ i )|∂ = (ξ i )|∂ and (ξˆ,i j )|∂ = (ξ,i j )|∂ of the interior symmetry. Note that the symbol D stands for the covariant derivative and Rik is the Ricci curvature. In order that (11) be well defined assume, without loss of generality, that in U all the components of ∇ f 1 are non zero. Since all the terms of (11) are analytic in U in and U out , the boundary conditions and ∂ are also analytic and the equation is linear and elliptic there exists a (unique) extension which defines an analytic vector field in U [25, 26] (Cauchy-Kowalewsky theorem). This analytic vector field has the property of being a symmetry of f 1 in U . Indeed, the first step consists in the following computation 2Dµ Dk f 1 µ k D µ Dµ Dk f 1 k ξˆ . (12) (ξˆ k f 1,k ) = Dµ D µ (ξˆ k f 1,k ) = f 1,k ξˆ k + (D ξˆ ) + f 1,k f 1,k Note now that Dµ D µ Dk f 1 = Dk ( f 1 ) + Rik D i f 1 and since f 1 is solution of (P) we get from Eq. (11) and (12) that (ξˆ k f 1,k ) = Fˆ ( f 1 )ξˆ k f 1,k in Uin , (ξˆ k f 1,k ) = 0 in Uout .
(13) (14)
The boundary conditions are the following. First note that f 1,k ξˆ k is C 1 on ∂. Indeed, recall that on ∂ f 1 is C 1 and its tangential second derivatives are continuous, therefore f 1,k ξˆ k and ( f 1,k ξˆ k )i = f 1,ik ξˆ k + f 1,k ξˆ,ik are also continuous on the boundary. The same applies to f 1,k ξ k . Since f 1,k ξ k = 0 in Uin then ( f 1,k ξ k )|∂ = 0 and ∂i ( f 1,k ξ k )|∂ = 0, and hence the boundary conditions are ( f 1,k ξˆ k )|∂ = 0 and ∂i ( f 1,k ξˆ k )|∂ = 0. Note now that Eq. (13) is analytic in U¯ in because the interior solutions f 1 , f 2 , f 3 can be analytically continued across the boundary. Holmgren’s theorem for linear analytic elliptic equations [31] implies that the (unique) C 1 solutions to (13) and (14) provided with the boundary conditions are ξˆ k f 1,k = 0 in Uin and ξˆ k f 1,k = 0 in Uout , thus showing that ξˆ k f 1,k = 0 in the whole U . Summarizing we have a (local) Lie algebra of n − 1 independent analytic symmetries of f 1 in U . From this Lie algebra we can reconstruct the partition βU ( f 1 ) via Frobenius theorem, the analyticity of ξˆ implying that the partition is analytic [32]. This ensures the existence of an analytic function Iˆ : U → R such that βU ( f 1 ) = βU ( Iˆ). By Lemma 5 it follows that Iˆ = F( f 1 ), thus showing that Iˆ extends to a saturated neighborhood of ∂, N (∂), as analytic function Iˆ : N (∂) → R. This result and the analyticity of f 1 in M − ∂ prove that β M ( f 1 ) is analytically representable across any leaf, and therefore there exists an analytic extension I : M → R of Iˆ such that β M ( f 1 ) = β M (I ), thus proving the claim. Concerning the physical meaning of the analytic representation property (ARP) we must say the following. From the proof of Theorem 1 it follows that the interior symmetries propagate across the free-boundary and remain symmetries of the exterior solution. This implies that a physical matching on a free-boundary (at least in static situations) does not only guarantee the continuity of the gravitational field but also the dependence between the external and internal properties of the fluid.
100
A. Pelayo, D. Peralta-Salas
Definition 1. An analytic function I : M → R is of equilibrium on U ⊆ M if I , (∇ I )2 and I agree fibrewise on U . A partition induced by an equilibrium function is called an equilibrium partition. Let us now prove the classification theorem. Theorem 2 (Classification theorem). If f 1 is a solution to the problem (P) then β M ( f 1 ) is an equilibrium partition. Proof. If p ∈ ∂ and U is a small neighborhood of p then f 2 and f 3 can be expressed ˆ f 1 ) in Uin . In Uout as functions of f 1 in Uin (see Lemma 5), thus implying that f 1 = F( the potential satisfies f 1 = 0. Theorem 1 ensures the existence of an analytic function I agreeing fibrewise with f 1 . The same technique as in Lemma 5 can be applied in order to show that f 1 = R(I ) in U . From Eq. (7) it follows R(I ) = 0 in Uout , which is equivalent to R
(I )(∇ I )2 + R (I )I = 0, I (∇ I )2
= (I ) in Uout . Since I is analytic so are I and (∇ I )2 and hence ˆ be the analytic continuation of to U (which indeed exists is analytic in U . Let
and therefore I (∇ I )2
(15)
ˆ ) in U and in particular in because of the analyticity of I and (∇II )2 ). Then (∇II )2 = (I Uin . On the other hand ˆ ) (16) R
(I )(∇ I )2 + R (I )I = F(I in Uin and together with (15) implies that I and (∇ I )2 depend only on I . The argument applies to the whole of U by analyticity and in fact the property of agreeing fibrewise extends (Lemma 1) to the whole M (although generally I and (∇ I )2 can be written as functions of I only locally). Since I and f 1 agree fibrewise we have that β M ( f 1 ) is an equilibrium partition. Theorem 2 provides a complete characterization of the level sets of the solutions to problem (P). In the following sections we will explore the geometrical and topological meaning of the equilibrium condition. This theorem applies to Newtonian self–gravitating fluids thus characterizing their equilibrium shapes on any Riemannian manifold. It also works for relativistic fluids without coupling between matter and geometry (see Eq. (1)). This kind of relativistic models, where the base space is predetermined, is used in some applications of interest [33]. In Sect. 9 it will be discussed how to extend our techniques to general relativistic fluids. Remark 2. It is interesting to observe that we do not impose additional assumptions in order to characterize the structure of the level sets of the potential. In the literature additional hypotheses are usually considered: |∇ f 1 | is a function of f 1 [34], existence of a “reference spherical model” [3] or physical constraints, e.g. positivity of the density and pressure, asymptotic structure of the potential and existence of state equation [1, 2, 4]. 5. General Geometric Properties of the Equilibrium Shapes Theorem 2 reduces the original problem involving a difficult system of PDEs to a purely geometrical problem: the classification of equilibrium partitions on different spaces. For
Geometric Classification of Equilibrium Shapes of Self-Gravitating Fluids
101
instance, in Sect.7 and 8, we will show examples of manifolds which do not admit equilibrium functions of certain types, thus obtaining by geometrical arguments an existence result: (P) cannot have these types of solutions on these spaces. Although the definition of equilibrium partition involves a particular function I the concept is mainly geometrical, as the following proposition shows. Proposition 1. Any analytic function representing an equilibrium partition is an equilibrium function. Proof. By hypothesis there exists an equilibrium function I representing . Consider another function Iˆ representing the same partition. By using the same argument as in Lemma 5 it is immediate to prove that Iˆ = F(I ) for a certain open set U , F being an analytic function. Since I is of equilibrium we have that locally (by the same argument) (∇ I )2 and I are functions of I . Now a straightforward computation yields that (∇ Iˆ)2 = F (I )2 (∇ I )2 and Iˆ = F
(I )(∇ I )2 + F (I )I . This implies that locally (∇ Iˆ)2 and Iˆ are functions of Iˆ, thus proving that Iˆ is a local equilibrium function. The globalization of this property follows from Lemma 1. Let us now prove a remarkable result which relates the equilibrium condition to the well known isoparametric property. Recall that a smooth function f : M → R is called isoparametric if (∇ f )2 = F( f ) and f = G( f ) in M, F and G being differentiable functions [35]. A regular level set of an isoparametric function is called isoparametric submanifold and the union of level sets is called isoparametric family. This concept was firstly introduced by Levi-Civita [36], Cartan [37, 38] and Segre [39] in a purely geometrical context. Two good surveys on this topic are the works of Nomizu [40] and Thorgbersson [41]. Note that in the literature it is sometimes considered another definition for an isoparametric family [42], which is not equivalent to the one considered in this paper. Before proving the theorem let us state some notation. Assume (without loss of generality) that the equilibrium function I has N different critical values {ci }1N (since I is analytic the set of critical values is discrete in R) and that I (M) = (−∞, +∞). The N +1 manifold M is therefore stratified as follows, M = i=1 Mi ∪ C( f ), C( f ) standing for the critical set of I and Mi = I −1 (ci−1 , ci ) (c0 = −∞ and c N +1 = +∞). Each set j Mi is possibly made up by several connected components Mi . j
Theorem 3. An equilibrium function is isoparametric on each Mi and hence its regular level sets are isoparametric submanifolds. j
Proof. In the open regions Mi the equilibrium function I is submersive and by Lemma 2 the partition is globally trivial. In Proposition 1 it was proved that (∇ I )2 and I are j functions of I in a certain open subset of Mi . The globalization of this property to the j whole Mi stems from the existence of a global transversal (non-closed) curve to the fibres of I , which is a consequence of the triviality of the partition (note that I can be j adapted to a coordinate system in Mi [18]). It is interesting to observe that the isoparametric character of an equilibrium function is generally only local, the following example illustrating this fact. Example 3. The analytic function I (x, y) = cos x 2 + y 2 in (R2 , δ) is of equilibrium type (its partition is formed by concentric circles). On the contrary it is not a global
102
A. Pelayo, D. Peralta-Salas
isoparametric function because (∇ I )2 = 1 − I 2 but I cannot be globally expressed as a function of I due to the existence of the critical fibres r = iπ , i ∈ N ∪ {0}. Anyway, as proved in Theorem 3, I is a well defined function of I in the domains M√ i = {iπ <
r < (i + 1)π } and a straightforward computation yields I = −I −
(−1)i 1−I 2 . iπ +arccos((−1)i I )
The most remarkable feature of Theorem 3 is that the idea of isoparametric submanifold, which was introduced in Differential Geometry several decades ago, naturally arises in the physical context of Fluid Mechanics. It is important to note that other authors have also employed the isoparametric condition in order to study the partitions induced by the solutions of certain PDEs [43–45], but the techniques that we use are different to these authors’, specifically the analytic representation property (Theorem 1). It is worth mentioning the interesting paper of Shklover [46], where it is shown that the overdetermined Neumann and Dirichlet problems on certain manifolds admit solution if the boundaries are assumed to be isoparametric. The converse, i.e. the existence of solution implies that the boundary is isoparametric, is not proved. The literature on the isoparametric property is extensive, Theorems 2 and 3 connect it with the problem of classifying the shapes of static self-gravitating fluids. In the following sections, for its relevance in this context, we will state without proofs, some well known results about isoparametric submanifolds. Several other statements will be obtained for which we provide demonstrations because, to the best of our knowledge, they are new or at least not explicitly stated in any reference that we have consulted. The following theorem characterizes the general properties that all the equilibrium partitions must satisfy on any Riemannian manifold. It is a well known result to experts in isoparametric families, but we provide a proof for the sake of completeness and because we believe that it is unknown to most people working on Mathematical Physics. Theorem 4. The partition induced by any equilibrium function I on M has a trivial j fibre bundle structure on each Mi , each (regular) leaf has constant mean curvature and locally the (regular) leaves are geodesically parallel. j
Proof. The first statement has been proved in Theorem 3. In Mi we have that (∇ I )2 = F(I ) and I = G(I ). The expression of the mean curvature (Eq. (2)) is the following: ∇I G(I ) −F (I ) =√ + √ , H = div |∇ I | F(I ) 2 F(I ) thus implying that H is constant on each (regular) leaf of the partition. Consider now the following equalities D X g(∇ I, ∇ I ) = D X F(I ) = F (I )(∇ I ) j X j , D X g(∇ I, ∇ I ) = 2g(D X ∇ I, ∇ I ) =
j 2(∇ I );k (∇ I )k X j
,
(17) (18)
D X standing for the covariant derivative with respect to the vector field X . Identifying (17) and (18) we obtain that F (I ) F (I ) ∇ I, X ) =⇒ D∇ I ∇ I = ∇I , 2 2 which is the condition on the integral curves of ∇ I to be tangent to geodesics. Call λ the parameter of the flow induced by ∇ I . Then c2 dI dI = (∇ I )2 = F(I ) =⇒ = λ2 − λ1 , dλ F(I ) c1 g(D∇ I ∇ I, X ) = g(
Geometric Classification of Equilibrium Shapes of Self-Gravitating Fluids
103
which depends exclusively on the√fibres of I and not on the path chosen. Since the arc length is related to λ by ds = F(I (λ)) dλ, we get that the (regular) leaves of the j partition in Mi are geodesically parallel. It is interesting to observe that the geodesical parallelism of the equilibrium partitions implies that they are Riemannian (singular) foliations [47, 48], a property which will be important in forthcoming sections. In fact Theorem 4 can be locally expressed as an equivalence. Proposition 2. A Riemannian codimension 1 (singular) foliation whose non-singular leaves have constant mean curvature is locally an equilibrium partition. Proof. Consider an open subset U ⊂ M small enough so that the foliation in U (which we assume regular) can be represented by a function I . If M is compact, simply connected and the foliation has trivial holonomy then I is defined on the whole M [49]. Since the leaves are parallel let us prove that I must satisfy (∇ I )2 = F(I ) in U . Indeed it is immediate to see that X ((∇ I )2 ) = 2g(D X ∇ I, ∇ I ) = 2g(D∇ I ∇ I, X ) for any vector field X on U . Since the foliation is geodesically parallel then the gradient lines of I are tangent to geodesics, that is D∇ I ∇ I = ψ∇ I for certain real-valued function ψ on U . Thus identified, we get that X ((∇ I )2 ) = 2ψ X (I ) and therefore the symmetries of I are also symmetries of (∇ I )2 , thus implying, via Frobenius theorem, that (∇ I )2 is (locally) a function of I . The constancy of the mean curvature H on the leaves is expressed ∇I as H = div( ∇ I ) = G(I ). After some computations, and taking into account that 2 (∇ I ) = F(I ), one readily gets that I is also a function of I in U , thus proving the (local) equilibrium property. If we assume that the foliation is locally trivial (this is the case when M is compact and there are not dense leaves [47, 48]) then we can extend U to a saturated set in such a way that the first integral I is well defined on the whole . In general it will be defined on any globally trivial saturated set . In the following section we show further properties of the equilibrium shapes for certain particularly relevant spaces. The more we want to characterize an equilibrium partition the more we have to restrict the topology and geometry of the base space.
6. Classification of the Equilibrium Shapes on Certain Spaces Isoparametric submanifolds in the Euclidean space (Rn , δ) are classified: they possess constant principal curvatures and hence must be globally isometric to Rn−1 , S n−1 or S n−1−k × Rk (1 ≤ k ≤ n − 2) with their respective canonical metrics [39, 41]. From this result it is straightforward to obtain the classification of equilibrium partitions on (Rn , δ). Proposition 3. The equilibrium partitions on the Euclidean space (Rn , δ) are given by concentric spheres S n−1 , parallel hyperplanes Rn−1 or parallel coaxial cylinders S n−1−k × Rk (1 ≤ k ≤ n − 2). Proof. Theorems 3 and 4 and Cartan’s classification of isoparametric submanifolds in j (Rn , δ) prove the claim on each Mi . The globalization follows from Lemma 1.
104
A. Pelayo, D. Peralta-Salas
Accordingly, a static self-gravitating fluid on the Euclidean space must take the shape of a round sphere, a cylinder or a region bounded by parallel hyperplanes, thus recovering the classical results of Lichtenstein and Lindblom on compact Newtonian fluids in (R3 , δ) [1, 2]. Corollary 1. If Rn is provided with a conformally flat metric, i.e. g = e2φ δ, and I is an equilibrium function which agrees fibrewise with φ, then the equilibrium partitions βRn (I ) are the same as in the Euclidean case. Proof. A straightforward computation yields that I is also an equilibrium function in (Rn , δ), and hence Proposition 3 applies. Since φ and I agree fibrewise then the partitions induced by I on (Rn , g) and (Rn , δ) are globally isometric, thus proving the claim. Note that the assumption of I and φ agreeing fibrewise is usually considered in the literature on relativistic fluids [21], where the geometry of the base space is coupled with the matter. Isoparametric submanifolds are also classified in the hyperbolic space Hn [37] and therefore a result analogous to Proposition 3 can be obtained, i.e. a detailed classification of the equilibrium partitions of Hn . Proceeding as in Proposition 3 it is immediate to prove the following claim. Proposition 4. The equilibrium partitions on the hyperbolic space Hn are given by concentric spheres S n−1 , parallel hyperplanes Hn−1 or parallel coaxial cylinders S n−1−k × Hk (1 ≤ k ≤ n − 2). A complete classification of the isoparametric submanifolds on the sphere S n has not yet been accomplished, see e.g. [41]. The partition formed by concentric spheres S n−1 is of equilibrium, but this does not exhaust all the possibilities, although from the physical viewpoint this is indeed the most relevant situation (see the following section for details). Apart from the canonical constant curvature manifolds, a geometric characterization of the equilibrium partitions can also be obtained for other Riemannian spaces. A particularly interesting case is when (M, g) is non-compact and has non-negative Ricci curvature. Proposition 5. If I : M → R is a submersive equilibrium function on (M, g) then the leaves of β M (I ) are totally geodesic submanifolds and ∇ I is a Killing vector field. Proof. Since I is a submersion then it is a global isoparametric function and hence ) = 1; (∇ I )2 = F(I ) = 0 and I = G(I ). Without loss of generality assume that F(I dI . Note this is equivalent to representing the partition β M (I ) by the function Iˆ = √ F(I ) ˙ that L ∇ I I = I = 1 and therefore ∇ I is complete, thus implying that I (M) = R. Bochner’s formula [50] for the function I reads as 1 (∇ I )2 = ∇ I · ∇I + Ricci(∇ I, ∇ I ) + ||D 2 I ||2 , 2 and therefore ||D 2 I ||2 = −Ricci(∇ I, ∇ I ) − G (I ). Taking into account that Ricci ≥ 0 1 and the well known inequality ||D 2 I ||2 ≥ n−1 (I )2 we get G (I ) ≤ −
G(I )2 . n−1
Geometric Classification of Equilibrium Shapes of Self-Gravitating Fluids
105
One can now see via integration that this differential inequality is satisfied by a function G verifying G ≤ cn−1 + c2 , c1 , c2 ∈ R. Since I takes any real value we would have 1 +I lim I →−c− G(I ) = −∞, which is a contradiction, and hence the only global solution 1 to the inequality is G(I ) = 0. Substituting this expression into Bochner’s formula it is concluded that ||D 2 I ||2 = 0, which is the condition for the fibres of I to be totally geodesic. Let us now prove that ∇ I is a Killing of (M, g). Recalling the expression for the second fundamental form of the fibres of I (Sect. 2) and taking into account that it is zero we get L ∇ I (gab ) − L ∇ I ((∇ I )a (∇ I )b ) = 0 . A quite long, although not difficult, computation shows that the second term of this expression vanishes (just assume that (∇ I )2 = 1), thus proving the claim. It is interesting to observe that Proposition 5 implies that M splits isometrically as M = N × R, where N is a fibre of I , thus recovering the Cheeger-Gromoll splitting theorem [51]. In fact this theorem was originally proved without assuming the existence of a submersive equilibrium function, but showing that it always exists (Busemann function), although it is not generally analytic. In ending this section let us focus on 3-manifolds (M 3 , g). The notation that we will use is explained in Sect. 2. The following elementary lemma will be useful. Lemma 6. Let S be a codimension 1 submanifold of M 3 such that R, R and Rab n a n b are constant on S. Then the Gauss curvature of S is also constant. Proof. Let u and v be two orthonormal vectors tangent to S at the point p ∈ S. The sectional curvature K of M 3 restricted to S is given by K = Rab (u a u b + v a v b ) −
R . 2
The expression u a u b +v a v b is a projection tensor onto S and therefore u a u b +v a v b = β ab , thus implying that K = R2 − Rab n a n b . The intrinsic sectional curvature of S satisfies the
relationship K = R2 . The assumptions of the lemma yield that K and K are constant on S and hence, by Gauss theorem, the claim follows. When M 3 is flat, conformally flat or locally symmetric, and there exists some relationship between the geometry and the equilibrium function I , then further geometrical properties of the equilibrium partition β M 3 (I ) can be obtained. Proposition 6. Let I : M 3 → R be an equilibrium function on a 3-manifold satisfying either of the following 1. M 3 is conformally flat, i.e. g = e2φ δ, and β M 3 (I ) = β M 3 (φ). If M 3 is flat this assumption is not necessary. 2. M 3 is locally symmetric and each fibre of I has parallel second fundamental form. Then the (regular) leaves of the equilibrium partition β M 3 (I ) have constant principal curvatures. Proof. First consider the conformally flat case. Let S be a regular leaf of β M 3 (I ). The Ricci tensor and the scalar curvature of (M 3 , e2φ δ) are given by Rab = φab − φa φb + δab ( E φ + (∇ E φ)2E ), R=e
−2φ
(4 E φ
+ 2(∇ E φ)2E ) ,
(19) (20)
106
A. Pelayo, D. Peralta-Salas
the subscript E meaning that the corresponding operator has the Euclidean form. It is immediate to check that (∇φ)2 = e−2φ (∇ E φ)2E , φ = e
−2φ
((∇ E φ)2E
+ E φ) .
(21) (22)
Since the partitions of I and φ agree fibrewise we have that φ is an equilibrium function. Now from Eq. (20), (21) and (22) it is evident that R is constant on S. Note by looking at Eq. (3) that if Hab H ab and Rab n a n b are both constant on S then R is also constant. The following computation is immediate 1 1 Hab H ab = (L n βab )(L n β ab ) = 3(L n e2φ )2 − (L n e2φ )(L n n a )n a 4 4 −(L n e2φ )(L n n b )n b − (L n e2φ )(L n n a )n a + (L n n a )(L n n a ) + (L n n a )n a
(L n n b )n b − (L n e2φ )(L n n b )n a δ ab + (L n n a )n a (L n n b )n b + (L n n b )(L n n b ) . As (L n n a )n a = 21 L n (n a n a ) = 0 the above expression simplifies to
1 Hab H ab = 3(L n e2φ )2 + 2(L n n a )(L n n a ) . 4
(23)
∇I The vector field n normal to S is defined as n = |∇ I | , and hence it is immediate to check that the first summand in Eq. (23) is constant on S. For the second summand notice the following computation:
1 2 1 1 (L n n a )(L n n a ) = |∇ I |2 L n Ln (L n |∇ I |2 ) + |∇ I | |∇ I | |∇ I | 1 (L n (∇ I )a )(L n (∇ I )a ) . + |∇ I |2 One readily gets that the first and the second summands are constant on S. The third one requires more computations. Indeed (L n (∇ I )a )(L n (∇ I )a ) =
a 1 b c ∂(∇ I )a ∂(∇ I ) (∇ I ) (∇ I ) |∇ I |2 ∂ xb ∂ xc 2 1 −4φ ∂ I ∂ I ∂ 2 I ∂e−2φ ∂ I −2φ ∂ I . = e + e |∇ I |2 ∂ xb ∂ xc ∂ xa ∂ xb ∂ xc ∂ xa ∂ xc ∂ xa
On the other hand ∂ x∂a ∂Ixb ∂∂xIa = 21 ∂∂xb (∇ E I )2E and φ agrees fibrewise with I . Taking these facts into account and after some computations it is obtained that (L n (∇ I )a )(L n (∇ I )a ) is constant on S and hence Hab H ab is also constant. Similar computations show the constancy of Rab n a n b on S, thus proving that R is also constant. Note now that Lemma 6 can be applied to conclude that the Gauss curvature of S is constant. Since S has dimension 2 and its mean curvature is also constant the proposition for conformally flat manifolds follows. Now let us focus on the locally symmetric spaces, i.e. Rabcd;m = 0. It is immediate that R is constant on S. Denote by || the induced covariant derivative on S. The following equation is readily obtained 2
ab = −(n a n b )||e . 0 = β||e
Geometric Classification of Equilibrium Shapes of Self-Gravitating Fluids
107
Therefore (Rab n a n b )||e = Rab (n a n b )||e = 0, which means that Rab n a n b is constant on S. Since the second fundamental form of S is parallel (i.e. Hab||e = 0) we have that (Hab H ab )||e = 0, thus concluding that R , and hence the Gauss curvature, is constant on S. As we mentioned concerning Corollary 1 the assumption of coupling between the equilibrium function and the underlying geometry (e.g. the conformal factor φ) is usual in the general relativistic setting. We are not aware whether equilibrium functions not fulfilling this hypothesis exist and if so, what geometrical properties they have. Remark 3. It would be interesting to obtain a full geometrical description of the equilibrium partitions for the 8 canonical 3-dimensional geometries [52]. The results on constant curvature and locally symmetric spaces obtained in this section apply to 5 of these manifolds, i.e. R3 , H3 , S 3 , S 2 × R and H2 × R. The geometrical properties of equilibrium partitions obtained in this section are relevant from the physical viewpoint because they give rise to the possible shapes of a static self-gravitating fluid on different Riemannian spaces. 7. Existence of Equilibrium Shapes and Fluid-Composed Stars In general it is a difficult task to know whether an equilibrium function exists on a Riemannian manifold. This problem is not only interesting from the mathematical viewpoint but also from the physical one. Indeed if certain space does not admit equilibrium partitions then a self-gravitating fluid would never reach static equilibrium, a non-existence result for the set (P) of PDEs. It would be desirable to classify the spaces admitting equilibrium functions, which would be the suitable spaces for doing relevant physics. In this section we will obtain some restrictions to the existence of equilibrium partitions. These restrictions are of geometrical or topological type. A mass of fluid generally encloses a contractible domain, e.g. think of a fluid-composed star, and hence the equipotential sets are contractible to an interior point. Let us prove that in this case the equilibrium partition in has only one focal point (recall that the focal set is the set of points where the lines of ∇ I intersect each other, i.e. the critical points). Since I is analytic in and the critical set of an analytic function does not possess endpoints [53] then the only possibility for the (interior) focal set is that it is formed by a single point since otherwise the fluid domain would not be contractible (recall that the leaves of the distance function are tubes around the focal sets [35]). Accordingly it is physically relevant to study the manifolds (M, g) which admit equilibrium partitions possessing an isolated focal point. Recall that the space (M, g) is harmonic with respect to p ∈ M [54] if the determinant of the metric in normal Riemann coordinates is a function of the geodesic distance to p. If G is the determinant of the metric in polar Riemann coordinates then it is well known that the determinant of the metric in normal Riemann coordinates is G˜ = Gr 2−2n (θ )−2 , where (θ )2 is the determinant of the metric of the round sphere S n−1 in spherical coordinates. Therefore the condition of harmonicity with respect to p can be expressed as G˜ = F(r )2 or G = F(r )2 r 2n−2 (θ )2 . Set A(r ) = F(r )r n−1 . Proposition 7. A local equilibrium partition with an isolated focal point p ∈ M exists on (M, g) if and only if the space is harmonic with respect to p. In this case the equilibrium partition is locally formed by geodesic spheres.
108
A. Pelayo, D. Peralta-Salas
Proof. Recall that in polar Riemann coordinates centered at p ∈ M the metric tensor is locally expressed as ds 2 = dr 2 + G i j (r, θ )dθ i dθ j . The sufficiency condition stems from the fact that the function I = 21 r 2 is of equilibrium. Indeed (∇ I )2 = r 2 = 2I A(r )) and I = ∂r (rA(r which is an analytic function of r because A(r ) = r n−1 + O(r n ). ) 2 Therefore r induces a local equilibrium partition (the geodesic spheres) whose focal point is p. Conversely if one has an equilibrium partition with a focal point formed by the point p then the geodesical parallelism of the leaves implies that the partition must be formed by geodesic spheres centered at p. This stems from the fact that the focal varieties of Riemannian (singular) foliations are smooth submanifolds of M and the regular leaves of the partition are tubes (constant distance) over either of the focal varieties [35]. On account of Proposition 1 the function I = 21 r 2 representing the same partition must √ be of equilibrium. The condition of I being a function of r is expressed as ∂r ln(r G) = C(r ), and a straightforward integration yields that G = A(r )2 B(θ )2 . )2 B(θ)2 ˜ Since G˜ = rA(r 2n−2 (θ)2 and G = 1 at p (r = 0) we obtain that B(θ ) = (θ ) (note that A(r ) = r n−1 (1 + O(r ))) thus implying that G˜ = F(r )2 . Since polar Riemann coordinates are only local then the partition formed by concentric geodesic spheres could not be globally defined (at least as an analytic partition). The globalization exists, for example, when the exponential map is a global diffeomorphism, e.g. if the space is simply connected and the sectional curvature is non-positive (Cartan-Hadamard’s theorem). Important examples of harmonic spaces with respect to p are provided by manifolds which are rotationally symmetric around p, in fact in dimension 2 this is always the case (G = A(r )2 (θ )2 ⇐⇒ rotational symmetry with respect to p). If (M, g) is harmonic with respect to any point then it is called a harmonic manifold. Particular cases of harmonic manifolds are the canonical constant curvature spaces S n , Rn and Hn , where the local partitions can be globalized (in S n there will appear a second focal point). A remarkable physical consequence is that in all these manifolds (static) fluid-composed stars can exist. Another consequence is that spaces whose metric does not satisfy the assumption of harmonicity with respect to any point will not have (local) equilibrium partitions with isolated focal points. In these spaces static fluid-composed stars cannot exist and hence they are not physically admissible. Remark 4. A physically realistic 3-space (the universe) must allow the existence of static contractible fluid domains around any point (the positions of fluid-composed stars should not be not privileged!). This implies that M 3 is harmonic and hence, it can be proved (dimension 3) [54] that this is equivalent to be two-point homogeneous. Two-point homogeneous manifolds are classified [55], in dimension 3 they are (up to local isometry) R3 , H3 and S 3 . Therefore the existence of static fluid stars implies that the universe must be a constant curvature manifold, thus justifying the main hypothesis of the standard cosmological models. It is interesting to observe that for any Riemannian manifold (M, g) and any given point p ∈ M there exists a smooth conformal factor [56] such that in the new metric g the geodesic spheres centered at p form locally an equilibrium partition, thus allowing the existence of contractible fluid domains in equilibrium. The following proposition establishes a topological constraint on M in order that a submersive equilibrium function exists.
Geometric Classification of Equilibrium Shapes of Self-Gravitating Fluids
109
Proposition 8. Let I be an equilibrium function such that dI = 0. Then M ∼ = N × R. j
Proof. Since dI = 0, then Mi = M and therefore Theorem 4 implies that I is a globally trivial fibre bundle with a global transversal curve . Since the lines of a gradient vector field are not closed then ∼ = R, thus proving the claim. Of course N ∼ = I −1 (0). Proposition 8 shows that submersive equilibrium functions do not exist if M is not diffeomorphic to N × R. It would be interesting to get topological restrictions ensuring the non existence of any equilibrium function. In Sect. 8 we will prove that surfaces without Killing vector fields do not admit equilibrium partitions (Proposition 9). Note also that when M has non-negative Ricci curvature the claim of Proposition 8 can be improved, as was shown in Proposition 5. 8. Equilibrium Shapes, Killing Vector Fields and Isoperimetric Domains The results of Sect. 7 suggest that equilibrium partitions are usually linked to certain geometric structures of the manifold. If these geometric structures fail to exist then equilibrium partitions do not exist. For example, consider the equilibrium partitions of the Euclidean space (Proposition 3). These partitions have the remarkable property of being generated by isometries of (Rn , δ). Furthermore, as a consequence of Proposition 7, the equilibrium partitions with p as focal point are induced by isometric group actions on M whenever the space is rotationally symmetric around p. These facts indicate that the isometries of the manifold are somehow related to the equilibrium functions. The following proposition establishes the equivalence between both concepts for surfaces. Proposition 9. Let (M, g) be a 2-dimensional Riemannian space. Then the equilibrium partitions are 1-dimensional (singular) foliations generated by Killing vector fields of (M, g), and conversely, any Killing vector field whose orbits are closed in M, is tangent to the level sets of an equilibrium function. Proof. Let ξ be a Killing vector field of (M, g). Since its orbits are closed it follows that the action of ξ on M is proper. Theorem 5 proves that the foliation defined by ξ is j of equilibrium. Consider now an orthogonal (local) coordinate system on Mi defined j by the functions (I, J ), I being an equilibrium function on M. Recall that on Mi 2 2 the function I is isoparametric and hence (∇ I ) = F(I ) and I = G(I ). Since ∂ I · ∂ I = F(I1 )2 and ∂ I · ∂ J = 0, the expression of the metric tensor in the new coor-
dinates is ds 2 = F(I )−2 dI 2 + N (I, J )2 d J 2 . A straightforward computation yields that G(I ) = I = F(I )2 ∂ I (log N F), and hence N (I, J ) = A(I )B(J ). If we define √ d Jˆ = B(J )d J we get that the metric in coordinates (I, Jˆ) is a warped product ds 2 = F(I )−2 dI 2 + A(I )d Jˆ2 , thus implying that the vector field ∂ Jˆ (tangent to the level sets of I ) is a local Killing. The analyticity of the metric implies that this Killing vector field (which is also analytic) globalizes [57]. ∼ R2 or S 1 × R, If I is a submersive equilibrium function then (Proposition 8) M = and therefore the warped product expression in Proposition 9 can be globalized to ds 2 = dI 2 + A(I )d Jˆ2 , where it has been assumed without loss of generality that (∇ I )2 = 1. Remark 5. As a consequence of Proposition 9 equilibrium functions do not exist on surfaces not admitting Killing vector fields, e.g. negative curvature tori of genus g ≥ 2 [50], this being the “generic” situation.
110
A. Pelayo, D. Peralta-Salas
Part of Proposition 9 can be generalized to higher dimension, as we prove in the next theorem. Theorem 5. Let = {ξ1 , . . . , ξ p }, p ≥ n − 1, be a Lie algebra of Killing vector fields of (M, g). satisfies that rank(ξ1 , . . . , ξ p ) = n − 1 in M, up to a null measure set, and it generates a closed subgroup of the group of isometries. Then the (singular) foliation induced by is an equilibrium partition. Proof. generates an isometric group action G on M. G is connected, simply connected (take the universal covering) and closed in the group of isometries (by assumption). This defines a proper group action on the manifold and therefore M can be divided in two connected components [58], the principal part M ∗ , which is open and dense in M, and the singular part, which is formed by totally geodesic submanifolds. M ∗ is foliated by codimension 1 closed submanifolds of M, in fact this foliation is a Riemannian submersion from M ∗ to M ∗ /G [48]. Note that M ∗ /G is a differentiable Hausdorff 1-manifold, and therefore diffeomorphic to R or S 1 . The submersion is analytic because we always assume in this paper that (M, g) is analytic, and therefore also the Killing vector fields. Call f the function representing the foliation in M ∗ ; since it is a Riemannian submersion, then f will satisfy that (∇ f )2 = F( f ) in the whole M ∗ , as proved in Proposition 2. Since the action of G is transitive on each leaf (the leaves are extrinsically homogeneous, that is homogeneous by isometries of the ambient space) then the mean curvature must be constant at all points of the leaf. This follows from the fact that the second fundamental forms at two different points connected by an isometry correspond through this isometry. ∇f In terms of f this condition is expressed as H = div( ∇ f ) = H ( f ). The following
∇f f ∇ f ∇(∇ f ) , readily implies that f = G( f ). Since computation, div( ∇ f ) = ∇ f − (∇ f )2 the non-principal set is nowhere dense the isoparametric condition extends to the whole M and therefore the foliation is of equilibrium. Note that the extended f is a function over R or S 1 and it could fail to be analytic in the singular set.
Remark 6. In general it is necessary to require that G is closed in the group of isometries. For example, take the flat 2-torus S 1 × S 1 and consider the action by the real line which is given by an irrational translation. This induces a Killing vector field, but the group generated is not closed in the isometry group of the torus, which we know is compact (it is O(2) × O(2)). In fact this action is not proper since the orbits are not (properly) embedded. Similar examples can be constructed in greater dimension. Note that Theorem 5 generalizes, in the Riemannian setting for arbitrary dimension, Theorem 1 in [59]. In general the converse of this theorem is true only for 2-dimensional manifolds (Proposition 9). Indeed consider a manifold which is not rotationally symmetric with respect to the point p but it is harmonic with respect to it. Then the geodesic spheres around p are equilibrium submanifolds but they are not induced by an isometric group action, thus showing that the converse theorem does not generally hold. It would be interesting to find conditions in order that the equilibrium partitions of a manifold be (singular) foliations induced by isometric group actions. All these results show the deep relationship between isometries and equilibrium and suggest that physically relevant spaces should possess enough Killing vector fields. Consequently an effective procedure in order to obtain equilibrium partitions, and hence equilibrium configurations of self-gravitating fluids, is to compute the Killing vector fields of the space. It is probable that spaces which just admit a few isometries (or even no one) do not admit equilibrium functions either (as in dimension 2), lacking static configurations. Let us illustrate Theorem 5 with an example.
Geometric Classification of Equilibrium Shapes of Self-Gravitating Fluids
Example 4. Consider the space H2 × R endowed with the metric ds 2 = 2 2 where F = 2−x 2−y and H2 = {(x, y) ∈ R2 : this manifold are: X 1 = (F + y 2 )∂x −x y∂ y , X 2
111 dx 2 +dy 2 F2
+ dz 2 ,
x 2 + y 2 < 2}. The Killing vector fields of = −x y∂x +(F +x 2 )∂ y , X 3 = −y∂x +x∂ y and X 4 = ∂z . A straightforward computation yields that (∇ f )2 = F 2 ( f x2 + f y2 ) + f z2 and f = F 2 ( f x x + f yy ) + f zz . Some easy, although long, computations show that the codimension 1 partitions (up to null measure set) induced by the Killing vector fields are: – {X 3 , X 4 } =⇒ f = x 2 + y 2 , which is an equilibrium function. – {X i , X j }, i = j = 1, 2, 3 =⇒ f = z, which is an equilibrium function. – {X 1 , X 4 } =⇒ f = – {X 2 , X 4 } =⇒ f =
x 2 +y 2 −2 , y 2 2 x +y −2 , x
which is an equilibrium function (with a singular set). which is an equilibrium function (with a singular set).
From the physical viewpoint it is reasonable to compare the shapes of a compact selfgravitating fluid with the isoperimetric domains. By the term isoperimetric we mean the sets which minimize the area for variations which leave fixed the volume. In the Euclidean space the only compact equilibrium submanifold is the round sphere, which is exactly the solution to the isoperimetric problem. The physical meaning is clear: fluid-composed stars would minimize their surfaces in order to achieve equilibrium. For general Riemannian manifolds an equilibrium submanifold does not solve the isoperimetric problem. The most general result that can be proved is the following. Proposition 10. Let S be a compact equilibrium codimension 1 submanifold. Then S is a critical point of the (n − 1)-area A(t) for all variations St that leave constant the n-volume V (t) enclosed by S. Proof. S is the level set of an analytic function and therefore it has no endpoints [53]. Since it is compact it encloses a finite volume. The equilibrium condition implies that the mean curvature is a constant H . Let St , t ∈ (−, ) and S0 = S, be a variation of S. The first variation of the area at t = 0 is given by [60] A (0) = −(n − 1)H S f dS, where f is the normal component of the variation vector of St and dS isthe (n − 1)-area element of S. Since the variation is volume preserving then V (0) = S f dS = 0 and therefore we get that A (0) = 0. This result cannot be improved in general. We can find manifolds for which equilibrium shapes are minimizers of the area and other manifolds for which they are maximizers or saddle points. Even the weaker condition of being stable, that is A
(0) ≥ 0, is not generally verified. It would be interesting to classify all the spaces whose compact equilibrium submanifolds are stable. The following list gives some of them: – Constant curvature simply connected manifolds. The geodesic spheres are the only stable submanifolds [60]. They are also of equilibrium on account of Proposition 7. – Rotationally symmetric planes with decreasing curvature from the origin. The geodesic circles are stable and enclose isoparametric domains [61], they are also of equilibrium. – Rotationally symmetric spheres with curvature increasing from the equator and equatorial symmetry. The geodesic circles are stable and enclose isoperimetric domains [61], they are also of equilibrium. – Rotationally symmetric cylinders with decreasing curvature from one end and finite area. The circles of revolution are stable, enclose isoperimetric domains [61] and a straightforward computation yields that they are also of equilibrium.
112
A. Pelayo, D. Peralta-Salas
It is not difficult to construct examples of manifolds with equilibrium partitions whose leaves are not stable. For instance consider the plane with the following metric tensor in polar coordinates ds 2 = dr 2 + r 2 (1 + r 2 )2 dθ 2 . The function I = 21 r 2 is of equilibrium, it induces the equilibrium partition given by the geodesic circles. Now, if you set f (r ) = r (1 + r 2 ), the expression f 2 − f f
= 1 + 3r 4 is greater than 1 when r > 0. This implies [61] that no stable curves exist. Other similar examples in dimension 2 can be found in the work of Ritoré. Another interesting example is given by the symmetric spaces of rank 1. The geodesic spheres are transitivity hypersurfaces of the group of isometries and therefore they are equilibrium submanifolds (Theorem 5). However not all the geodesic spheres are stable [60]. 9. Equilibrium Shapes of Relativistic Fluids The free-boundary problem (P) includes, as particular cases, the equations ruling Newtonian and relativistic fluids on Riemannian manifolds. In the relativistic case Einstein’s equations give rise to the additional constraint Rab = f 1−1 f 1;ab + 4π( f 2 − f 3 )gab ,
(24)
which expresses the coupling between the geometry of (M, g) and the potential f 1 . The metric tensor g can be proved to be analytic in M − ∂ [62] and Ct2 on ∂ (Synge’s junction condition [22]). If Eq. (24) is not taken into account (relativistic fluid model on a fixed space) then all the results obtained in this paper apply. When Eq. (24) is considered, Theorem 1 does not hold (its proof makes use of the analyticity of the metric on ∂), and therefore it is not possible to provide a full classification of the equilibrium shapes. Even in the case in which ARP could be proved, the proof of Theorem 2 would fail in general. This obstacle can be overcome when the metric is assumed to be conformally flat, i.e. g = e2φ δ, and the conformal factor φ agrees fibrewise with f 1 (this assumption is common in the literature, see e.g. [21]). In this case it is not difficult to show, proceeding as in Sect. 4, that ARP implies Theorem 2. Under mild physical assumptions it can be proved that the manifold M is diffeomorphic to Rn [63], and therefore Corollary 1 would imply (without imposing other physical constraints) that the equipotential sets are concentric spheres S n−1 , parallel hyperplanes Rn−1 or parallel coaxial cylinders S n−1−k × Rk (1 ≤ k ≤ n − 2). This would generalize, up to ARP, a theorem of Lindblom asserting that in dimension 3, and bounded domain , conformal flatness implies spherical symmetry [64]. Note that the hypothesis of conformal flatness is proved in [3, 4] under several physical assumptions. It would be interesting to prove ARP and Theorem 2 for general relativistic fluids. This would yield a complete classification of the equilibrium shapes without taking into account physical restrictions. Furthermore it would allow to detect the spaces on which ∂ is a round sphere or not. Note that the approaches of Beig & Simon and Lindblom&Masood-ul-Alam are adapted to prove the spherical symmetry of the equipotential sets, thus failing in more general situations. Our conjecture is that Theorems 1 and 2 remain true in the relativistic setting, without additional hypotheses. A possible proof may involve the concept of analytic representation of a metric. This requirement is rather natural since g, as f 1 , is an unknown of Eqs. (P) extending across the free-boundary. The question is how to define the analytic representation of a metric and how to prove that the metrics which are solutions to Eqs. (P) and (24) are analytically representable. If f 1 is shown to induce equilibrium partitions
Geometric Classification of Equilibrium Shapes of Self-Gravitating Fluids
113
then Kunzle’s work [34] (where his strong assumption (∇ f 1 )2 = F( f 1 ) would arise as a consequence of the equilibrium property) would imply the spherical symmetry of the fluid, just by taking into account some mild physical hypotheses. 10. Final Remarks and Open Problems This work has shown a connection between the level sets of the solutions to certain free-boundary problems and the equilibrium (isoparametric) condition. A remarkable consequence of this result has been the classification of the shapes of static fluids on manifolds. This suggests the interest, both mathematical and physical, of studying the structure of isoparametric submanifolds on Riemannian spaces. In this line many problems remain open, e.g. a complete classification in constant curvature or symmetric manifolds, and a better understanding of the interplay between geometry, topology and isoparametricity. An interesting consequence of our work is that we have provided a technique for characterizing the shapes of fluids on different spaces which is independent of whether ∂ is a round sphere or not. Up to now all the techniques available in the literature are adapted to the spherical symmetry situation. Consequently a deeper relationship between the geometrical and topological properties of (M, g) and the shapes of the fluid region has been established. We have shown that recovering the physical intuition that we have in the Euclidean space, e.g. the existence of contractible fluid domains, the connection with the isometries of (M, g) and the stability of the fluid regions, requires to restrict the base manifold. It would be interesting to ascertain whether techniques similar to the ones developed in this paper could be useful in other situations where the most interesting properties are of geometrical type, e.g. shapes of self-gravitating fluids in rotation, propagating interfaces, burning flames or computer vision [65]. Acknowledgements. A.P. thanks Maxim Kazarian for many helpful discussions and Ralf Spatzier for suggestions for improvement on a very preliminary version. A.P.’s research was partially supported by an EPSRC grant at the University of Warwick (UK) during the academic year 2001-2002. The latter part of his research was partially supported by Professor Peter Scott’s NSF grant at the University of Michigan. D.P.-S. is very grateful to Robert Beig, Rolando Magnanini, Mario Micallef, Stefano Montaldo, Renato Pedrosa, Cesar Rosales, Urs Schaudt and Janos Szenthe, for their interesting comments on different parts of this paper. He is also indebted to Alberto Enciso for his encouragement during the course of this work and the careful reading of the manuscript. Finally he is also grateful to Antti Kupiainen and the referee of the paper for their useful criticisms on previous versions of the article. D.P.-S.’s research was supported by FPI and FPU grants from UCM and MEC (Spain).
References 1. Lichtenstein, L.: Gleichgewichtsfiguren Rotiender Flüssigkeiten. Berlin: Springer, 1933 2. Lindblom, L.: Mirror planes in Newtonian stars with stratified flows. J. Math. Phys. 18, 2352 (1977) 3. Beig, R., Simon, W.: On the uniqueness of static perfect fluid solutions in general relativity. Commun. Math. Phys. 144, 373 (1992) 4. Lindblom, L., Masood-ul-Alam, A.K.M.: On the spherical symmetry of static stellar models. Commun. Math. Phys. 162, 123 (1994) 5. Bar, C.: Zero sets of solutions to semilinear elliptic systems of first order.Invent. Math. 138, 183 (1999) 6. Hardt, R. et al.: Critical sets of solutions to elliptic equations. J. Diff. Geom. 51, 359 (1999) 7. Caffarelli, L.A., Friedman, A.: Convexity of solutions of semilinear elliptic equations. Duke Math. J. 52, 431 (1985) 8. Cosner, C., Schmitt, K.: On the geometry of level sets of positive solutions of semilinear elliptic equations. Rocky Mount. J. Math. 18, 277 (1988)
114
A. Pelayo, D. Peralta-Salas
9. Lin, F.H.: Nodal sets of solutions of elliptic and parabolic equations. Comm. Pure Appl. Math. 44, 287 (1991) 10. Kukavica, I.: Level sets for the stationary solutions of the Ginzburg-Landau equation. Calc. Var. Part. Diff. Eqs. 5, 511 (1997) 11. Serrin, J.: A symmetry problem in potential theory. Arch. Rat. Mech. Anal. 43, 304 (1971) 12. Weinberger, H.F.: Remark on the preceding paper of Serrin. Arch. Rat. Mech. Anal. 43, 319 (1971) 13. Garofalo, N., Lewis, J.L.: A symmetry result related to some overdetermined boundary value problems. Amer. J. Math. 111, 9 (1989) 14. Henrot, A., Philippin, G.A.: Some overdetermined boundary value problems with elliptical free boundaries. SIAM J. Math. Anal. 29, 309 (1998) 15. Schaefer, P.W.: On nonstandard overdetermined boundary value problems. Nonlinear Anal. 47, 2203 (2001) 16. Newns, W.F.: Functional dependence. Amer. Math. Month. 74, 911 (1967) 17. Spivak, M.: A comprehensive introduction to differential geometry 5 vols. Berkeley: Publish or Perish, 1979 18. Gascon, F.G., Peralta-Salas, D.: On the construction of global coordinate systems in Euclidean spaces. Nonlinear Anal. 57, 723 (2004) 19. Gascon, F.G.: Non-wandering points of vector fields and invariant sets of functions. Phys. Lett. A 240, 147 (1998) 20. Avez, A.: Differential Calculus. Chichester: Wiley, 1986 21. Lindblom, L.: On the symmetries of equilibrium stellar models. Phil. Trans. R. Soc. Lond. A 340, 353 (1992) 22. Synge, J.L.: Relativity, the general theory. Amsterdam: North-Holland, 1966 23. Kinderlehrer, D., Nirenberg, L., Spruck, J.: Regularity in elliptic free-boundary problems. J. Anal. Math. 34, 86 (1978) 24. Karp, L., Margulis, A.S.: Newtonian potential theory for unbounded sources and applications to freeboundary problems. J. Anal. Math. 70, 1 (1996) 25. Morrey, C.B., Nirenberg, L.: On the analyticity of the solutions of linear elliptic systems of partial differential equations. Comm. Pure Appl. Math. 10, 271 (1957) 26. Morrey, C.B.: Multiple Integrals and the Calculus of Variations. Berlin: Springer, 1966 27. Browder, F.E.: The zeros of solutions of elliptic partial differential equations with analytic coefficients. Arch. Math. 19, 183 (1968) 28. Acquistapace, F., Broglia, F.: More about signatures and approximation. Geom. Dedicata 50, 107 (1994) 29. Lindblom, L.: Stationary stars are axisymmetric. Astrophys. J. 208, 873 (1976) 30. Morrey, C.B.: On the analyticity of the solutions of analytic non-linear elliptic systems of partial differential equations: analyticity at the boundary. Amer. J. Math. 80, 219 (1958) 31. Hormander, L.: Linear Partial Differential Operators. Berlin: Springer, 1964 32. Sussmann, H.J.: Orbits of families of vector fields and integrability of distributions. Trans. Amer. Math. Soc. 180, 171 (1973) 33. Noble, S.C., Choptuik, M.W.: Collapse of relativistic fluids. Work in progress, available at http:laplace. physics.ubc.ca/∼ scn/fluad, 2002 34. Kunzle, H.P.: On the spherical symmetry of a static perfect fluid. Commun. Math. Phys. 20, 85 (1971) 35. Wang, Q.: Isoparametric functions on Riemannian manifolds. Math. Ann. 277, 639 (1987) 36. Levi-Civita, T.: Famiglie di superficie isoparametriche nell ordinario spazio euclideo. Rend. Accad. Naz. Lincei 26, 355 (1937) 37. Cartan, E.: Familles de surfaces isoparametriques dans les espaces a courbure constante. Ann. Mat. Pura Appl. 17, 177 (1938) 38. Cartan, E.: Sur quelques familles remarquables d’hypersurfaces. C. R. Congres Math. Liege 1, 30 (1939) 39. Segre, B.: Famiglie di ipersuperfie isoparametriche negli spazi euclidei ad un qualunque numero di dimensioni. Rend. Acc. Naz. Lincei 27, 203 (1938) 40. Nomizu, K.: Elie Cartan’s work on isoparametric families of hypersurfaces. Proc. Symp. Pure Math. 27, 191 (1975) 41. Thorbergsson, G.: Handbook of Differential Geometry, Vol. I, Amsterdam: North-Holland, 2000, pp. 963-995 42. Baird, P.: Harmonic maps with symmetry, harmonic morphisms and deformations of metrics. Res. Notes Math. 87, 1 (1983) 43. Serrin, J.: The form of interfacial surfaces in Korteweg’s theory of phase equilibria. Quart. Appl. Math. 41, 357 (1983) 44. Sakaguchi, S.: When are the spatial level surfaces of solutions of diffusion equations invariant with respect to the time variable. J. Anal. Math. 78, 219 (1999) 45. Alessandrini, G., Magnanini, R.: Symmetry and non-symmetry for the overdetermined Stekloff eigenvalue problem II. In Nonlinear Problems in Applied Mathematics. Philadelphia: SIAM, 1995
Geometric Classification of Equilibrium Shapes of Self-Gravitating Fluids
115
46. Shklover, V.E.: Schiffer problem and isoparametric hypersurfaces. Rev. Mat. Iberoamericana 16, 529 (2000) 47. Tondeur, P.: Foliations on Riemannian Manifolds. New York: Springer, 1988 48. Molino, P.: Riemannian Foliations. Boston: Birkhauser, 1988 49. Hector, G., Hirsch, U.: Introduction to the geometry of foliations. Braunschweig: Friedr. Vieweg and Sons, 1986 50. Bochner, S.: Vector fields and Ricci curvature. Bull. Amer. Math. Soc. 52, 776 (1946) 51. Cheeger, J., Gromoll, D.: The splitting theorem for manifolds of non-negative Ricci curvature. J. Diff. Geom. 6, 119 (1971) 52. Scott, P.: The geometries of 3-manifolds. Bull. London Math. Soc. 15, 401 (1983) 53. Sullivan, D.: Combinatorial invariants of analytic spaces. Proceedings of Liverpool Singularities Symposium I, Berlin: Springer, 1971, p.165 54. Besse, A.L.: Manifolds all of whose Geodesics are Closed. Berlin: Springer, 1978 55. Helgason, S.: Differential Geometry, Lie Groups, and Symmetric Spaces. Providence: Amer. Math. Soc., 2001 56. Cao, J.: The existence of generalized isothermal coordinates for higher dimensional Riemannian manifolds. Trans. Amer. Math. Soc. 324, 901 (1991) 57. Nomizu, K.: On local and global existence of Killing vector fields. Ann. Math. 72, 105 (1960) 58. Palais, R.S.: On the existence of slices for actions of non-compact Lie groups. Ann. Math. 73, 295 (1961) 59. Szenthe, J.: On generalization of Birkhoff’s theorem. Preprint (2004) 60. Barbosa, J.L., DoCarmo, M., Eschenburg, J.: Stability of hypersurfaces of constant mean curvature in Riemannian manifolds. Math. Z. 197, 123 (1988) 61. Ritoré, M.: Constant geodesic curvature curves and isoperimetric domains in rotationally symmetric surfaces. Commun. Anal. Geom. 9, 1093 (2001) 62. Muller zum Hagen, H.: On the analyticity of static vacuum solutions of Einstein’s equations. Proc. Cambridge Phil. Soc. 67, 415 (1970) 63. Masood-ul-Alam, A.K.M.: The topology of asymptotically Euclidean static perfect fluid space-time. Commun. Math. Phys. 108, 193 (1987) 64. Lindblom, L.: Some properties of static general relativistic stellar models. J. Math. Phys. 21, 1455 (1980) 65. Sethian, J.A.: Level Set Methods and Fast Marching Methods. Cambridge: Cambridge University Press, 1999 Communicated by A. Kupiainen
Commun. Math. Phys. 267, 117–139 (2006) Digital Object Identifier (DOI) 10.1007/s00220-006-0021-5
Communications in
Mathematical Physics
On Hamiltonian Perturbations of Hyperbolic Systems of Conservation Laws, II: Universality of Critical Behaviour Boris Dubrovin SISSA, Via Beirut 2–4, 34014 Trieste, Italy. E-mail:
[email protected] Received: 12 October 2005 / Accepted: 10 November 2005 Published online: 28 April 2006 – © Springer-Verlag 2006
Abstract: Hamiltonian perturbations of the simplest hyperbolic equation u t + a(u)u x = 0 are studied. We argue that the behaviour of solutions to the perturbed equation near the point of gradient catastrophe of the unperturbed one should be essentially independent on the choice of generic perturbation neither on the choice of generic solution. Moreover, this behaviour is described by a special solution to an integrable fourth order ODE.
1. Introduction In the present work we continue the study of Hamiltonian perturbations of hyperbolic PDEs initiated by the paper [10]. We consider here the simplest case of a single equation in one spatial dimension, u t + a(u)u x + b1 (u)u x x + b2 (u)u 2x + 2 b3 (u)u x x x + b4 (u)u x u x x + b5 (u)u 3x + · · · = 0. (1.1) Here is a small parameter; the coefficient of k is a graded homogeneous polynomial in the derivatives u x , u x x , …of the total degree (k + 1), deg u (n) = n, n > 0. The unperturbed equation u t + a(u)u x = 0
(1.2)
can be considered as the simplest example of a nonlinear hyperbolic system; the smooth functions b1 (u), b2 (u), etc. determine the structure of the perturbation.
118
B. Dubrovin
Such expansions arise, e.g., in the study of the long wave (also called dispersionless) approximations of evolutionary PDEs; see Sect. 5 below for other mechanisms that yield perturbed equations of the form (1.1). The unperturbed equation (1.2) admits a Hamiltonian description of the form δ H0 = 0, u t + {u(x), H0 } ≡ u t + ∂x δu(x) H0 = f (u) d x, f (u) = a(u), {u(x), u(y)} = δ (x − y).
(1.3)
(1.4)
The perturbed equations of the form (1.1) are considered up to equivalencies defined by Miura-type transformations [9] of the form (1.5) k Fk u; u x , . . . , u (k) , u → u + k≥1
where Fk (u; u x , . . . , u (k) ) is a graded homogeneous polynomial in the derivatives u x , u x x , . . . of the degree deg Fk = k. Using results of [15] (see also [6, 9]) one can show that any Hamiltonian perturbation of Eq. (1.2) can be reduced to the form δH = 0, H = H0 + H1 + 2 H2 + · · · , δu(x) Hk = h k u; u x , . . . , u (k) d x, deg h k u; u x , . . . , u (k) = k.
u t + ∂x
Recall that for H =
(1.6)
h(u; u x , u x x , . . . ) d x, δH = E h, δu(x)
where E=
∂ ∂ ∂ − ∂x + ∂x2 − ··· ∂u ∂u x ∂u x x
is the Euler – Lagrange operator. The following well known property of the Euler – Lagrange operator will be often used in this paper: E h = 0 iff there exists h 1 = h 1 (u; u x , . . . ) such that h = const + ∂x h 1 . Note that we do not specify here the class of functions u(x). The Hamiltonians H = H [u] can be ill defined (e.g., a divergent integral) but the evolutionary PDE (1.6) makes sense. The crucial point for the subsequent considerations is the following statement (see, e.g., [7]): for two commuting Hamiltonians δH δF ∂x =0 {H, F} = 0 ⇔ E δu(x) δu(x)
Hamiltonian Perturbations of Hyperbolic Systems of Conservation Laws, II
119
the evolutionary PDEs u t + ∂x
δH δF = 0 and u s + ∂x =0 δu(x) δu(x)
commute, (u t )s = (u s )t . For sufficiently small one expects to see no major differences in the behaviour of solutions to the perturbed and unperturbed Eqs. (1.1) and (1.2) within the regions where the x-derivatives are bounded. However the differences become quite serious near the critical point (also called the point of gradient catastrophe) where the derivatives of solution to the unperturbed equation tend to infinity. Although the case of small viscosity perturbations has been well studied and understood (see [3] and references therein), the critical behaviour of solutions to general conservative perturbations (1.6) to our best knowledge has not been investigated (see the papers [32, 12, 17–19, 23–25, 28] for the study of various particular cases). The main goal of this paper is to formulate the Universality Conjecture about the behaviour of a generic solution to the general perturbed Hamiltonian equation near the point of gradient catastrophe of the unperturbed solution. We argue that, up to shifts, Galilean transformations and rescalings this behaviour essentially does not depend on the choice of solution neither on the choice of the equation (provided certain genericity assumptions hold valid). Moreover, this behaviour near the point (x0 , t0 , u 0 ) is given by u u 0 + a 2/7 U b −6/7 (x − a0 (t − t0 ) − x0 ) ; c −4/7 (t − t0 ) + O 4/7 , (1.7) where U = U (X ; T ) is the unique real smooth for all X ∈ R solution to the fourth order ODE,
1 3 1 2 1 IV dU X=TU− U + , U = U + 2U U + U , etc., (1.8) 6 24 240 dX depending on the parameter T . Here a, b, c are some constants that depend on the choice of the equation and the solution, a0 = a(v0 ). Equation (1.8) appeared in [4] (for the particular value of the parameter T = 0) in the study of the double scaling limit for the matrix model with the multicritical index m = 3. It was observed that generic solutions to (1.8) blow up at some point of real line; the conjecture about existence of a unique smooth solution has been formulated. To our best knowledge, this conjecture remains open, although there are some supporting evidences [20]. The present paper is organized as follows. In Sect. 2 we classify all Hamiltonian perturbations up to the order 4 . They are parametrized by two arbitrary functions c(u), p(u). For the simplest example the perturbations of the Riemann wave equation u t + u u x = 0 read 2 2c u x x x + 4c u x u x x + c u 3x ut + u u x + 24 4 + 2 p u x x x x x + 2 p (5u x x u x x x + 3u x u x x x x ) (1.9) + p 7u x u 2x x + 6u 2x u x x x + 2 p u 3x u x x = 0.
120
B. Dubrovin
For c(u) = const, p(u) = 0 this is nothing but the Korteweg - de Vries (KdV) equation; for other choices of the functions c(u), p(u) it seems not to be an integrable PDE. Remarkably, for arbitrary choice of the functional parameters the perturbed equation possesses an infinite family of approximate symmetries (see [2, 9, 22, 30] for discussion of approximate symmetries). In principle our approach can be applied to classifying the Hamiltonian perturbations of higher orders. However, higher order terms do not affect the type of critical behaviour. In Sect. 3 we establish an important property of quasitriviality of all perturbations (cf. [9, 10, 27]). The quasitriviality is given by a substitution u → u + 2 K 2 (u; u x , u x x , u x x x ) + 4 K 4 u; u x , . . . , u (6) (1.10) that transforms, modulo O( 6 ) the unperturbed equation (1.2) to (1.6). Here the functions K 2 and K 4 depend rationally on the x-derivatives. We also formulate the first part of our Main Conjecture that says that, for sufficiently small the solution to the perturbed system exists at least on the same domain of the (x, t)-plane where the unperturbed solution is defined. In Sect. 4 we briefly discuss existence of a bihamiltonian structure compatible with the perturbation (see also Appendix below). Some examples of perturbed Hamiltonian equations are described in Sect. 5. In Sect. 6 we recollect some properties of the ODE (1.8) and we formulate the second part of the Main Conjecture describing the special function U (X ; T ) in (1.7) as a particular solution to (1.8). Finally, in Sect. 7 we give the precise formulation of the Universality Conjecture (Main Conjecture, Part 3) and give some evidences supporting it1 . Because of lack of space we do not consider the numerical evidences supporting the idea of Universality; they will be given in a subsequent publication (see also [16]). In the last section we outline the programme of further researches towards understanding of universality phenomena of critical behaviour in general Hamiltonian perturbations of hyperbolic systems. 2. Hamiltonian Perturbations of the Riemann Wave Equation Let us start with the simplest case of Hamiltonian perturbations of the equation vt + v vx = 0 ⇔ vt + {v(x), H0 } = 0, {v(x), v(y)} = δ (x − y), 3 v d x. H0 = 6
(2.1)
Lemma 2.1. Up to the order O( 4 ), all Hamiltonian perturbations of (2.1) can be reduced to the form δH = 0, δu(x) 3 u c(u) 2 − 2 u x + 4 p(u)u 2x x + s(u)u 4x d x, H= 6 24
u t + ∂x
(2.2)
where c(u), p(u), s(u) are arbitrary functions. Moreover, the function s(u) can be eliminated by a Miura-type transform. 1 Perhaps, only this Part 3 deserves the name of the Main Conjecture. However, the precise formulation of it depends on the first two parts.
Hamiltonian Perturbations of Hyperbolic Systems of Conservation Laws, II
121
Proof. The Hamiltonian must have the form H = H0 + H1 + · · · + 4 H4 , where the density of Hk is a graded homogeneous polynomial of the degree k. So, the density of H1 is a total derivative: H1 = α(u)u x d x, α(u)u x = ∂x A(u), A (u) = α(u). The density of the Hamiltonian H2 modulo total derivatives must have the form −
c(u) 2 u 24 x
for some function c(u). Similarly, H3 must have the form H3 = c1 (u)u 3x d x. Here c1 (u) is another arbitrary function. Let us show that H3 can be eliminated by a Miura-type transform. Let us look for it in the form u → u + {u(x), F} +
2 {{u(x), F}, F} + · · · , 2
(2.3)
choosing F = 2
α(u)u 2x d x.
Such a transformation preserves the Poisson bracket. The change of the Hamiltonian H will be given by δ H = {F, H } + O( 4 ). At the order 3 one has
1 3 3 2 α (u)u x − ∂x (α u x ) u u x d x = δH = α(u)u 3x d x. 2 2 So, choosing α(u) = −2c1 (u) we kill the terms cubic in . The rest of the proof is obvious: in order 4 all the Hamiltonians have the form H4 = p(u)u 2x x + s(u)u 4x d x for some functions p(u), s(u). The last term can be killed by the canonical transformation of the form (2.3) generated by the Hamiltonian 3 F =− s(u)u 3x d x. 2 The lemma is proved.
122
B. Dubrovin
Choosing s(u) = 0 one obtains the family (1.9) of Hamiltonian perturbations of the Riemann wave equation depending on two arbitrary functions c = c(u), p = p(u). We will now compare the symmetries of (2.1) and those of the perturbed system (2.2). It is easy to see that the Hamiltonian equation vs + a(v)vx = 0 ⇔ vs + {v(x), H 0f } = 0, H 0f = f (v) d x, f (v) = a(v)
(2.4)
is a symmetry of (2.1) for any a(v), (vt )s = (vs )t . Moreover, the Hamiltonians H 0f commute pairwise, {H 0f , Hg0 } = 0 ∀ f = f (u), ∀g = g(u). This family of commuting Hamiltonians is complete in the following sense. Lemma 2.2. The family of commuting Hamiltonians H 0f is maximal, i.e., if H = h(u; u x , u x x , . . . ) d x commutes with all functionals of the form H 0f then h(u; u x , u x x , . . . ) = g(u) + ∂x (. . . ) for some function g(u). We will now construct a perturbation of the Hamiltonians H 0f preserving the commutativity modulo O( 6 ). Like in Lemma 2.1 one can easily check that all the perturbations up to the order 4 must have the form c f (u) 2 f (u) − 2 u x + 4 p f (u)u 2x x + s f (u)u 4x dx Hf = 24 for some functions c f (u), p f (u), s f (u). To ensure commutativity one has to choose these functions as follows. Lemma 2.3. For any f = f (u) the Hamiltonian flow δHf = 0, H f = h f d x, u s + ∂x δu(x) 2 c2 f (4) 2 4 h f = f − c f ux + p f + u 2x x 24 480 c c f (4) c c f (5) c2 f (6) p f (4) p f (5) − u 4x + + + + −s f 1152 1152 3456 6 6
(2.5)
is a symmetry, modulo O( 6 ), of (2.2). Moreover, the Hamiltonians H f commute pairwise: {H f , Hg } = O( 6 ) for arbitrary two functions f (u) and g(u).
Hamiltonian Perturbations of Hyperbolic Systems of Conservation Laws, II
123
Proof. One has to check the identity δ Hg δHf ∂x = 0, E δu(x) δu(x) where E is the Euler – Lagrange operator. We leave this calculation as an exercise for the reader. 3 Observe that for f = u6 the Hamiltonian H f coincides with (2.2). Also for f = u (the Casimir of the Poisson bracket) and f = trivial,
u2 2
(the momentum) the perturbation is
H f = H 0f . We do not know under what conditions on the functional parameters c(u), p(u) higher order perturbations can be added to the Hamiltonians (2.5) preserving the commutativity. The examples of Sect. 5 show that this can be done at least for some particular choices of the functions. However, the remark at the end of Sect. 4 suggests that the answer is not always affirmative.
3. Solutions to the Perturbed Equations. Quasitriviality We address now the problem of existence of solutions to the perturbed equation for t < tC . We will construct a formal asymptotic solution to (2.2) (and also to all commuting flows (2.5)) valid on the entire interval t < tC . The basic idea is to find a substitution v → u = v + O() that transforms all solutions to all unperturbed equations of the form (2.4) to solutions to the corresponding perturbed equations (2.5). Quasitriviality Theorem. There exists a transformation v → u = v +
4
k Fk (u; u x , . . . , u (n k ) ),
(3.1)
k=1
where Fk are rational functions in the derivatives homogeneous of the degree k, independent of f = f (u), that transforms all monotone solutions of (2.4) to solutions, modulo O( 6 ), of (2.5) and vice versa. The general quasitriviality theorem for evolutionary PDEs admitting a bihamiltonian description was obtained in [10]2 . As we do not assume a priori existence of a bihamiltonian structure (see, however, the next section), we will give here a direct proof of quasitriviality for the family of commuting Hamiltonians (2.5). For convenience we chose s(u) =
c(u) c (u) . 3456
2 In a very recent paper [27] the quasitriviality result was proved, in all orders in , for an arbitrary perturbation of the Riemann wave equation vt + v vx = 0. It has also been shown that the same transformation trivializes also all symmetries of the perturbed equation.
124
B. Dubrovin
Theorem 3.1. Introduce the following Hamiltonian
K =
1 c(u) u x log u x + 3 24
c2 (u) u 3x x p(u) u 2x x − 5760 u 3x 4 ux
d x.
Then the canonical transformation u → v = u + {u(x), K } +
2 {{u(x), K }, K } + · · · 2
satisfies Hf =
f (v) d x + O( 6 ) ∀ f (u).
The inverse transformation is the needed quasitriviality. It is generated by the Hamiltonian 2
1 c (v) vx3x p(v) vx2x − c(v) vx log vx − 3 d x, − −K = 24 5760 vx3 4 vx that is 2 v → u = v − {v(x), K } + {{v(x), K }, K } + · · · 2
3 2 vx x vx x 7 vx x vx x x vx x x x = v + ∂x c + c vx + 4 ∂x c2 − + 24 vx 360 vx4 1920 vx3 1152 vx2 x vx x 2 47 vx x 3 37 vx x vx x x 5 vx x x x 2 vx x x − +c +c c − + 5760 vx 3 2880 vx 2 1152 vx 384 5760 vx 2 v v xxx xx +c c − 144 360 vx 1 2 7 c c vx vx x + c vx 3 + 6 c c vx vx x + c c vx 3 + c c(4) vx 3 + 1152 vx x 3 vx x vx x x vx x x x vx vx x +p + p . (3.2) − + v + p xxx 2 vx 3 vx 2 2 vx 2 In this formula c = c(v), p = p(v). Main Conjecture, Part 1. Let v = v(x, t) be a smooth solution to the unperturbed equation vt + a(v) vx = 0 defined for all x ∈ R and 0 ≤ t < t0 monotone in x for any t. Then there exists a solution u = u(x, t; ) to the perturbed equation u t + ∂x
δHf = 0, δu(x)
f (u) = a(u)
defined on the same domain in the (x, t)-plane with the asymptotic at → 0 of the form (3.2).
Hamiltonian Perturbations of Hyperbolic Systems of Conservation Laws, II
125
4. Are All Hamiltonian Perturbations Also Bihamiltonian? All unperturbed equations vs + a(v) vx = 0 are bihamiltonian w.r.t. the Poisson pencil (see the definition in [9]) {v(x), v(y)}1 = δ (x − y), 1 {v(x), v(y)}2 = q(v(x))δ (x − y) + q (v)vx δ(x − y) 2
(4.1)
for an arbitrary function q(u), vs + {v(x), H1 }1 = vs + {v(x), H2 }2 = 0,
H1 =
f 1 (v) d x,
H2 =
f 2 (v) d x
1 f 1 (v) = a(v) = q(v) f 2 (v) + q (v) f 2 (v). 2 To show that (4.1) is a Poisson pencil it suffices to observe that the linear combination {v(x), v(y)}2 − λ {v(x), v(y)}1 = (q(v(x)) − λ) δ (x − y) 1 + q (v)vx δ(x − y) 2
(4.2)
is the Poisson bracket associated [11] with the flat metric ds 2 =
dv 2 . q(v) − λ
Theorem 4.1. For c(u) = 0 the commuting Hamiltonians (2.5) admit a unique bihamiltonian structure obtained by a deformation of (4.1) with q(u) satisfying
c2 c q p(u) = (4.3) 5 − , s(u) = 0. 960 c q The proof of this result along with the explicit formula for the deformed bihamiltonian structure is sketched in the Appendix below. The assumption c = 0 is essential: one can check that for c(u) ≡ 0 the Hamiltonians (2.5) commute, modulo O( 6 ), only w.r.t. the standard Poisson bracket (1.4). On the other hand it turns out that for this particular choice of the functional parameters the deformation of commuting Hamiltonians cannot be extended to the order O( 8 ). 5. Examples Example 1. For c(u) = c0 = const, p(u) = s(u) = 0 one obtains from (2.2) the KdV equation u t + u u x + c0
2 u x x x = 0. 12
126
B. Dubrovin
Choosing in (2.5) f (u) =
u k+2 (k + 2)!
one obtains the Hamiltonians of the KdV hierarchy ∂u δ Hk = 0, Hk = h k d x, k ≥ 0 + ∂x ∂tk δu(x) 2 u k−1 2 u k+2 − c0 u hk = (k + 2)! 24 (k − 1)! x 4 u k−2 u k−4 2 2 4 u − u + O( 6 ). + c0 96 5 (k − 2)! x x 36 (k − 4)! x The quasitriviality transformation (3.2) takes the form [2, 9]
v → u = v + ∂x2
2 c0 log vx + c0 2 4 24
vx3x 7 vx x vx x x vx x x x − + 4 3 360 vx 1920 vx 1152 vx2
+O( 6 ).
(5.1)
Example 2. The Volterra lattice q˙n = qn (qn+1 − qn−1 )
(5.2)
(also called difference KdV) has the following bihamiltonian structure [13] {qn , qm }1 = 2qn qm (δn+1,m − δn,m+1 ), 1 q˙n = {qn , H1 }1 , H1 = log qn , 2 q + q n m − 2 δn,m+1 − δn,m−1 2
1 1 + δn,m+2 − δn,m−2 , 2 2 q˙n = {qn , H2 }2 , H2 = qn .
(5.3)
{qn , qm }2 = qn qm
(5.4)
After substitution qn = ev(n) and division by 4 one arrives at the following bihamiltonian structure: 1 [δ(x − y + ) − δ(x − y − )] 4 2 = δ (x − y) + δ (x − y) + · · · , 3
{v(x), v(y)}1 =
(5.5)
Hamiltonian Perturbations of Hyperbolic Systems of Conservation Laws, II
1 {v(x), v(y)}2 = 1 − ev(x) δ (x − y) − ev vx δ(x − y) 2
1 5 (2 − 5 ev )δ (x − y) − ev vx δ (x − y) + 2 12 8 1 3 v − e (vx x + vx2 )δ (x−y) − ev (vx x x + 3vx vx x + vx3 )δ(x−y) + O( 4 ). 8 12
127
(5.6)
To compare this bihamiltonian structure with the one obtained in Theorem 4.1 the Poisson bracket (5.5) must be reduced to the standard form {u(x), u(y)}1 = δ (x − y)
(5.7)
by means of the transformation ∂x 2 4 vx x x x + O( 6 ). u= v = v − vx x + sinh ∂x 12 160 After the transformation the second bracket takes the form 1 {u(x), u(y)}2 = 1 − eu(x) δ (x − y) − eu u x δ(x − y) 2
3 1 2 u(x) 1 δ (x − y) + u x δ (x − y) + (7u x x + 5u 2x )δ (x − y) − e 4 8 24 1 3 + (2u x x x + 4u x u x x + u x )δ(x − y) + O( 4 ). 24
(5.8)
We leave as an exercise for the reader to compute the terms of order 4 and to verify that the Poisson bracket (5.8) is associated with the functional parameters chosen as follows c(u) = 2,
p(u) = −
1 1 , q(u) = 1 − eu , s(u) = . 240 4320
Example 3. The Camassa – Holm equation [5] (see also [14])
3 1 vt − 2 vx xt = v vx − 2 vx vx x + v vx x x 2 2
(5.9)
admits a bihamiltonian description (cf. [21]) after doing the following Miura-type transformation u = v − 2 vx x .
(5.10)
{u(x), u(y)}1 = δ (x − y) − 2 δ (x − y),
(5.11)
1 {u(x), u(y)}2 = u(x)δ (x − y) + u x δ(x − y). 2
(5.12)
The bihamiltonian structure reads
The Casimir H−1 of the first Poisson bracket analytic in has the form H−1 = h −1 d x, h −1 = u(x).
128
B. Dubrovin
Applying the bihamiltonian recursion procedure one obtains a sequence of commuting Hamiltonians Hk = h k d x of the hierarchy, h0 =
1 1 u v, h 1 = [v 3 + u v 2 ], . . . . 2 8
The corresponding Hamiltonian flows u tk = {u(x), Hk }1 ≡ (1 − 2 ∂x2 )∂x read
δ Hk δu(x)
3 1 2 u t0 = u x , u t1 = v v x − v x v x x + v v x x x , . . . . 2 2
The last equation reduces to (5.9) after the substitution (5.10). To compare the commuting Hamiltonians with those given in (2.5) one must first reduce the first Poisson bracket to the standard form {u(x), ˜ u(y)} ˜ 1 = δ (x − y) by the transformation −1/2 1 3 u˜ = 1 − 2 ∂x2 u = u + 2u x x + 4u x x x x + · · · . 2 8 After the transformation the Camassa – Holm equation will read u˜ t =
3 u˜ u˜ x + 2 (2u˜ x u˜ x x + u˜ u˜ x x x ) + 4 (5 u˜ x x u˜ x x x + 3 u˜ x u˜ x x x x + u˜ u˜ x x x x x ) + · · · . 2
It is easy to see that the commuting Hamiltonians of Camassa – Holm hierarchy are obtained from (2.5) by the specialization c(u) = 8 u,
p(u) =
u , q(u) = u, s(u) = 0. 3
6. Introducing a Special Function Let us recall some properties of the differential equation
1 1 1 IV 2 X = T U − U 3 + (U + 2U U ) + U 6 24 240
(6.1)
often considered as a 4th order analogue of the classical Painlevé-I equation. First, it can be interpreted as a monodromy preserving deformation of the following linear differential operator with polynomial coefficients ∂ψ = W ψ, ∂z where the matrix W reads 1 12UU + 8zU + U W=− 2 w21 120
2(16z 2 + 8z U + 6U 2 + U − 60T ) , −12UU − 8zU − U
(6.2)
Hamiltonian Perturbations of Hyperbolic Systems of Conservation Laws, II
129
where w21 = 32 z 3 − 16z 2 U − 2z(2U 2 + U + 60 T ) + 8U 3 + 2U U − U + 120X. 2
Indeed, it coincides with the compatibility conditions W X − Uz + [W, U] = 0 of the linear system (6.2) with
0
−1
∂ψ . = U ψ, U = ∂X 2U − 2z 0
(6.3)
Moreover, the dependence of (6.2) on T is isomonodromic iff the function U (X ) depends also on the parameter T according to the KdV equation UT + U U +
1 U = 0. 12
(6.4)
This is the spelling of the compatibility condition of the linear system (6.2), (6.3) with ∂ψ 1 2U + 4z U . (6.5) = V ψ, V = −U ∂T 6 8z 2 − 4zU − 4U 2 − U The Painlevé property readily follows from the isomonodromicity: singularities in the complex (X, T )-plane of general solution to (6.1), (6.4) are poles [20]. Main Conjecture, Part 2 (cf. [4]). The ODE (6.1) has unique solution U = U (X ; T ) smooth for all real X ∈ R for all real values of the parameter T . Note that due to the uniqueness the solution in question satisfies the KdV equation (6.4). For T << 0 the solution of interest is very close to the unique root of the cubic equation XTU−
U3 , 6
that is,
U (−T )
1/2
w + (−T )−7/2
3w 2 − 2 3 (w 2 + 2)4
189w 4 − 972w 2 + 436 −21/2 −(−T ) w + O (−T ) 9 (w 2 + 2)9 1 X = −(−T )3/2 w + w 3 . 6 −7
(6.6)
Same is true for any T for |X | >> 0. For T >> 0 the solution develops oscillations typical for dispersive waves [32] within a region around the origin; one can use the Whitham method to approximate U (X ; T ) by modulated elliptic functions within the oscillatory zone [18, 29]. Thus the solution in question interpolates between the two types of asymptotic behaviour (cf. [23] where the role of the special solution U (X ; T ) in the KdV theory was discussed).
130
B. Dubrovin
The solutions to the fourth order ODE (6.1) can be parametrized [20] by the monodromy data (i.e., the collection of Stokes multipliers) of the linear differential operator (6.3) with coefficients polynomial in z. The solution corresponding to given Stokes multipliers can be reconstructed by solving a certain Riemann – Hilbert problem. The particular values of the Stokes multipliers associated with the smooth solution in question have been conjectured in [20]. 7. Local Galilean Symmetry and Critical Behaviour We will now proceed to discussing the universality problem. Consider the perturbed PDE u t + {u(x), H f } = u t + a(u)u x + O( 2 ) = 0,
f (u) = a(u).
(7.1)
Let us apply the transformation (3.2) to the unperturbed solution v = v(x, t) of vt + a(v)vx = 0
(7.2)
obtained by the method of characteristics: x = a(v) t + b(v)
(7.3)
for some smooth function b(v). Let the solution arrive at the point of gradient catastrophe for some x = x0 , t = t0 , v = v0 . At this point one has x0 = a(v0 )t0 + b(v0 ), 0 = a (v0 )t0 + b (v0 ), 0 = a (v0 )t0 + b (v0 )
(7.4)
(inflection point). Let us assume the following genericity assumption: κ := −(a (v0 )t0 + b (v0 )) = 0.
(7.5)
Let us first recall the universality property for the critical behaviour of the unperturbed solutions: up to shifts, Galilean transformations and rescalings a generic solution to (7.2) near (x0 , t0 ) behaves like the cubic root function. We will present this well known statement in the following form. Introduce the new variables x¯ = x − a0 (t − t0 ) − x0 , t¯ = t − t0 , v¯ = v − v0 . Let us do the following scaling transformation x¯ → λ x, ¯ 2
t¯ → λ 3 t¯,
(7.6)
1 3
¯ v¯ → λ v. Lemma 7.1. After the rescaling (7.6) any generic solution to (7.2) at the limit λ → 0 for t < t0 goes to the solution of the cubic equation x¯ = a0 v¯ t¯ − κ
v¯ 3 . 6
(7.7)
Hamiltonian Perturbations of Hyperbolic Systems of Conservation Laws, II
131
In these formulae a0 = a(v0 ), a0 = a (v0 ). Note that the inequality κ a0 > 0
(7.8)
must hold true in order to have the solution well defined for t < t0 near the point of generic gradient catastrophe (7.4). To prove the lemma it suffices to observe that, after the rescaling (7.6) and division by λ Eq. (7.3) yields x¯ = a0 v¯ t¯ − κ
v¯ 3 + O λ1/3 . 6
The parameter κ can be eliminated from (7.7) by a rescaling. The resulting cubic function can be interpreted as the universal unfolding of the A2 singularity [1]. Our basic observation we are going to explain now is that, after a Hamiltonian perturbation the A2 singularity transforms to the special solution of (1.8) described above. Let us look for a solution to the perturbed PDE (7.1) in the form of a formal power series u = u(x, t; ) = v(x, t) + k vk (x, t) (7.9) k≥1
with v(x, t) given by (7.3) satisfying (7.1) modulo O( 5 ). We will say that such a solution is monotone at the point x = x0 , t = t0 if u x (x0 , t0 ; 0) ≡ vx (x0 , t0 ) = 0. According to the results of Sect. 3 all monotone solutions of the form (7.9) can be obtained by applying the transformation (3.2) to the nonperturbed solution (7.2) (more precisely, one has to allow -dependence of the function b(u)). Lemma 7.2. Let us perform the rescaling (7.6) along with → λ7/6
(7.10)
in the quasitriviality transformation (3.2). Then the resulting solution to the perturbed PDE will be equal to u = v0 + λ
1/3
2 2 v¯ + ∂x c0 log v¯ x + c0 2 4 24 3 7 v¯ x x v¯ x x x v¯ x x x x v¯ x x 2/3 + O λ × − + 360 v¯ x4 1920 v¯ x3 1152 v¯ x2
(7.11)
(cf. (5.1)) where c0 = c(v0 ), v¯ = v(x, ¯ t) is the solution to the cubic equation (7.7). Proof is straightforward.
(7.12)
132
B. Dubrovin
It remains to identify (7.11) with the formal asymptotic solution (6.6) to the ODE (6.1). This can be done by a direct substitution. An alternative way is to observe that, near the point of gradient catastrophe the perturbed PDE acquires an additional Galilean symmetry. Indeed, according to the previous lemma, locally one can replace the functions c(u), p(u) by constants c0 = c(v0 ), p0 = p(v0 ) (the constant p0 , however, does not enter in the leading term of the asymptotic expansion in powers of λ1/3 ). Let us show that in this situation any solution to the perturbed PDE of the form (7.9) satisfies also a fourth order ODE. Lemma 7.3. Let c(u) = c0 , p(u) = p0 . Then for any solution u(x, t; ) of the form (7.9) monotone at the point (x0 , t0 ) there exists a formal series g(u; ) = g0 (u) + k gk (u) k≥1
such that for arbitrary x, t sufficiently close to x0 , t0 the function u(x, t; ) satisfies, modulo O( 5 ), the following fourth order ODE: x =t
δHf δ Hg + . δu(x) δu(x)
(7.13)
Here g0 (u) = b(u). Proof. It is easy to see that the flow δHf (7.14) δu(x) is a symmetry of (7.1). Combining this symmetry with one of the commuting flows u τ = 1 − t ∂x
δ Hg =0 δu(x) one obtains another symmetry. The set of stationary points of this combination δ Hg δHf + −x =0 ∂x t δu(x) δu(x) u s + ∂x
is therefore invariant for the t-flow. Considering the limit → 0 it is easy to see that the integration constant vanishes on the solution (3.2), (7.2). The lemma is proved. The ODE for the function u(x) is closely related to the so-called string equation known in matrix models and topological field theory (see, e.g., [9]). Explicitly 2 x = t a(u) + b(u) + c0 t 2 a u x x + a u 2x + 2 b u x x + b u 2x 24
1 2 4 c0 t a + b u x x x x 2 p0 t a + b + + 240
1 c02 t a I V + b I V u x x x u x + 4 p0 t a + b + 120
11 2 V c0 t a + b V u x x u 2x + 4 p0 t a I V + b I V + 1440
1 2 VI 1 p0 t a V + b V + c0 t a + b V I u 4x + O( 5 ). + (7.15) 2 1152
Hamiltonian Perturbations of Hyperbolic Systems of Conservation Laws, II
133
Let us call the solution generic if, along with the condition κ := −(a (v0 )t0 + b (v0 )) = 0 it also satisfies c0 := c(v0 ) = 0.
(7.16)
Main Conjecture, Part 3. The generic solution described in the Main Conjecture, Part 1 can be extended up to t = t0 + δ for sufficiently small positive δ = δ(); near the point (x0 , t0 ) it behaves in the following way: 2 1/7 x − a0 (t − t0 ) − x0 a0 (t − t0 ) c0 4/7 . (7.17) u v0 + U ; + O κ2 (κ c03 6 )1/7 (κ 3 c02 4 )1/7 Here U = U (X ; T ) is the solution to the ODE (1.8) specified in the Main Conjecture, Part 2. To arrive at the asymptotic formula (7.17) we do in (7.15) the rescaling of the form (7.6) along with (7.10). After substitution to Eq. (7.15) and division by λ, one obtains
3 4 2 u¯ + c0 u¯ 2x + 2u¯ u¯ x x + c02 u¯ x x x x + O λ1/3 . x¯ = a0 u¯ t¯ − κ 6 24 240 In derivation of this formula we use that the monomial of the form k u ix1 u ix2x u ix3x x . . . after the rescaling will be multiplied by λ D with D=
1 1 k + (i 1 + i 2 + · · · ) 6 3
due to the degree condition i 1 + 2 i 2 + 3 i 3 + · · · = k. Adding the terms of higher order k > 4 will not change the leading term. Choosing 3/7
λ = 6/7 c0
we arrive at the needed asymptotic formula. Clearly the above arguments require existence and uniqueness of the solution to (1.8) smooth on the real line described in the Main Conjecture, Part 2.
8. Concluding Remarks We have presented arguments supporting the conjectural universality of critical behaviour of solutions to generic Hamiltonian perturbations of a hyperbolic equation of the form (1.2). In subsequent publications we will study the Main Conjecture in more details. The possibilities of using the idea of Universality in numerical algorithms to dealing with oscillatory behaviour of solutions to Hamiltonian PDEs will be explored. We will also proceed to the study of singularities of generic solutions to integrable Hamiltonian hyperbolic systems of conservation laws ∂h(u) u it + ∂x ηi j = 0, η ji = ηi j , det(ηi j ) = 0. (8.1) ∂u j
134
B. Dubrovin
Recall that, according to the results of [31] the system (8.1) is integrable if it diagonalizes in a system of curvilinear coordinates v k = v k (u), k = 1, . . . , n for the Euclidean/pseudo-Euclidean metric ds 2 = ηi j du i du j =
n
gk (v)(dv k )2 ,
−1 ηi j := ηi j ,
k=1
vtk + λk (v)vxk = 0, k = 1, . . . , n (in this formula there is no summation over repeated indices!). All Hamiltonian perturbations of the hyperbolic system (8.1) can be written in the form
δH h(u) + = 0, H = u it + ∂x ηi j j k h k (u; u x , . . . , u (k) ) d x, δu (x) k≥1
deg h k = k. We plan to study symmetries of the perturbed Hamiltonian hyperbolic systems. In particular, we will classify the perturbations preserving integrability and study the correspondence between the types of critical behaviour of the perturbed and unperturbed systems. The next step would be to extend our approach to Hamiltonian perturbations of spatially multidimensional hyperbolic systems (cf. [8]). Appendix: Bihamiltonian Structures Associated with the Perturbations of the Riemann Wave Hierarchy Theorem A.1. For arbitrary two functions c = c(u) = 0, q = q(u) the family of Hamiltonians (2.5) with
c2 c q p(u) = (A.1) 5 − , s(u) = 0 960 c q is commutative
{H f , Hg }1,2 = 0 mod O( 6 ) ∀ f = f (u), ∀g = g(u)
(A.2)
with respect to the Poisson pencil of the form {u(x), u(y)}1 = δ (x − y), {u(x), u(y)}2 = {u(x), u(y)}[0] + 2 {u(x), u(y)}[2] + 4 {u(x), u(y)}[4] + O( 6 ). Here the terms of order 0: 1 {u(x), u(y)}[0] 2 = q(u)δ (x − y) + q (u)u x δ(x − y). 2 All terms of higher orders are uniquely determined from the bicommutativity (A.2) provided validity of the constraint (A.1). Namely, the terms of order 2:
Hamiltonian Perturbations of Hyperbolic Systems of Conservation Laws, II
135
cq 3 δ (x − y) + cq u x δ (x − y) {u(x), u(y)}[2] 2 = 8 16
c q c q 5cq c q u x x 7cq u x x 2 + + + ux + + δ (x − y) 16 6 48 16 48 cq c q c q cq (4) 3 1 + + c q + cq u x u x x + u x x x δ(x−y). + ux + 48 24 48 12 24 The terms of order 4: 1 5 2 {u(x), u(y)}[4] 3cc q + c2 q δ V (x − y)+ 3cc q +c q u x δ I V (x − y) 2 = 192 384 3c c q cc q 3c 2 q 5cc q cc q 2 u x 2 + + + − + 32 32 32 48 240q c2 q 3 19cc q 3c2 q q c2 q (4) + + + − ux 2 192 640q 64 480q 2 c2 q 2 19c2 q 3c 2 q 3cc q 17cc q + + − + + u x x δ (x − y) 64 64 192 480q 960 3c 2 q c c q cc(4) q 19c c q 23cc q 5cc q (4) 7cc q + + + + + + + 128 32 128 128 384 64 64 c 2 q 2 cc q 2 cc q 3 c2 q 4 17cc q q c2 q (5) 3c 2 q + − − + − − 96 32 160q 160q 640q 80q 2 160q 3 3cc q 21c2 q 2 q 9c2 q 2 9c2 q q (4) 9c c q 3 + + − − + u x 1280q 1280q 64 64 1280q 2 +
3cc q 2 3c2 q 3 69cc q 11c 2 q 13cc q + − + + 64 64 160q 320 320q 2 13c2 q q 3c2 q (4) + − ux uxx 640q 80 c 2 q cc q 13cc q c2 q 2 c2 q + + − + + u x x x δ (x − y) 32 32 192 320q 60
+
+
c 2 q c c q cc(4) q c c q 2 cc q 2 c 2 q 3 cc q 3 + + − − + + 48 32 96 160q 480q 160q 2 160q 2 − + +
cc q 4 80q 3
+
c2 q 5 160q 4
11cc q 2 q 320q 2
−
+
9c 2 q q 35c c q 5cc q 9cc q q + − − 384 128 640q 640q
13c2 q 3 q 640q 3
−
cc q 2 19c2 q q 2 17c 2 q (4) + + 64q 384 1280q 2
cc q q (4) 17c2 q 2 q (4) 5cc q (4) 11c2 q q (4) 35cc q (5) − + − + 96 64q 1280q 1152 1920q 2
136
B. Dubrovin
11c2 q q (5) c2 q (6) − + 3840q 288 +
4
ux +
3c 2 q c c q cc(4) q 91c c q + + + 128 32 128 384
c 2 q 2 37cc q cc q 2 cc q 3 c2 q 4 59c 2 q 53cc q − + − + − + 384 60q 60q 320 240 30q 2 60q 3
47cc q q 173c2 q 2 q 77c2 q 2 169cc q (4) 77c2 q q (4) + − + − 640q 3840q 960 3840q 3840q 2 cc q 2 73c2 q (5) 3c c q cc q 5c 2 q cc q + + + − + u x 2u x x + 2880 128 128 96 16 80q 5c2 q q 31c2 q (4) c2 q 3 157cc q − + + + uxx 2 2 1920 384q 1920 160q 3c c q cc q c 2 q 3cc q cc q 2 + + + − + 64 64 12 32 60q 11c2 q q 11c2 q (4) c2 q 3 19cc q − + + + ux uxxx 160 640q 480 120q 2 c 2 q cc q 11cc q c2 q 2 17c2 q + + − + + u x x x x δ (x − y) 128 128 384 320q 1920 −
+
c c q q c 2 q c c q cc(4) q cc q q c 2 q 2 q + + − − + 192 128 384 640q 1920q 640q 2 +
cc q 2 q 640q 2
− +
c2 q 2 q 2 320q 3
cc q 2 q (4) 320q 2
− + −
cc q 3 q 320q 3 c2 q 3 1280q 2
+
c2 q 3 q (4) 640q 3
+
c2 q 4 q 640q 4
−
c 2 q 2 cc q 2 3cc q q 2 − + 640q 640q 640q 2
c 2 q q (4) 7c c q (4) cc q (4) cc q q (4) + − − 384 128 640q 640q 2
−
3cc q q (4) 13c2 q q q (4) c2 q (4) + − 2 640q 1280q 3840q
cc q q (5) c2 q 2 q (5) 17c 2 q (5) 5cc q (5) c2 q q (5) 5cc q (6) + − + − + 2304 576 640q 960q 1152 1280q 2 2 2 (6) 2 (7) (4) c q cc q cc q c c q 2 c q c q q 5 + + − + + − u x 3840q 2304 64 48 192 160q
+
−
cc q 2 c 2 q 3 cc q 3 cc q 4 c2 q 5 97c c q + + − + + 480q 960 160q 2 160q 2 80q 3 160q 4
+
c 2 q q 13cc q cc q q 19cc q 2 q 11c2 q 3 q − − + − 320 60q 60q 480q 2 480q 3
−
cc q q (4) 11c2 q 2 q (4) cc q 2 3c2 q q 2 19c 2 q (4) 67cc q (4) + − + + + 48q 320 960 48q 160q 2 960q 2
Hamiltonian Perturbations of Hyperbolic Systems of Conservation Laws, II
137
c2 q q (4) 131cc q (5) c2 q q (5) c2 q (6) − + + − u x 3u x x 80q 2880 240q 180 7c c q 7cc q 7c 2 q 2 7cc q 2 7cc q 3 7c2 q 4 + − + − + − 128 384 960q 960q 480q 2 960q 3 59c 2 q 23cc q 3c2 q 2 131cc q (4) cc q q 13c2 q 2 q + − + + − 960 320 30q 320q 1920 640q 2 3c c q cc q c 2 q 2 cc q 2 3c2 q q (4) 31c2 q (5) 2 + − + u + − − u x x x 320q 2880 64 64 160q 160q
+
+
cc q 3 80q 2
47c 2 q 13cc q 13cc q q c2 q 2 q + + − 960 240 480q 160q 3 60q 2 7c2 q q (4) 23c2 q (5) 49cc q (4) − + + u x 2u x x x 960 960q 2880
−
c2 q 4
+
7c2 q 2 960q cc q 2 5c 2 q 5cc q c2 q 3 3cc q + − + + + 2 192 192 96q 64 192q c 2 q cc q c2 q q c2 q (4) cc q 2 c2 q 3 − + − + u + + u x x x x x 96q 96 64 64 160q 320q 2 c2 q q c2 q (4) 9cc q − + + ux uxxxx 320 160q 160 cc q c2 q 2 c2 q − + + u x x x x x δ(x − y). 192 960q 480 −
To prove the theorem one has to analyze the commutativity conditions δ Hg δHf L =0 E δu(x) δu(x) for arbitrary two functions f (u), g(u). Here 1 2 L = q∂x + q u x − c q ∂x3 + · · · 2 8 is the Hamiltonian differential operator associated with the second Hamiltonian structure. To prove validity of Jacobi identity one has to check that the -terms in the second Hamiltonian structure can be eliminated by the quasitriviality transformation described in Sect. 3. We will omit the calculations. Observe that the family of bihamiltonian structures given in Theorem A.1 depends on two arbitrary functions c = c(u), q = q(u), in agreement with the results of [26]. It is understood that the Jacobi identity for the Poisson pencil holds true identically in λ modulo terms of the order O( 6 ). Acknowledgements. This work is partially supported by European Science Foundation Programme “Methods of Integrable Systems, Geometry, Applied Mathematics" (MISGAM), Marie Curie RTN “European Network
138
B. Dubrovin
in Geometry, Mathematical Physics and Applications" (ENIGMA), and by Italian Ministry of Universities and Researches (MIUR) research grant PRIN 2004 “Geometric methods in the theory of nonlinear waves and their applications".
References 1. Arnold, V.I., Gusein-Zade, S.M., Varchenko, A.N.: Singularities of differentiable maps. Vol. I. The classification of critical points, caustics and wave fronts. Monographs in Mathematics 82. Boston, MA: Birkhäuser Boston, Inc., 1985 2. Baikov, V.A., Gazizov, R.K., Ibragimov, N.Kh.: Approximate symmetries and formal linearization. PMTF 2, 40–49 (1989) (In Russian) 3. Bressan, A.: One dimensional hyperbolic systems of conservation laws. In: Current developments in mathematics, 2002, Somerville, MA: Int. Press, 2003, pp. 1–37 4. Brézin, É., Marinari, E., Parisi, G.: A nonperturbative ambiguity free solution of a string model. Phys. Lett. B 242, 35–38 (1990) 5. Camassa, R., Holm, D.D.: An integrable shallow water equation with peaked solitons. Phys. Rev. Lett. 71, 1661–1664 (1993) 6. Degiovanni, L., Magri, F., Sciacca,V.: On deformation of Poisson manifolds of hydrodynamic type. Commun. Math. Phys. 253, no. 1, 1–24 (2005) 7. Dickey, L.A.: Soliton equations and Hamiltonian systems. Second edition. Advanced Series in Mathematical Physics 26. River Edge, NJ: World Scientific Publishing Co., Inc., 2003 8. Dobrokhotov, S., Pankrashkin, K., Semenov, E.: On Maslov’s conjecture on the structure of weak point singularities of the shallow water equations. Dokl. Akad. Nauk 379, no. 2, 173–176 (2001); English translation: Doklady Math. 64, 127–130 (2001) 9. Dubrovin, B., Zhang, Y.: Normal forms of integrable PDEs, Frobenius manifolds and Gromov-Witten invariants. http://arxiv.org/list/math.DG/0108160, 2001 10. Dubrovin, B., Liu, S.-Q., Zhang, Y.: On hamiltonian perturbations of hyperbolic systems of conservation laws, I: quasitriviality of bihamiltonian perturbations. Comm. Pure and Appl. Math. 59, 559–615 (2006) 11. Dubrovin, B., Novikov, S.P.: Hamiltonian formalism of one-dimensional systems of the hydrodynamic type and the Bogolyubov-Whitham averaging method. Dokl. Akad. Nauk SSSR 270, no. 4, 781–785 (1983); English translation: Soviet Math. Dokl. 27, 665–669 (1983) 12. El, G.A.: Resolution of a shock in hyperbolic systems modified by weak dispersion. Chaos 15, 037103 (2005) 13. Faddeev, L.D., Takhtajan, L.A.: Hamiltonian methods in the theory of solitons. Springer Series in Soviet Mathematics, Berlin: Springer-Verlag, 1987 14. Fokas, A.S.: On a class of physically important integrable equations. Physica D 87, 145–150 (1995) 15. Getzler, E.: A Darboux theorem for Hamiltonian operators in the formal calculus of variations. Duke Math. J. 111, 535–560 (2002) 16. Grava, T., Klein, C.: Numerical solution of the small disperion limit of the KdV equation and Whitham equations. http://arxiv.org/list/math-ph/0511011, 2005 17. Gurevich, A., Meshcherkin, A.: Expanding self-similar discontinuities and shock waves in dispersive hydrodynamics. Sov. Phys. JETP 60, 732–740 (1984) 18. Gurevich, A., Pitaevski, L.: Nonstationary structure of a collisionless shock wave. Sov. Phys. JETP Lett. 38, 291–297 (1974) 19. Hou, T.Y., Lax, P.D.: Dispersive approximations in fluid dynamics. Comm. Pure Appl. Math. 44, 1–40 (1991) 20. Kapaev, A.A.:Weakly nonlinear solutions of the equation P12 . Zap. Nauchn. Sem. Leningrad. Otdel. Mat. Inst. Steklov. (LOMI) 187 (1991), Differentsialnaya Geom. Gruppy Li i Mekh. 12, 88–109, 172–173, 175; translation in J. Math. Sci. 73, no. 4, 468–481 (1975) 21. Khesin, B., Misiołek, G.: Euler equations on homogeneous spaces and Virasoro orbits. Adv. Math. 176, 116–144 (2003) 22. Kodama, Y., Mikhailov, A.: Obstacles to asymptotic integrability. In: Algebraic aspects of integrable systems, Progr. Nonlinear Differential Equations Appl. 26, Boston, MA: Birkhäuser, 1997, pp. 173–204 23. Kudashev, V., Suleimanov, B.: A soft mechanism for the generation of dissipationless shock waves. Phys. Lett. A 221, 204–208 (1996) 24. Lax, P., Levermore, D.: The small dispersion limit of the Korteweg-de Vries equation. I, II, III. Comm. Pure Appl. Math. 36, 253–290, 571–593, 809–829 (1983) 25. Lax, P. D., Levermore, C.D., Venakides, S.: The generation and propagation of oscillations in dispersive initial value problems and their limiting behavior. In: Important developments in soliton theory, Springer Ser. Nonlinear Dynam., Berlin: Springer, 1993, pp. 205–241
Hamiltonian Perturbations of Hyperbolic Systems of Conservation Laws, II
139
26. Liu, S.Q., Zhang, Y.: Deformations of semisimple bihamiltonian structures of hydrodynamic type. J. Geom. Phys. 54, 427–453 (2005) 27. Liu, S.Q., Zhang, Y.: On quasitriviality of a class of scalar evolutionary PDEs. J. Geom. Phys., 2006, to appear. http://arxiv.org/list/ nlin.SI/0510019, 2005 28. Lorenzoni, P.: Deformations of bihamiltonian structures of hydrodynamic type. J. Geom. Phys. 44, 331– 375 (2002) 29. Potëmin, G.: Algebro-geometric construction of self-similar solutions of the Whitham equations. Uspekhi Mat. Nauk 43, no. 5(263), 211–212 (1988); translation in Russ. Math. Surv. 43, 252–253 (1988) 30. Strachan, I.A.B.: Deformations of the Monge/Riemann hierarchy and approximately integrable systems. J. Math. Phys. 44, 251–262 (2003) 31. Tsarëv, S.P.: The geometry of Hamiltonian systems of hydrodynamic type. The generalized hodograph method, Izv. Akad. Nauk SSSR Ser. Mat. 54, no. 5, 1048–1068 (1990); English translation in Math. USSR-Izv. 37, 397–419 (1991) 32. Zabusky, N.J., Kruskal, M.D.: Interaction of “solitons" in a collisionless plasma and the recurrence of initial states. Phys. Rev. Lett. 15, 240–243 (1965) Communicated by P. Constantin
Commun. Math. Phys. 267, 141–157 (2006) Digital Object Identifier (DOI) 10.1007/s00220-006-0023-3
Communications in
Mathematical Physics
Dissipative Quasi-Geostrophic Equation for Large Initial Data in the Critical Sobolev Space Hideyuki Miura Mathematical Institute, Tohoku University Sendai, 980-8578, Japan. E-mail:
[email protected] Received: 13 October 2005 / Accepted: 9 December 2005 Published online: 9 May 2006 – © Springer-Verlag 2006
Abstract: The critical and super-critical dissipative quasi-geostrophic equations are investigated in R2 . We prove local existence of a unique regular solution for arbitrary initial data in H 2−2α which corresponds to the scaling invariant space of the equation. We also consider the behavior of the solution near t = 0 in the Sobolev space. 1. Introduction Let us consider the two dimensional dissipative quasi-geostrophic equation: ∂θ + (−)α θ + u · ∇θ = 0 in R2 × (0, ∞), ∂t u = (−R2 θ, R1 θ ) in R2 × (0, ∞), θ| 2 t=0 = θ0 in R ,
(DQGα )
where the scalar function θ and the vector field u denote the potential temperature and the fluid velocity, respectively, and α is a non-negative constant. Ri = ∂∂xi (−)−1/2 (i = 1, 2) represents the Riesz transform. We are concerned with the initial value problem for this equation. It is known that (DQGα ) is an important model in geophysical fluid dynamics. Indeed, it is derived from general quasi-geostrophic equations in the special case of constant potential vorticity and buoyancy frequency. Since there are a number of applications to the theory of oceanography and meteorology, a lot of mathematical researches have been devoted to the equation. The case α = 1/2 is called critical since its structure is quite similar to that of the 3-dimensional Navier-Stokes equations. The case α > 1/2 is called sub-critical and α < 1/2 is called super-critical, respectively. In the sub-critical cases, Constantin and Wu [5], Wu [15] proved global existence of a unique regular solution. However, in the critical and super-critical cases, global well-posedness for large initial data is still open. In the critical case, Constantin, Cordoba and Wu [4] constructed a
142
H. Miura
global regular solution for the initial data in H 1 with small L ∞ norm. In both critical and super-critical cases, Chae and Lee [2] and Ju [9] proved global existence of 2−2α and H 2−2α a unique regular solution for the initial data in the Besov space B2,1 under the smallness assumption of each homogeneous norm, respectively. For large initial data, Cordoba-Cordoba [6] proved local existence of a regular solution for the initial data in H s with s > 2 − α. Ju [9, 10] improved the admissible exponent up to s > 2 − 2α. In this paper we show local existence of a unique regular solution with initial data in H 2−2α for both critical and super-critical cases. In Ju [10], he conjectured the local H 1 solution in the critical case without smallness assumption on the initial data. Our theorem gives a positive answer to his question. Moreover, our theorem improves the class of initial data to construct the local regular solution. Indeed, H 2−2α is larger than H s (s > 2 − 2α). See Remark 1 below. Here the exponent 2 − 2α is important, because this is the borderline case with respect to the scaling. We observe that if θ (x, t) is the solution of (DQGα ), then θλ (x, t) ≡ λ2α−1 θ (λx, λ2α t) is also a solution of (DQGα ). Then the homogeneous space H˙ 2−2α is called scaling invariant, since θλ (·, 0) H˙ 2−2α = θ (·, 0) H˙ 2−2α holds for all λ > 0. The scaling invariant spaces play an important role for the theory of nonlinear partial differential equations. If the equation has a class of scaling invariance, then it coincides with the most suitable space to construct the solution which is expected unique and regular. (See, e.g. Danchin [7], Koch-Tataru [11].) We now sketch the idea of our proof. In contrast with other equations, it seems to be difficult to prove the local existence of regular solutions by the classical approach such as Fujita-Kato’s argument [8]. As is pointed out in [2], we have difficulty to find an appropriate space E which yields the following continuous bilinear estimate of the Duhamel term: · e−(·−s)(−)α (u · ∇θ )(s)ds ≤ Cθ 2 . E 0
E
For α ≤ 1/2, we see the linear part (−)α θ in (DQGα ) is too weak to control αthe nonlinear term u · ∇θ . In fact, the smoothing property of the semigroup e−t (−) is not enough to overcome the loss of derivatives in the nonlinear term. To avoid this difficulty, in [2, 9] they applied the cancellation property of the equation to construct the small global solution. However, it seems to be difficult to adopt their method to deal with the large initial data. So, in this paper we introduce a modified version of Fujita-Kato’s argument. To be precise, we derive a family of integral inequalities on the Littlewood-Paley decomposition of the solution, which makes it possible to utilize the cancellation property of the equation. In the usual Fujita-Kato argument, such cancellation property seems to be unavailable. In order to apply the cancellation property, we establish a new commutator estimate associated with the Littlewood-Paley operator in the Sobolev space. Such inequality plays an crucial role to estimate the nonlinear term. Combining with the cancellation property and the commutator estimate we obtain a priori estimates in the scaling invariant spaces. Thus we construct the local solution for large initial data in H 2−2α . As a byproduct of our approach, we can obtain weighted (in time) estimates of the solution near t = 0 in higher order Sobolev spaces. The paper is organized as follows. In Sect. 2, we define some function spaces and the precise statement of our theorem. Section 3 is devoted to establish some useful estimates such as the commutator estimate. Finally in Sect. 4 we prove the theorem.
Dissipative Quasi-Geostrophic Equation for Large Initial Data
143
2. Definitions and the Statement of the Theorem In this section we define some function spaces and then state the main theorem. Throughout this paper we deal with the two-dimensional space R2 . Let us first recall the definition of the Sobolev space. We define Z as the topological dual space of Z defined by Z ≡ { f ∈ S; x α f (x) d x = 0 for all α ∈ N2 }, where S denotes the space of Schwartz functions. ∞ 2 ˆ Let {φ j }∞ j=−∞ be the Littlewood-Paley decomposition of unity, i.e. φ ∈ C 0 (R \ ∞ 2 − j ˆ {0}), supp φˆ ⊂ {ξ ∈ R ; 1/2 ≤ |ξ | ≤ 2} and j=−∞ φ(2 ξ ) ≡ 1 except ξ = 0. We ˆ − j ξ ). define the Littlewood-Paley operator j as j = φ j ∗, where F(φ j )(ξ ) = φ(2 For 1 < p < ∞, we define the homogeneous and inhomogeneous Sobolev spaces H˙ s, p and H s, p by 1/2 for s ∈ R, H˙ s, p ≡ f ∈ Z ; f H˙ s, p ≡ (2s j | j f |)2 < ∞ j∈Z p
and
H s, p ≡ f ∈ S ; f H s, p ≡ f L p + f H˙ s, p < ∞ for s > 0,
respectively. We abbreviate H˙ s,2 = H˙ s and H s,2 = H s . Remark. Let P be the set of all polynomials. Then Z S /P holds. Since we cannot distinguish zero from other polynomials in S /P, H˙ s, p seems not to be appropriate as function spaces to treat equations. Fortunately, if the exponents s and p satisfy the condition s < 2/ p, then H˙ s, p can be regarded as a subspace of S . Indeed, for s < 2/ p, we have H˙ s, p f ∈ S ; f H˙ s, p < ∞ and f = j f in S . j∈Z
For the details, see, e.g. Kozono-Yamazaki [12]. Now we state the main theorem of this paper. Theorem 1. Let 0 < α ≤ 1/2. Suppose that the initial data θ0 ∈ H 2−2α . Then there exist a positive constant T and a unique solution θ of (DQGα ) in L ∞ (0, T ; H 2−2α ) ∩ L 2 (0, T ; H˙ 2−α ). Moreover such a solution θ belongs to C([0, T ); H 2−2α ) and it satisfies the following estimate: β
sup t 2α θ (t) H˙ 2−2α+β < ∞ f or 0 ≤ β < 2α.
(2.1)
0
In particular, we have β
lim t 2α θ (t) H˙ 2−2α+β = 0 f or 0 < β < 2α.
t→0
(2.2)
144
H. Miura
Remark 1. i) Ju [9, 10] proved the local existence of a unique solution for the initial data in H s with s > 2 − 2α. Theorem 1 improves his result on the space of initial data. Indeed, H 2−2α is larger than H s for s > 2 − 2α. ii) In contrast with Chae-Lee [2] and Ju [9], we make use of the Fujita-Kato type argument to construct the solution. This approach provides us the weighted estimate (2.1) of the solution in higher order Sobolev space. iii) Ju [9] proved global existence of a solution for the initial data in H 2−2α with small homogeneous norm. Theorem 1 can be regarded as the local version of his result. In fact, by the argument of our proof, one can also prove the similar global existence theorem: Corollary 1. There exists a positive constant ε such that if the initial data θ0 ∈ H 2−2α satisfies θ0 H˙ 2−2α < ε, then one can take T = ∞ in Theorem 1. 3. Littlewood-Paley Operator and the Commutator Estimate In this section we recall several estimates related to the Littlewood-Paley operator. Throughout this paper we denote a positive constant by C (or C , etc.) the value of which may differ from one occasion to another. On the other hand, we denote Ci (i = 1, 2, · · · ) as the certain constants. We recall Bernstein’s inequality. Lemma 1. (i) Let s ∈ R, 1 ≤ p ≤ ∞. Then there exist positive constants C = C(s, p) and C = C (s, p) such that C2 js j f L p ≤ (−)s/2 j f L p ≤ C 2 js j f L p holds for all j ∈ Z. (ii) Let 1 ≤ p ≤ q ≤ ∞. Then there exists a positive constant C = C( p, q) such that j f L q ≤ C2(2/ p−2/q) j j f L p holds for all j ∈ Z. We prepare various product estimates in the Sobolev space. For this purpose we recall paraproduct formula introduced by Bony [1]. Paraproduct operators are defined by Tf g ≡ S j f j g, j∈Z
R( f, g) ≡ where S j f ≡
k≤ j−3 k
i f j g,
|i− j|≤2
f . Then we have the formal expression for the product: f g = T f g + Tg f + R( f, g).
The following estimates are fundamental properties for the paraproduct operators. For the proof see, e.g. Runst-Sickel [13]. Lemma 2. (i) Let s < 1, t ∈ R. Then there exists a positive constant C = C(s, t) such that T f g H˙ s+t−1 ≤ C f H˙ s g H˙ t holds for f ∈ H˙ s and g ∈ H˙ t .
Dissipative Quasi-Geostrophic Equation for Large Initial Data
145
(ii) Let s + t > 0. Then there exists a positive constant C = C(s, t) such that R( f, g) H˙ s+t−1 ≤ C f H˙ s g H˙ t holds for f ∈ H˙ s and g ∈ H˙ t A direct consequence is the following product estimate in the Sobolev space: Proposition 1. Let s, t < 1 and s + t > 0. Then there exists a positive constant C = C(s, t) such that f g H˙ s+t−1 ≤ C f H˙ s g H˙ t holds for f ∈ H˙ s and g ∈ H˙ t . Finally, we state the commutator estimate associated with the operator j , which plays an important role for the estimate of the nonlinear term. Proposition 2. Let 1 ≤ s < 2, t < 1 with s + t > 1. Then there exist positive constants C = C(s, t) such that [ f, j ]g L 2 ≤ C2−(s+t−1) j c j f H˙ s g H˙ t holds for j ∈ Z, f ∈ H˙ s and g ∈ H˙ t with
2 j∈Z c j
= 1. Here we denote
[ f, j ]g = f j g − j ( f g). Proof. Let us decompose the commutator [ f, j ]g by paraproduct formula as follows: [ f, j ]g = [T f , j ]g + R( f, j g) − j R( f, g) + T j g f − j Tg f. We estimate five terms on the right-hand side respectively. By the definition of paraproduct and localization in frequency, we have
[T f , j ]g =
[Sk f, j ]k g.
|k− j|≤3
Applying the mean value theorem, we see that the right-hand side is equal to |k− j|≤3
=2
−j
1
φ j (y)(y · (Sk ∇ f )(x − τ y))k g(x − y)dτ dy
0
|k− j|≤3
Since
1
φ(y)(y · (Sk ∇ f )(x − 2− j τ y))k g(x − 2− j y)dτ dy.
0
|y||φ(y)|dy < ∞, we have [T f , j ]g L 2 ≤ C2− j
|k− j|≤3
Sk ∇ f L p k g L p∗ ,
146
H. Miura
where we have taken p < ∞ as s ≡ s + 2/ p < 2 and 1/ p + 1/ p ∗ = 1/2. We can choose such p by the assumption of s. Then Hölder’s inequality yields Sk ∇ f = 2(2−s )l 2(s −2)l l ∇ f l≤k−3
≤ C2
(2−s )k
(2(s −2)l l ∇ f )2
1/2 .
l∈Z
Hence we have Sk ∇ f L p ≤ C2
−(s −2)k
≤ C2
−(s −2)k
≤ C2
−(s −2)k
1/2 (s −2)l 2 (2 ∇ f ) l l∈Z
Lp
f H˙ s −1, p f H˙ s .
By finiteness of the number of the sum on k, we can estimate as follows: [T f , j ]g L 2 ≤ C2−(s −1) j f H˙ s k g L p∗ |k− j|≤3
≤ C2
−(s −1) j
≤ C2
−(s+t−1) j
f H˙ s j g L p∗
f H˙ s 2 j (s−s +t) j g L p∗
≤ C2−(s+t−1) j f H˙ s 2 jt j g L 2 ≤ C2−(s+t−1) j c j f H˙ s g H˙ t , where we define c j = (2 jt j g L 2 )/g H˙ t . Thus, we obtain the estimate for the first term. ˜ k = |k− j|≤2 j . Then we observe that Let ˜ k f k j g, R( f, j g) = |k− j|≤2
which yields the estimate of the second term: ˜ k f k j g L 2 R( f, j g) L 2 ≤ |k− j|≤2
≤
˜ k f L p k j g p∗ L
|k− j|≤2
≤ 2−(s+t−1) j
˜ k f L p 2(s−s +t)k k j g p∗ 2k(s −1) L
|k− j|≤2
≤ C2
−(s+t−1) j
c j f H˙ s g H˙ t ,
p, s
where and c j are chosen as above. Since s + t > 0, we can apply Lemma 2 to the third term: j R( f, g) L 2 ≤ Ccj 2−(s+t−1) j f H˙ s g H˙ t
Dissipative Quasi-Geostrophic Equation for Large Initial Data
147
with cj = (2(s+t−1) j j R( f, g) L 2 )/ f g H˙ s+t−1 . For the fourth term, we observe that T j g f = Sk j gk f, k≥ j−2
which yields
T j g f L 2 ≤ C
M( j g)|k f | L 2
k≥ j−2
≤ j g L p∗
|k f | L p
k≥ j−2
≤ C2−(s−s +t) j c j g H˙ t
|k f | L p .
k≥ j−2 ∗
In the above inequalities, we have used the L p -boundedness of the Hardy-Littlewood maximal operator M, where M f (x) ≡ supr >0 1/|B(x, r )| B(x,r ) | f (y)|dy. Since s = s + 2/ p > 1, we have
|k f | =
k≥ j−2
k≥ j−2
≤ C2
2−(s −1)k 2(s −1)k |k f |
−(s −1) j
1/2 2
2(s −1)k
|k f |2
.
k≥ j−2
Thus we can estimate the fourth term. Finally, since t < 1, Lemma 2 shows that j Tg f L 2 ≤ Ccj 2−(s+t−1) j f H˙ s g H˙ t with cj = (2(s+t−1) j j Tg f L 2 )/ f g H˙ s+t−1 . 4. Proof of Theorem 4.1. Linear estimates. In this subsection, we consider the linear dissipative equation. The following lemma is closely related to Chemin [3, Prop. 2.1], which characterizes the evolution of the solution to the linear equation. α
Lemma 3. Let e−t (−) a ≡ F −1 (e−t|·| a), ˆ where F −1 denotes the inverse Fourier transform. Then there exist positive constants λ and λ (λ < λ ) depending only on α > 0 such that e−2 for all t > 0.
2α j λ t
2α
α
j a L 2 ≤ e−t (−) j a L 2 ≤ e−2
2α j λt
j a L 2
148
H. Miura α
Proof. Let u(t) ≡ e−t (−) j a. Then u satisfies ∂u + (−)α u = 0 in R2 × (0, ∞), ∂t u| 2 t=0 = j a in R . Taking the inner product in L 2 with the first equation and u, we have 1 d u2L 2 + (−)α/2 u2L 2 = 0. 2 dt By Lemma 1, there exist positive constants λ and λ (λ < λ ) such that 1 d u2L 2 + λ22α j u2L 2 ≤ 0, 2 dt and 1 d u2L 2 + λ 22α j u2L 2 ≥ 0. 2 dt Dividing the above inequalities by u L 2 and then integrating on the interval (0, t), we have e−2
2α j λ t
u(0) L 2 ≤ u(t) L 2 ≤ e−2
2α j λt
u(0) L 2 .
By definition of u, we obtain the desired result. Now we state the smoothing estimates. Proposition 3. For α > 0 and s ≥ 0, there exists a positive constant C = C(s, α) such that α
s
sup t 2α e−t (−) a H˙ s ≤ Ca L 2
(4.1)
t>0
for all a ∈ L 2 . In particular, we have α
s
lim t 2α e−t (−) a H˙ s = 0,
(4.2)
t→0
for all a ∈ L 2 . Moreover, if 0 ≤ s ≤ α, then we have α
e−t (−) a L 2α/s (0,∞; H˙ s ) ≤ Ca L 2
(4.3)
for all a ∈ L 2 . Proof. We have e
−t (−)α
a H˙ s =
j∈Z
1/2 2 α 22 js e−t (−) j a 2 . L
Dissipative Quasi-Geostrophic Equation for Large Initial Data
149
On the other hand, it follows from the previous lemma that α
e−t (−) j a2L 2 ≤ e−2
2α j+1 λt
j a2L 2 .
Here, we observe that sup 22 js e−λt2
2α j+1
j∈Z
s
≤ Ct − α ,
which yields sup t
s 2α
e
−t (−)α
0
a H˙ s ≤ C
j∈Z
1/2 j a2L 2
≤ a L 2 . To prove (4.2), for any ε > 0 we choose the function aε ∈ C0∞ satisfying a − aε L 2 < ε/2. Then it follows from (4.1) that s s α α α t 2α e−t (−) a H˙ s ≤ t 2α e−t (−) (a − aε ) H˙ s + e−t (−) aε H˙ s s
< a − aε L 2 + t 2α aε H˙ s . Let T = ε/(2aε H˙ s ). Then the left-hand side of the above inequality is bounded by ε if t < T . This proves (4.2). 2α j Next we prove (4.3). Let v j (t) ≡ e−2 λt j a L 2 , and v j satisfies ∂t v j + λ22α j v j = 0 for t > 0 and j ∈ Z. 2α/s−1
and then integrating the above identity in time, Multiplying this inequality by v j we have ∞ λ22α j v j (t)2α/s dt = Cv j (0)2α/s , 0
that is, 2s j v j L 2α/s = Cv j (0). Taking l 2 -norm on both sides of this estimate, we obtain
j∈Z
1/2 2s j v j L 2α/s
≤C
j∈Z
1/2 v 2j (0)
.
150
H. Miura
Since α/s ≥ 1, the left-hand side is estimated from below as follows: 1/2 1/2 2s j v j 2 2α/s = 22s j v 2j L α/s L
j∈Z
So we have
j∈Z
1/2 2s j 2 ≥ 2 v j j∈Z α/s L 1/2 2s j 2 = 2 vj j∈Z
1/2 2s j 2 2 vj j∈Z
≤C L 2α/s
. L 2α/s
1/2 v 2j (0)
.
j∈Z
From Lemma 3, we obtain (4.3). 4.2 Proof of Theorem 1 Step 1. A priori estimates. We first show an a priori estimate in L 3 (0, T ; H˙ 2−4α/3 ). More precisely, we will prove that there exist a positive constant C1 and a bounded function I1 = I1 (T ) with I1 (T ) ≤ Cθ0 H˙ 2−2α and
lim I1 (T ) = 0
(4.4)
T →0
such that θ L 3 H˙ 2−4α/3 ≤ I1 (T ) + C1 θ 2L 3 H˙ 2−4α/3 T
(4.5)
T
holds for all solutions θ of (DQGα ). Here we write the space L p (0, T ; H˙ s ) as L T H˙ s . Applying the operator j to (DQGα ), we obtain p
∂t θ j + (−)α θ j = − j (u · ∇θ ), where we denote θ j ≡ j θ . Adding u · ∇ j θ on both sides, we have ∂t θ j + (−)α θ j + u · ∇ j θ = [u, j ]∇θ. Taking the inner product with the above inequality and θ j , and then applying Lemma 1, we obtain from the divergence free condition that 1 d θ j 2L 2 + λ22α j θ j 2L 2 ≤ [u, j ]∇θ L 2 θ j L 2 . 2 dt Dividing both sides by θ j L 2 , we have d θ j L 2 + λ22α j θ j L 2 ≤ [u, j ]∇θ L 2 . dt
Dissipative Quasi-Geostrophic Equation for Large Initial Data
151
Applying Proposition 2 with s = 2 − 4α/3 and t = 1 − 4α/3 and Calderón-Zygmund’s inequality, we obtain 1 d θ j L 2 + λ22α j θ j L 2 ≤ [u, j ]∇θ L 2 2 dt ≤ Cc j 2−(2−8α/3) j u H˙ 2−4α/3 ∇θ H˙ 1−4α/3 ≤ Cc j 2−(2−8α/3) j θ 2H˙ 2−4α/3 . Integrating both sides in time on the interval (0, t), we have t 2α j 2α j θ j (t) L 2 ≤ e−2 λt θ j (0) L 2 + Cc j 2−(2−8α/3) j e−2 λ(t−s) θ (s)2H˙ 2−α ds. 0
Multiplying the above inequality by 2(2−4α/3) j and then taking the l 2 -norm with respect to j, we can estimate the H˙ 2−4α/3 norm of θ as: 1/2 2α j+1 λt 22(2−4α/3) j e−2 θ j (0)2L 2 θ (t) H˙ 2−4α/3 ≤ j∈Z
2 1/2 t 2α j c j 24α j/3 +C e−2 λ(t−s) θ (s)2H˙ 2−4α/3 ds
j∈Z
0
≡ I + I I. In order to show (4.5), we need to estimate L 3T norm of the right-hand side. According to Lemma 3 and (4.3), we see that the first term is estimated as 1/2 2 2α j 2(2−4α/3) j e−2 λt θ j (0) L 2 ≤ Cθ0 H˙ 2−2α . 3 j∈Z LT
Let
1/2 2 (2−4α/3) j −22α j λt I1 (T ) ≡ 2 e θ j (0) L 2 j∈Z
. L 3T
Then absolute continuity of the integral yields (4.4). Since sup 24α j/3 e−2
2α j λ(t−s)
< C(t − s)−2/3 ,
j
we can estimate the second term as: 2 1/2 t 2α j I I L 3 = C c j 24α j/3 e−2 λ(t−s) θ (s)2H˙ 2−4α/3 ds T 0 j∈Z 3 LT t −2/3 ≤C θ (s)2H˙ 2−4α/3 ds (t − s) L 3T
0
≤ Cθ 2L 3 H˙ 2−4α/3 , T
152
H. Miura
where we used Hardy-Littlewood-Sobolev’s inequality in the last inequality. Therefore we obtain the a priori estimate (4.5). Similarly to the previous arguments, we can also show that there exists a bounded function I2 = I2 (T ) with I2 (T ) ≤ Cθ0 H˙ 2−2α and
lim I2 (T ) = 0
T →0
satisfying θ L 2 H˙ 2−α ≤ I2 (T ) + Cθ 2L 3 H˙ 2−4α/3 . T
(4.6)
T
Moreover, we have θ L ∞ H˙ 2−2α ≤ θ0 H˙ 2−2α + Cθ 2L 2 H˙ 2−α . T
T
Combining the above estimates with the maximum principle [6] θ (t) L 2 ≤ θ0 L 2 , we obtain the following estimate θ L ∞ H 2−2α ≤ θ0 H 2−2α + Cθ 2L 2 H˙ 2−α . T
(4.7)
T
Step 2. Convergence of approximation sequences. To construct the solution, we consider the following successive approximation: ∂t θ 0 + (−)α θ 0 = 0 in R2 × R+ , θ 0 |t=0 = θ0 in R2 and
∂t θ n+1 + (−)α θ n+1 + u n · ∇θ n+1 = 0 in R2 × R+ , u n = (−R2 θ n , R1 θ n ) in R2 × R+ , n+1 θ |t=0 = θ0 in R2 ,
for n = 0, 1, 2 . . . . We will establish uniform estimates on θ n . Similarly to the arguments in Step 1, we can show that there exists a bounded function I1 with lim T →0 I1 (T ) = 0 such that θ 0 L 3 H˙ 2−4α/3 T
≤ I1 (T ),
θ n+1 L 3 H˙ 2−4α/3 ≤ I1 (T ) + C1 θ n L 3 H˙ 2−4α/3 θ n+1 L 3 H˙ 2−4α/3 T
T
T
for n = 0, 1, 2 . . . . Taking T0 > 0 so small that I1 (T0 ) ≤ 1/(4C1 ), we have θ n L 3 H˙ 2−4α/3 ≤ 2I1 (T ) for T < T0 . T
By (4.6), we can also show that there exists a bounded function I2 with lim T →0 I2 (T ) = 0 such that θ n L 2 H˙ 2−α ≤ I2 (T ) + C(I1 (T ))2 for T < T0 . T
(4.8)
Dissipative Quasi-Geostrophic Equation for Large Initial Data
153
Moreover, (4.7) yields θ n L ∞ H 2−2α ≤ θ0 H 2−2α + C(I3 (T ))2 for T < T0 , T
(4.9)
where we write I3 (T ) ≡ I2 (T ) + C(I1 (T ))2 . Using (4.8), we will prove the convergence of the sequence θ n in L 4T H˙ 3/4 . Let δθ n+1 = θ n+1 − θ n , δu n+1 = u n+1 − u n , δθ 0 = θ 0 and δu 0 = u 0 , and we have following equations of the differences: ∂t δθ n+1 + (−)α δθ n+1 + u n · ∇δθ n+1 + δu n · ∇θ n = 0 in R2 × R+ , δu n = (−R2 δθ n , R1 δθ n ) in R2 × R+ , δθ n+1 |t=0 = 0 in R2 , for n = 0, 1, 2 . . . . Similarly to the arguments in Step 1, we have 2 1 d n+1 2 n n+1 , ) + j (δu n · ∇θ n ), δθ n+1 δθ j 2 +λ22α j δθ n+1 j 2 ≤ − j (u · ∇δθ j L L 2 dt ≡ j θ n+1 − j θ n . Since divu n = 0, we have where δθ n+1 j n+1 u n · ∇δθ n+1 = 0. j , δθ j By Hölder’s inequality, we have d n+1 n n+1 L 2 + j (δu n · ∇θ n ) L 2 , δθ j 2 +λ22α j δθ n+1 j 2 ≤ [u , j ]∇δθ L L dt which yields t 2α j n+1 e−2 λ(t−s) u n , j ∇δθ n+1 L 2 + j (δu n · ∇θ n ) L 2 ds. δθ j (t) 2 ≤ C L
0
(4.10) By s = 2 − α and t = −1/4 in Proposition 2, we have n u , j ∇δθ n+1 2 ≤ C2−(3/4−α) j c j u n H˙ 2−α ∇δθ n+1 H˙ −1/4 L
≤ C2−(3/4−α) j c j θ n H˙ 2−α δθ n+1 H˙ 3/4 . On the other hand, by Proposition 1, we have j (δu n · ∇θ n ) 2 ≤ C2−(3/4−α) j c δu n · ∇θ n ˙ 3/4−α j H L ≤ C2−(3/4−α) j cj δu n H˙ 3/4 θ n H˙ 2−α , where j cj 2 = 1. Multiplying (4.10) by 23/4 j , and then taking the l 2 -norm with respect to j, we have δθ n+1 (t) H˙ 3/4 2 1/2 t 2α j 2α j e−2 λ(t−s) θ n H˙ 2−α (c j δθ n+1 H˙ 3/4 + cj δu n H˙ 3/4 )ds , ≤C j∈Z
0
154
H. Miura
which yields
δθ n+1 L 4 H˙ 3/4 ≤ C θ n L 2 H˙ 2−α δθ n+1 L 4 H˙ 3/4 + δθ n L 4 H˙ 3/4 θ n L 2 H˙ 2−α T T T T T n n+1 n ≤ C2 θ L 2 H˙ 2−α δθ L 4 H˙ 3/4 + δθ L 4 H˙ 3/4 . T
T
By (4.8), there exists T1 > 0 such that Hence we have
θ n
L 2T H˙ 2−α
T
< 1/(3C2 ) for all n = 0, 1, 2 . . . .
1 δθ n L 4 H˙ 3/4 T1 T1 2 1 0 ≤ n+1 θ L 4 H˙ 3/4 T1 2 C ≤ n+1 θ0 H˙ 3/4−α/2 2 C ≤ n+1 θ0 H 2−2α . 2 This shows the existence of the function θ ∈ L 4T1 H˙ 3/4 satisfying limn→∞ θ n = θ in L 4T1 H˙ 3/4 . Furthermore, the uniform estimates (4.8) and (4.9) show that θ also belongs 2−2α ∩ L 2 H ˙ 2−α . We can also prove the uniqueness by similar arguments as to L ∞ T1 H T1 above. Here we can easily check that θ satisfies (DQGα ). We next prove continuity of the solution with values in H 2−2α . By the standard bootstrap argument, it suffices to show the right continuity at t = 0. For the purpose, firstly we prove continuity of the solution with values in H r for 0 ≤ r < 1 − α. Indeed, since u and θ satisfy δθ n+1 L 4
H˙ 3/4
≤
u, θ ∈ L 2 (0, T1 ; H 2−α ) and ∂t θ = −(−)α θ − u · ∇θ, we easily see that the right-hand side of the above identity belongs to L 1T1 H r for 0 ≤ r < 2−2α , 1 − α, which yields that θ ∈ C([0, T1 ); H r ). From the fact that θ belongs to L ∞ T1 H Lemma 1.4 in [14, Chap. 3] shows that θ ∈ Cw ([0, T ); H 2−2α ). By (4.7), we have θ (t) − θ0 2H 2−2α ≤ θ (t)2H 2−2α + θ0 2H 2−2α − 2θ (t), θ0 H 2−2α ≤ 2θ0 2H 2−2α − 2θ (t), θ0 H 2−2α + Cθ 2L 2 H 2−2α , t
H 2−2α . Since θ
is weakly continuous, the second where ·, · H 2−2α is the inner product of term converges to 2θ0 2H 2−2α as t tends to 0. On the other hand, the third term converges to 0 as t tends to 0 because of absolute continuity of the L 2t -norm on t > 0. This shows continuity of the solution at t = 0 with values in H 2−2α . Step 3. Weighted estimates. For the proof of (2.1) and (2.2), it suffices to show β
lim sup t 2α θ n (t) H˙ 2−2α+β = 0
t→0 n≥0
(4.11)
for 0 < β < 2α. We divide the proof into two cases 0 < β < α and α ≤ β < 2α.
Dissipative Quasi-Geostrophic Equation for Large Initial Data
155
Case 1. We prove (4.11) for 0 < β < α. For n = 0 (4.1) shows that β
sup t 2α θ 0 (t) H˙ 2−2α+β ≤ Cθ0 H˙ 2−2α .
(4.12)
0
β
In particular, it follows from (4.2) that J1 (T ) ≡ sup0
lim J1 (T ) = 0.
T →0
satisfies For n ≥ 0, θ n+1 j d n+1 n n+1 [u ≤ , ]∇θ θ j 2 + λ22α j θ n+1 2. j j L L2 L dt
(4.13)
Applying Proposition 2 with s = 2 − 2α + β and t = 1 − 2α + β, we have n u , j ∇θ n+1
≤ Cc j 2−(2−4α+2β) j θ n H˙ 2−2α+β θ n+1
L2
H˙ 2−2α+β
.
Hence from (4.13) we obtain −2 θ n+1 j (t) L 2 ≤ e
2α j λt
+Cc j 2−(2−4α+2β) j
θ j (0) L 2 t
e−2
2α j λ(t−s)
0
θ n (s) H˙ 2−2α+β θ n+1 (s) H˙ 2−2α+β ds.
Similarly to the arguments in Step 1, we have t
β 2α
θ n+1 (t) H˙ 2−2α+β ≤ t +Ct
β 2α
β 2α
1/2 2 2α j 2(2−2α+β) j e−2 λt θ j (0) L 2
c j 2(2α−β) j
j∈Z
j∈Z
t 0
2α j e−2 λ(t−s) θ n (s)
H˙ 2−2α+β
θ n+1 (s)
2 1/2 ˙ 2−2α+β ds
H
≡ I + I I.
The first term is estimated as in (4.12). Indeed, Lemma 3 and Proposition 3 yield I ≤ sup t 0
β 2α
1/2 2 2α j 2(2−2α+β) j e−2 λt θ j (0) L 2 ≤ C J1 (T ). j∈Z
Since sup 2(2α−β) j e−2 j∈Z
2α j λ(t−s)
β
< C(t − s)−1+ 2α ,
156
H. Miura
we can estimate the second term as follows: t β β (t − s)−1+ 2α θ n (s) H˙ 2−2α+β θ n+1 (s) H˙ 2−2α+β ds I I ≤ Ct 2α 0 β β n n+1 2α 2α sup t θ (t) H˙ 2−2α+β ≤ C sup t θ (t) H˙ 2−2α+β 0
×t ≤C
β 2α
0
t
(t − s)
β −1+ 2α
0
sup t
β 2α
0
s
− βα
ds
θ (t) H˙ 2−2α+β n
sup t
β 2α
θ
0
n+1
(t) H˙ 2−2α+β
for 0 < t < T , where we have used the assumption 0 < β < α in the last line. Thus we have β
sup t 2α θ n+1 (t) H˙ 2−2α+β 0
0
Taking T2 > 0 sufficiently small, we obtain J1 (T ) < 1/(4C3 C4 ) for T < T2 . Hence we conclude that β
sup t 2α θ n (t) H˙ 2−2α+β ≤ 2J1 (T ) for T < T2 and n = 0, 1, 2 . . . ,
0
which yields (4.11). Case 2. We next prove (4.11) for α ≤ β < 2α. For n = 0, again by Proposition 3, there exists a bounded function J2 = J2 (T ) with J2 (T ) ≤ Cθ0 H˙ 2−2α and
lim J2 (T ) = 0
T →0
such that β
sup t 2α θ 0 (t) H˙ 2−2α+β ≤ J2 (T ).
(4.14)
0
For n ≥ 0, we apply Proposition 2 with s = 2 − 3α/2 + β/4 and s = 1 − 3α/2 + β/4 to the right-hand side of (4.13), and it holds d n+1 −(2−3α+β/2) j θ n H˙ 2−3α/2+β/4 θ n+1 H˙ 2−3α/2+β/4 . θ j 2 +λ22α j θ n+1 j 2 ≤ Cc j 2 L L dt Similarly to the previous arguments, we have 1/2 2 β β 2α j 2(2−2α+β) j e−2 λt θ j (0) L 2 t 2α θ n+1 (t) H˙ 2−2α+β ≤ t 2α j∈Z
2 1/2 t β 2α j c j 2(α−β/2) j e−2 λ(t−s) θ n (s) H˙ 2−3α/2+β/4 θ n+1 (s) H˙ 2−3α/2+β/4 ds +Ct 2α j∈Z
≡ I + I I.
0
Dissipative Quasi-Geostrophic Equation for Large Initial Data
157
The first term I is estimated as in (4.14). So we need to treat only the second term I I . Since sup 2(α−β/2) j e−2 j∈Z
we have β 2α
t
1
2α j λ(t−s)
1
β
< C(t − s)− 2 − 4α ,
β
I I ≤ Ct (t − s)− 2 − 4α θ n (s) H˙ 2−3α/2+β/4 θ n+1 (s) H˙ 2−3α/2+β/4 ds 0 1 β 1 β + 8α n + 8α n+1 4 4 sup t θ (t) H˙ 2−3α/2+β/4 θ (t) H˙ 2−3α/2+β/4 ≤ C sup t 0
0
for 0 < t < T . Since 0 < 1/4 + β/(8α) < α, it follows from the previous case that 1
β
sup t 4 + 8α θ n (t) H˙ 2−3α/2+β/4 ≤ 2J1 (T ) for T < T2 .
0
Hence the second term is bounded by 4C(J1 (T ))2 for T < T2 . From the above estimates, we see estimate (4.11) holds for α ≤ β < 2α. Acknowledgements. The author would like to express deep gratitude to Professor Hideo Kozono for valuable suggestions and encouragement. He would also like to express sincere thanks to Professors Dongho Chae, Kenji Nakanishi, Takayoshi Ogawa, and Yoshio Tsutsumi for useful discussions. He is also grateful to Doctor Jun-ichi Segata and the referee for numerous suggestions on the manuscript.
References 1. Bony, J.-M.: Calcul symbolique et propagation des singularités pour les équations aux dérivées partielles non linéaires. Ann. Sci. École Norm. Sup. 14, 209–246 (1981) 2. Chae, D., Lee, J.: Global well-posedness in the super-critical dissipative quasi-geostrophic equations. Commun. Math. Phys. 233, 297–311 (2003) 3. Chemin, Y.: Théorèmes d’unicité pour le système de Navier-Stokes tridimensionnel. J. Anal. Math. 77, 27–50 (1999) 4. Constantin, P., Cordoba, D., Wu, J.: On the critical dissipative quasi-geostrophic equation. Indiana Univ. Math. J. 50, 97–107 (2001) 5. Constantin, P., Wu, J.: Behavior of solutions of 2D quasi-geostrophic equations. SIAM J. Math. Anal. 30, 937–948 (1999) 6. Cordoba, A., Cordoba, D.: A maximum principle applied to quasi-geostrophic equations. Commun. Math. Phys. 249, 511–528 (2004) 7. Danchin, R.: Local theory in critical spaces for compressible viscous and heat-conductive gases. Comm. Partial Differ. Eq. 26, 1183–1233 (2001) 8. Fujita, H., Kato, T.: On the Navier-Stokes initial value problem I. Arch. Rat. Mech. Anal. 16, 269–315 (1964) 9. Ju, N.: Existence and uniqueness of the solution to the dissipative 2D quasi-geostrophic equations in the Sobolev space. Commun. Math. Phys. 251, 365–376 (2004) 10. Ju, N.: On the two dimensional quasi-geostrophic equations. Indiana Univ. Math. J. 54, 897–926 (2005) 11. Koch, H., Tataru, D.: Well-posedness for the Navier-Stokes equations. Adv. Math. 157, 22–35 (2001) 12. Kozono, H., Yamazaki, M.: Semilinear heat equations and the Navier-Stokes equation with distributions in new function spaces as initial data. Comm. Partial Differ. Eq. 19, 959–1014 (1994) 13. Runst, T., Sickel, W.: Sobolev spaces of fractional order, Nemytskij operators, and nonlinear partial differential equations. de Gruyter Series in Nonlinear Analysis and Applications 3, Berlin: Walter de Gruyter & Co., 1996 14. Temam, R: Navier-Stokes equations. Theory and numerical analysis. Providence, RI: AMS Chelsea Publishing, 2001 15. Wu, J.: Dissipative quasi-geostrophic equations with L p data. Electron. J. Differ. Eq. 56, 1–13 (2001) Communicated by P. Constantin
Commun. Math. Phys. 267, 159–180 (2006) Digital Object Identifier (DOI) 10.1007/s00220-006-0022-4
Communications in
Mathematical Physics
Spectral Theory of the Klein–Gordon Equation in Pontryagin Spaces Heinz Langer1 , Branko Najman , Christiane Tretter2 1 Institut für Analysis und Scientific Computing, Technische Universität Wien, Wiedner Hauptstr. 8–10,
1040 Wien, Austria. E-mail:
[email protected]
2 FB 3 – Mathematik, Universität Bremen, Bibliothekstr. 1, 28359 Bremen, Germany.
E-mail:
[email protected] Received: 24 October 2005 / Accepted: 4 November 2005 Published online: 16 May 2006 – © Springer-Verlag 2006
Abstract: In this paper we investigate an abstract Klein–Gordon equation by means of indefinite inner product methods. We show that, under certain assumptions on the potential which are more general than in previous works, the corresponding linear operator A is self-adjoint in the Pontryagin space K induced by the so-called energy inner product. The operator A possesses a spectral function with critical points, the essential spectrum of A is real with a gap around 0, and the non-real spectrum consists of at most finitely many pairs of complex conjugate eigenvalues of finite algebraic multiplicity; the number of these pairs is related to the ‘size’ of the potential. Moreover, A generates a group of bounded unitary operators in the Pontryagin space K. Finally, the conditions on the potential required in the paper are illustrated for the Klein–Gordon equation in Rn ; they include potentials consisting of a Coulomb part and an Lp -part with n ≤ p < ∞. 1. Introduction The motion of a relativistic spinless particle of mass m and charge e in an electrostatic field with potential q is described by the Klein–Gordon equation 2 ∂ (1.1) − i eq − + m 2 ψ = 0, ∂t where the velocity of light has been normalized to 1; here ψ is a complex-valued function of t ∈ R and of x ∈ Rn . An abstract model for this equation is obtained if we replace the strictly positive self-adjoint operator generated by the differential expression − + m 2 in the function space L 2 (Rn ) by a strictly positive self-adjoint operator H0 in a Hilbert space H with scalar product (·, ·) and the operator of multiplication by the function eq in L 2 (Rn ) by a symmetric operator V in H: 2 d (1.2) − i V + H0 u = 0; dt Deceased; formerly University of Zagreb, Bijeniˇcka 30, 41000 Zagreb, Croatia
160
H. Langer, B. Najman, C. Tretter
here u is a function of t with values in H. The abstract Klein–Gordon equation (1.2) can be transformed into a first order differential equation for a vector function x with two components in an appropriate product Hilbert space G and a linear operator A in G: dx = i Ax. dt
(1.3)
This can be achieved by different substitutions leading to different operators A; however, in general this is not possible with a self-adjoint operator A in a Hilbert space G. The operator considered in the present paper arises from the abstract Klein–Gordon equation (1.2) by means of the substitution x = u,
y = −i
d u, dt
which leads to a first order differential equation for x = (x y)t of the form dx 0 I = i Ax, A = . H0 − V 2 2V dt
(1.4)
(1.5)
in Since both operators H0 and V are in general unbounded, the block operator matrix A (1.5) may not even be densely defined nor closed. To this end, suitable assumptions have to be imposed on the potential V so that we can associate a closed operator A with the If the potential V is not small, A does not exhibit symmetry in block operator matrix A. any Hilbert space. However, formally, if we introduce the so-called energy inner product ·, · which, for suitable elements x = (x y)t , x = (x y )t of H ⊕ H, is given by H0 − V 2 0 x, x = (H0 − V 2 )x, x + (y, y ), (1.6) x, x = 0 I is symmetric with respect to ·, ·: then it is not difficult to see that A 0 H0 − V 2 Ax, x = x, x . H0 − V 2 2V The inner product ·, · is in general indefinite; under our assumptions on the potential V , it is negative definite on a subspace of finite dimension so that the space G equipped with ·, · becomes a so-called Pontryagin space. in the energy inner product For the Klein–Gordon equation in Rn , the operator A ·, · has been studied in a number of papers, see, e.g., [SSW40, Lun73a, Lun73b, Eck76, Eck80, Sch76, Kak76, Wed77, Wed78, Jon79, Naj80a, Naj80b, Naj83, Bac04], and the unpublished manuscript [LN96]1 ; some of these works also consider the corresponding in a Pontryagin space, but under more restrictive assumptions on the abstract operator A potential V . The operator A and the energy inner product ·, · studied in this paper are related to other operators associated with the abstract Klein–Gordon equation (see [LNT06]). They arise from the second order differential equation (1.2) by means of the substitution d x = u, y = −i − V u, (1.7) dt 1 This manuscript was the starting point for the present paper; unfortunately, Professor Branko Najman died in August 1996.
Spectral Theory of the Klein–Gordon Equation in Pontryagin Spaces
161
which leads to a first order differential equation (1.3) for x = (x y)t of the form dx V I x. (1.8) =i H0 V dt The operator A1 , for example, is obtained from (1.8) as the closure of the block operator matrix therein in the Hilbert space G1 = H ⊕ H; it turns out to be symmetric with respect to the so-called charge inner product [·, ·], which is defined on elements x = (x y)t , x = (x y )t of G1 = H ⊕ H by a relation of the form 0 I x, x = (x, y ) + (y, x ). [x, x ] = (1.9) I 0 Independently of the potential V , the charge inner product is in general negative on an infinite dimensional subspace and hence leads to a so-called Krein space. The energy inner product is related to the charge inner product as follows: I 0 , (1.10) x, x = [A1 W x, W x ], W = −V I
for suitable elements x, x of the Pontryagin space G, ·, · ; under our assumptions on : G → G1 is bounded. The spectral properties of the operator A1 V , the operator W and of another operator A2 associated with (1.8) in the charge inner product and their relations to the operator A are investigated in a separate paper (see [LNT06]). The present paper is organized as follows: In the next Sect. 2 we briefly review results from the theory of self-adjoint operators in Pontryagin spaces. In Sect. 3 we associate the operator A with (1.5); it acts in the space G = H1/2 ⊕ H, where H1/2 is the Hilbert 1/2
1/2 space given by D H0 with norm H0 · . We show that if 1/2
−1/2 ⊂ D(V ) (i.e., S = V H0 is bounded) and (i) D H0 ∗ (ii) I − S S is boundedly invertible, then the operator A=
0 I , D(A) = D(H ) ⊕ H1/2 , H 2V
is closed and boundedly invertible in G; here H is the self-adjoint operator in H given by 1/2 1/2 H = H0 (I − S ∗ S)H0 . In Sect. 4 we introduce the indefinite inner product ·, · on G and we prove that under the above assumptions (i) and (ii) the space G equipped with this inner product is a Krein space K and A is a self-adjoint operator in K with non-empty resolvent set. In addition, we study the relation (1.10) of the energy inner product ·, · with the operator A1 and the corresponding charge inner product [·, ·]. Section 5 contains the main result about the spectral properties of A. Under the additional assumption −1/2
(iii) S = V H0
= S0 + S1 with S0 < 1 and a compact operator S1 ,
we show that K is a Pontryagin space of index κ, where κ is the number of negative eigenvalues of I − S ∗ S, the operator A possesses a spectral function with at most finitely many critical points, the non-real spectrum of A consists of at most κ pairs of complex conjugate eigenvalues, and the essential spectrum of A is real and has a gap of size at
162
H. Langer, B. Najman, C. Tretter
least 2(1 − S0 )m around 0. Moreover, the operator A generates a strongly continuous group exp(i At) t∈R of unitary operators in the Pontryagin space K and hence the Cauchy problem dx = i Ax, x(0) = x0 , dt has a unique solution for all initial values x0 ∈ H1/2 ⊕ H. Since ∞ is not
a critical point for a self-adjoint operator in a Pontryagin space, the group exp(i At) t∈R is uniformly bounded in K; therefore the time-asymptotic behaviour of the solution x and hence of the solution of the abstract Klein–Gordon equation (1.2) is the same as in a Hilbert space. This is not the case for the self-adjoint operator A1 in the Krein space G1 since there ∞ is a critical point (see [LNT06]). Finally, in Sect. 6, we consider the Klein–Gordon equation in Rn and present sufficient conditions for the above assumptions. In particular, we show that our results apply to potentials V of the form V = V0 + V1 with a Coulomb part V0 (x) = γ /|x|, x ∈ Rn \{0}, with γ < (n − 2)/2 and V1 ∈ L p (Rn ) with n ≤ p < ∞. 2. Preliminaries 1. Notations and definitions from spectral theory. For a closed linear operator A in a Hilbert space G with domain D(A) we denote by ρ(A), σ (A), and σp (A) its resolvent set, spectrum, and point spectrum (or set of eigenvalues), respectively. For λ ∈ σp (A) the algebraic eigenspace of A at λ is denoted by Lλ (A). The operator A is called Fredholm if its kernel is finite dimensional and its range is finite codimensional (and hence closed), see, e.g., [GGK90, Chapter IV, §5.1]. The essential spectrum of A is defined by σess (A) := {λ ∈ C : A − λ is not Fredholm}. An eigenvalue λ0 ∈ σp (A) is called of finite type if λ0 is isolated (i.e., a punctured neighbourhood of λ0 belongs to ρ(A)) and A − λ0 is Fredholm or, equivalently, the corresponding Riesz projection is finite dimensional. 2. Linear spaces with inner products. A Krein space (K, [·, ·]) is a linear space K which is equipped with an (indefinite) inner product (i.e., a hermitian sesquilinear form) [·, ·] such that K can be written as K = G+ []G− ,
(2.1)
where (G± , ±[·, ·]) are Hilbert spaces and [] means that the sum of G+ and G− is direct and [G+ , G− ] = {0}. The norm topology on a Krein space K is the norm topology of the orthogonal sum of the Hilbert spaces G± in (2.1). It can be shown that this norm topology is independent of the particular decomposition (2.1); all topological notions in K refer to this norm topology and · denotes any of the equivalent norms. Krein spaces often arise as follows: In a given Hilbert space (G, (·, ·)), every bounded self-adjoint operator G in G with 0 ∈ ρ(G) induces an inner product [x, y] := (Gx, y), x, y ∈ G,
(2.2)
such that (G, [·, ·]) becomes a Krein space; here, in the decomposition (2.1), we can choose G+ as the spectral subspace of G corresponding to the positive spectrum of
Spectral Theory of the Klein–Gordon Equation in Pontryagin Spaces
163
G and G− as the spectral subspace of G corresponding to the negative spectrum of G. A subspace L of a linear space K with inner product [·, ·] is called non-degenerated if there exists no x ∈ L, x = 0, such that [x, L] = 0, otherwise L is called degenerated; note that a Krein space K is always non-degenerated, but it may have degenerated subspaces. An element x ∈ K is called positive (non-negative, negative, non-positive, neutral, respectively) if [x, x] > 0 (≥ 0, < 0, ≤ 0, = 0, respectively); a subspace of K is called positive (non-negative, etc., respectively), if all its nonzero elements are positive (non-negative, etc., respectively). For the definition and simple properties of Krein spaces and linear operators therein we refer to [Bog74, Lan82, AI89]. 3. Self-adjoint operators in Krein spaces. For a closed linear operator A in a Krein space K with dense domain D(A), the (Krein space) adjoint A+ of A is the densely defined operator in K given by D(A+ ) := {y ∈ K : [A · , y] is a continuous linear functional on D(A)} and the relation [Ax, y] = [x, A+ y], x ∈ D(A), y ∈ D(A+ ). The operator A is called symmetric if A ⊂ A+ and self-adjoint if A = A+ . The spectrum of a self-adjoint operator A in a Krein space K is always symmetric to the real axis; note that both the spectrum σ (A) or the resolvent set ρ(A) may be empty. An orthogonal projection P in a Krein space K is a self-adjoint projection in K; note that orthogonal projections in a Krein space may have norm > 1. If for a self-adjoint operator A in a Krein space K with λ0 ∈ σp (A) all the eigenvectors at λ0 are positive (negative, respectively), then λ0 is called an eigenvalue of positive (negative, respectively) type. A positive or negative eigenvector x0 of A at λ0 does not have any associated vectors. Consequently, if for an eigenvector x0 at λ0 there exists an element x1 such that (A − λ0 )x1 = x0 , then x0 is neutral. 4. Self-adjoint operators in Pontryagin spaces. If in some decomposition (2.1) one of the components G± is of finite dimension, it is of the same dimension in all such decompositions, and the Krein space (K, [·, ·]) is called a Pontryagin space. For the Pontryagin spaces K occurring in this paper, the negative component G− is of finite dimension, say κ; in this case, K is called a Pontryagin space with negative index say κ. If K arises from a Hilbert space G by means of a self-adjoint operator G with inner product (2.2), then K is a Pontryagin space with negative index κ if and only if the negative spectrum of the invertible operator G consists of exactly κ eigenvalues, counted according to their multiplicities. In a Pontryagin space K with negative index κ each non-positive subspace is of dimension ≤ κ, and a non-positive subspace is maximal non-positive (that is, it is not properly contained in another non-positive subspace) if and only if it is of dimension say κ. If L is a non-degenerated linear space with inner product [·, ·] such that for a κdimensional subspace L− we have [x, x] < 0, x ∈ L− , x = 0, but there is no (κ + 1)-dimensional subspace with this property, then there exists a Pontryagin space K with negative index κ such that L is a dense subset of K. This means
164
H. Langer, B. Najman, C. Tretter
that L can be completed to a Pontryagin space in a similar way as a pre-Hilbert space can be completed to a Hilbert space. The spectrum of a self-adjoint operator in a Pontryagin space is real with the possible exception of at most κ non-real pairs of eigenvalues λ, λ of finite type; this estimate can be improved by taking multiplicities into account (see (2.3) below). According to a theorem of Pontryagin, a self-adjoint operator A in a Pontryagin space with negative index κ has a κ-dimensional invariant non-positive subspace Lmax − : Lmax − ⊂ D(A),
max ALmax − ⊂ L− ;
max the subspace Lmax − can be chosen such that Im σ (A|L− ) ≥ 0. Then the points of σ (A|Lmax − ) are the eigenvalues of A in the closed upper half plane with a non-positive eigenvector. We denote the set of all eigenvalues of A with a non-positive eigenvector by σ0 (A); for a point λ ∈ σ0 (A), the maximal dimension of a non-positive subspace of Lλ (A) is denoted by κλ− (A). Concerning the non-real spectrum of A, the closed linear span of all the algebraic eigenspaces Lλ (A) corresponding to the eigenvalues λ of A in the open upper (or lower) half plane is a neutral subspace of K; for all such points λ the algebraic eigenspaces Lλ (A), Lλ (A) are skewly linked, that is, to each nonzero x ∈ Lλ (A) there exists a y ∈ Lλ (A) such that [x, y] = 0 and to each nonzero y ∈ Lλ (A) there exists an x ∈ Lλ (A) such that [x, y] = 0. In particular, dim Lλ (A) = dim Lλ (A) and the Jordan structure of A in Lλ (A) and in Lλ (A) is the same. Further, the relation κ= κλ− (A) + dim Lλ (A) (2.3) λ∈σ0 (A)∩R
λ∈σ (A)∩C+
holds, which yields estimates for the number of points of σ0 (A). All (real) points λ ∈ σ (A) \ σ0 (A) are spectral points of positive type, by which we mean that they are either eigenvalues of positive type or, if they belong to the continuous spectrum, that for each sequence (xn ) ⊂ D(A), xn = 1, (A − λ)xn → 0
=⇒
lim inf [xn , xn ] > 0. n→∞
5. Spectral functions of self-adjoint operators in Pontryagin spaces. If q denotes the ∗ minimal polynomial or the characteristic polynomial of the restriction A|Lmax − , and q is the polynomial given by q ∗ (z) = q(z), z ∈ C, then the polynomial q ∗ q is independent of the particular choice of the invariant subspace Lmax − , and it is not hard to show that [q ∗ (A)q(A)x, x] ≥ 0, x ∈ D(A2κ ). As a consequence, a self-adjoint operator A in a Pontryagin space possesses a spectral function with possible critical points (see [KL63] and also [Lan82]). In order to introduce it, we call a bounded or unbounded real interval
⊂ R admissible for the operator A if the end points of do not belong to σ0 (A). Then, for every admissible interval , there exists an orthogonal projection E( ) in K such that the range E( )K is invariant under A and
σ A E( )K ⊂ , σ A (I − E( ))K ∩ R ⊂ R \ . Moreover, the mapping → E( ) from the semiring R A of all admissible intervals into the space of all bounded linear operators in K is a homomorphism, that is, for
1 , 2 ∈ R A , E( 1 ∩ 2 ) = E( 1 )E( 2 ),
E( 1 ∪ 2 ) = E( 1 ) + E( 2 ) − E( 1 ∩ 2 ),
Spectral Theory of the Klein–Gordon Equation in Pontryagin Spaces
165
and E(∅) = 0,
E(R)K =
[⊥] Lλ (A)
;
λ∈σ (A)\R
here [⊥] denotes the orthogonal complement with respect to the indefinite inner product. The critical points of the spectral function E are those points λ ∈ R for which the inner product [·, ·] is indefinite on E( )K for each ∈ R A containing λ; all critical points of E belong to σ0 (A). If an interval ∈ R A does not contain points of σ0 (A), then the range E( )K is a positive subspace of K and hence a Hilbert space. Therefore, with the exception of the points of σ0 (A) ∩ R, the spectral behaviour of A is that of a self-adjoint operator in a Hilbert space. In particular, for an admissible interval with ∩ σ0 (A) = ∅,
λE(dλ); AE( ) =
here, if A is an unbounded operator and is an unbounded interval, the expressions on either side coincide as unbounded operators. Given a point λ0 ∈ σ0 (A) ∩ R, we choose an admissible interval = [α, β] such that [α, β] ∩ σ0 (A) = {λ0 }. If Lλ0 (A) is non-degenerated (e.g., if λ0 is an eigenvalue of negative type), then the strong limits lim E([α, µ]),
µλ0
lim E([µ, β])
µλ0
exist. They can be considered as spectral projections E([α, λ0 )) and E((λ0 , β]) of A corresponding to the intervals [α, λ0 ) and (λ0 , β], respectively, and the decomposition E( )K = E([α, λ0 ))K [] Lλ0 (A) [] E((λ0 , β])K
(2.4)
holds. If, however, Lλ0 is degenerated, then at least one of the quantities lim sup E([α, µ]) or lim sup E([µ, β]) µλ0
µλ0
is infinite and the subspace Lλ0 (A) cannot be split off as in (2.4). If A is an unbounded self-adjoint operator in a Pontryagin space K, we choose a bounded admissible interval which contains all the real points of σ0 (A) and we consider the space L1 := E( )K [] Lλ (A). λ∈σ (A)\R
It is a Pontryagin space with negative index κ that reduces A and the restriction A1 := A|L1 is a bounded operator. The orthogonal complement L0 of L1 in K is a Hilbert space with respect to the inner product [·, ·] and the decomposition K = L1 [] L0 yields a corresponding orthogonal decomposition of the operator A: A = A1 [] A0 .
(2.5)
166
H. Langer, B. Najman, C. Tretter
Here A1 is a bounded self-adjoint operator in the Pontryagin space L1 with negative index κ and A0 is a self-adjoint operator in the Hilbert space L0 . Thus, the study of an unbounded self-adjoint operator in a Pontryagin space can always be reduced to the study of a bounded self-adjoint operator in a Pontryagin space and of an unbounded self-adjoint operator in a Hilbert space. A bounded operator in a Pontryagin space K is called unitary if it maps K onto itself and [U x, U y] = [x, y], x, y ∈ K. Using the decomposition (2.5), it readily follows that a self-adjoint operator A in a Pontryagin space generates a group (exp(it A))t∈R of unitary operators in K and that this group is exponentially bounded, that is, exp(it A) ≤ C eγ |t| , t ∈ R, with positive constants C and γ . This was first proved by M.A. Na˘ımark in [Na˘ı66]. 3. An Operator Associated with the Abstract Klein–Gordon Equation
Let H, (·, ·) be a Hilbert space with corresponding norm · , H0 a strictly positive self-adjoint operator in H, H0 ≥ m 2 > 0, and V a symmetric operator in H. By means
of the operator H0 we introduce the Hilbert space H1/2 , (·, ·)1/2 as 1/2
1/2 1/2
H1/2 := D H0 , (x, y)1/2 := H0 x, H0 y , x, y ∈ H1/2 .
(3.1)
In the orthogonal sum G := H1/2 ⊕ H with norm 1/2 1/2 x G = H0 x 2 + y 2 , x = (x y)t ∈ G, formally given by we consider the block operator matrix A, 0 I := , A H0 − V 2 2V
(3.2)
which arises from the differential equation (1.2) by means of the substitution (1.4) (see (1.5)). we make In order to associate a well-defined operator with the entry H0 − V 2 in A, the following assumption: 1/2
⊂ D(V ); Ass. (i) D H0 this condition implies that the operator −1/2
S := V H0
(3.3)
is everywhere defined and bounded on H. In the next section we need that the operator associated with the formal expression H0 − V 2 is boundedly invertible. In order to assure this we also assume Ass. (ii) 1 ∈ ρ(S ∗ S), that is, the operator I − S ∗ S is boundedly invertible.
Spectral Theory of the Klein–Gordon Equation in Pontryagin Spaces −1/2
167
−1/2
In this case the operator H0 (I − S ∗ S)−1 H0 is everywhere defined, injective, bounded and self-adjoint in H; therefore the operator H := H0 (I − S ∗ S)H0 , D(H ) = {x ∈ H1/2 : (I − S ∗ S)H0 x ∈ H1/2 } (3.4) 1/2
1/2
1/2
is self-adjoint and boundedly invertible in H. The operator H can also be considered as a densely defined closed operator from H1/2 to H, for which we use the same symbol H : it is densely defined because D(H ) is dense in H and the inclusion H1/2 → H is continuous; it is closed since the middle factor is closed in H, the left factor is boundedly invertible in H and the right factor is boundedly invertible as an operator from H1/2 to H (see [Kat66, Sect. III.5.2]). Remark 3.1. The operator H in H (from H1/2 to H, respectively) can also be defined by means of quadratic forms if we replace the conditions (i) and (ii) by 1/2 Ass. (i ) V is H0 -bounded with relative bound less than 1. 1/2 In fact, Assumption (i) is equivalent to the fact that V is H0 -bounded, that is, 1/2
⊂ D(V ) and there exist constants a, b ≥ 0, such that D H0 1/2
1/2
V x ≤ a x + b H0 x , x ∈ D(H0 ).
(3.5)
In Assumption (i ) it is required, in addition, that (3.5) holds with b < 1, or, equivalently (see [Kat82, Sect. V.4.1]), there exist constants a , b ≥ 0, b < 1, such that V x 2 ≤ a 2 x 2 + b2 H0 x 2 , x ∈ D(H0 ). 1/2
1/2
(3.6)
If we introduce the forms 1/2 1/2
1/2
h[x, y] := H0 x, H0 y , x, y ∈ D H0 , v2 [x, y] := (V x, V y), x, y ∈ D(V ), then (3.6) (and hence (i )) implies that the form v2 is h–bounded with relative formbound less than 1. Then, according to [Kat82, Theorem VI.3.9], the form sum h + v2 is closed and symmetric, and the entry H0 − V 2 in (3.7) can be defined by means of the self-adjoint operator in H induced by the form sum h + v2 . Our choice of the conditions (i) and (ii) rather than (i ) is due to the fact that Assumption (ii) is needed in the next section for other reasons. in (3.2) we now associate the block operator matrix A in With the formal matrix A G = H1/2 ⊕ H defined by 1/2
0 I , D(A) = D(H ) ⊕ D H0 . H 2V
A=
(3.7)
1/2
Lemma 3.2. If D H0 ⊂ D(V ) and 1 ∈ ρ(S ∗ S), then the operator A from (3.7) is boundedly invertible, and hence closed in G, with −2H −1 V H −1 −1 . (3.8) A = I 0
168
H. Langer, B. Najman, C. Tretter
Proof. By the assumptions, H is a boundedly invertible operator from H1/2 to H. Hence formally the inverse of A is given by (3.8). It remains to be shown that A−1 is a bounded operator in G = H1/2 ⊕ H. This follows from the facts that H −1 is a bounded operator from H to H1/2 , the identity I is bounded as an operator from H1/2 to H since the inclusion H1/2 → H is −1/2 −1/2 continuous, and H −1 V = H0 (I − S ∗ S)−1 H0 V is a bounded operator in H1/2 . For the latter, we observe that V is bounded from H1/2 to H by the first assumption, −1/2 −1/2 (I − S ∗ S)−1 H0 is bounded in H by the second assumption, and H0 is bounded from H to H1/2 . The operator A is related to another operator associated with the Klein–Gordon equation (1.2) which is formally given by (1.8) and arises from the substitution (1.7): In the orthogonal sum G1 := H ⊕ H we consider the operator 1 := A
V I H0 V
(3.9)
with domain 1 ) := D( A
x ∈ H ⊕ H : x ∈ D(V ) ∩ D(H0 ), y ∈ D(V ) . y
1 is closable It has been shown in [LNT06, Thm. 3.1] that Assumption (i) implies that A with closure A1 given by 1/2
1/2
x 1/2 ∗ D(A1 ) = , (3.10) ∈ H ⊕ H : x ∈ D H0 , H0 x + S y ∈ D H0 y Vx + y x . (3.11) = A1 1/2 1/2 y H0 (H0 x + S ∗ y) In order to establish the relation between A and A1 , we introduce the unbounded operator W from G1 = H ⊕ H to G = H1/2 ⊕ H as I 0 , D(W ) := H1/2 ⊕ H. W := V I
Its inverse W
−1
I 0 = −V I
in (1.10)) is a bounded operator from G = H1/2 ⊕ H to G1 = H ⊕ H (denoted by W since V is a bounded operator from H1/2 to H by Assumption (i). Lemma 3.3. If Assumptions (i) and (ii) are satisfied, then A = W A1 W −1 .
(3.12)
Spectral Theory of the Klein–Gordon Equation in Pontryagin Spaces
169
Proof. Using the description of the domain of A1 from (3.10) and the fact that for −1/2
y ∈ H1/2 ⊂ D(V ) we have S ∗ y = H0
−1/2
V y = H0
V y ∈ H1/2 , we find
D(W A1 W −1 ) x x x ∈ H1/2 ⊕ H ∈ D(A1 ), A1 = ∈ H1/2 ⊕ H : −V x + y −V x + y y x 1/2 1/2
= ∈ H1/2 ⊕ H : H0 x + S ∗ (−V x + y) ∈ D H0 , V x −V x + y ∈ H1/2 y 1/2
x 1/2 −1/2 1/2 = ∈ H1/2 ⊕ H : H0 x − S ∗ V H0 H0 x ∈ D H0 , y ∈ H1/2 y 1/2
x 1/2 = ∈ H1/2 ⊕ H : (I − S ∗ S)H0 x ∈ D H0 , y ∈ H1/2 y = D(H ) ⊕ H1/2 = D(A). That the operators A and W A1 W −1 coincide is seen as follows: for x ∈ D(H ) and 1/2
y ∈ H1/2 = D H0 we have, observing (3.11) and (3.4), Vx − Vx + y x I 0 I 0
A1 = 1/2 1/2 V I V I −V x + y H0 H0 x + S ∗ (−V x + y) y
= 1/2 1/2 H0 x + S ∗ (−V x + y) + V y H 0 y y = = 1/2 H x + 2V y H x + H S∗ y + V y 0 x =A , y where we have used that y ∈ H1/2 ⊂ D(V ) and H0 S ∗ = (S H0 )∗ = V ∗ ⊃ V . 1/2
1/2
4. Indefinite Inner Products In this section we always suppose that Assumptions (i) and (ii) are satisfied and we consider the operator A from (3.7). Obviously, A is not symmetric with respect to the Hilbert space inner product of G = H1/2 ⊕ H. However, it exhibits symmetry with respect to another inner product which is, in general, indefinite. This so-called energy inner product on G is defined as 1/2 1/2
x, x := H0 x, H0 x − (V x, V x ) + (y, y ) −1/2
for x = (x y)t , x = (x y )t ∈ G, which, using S = V H0 , can also be written as 1/2 1/2
x, x = (I − S ∗ S)H0 x, H0 x + (y, y ). (4.1) Lemma 4.1. Under Assumptions (i) and (ii), the space K := (G, ·, ·) is a Krein space. If, additionally, the number of negative eigenvalues of the operator I − S ∗ S in H is finite, say κ, then K is a Pontryagin space with negative index κ.
170
H. Langer, B. Najman, C. Tretter
Proof. Due to Assumptions (i) and (ii), the operator I − S ∗ S is bounded and self-adjoint
in H with 0 ∈ ρ(I − S ∗ S). Hence H equipped with the inner product (I − S ∗ S) ·, · is a Krein space (see Sect. 2.2); if, in addition, the number of negative eigenvalues of I − S ∗ S in H is finite, say κ, it is a Pontryagin space with negative index κ. Now the 1/2 claim follows since H0 : H1/2 → H is an isomorphism. Remark 4.2. If S < 1, that is, κ = 0, then K is a Hilbert space. 1/2
⊂ D(V ) and that 1 ∈ ρ(S ∗ S). Then A is a selfTheorem 4.3. Suppose that D H0 adjoint operator in the Krein space K with ρ(A) = ∅. 1/2
Proof. For x = (x y)t ∈ D(A) = D(H ) ⊕ D H0 , we obtain, using (4.1), Ax, x =
x y , H x + 2V y y
1/2 1/2 = (I − S ∗ S)H0 y, H0 x + (H x + 2V y, y) = (y, H x) + (H x, y) + 2(V y, y), which is real. Thus the operator A is symmetric in K. In order to prove that A is selfadjoint in K, it remains to be shown that ρ(A) contains a real point µ (then (A − µ)−1 is bounded and symmetric in K and hence self-adjoint in K). Hence, to complete the proof, it suffices to show that 0 ∈ ρ(A). For f = ( f g)t ∈ H1/2 ⊕ H, the equation Ax = f with x = (x y)t ∈ D(A) = D(H ) ⊕ H1/2 is equivalent to y = f, H x + 2V y = g. Since 1 ∈ ρ(S ∗ S), H is boundedly invertible (see (3.4)) and so the second equation with y = f has a unique solution x ∈ D(H ), whence Ax = f has the unique solution −1 H (−2V f + g) x= ∈ D(H ) ⊕ H1/2 = D(A), f which proves that 0 ∈ ρ(A).
The energy inner product ·, · on G = H1/2 ⊕ G is related to the so-called charge inner product [·, ·] on G1 = H ⊕ H which is given by
[x, x ] := (x, y ) + (y, x ) = Gx, x (4.2) with
0 I . G := I 0
Obviously, the space K1 := (G1 , [·, ·]) is a Krein space for which the positive and negative components in the decomposition (2.1) have the same dimension; in particular, if H is infinite dimensional (as it is the case for the Klein–Gordon equation), then both components are infinite dimensional.
Spectral Theory of the Klein–Gordon Equation in Pontryagin Spaces
171
1 given by (3.9) is symmetric with respect to the charge inner product The operator A in K1 since, for x = (x y)t ∈ D( Aˆ 1 ) = D(H0 ) ⊕ D(V ),
1 x, x] = G A 1 x, x = (H0 x, x) + (V y, x) + (x, V y) + (y, y), [A which is real. Moreover, it has been shown in [LNT06] that, under Assumption (i), the 1 given by (3.10), (3.11) is self-adjoint in K1 with ρ(A1 ) = ∅. closure A1 of A Proposition 4.4. Between the indefinite inner products ·, · of G and [·, ·] of G1 the following relations hold: i) x, x = [A1 W −1 x, W −1 x ], x ∈ W D(A1 ), x ∈ G, ii) Ax, x = [A21 W −1 x, W −1 x ], x ∈ D(A), Ax ∈ W D(A1 ), x ∈ G. Proof. i) Let x = (x y)t ∈ W D(A1 ), x = (x y )t ∈ G = H1/2 ⊕ H, and set u=
u x := W −1 x = ∈ D(A1 ). v −V x + y
Then we have x = Wu =
x u = , y Vu + v
and the left-hand side of i) becomes x, x = (H0 x, H0 x ) − (V x, V x ) + (y, y ) 1/2
1/2
= (H0 u, H0 x ) − (V u, V x ) + (V u + v, y ). 1/2
1/2
Using (3.11) and the fact that x ∈ H1/2 , we can rewrite the right hand side of i) as A1 W
−1
x, W
u x Vu + v x , = , x = A1 v H0 (u + T ∗ v) −V x + y −V x + y
1/2 1/2 = V u + v, −V x + y + H0 (u +T ∗ v), H0 x 1/2 1/2 1/2 1/2
= (V u + v, −V x + y )+ H0 u, H0 x + H0 T ∗ v, H0 x .
−1
Since T = V H0−1 , the last summand equals (v, T H0 x ) = (v, V x ) and i) follows. ii) Let x ∈ D(A) be such that Ax ∈ W D(A1 ) and hence W −1 Ax ∈ D(A1 ). Since, by the operator equality (3.12), we have W −1 Ax = A1 W −1 x, it follows that A1 W −1 x ∈ D(A1 ) and further, by i), Ax, x = [A1 W −1 Ax, W −1 x ] = [A21 W −1 x, W −1 x ] for arbitrary x ∈ G.
1/2
Lemma 4.5. Let D H0 ⊂ D(V ). Then the set W D(A1 ) is dense in G.
172
H. Langer, B. Najman, C. Tretter
Proof. By (3.10), we have W D(A1 ) =
x Vx + y
∗
: x ∈ D(V ), x + T y ∈ D(H0 ) .
Hence if (x0 y0 )t ∈ G is orthogonal to W D(A1 ) with respect to the Hilbert space inner product in G = H1/2 ⊕ H, then x x0 1/2 1/2 , = (H0 x, H0 x0 ) + (V x + y, y0 ) = 0 (4.3) y0 G Vx + y for all (x y)t ∈ W D(A1 ). If we choose x = 0 and y ∈ D(V ), then T ∗ y = H0−1 V y = H0−1 V y ∈ D(H0 ) and hence (0 y)t ∈ W D(A1 ). Now (4.3) shows that y0 is orthogonal in H to the dense subset D(V ) and thus y0 = 0. If we choose x ∈ D(H0 ) and y = 0, 1/2
⊂ D(V ) by assumption. Since H0 then (x 0)t ∈ W D(A1 ) because D(H0 ) ⊂ D H0 is bijective, (4.3) implies that x0 is orthogonal to H and hence x0 = 0. Remark 4.6. If the operator I − S ∗ S has only finitely many, say κ, negative eigenvalues, the Pontryagin space K and the operator A can also be introduced by means of the operator A1 in the space G1 as follows. By Proposition 4.4 i), the indefinite inner product ·, · is defined on the dense subset W D(A1 ) of G. Since I − S ∗ S has only κ negative eigenvalues, the form [A1 ·, ·] on D(A1 ) ⊂ G1 , and hence ·, · on W D(A1 ), has κ negative squares (see [LNT06]). Therefore K is the Pontryagin space completion of W D(A1 ) ⊂ G with respect to the inner product ·, ·; the operator A can now be defined by the relation in Proposition 4.4 ii). 5. Spectral Properties of the Operator A In this section we exploit the self-adjointness of the operator A with respect to the indefinite inner product ·, ·. We show that A possesses a spectral function with at most finitely many critical points, we investigate the structure of the spectrum of A and consider the solvability of an abstract Cauchy problem for A (and hence for the Klein–Gordon equation). In order to guarantee that K is a Pontryagin space, in addition to the Assumptions (i) and (ii), we suppose that −1/2
Ass. (iii) S = V H0
= S0 + S1 with S0 < 1 and a compact operator S1 .
To study the spectrum and essential spectrum of A under Assumption (iii), the following lemma for the particular case S < 1 is useful. 1/2
⊂ D(V ) and 1 ∈ ρ(S ∗ S). Define the quadratic Lemma 5.1. Suppose that D H0 pencil L of bounded operators in H by −1/2 −1/2
L(λ) := I − S ∗ S + λ S ∗ H0 + H0 S − λ2 H0−1 , λ ∈ C. (5.1) If S < 1, then ρ(A) = ρ(L) and, with α := (1 − S )m, (−α, α) ⊂ ρ(A).
Spectral Theory of the Klein–Gordon Equation in Pontryagin Spaces
173
Proof. By Lemma 3.2, the operator A has a bounded inverse A−1 . By the spectral mapping theorem (see [EE87, Thm. IX.2.3]), we have
λ ∈ ρ(A) ⇐⇒ µ := λ−1 ∈ ρ A−1 . If S < 1, then the operator := I − S ∗ S is uniformly positive. Hence, by (3.4), 1/2 1/2 −1/2 −1/2 −1/2 −1/2 we can write H = H0 H0 , H −1 = H0
H0 , and thus we can −1 factorize the inverse A given by (3.8) as −2H −1 V H −1 −1 A = I 0 −1/2 −1/2 −1/2
−1/2 0 −2 −1/2 H0 V −1/2 H0 ; = H0 0 I I 0 here the right factor is an operator from G = H1/2 ⊕ H to H ⊕ H and the left factor is an operator from H ⊕ H back to G = H1/2 ⊕ H. If we exchange the order of the factors and define the auxiliary operator B in H ⊕ H by −1/2 −1/2 H −1/2 V −1/2 H −1/2
−1/2 0 H0 0 0 B : = −2
0 I I 0 −1/2 −1/2 −1/2 − −1/2 S ∗ H0
−1/2 − −1/2 H0 S −1/2 −1/2 H0 = , −1/2 −1/2 H0
0
−1/2 −1/2 −1/2 then ρ A−1 \ {0} = ρ(B) \ {0}; here we have used that H0 V H0 = S ∗ H0 = −1/2 H0 S. For µ ∈ C, µ = 0, we have µ ∈ ρ(B) if and only if for every ( f g)t ∈ H ⊕ H there exists a unique (x y)t ∈ H ⊕ H such that −1/2 ∗ −1/2 −1/2
−1/2 −1/2 −
S H0
− −1/2 H0 S −1/2 − µ x + −1/2 H0 y = f, −1/2 −1/2
H0
x − µ y = g.
If we divide both equations by µ ( = 0) and insert the second into the first, we see that this is equivalent to 1 1 1 1 −1/2 −1/2
−1/2 − −1/2 S ∗ H0 + H0 S + − 2 H0−1 −1/2 x = f + 2 −1/2 H0 g, µ µ µ µ
1 −1/2 −1/2 y= H
x−g . µ 0 Since −1/2 is bounded and boundedly invertible, the latter is equivalent to the fact that µ belongs to the resolvent set of the operator pencil given by 1 ∗ −1/2 1 −1/2
S H0 + H0 S + − 2 H0−1 µ µ or, equivalently, λ = µ−1 ∈ ρ(L) with L given by (5.1). This completes the proof of ρ(A) = ρ(L). Finally, for λ ∈ R, |λ| < (1 − S )m, the estimate −1/2
−1/2
S ∗ S − λ(S ∗ H0 + H0 S) + λ2 H0−1
2
1 1 < S 2 + 2 1 − S m S + 1 − S m 2 2 = 1 m m and (5.1) show that λ ∈ ρ(L) = ρ(A).
174
H. Langer, B. Najman, C. Tretter
1/2
−1/2 ⊂ D(V ), that S = V H0 Theorem 5.2. Suppose that D H0 = S0 + S1 with ∗ S0 < 1 and a compact operator S1 , and that 1 ∈ ρ(S S). Then we have: i) K is a Pontryagin space with finite negative index κ, where κ is the number of negative eigenvalues of the operator I − S ∗ S. ii) The self-adjoint operator A in K has a spectral function with at most finitely many critical points. iii) The non-real spectrum of A is symmetric with respect to the real axis and consists of at most κ pairs of eigenvalues λ, λ of finite type; the algebraic eigenspaces corresponding to λ and λ are isomorphic. iv) The linear span of all the algebraic eigenspaces corresponding to the eigenvalues of A in the upper (or lower) half plane is a neutral subspace of K and κ= κλ− (A) + dim Lλ (A); λ∈σ0 (A)∩R
λ∈σ (A)∩C+
here σ0 (A) denotes the set of all eigenvalues of A with non-positive eigenvector. v) The points of σ (A) \ σ0 (A) (which are all real ) are spectral points of positive type. vi) The essential spectrum σess (A) is real and σess (A) ∩ (−α, α) = ∅, where α := (1 − S0 ) m.
vii) The operator A generates a strongly continuous group exp(i At) t∈R of unitary operators in K and hence the Cauchy problem dx = i Ax, x(0) = x0 , dt
has the unique solution x(t) = exp i At x0 , t ∈ R, for all initial values x0 ∈ K. Proof. i) Assumption (iii) on S implies that I − S ∗ S = I − S0∗ S0 + K , where I − S0∗ S0 is uniformly positive and K is a compact operator in H. Thus I − S ∗ S has only a finite number κ of negative eigenvalues and hence K is a Pontryagin space of finite negative index κ by Lemma 4.1. ii), iii), iv), v), and vii) are immediate consequences of Theorem 4.3 and of i) by [Lan82] (see also Sects. 2.4 and 2.5). vi) We define an operator A0 in G by 0 I . (5.2) A0 := 1/2 1/2 2V H0 (I − S0∗ S0 )H0 By the spectral mapping theorem (see [EE87, Thm. IX.2.3]), we have
λ ∈ σess (A) ⇐⇒ µ := λ−1 ∈ σess A−1 ,
λ ∈ σess (A0 ) ⇐⇒ µ := λ−1 ∈ σess A−1 0 .
(5.3)
Spectral Theory of the Klein–Gordon Equation in Pontryagin Spaces
175
The difference A−1 − A−1 0 is compact since S1 is compact by assumption; in fact,
−1 0 0 −1 −1 −1 −1
A−1 , A0 − A A0 = A A − A0 = A 1/2 1/2 S1∗ S0 + S0∗ S1 H0 0 0 H0 which is compact since S1∗ S0 + S0∗ S1 is compact and 1/2 −1/2 0 0 (I − S ∗ S) , 0 H −1 H0 = 0 H0 A−1 1/2 = 0 0 0 H0 0 0 1/2 −1/2 −1/2 ∗ −1 H0 0 A−1 = −2(I − S0 S0 ) H0 V (I − S0∗ S0 )−1 H0 0 0 0 0 0 are bounded. By iii), σ (A) has empty interior as a subset of C. In addition, part iii) applied to A and A0 shows that each of the at most two components
of C \ σ (A) conand tains a point in ρ(A0 ). Hence, by [RS78, Lemma XIII.4], σess A−1 = σess A−1 0 thus, by (5.3), σess (A) = σess (A0 ).
(5.4)
Now Lemma 5.1 applied to A0 shows that (−α, α) ⊂ ρ(A0 ) and, consequently, σess (A)∩ (−α, α) = σess (A0 ) ∩ (−α, α) = ∅. The special cases that S is compact or that S < 1 in Theorem 5.2 have been considered before (see, e.g., [LN96] and [Naj79], respectively): 1/2
Remark 5.3. Suppose that D H0 ⊂ D(V ) and 1 ∈ ρ(S ∗ S). 1/2
i) If V is H0 -compact, then
σess (A) = λ ∈ C : λ2 ∈ σess (H0 ) .
−1/2 < 1, then κ = 0, K is a Hilbert space, A is self-adjoint in this Hilbert ii) If V H0 space, and σ (A) ∩ (−α, α) = ∅
−1/2 m. with α = 1 − V H0 −1/2
1/2
Proof. i) If V is H0 -compact, we can choose S0 = 0 and S1 = V H0 in Assumption (iii). Then (5.4), (5.3), and (5.2) show that −1
−2H0−1 V H0−1 −1 . λ ∈ σess (A) ⇐⇒ λ ∈ σess A0 = σess I 0 If we define
then A−1 0 −D =
0 H0−1 D := , I 0 −2H0−1 V I
H0−1 0 H0−1 −2H0−1 V 0 − = 0 0 0 I 0
176
H. Langer, B. Najman, C. Tretter
is compact by assumption. Moreover, by Theorem 5.2 iii) and v), σ (A−1 0 ) has empty interior as a subset of C and C \ σ (A−1 ) consists of only one component containing 0 −1 points in ρ(D) (e.g., all non-real points of C \ σ (A0 )). Hence [RS78, Lemma XIII.4] shows that σess (A−1 0 ) = σess (D). Now the fact that
λ−1 ∈ σess (D) ⇐⇒ λ−2 ∈ σess H0−1 , which is not difficult to check (see, e.g., [HM01]), completes the proof. −1/2 ii) is immediate from Theorem 5.2 v) if we choose S0 = V H0 and S1 = 0 in Assumption (iii). Remark 5.4. If, under Assumption (iii), Assumption (ii) is not satisfied, that is, 1 ∈ σp (S ∗ S), then 0 is an isolated eigenvalue of finite multiplicity of the self-adjoint opera1/2 tor H and, with N0 := ker H = ker(I −S ∗ S)H0 , the subspace N0 ⊕{0} is the isotropic = K/N0 is a subspace of the inner product space (K, ·, ·). Then the factor space K Pontryagin space with negative index again given by the number of negative eigenvalues of I − S ∗ S. Since ker A = N0 ⊕ {0}, the operator A induces a self-adjoint operator A in this Pontryagin space K. Then all claims of Theorem 5.2 remain true for A. 6. Assumptions for the Klein–Gordon Equation in Rn In this section we consider the example of the Klein–Gordon equation in Rn for which H = L 2 (Rn ) with norm · 2 and scalar product (·, ·)2 , H0 = − + m 2 , and V stands for the operator of multiplication by a function V : Rn → R. In this case, sufficient conditions on the potential V will be established that guarantee the assumptions 1/2
⊂ D(V ) or, equivalently, V : H1/2 → H is bounded, Ass. (i) D H0 −1/2
= S0 + S1 with S0 < 1 and a compact operator S1 , Ass. (iii) S = V H0 which were used in the previous sections. Obviously, (iii) is stronger than (i). Note that, according to Remark 5.4, Assumption (ii), that is, 1 ∈ ρ(S ∗ S), is not an essential restriction and thus will not be considered here. It is well-known (see [Tri92, Sects. 1.3.1, 1.3.2]) that for H0 = − + m 2 the space 1/2
H1/2 = D H0 is the Sobolev space of order 1 associated with L 2 (Rn ): H1/2 = W21 (Rn ). Hence Assumption (i) holds if and only if W21 (Rn ) ⊂ D(V ) or, equivalently, if there exist constants a, b ≥ 0 such that V u 2 ≤ a u 2 + b (− + m 2 )1/2 u 2 , u ∈ W21 (Rn );
(6.1)
this is equivalent to the (− + m 2 )-form boundedness of V 2 , that is, (V 2 u, u)2 ≤ a(u, u)2 + b((− + m 2 )u, u)2 , u ∈ W21 (Rn ). Assumption (iii) holds if V = V0 + V1 , where W21 (Rn ) ⊂ D(Vi ) for i = 0, 1, V0 satisfies (6.1) with a, b ≥ 0 such that a + b < 1, m
(6.2)
Spectral Theory of the Klein–Gordon Equation in Pontryagin Spaces
177
that is, S0 = V0 (− + m 2 )−1/2 is a strict contraction, and S1 = V1 (− + m 2 )−1/2 is compact. Many different sufficient conditions for the relative boundedness as well as for the relative compactness of a multiplication operator with respect to (− + m 2 )1/2 have been established (see, e.g., [Kat82, RS75, Sim71] and the more specialized references therein). In the following we formulate two well-known sufficient conditions in terms of L p -spaces and Rollnik classes. We start with a well-known relative compactness result, which, for p = 3, goes back to Brezis and Kato (see [BK79]). Theorem 6.1. If n ≥ 3 and V ∈ L p (Rn ) with n ≤ p < ∞, then V is (− + m 2 )1/2 compact.2 Proof. The operator of multiplication with V in L 2 (Rn ) is (−+m 2 )1/2 -compact if and only if V (− + m 2 )−1/2 is compact, that is, if (u m ) is a sequence in the form domain W21 (Rn ) of − + m 2 that converges weakly to 0, then (V u m ) converges strongly to 0 in L 2 (Rn ). Now let n ≤ p < ∞ and W ∈ L p (Rn ), and set q := p/( p − 2). By Hölder’s inequality and the boundedness of the embedding of the Sobolev space W21 (Rn ) into L 2q (Rn ) (which holds since p ≥ n, see [EE87, Theorem V.3.7]), we have W u 22 ≤ W 2p u 22q ≤ c2 W 2p u 22,1 , u ∈ W21 (Rn );
(6.3)
here c is the norm of the embedding of W21 (Rn ) into L 2q (Rn ). Assume now that (u m ) ⊂ W21 (Rn ) converges weakly to 0 and let ε > 0. Since C0∞ (Rn ) ⊂ L p (Rn ) is dense, there exists a function Vε ∈ C0∞ (Rn ) such that V − Vε p < ε. Let ε := supp Vε and choose Cε ≥ 0 such that |Vε | ≤ Cε . Then we obtain, using (6.3) with W = V − Vε , V u m 2 ≤ (V − Vε )u m 2 + Vε u m 2 ≤ c ε u m 2,1 + Cε u m |ε 2 .
(6.4)
Since (u m ) converges weakly in W21 (Rn ), it is a bounded sequence in W21 (Rn ). Hence, choosing ε sufficiently small, the first term can be made arbitrarily small. The second term becomes arbitrarily small for sufficiently large m: in fact, W21 (ε ) is compactly embedded in L 2 (ε ) since ε is bounded (see [EE87, Theorem V.3.7]) and thus (u m |ε ) converges to 0 strongly in L 2 (ε ). For n = 3, a criterion for the relative form-boundedness of V 2 with respect to −+m 2 can be formulated in terms of Rollnik potentials, see [RS75], [Sim71]: A measurable function W : R3 → R is said to belong to the class R of Rollnik potentials if
|W (x)||W (y)| 2 W R := dx dy < ∞. |x − y|2 R3 R3 Theorem 6.2. If n = 3 and V : R3 → R is a measurable function such that V 2 ∈ R + L ∞ (R3 ), then V is (− + m 2 )1/2 -bounded (with relative bound 0). In particular, if V 2 ∈ R, we have 2 V (− + m 2 )−1/2 ≤ V R . 4π 2 We thank W.D. Evans for communicating this result to us.
178
H. Langer, B. Najman, C. Tretter
Proof. The first statement may be found in [RS75, Theorem X.19], for the proof see [Sim71, Theorem I.21]. For the second claim, we note that, by [Sim71, (I.13)], 1/2
|V 2 (x)| e−m|x−y| |V 2 (y)| 1 |V |(− + m 2 )−1 |V |u, u 2 ≤ dx dy u 22 4π |x − y|2 R3 R3 1 ≤ V 2 R u 22 , u ∈ D(|V |) = D(V ), 4π which implies that (− + m 2 )−1/2 |V | ≤ V 2 R /(4π ). Hence the densely defined operator (− + m 2 )−1/2 |V | is bounded and V (− + m 2 )−1/2 = |V |(− + m 2 )−1/2 = (|V |(− + m 2 )−1/2 )∗ V 2 R = (− + m 2 )−1/2 |V | ≤ , 4π follows.
Remark 6.3. For n = 3, Theorem 6.1 shows that every V ∈ L p (R3 ) + L ∞ (R3 ) with 3 ≤ p < ∞ is (−+m 2 )1/2 -bounded (with relative bound 0). This condition is more restrictive than the condition V 2 ∈ R+L ∞ (R3 ) in Theorem 6.2. Indeed, V ∈ L p (R3 )+L ∞ (R3 ) with p ≥ 3 implies that V 2 ∈ L p/2 (R3 ) + L p (R3 ) + L ∞ (R3 ) ⊂ R + L ∞ (R3 ) since L q (R3 ) + L ∞ (R3 ) ⊂ R + L ∞ (R3 ) for q ≥ 3/2 (see [Sim71, Corollary I.2]). The Coulomb potential V (x) = γ /|x|, x ∈ Rn \ {0}, does not have relative bound 0 with respect to (− + m 2 )1/2 ; therefore neither Theorem 6.1 nor Theorem 6.2 apply to it. In this case, however, Assumption (i) is an immediate consequence of the Hardy inequality. Proposition 6.4. The Coulomb potential V (x) = γ /|x|, x ∈ Rn \ {0}, with γ ∈ R satisfies Assumption (i) for n ≥ 3; in fact, V (− + m 2 )−1/2 ≤
2|γ | . n−2
Proof. The classical Hardy inequality (see [HLP88, Theorem 330]) shows that, for u ∈ W21 (Rn ), V u 22 ≤
4γ 2 4γ 2 2 ∇u ≤ (− + m 2 )1/2 u 22 , 2 (n − 2)2 (n − 2)2
which yields the desired estimate.
As a consequence of Theorems 6.1, 6.2, and Proposition 6.4, we obtain: Example 6.5. Let n ≥ 3. Assumption (iii) (and hence (i)) is satisfied if V = V0 + V1 , where V1 ∈ L p (Rn ) with n ≤ p < ∞, and for V0 one of the following holds: i) V0 ∈ L ∞ (Rn ) with V0 ∞ < m, ii) V0 (x) = γ /|x|, x ∈ Rn \ {0}, with γ ∈ R such that |γ | < (n − 2)/2,
Spectral Theory of the Klein–Gordon Equation in Pontryagin Spaces
179
and, in the particular case n = 3, iii) V02 ∈ R with V02 R < 4π . Note that the admission of the relatively compact part V1 of V , which is not subject to any relative norm bound, gives rise to complex eigenvalues. This was avoided in earlier papers by assuming that V1 = 0 (see, e.g., [SSW40] for case i) and [Ves83] for case ii)). Acknowledgements. The first and the last author gratefully acknowledge the support of Deutsche Forschungsgemeinschaft, DFG, under Grant No. TR368/6-1. We also thank the referee for valuable suggestions which led to the present form of the paper.
References [AI89]
Azizov, T. Y., Iokhvidov, I. S.: Linear operators in spaces with an indefinite metric. In: Pure and Applied Mathematics (New York). Chichester: John Wiley & Sons Ltd., 1989; Translated from the Russian by E. R. Dawson, A Wiley-Interscience Publication [Bac04] Bachelot, A.: Superradiance and scattering of the charged Klein–Gordon field by a step-like electrostatic potential. J. Math. Pures Appl. (9), 83(10),1179–1239 (2004) [BK79] Brézis, H., Kato, T.: Remarks on the Schrödinger operator with singular complex potentials. J. Math. Pures Appl. (9), 58(2), 137–151 (1979) [Bog74] Bognár, J.: Indefinite inner product spaces. Ergebnisse der Mathematik und ihrer Grenzgebiete, Band 78, New York: Springer-Verlag, 1974 [Eck76] Eckardt, K.-J.: On the existence of wave operators for the Klein–Gordon equation. Manuscripta Math. 18(1), 43–55 (1976) [Eck80] Eckardt, K.-J.: Scattering theory for the Klein–Gordon equation. Funct. Approx. Comment. Math. 8, 13–42 (1980) [EE87] Edmunds, D. E., Evans, W. D.: Spectral theory and differential operators. Oxford Mathematical Monographs. New York: The Clarendon Press Oxford University Press, 1987 [GGK90] Gohberg, I., Goldberg, S., Kaashoek, M. A.: Classes of linear operators. Vol. I, Volume 49 of Oper. Theory Adv. Appl., Basel: Birkhäuser Verlag, 1990 [HLP88] Hardy, G. H., Littlewood, J. E., Pólya, G.: Inequalities. Cambridge Mathematical Library. (Reprint of the 1952 edition). Cambridge: Cambridge University Press, 1988 [HM01] Hardt, V., Mennicken, R.: On the spectrum of unbounded off-diagonal 2 × 2 operator matrices in Banach spaces. In: Recent advances in operator theory (Groningen, 1998), Volume 124 of Oper. Theory Adv. Appl., Basel: Birkhäuser, 2001, pp. 243–266 [Jon79] Jonas, P.: On local wave operators for definitizable operators in Krein space and on a paper by T. Kako. Preprint P-46/79 Zentralinstitut für Mathematik und Mechanik der AdW DDR, Berlin, 1979 [Kak76] Kako, T.: Spectral and scattering theory for the J -selfadjoint operators associated with the perturbed Klein–Gordon type equations. J. Fac. Sci. Univ. Tokyo Sect. IA Math. 23(1), 199–221 (1976) [Kat66] Kato, T.: Perturbation theory for linear operators. Die Grundlehren der mathematischen Wissenschaften, Band 132. New York: Springer-Verlag, 1966 [Kat82] Kato, T.: A short introduction to perturbation theory for linear operators. New York: Springer-Verlag, 1982 [KL63] Kre˘ın, M. G., Langer, G. K.: On the spectral function of a self-adjoint operator in a space with indefinite metric. Dokl. Akad. Nauk SSSR 152, 39–42 (1963). [Lan82] Langer, H.: Spectral functions of definitizable operators in Krein spaces. In: Functional analysis (Dubrovnik, 1981), Volume 948 of Lecture Notes in Math., Berlin: Springer-Verlag, 1982, pp. 1–46. [LN96] Langer, H., Najman, B.: A Krein space approach to the Klein–Gordon equation. Unpublished manuscript, 1996 [LNT06] Langer, H., Najman, B., Tretter, C.: Spectral theory of the Klein–Gordon equation in Kre˘ın spaces. Submitted, 2006 [Lun73a] Lundberg, L.-E.: Relativistic quantum theory for charged spinless particles in external vector fields. Commun. Math. Phys. 31, 295–316 (1973) [Lun73b] Lundberg, L.-E.: Spectral and scattering theory for the Klein–Gordon equation. Comm. Math. Phys. 31, 243–257 (1973)
180
[Na˘ı66]
H. Langer, B. Najman, C. Tretter
Na˘ımark, M. A.: Analog of Stone’s theorem for a space with an indefinite metric. Dokl. Akad. Nauk SSSR, 170, 1259–1261 (1966) [Naj79] Najman, B.: Solution of a differential equation in a scale of spaces. Glas. Mat. Ser. III 14(34)(1), 119–127 (1979) [Naj80a] Najman, B.: Spectral properties of the operators of Klein–Gordon type. Glas. Mat. Ser. III, 15(35)(1), 97–112 (1980) [Naj80b] Najman, B.: Localization of the critical points of Klein–Gordon type operators. Math. Nachr. 99, 33–42 (1980) [Naj83] Najman, B.: Eigenvalues of the Klein–Gordon equation. Proc. Edinburgh Math. Soc. (2), 26(2), 181–190 (1983) [RS75] Reed, M., Simon, B.: Methods of modern mathematical physics. II. Fourier analysis, self-adjointness. New York: Academic Press [Harcourt Brace Jovanovich Publishers], 1975 [RS78] Reed, M., Simon, B.: Methods of modern mathematical physics. IV. Analysis of operators. New York: Academic Press [Harcourt Brace Jovanovich Publishers], 1978 [Sch76] Schechter, M.: The Klein–Gordon equation and scattering theory. Ann. Phys. 101(2), 601–609 (1976) [Sim71] Simon, B.: Quantum mechanics for Hamiltonians defined as quadratic forms. Princeton, NJ: Princeton University Press, 1971 [SSW40] Schiff, L., Snyder, H., Weinberg, J.: On the existence of stationary states of the mesotron field. Phys. Rev. 57, 315–318 (1940) [Tri92] Triebel, H.: Theory of function spaces. II, Volume 84 of Monographs in Mathematics. Basel: Birkhäuser Verlag, 1992 [Ves83] Veseli´c, K.: On the nonrelativistic limit of the bound states of the Klein–Gordon equation. J. Math. Anal. Appl. 96(1), 63–84 (1983) [Wed77] Weder, R.: Selfadjointness and invariance of the essential spectrum for the Klein–Gordon equation. Helv. Phys. Acta 50(1), 105–115 (1977) [Wed78] Weder, R. A.: Scattering theory for the Klein–Gordon equation. J. Funct. Anal. 27(1), 100–117 (1978) Communicated by B. Simon
Commun. Math. Phys. 267, 181–225 (2006) Digital Object Identifier (DOI) 10.1007/s00220-006-0040-2
Communications in
Mathematical Physics
On Motives Associated to Graph Polynomials Spencer Bloch1 , Hélène Esnault2 , and Dirk Kreimer3,4 1 Dept. of Mathematics, University of Chicago, Chicago, IL 60637, USA. E-mail:
[email protected] 2 Mathematik, Universität Duisburg-Essen, FB6, Mathematik, 45117 Essen, Germany.
E-mail:
[email protected]
3 IHES, 91440 Bures sur Yvette, France. E-mail:
[email protected] 4 Boston University, Boston, MA 02215, USA
Received: 26 October 2005 / Accepted: 4 January 2006 Published online: 23 May 2006 – © Springer-Verlag 2006
Abstract: The appearance of multiple zeta values in anomalous dimensions and βfunctions of renormalizable quantum field theories has given evidence towards a motivic interpretation of these renormalization group functions. In this paper we start to hunt the motive, restricting our attention to a subclass of graphs in four dimensional scalar field theory which give scheme independent contributions to the above functions. 0. Introduction Calculations of Feynman integrals arising in perturbative quantum field theory [4, 5] reveal interesting patterns of zeta and multiple zeta values. Clearly, these are motivic in origin, arising from the existence of Tate mixed Hodge structures with periods given by Feynman integrals. We are far from a detailed understanding of this phenomenon. An analysis of the problem leads via the technique of Feynman parameters [12] to the study of motives associated to graph polynomials. By the seminal work of Belkale and Brosnan [3], these motives are known to be quite general, so the question becomes under what conditions on the graph does one find mixed Tate Hodge structures and multiple zeta values. The purpose of this paper is to give an expository account of some general mathematical aspects of these “Feynman motives” and to work out in detail the special case of wheel and spoke graphs. We consider only scalar field theory, and we focus on primitively divergent graphs. (A connected graph is primitively divergent if #Edge() = 2h 1 (), where h 1 is the Betti number of the graph; and if further for any connected proper subgraph the number of edges is strictly greater than twice the first Betti number.) From a motivic point of view, these play the role of “Calabi-Yau” objects in the sense that they have unique periods. Physically, the corresponding periods are renormalization scheme independent. Graph polynomials are introduced in Sects. 1 and 2 as special cases of discriminant polynomials associated to configurations. They are homogeneous polynomials written
182
S. Bloch, H. Esnault, D. Kreimer
in a preferred coordinate system with variables corresponding to edges of the graph. The corresponding hypersurfaces in projective space are graph hypersurfaces. Section 3 studies coordinate linear spaces contained in the graph hypersurface. The normal cones to these linear spaces are linked to graph polynomials of sub and quotient graphs. Motivically, the chain of integration for our period meets the graph hypersurface along these linear spaces, so the combinatorics of their blowups is important. (It is curious that arithmetically interesting periods seem to arise frequently (cf. multiple zeta values [11] or the study of periods associated to Mahler measure in the non-expansive case [8]) in situations where the polar locus of the integrand meets the chain of integration in combinatorially interesting ways.) Section 4 is not used in the sequel. It exhibits a natural resolution of singularities P(N ) → X for a graph hypersurface X . P(N ) is a projective bundle over projective space, and the fibres P(N )/ X are projective spaces. Section 5 introduces Feynman quadrics. The period of interest is interpreted as an integral (5.3) over P2r −1 (R). The integrand has simple poles along r distinct quadrics. When these quadrics are associated to a graph , the period is shown to be convergent precisely when is primitively divergent as above. Section 6 reinterprets the above period as a relative period (6.10) associated to the graph hypersurface. This is the Schwinger trick [12]. Section 7 presents the graph motive in detail. Let X ⊂ P2n−1 be the graph hypersurface associated to a primitive divergent graph. Let ⊂ P2n−1 be the coordinate simplex (union of 2n coordinate hyperplanes). An explicit sequence of blowups in P2n−1 of linear spaces is described. Write P → P2n−1 for the resulting variety. Let f : Y ⊂ P be the strict transform of X , and let B := f −1 () be the total inverse image. Then the motive is H 2n−1 (P \ Y, B \ B ∩ Y ).
(0.1)
Section 8 considers what can be said directly about the motive of a graph hypersurface X using elementary projection techniques. The main tool is a theorem of C. L. Dodgson about determinants, published in 1866. Section 9 describes what the theory of motivic cohomology suggests about graph motives in cases [5] where the period is related to a zeta value. Section 10 considers the Schwinger trick from a geometric point of view. The main result is that in middle degree, the primitive cohomology of the graph hypersurface is supported on the singular set. Sections 11 and 12 deal with wheel and spoke graphs. Write X n ⊂ P2n−1 for the hypersurface associated to the graph which is a wheel with n spokes. The main results are Hc2n−1 (P2n−1 \ X n ) ∼ = Q(−2), 2n−1 2n−1 (P \ Xn) ∼ H = Q(−2n + 3).
(0.2) (0.3)
2n−1 \ X ) in this case is generated by the Further, the de Rham cohomology H D2n−1 n R (P integrand of our graph period (7.1). Note that nonvanishing of the graph period, which is clear by considerations of positivity, only implies that the integrand gives a nonzero cohomology class in H D2n−1 R (P \ Y, B \ B ∩ Y ). It does not a priori imply nonvanishing 2n−1 \ X ). (P in H D2n−1 n R Finally, Sect. 13 discusses various issues which remain to be understood, including the question of when the motive (0.1) admits a framing, the curious role of triangles in graphs whose period is known to be related to a ζ value, and the possibility of constructing a Hopf algebra H of graphs such that assigning to a primitive divergent graph
Graph Polynomials
183
its motive would give rise to a Hopf algebra map from H to the Hopf algebra M Z V of mixed zeta values. From a physics viewpoint, our approach starts with a linear algebra analysis of the configurations given by a graph and its relations imposed by the edges on the vertices, illuminating the structure of the graph polynomial. An all important notion then is the one of a subgraph, and the clarification of the correspondence between linear subvarieties and subgraphs is our next achievement. We then introduce the Feynman integral assigned to a Feynman graphs based on the usual quadrics provided by the scalar propagators of free field theory. The map from that Feynman integral to an integration over the inverse square of the graph polynomial proceeds via the Schwinger trick [12], which we discuss in detail. We next discuss the motive using relating chains of coordinate linear subspaces of the graph hypersurfaces with chains of subgraphs. This allows for a rather systematic stratification of the graph hypersurface which can be carried through for the wheel graphs, but fails in general. We give an example of such a failure. The wheels are then subjected to a formidable computation of their middle dimensional cohomology, a feat which we are at the time of writing unable to repeat for even the next most simple class of graphs, the zig-zag graphs of [4], which, at each loop order, evaluate indeed to a rational multiple of the wheel at the same loop order. After collecting our results for the de Rham class in the wheels case, we finish the paper with some outlook how to improve the situation. 1. Polynomials Associated with Configurations Let K be a field and let E be a finite set. Write K [E] for the K -vector space spanned by E. A configuration is simply a linear subspace i V : V → K [E]. The space K [E] is self-dual in an evident way, so for e ∈ E we may consider the functional e∨ ◦ i V : V → K . Fix a basis v1 , . . . , vd for V , and let Me be the d × d symmetric matrix associated to the rank 1 quadratic form (e∨ ◦ i V )2 on V . Define a polynomial V (A) = det A e Me . (1.1) e∈E
V is homogeneous of degree d. Note that changing the basis of V only changes V by a unit in K × . Remark 1.1. Write ιV : P(V ) → P#E−1 for the evident embedding on projective spaces of lines. View the quadratic forms (e∨ ◦ i V )2 as sections in (P(V ), O(2)). Then ιV is defined by the possibly incomplete linear series spanned by these sections, and V is naturally interpreted as defining the dual hypersurface in P#E−1,∨ of sections of this linear system which define singular hypersurfaces in P(V ), cf. Sect. 4. Lemma 1.2. Each Ae appears with degree ≤ 1 in V . Proof. The matrix Me has rank ≤ 1. If Me = 0 then of course Ae doesn’t appear and there is nothing to prove. If rank Me is 1, then multiplying on the left and right by invertible matrices (which only changes V by an element in K × ) we may assume Me is the matrix with 1 in position (1, 1) and zeroes elsewhere. In this case Ae + m ee . . . V = det (1.2) .. .. , . .
184
S. Bloch, H. Esnault, D. Kreimer
where Ae appears only in entry (1, 1). The assertion of the lemma follows by expanding the determinant along the first row.
As a consequence, we can write
V (A) =
ce1 ,...,ed Ae1 Ae2 · · · Aed .
(1.3)
{e1 ,...,ed }
Lemma 1.3. With notation as above, write Me1 ,...,ed for the matrix (with respect to the chosen basis of V ) of the composition e →0, e =ei
V → K [E] −−−−−−−→ K e1 ⊕ . . . ⊕ K ed .
(1.4)
Then ce1 ,...,ed = det Me21 ,...,ed . V by setting Proof. As a consequence of Lemma 1.2, ce1 ,...,ed is obtained from Aei = 1, 1 ≤ i ≤ d and Ae = 0 otherwise, i.e. ce1 ,...,e = det d i Mei . With respect to the chosen basis of V we may write e∨ ◦ i V = ae,i vi∨ : V → K . Then Me = (ae,i ae, j )i j so Me1 ,...,ed = (ae,i );
Me = (ae,i )(a j,e )t = Me1 ,...,ed Met 1 ,...,ed .
(1.5)
e
Corollary 1.4. The coefficients of V are the squares of the Plücker coordinates for K [E] W . More precisely, the coefficient of e ∈T Ae is PlückerT (W )2 . Remark 1.5. Let G denote the Grassmann of all Vd ⊂ K [E]. G carries a line bundle OG (1) ∼ subbundle. Sections of = det(V)∨ , where V ⊂ K [E] ⊗ K OG is the universal OG (1) arise from the dual map d K [E] ∼ = (G, det V ∨ ). Lemma 1.3 can be interpreted universally as defining a section ∈ (G × P(K [E]), OG (2) OP (1)).
(1.6)
Define W = K [E]/V to be the cokernel of i V . Dualizing yields an exact sequence iW ∨
0 → W ∨ −−→ K [E] → V ∨ → 0
(1.7)
and hence a polynomial W ∨ (A) which is homogeneous of degree #E − d. Proposition 1.6. We have the functional equation V (A) = c ·
e∈E
Ae W ∨ A−1 ; c ∈ K × .
(1.8)
Graph Polynomials
185
Proof. For T ⊂ E with #T = #E − d, consider the diagram 0
0 −−−−→ V −−−−→
βT
K [T ]
−−−−→ W
K [E]
−−−−→ W −−−−→ 0
(1.9)
α E−T
V −−−−→ K [E − T ]
0 Fix bases for V and W so the isomorphism det K [E] ∼ = det V ⊗ det W (canonical up to ±1) is given by c ∈ K × . Then c = det α E\T det βT−1 . By the above, the coefficient in V of e ∈T Ae is det α 2E\T while the coefficient of e∈T Ae in W ∨ is (det βTt )2 . The proposition follows immediately.
Remark 1.7. Despite the simple relation between V and W ∨ it is useful to have both. When we apply this machinery in the case of graphs, W ∨ admits a much more concrete description. On the other hand, V is more closely related to the Feynman integrals and periods of motives. Remark 1.8.Let K [E] W be as above, and suppose W is given with a basis. Then the matrix e Ae Me associated to i W ∨ : W ∨ → K [E] is canonical as well. In fact, a situation which arises in the study of graph polynomials is an exact sequence K [E]0 → W → K → 0. In this case, the matrix Ae Me has zero determinant. Define W := Image(K [E] → W ). It is easy to check that the graph polynomial for i W 0∨ : W 0∨ → K [E] is obtained from Ae Me by removing the first row and column and taking the determinant. 2. Graph Polynomials A finite graph is given with edges E and vertices V . We orient the edges. Thus each vertex of has entering edges and exiting edges. For a given vertex v and a given edge e, we define sign(v, e) to be −1 if e enters v and +1 if e exists v. We associate to a configuration (defined over Z) via the homology sequence ∂
0 → H1 (, Z) → Z[E] − → Z[V ] → H0 (, Z) → 0, (2.1) where the bounday map is Z-linear and defined by ∂(e) = v∈V sign(v, e) · v. Then ∂ depends on the chosen orientation but Hi (, Z) do not. deg
When is connected, we write Z[V ]0 := ker(Z[V ] −−→ Z). We define the graph polynomial of , := H1 (,Z) .
(2.2)
186
S. Bloch, H. Esnault, D. Kreimer
Recall a tree is a connected and simply connected graph. A tree T ⊂ is said to be a spanning tree for the connected graph if every vertex of lies in T . (If is not connected, we can extend the notion of spanning tree T ⊂ by simply requiring that T ∩ i be a spanning tree in i for each connected component i ⊂ .) Lemma 2.1. Let T be a subgraph of a connected graph . Let E = E be the set of edges of and let E T ⊂ E be the edges of T . Then T is a spanning tree if and only if one has an exact homology diagram as indicated: 0
0 −−−−→ H1 () −−−−→
0
β
Z[E T ]
−−−−→ Z[V ]0 ∼ =
Z[E]
−−−−→ Z[V ] −−−−→
∂
α
0 −−−−→ H1 () −−−−→ Z[E \ E T ] −−−−→ ∼ 0 =
0 Proof. Straightforward.
Z
Z −−−−→ 0 (2.3)
−−−−→ Z −−−−→ 0 ∼ =
0
Proposition 2.2. With notation as above, we have (A) = Ae .
(2.4)
T span tr. e ∈T
Proof. Fix a basis h j for H1 (). Then
(A) = det Ae e∨ (h j )e∨ (h k ) .
(2.5)
e of the monomial Let B ⊂ E have b elements, and let E = E \ B. The coefficient = 0 for e ∈ E . The coefficient is A in (A) is computed by setting A e e e∈B non-zero iff the determinant (1.1) is non-zero under this specialization, and this is true iff we get a diagram as in (2.3), i.e. iff E = E T for a spanning tree T . The coefficient of this monomial is 1 = det(αα t ), where α is as in the bottom row of (2.3).
Remark 2.3. If = i with i connected, then = i , (2.6) i
as both the free abelian group on edges and H1 are additive in i. If we define spanning “trees” in disconnected graphs as suggested above, Proposition 2.2 carries over to the disconnected case.
Graph Polynomials
187
Corollary 2.4. The coefficients of are all either 0 or +1. Definition 2.5. The graph hypersurface X ⊂ P#(E )−1 is the hypersurface cut out by = 0. Properties 2.6. We list certain evident properties of : 1. is a sum of monomials with coefficient +1. 2. No variable Ai appears with degree > 1 in any monomial. 3. Let 1 and 2 be graphs, and fix vertices vi ∈ i . Define := i /{v1 ∼ v2 }. Thus, E = E 1 E 1 and H1 () = H1 (1 ) ⊕ H1 (2 ). Writing A(i) for the variables associated to edges of i , we see that = 1 (A(1) )2 (A(2) ). Geometrically, the graph hypersurface X : = 0 is simply the join of the graph hypersurfaces X i . (Recall, if Pi ⊂ P N are linear subsets of projective space such that P1 ∩ P2 = ∅ and dim P1 + dim P2 = N − 1, and X i ⊂ Pi are closed subvarieties, then the join X 1 ∗ X 2 is simply the union of all lines joining points of X 1 to points of X 2 .) In particular, if 2 is a tree, so 2 = 0, then X is a cone over X 2 . 4. Defining via spanning trees (2.4) can lead to confusion in degenerate cases. For ∼ example, if has only a single n vertex (tadpole graph) and n edges, then H1 () = n ∼ Z[E ] = Z . Thus = 1 Ai , but there are no spanning trees. 3. Linear Subvarieties of Graph Hypersurfaces Let be a graph with n = #E edges. For convenience we take to be connected. It will be convenient to use the notation h 1 () := rank H1 (). In talking about subgraphs of a given graph , we will frequently not distinguish between the subgraph and the collection of its edges. (In particular, we will not permit isolated vertices.) Recall we have associated to a hypersurface X ⊂ Pn−1 . Our projective space has a distinguished set of homogeneous coordinates Ae ↔ e ∈ E , so we get a dictionary: Subgraphs G ⊂ ↔ coordinate linear subspaces L ⊂ Pn−1 G → L(G) : Ae = 0, e ∈ G L : Ae = 0, e ∈ S ⊂ E → G(L) = e ⊂ .
(3.1)
e∈S
The Feynman period is the integral of a differential form on Pn−1 with poles along X over a chain which meets X along the non-negative real loci of coordinate linear spaces contained in X . To give motivic meaning to this integral, it will be necessary to blow up such linear spaces. The basic combinatorial observation is Proposition 3.1. With notation as above, a coordinate linear space L is contained in X if and only if h 1 (G(L)) > 0. Proof. Suppose L : Ae = 0, e ∈ S. Then L ⊂ X if and only if every monomial in is divisible by Ae for some e ∈ S. In other words, iff no spanning tree of contains S. The assertion now follows from Lemma 3.2. Let S ⊂ be a (not necessarily connected) subgraph. Then S is contained in some spanning tree for iff h 1 (S) = 0.
188
S. Bloch, H. Esnault, D. Kreimer
Proof of Lemma . Consider the diagram c
0 −−−−→ H1 (S) −−−−→ Z[E S ] −−−−→ Z[V ]0
i
b
(3.2)
a
0 −−−−→ H1 () −−−−→ Z[E ] −−−−→ Z[V ]0 −−−−→ 0. Note that the map i is always injective. S is itself a spanning tree iff c is surjective and a and b have disjoint images. If we simply assume disjoint images with c not surjective, we can find e ∈ E such that e ∈ im(a) + im(b). Then S = S ∪ {e} still satisfies h 1 (S ) = 0. Continuing in this way, eventually c must be surjective. Since the images of a and b remain disjoint, c will be an isomorphism, and the resulting subgraph of will be a spanning tree.
This completes the proof of the proposition.
Let be a connected graph as above, and let G ⊂ be a subgraph. It will be convenient not to assume G connected. In particular, G and X G will be defined as in Remark 2.3. We define a modified quotient graph //G
(3.3)
by identifying the connected components G i of G to vertices vi ∈ //G (but not identifying vi ∼ v j ). If G is connected, this is the standard quotient in topology. One gets a diagram with exact rows and columns 0
0
0
0 −−−−→
H1 (G)
−−−−→
Z[E G ]
−−−−→
Z[VG ]0
−−−−→ 0
0 −−−−→
H1 () π
−−−−→
Z[E ]
−−−−→
Z[V ]0
−−−−→ 0
(3.4)
0 −−−−→ H1 (//G) −−−−→ Z[E //G ] −−−−→ Z[V//G ]0 −−−−→ 0
0
0
0.
Note with this modified quotient the map labeled π is surjective. Our objective now is to relate the graph hypersurfaces X , X G , X //G . To this end, we first consider the relation between spanning trees for the three graphs. If T ⊂ is a spanning tree, then h 1 (T ∩ G) = 0, but T ∩ G is not necessarily connected. In particular it is not necessarily a spanning tree for G. ⊂ such that There is an evident lifting from subgraphs V ⊂ //G to subgraphs V and G have no common edges. V
Graph Polynomials
189
Lemma 3.3. Let U ⊂ G be a spanning tree (cf. Remark 2.3). Then the association U V → T := V
(3.5)
induces a 1 to 1 correspondence between spanning trees V of //G and spanning trees T of such that U ⊂ T . Proof. Let T be a spanning tree for and assume U ⊂ T . Necessarily, G ∩ T = U . Indeed, U ⊂ G ∩ T and h 1 (G ∩ T ) = 0. Since U is already a spanning tree, it follows from (2.3) that G ∩ T cannot be strictly larger than U . By (3.4), π(T ) ∼ = T //U ⊂ //G is connected and h 1 (π(T )) = 0. It follows that )U , so the association T → π(T ) π(T ) is a spanning tree for //G. We have T = π(T is injective. Finally, if V ⊂ //G is a spanning tree, then since U )//U, V ∼ = (V U ) = 0. One easily checks that this subgraph is it follows from (3.4) that h 1 (V connected and contains all the vertices of , so it is a spanning tree.
Proposition 3.4. Let be a connected graph, and let G ⊂ be a subgraph. Assume h 1 (G) = 0. Let X ⊂ P(E ) be the graph hypersurface, and let L(G) : Ae = 0, e ∈ G be the linear subspace of P(E ) corresponding to G. Then L(G) is naturally identified with P(E //G ), and under this identification, X //G = X ∩ L(G). Proof. In this case, Lemma 3.3 implies that spanning trees for //G are in 1 to 1 correspondence with spanning trees for containing G. It follows from Proposition 2.2 that //G = | Ae =0,e∈G .
Proposition 3.5. Let G ⊂ be a subgraph, and suppose h 1 (G) > 0. Then L(G) : Ae = 0, e ∈ G is contained in X . Let P → P(E ) be the blowup of L(G) ⊂ P(E ), and let F ⊂ P be the exceptional locus. Let Y ⊂ P be the strict transform of X in P. Then we have canonical identifications F∼ = P(E G ) × P(E //G ),
Y ∩ F = X G × P(E //G ) ∪ P(E G ) × X //G .
(3.6) (3.7)
Proof. Let T ⊂ be a spanning tree. We have h 1 (T ∩ G) = 0 so T ∩ G is contained in a spanning tree for G by Lemma 3.2. In particular, #(T ∩ G) ≥ #E G − h 1 (G), with equality if and only if T ∩ G is a spanning tree for G. The normal bundle for L(G) ⊂ P(E ) is e∈G O(1), from which it follows that F∼ = L(G) × P(E G ). Also, of course, L(G) ∼ = P(E \ E G ) ∼ = P(E //G ). We have L(G) ⊂ X by Proposition 3.1. The intersection F ∩ Y is the projectivized normal cone of this inclusion. Algebraically, we identify K [Ae ]e∈//G ⊗ K [Ae ]e∈G
(3.8)
190
S. Bloch, H. Esnault, D. Kreimer
with the tensor of the homogeneous coordinate rings for P(E //G ) and P(E Our cone G ). is the hypersurface in this product defined by the sum of terms in = T ⊂ e ∈T Ae of minimal degree in the normal variables Ae , e ∈ G. These correspond to spanning trees T with #G ∩ T maximal. By the above discussion, these are the T such that T ∩ G is a spanning tree for G. It now follows from Lemma 3.3 that in fact the cone is defined by //G (Ae )e∈//G · G (Ae )e∈G ∈ K [Ae ]e∈//G ⊗ K [Ae ]e∈G . The proposition is now immediate.
(3.9)
Remark 3.6. The set F ∩ Y above can also be interpreted as the exceptional fibre for the blowup of L(G) ⊂ X . Example 3.7. Fix an edge e0 ∈ and take G = \e0 . Then L(G) =: p is a single point. If p ∈ X , then h 1 (G) = 0 and Proposition 3.4 implies that X //G = ∅. If p ∈ X , then F ∼ = P(E \ e0 ) and the exceptional divisor for the blowup of p ∈ X is X \e0 . Algebraically, this all amounts to the identity = Ae0 \e0 + /e0 ,
(3.10)
where the two graph polynomials on the right do not involve Ae0 . 4. Global Geometry In this section, for a vector bundle E over a variety X we write P(E) for the projective bundle of hyperplane sections, so a∗ OP(E) (1) = E, with a : P(E) → X . In particular, a surjection of vector bundles E F gives rise to a closed immersion P(F) → P(E). Consider projective space Pr and its dual (Pr )∨ . One has the Euler sequence e
0 → OPr − → OPr (1) ⊗ ((Pr )∨ , O(1)) → TPr → 0, where T is the tangent bundle. Writing T0 , . . . , Tr for a basis of (Pr , O(1)) and ((Pr )∨ , O(1)) for the dual basis, we have e(1) =
Ti ⊗
∂ ∈ Pr , OPr (1) ⊗ ((Pr )∨ , O(1)) . ∂ Ti
(4.1) ∂ ∂ Ti
∈
(4.2)
Geometrically, we can think of e(1) as a homogeneous form of degree (1, 1) on Pr ×(Pr )∨ whose zeroes define P(TPr ) → Pr × (Pr )∨ . The fibre in P(TPr ) over a point ∂∂Ti = ai in (Pr )∨ is the hyperplane cut out by ai Ti in Pr . For V → Pr a closed subvariety, define pV to be the composition pV : P(TPr |V ) → P(TPr ) → (Pr )∨ , and the fibre over ∂∂Ti = ai is V ∩{ ai Ti = 0}. Assuming V smooth, we have the normal bundle sequence 0 → TV → TPr |V → N V /Pr → 0.
(4.3)
Graph Polynomials
191
Proposition 4.1. Assume V → Pr is a smooth, closed subvariety. Consider the diagram →
P(N V /Pr ) −−−−→ P(TPr |V ) p .
V
(Pr )∨
(4.4)
(Pr )∨
P(N V /Pr ) ∩ pV−1 (a) = V ∩ ai Ti = 0
We have
sing
,
(4.5)
the singular points of the corresponding hypersurface section. Proof. Let x ∈V ⊂ Pr be a point. To avoid confusion we write dTi for the dual basis to ∂ ai dTi and a point x ∈ V we can associate a point of P(TPr |V ). Suppose ∂ Ti . To a sum −1 x ∈ pV (a). Then x is singular in this fibre if and only if ai dTi kills TV,x ⊂ TPr ,x , and this is true if and only if ai dTi ∈ P(N V /Pr ).
Suppose now V = Pk and the embedding Pk → Pr is defined by a sublinear system in (Pk , O(2)) spanned by quadrics q0 , . . . , qk . The fibres of the map p : TPr /Pk → (Pr )∨ are the degree 2 hypersurfaces ai qi = 0 ⊂ Pk . Note that the singular set in such a hypersurface is a projective space of dimension = k − rank( ai Mi ), where the Mi are (k + 1) × (k + 1) symmetric matrices associated to the quadrics qi . We conclude Proposition 4.2. With notation as above, define
X = a ∈ (Pr )∨ | rank ai Mi < k + 1 .
(4.6)
Then writing N = NPk /Pr , the map P(N ) → X is a resolution of singularities of X . The fibres of this map are projective spaces, with general fibre P0 = point. 5. Quadrics Let K ⊂ R be a real field. (For the application to Feynman quadrics, K = Q.) We will be interested in homogeneous quadrics Q i : qi (Z 1 , . . . , Z 2r ) = 0, 1 ≤ i ≤ r
(5.1)
in P2r −1 with homogeneous coordinates Z 1 , . . . , Z 2r . The union ∪ri Q i of the quadrics 2r r −1 = K [η] for a generator η ,ω has then degree 2r . It implies that P 1 Qi which, on the affine open Z 2r = 0 with affine coordinates z i = ZZri , i = 1, . . . , (2r − 1), is η| Z 2r −1 =0 =
dz 1 ∧...∧dz 2r −1 , q˜1 ···q˜r
with q˜i =
qi 2 Z 2r
. By (standard) abuse of notations, we write
2r −1 ; 2r −1 := (−1)i Z i d Z 1 ∧ · · · d Z i · · · ∧ d Z 2r . q1 · · · qr 2r
η=
(5.2)
i=1
The transcendental quantity of interest is the period ∞ dz 1 ∧ · · · ∧ dz 2r −1 P(Q) := η= . 2r −1 q˜1 · · · q˜r P (R ) z 1 ,...,z 2r −1 =−∞
(5.3)
192
S. Bloch, H. Esnault, D. Kreimer
The integral is convergent and the period well defined, e.g. when the quadrics are all positive definite. Suppose now r = 2n above, so we consider quadrics in P4n−1 . Let H ∼ = K n be a vector space of dimension n, and identify P4n−1 = P(H 4 ). For : H → K a linear functional, 2 gives a rank 1 quadratic form on H . A Feynman quadric is a rank 4 positive semi-definite form on P4n−1 of the form q = q = (2 , 2 , 2 , 2 ) . We will be interested in quadrics Q i of this form (for a fixed decomposition K 4n = H 4 ). In other words, we suppose given linear forms i on H , 1 ≤ i ≤ 2n, and we consider the corresponding period P(Q), where qi = (qi , qi , qi , qi ). For : H → K a linear form, write λ = ker(), = P(λ, λ, λ, λ) ⊂ P(H 4 ) = P4n−1 . The Feynman quadric q associated to is then a cone over the codimension 4 linear space . For a suitable choice of homogeneous coordinates Z 1 , . . . , Z 4n we have q = Z 12 + · · · + Z 42 . Let q1 , . . . , q2n be Feynman quadrics, and let i be the linear space associated to qi as above. As K is a real field, P4n−1 (R) meets Q i (C) only on i (R). Lemma 5.1. With notation as above, for I = {i 1 , . . . , i p } ⊂ {1, . . . , 2n}, write r (I ) = codim H (λi1 ∩. . .∩λi p ). The integral (5.2) converges if and only if sup I { p(I )−2r (I )} < 0. Here the sup is taken over all I ⊂ {1, . . . , 2n} and p(I ) = #I . Proof. Suppose λ1 ∩ . . . ∩ λ p has codimension r , with 2r ≤ p. We can choose local p coordinates x j so that i=1 i : x1 = · · · = x4r = 0, and then make the blowup x y j = x4rj , 1 ≤ j ≤ 4r − 1, y j = x j , j ≥ 4r . Then x 4r −1 d 4n−1 y d 4n−1 x = 2 p 4r q1 (x) · · · q2n (x) x4r q˜1 (y) · · · q˜2n (y)
(5.4)
for suitable q˜i (y) which are regular in the y-coordinates. Since | q˜i−1 | ≥ C > 0, it follows that the integral over a neighborhood of 0 ∈ R4n−1 diverges if (4r − 2 p) ≤ 0. Suppose conversely that sup I { p(I ) − 2r (I )} < 0. Note if n = 1, the quadrics are smooth and positive definite so the integrand has no pole along the integration chain and convergence is automatic. Assume n > 1. The above argument shows that blowing up an intersection of the i does not introduce a pole in the integrand along the exceptional divisor. Further, the strict transforms of the quadrics continue to have degree ≤ 2 in the natural local coordinates and to be cones over the strict transforms of the i . One knows that after a finite number of such blowups, the strict transforms of the i will meet transversally (see [10] for a minimal way to do it). All blowups and coordinates will be defined over K ⊂ R, and one is reduced to checking convergence for an integral of the form d 4n−1 x (5.5) 2 2 2 2 U (x 1 + · · · + x 4 ) · · · (x 4n−7 + · · · + x 4n−4 ) with U a neighborhood of 0 ∈ R2n−1 . The change of variables xi = t yi , i ≤ (4n − 4) introduces a t 4n−5−2n+2 = t 2n−3 factor. Since n ≥ 2, convergence is clear.
Let be a graph with N edges and n loops. Associated to we have the configuration of N hyperplanes in the n-dimensional vector space H = H1 (), (2.1). As above, we map the Feynman quadrics qi = (i2 , i2 , i2 , i2 ) on P4n−1 , 1 ≤ i ≤ N . The graph
Graph Polynomials
193
is said to be convergent (resp. logarithmically divergent) if N > 2n (resp. N = 2n). When is logarithmically divergent, the form ω := has poles only along
d 4n−1 x q1 · · · q2n
Q i , and we define the period P() := ω P4n−1 (R)
(5.6)
(5.7)
as in (5.3). Proposition 5.2. Let be a logarithmically divergent graph with n loops and 2n edges. The period P() converges if and only if every subgraph G is convergent, i.e. if and only if is primitive log divergent in the sense discussed in Sect. 0. Proof. Let G ⊂ be a subgraph with m loops and M edges, and assume M ≤ 2m. Let I ⊂ {1, . . . , 2n} be the edges not in G. Note H1 (G) ⊂ H1 () has codimension n − m and is defined by the 2n − M linear functionals corresponding to edges in I . By Lemma 5.1, the fact that 2(n − m) ≤ 2n − M implies that the period integral P() is divergent. Conversely, if the period integral is divergent, there will exist an I with p(I ) − 2r (I ) ≥ 0. Let G ⊂ be the union of the edges not in I . Then G has 2n − p(I ) edges. Also H1 (G) ⊂ H1 () is defined by the vanishing of functionals associated to edges in I , so G has n − r (I ) loops. It follows that G is not convergent.
6. The Schwinger Trick Let Q i : qi (Z 1 , . . . , Z 4n ) = 0, 1 ≤ i ≤ 2n be quadrics in P4n−1 . We assume the period integral (5.3) converges. Let Mi be the 4n × 4n symmetric matrix corresponding to qi , and write (A1 , . . . , A2n ) := det(A1 M1 + · · · + A2n M2n ). The Schwinger trick relates the period integral P(Q) (5.3) to an integral on
4n−1 (Z )
2n−1 (A) =C . √ 4n−1 2n−1 q · · · q 2n P (R ) 1 σ (R )
(6.1) P2n−1 , (6.2)
Here σ 2n−1 (R) ⊂ P2n−1 (R) is the locus of all points s = [s1 , . . . , s2n ] such that the projective coordinates si ≥ 0. C is an elementary constant, and the ’s are as in (5.2). Note the homogeneity is such that the integrands make sense. Lemma 6.1. With notation as above, define
4n−1 g(A) = . 2n P4n−1 (R) (A1 q1 + · · · + A2n q2n ) Then
√ × g(A) = cπ −2n ; c ∈ Q , [Q(c) : Q] ≤ 2.
If = 2 for a polynomial ∈ Q[A1 , . . . , A2n ], then c ∈ Q× .
(6.3)
(6.4)
194
S. Bloch, H. Esnault, D. Kreimer
Proof. By analytic continuation, we may suppose that Q a : Ai qi = 0 is smooth. The integral is then the period associated to H 4n−1 (P4n−1 \ Q a ). As generator for the homology we may either take P4n−1 (R) or the tube τ ⊂ P4n−1 \ Q a lying over the difference of two rulings 1 − 2 in the even dimensional smooth quadric Q a . (More precisely, let p S⊂N− → X be the sphere bundle for some metric on the normal bundle N of X , where X ⊂ P2n−1 is defined by = 0. Take τ = p −1 (1 − 2 ).) The two generators differ by a rational scale factor c. Integrating over τ shows that g(A) is defined up to a scale factor ±1 on P2n−1 \ X . The monodromy arises because the rulings i on Q a can be interchanged as a winds around X . It follows easily that the left-hand side in (6.4) is homogeneous of degree 0 and single-valued on P2n−1 \ X . To study its behavior near X we restrict to a general line in P2n−1 . In affine coordinates, we can then assume the 4n−1 2 xi − t = 0, where t is a parameter on the line. family of quadrics looks like 1 The integral then becomes
1 d x1 ∧ . . . ∧ d x4n−1 = const · t − 2 2 2n xi − t
γ
(6.5)
1
1
2 gives the value const · t − 2 for a suitable cycle γ . The change √ of variable xi = yi t4n−1 from which one sees that g(A) is constant. Since H (P4n−1 \ Q a ) ∼ = Q(−2n) −2n × as Hodge structure, g(A) = c0 π for some c0 ∈ Q , and the lemma follows.
With notation as above, define f (A) :=
P4n−1 (R)
4n−1 (Z ) . (A1 q1 + · · · + A2n q2n )q2 q3 · · · q2n
(6.6)
Note that f (A) is defined for qi positive definite and A j ≥ 0 but not all A j = 0. We have g(A) =
∂ 2n−1 −1 f (A). (2n − 1)! ∂ A2 . . . ∂ A2n
(6.7)
Ai A1 , 2 ≤ i ≤ 2n, and define F(a2 , . . . , a2n ) := A1 f (A). Note the various ∂ i−1 /∂a2 . . . ∂ai F(a) vanish as ai → +∞ with a j ≥ 0, ∀ j. Also 2n−1 = A2n
Write ai = partials
1
−da2 ∧ . . . ∧ da2n . Thus
σ 2n−1 (R)
g(A) 2n−1 (A) = −
σ 2n−1 (R)
A2n 1 g(A)da2 ∧ . . . ∧ da2n =
+∞ 1 ∂ 2n−1 F(a)da2 ∧ . . . ∧ da2n = (2n − 1)! a2 ,...,a2n =0 ∂a2 . . . ∂a2n −1
4n−1 (Z ) = P(Q). F(0, . . . , 0) = 4n−1 (2n − 1)! P (R) q1 q2 · · · q2n
(6.8)
This identity holds by analytic extension in the q’s where both integrals are defined. Combining (6.8) with Lemma 6.1 we conclude
Graph Polynomials
195
Proposition 6.2. With notation as above, assuming the integral defining P(Q) is convergent, we have P(Q) :=
4n−1 (Z ) c = 2n π P4n−1 (R) q1 q2 · · · q2n
2n−1 (A) . √ σ 2n−1 (R)
(6.9)
Corollary 6.3. Let be a graph with n loops and 2n edges. Assume every proper subgraph of is convergent, and let q1 , . . . , q2n be the Feynman quadrics associated to (cf. Sect. 5). The symmetric matrices Mi (6.1) in this case are block diagonal
Ni 0 Mi = 0 0
0 Ni 0 0
0 0 Ni 0
0 0 0 Ni
and = 4 , where = det(A1 N1 + . . . + A2n M2n ) is the graph polynomial (2.2). The Schwinger trick yields (cf. (5.7)) P() :=
4n−1 (Z ) c = 2n π P4n−1 (R) q1 q2 · · · q2n
2n−1 (A) 2 σ 2n−1 (R)
(6.10)
for c ∈ Q× . 7. The Motive We assume as in Sect. 5 that the ground field K ⊂ R is real. Let be a graph with n loops and 2n edges and assume every proper subgraph of is convergent. Our objective in this section is to consider the motive with period σ 2n−1 (R)
2n−1 (A) . 2
(7.1)
We consider P2n−1 with fixed homogeneous coordinates A1 , . . . , A2n associated with the edges of . Linear spaces L ⊂ P2n−1 defined by vanishing of subsets of the Ai will be referred to as coordinate linear spaces. For such an L, we write L(R≥0 ) for the subset of real points with non-negative coordinates. Lemma 7.1. X (C) ∩ σ 2n−1 (R) = L⊂X L(R≥0 ), where the union is taken over all coordinate linear spaces L ⊂ X . Proof. We know by Corollary 2.4 that is a sum of monomials with coefficients +1. The lemma is clear for the zero set of any polynomial with coefficients > 0.
Remark 7.2. (i) The assertion of the lemma is true for any graph polynomial. We do not need hypotheses about numbers of edges or loops. (ii) By Proposition 3.1, coordinate linear spaces L ⊂ X correspond to subgraphs G ⊂ such that h 1 (G) > 0.
196
S. Bloch, H. Esnault, D. Kreimer
Proposition 7.3. Let be as above. Define η = η =
2n−1 (A) 2
(7.2)
as in (5.2). There exists a tower πr,r −1
πr −1,r −2
π2,1
π1,0
P = Pr −−−→ Pr −1 −−−−−→ . . . −−→ P1 −−→ P2n−1 , π = π1,0 ◦ · · · ◦ πr,r −1 ,
(7.3)
where Pi is obtained from Pi−1 by blowing up the strict transform of a coordinate linear space L i ⊂ X and such that (i) π ∗ η has no poles along the exceptional divisors associated to the blowups. (ii) Let B ⊂ P be the total transform in P of the union of coordinate hyperplanes 2n−2 : A1 A2 · A2n = 0 in P2n−1 . Then B is a normal crossings divisor in P. No face (= non-empty intersection of components) of B is contained in the strict transform Y of X in P. (iii) the strict transform of σ 2n−1 (R) in P does not meet Y . Proof. Our algorithm to construct the blowups will be the following. Let S denote the set of coordinate linear spaces L ⊂ P2n−1 which are maximal, i.e. L ∈ S, L ⊂ L ⊂ X ⇒ L = L . Define F = {L ⊂ X coordinate linear space | L = L (i) , L (i) ∈ S}. (7.4) Let Fmin ⊂ F be the set of minimal elements in F. Note that elements of Fmin are π1,0 disjoint. Define P1 −−→ P2n−1 to be the blowup of elements of Fmin . Now define F1 to be the collection of strict transforms in P1 of elements in F \ Fmin . Again elements in F1,min are disjoint, and we define P2 by blowing up elements in F1,min . Then F2 is the set of strict transforms in P2 of F1 \ F1,min , etc. This process clearly terminates. Note that to pass from Pi to Pi+1 we blow up strict transforms of coordinate linear spaces L contained in X . There will exist an open set U ⊂ P2n−1 such that Pi ×P2n−1 U ∼ = U and such that L ∩ U = ∅. It follows that to calculate the pole orders of π ∗ η along exceptional divisors arising in the course of our algorithm it suffices to consider the simple blowup of a coordinate linear space L ⊂ X on P2n−1 . Suppose L : A1 = . . . A p = 0. By assumption, the subgraph G = {e1 , . . . , e p } ⊂ is convergent, i.e. p > 2h 1 (G). As in Proposition 3.5, if I = (A1 , . . . , A p ) ⊂ K [A1 , . . . , A2n ], then ∈ I h 1 (G) − I h 1 (G)+1 so the denominator of η contributes a pole of order 2h 1 (G) along the exceptional divisor. On the other hand, writing ai = AA2ni , a typical open in the blowup will have coordinates ai = aapi , i < p together with a p , . . . , a2n−1 and the exceptional divisor will be defined by a p = 0. Thus da1 ∧ . . . ∧ da2n−1 = d(a p a1 ) ∧ . . . ∧ d(a p a p−1 ) ∧ da p ∧ . . . p−1
= ap
da1 ∧ . . . ∧ da p−1 ∧ da p . . . .
(7.5)
Finally, π ∗ η will vanish to order p − 1 − 2h 1 (G) ≥ 0 on the exceptional divisor, so the algorithm will imply (i). Here we observe that at least on the strata for which p is even, π ∗ η not only is regular along the exceptional divisor, but indeed really vanishes to order ≥ 1. Recall the dictionary (3.1) between subgraphs G = G(L) ⊂ and coordinate linear spaces L = L(G).
Graph Polynomials
197
Lemma 7.4. Let F be as above, and let ∅ = L 1 L 2 . . . L r be a chain of faces in F which is saturated in the sense that it cannot be made longer using elements of F. Let G r G r −1 . . . G 1 G 0 := be the chain of subgraphs. Then h 1 (G j ) = r + 1 − j. In particular, n = h 1 () = r + 1. For j ≥ 1 and any e ∈ G j \ G j+1 we have h 1 (G j \ e) = h 1 (G j ) − 1 = h 1 (G j+1 ). Proof of Lemma 2. Let G ⊂ be a (not necessarily connected) subgraph. Consider the property ∀e ∈ G, h 1 (G \ e) < h 1 (G).
(7.6)
I claim we can write G = G (i) , where the G (i) have the same minimality property and in addition h 1 (G (i) ) = 1. We argue by induction on h = h 1 (G). If h = 1 we can just take G. If h > 1, then for every e ∈ G we can find a G e ⊂ G such that e ∈ G e , h 1 (G e ) = 1, and G e is minimal. Indeed, since h 1 (G \ e) < h 1 (G), we can find a connected subgraph G ⊂ G such that e ∈ G , h 1 (G ) = 1, and h 1 (G \ e) = 0. Now just remove e = e from G until the resulting subgraph is minimal. Since e ∈ G we have G = G as desired. Applying our dictionary, L(G) = e e L(G e ). Note the L(G e ) ⊂ X are maximal. We conclude that L(G) ∈ F for any G ⊂ satisfying (7.6). Conversely, if G = G (i) with L(G (i) ) maximal in X , then every vertex in G lies on at least 2 edges (because this holds for the G (i) ). If for some e ∈ G we had h 1 (G) = h 1 (G \ e), we would then necessarily have that G \ e was disconnected. If e ∈ G (1) ⊂ G, then since G (1) has no external edges, it would follow that G (1) \ e was disconnected. This would imply h 1 (G (1) \ e) = h 1 (G (1) ), a contradiction. We conclude that L ∈ F iff G(L) satisfies (7.6). The lemma now is purely graphtheoretic, concerning the existence of chains of subgraphs satisfying (7.6). Basically the condition is that the G i have no external edges and are “1-particle irreducible” in the physicist’s sense. (Note of course that we cannot assume this for G 0 = , which is given.) To construct such a chain one simply takes G r ⊂ minimal such that h 1 (G r ) = 1 and G r −i minimal such that G r −i+1 ⊂ G r −i and h 1 (G r −i+1 ) > h 1 (G r −i ). Note the G j are not necessarily connected.
We now prove (ii). Let π : P → P2n−1 be constructed as above, using the Fi,min . 0-faces of B ⊂ P will be referred to as vertices (not to be confused with vertices of the graph). It will suffice to show that no vertex lies in the strict transform Y . Let v ∈ P be a vertex. The question of whether v ∈ Y is local around v, so we may localize our tower (7.3), replacing Pi with Spec (O Pi ,vi ), where vi ∈ Pi is the image of v. In particular, P2n−1 is replaced by Spec (OP2n−1 ,v0 ), where v0 ∈ P2n−1 is the image of v. Note the image vi of v in Pi is always a vertex. We modify the tower by throwing out the steps for which Spec (O Pi ,vi ) → Spec (O Pi−1 ,vi−1 ) are isomorphisms. For convenience, we don’t change notation. All our Pi are now local. Let E 1 , . . . , Er ⊂ P be the exceptional divisors, where E i comes by pullback from Pi . Write L i := π(E i ) ⊂ P0 := Spec (OP2n−1 ,v0 ). We claim that v0 ∈ L 1 , and L 1 L 2 . . . L r is precisely the sort of saturated chain in F considered in Lemma 7.4 above. Indeed, at each stage, v maps to the exceptional divisor from the stage before. (If v does not map to the exceptional divisor in Pi , then the local rings at the image of v in Pi and Pi−1 are isomorphic, and this arrow is dropped under localization.) Our task now will be to compute Y ∩ ri=1 E i . We will do this step by step. (We drop the assumption that our chain is saturated.) Suppose first r = 1, i.e. there is only one
198
S. Bloch, H. Esnault, D. Kreimer
blowup. Let L 1 ⊂ P2n−1 be the linear space being blown and suppose L 1 has codimension p1 . Then by Proposition 3.5 if we write G 1 = G(L 1 ) ⊂ and //G 1 for the quotient identifying each connected component of G to a point, we have E 1 ∼ = L 1 × P p1 −1 and Y1 ∩ E 1 = (X //G 1 × P p1 −1 ) ∪ (L 1 × X G 1 ).
(7.7)
Now suppose we have L 1 ⊂ L 2 and we want to compute Y2 ∩ E 1 ∩ E 2 ⊂ P2 . (We write abusively E 1 for the pullback to P2 of E 1 . Yi ⊂ Pi is the strict transform of X .). Locally at v0 let L i : a1 = . . . = a pi = 0 with p1 > p2 . Let f be a local defining equation for X near v0 and write f = c I,J (a1 , . . . , a p2 ) I (a p2 +1 , . . . , a p1 ) J (7.8) with evident multi-index notation. Write |I |, |J | for the total degree of a multi-index. We are interested in points of P1 where the strict transform of L 2 meets E 1 . Typical local coordinates at such points look like ai := ai /a p1 , 1 ≤ i < p1 , a p1 = a p1 , . . . ( coords. not involving the a’s). (7.9) To compute the intersection of the strict transform with the two exceptional divisors on P2 , we let ν := min(|I | + |J |) in (7.8), and write (a p1 )|I |+|J |−ν c I,J (a1 , . . . , a p2 ) I (a p2 +1 , . . . , a p1 −1 ) J . (7.10) f1 = This is the equation for Y1 ⊂ P1 . We then take the image in the cone for the second blowup by taking the sum only over those terms with |I | = |I |min minimal: (a p1 )|I |+|J |−ν c I,J (a1 , . . . , a p2 ) I (a p2 +1 , . . . , a p1 −1 ) J . (7.11) f˜1 = I,J |I |=|I |min
Notice that a priori a p1 might divide f˜1 . We claim in fact that it does not, i.e. that there exists I, J such that c I,J = 0 and both |I | and |I | + |J | are minimum. To see this, note |I |min = h 1 (G 2 ); min(|I | + |J |) = h 1 (G 1 ).
(7.12)
Assuming L 1 ⊂ L 2 is part of a saturated tower, we have as in Lemma 7.4 that h 1 (G 1 ) = h 1 (G 2 ) + 1. If no nonzero term in f has both |I | and |I | + |J | minimal, then every term with |I | + |J | minimal must have |I | = |I |min + 1 and |J | = 0. But this would mean that the graph polynomial for G 1 would not involve the variables A p2 +1 , . . . , A p1 . Since the G i have no external edges and h 1 (G i \ e) < h 1 (G i ), there are spanning trees ( disjoint unions of spanning trees if G i is not connected) avoiding any given edge, so this is a contradiction. In general, if we have L 1 ⊂ . . . ⊂ L r saturated we write f= c Iq ,...,Ir (a1 ,. . . , a pr ) Ir (a pr +1 ,. . . , a pr −1 ) Ir −1 · · · (a p2 +1 , . . . , a p1 ) I1 . (7.13) I1 ,...,Ir
We have min(|Ir |) = min(|Ir −1 | + |Ir |) − 1 = . . . = min(|Ir | + · · · + |I1 |) − r + 1.
(7.14)
Graph Polynomials
199
We claim there exist spanning trees T for G 1 such that T does not contain any G i \ G i+1 . This will mean there exist c Iq ,...,Ir = 0 such that r1 |I j | is minimum but |I j | = 0 for any j. By (7.14), this in turn implies for such a monomial that ri=q |Ii | is minimal for all q. To show the existence of T , choose ei ∈ G i \ G i+1 for 1 ≤ i ≤ r − 1 and er ∈ G r . It suffices to show that h 0 (G 1 \ {e1 , . . . , er }) = h 0 (G 1 ). We have h 0 (G 1 \ e1 ) = h 0 (G 1 ) (since h 1 drops). Mayer Vietoris yields an exact sequence . . . → H1 (G 1 \ e1 ) → H0 (G 2 \ {e2 , . . . , er }) → H0 (G 2 ) ⊕ H0 (G 1 \ {e1 , . . . , er }) → H0 (G 1 \ e1 ) → 0.
(7.15)
We have inductively H0 (G 2 \ {e2 , . . . , er }) ∼ = H0 (G 2 ) and we deduce H0 (G 1 \ {e1 , . . . , er }) ∼ = H0 (G 1 \ e1 ) ∼ = H0 (G 1 ).
(7.16)
Let f be as in (7.13) and assume there exists c Iq ,...,Ir = 0 as above. We claim that Y ∩ E 1 ∩ . . . ∩ Er can be computed as follows. For clarity, it is convenient to change notation a bit and write Di ⊂ Pi for the exceptional divisor. Abusively, E i will denote any pullback of Di to a P j for j > i. Take the strict transform Y1 to P1 and intersect with D1 . Now take the strict transform st2,1 (Y1 ∩ D1 ) of Y1 ∩ D1 to P2 and intersect with D2 . Continue in this fashion. The assertion is Y∩
r
E i =Er ∩ str,r −1 Dr −1 ∩ str −1,r −2 (Dr −2 ∩ . . . st2,1 (D1 ∩ Y1 ) . . .) . (7.17)
1
This is just an elaboration on (7.11), (7.13). The left hand-side amounts to taking the terms with |I1 | + . . . + |Ir | minimal, removing appropriate powers of defining equations for the exceptional divisors, and then restricting; while the right-hand side takes those terms with rq |I j | minimum for q = 1, . . . , r − 1. By what we have seen, these yield the same answer. It remains to see that the intersection (7.17) doesn’t contain the vertex v. We have seen (7.7) that D1 ∩ Y1 is a union of the pullbacks of graph hypersurfaces for G 1 and //G 1 . We have a cartesian diagram E 1 ∩ D2 ∼ = L 1 × P p2 −1 × P p1 − p2 −1 −→
D1 ∼ = L 1 × P p1 −1
P2 −→ B L(λ2
ρ1
−→ P1 −→
⊂ P p1 −1 ) −→ P p2 −1 , (7.18)
P p1 −1
where λ2 ∼ = P p2 −1 corresponds to L 2 ⊃ L 1 , and the strict transform in P1 is the pullback −1 L 2 = ρ1 (λ2 ). Of course the picture continues in this fashion all the way up. In the end, we get L 1 × P pr −1 × P pr −1 − pr × . . . × P p1 − p2 −1 .
(7.19)
The strict transform of X here, by (7.17), is the union of pullbacks of graph hypersurfaces −1 pr L−1 X //G 1 ∪ prr−1 X G r ∪ prr−1 −1 X G r −1 //G r ∪ . . . ∪ pr1 X G 1 //G 2 . 1
(7.20)
Now each of the graphs involved has h 1 = 1, so each of the graph hypersurfaces is linear. As we have seen, they involve all the edge variables so they do not vanish at
200
S. Bloch, H. Esnault, D. Kreimer
any of the vertices. This completes the proof of Proposition 7.3(ii). Finally, the proof of (iii) is straightforward from (ii). One uses the existence of local coordinates as in (7.13) with respect to which the defining equation of the strict transform is a sum of monomials with coefficients > 0, and elements in the strict transform σ˜ of σ 2n−1 (R) have coordinates ≥ 0. (Points in Y ∩ σ˜ could be specialized to vertices.)
We are now in a position to make explicit the motive (0.1) associated to a primitive π divergent graph ⊂ P2n−1 . Let P − → P2n−1 be as in Proposition 7.3. Let ⊂ P2n−1 be the union of the 2n coordinate hyperplanes. Let B := π ∗ and let Y ⊂ P be the strict transform of the graph hypersurface X = X . Consider the motive (0.1): H := H 2n−1 (P \ Y, B \ B ∩ Y ).
(7.21)
By construction, Proposition 7.5. The divisor B ⊂ P has normal crossings. The Hodge structure on the Betti realization H B has the following properties: (i) H B has weights in [0, 4n − 2]. W0 H B ∼ = Q(0). (ii) The strict transform σ˜ of the chain σ 2n−1 (R) in Proposition 7.3(iii) represents an homology class in H2n−1 (P \ Y, B \ B ∩ Y ). The composition σ˜
W0 H B → H B −→ Q is a vector space isomorphism. Proof. We have the exact sequence 0 → H 2n−2 (B \ Y ∩ B)/H 2n−2 (P \ Y ) → H → H 2n−1 (P − Y ). (7.22) Bi1 ∩ . . . ∩ Bir . We have a spectral sequence of Hodge Write B = Bi , B (r ) = structures p,q
E1
= H q (B ( p+1) \ B ( p+1) ∩ Y ) ⇒ H p+q (B \ B ∩ Y ).
(7.23)
From known properties of weights for open smooth varieties, we get an exact sequence H 0 (B (2n−2) ) → H 0 (B (2n−1) ) → W0 H → 0.
(7.24)
An analogous calculation with B replaced by ⊂ P2n−1 yields Q(0) as cokernel. It is easy to see that blowing up strict transforms of linear spaces doesn’t change this cokernel. This proves (i). Assertion (ii) is straightforward.
An optimist might hope for a bit more. Whether for all primitive divergent graphs, or for an identifiable subset of them, one would like that the maximal weight piece of H B should be Tate, W H B = Q(− p)⊕r . grmax
(7.25)
Further one would like that there should be a rank 1 sub-Hodge structure ι : Q(− p) → W H such that the image of η ∈ H W grmax B D R in grmax H D R spans ι(Q(− p)) D R . Our main result is that this is true for wheel and spoke graphs, (Sects. 11, 12).
Graph Polynomials
201
8. The Motive II In this section we consider the class of the graph hypersurface [X ] in the Grothendieck group K mot of quasi-projective varieties over k with the relation [X ] = [Y ] + [X \ Y ] for Y closed in X . We assume has N edges and n loops. The basic result of [3] is that [X ] can be quite general. In particular, the motive of X is not in general mixed Tate. From the physicists’ point of view, of course, one is primarily interested in the period (6.10). Results in [3] do not exclude the possibility of some mixed Tate submotive yielding this period. The methods of [3] seem to require graphs with physically unrealistic numbers of edges, so it is worth looking more closely at [X ]. In this section we pursue a naive projection technique based on the fact that graph and related polynomials have degree ≤ 1 in each variable. We stratify X and examine whether the strata are mixed Tate. For N = 2n ≥ 12, we identify a possible non-mixed Tate stratum. Curiously, the stratum we consider turns out to be mixed Tate in “most” cases, but with a computer it is not difficult to generate cases where it may not be. We give such an example with 12 edges. Note however that Stembridge [13] has shown that all graphs with ≤ 12 edges are mixed Tate, so the particular example we give must in fact be mixed Tate. Techniques and results in this section should be compared with [13], which predates our work. The basic observation of Kontsevich is that for X mixed Tate, there will exist a polynomial P with Z-coefficients such that for any finite field Fq we have # X (Fq ) = P (q). Stembridge has implemented a computer algorithm for checking this. It might be of interest to try some of our examples to see if they satisfy Kontsevich’s condition. If we fix an edge e, by (3.10) we can write the graph polynomial = Ae · \e + /e .
(8.1)
Projecting from the point ve defined by Ae (ve ) = 1, Ae (ve ) = 0, e = e yields pre : P N1 \ {ve } → P N −2 and ∼ =
→ P N −2 \ X \e . X \ pre−1 (X \e ) ∩ X −
(8.2)
One might hope to stratify X and try to analyse its motive in this way. We know, however, by [3] that in general this motive is very rich, and such elementary techniques will not suffice to understand it. Indeed, we have pre−1 (X \e ) ∩ X = pre−1 (X \e ∩ X /e ),
(8.3)
so already at the second step we must analyse an intersection of two graph hypersurfaces. What is amusing is that, in fact, one can continue a bit further, and the process gives some indication of where motivic complications might first arise. Lemma 8.1. Assume has n loops and 2n edges. Enumerate the edge variables A1 , . . . , A2n in such a way that A1 A2 · · · An appears with coefficient 1 in . Then we can write = det(m i j + δi j Ai )1≤i, j≤n ; m i j = m i j (An+1 , . . . , A2n ). In other words, the first n variables appear only on the diagonal.
(8.4)
202
S. Bloch, H. Esnault, D. Kreimer
Proof. Let T ⊂ be the subgraph with edges en+1 , . . . , e2n . Our assumption implies that T is a spanning tree, so Z[E ] ∼ = H1 ()⊕Zen+1 ⊕. . .⊕Ze2n . The linear functionals ei∨ thus induce an isomorphism (e1∨ , . . . , en∨ ) : H1 () ∼ = Zn .
(8.5)
With respect to this basis of H1 () the rank 1 quadratic forms (ei∨ )2 correspond to the matrices with 1 in position (i, i) and zeroes elsewhere, for 1 ≤ i ≤ n. Define (m i j ) to ∨ 2 be the symmetric matrix associated to the quadratic form 2n n+1 Ai (ei ) . The assertion of the lemma is now clear.
Lemma 8.2 ([9]). Let ψ = det(m i j + δi j Ai )1≤i, j≤n , where the m i j are independent of A1 , . . . , An . For 1 ≤ k ≤ n write ψ k := ∂ ∂Ak ψ and ψk := ψ| Ak =0 . For I, J ⊂ {1, . . . , n} with #I = # J , define ψ(I, J ) to be the determinant as above with the rows in I and the columns in J removed. Let 1 ≤ k, ≤ n be distinct integers and assume k, ∈ I ∪ J . Then ψ(I, J )k ψ(I, J )kl − ψ(I, J )k ψ(I, J )lk = ±ψ(I ∪ {k}, J ∪ {}) ×ψ(I ∪ {}, J ∪ {k}).
(8.6)
The two factors on the right have degrees ≤ 1 in Ai for i ≤ n. Proof. We can drop the rows in I and the columns in J to begin with and ignore the Aν for ν ∈ {k, }. In this way, we reduce to the following assertion. Let M be an n × n matrix with coefficients in a commutative ring. Assume n ≥ 2. Write M(S, T ) for the matrix with rows in S and columns in T deleted. Then det M({1, 2}, {1, 2}) · det M − det M({1}, {1}) · det M({2}, {2}) = − det M({1}, {2}) · det M({2}, {1})
(8.7)
(By convention, the determinant of a 0×0-matrix is 1.) This is a straightforward exercise. We attempt to stratify our graph hypersurface X using the above lemmas. To fix ideas, we assume has 2n edges and n loops. Step 1. We order the edges so admits a description as in Lemma 8.1. Step 2. Project as in (8.2) with e = e1 , to conclude [X ] = [P2n−2 ] + [Cone(X \e1 ∩ X /e1 )] − [X \e1 ∩ X /e1 ] = [P2n−2 ] + 1 + ([A1 ] − 1)[X \e1 ∩ X /e1 ].
(8.8)
Step 3. Using (3.10), we can write (with notation as in Lemma 8.2 and = ) \e1 =
∂ ∂ A1
= A2 \{e1 ,e2 } + (\e1 )/e2 = A2 12 + 21 ,
/e1 = | A1 =0 = A2 (/e1 )\e2 + /{e1 ,e2 } = A2 12 + 12 .
(8.9)
Eliminating A2 , we conclude that projection from P2n−2 onto P2n−3 with coordinates A3 , . . . , A2n carries X −e1 ∩ X /e1 onto the hypersurface defined by 21 12 − 12 12 = 0. By Lemma 8.2, 21 12 − 12 12 = (1, 2)(2, 1) = (1, 2)2 .
(8.10)
Graph Polynomials
203
(The right-hand identity holds because = is the determinant of a symmetric matrix.) Step 4. Write V(I ) for the locus of zeroes of a homogeneous ideal I . The projection in Step 3 blows up on V(21 , 12 , 12 , 12 ), and we conclude [X \e1 ∩ X /e1 ] = [X (1, 2)] + [Cone V(21 , 12 , 12 , 12 )] − [V(21 , 12 , 12 , 12 )] = [X (1, 2)] + 1 + ([A1 ] − 1)[V(21 , 12 , 12 , 12 )].
(8.11)
Step 5. One could try to study the motive of V(21 , 12 , 12 , 12 ), but the elimination theory gets complicated, so instead we focus on [X (1, 2)]. Since (1, 2) has degree ≤ 1 in A3 we may project onto P2n−4 with coordinates A4 , . . . , A2n . It might seem that we could repeat the argument starting from Step 2 above, but there is a problem. Writing = det M with M symmetric, we have (1, 2) = det M(1, 2), where M(1, 2) is obtained from M by deleting the first row and the second column. This matrix is no longer symmetric. Just as in (8.2), the projection X (1, 2) → P2n−4 blows up over V((1, 2)3 , (1, 2)3 ). Step 6. Just as in Step 3, we project V((1, 2)3 , (1, 2)3 ) to P2n−5 with coordinates A5 , . . . , A2n . When we eliminate A4 we find the image of the projection is given by the zeroes of (1, 2)34 (1, 2)34 − (1, 2)34 (1, 2)43 Lemma 8.2
=
(8.12)
({1, 3}, {2, 4}) · ({1, 4}, {2, 3}).
Step 7. At this point something new has happened. The right-hand side in (8.12) is not a square. Although both factors have degree ≤ 1 in A5 , we will at the next stage in our motivic stratification have to deal with V(({1, 3}, {2, 4}), ({1, 4}, {2, 3})).
(8.13)
Here Lemma 8.2 no longer applies. We find by example that eliminating A5 , the resulting hypersurface in P2n−6 in general no longer factors into factors with degrees ≤ 1 in A6 . Projection then is no longer an isomorphism at the generic point, and the argument is blocked. Example 8.3. The computer yields the following example of a graph with 6 loops and 12 edges for which the projection (8.13) has an irreducible factor with degree 2 in A6 . Take 7 vertices labeled 1, 2, . . . , 7 and connect them with edges as indicated: (1, 2), (2, 3), (3, 4), (4, 5), (5, 6), (6, 7), (7, 2), (7, 3), (6, 4), (5, 1), (5, 3), (4, 1).
(8.14)
Note that this graph is mixed Tate though by explicit computation, which finds it ∼ ζ (3)ζ (5).
204
S. Bloch, H. Esnault, D. Kreimer
9. General Remarks Let be a graph with n loops and 2n edges. We assume all subgraphs of are convergent so the period P() is defined (Proposition 5.2). The Schwinger trick (Corollary 6.3) relates P() to an integral computed in Schwinger coordinates in P2n−1 . To avoid confusion, we write Pquadric () for the period (5.7) of the configuration of Feynman quadrics associated to and Pgraph () for the graph period. We have by (6.10), Pquadric () ∈ Q× π −2n Pgraph ().
(9.1)
Proposition 7.3 shows that there is a suitable birational transformation π : P → P2n−1 defined over Q, such that the integrand η ∈ (P2n−1 , ω(2X )) keeps poles only along the strict transform Y of the discriminant hypersurface X , that is π ∗ (η) ∈ (P, ω(2Y )). Thus, denoting by B the total transform of the union of coordinate hyperplanes Ai = 0, the form η yields a class π ∗ η ∈ (P, ω(2Y )) → H D2n−1 R (P \ Y, B \ B ∩ Y )
(9.2)
in relative de Rham cohomology. On the other hand, Proposition 7.3 shows that the strict transform σ˜ 2n−1 (R) of the cycle of integation is disjoint from Y . Thus it yields a relative homology class 2n−1 (P \ Y, B \ B ∩ Y )∨ σ˜ 2n−1 (R) ∈ H2n−1 (P \ Y, B \ B ∩ Y ) = HBetti
(9.3)
in Betti cohomology. More precisely Claim 9.1. The period integral (5.3) Pquadric () ∈ π −2n Q× · Pgraph (), where Pgraph () is a period of the cohomology H 2n−1 (P \ Y, B \ B ∩ Y ). By period here we mean the integral of an algebraic de Rham form π ∗ η defined over Q against a Q-homology chain σ˜ 2n−1 . Suppose now, as has been established in a number of cases [5], that the period is related to a zeta value: Pquadric () ∈ π Z Q× ζ ( p). Then the general guideline for what we wish to understand is the following. One has now a good candidate for a triangulated category of mixed motives over Q, defined by Voevodsky, Levine and Hanamura ([6], Sect. 1 and references there for the discussion here). One further considers the triangulated subcategory spanned by Q(n), n ∈ Z. In this category, one has p= j =0 Q j Hom (Q(0), Q( p)) = K 2 p−1 (Q) ⊗ Q (9.4) p ≥ 1, j = 1 . 0 else The iterated extensions of Q(n) form an abelian subcategory which is the heart of a t-structure. Borel’s work on the K -theory of number fields [2, 14] tells us that K 2 p−1 (Q)⊗Q ∼ =Q for p = 2n − 3, n ≥ 2, so there is a one dimensional space of motivic extensions of Q(0) by Q( p). We want to understand their periods. Let E be a nontrivial such extension. We write E D R = Q · e0 ⊕ Q · e p , with F 0 E D R = Qe0 . The Betti realization is E C = C · e0 ⊕ C · e p and E Q = Q · (2πi) p e p ⊕ Q · (e0 + βe p ) for a suitable β. The corresponding Hodge structures on the Q(i) are (Q(0) D R = Q · 0 , Q(0)Q = Q · 0 ), (Q( p) D R = Q · p , Q( p)Q = Q · (2πi) p p ). (9.5)
Graph Polynomials
205
We have an exact sequence 0 → Q( p) → E → Q(0) → 0
(9.6)
given by p → e p , e0 → 0 . The ambiguity here is that we can replace e0 + βe p by e0 + (β + c(2πi) p )e p for c ∈ Q as a basis element for E Q , so β ∈ C/(2πi) p Q is well defined. In fact, Ext1M H S (Q(0), Q( p)) = C/(2πi) p Q and β is the class of E. ∨ ∨ To compute the period, consider the dual object E ∨ , with E ∨ D R = Qe0 ⊕ Qe p and ∨ ∨ ∨ − p ∨ E Q = Qe0 ⊕ Q(2πi) (e p − βe0 ). By definition, the period is obtained by pairing ∨ of the generator (2πi) p e∨ ∈ Q(− p) = E ∨ /Q(0) . F 0 E D R against a lifting in E Q Q Q p Q This yields e0 , (2πi)− p (e∨p − βe0∨ ) = −(2πi)− p β.
(9.7)
It is better from the period viewpoint to dualize and consider the period of E ∨ , which is an extension of Q(− p) by Q(0). This yields e∨p , e0 + βe p = β.
(9.8)
For E a non-split motivic extension of Q(0) by Q( p), p odd, ≥ 3, let β ∈ C/(2πi) p Q be the extension class. Note Im(β) ∈ R is well defined. One knows by the Borel regulator theory [2, 14] that ζ ( p) ∈ Im(β)Q× . Now consider our graph with period related to ζ ( p). The motive H 2n−1 (P \ Y, B \ B ∩ Y ) has lowest weight piece Q(0), so we might expect to find inside it a subquotient motive of rank 2 which is an extension of Q(− p) by Q(0). By the above discussion, we would then hope Pgraph () ∈ ζ ( p)Q× .
(9.9)
By (6.10) this would yield Pquadric () ∈ π −2n ζ ( p)Q× . For example, take = n to be the wheel with n spokes. Then p = 2n − 3 and we expect, if indeed the ζ -values computed in [5] are motivic, to find Pgraph (n ) ∈ ζ (2n − 3)Q× ;
Pquadric (n ) ∈ π −2n ζ (2n − 3)Q× .
(9.10)
The aim of the next sections is to show for the wheel and spoke family of examples what can be done motivically. We will show in particular H 2n−1 (P2n−1 \ X ) = Q(−2n + 3).
(9.11)
2n−1 \ X ) is spanned by η. Even in this special case, we are not able Moreover, H D2n−1 R (P to find a suitable rank 2 subquotient motive of H 2n−1 (P \ Y, B \ B ∩ Y ).
206
S. Bloch, H. Esnault, D. Kreimer
10. Correspondences We will assume in this section that has n loops and 2n edges. So one has 2n Feynman quadrics which we denote by qe , of Eq. Q e , see Sect. 5. Recall concretely that to an edge e, one associates coordinates xe (i), i = 1, . . . , 4 = j. Given an orientation of , to a vertex v, one associates the relation e sign(v, e)xe (i) = 0 for all 4= j j j i = 1, . . . , j = 4. Then qe =: qe is defined by Q e := a=1 xe (a)2 = 0 in P jn−1 . j One defines Q = Q j ⊂ P jn−1 × P2n−1 by the equation e Ae Q e = 0. This defines a correspondence A2n−1 −fibration
j
P2n−1 × P jn−1 \ Q j −−−−−−−−−→ P jn−1 \ ∩2n e=1 qe . πj
(10.1)
P2n−1 We discuss now this correspondence for the Feynman quadrics, i.e. j = 4. On the other hand, we can consider all the definitions above for other j, and we discuss the resulting correspondence (10.1) for j = 1 and j = 2 as well. For j = 1, we rather consider the projection proj : Q1 → P2n−1 . Let us denote by ⊂ Q1 the closed subscheme with proj−1 (x) ∩ = Sing(proj−1 (x)). Then → X is the desingularization P(N ) → X studied in Proposition 4.2. We assume now j = 2. Recall that if Z ⊂ P2N +1 is a smooth even dimensional quadric, then % 0 j = 2N j 2N +1 Hc (P \ Z) = , (10.2) Q(−N )[1 − 2 ] j = 2N where i are the 2 rulings of Z . We define
X i = (A) ∈ P2n−1 , rk
Ae Q 1e
(10.3)
e
So X = X 0 , and X i+1 is the singular locus of X i . We denote by j = j0 : P2n−1 \ X → j P2n−1 , ji : X i−1 \ X i → X i−1 . Over X i , the quadric e Ae qe is a cone over a smooth j quadric e Ae qe ⊂ P j (n−i)−1 , thus by homotopy invariance and base change for R(π j )! ([7]), one obtains Proposition 10.1.
j! Q(−2n + 1) ( j ) Q(−2n − 1) 1 ! R i (π4 )! Q = . . . ( j ) Q(−2n + 1 − 2a) a !
i = 4n − 1 i = 4n + 3 , ... i = 4n + 4a
(10.4)
j! Q(−n + 1) ( j ) Q(−n) 1 ! R i (π2 )! Q = ... ( j ) Q(−2n + 1 − a) a !
i = 2n − 1 i = 4n + 1 . ... i = 2n + 2a
(10.5)
We draw now two consequences from this computation.
Graph Polynomials
207
Proposition 10.2. One has maps 4 4 Hc4n−1 (P4n−1 \ ∪2n Hc2n−1 (P2n−1 \ X ) → Hc2n (P4n−1 \ ∩2n e=1 qe ) e=1 qe ) → in particular dually 4 2n−1 (P2n−1 \ X ). H 4n−1 (P4n−1 \ ∪2n (10.6) e=1 qe )(2n) → H
Proof. By (10.4), the term E 22n−1,4n−1 = Hc2n−1 (P2n−1 \ X )(−2n + 1) of the Leray spectral sequence for π4 maps to Hc2n−1+4n−1 (P2n−1 × P4n−1 \ Q4 ), which in turn is 4 equal to Hc2n (P4n−1 \ ∩2n e=1 qe )(−2n + 1) by homotopy invariance. The second map 4 comes from the Mayer-Vietoris spectral sequence for ∪2n
e=1 qe . Remark 10.3. We will see in Sect. 11 on the wheel with n spokes that for n = 3, the first map is an isomorphism, but in general, we do not control it. 2 Proposition 10.4. Assume ∩2n e=1 qe = ∅, for example for the wheel with n spokes (see Sect. 11). Then
Hc2n−1 (P2n−1 \ X ) = H 2n−2 (X )/H 2n−2 (P2n−1 ) is supported along X a for some a ≥ 1. Proof. By homotopy invariance again and by assumption, we have 2 Hc2n−1+2n−1 (P2n−1 × P2n−1 \ Q2 ) = Hc0 (P2n−1 \ ∩2n e=1 qe ) = 0.
(10.7)
2n−1,2n−1 So the Leray spectral sequence for π2 together with (10.4) imply that E ∞ = 0, with E 22n−1,2n−1 = Hc2n−1 (P2n−1 \ X )(−n + 1). So, since R i (π2 )! is supported in lower strata of X , this shows the proposition.
Remark 10.5. We will see in Sect. 11 on the wheel with n spokes that for n = 3, the ∼ =
Leray spectral sequence will equate H 0 (X 1 )(−1) − → H 4 (X )/H 4 (P5 ). 11. Wheel and Spokes The purpose of this section is to compute the middle dimensional cohomology for a graph polynomial in a non-trivial case. The geometry we will be using involves only projections, homotopy invariance and Artin vanishing theorem. Consequently, our cohomology computation holds for Betti or étale cohomology, and would for motivic cohomology if one had Artin vanishing. To unify notations, we denote this cohomology as H (?, Q) rather than Q in the -adic case. Fix n ≥ 3 and let = W Sn be the graph which is a wheel with n spokes. W Sn has vertices {0, 1, . . . , n} and edges ei = (0, i), 1 ≤ i ≤ n and e j = ( j − n, j − n + 1 mod n), n + 1 ≤ j ≤ 2n. Suitably oriented, i = ei + ei+n − ei+1 mod n , 1 ≤ i ≤ n form a basis for the loops. The following is straightforward. Lemma 11.1. has n loops and 2n edges. Every proper subgraph is convergent so the period P() is defined (see Proposition 5.2). Proof. Omitted.
208
S. Bloch, H. Esnault, D. Kreimer
Let Ti , 1 ≤ i ≤ 2n be variables. The graph polynomial of can be written (T ) = det
2n
Ti M
(i)
,
(11.1)
i=1
where M (i) = (M (i) pq )1≤ p,q≤n ;
∨ ∨ M (i) pq = ei ( p )ei (q ).
(11.2)
It follows easily that T1 + T2 + Tn+1 −T2 0 ... 0 −T1 −T2 T2 + T3 + Tn+2 −T3 . . . 0 0 . (11.3) = det .. .. .. .. .. . . . ... . . 0 0 . . . −Tn Tn + T1 + T2n −T1
It will be convenient to make the change of variables Bi = Ti+1 + Ti+2 + Ti+1+n , Ai = −Ti−2 ,
(11.4)
where all the indices are counted modulo n and taken in [0, . . . , n]. Write
B0 A0 0 . . . A0 B1 A1 . . . 0 A B A 1 2 3 n = n (A, B) = det . .. .. .. .. . . . An−1 0 . . . . . .
. . . An−1 ... 0 ... 0 . .. ... . An−2 Bn−1
(11.5)
The graph hypersurface in the A, B-coordinates is given by P2n−1 ⊃ X n : n (A, B) = 0.
(11.6)
Define H ∗ (X n , Q)prim := coker(H ∗ (P2n−1 , Q) → H ∗ (X n , Q)). We formulate now our main theorem. Theorem 11.2. Let X n ⊂ P2n−1 be the graph polynomial hypersurface for the wheel with n ≥ 3 spokes. Then one has H 2n−1 (P2n−1 \ X n ) ∼ = Q(−2n + 3) or equivalently, via duality H 2n−2 (X n , Q)prim ∼ = Q(−2). In particular, H 2n−1 (X n , Q)prim is independent of n ≥ 3.
Graph Polynomials
209
Proof. The proof is quite long and involves several geometric steps. We first define homogeneous polynomials Q n−1 and K n as indicated: n = B0 Q n−1 (B1 , . . . , Bn−1 , A1 , . . . , An−2 ) +K n (B1 , . . . , Bn−1 , A0 , . . . , An−1 ).
(11.7)
Here Q n−1 (B1 , . . . , Bn−1 , A1 , . . . , An−2 ) B1 A1 0 . . . . . . 0 0 A B A 0 . . . = det 1 2 2 . ... ... ... ... ... 0 . . . . . . . . . An−2 Bn−1
(11.8)
Lemma 11.3. One has inductive formulae: Q n−1 (B1 , . . . , Bn−1 , A1 , . . . , An−2 ) = B1 Q n−2 (B2 , . . . , Bn−1 , A2 , . . . , An−2 ) −A21 Q n−3 (B3 , . . . , Bn−1 , A3 , . . . , An−2 ) = Bn−1 Q n−2 (B1 , . . . , Bn−2 , A1 , . . . , An−3 ) −A2n−2 Q n−3 (B1 , . . . , Bn−3 , A1 , . . . , An−4 );
(11.9)
and K n (B1 , . . . , Bn−1 , A0 , . . . , An−1 ) =−A20 Q n−2 (B2 , . . . , Bn−1 , A2 , . . . , An−2 ) −A2n−1 Q n−2 (B1 , . . . , Bn−2 , A1 , . . . , An−3 ) + 2(−1)n−1 A0 · · · An−1 .(11.10) Proof. Straightforward.
The following lemma is a direct application of Artin’s vanishing theorem [1], Théorème 3.1, and homotopy invariance, and will be the key ingredient to the computation. Lemma 11.4. Let V ⊂ P N be a hypersurface which is a cone over the hypersurface W ⊂ Pa . Then one has H i (P N \ V ) = 0 for i > a or equivalently j
Hc (P N \ V ) = 0 for j < 2N − a. Proof. The projection P N \ V → Pa \ W is a A N −a -fibration. By homotopy invariance, j j−2(N −a) a j−2(N −a) (P \ W )(−(N − a)) and by Artin’s vanishing Hc Hc (P N \ V ) = Hc (Pa \ W ) = 0 for j − 2(N − a) < a, i.e. for j < 2N − a.
For a homogeneous ideal I or a finite set F1 , F2 , . . . of homogeneous polynomials, we write V(I ) or V(F1 , F2 , . . .) for the corresponding projective scheme. We will need to pass back and forth via various projections. In confusing situations we will try to specify the ambiant projective space. A superscript (i) will mean the ambient projective space is Pi . In the following lemma, P2n−1 has coordinates (B0 : . . . : Bn−1 : A0 : . . . : An−1 ) and P2n−2 drops the B0 .
210
S. Bloch, H. Esnault, D. Kreimer
Lemma 11.5. We have
H 2n−2 (X n , Q) ∼ = H 2n−4 V(Q n−1 , K n )(2n−2) , Q(−1) .
(11.11)
Proof. By (11.7), one has X n ∩ V(Q n−1 ) = V(Q n−1 , K n )(2n−1) .
(11.12)
Let p = (1, 0, . . . , 0) ∈ P2n−1 . Projection from p gives an isomorphism (use (11.7) to solve for B0 ) π p : X n \ X n ∩ V(Q n−1 ) ∼ = P2n−2 \ V(Q n−1 ).
(11.13)
We get a long exact sequence Hc2n−2 (P2n−2 \ V(Q n−1 )) → H 2n−2 (X n ) → H 2n−2 (V(K n , Q n−1 )(2n−1) ) → Hc2n−1 (P2n−2 \ V(Q n−1 )).
(11.14)
Since the polynomial Q n−1 does not involve A0 or An−1 , we can apply Lemma 11.4 with N = 2n − 2 and a = 2n − 4 to deduce Hci (P2n−2 \ V(Q n−1 )) = (0), i < 2n.
(11.15)
H 2n−2 (X n ) ∼ = H 2n−2 (V(K n , Q n−1 )(2n−1) ).
(11.16)
We conclude
The projection π p is an A1 -fibration, V(K n , Q n−1 )(2n−1) − p → V(K n , Q n−1 )(2n−2) , and we obtain H 2n−2 (V(K n , Q n−1 )(2n−1) ) ∼ (V(K n , Q n−1 )(2n−1) − p) = (2n − 2 > 0)2n−2 c ∼ = H 2n−4 (V(K n , Q n−1 )(2n−2) )(−1).
(11.17)
We now consider the line with coordinate functions A0 , An−1 , ⊂ P2n−2 (B1 : . . . : Bn−1 : A0 : . . . : An−1 ), : B1 = . . . = Bn−1 = A1 = . . . = An−2 = 0.
(11.18)
One has ⊂ V(Q n−1 , K n )(2n−2) . The sequence 0 → Hc2n−4 (V(Q n−1 , K n )(2n−2) \ ) → H 2n−4 (V(Q n−1 , K n )(2n−2) ) → H 2n−4 ()
(11.19)
together with the previous lemma implies Hc2n−4 (V(Q n−1 , K n )(2n−2) \ )(−1) ∼ = H 2n−2 ( X˜ n , Q),
(11.20)
Graph Polynomials
211
where H 2n−2 (X n ) = H 2n−2 ( X˜ n ) for n > 3, and for n = 3, H 4 ( X˜ 3 ) = ker(H 4 (X 3 ) → H 2 ()(−1)) ∼ = H 4 (X 3 )prim . The next step is now motivated by the shape of the matrix (11.5). If we wish to induct on n, we have to find the geometry which gets rid of the corner term An−1 in the matrix. We project further to P2n−4 = P2n−4 (B1 : . . . : Bn−1 : A1 : . . . : An−2 ). Let r : V(Q n−1 , K n )(2n−2) \ → V(Q n−1 )(2n−4)
(11.21)
be the projection with center . It is clear from (11.10) that the fibres of r are conics in the variables A0 , An−1 with discriminant δn−1 (B1 , . . . , Bn−1 , A1 , . . . , An−2 ) := Q n−2 (B2 , . . . , Bn−1 , A2 , . . . , An−2 ) · Q n−2 (B1 , . . . , Bn−2 , A1 , . . . , An−3 ) −(A1 · · · An−2 )2 .
(11.22)
We show that in fact the situation is degenerated: Lemma 11.6. One has δn−1 (B1 , . . . , Bn−1 , A1 , . . . , An−2 ) = Q n−3 (B2 , . . . , Bn−2 , A2 , . . . , Bn−3 ) · Q n−1 (B1 , . . . , Bn−1 , A1 , . . . , An−2 ). (11.23) In particular, the general fibre of r in (11.21) is a double line (so {Q n−1 = K n = 0} is non-reduced). Proof. We compute in the ring & K B1 , . . . , Bn−1 , A1 , . . . , An−2 ,
1 Q n−2 (B2 , . . . , Bn−1 , A2 , . . . , An−2
'
. (11.24)
One has B1 = A21 Q n−3 (B3 , . . . , Bn−1 , A3 , . . . , An−2 )/Q n−2 (B2 , . . . , Bn−1 , A2 , . . . , An−2 ) +Q n−1 (B1 , . . . , Bn−1 , A1 , . . . , An−2 )/Q n−2 (B2 , . . . , Bn−1 , A2 , . . . , An−2 ). (11.25) This yields
δn−1 = A21 δn−2 (B2 , . . . , Bn−1 , A2 , . . . , An−2 ) −Q n−2 (B2 , . . . , Bn−1 , A2 , . . . , An−2 ) · Q n−4 (B3 , . . . , Bn−2 , A3 , . . . , An−3 )
+Q n−1 (B1 , . . . , Bn−1 , A1 , . . . , An−2 ) · Q n−3 (B2 , . . . , Bn−2 , A2 , . . . An−3 ). (11.26) We now argue by induction starting with n = 3: δ3−1 = B1 B2 − A21 = Q 2 (B1 , B2 , A1 ) · 1.
(11.27)
212
S. Bloch, H. Esnault, D. Kreimer
From Lemma 11.6 we see that the reduced scheme V(Q n−1 , K n )red \ is fibred over V(Q n−1 )(2n−4) ⊂ P2n−4 with general fibre A1 . The fibres jump to A2 over the closed set
Z n−1 : V Q n−1 (B1 , . . . , Bn−1 , A1 , . . . , An−2 ), Q n−2 (B1 , . . . , Bn−2 A1 , . . . , An−3 ), Q n−2 (B2 , . . . , Bn−1 , A2 , . . . , An−2 ) . (11.28) As a consequence, we get an exact sequence H 2n−9 (Z n−1 )(−3) → Hc2n−6 (V(Q n−1 )(2n−4) \ Z n−1 )(−2) → H 2n−2 ( X˜ n ) → H 2n−8 (Z n−1 )(−3)
(11.29)
with the tilde as in (11.20). Lemma 11.7. (i) The restriction map H i (P2n−4 ) → H i (Z n−1 ) is surjective for i < 2n − 7. (ii) Z 2 = ∅. (iii) For n ≥ 4 we have H 2n−7 (Z n−1 ) ∼ = Hc2n−6 ({Q n−1 = 0}(2n−4) \ Z n−1 ). Proof. (i) Z n−1 is defined by 3 equations, thus by Artin’s vanishing theorem Hci (P2n−4 \ Z n−1 ) = 0 vanishes for i < 2n − 6. (ii) One has Z 2 : B1 B2 − A21 = B1 = B2 = 0 in P2 (B1 : B2 : A1 ), so Z 2 = ∅. (iii) For n ≥ 4 we have H 2n−7 (V(Q n−1 )(2n−4) ) → H 2n−7 (Z n−1 ) → Hc2n−6 (V(Q n−1 )(2n−4) \ Z n−1 ) → H 2n−6 (V(Q n−1 )(2n−4) ) → H 2n−6 (Z n−1 ). Since H i (P2n−4 ) H i (V(Q n−1 )(2n−4) ) for i ≤ 2n − 6, the lemma follows.
(11.30)
Now we may put together Lemma 11.7 and (11.29) to deduce Lemma 11.8. We have H 2n−7 (Z n−1 )(−2) ∼ = H 2n−2 (X n )/H 2n−2 (P2n−1 ); n ≥ 4 H 2 (X 3 )/H 2 (P5 ) ∼ = H 0 (V(Q 2 )(2) )(−2) = Q(−2).
(11.31)
In order to prove Theorem 11.2 it will therefore suffice to prove Theorem 11.9. Let
Z n := V Q n (B1 , . . . , Bn , A1 , . . . , An−1 ),
Q n−1 (B1 , . . . , Bn−1 , A1 , . . . , An−2 ), Q n−1 (B2 , . . . , Bn , A2 , . . . , An−1 ) . (11.32) Then, for n ≥ 3 we have H 2n−5 (Z n , Q)) ∼ = Q(0).
Graph Polynomials
213
Q p (i) := Q p (Bi , . . . , Bi+ p−1 , Ai , . . . , Ai+ p−2 ).
(11.33)
Given a closed subvariety V ⊂ P N , write (V ) ≥ r if the restriction maps H i (P N ) → H i (V ) are surjective for all i ≤ r . (It is equivalent to require these maps to be an isomorphism for i ≤ min(2 dim V, r ).) For V = V(I ) it is convenient to write (I ) := (V(I )). For example a linear subspace has = ∞. A disjoint union of 2 points has = −1. In what follows, the term variety is used loosely to mean a reduced (but not necessarily irreducible) algebraic scheme over a field. We begin with some elementary properties of . Lemma 11.10. Let L ⊂ P N be a linear subspace of dimension p. Let π : P N \ L → P N − p−1 be the projection with center L. For V ⊂ P N − p−1 a closed subvariety, write (abusively) π −1 (V ) ⊂ P N for the cone over V . Then (π −1 (V )) = (V ) + 2( p + 1). Proof. π: P N \ L → P N − p−1 is an A p+1 -bundle. By homotopy invariance, we have a commutative diagram i+2( p+1)
(P N \ L) ∼
=
Hc
i+2( p+1)
−−−−→ Hc
surj.
H i (P N − p−1 )(− p − 1) −−−−→
(π −1 (V )\ L) ∼
=
(11.34)
H i (V )(− p − 1).
The bottom horizontal map is surjective for i ≤ (V ), so the top map is surjective in that range as well. Now consider the diagram 0 −−−−→
j
Hc (P N \ L) surj.
−−−−→
a
H j (P N ) −−−−→ H j (L) −−−−→ 0 c (11.35)
b
0 −−−−→ Hc (π −1 (V )\ L) −−−−→ H j (π −1 V ) −−−−→ H j (L) −−−−→ 0 j
Note the maps a, b are surjective in all degrees, so we get short-exact sequences for all j. The left-hand vertical map is surjective if and only if the central map c is surjective. Since the left-hand map is surjective for j ≤ (V ) + 2( p + 1) by (11.34), the lemma follows.
Lemma 11.11. Let V, W ⊂ P N be closed subvarieties. If V ∩ W = ∅, then
(V ∪ W ) ≥ min (V ), (W ), 2 dim(V ∩ W ), (V ∩ W ) + 1 .
(11.36)
Proof. We use Mayer-Vietoris, H i−1 (V ) ⊕ H i−1 (W ) → H i−1 (V ∩ W ) → H i (V ∪ W ) g
→ H i (V ∩ W ). → H i (V ) ⊕ H i (W ) − (11.37) Note in general if we have A ⊂ B ⊂ P N , then H i (B) H i (A) for i ≤ (A). Thus, for i ≤ (V ∩ W ) + 1 we get g
→ H i (V ∩ W ). 0 → H i (V ∪ W ) → H i (V ) ⊕ H i (W ) − For i ≤ min((W ), 2 dim(V ∩ W )) the map g above is injective on 0 ⊕ dim H i (V ∪ W ) ≤ dim H i (V ) and the lemma follows.
(11.38) H i (W ),
so
214
S. Bloch, H. Esnault, D. Kreimer
The proof of Theorem 11.9 proceeds by writing Z n = V(Q n (1), Q n−1 (1)) ∩ V(Q n (1), Q n−1 (2))
(11.39)
from (11.32). We remark that the automorphism of projective space given by B1 → Bn , B2 → Bn−1 , . . . A1 → An−1 , . . . An−1 → A1
(11.40)
carries Q n (1) → Q n (1) and Q n−1 (1) → Q n−1 (2) so the varieties on the right in (11.39) are isomorphic. Lemma 11.12. We have (Q 2 (1), Q 1 (2)) = (Q 2 (1), Q 1 (1)) = (Q 2 (1)) = ∞.
(11.41)
(Q n (1), Q n−1 (2)), (Q n (1), Q n−1 (1)), (Q n (1)) ≥ 2n − 3.
(11.42)
For n ≥ 3,
Proof. We write an := (Q n (1)), bn := (Q n (1), Q n−1 (2)).
(11.43)
(Using the automorphism (11.40), we need only consider these.) We have Q 2 (1) = B1 B2 − A21 , Q 1 (i) = Bi
(11.44)
from which the lemma is immediate in the case n = 2. For n = 3 we have the exact sequence Hci (P3 \ V(Q 2 (2))(3) ) → H i (V(Q 3 (1))) → H i (V(Q 3 (1), Q 2 (2)))
(11.45)
(cf. (11.48) below). Since (V(Q 2 (2))) = ∞, the group on the left vanishes for i < 6. On the other hand V(Q 3 (1), Q 2 (2)) = {B3 = A2 = 0} ∪ {A1 = B2 B3 − A22 = 0} ⊂ P4 (B1 , B2 , B3 , A1 , A2 ).
(11.46)
Each of the two pieces on the right has = ∞. Their intersection is the linear space L :={A2 = A1 = B3 = 0} which is a line. Lemma 11.11 gives b3 := (Q 3 (1), Q 2 (2)) ≥ 2, but we can consider directly the situation for H 3 , . . . H 2 (L) → H 3 (V(Q 3 (1), Q 2 (2))) → 0 ⊕ 0,
(11.47)
and conclude a3 ≥ b3 ≥ 3 = max(3, 2 · 4 − 5). The proof of the lemma for n ≥ 4 is recursive. We have, projecting from the point B1 = 1, Bi = A j = 0 using (11.9),
Hci V(Q n (1))\V(Q n (1), Q n−1 (2)) −−→ H i V(Q n (1)) −−→ H i V(Q n (1), Q n−1 (2)) ∼ (11.48)
= Hci (P2n−3 \V(Q n−1 (2))(2n−3) ).
Graph Polynomials
215
Dropping the variable A1 , P2n−3 \ V(Q n−1 (2))(2n−3) becomes an A1 -bundle over P2n−4 \ V(Q n−1 (2))(2n−4) , so
(11.49) Hci V(Q n (1)) \ V(Q n (1), Q n−1 (2)) = 0 for i ≤ an−1 + 3. We conclude from (11.36) that an ≥ min(an−1 + 3, bn ).
(11.50)
As a consequence of (11.9),
(Q n (1), Q n−1 (2)) = B1 Q n−1 (2) − A21 Q n−2 (3), Q n−1 (2) = (A21 Q n−2 (3), Q n−1 (2)).
(11.51)
In terms of V this reads V(Q n (1), Q n−1 (2)) = V(Q n−2 (3), Q n−1 (2))(2n−2) ∪ V(Q n−1 (2), A1 )(2n−2) . (11.52) The varieties on the right are cones with fibres of dimensions 2 and 1 respectively. From Lemmas 11.10 and 11.11 we conclude bn ≥ min(bn−1 + 4, an−1 + 2, 2 dim V(Q n−1 (2), Q n−2 (3)) + 2, bn−1 + 3) = min(an−1 + 2, bn−1 + 3, 4n − 10). (11.53) Starting with a3 , b3 ≥ 3 and plugging recursively into (11.53) and (11.50), the inequalities of the lemma, an , bn ≥ 2n − 3, follow.
We return now to the proof of Theorem 11.9. Lemma 11.13. We have the decompositions V(Q n (1), Q n−1 (2)) = V(A1 , Q n−1 (2)) ∪ V(A2 , Q n−2 (3)) ∪ . . . ∪V(An−1 , Bn ),
(11.54)
V(Q n (1), Q n−1 (1)) = V(An−1 , Q n−1 (1)) ∪ V(An−2 , Q n−2 (1)) ∪ . . . ∪V(A1 , B1 ), (11.55) V(Q n (1), Q n−1 (1)) ∪ V(Q n (1), Q n−1 (2)) = V(A1 , Q n (1)) ∪ V(A2 , Q n (1)) ∪ . . . ∪ V(An−1 , Q n (1)) = V
n−1 i=1
Ai , Q n (1) . (11.56)
Proof. For (11.54), we appeal repeatedly to (11.9), V(Q n (1), Q n−1 (2)) = V(A1 , Q n−1 (2)) ∪ V(Q n−1 (2), Q n−2 (3)) = . . . . (11.57) To prove (11.55), we apply the automorphism (11.40) to (11.54). Finally, from the determinant formula (11.8) one sees the congruences Q n (1) ≡ Q p (1) · Q n− p ( p + 1)
mod A p ; 1 ≤ p ≤ n − 1.
We can use these to combine the V(Ai , ∗) from (11.54) and (11.55).
(11.58)
216
S. Bloch, H. Esnault, D. Kreimer
The idea now is to use Mayer-Vietoris on (11.39) and (11.56). We get
H 2n−5 V(Q n (1), Q n−1 (2)) ⊕ H 2n−5 V(Q n (1), Q n−1 (1))
n−1 → H 2n−5 (Z n ) → H 2n−4 V A , Q (1) i n i=1
2n−4 2n−4 →H V(Q n (1), Q n−1 (2)) ⊕ H V(Q n (1), Q n−1 (1)) → H 2n−4 (Z n ). (11.59) The vanishing results from Lemma 11.12 now yield H
2n−5
n−1 ( 2n−4 ∼ H 2n−4 (P2n−2 ). (Z n ) = H Ai , Q n (1) V
(11.60)
i=1
The final step in the proof of Theorem 11.9 will be to analyse the spectral sequence p,q E1
)
=
n−1
p+q H V(Ai0 , . . . , Ai p , Q n (1)) ⇒ H Ai , Q n (1) . V q
i 0 ,...,i p
i=1
(11.61)
We can calculate H q V(Ai0 , . . . , Ai p , Q n (1)) as follows. Write n 0 = i 0 , n 1 = i 1 − p+1 i 0 , . . . , n p = i p − i p−1 , n p+1 = n − i p . Thus we have a partition n = 0 n j . As in (11.58) we may factor Q n (1)| Ai0 =...=Ai p =0 = Q n 0 (1)Q n 1 (i 0 + 1) · Q n p+1 (i p + 1)| Ai0 =...=Ai p =0 . (11.62) Each Q n j (i j−1 +1) is a homogeneous function on P2n j −2 . Note if n j = 1, Q 1 (i) = Bi is a homogeneous function on P0 . (The homogeneous coordinate ring of P0 is a polynomial ring in one variable.) We have linear spaces *i0 , . . . , A *i p , . . . , An−1 , B1 , . . . , Bn ) L j ⊂ P2n− p−3 (A1 , . . . , A and cone maps π j : P2n− p−2 \ L j → P2n j −2 . (When n j = 1, L j is a hyperplane.) Then V(Ai0 , . . . , Ai p , Q n (1)) is the union of the cones π −1 j (V(Q n j (i j ))). (When n j = 1, the cone is just L j .) Write U j = P2n j −2 \ V(Q n j (i j )) (U j = pt when n j = 1) and U = P2n− p−3 \
p+1 +
π −1 j (V(Q n j (i j ))).
j=0
The map
πj : U →
p+1
U j is a Gm -bundle. Thus
Graph Polynomials
217
Hc∗ P2n− p−3 \ V(Ai0 , . . . , Ai p , Q n (1)) p+1 = H ∗ (U ) ∼ = Hc∗ (Gm ) ⊗
p+1 ,
Hc∗ (U j ).
(11.63)
j=0
Suppose now that some n j > 1. Then, by Lemma 11.12, these cohomology groups vanish in degrees less than or equal to p+1+
p+1 (2n j − 2) = 2n − p − 3.
(11.64)
j=0
It follows that we have surjections H i (P2n−2 ) H i (V(Ai0 , . . . , Ai p , Q n (1))); i ≤ 2n − p − 4.
(11.65)
Note this includes the middle dimensional cohomology. The exceptional case is when all the n j = 1. Then p = n −2. Formula (11.64) would n−1 (U ) = 0. We have suggest Hc∗ (U ) = (0), ∗ < n, but in fact U ∼ = Gn−1 m has Hc n
n−2,q q q E1 = H V(A1 , . . . , An−1 , Q n (1)) = H V Bi . (11.66) i=1 p,q
It follows that E 2n−2,n−2 = Q, and E 2 has E 20,2n−4 = ker
n−1
) i=1
→
)
= (0) for p + q = 2n − 4, if p = 0, n − 2. One
H 2n−4 (V(Ai , Q n (1)))
H 2n−4 (V(Ai1 , Ai2 , Q n (1))) .
(11.67)
I ={i 1 ,i 2 }
Again by (11.65) E 20,2n−4 = Q is generated by the class of the hyperplane section. Finally, the differential dr reads p−r,q+r −1
Er
p,q
→ Er
p+r,q−r +1
→ Er
.
(11.68)
We have r ≥ 2. In the case p + q = 2n − 4, the group on the left vanishes by (11.65), the group in the middle vanishes for p = 0, n − 2, and the group on the right vanishes p,q p,q for p = n − 2 because we have only n − 1 components. It follows that Er +1 ∼ = Er . We conclude from (11.60), H 2n−5 (Z n ) ∼ = Q(0). This completes the proof of Theorem 11.9.
(11.69)
By Lemma 11.8, Theorem 11.2 follows from Theorem 11.9. This completes the proof of Theorem 11.2.
218
S. Bloch, H. Esnault, D. Kreimer
12. de Rham Class Let X n ⊂ P2n−1 be the graph hypersurface associated to the wheel and spoke graph with n spokes as in Sect. 11. By the results in that section, we know that de Rham cohomology 2n−1 \ X ) ∼ K . Our objective here is to show this is generated by fulfills H D2n−1 n = R (P ηn :=
2n−1 ∈ (P2n−1 , ω(2X n )) n2
(12.1)
2n−1 \ X ). (cf. (6.10)), i.e. we show that [ηn ] = 0 in H D2n−1 n R (P To a certain point, the argument is general and applies to the form η attached to any graph with n loops and 2n edges. In this generality it is true that [η ] lies in the second level of the coniveau filtration. We do not give the proof here.
Lemma 12.1. Let U = Spec R be a smooth, affine variety, and let 0 = f, g ∈ R be functions. Let Z : f = g = 0 in U . We have a map of complexes
∗R[1/ f ] / ∗R ⊕ ∗R[1/g] / ∗R → ∗R[1/ f g] / ∗R . (12.2) Then the de Rham cohomology with supports H Z∗ ,D R (U ) is computed by the cone of (12.2) shifted by −2. Proof. The localization sequence identifies H{∗f =0},D R (U ) = H ∗ ( ∗R[1/ f ] / ∗R [−1])
(12.3)
(resp. replace f by g resp. f g.) The assertion of the lemma follows from the exact sequence for X, Y ⊂ U . . . → H X∗ ∩Y → H X∗ ⊕ HY∗ → H X∗ ∪Y → H X∗+1 ∩Y → . . . .
(12.4)
Remark 12.2. Evidently, this cone is quasi-isomorphic to the cone of
∗R[1/ f ] / ∗R → ∗R[1/ f g] / ∗R[1/g] .
(12.5)
For the application, U = P2n−1 \ X n . To facilitate computations, it is convenient to i and localize further and invert a homogeneous coordinate as well. We take ai = AAn−1 i bi = ABn−1 , (11.4) . (We will check that the forms we work with have no poles along An−1 = 0.) Q (i) We write Q p (i) as in (11.33). Let q p (i) = A pp (resp. κn = AKn n with K n as in n−1
n−1
(11.7)). Take f = qn−1 (1), g = qn−2 (2). The local defining equation X n : b0 qn−1 (1) + κn has been inverted in U , so κn is invertible on f = 0 and the element . 1 1 b0 β := −db1 ∧ . . . ∧ dbn−1 ∧ da0 ∧ . . . ∧ dan−2 − κn qn−1 (1) b0 qn−1 (1) + κn (12.6)
2n−2 and satisfies is defined in 2n−2 R[1/ f ] / R
dβ = ηn =
db0 ∧ . . . ∧ dbn−1 ∧ da0 ∧ . . . ∧ dan−2 . (b0 qn−1 (1) + κn )2
(12.7)
Graph Polynomials
219
Applying the fundamental relation expressed by Lemma 11.6, one obtains κn qn−2 (2) ≡ (a0 qn−2 (2) + (−1)n a1 · · · an−2 )2
mod qn−1 (1).
(12.8)
Computing now in ∗R[1/ f g] / ∗R[1/g] we find
. b0 qn−1 (1) dqn−1 (1) db2 β=− ∧ ∧ db3 ∧ . . . ∧ dan−2 1 − qn−1 (1) κn qn−2 (2) b0 qn−1 (1) + κn
1 dqn−1 (1) dqn−2 (2) ∧ ∧ ν , (12.9) =d · a0 qn−2 (2) + (−1)n a1 · · · an−2 qn−1 (1) qn−2 (2)
where db3 ∧ db4 ∧ . . . ∧ dbn−1 ∧ da1 ∧ . . . ∧ dan−2 qn−3 (3) dqn−3 (3) dqn−4 (4) dq1 (n − 1) =± ∧ ∧ ... ∧ ∧ da1 · · · ∧ dan−2 . qn−3 (3) qn−4 (4) q1 (n − 1)
ν=±
(12.10) (Note that a0 is omitted.) It follows from (12.8) that in ∗R[1/ f g] / ∗R[1/g] we have
db2 1 dqn−1 (1) ∧ ∧ db3 . . . . · n a0 qn−2 (2) + (−1) a1 · · · an−2 qn−1 (1) qn−2 (2) dbn−1 ∧ da1 ∧ . . . ∧ dan−2 = dθ, (12.11)
β=d
θ:=
db2 1 dqn−1 (1) ∧ ∧ db3 . . . , · a0 qn−2 (2) + (−1)n a1 · · · an−2 qn−1 (1) qn−2 (2) dbn−1 ∧ da1 ∧ . . . ∧ dan−2
(defining θ .) One checks easily that neither β nor θ has a pole along An−1 = 0, so the pair (β, θ ) ∈ H Z2n−1 ,D R (U )
(12.12)
2n−1 \ X ). Here represents a class mapping to ηn ∈ H D2n−1 n R (P
Z : Q n−1 (1) = Q n−2 (2) = 0. Lemma 12.3. The map H Z2n−1 (P2n−1 \ X n ) → H 2n−1 (P2n−1 \ X n )
(12.13)
is injective. Proof. Let Y : Q n−1 (1) = 0. We have v
u
H Z2n−1 (P2n−1 \ X n ) − → HY2n−1 (P2n−1 \ X n ) − → H 2n−1 (P2n−1 \ X n ), (12.14) and it will suffice to show u and v injective. We have projections B0
A0 ,An−1
P2n−1 \ (X n ∪ Y ) −→ P2n−2 \ Y0 −−−−−→ P2n−4 \ Y1 .
(12.15)
220
S. Bloch, H. Esnault, D. Kreimer
Here P2n−1 has homogeneous coordinates A0 , . . . , An−1 , B0 , . . . , Bn−1 ; the arrows are labeled by the variables which are dropped, and Y, Y0 are cones over Y1 . The arrow on the left is a Gm -bundle and on the right an A2 -bundle. It follows that H 2n−2 (P2n−1 \ (X n ∪ Y )) ∼ = H 2n−2 (P2n−4 \ Y1 ) ⊕ H 2n−3 (P2n−4 \ Y1 )(−1) = (0)
(12.16)
by Artin vanishing. As a consequence, the map v in (12.14) is injective. The locus Y \ Z is smooth (Q n−2 (2) = ∂ Q n−1 (1)/∂ B1 ) so to prove injectivity for u it will suffice to show H 2n−4 (Y \ ((X n ∩ Y ) ∪ Z )) = (0).
(12.17)
Consider the projection obtained as in (12.15) by dropping the variables B0 , A0 , An−1 (so Y, Z are cones over Y1 , Z 1 ) π
→ Y1 \ Z 1 ⊂ P2n−4 . Y \ ((X n ∩ Y ) ∪ Z ) −
(12.18)
Note that X n ∩ Y : Q n−1 (1) = K n = 0, where K n is as in (11.7). We can write π as a composition of two projections. First dropping B0 yields an A1 -fibration. Then dropping A0 , An−1 leads to a fibration with fibre A2 − quadric. By Lemma 11.6, this quadric is a double line, so the fibres of π are A2 × Gm . It follows that H 2n−4 (Y \ ((X n ∩ Y ) ∪ Z )) ∼ = H 2n−4 (Y1 \ Z 1 ) ⊕ H 2n−5 (Y1 \ Z 1 )(−1) = H 2n−5 (Y1 \ Z 1 )(−1). (12.19) (The right hand identity is Artin vanishing since Y1 \ Z 1 is affine of dimension 2n − 5.) Dropping the variable B1 realizes {Q n−2 (2) = 0} as the cone over a hypersurface Y2 ⊂ P2n−5 . Using (11.9), we conclude H 2n−5 (Y1 \ Z 1 ) ∼ = H 2n−5 (P2n−5 \ Y2 ).
(12.20)
But the equation defining Y2 does not involve A1 , so yet another projection is possible, and we deduce vanishing on the right in (12.20) by Lemma 11.4. Theorem 12.4. Let X n be the graph hypersurface for the wheel and spokes graph with 2n−1 \ X ) be the de Rham class (12.1). Then n spokes. Let [ηn ] ∈ H D2n−1 n R (P 2n−1 K [ηn ] = H D2n−1 \ X n ). R (P 2n−1 \ X ), (12.12). By Lemma Proof. We have lifted [ηn ] to a class (β, θ ) ∈ H Z2n−1 n ,D R (P 12.3, it will suffice to show (β, θ ) = 0. We localize at the generic point of Z . It follows from (12.10) and (12.11) that as a class in the de Rham cohomology of the function field of Z , this class is represented by the form
±d log(qn−3 (3))∧ . . . ∧d log(q1 (n − 1))∧ d log(a1 )∧ . . . ∧d log(an−2 ). (12.21) It is easy to see that this is a non-zero multiple of d log(b3 ) ∧ . . . ∧ d log(bn−1 ) ∧ d log(a1 ) . . . d log(an−2 ), and so is nonzero as a form. To see that it is nonzero as a cohomology class, one applies Deligne’s mixed Hodge theory which implies that the vector space of logarithmic forms injects into de Rham cohomology of the open on which those forms are smooth.
Graph Polynomials
221
13. Wheels and Beyond 13.1. A few words on the wheel with 3 spokes. Let X 3 ⊂ P5 be the hypersurface associated to the wheel with 3 spokes. X 3 : det(A1 M1 + . . . + A6 M6 ) = 0, where the Mi are symmetric rank one 3×3 matrices. It is easy to see in this case that the Mi span the vector space of all symmetric 3×3-matrices. The mapping g → t gg identifies G L 3 (C)/O3 (C) with the space of invertible symmetric 3 × 3 complex matrices. It follows that P5 − X 3 ∼ = G L 3 (C)/C× O3 (C).
(13.1)
From this, standard facts about the cohomology of symmetric spaces yield Theorem 11.2 for X 3 . (We thank P. Deligne for this argument.) From another point of view, X 3 is the space of singular quadrics in P2 . Such a quadric is a union of two (possibly coincident) lines, so we get X3 ∼ = Sym2 P2 .
(13.2)
This way we see immediately that H 4 (X ) = Q(−2) ⊕ Q(−2), where the 2 generators are the class of the algebraic cycles p × P2 + P2 × p and the diagonal . In particular, Remark 10.5 is clear. Then p × P2 is linearly embedded into P5 while is embedded by the the complete linear system O(−2). Thus − 2 · ( p × P2 + P2 × p) spans the interesting class in H 4 (X )prim . It is likely that its strict transform in the blow up π : P → P5 yields a relative class in HY6 (P, B), but we haven’t computed this last piece. 13.2. Beyond wheels. An immediate observation is that the wheel with n spokes wn ,
wn =
(13.3)
and the zig-zag graphs z n ,
zn =
(13.4)
222
S. Bloch, H. Esnault, D. Kreimer
are both obtained by gluing triangles together in a rather obvious way. Both classes of graphs evaluate to rational multiples of ζ (2l − 3) at l-loops [4]. The kinship between these two classes of graphs is not easily seen at the level of their graph polynomials. Suppose we try to look directly at the Feynman period (5.3). Let = e1 +e2 +e3 ∈ H1 () be the loop spanned by a triangle. If we choose coordinates on H1 () in such a way that the first coordinate k coincides on Q · ⊂ H1 with ei∨ , i ≤ 3, and the other coordinates q are pulled back from a system of coordinates on H1 /Q · , then the k coordinate appears only in the quadrics Q i associated to the edges ei , i = 1, 2, 3. Replacing k by k1 , . . . , k4 , the period (5.3) can be written ∞ ∞ dq dk . (13.5) q=−∞ Q 4 (q) · · · Q n (q) k=−∞ Q 1 (k, q)Q 2 (k, q)Q 3 (k, q) We have the Feynman parametrization ∞ ∞ 1 1+y = d xd y, (13.6) Q 1 (k)Q 2 (k)Q 3 (k) (k)+ y Q 2 (k)+ Q 3 (k)]3 + y)Q [x(1 0 0 1 and the elementary integral, valid with appropriate positivity hypotheses on an inhomo 1 , . . . , k4 ), geneous quadric Q(k ∞ d 4k 1 = , (13.7) 3 Q Q k1 ,...,k4 =−∞ where, up to a scale factor depending on the determinant of the degree 2 homogeneous Q is a certain quadratic polynomial in the coefficients of Q. With these part of Q, substitutions, the period becomes ∞ ∞ dq , (13.8) d xd y Q(x, y, q)Q 4 (q) · · · Q n (q) x,y=0 q=−∞ where Q(x, y, q) is quadratic in q with coefficients which are rational functions in the Feynman parameters x, y. It would be of interest to try to make this calculation motivic. A triangle is the one-loop contribution to the six-point Green function in φ 4 theory: its four-valent vertices between any pair of its three edges allow for two external edges, so that these three vertices allow for six external edges altogether. The message in the above that sequences of triangles increase the transcendental degree (= point at which ζ is evaluated) in steps of two seems to be a universal observation judging by computational evidence. Indeed, let us look at the graph which encapsulates the first appearance of a multiple zeta value, in this case the first irreducible double sum ζ (5, 3) which appears in the graph
M=
.
Graph Polynomials
223
This graph is the first in a series of graphs
Mi =
.
Adding triangles yields ζ (5, 2l + 3). Most interestingly, these graphs can be decomposed into zig-zag graphs in a manner consistent with the Hopf algebra structure on the multiple zeta value Hopf algebra MZVs, upon noticing that the replacement of a triangle in
by the six-point function
g6 =
delivers the graph M. (Remove the three edges of a triangle from w3 , and attach the remaining graph, which has 3 univalent vertices and one trivalent vertex, to g6 by identifying the univalent vertices with 3 vertices of g6 no two of which are connected by a single edge.) Note that indeed g6 has six vertices of valence three. Each vertex hence will have one external edge attached to it to make it four-valent, and the resulting six external edges make this graph into a contribution to a six-point function. It can hence replace any triangle.
224
S. Bloch, H. Esnault, D. Kreimer
Furthermore, the six-point function g6 is related to the four-loop graph
w4 =
by the operation w4 = g6 /e,
(13.9)
where e is any edge connecting two vertices. Indeed, g6 is the bipartite graph on two times three edges. Shrinking any of those edges to a point combines two valence-three vertices into one four-valent vertex with its four edges connecting to each of the other four remaining vertices. This suggests constructing a Hopf algebra H on primitive vertex graph in φ 4 theory which incorporates the purely graph-theoretic lemma 7.4 such that the following highly symbolic diagram commutes: 2PI
H −−−−→ φ
H⊗H φ⊗φ
.
(13.10)
MZV
MZV −−−−→ MZV ⊗ MZV First results are in agreement with the expectation that all graphs up to twelve edges are mixed Tate, which they are by explicit calculation [4], and also predict correctly the apperance of a double sum ζ (3, 5) or products ζ (3)ζ (5) in six-loop graphs. The seven loop data demand some highly non-trivial checks (currently in process) on the data amassed in [4, 5]. Acknowledgement. The second named author thanks Pierre Deligne for important discussions.
References 1. Artin, M.: Théorème de finitude pour un morphisme propre; dimension cohomologique des schémas algébriques affines. In SGA 4, tome 3, XIV, Lect. Notes Math., Vol. 305, Berlin-Heidelberg-New York: Springer, 1973, pp. 145-168. 2. Borel, A.: Cohomologie de S L n et valeurs de fonctions zêta aux points entiers. Ann. Scuola Norm. Sup. Pisa Cl. Sci. (4) 4, no. 4, 613–636 (1977), 3. Belkale, P., Brosnan, P.: Matroids, Motives, and a Conjecture of Kontsevich. Duke Math. J. 116, no. 1, 147–188 (2003) 4. Broadhurst, D., Kreimer, D.: Knots and numbers in 4 theory to 7 loops and beyond. Int. J. Mod. Phys. C 6, 519 (1995) 5. Broadhurst, D., Kreimer, D.: Association of multiple zeta values with positive knots via Feynman diagrams up to 9 loops. Phys. Lett. B 393 (3-4), 403–412 (1997) 6. Deligne, P., Goncharov, A.: Groupes fondamentaux motiviques de Tate mixte, Ann. Sci. Éc. Norm. Sup. (4) 38, no1, 1–56 (2005) 7. Deligne, P.: Cohomologie étale. SGA 4 1/2, Springer Lecture Notes 569 Berlin-Heidelberg-New York: Springer, 1977
Graph Polynomials
225
8. Deninger, C., Deligne periods of mixed motives, K -theory, and the entropy of certain Zn -actions. JAMS 10, no. 2, 259–281 (1997) 9. Dodgson, C.L., Condensation of determinants. Proc. Roy. Soc. London 15, 150–155 (1866) 10. Esnault, H., Schechtman, V., Viehweg, E.: Cohomology of local systems on the complement of hyperplanes. Invent. Math. 109, 557–561 (1992); Erratum: Invent. Math. 112, 447 (1993) 11. Goncharov, A., Manin, Y.: Multiple zeta motives and moduli spaces M 0,n . Compos. Math. 140, no. 1, 1–14 (2004) 12. Itzykson, J.-C., Zuber, J.-B.: Quantum Field Theory. New York: Mc-Graw-Hill, 1980 13. Stembridge, J.: Counting Points on Varieties over Finite Fields Related to a Conjecture of Kontsevich. Ann. Combin. 2, 365–385 (1998) 14. Soulé, C.: Régulateurs, Seminar Bourbaki, Vol. 1984/85. Asterisque No. 133–134, 237–253 (1986) Communicated by J.Z. Imbrie
Commun. Math. Phys. 267, 227–263 (2006) Digital Object Identifier (DOI) 10.1007/s00220-006-0039-8
Communications in
Mathematical Physics
Equivariant Asymptotics for Bohr-Sommerfeld Lagrangian Submanifolds Marco Debernardi1 , Roberto Paoletti2 1 Dipartimento di Matematica F. Casorati, Via Ferrata 1, Università di Pavia, 27100 Pavia, Italy.
E-mail:
[email protected]
2 Dipartimento di Matematica e Applicazioni, Università degli Studi di Milano Bicocca,
Via R. Cozzi 53, Edificio U5, 20126 Milano, Italy. E-mail:
[email protected] Received: 27 October 2005 / Accepted: 26 January 2006 Published online: 9 May 2006 – © Springer-Verlag 2006
Abstract: Suppose given a complex projective manifold M with a fixed Hodge form . The Bohr-Sommerfeld Lagrangian submanifolds of (M, ) are the geometric counterpart to semi-classical physical states, and their geometric quantization has been extensively studied. Here we revisit this theory in the equivariant context, in the presence of a compatible (Hamiltonian) action of a connected compact Lie group. 1. Introduction Let M be an n-dimensional complex projective manifold, and let be a Hodge form on it. Let (L , h) be an Hermitian ample line bundle on M, such that −2πi is the curvature π of its unique compatible connection. Let L ∗ ⊇ X → M be the unit circle bundle in the 1 dual line bundle, and denote by α ∈ (X ) the normalized connection 1-form on X . Thus α is a contact form on X satisfying dα = π ∗ (). The compact Legendrian submanifolds of X play an important role in the theory of geometric quantization. An immersed Lagrangian submanifold ι : → M lifts to an immersed Legendrian submanifold ι˜ : → X if and only if there exists a non-vanishing covariantly constant section of ι∗ (L). Thus, the Legendrian submanifolds of X determine by projection distinguished immersed Lagrangian submanifolds of (M, ), so-called Bohr-Sommerfeld Lagrangian submanifolds. Roughly speaking, from a semiclassical point of view these (rather than points in the phase space (M, )) are the geometric counterparts to physical states. Consequently, the quantization of BohrSommerfeld Lagrangian submanifolds has been an important line of research in symplectic geometry (see for example [GS3, BPU, GT, BW] and references therein). In particular, a systematic procedure for quantizing Bohr-Sommerfeld Lagrangian submanifolds has been developed by Borthwick, Paul and Uribe in [BPU]. In short, the choice of a half-form λ on determines a generalized half-form on X supported on , essentially the delta-function determined by (, λ); by applying the Szegö kernel to the latter, and taking Fourier components, one then naturally associates to (, λ) a
228
M. Debernardi, R. Paoletti
sequence of holomorphic sections of L ⊗k , u k ∈ H 0 M, L ⊗k , for every k = 0, 1, 2, . . . . The theory of [BPU] describes how the local geometry of (, λ) captures the pointwise asymptotic properties of the sequence u k . In this article, we shall suppose given in addition the holomorphic action of a g-dimensional connected compact Lie group G on M, Hamiltonian with respect to . We shall assume that 0 ∈ g∗ is a regular value for the moment map : M → g∗ ; here g denotes the Lie algebra of G. We shall also suppose that L is an ample G-line bundle and that the Hermitian metric h on L is G-invariant. We recall that, up to topological obstructions, the existence of a linearization amounts to the existence of a moment map ([K, GS1], §3, and [GGK], Chap. VI). More precisely, in the presence of a linearization one recovers a moment map by pairing the connection form on X with the infinitesimal action of the Lie algebra g. Conversely, to a moment map there is associated an infinitesimal action of g on L; more precisely, ξ ∈ g acts on sections of L by the operator ∇ξ M +2πiξ , where ∇ is the covariant derivative associated to the connection, ξ M is the vector field on M generated by ξ , and ξ =: , ξ : M → R. The obstruction to extend the infinitesimal action of g to an action of G is of topological nature, and the extension certainly exists if G is simply connected. In this situation, every space of global holomorphic sections H 0 M, L ⊗k admits a G-equivariant unitary decomposition over the irreducible representations of G: H 0 M, L ⊗k = H 0 M, L ⊗k .
Here, runs over the set of highest weights, and thus indexes all finite dimensional irreducible representations V of G; for every , the summand H 0 M, L ⊗k is G-equivariantly isomorphic to a direct sum of finitely many copies of V . In particular, if u k ∈ H 0 M, L ⊗k is the sequence associated to the pair (, λ), we have for every k = 0, 1, 2, . . . a decomposition u k = ⊕ u k, , where u k, ∈ H 0 M, L ⊗k . We shall investigate the asymptotic properties of the sequence u k, , for fixed and k → +∞. Naturally enough, these are governed by the mutual position of and the zero locus of the moment map, −1 (0) ⊆ M. For example, when G is semi-simple and covers a Lagrangian submanifold π() ⊆ M, is G-invariant if and only if π() ⊆ −1 (0) [GS1]. If we choose, as we may after averaging, λ itself to be G-invariant, then so will be each u k ; therefore, u k, = 0 for every = 0 and k ∈ N. We shall assume instead that is transversal to −1 (0); this geometric hypothesis implies a nontrivial decomposition over the irreducibles of G. Incidentally, we remark that any given compact Legendrian submanifold ⊆ X may be deformed into one transversal to −1 (0), by a contactomorphism arbitrarily close to the identity. To see this, for some integer r ≥ 1 let us choose Hamiltonian vector fields V1 , . . . , Vr on M, such that for every m in an open neighbourhood U of π() one has Tm M = span{V1 (m), . . . , Vr (m)}. For every i = 1, . . . , r , let ψi : R → Diff(M) be the one-parameter group of symplectomorphisms generated by Vi , and consider the − → smooth map : × Rr → M given by (x, t ) = ψ1 (t1 ) ◦ · · · ◦ ψr (tr ) ◦ π(x)(x ∈ − → , t = (t1 , . . . , tr ) ∈ Rr ). By the assumption on the Vi ’s, we can find δ > 0 such that, if Bδ (0) ⊆ Rr is the open ball centered at the origin of radius δ, the restriction of to × Bδ (0) is a submersion. The transversality theorem [GP] then implies that we can − → find arbitrarily small t ∈ Bδ (0) such that the map ψ1 (t1 ) ◦ · · · ◦ ψr (tr ) ◦ π : → M −1 i on X which is transversal to (0). For every i = 1, . . . , r , there exist vector fields V i is the horizontal lift of Vi to are π -related to the Vi ’s (i.e., the horizontal component of V
Equivariant Asymptotics for Bohr-Sommerfeld Lagrangian Submanifolds
229
X , for every i = 1, . . . , r ), and which generate a one-parameter group of contactomori : R → Diff(X ) ([W], §4, and [GE], Theorem 2.2). Since ψ i (t) covers ψi (t) phisms ψ → =: ψ1 (t1 ) ◦ · · · ◦ ψr (tr )() is a Legendrian submanifold of for every i and t ∈ R, − t X transversal to −1 (0). Here is an explicit example: Example 1.1. Endow P1 with the Fubini-Study metric, so that L is the hyperplane bundle. Then X is the unit sphere S 3 ⊆ C2 , with projection π : S 3 → P1 given by ia 1 the Hopf ≤ a ≤ π ). Let ιa : S 1 → C ⊕ C be given by map. Fix e ia ∈ S (−π iθ ιa e = cos(θ ), e sin(θ ) . Then ι(S 1 ) ⊆ S 3 is a Legendrian knot for the standard contact structure, and may therefore be viewed as a Bohr-Sommerfeld immersed 1 1 Lagrangian submanifold of P1 .Let us consider the Hamiltonian action of S on P given by t [z 0 : z 1 ] =: t z 0 : t −1 z 1 , with moment map ([z 0 : z 1 ]) = |z 0 |2 − |z 1 |2 /z2 (we use rather than · to distinguish the action from the ordinary one given by scalar multiplication). In affine coordinates, ιa (S 1 ) covers the line through the origin of slope tan(a), and −1 (0) is the unit circle centered at the origin. This example may be generalized in any dimension. One motivation for studying this problem comes from the following natural question: Let us set M =: −1 (0) ⊆ M and M0 =: M /G.
(1)
Thus, M0 is the GIT quotient of M with respect to the action of the complexification G˜ of G, and (L , h, ) descend to corresponding orbi-objects (L 0 , h 0 , 0 ) on M0 . If we set X =: π −1 (M ) ⊆ X and X 0 =: X /G,
(2) then X 0 is the circle orbi-bundle of the Hermitian line orbi-bundle L ∗0 , h 0 . Let us momentarily suppose to fix ideas that G acts freely on M , so that (M0 , 0 ) is a Kähler manifold, and L 0 an honest ample line bundle on it. Now if ⊆ X is a (half-weighted) Legendrian submanifold transversal to M , the intersection = ∩ X determines by projection an immersed (half-weighted) Legendrian submanifold 0 → X 0 , which we may think of as the reduction of . We thus have corresponding half-forms u on X and u 0 on X 0 in the images of the respective Szegö projectors; taking Fourier components we obtain sequences u k ∈ H 0 M, L ⊗k and (u 0 )k ∈ H 0 M0 , L ⊗k . 0
On the other hand, it is well-known that for k = 0, 1, 2, . . . there is a natural isomor G phism H 0 M, L ⊗k ∼ = H 0 (M0 , L ⊗k 0 ) [GS1]. One is then led to ask whether, under the latter isomorphism, (u 0 )k = u k,0 , at least in some asymptotic sense. More pictorially, does the principle quantization commutes with reduction also hold for the single (transverse, semiclassical) state? To leading order, the relation between the two sequences is governed by the effective potential of the action, defined as the function Veff on M associating to every p ∈ M the volume of its G-orbit [BG], and a measure of the mutual position between and the G-orbit at a given point (another appearance of the effective potential in equivariant asymptotics is described in [P2]). Although (u 0 )k and u k,0 have the same order of growth, the answer to the question above is negative (see Remark 3.3).
230
M. Debernardi, R. Paoletti
Following Corollary 1.1, we shall also make some remarks regarding the case where is G-invariant. Some general introductory remarks are in order. First, while we have followed the general philosophy of Borthwick, Paul and Uribe, we have based our approach on the parametrix for the Szegö kernel constructed by Boutet de Monvel and Sjöstrand in [BS], rather than on the theory of Fourier-Hermite distributions and symplectic spinors as in [BPU]. This follows the approach to equivariant asymptotics already used in [P1], and is inspired by the study of algebro-geometric Szegö kernels by Zelditch in [Z] and its subsequent developments, as in [BSZ, STZ, SZ] (in [STZ], in particular, the authors work out scaling asymptotics for toric eigenfunctions). We shall then deal with half-densities, rather than half-forms, on the given Legendrian submanifolds. We have furthermore made extensive use of the notion of Heisenberg local coordinates introduced in [SZ], for this makes the relation between the local geometry of and the leading term in the asymptotic expansions particularly explicit and simple to express. Thus, even in the action-free case, our proofs and statements depart somewhat from the corresponding ones in [BPU]. Finally, we have focused on the case of complex projective manifolds. However, given the microlocal description of almost-complex Szegö kernels given in [SZ], our arguments can be extended to the symplectic almost complex category. Our statements are best expressed by viewing sections of L ⊗k as equivariant functions on X . Given that α and endow X with a G-invariant volume form, functions and half-densities on X may be equivariantly and unitarily identified. Briefly, let H(X ) ⊆ C ∞ (X ) be the Hardy space, and let H(X )k be the k th isotype for the S 1 -action. Then there is a well-known canonical unitary isomorphism H(X )k ∼ = H 0 M, L ⊗k , and we shall use the same symbol for the holomorphic sections and the corresponding equivariant functions. Now, if ⊆ X is a compact Legendrian submanifold, the choice of a smooth half-density λ on it determines a generalized half-density δ,λ on X (§ 2.1). Applying the Szegö projector to δ,λ , and taking Fourier components, we obtain as before equivariant functions u k, , for every integer k and highest weight . Our key result concerns the asymptotic expansion for an appropriate scaling limit of the sequence u k, . More precisely, suppose that x ∈ X and w ∈ Tm M, where m = π(x). We shall often implicitly identify Tm M with the horizontal tangent space at x determined by the connection, Hx (X/M) ⊆ Tx X . If x ∈ , the tangent space Tx may be viewed as a Lagrangian subspace of Tm M. Inspired by the results on the scaling limits of Szegö√kernels in [BSZ and SZ], we shall investigate the asymptotic √ behavior of u k, (x + w/ k), for k → +∞ and as w varies in Tm M. The point x + w/ k is only well-defined up to the choice of a coordinate system near x, and the ambiguity is O(k −1 ); the leading order part of the asymptotic expansion in Theorem 1.1 below is then independent of the choice of local coordinates. For concreteness, we shall at any rate assume that some system of local Heisenberg coordinates has been fixed. Let us introduce some further pieces of notation. Here orthogonality refers to the standard Euclidean structure on Cn = Rn ⊕ Rn .
Equivariant Asymptotics for Bohr-Sommerfeld Lagrangian Submanifolds
231
Definition 1.1. i) If (h, g) ∈ S 1 × G and x ∈ X , let dx (h, g) : Tx X → T(h,g)·x X be the differential of the action. ii) As above, set =: ∩ X . Let H (X/M) be the restriction to of the horizontal tangent bundle; thus, H (X/M) has fiber at x ∈ given by Tπ(x) M. iii) For m ∈ M, let G m ⊆ G be the stabilizer subgroup of m. If 0 ∈ g∗ is a regular value of , then G acts locally freely on M ; therefore, G m is a finite subgroup of G for every m ∈ M . iv) If x ∈ X and m ∈ M let us denote by g X (x) = Tx (G · x) ⊆ Tx (X ) and g M (m) = Tm (G · m) ⊆ Tm (M)
(3)
the tangent spaces to the respective G-orbits. Now given that the G-action is horizontal on X , for x ∈ X we have a natural identification g X (x) ∼ = g M (π(x)) ⊆ Hx (X/M).
(4)
It is well-known that if m ∈ M then g M (m) is a g-dimensional isotropic subspace of Tm M [GS1]. v) To leading order, we shall see that only the component of w in a certain √n-dimensional real vector subspace N˜ (x) ⊆ Tπ(x) M contributes to |u k, (x + w/ k)|. More precisely, recall that Tx may be viewed as a Lagrangian subspace of Tπ(x) M. Now if is transversal to X , then Tx ∩ g X (x) = {0} for every x ∈ (Corollary 2.1). Thus, if x ∈ there is a direct sum decomposition Tπ(x) M = (Tx + g M (π(x)))⊥ ⊕⊥ (Tx + g M (π(x))) ∼ = (Tx + g M (π(x)))⊥ ⊕⊥ Tx ⊥ ∩ Tx ⊕ Tx ⊕ g M (π(x)) . (5) If x ∈ , we shall then set
N˜ (x) =: (Tx + g M (π(x)))⊥ ⊕⊥ Tx ⊥ ∩ Tx ;
(6)
T˜ (x) =: Tx ⊕ g M (π(x)).
(7)
Thus, Tπ(x) M = N˜ (x) ⊕ T˜ (x). We shall denote by N˜ and T˜ the rank-n vector sub-bundles of H (X/M) whose fibres at x ∈ are given by, respectively, (6) and (7). vi) Suppose again x ∈ , m = π(x). Given w ∈ Tm M, we shall denote by w j , j = 1, 2, 3, 4, the components of w in the following intrinsic and unique algebraic decomposition: w = wa + wb + wc + wd ,
(8)
where
⊥ wa ∈ (Tm + g M (m))⊥ , wb ∈ Tm ∩ Tm , wc ∈ Tm , wd ∈ g M (m). Thus, w =: wa + wb and w =: wc + wd are the components of w in N˜ (x) and T˜ (x), respectively.
232
M. Debernardi, R. Paoletti (1/2)
vii) Let dens and dens be the Riemannian density and half-density on , (1/2) respectively. If λ is any smooth half-density on , we may write λ = f λ dens for a unique smooth function f λ on . √ The leading order term of the asymptotic expansion for u k, (x + w/ k) will depend on both the effective volume of the action at π(x), and a function expressing a pointwise measure of the mutual position between the Legendrian submanifold and the G-orbit. Given any x ∈ , let us choose Heisenberg local coordinates (θ, p, q) for X centered at x. The horizontal tangent space Hx (X/M) then gets unitarily identified with Cn , with complex coordinates z = p + iq. Perhaps after applying a unitary transformation in z, we may as well assume that the Lagrangian subspace Tx ⊆ Cn is defined by p = 0. Let us choose an orthonormal basis of g M (π(x)) (for the induced metric); the inclusion g M (π(x)) ⊆ Tπ(x) M is then described by a linear map r ∈ Rg → R r + i T R r,
(9)
where R is an n × g real matrix, T an n × n real matrix, and they satisfy rank(R) = g R t R + R t T t T R = Ig .
(10)
Recalling that g M (m) ⊆ Tm M is an isotropic subspace when m ∈ M , one can see that the complex matrix R t R + i R t T R is symmetric, has positive definite real part, and its determinant is independent of the choices involved. Let : → C be the smooth function defined by (x) =:
det(R t R + i R t T R)−1/2 Veff (π(x))
x ∈ .
(11)
The square root of the determinant is determined according to the conventions described in [H], §3.4. We are now ready to state our main result. Recall that n = dimC (M), g = dimR (G). Theorem 1.1. Suppose that 0 ∈ g∗ is a regular value of , and that the compact Legendrian submanifold ⊆ X is transversal to −1 (0). Let λ be a smooth half-density on . Fix a highest weight for G. For k = 0, 1, 2, . . . , let u k, be the component of δ,λ in H(X )k, ⊆ H(X ). Then: i) if x ∈ (S 1 × G) · , then u k, (x) = O(k −∞ ) as k → +∞; ii) there exist a positive definite metric S on N˜ and a real quadratic form P on , let h , g ∈ S 1 × G, H (X/M) such that the following holds. If x ∈ (S 1× G)· j j 1 ≤ j ≤ r x , be the finitely many elements such that h j , g j · x ∈ . For every j, let x j =: h j , g j · x, w j = dx h j , g j (w) ∈ Hx j (X/M). For every w ∈ Tπ(x) M, the following asymptotic expansion holds for k → +∞:
rx √ 1 dim(V ) (2π )(n+g) ( j) (n−g)/2 u k, x + w/ k ∼ k 0 (x, , k, w) G π(x) π n 2g j=1 · f λ (x j ) +
f ≥1
k
(n−g− f )/2
rx j=1
( j)
f (x, , k, w),
Equivariant Asymptotics for Bohr-Sommerfeld Lagrangian Submanifolds
233
where G π(x) ⊆ G is the stabilizer of π(x), and for every j = 1, . . . , r x we have ( j) 0 (x, , k, w)
=:
h −k j
χ (g j ) (x j ) e
−Sx j wj ,wj −i Px j (w j ,w j )
.
Here χ : G → C denotes the character of the representation V . The real quadratic forms S and P will be described precisely in the course of the proof (see (73) and (74)); they are determined by R and T . Letting w = 0, we obtain an asymptotic expansion for u k, (x):
rx 1 dim(V ) (2π )(n+g) (n−g)/2 u k, (x) ∼ k h −k j χ (g j ) (x j ) f λ (x j ) G π(x) π n 2g j=1 +L.O.T. Let us see what the asymptotic expansion of Theorem 1.1 looks like in the action-free case. In this case obviously = , and may be disregarded. Here we work in a system of adapted Heisenberg local coordinates (Definition 2.1). As a corollary of (the proof of) Theorem 1.1, we obtain (cfr. Theorem 3.12 of [BPU]): Corollary 1.1. Let ⊆ X be any compact Legendrian submanifold, λ a smooth halfdensity on . Let u k ∈ H(X )k be the components of δ,λ in H(X )k , k = 0, 1, 2, . . . . Then: i) if x ∈ S 1 · , then u k (x) = O k −∞ , as k → +∞; ii) if x ∈ S 1 · , let h 1 , . . . , h r x ∈ S 1 be the finitely many elements such that x j =: h j · x ∈ . Set m = π(x), m j = π(x j ). Suppose that (θ, p, q) is a system of local Heisenberg coordinates for X adapted to at x. Then for every w ∈ Tm M the following asymptotic expansion holds as k → +∞: n/2 rx √ 2 ( j) 0 (x, k, w) f λ (x j ) u k x + w/ k ∼ k n/2 π +
k (n− f )/2
f ≥1
j=1 rx
( j)
f (x, k, w),
j=1
where for every j = 1, . . . , r x we have ( j) 0 (x, k, w)
=:
2 ⊥ ⊥ −k −w j −i m j w j ,w j hj e ;
here w ⊥j is the component of w j = dx h j (w) perpendicular to Tx j , and w j = w j − w ⊥j ∈ Tx j . In particular, for w = 0 we obtain: n/2 rx 2 n/2 h −k u k (x) ∼ k j f λ (x j ) + L.O.T. π j=1
As a consequence of Corollary 1.1, let us momentarily return to the equivariant setting and take up again the case of an invariant Legendrian submanifold. More precisely,
234
M. Debernardi, R. Paoletti
suppose that G is semi-simple and acts freely on −1 (0). Suppose also that ⊆ X is a G-invariant compact Legendrian submanifold which maps down diffeomorphically under π onto a Lagrangian submanifold π() ⊆ M. Thus, π() ⊆ −1 (0), and if we (1/2) choose a G-invariant smooth half-density λ = f λ · dens on , the corresponding generalized half-density δ,λ is also G-invariant. As we have mentioned, we then have u k, = 0 unless = 0, and u k,0 = u k for every k. Now, 0 =: /G is a compact Legendrian submanifold of X 0 = X /G; let us define (with a slight abuse of language) (1/2) λ0 =: f λ · dens0 . Then it follows from Corollary 1.1 that, up to the multiplicative g/2 factor (2/π ) k g/2 , the asymptotic expansion for the corresponding sequence u k has the same leading order term as the asymptotic expansion of u k . Let us illustrate the theorem and the corollary with some examples. Example 1.2. Let us consider again the setting of Example 1.1. Thus, ⊆ S 3 ⊆ C2 is the Legendrian knot given by ι(eit ) = (cos(t), sin(t)). Let us choose the Riemannian half-density on it, so that f λ = 1. Let us consider first the action-free case. The Szegö kernel at the level k is given by k (x, y) =
(k + 1) x, yk π
x, y ∈ S 3 ,
where x, y denotes the standard Hermitian product of x, y ∈ C2 [BSZ]. Since k is self-adjoint with respect to the L 2 -Hermitian pairing, we have k δ,λ , f = δ,λ , k f ,
(12)
for every f ∈ C ∞ (S 3 ) (we are identifying half-densities with functions by the Riemannian half-density on S 3 ). Thus, setting x = (x0 , x1 ) ∈ S 3 ⊆ C2 , we have
(k + 1) 2π u k (x) = k (x, (cos(t), sin(t))) dt = (x0 cos(t) + x1 sin(t))k dt π 0 0 (k + 1) 2π k log(x0 cos(t)+x1 sin(t)) (k + 1) 2π i k S(t,x) = e dt = e dt, (13) π π 0 0 2π
where S(t, x) =: −i log(x0 cos(t) + x1 sin(t)) (any branch of the logarithm may be used). The latter equalities are meaningless at those values of t where x0 cos(t) + x1 sin(t) = 0; however, the contribution of a neighbourhood of radius of any of these points is O k log() . We shall implicitly introduce cut-off functions vanishing in a small neighbourhood of those points and ignore them in the following. Clearly, (S) ≥ 0. We have ∂S −x0 sin(t) + x1 cos(t) = −i . ∂t x0 cos(t) + x1 sin(t) Therefore, ∂∂tS (t0 , x) = 0 if and only if there exists ei h ∈ S 1 such that ei h · x = ι(t0 ). Thus, (13) is rapidly decreasing in k unless x ∈ S 1 · . Suppose then x ∈ S 1 · , and let ei h j ∈ S 1 , where h 0 , . . . , h r ∈ [0, 2π ), be the elements such that ei h j · x ∈ . For
Equivariant Asymptotics for Bohr-Sommerfeld Lagrangian Submanifolds
235
every j = 1, . . . , r there is a unique t j ∈ [0, 2π ) such that ei h j · x = ι(t j ). Hence the t j ’s are the only stationary points of S. At every h j , we have ∂2S = i, ∂t 2 so that the t j are all non-degenerate critical points. We have x0 cos(t j ) + x1 sin(t j ) = e−i h j . Application of the stationary phase lemma now yields √ r 2π √ −ikh j u k (x) ∼ k· e + L.O.T. π
(14)
j=1
1 1 1 Example 1.3. Let us re-examine Example 1.2 in the presence of the action S ×P → P given by t [z 0 : z 1 ] =: t z 0 : t −1 z 1 which we considered in Example 1.1. For any ∈ Z, we now have: (k + 1) 2π 2π i k S(s,t,x) i s e e dt ds, (15) u k, (x) = 2π 2 0 0 where S(s, t, x) =: −i log x0 e−is cos(t) + x1 eis sin(t) . We have
∂S −x0 e−is sin(t) + x1 cos(t) eis = −i · , (16) ∂t x0 e−is cos(t) + x1 eis sin(t) −x0 e−is cos(t) + x1 sin(t) eis ∂S = . (17) ∂s x0 e−is cos(t) + x1 eis sin(t) Thus, ds,t S (s0 , t0 , x) = 0 if and only if ei h · eis0 x = ι(t0 ) for some ei h ∈ S 1 (by (16)) and x0 = x1 (by pairing (16) with (17)). Thus, u k, (x) is rapidly decreasing unless x ∈ (S 1 × G) · (we have G = S 1 here). Now suppose x ∈ (S 1 × G) · , and let ei h j , eis j ∈ S 1 × G be the finitely many , and for every j let t ∈ [0, 2π ) be uniquely elements such that ei h j · (eis j x) ∈ j is i h j determined by the condition that e · e j x = ι(t j ). The pairs s j , t j are the only critical points of S(·, ·, x), and for any j = 1, . . . , r the Hessian matrix of S at s j , t j is given by 1 ±i H(s j ,t j ) (S) = i . ±i 1 Applying the stationary phase lemma, we now obtain: 1 i ( s j −kh j ) + L.O.T. e u k, (x) ∼ √ 2π j This agrees with Theorem 1.1. To check this, remark that |G m | = 2 and Veff (m) = π , for every m ∈ −1 (0). The latter equality follows from the fact that every G-orbit in S 3 has length 2π , and if an orbit maps to −1 (0) then it doubly covers its image in P1 .
236
M. Debernardi, R. Paoletti
As an application, in § 4 we shall study the following problem: Problem 1.1. Suppose given two compact Legendrian submanifolds, , ⊆ X , with specified smooth half-densities λ and σ , respectively. Let u k, , vk, ∈ H(X )k, be the components of δ,λ and δ,σ , respectively. How can we relate the asymptotic behavior of the Hermitian products u k, , vk, , as k → +∞, to the geometry of , and −1 (0)? In the action-free case, and in the setting of Fourier-Hermite distributions and symplectic spinors, this was carried out in [BPU]. Broadly speaking, we shall see that: • if (S 1 × G) · ∩ ∩ ( ◦ π )−1 (0) = ∅, then u k, , vk, = O k −∞ as k → +∞; 1 • if, more generally, the map S × G × → X given by the action is transversal to −1 = ∩ ( ◦ π ) (0) , then there is an asymptotic expansion u k, , vk, ∼ k −g/2 ρ0 + k −(g+ f )/2 ρ f , f ≥1
where the leading term ρ0 is describedexplicitly, and is a sum of terms corresponding to each h j , g j ∈ S 1 × G such that h j , g j · ∩ = ∅; • a similar asymptotic expansion holds when S 1 × G × → X meets nicely; the order of the leading term depends on the dimension of the inverse image of in S 1 × G × , and the leading coefficient is determined by certain integrals on this inverse image. The present work covers part of the PhD thesis of the first author at the University of Pavia. 2. Preliminaries In this section we shall collect a number of preliminary technical results, and begin a more precise description of the microlocal background of the quantization scheme outlined in the introduction. In §2.3 we shall prove statement i) of Theorem 1.1. 2.1. The distribution defined by a half-density on a Legendrian submanifold. The given complex structure J on M and the unique compatible connection form α on X determine ∂ a Riemannian metric and a volume form vol X =: α ∧ (dα)n . The generator ∂θ of the 1 S -action on X spans the vertical tangent bundle V (X/M) =: ker(dπ ) ⊆ T X : ∂ , (18) V (X/M) = span ∂θ and T X = V (X/M) ⊕ H (X/M) ∼ = V (X/M) ⊕ π ∗ (T M). Now let ⊆ X be a compact Legendrian submanifold, endowed with the induced Riemannian metric. Suppose x ∈ . In view of the aboveisomorphism, any basis b of ∂ Tx can be naturally extended to a basis b = ∂θ , b, Jπ(x) b of Tx X , where Jπ(x) denotes the complex structure at π(x) ∈ M. By construction, b is orthonormal if b is also. Thus the map b → b yields an embedding Bs()ort → Bs(X )ort , with obvious equivariance
Equivariant Asymptotics for Bohr-Sommerfeld Lagrangian Submanifolds
237
properties; here Bs()ort is the principal O(n)-bundle of orthonormal frames of , and Bs(X )ort is the principal O(2n + 1)-bundle of orthonormal frames of X . Given this, halfdensities on X restrict to half-densities on ; conversely, half-densities on extend to half-densities for X defined on . On the upshot, any choice of a smooth half-density λ on determines a generalized half-density on X supported on , δ,λ , as follows: If β is a smooth half-density on X , let β denote the induced half-density on ; thus the product β ⊗ λ is a density on . Then β ⊗ λ. δ,λ (β) =:
(1/2) (1/2) Suppose β = b dens X , λ = f λ dens with b ∈ C ∞ (X ), f λ (1/2) (1/2) dens X , dens denote the Riemannian half-densities on X and (1/2) Then β = b| dens , and
∈ C ∞ () (here , respectively).
δ,λ (β) =
( b| · f λ ) dens .
(19)
2.2. Adapted local coordinates and the microlocal structure of δ,σ . Heisenberg coordinates for circle bundles are discussed in [SZ]. The context of [SZ] is the symplectic and almost complex category; we recall that the construction of local Heisenberg coordinates at a given x0 ∈ X involves the choice of preferred local coordinates and a preferred frame for L at m 0 = π(x0 ) ∈ M. Although it isn’t strictly necessary, in the present complex projective setting the preferred local coordinates and frames involved may as well be assumed holomorphic. In short, suppose that (z 1 , . . . , z n ) is a system of preferred local holomorphic coordi nates for M at m 0 , so that the Hermitian metric satisfies (g − i)|m 0 = nj=1 dz j ⊗dz j . Let e L be a preferred local holomorphic frame for L at p0 , with dual frame e∗L , such that e∗L (m 0 ) = x0 . The associated system of local Heisenberg coordinates for X centered at x0 , ρ : U ⊆ (−π, π ) × Cn → V ⊆ X , is ρ(θ, z) = eiθ a(z)−1/2 e∗L (z);
2 here a =: e∗L = e L −2 . Write z = (z 1 , . . . , z n ) = p + iq, where p, q ∈ Rn , and set p dq =: j p j dq j , q dp =: j q j dp j . By [SZ], §1.2, the connection form α has the local representation with β = O z2 .
α = dθ + p dq − q dp + β( p, q),
(20)
Definition 2.1. Suppose that ⊆ X is a compact Legendrian submanifold, and that x0 ∈ . A system of Heisenberg local coordinates (θ, p, q) centered at x0 is called adapted to at x0 if is tangent to the submanifold {θ = 0, p = 0} at x0 . Any system of Heisenberg local coordinates at x0 may be turned into one adapted to at x0 simply by applying a suitable unitary transformation in the z coordinates. Suppose the Heisenberg local coordinates (θ, p, q) are adapted to at x0 . Then is locally defined by θ = f (q), p = h(q), where ( f, h) : V → R × Cn vanishes to second order at x0 . Thus, F(θ, q) = θ − f (q) and H ( p, q) = p − h(q) are local defining functions for on V . Actually,
238
M. Debernardi, R. Paoletti
Lemma 2.1. f vanishes to third order at the origin. Proof. By assumption, the restriction of α to vanishes identically; therefore, d f = −h dq + q dh − β(h, q), which vanishes to second order at q = 0. The q’s may be naturally viewed as local coordinates on . Let D (q) be the local (1/2) coordinate expression for the Riemannian half-density dens on . In view of (19), we conclude: Lemma 2.2. Suppose x0 ∈ , and choose adapted Heisenberg local coordinates at x0 , defined in an open neighborhood V x0 . Up to a smoothing contribution, the restriction of δ,λ to Cc∞ (V ) is a Fourier integral 1 (2π )n+1
R ×R n
ei·(τ F+η·H ) f λ (q) D (q)dτ dη.
(21)
Here (τ, η) ∈ R × Rn , η · H = k ηk Hk , where H ( p, q) = p − h(q), and D is the local coordinate expression for the Riemannian density of (the q’s restrict to a system of local coordinates on ). By our choices, D (0) = 1. The factor (2π )−(n+1) in front of (21) comes from the fact that codim(, X ) = n + 1. Let {V j } be an open cover of X such that whenever V j ∩ = ∅ there exist Heisenberg local coordinates on V j adapted to at some x j ∈ V j ∩ . Let j ρ j = 1 be a partition of unity subordinate to the open cover {V j }. Then δ,λ = j δ,λ ( j) , where each δ,λ ( j) =: ρ j δ is either a smoothing operator or a Fourier integral as (21). The construction of a system of Heisenberg local coordinates adapted to at some x ∈ may be varied smoothly with x. More precisely, let B2n+1 (0, ε) ⊆ R2n+1 ∼ = R×Cn be the open ball of radius ε centered at the origin and having radius ε > 0. Then: Lemma 2.3. Fix y ∈ . Then there exist i) open neighborhoods y ∈ U ⊆ and y ∈ V ⊆ X , and ii) a smooth map κ : U × B2n+1 (0, ε) → V , such that the following holds: For every x ∈ U , the restricted map κx = κ(x, ·) : B2n+1 (0, ε) → V is a Heisenberg local chart adapted to at x. Let θ (x) , p (x) , q (x) be the local coordinates associated to κx (x ∈ U ). It is then clear that we may also find smoothly varying local defining functions (Fx , Hx ) : V → R×Cn , of the form Fx = θ (x) − f x (q (x) ), Hx = p (x) − h x (q (x) ). Lemma 2.4. Suppose x0 ∈ X and fix Heisenberg local coordinates (θ, z) centered at x0 . Let m 0 =: π(x0 ) ∈ M and for some δ > 0 consider a smooth path γ : (−δ, δ) → M satisfying γ (0) = m 0 . Let γ˜ : (−δ, δ) → X be the unique horizontal lift of γ to X satisfying γ˜ (0) = x0 . Then the local Heisenberg coordinates of γ˜ are of the form (θ (t), z(t)), where z(t) ∈ Cn are the (holomorphic) preferred local coordinates of γ (t), and θ (t) ∈ (−π, π ) vanishes to third order at t = 0. Proof. If z(t) are the preferred coordinates of γ (t), then clearly the Heisenberg coordinates of γ˜ (t) have the form (θ (t), z(t)) for some smooth real function θ (t) vanishing at the origin. Now let z (0) = p0 + iq0 be the tangent vector of γ at t = 0 (expressed in local coordinates). Thus, z(t) = (t p0 + P(t)) + i(tq0 + Q(t)), where P(t) and Q(t)
Equivariant Asymptotics for Bohr-Sommerfeld Lagrangian Submanifolds
239
vanish to second order at t = 0. Since γ˜ is horizontal, the pull-back γ˜ ∗ (α) ∈ 1 (−δ, δ) vanishes identically. Now in view of (20) and the horizontality of γ˜ , 0 = γ˜ ∗ (α) = θ (t) + (t p0 + P(t)) q0 + Q (t) − (tq0 + Q(t)) p0 + P (t) dt +O(t 2 ). It follows that θ (t) = O(t 2 ), since the first order terms cancel out. 2.3. The equivariant setting. Recall that the connection form α and π ∗ () naturally endow X with a G-invariant volume form. This yields a unitary and equivariant identification of functions and half-densities, which with some abuse of language will be implicit in the following discussion. Let L 2 (X ) be the Hilbert space of square-integrable half-densities on X . By the theory of [BS], the Schwartz kernel of the Szegö projector X : L 2 (X ) → H(X ) ⊆ L 2 (X ) is microlocally a Fourier integral +∞ X (x, y) = eitψ(x,y) ζ (x, y, t) dt. (22) 0
The phase ψ is the restriction to X × X of a smooth function on L ∗ × L ∗ defined in the neighbourhood of (x0 , x0 ), satisfying (ψ) ≥ 0. The Taylor series of ψ along the diagonal L ∗ ⊆ L ∗ × L ∗ is: ψ(x + h, x + k) ∼ i
∂ I +J ρ I,J
∂z I ∂z J
J
(x) h I k ,
where ρ = 1 − ·2 is the defining function for X ⊆ L ∗ . The amplitude ζ (x, y, t) ∈ S n (X × X × R+ ) is a classical symbol of the form ζ (x, y, t) ∼
∞
t n−k ζk (x, y).
(23)
k=0
A complete discussion of the almost analytic geometry involved, together with a description of the leading term, is in [BS, Z, SZ]. It follows in particular that the wave front of X is the closed isotropic cone = (x, r αx , x, −r αx ) : x ∈ X, r > 0 ⊆ T ∗ X \ {0} × T ∗ X \ {0} . By standard basic results on wave fronts [DU, H] X extends to a continuous operator X : D (X ) → D (X ). Its image is the space of those distributions all of whose Fourier coefficients belong to the Hardy space. In particular, if u = X δ,λ then the wave front of u satisfies WF(u) ⊆ {(x, r αx ) : x ∈ , r > 0}; its projection in X is the singular support of u, which thus satisfies: SS(u) ⊆ SS δ,λ ⊆ . If x ∈ S 1 · , then u is smooth on an S 1 -invariant neighbourhood of x; hence u k (x) = O(k −∞ ). Now the G-action on X induces a unitary representation of G on L 2 (X ), given by (g · f )(x) = µ∗g−1 f (x) = f (g −1 · x). Given a highest weight for G, let L 2 (X ) ⊆
240
M. Debernardi, R. Paoletti
L 2 (X ) be the subspace of those elements contained in a finite direct sum of copies of V . The orthogonal projector P : L 2 (X ) → L 2 (X ) is given by P = dim(V ) χ (g −1 ) µ∗g−1 dg, (24) G
where χ is the character of the representation [DI]. We need to recall some basic facts concerning the microlocal structure of (24). To this end, let us remark that the action µ : G × X → X naturally induces a Hamiltonian action (for the canonical symplectic structure) µ˜ : G × (T ∗ X \ {0}) → T ∗ X \ {0}. Let : T ∗ X \ {0} → g∗ be the associated moment map. Then P is a Fourier integral operator, associated to the Lagrangian submanifold 0 =: (ν1 , ν2 ) : (ν1 ) = 0, ν2 = µ(g, ˜ ν1 ) (25) ⊆ T ∗ X \ {0} × T ∗ X \ {0} [GS2]. Thus, P obviously extends to a continuous operator P : D (X ) → D (X ), and the composition P ◦ X is a Fourier integral operator with complex phase, whose wave front satisfies: WF (P ◦ X ) = x, r αx , y, −r α y : r > 0, (x, r αx ) = 0, y = µ(g, x) (26) = x, r αx , y, −r α y : r > 0, (x) = 0, y = µ(g, x) . In the first equality, we have made use of the G-invariance of α, and in the second we have used the equality (x, r αx ) = r (x, αx ) = r (x) [GS1]. On the upshot, WF P ◦ X δ,λ ⊆ {(x, r αx ) : x ∈ G · , (x) = 0, r > 0} . (27) Therefore, if x ∈ (S 1 × G) · , then P ◦ X (u) is smooth on an S 1 -invariant neighborhood of x. Given that u k, is the k th Fourier component of P ◦ X (u), we obtain: Proposition 2.1. If x ∈ (S 1 × G) · , then u k, (x) = O(k −∞ ) as k → +∞. Let us now dwell on the geometry of . To this end, let us first recall the following basic fact from [GS1]: Lemma 2.5. For every m ∈ M , Tm M is the symplectic annihilator g M (m)0 of the isotropic subspace g M (m). In particular, Tm M is a co-isotropic subspace of Tm M. We deduce: Corollary 2.1. Suppose that is transversal to X , and x ∈ . Then Tx ∩ Tx (G · x) = 0. Proof. Let m =: π(x) ∈ M . Since both Tx and Tx (G · x) = g X (x) are horizontal subspaces of Tx X , it is equivalent to prove that Tx ∩ g M (m) = 0, where in the latter equality Tx is identified with the Lagrangian subspace dx π (Tx ) ⊆ Tm M. Passing to symplectic annihilators, we have (Tx ∩ g M (m))0 = Tx 0 + g M (m)0 = Tx + Tm M = Tm M, by Lemma 2.5 and the transversality assumption.
Equivariant Asymptotics for Bohr-Sommerfeld Lagrangian Submanifolds
241
By horizontality, this means: Corollary 2.2. Let ⊆ X be a Legendrian submanifold transversal to X . If x ∈ , we have Tx ∩ (Vx (X/M) ⊕ g X (x)) = {0}. Since the action of S 1 × G on X is locally free, we conclude: Corollary 2.3. Suppose that ⊆ X is a compact Legendrian submanifold transversal to X , andthat x ∈ (S 1 × G) · . There are then only finitely many h j , g j ∈ S 1 × G such that h j , g j · x ∈ . Now suppose x ∈ , and let us choose Heisenberg local coordinates (θ, z) adapted to centered at x, defined on some open neighborhood V x. Let F(θ, q) = θ − f (q) and H ( p, q) =: p − h(q) be defining functions for ∩ V , as in §2.2. By construction, the locus θ = 0 is tangent to the horizontal tangent bundle at the origin, and d0 f = 0. Given that the G-action on X is horizontal at any x ∈ X , it follows that dx F(ξ(x)) = 0. In view of Corollary 2.1, we obtain: Corollary 2.4. There exist an open neighbourhood E of 0 ∈ g and c > 0 such that H µexp (ξ ) (x) ≥ cξ for every ξ ∈ E. G 2.4. A unitary invariant of pairs of Lagrangian subspaces. The invariant introduced in this section will be used in §4. Let (V, V , J ) be a unitary vector space; that is, V is a 2r dimensional real vector space, V a linear symplectic structure on V , and J ∈ GL(V ) is a complex structure compatible with V . Leaving V and J understood, let U (V ) denote its unitary group, and let GrLag (V ) be the its Lagrangian Grassmanian (the manifold parametrizing Lagrangian vector subspaces of V ). Given L , L ∈ GrLag (V ), let U (V ) L ,L ⊆ U (V ) be the subset of unitary transformations mapping L onto L . Definition 2.2. For every L, we let ı J (L , L) = 1. If c = dim L ∩ L < n, suppose that ψ ∈ U (V ) L ,L satisfies ψ(L ∩ L ) ⊆ L ∩ L . Let B be an orthogonal real basis of L whose first c vectors lie in L ∩ L . The matrix of ψ in the basis B, viewed as an orthonormal complex basis of V , has a block diagonal form, whose first c × c block is a real orthogonal matrix and whose second (r − c) × (r − c) block is a unitary matrix A + i B ∈ U (r − c), where A, B ∈ Mr −c (R) and B is non-singular. Then ı J L , L =: | det(B)|. 2 For example, if V = C2 with its standard unitary structure, and L , L ⊆ C are two distinct lines through the origin, then ı J L , L = | sin(ϑ)|, where ϑ is the angle between L and L . For every c = 0, . . . , n, let Dc =: L , L ∈ GrLag (V ) × GrLag (V ) : dim L ∩ L = c .
We leave it to the reader to check the following: ∗ Lemma 2.6. ı J : Gr Lag (V ) × Gr Lag (V ) → R is well-defined, U (V ) – invariant (with respect to the action R · L , L = R L , R L ), and symmetric. It is continuous on Dc , for every c = 0, . . . , n.
242
M. Debernardi, R. Paoletti
3. Proof of Theorem 1.1 As before, let u =: X δ,λ ∈ H(X ), and denote by u k, ∈ Hk, (X ) its S 1 × Gequivariant components. Suppose x ∈ (S 1 × G) · . Choose local Heisenberg coordinates (θ, z) = (θ, p, q) centered at x, defined on an open neighbourhood V x (z = p + iq and p, q ∈ Rn ). We shall denote by x + w the point in V having Heisenberg local coordinates (0, w) (w ∈ Cn ). √ Given w ∈ Cn , let us consider the asymptotics of u k, (x + w/ k) for fixed and k → +∞. We have: π √ √ dim(V ) u k, x + w/ k = x + w/ u µ ◦ r k −1 iϑ g e (2π )n+2 G −π ×χ (g −1 ) e−ikϑ dg dϑ.
(28)
Here µ and r denote the G- and S 1 -actions on X (we shall occasionally also use a dot to denote group action on a given point). Let{(eiϑ j , gj )}, 1 ≤ j ≤ Nm , be the finitely many elements of S 1 × G such that x j =: eiϑ j , g j · x ∈ (Corollary 2.3); Nm depends only on m = π(x) ∈ M. Since the action of G on X is locally free, but not necessarily free, it may happen that x j = x j for j = j . We shall now show that, perhaps after disregarding a rapidly decaying contribution, the integration over S 1 × G may be localized near the eiϑ j , g j ’s. Using standard basic facts from the theory of wave fronts [DU, H], and recalling that we are identifying functions and densities by means of the Riemannian volume forms, one can prove the following: Lemma 3.1. For y ∈ X , define the smooth map ϒ y : S 1 × G → X by ϒ y (h, g) = µg−1 ◦ rϑ (y) (h ∈ S 1 , g ∈ G). Then: i) ϒ y is an immersion, for every y ∈ X ; ii) the pull-back ϒ y∗ (u) is a well-defined generalized half-density on S 1 × G; iii) the singular support of ϒ y∗ (u) satisfies SS ϒ y∗ (u) ⊆ (h, g) ∈ S 1 × G : µg−1 ◦ rh (y) ∈ . Now suppose > 0 is suitably small; similarly, choose a suitably small open neighborhood E of the unit e ∈ G. For every j = 1, . . . , Nm , let D j =: eiϑ : ϑ − ϑ j < , and E j =: g −1 j · E ⊆ G. Thus, T j =: D j × E j ⊆ S 1 × G is an open neighborhood of (eiϑ j , g −1 j ). Let T0 ⊆ S 1 × G be an open subset such that (eiϑ j , g −1 j ) ∈ T 0 , for every j, and such that Nm Nm 1 S × G = j=0 T j . Let j=0 γ j (h, g) = 1 be a partition of unity subordinate to the
m open cover T = {T j } Nj=0 of S 1 × G. We have, with dh = (2π )−1 dϑ:
Equivariant Asymptotics for Bohr-Sommerfeld Lagrangian Submanifolds
243
Nm √ dim(V ) √ u k, x + w/ k = γ j (h, g) u µg−1 ◦ rh x + w/ k n+1 (2π ) j=0 T j √ u k, (x + w/ k) j , (29) ×χ (g −1 ) h −k dg dh = j
√ where u k, (x + w/ k) j is defined to be the j th summand in (29). √ Lemma 3.2. u k, x + w/ k = O(k −∞ ) as k → +∞. 0
Proof. Perhaps after restricting the open neighborhood V of x, we may assume that dist X µg−1 ◦ rh (y), > 1 for some given sufficiently small 1 > 0 and every (h, g) ∈ T0 , y ∈ V . Therefore, as y ∈ V varies, the generalized functions γ0 (h, g) ϒ y∗ (u) ∈ D (S 1 × G) are smooth and have bounded derivatives. Taking Fourier components, we deduce that there exist C N > 0, N = 1, 2, . . . , such that u k, (y)0 < C N k −N for every y ∈ V . Since √ x + w/ k ∈ V for k 0, the statement follows. √ Next we shall focus on the asymptotics of each u k, (x + w/ k) j , 1 ≤ j ≤ Nm . Recalling (28), we have: √ √ dim(V ) ˜ µg−1 ◦ rh x + w/ k , y δ,λ (y) u k, x + w/ k = j (2π )n+1 T j X ×γ j (h, g) χ (g −1 ) h −k dy dg dh. (30) iϑ For every j, set V j =: e j , g j · V . We may assume that if (h, g) ∈ T j and y ∈ V j , then √ dist X µg−1 ◦ rh x + w/ k , y > 3 for some 3 > 0 and every k 0. Given that the Szegö kernel is smoothing away from the diagonal, it follows from (31) that √ √ dim (V ) ˜ µg−1 ◦ rh x + w/ k , y δ,λ (y) u k, x + w/ k ∼ j (2π )n+1 T j V j ×γ j (h, g) j (y) χ (g −1 ) h −k dg dh dy,
(31)
for an appropriate compactly supported bump function j on V j , identically equal to one near x j . Here symbol ∼ means that the difference between the left and right the hand side is O k −∞ . For every j, the local Heisenberg coordinates on V determine by translation local Heisenberg coordinates on V j centered at x j . We may then compose with an appropriate A j( j)∈ U( (n) in the z-variable, so as to obtain a system of local Heisenberg coordinates j) θ ,z adapted to at x j , in the sense of §2.2. With these coordinates understood, √ √ µg−1 ◦ reiϑ j (x + w/ k) = x j + w j / k, where w j = A j (w); furthermore, j
µ
g −1 j g
−1
√ √ ◦ r i (ϑ+ϑ j ) x + w/ k = µg−1 ◦ reiϑ x j + w j / k e (g ∈ E, |ϑ| < ) .
(32)
244
M. Debernardi, R. Paoletti
To simplify, when focussing on one j at a time, we shall write (θ, p, q) for (θ ( j) , q ( j) , F j (q) = θ − f j (q) and H j ( p, q) = p − h j will denote local defining functions for in V j . Thus, √ √ dim (V ) −kϑ j ˜ µg−1 ◦ reiϑ x j + w j / k , y u k, x + w/ k ∼ e j (2π )n+2 E − V j −1 −ikϑ j (y) δ,λ (y) ×γ j ei (ϑ j +ϑ ) , g −1 j g χ g g j e p ( j) ). Thus,
×dg dϑ dy.
(33)
We may assume that the Szegö kernel can be represented on each V j ×V j by a Fourier integral as in (22), and that δ,λ is represented on each V j by a Fourier integral as in (21); we shall thus apply (22) to (21), write the k th Fourier component of the result as an oscillatory integral, and study the asymptotics of the latter by the lemma of stationary phase [H]. Let us fix an orthonormal basis of g, and identify the latter with Rg . We may assume that the exponential map, expG : g → G, induces a diffeomorphism E =: exp−1 G (E) → E. Thus, the linear coordinates on E become local coordinates on E. Lemma 3.3. Let s j , S j : E ⊆ g → R ×Cn be defined by the condition that µe−ξ (x j ) has adapted Heisenberg local coordinates s j (ξ ), S j (ξ ) . Then S j is an embedding and s j vanishes to third order at 0 ∈ g. Proof. The first statement holds because the G-action on −1 (0) ⊆ M is locally free, and the second follows from Lemma 2.4 since G acts horizontally on (◦π )−1 (0) ⊆ X . By construction of Heisenberg local coordinates, we have: Lemma 3.4. Suppose y ∈ V j has adapted Heisenberg local coordinates (θ, z). Let dist M be the geodesic distance function on M. Then (perhaps after restricting V j and E ): 1 S j (ξ ) − z ≤ dist M µe−ξ (π(x j )), π(y) ≤ 2 S j (ξ ) − z . 2 Let us write ψ j and ζ j for the phase and amplitude in (22) on V j × V j , and define √ j (τ, η, t, g, ϑ, y) =: tψ j µg−1 ◦ reiϑ x j + w j / k , y +τ F j (y) + η · H j (y) − ϑ.
(34)
Clearly, j = t (ψ j ) ≥ 0. Performing the change of variables t → kt, η → kη, τ → kτ , we obtain: +∞ √ dim(V ) −ikϑ j u k, (x + w/ k) j ∼ k n+2 e eik j χ (g −1 g j ) (2π )n+2 R Rn 0 E − V j √ · γ j (ei (ϑ j +ϑ ) , g −1 j g) ζ j (µg −1 ◦ rϑ (x j + w j / k), y, kt) · j (y) f λ (q) D (q) dτ dη dt dg dϑ dy.
(35)
Equivariant Asymptotics for Bohr-Sommerfeld Lagrangian Submanifolds
245
Remark 3.1. Arguing as in the proof of Theorems 2.3.1 and 2.2.2 of [DU], an oscillatory integral like (35) can be evaluated asymptotically by implicitly introducing a cut-off in the norm of (t, τ, η), vanishing for large values of argument (this justifies the integration by parts in the proof of Lemma 3.5). Let us now split the integration in dg dy as follows. Let distG denote the Riemannian distance function on G, and for k = 1, 2, . . . define open sets A jk , B jk ⊆ E × V j by A jk =: (g, y) ∈ E × V j : γ (g, y) > k −1/3 , B jk =: (g, y) ∈ E × V j : γ (g, y) < 2k −1/3 , (36) where γ (g, y) =: max distG (g, e), dist M µg−1 (π(x)), π(y) .
(37)
Let a jk + b jk = 1 be a partition of unity on E × V j subordinate to the open cover E × V j = A jk ∪ B jk . By construction, a jk and b jk may be chosen S 1 -invariant. In local coordinates, we may actually assume that √ √ √ √ 3 3 3 3 a jk (ξ, z) = a j k ξ, k z , b jk (ξ, z) = b j k ξ, k z , (38) for fixed functions a j , b j . Then √ √ u k, x + w/ k ∼ u k, x + w/ k j
ja
√ + u k x + w/ k
jb
,
√ where u k, (x + w/ k) ja is obtained by multiplying the integrand √ in (35) by a jk (y) and integration is thus over A jk – and similarly for u k, (x + w/ k) jb . √ Lemma 3.5. u k, x + w/ k = O(k −N ), N = 1, 2, . . . . ja
Proof. In view of Lemma 3.4, for every eiϑ , expG (ξ ) ∈ S 1 × E, we have √ √ dist X µe−ξ ◦ rϑ x j + w j / k , y ≥ dist M µe−ξ π x j + w j / k , π(y) ≥
1 z − S j (ξ ) + O(k −1/2 ). 2
(39)
Denote by dist X the geodesic distance function on X . Fix an open neighborhood R x j with compact closure, contained in the chosen chart adapted to . Let C =: max dx H j . x ∈R
If x , x ∈ R have local coordinates θ , z , θ , z then H j x − H j x ≤ C z − z . Here · is the standard norm in Cn . Choose c ∈ (0, C) satisfying the conclusions of Corollary 2.4.
(40)
246
M. Debernardi, R. Paoletti
Now let us set A(1) =: (g, y) ∈ A jk : dist M µg−1 (π(x)), π(y) > jk
c γ (g, y) , 10 C
(41)
c (2) A jk =: (g, y) ∈ A jk : dist M µg−1 (π(x)), π(y) < γ (g, y) . 5C
(42) (1)
(2)
Let τ1 + τ2 = 1 be a partition of unity of A jk subordinate to the open cover A jk ∪ A jk = A jk . We may assume that τ1 and τ2 are fixed functions of (g, y), independent of k. −1/3 , Suppose first that (g, y) ∈ A(1) jk . By Lemma 3.4 and the hypothesis γ (g, y) > k this implies z − S j (ξ ) > c k −1/3 /(20 C). Given this, (39) implies √ 1 (43) dist X µe−ξ ◦ rϑ x j + w j / k , y ≥ z − S j (ξ ) 3 for k 0. In view of Corollary 1.3 of [BS], we deduce with g = expG (ξ ): √ √ dt j µg−1 ◦ rϑ x j + w j / k , y = ψ j µg−1 ◦ rϑ x j + w j / k , y √ ≥ ψ j µg−1 ◦ rϑ x j + w j / k , y 2 ≥ C1 z − S j (ξ ) , (44) with C1 > 0 an appropriate constant. (1) The differential operator L (1) =: ψ −1 ∂t∂ is thus well-defined and smooth on A jk , is positively homogeneous of degree −1 in t, and satisfies 2 L (1) j = 1, L (1) (1, y, g) ≤ C2 / z − S j (ξ ) . If on the other hand (g, y) ∈ A(2) jk and k 0, then necessarily distG (g, e) = γ (g, y) > k −1/3 . If g = expG (ξ ), we also deduce S j (ξ ) − z ≤ 2 dist M µg−1 (π(x)), π(y) < 2c γ (exp (ξ ), y) G 5C 2c 4c distG (g, e) ≤ ξ . = 5C 5C Thus, if ξ ∈ g, g = expG (ξ ) ∈ R, and c > 0 is as in Corollary 2.4, then dη j = H j (y) = H j (y) − H j (µg (x)) + H j (µg (x)) ≥ H j (µg (x)) − H j (µg (x)) − H j (y) ≥ cξ − CS j (ξ ) − z 4 ≥ c − c ξ ≥ C3 ξ ≥ C3 ξ 2 , 5
(45)
since we may assume ξ < 1/2 if g = expG (ξ ) ∈ R. Arguing as above, one can (2) then produce a linear first order differential operator L (2) on A jk , positively homogeneous of degree −1 in η and with no zero order term, such that L (2) j = 1, whence L (2) (eik j ) = ikeik j , and L (2) ≤ C4 /ξ 2 .
Equivariant Asymptotics for Bohr-Sommerfeld Lagrangian Submanifolds
247
Then L =: τ1 L (1) + τ2 L (2) is a first order linear partial differential operator on A j , positively homogeneous ofdegree −1 in (t, η), having no zero order term, and satisfying L j = 1. Hence L eik j = ikeik j , and L ≤ C /(z, ξ )2 for some C > 0. Let L T be the transpose operator (norms and transposes are in the given local coordinates); then for every s = 1, 2, . . . there exists a constant Cs > 0 such that (L T )s ≤ Cs /(z, ξ )2s .
(46)
In L T and its powers only t- and η- derivatives occur, and the coefficients are functions of (g, y). Now let us set S = R × Rn × (0, +∞) × (−2, 2), and d X =:√dτ dη dt dϑ. Let us write eik j F j for the integrand in the expression for u k, (x + w/ k) ja ; the latter is obtained from (35) by inserting the additional factor a jk . Then (Remark 3.1) √ dim(V ) −ikϑ j u k, x + w/ k ∼ k n+2−s e eik j (L T )s (F j ) d X dg dy. ja (2π )n+2 i s S A jk (47) Introducing radial coordinates in the (z, g)-variables and invoking (46), ∞ √ u k, x + w/ k ≤ Ds k n+2−s r 2n+g−1−2s dr ja
=
Ds
k
k −1/3 (n−g−s)/3
,
where Ds , Ds are appropriate positive constants. This completes the proof of Lemma 3.5. √ We shall now determine the asymptotics of u k, (x + w/ k) jb . To this end, let us perform the following change of integration variables: θ = θ,
p = p − h j (q), q = q.
(48)
At the origin, the Jacobian of this transformation is the identity. In the new coordinates, the phase function (34) becomes j τ, η, t, g, ϑ, θ , p , q wj wj , θ , p + h j (q ), q =: tψ j µg−1 ◦ rϑ x j + √ + R √ k k (49) +τ θ − f j (q ) + η · p − ϑ, w where R : Cn → Cn vanishes to second order at the origin. Thus, Rk =: R √ j = k
O(k −1 ) as k → +∞. In the following, we shall work in the new coordinates and omit the primes for notational simplicity. We shall next rescale our coordinates by a factor k −1/2 , as follows. First, let √ us rescale the local coordinates on G in the neighbourhood E e, by writing ξ = ν/ k; thus on √ E we have g = expG (ξ ) = expG (ν/ k). Let us also rescale in the same manner the new coordinates (48) centered at x j in the horizontal direction; more precisely, let us √ √ write (θ, p, q) = (θ, r/ k, s/ k). Here ν ∈ Rg (given our choice of an orthonormal
248
M. Debernardi, R. Paoletti
√ √ n basis of g) and √ r, s ∈ R . In Heisenberg coordinates, (θ, r/ k, s/ k) corresponds to (θ, (r + is)/ k + h j (s)/k). By the definition (36) of B jk , integration in dν dr ds takes place over a ball of radius O(k 1/6 ) in Rg × Cn . Let us express (49) in rescaled coordinates, and define r ν s jk (τ, η, t, ν, ϑ, θ, r, s) =: j τ, η, t, expG √ , ϑ, θ, √ , √ . (50) k k k √ By Lemma 2.1, f j vanishes to third order at the origin. Thus f j (s/ k) = k −3/2 f jk (s) for a smooth function f jk vanishing to third order at the origin. We obtain √ jk = tψ j µe−ν/√k ◦ rϑ x j + w j / k + Rk , θ, k −1/2 r + k −1 h j (s), k −1/2 s +τ θ + k −1/2 η · r − ϑ + k −3/2 f jk (s).
(51)
As usual, we may identify w j with a tangent vector in Tm j M, where m j =: π(x j ); clearly, m j = g j · m. Let ν M be the vector field on M generated by ν ∈ g. √ Lemma 3.6. The adapted Heisenberg coordinates of µe−ν/√k ◦ rϑ (x j + w j / k + Rk ) are wj ν wj ν 2 1 ϑ − m j ν M (m j ), w j + Q √ , √ , √ w j − ν M (m j ) + T √ , √ , k k k k k k where Q, T : Cn ×Rg → Cn vanish at the origin to third and second order, respectively. w w Thus, Q k =: Q( √ j , √ν ) = O k −3/2 and Tk = T ( √ j , √ν ) = O(k −1 ) as k → +∞ k k k k for fixed ν as k → +∞. √ Proof. Clearly, the preferred holomorphic coordinates of µe−ν/√k (m j + w j / k + Rk ) w are k −1/2 (w j − ν M (m j )) + T ( √ j , √ν ), for some Cn -valued function T vanishing to k k second order at the origin. By construction, the Heisenberg coordinates of µe−ν/√k ◦ √ rϑ x j + w j / k + Rk then have the form √ wj ν 1 , θ (1/ k), √ w j − ν M (m j ) + T √ , √ k k k for some smooth function θ : (−δ, δ) → R. To determine the latter, let us momentarily set ϑ = 0 and consider the path, defined for sufficiently small s, t ∈ R, γs : t ∈ (−δ, δ) → µe−t ν x j + s w j + R(s w j ) , where R is as in (49); thus, R(s w j ) is a smooth function (−δ, δ) → Cn vanishing to second order at s = 0. Let us write w j = pw + iqw , ν M ( p j ) = pν + iqν . The preferred coordinates of π(γs (t)) are given by (spw − t pν ) + i(sqw − tqν ) + Q(sw, tν), where Q : Cn × Rg → Cn vanishes to second order at the origin. Thus, the Heisenberg coordinates of γs (t) have the form θ (s, t), (spw − t pν ) + i(sqw − tqν ) + Q(sw, tν) , for some real-valued smooth function θ (s, t).
Equivariant Asymptotics for Bohr-Sommerfeld Lagrangian Submanifolds
249
Claim 3.1. θ (s, t) = (s t) · d0 + θ1 (sw, tν), where d0 depends only on w and ν, while θ1 vanishes to third order at s = t = 0. Proof. By construction, θ (s, 0) vanishes identically. Therefore, θ (s, t) = t θ1 (s, t), for a smooth function θ1 . Given that G acts horizontally on X , Lemma 3.3 implies that θ (0, t) vanishes to third order at t = 0. Thus, θ1 (s, t) = at 2 + t 3 b(t) + s d(s, t). The claim follows by writing d(s, t) = d0 + d1 (s, t), where d1 (0, 0) = 0. Next we shall determine d0 by use of (20). The pull-back γs∗ (α) ∈ 1 (−δ, δ) is given by γs∗ (α)(t) = sd0 + [(sqw − tqν ) pν − (spw − t pν ) qν ] dt + G 1 (s, t) dt
= s d0 + ( pν qw − qν pw ) dt + G 1 (s, t) dt
= s d0 + m j (ν M (m j ), w j ) dt + G 1 (s, t) dt, where G 1 (s, t) vanishes to second order at s = t = 0. On the other hand, if ν X is the vector field on X generated by ν ∈ g, then α(ν X ) = ν =: , ν (Eq. (5.1) of [GS1]). Since −1 (0) is G-invariant,
∂◦γ ∂t (0,t)
= 0 for every t. Thus,
(γs (t)) = s dm j (w j ) + G 2 (s, t), where G 2 vanishes to second order at the origin. Hence, γs∗ (α)(t) = − (γs (t)), ν dt = − s dm j (w j ), ν + G 3 (s, t) dt = − s dm j ν (w j ) + G 3 (s, t) dt = −sm j ν M (m j ), w j + G 3 (s, t) dt, where G 3 vanishes to second order, and ν =: , ν is the Hamiltonian function asso ciated to ν. Thus G 1 = G 3 , and d0 = −2 m j ν M (m j ), w j . Lemma 3.6 now follows in the case ϑ = 0 by letting s = t = k −1/2 ; in general we need only notice that rϑ corresponds in local Heisenberg coordinates to a translation by ϑ. Following the notation of [BSZ], let us set 1 u2 + v2 2 1 = i (u · v) − u − v2 (u, v ∈ Cn ). 2
K 2 (u, v) =: u · v −
(52)
In view of the asymptotic expansion for the phase discussed in the proof of the scaling limit of the Szegö kernel in [BSZ] and [SZ], Lemma 3.6 implies that jk in (51) has an asymptotic k-expansion of the form
250
M. Debernardi, R. Paoletti 1 i ϑ− 2k m j (ν M (m j ),w j )−θ −ϑ +τ θ + √ η·r jk ∼ i t 1 − e k it i(ϑ−θ) K 2 w j − ν M (m j ), r + i s + O k −3/2 − e k
1 = i t 1 − ei (ϑ−θ) − ϑ + τ θ + √ η · r k t i(ϑ−θ) 2 m j (ν M (m j ), w j ) + i K 2 w j − ν M (m j ), r + i s − e k 1 w + t ei(ϑ−θ) P √ (r + is), √ , k k
(53)
where P : Cn ×Rg → C vanishes to third order at the origin. Hence, Pk =: P( √1 (r +is), k w √ ) = O k −3/2 for fixed r, s ∈ Rn as k → +∞. k √ √ On the upshot, we have u k, (x + w/ k) j ∼ u k, (x + w/ k) jb and √ dim(V ) 2−g/2 −ikϑl k e F jk (ν, s) dν ds; (54) u k, (x + w/ k) jb = (2π )n+2 Rg Rn √ here, performing the coordinate change η → k η, we have set +∞ n/2 F jk (ν, s) =: k ei k S A dr dθ dτ dη dϑ dt, (55) 0
where the complex phase
S(ϑ, t, θ, τ, η, r ) =: it 1 − ei(ϑ−θ) − ϑ + τ θ + η · r
(56)
satisfies (S) ≥ 0, and the amplitude A is
−i t ei(ϑ−θ) 2 m j (ν M (m j ),w j )+i K 2 (w j −ν M (m j ), r +i s ) i k t ei(ϑ−θ) P √1 (r +is), √w k k ×e j (y) f λ (q) D (q)
A =: e
√ ×b j (k −1/6 (ν, r + is)) ζ j µg−1 ◦ rϑ (x j + w j / k), y, kt √ √ √ ν/ k −ν/ k χ ×γ j ei(ϑ j +ϑ) , g −1 e (e g j ) HG (ν/ k); j
(57)
in (55), the integration from 0 to +∞ refers to dt. In (57), HG denotes the Haar density on G, expressed in the local coordinates given by the exponential √ chart; thus, HG (0)√= 1. The factor b j (k −1/6 (ν, r + is)) comes from setting ξ = ν/ k and z = (r + is)/ k in b jk given by (38). As above, let us write w j = pw + iqw , with pw , qw ∈ Rn . The term 2 m j (ν M (m j ), w j ) + i K 2 w j − ν M (m j ), r + i s , appearing in the exponent in the first factor of (57), may be rewritten: 2 ( pν qw − qν pw ) − (qw − qν ) · r + ( pw − pν ) · s i pw − pν − r 2 + qw − qν − s2 . − 2
(58)
Equivariant Asymptotics for Bohr-Sommerfeld Lagrangian Submanifolds
251
Lemma 3.7. i) As k → +∞, there is an asymptotic expansion, uniform on compact subsets of Rg × Rn , F jk (ν, s) ∼ k n/2−2 Z j0 (ν, s) + k (n− f )/2−2 Z j f (ν, s). f ≥1
The coefficient of the leading term is: Z j0 (ν, s) =
(2π )n+2 χ (g j ) f λ (x j ) πn ×e− 2 pw − pν e−i( pw − pν )s−2 i ( pν qw −qν 1
2
pw )− 12 qw −qν −s2
.
ii) There exist positive constants c > 0 and C f > 0 for every f = 1, 2, . . . , such that −c s2 +ν2 |Z j f (ν, s)| < C f e , for every (ν, s) ∈ Rg × Rn . iii) For every # = 0, 1, 2, . . . there exists D# > 0 such that # 2 2 (n− f )/2−2 F jk (ν, s) − k Z j f (ν, s) ≤ D# k (n−#−1)/2−2 e−c s +ν f =0 for every (ν, s) ∈ Rg × Rn . √ Proof. i) On any fixed compact subset of Rg × Rn , we have HG (ν/ k) = 1 + O(k −1/2 ), √ µeν/√k = id + O(k −1/2 ), χ (e−ν/ k g j ) = χ (g j ) + O(k −1/2 ) as k → +∞. Incorporating the terms O(k −1/2 ) into the amplitude, (55) may be interpreted as an oscillatory integral, with complex phase (56), and whose amplitude may be developed in descending powers of k −1/2 . Since r = ∂∂ηS , the asymptotic contribution to F jk (s) from the region r ≥ 1, say, is O(k −∞ ). Therefore, we may assume that both r and s are bounded in norm, and so b j (k −1/6 (ν, v)) = 1 if k 0. The proof of the following is left to the reader: Claim 3.2. The phase S has only one stationary point (ϑ0 , t0 , θ0 , τ0 , η0 , r0 ), given by t0 = τ0 = 1, ϑ0 = θ0 = 0, r0 = η0 = 0. The Hessian of S at this stationary point is 1 −i −1 0 0 0 −i 0 i 0 0 0 1 −i 0 0 −1 i . i 0 0 0 0 −i 0 0 0 0 0 0 −i In 0 0 0 0 −i In 0 Now (i) follows in view of (23) and (58) by the complex stationary phase Lemma (Theorem 7.7.5 of [H]). ii) By our choice of adapted Heisenberg local coordinates for centered at x j , and by Corollary 2.1, ν → pν is an injective R-linear map g → Rn ; therefore, so is the affine map A j : g ⊕ Rn → Cn given by A j (ν, s) =: ( pw − pν ) + i(qw − qν − s).
252
M. Debernardi, R. Paoletti
Hence there exist c, d > 0 such that
pw − pν 2 + qw − qν − s2 ≥ c ν2 + s2 − d.
(59)
Now ii) follows in view of (58) and i). iii) In view of the cut-off b j (k −1/6 (ν, r + is)), we may suppose (ν, s) ≤ k 1/6 (and r ≤ 1, as above). Let us make the coordinate change s = s/k 1/6 , ν = ν/k 1/6 , so that ν and s may be assumed to be bounded. Then (55) may still be interpreted as an oscillatory integral, with complex phase S, and whose amplitude is S1/6 . We may then apply the stationary phase lemma as in i), ii) and plug back in s = k 1/6 s and ν = k 1/6 ν in the result. This completes the proof of Lemma 3.7. In view of (54), Lemma 3.7 implies the asymptotic expansion √ u k, x + w/ k = k (n−g)/2 0 (k, , w)( j) jb + k (n−g− f )/2 f (k, , w, x)( j) ,
(60)
f ≥1
where f (k, , w, x)( j) =
dim(V ) −ikϑl e Z j f (ν, s) dν ds. (2π )n+2 Rg Rn
(61)
In particular, the coefficient of the leading term is 0 (k, , w)( j) =
dim(V ) −ikϑ j e χ (g j ) f λ (x j ) n π ×
1
e− 2 pw − pν e−i( pw − pν )s−2 i ( pν qw −qν 2
(62) pw )− 21 qw −qν −s2
dνds.
Rg Rn
To integrate in ds, let us make the change of variables s s + qν − qw , and recall 2 that the function e−x /2 on Rn equals its own Fourier transform. We obtain: 0 (k, , w)( j) =
n dim(V ) (2π ) 2 e−ikϑ j χ (g j ) f λ (x j ) n π
×
e− pw − pν e−i[( pw − pν )(qw −qν )+2 ( pν qw −qν 2
Rg
pw )]
dν. (63)
Let us now decompose w j as in (8), and to simplify our notation let us write wa , . . . for (w j )a , . . . ( j being fixed in our argument). Let us write wa , . . . in local coordinates as column vectors ( pa qa )t , . . . ∈ R2n . More precisely, we have: pa , where pa · pν = 0 ∀ ν ∈ g; Claim 3.3. i) wa = 0 0 0 ii) wb = and wc = for appropriate qb , qc ∈ Rn , and pν · qc = 0, ∀ ν ∈ g; qb qc pνw , for a unique νw ∈ g. iii) wd = qνw
Equivariant Asymptotics for Bohr-Sommerfeld Lagrangian Submanifolds
253
Proof. For the second statement of ii), recall that wc ∈ Tm j ⊆ Tm j M , and the latter is the symplectic annihilator of g M (m j ). Everything else is immediate. Lemma 3.8. Let pw j , qw j ∈ Rn be as in Claim 3.3. Then for every ν ∈ g one has pν · qw − qν · pw = pν · qb − qν · pa .
(64) Proof. The left-hand side of (64) is the symplectic pairing m j ν M (m j ), w j . We have pw = pa + pνw , qw = qb + qc + qνw . However, recalling that g M (m j ) ⊆ Tm j M is an isotropic subspace, we have m j ν M (m j ), νw (m j ) = 0 for every ν ∈ g; therefore, pνw and qνw may be ignored. The statement then follows from ii) of Claim 3.3. Let us make the change of variables β = ν − νw in (63). The real part of the exponent in (63) then is − pa 2 − pβ 2 , while the imaginary part may be written as − pa − pβ · (qb + qc ) − qβ − 2 ( pβ + pνw ) · qb − (qβ + qνw ) · pa = m j (w j − wd , wa ) + 2 m j (w j , wd ) − pβ · qb − 3qβ · pa − pβ · qβ . (65) Let us set C(w j ) = m j (w j − wd , wa ) + 2 m j (w j , wd ). We may then rewrite the right-hand side of (63) as 3
0 (k, , w)( j) = dim(V ) (2π ) 2 n+1 e−ikϑ j χ (g j ) f λ (x j ) e−wa 2 × e− pβ +i pβ ·qβ e−i pβ ·qb −3qβ · pa dβ. Rg
2 +iC(w
j)
(66)
In (66), the integral is over g, identified with Rg by means of an orthonormal basis for the Haar metric. Thus, the Lebesgue measure dβ corresponds to the Haar measure at the identity e ∈ G. To make our statement more intrinsic, we shall now rewrite the latter integral as an integral over g M (m j ) ⊆ Tm j M, with the induced metric. Lemma 3.9. Fix t ∈ M . Suppose that B is an orthonormal basis of g for the Haar metric, and that Bt is an orthonormal basis of g M (t) for the induced metric from Tt M. Identify g with g M (t) by the linear isomorphism ξ → ξ M (t). Let A = MBBt (idg) be the matrix of the base change. Then | det(A)| =
1 , Veff (t) |G t |
where Veff (t) is the effective potential at t, and G t ⊆ G is the stabilizer subgroup of t. Remark 3.2. Since m j = µg j (m), we have Veff (m j ) = Veff (m) and |G m j | = |G m |. Proof. Suppose B = {v1 , . . . , vg }, Bt = {w1 , . . . , wg } so that w j = A = [ai j ]. Hence
g
i=1 ai j vi , where
(67) w1 ∧ · · · ∧ wg = det(A)v1 ∧ · · · ∧ vg . ' Let densG be the Haar density on G, so that G densG = 1; hence densG (v1 ∧· · ·∧vg ) = 1. Let denst be the pull-back to G of the invariant metric density on the orbit G ·t ∼ = G/G t under the degree – |G t | covering map g → g · t. By invariance, denst = Veff (t) · |G t | · densG .
(68)
254
M. Debernardi, R. Paoletti
By construction, 1 = denst (w1 ∧ · · · ∧ wg ) = | det(A)| · Veff (t) · |G t | · volG (v1 ∧ · · · ∧ vg ) = | det(A)| · Veff (t) · |G t |. Now let β and u denote the linear coordinates on g associated to the basis B and Bq , respectively; thus, β = A u. By the lemma, −1 du. dβ = Veff (q) |G q |
(69)
We have already exploited the following consequence of Corollary 2.1: working in adapted Heisenberg local coordinates, the projection of g M (m j ) ⊆ Tm j M ∼ = Cn ∼ = Rn × Rn onto Rn × {0}, preal : ν M (m j ) → pν , (ν ∈ g) is injective. Hence there exists a linear map T : Rn → Rn such that qν = T ( pν ) for every ν ∈ g (if we so wish, we may determine T uniquely by imposing that it vanishes on the Euclidean orthocomplement of preal (g M ( p j )) ⊆ Rn ). We shall think of T as an n×n real matrix. On the upshot, identifying g ∼ = Rg by Bm j , and Tm j M ∼ = Cn by the given choice ∼ of adapted Heisenberg local coordinates, the inclusion g = g M (m j ) → Tm j M may be written ι(u) = Ru + i T Ru (u ∈ Rg ),
(70)
for a certain n × g real matrix R of maximal rank g. Since Bm j is orthonormal for the induced metric, we have R R t + R t T t T R = Ig . Lemma 3.10. R t T R is a g × g symmetric matrix. Proof. Since m j ∈ M , g M (m j ) ⊆ Tm j M is an isotropic subspace. Thus, for every ξ, ν ∈ g we have pξ pν Rξ Rν , = 0 , m j ξ M (m j ), ν M (m j ) = 0 qξ qν T Rξ T Rν = ξ t (R t T R − R t T t R)ν, where 0 denotes the standard symplectic structure on Rn × Rn . Now (66) may be rewritten as 0 (k, , w)( j) =
3 dim(V ) 2 (2π ) 2 n+1 e−ikϑ j χ (g j ) f λ (x j ) e−wa +iC(w j ) Veff (q) |G q | t t t t t t (71) × e−u R R+i R T R u e−iu R qb −3T pa du.
Rg
Equivariant Asymptotics for Bohr-Sommerfeld Lagrangian Submanifolds
255
g
Up to a scalar factor (2π ) 2 , the integral in (71) is the evaluation at R t (qb − 3T t pa ) 1 of the Fourier transform of the function e− 2 (u,Au) on Rg , where A = 2(R t R + i R t T R) is a g × g complex symmetric matrix with positive definite real part. By Theorem 7.6.1 of [H], we have
1 ) (2π )(n+g) −ikϑ j dim(V 0 (k, , w)( j) = e χ (g j ) f λ (x j ) n Veff (q) |G q | π 2g −1/2 × det R t R + i R t T R exp(−Q(w j )), (72) where
−1 t 1 t R (qb −3T t pa ), R t R +i R t T R R (qb −3T t pa ) 4 = S(w j ) + i P(w j ), (73)
Q(w j ) = pa 2 −iC(w j )+
where S and P denote real valued quadratic forms. Here ( , ) is the standard Euclidean −1 scalar product on Rg . Thus, if R t R + i R t T R = F + i G, where F and G are g × g real symmetric matrices, S(w j ) =: pa 2 +
1 t R (qb − 3T t pa ), F R t (qb − 3T t pa ) . 4
(74)
Recall that w j = wa +wb in the decomposition described in Definition 1.1 and Claim 3.3. Lemma 3.11. S(w j ) ≥ 0, and equality holds only if w j = 0. Proof. Since F is positive definite by construction, both summands in (74) are ≥ 0. Suppose S(wj ) = 0. Then both summands vanish, whence pa = 0 and R t qb = 0. Thus we are reduced to proving: Lemma 3.12. If R t qb = 0, then qb = 0. Proof. By construction, the range of R is preal g M (m j ) = { pν : ν ∈ g} ⊆ Rn . Recall that the symplectic annihilator of g M (m j ) is given by g M (m j )0 = Tm j M . Hence, in view of the identification Tm j M ∼ = Rn ⊕ Rn (and viewing as usual Tx j as a subspace of Tm j M), 0 t n t n 0 ker(R ) = q ∈ R : pν q = 0 ∀ν ∈ g = q ∈ R : ∈ g M (m j ) q 0 0 n n ∈ Tm j M = q ∈ R : ∈ Tx j . = q∈R : q q By definition, wb ∈ (Tx j )⊥ . Thus if R t qb = 0 then 0 ∈ Tx j ∩ (Tx j )⊥ = {0}. qb This completes the proof of Theorem 1.1.
256
M. Debernardi, R. Paoletti
Remark 3.3. Let us now consider the question described in the introduction, i.e. whether quantization commutes with reduction for a transverse compact Legendrian submanifold ⊆ X . For the sake of brevity, we shall make fairly brutal simplifying assumptions, and leave it to the interested reader to work out more general cases. Let us assume that G acts freely on M , and – to fix ideas – that meets every 1 S × G-orbit in X at most once. Thus, =: ∩ X is an (n-g)-dimensional isotropic submanifold, which projects diffeomorphically onto a compact Legendrian submanifold 0 ⊆ X 0 =: X /G ( and 0 actually map down diffeomorphically onto Lagrangian submanifolds in M and M0 , respectively). Also, let us choose as a half-density on the (1/2) Riemannian half-density dens , so that f λ = 1. Let us fix x ∈ , and let x0 ∈ X 0 be its image in X 0 . To simplify, let us also assume that is perpendicular to the G orbit G · x at x, so that – referring to (70) – we have T = 0 and R t R = Ig . In view of Theorem 1.1, we then have: ( (2π )n+g (n−g)/2 π −n u k,0 (x) ∼ k + f k (n−g− f )/2 . 2g Veff π(x) f ≥1
Now there are two natural ways to induce a half-density on ∼ = 0 : One is to (1/2) choose the Riemannian half-density, λ = dens0 , so that f λ = 1. The other is to (1/2) ∼ g associated to the Haar metric. Let λ be divide dens by the half-density on g∗ = the half-density obtained in this manner. By arguments similar to those in Lemma 3.9, −1/2 one can see that f λ (x0 ) = . Veff π(x) H (X ) is the sequence associated to λ , then by CorolIf u ∈ H 0 M0 , L ⊗k ∼ = k 0 k
0
lary 1.1 we have: (2π ) 2 (n−g)/2 k + f k (n− f )/2 . πn n
u k (x) ∼
f ≥1
∼ k (X 0 ) is the sequence associated to λ , If on the other hand u k ∈ H 0 (M0 , L ⊗k 0 ) = H −1/2 . the leading order term gets multiplied by Veff π(x) 4. The Hermitian Products Let us now assume that , ⊆ X are two compact Legendrian submanifolds, and that λ and σ are given smooth half-densities on and , respectively. Let u =: X (δ,λ ), v =: X (δ,σ ). Let as usual be a fixed highest weight of G. We shall study in this section the asymptotics of the Hermitian products (u k, , vk, ) L 2 (X ) as k → +∞. In the action-free case, we shall reproduce expansions similar to those in [BPU], except for some differences due to the fact that we are dealing with half-densities rather than half-forms. To this end, let us first of all recall that we are unitarily and equivariantly identifying functions and half-densities on X . Furthermore, the self-duality pairing < , > and the L 2 -unitary product ( , ) L 2 of smooth half-densities τ = f · dens X and υ = g · dens X ' are related by X f · g dens X = (τ, υ) L 2 = τ, υ. Suppose then that u t , t > 0, is a family of smooth half-densities on X such that u t → u as t → 0, in the topology of the
Equivariant Asymptotics for Bohr-Sommerfeld Lagrangian Submanifolds
257
space of all generalized half-densities whose wave front is conormal to . In view of the self-adjointness of the orthogonal projector k, on H(X )k, , we obtain: = lim u t , vk, L 2 (X ) (u k, , vk, ) L 2 (X ) = lim k, (u t ), vk, ) 2 t→0 t→0 L (X ) = lim u t , vk, = u, vk, t→0 f λ · vk, dens , =
(75)
where dens is the Riemannian density on . 4.1. The transverse case. Consider the smooth map given by group action restricted to , ϒ : (h, g, x) ∈ S 1 × G × → (h, g) · x ∈ X. To fix ideas, suppose first that ϒ is transversal to = ∩ ( ◦ π )−1 (0) . In this case, ϒ −1 ( ) is a finite set: ϒ −1 ( ) = { y˜1 , . . . , y˜r }, where y˜ j = (h j , g j , y j ) for some h j ∈ S 1 , g j ∈ G and y j ∈ . Hence y)j =: ϒ( y˜ j ) = (h j , g j ) · y˜ j ∈ for every j. Now let U j ⊆ be some arbitrarily small neighbourhood of y j . Since vk, = O(k −∞ ) away from , in view of (75) we have r
(u k, , vk, ) L 2 (X ) ∼
j=1 U j
f λ · vk, dens .
(76)
Let us fix Heisenberg local coordinates ( p, q, θ ) for X centered at ) y j and adapted to , defined on an open neighbourhood V j ) y j . Thus, ∩ V j ⊆ V j is defined by conditions θ = f (q) and p = h(q), as described in §2.2. We may arrange, given our assumptions, that * ∂ ∂ T)y j = span ,..., . (77) ∂q1 )y j ∂qn−g )y j The following is left to the reader: Lemma 4.1. Given (77), we have y j ) = span g X ()
∂
∂ pn−g+1
for appropriate tn−g+1 , . . . , tn ∈ T)y j .
) yj
* ∂ + tn−g+1 , . . . , + tn , ∂ pn )y j
(78)
258
M. Debernardi, R. Paoletti
Let us now consider the Legendrian submanifold ) y j ∈ j =: ϒ {(h j , g j )} × ⊆ X, obtained by ‘translating’ by the action of (h j , g j ) ∈ S 1 × G. Given (77) and Lemma 4.1, the present transversality assumption implies: Lemma 4.2. In the above situation, ( p1 , . . . , pn−g ) restrict to local coordinates on j centered at ) y j , and ( p1 , . . . , pn−g , qn−g+1 , . . . , qn ) restrict to local coordinates on j centered at ) yj. Therefore ( p1 , . . . , pn−g , qn−g+1 , . . . , qn ) may be viewed in a natural manner as local coordinates on centered at y j , defined on some open neighbourhood U j ⊆ . In order to apply Theorem 1.1, we need to relate these coordinates on to the local Heisenberg coordinates on X . Given x = (x1 , . . . , xn ), to simplify our notation let us write x = (x , x ), where x = (x1 , . . . , xn−g ), x = (xn−g+1 , . . . , xn ). The following is left to the reader: We have: Lemma 4.3. There exists an R-linear map A j : Rn → Cn ∼ = {0} ⊕ Cn ⊆ R ⊕ Cn such that if y ∈ U j ⊆ has local coordinates Heisenberg coordinates
√1 k
Aj
( p , q ) +
O(k −1 ).
√1 ( p , q ) k
on , then it has local
Let y( √1 ( p , q )) denote the point in U j having local coordinates √1 ( p , q ). By k k Theorem 1.1 and Lemma 4.3, passing to rescaled coordinates on U j we may then write the j th summand in (76) as: −n/2 f λ · vk, dens = k f λ (k −1/2 ( p , q ) vk, k −1/2 A j ( p , q )+ O(k −1 ) Rn
Uj
×D k −1/2 ( p , q ) dp dq .
(79)
Inserting the asymptotic expansion of Theorem 1.1 in (79), we conclude Proposition 4.1. If ϒ : S 1 × G × → X is transversal to , the j th summand in (83) is ( j) ( j) f λ · vk, dens ∼ k −g/2 ρ0 + k −(g+ f )/2 ρ f , Uj
f ≥1
where ( j) ρ0
( (2π )n+g k dim(V ) 1 = h j χ (g −1 )j ) f λ (y j ) f σ ( y)j ) j ) ( y |G π(y j ) | π n 2g −S y j A j ( p ,q ),A j ( p ,q ) −i T+ y j A j ( p ,q ),A j ( p ,q ) · e + dp dq . Rn
Equivariant Asymptotics for Bohr-Sommerfeld Lagrangian Submanifolds
259
In the action-free case, the present transversality assumption means that S 1 × → X is transverse to . For every j = 1, . . . , r , T)y j j ⊆ Tπ()y j ) M is a Lagrangian subspace transversal to T)y j . Thus, in the given Heisenberg local coordinates adapted to at ) yj, we have T)y j j = {( p, Z j p) : p ∈ Rn } ⊆ Tπ()y j ) M ∼ = Rn ⊕ Rn ,
(80)
where Z j is a symmetric matrix. Therefore, the p’s restrict to a system of local coordinates on j (whence on ), and A j ( p) = p + i Z j p. Let ı Jπ()y j ) : Grlag (Tπ()y j ) M) × Grlag (Tπ()y j ) M) → R be the invariant introduced in Sect. 2.4; let us write J j = Jπ()y j ) . Applying the asymptotic expansion of Corollary 1.1, Corollary 4.1. Suppose that the two projections → M and → M are transversal. Let ϒ : S 1 × → be the map induced by the action, and suppose ϒ −1 () = { y˜1 , . . . , y˜r }, where y˜ j = (h j , y j ). Set ) y j =: h j · y j and j =: h j · for every j. Then k − f /2 ρ f , (u k , vk ) ∼ ρ0 + f ≥1
where n r −1 (2π ) 2 k 2 t ρ0 = h j f λ (y j ) f σ ( y)j ) ı J j T)y j j , T)y j e− p +i p Z j p dp. n πn R
j=1
4.2. The clean case. Now we shall make the following more general hypothesis: i) and are both transversal to X ; let us set =: ∩ X , =: ∩ X . ii) The smooth map given by group action restricted to , ϒ : (h, g, x) ∈ S 1 × G × → (h, g) · x ∈ X, meets nicely; by this, we mean that every connected component of ϒ −1 ( ) is a manifold, and that for every ς = (h, g, x) ∈ ϒ −1 ( ) we have Tς ϒ −1 ( ) = (dς ϒ)−1 Tϒ(ς) . iii) there exist integers r , r ≥ 1 such that for every x ∈ and y ∈ one has | ∩ (G · x)| = r and | ∩ (G · y)| = r .
(81)
iv) G acts freely on M . Definition 4.1. Let us set Y˜ =: ϒ −1 ( ) ⊆ S 1 × G × . Let π : S 1 × G × → be the projection onto the third summand, and let us set Y =: π (Y˜ ) ⊆ . Lemma 4.4. Let = ∩ X . Then there exists an open neighbourhood V ⊆ of such that ϒ is immersive on S 1 × G × V . Proof. This follows from the horizontality of and of the G-action on X , and from Corollary 2.1.
260
M. Debernardi, R. Paoletti
Proposition 4.2. Suppose that the hypotheses i), ii) and iii) above are satisfied. Let Y˜1 , . . . , Y˜r ⊆ S 1 × G × be the connected components of Y˜ , and let Y j =: π (Y˜ j ) ⊆ . Then: i) for every j = 1, . . . , r , there exists h j ∈ S 1 such that Y˜ j ⊆ {h j } × G × ; ii) every Y j is a submanifold, and the induced map π j : Y˜ j → Y j is an unramified covering; iii) the Y j ’s, with possible repetitions, are the connected components of Y . j . Proof. i) Suppose that (h, g, x) ∈ Y˜ j for some j, and consider (a, v, w) ∈ T(h,g,x) Y Since ϒ(Y j ) ⊆ and is Legendrian, we conclude that ∂ + d(h,g,x) ϒ(0, v, w) 0 = αϒ(h,g,x) d(h,g,x) ϒ(a, v, w) = αϒ(h,g,x) a ∂θ = a + αϒ(h,g,x) d(h,g,x) ϒ(0, v, w) = a. The latter equality follows from the horizontality of and of the G-action on X . Since Y˜ j is connected, the statement follows. π j is not an immersion, by part i) ii) and iii) Let π j : Y˜ j → be the projection. If there exists (h j , g, x) ∈ Y˜ j and a tangent vector of the form (0, v, 0) ∈ T(h j ,g,x) Y˜ j , for some 0 = v ∈ Tg G. By Lemma 4.4, 0 = d(h j ,g,x) ϒ (0, v, 0) ∈ Tϒ(h j ,g,x) ∩ Tϒ(h j ,g,x) G · ϒ(h j , g, x) , against Corollary 2.1. Suppose now that h ∈ {h 1 , . . . , h r } ⊆ S 1 . Let , Y (h) =: Y˜ j . h j =h
Suppose that y ∈ π (Y (h) ); there are as many inverse images of y in Y (h) as there are group elements g ∈ G such that (h, g) · y = g · (h · y) ∈ ; in other words, (h) −1 (y) = ∩ G · h · y) = d . (82) Y ∩ π On the other hand, in an immersion with compact domain the number of points in a inverse image can only jump up. Therefore, given (82) the cardinality of a fibre has to component Y˜ j of Y (h) . If on the be constant for each map Y˜ j → Y j , for every −1connected other f : X → Y is an immersion, and f (y) is constant for every y ∈ f (X ), then f (X ) is a manifold and the induced map X → f (X ) is an unramified covering. We note in passing that by the same argument, and given the symmetry of our hypothesis on and , we also have: Proposition 4.3. For every j = 1, . . . , r , let j = ϒ(Y˜ j ). Then the j ’s are disjoint manifolds, and are the connected components of ϒ(Y˜ ). The induced map Y˜ → ϒ(Y˜ ) is an unramified covering.
Equivariant Asymptotics for Bohr-Sommerfeld Lagrangian Submanifolds
261
Definition 4.2. For every j = 1, . . . , r , set c j =: n − dim(Y j ) and let d j be the degree of the unramified cover π j : Y˜ j → Y j . Thus, for every y ∈ Y j there exist distinct s1 j (y), . . . , sd j , j (y) ∈ G such that (h j , si j (y), y) ∈ Y˜ j , and therefore (h j , si j (y)) · y ∈ , i = 1, . . . , d j . Locally on Y j near y we may think of si j ’s as G-valued smooth maps. The si j ’s are not globally well-defined as smooth maps Y j → G; nonetheless, collectively they do define a smooth map from Y j to the appropriate symmetrized product of G. Now let U j ⊆ be some arbitrarily small tubular neighbourhood of the submanifold Y j ⊆ . Since vk, = O(k −∞ ) away from , in view of (75) we have (u k, , vk, ) L 2 (X ) ∼
r j=1
Uj
f λ · vk, dens .
(83)
Remark 4.1. Since the Y j ’s are not necessarily all distinct, (83) is not literally true. However, to avoid making our exposition too heavy, we shall be slightly vague on this; we shall thus act as the Y j were all disjoint. In the following computations, each summand in (83) will split as the sum of various other contributions, and we shall not sum the same contribution twice. Suppose 1 ≤ j ≤ r . For every y ∈ Y j , we may find an open neighbourhood d j ˜ y ∈ S ⊆ Y j which is uniformly covered by π˜ j , meaning that π˜ −1 j (S) = i=1 Si ⊆ Y˜ j , a disjoint union where each S˜i projects diffeomorphically onto S under π˜ j , and (h j , si j (y), y) ∈ S˜i . Perhaps after restricting S, by Lemma 4.4 we may further assume that for each i the map induced by ϒ, (h j , si j (y), y) → (h j , si j (y)) · y is a diffeomorphism onto its image, Si =: ϒ( S˜i ) ⊆ . S˜i ∼ =)
(84)
We may then find a finite open cover {S ja }a∈A of Y j with the following properties: i) each S ja is the domain of a coordinate chart, say R ja = (r1 , . . . , rn−c j ) : S ja → Bn−c j (0, ) ⊆ Rn−c j , for some > 0; d j ˜ ii) each S ja is uniformly covered by π˜ j , and π˜ −1 j (S ja ) = i=1 Si ja is a disjoint union, where for each i, a, (85) S˜i ja =: (h j , si j (y), y) : y ∈ S ja ⊆ Y˜ j ; Si ja =: ϒ( S˜i ja ) ⊆ for every i, a; iii) ϒ induces a diffeomorphism S˜i ja ∼ =) iv) for every i, a there exist an open neighbourhood Ti ja ⊆ X of ) Si ja , and a smooth ) Si ja the map κ = κi ja : Si ja × Ti ja → B2n+1 (0, ), such that for every y ∈ ) partial function κ y : Ti ja → B2n+1 (0, ) is a Heisenberg chart adapted to at y (Lemma 2.3).
262
M. Debernardi, R. Paoletti
Now recall that for every j we have fixed a tubular neighbourhood U j ⊆ of Y j ; let p j : U j → Y j be the projection, and set U ja =: p −1 U j . Thus, {U ja }a∈A j (S ja ) ⊆ is a finite open cover of U j . By introducing a partition of unity a ϕ ja = 1, we may decompose the j th summand in (83) as f λ · vk, dens = ϕ ja f λ · vk, dens . (86) Uj
U ja
a
We are thus reduced to considering the asymptotics of each summand in (86). Given (85) we may the apply a relative version of the argument in §4.1; rescaling will now be in the coordinates in U ja which are transversal to Y j . We now leave it to the reader to verify that, using the local coordinates R ja = (r1 , . . . , rn−c j ) on S ja , one obtains an asymptotic expansion ( ja) (n−g−c j )/2 ϕ ja f λ · vk, dens ∼ k ρ0 (r ) dr U ja
+
Rn−c j
( ja)
k (n−g−c j − f )/2 ρ f
,
(87)
f ≥1
where, in view of the asymptotic expansion of Theorem 1.1, ( 1 (2π )n+g k dim(V ) ( ja) hj χ sl j (y(r )) ρ0 (r ) = n g Veff [π(y(r ))] π 2 l e−Sr (z)−i Pr (z) dz, · ϕ ja y(r ) f λ y(r ) f σ h j · sl j (y(r )) Rc j
for quadratic forms Sr , Pr on Rc j , with Sr positive definite. In the action-free case, this becomes: ( ja) ϕ ja f λ · vk dens ∼ k (n−c j )/2 ρ0 (r ) dr U ja
+
Rn−c j
( ja)
k (n−c j − f )/2 ρ f
,
(88)
f ≥1
where n
( ja) ρ0 (r )
dim(V ) (2π ) 2 k = hj Veff [π(y(r ))] π n · ϕ ja y(r ) f λ y(r ) f σ h j · y(r ) T ja y(r ) ,
T ja y(r ) =: ı Jh j ·y(r ) Th j ·y(r ) · j , Th j ·y(r )
−1
·
R
cj
e
− p2 +i p t Z h j ·y( p) p
dp,
Z r being an appropriate c j × c j symmetric matrix. Acknowledgements. We are very grateful to the referee for suggesting various improvements in presentation, and to Steve Zelditch for some interesting remarks.
Equivariant Asymptotics for Bohr-Sommerfeld Lagrangian Submanifolds
263
References [BW]
Bates, S., Weinstein, A.: Lectures on the geometry of quantization. Berkeley Mathematics Lecture Notes 8, Providence, RI: AMS, 1997 [BSZ] Bleher, P., Shiffman, B., Zelditch, S.: Universality and scaling of correlations between zeros on complex manifolds. Invent. Math. 142, 351–395 (2000) [BPU] Borthwick, D., Paul, T., Uribe, A.: Legendrian distributions with applications to relative Poincaré series. Invent. Math. 122, no. 2, 359–402 (1995) [BS] Boutet de Monvel, L., Sjöstrand, J.: Sur la singularité des noyaux de Bergman et de Szegö. Astérisque 34–35, 123–164 (1976) [BG] Burns, D., Guillemin, V.: Potential functions and actions of tori on Kähler manifolds. Comm. Anal. Geom. 12 no. 1–2, 281–303 (2004) [DI] Dixmier, J.: Les C ∗ -algebras et leurs réprésentations. Paris: Gauthier-Villars, 1964 [DU] Duistermaat, J.J.: Fourier integral operators. Boston: Birkhäuser, 1996 [GE] Geiges, H.: Contact Geometry. In: Handbook of Differential Geometry 2, F.J.E. Dillen, L.C.A. Verstraelen, eds. Amsterdam: North Holland, 2006, pp. 325–382 [GT] Gorodentsev, A.L., Tyurin, A.N.: Abelian Lagrangian algebraic geometry. Izv. Ross. Akad. Nauk Ser. Mat. 65:3, 15–50 (2001); English transl., Izv. Math. 65, 437–467 (2001) [GGK] Guillemin, V., Ginzburg, V., Karshon, Y.: Moment maps, cobordism, and Hamiltonian group actions. Mathematical Surveys and Monographs 98, Providence, RI: A.M.S., 2002 [GP] Guillemin, V., Pollack, A.: Differential topology. Englewood Cliffs, N.J.: Prentice-Hall, Inc., 1974 [GS1] Guillemin, V., Sternberg, S.: Geometric quantization and multiplicities of group representations. Inv. Math. 67, 515–538 (1982) [GS2] Guillemin, V., Sternberg, S.: Homogeneous quantization and multiplicities of group representations. J. Func. Anal. 47, 344–380 (1982) [GS3] Guillemin, V., Sternberg, S.: The Gelfand-Cetlin system and quantization of the complex flag manifold. J. Func. Anal. 52, 106–128 (1983) [H] Hörmander, L.: The analysis of partial differential operators I. Berlin-Heidelberg-New York: Springer-Verlag, 1990 [K] Kostant, B.: Quantization and unitary representations. I. Prequantization. Lectures in modern analysis and applications, III (1965), Lecture Notes in Math., Vol. 170, Springer, Berlin, 1970, pp. 87–208 [P1] Paoletti, R.: Moment maps and equivariant Szegö kernels. J. Symplectic Geom. 2, no. 1, 133–175 (2003) [P2] Paoletti, R.: The Szegö kernel of a symplectic quotient. Adv. Math. 197, 523–553 (2005) [STZ] Shiffman, B., Tate, T., Zelditch, S.: Distribution laws for integrable eigenfunctions. Ann. Inst. Fourier (Grenoble) 54, no. 5, 1497–1546 (2004) [SZ] Shiffman, B., Zelditch, S.: Asymptotics of almost holomorphic sections of ample line bundles on symplectic manifolds. J. Reine Angew. Math. 544, 181–222 (2002) [W] Weinstein, A.: Connections of Berry and Hannay type for moving Lagrangian submanifolds. Adv. Math. 82, 133–159 (1990) [Z] Zelditch, S.: Szegö kernels and a theorem of Tian. Int. Math. Res. Not. 6, 317–331 (1998) Communicated by M.R. Douglas
Commun. Math. Phys. 267, 265–277 (2006) Digital Object Identifier (DOI) 10.1007/s00220-006-0025-1
Communications in
Mathematical Physics
Self-Organized Forest-Fires near the Critical Time J. van den Berg , R. Brouwer CWI, Kruislaan 413, P.O. Box 94079, 1090 GB Amsterdam, The Netherlands. E-mail:
[email protected];
[email protected] Received: 29 October 2005 / Accepted: 6 December 2005 Published online: 22 April 2006 – © Springer-Verlag 2006
Abstract: We consider a forest-fire model which, somewhat informally, is described as follows: Each site (vertex) of the square lattice is either vacant or occupied by a tree. Vacant sites become occupied at rate 1. Further, each site is hit by lightning at rate λ. This lightning instantaneously destroys (makes vacant) the occupied cluster of the site. This model is closely related to the Drossel-Schwabl forest-fire model, which has received much attention in the physics literature. The most interesting behaviour seems to occur when the lightning rate goes to zero. In the physics literature it is believed that then the system has so-called self-organized critical behaviour. We let the system start with all sites vacant and study, for positive but small λ, the behaviour near the ‘critical time’ tc , defined by the relation 1 − exp(−tc ) = pc , the critical probability for site percolation. Intuitively one might expect that if, for fixed t > tc , we let simultaneously λ tend to 0 and m to ∞, the probability that some tree at distance smaller than m from O is burnt before time t goes to 1. However, we show that under a percolation-like assumption (which we can not prove but believe to be true) this intuition is false. We compare with the case where the square lattice is replaced by the directed binary tree, and pose some natural open problems. 1. Introduction 1.1. Background and motivation. Consider the following, informally described, forestfire model. (A precise description follows later in this section). Each site of the lattice Zd is either vacant or occupied by a tree. Vacant sites become occupied at rate 1, independently of anything else. Further, sites are hit by lightning at rate λ, the parameter of the model. When a site is hit by lightning, its entire occupied cluster instantaneously burns down (that is, becomes vacant). Part of vdB’s research is supported by BRICKS project AFM 2.2.
266
J. van den Berg, R. Brouwer
This is a continuous-time version of the Drossel-Schwabl model which has received much attention in the physics literature. See e.g. [1, 3, 5, 9] and sections in the book by Jensen [7]. For comparison with real forest-fires see [8]. The most interesting questions are related to the asymptotic behaviour when the lightning rate tends to 0. It is believed that this behaviour resembles that of ‘ordinary’ statistical mechanics systems at criticality. In particular, it is believed that, asymptotically, the cluster size distribution has a power-law behaviour. Heuristic results confirming such behaviour have been given in the literature, but the validity of some of these results is debatable (see [5]) and almost nothing is known rigorously (except for the one-dimensional case). Our goal is more modest, and we address some basic problems which, surprisingly, have so far been practically ignored, although their solution is crucial for a beginning of rigorous understanding of these models. We restrict to the 2-dimensional case. That is, the forest is represented by the square lattice. It seems to be taken for granted in the literature that, informally speaking, as we let λ tend to 0, the steady-state probability that a given site, say the origin O, is vacant stays away from 0. But is this really obvious? (Even, is it true?) The intuitive reasoning seems to be roughly as follows: “If the limit of the probability to be occupied would be 1, then the system would have an ‘infinite occupied cluster’. But that cluster would be immediately destroyed, bringing the occupation density away from 1: contradiction”. Of course this reasoning is, mildly speaking, quite shaky and we believe that a rigorous solution of this problem is necessary for a clear understanding of the forest-fire model. The problems investigated in this paper are, although not the same as the one just described, of the same spirit. Instead of looking at the steady-state distribution, we start with all sites vacant and look at the time tc at which, in the modified model where there is only growth but no ignition, an infinite cluster starts to form. Intuitive reasoning similar to that above makes plausible that, informally speaking, for every t > tc , the probability that O burns before time t stays away from 0 as λ tends to 0. Continuing such intuitive reasoning then leads to the ‘conclusion’ that, again informally speaking, if we take m sufficiently large and replace the above event by the event {Some vertex at distance ≤ m from O burns before time t}, the corresponding probability will be, as λ tends to 0, as close to 1 as we want. We relate this to problems which are closer to ordinary percolation. In particular we show that, under a percolation-like assumption (which we believe to be true), the above ‘conclusion’ is false. We hope our results will lead to further research and clarification of the above problems.
1.2. Formal statement of the problems. So far, we have not defined our model precisely yet. We now give this more precise description, formulate some of the above mentioned problems more formally, and introduce much of the terminology used in the rest of this paper. We work on the square lattice, i.e. the graph of which the set of sites (vertices) is Z2 , and where two vertices (i, j) and (k, l) share an edge if |i − k| + | j − l| = 1. To each site we assign two Poisson clocks: one (which we call the ‘growth clock’) having rate 1, and the other (the ‘ignition clock’) having rate λ. All Poisson clocks behave independently of each other. A site can be occupied by a tree or vacant. These states are denoted by 1 and 0 respectively. Initially all sites are vacant. We restrict ourselves to a finite box B(n) := [−n, n]2 . (In our theorems we consider the behaviour as n → ∞). The dynamics is as follows: when the growth clock of a site v rings, that site becomes
Self-Organized Forest-Fires near the Critical Time
267
occupied (unless it already was occupied, in which case the clock is ignored); when the ignition clock of a site v rings, each site that has an occupied path in B(n) to v, becomes vacant instantaneously. (Note that this means that if v was already vacant, nothing happens.) Now let ηvn (t) = ηv (t) ∈ {0, 1} denote the state of site v at time t, and define η(t) = ηn (t) := (ηvn (t), v ∈ B(n)). Note that, for each n, (ηn (t), t ≥ 0) is a finite-state (continuous-time) irreducible Markov chain with state space {0, 1} B(n) . The assignment of Poisson clocks to every site of the square lattice provides a natural coupling of the processes ηn (·), n ≥ 1 with each other, and with other processes (see below). For m ≤ n, we often use the informal phrase “ηn has a fire in B(m) before time t” for the event {∃v ∈ B(m) and ∃s ≤ t such that ηvn (s − ) = 1 and ηvn (s) = 0}. Similarly, we use “ηn has at least two fires in B(m) before time t” for the event {∃v, w ∈ n (u − ) = 1 and ηn (s) = ηn (u) = 0}. Note that B(m) and ∃s < u ≤ t s.t. ηvn (s − ) = ηw v w we allow v and w to be equal. Let Pλ be the measure that governs all the underlying Poisson processes mentioned above (and hence, for all n simultaneously, the processes ηn (·)). Often, when there is no need to explicitly indicate the dependence on λ, or when we consider events involving the growth clocks only, we will omit this subscript. It is trivial that for all times t and all n, m the probability that ηn has a fire in B(m) before time t goes to 0 as λ ↓ 0, and hence lim lim Pλ (ηn has a fire in B(m) before time t) = 0.
n→∞ λ↓0
A much more natural (and difficult!) question is what happens when we reverse the order of the limits. For the investigation of such questions it turns out to be very useful to consider the modified process σ (t) on the infinite lattice, which we obtain, loosely speaking, if we obey the above mentioned growth clocks but ignore the ignition clocks: σv (t) = I{The growth clock at v rings in [0,t]} , where I denotes the indicator function. It is clear that, for each time t, the σv (t), v ∈ Z2 , are Bernoulli random variables with parameter 1 − exp(−t). So, if we define tc by the relation pc = 1 − exp(−tc ), where pc is the critical value for ordinary site percolation on the square lattice, we see that σ (t) has no infinite occupied cluster for t ≤ tc but does have an infinite cluster for t > tc . To illustrate the usefulness of comparison of η with σ (and as introduction to more subtle comparison arguments), we show that lim sup lim sup Pλ (ηn has a fire in O before time t) ≤ θ (1 − e−t ), λ↓0
(1)
n→∞
where θ (.) is the percolation function for ordinary site percolation. The argument is as t (O) denote the occupied cluster of 0 in the configuration σ (t). It is easy follows: Let C to see from the process descriptions above that in order to have, for the process ηn , a fire in 0 before time t, it is necessary (but not sufficient) that at least one of the ignition t (O) has rung before time t. Using the independence of the different clocks in the set C Poisson clocks, we have
268
J. van den Berg, R. Brouwer
Pλ (ηn has a fire in O before time t) ∞ t (O)| = k and ∃v ∈ C t (O) that has ignition before time t) Pλ (|C ≤ k=1
t (O)| = ∞) + P(|C ∞ t (O)| = k)(1 − e−λtk ) + θ (1 − e−t ). P(|C = k=1
Note that, in the r.h.s. above, the first term does not depend on n and, as λ → 0, clearly goes to 0 (by bounded convergence). The desired result follows. In particular, we have for each m and each t ≤ tc , lim lim Pλ (ηn has a fire in B(m) before time t) ≤ |B(m)|θ (1 − e−t ) = 0, λ↓0 n→∞
(2)
where |B(m)| denotes the number of sites in B(m). But what happens right after tc ? Intuitively one might argue as follows: “If the l.h.s. of (2) is 0 for some t > tc , then roughly speaking, the system at time t looks as in ordinary percolation with parameter 1 − exp(−t), so that an infinite occupied cluster has built up, and this cluster intersects B(m) with positive probability. But an infinite cluster has an infinite total ignition rate and hence catches fire immediately: contradiction. Hence for each t > tc the l.h.s. of (2) is strictly positive.” As we said before, such reasoning is very shaky. Its conclusion is correct for the directed binary tree (see Lemma 4.5). We have some inclination to believe that the conclusion also holds for the square lattice, but prefer to formulate this as an open problem, rather than a conjecture: Open Problem 1.1. Is, for all t > tc , lim sup lim sup Pλ (ηn has a fire in O before time t) > 0 ? λ↓0
(3)
n→∞
Believing the answer to the above problem is affirmative, it is intuitively very tempting to go further and ‘conclude’ that also the answer to the following problem is affirmative: Open Problem 1.2. Is it true that for all t > tc and each ε > 0 there exists an m such that lim sup lim sup Pλ (ηn has a fire in B(m) before time t) > 1 − ε ? λ↓0
(4)
n→∞
The intuitive (and again shaky) reasoning here is, roughly speaking, that if the answer to Problem 1.1 is affirmative, there will be a positive density of sites that burn before time t, and hence the probability of having such a site in B(m) will tend to 1 as m → ∞. Our main result, Theorem 2.2, indicates that the behaviour of the process may be considerably different from what the above intuition suggests. At this point, one could wonder whether it is really necessary to first restrict to finite n, so that we have the annoying ‘extra’ limit n → ∞ in our theorems and problem formulations: is, for each λ > 0, the model well-defined on the infinite lattice? For sufficiently large λ one can easily see that this is true. (Using domination by suitable
Self-Organized Forest-Fires near the Critical Time
269
Bernoulli processes one can, for such λ, make a standard graphical construction.) M. Dürre (see [4]) has recently shown, by more abstract means, that an infinite-volume forest-fire process exists for every λ > 0, but for small λ the uniqueness of such process is still open. In Sect. 4 we consider a slightly modified process that is obviously, by a graphical construction, well-defined on the infinite lattice. In this modified process occupied clusters with size larger than or equal to L (the parameter of the model) are instantaneously removed. For that model we have results very similar to those for the original one. In this paper we will assume knowledge of some ‘classical’ results in 2-dimensional percolation, in particular the standard RSW-type results (see [6], Chap. 11). 2. Statement of the Main Results 2.1. A percolation-like critical value. In this subsection we define a percolation-like critical value, denoted by δˆc , which plays a major role in the statement of our main results. First some notational remarks. Recall that pc denotes the critical probability for site percolation on the square lattice. The product measure with density p will be denoted by Pp . The event that there is an occupied path from a set V to a set W is denoted by {V ↔ W }. Let n be a positive integer, and consider the box B = [0, 4n] × [0, 3n]. By the boundary of B, denoted by ∂ B, we mean the set of those sites in B that have a neighbour in the complement of B. We are now ready to define δˆc . Let δ ∈ [0, 1]. Suppose the sites of B are, independently of each other, occupied with probability pc and vacant with probability 1 − pc . Next, informally, we destroy the occupied cluster of the boundary. That is, each vertex in B that initially had an occupied path to the boundary of B is made vacant. Finally, in the resulting configuration, each vacant site (that is, each site that initially was vacant, or that was initially occupied but made vacant by the above destruction step) is, independently, made occupied with probability δ. It is straightforward to see that in the final configuration a site v ∈ B is occupied with probability pc − Ppc (v ↔ ∂ B) + (1 − pc + Ppc (v ↔ ∂ B)) δ. If we let n grow and choose v further and further away from ∂ B, this clearly converges to pc + (1 − pc )δ. Although this is larger than pc , the final configuration has complicated spatial dependencies and therefore it is not clear whether, in the bulk, it is ‘essentially supercritical’. In particular, let A be the box [n, 3n] × [n, 2n], and consider the probability pn (δ) that the final configuration has an occupied vertical crossing of A. (As is well-known, in ordinary supercritical percolation the probability of such event goes to 1 as n → ∞.) It is clear that pn (δ) is increasing in δ, and we define δˆc = sup {δ : pn (δ) is bounded away from 1, uniformly in n}.
(5)
Conjecture 2.1. δˆc > 0. In spite of serious attempts no proof (or disproof) of this conjecture has been found yet. It is supported by simulation results but, since the box size our simulations could handle is limited, one has to be careful with interpreting such results. Conjecture 2.1 is very similar in nature to, and ‘somewhat’ weaker than (see the discussion below), Conjecture 3.2 in [2]. There we proved, among other results, that
270
J. van den Berg, R. Brouwer
assumption of that conjecture yields, informally speaking, the non-existence of a process on the square lattice, starting with all sites vacant, where (as in our model) vacant sites always become occupied at rate 1, and where infinite occupied clusters instantaneously become vacant. Such a non-existence result, although theoretically interesting, looks somewhat esoteric. In the present paper we show that the conjecture also has remarkable consequences for the ‘concrete’ and natural forest-fire models η(·). Conjecture 2.1 is weaker than the above mentioned conjecture in [2], in the sense that we can prove that the correctness of the latter implies that of the former but we don’t know how to prove the reverse implication. Since the weaker form is sufficient for our purposes (here as well as in [2]), we decided to present that form. 2.2. The main results. Recall the definition of δˆc in (5). We are now ready to state our main results: Theorem 2.2. If δˆc > 0, there exists a t > tc such that for all m, lim inf lim inf Pλ (ηn has a fire in B(m) before time t) ≤ 1/2. λ↓0
n→∞
(6)
The key to Theorem 2.2 is the following proposition (which is also interesting in itself): Proposition 2.3. If δˆc > 0, there exists a t > tc such that for all m, lim lim sup Pλ (ηn has at least 2 fires in B(m) before time t) = 0. λ↓0
(7)
n→∞
The proofs of the above proposition and theorem are given in Sect. 3. 3. Proofs The proof of our main theorem (Theorem 2.2) depends heavily on Proposition 2.3. For the proof of the proposition we need two auxiliary models. One of these, the ‘pure growth’ model σ (t), was already introduced in Section 1. The other, which has the same growth mechanism but where removal of trees takes place at time tc only, is described below. 3.1. Removal at tc only. Let I denote the set of all positive even integers i and consider the annuli Ai := B(5 · 3i )\B(3i ), i ∈ I . Note that these annuli are pairwise disjoint. In the process we are going to describe, again every site can be vacant (have value 0) or occupied (value 1). By a ‘surrounding i cluster’ we will mean an occupied circuit C around 0 in the annulus Ai , together with all occupied paths in Ai that contain a site in C. The process is completely determined by the Poisson growth clocks introduced in Sect. 1, in the following way. Initially each site is vacant. Whenever the growth clock of a site rings, the site becomes occupied. (When it already is occupied, the clock is ignored.) Destruction (1 → 0 transitions) only takes place at time tc : at that time, for each positive even integer i, each ‘surrounding i cluster’ is instantaneously made vacant. After tc the growth mechanism proceeds as before. Let ξv (t) denote the value of site v at time t. Earlier in this paper we mentioned an obvious but useful relation (comparison) between the pure growth process σ (·) and the forest-fire process η(·). There is also a
Self-Organized Forest-Fires near the Critical Time
271
useful relation between ξ(·) and η(·), but its statement and proof are less straightforward (see Lemma 3.2 in Sect. 3.2). Another lemma involving the process ξ(·) that will be important for us is the following. Lemma 3.1. If δˆc > 0 there exist γ < 1 and ε > 0 such that for all i ∈ I , P(∂ B(3i ) → ∂ B(3 · 3i ) in the configuration ξ(tc + ε)) < γ .
(8)
The proof of this lemma is very similar to that of Lemma 3.4 of [2]. (The pn ’s we defined a few lines before (5) differ from the ‘corresponding’ an ’s in [2], but the modifications in the proof arising from this difference are straightforward.) 3.2. Proof of Proposition 2.3. Fix m. Since the probability in the statement of the proposition is monotone in m, we may assume that m is of the form 3l for some even positive integer l. (So each annulus Ai , i ∈ I , defined in the previous subsection, is either contained in B(m) or disjoint from B(m)). Let τ = τ (n, m) be the first time that ηn has a fire in B(m); more precisely, τ := inf{t : ∃v ∈ B(m) s.t. ηvn (t) = 0 and ηvn (t − ) = 1}. Next, define, for 1 > λ > 0, 1 K (λ) := √ , 3 λ 1 , k(λ) := √ 4 λ A(k(λ), K (λ)) := B(K (λ))\B(k(λ)).
(9)
Further, define the following events: B1 = B1 (λ) := { no ignitions in B(K (λ)) before or at time τ }, B2 = B2 (λ) := { σ (tc ) has a vacant *-circuit in A(k(λ), K (λ))}, where by ‘*-circuit’ we mean a circuit (surrounding 0) in the matching lattice (i.e. the lattice obtained from the square lattice by adding the two ‘diagonal edges’ in each face of the square lattice). We will use the following relation between the forest-fire process η(·) and the auxiliary process ξ(·) described in the previous subsection. Lemma 3.2. Let λ ∈ (0, 1). On B1 ∩ B2 we have, for all t > τ , all v ∈ B(k(λ)) \ B(m) and all n, that ηvn (t) ≤ ξv (t).
(10)
Proof. (of Lemma 3.2). Suppose B1 ∩ B2 holds. Take n, t and v as in the statement of the lemma. Obviously, we may assume that k(λ) > m. To simplify notation we will, during the proof of this lemma, omit the superscript n from η, and the argument λ from k and K . Suppose ξv (t) = 0. We have to show that then also ηv (t) = 0. Since ξv (t) = 0, the growth clock of v does not ring in the interval (tc , t], and we may assume that just before tc the occupied ξ cluster of v surrounds B(m). (Otherwise the desired conclusion follows trivially.) From the definitions of the processes it then follows that at time tc
272
J. van den Berg, R. Brouwer
the occupied σ cluster of v, which we will denote by C, surrounds B(m). By B2 we have that C is in the interior of a vacant (that is, having σ (tc ) = 0) *-circuit in A(k, K ). Clearly, η ≡ 0 on this circuit during the time interval (0, tc ], which prevents fires starting in its exterior to reach its interior. From this, and the event B1 , we conclude that τ > tc , and that η(tc ) and σ (tc ) agree in the interior of this circuit. In particular, the occupied η cluster of v at time tc equals the above mentioned set C. From B1 it follows that at time τ a connected set is burnt which contains sites in B(m) as well as in the complement of B(K ). But then it also contains a site in C (because C surrounds B(m) and lies inside B(K )). So the whole set C, and in particular v, burns at some time s ∈ (tc , τ ]. Since the growth clock of v does not ring between time tc and t, it follows that indeed ηv (t) = 0. This completes the proof of Lemma 3.2.
Now we go back to the proof of the proposition. Assume δˆc > 0. Choose ε and γ as in Lemma 3.1. By (2) it is sufficient to prove that lim lim sup Pλ (ηn has at least 2 fires in B(m) in (tc , tc + ε)) = 0. λ↓0
(11)
n→∞
Define, in addition to B1 and B2 above, the event B˜ 1 = { no ignitions in B(K (λ)) in the time interval (0, tc + ε)}. We have P(at least 2 fires in B(m) in (tc , tc + ε)) ≤ P({at least 2 fires in B(m) in (tc , tc + ε)} ∩ B˜ 1 ∩ B2 ) +P( B˜ 1c ) + P(B2c ).
(12)
Now note that B˜ 1 does not depend on n, and that P( B˜ 1c ) ≤ λ |B(K (λ))| (tc + ε) → 0, as λ ↓ 0,
(13)
by the definition of K (λ) (see (9)). Next, note that the probability of B2 does not depend on n either, and that the domination of η by σ gives: P(B2c ) ≤ P{∂ B(k(λ)) ↔ ∂ B(K (λ)) in σ (tc )} → 0, as λ ↓ 0,
(14)
by a well-known result from ordinary percolation and the fact that K (λ)/k(λ) → ∞ as λ ↓ 0. Finally, we handle the event in the first term on the right hand side of (12). Since we will take limits as λ ↓ 0, we may restrict to λ’s for which k(λ) > m. Then we have the following relation between events: {at least 2 fires in B(m) in (tc , tc + ε)} ∩ B˜ 1 ∩ B2 = {τ ∈ (tc , tc + ε) and at least 1 fire in B(m) in (τ, tc + ε)} ∩ B˜ 1 ∩ B2 ⊂ {∂ B(m) ↔ ∂ B(k(λ)) in ηn (s) for some s ∈ (τ, tc + ε)} ∩ B1 ∩ B2 ⊂ {∂ B(m) ↔ ∂ B(k(λ)) in ξ(tc + ε)}, (15) where the second inclusion follows from Lemma 3.2 (and the monotonicity of ξ(t) for t > tc ), and the first inclusion holds because, by the event B˜ 1 , fires in B(m) before time tc + ε, can only arrive from outside B(K (λ)). To handle the probability of the
Self-Organized Forest-Fires near the Critical Time
273
last event in (15), first observe that, for each i, the random variables ξv (t), t ≥ 0, v ∈ A(i) are completely determined by Poisson clocks assigned to the sites inside the annulus A(i). We use the notation I (λ) for the set of all positive even integers j with m < 3 j < 5 · 3 j ≤ k(λ). Since the annuli A(i), i ∈ I are disjoint, we get from Lemma 3.1 that P{∂ B(m) ↔ ∂ B(k(λ)) in ξ(tc + ε)} ≤ γ |I (λ)| .
(16)
Combining (15) and (16), and using that k(λ), and hence also |I (λ)| goes to ∞ as λ ↓ 0, we get lim lim sup P({at least 2 fires in B(m) before time (tc + ε)} ∩ B˜ 1 ∩ B2 ) = 0. (17) λ↓0 n→∞
Combining (12), (13), (14) and (17) completes the proof of Proposition 2.3
3.3. Proof of Theorem 2.2. Proof. Suppose δˆc > 0 and that for all t > tc there exists an m = m(t) such that lim inf lim inf Pλ ( fire in B(m) before time t) > 1/2. λ↓0
n→∞
(18)
We will show that this leads to a contradiction. Choose t as in Proposition 2.3. Now take u ∈ (tc , t). By (18) there exist m 0 and α(u) > 0 such that lim inf lim inf Pλ ( fire in B(m 0 ) before time u) > 1/2 + α(u). λ↓0
n→∞
(19)
By (1) (and the continuity of θ ) we can choose an s ∈ (tc , u) with lim inf lim inf Pλ ( fire in B(m 0 ) before time s) ≤ α(u)/2. λ↓0
n→∞
(20)
By (18) there exists an m 1 > m 0 such that lim inf lim inf Pλ ( fire in B(m 1 ) before time s) > 1/2. λ↓0
n→∞
(21)
Clearly, P( fire in B(m 0 ) before time u) ≤ P( fire in B(m 1 ) before time s and fire in B(m 0 ) between times s and u) +P( fire in B(m 0 ) before time s) +P(no fire in B(m 1 ) before time s). (22) Now for each term in (22) we take lim inf λ↓0 lim inf n→∞ . Then, by Proposition 2.3 the first term on the r.h.s. will vanish. Using this, and applying (20) and (21) to the second and the third term respectively, yields lim inf lim inf P( fire in B(m 0 ) before time u) ≤ 1/2 + α(u)/2, λ↓0
n→∞
which contradicts (19). This completes the proof of Theorem 2.2.
274
J. van den Berg, R. Brouwer
4. Discussion and Modified Models In the model above it was the square lattice which played the role of space. Completely analogous results can be proved, in the same way, for the triangular or the honeycomb lattice. In the following subsections we discuss some less obvious modifications of the model (different ignition mechanism; binary tree instead of square lattice). 4.1. Ignition of sufficiently large clusters. Again we work on the square lattice. In this model the growth mechanism is the same as before (that is, vacant sites become occupied at rate 1), but the ignition mechanism is different: Instead of the ignition rate λ we have an (integer) parameter L. The ignition rule now is that whenever a cluster of size ≥ L occurs, it is instantaneously ignited and burnt down (that is, each of its sites becomes vacant). A very pleasant feature of this model is that, since the interactions now have finite range, it can be defined on the infinite lattice using a standard graphical construction. This frees us from the necessity to first work on B(n) and later take limits as n → ∞, and thus from the annoying double limits we had in our main results. As before, we start at time 0 with all sites vacant. Let ηv[L] (t) denote the value (0 or 1) of site v at time t. The analog of Open Problem 1.1 is: Open Problem 4.1. Is, for all t > tc , lim sup P(η[L] has a fire in O before time t) > 0 ?
(23)
L→∞
Similarly, there is a straightforward analog of Open Problem 1.2. Although this modified model is seemingly simpler than the original one, we think the problems are, essentially, as hard as before. We have, with δˆc as before (see (5)), analogs of Theorem 2.2 and Proposition 2.3. Theorem 4.2. If δˆc > 0, there exists a t > tc such that for all m, lim inf P(η[L] has a fire in B(m) before time t) ≤ 1/2. L→∞
(24)
Proposition 4.3. If δˆc > 0, there exists t > tc such that for all m, lim P(η[L] has at least 2 fires in B(m) before time t) = 0.
L→∞
(25)
Theorem 4.2 follows from Proposition 4.3 in the same way as Theorem 2.2 from Proposition 2.3. The proof of Proposition 4.3 is very similar to that of Proposition 2.3 and we only indicate the main modifications: Instead of (9) we define K L := L 1/3 , k L := L 1/4 . Next, the events B1 , B2 are replaced by the single event B3 := {σ (tc ) has a vacant *-circuit in A(k L , K L )}, and Lemma 3.2 is replaced by the following lemma, whose proof is a straightforward modification of that of the former. (Of course we take m as before, and τ = τ (L , m) is now defined as the first time that η[L] has a fire in B(m).)
Self-Organized Forest-Fires near the Critical Time
275
Lemma 4.4. On B3 we have, for all t > τ and all v ∈ B(k L ) \ B(m) that ηv[L] (t) ≤ ξv (t).
(26)
The proof of Proposition 4.3 now proceeds as before.
4.2. The binary tree. In this subsection we consider the same dynamics as for the process η in Sects. 1–3, but now we take the directed binary tree instead of the square lattice. By the infinite binary directed tree, denoted by T , we mean the tree where one vertex (called the root) has two edges, each other vertex has three edges, and where all edges are oriented in the direction of the root. The root will be denoted by O. By the children of a site v we mean the two sites from which there is an edge to v. (And we say that v is the parent of these sites.) By the first generation of v we mean the set of children of v, by the second generation the children of the children of v, etc. The subgraph of T containing O and its first n generations will be denoted by T (n). Let us now describe the model in detail. We work on T (n). Initially all sites are vacant. As in the original (Sect. 1) model vacant sites become occupied at rate 1 and occupied sites are ignited at rate λ. When a site v is ignited, instantaneously each site on the occupied path from v in the direction of the root is made vacant. The forest-fire interpretation is not very natural here. More natural is the interpretation in terms of a nervous system: Replace the word ‘site’ by ‘node’, ‘occupied’ by ‘alert’, vacant by ‘recovering’, ‘ignition’ by ‘arrival of a signal from outside the system’. Then the above description says that whenever an alert node v receives a signal (either from a child, or from outside the system), it immediately transmits it to its parent (except when v = O, in which case it ‘handles’ the signal itself), after which it needs an exponentially distributed recovering time to become alert again. As before we use 1 to represent an occupied (‘alert’) and a 0 to represent a vacant (‘recovering’) vertex. Let ζv (t) ∈ {0, 1} denote the state of vertex v at time t. If we want to stress dependence on n we write ζvn (t). As in Sect. 1, the processes ζ n (·) can be completely described in terms of independent Poisson growth and ignition clocks, assigned to the sites of T . Recall that site percolation on the binary tree has critical probability 1/2, and percolation probability function θ ( p) = (2 p − 1)/ p, for p ≥ 1/2. Combining this with the same arguments that led to (1) shows that, if we first let n go to ∞ and then λ to 0, the probability that the root burns before time log 2 goes to 0, and, moreover, that for t > log 2, lim sup lim sup Pλ (O burns before time t) ≤ λ↓0
n→∞
1 − 2e−t . 1 − e−t
(27)
A nice feature of the binary tree is that we can (quite simply in fact) also prove a lower bound (compare with Open Problem 1.1 for the square lattice): Lemma 4.5. For all t > log 2, lim inf lim sup Pλ (ζ n has a fire in O before time t) ≥ λ↓0
n→∞
Note that this lower bound is half the upper bound (27).
1 1 − 2e−t . 2 1 − e−t
(28)
276
J. van den Berg, R. Brouwer
Proof. Define the functions f nλ (t) := Pλ (ζ n has a fire in O before time t), t > 0, and gnλ (s, t) := f nλ (t) − f nλ (s), 0 < s < t, i.e. the probability that the first time that O burns is between s and t. Fix a t > log 2 and take t˜ ∈ (log 2, t). Suppose that lim inf lim sup f nλ (t) < λ↓0
n→∞
1 1 − 2e−t˜ . 2 1 − e−t˜
(29)
We will show that this leads to a contradiction. By (29) there exists an α > 0 and a sequence (λi , i = 1, 2, · · · ), which is decreasing, converges to 0 and has, for all i, lim sup f nλi (t) < n→∞
1 1 − 2e−t˜ − α. 2 1 − e−t˜
(30)
Fix j large enough such that e−λ j t˜(1 + 2α(1 − e−t˜)) > 1.
(31)
The reason for this choice will become clear later. Observe that, if v and w are the children of O, the processes ζvn+1 (·) and ζwn+1 (·) are independent copies of ζ On (·) (and are also independent of the Poisson clocks at O). Also observe that, to ensure that the first fire at the root occurs between times t˜ and t, it is sufficient that the growth clock of O rings before time t˜, no ignition occurs at the root before time t˜, at least one of its children burns between times t˜ and t and none of its children burns before time t˜. Hence, by these observations, λ gn+1 (t˜, t) ≥ (1 − e−t˜) e−λt˜ gnλ (t˜, t)2 + 2gnλ (t˜, t) (1 − f nλ (t˜) . (32) Now we take λ equal to λ j in (32), and apply (30) (noting that f nλ (t) ≥ f nλ (t˜)). λ This gives that (with the abbreviation gk for gk j (t˜, t), k = 1, 2, · · · ) for all sufficiently large n, λ
gn+1 ≥ (1 − e−t˜) e−λ j t˜ 2gn (1 − f n j (t˜)) ≥ gn × e−λ j t˜(1 + 2α(1 − e−t˜)) .
(33)
However, the factor behind gn in the r.h.s. of (33) does not depend on n and is, by (31), strictly larger than 1, so that the sequence of gn ’s ‘explodes’: a contradiction. Hence lim inf lim sup f nλ (t) ≥ λ↓0
n→∞
1 1 − 2e−t˜ . 2 1 − e−t˜
(34)
This holds for each t˜ ∈ (tc , t). Letting t˜ ↑ t in (34) completes the proof of Lemma 4.5.
Self-Organized Forest-Fires near the Critical Time
277
By a similar ‘independent copies’ observation as used a few lines above (32) (now for all sites in the m th generation of the root), Lemma 4.5 immediately gives the following corollary (compare with Theorem 2.2 and Proposition 2.3): Corollary 4.6. For all t > log 2, all ε > 0 and all k, there exists m such that lim inf lim sup Pλ (ζ n has at least k fires in T (m) before time t) > 1 − ε. λ↓0
(35)
n→∞
Acknowledgements. We thank Antal Járai, Ronald Meester and Vladas Sidoravicius for stimulating discussions.
References 1. van den Berg, J., Járai, A.A.: On the asymptotic density in a one-dimensional self-organized critical forest-fire model. Commun. Math. Phys. 253, 633–644 (2004) 2. van den Berg, J., Brouwer, R.: Self-destructive percolation. Random Structures and Algorithms 24, Issue 4, 480–501 (2004) 3. Drossel, B., Schwabl, F.: Self-organized critical forest-fire model. Phys. Rev. Lett. 69, 1629–1632 (1992) 4. Dürre, M.: Existence of multi-dimensional infinite volume self-organized critical forest-fire models. Preprint (2005) 5. Grassberger, P.: Critical behaviour of the Drossel-Schwabl forest fire model. New J. Phys. 4, 17.1–17.15 (2002) 6. Grimmett, G.R.: Percolation, Berlin-Heidelberg-New York: Springer-Verlag (1999) 7. Jensen, H.J.: Self-Organized Criticality, Cambridge Lecture Notes in Physics, Cambridge: Cambridge Univ. Press (1998) 8. Malamud, B.D., Morein, G., Turcotte, D.L.: Forest Fires: An example of self-organized critical behaviour, Science 281, 1840–1841 (1998) 9. Schenk, K., Drossel, B., Schwabl, F.: Self-organised critical forest-fire model on large scales. Phys. Revi. E 65, 026135-1-8 (2002) Communicated by H. Spohn
Commun. Math. Phys. 267, 279–305 (2006) Digital Object Identifier (DOI) 10.1007/s00220-006-0065-6
Communications in
Mathematical Physics
Charges and Fluxes in Maxwell Theory on Compact Manifolds with Boundary Marcos Alvarez1 , David I Olive2 1 Centre for Mathematical Science, City University, London Northampton Square, London EC1V 0HB, UK.
E-mail:
[email protected]
2 Physics Department, University of Wales Swansea, Singleton Park, Swansea SA2 8PP, UK.
E-mail:
[email protected] Received: 27 March 2003 / Accepted: 17 February 2006 Published online: 5 August 2006 – © Springer-Verlag 2006
Abstract: We investigate the charges and fluxes that can occur in higher-order Abelian gauge theories defined on compact space-time manifolds with boundary. The boundary is necessary to supply a destination to the electric lines of force emanating from brane sources, thus allowing non-zero net electric charges, but it also introduces new types of electric and magnetic flux. The resulting structure of currents, charges, and fluxes is studied and expressed in the language of relative homology and de Rham cohomology and the corresponding abelian groups. These can be organised in terms of a pair of exact sequences related by the Poincaré-Lefschetz isomorphism and by a weaker flip symmetry exchanging the ends of the sequences. It is shown how all this structure is brought into play by the imposition of the appropriately generalised Maxwell’s equations. The requirement that these equations be integrable restricts the world-volume of a permitted brane (assumed closed) to be homologous to a cycle on the boundary of space-time. All electric charges and magnetic fluxes are quantised and satisfy the Dirac quantisation condition. But through some boundary cycles there may be unquantised electric fluxes associated with quantised magnetic fluxes and so dyonic in nature. 1. Introduction In the search for a unified theory of particle interactions encompassing both the standard model and Einstein’s theory of gravity the most promising candidate seems to be the superstring and M-theories which require space-time to have dimensions 10 and 11 respectively for internal consistency. A common feature of these is the presence of states known as “ p-branes”, objects which, classically at least, can be pictured as extended objects resembling p-dimensional surfaces (or volumes) in space. As time evolves these sweep out surfaces (or volumes) of one dimension higher, p + 1. When p = 0, the object is simply a point particle tracing out a world-line, w, in space-time. It has a geometrically natural interaction with Maxwell’s electromagnetic field specified by the addition of a term in the action taking the schematic form
280
M. Alvarez, D. I. Olive
“q
w
A”.
(1.1)
The same can be done for any positive value of p less than m, the dimension of space-time, with the proviso that the gauge potential A now has degree ( p + 1), matching the fact that the world-volume, w, has ( p + 1)-dimensions [O, N, T1]. When p equals one, so that the brane is a string, A is the familiar Kalb-Ramond [KR] gauge potential (see also [CS]). Naively Stokes’ theorem implies that expression (1.1) is unchanged when A is altered according to A → A + dχ ,
(1.2)
where χ is of degree p and arbitrary, if it is assumed that w is closed, i.e. a ( p + 1)-cycle. This generalised gauge invariance suggests that an important physical role would be played by the following quantity which is invariant with respect to (1.2): F = d A.
(1.3)
This is the ( p + 2)-form field strength, reducing to the familiar one of Maxwell when p = 0. The most natural equations of motion for F take the form: d F = 0,
(1.4)
d ∗ (h F) = ∗ j,
(1.5)
in exterior calculus notation, although there are more elaborate possibilities. In order to include a common feature of supergravity/superstring theories we have admitted the presence of a positive scalar function of scalar fields, h(φ), in (1.5), that equals unity in vacuo. Apart from this feature, these are Maxwell’s equations generalised in the form envisaged by Hodge, and ∗ denotes his duality operation, constructed by means of a metric on space-time, here assumed to be a fixed background [H, F]. We shall henceforth refer to them as Maxwell’s equations. The inhomogeneous Maxwell equation (1.5) will play an important role in what follows irrespective of the detailed form of the quantity h as long as it reduces to unity in vacuo and we shall not have to consider equations of motion for the scalar fields. The quantity j is the “electric current” and is a ( p + 1)-form, so possessing the same degree as A. By virtue of (1.5) and the nilpotency of the exterior derivative, d, it has to be “conserved” so that d ∗ j = 0,
(1.6)
and we shall always suppose this. Such field theories are indeed part of modern superstring/M-theory and it is therefore important to understand their properties by answering the questions listed below, particularly when the background space-time is taken to be topologically complicated. But these equations are only a part of the larger theory and not the whole. In this subtheory, no account need be taken of supersymmetry and the values of p and m, the dimension of space-time M, can treated as arbitrary. Many special features of these subtheories have become familiar [N, T1, T2, HT, DGHT], but our aim is to uncover yet more general structure as will be seen.
Charges and Fluxes in Maxwell Theory on Compact Manifolds with Boundary
281
When p vanishes and m equals 4 there are three notions familiar since the times of Faraday, electric charge, electric flux and magnetic flux. (Magnetic charge is excluded by (1.4) since magnetic current is). These three quantities are all conserved, that is unchanged by various sorts of evolution, including that in time. It will be seen to be important to distinguish the notions before examining the possibility of any relations between them. Allowing p and m to be arbitrary, the physical questions considered in this paper concern: (1) the classification and enumeration of the independent charges and fluxes, (2) the understanding of how the Maxwell equations (1.4) and (1.5) relate the notions of electric flux and charge, (3) the determination of the possible numerical values of these charges and fluxes, (4) the understanding of how quantum theory can relate the values of electric and magnetic conserved quantities (yielding the generalised Dirac quantisation condition). The answers turn out to be more subtle and interesting than we had expected and it is this that motivates this presentation. It is found important to resist the common temptation to simplify by taking the space-time manifold, M, to be closed as this results in oversimplification. When account is taken of the generalised Maxwell equations, (1.4) and (1.5), all electric charges and fluxes then vanish, leaving magnetic fluxes as the only available conserved quantities, as Henneaux and Teitelboim [HT] emphasised. In particular this applies to the situation with m = 4 considered by Misner and Wheeler [MW1] who were amongst the earliest to advocate the application of homology theory to unified field theories. Thus it is essential to allow space-time, M, to possess a boundary, B, a manifold of dimension one less, interpreted as corresponding to the “points at spatial infinity” through which “electric field lines” may escape, thereby furnishing a potentially nontrivial flux. When this is done, the answers to the physical questions above are provided by a set of results in pure mathematics whose physical relevance is, we believe, hitherto unappreciated by physicists. Once we have established the appropriate definitions we find the connections, made more precise in the text: electric charges ⇔ relative homology of space-time, electric fluxes ⇔ absolute homology of the boundary of space-time, magnetic fluxes ⇔ absolute homology of space time. All these charges and fluxes are expressed as integrals over some sort of cycle in space-time and homology deals with the classification of these cycles in the way that is appropriate to the physics. There are precisely three types of homology in the situation just described and all three play a physical role according to the connections just listed. Moreover the relationships between the different sorts of conserved quantity correspond to relationships between these different sorts of homology. All of the aforementioned types of homology class form elements of an abelian group, the appropriate homology group, H∗ , say. Taking into account all values of p that are possible in the given fixed background space-time, M, these abelian groups can be arranged in a certain order such that there is a natural homomorphism acting between successive members. This provides a sequence with the property of being exact, that is, at each stage, the homology group, H∗ , possesses a subgroup, K ∗ , say, that is at the same time the kernel of the succeeding homomorphism and the image of the preceding one. This is the exact sequence of relative homology (of space-time). A more refined classification of the physical notions of charge and flux will depend on the distinction
282
M. Alvarez, D. I. Olive
between the subgroup K ∗ and the coset group H∗ /K ∗ within each homology group H∗ . This structure is explained in more detail in the text as it becomes relevant to the development of the physical arguments and particularly in Sects. 7 and 10, as well as the Appendix. Relevant mathematical background together with more detail can be found in [S1] and [M]. Each homology group, H∗ , is abelian, and usually of infinite order. For reasons explained they are essentially discrete lattices of finite dimension, b∗ which is known as the Betti number. The number of linearly independent charges and fluxes will be expressible in terms of Betti numbers in a surprisingly complicated way that we shall determine. These results will answer the first two of the physical questions listed above. An important subtlety is that although the definition of the conserved electric charge as an integral over the conserved current, j works irrespective of whether or not the generalised Maxwell equations (1.4) and (1.5) are assumed to hold, the counting of the charges does depend on this choice, being more complicated when they do hold, as they should when account is taken of physical relevance. For example, when spacetime is closed, all possible non-trivial electric charges are forced to vanish by Eqs. (1.4) and (1.5). The point is that there exist conserved currents on space-time for which it is impossible to integrate (1.5) to obtain a field strength, F. Consequently these currents will be forbidden on the physical grounds that the field strengths must exist. It is the aforementioned exact sequence of relative homology that clarifies the occurrence of this phenomenon as explained in Sect. 4 and amplified later. Answering the third of the physical questions listed above requires an explicit form of the conserved current, j (w), due to a p-brane with world-volume w as implied by (1.1) and (1.5) together. This is provided by a singular differential form involving Dirac δ-functions whose support is w. Then the electric charge associated with integrating over a relative cycle S is q times the intersection number of S and the absolute cycle, w. The coefficient q is defined by (1.1) and the intersection number is well defined as w and S have dimensions summing to m, that of space-time. As this intersection number is unchanged by homologies of both w and S, it is defined on their homology classes. Since the groups formed by these classes are essentially lattices whose dimension is the relevant Betti number, it follows that the intersection data is encoded in the intersection matrix, I , formed of the intersection numbers between elements of bases of the two lattices. This matrix, I , has integer entries, is square and unimodular (that is, has determinant equal to ±1), the latter two properties being consequences of “Poincaré-Lefschetz duality”, another feature of the exact sequence of relative homology. So far this analysis does not use the “Maxwell equations”, (1.4) and (1.5), and hence applies whether or not they are chosen to hold. If not, the electric charges take values equal to an integer times q. Conversely the unimodularity of the intersection matrix means that there exist brane configurations realising all possible values of these sets of values. If Maxwell’s equations are chosen to hold, as they should, the situation is more complicated as many potential electric charges are forced to vanish, apparently contradicting the unimodular property of the intersection matrix. The resolution of this paradox depends on the recognition that some brane configurations are forbidden as they yield conserved electric currents for which Maxwell’s equations (1.4) and (1.5) cannot be integrated to yield a field strength. This is explained in more detail in Sect. 8 and requires the intersection matrix to have a more detailed structure than so far apparent. This is revealed by writing it in block form according to the kernel subgroup, K ∗ , of each H∗ , and the coset H∗ /K ∗ . One block has to vanish identically and this is
Charges and Fluxes in Maxwell Theory on Compact Manifolds with Boundary
283
verified explicitly in Sect. 10 and the Appendix. This leaves square matrices on the block diagonal each of which have to be unimodular. The upshot is that the only brane configurations that are allowed by the integrability requirement are those that are homologous to cycles in the boundary, B, of space-time, M. The surviving electric charges again take values that are integral multiples of q. Conversely there are allowed brane configurations that realise all possible sets of these values. Another phenomenon quantified by the exact sequence of homology is the existence of electric fluxes which are not equal to electric charges and hence not quantised. Through the same cycles there may flow quantised magnetic fluxes of the field coupling to the brane dual to that coupling to the electric field so that the overall effect is suggestive of something dyonic. All results so far are “classical”, invoking no quantum theory. Taking the latter into account requires that the schematic term (1.1) in the action be unambiguous when suitably exponentiated. This constrains the values of the magnetic fluxes to satisfy a generalisation of Dirac’s celebrated quantisation when compared with any of the electric charges [D][WY][AO1]. The resulting picture is beautifully consistent yet unexpectly rich. Nevertheless our analysis made a number of implicit simplifications compared with the full superstring theory that are so far unavoidable. Some of these are listed in the conclusion, Sect. 11, and it is hoped that a subsequent elaboration of our present methods will lead to answers removing these assumptions. A technical Appendix extends the idea of a distribution valued form associated with a bulk cycle (such as the brane world-volume) to chains both in the bulk and on the boundary. These constructions are used to derive the weak form of Poincaré-Lefshetz duality used in establishing the vanishing of an off-diagonal block of the previously mentioned intersection matrix. Relative topology has been used previously to discuss certain aspects of branes in M-theory. A partial description of the role of relative cohomology in the classification of charges in generalised Maxwell theory was sketched in Sect. 2 of [MW2]. In [KS] relative cohomology is used to present a geometric description of certain brane intersections in M-theory. A similar analysis of D2-branes in Wess-Zumino-Witten theory can be found in [FS].
2. First Notions Taken as given is a fixed background space-time M, assumed oriented and compact, but possibly of complicated topology. It has dimension m and initially it is assumed to be closed. On it is defined a field strength F that is a ( p + 2)-form satisfing the generalised Maxwell equations (1.4) and (1.5). According to the first of these F is closed so that locally there is defined a ( p + 1)-form gauge potential A, (1.3), with a gauge ambiguity with respect to the gauge transformations (1.2), where χ is a p-form also defined locally. The quantity j is a p + 1-form denoting the electric current due to the matter degrees of freedom. For the time being it does not have to be assumed that it has the form that (1.1) would imply. Electric current conservation is the statement that ∗ j is a closed form on M, (1.6). This follows from the above Maxwell equation (1.5) as d 2 vanishes, but we shall assume its validity even when Maxwell’s equations are disregarded.
284
M. Alvarez, D. I. Olive
The first notion of an electric charge is associated with the current j without any reference to the field strength, F. Hence Maxwell’s equations can be temporarily disregarded. It is formulated by considering an oriented region S that is a (m − p − 1)-chain over which it is possible to integrate the matching form ∗ j: Q(S) = ∗ j. (2.1) S
Conventionally the region S would be thought of as “ space-like” but this is not essential. The virtue of the definition (2.1) is that it is insensitive to alterations of S by homologies that preserve its boundary, ∂ S. Thus, if S = S + ∂C, so ∂ S = ∂ S, Q(S ) = Q(S) as Q(∂C) = ∂C ∗ j = C d ∗ j = 0, using Stokes’ theorem and current conservation (1.6). This establishes a good sense in which the charge Q is conserved. The disadvantage of this definition is that the regions S for which the charge is defined lack any real homological significance unless it is assumed that S is closed, ∂ S = 0. Now the result means that each electric charge, Q(S), is preserved by homologies of S, that is, unchanged by the kinds of evolution associated with these homologies. Homologous surfaces form absolute homology classes which themselves form an abelian group under addition of surfaces, in this case the absolute homology group of M, denoted Hm− p−1 (M; ZZ ). Without any field strengths satisfying the Maxwell equations this would be the end of the story as there would be no fluxes to consider. Since field strengths are included, it is necessary to consider the effect of applying Maxwell’s equation (1.5): Q(S) = ∗ j = d ∗ (h F) = ∗(h F) = 0 S
S
∂S
as ∂ S vanishes. Thus all electric charges vanish when Maxwell’s equations hold on a closed space-time, M. In physical terms, the problem is that the Maxwell equation (1.5) attaches electric lines of force to the electric charge distribution and these lines have nowhere to go. Mathematically the point is that when the conserved electric current, j, is such that any Q(S) fails to vanish, it is impossible to integrate (1.5) to obtain the field strength F on M. This is unaceptable on physical grounds. An obvious remedy is to provide a destination for the lines of force by allowing space-time, M, to be non-compact, and this will be considered next. But it will remain necessary to check the integrability of Maxwell’s equations in the sense just described. 3. Electric Charges and Relative Homology Instead of allowing space-time, M, to be non-compact, as just suggested, we shall do something slightly different and keep it compact but allow it to have a non-trivial boundary, B = ∂M, of one dimension less. This can be thought of as comprising those points at spatial infinity through which electric lines of force may escape. On the other hand, electric current, j, is not allowed to escape, that is its Hodge dual, ∗ j, is assumed to be localised and this is expressed by the boundary condition: ∗ j B = 0. (3.1)
Charges and Fluxes in Maxwell Theory on Compact Manifolds with Boundary
285
More precisely this means that the restriction of the differential form ∗ j to B vanishes. Thus the normal components of j vanish on B. In addition, it is assumed that the scalar function, h, occurring in (1.5), takes its vacuum value on B: h B = 1. (3.2) Of course Maxwell’s equations, (1.4) and (1.5) and also current conservation, (1.6) still hold on M, or as we shall say, in the bulk. The same expression (2.1) for the electric charge holds good except that now, instead of assuming ∂ S vanishes, we assume that it lies in B, and so S has become what is called a relative cycle. Suppose that S is altered by a relative homology: S → S = S + ∂C + β,
C ∈ M,
β ∈ B.
Then Q(∂C) = ∂C ∗ j = C d ∗ j = 0, by Stokes’ theorem and (1.6), while Q(β) = β ∗ j = 0 by (3.1). So Q(S) = Q(S )
if S ∼ S
(3.3)
in relative homology M mod B. In particular Q(S) vanishes if S ∼ 0. Thus the electric charge is well defined as an integral over relative homology classes, denoted [S] and forming an abelian group, Hm− p−1 (M, B; ZZ ). This is one sense in which the electric charges are conserved. The abelian group structure arises because two like relative cycles can be added to form a third. According to (2.1) this addition law is respected by the electric charges: Q([S]) + Q([S ]) = Q([S + S ]) = Q([S] + [S ]),
(3.4)
and this furnishes another sense in which they are conserved. Some elements of this homology group have finite order and are called torsion elements. Thus if S is not trivial, that is not relatively homologous to 0, yet has the property that there exists an integer n such that n[S] = [nS] is trivial, then, by the above, Q([S]) = Q([nS])/n = 0. Altogether such elements form a finite abelian subgroup T , (the torsion group), which can be divided out of Hm− p−1 (M, B; ZZ ) to form a free group Fm− p−1 (M, B; ZZ ) = Hm− p−1 (M, B; ZZ )/T
(3.5)
which can be regarded as a lattice of finite dimension bm− p−1 (M, B). This dimension is the corresponding Betti number. Because there are no contributions from torsion elements, electric charges are only defined on Fm− p−1 (M, B; ZZ ). Hence the integer bm− p−1 (M, B) counts what appears to be the number of linearly independent electric charges that can be defined on the space-time M. This conclusion is an overestimate for reasons to be explained in the next section. From now on, the conventions of this section will be adopted, and absolute chains will be denoted by lower case Roman letters (a,b,c … s,t,u,v,w..), relative chains by upper case Roman letters (A,B,C … S,T,U,V,W..) and chains in the boundary by Greek letters (α, β, γ . . . φ, χ , ψ..). The letters later in the alphabet will denote the corresponding cycles.
286
M. Alvarez, D. I. Olive
4. Electric Fluxes and Electric Charges The above derivation of the topological classification of electric charges by relative homology Fm− p−1 (M, B; ZZ ), used only the properties (1.6) and (3.1) of the current j, and not the Maxwell equations d F = 0 and (1.5). Current conservation (1.6) can be regarded as a necessary local condition for the integrability of the field strength F, given the current, j, but it is not sufficient, as already has been seen when space-time, M, has no boundary, nor will it be so when it does have a boundary. Assuming the Maxwell equation (1.5) does hold, the electric charge Q(S) can be rewritten as an electric flux: Q(S) = ∗ j = d ∗ (h F) = ∗(h F) = ∗F, (4.1) S
S
∂S
∂S
by Stokes’ theorem and (3.2). Of course ∂ S is in the space-time boundary, B, and is a cycle though not necessarily a boundary of a chain there, even though it is in the bulk, M. But it is possible to provide a more general definition of electric flux than this by considering any cycle in B, not just one that is a boundary of a relative cycle: E (φ) = ∗F, φ ∈ B, ∂φ = 0. (4.2) φ
This extended definition works on all the absolute homology classes of the boundary, Hm− p−2 (B; ZZ ), or, more precisely, the free parts, Fm− p−2 (B; ZZ ), defined as before. To check, consider the absolute homology in the boundary, φ → φ + ∂γ , γ ∈ B. Then E (∂γ ) = ∂γ ∗F = γ d ∗ (h F) = γ ∗ j = 0, using Stokes’ theorem and Eqs. (1.5) and (3.1). So indeed E (φ) = E (φ ) if φ ∼ φ in absolute homology in B, in distinction to the electric charges that appeared to correspond to relative homology. So, since their classifications differ, electric charges and electric fluxes must be distinguished. This distinction manifests itself in two different physical ways. First, not all electric fluxes are expressible as electric charges because not all cycles on the boundary, B, are boundaries of chains on M. The electric fluxes that are equal to charges are associated with cycles on the boundary, B, that are also boundaries of chains on M, as in (4.1). These classes of cycles form a subgroup of the absolute homology group of the boundary, B, Hm− p−2 (B; ZZ ), that we shall denote as follows: K m− p−2 (B; ZZ ) = {classes of boundary cycle φ satisfying φ = ∂ S for some bulk chain S}.
(4.3)
Secondly there are apparently non-trivial electric charges, Q(S), which must vanish if they are expressible as fluxes. This happens precisely when ∂ S is a boundary in B, as well as in M, according to Stoke’s theorem applied to (4.1). The classes of these cycles form a subgroup of the relative homology group that we shall denote as follows: K m− p−1 (M, B; ZZ ) = {classes of relative cycle, R, satisfying ∂ R = ∂α, α ∈ B}. (4.4) It follows that it is the vanishing of the charges associated with these cycles that is the extra integrability condition on Maxwell’s equation (1.5) in order to obtain a field strength, F, given a conserved current, j.
Charges and Fluxes in Maxwell Theory on Compact Manifolds with Boundary
287
To recap, electric charges defined on K m− p−1 (M, B; ZZ ) all vanish, leaving nontrivial charges associated with each coset element of this subgroup. Furthermore only those electric fluxes defined on K m− p−2 (B; ZZ ) are expressible as electric charges. What is happening mathematically is that the boundary operation ∂ mapping relative cycles to boundary cycles induces a map ∂∗ :
Hm− p−1 (M, B; ZZ ) −→ Hm− p−2 (B; ZZ ).
(4.5)
In fact this is a group homomorphism with kernel K m− p−1 (M, B; ZZ ), (4.4), and image K m− p−2 (B; ZZ ), (4.3). So, by Lagrange’s theorem, Hm− p−1 (M, B; ZZ )/K m− p−1 (M, B; ZZ ) ≡ K m− p−2 (B; ZZ ), and it is this group (or more precisely the free part) that classifies the non-trivial electric charges. Applying this to the free parts that are lattices classifying the corresponding charges and fluxes, the number of linearly independent electric charges is given by bm− p−1 (M, B) − sm− p−1 (M, B) = sm− p−2 (B),
(4.6)
explaining the overestimate mentioned previously. The integers s∗ (X ) are the dimensions of the lattices specifying the free part of K ∗ (X ; ZZ ). As there are bm− p−2 (B) linearly independent fluxes, sm− p−2 (B) of which are expressible as electric charges the difference, the number bm− p−2 (B) − sm− p−2 (B), specifies the number of linearly independent electric fluxes that cannot be equated to electric charges of the form (2.1). 5. Relation Between the Preliminary and Final Versions of Electric Charge For reasons that become clear later, it is worth asking a question that seems rather ridiculous from a physical point of view, namely how to relate the class of electric charge obtained by integrating ∗ j over a bulk cycle to the class obtained by integrating over a relative cycle. The first class, considered in our preliminary discussion still makes sense when space-time has a boundary since a bulk cycle can be considered as a special case of a relative cycle. The reason the question is apparently ridiculous from a physical point of view is that these preliminary charges do all vanish when account is taken of Maxwell’s equations as already seen. Consider an absolute bulk (m − p − 1)-cycle, r , and decompose it into the sum of a (m − p − 1)-chain in B and a remainder that contains no such chain: r = R + α. As ∂r = 0, ∂ R = −∂α ∈ B, so that R is a relative cycle. Furthermore, if r is trivial as a bulk cycle, so r = ∂a, then R = ∂a − α and so is trivial as a relative cycle. Hence the projection map j : r → R induces a map, j∗ , of absolute bulk homology classes to relative homology classes: j∗ :
Hm− p−1 (M; ZZ ) −→ Hm− p−1 (M, B; ZZ ).
(5.1)
This is actually a homomorphism. Its kernel consists of bulk cycles, r for which R is relatively trivial, so r = (∂C + β) + α = ∂C + γ , where γ ∈ B. Since r is closed, so is γ . Thus γ is a cycle in the boundary and the kernel can be denoted K m− p−1 (M, ZZ ) = {classes of bulk cycle homologous to cycles in B}.
(5.2)
288
M. Alvarez, D. I. Olive
On the other hand the image of j∗ consists of classes of relative cycle whose boundary is the boundary of a chain within B. This coincides with the subgroup K m− p−1 (M, B; ZZ ) already defined as being the kernel of ∂∗ in the previous section, (4.3). Putting together j∗ and ∂∗ as two successive homomorphisms: ∂∗
j∗
Hm− p−1 (M; ZZ ) −→ Hm− p−1 (M, B; ZZ ) −→ Hm− p−2 (B; ZZ ), we see that this sequence is exact at Hm− p−1 (M, B; ZZ ) as K m− p−1 (M, B; ZZ ) is both the the image of j∗ and the kernel of ∂∗ . This is a short segment of the exact sequence of relative homology alluded to in the introduction and more segments will be seen when magnetic fluxes are considered next. The complete exact sequence will be presented in later sections. A textbook presentation can be found in [M]. Of course, as we saw at the start, all electric charges vanish that are integrals over cycles in Hm− p−1 (M; ZZ ). This agrees with the fact already found above that they also vanish on K m− p−1 (M, B; ZZ ), which is the image of the former group under the action of j∗ . 6. Action Principle for p-Branes and Magnetic Flux Quantisation The standard (naive) expression for the term in the action describing the interaction of the field strength, F with the current, j, that is its source, according to (1.4) and (1.5), is A ∧ ∗ j. (6.1) M
Naively, this term is gauge invariant on its own with respect to the transformation (1.2) given that the electric current, j, is conserved, (1.6), and localised, (3.1). Ideally the current should be expressible in terms of quantum mechanical wave functions for the matter but it is only really understood how to do this when p = 0 so that the branes are point particles. By default, the only accepted way to proceed is to adopt the classical geometric picture described in the introduction. The evolution of the p-brane in space-time is specified by its world-volume, w, an absolute bulk ( p + 1)-cycle on M. Then the action term (6.1) takes the form (1.1) mentioned at the start. Because we already know the classical equations of motion in the Maxwell form (1.4) and (1.5), the detailed form of the action is only really relevant in the quantum theory. In that context, the expressions (1.1) and (6.1) are equally problematical (which explains the use of the words “schematic or naive”) as they involve the gauge potential, A, which is only defined locally, whilst the integration extends globally over all of spacetime, M. Consequently, in a topologically complicated space-time such as the one being imagined, there are problems in patching together this expression in overlapping neigh iq A/ w bourhoods of space-time. Fortunately it is the exponentiated action e that enters the Feynman action principle and this is more amenable. One needs to know how this phase alters when w is altered by a boundary. That is tantamount to requiring that the phase has a meaning when w is a boundary of a bulk chain. This can be done provided the background field strength F satisfies the Dirac quantisation conditions [D] for all magnetic fluxes through bulk ( p + 2)-cycles [N, T2]: 2π M (v) = F ∈ ZZ , ∂v = 0. (6.2) q v
Charges and Fluxes in Maxwell Theory on Compact Manifolds with Boundary
289
As d F = 0 these fluxes are defined on the classes of the absolute homology of spacetime M, forming the group H p+2 (M, ZZ ), or more precisely the free part of this, F p+2 (M; ZZ ), a lattice of dimension b p+2 (M). A parenthetic remark concerning this quantisation condition (6.2) is that it is known not really to be correct when wave functions are considered, as is, so far, only possible when p vanishes and the brane is therefore a point particle. Then there is a possibility of fractional quantisation conditions when the wave function is of a spinor nature (involving half-integers instead of integers). The precise rule is easy to state when m = 4 [AO1]. By this stage of the argument it has become established that, as claimed in the introduction, there is a connection between the physical notions of electric charge, electric flux and magnetic flux of a p-brane and mathematical notions of relative homology, absolute boundary homology and absolute bulk homology and more precisely, with the free parts of the abelian homology groups, Hm− p−1 (M, B; ZZ ), Hm− p−2 (B; ZZ ) and H p+2 (M; ZZ ), respectively. But there is a more detailed structure connected to the subgroups K ∗ of H∗ , for short, that plays a role in the exact sequence of relative homology and moreover possesses a physical relevance. Let us illustrate this last point by investigating magnetic fluxes through B cycles with a view to comparing electric and magnetic fluxes. Later on we shall see how this comparison will indicate a generalised dyonic phenomenon that is possibly related to the Zwanziger-Schwinger quantisation condition [Z2, S2]. Magnetic fluxes can already be defined for cycles in the boundary, B, rather than in the bulk, M, but nothing appears to be gained by this as cycles in B are automatically cycles in M but may become boundaries of bulk chains when regarded as M-cycles and hence homologically trivial in the bulk. Associated with this idea is the inclusion map, i, which induces the homomorphism: i∗ :
H p+2 (B; ZZ ) −→ H p+2 (M; ZZ ),
(6.3)
with kernel consisting of the classes of cycle just mentioned that become boundaries. This is precisely the subgroup K p+2 (B; ZZ ) of the type met before, (4.3), (with p + 2 replaced by m − p − 2), as the image of the homomorphism, ∂∗ , (4.5), induced by the boundary operator and met before in the comparison of electric charges and fluxes. The image of this homomorphism is clearly given by classes of bulk cycle homologous to a cycle in the boundary and these precisely form the subgroup K p+2 (M; ZZ ), (5.2), already met as the kernel of the homomorphism j∗ , (5.1), (again with p + 2 replaced by m − p − 2). All magnetic fluxes on cycles of K p+2 (B; ZZ ) vanish, as F= F = d F = 0, φ
∂S
S
by (4.3), Stokes’ theorem and (1.4), corresponding to the fact that these cycles are trivial as bulk cycles. So the only non-trivial magnetic fluxes through boundary cycles correspond to the b p+2 (B) − s p+2 (B) cosets of K p+2 (B; ZZ ) in H p+2 (B; ZZ ). These observations will become more interesting when we are able to compare them with the corresponding properties of electric fluxes through boundary cycles later on. Thus we have two more examples of a coincidence between images and kernels of different homomorphisms. This phenomenon is part of the exact sequence of relative homology mentioned in the introduction, an important pattern that has been emerging gradually and will be elaborated now.
290
M. Alvarez, D. I. Olive
7. The Exact Sequence of Relative Homology of Space-Time Our study within a general setting of the physical concepts of electric charge, electric flux and magnetic flux has revealed how these are described as integrals over cycles in space-time that are respectively relative, boundary and bulk type and unchanged by the appropriate homologies. So they are certainly classified by the corresponding homology groups Hm− p−1 (M, B; ZZ ), Hm− p−2 (B; ZZ ) and H p+2 (M, ZZ ), when p-branes are considered. We have also met three different types of homomorphism between the three types of homology group, denoted i ∗ , j∗ and ∂∗ , and illustrated by (6.3), (5.1) and (4.5). Associated with all of these is an image and kernel which is always a very specific subgroup of the relevant homology group, illustrated by (4.4), (4.3) and (5.2). If p is allowed to run over all the values compatible with possible p-branes in the given background space-time M, the set of all possible homology groups can be arranged as an ordered sequence with homomorphisms of one or other of the above three types relating each successive pair: ∂∗
i∗
j∗
∂∗
i∗
. . .−→Hm− p−1 (B) −→ Hm− p−1 (M) −→ Hm− p−1 (M, B) −→ Hm− p−2 (B) −→ . . . . (7.1) This is the exact sequence of relative homology well known to pure mathematicians in the context of algebraic topology, and more careful and detailed treatments can be found in various textbooks. The notation has been compressed by omitting reference to the integers ZZ . Assuming space-time, M, is connected, this exact sequence of abelian groups starts and finishes with the trivial group, written as 1 in multiplicative notation: 1 → Hm (M, B) → Hm−1 (B) → Hm−1 (M) → Hm−1 (M, B) → . . . and . . . H1 (B) → H1 (M) → H1 (M, B) → H0 (B) → H0 (M) → 1. Thus, besides the two trivial terms terminating the exact sequence, there are 3m terms. From the sequence it is now possible to evaluate in terms of the Betti numbers the numbers sq (B), sq (M) and sq (M.B) that are the dimensions of the free parts of the kernels (4.3), (5.2) and (4.4) and entered the counts of the various charges and fluxes. The exact sequence (7.1) implies a similar but simpler exact sequence for the free parts of the homology groups (obtained by dividing out the torsion subgroup). Working over real coefficients rather than integers yields an exact sequence of vector spaces with dimensions given by the Betti numbers and linked by linear maps replacing the group homomorphisms. To understand what happens consider such a sequence in simplified notation: 1 → V0 → V1 → V2 → V3 → . . . VN → 1.
(7.2)
If K n ⊂ Vn is the kernel/image, then, by exactness K n ≡ Vn−1 /K n−1 (retaining multiplicative notation). So repeating K n = Vn−1 /Vn−2 /Vn−3 / . . . /V1 /V0 /1,
(7.3)
Charges and Fluxes in Maxwell Theory on Compact Manifolds with Boundary
291
and taking dimensions, sn = dimK n = bn−1 − bn−2 . . . (−1)n+1 b0 = bn − bn+1 . . . (−1) N −n b N ,
(7.4)
using the fact that s N +1 , which equals the alternating sum of all the Betti numbers, vanishes. Applying these formulae to the exact sequence of relative homology, (7.1), yields sn (M) = bn (M) − bn (M, B) + bn−1 (B) − bn−1 (M) . . . , sn (M, B) = bn (M, B) − bn−1 (B) + bn−1 (M) − bn−1 (M, B) . . . , sn (B) = bn (B) − bn (M) + bn (M, B) − bn−1 (B) . . . , showing how the count of electric charges, (4.6), depends on the topology of space-time, M. It is familiar that in the understanding of electro-magnetic duality on closed spacetime manifolds, M, a property known as Poincaré duality is important. There is an analogous property for manifolds with boundary that will play an important role in the present context. This is known as Poincaré-Lefschetz duality and a short explanation follows. Corresponding to the integer homology groups already defined it is possible to define integer cohomology groups denoted H q (M; ZZ ) and so on. There is also an exact sequence of homomorphisms linking these in the sense of ascending superscript: . . . → H p (B) → H p+1 (M, B) → H p+1 (M) → H p+1 (B) → . . . .
(7.5)
The statement of Poincaré-Lefschetz duality is that the corresponding terms in the two exact sequences (7.1) and (7.5) are isomorphic as groups. So H p+1 (M, B; ZZ ) ≡ Hm− p−1 (M; ZZ ), H p+1 (M; ZZ ) ≡ Hm− p−1 (M, B; ZZ ),
(7.6)
and H p (B; ZZ ) ≡ Hm− p−1 (B; ZZ ).
(7.7)
The last isomorphism is simply Poincaré duality for the boundary, B, which is automatically a closed manifold of dimension m − 1. Notice how the superscripts and subscripts in an isomorphism are always complementary in the sense of summing to the dimension of the relevant manifold and how (7.6) relates relative topology to absolute topology in the bulk. There is yet another relation between homology and cohomology that results from the universal coefficient theorem by considering the coefficients to be real numbers rather than integers. The resultant groups are simply the vector spaces, with dimension equal to the Betti number, spanned by the lattices given by the free parts of the integer groups as previously mentioned. Then a homology group of given suffix and type is the dual of the cohomology group of corresponding superscript and type: H q (M; IR) = Hq (M; IR)∗ , H q (B; IR) = Hq (B; IR)∗ .
H q (M, B; IR) = Hq (M, B; IR)∗ , (7.8)
292
M. Alvarez, D. I. Olive
By means of these and the Poincaré-Lefschetz duality relations (7.6) and (7.7), the cohomology groups can be eliminated to yield the following relations between homology groups: Hq (M; IR) = Hm−q (M, B; IR)∗
and
Hq (B; IR) = Hm−q−1 (B; IR)∗ . (7.9)
Because the Betti numbers are the dimensions of these real vector spaces, particular consequences are the following equalities: bq (M) = bm−q (M, B)
and
bq (B) = bm−q−1 (B).
(7.10)
The corresponding duality relations for the dimensions on the image/kernels of the exact sequence, the numbers sq (M), sq (B) and sq (M, B), will be important and are easily obtained by recognising that in the simplified notation for the exact sequence, (7.2), Vn = VN∗ −n . So bn = b N −n and hence by (7.4), sn = s N +1−n . As a consequence dim Vn = bn = sn + sn+1 = dimK n + dim(V /K )n and b N −n = dim VN −n = s N +1−n + s N −n = dim(V /K ) N −n + dim K N −n . This means the dimensions of the two complementary subspaces of V , namely K and V /K interchange under duality, N ↔ N − n. In particular sm− p−1 (M, B) = b p+1 (M) − s p+1 (M) and s p+1 (M) = bm− p−1 (M, B) − sm− p−1 (M, B).
(7.11)
In fact, by (7.10) these two equations are the same as each other. 8. Electric Charges as Intersection Numbers With this information we are now well prepared to consider the physical question as to the possible numerical values of the generalised electric charges (2.1). Given a suitable expression for the conserved, localised electric current, j, the charges are evidently determined without recourse to Maxwell’s equations (1.4) and (1.5). Hence in this calculation these equations can be temporarily renounced, provided it is remembered that their reinstatement will reduce the number of independent electric charges, as explained in Sect. 3. We shall defer this reinstatement and the detailed understanding of the issues it raises until the following section. Just as in the discussion of magnetic fluxes and their quantisation in Sect. 6, we shall have to resort to the geometrical picture of a brane world-volume, as this will give us tractable form for the current. This is found by equating (1.1) and (6.1), the two versions of the term in the action responsible for the brane coupling to the gauge potential: q A= A ∧ ∗ j. (8.1) w
M
Charges and Fluxes in Maxwell Theory on Compact Manifolds with Boundary
293
Now A is taken to be an arbitrary ( p + 1)-form on M, so it follows that ∗ j = qµ(w),
(8.2)
where µ(w) is a singular (m − p − 1)-form involving a product of the same number of Dirac δ-functions with support on the absolute ( p + 1)-cycle w and differentials in the variables transverse to it. It follows that its restriction to B vanishes, as it should (3.1). In the Appendix it will be shown to be closed as well. Inserting the p-brane current (8.2) into the electric charge (2.1) yields Q(w; S) = q µ(w). S
This is invariant under relative homologies of S according to the work of Sect. 3. Now consider a bulk homology of the world volume, w, w → w = w + ∂a. By linearity µ(w ) = µ(w) + µ(∂a). As discussed in the Appendix, µ(a) exists for a bulk chain, a (and now involves step functions as well as Dirac-delta functions) and, moreover, obeys dµ(a) = µ(∂a), up to a sign. Hence the change in the electric charge, Q(w; S), due to this homology is µ(a) = µ(a)B = 0. Q(w − w; S) = q µ(∂a) = q dµ(a) = q S
∂S
S
∂S
So Q(w; S) is defined on the homology classes Hm− p−1 (M, B; ZZ ) × H p+1 (M; ZZ ), or rather on the corresponding product of free parts. So it can be assumed that the relative cycle S intersects the absolute bulk cycle of complementary dimension, w, at discrete points. Then the integral for the electric charge is recognised as [HT] Q(w; S) = q I (w, S), where I (w, S) denotes the intersection number of the absolute bulk cycle w with the relative cycle S, being the algebraic sum of the number of these points, taking into account signs due to relative orientation. This intersection number possesses a number of mathematical properties that are important for the physical interpretation of this result that we shall now describe. Choose bases S j and wi in the lattices Fm− p−1 (M, B; ZZ ) and F p+1 (M; ZZ ) that are the free parts of the two relevant homology groups. Then all intersection numbers are specified by knowledge of the matrix I (wi , S j ) = Ii j
∈ ZZ .
(8.3)
This intersection matrix, I , has b p+1 (M) rows and bm− p−1 (M, B) columns and hence is square, by (7.10). Yet another consequence of Poincaré-Lefschetz duality is that this matrix I is unimodular: det I = ±1.
(8.4)
Putting these results together it follows that all electric charges are quantised: Q(S) ∈ q ZZ
(8.5)
as integral multiples of the coupling constant, q, that enters the action. Thus any electric charge paired with any magnetic flux satisfies the Dirac quantisation condition: Q(S) M (v) ∈ 2π ZZ .
(8.6)
294
M. Alvarez, D. I. Olive
In deriving this quantisation condition it was implicitly assumed that there is only one species of p-brane and that it had a definite coupling constant, q, as defined above. Classically it is possible to imagine several distinct species of p-brane, distinguished by different coupling constants, q1 , q2 , . . . , q N whose ratios may be irrational. Then the total electric charge contained in the relative cycle, S, would now be Q(S) =
N
qi I (wi , S),
(8.7)
i=1
where wi is the world-volume of the brane of species i. This electric charge is not quantised if the ratios of coupling constants are irrational. However, in order to make sense of the exponentiated quantum action (6.1) for each species of brane the magnetic flux quantisation (6.2) has to hold separately with q replaced in turn by each species of coupling constant q1 , q2 . . . q N . It is this that forces their ratios to be rational, as we now show. Quantum mechanical consistency requires any given flux M (v) to be quantised separately for each of the N coupling constants, M (v) =
2π 2π 2π m1 = m2 = · · · mN, q1 q2 qn
in which m i ∈ ZZ for all i = 1, . . . , N . Therefore qi /q j = m i /m j for all i and j, so that the ratios of the coupling constants must be rational, as claimed. It follows that Q(S) M (v) =
N
qi I (wi , S)
i=1
= 2π
N i=1
2π mj qj qi m j = 2π I (wi , S)m i , qj N
I (wi , S)
(8.8)
i=1
which is in 2π ZZ . That is, (8.6) must remain true in the presence of several distinct species of p-branes carrying different charges. If Maxwell’s equations are now taken into account, then it follows that certain of the electric fluxes are indeed quantised, namely those obtained by integrating over boundary cycles within K m− p−2 (B; ZZ ), as these fluxes are equal to electric charges. On the other hand, there is no reason to believe that the remaining electric fluxes are quantised, and we shall return to some comments on this later. All that can be said as a result of (8.5) is that, for the latter fluxes, the quantities exp( 2πi q E (v) ) are well defined on the cosets (H/K )m− p−2 (B; ZZ ). The physical consequence of I being unimodular is that a configuration of the braneworld-volume can be found that realises any assignment of charges satisfying (8.5). Finally let us comment on the connection between the physical arguments of this section and the mathematical arguments of the preceding one, outlining Poincaré-Lefschetz duality. By current conservation, (1.6), the dual current, ∗ j, is a closed (m − p − 1)form, that, in addition, has vanishing restriction on the boundary of space-time, (3.1). This means that it is a relative (m − p − 1)-cocycle in the sense of de Rham cohomology, m− p−1 and so defines a class of Hde Rham (M, B; IR). The same is therefore true of µ(w) by (8.2), which therefore provides a map from the homology class of the world-volume, m− p−1 w, H p+1 (M; ZZ ) to Hde Rham (M, B; IR). This is part of the Poincaré-Lefschetz isomorphism, (7.6). It is possible to develop this line of thought and this is done in the Appendix.
Charges and Fluxes in Maxwell Theory on Compact Manifolds with Boundary
295
9. Maxwell’s Equations and the Intersection Matrix Two (correct) arguments have been developed in this paper that apparently lead to a contradiction. We shall now explain what this is, and how it is resolved by finding that the intersection matrix, I , (8.3), has further detailed properties, hitherto unexpected. In Sect. 4 it was shown that the effect of Maxwell’s equations is to force all electric charges, Q(S), to vanish when the relative cycle over which they are integrated, S, belongs to K m− p−1 (M, B; ZZ ). Yet according to the preceding section, irrespective of Maxwell’s equations, the electric charge due to a brane configuration with worldvolume, w was seen to be proportional to the intersection number of w with S. The apparent contradiction arises from the fact that the intersection matrix is non-singular, as a consequence of its being unimodular (8.4). To make this clearer, it is natural to partition the intersection matrix in a way that distinguishes each kernel K within each H from the cosets H/K . This is done by choosing the basis {S j } so that the first sm− p−1 (M, B) elements form a basis of K m− p−1 (M, B) while the remainder refer to the cosets (H/K )m− p−1 (M, B). The basis {wi } is chosen so that the last s p+1 (M) elements form a basis for K p+1 (M) while the remainder refer to the cosets. Corresponding to this, the intersection matrix, (8.3), is written in the block form
(H/K )(M)
K (M,B)
(H/K )(M,B)
A
Y
I (w, S) = K (M)
.
X
(9.1)
B
That electric charges associated with K m− p−1 (M, B) all vanish seems to imply that the submatrices A and X vanish, apparently contradicting the fact that the overall matrix has determinant equal to ±1. But this is not a correct interpretation of what has been shown. The correct interpretation is that the absolute bulk homology classes of the brane world-volume, w, that yield non-zero charges associated with relative cycles of homology belonging to the subgroup K m− p−1 (M, B; ZZ ) are all forbidden because Maxwell’s equations cannot then be integrated to yield field strengths, given the corresponding currents (8.2). Thus the only permitted homology classes of world-volume are those whose intersection number with all elements of K m− p−1 (M, B; ZZ ) vanish. These classes should form a subgroup and it is natural to anticipate that this be provided by the kernel K p+1 (M; ZZ ). The condition for this is that the submatrix X in (9.1 vanish. This is perfectly consistent with the unimodularity of I, (8.4), since, by (7.11), the consequences of Poincaré-Lefschetz duality for the kernels, the block diagonal submatrices A and B are both square. Consequently ±1 = det I = det A det B, implying that the block diagonal submatrices A and B, possessing integer entries, are both unimodular too. Thus it is the submatrix B that gives the physical charges, Q(S), for S in the coset (H/K )m− p−1 (M, B; ZZ ) as it determines the intersection numbers between these relative classes and the permitted homology classes of brane world-volume. According to (5.2), (with m − p − 1 replaced by p + 1) these permitted world-volumes are those homologous to cycles in the boundary of space-time, B. It is remarkable that such a
296
M. Alvarez, D. I. Olive
selection rule on brane configurations can be derived without recourse to any equations of motion for the brane degrees of freedom. Since the submatrix, B, that determines the physical charges, is unimodular, the previous conclusion that there exist brane configurations realising any assignment of quantised charges, (8.5), holds good even when the selection rule is taken into account. What remains is to provide an independent check that the block submatrix X in (9.1) vanishes. This is a geometrical condition that should hold for any background space-time, M, with boundary B, and it can be rewritten as: I (K p+1 (M), K m− p−1 (M, B)) = 0.
(9.2)
This vanishing theorem will be demonstrated in the next section using some results developed in the Appendix. 10. De Rham Cohomology, Field Strengths and Currents The argument will be interesting as it brings into play further parallels between physical and mathematical concepts and sheds light on the more abstract ideas involving the two related exact sequences mentioned previously and to be elaborated below. We can no longer avoid describing de Rham cohomology which deals with the exterior derivative, d, of differential forms (such as the field strengths and currents we have been talking about). We have to explain the three types of cohomology group that arise, given a manifold with boundary; how they can be arranged in an exact sequence, and how that exact sequence is related to the one for homology groups already explained. A real q-form, ω, on M is an absolute bulk cocycle if it is coclosed, dω = 0. Two such cocycles are absolutely cohomologous in the bulk if H q (M; IR) :
ω ∼ ω
⇐⇒
ω = ω + dα.
(10.1)
As indicated, these form equivalence classes which constitute elements of the absolute bulk homology group H q (M) (in the sense of de Rham). The groups are abelian since composition is by addition. Actually these groups are real vector spaces since they are closed under multiplication by real numbers. Essentially the same concepts can be applied to q-forms, φ, on the boundary, B. φ is a boundary cocycle if it exists on B and is coclosed there. Two such cocycles are absolutely cohomologous on the boundary if H q (B; IR) :
φ ∼ φ
⇐⇒
φ = φ + dβ.
(10.2)
Again these are equivalence relations whose classes form the group indicated. The third and last concept is that of a relative cocycle, η, which is defined in the bulk, on M, is coclosed there, dη = 0 and has vanishing restriction to the boundary, ηB = 0. Two such cocycles are relatively cohomologous if they differ by a coexact form dα with the property that the restriction of α to the boundary is coexact there. H q (M, B; IR) : η ∼ η ⇐⇒ η = η + dα, α B = dβ. (10.3) Again these are equivalence relations whose classes form the group indicated. In each case the cohomology relation preserves the appropriate coclosure property. Physical
Charges and Fluxes in Maxwell Theory on Compact Manifolds with Boundary
297
examples are provided by the field strength, F, which is an absolute bulk ( p + 2)-cocycle, ∗F B , which is an absolute boundary (m − p − 2)-cocycle and the dual current, ∗ j, which is a relative (m − p − 1)-cocycle. Thus there are three types of cohomology matching the three types of homology already explained. Furthermore they both exist for a range of values of the integer q specifying the dimension of the cycle or the degree of the form as appropriate. When taken over real numbers, homology and cohomology groups of matching type and integer q are related in a nice way, as dual vector spaces, see (7.8). To understand the first example of these relations, let ω be a q-cocycle and v a q-cycle, both in the absolute bulk sense, and consider ω ∈ IR. v
This integral enjoys a number of properties: 1) It is invariant under the appropriate homologies of v, v → v = v + ∂a, and cohomologies of ω, (10.1). 2) It is linear in v and ω separately and hence provides a real bilinear form. 3) It is nonsingular; that is there is no nontrivial class of either type such that the integral vanishes for all classes of the other type. The first two properties are easy to check but the third, nonsingularity, is quoted as a known theorem (of de Rham). Of course the magnetic flux (6.2) already defined is an example of such an integral. Precisely analogous constructions work for the other two types of homology/cohomology and yield the remaining duality relations (7.8). Physical examples of these integrals are provided by electric charge, (2.1), and electric flux, (4.2), involving relative and boundary homology/cohomology respectively. Space-time, M, is itself a relative m-cycle and hence it is appropriate to integrate relative m-cocycles over it. The wedge product η ∧ ω is such a cycle if η and ω are respectively relative and absolute bulk cocycles of complementary degree (summing to m). So it is natural to consider η ∧ ω ∈ IR. M
This integral is (1) invariant under the appropriate cohomologies of ω and η, (10.1) and (10.3), (2) bilinear in ω and η, (3) nonsingular. As a consequence there results the duality relation H q (M; IR) = H m−q (M, B; IR)∗ which, when combined with the previous duality relations (7.8), implies H q (M; IR) = Hm−q (M, B; IR) and H q (M, B; IR) = Hm−q (M; IR), a weak version of Poincaré-Lefschetz duality, (7.6) (weak because it is over the reals rather than the integers). A weak version (over the reals) of the similar relation for the boundary, (7.7), can likewise be checked.
298
M. Alvarez, D. I. Olive
These results are sufficient to show that there exists an exact sequence of de Rham cohomology groups but it is worth demonstrating this explicitly in order to find precise definitions of the common kernel/image subgroups of these groups. Relative cocycles are automatically absolute cocycles in the bulk too and this leads to the homomorphism j∗ :
H q (M, B) → H q (M).
The kernel of j ∗ is made up of the elements that are trivial in H q (M): K q (M, B) = {classes of H q (M, B) satisfying η = dα, dα B = 0},
(10.4)
(10.5)
while the image appears to consist of elements of H q (M) with ωB vanishing. Absolute cocycles in the bulk automatically yield cocycles in the boundary when restricted to it. So ω → ωB yields the homomorphism i∗ :
H q (M) → H q (B)
(10.6)
with kernel K q (M) = {classes of H q (M) with ωB coexact}.
(10.7)
This obviously includes the image of j ∗ and tallies after applying bulk cohomologies (10.1). The image of i ∗ will be specified below. Given a coclosed form, β0 , on the boundary, dβ0 = 0 on B, there is a way to find a closed form ηβ , of one degree higher on the bulk whose restriction to the boundary automatically vanishes so that it is relatively coclosed. Although the procedure is not unique, the degree of ambiguity lies in a single relative cohomology class and so the procedure leads to a homomorphism, known as the Bockstein homomomorphism: d∗ :
H q (B)
→
H q+1 (M, B).
(10.8)
Let β denote an extension of β0 from the boundary, that is, a formon M, not necessarily closed, satisfying β B = β0 . Then, if ηβ = dβ, dηβ = 0 and ηβ B = dβ B = dβ0 = 0 and so ηβ is a relative cocycle. Consider now β0 and β0 , forms which are cohomologous in H q (B), (10.2), so β0 = β0 = dα, (on B). If they have extensions β and β , respectively to the bulk ηβ − ηβ = d(β − β) and (β − β)B = dα, which means ηβ and ηβ are relatively cohomologous, (10.3), as desired. In particular, this applies to the ambiguity arising when β and β are different extensions of the same β0 . The image of d ∗ is obviously given by increasing q by unity in (10.5), originally the kernel of j ∗ but the kernel of d ∗ is trickier. Obviously ηβ is trivial in relative cohomology (10.3) whenever β0 is coexact but this means it is trivial in H q (B). But ηβ is also trivial if it vanishes, that is if β0 extends to a form β in the bulk which is still coclosed. Thus K q (B) = {classes of H q (B) extending to coclosed forms on M}.
(10.9)
Charges and Fluxes in Maxwell Theory on Compact Manifolds with Boundary
299
This is also the image of i ∗ . Thus we have a series of identifications of images and kernels and the results can be all assembled in the following grand diagram: i∗
.. → ∂∗
d∗
j∗
i∗
j∗
H p (B) →H p+1 (M, B)→
H p+1 (M)
i∗
d∗
∂∗
i∗
→ H p+1 (B) → .. (10.10)
.. →Hm− p−1 (B)→Hm− p−1 (M)→Hm− p−1 (M, B)→Hm− p−2 (B)→ .. The upper sequence is composed of the homomorphisms of the de Rham cohomology groups just described. It is exact because at each stage the kernels and images coincide as was just explained. The lower sequence is the exact sequence of homology (7.1) explained in previous sections whilst the vertical arrows indicate the PoincaréLefschetz isomorphisms (7.6) and (7.7). The most powerful version of this diagram refers to groups taken over the integers, ZZ , but for some parts of the diagram we have only given arguments establishing a weaker version, over the reals, IR. By (7.3), a consequence of exactness, the kernel subgroups of the pairs of groups related by the Poincaré-Lefschetz isomorphism are themselves isomorphic. This suggests that the result we wish to prove, (9.2), is equivalent to its cohomological counterpart: η ∧ ω=0 if η ∈ K m−q (M, B; IR) and ω ∈ K q (M; IR). M
(10.11) But this is quite easy to prove using the results above, as we now see. By (10.5) the integral equals M dα ∧ ω = M d(α ∧ ω) theorem asdω vanishes by (10.7). By Stokes’ on M the integral equals B α ∧ ω = B α B ∧ ωB = B α ∧ dγ as ωB is coexact by (10.7). But, by (10.5), α is coclosed so the integral equals B d(α ∧ γ ) = ∂ B α ∧ γ = 0, by Stokes’ for the boundary, B, and the fact that the latter is automatically closed. Vanishing theorems analogous to (10.11) also apply to the integrals like w ω coupling a pair of like homology and cohomology groups. For example, the electric charge Q(S) = S ∗ j couples the relative homology of the integration domain, S, Hm− p−1 (M, B) to the relative de Rham cohomology of the dual current, ∗ j, H m− p−1 (M, B) and vanishes when S ∈ K m− p−1 (M, B), (4.4) and ∗ j ∈ K m− p−1 (M, B), (10.5). The latter condition certainly hold when Maxwell’s equation, (1.5), for the field strength, F, holds. This vanishing theorem is then precisely what was proven in our earlier discussion of electric charges, and that is now seen to be part of a more general pattern. The last step is the derivation of the vanishing theorem (9.2) for the intersection matrix from the vanishing theorem for cohomology, (10.11), proven above, using the upward arrow in the Poincaré-Lefschetz isomorphism, (10.10). A convenient concrete version of this map is provided by the quantity µ(w) that enters the expression (8.2) for the dual current ∗ j due to a brane whose world-volume is the absolute cycle w, and generalisations of this to be explained in the Appendix. These maps will provide homomorphisms between the groups indicated in (10.10) mapping the appropriate kernel subgroups into each other. The desired result follows by combining these results with the fact that the intersection number can be written I (w, S) = µ(w) ∧ µ(S). M
300
M. Alvarez, D. I. Olive
11. Discussion Motivated by the physical questions of elucidating and counting the types of conservation laws occurring in the sorts of generalised Maxwell theories that arise naturally in string/superstring theories formulated on a fixed background space-time of possibly complicated topology, we have been led to a well established area of pure mathematics. This is the theory of relative homology/cohomology associated with the space-time, assumed to have a boundary, and it seems not to be so familiar to physicists despite its evident physical relevance. Accordingly we have tried to build it up systematically, as guided by physics, and in particular, the generalised Maxwell equations, and included reasonably self-contained proofs. Given an understanding of the overall grand mathematical structure, comprising the two exact sequences of homology and cohomology and the Poincaré-Lefschetz isomorphism relating them, as depicted by (10.10), and the duality relations, (7.8), indicating a horizontal reflection symmetry of the exact sequences, it is relatively easy to explain the relevance to physics. This is what we now do because of the value of the new perspectives afforded. The first step is the recognition that there are precisely three types of conserved quantity, electric charge, (2.1), electric flux, (4.2) and magnetic flux, (6.2), and that these are associated with the three possible types of homology/cohomology, namely relative, boundary and absolute bulk, respectively. In fact these conserved quantities are invariant under the appropriate homologies/cohomologies and, indeed, constitute nonsingular bilinear forms on the free parts of these groups, thereby being responsible for the duality relations, (7.8), of the exact sequences (10.10). However this argument makes only partial use of the generalised Maxwell’s equations, (1.4) and (1.5), and the associated boundary conditions (3.1) and (3.2). What is used for each conserved quantity in turn is: Electric charge, (2.1): d ∗ j = 0 and ∗ j B = 0, Electric flux, (4.2): d{(∗F)B } = 0, Magnetic Flux, (6.2): d F = 0. With this limited information these three conserved quantities appear unrelated to each other and counted by the relevant Betti numbers, bm− p−1 (M, B), bm− p−2 (B) and b p+2 (M) as explained above. The content in Maxwell’s equations that has not so far been exploited is the inhomogeneous Maxwell in the bulk, (1.5), and it has many extra consequences, as we have seen in the text. From the point of view of de Rham cohomology the most immediate is that the dual current, ∗ j, is not just coclosed (current conservation, (1.6) but coexact, and hence an element of the subgroup K m− p−1 (M, B), (10.5), of the relative de Rham cohomology group. This is the subgroup that plays the role of kernel/image at this stage of the exact sequence of cohomology. Thus the exact sequence is now brought into play by means of the bulk Maxwell’s equations. When this current, j, is determined by the geometrical picture in terms of the p-brane world-volume, w, by (8.2), the fact that µ realises the Poincaré-Lefschetz isomorphism as explained in the Appendix, means that the world-volume w must belong to a class of K p+1 (M), (4.3), and hence be homologous to a cycle on the boundary, B, of spacetime. This was one of our main results, obtained by a more roundabout, though more self-contained, method, when we were not taking the complete mathematical structure for granted. This conclusion is contrary to what would have seemed intuitively likely, that any configuration of brane world-volumes in space-time is possible. The reason unsuitable
Charges and Fluxes in Maxwell Theory on Compact Manifolds with Boundary
301
configurations are forbidden is that they provide topological obstructions to the integration of the generalised Maxwell equations for which they provide sources, as argued in the text. Notice that in obtaining this selection rule it was not necessary to take into account any equations of motion for the brane degrees of freedom. We saw that another, related, consequence of Maxwell’s inhomogeneous equations in the bulk was the reduction of the count of linearly independent electric charges from the Betti number bm− p−1 (M, B) to s p+1 (M) = bm− p−1 (M, B) − sm− p−1 (M, B), corresponding to the number of linearly independent homology classes permitted for the p-brane world-volume. In Sect. 4 we saw that p-brane electric fluxes are classified by the boundary homology group, Hm− p−2 (B; ZZ ), or, more precisely by the free part of this abelian group obtained by dividing out the torsion subgroup, namely a lattice of dimension given by the Betti number bm− p−2 (B). The effect of the inhomogeneous bulk Maxwell equations is to equate to electric charges those electric fluxes on the sublattice of dimension sm− p−2 (B), corresponding to K m− p−2 (B; ZZ ). As a result these electric fluxes are quantised, as integer multiples of q, but this result does not apply to the remaining bm− p−2 (B) − sm− p−2 (B) electric fluxes. There is no reason for them to be quantised. Through these boundary cycles there may also be magnetic fluxes, this time associated with p-branes, ˜ dual to the p-branes (so p + p˜ + 4 = m). As seen in Sect. 6, these vanish on the afore-mentioned sublattice of dimension sm− p−2 (B) whilst the remaining fluxes are quantised as integer multiples of 2π /q. Thus there is evidence of states carrying just quantised electric charge and no magnetic charge, and these must be the input p-brane states. But the quantised magnetic flux and non-quantised electric flux through the (H/K )m− p−2 (B; ZZ ) cycles is rather reminiscent of known solutions [W1] to the Zwanziger-Schwinger quantisation condition [Z2, S2] applying to particles in four dimensional space-time and so provides evidence of mysterious and intriguing dyonic objects that are not situated on the space-time, M, according to (1.4). Maybe a better understanding of this phenomenon is important in connection with electromagnetic duality. At this stage, we should explain that one motivation for the present work was to gain a better understanding of electromagnetic duality [MO]. It has been understood that in a closed space-time of four dimensions, certain partition functions exhibit a beautiful covariance under the action of the modular group implementing electro-magnetic duality transformations [V, W2] and this is further enhanced when spin is taken into account [AO2]. It is also possible to include Wilson loops [Z1]. But, in closed space-times there are neither non-vanishing electric charges nor electric fluxes, only magnetic fluxes. Yet in supersymmetric gauge theories on flat space-time it is familiar that electromagnetic duality transformations permute electric and magnetic charges [S3]. So it might be important to consider space-times with boundary, as we have. As just explained there is a beautiful topological classification of conservation laws involving electric charge, electric flux and magnetic flux but leaving no room for the classification of magnetic charge. As a result we are left with a dilemma to be resolved by future work. There are many other questions left open for future work and many of them concern undesirable simplifications that have been made relative to the full complication of superstring theory. We shall conclude by listing some of these. Some of these oversimplifications are routine practice in the subject but should none-the-less be removed when possible. 1) Branes have been treated as geometrical objects, cycles in space-time, and not assigned any sort of generalised quantum mechanical wave function as ideally they should.
302
2) 3) 4) 5)
6) 7) 8) 9)
M. Alvarez, D. I. Olive
In the absence of this there is lacking the concept of intrinsic spin which is familiar for particle ( p = 0) wave-functions on four dimensional space-time, and known to play a role in the understanding of electromagnetic duality [AO2]. No account is taken of any internal brane structure, such as gauge theories confined to the brane as sometimes required by supersymmetry. If so presumably a K -theory classification of this internal structure would be relevant, [W4]. Brane world-volumes have been treated as cycles. It might be more reasonable to allow them to have boundaries in the infinite past or future but we do not know how to do this. No equations of motion for p-brane degrees of freedom have been considered. Partly this is because these equations ought to involve the wave-functions, not yet formulated properly anyway when p > 0. No account is taken of any supersymmetry. This usually requires a spectrum of values of p and the fact that some branes may possess boundaries situated on other branes [S6]. It would be interesting to know how the charges and fluxes we have discussed could be related to the tensor charges occurring in the supersymmetry algebra. Branes have been treated as carrying only electric charge but maybe an additional magnetic charge should be allowed as an input in (1.4). No account of Chern-Simons type terms has been taken in the generalised Maxwell equations. This could only occur when p + 2 is even and divides m + 1, as for the familiar case of p = 2 and m = 11, [CJS]. No special consideration has been made of the self-dual case when m equals twice p + 2 (so p = p). ˜ Only orientable manifolds have been considered but there is a possibility of interesting phenomena when space-time (or space) is not orientable [S4, S5, DH].
12. Appendix The basic idea stems from the way the term in the action, (1.1), describing the geometrical coupling of the p-brane to the gauge potential A, defines the dual electric current, ∗ j, via (6.1) to be proportional to a distribution valued differential form, µ(w), depending on the world-volume, w. Clearly this idea is motivated by physical considerations. A mathematical version had earlier been proposed by de Rham [dR]. So far the idea applies to absolute cycles and it has to be extended to relative cycles and to chains both in the bulk and on the boundary and this is now done. If C is a q-chain containing no sub q-chain lying in the boundary, B, its dual current, µ(C), is defined by f = f ∧ µ(C), M
C
where f is an arbitrary q-form. On the other hand if γ is a q-chain lying on the boundary, B, the dual surface current, ν(γ ) is defined by g= g ∧ ν(γ ), γ
B
where g is an arbitary q-form on the boundary. Notice that even though C and γ are chains of the same dimension, q, µ(C) and ν(γ ) are forms of different degree, m − q and m − q − 1 respectively.
Charges and Fluxes in Maxwell Theory on Compact Manifolds with Boundary
303
The boundary of C, ∂C, can be decomposed into two terms of the type just described, each of dimension one less: ∂C = U + α, so that µ(C), µ(U ) and ν(α) are all well-defined. The integral ∂C h can be evaluated in two ways, first as M h ∧µ(U )+ B h B ∧ν(α), and secondly as dh ∧ µ(C), using Stokes’ theorem. On integrating by parts this M d(C) equals B h B ∧ µ(C) B + (−1) M h ∧ dµ(C), where d(C) is the dimension of C. Equating the bulk and boundary terms separately yields the identities dµ(C) = (−1)d(C) µ(U )
and
µ(C)B = ν(α).
These are precisely what is needed to check the properties of the upwards PoincaréLefschetz homomorphism from homology to de Rham cohomology. This will be done by exploiting the different ways of interpreting these relations and special cases of them. C is a relative cycle if U vanishes. Then dµ(C) = 0 and so µ(U ) is an absolute bulk de Rham cocycle as µ(C)B need not vanish. C is an absolute bulk cycle if U and α both vanish. Then dµ(C) and µ(C)B both vanish, implying that µ(C) is a relative de Rham cocycle. U is a relative boundary and µ(U ) is coexact and so trivial in absolute bulk cohomology. U is an absolute boundary if α vanishes. Again µ(U ) is coexact but in addition µ(U )B = ν(α) = 0 so that now µ(U ) is trivial in relative cohomology. These four observations are sufficient to show that µ maps absolute or relative homology classes into relative or absolute cohomology classes respectively. By linearity these maps are homorphisms and it is easy to see that their kernels include the torsion subgroups so more properly µ acts on the free parts, F, of the homology groups H (obtained by dividing out the torsion). So µ:
Fq (M; ZZ ) → F m−q (M, B; ZZ )
and
Fq (M, B; ZZ ) → F m−q (M; ZZ ).
The last step is to check that µ maps the appropriate kernel subgroups into each other. U is an absolute bulk cycle homologous to a boundary cycle, −α if ∂U vanishes and so in a class of K q (M) by (5.2). But then µ(U ) = d[(−1)d(C) µ(C)] and µ(C)B = ν(α), where dν(α) = −(−1)d(C) ν(∂α) = 0. So, by (10.5), µ maps from a class of K q (M) to a class of K m−q (M, B). Finally if U vanishes and α = ∂γ , then ∂C = ∂γ meaning that C is a relative cycle in a class of K q (M, B) by (4.4). Hence dµ(C) vanishes and µ(C)B = ν(α) = ν(∂γ ) = (−1)d(γ ) dν(γ ). Thus, by (10.7), µ maps from a class of K q (M, B) to a class of K m−q (M). By the work of this Appendix, the intersection number
I (w, S) ≡
µ(w) = S
M
µ(w) ∧ µ(S).
Furthermore if w ∈ K q (M; ZZ ) and S ∈ K m−q (M, B; ZZ ) then µ(w) ∈ K m−q (M, B; ZZ ) and µ(S) ∈ K q (M; ZZ ) so that, finally, I (w, S) vanishes by (10.11), as desired.
304
M. Alvarez, D. I. Olive
Acknowledgements. D.I. Olive is belatedly grateful to G.-C. Wick for first introducing him to homology theory, long ago, and to Tobias Ekholm and Victor Pidstrigach for separately explaining important mathematical concepts to him. He thanks the Mittag-Leffler Institute (Djursholm), IFT (UNESP São Paulo), the Yukawa Institute (University of Kyoto) and NORDITA for hospitality whilst parts of this work were accomplished. M. Alvarez’s research has been supported by PPARC through the Advanced Fellowship PPA/A/S/1999/00486. Support to both of us from the European String Network HPRN-CT-2000-122 is also gratefully acknowledged.
References [A] [AO1]
Alvarez, O.: Topological quantisation and cohomology. Commun. Math. Phys. 100, 279–309 (1985) Alvarez, M., Olive, D.I.: The Dirac quantisation condition for fluxes on four-manifolds. Commun. Math. Phys. 210, 13–28 (2000) [AO2] Alvarez, M., Olive, D.I.: Spin and abelian electromagnetic duality on four-manifolds. Commun. Math. Phys. 217, 331–356 (2001) [BT] Bott, R., Tu, L.W.: Differential forms in algebraic topology. Graduate Texts in Mathematics 82, Berlin Heidelberg New York: Springer, 1982 [CJS] Cremmer, E., Julia, B., Scherk, J.: Supergravity Theory In 11 Dimensions. Phys. Lett. B 76, 409 (1978) [CS] Cremmer, E., Scherk, J.: Spontaneous dynamical breaking of gauge symmetry in dual models. Nucl. Phys. B 72, 117–124 (1974) [D] Dirac, P.A.M.: Quantised singularities in the electromagnetic field. Proc. Roy. Soc. A133, 60–72 (1931) [dR] de Rham, G.: Variétés Différentiables. Paris: Hermann 1955; Differentiable Manifolds. Comprehensive Studies in Mathematics 266, Berlin Heidelberg New York: Springer, 1984 [DGHT] Deser, S., Gomberoff, A., Henneaux, M., Teitelboim, C.: Duality, self-duality, source and charge quantisation in abelian N -form theories. Phys. Lett. B 400, 80–86 (1997) [DH] Diemer, T., Hadley, M.J.: Charge and the topology of space-time. Class. Quant. Grav. 16, 3567–3577 (1999) [F] Flanders, H.: Differential forms, with applications to the physical sciences. New York: Academic, 1963, New York: Dover, 1989 [FS] Figueroa-O’Farrill, J., Stanciu, S.: D-brane charge, flux quantisation and relative (co)homology”. JHEP 0101, 006 (2001) [H] Hodge, W.V.D.: The theory and applications of harmonic integrals. Cambridge: Cambridge University Press, 1952 [HT] Henneaux, M., Teitelboim, C.: p-form Electrodynamics. Found. Phys. 16, 593–717 (1986) [KR] Kalb, M., Ramond, P.: Classical Direct Interstring Action. Phys. Rev. D9, 2273–2284 (1974) [KS] Kalkkinen, J., Stelle, K.: Large gauge transformations in M-theory. J. Geom. Phys. 48, 100–132 (2003) [M] Massey, W.S.: A basic course in algebraic topology. Graduate Texts in Mathematics 127, Berlin Heidelberg New York: Springer, 1991 [MW1] Misner, C.W., Wheeler, J.A.: Classical Physics as Geometry. Annals of Phys. 2, 525–603 (1957) [MW2] Moore, G., Witten, E.: Self-duality, Ramond-Ramond fields, and K-theory. JHEP 0005, 32 (2000) [MO] Montonen, C., Olive, D.I.: Magnetic monopoles as gauge particles? Phys. Lett. B72, 117–120 (1977) [N] Nepomechie, R.: Magnetic monopoles from antisymmetric tensor gauge fields. Phys. Rev. D31, 1921-1924 (1985) [O] Orland, P.: Instantons and Disorder in Antisymmetric Tensor gauge fields. Nucl. Phys. B 205 [FS8], 107–118 (1982) [S1] Schwarz, A.: Topology for Physicists. Comprehensive Studies in Mathematics 308, Berlin Heidelberg New York: Springer, 1994 [S2] Schwinger, J.S.: A Magnetic Model Of Matter. Science 165, 757 (1969) [S3] Sen, A.: Dyon-monopole bound states, self-dual harmonic forms on the multimonopole moduli space, and S L(2, ZZ ) invariance in string theory. Phys. Lett. B 329, 217-221 (1994) [S4] Sorkin, R.: On the relation between charge and topology. J. Phys. A10, 717–725 (1977) [S5] Sorkin, R.: The quantum electromagnetic field in multiply connected space. J. Phys. A12, 403–421 (1979) [S6] Strominger, A.: Open p-branes. Phys. Lett. B 383, 44–47 (1996) [T1] Teitelboim, C.: Gauge invariance for extended objects. Phys. Lett. B 167, 63–68 (1986) [T2] Teitelboim, C.: Monopoles of higher rank. Phys. Lett. B 167, 69–72 (1986) [V] Verlinde, E.: Global aspects of electric-magnetic duality. Nucl. Phys. B 455, 211–228 (1995) [W1] Witten, E.: Dyons Of Charge eθ/2π . Phys. Lett. B86, 283–287 (1979)
Charges and Fluxes in Maxwell Theory on Compact Manifolds with Boundary
[W2] [W3] [W4] [WY] [Z1] [Z2]
305
Witten, E.: On S-duality in abelian gauge theory. Selecta Math (NS) 1, 383–410 (1995) Witten, E.: On flux quantization in M-theory and the effective action. J. Geom. Phys. 22, 1–13 (1997) Witten, E.: Overview of K-theory applied to strings. Int. J. Mod. Phys. A 16, 693 (2001) Wu, T.T., Yang, C.N.: Concept of non-integrable phase factors and global formulation of gauge fields. Phys. Rev. D12, 3845–3857 (1975) Zucchini, R.: Abelian duality and Wilson loops. Commun. Math. Phys. 242, 473–500 (2003) Zwanziger, D.: Quantum Field Theory Of Particles With Both Electric And Magnetic Charges. Phys. Rev. 176, 1489 (1968)
Communicated by G.W. Gibbons
Commun. Math. Phys. 267, 307–353 (2006) Digital Object Identifier (DOI) 10.1007/s00220-006-0075-4
Communications in
Mathematical Physics
The Universality Classes in the Parabolic Anderson Model Remco van der Hofstad1 , Wolfgang König2 , Peter Mörters3 1 Department of Mathematics and Computer Science, Eindhoven University of Technology,
5600 MB Eindhoven, The Netherlands. E-mail:
[email protected]
2 Mathematisches Institut, Universität Leipzig, 04109 Leipzig, Germany.
E-mail:
[email protected]
3 Department of Mathematical Sciences, University of Bath, Bath BA2 7AY, United Kingdom.
E-mail:
[email protected] Received: 18 March 2005 / Accepted: 20 February 2006 Published online: 3 August 2006 – © Springer-Verlag 2006
Abstract: We discuss the long time behaviour of the parabolic Anderson model, the Cauchy problem for the heat equation with random potential on Zd . We consider general i.i.d. potentials and show that exactly four qualitatively different types of intermittent behaviour can occur. These four universality classes depend on the upper tail of the potential distribution: (1) tails at ∞ that are thicker than the double-exponential tails, (2) double-exponential tails at ∞ studied by Gärtner and Molchanov, (3) a new class called almost bounded potentials, and (4) potentials bounded from above studied by Biskup and König. The new class (3), which contains both unbounded and bounded potentials, is studied in both the annealed and the quenched setting. We show that intermittency occurs on unboundedly increasing islands whose diameter is slowly varying in time. The characteristic variational formulas describing the optimal profiles of the potential and of the solution are solved explicitly by parabolas, respectively, Gaussian densities. Our analysis of class (3) relies on two large deviation results for the local times of continuous-time simple random walk. One of these results is proved by Brydges and the first two authors in [BHK05], and is also used here to correct a proof in [BK01].
1. Introduction and Main Results 1.1. The parabolic Anderson model. We consider the continuous solution v : [0, ∞) × Zd → [0, ∞) to the Cauchy problem for the heat equation with random coefficients and localised initial datum, ∂ v(t, z) = d v(t, z) + ξ(z)v(t, z), ∂t v(0, z) = 1l0 (z), for z ∈ Zd .
for (t, z) ∈ (0, ∞) × Zd ,
(1.1) (1.2)
308
R. van der Hofstad, W. König, P. Mörters
Here ξ = (ξ(z) : z ∈ Zd ) is an i.i.d. random potential with values in [−∞, ∞), and d is the discrete Laplacian, d f (z) = f (y) − f (z) , for z ∈ Zd , f : Zd → R. y∼z
The parabolic problem (1.1) is called the parabolic Anderson model. The operator d + ξ appearing on the right is called the Anderson Hamiltonian; its spectral properties are well-studied in mathematical physics. Equation (1.1) describes a random mass transport through a random field of sinks and sources, corresponding to lattice points z with ξ(z) < 0, respectively, > 0. It is a linearised model for chemical kinetics [GM90], is equivalent to Burger’s equation in hydrodynamics [CM94], and describes magnetic phenomena [MR94]. We refer the reader to [GM90, M94, CM94] for more background and to [GK05] for a survey on mathematical results. The long-time behaviour of the parabolic Anderson problem is well-studied in the mathematics and mathematical physics literature because it is the prime example of a model exhibiting an intermittency effect. This means, loosely speaking, that most of the total mass of the solution, U (t) = v(t, z), for t > 0, (1.3) z∈Zd
is concentrated on a small number of remote islands, called the intermittent islands. A manifestation of intermittency in terms of the moments of U (t) is as follows. For 0 < p < q, the main contribution to the q th moment of U (t) comes from islands that contribute only negligibly to the p th moments. Therefore, intermittency can be defined by the requirement, lim sup t→∞
U (t) p 1/ p = 0, for 0 < p < q, U (t)q 1/q
(1.4)
where · denotes expectation with respect to ξ . Whenever ξ is truly random, the parabolic Anderson model is intermittent in this sense, see [GM90, Theorem 3.2]. However, one wishes to understand the intermittent behaviour in much greater detail. The following has been heuristically argued in the literature and has been verified, at least partially, for important special examples of potentials: the intermittent islands are characterized by a particularly high exceedance of the potential and an optimal shape, which is determined by a deterministic variational formula. A universal picture is present: the location and number of the intermittent islands are random, their size and the absolute height of the potential in the islands is t-dependent, but the (rescaled) shape depends neither on randomness nor on t. Examples studied include the double-exponential distribution [GM98], potentials bounded from above [BK01] and continuous analogues on Rd instead of Zd like Poisson obstacle fields [S98] and Gaussian and other Poisson fields [GK00, GKM00]. A finer analysis of the geometry of the intermittent islands has been carried out for Poisson obstacle fields [S98] and the double-exponential distribution [GKM06]. In the present paper we initiate the study of the parabolic Anderson model for arbitrary potentials, with the aim of identifying all universality classes of intermittent behaviour that can arise for different potential distributions. Our standing assumption is that the potentials (ξ(z) : z ∈ Zd ) are independent and identically distributed and that all positive exponential moments of ξ(0) are finite, which is necessary and sufficient for
The Universality Classes in the Parabolic Anderson Model
309
the finiteness of the p th moments of U (t) at all times. The long-term behaviour of the solutions depends strongly and exclusively on the upper tail behaviour of the random variable ξ(0). It is fully described by the top of the spectrum of the Anderson Hamiltonian d + ξ in large t-dependent boxes. The outline of the remainder of this section is as follows. In Sect. 1.2, we formulate and discuss a mild regularity condition on the potential. In Sect. 1.3, we show that under this condition the potentials can be split into exactly four classes, which exhibit four different types of intermittent behaviour. Three of these classes have been studied in the literature up to now. A fourth class, the class of almost bounded potentials, is studied in the present paper for the first time. We present our results on the moment and almost-sure large-time asymptotics for U (t) in Sect. 1.4. In Sect. 1.5, we give a heuristic derivation of the moment asymptotics, and in Sect. 1.6, we explain the variational problems involved. 1.2. Regularity assumptions. We first state and discuss our regularity assumptions on the potential. Roughly speaking, the purpose of these assumptions is to ensure that the potential has the same qualitative behaviour at different scales, and therefore the system does not belong to different universality classes at different times. Our assumptions refer to the upper tail of ξ(0), and are conveniently formulated in terms of the regularity of its logarithmic moment generating function, H (t) = log etξ(0) , as t ↑ ∞. (1.5) Note that H is convex and t → H (t)/t is increasing with limt→∞ H (t)/t = esssup ξ(0). To simplify the presentation, we make the assumption that if ξ is bounded from above, then esssup ξ(0) = 0, so that limt→∞ H (t)/t ∈ {0, ∞}. This is no loss of generality, 1 as additive constants in the potential appear as additive constants both in pt logU (t) p and
1 t
log U (t). The first central assumption on H is the following:
Assumption (H). t →
H (t) t
is in the de Haan class.
is in the de Haan class if, for some regularly varyWe say that a measurable function H (λt) − H (t)) converges to a nonzero ing function g : (0, ∞) → R, the term g(t)−1 ( H ∈ g . limit as t ↑ ∞, for any λ > 1. In the notation of [BGT87] this means that H Recall that a measurable function g is called regularly varying if g(λt)/g(t) converges to a positive limit for every λ > 0. If this is the case, then the limit takes the form λ , and is called the index of regular variation. If = 0, then the function is called slowly varying. When H (t)/t is in the de Haan class, then H is regularly varying with some index γ ∈ R, see [BGT87, Theorem 3.6.6]. By convexity of H , we have γ ≥ 0. If H is regularly varying with index γ = 1, then H (t)/t is in the de Haan class, so that the statements are equivalent for γ = 1. However, if γ = 1, then this does not necessarily hold, see [BGT87, Theorem 3.7.4]. which From the theory of regular functions we derive the existence of a function H can be characterized by two parameters, γ ∈ [0, ∞) and ρ ∈ (0, ∞), and plays an important role in the sequel. : (0, ∞) Proposition 1.1. Assumption (H) is equivalent to the existence of a function H → R and a continuous auxiliary function κ : (0, ∞) → (0, ∞) such that lim
t↑∞
H (t y) − y H (t) (y) = 0, for y ∈ (0, 1) ∪ (1, ∞). =H κ(t)
(1.6)
310
R. van der Hofstad, W. König, P. Mörters
The convergence holds uniformly on every interval [0, M], with M > 0. Moreover, with γ the index of variation of H , the following statements hold: (i) κ is regularly varying of index γ ≥ 0. In particular, κ(t) = t γ +o(1) as t ↑ ∞. (ii) There exists a parameter ρ > 0 such that, for every y > 0, γ (y) = ρ y − y , and lim H (t) = ρ , (a) if γ = 1, then H t↑∞ κ(t) 1−γ γ −1 |H (t)| (y) = ρy log y, and lim (b) if γ = 1, then H = ∞. t↑∞ κ(t)
Proof. See Chapter 3 in [BGT87]. More accurately, using the notation f (t) = H (t)/t and g(t) = κ(t)/t, (i) is shown in [BGT87, Sect. 3.0], see also [BGT87, Theorem 1.4.1]. The uniformity of the convergence follows since the left hand side of (1.6) is convex in y, negative on the interval (0, 1), and continuous in zero. (ii) follows from [BGT87, Lemma 3.2.1]. The implication stated in (ii)(a) follows from [BGT87, Theorems 3.2.6, 3.2.7], and the implication stated in (ii)(b) is shown in [BGT87, Theorem 3.7.4].
an asymptotic shape function for Note that κ is an asymptotic scale function, and H H . While γ ∈ [0, ∞) is unambiguously determined by the potential distribution, the . The latter option makes it possible to parameter ρ could be absorbed in either κ or H keep track of ρ in the sequel. If ξ is unbounded from above, then ξ and ξ + C have the and κ for any C ∈ R. If ξ is replaced by Cξ for some C > 0, then the pair same pair of H , κ). In the case γ = 1 one may choose κ(t) = H (t) ( H , κ) may be replaced by (C γ H t in (1.6), if γ = 1 one may take κ(t) = H (t) − 1 H (s)/s ds, see [BGT87, Theorem 3.7.3]. The three regimes 0 ≤ γ < 1, γ = 1 and γ > 1 obviously distinguish three qualitatively different classes of (upper tail behaviour of) potentials. However, in order to appropriately describe the asymptotics of the parabolic Anderson model in the case γ = 1, a finer distinction is necessary. For this we need an additional mild assumption on the auxiliary function κ: κ(t) exists as an element of [0, ∞]. Assumption (K). The limit κ ∗ = lim t→∞ t Assumption (K) is obviously satisfied in the cases γ = 1 and for potentials bounded from above in the case γ = 1. Indeed, when γ < 1, then κ ∗ = 0, while when γ > 1, then κ ∗ = ∞ by Proposition 1.1(ii)(a). When γ = 1 and H (t)/t → 0, then, by Proposition 1.1(ii)(b), H (t)/κ(t) → ∞, so that κ(t)/t → 0. Hence, Assumption (K) can be a restriction only for potentials unbounded from above in the case γ = 1. 1.3. The universality classes. In this section, we define and discuss the four universality classes of the parabolic Anderson model under the Assumptions (H) and (K). In particular, we explain the relation between the asymptotics of the parabolic Anderson model and the parameters γ and κ ∗ introduced in Assumptions (H) and (K). For the moment, we focus on the large time behaviour of the p th moment U (t) p for any p > 0. In this paper we show that there is a scale function α : (0, ∞) → (0, ∞) and a number χ ∈ R such that
H pt α( pt)−d 1 1 p χ + o(1) , as t ↑ ∞ . (1.7) − logU (t) = −d 2 pt pt α( pt) α( pt)
The Universality Classes in the Parabolic Anderson Model
311
The scale function α describes how fast the expected total mass, which at time t = 0 is localised at the origin, spreads, in the sense that α(t)2 z∈Zd v(t, z) 1l{|z| ≤ R α(t)} log lim lim inf = 0. (1.8) R↑∞ t↑∞ t z∈Zd v(t, z) Moreover, in the three classes where the mass does not concentrate asymptotically in a single point, there exists R > 0 such that α(t)2 z∈Zd v(t, z) 1l{|z| ≤ R α(t)} log < 0. (1.9) lim inf t↑∞ t z∈Zd v(t, z) In three of the four classes the results (1.7), (1.8) and (1.9) are already contained in the literature, and we only give references; a further class will be the subject of the remainder of the paper. Heuristically, α(t) also determines the size of the intermittent islands for the almost sure behaviour of U (t). The order of their diameter is given as (α ◦ β)(t), where β(t) is the asymptotic inverse of t → t/α(t)2 evaluated at d log t, cf. Sect. 1.4.2 below. The numbers χ are naturally given in terms of minimisation problems, where the minimisers correspond to the typical shape of the solution on an intermittent island. A rigorous proof of these heuristic statements, however, is beyond the means of this paper. One expects that α(t) is asymptotically the larger, the thinner the upper tails of ξ(0) are. It will turn out that when κ ∗ = ∞, then (1.7) is satisfied with α(t) = 1. Therefore, we only need to analyse α(t) in the case when κ ∗ < ∞. Analytically, if κ ∗ < ∞, then α(t) may be defined by a fixed point equation as follows: Proposition 1.2 (The scale function α). Suppose that Assumptions (H) and (K) are satisfied and κ ∗ < ∞. There exists a regularly varying scale function α : (0, ∞) → (0, ∞), which is unique up to asymptotic equivalence, such that for all sufficiently large t > 0,
κ tα(t)−d 1 = . (1.10) tα(t)−d α(t)2 t = ∞. Moreover, α(t)d √ (i) If γ = 1 and 0 < κ ∗ < ∞, then limt↑∞ α(t) = 1/ κ ∗ ∈ (0, ∞). (ii) If γ = 1 and κ ∗ = 0, or if γ < 1, then limt↑∞ α(t) = ∞.
The index of regular variation is
1−γ d+2−dγ
and hence lim
t↑∞
Proof. To see that α is regularly varying and unique up to asymptotic equivalence we note that f (t) = t (κ(t)/t)−d/2 is regularly varying with index at least one. By [BGT87, Theorem 1.5.12], there exists an asymptotically unique inverse g such that f (g(t)) ∼ t for t ↑ ∞. This inverse is regularly varying. By definition, t → tα(t)−d satisfies f (tα(t)−d ) = t and hence α(t) ∼ (t/g(t))1/d is regularly varying. The index of regular variation of α is immediate from the defining equation and the fact that κ(t) is regularly varying with index γ . Under the assumptions of (i), for large t, the mapping x → κ(t x d/2 )/t x d/2 maps a compact interval centred in κ ∗ to itself, and hence the existence of a solution to (1.10) follows from a fixed-point argument. The stated properties of α( · ) follow immediately from the definition.
312
R. van der Hofstad, W. König, P. Mörters
Under the assumptions of (ii), we look at the problem of finding s > 0 such that κ(s)/s = (s/t)2/d . For any fixed t, as we increase s the left hand side goes to zero and the right hand side to infinity. Hence for sufficiently large t, there exists a solution s = s(t), which is going to infinity as t ↑ ∞. Then α(t) = (t/s(t))1/d solves (1.10) and converges to infinity.
Now we introduce the four universality classes, ordered from thick to thin upper tails of ξ(0). Recall the general formula for the asymptotics of the moments U (t) p from (1.7). Uniqueness for the variational problems below is to be understood up to spatial translation. (1) γ > 1, or γ = 1 and κ ∗ = ∞. This case is included in [GM98], see also [GM90], as the upper boundary case ρ = ∞ in their notation. Examples include the Weibull-type distributions, for which Prob{ξ(0) > x} ≈ exp(−βx a ) with a > 1. Here χ = 2d, the scale function α(t) = 1 is constant, and the first term on the right-hand side in (1.7) dominates the sum, which diverges to infinity. The asymptotics in (1.8) can be strengthened to v(t, 0) 1 = 0, log t↑∞ t z∈Zd v(t, z) lim
i.e. the expected total mass remains essentially in the origin and the intermittent islands are single sites, a phenomenon of complete localisation. We call this the single-peak case. 3 (2) γ = 1 and κ ∗ ∈ (0, ∞). This case, the double-exponential case, is the main objective of [GM98]. The prime example is the double exponential distribution with parameter ρ ∈ (0, ∞),
Prob ξ(0) > r = exp{−er/ρ }, √ ∗ which implies H (t) = ρt log(ρt) − ρt + o(t). Here α(t) → 1/ κ ∈ (0, ∞), so that the size of the intermittent islands is constant in time. The first term on the right hand side in (1.7) dominates the sum, which goes to infinity. Moreover, 1
2 χ = min g 2 (x) log g 2 (x) , (1.11) g(x) − g(y) − ρ 2d g : Zd →R d d g 2 =1
x,y∈Z x∼y
x∈Z
where we write x ∼ y if x and y are neighbours. This variational problem is difficult to analyse. It has a solution, which is unique for sufficiently large values of ρ, and heuristically this minimizer represents the shape of the solution. As noted in [GH99], for any family of minimizers gρ , as ρ ↑ ∞, gρ converges to δ0 , which links to the single-peak case. Furthermore, as ρ ↓ 0, the minimisers gρ are asymptotically given by 2 √ gρ2 (x/ ρ) = (1 + o(1)) e−|x| π −d/2 , uniformly on compacts and in L 1 (Rd ). Consequently, ρ 1 χ = ρ d 1 − log + o(1) as ρ ↓ 0. 2 π 3
The Universality Classes in the Parabolic Anderson Model
313
(3) γ = 1 and κ ∗ = 0. Potentials in this class are called almost bounded in [GM98] and may be seen as the degenerate case for ρ = 0 in their notation. This class contains both bounded and unbounded potentials, and is analysed for the first time in the present paper. The scale function α(t) and hence the diameter of the intermittent islands goes to infinity and is slowly varying, in particular it is slower than any power of t. The first term on the right-hand side in (1.7) dominates the sum, which may go to infinity or zero. Moreover, χ = min |∇g(x)|2 d x − ρ g 2 (x) log g 2 (x) d x , (1.12) g∈H 1 (Rd ) g2 =1
Rd
see Theorem 1.4. This variational formula is obviously the continuous variant of (1.11), and it is much easier to solve. There is a unique minimiser, given by ρ d/4 ρ gρ (x) = exp − |x|2 , π 2 representing the rescaled shape of the solution on an intermittent island. In particular, χ = ρd 1 − 21 log πρ , which is the asymptotics of (1.11) as ρ ↓ 0. Hence, on the level of variational problems, (3) is the boundary case of (2) for ρ ↓ 0. 3 (4) γ < 1. This is the case of potentials bounded from above, which is treated in [BK01]. Indeed, in [BK01], it is assumed that there exists a non-decreasing function α(t) : (0, ∞) → (−∞, 0] such that and a nonpositive function H t
d+2 lim α(t)t H α(t) d y = H (y), t↑∞
uniformly on compact sets in (0, ∞). It is easy to infer from the results of Sect. 1.2 that this assumption is equivalent to Assumption (H) with index γ < 1 (recall that in this case Assumption (K) is redundant), for α defined by (1.10) and (y) = ρ y γ . H γ −1 1−γ Here α(t) → ∞ as t → α(t) is regularly varying with index d+2−dγ . The potential ξ is necessarily bounded from above. In this case, the two terms on the right hand side in (1.7) are of the same order, and (1.7) converges to zero. Moreover, g 2γ (x) − g 2 (x) 2 dx . (1.13) |∇g(x)| d x − ρ χ = inf γ −1 g∈H 1 (Rd ) Rd Rd g2 =1
In the lower boundary case where γ = 0, the functional g 2γ must be replaced by the Lebesgue measure of supp (g). In this case the formula is well-known and well-understood. In particular, the minimizer exists, is unique up to spatial shifts, and has compact support. To the best of our knowledge, for γ ∈ (0, 1), the formula in (1.13) has not been analysed explicitly, unless in d = 1. In Proposition 1.16 below, we show that (1.13) converges to (1.12), as would follow from interchanging the limit γ ↑ 1 with the infimum on g. This means that, on the level of variational formulas, (3) is the boundary case of (4) for γ ↑ 1. 3
314
R. van der Hofstad, W. König, P. Mörters
Remark 1.3. The variational problems in (1.11), (1.12), and (1.13) encode the asymptotic shape of the rescaled and normalised solution v(t, · ) in the centred ball with radius of order α(t). Informally, the main contribution to U (t) comes from the events that
v t, · α(t)
≈ g, v t, · α(t) 2 where g is a minimiser in the definition of χ . To the best of our knowledge this heuristics has not been made rigorous in any nontrivial case so far. Note that in case (1), formally, (1.11) holds with ρ = ∞ and hence the optimal g is 1l0 . 3 Since the cases (1), (2) and (4) have been studied in the literature [BK01, GM98], the possible scaling picture of the parabolic Anderson model under the Assumptions (H) and (K) is complete once the case (3) is resolved. This is the content of the remainder of this paper.
1.4. Long time tails in the almost bounded case. In this section we present our results on the almost bounded case (3). In other words, we assume that κ(t)/t is slowly varying and converges to zero. 1.4.1. Moment asymptotics. Our main result on the annealed asymptotics of U (t) gives the first two terms in the asymptotics of U (t) p for any p > 0, as t ↑ ∞. This is a substantial improvement over the result for the almost bounded case contained in [GM98, Theorem 1.2], which just states that logU (t) p = o(t) for p ∈ N. Theorem 1.4 (Moment asymptotics). Suppose Assumptions (H) and (K) hold, and assume that we are in case (3), i.e., γ = 1 and κ ∗ = 0. Let ρ > 0 be as in Proposition 1.1(ii)(b). Then, for any p ∈ (0, ∞),
H pt α( pt)−d 1 1 p logU (t) = ρd(1 − 21 log πρ ) + o(1) , as t ↑ ∞. − −d 2 pt pt α( pt) α( pt) (1.14) Remark 1.5 (The constant). Recall from (1.7) and (1.12) that the constant ρd(1 − ρ 1 2 log π ) arises as a variational problem; see Sect. 1.6. The variational problem plays an essential role in the proof. 3 Remark 1.6 (Intermittency). Note from (1.10) that the first term in (1.14) is of higher order than the second term. Formula (1.14), together with the results of Proposition 1.1 and the fact that α( · ) is slowly varying, imply that
H pt α( pt)−d H qt α(qt)−d U (t) p 1/ p log = − + o t/α(t)2 q 1/q −d −d U (t) p α( pt) q α(qt) t q ˆ p
= (1.15) H q + o(1) for p, q ∈ (0, ∞). p 2 α(t) In particular, we have intermittency in the sense of (1.4), and the convergence is exponential on the scale t/α(t)2 . 3
The Universality Classes in the Parabolic Anderson Model
315
Remark 1.7 (Interpretation). The fact that the minimisers of the variational problem (1.12) are given by Gaussian densities can be interpreted in the sense that the solution u(t, x) is asymptotically a heat flow running in the ‘slow motion’ scale α(t). Observe that this heat flow is the solution of (1.1) if the potential ξ is replaced by a certain parabola in the same scale. This parabola is the optimal potential in the sense of Remark 1.13 below. 3 In spite of the simplicity of the variational formula (1.12), the derivation of (1.14) is technically rather involved and requires a number of demanding tools. We use both representations of U (t) available to us: an approximative representation in terms of an eigenfunction expansion, and the Feynman-Kac formula involving the simple random walk. The heart of the proof is an application of a large deviation principle for the rescaled local times of simple random walk. However, there are three major obstacles to be removed, which require a variety of novel techniques. The first one is a compactification argument for the space, which is based on an estimate for Dirichlet eigenvalues in large boxes against maximal Dirichlet eigenvalues in small subboxes. This is an adaptation of a method from [BK01]. The second technique is a cutting argument for the large potential values, which we trace back to a large deviations estimate for the self-intersection number of the simple random walk. This is of independent interest and is carried out in Sect. 2. Finally, the third obstacle, which appearsin the proof of the upper bound, is the lack of upper semi-continuity of the map f → f (x) log f (x) d x in the topology of the large deviation principle, even after compactification and removal of large values. Therefore, in the proof of the upper bound we replace the classical large deviation principle by a new approach, taken from [BHK05], which identifies and estimates the joint density of the family of the random walk local times. See Proposition 3.3 below. An alternative heuristic derivation of formula (1.14) is given in Sect. 1.5. The proof of Theorem 1.4 is given in Sect. 2 and 3. 1.4.2. Almost-sure asymptotics. We define another scale function β such that β(t)
2 ∼ d log t . α β(t)
(1.16)
In other words, β(t) is the asymptotic inverse of t → t/α(t)2 evaluated at d log t, which by [BGT87, Theorem 1.5.12] exists and is slowly varying. In order to avoid technical inconveniences, we assume that the field ξ is bounded from below. See Remark 1.11 for comments on this issue. Theorem 1.8 (Almost sure asymptotics). Suppose Assumptions (H) and (K) hold, and assume that we are in case (3), i.e., γ = 1 and κ ∗ = 0. Furthermore, suppose that β is defined by (1.16) and that essinf ξ(0) > −∞. Let ρ > 0 be as in Proposition 1.1. Then, almost surely,
H β(t)α(β(t))−d 1 log U (t) = t β(t) α(β(t))−d
1 ρ(d − d2 log πρ + log ρe ) + o(1) , as t ↑ ∞. (1.17) − 2 α(β(t)) Remark 1.9 (The constant). In Sect. 1.6, we will see that also the constant ρ(d − d2 log πρ + log ρe ) arises as a variational problem. A remarkable fact is that the first two leading contributions to U (t) are deterministic. 3
316
R. van der Hofstad, W. König, P. Mörters
Remark 1.10 (Interpretation). Heuristically, α(β(t)) is the order of the diameter of the intermittent islands, which almost surely carry most of the mass of U (t). Note that β(t) = (log t)1+o(1) and α(β(t)) = (log t)o(1) , i.e., the size of the intermittent islands increases extremely slowly. The crucial point in the proof of Theorem 1.8 is to show the existence of an island with radius of order α(β(t)) within the box [−t, t]d on which the shape of the vertically shifted and rescaled potential is optimal, i.e., resembles a certain parabola. To prove this, we use the first moment asymptotics at time β(t) locally on that island. The exponential rate, which is β(t)/α(β(t))2 has to be balanced against the number of possible islands, which has exponential rate d log t, cf. (1.16). 3 Remark 1.11 (Lower tails of the potential). The assertion of Theorem 1.8 remains true mutatis mutandis if the assumption essinf ξ(0) > −∞ is replaced, in d ≥ 2, by the assumption that Prob{ξ(0) > −∞} exceeds the critical nearest-neighbour site percolation threshold. This ensures the existence of an infinite component in the set C = {z ∈ Zd : ξ(z) > −∞}, and thus (1.17) holds conditional on the event that the origin belongs to the infinite cluster in C. In d = 1, an infinite cluster exists if and only if Prob{ξ(0) > −∞} = 1. If we assume that ξ(0) > −∞ almost surely and log(−ξ(0) ∨ 1) < ∞, (1.17) is true verbatim, while otherwise the rate of the almost sure asymptotics depends on the lower tails of ξ(0); see [BK01a] for details. The effect of the assumption is to ensure sufficient connectivity in the sense that the mass flow from the origin to regions where the random potential assumes high values and an approximately optimal shape is not hampered by deep valleys on the way. We decided to detail the proof of the almost sure asymptotics under the stronger assertion that essinf ξ(0) > −∞. See [BK01, Sect. 5.2] for the proof of the analogous assertion in the bounded-potential case under the weaker assumptions. The arguments given there can be extended with some effort to the situation of the present paper. 3 The proof of Theorem 1.8 is given in Sect. 4. It essentially follows the strategy of [BK01]. 1.4.3. Examples. We now explain what kind of upper tail behaviour is covered by the almost bounded case, arguing separately for the bounded and unbounded case, denoted by (B) and (U), respectively. Suppose the distribution of the field ξ(0) satisfies
r ↑∞ in case (U), f (r ) log Prob ξ(0) > r ∼ −e , as (1.18) r ↑ 0 = esssup ξ(0), in case (B). Here f is a positive, strictly increasing smooth function satisfying f (r ) ↑ ∞ as r ↑ ∞ in case (U) and f (r )r ↑ ∞ as r ↑ 0 in case (B). Note that typical representatives of case (2) of the four universality classes are f (r ) ≈ cr as r ↑ ∞, violating the condition in case (U); and typical representatives of case (4) of the four universality classes γ are f (r ) ≈ − 1−γ log |r | as r ↑ 0, violating the condition in case (B). The cumulant generating function behaves like
H (t) ≈ log etr exp −e f (r ) dr ≈ sup tr − e f (r ) = tr (t) − e f (r (t)) , (1.19) r
where r (t) is asymptotically, as t ↑ ∞, defined via t = f (r (t))e f (r (t)) . Note that r (t) ↑ ∞ in case (U), while r (t) ↑ 0 in case (B), as t ↑ ∞. Hence, f (r (t)) ↑ ∞ in
The Universality Classes in the Parabolic Anderson Model
317
case (U), while f (r (t))r (t) ↑ ∞ in case (B). Rewriting the definition of r (t) as e f (r (t)) =
tr (t) = o(tr (t)), f (r (t))r (t)
we thus obtain that the first term on the right hand side of (1.19) dominates the second term. Therefore, we can approximate H (t)/t ≈ r (t), as t ↑ ∞. We next assume that f (r ( · )) is slowly varying at infinity. We then see that, using the fact that r (t) = f −1 log f (rt (t)) in the last equality,
H (t y) − y H (t) ≈ t y f −1 log f (rt (ty y)) − f −1 log f (rt (t))
≈ t y f −1 log f (rt (t)) + log y − f −1 log f (rt (t)) ≈ t (y log y) ( f −1 ) log
t f (r (t))
= (y log y)
t . f (r (t))
Using Proposition 1.1, this means that the scaling relation in (1.6) is satisfied with κ(t) = t/ f (r (t)) and ρ = 1. As f (r (t)) ↑ ∞ is slowly varying, we see that we are in case (3) of the four universality classes.
1.5. Heuristic derivation of Theorem 1.4. In this section, we give a heuristic explanation of Theorem 1.4 in terms of large deviations for the scaled potential ξ . Our proof of Theorem 1.4 follows a different strategy. We use the setup and notation of Sect. 1.4.3 and handle the cases (B) respectively (U) simultaneously. Consequently, the definition (1.10) of α(t) reads α(t)2 =
tα(t)−d = f r (tα(t)−d ) . −d κ(tα(t) )
(1.20)
We introduce the shifted, scaled potential
H (tα(t)−d ) ξ t (x) := α(t)2 ξ xα(t) − tα(t)−d
d f (r (tα(t)−d )) ≈ α(t)2 ξ xα(t) − r (tα(t)−d ) + α(t) , e t
(1.21)
for x ∈ Q R = [−R, R]d . The process ξ t satisfies a large deviation principle, for every R > 0, on the cube Q R with rate tα(t)−2 and rate function ϕ → Q R eϕ(x)−1 d x. Indeed, with B R = [−R, R]d ∩ Zd ,
Prob ξ t ≈ ϕ on Q R −1 ) α(t)d f (r (tα(t)−d )) −d ≈ Prob ξ(0) ≈ ϕ(zα(t) + r (tα(t) ) − e 2 t α(t) z∈B Rα(t)
≈
z∈B Rα(t)
exp − exp f r (tα(t)−d ) +
ϕ(zα(t)−1 ) α(t)2
−
α(t)d f (r (tα(t)−d )) t e
.
318
R. van der Hofstad, W. König, P. Mörters
By a Taylor expansion around r (tα(t)−d ), using that s = f (r (s))e f (r (s)) for s = tα(t)−d as well as (1.20), we can continue with
Prob ξ t ≈ ϕ on Q R ϕ(x) ≈ exp −α(t)d exp f (r (tα(t)−d )) + f (r (tα(t)−d )) − 1 d x 2 α(t) QR t eϕ(x)−1 d x = exp − f (r (tα(t)−d )) Q R t eϕ(x)−1 d x . ≈ exp − 2 α(t) Q R The asymptotics of U (t) p can now be explained as follows. Note that U (t) = u(t, 0), where u(t, · ) is the solution of the parabolic Anderson model (1.1) with initial condition u(0, · ) = 1. We can approximate u(t, 0) by wt (t, 0), where (s, z) → wt (s, z) is the solution to the initial boundary value problem (1.1) with zero boundary condition outside the box Bt and initial condition wt (0, · ) = 1l Bt . Let λdt (ξ ) denote the principal eigenvalue of d + ξ in 2 (Bt ) with zero boundary condition. Then an eigenfunction expansion shows that U (t) p = u(t, 0) p ≈ wt (t, 0) p ≈ e ptλt (ξ ) . d
This already explains why the asymptotics of the p th moments of U (t) are the same as the asymptotics of the moments of U ( pt). We proceed by taking p = 1. Now the shift invariance and the asymptotic scaling properties of the discrete Laplace operator yield that
(tα(t)−d ) (tα(t)−d ) λdt (ξ ) = Htα(t) + λdt α(t)−2 ξ t (· α(t)−1 ) ≈ Htα(t) + α(t)−2 λ(ξ t ), −d −d where λ(ψ) denotes the principal eigenvalue of +ψ in L 2 (Q tα(t)−d ), with zero boundary condition. Hence, t −d d U (t) ≈ e H (tα(t) )α(t) exp λ(ξ ) . (1.22) t α(t)2 Using the large deviation principle for ξ t with R = tα(t)−d , and anticipating that ψ → λ(ψ) has the appropriate continuity and boundedness properties, we may use Varadhan’s lemma to deduce that 1 1 (tα(t)−d ) logU (t) ≈ Htα(t) − χ, −d t α(t)2 where χ is given by χ = inf ψ
Rd
eψ(x)−1 d x − λ(ψ) .
(1.23)
We show in Sect. 1.6 that χ is equal to ρd(1 − 21 log πρ ). This completes the heuristic derivation of Theorem 1.4. The interpretation of the above heuristics is that the moments of the total mass U (t) are mainly governed by potentials ξ whose shape is approximately given as
(tα(t)−d ) ξ(·) ≈ Htα(t) + α(t)−2 ψ · α(t)−1 , −d where ψ is a minimiser of the formula in (1.23).
The Universality Classes in the Parabolic Anderson Model
319
1.6. Variational representations of the constants in Theorem 1.4 and 1.8. 1.6.1. The constant in Theorem 1.4. Fix ρ > 0 and define χ (ρ) ∈ R by χ (ρ) =
inf
g∈H 1 (Rd ) g2 =1
∇g22 − H(g 2 ) ,
(1.24)
where H 1 (Rd ) is the usual Sobolev space, ∇ the usual (distributional) gradient, and H(g 2 ) = ρ
Rd
g 2 (x) log g 2 (x) d x.
(1.25)
By the logarithmic Sobolev inequality in (1.30) below, H(g 2 ) ∈ [−∞, ∞) is welldefined for g ∈ H 1 (Rd ). Furthermore, we introduce the Legendre transform of H on L 2 (Rd ) and the top of the spectrum of the operator + ψ in H 1 (Rd ), L(ψ) =
sup
g∈L 2 (Rd )
g 2 , ψ−H(g 2 )
and λ(ψ) =
sup g∈H 1 (Rd ) g2 =1
ψ, g 2 −∇g22 . (1.26)
Introduce the functions gρ (x) =
ρ d
4
π
ρ
e− 2 |x|
2
and ψρ (x) = ρ + ρ
ρ d log − ρ 2 |x|2 , for x ∈ Rd . 2 π (1.27)
Note that the Gaussian density gρ is the unique L 2 -normalized positive eigenfunction of the operator + ψρ in H 1 (Rd ) with eigenvalue λ(ψρ ) = ρ − ρd + ρ d2 log πρ . It satisfies L(ψρ ) = ρ. Proposition 1.12 (Solution of the variational formula in (1.24)). For any ρ ∈ (0, ∞), the infimum in (1.24) is, up
to horizontal shift, uniquely attained at gρ . In particular, χ (ρ) = ρd 1 − 21 log πρ is the constant appearing in Theorem 1.4. Moreover, L is identified as 1 ρ e ρ ψ(x) d x, L(ψ) = e Rd
(1.28)
and the ‘dual’ representation is χ (ρ) =
inf
ψ∈C(Rd ) L(ψ)<∞
L(ψ) − λ(ψ) ,
(1.29)
where C(Rd ) is the set of continuous functions Rd → R. Up to horizontal shift, the infimum in (1.29) is uniquely attained at the parabola ψρ in (1.27).
320
R. van der Hofstad, W. König, P. Mörters
Proof. √ By the logarithmic Sobolev inequality in the form of [LL01, Th. 8.14] with a = π/ρ, we have ∇g22 ≥ ρ
Rd
g 2 (x) log g 2 (x) d x + ρd 1 −
1 2
log πρ ,
(1.30)
with equality exactly for the Gaussian density gρ and its horizontal shifts. This proves the first statement. In order to see that (1.28) holds, use Jensen’s inequality for any g ∈ L 2 (Rd ) to obtain 1ψ 1 eρ eρψ g2 2 log 2 ≤ ρg2 log . 2 g g2 g22
g , ψ − H(g ) = 2
2
ρg22
(1.31)
1
Equality holds if and only if g 2 = Ce ρ ψ for some C > 0. The right side of (1.31) is 1 maximal precisely for g22 = 1e e ρ ψ . Substituting this value, we arrive at (1.28). To see the last two statements, we use (1.28) and the formula in (1.26) for λ(ψ) to obtain, for any ψ ∈ C(Rd ), ψ 1 2 2 2 − log g 2 − e ρ ψ−log g −1 . ∇g2 − H(g ) − ρ g 2 L(ψ) − λ(ψ) = inf ρ g∈H 1 (Rd ) g2 =1
(1.32) The term in square brackets is equal to θ − eθ−1 for θ = ψρ − log g 2 . Since this is nonpositive and is zero only for θ = 1, we have that ‘≤’ holds in (1.29). Furthermore, by restricting the infimum over g to strictly positive continuous functions and interchanging the order of the infima, we see that inf
L(ψ) − λ(ψ)
≤
inf
ψ 1 2 − log g 2 − e ρ ψ−log g −1 ∇g22 − H(g 2 ) − ρ g 2 ρ ψ∈C (Rd )
inf
∇g22 − H(g 2 ) = χ (ρ),
ψ∈C (Rd )
≤
g∈H 1 (Rd ) g2 =1,g>0 g∈H 1 (Rd ) g2 =1,g>0
inf
by substituting ψ = ρ + ρ log g 2 , and we use that the maximizer g of the right hand side is strictly positive. Therefore, equality holds in (1.29). We also know that, by uniqueness of the solution in (1.24), the unique minimizer in (1.29) is ψ = ρ + ρ log gρ2 = ψρ .
Remark 1.13 (Interpretation). Both representations (1.24) and (1.29) may be interpreted in terms of optimal rescaled profiles for the moment asymptotics of the total mass U (t). While the minimizer ψρ in (1.29) describes the shape of the potential ξ (see Sect. 1.5), the minimizer gρ in (1.24) describes the solution u(t, ·), cf. Remark 1.3. 3
The Universality Classes in the Parabolic Anderson Model
321
1.6.2. The constant in Theorem 1.8. We now turn to the variational representation of the constant appearing in Theorem 1.8. We define χ (ρ) by χ (ρ) = inf{−λ(ψ) : ψ ∈ C(Rd ), L(ψ) ≤ 1},
(1.33)
where we recall that C(Rd ) is the set of continuous functions Rd → R. Proposition 1.14 (Solution of the variational formula in (1.33)). For any ρ ∈ (0, ∞), the function ψρ −ρ log ρe , with ψρ as defined in (1.27), is the unique minimizer in (1.33), and χ (ρ) = χ (ρ) + ρ log ρe . Proof. Obviously, the condition L(ψ) ≤ 1 in (1.33) may be replaced by L(ψ) = 1. In the representation χ (ρ) = inf ρ log L(ψ) − λ(ψ) : ψ ∈ C(Rd ), L(ψ) = 1 we may omit the condition L(ψ) = 1 completely since ρ log L(ψ) − λ(ψ) is invariant under adding constants to ψ. We use the definition of λ(ψ) in (1.26), and (1.28), and obtain, after interchanging the infima, 1 ρ 2 2 χ (ρ) = inf ∇g2 − sup ψ, g − ρ log e ρ ψ(x) d x + ρ log . 1 d e g∈H (R ) ψ∈C (Rd ) g2 =1
(1.34) The supremum over ψ is uniquely (up to additive constants) attained at ψ = ρ log g 2 with value H(g 2 ), as an application of Jensen’s inequality shows: 1 1 2 ρ log e ρ ψ(x) d x = ρ log d x g 2 (x) e ρ ψ(x)−log g (x) 1 ψ(x) − log g 2 (x) ≥ ρ d x g 2 (x) ρ = ψ, g 2 − H(g 2 ). Hence, χ (ρ) = χ (ρ) + ρ log ρe . Since gρ is, up to horizontal shifts, the unique minimiser ρ = ρ log gρ2 + C is the unique minimizer in (1.34). By the above reasoning, in (1.24), ψ ρ is the unique minimizer of (1.33), where C = −ρ log ρ is determined by requiring ψ e ρ ) = 1. that L(ψ
Remark 1.15 (Interpretation). There is an interpretation of the minimiser of (1.33) in terms of the optimal rescaled profile of the potential ξ for the almost-sure asymptotics of the total mass U (t). Indeed, the condition L(ψ) ≤ 1 guarantees that, almost surely for all large t, the profile ψ appears in some ‘microbox’ in the rescaled landscape ξ within the ‘macrobox’ Bt = [−t, t]d ∩ Zd , which is one of the intermittent islands. The logarithmic rate of the total mass, 1t log U (t) ≈ λ Bt (ξ ), can be bounded from below against the eigenvalue of ξ in the microbox, which is described by λ(ψ). Optimising over all admissible ψ explains the lower bound in (1.17). Our proof of the lower bound in Sect. 4 makes this heuristics precise. The Gaussian density gρ in (1.27) is the unique positive L 2 -normalized eigenfunction of + ψρ − ρ log ρe corresponding to the eigenvalue − χ (ρ) = λ(ψρ − ρ log ρe ).
322
R. van der Hofstad, W. König, P. Mörters
It describes the rescaled shape of the solution u(t, ·) in the intermittent island. An interesting consequence is that the appropriately rescaled potential and solution shapes are identical for the moment asymptotics and for the almost sure asymptotics. This phenomenon also occurs in the cases of the double-exponential distribution and the potentials bounded from above. 3 1.6.3. Convergence of the variational problem in (1.13). We close this section by showing that the variational problem in (1.13) converges to the variational problem in (1.12) as γ ↑ 1. We define g 2γ (x) − g 2 (x) 2 χ (ρ, γ ) = inf |∇g(x)| d x + ρ d x , (1.35) 1−γ g∈H 1 (Rd ) Rd Rd g2 =1
which is equal to the variational problem in (1.13). Proposition 1.16 (Convergence of the variational problem in (1.35)). For any ρ ∈ (0, ∞), lim χ (ρ, γ ) = χ (ρ). γ ↑1
(1.36)
Proof. The upper bound in (1.36) follows by substituting the Gaussian density g = gρ in (1.27) into the infimum in (1.35), and by noting that lim
γ ↑1 Rd
2γ
gρ (x) − gρ2 (x) γ −1
dx =
Rd
gρ2 (x) log gρ2 (x) d x,
by an explicit computation of the integrals involved. For the lower bound in (1.36), we bound, for any γ ∈ [0, 1) and g ∈ H 1 (Rd ), 2 g 2γ (x) − g 2 (x) e(γ −1) log g (x) − 1 dx = dx g 2 (x) 1−γ 1−γ Rd Rd ≥− g 2 (x) log g 2 (x) d x, Rd
since eθ − 1 ≥ θ for every θ ∈ R. Therefore, χ (ρ, γ ) ≤ χ (ρ) for every γ ∈ [0, 1). The remainder of the paper is as follows. In Sect. 2 we present an important auxiliary result on self-intersections of random walks, which will be used in the proof of Theorem 1.4 in Sect. 3. The proof of Theorem 1.8 is given in Sect. 4. Finally, in Sect. 5 we use the opportunity to correct an error in the proof of the moment asymptotics in case (4) from [BK01]. 2. An Auxiliary Result on Self-Intersections of Random Walks In this section we provide a result on q-fold self-intersections of random walks, for small q > 1, which is an important tool in the proofof the upper bound in Theorem 1.4. This t result is of independent interest. Let t (z) = 0 δz (X (s)) ds denote the local time at z of the simple random walk (X (s) : s ∈ [0, t]) on Zd with generator d , starting at the origin.
The Universality Classes in the Parabolic Anderson Model
323
Proposition 2.1. Fix q > 1 such that q(d − 2) < d and R > 0. Let α(t) → ∞ such that α(t) = O(t 2/(2d+2)−ε ) for some ε > 0. Then α(t)2 t θ↓0 t↑∞ 1 q − q1 [d+(2−d)q] q 1l{supp (t ) ⊆ B Rα(t) } = 0. × log E exp θ α(t) t (x)
lim sup lim sup
x∈Zd
(2.1) Remark 2.2. The result is better understood when rephrasing it in terms of the normalised and rescaled local times, L t (·) = 1t α(t)d t ( · α(t)). Then the exponent may be rewritten as 1 q 1 t − q [d+(2−d)q] q α(t) t (x) = L t q , α(t)2 d x∈Z
where · q is the norm on L q (Rd ). Hence, (2.1) is a large deviations result for the qnorm of L t on the scale t/α(t)2 . It is known that (L t : t > 0) satisfies a large deviation principle on this scale in the weak topology generated by bounded continuous functions, see for example [GKS06]. However, (2.1) does not follow from a routine application of Varadhan’s lemma, since the q-norm is neither bounded nor continuous in this topology. See [Ch04] for an analogous result for a smoothed version of L t . 3 Remark 2.3. Our proof yields (2.1) also without indicator on {supp (t ) ⊆ B Rα(t) } if the sum is restricted to a finite subset of Zd . It can easily be extended to a large class of random walks, also in discrete time. The proof is based on a combinatorial analysis of the high integer moments of the random variable x t (x)q . This method is of crucial importance in the analysis of intersections and self-intersections of random paths [KM02], and of random walk in random scenery [GKS06]. 3 Proof of Proposition 2.1. By B we denote the box B = B Rα(t) = [−Rα(t), Rα(t)]d ∩ Zd . In the exponent on the left side of (2.1), we restrict the sum to x ∈ B and forget about the indicator on {supp (t ) ⊆ B Rα(t) }. In the following we write · q for the norm in q (B). In a first step we reduce the problem to a problem on asymptotics of high integer moments. Suppose first that there are constants T, C > 0 such that kq E t q ≤ k kq C kq α(t)k[d+(2−d)q] , for any t ≥ T, k ≥
t . α(t)2
(2.2)
We now show that this assumption implies (2.1). Expanding the exponential series, we rewrite − 1 [d+(2−d)q] − q1 [d+(2−d)q] k 1 θ α(t) E exp θ α(t) q t q = ∞ E t qk . k=0 k! Abbreviate kt = qt/α(t)2 . Under our assumption, k k k k [d+(2−d)q] C α(t) q , for t ≥ T, k ≥ kt , E t qk ≤ q
(2.3)
324
R. van der Hofstad, W. König, P. Mörters
and hence we obtain − 1 [d+(2−d)q] E exp θ α(t) q t q ≤
k t −1 k=0
∞ 1 k 1 θCk k − k [d+(2−d)q] E t qk + . θ α(t) q k! k! q
(2.4)
k=kt
For all sufficiently small θ > 0, the second term is estimated as follows: k t θC ∞ ∞ k k eq θC 1 θCk ≤ = , θC k! q eq 1 − eq k=k k=k t
t
and the exponential rate (in tα(t)−2 ) of the right hand side tends to −∞ as θ ↓ 0. For the first term, we bound, using Hölder’s inequality and (2.3), for k ≤ kt , k k kt kt k t [d+(2−d)q] kt C kt α(t) q E t qk ≤ E t qkt kt ≤ q k k k t [d+(2−d)q] = C k α(t) q . q Therefore, the first term in (2.4) is bounded by kt θC 1 θCkt k k ≤ e q t. k! q k=0
This proves that (2.2) implies the statement (2.1). Therefore, it suffices to prove (2.2) with some constants C, T > 0. We use C to denote a generic constant which depends on R, d and q, but not on k and t, and C may change its value from appearance to appearance. To prove (2.2), we write Ak for the set of maps β : B → N0 satisfying x∈B βx = k. First we write out kq E t q = E t (x)q#{i : zi =x} z 1 ,...,z k ∈B
=
β∈Ak
= k!
x∈B
E t (x)qβx # z ∈ B k : βx = #{z i = x}∀x
β∈Ak
x∈B
1 . E t (x)qβx βx ! x∈B
(2.5)
x∈B
Note that, for β ∈ Ak , the numbers qβx are not necessarily integers. We resolve this problem, in an upper bound, by introducing a further sum over the set Ak (β) of all : B → N0 satisfying |β x − qβx | < 1 for every x ∈ B. Then, clearly, β $ $ % % x β qβx t (x) E t (x) E ≤ . (2.6) x∈B
∈Ak (β) β
x∈B
The Universality Classes in the Parabolic Anderson Model
325
x . Writing out the local times, ∈ Ak (β) and denote We fix β ∈ Ak and β k = x∈B β we have % $ x β t
x . t (x)βx = dsix P X (six ) = x ∀x ∈ B ∀i = 1, . . . , β E x∈B i=1 0
x∈B
The next step is to give new names to the integration variables six such that we can order x for the time variables. Fix some function : {1, . . . , k} → B such that |−1 ({x})| = β any x ∈ B. We continue with, denoting the set of permutations of 1, . . . , k by Sk , E t (x)βx x∈B
=
k [0,t]
dt1 . . . dtk P{X (ti ) = (i)∀i = 1, . . . , k}
=
=
0
σ ∈S k
k (0,∞)
dt1 . . . dtk
P X (tσ (i) ) = (i) ∀i
σ ∈S k
ds1 . . . dsk 1l
k
si ≤ t
i=1
k
psi (σ (i − 1)), (σ (i)) ,
(2.7)
i=1
where we switched from σ to σ −1 and substituted si = ti − ti−1 (with t0 = 0), and we introduced the transition probabilities of a continuous time simple random walk, ps (x, y) = Px {X (s) = y}. Here we use the convention σ (0) = 0 and (0) = 0, the starting point of the random walk. *k We estimate the indicator on the right hand side of (2.7) against eλt i=1 e−λsi for −2 λ = α(t) . Then we integrate out over all the si , to obtain k
x β λt ≤e t (x) G λ (σ (i − 1)), (σ (i)) , E
(2.8)
σ ∈S k i=1
x∈B
∞ where G λ is the Green’s function of the walk given by G λ (x, y) = 0 e−λs ps (x, y) ds. It will be convenient to use a closed loop of sites, i.e., to change the convention σ (0) = 0 to the convention σ (0) = σ ( k). Since 1l{σ (0) = 0}
k
G λ (σ (i − 1)), (σ (i))
i=1
= 1l{σ (0) = σ ( k)}
k
G λ (0), (σ (1))
G λ (σ (i − 1)), (σ (i)) , G λ (σ (k)), (σ (1)) i=1
this change of conventions leads to a factor
G λ (0), (σ (1))
, k)), (σ (1)) G λ (σ (
326
R. van der Hofstad, W. König, P. Mörters
which can be bounded by eo(k) since supx,y∈B G λ (0, y)/G λ (x, y) ≤ eo(k) , where we recall that λ = α(t)−2 , k > tα(t)−2 and B = B Rα(t) . ) the set of maps γ : B × B → N0 such that y∈B γx,y = We denote by P(β y∈B γ y,x = βx for any x ∈ B. Then we can rewrite E t (x)βx x∈B
≤ eλt+o(k)
G λ (x, y)γx,y
) x,y∈B γ ∈P(β
×
k w∈B (w0 =w k)
σ ∈S k
1l{ ◦ σ = w}1l γx,y = #{i : wi−1 = x, wi = y} ∀x, y ∈ B .
We can evaluate the sums over w and σ using elementary combinatorics. Indeed, note that #{σ : ◦ σ = w} =
x !, β
(2.9)
x∈B
since, given w and , the left hand side equals the number of orders in which one can x for each x ∈ B are indistinguishable, into a row such that the put k objects, of which β same vector of elements arises. Since one can only permute within those indices which belong to the same class of indistinguishable objects, we obtain (2.9). This performs the sum over σ . To perform the sum over w for fixed γ , we use [dH00, p.17], to obtain
# w : γx,y = #{i : wi−1 = x, wi = y} ∀x, y ∈ B ≤ k x,y∈B
x ! β . γx,y !
Therefore, we obtain E t (x)βx x∈B
≤ eλt+o(k)
G λ (x, y)γx,y
) x,y∈B γ ∈P(β
≤ eλt+o(k)
e
x,y∈B
x ! β x ! β γx,y ! x∈B
γx,y
) γ ∈P(β
G λ (x, y)q q1 γx,y 1−q γx,y 2β x x , β γx,yq γx,y
x,y∈B
x,y∈B
x∈B
(2.10) x − qβx | < 1, where we use that n n e−n ≤ n! ≤ n n . We next use that, since |β k=
x,y∈B
γx,y =
x∈B
x ≤ q β
x∈B
βx + |B| = qk + |B|.
The Universality Classes in the Parabolic Anderson Model
327
d By our assumption on the growth of α(t), we have |B| ≤ Cα(t) ≤ o(k) and hence γ ). We use Jensen’s inequality for the logarithm to obtain e x,y∈B x,y ≤ C k . Fix γ ∈ P(β 1 1 γ q G λ (x, y)q q γx,y (x, y) G x,y λ x β = exp log x q γx,y γx,y β x,y∈B x∈B y∈B 1 G λ (x, y)q . (2.11) x log β ≤ exp x q β x∈B
y∈B
Recall that λ = α(t)−2 . Since (d − 2)q < d, there is a constant C (only depending on R, d and q) such that, for any x ∈ B,
G α(t)−2 (x, y)q ≤ Cα(t)d+(2−d)q .
(2.12)
y∈B
This gives that 1 1 G λ (x, y)q q γx,y − q βx ≤ C k α(t)[d+(2−d)q]k/q βx . γx,y
x,y∈B
(2.13)
x∈B
We substitute (2.13) into (2.10) and summarise (2.5), (2.6) and (2.10). Using that | k− qk| ≤ |B|, we obtain qk [d+(2−d)q](k+ q1 |B|) k k C k α(t) E t q ≤
×
( k1 γx,y ) 1
∈Ak (β) γ ∈P(β ) x,y∈B β∈Ak β
x ) q ( k1 β
1−q q
−2 1 ( k βx )βx /βx
γx,y
. (2.14)
Note that, by our growth assumption on α(t) and since k ≥ t/α(t)2 , k k ≤ (qk)qk C |B| k |B| ≤ (qk)qk eo(k) .
The product is estimated with the help of Jensen’s inequality for the logarithm, together γ with the fact that y → βx,y is a probability measure, as follows: x
x,y∈B
( k1 γx,y ) 1
1−q q
x ) q −2 ( 1 βx )βx /βx ( k1 β k
γx,y
q − 1
y ) x ( 1 β γx,y β k x β = exp log x q γx,y β x∈B y∈B 1 x β βx 1 x log β + × exp − βx log k q k x∈B x∈B 1 β βx 1 x log x . + β ≤ exp − βx log k q k x∈B
x∈B
328
R. van der Hofstad, W. König, P. Mörters
x ≤ qβx + 1 ≤ 2qβx for βx > 0 to bound Now recall that qβx − 1 ≤ β
x log β
x∈B
x β 2qβx (qβx − 1) log ≤ k k x∈B =q
βx log
x∈B
βx k k log + qk log , + k 2qβx k x∈B
so that we arrive at −
x∈B
βx log
β βx 1 k k 1 x log x ≤ k log + + β log k q 2qβx k k q x∈B x∈B ≤ k log
qk + Ck + C|B| log k ≤ Ck, k
since qk/ k converges to one and since |B| log k ≤ Cα(t)d log k ≤ o(k). Hence, we have estimated the product on the right hand side of (2.14) against C k uniformly in β ∈ Ak , ∈ Ak (β) and γ ∈ P(β ). Our growth condition on α(t) implies that each of the sums β can be estimated against eo(k) . Indeed, )| ≤ k |B| ≤ eCα(t) |P(β 2
2d
log k
≤ eo(k) ,
∈ Ak (β) and for any β ∈ Ak . Furthermore, |Ak (β)| ≤ 2|B| ≤ eo(k) for any for any β β ∈ Ak , and finally |Ak | ≤ k |B| ≤ eo(k) . Therefore, we obtain kq E t q ≤ C k k qk α(t)[d+(2−d)q]k α(t)C|B| ≤ k kq C kq α(t)k[d+(2−d)q] , where we again used our growth condition on α(t). This completes the proof.
3. The Moment Asymptotics: Proof of Theorem 1.4 Our analysis is based on the link between the random-walk and random-field descriptions provided by the Feynman-Kac formula. Let (X (s) : s ∈ [0, ∞)) be the continuous-time simple random walk on Zd with generator d . By Pz and Ez we denote the probability measure, respectively, the expectation with respect to the walk starting at X (0) = z ∈ Zd . Let V : Zd → [−∞, ∞) be a potential that is non-percolating from below, i.e. there exists A ∈ R such that the level set {z ∈ Zd : V (z) ≤ A} does not contain an infinite connected component. Then, see e.g. [GM90, Lemmas 2.2 and 2.3], there exists a unique nonnegative, continuous solution u V of the initial-value problem ∂t u(t, z) = d u(t, z) + V (z)u(t, z), u(0, z) = 1,
for (t, z) ∈ (0, ∞) × Zd , for z ∈ Zd ,
and the Feynman-Kac formula allows us to express u V as t
V u (t, z) = Ez exp V X (s) ds , for z ∈ Zd , t > 0.
(3.1)
(3.2)
0
By [GM90, Theorem 2.1] the random potential ξ is almost surely non-percolating from below. Hence, u ξ is the solution of the parabolic Anderson problem in (1.1) with initial
The Universality Classes in the Parabolic Anderson Model
329
condition u(0, z) = 1 for all z ∈ Rd , and the main object of our study is U (t) = u ξ (t, 0). Introduce the vertically shifted random potential ξt (z) = ξ(z) − H
t α(t)d . α(t)d t
(3.3)
Note that t is a parameter here, and ξt should not be seen as a time-dependent random potential. Fix p ∈ (0, ∞). Then Theorem 1.4 is equivalent to the statement α( pt)2 log u ξ pt (t, 0) p = −χ (ρ), t↑∞ pt lim
(3.4)
where χ (ρ) is defined in (1.24). We approximate u ξ pt by finite-space versions. Let R > 0 and let B R = [−R, R]d ∩ Zd be the centred box in Zd with radius R. Introduce u VR : [0, ∞) × Zd → [0, ∞) by u R (t, z) = Ez exp V
t
V X (s) ds 1l supp (t ) ⊆ B R ,
(3.5)
0
t where t (z) = 0 δz (X (s)) ds are the local times of the random walk. Note that u rV ≤ u VR ≤ u V for 0 < r < R < ∞. In the finite space setting we can work easily with eigenfunction expansions: We look at the function
fory, z ∈ Zd , (3.6) p VR (t, y, z) = E y eV,t 1l supp (t ) ⊆ B R 1l X (t) = z and the eigenvalues, λ1 > λ2 ≥ λ3 ≥ · · · ≥ λn , of the operator d + V in 2 (B R ) with zero boundary condition, where we abbreviate n = |B R |. We may pick an orthonormal basis of corresponding eigenfunctions ek . By convention, ek vanishes outside B R . Note that z∈B R p VR (t, y, z) = u VR (t, y). Furthermore, we have the eigenfunction expansion p VR (t, y, z) =
etλk ek (y)ek (z).
(3.7)
k
In particular, u VR (t, z) =
etλk ek , 1lek (z).
(3.8)
k
The following proposition carries out the necessary large deviations arguments for the case p = 1, and is the key result for the proof of (3.4). Proposition 3.1. α(t)2 ξt (i) Let R > 0. Then lim sup (t, 0) ≤ −χ (ρ). log u Rα(t) t t↑∞ 2 α(t) ξt (ii) lim inf lim inf log u Rα(t) (t, 0) ≥ −χ (ρ). t↑∞ R↑∞ t The proofs of Proposition 3.1(i) and (ii) are deferred to Sect. 3.2 and 3.3, respectively.
330
R. van der Hofstad, W. König, P. Mörters
3.1. Proof of (3.4) subject to Proposition 3.1. Proof of the lower bound in (3.4).. All we have to do is to show that, as t ↑ ∞, ξ −2 ξ pt u pt (t, 0) p ≥ eo(tα( pt) ) u Rα( pt) ( pt, 0) .
(3.9)
To prove this, we repeat the proof of [BK01, Lemmas 4.1 and 4.3] for the reader’s convenience. We abbreviate r = Rα( pt), V = ξ pt , u = u V , u r = u rV and pr = prV . Note −2 that |Br | = eo(tα( pt) ) . Now we prove (3.9). First we assume that p ∈ (0, 1). Use the
p shift invariance of the p distribution of the field V and the inequality i xi ≥ x for nonnegative xi to i i estimate 1 −2 u(t, z) p ≥ eo(tα( pt) ) u(t, z) p u(t, 0) p = |Br | z∈Br z∈Br p −2 (3.10) u(t, z) . ≥ eo(tα( pt) ) z∈Br
By · we denote the norm on 2 (Br ). According to Parseval’s identity, the numbers ek , 1l2 /1l2 sum up to one. Using u ≥ u r , the Fourier expansion in (3.8) and Jensen’s inequality, we obtain p 2 ek , 1l2 p o(tα( pt)−2 ) ptλk ek , 1l ≥ 1l2 p ≥ e u(t, z) etλk e 1l2 1l2 z∈Br k k −2 −2 u r ( pt, z) ≥ eo(tα( pt) ) u r ( pt, 0) . (3.11) ≥ eo(tα( pt) ) z∈Br
Substituting (3.11) in (3.10) completes the proof of (3.9) in the case p ∈ (0, 1). Now we turn to the case p ∈ [1, ∞). We use the first equation in (3.10), Jensen’s p inequality, the eigenfunction expansion in (3.8) and the inequality ( i xi ) p ≥ i xi for nonnegative xi to obtain p p 1 −2 ≥ eo(tα( pt) ) u(t, z) etλk ek , 1l2 u(t, 0) p ≥ |Br | z∈Br k o(tα( pt)−2 ) ptλk 2p (3.12) e ek , 1l . ≥e k
Next we use Jensen’s inequality as follows ptλ k e , 1l2 p k ke ptλk 2p ptλk = e ek , 1l e ptλk ke k k 2 3p ptλ 2 k ek , 1l ke ptλk ≥ e ptλk ke k 31− p 2 ptλ k ke = u r ( pt, z) ptλ ≥ u r ( pt, 0) . 2 k ek , 1l ke z∈B r
(3.13)
The Universality Classes in the Parabolic Anderson Model
331
In the last step, we have used the eigenfunction expansions in (3.7) and (3.8) to see that e ptλk ek , 1l2 = p RV (t, y, z) ≥ p RV (t, z, z) = e ptλk . z
k
y
z
k
Combining (3.12) and (3.13) completes the proof of (3.9) also in the case p ∈ [1, ∞).
Proof of the upper bound in (3.4). A main ingredient in our proof is the following preparatory lemma, which provides, for any potential V , an estimate of u V (t, 0) in terms of the maximal principal eigenvalue of d + V in small subboxes (‘microboxes’) of a ‘macrobox’. For z ∈ Zd and R > 0, we denote by λdz;R (V ) the principal eigenvalue of the operator d + V with Dirichlet boundary conditions in the shifted box z + B R . Lemma 3.2. Let r : (0, ∞) → (0, ∞) such that r (t)/t ↑ ∞. For R, t > 0 let B R (t) = Br (t)+2R . Then there is a constant C > 0 such that, for any sufficiently large R, t and any potential V : Zd → [−∞, ∞), 1/2 −r (t) t u V (t, 0) ≤ E e2 0 V (X s ) ds e
2 d + eCt/R 3r (t) exp t max λdz;2R (V ) . (3.14) z∈B R (t)
Proof. This is a modification of the proof of [BK01, Proposition 4.4], which refers to nonpositive potentials V only. The proof of [BK01, Proposition 4.4] consists of [BK01, Lemma 4.5] and [BK01, Lemma 4.6]. The latter states that
d V Ct/R 2 d u r (t) (t, 0) ≤ e 3r (t) exp t max λz;2R (V ) . (3.15) z∈B R (t)
A careful inspection of the proof shows that no use is made of nonpositivity of V and hence (3.15) applies in the present setting. In order to estimate u V (t, 0) − u rV(t) (t, 0), we introduce the exit time τ R = inf{t > 0 : X (t) ∈ / B R } from the box B R and use the Cauchy-Schwarz inequality to obtain t u V (t, 0) − u V (t, 0) = E exp V (X (s)) ds 1l{τr (t) ≤ t} r (t)
0
t 1/2 ≤ E e2 0 V (X s ) ds P{τr (t) ≤ t}1/2 . According to [GM98, Lemma 2.5(a)], for any r > 0, r −1 . P{τr ≤ t} ≤ 2d+1 exp − r log dt Hence, we may estimate P{τr (t) ≤ t}1/2 ≤ e−r (t) , for sufficiently large t, completing the proof.
We now complete the proof of the upper bound in (3.4), subject to Proposition 3.1. Let p ∈ (0, ∞) and fix R > 0. First, notice that the second term in (3.14) can be estimated in terms of a sum, ptλd (V ) d exp pt max λz;2R (V ) ≤ e z;2R . (3.16) z∈B R (t)
z∈B R (t)
332
R. van der Hofstad, W. König, P. Mörters
Thus, applying (3.14) to u ξ pt (t, 0) with R replaced by Rα( pt), raising both sides to the pth power, and using (x + y) p ≤ 2 p (x p + y p ) for x, y ≥ 0, together with (3.16), we get t p/2 − pr (t) u ξ pt (t, 0) p ≤ 2 p E e2 0 ξ pt (X (s)) ds e
pd 2 2 ptλd (ξ ) + eC pt/(R α( pt) ) 3r (t) e z;2Rα( pt) pt . z∈B Rα( pt) (t)
Next we take the expectation with respect to ξ and note that, by the shift-invariance of ξ , the distribution of λdz;2Rα( pt) (ξ ) does not depend on z ∈ Zd . This gives t p/2 − pr (t) e u ξ pt (t, 0) p ≤ 2 p E e2 0 ξt p (X (s)) ds
2 2 pd+d ptλd0;2Rα( pt) (ξ pt ) + eC pt/(R α( pt) ) 3r (t) e .
(3.17)
In order to show that the first term on the right is negligible, estimate, in the case p ≥ 2, with the help of Jensen’s inequality and Fubini’s theorem, t 1 t p/2 E e2 0 ξt p (X (s)) ds ≤ E et 0
H ( ptα( pt)−d ) exp − α( pt)−d 1 t H ( ptα( pt)−d ) ≤E e ptξ(X (s)) ds exp − t 0 α( pt)−d H ( ptα( pt)−d ) . = e H ( pt) exp − α( pt)−d ptξ(X (s)) ds
In the case p < 2, a similar calculation shows that t H ( ptα( pt)−d ) p/2 p E e2 0 ξt p (X (s)) ds ≤ e 2 H (2t) exp − . α( pt)−d Hence, for the choice r (t) = t 2 , the first term on the right hand side of (3.17) satisfies lim sup t↑∞
t p/2 − pr (t) α( pt)2 log E e2 0 ξt p (X (s)) ds e = −∞, pt
(3.18)
where we use that H (t)/t and α(t) are slowly varying. In (3.17), take the logarithm, multiply by α( pt)2 /( pt) and let t ↑ ∞. Then we have that lim sup t↑∞
≤
α( pt)2 log u ξ pt (t, 0) p pt
α( pt)2 C log exp{ ptλd0;2Rα( pt) (ξ pt )} , + lim sup R2 pt t↑∞ −2
(3.19)
where we also used that r (t) pd+d = eo(tα( pt) ) as t ↑ ∞. Now we estimate the right th d hand side of (3.19). We denote by λd,k 0;Rα( pt) (ξ pt ) the k eigenvalue of + ξ pt in the box
The Universality Classes in the Parabolic Anderson Model
333
B Rα( pt) with zero boundary condition. Using an eigenfunction expansion as in (3.7), we get ξ pt
p Rα( pt) ( pt, x, x) exp ptλd0;Rα( pt) (ξ pt ) ≤ exp ptλd,k (ξ ) = 0;Rα( pt) pt k
≤
x∈B Rα( pt)
ξ pt u Rα( pt) ( pt, x)
(3.20)
x∈B Rα( pt)
≤
pt Ex e 0 ξ pt (X (s)) ds 1l{supp ( pt ) ⊆ x + B2Rα( pt) }
x∈B Rα( pt)
ξ pt ≤ |B Rα( pt) | u 2Rα( pt) ( pt, 0) , −2
where we also used the shift-invariance. Recall that |B Rα( pt) | ≤ eo(tα( pt) ) . We finally use Proposition 3.1(i) for pt instead of t to complete the proof of the upper bound in (3.4).
t 3.2. Proof of Proposition 3.1(i). Recall the local times of the walk, t (z) = 0 1l{X (s) = t z} ds. Note that 0 V (X (s)) ds = V, t , where · , · stands for the inner product on 2 (Zd ). From (3.5) with V = ξt , we have ξt (3.21) u Rα(t) (t, 0) = E0 eξt ,t 1l{supp (t ) ⊆ B Rα(t) } . Recall from (1.5) that elξ(x) = e H (l) for any l ∈ R and x ∈ Zd . We carry out the expectation with respect to the potential, and obtain, using Fubini’s theorem and the independence of the potential variables, ξt u Rα(t) (t, 0) d d = e−α(t) H (t/α(t) ) E0 exp t (x)ξ(x) 1l{supp (t ) ⊆ B Rα(t) } (3.22) = E0 exp
x∈B Rα(t) d H (t (x)) − t (x) α(t) t
H (t/α(t) ) d
1l{supp (t ) ⊆ B Rα(t) } ,
x∈B Rα(t)
where we also use that x∈Zd t (x) = t. We now split the sum in the exponent into a part where we have some control over the size of the local times, and a part with very large local times. Introducing d α(t)2 H(t) H (t (x)) − t (x) α(t) H (t/α(t)d ) M (t ) = t t x∈B Rα(t)
Mt 1l{t (x) ≤ α(t) d }, d d R(t) H (t (x)) − t (x) α(t) H (t/α(t) ) 1l{t (x) > M (t ) = t x∈B Rα(t)
we have ξt u Rα(t) (t, 0) = E0 exp
(3.23) Mt }, α(t)d
t (t) (t) H ( ) + R ( ) 1 l{supp ( ) ⊆ B } t t t Rα(t) . M α(t)2 M
(3.24)
334
R. van der Hofstad, W. König, P. Mörters
(t) We will see that H(t) M gives the main term and R M a small remainder in the limit t → ∞, followed by M → ∞. To separate the two factors coming from this split, we use Hölder’s inequality. For any small η > 0, we have
ξt u Rα(t) (t, 0) ≤ E0 exp (1 + η)
1 t 1+η (t) H ( ) 1 l{supp ( ) ⊆ B } t t Rα(t) α(t)2 M η 1+η 1+η (t) ×E0 exp η R M (t ) 1l{supp (t ) ⊆ B Rα(t) } . (3.25)
We show later that the second factor is asymptotically negligible, more precisely, we show that α(t)2 lim sup lim sup t M→∞ t→∞ × log E0 exp CR(t) M (t ) 1l{supp (t ) ⊆ B Rα(t) } ≤ 0, for C > 0.
(3.26)
Let us first focus on the first term. Recall the definition of α(t) in (1.10) and the uniform convergence claimed in Proposition 1.1. For every ε > 0 and all sufficiently large times t, we obtain the upper bound H(t) M (t ) ≤ ≤
α(t)2 t t κ( α(t)d ) ρ
t (x) t/α(t)d
log
t (x)
1l{t (x) t/α(t)d
x∈B Rα(t)
2 Mt t } + ε (2R)d α(t)d α(t) t κ( α(t)d ) α(t)d
≤ρ
d 1 1 t t (x) log t t (x) α(t)
+ ε (2R)d
x∈B Rα(t)
= G t ( 1t t ) + ε (2R)d ,
(3.27)
where we dropped the indicator, which we can do for M ≥ 1 since y log y ≥ 0 for y > M, and let G t (µ) = ρ
µ(x) log α(t)d µ(x) , for µ ∈ M(Zd ).
(3.28)
x∈B Rα(t)
The further analysis makes crucial use of an inequality derived in [BHK05]. In [BHK05], the law of the local times are investigated, and an explicit formula is derived for the density of the local times on the range of the random walk. This explicit formula makes it possible to give strong upper bounds on exponential functionals: Proposition 3.3. For any finite set B ⊆ Zd and any measurable functional F : M1 (B) → R, 1 E0 et F( t t ) 1l{supp (t ) ⊆ B} 4 4
2 F(µ) − 21 (2dt)|B| |B|. (3.29) ≤ exp t sup µ(x) − µ(y) µ∈M1 (B)
x∼y
The Universality Classes in the Parabolic Anderson Model
335
We substitute (3.27) into (3.25) and apply (3.29) for F = (1+η)G t /α(t)2 and B = B Rα(t) 2 and note that (2dt)|B Rα(t) | ≤ eo(t/α(t) ) . Hence, we obtain that the first term on the right hand side of (3.25) can be estimated by t (t) H ( ) 1 l{supp ( ) ⊆ B } E0 exp (1 + η) t t Rα(t) M α(t)2
2 ρ
1 ρ log α(t) + ε (2R)d , (3.30) ≤ eo(t/α(t) ) exp − t χ d α(t) 2 − α(t)2 d where we abbreviated ρ = (1 + η)ρ and introduced 4 4
2 1 χ d (δ) = inf µ(x) − µ(y) − δ µ(x) log µ(x) , 2 µ∈M1 (Zd )
for δ > 0,
x∈Zd
x∼y
(3.31) the discrete variant of χ (ρ) in (1.24), which was studied in Gärtner and den Hollander [GH99]. In Proposition 3 and the subsequent remark they show that χ d (δ) =
π e2 dδ log + o(1) , as δ ↓ 0. 2 δ
Substituting this into (3.30), we obtain t α(t)2 H(t) log E0 exp (1 + η) M (t ) 1l{supp (t ) ⊆ B Rα(t) } 2 t α(t) t→∞ 2 πe ρ d log + ε(2R)d = −χ ( ρ ) + ε(2R)d , (3.32) ≤− 2 ρ
lim sup
as can be seen from Proposition 1.12. Using (3.32) together with (3.26) in (3.25) and letting M → ∞, ε ↓ 0 and η ↓ 0, gives the desired upper bound and finishes the proof of Proposition 3.1(i) subject to the proof of (3.26). It remains to investigate the second term in (3.25), i.e., to prove (3.26). We first estimate R(t) M (t ) (recall (3.24)) from above in terms of a nice functional of t . Since we have to work uniformly for arbitrarily large local times, it is not possible to estimate against a functional of the form x t (x) log t (x), but we succeed in finding an upper 1/q q (x) for some q > 1 close to 1. Then Proposition 2.1 can bound of the form x t be applied and yields (3.26). We fix δ ∈ (0, 21 ] and note that there exist A > 1, t0 > 0 such that H (t y) − y H (t) 2 ≤ Ay 1+δ /3 for any y ≥ 1 and t > t0 . κ(t)
(3.33)
Indeed, this follows from [BGT87, Theorem 3.8.6(a)]. Therefore, we obtain that −d t H (t (x)) − t (x) α(t) t H ( α(t)d ) ≤ Aκ(tα(t) ) d
(x) 1+δ 2 /3 t . tα(t)−d
(3.34)
We pick now ε > 0 such that 1 + δ 2 /3 − ε = 1/(1 + δ).
(3.35)
336
R. van der Hofstad, W. König, P. Mörters
which implies that ε < δ. For any µ ∈ M1 (Zd ), we use Jensen’s inequality together with (3.35) as follows: µ(x)ε 2 1+δ 2 /3 ε µ(x) = µ(x) µ(x)1+δ /3−ε ε x : µ(x)>M µ(x) x : µ(x)>M
x : µ(x)>M
≤
x : µ(x)>M
µ(x)
x : µ(x)>M
≤
≤M
µ(x)
M
x : µ(x)>M
x : µ(x)>M δ 1+δ (ε−1)
ε
ε−δ 1+δ
1 1− 1+δ
ε
µ(x)1+ε ε x : µ(x)>M µ(x) µ(x)
1+ε
x : µ(x)>M
µ(x)
1+δ
1 1+δ
=M
1 δ−ε 1+δ µ(x) (3.36) M
2δ ε− 1+δ
x
1 1+δ
µ(x)
1 1+δ
1+δ
,
x
where we used in the last step that in the first integral on the right, µε ≤ M ε−1 µ on {µ > M}, and hence the first term on the right is not bigger than one, as the exponent is positive and µ ∈ M1 (Zd ). We write q = 1 + δ. We apply the above to µ = 1t t and M M replaced by α(t) d , to obtain that t (x) 1+δ 2 /3 1l{t (x) > tα(t)−d x
Mt } α(t)d
≤ α(t)d(1+δ
2 /3)
2δ M ε− 1+δ −1 t t q d α(t)
2δ
δ
= M ε− 1+δ α(t)d(1+ 1+δ ) t −1 t q .
(3.37)
We recall (3.24), use (3.34) and the definition of α(t) in (1.10). With the help of (3.37) we arrive at t (x) 1+δ 2 /3 t Mt R(t) ( ) ≤ A κ 1l t (x) > α(t) t M d −d α(t)d tα(t) x 2δ
δ
2δ
− q1 [d+(2−d)q]
≤ AM ε− 1+δ α(t)−(2+d)+d(1+ 1+δ ) t q = AM ε− 1+δ α(t)
t q , δ = − q1 [d + (2 − d)q]. where we recall that q = 1 + δ and therefore −(2 + d) + d 1 + 1+δ 2δ
Put θ = AM ε− 1+δ , and observe that θ ↓ 0 as M ↑ ∞ for δ > 0 small enough, since 2 2δ δ ε − 1+δ = δ3 − 1+δ < 0 for δ > 0 small enough. Hence, (3.26) follows immediately from Proposition 2.1. This completes the proof of Proposition 3.1(i). 3.3. Proof of Proposition 3.1(ii). Recall from (1.21) the rescaled version, ξ t , of the vertically shifted potential, ξt , defined in (3.3). Furthermore, introduce the normalised, scaled version of the random walk local times, L t (x) :=
α(t)d t xα(t) , t
for x ∈ Rd ,
The Universality Classes in the Parabolic Anderson Model
337
and note that L t is an L 1 -normalised random step function. Note that supp (L t ) ⊆ Q R if supp (t ) ⊆ B Rα(t) , where we abbreviated Q R = [−R, R]d . We start from (3.22). Let t
t
H y α(t) d − y H α(t)d t (y) = t
, for t, y > 0, H κ α(t) d , uniformly on all compact sets. Now the exponent on t converges to H and recall that H the right hand side of (3.22) can be rewritten as follows. t
H (t (z)) −α(t)d H α(t) d +
z∈B Rα(t)
t
d x + α(t) L t (x) H H α(t) d L t (x) d x QR QR
t t L t (x) d x = t 2 H(t) = α(t)d κ α(t) H R (L t ), d α(t)
= −α(t)
d
t α(t)d
d
QR
where we use the definition of α(t) in (1.10) and introduce the functional
t f (x) d x. H(t) ( f ) = H R QR
Hence, ξt (t, 0) u Rα(t)
= E0
t (t) H (L t ) 1l{supp (L t ) ⊆ Q R } . exp α(t)2 R
(3.38)
A key ingredient in the proof of Proposition 3.1(ii) is the large deviation principle for (L t : t > 0) as formulated in the following proposition: Proposition 3.4. Fix R > 0. Under P0 { · , supp (L t ) ⊆ Q R }, the rescaled local times process (L t : t > 0) satisfies a large deviation principle as t ↑ ∞ on the set of L 1 -normalized functions Q R → R, equipped with the weak topology induced by test integrals against all continuous functions, where the speed of the large deviation principle is tα(t)−2 , and the rate function is g 2 → ∇g22 , on the set of all g ∈ H 1 (Rd ) with supp (g) ⊆ Q R , and is equal to ∞ outside this set. Proof. This large deviation principle is stated in [GKS06, Lemma 3.2] in the discretetime case, and is proved in [GKS06, Sect. 6]. The proof in the continuous-time case is very similar, we briefly sketch the argument. Let f : Q R → R be continuous. The core of the argument is to show that
α(t)2 log E0 exp tα(t)−2 f, L t 1l{supp (L t ) ⊆ Q R } = λ R ( f ) , (3.39) t↑∞ t lim
where λ R ( f ) is the principal eigenvalue of + f in H01 (Q R ), see also (4.12). The rest of the argument is an application of the Gärtner-Ellis theorem, see [GKS06, Sect. 6] for details. To show (3.39), consider the discrete approximation
d f t (z) = α(t) for z ∈ Zd . f x + zα(t)−1 d x , [0,α(t)−1 )d
338
R. van der Hofstad, W. König, P. Mörters
Then t 1 f, L t = α(t)2 α(t)2
t
t/α(t)2
f t (X (s)) ds =
f t X (sα(t)2 ) ds .
0
0
Denoting by µt the normalised occupation measure of a Brownian motion {B(s) : s ≥ 0}, an application of the local functional central limit theorem yields that
E0 exp tα(t)−2 f, L t 1l{supp (L t ) ⊆ Q R } t/α(t)2
2 f t α(t) B(s) ds 1l{supp (µt/α(t)2 ) ⊆ Q R } eo(t/α(t) ) . = E0 exp 0
Since f t (α(t) · ) → f ( · ) uniformly on Q R , (3.39) follows from E exp
T
f B(s) ds 1l{supp (µT ) ⊆ Q R } = exp T λ R ( f ) + o(T ) , for T ↑ ∞,
0
see, e.g. [S98, Theorem 3.1.2], with T = tα(t)−2 .
In order to apply the large deviation principle in Proposition 3.4 to obtain a lower bound for the right hand side of (3.38), we need the lower-bound half of Varadhan’s lemma, and we have to replace H(t) R by its limiting version HR ( f ) = ρ
f (x) log f (x) d x.
(3.40)
QR
However, the latter is technically not so easy. Inserting the indicator on the event {L t ∞ < M} for any M > 1 would make it possible to use the locally uniform cont (y) towards ρy log y, but this event is not open in the topology of the vergence of H large deviation principle. Therefore, similarly to the proof of the upper bound, we have to split H(t) R (L t ) into the sum of H R (L t ) and a remainder term, separate these two from each other by the use of Hölder’s inequality and apply Proposition 2.1 to the remainder term. Let us turn to the details. Since H is convex with H (0) = 0, we have H (yt) ≥ y H (t) for all t > 0 and all t ( f (x)) ≥ 0 on {x : f (x) > M} for any M > 1. Hence, we may y ≥ 1. Therefore, H estimate H(t) R (f) ≥
t f (x) d x 1l{ f (x) ≤ M} H QR
=ρ
1l{ f (x) ≤ M} f (x) log f (x) d x + o(1) = HR ( f ) − ρ 1l{ f (x) > M} f (x) log f (x) d x + o(1). QR
QR
The Universality Classes in the Parabolic Anderson Model
339
The remainder can be estimated, for any δ > 0, as follows. For any f : Q R → [0, ∞) satisfying f = 1, 1+δ/2 f 2 2 f >M f δ/2 log f ≤ f log f = f f log δ f >M δ f >M f >M f >M f >M f f >M f M −δ/2 f >M f 1+δ 2 f log ≤ δ f >M f >M f 1 1+δ 1+δ M −δ/2 2 f >M f ≤ f δ f >M f >M f δ δ δ 2 2 1+δ = M − 2+2δ f f q ≤ M − 2+2δ f q , δ δ f >M δ
where we put q = 1 + δ. Altogether, we have, abbreviating θ = 2 ρδ M − 2+2δ , t ξt
2 u Rα(t) (t, 0) ≥ E0 exp H R (L t ) − θ L t q 1l{supp (L t ) ⊆ Q R } eo(t/α(t) ) . 2 α(t) (3.41) Similarly to the proof of the upper bound, the main contribution will turn out to come from H R , and the q-norm is a small remainder. In order to separate the two from each other, we use Hölder’s inequality to estimate, for some small η > 0, t E0 exp (1 − η)H (L ) 1 l{supp (L ) ⊆ Q } R t t R α(t)2 1−η t
H R (L t ) − θ L t q 1l{supp (L t ) ⊆ Q R } ≤ E0 exp α(t)2 η t 1−η θ L t q 1l{supp (L t ) ⊆ Q R } . ×E0 exp (3.42) α(t)2 η This effectively yields a lower bound on the expected value in (3.41) of the form
ξt u Rα(t) (t, 0) ≥ E0 exp
1 1−η t (1 − η)H (L ) 1 l{supp (L ) ⊆ Q } R t t R 2 α(t) − η t 1−η 1−η 2 θ L t q 1l{supp (L t ) ⊆ Q R } ×E0 exp eo(t/α(t) ) . 2 α(t) η (3.43)
From Proposition 2.1 it follows that the second expectation on the right is negligible in the limit t → ∞, followed by M → ∞, i.e., θ ↓ 0. Hence, we can concentrate on the first term. To apply the lower-bound half of Varadhan’s lemma, see [DZ98, Lemma 4.3.4], we need the following lower semi-continuity property of the function H R : Lemma 3.5. Let f : Q R → [0, ∞) be continuous. Then H R is lower semi-continuous in f in the topology induced by pairing with all continuous functions Q R → [0, ∞).
340
R. van der Hofstad, W. König, P. Mörters
Proof. Let ( f n : n ∈ N) be a family in L 1 (Q R ) such that f n , ψ → f, ψ as n → ∞ for any continuous function ψ : Q R → R. We have to show that lim inf n→∞ H R ( f n ) ≥ H R ( f ). For any s ∈ (0, ∞) we denote by gs the tangent to y → φ(y) := ρy log y in s, i.e., gs (y) = ρ(1 + log s)y − ρs, for all y ∈ R. By convexity we have gs ≤ φ for any s ∈ (0, ∞). Therefore, for any 0 < ε < 1/e,
φ f n (x) d x ≥ g f (x)∨ε f n (x) d x HR ( fn ) = QR QR = ρ 1 + log( f ∨ ε), f n − ρ f ∨ ε, f n . Letting n → ∞, we obtain, using the boundedness and continuity of log( f ∨ ε), lim inf H R ( f n ) ≥ ρ 1 + log( f ∨ ε), f − ρ f ∨ ε, f n→∞ f (x) log f (x)1l{ f (x)>ε} d x + gε ( f (x))1l{ f (x)≤ε} d x ≥ρ Q QR R
≥ρ f (x) log f (x) d x + f (x)(1 + log ε) − ε 1l{ f (x)≤ε} d x. QR
QR
The second summand is bounded from below by Leb(Q R )ε log ε, which converges to zero as ε ↓ 0. This completes the proof.
Now we can apply [DZ98, Lemma 4.3.4] and obtain t α(t)2 lim inf log E0 exp (1 − η)H R (L t ) 1l{supp (L t ) ⊆ Q R } t→∞ t α(t)2 ≥ − inf ∇g22 −(1−η)H R (g 2 ) : g ∈ H 1 (Rd ) ∩ C(Rd ), g2 = 1, supp (g) ⊆ Q R . Letting η ↓ 0 and R ↑ ∞, it is easy to see that the right hand side tends to −χ (ρ) defined (R) in (1.24). Indeed, use appropriate continuous cut-off versions g(1−η)ρ of the minimiser g(1−η)ρ in (1.27) to verify this claim. Using this on the right hand side of (3.43) and recalling Proposition 2.1, we see that the proof of the lower bound in Proposition 3.1(ii) is finished. 4. The Almost-Sure Asymptotics: Proof of Theorem 1.8 We again derive upper and lower bounds, following the strategy in [BK01, Sect. 5]. Recall the scale function β(t) defined in (1.16) and let β(t) α(β(t))d ξβ(t) (z) = ξ(z) − H (4.1) α(β(t))d β(t) denote the appropriately vertically shifted potential (compare to (3.3)). Then Theorem 1.8 is equivalent to the assertion α(β(t))2 log u ξβ(t) (t, 0) = − χ (ρ), almost surely, (4.2) t↑∞ t where χ (ρ) = ρ d − d2 πρ + log ρe = − sup{λ(ψ) : ψ ∈ C(Rd ), L(ψ) ≤ 1}, see Sect. 1.6.2. lim
The Universality Classes in the Parabolic Anderson Model
341
4.1. Proof of the upper bound in (4.2). Let r (t) = t log t and apply Lemma 3.2 with V = ξβ(t) and with R replaced by Rα(β(t)). Furthermore, take logarithms, multiply with α(β(t))2 /t and let t ↑ ∞. As in (3.18), one shows that the first term is negligible. Hence, we obtain that lim sup t↑∞
α(β(t))2 C log u ξβ(t) (t, 0) ≤ 2 + lim sup α(β(t))2 max λz;2Rα(β(t)) (ξβ(t) ) , z∈B(t) t R t↑∞
where B(t) = B Rα(β(t)) (t) (recall the definition B R (t) = Br (t)+2R from Lemma 3.2). Let (λi (t) : i = 1, . . . , N (t)), with N (t) = |B R (t)|, be a deterministic enumeration of the random variables λz;2Rα(β(t)) (ξβ(t) ) with z ∈ B(t). Note that these random variables are identically distributed (but not independent) and that, by (3.20) and Proposition 3.1(i), their exponential moments are estimated by lim sup t↑∞
α(β(t))2 log eβ(t) λ1 (t) ≤ −χ (ρ). β(t)
(4.3)
We next show that, for any ε > 0, almost surely, N (t)
lim sup α(β(t))2 max λi (t) ≤ − χ (ρ) + ε, i=1
t↑∞
(4.4)
which completes the proof of the upper bound in (4.2). To prove (4.4), one first realizes that it suffices to show (4.4) only for t ∈ {en : n ∈ N}, since the functions t → α(t), t → β(t), and t → H (t)/t are slowly varying, and t → N (t), R → λ R (ξβ(t) ) are increasing. Let
− χ (ρ) + ε pn = Prob max λi (e ) ≥ . i=1 α(β(en ))2 N (en )
n
We recall that β(en )α(β(en ))−2 ∼ dn. Using Chebyshev’s inequality and (4.3), we estimate, for any k > 0, n n n n −2 χ (ρ)−ε) pn ≤ N (en )Prob ekβ(e )λ1 (e ) ≥ e−kβ(e )α(β(e )) ( χ (ρ)−ε) kβ(en )λ1 (en ) ≤ en(d+o(1)) enkd( e .
(4.5)
In order to evaluate the last expectation, we intend to apply (4.3) with β(t) replaced by kβ(t). For this purpose, we note that we can replace α(β(t)) by α(kβ(t)) in (4.3), since α is slowly varying. Also, kβ(t)λ Rα(kβ(t)) (ξβ(t) ) = kβ(t)λ Rα(kβ(t)) (ξkβ(t) ) − kβ(t)
H (β(t)α(β(t))−d ) H (kβ(t)α(kβ(t))−d ) , − β(t)α(β(t))−d kβ(t)α(kβ(t))−d
342
R. van der Hofstad, W. König, P. Mörters
where we use that by (4.1), the field ξβ(t) − ξkβ(t) is constant and deterministic. Now we use (1.6) and (1.10), to see that the deterministic term is equal to H (β(t)α(β(t))−d ) H (kβ(t)α(kβ(t))−d ) kβ(t) − β(t)α(β(t))−d kβ(t)α(kβ(t))−d = α(β(t))d k H (β(t)α(β(t))−d ) − H (kβ(t)α(β(t))−d ) + o(n)
(k) + o(1) κ β(t)α(β(t))−d + o(n) = −α(β(t))d H
β(t) ρk log k + o(1) (1 + o(1)) + o(n) =− 2 α(β(t)) = −nd (ρk log k)(1 + o(1)). Hence,
ekβ(e
n )λ (en ) 1
≤ exp − nd kχ (ρ) − ρk log k + o(1) .
Using this in (4.5), we arrive at
pn ≤ exp nd 1 + k( χ (ρ) − ε) − kχ (ρ) + ρk log k + o(1) . Choosing k = ρ1 , we see that pn ≤ e−nd(kε+o(1)) . This is summable over n ∈ N, and the Borel-Cantelli lemma yields that (4.4) holds almost surely. This completes the proof of the upper bound in (4.2). 4.2. Proof of the lower bound in (4.2). Our proof of the lower bound in (4.2) follows the strategy of [BK01, Sect. 5.2]. First we establish that, with probability one, for any sufficiently large t, there is, inside a ‘macrobox’ of radius roughly t, centred at the origin, some ‘microbox’ of radius Rα(β(t)) in which the random field ξβ(t) has some shape with optimal spectral properties. Then we obtain a lower bound for the Feynman-Kac formula in (3.2) by requiring that the random walk moves quickly to that box and stays there for approximately t time units. As a result, the contribution from that strategy is basically given by the largest eigenvalue of d + ξ in that microbox. Rescaling and letting R ↑ ∞, the lower bound is derived from this. Let us go to the details. We pick an increasing auxiliary scale function t → γt satisfying γt = t 1−o(1) , t − γt = t (1 + o(1)), H (β(t)α(β(t))−d ) t t γt = o , γ . = o t α(β(t))2 β(t)α(β(t))−d α(β(t))2
(4.6)
(Note that the second requirement follows from the third.) For example, γt = tα(β(t))−2 εt with some suitable εt ↓ 0 as a small inverse power of log t satisfies (4.6). This is obvious in the case where lims↑∞ H (s)/s = 0, and in the case where lims↑∞ H (s)/s = ∞, it is also clear since H (s)/s diverges only subpolynomially in s, while β(t) = (log t)1+o(1) and α is slowly varying. The crucial step is to show that, in the ‘macrobox’ Bγt , we find an appropriate ‘microbox’. To fix some notation, let Q R = [−R, R]d and let C(Q R ) denote the set of
The Universality Classes in the Parabolic Anderson Model
343
continuous functions Q R → R. We need finite-space versions of the functionals H, L and λ defined in (1.25) and (1.26). Recall the definition of H R from (3.40) and define its Legendre transform L R : C(Q R ) → (−∞, ∞] by
L R (ψ) = sup f, ψ − H R ( f ) : f ∈ C(Q R ), f ≥ 0, supp f ⊆ supp ψ . (4.7) As in the proof of Proposition 1.12 one can see that f = eψ/ρ−1 is the unique maximizer in (4.7) with ρ eψ(x)/ρ d x. L R (ψ) = e QR Proposition 4.1 (Existence of an optimal microbox). Fix R > 0 and let ψ ∈ C(Q R ) satisfy L R (ψ) < 1. Let ε > 0. Then, with probability one, there exists t0 > 0, depending also on ξ , such that, for all t > t0 , there is yt ∈ Bγt , depending on ξ , such that z ε 1 − ψ , for z ∈ B Rα(β(t)) . (4.8) ξβ(t) (yt + z) ≥ 2 α(β(t)) α(β(t)) α(β(t))2 The proof of Proposition 4.1 is deferred to the end of this section. Now we finish the proof of the lower bound in (4.2) subject to Proposition 4.1. Let R, ε > 0, and let ψ ∈ C(Q R ) be twice continuously differentiable with L R (ψ) < 1. Fix ξ not belonging to the exceptional set of Proposition 4.1, i.e., let t0 and (yt : t > t0 ) in Bγt be chosen such that (4.8) holds for every t > t0 . Fix t > t0 . In the Feynman-Kac formula t u ξβ(t) (t, 0) = E0 exp ξβ(t) (X (s)) ds , 0
we obtain a lower bound by requiring that the random walk is at yt at time γt and remains within the microbox B yt ,t = yt + B Rα(β(t)) during the time interval [γt , t]. Using the Markov property at time γt , we obtain by this the lower bound γt ξβ(t) (X (s)) ds δ yt (X (γt )) u ξβ(t) (t, 0) ≥ E0 exp 0 t−γt × E yt exp ξβ(t) (X (s)) ds 1l{τ yt ,t > t − γt } , (4.9) 0
/ B yt ,t } denotes the exit time from the microbox B yt ,t . where τ yt ,t = inf{s > 0 : X (s) ∈ In the first expectation on the right side of (4.9), we estimate ξ from below by its minimum K = essinf ξ(0) > −∞, and in the second expectation we use (4.8) and shift spatially by yt to obtain H (β(t)α(β(t))−d ) ξβ(t) P0 {X (γt ) = yt } u (t, 0) ≥ exp γt K − β(t)α(β(t))−d t−γt −2 ×e−ε(t−γt )α(β(t)) E0 exp ψt (X (s)) ds 1l{τ0,t > t − γt } , 0
(4.10)
344
R. van der Hofstad, W. König, P. Mörters
where we have denoted ψt (·) = α(β(t))−2 ψ(· α(β(t))−1 ). By our choice in (4.6), the −2 first term on the right side of (4.10) is eo(tα(β(t)) ) . Now, by choosing a path from the origin to yt consisting of k steps for k = γt or k = γt + 1, k
1 P{σ (1) + · · · + σ (k) ≤ γt < σ (1) + · · · + σ (k + 1) , P0 X (γt ) = yt } ≥ 2d where σ (1), σ (2), . . . are independent exponential random variables with mean 1/2d. Using that
P σ (1) + · · · + σ (k) ≤ γt < σ (1) + · · · + σ (k + 1)
≥ P σ (1) + · · · + σ (k) ∈ [ γ2t , γt ) P σ (0) ≥ γ2t , and Cramér’s theorem, we obtain the lower bound P0 {X (γt ) = yt } ≥ e−O(γt ) = e−o(tα(β(t))
−2 )
.
By an eigenfunction expansion we have that t−γt E0 exp ψt (X (s)) ds 1l{τ0,t > t − γt } 0
t−γt ψt (X (s)) ds 1l{τ0,t > t − γt , X (t − γt ) = 0} ≥ E0 exp 0 ≥ exp (t − γt )λd (t) et (0)2 ,
where λd (t) is the principal eigenvalue of d + ψt in the box B Rα(β(t)) with zero boundary condition, and et is the corresponding positive 2 -normalized eigenvector. Putting together these estimates and recalling from (4.6) that t − γt = t (1 + o(1)), we obtain, almost surely, lim inf t↑∞
α(β(t))2 log u ξβ(t) (t, 0) ≥ −ε + lim inf α(β(t))2 λd (t) t↑∞ t α(β(t))2 log et (0)2 . + lim inf t↑∞ t
(4.11)
We now define the continuous counterpart λ R of λd (t), which is the finite-space version of the spectral radius defined in (1.26):
λ R (ψ) = sup ψ, g 2 − ∇g22 : g ∈ H 1 (Rd ), g2 = 1, supp g ⊆ Q R . (4.12) According to [BK01, Lemma 5.3], lim inf α(β(t))2 λd (t) ≥ λ R (ψ) t↑∞
and
lim inf t↑∞
α(β(t))2 log et (0)2 ≥ 0. t
Using this in (4.11), we obtain lim inf t↑∞
α(β(t))2 log u ξβ(t) (t, 0) ≥ −ε + λ R (ψ), t
(4.13)
The Universality Classes in the Parabolic Anderson Model
345
for any ε > 0 and for any twice continuously differentiable function ψ ∈ C 2 (Q R ) satisfying L R (ψ) < 1. Hence, lim inf t↑∞
where
α(β(t))2 χR , log u ξβ(t) (t, 0) ≥ − t
χ R = inf −λ R (ψ) : ψ ∈ C 2 (Q R ) and L R (ψ) < 1 .
(4.14)
R ≤ χ (ρ). This can be It remains to show that, for any ρ > 0, we have lim sup R↑∞ χ seen as follows: By Proposition 1.14 the variational problem in (1.33) has a minimizer ψ ∗ , a parabola with L(ψ ∗ ) = 1. Pick ψ R = ε R + ψ ∗ | Q R , where ε R > 0 is chosen such that L R (ψ R ) = 1 − R1 . Obviously ε R ↓ 0. It is easy to show, using the explicit principal eigenfunction of + ψ ∗ that lim R→∞ λ R (ψ R ) = λ(ψ ∗ ). This completes the proof of the lower bound in (4.2) subject to Proposition 4.1. We finally prove Proposition 4.1: Proof of Proposition 4.1. This is very similar to the proof of [BK01, Prop. 5.1]. Recall that ψt (·) = α(β(t))−2 ψ(· α(β(t))−1 ). Consider the event 5 ε ξβ(t) (y + z) ≥ ψt (z) − , for y ∈ Zd . A(t) y = 2α(β(t))2 z∈B Rα(β(t))
Note that the distribution of A(t) y does not depend on y. Our first goal is to show that, for every ε > 0,
−d L R (ψ)−Cε+o(1) Prob A(t) , as t ↑ ∞, (4.15) 0 ≥t where C > 0 depends only on R and ψ, but not on ε. It is convenient to abbreviate st = β(t)α(β(t))−d .
(4.16)
Let f ∈ C(Q R ) be some positive auxiliary function (to be determined later), and consider the tilted probability measure Probt,z ( · ) = e ft (z)ξβ(t) (z) 1l{ξ(z) ∈ · } e−H ( f t (z))+ ft (z)H (st )/st , for z ∈ Zd , where f t (z) = st f (zα(β(t))−1 ) is the scaled version of f . The purpose of this tilting is to make the event A(t) 0 typical. We denote the expectation with respect to Probt,z by · t,z . Consider the event ε ε . ≥ ξβ(t) (z) − ψt (z) ≥ − Dt (z) = 2α(β(t))2 2α(β(t))2 6 Using that z∈B Rα(β(t)) Dt (z) ⊆ A(t) 0 and the left inequality in the definition of Dt (z), we obtain H (s ) (t)
ε t H ( f t (z)) − f t (z) + ψt (z) + Prob A0 ≥ exp st 2α(β(t))2 z∈B Rα(β(t))
Probt,z Dt (z) . (4.17) × z∈B Rα(β(t))
346
R. van der Hofstad, W. König, P. Mörters
Since β(t)α(β(t))−2 = d log t, it is clear from a Riemann sum approximation that ε − f t (z) ψt (z) + exp 2α(β(t))2 z∈B Rα(β(t))
= exp −
z ε 1 β(t) z ψ α(β(t))) + 2 f α(β(t))) 2 d α(β(t)) α(β(t)) z∈B Rα(β(t))
=t
−d f,ψ−d 2ε f,1l+o(1)
,
as t ↑ ∞.
(4.18)
We use the uniformity of the convergence in (1.6), the definitions (1.10) of α( · ) and (1.16) of β(t), and a Riemann sum approximation to obtain H ( f t (z)) − f t (z) Hs(st t ) z∈B Rα(β(t))
=
z
z H st f α(β(t)) − f α(β(t)) H (st )
z∈B Rα(β(t))
= κ(st )
ρf
z α(β(t))
log f
z α(β(t))
+ o(α ◦ β(t)d )
z∈B Rα(β(t))
= H R ( f ) + o(1) (1 + o(1))
β(t) = H R ( f ) d(log t) + o(1) . (4.19) 2 α(β(t))
Using (4.18) and (4.19) in (4.17), we arrive at
d H R ( f )− f,ψ− 2ε f,1l +o(1) ≥ t Prob A(t) 0
Probt,z Dt (z) ,
as t ↑ ∞.
z∈B Rα(β(t))
Recall from (4.7) that L R is the Legendre transform of H R . We choose f as the minimizer on the right of (4.7), i.e., such that H R ( f ) − f, ψ = −L R (ψ). Hence, to show that (4.15) holds, it is sufficient to show that
Probt,z Dt (z) ≥ t o(1) , as t ↑ ∞. (4.20) z∈B Rα(β(t))
To show this, note that
ε Probt,z Dt (z) = 1 − Probt,z ξβ(t) (z) > ψt (z) + 2α(β(t))2 ε . −Probt,z ξβ(t) (z) < ψt (z) − 2α(β(t))2
(4.21)
Since both terms are handled in the same way, we treat only the second term. For any a > 0 we use the exponential Chebyshev inequality to bound ε Probt,z ξβ(t) (z) < ψt (z) − 2α(β(t))2 ε ≤ e−H ( ft (z))+ ft (z)H (st )/st exp f t (z)ξβ(t) (z) + a ψt (z) − ξβ(t) (z) − 2α(β(t))2 = e H ( ft (z)−a)−H ( f t (z))+a H (st )/st ea[ψt (z)−ε/2α(β(t)) ] . 2
The Universality Classes in the Parabolic Anderson Model
347
We pick a = δt f t (z) with some δt ↓ 0. Then the terms involving H can be treated similarly to (4.19). Indeed, abbreviating f = f (zα(β(t))−1 ), we obtain H (st ) st
= H st (1 − δt ) f − (1 − δt ) f H st − H st f − f H st
ρ + o(1) (1 − δt ) = κ(st )(1 + o(1)) H f −H f =− f , f δt d log t 1 + log d α(β(t))
H ( f t (z) − a) − H ( f t (z)) + a
where we also used the approximation log(1 − δt ) = −δt (1 + o(1)). Hence, we obtain Probt,z ξβ(t) (z) < ψt (z) − ≤ t δt α(β(t))
−d
ε 2α(β(t))2
−ε/2−ρ(1+log f [ψ f )](d+o(1))
,
as t ↑ ∞,
= ψ(zα(β(t))−1 ). Recall that we chose f optiwhere we recall (4.1) and abbreviate ψ mally in (4.7), which in particular means that log f (x) = ψ(x)/ρ − 1. Hence, for some C > 0, not depending on t nor on z, we have, for t > 1 large enough, Probt,z ξβ(t) (z) < ψt (z) −
ε 1 −d ≤ t −Cdεδt α(β(t)) ≤ . 2 2α(β(t)) 4
Going back to (4.21) and assuming that the first probability term satisfies the same bound, we have
|B
d | Probt,z Dt (z) ≥ 1 − 21 Rα(β(t)) = eC(Rα(β(t))) = eo(log t) = t o(1) ,
(4.22)
z∈B Rα(β(t))
where we use that α is slowly varying and β(t) = (log t)1−o(1) , so that α(β(t))d ≤ β(t)dη = o(log t) for t → ∞. This proves (4.20), and therefore (4.15). We finally complete the proof of Proposition 4.1. As in the proof of [BK01, Prop. 5.1] it suffices to prove the almost sure existence of a (random) n 0 ∈ N such that, for any n+1 n ≥ n 0 , there is a yn ∈ Bγen such that the event A(eyn ) occurs. In the following, we abbreviate t = en . Let Mt = Bγt ∩ 3Rα(β(et))Zd . Note that |Mt | ≥ t d−o(1) as t ↑ ∞ and that the events A(et) y with y ∈ Mt are independent. It suffices to show the summability of (et) (et) 1 pt = Prob 1l{A y } ≤ 2 |Mt |Prob(A0 ) y∈Mt
on t ∈ eN . Indeed, since, by (4.15), d−d L R (ψ)−Cε−o(1) |Mt |Prob(A(et) 0 )≥t
(4.23)
tends to infinity if ε > 0 is small enough (recall that L R (ψ) < 1), the summability ensures, via the Borel-Cantelli lemma, that, for all sufficiently large t, even a growing
348
R. van der Hofstad, W. König, P. Mörters
number of the events A(et) y with y ∈ Mt occurs. To show the summability of pt for t ∈ eN , we use the Chebyshev inequality to estimate pt ≤ Prob
(et)
1l{A y } −
y∈Mt
≤4
(et)
1l{A y }
y∈Mt
1 − Prob(A(et) 0 )
|Mt |Prob(A(et) 0 )
2
2 1 > |Mt |Prob(A(et) 0 ) 4
.
The summability over all t ∈ eN is clear from (4.23).
5. Appendix: Corrected Proof of Lemma 4.2 in [BK01] We use the opportunity to correct an error in the proof of one of the main results of [BK01], the analogue of Theorem 1.4 for case (4) in Sect. 1.3. In the original proof the large deviation principle of Proposition 3.4 and Varadhan’s lemma are applied to the functional f → − f γ d x, which fails to be continuous in the topology of the large deviation principle. Here we adapt the techniques of the present paper to derive this result. We use the notation of Sect. 3. Recall case (4) from Sect. 1.3. That is, we are in the case where esssup ξ(0) = 0, γ ∈ (0, 1) and κ ∗ = 0. The case γ = 0 is easier and can be treated analogously. The t (x) = −Dx γ , uniformly in x on compact subsets main assumption is that limt→∞ H of [0, ∞), where d+2 t (x) = α(t) H x t , H t α(t)d
(5.1)
and D > 0 is a parameter. We have α(t) = t ν+o(1) as t → ∞, where ν = (0,
1 d+2 ).
1−γ d+2−dγ
∈
The step which needs amendment in [BK01] is the following analogue of Proposition 3.1: Proposition 5.1. ξ α(t)2 log u Rα(t) (t, 0) ≤ −χ (M) . t t↑∞ ξ α(t)2 (ii) For any R > 0, lim inf log u Rα(t) (t, 0) ≥ −χ R , t↑∞ t (i) For any R > 0 and M > 0, lim sup
where χ (M) = χR =
inf
g∈H 1 (Rd ) g2 =1
inf
∇g22 + D
g∈H 1 (Rd ) g2 =1,supp (g)⊆Q R
g 2 (x) ∧ M
γ
dx ,
∇g22 + D g 2γ (x) d x .
The Universality Classes in the Parabolic Anderson Model
Proof. Introduce H(t) R (f) =
t f (x) d x, H
349
for f ∈ L 1 (Q R ), f ≥ 0.
QR
As in (3.38), we have ξ u Rα(t) (t, 0) = E0 exp
t (t) H (L ) 1 l{supp (L ) ⊆ Q } , t t R R α(t)2
(5.2)
where we recall the rescaled and normalized local times L t . We start with the proof of (i). Fix M > 0. With H R ( f ) = −D Q R f (x)γ d x, we have, uniformly in f ∈ L 1 (Q R ), f ≥ 0, (t) lim sup H(t) R ( f ) ≤ lim sup H R ( f ∧ M) = H R ( f ∧ M).
t↑∞
t↑∞
Note that H R (L t ∧ M) = α(t)2 G t ( 1t t ), where we introduce γ D −d d G t (µ) = − (α(t) α(t) µ(z)) ∧ M , for µ ∈ M1 (B Rα(t) ). α(t)2 z∈B Rα(t)
We now use Proposition 3.3 for B = B Rα(t) and F = G t to obtain from (5.2) that, for any large t, ξ −2 u Rα(t) (t, 0) ≤ eo(tα(t) ) E0 exp t G t ( 1t t ) 1l{supp (t ) ⊆ B Rα(t) } −2 ≤ eo(tα(t) ) exp − tχt(M) , where (M)
χt
4 2 4 1 = inf µ(x) − µ(y) − G t (µ) . µ∈M1 (B Rα(t) ) 2 x∼y
The proof of the upper bound is finished as soon as we have shown that lim inf α(t)2 χt(M) ≥ χ (M) .
(5.3)
t↑∞
This is shown as follows. Let (tn : n ∈ N) be a sequence of positive numbers tn → ∞ along which lim inf t↑∞ α(t)2 χt(M) is realized. We may assume that its value is finite. Let (µn : n ∈ N) be a sequence of approximative minimizers, i.e., probability measures on Zd having support in B Rα(t) such that 2 γ 4 4
21 −d d α(tn ) µn (z) ∧ M µn (z) − µn (y) + Dα(tn ) lim inf α(tn ) 2 n→∞
z∼y
z
is equal to the left-hand side of (5.3). For any i ∈ {1, . . . , d} consider gn(i) : Rd → R given by 7
α(tn )xi gn(i) (x) = α(tn )d µn α(tn )x + xi − α(tn ) 7
7
×α(tn ) α(tn )d µn α(tn )x + ei − α(tn )d µn α(tn )x ,
350
R. van der Hofstad, W. König, P. Mörters
where ei ∈ Zd is the i th unit vector. For x = (x j : j = 1, . . . , d) ∈ Rd , we abbreviate (i) (i) xi = (x j : j = i) ∈ Rd−1 and denote gn, xi ∈ Rd−1 , xi (x i ) = gn (x). For almost every (i) 1 the map gn, xi is continuous and piecewise affine, and hence lies in H (R) with support in [−R, R]. Furthermore, 7
7
∂gn(i) (i) d µ α(t )x + e − α(t )d µ α(t )x . (gn, ) (x ) = (x) = α(t ) α(t ) i n n n n i n n n xi ∂ xi Hence, using Fubini’s theorem and Fatou’s lemma, we see that 7 2 7 21 d ∞ > lim inf α(tn ) α(tn ) µn (z) − α(tn )d µn (y) n→∞ 2 z∼y = lim inf n→∞
≥
d i=1
d i=1
Rd−1
Rd−1
d xi
R
d xi lim inf n→∞
R
82 8 (i) 8 d xi 8(gn, xi ) (x i ) 82 8 (i) 8 d xi 8(gn, xi ) (x i ) .
Since |xi − α(tn )xi /α(tn )| ≤ α(tn )−1 , this also shows that lim gn(i) −
n→∞
7 α(tn )d µn (α(tn ) · )2 = 0.
(5.4)
(5.5)
In particular, gn(i) is asymptotically L 2 -normalized. Furthermore, it follows that, along a (i) (i) 1 suitable subsequence, for almost all xi ∈ Rd−1 , gn, xi converges to some g xi ∈ H (R). 2 The convergence is (i) strong in L , (ii) pointwise almost everywhere, and (iii) weak in L 2 for the gradients. The limit satisfies 7 7 2 1 lim inf α(tn )2 α(tn )d µn (z) − α(tn )d µn (y) n→∞ 2 z∼y ≥
d i=1
Rd−1
d xi
R
82 8 d xi 8(gx(i)i ) (xi )8 .
(5.6)
4 (i) (i) (i) d Since gn, xi (x i ) = gn (x) and lim n→∞ gn − α(tn ) µn (α(tn ) · )2 = 0, there is g ∈ (i) L 2 (Rd ) such that g(x) = gxi (xi ) for almost all x ∈ Rd . In particular, (a) g ∈ H 1 (Rd ) with (b) g2 = 1, (c) supp (g) ⊂ Q R and (d) 7 2 7 2 21 d ∇g2 ≤ lim inf α(tn ) α(tn ) µn (z) − α(tn )d µn (y) . n→∞ 2 z∼y Indeed, (a) follows from (b) and (d). Item (b) follows from (5.5), while item (c) is trivially satisfied. We are left to prove item (d). Since gx(i)i (xi ) = g(x) for almost every x, we get
∂ (i) ∂ gx(i)i (xi ) = g (xi ) = g(x), ∂ xi xi ∂ xi
(5.7)
The Universality Classes in the Parabolic Anderson Model
351
and hence d
d−1 i=1 R
d xi
R
82 8 d xi 8(gx(i)i ) (xi )8 =
Rd
dx
d 82 8 ∂ 8 g(x)8 = ∇g22 . ∂ xi
(5.8)
i=1
Therefore, item (d) follows from (5.6). It remains to show that (g(x)2 ∧M)γ d x ≤ lim inf n→∞ α(tn )−d z ((α(tn )d µn (z))∧ M)γ . Note that γ
α(tn )d µn (z) ∧ M α(tn )−d z
γ
= α(tn )d µn (α(tn )x) ∧ M d x 8 82γ α(tn )xi (i) 8 (i) 8 γ (gn,xi ) (xi )8 ∧ M d x. = 8gn (x) − xi − α(tn )
(5.9)
We next use the inequality |a − b|2γ ≥ (|a|γ − |b|γ )2 ≥ |a|2γ − 2|ab|γ and for the subtracted term use Jensen’s inequality and the Cauchy-Schwarz inequality, as well as (5.4), to see that 8 8γ α(tn )xi (i) 8 8 (i) (gn,xi ) (xi )8 d x 8gn (x) xi − α(tn ) 8 γ 8 (i) 888 8g (x)88 xi − α(tn )xi (g (i) ) (xi )88 d x ≤ (2R)d (2R)−d n n, xi α(tn ) QR γ γ ≤ α(tn )−γ (2R)(1−γ )d g (i) ∂ g (i) , n
2
∂ xi n
2
which is negligible. Next we use the fact that gn(i ) → g pointwise and Fatou’s lemma to see that the limit inferior of the right hand side of (5.9) is not smaller than (g(x)2γ ∧M γ ) d x. This completes the proof of (5.3) and therefore the proof of (i). We next turn to the proof of (ii). First we show that, for any f ∈ C(Q R ) and any family of L 1 (Q R )-normalized functions f t ∈ L 1 (Q R ) satisfying f t → f in the weak topology induced by test integrals against all continuous functions, lim inf H(t) R ( f t ) ≥ H R ( f ).
(5.10)
t↑∞
(t) (t) We fix a large M > 0 and estimate H(t) R ( f t ) ≥ H R ( f t ∧ M) + H R ( f t 1l{ f t > M}). We γ first handle the first term. Introduce φ(x) = x and let g y (x) = (1 − γ )y γ + γ y γ −1 x denote the tangent of φ at y ∈ (0, ∞). By concavity, we have φ ≤ g y on (0, ∞) for any y > 0. This implies that, as t ↑ ∞, for any ε > 0,
H(t) φ f t (x) ∧ M d x ≥ o(1) − D g f (x)∨ε f t (x) d x R ( f t ∧ M) = o(1) − D QR QR
γ
γ −1 ≥ o(1)− D(1 − γ ) f (x) ∨ ε d x − Dγ f t (x) f (x) ∨ ε dx QR QR = o(1) − D(1 − γ ) ( f ∨ ε)γ − Dγ f ( f ∨ ε)γ −1 ,
QR
QR
352
R. van der Hofstad, W. König, P. Mörters
where in the last step we used that ( f ∨ ε)γ −1 is continuous and f t → f . Letting ε ↓ 0, we see that lim inf t↑∞ H(t) R ( f t ∧ M) ≥ H R ( f ) for any M > 0. It remains to show that lim inf M↑∞ lim inf t↑∞ H(t) R ( f t 1l{ f t > M}) ≥ 0. Fix δ > 0 such that γ + δ < 1. Recall (5.1). Since H is regularly varying with exponent γ , by [BGT87, Proposition 1.3.6], there is an M > 0 such that, for any sufficiently large t, t (x) ≥ −x γ +δ , H Hence,
for any x > M.
f t (x)γ +δ 1l{ f t (x) > M} d x ≥ −M γ +δ−1 f t (x) d x = −M γ +δ−1 ,
H(t) R ( f t 1l{ f t > M}) ≥ −
QR
QR
since f t is L 1 -normalized. This completes the proof of (5.10). We complete the proof of Proposition 5.1(ii) by using (5.10) in (5.2) and use the lower bound of Varadhan’s lemma in [DZ98, Lemma 4.3.4] to conclude that the assertion in (ii) holds.
Acknowledgements. We would like to thank Laurens de Haan for helpful discussions on regularly varying functions, and the organisers of the Workshop on Interacting stochastic systems in Cologne, 2003, where this work was initiated. The work of the first author was supported in part by Netherlands Organisation for Scientific Research (NWO). The second author would like to thank the German Science Foundation for awarding a Heisenberg grant (realized in 2003/04), and the third author would like to acknowledge the support of the Nuffield Foundation through grant NAL/00631/G and the EPSRC through grant EP/C500229/1.
References [BGT87] Bingham, N.H., Goldie, C.M., Teugels, J.L.: Regular Variation. Cambridge: Cambridge University Press 1987 [BK01] Biskup, M., König, W.: Long-time tails in the parabolic Anderson model with bounded potential. Ann. Probab. 29(2), 636–682 (2001) [BK01a] Biskup, M., König, W.: Screening effect due to heavy lower tails in one-dimensional parabolic Anderson model. J. Stat. Phys. 102(5/6), 1253–1270 (2001) [BHK05] Brydges, D., van der Hofstad, R., König, W.: Joint density for the local times of continuous-time Markov chains (2005) available at http://www.math.uni-leipzig.de/∼koenig/www/localtimes.pdf [CM94] Carmona, R., Molchanov, S.A.: Parabolic Anderson problem and intermittency. Mem. Amer. Math. Soc. 108(518), (1994) [Ch04] Chen, X.: Exponential asymptotics and law of the iterated logarithm for self-intersection local times of random walks. Ann. Probab. 32(4), 3248–3300 (2004) [DZ98] Dembo, A., Zeitouni, O.: Large Deviations Techniques and Applications, 2nd Edition. New York: Springer, 1998 [DV79] Donsker, M., Varadhan, S.R.S.: On the number of distinct sites visited by a random walk. Comm. Pure Appl. Math. 32, 721–747 (1979) [GKS06] Gantert, N.,König, W., Shi, Z.: Annealed deviations for random walk in random scenery. Annales Inst. H. Poincaré: Probab. et Stat., to appear (2006). [GH99] Gärtner, J., den Hollander, F.: Correlation structure of intermittency in the parabolic Anderson model. Probab. Theory Relat. Fields 114, 1–54 (1999) [GK00] Gärtner, J., König, W.: Moment asymptotics for the continuous parabolic Anderson model. Ann. Appl. Probab. 10(3), 192–217 (2000) [GK05] Gärtner, J., König, W.: The parabolic Anderson model. In: J.-D. Deuschel, A. Greven (eds.), Interacting Stochastic Systems, Berlin Heidelberg Newyork: Springer 2005, pp.153–179 [GKM00] Gärtner, J., König, W., Molchanov, S.: Almost sure asymptotics for the continuous parabolic Anderson model. Probab. Theory Relat. Fields 118(4), 547–573 (2000)
The Universality Classes in the Parabolic Anderson Model
353
[GKM06] Gärtner, J., König, W., Molchanov, S.: Geometric characterization of intermittency in the parabolic Anderson model. Ann. Probab., to appear (2006) [GM90] Gärtner, J., Molchanov, S.: Parabolic problems for the Anderson model I. Intermittency and related topics. Commun. Math. Phys. 132, 613–655 (1990) [GM98] Gärtner, J., Molchanov, S.: Parabolic problems for the Anderson model II. Second-order asymptotics and structure of high peaks. Probab. Theory Relat. Fields 111, 17–55 (1998) [dH00] den Hollander, F.: Large Deviations. Fields Institute Monographs. Providence, RI: Amer. Math. Soc. 2000 [KM02] König, W., Mörters, P.: Brownian intersection local times: upper tail asymptotics and thick points. Ann. Probab. 30, 1605–1656 (2002) [LL01] Lieb, E.H., Loss, M.: Analysis, 2nd Edition. AMS Graduate Studies, Vol. 14, Providence, RI: Amer. Math. Soc., 2001 [M94] Molchanov, S.: Lectures on random media. In: D. Bakry, R.D. Gill, S. Molchanov, Lectures on Probability Theory, Ecole d’Eté de Probabilités de Saint-Flour XXII-1992, LNM 1581, Berlin: Springer, 1994, pp. 242–411 [MR94] Molchanov, S., Ruzmaikin, A.: Lyapunov exponents and distributions of magnetic fields in dynamo models. In The Dynkin Festschrift: Markov Processes and their Applications. Mark Freidlin, (ed.) Basel: Birkhäuser, (1994) [S98] Sznitman, A.-S.: Brownian motion, Obstacles and Random Media. Berlin: Springer 1998 Communicated by H. Spohn
Commun. Math. Phys. 267, 355–392 (2006) Digital Object Identifier (DOI) 10.1007/s00220-006-0085-2
Communications in
Mathematical Physics
Convergence in Higher Mean of a Random Schrödinger to a Linear Boltzmann Evolution Thomas Chen Department of Mathematics, Fine Hall, Princeton University, Princeton, NJ 08544-1000, USA. E-mail:
[email protected] Received: 20 April 2005 / Accepted: 16 May 2006 Published online: 1 August 2006 – © Springer-Verlag 2006
Abstract: We study the macroscopic scaling and weak coupling limit for a random Schrödinger equation on Z3 . We prove that the Wigner transforms of a large class of “macroscopic” solutions converge in r th mean to solutions of a linear Boltzmann equation, for any 1 ≤ r < ∞. This extends previous results where convergence in expectation was established. 1. Introduction We study the macroscopic scaling and weak coupling limit of the quantum dynamics in the three dimensional Anderson model, generated by the Hamiltonian 1 Hω = − + λVω (x) 2
(1)
on 2 (Z3 ). Here, is the nearest neighbor discrete Laplacian, the coupling constant λ > 0 defines the disorder strength, and the random potential is given by Vω (x) = ωx , where {ωx }x∈Z3 , are independent, identically distributed Gaussian random variables. While the phenomenon of impurity-induced insulation is, for strong disorders λ 1 or extreme energies, mathematically well understood (Anderson localization, [1, 6]), establishing the existence of electric conduction in the weak coupling regime λ 1 is a central open problem in this research field. A particular strategy to elucidate aspects of the latter, which has led to important recent successes (especially [5]), is to analyze the macroscopic transport properties derived from the microscopic quantum dynamics generated by (1), [2–5, 9]. Let φt ∈ 2 (Z3 ) be the solution of the random Schrödinger equation i∂t φt = Hω φt (2) φ0 ∈ 2 (Z3 ),
356
T. Chen
with a deterministic initial condition φ0 which is supported on a region of diameter O(λ−2 ). Let Wφt (x, v) denote its Wigner transform, where x ∈ 21 Z3 ≡ (Z/2)3 , and v ∈ T3 = [0, 1]3 . We consider a scaling for small λ defined by the macroscopic time, position, and velocity variables (T, X ) := λ2 (t, x), V := v, while (t, x, v) are the microscopic variables. Likewise, we introduce an appropriately rescaled, macroscopic counterpart Wλr esc (T, X, V ) of Wφt (x, v). It was proved by Erdös and Yau for the continuum, [4, 3], and by the author for the lattice model, [2], that globally in macroscopic time T , and for any test function J (X, V ), lim E d X d V J (X, V )Wλr esc (T, X, V ) λ→0 = d X d V J (X, V )F(T, X, V ), where F(T, X, V ) is the solution of a linear Boltzmann equation. For the random wave equation, a similar result is proved by Lukkarinen and Spohn, [7]. The corresponding local in T result was established much earlier by Spohn, [9]. The main goal of this paper is to strengthen the mode of convergence. We establish convergence in r th mean, lim E d X d V J (X, V )Wλr esc (T, X, V ) λ→0 r (3) − d X d V J (X, V )F(T, X, V ) = 0, for any J and any r ∈ 2N, and hence for any 1 ≤ r < ∞. Thus, in particular, we observe that the variance of d X d V J (X, V )Wλr esc (T, X, V ) vanishes in this macroscopic, hydrodynamic limit. Our proof comprises generalizations and extensions of the graph expansion methods introduced by Erdös and Yau in [4, 3], and further elaborated on in [2]. The structure of the graphs entering the problem is significantly more complicated than in [4, 3, 2], and the number of graphs in the expansion grows much faster than in [4, 3, 2] (superfactorial versus factorial). As a main technical result in this paper, it is established that the associated Feynman amplitudes are sufficiently small to compensate for the large number of graphs, which is shown to imply (3). This is similar to the approach in [4, 3, 2]. The present work addresses a time scale of order O(λ−2 ) (as in [4, 3, 2]), in which the average number of collisions experienced by the electron is finite, so that ballistic behavior is observed. Accordingly, the macroscopic dynamics is governed by a linear Boltzmann equation. Beyond this time scale, the average number of collisions is infinite, and the level of difficulty of the problem increases drastically. In their recent breakthrough result, Erdös, Salmhofer and Yau have established that over a time scale of order O(λ−2−κ ) for an explicit numerical value of κ > 0, the macroscopic dynamics in d = 3 derived from the quantum dynamics is determined by a diffusion equation, [5]. We note that control of the macroscopic dynamics up to a time scale O(λ−2 ) produces lower bounds of the same order (up to logarithmic corrections) on the localization lengths of eigenvectors of Hω , see [2] for d = 3 (the same arguments are valid for d ≥ 3). This extends recent results of Schlag, Shubin and Wolff, [8], who derived similar lower bounds for the weakly disordered Anderson model in dimensions d = 1, 2 using harmonic analysis techniques.
Convergence in Higher Mean of a Random Schrödinger to a Linear Boltzmann Evolution
357
This work comprises a partial joint result with Laszlo Erdös (Lemma 5.2), to whom the author is deeply grateful for his support and generosity. 2. Definition of the Model and Statement of Main Results To give a mathematically well-defined meaning to all quantities occurring in our analysis, we first introduce our model on a finite box L = {−L , −L + 1, . . . , −1, 0, 1, . . . , L − 1, L}3 ⊂ Z3 ,
(4)
for L ∈ N much larger than any relevant scale of the problem, and take the limit L → ∞ later. All estimates derived in the sequel will be uniform in L. We consider the discrete Schrödinger operator 1 Hω = − + λVω 2
(5)
on 2 ( L ) with periodic boundary conditions. Here, is the nearest neighbor Laplacian, f (y), (6) ( f )(x) = 6 f (x) − |y−x|=1
and Vω (x) = ωx
(7)
is a random potential with {ω y } y∈ L i.i.d. Gaussian random variables satisfying E[ωx ] = 0, E[ω2x ] = 1, for all x ∈ L . Expectations of higher powers of ωx satisfy Wick’s theorem, cf. [4], and our discussion below. Clearly, Vω ∞ ( L ) < ∞ almost surely (a.s.), and Hω is a.s. self-adjoint on 2 ( L ), for every L < ∞. 1 1 L−1 3 3 Let ∗L = L1 L = {−1, − L−1 L , . . . , − L , 0, L , . . . , L , 1} ⊂ T denote the lat3 3 tice dual to L , where T = [−1, 1] the 3-dimensional unit torus. For 0 < ρ ≤ 1 with 1 is given by ∗L ,ρ = ρ ∈ N, we define L ,ρ := ρρ −1 L , and note that its dual lattice 1 −1 3 k∈ L ,ρ , L ρ−1 L ⊂ ρ T . For notational convenience, we shall write L ,ρ dk ≡ and ρ −1 T3 dk for the Lebesgue integral. For the Fourier transform and its inverse, we use the convention 3 −2πik·x ∨ e f (x), g (x) = dkg(k)e2πik·x , (8) f (k) = ρ ∗L ,ρ
x∈ L ,ρ
for L ≤ ∞ (where ∞,ρ = ρZ3 and ∗∞,ρ = T3 /ρ). We will mostly use ρ = 1, and sometimes ρ = 21 . On ∗L ,ρ , we define δ(k) = 1(k) with δ(0) = | L ,ρ | if k = 0 and d d δ(k) = 0 if k = 0. On T or R , δ will denote the usual d-dimensional delta distribution. The nearest neighbor lattice Laplacian defines the Fourier multiplier f (k), (− f ) (k) = 2e (k)
(9)
where e (k) =
3
(1 − cos(2π ki )) = 2
i=1
determines the kinetic energy of the electron.
3 i=1
sin2 (π ki )
(10)
358
T. Chen
Let φt ∈ 2 ( L ) denote the solution of the random Schrödinger equation i∂t φt = Hω φt φ0 ∈ 2 ( L ),
(11)
for a fixed realization of the random potential. We define its (real, but not necessarily positive) Wigner transform Wφt : L , 1 × ∗L → R by 2
Wφt (x, v) := 8
φt (y)φt (z)e2πi(y−z)v .
(12)
y,z∈ L y+z=2x
Fourier transformation with respect to the variable x ∈ L , 1 (i.e. (8) with ρ = 21 , see 2 [5] for more details) yields ξ ξ φt (ξ, v) = φ t (v + ), t (v − )φ W 2 2 for v ∈ ∗L and ξ ∈ ∗
L , 12
(13)
⊂ 2T3 .
The Wigner transform is the key tool in our derivation of the macroscopic limit for the quantum dynamics described by (19). For η > 0 small, we introduce macroscopic variables T := ηt, X := ηx, V := v, and consider the rescaled Wigner transform η
Wφt (X, V ) := η−3 Wφt (η−1 X, V )
(14)
for T ≥ 0, X ∈ η L , 1 , and V ∈ ∗L . 2
For a Schwartz class function J ∈ S(R3 × T3 ), we write η η d V J (X, V )Wφt (X, V ). J, Wφt := X ∈η L , 1
(15)
∗L
2
φt as in (13), we have With W η
φt = J, Wφt = J η , W
∗
×∗L L , 21
φt (ξ, v), dξ dv J η (ξ, v)W
(16)
where Jη (x, v) := η−3 J (ηx, v), and J η (ξ, v) = η−3
J (ηx, v)e−2πi xξ = η−3
x∈ L , 1 2
J (X, v)e
− 2πiηX ξ
.
(17)
X ∈η L , 1 2
We note that in the limit L → ∞, J η (ξ, v) tends to a smooth delta function with respect to the ξ -variable, of width O(η) and amplitude O(η−1 ), but remains uniformly bounded with respect to η in the v-variable. The macroscopic scaling limit obtained from letting η → 0, with η = λ2 , is determined by a linear Boltzmann equation. This was proven in [2] for Z3 , and nonGaussian distributed random potentials (the Gaussian case follows also from [2]). The corresponding result for the continuum model in dimensions 2, 3 was proven in [4].
Convergence in Higher Mean of a Random Schrödinger to a Linear Boltzmann Evolution
359
Theorem 2.1. For η > 0, let η φ0 (x)
2πi S(ηx)
η h(ηx)e := η ,
h 2 (ηZ3 ) 3 2
(18)
with h, S ∈ S(R3 , R) of Schwartz class, and h L 2 (R3 ) = 1. Assume L sufficiently large η η η (see (73)) that φ0 L = φ0 . Let φt be the solution of the random Schrödinger equation η
η
i∂t φt = Hω φt
(19)
η
on 2 ( L ) with initial condition φ0 , and let (η)
η
WT (X, V ) := Wφ η
η−1 T
(X, V )
(20)
denote the corresponding rescaled Wigner transform. Choosing η = λ2 ,
(21)
where λ is the coupling constant in (5), it follows that
2 lim lim E J, WT(λ ) = J, FT ,
λ→0 L→∞
(22)
where FT (X, V ) solves the linear Boltzmann equation ∂T FT (X, V ) + =
T3
3 (sin 2π V j )∂ X j FT (X, V ) j=1
dU σ (U, V ) [FT (X, U ) − FT (X, V )]
(23)
with initial condition η
F0 (X, V ) = w − lim Wφ η = |h(X )|2 δ(V − ∇ S(X )), η→0
(24)
0
and where σ (U, V ) := 2π δ(e (U ) − e (V )) denotes the collision kernel. The purpose of the present work is to obtain a significant improvement of the mode of convergence. Our main result is the following theorem.
360
T. Chen
η , satisfies the concentration Theorem 2.2. Assume that the Fourier transform of (18), φ 0 of singularity property (29)–(31). Then, for any fixed, finite r ∈ 2N, any T > 0, and for any Schwartz class function J , the estimate lim
L→∞
r1 1 (λ2 ) (λ2 ) r E J, WT − E J, WT ≤ c(r, T )λ 300r
(25)
holds for λ sufficiently small, and a finite constant c(r, T ) that does not depend on λ. Consequently,
r 2 (26) lim lim E J, WT(λ ) − J, FT = 0 λ→0 L→∞
for any 1 ≤ r < ∞ (i.e. convergence in r th mean), and any T ∈ R+ .
(λ2 ) We observe that, in particular, the variance of J, WT vanishes in the limit λ → 0. Moreover, the following result is an immediate consequence. Corollary 2.1. Under the assumptions of Theorem 2.2, the rescaled Wigner transform 2 WT(λ ) converges weakly, and in probability, to a solution of the linear Boltzmann equations, globally in T > 0, as λ → 0. That is, for any finite T > 0, any ν > 0, and any J of Schwartz class,
(λ2 ) (27) P lim J, WT − J, FT > ν = 0, λ→0
where FT solves (23) with initial condition (24). η
. We have obtained a well-defined semiclassical initial condition 2.1. Singularities of φ 0 (24) for the linear Boltzmann evolution (23) from initial data in (2) of WKB type (18). This can in general not be expected if the initial data in (2) are only required to be in 2 (Z3 ), but 2 (Z3 ) initial data in (2) suffice for the expectation value of the quantum fluctuations in (22) to converge to zero as λ → 0, see [4, 2]. As we will see, a key point in proving that as λ → 0, the quantum fluctuations vanish η with in higher mean, (26), consists of controlling the overlap of the singularities of φ 0 −1 those of the resolvent multipliers (e (k) − α ± iε) , where α ∈ R and ε = O(η) 1. As opposed to the case in (22), it cannot be expected that the quantum fluctuations vanish in higher mean for general 2 inital data (for (22), the overlap of the singularities of η and of those of the resolvent multipliers plays no rôle). Moreover, we note that the φ 0 singularities of the WKB initial condition η (k) = η 2 φ 0
3
h(ηx)e
2πi( S(ηx) η −kx)
x∈Z3 3
= η2
h(X )e
X 2πi( S(X )−k ) η
(28)
X ∈ηZ3
(which are determined by the zeros of detHessS(X ), the determinant of the Hessian of S) will possess a rather arbitrary structure for generic choices of S ∈ S(R3 , R). At present, we do not know if for WKB initial data of the form (18), the quantum fluctuations would
Convergence in Higher Mean of a Random Schrödinger to a Linear Boltzmann Evolution
361
converge to zero in higher mean without any further restrictions on the phase function S ∈ S(R3 , R). A more detailed analysis of these questions is left for future work. In this paper, we shall assume that the Fourier transform of the WKB initial condition (18) satisfies a concentration of singularity condition: η η η (k) = f ∞ φ (k) + f sing (k), 0
(29)
where η
f ∞ L ∞ (T3 ) < c,
(30)
and η
η
η
4
| f sing | ∗ | f sing | L 2 (T3 ) = | f sing |∨ 24 (Z3 ) ≤ c η 5
(31)
for finite, positive constants c, c independent of η. This condition imposes a restriction on the phase function S. η satisfy (29)–(31). The following simple, but physically important examples of φ 0 2.1.1. Example Let S(X ) = p X for X ∈ supp{h}, and p ∈ T3 . Then, η− 2 h(η−1 (k − p)) =: δη (k − p).
h 2 (ηZ3 ) 3
η (k) = φ 0
(32)
Since h is of Schwartz class, δη is a smooth bump function concentrated on a ball of radius O(η), with δη L 2 (T3 ) = 1. Accordingly, we find (|δη | ∗ |δη |)(k) ≈ χ (|k| < cη),
(33)
and 3
|δη | ∗ |δη | L 2 (T3 ) = |δη |∨ 24 (Z3 ) ≤ cη 2 .
(34)
η
Hence, (29)–(31) is satisfied, with f ∞ = 0. In this example, p ∈ T3 corresponds to the velocity of the macroscopic initial condition F0 (X, V ) in (24) for the linear Boltzmann evolution. 2.1.2. Example As a small generalization of the previous case, we may likewise assume for S that for every k ∈ T3 , there are finitely many solutions X j (k) of ∇ X S(X j (k)) = k, and that X j (·) ∈ C 1 (supp{h}) for each j. Moreover, we assume that |detHessS(X )| > c uniformly on supp{h}. Then, by stationary phase arguments, [10], one finds that η η η η (k) = f ∞ φ (k) + f sing (k), f ∞ L ∞ (T3 ) < c 0
with η
f sing (k) =
c j δη( j) (k − ∇ X S(X j (k))),
(35)
(36)
j ( j)
for constants c j independent of η, and smooth bump functions δη similar to (32). One 3 η again obtains | f sing |∨ 24 (Z3 ) ≤ cη 2 , which verifies that (29)–(31) holds. ∇ S determines the velocity distribution of the macroscopic initial condition F0 (X, V ) in (24).
362
T. Chen
3. Proof of Theorem 2.2 We expand φt into a truncated Duhamel series N −1
φt =
φn,t + R N ,t ,
(37)
n=0
where
φn,t := (−iλ)n
Rn+1 +
ds0 · · · dsn δ
n
s j − t eis0 2 Vω eis1 2 · · · Vω eisn 2 φ0
(38)
j=0
denotes the n th Duhamel term, and where t dse−i(t−s)Hω Vω φ N −1,s R N ,t = −iλ
(39)
0
η
is the remainder term. Here and in the sequel, we write φ0 ≡ φ0 for brevity. The number ω is well-defined and bounded N remains to be optimized. Since Vω 1 ( L ) < ∞ a.s., V on ∗L , with probability one, for every L < ∞. Then, n n,t (k0 ) = (−iλ)n ds0 · · · dsn δ φ sj − t j=0
×
(∗L )n
ω (k1 − k0 )e−is1 e (k1 ) · · · dk1 · · · dkn e−is0 e (k0 ) V
ω (kn − kn−1 )e−isn e (kn ) φ 0 (kn ). ×···V
(40)
Expressed as a resolvent expansion in momentum space, we find (−λ)n εt n,t (k0 ) = φ e dαe−itα 2πi R 1 ω (k1 − k0 ) × dk1 · · · dkn V e (k0 ) − α − iε (∗L )n ω (kn − kn−1 ) ×···V
1 0 (kn ). φ e (kn ) − α − iε
(41)
1 as a particle propagator. Likewise, we We refer to the Fourier multiplier e (k)−α−iε th note that (41) is equivalent to the n term in the resolvent expansion of 1 1 φ0 . dze−it z (42) φt = 2πi −iε+R Hω − z
By the analyticity of the integrand in (41) with respect to the variable α, the path of the α-integration can, for any fixed n ∈ N, be deformed into the closed contour I = I0 ∪ I1 , away from R, with
(43)
Convergence in Higher Mean of a Random Schrödinger to a Linear Boltzmann Evolution
363
I0 := [−1, 13], I1 := ([−1, 13] − i) ∪ (−1 − i(0, 1]) ∪ (13 − i(0, 1]), which encloses spec (− − iε) = [0, 12] − iε. Next, we apply the time partitioning method introduced in [4]. To this end, we choose κ ∈ N with 1 κ ε−1 , and subdivide [0, t] into κ subintervals bounded by θ j = jtκ , j = 1, . . . , κ. Then, R N ,t = −iλ
κ−1
e
−i(t−θ j+1 )Hω
θ j+1
θj
j=0
dse−i(θ j+1 −s)Hω Vω φ N −1,s .
(44)
Let φn,N ,θ (s) denote the n th Duhamel term, conditioned on the requirement that the first N collisions occur in the time interval [0, θ ], and all remaining n − N collisions take place in the time interval (θ, s]. That is, n−N n−N φn,N ,θ (s) := (−iλ) ds0 · · · dsn−N δ s j − (s − θ ) +1 Rn−N +
×e
is0 2
Vω · · · Vω e
j=0 2
Vω φ N −1,θ .
(45)
n,N ,θ (s) := −iλVω φn−1,N ,θ (s) φ
(46)
isn−N
Moreover, let
denote its “truncated” counterpart. Further expanding e−is Hω in (44) into a truncated Duhamel series with 3N terms, we find (<4N )
R N ,t = R N ,t
(4N )
+ R N ,t ,
(47)
where (<4N )
R N ,t
=
κ 4N −1
e−i(t−θ j )Hω φn,N ,θ j−1 (θ j )
(48)
j=1 n=N
and (4N )
R N ,t =
κ
e−i(t−θ j )Hω
j=1
θj
θ j−1
4N ,N ,θ j−1 (s). dse−i(θ j −s)Hω φ
(49)
By the Schwarz inequality, )
R (<4N
2 ≤ 3N κ N ,t
sup
N ≤n<4N ,1≤ j≤κ
φn,N ,θ j−1 (θ j ) 2
(50)
and (4N )
R N ,t 2 ≤ t sup
sup
1≤ j≤κ s∈[θ j−1 ,θ j ]
for every fixed realization of Vω .
4N ,N ,θ j−1 (s) 2 ,
φ
(51)
364
T. Chen
Let r ∈ 2N, and let Wt;n 1 ,n 2 (x, v) := 8
ψn 2 ,t (y)ψn 1 ,t (z)e2πi(y−z)·v ,
(52)
y,z∈ L y+z=2x
for x ∈ L , 1 , denote the (n 1 , n 2 )th term in the Wigner distribution, with 2
ψn,t :=
φ n,t
κ −i(t−θ j )Hω φ n,N ,θ j−1 (θ j ) j=1 e (4N ) R N ,t
if n < N if N ≤ n < 4N if n = 4N .
(53)
We note that Fourier transformation with respect to x ∈ L , 1 (see (8)) yields 2
ξ ξ t;n 1 ,n 2 (ξ, v) = ψ n 2 ,t (v − )ψ n 1 ,t (v + ), W 2 2
(54)
see also (13). Then, clearly, 1 φt − E J λ2 , W φt r r E J λ2 , W ≤ CN
4N
1 t;n 1 ,n 2 − E J λ2 , W t;n 1 ,n 2 r r , E J λ2 , W
(55)
n 1 ,n 2 =0
and we distinguish the following cases. If n 1 , n 2 < N , we note that 1 t;n 1 ,n 2 − E J λ2 , W t;n 1 ,n 2 r r E J λ2 , W
1 t;n 1 ,n 2 r r = E2−conn J λ2 , W r r1 2 2 ξ ξ φn 1 ,t v + = E2−conn dξ dv J λ2 (ξ, v)φn 2 ,t v − , 2 2
(56) where E2−conn denotes the expectation based on 2-connected graphs, cf. Definition 5.1 below. If N ≤ n i ≤ 4N for at least one value of i, we use ξ ξ ψn 1 ,t v + | Jλ2 , Wt;n 1 ,n 2 | = dξ dv Jλ2 (ξ, v)ψn 2 ,t v − 2 2 ≤ dξ sup | Jλ2 (ξ, v)| ψn 1 ,t 2 ψn 2 ,t 2 (57) v
and
2T3
dξ sup | J λ2 (ξ, v)| < c. v
(58)
Convergence in Higher Mean of a Random Schrödinger to a Linear Boltzmann Evolution
365
Then, for constants C which are independent of ε, we obtain the following estimates. If n 1 < N and N ≤ n 2 < 4N , the Schwarz inequality implies 1 t;n 1 ,n 2 − E J λ2 , W t;n 1 ,n 2 r r E J λ2 , W ( ' 1 ' & & r r r ≤ C E ψn 1 ,t 2 ψn 2 ,t 2 + E ψn 1 ,t 2 ψn 2 ,t 2
1
1
1 ( 2r 2r 2 2r 2r 2 2 . E ψn 2 ,t 2 ≤ C E ψn 1 ,t 2 + E ψn 1 ,t 2 E ψn 2 ,t 2 (59) Thus, if n 1 < N , N ≤ n 2 < 4N , 1 t;n 1 ,n 2 − E J λ2 , W t;n 1 ,n 2 r r E J λ2 , W )
1
1 2r 2r 2r E
φ ≤ Cκ N sup E φn 1 ,t 2r (θ )
n 2 ,N ,θ j−1 j 2 2 j
1 2 + sup E φn 1 ,t 22 E φn 2 ,N ,θ j−1 (θ j ) 22
* ,
(60)
j
while for n 1 < N , n 2 = 4N , 1 t;n 1 ,4N − E J λ2 , W t;n 1 ,4N r r E J λ2 , W )
1
1 2r 2r 2r E φn 1 ,t 2r E
φ ≤ Ct sup sup (s)
4N ,N ,θ j−1 2 2 j s∈[θ j−1 ,θ j ]
+ sup
sup
j s∈[θ j−1 ,θ j ]
1 4N ,N ,θ j−1 (s) 22 2 E φn 1 ,t 22 E φ
* .
(61)
If N ≤ n 1 , n 2 ≤ 4N , we use the Schwarz inequality in the form 1 t;n 1 ,n 2 − E J λ2 , W t;n 1 ,n 2 r r E J λ2 , W (
2r r1 ' & ≤ C E ψn 1 ,t 2 + ψn 2 ,t 2 + E ψn 1 ,t 2 ψn 2 ,t 2 2
1 (
r 2r 2 E ψn j ,t 2 + E ψn j ,t 2 . ≤C j=1
Hence, for N ≤ n 1 , n 2 ≤ 4N , 1 t;n 1 ,n 2 − E J λ2 , W t;n 1 ,n 2 r r E J λ2 , W N ≤n 1 ,n 2 ≤4N
)
≤ C(N κ)2 sup j
sup
N ≤n<4N
1 r E φn,N ,θ j−1 (θ j ) 2r 2
(62)
366
T. Chen
+ sup j
+ Ct
sup
*
2 E φn,N ,θ j−1 (θ j ) 2
sup
sup
N ≤n<4N
) 2
j s∈[θ j−1 ,θ j ]
1 r 4N ,N ,θ j−1 (s) 2r E φ 2
+ sup
sup
j s∈[θ j−1 ,θ j ]
4N ,N ,θ j−1 (s) 22 E φ
*
.
(63)
We shall next use Lemmata 4.1, 4.2, and 4.3 below to bound the above sums. 1 From Lemma 4.1, and ((nr )!) r < n n r n , one obtains
1 t;n ,n r r E2−conn J λ2 , W 1
2
n 1 ,n 2
(64)
n 2
1 1 ≤ Cκ N N +3 (log )3 (cr λ2 ε−1 log ) N ε ε 1 1 1 (cλ2 ε−1 )4N 1 3 1 4N 2 4N 2 −1 ε 5 + ε 5r (log ) (cr λ ε log ) + (4N ) × . √ ε ε N! (65)
From (75) and Lemma 4.3, 1 E J λ2 , W t;n 1 ,4N − E J λ2 , W t;n 1 ,4N r r + n 1
n 2
1 1 6 1 5N 2 −1 5N +1 −2N 2 −1 ≤ Cε N κ (log ) (cr λ ε log ) . ε ε
(66)
Finally, from Lemmata 4.2 and 4.3, 1 t;n 1 ,n 2 − E J λ2 , W t;n 1 ,n 2 r r E J λ2 , W N ≤n 1 ,n 2 ≤4N
(cλ2 ε−1 )4N √ N! 1 1 1 1 + C(N κ)2 (4N )4N ε 5 + ε 5r (log )3 (cr λ2 ε−1 log )4N ε ε 1 1 + Cε−2 κ −2N (4N )4N (log )3 (cr λ2 ε−1 log )4N . ε ε
≤ C(N κ)2
(67)
Convergence in Higher Mean of a Random Schrödinger to a Linear Boltzmann Evolution
367
We emphasize that the bounds (64)–(67) are uniform in L. Consequently, for a choice of parameters ε= N =
1 λ2 = , T +t log 1ε
, ,
100r log log 1ε . 1 150r , κ = (log ) ε
(68)
we find, for sufficiently small ε, 1
1
ε− 70r < N N < ε− 100r , (4N )4N 1 (cr λ2 ε−1 log )4N ε (cλ2 ε−1 )4N √ N! κ −2N
1
< ε− 20r , 1
< ε− 50r ,
(69)
1
< ε 25r , ≤ ε3 ,
whereby it is easy to verify that 1 1 1 1 1 (64) < (log )10 ε 5r ε− 100r ε 50r < ε 20r , ε 21 1 1 1 1 1 1 10+150r − 1 1 − − ε 50r ε 25r + ε 20r (ε 5 + ε 5r )ε 50r < ε 120r , (65) < (log ) ε 1 1 1 10 −1− 1 3 − 1 2 20r ε ε 50r (66) < (log ) ε < ε2, ε 1 1 1 1 1 2+300r 1 1 (67) < (log ) ε 25r + (log )6+300r ε− 20r (ε 5 + ε 5r )ε− 50r ε ε 1 1 1 1 + (log )3 ε−2 ε3 ε− 20r ε− 50r < ε 30r . ε Collecting all of the above, and recalling (55), 1 1 1 φt − E J λ2 , W φt r r < C N ε 120r < ε 150r , E J λ2 , W
(70)
(71)
uniformly in L. Hence, using Theorem 2.1, r r1 2 lim lim E J, WT(λ ) − J, FT λ→0 L→∞
1 φt − E J λ2 , W φt r r E J λ2 , W λ→0 L→∞ 2 + lim lim EJ, WT(λ ) − J, FT = 0,
≤ lim lim
λ→0 L→∞
(72)
for every fixed, finite value of r ∈ 2N and T > 0. This in turn implies that (72) holds for any fixed, finite r ≥ 1 and globally in T , which establishes Theorem 2.2.
368
T. Chen
4. Main Lemmata In this section, we summarize the key technical lemmata needed to establish (71). The proofs are based on graph expansion techniques and estimation of high dimensional singular integrals in momentum space. To arrive at our results, we have to significantly generalize and extend methods developed in [4] and [2]. In all that follows, we suppose that L is finite, but much larger than any relevant scale of the problem; for our purposes, the assumption that L ε−r/ε
(73)
will suffice. Lemma 4.1. Let n¯ := n 1 + n 2 , where n 1 , n 2 < N . For any fixed r ∈ 2N, and every T = λ2 ε−1 > 0, there exists a finite constant c = c(T ) independent of L such that
1 n¯
r 1 1 ¯ 1 3 1 2 t;n 1 ,n 2 r r ≤ ε 5r ( nr log cλ2 ε−1 log )! E2−conn J λ2 , W .(74) 2 ε ε
Furthermore, for any fixed r ∈ 2N and n < N , there is an a priori bound
1 1 1 3 1 n r 2 −1 r log cλ E φn,t 2r ≤ ((nr )!) ε log , 2 ε ε
(75)
where c is independent of T and L. 1
The gain of a factor ε 5r in (74) over the a priori bound (75) is the key ingredient in our proof of (71). Lemma 4.2. For any fixed r ∈ 2N, N ≤ n < 4N , and T = λ2 ε−1 ,
1 r E φn,N ,θ j−1 (θ j ) 2r 2 ≤
1 1 1 1 (cλ2 ε−1 )n 1 + (n!)ε 5 + ((nr )!) r ε 5r (log )3 (c λ2 ε−1 log )n , √ ε ε n!
(76)
for finite constants c and c = c (T ) which are independent of L. This lemma is proved in Sect. 6. Lemma 4.3. For any fixed r ∈ 2N and T > 0, there exists a finite constant c = c(T ) independent of L such that 1
1 ((4Nr )!) 2r (log 1ε )3 (cλ2 ε−1 log 1ε )4N r 2r E φ4N ,N ,θ j−1 (θ j ) 2 ≤ . κ 2N
(77)
The proof of this lemma is given in Sect. 7. We will make extensive use of the basic inequalities formulated in the following lemma.
Convergence in Higher Mean of a Random Schrödinger to a Linear Boltzmann Evolution
369
Lemma 4.4. For L sufficiently large (e.g. for (73)), 1 1 sup = , |e (k) − α − iε| ε 3 α∈I,k∈T 1 dk |dα| , < c log . (78) ε T3 |e (k) − α − iε| I |e (k) − α − iε| Proof. Clearly, 1 dk dk ≤ +O . (79) ε2 | L | ∗L |e (k) − α − iε| T3 |e (k) − α − iε| dk < c log 1ε is proved in [2, 4]. The remaining cases are evident. The bound T3 |e (k)−α−iε| Moreover, we point out the following key property of the functions φn,N ,θ j−1 (s). From n−N n−N φn,N ,θ (s) = (−iλ) ds0 · · · dsn−N δ( s j − s) +1 Rn−N +
Vω · · · Vω e t − θ = κ , we find (−λ)n−N e(s−θ)κε ×e
where s ∈
[θ, θ ]
and
θ
n,N ,θ (θ ))(k0 ) = (φ
is0 2
2πi
j=0
isn−N
2
Vω φ N −1,θ ,
dαe−i(s−θ)α
I
(80)
(∗L )n−N
dk1 · · · dkn−N
1 ω (k1 − k0 ) · · · V e (k0 ) − α − iκε 1 ω (kn−N +1 − kn−N ) ×··· V e (kn−N ) − α − iκε N −1,θ (kn−N +1 ), ×φ
×
recalling that ε =
1 t,
(81)
and with
N −1,θ (kn−N +1 ) = φ
(−λ) N eεθ 2πi
dαe I
−iαθ
n / (∗L ) N j=n−N +1
dk j
1 ω (kn−N +2 − kn−N +1 ) · · · V e (kn−N +1 ) − α − iε 1 ω (kn − kn−1 ) 0 (kn ). φ ×···V (82) e (kn ) − α − iε The key observation here is that there are n − N + 1 propagators with imaginary part −κε in the denominator, where κε ε, and N propagators where the corresponding imaginary part is ε. Therefore, we have a bound 1 1 1 ≤ (83) |e ( p) − α − iκε| κε ε ×
for n − N + 1 propagators, which is much smaller than the bound 1 κ
1 |e ( p)−α−iε|
≤ 1ε . This
gain of a factor as compared to (78) is exploited in the time partitioning method, and is applied systematically in the proof of Lemma 4.3.
370
T. Chen
5. Proof of Lemma 4.1 We recall that
t;n 1 ,n 2 r E2−conn J λ2 , W 0 k + k = E2−conn dkdk J λ2 (k − k , ) ∗ ×∗ 2 L , 21
L
2 r 1 2 ×φn 2 ,t (k)φn 1 ,t (k ) ,
(84)
and note that J λ2 forces |k − k (mod2T3 )| < cλ2 , while |k + k (mod2T3 )| is essentially unrestricted. Next, we introduce the following multi-index notation. As n 1 , n 2 will remain fixed in the proof, let for brevity n ≡ n 1 , and n¯ ≡ n 1 + n 2 . For j = 1, . . . , r , let ( j)
( j)
k ( j) := (k0 , . . . , kn+1 ¯ ), dk ( j) := ( j)
dk J := λ2
K ( j) [k ( j) , α j , β j , ε] := U ( j) [k ( j) ] :=
n+1 ¯ / =0 n+1 ¯ / =0 n /
( j)
dk , ( j) ( j) ( j) dk J λ2 (kn − kn+1 ,
( j)
( j)
kn + kn+1 ), 2 n+1 ¯ /
1
(85) 1
( j) ( j) =0 e (k ) − α j − iε j =n+1 e (k ) − β j n n+1 ¯ / / ω (k ( j) − k ( j) ) ω (k ( j) − k ( j) ), V V −1 −1 =1 =n+2
+ iε j
,
where ε j := (−1) j ε, α j ∈ I , and β j ∈ I¯ (the complex conjugate of I ). On the last line, ω (−k). ω (k) = V we note that V Moreover, we introduce the notation α := (α1 , . . . , αr ), dα :=
r /
dα j ,
(86)
j=1
and likewise for β, ξ and dβ, dξ . Then, e2r εt λr n¯ −it rj=1 (−1) j (α j −β j ) (84) = dαdβe (2π )2r (I × I¯)r r r / / ( j) × dk J E2−conn U ( j) [k ( j) ] ¯ (T3 )(n+2)r
×
r / j=1
j=1
λ2
j=1
( j) (k ( j) ), ( j) (k ( j) )φ K ( j) [k ( j) , α j , β j , ε]φ 0 0 0 n+1 ¯
(87)
Convergence in Higher Mean of a Random Schrödinger to a Linear Boltzmann Evolution
where ( j) := φ 0
0 if j is even φ 0 if jis odd. φ
371
(88)
The expectation E2−conn (defined in (91) below) in (87) produces a sum of O((nr ¯ )!) singular integrals with complicated delta distribution insertions. We organize them by use of (Feynman) graphs, which we define next, see also Fig. 1. We consider graphs comprising r parallel, horizontal solid lines, which we refer to as particle lines, each containing n¯ vertices enumerated from the left, which account ω . Between the n th and the n + 1th V ω -vertex, we for copies of the random potential V insert a distinguished vertex to account for the contraction with J λ2 (henceforth referred to as the “ J λ2 -vertex”). Then, the n edges on the left of the J λ2 -vertex correspond to the n,t , while the n¯ − n edges on the right correspond to those in n,t resp. ψ propagators in ψ ψn−n,t resp. ψn−n,t . We shall refer to those edges, labeled by the momentum variables ¯ ¯ ( j) k , as propagator lines. The expectation produces a sum over all possible products of r2n¯ delta distributions, each standing for one contraction between a pair of random potentials. We connect every pair of mutually contracted random potentials with a dashed contraction line. We then identify the contraction type with the corresponding graph. We remark that what is defined here as one particle line was referred to as a pair of particle lines joined by a J λ2 -, or respectively, a δ-vertex in [2, 4]. Thus, according to the terminology of [2, 4], we would here be discussing the case of 2r particle lines. Due to the different emphasis in the work at hand, the convention introduced here appears to be more convenient. We particularly distinguish the class of completely disconnected graphs, in which random potentials are mutually contracted only if they are located on the same particle line. Clearly, all of its members possess r connectivity components. All other contraction types are referred to as non-disconnected graphs. A particular subfamily of non-disconnected graphs, referred to as 2-connected graphs, is defined by the property that every connectivity component has at least two particle lines. Accordingly, we may now provide the following definitions which were in part already anticipated in the preceding discussion.
Fig. 1. A (completely connected) contraction graph for the case r = 6, n = 3, n¯ = 7. The J λ2 -vertices ω -vertices are shown in white. The r particle lines are solid, while the lines are drawn in black, while the V corresponding to contractions of pairings of random potentials are dashed. For j = 3 in the notation of (87), (3) (3) the momenta k0 and kn+1 are written above the corresponding propagator lines ¯
372
T. Chen
Definition 5.1. Let
Edisc
r /
U ( j) [k ( j) ] :=
j=1
r /
E U ( j) [k ( j) ]
(89)
j=1
ω only if they lie on the same particle include contractions among random potentials V line. We refer to Edisc as the expectation based on completely disconnected graphs. We denote by r r r / / / En−d U ( j) [k ( j) ] := E U ( j) [k ( j) ] − Edisc U ( j) [k ( j) ] , (90) j=1
j=1
j=1
the expectation based on non-disconnected graphs, defined by the condition that there is at least one connectivity component comprising more than one particle line. Moreover, we refer to r r
/ / ( j) ( j) ( j) ( j) ( j) ( j) E2−conn U [k ] − E U [k ] U [k ] := E (91) j=1
j=1
as the expectation based on 2-connected graphs. ( J 2 )
λ For r n¯ ∈ 2N, let r ;n,n ¯ denote the set of all graphs on r ∈ N particle lines, each containing n¯ Vω -vertices, and each with the J λ2 -vertex located between the n th and ω -vertex. Then, n + 1th V r / E U ( j) [k ( j) ] = δπ (k (1) , . . . , k (r ) ), (92) ( J 2 )
j=1
λ π ∈r ;n,n ¯
where δπ (k (1) , . . . , k (r ) ) is defined as follows. There are r2n¯ pairing contractions be ω -vertices in π . Every (dashed) contraction line connects a random potential tween V ( j 1 ω (k ( j2 ) − kn( 2j2 ) ), for some pair of multi-indi ω (k ) − kn( 1j1 ) ) with a random potential V V n 1 +1 n 2 +1 ces (( j1 , n 1 ), ( j2 , n 2 )) determined by π , for which
ω (k ( j1 ) − kn( 1j1 ) )V ω (k ( j2 ) − kn( 2j2 ) ) = δ(k ( j1 ) − kn( 1j1 ) + k ( j2 ) − kn( 2j2 ) ). (93) E V n 1 +1 n 2 +1 n 1 +1 n 2 +1 Then, δπ (k (1) , . . . , k (r ) ) is given by the product of deltas (where δ(0) = | L | and δ(k) = 0 if k = 0) over all pairs of multi-indices ((n 1 ; j1 ), (n 2 ; j2 )) determined by π . We refer to a graph with a single connectivity component as a completely connected ( J 2 )
λ graph. We denote the subset of r ;n,n ¯ consisting of completely connected graphs by
( J 2 )conn
λ r ;n,n ¯
. ( J 2 )
λ is the disjoint union of its completely connected Clearly, any graph π ∈ r ;n,n ¯ ( Jλ2 )conn components π j ∈ s j ;n,n with s j = r . Accordingly, (84) factorizes into the ¯ 2 corresponding Feynman amplitudes, Amp J 2 (π ) = j Amp J 2 (π j ). λ
λ
Convergence in Higher Mean of a Random Schrödinger to a Linear Boltzmann Evolution
373
We may thus restrict our attention to completely connected graphs. ( J 2 )conn
λ Let π ∈ s;n,n ¯
A mp J 2 (π ) := λ
with s ≥ 1. Its Feynman amplitude is given by
λs n¯ e2sεt (2π )2s ×
s
dαdβe−it j=1 (−1) (α j −β j ) (I × I¯)s s / dk ( j) δπ (k (1) , . . . , k (s) )
¯ (∗L )(n+2)s
×
×
(∗ 1 )s L, 2 s /
j
j=1
dξ
s /
J λ2 (ξ j ,
( j) kn
( j) + kn+1
2
j=1
( j)
( j)
)δ(kn − kn+1 − ξ j )
( j) (k ( j) ). ( j) (k ( j) )φ K ( j) [k ( j) , α j , β j , ε]φ 0 0 0 n+1 ¯
(94)
j=1
dk by T3 dk, and the scaled Kronecker deltas on ∗L by delta distributions on T3 , we define λs n¯ e2sεt −it sj=1 (−1) j (α j −β j ) Amp J 2 (π ) := dαdβe λ (2π )2s (I × I¯)s s / × dk ( j) δπ (k (1) , . . . , k (s) )
Replacing
∗L
¯ (T3 )(n+2)s
× ×
(2T3 )s s /
j=1
dξ
s / j=1
( j) ( j) kn + kn+1 ( j) ( j) )δ(kn − kn+1 − ξ j ) J λ2 (ξ j , 2
( j) (k ( j) ), ( j) (k ( j) )φ K ( j) [k ( j) , α j , β j , ε]φ 0 0 0 n+1 ¯
(95)
j=1
which is independent of L. It is obvious that A mp J 2 (π ) is a discretization of Amp J 2 (π ) λ
λ
on a grid of lattice spacing O( L1 ). The discretization error is bounded in the following lemma. Lemma 5.1. C(n, ¯ ε) , Amp J 2 (π ) − Amp J 2 (π ) < λ λ | L | n) ¯ 2 . where C(n, ¯ ε) ≤ O εs((sn+2)+1 ¯
(96)
Proof. The integrand in Amp J 2 (π ) contains s(n¯ + 2) resolvent multipliers, each of λ
which is bounded by 1ε in L ∞ (T3 ), and by ε12 in C 1 (T3 ). It is demonstrated in our discussion below how to systematically integrate out all deltas in (95). Replacing the integral over T3 by the sum over ∗L for each momentum remaining after integrating
374
T. Chen
out the delta distributions in (95) (using ξ to integrate out the functions J λ2 , see (58)) yields an error of order O( |1L | ), multiplied with the sum of first derivatives of the integrand with respect to each momentum. That integrand is given by a product of (n¯ + 2)s resolvent multipliers; differentiation with respect to the momentum variables yields a ¯ sum in which each term can be bounded by ε−s(n+2)−1 . Moreover, this sum comprises no more than (s(n¯ + 2))2 terms (where s ≤ r is fixed). For the truncated Duhamel series (37), we have to estimate amplitudes of the form (95) for n¯ up to n¯ ≤ 4N ≤ O(log 1ε ), see (69) and (48), (49) (for n¯ = 4N , there are 2r propagators less, and the denominators of some propagators have an imaginary part 1 1 κε instead of ε , see Sect. 6; this only improves the bounds considered here). Thus, for L ε−r/ε ≥ ε−s/ε (see (73)), the discretization error is smaller than O(ε). Accordingly, we shall henceforth only consider Amp J 2 (π ), and assume L to be sufficiently λ large for the discretization errors to be negligible in all cases under consideration. In particular, all bounds obtained in the sequel will be uniform in L, and we recall that we are sending L to ∞ first before taking any other limits. The following key lemma is in part a joint result with Laszlo Erdös. ( J 2 )conn
λ Lemma 5.2. Let s ≥ 2, s n¯ ∈ 2N, and let π ∈ s;n,n be a completely connected ¯ graph. Then, there exists a finite constant c = c(T ) independent of L such that 1 1 1 s n¯ |Amp J 2 (π )| ≤ ε 5 (log )3 (cλ2 ε−1 log ) 2 , λ ε ε
(97)
for every T = λ2 ε−1 > 0.
5.1. Classification of contractions. For the proof of Lemma 5.2, we classify the contractions among random potentials appearing in δπ beyond the typification introduced in [4] and [2]. We define the following types of delta distributions. Definition 5.2. A delta distribution of the form ( j)
( j)
δ(ki+1 − ki
( j)
( j)
+ ki +1 − ki ), |i − i | ≥ 1
(98)
which connects the i th with the i th vertex on the same particle line is called an internal delta. The corresponding contraction line in the graph is an internal contraction. An internal delta with |i − i | = 1 is called an immediate recollision. A delta distribution of the form ( j)
( j)
δ(ki+1 − ki
( j )
( j )
+ ki +1 − ki ), j = j
(99)
which connects the i th vertex on the j th particle line with the i th vertex on the j th particle line is called a transfer delta. The corresponding contraction line is referred to as a transfer contraction, and labeled by ((i; j), (i ; j )). A vertex that is adjacent to a transfer contraction is called a transfer vertex.
Convergence in Higher Mean of a Random Schrödinger to a Linear Boltzmann Evolution
375
5.2. Reduction to the L 4 -problem. Assume that s ≥ 2. Given a completely connected ( J 2 )conn
λ graph π ∈ s;n,n , we enumerate the transfer contraction lines by ∈ {1, . . . , m} ¯ (with m denoting the number of transfer contraction lines in π ).
( J 2 )conn
λ We decompose π ∈ s;n,n into s reduced 1-particle lines as follows, see also ¯ Fig. 2. Assume that the th transfer contraction is labeled by (i ; j), (i ; j ) . We replace the corresponding transfer delta by the product ( j) ( j) ( j ) ( j ) δ ki +1 − ki + ki +1 − ki ( j) ( j) (j ) ( j ) (100) → δ ki +1 − ki + u δ ki +1 − ki − u ,
j th , and the second factor to the j th
where the first factor is attributed to the particle line. ( j) ( j) We say that δ(ki+1 − ki + u ) couples the vertex (i; j) to the new variable u . We refer to u as the transfer momentum corresponding to the th transfer contraction line, and to ( j) ( j) δ(ki +1 − ki + u ) as the reduced transfer delta on the j th particle line (parametrized by u ). We factorize every transfer delta, and associate each reduced transfer delta to the corresponding contraction line. Let u ( j) comprise all transfer momenta u which couple to a transfer vertex on the th j particle line. We define / ( j) ( j) ( j) ( j) δint (k ( j) ) := δ(ki+1 − ki + ki +1 − ki ) (101) internal deltas
and δ ( j) (u ( j) , k ( j) ) := δint (k ( j) )
/ u belonging to u ( j) on j th particle line
( j)
( j)
δ(ki +1 − ki ± u ),
(102)
which comprises all deltas on the j th particle line, including the corresponding factors from the modified transfer deltas. Moreover, every vertex carries a factor λ.
Fig. 2. The decomposition of the graph π in Fig. 1 into reduced 1-particle lines, with the exception of the particle lines labeled by j = 1 and j = 2. A numbered vertex with label accounts for a reduced transfer delta carrying the transfer momentum u , and a label − accounts for one carrying a transfer momentum −u . In this example, unfilled numbered transfer vertices carry transfer momenta used for L ∞ -bounds in (131), while shaded transfer vertices carry transfer momenta used for L 1 -bounds
376
T. Chen
Definition 5.3. The j th reduced 1-particle graph π j (u ( j) ) comprises the j th particle line, nV ¯ ω -vertices, one J λ2 -vertex, all internal contractions, but none of the transfer contraction lines. The transfer vertices carry the reduced transfer deltas, and are parametrized by u ( j) . Accordingly, we refer to Amp J 2 (π j (u ( j) )) := λ
λn¯ e2sεt (2π )2 × ×
I × I¯
¯ (T3 )n+2
2T3
dα j dβ j e−it (−1)
j (α
j −β j )
dk ( j) δ ( j) (u ( j) , k ( j) )
dξ j J λ2 (ξ j ,
( j)
( j)
kn + kn+1 ( j) ( j) )δ(kn − kn+1 − ξ j ) 2
( j) (k ( j) ) ( j) (k ( j) )φ ×K ( j) [k ( j) , α j , β j , ε]φ 0 0 0 n+1 ¯
(103)
as the j th reduced 1-particle amplitude. The amplitude Amp J 2 (π ) is obtained from the product of all reduced 1-particle λ amplitudes, by integrating over the transfer momenta. ( J 2 )conn
λ , for s ≥ 2, carries the Lemma 5.3 (Factorization lemma). Assume that π ∈ s;n,n ¯ ( j) transfer momenta u = (u 1 , . . . , u m ). Let π j (u ), for j = 1, . . . , s, denote the j th reduced 1-particle graph. Then,
Amp J 2 (π ) = λ
du 1 . . . du m
s / =1
Amp J 2 (π j (u ( j) )). λ
(104)
Notably, every u in u appears in precisely two different reduced 1-particle amplitudes (once with each sign). Next, we reduce the problem for s ≥ 2 to the problem s = 2 (corresponding to a completely connected L 4 -graph). To this end, let us assume that π contains m transfer contractions, carrying the transfer momenta u = (u 1 , . . . , u m ). Then, by (5.3), |Amp J 2 (π )| ≤ λ
du
s / j=1
|Amp J 2 (π j (u ( j) ))| , λ
(105)
where u ( j) denotes the subset of m j transfer momenta which couple to the j th particle line. Moreover, let u ( j;i) denote the subset of transfer momenta in u ( j) belonging to transfer contractions between the j th and the i th reduced 1-particle line. We recall that every transfer momentum appears in precisely two reduced 1-particle amplitudes. Hence, u ( j;i) = u (i; j) for all i = j, and u (i;i) = ∅ for all i.
(106)
Convergence in Higher Mean of a Random Schrödinger to a Linear Boltzmann Evolution
377
Assuming that u (s−1;s) = ∅ (possibly after relabeling the particle lines), |Amp J 2 (π )| ≤ du (1;2) · · · du (1;s) du (2;3) · · · · · · du (s−2;s−1) du (s−2;s) du (s−1;s) λ s / × |Amp J 2 (π j (u ( j;1) , . . . , u ( j; j−1) , u ( j; j+1) , . . . , u ( j;s) ))|
≤
λ
j=1
du (1;2) · · · du (1;s) |Amp J 2 (π1 (u (1;2) , . . . , u (1;s) ))| λ (2;3) (2;s) (2;1) (2;3) (2;s) · · · du |Amp J 2 (π2 (u ,u ,...,u ))| × sup du λ
u (1;2)
······ ×
0
sup
du (s−1;s) |Amp J 2 (πs−1 (u (s−1;1) , . . . , u (s−1;s) ))| λ
u (;s−1) ,u (;s) 1≤≤s−2
1 ×|Amp J 2 (πs (u (s;1) , . . . , u (s;s−1) ))|
=
s−2 /
λ
A j B,
(107)
j=1
where
du ( j; j+1) · · · du ( j;s) |Amp J 2 (π j (u ( j;1) , . . . , u ( j; j−1) ,
A j := sup
λ
u (i; j) 1≤i< j
( j; j+1)
))|
(108)
As := sup |Amp J 2 (πs (u (s;1) , . . . . . . , u (s;s−1) ))|
(109)
u
,...,u
( j;s)
for 1 ≤ j ≤ s − 1, and u (i;s) 1≤i<s
λ
(As−1 and As are used in the a priori bound of Lemma 5.5 below). Moreover, B := sup du (s−1;s) |Amp J 2 (πs−1 (u (s−1;1) , . . . , u (s−1;s) ))| λ
u (;s−1) ,u (;s) 1≤≤s−2
×|Amp J 2 (πs (u (s;1) , . . . , u (s;s−1) ))| λ
(110)
corresponds to a completely connected L 4 -graph. We note that B ≤ As−1 As is evident. Next, we estimate the terms A j .
(111)
378
T. Chen
Lemma 5.4. Assume that the j th truncated particle line contains m j transfer deltas, ( j) ( j) carrying the transfer momenta u ( j) . Let u ( j) = (u 1 , . . . , u m j ), according to an arbitrary enumeration of the transfer vertices. ( j) ( j) Let a ∈ N0 and 0 ≤ a ≤ m j , and arbitrarily partition u ( j) into u 1 and u ∞ , where ( j) ( j) u 1 contains a, and u ∞ contains m j − a transfer momenta. Then, n+m ¯ ¯ j j 1 n−m ( j) (112) sup du 1 |Amp J 2 (π j (u ( j) ))| < (cλ)n¯ ε− 2 +a (log ) 2 +a+2 , λ ε ( j) u ∞
for a constant c which is independent of ε. Proof. For notational convenience, we may, without any loss of generality, assume that ( j)
( j)
( j)
( j)
( j)
u 1 := (u 1 , . . . , u a ), u (∞j) := (u a+1 , . . . , u m j )
(113)
(by possibly relabeling the transfer momenta in u ( j) ). We recall the definition of Amp(π j (u ( j) )) from (103), and note that π j (u ( j) ) contains m j vertices carrying reduced transfer deltas, n¯ − m j ∈ 2N vertices that are adjacent to an internal contraction line, and one J λ2 -vertex. Clearly, λn¯ 2εt ( j) ( j) e dξ sup | Jλ2 (ξ, v)| du 1 |Amp J 2 (π j (u ))| ≤ λ (2π )2 v 2T3 ( j) ( j) ( j) ( j) ( j) ( j) ( j) |dα j ||dβ j | du 1 dk δ (u , k )δ(kn − kn+1 − ξ ) × sup ξ
I × I¯
¯ (T3 )n+2+a
( j) (k ( j) )||φ ( j) (k ( j) )|, ×|K ( j) [k ( j) , α j , β j , ε]||φ 0 0 0 n+1 ¯
(114)
and we recall (58). Adding the arguments of all delta distributions, we find the momentum conservation condition ( j)
( j)
kn+1 = k0 + ξ + ¯
mj ( j) (±u i ),
(115)
i=1
linking the momenta at both ends of the reduced 1-particle graph. We replace the delta m j ( j) ( j) ( j) belonging to the vertex (n; ¯ j) by δ(kn+1 − k0 − ξ − i=1 (±u i )), irrespective of ¯ it being an internal or a reduced transfer delta, and remove it from δ ( j) (u ( j) , k ( j) ). We ( j) ( j) ( j) integrate out the J λ2 -delta δ(kn − kn+1 − ξ ) using the variable kn+1 , and the delta m j ( j) ( j) ( j) ( j) δ(kn+1 ¯ ¯ − k0 − ξ − ¯ . It follows that if 1 ≤ n < n, i=1 (±u i )) using the variable kn+1
n¯
|(114)| ≤ Cλ sup × ×
ξ
I × I¯
(T3 )a
( j) du 1
( j) ( j) ( j) ( j) ( j) dk0 |φ 0 (k0 )||φ0 (k0
mj ( j) +ξ + (±u i ))| i=1
( j) |dα j ||dβ j |F[u ( j) , k0 , ξ, α j , β j , ε]
1 ( j) |e (k0 ) − α j
( j) − iε||e (k0
+ξ +
m j
( j)
i=1 (±u i )) − β j + iε|
,
Convergence in Higher Mean of a Random Schrödinger to a Linear Boltzmann Evolution
379
where ( j)
F[u ( j) , k0 , ξ, α j , β j , ε] ( j) ( j)( j) ( j) ( j) dkn d k δ (u , k , ξ ) := ¯ (T3 )n−1
( j)
×
( j) [ |K k , α j , β j , ε]| ( j)
( j)
|e (kn ) − α j − iε||e (kn + ξ ) − β j + iε|
,
(116)
with ( j) ( j) ( j) ( j) ( j) ( j) ( j) ( j) ( j) k := (k0 , kn , k ), k := (k1 , . . . , kn−1 , kn+2 , . . . , kn¯ )
(117)
and ( j) ( j) [ K k , α j , β j , ε] :=
n−1 /
1
( j) =1 e (k ) − α j n¯ /
×
− iε j 1
( j) =n+2 e (k ) − β j
+ iε j
.
(118)
( j) k , ξ ) is obtained from δ ( j) (u ( j) , k ( j) ) by omitting the delta distribution Here, δ ( j) (u ( j) , ( j) ( j) belonging to the vertex (n, ¯ j), and by substituting kn+1 → kn − ξ . Splitting ( j)
( j)
( j)
( j)
(k )||φ (k + u)| ≤ |φ 0 0 0 0
1 ( j) ( j) 2 1 ( j) ( j) (k )| + |φ (k + u)|2 , |φ 0 0 2 0 2 0
(119)
we find |(114)| ≤ (I ) + (I I ),
(120)
where (I ) ≤ Cλn¯
( j) ( j) ( j) 2 dk0 |φ 0 (k0 )|
× sup sup sup ξ
αj
I¯
( j)
k0
|dβ j |
1 ε
(T3 )a
sup
( j) k0
|dα j | I
1 ( j)
|e (k0 ) − α j − iε|
( j)
du 1
F[u ( j) , k0 , ξ, α j , β j , ε] m j ( j) ( j) |e (k0 + ξ + i=1 (±u i )) − β j + iε|
( j) du 1
F[u ( j) , k0 , ξ, α j , β j , ε] , m j ( j) ( j) |e (k0 + ξ + i=1 (±u i )) − β j + iε|
( j)
0 22 log ≤ Cλn¯ φ × sup sup sup ξ
αj
( j)
k0
I¯
|dβ j |
(T3 )a
( j)
(121)
380
T. Chen
and (I I ) ≤ Cλn¯ sup
I × I¯
( j)
(T3 )a
ξ
×
( j)
( j)
( j)
(k + ξ + dk0 |φ 0 0
du 1
mj ( j) (±u i ))|2 i=1
|dα j ||dβ j |
1 ( j)
|e (k0 ) − α j − iε| ( j)
F[u ( j) , k0 , ξ, α j , β j , ε] m j ( j) ( j) |e (k0 + ξ + i=1 (±u i )) − β j + iε| 1 ( j) ( j) ( j) 2 ≤ Cλn¯ dk0 |φ sup |dβ j | 0 (k0 )| ( j) ( j) I¯ |e (k0 ) − β j − iε| k0 ( j) F [u ( j) , k0 , ξ, α j , β j , ε] ( j) × sup sup sup |dα j | du 1 m j ( j) ( j) ξ β j k ( j) I (T3 )a |e (k0 − ξ − i=1 (±u i )) − α j + iε| ×
0
1 ε
0 22 log ≤ Cλn¯ φ
ξ
β j k ( j) 0
|dα j |
× sup sup sup I
( j)
( j)
(T3 )a
du 1
( j)
( j)
F [u ( j) , k0 , ξ, α j , β j , ε] . m j ( j) ( j) |e (k0 − ξ − i=1 (±u i )) − α j + iε| m j
(122) ( j)
We have here applied a shift k0 → k0 − ξ − i=1 (±u i ) which induces F → F in the obvious way. We note that this only affects the delta distributions belonging to the ( j) k , ξ ) of F. We focus on (I ), the case of (I I ) is vertices (1, j) and (n, ¯ j) in δ ( j) (u ( j) , analogous. We have ( j) F[u ( j) , k0 , ξ, α j , β j , ε] ( j) sup sup sup |dβ j | du 1 mj ( j) ( j) ξ α j k ( j) I (T3 )a |e (k0 + ξ + i=1 (±u i )) − β j + iε| 0 ( j) ( j) [ |K k , α j , β j , ε]| ( j) ( j) ( j) k = sup sup sup |dβ j | du 1 dkn d m j ( j) ( j) ξ α j k ( j) I |e (k0 + ξ + i=1 (±u i )) − β j + iε| 0
( j) δ ( j) (u ( j) , k , ξ)
×
. (123) ( j) ( j) |e (kn ) − α j − iε||e (kn + ξ ) − β j + iε| 1 ≤ sup |dβ j | ( j) ( j) I |e (k + ξ ) − β + iε| n j kn ( j) ( j) [ |K k , α j , β j , ε]| ( j) ( j) ( j) × sup sup sup du 1 dkn d k m j ( j) ( j) ξ α j k ( j) |e (k0 + ξ + i=1 (±u i )) − β j + iε| 0
×
( j) δ ( j) (u ( j) , k , ξ) ( j)
|e (kn ) − α j − iε|
.
Next, we integrate out the reduced transfer deltas:
(124)
Convergence in Higher Mean of a Random Schrödinger to a Linear Boltzmann Evolution
381
• The case 1 ≤ ≤ a. If i < n, ¯ we integrate out the corresponding transfer deltas ( j)
( j)
( j)
( j)
δ(ki +1 − ki ± u ) using the transfer momenta u ( j)
( j)
(the components of u 1 ).
Then, for each such , we use the variable ki +1 (on the right of the corresponding transfer vertex, according to our conventions) to estimate the corresponding propagator in L 1 ,
( j)
dki +1
( j) |e(ki +1 ) − γ
± iε|
< c log
1 ε
(125)
(where γ denotes α j or β j ). If the n¯ th vertex is a transfer vertex, and i = n, ¯ we recall that the corresponding transfer delta has already been integrated out using the ( j) momentum kn+1 ¯ , and replaced by the delta enforcing (115). Accordingly, we use u for the estimate 1 du (126) ≤ c log , m j ( j) ( j) ε |e (k0 + ξ + i=1 (±u i )) − β j + iε| noting that the propagator in the integrand (supported on the edge initially labeled by ( j) kn+1 ¯ ) is the only one depending on u . Thus, in this step, a propagators are in total estimated in L 1 by c log 1ε , irrespectively of whether there is ≤ a with i = n¯ or not. ¯ we integrate out the corresponding reduced transfer • The case a < ≤ m j . If i < n, ( j)
( j)
( j)
( j)
deltas δ(ki +1 − ki ± u ) using the variable ki +1 on the right of the associated vertex (i ; j). We then estimate each of the corresponding propagators by sup ( j) +1
ki
1 ( j) |e(ki +1 ) − γ
± iε|
≤
1 ε
(127)
in L ∞ . If i = n, ¯ we again note that the corresponding transfer delta has already been ( j) integrated out using the momentum kn+1 ¯ . For the propagator supported on the edge ( j) labeled by kn+1 ¯ , we use ( j) |e (k0
+ξ +
m j
1
( j) i=1 (±u i )) − β j
+ iε|
≤
1 . ε
(128)
Thus, in this step, m j −a propagators are in total estimated in L ∞ by 1ε , irrespectively of whether there is > a with i = n¯ or not. ( j)
( j)
We summarize that out of the n¯ + 2 momenta in k ( j) , we have used k0 , kn+1 , and ( j) kn+1 to begin with. Moreover, if the n¯ th vertex is a transfer vertex, we have used another ¯ m j − 1 components of k ( j) to either integrate out transfer deltas, or to estimate propagators in L 1 . On the other hand, if the n¯ th vertex is an internal vertex, we have, to this end, used m j components of k ( j) . We also note that out of the n¯ + 2 propagators, a have been estimated by 1ε in L ∞ , and m j − a + 2 (two from the integrals in α j and β j ) by c log 1ε in L 1 . Next, we introduce a spanning tree T on π j (u ( j) ), which contains all internal contraction lines, but none of the transfer vertices, and none of the m j − a + 2 edges carrying
382
T. Chen
propagators that were already estimated above in L 1 or L ∞ . Thus, in particular, T does ( j) ( j) ( j) not contain the propagator edges corresponding to the momenta k0 , kn+1 and kn+1 ¯ . We then call T admissible. Thus, we distinguish the following cases: • The n¯ th vertex is an internal vertex. The corresponding internal delta has already been ( j) replaced by the delta enforcing (115), and integrated out using kn+1 ¯ . Accordingly, we use the estimate (128) for the propagator on its right. Out of the remaining n¯ − 2 − m j n−m ¯ momenta in k ( j) , we use 2 j − 1 momenta supported on T to integrate out the remaining internal deltas, and we estimate the corresponding propagators in L ∞ by n−m ¯ j 1 1 ε . There remain 2 −1 momenta for L -bounds on the corresponding propagators. • The n¯ th vertex is a transfer vertex. Out of the remaining n¯ − 3 − m j momenta in k ( j) , n−m ¯ we use 2 j momenta supported on T to integrate out the internal deltas, and we estimate the corresponding propagators in L ∞ by 1ε . There remain for L 1 -bounds on the corresponding propagators. ( j)
n−m ¯ j 2
−2 momenta
( j)
With the rôles of the propagators on the edges labeled by k0 and kn+1 ¯ interchanged, the discussion for the term (I I ) is fully analogous to the one of (I ). n−m ¯ n+m ¯ Summarizing, 2 j + (m j − a) = 2 j − a propagators are in total bounded in L ∞ , and
n−m ¯ j 2
+ 2 + a in L 1 . In conclusion, we obtain
sup ( j)
u∞
( j) du 1 |Amp J 2 (π j (u ( j) ))| λ
n¯ −
< (cλ) ε
n+m ¯ j 2
+a
¯ j n−m 2 +a+2 1 log , ε
(129)
as claimed. The cases n = 0 and n = n¯ are similar, and also yield (129). This can be proved with minor modifications of the arguments explained above, and will not be reiterated. This concludes the proof. ( J 2 )conn
λ Lemma 5.5. Let π ∈ s;n,n ¯
. We then have the a priori bound
s n¯ 1 3s 1 2 2 −1 cλ ε log . |Amp J 2 (π )| < log λ ε ε
(130)
Proof. From (107), we have |Amp J 2 (π )| ≤ λ
s /
Aj.
(131)
j=1
Using (131) and Lemma 5.4, we get |Amp J 2 (π )| ≤ (cλ)s n¯ λ
s /
ε−
j=1
< (cλ)s n¯ ε−
s n+ ¯
2
n+m ¯ j 2
j mj
+a j
¯ j 1 n−m (log ) 2 +a j +2 ε
+ j aj
¯ j mj 1 s n− (log ) 2 + j a j +2s . ε
(132)
We observe that m = j m j ∈ 2N0 is twice the number of transfer contractions in π , since it counts the number of transfer vertices. Moreover, j a j = m2 because to every
Convergence in Higher Mean of a Random Schrödinger to a Linear Boltzmann Evolution
383
transfer contraction, we associate one resolvent estimated in L 1 and one estimated in ∞ L , and j a j counts those estimated in L 1 . This implies the asserted bound. Next, we estimate the term B in (107), and show that exploiting the connectedness 1 of a pair of particle lines, one gains a factor ε 5 over the bound B ≤ As−1 As inferred from Lemma 5.4.
Lemma 5.6. Assume that the reduced 1-particle lines π j (u ( j) ) and π j (u ( j ) ) have m j; j common transfer momenta u ( j; j ) . Let u (i; j) denote the m j + m j − 2m j; j transfer 0 momenta appearing in either u ( j) or u ( j ) , but not in both. Moreover, assume that φ satisfies the concentration of singularity condition (31). Then, sup du ( j; j ) |Amp J 2 (π j (u ( j) ))||Amp J 2 (π j (u ( j ) ))| u ( j; j
λ
)
1
¯ ≤ λ2n¯ ε 5 −n−
m j +m j 2
+m j; j
λ
1 ¯ m j +m j +m j; j +4 2 (c log )n− ε
(133)
1
which improves the corresponding a priori bound by a factor ε 5 .
Proof. To estimate the l.h.s. of (133), we use L ∞ − L 1 -bounds in the variables u ( j; j ) , with the exception of one transfer momentum, which we denote by u. Thereby, we cut all but one transfer lines between the j th and the j th reduced 1-particle line. One straightforwardly obtains (133) if it is possible to identify a subgraph in the expression for (133) that corresponds to the “crossing integral” 1 sup sup dp1 dp2 |e ( p) − γ − iε 3 1 1 ||e (q) − γ2 − iε1 | γi ∈I k∈T 4 1 1 ≤ cε− 5 (log )3 , × (134) |e ( p − q + k) − γ3 − iε| ε see Lemma 3.11 in [2]. Here, one of the three resolvents would have been estimated in 1 L ∞ by 1ε in the a priori bound. There is a gain of a factor ε 5 because the singularities which contribute most to (134) are concentrated in tubular ε-neighborhoods of level surfaces of e , whose intersections are of small measure (the curvature of the level surfaces of the energy function e plays a crucial role for this result).
Fig. 3. An example unrelated to that in Figs. 1 and 2. Here, all reduced transfer vertices are shaded, while the unreduced transfer vertices and all internal vertices are unfilled. The reduced 1-particle line with j = 1 contains an immediate recollision with a reduced transfer vertex insertion, and a nesting subgraph. The reduced 1-particle lines with j = 2, 3 define a ladder diagram with two rungs each, and decorated by an immediate recollision, and connected by a transfer contraction line. The reduced 1-particle line with j = 4 contains a crossing subgraph
384
T. Chen
On each reduced 1-particle line, we identify the contraction structure based on internal deltas, see also Fig. 3. As explained in detail in [4, 2], the only possible cases are (we are here omitting the labels j, j of the reduced 1-particle lines): • The internal contractions of the reduced 1-particle line define a ladder graph decorated with progressions of immediate recollisions. That is, every internal contraction is either an immediate recollision (a contraction between neighboring internal vertices, possibly with transfer vertices located in between), or a rung of the ladder contracting a vertex labeled by i ≤ n with a vertex labeled by i > n. For any pair of rung contractions labeled by (i 1 , i 1 ) and (i 2 , i 2 ), one has i 1 < i 2 , and i 1 > i 2 (no crossing of rungs). These were denoted “simple graphs” in [4, 2]. • Otherwise, one can identify at least one nesting or crossing subgraph. A pair of internal deltas δ(ki1 +1 − ki1 + ki1 +1 − ki1 ) and δ(ki2 +1 − ki2 + ki2 +1 − ki2 ) defines a nesting subgraph if i 1 < i 2 < i 2 < i 1 , and either i 1 ≤ n or i 1 > n. It defines a crossing subgraph if i 1 < i 2 < i 1 < i 2 , and either i 2 ≤ n or i 1 > n. In (133), one can identify a crossing subintegral of the form (134) in the following situations: • One of the reduced 1-particle graphs contains a nesting or crossing subgraph consisting of internal contraction lines, similarly as in [4, 2]. Then, one can completely disconnect the j th and the j th reduced 1-particle line by L ∞ − L 1 -estimates in u ( j; j ) , 1 and one still gains a factor ε 5 from (134). • Both reduced 1-particle subgraphs correspond to ladder graphs with immediate recollision insertions (denoted “simple graphs” in [4, 2]), but there is at least one transfer contraction between the j th and the j th reduced 1-particle line whose ends are located either between rungs of the ladder (that is, not on the left or right of the outermost rung contraction δ(ki∗ +1 − ki∗ + ki∗ +1 − ki∗ ), where i ∗ is the smallest, and i ∗ is the largest index appearing in any rung contraction on the given reduced 1-particle line) and/or inside an immediate recollision subgraph. The integral over the associated transfer momentum u then produces a subintegral of the form (134), and one gains a 1 factor ε 5 . The crossing estimate cannot be applied in its basic form (134) when the j th and the j th reduced 1-particle graphs have ladder structure, and every transfer contraction between the j th and the j th reduced 1-particle graph is adjacent to at least one vertex on the left or right of the outermost rung contraction, which is also not located inside an immediate recollision subgraph. Then, the corresponding integrals do not only involve 0 , which itself typically exhibits singularities. propagators, but also φ The situation is most difficult to handle if both of the adjacent transfer vertices are of that type. Then, the only subintegral with crossing structure has the form 0 ( p + u + k)||φ 0 (q)||φ 0 (q − u)| 0 ( p)||φ Aε := dα1 dα2 dpdqdu|φ 1 |e ( p) − α1 − iε||e ( p + u + k) − β1 + iε| 1 × . |e (q) − α2 − iε||e (q − u) − β2 + iε| ×
(135)
This expression is obtained from partitioning the integrals on the r.h.s. of (133) in the same way as in the proof of Lemma 5.4 (we recall that on each reduced 1-particle line,
Convergence in Higher Mean of a Random Schrödinger to a Linear Boltzmann Evolution
385
one of the energy parameters α j or β j is always used to estimate a propagator neighbor 0 in L 1 ). In (135), singularities of |φ 0 | may overlap with those of the 0 or φ ing to either φ neighboring resolvents; crossing structures then also depend on the singularity structure 0 |. If we argue as in the proof of Lemma 5.4, we would use two momentum integrals of |φ 0 4 2 3 , and the remaining integrals (α1 , α2 , and the third momentum) to bound for φ L (T ) three resolvents in L 1 (T3 ) by c log 1ε so that one resolvent is estimated in L ∞ (T3 ) by 1 ε . Thereby, one gets 1 0 4 2 3 . Aε < cε−1 (log )3 φ L (T ) ε
(136)
The remaining terms contributing to the l.h.s. of (134) are estimated in the same way as in the proof of Lemma 5.4 (i.e. by introduction of a spanning tree, and use of L 1 − L ∞ bounds on the propagators), whereby one again arrives at the expression for the a priori 1 bound, which is the r.h.s. of (134) without the ε 5 -factor. We shall not repeat the detailed argument. To prove (133), we improve (136) by 4 1 Aε ≤ c(T )ε− 5 (log )4 , ε
(137)
where the constant depends only on the macroscopic time T > 0. Our proof uses the 0 . We do not know if for general L 2 η-concentration property of the WKB initial data φ initial data, or for WKB initial conditions with an arbitrary phase function S of Schwartz class, (136) can be improved. We recall the concentration of singularity condition (29)–(31), by which 0 (k) = f ∞ (k) + f sing (k), φ
(138)
f ∞ ∞ < c,
(139)
where
and 4
| f crit | ∗ | f crit | L 2 (T3 ) = | f sing |∨ 24 (Z3 ) ≤ c η 5 for constants c, c that are uniform in η. We observe that Aε has the form g1 ∗ Aε [g1 , g2 , g3 , g4 ] = dα1 dα2 g2 , g3 ∗ g4 L 2 (T) 2 I = dα1 dα2 g1 g2 , g3 g4 2 (Z3 ) I2 = Aε [g1,r1 , g2,r2 , g3,r3 , g4,r4 ],
(140)
(141)
ri ∈{∞,crit} 1 0 ( p)|, etc., and where 0 where g1 ( p) := |e ( p)−α |φ gi,r is obtained from replacing φ 1 −iε| by fr in gi , for r ∈ {∞, crit}. The corresponding terms can then be bounded as follows.
386
T. Chen
First of all, if ri = ∞ for i = 1, . . . , 4, Aε [g1,∞ , g2,∞ , g3,∞ , g4,∞ ] ≤ f ∞ 4∞ sup dpdqdu α j ,β j
1 |e ( p) − α1 − iε||e ( p + u) − β1 + iε|
1 |e (q) − α2 − iε||e (q − u) − β2 + iε| 4 1 < cε− 5 (log )4 , ε ×
(142)
using (134). If ri = crit for one value of i, Aε [g1,crit , g2,∞ , g3,∞ , g4,∞ ] 3 ≤ f ∞ ∞ dp| f crit ( p)| 1 1 × sup dq |e (q) − β2 + iε| p e ( p + u) − β1 + iε 1 1 sup dα2 × sup dα1 |e ( p) − α1 − iε| |e (q) − α2 − iε| p q I I 2 1 (143) < cε−1 η 5 (log )4 , ε 2
using f crit L 1 (T3 ) ≤ cη 5 , which follows from
f crit 2L 1 (T3 ) = dpdu| f crit ( p)|| f crit (u)| = dpdu| f crit ( p)|| f crit (u − p)| ≤
2 21
du
dp| f crit ( p)|| f crit (u − p)|
1 2
du
= | f crit | ∗ | f crit | L 2 (T3 ) 4
≤ cη 5 ,
(144)
see (140), and where we have used Vol(T3 ) = 1. The remaining cases r1 = r3 = r4 = ∞, r2 = crit, etc., are similar. If ri = crit for two values of i, dα1 dα2 Aε [g1,crit , g2,crit , g3,∞ , g4,∞ ] I 1 2 ≤ f ∞ ∞ dα1 dα2 dpdqdu| f crit ( p)|| f crit ( p + u)| 2 |e ( p) − α1 − iε| I 1 × |e ( p + u) − β1 + iε||e (q) − α2 − iε||e (q − u) − β2 + iε|
Convergence in Higher Mean of a Random Schrödinger to a Linear Boltzmann Evolution
≤ f ∞ 2∞ dp| f crit ( p)| du| f crit ( p + u)| ε−1 1 × sup dq |e (q − u) − β2 + iε| u 1 1 sup dα2 × sup dα1 |e ( p) − α1 − iε| |e (q) − α1 − iε| p q I I 3 4 1 , < cε−1 η 5 log ε
387
(145)
again using (144). The cases r1 = r3 = ∞, r2 = r4 = crit, etc., are similar. If ri = crit for three values of i, dα1 dα2 Aε [g1,crit , g2,crit , g3,crit , g4,∞ ] I ≤ f ∞ ∞ dα1 dα2 dpdqdu| f crit ( p)|| f crit ( p + u)|| f crit (q)| I2
1 |e ( p) − α1 − iε||e ( p + u) − β1 + iε| 1 × |e (q) − α2 − iε||e (q − u) − β2 + iε| ×
≤ f ∞ ∞ f crit 3L 1 (T3 ) ε−2 1 1 sup dα2 × sup dα1 |e ( p) − α1 − iε| |e (q) − α2 − iε| p q I I 6 1 (146) ≤ cε−2 η 5 (log )2 ε using (144). The remaining cases are similar. Finally, if ri = crit for all values of i, dα1 dα2 Aε [g1,crit , g2,crit , g3,crit , g4,crit ] I ≤ dα1 dα2 dpdqdu| f crit ( p)|| f crit ( p + u)|| f crit (q)|| f crit (q − u)| I2
1 |e ( p) − α1 − iε||e ( p + u) − β1 + iε| 1 × |e (q) − α2 − iε||e (q − u) − β2 + iε| ×
≤ ε−2 | f crit | ∗ | f crit | 2L 2 (T3 ) 1 1 sup dα2 × sup dα1 |e ( p) − α1 − iε| |e (q) − α1 − iε| p q I I 8 1 (147) < cε−2 η 5 (log )2 , ε using (140). We recall that η = λ2 = T ε, where ε = 1t is the inverse microscopic time, and T = λ2 t is the macroscopic time.
388
T. Chen
Collecting the estimates on (141) derived above, we find that for any T > 0, there is a constant c(T ) < Tc2 such that 4 1 Aε ≤ c(T )ε− 5 (log )4 . ε
(148)
1
This estimate improves (136) by a factor ε 5 , as claimed, and establishes (133).
Using the arguments used in the proof of Lemma 5.5, one hereby also establishes Lemma 5.2. Moreover, we find the following bounds. ( J 2 )2−conn
( J 2 )n−d
( J 2 )
λ λ λ Lemma 5.7. Let r ∈ 2N, and let r ;n,n , r ;n,n ⊂ r ;n,n denote the sub¯ ¯ ¯ classes of 2-connected and non-disconnected graphs, respectively. Then, for every T = λ2 ε−1 > 0, there exists a finite constant c = c(T ) such that 1 1 1 r n¯ |Amp J 2 (π )| ≤ (r n)!ε ¯ 5 (log )3r (cλ2 ε−1 log ) 2 , (149) λ ε ε ( J 2 )2−conn
λ π ∈r ;n,n ¯
and
( J 2 )n−d λ π ∈r ;n,n ¯
1 1 1 r n¯ |Amp J 2 (π )| ≤ (r n)!ε ¯ 5 (log )3r (cλ2 ε−1 log ) 2 . λ ε ε
(150)
( J 2 )
λ Proof. This follows immediately from Lemma 5.2 and the fact that r ;n,n ¯ contains no r n ¯ more than (r n)!2 ¯ graphs.
Lemma 5.8. For any fixed r ≥ 2, r ∈ 2N, n ≤ N , and T > 0, there exists a finite constant c = c(T ) such that
1 1 1 1 r E φn,t 2r ≤ ((2nr )!) r (log )3 (cλ2 ε−1 log )n . (151) 2 ε ε Proof. This is proved in the same way as the a priori bound of Lemma 5.5. The only ( j) ( j) modification is that the J λ2 -delta is replaced by δ(kn − kn+1 ) on every particle line. We note that the expansion for (151) contains disconnected graphs. 6. Proof of Lemma 4.2 Based on the previous discussion, is straightforward to see that
E φn,N ,θm−1 (θm ) 2r 2 r 2r (1+θ ε) / m−1 e −iθm rj=1 (−1) j (α j −β j ) dα dβ e = j j (2π )2r (I × I¯)r j=1 r r / / ( j) ( j) × dk ( j) δ(kn − kn+1 ) λr n¯ E U ( j) [k ( j) ] ¯ (T3 )(n+2)r
×
r / j=1
( j)
j=1
j=1
( j)
( j)
( j)
( j)
(k )φ (k ) K n,N ,κ [k ( j) , α j , β j , ε] φ 0 0 0 n+1 ¯
(152)
Convergence in Higher Mean of a Random Schrödinger to a Linear Boltzmann Evolution
389
(using (θm − θm−1 )κε = 1) where ( j)
K n,N ,κ :=
1 ( j) (e (kn ) − α j
( j)
− iε j )(e (kn+1 ) − β j + iε j )
( j) , K n,N ,κ
(153)
and ( j) [k ( j) , α j , β j , ε] K n,N ,κ :=
n−N /
( j) 1 =0 e (k1 ) − α j
×
n−1 /
1 n+N /
1
( j) − iκε j 2 =n−N +1 e (k2 ) − α j
1
( j) 3 =n+2 e (k3 ) − β j
2n+1 /
− iε j
1
( j) + iε j 4 =n+N +1 e (k4 ) − β j
+ iκε j
,
(154)
( j)
( j)
see also (86) and the discussion following (80). We refer to δ(kn − kn+1 ), which replaces the J λ2 -delta, as the “L 2 -delta”, since it is responsible for the L 2 -inner product on the left-hand side of (152). The expression (152) can be estimated in the same way as the integrals (87) considered above, however, we are now considering the full instead of the non-disconnected
2 expectation. r ( j) ( j) As before, E [k ] in (152) decomposes into a sum of products of delta j=1 U distributions, which we represent by Feynman graphs. By the notational conventions introduced after (87), we have n¯ = 2n. We let r ;n,n ¯ denote the set of graphs on r ω , and particle lines, each containing n¯ vertices from copies of the random potential V ω -vertex. For π ∈ r ;n,n¯ , with the L 2 -delta located between the n th and the n + 1st V let Ampδ (π ) denote the amplitude corresponding to the graph π , given by the integral
2 r ( j) [k ( j) ] in (152) by δ (k (1) , . . . , k (r ) ) (the product obtained from replacing E U π j=1 of delta distributions corresponding to the contraction graph π ). The subscript in Ampδ implies that instead of J λ2 as before, we now have the L 2 -delta at the distinguished vertex. Let rconn ¯ of completely connected graphs. ;n,n ¯ denote the subclass of r ;n,n Lemma 6.1. Let s ≥ 2, s ∈ N, and let π ∈ conn s;2n,n (that is, n¯ = 2n) be a completely 2 −1 connected graph. Then, for every T = λ ε > 0, there exists a finite constant c = c(T ) such that 1 1 1 |Ampδ (π )| ≤ ε 5 (log )3s (cλ2 ε−1 log )sn . (155) ε ε Proof. The proof is completely analogous to the one given for Lemma 5.2 (using 1 ε ), and will not be reiterated here.
1 κε
≤
In contrast to the situation in Lemma 5.2, the expectation in (152) contains completely disconnected graphs, which satisfy r |Ampδ (π )| ≤ |Ampδ (π )| . (156) π ∈rdisc ;n,n ¯
π ∈conn 1;n,n ¯
We invoke the following bound from [2] (the continuum version is proved in [4]).
390
T. Chen
Lemma 6.2. Let n¯ = 2n. Then, for constants c, c independent of T ,
|Ampδ (π )| ≤
π ∈conn 1;n,n ¯ 2 −1 )n
The term (cλ√ε
n!
1 (cλ2 ε−1 )n 1 1 + (n!)ε 5 (log )3 (c λ2 ε−1 log )n . √ ε ε n!
(157)
bounds the contribution from decorated ladder diagrams, while the 1
term that carries an additional ε 5 -factor is obtained from crossing and nesting type subgraphs. The proof of (157) is presented in detail in [2] and [4]. The number of non-ladder graphs is bounded by n!2n , hence the factor n!. The sum over non-disconnected graphs can be estimated by the same bound as in Lemma 5.7. The result is formulated in the following lemma. Lemma 6.3. Let r ∈ 2N, n¯ = 2n, and rn−d ¯ denote the subclass of non-dis;n,n ¯ ⊂ r ;n,n connected graphs. Then, for every T > 0, there exists a finite constant c = c(T ) such that 1 1 1 r n¯ |Ampδ (π )| ≤ (r n)!ε ¯ 5 (log )3r (cλ2 ε−1 log ) 2 . (158) ε ε n−d π ∈r ;n,n ¯
Combining (157) with Lemma 6.3, and applying the Minkowski inequality, the statement of Lemma 4.2 follows straightforwardly. 7. Proof of Lemma 4.3 For r ∈ 2N, s ∈ [θm−1 , θm ], θm − θm−1 = κt , n = 4N , and n¯ = 8N , one gets
4N ,N ,θm−1 (s) 2r E φ 2 r r / j e2r ((s−θm )κ+θm )ε dα j dβ j e−is j=1 (−1) (α j −β j ) = 2r r (2π ) ¯ (I × I ) j=1 r r / / ( j) ( j) × dk ( j) δ(k − k ) λr n¯ E U ( j) [k ( j) ] ¯ (T3 )(n+2)r
×
r /
4N
j=1
4N +1
j=1
( j) ( j) ( j) ( j) ( j) ( j) K 4N ,N ,κ [k , α j , β j , ε] φ0 (k0 )φ0 (k8N +1 ),
(159)
j=1
where ε = 1t . The notations are the same as in the proof of Lemma 4.2. See (154) for the ( j) definition of K 4N ,N ,κ . We note that here, the propagators on each particle line previously 4N ,N ,θm−1 (θm ) instead of labeled by n and n + 1 are absent, since we are considering φ φn,N ,θ j−m (θm ), see (46). Let rconn ;8N ,4N denote the subset of r ;8N ,4N of completely connected graphs. Lemma 7.1. Let r ≥ 1, r ∈ N, and let π ∈ rconn ;8N ,4N be a completely connected graph. 2 −1 Then, for every T = λ ε > 0, there exists a finite constant c = c(T ) such that 1 1 r n¯ |Ampδ (π )| ≤ κ −2r N (log )3r (cλ2 ε−1 log ) 2 . ε ε
(160)
Convergence in Higher Mean of a Random Schrödinger to a Linear Boltzmann Evolution
391
Proof. We modify the proof of Lemma 5.5 in the following manner. We observe that (152) contains r (6N + 2) propagators with imaginary parts ±iκε, and 2r N propagators with imaginary parts ±iε. In the proof of Lemma 5.5, 4r N out of all propagators were estimated in L ∞ , while the rest was estimated in L 1 . The fact that there are two propagators less per reduced 1-particle line leads to an improvement over the estimates of Lemma 5.2 which we, however, do not need to exploit. Carrying out the same arguments line by line, we estimate 4r N out of all propagators in (159) in L ∞ . There are 2r N propagators carrying an imaginary part ±iε in the denominator. By the pigeonhole principle, at least 2r N propagators bounded in L ∞ have a denominator with an imaginary part ±iκε. From each of those, one obtains an improvement of the a priori bound in Lemma 5.5 by a factor κ −1 . This is because all propagators estimated in L ∞ in Lemma 5.5 were bounded by 1ε . In total, one gains a factor of at least κ −2r N over the estimate in Lemma 5.5. For a more detailed exposition of arguments concerning the time partitioning method, we refer to [4]. Lemma 7.2. Let r ∈ 2N. Then, for every T > 0, there exists a finite constant c = c(T ) such that 1 1 |Ampδ (π )| ≤ (4r N )!κ −2r N (log )3r (cλ2 ε−1 log )4r N . (161) ε ε π ∈r ;8N ,4N
Proof. Let π ∈ r ;8N ,4N have m connectivity components, and let π comprise s1 , . . . , sm m particle lines, where l=1 sl = r . Then, 1 m 1 n¯ (log )3 l=1 sl (cε−1 λ2 log ) ε ε 1 3r −1 2 1 4r N −2r N ≤κ (log ) (cε λ log ) . ε ε
|Ampδ (π )| ≤ κ −2N
m
l=1 sl
m l=1 sl 2
Moreover, the number of elements of r ;8N ,4N is bounded by (4r N )!24r N .
(162)
The corresponding sum over disconnected graphs can be estimated by the bound in Lemma 6.3. This proves Lemma 4.3. Acknowledgements. I am deeply grateful to H.-T. Yau and L. Erdös for their support, encouragement, advice, and generosity. I have benefitted immensely from numerous discussions with them, in later stages of this work especially from conversations with L. Erdös. I also thank H.-T. Yau for his very generous hospitality during two visits at Stanford University. I am most grateful to the anonymous referee for very detailed and helpful comments, and for pointing out an error related to the WKB initial condition in an earlier version of the manuscript. This work was supported by NSF grants DMS-0407644 and DMS-0524909, and in part by a grant of the NYU Research Challenge Fund Program while the author was at the Courant Institute, NYU, as a Courant Instructor.
References 1. Aizenman, M., Molchanov, S.: Localization at large disorder and at extreme energies: an elementary derivation. Commun. Math. Phys. 157, 245–278 (1993) 2. Chen, T.: Localization Lengths and Boltzmann Limit for the Anderson Model at Small Disorders in Dimension 3. J. Stat. Phys. 120(1–2), 279–337 (2005) 3. Erdös, L.: Linear Boltzmann equation as the scaling limit of the Schrödinger evolution coupled to a phonon bath. J. Stat. Phys. 107(5), 1043–1127 (2002)
392
T. Chen
4. Erdös, L., Yau, H.-T.: Linear Boltzmann equation as the weak coupling limit of a random Schrödinger equation. Comm. Pure Appl. Math. LIII, 667–753 (2000) 5. Erdös, L., Salmhofer, M., Yau, H.-T.: Quantum diffusion of random Schrödinger evolution in the scaling limit. http://arxiv.org/abs/math-ph/0502025, 2005 6. Fröhlich, J., Spencer, T.: Absence of diffusion in the Anderson tight binding model for large disorder or low energy. Commun. Math. Phys. 88, 151–184 (1983) 7. Lukkarinen, J., Spohn, H.: Kinetic Limit for Wave Propagation in a Random Medium. http://arxiv.org/abs/math-ph/0505075, 2005 8. Schlag, W., Shubin, C., Wolff, T.: Frequency concentration and localization lengths for the Anderson model at small disorders. J. Anal. Math. 88, 173 (2002) 9. Spohn, H.: Derivation of the transport equation for electrons moving through random impurities. J. Stat. Phys. 17(6), 385–412 (1977) 10. Stein, E.: Harmonic Analysis. Princeton, NJ: Princeton University Press, 1993 Communicated by H.-T. Yau
Commun. Math. Phys. 267, 393–418 (2006) Digital Object Identifier (DOI) 10.1007/s00220-006-0084-3
Communications in
Mathematical Physics
Multifractal Analysis for Lyapunov Exponents on Nonconformal Repellers Luis Barreira1 , Katrin Gelfert2 1 Departamento de Matemática, Instituto Superior Técnico, 1049-001 Lisboa, Portugal.
E-mail:
[email protected]
2 Max-Planck-Institut für Physik Komplexer Systeme, Nöthnitzer Str. 38, D-01187 Dresden, Germany.
E-mail:
[email protected] Received: 29 April 2005 / Accepted: 12 May 2006 Published online: 8 August 2006 – © Springer-Verlag 2006
Abstract: For nonconformal repellers satisfying a certain cone condition, we establish a version of multifractal analysis for the topological entropy of the level sets of the Lyapunov exponents. Due to the nonconformality, the Lyapunov exponents are averages of nonadditive sequences of potentials, and thus one cannot use Birkhoff’s ergodic theorem nor the classical thermodynamic formalism. We use instead a nonadditive topological pressure to characterize the topological entropy of each level set. This prevents us from estimating the complexity of the level sets using the classical Gibbs measures, which are often one of the main ingredients of multifractal analysis. Instead, we avoid even equilibrium measures, and thus in particular g-measures, by constructing explicitly ergodic measures, although not necessarily invariant, which play the corresponding role in our work. 1. Introduction The theory of multifractal analysis is a subfield of the dimension theory of dynamical systems. Briefly, multifractal analysis studies the complexity of the level sets of the invariant local quantities obtained from a dynamical system. For example, we can consider Birkhoff averages, Lyapunov exponents, pointwise dimensions, or local entropies. These functions are usually only measurable and thus their level sets are rarely manifolds. Hence, in order to measure the complexity of these sets it is appropriate to use quantities such as the topological entropy or the Hausdorff dimension. The theory of multifractal analysis has also a privileged relation with the experimental study of dynamical systems. More precisely, the so-called multifractal spectra—which are one of the main components of multifractal analysis—are obtained from the study of the complexity of the level sets, and can be determined experimentally essentially with arbitrary precision. As such we may expect to be able to recover, to some extent, information about the dynamical Supported by the Center for Mathematical Analysis, Geometry, and Dynamical Systems, through FCT by Program POCTI/FEDER and the grant SFRH/BPD/12108/2003.
394
L. Barreira, K. Gelfert
system from the information contained in the multifractal spectra. See [6] for a related discussion. Our main objective is to develop a new approach that allows one to study the multifractal analysis of the level sets of Lyapunov exponents for a class of nonconformal repellers. We recall that a differentiable map is said to be conformal on a given set provided that its differential is a multiple of an isometry at all points of that set. We use the expression nonconformal repeller to refer to a repeller of an expanding map which is not conformal on (see Sect. 2.1 for details). We emphasize that the dimension theory and the multifractal analysis of dynamical systems are only well understood in the case of conformal uniformly hyperbolic dynamics (either invertible or noninvertible), and thus in particular for systems with sufficiently low dimension. These include saddle-type hyperbolic diffeomorphisms on surfaces, and holomorphic maps in the complex plane with hyperbolic Julia sets. The study of the dimension of invariant sets of nonconformal transformations has proven to be much more delicate. The main difficulty is related with the possibility of existence of distinct Lyapunov exponents associated to different directions, which may change from point to point. Another difficulty is that certain number-theoretical properties, and not only the geometric properties of the invariant sets, start playing an important role (we refer to [2] for more details on this phenomenon). Nevertheless, there exist several noteworthy results concerning the dimension theory of certain classes of invariant sets of nonconformal transformations, namely due to Falconer [11, 12], Bothe [8], Simon [19], and Simon and Solomyak [20]. In all these delicate works the authors make some additional assumptions ultimately in order to avoid two main types of difficulties, which are closely related to the above discussion. These difficulties are: 1. the lack of a clear separation between different “Lyapunov” directions, connected with a possible small regularity of the associated distributions; 2. the existence of number-theoretical properties that can cause the variation of the Hausdorff dimension with respect to a certain generic value (such as the one in [11]). For the current state-of-the-art of the dimension theory of invariant sets of nonconformal transformations we refer the reader to the relevant sections of Pesin’s book [15] and of our survey [4]. On the other hand, the multifractal analysis of nonconformal dynamics is essentially open ground. Related works are [7, 13], although they pursue different objectives and use different techniques from the ones developed here. We now describe a prototype of the problems that we want to address. Consider a local diffeomorphism f : M → M on a compact manifold. Given x ∈ M and v ∈ Tx M, we define the Lyapunov exponent of (x, v) by λ(x, v) = lim sup n→+∞
1 logdx f n v, n
(1)
with the convention that log 0 = −∞. It follows from the abstract theory of Lyapunov exponents (see [5] for full details) that for each x ∈ M there exist a positive integer s(x) ≤ dim M, numbers λ1 (x) < · · · < λs(x) (x) (which are the values of the Lyapunov exponent λ(x, ·)), and linear spaces {0} = E 0 (x) ⊂ E 1 (x) ⊂ · · · ⊂ E s(x) (x) = Tx M such that for i = 1, . . ., s(x) we have E i (x) = {v ∈ Tx M : λ(x, v) ≤ λi (x)},
Multifractal Analysis for Lyapunov Exponents
395
and λ(x, v) = λi (x) whenever v ∈ E i (x) \ E i−1 (x). We also consider the values of the Lyapunov exponent λ(x, ·) counted with multiplicities, i.e., the numbers ρ1 (x) ≤ · · · ≤ ρm (x) with m = dim M,
(2)
where for each i = 1, . . . , s(x) we define ρ j (x) = λi (x) whenever j = dim E i−1 (x) + 1, . . . , dim E i (x). It follows from Oseledets’ multiplicative ergodic theorem (see for example [5]), or more precisely from its version for noninvertible transformations, that for every finite f -invariant measure on M there is a set X ⊂ M of full measure such that if x ∈ X then lim
n→+∞
1 logdx f n v = λi (x) n
(3)
for every v ∈ E i (x) \ E i−1 (x) and i = 1, . . . , s(x) (in particular, the lim sup in (1) is now a limit). Given α = (α1 , . . . , αm ) ∈ Rm we define the level set F(α) of the Lyapunov exponents by F(α) = {x ∈ X : ρ j (x) = α j for j = 1, . . . , m}
(4)
(we stress that by taking points in X we are also assuming the existence of the limits, as in (3)). Our main aim is to describe the complexity of these sets. More precisely, given a compact f -invariant set ⊂ M we want to characterize the topological entropy h top ( f |F(α) ∩ ) of f on F(α) ∩ as a function of α. We note that the sets F(α) need not be compact, and in fact this should be the typical situation (see [2]). Accordingly we are using here the notion of topological entropy on noncompact sets introduced by Bowen in [9] (see also Sect. 7). For example, if f is conformal, i.e., dx f is a multiple of an isometry for each x ∈ M, then λ(x, v) = lim sup n→+∞
n−1 1 1 logdx f n = lim sup ϕ( f k x) n n→+∞ n
(5)
k=0
for every v ∈ Tx M \ {0}, where ϕ(x) = logdx f . In particular, when f is conformal, all values ρ j (x) of the Lyapunov exponent are equal. Thus, in this situation the set F(α) in (4) only depends on one parameter, and it is sufficient to study the level sets n−1 1 k G(β) = x ∈ M : lim ϕ( f x) = β (6) n→+∞ n k=0
for β ∈ R. Note that G(β) = F(β, . . . , β). It is known (see [15] for full details and references) that if ⊂ M is a repeller of a C 1+α transformation f such that f | is conformal, then the map β → h top ( f |G(β) ∩ ) is real analytic, and we have the relation h top ( f |G(β) ∩ ) = sup (T (q) + qβ),
(7)
T (q) = P(qϕ) − q P(ϕ)
(8)
q∈R
where
with P the topological pressure with respect to f |.
396
L. Barreira, K. Gelfert
When we consider an arbitrary map f , i.e., one which is not necessarily conformal, in general one is not able to replace the limits in (3) (and in (4)) by limits of Birkhoff averages as in (6). This causes several additional complications when we study the function α → h top ( f |F(α) ∩ ),
(9)
where is a given compact f -invariant set, such as for example a repeller of f . In particular, instead of the sequence ϕn =
n−1
ϕ ◦ fk
k=0
in (5) and (6), we need to consider sequences ϕn that may not satisfy any additivity (or even subadditivity), contrary to what happens in (5) and (6). This prevents us from using the classical thermodynamic formalism and the (classical) topological pressure P when we study the complexity of the level sets F(α). Nevertheless, we are still able to develop what we consider a natural approach using a nonadditive version of the topological pressure. In particular, we establish an identity for the function in (9), which is analogous to that in (7) with the topological pressure P in (8) replaced by the nonadditive pressure. We also would like to consider repellers of maps that need not be of class C 1+α . However, in this situation one is not able in general to obtain Gibbs measures, even in the case of conformal repellers, which could allow us to establish the generalization of the identity in (7). This is due to the fact that one is naturally led to consider nonadditive sequences of potentials, and thus we require, at least to some extent, a genuine nonadditive thermodynamic formalism. Unfortunately, there exists yet no corresponding theory of nonadditive equilibrium measures, which could be another possible approach to obtain our results. Instead, we develop a new approach which involves constructing explicitly certain ergodic measures, although a priori not necessarily invariant. In the case of C 1+α conformal maps the measures that we construct indeed coincide with the Gibbs measures considered in the classical multifractal analysis, which are the equilibrium measures µq of the functions qϕ. On the other hand, we emphasize that for nonconformal maps, even of class C 1+α , we are in general not able to use Gibbs measures. The content of the paper is the following. In Sect. 2 we introduce some notions that are needed in the formulation of our main results in Sect. 3. We discuss a special class of almost additive sequences in Sect. 4. These are related to the Lyapunov exponents and are crucial in our work. In Sect. 5 we introduce a class of measures related to the level sets of Lyapunov exponents, and we discuss some of its properties in Sect. 6. We also discuss the nonadditive topological pressure in Sect. 7, as well as its relevance to our work. Combining the results in Sects. 4–7 we then establish our results in Sects. 8 and 9. 2. Preliminaries We present in this section several notions that are needed in the formulation of our main results in Sect. 3. In particular, we consider a model class of nonconformal repellers in R2 instead of striving for any formal generalization. This will allow us to highlight the main ideas without any accessory technicalities.
Multifractal Analysis for Lyapunov Exponents
397
2.1. Repellers and Lyapunov exponents. Let be a repeller of a C 1 map f : R2 → R2 . This means that is compact, f -invariant ( f −1 = ), and that f is expanding on , i.e., there exist c > 0 and β > 1 such that dx f n v ≥ cβ n v for every x ∈ , n ∈ N, and v ∈ Tx R2 . We also assume that there is an open set U ⊃ such that = n∈N f n U , and that f | is topologically mixing. We recall that the singular values σ1 (A) ≥ σ2 (A) of the 2 × 2 matrix A are the eigenvalues, counted with multiplicities, of the matrix (A∗ A)1/2 , where A∗ denotes the transpose of A. We consider functions ϕi,n : → R for i = 1, 2 and n ∈ N, defined on the repeller by ϕi,n (x) = log σi (dx f n ). Given α = (α1 , α2 ) ∈ R2 we want to study the complexity of the level sets 1 E(α) = x ∈ : lim ϕ n (x) = α , n→∞ n
(10)
(11)
where ϕ n = (ϕ1,n , ϕ2,n ). In particular, we are interested in the topological entropy h top ( f |E(α)) of the map f on each level set E(α). It follows from Oseledets’ multiplicative ergodic theorem (see for example [5]) that for any f -invariant finite measure µ on the set of points x ∈ for which (see (2)) (ρ1 (x), ρ2 (x)) = lim
n→+∞
1 ϕ (x) n n
has full µ-measure. In particular we have E(α) = F(α) (mod 0) for every α ∈ R2 , with F(α) as in (4) with m = 2. Accordingly, we may also think of the set E(α) as a level set of the Lyapunov exponents. We call the function α → h top ( f |E(α)) the entropy spectrum of the Lyapunov exponents. 2.2. Markov partitions and topological pressure. We recall that a collection of closed sets R1 , . . . , R p ⊂ is a Markov partition of the repeller if: p 1. = i=1 Ri , and int Ri = Ri for i = 1, . . ., p; 2. int Ri ∩ int R j = ∅ whenever i = j; 3. f Ri ⊃ R j whenever f (int Ri ) ∩ int R j = ∅. It is well known that any repeller possesses Markov partitions of arbitrarily small diameter (see [17]). Given a Markov partition of the repeller , we define a p × p matrix A = (ai j ) with entries 1 if f (int Ri ) ∩ int R j = ∅ , ai j = 0 if f (int Ri ) ∩ int R j = ∅ and we consider the associated topological Markov chain σ : A → A defined by σ (i 1 i 2 · · · ) = (i 2 i 3 · · · ) on the set A = (i 1 i 2 · · · ) ∈ {1, . . . , p}N : aik ik+1 = 1 for every k ∈ N .
398
L. Barreira, K. Gelfert
We denote by A,n the set of n-tuples which are the first n elements of some sequence in A , i.e., (i 1 · · · i n ) ∈ A,n if and only if there exists ( j1 j2 · · · ) ∈ A such that i = j
for = 1, . . ., n. For each (i 1 · · · i n ) ∈ A,n we define i1 ···in =
n
f − +1 Ri .
(12)
=1
We obtain a coding map χ : A → for the repeller, given by χ (i 1 i 2 · · · ) =
∞
f − Ri +1 =
=0
∞
i1 ···in .
n=1
We will express the topological entropy of the level sets E(α) (see (11)) in terms of the following nonadditive version of the topological pressure. Let = (ϕn )n be a sequence of continuous functions ϕn : → R. We define P( ) = lim sup n→∞
1 log exp max ϕn (x). x∈i 1 ···i n n
(13)
i 1 ···i n
One can show that P( ) is independent of the particular Markov partition that we use in its definition (this follows from results in [1]; see Sect. 7). For simplicity, we continue to refer to P( ) as the topological pressure of (with respect to f |). It coincides with the notion of nonadditive upper capacity topological pressure introduced in [1, Sect. 1.5] (see also Sect. 7 for a related discussion), when restricted to the case of symbolic dynamics. One can easily verify that if is the sequence composed of k the functions ϕn = n−1 k=0 ϕ ◦ f for some fixed continuous function ϕ : → R, then P( ) = P(ϕ), where P(ϕ) is the classical topological pressure of ϕ (with respect to f |), and in particular the lim sup in (13) can be replaced by a limit. We note that for an arbitrary sequence the lim sup in (13) may not be a limit. However, this is the case for the so-called almost additive sequences considered below (see Sect. 4). 2.3. Tempered distortion. Consider δ > 0 such that f is invertible on the ball B(x, δ) for every x ∈ (simply take a Lebesgue number of a cover by balls such that f is invertible on each of them). For each x ∈ and n ∈ N we define Bn (x, δ) =
n−1
f − B( f x, δ).
=0
We always assume that the diameter of the Markov partition used to define the sets i1 ···in in (12) is at most δ/2. This ensures that i1 ···in ⊂ Bn (x, δ) for every x = χ (i 1 i 2 · · · ) ∈ and n ∈ N.
(14)
We say that f has tempered distortion on if for some δ > 0, lim sup n→∞
1 log sup d y f n (dz f n )−1 : x ∈ and y, z ∈ Bn (x, δ) = 0, n
(15)
Multifractal Analysis for Lyapunov Exponents
399
and that f has bounded distortion on if for some δ > 0, sup d y f n (dz f n )−1 : x ∈ and y, z ∈ Bn (x, δ) < ∞. Note that if one of these properties holds for some δ then it also holds for any smaller δ. Given d = (d1 , d2 ) ∈ R2 and the sequence of functions ϕ n = (ϕ1,n , ϕ2,n ) defined 2 by (10) we write d, ϕ n = i=1 di ϕi,n . Proposition 1. Let be a repeller of the C 1 map f : R2 → R2 . If f has tempered distortion on , then there exists a positive sequence (ρn )n decreasing to 0 such that for every d ∈ R2 , n ∈ N, and (i 1 · · · i n ) ∈ A,n we have max x∈i1 ···in expd, ϕ n (x) min y∈i1 ···in expd, ϕ n (y)
≤ enρn d .
(16)
If f has bounded distortion on , then there exists D > 0 such that for every d ∈ R2 , n ∈ N, and (i 1 · · · i n ) ∈ A,n we have max x∈i1 ···in expd, ϕ n (x) min y∈i1 ···in expd, ϕ n (y)
≤ D d .
(17)
Proof. Assume first that f has tempered distortion on . It follows from (15) that there exists a positive sequence (ρn )n decreasing to zero such that d y f n (dz f n )−1 ≤ enρn /2 for every x ∈ and y, z ∈ Bn (x, δ). Since d y f n = (d y f n (dz f n )−1 )dz f n this yields d y f n (B) ⊂ enρn /2 dz f n (B), where B ⊂ Tx M is the unit ball centered at 0. Since the numbers σi (d y f n ) coincide with the lengths of the semiaxes of d y f n (B) we conclude that for i = 1, 2, σi (d y f n )/σi (dz f n ) ≤ enρn and thus ϕi,n (y) − ϕi,n (z) ≤ nρn . Therefore d, ϕ n (y) − d, ϕ n (z) ≤ nρn d, and in view of (14) this yields (16). When f has bounded distortion on we can replace nρn by a constant, and we obtain (17). The tempered distortion property ensures that one can replace the maximum in (13) by the minimum (or in fact by any intermediate value) when we consider the topological pressure P( ) of a sequence of potentials. We now give a condition for bounded distortion in the case of C 1+α transformations. Given α > 0, we say that f is α-bunched on if (dx f )−1 1+α dx f < 1 for every x ∈ . Notice that any conformal map on is α-bunched for every α > 0. The following statement is an immediate consequence of the proof of Theorem 4 in [3].
400
L. Barreira, K. Gelfert
Proposition 2. Let be a repeller of the C 1+α map f : R2 → R2 which is α-bunched on . Then f has bounded distortion on . 2.4. Cone condition. Given a number γ ≤ 1 and a 1-dimensional subspace E(x) ⊂ Tx R2 , we define the cone Cγ (x) = {(u, v) ∈ E(x) ⊕ E(x)⊥ : v ≤ γ u}.
(18)
We say that a differentiable map f : R2 → R2 satisfies a cone condition on a set ⊂ R2 if there exist γ ≤ 1 and for each x ∈ a 1-dimensional subspace E(x) ⊂ Tx R2 varying continuously with x such that (dx f )Cγ (x) ⊂ {0} ∪ int Cγ ( f x).
(19)
We present several examples of maps satisfying a cone condition. Example 1. Assume that for each x ∈ the derivative dx f is represented by a positive 2 × 2 matrix. Then the first quadrant Q is invariant under these linear transformations, i.e., (dx f )Q ⊂ Q for each x ∈ . Therefore, the map f satisfies the cone condition in (19) with γ = 1, taking for E(x) the 1-dimensional subspace making an angle of π/4 with the horizontal direction. Another class of examples corresponds to the existence of a strongly unstable foliation. Example 2. Let be a locally maximal repeller in the sense that in some open neighborhood U the repeller is the only invariant set. In this case f −1 ∩U = . Assume that there exists a strongly unstable foliation of the set U , i.e., a foliation by 1-dimensional C 2 leaves V (x) such that: 1. f (V (x)) ⊃ V ( f x) for every x ∈ U ∩ f −1 U ; 2. there exist constants c > 0 and λ ∈ (0, 1) such that
|det dx f n | n ≤ cλ for all x ∈ f −i U and n ∈ N. dx f n |Tx V (x)2 n
i=0
It is shown in [14] that this assumption is equivalent to: 1. for some choice of subspaces E(x) varying continuously with x, the cone condition in (19) holds for every x ∈ U ∩ f −1 U ; 2. there exist 1-dimensional subspaces F(x) ⊂ {0} ∪ int Cγ (x) for each x ∈ U ∩ f −1 U such that dx f F(x) = F( f x). Thus, repellers with a strongly unstable foliation satisfy a cone condition. Notice that the cone condition in (19) is weaker than assuming the existence of a strongly unstable foliation. In particular, (19) does not ensure the existence of an invariant distribution F(x) as in Example 2. On the other hand, when there exists a strongly unstable foliation, the invariant distribution F(x) is given by (see [14])
F(x) = d y f n Cγ (y). n∈N y∈ f −n x
Multifractal Analysis for Lyapunov Exponents
401
It is thus independent of the particular preimages xn ∈ f −n x, i.e.,
dxn f n Cγ (xn ). F(x) = n∈N
We can also consider repellers with a dominated splitting. Example 3. We say that the repeller possesses a dominated splitting if there exists a decomposition T R2 = E ⊕ F such that: 1. dx f E(x) = E( f x) and dx f F(x) = F( f x) for every x ∈ ; 2. there exist constants c > 0 and λ ∈ (0, 1) such that dx f n |E · (dx f )−n |F ≤ cλn for all x ∈ and n ∈ N. It follows easily from the definition that the subspaces E(x) and F(x) vary continuously with x. Furthermore, one can verify that when there exists a dominated splitting of , the map f satisfies a cone condition on . We note that the existence of a strongly unstable foliation does not ensure the existence of a dominated splitting, due to the requirement of a d f -invariant decomposition E ⊕ F (more precisely, the existence of a strongly unstable foliation only ensures the existence of the invariant distribution F in Example 2). 3. Main Results We formulate our main results in this section. We first introduce some notions. Consider the sequence d, = (d, ϕ n )n . We say that a vector α ∈ R2 is a gradient of the topological pressure, or more precisely of the function d → P(d, ), if there exists d(α) ∈ R2 such that α = ∇ P(d(α), ) (in particular, this includes the requirement that the gradient is well-defined). We note that d(α) may not be unique. We show below that the function d → P(d, ) is convex (see Proposition 6). Thus, it readily follows from the theory of convex analysis (see for example [16]) that d → P(d, ) is differentiable on a G δ dense set. Let ν be a Borel f -invariant probability measure on . The entropy of the partition ξn = {i1 ···in : (i 1 · · · i n ) ∈ A,n } with respect to the measure ν is defined by ν(i1 ···in ) log ν(i1 ···in ), Hν (ξn ) = − i 1 ···i n
and the entropy of f with respect to ν is given by 1 1 Hν (ξn ) = inf Hν (ξn ). n→∞ n n≥1 n
h ν ( f ) = lim
We now formulate our main result concerning the topological entropy of the level sets E(α). Theorem 1 (Entropy spectrum of the Lyapunov exponents). Let be a repeller of a C 1+α map f : R2 → R2 , α ∈ (0, 1] such that: 1. f satisfies a cone condition on ; 2. f has bounded distortion on .
402
L. Barreira, K. Gelfert
Then for each gradient α ∈ R2 of the topological pressure we have h top ( f |E(α)) = inf [P(d, ) − d, α]. d ∈R2
(20)
Furthermore, there exists a unique ergodic f -invariant probability measure µd (α ) on concentrated on E(α) such that P(d(α), ) = h µd(α) ( f ) + d(α), α,
(21)
and there exists a constant K > 0 (independent of α) such that for every n ∈ N and x ∈ i1 ···in we have K −(1+d (α )) ≤
µd (α ) (i1 ···in ) ≤ K 1+d (α ) . exp[−n P(d(α), ) + d(α), ϕ n (x)]
(22)
As a byproduct of our approach we also obtain the following statement, concerning the existence of measures possessing a weak version of the Gibbs property in (22). Theorem 2 (Existence of weak Gibbs measures). Let be a repeller of a C 1 map f : R2 → R2 such that: 1. f satisfies a cone condition on ; 2. f has tempered distortion on . Then there exist a constant K > 0, a positive sequence (ρn )n decreasing to 0, and for each d ∈ R2 an ergodic measure νd on such that given n ∈ N and x ∈ i1 ···in we have νd (i1 ···in ) K −(1+d ) e−nρn d ≤ ≤ K 1+d enρn d . exp[−n P(d, ) + d, ϕ n (x)] We emphasize that the measures νd in Theorem 2 need not be invariant. The proofs of Theorems 1 and 2 are given in Sect. 9, using the auxiliary material developed in the following sections. 4. Almost Additive Sequences 4.1. Topological pressure. We say that the sequence = (ϕn )n is almost additive (with respect to f |) if there exists C > 0 such that for every n, m ∈ N and x ∈ we have −C + ϕn (x) + ϕm ( f n x) ≤ ϕn+m (x) ≤ C + ϕn (x) + ϕm ( f n x).
(23)
We have the following result for the topological pressure of almost additive sequences. Proposition 3. Let be a repeller of a C 1 map, and let be an almost additive sequence of continuous functions on such that
1 sup |ϕn (y) − ϕn (z)| : y, z ∈ i1 ···in and (i 1 · · · i n ) ∈ A,n → 0 n as n → ∞. Then 1 P( ) = lim log exp ϕn (xi1 ···in ) n→∞ n i 1 ···i n
for any points xi1 ···in ∈ i1 ···in for each n ∈ N and (i 1 · · · i n ) ∈ A,n .
(24)
Multifractal Analysis for Lyapunov Exponents
403
The proof of Proposition 3 requires the notion of nonadditive topological pressure introduced in [1] and is given in Sect. 8. The condition in (24) is of the same type as the tempered distortion property, and indeed in our applications of Proposition 3 the condition (24) can be readily obtained from this property.
4.2. Almost multiplicativity of the singular values. We now study the sequences of functions in (10) obtained from the singular values of the linear transformations A = dx f : R2 → R2 . Since (A∗ A)1/2 is symmetric, there exists an orthonormal basis e1 , e2 of R2 which consists of eigenvectors of (A∗ A)1/2 corresponding to the eigenvalues σi (A). Furthermore, the vectors Ae1 , Ae2 are orthogonal, and satisfy σ1 (A) = A = Ae1 and σ2 (A) = A−1 −1 = Ae2 (with the 2-norm in R2 ). We want to show that under the cone condition in (19) the sequences of functions in (10) are almost additive. Proposition 4. Let f : R2 → R2 be a C 1 local diffeomorphism, and let ⊂ R2 be a compact f -invariant set. If f satisfies a cone condition on , then there exists C ≥ 1 such that for every x ∈ and n, m ∈ N we have C −1 σ1 (dx f n )σ1 (d f n x f m ) ≤ σ1 (dx f n+m ) ≤ σ1 (dx f n )σ1 (d f n x f m ), and σ2 (dx f n )σ2 (d f n x f m ) ≤ σ2 (dx f n+m ) ≤ Cσ2 (dx f n )σ2 (d f n x f m ). Proof. Let x ∈ . Note that Cγ (x) is taken by the linear transformation dx f into a subset of Cγ ( f x) which is also a cone, namely dx f Cγ (x) = {(u, v) ∈ F(x) ⊕ F(x)⊥ : v ≤ γ (x)u}, for some number γ (x) < 1 and some 1-dimensional subspace F(x) ⊂ T f x R2 . Furthermore, since f is of class C 1 and the subspaces E(x) in (18) vary continuously with x, we can always assume that the function x → γ (x) is continuous. Take unit vectors v, u 1 ∈ Cγ (x) and choose another unit vector u 2 orthogonal to u 1 , such that u 1 and u 2 are eigenvectors of the map ((dx f )∗ dx f )1/2 . Then the vectors vi = dx f u i , i = 1, 2 are also orthogonal. Writing v = cos βv u 1 + sin βv u 2 we have |cos βv | = cos (u 1 , v) ≥
1 − γ (x)2 > 0. 1 + γ (x)2
(25)
By the continuity of the function x → γ (x), 1 − γ (x)2 > 0. x∈ 1 + γ (x)2
a := inf Since
dx f v = cos βv dx f u 1 + sin βv dx f u 2 = cos βv
v1 v2 + sin βv , v1 v2
(26)
404
L. Barreira, K. Gelfert
in view of (25), dx f v ≥ |cos βv | · dx f u 1 ≥ adx f u 1 .
(27)
On the other hand, by (26), tan βv = tan βv v2 /v1 .
(28)
Let v, w ∈ Cγ (x)\int Cγ (x) be vectors along the boundaries of Cγ (x), with positive projection in the direction of u 1 . By the cone condition in (19), we have βv + βw = (v, w) > (dx f v, dx f w) = βv + βw .
(29)
If v2 ≥ v1 , then it would follow from (28) that βv ≥ βv and βw ≥ βw . But this contradicts (29). Therefore, we must have v2 < v1 , and vi = dx f u i = σi (dx f ) for i = 1, 2.
(30)
Given y ∈ and k ∈ N let v y,k = 0 be an eigenvector of ((d y f k )∗ d y f k )1/2 corresponding to the largest eigenvalue. It follows from (30) (with f replaced by f k ) that v y,k ∈ Cγ (y). Consider now the vector w = dx f n vx,n+m /dx f n vx,n+m ∈ Cγ ( f n x). Since dx f n vx,n = dx f n
and
dx f n+m vx,n+m = dx f n+m ,
it follows from (27) (with f replaced by f n and f m ) that dx f n vx,n+m ≥ adx f n vx,n and d f n x f m w ≥ ad f n x f m v f n x,m . We now obtain from dx f n+m vx,n+m = (d f n x f m ◦ dx f n )vx,n+m that dx f n+m = d f n x f m w · dx f n vx,n+m ≥ a 2 d f n x f m v f n x,m · dx f n vx,n and the first statement follows. The second statement follows readily from the identities |det A| = σ1 (A)σ2 (A) and |det(AB)| = |det A| · |det B|. In view of Proposition 4 we conclude that the sequences of functions (ϕi,n )n , i = 1, 2 in (10) are almost additive. It follows readily from a result of Derriennic in [10] (see Lemma 11 below) together with Proposition 4 that there always exist vectors α ∈ R2 for which the set E(α) in (11) is nonempty.
Multifractal Analysis for Lyapunov Exponents
405
5. Construction of the Measures With the same setting as in Theorem 2, we construct in this section certain probability measures on the repeller. These measures are crucial for our study of multifractal analysis. The measures are ergodic, although a priori not necessarily invariant. We set
n (i 1 · · · i n , d) = max{expd, ϕ n (y) : y ∈ i1 ···in } = max{σ1 (d y f n )d1 σ2 (d y f n )d2 : y ∈ i1 ···in }, with the convention that n (i 1 · · · i n , d) = 0 if i1 ···in = ∅. In the following lemmas we use the almost multiplicativity of the singular values established in Proposition 4 to obtain the almost multiplicativity of the functions n . We also set
n (d) =
n (i 1 · · · i n , d). i 1 ···i n
Note that since f | is topologically mixing, there exists k ∈ N such that Ak has only positive entries. This ensures that given sequences (i 1 · · · i n ) ∈ A,n and ( j1 · · · jl ) ∈ A,l there exists ( p1 · · · pk ) ∈ A,k such that (i 1 · · · i n p1 · · · pk j1 · · · jl ) ∈ A,n+k+l . Lemma 1. There exists K 1 > 0 such that for every d ∈ R2 and l > k we have − d
K1
d
l−k (d) ≤ l (d) ≤ p k K 1 l−k (d).
Proof. Given y ∈ i1 ···il such that expd, ϕ l (y) = l (i 1 · · · il , d), it follows from Proposition 4 that
l (i 1 · · · il , d) ≤ C d expd, ϕ k (y) expd, ϕ l−k ( f k (y)) ≤ C d C kd l−k ( j1 · · · jl−k , d),
(31)
for some constant C > 0. Furthermore, for each x ∈ i1 ···il we have
l (i 1 · · · il , d) ≥ expd, ϕ l (x) ≥ C −d expd, ϕ k (x) expd, ϕ l−k ( f k (x)).
(32)
If, in addition, x satisfies expd, ϕ l−k ( f k (x)) = l−k ( j1 · · · jl−k , d), then it follows from (31) and (32) that C d C kd l−k ( j1 · · · jl−k , d) ≥ l (i 1 · · · il , d) ≥ C −d C −kd l−k ( j1 · · · jl−k , d), by eventually enlarging the constant C . On theother hand, since Ak > 0, for each ( j1 · · · jl−k ) ∈ A,l−k there exists (n 1 · · · n k ) ∈ A,k such that (n 1 · · · n k j1 · · · jl−k ) ∈ A,l . Hence, p k C d C kd
l−k ( j1 · · · jl−k , d) ≥
j1 ··· jl−k
l (i 1 · · · il , d) ≥ C −d C −kd
i 1 ···il
This completes the proof of the lemma.
j1 ··· jl−k
l−k ( j1 · · · jl−k , d).
406
L. Barreira, K. Gelfert
Lemma 2. There exists K 2 > 0 such that for every d ∈ R2 , n ∈ N, (i 1 · · · i n ) ∈ A,n , and l > k we have
n+l (i 1 · · · i n j1 · · · jl , d) ≤ C d n (i 1 · · · i n , d) l (d) j1 ··· jl
and
− d
n+l (i 1 · · · i n j1 · · · jl , d) ≥ n (i 1 · · · i n , d) l (d)e−nρn d p −k K 2
.
j1 ··· jl
Proof. Fix (i 1 · · · i n ) ∈ A,n and choose a sequence ( j1 · · · jl ) ∈ A,l such that (i 1 · · · i n j1 · · · jl ) ∈ A,n+l . Given y ∈ i1 ···il such that expd, ϕ l (y) = l (i 1 · · · il , d), it follows from Proposition 4 that
n+l (i 1 · · · i n j1 · · · jl , d) ≤ C d n (i 1 · · · i n , d) l ( j1 · · · jl , d). Thus,
(33)
n+l (i 1 · · · i n j1 · · · jl , d) ≤ C d n (i 1 · · · i n , d) l (d).
j1 ··· jl
This establishes the first inequality in the lemma. Note now that for any ( j1 · · · jl−k ) ∈ A,l−k there exists (m 1 · · · m k ) ∈ A,k such that (i 1 · · · i n m 1 · · · m k j1 · · · jl−k ) ∈ A,n+l . Hence, for each x ∈ i1 ···in m 1 ···m k j1 ··· jl−k we have
n+l (i 1 · · · i n k1 · · · kk j1 · · · jl−k , d) ≥ C −2d expd, ϕ n (x) expd, ϕ k ( f n (x)) expd, ϕ l−k ( f n+k (x)).
(34)
If, in addition, x satisfies expd, ϕ l−k ( f n+k (x)) = l−k ( j1 · · · jl−k , d), then it follows from (34) and Proposition 1 that
n+l (i 1 · · · i n m 1 · · · m k j1 · · · jl−k , d) ≥ C −2d C −kd expd, ϕ n (x) expd, ϕ l−k ( f n+k (x)) ≥ C −2d C −kd n (i 1 · · · i n , d)e−nρn d l−k ( j1 · · · jl−k , d),
(35)
C
> 0. Summing over all admissible sequences in A,l and using for some constant Lemma 1 we obtain
n+l (i 1 · · · i n t1 · · · tl , d) t1 ···tl
≥
n+l (i 1 · · · i n m 1 · · · m k j1 · · · jl−k , d)
j1 ··· jl−k
≥ C −2d C −kd n (i 1 · · · i n , d)e−nρn d l−k (d) − d
≥ C −2d C −kd n (i 1 · · · i n , d)e−nρn d p −k K 1 This gives the second inequality in the lemma.
l (d).
(36)
Multifractal Analysis for Lyapunov Exponents
407
The following is an immediate consequence of Lemma 2. Lemma 3. For every d ∈ R2 , n ∈ N, and l > k we have − d
l (d) n (d)e−nρn d p −k K 2
≤ l+n (d) ≤ C d l (d) n (d).
To establish the almost additivity of the function n → log n (d) we use the following statement. Lemma 4. Let (ξn )n be a sequence of real numbers satisfying ξn+m ≥ ξn ξm e−nρn for 1/n every n, m ∈ N, with ρn → 0 as n → ∞. Then there exists the limit limn→∞ ξn , and 1/n
lim ξn
n→∞
≥ sup(ξn e−ρn ). 1/n
n≥1
Proof. Each q ∈ N can be written in the form q = kp + l with k ∈ N and 0 ≤ l < p. By assumption, ξq ≥ ξ pk ξl e−kpρ p . Since kp ≤ q we have ≥ (ξ p e−ρ p )kp/q min ξl
1/q
1/ p
ξq
1/q
0≤l< p
.
Letting q → ∞, with p fixed, we obtain kp/q → 1 and hence 1/q
lim inf ξq q→∞
≥ ξ p e−ρ p . 1/ p
This yields 1/q
lim inf ξq q→∞
≥ sup ξ p e−ρ p ≥ lim sup ξ p e−ρ p = lim sup ξ p , 1/ p
1/ p
p→∞
p≥0
and the proof is complete.
1/ p
p→∞
Lemma 5. For every d ∈ R2 and n ∈ N we have − d −k
e−nρn d K 2
p
n (d) ≤ exp[n P(d, )] ≤ C d n (d).
Proof. By Lemma 3 we have n+l (d) ≤ C d n (d) l (d). Hence, a classical result on submultiplicative sequences implies that there exists the limit lim
n→∞
1 1 1 log n (d) = lim log(C d n (d)) = inf log(C d n (d)). n→∞ n≥1 n n n
(37)
Recall now that (i 1 · · · i n , d) = max x∈i1 ···in expd, ϕ(x). By Proposition 3 we obtain 1 1 log
n (i 1 · · · i n , d) = lim log n (d), n→∞ n n→∞ n
P(d, ) = lim
(38)
i 1 ···i n
and by (37) it follows that exp[n P(d, )] ≤ C d n (d) for each n ∈ N. Again from Lemma 3 we have that ρn ,
l+n (d) ≥ l (d) n (d) p −k e−n
where ρ n = d(ρn +
1 n
log K 2 ), and by Lemma 4,
ρn exp P(d, ) = lim n (d)1/n ≥ sup[( n (d) p −k )1/n e− ]. n→∞
Hence, exp[n P(d, )] ≥ n
ρn (d) p −k e−n
n≥1
for each n ∈ N.
408
L. Barreira, K. Gelfert
We now define a probability measure νn,d on the algebra Bn generated by the sets i1 ···in by νn,d (i1 ···in ) =
n (i 1 · · · i n , d)
n (d)
for each set i1 ···in , and we extend it arbitrarily to the Borel σ -algebra of . Since is compact, the family of Borel probability measures on is compact in the weak∗ topology and hence, there exists a subsequence (νn k ,d ) converging to some probability measure νd in the weak∗ topology. We emphasize that the limit νd may not be unique. 6. Properties of the Measures We now present several properties of the measures νd constructed in Sect. 5. We continue to consider the same setting as in Theorem 2. Our first result shows that each νd possesses a weak Gibbs property. However, the measures νd may not be invariant and thus, in principle, this statement cannot be obtained directly from the thermodynamic formalism by defining instead each νd as an equilibrium measure. Lemma 6. There exists K > 0 such that for every d ∈ R2 , n ∈ N, and (i 1 · · · i n ) ∈ A,n we have p −2k (K e2nρn )−d ≤
νd (i1 ···in ) ≤ p k (K enρn )d . exp[−n P(d, )] n (i 1 · · · i n , d)
Proof. Take k ∈ N such that Ak > 0, and integers n ∈ N and l > n + k. Since νl,d is a measure on the algebra generated by the sets i1 ···il we have νl,d (i1 ···in ) =
l (i 1 · · · i n j1 · · · jl−n , d) ,
l (d)
j1 ··· jl−n
where the sum is taken over all admissible sequences ( j1 · · · jl−n ) such that (i 1 · · · i n j1 · · · jl−n ) ∈ A,l . By Lemmas 2, 3, and 5 we obtain νl,d (i1 ···in ) ≤ n (i 1 · · · i n , d) l−n (d) l (d)−1 C d ≤ n (i 1 · · · i n , d) l (d) n (d)−1 enρn d l (d)−1 p k (C K 2 )d ≤ n (i 1 · · · i n , d) exp[−n P(d, )]enρn d p k (C 2 K 2 )d . Analogously, we obtain − d
νl,d (i1 ···in ) ≥ n (i 1 · · · i n , d) l−n (d) l (d)−1 e−nρn d p −k K 2
≥ n (i 1 · · · i n , d) l (d) n (d)−1 l (d)−1 e−nρn d p −k (C K 2 )−d ≥ n (i 1 · · · i n , d) exp[−n P(d, )]e−2nρn d p −2k (C K 22 )−d . Taking the limit of the sequence νn k ,d as k → ∞, we obtain the desired inequalities. By Lemma 6 and Proposition 1 we obtain the following statement.
Multifractal Analysis for Lyapunov Exponents
409
Lemma 7. For every d ∈ R2 , n ∈ N, and x ∈ i1 ···in we have p −2k (K e3nρn )−d ≤
νd (i1 ···in ) ≤ p k (K e2nρn )d . exp[−n P(d, ) + d, ϕ n (x)]
In particular, for a point x = χ (i 1 i 2 · · · ) ∈ E(α) we have 1 log νd (i1 ···in ) = d, α n→∞ n
P(d, ) + lim
(39)
(i.e., there exists the limit in (39), and the identity in (39) holds). We emphasize that the measure νd may not be invariant and thus, in principle, the existence of the limit in (39) does not follow from the Shannon–McMillan–Breiman theorem. For the proof of the ergodicity of νd we need the following statement. Lemma 8. There exists a constant K 3 > 0 such that for every d ∈ R2 , n, l ∈ N, (i 1 · · · i n ) ∈ A,n , ( j1 · · · jl ) ∈ A,l , and m > n + l + 2k we have
m (i 1 · · · i n k1 · · · km−n−l j1 · · · jl , d)
k1 ···km−n−l − d
≥ n (i 1 · · · i n , d) l ( j1 · · · jl , d) m−n−l (d)e(−nρn −lρl )d p −2k K 3
.
Proof. Since Ak > 0, for any finite sequence Lˆ ∈ A,m−n−l−2k there exist L 1 , L 2 ∈ A,k such that (i 1 · · · i n L 1 Lˆ L 2 j1 · · · jl ) ∈ A,m . Arguing in a similar manner to that in (35) and (36), we obtain
m (i 1 · · · i n k1 · · · km−n−l j1 · · · jl , d)
k1 ···km−n−l
≥
n (i 1 · · · i n , d) l ( j1 · · · jl , d)e(−nρn −lρl )d
Lˆ
ˆ d)C −4d C −2kd × m−n−l−2k ( L, = n (i 1 · · · i n , d) l ( j1 · · · jl , d)e(−nρn −lρl )d m−n−l−2k (d)C −4d C −2kd . The desired inequality follows by applying Lemma 1.
We can now establish the ergodicity of the measures. Lemma 9. Each measure νd is ergodic. Proof. Given sets i1 ···in and j1 ··· jl , for any m > n + 2k we have νd (i1 ···in ∩ f −m ( j1 ··· jl )) =
k1 ···km−n
νd (i1 ···in k1 ···km−n j1 ··· jl ).
410
L. Barreira, K. Gelfert
By Lemmas 6, 8, and 5 we obtain νd (i1 ···in ∩ f −m ( j1 ··· jl )) ≥
m+l (i 1 · · · i n k1 · · · km−n j1 · · · jl , d) k1 ···km−n
× p −2k exp[−(m + l)(P(d, ) + 2ρm+l d)]K −d ≥ exp[−(m + l)(P(d, ) + 2ρm+l d)] × n (i 1 · · · i n , d) l ( j1 · · · jl , d) m−n (d)e(−nρn −lρl )d p −4k (K K 3 )−d ≥ exp[−(n + l)P(d, ) − 2(m + l)ρm+l d] × n (i 1 · · · i n , d) l ( j1 · · · jl , d)e(−nρn −lρl )d p −4k (C K K 3 )−d . Again from Lemma 6, νd (i1 ···in ∩ f −m ( j1 ··· jl )) ≥ νd (i1 ···in )νd ( j1 ··· jl )e−2(nρn +lρl +(m+l)ρm+l )d p −6k (C K 3 K 3 )−d .
(40)
Given Borel sets A, B ⊂ we write them as disjoint unions up to zero measure sets with respect to the measure νd , i.e., A=
∞
ai
(mod 0) and B =
i=1
For each m ∈ N we have
∞
b j
(mod 0).
j=1
νd ( f −m (A) ∩ B) = νd
∞
f −m (ai ) ∩
i=1
= νd
∞
i, j=1
∞ j=1
b j
f −m (ai ) ∩ b j =
∞
νd ( f −m (ai ) ∩ b j ).
i, j=1
If νd (A) > 0 and νd (B) > 0, we take finite sequences ai ∈ A,li and b j ∈ A,l j for which νd (ai ) > 0 and νd (b j ) > 0. For any integer m > li + 2k, it follows from (40) that νd ( f −m ai ∩ b j ) ≥ νd (ai )νd (b j )D > 0, for some constant D = D(li , l j , m) > 0, and thus νd ( f −m (A) ∩ B) > 0. Take now B = \ A with A f -invariant. If 0 < νd (A) < 1 then 0 = νd (A ∩ ( \ A)) = νd ( f −n (A) ∩ ( \ A)) for every n ∈ N, while we have shown above that νd ( f −m (A) ∩ ( \ A)) > 0. Hence, either νd (A) = 0 or νd (A) = 1, and νd is ergodic. > 0 and for Lemma 10. If f has bounded distortion on , then there exist a constant K each d ∈ R2 a unique ergodic f -invariant probability measure µd among the measures νd , such that for every n ∈ N and (i 1 · · · i n ) ∈ A,n we have −(1+d ) ≤ K
µd (i1 ···in ) 1+d . ≤K exp[−n P(d, ) + d, ϕ n (x)]
(41)
Multifractal Analysis for Lyapunov Exponents
411
>0 Proof. By Lemma 7 and the bounded distortion property there exists a constant K such that for every d ∈ R2 , n ∈ N, and x ∈ i1 ···in we have −(1+d ) ≤ K
νd (i1 ···in ) 1+d . ≤K exp[−n P(d, ) + d, ϕ n (x)]
n−1 We now consider the sequence ( n1 l=0 νd ◦ f −l )n . Any limit point µd of this sequence in the weak∗ topology is an f -invariant measure concentrated on and satisfying (41). In fact, applying successively Lemma 6, (33) in the proof of Lemma 2, Lemma 5, and again Lemma 6 we obtain νd ( f −l i1 ···in ) =
νd ( j1 ··· jl i1 ···in )
j1 ··· jl
≤ c1
exp[−(l + n)P(d, )] l+n ( j1 · · · jl i 1 · · · i n , d)
j1 ··· jl
≤ c2
exp[−(l + n)P(d, )] l ( j1 · · · jl , d) n (i 1 · · · i n , d)
j1 ··· jl
= c2 exp[−(l + n)P(d, )] l (d) n (i 1 · · · i n , d) ≤ c3 exp[−n P(d, )] n (i 1 · · · i n , d) ≤ c4 νd (i1 ···in ), for some constants c1 , c2 , c3 , and c4 > 0. Analogously, applying now successively Lemma 6, (34)–(35) in the proof of Lemma 2, Lemma 5, and again Lemma 6 we obtain νd ( f −l i1 ···in ) ≥ c5
exp[−(l + n)P(d, )] l+n ( j1 · · · jl i 1 · · · i n , d)
j1 ··· jl
≥ c6
exp[−(l + n)P(d, )] l ( j1 · · · jl , d) n (i 1 · · · i n , d)
j1 ··· jl
≥ c7 exp[−n P(d, )] n (i 1 · · · i n , d) ≥ c8 νd (i1 ···in ), for some constants c5 , c6 , c7 , and c8 > 0. Therefore, c8 νd (i1 ···in ) ≤
n−1 1 νd ( f −l i1 ···in ) ≤ c4 νd (i1 ···in ) n
(42)
l=0
for every n ∈ N, and thus any limit point µd of the above sequence of measures satisfies (41). We could now proceed in a similar manner to that in Lemma 9 to show that µd is ergodic. Alternatively, by (42) we have c8 νd (i1 ···in ) ≤ µd (i1 ···in ) ≤ c4 νd (i1 ···in )
(43)
for every n ∈ N and (i 1 · · · i n ) ∈ A,n . By Lemma 9 the measure νd is ergodic and hence, by (43), the measure µd is also ergodic. The uniqueness of µd follows from its ergodicity together with the fact that by (41) any two such measures must be equivalent.
412
L. Barreira, K. Gelfert
7. Topological Pressure and Entropy We assume everywhere in this section that f has bounded distortion on . Denote by M the family of f -invariant Borel probability measures on . We have the following variational principle for the topological pressure. Proposition 5. For every d ∈ R2 we have P(d, ) = max h µ ( f ) + µ∈M
and
1 lim d, ϕ n dµ , n→∞ n
1 (44) d, ϕ n dµd . n→∞ n k Proof. Given numbers a1 , . . ., ak and p1 , . . ., pk ≥ 0 satisfying i=1 pi = 1 we have P(d, ) = h µd ( f ) +
0≥
k i=1
lim
pi ai − log pi − log
k
eai
i=1
(see for example [21, Lemma 9.9]). Thus, we obtain from the definition of n (d) that for every Borel probability measure µ on , 0≥ µ(i1 ···in )(log n (i 1 · · · i n , d) − log µ(i1 ···in ) − log n (d)) i 1 ···i n
= Hµ (ξn ) +
µ(i1 ···in )(log n (i 1 · · · i n , d) − log n (d)).
(45)
i 1 ···i n
We need the following special case of the extension of Kingman’s sub-additive ergodic theorem given by Derriennic in [10] (in fact this particular statement follows readily from the subadditive ergodic theorem, since by (23) the sequence ψn = ϕn +C is subadditive). Lemma 11. Let f be a continuous map of a compact metric space . If (ϕn )n is an almost additive sequence of continuous functions on satisfying inf n∈N (ϕn /n) > −∞, then for any f -invariant probability measure µ on , there exists a µ-measurable f -invariant function ϕ : → R such that ϕn /n → ϕ as n → ∞ µ-almost everywhere and in L 1 (µ). By Lemma 11 there is an f -invariant µ-measurable function ϕ such that ϕ n /n → ϕ as n → ∞ µ-almost everywhere and in L 1 (µ). By the dominated convergence theorem we obtain 1 1 d, ϕ n dµ = lim d, ϕ n dµ = d, ϕ dµ. lim n→∞ n n→∞ n Hence, dividing in (45) by n and taking the limit as n → ∞ we have 1 lim d, ϕ n dµ. P(d, ) ≥ h µ ( f ) + n→∞ n
Multifractal Analysis for Lyapunov Exponents
413
When µ = µd it follows from Lemmas 5 and 6 that
n (i 1 · · · i n , d) ≥ µd (i1 ···in ) p −2k (K K 2 )−d e−2nρn d .
n (d) Thus,
0 ≥ Hµd (ξn ) +
µd (i1 ···in ) log n (i 1 · · · i n , d) − log n (d)
i 1 ···i n
≥ Hµd (ξn ) − 2k log p − d log(K K 2 ) − 2nρn d + µd (i1 ···in ) log µd (i1 ···in ) i 1 ···i n
= −2k log p − d log(K K 2 ) − 2nρn d.
(46)
By Proposition 1, for every x ∈ i1 ···in we have |log n (i 1 · · · i n , d) − d, ϕ n (x)| ≤ nρn d. Hence, dividing in (46) by n and taking the limit as n → ∞ we derive (see (38)) 1 0 = h µd ( f ) + lim d, ϕ n dµd − P(d, ). n→∞ n This completes the proof of the lemma.
Proposition 6. The function d → P(d, ) is convex. Proof. By Hölder’s inequality, given sequences = (ϕn )n and = (ψn )n of continuous functions ϕn , ψn : → R and t ∈ (0, 1) we have exp t sup ϕn (x) + (1 − t) sup ψn (x) i 1 ···i n
x∈i 1 ···i n
≤
exp
sup x∈i 1 ···i n
i 1 ···i n
t ϕn (x)
x∈i 1 ···i n
exp
i 1 ···i n
1−t sup
ψn (x)
.
sup
ψn (y),
x∈i 1 ···i n
Thus, log
exp
(tϕn (x) + (1 − t)ψn (x))
sup x∈i 1 ···i n
i 1 ···i n
≤ t log
i 1 ···i n
exp
sup x∈i 1 ···i n
ϕn (x) + (1 − t) log
i 1 ···i n
exp
y∈i 1 ···i n
which implies that P(t + (1 − t)) ≤ t P( ) + (1 − t)P(). The convexity follows by setting = d , ϕ n and = d, ϕ n .
414
L. Barreira, K. Gelfert
Proposition 7. For every α = ∇ P(d, ) ∈ R2 , the measure µd is concentrated on E(α), i.e., µd (E(α)) = 1, and P(d, ) = h µd ( f ) + d, α.
(47)
Proof. By Proposition 5, for every v ∈ R2 we have 1 lim d + v, ϕ n dµd , P(d + v, ) ≥ h µd ( f ) + n→∞ n and using (44) we obtain P(d + v, ) − P(d, ) 1 1 ≥ lim d + v, ϕ n dµd − lim d, ϕ n dµd n→∞ n n→∞ n 1 = v, lim ϕ n dµd . n→∞ n This shows that
q=
lim
n→∞
1 ϕ dµd n n
(48)
is a subgradient of the function d → P(d, ) at d. By Proposition 6 this function is convex, and since d → P(d, ) is differentiable at d, the derivative α = ∇ P(d, ) is the unique subgradient at this point (see for example [16]). In particular, α = ∇ P(d, ) = q.
(49)
By Lemma 11, there is an f -invariant µd -measurable function ϕ : R2 → R such that ϕ n /n → ϕ as n → ∞ µd -almost everywhere. On the other hand, by Lemma 9, the measure µd is ergodic, and thus ϕ is constant µd -almost everywhere. Together with (48) and (49) this shows that µd is concentrated on the level set E(α). Finally, the identity in (47) follows readily from (44) in Proposition 5. 8. Proof of Proposition 3 We first recall the concept of nonadditive topological pressure following the approach of Barreira in [1]. Let U be a finite open cover of . Given n ∈ N we denote by Wn (U) the collection of strings U = U1 · · · Un with U1 , . . ., Un ∈ U. For each U ∈ Wn (U), we write m(U) = n, and we define the open set (U) = x ∈ : f k−1 x ∈ Uk for k = 1, . . . , n . We say that ⊂ n∈N Wn (U) covers the set Z ⊂ if U ∈ (U) ⊃ Z . Let now
= (ϕn )n be a sequence of continuous functions ϕn : → R. We define γn ( , U) = sup {|ϕn (x) − ϕn (y)| : x, y ∈ (U) for some U ∈ Wn (U)} . We always assume that 1 lim sup lim sup γn ( , U) = 0. diam U→0 n→∞ n
(50)
Multifractal Analysis for Lyapunov Exponents
415
For each string U ∈ Wn (U) we write ϕ(U) = sup(U ) ϕn when (U) = ∅, and ϕ(U) = −∞ otherwise. For each set Z ⊂ we define M(Z , α, , U) = lim inf exp(−αm(U) + ϕ(U)), (51) n→∞
U ∈
where the infimum is taken over all ⊂ k≥n Wk (U) covering Z . The quantity in (51) jumps from +∞ to 0 at a unique critical value of α, and we can define PZ ( , U) = inf{α : M(Z , α, , U) = 0}. Moreover there exists the limit PZ ( ) =
lim
diam U→0
PZ ( , U).
The number PZ ( ) is called the nonadditive topological pressure of the sequence of functions on the set Z (with respect to f ). We note that Z need not be compact nor f -invariant. For each n ∈ N we also define the partition function Zn (Z , , U) = inf exp sup ϕn ,
U ∈
(U )
where the infimum is taken over all ⊂ Wn (U) covering Z . We say that the sequence = (ϕn )n is subadditive if ϕn+m ≤ ϕn + ϕm ◦ f n for every n, m ∈ N. The following is a combination of Theorem 1.6 and Proposition 1.10 in [1]. Proposition 8 ([1]). Assume that is subadditive with (ϕn − ϕn+1 )n uniformly bounded from above. If Z ⊂ is a compact f -invariant set, and U is a finite open cover of , then 1 PZ ( ) = lim log Zn (Z , , U). n→∞ n We can now establish Proposition 3. Proof of Proposition 3. Since = (ϕn )n is almost additive (see (23)), the sequence of functions ψn = ϕn +C is subadditive. We consider the open cover of by the rectangles R1 , . . . , R p of the Markov partition used to define the sets i1 ···in (we note that this is a partition by open sets with respect to the induced topology on ). In this case the partition function takes the form Zn (Z , , U) = exp max ϕn (x). i 1 ···i n
x∈i 1 ···i n
Furthermore, the condition (24) ensures that the sequence satisfies (50). In addition, again since is almost additive, we have −C + ϕ1 ◦ f n ≤ ϕn+1 − ϕn ≤ C + ϕ1 ◦ f n , and the continuity of ϕ1 on the compact set ensures that (ϕn − ϕn+1 )n is uniformly bounded from above. It now follows immediately from (13) and Proposition 8 that 1 P( ) = lim sup log exp max ϕn (x) = P ( ). (52) x∈i 1 ···i n n→∞ n i 1 ···i n
This completes the proof.
416
L. Barreira, K. Gelfert
We also recall some properties of the nonadditive topological pressure that will be useful in Sect. 9. The topological entropy of f on the set Z ⊂ (see [9, 15]) can be shown to coincide with h top ( f |Z ) = PZ (), for the sequence = (ψn )n given by ψn = 0 for each n. We now write = lim sup n→∞
1 sup |ϕn (x)|. n x∈
Proposition 9 ([1]). The following properties hold: 1. 2. 3. 4. 5.
PZ 1 ( ) ≤ PZ 2 ( ) whenever Z 1 ⊂ Z 2 ⊂ ; for a countable union Z = i∈I Z i we have PZ ( ) = supi∈I PZ i ( ); if ϕn ≤ ψn for all sufficiently large n, then PZ ( ) ≤ PZ (); |PZ ( ) − PZ ()| ≤ − ; h top ( f |Z ) = s, where s ≥ 0 is the unique root of PZ (−s) = 0 with = (ψn )n given by ψn = n for each n.
9. Proofs of the Main Results We first introduce the notion of local entropy for the symbolic dynamics. Consider the pullback µd = µd ◦ χ of the probability measure µd to the space A . We endow A with the distance d((i 1 i 2 · · · ), ( j1 j2 · · · )) =
∞
β −k |i k − jk |,
k=1
for some constant β > p. One can easily verify that (i 1 i 2 · · · ) is in the ball B(( j1 j2 · · · ), β −l ) if and only if i k = jk for every k = 1, . . . , k = l. Hence, µd (i1 ···il ) = µd (B((i 1 i 2 · · · ), β −l )). For a Borel probability measure µ on A the µ-local entropy of the shift map σ at the point ω = (i 1 i 2 · · · ) is defined by 1 h µ (σ, ω) = lim − log µ(i 1 · · · i n ), n→∞ n whenever the limit exists. We also need the following result of Schmeling concerning the preservation of the topological entropy under the coding map χ of the repeller. Proposition 10 ([18]). If is a repeller of a C 1+α map f , then h top (σ |Z ) = h top ( f |χ (Z )) for every set Z ⊂ A . We can now establish our main theorem. Proof of Theorem 1. By (39), for ω ∈ χ −1 (E(α)) we have h µd (σ, ω) = P(d, ) − d, α.
(53)
Let now α = ∇ P(d, ) be a gradient of the topological pressure. By Proposition 7 we have µd (E(α)) = 1 and thus µd (χ −1 (E(α))) = 1. In view of (53) this implies (see for example [15]) that h top (σ |χ −1 (E(α))) ≥ P(d, ) − d, α.
(54)
Multifractal Analysis for Lyapunov Exponents
417
By Proposition 10 we have h top ( f |E(α)) = h top (σ |χ −1 (E(α))), and hence, h top ( f |E(α)) ≥ inf [P(d, ) − d, α]. d ∈R2
We now establish the reverse inequality. By Proposition 9, the number h = h top ( f |E(α)) is the unique root of the equation PE(α ) (−h) = 0, where = (ψn )n is given by ψn = n for each n. Given ε > 0 and τ ∈ N we define the set L ε,τ = {x ∈ : |ϕi,n (x) − αi ψn | ≤ εn for every n ≥ τ, i = 1, 2}.
(55)
Notice that L ε,τ ⊂ L ε,τ if τ ≤ τ , and that
L ε,τ . E(α) ⊂ ε>0 τ ∈N
Using the bounded distortion property, it follows from the proof of Proposition 1 that there exists δ > 0 such that for every x ∈ , sup
y∈Bn (x,δ)
ϕ n (x) − ϕ n (y) ≤ εn.
(56)
Hence, if Bn (x, δ) ∩ L ε,τ = ∅, then by (56) and (55) we obtain |ϕi,n (y) − αi ψn | ≤ 2εn for every y ∈ Bn (x, δ). Again by Proposition 9 this implies that PL ε,τ (−h) ≤ PL ε,τ (d, (ϕ n − αψn )n − h) + 2εd, and hence 0 = PE(α ) (−h) ≤ Pτ ∈N L ε,τ (−h) = sup PL ε,τ (−h) τ ∈N
≤ P (d, (ϕ n − αψn )n − h) + 2εd. Since ε is arbitrary, we obtain
inf P (d, (ϕ n − αψn )n − h) = inf [P (d, ) − d, α] − h ≥ 0.
d ∈R2
d ∈R2
By (52) we have P = P , and together with (54) this completes the proof of the identity in (20). The properties in (21) and (22) are a simple rewriting respectively of the properties (47) in Proposition 7, and (41) in Lemma 10. This completes the proof of the theorem. We now consider the case of tempered distortion. Proof of Theorem 2. The statement is an immediate consequence of Lemmas 6 and 9. Acknowledgements. Luis Barreira would like to thank Claudia Valls for her precious help, and Benoît Saussol for several discussions related with the explicit construction of full measures (although with another purpose in mind and with different methods, a related construction was effected by the two in 1999 in unpublished joint work).
418
L. Barreira, K. Gelfert
References 1. Barreira, L.: A non-additive thermodynamic formalism and applications to dimension theory of hyperbolic dynamical systems. Ergodic Theory Dynam. Systems 16, 871–927 (1996) 2. Barreira, L.: Hyperbolicity and recurrence in dynamical systems: a survey of recent results. Resenhas IME-USP 5, 171–230 (2002) 3. Barreira, L.: Dimension estimates in nonconformal hyperbolic dynamics. Nonlinearity 16, 1657–1672 (2003) 4. Barreira, L., Gelfert, K.: Dimension estimates in dynamical systems: a survey. In preparation 5. Barreira, L., Pesin, Ya.: Lyapunov Exponents and Smooth Ergodic Theory. Univ. Lect. Ser. 23, Providence, RI: Amer. Math. Soc., 2002 6. Barreira, L., Pesin, Ya., Schmeling, J.: On a general concept of multifractality: multifractal spectra for dimensions, entropies, and Lyapunov exponents. Multifractal rigidity. Chaos 7, 27–38 (1997) 7. Barreira, L., Radu, L.: Multifractal analysis of nonconformal repellers: a model case. Dyn. Syst. To appear. 8. Bothe, H.: The Hausdorff dimension of certain solenoids. Ergodic Theory Dynam. Systems 15, 449–474 (1995) 9. Bowen, R.: Topological entropy for noncompact sets. Trans. Amer. Math. Soc. 184, 125–136 (1973) 10. Derriennic, Y.: Un théorème ergodique presque sous-additif. Ann. Probab. 11, 669–677 (1983) 11. Falconer, K.: The Hausdorff dimension of self-affine fractals. Math. Proc. Cambridge Philos. Soc. 103, 339–350 (1988) 12. Falconer, K.: Bounded distortion and dimension for non-conformal repellers. Math. Proc. Cambridge Philos. Soc. 115, 315–334 (1994) 13. Feng, D., Lau, K.: The pressure function for products of non-negative matrices. Math. Res. Lett. 9, 363–378 (2002) 14. Hu, H.: Box dimensions and topological pressure for some expanding maps. Commun. Math. Phys. 191, 397–407 (1998) 15. Pesin, Ya.: Dimension Theory in Dynamical Systems: Contemporary Views and Applications. Chicago Lectures in Mathematics, Chicago: Chicago University Press, 1997 16. Rockafellar, R.: Convex Analysis. Princeton Math. Ser. 28, Princeton, NJ: Princeton Univ. Press, 1970 17. Ruelle, D.: Repellers for real analytic maps. Ergodic Theory Dynam. Systems 2, 99–107 (1982) 18. Schmeling, J.: Entropy preservation under Markov coding. J. Stat. Phys. 104, 799–815 (2001) 19. Simon, K.: The Hausdorff dimension of the Smale–Williams solenoid with different contraction coefficients. Proc. Amer. Math. Soc. 125, 1221–1228 (1997) 20. Simon, K., Solomyak, B.: Hausdorff dimension for horseshoes in R3 . Ergodic Theory Dynam. Systems 19, 1343–1363 (1999) 21. Walters, P.: An Introduction to Ergodic Theory. Graduate Texts in Mathematics 79, Berlin-HeidelbergNew York: Springer, 1981 Communicated by A. Kupiainen
Commun. Math. Phys. 267, 419–449 (2006) Digital Object Identifier (DOI) 10.1007/s00220-006-0060-y
Communications in
Mathematical Physics
Carleman Estimates and Absence of Embedded Eigenvalues Herbert Koch1, , Daniel Tataru2, 1 Fachbereich Mathematik, Universität Dortmund, 44221 Dortmund, Germany 2 Department of Mathematics, University of California, Berkeley, CA 94720-3840, USA.
E-mail:
[email protected] Received: 24 August 2005 / Accepted: 28 February 2006 Published online: 11 August 2006 – © Springer-Verlag 2006 n+1
Abstract: Let L = −− W be a Schrödinger operator with a potential W ∈ L 2 (Rn ), n ≥ 2. We prove that there is no positive eigenvalue. The main tool is an L p − L p Carleman type estimate, which implies that eigenfunctions to positive eigenvalues must be compactly supported. The Carleman estimate builds on delicate dispersive estimates established in [7]. We also consider extensions of the result to variable coefficient operators with long range and short range potentials and gradient potentials. 1. Introduction Let n ≥ 2. Suppose W is a potential in Rn which decays at infinity. Then the Schrödinger operator −Rn − W has continuous spectrum [0, ∞). In addition its spectrum may contain eigenvalues which could be positive, negative or zero. Positive eigenvalues in the continuous spectrum are undesirable. They are very unstable since they are destroyed even by weak interactions between the continuous spectrum and the eigenvalue (see [9]). Physically they correspond to trapped states in the continuous spectrum, and they are difficult to handle analytically. Moreover, excluding eigenvalues in the continuous spectrum is often a first step toward scattering. There is an extensive theory dealing with the absence of positive eigenvalues. It is well known that under weak assumptions like lim |x||W (x)| = 0
|x|→∞
The first author was partially supported by DFG grant KO1307/1 and also by MSRI for Fall 2005
(1)
The second author was partially supported by NSF grants DMS0354539 and DMS 0301122 and also by
MSRI for Fall 2005
420
H. Koch, D. Tataru
there are no positive eigenvalues. The argument uses L 2 Carleman estimates in three steps as follows. Suppose that −u − W u = u with u ∈ L 2 , where the eigenvalue is normalized to 1 by scaling. Then one proves that: (1) The eigenfunction u decays faster than polynomially at infinity. (2) If u vanishes faster than polynomially at infinity then u has compact support. (3) If u has compact support then it must vanish. These arguments work for many Schrödinger operators. However they do not cover Schrödinger operators for several particles (which are studied in [2] and [1]), neither do the standard arguments apply to the absence of bound states (i.e. L 2 solutions) in nonlinear optics modeled by problems of the type −u = ωu + a(x)|u|σ u with a bounded function a, because it is not clear how the assumption u ∈ L 2 (Rn ) is related to pointwise decay. On the other hand the assumption (1) on pointwise decay is sharp: There is the famous Wigner-Von Neumann example of a positive eigenvalue and a potential decaying like 1/|x| but not better, see [12, 8]. Motivated by the above questions and by other potential applications one seeks to replace the pointwise bound (1) by an L p bound. In terms of scaling any such bound must necessarily be weaker than (1) due to counterexamples by Jerison and Ionescu ([3]). Jerison and Ionescu [3] have recently obtained absence of embedded eigenvalues for W ∈ L n/2 . In this paper we obtain the same result for a larger class of potentials which includes W ∈L
n+1 2
.
(2)
We note that a higher index is better since it allows for potentials with less decay at infinity. Another way to look at this is that such a condition is mostly relevant for the low frequency part of W . The counterexample of Jerison and Ionescu ([3]) shows that this is the highest possible exponent. Our method is robust enough so that it also allows us to add a long range potential, and also to replace the Laplacian with a (mildly) asymptotically flat second order elliptic operator. The latter generalization is more technical and less self-contained, so it is discussed only in the last section. Thus we consider potentials which are the sum of weakly decaying long range potentials V and short range potentials W . We even include the eigenvalue λ > 0 into the long range potential and study the problem (− − V )u = W u.
(3)
2 by To describe the long range potential we define the space Cx 2 is the space of C 2 functions for which the following norm is finite: Definition 1. Cx loc
f C 2 : = max{sup | f (x)|, supx|D f |, supx2 |D 2 f |}. x
x
Then we introduce the condition
Carleman Estimates and Absence of Embedded Eigenvalues
421
2 and satisfies Assumption A 1 (The long range potential). V belongs to Cx
lim inf V > 0, τ0 := − lim inf |x|→∞
|x|→∞
x · ∇V < 1/2. 4V
(4)
The bound from below on V corresponds to the condition λ > 0 while the last bound in (4) says that for large |x| the function |x|2 is strictly convex along the null Hamilton flow for − − V , and thus guarantees nontrapping outside a compact set. To describe the short range potential we define the space −
1
Definition 2. X is the space of Wlocn+1 finite: W X = sup W u u∈C0∞
W X = sup W u u∈C0∞
, 2(n+1) n+3
functions for which the following norm is
W
1 , 2(n+1) − n+1 n+3
W
− 31 +, 6 5
/u
/u
1 , 2(n+1) n−1
W n+1
1
W 3 −,6
n ≥ 3,
n = 2, > 0.
For a domain D ⊂ Rn we denote X (D) = {1 D W ; W ∈ X }. Then we introduce Assumption A 2 (The short range potential). W belongs to X loc and can be decomposed as W = W1 + W2 where lim sup j→∞ W1 X ({x|2 j−1 ≤|x|≤2 j+1 }) < δ, lim sup|x|→∞ |x||W2 (x)| < δ.
(5) (6)
The W2 component corresponds to the L 2 Carleman estimates. The class of allowed W1 n n+1 n+1 n n+1 potentials includes L 2 and L 2 , or even better1 l 2 (L 2 ), where the l 2 norm is taken with respect to a partition of Rn into unit cubes. Our main result is Theorem 3. Assume that V and W satisfy Assumptions A1 and A2, let τ1 > τ0 and 1 (Rn ) satisfy (3) and |x|τ1 − 21 u ∈ L 2 . assume that δ is sufficiently small. Let u ∈ Hloc Then u ≡ 0. By comparison, the result of Jerison and Ionescu [3] applies to the case V = 1 and n W ∈ L 2 , n ≥ 3. We note that the exponent p = n/2 is critical for weak unique continuation; for smaller exponents there are examples of compactly supported eigenfunctions, see [6]. The conditions (5) and (6) have a different scaling behavior. Nevertheless both are sharp, which can be seen by the Wigner-Von Neumann example and the non-radial counter-example of Jerison and Ionescu. The proof uses Carleman estimates, following the same three steps indicated above. A combined L 2 - L p Carleman inequality replaces the previous L 2 Carleman inequalities. Proving such inequalities is a highly nontrivial task and relies on the bounds 3
1 l 2 L 1+ if n = 2.
422
H. Koch, D. Tataru
established in [7]. Conjugation of the operator − − V with the weight of the Carleman inequality leads to a non-selfadjoint partial differential equation. A pseudo-convexity type condition is satisfied, but it degenerates for large x. This is related to the fact that the anti-selfadjoint part of the conjugated operator decays for large x in relevant coordinates. Compared to earlier work and to the steps outlined above, we also consider a different family of weights in the Carleman estimates. Precisely, we begin with weights of the √ form h(x) = eτ |x| for part 2 of the argument, which we then flatten at infinity for part 1. This yields a more robust argument, and also better results in the variable coefficient case. The paper is organized as follows. In the next section we state all the L p Carleman estimates and show how they lead to the result on the absence of the embedded eigenvalues. There are two main ingredients in the proof of the L p Carleman estimates. The first is the L 2 Carleman estimates, which are proved in Sect. 3. The second is a dispersive estimate for second order operators which is obtained in Sect. 4 using an earlier result of the authors, namely Theorem 3 of [7]. This is of independent interest so we state it in more generality than needed here. The L p estimates are proved Sect. 5. The L 2 bounds obtained earlier are used to localize the L p bounds to small spatial scales. Then we can rescale to a setting where the general dispersive estimates of Theorem 7 apply. Finally, in the last two sections we discuss the extension of the results to second order elliptic operators with variable but asymptotically flat coefficients as well as unbounded gradient potentials. This goes along the same lines. 2. Carleman Estimates and Embedded Eigenvalues As explained above the proof depends on Carleman inequalities. In this section we explain the Carleman inequalities and their application whereas most of the proofs are postponed to the remaining sections. Let 1 ≤ p ≤ ∞ and s ∈ R. We define the Sobolev space W s, p (Rn ) by the norm f W s, p = (1 + |D|2 )s/2 f L p and W s, p (U ) for open subsets U of Rn through its norm which is the infimum of the norm of extensions. Given a temperate distribution f and the Sobolev space W s,q we define the norm 1/ p ∞ p f l p W s,q = f W s,q ({2 j−1 ≤|x|≤2 j+1 }) j=1
with the obvious modification for p = ∞. Our Carleman estimates have the form eh(ln(|x|)) v
1 , 2(n+1) n−1
l 2 W n+1
inf
+ eh(ln(|x|)) ρv L 2
f 1 + f 2 =(−−V )v
eh(ln(|x|)) ρ −1 f 1 L 2 + eh(ln(|x|)) f 2
l2 W
1 , 2(n+1) − n+1 n+3
, (7)
where ρ is given by ρ=
h (ln(|x|)) h (ln(|x|))2 h + (ln(|x|)) + |x|2 |x|4
41 (8)
with h + denoting the positive part of h . As a general rule, the function h is chosen to be
Carleman Estimates and Absence of Embedded Eigenvalues
423
(a) increasing, h ≥ τ0 , with h (0) large, (b) slowly varying on the unit scale, |h ( j) | h for j = 2, 3, 4, (c) strictly convex for as long as h (ln(|x|)) |x|. More precise choices are made later on for convenience, but the estimates are in effect true for all functions h satisfying the above conditions. The two terms in ρ have different origins. The second one simply measures the effect of the convexity of the function h. The first one, on the other hand, is due to the presence of the long range potential, which provides some extra pseudoconvexity for large |x|. A simplifying assumption consistent with the choices of weights in this paper is to strengthen (c) to (c)’ h (ln(|x|)) ≈ h (ln(|x|)) for as long as h (ln(|x|)) |x|. This allows us to simplify the expression of ρ to 14 h (ln(|x|))2 h (ln(|x|)) 1+ ρ= . (9) |x|2 |x|2 Our Carleman estimates use weights which grow exponentially, but also allow for the possibility of leveling off the weight for large enough |x|. Precisely, for > 0 we consider the weight h defined by τ2 . (10) + et Proposition 4. Suppose that V satisfies Assumption A1. Then (7) holds for all v sup1 ported in |x| > 1 and satisfying |x|τ1 − 2 v ∈ L 2 , uniformly with respect to 0 < ε ≤ ε0 and τ large enough. t
h (t) = τ1 + (τ e 2 − τ1 )
τ2
The coefficient 21 in the exponent is chosen somewhat arbitrarily. However, it must be smaller than 1 in order for stage (c) above to be reached. This is necessary if we are to be able to taper off the weight at infinity. We continue with a short discussion of the weight h . For small t it is uniformly convex in the sense that h ≈ h . The first interesting threshold for it is t0 defined by e t0 ≈ τ 2 . This implies that h (t0 ) ≈ et0 . In the range [0, t0 ] the last factor in (10) is largely irrelevant, and h behaves like an exponential. In this region, the pseudoconvexity in the Carleman estimates is produced by the convexity of h. After t0 h is still convex, roughly up to t1 defined by et1 ≈ −1 τ 2 . The region t1 + O(1) contains both the inflexion point t1 and the maximum point for h . In between t0 and t1 the pseudo-convexity comes from the potential term, while the contribution from the convexity of h is still positive but smaller. Beyond t1 + O(1) the function h (t) − τ1 decays in an exponential fashion. The last interesting threshold is t2 , where h approaches 1, given by et2 ≈ −2 τ 6 . Between t1 and t2 there is still convexity coming from the potential V , which suffices in order to control the lack of convexity of h . Finally, after t2 the pseudoconvexity in the classical sense is lost, but there remains an Airy type gain to push the estimates through.
424
H. Koch, D. Tataru
Proof of Theorem 3. Here we show that Proposition 4 implies Theorem 3. Step 1. We prove that u decays at infinity faster than e−τ so that (see Assumption A2)
√
|x| . We choose
R large enough
sup W1 X ({x|2 j ≤|x|≤2 j+1 }) < 2δ,
(11)
sup |x||W2 (x)| < 2δ.
(12)
2 j+1 >R
|x|>R
Choose φ ∈ C ∞ to be identically 1 for |x| ≥ 2R and 0 for |x| ≤ R. We set v = φu. Then −v − V v = W v − (φ)u − 2∇φ · ∇u. 1
For τ1 as in Theorem 3 we have |x|τ1 − 2 v ∈ L 2 , therefore we can apply Proposition 4 with > 0 to v to obtain eh (ln |x|) v
1 , 2(n+1) n−1
l 2 W n+1
+eh (ln |x|) ρv L 2 eh (ln |x|) ρ −1 (|u| + |∇u|) L 2 (B2R \B R ) +eh (ln |x|) W1 v
l2 W
1 , 2(n+1) − n+1 n+3
+ eh (ln |x|) ρ −1 W2 v L 2 .
By (11), (12) if δ is small enough then we can absorb the last two right hand side terms on the left to obtain eh (ln |x|) v
+ eh (ln |x|) ρv L 2 eh (ln |x|) ρ −1 (|u| + |∇u|) L 2 (B2R \B R ) .
1 , 2(n+1) n−1
l 2 W n+1
Then letting → 0 in the definition of h yields eτ
√
|x|
v
l2 W
1 2(n+1) n+1 , n−1
+ eτ
√
|x|
ρv L 2 eτ
√
|x| −1
ρ
(|u| + |∇u|) L 2 (B2R \B R ) , (13)
which shows that v and therefore u is rapidly decaying at infinity. Step 2. We prove that u vanishes outside a compact set. This is done using (13) (which can also be derived directly from Proposition 4 as above). From (13) we obtain R −1 e−τ
√
2R
eτ
√
|x|
v
l2 W
1 2(n+1) n+1 , n−1
+ eτ
√
|x|
ρv L 2 (|u| + |∇u|) L 2 (B2R \B R ) .
Letting τ → ∞ shows that v = 0 outside B2R . Then the same holds for u. Step 3. We prove that u is identically 0. Assume by contradiction that this is not the case, and choose r minimal so that u is supported in B(0, r ). Our problem is scale invariant, so without any restriction in generality we can assume that r > 1. Take x0 ∈ supp u with |x0 | = r . The problem is also invariant with respect to translations so we can assume instead that supp u ⊂ B(x0 , r ) and 2x0 ∈ supp u. To reach a contradiction we prove that there is α > 0 so that u is supported in B(0, 2r − α). This follows as in Step 2 provided we know that for every δ > 0 we can find ρ > 0 such that W1 v
W
1 , n+1 − n+1 n+3
≤ δv
1 , n+1 n−1
W n+1
,
supp v ⊂ B(2x0 , ρ).
Then α is chosen so that {2r − α < |x| < 2r } ∩ B(x0 , r ) ⊂ B(2x0 , ρ). Due to our choice of W this is a somewhat technical matter which is left for Proposition 16 in the appendix. This step can be approached alternatively by the unique continuation results of [7].
Carleman Estimates and Absence of Embedded Eigenvalues
425
3. The L 2 Carleman Estimates In this section we obtain the L 2 Carleman inequalities. Proposition 5. Suppose that V satisfies Assumption A1. Let h be as in (10) and ρ as in 1 (8). Then for all u satisfying |x|τ1 − 2 u ∈ L 2 we have |x| eh(ln |x|) ρ∇u 2 eh(ln |x|) ρ −1 ( + V )u L 2 , (14) eh(ln |x|) ρu L 2 + L h (ln |x|) + |x| uniformly with respect to τ sufficiently large and 0 < ε ≤ ε0 . Proof. We use a conformal change of coordinates y = x/|x| ∈ Sn−1 .
t = ln |x|, Denote
u = g and set v(t, y) = e(n−2)t/2 u(et y),
f (t, y) = e(n+2)t/2 g(et y).
A routine computation shows that |x|(n+2)/2 ( + V )|x|(2−n)/2 =
∂2 + Sn−1 − ((n − 2)/2)2 + e2t V, ∂t 2
therefore v solves the equation Lv = f,
L = ∂t2 + Sn−1 − ((n − 2)/2)2 + e2t V.
(15)
We also note that part of Assumption A1 in the new coordinates gives − lim inf t→∞
Vt 1 = τ0 < . 4V 2
By (4) we slightly readjust τ0 so that for t large we have −
Vt 1 ≤ τ0 < . 4V 2
(16)
For any exponential weight h we have e2h(t)+nt |u(et y)|2 dt dy = eh(t) et v2L 2 (R×Sn−1 ) , (17) e2h(ln |x|) |u|2 d x = R Sn−1
e2h(ln |x|) |g|2 d x = R
e2h(t)+nt |g(et y)|2 dt dy = eh(t) e−t f 2L 2 (R×Sn−1 ) . (18)
Sn−1
Hence, in the new coordinates the bound (14) becomes eh(t) ρ1 v L 2 + eh(t)
et
ρ1 ∇v L 2 eh(t) ρ1−1 f L 2 , + h (t)
(19)
426
H. Koch, D. Tataru
where ∇v is the gradient of v with respect to y and t and, by (9),
1 1 4 ρ1 (t) = et ρ = h (t) 4 e2t + h (t)2 .
(20)
To prove the above bound one would like to follow a standard strategy. This means conjugating the operator with respect to the exponential weight, and producing a commutator estimate for the self-adjoint and the skew-adjoint part of the conjugated operator. There are two small problems with this approach, both of which occur in the region where h (t) is small. First we want to incorporate the weight ρ1−1 on the right, which would require an additional conjugation. Where h is small this cannot be treated as a small perturbation, so we really have to include ρ −1 in the exponential weight. This leads to a second difficulty. After including ρ −1 in the exponential weight the commutator between the self-adjoint and the skew-adjoint part of the conjugated operator is no longer fully positive definite and we need a slightly modified argument. To handle both issues we prove a slightly more general result and then we obtain (19) as a special case of it. Precisely, we consider an exponential weight φ as follows: (i) φ ≥ τ1 − 21 , and φ (0) is large. (ii) 1 + φ is slowly varying on the unit scale, i.e. |φ ( j) (t)| 1 + φ (t),
j = 2, 3.
(iii) φ can only have a limited exponential growth rate, φ ≤ 43 (1 + φ ). Together with (i) this yields the existence of a unique t0 so that φ (t0 ) = et0 . Our last assumption asks for uniform convexity up to t0 : (iv) φ (t) ≈ φ (t) for 0 ≤ t ≤ t0 + C for some large parameter C. We summarize the bound for the weight eφ : Lemma 6. Consider a weight function φ satisfying the conditions (i)-(iv) above. Then for all v which are supported in t > 0 and with eφ(t)+t v ∈ L 2 we have 1
1
eφ(t) (e2t + φ (t)2 ) 2 v L 2 + eφ(t) ∇v L 2 eφ(t) (1 + φ )− 2 Lv L 2 .
(21)
Proof. First we conjugate with respect to the exponential weight. If we set w = eφ(t) v then w solves the equation L φ w = eφ(t) f,
L φ = eφ(t) Le−φ(t) .
We decompose L h into a selfadjoint and a skewadjoint part, L rφ = ∂t2 + − (
n−2 2 ) + e2t V + φ 2 , 2
L iφ = −φ ∂t − ∂t φ .
The bound to prove is 1
1
(e2t + φ (t)2 ) 2 w L 2 + ∇w L 2 (1 + φ )− 2 L φ w L 2 .
(22)
The proof of this inequality is based on several integrations by parts. In a standard manner one verifies that the integrations by parts below are valid if eφ+t v ∈ L 2 .
Carleman Estimates and Absence of Embedded Eigenvalues
427
We multiply L φ w by − 21 wt and integrate by parts to obtain 2t 1 1 e φ |wt |2 dy dt + ( φ + φ φ )|w|2 dtdy + (2V + Vt )w 2 dy dt 4 2 4 1 = (23) wt L φ w dy dt. 2 This computation is essentially like taking the commutator of L rφ and L iφ . On the left we have mostly positive contributions, with the following qualifications: – – –
the first term can be negative where φ < 0, the φ φ term can also be negative, but only for t > t0 + C where it is controlled by the V term, the φ term is controlled either by the V term or by the φ φ term.
To correct the first term in the region where φ is negative we consider a cutoff function χ which equals δ in {φ > 2} and which equals 1 in {φ < 1}. Here δ is a small universal parameter which we shall choose below. Since φ + 1 is slowly varying we can assume that χ has uniformly bounded derivatives. Multiplying L φ w by χ 2 (t)w and integrating gives n−2 2 χ wt 2L 2 + χ ∇w2L 2 + ( ) χ w2L 2 − χ 2 (e2t V + φ 2 )|w|2 dy dt 2 1 2 2 2 (∂ χ )w dy dt + wL φ w dy dt. = (24) 2 t We multiply this by µ and add to the previous relation. This yields 1 Vt 2 2 2 2 −χ µ+ e2t V w 2 dy dt µχ ∇w L 2 + (χ µ + φ )|wt | dy dt + 2 4V 1 + ( φ φ − χ 2 µφ 2 )|w|2 dy dt 2
n−2 2 2 ) µχ + ∂tt2 χ 2 µ |w|2 dy dt = − φ /4 − ( 2 1 + (χ 2 µw + wt )L φ w dy dt. (25) 2 To ensure that the left hand side is positive definite we recall that for large t, −
Vt 1 ≤ τ0 < τ1 ≤ + φ . 4V 2
Hence if we choose µ positive so that 1 1 − τ1 < µ < − τ0 2 2 then the first three terms are positive definite. For the fourth term we consider two possibilities. If t < t0 + C then χ = δ while φ ≈ φ so it yields a positive contribution. We choose the universal constant δ so that 1 1 2 φ φ − χ µφ ≥ φ φ 2 4
428
H. Koch, D. Tataru
if t ≤ t0 + C. For larger t this fourth integrand may be negative but then it is controlled by the third. The first term on the right hand side is controlled by the left hand side and we obtain 1
1
∇w2L 2 + (1 + φ ) 2 wt 2L 2 + (φ (t)2 + e2t ) 2 w2L 2
1 (µw + wt )L φ wdydt. 2
The proof is completed by an application of the Cauchy-Schwartz to the right hand side. Proof of Proposition 5, continued. We obtain (19) from Lemma 21. For this we need to associate to each weight h a function φ satisfying (i)-(iv) with the property that 1
1
1 + φ ≈ h , (1 + φ )− 4 eφ ≈ eh (h 2 + e2t )− 4 . The natural choice for φ is φ(t) = h(t) −
1 t 1 + ln(1 + h (t)) − ln(1 + e−t h (t)). 2 4 4
Then φ = h −
h (h − h )e−t 1 + + . 2 4(1 + h ) 4(1 + e−t h )
We verify the properties of φ. It is easy to see that 1 + φ is slowly varying. This implies that the last two terms in φ are bounded and have bounded derivatives. Hence the properties (ii)-(iv) follow from the similar properties of h . It remains to check the bound φ > τ1 − 21 . This is clear when h 1 which t corresponds to e 2 τ 3 . For larger t we have t 1 h (t) = τ1 + τ 3 e− 2 (1 + O(τ −1 ))
and h (t) = −
1 3 −t τ e 2 (1 + O(τ −1 )). 2
Then φ (t) > τ1 −
1 1 3 −t + τ e 2 (1 + O(τ −1 )) 2 2
so the desired bound is again verified. We note that what happens when h is small is not so important anyway; in this region we can simply choose φ(t) = h(t) − 2t .
Carleman Estimates and Absence of Embedded Eigenvalues
429
4. A General Dispersive Estimate for Second Order Operators In this section we study the second order operator2 L µ = ∂i a i j (x)∂ j + µ2 c(x) − iµ(b j (x)∂ j + ∂ j b j (x)), in the unit ball B ⊂ Rn , n ≥ 2 with real coefficients a i j and complex coefficients b j and c. Here µ is sufficiently large and plays the role of a semiclassical parameter. Concerning the type and regularity of the coefficients we assume that the matrix (a i j (x)) is real, symmetric and positive definite . (R E G) the functions a i j , bi and c are of class C 2 We define the symbol l(x, ξ ) = −ξi a i j (x)ξ j + c(x) + 2b j ξ j . The real part of l is a second degree polynomial in ξ with characteristic set char x l(x, ξ ) = {ξ ∈ Rn ; l(x, ξ ) = 0}. The geometric assumption on the operator L is for each x the characteristic set char x l(x, ξ ) (G E O M) is an ellipsoid of size ≈ 1. Our third hypothesis is concerned with the size of the Poisson bracket of the real and imaginary part of L. We are interested in a principal normality type condition of the form |{l(x, ξ ), l(x, ξ )}| δ + |l(x, ξ )| + |l(x, ξ )|,
(26)
where the relevant range for δ is µ−1 < δ 1. This would suffice for our purposes if in addition we knew that all the coefficients of l are of class C 3 . In general for technical reasons we need to replace the inequality with a decomposition {l, l}(x, ξ ) = δq0 (x, ξ ) + q1r (x, ξ )l(x, ξ ) + q1i (x, ξ )l(x, ξ ) + q2 (x, ξ ). (27) Thus our last assumption has the form the Poisson bracket {l, l} admits a representation (27) where β |∂xα ∂ξ qi (x, ξ )| ≤ cαβ |α| ≤ i (P N ) |q1r | + |q1i | 1, |q2 | |l|. |q0 | 1, For L in the class of operators described above we are interested in constructing a parametrix T which has good L p → L p and L 2 → L p mapping properties, while the 2 errors are always measured in L . A dual form of this also allows us to estimate the L p norm of a function u in terms of the L 2 norms of u and Lu. 2 We use the summation convention here and in the sequel.
430
H. Koch, D. Tataru
In the context of the Carleman estimates such parametrices allow us to superimpose local L p → L p bounds on top of the global L 2 → L 2 estimates in order to obtain a global L p → L p bound. Such estimates are dispersive in nature and are strongly related to the spreading of singularities in the parametrix T . This in turn is determined by the nonvanishing curvatures of the characteristic set char x l(x, ξ ). If L has constant coefficients and real symbol then the theorem below is nothing but a reformulation of the restriction theorem. If L has real symbol but variable coefficients then we are close to the spectral projection estimates of C. Sogge [10]. In the case when L has constant coefficients but complex symbol some bounds of this type were obtained in [4]. In the more general case considered here we rely on bounds and parametrix constructions in the author’s earlier paper [7]. These apply to principally normal operators. The operator L µ is principally normal on the unit spatial scale only if δ ≈ µ−1 . Otherwise, 1 we use a better spatial localization to the (δµ)− 2 scale. On one hand L µ is principally normal on this scale, while on the other hand this localization is compatible with the L 2 estimates and this allows us to easily put the pieces back together. All Sobolev norms in the theorem below are flattened at frequency µ instead of frequency 1 as usual. Hence we introduce the notation s
Wµs, p = {u ∈ S ; (µ2 + D 2 ) 2 u ∈ L p } with the corresponding norm. We note that the operator L is elliptic at frequencies larger than µ so all the estimates are trivial in that case. All the interesting action takes place at frequency µ, where we can identify all Sobolev norms with L p norms. Theorem 7. Suppose that the operator L µ satisfies the conditions (REG), (GEOM) and (PN) for some δ > µ−1 . Let φ ∈ C(B2 (0)) have compact support. Then A) There exists an operator T such that T f
1 , 2(n+1) Wµn+1 n−1
inf
f = f1 + f2
+ (δµ)1/4 µ−1/2 T f Hµ1
(δµ)−1/4 µ−1/2 f 1 L 2 + f 2
(28)
1 , 2(n+1) − n+1 n+3
Wµ
and (δµ)−1/4 µ−1/2 L T φ f − φ f L 2
inf
f = f1 + f2
(δµ)−1/4 µ−1/2 f 1 L 2 + f 2
1 , 2(n+1) − n+1 n+3
.
(29)
Wµ
B) For all functions u in B2 (0) we have φu
1 , 2(n+1) n−1
Wµn+1
(δµ)1/4 µ1/2 u L 2 +
inf
Lu= f 1 + f 2
(δµ)−1/4 µ−1/2 f 1 L 2 + f 2
1 , 2(n+1) − n+1 n+3
Wµ
. (30)
Carleman Estimates and Absence of Embedded Eigenvalues
431
C) Suppose that in addition the problem is pseudoconvex in the sense that q0 (x, ξ ) ≈ δ µ−1
x ∈ B2 (0), µ 1.
(31)
Then for all functions u with compact support in B2 (0) we have u
1 , 2(n+1) n−1
Wµn+1
+ (δµ)1/4 µ1/2 u L 2
inf
Lu= f 1 + f 2 + f 3
(δµ)−1/4 µ−1/2 f 1 L 2 + f 2
1 , 2(n+1) − n+1 n+3
.
(32)
Wµ
The difficult part of this theorem is the existence of the rough parametrix in Part A. This existence will be derived from Theorem 3 in [7]. The arguments repeat partially those of Sect. 3, 7 and 8 of [7]. Proof. Part A. (i) Localization. We first reduce the problem to the case when δ = µ−1 . This is done by localization to a small spatial scale and then by rescaling. The appropri1 ate spatial scale is r = (µδ)− 2 . We cover the support of φ with balls B j of radius r and choose a subordinate partition of unity of the form φ 2j = 1. Suppose that within B j there exists a parametrix T j satisfying the desired estimates. Then we set T =
N
φ j Tj φ j .
j=1
The bound (28) for T follows directly by square summing the similar bounds for T j . For (29) we compute I − LT =
N
N φ j (I − L T j )φ j + [L , φ j ]T j φ j .
j=1
j=1
For the first term we use (29) for T j while for the second we estimate the commutators using (28) for T j . In order to obtain the localized parametrices T j we rescale B j to the unit scale. Then the problem reduces to the original one but with δ = µ−1 . (ii) The elliptic high frequency parametrix. For each x the zero set of l is an ellipse contained in a ball of radius B Rµ (0) with R ∼ 1. Let ψ ∈ C ∞ (Rn ) be a nonnegative radial radially decreasing function supported in B2 (0) and identically 1 in B1 (0). Let φ be as in the statement of the theorem. We fix a nonnegative function φ0 ∈ C0∞ (B2 (0)), identically 1 on the support of φ. We define Thigh by its Weyl symbol φ0 (x)lµ−1 (x, ξ )(1 − ψ(ξ/µR))φ0 (x). Then the following L 2 bounds are immediate: Thigh f Hµ1 , f Hµ−1 ,
432
H. Koch, D. Tataru
(1 − L Thigh )(1 − ψ(D/(2µR)))φ f L 2 f Hµ−1 . These estimates are the elliptic versions of the parametrix bounds. By Sobolev embeddings they imply bounds of the type of Theorem 4. (iii) The low frequency parametrix. We first mollify the coefficients of L µ on a scale µ−1/2 and note that this does not affect the hypothesis of the theorem. We also modify its symbol for large ξ and extend it to R2n so that it is of size µ2 and so that it satisfies 2−|β| µ if |α| ≤ 2 α β˜ |∂x ∂ξ lµ (x, ξ )| . µ1+|α|/2−|β| if |α| ≥ 3 By Theorem 3 of [7] there exists a parametrix Tlow for l˜µ satisfying 1
µ n+1 Tlow f
inf
L
2(n+1) n+3
+ µ1/2 Tlow f L 2 1
f = f1 + f2
µ−1/2 f 1 L 2 + µ− n+1 f 2
L
(33)
2(n+1) n−1
and the error estimate µ−1/2 (1 − l˜µw (x, D)Tlow )ψ(D/(2µR))φ f L 2
inf
1
f = f1 + f2
µ−1/2 f 1 L 2 + µ− n+1 f 2
L
2(n+1) n−1
.
(34)
(iv) The complete parametrix. In the final step we combine the low and high frequency parametrices. We set T = Thigh (1 − ψ(D/2µR)φ0 + φ0 ψ(D/4µR)Tlow ψ(D/2µR)φ0 . The estimate (28) follows easily from the similar bounds for Thigh and Tlow . It remains to consider the error estimate. We have (I − L T )φ f = (I − L Thigh )(1 − ψ(D/2µR)φ f + φ0 ψ(D/4µR)(I − L˜ w µ Tlow )ψ(D/2µR)φ f + [ L˜ w µ , φ0 ψ(D/4µR)]Tlow ψ(D/2µR)φ f + (L − L˜ w µ )φ0 ψ(D/4µR)Tlow ψ(D/2µR)φ f. For the first two terms we use the error estimates for Thigh , respectively Tlow . In the third term the commutator has size µ in L 2 so we can use the L 2 bound for Thigh . The operator (L − L˜ w µ )φ0 ψ(D/4µR) also has size µ in L 2 since the original coefficients differ from the mollified ones by µ−1 . This complete the proof of the inequality (29). 1 2(n+1) − n+1 , n+3
Part B. We prove (30) by duality as in Sect. 3 of [7]. Let g ∈ Wµ φg as φg = h + L ∗ T φg,
. We decompose
Carleman Estimates and Absence of Embedded Eigenvalues
433
where T is the operator of Theorem 7 constructed for the formal adjoint operator L ∗ . By Part A of the theorem we have (δµ)−1/4 µ−1/2 h L 2 + (δµ)1/4 µ1/2 T φg L 2 + T φg g
1 , 2(n+1) − n+1 n+3
1 , 2(n+1) n−1
Wµn+1
.
Wµ
Therefore we can write |φu, g| = |u, φg| ≤ |u, h| + |u, L ∗ T φg| = |u, h| + |Lu, T φg| µ
(δµ)
1/4 1/2
×g
u L 2 +
1 , 2(n+1) − n+1 n+3
inf
Lu= f 1 + f 2
(δµ)
−1/4 −1/2
µ
f1 L 2 + f2
1 , 2(n+1) − n+1 n+3
Wµ
.
Wµ
This implies the estimate (30). Part C. We begin with an L 2 estimate. The principal symbol of L µ = L µ (µ2 + |D|2 )−1/2 is l¯µ (x, ξ ) =
− a i j (x)ξi ξ j + µ2 W (x) + 2µg j ξ j (µ2 + |ξ |2 )−1/2 .
A short calculation shows that δµ −{l¯µ (x, ξ ), l¯µ (x, ξ )} |l¯µ (x, ξ )|, and hence, by Corollary II.14 of [11], we obtain the bound δµw L 2 L µ w L 2 + w L 2 . If δµ 1 then the norm of u on the right hand side can be hidden on the left hand side. Applying this to w = (µ2 + |D|2 )1/2 v we obtain δµv2H 1 L µ v2L 2 .
(35)
µ
For u as in the theorem we write u = v + T L µ u. The bounds for the second term come from Part A. On the other hand, L µ v = (1 − L µ T )L µ u for which we can use the error estimate (29) to obtain (δµ)−1/4 µ−1/2 Lv L 2
inf
L µ u= f 1 + f 2
(δµ)−1/4 µ−1/2 f 1 L 2 + f 2
Then we successively apply (35) and (30) to v, concluding the proof.
1 , 2(n+1) − n+1 n+3
Wµ
.
434
H. Koch, D. Tataru
5. The L p Carleman Inequality In this section we prove Proposition 4. We first conjugate with respect to the exponential weight. If we set w = eh(ln(|x|)) v then we can rewrite (7) in the form w
1 , 2(n+1) n−1
l 2 W n+1
where
+ ρw L 2
inf
L h w= f 1 + f 2
ρ −1 f 1 L 2 + f 2
l2 W
1 , 2(n+1) − n+1 n+3
,
x x L h = + V w + h (ln |x|)2 |x|−2 − ∇ 2 h (ln |x|) + h (ln |x|) 2 ∇ . |x| |x|
We want to apply Theorem 7 on dyadic annuli A j = {x|2 j−1 < |x| < 2 j+1 }. The rescaling y = 2− j x transforms this set to A0 and the operator L h to y y j L h = + 22 j V˜ + h (ln(2 j |y|))2 |y|−2 − ∇ 2 h (ln(2 j |y|)) + h (ln(2 j |y|)) 2 ∇ . |y| |y| We verify that we can apply Theorem 7 to L h . Since h varies slowly on the unit scale we can take the corresponding value for µ to be µ j = 22 j + h ( j ln 2)2 . j
The coefficients b and c are given by 2j j 2 2 c = µ−2 j (2 V + h (ln(2 |y|)) /|y| ), b j = −
h (ln(2 j |y|)) y j µj |y|2
and are clearly of class C 2 and size O(1). We have j
j
lh (x, ξ ) = −ξ 2 + c, lh (x, ξ ) = 2b · ξ. Their Poisson bracket has the form h (t) {−|ξ | + c, b · ξ } = (−|ξ |2 + c) + 2y · ξ µ|y|2 2
−
1 h (t) − |y|4 h (t)|y|3
22 j h (t) 2h (t)2 h (t) y · ∇V − , |y|2 µ3j |y|4 µ3j
b·ξ
t = ln(2 j |y|).
Then we can apply Theorem 7 with δ comparable to the size of the third term. For our choice of h we have |h | h and also h (t) h (t) =⇒ h (t) et . Hence we can choose
2j 3 δ j = µ−3 2 . h ( j ln 2) + h ( j ln 2) j
Carleman Estimates and Absence of Embedded Eigenvalues
435
Let φ ∈ C0∞ (R) be a nonnegative function supported in [−1, 1] with ∞
φ 2 (t − j) = 1,
j=−∞
and let φ j (x) = φ(ln |x| − j). After rescaling, Part A of Theorem 7 yields a parametrix T j for L h in A j with the property that T j g +ρ
W −1
1 2(n+1) n+1 , n−1
+ ρT j g L 2 + ρ
(L h T j − 1)φ j g L 2
|x|
∇(T j g) L 2
h (ln |x|) + |x|
ρ −1 g1 L 2 (A j ) + g2
inf
g=g1 +g2
W
1 , 2(n+1) − n+1 n+3
(A j )
.
We define a parametrix for L h by T =
∞
φ j Tj φ j .
j=0
Summing up the bounds on T j we obtain a bound for T , T g
1 , 2(n+1) n−1
l 2 W n+1
+ ρT g L 2
inf
g=g1 +g2
ρ −1 g1 L 2 + g2
l2 W
1 , 2(n+1) − n+1 n+3
.
We also compute the error 1 − Lh T =
∞
φ j (1 − L h T j )φ j −
j=0
∞
[L h , φ j ]T j φ j .
j=0
Since [L h , φ j ] = O(|x|−1 )∇ + O(h (ln |x|)|x|−2 ) and |x|−1 ρ 2
|x| , h (ln |x|)|x|−2 ρ 2 h (ln |x|) + |x|
we can bound the error by ρ −1 (1 − L T )g L 2
inf
g=g1 +g2
ρ −1 g1 L 2 + g2
l2 W
1 , 2(n+1) − n+1 n+3
.
Now, after the construction of the parametrix the assertion of Proposition 4 follows exactly as the corresponding part of Theorem 7. We repeat the argument. Split w into w = v + T Lw. Then the second term satisfies the desired bounds while for the first we know that ρ −1 Lv L 2 = ρ −1 (L T − 1)Lw L 2
inf
Lw=g1 +g2
ρ −1 g1 L 2 + g2
l2 W
1 , 2(n+1) − n+1 n+3
.
436
H. Koch, D. Tataru
Lemma 5 allows us to also estimate ρv L 2 . On the other hand by Theorem 7, B rescaled and applied to v in A j we get φ j v
1 , 2(n+1) n−1
W n+1
ρv L 2 (A j ) + ρ −1 Lv L 2 (A j ) ,
and after summation in j, v
1 , 2(n+1) n−1
l 2 W n+1
ρv L 2 + ρ −1 Lv L 2 ,
thereby concluding the proof.
6. Equations with Gradient Potentials In this section we discuss the corresponding results which are obtained when short range gradient potentials are added. Thus we consider equations of the form (− − V )u = W u + Z l ∇u + ∇ Z r u
(36)
with V and W as before. The gradient potential Z = (Z l , Z r ) is subject to the following conditions: Assumption A 3 (The short range gradient potential). The gradient potential Z ∈ l ∞ (L n ) satisfies3 lim sup Z L n ({x|2 j ≤|x|≤2 j+1 }) ≤ δ.
(37)
j→∞
In addition for some R V L ∞ the low frequency part S
τ0 and assume that δ is sufficiently small. Let u ∈ Hloc 1
(1 + |x|)τ1 − 2 u ∈ L 2 . Then u ≡ 0.
By scaling we obtain the following result on the absence of embedded eigenvalues: Corollary 9. Assume that V , W and Z satisfy Assumptions A1,A2 respectively A3 with δ = 0. Then there are no embedded eigenvalues for the operator − − W − Z l ∇ − ∇ Z r . 3 In dimension n = 2 one needs to replace L n by L p for some p > 2 close to 2.
Carleman Estimates and Absence of Embedded Eigenvalues
437
The problem of introducing gradient potentials has long been considered in the context of the unique continuation and the strong unique continuation problems for the same operators as here. There the key breakthrough came in Wolff’s work [13] who proved that Z ∈ L n suffices for the unique continuation property. He also obtained the same result for strong unique continuation but only in low dimension. Later his ideas were used by the authors in [5] to complete the picture for strong unique continuation in high dimension, working with gradient potentials Z ∈ l 1 L n . This latter paper is more relevant to the present context as it provides Carleman estimates in largely the same format as here. Ideally, one would like to include matching gradient estimates to our L p Carleman inequalities. This would solve the problem but unfortunately cannot work. Wolff’s contribution was to show that by osculating the weight one can considerably improve the bounds for the gradient term in the equation. Thus the choice of weights ultimately depends both on the gradient potentials and on the solution u. In our context this argument is needed only at spatial scales where the frequency is larger than one in the characteristic set of the conjugated operator; this corresponds to h (ln |x|) > |x|, which translates to |x| ≤ τ 2 for our weight. For larger |x| the gradient does not contribute much to the problem. Thus we are led to consider perturbed weights ψ,τ (x) = h (ln |x|) + k(x),
(38)
where k is not spherically symmetric but is small in an appropriate sense. The assumptions on k are summarized in what follows: supp k ⊂ {|x| ≤ τ 2 } |x|α |∇ α k(x)| h (ln |x|) α = 1, 2, 3
.
(39)
The Carleman estimates are mainly concerned with the characteristic set of the conjugated operator L ψ . Away from it we can obtain elliptic estimates. To select (part of) the elliptic region for L ψ we introduce a smooth cutoff symbol χe which selects the region E = |ξ | ≥ R(1 + |x|−1 h (ln |x|)) . Here the truncation in ξ is done on the dyadic scale, while R is chosen sufficiently large so that E is away from the characteristic set of Pφ . Then the Carleman estimates are as follows: Theorem 10. Let n ≥ 3. Assume that the long range potential V satisfies A1. Let Z satisfy A3 with Z l ∞ L n + S
Then for each 0 < ≤ 0 , τ large enough, τ1 > τ0 and u which satisfies (1+|x|)τ1 − 2 u ∈ 1 there is a weight perturbation k satisfying (38),(39) so that the following L 2 and u ∈ Hloc estimate holds with constants independent of 0 < ≤ 0 , τ > τ0 : eψ,τ (x) u
+ eψ,τ (x) ρu L 2 + χew eψ,τ (x) u H 1
+ eψ,τ (x) ∇ Z r u 2 2n w −1 2n l L n+2 +χew H −1 l L n+2 +χe H ψ,τ (x) −1 ψ,τ (x) e ρ f 1 L 2 + e f 2 − 1 , 2(n+1) , l 2 W n+1 n+3 +χew H −1
+e
1 , 2(n+1) n−1
l 2 W n+1 ψ,τ (x) l
Z ∇u 2
(40)
438
H. Koch, D. Tataru
where f 1 + f 2 = (− − V )u.
In dimension n = 2 the result remains true if we replace the space L n for Z by L p 2n with p > 2 and close to 2, and if L n+2 is replaced by L q for any q > 1 close to 1. On the other hand, there is no need to use Wolff’s Lemma so the bound holds for all weights as above independently of the choice of Z and u. This is because in the region {|x| < τ 2 }, where we have strong pseudoconvexity, the analysis of the conjugated operator has an elliptic character and there is a full one derivative gain in the dispersive estimates. The key feature of the theorem is that the weight ψ,τ (x) depends both on the potential Z and on the solution v itself. Once this result is established, it leads as before to the conclusion that solutions to (36) must be compactly supported. Then (a variation of) Wolff’s weak unique continuation result [13] takes over and implies that v must be identically 0. We also refer the reader to [5], where the estimates are formulated in a way similar to this paper, and where both left and right gradient potentials are considered. The proof largely repeats arguments presented earlier in this paper and in [5]. For the convenience of the reader we describe the main steps. Proof of Theorem 10, outline. The main steps in the proof are as follows: (i) prove the L 2 part of the estimate (40) for u, i.e. the analogue of (14), uniformly for all weights ψ,τ with k satisfying (39). This is done as in Sect. 3. Switching to the (t, y) coordinates this becomes eψ,τ ρ1 v L 2 + eψ,τ
1 h (t) + et
ρ1 ∇v L 2 eψ,τ ρ1−1 Lv L 2 ,
(41)
where the operator L and ρ1 are the same as in Sect. 3. This bound is localizable on the unit scale in t. By this we mean that if we prove it on O(1) size intervals in t then the full estimate can be obtained by assembling the local bounds using a smooth partition of unity. In particular, it suffices to prove it separately in the regions {t > 2 ln τ }, respectively {t < 2 ln τ + 1}. In the first region the perturbation k is zero, so this is nothing but (19). In the second region we do have a contribution from k, but we have the added benefit of the uniform convexity of h. In particular et h (t), so the desired bound can be rewritten as 3
1
3
eψ,τ h (t) 4 v L 2 + eψ,τ h (t)− 4 ∇v L 2 eψ,τ h (t)− 4 Lv L 2 . Since h is large and slowly varying on the unit scale, it can be conjugated away from the right hand side to reduce this to 3
1
eψ,τ h (t) 2 v L 2 + eψ,τ h (t) 2 ∇v L 2 eψ,τ Lv L 2 . We set w = eψ,τ v and rewrite the above bound in terms of w in order to eliminate the exponential weight, 3
1
h (t) 2 w L 2 + h (t) 2 ∇w L 2 L ψ,τ w L 2 ,
(42)
Carleman Estimates and Absence of Embedded Eigenvalues
439
where the conjugated operator is given by L ψ,τ = eψ,τ Le−ψ,τ = L rψ,τ + L iψ,τ with selfadjoint, respectively skewadjoint parts L rψ,τ = t,y − |∇ψ,τ |2 − e2t V,
L iψ,τ = ∂ · ∇ψ,τ + ∇ψ,τ · ∂.
Since L ψ,τ w2L 2 = L rψ,τ w2L 2 + L iψ,τ w2L 2 + 2[L rψ,τ , L iψ,τ ]w, w, using Garding’s inequality this reduces as usual to a bound from below for the symbol of the commutator, 1 r r {l , l i } h 3 on {lψ = lψi ,τ = 0}. ,τ i ψ,τ ψ,τ
(43)
We have r lψ = ξ 2 − |∇ψ,τ |2 − e2t V, lψi ,τ = 2iξ ∇ψ,τ , ,τ
therefore 1 r {l , l i } = 4ξ ∇ 2 ψ,τ ξ + 4∇ψ,τ ∇ 2 ψ,τ ∇ψ,τ + 2e2t ∂t ψ,τ (2V + Vt ). i ψ,τ ψ,τ r If k = 0 then in the characteristic set of lψ we have |ξ | ≈ h . We also know that ,τ h ≈ h > 0 so the bound from below for the Poisson bracket follows. On the other hand, in the (t, y) coordinates the regularity of the perturbation k is easy to describe: α |∂t,y k(t, y)| h (t), |α| = 1, 2, 3.
(44)
Then we can treat k as a negligible perturbation in the above computation. This concludes the proof of (41). (ii) Conjugate the original equation with respect to the exponential weight. We set v = eψ,τ u, gi = eψ,τ f i and rewrite the equation (− − V )u = f 1 + f 2 in the form L ψ,τ v = g1 + g2 , where the conjugated operator L ψ,τ = eψ,τ (− − V )e−ψ,τ is written as a sum of its selfadjoint and skewadjoint parts, L ψ,τ = L rψ,τ + L iψ,τ
440
H. Koch, D. Tataru
with L rψ,τ = − − |∇ψ,τ |2 − V, L iψ,τ = ∇ψ,τ · ∂ + ∂ · ∇ψ,τ . This conjugation allows us to eliminate the exponential weight in (40) at the expense of replacing the operator − − V with L ψ,τ . (iii) Prove the mixed L 2 -L p estimate which is the analogue of (7), again uniformly with respect to all choices for k. After conjugation we can write it as v
1 , 2(n+1) n−1
l 2 W n+1
+ ρv L 2
inf
g1 +g2 =L ψ,τ u
ρ −1 g1 L 2 + g2
l2 W
1 , 2(n+1) − n+1 n+3
.
(45)
The proof of (45) is done by repeating the arguments in Sect. 5 with no change. As before, a key ingredient is the fact that in the L 2 estimates we are allowed to localize in x on a dyadic scale. This effectively turns the global problem into a local problem which after rescaling fits into the framework of Theorem 7. (iv) Add in the H −1 and H 1 norms in order to prove (40) for Z = 0. After conjugation this can be written as v
1 , 2(n+1)
+ ρv L 2 + χew v H 1
l 2 W n+1 n−1 ρ −1 g1 L 2
+ g2
l2 W
1 , 2(n+1) − n+1 n+3
+ g3 H −1
(46)
whenever g1 + g2 + χew g3 = L ψ,τ v. This is done in an elliptic fashion once we observe that L ψ,τ is elliptic in the support of the symbol χe . Indeed, the condition (39) on k implies that |∇ψ,τ | ≈ 1 +
h (ln |x|) . |x|
Hence if R 1 then in the support of χe we have r lψ (x, ξ ) ,τ
h (ln |x|) 2 = ξ − |∇ψ,τ | − V R 1 + . |x| 2
2
2
We construct a high frequency elliptic parametrix K for L ψ,τ as a pseudodifferential operator K (x, D) whose symbol is h (ln |x|) 1 |ξ | R 1 + k(x, ξ ) = lψ,τ (x, ξ ) |x| with some nice extension for small |ξ |. Using standard pdo calculus one obtains the elliptic bound K f H 1 f H −1 and the error estimates ρ −1 (L ψ,τ K − I )χew g L 2 g H −1 ,
Carleman Estimates and Absence of Embedded Eigenvalues
441
respectively χew (K L ψ,τ − I )v H 1 ρv L 2 . Using these estimates we show that (46) follows from (45). We first reduce the problem to the case g3 = 0 with the substitution v → v − K χew g3 , g2 → g2 − (L ψ,τ K − I )χew g3 . Then we estimate χew v by writing χew v = χew K L ψ,τ v + χew (I − K L ψ,τ )v. In both cases the L p norms are handled via Sobolev embeddings. (v) Estimate the part of (40) involving the high frequencies of Z uniformly with respect to the choice of the weight k. We consider a smooth cutoff symbol χlow which selects the region E = |ξ | ≤ 4R(1 + |x|−1 h (ln |x|)) . Correspondingly we split Z into Z low = χlow Z ,
Z high = (1 − χlow )Z .
Then we consider the Z terms in (40) with Z replaced with Z high . After conjugation we need to estimate the expressions l r Z high ∇v, ∇(Z high v), ∇ψ,τ Z high v.
For the first one we write l l l l Z high ∇v = Z high ∇χew v + (χew )2 Z high ∇(1 − χew )v + (1 − (χew )2 )Z high ∇(1 − χew )v. 2n
We can bound the first term in L n+2 , l Z high ∇χew v 2
2n
l L n+2
Z l l 2 L n χew v H 1 .
The second one is estimated in H −1 , (ln |x|) −1 h l l χew Z high ∇(1 − χew )v H −1 1 + Z high ∇(1 − χew )v 2 |x| L (ln |x|) −1 h l w Z high l ∞ L n 1 + ∇(1 − χe )v 2 |x|
2n
l L n−2
l Z high l ∞ L n v 2
2n
l L n−2
.
Finally the third term is pointwise small and rapidly decreasing at infinity since in the Fourier space the sets supp (1 − χe2 ), are separated.
supp (1 − χlow ) + supp (1 − χe )
442
H. Koch, D. Tataru
r The bound for the second expression ∇(Z high v) is dual to the previous one. Finally, ∇ψ,τ Z high v can be estimated in a similar fashion. We note that after eliminating the high frequency part of Z we can rewrite the contribution of the low frequency part symbolically in the form h (ln |x|) l r Z low . Z low ∇ + ∇ Z low = Z low ∇ + (∇ Z low ) ≈ Z low ∇ + 1 + |x|
(vi) Estimate the contribution of Z low in the region {|x| > τ 2 } uniformly with respect to the choice of the weight k. This is exactly where the frequency has size 1 in the characteristic set of L ψ,τ v, i.e. 1+
h (ln |x|) ≈ 1. |x|
We write v as v = χew v + (1 − χew )v. For the first part we use the H 1 bound and argue as (iv). For the second term we note that in the region {|x| > τ 2 } the function ∇(1 − χew )v satisfies the same bounds as v, therefore we can use the second part of the Assumption (A3) and treat the gradient potential as a potential. (vii) Show that within the region {|x| < τ 2 } it is possible to choose the perturbation k so that the estimate for Z low included holds. This is the part that uses Wolff’s osculation lemma, and it is explained in detail in [5]. Our case here is somewhat simpler than in [5] since in the region {|x| < τ 2 } we have uniform convexity of the weight, h ≈ h . Also the L p bound here is stronger. For the reader’s convenience we outline the argument. We begin with k = 0 and look at the function h (ln |x|) Z low u . F = Z low ∇u, 1 + |x| 1 We choose dyadically separated balls B j = B(x j , 16 |x j |), |x j | ≈ 2 j , where it concentrates, eh(ln |x|) F2 2n ≈ eh(ln |x|) F2 2n . L n+2 (B j )
l 2 L n+2
Consider a smooth nonincreasing function φ which equals 1 in (−∞, 1) and 0 in (2, ∞). Let σ be a small positive constant. Then we define a family of perturbations k of the form 16|x − x j | 4|x − x j | −1 k(x) = σ h (ln |x j |) φ + |x j | (x − x j ) · p j φ |x| |x| j
depending on the vectors p j which are subject to the condition | p j | ≤ 1. With this choice for k we retain the concentration in slightly larger sets uniformly with respect to the choice of the p j s eψ,τ F2 2n ≈ eψ,τ F2 2n . L n+2 (2B j )
l 2 L n+2
To get a good bound within 2B j we seek to choose p j so that eψ,τ Z ∇u |2B j concentrates on a small set, more precisely a set where Z is small in L n . For this we use
Carleman Estimates and Absence of Embedded Eigenvalues
443
Lemma 11. (Wolff’s Lemma [13]) Let µ be a measure in Rn and B a convex set. Then one can find bk ∈ B and disjoint convex sets E k ⊂ Rn so that the measures e xbk µ are concentrated in E k , 1 e xbk dµ e xbk dµ 2 Ek Ek and
|E k |−1 |B|.
For each j we use this lemma with 2n
µ = 12B j |eψ,τ F| n+2 d x, 0
B j = B(0, σ h (ln |x j |)|x j |−1 ),
where eψ,τ corresponds to choosing p j = 0 in the definition of k. Of the sets E jk we obtain in this way we can select one which we call E j with the property that 0
|E j |Z nL n (E j ) h (ln |x j |)−n |x j |n . ψ,τ Z ∇u is concenMaking this choice for each j we produce a weight ψ,τ for which e trated in E j . From the above relation we obtain
h (ln |x j |) 1. Z n2 L (E j ) |x j |
(47)
Due to the choice of the E j s we can estimate the contribution of Z low by e
ψ,τ
F 2
2n
l L n+2 ({|x|<τ 2 }
e
ψ,τ
1 2
2
F
2n L n+2 (E j )
.
After conjugation this becomes
Z low ∇v, Z low 1 + |x|−1 h (ln |x|) v 2
1 2
2n L n+2 (E j )
.
We split ∇v into low and high frequencies, ∇v = ∇(1 − χew )v + ∇χew v. For the high frequency part we use the H 1 bound for χew v and the L n bound for Z low as 2n
n
in (iv). For the low frequency part we use the L n−2 bound for v and the L 2 bound for Z low . Then the contribution of Z low is estimated by ∇χew v H 1 Z l 2 L n + v 2
2n
l L n−2
sup |x j |−1 h (ln |x j |)Z
By (A3) and (47) the conclusion follows.
j
n
L 2 (E j )
.
444
H. Koch, D. Tataru
7. Asymptotically Flat Metrics In this section we describe how the results on the absence of embedded eigenvalues extend to variable coefficient asymptotically flat metrics. We replace the Laplacian with a second order elliptic selfadjoint operator L = −∂ j a jk ∂k + i(b j ∂ j + ∂ j b j ) + c, where the coefficients a, b, c are real. We assume that L is flat at infinity in the sense that (see Definition 1) : 2 ⊂ L ∞, a jk , b j , c ∈ Cx
lim sup |x||∇a i j | ≤ δ0 , |x|→∞
lim sup |b(x)| + |x||∇b(x)| ≤ δ1 , lim sup |c(x)| + |x||∇c(x)| ≤ δ12 . |x|→∞
|x|→∞
(48)
We also slightly strengthen the Assumption A3 to make it stable with respect to changes of variable: Assumption A 4 (The short range gradient potential). The gradient potential Z ∈ l ∞ (L n ) satisfies4 lim sup Z L n ({x|2 j ≤|x|≤2 j+1 }) ≤ δ.
(49)
j→∞
In addition D−N Z satisfies the conditions in Assumption A2 for some N sufficiently large. Then we have Theorem 12. Assume that W , V and Z satisfy Assumptions A1,A2 and A4 with small enough δ, that τ1 > τ0 and that the coefficients of P satisfy (48) with δ0 and δ1 sufficiently 1 (Rn ) solves small. If u ∈ Hloc Lu + V u = W u + Z l ∇u + ∇ Z r u
(50)
1
and |x|τ1 − 2 u ∈ L 2 then u ≡ 0. The assumptions of Theorem 12 are not scale invariant. For the following straightforward consequence we rescale the operator. Corollary 13. Assume that the coefficients of the operator P satisfy (48) with δ0 sufficiently small. Let W , Z be as in Assumptions A2, A4 with δ = 0. Then there exists C > 0 so that P + W has no eigenvalues λ > Cδ1 . Proof of Theorem 12, outline. The proof follows the same ideas as in the constant coefficient case. We describe the steps in what follows, and discuss the necessary modifications. (i) Rescale so that |x||∇a| δ0 |b(x)| + |x||∇b(x)| δ1 |c(x)| + |x||∇c(x)| δ12 4 Again in dimension n = 2 one needs to replace L n by L n+
|x| > 1.
(51)
Carleman Estimates and Absence of Embedded Eigenvalues
445
Then in the Carleman estimates we can use the same weights φ,τ as in Theorem 10. We claim that (40) holds with − replaced by L. It suffices to justify the L 2 Carleman estimates. The rest of the argument is identical to the proof of Theorem 10. (ii) Augment (51) to gain also the relation lim sup |a(x) − In | δ0 .
(52)
|x|→∞
This is achieved using a change of coordinates somewhat similar to the one introduced in [5]. Due to (48), within each spatial dyadic region this can be achieved with a linear change of coordinates. Precisely, if we choose points x j so that |x j | = 2 j then for x ≈ 2 j we can use linear transformation y = B j x,
1
B j = A 2j ,
A j = (akl (x j )).
From one dyadic region to the next these linear maps differ by O(δ). We use a dyadic partition of unity 1=
∞
φ j (x)
j=1
to glue them together, y=
∞
φ j (x)B j x.
j=1
It is easy to verify that this change of variable achieves (52) and has the regularity |∂ α χ (x)| δ0 |x|1−|α| ,
|α| ≥ 2. 1
1
In the new coordinates we replace the operator L by J − 2 L J 2 where J is the Jacobian of the change of coordinates. This is selfadjoint in the new coordinates. One can check that this change of coordinates does not affect δ0 , δ1 by more than a fixed factor. One also has to verify that Assumptions A1, A2 and A4 remain unchanged. If χ were linear then the Assumption A1 on V would rest unchanged. As it is, we have to modify τ1 by O(δ0 ), which is suitably small. The only change in A2 and A4 is that δ is modified by a fixed constant. (iii) Prove the L 2 Carleman estimates, namely |x| eψ,τ ρ∇u 2 eψ,τ ρ −1 (L − V )u L 2 ., (53) eψ,τ ρu L 2 + L h (ln |x|) + |x| with ψ,τ as in (38). For this we argue as in the proof of Theorem 10. Switching to the (t, y) coordinates (53) is rewritten as 1 ψ,τ eψ,τ ρ1 u L 2 + e ρ ∇u (54) 2 eψ,τ ρ1−1 ( L˜ − e2t V )u L 2 . 1 L h (t) + et The new operator L˜ has the form L˜ = −∂t2 − y + ∂i a˜ i j ∂ j + iet (b˜ j ∂ j + ∂ j b˜ j ) + e2t (c˜ + V ),
446
H. Koch, D. Tataru
where, by (51), the coefficients a, ˜ b˜ and c˜ satisfy ˜ + |∇ b| ˜ δ1 , |c| ˜ + |∇ c| ˜ δ12 . |a| ˜ + |∇ a| ˜ δ0 , |b|
(55)
Conjugation in (54) with the exponential weight leads to 1 ρ1 v L 2 + ρ1 ∇v 2 ρ1−1 L˜ ,τ v L 2 , t L h (t) + e
(56)
where the selfadjoint, respectively the skewadjoint parts of L˜ ,τ are L˜ r,τ = L˜ − |∇ψ,τ |2 − a˜ jk ∂ j ψ,τ ∂k ψ,τ , L˜ i,τ = ∇ψ,τ ∂ + ∂∇ψ,τ + ∂ j a˜ jk (∂k ψ,τ ) + (∂k ψ,τ )a˜ jk ∂ j − i b˜ k (∂k ∇ψ,τ ). Using the fact that the Carleman estimates are localizable on the unit scale in t we split the proof in two parts. First we consider the region t < 2 ln τ + 1 which corresponds to |x| < 2τ 2 . In this region the weight φ,τ is strongly pseudoconvex and et h (t). As in the proof of Theorem 10 we can conjugate away the weight ρ1−1 on the right and reduce the problem to the bound (42). This in turn follows from (43). But due to (55), the contribution of the terms involving the coefficients a, ˜ b˜ and c˜ in the proof of (43) is negligible provided that δ0 and δ1 are sufficiently small. Secondly we consider the region t > 2 ln τ which corresponds to |x| > τ 2 . Here the pseudoconvexity of φ,τ is gradually lost as t increases. A negative effect of this is that we are no longer allowed to conjugate away the weight ρ −1 from the right. On the positive side, in this region we have k = 0. Then we can argue as in the proof of Proposition 5. As there, it suffices to prove the counterpart of Lemma 21. One can follow the same path, i.e. successively multiply L φ w by − 21 wt , and then by χ 2 w and integrate by parts. In this computation, the contributions of all terms involving the coefficients a, ˜ ˜b and c˜ are negligible if δ0 and δ1 are sufficiently small. (iv) The bound (40) suffices in order to show that the solution u to (50) has compact support. In order to prove that u is identically zero one uses local Carleman estimates, which are simpler. For the sake of completeness we state here the result, which is a slightly stronger version of a similar result in [5]. Theorem 14. Let L be a second order elliptic operator with C 1 coefficients in a compact domain D ⊂ Rn . Let φ be a strongly pseudoconvex function with respect to L in D. Then for each τ large enough, each Z with Z L n ≤ 1 and each u ∈ L 2 with support in D there exists a small linear function k in D so that the following estimate holds with constants independent of τ > τ0 : eτ (φ+k) u
3
W
1 2(n+1) n+1 , n−1
+eτ (φ+k) Z l ∇u
+ τ 4 eτ (φ+k) u L 2 + χew eτ (φ+k) u H 1
2n
L n+2 +χew H −1
+ eτ (φ+k) ∇ Z r u
2n
L n+2 +χew H −1
3
τ − 4 eτ (φ+k) f 1 L 2 + eτ (φ+k) f 2
W
1 , 2(n+1) − n+1 n+3
where f 1 + f 2 = (− − V )u, and χe is a frequency cutoff which selects the region {|ξ | τ }.
+χew H −1
,
(57)
Carleman Estimates and Absence of Embedded Eigenvalues
447
In this case the L 2 Carleman estimates are established using integration by parts in a standard manner. Due to the strong pseudoconvexity the L 2 estimates guarantee locali1 zation on the τ − 2 scale, which is exactly what is needed in order to freeze the coefficients for the proof of the L p estimates. This is why only C 1 regularity for the coefficients is needed. The proof of the L p estimates is based again on Theorem 7. The same applies for the rest of the Carleman estimates in this paper in the region |x| τ 2 (which corresponds to et τ 2 ). However, beyond this threshold the rescaled skewadjoint part becomes very small and the problem is close to the spectral projection estimates, respectively the Strichartz estimates for wave equations with C 2 coefficients. 1 The spatial localization scale is (1 + |x|−1 h (ln(|x|)))− 2 while the frequency, instead of decaying, remains O(1) due to the long range potential V . Hence the difference between 1 P and its frozen coefficient version is O(h (ln(|x|))− 2 ), which is more than the constant ρ 2 in the L 2 estimates. This is why we need also bounds on the second derivatives of a i j , as required by Theorem 7. 8. Appendix We consider a dyadic partition of unity in Rn , ∞
1=
χ j (x),
j=−∞
where χ j (x) = χ0 (2− j x) is supported in |x| ≈ 2 j . We also consider bump functions χ˜ j (x) = χ˜ j (2− j x) with slightly larger support, which equal 1 within the support of χ j such that χ˜ j χ˜l = 0 if | j − l| ≥ 2. Lemma 15. Let 1 < p < ∞ and − pn < s < np . Then p p χ j uW s, p . uW s, p ≈ Proof. Let (u j ) be a sequence in W s, p . Arguing by duality it suffices to prove the bound
∞
p
χ j u j W s, p
p
u j W s, p .
j=−∞
With D =
(1 + |D|2 )1/2
we have
∞
χ j u j W s, p = Ds
χ j u j L p .
j=−∞
We write Ds
∞ j=−∞
χju j =
∞
χ˜ j Ds χ j u j +
j=−∞
∞
(1 − χ˜ j )Ds χ j u j .
j=−∞
The terms in the first sum have almost disjoint supports and are easy to estimate. It remains to consider the second sum. We use bounds on the kernel of D−s and its derivatives to estimate |(1 − χ˜ j )Ds χ j u j (x)| u j L p 2
(s+ pn ) j
(2 j + |x|)−n−s .
448
H. Koch, D. Tataru
Then we conclude using ∞
aj2
(s+ pn ) j
∞
(2 j + |x|)−n−s L p ≈ p
j=−∞
|a j | p ,
s>−
j=−∞
n . p
This is the main ingredient in the proof of Proposition 16. Let δ > 0. Suppose that W ∈ X (see Definition 2). Then we have lim W X (B(0,α)) = 0.
α→0
Proof. We assume that n ≥ 3, the case n = 2 is similar. The result follows from the estimate W X ≈ χ j W
n+1 2 (X )
l
For one direction we write |W u, v| = | χ j W χ˜ j u, χ˜ j v| χ j W X χ˜ j u
1 , 2(n+1) n−1
W n+1
χ j W
l
χ j W
l
n+1 2 n+1 2
χ˜ j u
l
u
2(n+1) n−1
.
χ˜ j v
1 , 2(n+1) n−1
W n+1
v
1 , 2(n+1) n−1
W n+1
(58)
1 , 2(n+1) n−1
W n+1
χ˜ j v
l
1 , 2(n+1) n−1
W n+1
2(n+1) n−1
1 , 2(n+1) n−1
W n+1
.
For the other, we consider separately sums with j even and with j odd: χ j W u j , v j = χ j W χ˜ j u j , χ˜ j v j j even
j even
= W
χ˜ j u j ,
j even
W X
χ˜ j u j
j even
W X u j
l
χ˜ j v j
2(n+1) n−1
W
1 2(n+1) n+1 , n−1
1 , 2(n+1) n−1
W n+1
χ˜ j v j
j even
u j
l
2(n+1) n−1
1 , 2(n+1) n−1
W n+1
1 , 2(n+1) n−1
W n+1
.
References 1. Froese, R., Herbst, I.: Exponential bounds and absence of positive eigenvalues for N -body Schrödinger operators. Commun. Math. Phys., 87(3), 429–447 (1982/83) 2. Froese, R., Herbst, I., Hoffmann-Ostenhof, M., Hoffmann-Ostenhof, T.: On the absence of positive eigenvalues for one-body Schrödinger operators. J. Analyse Math. 41, 272–284 (1982) 3. Ionescu, A.D., Jerison, D.: On the absence of positive eigenvalues of Schrödinger operators with rough potentials. Geom. Funct. Anal. 13(5), 1029–1081 (2003) 4. Kenig, C.E., Ruiz, A., Sogge, C.D.: Uniform Sobolev inequalities and unique continuation for second order constant coefficient differential operators. Duke Math. J. 55(2), 329–347 (1987) 5. Koch, H., Tataru, D.: Carleman estimates and unique continuation for second-order elliptic equations with nonsmooth coefficients. Comm. Pure Appl. Math. 54(3), 339–360 (2001)
Carleman Estimates and Absence of Embedded Eigenvalues
449
6. Koch, H., Tataru, D.: Sharp counterexamples in unique continuation for second order elliptic equations. J. Reine Angew. Math. 542, 133–146 (2002) 7. Koch, H., Tataru, D.: Dispersive estimates for principally normal pseudodifferential operators. Comm. Pure Appl. Math. 58(2), 217–284 (2005) 8. Reed, M., Simon, B.: Methods of modern mathematical physics. IV. Analysis of operators. New York: Academic Press [Harcourt Brace Jovanovich Publishers], 1978 9. Soffer, A., Weinstein, M.I.: Time dependent resonance theory. Geom. Funct. Anal. 8(6), 1086–1128 (1998) 10. Sogge, C.D.: Fourier integrals in classical analysis, Volume 105 of Cambridge Tracts in Mathematics. Cambridge: Cambridge University Press, 1993 11. Tataru, D.: On the Fefferman-Phong inequality and related problems. Comm. Partial Differ. Eq. 27(11-12), 2101–2138 (2002) 12. von Neumann, J., Wigner, E.P.: Über merkwürdige diskrete Eigenwerte. Z. Phys. 30, 465–467 (1929) 13. Wolff, T.H.: Recent work on sharp estimates in second order elliptic unique continuation problems. In: Fourier analysis and partial differential equations (Miraflores de la Sierra, 1992), Boca Raton, FL: CRC, 1995,pp. 99-128 Communicated by B. Simon
Commun. Math. Phys. 267, 451–476 (2006) Digital Object Identifier (DOI) 10.1007/s00220-006-0076-3
Communications in
Mathematical Physics
Diagonalization of an Integrable Discretization of the Repulsive Delta Bose Gas on the Circle J. F. van Diejen Instituto de Matemática y Física, Universidad de Talca, Casilla 747, Talca, Chile. E-mail: [email protected] Received: 28 October 2005 / Accepted: 10 March 2006 Published online: 3 August 2006 – © Springer-Verlag 2006
Abstract: We introduce an integrable lattice discretization of the quantum system of n bosonic particles on a ring interacting pairwise via repulsive delta potentials. The corresponding (finite-dimensional) spectral problem of the integrable lattice model is solved by means of the Bethe Ansatz method. The resulting eigenfunctions turn out to be given by specializations of the Hall-Littlewood polynomials. In the continuum limit the solution of the repulsive delta Bose gas due to Lieb and Liniger is recovered, including the orthogonality of the Bethe wave functions first proved by Dorlas (extending previous work of C.N. Yang and C.P. Yang). Contents 1. Introduction . . . . . . . . . . . 2. Discrete Laplacians on the Alcove 2.1 Preliminaries . . . . . . . . 2.2 Laplacians . . . . . . . . . . 2.3 Hilbert space structure . . . 3. Bethe Ansatz Eigenfunctions . . 3.1 Bethe ansatz . . . . . . . . . 3.2 Bethe equations . . . . . . . 4. Solution of the Bethe Equations . 4.1 Solution . . . . . . . . . . . 4.2 Proof . . . . . . . . . . . . . 5. Diagonalization . . . . . . . . . 5.1 Spectrum and eigenfunctions
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
452 454 454 455 458 459 460 461 462 463 463 466 466
Work supported in part by the Fondo Nacional de Desarrollo Científico y Tecnológico (FONDECYT) Grant # 1051012, by the Anillo Ecuaciones Asociadas a Reticulados financed by the World Bank through the Programa Bicentenario de Ciencia y Tecnología, and by the Programa Reticulados y Ecuaciones of the Universidad de Talca.
452
J. F. van Diejen
5.2 Orthogonality and completeness 5.3 Integrability . . . . . . . . . . . 6. Continuum Limit . . . . . . . . . . . 6.1 Eigenfunctions . . . . . . . . . 6.2 Orthogonality . . . . . . . . . . 6.3 Hamiltonian . . . . . . . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
467 468 470 470 472 475
1. Introduction The non-ideal Bose gas with delta-potential interactions is a system of n one-dimensional bosonic particles characterized by a Hamiltonian given by the formal Schrödinger operator H = − + g δ(x j − xk ). (1.1) 1≤ j=k≤n
Here x1 , . . . , xn represent the position variables, := ∂x21 + · · · + ∂x2n , δ refers to the delta distribution, and g denotes a coupling parameter determining the strength of the interaction. For g > 0 the interaction between the particles is repulsive and for g < 0 it is attractive, whereas for g = 0 the model degenerates to an ideal one-dimensional boson gas without interaction between the particles. The eigenvalue problem for the above Schrödinger operator with periodic boundary conditions—i.e. for particles moving along a circle—was solved by Lieb and Liniger by means of the Bethe Ansatz method [LL]. The corresponding spectral problem for particles moving along the whole real line was considered subsequently by McGuire [Mc]. Since the appearance of these two pioneering papers, the exactly solvable quantum models under consideration have been the subject of numerous studies; for an overview of the vast literature and an extensive bibliography we refer the reader to Refs. [Ma, G4, KBI, AK, S]. Further generalizations of the one-dimensional quantum n-particle system with delta-potential interactions can be found in Refs. [G3, Gu, HO, Di], where analogous quantum eigenvalue problems are studied in which the permutation-symmetry is traded for an invariance with respect to the action of more general reflection groups [B, Hu], and also in Refs. [AFK, AK, CC, HLP], where the delta-potential interaction between the particles is replaced by more general zero-range point-like interactions (involving combinations of δ and δ type potentials) [A-H]. An important (and notoriously hard) problem connected with the Bethe Ansatz method is the question of demonstrating the completeness of the Bethe wave functions in a Hilbert space context. For the non-ideal Bose gas on the line with a repulsive delta-potential interaction (g > 0), the spectrum of the Schrödinger operator is purely continuous. The completeness of the Bethe wave functions was proved for this case by Gaudin [G1, G2]. For the corresponding system in the attractive regime (g < 0), the completeness problem is much harder as multi-particle binding may occur thus giving rise to mixed continuous-discrete spectrum. In this more complex situation the completeness of the Bethe wave functions was shown by Oxford [O], with the aid of techniques developed by Babbitt and Thomas in their treatment of an analogous spectral problem for the one-dimensional infinite isotropic Heisenberg spin chain [T, BT]. When passing from particles on the line to particles on the circle the nature of the system changes drastically, as the confinement to a compact region forces the spectrum of the Schrödinger operator to become purely discrete. In this situation the completeness of the Bethe wave functions was proved for the repulsive regime by Dorlas [Do]. In Dorlas’ approach the
Discrete Delta Bose Gas on the Circle
453
question of the completeness is first reduced to that of the orthogonality of the Bethe wave functions. This orthogonality is then shown to hold with the aid of quantum inverse scattering theory [KBI], combined with previous results of C.N. Yang and C.P. Yang pertaining to the solution of the associated algebraic system of Bethe equations (determining the spectrum of the Schrödinger operator under consideration) [YY]. For the attractive regime such progress has yet to be made: the question of the construction of a complete eigenbasis for the non-ideal Bose gas on the circle with delta-potential interaction remains (to date) open. By exploiting the translational invariance and the permutation symmetry the eigenvalue problem characterized by the Hamiltonian H (1.1) reduces—in the case of bosonic particles moving along a circle of unit circumference—to that of the free Laplacian −ψ = Eψ, (1.2) acting on a domain of wave functions ψ(x1 , . . . , xn ) with support inside the alcove (1.3) A = x ∈ Rn | x1 + · · · + xn = 0, x1 ≥ x2 ≥ · · · ≥ xn , x1 − xn ≤ 1 , and subject to normal linear homogeneous boundary conditions at the walls of the alcove of the form (1.4a) (∂x j − ∂x j+1 − g)ψ x −x =0 = 0, j = 1, . . . , n − 1, j j+1 (1.4b) (∂xn − ∂x1 − g)ψ x −x =1 = 0. 1
n
(The parameter E represents the energy eigenvalue.) The purpose of the present paper is to study a discretization of the eigenvalue problem in Eqs. (1.2)–(1.4b). Throughout we will restrict attention to the repulsive parameter regime g > 0. More specifically, we study the eigenvalue problem for an integrable system of discrete Laplacians acting on functions supported on a regular lattice over the alcove A (1.3), and subject to repulsive reflection relations at the boundary of the lattice. Since the alcove is compact, the lattice in question is finite; hence, our discrete eigenvalue problem is finite-dimensional. We solve the eigenvalue problem at issue by means of the Bethe Ansatz method. The resulting Bethe eigenfunctions turn out to be given by specializations of the Hall-Littlewood polynomials [M2, M3]. The orthogonality and completeness of these Bethe eigenfunctions arises as an immediate consequence of the integrability (which permits removing possible degeneracies in the spectrum of the Laplacians). As a byproduct, the Lieb-Liniger type Bethe eigenfunctions for the eigenvalue problem in Eqs. (1.2)–(1.4b) are recovered via a continuum limit. The orthogonality of the latter eigenfunctions (and thus eventually—because of Dorlas’ results [Do]—also the completeness) are in our approach immediately inherited from the corresponding orthogonality results for our discretized lattice model. In this connection it is probably helpful to recall that the original proof of the orthogonality due to Dorlas [Do] also involves a discretization, which arises however in a fundamentally different way from the one employed here. In a nutshell: Dorlas arrives at the orthogonality through a continuum limit of the (second) quantization of the Lattice Nonlinear Schrödinger Equation introduced by Izergin and Korepin [KBI], whereas here—in contrast—we study a rather more elementary quantum lattice model characterized by a direct discretization of the Schrödinger operator in Eqs. (1.2)–(1.4b) itself. The paper is structured as follows. In Sect. 2 the discretization of the eigenvalue problem in Eqs. (1.2)–(1.4b) is formulated. In Sect. 3 the eigenfunctions are constructed by
454
J. F. van Diejen
means of the Bethe Ansatz method. The associated Bethe equations are solved in Sect. 4 and the orthogonality and completeness of the corresponding Bethe wave functions is demonstrated in Sect. 5. Finally, the continuum limit is analyzed in Sect. 6. 2. Discrete Laplacians on the Alcove In this section we introduce a system of discrete Laplacians on a finite lattice over (a dilated version of) the alcove A (1.3). For this purpose it will be convenient to borrow concepts and notation from the theory of root systems. Here we will only need to deal with the simplest type of root systems: those of type A. For further background material concerning root systems the reader is referred to Refs. [B, Hu]. 2.1. Preliminaries. Let e1 , . . . , en denote the standard basis of unit vectors in Rn and let ·, · be the (usual) inner product with respect to which the standard basis is orthonormal. The alcove A (1.3) constitutes a convex polyhedron in the center-of-mass plane E = {x ∈ Rn | x, e = 0}, e = e1 + · · · + en ,
(2.1)
which is bounded by the n hyperplanes E 0 = {x ∈ E | x, α 0 = 1}, E j = {x ∈ E | x, α j = 0}, where
j = 1, . . . , n − 1,
α 0 := e1 − en and α j := e j − e j+1 ,
(2.2a) (2.2b)
j = 1, . . . , n − 1.
(2.3)
A = {x ∈ E | x, α 0 ≤ 1, x, α j ≥ 0, j = 1, . . . , n − 1}.
(2.4)
Specifically, we have that
The vertices (i.e. corners) of the polyhedron A are determined by the intersections of all choices of n − 1(= dim(E)) out of the n hyperplanes E 0 , . . . , E n−1 . These vertices are given explicitly by the origin 0 and the vectors ω j := e1 + · · · + e j − nj (e1 + · · · + en ),
j = 1, . . . , n − 1.
(2.5)
Indeed, the vectors ω1 , . . . , ωn−1 all lie on the hyperplane E 0 (2.2a) and constitute a basis of E that is dual to the basis α 1 , . . . , α n−1 (in the sense that ω j , α k = δ j,k , where δ j,k denotes the Kronecker delta symbol). Let r0 : E → E be the orthogonal reflection in the hyperplane E 0 (2.2a) and let r j : E → E, j = 1, . . . , n − 1 be the orthogonal reflections in the hyperplanes E j (2.2b). The action of these reflections on an arbitrary vector x ∈ E is of the form r0 (x) = x + (1 − x, α 0 )α 0 , r j (x) = x − x, α j α j , j = 1, . . . , n − 1.
(2.6a) (2.6b)
From these two formulas it is readily inferred that for j ∈ {1, . . . , n −1} the reflection r j swaps the j th and ( j + 1)th coordinates of x, and that r0 swaps the first and the last coordinates followed by a translation over the vector α 0 . Hence, the reflections r1 , . . . , rn−1 generate an action of the permutation group Sn on E, and the reflections r0 , . . . , rn−1
Discrete Delta Bose Gas on the Circle
455
generate an action of the affine permutation group Sˆn = Sn Q, which is the semidirect product of the permutation group Sn and the lattice of translations Q := SpanZ (α 1 , . . . , α n−1 ).
(2.7)
For any (affine) permutation σ ∈ Sˆn , one defines its length (σ ) as the minimal number of reflections needed for decomposing σ (non-uniquely) in terms of the generators: σ = r j1 r j2 · · · r j
(2.8)
(where j1 , . . . , j ∈ {0, 1, . . . , n − 1} and with the convention that the length of the identity element is equal to zero). The polyhedron A (2.4) constitutes a fundamental domain for the action of Sˆn on E. More specifically, for each x ∈ E the orbit Sˆn (x) intersects A precisely once. Let us denote by σx ∈ Sˆn the unique shortest affine permutation such that σx (x) ∈ A.
(2.9)
Let us fix a positive integer m. Below it will often be convenient to employ a dilated version of the polyhedron A(m) := m A rather than A itself. Throughout we shall distinguish by means of a superscript (m) the corresponding boundary planes, boundary (m) reflections, and the elements of the affine permutation group Sˆn := Sn (mQ) ⊂ Sˆn generated by these reflections. That is to say, the dilated alcove A(m) is bounded by (m) the hyperplanes E j := m E j , j = 0, . . . , n − 1; the orthogonal reflections in these
(m) = E j and hyperplanes act as r (m) j (x) := mr j (x/m), j = 0, . . . , n − 1. (So E j (m) (m) (m) ˆ r j = r j if j > 0.) The affine permutation σx ∈ Sn mapping a vector x ∈ E into (m)
the fundamental domain A(m) is given by the action σx (x) := mσx/m (x/m). 2.2. Laplacians. The orthogonal projection of Zn ⊂ Rn onto the center-of-mass plane E (2.1) is given by the lattice dual to Q (2.7): P := {λ ∈ E | ∀α ∈ Q : λ, α ∈ Z} = SpanZ (ω1 , . . . , ωn−1 ).
(2.10a) (2.10b)
It is clear from Eq. (2.10a) that Q is contained as a sublattice in P. Hence, the action of the affine permutation group Sˆn in E maps the lattice P into itself (cf. Eqs. (2.6a), (2.6b)). The intersection of the lattice P with the dilated polyhedron A(m) provides a finite grid P (m) over A(m) containing its vertices 0 and mω1 , . . . , mωn−1 : P (m) := {λ ∈ P | λ ∈ A(m) } = {k1 ω1 + · · · kn−1 ωn−1 | k1 , . . . , kn−1 ∈ Z≥0 , k1 + · · · + kn−1 ≤ m}. (2.11) We are now in the position to define a system of n − 1 Laplace operators acting in the space C(P (m) ) of complex functions ψ : P (m) → C.
456
J. F. van Diejen
Definition (Laplace Operator). To each basis vector ωk from Eq. (2.5) we associate a corresponding Laplace operator L (m) : C(P (m) ) → C(P (m) ) defined by its action on an k (m) arbitrary function ψ : P → C of the form (m) (L k ψ)λ := ψλ+ν , (2.12a) ν ∈Sn (ωk )
with the boundary convention that for µ ∈ P \P (m) , ψµ := t
(m) (σ (m) ) µ
ψσ (m) (µ) ,
(2.12b)
µ
where t denotes a real (coupling) parameter (and the length function (m) (·) refers to (m) the minimal number of reflections in the decomposition of an affine permutation in Sˆn (m) in terms of r0(m) , . . . , rn−1 ). (m)
Roughly speaking, the value of L k ψ in a point λ ∈ P (m) is equal to the sum of the values of ψ in all neighboring points of the form λ + ν, where ν runs through the orbit of ωk with respect to the action of the permutation group Sn . When λ + ν lies outside P (m) the value of ψλ+ν is governed by the boundary convention in Eq. (2.12b). It is instructive to clarify the nature of this boundary convention somewhat more in detail by (m) decomposing σλ+ν in terms of the elementary reflections in the hyperplanes bounding (m) A . Proposition 2.1 (Boundary Reflection Relations). Let λ ∈ P (m) . The boundary convention in Eq. (2.12b) amounts to the requirement that ∀ν ∈ Sn (ωk ) for which λ + ν ∈ P \ P (m) , tψ (m) if λ + ν, α 0 > m r0 (λ+ν ) (2.13a) ψλ+ν = tψr (m) (λ+ν ) if λ + ν, α j < 0 ( j > 0), j
or equivalently ψλ+ν =
tψλ+ν −α 0 if λ, α 0 = m and ν, α 0 = 1 tψλ+ν +α j if λ, α j = 0 and ν, α j = −1 ( j > 0).
(2.13b)
Proof. Let λ ∈ P (m) and ν ∈ Sn (ωk ) such that λ + ν ∈ P \ P (m) . Then there exist j ∈ {0, . . . , n − 1} such that λ + ν, α j is either > m if j = 0 or < 0 if j > 0. Geomet(m) (m) rically, this means that the hyperplane E j separates λ + ν from σλ+ν (λ + ν) ∈ P (m) . Let us write µ := r (m) j (λ + ν). Then for any such j we have that (m)
(m) (m)
σλ+ν = σµ r j
(m)
(m)
with (m) (σµ ) = (m) (σλ+ν ) − 1. (m)
We will now use induction on the length of σλ+ν to prove that the boundary convention in Eq. (2.12b) and the boundary reflection relation in Eq. (2.13a) are equivalent. Indeed, upon assuming (2.12b) it is clear that ψλ+ν = t
(m) (σ (m) ) λ+ν
ψσ (m) (λ+ν ) = t λ+ν
(m) (σ (m) )+1 µ
ψσ (m) (µ) = t ψµ , µ
Discrete Delta Bose Gas on the Circle
457
which amounts to (2.13a). Reversely, upon assuming (2.13a) and invoking the induction hypothesis we see that ψλ+ν = t ψµ = t
(m) (σ (m) )+1 µ
ψσ (m) (µ) = t µ
(m) (σ (m) ) λ+ν
ψσ (m) (λ+ν ) , λ+ν
which amounts to (2.12b). To finish the proof of the proposition it remains to check that the boundary reflection relations in Eqs. (2.13a) and (2.13b) are equivalent. For this purpose it suffices to notice that the requirements that λ ∈ P (m) and ν ∈ Sn (ω k ) imply that 0 ≤ λ, α j ≤ m and that −1 ≤ ν, α j ≤ 1 for j = 0, . . . , n − 1. Hence (m) λ + ν, α 0 > m iff λ, α 0 = m and ν, α 0 = 1, in which case r0 (λ + ν) = λ + ν − α 0 , and furthermore for j > 0 one has that λ+ν, α j < 0 iff λ, α j = 0 and ν, α j = −1, (m) in which case r j (λ + ν) = λ + ν + α j . It is clear from the proposition that the boundary convention in Eq. (2.12b) amounts to a normal linear boundary condition at the hyperplanes bounding A(m) . The coupling parameter t determines the nature of this boundary condition; for |t| > 1 the boundary term (i.e. the interaction between the particles) is attractive whereas for |t| < 1 it is repulsive. For t = 0 and t = 1 we are dealing with Dirichlet type and Neumann type boundary conditions, respectively. Applying the boundary convention to those contributions in the sum over the translated orbit λ + Sn (ωk ) corresponding to lattice points outside the grid P (m) gives rise (m) to a closed formula for the action of the Laplace operator in which the value of L k ψ at the point λ is expressed completely in terms of the values of ψ at the neighboring points of the form λ + ν ∈ P (m) . To make this procedure completely explicit we shall need some further notation. Let R be the orbit of the basis α 1 , . . . , α n−1 with respect to the action of the permutation group Sn and let R+ be the part of the orbit that expands nonnegatively with respect to this basis: R = {e j − ek | 1 ≤ j = k ≤ n},
R+ = {e j − ek | 1 ≤ j < k ≤ n}.
(2.14)
Proposition 2.2 (Explicit Action of the Laplace Operator). The action of the Laplace (m) operator L k (k ∈ {1, . . . , n − 1}) on an arbitrary grid function ψ : P (m) → C is of the form (m) (m) Vλ,ν ψλ+ν , (2.15a) (L k ψ)λ = ν ∈Sn (ωk ) λ+ν ∈P (m)
where (m)
Vλ,ν :=
with ρ :=
α ∈R+
α ∈R+ λ,α =0 ν ,α =1
1 − t 1+ρ ,α 1 − t ρ ,α
α ∈R+ λ,α =m ν ,α =−1
1 − t 1+n−ρ ,α , 1 − t n−ρ ,α
(2.15b)
α/2 = ω1 + · · · + ωn−1 .
Proof. It follows from (the proof of) Proposition 2.1 that for λ ∈ P (m) and ν ∈ Sn (ωk )
(m) (σ ) (m) such that λ+ν ∈ P (m) the coefficient Vλ(m) , σ t ,ν of ψλ+ν in (L k ψ)λ is of the form (m) ˆ where the sum is over those affine permutations σ in Sn that are generated by iterated application of reflections of the type figuring in Eqs. (2.13a), (2.13b). These affine (m) permutations are precisely the σ ∈ Sˆn for which σ (λ) = λ and σ (λ + ν) ∈ P (m) , or
458
J. F. van Diejen
equivalently, σ (λ + ν) = λ + ν. In other words, the coefficient is equal to the Poincaré (m) ˆ (m) series of the quotient of the stabilizer subgroup Sˆn, λ := {σ ∈ Sn | σ (λ) = λ} and the stabilizer subgroup Sˆ (m) ∩ Sˆ (m) , i.e. n,λ
(m) Vλ,ν
n,λ+ν
=
t
(m) (σ )
=
(m) (m) (m) σ ∈Sˆn,λ /(Sˆn,λ ∩Sˆn,λ+ν )
(m)
σ ∈Sˆn,λ (m)
t
(m)
(m) (σ )
σ ∈Sˆn,λ ∩Sˆn,λ+ν
t
(m) (σ )
.
(2.16)
It follows from a general formula for the Poincaré series of (affine) Weyl groups due to Macdonald [M1] (cf. Corollaries (2.5) and (3.4)) that the Poincaré series in the numerator and denominator of Eq. (2.16) admit product representations given by
t
(m) (σ )
=
α ∈R+ λ,α =0
(m)
σ ∈Sˆn,λ
1 − t 1+ρ ,α 1 − t ρ ,α
α ∈R+ λ,α =m
1 − t 1+n−ρ ,α 1 − t n−ρ ,α
(2.17)
and
t
(m) (σ )
(m) (m) σ ∈Sˆn,λ ∩Sˆn,λ+ν
=
α ∈R+ λ,α =0 λ+ν ,α =0
1 − t 1+ρ ,α 1 − t ρ ,α
α ∈R+ λ,α =m λ+ν ,α =m
1 − t 1+n−ρ ,α , 1 − t n−ρ ,α
respectively, which—upon inserting in Eq. (2.16)—gives rise to Eq. (2.15b).
(2.18)
(m) (m) (m) Remark. The stabilizer subgroups Sˆn,λ and Sˆn,λ ∩ Sˆn,λ+ν in the proof of Proposition 2.2 consist of (direct products of) permutation groups. It is well-known (and readily seen by induction) that the Poincaré series of the permutation group S admits the product representation j=1 (1 − t j )/(1 − t) (cf. e.g. Ref. [M2], Chapter III, §1). With the aid of this latter product formula it is not so difficult to verify Eqs. (2.17), (2.18) (and thus Eq. (2.15b)) directly (i.e. without invoking the much more general results of Macdonald [M1] cited in the proof). (We thank the referee for making this point.)
2.3. Hilbert space structure. We shall now endow the function space C(P (m) ) with an inner product, turning it into a (finite-dimensional) Hilbert space H(m) := 2 (P (m) , (m) ) characterized by a positive weight function (m) : P (m) → (0, ∞). To this end it will always be assumed from here onwards that the coupling parameter t lies in the repulsive regime −1 < t < 1 (2.19) (unless explicitly stated otherwise). For two arbitrary functions ψ, φ ∈ C(P (m) ) the inner product in question is then defined as (m) ψ, φ(m) := ψλ φλ λ , (2.20a) λ∈P (m)
where the weight function is given by (m) λ :=
α ∈R+ λ,α =0
1 − t ρ ,α 1 − t 1+ρ ,α
α ∈R+ λ,α =m
1 − t n−ρ ,α . 1 − t 1+n−ρ ,α
(2.20b)
Discrete Delta Bose Gas on the Circle
459
(Notice in this connection that the restriction of the coupling parameter to the repulsive regime (2.19) ensures that the values of the weight function (m) λ are indeed positive for (m) all λ ∈ P .) (m)
Proposition 2.3 (Adjoint). The Laplace operators L k are each other’s adjoints in H(m) ∀ψ, φ ∈ C(P (m) ) :
(m)
and L n−k (k ∈ {1, . . . , n − 1})
(m) L k(m) ψ, φ(m) = ψ, L n−k φ(m) . (m)
Proof. The proof hinges on the explicit formula for the action of L k 2.2. Elementary manipulations reveal that (m) (m) (m) (L k ψ)λ φλ λ L k ψ, φ(m) = λ∈P (m)
=
ν ∈Sn (ωk ) λ∈P (m) λ+ν ∈P (m) (i)
=
(m)
ν ∈Sn (ωk ) µ∈P (m) µ−ν ∈P (m) (ii)
=
=
(m)
ψµ Vµ−ν ,ν φµ−ν µ−ν
(m)
ν ∈Sn (ωk ) µ∈P (m) µ−ν ∈P (m) (iii)
in Proposition
(m) Vλ(m) ,ν ψλ+ν φλ λ
(2.21)
(m)
ψµ Vµ,−ν φµ−ν µ
(m)
(m)
(m)
ψµ (L n−k φ)µ µ = ψ, L n−k φ(m) ,
µ∈P (m)
where we have used: (i) the substitution λ = µ − ν, (ii) the identity (m)
(m)
(m)
(m)
Vµ−ν ,ν µ−ν = Vµ,−ν µ ,
(2.22)
and (iii) the facts that the coefficient Vµ(m) ,−ν is real and −ωk ∈ Sn (ωn−k ). To infer the identity in Eq. (2.22) one observes that for µ ∈ P (m) and ν ∈ Sn (ωk ) such that µ − ν ∈ (m) (m) (m) (m) P (m) both sides Vµ−ν ,ν µ−ν and Vµ,−ν µ reduce—upon canceling common terms from the numerator and denominator—to α ∈R+ µ,α =0 ν ,α =0
1 − t ρ ,α 1 − t 1+ρ ,α
α ∈R+ µ,α =m ν ,α =0
1 − t n−ρ ,α . 1 − t 1+n−ρ ,α
3. Bethe Ansatz Eigenfunctions (m)
In this section the eigenfunctions of the Laplacian L k (2.12a), (2.12b) are constructed by means of the Bethe Ansatz method of Lieb and Liniger [LL, Ma, G4, KBI].
460
J. F. van Diejen (m)
3.1. Bethe ansatz. If we ignore boundary effects for a moment and interpret L k (2.12a) (without the boundary convention (2.12b)) as a Laplacian acting on functions ψ : P → C, then clearly the plane wave ψλ (ξ ) = exp(iλ, ξ ) with wave number ξ ∈ E/(2π Q) constitutes an eigenfunction corresponding to the eigenvalue E k (ξ ) =
ν ∈Sn (ωk ) exp(iν, ξ ). The Bethe Ansatz method aims to construct the eigenfunctions (m)
ψ : P (m) → C for the operator L k (2.12a) with the boundary convention (2.12b) via a suitable linear combinations of plane waves (corresponding to the same eigenvalue E k (ξ )). This prompts us to look for eigenfunctions given by a linear combination of plane waves exp(iλ, ξ σ ), σ ∈ Sn with coefficients such that the boundary conditions in Proposition 2.1 are satisfied. Specifically, we will employ a Sn -invariant (in ξ ) Bethe Ansatz wave function of the form λ (ξ ) =
1 (−1)σ C(ξ σ )eiρ +λ,ξ σ , ξ ∈ 2π Int( A), δ(ξ )
(3.1a)
σ ∈Sn
where ξ σ := σ (ξ ), (−1)σ := det(σ ) = (−1)(σ ) , δ(ξ ) :=
(eiα ,ξ /2 − e−iα ,ξ /2 ),
(3.1b)
α ∈R+
and Int( A) = {ξ ∈ E | ξ , α 0 < 1, ξ , α j > 0, j = 1, . . . , n − 1}. (The condition that ξ ∈ 2π Int( A) guarantees that the denominator δ(ξ ) is nonzero.) Proposition 3.1 (Bethe Wave Function). The Bethe Ansatz wave function λ (ξ ) (3.1a), (3.1b) satisfies the boundary reflection relations in Proposition 2.1 for j = 1, . . . , n − 1 provided that C(ξ ) = (1 − t e−iα ,ξ ) (3.2) α ∈R+
(or a scalar multiple thereof ). Proof. Let λ ∈ P (m) and ν ∈ Sn (ωk ) such that λ + ν, α j = −1 for some j ∈ {1, . . . , n − 1}. Equating λ+ν (ξ ) =
1 (−1)σ C(ξ σ )eiρ +λ+ν ,ξ σ δ(ξ ) σ ∈Sn
to t r j (λ+ν ) (ξ ) = t λ+ν +α j (ξ ) t = (−1)σ C(ξ σ )eiα j ,ξ σ eiρ +λ+ν ,ξ σ δ(ξ ) σ ∈Sn
leads to the relation (−1)σ C(ξ σ τ ) = t σ ∈Sn,ρ+λ+ν
σ ∈Sn,ρ+λ+ν
(−1)σ C(ξ σ τ )eiα j ,ξ σ τ ∀τ ∈ Sn .
Discrete Delta Bose Gas on the Circle
461
Because r j stabilizes ρ + λ + ν (i.e. r j ∈ Sn,ρ +λ+ν ), the latter relation can be rewritten as
(−1)σ [C(ξ σ τ ) − C(r j (ξ σ τ ))]
σ ∈Sn,ρ+λ+ν σ −1 (α j )∈R+
=t
(−1)σ [C(ξ σ τ )eiα j ,ξ σ τ − C(r j (ξ σ τ ))e−iα j ,ξ σ τ ].
σ ∈Sρ+λ+ν σ −1 (α j )∈R+
By varying λ and ν it is seen that this relation implies that C(ξ ) − C(r j (ξ )) = t [C(ξ )eiα j ,ξ − C(r j (ξ ))e−iα j ,ξ ] (as an identity in ξ ) for all reflections r j , j = 1, . . . , n − 1, or equivalenty (assuming C(ξ ) is nontrivial in the sense that it does not vanish identically) C(ξ ) 1 − te−iα j ,ξ = , C(r j (ξ )) 1 − teiα j ,ξ
j = 1, . . . , n − 1.
Hence C(ξ ) must be of the form C(ξ ) = c0 (ξ )
(1 − te−iα ,ξ ),
α ∈R+
where c0 (ξ ) denotes an arbitrary Sn -invariant overall factor (i.e. c0 (ξ σ ) = c0 (ξ ), ∀σ ∈ Sn ). 3.2. Bethe equations. By pulling the overall factor δ(ξ ) inside the sum and exploiting the anti-invariance δ(ξ σ ) = (−1)σ δ(ξ ), the Bethe wave function λ (ξ ) (3.1a), (3.1b), with coefficients C(ξ ) taken from Eq. (3.2), passes over to
1 − t e−iα ,ξ σ λ (ξ ) = (3.3) eiλ,ξ σ . −iα ,ξ σ 1 − e + α ∈R σ ∈S n
From this expression it is clear that—for λ fixed—the Bethe wave function amounts to a Hall-Littlewood polynomial in the spectral parameter ξ [M2, M3]. We will now derive conditions on the spectral parameter such that the Bethe wave function satisfies the boundary reflection relations in Eqs. (2.13a), (2.13b) for j = 0. Proposition 3.2. (Bethe System). The Bethe Ansatz wave function λ (ξ ) (3.3) satisfies the boundary reflection relations in Proposition 2.1 for j = 0 if the spectral parameter ξ ∈ 2π Int( A) solves the algebraic system eimβ ,ξ =
1 − t eiβ ,ξ 2 eiβ ,ξ − t
α ∈R α ,β =1
1 − t eiα ,ξ , eiα ,ξ − t
∀β ∈ R.
(3.4)
462
J. F. van Diejen
ˆ ) := Proof. Let us write C(ξ α ∈R+ that λ + ν, α 0 = m + 1. Equating
1−t e−iα,ξ 1−e−iα,ξ
λ+ν (ξ ) =
and let λ ∈ P (m) and ν ∈ Sn (ωk ) such
ˆ σ )eiλ+ν ,ξ σ C(ξ
σ ∈Sn
to t r (m) (λ+ν ) (ξ ) = t 0
yields the relation
(m)
ˆ σ )eir0 C(ξ
(λ+ν ),ξ σ
σ ∈Sn
ˆ σ )eiλ+ν ,ξ σ = 0, 1 − t e−iα 0 ,ξ σ C(ξ σ ∈Sn
or equivalently
ˆ σ )eiλ+ν ,ξ σ 1 − t e−iα 0 ,ξ σ C(ξ
σ ∈Sn σ −1 (α 0 )∈R+
(0) (0) (0) iλ+ν ,r0 (ξ σ ) ˆ = 0, + 1 − t e−iα 0 ,r0 (ξ σ ) C(r (ξ ))e σ 0
where r0(0) ∈ Sn denotes the orthogonal reflection r0(0) (x) = x − x, α 0 α 0 . The latter equation translates to ˆ σ) + 1 − t e−iα 0 ,ξ σ C(ξ σ ∈Sn σ −1 (α 0 )∈R+
(0) −(m+1)iα 0 ,ξ σ iλ+ν ,ξ σ ˆ + 1 − t eiα 0 ,ξ σ C(r e = 0, 0 (ξ σ ))e
which is satisfied if (0) −(m+1)iα 0 ,ξ σ ˆ σ ) + 1 − t eiα 0 ,ξ σ C(r ˆ =0 1 − t e−iα 0 ,ξ σ C(ξ 0 (ξ σ ))e ˆ σ ) = 0) for all σ ∈ Sn , or equivalently (assuming C(ξ e(m+1)iβ ,ξ = −
ˆ (0) (ξ σ )) 1 − t eiβ ,ξ C(r 0 , ˆ σ ) 1 − t e−iβ ,ξ C(ξ
∀σ ∈ Sn ,
where β := σ −1 (α 0 ). The proposition now follows upon inserting ˆ (0) (ξ σ )) C(r 0 =− ˆ σ) C(ξ
α ∈R+ α ,α 0 >0
1 − t eiα ,ξ 1 − t eiα ,ξ σ . = − eiα ,ξ − t eiα ,ξ σ − t α ∈R α ,β >0
4. Solution of the Bethe Equations In this section the Bethe System in Proposition 3.2 is solved using a variational technique due to C.N. Yang and C.P. Yang [YY, Ma, G4, KBI].
Discrete Delta Bose Gas on the Circle
463
4.1. Solution. The following theorem provides (the existence of) a sequence ξ µ ∈ 2π Int( A) of solutions to the Bethe system in Proposition 3.2 labeled by vectors (playing the role of quantum numbers) µ ∈ P (m) . Theorem 4.1 (Bethe Vectors). For each µ ∈ P (m) there exists a (unique) Bethe vector ξ µ ∈ 2π Int( A) = {x ∈ E | x, α 0 < 2π, x, α j > 0, j = 1, . . . , n − 1} such that ξ µ satisfies the system in Eq. (3.4). Moreover, these Bethe vectors have the following properties: (i) ξ µ = ξ µ if and only if µ = µ, (ii) ξ µ depends smoothly on the boundary parameter t ∈ (−1, 1), 2π (iii) ξ µ = n+m (ρ + µ) for t = 0. 4.2. Proof. In standard coordinates the Bethe system of Proposition 3.2 reads eim(ξ j −ξk ) =
1 − tei(ξ j −ξ ) 1 − tei(ξ −ξk ) , ei(ξ j −ξ ) − t 1≤≤n ei(ξ −ξk ) − t 1≤≤n = j
(4.1)
=k
for 1 ≤ j = k ≤ n. This overdetermined system of n(n − 1) equations in the variables ξ1 , . . . , ξn is equivalent to the system of n equations eimξ j = c
1 − tei(ξ j −ξ ) , ei(ξ j −ξ ) − t 1≤≤n
j = 1, . . . , n,
(4.2)
= j
where c = 0 denotes an overall constant factor that we can scale to 1 by means of the translation ξ j → ξ j − im −1 log c, j = 1, . . . , n. Picking thus c = 1 and taking the logarithm of both sides recasts Eq. (4.2) in the additive form mξ j + θ (ξ j − ξ ) = 2π m j , j = 1, . . . , n, (4.3) 1≤≤n = j
where m = (m 1 , . . . , m n ) ∈ Zn and
x
(1 − 2t cos(x) + t 2 )−1 dx 0 x 1+t tan = 2 arctan 1−t 2 i x 1 − te = i log . ei x − t
θ (x) := (1 − t 2 )
(4.4a) (4.4b) (4.4c)
Here the branches of the arctangent function and those of the logarithmic function are to be chosen in such a way that (i) θ (x) (4.4b), (4.4c) is quasi-periodic: θ (x + 2π ) = θ (x) + 2π , and (ii) θ (x) (4.4b), (4.4c) varies from −π to π as x varies from −π to π (which corresponds to the principal branch). We notice that this choice of the branches ensures that θ (x) (4.4b), (4.4c) is smooth on the whole real axis and strictly monotonously increasing.
464
J. F. van Diejen
Lemma 4.2. For each n-tuple m = (m 1 , . . . , m n ) ∈ Zn , there exists a unique vector ξ (m) = (ξ1 (m), . . . , ξn (m)) solving the system in Eq. (4.3) (with θ (x) of the form in Eqs. (4.4a)–(4.4c)). Furthermore, this solution ξ (m) depends smoothly on the boundary parameter t ∈ (−1, 1). Proof. Let V (ξ1 , . . . , ξn ) :=
n n n m 2 1 ξj + (ξ j − ξk ) − 2π m jξj, 2 2 j=1
j,k=1
(4.5)
j=1
x where (x) := 0 θ (x)dx. Clearly the solution(s) of the system in Eq. (4.3) coincide(s) with the critical point(s) of the (smooth) function V (ξ1 , . . . , ξn ). The Hesse matrix of V is given by
m ∂2V = m+ θ (ξ j − ξ ) δ j,k − θ (ξ j − ξk ), 1 ≤ j, k ≤ n, H j,k = ∂ξ j ∂ξk =1
where θ (x) = (1 − t 2 )(1 − 2t cos(x) + t 2 )−1 > 0. It is readily seen that this Hesse matrix is positive definite: n
H j,k x j xk = m
j,k=1
n j=1
x 2j +
n n 1 θ (ξ j − ξk )(x j − xk )2 ≥ m x 2j > 0 2 j,k=1
j=1
(for any nonzero vector x ∈ Rn ). The function V (ξ1 , . . . , ξn ) is thus strictly convex, i.e., it admits at most one critical point: a global minimum. That such global minimum ξ (m) indeed exists in our case is immediate from the observation that V (ξ1 , . . . , ξn ) → +∞ when ξ → ∞. (Notice in this connection that (x) → +∞ for x → ±∞.) We thus conclude that the system in Eq. (4.3) has a unique solution ξ (m) (given by the global minimum of V ). It remains to check that the position of this global minimum depends smoothly on the boundary parameter t. To this end we notice that the integrand of θ (x) (4.4a) (which, incidentally, coincides with the generating function for the Chebyshev polynomials) is analytic in t for |t| < 1, and thus so are the function V (ξ1 , . . . , ξn ) and the system of equations for the global minimum in Eq. (4.3). The smoothness in ξ1 , . . . , ξn and t—combined with the fact that the Hessian det(H j,k ) is positive (i.e. nonvanishing)—now guarantees that the solution ξ (m) to the latter system must be smooth in t ∈ (−1, 1) by the implicit function theorem. The following lemma shows that the ordering between the components of the solution ξ(m) coincides with the ordering of the components of the labeling vector m. Lemma 4.3. Let m ∈ Zn and let ξ (m) be the associated solution of Eq. (4.3) detailed in Lemma 4.2. Then for m j ≥ m k the following inequalities hold: 2π(m j − m k ) 2π(m j − m k ) ≤ ξ j (m) − ξk (m) ≤ , m + nκ− (t) m + nκ+ (t) where κ± (t) :=
1−t 2 (1±|t|)2
(4.6a)
> 0. So, one has in particular that
m j > m k =⇒ ξ j (m) > ξk (m) and m j = m k =⇒ ξ j (m) = ξk (m).
(4.6b)
Discrete Delta Bose Gas on the Circle
465
Proof. Let m in Zn with m j ≥ m k . Subtracting the k th equation from the j th equation of the system in Eq. (4.3) yields that m(ξ j − ξk ) +
n θ (ξ j − ξ ) − θ (ξk − ξ ) = 2π(m j − m k ).
(4.7)
=1
Since the r.h.s. of this identity is nonnegative and θ (x) is strictly monotonously increasing, it follows that ξ j (m) ≥ ξk (m). Furthermore, from the formula θ (x) − θ (y) = x (1 − t 2 ) y (1 − 2t cos(x) + t 2 )−1 dx (cf. Eq. (4.4a)) it is immediate that κ+ (t)(x − y) ≤ θ (x) − θ (y) ≤ κ− (t)(x − y) for x ≥ y. Application of this upper and lower bound so as to estimate the terms in the sums on the l.h.s. of Eq. (4.7), now gives rise to the inequalities (m + nκ+ (t))(ξ j (m) − ξk (m)) ≤ 2π(m j − m k ) ≤ (m + nκ− (t))(ξ j (m) − ξk (m)), which completes the proof of Eq. (4.6a) (and thus also that of Eq. (4.6b)).
The next lemma improves the upper bound on the distance between the ξ j (m) and ξk (m) stemming from Lemma 4.3 in the situation that the distance between m j and m k is smaller than n + m. Lemma 4.4. Let m ∈ Zn such that m j − m k < n + m and let ξ (m) be the associated solution of Eq. (4.3) detailed in Lemma 4.2. Then ξ j (m) − ξk (m) < 2π.
(4.8)
Proof. Subtracting the k th equation from the j th equation of the system in Eq. (4.3) leads—upon recalling that θ (x) is odd—to (cf. Eq. (4.7)) m(ξ j − ξk ) +
n θ (ξ j − ξ ) + θ (ξ − ξk ) = 2π(m j − m k ).
(4.9)
=1
If ξ j − ξk ≥ 2π , then the average of ξ j − ξ and ξ − ξk is ≥ π . Hence θ (ξ j − ξ ) + θ (ξ − ξk ) ≥ 2π , in view of the fact that θ (x) is strictly monotonously increasing and θ (π + x) + θ (π − x) = 2π . Plugging this estimate in Eq. (4.9) reveals that ξ j (m)−ξk (m) ≥ 2π implies that 2π(m j −m k ) ≥ m(ξ j (m)−ξk (m))+2π n ≥ 2π(m+n), which completes the proof. We will now piece the results of Lemmas 4.2–4.4 together, so as to arrive at a proof for Theorem 4.1. Let m ∈ Zn such that m 1 > m 2 > · · · > m n and m 1 − m n < m + n,
(4.10a)
and let ξ(m) be the associated solution of Eq. (4.3) detailed in Lemma 4.2. It follows from Lemmas 4.3 and 4.4 that ξ1 (m) > ξ2 (m) > · · · ξn (m) and ξ1 (m) − ξn (m) < 2π.
(4.10b)
466
J. F. van Diejen
Let us define µ := m −
1 m, e e − ρ, n
ξ µ := ξ (m) −
1 ξ (m), e e, n
(4.11)
where (recall) e = e1 +· · ·+en and ρ = ω1 +· · ·+ωn−1 . In other words, µ is the orthogonal projection of m onto the center-of-mass hyperplane E (2.1) translated by −ρ and ξ µ is the orthogonal projection of ξ (m) onto E. The inequalities in Eqs. (4.10a) and (4.10b) ensure that µ ∈ P (m) (2.11) and that ξ µ ∈ 2π Int( A) (cf. Eq. (2.4)), respectively. It is furthermore clear that by varying m we can reach any lattice point µ ∈ P (m) . Indeed, for µ = k1 ω1 + · · · + kn−1 ωn−1 with k j ∈ Z≥0 and k1 + · · · + kn−1 ≤ m we may pick the components of m equal to m j = µ + ρ, α j + · · · + α n−1 = k j + · · · + kn−1 + n − j, j = 1, . . . , n. It is not difficult to check that the assignment µ → ξ µ is indeed welldefined (i.e. µ = µ ⇒ ξ µ = ξ µ ) and one-to-one (i.e. ξ µ = ξ µ ⇒ µ = µ ). Indeed, one has that ξ µ = ξ µ ⇐⇒ ξ (m) − ξ (m ) ∈ Re Eq. (4.3)
⇐⇒ ξ (m) − ξ (m ) ∈ 2π Ze
Eq. (4.3)
⇐⇒ m − m ∈ Ze ⇐⇒ µ = µ .
Since it is obvious that ξ µ inherits from ξ (m) the smooth dependence on the boundary parameter t and the property that its components solve the system in Eq. (4.2) (and thus the Bethe system in Eq. (4.1)), this proves Theorem 4.1 up to Property (ii). It remains to check Property (iii), which states that for t = 0 the Bethe vectors are given by 2π ξ µ = n+m (ρ + µ), µ ∈ P (m) . To this end we simply observe that Lemma 4.3 implies that for t = 0, 2π (m j − m k ), ξ j (m) − ξk (m) = n+m whence the statement follows by varying m subject to the constraints in Eq. (4.10a) and projecting onto the center-of-mass plane with the aid of Eq. (4.11). 5. Diagonalization In this section we will combine the results of Sects. 2–4 to arrive at an orthogonal basis for the Hilbert space H(m) = 2 (P (m) , (m) ), consisting of a complete set of joint (m) eigenfunctions for the (commuting) Laplace operators L 1(m) , . . . , L n−1 . 5.1. Spectrum and eigenfunctions. The following theorem provides the eigenfunctions of our Laplace operators in terms of Hall-Littlewood polynomials specialized at the Bethe vectors ξ µ , µ ∈ P (m) . Theorem 5.1 (Spectrum and Eigenfunctions). For special values of the spectral parameter, given by the Bethe vectors ξ µ , µ ∈ P (m) in Theorem 4.1, the Bethe wave function
λ (ξ ) (3.3) constitutes an eigenfunction of the Laplace operator L k(m) (2.12a), (2.12b), i.e. for any k ∈ {1, . . . , n − 1}, (m)
Lk
(ξ µ ) = E k (ξ µ ) (ξ µ ),
(5.1a)
Discrete Delta Bose Gas on the Circle
467
where the eigenvalue is of the form
E k (ξ ) =
exp(iν, ξ )
(5.1b)
ν ∈Sn (ωk )
(and (ξ µ ) = 0). Proof. Clearly the Hall-Littlewood polynomials λ (ξ ) (3.3) satisfy the identity
ν ∈Sn (ωk ) λ+ν (ξ ) = E k (ξ )λ (ξ ) (because all of the plane waves ψλ (ξ σ ) = exp (iλ, ξ σ ), σ ∈ Sn do so). Moreover, since the specialized Hall-Littlewood polynomials λ (ξ µ ), µ ∈ P (m) also satisfy the boundary convention in Eq. (2.12b) in view of Propositions 2.1, 3.1, 3.2 and Theorem 4.1, the stated eigenvalue equation follows. It remains to check that λ (ξ µ ) does not vanish identically. For this purpose it is enough to observe that for λ = 0: 0 (ξ ) =
1 − t e−iα ,ξ σ (i) 1 − t 1+ρ ,α (σ ) (ii) = t = , 1 − t ρ ,α 1 − e−iα ,ξ σ α ∈R+ σ ∈Sn α ∈R+ σ ∈Sn
where we have used (i) a rational function identity and (ii) a product formula for the Poincaré series of the permutation group that are both due to Macdonald [M1] (cf. Theorem (2.8) and Corollary (2.5), respectively). It is clear from the product formula on the r.h.s. that 0 (ξ ) > 0 for −1 < t < 1, whence λ (ξ µ ) indeed constitutes a true (i.e. nonzero) eigenfunction in H(m) . 5.2. Orthogonality and completeness. Theorem 5.1 provides as many eigenfunctions ). The as the dimension of the Hilbert space (indeed, dim(H(m) ) = #P (m) = n+m−1 m following theorem confirms our expectation that these eigenfunctions actually form an orthogonal basis for the Hilbert space in question. Alternatively, one may think of this theorem as describing a novel system of discrete (dual) orthogonality relations for the Hall-Littlewood polynomials. Theorem 5.2 (Orthogonality and Completeness). The Bethe wave functions (ξ µ ),
µ ∈ P (m)
(5.2a)
constitute an orthogonal basis of H(m) :
∀µ, µ ∈ P
(m)
:
(m)
(ξ µ ), (ξ µ )
=
0 if µ = µ , 0 if µ = µ .
(5.2b)
(m) Proof. Since L k(m) and L n−k are each others’ adjoints in H(m) by Proposition 2.3, it is
(m) clear that L k(m) (ξ µ ), (ξ µ )(m) = (ξ µ ), L n−k (ξ µ )(m) . By applying Theorem 5.1 and using the fact that E k (ξ ) = E n−k (ξ ), this equality is readily rewritten in the form E k (ξ µ ) − E k (ξ µ ) (ξ µ ), (ξ µ )(m) = 0. (5.3)
Theorem 4.1 now guarantees that for µ = µ the associated Bethe vectors ξ µ and ξ µ are distinct in 2π Int( A). Moreover, since the elementary symmetric polynomials
468
J. F. van Diejen
E 1 (ξ ), . . . , E n−1 (ξ ) separate the points of 2π Int( A) (as they generate the full algebra
of trigonometric polynomials on 2π A spanned by the Sn -invariant Fourier basis µ∈Sn (λ) exp(iµ, ξ ), λ ∈ P), this implies that in this situation E k (ξ µ ) = E k (ξ µ ) for a certain value of k ∈ {1, . . . , n − 1}. We thus conclude from Eq. (5.3) that the inner product (ξ µ ), (ξ µ )(m) must vanish if µ = µ . Finally, for µ = µ the inner product yields the squared norm of the Bethe wave function (ξ µ ) in H(m) , which is positive as (ξ µ ) = 0 by (the proof of) Theorem 5.1. 5.3. Integrability. From the previous results it is seen that our Laplace operators model a finite-dimensional quantum system that is integrable in the following sense. (m)
(m)
Theorem 5.3. (Integrability). The Laplacians L 1 , . . . , L n−1 (2.12a), (2.12b) constitute n − 1 (= dim(E)) mutually commuting operators in the Hilbert space H(m) . Furthermore, any operator L : H(m) → H(m) that commutes with all of the Laplacians (m) (m) (m) (m) L 1 , . . . , L n−1 lies in the polynomial algebra C[L 1 , . . . , L n−1 ]. (m) is immediate from the fact that the Proof. The commutativity of L 1(m) , . . . , L n−1 operators are simultaneously diagonalized by the basis (ξ µ ), µ ∈ P (m) of H(m) (cf. Theorems 5.1 and 5.2). The property that any operator L : H(m) → H(m) that commutes (m) (m) with L 1(m) , . . . , L n−1 is necessarily algebraically dependent of L (m) 1 , . . . , L n−1 hinges on the fact that the eigenvalues E 1 (ξ µ ), . . . , E n−1 (ξ µ ) separate the elements of the eigenbasis (ξ µ ), µ ∈ P (m) (cf. also the proof of Theorem 5.2). Indeed, it is immediate from this that L is diagonalized by (ξ µ ), µ ∈ P (m) . In other words, that there exists a function E L : {ξ µ }µ∈P (m) → C such that
L(ξ µ ) = E L (ξ µ )(ξ µ ), ∀µ ∈ P (m) .
(5.4a)
Since the Bethe functions (ξ µ ), µ ∈ P (m) form an orthogonal basis of H(m) , we have (by transposition) that the Hall-Littlewood polynomials λ (ξ ), λ ∈ P (m) form a basis for the space of complex functions on the spectral set {ξ µ }µ∈P (m) upon specialization. In particular, there exist (unique) complex coefficients cλ , λ ∈ P (m) such that cλ λ (ξ µ ), ∀µ ∈ P (m) . (5.4b) E L (ξ µ ) = λ∈P (m)
Furthermore, from the well-known property that the elementary symmetric polynomials E 1 (ξ ), . . . , E n−1 (ξ ) (5.1b) generate the space of symmetric polynomials it is clear that there exists a polynomial PL ∈ C[E 1 , . . . , E n−1 ] such that cλ λ (ξ ) = PL (E 1 (ξ ), . . . , E n−1 (ξ )). (5.4c) λ∈P (m) (m)
It follows from Eqs. (5.4a)–(5.4c) and Theorem 5.1 that the operators L and PL (L 1 , . . . , (m) ) coincide on the basis (ξ µ ), µ ∈ P (m) . Hence, we conclude that L = PL (L 1(m) , L n−1 (m)
(m)
(m)
. . . , L n−1 ) ∈ C[L 1 , . . . , L n−1 ].
Discrete Delta Bose Gas on the Circle
469
(m)
(m)
The Laplace operators L 1 , . . . , L n−1 are not self-adjoint in general in view of Proposition 2.3. As a consequence, the spectrum in Theorem 5.1 is generally complex-valued. (m) Within the commuting algebra C[L 1(m) , . . . , L n−1 ] there exist however many operators that are self-adjoint. For example, the alternative generators (m)
1 (m) (m) L k + L n−k , k ∈ {1, . . . , [n/2]}, 2 1 (m) (m) L k − L n−k , k ∈ {1, . . . , [(n − 1)/2]}, := 2i
L R,k := (m)
L I,k
are self-adjoint and have real spectrum of the form E R,k (ξ µ ) = cos(ν, ξ µ ), µ ∈ P (m) ,
(5.5a) (5.5b)
(5.6a)
ν ∈Sn (ωk )
E I,k (ξ µ ) =
ν ∈Sn (ωk )
sin(ν, ξ µ ), µ ∈ P (m) ,
(m)
(m)
(m)
(5.6b) (m)
respectively. The real subalgebra R[L R,1 , . . . , L R,[n/2] , L I,1 , . . . , L I,[(n−1)2] ] consists of all operators L : H(m) → H(m) such that (i) L commutes with all of the Laplacians (m) (m) L 1 , . . . , L n−1 and (ii) L is self-adjoint. One of the simplest positive operators in this real subalgebra is given by (m) H (m) := nId − L R,1 . (5.7) In standard coordinates the explicit action of this operator on an arbitrary wave function ψ ∈ H(m) is of the form (cf. Proposition 2.2) 1 1 V j,+λ ψλ+ν j − V j,−λ ψλ−ν j , (5.8a) (H (m) ψ)λ = nψλ − 2 2 1≤ j≤n λ+ν j ∈P (m)
1≤ j≤n λ−ν j ∈P (m)
where V j,+λ =
1 − t 1+k− j 1 − t k− j
j
V j,−λ =
1 − t 1+ j−k 1 − t j−k
1≤k< j λk =λ j
1≤k< j λk =λ j +m
j
1 − t 1+n+k− j , 1 − t n+k− j
(5.8b)
1 − t 1+n+ j−k , 1 − t n+ j−k
(5.8c)
and ν j = e j − (e1 + · · · + en )/n, j = 1, . . . , n (so ν 1 , . . . , ν n consist of the orthogonal projection of the standard basis e1 , . . . , en onto the center-of-mass plane E (2.1)). The spectrum of H (m) is built of positive eigenvalues E(ξ µ ), µ ∈ P (m) with n 1 − cos(ξ j ) . E(ξ ) =
(5.8d)
j=1
The operator H (m) (5.8a)–(5.8c) serves as the Hamiltonian of our lattice n-particle model. Below we will verify that in a continuum limit this lattice Hamiltonian tends formally to the Hamiltonian of the n-particle delta Bose gas on the circle.
470
J. F. van Diejen
6. Continuum Limit In this final section we first review the solution of the eigenvalue problem for the Laplacian in Eq. (1.2), with wave functions supported inside the alcove A (1.3) subject to repulsive boundary conditions of the form in Eqs. (1.4a), (1.4b) (i.e. with g>0). Our formulation amounts to the center-of-mass reduction of the seminal results due to Lieb and Liniger [LL] (Bethe wave functions), C.N. Yang and C.P. Yang [YY] (Bethe vectors), and Dorlas [Do] (orthogonality and completeness). Next we will show how this solution of the eigenvalue problem for the Laplacian in the alcove can be recovered from the corresponding solution of our discrete lattice model via a continuum limit.
6.1. Eigenfunctions. In the notation of Sect. 2 the eigenvalue problem in Eqs. (1.2)–(1.4b) reads −ψ = Eψ,
x ∈ A,
(6.1a)
with ∇x ψ, α 0 + gψ |x∈ E 0 = 0, ∇x ψ, α j − gψ |x∈ E j = 0,
(6.1b) j = 1, . . . , n − 1
(6.1c)
(where ∇x refers to the gradient). Let us define C := {ξ ∈ E | ξ , α j > 0, j = 1, . . . , n − 1}, P
(∞)
:= {k1 ω1 + · · · + kn−1 ωn−1 | k1 , . . . , kn−1 ∈ Z≥0 }.
(6.2a) (6.2b)
Theorem 6.1 (Bethe Wave Functions [LL]). The Bethe wave function1 (∞) (x, ξ ) =
α, ξ σ − ig eix,ξ σ , α, ξ σ +
(6.3a)
σ ∈Sn α ∈R
with the spectral parameter ξ ∈ C (6.2a) solving the Bethe system e
iβ ,ξ
=
ig + β, ξ ig − β, ξ
2 α ∈R α ,β =1
ig + α, ξ , ig − α, ξ
∀β ∈ R,
(6.3b)
constitutes a solution to the eigenvalue problem in Eqs. (6.1a)–(6.1c) corresponding to the eigenvalue E = E (∞) (ξ ) := ξ , ξ . It is instructive to recall briefly the essence of the proof of Lieb and Liniger in the present notation. Firstly, it is clear that the linear combination of plane waves (∞) (x, ξ ) constitutes an eigenfunction of − with eigenvalue E (∞) (ξ ). It remains to check that the 1 This explicit form of the expressions for the coefficients of the Bethe wave function is due to Gaudin [G3,
G4].
Discrete Delta Bose Gas on the Circle
471
boundary conditions are also satisfied. The boundary condition in Eq. (6.1c) is inferred by the following computation for x ∈ E j : ∇x (∞) , α j α, ξ σ − ig iα j , ξ σ eix,ξ σ = α, ξ σ + σ ∈Sn α ∈R
=
(g + iα j , ξ σ )
σ ∈Sn
α ∈R α =α j
α, ξ σ − ig eix,ξ σ α, ξ σ +
(i)
=g
σ ∈Sn α ∈R α =α j (ii)
= g
1−
σ ∈Sn
=g
α, ξ − ig σ eix,ξ σ α, ξ σ +
ig α, ξ σ − ig ix,ξ σ e α j , ξ σ α, ξ σ + α ∈R α =α j
α, ξ σ − ig eix,ξ σ α, ξ σ +
σ ∈Sn α ∈R
= g (∞) , where in Steps (i) and (ii) one exploits that
α ∈R+ α =α j
α ,ξ σ −ig α ,ξ σ
and α j , ξ σ are symmet-
ric and skew-symmetric, respectively, with respect to the action of r j on ξ σ , combined with the symmetry x, r j (ξ σ ) = x, ξ σ (since r j (x) = x if x ∈ E j ). Finally, the boundary condition in Eq. (6.1b) requires that for x ∈ E 0 ,
α, ξ σ − ig (g + iξ σ , α 0 ) eix,ξ σ = 0. α, ξ σ + α ∈R
σ ∈Sn
Manipulations similar to those in the proof of Proposition 3.2 reveal that this relation holds when the spectral parameter solves the Bethe system in Eq. (6.3b). Theorem 6.2 (Bethe Vectors [YY]). Let g > 0. For each µ ∈ P (∞) (6.2b) there exists a (unique) Bethe vector ξ µ ∈ C (6.2a) such that ξ µ satisfies the system in Eq. (6.3b). Moreover, these Bethe vectors have the following properties: (i) ξ µ = ξ µ if and only if µ = µ, (ii) ξ µ depends smoothly on the boundary parameter g > 0, (iii) ξ µ → 2π(ρ + µ) for g → +∞. In standard coordinates the Bethe system in Eq. (6.3b) reads ig + ξ j − ξ ig + ξ − ξk ei(ξ j −ξk ) = , ig − ξ j + ξ ig − ξ + ξk 1≤≤n = j
(6.4)
1≤≤n =k
for 1 ≤ j = k ≤ n, or equivalently (upon exploiting the translational invariance) ig + ξ j − ξ eiξ j = , j = 1, . . . , n. (6.5) ig − ξ j + ξ 1≤≤n = j
472
J. F. van Diejen
In the additive form the latter system becomes θ (∞) (ξ j − ξ ) = 2π m j , ξj +
j = 1, . . . , n,
(6.6)
1≤≤n = j
with m = (m 1 , . . . , m n ) ∈ Zn and θ (∞) (x) = 2g
x
(x 2 + g 2 )−1 dx x = 2 arctan g ig + x . = i log ig − x
(6.7a)
0
(6.7b) (6.7c)
It was shown in [YY] that for any m ∈ Zn the Bethe system in Eqs. (6.6)–(6.7c) has a unique solution ξ (m) given by the unique global minimum of the strictly convex function V (∞) (ξ1 , . . . , ξn ) :=
n n n 1 2 1 (∞) ξj + (ξ j − ξk ) − 2π m jξj, 2 2 j=1
j,k=1
(6.8)
j=1
x with (∞) (x) := 0 θ (∞) (x)dx. By projecting the solutions ξ (m), corresponding to vectors m ∈ Zn with m 1 > m 2 > · · · > m n , orthogonally onto the center-of-mass plane E the statements of Theorem 6.2 readily follow (cf. also Sect. 4). Theorem 6.3 (Orthogonality and Completeness [Do]). The Bethe wave functions (∞) (x, ξ µ ), µ ∈ P (∞) form an orthogonal basis for the Hilbert space H(∞) := L 2 ( A, dx) (with inner product φ, ψ(∞) := A φ(x)ψ(x)dx), i.e. ∀µ, µ ∈ P (∞) : and
(∞) (ξ µ ), (∞) (ξ µ )(∞) =
0 if µ = µ 0 if µ = µ
φ, (∞) (ξ µ ) = 0, ∀µ ∈ P (∞) =⇒ φ = 0.
(6.9a)
(6.9b)
Below we will infer that this center-of-mass reduction of Dorlas’ orthogonality relations can be recovered via a continuum limit from the corresponding results pertaining to the discrete lattice model in Sect. 5. It was moreover shown by Dorlas that the orthogonality of the Bethe wave functions for the repulsive delta Boson gas implies their completeness [Do, Sect. 3]. In other words, the completeness in Theorem 6.3 follows from the orthogonality (upon a cosmetic adaptation of Dorlas’ arguments to our center-of-mass situation).
6.2. Orthogonality. In order to perform the continuum limit let us from now on rescale the coupling parameter t putting t = e−g/m , g > 0.
(6.10)
Discrete Delta Bose Gas on the Circle
473
For any x in (the closure of) C (6.2a) we define an integral approximation [x] ∈ P (∞) (6.2b) of the form [x] := [x, α 1 ]ω1 + · · · + [x, α n−1 ]ωn−1 ,
(6.11)
where [x] denotes the integral part of x ∈ R≥0 obtained through truncation. With these notations we are in the position to embed the Hilbert space of lattice functions H(m) = 2 (P (m) , (m) ) into L 2 (C, dx) by means of a linear injection J (m) : H(m) → L 2 (C, dx) that associates to a lattice function φ : P (m) → C a staircase function J (m) (φ) : C → C of the form (m) [mx] φ[mx] for [mx] ∈ P (m) , (m) (6.12) (J φ)(x) := 0 for [mx] ∈ P (m) . It is not difficult to see that the staircase function J (m) (φ) has support on a bounded domain inside the dilated alcove (1 + mn ) A. This support shrinks towards (a subset of) A for m → ∞. It is also not difficult to deduce from this definition that ∀φ, ψ ∈ H(m) , (m) (J (m) φ)(x)(J (m) ψ)(x)dx = cn,m φλ ψλ λ , (6.13) C
λ∈P (m)
√ where cn,m = Vol(ω1 , . . . , ωn−1 )/m n−1 = 1/(m n−1 n). Let (m) (x, ξ ) be the staircase embedding of the Hall-Littlewood polynomial λ (ξ ) (3.3) (m) (x, ξ ) := (J (m) (ξ ))(x) (m) = [mx] [mx] (ξ ),
1 − e−g/m e−iα ,ξ σ (m) = [mx] ei[mx],ξ σ . −iα ,ξ σ 1 − e α ∈R+ σ ∈S
(6.14)
n
The following lemma states that, for m → ∞, the rescaled staircase function (m) (x, m1 ξ ) (6.14) converges pointwise to the Lieb–Liniger Bethe wave function (∞) (x, ξ ) (6.3a) when x lies in the interior of A and to zero when x lies outside A. Lemma 6.4. For any ξ ∈ C, one has that lim (m) (x, m1 ξ ) =
m→∞
(∞) (x, ξ ) if x ∈ Int( A), 0 if x ∈ C \ A.
(6.15)
Proof. The lemma readily follows from the explicit expression of the staircase wave function on the third line of Eq. (6.14), together with the observation that limm→∞ (m) (m) [mx] = 1 if x ∈ Int( A) and limm→∞ [mx] = 0 if x ∈ C \ A, and the fact that 1 limm→∞ m [mx] = x. Let us fix a µ ∈ P (∞) and pick m sufficiently large so as to ensure that µ ∈ P (m) . (m) (∞) We denote by ξ µ and ξ µ the associated Bethe vectors detailed in Theorem 4.1 and Theorem 6.2, respectively. Lemma 6.5. For any µ ∈ P (∞) , one has that (∞) lim mξ (m) µ = ξµ .
m→∞
(6.16)
474
J. F. van Diejen (m)
(∞)
Proof. Let m j := µ, α j + · · · + µ, α n−1 + n − j, j = 1, . . . , n. Then ξ µ and ξ µ correspond to (the projections onto the center-of-mass plane of) the (unique) global minima of V (ξ1 , . . . , ξn ) (4.5) and V (∞) (ξ1 , . . . , ξn ) (6.8), respectively. The rescaled Bethe (m) (ξ , . . . , ξ ):= vector mξ (m) 1 n µ thus corresponds to the global minimum of the function V mV (ξ1 /m, . . . , ξn /m). The lemma now follows from the observation that for m → ∞ the strictly convex function V (m) (ξ1 , . . . , ξn ) tends to V (∞) (ξ1 , . . . , ξn ) uniformly on compacts (which implies in particular that the global minimum of the V (m) converges to the global minimum of V (∞) ). The proof of the orthogonality in Theorem 6.3 now hinges on the following proposition. Proposition 6.6. For all µ, µ ∈ P (∞) , one has that (m) (m) (m) (x, ξ µ ) (m) (x, ξ µ )dx lim m→∞ C (∞) (∞) (∞) (x, ξ µ ) (∞) (x, ξ µ )dx. =
(6.17)
A
Proof. It is clear from (the proof of) Lemma 6.4 and from Lemma 6.5 that the integrand and support of the integral on the l.h.s. converges pointwise to the integrand and support of the integral on the r.h.s. To see that the integrals themselves converge accordingly we write (m) (x, ξ ) (m) (x, ξ )dx C (m) ˆ σ )C(−ξ ˆ C(ξ ) ei[mx],ξ σ −ξ σ [mx] dx, = σ (1+ mn ) A
σ,σ ∈Sn
(m) (m) 1−e−g/m e−iα,ξ ˆ ) = where C(ξ . After substituting ξ := ξ µ and ξ := ξ µ α ∈R+ 1−e−iα,ξ the proposition follows for m → ∞ upon invoking Lemma 6.5 and the dominated convergence theorem of Lebesgue. Indeed, one has that (m)
e
i m1 [mx],mσ (ξ (m) µ )−mσ (ξ µ )
and (m) [mx]
pointwise for m → ∞, and that |e
−→
−→ e
(∞)
ix,σ (ξ (∞) µ )−σ (ξ µ )
1 if x ∈ Int( A) 0 if x ∈ C \ A (m)
i m1 [mx],mσ (ξ (m) µ )−mσ (ξ µ )
(m)
| = 1, |[mx] | ≤ 1.
Proposition 6.6 can be rephrased as (∞)
(∞)
(∞) (ξ µ ), (∞) (ξ µ )(∞) (m) (m) = lim (J (m) (ξ µ ))(x)(J (m) (ξ µ ))(x)dx. m→∞ C
The r.h.s. of this limiting relation vanishes when µ = µ in view of Eq. (6.13) and Theorem 5.2, whence the orthogonality in Theorem 6.3 follows.
Discrete Delta Bose Gas on the Circle
475
6.3. Hamiltonian. We will now wrap up by verifying briefly that formally the Hamiltonian H (m) (5.8a), (5.8c) converges in the continuum limit to the Hamiltonian of the repulsive delta Bose gas on the circle. It is quite plausible that with a somewhat more in-depth analysis in the spirit of Ref. [R] one would be able to show that this convergence of the Hamiltonian is in fact in the strong resolvent sense, but we will not attempt to do so here. Let H(∞) be the self-adjoint extension in H(∞) = L 2 ( A, dx) of the Laplace operator − with boundary conditions of the from in Eqs. (6.1b), (6.1c), and let H(m) be the following rescaled staircase embedding of the operator H (m) (5.8a)–(5.8c) in L 2 (C, dx): H(m) = 2m 2 J (m) H (m) (J (m) )−1 (m) ,
(6.18)
where (m) : L 2 (C, dx) → L 2 (C, dx) denotes the orthogonal projection onto the finite-dimensional subspace of staircase functions J (m) (H(m) ) ⊂ L 2 (C, dx). It is clear that (∞) (∞) H(∞) (∞) (ξ (∞) (ξ µ ) (∞) (ξ (∞) (6.19a) µ )= E µ ), with E (∞) (ξ ) = ξ , ξ , and that (m)
(m)
(m)
H(m) (m) (ξ µ ) = E (m) (ξ µ ) (m) (ξ µ ),
(6.19b)
where E (m) (ξ ) := 2m 2 E(ξ ) with E(ξ ) given by Eq. (5.8d). From Lemmas 6.4 and 6.5 (m) (∞) it follows that limm→∞ (m) (x, ξ µ ) = (∞) (x, ξ µ ) pointwise for x ∈ Int( A) and (m) (∞) that limm→∞ E (m) (ξ µ ) = E (∞) (ξ µ ). In other words, for m → ∞ the eigenfunctions, the eigenvalues, and the eigenvalue equation for H(m) in Eq. (6.19b) converge pointwise to the eigenfunctions, the eigenvalues, and the eigenvalue equation for H(∞) in Eq. (6.19a), respectively. Acknowledgements. Thanks are due to M. Bustamante and to S.N.M. Ruijsenaars for several helpful discussions.
References [AFK] Albeverio, S., Fei, S.-M., Kurasov, P.: On integrability of many-body systems with point interactions. Operator Theory: Advances and Applications 132, 67–76 (2002) [A-H] Albeverio, S., Gesztesy, F., Høegh-Krohn, R., Holden, H.: Solvable Models in Quantum Mechanics. Second Edition, Providence, R.I.: AMS Chelsea Publishing, 2004 [AK] Albeverio, S., Kurasov, P.: Singular Perturbations of Differential Operators. Cambridge: Cambridge University Press, 2000 [BT] Babbitt, D., Thomas, L.: Ground state representation of the infinite one-dimensional Heisenberg ferromagnet, II. An explicit Plancherel formula. Commun. Math. Phys. 54, 255–278 (1977) [B] Bourbaki, N.: Groupes et algèbres de Lie, Chapitres 4–6. Paris: Hermann, 1968 [CC] Caudrelier, V., Crampé, N.: Exact results for the one-dimensional many-body problem with contact interaction: Including a tunable impurity. http://arxiv.org/list/cond-mat/0501110 (2005) [Di] van Diejen, J.F.: On the Plancherel formula for the (discrete) Laplacian in a Weyl chamber with repulsive boundary conditions at the walls. Ann. Henri Poincaré 5, 135–168 (2004) [Do] Dorlas, T.C.: Orthogonality and completeness of the Bethe Ansatz eigenstates of the Nonlinear Schroedinger Model. Commun. Math. Phys. 154, 347–376 (1993) [G1] Gaudin, M.: Bose gas in one dimension, I. The closure property of the scattering wavefunctions. J. Math. Phys. 12, 1674–1676 (1971) [G2] Gaudin, M.: Bose gas in one dimension, II. Orthogonality of the scattering states. J. Math. Phys. 12, 1677–1680 (1971) [G3] Gaudin, M.: Boundary energy of a Bose gas in one dimension. Phys. Rev. A. 4, 386–394 (1971) [G4] Gaudin, M.: La Fonction d’Onde de Bethe. Paris: Masson, 1983
476
J. F. van Diejen
[Gu] Gutkin, E.: Integrable systems with delta-potential. Duke Math. J. 49, 1–21 (1982) [HLP] Hallnäs, M., Langmann, E., Paufler, C.: Generalized local interactions in 1D: solutions of quantum many-body systems describing distinguishable particles. J. Phys. A: Math. Gen. 38, 4957–4974 (2005) [HO] Heckman, G.J., Opdam, E.M.: Yang’s system of particles and Hecke algebras. Ann. Math. 145, 139-173 (1997); erratum ibid. 146, 749–750 (1997) [Hu] Humphreys, J.E.: Reflection Groups and Coxeter Groups. Cambridge: Cambridge University Press, 1990 [KBI] Korepin, V.E., Bogoliubov, N.M., Izergin, A.G.: Quantum Inverse Scattering Method and Correlation Functions. Cambridge: Cambridge University Press, 1993 [LL] Lieb, E.H., Liniger, W.: Exact analysis of an interacting Bose gas, I. The general solution and the ground state. Phys. Rev. (2) 130, 1605–1616 (1963) [M1] Macdonald, I.G.: The Poincaré series of a Coxeter group. Math. Ann. 199, 151–174 (1972) [M2] Macdonald, I.G.: Symmetric Functions and Hall Polynomials. Second Edition, Oxford: Clarendon Press, 1995 [M3] Macdonald, I.G.: Orthogonal polynomials associated with root systems. Sém. Lothar. Combin. 45, Art. B45a, 40 pp. electronic (2000/01) [Ma] Mattis (ed.), D.C.: The Many-Body Problem: An Encyclopedia of Exactly Solved Models in One Dimension. Singapore: World Scientific, 1994 [Mc] McGuire, J.B.: Study of exactly soluble one-dimensional N -body problems. J. Math. Phys. 5, 622–636 (1964) [O] Oxford, S.: The Hamiltonian of the Quantized Nonlinear Schrödinger Equation. Ph. D. Thesis, Los Angeles: UCLA, 1979 [R] Ruijsenaars, S.N.M.: The continuum limit of the infinite isotropic Heisenberg chain in its ground state representation. J. Funct. Anal. 39, 75–84 (1980) [S] Sutherland, B.: Beautiful Models: 70 Years of Exactly Solved Quantum Many-Body Problems. Singapore: World Scientific, 2004 [T] Thomas, L.: Ground state representation of the infinite one-dimensional Heisenberg ferromagnet. J. Math. Anal. Appl. 59, 392–414 (1977) [YY] Yang, C.N., Yang, C.P.: Thermodynamics of a one-dimensional system of Bosons with repulsive delta-function interaction. J. Math. Phys. 10, 1115–1122 (1969) Communicated by B. Simon
Commun. Math. Phys. 267, 477–492 (2006) Digital Object Identifier (DOI) 10.1007/s00220-006-0057-6
Communications in
Mathematical Physics
Domain and Range of the Modified Wave Operator for Schrödinger Equations with a Critical Nonlinearity Nakao Hayashi1 , Pavel I. Naumkin2 1 Department of Mathematics, Graduate School of Science, Osaka University, Osaka, Toyonaka, 560-0043,
Japan. E-mail: [email protected]
2 Instituto de Matemáticas UNAM Campus Morelia, AP 61-3 (Xangari), Morelia CP 58089, Michoacán,
Mexico. E-mail: [email protected] Received: 14 November 2005 / Accepted: 6 March 2006 Published online: 19 July 2006 – © Springer-Verlag 2006
Abstract: We study the final problem for the nonlinear Schrödinger equation 2 1 i∂t u + u = λ|u| n u, (t, x) ∈ R × Rn , 2 where λ ∈ R,n = 1, 2, 3. If the final data u + ∈ H0,α = φ ∈ L2 : (1 + |x|)α φ ∈ L2 with n2 < α < min n, 2, 1 + n2 and the norm u+ L∞ is sufficiently small, then we prove the existence of the wave operator in L2 . We also construct the modified scattering operator from H0,α to H0,δ with n2 < δ < α.
1. Introduction In this paper we consider the modified wave operator for the nonlinear Schrödinger equation 2 1 i∂t u + u = λ |u| n u, (t, x) ∈ R × Rn , 2
(1.1)
where λ ∈ R, n = 1, 2, 3. Denote by Fφ or φˆ the Fourier transform of φ, n Fφ(ξ ) = (2π )− 2 e−i(x·ξ ) φ(x)d x, Rn
the inverse Fourier transform is denoted by F −1 . Our purpose is to find the solutions of (1.1) satisfying
· 2 · i|x|2 n
n exp −iλ u+ =0 (1.2) lim u(t) − (it)− 2 e 2t u+
log t t→+∞ t t
478
N. Hayashi, P.I. Naumkin
in L2 under the conditions that the final data
2 n < α < min n, 2, 1 + u + ∈ H0,α with 2 n and the norm u+ L∞ is sufficiently small. Also we show the existence of the modified scattering operator from H0,α to H0,δ , n2 < δ < α, under the smallness condition in H0,α . Notation and function spaces. We let ∂ j = ∂/∂ x j , ∂ l = ∂1l1 · · · ∂nln , l ∈ (N ∪ {0})n . U (t) is the free Schrödinger evolution group defined by n i it 2 2 U (t) φ = (2πit)− 2 e 2t |x−y| φ (y) dy = F −1 e 2 |ξ | Fφ Rn
= M (t) D (t) FM (t) , 2 and D(t) is the dilation operator where M = M (t) = exp i|x| 2t x n . (D (t) φ) (x) = (it)− 2 φ t We note that 1 U (−t) = M (−t) i n F −1 D M (−t) , t since (D (t))−1 = i n D 1t . By using the above identities we easily see that J (t) = U (t) xU (−t) = M (t) it∇M (−t) = x + it∇ and β
|J |β (t) = U (t) |x|β U (−t) = t β M (t) (−) 2 M (−t) for β ≥ 0. We introduce some function spaces. The Lebesgue space L p = φ ∈ S ; φL p < ∞ , 1/ p where φL p = Rn |φ(x)| p d x if 1 ≤ p < ∞ and φL∞ = ess.sup {|φ(x)| ; x ∈ Rn } if p = ∞. We denote by Ws,a p the weighted Sobolev space a s Ws,a p = φ ∈ S : · i∂x φ L p < ∞ 1 for any s, a ∈ R, 1 ≤ p ≤ ∞, where x = 1 + |x|2 2 . In particular, we denote s s ˙ sp,q we denote the homogeneous Hs,a = W2s,a (Rn ) , Wsp = Ws,0 p and H = W2 . By B Besov space with the semi-norm ∞ 1/q q ∂ θ (φ y − φ) p d x , φB˙ s = x −1−σ q sup L p,q
0
|y|≤x |θ|≤[s]
where s = [s] + σ, 0 < σ < 1, φ y (x) = φ(x + y) and [s] is the largest integer less than s. We let C(I; E) be the space of continuous functions from an interval I to a Banach space E. Different positive constants might be denoted by the same letter C. We now state our results in this paper. In the next theorem we prove the existence of the modified wave operator W+ : u + ∈ H0,α → u 0 ∈ H0,β .
Domain and Range of the Modified Wave Operator
479
Theorem 1. We assume that u + ∈ H0,α and u+ L∞ = ε, where ε is sufficiently small n and 2 < α < min n, 2, 1 + n2 . Then there exists a unique global solution u of (1.1) satisfying u ∈ C [0, ∞) ; L2 , |J |β u ∈ C [0, ∞) ; L2 , where
n 2
< β < α. Moreover the estimate is true
2 i|x|2 U (−t) u (t) − (it)− n2 e 2t u+ · exp −iλ
u+ ·
n log t t t
H0,δ
≤ Ct −
β−δ 2 −µ
for all t > 0, where 0 ≤ δ ≤ β, µ > 0. −1 such that Next we show the existence of the operator W−
where n2
−1 W− : u 0 ∈ H0,β → u − ∈ H0,δ , < δ < β < min 2, 1 + n2 . Therefore we have the modified scattering operator
n 2
−1 S+ = W− W+ : H0,α → H0,δ , < δ < α < min n, 2, 1 + n2 , provided that the norm u + H0,α is sufficiently
where small.
Theorem 2. We assume that u 0 ∈ H0,β and u 0 H0,β = ε, where ε is sufficiently small and n2 < β < 1 + n2 . Then there exist unique functions u − , h − ∈ H0,δ with n2 < δ < β satisfying 1 (FU (−t) u) exp iλ h − + |t|−χ n log |t| − u ≤ Cε1+ n2 |t|−µ (1.3) − Hδ
for all t < 0, with some µ > 2χ > 0, where u (t) is a solution of (1.1) such that u ∈ C (−∞, 0] ; L2 , |J |β u ∈ C (−∞, 0] ; L2 . Furthermore the asymptotic representation is true 1 · U (−t) u (t) exp iλ h − · + |t|−χ n log |t| − (it)− n2 u − t t H0,η 2
n
≤ Cε1+ n |t|− 2 −µ for all t < 0, where
n 2
(1.4)
< η < β with some µ > 0.
Our results are improvements of papers [3, 5, 9]. In Theorem 2 of [9], it was shown that for any u + ∈ H0,3 ∩ H1,2 withsmallness condition on u+ L∞ , Eq. (1.1) has a unique solution u ∈ C [0, ∞) ; H1,0 such that
2 i|x|2 b u (t) − (it)− 21 e 2t u+ · exp −iλ
u+ ·
log t ≤ Ct − 2 , t t H1,0
480
N. Hayashi, P.I. Naumkin
with 1 < b < 2 in the one dimensional case n = 1. In [3], the result of [9] was improved 1,2 as follows: it was shown that for any u + ∈ H0,3 smallness condition on ∩ H with u+ L∞ , Eq. (1.1) has a unique solution u ∈ C [0, ∞) ; H1,0 ∩ H0,1 such that
2 i|x|2 u (t) − (it)− 21 e 2t u+ · exp −iλ
u+ ·
log t t t
H1,0
≤ Ct −1 log3 t,
and
2 i|x|2 U (−t) u (t) − (it)− 21 e 2t u+ · exp −iλ
u+ ·
log t t t
H0,1
≤ Ct −1 log3 t.
The last estimate and the result of [7] enable us to define the modified scattering operator S+ : H0,3 ∩ H1,2 → L2 (see Corollary 2 in [3]). Their results required more smoothness conditions than those of ours since their methods are based on the substitution of an approximate solution (it)
− 21
e
i|x|2 2t
· 2
exp −iλ u+ u+
log t t t ·
to the free Schrödinger equation which implies the second differentiability of u+ t· . Note that by the method of paper [9] the condition u + ∈ H0,2 only is required for constructing the modified wave operator. In order to get of Theorem 1 we use the the result 2 −1 factorization of U (−t) and take U (t) F u+ exp −iλ |u+ | n log t as an approximate solution of u. By the identity
· 2 · i|x|2 2 n
n U (t) F −1 u+ exp −iλ |u+ | n log t = (it)− 2 e 2t u+ exp −iλ u+
log t t t 2 −1 +MDF (M − 1) F u+ exp −iλ |u+ | n log t we can see that the difference between the two approximate solutions is 2 MDF (M − 1) F −1 u+ exp −iλ |u+ | n log t . In the proof of Theorem 2 we take a modified approximate solution 1 −χ n |t| |t| U (t) F −1 u exp −iλ h + log − − to avoid the loss of the differentiability. The rest of the paper is organized as follows. In Sect. 2 we prove some preliminary estimates of the nonlinearity in the Sobolev space. Section 3 is devoted to the proof of Theorem 1. Then we prove Theorem 2 in Sect. 4.
Domain and Range of the Modified Wave Operator
481
2. Lemmas First we state the Sobolev imbedding inequality (see [4]). Lemma 3. Let q, r be any numbers satisfying 1 ≤ q, r ≤ ∞, and let j, m be any numbers satisfying 0 ≤ j < m. If φ ∈ Wrm ∩ Lq , then j m a (−) 2 φ p ≤ C (−) 2 φ r φ1−a Lq , L
1
L
m
j where 1p = nj + a r − n + 1−a q for all a in the interval m ≤ a ≤ 1, where C is a constant depending only on n, m, j, q, r, a, with the following exception: if m − j − nr is a nonnegative integer, then the above estimate holds for a = mj . β
We denote the fractional partial derivative ∂x j for β > 0, j = 1, 2, . . . , n, as follows ∂xβj φ (x)
2π = (1 − )
∞
−1− ∂xk j φ y j − φ y j dy j ,
0
where k = [β] , = β − k ∈ (0, 1) , φ y j = φ x1 , . . . , x j + y j , . . . , xn , is the Euler gamma function (see [1, 11]). Lemma 4. Let n2 < β < min n, 2, 1 + n2 . Then the estimates are true 2 2 φ |φ| n ˙ β ≤ C φLn ∞ φH˙ β , H 2 2 2 j φLn ∞ (log τ ) j φH˙ β , φ exp i |φ| n log τ β ≤ C 1 + ˙ H
and
j=1
2 |φ| φ − |ψ|2 ψ ˙ β ≤ C φ2L∞ + ψ2L∞ φ − ψH˙ β H + C φL∞ + ψL∞ φ − ψL∞ ψH˙ β
if n = 1. Also 2 2 n |φ| φ − |ψ| n ψ ˙ β H 2 ≤ C φLn ∞ φ − ψH˙ β + s 1−β φ − ψH˙ 1 2 + C φ − ψLn ∞ ψH˙ β + s 1−β ψH˙ 1 2 2 2 2 γ n n n n φ n +γ φH˙ β + φ n +γ ψH˙ β + ψ n +γ ψH˙ β + Cs ˙ 2 H
˙ 2 H
for all s > 0 if n = 2, 3, where 0 < γ ≤ min β − n2 , n2 1 + the right-hand sides are finite.
˙ 2 H
2 n
−β
, provided that
482
N. Hayashi, P.I. Naumkin
Proof. By the Taylor expansion 1 2 2 2 n n exp i |φ| log τ − 1 = i |φ| log τ exp iθ |φ| n log τ dθ. 0
Let us estimate
2 2 φ |φ| n exp iθ |φ| n log τ ˙ β H
for β > 1 and n = 2, 3. By a direct computation 2 2 2 ∇ φ |φ| n exp iθ |φ| n log τ = f (φ) exp i |φ| n log τ , where
f (φ) =
2 2 4 2 2 2i |φ| n −2 φ φ∇φ + φ∇φ log τ. + 1 |φ| n ∇φ + φ 2 |φ| n −2 ∇φ + n n n
By the Hölder inequality we find
2 f φ y exp iθ φ y n log τ − f (φ) exp iθ |φ| n2 log τ j j
L2
2 ≤ C f φ y j − f (φ)L2 + C log τ φ y j − φ n 2 p f (φ)L p Ln
with
1 p
+
1 p
= 1. Therefore 2 f (φ) exp iθ |φ| n log τ ˙ σ
B2,2
∞ 1 2 2 ≤ C x −1−2σ sup f φ y j − f (φ)L2 d x 0
|k|≤x
+C log τ
∞
x −1−2σ
0
1 2 n4 2 sup φ y j − φ 2 p f (φ)L p d x
|k|≤x
Ln
˙ σ is equivalent to for 0 < σ < 1. Since the norm of the homogeneous Sobolev space H that of the homogeneous Besov space B˙ σ2,2 (see [2]), then the first two estimates of the lemma follow by the method of proof of Lemma 3.4 in paper [6]. We now prove the last two estimates of the lemma. Since the norm of the homoge˙ β is equivalent to that of the homogeneous Besov space B˙ β (see neous Sobolev space H 2,2 [2]), we have
φH˙ β ≤ C φB˙ β
2,2
∞ 1 2 2 = x −1−2β sup φ y − φ L2 d x , 0
|y|≤x
Domain and Range of the Modified Wave Operator
483
where 0 < β < 1, ψ y (x) = ψ(x + y). For n = 1 we represent
2
2
φ y φ y − ψ y ψ y − |φ|2 φ + |ψ|2 ψ
2
≤ C φ y + ψ y + |φ| + |ψ| φ y − ψ y − (φ − ψ) + ψ y − ψ , then we get 2 |φ| φ − |ψ|2 ψ ˙ β H ∞ ≤ C x −1−2β sup
|y|≤x
0
1 2
2
2 2 2 2
φ y φ y − ψ y ψ y − |φ| φ + |ψ| ψ 2 d x L
≤ C φ2L∞ + ψ2L∞ φ − ψH˙ β + C φL∞ + ψL∞ φ − ψL∞ ψH˙ β . Thus the third estimate of the lemma is true. To prove the last estimate of the lemma we represent 2 2 2 2 2 2 n n |φ| n ∂x j (φ − ψ) + |φ| n − |ψ| n ∂x j ψ ∂x j |φ| φ − |ψ| ψ = 1 + n for n = 2, 3. Then we get
2 β−1 |φ| n ∂x j (φ − ψ) 2 ∂x j L ∞ 2 −β =C ∂x j φ y j − ψ y j − (φ − ψ) y j dy j |φ| n 0 ∞
2 −β 2
φ y n − |φ| n ∂x φ y − ψ y y dy j + j j j j j 0
L2
2 n
≤ C φL∞ φ − ψH˙ β ∞ 2 2 n
n + C φ y j − |φ| ∂x j φ y j − ψ y j 0
−β
L2
y j dy j .
By Lemma 3 we obtain 2 φ y n − |φ| n2 ∂x ϕ j j
L2
2 2 n
n ≤ C φ y j − |φ|
n
L β−1
ϕH˙ β
2 2 2 γ +β−1 φ n n +γ ϕH˙ β , ≤ C φ y j − φ Hn˙ σ ϕH˙ β ≤ C y jn
˙ 2 H
(2.1)
484
N. Hayashi, P.I. Naumkin
where σ = n2 − n2 (β − 1) ≥ 0, 0 < γ ≤ min β − n2 , n2 1 + Therefore we find ∞ 2 φ y n − |φ| n2 ∂x ϕ y −β dy j j j j
2 n
−β
, for n = 2, 3.
L2
0
≤ C φ
2 n
s n
˙ 2 +γ H
2 n γ −1
ϕH˙ β
yj
∞
2 n
dy j + C φL∞ ϕH˙ 1 s
0
≤ Cs
2 nγ
φ
2 n
n
−β
y j dy j
2 n
˙ 2 +γ H
ϕH˙ β + Cs 1−β φL∞ ϕH˙ 1
for all s > 0 so that
2 β−1 |φ| n ∂x j (φ − ψ) ∂x j
2 ≤ C φLn ∞ φ − ψH˙ β + s 1−β φ − ψH˙ 1
L2
2
2
+ Cs n γ φ n n +γ φ − ψH˙ β .
(2.2)
˙ 2 H
In the same manner
2 2 β−1 |φ| n − |ψ| n ∂x j ψ 2 ∂x j L ∞ −β 2 2 n n ≤ C |φ| − |ψ| ∂x j ψ − ψ y j y j dy j 0 L2 ∞
2
2 2 2 −β n n
|φ| n − φ y j − |ψ| n + ψ y j ∂x j ψ y j y j dy j +C 0
.
L2
For the first summand we have ∞ −β |φ| n2 − |ψ| n2 ψ − ψ y ∂ dy xj yj j j 0 L2 2 2 2 ≤ C |φ| n − |ψ| n ∞ ψH˙ β ≤ C φ − ψLn ∞ ψH˙ β . L
As in (2.1) we obtain ∞
2
2 2 2 −β n n
|φ| n − φ y j − |ψ| n + ψ y j ∂x j ψ y j y j dy j 0
2
2
ψH˙ β
≤ C φ n n +γ + ψ n n +γ ˙ 2 H
s
˙ 2 H
L2
2
y jn
γ −1
dy j
0 2
∞
+ C φ − ψLn ∞ ψH˙ 1 ≤ Cs
2 nγ
−β
y j dy j s
φ
2 n
n ˙ 2 +γ H
2
+ ψ n n +γ ˙ 2 H
2
ψH˙ β + Cs 1−β φ − ψLn ∞ ψH˙ 1 .
Domain and Range of the Modified Wave Operator
Then we find
2 2 β−1 |φ| n − |ψ| n ∂x j ψ ∂x j
L2
485
2 ≤ C φ − ψLn ∞ ψH˙ β + s 1−β ψH˙ 1 2 2 2 + Cs n γ φ n n +γ +ψ n n +γ ψH˙ β . ˙ 2 H
˙ 2 H
(2.3) By (2.2) and (2.3) the last estimate of the lemma follows. Lemma 4 is proved.
3. Modified Wave Operator We denote the first approximation for the solutions of (1.1) by 2 u 1 (t) = M (t) D (t) w (t) , w (t) = u+ exp −iλ |u+ | n log t . The free Schrödinger evolution group can be decomposed as , + R (t) φ U (t) φ = M (t) D (t) φ where R (t) = M (t) D (t) F (M (t) − 1) F −1 . To prove Theorem 1 we define the following function space: X = φ ∈ C [T, ∞) ; L2 ; φ (t) − u 1 (t)X < ∞ with the norm φX =
sup t∈[T,∞)
β t 2 +µ φ (t)L2 + t µ |J |β φ (t)L2 ,
where n2 < β < α < min n, 2, 1 + n2 , α − β > µ > 0 is sufficiently small. Multiplying both sides of (1.1) by FU (−t) , we obtain 2
i∂t (FU (−t) u) = λFU (−t) |u| n u. 2 Note that w (t) = u+ exp −iλ |u+ | n log t satisfies the equation i∂t w =
2 λ | w| n w . t
(3.1)
(3.2)
By (3.1) and (3.2) we have i∂t (FU (−t) u − w ) 2 2 2 1 1 n n n w| w w| w − R | = λFU (−t) |u| u − MD | t t λ 2 2 2 w| n w . = λFU (−t) |u| n u − |u 1 | n u 1 − FU (−t) R | t Since w) , FU (−t) u − w = FU (−t) (u − u 1 − R
(3.3)
486
N. Hayashi, P.I. Naumkin
by integrating (3.3) in time and by using condition (1.2) we obtain ∞ u (t) − u 1 (t) = −iλ
2 2 U (t − τ ) |u| n u − |u 1 | n u 1 dτ
t
∞
2
w| n w U (t − τ ) R (τ ) |
+R w + iλ t
dτ . τ
(3.4)
Equation (3.4) is the integral equation for (1.1) with condition (1.2). Let us consider the linearized version of (3.4), ∞ u (t) − u 1 (t) = −iλ
2 2 U (t − τ ) |v| n v − |u 1 | n u 1 dτ
t
∞
2
w| n w U (t − τ ) R (τ ) |
+ R w + iλ t
dτ , τ
(3.5)
where v ∈ Xρ ≡ φ ∈ X; φX ≤ ρ and ρ ≤ C u + H0,α . Since R = MDF (M − 1) F −1 , by Lemma 4 the remainder terms are estimated as δ R (t) (−) 2 w
L2
δ = (M − 1) F −1 (−) 2 w ≤ Ct −
α−δ 2
wHα ≤ Cρt −
L2
α−δ 2
log2 t
(3.6)
and ∞ δ 2 w| n w R (τ ) (−) 2 | t
L2
2 dτ ≤ Cε n ρ τ
2
∞
τ −1−
α−δ 2
log2 τ dτ
t
≤ Cε n ρt −
α−δ 2
log2 t,
where 0 ≤ δ ≤ β < α. Also by virtue of Lemma 3 we obtain vL∞ ≤ v − u 1 L∞ + u 1 L∞ n 1− n n n ≤ Ct − 2 |J |β (v − u 1 )L2β2 v − u 1 L2 2β + Ct − 2 u+ L∞ n ≤ Ct − 2 ρt −µ + ε
(3.7)
Domain and Range of the Modified Wave Operator
487
since v ∈ Xρ . Then by (3.5) the L2 -norm can be estimated as
u (t) − u 1 (t)L2 ≤ C
∞ 2 2 vLn ∞ + u 1 Ln ∞ v − u 1 L2 dτ t
+ R w L2
∞ 2 w| n w + C R (τ ) |
L2
t
≤ Cε
2 n
∞
2 dτ + Cρ n τ
v − u 1 L2 t
+Cρt
− α2
log t ≤ Cρt 2
dτ τ
∞ v − u 1 L2 t
dτ τ 1+µ
− β2 −µ
(3.8)
for all t ≥ T if T > 0 is sufficiently large. β Note that |J |β R (τ ) = R (τ ) (−) 2 . Then multiplying (3.5) by |J |β = t β M (t) β (−) 2 M (−t), we obtain ∞
β
|J | (u (t) − u 1 (t)) = −iλ
2 2 U (t − τ ) |J (τ )|β |v| n v − |u 1 | n u 1 dτ
t β 2
∞
β
2
w| n w U (t − τ ) R (τ ) (−) 2 |
+ R (t) (−) w + iλ t
dτ . τ
Then by (3.6) and (3.7) we find
β |J | (u (t) − u 1 (t))
L2
∞ 2 2 ≤ |J |β |v| n v − |u 1 | n u 1
L2
dτ + Cρt −
α−β 2
log2 t.
t
(3.9) Applying Lemma 4 we have β |J | |v|2 v − |u 1 |2 u 1 2 L
2
2 β
= Cτ Mv Mv − Mu 1 Mu 1 β ˙ H β −1 −µ |J | (v − u 1 ) L2 + v − u 1 L∞ ε + ρτ ≤ Cρτ ≤ Cρτ −1−µ ε + ρτ −µ
(3.10)
488
N. Hayashi, P.I. Naumkin
in the case n = 1 and also 2 2 β |J | |v| n v − |u 1 | n u 1 2 L
n2
n2 β
= Cτ Mv Mv − Mu 1 Mu 1
˙β H
2 ≤ C vL∞ |J |β (v − u 1 )L2 + C v − u 1 Ln ∞ |J |β u 1 L2 2 2 1−β β−1 n n vL∞ J (v − u 1 )L2 + v − u 1 L∞ J u 1 L2 + Cs τ 2 n
2
2
+ Cs n γ τ −1− n γ
2 2 n n n n |J | 2 +γ v 2 |J |β v L2 + |J | 2 +γ v 2 |J |β u 1 L2 L L 2 n n + |J | 2 +γ u 1 2 |J |β u 1 L2 (3.11) L
for all s > 0 if n = 2, 3, where 0 < γ ≤ min β − n2 , n2 1 + by using the estimates
2 n
−β
. Since v ∈ Xρ ,
n 1− n n v − u 1 L∞ ≤ Cτ − 2 |J |β (v − u 1 )L2β2 v − u 1 L2 2β ≤ Cρτ − 2 −µ− 2 (β− 2 ) n
and J (v − u 1 )L2 ≤ Cρτ −µ−
β−1 2
1
n
, we get from (3.11)
2 2 β |J | |v| n v − |u 1 | n u 1 2 L 2 2 2 2 2 n −1−µ −µ n n ε +ρ τ + Cρ 1+ n τ −1− n µ− n (β− 2 ) ≤ Cρτ β−1 2 2 2 n + Cρ 1+ n s 1−β τ β−1 τ −1−µ− 2 + τ −1− n µ− n (β− 2 ) 2 2 2 2 2 + Cρ 1+ n s n γ τ −1− n γ ≤ Cρτ −1−µ ε n + ρ n τ −µ
(3.12)
if we take s = τ 1−ν , γ ν ≥ nµ and (β − 1) ν + 2µ ≤ n1 β − n2 in the cases n = 2, 3. (For example, we can choose ν = γ4 and µ = ν 2 .) Then by virtue of (3.10) and (3.12) we find from (3.9), |J (t)|β (u (t) − u 1 (t)) 2 ≤ Cρ L
∞
2 α−β 2 τ −1−µ ε n + ρ n τ −µ dτ + Cρt − 2 log2 t
t
≤ Cρτ −µ .
(3.13)
In view of (3.8) and (3.13) we find that there exists a time T such that u ∈ Xρ . In the same manner we can prove the estimate u − u X ≤
1 v − v X , 2
Domain and Range of the Modified Wave Operator
489
where u is defined by (3.5) with v replaced by v . Therefore (3.5) defines a contraction mapping. Hence there exists a unique global solution u ∈ C([T, ∞) ; L2 ) of the integral equation (3.4) satisfying the estimate n
u (t) − u 1 (t)L2 ≤ Ct − 4 −µ . Arguing in the same way as in the proof of [12] we can extend the existence time to zero. Theorem 1 is proved. 4. Modified Scattering Operator To prove Theorem 2 let us consider the Cauchy problem for Eq. (1.1) with initial data u 0 ∈ H0,β with n2 < β < 1 + n2 and with sufficiently small norm ε = u 0 H0,β . In [7] it was proved that there exists a unique global solution u of the Cauchy problem for Eq. (1.1) satisfying U (−t) u ∈ C (−∞, 0] ; H0,β , and the following estimates: u (t)L2 ≤ u 0 L2 , |J (t)|β u (t)L2 ≤ Cε |t|
(4.1)
2
for all t ≤ 0, where = Cε n > 0 is small. From estimates (4.1) and the identity u (t) = MDFU (−t) u (t) + MDF (M − 1) U (−t) u (t) by Lemma 3 it follows that n
uL∞ ≤ Ct − 2 FU (−t) uL∞ n β 1− n n 2β +Ct − 2 (−) 2 F (M − 1) U (−t) u 2 F (M − 1) U (−t) uL2 2β ≤ Ct
− n2
FU (−t) uL∞ + Ct
n
L n − n2 − β2 1− 2β n
β
≤ Ct − 2 FU (−t) uL∞ + Ct − 4 − 2
n 1− n |J |β u L2β2 |x|β U (−t) u L2 2β β |J | u 2 . (4.2) L
By (3.1) we have for the function w (t) = FU (−t) u (t), w t = −
2 iλ | w| n w + R1 + R2 , t
where the remainder terms R1 = −
2 2 iλ |FMw| n FMw − | w| n w t
and R2 = −
2 iλ −1 F M − 1 F −1 |FMw| n FMw. t
(4.3)
490
N. Hayashi, P.I. Naumkin
By using Lemma 4 we have 2 2 R1 H˙ δ = |t|−1 |FMw| n FMw − | w| n w δ ˙ H −1 2 2 FMwL∞ + ≤ C |t| w L∞ F (M − 1) wH˙ δ wH˙ δ + C |t|−1 (FMwL∞ + w L∞ ) F (M − 1) wL∞ n n 2− ≤ C |t|−1 (M − 1) wH˙ 0,δ |J |β u Lβ 2 uL2 β n
n
1−
2δ + C |t|−1 (M − 1) wH2δ ˙ 0,δ (M − 1) wL2 n δ 1− n 1− δ × |J |β u L2β2 uL2 2β |J |β u Lβ 2 uL2 β
n −1− β−δ 2 + 1+ β
≤ Cε |t| if n = 1. Also
2 2 R1 H˙ δ = C |t|−1 |FMw| n FMw − | w| n w δ ˙ H 2 ≤ C |t|−1 FMwLn ∞ (M − 1) wH˙ 0,δ + s 1−β (M − 1) wH˙ 0,1 2 + C |t|−1 F (M − 1) wLn ∞ wH˙ 0,δ + s 1−β wH˙ 0,1 2
2
+ C |t|−1 s n γ w n 0, n +γ wH˙ 0,δ ˙ H
2
for all s > 0 if n = 2, 3, where 0 < γ ≤ min β − n2 , n2 1 + estimates FMwL∞ ≤ Cε |t| wH˙ 0,α ≤ Cε |t|
n 2β α β
2 n
−β
. Then using the
n − β2 1− 2β +
, F (M − 1) wL∞ ≤ Cε |t|
, (M − 1) wH˙ 0,α ≤ Cε |t|
for 0 ≤ α ≤ β we get R1 H˙ δ ≤ C |t|−1−
β−δ n +2
1 n 2 + C |t|−1+2 s 1−β |t|− n (β− 2 ) + s n γ
≤ C |t|−1−
β−δ n +2
+ C |t|−1−
if we take s = t −ν , with ν =
1 n
,
− β−α 2 +
(β− n2 )
β−1+ 2γ n
2γ n
ν+2
.
Also by using Lemma 4 we have 2 R2 H˙ δ ≤ C |t|−1 F M−1 − 1 F −1 |FMw| n FMw δ ˙ H 2 β−δ β−δ 2 ≤ C |t|−1− 2 |FMw| n FMw β ≤ C |t|−1− 2 FMwLn ∞ wH˙ 0,β ˙ H
2 n 1+ βn − wH˙ 0,β wLn 2 β n −1− β−δ 2 + 1+ β
−1− β−δ 2
≤ C |t|
≤ Cε |t|
.
≤ C |t|−1−
β−δ 2
2 n β 1+ βn − |J | u 2 u n 2 β L L
Domain and Range of the Modified Wave Operator
491
Thus we have R1 H˙ δ + R2 H˙ δ ≤ Cε |t|−1−µ
(4.4)
β−δ n .
< δ < β and some 0 < µ < and taking the real part of the result we get Multiplying both sides of Eq. (4.3) by w for h (t) = | w (t)|2 , h t = 2Re w (R1 + R2 ) , for
n 2
hence integrating with respect to time we find t h (t) − h (s) = 2Re
w (R1 + R2 ) dτ.
s
By estimate (4.4) we have h (t) − h t
t Hδ
≤ Cε
2
|τ |−1−µ+ dτ ≤ Cε2 |t|−µ
t
for all t < t < −1. Then we see that there exists a unique limit h − ∈ Hδ such that h (t) − h − Hδ ≤ Cε2 |t|−µ
(4.5)
for all t < −1.
1 Multiplying both sides of Eq. (4.3) by E (t) ≡ exp iλ h − + |t|−χ n log |t| , with
a small χ > 0 we get ∂t ( w E) = F,
(4.6)
where
1 1 F (t) = −iλ t −1 |h (t)| n − ∂t h − + |t|−χ n log |t| w E + (R1 + R2 ) E.
Note that by Lemma 4 we find −1 1 1 −χ n n − ∂t |h F (t)Hδ ≤ C |t| |t| t h + log w E (t)| − + (R1 + R2 ) EHδ ≤ Cε
1+ n2
−1−µ+2χ +
|t|
Hδ
.
Therefore integrating (4.6) with respect to time, we obtain t w t E t −w (t) E (t) Hδ = F (τ ) dτ
Hδ
t
2
t
≤ Cε1+ n
|τ |−1−µ+2χ + dτ
t
≤ Cε
1+ n2
|t|2χ +−µ
(4.7)
492
N. Hayashi, P.I. Naumkin
for all t < t < −1. Since the norm u 0 H0,β is sufficiently small, then we see that there δ exists a unique function u − ∈ H such that 2
1+ 2χ +−µ w (t) E (t) − u − Hδ ≤ Cε n |t|
for all t < −1. This implies (1.3). Furthermore the asymptotic representation (1.4) is true. Theorem 2 is proved. References 1. Bateman, H., Erdelyi, A.: Tables of Integral Transforms. N.Y.: McGraw-Hill Book Co., 1954, pp. 343 2. Bergh, J., Löfström, J.: Interpolation spaces. An introduction. Berlin-N.Y.: Springer-Verlag, 1976, pp. 207 3. Carles, R.: Geometric optics and Long range scattering for one dimensional nonlinear Schrödinger equations. Commun. Math. Phys. 220, 41–67 (2001) 4. Friedman, A.: Partial Differential Equations. New York: Holt-Rinehart and Winston, 1969 5. Ginibre, J., Ozawa, T.: Long range scattering for nonlinear Schrödinger and Hartree equations in space dimension n ≥ 2. Commun. Math. Phys. 151, 619–645 (1993) 6. Ginibre, J., Ozawa, T., Velo, G.: On the existence of the wave operators for a class of nonlinear Schrödinger equations. Ann. Inst. H. Poincaré Phys. Théor. 60(2), 211–239 (1994) 7. Hayashi, N., Naumkin, P.I.: Asymptotics in large time of solutions to nonlinear Schrödinger and Hartree equations. Amer. J. Math. 120, 369–389 (1998) 8. Hayashi, N., Tsutsumi, Y.: Remarks on the scattering problem for nonlinear Schrödinger equations. In: Knowles, I.W., Saito, Y. (eds.) Differential Equations and Mathematical Physics, Lecture Note in Mathematics. Berlin-Heidelberg-New York: Springer-Verlag, 1285, 1986, pp. 162–168 9. Ozawa, T.: Long range scattering for nonlinear Schrödinger equations in one space dimension. Commun. Math. Phys. 139, 479–493 (1991) 10. Shimomura, A., Tonegawa, S.: Long range scattering for nonlinear Schrödinger equations in one and two space dimensions. Differ. Int. Eqs. 17, 127–150 (2004) 11. Stein, E.M.: Singular Integrals and Differentiability Properties of Functions. 30, Princeton, NJ: Princeton Univ. Press, 1970 12. Tsutsumi, Y.: L2 -solutions for nonlinear Schrödinger equations and nonlinear groups. Funkcialaj Ekvacioj 30, 115–125 (1987) Communicated by P. Constantin
Commun. Math. Phys. 267, 493–513 (2006) Digital Object Identifier (DOI) 10.1007/s00220-006-0070-9
Communications in
Mathematical Physics
On the Volumorphism Group, the First Conjugate Point is Always the Hardest Stephen C. Preston Department of Mathematics, Stony Brook University, Stony Brook, NY 11794-3651, USA. E-mail: [email protected] Received: 2 December 2005/Accepted: 23 March 2006 Published online: 10 August 2006 – © Springer-Verlag 2006
Abstract: We find a simple local criterion for the existence of conjugate points on the group of volume-preserving diffeomorphisms of a 3-manifold with the Riemannian metric of ideal fluid mechanics, in terms of an ordinary differential equation along each Lagrangian path. Using this criterion, we prove that the first conjugate point along a geodesic in this group is always pathological: the differential of the exponential map always fails to be Fredholm. 1. Introduction The theory of ideal (inviscid) incompressible fluid mechanics is one of the most mathematically beautiful theories in physics. This is partly because one does not need any parameters to describe the system: as soon as one has a compact Riemannian manifold M, possibly with boundary, one can construct the volume-preserving diffeomorphism group Dµ (M), and on this infinite-dimensional manifold define a Riemannian metric using the kinetic energy integral. Arnold [A] showed that the geodesics of this metric on Dµ (M) are precisely the ideal incompressible fluid flows on M, in the Lagrangian coordinate description. Thus, once one gets past the fairly serious technical issues of functional analysis involved in constructing a topology on Dµ (M) and proving that the geodesic equations are well-posed (as accomplished by Ebin and Marsden [EMa]), one has essentially reduced much of ideal fluid mechanics to a study of geometry. Of course, this does not automatically solve the outstanding problems of fluid mechanics, but it does give a different context to them. For example, the most significant open problem of ideal fluid dynamics is global existence of solutions in a 3-manifold. From the geometric point of view, this is precisely geodesic completeness. We may thus hope that a better understanding of the geometry Much of this work was completed while the author was a Lecturer at the University of Pennsylvania. The author is grateful for their hospitality.
494
S. C. Preston
of Dµ (M) may help in clarifying this problem, or in suggesting new techniques for its solution. Researchers are only beginning to develop this subject, although much progress has been made in recent years, and it is likely to remain an interesting field of study for some time to come. Conjugate points on Dµ (M) have been of interest ever since Arnold [A] computed the sectional curvature on Dµ (T2 ), found that it was usually negative but sometimes positive, and asked whether one could find conjugate points. Computational difficulties prevented much progress in this direction, until Misiołek [M1] proved that one could construct some simple examples of conjugate points in Dµ (S 3 ) along geodesics corresponding to rigid rotations of the 3-sphere. Here, positive curvature on the underlying manifold M helps one obtain positive curvature on Dµ (M), which leads to the conjugate points. More surprisingly (and far more difficult computationally), Misiołek [M2] showed that conjugate points exist on Dµ (T2 ), using an example similar to the one about which Arnold had asked. Since the work of Misiołek, substantial progress has been made in understanding conjugate points on Dµ (M). For example, Shnirelman [Sh] proved that the diameter of Dµ (M 3 ) is finite for any 3-D manifold M, and using this result and the generalized flows of Brenier [B], showed that there must be “local cut points” along any sufficiently long geodesic in Dµ (M 3 ): that is, there is an arbitrarily close path joining the two points which is strictly shorter. In finite dimensions, such points must be conjugate; on the infinite-dimensional manifold Dµ (M 3 ), this is not necessarily true. No such result is possible in the 2-D case, since the diameter of Dµ (M 2 ) is infinite. More recently, Ebin, Misiołek, and the author [EMP] studied the nature of the differential of the exponential map. Singularities of d exp are precisely the conjugate points, so the nature of conjugate points tells us much about the structure of the exponential map. The map d exp is a mapping from one infinite-dimensional space to another, and its singularities may be of two types: failure to be injective, and failure to be surjective. (For finite-dimensional mappings, both types always coincide.) Grossman [G] called these singularities monoconjugate points and epiconjugate points, respectively. The authors of [EMP] showed that in Dµ (M 2 ), both types of conjugate points coincide and are of finite order, because the exponential map is Fredholm. On the other hand, [EMP] also showed that in Dµ (M 3 ), it is possible to have an epiconjugate point that is not a monoconjugate point. Their explicit example is the solid flat torus D 2 × S 1 , where the geodesic η is rigid, unit speed rotation of the disc. Here η(π ) is the first conjugate point; it is epiconjugate but not monoconjugate. In addition, for every ε > π , there is a to ∈ (π, π + ε) such that η(to ) is monoconjugate to η(0). The present research will demonstrate that this phenomenon is actually quite typical on three-manifolds. So the structure of conjugate points on Dµ (M 3 ) is in general much more complicated than on Dµ (M 2 ) or on a finite-dimensional Riemannian manifold. In Sect. 3, we explain why it has often been easier to find conjugate points in three dimensions than in two. It turns out that one can construct local approximations of Jacobi fields supported near any point in three dimensions, and use these to search for genuine Jacobi fields. In Theorem 3.1, we prove that one can construct a divergence-free vector field in a neighborhood of any Lagrangian path interior to M 3 , such that the index form along a geodesic in Dµ (M 3 ) can be approximated by a corresponding index form along this path. (This construction is not possible on a two-dimensional manifold.) The approximate index form comes from the following simple equation along a Lagrangian path:
The First Conjugate Point
495
D2 y + R(y, η) ˙ η˙ + ∇ y ∇ p = 0. (1.1) dt 2 Here p is the pressure function, and the path η(t)(x) in the interior of M satisfies the Lagrangian form of the ideal fluid equation: D η˙ = −∇ p. dt
(1.2)
Equation (1.1) is simply the linearization in M 3 of Eq. (1.2). If Eq. (1.1) has a solution y(t) vanishing at times t = 0 and t = a, then the geodesic η has a Jacobi field vanishing at t = 0 and some t = b, with b arbitrarily close to a. The criterion of Theorem 3.1 yields a very simple condition for conjugate points, which is easiest to apply when we are dealing with a steady solution X of the 3-D Euler equations. We find that for any steady solution X with a certain type of fixed point at some x (for example, an elliptic fixed point), there must be a monoconjugate point somewhere along the geodesic, and we can compute its location in terms of p(x). This is a purely three-dimensional phenomenon: in two dimensions, there are many steady flows that have elliptic fixed points but do not have any conjugate points along the corresponding geodesic, because the curvature operator is nonpositive in all directions. See [P2] for details. Although Theorem 3.1 applies only in three (or possibly higher) dimensions, it has a sort of converse that is true in dimension two or higher. This converse, Proposition 3.6, states that if a geodesic has a monoconjugate point at η(a) for some t = a, then along some Lagrangian path, Eq. (1.1) must have a solution vanishing at t = 0 and some t = α ≤ a. We apply this for some simple two-dimensional flows (rotational fields on rotationally-symmetric surfaces) and obtain a new criterion for them not to have monoconjugate points. A previous result of the author [P2] gives a different criterion, and we show that the two criteria are distinct with examples. The results of Theorem 3.1 and Proposition 3.6 are quite reminiscent of the results of Friedlander and Vishik [FV], though the method of proof is very different. They found ordinary differential equations such that exponential growth of their solutions implies exponential growth of linearized Euler perturbations, and hence of Jacobi fields along geodesics. Our result on conjugate points is loosely related to positive curvature on Dµ , while their result on exponential growth of Jacobi fields is loosely related to negative curvature on Dµ . But although the connection between curvature and Jacobi fields is subtle (as discussed in [P1]), both results show that the most important features of Jacobi fields in Dµ (M 3 ) are determined by certain ordinary differential equations along an arbitrary Lagrangian path. Theorem 3.1 implies that the first conjugate point along a geodesic in Dµ (M 3 ) is always pathological: we prove in Theorem 4.3 that the differential of the exponential map has either infinite-dimensional kernel or does not have closed range in the L 2 topology. (If M is a surface, the differential of the exponential map always has both finite-dimensional kernel and closed range, by the Fredholmness result of [EMP]. This is a very striking and surprising difference between two-dimensional and three-dimensional fluid mechanics.) A complete classification of the possible behaviors at the first conjugate point would be quite interesting. Finally we give an example of a geodesic in Dµ (M 3 ) such that monoconjugate points are all of infinite order and dense in an interval. Misiołek [M1] showed that if X is a unit-length left-invariant vector field on S 3 , then the corresponding geodesic η has a
496
S. C. Preston
Jacobi field vanishing at t = 0 and t = π . We compute all of the monoconjugate point locations (using a basis of curl eigenfields on S 3 constructed by Jason Cantarella), and find that they occur at all rational multiples of π greater than or equal to π itself. Furthermore, each one has infinite order. By a result of Biliotti et al. [BEPT], we conclude that for every τ ∈ [π, ∞), the point η(τ ) is epiconjugate to η(0). This example and the one in [EMP] give us explicit and natural examples of the sorts of pathological conjugate points first described by Grossman [G] in infinite-dimensional geometry, and recently explored in more depth by [BEPT]. 2. Background In this section, we briefly review the geometry of the volumorphism group Dµ (M). Many of the formulas provided here for covariant derivatives, curvature operators, and the index form were derived in [M1, P1, EMP]. We will confine ourselves to the C ∞ case, even though for some technical proofs it is more convenient to use the Sobolev H s spaces. See Ebin and Marsden [EMa] for the precise constructions. The space of volumorphisms Dµ (M) of a Riemannian manifold M (possibly with boundary ∂ M) consists of those C ∞ diffeomorphisms η satisfying η∗ µ = µ, where µ is the Riemannian volume form. This space has the structure of a Fréchet manifold. Its tangent spaces Tη Dµ (M) consist of elements of the form X◦η, where X is a vector field on M that is divergence-free and tangent to the boundary. The L 2 Riemannian metric ·, · on Dµ (M) is defined in terms of the metric ·, · on M by the formula U ◦η, V ◦η = U, V ◦η µ. (2.3) M
Dµ (M) also has a Lie group structure, where the group operator is composition. The differentials of the translation operators at the identity are d L η (X ) = Dη(X ) and d Rη (X ) = X ◦η.
(2.4)
By the change of variables formula for integrals, and the fact that each η is volume-preserving, we see that the metric (2.3) is right-invariant. It is not, however, left-invariant. In some sense, then, all the geometric information about Dµ (M) is contained in the left-translations. To compute covariant derivatives in the metric (2.3), we use the Weyl decomposition of vector fields. This decomposition allows us to write any vector field X on a manifold M as X = U + ∇ f, where U is divergence-free and tangent to ∂ M. We construct this decomposition by solving the Neumann problem f = div X,
∇ f, ν∂ M = X, ν∂ M
(2.5)
for f , then defining U := X − ∇ f . (Here ν is the unit normal on ∂ M.) This decomposition is orthogonal in the L 2 metric (2.3). We will denote the orthogonal projections by P(X ) = U and Q(X ) = ∇ f. (2.6) By right-invariance of the metric, the orthogonal projections in Tη Dµ (M) are given by Pη = d Rη ◦ P ◦ d Rη−1 and Q η = d Rη ◦ Q ◦ d Rη−1 . (2.7) Now we consider covariant derivatives.
The First Conjugate Point
497
Proposition 2.1. Suppose η(t) is a curve in Dµ (M), and J (t) is a vector field along η(t). Let X (t) be the Eulerian velocity field of η, defined by the formula dη ∂η X (t) = d Rη(t)−1 = ◦ η(t)−1 . (2.8) dt ∂t If we right-translate back to the identity to obtain Y (t) = d Rη(t)−1 J (t) , the covariant derivative can be computed using ∂Y DJ (2.9) = d Rη(t) + P ∇ X (t) Y (t) . dt ∂t Proof. Formula (2.9) is a consequence of the formula ∂Y DJ (t, x) = t, η(t, x) + ∇ X (t,η(t,x)) Y, dt ∂t
(2.10)
where the covariant derivative of J is computed along each path t → η(t, x); this just comes from the Chain Rule on M. Projecting both sides of (2.10) onto Tη Dµ (M), we obtain (2.9). A more detailed derivation is given in [M1]. The geodesic equation on Dµ (M) is D dη = 0, dt dt
η(0) = id, η(0) ˙ = Xo.
Using Eq. (2.9), the geodesic equation becomes, in terms of the Eulerian velocity field X (t) defined by (2.8), the Euler equation of ideal incompressible flow: ∂X + ∇ X (t) X (t) = −∇ p(t), ∂t
X (0) = X o .
(2.11)
The pressure function p(t) is written with a negative sign by convention, and comes from solving Eq. (2.5): ∇ p(t) = −Q ∇ X (t) X (t) . (2.12) The flow equation (2.8) can always be solved for η, with initial condition η(0) = id, and this gives a one-to-one correspondence between solutions of (2.11) and geodesics starting at the identity. We will assume that η(0) = id from now on; by right-invariance, this is no loss of generality. By formula (2.10), we may also think of the geodesic equation as a family of ordinary differential equations on the manifold: D dη (t, x) = −∇ p(t, x), dt dt
(2.13)
which is Newton’s equation with a time-dependent potential p(t, x). Of course, p(t, x) is not given in advance, but determined by the fluid so as to preserve volume. This point of view will be useful later; we will see how one can consider the full linearized geodesic equation on Dµ (M) by comparing it to the far simpler linearized Newton equation on M. Although these equations are not the same (when one is considering perturbations in M, one does not have the volume-preserving constraint to complicate the formulas), they are quite closely related.
498
S. C. Preston
We can eliminate the pressure term in (2.11) by computing the curl of both sides. In three dimensions, we get the vorticity form of the Euler equation:
∂ curl X (t, x) + X (t, x), curl X (t, x) = 0. ∂t
(2.14)
Equation (2.14) implies that the vorticity is transported by the flow: for every x ∈ M, (2.15) curl X t, η(t, x) = Dη(t, x) curl X o (x) . More generally, by lowering indices in Eq. (2.11) to get an equation for the 1-form X
and taking the differential, we obtain the equation
the solution of which is
∂ d X + L X d X = 0, ∂t
(2.16)
∗ d X (t) = η(t)−1 d X o .
(2.17)
See for example [AK] for details. on Dµ (M), but the There are several formulas for the Riemann curvature tensor R only one we’ll need is the following: if U , V , and W are divergence-free and tangent to the boundary, then R(U, V )W = P R(U, V )W + ∇V Q(∇U W ) − ∇U Q(∇V W ) . (2.18) See for example [P1]. Since the metric is right-invariant, the curvature tensor is as well, and thus we can compute the curvature at any η ∈ Dµ (M) using the same formula. We are interested in Jacobi fields, which are defined as follows: if η(t, s) is a family of curves in Dµ (M) with η(t) = η(t, 0) a geodesic, then J (t) = ∂η ∂s s=0 is a Jacobi field along η(t). Jacobi fields satisfy the linearized geodesic equation D2 J + R J (t), η(t) ˙ η(t) ˙ = 0. 2 dt
(2.19)
Equation (2.19) is extremely unwieldy, not least because the formulas for both curvature and the covariant derivative involve nonlocal operators (specifically, the solution of the Neumann problem (2.5)). However, the Jacobi equation can be simplified substantially, and in fact decoupled into two first-order equations. This fact was first observed by Rouchon [Ro], and exploited by the author [P1] to obtain explicit Jacobi fields along certain geodesics of Dµ (M). These simplifications result from the following equivalent expressions for the linearization of the Newton equation (2.13). Proposition 2.2. Consider a solution X (t) of the Euler equation (2.11), with corresponding geodesic η(t) in Dµ (M). Let J (t) be a vector field along η(t). Then we have ∂Z D2 J + ∇ + ∇ ∇ p + R(J, η) ˙ η ˙ = Z + ∇ X ◦ η, J X Z dt 2 ∂t where Z= and J = Y ◦ η.
∂Y + [X, Y ] ∂t
(2.20)
(2.21)
The First Conjugate Point
499
We can also write D2 J + ∇ J ∇ p + R(J, η) ˙ η˙ = (Dη−1 ) dt 2
∂
, V + ı V d X o ∂t
(2.22)
where V = ∂t U and J = Dη(U ). Here (Dη−1 ) is the pointwise metric adjoint of Dη−1 , and = Dη Dη is the metric pullback, a positive-definite linear operator on each
Tx M. In addition, (ı V d X o ) is a vector field satisfying (ı V d X o ) , F = d X o (V, F) = ∇V X o , F − ∇ F X o , V for any vector field F. Proof. To obtain Eq. (2.20), we start with DJ = (∂t Y + ∇ X Y )◦η = (Z + ∇Y X )◦η, dt a consequence of (2.10). Using the Euler equation (2.11), we have D2 J dt 2
D (Z + ∇Y X )◦η ◦η−1 + ∇Y ∇ p + R(Y, X )X + ∇ J ∇ p + R(J, η) ˙ η˙ ◦η−1 = dt = ∂t (Z + ∇Y X ) + ∇ X (Z + ∇Y X ) −∇Y (∂t X + ∇ X X ) + R(Y, X )X = ∂t Z + ∇∂t Y X + ∇Y (∂t X ) + ∇ X Z + ∇ X ∇Y X −∇Y (∂t X ) − ∇Y ∇ X X +∇Y ∇ X X − ∇ X ∇Y X + ∇[X,Y ] X = ∂t Z + ∇ Z X + ∇ X Z .
To derive Eq. (2.22), we first recall that Y = η∗ U . By the definition of the Lie bracket (see for example Spivak [Sp]), we have η∗
∂ −1 ∂Y η∗ Y = + [X, Y ]. ∂t ∂t
Thus Z = η∗ V . For convenience, define L = Z ◦ η = Dη(V ). Then by Eqs. (2.20) and (2.10), our goal becomes to prove that ∂ DL −1
. + ∇ L X = (Dη ) V + ı V d X o dt ∂t
(2.23)
(2.24)
This equation involves no space derivatives, so we can consider it as an equation along the fixed curve η(t, x) for each particular x ∈ M. So for some fixed x, pick an arbitrary vector wo ∈ Tx M. Then we can compute
d d d wo , (V ) = wo , V = Dη(wo ), L. dt dt dt By Eqs. (2.23) and (2.10), we can compute that D Dη(wo ) = ∇ Dη(wo ) X. dt
(2.25)
500
S. C. Preston
Thus (2.25) yields
d DL wo , (V ) = ∇ Dη(wo ) X, L + Dη(wo ), dt dt
DL = ∇ Dη(wo ) X, L − Dη(wo ), ∇ L X + Dη(wo ), + ∇L X . dt Using the general formula ∇ A X, B − ∇ B X, A = d X (A, B), we can write
d DL wo , (V ) = Dη(wo ), + ∇ L X − d X Dη(V ), Dη(wo ) . dt dt Now since the vorticity 2-form is transported by the flow, Eq. (2.17) yields d X Dη(V ), Dη(wo ) = d X o (V, wo ) = ı V d X o , wo . Thus we finally get
d DL + ∇ L X − ı V d X o , wo , wo , (V ) = Dη(wo ), dt dt and since this is true for any wo ∈ Tx M, we have the equation DL d + ∇ L X = (V ) + ı V d X o , Dη dt dt which yields (2.24) and hence (2.22). The following proposition shows how the Jacobi equation simplifies under either leftor right-translations. Proposition 2.3. If η is a geodesic in Dµ (M) and X is its Eulerian velocity field defined by (2.8), then the Jacobi operator in (2.19) can be written in two ways: • in terms of the right-translation, with Y = d Rη−1 (J ), as ∂Z D2 J + P ∇ + R J, η ˙ η ˙ = d R X + ∇ Z , η Z X dt 2 ∂t
(2.26)
where
Z = ∂t Y + [X, Y ]. • in terms of the left-translation, with U = d L η−1 J , as ∂U ∂ ∂U D2 J P + K , + R J, η ˙ η ˙ = (d L ) −1 Xo η dt 2 ∂t ∂t ∂t where the operator K X o : Tid Dµ (M) → Tid Dµ (M) is defined by K X o (W ) = P ı W d X o ,
(2.27)
(2.28)
with = Dη Dη being the metric pullback, X o being the initial velocity field, and (d L η−1 ) = d Rη ◦ P ◦(Dη−1 ) ◦d Rη−1 being the L 2 adjoint of the operator d L η−1 : Tid Dµ → Tη−1 Dµ .
The First Conjugate Point
501
Proof. The right-translated Jacobi equation (2.26) was derived by Rouchon [Ro], by linearizing the geodesic equation (2.11) and (2.8) directly. The fact that (2.26) is equivalent to (2.19) can be seen directly, using formulas (2.9) and (2.18). Equation (2.27) is a consequence of (2.22), along with the observation that for any vector field W , we have (d L η−1 ) (W ) = (d L η−1 ) ◦ P(W ). This observation follows from the fact that if g is any function on M, then (d L η−1 ) (∇g) = d Rη ◦ P ◦(Dη−1 ) ◦d Rη−1 (∇g) = d Rη ◦ P ∇(g◦η−1 ) = 0, so that (d L η−1 ) ◦ Q ≡ 0.
The operator K X o defined by (2.28) is given in two dimensions by K X o (W ) = P (curl X o )W , where is the two-dimensional Hodge star operator that rotates vectors 90◦ . This operator is compact, as discussed in [EMP]. In three dimensions, K X o (W ) = P curl X o × W , and this operator is generally not compact. The fact that this operator fails to be compact in three dimensions is the main reason Fredholmness of the exponential map fails in three dimensions, which is why conjugate points look so different between Dµ (M 2 ) and Dµ (M 3 ). The main thing we are interested in for this paper is the index form along a geodesic η(t) in Dµ (M). In general this is defined for a Riemannian manifold as a
DJ DJ J (t), dη dη , J (t) dt. , − R (2.29) Ia J (t), J (t) = dt dt dt dt 0 The index form represents the second derivative of the energy functional 1 a ∂η(t, s) ∂η(t, s)
E(s) = , dt. 2 0 ∂t ∂t If η(t, s) is a family of curves in Dµ (M), such that η(t, 0) is a geodesic, with η(0, s) and η(a, s) constant in s, then E (0) = 0 and ∂η ∂η . E (0) = Ia , ∂s s=0 ∂s s=0 So if the index form is negative for some vector field J (t) vanishing at t = 0 and t = a, then the geodesic is not minimizing on [0, a]. In addition, there must be a Jacobi field which vanishes at t = 0 and t = b for some b ∈ (0, a). We will derive an alternative formula for the index form (2.29), which will form the basis for the rest of the paper. Proposition 2.4. If η(t) is a geodesic in Dµ (M) and J (t)is a smooth vector field along η(t) vanishing at t = 0 and t = a, then the index form Ia J (t), J (t) may be written in terms of the left-translation U (t) = d L η(t)−1 J (t) as a
∂U ∂U (t, x), (t, x) (t, x) Ia J (t), J (t) = ∂t ∂t M 0 ∂U (t, x) µ(x) dt, (2.30) + d X o (x) U (t, x), ∂t
502
S. C. Preston
where (t, x) = Dη(t, x) Dη(t, x) is the metric pullback and d X o is the initial vorticity 2-form. Proof. The formula follows immediately from Proposition 2.3, after integrating the index form (2.29) by parts to obtain a 2
dη dη D J + R J (t), , J (t) dt. Ia J (t), J (t) = − dt dt dt 0 What is remarkable about the formula (2.30) is that it involves only local computations; it is not necessary to solve the Neumann problem (2.5) to compute the index form. The index form is virtually the only object in the geometry of Dµ that can be computed so easily, and it is this fact which helps so much to understand conjugate points on Dµ , despite our very incomplete understanding of the curvature on Dµ . The main reason we use left translations to write the index form (2.30) is that it yields an index form in each tangent space Tx M, so that we can study the differential equation in a single vector space rather than along a Lagrangian path. However, the two approaches are equivalent. 3. The Local Criterion Theorem 3.1. Let M be a 3-dimensional compact manifold (possibly with boundary). Let η : [0, T ) → Dµ (M) be a geodesic curve in the diffeomorphism group with η(0) = id. (Here T is the maximal time of existence, which may be infinite.) Let −1 be the velocity field, with X (0) = X . X (t) = dη o dt ◦ η(t) If for some point x in the interior of M, the ordinary differential equation d du du (t, x) + curl X o (x) × =0 (3.31) dt dt dt has a nontrivial solution vanishing at t = 0 and t = a, then for any δ > 0, there is a b ∈ (0, a + δ) such that η(b) is monoconjugate to id along η. Proof. Clearly curl X o (x) is not zero; if it were, we could not have a nontrivial solution vanishing at two points, since (t, x) is positive definite. So set up an oriented orthonormal basis {e1 , e2 , e3 } at Tx M, such that curl X o (x) = ωo e3 , with ωo > 0. Choose Riemannian normal coordinates (x1 , x2 , x3 ) in a neighborhood of x interior to M, such that at x, ∂x1 = e1 , ∂x2 = e2 , and ∂x3 = e3 . Let h : R → R be a nontrivial C ∞ function which vanishes identically outside [−1, 1]. For a small ε > 0, let us define three vector fields x x x x x x 1 2 3 1 2 3 h 2 h 3 ∂x1 − ε 3 h h 2 h 3 ∂x2 , A1 = ε 4 h ε ε ε ε ε ε x x x x x 1 2 3 2 3 3 4 x1 A2 = ε h h 2 h 3 ∂x1 + ε h h 2 h 3 ∂x2 , ε ε ε ε ε ε ε3 x1 x2 x3 ε2 x1 x2 x3 ∂x1 + h 2 h 3 h ∂x2 . A3 = − h 2 h 3 h 2 ε ε ε 2 ε ε ε (We set each A j ≡ 0 outside the coordinate neighborhood.)
The First Conjugate Point
503
Now we specify divergence-free vector fields E 1 , E 2 , and E 3 by the formulas E j = curl A j . Since we are working in Riemannian normal coordinates, we can compute these curls to order O(ε) just using the Euclidean formulas, and we obtain: x x x 1 2 3 h 2 h 3 ∂x1 + O(ε) on [−ε, ε] × [−ε2 , ε2 ] × [−ε3 , ε3 ], E1 = h ε ε ε x x x 1 2 3 E2 = h h 2 h 3 ∂x2 + O(ε) on [−ε, ε] × [−ε2 , ε2 ] × [−ε3 , ε3 ], ε ε ε x x x 1 2 3 E3 = h 2 h 3 h ∂x3 + O(ε) on [−ε2 , ε2 ] × [−ε3 , ε3 ] × [−ε, ε]. ε ε ε These vector fields are chosen so that, roughly speaking, E i is nearly parallel to ei near x, to lowest order. More precisely, we can check that the following formulas hold for the L 2 inner products: E 1 , E 1 = E 2 , E 2 = E 3 , E 3 = ε6 + O(ε7 ) E 1 , E 2 = E 1 , E 3 = E 2 , E 3 = O(ε7 ) ∂x3 × E 1 , E 2 = ε6 + O(ε7 ), ∂x3 × E 2 , E 1 = −ε6 + O(ε7 ) ∂x3 × E 1 , E 3 = ∂x3 × E 2 , E 3 = O(ε7 ), where the constant is defined by =
1
−1
h(σ ) dσ 2
1 −1
2
h (σ ) dσ 2
.
Now since u(t) is a solution of Eq. (3.31) vanishing at t = 0 and t = a, we know that there is a vector function u(t) ˜ vanishing at t = 0 and t = a + δ such that a+δ i a+δ (u, ˜ u) ˜ ≡ (t, x)∂t u(t), ˜ ∂t u(t) ˜ + curl X o (x) × u(t), ˜ ∂t u(t) ˜ dt < 0. 0
(The construction is the same as that for Jacobi fields in finite-dimensional Riemannian geometry, or more generally for index forms of second-order self-adjoint equations. See for example Reid [Re].) If u(t) ˜ = u 1 (t)e1 + u 2 (t)e2 + u 3 (t)e3 , then define U˜ (t) = u 1 (t)E 1 + u 2 (t)E 2 + u 3 (t)E 3 . For y in the support of U˜ , we can approximate (t, y) = (t, x) + O(ε) and curl X o (y) = curl X o (x) + O(ε). Therefore, we have a+δ (t)∂t U˜ , ∂t U˜ + curl X o × U˜ (t), ∂t U˜ dt Ia+δ (U˜ , U˜ ) = 0
=
0
a+δ
(t)∂t U˜ , ∂t U˜ + ωo ∂x3 × U˜ , ∂t U˜ dt + O(ε7 )
˜ u) ˜ + O(ε7 ), = ε6 i a+δ (u, and choosing ε sufficiently small, we can make this quantity negative. Since the index form is negative for some divergence free vector field on the interval [0, a + δ], there must be a Jacobi field along η vanishing at t = 0 and t = b for some b < a + δ. So η(b) is monoconjugate to η(0) along η, as desired.
504
S. C. Preston
Remark 3.2. The main point is that for any particular vector u ∈ Tx M, we can construct a divergence-free vector field U such that U ≈ u near x and P(curl X o ×U ) ≈ curl X o ×u near x. We can do this only in three (or possibly higher) dimensions. In two dimensions, the index form takes the form a Ia (U, U ) = (t)∂t U, ∂t U + (curl X o ) U , ∂t U dt, 0
where is the Hodge star operator. If U is any divergence-free vector field with support in a disc, then U is a gradient, and thus to lowest order, (curl X o ) U is also a gradient. Since the gradients are orthogonal to the divergence-free vector fields, the second term in the index form vanishes to lowest order; thus the index form is positive definite to lowest order. We conclude that there is no local criterion that can be used to find conjugate points along two-dimensional fluid flows: conjugate points on Dµ (M 2 ) are an essentially global phenomenon. In three (and possibly higher) dimensions, conjugate points are essentially a local phenomenon. Remark 3.3. The result is sharp, in the sense that there may not be a monoconjugate point actually at η(a). This is precisely what happens for one example where we can compute everything explicitly: uniform rotation with angular velocity 1 of the solid torus D 2 × S 1 . Ebin, Misiołek, and the author [EMP] computed explicitly the Jacobi fields along this flow in terms of curl eigenfields on the cylinder, and found that monoconjugate points occur at a sequence of locations that decreases to π , but that η(π ) itself is not a monoconjugate point. In this example (t, x) is always the identity and curl X o ≡ 2 ∂z , so that Eq. (3.31) becomes u (t) + 2 ∂z × u (t) = 0. With u(0) = 0, the solutions are sin2 t − sin t cos t 0 u(t) = sin t cos t sin2 t 0 u (0), 0 0 t and choosing u (0) orthogonal to ∂z , we see that the first vanishing point is a = π . In general, if a is not actually a monoconjugate location, then there must be a sequence of monoconjugate locations descending to a. We will discuss this point more thoroughly in the next section. Remark 3.4. By Proposition 2.2, Eq. (3.31) is equivalent to the pair of equations Dy Dz − ∇ y(t) X = z(t) and + ∇z(t) X = 0 dt dt
(3.32)
for vector fields y(t) and z(t) along a Lagrangian path t → η(t)(x) in M 3 . For a steady flow X for which we happen to know a Lagrangian path, Eqs. (3.32) will often be easier to write down and solve than Eq. (3.31). (Of course, at a fixed point of a steady flow, the two approaches are the same.) We can also write Eqs. (3.32) as the single second-order equation D2 y + ∇ y ∇ p + R(y, η) ˙ η˙ = 0. dt 2
(3.33)
D This is the linearization of the Newton equation dt η˙ = −∇ p on the manifold. Thus if the finite-dimensional Newtonian system (for the time-dependent potential energy function
The First Conjugate Point
505
p) has a conjugate point, so does the infinite-dimensional Riemannian system. One can use the same sort of comparison techniques to find solutions of (3.33) as are used in finite-dimensional Riemannian geometry. Theorem 3.1 is easiest to apply if X = X o is a steady solution of the Euler equation, with a fixed point x. Proposition 3.5. Suppose X is a steady solution of the Euler equation ∇ X X = −∇ p on a 3-manifold M, and x is a fixed point of X in the interior of M. If p(x) > 0, √ then Eq. (3.31) has nontrivial solutions vanishing at both t = 0 and t = π 2/p(x). Otherwise, all solutions of (3.31) can vanish at most at one time. Proof. The operator u → ∇u X is a linear operator in Tx M. Since X is assumed to be a steady solution of the Euler equation, we know by (2.14) that [X, curl X ] = 0 everywhere, and in particular at x. Thus ∇curl X (x) X = ∇ X (x) curl X = 0, because X (x) = 0. In addition, for any u ∈ Tx M, we have ∇u X, curl X (x)−∇curl X (x) X, u = curl X (x) × curl X (x), u = 0. Therefore, in an orthonormal basis {e1 , e2 , e3 } with e3 parallel to curl X (x), the matrix of u → ∇u X is of the form A 0 u, ∇u X = 0 0 where A is a 2 × 2 matrix. Since div X = 0, we have TrA = 0. So the characteristic equation for A is A2 = −(det A)I . We have the general formula div ∇ X X = Ric(X, X ) + Tr(u → ∇∇u X X ), which is valid for any divergence-free vector field. Thus for a solution of Eq. (2.11), we have p + Ric(X, X ) = −Tr(u → ∇∇u X X ). In particular, at the point x, we have Ric(X (x), X (x)) = 0 and thus p(x) = −Tr(u → ∇∇u X X ) = −Tr A2 = 2 det A. The solution of the pair of equations dz + ∇z X = 0, dt
dy − ∇y X = z dt
with initial conditions y(0) = 0 and z(0) = z o is 1 A−1 (et A − e−t A ) 0 z . y(t) = 2 0 t o √ √ If p(x) > 0, then et A = cos det A t I + √ 1 sin det A t A, so that y(t) =
det A
√ 1 det A
sin
det A t I 0 zo . 0 t
√
Choosing √ z o orthogonal to e3 , we obtain a nontrivial solution which vanishes at time t = π 2/p(x). On the other hand, if p(x) ≤ 0, we can easily verify that each component of y(t) increases with time, so that there are no nontrivial solutions vanishing at two times. We have a natural converse to Theorem 3.1, which works for any dimension n ≥ 2.
506
S. C. Preston
Proposition 3.6. Let M be any manifold with dimension n ≥ 2, possibly with boundary. Suppose η(t) is a geodesic in Dµ (M) with η(0) = id, and let X be the Eulerian velocity −1 field defined by X = ∂η ∂t ◦ η . If there is a Jacobi field along η vanishing at t = 0 and t = a > 0, then for some x in the interior of M, there is a solution u(t) of the ordinary differential equation d du
=0 (3.34) (t, x) + ı u(t) d X o (x) dt dt with u(0) = u(α) = 0 for some α ∈ (0, a]. Proof. Let U (t) be the left translation of the Jacobi field vanishing at t = 0 and t = a. Then the index form Ia (U, U ) vanishes: a Ia (U, U ) = (t)Ut (t), Ut (t) + ıU (t) d X o , Ut (t) dt = 0. 0
Thus, interchanging the order of integration, we know that a (t, x)Ut (t, x), Ut (t, x) + ıU (t,x) d X o (x) , Ut (t, x) dt dµ(x) = 0. M
0
As a result, we know
Ia (U, U ) =
i a (x) dµ(x) = 0, M
where the integrand is a i a (x) = (t, x)Ut (t, x), Ut (t, x) + ıU (t,x) d X o (x) , Ut (t, x) dt.
(3.35)
0
Thus i a (x) must vanish for some x in the interior of M. Now i a (x) is the index form of the self-adjoint system (3.34), and since the matrix (t, x) is always positive-definite, we can apply the Morse index theorem for systems to conclude that if i a (x) = 0, then there is some solution of (3.34) vanishing at t = 0 and at some t = α ≤ a. See Reid [Re], Theorem V.8.1. Remark 3.7. As in Remark 3.4, we can also use Eqs. (3.32), or the equivalent (3.33), instead of (3.34). Example 3.8. Consider M 2 , a two-dimensional disc, sphere, annulus, or torus, with rotationally symmetric metric of the form
and a vector field
ds 2 = dr 2 + ϕ 2 (r ) dθ 2
(3.36)
X = u(r ) ∂θ .
(3.37)
Any such X is a steady solution of theEuler and thus generates a geode equation (2.11), sic η given in coordinates by η(t) r, θ = r, θ +tu(r ) . These are the simplest nontrivial steady Euler flows. In [P2], the author proved the following theorem.
The First Conjugate Point
507
Theorem 3.9. A geodesic in Dµ (M 2 ) generated by an analytic steady flow X on M 2 with isolated singularities has nonpositive curvature operator all along it if and only if M 2 is a disc, sphere, annulus, or torus with a polar coordinate system with metric (3.36) in which X has the form (3.37), and in addition: • if M 2 is a torus, ϕ is constant; • if M 2 is not a torus, then the function Q(r ) = (ϕ u) /u is defined for all r and satisfies the differential inequality ϕ Q + Q2 ≤ 1 (3.38) everywhere. Nonpositive curvature operator implies, by the Rauch comparison theorem, that the differential of the exponential map satisfies |d(expid )t X (V )| ≥ t|V | for any t and any V ∈ Tid Dµ (M 2 ); therefore there are no conjugate points (monoconjugate or epiconjugate—see Grossman [G]). Proposition 3.6 gives us a simpler criterion for the nonexistence of monoconjugate points for flows of the form X = u(r ) ∂θ . We can prove the following. Proposition 3.10. Suppose M 2 is a disc, sphere, annulus, or torus, with metric given in polar coordinates by (3.36), with a vector field X of the form (3.37). Define a radial function by A(r ) = 4u(r )2 ϕ (r )2 + 2ϕ(r )ϕ (r )u(r )u (r ). (3.39) If A(r ) ≤ 0 for all r , then the geodesic η in Dµ (M 2 ) defined by X has no monoconjugate points. Furthermore, if A(r ) > 0 for some r , then thefirst monoconjugate point along η (if there is one) cannot occur earlier than τ = 2π/ supr A(r ). Proof. Using Remark 3.7, if η(a) is monoconjugate to id along η, then along any circle of constant r , Eqs. (3.32) have a solution with y(0) = 0 and y(a) = 0. With y(t) = f (t) ∂r + g(t) ∂θ and z(t) = h(t) ∂r + k(t) ∂θ , Eqs. (3.32) take the form f˙(t) = h(t), g(t) ˙ = u (r ) f (t) + k(t),
˙ = − u (r ) + 2 ϕ (r )u(r ) h(t). ˙ = 2u(r )ϕ(r )ϕ (r )k(t), k(t) h(t) ϕ(r ) The equation for h(t) takes the form ¨ + A(r )h(t) = 0, h(t) and from here we can easily solve to find h, k, f , and g with initial conditions h(0) = h o , k(0) = ko , f (0) = 0, and g(0) = 0. It is not hard to see that these equations have solutions vanishing for t > 0 if and only if we have A(r ) > 0, and then the solution vanishes √ at 2π/ A(r ). Thus if A(r ) ≤ 0 for all r , Eq. (3.32) does not have a solution vanishing at two times along any Lagrangian path, and thus the geodesic cannot have any monoconjugate pairs. On the √ other hand, if η(0) and η(a) are monoconjugate, then for some r , we must have 2π/ A(r ) = a. As a result, the first possible monoconjugate point cannot occur before 2π/ supr A(r ).
508
S. C. Preston
Theorem 3.9 and Proposition 3.10 do actually give us distinct criteria for a two-dimensional rotational flow to have no monoconjugate points (and thus to be infinitesimally minimizing along its entire length). For example, if ϕ(r ) = r and u(r ) ≡ 1 on the disc, then Theorem 3.9 implies that the geodesic has nonpositive curvature operator and thus no conjugate points, while Proposition 3.10 is inconclusive. On the other hand, if ϕ(r ) = sin r and u(r ) = 1/ sin2 {r } on the portion π/4 < r < 3π/4 of the round sphere, then A(r ) ≡ 0, so that the geodesic has no monoconjugate points; however, Q(r ) = (1 + cos2 r )/(2 cos r ) is singular at r = π/2 and never satisfies the differential inequality (3.38). Thus the curvature operator along the geodesic is sometimes positive and sometimes negative.
4. The First Conjugate Point Proposition 4.1. Suppose η is a geodesic in Dµ (M 3 ). Let τ = inf{a > 0 η(a) is monoconjugate to η(0) along η}. For each x in the interior of M 3 , let τ (x) = inf{a > 0 some solution of Eq. (3.31) vanishes at t = 0 and t = a.} Then τ = inf x∈int(M) τ (x). If η(τ ) is itself monoconjugate to η(0) along η, then in addition τ (x) is constant on M. Proof. Theorem 3.1 implies that τ ≤ inf x∈int(M) τ (x), while Proposition 3.6 implies that inf x∈int(M) τ (x) ≤ τ. This proves the first claim. Now suppose η(τ ) is actually monoconjugate to η(0). Then there is a Jacobi field J (t) along η with J (0) = 0 and J (τ ) = 0. As in the proof of Proposition 3.6, the left-translation U (t) of this vector field satisfies the equation i τ (x) dµ(x) = 0, M
where the integrand is i τ (x) =
0
τ
(t, x)Ut (t, x), Ut (t, x) + ıU (t,x) d X o (x) , Ut (t, x) dt.
If i τ (x) < 0 at some point x, then since i τ (x) is the index form of the self-adjoint system (3.31) in Tx M, there must be a solution of (3.31) vanishing at t = 0 and t = a for some a < τ . (As before, see Reid [Re].) This implies τ (x) ≤ a < τ , which is impossible. Thus i τ (x) ≥ 0 for every x in the interior of M. The only way a nonnegative function can integrate to zero is if it identically vanishes, so i τ (x) = 0 for every x. Thus we must have τ (x) = τ for every x. If we know anything about the metric pullback , Eq. (3.31) will be easy to solve, and we can determine the exact location of the first conjugate point. The easiest case is of course when X is a Killing field.
The First Conjugate Point
509
Corollary 4.2. If the geodesic η(t) in Dµ (M 3 ) consists of isometries, then the Eulerian velocity field X is steady and a Killing field. The infimum of monoconjugate point locations is then τ=
2π . sup M |curl X |
If η(τ ) is itself monoconjugate to id, then curl X has constant length on M. Proof. We know that η (0) = X o is a Killing field by definition, and since any Killing field is a steady solution of the Euler equation (see Misiołek [M1] for the proof), we must have X (t) = X o for all t. Since η(t) is always an isometry, the deformation (t, x) is always the identity, so that Eq. (3.31) becomes (using ω = curl X ) d 2u du + ω(x) × = 0, 2 dt dt whose solution with u(0) = 0 and u (0) perpendicular to ω(x) is sin |ω(x)|t cos |ω(x)|t − 1 u(t) = u (0) + ω(x) × u (0). |ω(x)| |ω(x)|2 Thus τ (x) = 2π/|ω(x)|. The rest follows from Proposition 4.1.
In Corollary 4.2, constant length of the vorticity is necessary for the infimum of conjugate points to be monoconjugate; however, it is not sufficient, as shown by the example given in [EMP]. There, the Killing field X on the solid torus is given in cylindrical coordinates by X = ∂θ , and the vorticity is curl X = 2 ∂z , with constant length. The Jacobi fields can all be computed in terms of curl eigenfields, and the monoconjugate point locations can be expressed in terms of roots of Bessel functions. The infimum of these is τ = π , but this is not itself a monoconjugate point location. Instead, this represents an epiconjugate point; the differential of the exponential map is one-to-one, but not closed, and therefore not onto. As we will see in the next theorem, the first conjugate point along a geodesic in Dµ (M 3 ) is always pathological: the differential of the exponential map (d expid )τ X o either is not closed, which implies that the span of the Jacobi fields excludes an infinitedimensional space of vectors; or has infinite-dimensional kernel. Thus the first conjugate point is either epiconjugate of infinite order or monoconjugate of infinite order. We will present an example of this latter phenomenon later. Theorem 4.3. Let η be a geodesic in Dµ (M 3 ) and let τ be the infimum of monoconjugate point locations, as in Proposition 4.1. If the differential of the exponential map (d expid )τ X o has empty or finite-dimensional kernel, then its range is not closed in the L 2 norm. Hence it is epiconjugate of infinite order, i.e., there is an infinite-dimensional space in Tη(τ ) Dµ (M 3 ) disjoint from the image of (d expid )τ X o . Proof. If there are no monoconjugate points, we have nothing to prove. Otherwise, τ as defined in Proposition 4.1 is finite. For each δ > 0, there is some xo with τ (xo ) < τ + δ. Solutions of the differential equation (3.31) depend continuously on x, and thus τ (x) is a continuous function of x in a neighborhood of xo . So for all x in some open set containing xo , we know τ (x) < τ + δ.
510
S. C. Preston
In this set , we can find a sequence of disjoint open sets n , and in each one we can construct a “test” Jacobi field Un as in Theorem 3.1, vanishing at t = 0 and t = τ + δ, with Iτ +δ (Un , Un ) < 0. Furthermore, since the sets n are disjoint, we have Iτ +δ (Un , Um ) = 0 if m = n. Thus the space of vector fields vanishing at 0 and τ + δ on which Iτ +δ is negative-definite is infinite-dimensional. Consequently, we must have infinitely many linearly independent Jacobi fields, each of which vanishes at t = 0 and at some t < τ + δ, as in the proof of the finite-dimensional Morse Index theorem. This happens one of two ways: either there are infinitely many independent Jacobi fields all satisfying J (0) = 0 and J (τ ) = 0; or there is a sequence of distinct monoconjugate point locations τn decreasing to τ . In the former case, we are done since η(τ ) is then monoconjugate to η(0) of infinite order. In the latter case, (d expid )τ X o is not closed, by a result of Biliotti, Exel, Piccione, and Tausk [BEPT]. Remark 4.4. The result of [BEPT] is applicable in the topology generated by the Riemannian metric, i.e., the L 2 topology. One may also ask whether (d expid )τ X o also has non-closed range in the Sobolev H s topology. If M has no boundary, one can proceed as in [EMP]; the commutators of the partial derivative operators with d exp are compact, and thus one can conclude non-closed range in H s from non-closed range in L 2 . If M does have a boundary, these commutators may not be compact, so one cannot answer this question (in the same way that Fredholmness in H s of the exponential map on Dµ (M 2 ) is unknown if M has a boundary). As a result of Theorem 4.3, we can say there always is a “first conjugate point,” either monoconjugate or epiconjugate, found at η(τ ), where τ = inf x∈int(M) τ (x), as in Proposition 4.1. It is natural to ask whether the monoconjugate point may be of infinite order, and whether the first conjugate point is actually monoconjugate; the example given in [EMP] exhibits neither. To answer this question, we consider the example of M = S 3 , where the velocity field X is a left-invariant Killing field. Misiołek [M1] showed that the corresponding geodesic in Dµ (S 3 ) has a monoconjugate point occurring at t = π . Using a basis of curl eigenfields on S 3 , we compute all of the conjugate points in this example. They are quite interesting. Proposition 4.5. If M = S 3 and X is a left-invariant vector field, then along the corresponding geodesic η in Dµ (S 3 ), the point η(a) is monoconjugate to η(0) if and only if a = nπ/j for positive integers n and j, with j ≤ n. Each such monoconjugate point has infinite order. In addition, for any t ≥ π , d(expid )t X does not have closed range; thus the differential of the exponential map is never Fredholm at or beyond the first conjugate point. Proof. Let us consider S 3 as consisting of the unit quaternions in R4 , with coordinates (w, x, y, z). Then a basis of left-invariant vector fields is e1 = x ∂w − w ∂x + z ∂ y − y ∂z , e2 = y ∂w − z ∂x − w ∂ y + x ∂z , e3 = z ∂w + y ∂x − x ∂ y − w ∂z . These have unit length and satisfy the bracket relations [e1 , e2 ] = 2e3 ,
[e2 , e3 ] = 2e1 ,
[e3 , e1 ] = 2e2 .
The First Conjugate Point
Their curls satisfy
511
curl ei = −2ei ,
and by linearity, every vector X in the Lie algebra of S 3 also satisfies curl X = −2X . Thus by (2.14), every such X is a steady solution of the Euler equation. By bi-invariance of the metric on S 3 , every left-invariant X is also a Killing field, so that the resulting flow consists of isometries. Thus the metric pullback is (t, x) ≡ 1, and by formula (2.27), the Jacobi equation for J = Dη(U ) takes the form Utt − 2P(X × Ut ) = 0. Computing curl of both sides, and using the fact that curl annihilates gradients, we get curl Utt + 2[X, Ut ] = 0.
(4.40)
Now we have the general formula ∇A, B = ∇ A B + ∇ B A + A × curl B + B × curl A, for any vector fields A and B. Since X is a Killing field, 2∇U X = curl X × U , so we have the formula ∇U, X = [X, U ] + X × curl U, which implies upon computing curls that curl [X, U ] = [X, curl U ] for any vector field U . Thus L X and curl commute as operators, so that in the (finite-dimensional) eigenspaces of curl, L X restricts to an operator from each eigenspace to itself. In the L 2 metric, curl is self-adjoint while L X is anti-self-adjoint. Therefore, there is a basis of the (complexified) space of divergence-free fields on S 3 , orthonormal in L 2 , consisting of vector fields U such that curl U = λU and [X, U ] = iαU
(4.41)
for some real numbers λ and α. To find this basis, we first start with the related question of the eigenvalues of X and on functions. Without loss of generality, we may assume that X = e1 by rotational symmetry. One can compute that the eigenfunctions of the Laplacian consist of restrictions of homogeneous, harmonic polynomials in R4 having degree some nonnegative integer k; for any such f , we have f = −k(k + 2) f . To find the eigenvalues of the operator f → e1 ( f ), we use a toroidal coordinate system: define (σ, θ, φ) so that w = cos σ cos θ, x = cos σ sin θ,
y = sin σ cos φ, z = sin σ sin φ.
Here σ ∈ [0, π/2] while θ, φ ∈ [0, 2π ). In these coordinates, we can compute that e1 = ∂θ + ∂φ . Any monomial P = w k1 x k2 y k3 z k4 with k1 + k2 + k3 + k4 = k can be written in these coordinates as P = cosk1 +k2 σ sink3 +k4 σ cosk1 θ sink2 θ cosk3 φ sink4 φ, and these functions are in turn spanned by the trigonometric functions p = cosk1 +k2 σ sink3 +k4 σ eim 1 θ eim 2 φ , for some integers m 1 ∈ {k1 + k2 , k1 + k2 − 2, . . . , −(k1 + k2 )} and m 2 ∈ {k3 + k4 , k3 + k4 − 2, . . . , −(k3 + k4 )}. For any such p, we have e1 ( p) = i(m 1 + m 2 ) p, so that the
512
S. C. Preston
eigenvalue (m 1 + m 2 ) takes on every integer value between −k and k with the same odd/even parity as k. Thus we have a basis of complex functions f such that f = −k(k + 2) f for some nonnegative integer k and e1 ( f ) = im f for some integer m ∈ {−k, −k +2, . . . , k −2, k}. We now construct a convenient basis of curl eigenfields, following Jason Cantarella (personal communication). (I) : U = curl2 S + (k + 2) curl S for S = f e1 and k ≥ 2; then curl U = kU and [e1 , U ] = imU . (II) : U = curl2 S + (k + 2) curl S for S = f (e2 ± ie3 ) and k ≥ 2; then curl U = kU and [e1 , U ] = i(m ∓ 2)U . (III) : U = curl2 S − k curl S for S = f e1 and k ≥ 0; then curl U = −(k + 2)U and [e1 , U ] = imU . (IV) : U = curl2 S − k curl S for S = f (e2 ± ie3 ) and k ≥ 0; then curl U = −(k + 2)U and [e1 , U ] = i(m ∓ 2)U . From these formulas, we see that the eigenvalues λ of curl are all integers except 0 and ±1; for each such λ, there is an eigenvalue iα of Le1 , where α is an integer with |α| ≤ |λ| and α having the same odd/even parity as λ. (Although it appears that type (II) violates this rule when m = ∓k, it turns out that U vanishes in this case.) In such a basis, Eq. (4.40) is diagonalized, and the coefficient c(t) of an eigenfield U with (4.41) will satisfy the equation λc (t) + 2iαc (t) = 0. If α = 0, the solution with c(0) = 0 is c(t) =
αt −iαt/λ λ sin e c (0), α λ
and the corresponding conjugate point occurs at t = |λ|π |α| . (If α = 0, there is no conjugate point obtained.) Since λ and α must both be integers, we have shown that all conjugate point locations are of the form qπ with q ≥ 1 a rational number. We can obtain all such rational numbers infinitely many times; if q = n/j with j ≤ n positive integers, then for any positive integer L, we can construct a vector field U of type (I) using k = 2n L and m = 2 j L; then the corresponding Jacobi field vanishes at qπ for any L. The final claim, that (d expid )t X is not closed if t ≥ π , follows from the general result of [BEPT], that the differential of the exponential map is not closed at any limit point of the set of monoconjugate point locations. Since the rational multiples of π are dense in [π, ∞), the differential of the exponential map is not closed beyond the first conjugate point. It would be interesting to see if a similar result is true in general; that is, whether the monoconjugate point locations are always dense in some intervals. We conjecture that they are, and expect that the proof involves a similar but more sophisticated approximation of the Jacobi field solution operator as in Theorem 3.1.
The First Conjugate Point
513
References [A]
Arnold, V.: Sur la geometrie differentielle des groupes de Lie de dimension infinie et ses applications a l’hydrodynamique des fluids parfaits. Ann. Inst. Grenoble 16, 319–361 (1966) [AK] Arnold, V., Khesin, B.: Topological Methods in Hydrodynamics. New York: Springer, 1998 [BEPT] Biliotti, L., Exel, R., Piccione, P., Tausk, D.V.: On the singularities of the exponential map in infinite dimensional Riemannian manifolds. To appear in Math. Ann. [B] Brenier, Y.: The dual least action problem for an ideal, incompressible fluid. Arch. Rat. Mech. Anal. 122, 323–351 (1993) [EMa] Ebin, D., Marsden, J.: Groups of diffeomorphisms and the motion of an incompressible fluid. Ann. of Math. 92, 102–163 (1970) [EMP] Ebin, D., Misiołek, G., Preston, S.C.: Singularities of the exponential map on the volume-preserving diffeomorphism group. To appear in Geom. Funct. Anal. [FV] Friedlander, S., Vishik, M.M.: Instability criteria for steady flows of a perfect fluid. Chaos 2(3), 455–460 (1992) [G] Grossman, N.: Hilbert manifolds without epiconjugate points. Proc. Amer. Math. Soc. 16, 1365– 1371 (1965) [M1] Misiołek, G.: Stability of ideal fluids and the geometry of the group of diffeomorphisms. Indiana Univ. Math. J. 42, 215–235 (1993) [M2] Misiołek, G.: Conjugate points in Dµ (T 2 ). Proc. Amer. Math. Soc. 124, 977–982 (1996) [P1] Preston, S.C.: For ideal fluids, Eulerian and Lagrangian instabilities are equivalent. Geom. Funct. Anal. 14, 1044–1062 (2004) [P2] Preston, S.C.: Nonpositive curvature on the area-preserving diffeomorphism group. J. Geom. Phys. 53, no. 3, 259–274 (2005) [Re] Reid, W.T.: Sturmian Theory for Ordinary Differential Equations. New York: Springer-Verlag, 1980 [Ro] Rouchon, P.: The Jacobi equation, Riemannian curvature and the motion of a perfect incompressible fluid. European J. Mech. B Fluids 11(3), 317–336 (1992) [Sh] Shnirelman, A.: Generalized fluid flows, their approximation and applications. Geom. Funct. Anal. 4, 586–620 (1994) [Sp] Spivak, M.: A comprehensive introduction to differential geometry. Volume 1, Houston, TX: Publish or Perish, 1999 Communicated by A. Kupiainen
Commun. Math. Phys. 267, 515–542 (2006) Digital Object Identifier (DOI) 10.1007/s00220-006-0055-8
Communications in
Mathematical Physics
Renormalization of Non-Commutative 44 Field Theory in x Space Razvan Gurau1 , Jacques Magnen2 , Vincent Rivasseau1 , Fabien Vignes-Tourneret1 1 Laboratoire de Physique Théorique, Bât. 210, CNRS UMR 8627 Université Paris XI, 91405 Orsay Cedex,
France. E-mail: {razvan.gurau, vincent.rivasseau, fabien.vignes}@th.u-psud.fr
2 Centre de Physique Théorique, CNRS UMR 7644 Ecole Polytechnique 91128 Palaiseau Cedex, France.
E-mail: [email protected] Received: 22 December 2005 / Accepted: 27 February 2006 Published online: 15 June 2006 – © Springer-Verlag 2006
Abstract: In this paper we provide a new proof that the Grosse–Wulkenhaar non-commutative scalar 44 theory is renormalizable to all orders in perturbation theory, and extend it to more general models with covariant derivatives. Our proof relies solely on a multiscale analysis in x space. We think this proof is simpler. It also allows direct interpretation in terms of the physical positions of the particles and should be more adapted to the future study of these theories (in particular at the non-perturbative or constructive level). 1. Introduction In this paper we recover the proof of perturbative renormalizability of non-commutative 44 field theory [1–3] by a method solely based on x space. In this way we avoid completely the sometimes tedious use of the matrix basis and of the associated special functions of [1–3] and we recover the more physical direct space representation of fields and particles. Moreover our proof works for the optimal range ]0, 1] of the parameter which was restricted to a much smaller interval in a previous proof. We also extend the corresponding BPHZ theorem to the more general complex Langmann-Szabo-Zarembo ϕ¯ ϕ ϕ¯ ϕ model with covariant derivatives, hereafter called the LSZ model. This model has a slightly more complicated propagator, and is exactly solvable in a certain limit [4]. Our method builds upon previous work of Filk and Chepelev-Roiban [5, 6]. These works however remained inconclusive [7], since these authors used the right interaction but not the right propagator, hence the problem of ultraviolet/infrared mixing prevented them from obtaining a finite renormalized perturbation series. The Grosse Wulkenhaar breakthrough was to realize that the right propagator in non-commutative field theory is not the ordinary commutative propagator, but has to be modified to obey LangmannSzabo duality [8, 2]. Non-commutative field theories (for a general review see [9]) deserve a thorough and systematic investigation. Indeed they may be relevant for physics beyond the stan-
516
R. Gurau, J. Magnen, V. Rivasseau, F. Vignes-Tourneret
dard model. They are certainly effective models for certain limits of string theory [10, 11]. Also they form almost certainly the correct framework for a microscopic ab initio understanding of the quantum Hall effect which is currently lacking. We think that x space-methods are probably more powerful for the future systematic study of the noncommutative Langmann-Szabo covariant field theories. Fermionic theories such as the two dimensional Gross-Neveu model can be shown to be renormalizable to all orders in their Langmann-Szabo covariant versions, using either the matrix basis or the direct space version developed here [12]. However the x-space version seems the most promising for a complete non perturbative construction, using Pauli’s principle to control the apparent (fake) divergences of perturbation theory. In the case of φ44 , recall that although the commutative version has been until now fatally flawed due to the famous Landau ghost, there is some hope that the non-commutative field theory treated at the perturbative level in this paper may also exist at the constructive level [13, 14]. Again the x-space formalism is probably better than the matrix basis for a rigorous investigation of this question. In the first section of this paper we establish the x-space power counting of the theory using the Mehler kernel form of the propagator in direct space given in [15]. In the second section we prove that the divergent subgraphs can be renormalized by counterterms of the form of the initial Lagrangian. The LSZ models are treated in the Appendix. Note that we do not prove here the exact topological power counting for irrelevant graphs. This should be doable with our methods but is not necessary for our theorem. 2. Power Counting in x-Space 2.1. Model, notations. Beware that throughout this paper we will use many different notations for position variables. To avoid any confusion for the reader we summarize these notations at the end of the paper. The simplest noncommutative ϕ44 theory is defined on R4 equipped with the associative and noncommutative Moyal product d 4k (a b)(x) = (2.1) d 4 y a(x+ 21 θ ·k) b(x+y) eık·y . (2π )4 The renormalizable action functional introduced in [2] is 1 2 1 λ S[ϕ] = d 4 x ∂µ ϕ ∂ µ ϕ + (x˜µ ϕ) (x˜ µ ϕ) + µ20 ϕ ϕ + ϕ ϕ ϕ ϕ (x), 2 2 2 4! (2.2) where x˜µ = 2(θ −1 )µν x ν and the Euclidean metric is used. In four dimensional x-space the propagator is [15] C(x, x ) =
˜2 ˜ 2 [2π sinh t]
˜
e
˜
t 2 − coth (x +x 2 )+ 2
˜ ˜ sinh t
x·x −µ20 t
,
(2.3)
˜ = 2θ −1 and the (cyclically invariant) vertex is [5] where V (x1 , x2 , x3 , x4 ) = δ(x1 − x2 + x3 − x4 )eı xθ −1 y
i+ j+1 x θ −1 x i j 1 i< j 4 (−1)
,
(2.4)
≡ − x2 y1 + x3 y4 − x4 y3 ). where we The main result of this paper is a new proof in configuration space of note1
2 θ (x 1 y2
1 Of course two different θ parameters could be used for the two symplectic pairs of variables of R4 .
Renormalization of Non-Commutative 44 Field Theory in x Space
517
Theorem 2.1 (BPHZ Theorem for Noncommutative 44 [2, 3]). The theory defined by the action (2.2) is renormalizable to all orders of perturbation theory. Let G be an arbitrary connected graph. The amplitude associated with this graph is (with selfexplaining notations): AG = d xv,i dtl v,i=1,...4
l
i+ j+1 x θ −1 x v,i v, j δ(xv,1 − xv,2 + xv,3 − xv,4 )eı i< j (−1) × v
×
˜2
l
˜ l )]2 [2π sinh(t
˜
e
2 2 ˜ − 2 coth(tl )(x v,i(l) +x v ,i (l) )+
˜ 2 ˜ ) x v,i(l) .x v ,i (l) −µ0 tl sinh(t l
.
(2.5)
For each line l of the graph joining positions xv,i(l) and xv ,i (l) , we choose an orientation and we define the “short” variable u l = xv,i(l) − xv ,i (l) and the “long” ˜ l = αl , the propagators variable vl = xv,i(l) + xv ,i (l) . With these notations, defining t in our graph can be written as: 2 αl αl 2 µ0 ˜ ˜ ˜ l 2 dα − 4 coth( 2 )u l − 4 tanh( 2 )vl − ˜ αl . e (2.6) [2π sinh(αl )]2 l
2.2. Orientation and position routing. A rule to solve the δ functions at every vertex is a “position routing” exactly analog to a momentum routing in the ordinary commutative case, except for the additional difficulty of the cyclic signs which impose to orient the lines. It is well known that there is no canonical such routing but there is a routing associated to any choice of a spanning tree in G. Such a tree choice is also useful to orient the lines of the graph, hence to fix the exact sign definition of the “short” line variables u l , and to optimize the multiscale power counting bounds below. Let n be the number of vertices of G, N the number of its external fields, and L the number of internal lines of G. We have L = 2n − N /2. Let T be a rooted tree in the graph (when the graph is not a vacuum graph it is convenient to choose for the root a vertex with external fields but this is not essential). We orient first all the lines of the tree and all the remaining half-loop lines or “loop fields”, following the cyclicity of the vertices. This means that starting from an arbitrary orientation of a first field at the root and inductively climbing into the tree, at each vertex we follow the cyclic order to alternate entering and exiting lines as in Fig. 1. Every line of the tree by definition of this orientation has one end exiting a vertex and another entering another one. This may not be true for the loop lines, which join two “loop fields”. Among these, some exit one vertex and enter another; they are called well-oriented. But others may enter or exit at both ends. These loop lines are subsequently referred to as “clashing lines”. If there are no clashing lines, the graph is called orientable. If not, it is called non-orientable. We will see below that non-orientable graphs are irrelevant in the renormalization group sense. In fact they do not occur at all in some particular models such as the LSZ model treated in the Appendix, or in the most natural noncommutative Gross-Neveu models [12]. For all the well-oriented lines (hence all tree propagators plus some of the loop propagators) we define in the natural way u l = xv,i(l) − xv ,i (l) if the line enters at xv,i(l)
518
R. Gurau, J. Magnen, V. Rivasseau, F. Vignes-Tourneret
Fig. 1. Orientation of a tree
and exits from xv ,i (l) . Finally we fix an additional (completely arbitrary) auxiliary orientation for all the clashing loop lines, and fix in the same way u l = xv − xv with respect to this auxiliary orientation. It is also convenient to define the set of “branches” associated to the rooted tree T . There are n − 1 such branches b(l), one for each of the n − 1 lines l of the tree, plus the full tree itself, called the root branch, and noted b0 . Each such branch is made of the subgraph G b containing all the vertices “above l” in T , plus the tree lines and loop lines joining these vertices. It has also “external fields” which are the true external fields hooked to G b , plus the loop fields in G b for the loops with one end (or “field”) inside and one end outside G b , plus the upper end of the tree line l itself to which b is associated. In the particular case of the root branch, G b0 = G and the external fields for that branch are simply all true external fields. We call X b the set of all external fields f of b. We can now describe the position routing associated to T . There are n δ functions in (2.5), hence n linear equations for the 4n positions, one for each vertex. The momentum routing associated to the tree T solves this system by passing to another equivalent system of n linear equations, one for each branch of the tree. This equivalent system is obtained by summing the arguments of the δ functions of the vertices in each branch. Obviously the Jacobian of this transformation is 1, so we simply get another equivalent set of n δ functions, one for each branch. Let us describe more precisely the positions summed in these branch equations, using the orientation. Fix a particular branch G b , with its subtree Tb . In the branch sum we find a sum over all the u l short parameters of the lines l in Tb and no vl long parameters since l both enters and exits the branch. This is also true for the set L b of well-oriented loops
Renormalization of Non-Commutative 44 Field Theory in x Space
519
lines with both fields in the branch. For the set L b,+ of clashing loops lines with both fields entering the branch, the short variable disappears and the long variable remains; the same is true but with a minus sign for the set L b,− of clashing loops lines with both fields exiting the branch. Finally we find the sum of positions of all external fields for the branch (with the signs according to entrance or exit). For instance in the particular case of Fig. 2, the delta function is δ u l1 + u l2 + u l3 + u L 1 + u L 3 − v L 2 + X 1 − X 2 + X 3 + X 4 . (2.7) The position routing is summarized by: Lemma 2.1 (Position Routing). We have, calling IG the remaining integrand in (2.5):
δ(xv,1 − xv,2 + xv,3 − xv,4 ) IG ({xv,i }) AG = v
=
b
δ
l∈Tb ∪L b
ul +
l∈L b,+
vl −
l∈L b,−
vl +
( f )x f IG ({xv,i }),(2.8)
f ∈X b
where ( f ) is ±1 depending on whether the field f enters or exits the branch. Using the above equations one can at least solve all the long tree variables vl in terms of external variables, short variables and long loop variables, using the n − 1 non-root branches. To this end, recall that the unique X i which is at the upper end of each tree line should be written in (2.7) as 1/2(vl ± u l ). There remains then the root branch δ function. If G b is orientable, this δ function of branch b0 contains only short and external variables, since L b,+ and L b,− are empty. If G b is non-orientable one can solve for an
Fig. 2. A branch
520
R. Gurau, J. Magnen, V. Rivasseau, F. Vignes-Tourneret
additional “clashing” long loop variable. We can summarize these observations in the following lemma: Lemma 2.2. The position routing solves any long tree variable vl as a function of: • • • •
the short tree variable u l of the line l itself, the short tree and loop variables with both ends in G b(l) , the long loop variables of the clashing loops with both ends in G b(l) (if any), the short and long variables of the loop lines with one end inside G b(l) and the other outside, • the true external variables x hooked to G b(l) .
The last equation corresponding to the root branch is particular. In the orientable case it does not contain any long variable, but gives a linear relation among the short variables and the external positions. In the non-orientable case it gives a linear relation between the long variables w of all the clashing loops in the graph, some short variables u, and all the external positions. From now on, each time we use this lemma to solve the long tree variables vl in terms of the other variables, we shall call wl rather than vl the remaining n +1− N /2 independent long loop variables. Hence looking at the long variables’ names the reader can check whether Lemma 2.2 has been used or not.
2.3. Multiscale analysis and crude power counting. In this section we follow the standard procedure of multiscale analysis [16]. First the parametric integral for the propagator is sliced in the usual way: C(u, v) = C 0 (u, v) +
∞
C i (u, v),
(2.9)
i=1
with M −2(i−1)
C (u, v) = i
M −2i
µ2 ˜ ˜ ˜ dα − coth α2 u 2 − tanh α2 v 2 − ˜0 αl 4 4 e . [2π sinh α]2
(2.10)
Lemma 2.3. For some constants K (large) and c (small) : C i (u, v) K M 2i e−c[M
i u+M −i v]
(2.11)
(which a posteriori justifies the terminology of “long” and “‘short” variables). The proof is elementary. For i 1, it relies only on second order approximation of the hyperbolic functions near the origin. This bound is also true for the first slice i = 0 with K depending on µ0 . Taking absolute values, hence neglecting all oscillations, leads to the following crude bound: |A G | du l dvl C il (u l , vl ) δv , (2.12) µ
l
v
Renormalization of Non-Commutative 44 Field Theory in x Space
521
where µ is the standard assignment of an integer index il to each propagator of each internal line l of the graph G, which represents its “scale”. We will consider only amputated graphs. Therefore we have no external propagators, but only external vertices of the graph; in the renormalization group spirit, the convenient convention is to assign all external indices of these external fields to a fictitious −1 “background” scale. To any assignment µ and scale i are associated the standard connected components G ik , k = 1, . . . , k(i) of the subgraph G i made of all lines with scales j i. These tree components are partially ordered according to their inclusion relations and the (abstract) tree describing these inclusion relations is called the Gallavotti-Nicolò tree [17]; its nodes are the G ik ’s and its root is the complete graph G (see Fig. 3). More precisely for an arbitrary subgraph g one defines: i g (µ) = inf il (µ), eg (µ) = l∈g
sup
il (µ).
(2.13)
l external line of g
The subgraph g is a G ik for a given µ if and only if i g (µ) i > eg (µ). As is well known in the commutative field theory case, the key to optimize the bound over spatial integrations is to choose the real tree T compatible with the abstract Gallavotti-Nicolò tree, which means that the restriction Tki of T to any G ik must still span G ik . This is always possible (by a simple induction from leaves to root). In Fig. 3a, an example of such a compatible tree is given with bold lines. We pick such a compatible tree T and use it both to orient the graph as in the previous section and to solve the associated branch system of δ functions according to Lemma 2.2. We obtain: il −il n 2il |A G,µ | K M e−c[M ul +M vl ] δb . du l dvl l
K
n
l
l
M
2il
du l dwl
b
e
−c[M il u l +M −il vl (u,w,x)]
δb 0 .
(2.14)
l
The key observation is to remark that any long variable integrated at scale i costs K M 4i whereas any short variable integrated at scale i brings K M −4i , and the variables “solved” by the δ functions bring or cost nothing. For an orientable graph the optimal solution is easy: we should solve the n − 1 long variables vl ’s of the tree propagators in terms of the other variables, because this is the maximal number of long variables that we can solve, and they have highest possible indices because T has been chosen compatible with the Gallavotti-Nicolò tree structure. Finally we still have the last δb0 function (equivalent to the overall momentum conservation in the commutative case). It is optimal to use it to solve one external variable (if any ) in terms of all the short variables and the external ones. Since external variables are typically smeared against unit scale test functions, this leaves power counting invariant.2 The non-orientable case is slightly more subtle. We remarked that in this case the system of branch equations allows to solve n long variables as a function of all the others. 2 In the case of a vacuum graph, there are no external variables and we must therefore use the last δ b0 function to solve the lowest possible short variable in terms of all others. In this way, we lose the M −4i factor for this short integration. This is why the power counting of a vacuum graph at scale i is not given by the usual formula M (4−N )i = M 4i below at N = 0, but is in M 8i , hence worse by M 4i . This is of course still much better than the commutative case, because in that case and in the analog conditions, that is without a fixed internal point, vacuum graphs would be worse than the others by an . . . infinite factor, due to translation invariance! In any case vacuum graphs are absorbed in the normalization of the theory, hence play no role in the renormalization.
522
R. Gurau, J. Magnen, V. Rivasseau, F. Vignes-Tourneret
(a)
(b)
(c) Fig. 3. (a) A ϕ 4 graph (b) Example of scale attribution (c) The “Gallavotti-Nicolò” tree
Should we always choose these n long variables as the n − 1 long tree variables plus one long loop variable? This is not always the optimal choice. Indeed when several disjoint G ik subgraphs are non-orientable it is better to solve more long clashing loop variables, essentially one per disjoint non-orientable G ik , because they spare higher costs than if tree lines were chosen instead. We now describe the optimal procedure, using words rather than equations to facilitate the reader’s understanding.
Renormalization of Non-Commutative 44 Field Theory in x Space
523
Let C be the set of all the clashing loop lines. Each clashing loop line has a certain j scale i, therefore belongs to one and only one G ik and consequently to all G k ⊃ G ik . We now define the set S of n long variables to be solved via the δ functions. First we put in S all the n − 1 long tree variables vl . Then we scan all the connected components G ik starting from the leaves towards the root, and we add a clashing line to S each time some new non-orientable component G ik appears. We also remove p − 1 tree lines from S each time p 2 non-orientable components merge into a single one. In the end we obtain a new set S of exactly n long variables. More precisely suppose some G ik at scale i is a “non-orientable leaf”, which means that it contains some clashing lines at scale i but none at scales j > i. We then choose one (arbitrary) such clashing line and put it in the set S. Once a clashing line is added to S in this way it is never removed and no other clashing line is chosen in any of the j G k at lower scales j < i to which the chosen line belongs. (The reader should be aware that this process allows nevertheless several clashing lines of S to belong to a single G ik , provided they were added to different connected components at upper scales.) When p 2 non-orientable components merge at scale i into a single non-orientable G ik , we can find p − 1 lines in the part of the tree Tki joining them together, (e.g. taking them among the first lines on the unique paths in T from these p components towards the root) and remove them from S. We see that if we have added in all q clashing lines to the set S, we have eliminated q − 1 tree lines. The final set S thus obtained in the end has exactly n elements. The non-trivial statement is that thanks to inductive use of Lemma 2.2 in each G ik , we can solve all the long variables in the set S with the branch system of δ functions associated to T . We perform now all remaining integrations. This spares the corresponding M 4i integration cost for each long variable in S. For any line not in S we see that the net power counting is 1, since the cost of the long variable integration exactly compensates the gain of the short variable integration. But for any line in S we earn the M −4i power counting of the corresponding short variable u without paying the M 4i cost of the long variable. Gathering all the corresponding factors together with the propagators’ prefactors M 2i leads to the following bound: |A G,µ | K n M 2il M −4il . (2.15) l
l∈S
Remark that if the graph is well-oriented this formula remains true but the set S consists of only the n − 1 tree lines. In the usual way of [16] we write l
M 2il =
il
M2 =
l i=1
M2 =
i,k l∈G i k
M 2l(G k ) i
(2.16)
i,k
and il l∈S i=1
M −4il =
M −4 ,
i,k l∈G i ∩S k
and we must now only count the number of elements in G ik ∩ S.
(2.17)
524
R. Gurau, J. Magnen, V. Rivasseau, F. Vignes-Tourneret
If G ik is orientable, it contains no clashing lines, hence G ik ∩ S = Tki , and the cardinal of Tki is n(G ik ) − 1. If G ik contains one or more clashing lines and p clashing lines l1 , . . . , l p in G ik have been chosen to belong to S, then p − 1 tree variables in Tki have also been removed from S and G ik ∩ S = Tki ∪ {l1 , . . . , l p } − {p − 1 tree variables}, hence the cardinal of G ik ∩ S is n(G ik ). Using the fact that 2l(G ik ) − 4n(G ik ) = −N (G ik ) we can summarize these results in the following lemma: Lemma 2.4. The following bound holds for a connected graph (with external arguments integrated against fixed smooth test functions): i M −ω(G k ) (2.18) |A G,µ | K n i,k
for some (large) constant K , with ω(G ik ) = N (G ik ) − 4 if G ik is orientable and ω(G ik ) = N (G ik ) if G ik is non-orientable. This lemma is optimal if vertices’ oscillations are not taken into account, and proves that non-orientable subgraphs are irrelevant. But it is not yet sufficient for a renormalization theorem to all orders of perturbation. 2.4. Improved power counting. Recall that for any non-commutative Feynman graph G we can define the genus of the graph, called g and the number of faces “broken by external legs”, called B [2, 3]. We have g 0 and B 1. The power counting established with the matrix basis in [2, 3], rewritten in the language of this paper 3 is: ω(G) = N − 4 + 8g + 4(B − 1),
(2.19)
hence we must (and can) renormalize only 2 and 4 point subgraphs with g = 0 and B = 1, which we call planar regular. They are the only non-vacuum graphs with ω 0. In the previous section we established that ω(G) N − 4, if G orientable, ω(G) N , if G non-orientable.
(2.20)
It is easy to check that planar regular subgraphs are orientable, but the converse is not true. Hence to prove that orientable non-planar subgraphs or orientable planar subgraphs with B 2 are irrelevant requires to use a bit of the vertices oscillations to improve Lemma 2.4 and get: Lemma 2.5. For orientable subgraphs with g 1 we have ω(G) N + 4.
(2.21)
For orientable subgraphs with g = 0 and B 2 we have ω(G) N .
(2.22)
3 Beware that the factor i in [3] is now 2i, and that the ω used here is the convergence rather than divergence degree. Hence there is both a sign change and a factor 2 of difference between the ω’s of this paper and the ones of [3].
Renormalization of Non-Commutative 44 Field Theory in x Space
525
This lemma although still not giving (2.19) is sufficient for the purpose of this paper. For instance it implies directly that graphs which contain only irrelevant subgraphs in the sense of (2.19) have finite amplitudes uniformly bounded by K n , using the standard method of [16] to bound the assignment sum over µ in (2.12). The rest of this subsection is essentially devoted to the proof of this Lemma 2.5. We return before solving δ functions, hence to the v variables. We will need only to compute in a precise way the oscillations which are quadratic in the long variables v to prove (2.21) and the linear oscillations in vθ −1 x to prove (2.22). Fortunately an analog problem was solved in momentum space by Filk and Chepelev-Roiban [5, 6], and we need only a slight adaptation of their work to position space. In fact in this subsection short variables are quite inessential but it is convenient to treat on the same footing the long 1/2 v and the external x variables, so we introduce a new global notation y for all these variables. The vertices are rewritten as i+ j+1 y θ −1 y +y Qu+u Ru 1 i i j (2.23) δ y1 − y2 + y3 − y4 + u i eı i< j (−1) 2 v for some inessential signs i and some symplectic matrices Q and R. Since we are not interested in the precise oscillations in the short u variables we will denote in the sequel quite sloppily by E u any linear combination of the u variables. Let’s consider the first Filk reduction [5], which contracts tree lines of the graph. It creates progressively generalized vertices with even number of fields. At a given induction step and for a tree line joining two such generalized vertices with respectively p and q − p + 1 fields ( p is even and q is odd), we assume by induction that the two vertices are
eı
δ(y1 − y2 + y3 . . . − y p + E u )δ(y p − y p+1 + . . . − yq + E u ) i+ j+1 y θ −1 y + i j 1 i< j p (−1)
i+ j+1 y θ −1 y +y Qu+u Ru i j p i< j q (−1)
.
(2.24)
Using the second δ function we see that: y p = y p+1 − y p+2 + . . . + yq − E u .
(2.25)
Substituting this expression in the first δ function we get: δ(y1 − y2 + . . . − y p+1 + .. − yq + E u )δ(y p − y p+1 + . . . − yq + E u ) ı (−1)i+ j+1 yi θ −1 y j + p i< j q (−1)i+ j+1 yi θ −1 y j +y Qu+u Ru 1 i< j p . e
(2.26)
The quadratic terms which include y p in the exponential are (taking into account that p is an even number): p−1
(−1)i+1 yi θ −1 y p +
i=1
q
(−1) j+1 y p θ −1 y j .
(2.27)
j= p+1
Using the expression (2.25) for y p we see that the second term gives only terms in y Lu. The first term yields: q p−1 i=1 j= p+1
(−1)
i+1+ j+1
yi θ
−1
yj =
q−1 p−1 i=1 k= p
(−1)i+k+1 yi θ −1 yk ,
(2.28)
526
R. Gurau, J. Magnen, V. Rivasseau, F. Vignes-Tourneret
which reconstitutes the crossed terms, and we have recovered the inductive form of the larger generalized vertex. One should be aware that y p has disappeared from the final result, but that all the subsequent ys> p have changed sign. This complication arises because of the cyclicity of the vertex. As p was chosen to be even (which implies q odd) we see that q − 1 is even as it should be. Consequently by this procedure we will always treat only even vertices. We finally rewrite the product of the two vertices as: δ(y1 − y2 + . . . + y p−1 − y p+1 + .. − yq + E u )δ(y p − y p−1 + . . . − yq + E u ) i+ j+1 y θ −1 y +y Qu+u Ru i j , (2.29) ×eı 1 i< j q (−1) where the exponential is written in terms of the reindexed vertex variables. In this way we can contract all lines of a spanning tree T and reduce G to a single vertex with “tadpole loops” called a “rosette graph” [6]. In this rosette to keep track of cyclicity is essential so rather than the “point-like” vertex of [6] we prefer to draw the rosette as a cycle (which is the border of the former tree) bearing loops lines on it (see Fig. 4). Remark that the rosette can also be considered as a big vertex, with r = 2n + 2 fields, on which N are external fields with external variables x and 2n + 2 − N are loop fields for the corresponding n + 1 − N /2 loops. When the graph is orientable (which is the case to consider in Lemma 2.5, the fields alternatively enter and exit, and correspond to the fields on the border of the tree T , which we meet turning around counterclockwise in Fig. 1. In the rosette the long variables yl for l in T have disappeared. Let us call z the set of remaining long loop and external variables. Then the rosette vertex factor is i+ j+1 z θ −1 z +z Qu+u Ru i j δ(z 1 − z 2 + . . . − zr + E u )eı 1 i< j r (−1) . (2.30) The initial product of δ functions has not disappeared so we can still write it as a product over branches like in the previous section and use it to solve the yl variables in terms of the z variables and the short u variables. The net effect of the Filk first reduction was simply to rewrite the root branch δ function and the combination of all vertices oscillations (using the other δ functions) as the new big vertex or rosette factor (2.30). The second Filk reduction [5] further simplifies the rosette factor by erasing the loops of the rosette which do not cross any other loops or arch over external fields. Here again
Fig. 4. A typical rosette
Renormalization of Non-Commutative 44 Field Theory in x Space
527
the same operation is possible. Consider indeed such a rosette loop l (for instance loop 2 in Fig. 4). This means that on the rosette cycle there is an even number of vertices in betwen the two ends of that loop and moreover that the sum of z’s in betwen these two ends must be zero, since they are loop variables which both enter and exit between these ends. Putting together all the terms in the exponential which contain zl we conclude exactly as in [5] that these long z variables completely disappear from the rosette oscillation factor, which simplifies as in [6] to (2.31) δ(z 1 − z 2 + . . . − zr + E u )eı z I z+z Qu+u Ru , where Ii j is the antisymmetric “intersection matrix” of [6] (up to a different sign convention). Here Ii j = +1 if oriented loop line i crosses oriented loop line j coming from its right, Ii j = −1 if i crosses j coming from its left, and Ii j = 0 if i and j do not cross. These formulas are also true for i external line and j loop line or the converse, provided one extends the external lines from the rosette circle radially to infinity to see their crossing with the loops. Finally when i and j are external lines one should define Ii j = (−1) p+q+1 if p and q are the numbering of the lines on the rosette cycle (starting from an arbitrary origin). If a node G ik of the Gallavotti–Nicolò tree is orientable but non-planar (g 1), there must therefore exist two intersecting loop lines in the rosette corresponding to this G ik , with long variables w1 and w2 . Moreover since G ik is orientable, none of the long loop variables associated with these two lines belongs to the set S of long variables eliminated by the δ constraints. Therefore, after integrating the variables in S the basic mechanism to improve the power counting of a single non planar subgraph is the following: −2i 1 2 −2i 2 2 −1 dw1 dw2 e−cM w1 −cM w2 −iw1 θ w2 +w1 .E 1 (x,u)+w2 E 2 (x,u) 2 2 −1 −2i 1 −2i 2 = dw1 dw2 e−cM (w1 ) −cM (w2 ) +iw1 θ w2 +(u,x)Q(u,x) 2 2i 1 −2i 2 4i 1 = KM (2.32) dw2 e−(M +M )(w2 ) = K . In these equations we used for simplicity M −2i instead of the correct but more compli˜ cated factor (/4) tanh(α/2) (see 2.6) (of course this does not change the argument) and we performed a unitary linear change of variables w1 = w1 +1 (x, u), w2 = w2 +2 (x, u) to compute the oscillating w1 integral. The gain in (2.32) is M −4(i1 +i2 ) , which is the difference between K and the normal factor M 4(i1 +i2 ) that the w1 and w2 integrals would have 2 2 −2i −2i cost if we had done them with the regular e−cM 1 w1 −cM 2 w2 factor for long variables. Beware that in (2.32) our constant c depends on θ and that our bounds are singular in the limit θ → 0. This basic argument must then be generalized to each non-planar leaf in the Gallavotti-Nicolò tree. This is done exactly in the same way as the inductive definition of the set A of clashing lines in the non-orientable case. In any orientable non-planar ‘primitive” G ik node (i.e. not containing sub-non-planar nodes) we can choose an arbitrary pair of crossing loop lines which will be integrated as in (2.32) using this oscillation. The corresponding improvements are independent. This leads to an improved amplitude bound: i M −ω(G k ) , (2.33) |A G,µ | K n i,k
528
R. Gurau, J. Magnen, V. Rivasseau, F. Vignes-Tourneret
where now ω(G ik ) = N (G ik ) + 4 if G ik is orientable and non planar (i.e. g 1). This bound proves (2.21). Finally it remains to consider the case of nodes G ik which are planar orientable but with B 2. In that case there are no crossing loops in the rosette but there must be at least one loop line arching over a non trivial subset of external legs in the G ik rosette (see line 6 in Fig. 4). We have then a non trivial integration over at least one external variable, called x, of at least one long loop variable called w. This “external” x variable without the oscillation improvement would be integrated with a test function of scale 1 (if it is a true external line of scale 1) or better (if it is a higher long loop variable).4 But we get now +2i 2 −M −2i w2 −iwθ −1 x+w.E 1 (x ,u) 4i d xdwe = KM d xe−M x = K , (2.34) so that a factor M 4i in the former bound becomes O(1), hence is improved by M −4i . This proves (2.22), hence completes the proof of Lemma 2.5.
This method could be generalized to get the true power counting (2.19). One simply needs a better description of the rosette oscillating factors when g or B increase. We conjecture that it is in fact possible to “disentangle” the rosette by some kind of “third Filk move”. Indeed the rank of the long variables’ quadratic oscillations is exactly the genus [7], and the rank of the linear term coupling these long variables to the external ones is exactly B − 1. So one can through a unitary change of variables on the long variables inductively disentangle adjacent crossing pairs of loops in the rosette. This means that it is possible to diagonalize the rosette symplectic form through explicit moves of the loops along the rosette. Once oscillations are factorized in this way, the single improvements shown in this section generalize to one improvement of M −8i per genus and one improvement of M −4i per broken face. In this way the exact power counting (2.19) should be recovered by pure x-space techniques which never require the use of the matrix basis. This study is more technical and not really necessary for the BPHZ theorem proved in this paper. 3. Renormalization In this section we need to consider only divergent subgraphs, namely the planar two and four point subgraphs with a single external face (g = 0, B = 1, N = 2 or 4). We shall prove that they can be renormalized by appropriate counterterms of the form of the initial Lagrangian. We compute first the oscillating factors Q and R of the short variables in (2.31) for these graphs. This is not truly necessary for what follows, but is a good exercise. 3.1. The oscillating rosette factor. In this subsection we define another more precise representation for the rosette factor obtained after applying the first Filk moves to a graph of order n. We rewrite in terms of u l and vl the coordinates of the ends of the tree lines l, l = 1, . . . , n − 1 (those contracted in the first Filk moves), but keep as variables called s1 , . . . , s2n+2 the positions of all external fields and all ends of loop lines (those not contracted in the first Filk moves). 4 Since the loop line arches over a non trivial (i.e. neither full nor empty) subset of external legs of the rosette, the variable x cannot be the full combination of external variables in the “root” δ function.
Renormalization of Non-Commutative 44 Field Theory in x Space
529
We start from the root and turn around the tree in the trigonometrical sense. We number separately all the fields as 1, . . . , 2n + 2 and all the tree lines as 1, . . . , n − 1 in the order they are met, but we also define a global ordering ≺ on the set of all the fields and tree lines according to the order in which they are met (see Fig. 5). In this way we know whether the field number p is met before or after tree line number q. For example, in Fig. 5, field number 8 ≺ tree line number 6. Lemma 3.1. The rosette contribution after a complete first Filk reduction is exactly: δ s1 − s2 + · · · − s2n+2 +
u l eı
i+ j+1 s θ −1 s i j 0 i< j 2n+2 (−1)
l∈T
×e−ı
l≺l
ul
θ −1 u
l
e−ı
l
(l)
u l θ −1 vl 2
eı
i −1 l,i≺l (−1) si θ u l +ı
l,i l
u l θ −1 (−1)i si
where (l) is −1 if the tree line l is oriented towards the root and +1 if it is not.
Fig. 5. Total ordering of the tree lines and fields
, (3.1)
530
R. Gurau, J. Magnen, V. Rivasseau, F. Vignes-Tourneret
Proof. We proceed by induction. We contract the tree lines according to their ordering. In this way, at any step k we contract a generalized vertex with 2k + 2 external fields corresponding to the contraction of the k − 1 first lines with a usual four-vertex with r = 4, and obtain a new generalized vertex with 2k + 4 fields. We suppose inductively that the generalized vertex has the above form and prove that it keeps this form after the contraction. We denote the external coordinates of this vertex as s1 , . . . , s2k+2 and the coordinates of the four-vertex as t1 , . . . , t4 . We contract the propagator (s p , tq ) with associated variables v = s p + tq and u = (−1) p+1 s p + (−1)q+1 tq . We also note that, since the tree is orientable, p + q is odd. Adding the arguments of the two δ functions gives the global δ function. We have the two equations: s1 − s2 + · · · − s2k+2 + u s = 0 , t1 − t2 + t3 − t4 = 0. (3.2) Using the invariance of the t vertex we can always eliminate the contribution of tq in the phase factor. We therefore have: ϕ = [s1 − s2 + · · · + (−1) p s p−1 ]θ −1 (−1) p s p +(−1) p s p θ −1 [(−1) p+2 s p+1 + · · · − s2k+2 ] = [s1 − s2 + · · · + (−1) p s p−1 ]θ −1 [−u + (−1)q+1 tq ] +[−u + (−1)q+1 tq ]θ −1 [(−1) p+2 s p+1 + . . . . − s2k+2 ].
(3.3)
4
As (−1)q+1 tq = i=1,i=q (−1)i ti we see that the sθ −1 tq terms in the above expression reproduce exactly the crossed terms needed to complete the first exponential. We rewrite the other terms as: [s1 − s2 + · · · + (−1) p s p−1 ]θ −1 (−u) + (−u)θ −1 [(−1) p+2 s p+1 + · · · − s2k+2 ] = [s1 − s2 + · · · + (−1) p s p−1 ]θ −1 (−u) us ] +(−u)θ −1 [−s1 + s2 · · · + (−1) p s p − s
= 2[s1 − s2 + · · · + (−1) s p−1 ]θ p
−1
(−u) + (−u)θ −1 (−1) p s p + uθ −1
us
s
=2
uθ −1 v −1 + (−1)i si θ −1 u + (−1) p+1 uθ u s , 2 s
(3.4)
i≺l
where we have used (−1) p s p = (−1) p (v − u)/2. Note that further contractions will not involve s1 . . . s p−1 . After collecting all the contractions and using the global delta function we write: 2 (−1)i si θ −1 u l = (−1)i si θ −1 u l + u l θ −1 (−1)i si + u l θ −1 u l , (3.5) l,i≺l
l,i≺l
l,l
l,i l
and the last term is zero by the antisymmetry of θ −1 .
We denote by L the set of loop lines, and analyze now further the rosette contribution for planar graphs. We call now xi , i = 1, . . . , N the N external positions. We choose as first external field 1 an arbitrary entering external line. We define an ordering among the
Renormalization of Non-Commutative 44 Field Theory in x Space
531
set of all lines, writing l ≺ l if both ends of l are before the first end of l when turning around the tree as in Fig. 5, where l1 ≺ l2 . Analogously we define l ≺ j when j is an external vertex (l1 ≺ x4 in Fig. 5). We define l ⊂ l if both ends of l lie in between the ends of l on the rosette (l2 ⊂ l4 in Fig. 5). We count a loop line as positive if it turns in the trigonometric sense like the rosette and negative if it turns clockwise. Each loop line l ∈ L has now a sign (l) associated with this convention, and we now make explicit its end variables in terms of u l and wl . With these conventions we prove the following lemma: Lemma 3.2. The vertex contribution for a planar regular graph is exactly: δ(
i+ j+1 x θ −1 x i j (−1)i+1 xi + u l )eı i, j (−1) i
×e
l∈T ∪L
ı
l∈T ∪L, l≺ j
×e−ı ×e−ı
u l θ −1 (−1) j x j +ı
l,l ∈T ∪L, l≺l
u l θ −1 u l −ı
l∈T ∪L, l j (−1)
u l θ −1 vl l∈T 2
−1 l∈L, l ∈L∪T ; l ⊂l u l θ wl (l)
jx
(l)−ı
−1 u
l
l∈L
u l θ −1 wl 2
jθ
(l)
.
(3.6)
Proof. We see that the global root δ function has the argument: (−1)i+1 xi + ul .
(3.7)
l∈L∪T
i
Since the graph has one broken face we always have an even number of vertices on the external face between two external fields. We express all the internal loop variables as functions of u’s and w’s. Using Lemma 3.1, we regroup the terms which still contain the external points which we relabel x in one quadratic and one linear form in the external positions. The quadratic term can be written as: (−1)i+ j+1 xi θ −1 x j .
(3.8)
i< j
The linear term in the external vertices is:
(−1)i+1 si θ −1 (−1) j x j +
i< j
+
(−1) j x j θ −1 (−1)i+1 si i> j
(−1) x j θ j
−1
ul +
l∈T,l j
=
+
l∈T,l j
u l θ −1 (−1) j x j
l∈T,l≺ j
ul θ
l ∈L,l j
−1
(−1) x j + j
(−1) j x j θ −1 u l
l ∈L,l j
(−1) j x j θ −1 u l +
l∈T,l≺ j
u l θ −1 (−1) j x j .
(3.9)
532
R. Gurau, J. Magnen, V. Rivasseau, F. Vignes-Tourneret
Consider a loop line from s p to sq with p < q. Its contribution to the vertex amplitude decomposes in a “loop-loop” term and a “loop-tree” term. The first one is: (−1)i+1 si θ −1 (−1) p s p + (−1) p s p θ −1 (−1)i+1 si + s p θ −1 sq i< p
+
p
(−1)i+1 si θ −1 (−1)q sq +
i
=
(−1) p sq θ −1 (−1)i+1 si q
(−1)i+1 si θ −1 [(−1) p s p + (−1)q sq ] i< p
+
[(−1) p s p + (−1)q sq ]θ −1 (−1)i+1 si
q
+
(−1)i+1 s i θ −1 [(−1) p+1 s p + (−1)q sq ] + s p θ −1 sq .
(3.10)
p
Taking into account that (−1)i+1 si + (−1) j+1 s j = u l if si and s j are the two ends of the loop line l , we can rewrite the above expression as: u l θ −1 (−u l ) + (−u l )θ −1 u l + u l θ −1 (−1) p+1 wl l ≺l
+(−1)
l l −1 p+1 u l θ wl
+
2
l ⊂l
u l θ −1 (−1)i+1 wl ,
(3.11)
l ,l⊂l
where l is fixed in all the above expressions. Summing the contributions of all the lines (being careful not to count the same term twice) we get the final result: −
u l θ −1 u l −
l ≺l
u l θ −1 wl (l) −
l,l ⊂l
u l θ −1 wl (l) . 2
(3.12)
l
We still have to add the “loop-tree” contribution. It reads: u l θ −1 (−1) p s p + (−1) p s p θ −1 u l l ∈T,l ≺ p
+
l ∈T,l p
ul θ
−1
l ∈T,l ≺q
=
(−1)q sq +
(−1)q sq θ −1 u l
l ∈T,l q
u l θ −1 [(−1) p s p + (−1)q sq ] +
l ∈T ;l ≺ p,q
+ =
l ∈T ;l ≺l
[(−1) p s p + (−1)q sq ]θ −1 u l
l ∈T ;l p,q
u l θ −1 [(−1) p+1 s p + (−1)q sq ]
l ∈T ; p≺l ≺q
u l θ −1 (−u l ) +
(−u l )θ −1 u l +
l ∈T ;l l
Collecting all the factors proves the lemma
l ∈T ;l ⊂l
u l θ −1 (−1) p+1 wl . (3.13)
Renormalization of Non-Commutative 44 Field Theory in x Space
533
3.2. Renormalization of the four-point function. Consider a 4 point subgraph which needs to be renormalized, hence is a node of the Gallavotti-Nicolò tree. This means that there is (i, k) such that N (G ik ) = 4. The four external positions of the amputated graph are labeled x1 , x2 , x3 and x4 . We also define Q, R and S as three skew-symmetric matrices of respective sizes 4 × l(G ik ), l(G ik ) × l(G ik ) and [n(G ik ) − 1] × l(G ik ), where we recall that n(G) − 1 is the number of loops of a 4 point graph with n vertices. The amplitude associated to the connected component G ik is then du C (x, u, w) du l dwl Cl (u l , wl ) A(G ik )(x1 , x2 , x3 , x4 ) =
∈Tki
l∈G ik , l∈T
ı ×δ x1 −x2 +x3 −x4 + u l e
p
p+q+1 x
pθ
−1 x
q +X QU +U RU +U SW
(3.14) .
l∈G ik
The exact form of the factor p
(3.15)
,e xe
(3.16)
e∈E()
is a linear combination on the set of external variables of the branch graph G with the correct alternating signs ,e , W =
,l wl (3.17) l∈L()
is a linear combination over the set L() of long loop variables for the external lines of G (and ,l are other signs), and U =
,l u l (3.18) l ∈S()
is a linear combination over a set S of short variables that we do not need to know explicitly. The tree propagator for line then is M −2(i()−1)
C (u , X , U , W ) = M −2i()
˜
α
α
˜ e− 4 {coth( 2 )ul2 +tanh( 2 )[X +W +U ]2 } dα . (3.19) [2π sinh(α )]2
534
R. Gurau, J. Magnen, V. Rivasseau, F. Vignes-Tourneret
To renormalize, let us call e = max e p , p = 1, . . . , 4 the highest external index of the subgraph G ik . We have e < i since G ik is a node of the Gallavotti-Nicolò tree. We evaluate A(G ik ) on external fields5 ϕ e (x p ) as: A(G ik )
=
4
d x p ϕ e (x p ) A(G ik )(x1 , x2 , x3 , x4 )
p=1
=
4
d x p ϕ e (x p ) eıExt
du C (u , t X , U , W )
∈Tki
p=1
×
du l dwl Cl (u l , wl ) δ + t u l eıt X QU +ıU RU +ıU SW
l∈G ik l∈T
l∈G ik
(3.20)
t=1
with = x1 − x2 + x3 − x4 and Ext = 4p
R(t) = −
˜
α tanh( ) t 2 X 2 + 2t X W + U 4 2 i
∈Tk
≡ −t 2 AX.X − 2tAX.(W + U ), where A =
˜ 4
A(G ik ) =
tanh( α2 ), and X.Y means 4
d x p ϕ e (x p ) eıExt
∈Tki
∈Tki
p=1
×
(3.21)
X Y . We have
du C (u , U , W )
du l dwl Cl (u l , wl ) eıU RU +ıU SW
(3.22)
l∈G ik l∈T
1
× δ() +
! " dt U.∇δ( + tU) + δ( + tU)[ı X QU + R (t)]
0
e
ıt X QU +R(t)
#
,
where C (u , U , W ) is given by (3.19) but taken at X = 0. The first term, denoted by τ A, is of the desired form (2.4) times a number independent of the external variables x. It is asymptotically constant in the slice index i, hence $
5 For the external index to be exactly e the external smearing factor should be in fact $ ϕ e (x ) − p p e−1 (x ) but this subtlety is inessential. ϕ p p
Renormalization of Non-Commutative 44 Field Theory in x Space
535
the sum over i at fixed e is logarithmically divergent: this is the divergence expected for the four-point function. It remains only to check that (1 − τ )A converges as i − e → ∞. But we have three types of terms in (1 − τ )A, each providing a specific improvement over the regular, log-divergent power counting of A: • The term U.∇δ( + tU). For this term, integrating by parts over external variables, the ∇ acts on external fields ϕ e , hence brings at most M e to the bound, whether the U term brings at least M −i . • The term X QU . Here X brings at most M e and U brings at least M −i . • The term R (t). It decomposes into terms in AX.X , AX.U and AX.W . Here the A brings at least M −2i() , X brings at worst M e , U brings at least M −i and X W brings at worst M e+i() . This last point is the only subtle one: if ∈ Tki , remark that because Tki is a sub-tree within each Gallavotti-Nicolò subnode of G ik , in particular all parameters wl for l ∈ L() which appear in W must have indices lower or equal to i() (otherwise they would have been chosen instead of in Tki ). In conclusion, since i() i, the Taylor remainder term (1−τ )A improves the powercounting of the connected component G ik by a factor at least M −(i−e) . This additional M −(i−e) factor makes (1 − τ )A(G ik ) convergent and irrelevant as desired. 3.3. Renormalization of the two-point function. We consider now the nodes such that N (G ik ) = 2. We use the same notations as in the previous subsection. The two external points are labeled x and y. Using the global δ function, which is now δ(x − y + U), we −1 remark that the external oscillation eı xθ y can be absorbed in a redefinition of the term eıt X QU , which we do from now on. Also we want to use expressions symmetrized over x and y. The full amplitude is % & i A(G k ) = d xd yϕ e (x)ϕ e (y)δ x − y + U du l dwl Cl (u l , wl ) × l∈G ik , l∈T
×
du C (u , X , U , W ) eı X QU +ıU RU +ıU SW .
(3.23)
∈Tki
First we write the identity " ! 1 e e e 2 e 2 e e 2 ϕ (x)ϕ (y) = [ϕ (x)] + [ϕ (y)] − [ϕ (y) − ϕ (x)] , 2
(3.24)
we develop it as ϕ
e
(x)ϕ
e
! 1 e 2 e 2 [ϕ (x)] + [ϕ (y)] − (y − x)µ .∇µ ϕ e (x) (y) = 2 "2 ' 1 µ ν e ds(1 − s)(y − x) (y − x) ∇µ ∇ν ϕ (x + s(y − x)) + , 0
(3.25)
536
R. Gurau, J. Magnen, V. Rivasseau, F. Vignes-Tourneret
and substitute into (3.23). The first term A0 is a symmetric combination with external fields at the same argument. Consider the case with the two external legs at x, namely the term in [ϕ e (x)]2 . For this term we integrate over y. This uses the δ function. We perform then a Taylor expansion in t at order 3 of the remaining function f (t) = eıt X QU +R(t) ,
(3.26)
−[t 2 AX.X
where we recall that R(t) = + 2tAX.(W + U )]. We get 1 A0 = d x[ϕ e (x)]2 eı(U RU +U SW ) 2 × du l dwl Cl (u l , wl ) du C (u , U , W ) ∈Tki
j
l∈G k , l∈T
1 1 × f (0) + f (0) + f (0) + 2 2
1
dt (1 − t)2 f (3) (t) .
(3.27)
0
In order to evaluate that expression, let A0,0 , A0,1 , A0,2 be the zeroth, first and second order terms in this Taylor expansion, and A0,R be the remainder term. First, du l dwl Cl (u l , wl ) A0,0 = d x [ϕ e (x)]2 eı(U RU +U SW ) ×
l∈G ik ,l∈T
du C (u , U , W )
(3.28)
∈Tki
is quadratically divergent and exactly of the expected form for the mass counterterm. Then 1 A0,1 = du l dwl Cl (u l , wl ) d x[ϕ e (x)]2 eı(U RU +U SW ) 2 i ×
l∈G k , l∈T
du C (u , U , W ) ı X QU + R (0)
(3.29)
∈Tki
vanishes identically. Indeed all the terms are odd integrals over the u, w-variables. A0,2 is more complicated: 1 A0,2 = du l dwl Cl (u l , wl ) d x[ϕ e (x)]2 eı(U RU +U SW ) 2 i ×
du C (u , U , W )
l∈G k , l∈T
− (X QU )2
∈Tki
−4ı X QU AX.(W + U ) − 2AX.X + 4[AX.(W + U )]2 .
(3.30)
The four terms in (X QU )2 , X QU AX.W , AX.X and [AX.W ]2 are logarithmically ˜ in divergent and contribute to the renormalization of the harmonic frequency term
Renormalization of Non-Commutative 44 Field Theory in x Space
537
(2.2). The terms in x µ x ν with µ = ν do not survive by parity and the terms in (x µ )2 have obviously the same coefficient. The other terms in X QU AX.U , (AX.U )(AX.W ) and [AX.U ]2 are irrelevant. Similarly the terms in A0,R are all irrelevant. ( For the term in A0 (y) in which we have d x[ϕ e (y)]2 we have to perform a similar computation, but beware that it is now x which is integrated with the δ function so that Q, S, R and R change, but not the conclusion. ! "2 Next we have to consider the term in (y − x)µ .∇µ ϕ e (x) in (3.25), for which we need to develop the f function only to first order. Integrating over y replaces each y − x by a U factor so that we get a term "2 ! 1 µ e A1 = du l dwl Cl (u l , wl ) d x U .∇µ ϕ (x) eı(U RU +U SW ) 2 i
×
du C (u , U , W ) f (0) +
∈Tki
l∈G k ,l∈T
1
dt f (t)dt .
(3.31)
0
The first term is "2 ! 1 µ e A1,0 = d x U .∇µ ϕ (x) eı(U RU +U SW ) 2 ×
du C (u , U , W ).
du l dwl Cl (u l , wl )
l∈G ik ,l∈T
(3.32)
∈Tki
The terms with µ = ν do not survive by parity. The other ones reconstruct a counterterm proportional to the Laplacian. The power-counting of this factor A1,0 is improved, with respect to A, by a factor M −2(i−e) which makes it only logarithmically divergent, as it should be for a wave-function counterterm. x The remainder term in A1,R has an additional factor at worst M −(i−e) coming from (1 the 0 dt f (t)dt term, hence is irrelevant and convergent. Finally the remainder terms A R with three or four gradients in (3.25) are also irrelevant and convergent. Indeed we have terms of various types: • There are terms in U 3 with ∇ 3 . The ∇ act on the variables x, hence on external fields, hence bring at most M 3e to the bound, whether the U3 brings at least M −3i . • Finally there are terms with 4 gradients which are still smaller. Therefore for the renormalized amplitude A R the power-counting is improved, with respect to A0 , by a factor M −3(i−e) , and becomes convergent. Putting together the results of the two previous sections, we have proved that the usual effective series which expresses any connected function of the theory in terms of an infinite set of effective couplings, related one to each other by a discretized flow [16], have finite coefficients to all orders. Reexpressing these effective series in terms of the renormalized couplings would reintroduce in the usual way the Zimmermann’s forests of “useless” counterterms and build the standard “old-fashioned” renormalized series. The most explicit way to check finiteness of these renormalized series in order to complete the “BPHZ theorem” is to use the standard “classification of forests” which distributes
538
R. Gurau, J. Magnen, V. Rivasseau, F. Vignes-Tourneret
Zimmermann’s forests into packets such that the sum over assignments in each packet is finite [16].6 This part is completely standard and identical to the commutative case. Hence the proof of Theorem 2.1 is completed. A. The LSZ Model In this section we prove the perturbative renormalizability of a generalized Langmann-Szabo-Zarembo model [18]. It consists in a bosonic complex scalar field theory in a fixed magnetic background plus an harmonic oscillator. The quartic interaction is of the Moyal type. The action functional is given by 1 ˜ 2 x 2 + µ20 ϕ + λ ϕ¯ ϕ ϕ¯ ϕ, S= ϕ¯ − D µ Dµ + (A.1) 2 where Dµ = ∂µ − ı Bµν x ν is the covariant derivative. The 1/2 factor is somewhat unusual in a complex theory but it allows us to recover exactly the results given in [15] ˜ 2 → ω2 = ˜ 2 + B 2 . By expanding the quadratic part of the action, we get a with 4 -like kinetic part plus an angular momentum term: ˜ 2 x 2 ϕϕ ϕ¯ D µ Dµ ϕ + (A.2) ¯ = ϕ¯ − ω2 x 2 − 2B L 5 ϕ with L 5 = x 1 p2 − x 2 p1 + x 3 p4 − x 4 p3 = x ∧ ∇. Here the skew-symmetric matrix B has been put in its canonical form 0 −1 (0) 1 0 B= . 0 −1 (0) 1 0
(A.3)
In x space, the interaction term is exactly the same as (2.4). The complex conjugation of the fields only selects the orientable graphs. ˜ = 0, the model is similar to the Gross-Neveu theory. Its renormalization is At therefore harder [12] and is not treated in this paper. If we additionally set B = θ −1 we recover the integrable LSZ model [18].
A.1. Power counting. The propagator corresponding to the action (A.1) has been calculated in [15] in the two-dimensional case. The generalization to higher dimensions, e.g. four, is straightforward: ∞ C(x, y) = 0
ω2 ω cosh Bt (x − y)2 dt exp − (2π sinh ωt)2 2 sinh ωt
cosh ωt − cosh Bt 2 sinh Bt −1 2 (x + y ) + ı xθ y . + sinh ωt sinh ωt
(A.4)
6 One could also use the popular inductive scheme of Polchinski, which however does not extend yet to non-perturbative “constructive” renormalization
Renormalization of Non-Commutative 44 Field Theory in x Space
539
Note that the sliced version of (A.4) obeys the same bound (2.11) as the ϕ 4 propagator. Moreover the additional oscillating phases exp ı xθ −1 y are of the form exp ı u l θ −1 vl . Such terms played no role in the power counting of the 4 theory. They were bounded by one. This allows to conclude that Lemmas 2.4 and 2.5 hold for the generalized LSZ model. Note also that in this case, the theory contains only orientable graphs due to the use of complex fields.
A.2. Renormalization. As for the noncommutative 4 theory, we only need to renormalize the planar (g = 0) two and four-point functions with only one external face. Recall that the oscillating factors of the propagators are exp ı
sinh Bt u l θ −1 vl . 2 sinh ωt
(A.5)
After resolving the v , ∈ T variables in terms of X , W and U , they can be included in the vertices oscillations by a redefinition of the Q, S and R matrices (see (3.14)). For the four-point function, we can then perform the same Taylor subtraction as in the 4 case. The two-point function case is more subtle. Let us consider the generic amplitude A(G ik )
=
d xd y ϕ¯ e (x)ϕ e (y)δ x − y + U × du l dwl Cl (u l , wl ) l∈G ik , l∈T
×
du C (u , X , U , W ) eı X QU +ıU RU +ıU SW .
(A.6)
∈Tki
The symmetrization procedure (3.24) over the external fields is not possible anymore, the theory being complex. Nevertheless we can decompose ϕ(x)ϕ(y) ¯ in a symmetric and an anti-symmetric part: ϕ(x)ϕ(y) ¯ + ϕ(y)ϕ(x) ¯ + ϕ(x)ϕ(y) ¯ − ϕ(y)ϕ(x) ¯ def = S + A ϕ(x)ϕ(y). ¯
ϕ(x)ϕ(y) ¯ =
1 2
(A.7)
The symmetric part of A, called As , will lead to the same renormalization procedure as the 4 case. Indeed, S ϕ(x)ϕ(y) ¯ = =
1 2 1 2
ϕ(x)ϕ(y) ¯ + ϕ(y)ϕ(x) ¯ ) * ϕ(x)ϕ(x) ¯ + ϕ(y)ϕ(y) ¯ − ϕ(x) ¯ − ϕ(y) ¯ ϕ(x) − ϕ(y)
which is the complex equivalent of (3.24).
(A.8)
540
R. Gurau, J. Magnen, V. Rivasseau, F. Vignes-Tourneret
In the anti-symmetric part of A, called Aa , the linear terms ϕ∇ϕ ¯ do not compensate: Aϕ(x)ϕ(y) ¯ = 21 ϕ(x)ϕ(y) ¯ − ϕ(y)ϕ(x) ¯ =
1 ¯ − x).∇ϕ(x) − (y − x).∇ ϕ(x)ϕ(x) ¯ 2 (ϕ(x)(y 2 1 1 + 2 ϕ(x)((y ¯ − x).∇) ϕ(x) − 2 ((y − x).∇)2 ϕ(x)ϕ(x) ¯ 1
ds(1 − s)2 ϕ(x)((y ¯ − x).∇)3 ϕ(x + s(y − x))
+ 21 0
−((y − x).∇)3 ϕ(x ¯ + s(y − x))ϕ(x)).
(A.9)
We decompose Aa into five parts following the Taylor expansion (A.9): ¯ − x).∇ϕ(x)δ x − y + U Aa1+ = d xd y ϕ(x)(y × du l dwl Cl (u l , wl ) l∈G ik , l∈T
× =
du C (u , X , U , W ) eı X QU +ıU RU +ıU SW
∈Tki
d x ϕ(x) ¯ U.∇ϕ(x) ×
du l dwl Cl (u l , wl )
l∈G ik , l∈T
du C (u , X , U , W ) eı X Q U +ıU RU +ıU SW ,
(A.10)
∈Tki
where we performed the integration over y thanks to the delta function. The changes have been absorbed in a redefinition of X , U and Q. From now on X (and X ) contain only x (if x is hooked to the branch b(l)) and we forget the primes for Q and U . We expand the function f defined in (3.26) up to order 2: ¯ U.∇ϕ(x) du l dwl Cl (u l , wl ) Aa1+ = ϕ(x) ×
l∈G ik , l∈T
du C (u , U , W ) eıU RU +ıU SW
∈Tki
× f (0) + f (0) +
1
dt (1 − t) f (t) .
(A.11)
0
The zeroth order term vanishes thanks to the parity of the integrals with respect to the u and w variables. The first order term contains ϕ(x) ¯ Uµ ∇µ ϕ(x) ı X QU + R (0) . (A.12) The first term leads to (U1 ∇1 + U2 ∇2 )ϕ(x 1 U 2 − x 2 U 1 ) with the same kind of expressions for the two other dimensions. Due to the odd integrals, only the terms of the form (U 1 )2 x 2 ∇1 − (U 2 )2 x 1 ∇2 survive. We are left with integrals like
Renormalization of Non-Commutative 44 Field Theory in x Space
(u 1 )2
du l dwl Cl (u l , wl )
l∈G ik , l∈T
541
du C (u , U , W )eıU RU +ıU SW .
(A.13)
∈Tki
To prove that these terms give the same coefficient (in order to reconstruct a x ∧ ∇ term), note that, apart from the (u 1 )2 , the involved integrals are actually invariant under an overall rotation of the u and w variables. Then by performing rotations of π/2, we prove that the counterterm is of the form of the Lagrangian. The R (0) and the remainder term in Aa1+ are irrelevant. Let us now study the other terms in Aa . ¯ ϕ(x) du l dwl Cl (u l , wl ) Aa1− = − d x U.∇ ϕ(x) ×
l∈G ik , l∈T
du C (u , X , U , W ) eı X QU +ıU RU +ıU SW .
(A.14)
∈Tki
Once more we decouple the external variables from the internal ones by Taylor expanding the function f . Up to irrelevant terms, this only doubles the x ∧ ∇ term in Aa1+ , 1 Aa2+ = du l dwl Cl (u l , wl ) ϕ(x) ¯ (U.∇)2 ϕ(x) 2 i l∈G k , l∈T
×
du C (u , U , W ) e
∈Tki
ıU RU +ıU SW
1 f (0) +
dt f (t) . (A.15)
0
The f (0) term renormalizes the wave-function. The remainder term in (A.15) is irrelevant. Aa2− doubles the Aa2+ contribution. Finally the last remainder terms (the last two lines in (A.9)) are irrelevant too. This completes the proof of the perturbative renormalizability of the LSZ models. Remark that if we had considered a real theory with a covariant derivative which corresponds to a neutral scalar field in a magnetic background, the angular momentum term wouldn’t renormalize. Only the harmonic potential term would. It seems that the renormalization “distinguishes” the true theory in which a charged field should couple to a magnetic field. It would be interesting to study the renormalization group flow of these kind of models along the lines of [13]. B. Notations of Positions • The letter x is used for the four initial positions of a vertex • the letter X is used solely for external positions of the considered graph or subgraphs • the letters v and u are used for the sum and difference of two positions joined by an internal line • the letter w is used solely as another name for a v variable which corresponds to a loop line (not a tree line) once a tree has been chosen • the letter y is used for the collective of long and external variables. • z is to y what w is to v, namely a name for the external variables or long loop variables • s and t are names for external variables and ends of loop lines variables in rosette vertices.
542
R. Gurau, J. Magnen, V. Rivasseau, F. Vignes-Tourneret
Hence the same complete set of 4n variables for a graph with n vertices depending on context can be denoted x ; X , u and v ; y and u ; X , u, w and the v of the tree lines ; z, u and the v of the tree lines. The s and t are only used in Subsect. 3.1. Acknowledgements. We thank V. Gayral and R. Wulkenhaar for useful discussions on this work.
References 1. Grosse, H., Wulkenhaar, R.: Power-counting theorem for non-local matrix models and renormalisation. Commun. Math. Phys. 254(1), 91–127 (2005) 2. Grosse, H., Wulkenhaar, R.: Renormalisation of φ 4 -theory on noncommutative R4 in the matrix base. Commun. Math. Phys. 256(2), 305–374 (2005) 3. Rivasseau, V., Vignes-Tourneret, F., Wulkenhaar, R.: Renormalization of noncommutative φ 4 -theory by multi- scale analysis. Commun. Math. Phys. 262, 565–594 (2006) 4. Langmann, E., Szabo, R.J., Zarembo, K.: Exact solution of quantum field theory on noncommutative phase spaces. JHEP 01, 017 (2004) 5. Filk, T.: Divergencies in a field theory on quantum space. Phys. Lett. B376, 53–58 (1996) 6. Chepelev, I., Roiban, R.: Convergence theorem for non-commutative Feynman graphs and renormalization. JHEP 03, 001 (2001) 7. Chepelev, I., Roiban, R.: Renormalization of quantum field theories on noncommutative Rd . i: Scalars. JHEP 05, 037 (2000) 8. Langmann, E., Szabo, R.J.: Duality in scalar field theory on noncommutative phase spaces. Phys. Lett. B533, 168–177 (2002) 9. Douglas, M.R., Nekrasov, N.A.: Noncommutative field theory. Rev. Mod. Phys. 73, 977–1029 (2001) 10. Connes, A., Douglas, M.R., Schwarz, A.: Noncommutative geometry and matrix theory: Compactification on tori. JHEP 02, 003 (1998) 11. Seiberg, N., Witten, E.: String theory and noncommutative geometry. JHEP 09, 032 (1999) 12. Vignes-Tourneret, F.: Perturbative renormalizability of the noncommutative Gross-Neveu model. Work in progress 13. Grosse, H., Wulkenhaar, R.: The beta-function in duality-covariant noncommutative φ 4 -theory. Eur. Phys. J. C35, 277–282 (2004) 14. Rivasseau, V., Vignes-Tourneret, F.: Non-commutative renormalization. To appear in the Proceedings of Rigorous Quantum Field Theory: A Symposium in Honor of Jacques Bros, Paris, France, 19-21 Jul 2004 (2004), Available at http://www.arXiv.org/list/hep-th/0409312, 2004 15. Gurau, R., Rivasseau, V., Vignes-Tourneret, F.: Propagators for noncommutative field theories. http://www.arXiv.org/list/hep-th/0512071, 2005. To appear in accepteded by Ann. H. Poincaré 16. Rivasseau, V.: From Perturbative to Constructive Renormalization. Princeton Series in Physics. Princeton, NJ: Princeton University Press, 1991, 336 pp. 17. Gallavotti, G., Nicolò, F.: Renormalization theory in four-dimensional scalar fields. i. Commun. Math. Phys. 100, 545–590 (1985) 18. Langmann, E.: Interacting fermions on noncommutative spaces: Exactly solvable quantum field theories in 2n+1 dimensions. Nucl. Phys. B654, 404–426 (2003) Communicated by A. Connes
Commun. Math. Phys. 267, 543–558 (2006) Digital Object Identifier (DOI) 10.1007/s00220-006-0078-1
Communications in
Mathematical Physics
The 2D Euler Equations and the Statistical Transport Equations N. V. Chemetov1 , F. Cipriano2 1 CMAF, Universidade de Lisboa, Av. Prof. Gama Pinto 2, 1649-003 Lisboa, Portugal.
E-mail: [email protected]
2 GFM e Dep. de Matemática FCT-UNL, Av. Prof. Gama Pinto 2, 1649-003 Lisboa, Portugal.
E-mail: [email protected] Received: 5 January 2006 / Accepted: 28 February 2006 Published online: 1 August 2006 – © Springer-Verlag 2006
Abstract: We prove the existence of weak solutions for the forward and backward statistical transport equations associated with the 2D Euler equations. Such solutions can be interpreted, respectively, as a statistical Lagrangian and a statistical Eulerian description of the motion of the fluid. 1. Introduction This article is concerned with the 2D Euler equations for the in-viscous incompressible fluid ∂v = −(v · ∇)v − ∇ p (1.1) ∂t div v = 0 subjected to periodic boundary conditions and given initial data v(x, 0) = v0 (x),
(1.2)
where v(x, t) = (v1 (x1 , x2 , t), v2 (x1 , x2 , t)) is the velocity field of the fluid, p = p(x, t) is the pressure. In the study of such a problem two different approaches are possible: A) (1.1) is considered as an usual P. D. E. with smooth or less smooth initial data; B) (1.1) is considered as a statistical equation with low regularity of the initial data. Concerning A, the existence and uniqueness results of the classical weak solutions for the initial data with r ot v0 ∈ L ∞ have been shown by the different approaches in [Y, Ka, C-S]. The description of less smooth fluctuation of the fluid, when we have a velocity discontinuity (mixing layers, jets), is related with the case when r ot v0 is a measure (delta function), but v0 ∈ L 2 . There is extensive literature on the subject of existence results; we refer to [K, DiP-M, D]. All the solutions of the Euler equations in the articles mentioned have the feature of finite kinetic energy.
544
N. V. Chemetov, F. Cipriano
However, many physical problems possess highly unstable structures, whose complete dynamics can not be described by a smooth model. To study such dynamics of the fluid, when we deal with the velocity field v0 belonging to H −s , s > 0, approach B is more natural. Statistical solutions of (1.1) are defined almost everywhere with respect to some probability measure, which is associated to physical quantities, which are invariants of the system. The underlying state space of the solutions is an infinite dimensional space, where a suitable differential calculus has to be considered. For our 2D case, in [A-H.K-M, A-H.K-R.F, B-F, C-D.G] invariant measures of Gibbs type have been constructed. These Gibbs measures are determined by quantities such as the enstrophy, the energy and the re-normalized energy and it has been proved that such measures are infinitesimally invariant with respect to the Euler equation. The existence of global flows (weak statistical solutions), leaving the measures invariant, has been shown in [A-C, Ci], where point-wise flows are carried by some families of probability measures. The flows take values in the support of the Gaussian measures with covariance given by the enstrophy, therefore such flows can be of infinite kinetic energy. Similar situations have arisen in Stochastic Analysis and, in particular, within Malliavin’s stochastic calculus of variations. In [Cr] (and also in [U-Z, P]) flows on the classical Wiener space, associated with vector fields with low regularity, have been defined. In our present article we develop the approach suggested in [A-C, Ci] to study flows with v0 ∈ H −s , s > 0. We introduce the concept of generalized statistical forward and backward flows. These are solutions of the corresponding Transport Equations defined on the invariant infinite Gibbs measure. This approach has many advantages, compared with the previous results [A-C, Ci]; using these statistical transport equations, one can try to develop methods, obtained for the usual PDE theory for solving important questions: 1) Uniqueness result; 2) Study of regularity of solutions; 3) Develop numerical methods for these statistical differential equations. As to the more detailed structure of this paper, in Sect. 2 we formulate our problem, define the standard Gibbs measure µγ given by the enstrophy integral and present a very useful lemma on the integrability of the operator B(U ) in L 2µγ , associated to the nonlinear part of the Euler equation (1.1). In Sects. 4 and 5, we define the generalized forward and backward flow of (1.1), which are solutions of the transport equations, and show their existence. In Sect. 6 we describe how the generalized forward flow can be interpreted as the statistical Lagrangian viewpoint for the description of the motion of the fluid and the generalized backward flow as the corresponding statistical Euler viewpoint
2. Statement of the Problem Let us return to the Euler equations (1.1), (1.2). Since div v = 0 and div v0 = 0, there exist functions U = U (x, t), u = u(x), such that v = ∇ ⊥ U = (−∂x2 U, ∂x1 U ), v0 = ∇ ⊥ u = (−∂x2 u, ∂x1 u). We can eliminate the pressure p in (1.1) by applying the differential operator r ot z = −∂x2 z 1 + ∂x1 z 2 to the first equation of (1.1) and obtain ∂t U = −∂x2 U · ∂x1 U + ∂x1 U · ∂x2 U.
(2.1)
The 2D Euler Equations
545
We consider solutions of (1.1), (1.2) on the 2-dimensional torus that we identify with T 2 = [0, 2π ] × [0, 2π ] subjected to periodic boundary conditions U (0, x2 , t) = U (2π, x2 , t), U (x1 , 0, t) = U (x1 , 2π, t)
(2.2)
1 ik·x ∀x = (x1 , x2 ) ∈ T 2 , ∀t ∈ [0, T ]. Let us denote by ek (x) = 2π e , k ∈ Z2 the eigen2 2 2 functions for the operator − with eigenvalues k = k1 + k2 , where k · x = k1 x1 + k2 x2 . They form a complete set of orthonormal functions in L 2 (T 2 ). We expand the solution U (x, t) of (2.1) in the form of Fourier serie, U (x, t) = Uk (t)ek (x). k
Since U is a real function and we can assume T 2 U d x = 0, then U−k = U k (z is the complex conjugate of z) and U (x, t) = Uk (t)ek (x), (2.3) k∈Z2+
where Z2+ denotes the set {k ∈ Z2 : k1 > 0, k2 ∈ Z or k1 = 0, k2 > 0}. For the initial data u(x) we have u(x) = u k ek (x). (2.4) k∈Z2+
In the sequel by (2.3), (2.4), we can identify the functions U, u with infinite vectors of Fourier coefficients U = Uk k∈Z2 and u = u k k∈Z2 , + + 2 ∞ where k ∈ Z+ . We define C = u = u k k∈Z2 : u k ∈ C . + Substituting (2.3) in Eq. (2.1) and introducing the operator B(U ) = Bk (U ) k∈Z2 +
with coefficients Bk = Bk (U ) described by the equalities Bk (U ) = αh,k Uh Uk−h , h=k h,k∈Z2+
αh,k =
1 2π
1 ⊥ 1 ⊥ (h (h · k)(h · k) − · k) , k2 2
(2.5)
where h ⊥ = (−h 2 , h 1 ), the system (1.1), (1.2), (2.2) will be equivalent to Problem. Find U = U (t), the solution of the infinite dimensional system d U (t) = B(U (t)) dt
(2.6)
U (0) = u.
(2.7)
with the initial conditions
546
N. V. Chemetov, F. Cipriano
Now we introduce the Sobolev spaces of order β ∈ R on the torus T 2 ,
β 2β 2 H = ϕ= ϕk ek : k |ϕk | < +∞, ϕ−k = ϕ k k
k
≡ ϕ = ϕk k∈Z2 ∈ C∞ : k 2β |ϕk |2 < +∞ . + 2
(2.8)
k∈Z+
The spaces H β are complex Hilbert spaces with inner product and norm given by < ϕ, ψ > H β = k 2β ϕk ψ¯ k , ϕ2H β =< ϕ, ϕ > H β . k∈Z2+
The two dimensional Euler equation has an infinite number of invariants of motion, among which we mention the energy and the enstrophy, defined respectively by 1 2 2 1 1 v2 d x = k u k = u2H 1 , E(t) = 2 T2 2 2 k 1 4 2 1 1 2 (r ot v) d x = k u k = u2H 2 . S(t) = 2 T2 2 2 k
Lemma 2.1 ([A-C]). Let u(t) be a smooth solution of the Euler equation (1.1), (1.2), (2.2). Then E(t), S(t) are constants of motion, i. e., E(t) = E(0) and S(t) = S(0), ∀t ∈ [0, T ].
(2.9)
We consider the Gaussian measures with covariance given by the enstrophy multiplied by a constant γ > 0, namely 1 4 2 γ k4 k k dµγ (ϕ) = (2.10) exp − γ k |z| d xd y dνγ (ϕk ), dνγ (z) = 2π 2 2 k∈Z+
with z = x + i y. Taking into account that µγ (H β ) = 1 for any β < 1, then, for simplification of notations, the integral will be written ϕ(u)dµγ (u) ϕ(u)dµγ (u) instead of Hβ
for the arbitrary function ϕ = ϕ(u) : C∞ → C, which is integrable with respect to µγ . In the following we will assume that the value of β < 1 is given. Definition 1. We define the Banach space L µp γ (H β ) = ϕ : C∞ → C : |ϕ(u)| p dµγ (u) < ∞ p
with the norm ϕ L p = µγ
|ϕ(u)| p dµγ (u), p ≥ 1.
(2.11)
The 2D Euler Equations
547
3. Useful Previous Results In this paragraph we state some results that will be needed in the next sections. Let us denote Z2+,n = {k ∈ Z2+ : |k| ≤ n} and d(n) = #Z2+,n . We consider the finite dimensional approximations of Bk (u) defined as
Bkn (u) =
αh,k u h u k−h .
(3.1)
h=k h,k∈Z2+,n
Following [A-C, Ci], we can verify that Bkn → Bk in L µp γ (H β ), and the functional B(u) = only if β < −1, that is
k∈Z2+
p>1
(3.2)
Bk (u)ek is integrable as a functional from H β to H β
p
B(u) H β dµγ < ∞,
p > 1 and β < −1.
More precisely, we have Lemma 3.1. For any k ∈ Z2+ we have Bk (u) ∈ L µp γ (H β ), ∀ p > 1, β < 1, B(u) ∈ L µp γ (H β , H β ), ∀ p > 1, β < −1. Definition 2. An arbitrary complex function f = f (u) : C∞ → C is a cylindrical function if, for some integer N , we have f = f (u) ≡ F(u α1 , . . . , u αd(N ) ), where F is a C01 (Cd(N ) ) - smooth function depending only on the components u αi , αi ∈ Z2+,d(N ) . Definition 3. The operator δµγ B : H β → C, which satisfies
B(u) · ∇ f (u)dµγ (u) =
δµγ B(u) · f (u)dµγ (u)
for any cylindrical function f , is named as the divergence of the field B(u) with respect to the measure µγ . Lemma 3.2. We have δµγ B(u) = 0 for µγ − a. e. u. Proof. To prove this result we use the approximations Bkn defined in (3.1). Since Bkn does not depend on the component u k , using Lemma 2.1, we can verify that δµγ B n = 0, ∀n ∈ N. Noting that the definition of δµγ B only involves integration against cylindrical functions, we obtain (3.2) (cf. [A-C] for detailed proof of this result).
548
N. V. Chemetov, F. Cipriano
4. Euler equations and transport equations Let us return to the Problem, written as d Uk (t) = Bk (U (t)), ∀k ∈ Z2+ , dt
(4.1)
with initial condition Uk (0) = u k . The solution of this system U = U (t) can be considered as a function of both the time parameter t ∈ [0, T ] and the initial data u ∈ C∞ ; therefore we define U (t, u) := U (t) for (t, u) ∈ [0, T ] × C∞ . Let us assume, just formally, that B = B(u) and U = U (t, u) are C 1 -differentiable functions. We can write the identity (flow property) U (t + s, u) = U (t, U (s, u)), ∀t, s ≥ 0, 0 ≤ t + s ≤ T. Taking the derivative on the time variable s we obtain ∂ ∂ Uk (t + s, u) = Uk (t, U (s, u))Bl (U (s, u)) ∂t ∂u l
(4.2)
(4.3)
l
fo each k ∈ Z2+ . For s = 0 we deduce that the function U = U (t, u) satisfies the linear transport equation ∂ Uk (t, u) = B(u) · ∇Uk (t, u), k ∈ Z2+ ∂t with initial condition Uk (0, u) = u k . Let us consider ∈ C 1 ([0, T ]) with (T ) = 0 and f any cylindrical function. We multiply the linear transport equation by (t) f (u) and integrate with respect to the measure dt × dµγ (u), where dt denotes the Lebesgue measure on [0, T ]. Integrating by parts, considering the initial condition and the fact that δµγ B = 0, we verify that a regular solution of (4.1) satisfies the integral equation T u k (0) f (u)dµγ (u) + Uk (t, u) (t) f (u)dµγ (u)dt 0
T
= 0
Uk (t, u) B(u) · ∇ f (u) (t)dµγ (u)dt.
Definition 4. A function U = U (t, u) = forward flow of the Problem if
(4.4)
Uk (t, u) k∈Z2 is called a generalized
Uk (t, u) ∈ W 1,∞ ([0, T ], L µp γ (H β )),
+
p > 1,
β < 1,
∀k ∈ Z2+ and the identities (4.4) hold, for any cylindrical function f = f (u) and any ∈ C 1 ([0, T ]), such that (T ) = 0. Theorem 4.1. There exists a generalized forward flow U (t, u) of the Problem such that p |Uk (t, u)| dµγ (u) ≤ |u k | p dµγ (u) < ∞ (4.5) for any t ∈ [0, T ] and k ∈ Z2+ .
The 2D Euler Equations
549
Proof. From the theory of O. D. E. and the conservation of the energy (Lemma 2.1), there exists a unique solution of
such that
d U n (t) = Bkn (U n (t)), ∀k ∈ Z2+,n , dt k U n (0) = u ∈ Cd(n) ,
(4.6)
U n (t) = U n (t, u) ∈ C 1 [0, T ] × Cd(n) .
(4.7)
Applying (4.2), (4.3) to each function Ukn (t, u), we deduce that any solution of (4.6) satisfies the system ∂ n U (t, u) = B n (u) · ∇Ukn (t, u), (4.8) ∂tn k Uk (0) = u k . Taking into account that δµγ B n = 0, we deduce
u k (0) f (u)dµγ (u) + = 0
T
T
Ukn (t, u) (t) f (u)dµγ (u)dt
0
Ukn (t, u) B n (u) · ∇ f (u) (t)dµγ (u)dt
(4.9)
for any cylindrical function f = f (u) and ∀ ∈ C 1 ([0, T ]), such that (T ) = 0. The condition δµγ B n = 0, ∀n ∈ N implies that the measure µγ is invariant for the flow map associated with B n . More precisely
f (U (t, u))dµγ (u) = n
f (u)dµγ (u), for any cylindrical function f.
(4.10)
In particular we have
|Ukn (t, u)| p dµγ (u) =
|u k | p dµγ (u) < ∞,
|∂t Ukn (t, u)| p dµγ (u) ≤
|Bk (u)| p dµγ (u) < ∞.
(4.11)
Therefore there exists a subsequence of {Ukn , Bln } such that, when n → ∞, p Ukn (t, u) Uk (t, u) (weakly) in W 1,∞ [0, T ], L µγ (H β ) , Bln (u) → Bl (u) in L µγ (H β ) p
(4.12)
for all k, l. This convergence allows to pass to the limit in Eq. (4.9), and conclude that Uk (t, u) satisfies Eq. (4.4) for every k. Inequality (4.5) follows from (4.11) and the weak convergence of Ukn (t, u).
550
N. V. Chemetov, F. Cipriano
5. Transport Equations and Liouville-Type Equations As discussed in [F-M-R-T], in turbulent flow regimes, physical properties are universally recognized as randomly varying and characterized by suitable probability distribution functions. For instance, some turbulent processes, due to technical difficulties, can not be measured with good precision; measurements are therefore to be taken with some error estimates. This is why we speak about finding the solution in some distribution class of initial data. Mathematically this can be formulated as follows: If the initial conditions are given according to a measure (distribution) ν0 = ν0 (u)
(5.1)
on the phase space C∞ , then the solution of the Euler equation (4.1) with initial distribution ν0 at some later time t will be distributed according to another distribution νt = νt (u). How can we determine this time dependent distribution νt = νt (u) with respect to the initial distribution ν0 = ν0 (u)? Definition 5. A distribution νt = νt (u) is called a generalized backward flow of the Euler equation (4.1) with initial distribution ν0 , if νt (u) is a probability measure and satisfies the Liouville-type equations t f (u)dνt (u) = f (u)dν0 (u) + (5.2) B(u) · ∇ f (u)dντ (u)dτ 0
for any cylindrical function f = f (u). 5.1. Existence for initial data absolutely continuous with respect to the measure µγ . The class of all absolutely continuous probability measures with respect to the Gaussian q q measure of the form v(u)dµγ (u) with v ∈ L µγ (H β ), q > 1 will be denoted as Mµγ . q
Theorem 5.1. For any initial probabilistic distribution dν0 = v0 (u)dµγ (u) ∈ Mµγ , q > 1 there exists a generalized backward flow νt of the Euler equation (4.1) such that dνt (u) = V (t, u)dµγ (u) and V (t, u) ∈ L ∞ [0, T ], L qµγ (H β ) q
that is, νt (u) ∈ Mµγ , for a. e. t ∈ [0, T ]. Moreover the function V = V (t, u) is a weak solution of the transport equation ∂t V (t, u) + B(u) · ∇V (t, u) = 0, (5.3) V (0, u) = v0 (u). Proof. Let us regular initial distributions dν0n (u) = v0n (u)dµγ (u), where v0n ∈ consider n 1 d(n) C (C ), v0 (u)dµγ (u) = v0 (u)dµγ (u) and v0n → v0 in L qµγ (H β ). Here the function v0n can be taken as P 1 v0 (Uα1 , . . . , Uαd(n) ), αi ∈ Z2+,d(n) , where
Pt f (u) =
n2
f (e−t u +
1 − e−2t y)dµγ (y)
The 2D Euler Equations
551
is the Ornstein-Uhlenbeck operator (see [U-Z]). For any cylindrical function f = f (u) we have n n d B U (t, u) · ∇ f U n (t, u) v0n (u)dµγ (u); f U n (t, u) v0n (u)dµγ (u) = dt here U n (t, u) ∈ C 1 [0, T ] × Cd(n) is the solution of problem (4.6). Making a change of variables from u to U = U n (t, u) (cf. (4.10)), we obtain n d B (u) · ∇ f (u) v0n (U n (−t, u))dµγ (u) f (u)v0n U n (−t, u) dµγ (u) = dt (5.4) and for the function V n (t, u) = v0n U n (−t, u) , we have (5.5) |V n (t, u)|q dµγ (u) = |v0n (u)|q dµγ (u) < C, ∀n. So there exists a subsequence of V n such that V n (t, u) V (t, u) (weakly) in L ∞ [0, T ], L qµγ (H β ) .
(5.6)
Hence by (3.2), (5.6) and (5.4) the distribution dνt (u) = V (t, u)dµγ (u) satisfies (5.2). Since ν0 is a probability measure, convergence (5.6) implies that νt is also a probability measure. Let us now show that V (t, U ) can be obtained as a weak solution of the transport equation (5.3). To do it we consider the approximated system ∂t V n (t, u) + B n (u) · ∇V n (t, u) = 0, (5.7) V n (0, u) = v0n (u). This system has a unique regular solution V n = V n (t, u) ∈ C 1 [0, T ] × Cd(n) , which satisfies the identity d n n V U (t, u) = 0, dt i. e. V n (t, u) verifies (5.1). On the other hand V n (t, u) is a solution of the weak formulation for the problem (5.7), namely T V n (t, u) (t) f (u)dµγ (u)dt v0n (u)(0) f (u)dµγ (u) + 0
T
= 0
V n (t, u) B n (u) · ∇ f (u) (t)dµγ (u)dt
(5.8)
for any fixed cylindrical function f = f (u) and ∈ C 1 ([0, T ]), (T ) = 0. From (3.2) and (5.6) we deduce that V (t, U ) is a weak solution of (5.3).
Remark 1. The measure νt = µγ , ∀t ≥ 0 is a particular generalized backward solution of the Euler equation (4.1) with initial condition (5.1), a result which has been proved in paper [A-C]. In the following two paragraphs we study the evolution of turbulent processes with the initial data, which are Dirac measures.
552
N. V. Chemetov, F. Cipriano
5.2. Approximation of the Dirac measure. Since we shall need the results of Lemma 3.1, in this subsection and in Subsect. 5.3, we consider the space H β with a fixed β, such that β < −1. Let δz 0 be the Dirac measure concentrated at a given point z 0 ∈ H β , that is for any set A ⊆ H β , we have (a) δz 0 (A) = 1, if z 0 ∈ A;
(b) δz 0 (A) = 0, if z 0 ∈ / A.
The main objective of this subsection is to construct an approximation of the Dirac measure δz 0 with respect to the measure µγ . Let ε > 0 be a fixed real. If we take an arbitrary z 0 ∈ H β , we define the set Bεk (z k0 ) = ε {z k ∈ C : |z k − z k0 | εk } with εk = 3/2 and z k0 being the k th coordinate of z 0 . We k consider the function χzε0k (z k )dµkγ (z k ), gk,ε (z k0 ) := C
k
where χzε0k (z k ) is the characteristic function of the set Bεk (z k0 ). Let us compute the integral k
of gk,ε (z k0 ) on C with respect to the measure µkγ (z k0 ), that gives C
gk,ε (z k0 )dµkγ (z k0 ) = 1 − e−|k|A with A =
γ ε2 . 4
Let us now define two sequences L n :=
n
(1 − e− j A )2 j ,
Tn :=
(1 − e−|k|A ), n = 1, 2, . . . .
k∈Z2+,n
j=1
The sequences {L n }, {Tn } are monotone that 0 < L n ≤ Tn < 1, ∀n = decreasing, such − j A ) is a convergent series, there1, 2, . . . and limn→∞ ln(L n ) = ∞ 2 j ln(1 − e j=1 fore we have 0 < lim L n T∞ := lim Tn < 1. n→∞
n→∞
(n,ε)
Let us consider the sequence {χz 0 (n,ε)
χz 0
(z) :=
k∈Z2+,n
(z)}∞ n=1 of functions defined as
χzε0k (z k ), ∀(z 0 , z) ∈ H β × H β , k
that is monotone decreasing in n and bounded: 0 χz(n,ε) (z) 1. It is clear that 0 (z) → χzε0 (z) = χz(n,ε) 0 n→∞
k∈Z+2
χzε0k (z k ), ∀(z 0 , z) ∈ H β × H β . k
Let us introduce the following functions: ε 0 (z) dµ (z), f (z ) := χzε0 (z) dµγ (z). f (n,ε) (z 0 ) := χz(n,ε) γ 0
(5.9)
The 2D Euler Equations
553
By Lebesgue’s theorem of dominated convergence and (5.9), we have lim f (n,ε) (z 0 ) = f ε (z 0 ), ∀z 0 ∈ H β , (n,ε) 0 0 (z )dµγ (z ) = f ε (z 0 )dµγ (z 0 ) = T∞ ∈ (0, 1). f lim
n→∞ n→∞
Considering Girsanov’s theorem (cf. [U-Z]), this implies f ε (z 0 ) > 0, µγ – a.e.. Therefore the functions δz(n,ε) (z) := 0
1 f (n,ε) (z 0 )
(z), δzε0 (z) := · χz(n,ε) 0
1 f ε (z 0 )
· χzε0 (z)
(5.10)
(n,ε)
are well defined and for µγ –a.e. z 0 and ∀z ∈ H β : δz 0 (z) → δzε0 (z), when n → ∞. Moreover we have 1) δzε0 (z) dµγ (z) = 1 for µγ –a.e. z 0 ; 2) For µγ –a.e. z 0 , the function δzε0 (z) has a compact support with respect to the weak topology in H β : 1 2 2 supp(δzε0 ) ⊂ Bε D (z 0 ) = z ∈ H β : ||z − z 0 ||2H β ε2 . (5.11) =: ε D k3 + k∈Z2
By Prokhorov’s theorem [G-S], there exists a subsequence of the measures δzε0 (z) dµγ (z) converging weakly to a measure dm z 0 (z), for µγ -a.e. z 0 , when ε → 0. Using 1 and 2 above we can verify that m z 0 coincides with the Dirac measure δz 0 , that is δzε0 (z)dµγ (z) dδz 0 (z) weakly for µγ − a. e. z 0 . ε→0
(5.12)
5.3. Existence for Dirac measure initial data. In this paragraph we will assume that good measurements of turbulent process can be made. In that case the evolution of the system should be described by a generalized solution associated with initial Dirac measure. Let us assume that the initial distribution (5.1) is the Dirac measure, concentrated at a given point u 0 ∈ supp(µγ ). In the sequel we prove the following theorem. Theorem 5.2. For µγ -a.e. u 0 there exists a generalized backward flow νt = νt (u), u ∈ H β for β < −1 of (4.1), satisfying equality (5.2) with the initial distribution (5.1), such that ν0 = δu 0 . Proof. The proof will be done in three steps. In the first step we construct approximated solutions of our problem, satisfying the Liouville-type equations (5.2). The second step will be devoted to show that this set of approximated solutions is relatively compact in a corresponding space of measures. In the last step we pass to the limit integral equations (5.2).
554
N. V. Chemetov, F. Cipriano
1st step. Using the reasoning of Theorem 5.1, we have that the function v (n,ε) (t, u) = δuε 0 (U n (−t, u)),
(5.13)
where U n = U n (t, u) is the solution of problem (4.6), fulfills the identity T (0) · f (u) δu 0 (u) dµγ (u) + (t) f (u) v (n,ε) (t, u) dµγ (u) dt 0
= 0
T
(t)
(B n (u) · ∇ f (u)) v (n,ε) (t, u) dµγ (u) dt
(5.14)
for any fixed cylindrical function f = f (u) and any (t) ∈ C 1 ([0, T ]), such that (T ) = 0. 2nd step. We show that the set of measures (n,ε)
dνt
(u) := v (n,ε) (t, u) dµγ (u)
(5.15)
is relatively compact for the weak topology on the space of measures over H β . Let C([0, T ], H β ) be the space of continuous functions on [0, T ], with values in H β . We introduce the measure (n,ε) ν () := δu 0 (u) dµγ (u), (5.16) S
where ⊂ C([0, T ], H β ) and S := {u ∈ H β : U n (·, u) ∈ }. For t ∈ [0, T ] and a given cylindrical function, we have f (y(t))dν (n,) (y) = f (U n (t, u))δuε 0 (u)dµγ (u) = f (u)dνt(n,) (u). (5.17) As we see, the relative compactness of the set of measures {dν (n,ε) (y)} with respect to the weak topology of measures over C([0, T ], H β ) implies the relative compactness of (n,ε) measures {dνt (u)} over H β for a.e. t ∈ [0, T ]. Then, in the sequel, the objective is to show that the set of measures {dν (n,ε) (y)} on C([0, T ], H β ) is relatively compact. Let us denote by Cc (H β ) the space of continuous functions on H β with a compact support (in weak topology) on H β . We introduce the operator (n,ε) 0 V (, ) := (u ) δu 0 (u) dµγ (u) dµγ (u 0 ), (5.18) S
defined for any function ∈ Cc (H β ) and any set ⊂ C([0, T ], H β ). We fix an arbitrary function ∈ Cc (H β ), denote A := ||||Cc (H β ) and choose a sufficiently large real R > 0, such that supp() ⊂ B R = {u 0 ∈ H β : ||u 0 || H β < R}. Let us verify that the set of measures V n (, ·) := V (n,εn ) (, ·), n = 1, 2, 3, . . . ,
(5.19)
where the exact value of εn is defined below in (5.22), satisfies the following conditions:
The 2D Euler Equations
a)
555
lim sup V n (, ||y(0)|| H β > ρ) = 0;
ρ→+∞ n
b) ∀ρ > 0, we have lim sup V n (, δ→0 n
sup 0t
||y(t) − y(t1 )|| H β ρ) = 0.
By Prokhorov’s criteria ([Mal], Theorem 2.6) these two conditions guarantee that the set of measures {V n (, ·)} is relatively compact (see, for instance, [Mal], Theorems 4.2 and 4.3). Let us start from the proof of condition a. We have A V (n,ε) (, ||y(0)|| H β > ρ) ||u|| H β · δu 0 (u)dµγ (u) dµγ (u 0 ). ρ BR For u ∈ H β , u 0 ∈ B R , satisfying ||u − u 0 || H β < ε · D, we obtain ||u|| H β < C, hence V (n,ε) (, ||y(0)|| H β > ρ)
A·C , ρ
that implies condition a. Now we show that the set of measures {V n (, ·)} satisfy condition b. We have V (n,ε) , sup ||y(t) − y(t1 )|| H β ρ 0t
A · ρ
t1
t1
t
B n (U n (s, u)) H β δu 0 (u)dµγ (u)dµγ (u 0 ) dt
BR
A · ρ +
t
BR
BR
B n (U n (s, u 0 )) H β dµγ (u 0 )
B n (U n (s, u)) − B n (U n (s, u 0 )) H β δu 0 (u)dµγ (u) dµγ (u) dt. (5.20)
∈ B R and From (4.7), in the domain {(t, u 0 , u) : t ∈ [0, T ], H β ε · D} the functions U n (t, u), U n (t, u 0 ) are bounded. By (3.1), there exists a constant Cn , depending only on the parameter n and satisfying the inequality u0
||u − u 0 ||
B n (U n (s, u)) − B n (U n (s, u 0 )) H β Cn U n (s, u) − U n (s, u 0 ) H β . Let us note that Cn → ∞, when n → ∞. By (4.6), the function z(t) := U n (s, u) − U n (s, u 0 ) H β satisfies a Gronwall type inequality, that implies z(t) z(0) · exp(Cn T ) for any t ∈ [0, T ].
(5.21)
In the following considerations, we assume that the parameter ε is equal to εn := Cn−1 · exp(−Cn T ). Considering estimates (5.20)-(5.22) and Lemma 3.1, we deduce the inequality V n ,
sup 0t
δ·A y(t) − y(t1 ) H β ρ · C, ρ
(5.22)
556
N. V. Chemetov, F. Cipriano
with the constant C, which is independent on n. Taking δ → 0, we conclude that the measures V n (, ·) fulfill condition b. Therefore, for any fixed function ∈ Cc (H β ), there exist a measure V (, ·) on C([0, T ], H β ) and a subsequence n = n() → ∞, such that, for all ⊂ C([0, T ], H β ), V n() (, )
→
n()→∞
V (, ).
Let us show that such a subsequence can be chosen independently of ∈ Cc (H β ). Since for a countable dense subset P of Cc (H β ), we can select a subsequence of V n , that we still denote by V n , such that for any ∈ P, we have V n (, ) → V (, ) for all ⊂ C([0, T ], H β ). n→∞
(5.23)
Let us now fix an arbitrary ∈ Cc (H β ). Using that P is dense in Cc (H β ), by the representation (5.18) and the convergence (5.23) it is easy to show that {V n (, ·)}∞ n=1 is a Cauchy sequence. Hence we have the convergence (5.23) for any ∈ Cc (H β ). Let us note that for any fixed ⊂ C([0, T ], H β ), the operator V n (, ) as the function of the parameter ∈ Cc (H β ) is linear, then the operator V (, ) is also linear in . By Kakutani-Riesz Representation theorem there exists an unique positive measure µ which represents the operator V as V (, ) = (u 0 ) dµ (u 0 ), ∈ Cc (H β ) and fixed ⊂ C([0, T ], H β ). Let us consider the constructed measure µ and the measure µγ , which are two measures on the space (H β , B), where B is the σ -algebra defined by the Borel sets of H β . By the Radon-Nikodym theorem (see, for instance, [U-Z]) there exists an integrable positive real function (u 0 ) on the space (H β , B) that satisfies µ (A) = A (u 0 ) dµγ (u 0 ) and (5.24) V (, ) = (u 0 ) (u 0 ) dµγ (u 0 ), for any function ∈ Cc (H β ) and any set ⊂ C([0, T ], H β ). Considering (5.18), (5.23) and (5.24), we obtain for µγ - a.e. u 0 , ν (n,εn ) () → u 0 () := (u 0 ), ∀ ⊂ C([0, T ], H β ). n→∞
Taking into account (5.17), for µγ -a.e. u 0 we deduce the existence of a probability 0 measure νtu on H β , such that (n,εn )
νt
() → νtu () ∀ ⊂ C(H β ) and for a.e. t ∈ [0, T ]. 0
n→∞
(5.25)
3rd step. With the help of (3.2), (5.12) and (5.25), passing to the limit on n → ∞ in 0 equality (5.14) written for ε = εn , we obtain that the measures {νtu , t ∈ [0, T ]} satisfy the equality
T
(0) · f (u ) + 0
0
(t)
0 f (u)dνtu (u)
T
dt = 0
(t)
(B(u) · ∇ f (u))
0 dνtu (u)
0
dt.
Hence we have shown that for µγ -a.e. u 0 the set of the probability measures {νtu , t ∈ [0, T ]} is the generalized backward flow of the Euler equation (4.1) with the initial distribution ν0 = δu 0 .
The 2D Euler Equations
557
6. Conclusion
From (4.5) and if U (0) = u ∈ H β , β < 1, we obtain U (t, u)2H β dµγ < C, t ∈ [0, T ]. Hence for µγ - a. e. fixed u, using (2.3), U (t, u) is a function of (t, x) U (t, x) = Uk (t, u)ek (x) ∈ L ∞ [0, T ], H β k∈Z2+
satisfying the periodic conditions (2.2) and the initial data (2.4): U (0, x) = u(x). The function U (t, x) may not satisfy the Euler equation (2.1), because the Fourier coefficients may not correspond to a solution of the infinite dimensional equation (2.6). One possible approach to this open problem is the method of the re-normalized solutions [DiP-L]. As is well known, there are two different ways of expressing the behaviour of the fluid: the Lagrangian and the Eulerian point of view. Their difference lies in the choice of coordinates to describe flow phenomena. In the Lagrangian description, the fluid is viewed as a collection of fluid particles (elements) that are freely translating, rotating, and deforming. To obtain a full description of the flow we need to identify the initial position of the elements. In our case, the relationship U = U (t, u) is the statistical Lagrangian description of the fluid. In the Eulerian description, an observed point of the physical space remains unchanged by time t. Quantities (velocities, temperature, pressure, ...) are measured at different instances of t. Hence, our quantities µt = µt (U ), V = V (t, U ) are determined as functions of Euler parameters: the time t and the observed point U . Acknowledgements. The authors are grateful to Prof. A. B. Cruzeiro for her suggestions, very helpful discussions and corrections. N.V. Chemetov thanks the FCT, project POCTI / MAT / 45700 / 2002 for support. Financial support of FCT, project POCTI / MAT / 55977 / 2004 is gratefully acknowledged by F. Cipriano.
References [A-H.K-M] [A-H.K-R.F] [A-C] [B-F] [Ci] [C-S] [Cr] [C-D.G] [D] [DiP-L] [DiP-M]
Albeverio, S., Hoegh-Krohn, R., Merlini, D.: Some remarks on Euler flow, associated generalized random fields and Coulomb systems. In: Albeverio, S. (ed.) Infinite dimensional analysis and stochastic processes. London: Pitman, 1985, pp. 216–244 Albeverio, S., Ribeiro de Faria, M., Hoegh-Krohn, R.: Stationary measures for the periodic Euler flow in two dimensions. J. Stat. Phys. 20, 585–595 (1979) Albeverio, S., Cruzeiro, A.B.: Global flows with invariant (Gibbs) measures for Euler and Navier-Stokes two dimensional fluids. Commun. Math. Phys. 129, 431–444 (1990) Boldrighini, C., Frigio, S.: Equilibrium states for the two-dimensional incompressible Euler fluid. Atti Sem. Mat. Fis. Univ. Moderna XXVII, 106–125 (1978) Cipriano, F.: The two dimensional Euler equation: a statistical study. Commun. Math. Phys. 201, 139–154 (1999) Chemetov, N.V., Starovoitov, V.N.: On a motion of perfect fluid in a domain with sources and sinks. J. Math. Fluid Mech. 4, 128–144 (2002) Cruzeiro, A.B., Équations différentielles sur l’espace de Wiener et formules de CameronMartin non linéaires. J. Funct. Anal. 54, 206–227 (1983) Caprino, S., De Gregorio, S.: On the statistical solutions of the two-dimensional periodic Euler equations. Math. Methods Appl. Sci. 7, 55–73 (1985) Delort, J.M.: Existence de nappes de tourbillon en dimension deux. J. Amer. Math. Soc. 4, 553–586 (1991) DiPerna, R.J., Lions, P.L.: Ordinary differential equations, transport theory and Sobolev spaces. Invent. Math. 98, 511–547 (1989) Diperna, R.J., Majda, A.J.: Concentrations in regularizations for 2D incompressible flow. Commun. Pure Appl. Math. 40, 301–345 (1987)
558
[F-M-R-T] [G-S] [Ka] [K] [Mal] [P] [U-Z] [Y]
N. V. Chemetov, F. Cipriano
Foias, C., Manley, O., Rosa, R., Temam, R.: Navier-Stokes equations and turbulence. Cambridge: Cambridge University Press, 2001 Gihman, I.I., Skorohod, A.V.: The theory of stochastic processes. Springer Monographs in Mathematics. New York: Springer Verlag, 1974 Kato, T.: On classical solutions of the two-dimensional nonstationary Euler equations. Arch. Rat. Mech. Anal. 25 (3), 188–200 (1967) Krasny, R.: Computing vortex sheet motion. In: Proceedings of Int. Congress of Math. Kyoto 1990, New York: Springer-Verlag, 2, 1991, pp. 1573–1583 Malliavin, P.: Introduction au cours d’Analyse. Paris: Masson, 1982 Peters, G.: Anticipating flows on the Wiener space generated by vector fields of low regularity. J. Funct. Anal. 142, 129–192 (1996) Ustunel, A.S., Zakai, M.: Transformation of measure on the Wiener space. Springer Monographs in Mathematics, Berlin: Springer Verlag, 2000 Yudovic, V.: Non-stationary flow of an incompressible liquid. Zh. Vychysl. Mat. Mat. Fiz. 3, 1032–1066 (1963)
Communicated by P. Constantin
Commun. Math. Phys. 267, 559–561 (2006) Digital Object Identifier (DOI) 10.1007/s00220-006-0064-7
Communications in
Mathematical Physics
Erratum
Connecting Solutions of the Lorentz Force Equation do Exist E. Minguzzi1,2 , M. Sánchez3 1 Departamento de Matemáticas, Plaza de la Merced 1–4, 37008 Salamanca, Spain.
E-mail: [email protected]
2 INFN, Piazza dei Caprettari 70, 00186 Roma, Italy 3 Departamento de Geometría y Topología Facultad de Ciencias, Avda. Fuentenueva s/n. 18071 Granada,
Spain. E-mail: [email protected] Received: 5 May 2006 / Accepted: 5 May 2006 Published online: 5 August 2006 – © Springer-Verlag 2006 Commun. Math. Phys. 264, 349–370 (2006)
Due to a processing error the last paragraph of Sect. 3.1 on p. 358 was printed with an error. In addition, in Sect. 5.2 on pp. 365–366 the presentation of the Example was processed incorrectly. The second paragraph of Sect. 5.2 (two lines) must be removed and the last sentence in the same section must be replaced with ‘Nevertheless, even though σ0 maximizes in C 1 , it does not maximize in C 2 (nor in the causal homotopy class C), in agreement with our results.’ The corrected paragraphs read as follows. p. 358: In particular, if x0 x1 the two points can be connected by means of a timelike geodesic (in fact, by one for each time like homotopy class in Cx0 ,x1 , as will be apparent below). If x1 ∈ E + (x0 ) = J + (x0 )\I + (x0 ) then x0 and x1 can still be joined by a lightlike geodesic, but this case does not make sense for the LFE. One can also wonder for the connectedness of x0 , x1 by means of a geodesic even if they are not causally related, as in variational frameworks described below. Although this question has a geometrical interest (see for instance the survey [37]), it does not have a direct physical interpretation, nor equivalence for LFE. pp. 365–366: 5.2. A remarkable example. Lemma 5.1 does not forbid the existence of a lightlike geodesic σ which maximizes the functional on the closure of a timelike class C x0 ,x1 . However, in that case the maximizer on Cx0 ,x1 ⊃ C x0 ,x1 does not coincide with σ , as the following example shows. cap Example. Let be a surface embedded in R3 obtained by gluing the spherical √ √ 3 3 2 2 2 2 2 2 2 x + y + z = r , z > − 2 r + z with a cylinder x + y = r /4, z < − 2 r − z , The online version of the original article can be found at http://dx.doi.org/10.1007/s00220-006-1547-2
560
E. Minguzzi, M. Sánchez
by making a smooth transition in the points with coordinate z ∈ [− √
√
3 2 r
− z , −
√
3 2 r
+
Notice that this transition can be made smooth and z ], for some positive z < depending only on the azimuthal angle θ in a small interval ( 56 π −θ , 56 π +θ ), θ < π/6. Only the details of this surface included in the spherical cap with θ ≤ π/2 + , for some small positive < π/6, will be relevant. Let dl 2 be the induced Riemannian metric on , and fix q = (r, 0, 0) ∈ . Consdier the natural product (globally hyperbolic) spacetime M = R × , g = dt 2 − dl 2 , with natural projection π : M → , and the fixed events x0 = (0, q), x1 = (2πr, q). (1) The timelike curve λ → (2πr λ, q) fix a timelike homotopy class C1 (:= C x0 ,x1 ). The connecting lightlike geodesic 3 2 r.
σ0 (λ) = (2πr λ, c0 (λ)), c0 (λ) = (r cos 2π λ, r sin 2π λ, 0), λ ∈ [0, 1], lies in the boundary C˙ 1 , In fact, σ0 can be reached by approximating the part c0 with a constant-speed parametrization cα of ∩ α , where α ⊂ R3 is the plane through q, orthogonal to the plane y = 0, which makes an oriented positive angle α < π/2 with the plane z = 0 (cα is contained in the region z > 0 except in the tangent point q). However, by letting α < 0 we can find a second timelike homotopy class C2 such that σ0 ∈ C˙ 2 ; of course, C1 and C2 are contained in the same causal homotopy class C. Notice that c0 passes through the antipodal point −q = (−r, 0, 0), which is also a conjugate point of q; thus, σ0 also contains a conjugate point. Fix q/m > 0 (resp. q/m < 0), and let F = Bπ ∗ = dω be on M, where is the volume 2-form of (with the orientation induced by the outer normal in the spherical cap), and where B : → R is a non-negative (resp. non-positive) function, with B ≡ B > 0 (resp. < 0) constant for θ ≤ π/2, acid monotonically decreasing (resp. increasing) to 0 for θ ∈ (π/2, π/2 + ]. The charged-particle action I x0 ,x1 is given by two contributions. The electromagnetic term reads q q ω= B (17) m σ m R where, without loss of generality, σ (λ) = (2πr λ, c(λ)) and ∂ R = c. For a given length L ≤ 2πr of c this integral is maximized in C1 by the circle cα with length L, namely c L . Indeed, the maximizer must be a circle in order to maximize the area, and it is tangent to c0 since, otherwise, its enclosed surface R would include regions where B < B (resp. B > B). Thus q q ω ≤ B A[c L ], (18) m m σ
A[c L ]
is the area contained in c L . And the equality holds iff c = c L (up to a where reparametrization with the same winding number). The contribution of the length of σ in I x0 ,x1 is: 2 2πr dl l[c] 2 ds = 1− dt ≤ 2πr 1 − , (19) dt 2πr 0 σ where l[c] is the length of c = π ◦ σ , and the equality holds when the speed of c is constant. We have then l[c] 2 q I x0 ,x1 [σ ] ≤ 2πr 1 − + B A[cl[c] ], (20) 2πr m
Connecting Solutions of the Lorentz Force Equation do Exist
561
where the equality holds iff π ◦ σ = cl[c] . But in terms of the angle 0 ≤ α ≤ π/2 with cα = cl , we have l = 2πr cos α and A[cl ] = 2πr 2 (1 − sin α). Hence if mq Br > 1, q q q I x0 ,x1 [σ ] ≤ 2πr 2 B + 2πr 1 − Br sin α ≤ 2πr 2 B = I x0 ,x1 [σ0 ], (21) m m m and the equality holds iff α = 0 and the projection of σ is c0 (= c2πr ), i.e. iff σ = σ0 . Nevertheless, even though σ0 maximizes in C 1 , it does not maximize in C 2 (nor in the causal homotopy class C), in agreement with our results.
Commun. Math. Phys. 267, 563–586 (2006) Digital Object Identifier (DOI) 10.1007/s00220-006-0066-5
Communications in
Mathematical Physics
A Domain of Spacetime Intervals in General Relativity Keye Martin1 , Prakash Panangaden2 1 Naval Research Laboratory, Center for High Assurance Computer Systems, Washington, DC 20375, USA.
E-mail: [email protected]
2 School of Computer Science, McGill University, Montreal, Quebec H3A 2A7, Canada.
E-mail: [email protected] Received: 24 February 2005 / Accepted: 23 January 2006 Published online: 15 August 2006 – © Springer-Verlag 2006
Abstract: We prove that a globally hyperbolic spacetime with its causality relation is a bicontinuous poset whose interval topology is the manifold topology. From this one can show that from only a countable dense set of events and the causality relation, it is possible to reconstruct a globally hyperbolic spacetime in a purely order theoretic manner. The ultimate reason for this is that globally hyperbolic spacetimes belong to a category that is equivalent to a special category of domains called interval domains. We obtain a mathematical setting in which one can study causality independently of geometry and differentiable structure, and which also suggests that spacetime emerges from something discrete. 1. Introduction Since the first singularity theorems [Pen65, HE73] causality has played a key role in understanding spacetime structure. The analysis of causal structure relies heavily on techniques of differential topology [Pen72]. For the past decade Sorkin and others [Sor91] have pursued a program for quantization of gravity based on causal structure. In this approach the causal relation is regarded as the fundamental ingredient and the topology and geometry are secondary. In this paper, we prove that the causality relation is much more than a relation – it turns a globally hyperbolic spacetime into what is known as a bicontinuous poset. The order on a bicontinuous poset allows one to define an intrinsic topology called the interval topology1 . On a globally hyperbolic spacetime, the interval topology is the manifold topology. Theorems that reconstruct the spacetime topology have been known [Pen72] and Malament [Mal77] has shown that the class of timelike curves determines the causal structure. We establish these results as well though in a purely order theoretic fashion: there is no need to know what “smooth curve” means. 1 Other people use this term for a different topology: what we call the interval topology has been called the biScott topology.
564
K. Martin, P. Panangaden
Our more abstract stance also teaches us something new: the fact that a globally hyperbolic spacetime is bicontinuous implies that it can be reconstructed in a purely order theoretic manner, beginning from only a countable dense set of events and the causality relation. The ultimate reason for this is that the category of globally hyperbolic posets, which contains the globally hyperbolic spacetimes, is equivalent to a very special category of posets called interval domains. Domains [AJ94, GKK+ 03] are special types of posets that have played an important role in theoretical computer science since the late 1960s when they were discovered by Dana Scott [Sco70] for the purpose of providing a semantics for the lambda calculus. They are partially ordered sets that carry intrinsic (order theoretic) notions of completeness and approximation. From a certain viewpoint, then, the fact that the category of globally hyperbolic posets is equivalent to the category of interval domains is surprising, since globally hyperbolic spacetimes are usually not order theoretically complete. This equivalence also explains why spacetime can be reconstructed order theoretically from a countable dense set: each ω-continuous domain is the ideal completion of a countable abstract basis, i.e., the interval domains associated to globally hyperbolic spacetimes are the systematic ‘limits’ of discrete sets. This may be relevant to the development of a foundation for quantum gravity, an idea we discuss at the end. But, with all speculation aside, the importance of these results and ideas is that they suggest an abstract formulation of causality – a setting where one can study causality independently of geometry and differentiable structure.
2. Domains, Continuous Posets and Topology In this section we quickly review the basic notions of domain theory. These notions arose in the study of the mathematical theory of programming languages, but it is not necessary to know any of the computer science motivation for the mathematics that follows. A poset is a partially ordered set, i.e., a set together with a reflexive, antisymmetric and transitive relation. Definition 2.1. Let (P, ) be a partially ordered set. A nonempty subset S ⊆ P is directed if (∀x, y ∈ S)(∃z ∈ S) x, y z. The supremum of S ⊆ P is the least of all its upper bounds provided it exists. This is written S. One can think of countable directed sets as generalizations of increasing sequences; one will not go too far wrong picturing them as sequences. These ideas have duals that will be important to us: a nonempty S ⊆ P is filtered if (∀x, y ∈ S)(∃z ∈ S) z x, y. The infimum S of S ⊆ P is the greatest of all its lower bounds provided it exists. Definition 2.2. For a subset X of a poset P, set ↑X := {y ∈ P : (∃x ∈ X ) x y} & ↓X := {y ∈ P : (∃x ∈ X ) y x}. We write ↑ x = ↑ {x} and ↓ x = ↓ {x} for elements x ∈ X . A partial order allows for the derivation of several intrinsically defined topologies. Here is our first example.
A Domain of Spacetime Intervals in General Relativity
565
Definition 2.3. A subset U of a poset P is Scott open if (i) U is an upper set: x ∈ U & x y ⇒ y ∈ U , and (ii) U is inaccessible by directed suprema: For every directed S ⊆ P with a supremum,
S ∈ U ⇒ S ∩ U = ∅.
The collection of all Scott open sets on P is called the Scott topology. Posets can have a variety of completeness properties. If every subset has a supremum and an infimum the poset is called a complete lattice. This is a rather strong condition though still a very useful concept. The following completeness condition has turned out to be particularly useful in applications. Definition 2.4. A dcpo is a poset in which every directed subset has a supremum. The least element in a poset, when it exists, is the unique element ⊥ with ⊥ x for all x. The set of maximal elements in a dcpo D is max(D) := {x ∈ D : ↑x = {x}}. Each element in a dcpo has a maximal element above it. Definition 2.5. For elements x, y of a poset, write x y iff for all directed sets S with a supremum, y
S ⇒ (∃s ∈ S) x s.
We set ↓x = {a ∈ D : a x} and ↑x = {a ∈ D : x a}. For the symbol “,” read “approximates.” Definition 2.6. A basis for a poset D is a subset B such that B ∩ ↓x contains a directed set with supremum x for all x ∈ D. A poset is continuous if it has a basis. A poset is ω-continuous if it has a countable basis. Continuous posets have an important property, they are interpolative. Proposition 2.7. If x y in a continuous poset P, then there is z ∈ P with x z y. This enables a clear description of the Scott topology, Theorem 2.8. The collection {↑x : x ∈ D} is a basis for the Scott topology on a continuous poset. And also helps us give a clear definition of the Lawson topology. Definition 2.9. The Lawson topology on a continuous poset P has as a basis all sets of the form ↑x\↑F, for F ⊆ P finite.
566
K. Martin, P. Panangaden
These relations and topologies are understood as expressing qualitative aspects of information. These aspects are not directly applicable to the present context, but they are, nevertheless, suggestive. The partial order describes relative information content. Thus x y implies that x has less information than y. The way-below relation captures the idea of a “finite piece of information.” If one considers the subsets of any infinite set, say the natural numbers, ordered by inclusion, then x y simply means that x is a finite subset of y. The idea of a continuous poset is that every element can be reconstructed from its finite approximants. Scott open sets can be thought of as observable properties. Consider a process which successively produces the digits of a real number r between 0 and 1. After n digits are produced, we have an approximation rn of r , and since each new digit provides additional information, we have rn rn+1 . A Scott open set U is now an observable property of r , i.e., r has property U (r ∈ U ) iff this can be finitely observed: (i) If at just one stage of computation we find rn ∈ U , then r must have property U , since U is an upper set; (ii) If r has property U , then this fact is finitely observable, because U is inaccessible by directed suprema and r = rn . Thus, one can deduce properties of ideal elements assuming only the ability to work with their finite approximations. The information order and the Scott topology σ D on a domain D are then related by x y ≡ (∀U ∈ σ D ) x ∈ U ⇒ y ∈ U, i.e., y is more informative than x iff it has every observable property that x does. In general, the Scott topology is T0 but not T1 while the Lawson topology on an ω-continuous domain is metrizable. The next idea is fundamental to the present work: Definition 2.10. A continuous poset P is bicontinuous if • For all x, y ∈ P, x y iff for all filtered S ⊆ P with an infimum,
S x ⇒ (∃s ∈ S) s y,
and • For each x ∈ P, the set ↑x is filtered with infimum x. Example 2.11. R, Q are bicontinuous. Definition 2.12. On a bicontinuous poset P, sets of the form (a, b) := {x ∈ P : a x b} form a basis for a topology called the interval topology. The proof uses interpolation and bicontinuity. A bicontinuous poset P has ↑x = ∅ for each x, so it is rarely a dcpo. Later we will see that on a bicontinuous poset, the Lawson topology is contained in the interval topology (causal simplicity), the interval topology is Hausdorff (strong causality), and ≤ is a closed subset of P 2 . Definition 2.13. A continuous dcpo is a continuous poset which is also a dcpo. A domain is a continuous dcpo.
A Domain of Spacetime Intervals in General Relativity
567
Example 2.14. Let X be a locally compact Hausdorff space. Its upper space UX = {∅ = K ⊆ X : K is compact} ordered under reverse inclusion AB⇔B⊆A is a continuous dcpo: • For directed S ⊆ UX, S = S. • For all K , L ∈ UX, K L ⇔ L ⊆ int(K ). • UX is ω-continuous iff X has a countable basis. It is interesting here that the space X can be recovered from UX in a purely order theoretic manner: X max(UX) = {{x} : x ∈ X }, where max(UX) carries the relative Scott topology it inherits as a subset of UX. Several constructions of this type are known. The next example is due to Scott[Sco70]; it will be good to keep in mind when studying the analogous construction for globally hyperbolic spacetimes. Example 2.15. The collection of compact intervals of the real line IR = {[a, b] : a, b ∈ R & a ≤ b} ordered under reverse inclusion [a, b] [c, d] ⇔ [c, d] ⊆ [a, b] is an ω-continuous dcpo: • For directed S ⊆ IR, S = S, • I J ⇔ J ⊆ int(I ), and • {[ p, q] : p, q ∈ Q & p ≤ q} is a countable basis for IR. The domain IR is called the interval domain. We also have max(IR) R in the Scott topology. Approximation can help explain why: Example 2.16. A basic Scott open set in IR is ↑[a, b] = {x ∈ IR : x ⊆ (a, b)}.
568
K. Martin, P. Panangaden
3. The Causal Structure of Spacetime A manifold M is a locally Euclidean Hausdorff space that is connected and has a countable basis. A connected Hausdorff manifold is paracompact iff it has a countable basis. A Lorentz metric on a manifold is a symmetric, nondegenerate tensor field of type (0, 2) whose signature is (− + ++). Definition 3.1. A spacetime is a real four-dimensional2 smooth manifold M with a Lorentz metric gab . Let (M, gab ) be a time-orientable spacetime. Let +≤ denote the future directed causal curves, and +< denote the future directed time-like curves. Definition 3.2. For p ∈ M, I + ( p) := {q ∈ M : (∃π ∈ +< ) π(0) = p, π(1) = q} and J + ( p) := {q ∈ M : (∃π ∈ +≤ ) π(0) = p, π(1) = q}. Similarly, we define I − ( p) and J − ( p). We write the relation J + as p q ≡ q ∈ J + ( p). The following properties from [HE73] are very useful: Proposition 3.3. Let p, q, r ∈ M. Then (i) The sets I + ( p) and I − ( p) are open. (ii) p q and r ∈ I + (q) ⇒ r ∈ I + ( p). (iii) q ∈ I + ( p) and q r ⇒ r ∈ I + ( p). (iv) Cl(I + ( p)) = Cl(J + ( p)) and Cl(I − ( p)) = Cl(J − ( p)). We always assume the chronology conditions that ensure (M, ) is a partially ordered set. We also assume strong causality which can be characterized as follows [Pen72]: Theorem 3.4. A spacetime M is strongly causal iff its Alexandroff topology is Hausdorff iff its Alexandroff topology is the manifold topology. The Alexandroff topology on a spacetime has {I + ( p) ∩ I − (q) : p, q ∈ M} as a basis [Pen72]3 . 2 The results in the present paper work for any dimension n ≥ 2 [J93]. 3 This terminology is common among relativists but order theorists use the phrase “Alexandrov topology”
to mean something else: the topology generated by the upper sets.
A Domain of Spacetime Intervals in General Relativity
569
4. Global Hyperbolicity Penrose has called globally hyperbolic spacetimes “the physically reasonable spacetimes [Wal84].” In this section, M denotes a globally hyperbolic spacetime, and we prove that (M, ) is a bicontinuous poset. Definition 4.1. A spacetime M is globally hyperbolic if it is strongly causal and if ↑a ∩ ↓b is compact in the manifold topology, for all a, b ∈ M. Lemma 4.2. If (xn ) is a sequence in M with xn x for all n, then lim xn = x ⇒ xn = x. n→∞
n≥1
Proof. Let xn y. By global hyperbolicity, M is causally simple, so the set J − (y) is closed. Since xn ∈ J − (y), x = lim xn ∈ J − (y), and thus x y. This proves x = xn . Lemma 4.3. For any x ∈ M, I − (x) contains an increasing sequence with supremum x. Proof. Because x ∈ Cl(I − (x)) = J − (x) but x ∈ I − (x), x is an accumulation point of I − (x), so for every open set V with x ∈ V , V ∩ I − (x) = ∅. Let (Un ) be a countable basis for x, which exists because M is locally Euclidean. Define a sequence (xn ) by first choosing x1 ∈ U1 ∩ I − (x) = ∅ and then whenever xn ∈ Un ∩ I − (x) we choose xn+1 ∈ (Un ∩ I + (xn )) ∩ I − (x) = ∅. By definition, (xn ) is increasing, and since (Un ) is a basis for x, lim xn = x. By Lemma 4.2, xn = x. Proposition 4.4. Let M be a globally hyperbolic spacetime. Then x y ⇔ y ∈ I + (x) for all x, y ∈ M. Proof. Let y ∈ I + (x). Let y
S with S directed. By Prop. 3.3(iii), y ∈ I + (x) & y S ⇒ S ∈ I + (x).
is locally compact, there is an open set V ⊆ M Since I + (x) is manifold open and M whose closure Cl(V ) is compact with S ∈ V ⊆ Cl(V ) ⊆ I + (x). Then, using approximation on the upper space of M, Cl(V ) S = s, S , s∈S
570
K. Martin, P. Panangaden
where the intersection on the right is a filtered collection of nonempty compactsets by directedness of S and global hyperbolicity of M. Thus, for some s ∈ S, [s, S] ⊆ Cl(V ) ⊆ I + (x), and so s ∈ I + (x), which gives x s. This proves x y. − Now let x y. By Lemma 4.3, there is an increasing sequence (yn ) in I (y) with y = yn . Then since x y, there is n with x yn . By Prop. 3.3(ii), x yn & yn ∈ I − (y) ⇒ x ∈ I − (y), which is to say that y ∈ I + (x).
Theorem 4.5. If M is globally hyperbolic, then (M, ) is a bicontinuous poset with = I + whose interval topology is the manifold topology. Proof. By combining Lemma 4.3 with Prop. 4.4, ↓x contains an increasing sequence with supremum x, for each x ∈ M. Thus, M is a continuous poset. For the bicontinuity, Lemmas 4.2, 4.3 and Prop. 4.4 have “duals” which are obtained by replacing ‘increasing’ by ‘decreasing’, I + by I − , J − by J + , etc. For example, the dual of Lemma 4.3 is that I + contains a decreasing sequence with infimum x. Using the duals of these two lemmas, we then give an alternate characterization of in terms of infima: x y ≡ (∀S) S x ⇒ (∃s ∈ S) s y, where we quantify over filtered subsets S of M. These three facts then imply that ↑x contains a decreasing sequence with inf x. But because can be phrased in terms of infima, ↑x itself must be filtered with inf x. Finally, M is bicontinuous, so we know it has an interval topology. Because = I + , the interval topology is the one generated by the timelike causality relation, which by strong causality is the manifold topology. Bicontinuity, as we have defined it here, is really quite a special property, and some of the nicest posets in the world are not bicontinuous. For example, the powerset of the naturals Pω is not bicontinuous, because we can have F G for G finite, and F = Vn , where all the Vn are infinite. 5. Causal Simplicity Sometimes global hyperbolicity is regarded as a little too strong. A weaker condition often used is causal simplicity. Definition 5.1. A spacetime M is causally simple if J + (x) and J − (x) are closed for all x ∈ M. It turns out that causal simplicity also has a purely order theoretic characterization. Theorem 5.2. Let M be a spacetime and (M, ) a continuous poset with = I + . The following are equivalent: (i) M is causally simple. (ii) The Lawson topology on M is a subset of the interval topology on M.
A Domain of Spacetime Intervals in General Relativity
571
Proof. (i) ⇒ (ii): We want to prove that {↑x ∩ ↑F : x ∈ M & F ⊆ M finite} ⊆ intM . By strong causality of M and = I + , intM is the manifold topology, and this is the crucial fact we need as follows. First, ↑x = I + (x) is open in the manifold topology and hence belongs to intM . Second, ↑ x = J + (x) is closed in the manifold topology by causal simplicity, so M\↑x belongs to intM . Then intM contains the basis for the Lawson topology given above. (ii) ⇒ (i): First, since (M, ) is continuous, its Lawson topology is Hausdorff, so intM is Hausdorff since it contains the Lawson topology by assumption. Since = I + , intM is the Alexandroff topology, so Theorem 3.4 implies M is strongly causal. Now, Theorem 3.4 also tells us that intM is the manifold topology. Since the manifold topology intM contains the Lawson by assumption, and since J + (x) = ↑x and J − (x) = ↓x are both Lawson closed (the second is Scott closed), each is also closed in the manifold topology, which means M is causally simple. Note that in this proof we have used the fact that causal simplicity implies strong causality. 6. Global Hyperbolicity in the Abstract There are two elements which make the topology of a globally hyperbolic spacetime tick. They are: (i) A bicontinuous poset (X, ≤). (ii) The intervals [a, b] = {x : a ≤ x ≤ b} are compact in the interval topology on X . From these two we can deduce some aspects we already know as well as some new ones. In particular, bicontinuity ensures that the topology of X , the interval topology, is implicit in ≤. We call such posets globally hyperbolic. Theorem 6.1. (i) A globally hyperbolic poset is locally compact and Hausdorff. (ii) The Lawson topology is contained in the interval topology. (iii) Its partial order ≤ is a closed subset of X 2 . (iv) Each directed set with an upper bound has a supremum. (v) Each filtered set with a lower bound has an infimum. Proof. First we show that the Lawson topology is contained in the interval topology. Sets of the form ↑x are open in the interval topology. To prove X \ ↑ x is open, let y ∈ X \↑ x. Then x y. By bicontinuity, there is b with y b such that x b. For any a y, y ∈ (a, b) ⊆ X \↑ x which proves the Lawson topology is contained in the interval topology. Because the Lawson topology is always Hausdorff on a continuous poset, X is Hausdorff in its interval topology. Let x ∈ U where U is open. Then there is an open interval x ∈ (a, b) ⊆ U . By continuity of (X, ≤), we can interpolate twice, obtaining a closed interval [c, d] followed by another open interval we call V . We get x ∈ V ⊆ [c, d] ⊆ (a, b) ⊆ U.
572
K. Martin, P. Panangaden
The closure of V is contained in [c, d]: X is Hausdorff so compact sets like [c, d] are closed. Then Cl(V ) is a closed subset of a compact space [c, d], so it must be compact. This proves X is locally compact. To prove ≤ is a closed subset of X 2 , let (a, b) ∈ X 2 \ ≤. Since a ≤ b, there is x a with x ≤ b by continuity. Since x ≤ b, there is y with b y and x ≤ y by bicontinuity. Now choose elements 1 and 2 such that x a 1 and 2 b y. Then (a, b) ∈ (x, 1) × (2, y) ⊆ X 2 \ ≤ . For if (c, d) ∈ (x, 1) × (2, y) and c ≤ d, then x ≤ c ≤ 1 and 2 ≤ d ≤ y, and since c ≤ d, we get x ≤ y, a contradiction. This proves X 2 \ ≤ is open. Given a directed set S ⊆ X with an upper bound x, if we fix any element 1 ∈ S, then the set ↑1 ∩ S is also directed and has a supremum iff S does. Then we can assume that S has a least element named 1 ∈ S. The inclusion f : S → X :: s → s is a net and since S is contained in the compact set [1, x], f has a convergent subnet g : I → S. Then T := g(I ) ⊆ S is directed and cofinal in S. We claim T = lim T . First, lim T is an upper bound for T . If there were t ∈ T with t lim T , then lim T ∈ X \ ↑ t. Since X \ ↑ t is open, there is α ∈ I such that (∀β ∈ I )α ≤ β ⇒ g(β) ∈ X \ ↑ t. Let u = g(α) and t = g(γ ). Since I is directed, there is β ∈ I with α, γ ≤ β. Then g(β) ∈ X \↑ t & t = g(γ ) ≤ g(β), where the second inequality follows from the fact that subnets are monotone by definition. This is a contradiction, which proves t lim T for all t. To prove T = lim T , let u be an upper bound for T . Then t u for all t. However, if lim T ≤ u, then lim T ∈ X \ ↓ u, and since X \ ↓ u is open, we get that T ∩ (X \ ↓ u) = ∅, which contradicts that u is an upper bound for T . Now we prove S = lim T . Let s ∈ S. Since T is cofinal in S, there is t ∈ T with s ≤ t. Hence s ≤ t ≤ lim T , so lim T is an upper bound for S. To finish, any upper bound for S is one for T so it must be above lim T . Then S = lim T . Given a filtered set S with a lower bound x, we can assume it has a greatest element 1. The map f : S ∗ → S :: x → x is a net where the poset S ∗ is obtained by reversing the order on S. Since S ⊆ [x, 1], f has a convergent subnet g, and now the proof is simply the dual of the suprema case. Globally hyperbolic posets share a remarkable property with metric spaces, that separability and second countability are equivalent. Proposition 6.2. Let (X, ≤) be a bicontinuous poset. If C ⊆ X is a countable dense subset in the interval topology, then (i) The collection {(ai , bi ) : ai , bi ∈ C, ai bi } is a countable basis for the interval topology. Thus, separability implies second countability, and even complete metrizability if X is globally hyperbolic. (ii) For all x ∈ X, ↓x ∩ C contains a directed set with supremum x, and ↑x ∩ C contains a filtered set with infimum x.
A Domain of Spacetime Intervals in General Relativity
573
Proof. (i) Sets of the form (a, b) := {x ∈ X : a x b} form a basis for the interval topology. If x ∈ (a, b), then since C is dense, there is ai ∈ (a, x)∩C and bi ∈ (x, b)∩C and so x ∈ (ai , bi ) ⊆ (a, b). (ii) Fix x ∈ X . Given any a x, the set (a, x) is open and C is dense, so there is ca ∈ C with a ca x. The set S = {ca ∈ C : a x} ⊆ ↓x ∩ C is directed: If ca , cb ∈ S, then since ↓x is directed, there is d x with ca , cd d x and thus ca , cb cd ∈ S. Finally, S = x: Any upper bound for S is also one for ↓x and so above x by continuity. The dual argument shows ↑x ∩ C contains a filtered set with inf x. Globally hyperbolic posets are very much like the real line. In fact, a well-known domain theoretic construction pertaining to the real line extends in perfect form to the globally hyperbolic posets: Theorem 6.3. The closed intervals of a globally hyperbolic poset X IX := {[a, b] : a ≤ b & a, b ∈ X } ordered by reverse inclusion [a, b] [c, d] ≡ [c, d] ⊆ [a, b] form a continuous domain with [a, b] [c, d] ≡ a c & d b. The poset X has a countable basis iff IX is ω-continuous. Finally, max(IX ) X, where the set of maximal elements has the relative Scott topology from IX . Proof. If S ⊆ IX is a directed set, we can write it as S = {[ai , bi ] : i ∈ I }. Without loss of generality, we can assume S has a least element 1 = [a, b]. Thus, for all i ∈ I, a ≤ ai ≤ bi ≤ b. Then {ai } is a directed subset of Xbounded above by b, {bi } is a filtered subset of X bounded below by a. We know that ai = lim ai , bi = lim bi and that ≤ is closed. It follows that
S= ai , bi . For the continuity of IX , consider [a, b] ∈ IX . If c a and b d, then [c, d] [a, b] in IX . Then [a, b] = {[c, d] : c a & b d} (1) a supremum that is directed since X is bicontinuous. Suppose now that [x, y] [a, b] in IX . Then using (1), there is [c, d] with [x, y] [c, d] such that c a and b d which means x c a and b d y and thus x a and b y. This completely characterizes the relation on IX , which now enables us to prove max(IX ) X , since we can write ↑[a, b] ∩ max(IX ) = {{x} : x ∈ X & a x b} and ↑[a, b] is a basis for the Scott topology on IX .
574
K. Martin, P. Panangaden
Finally, if X has a countable basis, then it has a countable dense subset C ⊆ X , which means {[an , bn ] : an bn , an , bn ∈ C} is a countable basis for IX by Prop. 6.2(ii). The endpoints of an interval [a, b] form a two element list x : {1, 2} → X with a = x(1) ≤ x(2) = b. We call these formal intervals. They determine the information in an interval as follows: Corollary 6.4. The formal intervals ordered by x y ≡ x(1) ≤ y(1) & y(2) ≤ x(2) form a domain isomorphic to IX . This observation – that spacetime has a canonical domain theoretic model – has at least two important applications, one of which we now consider. We prove that from only a countable set of events and the causality relation, one can reconstruct spacetime in a purely order theoretic manner. Explaining this requires domain theory. 7. Spacetime from a Discrete Causal Set In the causal set program [Sor91] causal sets are discrete sets equipped with a partial order relation interpreted as causality. They require a local finiteness condition: between two elements there are only finitely many elements. This is at variance with our notions of interpolation. We are not going to debate the merits of local finiteness here; instead we show that from a countable dense subset equipped with the causal order we can reconstruct the spacetime manifold with its topology. Recall from the appendix on domain theory that an abstract basis is a set (C, ) with a transitive relation that is interpolative from the − direction: F x ⇒ (∃y ∈ C) F y x, for all finite subsets F ⊆ C and all x ∈ F. Suppose, though, that it is also interpolative from the + direction: x F ⇒ (∃y ∈ C) x y F. Then we can define a new abstract basis of intervals int(C) = {(a, b) : a b} = ⊆ C 2 whose relation is (a, b) (c, d) ≡ a c & d b. Lemma 7.1. If (C, ) is an abstract basis that is ± interpolative, then (int(C), ) is an abstract basis. Proof. Let F = {(ai , bi ) : 1 ≤ i ≤ n} (a, b). Let A = {ai } and B = {bi }. Then A a and b B in C. Since C lets us interpolate in both directions, we get (x, y) with F (x, y) (a, b). Transitivity is inherited from C. Let IC denote the ideal completion of the abstract basis int(C).
A Domain of Spacetime Intervals in General Relativity
575
Theorem 7.2. Let C be a countable dense subset of a globally hyperbolic spacetime M and = I + be timelike causality. Then max(IC) M, where the set of maximal elements have the Scott topology. Proof. Because M is bicontinuous, the sets ↑x and ↓x are filtered and directed respectively. Thus (C, ) is an abstract basis for which (int(C), ) is also an abstract basis. Because C is dense, (int(C), ) is a basis for the domain IM. But, the ideal completion of any basis for IM must be isomorphic to IM. Thus, IC IM, and so M max(IM) max(IC). In “ordering the order” I + , taking its completion, and then the set of maximal elements, we recover spacetime by reasoning only about the causal relationships between a countable dense set of events. One objection to this might be that we begin from a dense set C, and then order theoretically recover the space M – but dense is a topological idea so we need to know the topology of M before we can recover it! But the denseness of C can be expressed in purely causal terms: C dense ≡ (∀x, y ∈ M)(∃z ∈ C) x z y. Now the objection might be that we still have to reference M. We too would like to not reference M at all. However, some global property needs to be assumed, either directly or indirectly, in order to reconstruct M. Theorem 7.2 is very different from results like “Let M be a certain spacetime with relation ≤. Then the interval topology is the manifold topology.” Here we identify, in abstract terms, a process by which a countable set with a causality relation determines a space. The process is entirely order theoretic in nature, spacetime is not required to understand or execute it (i.e., if we put C = Q and =<, then max(IC) R). In this sense, our understanding of the relation between causality and the topology of spacetime is now explainable independently of geometry. Last, notice that if we naively try to obtain M by taking the ideal completion of (S, ) or (S, ) that it will not work: M is not a dcpo. Some other process is necessary, and the exact structure of globally hyperbolic spacetime allows one to carry out this alternative process. Ideally, one would now like to know what constraints on C in general imply that max(IC) is a manifold.
8. Spacetime as a Domain The category of globally hyperbolic posets is naturally isomorphic to a special category of domains called interval domains. Definition 8.1. An interval poset is a poset D that has two functions le f t : D → max(D) and right : D → max(D) such that (i) Each x ∈ D is an “interval” with left(x) and right(x) as endpoints: (∀x ∈ D) x = left(x) right(x),
576
K. Martin, P. Panangaden
(ii) The union of two intervals with a common endpoint is another interval: For all x, y ∈ D, if right(x) = left(y), then left(x y) = left(x) & right(x y) = right(y), (iii) Each point p ∈ ↑x ∩ max(D) of an interval x ∈ D determines two subintervals, le f t (x) p and p right (x), with endpoints: left(left(x) p) = left(x) & right(left(x) p) = p, left( p right(x)) = p & right( p right(x)) = right(x). Notice that a nonempty interval poset D has max(D) = ∅ by definition. With interval posets, we only assume that infima indicated in the definition exist; in particular, we do not assume the existence of all binary infima. Definition 8.2. For an interval poset (D, le f t, right), the relation ≤ on max(D) is a ≤ b ≡ (∃ x ∈ D) a = left(x) & b = right(x) for a, b ∈ max(D). Lemma 8.3. (max(D), ≤) is a poset. Proof. Reflexivity: By property (i) of an interval poset, x left(x), right(x), so if a ∈ max(D), a = left(a) = right(a), which means a ≤ a. Antisymmetry: If a ≤ b and b ≤ a, then there are x, y ∈ D with a = left(x) = right(y) and b = right(x) = left(y), so this combined with property (i) gives x = left(x) right(x) = right(y) left(y) = y and thus a = b. Transitivity: If a ≤ b and b ≤ c, then there are x, y ∈ D with a = left(x), b = right(x) = left(y) and c = right(y), so property (ii) of interval posets says that for z = x y we have left(z) = left(x) = a & right(z) = right(y) = c, and thus a ≤ c.
An interval poset D is the set of intervals of (max(D), ≤) ordered by reverse inclusion: Lemma 8.4. If D is an interval poset, then x y ≡ (left(x) ≤ left(y) ≤ right(y) ≤ right(x)). Proof. (⇒) Since x y left(y), property (iii) of interval posets implies z = left(x) left(y) is an “interval” with left(z) = left(x) & right(z) = left(y) and thus left(x) ≤ left(y). The inequality right(y) ≤ right(x) follows similarly. The inequality left(y) ≤ right(y) follows from the definition of ≤. (⇐) Applying the definition of ≤ and properties (ii) and (i) of interval posets to left(x) ≤ left(y) ≤ right(x), we get x left(y). Similarly, x right(y). Then x left(y) right(y) = y.
A Domain of Spacetime Intervals in General Relativity
577
Corollary 8.5. If D is an interval poset, φ : D → I(max(D), ≤) :: x → [left(x), right(x)] is an order isomorphism. In particular, p ∈ ↑x ∩ max(D) ≡ left(x) ≤ p ≤ right(x) in any interval poset. Definition 8.6. If (D, left, right) is an interval poset, [ p, ·] := left−1 ( p) and [·, q] := right −1 (q) for any p, q ∈ max(D). Definition 8.7. An interval domain is an interval poset (D, left, right) where D is a continuous dcpo such that (i) If p ∈ ↑x ∩ max(D), then ↑(left(x) p) = ∅ &
↑( p right(x)) = ∅.
(ii) For all x ∈ D, the following are equivalent: (a) ↑x = ∅ (b) (∀y ∈ [ left(x), · ] )( y x ⇒ y right(y) in [ ·, right(y) ] ) (c) (∀y ∈ [·, right(x)])( y x ⇒ y left(y) in [ left(y), · ] ) (iii) Invariance of endpoints under suprema: (a) For all directed S ⊆ [ p, ·] left S = p & right S = right T for any directed T ⊆ [q, ·] with right(T ) = right(S). (b) For all directed S ⊆ [·, q] left S = left T & right S =q for any directed T ⊆ [·, p] with left(T ) = left(S). (iv) Intervals are compact: For all x ∈ D, ↑ x ∩ max(D) is Scott compact. Interval domains are interval posets whose axioms also take into account the completeness and approximation present in a domain: (i) says if a point p belongs to the interior of an interval x ∈ D, the subintervals left(x) p and p right(x) both have nonempty interior; (ii) says an interval has nonempty interior iff all intervals that contain it have nonempty interior locally; (iii) explains the behavior of endpoints when taking suprema. For a globally hyperbolic (X, ≤), we define: left : IX → IX :: [a, b] → [a] and right : IX → IX :: [a, b] → [b].
578
K. Martin, P. Panangaden
Lemma 8.8. If (X, ≤) is a globally hyperbolic poset, then (IX, left, right) is an interval domain. In essence, we now prove that this is the only example. Definition 8.9. The category IN of interval domains and commutative maps is given by • objects Interval domains (D, left, right). • arrows Scott continuous f : D → E that commute with left and right, i.e., such that both D f
? E
left D
D
f ? - E
and
left E
f
D
rightD D
? E
f ? - E right E
commute. • identity 1 : D → D. • composition f ◦ g. Definition 8.10. The category G is given by • objects Globally hyperbolic posets (X, ≤). • arrows Continuous in the interval topology, monotone. • identity 1 : X → X . • composition f ◦ g. It is routine to verify that IN and G are categories. Proposition 8.11. The correspondence I : G → IN given by (X, ≤) → (IX, left, right), ( f : X → Y ) → ( f¯ : IX → IY ) is a functor between categories. Proof. The map f¯ : IX → IY defined by f¯[a, b] = [ f (a), f (b)] takes intervals to intervals since f is monotone. It is Scott continuous because suprema and infima in X and Y are limits in the respective interval topologies and f is continuous with respect to the interval topology. Now we prove there is also a functor going the other way. Throughout the proof, we use for suprema in (D, ) and for suprema in (max(D), ≤). Lemma 8.12. Let D be an interval domain with x ∈ D and p ∈ max(D). If x p in D, then left(x) p right(x) in (max(D), ≤). Proof. Since x p in D, x p, and so left(x) ≤ p ≤ right(x). (⇒) First we prove left(x) p. Let S ⊆ max(D) be a ≤-directed set with p ≤ S. −1 −1 For x¯ := φ ([left(x), p]) and y := φ ([left(x), S]), we have y x. ¯ By property
A Domain of Spacetime Intervals in General Relativity
579
(i) of interval domains, ↑x = ∅ implies that ↑x¯ = ↑(left(x) p) = ∅, so property (ii) of interval domains says y right(y) in the poset [·, right(y)]. Then y right(y) = φ −1 [s, S] s∈S
which means y φ −1 [s, S] for some s ∈ S. So by monotonicity of φ, left(x) ≤ s. Thus, left(x) p in (max(D), ≤). be a ≤-directed set with right(x) ≤ Now we prove−1p right(x). Let S ⊆ max(D) S. For x¯ := φ ([ p, right(x)]) and y := φ −1 ([ p, S]), y x, ¯ and since ↑x¯ = ∅ by property (i) of interval domains, property (ii) of interval domains gives y right(y) in [·, right(y)]. Then y right(y) = φ −1 [s, S] s∈S
which means [s,
S] ⊆ [ p,
S] and hence p ≤ s for some s ∈ S.
Now we begin the proof that (max(D), ≤) is a globally hyperbolic poset when D is an interval domain. Lemma 8.13. Let p, q ∈ max(D). (i) If S ⊆ [ p, ·] is directed, then right
S = right(s). s∈S
(ii) If S ⊆ [·, q] is directed, then left
S = left(s). s∈S
Proof. (i) First, right S is a ≤-lower bound for {right(s) : s ∈ S} because [ p, right(s)]. φ S = [left S , right S ]= s∈S
Given any other lower bound q ≤ right(s) for all s ∈ S, the set T := {φ −1 ([q, right(s)]) : s ∈ S} ⊆ [q, ·] is directed with right(T ) = right(S), so q = left T ≤ right T = right S , where the two equalities follow from property (iii)(a) of interval domains, and the inequality follows from the definition of ≤. This proves the claim. (ii) This proof is simply the dual of (i), using property (iii)(b) of interval domains.
580
K. Martin, P. Panangaden
Lemma 8.14. Let D be an interval domain. If ↑x = ∅ in D, then S ≤ left(x) ⇒ (∃s ∈ S) s ≤ right(x) for any ≤-filtered S ⊆ max(D) with an infimum in (max(D), ≤). Proof. Let S ⊆ max(D) be a ≤-filtered set with S ≤ left(x). There is some [a, b] with x = φ −1 [a, b]. Setting y := φ −1 [ S, b], we have y x and ↑x = ∅, so property (ii)(c) of interval domains says y left(y) in [left(y), ·]. Then y left(y) = φ −1 [ S, s], s∈S
where this set is -directed because S is ≤-filtered. Thus, y φ −1 [ s ∈ S, which gives s ≤ b.
S, s] for some
Lemma 8.15. Let D be an interval domain. Then (i) The set ↓x is ≤-directed with ↓x = x. (ii) For all a, b ∈ max(D), a b in (max(D), ≤) iff for all ≤-filtered S ⊆ max(D) with an infimum, S ≤ a ⇒(∃s ∈ S) s ≤ b. (iii) The set ↑x is ≤-filtered with ↑x = x. Thus, the poset (max(D), ≤) is bicontinuous. Proof. (i) By Lemma 8.12, if x p in D, then left(x) p in max(D). Then the set T = {left(x) : x p in D} ⊆ ↓ p is ≤-directed. We will prove S = p. To see this, S = {φ −1 [left(x), p] : x p in D} is a directed subset of [·, p], so by Lemma 8.13(ii), left S = T. Now we calculate S. We know S = φ −1 [a, b], where [a, b] = [left(x), p]. Assume S = p. By maximality of p, p S, so there must be an x ∈ D with x p and x S. Then [a, b] ⊆ [left(x), right(x)], so either left(x) ≤ a or b ≤ right(x). But, [a, b] ⊆ [left(x), p] for any x p in D, so we have left(x) ≤ a and b ≤ p ≤ right(x), which is a contradiction. Thus, p= S = left S = T, and since ↓ p contains a ≤-directed set with sup p, ↓ p itself is ≤-directed with ↓ p = p. This proves (max(D), ≤) is a continuous poset. (ii) (⇒) Let a b in max(D). Let x := φ −1 [a, b]. We first prove ↑ x = ∅ using property (ii)(b) of interval domains. Let y x with y ∈ [a, ·]. We need to show y right(y)
A Domain of Spacetime Intervals in General Relativity
581
in the poset [·, right(y)]. Let S ⊆ [·, right(y)] be directed with right(y) right(y) = S by maximality. Using Lemma 8.13(ii), right(y) = S = left S = left(s).
S and hence
s∈S
But y x, so b ≤ right(y) = s∈S left(s), and since a b, a ≤ left(s) for some s ∈ S. Then since for this same s, we have left(y) = a ≤ left(s) ≤ right(s) = right(y) which means y s. Then y right(y) in the poset [·, right(y)]. By property (ii)(b), we have ↑x = ∅, so Lemma 8.14 now gives the desired result. (ii) (⇐) First, S = {a} is one such filtered set, so a ≤ b. Let x = φ −1 [a, b]. We prove ↑x = ∅ using axiom (ii)(c) of interval domains. Let y x with y ∈ [·, b]. To prove y left(y) in [left(y), ·], let S ⊆ [left(y), ·] be directed with left(y) S. By maximality, left(y) = S. By Lemma 8.13(i), left(y) = S = right( S) = right(s) s∈S
and {right(s) : s ∈ S} is ≤-filtered. Since y x, right(s) = left(y) ≤ left(x) = a, s∈S
so by assumption, right(s) ≤ b, for some s ∈ S. Then for this same s, left(y) = left(s) ≤ right(s) ≤ b = right(y) which means y s. Then y left(y) in [left(y), ·]. By property (ii)(c) of interval domains, ↑x = ∅. By Lemma 8.12, taking any p ∈ ↑x, we get a = left(x) p right(x) = b. (iii) Because of the characterization of in (ii), this proof is simply the dual of (i). Lemma 8.16. Let (D, left, right) be an interval domain. Then (i) If a p b in (max(D), ≤), then φ −1 [a, b] p in D. (ii) The interval topology on (max(D), ≤) is the relative Scott topology max(D) inherits from D. Thus, the poset (max(D), ≤) is globally hyperbolic. Proof. (i) Let S ⊆ D be directed with p S. Then p = S by maximality. The sets L = {φ −1 [left(s), p] : s ∈ S} and R = {φ −1 [ p, right(s)] : s ∈ S} are both directed in D. For their suprema, Lemma 8.13 gives left L = left(s) & right R = right(s). Since s φ −1 [
s∈S
s∈S
right(s)] for all s ∈ S, −1 p= Sφ left(s), right(s) ,
s∈S
left(s),
s∈S
s∈S
s∈S
582
K. Martin, P. Panangaden
and so
left(s) = p =
s∈S
right(s).
s∈S
Since a p, there is s1 ∈ S with a ≤ left(s1 ). Since p b, there is s2 ∈ S with right(s2 ) ≤ b, using bicontinuity of max(D). By the directedness of S, there is s ∈ S with s1 , s2 s, which gives a ≤ left(s1 ) ≤ left(s) ≤ right(s) ≤ right(s2 ) ≤ b φ −1 [a, b]
s. which proves (ii) Combining (i) and Lemma 8.12, a p b in (max(D), ≤) ⇔ φ −1 [a, b] p in D. Thus, the identity map 1 : (max(D), ≤) → (max(D), σ ) sends basic open sets in the interval topology to basic open sets in the relative Scott topology, and conversely, so the two spaces are homeomorphic. Finally, since ↑x ∩ max(D) = { p ∈ max(D) : left(x) ≤ p ≤ right(x)}, and this set is Scott compact, it must also be compact in the interval topology on (max(D), ≤), since they are homeomorphic. Proposition 8.17. The correspondence max : IN → G given by (D, left, right) → (max(D), ≤) ( f : D → E) → ( f |max(D) : max(D) → max(E)) is a functor between categories. Proof. First, commutative maps f : D → E preserve maximal elements: If x ∈ max(D), then f (x) = f (left D (x)) = left E ◦ f (x) ∈ max(E). By Lemma 8.16(ii), f |max(D) is continuous with respect to the interval topology. For monotonicity, let a ≤ b in max(D) and x := φ −1 [a, b] ∈ D. Then left E ◦ f (x) = f (left D (x)) = f (a) and right E ◦ f (x) = f (right D (x)) = f (b), which means f (a) ≤ f (b), by the definition of ≤ on max(E).
Before the statement of the main theorem in this section, we recall the definition of a natural isomorphism. Definition 8.18. A natural transformation η : F → G between functors F : C → D and G : C → D is a collection of arrows (η X : F(X ) → G(X )) X ∈ C such that for any arrow f : A → B in C, F(A) F( f )
? F(B)
ηA G(A) G( f ) ? - G(B) ηB
commutes. If each η X is an isomorphism, η is a natural isomorphism.
A Domain of Spacetime Intervals in General Relativity
583
Categories C and D are equivalent when there are functors F : C → D and G : D → C and natural isomorphisms η : 1C → G F and μ : 1D → F G. Theorem 8.19. The category of globally hyperbolic posets is equivalent to the category of interval domains. Proof. We have natural isomorphisms η : 1IN → I ◦ max and μ : 1G → max ◦ I. This result suggests that questions about spacetime can be converted to domain theoretic form, where we can use domain theory to answer them, and then translate the answers back to the language of physics (and vice-versa). Notice too that the category of interval posets and commutative maps is equivalent to the category of posets and monotone maps. It also shows that causality between events is equivalent to an order on regions of spacetime. Most importantly, we have shown that globally hyperbolic spacetime with causality is equivalent to a structure IX whose origins are “discrete.” This is the formal explanation for why spacetime can be reconstructed from a countable dense set of events in a purely order theoretic manner. 9. Conclusions We summarize the main results of this paper: 1. we have shown how to reconstruct the spacetime topology from the causal structure using purely order-theoretic ideas, 2. we have given an order theoretic characterization of causal simplicity, 3. we give an abstract order-theoretic definition of global hyperbolicity, 4. we have identified bicontinuity as an important causality condition, 5. we have shown that one can reconstruct the spacetime manifold - and its topology from a countable dense subset, 6. we show that there is an equivalence of categories between a new category of interval domains and the category of globally hyperbolic posets. One of us has also shown that (a version of) the Sorkin-Woolgar result [SW96, Mar06] holds using order theoretic arguments. In fact other aspects of domain theory - the notion of powerdomains as a domain theoretic generalization of powersets - play an important role in that result and provides a very natural setting for the result. There is much more one can do. As we have seen, one of the benefits of the domain theoretic viewpoint is that from a dense discrete set (C, ) with timelike causality , spacetime can be order theoretically reconstructed: globally hyperbolic spacetime emanates from something discrete. So one question is whether the ‘denseness’ requirement can be eliminated: in essence, can one tell when an abstract basis (C, ) comes from a manifold? Of course, we can
584
K. Martin, P. Panangaden
attempt the reconstruction and see what we get, but can we predict what the result will be by imposing certain conditions on (C, )? Another interesting - possibly related - question is the algebraic topology of these manifolds based on directed homotopy [GFR98, Faj00]. It is clear that the developments cited show the usefulness of the concept of homotopy based on directed curves for computational applications. It would be fascinating to see what one could learn about space time structure and especially topology change. It seems that it might be possible to use order as the basis for new and useful causality conditions which generalize globally hyperbolicity. Some possible candidates are to require (M, ) a continuous (bicontinuous) poset. Bicontinuity, in particular, has the nice consequence that one does not have to explicitly assume strong causality as one does with global hyperbolicity. Is M bicontinuous iff it is causally simple? We also expect there to be domain theoretic versions of most of the well known causality conditions, such as causal continuity or stable causality. It is now natural to ask about the domain theoretic analogue of ‘Lorentz metric’, and the authors suspect it is related to the study of measurement ([Mar00a, Mar00b]). Measurements give a way of introducing quantitative information into domain theory. As is well known the causal structure determines the conformal metric: to get the rest of the metric one needs some length or volume information. We feel that the domain theoretic setting can be used to address the whole gamut of quantum theoretic questions. Perhaps one can use domain theoretic notions of the derivative to define fields on spacetime. After that, we could ask about the domain theoretic analogue of dynamics for fields on spacetime or even for Einstein’s equation. Given a reformulation of general relativity in domain theoretic terms, a first step toward a theory of quantum gravity would be to restrict to a countable abstract basis with a measurement. The advantage though of the domain theoretic formulation is that we will know up front how to reconstruct ‘classical’ general relativity as an order theoretic ‘limit’. Appendix: Domain Theory A useful technique for constructing domains is to take the ideal completion of an abstract basis. Definition 9.1. An abstract basis is given by a set B together with a transitive relation < on B which is interpolative, that is, M < x ⇒ (∃ y ∈ B ) M < y < x for all x ∈ B and all finite subsets M of B. Notice the meaning of M < x: It means y < x for all y ∈ M. Abstract bases are covered in [AJ94], which is where one finds the following. Definition 9.2. An ideal in (B, <) is a nonempty subset I of B such that (i) I is a lower set: ( ∀ x ∈ B )( ∀ y ∈ I ) x < y ⇒ x ∈ I. (ii) I is directed: ( ∀ x, y ∈ I )( ∃ z ∈ I ) x, y < z. The collection of ideals of an abstract basis (B, <) ordered under inclusion is a partially ¯ ordered set called the ideal completion of B. We denote this poset by B.
A Domain of Spacetime Intervals in General Relativity
585
The set {y ∈ B : y < x} for x ∈ B is an ideal which leads to a natural mapping from B into B, given by i(x) = {y ∈ B : y < x}. Proposition 9.3. If (B, <) is an abstract basis, then (i) Its ideal completion B¯ is a dcpo. ¯ (ii) For I, J ∈ B, I J ⇔ ( ∃ x, y ∈ B ) x < y & I ⊆ i(x) ⊆ i(y) ⊆ J. (iii) B¯ is a continuous dcpo with basis i(B). If one takes any basis B of a domain D and restricts the approximation relation on D to B, they are left with an abstract basis (B, ) whose ideal completion is D. Thus, all domains arise as the ideal completion of an abstract basis. Appendix: Topology Nets are a generalization of sequences. Let X be a space. Definition 9.4. A net is a function f : I → X where I is a directed poset. A subset J of I is cofinal if for all α ∈ I , there is β ∈ J with α ≤ β. Definition 9.5. A subnet of a net f : I → X is a function g : J → I such that J is directed and • For all x, y ∈ J, x ≤ y ⇒ g(x) ≤ g(y) • g(J ) is cofinal in I . Definition 9.6. A net f : I → X converges to x ∈ X if for all open U ⊆ X with x ∈ U , there is α ∈ I such that α ≤ β ⇒ f (β) ∈ U for all β ∈ I . A space X is compact if every open cover has a finite subcover. Proposition 9.7. A space X is compact iff every net f : I → X has a convergent subnet. References [AJ94]
Abramsky, S., Jung, A.: Domain theory. In: Maibaum, T.S.E., Abramsky, S., Gabbay, D.M., eds., Handbook of Logic in Computer Science, Vol. III. Oxford: Oxford University Press, 1994 [Faj00] Fajstrup, L.: Loops, ditopology and deadlocks: Geometry and concurrency. Math. Struct. Comput. Sci. 10(4), 459–480 (2000) [GFR98] Goubault, E., Fajstrup, L., Raussen, M.: Algebraic topology and concurrency. Department of Mathematical Sciences RR-98-2008, Aalborg University, 1998. Presented at MFCS 1998 London [GKK+ 03] Gierz, G., Hoffman, K.H., Keimel, K., Lawson, J.D., Mislove, M., Scott, D.S.: Continuous lattices and domains. Number 93 in Encyclopedia of Mathematics and its Applications. Cambridge: Cambridge University Press, 2003 [HE73] Hawking, S.W., Ellis, G.F.R.: The large scale structure of space-time. Cambridge Monographs on Mathematical Physics. Cambridge: Cambridge University Press, 1973.
586
[J93] [Mal77] [Mar00a] [Mar00b] [Mar06] [Pen65] [Pen72] [Sco70] [Sor91] [SW96] [Wal84]
K. Martin, P. Panangaden
Joshi, P.S.: Global aspects in gravitation and cosmology. International Series of Monographs on Physics 87. Oxford: Oxford Science Publications, 1993 Malement, D.: The class of continuous timelike curves determines the topology of spacetime. J. Math. Phys. 18(7), 1399–1404 (1977) Martin, K.: A foundation for computation. PhD thesis, Department of Mathematics, Tulane University, 2000 Martin, K.: The measurement process in domain theory. In: 27th International Colloquium on Automata, Languages and Programming (ICALP’00), Number 1853 in Lecture Notes In Computer Science. Berlin-Heidelberg-New York: Springer-Verlag, 2000, pp. 116–126 Martin, K.: Compactness of the space of causal curves. J. Class. Quantum Gravity, 23(4), 1241–1253 (2006) Penrose, R.: Gravitational collapse and space-time singularities. Phys. Rev. Lett. 14, 57–59 (1965) Penrose, R.: Techniques of differential topology in relativity. Phildelphia, PA: Society for Industrial and Applied Mathematics, 1972 Scott, D.: Outline of a mathematical theory of computation. Technical Monograph PRG-2, Oxford University Computing Laboratory, 1970 Sorkin, R.: Spacetime and causal sets. In: D’Olivo, J. et. al., ed., Relativity and Gravitation: Classical and Quantum. Singapore: World Scientific, 1991 Sorkin, R.D., Woolgar, E.: A causal order for spacetimes with c0 Lorentzian metrics: Proof of compactness of the space of causal curves. Class. Quantum Grav. 13, 1971–1994 (1996) Wald, R.M.: General relativity. Chicago, IL: The University of Chicago Press, 1984
Communicated by G.W. Gibbons
Commun. Math. Phys. 267, 587–610 (2006) Digital Object Identifier (DOI) 10.1007/s00220-006-0068-3
Communications in
Mathematical Physics
On Z-Gradations of Twisted Loop Lie Algebras of Complex Simple Lie Algebras Kh. S. Nirov1, , A. V. Razumov2 1 Max-Planck-Institut für Gravitationsphysik – Albert-Einstein-Institut, Am Mühlenberg 1,
14476 Golm b. Potsdam, Germany. E-mail: [email protected]
2 Institute for High Energy Physics, 142281 Protvino, Moscow Region, Russia.
E-mail: [email protected] Received: 14 June 2005 / Accepted: 20 March 2006 Published online: 15 August 2006 – © Springer-Verlag 2006
Abstract: We define the twisted loop Lie algebra of a finite dimensional Lie algebra g as the Fréchet space of all twisted periodic smooth mappings from R to g. Here the Lie algebra operation is continuous. We call such Lie algebras Fréchet Lie algebras. We introduce the notion of an integrable Z-gradation of a Fréchet Lie algebra, and find all inequivalent integrable Z-gradations with finite dimensional grading subspaces of twisted loop Lie algebras of complex simple Lie algebras. 1. Introduction The theory of loop groups and loop Lie algebras has a lot of applications to mathematical and physical problems. In particular, it is a necessary tool for formulation of many integrable systems and construction of appropriate integration methods. Here one or another version of the factorization problem for the underlying group arises (see, for example, [17]). For the so-called Toda systems associated with loop groups the required factorization is induced by a Z-gradation of the corresponding loop Lie algebra, and, at least from this point of view, the classification of Z-gradations of loop Lie algebras is quite important. Certainly, this problem is also interesting from a pure mathematical point of view. The definition and general integration procedure for the Toda systems can be found in [8, 13, 14]. The classification of Z-gradations of complex semisimple finite dimensional Lie algebras is well known (see, for example, [2]). The corresponding classification of the Toda systems associated with complex classical Lie groups was given in papers [15, 10]. There are two main definitions of the loop Lie algebras. In accordance with the first definition used, for example, by Kac in his famous monograph [4], a loop Lie algebra is the set of finite Laurent polynomials with coefficients in a finite dimensional Lie algebra On leave of absence from the Institute for Nuclear Research of the Russian Academy of Sciences, 117312 Moscow, Russia. E-mail: [email protected]
588
Kh. S. Nirov, A.V. Razumov
g. One can say that in accordance with this definition a loop Lie algebra is formed by polynomial loops in g. It is rather difficult to associate a Lie group with such a Lie algebra. Actually this is connected to the fact that the exponential of a finite polynomial is usually not a finite polynomial. However, it should be noted that with this definition in the case when the underlying Lie algebra is complex and simple one can classify all Z-gradations of the loop Lie algebras with finite dimensional grading subspaces [4, Exercise 8.8]1 . In accordance with the second definition, used in the monograph by Pressley and Segal [12], a loop Lie algebra is the set of smooth mappings from the circle S 1 to a finite dimensional Lie algebra g, or, in other words, the set of smooth loops in g. This set is endowed with the structure of a Fréchet space. Here the Lie algebra operation defined pointwise is continuous. The definition given in [12] is more convenient for applications to the theory of integrable systems, because in this case we always have an appropriate Lie group. Therefore, it would be interesting and useful to obtain a classification of Z-gradations for loop Lie algebras defined as in [12]. In the present paper we introduce the concept of an integrable Z-gradation and classify all integrable Z-gradations with finite dimensional grading subspaces of loop Lie algebras and twisted loop Lie algebras of finite dimensional complex simple Lie algebras. The result of the classification is actually the same as for loop Lie algebras and twisted loop Lie algebras defined as in [4]. Namely, to classify all integrable Z-gradations with finite dimensional grading subspaces of the Lie algebras under consideration one has to classify all Z K -gradations of the underlying Lie algebras or, equivalently, all their automorphisms of finite order. Note that any Z-gradation defined for polynomial loops can be extended to smooth loops, and used then to construct the corresponding Toda system. However, to classify Toda systems we have to know all Z-gradations which can be defined for smooth loops. 2. Loop Lie algebras and Loop Lie Groups Consider the vector space C ∞ (S 1 , V ) of smooth mappings from the circle S 1 to a real or complex finite dimensional vector space V . It is convenient to treat the circle S 1 as the set of complex numbers of modulus one. There is a natural mapping from the set R of real numbers to S 1 which takes σ ∈ R to eiσ ∈ S 1 . Given an element ξ ∈ C ∞ (S 1 , V ), one defines a mapping ξ˜ from R to V by the equality ξ˜ (σ ) = ξ(eiσ ). The mapping ξ˜ is smooth and satisfies the relation ξ˜ (σ + 2π ) = ξ˜ (σ ). Conversely, any smooth periodic mapping from R to V induces a smooth mapping from S 1 to V . Introduce the notation ξ˜ (k) = dk ξ˜ /ds k , where s is the standard coordinate function on R. We set ξ˜ (0) = ξ˜ . Given an element ξ ∈ C ∞ (S 1 , V ), we denote by ξ (k) the element of C ∞ (S 1 , V ) induced by ξ˜ (k) . Endow C ∞ (S 1 , V ) with the structure of a topological vector space in the following way. Let · be a norm on V . Define a countable collection of norms { · m }m∈N on C ∞ (S 1 , V ) by ξ m = max max ξ (k) ( p), 0k<m p∈S 1
1 Actually in [4] one can find the classification of Z-gradations of the affine Kac–Moody algebras. The classification of Z-gradations of the corresponding loop Lie algebras immediately follows from that classification.
On Z-Gradations of Twisted Loop Lie Algebras of Complex Simple Lie Algebras
589
or via the corresponding mapping ξ˜ : R → V by ξ˜ m = max
max
0k<m σ ∈[0,2π ]
ξ˜ (k) (σ ).
Note that for any ξ ∈ C ∞ (S 1 , V ), if m 1 < m 2 , then ξ m 1 ξ m 2 . Given a positive integer m, denote Um = {ξ ∈ C ∞ (S 1 , V ) | ξ m < 1/m}. The collection formed by the sets Um is a local base of topology on C ∞ (S 1 , V ). As a base of the topology we can take the collection of subsets of the form Uξ,m = ξ + Um ,
ξ ∈ C ∞ (S 1 , V ).
A sequence (ξn ) in C ∞ (S 1 , V ) converges to ξ ∈ C ∞ (S 1 ,V ) relative to this topology, if (k) ˜ and only if for each nonnegative integer k the sequence ξ n converges uniformly to ξ˜ (k) . One can show that actually we have a Fréchet space. We define a Fréchet space as a complete topological vector space whose topology is induced by a countable family of seminorms.2 Let now g be a real or complex finite dimensional Lie algebra. Supply the Fréchet space C ∞ (S 1 , g) with the Lie algebra structure defining the Lie algebra operation pointwise. The obtained Lie algebra is called the loop Lie algebra of g and denoted L(g). It is clear that constant mappings form a subalgebra of L(g) which is isomorphic to the initial Lie algebra g. Let again V be a real or complex finite dimensional vector space, and let a be an automorphism of V . Consider the quotient space E of the direct product R × V by the equivalence relation which identifies (σ, v) with (σ + 2π, a(v)). Define the projection π : E → S 1 by the relation π([(σ, v)]) = eiσ . π
It is not difficult to show that in such a way one obtains a smooth vector bundle E → S 1 with fiber V . Let ξ be a smooth section of E. For any σ ∈ R there exists a unique element ξ˜ (σ ) ∈ V such that [(σ, ξ˜ (σ ))] = ξ(eiσ ). This relation defines a smooth mapping ξ˜ from R to V which satisfies the relation ξ˜ (σ + 2π ) = a(ξ˜ (σ )), called twisted periodicity. Conversely, given a mapping ξ˜ : R → V which is twisted periodic, the equality ξ( p) = [(σ, ξ˜ (σ ))], p = eiσ , π
defines a smooth section of E. One can make the space C ∞ (S 1 ← E) of smooth secπ tions of E → S 1 a Fréchet space in the same way as it was done above for the space C ∞ (S 1 , V ).3 Here it is natural and useful to assume that the corresponding norm on V is invariant with respect to the automorphism a. 2 Sometimes a more general definition of a Fréchet space is used (see, for example, [16]). 3 In the case where V is a complex linear space, the vector bundle E is trivializable, and the sections of E can be identified with the mappings from S 1 to V .
590
Kh. S. Nirov, A.V. Razumov
If the vector space V is a Lie algebra g and a is an automorphism of g, one can π supply the vector space C ∞ (S 1 ← E), or equivalently the vector space of twisted periodic mappings from R to g, with the structure of a Lie algebra defining the Lie algebra operation pointwise.4 We denote this Lie algebra by La (g) and call it a twisted loop Lie algebra. The loop Lie algebra L(g) can be considered as the twisted loop Lie algebra La (g) with a = idg. Let G be a Lie group whose Lie algebra coincides with g and Ad be the adjoint representation of G in g. For any g ∈ G the linear operator Ad(g) is an automorphism of g. Such automorphisms are called inner automorphisms. They form a normal subgroup Int g of the group Aut g of automorphisms of g. One can show that if the automorphisms a and b of g differ by an inner automorphism of g then the twisted loop Lie algebras La (g) and Lb (g) are naturally isomorphic. This means, in particular, that if the Lie algebra g is semisimple one can consider only the twisted loop Lie algebras La (g) with a belonging to the finite subgroup of Aut g identified with the automorphism group Aut Π of some simple root system Π of g. In particular, one can assume that a K = idg for some positive integer K . It is convenient for our purposes to assume that K does not necessarily coincide with the order of a. Let g be a semisimple Lie algebra. Consider an arbitrary element η of La (g) and the corresponding mapping η˜ from R to g. It is clear that the mapping ξ˜ defined as ξ˜ (σ ) = η(K ˜ σ ), is a periodic mapping from R to g. Therefore, it induces an element ξ of L(g). It is clear that in this way we obtain an injective homomorphism from La (g) to L(g). The image of this homomorphism is formed by the elements ξ satisfying the condition ξ( K p) = a(ξ( p)), where K = exp(2π i/K ) is the K th principal root of unity. We will denote this image as La,K (g). For the corresponding mapping ξ˜ from R to g one has ξ˜ (σ + 2π/K ) = a(ξ˜ (σ )). Thus, when g is a semisimple Lie algebra, the twisted loop Lie algebra La (g) can be identified with a subalgebra of the loop Lie algebra L(g). We call a Lie algebra G a Fréchet Lie algebra if G is a Fréchet space and the Lie algebra operation in G, considered as a mapping from G × G to G, is continuous. Actually one can consider a Fréchet Lie algebra as a smooth manifold modelled on itself. Here the Lie algebra operation is a smooth mapping. It is not difficult to be convinced of the validity of the next simple lemma. Lemma 1. There are positive real numbers Cm , m = 1, 2, . . . , such that [ξ, η]m Cm ξ m ηm for all ξ, η ∈ La (g). Using this lemma, one can prove the following proposition. Proposition 1. The twisted loop Lie algebra La (g) is a Fréchet Lie algebra. 4 Even if g is a complex Lie algebra, we cannot in general trivialize the vector bundle E without destroying Lie algebra structure.
On Z-Gradations of Twisted Loop Lie Algebras of Complex Simple Lie Algebras
591
Let G be a finite dimensional Lie group with the Lie algebra g. The loop group L(G) is defined as the set of all smooth mappings from the circle S 1 to G with the group law being a pointwise composition in G. Here, as for the case of loop Lie algebras, for any element γ of L(G) one can define a smooth mapping γ˜ from R to G connected with γ by the equality γ˜ (σ ) = γ (eiσ ), and satisfying the relation γ˜ (σ +2π ) = γ˜ (σ ). Conversely, any periodic smooth mapping from R to G induces an element of L(G). One can endow the loop group L(G) with the structure of an infinite dimensional manifold and a Lie group in the following way. Recall that the exponential mapping exp : g → G is a local diffeomorphism near the identity. Let U˘ e be an open neighbourhood of the identity of G diffeomorphic to some open neighbourhood of the zero element of g, and ϕ˘ be the restriction of the inverse of the exponential mapping to U˘ e . Denote Ue = C ∞ (S 1 , U˘ e ) and define a mapping ˘ U˘ e )) by ϕ : Ue → C ∞ (S 1 , ϕ( ϕ(γ ) = ϕ˘ ◦ γ . ˘ U˘ e )) is open in L(g) and we can consider the pair (Ue , ϕ) Note that the set C ∞ (S 1 , ϕ( as a chart on L(G). For an arbitrary element γ ∈ L(G) denote Uγ = γ Ue , and define the mapping ϕγ : Uγ → C ∞ (S 1 , ϕ( ˘ U˘ e )) by ϕγ (γ ) = ϕ˘ ◦ (γ −1 γ ). In this way we obtain an atlas which makes L(G) into a smooth manifold modelled on the Fréchet space L(g). Actually in this way L(G) becomes a Lie group. The Lie algebra of L(G) can be naturally identified with L(g). We say that the set U ⊂ L(G) is open if for any γ ∈ L(G) the set ϕγ (U ∩ Uγ ) is open. This definition supplies L(G) with the structure of a topological space. As any Lie group the loop Lie group L(G) is a Hausdorff topological space. Twisted loop groups are defined in full analogy with twisted loop Lie algebras. Let a be an automorphism of a Lie group G and E be the quotient space of the direct product R × G by the equivalence relation which identifies (σ, g) with (σ + 2π, a(g)). Defining the projection π : E → S 1 by the relation π([(σ, g)]) = eiσ , π
we obtain a smooth fiber bundle E → S 1 with fiber G. Endow the space of smooth sections of this bundle with the structure of a group defining the group composition pointwise. This group is called the twisted loop group of G and denoted La (G). Similarly, as for the case of L(g), one endows La (G) with the structure of an infinite dimensional manifold modelled on the Fréchet space La (g), where we denote the automorphism of g induced by the automorphism a of G by the same letter a. One can verify that in such a way La (G) becomes a Lie group with the Lie algebra La (g). Recall that for any g ∈ G the mapping Int(g) : h ∈ G → ghg −1 ∈ G is an automorphism of G. Such automorphisms are called inner and form a normal subgroup of the group Aut G. Similarly, as for the twisted loop Lie algebras, if the automorphisms a and b of G differ by an inner automorphism of G, then the twisted loop Lie groups La (G) and Lb (G) are naturally isomorphic. Therefore, for the case of a semisimple Lie
592
Kh. S. Nirov, A.V. Razumov
group G one can consider only twisted loop groups La (G), where a K = id G for some positive integer K . One can show that there is a bijective correspondence between elements of La (G) and twisted periodic mappings from R to G. We denote by γ˜ the twisted periodic mapping from R to G corresponding to the element γ ∈ La (G). Let G be a semisimple Lie group, and a be an automorphism of G such that a K = id G for some positive integer K . The transformation σ → K σ induces an injective homomorphism from La (G) to L(G) whose image is formed by the elements γ satisfying the condition γ ( K p) = a(γ ( p)), and will be denoted by La,K (G). For the corresponding mapping γ˜ from R to G the above condition becomes γ˜ (σ + 2π/K ) = a(γ˜ (σ )). Thus, when G is a semisimple Lie group the twisted loop group La (G) can be identified with a subgroup of L(G). 3. Automorphisms of Twisted Loop Lie Algebras In this section g is always a complex simple Lie algebra and a is an automorphism of g. As was shown in Sect. 2, studying the twisted loop Lie algebra La (g), one can assume without any loss of generality that a K = idg for some positive integer K and consider instead of La (g) the corresponding subalgebra La,K (g) of L(g). We include the requirement of continuity into the definition of an automorphism of a Fréchet Lie algebra G. Treating a Fréchet Lie algebra G as a smooth manifold, we see that since any automorphism A of G is linear and continuous, it is smooth. There are two main classes of automorphisms of the Fréchet Lie algebra La,K (g). The automorphisms of the first class are generated by diffeomorphisms of S 1 . Let us recall that the group Diff(S 1 ) of smooth diffeomorphisms of the circle S 1 can be supplied with the structure of a smooth infinite dimensional manifold in such a way that it becomes a Lie group, (see, for example, [9, 6, 7]). The Lie algebra of the Lie group Diff(S 1 ) is the Lie algebra Der C ∞ (S 1 ) of smooth vector fields on S 1 . Here the one-parameter subgroup associated with a vector field X is actually the flow generated by X . Let f be a diffeomorphism of S 1 . Consider a linear continuous mapping A f : La,K (g) → L(g) defined by the equality A f ξ = ξ ◦ f −1 . It is easy to see that if η = A f ξ , then η( f ( K f −1 ( p))) = a(η( p)). Hence, if f ( K p) = K f ( p) for any p ∈ then A f can be considered as a mapping from La,K (g) to La,K (g). In this case A f is an automorphism of La,K (g). Conversely, if the mapping A f is a mapping from La,K (g) to La,K (g), then f satisfies the above condition and A f is an automorphism of La,K (g). S1,
On Z-Gradations of Twisted Loop Lie Algebras of Complex Simple Lie Algebras
593
One can show that the diffeomorphisms satisfying the condition f ( K p) = K f ( p) form a Lie subgroup of the Lie group Diff(S 1 ). We denote it by Diff K (S 1 ). The Lie algebra of Diff K (S 1 ) is the subalgebra of Der C ∞ (S 1 ) formed by the vector fields X such that (X (ϕ))( K p) = (X (ϕ))( p) for any function ϕ ∈ C ∞ (S 1 ) satisfying the condition ϕ( K p) = ϕ( p). Denote this subalgebra by Der K C ∞ (S 1 ). It is clear that we have a left action of Diff K (S 1 ) on La,K (g) realised by automorphisms of La,K (g). Since this action is effective, we can say that the group of automorphisms Aut La,K (g) has a subgroup which can be identified with the Lie group Diff K (S 1 ). The Lie group Diff K (S 1 ) can be identified with a subgroup of the group Diff(R). To construct this identification we start with consideration of general smooth mappings from S 1 to S 1 . For any f ∈ C ∞ (S 1 , S 1 ) one can find a smooth mapping f˜ ∈ C ∞ (R, R), connected with f by the equality ˜
f (eiσ ) = ei f (σ ) . The function f˜ satisfies the relation f˜(σ + 2π ) − f˜(σ ) = 2π k, where k is an integer, called the winding number of f . From the other hand, any smooth mapping f˜ ∈ C ∞ (R, R) which satisfies the above relation induces a smooth mapping from S 1 to S 1 . It is evident that two functions differing by a multiple of 2π induce the same mapping. If f is a diffeomorphism, then its winding number is 1 for an orientation preserving mapping, and it is −1 for an orientation reversing mapping. Note that in this case the corresponding function f˜ is strictly monotonic, and that any smooth strictly monotonic function satisfying the relation f˜(σ + 2π ) − f˜(σ ) = ±2π, induces a diffeomorphism of S 1 . If f ∈ Diff K (S 1 ) one obtains that f˜(σ + 2π/K ) = f˜(σ ) + 2π/K for K ≥ 2 and that
f˜(σ + π ) = f˜(σ ) ± π
for K = 2. Note that if ξ is an element of La,K (g) and f ∈ Diff K (S 1 ) then A f ξ = ξ˜ ◦ f˜−1 , where A f is the automorphism of La,K (g) induced by f . The second interesting class of automorphisms of La,K (g) is formed by automorphisms generated by automorphisms of g acting on the elements of La,K (g) pointwise.
594
Kh. S. Nirov, A.V. Razumov
Let α be an element of the Lie group L(Aut g). Consider a linear mapping from La,K (g) to L(g) defined by the equality Aα ξ = α ξ, where (α ξ )( p) = α( p)(ξ( p)). It is clear that Aα is a homomorphism from La,K (g) to L(g). Moreover, if α satisfies the relation α( K p) = a α( p)a −1 = Int(a)(α( p)), then the mapping Aα is an automorphism of La,K (g). It follows from the equality a K = idg that (Int(a)) K = idAut g. Hence, we see that any element of the Lie group LInt(a),K (Aut g) induces an automorphism of the Lie algebra La,K (g), and we have a left action of LInt(a),K (Aut g) on La,K (g) realised by automorphisms of La,K (g). This action is again effective and, therefore, Aut La,K (g) has a subgroup which can be identified with the Lie group LInt(a),K (Aut g). Actually, if for f ∈ Diff K (S 1 ) and α ∈ LInt(a),K (Aut g) we define the automorphism A( f,α) of La,K (g) by A( f,α) ξ = α(ξ ◦ f −1 ), we obtain a left effective action of the semidirect product of Diff K (S 1 ) and LInt(a),K (Aut g) on La,K (g) realised by automorphisms of La,K (g). Here the group operations in Diff K (S 1 ) LInt(a),K (Aut g) are given by ( f 1 , α1 )( f 2 , α2 ) = ( f, α), where f = f1 ◦ f2 ,
α = α1 (α2 ◦ f 1−1 ),
and ( f, α)−1 = ( f −1 , α −1 ◦ f −1 ). Thus, we see that Diff K (S 1 ) LInt(a),K (Aut g) can be identified with a subgroup of the group Aut La,K (g). In fact, this subgroup exhausts the whole group Aut La,K (g). Theorem 1. The group of automorphisms of La,K (g) can be naturally identified with the semidirect product Diff K (S 1 ) LInt(a),K (Aut g). Proof. The main idea of the proof is borrowed from [12]. Let A be an automorphism of La,K (g). Fix a point p ∈ S 1 and consider the mapping A p from La,K (g) to g defined by the equality A p (ξ ) = (A ξ )( p). This mapping is linear and continuous. Some necessary information on such mappings are given in Appendix A. Certainly, A p is a homomorphism from La,K (g) to g. Let m be a nonnegative integer. Denote by χ pm a smooth function on S 1 such that χ pm(m) ( p) = 1,
χ pm(k) ( p) = 0, k = m,
On Z-Gradations of Twisted Loop Lie Algebras of Complex Simple Lie Algebras
595
and supp χ pm ∩ K supp χ pm = ∅. Let x be an arbitrary element of g. It is not difficult to get convinced that for any nonnegative integer m the mapping ηmp,x =
K −1 l=0
χml p a −l (x) K
is an element of La,K (g) satisfying the conditions ηm(m) p,x ( p) = x,
ηm(k) p,x ( p) = 0, k = m.
The linear mapping A is invertible by definition. Therefore, there is an element 0 ∈ L 0 0 0 ξ p,x a,K (g) such that A(ξ p,x ) = η p,x . This implies that A p (ξ p,x ) = x. Thus the mapping A p is surjective. For any open set U ⊂ S 1 the set LU a,K (g) = {ξ ∈ La,K (g) | supp ξ ⊂ U } is an ideal of La,K (g). If an element x ∈ g belongs to the image of the restriction of U A p to LU a,K (g), then there is an element ξ ∈ La,K (g) such that A p (ξ ) = x. Since A p is surjective, it follows that for any element y ∈ g one can find an element η ∈ La,K (g) such that A p (η) = y. Since [ξ, η] belongs to LU a,K (g), it follows that [x, y] = A p ([ξ, η]) belongs to the image of A p |LU (g) . This means that the image of A p |LU (g) is an ideal a,K
a,K
of g. As the Lie algebra g is simple, the mapping A p |LU (g) is either trivial or surjective. a,K q , where q ∈ S 1 (see The support of the mapping A p is the union of sets of the form Appendix A). Suppose that supp A p ⊃ q1 ∪ q2 and q1 ∩ q2 = ∅. Let U1 and U2 be disjoint neigbourhoods of q1 and q2 respectively. Since S 1 is a normal topological space U2 1 such neighbourhoods do exist. It is clear that LU a,K (g) and La,K (g) are commuting ideals of La,K (g). Therefore, the images A p |LU1 (g) and A p |LU2 (g) are commuting ideals a,K
a,K
of g. Hence, one of the mappings A p |LU1 (g) and A p |LU2 (g) is surjective, the other one a,K a,K is trivial. Thus, the support of A p has the form f ( p) for some mapping f : S 1 → S 1 , and we can write A p (ξ ) =
M
cmp (ξ (m) ( f ( p))),
m=0
for some nonnegative integer M and endomorphisms cmp (see Appendix A). We assume that the endomorphisms cmp are defined for all nonnegative m, but cmp = 0 for m > M. It is clear that A p (ηmf ( p),x ) = cmp (x). Using the relations [ηmp,x , ηnp,y ](m+n) ( p)
m+n = [x, y], m
we obtain A p ([ηmf ( p),x , ηnf ( p),y ]) =
[ηmp,x , ηnp,y ](k) ( p) = 0, k = m + n, m + n m+n c p ([x, y]). m
596
Kh. S. Nirov, A.V. Razumov
Since A p is a homomorphism, we have A p ([ηmf ( p),x , ηnf ( p),y ]) = [A p (ηmf ( p),x ), A p (ηnf ( p),y )], therefore, A p ([ηmf ( p),x , ηnf ( p),y ]) = [cmp (x), cnp (y)]. Thus, one has the equalities m + n m+n c p ([x, y]) = [cmp (x), cnp (y)]. m
(*)
In particular, for m = n = 0 the equality c0p ([x, y]) = [c0p (x), c0p (y)]
(**)
is valid. Since g is simple, the mapping c0p is either trivial or surjective. Suppose that it is trivial. Putting in the equality (*) n = 0, we obtain cmp ([x, y]) = [cmp (x), c0p (y)]. Since the Lie algebra g is simple, then [g, g] = g. Therefore, for any m the mapping cmp is trivial. Hence, the mapping A p is also trivial. This contradicts surjectivity of A p . Thus, c0p is surjective, and the equality (**) says that it is an automorphism of the Lie algebra g. Putting in (*) m = 0 and n = 1, we obtain c1p ([x, y]) = [c0p (x), c1p (y)] for any x, y ∈ g. Rewrite this equality as (c0p )−1 (c1p ([x, y])) = [x, (c0p )−1 (c1p (y))]. Therefore,
((c0p )−1 c1p ) ad(x) = ad(x)((c0p )−1 c1p )
for any x ∈ g. Since the Lie algebra g is simple, the linear operator (c0p )−1 c1p is multiplication by some scalar, denote it by ρ. Thus, we have c1p = ρ c0p . Relation (∗) for m = 1 and n = 1 takes the form 2 c2p ([x, y]) = [c1p (x), c1p (y)] = ρ 2 c0p ([x, y]). Therefore, c2p = (ρ 2 /2)c0p . In the general case we have cmp = (ρ m /m!)c0p for any positive m. From the other hand, cmp = 0 for m > M. It is possible only if ρ = 0. Hence, cmp = 0 for all m > 0. Define a mapping α : S 1 → Aut g by α( p) = c0p , then one can write
A ξ = α(ξ ◦ f ).
On Z-Gradations of Twisted Loop Lie Algebras of Complex Simple Lie Algebras
597
Since for any ξ ∈ La,K (g) the mapping A ξ belongs to La,K (g), the mappings f and α must be smooth. The mapping f is actually an element of Diff K (S 1 ), and α belongs to LInt(a),K (Aut g). Hence, defining f = f −1 , we see that A ξ = α(ξ ◦ f −1 ). Thus, an arbitrary automorphism of La,K (g) has the above form for some f ∈ Diff K (S 1 ) and some α ∈ LInt(a),K (Aut g). Note that in the case where g is a complex Lie algebra, the Lie group Aut g is a complex Lie group. In this case LInt(a),K (Aut g) is also a complex Lie group. From the other hand, any diffeomorphism from the identity component of the Lie group Diff K (S 1 ) to a complex Lie group is trivial (see, for example, [12]). This implies that Diff K (S 1 ) cannot be endowed with the structure of a complex Lie group. Therefore, even in the case where g is a complex Lie algebra we consider LInt(a),K (Aut g) as a real Lie group. Thus, the identification described in Theorem 1 supplies the group Aut La,K (g) with the structure of a real Lie group. Here the action of the group Aut La,K (g) on La,K (g), where La,K (g) is treated as a real manifold, is smooth. The Lie algebra of the Lie group Aut g is the Lie algebra Der g of derivations of g. The situation is almost the same for the case of the Lie group Aut La,K (g). Actually, any element of the Lie algebra of the Lie group Aut La,K (g) induces a derivation of La,K (g), but in the case where g is a complex Lie algebra there are derivations of La,K (g) which cannot be obtained in such a way. To show this, let us consider first the Lie algebra of Aut La,K (g). Using the identification described in Theorem 1, we see that this Lie algebra can be identified with the semidirect product of the Lie algebra of the Lie group Diff K (S 1 ) and the Lie algebra of the Lie group LInt(a),K (Aut g). As we already noted the Lie algebra of Diff K (S 1 ) is the subalgebra Der K C ∞ (S 1 ) of the Lie algebra Der C ∞ (S 1 ) of smooth vector fields on S 1 . The Lie algebra of the Lie group LInt(a),K (Aut g) is LAd(a),K (Der g). Thus, the Lie algebra of the group of automorphisms of La,K (g) can be naturally identified with the Lie algebra Der K C ∞ (S 1 ) LAd(a),K (Der g). By a derivation of a Fréchet Lie algebra G we mean a continuous linear mapping D from G to G which satisfies the relation D[ξ, η] = [Dξ, η] + [ξ, Dη]. Note again that continuity and linearity imply smoothness. The derivation of La,K (g) corresponding to an element of Der K C ∞ (S 1 ) LAd(a),K (Der g) is constructed as follows. Define the action of a vector field X ∈ Der K (S 1 ) on an element ξ ∈ La,K (g) in the usual way. Let (ei ) be a basis of g, then for any element ξ ∈ La,K (g) one can write ei ξ i , ξ= i
where ξ i are smooth functions on S 1 . Then one assumes that X (ξ ) = ei X (ξ i ). i
One can get convinced that this definition does not depend on the choice of a basis (ei ). Let (X, δ) be an element of Der K C ∞ (S 1 ) LAd(a),K (Der g). Consider the corresponding one-parameter subgroup of the Lie group Diff K (S 1 ) LInt(a),K (Aut g). It
598
Kh. S. Nirov, A.V. Razumov
is determined by two mappings λ : R → Diff K (S 1 ) and θ : R → LInt(a),K (Aut g). For any fixed element ξ ∈ La,K (g) one has a curve τ ∈ R → θ (τ )(ξ ◦ (λ(τ ))−1 ) in La,K (g). The tangent vector to this curve at zero can be treated as the action of a linear operator D on the element ξ . It is clear that Dξ = −X (ξ ) + δ(ξ ), where
(δ(ξ ))( p) = δ( p)(ξ( p)).
One can verify that D is a derivation of the Lie algebra La,K (g). In can be shown also that in the case where g is a real Lie algebra the derivations of the above form exhaust all possible derivations of the Lie algebra La,K (g). In the case where g is a complex Lie algebra to exhaust all derivations one should assume that the vector field X may be complex. 4. Z-Gradations of Twisted Loop Lie Algebras In general, dealing with Z-gradations of infinite dimensional Lie algebras we confront the necessity to work with infinite series of their elements, or, in other words, with series in Fréchet spaces. Let us recall some relevant information on such series. Given a countable set S, we denote by D(S) the set of all finite subsets of S, considered as a directed set, where α β if and only if α ⊃ β. The definition of a directed set can be found, for example, in book [5]. Let X be a topological vector space, I be some countable set, and (xi )i∈I be a collection of elements of X indexed by I . The symbol i∈I xi is called a series in X . Consider a net (sα )α∈D(I ) [5], where xi . sα = i∈α
If the net (sα ) converges to an element s ∈ X we say that the series unconditionally to s and write xi . s=
i∈I
xi converges
i∈I
Here the element s is called the sum of the series i∈I xi . Let X be a topological vector space whose topology is induced by a countable family of seminorms each m (·m ). If a series i∈I xi in X converges unconditionally and for the series i∈I xi m also converges unconditionally, one says that the series i∈I xi converges absolutely. Let G be a Fréchet Lie algebra. Suppose that for any k ∈ Z there is given a closed subspace Gk of G such that (a) for any k, l ∈ Z one has [Gk , Gl ] ⊂ Gk+l , (b) any element ξ of G can be uniquely represented as an absolutely convergent series ξ= ξk , k∈Z
where ξk ∈ Gk . In this case we say that the Fréchet Lie algebra G is supplied with a Z-gradation, and call the subspaces Gk the grading subspaces of G and the elements ξk the grading components of ξ . If F is an isomorphism from the Fréchet Lie algebra
On Z-Gradations of Twisted Loop Lie Algebras of Complex Simple Lie Algebras
599
G to a Fréchet Lie algebra H, then taking the subspaces Hk = F(Gk ) of H as grading subspaces we endow H with a Z-gradation. In this case we say that the Z-gradations of G and H under consideration are conjugated by the isomorphism F. It is clear that if the grading components of an element ξ ∈ G are ξk , then the grading components F(ξ )k of the element F(ξ ) ∈ H are F(ξk ). As the simplest example, let us consider the so-called standard gradation of L(g). Denote by λ the standard coordinate function on C and its restriction to S 1 . The grading subspaces for the standard gradation are defined as L(g)k = {ξ ∈ L(g) | ξ = λk x, x ∈ g}, and the expansion of a general element ξ of L(g) over grading subspaces is the representation of ξ as a Fourier series:
ξ=
λk x k ,
k∈Z
that in terms of the mapping ξ˜ has the usual form ξ˜ =
eiks xk ,
k∈Z
with xk =
1 2π
[0,2π ]
e−iks ξ˜ ds.
From the theory of Fourier series it follows that the Fourier series of any element ξ ∈ L(g) converges absolutely to ξ as a series in the Fréchet space L(g). Hence, we really have a Z-gradation of L(g). To justify the necessity to include the requirement of absolute convergence in the definition of Z-gradation let us formulate first a proposition which can be proved along the lines of the proof of the corresponding proposition for series in normed spaces (see, for example, [1]). Proposition 2. Let a series i∈I xi in a Fréchet space converge absolutely. Assume that the set I is represented as the union of a countable number of nonempty nonintersecting sets I j , j ∈ J . For any j ∈ J the series i∈I j xi converges absolutely and the series j∈J y j , where xi , yj = i∈I j
converges absolutely. Moreover, one has i∈I
⎛ ⎞ ⎝ xi = xi ⎠ . j∈J
i∈I j
Now we are able to prove the following proposition.
600
Kh. S. Nirov, A.V. Razumov
Proposition 3. Let a Fréchet Lie algebra G be supplied with a Z-gradation. For any two elements of G, ξk , η= ηk , ξ= k∈Z
k∈Z
the grading components of [ξ, η] are given by [ξ, η]k = [ξk−l , ηl ]. l∈Z
Here the series at the right-hand side converges absolutely. Proof. First prove that the series (k,l)∈Z×Z [ξk , ηl ] converges absolutely. Let α be an element of D(Z × Z), fix a positive integer m, and define rα,m = [ξk , ηl ]m . (k,l)∈α
There are elements β, γ ∈ D(Z) such that α ⊂ β × γ . Using Lemma 1, we obtain rα,m [ξk , ηl ]m Cm ξk m ηl m (k,l)∈β×γ
(k,l)∈β×γ
⎛
⎞⎛ ⎞ = Cm ⎝ ξk m ⎠ ⎝ ηl m ⎠ Cm ξk m ηl m . k∈β
k∈Z
l∈γ
l∈Z
It is clear that for any positive integer m the net (rα,m )α∈D(Z) is monotonically increasing, that means that rα,m ≥ rβ,m if α β. The above inequalities show that it is also bounded above. Similarly as it is for the case of sequences, such a net is convergent. Therefore, the series (k,l)∈Z×Z [ξk , ηl ] converges absolutely. As follows from Proposition 2 one can write [ξk , ηl ] = [ξk , ηl ] . (k,l)∈Z×Z
k∈Z
l∈Z
For a fixed k the net [ξk , ηl ] α∈D(Z) converges absolutely. This net coincides with l∈α the net [ξk , l∈α ηl ] α∈D(Z) . Since the net l∈α ηl α∈D (Z) converges to η and the Lie algebra operation in G is continuous, one has [ξk , ηl ] = [ξk , η]. l∈Z
Similarly, one obtains
[ξk , η] = [ξ, η]. k∈Z
Using again Proposition 2, we come to the equality [ξk−l , ηl ], [ξ, η] = where for any k the series
k∈Z l∈Z
l∈Z [ξk−l , ηl ]
converges absolutely.
On Z-Gradations of Twisted Loop Lie Algebras of Complex Simple Lie Algebras
601
Suppose that aFréchet Lie algebra G is supplied with a Z-gradation such that for any element ξ = k∈Z ξk of G the series k∈Z k ξk converges unconditionally. In this case one can define a linear operator Q in G, acting on an element ξ = k∈Z ξk as Qξ = k ξk . k∈Z
Actually the elements k ξk are the grading components of the element Qξ , therefore, the series k∈Z k ξk converges absolutely by the definition of a Z-gradation. It is clear that Gk = {ξ ∈ G | Qξ = kξ }. We call the linear operator Q the grading operator and say that the Z-gradation under consideration is generated by grading operator. If a Z-gradation of a Fréchet Lie algebra G and a Z-gradation of a Fréchet Lie algebra H are conjugated by an isomorphism F, and the Z-gradation of G is generated by a grading operator Q, then the Z-gradation of H is generated by the grading operator F Q F −1 . The standard gradation of L(g) is generated by a grading operator Q such that = −i dξ˜ /ds. Qξ Here the operator Q is a derivation of L(g). In general we have the following statement. Proposition 4. Let a Z-gradation of a Fréchet Lie algebra G be generated by a grading operator Q. The equality Q[ξ, η] = [Qξ, η] + [ξ, Qη] is valid for any ξ, η ∈ G. Proof. Using Proposition 3, one obtains
Q[ξ, η] = (Q[ξ, η])k = k[ξ, η]k = k[ξk−l , ηl ] . k∈Z
k∈Z
k∈Z
In a similar way one comes to the equalities [Qξ, η] = (k − l)[ξk−l , ηl ] , k∈Z
l∈Z
[ξ, Qη] =
l∈Z
k∈Z
l[ξk−l , ηl ] .
l∈Z
The three above equalities imply the validity of the statement of the proposition. It follows from this proposition that if the grading operator Q generating a Z-gradation of a Fréchet Lie algebra G is continuous, it is a derivation of G. We call a Z-gradation of a Fréchet Lie algebra G integrable if the mapping Φ from R × G to G defined by the relation Φ(τ, ξ ) = e−ikτ ξk k∈Z
is smooth. Here as usual we denote by ξk the grading components of the element ξ with respect to the Z-gradation under consideration. For each fixed ξ ∈ G the mapping Φ induces a smooth curve Φξ : R → G given by the equality Φξ (τ ) = Φ(τ, ξ ).
602
Kh. S. Nirov, A.V. Razumov
Proposition 5. Any integrable Z-gradation of a Fréchet Lie algebra G is generated by grading operator. The corresponding grading operator Q acts on an element ξ ∈ G as d Qξ = i Φξ , dt 0 where we denote by t the standard coordinate function on R. Proof. Since the mapping Φ is smooth and linear in ξ , then Q is a continuous linear operator on G. Therefore, for any net (ξα )α∈D(Z) in G which converges to an element ξ ∈ G, the net (Q ξα )α∈D(Z) converges to Q ξ . The net (ξα )α∈D(Z) , where ξk , ξα = k∈α
where ξk are the grading components of ξ , converges to ξ . Since for any α ∈ D(Z) the element ξα is the sum of a finite number of grading components, one has d Qξα = i Φξα = k ξk . dt 0 k∈α
This means that Qξ = under consideration.
k∈Z k ξk . Thus, the linear operator
Q generates the Z-gradation
Proposition 6. Let a Fréchet Lie algebra G be supplied with an integrable Z-gradation. Then for any fixed τ ∈ R the mapping ξ ∈ G → Φ(τ, ξ ) ∈ G is an automorphism of G. The mapping Φ satisfies the relation Φ(τ1 , Φ(τ2 , ξ )) = Φ(τ1 + τ2 , ξ ). Proof. From Proposition 3 it follows that one can write [(Φ(τ, ξ ))k−l , (Φ(τ, η))l ]. [Φ(τ, ξ ), Φ(τ, η)] = k∈Z l∈Z
It is clear that (Φ(τ, ξ ))k−l = e−i(k−l)τ ξk−l ,
(Φ(τ, η))l = e−ilτ ηl .
Therefore, one has [Φ(τ, ξ ), Φ(τ, η)] =
k∈Z
e−ikτ
l∈Z
[ξk−l , ηl ] =
e−ikτ [ξ, η]k = Φ(τ, [ξ, η]).
k∈Z
That proves the first statement of the proposition. The second statement of the proposition is evident. We will need further the following facts from the theory of infinite dimensional manifolds. Let M and N be two finite dimensional manifolds, and M be compact. The space C ∞ (M, N ) of all smooth mappings from M to N can be supplied with the structure of a smooth manifold modelled on Fréchet spaces (see, for example, [3, 6, 7]).
On Z-Gradations of Twisted Loop Lie Algebras of Complex Simple Lie Algebras
603
Let P, M, N be three finite dimensional manifolds, and let M be compact. Consider a smooth mapping f from P to C ∞ (M, N ). This mapping induces a mapping F from P × M to N defined by the equality F( p, q) = ( f ( p))(q). One can prove that the mapping F is smooth. Conversely, if one has a smooth mapping from P × M to N , reversing the above equality one can define a mapping from P to C ∞ (M, N ), and this mapping is also smooth. Thus, we have the following canonical identification C ∞ (P, C ∞ (M, N )) = C ∞ (P × M, N ). This fact is called the exponential law or the cartesian closedness (see, for example, [6, 7]). Let us return to consideration of twisted loop Lie algebras. Suppose that g is a complex simple Lie algebra, and a is an automorphism of g satisfying the relation a K = idg for some positive integer K . Assume that the twisted Lie algebra La,K (g) is endowed with an integrable Z-gradation. Define a mapping ϕ from R to the Lie group Aut La,K (g) by the equality (ϕ(τ ))(ξ ) = Φ(τ, ξ ). It is a curve in the Lie group Aut La,K (g). Using the identification of Aut La,K (g) with the Lie group Diff K (S 1 ) LInt(a),K (Aut g), for any τ ∈ R one can write ϕ(τ ) = (λ(τ ), θ (τ )), where λ is a mapping from R to the Lie group Diff K (S 1 ) and θ is a mapping from R to the Lie group LInt(a),K (Aut g). The mapping λ induces a mapping Λ from R × S 1 to S 1 given by Λ(τ, p) = (λ(τ ))( p), and the mapping θ induces a mapping Θ from R × S 1 to Aut g given by Θ(τ, p) = (θ (τ ))( p). Using the mappings Λ and Θ one can write (Φ(τ, ξ ))( p) = Θ(τ, p)(ξ(Λ−1 (τ, p))), where the mapping Λ−1 : R × S 1 → S 1 is defined by the equality Λ−1 (τ, Λ(τ, p)) = p. Since the mapping Φ is smooth, also the mappings Λ and Θ are smooth. Therefore, by the exponential law, the mappings λ and θ are also smooth. Thus, the curve ϕ is a smooth curve in the Lie group Aut La,K (g). Actually, as follows from Proposition 6, it is a one-parameter subgroup of Aut La,K (g). The tangent vector to the curve ϕ at zero is a derivation of La,K (g) which coincides with the linear operator −i Q. Therefore, one has the equality Qξ = −i X (ξ ) + i δ(ξ ). Here X ∈ Der K C ∞ (S 1 ) is the vector field being the tangent vector at zero to the curve λ in Diff K (S 1 ), and δ is the tangent vector at zero to the curve θ in LInt(a),K (Aut g). Note that the mapping Λ corresponding to the mapping λ is a flow on S 1 , and X is the vector field which generates this flow.
604
Kh. S. Nirov, A.V. Razumov
Proposition 7. Either the vector field X is zero vector field, or it has no zeros. Proof. It is clear that Φ(τ + 2π, ξ ) = Φ(τ, ξ ) for any ξ ∈ L(g). It implies that Λ(τ + 2π, p) = Λ(τ, p) for any p ∈ S 1 . According to the mechanical interpretation of the flow, Λ(τ, p) is the position of a particle at time τ , if its position at zero time is p. Here the velocity of the particle at time τ is X ((τ, p)). If p ∈ S 1 is a zero of X , then a particle placed at the point p at some instant of time will forever remain at that point. If X ( p) = 0, a particle placed at the point p will instantly move in the same direction, and it cannot pass any zero of the vector field X . If X has zeros, this contradicts the periodicity of Λ in the first argument. There is no contradiction only if X is zero vector field. Recall that any derivation of a simple Lie algebra is an inner derivation. Therefore, if δ is an element of LInt(a),K (Der g), then there exists a unique element η of La,K (g) such that δ(ξ ) = [η, ξ ]. Thus, we come to the following proposition. Proposition 8. The grading operator Q generating an integrable Z-gradation of a twisted loop Lie algebra La,K (g) acts on an element ξ ∈ La,K (g) as Qξ = −i X (ξ ) + i [η, ξ ], where X ∈ Der K
C ∞ (S 1 ),
and η is an element of La,K (g).
We will not consider Z-gradations with infinite dimensional grading subspaces. Therefore, the vector field X cannot be zero vector field. Indeed, suppose that X = 0 and Q ξ = i [η, ξ ] = kξ , then [η, ξ ]1 = |k|ξ 1 . From Lemma 1 one obtains |k| ≤ Cη1 . Hence, we have only a finite number of grading subspaces, thus, at least some of them must be infinite dimensional. From now on we identify any element ξ of La,K (g) with the corresponding mapping ξ˜ from R to g omitting the tilde. Similarly, we identify each element of Diff K (S 1 ) with the corresponding mapping f˜ again omitting the tilde. An element X of Der K C ∞ (S 1 ) is identified with the vector field on R, which we denote again by X . One has X = v d/ds, where the function v satisfies the relation v(σ + 2π/K ) = v(σ ). Proposition 9. Let the twisted loop Lie algebra La,K (g) be endowed with an integrable Z-gradation with finite dimensional grading subspaces, and Q be the corresponding grading operator, which has the form described in Proposition 8. For any diffeomorphism f ∈ Diff K S 1 one has A f Q A−1 f ξ = −i f ∗ X (ξ ) + i [η, ξ ], where A f is the automorphism of La,K (g) induced by f . Here the diffeomorphism f can be chosen so that f ∗ X = κ d/ds for some nonzero real constant κ.
On Z-Gradations of Twisted Loop Lie Algebras of Complex Simple Lie Algebras
605
Proof. The first statement of the proposition follows from the well known equality f ∗ X (ϕ) = f −1∗ X ( f ∗ ϕ) valid for any ϕ ∈ C ∞ (S 1 ). Writing the vector field X as v d/ds, in accordance with Proposition 7 we conclude that the function v has no zeros. Thus we can consider a diffeomorphism f of Diff K (S 1 ) with σ dσ . f (σ ) = κ 0 v(σ ) Here the constant κ is fixed by f (2π/K ) = 2π/K . It is easy to verify that f ∗ X = κ d/ds. Thus, the second statement of the proposition is true. Without any loss of generality one can assume that the constant κ of the above proposition is positive. Indeed, if it is not the case one can do so performing the mapping ξ(σ ) → ξ(−σ ) which maps La,K (g) isomorphically onto La −1 ,K (g). Let G be a simply connected Lie group whose Lie algebra coincides with g. Denote the automorphism of G corresponding to the automorphism a of g by the same letter a. Since G is a complex simple Lie group, we will consider it as a linear group. The following proposition is evident. Proposition 10. Let the twisted loop Lie algebra La,K (g) be endowed with an integrable Z-gradation, and Q be the corresponding grading operator, which has the form described in Proposition 8. Let γ be a smooth mapping from R to G satisfying the relation γ (σ + 2π/K ) = a(g γ (σ )) for some g ∈ G. Consider a linear mapping Aγ acting on any element ξ ∈ La,K (g) as Aγ ξ = γ ξ γ −1 . The mapping Aγ is an isomorphism from La,K (g) to the Lie algebra of smooth mappings ξ from R to g satisfying the equality ξ(σ + 2π/K ) = a(g ξ(σ )g −1 ). This isomorphism conjugates the Z-gradation of La,K (g) and the Z-gradation generated by the grading operator Aγ Q A−1 γ which acts as −1 Aγ Q A−1 + X (γ )γ −1 , ξ ]. γ ξ = −i X (ξ ) + i [γ η γ
Now we are able to prove our main theorem. Theorem 2. An integrable Z-gradation of a twisted loop Lie algebra La,K (g) with finite dimensional grading subspaces is conjugated by an isomorphism to a Z-gradation of an appropriate twisted loop Lie algebra La ,K (g) generated by grading operator Q ξ = −i dξ/ds. Here the automorphisms a and a differ by an inner automorphism of g.
606
Kh. S. Nirov, A.V. Razumov
Proof. In accordance with Proposition 8 the grading operator of an integrable Z-gradation of La,K (g) with finite dimensional grading subspaces is specified by the choice of a vector field X ∈ Der K C ∞ (S 1 ) and by an element η ∈ La,K (g). Having in mind Proposition 9 and the discussion given just below it, we assume without loss of generality that X = κ d/ds for some positive real constant κ. Let a mapping γ : R → G be a solution of the equation κ γ −1 dγ /ds = −η. It is well known that this equation always has solutions, all its solutions are smooth, and if γ and γ are two solutions then γ = gγ for some g ∈ G. Using the equality η(σ + 2π/K ) = a(η(σ )), one concludes that, if γ is a solution, then the mapping γ defined by the equality γ (σ ) = a −1 (γ (σ + 2π/K )) is also a solution. Hence, for some g ∈ G one has γ (σ + 2π/K ) = a(gγ (σ )). The mapping Aγ , described in Proposition 10, accompanied by the transformation σ → σ/K maps La,K (g) isomorphically onto the Fréchet Lie algebra G formed by smooth mappings ξ from R to g satisfying the condition ξ(σ + 2π ) = a (ξ(σ )), where a = a ◦ Ad(g). Denote the grading operator generating the corresponding conjugated Z-gradation again by Q. In accordance with Proposition 10 the operator Q acts on an element ξ as Qξ = −iK dξ/ds, where K = κ K . Suppose that for some integer k the grading subspace Gk is nontrivial and ξ ∈ Gk is not equal to zero, then ξ = exp(i ks/K ) ξ(0) with ξ(0) = 0. Since ξ is an element of G, one should have a (ξ(0)) = exp(2π i k/K ) ξ(0). For any integer l the mapping ξ defined by ξ = exp(i ls) ξ is a nonzero element of G. The action of the grading operator Q on ξ gives (K l + k)ξ . The number K l + k should be an integer. Since l is an arbitrary integer, it is possible only if K is an integer. Actually, due to the remark given after the proof of Proposition 9, one can assume without any loss of generality that it is a positive integer.
On Z-Gradations of Twisted Loop Lie Algebras of Complex Simple Lie Algebras
607
For any integer k denote by [k] K the element of the ring Z K corresponding to k. Let x be an arbitrary element of g and ξ be an element of G such that ξ(0) = x. Expanding ξ over the grading subspaces, ξ= ξk , k∈Z
one obtains
x=
xm ,
m∈Z K
where
xm =
ξk (0).
k∈Z [k] K =m
Here for any m ∈ Z K we have a (xm ) = exp(2π i k/K ) xm , where k is an arbitrary integer such that [k] K = m. Hence, the automorphism a is semisimple and a K = idg. The change σ → K σ induces an isomorphism from G to La ,K (g) which conjugates the Z-gradation of G under consideration with the Z-gradation of La ,K (g) generated by grading operator Q = −i d/ds. That was to be proved. Note now that the grading subspaces of the standard Z-gradation of a twisted loop Lie algebra La,K (g) are La,K (g)k = {ξ ∈ La,K (g) | ξ = λk x, x ∈ g, a(x) = Kk x}. It follows from the above theorem that to classify all Z-gradations of the twisted loop Lie algebras La,K (g) it suffices to classify the automorphisms of g of finite order. The solution of the latter problem can be found, for example in [11, 2], or in [4]. It is well known that classification of the automorphisms of g of finite order is equivalent to classification of Z K -gradations of g. Let us have two Z-gradations of La,K (g) which are conjugated to standard Z-gradations of Lie algebras La ,K (g) and La ,K (g). It is clear that the initial Z-gradations are conjugated by an isomorphism of La,K (g) if and only if K = K and the automorphisms a and a are conjugated. Acknowledgements. Kh.S.N. is grateful to the Max-Planck-Institut für Gravitationsphysik – Albert-EinsteinInstitut in Potsdam for its hospitality and friendly atmosphere. His work was supported by the Alexander von Humboldt-Stiftung, under a follow-up fellowship program. The work of A.V.R. was supported in part by the Russian Foundation for Basic Research (Grant No. 04–01–00352).
A. Distributions on S1 and Generalisations A continuous linear functional on the Fréchet space C ∞ (S 1 ) = C ∞ (S 1 , C) is said to be a distribution on S 1 . For a general presentation of the theory of distributions we refer to the book by Rudin [16]. The support of a function ϕ ∈ C ∞ (S 1 ) is defined as the closure of the set where ϕ does not vanish and is denoted as supp ϕ. We say that a distribution T vanishes on an
608
Kh. S. Nirov, A.V. Razumov
open set U if T (ϕ) = 0 whenever supp ϕ ⊂ U . Then the support of T is defined as the complement of the union of all open sets where T vanishes. It is clear that the support of a distribution on S 1 is a closed set. If the support of a distribution T coincides with a one-point set { p}, then T (ϕ) =
n
cm ϕ (m) ( p)
m=0
for some nonnegative integer n and constants cm . Let now T be a continuous linear mapping from L(g) to g. Given a basis (ei ) of g, denote by (μi ) the dual basis of g∗ . For any element x of g one has x= ei μi (x). i
Using this equality, one can write T (ξ ) =
ei μi (T (ξ )).
i
Representing a general element ξ of L(g) as j e j ξ j , one obtains μi (T (e j ξ j )). μi (T (ξ )) = j
Introduce a matrix of distributions (T i j ) on S 1 defined by the relation T i j (ϕ) = μi (T (e j ϕ)). Here ϕ is a smooth function on S 1 . Now one can write T (ξ ) = ei T i j (ξ j ). i, j
Thus, the matrix (T i j ) completely determines the mapping T . The support supp ξ of the element ξ of L(g) is defined as the closure of the set where ξ i one concludes that does not take zero value. Representing ξ as i ei ξ i supp ξ = i supp ξ . We say that a continuous smooth mapping T from L(g) to g vanishes on an open set U if T (ξ ) = 0 whenever supp ξ ⊂ U . The support of T is defined as the complement of the union of all open sets where T vanishes. It is clear that supp T = i, j supp T i j , where (T i j ) is the matrix of distributions on S 1 which determines the mapping T for given dual bases (ei ) and (μi ) of g and g∗ respectively. If the support of T is a one-point set { p}, then one can easily demonstrate that T (ξ ) =
n
cm (ξ (m) ( p))
m=0
for some nonnegative integer n and endomorphisms cm of g. Consider now continuous linear mappings from La (g) to g for the case when g is a semisimple Lie algebra and a is an automorphism of g satisfying the relation a K = idg
On Z-Gradations of Twisted Loop Lie Algebras of Complex Simple Lie Algebras
609
for some positive integer K . In this case La (g) can be considered as a subalgebra of L(g) formed by the elements ξ satisfying the condition ξ( K p) = a(ξ( p)). We denote this subalgebra as La,K (g). Define a linear operator A in L(g) acting on an element ξ in accordance with the relation A ξ( p) = a(ξ( K−1 p)). An element ξ ∈ L(g) belongs to La,K (g) if A ξ = ξ . For an arbitrary element ξ ∈ L(g) the element ξ defined as ξ=
K −1 1 −m A ξ K m=0
belongs to La,K (g), and one can extend a continuous linear mapping T from La,K (g) from L(g) to g assuming that to g to a continuous linear mapping T (ξ ) = T ( T ξ ). One can easily show that and that
◦ A = T , T = supp T. supp T
It is clear that if the support of an element ξ ∈ La,K (g) contains a point p ∈ S 1 , then it contains also the point K p. Therefore, the support of an element of La,K (g) is the union of sets of the form p = { p, K p, . . . , KK −1 p},
p ∈ S1.
The same is true for the support of an arbitrary continuous linear mapping from La,K (g) to g. Let T be a continuous linear mapping from La,K (g) to g whose support is p . The has the same support and is invariant with respect to the action corresponding mapping T of the automorphism A. Using these facts one can obtain that K −1 n (ξ ) = 1 T cm (a −l (ξ (m) ( lK p))) K l=0 m=0
for some nonnegative integer n and endomorphisms cm of g. Restricting the mapping T again to La,K (g) one has n cm (ξ (m) ( p)), T (ξ ) = m=0
for any ξ ∈ La,K (g).
610
Kh. S. Nirov, A.V. Razumov
References 1. Dieudonné, J.: Foundations of Modern Analysis. New York: Academic Press, 1960 2. Gorbatsevich, V. V., Onishchik, A. L., Vinberg, E. B.: Lie Groups and Lie Algebras. III. Structure of Lie Groups and Lie Algebras. Encyclopaedia of Mathematical Sciences, Vol. 41, Berlin: Springer, 1994 3. Hamilton, R.: The inverse function theorem of Nash and Moser. Bull. Am. Math. Soc. 7, 65–222 (1982) 4. Kac, V. G.: Infinite dimensional Lie algebras. 3rd ed., Cambridge: Cambridge University Press, 1994 5. Kelley, J. H.: General Topology. New York: Springer Verlag, 1975 6. Kriegl, A., Michor, P.: Aspects of the theory of infinite dimensional manifolds. Diff. Geom. Appl. 1, 159–176 (1991) 7. Kriegl, A., Michor, P.: The Convenient Setting of Global Analysis. Mathematical Surveys and Monographs. Vol. 53. Providence, RI: Amer. Math. Soc., 1997 8. Leznov, A.N., Saveliev, M.V.: Group-theoretical Methods for Integration of Nonlinear Dynamical Systems. Basel: Birkhäuser, 1992 9. Milnor, J.: Remarks on infinite-dimensional Lie groups. In: Relativity, Groups and Topology II, DeWitt, B.S., Stora, R., eds., Amsterdam: North-Holland, 1984, pp. 1007–1057 10. Nirov, Kh.S., Razumov, A.V.: On classification of non-abelian Toda systems. In: Geometrical and Topological Ideas in Modern Physics, Petrov, V. A., ed., Protvino: IHEP, 2002, pp. 213–221 11. Onishchik, A.L., Vinberg, E. B.: Lie Groups and Algebraic Groups. Berlin: Springer, Berlin, 1990 12. Pressley, A., Segal, G.: Loop Groups. Oxford: Clarendon Press, 1986 13. Razumov, A. V., Saveliev, M. V.: Lie Algebras, Geometry, and Toda-type Systems. Cambridge: Cambridge University Press, 1997 14. Razumov, A. V., Saveliev, M. V.: Multi-dimensional Toda-type systems. Theor. Math. Phys. 112, 999– 1022 (1997) 15. Razumov, A. V., Saveliev, M. V., Zuevsky, A. B.: Non-abelian Toda equations associated with classical Lie groups. In: Symmetries and Integrable Systems, Sissakian, A. N., ed., Dubna: JINR, 1999, pp. 190–203 16. Rudin, W.: Functional Analysis. New York: McGraw-Hill, 1973 17. Semenov–Tian–Shansky, M. A.: Integrable systems and factorization problems. In: Factorization and Integrable Systems, Gohberg, I., Manojlovic, N., Ferreira dos Santos, A., eds. Boston: Birkhäuser, 2003, pp. 155–218 Communicated by L.Takhtajan
Commun. Math. Phys. 267, 611–629 (2006) Digital Object Identifier (DOI) 10.1007/s00220-006-0090-5
Communications in
Mathematical Physics
Moments of the Derivative of Characteristic Polynomials with an Application to the Riemann Zeta Function J.B. Conrey1,2 , M.O. Rubinstein3 , N.C. Snaith2 1 American Institute of Mathematics, 360 Portage Ave, Palo Alto, CA 94306, USA.
E-mail: [email protected]
2 School of Mathematics, University of Bristol, Bristol, BS8 1TW, United Kingdom.
E-mail: [email protected]
3 Pure Mathematics, University of Waterloo, Waterloo, Ontario, N2L 3G1, Canada.
E-mail: [email protected] Received: 25 August 2005 / Accepted: 21 April 2006 Published online: 22 August 2006 – © Springer-Verlag 2006
Abstract: We investigate the moments of the derivative, on the unit circle, of characteristic polynomials of random unitary matrices and use this to formulate a conjecture for the moments of the derivative of the Riemann ζ function on the critical line. We do the same for the analogue of Hardy’s Z -function, the characteristic polynomial multiplied by a suitable factor to make it real on the unit circle. Our formulae are expressed in terms of a determinant of a matrix whose entries involve the I-Bessel function and, alternately, by a combinatorial sum.
1. Introduction Characteristic polynomials of unitary matrices serve as extremely useful models for the Riemann zeta-function ζ (s). The distribution of their eigenvalues give insight into the distribution of zeros of the Riemann zeta-function and the values of these characteristic polynomials give a model for the value distribution of ζ (s). See the works [KS] and [CFKRS] for detailed descriptions of how these models work. The important fact is that formulas for the moments of the Riemann zeta-function are suggested by the moments of the characteristic polynomials of unitary matrices. We consider two problems here: the moments of the derivative of the characteristic polynomial A (s) of an N × N unitary matrix A, and also the moments of the analogue of the derivative of Hardy’s Z -function, the characteristic polynomial multiplied by a suitable factor to make it real on the unit circle. In its simplest form our problem is to give an exact formula, valid for complex r with r > 0, of the moments of the absolute value of the derivative of characteristic polynomials U (N )
|A (1)|r d A N
(1.1)
612
J.B. Conrey, M.O. Rubinstein, N.C. Snaith
or of
U (N )
|Z A (1)|r d A N .
(1.2)
Here we are integrating against Haar measure on the unitary group, and Z A (s) is equal to A (s) times a rotation factor that makes it real on the unit circle. See the next section for the precise definition. Unfortunately, we cannot yet solve either of these problems. However, we can give asymptotic formulas when r = 2k for positive integer values of k. The first two of these formulas involve the Maclaurin series coefficients of a certain k × k determinant, while the third involves a combinatorial sum. Theorem 1. For fixed k and N → ∞ we have 2 |A (1)|2k d A N ∼ bk N k +2k ,
(1.3)
U (N )
where bk = (−1)
k(k+1)/2
k √ k d k+h −x −k 2 /2 e x det Ii+ j−1 (2 x) , (1.4) k×k h dx x=0 h=0
and Iν (z) denotes the modified Bessel function of the first kind. Theorem 2. For fixed k and N → ∞ we have 2 |Z A (1)|2k d A N ∼ bk N k +2k ,
(1.5)
U (N )
where bk = (−1)k(k+1)/2
d dx
2k x √ 2 e− 2 x −k /2 det Ii+ j−1 (2 x) k×k
.
(1.6)
x=0
We also have combinatorial description of bk . Theorem 3. bk = (−1)k(k+1)/2 ×
m∈POk+1 (2k)
k
i=1
2k −1 m 0 m 2
⎞ ⎛
1 ⎝ (m j − m i + i − j)⎠ , (2k − i + m i )!
(1.7)
1≤i< j≤k
where POk+1 (2k) denotes the set of partitions m = (m 0 , . . . , m k ) of 2k into k + 1 nonnegative parts. We have computed some values of bk and bk ; these are tabulated at the end of the paper. Applying the random matrix theory philosophy suggests the conjecture:
Characteristic Polynomials Applied to Riemann Zeta Function
Conjecture 1. 1 T
T
613
|ζ (1/2 + it)|2k dt ∼ ak bk log(T )k
2 +2k
,
(1.8)
0
and, similarly for Hardy’s Z function, 1 T 2 |Z (1/2 + it)|2k dt ∼ ak bk log(T )k +2k , T 0
(1.9)
where ak is the arithmetic factor ak =
∞ k 2
(m + k) 2 −m 1 − 1p p . m!(k) p
(1.10)
m=0
Remarks. (1) The factor ak is the same arithmetic contribution that arises in the moments of the Riemann zeta function itself, see [KS] or [CFKRS]. For an explanation of why these moments factor, asymptotically, into the product of a contribution from the primes, ak , and a coefficient calculated via random matrix theory, see [GHK]. (2) In this paper we are only concerned with the leading asymptotics of the moments of ζ (1/2 + it) and Z (1/2 + it). Consequently, we use the k-fold integrals for moments given below in Lemma 3. If one wishes to study lower order terms one would need to use the full moment conjecture for ζ and Z given in [CFKRS] as a 2k fold integral. (3) Forrester and Witte have taken our Theorems 1-2 and managed to find an alternate expression for bk and bk involving a Painlevé III equation, and also an expression involving a certain generalised hypergeometric function [FW, Sect. 5]. (4) In his PhD thesis, Chris Hughes gives a similar conjecture for a more general problem involving mixed moments [Hug, Conj. 6.1]. What is new in our paper are the formulas for bk and bk . For comparison, we state Hughes’ formulation of the conjecture for Hardy’s Z function. Let I (h, k) =
T
2h |Z (t)|2k−2h Z (t) dt.
0
Hughes conjectures that I (h, k) ∼ B(h, k)ak f k T (log T )k
2 +2h
,
where ak is given above, fk =
k−1
j=0
j! , (k + j)!
and B(h, k) is the constant that Hughes obtains from the analogous moments for Z A via random matrix theory: 2h 1 n−h 2h e−nβ/2 det{bi, j }, (−1) β→0 β 2h n
B(h, k) = lim
n=0
614
J.B. Conrey, M.O. Rubinstein, N.C. Snaith
where bi, j =
2h m=0
(2k − n + i − 1)! (2k − n + i − 1 + m)!
i +k−n−1+m m
i +m−1 βm . j −1
To get Hughes’ conjecture in this form, see (6.60), (6.51), and (6.52) of [Hug] and replace β by β/(i N ). (5) Brezin and Hikami [BH] attempt to use a similar approach to obtain a theorem for moments of derivatives of characteristic polynomials, but there is an error in their paper. (6) Numerically, we have observed that bk ∼ 4−k f k , as k → ∞. Other than the power of 4, the r.h.s. is the constant that appears in the moments of characteristic polynomials [KS]: 2 | A (1)|2k d A N ∼ f k N k , as N → ∞. U (N )
A heuristic explanation is as follows. The large values of |A (1)| occur near the large values of | A (1)|, namely when all the eigenvalues are close to −1. The derivative of A involves a sum of N terms, each of which is missing, when the eigenvalues are close to −1, one factor of size roughly 2. A comparison with the 2k th moment of | A | thus gives an extra N 2k and a factor of 2−2k . We have not attempted to make this argument rigorous. We have also not attempted to show that bk and bk are non-zero. (7) The problem of moments of the derivative, is related, through Jensen’s formula, to the problem of zeros of the derivative. This approach requires knowledge of the complex moments of the derivative and we are only able to obtain integer moments. For characteristic polynomials one is interested in studying the radial distribution of the zeros of the derivative. Francesco Mezzadri has the best results in this direction [Mez]. On the number theory side, one is interested in the horizontal distribution of the zeros of ζ . Partial results have been obtained by Levinson and Montgomery [LM], Conrey and Ghosh [CG], Soundararajan [Sou], and Zhang [Z]. 2. Notation If A is an N × N matrix with complex entries A = (a jk ), we let A∗ be its conjugate transpose, i.e. A∗ = (b jk ), where b jk = ak j . A is said to be unitary if A A∗ = I . We let U (N ) denote the group of all N × N unitary matrices. This is a compact Lie group and has a Haar measure. All of the eigenvalues of A ∈ U (N ) have absolute value 1; we write them as eiθ1 , eiθ2 , . . . eiθ N .
(2.1)
N The eigenvalues of A∗ are e−iθ1 , . . . , e−iθ N . Clearly, the determinant, det A = n=1 eiθn of a unitary matrix is a complex number with absolute value equal to 1. We are interested in computing various statistics about these eigenvalues. Consequently, we identify all matrices in U (N ) that have the same set of eigenvalues. The collection of matrices with the same set of eigenvalues constitutes a conjugacy class in
Characteristic Polynomials Applied to Riemann Zeta Function
615
U (N ). Weyl’s integration formula [Weyl, p. 197] gives a simple way to perform averages over U (N ) for functions f that are constant on conjugacy classes. Weyl’s formula asserts that for such an f , f (A) dHaar = f (θ1 , . . . , θ N )d A N , (2.2) U (N )
[0,2π ] N
where d AN =
1≤ j
iθ e k − eiθ j 2 dθ1 . . . dθ N . N !(2π ) N
(2.3)
The characteristic polynomial of a matrix A is denoted A (s) and is defined by A (s) = det(I − s A∗ ) =
N
(1 − se−iθn ).
(2.4)
n=1
The roots of A (s) are the eigenvalues of A and are on the unit circle. Notice that this definition of the characteristic polynomial differs slightly from the usual definition in that it has an extra factor of det(A∗ ). We regard A (s) as an analogue of the Riemann zeta-function and this definition is chosen so as to resemble the Hadamard product of ζ . The characteristic polynomial satisfies the functional equation A (s) = (−s) N
N
n=1
= (−1) N det
e−iθn
N
(1 − eiθn /s)
n=1 ∗ N A s A∗ (1/s).
(2.5) (2.6)
We define the Z-function by Z A (s) = e−πi N /2 ei
N
n=1 θn /2
s −N /2 A (s);
(2.7)
here if N is odd, we use the branch of the square-root function that is positive for positive real s. The functional equation for Z is Z A (s) = (−1) N Z A∗ (1/s).
(2.8)
Z A (eiθ ) = Z A (eiθ )
(2.9)
Note that
so that Z A (eiθ ) is real when θ is real. We regard Z A (eiθ ) as an analogue of Hardy’s function Z (t). We let In be the usual modified Bessel function with power series expansion In (x) =
∞ x n
2
j=0
x2 j . 22 j (n + j)! j!
(2.10)
The way that the I-Bessel function enters our calculation is through the following formula: √ 1 e Lz+t/z L 2k−1 I2k−1 (2 Lt) dz = . (2.11) 2πi |z|=1 z 2k (Lt)k−1/2
616
J.B. Conrey, M.O. Rubinstein, N.C. Snaith
This formula can be proven by comparing the coefficient of z 2k−1 in e Lz+t/z with the power series formula for I2k−1 . We let (z 1 , . . . , z k ) denote the Vandermonde determinant j−1 . (z 1 , . . . , z k ) = det z i
(2.12)
k×k
We often omit the subscripts and write (z) in place of (z 1 , . . . , z k ). Also, we allow differential operators as the arguments, such as
d dL
d d = ,..., d L1 d Lk
= det
k×k
d d Li
j−1 .
(2.13)
The key fact about the Vandermonde is that (z 1 , . . . , z k ) =
(z j − z i ).
(2.14)
1≤i< j≤k
We let z(x) =
1 1 = + O(1). −x 1−e x
(2.15)
The function z(x) plays the role for random matrix theory that ζ (1+x) plays in the theory of moments of the Riemann zeta-function. See for example pp. 371–372 of [CFKRS2]. We let denote the subset of permutations σ ∈ S2k of {1, 2, . . . , 2k} for which σ (1) < σ (2) < · · · < σ (k)
(2.16)
σ (k + 1) < σ (k + 2) < · · · < σ (2k).
(2.17)
and
We let POk+1 (2k) be the set of partitions m = (m 0 , . . . , m k ) of 2k into k + 1 non-negative parts. This quantity arises from the multinomial expansion (x0 + x1 + · · · + xk )
2k
=
m∈POk+1 (2k)
2k m 0 x · · · xkm k , m 0
(2.18)
where
2k m
=
(2k)! . m0! . . . mk !
(2.19)
Characteristic Polynomials Applied to Riemann Zeta Function
617
3. Lemmas The main tool in proving Theorems 1–2 is to take formulas (Lemma 3) for moments of characteristic polynomials with shifts, differentiate these with respect to the shifts, and then set the shifts equal to zero. This gives k-fold contour integrals. To separate the integrals involved, we introduce extra parameters and differential operators to pull out a portion of these integrands. Lemma 1. Assume that α1 , . . . , α2k are distinct complex numbers. We have
k
U (N ) j=1
=
A (e−α j ) A∗ (eα j+k ) d A N
eN
k
j=1 (ασ ( j) −α j )
σ ∈
z(ασ ( j) − ασ (k+i) ).
(3.1)
1≤i, j≤k
This is proven in Sect. 2 of [CFKRS2]. See formulas (2.5), (2.16), and (2.21) of that paper. The definition given there of the characteristic polynomial differs slightly from the one we use here, and that introduces some extra exponential factors in (2.21) of the aforementioned paper, and also necessitates replacing the α’s by −α’s. Since Z A (e−α j )Z A∗ (eα j+k ) = (−1) N e N (α j −α j+k )/2 A (e−α j ) A∗ (eα j+k )
(3.2)
we can write a corresponding lemma for Z. Lemma 2. Assume that α1 , . . . , α2k are distinct complex numbers. Then
k
U (N ) j=1
Z A (e−α j )Z A∗ (eα j+k ) d A N N 2k
= (−1) N k e− 2
j=1 α j
eN
k
j=1 ασ ( j)
σ ∈
z(ασ ( j) − ασ (k+i) ).
(3.3)
1≤i, j≤k
We can express the sums in the last two lemmas as integrals. Thus we have Lemma 3. Assume that all of the α j are smaller than 1 in absolute value. Then
k
U (N ) j=1
A (e−α j ) A∗ (eα j+k ) d A N
1 = k!(2πi)k
|wi |=1
eN
k
j=1 (w j −α j )
1≤i≤k 1≤ j≤2k
z(wi − α j )
i= j
z(wi − w j )−1
k
dw j
j=1
(3.4)
618
J.B. Conrey, M.O. Rubinstein, N.C. Snaith
and
k
U (N ) j=1
= (−1)
Z A (e−α j )Z A∗ (eα j+k ) d A N
Nk e
N 2k −2 j=1 α j
k!(2πi)k
|wi |=1
eN
k
j=1 w j
z(wi −α j )
z(wi −w j )−1
i= j
1≤i≤k 1≤ j≤2k
k
dw j .
j=1
(3.5) In this lemma, and its corollary below, we do not require the α j s to be distinct. The proof of Lemma 3 is a straightforward evaluationofthe residues in the integral in (3.4), arising from the factor z(wi − α j ), to obtain the 2k Each of the k integrals in k terms in (3.1). (3.4) results in a sum over 2k residues, but due to the factor i= j z(wi − w j )−1 , any one of these 2k 2 terms is zero if the residue of two of the integrals, say wi and w j , are evaluated at the same point α . Using the fact that z(w) = 1/w + O(1) we easily deduce Corollary 1. Suppose that α j = α j (N ) and |α j | 1/N for each j. Then
k
U (N ) j=1
A (e−α j ) A∗ (eα j+k ) d A N
1 = k!(2πi)k
|wi |=1
e
N
k
j=1 (w j −α j )
i= j (wi
1≤i≤k 1≤ j≤2k
k − wj)
(wi − α j )
dw j + O(N k
2 −1
)
(3.6)
j=1
with an implicit constant independent of N ; similarly,
k
U (N ) j=1
= (−1)
Z A (e−α j )Z A∗ (eα j+k ) d A N
Nk e
N 2k −2 j=1 α j
k!(2πi)k
|wi |=1
e
N
k
j=1 w j
i= j (wi
1≤i≤k 1≤ j≤2k
k − wj)
(wi − α j )
dw j + O(N k
2 −1
).
j=1
(3.7) Lemma 4. Let f be k − 1 times differentiable, k ≥ 1. Then d
) ( f (L i ) = det f ( j−1) (L i ) , k×k dL k
i=1
where by (d/d L) we mean the differential operator
d d j−1 d = det − . k×k d L j−1 dL j d Li 1≤i< j≤k
(3.8)
i
Proof. This follows using the definition of the Vandermonde determinant. Noticing that row i of the matrix only involves L i , we factor the product into the determinant.
Characteristic Polynomials Applied to Riemann Zeta Function
619
Lemma 5. Let f be 2k − 2 times differentiable. Then k d
2 f (L i ) = k! det f (i+ j−2) (L) . k×k dL L i =L
(3.9)
i=1
k More generally, suppose that g(L 1 , . . . , L k ) = rR=1 ar i=1 fr,i (L i ) is a symmetric function of its k variables. Then R (i+ j−2) d 2 g(L 1 , . . . , L K ) = k! ar det fr,i (L) . (3.10) k×k dL L j =L r =1
Proof. Applying the Vandermonde a second time to Lemma 4 we get d det f ( j−1) (L i ) . d L k×k
(3.11)
Expand the determinant as a sum over all permutations μ of the numbers 1, 2, . . . , k: det
k×k
k
f ( j−1) (L i ) = sgn(μ) f μi −1 (L i ). μ
(3.12)
i=1
Apply Lemma 4 to find that a typical term above equals sgn(μ) det f (μi + j−2) (L i ) .
(3.13)
k×k
Setting L i = L for 1 ≤ i ≤ k, we may rearrange the rows so as to undo the permutation μ. This introduces another sgn(μ) in front of the determinant and gives (3.14) det f (i+ j−2) (L) . k×k
Since there are k! permutations μ, we get k! det f (i+ j−2) (L) .
(3.15)
k×k
The proof of the second part of the lemma is left to the reader. Lemma 6. Suppose that P and Q are polynomials with Q(w) = max |α j | < c. Then 1 2πi
|w|=c
2k
j=1 (w
− α j ) and
2k 2k
d ewL P(w) j=1 x j α j dw = P e dx j. 2k w Q(w) dL j=1 x j ≤L
(3.16)
j=1
Proof. Since
d ewL = ewL P(w), (3.17) dL the derivatives can be pulled outside the integral immediately. With the Laplace trans1 form pair e xα and w−α , related by ewx 1 xα e = dw, (3.18) 2πi |w|=c w − α
P
620
J.B. Conrey, M.O. Rubinstein, N.C. Snaith
we merely apply repeatedly the Laplace convolution formula, which for Laplace transform pairs f i and φi states that x 1 φ1 (s)φ2 (s)esx ds = f 1 (y) f 2 (x − y)dy, (3.19) 2πi |w|=c 0 to evaluate the Laplace transform of the product
w
2k
1
j=1 (w−α j )
.
Lemma 7. We have 2k j=1
x j ≤L
x1 . . . xn
2k
dx j =
j=1
L 2k+n . (2k + n)!
(3.20)
This lemma can be proved in a straightforward manner by induction. 4. Proofs We now give the proofs of our identities for the leading terms of the moments of the derivatives of and Z. We begin with the proof of Theorem 2 for Z as it is slightly easier. Proof of Theorem 2. A differentiated form of the second formula of Corollary 1 gives us k 2k
d Z A (e−αh )Z A∗ (eαk+h )d A N dα j U (N ) j=1
h=1
= (−1)
k(k−1) 2 +k N
N 2k 2k
d e− 2 j=1 α j N kj=1 w j e dα j k!(2πi)k |wi |=1 j=1
+O(N
k 2 +2k−1
k
2 (w) dw j 1≤i≤k (wi−α j ) j=1
1≤ j≤2k
),
(4.1)
provided that α j = α j (N ) 1/N . Notice that d d Z A (e−α )α=0 = − Z A (s)s=1 = −Z A (1) dα ds
(4.2)
d Z A∗ (eα )α=0 = Z A ∗ (1) = (−1) N Z A (1). dα
(4.3)
and
So, U (N )
|Z A (1)|2k d A N
= (−1)
k(k+1) 2
N 2k 2k
d e− 2 j=1 α j N kj=1 w j e dα j k!(2πi)k |wi |=1 j=1
+O(N
k 2 +2k−1
).
k
2 (w) dw j 1≤i≤k (wi−α j ) α=0
1≤ j≤2k
j=1
(4.4)
Characteristic Polynomials Applied to Riemann Zeta Function
621
The sign here arises as the (−1)k N from (4.3) cancels the same factor in (4.1), we have k(k−1) a (−1)k from (4.2) and we pick up the (−1) 2 in (4.1) through writing the factor 2 i= j (wi − w j ) in (3.7) as (w) above. To separate the integrals, we introduce extra parameters L i and move the Vandermonde polynomial outside the integral as a differential operator, getting N 2k
(−1)
k(k+1) 2
2k
d 2 (d/d L)e− 2 dα j k!(2πi)k j=1
×
|w j |=1
+O(N
e
k i=1
1≤i≤k 1≤ j≤2k
k 2 +2k−1
k
L i wi
(wi − α j )
j=1 α j
dw j
j=1
α=0,L i =N
).
(4.5)
Next, we observe that N k e− 2 α 1 d 1 N = k − dα 1≤i≤k (wi − α) α=0 wj 2 i=1 wi
(4.6)
j=1
so that (4.5) equals, without the O term,
(−1)
k(k+1) 2
2 (d/d L) k!(2πi)k
e
k i=1
L i wi
k
k
1 j=1 w j
2k −
N 2
k
2k i=1 wi
|w j |=1
j=1
dw j
L i =N
. (4.7)
Introducing another auxiliary variable t, this can be expressed as (−1)
k(k+1) 2
k d 2k −N t/2 k 2 (d/d L) dt e e i=1 L i wi +t/wi
dw . j k 2k k!(2πi)k |w j |=1 L i =N ,t=0 i=1 wi j=1
(4.8) This allows us to separate the integrals and we get (−1)
k(k+1) 2
2k k 2 (d/d L) d/dt e−N t/2 1 e L i w+t/w dw . 2k k! 2πi |w|=1 w L i =N ,t=0 i=1
(4.9) The integral evaluates to √ L i2k−1 I2k−1 (2 L i t) (L i t)k−1/2
(4.10)
622
J.B. Conrey, M.O. Rubinstein, N.C. Snaith
as noted earlier. Thus, |Z A (1)|2k d A N U (N )
= (−1)
k(k+1) 2
2k √ k 2k−1 (d/d L) d/dt e−N t/2
Li I2k−1 (2 L i t) k! (L i t)k−1/2 L i =N ,t=0 i=1
+ O(N
k 2 +2k−1
).
(4.11)
So, letting √ L i2k−1 I2k−1 (2 L i t) f t (L i ) = , (L i t)k−1/2 we have, by Lemma 5, that (4.11) equals 2k (i+ j−2) k(k+1) d 2 (−1) 2 e−N t/2 det f t (N ) + O(N k +2k−1 ). k×k dt t=0
(4.12)
(4.13)
Now we see, from (2.10), that f t (L) =
∞ r =0
t r L 2k−1+r , r !(2k − 1 + r )!
(4.14)
so that if μ ≤ 2k − 1, then f (μ) (L) =
∞ r =0
t r L 2k−1−μ+r = r !(2k − 1 − μ + r )!
(2k−1−μ)/2 √ L I2k−1−μ (2 Lt). t
(4.15)
Therefore, (4.13) equals 2k (2k+1−i− j)/2 √ k(k+1) d N e−N t/2 det I2k+1−i− j (2 N t) (−1) 2 k×k dt t t=0 +O(N k
2 +2k−1
).
(4.16)
Clearly det k (ai, j ) = det k (ak+1−i,k+1− j ), therefore (4.16) can be written, without the remainder term, as 2k (i+ j−1)/2 √ k(k+1) d N (−1) 2 e−N t/2 det Ii+ j−1 (2 N t) . (4.17) k×k dt t t=0 If we substitute x = N t, then d/dt = N d/d x and we get 2k 2 (i+ j−1)/2 √ k(k+1) d N (−1) 2 N 2k e−x/2 det Ii+ j−1 (2 x) k×k dx x x=0 2k √ k(k+1) d 2 2 e−x/2 x −k /2 det Ii+ j−1 (2 x) , = (−1) 2 N k +2k k×k dx x=0 2
(4.18)
since det k (M i+ j−1 ai, j ) = M k det k (ai, j ) as is seen by factoring M j out of the j th column and M i−1 out of the i th row. This completes the proof of Theorem 2.
Characteristic Polynomials Applied to Riemann Zeta Function
623
Proof of Theorem 1. Turning to Theorem 1’s proof, we begin as before, but with a differentiated form of the first formula of Corollary 1: k 2k
d A (e−αh ) A∗ (eαk+h )d A N dα j U (N ) j=1
h=1
= (−1)
k(k−1) 2
2k
d 1 N kj=1 (w j −α j ) e dα j k!(2πi)k |wi |=1 j=1
+O(N
k 2 +2k−1
k
2 (w) dw j 1≤i≤k (wi − α j ) j=1
1≤ j≤2k
),
(4.19)
provided that α j 1/N . Now d d A (e−α )α=0 = − A (s)s=1 = −A (1) dα ds
(4.20)
and d A∗ (eα )α=0 = A∗ (1) = A (1), dα hence setting α = 0, (4.19) becomes |A (1)|2k d A N U (N )
= (−1)
k(k+1) 2
2k
d 1 N kj=1 (w j −α j ) e dα j k!(2πi)k |wi |=1 j=1
+O(N
k 2 +2k−1
(4.21)
k
2 (w) dw j (w −α ) 1≤i≤k i j α j =0 j=1
1≤ j≤2k
).
(4.22)
Introducing variables L i as before, the above equals, without the O term k 2k k 2 k(k+1) e i=1 (L i wi −N αi )
j=1 (d/dα j ) (d/d L) 2 (−1) dw j . k k!(2πi) 1≤i≤k (wi − α j ) |w j |=1 α j =0,L i =N j=1
1≤ j≤2k
(4.23) Performing the differentiations with respect to the α j leads us to k k k k 1 i=1 L i wi e −N j=1 w j j=1 2 k(k+1) (d/d L) (−1) 2 k 2k k!(2πi)k |w j |=1 i=1 wi
k 1 wj
k
dw j
j=1
L i =N
.
(4.24) Now we write k j=1
1 −N wj
k k j=1
1 wj
k =
k j=1
1 −N wj
k k j=1
1 −N +N wj
k
k+h k k k 1 k−h N = −N . h wj h=0
j=1
(4.25)
624
J.B. Conrey, M.O. Rubinstein, N.C. Snaith
Introducing the auxiliary variable t, (4.24) can be expressed as (−1)
k(k+1) 2
h
h=0
×
k k
e
k(k+1) 2
k i=1
k
2 (d/d L) d k+h e−N t dt k!(2πi)k
L i wi +t/wi
2k i=1 wi
|w j |=1
= (−1)
N
k−h
k
dw j
j=1
L i =N ,t=0
k+h −N t k 2 e k k−h (d/d L) d/dt N h k! h=0
k
1 e L i w+t/w dw . × 2k 2πi |w|=1 w L i =N ,t=0
(4.26)
i=1
Proceeding as before we arrive at |A (1)|2k d A N U (N )
= (−1)
k(k+1) 2
N
k 2 +2k
k √ k d k+h −x −k 2 /2 e x det Ii+ j−1 (2 x) k×k h dx x=0 h=0
+O(N
k 2 +2k−1
).
(4.27)
Proof of Theorem 3. We now give the proof of Theorem 3. We rewrite Eq. (4.1) as k 2k
d Z A (e−αh )Z A∗ d A N dα j U (N ) j=1
h=1
= (−1)
k(k−1) 2 +k N
N 2k k k 2k k
2 (w) i=1 wi dwi d e− 2 j=1 α j N i=1 wi e dα j k!(2πi)k wi 1≤i≤k (wi −α j ) |wi |=1 j=1
+O(N
k 2 +2k−1
1≤ j≤2k
).
i=1
(4.28)
Introducing variables L i as before, we can rewrite the main term above as (−1)
k(k−1) 2 +k N
k! ×
2k
j=1 α j
j=1
2
2k
d −N e 2 dα j
d dL
k i=1
d d Li
k i=1
Now, by Lemma 6, the integral is 2k j=1
x j ≤L i
e
1 2πi
2k j=1
xjαj
dw . 2k w j=1 (w − α j )
(4.29)
dx j.
(4.30)
eLi w
|w|=1
1≤ j≤2k
Characteristic Polynomials Applied to Riemann Zeta Function
625
Letting the variables in the i th integral be xi, j we may express the product of the k integrals as
2k
j=1 x 1, j ≤L 1
...
j=1 x k, j ≤L k
N 2k
We incorporate the factor e− 2
j=1 α j
2k
j=1 x 1, j ≤L 1
...
2k
k
2k
i=1
j=1 xi, j α j
d xi, j .
(4.31)
1≤i≤k 1≤ j≤2k
into this product and have
e
2k
j=1 x k, j ≤L k
e
2k
j=1 α j
k
i=1 xi, j −N /2
d xi, j .
(4.32)
1≤i≤k 1≤ j≤2k
We differentiate this product of integrals with respect to each α j and set each α j equal to 0 yielding
2k k
2k
j=1 x 1, j ≤L 1
...
2k
j=1 x k, j ≤L k
j=1
i=1
xi, j
N − 2
d xi, j .
(4.33)
1≤i≤k 1≤ j≤2k
We want to compute this integral by multiplying out the product and using Lemma 7. A good way to think about this is as follows. By Eq. (2.18) (A1 + · · · + Ak − A)
2k
=
m∈POk+1 (2k)
2k mk 1 (−A)m 0 Am 1 · · · Ak . m
(4.34)
When we multiply out the product we will have a sum of (k +1)2k terms, each term being a product of some number of factors (−N /2) and xi, j . Let m ∈ POk+1 (2k) represent a generic term in which (−N /2) appears m 0 times, and factors x1, j appear for m 1 values of j, and x2, j for m 2 values of j and so on. When we apply Lemma 7 to this term, when we perform the integration over the variables x1, j the answer is solely determined by m 1 , the number of different x1, j that appear in this term. Therefore, we find that the product of integrals evaluates as
m∈POk+1 (2k)
2k m
2k+m k L 2k+m N m0 L 1 1 k − ··· . 2 (2k + m 1 )! (2k + m k )!
(4.35)
We now have that the quantity in Eq. (4.29) is equal to (−1)
k(k−1) 2 +k N
k! × 2
d dL
k i=1
d d Li
m∈POk+1 (2k)
2k+m k L 2k+m 2k N m0 L 1 1 k − ··· . m 2 (2k + m 1 )! (2k + m k )! (4.36)
626
J.B. Conrey, M.O. Rubinstein, N.C. Snaith
Now we need to carry out the differentiations with respect to the L i and set the L i equal k d/d L i and obtain to N . We perform the differentiations i=1 (−1)
k(k−1) 2 +k N
k!
2
d dL
m∈POk+1 (2k)
k 1 L 2k+m L 2k−1+m 2k N m0 k 1 − ··· . m 2 (2k − 1 + m 1 )! (2k + m k )! (4.37)
Now the sum over m 1 , . . . , m k is a symmetric function of the variables L i . Therefore, we can apply the second part of Lemma 5 to obtain that the above, evaluated at L i = N is U (N )
|Z A (1)|2k d A N
= (−1)
k(k+1) 2
m∈POk+1 (2k)
2k N m0 N 2k+1+m i −i− j 2 − + O(N k +2k−1 ), det k×k (2k +1+m i −i − j)! m 2 (4.38)
which we rewrite as (−1)
k(k+1) 2
N
k 2 +2k
m∈POk+1 (2k)
2k 1 2 −m 0 (−2) + O(N k +2k−1 ). det k×k (2k +1+m i −i − j)! m (4.39)
Here the signs work out as in (4.4). We factor 1/(2k − i + m i )! out of the i th row. The remaining determinant has i th row
1, (2k − i + m i ), (2k − i + m i )(2k − i − 1 + m i ), . . . ,
k−1
(2k − i − j + 1 + m i ).
j=1
(4.40) This determinant is a polynomial in the m i of degree 0+1+· · ·+(k−1) = k(k−1)/2 which vanishes whenever m j − m i = j − i; moreover the part of it with degree k(k − 1)/2 is precisely (m 1 , . . . , m k ) = 1≤i< j≤k (m j − m i ). Consequently the determinant evaluates to
1≤i< j≤k
This concludes the evaluation of bk .
(m j − m i − j + i).
(4.41)
Characteristic Polynomials Applied to Riemann Zeta Function
627
5. Numerical Evaluation of bk and bk We have the following values for bk : b1 = b2 =
b3 =
b4 =
b5 =
b6 =
b7 =
b8 =
b9 =
b10 =
b11 =
b12 =
b13 =
2121
226
27
1 3
61 25 · 32 · 5 · 7 · 34
277 · 52 · 72 · 11
2275447 218 · 310 · 54 · 73 · 11 · 13
· 314
3700752773 · 74 · 112 · 132 · 17 · 19
· 56
3654712923689 239 · 319 · 59 · 76 · 113 · 133 · 17 · 19 · 23
250
53 · 13008618017 · 143537 · 513 · 78 · 115 · 134 · 172 · 192 · 23
· 328
41 · 359 · 5505609492791 · 3637 268 · 335 · 516 · 711 · 116 · 135 · 173 · 192 · 23 · 29 · 31
284
· 342
757 · 45742439 · 60588179 · 13723 · 521 · 714 · 118 · 136 · 174 · 193 · 232 · 29 · 31
652071900673 · 241845775551409 2105 · 355 · 525 · 717 · 1110 · 138 · 175 · 194 · 233 · 29 · 37
· 364
1318985497 · 578601141598041214011811 · 719 · 1112 · 139 · 177 · 196 · 234 · 292 · 312 · 37 · 41 · 43
· 531
113 · 206489633386447920175141 · 51839 · 14831 2150 · 375 · 537 · 723 · 1115 · 1312 · 177 · 197 · 235 · 293 · 312 · 37 · 41 · 43 · 47
2174
· 390
4670754069404622871904068067089635254838677 · 542 · 728 · 1117 · 1314 · 1710 · 199 · 236 · 293 · 313 · 372 · 41 · 43 · 47
107 · 194946046688455595346779341 · 996075171809335069 b14 = 203 103 50 31 2 ·3 · 5 · 7 · 1120 · 1317 · 1712 · 1910 · 237 · 294 · 314 · 372 · 41 · 43 · 47 · 53
628
J.B. Conrey, M.O. Rubinstein, N.C. Snaith
b15 =
29547975377 · 3981541 · 1807995588661527603489333681461 · 1584311 2230 · 3117 · 557 · 737 · 1122 · 1319 · 1714 · 1912 · 239 · 295 · 315 · 373 · 412 · 432 · 47 · 53 · 59
We have the following values for bk :
1 ·3
b1 = b2 = b3 = b4 = b5 = b6 = b7 =
230
242
256
212
26
· 310
· 312
· 56
· 328
· 59
· 513
1 ·3·5·7
· 32
220
· 319
22
1 · 52 · 72 · 11
31 · 54 · 72 · 11 · 13 227 · 11 · 132 · 17 · 19
· 74
67 · 1999 · 76 · 113 · 133 · 17 · 19 · 23 43 · 46663 · 78 · 114 · 133 · 172 · 192 · 23
b8 =
46743947 272 · 334 · 516 · 711 · 116 · 134 · 173 · 192 · 23 · 29 · 31
b9 =
19583 · 16249 290 · 342 · 521 · 714 · 118 · 136 · 173 · 193 · 232 · 29 · 31
b10 =
3156627824489 2110 · 355 · 525 · 717 · 1110 · 138 · 175 · 194 · 233 · 29 · 31 · 37
b11 =
59 · 11332613 · 33391 2132 · 363 · 531 · 718 · 1112 · 1310 · 175 · 195 · 234 · 292 · 312 · 37 · 41 · 43
b12 =
241 · 251799899121593 2156 · 375 · 537 · 723 · 1115 · 1312 · 178 · 197 · 234 · 293 · 312 · 41 · 43 · 47
b13 =
285533 · 37408704134429 2182 · 390 · 542 · 728 · 1117 · 1314 · 1710 · 198 · 235 · 293 · 313 · 372 · 41 · 43 · 47
197 · 1462253323 · 6616773091 = b14 2210 · 3100 · 550 · 731 · 1120 · 1317 · 1712 · 1910 · 237 · 294 · 314 · 372 · 41 · 43 · 47 · 53 b15 =
1625537582517468726519545837 2240 · 3117 · 557 · 737 · 1122 · 1319 · 1714 · 1911 · 239 · 295 · 315 · 373 · 412 · 432 · 47 · 53 · 59
Acknowledgements. The authors are grateful to AIM and the Isaac Newton Institute for very generous support and hospitality. JBC was supported by the NSF, MOR by the NSF and NSERC, and NCS by an EPSRC Advanced Research Fellowship.
Characteristic Polynomials Applied to Riemann Zeta Function
629
References [BH]
Brezin, E., Hikami, S.: Characteristic polynomials of random matrices at edge singularities. Phys. Rev. E 62(3), 3558–3567 (2000) [CFKRS] Conrey, J.B., Farmer, D.W., Keating, J.P., Rubinstein, M.O., Snaith, N.C.: Integral moments of L-functions. Proc. London Math. Soc. 91, 33–104 (2005) [CFKRS2] Conrey, J.B., Farmer, D.W., Keating, J.P., Rubinstein, M.O., Snaith, N.C.: Autocorrelation of random matrix polynomials. Commun. Math. Phys. 237(3), 365–395 (2003) [CG] Conrey, J.B., Ghosh, A.: Zeros of derivatives of the Riemann zeta-function near the critical line. Analytic number theory (Allerton Park, IL, 1989), Progr. Math. 85, Boston, MA: Birkhäuser Boston, 1990, pp. 95–110 [FW] Forrester, P.J., Witte, N.S.: Boundary conditions associated with the Painlevé III and V evaluations of some random matrix averages. http://arXiv.org/list/math.CA/0512142 [GHK] Gonek, S.M., Hughes, C.P., Keating, J.P.: hybrid Euler-Hadamard product formula for the Riemann zeta function. http://arXiv.org/list/math.NT/0511182, 2005 [Hug] Hughes, C.P.: On the characteristic polynomial of a random unitary matrix and the Riemann zeta function. PhD thesis, University of Bristol, 2001 [KS] Keating, J.P., Snaith, N.C.: Random matrix theory and ζ (1/2 +it). Commun. Math. Phys. 214(1), 57–89 (2000) [Lev] Levinson, N.: More than one third of zeros of Riemann’s zeta-function are on σ = 1/2. Adv. Math. 13, 383–436 (1974) [LM] Levinson, N., Montgomery, H.L.: Zeros of the derivatives of the Riemann zeta-function. Acta Math. 133, 49–65 (1974) [Mez] Mezzadri, F.: Random matrix theory and the zeros of ζ (s). J. Phys. A 36(12), 2945–2962 (2003) [Sou] Soundararajan, K.: The horizontal distribution of zeros of ζ (s). Duke Math. J. 91(1), 33–59 (1998) [Weyl] Weyl, H.: The Classical Compact Groups. Princeton, NJ: Princeton University Press, 1946 [Zha] Zhang, Y.: On the zeros of ζ (s) near the critical line. Duke Math. J. 110(3), 555–572 (2001) Communicated by P. Sarnak
Commun. Math. Phys. 267, 631–667 (2006) Digital Object Identifier (DOI) 10.1007/s00220-006-0072-7
Communications in
Mathematical Physics
Continuous Phase Transitions for Dynamical Systems Omri Sarig Mathematics Department, Pennsylvania State University, University Park, PA 16802, USA. E-mail: [email protected] Received: 29 August 2005/ Accepted: 13 February 2006 Published online: 17 August 2006 – © Springer-Verlag 2006
Dedicated to Y. Pesin on the occasion of his 60th birthday Abstract: We study the asymptotic expansion of the topological pressure of one–parameter families of potentials at a point of non-analyticity. The singularity is related qualitatively and quantitatively to non–Gaussian limit laws and to slow decay of correlations with respect to the equilibrium measure. 1. Introduction This paper deals with the thermodynamic formalism of countable Markov shifts. It explores the stochastic implications of non-analyticity for the topological pressure functional, by pursuing an analogy with the theory of continuous phase transitions. Continuous phase transitions. A continuous phase transition (sometimes also called high–order phase transition) is a situation where a thermodynamic quantity varies continuously but not analytically when some external parameter of the system is changed. The prototypical example is ferromagnetic material at zero external magnetic field: The magnetic moment per unit volume (‘magnetization’) decreases continuously as the material is heated, until it completely vanishes at a certain critical temperature Tc ; the derivative of the magnetization with respect to temperature (‘susceptibility’) diverges at Tc . Systems undergoing a continuous phase transition develop local long–range order. This order can be described in terms of large fluctuations of thermodynamic quantities (‘abnormal fluctuations’), and slow decay of correlations (‘infinite correlation length’). See [St] for examples of continuous phase transitions, and [BDFN, Hi] for theoretical treatment. Most thermodynamic quantities can be expressed as partial derivatives of the Helmholtz or Gibbs Free Energy F. Therefore, a continuous phase transition is sometimes This work was partially supported by NSF grant DMS–0400687.
632
O. Sarig
defined as a situation where the free energy is C 1 but not real–analytic. Physicists have found empirically that the free energy F(t) satisfies an asymptotic power law close to the critical point: F(t) ≈ Ct α + analytic terms for t = (T − Tc )/Tc (the ‘reduced temperature’). The parameter α is called a critical exponent. It is not clear how to define ≈. In this work (as in [Hi]), we formalize ≈ by stipulating that F(t) = ±t α L(1/t) + analytic terms where L: (c0 , ∞) → ∞ is a positive (Borel) function s.t. L(st) −−−→ 1 for all s > 0. L(t) t→∞
(1)
In this case L(t) is called slowly varying (s.v.) at infinity, and t α L(1/t) is said to be regularly varying (r.v.) with index α, see Appendix A. This interpretation of ≈ is not completely standard, but it is reasonable because it is equivalent to saying that the singular part F ∗ (t) of F(t) scales asymptotically like a ∗ s α for all s > 0, compare with [BDFN].1 power: FF ∗(st) (t) −−−→ + t→0
Continuous phase transitions in dynamical systems. Let T : X → X be some continuous map on a complete metric separable space X (in most examples treated below X is not locally compact). The dynamical counterpart to the free energy is the topological pressure functional φ → Ptop (φ) defined for continuous φ: X → R s.t. sup φ < ∞ by Ptop (φ) := sup h μ (T ) + φdμ: μ is a Borel probability measure φdμ = −∞ . s.t. μ ◦ T −1 = μ and Here and throughout h μ (T ) is the metric entropy of μ. The analogy [Ru] becomes apparent if one thinks of the metric entropy h μ (T ) as of entropy per particle, and of φdμ as −β×energy per particle with β = inverse temperature (we describe an example in Appendix B). With this interpretation, maximizing h μ (T ) + φdμ amounts to minimizing the Helmholtz free energy. The maximizing measure μ (if it exists) is called the equilibrium measure of φ, and (if unique) is denoted by μφ . Definition 1. Let T: X → X be a continuous map of a complete metric separable space X , and φt : X → R a family of continuous functions, t ≥ 0. (1) {φt }t≥0 is called regular, if ∃ > 0 s.t. φt has an equilibrium measure μt for t ∈ [0, ). (2) {φt }t≥0 is said to undergo a continuous phase transition at 0+ , if it is regular, ∃ s.t. t → Ptop (φt ) is C 1 on [0, ), but > 0 s.t. t → Ptop (φt ) extends to a real analytic function on (−, +). (3) {φt }t≥0 is said to exhibit a critical exponent α as t → 0+ if Ptop (φt ) = ±t α L(1/t) + h(t) with h(t) analytic at zero, L(x) s.v. at infinity and either α ∈ N or α ∈ N and L(x) −−−→const. x→∞
1 In fact, this interpretation seems to be implicit in many of the manipulations done in the physical theory of
critical phenomena. For example, the standard derivation of the critical exponent identities is done by formal differentiation of a (postulated) asymptotic expansion of the free energy (see e.g. [BDFN] §1.5.1). However, if α > 0, f (t) ∼ t α L(1/t) and f (t) ∼ αt α−1 L(1/t), then L(1/t) must be slowly varying, because of Karamata’s Theorem (Appendix A).
Continuous Phase Transitions for Dynamical Systems
633
Some people would also include in the definition of a continuous phase transition cases when Ptop (φt ) is equal to two different analytic functions on the two sides of zero, but is differentiable at zero. We do not treat such cases here. We focus on one–parameter families of the form φt := φ + tψ. In this case t → Ptop (φ + tψ) is convex and this imposes restrictions on the sign in front of t α L(1/t), see below. The dynamical systems we study are assumed to have countable Markov partitions. This is equivalent to the study of topological Markov shifts. + , T ) with a countable set of Topological Markov Shifts. A topological Markov shift (A + := {(x , x , . . .) ∈ S N∪{0} : states S and a transition matrix A = (ti j ) S×S is the set A 0 1 ∀i, txi xi+1 = 1} together with the map (T x)i = xi+1 . A word (w1 , . . . , wn ) ∈ S n is called admissible if twi wi+1 = 1 for all i. A topological Markov shift is called topologically mixing if for every a, b ∈ S there are admissible words beginning with a and ending at b of length n for all large n. A topological Markov shift is endowed with a metric d(x, y):= 2− min{k:xk = yk } . The resulting topology is generated by the basis of cylinders + [a0 , . . . , an−1 ] := {x ∈ A : xi = ai , 0 ≤ i ≤ n − 1}. + → R is called Hölder continuous if |φ(x) − φ(y)| ≤ Ad(x, y)κ A function φ: A for some constants A, κ > 0. This condition is too strong for us, because it implies boundedness. The following notions do not:
(1) φ is locally Hölder continuous if |φ(x) − φ(y)| ≤ Ad(x, y)κ whenever x0 = y0 ; (2) φ is weakly Hölder continuous if |φ(x) − φ(y)| ≤ Ad(x, y)κ whenever x0 = y0 , x1 = y1 ; (3) φ has summable variations if n≥2 var k φ < ∞, where var k φ:= sup{|φ(x)−φ(y)|: yi = xi (i = 0, . . . , k − 1)}. Local Hölder continuity is stronger than weak Hölder continuity, and weak Hölder continuity is stronger than summable variations. We need the Variational Principle for countable Markov shifts [S1]: Suppose T : X → X is a topologically mixing topological Markov shift, φ: X → R has summable variations, and sup φ < ∞; then for any state a, n−1 1 log exp φ(T k x). n→∞ n n
Ptop (φ) = lim
T x=x x0 =a
k=0
The limit on the right-hand side is called the Gurevich pressure of φ in honor of B. Gurevich who proved the variational principle in the case φ = φ(x0 , x1 ) [Gu1, Gu2].2 Program. In the case |S| < ∞, Ruelle [Ru] has established the following relation between the analytic properties of t → Ptop (φ + tψ) and the statistical properties of the equilibrium measure at t = 0: + , T ) is topologically mixing. If |S| < ∞ and φ, ψ: Theorem 1 (Ruelle). Suppose (A + A → R are Hölder continuous, then t → Ptop (φ + tψ) is real–analytic, and admits 2 The variational principle is stated under a stronger condition on φ in [S1], but is true with practically the same proof under the assumptions stated above.
634
O. Sarig
the expansion Ptop (φ + tψ) = Ptop (φ) + cψ t + 21 σψ2 t 2 + o(t 2 ), where cψ = Eμφ [ψ] and dist. n−1 k − nc √1 ψ ◦ T −−−→ N (0, σψ2 ) w.r.t. μφ .3 ψ k=0 n n→∞
Ruelle has also proved exponential decay of correlations in this case [Ru]. Thus there can be no phase transitions for short–range interactions when |S| < ∞. Phase transitions are possible for short–range interactions when |S| = ∞. Indeed, it is well–known that long–range interactions on one–dimensional lattice-gas models may admit phase transitions, and there are cases when such models can be recast as short–range interactions on infinite state shifts. See Appendix B and [FF, Ho, Hi, Lo, PS, Wa1, Wa2, S2, S7, MU2, Y1]. Motivated by the physical analogy described above, we seek a generalization of Theorem 1 which relates singular behavior for Ptop (φ + tψ) (‘critical exponents’) to k non-Gaussian distributional limit theorems for n−1 k=0 ψ ◦ T and to sub–exponential rates of mixing (‘abnormal fluctuations’ and ‘infinite correlation length’). Such a relation is mentioned in the physics literature, see [BDFN] for a renormalization group approach and Hilfer [Hi] for a probabilistic point of view very similar to the one we adopt below. Rigorous results are more difficult to find. See Sect. V.8 in [El] for a discussion of the Ising model.4 2. Statement of Results with Laplace transAssumptions. Let G α (0 < α ≤ 2) be the probability distribution form R esξ dG α (ξ ) = exp[sgn(α − 1)s α ] when α = 1 and R esξ dG α (ξ ) = e−s when α = 1. Such distributions exist: When α = 1, G α is the standard spectrally negative stable distribution of index α, and when α = 1 G α is the degenerate distribution concentrated at {−1} (see [Z] for details). + , T ) be a topologically mixing topological Markov shift with a countable set Let (A of states S and a transition matrix A = (ti j ) S×S . Our results are simplest to state when A satisfies the Big Images and Preimages property: ∃b1 , . . . , b N : ∀a ∈ S, ∃i, j s.t. tbi a tab j = 1.
(BIP)
This condition appears naturally in the theory of countable Markov shifts, as necessary and sufficient for the existence of Gibbs measures in the sense of Bowen [S3, MU1]. (Equilibrium measures may exist in the absence of (BIP), see [S4].) We can remove the BIP property, at the cost of additional assumptions on φ and ψ. Define for a state a ∈ S the function ra (x):= min{k : xk = a}, with the convention min ∅ = ∞. Let μφ be the equilibrium measure of φ (when it exists). We shall impose the following assumption on φ: There exists a ∈ S such that Eμφ [ra ] < ∞.
( )
We call a set E ⊆ A bounded, if E ⊆ {x : x0 ∈ S0 } for some finite set S0 ⊂ S. We shall consider functions ψ for which +
ψ ∈ L 1 (μφ ), and ψ ≤ Eμφ [ψ] outside a bounded set.
( )
dist. 3 Here and throughout, E denotes expectation, N (0, σ 2 ) is the Gaussian distribution, and − −→ means
convergence in distribution, see [F, GK]. 4 The case of discontinuous (‘first-order’) phase transitions is more amenable to rigorous treatment. A discontinuous phase transition is characterized by the lack of differentiability of the free energy. The theory of large deviations can be used to interpret such a singularity as lack of exponential convergence in distribution of an associated macroscopic quantity to a unique thermodynamic value, see Ellis [El].
Continuous Phase Transitions for Dynamical Systems
635
+ , T ) be Critical exponents and abnormal fluctuations. Throughout this section let (A a topologically mixing topological Markov shift, and suppose {φ + tψ}t≥0 is a regular family, where φ, ψ are two locally Hölder continuous functions s.t. sup φ < ∞, Ptop (φ) < ∞ and sup ψ < ∞.
Theorem 2. Assume ( ) and ( ). The following are equivalent for 1 < α < 2: (1) Critical Exponent: Ptop (φ + tψ) = Ptop (φ) + ct + t α L(1/t) with L(x) slowly varying at infinity.
n−1 dist. (2) Non-Gaussian Fluctuations: B1n ψ ◦ T k − cn −−−→ G α w.r.t. μφ , where n→∞
k=0
1 α
c = Eμφ [ψ], Bn = n (n) and (n) is slowly varying at infinity. The following theorems treat the case α = 1, 2. Theorem 3. Assume ( ) and ( ). (1) Taylor Expansion: Ptop (φ + tψ) = Ptop (φ) + ct + σ2 t 2 + o(t 2 ) with σ = 0 iff n−1 dist. (ψ ◦ T k − c) −−−→ N (0, σ 2 ). In this case ψ ∈ L 2 (μφ ). c = Eμφ [ψ] and √1n 2
n→∞
k=0
(2) Critical Expansion: Ptop (φ +tψ) = Ptop (φ)+ct + 21 t 2 L(1/t) with L(x) s.v. at infinity
n−1 dist. ψ ◦ T k − cn −−−→ N (0, 1) where c = Eμφ [ψ], and s.t. L(x) → const iff B1n Bn is r.v. of index
1 2
s.t.
√k=0 n Bn →
n→∞
0. In this case L(x) → ∞.
Theorem 4. (1) Taylor Expansion: Assume ( ). Then Ptop (φ + tψ) = Ptop (φ) + ct + o(t) with n−1 c = Eμφ [ψ], and n1 ψ ◦ T k −−−→ Eμφ [ψ] μφ –a.s. and in distribution. k=0
n→∞
(2) Critical Expansion: Assume (BIP). Then Ptop (φ + tψ) = Ptop (φ) + ct + t L(1/t) with |L(x)| s.v. at infinity and L(x) → const. iff ψ ∈ L 1 (μφ ) and ∃Bn r.v. of index 1 s.t. n 1 n−1 k dist. k=0 ψ ◦ T −−−→ G 1 . In this case L(x) −−−→ −∞. Bn → 0 and Bn n→∞
x→∞
k To understand the previous results, it is useful to think of ψn := n−1 k=0 ψ ◦ T as of a ‘macroscopic’ quantity with average (at equilibrium) nEμφ [ψ]. In the absence of a√phase transition, one expects the fluctuations of ψn about its average to be of order n. The previous results say that in the presence of a continuous phase √ transition with critical exponent α ≤ 2, the fluctuations are of order Bn with Bn n (compare with [Hi]). Remark. Theorems 2, 3, 4 remain true if the pair of conditions ( ) and ( ) is replaced by (BIP). Under this new set of assumptions: (1) Theorem 2 is also valid for 0 < α < 1, except that in this case Eμφ [ψ] = −∞, the slow variation of L(x) should be replaced by the slow variation of −L(x), and c can be set to zero (because ct = o(t α L(1/t)), cn = o(Bn )); (2) Case (1) of Theorem 3 holds iff ψ ∈ L 2 (μφ ) and ψ is not a measurable coboundary [AD], [Gou2], and Case (2) of Theorem 3 holds iff ψ ∈ L 2 (μφ ) (see Theorem 5 below); (3) Case (1) of Theorem 4 holds iff ψ ∈ L 1 (μφ ).
636
O. Sarig
When do different systems exhibit the same asymptotic expansion?. In order to answer this question, one needs to clarify what properties of ψ and μφ are captured by α and L(x). The following is motivated by [AD, GK]. + , T ) be a topologically mixing topological Markov shift with the BIP Theorem 5. Let (A property, and suppose φ, ψ are locally Hölder continuous s.t. sup φ < ∞, Ptop (φ) < ∞, sup ψ < ∞ and s.t. φ has an equilibrium measure. The following are equivalent for L(x) s.t. |L(x)| is s.v. at infinity and 0 < α ≤ 2:
(1) Critical Exponent: Ptop (φ + tψ) = Ptop (φ) + ct + t α L(1/t)[1 + o(1)] as t → 0+ ; (2) Domain of Attraction: One of the following holds as x → ∞ : x −α (a) 0 < α < 2, α = 1 and μφ [ψ < −x] ∼ − (1−α) L(x); 1 (b) α = 1 and either ψ ∈ L , and then L(x) = Eμφ [ψ] − c + o(1), or ψ ∈ L 1 and then L(x) ∼ Eμφ [ψ ∨ (−x)]; (c) α = 2 and either ψ ∈ L 2 , and then L(x) = 21 σ 2 + o(1) for some σ ∈ R; or
ψ ∈ L 2 , and then L(x) ∼ 21 Eμφ ψ 2 1[|ψ|≤x] . Here f (x) ∼ g(x) means
f (x) −−→ g(x) − x→∞
1 and a ∨ b:= max{a, b}.
Remark 1. The implication (2) ⇒ (1) follows from Theorem 2 and the work of Aaronson & Denker [AD] who showed that (2) implies a distributional limit theorem. We give an alternative proof below. Remark 2. Theorem 5 enables one to construct an abundance of ψ’s for which {φ+tψ}t≥0 has a critical exponent. These ψ’s are of course unbounded. Indeed, by [S3], in the BIP case Ptop (φ +tψ) is real–analytic whenever Ptop (φ +tψ) < ∞ for all t in some two-sided neighbourhood of zero (e.g. bounded ψ’s). For shifts without the BIP property, critical exponents are possible for ψ bounded (see below). Remark 3. The following generalization of Theorem 5 to general shifts is a direct con+,T) sequence of the discussion at the beginning of Sect. 4 and Theorem 8 there. Let (A be a topologically mixing topological Markov shift, and suppose {φ + tψ}t≥0 is a regular family, where φ, ψ are two locally Hölder continuous functions s.t. Ptop (φ) < ∞, sup φ < ∞, and sup ψ < ∞. We assume ( ) (but do not assume ( ) or (BIP)). Let A be the bounded set mentioned in ( ), and define ψ := 1 A ·
r −1
ψ ◦ T k , where r (x):= inf{n ≥ 1: T n x ∈ A}.
k=0
Then Ptop (φ + tψ) = Ptop (φ) + ct + t α L(1/t)[1 + o(1)] for 1 < α ≤ 2 with L s.v at infinity iff ψ satisfies the domain of attraction condition of Theorem 5 w.r.t. the normalized restriction of μφ to A. The random variables ψ can be thought of as sums over ‘weakly correlated blocks’, see [Hi, FF] and Appendix B. This explains why in the non–BIP case even bounded ψ’s may satisfy non-Gaussian limit laws: ψ may have a heavy tail, even if ψ does not, because r may have a heavy tail (of course ( ) must fail in this case). Returning to the case treated in Theorem 5, we note that the domain of attraction conn−1 dition is phrased in terms of ψ alone, and is not an asymptotic property of ψ ◦ T k as k=0
Continuous Phase Transitions for Dynamical Systems
637
n → ∞. Of course the thermodynamic limit is still present in the form of the equilibrium measure μφ . But in the BIP case the equilibrium measure satisfies certain a priori uniform bounds which allow one to deduce the following thermodynamic-limit–free necessary + s.t. x starts at condition for the existence of a critical exponent. Choose some xa ∈ A a a (a ∈ S). Corollary 1. Under the assumptions of Theorem 5, Ptop (φ+tψ) = Ptop (φ)+ct +t α L(1/t) with |L(x)| s.v. at infinity and α ∈ (0, 2) \ {1} implies:
eφ(xa )
a∈S:ψ(xa )<−x
1 |L(x)| as x → ∞. xα
Here and throughout f (x) g(x) as x → ∞ means: ∃M such that all x large enough.
1 M
≤
f (x) g(x)
≤ M for
Critical exponents and slow decay of correlations. The covariance of two square integrable functions f, g defined on a probability space (X, B, μ) is Covμ ( f, g) := f gdμ − f dμ gdμ. The following result says that under certain assumptions, the existence of a critical exponent implies that the decay of correlations is sub-exponential, as expected from the analogy described in the introduction. We need a strengthening of ( ): ψ ∈ L 1 and ∃ > 0 s.t. ψ ≤ Eμφ [ψ] − outside a bounded set.
()
+ , T ) be a topologically mixing topological Markov shift, and supTheorem 6. Let (A pose {φ + tψ}t≥0 is a regular family, where φ, ψ are locally Hölder continuous functions s.t. sup φ < ∞, Ptop (φ) < ∞, ψ∞ < ∞, and ψ satisfies (). If Ptop (φ + tψ) = Ptop (φ) + ct + t α L(1/t) with 1 < α < 2 and L is s.v. at ∞, then
Covμφ ( f, g ◦ T n )
L(n) n α−1
f dμφ
gdμφ as n → ∞
for all f, g locally Hölder continuous with bounded support and positive expectation.
3. Proofs for Shifts Satisfying the BIP Property Standing Assumptions. In this section we give the proofs of Theorems 2, 3, 4 and 5 in the case of topologically mixing countable Markov shifts with (BIP). Our assumptions on φ and ψ are that they are locally Hölder continuous, bounded from above, and that Ptop (φ) < ∞. We do not assume ( ), ( ) or that {φ + tψ}t>0 is regular. We do assume that φ has an equilibrium measure.5 5 In fact, this assumption can be removed as well: locally Hölder potentials with finite pressure on shifts with (BIP) always have Gibbs measures [S3], and everything we say below holds with μφ = Gibbs measure of φ.
638
O. Sarig
Our results remain unchanged if we add to φ a term of the form h − h ◦ T + c with h bounded (locally) Hölder continuous and c ∈ R. It is always possible, by means of such h and c, to change φ so that Ptop (φ) = 0, sup φ ≤ 0, and eφ(y) = 1 for all x. T y=x
This is Lemma 1 in [S2] (the boundedness of h is proved for systems with the BIP property in [S3]). Henceforth, we assume that φ satisfies these additional assumptions. Distributional Limit Theorems and Laplace Transforms. We shall study the distribun−1 k − cn by analyzing the behaviour of ψ ◦ T tional limit behaviour of X n:= B1n k=0 its Laplace transform Eμφ [et X n ]: Proposition 1. Let X, X n be random variables such that for some ω > 0, E(et X n ), E(et X ) are finite for all 0 ≤ t ≤ ω. The following are equivalent: (1) E(et X n ) −−−→ E(et X ) for all 0 ≤ t ≤ t0 and some t0 > 0; n→∞
dist.
(2) X n −−−→ X . n→∞
Proof. See e.g. Martin–Löf [ML].
Nagaev’s Method [N]. This is a method for analyzing the Laplace (or Fourier) transform of the distribution of the sum of dependent identically distributed random variables. We need it to analyze the distribution of ψn := ψ + ψ ◦ T + · · · + ψ ◦ T n−1 with respect to μφ . The idea is to construct a family of operators Rt such that Eμφ [etψn ] = Eμφ [Rtn 1] and use operator theory to analyze the right-hand side, see Nagaev [N] and Aaronson & Denker [AD]. In order to construct Rt , we recall some facts on the structure of equilibrium measures for countable Markov shifts. It was proved in [S4] and [BS] that the equilibrium measure μφ must be of the form hdν, where h is a positive continuous function and ν is a positive measure such that R0∗ ν = e Ptop (φ) ν and R0 h = e Ptop (φ) h, where R0 is Ruelle’s operator: eφ(y) f (y). (R0 f )(x) = T y=x
It is also known that h is, up to a constant, the unique positive continuous function such that R0 h = e Ptop (φ) h. Since by our assumptions on φ Ptop (φ) = 0 and R0 1 ≡ φ(y) = 1, we must have h = const., whence μ is a constant times ν. It follows φ T y=x e that R0∗ μφ = μφ . In particular, Eμφ [R0 F] = Eμφ [F] for every bounded continuous function F. Now define the operators Rt f := R0 [etψ f ]. A calculation shows that Rtn 1 = R0n [etψn 1]. Passing to expectations with respect to μφ , we see that Eμφ [etψn ] = Eμφ [R0n (etψn )] = Eμφ [Rtn 1] as required.
Continuous Phase Transitions for Dynamical Systems
639
Next we seek a Banach space L such that Rt : L → L has good spectral properties. Such a space was found by Aaronson and Denker [AD]. We review their construction. + , and fix some κ > 0 such that φ, ψ are both Hölder Recall the metric d(x, y) on A continuous with exponent κ with respect to d. Define L to be the space of functions + → C such that f : A | f (x) − f (y)| : x = y, x0 = y0 . f L:= f ∞ + D f < ∞, where D f := sup d(x, y)κ
This is a Banach space with respect to · L . + has the BIP property, and let φ, ψ Proposition 2 (Aaronson & Denker). Suppose A be two locally Hölder continuous functions such that sup φ ≤ Ptop (φ) = 0, R0 1 = 1, sup ψ < ∞, and φ has an equilibrium measure μφ . Then:
(1) Boundedness: Rt (L) ⊆ L and Rt : L → L are bounded linear operators for all t ≥ 0. (2) Spectral Gap: R0 = P + N , where P R0 = R0 P, P 2 = P, N P = P N = 0 and the spectral radius of N is less than one. P is given by P f := Eμφ [ f ].
(3) Continuity: Rt − Rs = O |t − s| + Eμφ |1 − e|t−s|ψ | for 0 ≤ t, s ≤ 1, where · is the operator norm. (4) Differentiability: If Eμφ |ψ| < ∞, then t → Rt is continuously differentiable on [0, δ0 ) for some δ0 > 0. The derivative is Rt : f → Rt (ψ f ) (right derivative is meant at 0). Proof. The BIP property implies that any equilibrium measure μφ of a locally Hölder continuous potential φ has the Gibbs property: ∃G = G(φ) such that + G −1 μφ [x0 , . . . , xn−1 ] ≤ eφn (x) ≤ Gμφ [x0 , . . . , xn−1 ], (x ∈ A ),
where φn:= φ + φ ◦ T + · · · + φ ◦ T n−1 [S3]. Thus eφ(x) ≤ Gμφ [x0 ]. Fix some bounded Lipschitz function F: (−∞, sup ψ] → R with Lipschitz constant Li p(F), and define the linear operator R F : f → R0 [F ◦ ψ · f ]. We need the following : estimate: For some constant M independent of F, and Da [F ◦ψ]:= sup{ F(ψ(x))−F(ψ(y)) d(x,y)κ x = y, x, y ∈ [a]}, μφ [a]Da (F ◦ ψ) . (2) R F ≤ M Eμφ [|F| ◦ ψ] + a∈S
To prove this, we must estimate R F f ∞ , D[R F f ] for f ∈ L. Fix x, y such that x0 = y0 , and let P(x0 ):= {a ∈ S: tax0 = 1}. Then |R F f (x) − R F f (y)| ≤ eφ(ax) |1 − eφ(ay)−φ(ax) | · |F(ψ(ax)) f (ax)| a∈P(x0 )
+
eφ(ay) |F(ψ(ax)) − F(ψ(ay))|| f (ax)|
a∈P(x0 )
+
a∈P(x0 )
eφ(ay) |F(ψ(ay))|| f (ax) − f (ay)|.
640
O. Sarig
If φ(ax) = φ(ay) then |1 − eφ(ax)−φ(ay) | Dφd(ax, ay)κ |φ(ax) − φ(ay)| |1 − eδ | ≤ sup : |δ| ≤ Dφd(ax, ay)κ Dφd(ax, ay)κ < K d(x, y)κ δ
|1 − eφ(ax)−φ(ay) | ≤
δ
| with (for example) K = Dφ ·sup{ |1−e to guaranδ : |δ| ≤ Dφ}. Re-define K if necessary tee K > 1. It is now straightforward to deduce, using the inequality a∈P(x0 ) eφ(ax) ≤ G, that D(R F f ) ≤ 2K f L R0 (|F| ◦ ψ)∞ + G f L μφ [a]Da (F ◦ ψ). a∈S
It is also clear that R F f ∞ = R0 [F ◦ ψ · f ]∞ ≤ R0 (|F| ◦ ψ)∞ f L . Thus μφ [a]Da (F ◦ ψ). R F ≤ 3K R0 (|F| ◦ ψ)∞ + G a∈S
We proceed to estimate R0 (|F| ◦ ψ)∞ : R0 (|F| ◦ ψ)(x) ≤ G
μφ [a] inf |F| ◦ ψ + Da (|F| ◦ ψ) [a]
a∈P(x0 )
≤ G Eμφ [|F| ◦ ψ] +
μφ [a]Da (F ◦ ψ) .
a∈S
Recalling that K > 1, we obtain (2) with M:= 3K G. Note that for every a, Da (|F| ◦ ψ) ≤ Li p(F)Dψ, so
R F ≤ M Eμφ [|F| ◦ ψ] + Li p(F)Dψ . The boundedness of Rt is the special case with F(ξ ) = etξ . The spectral gap of R0 follows from the Ionescu-Tulcea Marinescu theorem and the mixing of μφ , as in [AD]. The modulus of continuity of t → Rt is obtained by observing that Rt − Rs = R F with F(ξ ) = etξ − esξ . Differentiability is more difficult. We begin with the continuity of t → Rt (defined
− R ) f = R ( f ), where F (ξ ) = etξ ξ(e hξ − 1). We fix t > 0 in part (4)). Write (Rt+h Fh h t and show that the norm of this operator tends to zero as h → 0, using (2): (1) Eμφ [|Fh ◦ ψ|] −−−→ 0 because Fh ◦ ψ −−−→ 0 pointwise and |Fh ◦ ψ| is uniformly h→0
h→0
bounded for |h| < 2t . (2) −−→ 0: By the mean value theorem, a μφ [a]Da (Fh ◦ ψ) − h→0
Da (Fh ◦ ψ) ≤ Dψ · sup{|Fh (z)|: z ∈ (inf ψ[a], sup ψ[a])}. The right-hand side converges to zero as h → 0, and is uniformly bounded (as a function of a) for all |h| < 2t (direct calculation). The result follows from the bounded convergence theorem.
Continuous Phase Transitions for Dynamical Systems
641
It follows from (2) that R Fh −−−→ 0, whence the continuity of Rt for t > 0. The conh→0
tinuity from the right at t = 0 can be proved by repeating the previous argument with t = 0 and h → 0+ . The only difference is that now instead of the bounded convergence theorem one has to use the dominated convergence theorem, the integrability of |ψ|, and the uniform boundedness of ehψ , etψ tψ for 0 < h, t < 1. hξ We prove the differentiability of Rt . Set Fh (ξ ):= e h−1 − ξ . We have: Rt+h − Rt
− Rt f ≡ R Fh (etψ f ). h It is easy to check that eψ ∈ L and that f etψ L ≤ f L etψ L . Consequently Rt+h − Rt
tψ μφ [a]Da (Fh ◦ ψ) . − Rt ≤ e L R Fh = O Eμφ [|Fh (ψ)|] + h a∈S
By the mean value theorem, the following inequality holds on [a]: Da (Fh ◦ ψ) ≤ Dψ|eh(ψ+var1 ψ) − 1|. Consequently, Rt+h − Rt
h(ψ+var 1 ψ) = O E − R [|F (ψ)|] + E [|e − 1|] . μ h μ φ φ t h h(ψ+var 1 ψ) −1| is uniformly Now |Fh |◦ψ is dominated by a constant times 1+|ψ|, and |e R −R 0. bounded for 0 < h < 1. Therefore, if ψ ∈ L 1 , then t+hh t − Rt −−−→ + h→0
To see the limit as h → 0− (when t > 0), write τ = t − |h|, τ + |h| = t and repeat the previous argument with τ for t, using the continuity of t → Rt and the fact that the big Oh in the previous equation is uniform on a neighbourhood of t. Spectral gaps are stable under small perturbations [Ka]. Therefore, there exists an open neighbourhood U of R0 in Hom(L, L) (the space of bounded linear operators on L) and analytic maps λ: U → C, P, N: U → Hom(L, L) s.t. R = λ(R)[P(R) + N (R)], R P(R) = P(R)R = λ(R)P(R) for all R ∈ U, P(R)N (R) = N (R)P(R) = 0 P(R)2 = P(R), dim Im[P(R)] = 1, and such that the spectral radius of N (R) is uniformly smaller than one. Proposition 3. Under the standing assumptions of this section, there exist 0 > 0 and (t) −−−→ 0 s.t. for all 0 ≤ t ≤ 0 Eμφ [etψn ] = [1 + O((t))] exp[n Ptop (φ + tψ)] t→0+ k uniformly in n, where ψn:= n−1 k=0 ψ ◦ T . Proof. Fix 0 > 0 so small that 0 ≤ t ≤ 0 implies that Rt ∈ U and that the spectral radius of Nt is less than θ < 1. This is possible, because t → Rt is continuous. For such t’s λ(t):= λ(Rt ), Pt := P(Rt ) and Nt := N (Rt ) make sense, and depend continuously on t. In particular, Eμφ [Pt 1] −−−→ Eμφ [P1] = Eμφ [1] = 1. Making 0 smaller, if + t→0
necessary, we ensure that Eμφ [Pt 1] = 0 for all 0 < t < 0 .
642
O. Sarig
Now define h t := Pt 1/Eμφ [Pt 1]. Recalling that R0∗ μφ = μφ , we see that λ(t) = n
λ(t) h t dμφ = n
=
Rtn h t dμφ
R0n [etψn h t ]dμφ = Eμφ [etψn ] +
etψn (h t − 1)dμφ .
Now | etψn (h t − 1)dμφ | ≤ Eμφ [etψn ]h t − 1L , so λ(t)n = [1 + O(h t − 1L )]Eμφ [etψn ]. 0. Clearly We show that h t − 1L −−−→ + t→0
h t − 1L ≤
Pt 1 − 1L + |1 − Eμφ (Pt 1)| Eμφ (Pt 1)
≤
21L P(Rt ) − P(R0 ). Eμφ (Pt 1)
The spectral gap of R0 implies that P(Rt ) − P(R0 ) = O(Rt − R0 ) as t → 0+ , so by the previous proposition, h t − 1L = O |t| + Eμφ (|1 − e|t|ψ |) .
(3)
0. We deduce: The bounded convergence theorem now shows that h t − 1L −−−→ + ∃(t) −−−→ 0 such that Eμφ [etψn ] = [1 + O((t))]λ(t)n . +
t→0
t→0
We show that λ(t) = exp[Ptop (φ + tψ)]. Consider the indicator function 1[a] of [a] for some a ∈ S s.t. μφ [a] = 0 (in fact every a ∈ S has this property). Since 1[a] ∈ L, L
Pt 1[a] −−−→ P1[a] = Eμφ [1a ] = 0. Thus Pt 1[a] > 0 for all t small enough. + t→0
Fix some xa ∈ [a]. The commutation relations between Rt , Pt and Nt imply that (Rtn 1[a] )(xa ) = λ(t)n [Pt 1[a] + Ntn 1[a] ](xa ) = λ(t)n [Pt 1[a] (xa ) + o(1)] ∼ λ(t)n Pt 1[a] (xa ). We see that for every xa ∈ [a], 1 1 log(Rtn 1[a] )(xa ) = lim log eφn (y)+tψn (y) 1[a] (y) n→∞ n n→∞ n n
log λ(t) = lim
T y=xa
1 log eφn (z)+tψn (z) 1[a] (z), n→∞ n n
= lim
T z=z
where the last transition is because the local Hölder continuity of φ and ψ allows us to change each y ∈ T −n (xa ) ∩ [a] into z(y) = (y0 , . . . , yn−1 ; y0 , . . . , yn−1 ; . . .) without affecting the limit. By the variational principle of [S1], log λ(t) = Ptop (φ + tψ). Now that we have related E[etψn ] to Ptop (φ + tψ) we can proceed as in the case of i.i.d’s (see e.g. [F]). It is convenient to start with Theorem 4, Part 1.
Continuous Phase Transitions for Dynamical Systems
643
Proof of Theorem 4, Part 1 for shifts satisfying (BIP). We continue to assume w.l.o.g. that φ ≤ Ptop (φ) = 0, T y=x eφ(y) = 1 and R0∗ μφ = μφ . Subtracting a suitable constant from ψ if necessary, we also assume w.l.o.g. that sup ψ < 0. Recall the notation Rt , λ(t), h t from the proof of Proposition 3. We have: R0 (etψ h t )dμφ − h t dμφ λ(t) − 1 = Rt h t dμφ − 1 = = (etψ − 1)h t dμφ = Eμφ [etψ − 1] + Eμφ [(etψ − 1)(h t − 1)]. Now |Eμφ [(etψ −1)(h t −1)]| ≤ |Eμφ [etψ −1]|·h t −1∞ = o(Eμφ [etψ −1]), because etψ − 1 doesn’t change sign and because h t − 1∞ ≤ h t − 1L → 0 (see the proof of Proposition 3). We conclude that λ(t) − 1 = [1 + o(1)]Eμφ [etψ − 1]. We have seen in the proof of Proposition 3 that λ(t) = exp Ptop (φ + tψ). Since Ptop (φ + tψ) = o(1) as t → 0+ ,
Ptop (φ + tψ) = [1 + o(1)] e Ptop (φ+tψ) − 1 = [1 + o(1)]Eμφ [etψ − 1], as t → 0+ .(4) It follows that Ptop (φ+tψ) = ct +o(t) iff Eμφ [etψ −1] = [ct +o(t)][1+o(1)], as t → 0+ , which (upon division by t and some rearrangements) is equivalent to tψ e −1 ψ = c. lim+ Eμφ t→0 tψ It is not difficult to see, using ψ < 0, that the limit is equal to Eμφ [ψ]. We conclude that Ptop (φ + tψ) = ct + o(t) iff ψ ∈ L 1 and c = Eμφ [ψ]. k −−→ E [ψ] μ –almost surely and in distribution, In this case n1 n−1 μφ φ k=0 ψ ◦ T − n→∞ because of the ergodicity of μφ [BS] and the Birkhoff ergodic theorem. This proves the ‘Taylor expansion’ case of Theorem 4 (in the extended form described by the remark after Theorem 4). Proof of Theorem 2 for shifts satisfying (BIP). We keep the standing assumptions of this section. Assume first that Ptop (φ + tψ) = ct + t α L(1/t) with |L(x)| slowly varying at infinity and 0 < α < 2, α = 1 (we are also considering 0 < α < 1 because of the remark after Theorem 4). Since for every continuous function f and constant C, Ptop ( f + C) = Ptop ( f ) + C, we can normalize ψ to make c = 0. The asymptotic relation becomes Ptop (φ + tψ) = t α L(1/t). Construct Bn → ∞ such that n|L(Bn )|/Bnα −−−→ 1. Here is how to do this: The n→∞
function f (x):= x α /|L(x)| is regularly varying at infinity with index α > 0, and therefore admits a regularly varying asymptotic inverse g(x) (see Appendix A). By definition, ( f ◦ g)(x) ∼ (g ◦ f )(x) ∼ x as x → ∞, so Bn:= g(n) is as required (it tends to infinity, because it is regularly varying with index 1/α > 0). Assume for the moment that the sign of L(x) converges to sgn(α − 1) as x → ∞. Proposition 3 and the expansion of Ptop (φ + tψ) imply that ψn
Eμφ [et Bn ] = [1 + O(( Btn ))] exp[n Ptop (φ + = [1 +
t Bn ψ)] n ) L(Bn /t) O(( Btn ))] exp[t α n L(B −−→ Bnα L(Bn ) ] − n→∞
exp[sgn(α − 1)t α ].
644
O. Sarig
The last expression is the Laplace transform of G α , and so, by Proposition 1 (whose conditions hold because sup ψ < ∞),
dist. 1 −−→ G α . Finally Bn ψn − n→∞ gives Eμφ [ψ] = c.
we observe that when
α > 1, Theorem 4, Part 1 applies, and We now explain why sgn[L(x)] −−−→ sgn(α − 1). Recalling the definition of the x→∞
topological pressure, we observe that ϕ(t) := t α L(1/t) = Ptop (φ + tψ) is convex on [0, ∞). If α > 1, then ϕ(0) = 0 and ϕ+ (0) = 0 (the right-derivative at zero). Convexity forces ϕ to be non-negative, whence L(x) ≥ 0 for all x > 0. Since L(x) is eventually non-zero (its absolute value is assumed to be slowly varying), it is eventually positive. If on the other hand 0 < α < 1, then for any c0 , Ptop (φ + t (ψ − c0 )) = Ptop (φ + tψ) − c0 t = t α L(1/t)[1 − c0 (1/t)α−1 /L(1/t)] = t α L(1/t)[1 + o(1)]. If c0 > sup ψ, then Ptop (φ + t (ψ − c0 )) < Ptop (φ) = 0. This forces L(1/t) to be eventually negative. This completes the proof of (1) ⇒ (2). We prove the other direction. Assume
dist. 1 −−→ Bn (ψn − cn) − n→∞
G α for Bn regularly vary-
ing with index 1/α and c ∈ R. Again, we can subtract a constant from ψ to make c = 0. Our objective is then to show that Ptop (φ) = t α L(1/t) with |L(x)| slowly varying. ψn
Proposition 1 says that Eμφ [et Bn ] → exp[sgn(α − 1)t α ]. Combining this with Proposition 3 gives, since Bn → ∞, lim n Ptop (φ +
n→∞
t Bn ψ)
= sgn(α − 1)t α
(5)
on some one–sided right neighbourhood of 0. Applying the sufficient condition for regular variation of Appendix A with f (x):= |Ptop (φ + x1 ψ)|, an = n and bn = Bn , we conclude that Ptop (φ + tψ) = t ρ L(1/t), with |L(x)| slowly varying at infinity and some ρ > 0. By (5), ρ = α. Proof of Theorem 4, Part 2 for shifts satisfying (BIP). We keep the standing assumptions of this section. Suppose Ptop (φ + tψ) = ct + t L(1/t) with |L(x)| slowly varying at infinity and L(x) → const. Changing ψ by a constant, we arrange for sup ψ < 0. Equation (4) holds, and leads tψ to c + L(1/t) = [1 + o(1)]Eμφ e tψ−1 ψ −−−→ Eμφ [ψ]. Since L(x) → const., we must + t→0
have Eμφ [ψ] = −∞, whence L(x) → −∞. As in the proof of Theorem 2, we construct Bn regularly varying of index 1 such t that n|L(Bn )|/Bn → 1, and observe using Proposition 3 that Eμφ [e Bn ψn ] −−−→ e−t . n→∞
dist.
The limit is the Laplace transform of G 1 . It follows that 1/Bn ψn −−−→ G 1 . Note that n/Bn → 0, because |L(x)| → ∞. This proves (⇒). To see (⇐) assume that
ψn dist. −−→ Bn − n→∞
n→∞
G 1 with Bn r.v. of index one such that n/Bn → 0.
Arguing as in the proof of Theorem 2, we deduce that Ptop (φ + tψ) = t L(1/t) with n )| |L(x)| slowly varying at infinity such that n|L(B −−−→ 1. Since n/Bn → 0, L(x) → Bn n→∞ const. Proof of Theorem 3 for shifts satisfying (BIP). We keep the standing assumptions of this section. Assume first that
1 Bn (ψn
dist.
− cn) −−−→ N (0, 1) for some Bn regularly n→∞
Continuous Phase Transitions for Dynamical Systems
645
√ varying of index 21 (this includes the case Bn = σ n). Subtracting a suitable constant from ψ we may assume w.l.o.g that c = 0 (of course we can no longer assume that sup ψ < 0). 1 2 The Laplace transform of N (0, 1) is e 2 t . Arguing as in the proof of Theorem 2, we obtain (since Ptop (φ) = 0) Ptop (φ + tψ) =
1 2 t L(1/t) 2
(6)
n) −−−→ 1. with L(x) s.v. at infinity such that n L(B Bn2 n→∞ √ √ 2 If Bn ∼ σ n, then L(Bn ) −−−→ σ , and if n/Bn → 0, then L(Bn ) −−−→ ∞.
n→∞
n→∞
The same limits must hold for L(x) as x → ∞, because of the regular variation of Bn and L(x) (use the uniform convergence theorem for slow variation in Appendix A). This proves (⇐) in parts (1) and (2). We prove (⇒). It is enough to treat the case Ptop (φ + tψ) =
1 2 t L(1/t) 2
with L(x) = σ 2 + o(1) or with L(x) → const., L slowly varying (we can always reduce to this case by subtracting c from ψ). Note that Ptop (φ +tψ) = o(t), whence by Theorem 4 for systems with BIP, ψ ∈ L 1 and Eμφ [ψ] = 0. As before, the asymptotic expansion above implies the existence of Bn regularly dist. 1 −−→ N (0, 1), and Bn is determined up to asympBn ψn − n→∞ n) totic equivalence by the condition n L(B −−−→ 1. In the Taylor expansion case L(x) = Bn2 n→∞ √ 2 σ + o(1), so Bn ∼ σ n. In the critical expansion case, L(x) → const. We √ shall see in the next section that this happens iff ψ ∈ L 2 and L(x) → ∞. In particular Bnn → 0, and L(x) → const. can only happen if ψ ∈ L 2 .
varying of order
1 2
such that
Proof of Theorem 5. We keep the standing assumptions of this section, and begin with the direction (1) ⇒ (2). Case 1. 0 < α < 1. In this Case (1) can be rewritten as Ptop (φ+tψ) = t α L(1/t)[1+o(1)], because Ptop (φ) = 0 (standing assumptions) and ct = o(t α L(1/t)). We assume without loss of generality that sup ψ < 0 (otherwise subtract a suitable constant c0 from ψ and pass from L(x) to L(x) − c0 t 1−α ∼ L(x)). We saw in the proof of Theorem 2 that L(x) is eventually negative. Since sup ψ < 0, (4) holds, and so 1 − Eμφ [etψ ] ∼ t α |L(1/t)| as t → 0+ . (7) ∞ Write 1−Eμφ [etψ ] = Eμφ [1−e−t|ψ| ] = 0 (1−e−t x )d F(x), where F(x):= μφ |ψ| ≤
x is the distribution function of |ψ|. Now6 ∞ ∞ x (1 − e−t x )d F(x) = t e−t y dyd F(x) 0 0 0 ∞ ∞ ∞ e−t y 1[y<x] d F(x)dy = t e−t y (1 − F(y))dy. =t 0
0
0
6 Here and throughout Lebesgue-Stieltjes integrals are used with the convention b = a (a,b] .
646
O. Sarig
Consequently,
∞ 0
e−t y dU (y) ∼ t α−1 |L(1/t)|, where U (y):=
y 0
[1 − F(x)]d x. 1−α
x |L(x)| as By Karamata’s Tauberian theorem this is equivalent to U (x) ∼ (2−α) x → ∞. The monotone density theorem of Appendix A applies; Differentiating, we x −α obtain 1 − F(x) ∼ (1−α) |L(x)| as x → ∞, which is Case (2) (a) in Theorem 5.
Case 2. α = 1. According to Theorem 4 and the remark immediately following it, either ψ ∈ L 1 and then L(x) = Eμφ [ψ] − c + o(1), or ψ ∈ L 1 and then L(x) −−−→ −∞. In x→∞ the first case there is nothing further to prove, so we focus on the second. In this case the asymptotic expansion of the pressure becomes Ptop (φ+tψ) ∼ t L(1/t), because Ptop (φ) = 0 and ct = o(t L(1/t)). As before, we may assume w.l.o.g. that sup ψ < 0, and this gives us (4) with α = 1. x Again, Karamata’s Tauberian Theorem leads to U (x) = 0 [1 − F(y)]dy ∼ |L(x)| with F(·) the distribution function of |ψ|. We now observe that x x ∞ [1 − F(t)]dt = d F(y) dt 0 0 t ∞ ∞ ∞
1[t≤x] 1[t
0
0
where a ∧ b:= min{a, b}. We obtain Eμφ |ψ| ∧ x ∼ |L(x)|. Since L(x) is eventually negative and sup ψ < 0, Case (2)(b) follows. Case 3. 1 < α ≤ 2. By Theorem 4, in this case ψ ∈ L 1 and c = Eμφ [ψ]. Assume w.l.o.g. that Eμφ [ψ] = 0. We are left with the expansion Ptop (φ + tψ) = t α L(1/t)[1 + o(1)]. As in the proof of Theorem 2, L(x) must be eventually positive. Proposition 2 says that t → Rt is differentiable on [0, δ0 ) for some δ0 > 0, that its derivative there is Rt : f → Rt (ψ f ), and that this derivative converges to R0 as t → 0+ . Make δ0 smaller, if necessary, to ensure that Pt 1 > 0 for 0 < t < δ0 . This is possible, because Pt 1 → P0 1 ≡ 1 uniformly. Since Pt = P(Rt ) and P(·) is analytic close to R0 , t → Pt 1 is differentiable on [0, δ0 ) and its derivative is continuous from the right at zero. It follows that t → h t ≡ Pt 1/Eμφ [Pt 1] is differentiable on [0, δ0 ) and that its derivative, which we denote by h t , L
h 0 . satisfies h t −−−→ + t→0
Differentiation of Rt h t = λ(t)h t gives: R0 [etψ ψh t ] + R0 [etψ h t ] = λ (t)h t + λ(t)h t . Taking expectations on both sides, we obtain after some re-organization: 1 − etψ ψh t . Eμφ [etψ ψh t ] = λ (t) + (λ(t) − 1)Eμφ [h t ] + tEμφ tψ Add Eμφ [etψ ψ(1 − h t )] to both sides to get: Eμφ [e ψ] = λ tψ
(t) + (λ(t) − 1)Eμφ [h t ] + tEμφ
1 − etψ
tψ 1 − h t ψh t + e ψ . tψ t
tψ
t Since sup ψ < ∞, | 1−e ψh t + etψ ψ 1−h t |is dominated by some constant times |ψ|. tψ tψ
tψ 1−h t −−−→ −2E [ψh ]. It follows that Since ψ ∈ L 1 , Eμφ 1−e μφ 0 tψ ψh t + e ψ t +
t→0
Eμφ [e ψ] = λ (t) + (λ(t) − 1) Eμφ [h 0 ] + o(1) − 2tEμφ [ψh 0 ] + o(t). tψ
(8)
Continuous Phase Transitions for Dynamical Systems
647
Recalling that λ(t) = exp Ptop (φ + tψ) = exp [1 + o(1)]t α L(1/t) , we see that λ(t) − 1 ∼ t α L(1/t). Now λ(t) − 1 is convex, because Ptop (φ + tψ) is convex. Therefore, its derivative is monotonic, and the Monotone Density Theorem (Appendix A) applies; Differentiating, we get λ (t) ∼ αt α−1 L(1/t). Plugging these relations into (8) gives αt α−1 L(1/t)(1 + o(1)) 1 < α < 2, tψ
Eμφ [e ψ] = (9) 2t L(1/t) − Eμφ [ψh 0 ] + o(t) + o(t L(1/t)) α = 2. When α = 2, this relation implies (since Eμφ [ψ] = 0) 1 Eμ [etψ ψ] + o(1) 2t φ etψ − 1
1 1 · ψ 2 + o(1) −−−→ Eμ [ψ 2 ], = Eμφ + t→0 2 tψ 2 φ
(1 + o(1))L(1/t) − Eμφ [h 0 ψ] =
because e tψ−1 is positive and uniformly bounded on [ψ = 0] when 0 < t < 1. We see that L(x) → const. or L(x) → ∞ according to whether ψ ∈ L 2 or not. Consider first the case α = 2 and ψ ∈ L 2 . In this case L(x) → const. This constant is non-negative, otherwise Ptop (φ + tψ) = t 2 L(1/t)[1 + o(1)] is not convex (see the proof of Theorem 2). We denote it by 21 σ 2 , and recognize the first half of (2)(c) in Theorem 5. Next assume that α = 2 and ψ ∈ L 2 , or that 1 < α < 2. In these cases, (9) becomes Eμφ [etψ ψ] ∼ αt α−1 L(1/t) (when α = 2 this is because L(x) → ∞). We wish to differentiate this asymptotic relation. In order to do this we first need to check that Eμφ [etψ ψ] has a monotonic derivative on some interval (0, δ). To see this, we use the dominated convergence theorem to see that for every t > 0, hψ − 1 d tψ tψ e Eμ [e ψ] = Eμφ lim e ψ = Eμφ [etψ ψ 2 ]. h→0 dt φ h tψ
This function is convex. Therefore, it is monotonic on (0, δ) for some δ > 0, and the monotone density theorem is applicable. Differentiating, we have Eμφ [etψ ψ 2 ] ∼ α(α − 1)t α−2 L(1/t), as t → 0+ . The right-hand-side diverges at zero; it follows that Eμφ [ψ 2 ] = ∞ for α ∈ (1, 2). Since sup ψ < ∞, Eμφ [etψ ψ 2 ] ∼ Eμφ [e−t|ψ| ψ 2 ] as t → 0+ , and we obtain: Eμφ [e−t|ψ| |ψ|2 ] ∼ α(α − 1)t α−2 L(1/t), as t → 0+ . Setting F(x):= μφ [|ψ| ≤ x], we rewrite this in the form ∞ ∞ α−2 −t x 2 −t x α(α − 1)t L(1/t) ∼ e x d F(x) ≡ e d 0
0
x
0
By Karamata’s Tauberian Theorem: x α(α − 1) 2−α x y 2 d F(y) ∼ L(x), as x → ∞. (3 − α) 0
y d F(y) . 2
648
O. Sarig
When α = 2 (and ψ ∈ L 2 ), we obtain 1 x 2 1 L(x) ∼ y d F(y) = Eμφ [ψ 2 1[|ψ|≤x] ], 2 0 2 and we recognize Case (2)(c) of Theorem 5. When 1 < α < 2, Feller’s theorem 1 (Appendix A) gives 1 − F(x) ∼ − (1−α) x −α L(x) as x → ∞. Observing that 1 − F(x) = μφ [ψ < −x] for all x > sup ψ, we recognize Case (2)(a) in Theorem 5. We now assume Part (2) in Theorem 5, and prove Part (1). As explained before, this follows from Aaronson & Denker in [AD] and Theorems 2, 3, and 4, but we include the proof anyway, because it is much simpler than in the more general case they treated (more on this below). Suppose first that 0 < α < 1, and assume w.l.o.g that sup ψ < 0. Reversing the steps x −α of the proof of Case 1 above, we see that μφ [ψ < −x] ∼ |(1−α)| |L(x)| implies that Eμφ [etψ −1] ∼ t α |L(1/t)| as t → 0+ . This implies the desired expansion of Ptop (φ +tψ) because of (4). Now assume that α = 1. If ψ ∈ L 1 and L(x) = Eμφ [ψ] − c + o(1), then the expansion of Ptop (φ + tψ) follows from the version of Theorem 4 for shifts satisfying (BIP). If ψ ∈ L 1 and L(x) ∼ Eμφ [ψ ∨ (−x)] as x → ∞, then necessarily |L(x)| → ∞. This allows us to assume w.l.o.g. that sup ψ < 0, because a subtraction of a constant from ψ does not affect the statements Ptop (φ + tψ) = t L(1/t)[1 + o(1)] or L(x) ∼ Eμφ [ψ ∨ (−x)]. We can now reverse the steps of the proofs of Case 2, and then ∞ of Case 1, to obtain 0 (1 − e−t x )d F(x) ∼ t|L(1/t)|. This, by (4), implies the desired expansion of Ptop (φ + tψ). x −α Now suppose that 1 < α < 2 and μφ [ψ < −x] ∼ − (1−α) L(x). Since sup ψ < ∞, −α
x this implies that ψ ∈ L 1 , ψ ∈ L 2 , and that μφ [|ψ| > x] ∼ − (1−α) L(x). We subtract a constant from ψ to ensure that Eμφ [ψ] = 0 (this does not affect the previous assertions). Reversing the asymptotic analysis in Case 3, we see that Eμφ [e−t|ψ| |ψ|2 ] ∼ α(α − 1)t α−2 L(x), whence Eμφ [etψ ψ 2 ] ∼ α(α − 1)t α−2 L(x) (these quantities diverge because ψ ∈ L 2 , and differ by O(1) because sup ψ < ∞). Integrating this relation (using Eμφ [ψ] = 0) we deduce Eμφ [etψ ψ] ∼ αt α−1 L(1/t) as t → 0+ . By (8)
λ (t) ∼ αt α−1 L(1/t) (all terms on the right-hand side of (8) are O(t) except λ (t).) Integrating once more gives by Karamata’s theorem λ(t) − 1 ∼ t α L(1/t). Since λ(t) = exp Ptop (φ + tψ) and Ptop (φ) = 0, this implies Ptop (φ + tψ) = t α L(1/t)[1 + o(1)]. Suppose α = 2, ψ ∈ L 2 , and L(x) ∼ 21 Eμφ [ψ 2 1[|ψ|≤x] ]. By Karamata’s Tauberian ∞ theorem, 0 e−t x x 2 d F(x) = 2[1+o(1)]L(1/t) as t → 0+ , where F(x) = μφ [|ψ| ≤ x]. Integrating both sides w.r.t. t over (t0 , ∞) gives ∞ 1/t0 ∞ [1 + o(1)]L(s) e−t0 x xd F(x) = 2 [1 + o(1)]L(1/t)dt ≡ 2 ds. s2 0 0 t0 It follows that Eμφ |ψ| = lim + t0 →0
∞ 0
e−t0 x xd F(x) = 2
0
∞
[1 + o(1)]L(s) ds < ∞, s2
Continuous Phase Transitions for Dynamical Systems
649
where the last integral converges at infinity because of the slow variation of L. Now that we know that ψ ∈ L 1 we can assume w.l.o.g that Eμφ [ψ] = 0 (the reader can check that Eμφ [(ψ − c)2 1[ψ≤x+c] ] is still asymptotic to L(x)). The reader may verify that Eμφ [etψ ψ 2 ] ∼ Eμφ [e−t|ψ| ψ 2 ] as t → 0+ , using the assumptions sup ψ < ∞ and Eμφ [ψ 2 ] = ∞. We have already seen that Eμφ [e−t|ψ| ψ 2 ] = 2[1 + o(1)]L(1/t) as t → 0+ , because of Karamata’s Tauberian theorem, and so Eμφ [etψ ψ 2 ] ∼ 2L(1/t) as t → 0+ . Integrating this gives (since Eμφ [ψ] = 0), Eμφ [etψ ψ] ∼ 2t L(1/t)ast → 0+ . We can now deduce the asymptotic expansion of Ptop (φ + tψ) from (8) as before. It remains to treat the case α = 2 and ψ ∈ L 2 . Without loss of generality, Eμφ [ψ] = 0. We must prove that Ptop (φ + tψ) = relation
σ2 2 2 t
+ o(t 2 ) for some σ ∈ R. Define L(x) by the
Eμφ [etψ ψ] = 2t L(1/t) − Eμφ [ψh 0 ] .
] + o(1) + o(1) . Recalling that λ(t) = E [h By (8), λ (t) = 2t L(1/t) + 1−λ(t) μ φ 0 2t exp Ptop (φ + tψ) and that Ptop (φ + tψ) = o(t) by Theorem 4 and the assumption Eμφ [ψ] = 0, we deduce that λ (t) = 2t L(1/t) + o(t). Next, observe that L(x) −−−→ 21 Eμφ [ψ 2 ]+Eμφ [ψh 0 ] =: 21 σ0 , because 1t Eμφ [etψ ψ] = x→∞
Eμφ [ e tψ−1 · ψ 2 ] → Eμφ [ψ 2 ] by the dominated convergence theorem. Consequently, tψ
λ (t) = σ0 t + o(t). Integrating over (0, t] gives λ(t) − 1 = σ20 t 2 + o(t 2 ). Now λ(t) − 1 = e Ptop (φ+tψ) − 1 ∼ Ptop (φ + tψ), so also Ptop (φ + tψ) = 21 σ0 t 2 + o(t 2 ). The convexity of the topological pressure forces σ0 to be non-negative. We may therefore write σ0 = σ 2 for some σ ∈ R, and (1) is proved. Final Remarks. Our analysis is simplified by the assumption that sup ψ < ∞. This assumption allows us to use Laplace transforms rather than Fourier transforms as in [AD], and this enables us to use the full force of the theory of regular variation. It is likely that sup ψ < ∞ can be relaxed to the (more cumbersome) assumption that ∃t > 0 for which Eμφ [etψ ] < ∞ (I did not check). It makes no sense to go further and consider ψ without exponential moments, because for such ψ’s the BIP property implies Ptop (φ + tψ) = ∞ for all t > 0, and critical exponents are meaningless. 4. Inducing Every countable Markov shift induces a topological Markov shift with the BIP property, in a sense that is explained below. The proof of Theorems 2, 3, and 4 for systems without the BIP property uses this technique to reduce the general case to the BIP case. In this section we explain how to relate information on distributional convergence and asymptotic expansions for the pressure for the original system to that for the induced system.
650
O. Sarig
Inducing. Let (X, B, m) be a probability space, and T : X → X a measurable map. Assume that T is probability preserving and ergodic. Fix some A ∈ B with positive measure. By Poincaré’s Recurrence Theorem, the following functions are finite almost everywhere: r (x) := min{n ≥ 1: T n (x) ∈ A}; ϕ(x) := 1 A (x) min{n ≥ 1: T n (x) ∈ A}. The induced map on A is T A (x):= T ϕ(x) (x), defined on the measure space (A, B A , m A ), where B A:= {E ∈ B: E ⊆ A} and m A (E) := m(E|A) ≡
m(E ∩ A) . m(A)
The following facts are classical (we are assuming that m is ergodic and invariant): (1) m A is ergodic and invariant w.r.t. TA ; ϕ−1 (2) Kac’ Formula: X f dm = A k=0 f ◦ T k dm. In particular, Em A [ϕ] = 1/m(A); (3) Abramov’s Formula: h m (T ) = m(A)h m A (T A ). Inducing Distributional Limit Theorems. Let T be an ergodic probability preserving transformation on a standard probability space (X, B, m), fix a set of positive measure A ∈ B, and define r (x), ϕ(x), (A, B A , m A , T A ) as above. Set: ϕ n := 1 A (x)
n−1
ϕ ◦ T Ak ,
k=0
r n := r + ϕ n−1 ◦ T r . Melbourne & Török [MT] related the Central Limit Theorem for Birkhoff sums of T A to that for Birkhoff sums of T (see also Gouëzel [Gou2]). The following theorem generalizes their result to other distributional limit theorems: Theorem 7. Suppose ∃Bn s.t. B1n [ϕ n − n/m(A)] is tight on (A, B A , m A ). Set ψ := ϕ−1 k 1 k=0 ψ ◦ T . If Bn is regularly varying of index 0 < ρ = 1, and ψ ∨ 0 ∈ L or 1 ψ ∧ 0 ∈ L , then the following are equivalent: (1)
1 Bn
(2)
1 Bn
n−1 k=0 n−1
ψ ◦ T Ak converges in distribution on (A, B A , m A ); ψ ◦ T k converges in distribution on (X, B, m).
k=0
If ∃0 > 0 s.t. ρ = 1.
1 [ϕ n n 1−0
−
n m(A) ]
is tight on (A, B A , m A ), then the conclusion holds for
Proof. We assume w.l.o.g. that T is invertible (otherwise, pass to the natural extension of T ). Of course, if T is invertible, then T A is invertible. Invertibility allows us to define: ⎧ ⎧ n−1 n−1 k k ⎪ ⎪ ⎨ k=0 ψ ◦ T n > 0 ⎨ k=0 ψ ◦ T A n > 0 ψn := 0 n = 0 and ψ n:= 0 n=0 ⎪ ⎪ ⎩−ψ ◦ T n ⎩−ψ ◦ T n n < 0 n < 0. |n| |n| A
Continuous Phase Transitions for Dynamical Systems
651
With these conventions, ψn+m = ψn + ψm ◦ T n on X , and ψ n+m = ψ n + ψ m ◦ T An on A for all m, n ∈ Z. Given x ∈ A, let n[x, N ] be the unique integer such that ϕ n[x,N ] (x) ≤ N < ϕ n[x,N ]+1 (x) (this makes sense almost everywhere in A). Note that ϕ n[x,N ]+1 (x) ϕ −−→ n[x,N ] . By the ergodic theorem, − →∞
ϕ n[x,N ] (x) n[x,N ]
≤
N n[x,N ]
<
Em A [ϕ], and by Kac’ formula Em A [ϕ] =
1/m(A). It follows that
n[x, N ] ∼ N A:= [N m(A)] almost everywhere, as N → ∞. Here is an outline of the proof. We start, as in [MT], from the following identity on A: ψN BN A = BN BN
ψ NA 1 1
+ ψ n[x,N ] − ψ N A + ψ N −ϕ n[x,N ] (x) ◦ T ϕ n[x,N ] (x) . BN A BN A BN
We prove below that
BN A BN
(A, B A , m A ) (Step 2), and
→ m(A)ρ (Step 1), 1 BN
ψ N −ϕ n[x,N ] (x) (T
1 BN A
ϕ n[x,N ]
(10)
dist. ψ n[x,N ] − ψ N A −−−→ 0 on n→∞
dist.
x) −−−→ 0 on (A, B A , m A ) (Step n→∞
ψ N converges in distribution on (A, B A , m A ) iff B1N ψ N con3). This implies that verges in distribution on (A, B A , m A ). Eagleson’s theorem on distributional convergence implies that B1N ψ N converges in distribution on (A, B A , m A ) iff it converges in distribution on (X, B, m) (Step 4). The theorem follows. 1 BN
Step 1.
BN A BN
−−−−→ m(A)ρ . N →∞
Proof. Use the uniform convergence theorem for slow variation (Appendix A). Step 2. If (1) or (2) in Theorem 7 hold, then W N :=
1 BN A
dist. ψ n[x,N ] − ψ N A −−−−→ 0 on
(A, B A , m A ). (This is a generalization of Lemma 3.4 in [MT].)
N →∞
Proof. Set m 0 [x, N ]:= n[x, N ] − N A and m[x, N ]:= m 0 [T A−N A x, N ]. By Step 1, it is enough to show that
1 BN
dist.
ψ m 0 [x,N ] (T AN A x) −−−−→ 0 on A. This is the same as N →∞
1 dist. ψ m[x,N ] −−−−→ 0 on A, N →∞ BN
(11)
because T A is measure preserving. Case 1. ψ ∈ L 1 . Suppose first that ψdm = 0. By Kac’ formula, if ψ ∈ L 1 (X ), then ψ ∈ L 1 (A) ψ and X ψdm = m(A) A ψdm A . By the ergodic theorem, ψNN , NN converge pointwise, whence in distribution, to their means. These means are different (otherwise m(A) = 1 and there is nothing to prove). Therefore, if lim sup Bnn = 0 then (1) and (2) both hold, n→∞ and if lim sup Bnn > 0, ψ = 0, ψ = 0, then both (1) and (2) fail. We may therefore n→∞ restrict ourselves to the case ψ = ψ = 0, 0 < ρ ≤ 1 (if ρ > 1 then Bnn → 0).
652
O. Sarig
Fix some N0 and > 0 to be determined later: ⎡ m A B1N |ψ m[x,N ] | > t ≤ m A ⎣m[x, N ] ≤ N0 ,
1 BN
N0
⎤ |ψ| ◦ T k > t ⎦
k=−N0
$ $ $ ] $ 1 > t +m A m[x, N ] ≥ N0 , | m[x,N | · ψ $ $ BN m[x,N ] m[x,N ] ⎡ ⎤ N 0 ≤ mA ⎣ |ψ| ◦ T k > t B N ⎦
k=−N0
$ $ $ $ 1 +m A m[x, N ] ≥ N0 , $ m[x,N ] ψ m[x,N ] $ > ] +m A | m[x,N | > t/ . BN
The first summand is o(1) as N → ∞. The second summand can be made less than by choosing N0 sufficiently large, because
ψ −−→ − →∞
Em A [ψ] = 0 almost surely, whence
uniformly outside a set of measure . Since m[·, N ], m 0 [·, N ] are equal in distribution on (A, B A , m A ), this leaves us with $ $ $ $ $ ]$ m A B1N $ψ m[x,N ] $ > t ≤ o(1) + + m A $ m 0B[x,N $ > t/ as N → ∞. N ] Since is arbitrary, (11) reduces to the tightness of m 0B[x,N . N When ρ ∈ (0, 1) we argue as follows. By the definition of m 0 [x, N ] and n[x, N ], m 0 [x, N ] > t B N ⇔ n[x, N ] > [t B N ] + N A =: α N (t) ⇒ ϕ α N (t) < N . Therefore, m 0 [x, N ] mA > t ≤ m A [ϕ α N (t) < N ] BN N (t) N (t) N − αm(A) ϕ α N (t) − αm(A) = mA < β N (t) , where β N (t):= . Bα N (t) Bα N (t)
But B N is regularly varying of index ρ ∈ (0, 1), so β N (t) −−−−→ − m(A)t ρ+1 . Using this, N →∞
and the assumption that B1N [ϕ N − N /m(A)] is tight, it is easy to see that for every > 0, ] m 0 [x,N ] > t < for all N . A similar estimate of m < t ∃t so large that m A m 0B[x,N A BN N for t 0 finishes the proof of tightness when ρ ∈ (0, 1). Now suppose ρ = 1. Let 0 be as in the statement of the theorem. Repeating the same argument, we see that m 0 [x, N ] mA > t ≤ m A [ϕ α N (t) < N ] BN N (t) N (t) N − αm(A) ϕ α N (t) − αm(A) = mA < γ N (t) , where γ N (t):= α N (t)1−0 α N (t)1−0 . Calculating, we see that γN =
N m(A) − [t B N ] − N A m(A) ([t B N ] + N A )1−0
∼
−t BNN N 0 1−0 m(A) t BNN + m(A) + o(1) .
Continuous Phase Transitions for Dynamical Systems
653
Since B N is regularly varying of index 1, B N /N is slowly varying. It follows that γn is minus a regularly varying sequence of index 0 > 0, whence γN → −∞. Since m 0 [x,N ] 1 [ϕ N − n/m(A)] is tight, by assumption, we get m A > t → 0 for all BN N 1−0 m 0 [x,N ] t > 0. A similar argument shows that m A < t → 0 for all t negative, and we BN obtain the tightness of
m 0 [x,N ] BN
when ρ = 1. Equation (11) follows.
one of ψ ∨ 0, ψ ∧ 0 Case 2. ψ ∈ L 1 . We prove (11) when ψ ∈ L 1 . By our assumptions is integrable. Without loss of generality, ψ ∨ 0 < ∞ and ψ ∧ 0 = −∞. By the ergodic theorem
ψN N
,
ψN N
−−−−→ −∞ almost surely, so either (1) and (2) are N →∞
both false or N /B N → 0. We restrict ourselves to this case. By the ergodic theorem, ψ
ψN BN
N = (ψ∧0) + o(1) and B NN = BN loss of generality that ψ ≤ 0.
(ψ∧0) N BN
+ o(1). We may therefore also assume without
We begin by showing that if (1) or (2) holds, then is clear, so suppose (2) holds. In this case
ψN BN
ψN BN
is tight. When (1) holds, this
is tight, and since ψ doesn’t change sign,
$ $ $ ψ $ ϕ ψ ]+1 $ > t m A $ B NN $ > t ≤ m A $ n[x,N BN $ $
$ > t + m A ϕ n[x,N ]+1 > 2N ≤ m A ϕ n[x,N ]+1 ≤ 2N and $ ψB2N N $ $ 1 ϕ ψ B ]+1 1 1 2N $ N > t B2N m $ B2N + m A n[x,N . ≤ − > N m(A) m(A) A m(A) B N is regularly varying, so
B2N BN
−−−−→ 2ρ . N →∞
ψN BN
is tight, so we can make the first
summand uniformly small by choosing t large. The second summand tends to zero as ϕ ϕ n[x,N ]+1 ]+1 1 ∼ n[x,N −−−→ m(A) a.e., whence N → ∞, because by the ergodic theorem n[x,N NA ]+1 − ϕ n[x,N ]+1 NA
N →∞
−
dist. 1 −−−→ m(A) − N →∞
0. This proves tightness.
m 0 [x,N ] dist. −−−−→ 0 on A, because |m 0 [x, N ]| ≤ n[x, N ] + N A BN N →∞ N [1 + m(A)] and N /B N → 0 by assumption. Since m 0 [·, N ] m[·, N ] are equal ] dist. distribution w.r.t m A , m[x,N −−−−→ 0 on A. BN N →∞
Next we observe that
Since the sign of ψ is constant, for every > 0,
$ $ $ ψ m[x,N ] $ m A $ B N $ > t, m[x, N ] < 0 ≤ $ $ $ $ $ m[x,N ] $ $ ψ −m[x,N ] ◦T Am[x,N ] $ m[x,N ] > t + m ≥ ≤ mA ∈ [−, 0], $ $ $ $ A BN BN BN $ −[ B N ] $ $ $ $ $ $ ψ [ B N ] ◦TA ]$ ≤ mA $ $ > t/ + m A $ m[x,N [ B N ] BN $ ≥ $ $ $ ψ [ B ] $ $ ]$ ≥ , because m A ◦ T A[ B N ] = m A , = m A $ [ B NN] $ > t/ + m A $ m[x,N $ BN
≤ in
654
O. Sarig
$ $ $ψ ]$ > t and m[x, N ] ≥ 0 m A $ m[x,N $ BN $ $ $ ψ m[x,N ] $ ] $ > t + m A m[x,N ] ≥ $ ≤ m A 0 ≤ m[x,N ≤ and $ BN $ BN BN $ $ $ ψ [ B ] $ ] ≥ . ≤ m A $ [ B NN] $ > t/ + m A m[x,N BN Putting this all together, we get $ $ $ $ $ $ $ψ $ ψ [ B N ] $ $ m[x,N ] $ ]$ > t ≤ 2m > t/ + 2m ≥ . m A $ m[x,N $ $ $ $ $ A A BN [ B N ] BN ψk Bk is tight, there exists so small that the first summand is less than ] dist. −−−−→ 0, there exists N0 s.t. the second summand is less than N . Since m[x,N BN N →∞ $ $ $ψ ]$ N > N0 . We deduce that m A $ m[x,N B N $ > t < 2δ for N large enough, proving
Fix δ > 0. Since δ for all
δ for all (11) in Case 2.
This completes the proof of Step 2. Step 3.
1 BN
ψ N −ϕ n[x,N ] (x) (T ϕ n[x,N ] (x) x) −−−−→ 0 in distribution on (A, B A , m A ). N →∞
Proof. We thank the referee for the following short argument. Recall the definition of r from the beginning of Sect. 4, and set S(x):= T A−1 (T r (x) (x)) (x ∈ X ). Then |ψ N −ϕ n[x,N ] (x) (T ϕ n[x,N ] (x) x)| ≤ (T N x), where (x):=
ϕ(Sx)
|ψ(T k Sx)|.
k=0
Now
1 BN
dist.
◦ T N −−−−→ 0 on (X, B, m), because m ◦ T −1 = m and B N → ∞. It
follows that
1 BN
N →∞ dist. ◦ T N −−−−→ N →∞
0 on (A, B A , m A ). ψ
Steps 1–3 and (10) show that B NN converges in distribution on (A, B A , m A ) iff converges in distribution on (A, B A , m A ). Step 4. ψB NN converges in distribution on (A, B A , m A ) iff on (X, B, m), and the limiting distribution is the same.
ψN BN
ψN BN
converges in distribution
Proof. Eagleson proves that if X i is a stationary ergodic stochastic process and Yn := 1 Bn (X 1 + · · · + X n ) converges in distribution for some Bn ↑ ∞ on (, F, μ), then Yn converges in distribution to the same limit on (, F, μ ) for all μ μ ([Ea], Theorem 4). This proves (⇐). To see the other direction, assume ψB NN converges in distribution on (A, B A , m A ), and consider the following decomposition on (X, B, m) in the limit N → ∞: ψN ψN ψr ψr ◦T −N |ψ|r r N + 1[r ≥N ] O = 1[r
Continuous Phase Transitions for Dynamical Systems
655
The big-Oh terms converge to zero in distribution, and 1[r
Remark. The proof shows that the distributional limit of of the distributional limit of
ψN BN
ψN BN
is a m(A)ρ –scaled version
, see Step 1.
+ , T ) be a topologiInducing Asymptotic Expansions. Throughout this section, let (A cally mixing countable Markov shift with set of states S, and let A ⊂ S be some finite union of states. Define ϕ(x) and T A (x):= T ϕ(x) (x) as above. The resulting map can be given the structure of countable Markov shift as follows:
(1) States: S:= {[a, ξ1 , . . . , ξn−1 , b]: a, b ∈ A, n ≥ 1, ξi ∈ A for all i} \ {∅}; (2) Transition matrix: A = (t[a],[b] ) S×S with t[a],[b] = 1 iff the last symbol in a is the first symbol in b. We call this shift the induced shift (on A), because it is conjugate to the induced map. The conjugacy is π: + → A given by A
π([a (1) , ξ (1) , b(1) ], [a (2) , ξ (2) , b(2) ], . . .) = (a (1) , ξ (1) , a (2) , ξ (2) , a (3) , . . .). It is easy to verify that the induced shift satisfies the BIP property. ϕ−1 + → R induces a function f : + → R by f := k ◦ π. Every f : A f ◦ T k=0 A We call this function the induced function (by f ). Define + H A:= { f : A → R| f has summable variations, sup f < ∞ and ¨ f is locally H"older continuous}.
It is easy to see that H A contains all weakly Hölder continuous functions which are bounded from above. Theorem 8. Suppose φ, ψ ∈ H A , and that ψ satisfies ( ) with respect to a finite set of states A. If {φ + tψ}t≥0 is regular and Ptop (φ) = Eμφ [ψ] = 0, then Ptop (φ + tψ) =
1 + o(1) Ptop (φ + tψ) as t → 0+ , μφ (A)
where μφ is the equilibrium measure of φ. + → R belongs to H , then Lemma 1. If f : A A ∞ (1) varn ( f ) ≤ k=n+1 vark ( f ); (2) If, in addition, Ptop ( f ) < ∞, then sup f − Ptop ( f ) < ∞; (3) If, in addition, f has an equilibrium measure μ, then Ptop ( f − Ptop ( f )) = 0, and )(E) μ(E):= (μ◦π μ(A) is an equilibrium measure for f − Ptop ( f ).
656
O. Sarig
Proof. Suppose x, y ∈ + agree on the first n symbols, and write x = π(x), y = π(y). A Since ϕ ◦ π is constant on partition sets in + , ϕ(x) = ϕ(y) = n 0 . One checks that A
+ agree on the first ϕ(x) + ϕ(T x) + · · · ϕ(T n−1 x) + 1 symbols (the one at the x, y ∈ A A A end is because of the last symbol of the last cylinder). We see that x, y agree on (at least) the first n 0 + (n − 1) + 1 = n 0 + n symbols, and so
| f (x) − f (y)| ≤
n 0 +n $ $ $ var k ( f ). $ f (T k x) − f (T k y)$ ≤
n 0 −1 $ k=0
k=n+1
Part (1) follows. To see Part (2), construct a finite set of admissible words {wab : a, b ∈ A} of length n ab (as words in the alphabet S) such that wab starts with a and ends with b. Such words + . Set exist because of the topological mixing of A & % C := sup |( f − Ptop ( f ))n ab (x)|: x ∈ [wab ], a, b ∈ A . By Part (1), C < ∞. We show that sup f − Ptop ( f ) ≤ C + ∞ n=2 var n ( f ) =: C 0 . Otherwise ∃x ∈ A + for which f − Ptop ( f )(x) > C0 . By Part (1), f − Ptop ( f ) > C on the partition set which contains x. Denote this partition set by [x 0 ], write x 0 = [b, ξ , a], and consider the point z := (x 0 , wab , x 0 , wab , x 0 , wab , . . .). This is a periodic point of order 1 + n ab , and ( f − Ptop ( f ))1+n ab (z) > C − C = 0. N −1 [ f (T k z) − Ptop ( f )] > 0. Write z = π(z). Then for some N , T N (z) = z and k=0 N −1 1 The measure μ:= N k=0 δT k z is T –invariant, has zero entropy, and satisfies h μ (T ) +
+ A
[ f − Ptop ( f )]dμ =
N −1 1 [ f (T k z) − Ptop ( f )] > 0. N k=0
It follows that h μ (T ) + f dμ > Ptop ( f ), in contradiction to the definition of Ptop ( f ). Part (2) is proved. Before proving Part (3), we recall from [BS] that μ[a] = 0 for any state a ∈ S and every equilibrium measure μ of a potential with summable variations on a topologically mixing shift. Therefore, μ is well defined. Next we note that μ is shift invariant, because μ| A is T A –invariant. The formulæ of Kac and Abramov and the conjugacy between T A and the induced shift give f − Ptop ( f )dμ Ptop ( f − Ptop ( f )) ≥ h μ◦π −1 (T A ) + 1 = h μ (T ) + f − Ptop ( f )dμ = 0. μ(A) The other inequality is more delicate, because it is not true that every T A –invariant probability measure is induced by a T –invariant probability measure: We can only guarantee this for T A –invariant measures for which ϕ is integrable.
Continuous Phase Transitions for Dynamical Systems
657
To deal with this difficulty, we note that since f − Ptop ( f ) has summable variations (Part 1) and is bounded from above (Part 2), then Ptop ( f − Ptop ( f )) is equal to the Gurevich pressure of f − Ptop ( f ). Therefore, by Theorem 2 of [S1], Ptop ( f − Ptop ( f )) = sup h m (T ) + f − Ptop ( f )dm : m has compact support . For such measures ϕ ◦ π is essentially bounded, whence integrable. Therefore Ptop ( f − Ptop ( f )) is achieved as a supremum over invariant measures which are in+ . Such measures ν satisfy duced by shift invariant measures on A 1 h ν (T ) + f − Ptop ( f )dν = f − Ptop ( f )dν ≤ 0. h ν◦π −1 (T A ) + + ν(A) + A A
Passing to the supremum, we get Ptop ( f − Ptop ( f )) ≤ 0. In the first part of the proof we saw that h μ◦π −1 (T A ) + f − Ptop ( f )dμ = 0 for μ induced by the equilibrium measure of f . Consequently, this is an equilibrium measure for f − Ptop ( f ) (by [BS] the only one), and the pressure is zero. Proof of Theorem 8. The convexity of Ptop (φ +tψ) and the assumption that Ptop (φ) = 0 imply that either Ptop (φ +tψ) = 0 on some right neighborhood of 0, or Ptop (φ +tψ) = 0 for all t > 0 small. In the first case the theorem holds trivially by Lemma 1, Part (3). We may therefore assume without loss of generality that Ptop (φ + tψ) = 0 for all t > 0 small. Recall the definitions of ϕ, + , and of the functions φ, ψ induced by φ, ψ. By A assumption, A is a finite union of states such that ψ ≤ 0 = Eμφ [ψ] outside A. Thus: ϕ−1
sup ψ < ∞.
To see this write ψ = k=0 ψ ◦ T k , and observe that the first summand is dominated by sup ψ, while the other summands are non-positive (they correspond to the part of the orbit which lies outside A). Note also that by Lemma 1 Part (2) sup φ < ∞. Step 1. Ptop (φ + tψ) > 0 for all t > 0. Proof. Kac’ formula and the assumption Eμφ [ψ] = 0 imply that Eμφ [ψ] = 0, where μ ◦π μφ = μφφ(A) . By Lemma 1, Ptop (φ) = 0, and μφ is the equilibrium measure of φ: μφ = μφ . Consequently Eμφ [ψ] = 0.
By Theorem 4 for BIP systems, Ptop (φ +tψ) = o(t) as t → 0+ (note that the assumptions listed at the beginning of Sect. 3 are satisfied). We see that the right–derivative of t → Ptop (φ +tψ) at t = 0 vanishes. But t → Ptop (φ +tψ) is convex, so Ptop (φ +tψ) ≥ 0 for t ≥ 0. Lemma 1 tells us that Ptop (φ + tψ − Ptop (φ + tψ)) = 0. If Ptop (φ + tψ) were negative, then by the properties of the topological pressure and since ϕ ≥ 1, 0 = Ptop (φ + tψ − Ptop (φ + tψ)ϕ) ≥ Ptop (φ + tψ) + |Ptop (φ + tψ)| > 0. Therefore Ptop (φ + tψ) ≥ 0 for t > 0. The inequality is strict, otherwise by convexity Ptop (φ + tψ) vanishes on some right-neighbourhood of 0, contrary to our assumptions.
658
O. Sarig
Step 2. Set f t := φ + tψ and φt := f t − Ptop ( f t ). The induced potentials φt , f t have Gibbs measures μ ft , μφt , and Eμφ [ϕ] ≤ t
Ptop ( f t ) Ptop ( f t )
≤ Eμ f [ϕ] for all 0 ≤ t ≤ 0 . t
Proof. Any locally Hölder continuous potential with finite pressure on a shift with the BIP property has an invariant Gibbs measure [S3]. Therefore, since + has the BIP propA
erty, it is enough to check that f t , φt have finite pressure. They do, because sup ψ < ∞, Ptop ( f t ) ≥ 0 (Step 1), and Ptop (φ) = 0 < ∞ (Lemma 1 Part 3). Fix 0 > 0 such that f t := φ + tψ has an equilibrium measure for 0 ≤ t ≤ 0 (regularity). Fix 0 ≤ t ≤ 0 , and consider the function p(s):= Ptop ( f t − sϕ) for s ≥ 0. This is a convex function, and therefore p+ (0) ≤
p(Ptop ( f t )) − p(0) ≤ p+ (Ptop ( f t )), Ptop ( f t )
where p+ denotes one-sided derivative from the right (which can be infinite). The term P
(f )
t in the middle is − Ptop (Lemma 1, Part (3)). Theorem 4 for BIP systems gives the top ( f t ) one-sided derivatives (see the remark after Theorem 4): $ d $$ p+ (0) = Ptop ( f t + s(−ϕ)) = −Eμ f [ϕ], t ds $s=0+
p+ (Ptop ( f t ))
$ d $$ = Ptop (φt + s(−ϕ)) = −Eμφ [ϕ]. t ds $s=0+
This completes the proof. Step 3. Eμ f [ϕ] −−−→ + t
t→0
1 μφ (A) .
Proof. We work on the BIP shift ( + , T ). Define as in Sect. 3 the space L and the A
operators R0 , Rt corresponding to φ and ψ: R0 ( f )(x):= eφ(y) f (y) , Rt ( f ):= R0 [etψ f ]. T y=x
Here and throughout T denotes the shift on + . A As in the beginning of Sect. 3, we may assume without loss of generality that φ(y) = 1 (otherwise pass to φ + h − h ◦ T with some bounded locally Hölder T y=x e continuous function h : +A → R, and note that this does not affect μ ft or μφ ). This reduction allows us to assume that R0 1 = 1. By Proposition 2 Part (3), Rt − R0 −−−→ 0. It follows that the eigenprojections + t→0
Pt := P(Rt ) are well–defined for t small, and converge in norm to P0:= P(R0 ). The operator Rt is the Ruelle operator of f t . The theory of Ruelle operators for shifts with BIP says that λ(Rt ) = exp Ptop ( f t ) and that Pt F = ht Fdνt where νt is an eigenmeasure of Rt , h t is a positive eigenfunction of Rt , and h t dνt = 1. The Gibbs measure of f t is h t dνt . Consequently, Pt [F Pt 1] Fdμ ft = for all F ∈ L. (12) Pt 1 (The RHS is a scalar, because dim Im(Pt ) = 1.)
Continuous Phase Transitions for Dynamical Systems
659
Since Pt → P0 in norm and P0 1 = 1 (because R0 1 = 1), Eμ f [F] −−−→ Eμφ [F] t t→0+
for all F ∈ L. In particular, Eμ f ϕ1[ϕ
t→0
We claim that for every > 0 there exists N such that Eμ f [ϕ1[ϕ≥N ] ] < for all t t in some one–sided neighbourhood of zero (uniform integrability). This will imply that Eμ f [ϕ] −−−→ Eμφ [ϕ]. Step 3 will then follow from Kac’ formula Eμφ [ϕ] = 1/μφ (A). + t
t→0
To prove uniform integrability, we need the transfer operator of μ ft , given by Tt F:=
λ(Rt )−1 Rt [F Pt 1]. Pt 1
It is straightforward to check, using (12), that Eμ f [Tt F] = Eμ f [F] for all F ∈ L. It t
follows that for every a ∈ S,
t
(Pt 1)(ax) e ft (ax) μ ft [a] = Eμ f [Tt 1[a] ] = λ(Rt )−1 dμ ft (x) t (Pt 1)(x) T [a] sup Pt 1 Dφ+t sup ψ φ(z) e e ≤ e−Ptop ( ft ) for all z ∈ [a]. inf Pt 1
The term in the brackets converges as t → 0+ (to e Dφ ), and is therefore uniformly bounded. The term eφ(z) is bounded by Gμφ [a], where G is as in the proof of Proposition 2. Consequently, there exists some constant C0 such that μ ft [a] ≤ C0 μφ [a] for all a ∈ S. Since ϕ is constant on 1–cylinders in + , we obtain Eμ f [ϕ1[ϕ≥N ] ] ≤ C0 Eμφ [ϕ1[ϕ≥N ] ] t A for all N . The RHS tends to zero as N → ∞, by the dominated convergence theorem. We obtained the uniform integrability of ϕ w.r.t. μ ft . Step 4. Eμφ [ϕ] −−−→ + t
t→0
1 μφ (A) .
Proof. The proof is essentially the same as in the previous step, except that here we need to use the perturbation operators 't f := Rt [e−Ptop ( ft )ϕ f ] R 't − R0 −−−→ 0. (the Ruelle operators of φt = f t − Ptop ( f t )ϕ). We first claim that R + t→0
(1) (d) We need the following generalization of Eq. (2): Let → ψ = (ψ , . . . , ψ ) be a + and F(t , . . . , t ) some real valued function such vector of real valued functions on A 1 d → + . Define R f := R [F( → ) f ]. Then for that F( ψ (x)) is well defined for all x ∈ A F 0 ψ some constant M which only depends on φ, ⎛ ⎞ → → R F ≤ M ⎝Eμφ |F( ψ )| + μφ [a]Da [F( ψ )]⎠ . (13) a∈S
The proof is the same as in the one-dimensional case (as is the constant M).
660
O. Sarig
't − R0 = R Ft with Ft (ψ, ϕ) = etψ−Ptop ( ft )ϕ − 1. Therefore, We now observe that R $ $ tψ−Ptop ( f t )ϕ 't − R0 ≤ M Eμ $etψ−Ptop ( ft )ϕ − 1$ + R μ [a]D [e ] a φ φ a∈S
$ $ 0, ≤ M Eμφ $etψ−Ptop ( ft )ϕ − 1$ + tet sup ψ Dψ −−−→ +
t→0
because of the bounded convergence theorem (we are using here the facts that sup ψ < ∞ and Ptop ( f t ) > 0). 't − R0 −−−→ 0 we can proceed exactly as in the preNow that we know that R + t→0
't := P( R 't ) replacing Pt , to deduce that vious step, but with the eigenprojections P Eμφ [ϕ] −−−→ E [ϕ]. The theorem follows from Step 2. μ φ + t
t→0
5. Proofs for Shifts not Satisfying the BIP Property Reduction of the General Case to the BIP Case. Let φ and ψ be two locally Hölder continuous functions bounded from above and assume ( ), ( ), and that φ + tψ has an equilibrium measure for 0 ≤ t ≤ 0 . We also assume without loss of generality that Ptop (φ) = 0 and Eμφ [ψ] = 0 (otherwise subtract suitable constants). Let A ⊂ S be a finite union of states such that ψ ≤ Eμφ [ψ] = 0 outside A. Let a ∈ S be some state such that Eμφ [ra ] < ∞, where ra (x):= min{k: xk = a}. Without loss of generality, [a] ⊆ A (otherwise add a to A). Set ϕ(x):= 1 A (x) min{k ≥ 1: T k x ∈ A}, and let T A: A → A, T A (x):= T ϕ(x) (x) be the induced map. We have seen that this map can be coded by a topological Markov shift with the BIP property. Let φ and ψ be as before. These are locally Hölder continuous functions, and as in the proof of Theorem 8, sup φ = sup φ − Ptop (φ) < ∞; sup ψ = sup ψ − Eμφ [ψ] < ∞. We conclude that + , φ, ψ satisfy the standing assumptions listed at the beginning of A Sect. 3 – the assumptions needed to prove theorems 2, 3, 4 for BIP systems. In order to pass from the induced system to the original system, we need to apply + , φ, ψ); we check Theorems 7 and 8. The conditions of Theorem 8 are satisfied (by A the conditions of Theorem 7. The only thing to check is that the tightness assumption holds in all relevant cases. If α ∈ (1, 2) one must show that B1n (ϕ n − n/μφ (A)) is tight for any sequence Bn √ regularly varying of index α1 ; If α = 2 one must check tightness for Bn = n or for Bn √ s.t. n = o(Bn ). (The case α = 1 does not require Theorem 7). We show √ 1 n [ϕ − n/μφ (A)] is tight for all {Bn } positive s.t. lim sup < ∞. Bn n B n→∞ n This covers all possibilities.
(14)
Continuous Phase Transitions for Dynamical Systems
661
Observe that Eμφ [ϕ 2 ] < ∞. To see this recall from Lemma 1 that μφ = μφ , note that ra (x) ≥ min{k ≥ 1: T k (x) ∈ A} =: r A (x), and observe that n ∞ ∞ 2 2 n μφ [ϕ = n] ≤ k μφ [ϕ = n] 2 Eμφ [ϕ ] = n=1
=2
∞
⎛ ⎝
n=1
n=1 ϕ−1 [ϕ=n] k=0
k=1
⎞
r A ◦ T k ⎠ dμφ = 2
r A dμφ ≤ 2
ra dμφ < ∞.
It follows that −ϕ satisfies Case (2)(c) of Theorem 5. By Theorem 3 for BIP systems, ϕ n satisfies the central limit theorem, and (14) follows. Proof of Theorem 4 for Systems without the BIP Property. It is enough to treat the case Eμφ [ψ], Ptop (φ) = 0. By Lemma 1, Ptop (φ) = 0 and μφ = μφ . By Kac’ formula, ψ ∈ L 1 , and Eμφ [ψ] = 0. We deduce from Theorem 4 in the BIP case that Ptop (φ + tψ) = o(t). By Theorem 8, Ptop (φ + tψ) = o(t). The remaining part of the Theorem is because of the ergodicity of μφ , see [BS]. Proof of Theorem 2 for Systems without the BIP Property. It is enough to treat the case Eμφ [ψ], Ptop (φ) = 0. Suppose Ptop (φ + tψ) = ct + t α L(1/t) with c ∈ R, 1 < α < 2 and L(x) slowly varying at infinity. The previous section shows that c = 0. By Theorem 8 Ptop (φ + tψ) = μφ1(A) t α [1+o(1)]L(1/t). L(x):= 1+o(1) μφ (A) L(x) is slowly varying at infinity, therefore by Theorem 2 for BIP systems, ∃Bn regularly varying of dist. 1 −−→ G α . Bn ψ n − √ n→∞ Since 1 < α < 2, Bnn → 0, so B1n [ϕ n − n/μφ (A)] is tight. Theorem 7 now implies dist. that B1n ψn −−−→ G ∗α , where G ∗α is equal to G α up to change of scale. Renormalizing n→∞ Bn , we obtain (2) in Theorem 2, and we proved (1)⇒(2). The other direction is handled
index α such that
in the same way.
Proof of Theorem 2 for Systems without the BIP Property. It is enough to treat the case Ptop (φ), Eμφ [ψ] = 0. We saw above that c = 0. Part 1 (Taylor expansion). By Theorem 8, Ptop (φ+tψ) = 21 σ 2 t 2 +o(t 2 ) iff Ptop (φ + tψ) = 2 dist. 1 √ σ t 2 +o(t 2 ). Our results for BIP maps say that this is equivalent to √1n ψ n −−−→ 2 μφ (A)
σ2 μφ (A) )
n→∞
dist. √1 ψn −−−→ n n→∞
N (0, σ 2 )
N (0, w.r.t. μφ . By Theorem 7, this happens iff (see the remark at the end of the proof of Theorem 7). We explain why in this case ψ ∈ L 2 (μφ ). By Theorem 5, ψ ∈ L 2 (μφ ). When we proved (14), we saw that ( ) ⇒ ϕ ∈ L 2 (μφ ). Therefore, ψ − sup ψ ∈ L 2 . It follows ϕ−1 2 k 2 2 that k=0 (ψ − sup ψ) ◦ T dμφ + positive terms < ∞, whence (ψ − sup ψ) ∈ L . By Kac’ formula, ψ − sup ψ ∈ L 2 (μφ ), and so ψ ∈ L 2 . Part 2 (Critical expansion). By Theorem 8, Ptop (φ + tψ) = t 2 L(1/t) with L(x) slowly 2 varying and not asymptotically constant, iff Ptop (φ + tψ) = 1+o(1) μφ (A) t L(1/t) with such L. By the BIP property, this is equivalent to the existence of Bn r.v. of index
1 2
such that
662
O. Sarig
√ dist. n 1 1 ψ − − − → N (0, 1), E [ψ] = 0, and μ n Bn Bn → 0. By (14) Bn [ϕ n − n/μφ (A)] is tight, φ n→∞ dist. dist. so B1n ψ n −−−→ N (0, 1) is equivalent to B1∗ ψn −−−→ N (0, 1) for Bn∗ proportional to n n→∞ n→∞ Bn . This gives the equivalence in Theorem 3, Part 2.
To finish the proof, it is enough to observe that the BIP property, the expansion 2 Ptop (φ + tψ) = 1+o(1) μφ (A) t L(1/t), and Theorem 5 Case (2)(c) show that L(x) → ∞ whenever it is not asymptotic to a constant. Proof of Theorem 6. Without loss of generality φ has zero pressure, and ψ has zero expectation (and then Ptop (φ + tψ) = t α L(1/t)). Fix an arbitrary finite union of states A so large that ψ < Eμφ [ψ] − = − outside A, and let ϕ(x):= 1 A (x) min{n ≥ 1: T n (x) ∈ A}. Let ψ be the induced version of ψ on A. By Theorem 8, Ptop (φ + tψ) =
1 + o(1) α t L(1/t) as t → ∞. μφ (A)
Since 1 < α < 2, L(t) must be asymptotically non-negative (see Sect. 3). By Theorem t −α 5, μφ (A ∩ [|ψ| > t]) ∼ |(1−α)| L(t) as t → ∞. ϕ−1 By choice of A, ψ = ψ + k=1 ψ ◦ T k , where each summand under the sigma symbol is less than −. It follows that (ϕ − 1) − ψ∞ ≤ |ψ| ≤ ϕψ∞ . Since L(x) is slowly varying, L(λx), L(λ + x) ∼ L(x) as x → ∞ for all λ ∈ R+ , and so
−α t −α L(t) μφ [ϕ > t] ≤ μφ |ψ| > (t − 1) − ψ∞ ∼ , |(1 − α)| −α L(t)
ψ−α ∞ t . μφ [ϕ > t] ≥ μφ |ψ| > tψ∞ ] ∼ |(1 − α)| Consequently, μφ [ϕ > n] L(n) nα . We now appeal to Gouëzel [Gou1], Theorem 1.3 (see also [S5]), which says that in our context for every f, g locally Hölder continuous supported inside A with non-zero expectation, ∞ Covμφ ( f, g ◦ T n ) ∼ μφ [ϕ > k] f dμφ gdμφ . k=n+1
The theorem follows from Karamata’s Theorem.
Appendix A. Slow and Regular Variation Slow and Regular Variation. A positive function L : (c0 , ∞) → R is called slowly varying (at infinity) if it is Borel measurable and L(ts) −−−→ 1 for all s > 0. L(t) t→∞ A positive sequence {cn }n≥1 is called slowly varying (at infinity) if L(t):= c[t] is slowly varying (at infinity). A positive function f : (c0 , ∞) → R is called regularly varying at infinity with index α, if f (x) = x α L(x) with L(x) slowly varying at infinity. A positive sequence {cn }n≥1
Continuous Phase Transitions for Dynamical Systems
663
is called regularly varying at infinity with index α, if f (t):= c[t] is regularly varying at infinity with index α. For example, log x, 1/ ln ln x are slowly varying at infinity, and x α ln x(ln ln x)2 , α x / ln x are regularly varying with index α. Sufficient Condition for Regular Variation. Let f (x) be a positive continuous function, and {an }, {bn } some positive numbers such that lim sup bn =∞ and lim sup bbn+1 =1. n n→∞
n→∞
If lim an f (bn x) exists, is positive, and is continuous on some open interval (a, b) ⊂ n→∞
R+ , then f (x) is regularly varying at infinity ([BGT], Theorem 1.9.2). The General Form of Regularly Varying Functions. A Borel function f (x) is regularly varying at infinity with index α iff x du α as x → ∞, f (x) = [c + o(1)]x exp (u) u 1 where c > 0 and (u) −−−→ 0 ([BGT], Theorem 1.3.1). x→∞
In particular, any regularly varying function f (x) with index α satisfies f (x) → ∞ when α > 0 and f (x) → 0 when α < 0 ([BGT], Prop. 1.5.1). Uniform Convergence Theorem. If L(t) is slowly varying at infinity, then uniformly on compact subsets of (0, ∞) ([BGT], Theorem 1.2.1).
L(ts) −−→ 1 L(t) − t→∞
Asymptotic Inversion Theorem. If f (x) is regularly varying at infinity with positive index α, then there exists g(x) regularly varying at infinity with index 1/α such that ( f ◦ g)(x) ∼ (g ◦ f )(x) ∼ x as x → ∞ ([BGT], Theorem 1.5.12). Differentiating Asymptotic Relations: The Monotone Density Theorem. Suppose x U (t) = 0 u(y)dy, and L(x) is slowly varying at infinity. (1) If u(y) is monotone at some interval (0, δ) and ρ ≥ 0, then U (t) ∼ t ρ L(1/t) as t → 0+ implies u(t) ∼ ρt ρ−1 L(1/t) as t → 0+ . (2) If u(y) is monotone at some interval (δ, ∞) and ρ ∈ R, then U (x) ∼ x ρ L(x) as x → ∞ implies u(x) ∼ ρx ρ−1 L(x) as x → ∞. Here and throughout f (x) ∼ 0 · g(x) means f (x) = o(g(x)) ([BGT], Theorems 1.7.2 and 1.7.2b). Integrating Asymptotic Relations: Karamata’s Theorem. Suppose L(x) is slowly varying at infinity and locally bounded. Then as x → ∞, x x ρ+1 L(x), f orallρ > −1, t ρ L(t)dt ∼ ρ+1 a ∞ x ρ+1 L(x), f orallρ < −1. t ρ L(t)dt ∼ − ρ+1 x
664
O. Sarig
The converse is also true: Any positive locally bounded L(x) for which one of these relations holds for some ρ = −1 must be slowly varying ([BGT], Theorems 1.5.11 and 1.6.1) After a change of variables, Karamata’s theorem implies that if L(x) is slowly varying at infinity and α > −1, then t t 1+α L(1/t) as t → 0+ . τ α L(1/τ )dτ ∼ 1 + α 0 Conversely, if L satisfies the above, then it must be slowly varying at infinity. Karamata’s Tauberian Theorem. Let U (x) be a non-decreasing function on R, which is continuous from the right, and such that U (0) = 0. Suppose L(x) is slowly varying at infinity, and c > 0, ρ ≥ 0. The following are equivalent: cx ρ L(x), as x → ∞. (1 + ρ) ∞ c e−t x dU (x) ∼ ρ L(1/t), as t → 0+ t 0 ([BGT], Theorem 1.7.1). U (x) ∼
Truncated Variance: Feller’s Theorem. Let F(x) be a right x continuous probability distribution function such that F(0) = 0, and set U (x):= 0 y 2 d F(y). Suppose L(x) is slowly varying at infinity as x → ∞, c = 0, and 0 < ρ < 2. The following are equivalent: U (x) ∼ cx ρ L(x), as x → ∞, cρ ρ−2 x L(x), as x → ∞. 1 − F(x) ∼ 2−ρ (See Feller [F] VIII.9 for generalizations). ∞ ∞
y Proof. Start with the identity 1 − F(x) = x y −2 d 0 t 2 d F(t) = x y −2 dU (y). Integration by parts gives: y=∞ ∞ 1 − F(x) = y −2 U (y) +2 y −3 U (y − )dy. y=x
If U (y) ∼
cy ρ L(y),
then
U (y − )
∼
x
cy ρ L(y).
By Karamata’s theorem: ∞ y ρ−3 dy 1 − F(x) = −cx ρ−2 L(x)[1 + o(1)] + 2cL(x)[1 + o(1)] x
cρ ρ−2 x = L(x)[1 + o(1)]. 2−ρ
x To see the other direction, integrate by parts U (y) = 0 y 2 d F(y): y=x x x U (x) = y 2 F(y) −2 y F(y − )dy = x 2 F(x) − 2 y F(y − )dy y=0
0
0
x
= −x 2 [1 − F(x)] + 2
y[1 − F(y − )]dy.
0
Now plug into this expression the asymptotic formula for 1 − F(x) and conclude as before, using Karamata’s theorem.
Continuous Phase Transitions for Dynamical Systems
665
Appendix B. The Fisher–Felderhoff Droplet Model We describe a crude simplification of a model in [FF]. A ‘vapor’ close to the condensation point consists of microscopic droplets. The interaction between particles in different droplets is negligible, but the interaction between particles in the same droplet is strong, and long–range.7 When two droplets ‘touch’, they become one. ‘Condensation’ is the appearance of macroscopic droplets. Here is a lattice–gas model of this situation. Space is discretized and described by a one–sided one-dimensional string of sites, each of which can be in one of two states: empty (state ‘0’) or occupied (‘1’). The configuration space is {0, 1}N0 . A ‘droplet’ is a maximal string of occupied sites. We describe the interaction by prescribing the function φ(x0 , x1 , . . .) := −βU (x0 |x1 , x2 , . . .), where β is a constant (‘inverse temperature’) and U (x0 |x1 , x2 , . . .) is the energy due to the interaction of site zero and the other sites.8 Note that the energy due to the interaction between the first n sites and the rest is minus the nth Birkhoff sum of φ. It follows that the Helmholtz free energy(=Energy – β1 ×Entropy) per site is up to a constant ⎡
n−1
⎤
1⎣ 1 ⎦ φ ◦ T k dμφ − μφ [a] log − n μφ [a] k=0 n–cylinders = − h μφ (T ) + φdμφ = −Ptop (φ),
lim
n→∞
at least when φ has an equilibrium measure μφ . Since different droplets do not interact, φ takes the form φ(0, ∗) = 0 , φ(1, 1, . . . , 1, 0, ∗):= f (n) , -. / n
for some function f (n). If the interaction is ‘long range’, then this function is not locally Hölder, because the effect of far away sites is not exponentially small. Consider now the following re-coding of a configuration:(x0 , x1 , . . .) → (y0 , y1 , . . .), where xi = 0 ⇒ yi = 0, and xi = 1 ⇒ yi = 1+number of occupied sites to the right of i until the first unoccupied site, for example: (0, 1, 1, 0, 0, 1, 0, 1, 1, 1, 0, . . .) → (0, 2, 1, 0, 0, 1, 0, 3, 2, 1, 0, . . .). In this coding, the configuration space becomes the renewal shift: the topological Markov shift with state space N ∪ {0} and transition matrix ⎧ ⎪ ⎨1 i = 0; A = (ti j ) where ti j = 1 i > 0, j = i − 1; ⎪ ⎩0 otherwise. 7 One example of long–range interactions in liquid droplets is ‘surface tension’. 8 It is useful to think of U (x |x , x , . . .) as of the energy cost of separating site zero from sites n, n > 0, 0 1 2
and moving it to infinity – that is, if site zero is occupied.
666
O. Sarig
In the new coordinates the interaction becomes locally Hölder (‘short range’): '(y0 , y1 , . . .) = f (y0 ) y0 = 0; φ 0 y0 = 0. Thus a compact shift with a long range potential is recoded as a non-compact shift with a short range potential. The critical phenomena for the Fisher–Felderhoff model for various choices of f (n) is described in [FF] and [Wa1, Wa2]. Acknowledgements. The author wishes to thank the referee for his careful reading of the manuscript and for many valuable suggestions, and Michael Fisher and David Ruelle for useful discussions
References [ADU]
Aaronson, J., Denker, M., Urba´nski, M.: Ergodic theory for Markov fibred systems and parabolic rational maps. Trans. AMS 337, 495–548 (1993) [AD] Aaronson, J., Denker, M.: Local limit theorems for partial sums of stationary sequences generated by Gibbs-Markov maps. Stoch. Dyn. 1(2), 193–237 (2001) [BG] Bálint, P., Gouëzel, S.: Limit theorems in the stadium billiard. Commun. Math. Phys. 263, 2, 461– 512 (2006) [BGT] Bingham, N.H., Goldie, C.M., Teugels, J.L.: Regular variation. Encyclopedia of Math. and its Appl. 27, Cambridge: Cambridge Univ. Press 1987. [BDFN] Binney, J.J., Dowrick, N.J., Fisher, A.J., Newman, M.E.J.: The theory of critical phenomena, an introduction to the renormalization group. Oxford Science Publications, Oxford: Oxford University Press, 1992 [Bo] Bowen, R.: Equilibrium states and the ergodic theory of Anosov diffeomorphisms. Lecture Notes in Mathematics, Vol. 470. Berlin-New York: Springer-Verlag, 1975 [BS] Buzzi, J., Sarig, O.: Uniqueness of equilibrium measures for countable Markov shifts and multidimensional piecewise expanding maps. Erg. Th. Dynam. Sys. 23, 1383–1400 (2003) [Ea] Eagleson, G.K.: Some simple conditions for limit theorems to be mixing. (Russian) Teor. Verojatnost. i Primenen. 21(3), 653–660 (1976) Engl. Transl. in Theor. Probab. Appl. 21(3), 637–642 (1976) [El] Ellis, R.S.: Entropy, large deviations, and statistical mechanics. Grund. Math. Wissenschaften 271, Berlin Heidelberg-Newyork: Springer Verlag 1985 [F] Feller, W.: An introduction to probability theory and its applications. Volume II, Second edition, Newyork: John Wiley & Sons, 1971 [FF] Fisher, M.E., Felderhof, B.U.: Phase transition in one–dimensional cluster–interaction fluids: IA. Thermodynamics, IB. Critical behavior. II. Simple logarithmic model. Ann. Phy. 58, 177–280 (1970) [GK] Gnedenko, B.V., Kolmogorov, A.N.: Limit distributions for sums of independent random variables. Translated and annotated by K.L. Chung, with an Appendix by J.L. Doob. Readings MA: Addison–Wesley Publishing Company, 1954 [Gou1] Gouëzel, S.: Sharp polynomial estimates for the decay of correlations. Israel J. Math. 139, 29–65 (2004) [Gou2] Gouëzel, S.: Central limit theorem and stable laws for intermittent maps. Probab. Theory Related Fields 128(1), 82–122 (2004) [Gu1] Gureviˇc, B.M.: Topological entropy of a countable Markov chain. Dokl. Akad. Nauk SSSR 187, 715–718 (1969) [Gu2] Gurevich, B.M.: A variational characterization of one-dimensional countable state Gibbs Random field. Z. Wahrscheinlichkeitstheorie verw. Gebiete 68, 205–242 (1984) [H1] Haydn, N., Isola, S.: Parabolic rational maps. J. London Math. Soc. 63(2), 673–689 (2001). [Hi] Hilfer, R.: Classification theory for anequilibrium phase transitions. Phys. Rev. E 48(4), 2466–2475 (1993) [Ho] Hofbauer, F.: Examples for the non–uniqueness of the equilibrium states. Trans. AMS 228, 223–241 (1977) [Ka] Kato, T.: Perturbation theory for linear operators. Reprint of the 1980 edition. Classics in Mathematics.Berlin: Springer-Verlag, 1995 [Ke] Keane, M.: Strongly mixing g-measures. Invent. Math. 16, 309–324 (1972)
Continuous Phase Transitions for Dynamical Systems
[Lo] [ML] [MU1] [MU2] [MT] [N] [PS] [Ru]
[S1] [S2] [S3] [S4] [S5] [S6] [S7] [St] [W] [Wa1] [Wa2] [Y1] [Y2] [Z]
667
Lopes, A.O.: The Zeta function, non–differentiability of pressure, and the critical exponent of transition. Adv. Math. 101(2), 133–165 (1993) Martin–Löf, A.: Mixing properties, differentiability of the free energy and the central limit theorem for a pure phase in the Ising model at low temperature. Commun. Math. Phys. 32, 75–92 (1973) Urba´nski, M., Mauldin, R.D.: Gibbs states on the symbolic space over an infinite alphabet. Israel J. Math. 125, 93–130 (2001) Mauldin, R.D., Urba´nski, M.: Graph directed Markov systems; Geometry and dynamics of limit sets. Cambridge Tracts in Mathematics, 148, Cambridge: Cambridge University Press, Cambridge, 2003. Melbourne, I., Török, A.: Statistical limit theorems for suspension flows. Israel J. Math. 194, 191– 210 (2004) Nagaev, S.V.: Some limit theorems for stationary Markov chains. (Russian) Teor. Veroyatnost. i Primenen. 2, 389–416 (1957) Prellberg, T., Slawny, J.: Maps of intervals with indifferent fixed points: thermodynamic formalism and phase transitions. J. Stat. Phys. 66(1–2), 503–514 (1992) Ruelle, D.: Thermodynamic Formalism, The mathematical structures of equilibrium statistical mechanics. 2nd Ed. . Cambridge Mathematical Library. Cambridge: Cambridge University Press, 2004 Sarig, O.M.: Thermodynamic Formalism for Countable Markov shifts. Erg. Th. Dyn. Sys. 19, 1565–1593 (1999) Sarig, O.: Phase Transitions for Countable Markov Shifts. Commun. Math. Phys. 217, 555–577 (2001). Sarig, O.: Characterization of existence of Gibbs measures for Countable Markov shifts. Proc. of AMS. 131(6), 1751–1758 (2003) Sarig, O.: Thermodynamic formalism for null recurrent potentials. Israel J. Math. 121, 285–311 (2001) Sarig, O.: Subexponential decay of correlations. Invent. Math. 150, 629–653 (2002) Sarig, O.: Thermodynamic formalism for countable Markov shifts. Tel-Aviv University Thesis (2000). Sarig, O.: On an example with a non-analytic topological pressure. C. R. Acad. Sci. Paris Sér. I Math. 330(4), 311–315 (2000) Stanley, H.E.: Introduction to phase transitions and critical phenomena. Oxford: Oxford University Press 1971 Walters, P.: Ruelle’s operator theorem and g-measures. Trans. Amer. Math. Soc. 214, 375–387 (1975) Wang, X.-J.: Abnormal fluctuations and thermodynamic phase transition in dynamical systems. Phys. Review A 39(6), 3214–3217 Wang, X.-J.: Statistical physics of temporal intermittency. Phys. Review A 40(11), 6647–6661 (1989) Yuri, M.: Thermodynamic formalism for countable to one Markov systems. Trans. Amer. Math. Soc. 355(7), 2949–2971 (2003) Yuri, M.: Phase transition, non-Gibbsianness and subexponential instability, Ergodic Thy Dynam. Syst. 25, 1325–1342 (2005) Zolotarev, V.M.: One–dimensional stable distributions . Transl. Math. Monog. 65, Providence, RI: Amer. Math. Sec., 1986
Communicated by G. Gallavotti
Commun. Math. Phys. 267, 669–701 (2006) Digital Object Identifier (DOI) 10.1007/s00220-006-0059-4
Communications in
Mathematical Physics
Lifshitz Tails in Constant Magnetic Fields Frédéric Klopp1 , Georgi Raikov2 1 Université de Paris Nord et Institut Universitaire de France, Département de mathématiques, Avenue J.
Baptiste Clément, 93430 Villetaneuse, France. E-mail: [email protected]
2 Departamento de Matemáticas, Facultad de Ciencias, Universidad de Chile, Las Palmeras 3425, Santiago,
Chile. E-mail: [email protected] Received: 12 September 2005 / Accepted: 25 January 2006 Published online: 19 August 2006 – © Springer-Verlag 2006
Abstract: We consider the 2D Landau Hamiltonian H perturbed by a random alloy-type potential, and investigate the Lifshitz tails, i.e. the asymptotic behavior of the corresponding integrated density of states (IDS) near the edges in the spectrum of H . If a given edge coincides with a Landau level, we obtain different asymptotic formulae for power-like, exponential sub-Gaussian, and super-Gaussian decay of the one-site potential. If the edge is away from the Landau levels, we impose a rational-flux assumption on the magnetic field, consider compactly supported one-site potentials, and formulate a theorem which is analogous to a result obtained by the first author and T. Wolff in [25] for the case of a vanishing magnetic field. 1. Introduction Let H0 = H0 (b) := (−i∇ − A)2 − b
(1.1)
be the unperturbed Landau Hamiltonian, essentially self-adjoint on C0∞ (R2 ). Here A = (− bx22 , bx21 ) is the magnetic potential, and b ≥ 0 is the constant scalar magnetic field. It is well-known that if b > 0, then the spectrum σ (H0 ) of the operator H0 (b) consists of the so-called Landau levels 2bq, q ∈ Z+ , and each Landau level is an eigenvalue of infinite multiplicity. If b = 0, then H0 = −, and σ (H0 ) = [0, ∞) is absolutely continuous. Next, we introduce a random Z2 -ergodic alloy-type electric potential V (x) = Vω (x) :=
ωγ u(x − γ ), x ∈ R2 .
γ ∈Z2
Our general assumptions concerning the potential Vω are the following ones:
670
F. Klopp, G. Raikov
• H1 : The single-site potential u satisfies the estimates 0 ≤ u(x) ≤ C0 (1 + |x|)−κ , x ∈ R2 ,
(1.2)
with some κ > 2 and C0 > 0. Moreover, there exists an open non-empty set ⊂ R2 and a constant C1 > 0 such that u(x) ≥ C1 for x ∈ . • H2 : The coupling constants ωγ γ ∈Z2 are non-trivial, almost surely bounded i. i. d. random variables. Evidently, these two assumptions entail M := ess-sup sup |Vω (x)| < ∞. ω
(1.3)
x∈R2
On the domain of H0 define the operator H = Hω := H0 (b) + Vω . The integrated density of states (IDS) for the operator H is defined as a non-decreasing left-continuous function Nb : R → [0, ∞) which almost surely satisfies ϕ(E)dNb (E) = lim R −2 Tr 1 R ϕ(H )1 R , ∀ϕ ∈ C0∞ (R). (1.4) R
R→∞
Here and in the sequel 1O denotes the characteristic function of the set O, and R := R R 2 − 2 , 2 . By the Pastur-Shubin formula (see e.g. [36, Sect. 2] or [11, Cor. 3.3]) we have ϕ(E)dNb (E) = E Tr 11 ϕ(H )11 , ∀ϕ ∈ C0∞ (R), (1.5) R
where E denotes the mathematical expectation. Moreover, there exists a set ⊂ R such that σ (Hω ) = almost surely, and supp dNb = . The aim of the present article is to study the asymptotic behavior of Nb near the edges of . It is well known that, for many random models, this behavior is characterized by a very fast decay which goes under the name of “Lifshitz tails”. It was studied extensively in the absence of magnetic field (see e.g. [31, 15]), and also in the presence of magnetic field for other types of disorder (see [2, 6, 12, 7, 13]). 2. Main Results In order to fix the picture of the almost sure spectrum σ (Hω ), we assume b > 0, and make the following two additional hypotheses: • H3 : The support of the random variables ωγ , γ ∈ Z2 , consists of the interval [ω− , ω+ ] with ω− < ω+ and ω− ω+ ≤ 0. • H4 : We have M+ − M− < 2b where ±M± := ess-supω supx∈R2 (±Vω (x)). ∞ [2bq + M , 2bq + Assumptions H1 – H4 imply M− M+ ≤ 0. Moreover, the union ∪q=0 − 2 M+ ] which contains , is disjoint. Introduce the bounded Z -periodic potential W (x) := u(x − γ ), x ∈ R2 , γ ∈Z2
Lifshitz Tails in Constant Magnetic Fields
671
and on the domain of H0 define the operators H ± := H0 + ω± W . It is easy to see that ∞ ∞ σ (H − ) ⊆ ∪q=0 [2bq + M− , 2bq], σ (H + ) ⊆ ∪q=0 [2bq, 2bq + M+ ],
and σ (H − ) ∩ [2bq + M− , 2bq] = ∅, σ (H + ) ∩ [2bq, 2bq + M+ ] = ∅, ∀q ∈ Z+ . Set E q− := inf σ (H − ) ∩ [2bq + M− , 2bq] ,
E q+ := sup σ (H + ) ∩ [2bq, 2bq + M+ ] .
Following the argument in [16] (see also [31, Theorem 5.35]), we easily find that ∞ = ∪q=0 [E q− , E q+ ],
i.e. is represented as a disjoint union of compact intervals, and each interval [E q− , E q+ ] contains exactly one Landau level 2bq, q ∈ Z+ . In the following theorems we describe the behavior of the integrated density of states Nb near E q− , q ∈ Z+ ; its behavior near E q+ could be analyzed in a completely analogous manner. Our first theorem concerns the case where E q− = 2bq, q ∈ Z+ . This is the case if and only if ω− = 0; in this case, the random variables ωγ , γ ∈ Z2 , are non-negative. Theorem 2.1. Let b > 0 and Assumptions H1 – H4 hold. Suppose that ω− = 0, and that P(ω0 ≤ E) ∼ C E κ , E ↓ 0,
(2.1)
for some C > 0 and κ > 0. Fix the Landau level 2bq = E q− , q ∈ Z+ . i) Assume that C− (1 + |x|)−κ ≤ u(x) ≤ C+ (1 + |x|)−κ , x ∈ R2 , for some κ > 2, and C+ ≥ C− > 0. Then we have lim
E↓0
ii) Assume have
e−C+ |x| C+
β
2 ln | ln (Nb (2bq + E) − Nb (2bq))| =− . ln E κ−2
≤ u(x) ≤
e−C− |x| C−
β
, x ∈ R2 , β ∈ (0, 2], C+ ≥ C− > 0. Then we
2 ln | ln (Nb (2bq + E) − Nb (2bq))| = 1+ . E↓0 ln | ln E| β
lim 1
−C− |x|2
2
(2.2)
0 |<ε} iii) Assume {x∈R ;C|x−x ≤ u(x) ≤ e C− + ε > 0. Then there exists δ > 0 such that
(2.3)
for some C+ ≥ C− > 0, x0 ∈ R2 , and
ln | ln (Nb (2bq + E) − Nb (2bq)| ln | ln E| ln | ln (Nb (2bq + E) − Nb (2bq)| ≤ lim sup ≤ 2. ln | ln E| E↓0
1 + δ ≤ lim inf E↓0
(2.4)
672
F. Klopp, G. Raikov
The proof of Theorem 2.1 is contained in Sects. 3–5. In Sect. 3 we construct a periodic approximation of the IDS Nb which plays a crucial role in this proof. The upper bounds of the IDS needed for the proof of Theorem 2.1 are obtained in Sect. 4, and the corresponding lower bounds are deduced in Sect. 5. Remarks. i) In the first and second part of Theorem 2.1 we consider one-site potentials u respectively of power-like or exponential sub-Gaussian decay at infinity, and obtain the values of the so-called Lifshitz exponents. Note however that in the case of power-like decay of u the double logarithm of Nb (2bq + E) − Nb (2bq) is asymptotically proportional to ln E (see (2.2)), while in the case of exponentially decaying u this double logarithm is asymptotically proportional to ln | ln E| (see (2.3)); in both cases the Lifshitz exponent is defined as the corresponding proportionality factor. In the third part of the theorem which deals with one-site potentials u of super-Gaussian decay, we obtain only upper and lower bounds of the Lifshitz exponent. It is natural to conjecture that the value of this exponent is 2, i.e. that the upper bound in (2.4) reveals the correct asymptotic behavior. ii) In the case of a vanishing magnetic field, the Lifshitz asymptotics for random Schrödinger operator with repulsive random alloy-type potentials has been known since long ago (see [17]). To the authors’ best knowledge the Lifshitz asymptotics for the Landau Hamiltonian with non-zero magnetic field, perturbed by a positive random alloy-type potential, is considered for the first time in the present article. However, it is appropriate to mention here the related results concerning the Landau Hamiltonian with repulsive random Poisson potential. In [2] the Lifshitz asymptotics in the case of a power-like decay of the one-site potential u, was investigated. The case of a compact support of u was considered in [6]. The results for the case of a compact support of u were essentially used in [12] and [7] (see also [13]), in order to study the problem in the case of an exponential decay of u. Our second theorem concerns the case where E q− < 2bq, q ∈ Z+ . This is the case if and only if ω− < 0. In order to handle this case, we need some facts from the magnetic Floquet-Bloch theory. Let := g1 Z ⊕ g2 Z with g j > 0, j = 1, 2. Introduce the tori T := R2/ , T∗ := R2/ ∗ ,
(2.5)
where ∗ := 2πg1−1 Z ⊕ 2πg2−1 Z is the lattice dual to . Denote by O and O ∗ the fundamental domains of T and T∗ respectively. Let W : R2 → R be a -periodic bounded real-valued function. On the domain of H0 define the operator HW := H0 + W. Assume that the scalar magnetic field b ≥ 0 satisfies the integer-flux condition with respect to the lattice , i.e. that bg1 g2 ∈ 2π Z+ . Fix θ ∈ T∗ . Denote by h 0 (θ ) the self-adjoint operator generated in L 2 (O ) by the closure of the non-negative quadratic form |(i∇ + A − θ ) f |2 d x O
defined originally on the set
f = g O | g ∈ C ∞ (R2 ), (τγ g)(x) = g(x), x ∈ R2 , γ ∈ ,
where τ y , y ∈ R2 , is the magnetic translation given by (τ y g)(x) := eib
y1 y2 2
eib
x∧y 2
g(x + y), x ∈ R2 ,
(2.6)
Lifshitz Tails in Constant Magnetic Fields
673
with x ∧ y := x1 y2 − x2 y1 . Note that the integer-flux condition implies that the operators τγ , γ ∈ , commute with each other, as well as with operators i ∂∂x j + A j , j = 1, 2 (see (1.1)), and hence with H0 and HW . In the case b = 0, the domain of the operator h 0 is isomorphic to the Sobolev space H2 (T ), but if b > 0, this is not the case even under the integer-flux assumption since h 0 acts on U (1)-sections rather than on functions over T (see e.g [30, Subsect. 2.2]). On the domain of h 0 define the operator h W (θ ) := h 0 (θ ) + W, θ ∈ T∗ . Set
H0 :=
(2.7)
O ∗
⊕ h 0 (θ )dθ, HW :=
O ∗
⊕ h W (θ )dθ.
(2.8)
It is well-known (see e.g [10, 35 or 30, Subsect. 2.4]) that the operators H0 and HW are unitarily equivalent to the operators H0 and HW respectively. More precisely, we have H0 = U ∗ H0 U and HW = U ∗ HW U , where U: L 2 (R2 ) → L 2 (O × O ∗ ) is the unitary Gelfand-type operator defined by (U f )(x; θ ) :=
1
vol T∗ γ ∈
e−iθ(x+γ ) (τγ f )(x), x ∈ O , θ ∈ T∗ .
(2.9)
Evidently for each θ ∈ T∗ the spectrum of the operator h W (θ ) is purely discrete. Denote ∞ by E j (θ ) j=1 the non-decreasing sequence of its eigenvalues. Let E ∈ R. Set J (E) := j ∈ N ; there exists θ ∈ T∗ such that E j (θ ) = E . Evidently, for each E ∈ R the set J (E) is finite. If E ∈ R is an end of an open gap in σ (H0 + W), then we will call it an edge in σ (H0 + W). We will call the edge E in σ (H0 + W) simple if # J (E) = 1. Moreover, we will call the edge E non-degenerate if for each j ∈ J (E) the number of points θ ∈ T∗ such that E j (θ ) = E is finite, and at each of these points the extremum of E j is non-degenerate. Assume at first that b = 0. Then H0 = −, and we will consider the general d-dimensional situation; the simple and non-degenerate edges in σ (−+W) are defined exactly as in the two-dimensional case. If W : Rd → R is a bounded periodic function, it is well-known that: • The spectrum of − + W is absolutely continuous (see e.g. [33, Theorems XIII.90, XIII.100]). In particular, no Floquet eigenvalue E j : T∗ → R, j ∈ N, is constant. • If d = 1, all the edges in σ (− + W) are simple and non-degenerate (see e.g. [33, Theorem XIII.89]). • For d ≥ 1 the bottom of the spectrum of − + W is a simple and non-degenerate edge (see [19]). • For d ≥ 1, the edges of σ (− + W) generically are simple (see [24]). Despite the widely spread belief that generically the higher edges in σ (− + W) should also be non-degenerate in the multi-dimensional case d > 1, there are no rigorous results in support of this conjecture. Let us go back to the investigation of the Lifshitz tails for the operator − + Vω . It follows from the general results of [16] that E − (respectively, E + ) is an upper (respectively, lower) end of an open gap in σ (−+ Vω ) if and only if it is an upper (respectively,
674
F. Klopp, G. Raikov
lower) end of an open gap in the spectrum of − + ω− W (respectively, − + ω+ W ). For definiteness, let us consider the case of an upper end E − . The asymptotic behavior of the IDS N0 (E) as E ↓ E − has been investigated in [28, 29] in the case d = 1, and in [18] in the case d ≥ 1 and E − = inf σ (− + ω− W ). Note that the proofs of the results of [28, 29 and 18], essentially rely on the non-degeneracy of E − . Later, the Lifshitz tails for the operator − + Vω near the edge E − were investigated in [21] under the assumptions that d ≥ 1, E − > inf σ (− + ω− W ), and that E − is a non-degenerate edge in the spectrum of − + ω− W ; due to the last assumption these results are conditional. However, it turned out possible to lift the non-degeneracy assumption in the two-dimensional case considered in [25]. First, it was shown in [25, Theorem 0.1] that for any single-site potential u satisfying assumption H1 , we have lim sup E↓0
ln | ln (N0 (E − + E) − N0 (E − ))| <0 ln E
without any additional assumption on E − . If, moreover, the support of u is compact, and the probability P(ω0 − ω− ≤ E) admits a power-like decay as E ↓ 0, it follows from [25, Theorem 0.2] that there exists α > 0 such that ln | ln (N0 (E − + E) − N0 (E − ))| = −α E↓0 ln E
lim
(2.10)
under the unique generic hypothesis that E − is a simple edge. Note that the absolute continuity of σ (− + ω− W ) plays a crucial role in the proofs of the results of [25]. Assume now that the scalar magnetic field b > 0 satisfies the rational flux condition b ∈ 2π Q. More precisely, we assume that b/2π is equal to the irreducible fraction p/r , p ∈ N, r ∈ N. Then b satisfies the integer-flux assumption with respect, say, to the lattice = r Z ⊕ Z, and the operator H − is unitarily equivalent to Hω− W . As in the non-magnetic case, in order to investigate the Lifshitz asymptotics as E ↓ E q− of Nb (E), we need some information about the character of E q− as an edge in the spectrum of H − . For example, if we assume that E q− is a simple edge, and the corresponding Floquet band does not shrink into a point, we can repeat almost word by word the argument of the proof of [25, Theorem 0.2], and obtain the following Theorem 2.2. Let b > 0, b ∈ 2π Q, and Assumptions H1 –H4 hold. Assume that the support of u is compact, ω− < 0, and P(ω0 − ω− ≤ E) ∼ C E κ , E ↓ 0, for some C > 0 and κ > 0. Fix q ∈ Z+ . Suppose E q− is a simple edge in the spectrum of the operator H − , and that the function E j , j ∈ J (E q− ), is not identically constant. Then there exists α > 0 such that lim
E↓0
ln | ln (Nb (E q− + E) − Nb (E q− ))| ln E
= −α.
(2.11)
Remarks. i) It is believed that under the rational-flux assumption the Floquet eigenvalues E j , j ∈ N, for the operator H − generically are not constant. Note that this property may hold only generically due to the obvious counterexample where u = 11 , H − = H0 + ω− , and for all j ∈ N the Floquet eigenvalue E j is identically equal to 2b( j − 1) + ω− . Also, in contrast to the non-magnetic case, we do not know whether the edges in the spectrum of H − generically are simple. ii) The definition of the constant α in (2.11) is completely analogous to the one in (2.10) which concerns the non-magnetic case. This definition involving the concepts of
Lifshitz Tails in Constant Magnetic Fields
675
Newton polygon, Newton diagram, and Newton decay exponent, is not trivial, and can be found in the original work [25], or in [22, Subsect. 4.2.8]. 3. Periodic Approximation 2
Pick a > 0 such that ba 2π ∈ N. Set L := (2n + 1)a/2, n ∈ N, and define the random 2 2LZ -periodic potential per V per (x) = Vn,ω (x) := Vω 12L (x + γ ), x ∈ R2 . γ ∈2L Z2 per
per
On the domain of H0 define the operator H per = Hn,ω := H0 + Vn,ω . For brevity set T2L := T2L Z2 , T∗2L := T∗2L Z2 (see (2.5)). Note that the square 2L is the fundamental domain of the torus T2L , while ∗2L := π L −1 is the fundamental domain of T∗2L . As in (2.7), on the domain of h 0 define the operator h(θ ) = h per (θ ) := h 0 (θ ) + V per , θ ∈ T∗2L , and by analogy with (2.8) set Hper :=
∗2L
⊕ h per (θ )dθ.
As above, the operators H0 and H per are unitarily equivalent to the operators H0 and Hper respectively. Set per N per (E) = Nn,ω (E) := (2π )−2 N (E; h per (θ ))dθ, E ∈ R. (3.1) ∗2L
Here and in the sequel, if T is a self-adjoint operator with purely discrete spectrum, then N (E; T ) denotes the number of the eigenvalues of T less than E ∈ R, and counted with the multiplicities. The function N per plays the role of IDS for the operator H per since, similarly to (1.4) and (1.5), we have ϕ(E)dN per (E) = lim R −2 Tr 1 R ϕ(H per )1 R R
R→∞
almost surely, and E
R
ϕ(E)dN
per
(E) = E Tr 11 ϕ(H per )11 ,
(3.2)
for any ϕ ∈ C0∞ (R) (see e.g. the proof of [21, Theorem 5.1] where however the case of a vanishing magnetic field is considered). Theorem 3.1. Assume that Hypotheses H1 and H2 hold. Let q ∈ Z+ , η > 0. Then there exist ν > 0 and E 0 > 0 such that for E ∈ (0, E 0 ] and n ≥ E −ν we have −η E N per (2bq + E/2) − N per (2bq − E/2) −e−E ≤ Nb (2bq + E) − Nb (2bq − E) −η (3.3) ≤ E N per (2bq + 2E) − N per (2bq − 2E) + e−E .
676
F. Klopp, G. Raikov
The main technical steps of the proof of Theorem 3.1 which is the central result of this section, are contained in Lemmas 3.1 and 3.2 below. Lemma 3.1. Let Q = Q ∈ L ∞ (R2 ), X := H0 + Q, D(X ) = D(H0 ). Then there exists = (b) > 0 such that for each α, β ∈ Z2 , and z ∈ C\σ (X ) we have
1 b+1 −1 e−η(z)|α−β| , χα (X − z) χβ HS ≤ 2 1/2 1 + (3.4) π η(z) where χα := 11 +α , α ∈ Z2 , η(z) = η(z; b, Q) := Hilbert-Schmidt norm, and |Q|∞ := Q L ∞ (R2 ) .
dist(z,σ (X )) |z|+|Q|∞ +1 ,
· HS denotes the
Proof. We will apply the ideas of the proof of [20, Prop. 4.1]. For ξ ∈ R2 set X ξ := eξ ·x X e−ξ ·x = (i∇ + A − iξ )2 + Q = X − 2iξ · (i∇ + A) + |ξ |2 . Evidently, X ξ − z = (X − z) 1 + (X − z)−1 |ξ |2 − 2iξ · (i∇ + A) .
(3.5)
Let us estimate the norm of the operator (X − z)−1 |ξ |2 − 2iξ · (i∇ + A) appearing at the right-hand side of (3.5). We have (X − z)−1 |ξ |2 ≤ |ξ |2 dist(z, σ (X ))−1 , (X − z)−1 2iξ · (i∇ + A) ≤ 2(H0 + 1)−1 (i∇ + A) · ξ − (X − z)−1 (Q − z − 1)(H0 + 1)−1 (i∇ + A) · ξ
1 |ξ | ≤ 2C 1 + η(z) with ((2q + 1)b)1/2 . 2bq + 1 q∈Z+
C = C(b) := (H0 + 1)−1 (i∇ + A) = sup
1 Choose ∈ 0, 8(C+1) and ξ ∈ R2 such that |ξ | = η(z). Then, by the above estimates, we have
1 η(z) (X − z)−1 |ξ |2 − 2iξ · (i∇ + A) ≤ 2 η(z)2 dist(z, σ (X ))−1 + 2C 1 + η(z) ≤ 2 η(z) + 2C(1 + η(z)) < 2 + 4C < 3/4,
(3.6)
since the resolvent identity implies η(z) < 1. Therefore, the operator X ξ −z is invertible, and (3.7) χα (X − z)−1 χβ = e−ξ ·x χα χα (X ξ − z)−1 χβ eξ ·x χβ .
Lifshitz Tails in Constant Magnetic Fields
677
Moreover, (3.5) and (3.6) imply χα (X ξ − z)−1 χβ HS ≤ 4(X − z)−1 χβ HS ≤ 4(H0 + 1)−1 χβ − (X − z)−1 (Q − z − 1)(H0 + 1)−1 χβ HS ≤ 4(H0 + 1)−1 χβ HS (1 + (X − z)−1 (Q − z − 1)) ≤ 4(H0 + 1)−1 χβ HS
1 . (3.8) × 1+ η(z) Finally, applying the diamagnetic inequality for Hilbert-Schmidt operators (see e.g. [1]), we get (H0 + 1)−1 χβ HS ≤ (H0 + 1)−1 (H0 + b + 1)(H0 + b + 1)−1 χβ HS ≤ (H0 + 1)−1 (H0 + b + 1)(− + 1)−1 χβ HS 2bq + b + 1 b+1 (− + 1)−1 χβ HS = = sup . 2π 1/2 q∈Z+ 2bq + 1
(3.9)
The combination of (3.7), (3.8), and (3.9) yields χα (X − z)−1 χβ HS ≤
1 2(b + 1) −ξ(α−β) 1 + . e π 1/2 η(z)
α−β Choosing ξ = η(z) |α−β| , we get (3.4).
Lemma 3.2. Assume that Hypotheses H1 and H2 hold. Then there exists a constant C > 1 such that for any ϕ ∈ C0∞ (R), and any n ∈ N, l ∈ N, we have
per E ϕ(E)dN (E) − ϕ(E)dN (E) b R R j −l Cl log l l+5 d ϕ ≤ cn e (3.10) sup (|x| + C) d x j (x) . x∈R, 0≤ j≤l+5 Proof. We will follow the general lines of the proof of [23, Lemma 2.1]. Due to the fact that we consider only the two-dimensional case, and an alloy-type potential which is almost surely bounded, the argument here is somewhat simpler than the one in [23]. By (1.5) and (3.2) we have
E ϕ(E)dNb (E) − ϕ(E)dN per (E) = E Tr 11 (ϕ(H ) − ϕ(H per ))11 . R
R
Next, we introduce a representation of the operator ϕ(H ) − ϕ(H per ) by the HelfferSjöstrand formula (see e.g. [4, Chap. 8]). Let ϕ˜ be an almost analytic extension of the function ϕ ∈ C0∞ (R) appearing in (3.10). We recall that ϕ˜ possesses the following properties: 1. 2. 3. 4.
If Im z = 0, then ϕ(z) ˜ = ϕ(z). supp ϕ˜ ⊂ {x + i y ∈ C; |y| < 1}. ϕ˜ ∈ S ({x + i y ∈ C; |y| < 1}). The family of functions x → ∂∂ϕz¯˜ (x + i y)|y|−m , |y| ∈ (0, 1), is bounded in S(R) for any m ∈ Z+ .
678
F. Klopp, G. Raikov
Such extensions exist for ϕ ∈ S(R) (see [27, 4, Chapt. 8]), and there exists a constant C > 1 such that for any m ≥ 0, α ≥ 0, β ≥ 0, we have
∂β ∂ ϕ˜ sup sup x α β |y|−m (x + i y) ∂x ∂ z¯ 0≤|y|≤1 x∈R d β ϕ(x) α m log m+α log α+β+1 ≤C (3.11) sup sup x . β d x β ≤m+β+2, α ≤α x∈R Then the Helffer-Sjöstrand formula yields
E Tr 11 (ϕ(H ) − ϕ(H per ))11
1 ∂ ϕ˜ −1 per −1 (z) 11 (H − z) − (H − z) = E Tr 11 d xd y π C ∂ z¯
∂ ϕ˜ 1 −1 per per −1 (z) 11 (H − z) (V − V )(H − z) 11 d xd y . (3.12) = E Tr π C ∂ z¯
Next, we will show that 11 (H − z)−1 (V per − V )(H per − z)−1 11 is a trace-class operator for z ∈ C\R, and almost surely
M + |z| + 1 2 M(b + 1)2 −1 per per −1 11 (H − z) (V − V )(H − z) 11 Tr ≤ 1+ , 2π |Im z| (3.13) where .Tr denotes the trace-class norm. Evidently, 11 (H − z)−1 (V per − V )(H per − z)−1 11 Tr ≤
11 (H0 + 1)−1 2HS (V per
− V )(H0 + 1)(H − z)−1 (H0 + 1)(H per − z)−1 . (3.14)
per − By (3.9) we have 11 (H0 + 1)−1 2HS ≤ (b+1) 4π . Moreover, almost surely V V ≤ 2M. Finally, it is easy to check that both norms (H0 + 1)(H − z)−1 and (H0 + 1)(H per − z)−1 are almost surely bounded from above by 1 + M+|z|+1 |Im z| , so that (3.13) follows from (3.14). Taking into account estimate (3.13) and Properties 2, 3, and 4 of the almost analytic continuation ϕ, ˜ we find that (3.12) implies per E Tr 11 (ϕ(H ) − ϕ(H ))11 1 ∂ ϕ˜ (z)E Tr 11 (H − z)−1 (V per − V )(H per − z)−1 11 d xd y. (3.15) = π C ∂ z¯ 2
Our next goal is to obtain a precise estimate (see (3.19) below) on the decay rate as n → ∞ of E Tr 11 (H − z)−1 (V per − V )(H per − z)−1 11 with z ∈ C \ R and |Im z| < 1. Evidently, E Tr 11 (H − z)−1 (V per − V )(H per − z)−1 11 = E Tr 11 (H − z)−1 χα (V per − V )(H per − z)−1 11 , α∈Z2 ,|α|∞ >na
Lifshitz Tails in Constant Magnetic Fields
679
where |α|∞ := max j=1,2 |α j |, since V per = V on 2L , and therefore χα (V per − V ) = 0 if |α|∞ ≤ na. Hence, bearing in mind estimates (1.3) and (3.4), we easily find that |E Tr 11 (H − z)−1 (V per − V )(H per − z)−1 11 | ≤ E χ0 (H − z)−1 χα (V per − V )(H per − z)−1 χ0 Tr α∈Z2 ,|α|∞ >na
≤ 2M
E χ0 (H − z)−1 χα HS χα (H per − z)−1 χ0 HS
α∈Z2 ,|α|∞ >na
≤
M(b + 1)2 2π
|x| + M + 2 2 1+ |y|
α∈Z2 ,|α|∞ >na
2|α||y| (3.16) exp − |x| + M + 2
for every z = x + i y with 0 < |y| < 1. Using the summation formula for a geometric series, and some elementary estimates, we conclude that there exists a constant C depending only on such that
|x| + M + 2 an|y| 2|α||y| ≤ 1+C exp − , exp − |x| + M + 2 |y| |x| + M + 2 2 α∈Z ,|α|∞ >na
(3.17) provided that 0 < |y| < 1. Putting together (3.16) and (3.17), we find that there exists a constant C = C(M, b, , a) such that E Tr 11 (H − z)−1 (V per − V )(H per − z)−1 11
|x| + C 3 an|y| ≤C . (3.18) exp − |y| |x| + C Writing
|x| + C 3 |x| + C 3+l an|y| l an|y| an|y| = (an)−l exp − exp − |y| |x| + C |y| |x| + C |x| + C with l ∈ N, and bearing in mind the elementary inequality t l e−t ≤ (l/e)l , t ≥ 0, l ∈ N, we find that (3.18) implies E Tr 11 (H − z)−1 (V per − V )(H per − z)−1 11
|x| + C 3+l l log l ≤ C(ae)−l n −l e , l ∈ N. (3.19) |y| Combining (3.19) and (3.15), we get |E Tr 11 (ϕ(H ) − ϕ(H per ))11 | C ≤ (|x| + C)−2 d x (ae)−l n −l el log l sup sup (|x| + C)l+5 |y|−(l+3) π R 0<|y|<1 x∈R ∂ ϕ˜ (3.20) × (x + i y) , l ∈ N. ∂ z¯ Applying estimate (3.11) on almost analytic extensions, we find that (3.20) entails (3.10).
680
F. Klopp, G. Raikov
Now we are in position to prove Theorem 3.1. Let ϕ+ ∈ C0∞ (R) be a non-negative Gevrey-class function with Gevrey exponent > 1, such that R ϕ+ (t)dt = 1, supp ϕ+ ⊂ − E2 , E2 . Set + := 1 3E 3E ∗ ϕ+ . Then + is Gevrey-class func2bq−
2
,2bq+
tion with Gevrey exponent . Moreover,
2
1[2bq−E,2bq+E] (t) ≤ + (t) ≤ 1[2bq−2E,2bq+2E] (t), t ∈ R. Therefore,
Nb (2bq + E) − Nb (2bq − E) ≤ E N per (2bq + 2E) − N per (2bq − 2E)
(3.21) + E + (t)dNb (t) − + (t)dN per (t) . R
R
Applying Lemma 3.2 and the standard estimates on the derivatives of Gevrey-class functions, we get
per E + (t)dNb (t) − + (t)dN (t) ≤ Cn −l (l + 5)(l+5) , l ∈ N, (3.22) R
R
with C independent of n, and l. Optimizing the r.h.s. of (3.22) with respect to l, we get
per E + (t)dNb (t) − + (t)dN (t) ≤ exp −( + C)n 1/(+C) R
R
for sufficiently large n. Picking η > 0, and choosing ν > ( + C)η and n ≥ E −ν , we find that
−η per E + (t)dNb (t) − + (t)dN (t) ≤ e−E (3.23) R
R
for sufficiently small E > 0. Now the combination of (3.21) and (3.23) yields the upper bound in (3.3). The proof of the first inequality in (3.3) is quite similar, so that we will just outline it. Let ϕ− ∈ C0∞ (R) be a non-negative Gevrey-class function with Gevrey exponent > 1, such that R ϕ+ (t)dt = 1, and supp ϕ+ ⊂ − E4 , E4 . Set + := 1 3E 3E ∗ ϕ+ . Then − is Gevrey-class function with Gevrey exponent 2bq−
4
,2bq+
4
. Similarly to (3.21) we have E N per (2bq + E/2) − N per (2bq − E/2)
− E − (t)dNb (t) − − (t)dN per (t) R
R
≤ Nb (2bq + E) − Nb (2bq − E).
(3.24)
Arguing as in the proof of (3.23), we obtain
−η per E − (t)dNb (t) − − (t)dN (t) ≤ e−E R
R
which combined with (3.24) yields the lower bound in (3.3). Thus, the proof of Theorem 3.1 is now complete. Further, we introduce a reduced IDS ρq related to a fixed Landau level 2bq, q ∈ Z+ . ∞ {2bq}, and It is well-known that for every fixed θ ∈ T∗2L we have σ (h(θ )) = ∪q=0
Lifshitz Tails in Constant Magnetic Fields
681
dim Ker (h(θ ) − 2bq) = 2bL 2 /π for each q ∈ Z+ (see [5]). Denote by pq (θ ) : L 2 (2L ) → L 2 (2L ) the orthogonal projection onto Ker (h(θ ) − 2bq), and by rq (θ ) = per rq,n,ω (θ ) the operator pq (θ )Vn,ω pq (θ ) defined and self-adjoint on the finite-dimensional Hilbert space pq (θ )L 2 (2L ). Set −2 N (E; rq,n,ω (θ ))dθ, E ∈ R. (3.25) ρq (E) = ρq,n,ω (E) = (2π ) ∗2L
By analogy with (3.1), we call the function ρq,n,ω the IDS for the operator Rq = Rq,n,ω := ∗ ⊕rq,n,ω dθ defined and self-adjoint on Pq L 2 (2L × ∗2L ) where Pq := 2L per P . q ∗2L ⊕ pq (θ )dθ . Note that Rq = Pq V Denote by Pq , q ∈ Z+ , the orthogonal projection onto Ker(H0 − 2bq). Evidently, Pq = U Pq U ∗ . As mentioned in the Introduction, rank Pq = ∞ for every q ∈ Z+ . Moreover, the functions q! b ( j−q+1)/2 ( j−q) q e j (x) = e j,q (x) := (−i) (x1 + i x2 ) j−q L q π j! 2
b 2 − b |x|2 |x| e 4 , j ∈ Z+ , × (3.26) 2 form the so-called angular-momentum orthogonal basis of Pq L 2 (R2 ), q ∈ Z+ (see [8] or [3, Sect. 9]). Here ( j−q)
Lq
(ξ ) :=
q l=max{0,q− j}
(−ξ )l j! , ξ ∈ R, q ∈ Z+ , ( j − q + l)!(q − l)! l!
j ∈ Z+ ,
are the generalized Laguerre polynomials. For further references we give here several estimates concerning the functions e j,k . If q ∈ Z+ , j ≥ 1, and ξ ≥ 0, we have ( j−q)
Lq
( jξ )2 ≤ j 2q e2ξ
(3.27)
(see [14, Eq. (4.2)]). On the other hand, there exists j0 > q such that j ≥ j0 implies ( j−q) ( jξ )2 Lq
1 ≥ (q!)2
2+2q 1 ( j − q)2q 2
(3.28)
if ξ ∈ [0, 1/2] (see [32, Eq. (3.6)]). Moreover, for j ∈ Z+ and q ∈ Z+ we have 1 (a ∗ )q e0,q (x), x ∈ R, e j,q (x) = √ q!(2b)q
(3.29)
where a ∗ := −i
∂ ∂ ∂ 2 2 − A1 − i −i − A2 = −2ieb|z| /4 e−b|z| /4 , z := x1 + i x2 , ∂ x1 ∂ x2 ∂z (3.30)
682
F. Klopp, G. Raikov
is the creation operator (see e.g. [3, Sect. 9]). Evidently, a ∗ commutes with the magnetic translation operators τγ , γ ∈ 2LZ2 (see (2.6)). Finally, the projection Pq , q ∈ Z+ , admits the integral kernel
b b −i b x∧x e 2 |x − x |2 , x, x ∈ R2 , K q,b (x, x ) = q (3.31) 2π 2 (0)
where q (ξ ) := L q (ξ )e−ξ/2 , ξ ∈ R. Since Pq is an orthogonal projection in L 2 (R2 ) ∗ we have Pq L 2 (R2 )→L 2 (R2 ) = 1. Using the facts that Pq = U Pq U and Pq := ∗2L ⊕ pq (θ )dθ , as well as the explicit expressions (2.9) for the unitary operator U , and (3.31) for the integral kernel of Pq , q ∈ Z+ , we easily find that the projection pq (θ ), θ ∈ T∗2L , admits an explicit kernel in the form b iθ(x −x) −i b x∧x e Kq,b (x, x ; θ ) = e 2 2π
b b b 2 |x − x + α| e−iθα ei 2 (x+x )∧α ei 2 α1 α2 , x, x ∈ 2L . (3.32) × q 2 2 α∈2L Z
Lemma 3.3. Let the assumptions of Theorem 3.1 hold. Suppose, moreover, that the random variables ωγ , γ ∈ Z2 , are non-negative. M , ∞ there exists E 0 ∈ (0, 2b) such that for each E ∈ (0, E 0 ), a) For each c0 ∈ 1 + 2b θ ∈ T∗2L , almost surely (3.33) N (E; r0 (θ )) ≤ N (E; h(θ )) ≤ N (c0 E; r0 (θ )). M M b) Assume H4 , i.e. 2b > M. Then for each c1 ∈ 0, 1 − 2b , c2 ∈ 1 + 2b , ∞ , there ∗ exists E 0 ∈ (0, 2b) such that for each E ∈ (0, E 0 ), θ ∈ T2L , and q ≥ 1, almost surely N (c1 E; rq (θ )) ≤ N (2bq + E; h(θ )) − N (2bq; h(θ )) ≤ N (c2 E; rq (θ )). (3.34) Proof. In order to simplify the notations we will omit the explicit dependence of the operators h, h 0 , pq , and rq , on θ ∈ T∗2L . Moreover, we set Dq := pq D(h) = pq L 2 (2L ), and Cq := (1 − pq )D(h). At first we prove (3.33). The minimax principle implies N (E; h) ≥ N (E; p0 hp0 |D0 ) = N (E; r0 ), which coincides with the lower bound in (3.33). On the other hand, the operator inequality h ≥ p0 (h 0 + (1 − δ)V per ) p0 + (1 − p0 )(h 0 + (1 − δ −1 )V per )(1 − p0 ), δ ∈ (0, 1), (3.35) combined with the minimax principle, entails N (E; h) ≤ N (E; p0 (h 0 + (1 − δ)V per ) p0 |D0 ) +N (E; (1 − p0 )(h 0 + (1 − δ −1 )V per )(1 − p0 )|C0 ) ≤ N ((1 − δ)−1 E; r0 ) + N (E + M(δ −1 − 1); (1 − p0 )h 0 (1 − p0 )|C0 ). (3.36)
Lifshitz Tails in Constant Magnetic Fields
Choose M(δ −1 − 1) < 2b, and, hence, c0 := (1 − δ)−1 > 1 + M(δ −1 − 1)). Since
683 M 2b ,
and E ∈ (0, 2b −
inf σ ((1 − p0 )h 0 (1 − p0 )|C0 ) = 2b, we find that the second term on the r.h.s. of (3.36) vanishes, and N (E; h) ≤ N (c0 E; r0 ) which coincides with the upper bound in (3.33). Next we assume q ≥ 1 and M < 2b, and prove (3.34). Note for any E 1 ∈ (0, 2b− M) we have N (2bq; h) = N (2bq − E 1 ; h). Pick again δ ∈
M 2b+M , 0
M so that c2 := (1 − δ)−1 > 1 + 2b . Then the operator inequality
h ≥ pq (h 0 + (1 − δ)V per ) pq + (1 − pq )(h 0 + (1 − δ −1 )V per )(1 − pq ), δ ∈ (0, 1), analogous to (3.35), yields N (2bq + E; h) ≤ N (2bq + E; pq (h 0 + (1 − δ)V per ) pq |Dq ) +N (2bq + E; (1 − pq )(h 0 + (1 − δ −1 )V per )(1 − pq )|Cq ) ≤ N (c2 E; rq ) + N (2bq + E + M(δ −1 − 1); (1 − pq )h 0 (1 − pq )|Cq ). On the other hand, the minimax principle implies N (2bq − E 1 ; h) ≥ N (2bq − E 1 ; (1 − pq )h(1 − pq )|Cq ) ≥ N (2bq − E 1 − M; (1 − pq )h 0 (1 − pq )|Cq ). Thus we get N (2bq + E; h) − N (2bq − E 1 ; h) ≤ N (c2 E; rq ) +N (2bq + E + M(δ −1 − 1); (1 − pq )h 0 (1 − pq )|Cq ) −N (2bq − E 1 − M; (1 − pq )h 0 (1 − pq )|Cq ).
(3.37)
It is easy to check that 2bq − E 1 − M > 2b(q − 1), 2bq + E + M(δ −1 − 1) < 2(q + 1)b, provided that E ∈ (0, 2b − M(δ −1 − 1)). Since σ ((1 − pq )h 0 (1 − pq )|Cq ) ∩ (2(q − 1)b, 2(q + 1)b) = ∅, we find that the the r.h.s. of (3.37) is equal to N (c2 E; rq ), thus getting the upper bound in (3.34). M Finally, we prove the lower bound in (3.34). Pick ζ ∈ 2b−M , ∞ , and, hence M c1 := (1 + ζ )−1 ∈ 0, 2b . Bearing in mind the operator inequality h ≤ pq (h 0 + (1 + ζ )V per ) pq + (1 − pq )(h 0 + (1 + ζ −1 )V per )(1 − pq ),
684
F. Klopp, G. Raikov
and applying the minimax principle, we obtain N (2bq + E; h) ≥ N (2bq + E; pq (h 0 + (1 + ζ )V per ) pq |Dq ) +N (2bq + E; (1 − pq )(h 0 + (1 + ζ −1 )V per )(1 − pq )|Cq ) ≥ N (c1 E; rq ) + N (2bq + E − M(ζ −1 + 1); (1 − pq )h 0 (1 − pq )|Cq ). On the other hand, since V per ≥ 0, the minimax principle directly implies N (2bq − E 1 ; h) ≤ N (2bq − E 1 ; h 0 ) = N (2bq − E 1 ; (1 − pq )h 0 (1 − pq )|Cq ). Combining the above estimates, we get N (2bq + E; h) − N (2bq − E 1 ; h) ≥ N (c1 E; rq ) − N (2bq + E − M(ζ −1 + 1); (1 − pq )h 0 (1 − pq )|Cq ) −N (2bq − E 1 ; (1 − pq )h 0 (1 − pq )|Cq ) .
(3.38)
Since 2(q −1)b < 2bq + E − M(ζ −1 + 1) < 2(q +1)b, 2(q −1)b < 2bq − E 1 < 2(q +1)b, provided that E ∈ (0, 2b + M(ζ −1 + 1)), we find that the r.h.s of (3.38) is equal to N (c1 E; rq ) which entails the lower bound in (3.34). Integrating (3.33) and (3.34) with respect to θ and ω, and combining the results with (3.3), we obtain the following Corollary 3.1. Assume that the hypotheses of Theorem 3.1 hold. Let q ∈ Z+ , η > 0. If q ≥ 1, assume M < 2b. Then there exist ν = ν(η) > 0, d1 ∈ (0, 1), d2 ∈ (1, ∞), and E˜ 0 > 0, such that for each E ∈ (0, E˜ 0 ) and n ≥ E −ν , we have −η −η E ρq,n,ω (d1 E) − e−E ≤ Nb (2bq + E) − Nb (2bq) ≤ E ρq,n,ω (d2 E) + e−E . (3.39) 4. Proof of Theorem 2.1: Upper Bounds of the IDS In this section we obtain the upper bounds of Nb (2bq + E) − Nb (2bq) necessary for the proof of Theorem 2.1. Theorem 4.1. Assume that H1 – H4 hold, that almost surely ωγ ≥ 0, γ ∈ Z2 , and (2.1) is valid. Fix the Landau level 2bq, q ∈ Z+ . i) Assume that u(x) ≥ C(1 + |x|)−κ , x ∈ R2 , for some κ > 2, and C > 0. Then we have lim inf E↓0
2 ln | ln (Nb (2bq + E) − Nb (2bq))| ≥ . | ln E| κ−2
(4.1)
β
ii) Assume u(x) ≥ Ce−C|x| , x ∈ R2 , for some β > 0, C > 0. Then we have lim inf E↓0
2 ln | ln (Nb (2bq + E) − Nb (2bq))| ≥1+ . ln | ln E| β
(4.2)
Lifshitz Tails in Constant Magnetic Fields
685
iii) Assume u(x) ≥ C1{x∈R2 ; |x−x0 |<ε} for some C > 0, x0 ∈ R2 , and ε > 0. Then there exists δ > 0 such that we have lim inf E↓0
ln | ln (Nb (2bq + E) − Nb (2bq)| ≥ 1 + δ. ln | ln E|
(4.3)
Fix θ ∈ T∗2L . Denote by λ j (θ ), j = 1, . . . , rank rq,n,ω (θ ), the eigenvalues of the operator rq,n,ω (θ ) enumerated in non-decreasing order. Then (3.25) implies 1 E ρq,n,ω (E) = E(N (E; rq,n,ω (θ ))dθ (2π )2 ∗2L 1 = (2π )2
rank rq,n,ω (θ)
∗2L
P(λ j (θ ) < E)dθ
(4.4)
j=1
with E ∈ R. Since the potential V is almost surely bounded, we have rank rq,n,ω (θ ) ≤ rank pq (θ ) = 2bL 2/π . Therefore, (4.4) entails bL 2 E ρq,n,ω (E) ≤ 2π 3
∗2L
P(rq,n,ω (θ ) has an eigenvalue less than E)dθ. (4.5)
In order to estimate the probability in (4.5), we need the following Lemma 4.1. Assume that, for n ∼ E −ν , the operator rq,n,ω (θ ) has an eigenvalue less than E. Set L := (2n + 1)a/2. Pick E small and l large such that L l. Decompose 2L = ∪γ ∈2l Z2 ∩2L (γ + 2l ). Fix C > 1 sufficiently large and m = m(L , l) such that 1 2 bl ≤ m ≤ CbL 2 , C 2 l 2 2 E > Ce−bl /2+m ln(C bl /m) . L
(4.6) (4.7)
Then, there exists γ ∈ 2lZ2 ∩2L and a non-identically vanishing function ψ ∈ L 2 (R2 ) in the span of {e j,q }0≤ j≤m , the functions e j,q being defined in (3.26), such that Vωγ ψ, ψl ≤ 2Eψ, ψl , per γ where Vω (x) = Vω (x + γ ), and ·, ·l := 2l | · |2 d x.
(4.8)
Proof. Consider ϕ ∈ Ran pq (θ ) a normalized eigenfunction of the operator rq,n,ω (θ ) corresponding to an eigenvalue smaller than E. Then we have Vω ϕ, ϕ L ≤ Eϕ, ϕ L .
(4.9)
Whenever necessary, we extend ϕ by magnetic periodicity (i.e. the periodicity with respect to the magnetic translations) to the whole plane R2 . Note that Kq,b (x, x ; θ )ϕ(x )d x = eiθ(x −x) K q,b (x, x )ϕ(x )d x ϕ(x) = ϕ(x; θ ) = 2L
R2
686
F. Klopp, G. Raikov
with x ∈ 2L (see (3.31) and (3.32) for the definition of K q,b and Kq,b respectively). Evidently, ϕ ∈ L ∞ (R2 ), and since it is normalized in L 2 (2L ), we have
1/2 |Kq,b (x, x ; θ )|2 d x ϕ L ∞ (R2 ) ≤ sup 2L
x∈2L
⎛ ⎜ ≤ sup ⎝
⎛ ⎝
2L
x∈2L
where ˜ q (y) :=
⎞2
⎞1/2
˜ q (x − x + α)⎠ d x ⎟ ⎠
≤ C, (4.10)
α∈2L Z2
b 2 b |y| q , 2π 2
y ∈ R2 ,
(4.11)
and C depends on q and b but is independent of n and θ . Fix C1 > 1 large to be chosen later on. Consider the sets 2 l 1 2 2 2 L+ = γ ∈ 2lZ ∩ 2L ; |ϕ(x)| d x ≥ |ϕ(x)| d x , C1 L γ +2l 2L 2 l 1 2 2 2 L− = γ ∈ 2lZ ∩ 2L ; |ϕ(x)| d x < |ϕ(x)| d x . C1 L γ +2l 2L The sets L− and L+ partition 2lZ2 ∩ 2L . Fix C2 > 1 large. Let us now prove that for some γ ∈ L+ , one has per 2 Vω (x)|ϕ(x)| d x ≤ C2 E |ϕ(x)|2 d x. γ +2l
γ +2l
Indeed, if this were not the case, then (4.9) would yield per −E |ϕ(x)|2 d x≤ Vω (x)|ϕ(x)|2 d x − E γ ∈L− γ +2l
γ ∈L−
γ +2l
E ≤ γ ∈L+
γ +2l
≤−E(C2 − 1)
|ϕ(x)| d x − 2
γ ∈L+
γ +2l
γ ∈L− γ +2l
≤ ≤
γ ∈L+ γ +2l
γ ∈L+ γ +2l
γ +2l
γ ∈L+ γ +2l
|ϕ(x)|2 d x
per Vω (x)|ϕ(x)|2 d x
|ϕ(x)|2 d x.
On the other hand, the definition of L− yields |ϕ(x)|2 d x = |ϕ(x)|2 d x + 2L
γ +2l
(4.12)
(4.13)
|ϕ(x)|2 d x
1 l 2 |ϕ(x)|2 d x C1 L 2L γ ∈L− 1 |ϕ(x)|2 d x + |ϕ(x)|2 d x. C1 2L |ϕ(x)|2 d x +
Lifshitz Tails in Constant Magnetic Fields
687
Plugging this into (4.13), we get
1 E 2 |ϕ(x)| d x ≥ E(C2 − 1) 1 − |ϕ(x)|2 d x C1 2L C1 2L
(4.14)
which is clearly impossible if we choose (C2 − 1)(C1 − 1) > 1. So from now on we assume that (C2 − 1)(C1 − 1) > 1. Hence, we can find γ ∈ 2lZ2 ∩ 2L such that one has per Vω (x)|ϕ(x)|2 d x ≤ C2 E |ϕ(x)|2 d x, γ +2l
γ +2l
γ +2l
2 l 1 2 |ϕ(x)| d x ≥ |ϕ(x)|2 d x. C1 L 2L
Shifting the variables in the integrals above by γ , we may assume γ = 0 if we replace per γ Vω by Vω . Thus we get Vωγ (x)|ϕ(x)|2 d x ≤ C2 E |ϕ(x)|2 d x,
2l
2l
2l
2 l 1 2 |ϕ(x)| d x ≥ |ϕ(x)|2 d x. C1 L γ +2L
Due to the magnetic periodicity of ϕ, we have |ϕ(x)|2 d x = |ϕ(x)|2 d x γ +2L
2L
which yields
2l
Vω (x)|ϕ(x)|2 d x ≤ C2 E 1 |ϕ(x)| d x ≥ C1 2
2l
|ϕ(x)|2 d x,
(4.15)
2 l |ϕ(x)|2 d x. L 2L
(4.16)
2l
Let us now show that roughly the same estimates hold true for ϕ replaced by a function ψ ∈ Pq L 2 (R2 ). Set ψ := Pq χ− eθ ϕ where eθ (x) := eiθ x , x ∈ R2 , and χ− denotes the characteristic function of the set {x ∈ R2 ; |x|∞ < L}. Note that ϕ − eθ ψ = eθ Pq χ+ eθ ϕ, where χ+ is the characteristic function of the set {x ∈ R2 ; |x|∞ ≥ L}. Let us estimate the L 2 (2L )-norm of the function ϕ − eθ ψ. We have 2 2 2 iθ x ϕ − eθ ψ L := ϕ − eθ ψ L 2 ( ) = K q,b (x, x )χ+ (x )e ϕ(x )d x d x 2L
2L
R2
2
≤ sup |ϕ(x )| x ∈R2
×
2L
R2 R2
˜ q (x − x )χ+ (x )χ+ (x )d x d x d x, ˜ q (x − x ) (4.17)
688
F. Klopp, G. Raikov
˜ q being defined in (4.11). Bearing in mind estimate (4.10), and taking the function ˜ at infinity, we easily find that (4.17) implies the into account the Gaussian decay of existence of a constant C > 0 such that for sufficiently large L we have ϕ − eθ ψ2L ≤ e−L/C . As ϕ is normalized in L 2 (2L ), this implies that, for sufficiently small E, ψ L ≥
1 ϕ L and ϕ − eθ ψ L ≤ e−L/C ψ L . 2
(4.18)
per
As Vω is uniformly bounded, it follows from our choice for L and l and estimate (4.18) that, for E sufficiently small, 2 l 1 |ψ(x)|2 d x ≥ |ϕ(x)|2 d x − Cϕ − eθ ψ2L C1 L 2l 2L 2 l 1 ≥ |ψ(x)|2 d x, ˜ C1 L 2L per per 2 2 2 ˜ Vω (x)|ψ(x)| d x = Vω (x)|ϕ(x)| d x + Cϕ − eθ ψ L ≤ C2 E |ψ(x)|2 d x. 2l
2l
2l
Hence, we obtain inequalities (4.15)–(4.16) with ϕ replaced by ψ ∈ Pq L 2 (R2 ). Now, we write ψ = j≥0 a j e j (see (3.26)). Using the fact that {e j } j≥0 is an orthogonal family on any disk centered at 0 (this is due to the rotational symmetry), we compute 2 2 2 |ψ(x)|2 d x ≤ |ψ(x)| d x = |a | j √ √ |e j (x)| d x, (4.19) 2l
and
|x|≤ 2l
|ψ(x)|2 d x ≥
2L
|x|≤L
|x|≤ 2l
j≥0
|ψ(x)|2 d x =
|a j |2
j≥0
|x|≤L
|e j (x)|2 d x.
(4.20)
Fix m ≥ 1 and decompose ψ = ψ0 + ψm , where ψ0 =
m
ajej,
ψm =
j=0
ajej.
(4.21)
j≥m+1
Our next goal is to estimate the ratio √ |e (x)|2 d x j,q |x|< 2l , 2 |e |x|
j ≥ m + 1,
(4.22)
where l, m, and L satisfy (4.6) with suitable C, under the hypotheses that l, and hence m and L are sufficiently large. Passing to polar coordinates (r, θ ), and then changing the 2 variable s = bρ 2 j in both the numerator and the denominator of (4.22), we find that
√ |x|< 2l
|e j,q (x)|2 d x
bl 2 /j
( j−q)
e−s( j−q) s j−q L q ( js)2 ds 0 = . bL 2 /(2 j) 2 ( j−q) e−s( j−q) s j−q L q ( js)2 ds |x|
(4.23)
Lifshitz Tails in Constant Magnetic Fields
689
Employing estimates (3.27) and (3.28), we get bl 2 /j
( j−q) e−s( j−q) s j−q L q ( js)2 ds 0 bL 2 /(2 j) ( j−q) e−s( j−q) s j−q L q ( js)2 ds 0
≤ C(q)
j j −q
2q bl 2 /j 0
e( j−q) f (s) ds
0
e( j−q) f (s) ds
( j)
,
(4.24) where f (s) := ln s − s, s > 0, and ( j) =
1 2 if bL 2 2j
j ≤ bL 2 , if j > bL 2 .
Note that the function f is increasing on the interval (0, 1). Since j ≥ m + 1, and C, the 2 constant in (4.6), is greater than one, we have blj < 1. Therefore,
bl 2 /j
e( j−q) f (s) ds ≤
0
bl 2 ( j−q) f (bl 2 /j) e . j
(4.25)
On the other hand, using a second-order Taylor expansion of f , we get f (s) ≥ f (( j)) +
s − ( j) 1 − , s ∈ (( j)/2, ( j)). ( j) 2
Consequently,
( j)
e( j−q) f (s) ds ≥
0
( j) ( j)/2
e( j−q) f (s) ds ≥
( j) ( j−q)( f (( j))−1)) e . 2
(4.26)
Putting together (4.24)-(4.26), we obtain
√ 2l
|x|<
|x|
|e j,q (x)|2 d x
|e j,q (x)|2 d x
˜ ≤ C(q)
2bl 2 j( j)
j j −q
≤ C(q) ⎧
2q ⎨
2bl 2 j( j)
j j −q
2q exp (( j − q)( f (bl 2 /j) − f (( j)) + 1)
3/2 2 j q exp −bl 2 + j ln ( 2e j bl ) if j < bL 2 , ⎩ exp −bl 2 + j ln ( 2e22l 2 ) if j ≥ bL 2 . L
(4.27)
Now, using the computations (4.19) and (4.20) done for ψm , as well as (4.6), we obtain 2 2 |ψm (x)|2 d x ≤ Ce−bl /2+m ln(C bl /2m) |ψ(x)|2 d x 2l
2L
2 L 2 2 e−bl /2+m ln(C bl /m) |ψ(x)|2 d x. ≤ C1 l 2l
(4.28)
690
F. Klopp, G. Raikov
Plugging this into (4.15) – (4.16) for ψ, and using the uniform boundedness of Vω , we get that " # 2 L 2 −bl 2 +m ln(C bl 2 /2m) Vω (x)|ψ0 (x)| d x ≤ C2 E + C e |ψ0 (x)|2 d x, l 2l 2l " # 2 l 1 2 2 2 |ψ0 (x)|2 d x ≥ − e−bl +m ln(C bl /2m) |ψ(x)|2 d x. C1 L 2l 2L Taking (4.7) into consideration, this completes the proof of Lemma 4.1.
Let us now complete the proof of Theorem 4.1. Assume at first the hypotheses of its −κ 2 first part. In particular, suppose that u(x) ≥ C(1 $ + |x|) % , x ∈ R , with some κ > 2, and C > 0. Pick η > 2/(κ − 2), and ν0 > max κ1−2 , ν where ν = ν(η) is the number
defined in Corollary 3.1. Finally, fix an arbitrary κ > κ and set n ∼ E −ν0 ,
L = (2n + 1)a/2, l = E
− κ 1−2
2
, m ∼ E − κ −2 .
Then the numbers m, l, and L, satisfy (4.6) – (4.7) provided that E > 0 is sufficiently small. Further, for any γ0 ∈ 2lZ2 ∩ 2L we have 1 −κ γ0 2 ωγ u(x − γ )|ψ(x)| d x ≥ l ωγ |ψ(x)|2 d x Vω ψ, ψl ≥ C3 2l 2l |γ |≤l
|γ |≤l
(4.29) with C3 > 0 independent of θ and E. Hence, the probability that there exists γ ∈ 2lZ2 ∩ 2L and a non-identically vanishing function ψ in the span of {e j }0≤ j≤m such that (4.8) be satisfied, is not greater than the probability that l −2
ωγ ≤ C3 El κ−2 = C3 E
κ −κ κ −2
.
(4.30)
|γ |≤l
Applying a standard large-deviation estimate (see e.g. [15, Subsect. 8.4] or [22, Sect. 3.2]), we easily find that the probability that (4.30) holds, is bounded by
κ −κ 2 κ −κ 2 −2 −2 −2 κ κ κ exp C4 l ln P(ω0 ≤ C3 E ) = exp C4 E ln P(ω0 ≤ C3 E ) with C4 independent of θ and E > 0 small enough. Applying our hypothesis that P(ω0 ≤ E) ∼ C E κ , E ↓ 0, with C > 0 and κ > 0, we find that for any κ > κ, θ ∈ T∗2L , and sufficiently small E > 0, we have 2 (4.31) P(rq,n,ω (θ ) has an eigenvalue less than E) ≤ exp −C5 E κ −2 | ln E| with C5 > 0 independent of θ and E. Putting together (3.39), (4.5) and (4.31), and taking into account that area ∗2L = π 2 L −2 , we get Nb (2bq + E) − Nb (2bq) ≤
2 b exp −C5 E κ −2 | ln E| + exp (−E −η ) 2π
Lifshitz Tails in Constant Magnetic Fields
691
which implies lim inf E↓0
ln | ln Nb (2bq + E) − Nb (2bq)| 2 ≥ | ln E| κ −2
for any κ > κ. Letting κ ↓ κ, we get (4.1). Assume now the hypotheses of Theorem 4.1 ii). In particular, we suppose that u(x) ≥ β Ce−C|x| , x ∈ R2 , C > 0, β > 0. Put β0 = max {1, 2/β}. Pick an arbitrary β > β and set
l = | ln E|1/β , m ∼ | ln E|β0 . Then (4.6)–(4.7) are satisfied provided that E > 0 is sufficiently small, and similarly to (4.29), for any γ0 ∈ 2lZ2 ∩ 2L we have 1 −c6 l β γ0 Vω ψ, ψl ≥ e ωγ |ψ(x)|2 d x C6 2l |γ |≤l
with C6 > 0 independent of θ and E. Arguing as in the derivation of (4.31), we get P(rq,n,ω (θ ) has an eigenvalue less than E) ≤ exp −C7 | ln E|1+2/β ln | ln E| (4.32) with C7 > 0 independent of θ and E. As in the previous case, we put together (3.39), (4.5) and (4.31), and obtain the estimate Nb (2bq + E) − Nb (2bq) ≤
b exp −C7 | ln E|1+2/β ln | ln E| + exp (−E −η ) 2π
which implies lim inf E↓0
2 ln | ln (Nb (2bq + E) − Nb (2bq))| ≥1+ ln | ln E| β
for any β > β. Letting β ↓ β, we get (4.2). Finally, let us assume the hypotheses of Theorem 4.1 iii). In particular, we assume that u(x) ≥ C1{x∈R2 ;|x−x0 |<ε} with some C > 0, x0 ∈ R2 , and ε > 0. Due to τx0 H0 τx∗0 = H0 and τx0 1{x∈R2 ;|x−x0 |<ε} τx∗0 = 1{x∈R2 ;|x|<ε} we can assume without loss of generality that x0 = 0. Our first goal is to estimate from below the ratio 2 |x−γ |≤ε |Pm (x)| d x Rγ = Rγ ,m,q := √ , (4.33) 2 |x|≤ 2l |Pm (x)| d x where Pm (x) :=
q j=0
with 0 = c = (c0 , c1 , . . . , cm ) ∈ Cm .
c j e j,q (x), x ∈ R2 ,
(4.34)
692
F. Klopp, G. Raikov
q j Lemma 4.2. Let q ∈ Z+ . Let π(s) = j=0 c j s , s ∈ R. Moreover, let p ∈ Z+ , ρ ∈ (0, ∞). Then we have " q # & e−(q+1)ρ ρ q(q+1) ρ p+1 ( p + 1)q (r !)2 |c|2 (q + 1)q/2 (1 + ρ q )q ( p + 2q + 1)(q+1)2 r =0 ρ ρ p+1 2 |c| , ≤ |π(s)|2 e−s s p ds ≤ q + 1(1 + ρ q ) (4.35) p+1 0 where c := (c0 , c1 , . . . , cq ) ∈ Cq+1 and |c|2 = |c0 |2 + · · · + |cq |2 . Proof. Let M be the (q+1)×(q+1) positive-definite matrix with entries j, k = 0, 1, . . . , q. Then we have ρ |π(s)|2 e−s s p ds = Mc, c ≤ M|c|2 . ρ
ρ 0
s j+k+ p e−s ds,
0
Further, M = 0 E(s)ds, where E(s), s ∈ (0, ρ), is the rank-one matrix with entries s j+k e−s s p , j, k = 0, 1, . . . , q. Obviously, ' ( ( q E(s) = ) s 2 j e−s s p ≤ q + 1(1 + s q )e−s s p , s ∈ (0, ρ), j=0
and
M ≤ 0
ρ
ρ ρ p+1 (1 + ρ q ) , E(s)ds ≤ q + 1 (1 + s q )e−s s p ds ≤ q + 1 p+1 0
which yields the upper bound in (4.35). Next, we have ρ det M 2 |c| ≤ |π(s)|2 e−s s p ds. Mq 0 Further, ˜ ≤ det M, e−(1+q)ρ det M ˜ is the (q + 1) × (q + 1)-matrix with entries ρ s j+k+ p ds = where M 0 0, 1, . . . , q, and
(4.36)
(4.37) ρ j+k+ p+1
j+k+ p+1 ,
˜ = ρ (q+ p+1)(q+1) q , det M
j, k = (4.38)
where q = q ( p) is the determinant of the (q + 1) × (q + 1)-matrix with entries ( j + k + p + 1)−1 , j, k = 0, 1, . . . , q. On the other hand, it is easy to check that q =
(q!)2 1 . q−1 , q ≥ 1, p ≥ 0, 0 = *q−1 p+1 ( p + 2q + 1) r =0 ( p + q + r + 1)2
Hence, for q ≥ 1 and p ≥ 0,
*q
(p
2 r =0 (r !) 2 + 2q + 1)(q+1)
≤ q .
(4.39)
Putting together (4.36)–(4.39) and using the upper bound in (4.35), we obtain the corresponding lower bound. In the following proposition we obtain the needed lower bound of ratio (4.33).
Lifshitz Tails in Constant Magnetic Fields
693
Proposition 4.1. There exists a constant C > 0 such that for sufficiently large m and l, ratio (4.33) satisfies the estimates Rγ ≥ e−C m ln l
(4.40)
for each linear combination Pm of the form (4.34). Proof. Evidently, 2 2 |Pm (x)| d x = |Pm (x + γ )| d x = |(τγ Pm )(x)|2 , (4.41) |x−γ |≤ε |x|≤ε |x|≤ε 2 2 √ |Pm (x)| d x ≤ √ |Pm (x)| |x|≤ 2l
|x−γ |≤2 2l
=
√ |x|≤2 2l
|Pm (x + γ )| d x = 2
√ |x|≤2 2l
|(τγ Pm )(x)|2 d x,
(4.42)
the magnetic translation operator τγ being defined in (2.6). Using the fact that τγ commutes with the creation operator a ∗ (see (3.30)), we easily find that (3.29) implies (τγ Pm )(x) =
m
2 c˜ j (a ∗ )q z j eζ z e−b|z| /4 ,
(4.43)
j=0
where z = x1 + i x2 , ζ = − b2 (γ1 − iγ2 ), and the coefficients c˜ j , j = 0, 1, . . . , m, may depend on γ , b and q but are independent of x ∈ R2 . Applying (3.26) and (3.29), we get m j=0
∞ m ζ k ∗ q j+k −b|z|2 /4 2 (a ) z e c˜ j (a ∗ )q z j eζ z e−b|z| /4 = c˜ j k! j=0
= e−b|z|
2 /4
m
cˆ j z j−q
∞
j=0
k=0
k=0
(ζ z)k ( j+k−q) Lq (b|z|2 /2) k!
(4.44)
with cˆ j , j = 0, 1, . . . , m, independent of x ∈ R2 . By [9, Eq.(8.977.2)] we have ∞ (ζ z)k k=0
k!
( j+k−q)
Lq
( j−q)
(b|z|2 /2) = eζ z L q
b|z|2 − ζz , 2
(4.45)
while the Taylor expansion formula entails ( j−q)
Lq
b|z|2 − ζz 2
=
q ( j−q) (ξ ) (−ζ z)s d s L q , s ξ =b|z|2 /2 s! dξ
(4.46)
s=0
and [9, Eq.( 8.971.3)] yields ( j−q)
ds Lq (ξ ) ( j−q+s) = (−1)s L q−s (ξ ), ξ ∈ R. s dξ
(4.47)
694
F. Klopp, G. Raikov
Combining (4.43)–(4.47), we find that (τγ Pm )(x) = eζ z P˜ m (x), x ∈ R2 ,
(4.48)
where 2 P˜ m (x) = e−b|z| /4
m
cˆ j
s=0
j=0
= e−b|z|
2 /4
m+q
q ζs
s!
( j+s−q)
z j+s−q L q−s
(b|z|2 /2)
z p−q φ p,q (b|z|2 /2),
(4.49)
p=0
and φ p,q , p = 0, . . . , m + q, are polynomials of degree not exceeding q; moreover, if p < q, then the minimal possible degree of the non-zero monomial terms in φ p,q , is √
q − p. Bearing in mind that |eζ z |2 = e x·γ and |γ | ≤ constant C such that for sufficiently large l we have
2 2 l,
we find that there exists a
2 ˜ Rγ ≥ e−Cl R,
(4.50)
where
2 ˜ |x|≤ε |Pm (x)| d x , √ 2 ˜ |x|≤2 2l |Pm (x)| d x
R˜ =
(4.51)
the functions P˜ m being defined in (4.49). Passing to the polar coordinates (r, θ ) in R2 , after that changing the variable s = br 2/2, and taking into account the rotational symmetry, we find that for each R > 0 we have m+q 2π 2 p−q ρ p−q −s |Pm (x)| d x = s e |φ p,q (s)|2 ds b b |x|≤R 0
2
=
m ρ
p=0
s p e−s |
p,q (s)|
2
ds +
p=0 0
q
ρ
s p e−s | ˜
p,q (s)|
2
ds;
(4.52)
p=1 0
if q = 0, then the second + term in the last line of (4.52) should be set equal Here + to zero. 2π 2 p 2π 2 − p − p 2 ˜ ρ = b R /2, p,q (s) = s p,q = b b φ p+q,q (s), p = 0, . . . , m, b b φq− p,q (s), p = 1, . . . , q. Note that the degree of the polynomials q, p does not exceed q, and the the degree of the polynomials ˜ q, p does not exceed q − p. Bearing in mind (4.52) and applying Lemma 4.2, we easily deduce the existence of a constant C > 0 such that for sufficiently large m and l we have R˜ ≥ e−Cm ln l , which combined with (4.50) yields (4.40).
Lifshitz Tails in Constant Magnetic Fields
695
Next, we pick an arbitrary η and ν = ν(η), the number defined in Corollary 3.1. Further, we choose ς > 1 and δ ∈ (0, 1/2) so that ς (1 − δ) > 1 + 2ν, and set l = | ln E|δ/2 , m ∼
ς | log E| , log | log E|
L = (2n + 1)a/2.
(4.53)
Then, for E sufficiently small, (4.6) – (4.7) are satisfied. Further, we impose the additional condition that μ := Cςδ 2 < 1, where C is the constant in (4.40), which is compatible with the conditions on ς and δ formulated above. Now, the probability that there exists γ ∈ 2lZ2 ∩ 2L and a non identically vanishing function ψ in the span of {e j }0≤ j≤m such that (4.8) be satisfied, is not greater than the probability that ωγ ≤ l −2 E 1−μ = E 1−μ | ln E|δ . l −2 |γ |≤l
Arguing as in the derivation of (4.31) and (4.32), we conclude that for any θ ∈ T∗2L we have P(rq,m,ω has an eigenvalue less than E) ≤ exp C8l 2 log P(ω0 ≤ E 1−μ | ln E|δ ) ≤ exp −C9 | ln E|1+δ ln | ln E| (4.54) with positive C8 and C9 independent of θ and E > 0 small enough. Combining the upper bound in (3.39), (4.5), and (4.54), we get (4.3). This completes the proof of the upper bounds in Theorem 2.1. 5. Proof of Theorem 2.1: Lower Bounds of the IDS In this section we get the lower bounds of Nb (2bq + E) − Nb (2bq) needed for the proof of Theorem 2.1. Theorem 5.1. Assume that H1 – H4 hold, that almost surely ωγ ≥ 0, γ ∈ Z2 , and (2.1) is valid. Fix the Landau level 2bq, q ∈ Z+ . i) We have lim inf E↓0
2 ln | ln (Nb (2bq + E) − Nb (2bq))| ≤ , | ln E| κ−2
(5.1)
where κ is the constant in (1.2). β ii) Let u(x) ≤ e−C|x| , x ∈ R2 , for some C > 0 and β ∈ (0, 2]. Then we have lim sup E↓0
ln (Nb (2bq + E) − Nb (2bq)) πκ , ≥− 1+2/β | ln E| C
if β ∈ (0, 2), and lim inf E↓0
ln (Nb (2bq + E) − Nb (2bq)) ≥ −π κ | ln E|2
2 1 + b C
(5.2)
,
(5.3)
if β = 2. Therefore, lim sup E↓0
ln | ln (Nb (2bq + E) − Nb (2bq))| ≤ 1 + 2/β. ln | ln E|
(5.4)
696
F. Klopp, G. Raikov
Note that the combination of Theorem 4.1 with Theorem 5.1 completes the proof of Theorem 2.1. Let us prove now Theorem 5.1. Pick η ≥ κ2−2 in the case of its first part, or an arbitrary η > 0 in the case of its second part. As above, set n ∼ E −ν , where ν = ν(η) is the number defined in Corollary 3.1, and L = (2n + 1)a/2. Bearing in mind the lower bound in (3.39), and (4.4), we conclude that it suffices to estimate from below the quantity 1 E ρq,n,ω (E) = E(N (E; rq,n,ω (θ ))dθ (2π )2 ∗2L = (2π )−2 ≥ (2π )−2
rank rq,n,ω (θ)
∗2L
∗2L
P(λ j (θ ) < E)dθ
j=1
P(λ1 (θ ) < E)dθ.
(5.5)
Fix an arbitrary θ ∈ T∗2L . Evidently, P(λ1 (θ ) < E) is equal to the probability that there exists a non-zero function f ∈ Ran rq,n,ω (θ ) such that Vω (x)| f (x; θ )|2 d x < E | f (x; θ )|2 d x. (5.6) 2L
2L
Further, pick the trial function e−iθ(x+γ ) (τγ ϕ)(x), ˜ x ∈ 2L , θ ∈ T∗2L , ϕ(x; θ ) =
(5.7)
γ ∈2L Z2
where ϕ(x) ˜ = ϕ˜q (x) := z q e−b|z|
2 /4
, z = x1 + i x2 , z = x1 − i x2 .
(5.8)
Since the function ϕ˜q is proportional to e0,q (see (3.26)), we have ϕ ∈ Ran rq,n,ω (θ ). Therefore, the probability that there exists a non-zero function f ∈ Ran rq,n,ω (θ ) such that (5.6) holds, is not less than the probability that Vω (x)|ϕ(x; θ )|2 d x < E |ϕ(x; θ )|2 d x. (5.9) 2L
2L
Lemma 5.1. Let the function ϕ be defined as in (5.7) – (5.8). Then there exist L 0 > 0 and c1 > 0 independent of θ such that L ≥ L 0 implies |ϕ(x; θ )|2 d x > c1 . (5.10) 2L
Proof. We have ϕ = ϕ0 + ϕ∞ where ˜ ϕ0 (x; θ ) = e−iθ x ϕ(x), ϕ∞ (x; θ ) =
γ ∈2L Z2 , γ =0
e−iθ(x+γ ) (τγ ϕ)(x). ˜
(5.11) (5.12)
Lifshitz Tails in Constant Magnetic Fields
697
Note that ˜ sup |ϕ∞ (x; θ )| ≤ ce ˜ −cL
2
(5.13)
x∈2L
with c˜ independent of L and θ . Further, 1 |ϕ(x; θ )|2 d x ≥ |ϕ0 (x; θ )|2 d x − 2 |ϕ∞ (x; θ )|2 d x 2 2L 2L 2L 1 2 ˜ 2 ≥ |ϕ(x)| ˜ d x − 8cL ˜ 2 e−cL . (5.14) 2 R2 q 2 ˜ 2 d x = 2π q!, we find that (5.14) implies (5.10). Taking into account that R2 |ϕ| b q By assumption we have u(x) ≤ Cv(x), C > 0, x ∈ R2 ,
(5.15) β
where v(x) := (1 + |x|)−κ in the case of Theorem 5.1 i), and v(x) := e−C|x| in the case of Theorem 5.1 ii). Since ωγ ≥ 0, inequality (5.9) will follow from ωγ v(x − γ )|ϕ(x; θ )|2 d x ≤ c2 E, (5.16) γ ∈Z2
2L
where c2 = c1 C −1 , C being the constant in (5.15), and c1 being the constant in (5.10). Next, we write ωγ v(x − γ )|ϕ(x; θ )|2 d x γ ∈Z2
≤2
2L
γ ∈Z2
ωγ
v(x − γ )|ϕ0 (x; θ )| d x + 2 2
2L
γ ∈Z2
ωγ
2L
v(x − γ )|ϕ∞ (x; θ )|2 d x, (5.17)
where ϕ0 and ϕ∞ are defined in (5.11) and (5.12) respectively. Lemma 5.2. Fix q ∈ Z+ . i) Let κ > 0, b > 0. Then there exists a constant c > 0 such that for each y ∈ R2 , L > 0, and θ ∈ T∗2L , we have (1 + |x − y|)−κ |ϕ0 (x; θ )|2 d x ≤ c (1 + |y|)−κ . (5.18) 2L
Cb ii) Let β ∈ (0, 2], b > 0, C > 0. If β ∈ (0, 2), set b0 := C. If β = 2, set b0 := 2C+b . 2 Then for each b1 < b0 there exists a constant c > 0 such that for each y ∈ R , L > 0, and θ ∈ T∗2L , we have β β e−C|x−y| |ϕ0 (x; θ )|2 d x ≤ c e−b1 |y| . (5.19) 2L
698
F. Klopp, G. Raikov
We omit the proof since estimates (5.18)–(5.19) follow from standard simple facts concerning the asymptotics at infinity of the convolutions of functions admitting power-like or exponential decay, with the derivatives of Gaussian functions. In the case of powerlike decay, results of this type can be found in [34, Theorem 24.1], and in the case of an exponential decay similar results are contained in [12, Lemma 3.5]. Using Lemma 5.2, we find that under the hypotheses of Theorem 5.1 i) we have 2 ωγ v(x − γ )|ϕ0 (x; θ )|2 d x ≤ c3 ωγ (1 + γ |)−κ , (5.20) 2L
γ ∈Z2
γ ∈Z2
while under the hypotheses of Theorem 5.1 ii) for each b1 < b0 we have 2 2 ωγ v(x − γ )|ϕ0 (x; θ )|2 d x ≤ c3 ωγ e−b1 |γ | , 2L
γ ∈Z2
(5.21)
γ ∈Z2
where c3 is independent of L and θ . Further, for both parts of Theorem 5.1 we have ˜ 2 2 ωγ v(x − γ )|ϕ∞ (x; θ )|2 d x ≤ c4 L 2 e−cL , (5.22) γ ∈Z2
2L
where c4 is independent of L and θ , and c˜ is the constant in (5.13). Since L ∼ E −ν , ν > 0, we have c2 ˜ 2 (5.23) c2 E − c4 L 2 e−cL ≥ E 2 for sufficiently small E > 0. Combining (5.17) with (5.20)–(5.23), and setting c5 = c2 /(2c3 ), we find that (5.16) will follow from the inequality ωγ c5 (1 + |γ |)−κ ≤ c5 E, (5.24) γ ∈Z2
in the case of Theorem 5.1 i), or from the inequality β ωγ e−b1 |γ | ≤ c5 E, b1 < b0 ,
(5.25)
γ ∈Z2
in the case of Theorem 5.1 ii). Now pick l > 0 and write ωγ (1 + |γ |)−κ ≤ ωγ + γ ∈Z2
γ ∈Z2
β
ωγ e−b1 |γ | ≤
ωγ |γ |−κ ,
γ ∈Z2 , |γ |≤l
γ ∈Z2 , |γ |>l
ωγ +
γ ∈Z2 , |γ |≤l
β
ωγ e−b1 |γ | .
(5.26)
(5.27)
γ ∈Z2 , |γ |>l
Evidently, for each κ ∈ (2, κ) and b2 < b1 there exists a constant c6 > 0 such that ωγ |γ |−κ ≤ c6l −κ +2 , (5.28) γ ∈Z2 , |γ |>l
γ ∈Z2 , |γ |>l
β
β
ωγ e−b1 |γ | ≤ c6 e−b2 l .
(5.29)
Lifshitz Tails in Constant Magnetic Fields
699
Fix l and c7 ∈ (0, c5 ) such that
l −κ +2 =
c5 − c7 E c6
(5.30)
c5 − c7 E c6
(5.31)
in the case of Theorem 5.1 i), or β
e−b2 l =
in the case of Theorem 5.1 ii). Putting together (5.26) - (5.31), we conclude that (5.24), or, respectively, (5.25) will follow from the inequality ωγ ≤ c7 E, (5.32) γ ∈Z2 , |γ |≤l
provided that l satisfies (5.30) or, respectively, (5.31). Set N (l) := #{γ ∈ Z2 ; |γ | ≤ l}, so that we have N (l) = πl 2 (1 + o(1)), l → ∞.
(5.33)
Evidently, the probability that (5.32) holds, is not less than the probability that ωγ ≤ c7 E/N (l) for each γ ∈ Z2 such that |γ | ≤ l. Since the random variables ωγ are identically distributed and independent, the last probability is equal to P(ω0 ≤ c7 E/N (l)) N (l) . Combining the above inequalities, and using the lower bound in (3.39), we get Nb (2bq + E) − Nb (2bq) ≥
area ∗2L −η P(ω0 ≤ c7 E/N (l)) N (l) − e−E , (5.34) 2 (2π )
where l is chosen to satisfy (5.30) with an arbitrary κ ∈ (2, κ) in the case of Theorem 5.1 i), or to satisfy (5.31) with an arbitrary fixed b2 < b0 in the case of Theorem 5.1 ii). Putting together (5.34), (2.1), (5.30), and (5.33), we get lim sup E↓0
2 ln | ln (Nb (2bq + E) − Nb (2bq))| ≤ | ln E| κ −2
for any κ ∈ (2, κ) such that η > κ2−2 . Letting κ ↑ κ, we get (5.1). Similarly, putting together (5.34), (2.1), (5.31), and (5.33), we get lim inf E↓0
ln (Nb (2bq + E) − Nb (2bq)) | ln E|
for any b2 < b0 . Letting
b2 ↑ b0 =
1+ β1
1 C if bC b+2C
≥−
πκ b2
β ∈ (0, 2), if β = 2,
we get (5.2)–(5.3). Acknowledgements. The financial support of the Chilean Science Foundation Fondecyt under Grants 1020737 and 7020737 is acknowledged by both authors. Georgi Raikov is sincerely grateful for the warm hospitality of his colleagues at the Department of Mathematics, University of Paris 13, during his visit in 2004, when a considerable part of this work was done. He would like to thank also Werner Kirsch, Hajo Leschke and Simone Warzel for several illuminating discussions.
700
F. Klopp, G. Raikov
References 1. Avron, J., Herbst, I., Simon, B.: Schrödinger operators with magnetic fields. I. General interactions. Duke. Math. J. 45, 847–883 (1978) 2. Broderix, K., Hundertmark, D., Kirsch, W., Leschke, H.: The fate of Lifshits tails in magnetic fields. J. Stat. Phys. 80, 1–22 (1995) 3. Bruneau, V., Pushnitski, A., Raikov, G.D.: Spectral shift function in strong magnetic fields. Alg. i Analiz 16, 207–238 (2004) 4. Dimassi, M., Sjöstrand, J.: Spectral Asymptotics in the Semi-Classical Limit. London Mathematical Society Lecture Notice Series 268, Cambridge: Cambridge University Press, 1999 5. Dubrovin, B.A., Novikov, S.P.: Ground states in a periodic field. Magnetic Bloch functions and vector bundles. Sov. Math., Dokl. 22, 240–244 (1980) 6. Erd˝os, L.: Lifschitz tail in a magnetic field: the nonclassical regime. Probab. Th. Related Fields 112, 321–371 (1998) 7. Erd˝os, L.: Lifschitz tail in a magnetic field: coexistence of classical and quantum behavior in the borderline case. Probab. Theory Related Fields 121, 219–236 (2001) 8. Fock, V.: Bemerkung zur Quantelung des harmonischen Oszillators im Magnetfeld, Z. Physik 47, 446–448 (1928) 9. Gradshteyn, I.S., Ryzhik, I.M.: Table of Integrals, Series, and Products. New York San Francisco, London: Academic Press, 1965. 10. Helffer, B., Sjöstrand, J.: Equation de Schrödinger avec champ magnétique et équation de Harper, In: H. Holden, A. Jensen (eds.), Schrödinger operators, Proceedings, Sonderborg, Denmark 1988, Lect. Notes in Physics 345 Berlin: Springer (1981), pp. 118–197 11. Hupfer, T., Leschke, H., Müller, P., Warzel, S.: Existence and uniqueness of the integrated density of states for Schrödinger operators with magnetic fields and unbounded random potentials. Rev. Math. Phys. 13, 1547–1581 (2001) 12. Hupfer, T., Leschke, H., Warzel, S.: Poissonian obstacles with Gaussian walls discriminate between classical and quantum Lifshits tailing in magnetic fields. J. Stat. Phys. 97, 725–750 (1999) 13. Hupfer, T., Leschke, H., Warzel, S.: The multiformity of Lifshits tails caused by random Landau Hamiltonians with repulsive impurity potentials of different decay at infinity. In: Differential equations and mathematical physics (Birmingham, AL, 1999), AMS/IP Stud. Adv. Math., 16, Providence, RI: Amer. Math. Soc., 2000, pp. 233–247 14. Hupfer, T., Leschke, H., Warzel, S.: Upper bounds on the density of states of single Landau levels broadened by Gaussian random potentials. J. Math. Phys. 42, 5626–5641 (2001) 15. Kirsch, W.: Random Schrödinger operators: a course. In: Schrödinger operators, Proc. Nord. Summer Sch. Math., Sandbjerg Slot, Soenderborg/Denmark 1988, Lect. Notes Phys. 345, Berlin: Springer, (1989), pp. 264–370 16. Kirsch, W., Martinelli, F.: On the spectrum of Schrödinger operators with a random potential. Commun. Math. Phys. 85, 329–350 (1982) 17. Kirsch, W., Martinelli, F.: Large deviations and Lifshitz singularity of the integrated density of states of random Hamiltonians. Commun. Math. Phys. 89, 27–40 (1983) 18. Kirsch, W., Simon, B.: Lifshitz tails for periodic plus random potentials. J. Statist. Phys. 42, no. 5-6, 799–808 (1986) 19. Kirsch, W., Simon, B.: Comparison theorems for the gap of Schrödinger operators. J. Funct. Anal. 75, 396–410 (1987) 20. Klopp, F.: An asymptotic expansion for the density of states of a random Schrödinger operator with Bernoulli disorder. Random Oper. Stochastic Equations 3, 315–331 (1995) 21. Klopp, F.: Internal Lifshits tails for random perturbations of periodic Schrödinger operators. Duke Math. J. 98, 335–396 (1999) 22. Klopp, F.: Lifshitz tails for random perturbations of periodic Schrödinger operators. In: Spectral and inverse spectral theory (Goa, 2000). Proc. Indian Acad. Sci. Math. Sci. 112, 147–162 (2002) 23. Klopp, F., Pastur, L.: Lifshitz tails for random Schrödinger operators with negative singular Poisson potential. Commun. Math. Phys. 206, 57–103 (1999) 24. Klopp, F., Ralston, J.: Endpoints of the spectrum of periodic operators are generically simple. In: Cathleen Morawetz: a great mathematician. Methods Appl. Anal. 7, 459–463 (2000) 25. Klopp, F., Wolff, T.: Lifshitz tails for 2-dimensional random Schrödinger operators. Dedicated to the memory of Tom Wolff. J. Anal. Math. 88, 63–147 (2002) 26. Landau, L.: Diamagnetismus der Metalle. Z. Physik 64, 629-637 (1930) 27. Mather, J.N.: On Nirenberg’s proof of Malgrange’s preparation theorem. In: Proceedings of Liverpool Singularities—Symposium, I (1969/70), Lecture Notes in Mathematics, 192, Berlin: Springer 1971, pp. 116–120 28. Mezincescu, G.: Lifschitz singularities for periodic operators plus random potentials. J. Statist. Phys. 49, 1181–1190 (1987)
Lifshitz Tails in Constant Magnetic Fields
701
29. Mezincescu, G.: Internal Lifshitz singularities for one-dimensional Schrödinger operators. Commun. Math. Phys. 158, 315-325 (1993) 30. Mohamed, A., Raikov, G.: On the spectral theory of the Schrödinger operator with electromagnetic potential. In: Pseudo-differential calculus and mathematical physics, Math. Top., 5 Berlin: Akademie Verlag, 1994, pp. 298–390 31. Pastur, L., Figotin, A.: Spectra of Random and Almost-Periodic Operators. Grundlehren der Mathematischen Wissenschaften 297 Berlin: Springer-Verlag, 1992 32. Raikov, G.D., Warzel, S.: Quasi-classical versus non-classical spectral asymptotics for magnetic Schrödinger operators with decreasing electric potentials. Rev. Math. Phys. 14, 1051–1072 (2002) 33. Reed, M., Simon, B.: Methods of Modern Mathematical Physics. IV. Analysis of Operators. New York: Academic Press, 1978 34. Shubin, M.A.: Pseudodifferential Operators and Spectral Theory. Second Edition, Berlin: SpringerVerlag, 2001 35. Sjöstrand, J.: Microlocal analysis for the periodic magnetic Schrödinger equation and related questions. In: Microlocal analysis and applications (Montecatini Terme, 1989), Lecture Notes in Math., 1495, Berlin: Springer, 1991, pp. 237–332 36. Veseli´c, I.: Integrated density of states and Wegner estimates for random Schrödinger operators. In: Spectral Theory of Schrödinger Operators, Contemp. Math. 340, Providence, RI: AMS, 2004, pp. 97–183 Communicated by B. Simon
Commun. Math. Phys. 267, 703–733 (2006) Digital Object Identifier (DOI) 10.1007/s00220-006-0100-7
Communications in
Mathematical Physics
Uniqueness of Diffeomorphism Invariant States on Holonomy–Flux Algebras Jerzy Lewandowski1,2 , Andrzej Okołów2,5 , Hanno Sahlmann1 , Thomas Thiemann3,4 1 Center for Gravitational Physics and Geometry, Physics Department, 104 Davey, Penn State,
University Park, PA 16802, USA. E-mail: [email protected]
2 Instytut Fizyki Teoretycznej, Uniwersytet Warszawski, ul. Ho˙za 69, 00-681 Warszawa, Poland.
E-mail: [email protected]; [email protected]
3 Albert Einstein Institut, MPI f. Gravitationsphysik, Am Mühlenberg 1, 14476 Golm, Germany 4 Perimeter Institute for Theoretical Physics and University of Waterloo, 31 Caroline Street North, Waterloo,
Ontario N2L 2Y5, Canada. E-mail: [email protected]
5 Department of Physics and Astronomy, Louisiana State University, Baton Rouge, LA 70803-4001, USA
Received: 24 October 2005 / Accepted: 1 June 2006 Published online: 22 August 2006 – © Springer-Verlag 2006
Abstract: Loop quantum gravity is an approach to quantum gravity that starts from the Hamiltonian formulation in terms of a connection and its canonical conjugate. Quantization proceeds in the spirit of Dirac: First one defines an algebra of basic kinematical observables and represents it through operators on a suitable Hilbert space. In a second step, one implements the constraints. The main result of the paper concerns the representation theory of the kinematical algebra: We show that there is only one cyclic representation invariant under spatial diffeomorphisms. While this result is particularly important for loop quantum gravity, we are rather general: The precise definition of the abstract ∗-algebra of the basic kinematical observables we give could be used for any theory in which the configuration variable is a connection with a compact structure group. The variables are constructed from the holonomy map and from the fluxes of the momentum conjugate to the connection. The uniqueness result is relevant for any such theory invariant under spatial diffeomorphisms or being a part of a diffeomorphism invariant theory. 1. Introduction In the Hamiltonian analysis of theories of gauge potentials, the configuration space usually is the space A of connections defined on a principal fiber bundle : P → of a compact structure group G. The cotangent bundle T ∗ A (appropriately defined) with the natural symplectic structure becomes the phase space. In addition to the Hamiltonian equations of motion, the theory will exhibit constraint equations. The constraints play a double role in a Hamiltonian theory. On the one hand they generate a group of symmetries of the phase space referred to as the gauge transformations, on the other hand the set of solutions of the constraints defines the constraint surface of the phase space. The simplest example for such a kind of theory is certainly Maxwell theory, where the structure group is U (1). A more general example is Yang–Mills theory, where the
704
J. Lewandowski, A. Okołów, H. Sahlmann, T. Thiemann
structure group may be an arbitrary compact Lie group G. In this case the group of the gauge transformations is the group of the fiber preserving automorphisms of the given bundle, homotopic to the identity. The group is often referred to as the “Yang Mills gauge transformations”. Another example, in fact the one which has triggered the present investigations, is gravity, formulated in terms of real Ashtekar variables [1, 4–7]. In the 3 + 1 case, the structure group is SU (2), the bundle is trivial and defined over a 3-manifold. The group of the gauge transformations generated by the constraints contains all the bundle automorphisms homotopic to the identity map. In terms of a local section, the group becomes the semi-direct product of the Yang–Mills gauge transformations and the diffeomorphisms of homotopic to the identity map. This Hamiltonian formulation is the starting point of the loop quantum gravity (LQG, for brevity) program. To quantize such a theory à la Dirac, one first seeks appropriate basic variables. These are preferred functions separating the points of the phase space which are then quantized. This part of the procedure is called kinematical hereafter. The constraints are then imposed as operator equations on the kinematical Hilbert space or in an appropriately selected dual. In the present paper we are concerned with two issues arising in the kinematical quantization framework of LQG and of every theory of connections whose phase space is T ∗ A. The first issue concerns the choice of basic classical variables and a definition of a corresponding (abstract) quantum ∗-algebra. We slightly generalize and improve details of the ideas developed in LQG and define a quantum ∗-algebra A of basic quantum variables. Our definitions are valid for arbitrary dimension D ≥ 2 of the base manifold , arbitrary compact structure group G, and arbitrary bundle P. The second issue arises when we look for representations: If A admits more than one representation, which one are we going to choose to carry out the Dirac quantization program? Our result here will hold in a more specific setting than our definition of the algebra A: We will show that in the case of diffeomorphism invariant theories,1 upon restricting to diffeomorphism invariant representations, this issue will not arise: we find a unique cyclic representation In the following, let us explain the two results of the paper a bit more in detail and relate them to what has already been achieved elsewhere. The classical algebra. For the sake of informal presentation, let us choose a (local) trivialization of the bundle P and use the notation of field theory. (The main part of the paper will be kept in the geometric and algebraic style.) Then the phase space consists of pairs (A, E) of fields defined on , where: (i) A is a differential 1-form taking values in the Lie algebra g of the gauge group G and (ii) E is a vector density of weight 1 taking values in g∗ , the dual vector space to g. The non-vanishing Poisson relations between the fields evaluated at points can be written as {Aia (x), E bj (y)} = δab δ ij δ(x, y),
(1)
where in a local coordinate system (x 1 , . . . , x D ) in and in a basis {τ1 , . . . , τd } of g the fields are decomposed into A = Aia d x a ⊗ τi and E = E ia ∂a ⊗ τ i (τ 1 , . . . , τ d ∈ g∗ denote the dual basis). The first question to ask is which functionals of A and E should be quantized. A very natural answer is obtained by considering the geometric nature of the fields A and E: A is a 1-form on and therefore integrals of A along 1-dimensional 1 In the non-trivial bundle case, we mean invariance with respect to a group of automorphisms of P which induces, by the bundle projection , all the diffeomorphisms of homotopic with the identity map.
Uniqueness of Diffeomorphism Invariant States on Holonomy–Flux Algebras
705
submanifolds are well defined. E on the other hand, as a vector density of weight one can be turned into a pseudo (D − 1)-form E˜ (still g∗ valued) using the totally antisymmetric 1 E ia aa1 ...a D−1 d x a1 ∧ · · · ∧ d x a D−1 ⊗ τ i . Hence it can be symbol, namely E˜ = (D−1)! integrated over (D − 1)-dimensional hyper-surfaces of . Asking in addition for simple transformation behavior of the functionals of A upon a change of trivialization, that is with respect to A → g −1 Ag + g −1 dg, E → g −1 Eg, where g is an arbitrary (locally defined) G valued function in , one is led to consider functionals depending on A via the Wilson loop functionals h α [A] = P exp − A , α
where α is a path in . A similar requirement applied to the canonical conjugate field E leads to the flux-like variables E S, f = (2) E˜ i f i , S
where S is a (D − 1)-dimensional surface and f : S → g is a function of compact support on S. Starting from the bracket (1) these variables can be endowed with a Lie algebra structure with a remarkable geometric flavor which was systematically explored in [9, 10]: The functions : A → C depending on A via the Wilson loop functionals only form the algebra of cylindrical functions and every flux variable E S, f acts as a derivation X S, f on this algebra, defined by the Poisson bracket2 X S, f := { , E S, f }.
(3)
This is also the approach we will use in the present paper. The product (3) is well defined provided that the intersection between the path α with the surface S contains finitely many isolated points. A simple condition that ensures this property uses a real analytic structure on , analytic paths and analytic surfaces. Correspondingly, analytic diffeomorphisms of are among the natural symmetries inherited from that define automorphisms of the algebra of the basic variables. The analyticity requirement, however, breaks the local character of the non-analytic diffeomorphisms group. Therefore, the most important difference from the treatment of the previous papers on the subject [13–16, 20] is that we will employ here a considerably larger group of symmetries. We will not require that the diffeomorphisms we consider be analytic everywhere but, roughly speaking, analytic only up to submanifolds of lower dimension. Some care has to be taken in the precise definition of this notion, mainly to insure that they form a group and that application of these diffeomorphisms produce surfaces and edges that still have finitely many isolated intersection points. The important point is that this larger symmetry group now contains local diffeomorphisms, and this will be instrumental for proving the uniqueness result.3 A more technical difference as compared to the LQG literature 2 The Poisson bracket in (3) preserves the space of cylindrical functions and the Dirac delta is absorbed completely by the integrations involved in the definitions of the holonomy and flux. This fact was pointed out for the first time in [1, 8]. The specific flux derivation used in this paper was defined in [9]. 3 A more radical enlargement of the symmetry group of the algebra has been advocated for a long time by Zapata (see ex. [17]). Recently, a similar enlargement has been implemented in [11, 29]. See also [18] for a discussion of these questions.
706
J. Lewandowski, A. Okołów, H. Sahlmann, T. Thiemann
is that, following [20], we will be working with arbitrary space-time dimensions and not assume a trivialization of the G-bundle. The quantum algebra. The next step in the quantization program is to define the quantum algebra A. Stated in a heuristic way, we want to define an abstract ∗-algebra of quantum objects hˆ α , Xˆ S,I whose relations reflect (i) the multiplicative structure of the functions of the Wilson loop functionals and the derivations, and (ii) the complex conjugation structure of the functions of the Wilson loop functionals and the flux functionals. Such an algebra has been defined in [13–16] on various levels of rigor and abstraction. Here, we will reach an equivalent, precise definition by using intuition from geometric quantization. Representations, uniqueness. After one has defined the quantum algebra A, according to the Dirac quantization program A has to be represented on a Hilbert space, the constraints have to be implemented as operators, and solutions to the constraints have to be found. Generically, A will admit an infinite number of inequivalent representations, so it is an important question which one of them is the right one to use. Ultimately, this question can only be answered by exhibiting one or more representations in which the program can be followed through to the end, leading to a bona fide quantization of the theory. However, there are clearly more and less natural choices of representations to try first: Most importantly, if the classical theory has symmetries that act on A by a group of automorphisms then it is natural to try to find a representation in which these automorphisms are unitarily implemented. A second natural idea is to first look at irreducible or at least cyclic representations as the simple building blocks, out of which more complicated representations could eventually be built. Finally, if A is not a Banach-algebra, one has to worry about domain questions and it is somewhat natural to consider representations first that have simple properties in this respect. A simple formulation of these properties can be given by asking for a state (i.e. a positive, normalized, linear functional) on A that it is invariant under the classical symmetry automorphisms of A. Given a state on A one can define a representation via the GNS construction. This representation will be cyclic by construction. Furthermore, by construction it has a common invariant dense domain for all the operators representing elements of A. Finally, if the state is invariant under some automorphism of A, its action is automatically unitarily implemented in the representation. In this article, we will investigate the class of representations of A delineated above in a special case, namely if the theory under consideration is invariant under diffeomorphisms of the manifold in the trivial bundle case, and automorphisms of the bundle in the general case. Most prominently, this is the case for gravity, written in terms of connection variables, as used in loop quantum gravity. It follows from [9, 19], that for the case of interest for loop quantum gravity, D = 3 and G = SU(2), a state with these properties exists. The corresponding representation has subsequently served as a cornerstone in the LQG program. Moreover, this representation can be immediately generalized to arbitrary dimension and arbitrary compact gauge group. Therefore the requirements above do not reduce considerations to the empty set. However, it is an important question for the LQG program, and at least an interesting mathematical question in general, whether there exist other representations with these properties. Our analysis will show that this is not the case: the only state that is invariant under the group of diffeomorphisms described above is the one used in LQG. This is a more satisfying result than the ones obtained in [13–16, 20]. However it relies heavily on the enlargement of the symmetry group of a state not used in earlier publications.
Uniqueness of Diffeomorphism Invariant States on Holonomy–Flux Algebras
707
While work on this manuscript was in progress, similar results as the ones that we will present here have been obtained in [29]. The technical setup of [29] differs somewhat from the one used here, and we refer to Sect. 5 for a comparison. 2. The Holonomy–Flux ∗-Algebra The goal of this section is a definition of the ∗-algebra A of basic, quantum observables. We have already mentioned the algebra in the introduction and explained its meaning; in this section we will give a complete definition. As indicated, in our approach we will base all definitions on a new category of manifolds that is larger than the analytic category but smaller than the m-times differentiable one. The technical definitions and proofs in this respect are relegated to the appendix. Let us start here by giving a more intuitive description, justification for this enlargement and outline of the properties relevant in our paper.
2.1. Semianalytic structures. In this work we consider a D-dimensional differential manifold . The differentiability class C m is fixed, m ≥ 1. Our elementary variables – already mentioned in the introduction and carefully defined in the following sections – are constructed by using curves (later called edges) and co-dimension one submanifolds (faces) of . A necessary condition for the Poisson bracket between the variables to be finite is that every edge intersects every face in an at most finite number of isolated intersection points plus a finite number of connected segments (i.e. edges in themselves). To ensure this condition we need to carefully define a class of curves and submanifolds we consider. It will be also important that the class be preserved by a sufficiently large subgroup of the diffeomorphisms of . ‘Large’ means that the subgroup contains sufficiently many diffeomorphisms that act non-trivially only within compact regions. This is not the case, for example, for the analytic diffeomorphism group that has usually been considered in this context. We solve this technical issue by defining an appropriate category of manifolds we will call semianalytic. Next, we assume that the manifold is equipped with a semianalytic structure. Henceforth, all the local maps, diffeomorphisms, submanifolds and functions thereon, are assumed to be C m and semianalytic. Throughout the paper, submanifolds are assumed to be embedded submanifolds. A semianalytic structure is weaker than an analytic one, therefore it can be determined on for example by choosing an arbitrary analytic structure. Briefly, ‘semianalytic’ means ‘piecewise analytic’. For example, a semianalytic submanifold would be analytic except for on some lower dimensional sub-manifolds, which in turn have to be piecewise analytic. To convey the idea, Fig. 1 depicts a semianalytic surface in R3 . However, whereas in the case of = R ‘piecewise analytic’ has a well established interpretation, in a higher dimensional case those words admit a huge ambiguity. Therefore in the appendix we introduce exact definitions and prove relevant properties. We heavily rely on the theory of the semianalytic sets developed by Łojasiewicz [2, 3]. The special property of the semianalytic category so relevant for us in this paper, is that the intersection between every two connected submanifolds, locally, is a finite union of connected submanifolds. Of course, this is also true in the analytic case. But the difference between the analytic and the semianalytic structures is in the local character of the later ones. Technically, the locality is expressed by the fact that every open covering of admits a compatible semianalytic partition of unity.
708
J. Lewandowski, A. Okołów, H. Sahlmann, T. Thiemann
Fig. 1. A semianalytic surface
2.2. The cylindrical functions. Let us recall from the appendix that by semianalytic edge we mean a connected, 1-dimensional semianalytic submanifold of with 2-point boundary. Definition 2.1. An edge is an oriented embedded 1-dimensional C0 submanifold of with a 2-point boundary, given by a finite union of semianalytic edges. Over the manifold we fix a principal fiber bundle : P → .
(4)
The structure group of P is denoted by G, and it is assumed to be compact and connected. The right action of G on P will be denoted in the usual way as G × P (g, p) → Rg p ∈ P. We are assuming the bundle is semianalytic. On P we consider the space of the connections A. Given an edge e, a connection A ∈ A defines a bundle isomorphism A(e) : −1 (x) → −1 (y),
(5)
where x and y are the beginning and end points of e, and the fibers of P are considered as pullbacks of the bundle P. The space Ae of all the bundle isomorphisms −1 (x) → −1 (y) (in fact, Ae depends on the points x and y only) can be mapped in a 1-1 way into G, σ : Ae → G,
(6)
and the map, called a gauge map, is defined by a choice of two points, px ∈ −1 (x) and p y ∈ −1 (y), and by Ae ( px ) = Rσ (Ae ) p y .
(7)
Therefore, it is determined up to the left and right multiplication in G by arbitrary elements g, h ∈ G, corresponding to changing the points px and p y . In this way Ae inherits every structure of G which is left and right invariant (including the topology, the differential manifold structure, the Haar measure).
Uniqueness of Diffeomorphism Invariant States on Holonomy–Flux Algebras
709
Definition 2.2. A function : A → C is called cylindrical if there exists a finite set γ = {e1 , . . . , en } of edges and a function ψ ∈ C ∞ (Ae1 × · · · × Aen ) such that for every A ∈ A, (A) = ψ(A(e1 ), . . . , A(en ));
(8)
in this case, we say that is compatible with γ and ψ. Every cylindrical function is compatible with many sets of edges. Without lack of generality, we may assume that γ is an embedded graph, that is, if two edges e I = e J intersect, then the intersection is contained in the boundary of each of them [22].4 The boundary points of edges constituting a graph γ are called the vertices of γ . It is easy to see that all the cylindrical functions set up a subalgebra of the algebra of all the complex valued functions defined on A; we denote it by Cyl. In a natural way it admits definition of an involution and a norm ¯ := sup | (A)|. := , A∈A
(9)
2.3. The Ashtekar–Isham quantum configuration space A. The space of connections is considered here a configuration space. However, promoting the cylindrical functions to the basic position variables on A (an over-complete set of variables) is equivalent to embedding A into the Gel’fand spectrum of the unital C∗ -algebra Cyl defined as the completion of (Cyl, · , ∗). Elements of the Gel’fand spectrum of Cyl have a geometric interpretation of generalized (or distributional) connections on A. We recall now the definition of the generalized connections (see [18] for a recent review, and [19, 21–25] for the origins). Consider the space E of all the edges in including the trivial one. Certain pairs (e, e ) ∈ E × E can be composed, yielding a new edge. More precisely, let the beginning point of e be the end point of e , then we define e ◦ e := e ∪ e \ (e ∩ e ),
(10)
provided the result is again an edge, where the line stands for the completion, and the beginning (end) point of e ◦ e is defined to be the beginning (end) point of e (e ). If e differs from e in orientation only, then e ◦ e is trivial, hence we will also use the notation e−1 for the edge e with orientation reversed. On the other hand, from the principal fiber bundle P, for pairs of points x, y ∈ , one has the bundle isomorphisms −1 (x) → −1 (y), and those from −1 (x) to −1 (y) can be composed with those from −1 (y) to −1 (z), yielding isomorphisms again. Definition 2.3. A generalized connection A¯ on P assigns to every edge e a bundle isomorphism ¯ A(e) : −1 (es ) → −1 (et ), where es is the beginning (source) of the edge e, and et is its end (target), such that −1 ¯ ¯ ), and A(e ¯ −1 ) = A(e) ¯ ¯ ◦ e ) = A(e) ◦ A(e (11) A(e whenever e ◦ e is defined. We denote the space of generalized connections by A. 4 In [22] the analyticity was assumed. However, owing to Proposition A.14 the semianalyticity assumption used in the definition of the edges is sufficient.
710
J. Lewandowski, A. Okołów, H. Sahlmann, T. Thiemann
Every cylindrical function is naturally extendable to A, by using a compatible graph γ and function ψ (see (8)), namely ¯ := ψ( A(e ¯ 1 ), . . . , A(e ¯ n )), ( A)
(12)
(the result defines a unique function on A, independent of choice of the compatible γ and ψ). Given any A¯ ∈ A, the map ¯ ∈ C, Cyl → ( A)
(13)
is continuous in (Cyl, · ) and defines a C∗ -algebra homomorphism, that is an element of the Gel’fand spectrum. Moreover, every element of the Gel’fand spectrum can be represented by a generalized connection in that way [19, 22]. In this way we identify the spectrum with A. Definition 2.4. The Ashtekar–Isham quantum configuration space for the loop quantization of the theory of connections defined on P is the space A of the generalized connections. 2.4. Generalized vector fields tangent to A. Given a finite dimensional manifold as a configuration space, and the cotangent bundle as the phase space, the momenta correspond to the tangent vector fields5 and there is available an elegant geometric quantization scheme. This idea is easily generalized to an infinite dimensional A, but for the quantization, one would need a measure on A in our case required to be invariant with respect to the automorphisms of the bundle P. Instead, Ashtekar and Isham defined A and proposed to embed A in A because the latter has naturally defined compact topology and is therefor easier to treat. However, A does not have a manifold structure. Nonetheless, the fluxes of the electric field and the corresponding derivations (3) defined in Cyl do lead to a quite precise definition of a generalized vector field tangent to A. We introduce it in this subsection in a geometric, manifestly trivialization invariant way. We define now on A generalized vector fields which correspond to the derivations (3), that is to the smeared fluxes of the frame field E. The generalized vector fields are labeled by faces, and appropriate smearing functions. A face S is introduced in the appendix (see Definition A.16) as a co-dimension 1 submanifold of , oriented in the sense that the normal bundle of S is equipped with an orientation. Now, we will carefully define the smearing functions. Our emphasis is on the geometric, gauge invariant characteristics, and on careful specification of the class of fields we are going to use. Let S be a face. Consider the bundle PS := −1 (S) ⊂ P
(14)
equipped with the principal fiber bundle structure induced by the bundle P. Definition 2.5. Given a face S, a smearing vector field is a compactly supported semianalytic vector field defined on the bundle PS , tangent to the fibers of the bundle and invariant under the action of the structure group G in P. 5 Every vector field on a manifold defines naturally a function on the cotangent bundle. If the manifold is a configuration space, and the cotangent bundle with the natural Poisson bracket is the phase space, then the function is linear in ‘momenta’.
Uniqueness of Diffeomorphism Invariant States on Holonomy–Flux Algebras
711
Let f be a smearing vector field on PS . Denote by exp(· f ) the corresponding flow. The map exp(t f ) : PS → PS
(15)
assigned by the flow to each t ∈ R preserves the fibers of PS (i.e. ◦ exp(t f ) = ) and commutes with the right action of the structure group (i.e. exp(t f )Rg = Rg exp(t f ), for every g ∈ G). It is easy to show that the flow is semianalytic: in a local trivialization of PS , the vector field f corresponds to an element of the Lie algebra of G and the flow can be expressed by the usual exponential map. Restricted to each fiber −1 (x) ⊂ PS , the flow becomes a fiber automorphism exp(t f )x , exp(t f )x := exp(t f )|−1 (x) .
(16)
We use the latter one to define below a 1-dimensional group formed by maps θ (t) : A → A which, briefly speaking, give every generalized connection A¯ a ‘translation’ exp(±t f ) supported on those edges which intersect the face S transversally (in the topological sense) where the sign depends on the orientation of the edge with respect to the orientation of S. To define it, note that S admits an open neighborhood U ⊂ such that U \ S = U − ∪ U +, where U − and U + are disjoint, each of them is open in , connected and non-empty. The labels ‘+’ and ‘−’ correspond to the orientation of S. An action of the generalized flow θ (t) on a generalized connection A¯ ∈ A, can be defined by using only a subclass of edges taken into account in what follows: ⎧ ¯ exp( 1 t f )x ⎪ if e ∩ S = {x} and e \ x ⊂ U + ⎨ A(e) 2 (t) ¯ 1 ¯ exp(− t f )x if e ∩ S = {x} and e \ x ⊂ U − , θ ( A)(e) := A(e) (17) 2 ⎪ ⎩ A(e) ¯ ¯ if e ∩ S = ∅ or e ∩ S = e where x stands for the beginning point of e. Every edge e can be written as a composition of edges of the type given on the right-hand side of Eq. (17) and their inverses, there¯ ) is determined by (17) and the requirement that θ (t) maps generalized fore θ (t) ( A)(e connections to generalized connections. Given an orientation of S, the resulting flow is independent of the choice of the neighborhoods U, U − , U + . Importantly, the pullback θ (t)∗ preserves Cyl and for every cylindrical function , the derivative X S, f :=
d ¯ |t=0 (θ (t) ( A)) dt
(18)
is a well defined element of Cyl. This definition is equivalent to (3) and it is its manifestly trivialization independent version. An important observation is that the action of the X S, f on the cylindrical functions is linear in the vector field f , i.e. X S, f1 + f2 = X S, f1 + X S, f2 . The explicit formula for the action of the operator can be found for example in [6]. Definition 2.6. The operator X S, f : Cyl → Cyl defined in (18), where S is a face and f is a smearing vector field (see Definition 2.5) will be called the flux vector field corresponding to (S, f ).
712
J. Lewandowski, A. Okołów, H. Sahlmann, T. Thiemann
The space of the linear combinations of the operators Cyl → Cyl of the form · X S1 , f1 , · [X S1 , f1 , X S2 , f2 ], · [. . . [X S1 , f1 , X S2 , f2 ], . . . , X Sk , fk ],
(19)
where ∈ Cyl and X S1 , f1 , X S2 , f2 , . . . are the flux vector fields will be called the space of generalized vector fields tangent to A, and denoted by (T A). It will be also convenient to use the complexification (T A)(C) of (T A). Note that every Y ∈ (T A)(C) is a derivation in Cyl, that is Y ( ) = Y ( ) + Y ( ).
(20)
Continuing the analogy with the geometric quantization in the finite dimensional case, consider the vector space Aclass := Cyl ×(T A)(C) ,
(21)
equipped with: • the Lie bracket {·, ·}, {( , Y ), ( , Y )} := −(Y ( ) − Y ( ), [Y, Y ]),
(22)
• the complex conjugation ¯ defined by the complex conjugations in Cyl and in (T A)(C) extended to a map ¯ Y¯ ) ∈ Aclass Aclass ( , Y ) = a → a¯ := ( ,
(23)
¯ (in (T A)(C) the c.c. is defined naturally as Y¯ ( ) := Y ( )). Definition 2.7. The classical Ashtekar–Corichi–Zapata holonomy-flux algebra is the Lie algebra (Aclass , {·, ·}) equipped with ¯ as involution. The ACZ algebra Aclass admits also an action of the the algebra Cyl, Cyl ×Aclass → Aclass , ( , ( , Y )) → · ( , Y ) := ( , Y ). 2.5. The quantum ∗-algebra. The ACZ classical holonomy-flux Lie algebra Aclass , is used now as a set of labels to define an abstract ∗-algebra. Consider the ∗-algebra of the finite formal linear combinations of all the finite sequences of elements of Aclass with the obvious vector space structure, the associative product ·, and involutive anti-linear algebra anti-isomorphism ∗, defined, respectively, as follows (a1 , . . . , an ) · (b1 , . . . , bm ) = (a1 , . . . , an , b1 , . . . , bm ), (a1 , . . . , an )∗ = (an , . . . , a1 ).
(24) (25)
Divide the algebra by a two-sided ideal defined by the following elements (consisting of 1-element and 2-element sequences): (αa) − α(a) , (a + b) − (a) − (b), (a, b) − (b, a) −i({a, b}), ( , a) −( a),
(26) (27) (28)
Uniqueness of Diffeomorphism Invariant States on Holonomy–Flux Algebras
713
given by all α ∈ C, a, b ∈ Aclass and ∈ Cyl. The first class (26) of elements of the ideal relates the linear structure of Aclass with the linear structure of the resulting quotient. The second class (27) of elements encodes the familiar quantum relation between the bracket {·, ·} in Aclass and the commutators in the quantum algebra A. The third class6 (28) encodes the module structure of the ACZ Lie algebra Aclass over the algebra Cyl. Shorter, the algebra A may be also viewed as the algebra exp(⊗Aclass ) divided by the identities (27) and (28). Note that each of the classes (26, 27, 28) is preserved by ∗. Denote the quotient ∗-algebra by A. Definition 2.8. The quantum holonomy-flux ∗-algebra is the (unital) ∗-algebra (A, ∗). The classical ACZ algebra Aclass is naturally mapped in A, Aclass → A
(29)
in the sense that A is isomorphic to the enveloping algebra of Aclass (see (26), (27)), divided by additional identities (28) to preserve the structure of a Cyl-module. The images in A of 1-element sequences (( , 0)) or ((0, Y )), where ∈ Cyl and Y ∈ ˆ and Yˆ respectively. They generate the algebra A. In (T A)(C) will be denoted by particular, for every cylindrical function and every flux vector field X S, f , ˆ¯ ˆ ∗ = , Xˆ ∗S, f = Xˆ S, f .
(30)
It is easy to see that the map (29) is an embedding. Here is a simple argument.7 Consider a representation π0 : exp(⊗Aclass ) → L(Cyl), where L(Cyl) is the algebra of the linear maps Cyl → Cyl, defined on the generators as follows: π0 (( , X )) = − i{, X }.
(31)
It is easy to check that each of the elements (27) and (28) is in the kernel of π0 . Therefore, π0 passes naturally to a representation π0 : A → L(Cyl), π0 : A → L(Cyl).
(32)
The point is that the composition of the maps (29) and π0 , Aclass → A → L(Cyl),
(33)
is obviously injective. Hence the first map is also injective. 2.6. The elements of A. Owing to the identities in A defined by the third class (28) of ⊂ A of Cyl upon the map (29) elements defining the ideal above, the image Cyl ˆ ∈A Cyl →
(34)
is a ∗-subalgebra. Due to (26), the map is linear, it is multiplicative due to (28), and bijective as the restriction of (29) and hence a ∗-isomorphism between Cyl and C yl. Every element of the algebra A is a finite linear combination of elements of the form ˆ ˆ 1 Xˆ S11 , f11 , ˆ 2 Xˆ S21 , f21 Xˆ S22 , f22 , . . . , ˆ k Xˆ Sk1 , fk1 . . . Xˆ Skk , fkk , . . . , ,
(35)
6 The theorem we formulate and prove in this paper uses only the fact that the two-sided ideal contains elements of the third class for a ∈ Cyl. 7 We thank Wojtek Kami´nski for help.
714
J. Lewandowski, A. Okołów, H. Sahlmann, T. Thiemann
where , i ∈ Cyl and X Si j , fi j are the flux vector fields for all the i, j = 1, . . . , k. For example, ˆ S , f + ˆ Xˆ S , f = −i X ˆ Xˆ S, f Xˆ S , f . a = Xˆ S, f S, f ( ) X
(36)
2.7. Symmetries of A. The group of the semianalytic automorphisms of the principal fiber bundle P acts naturally in the space A of connections. The action preserves the algebra Cyl of the cylindrical functions, the norm · and the ∗ involution. Therefore it induces an action of the bundle automorphism group in the space A of generalized connections. The action of the bundle automorphism group on the flux vector fields can be viewed either as the action on operators X S, f : Cyl → Cyl (3), or as the action on the field E and its flux functional (2). Both definitions are equivalent and lead to an appropriate action of the bundle automorphisms on the labels, i.e. the faces and the smearing vector fields. In this way, the bundle automorphism group induces an isomorphism of the ACZ classical Lie algebra Aclass , and finally a ∗-isomorphism of the quantum ∗-algebra A. In this subsection we discuss the action of the automorphisms/diffeomorphisms in detail. But before doing that let us make a remark on the relation between the bundle P automorphisms and the manifold diffeomorphisms. For every bundle automorphism, ϕ˜ : P → P,
(37)
ϕ:→
(38)
◦ ϕ˜ = ϕ ◦ .
(39)
there is a unique diffeomorphism
such that
In our case both of them are assumed to be semianalytic. If the diffeomorphism ϕ is the identity map, then the corresponding automorphism is fiber preserving, and we can call it a Yang–Mills gauge transformation. On the other hand, all the diffeomorphisms homotopic to the identity map are related to the bundle automorphisms via (39). In this sense, the bundle automorphisms represent also the diffeomorphisms of . Now we turn to the technical details of the action of the bundle automorphism group in the quantum ∗-algebra A. For every edge (only the end points of e matter here), the map ϕ˜ defines the following map [20] Adϕ˜ : Ae → Aϕ(e) , A(e) → ϕ˜ ◦ A(e) ◦ ϕ˜ −1 .
(40)
The map extends naturally to the product space, Adϕ˜ : Ae1 × · · · × Ae N → Aϕ(e1 ) × · · · × Aϕ(e N ) , ( A(e1 ), . . . , A(e N ) ) → (Adϕ˜ (A(e1 )), . . . , Adϕ˜ (A(e N )) ).
(41)
It will be relevant later that the map is smooth which can easily be seen by fixing any trivialization of the fibers of P in question. Via Ad, the automorphism ϕ˜ acts in the space of the generalized connections, Adϕ˜ : A → A, ¯ −1 (e))). Adϕ˜ A¯ (e) := Adϕ˜ ( A(ϕ
(42) (43)
Uniqueness of Diffeomorphism Invariant States on Holonomy–Flux Algebras
715
∗
The pullback Adϕ˜ preserves the space of the cylindrical functions. Indeed, for every cylindrical function given by (8), ∗
¯ = (Ad∗ ψ)( A(ϕ ¯ −1 (e1 )), . . . , A(ϕ ¯ −1 (en ))), (Adϕ˜ )( A) ϕ˜
(44)
∗
meaning that Adϕ˜ is compatible with the graph ϕ −1 (γ ) := {ϕ −1 (e1 ), . . . , ϕ −1 (en )} and the function Ad∗ϕ˜ ψ ∈ C ∞ (Aϕ −1 (e1 ) × · · · × Aϕ −1 (en ) ). Given a flux vector field X S, f , the map Adϕ˜ defined above maps the generalized flow (17) into a new flow Adϕ˜ θ (·) . It is easy to check that the new flow is the flow of the flux vector field X ϕ(S),ϕ˜∗ f [20], hence Adϕ˜ ∗ X S, f = X ϕ(S),ϕ˜∗ f .
(45)
Finally, the natural action of ϕ˜ on A, αϕ˜ : A → A,
(46)
ˆ is a ∗-algebra automorphism determined by the following action on the generators and Xˆ S, f , where ∈ Cyl and X S, f are arbitrary: ∗ ˆ ˆ ˆ := Ad αϕ˜ ϕ˜ −1 , αϕ˜ X S, f := X ϕ(S),ϕ˜∗ f .
(47)
αϕ˜1 ◦ϕ˜2 = αϕ˜1 ◦ αϕ˜2 .
(48)
The action satisfies
It should be pointed out that the quantum holonomy-flux ∗-algebra A can potentially admit more symmetries. The relevance of the bundle automorphisms lies in the fact that they are the symmetries of a diffeomorphism invariant classical theory. 3. States, GNS In all of the following, we will be concerned with states ω on A and their GNS representations. Recall that Definition 3.1. A state on a ∗-algebra A is a functional ω : A → C, such that for every α ∈ C, and every a, b ∈ A, ω(αa + b) = αω(a) + ω(b), ω(a ∗ ) = ω(a), ω(a ∗ a) ≥ 0, ω(I ) = 1,
(49) (50)
where I stands for the unity element of A. Given a state on A, we can construct the corresponding GNS representation (Hω , πω , ω ,) where Hω is a Hilbert space, πω a representation of A on Hω and ω a vector in Hω which, when viewed as a state on A, coincides with ω. A detailed exposition of the GNS construction for algebras of unbounded operators can be found for example in [12]. Here we will only need the following elements and properties that are easy to prove: (i) the linear space of the equivalence classes [A] := A/I,
(51)
716
J. Lewandowski, A. Okołów, H. Sahlmann, T. Thiemann
where I is the left ideal formed by all a ∈ A such that ω(a ∗ a) = 0, is equipped by the state ω with the following product: [a] , [b] := ω(a ∗ b), where for every a ∈ A, [a] ∈ A/I√ stands for the equivalence class defined by a; (ii) the product provides a norm a ω = [a] , [a] in [A], and the completion Hω := [A]
(52)
together with the product · , · is a Hilbert space; (iii) to every element a of A we assign a linear but in general unbounded operator πω (a) acting in [A], πω (a)[b] := [ab], for every b ∈ A;
(53)
(iv) the action πω preserves the subspace [A], hence [A] serves as a common, dense domain for all the operators πω (a), a ∈ A; (v) The representation πω satisfies
πω (a)[b] , [c] = [b] , πω (a ∗ )[c] , (54) for every a, b, c ∈ A. As we explained in the previous section, A contains the subalgebra C yl ⊂ A isomorphic as a ∗-algebra with the algebra Cyl. Therefore, every state ω defined in A, restricted to C yl defines a state on Cyl. On the other hand, there is known a powerful characterization of states defined on the completion Cyl. Fortunately, that characterization applies due to the following fact: also to all the states on the ∗-algebra Cyl (and hence Cyl), Lemma 3.2. 8 Suppose that ω : Cyl → C satisfies for every , ∈ Cyl and α ∈ C, the following equalities and inequality: ¯ = ω( ), ω(α + ) = αω( ) + ω( ), ω( ) ¯ ω( ) ≥ 0, ω(I ) = 1. Then, |ω( )| ≤ 2 .
(55)
Therefore, ω is continuous with respect to the norm · and determines a unique extension to a state defined on the C∗ -algebra Cyl. Proof. For every ∈ Cyl we have |ω( )| = |ω( R ) + iω( I )| ≤ |ω( R )| + |ω( I )|,
(56)
where R and I are the real and imaginary parts of , respectively. Let = 0 be R or I (if both R and I are zero then (55) is satisfied trivially). For every real number q the following equality holds: ω( ) = q − ω(q I − ).
(57)
8 This lemma is a modification of similar well known results see for example [30], p. 106,107. We include it for completeness. The factor 2 in the inequality (55) can be probably lowered to 1, but this is not relevant in our paper.
Uniqueness of Diffeomorphism Invariant States on Holonomy–Flux Algebras
Let q > 1. Then, the function q I − is strictly positive, and A A →
(A) := q I − (A)
717
(58)
is a well defined function on the space of connections. Importantly, this is also a cylindrical function. Indeed, if we represent by a compatible graph γ and function ψ (see (8)), then the function q ψ − ψ is everywhere strictly positive, because the natural map A → Ae1 × · · · × Aen is onto. Therefore the function ψ
:= sqr tq ψ I − ψ
(59)
is C∞ and the corresponding cylindrical function is exactly
. But this means that the second term on the right hand side of the equality (57) (including the sign) is negative. Indeed, ¯
) ≤ 0. −ω(q − ) = −ω( This observation completes the proof of Lemma 3.2.
(60)
4. The Uniqueness Theorem As explained in the introduction, of particular importance are states invariant with respect to the automorphisms of the principal fiber bundle P. A state ω defined on the algebra A is invariant with respect to a bundle automorphism ϕ˜ : P → P, if for every a ∈ A, ω(a) = ω(αϕ˜ a).
(61)
If ω is invariant with respect to all the fiber preserving automorphisms, we call it Yang–Mills gauge invariant. Definition 4.1. If a state defined on the quantum ∗-algebra is invariant with respect to all the bundle automorphisms of P that induce, via the bundle projection , diffeomorphisms homotopic to the identity, we call it Yang–Mills gauge and diffeomorphism invariant or, if there is no danger of confusion, just invariant. Given a ∗-algebra and a symmetry group, assuming the existence of a diffeomorphism invariant state is a strong condition. However, in our case one invariant state is already known; we recall it below. It will be, therefore, natural to ask if there are other states with that property. Example. A Yang–Mills gauge invariant and diffeomorphism invariant state on A. Define the action of ω0 on the elements of A of the form a · Yˆ , where a ∈ A and Y ∈ (T A)(C) as simply ω0 (a · Yˆ ) := 0
(62)
ˆ corfor every a and every vector field Y . To define the action of ω0 on an element responding to ∈ Cyl, recall a general form (8) of a cylindrical function. Recall also that each factor Ae in the domain Ae1 × · · · × Aen has all the left and right invariant structures of the bundle structure group G. One of them is the probability Haar measure μe . We use it to set ˆ ω0 ( ) := ψdμe1 ⊗ · · · ⊗ dμen . (63) Ae1 ×···×Aen
718
J. Lewandowski, A. Okołów, H. Sahlmann, T. Thiemann
Importantly, this integral is independent of choice of the graph γ compatible with a given . Due to the general form of a ∈ A given by (35) the equalities (62, 63) determine a state ω0 . The positivity of ω0 amounts to the positivity of the Haar measure μe1 ⊗ · · · ⊗ μen on C ∞ (Ae1 × · · · × Aen ) which is obviously true. The state ω0 is Yang–Mills gauge and diffeomorphism invariant. To see this, note that every automorphism ϕ˜ of the bundle P maps every flux vector field into another flux vector field, therefore the condition (62) is manifestly invariant. To see the invariance of the part (63) of the definition, consider a graph γ and function ψ ∈ C ∞ (Ae1 ×· · ·×Aen ) compatible with (see (8)), and a gauge map σ −1 γ : G n → Ae1 × · · · × Aen defined by the gauge maps (6) and any choice of points px ∈ −1 (x) for every vertex x of γ . Then the definition (63) reads ∗ ω0 ( ) = σ −1 γ ψdμ H , (64) Gn
where μ H is the probability Haar measure on G n . On the other hand, the transformed function Adϕ˜ is compatible with a graph ϕ −1 (γ ), and the function ψ = Adϕ˜ ψ. If we use for the graph ϕ −1 (γ ) the gauge map σϕ −1 (γ ) given by the points ϕ˜ −1 ( px ), then simply ∗
∗
σ −1 ϕ −1 (γ ) ψ = σ −1 γ ψ.
(65)
The state ω0 is well known, and is extensively used in the loop quantization [6, 22]. Given the example of an invariant state above, let us now state and prove our uniqueness result: Theorem 4.2. There exists exactly one Yang–Mills gauge invariant and diffeomorphism invariant state on the quantum holonomy-flux ∗-algebra A. Proof. The existence is known, see the example; therefore it suffices to prove the uniqueness. We will assume from now on that ω is a diffeomorphism invariant state on A, label the corresponding representation obtained from the GNS construction by (Hω , πω ) and use the notation introduced in Sect. 3. To simplify the reading, we will break down the proof into two parts. The first of these is rather technical. It will establish a proof of the following fundamental lemma: Lemma 4.3. Let ω be an invariant state on A. Then for every flux vector field X S, f , where S is a face, and f a smearing vector field, in the corresponding GNS-representation, [ Xˆ S, f ] = 0.
(66)
Once the lemma is established, the rest of the proof of the uniqueness is fairly straightforward. Proof of Lemma 4.3. Let S be a face in and f be a smearing vector field. We will decompose f into a certain finite sum, f = fIi , (67) I
i
such that each term f I i is a smearing vector field itself which satisfies [ Xˆ S, f I i ] = 0.
(68)
Uniqueness of Diffeomorphism Invariant States on Holonomy–Flux Algebras
Then, (66) follows automatically from the linearity X S, f I i . X S, f = I
719
(69)
i
To each point x ∈ (supp f ) ⊂ choose an open neighborhood Ux in in such a way that there exists a trivialization Tx of −1 (Ux ), Tx : Ux × G → −1 (Ux ),
(70)
and such that there is a chart χx containing Ux in its domain with χx (S ∩ Ux ) = {(x 1 , . . . , x D ) | x D = 0, 0 < x 1 < 1, . . . , 0 < x D−1 < 1}.
(71)
Since the support of f is compact in P, we can choose from that covering a finite subcovering {U I } N I =1 of (supp f ). We denote the corresponding trivialization by T I and the corresponding chart by χ I . Let φ I : → R, where I = 1, . . . , N , be a family of functions such that supp φ I ⊂ U I for every I , and for every x ∈ supp f , N
φ I (x) = 1.
(72)
I =1
We use that partition of unity, to decompose the smearing vector field f , f =
N
fI ,
f I := φ I f.
(73)
I =1
Each f I (I = 1, . . . , N ) is still a smearing vector field in the sense of Definition (2.5) and additionally has the appropriate support property: (supp f I ) ⊂ U I . Now we fix I and decompose the smearing vector field f I further. Suppose R1 be a vector field defined on G and right invariant. It defines naturally a vector field on U I × G. That vector field is mapped by the trivialization T I into a vector field defined in −1 (U I ), tangent to the fibers and invariant with respect to the group action. Let Ri , i = 1, . . . , dim G, be a basis in the vector space of the right invariant vector fields defined on G. Then, every smearing vector field f I defined on PS = −1 (S) is a sum of the vector fields proportional to the vector fields T I ∗ Ri , i = 1, . . . , dim G, f I i , f I i = (∗ h i ) T I ∗ Ri , (74) fI = i
where each coefficient ∗ h
is a function h i : S → R lifted to the bundle PS = −1 (S). Obviously supp h i ⊂ U I . Now we can finish the proof of Lemma 4.3, by showing that necessarily (68) is true. To this end, fix indices I, i, and for every compactly supported function h : S ∩ U I → R consider the following smearing vector field defined on S (no summation with respect to i): i
w(h) := (∗ h) T I ∗ Ri .
(75)
Consider the following product ( · | · ) which, given a pair of compactly supported functions h, g : S ∩ U I → R, assigns the following number (h|g): (h|g) := [ Xˆ S,w(h) ] , [ Xˆ S,w(g) ] . The product ( · | · ) has the following properties:
720
J. Lewandowski, A. Okołów, H. Sahlmann, T. Thiemann
(i) it is bilinear and symmetric, (ii) it is invariant under diffeomorphisms of which are supported in U I and preserve U I as well as S, (iii) for h = g = h i of (74), it is exactly the norm squared of the Hilbert space Hω element [ Xˆ S, f I i ] under consideration, [ Xˆ S, f I i ] 2 = [ Xˆ S, f I i ] , [ Xˆ S, f I i ] = (h i |h i ). (76) The property ii) above, follows from the fact, that via the trivialization T I , every diffeomorphism ϕ : → preserving U I defines an automorphism ϕ˜ of the bundle P which preserves each of the vector fields T I ∗ Ri . Therefore, if ϕ additionally preserves S, then the action of ϕ˜ (47) on Xˆ S,w(h) amounts to αϕ˜ Xˆ S,w(h) = Xˆ S,w(h◦ϕ −1 ) .
(77)
We will now show that properties i) and ii) already imply that that (h|h) is zero for every function h : S → R with compact support in U I . Then iii) shows that we have reached our goal. Let us use the chart χ I to push forward the action arena into R D : U I := χ I (U I ), h
:= h ◦ χ
−1
S = χ I (S ∩ U I ),
: S → R,
(78) (79)
where h has compact support and S is defined by (71). We want to extend h to a function defined in U I and of a compact support. Therefore, we choose an arbitrary semianalytic function κ : R → R such that κ (0) = 1 and the function (x 1 , . . . , x D ) → h (x 1 , . . . , x D−1 )κ (x D )
(80)
has compact support contained in U I . Using these ingredients, we can define a map9 ϕλ : R D → R D , where λ is a real parameter, by ϕλ (x 1 , . . . , x D ) := (x 1 + λh (x 1 , . . . , x D−1 )κ (x D ), x 2 , . . . , x D ).
(81)
Lemma 4.4. There is λ0 > 0 such that for every 0 < λ < λ0 , ϕλ is a semianalytic diffeomorphism of R D equal to the identity outside of U I and preserving U I . Proof. The Jacobian of ϕ λ is a triangular matrix and the determinant turns out to be simply 1 + λκ ∂1 h . Since λκ ∂1 h has compact support and is semianalytic, it is in particular bounded, and thus there is a λ0 > 0 such that 1 + λκ ∂1 h > 0 for every 0 < λ < λ0 . Hence ϕ λ is locally a diffeomorphism, provided 0 < λ < λ0 . It is also
a global diffeomorphism, because outside of the support of κ h it acts as the identity and thus lim|x|→∞ ϕλ (x) = ∞. Then a well known theorem by Hadamard proves the assertion. Because all the functions used in the construction of ϕ λ are assumed to be semianalytic, and all the operations used preserve the semianalyticty (see the Appendix), ϕ λ is also semianalytic. Finally, note that every bijection which is an identity on a certain subset, necessarily preserves the complement. 9 We will use here a modification of the trick mentioned in the Appendix of [27].
Uniqueness of Diffeomorphism Invariant States on Holonomy–Flux Algebras
721
Now let us choose a semianalytic function H with support in U I such that H (x 1 , . . . , x D ) = x 1
whenever
(x 1 , . . . , x D ) ∈ supp κ h .
(82)
Such a function can be easily constructed by using an appropriate partition of the unity. Let us see how each of the diffeomorphisms ϕ λ acts on H : Because of the properties of H in relation to the support of tailκ h we find x 1 + λh (x 1 , . . . , x D−1 )κ (x D ) on supp κ h
∗ 1 D ϕ λ H (x , . . . , x ) = H (x 1 , . . . , x D ) otherwise = H (x 1 , . . . , x D ) + λh (x 1 , . . . , x D−1 )κ (x D ). Now let us pull this relation back to U I and the manifold again by using the chart χ I . Denote the pullbacks of the functions H , h , κ and ϕλ , respectively, by H , h, κ and ϕλ . The functions H and hκ have support contained in U I , therefore we can extend them as identically zero to the rest of . Similarly, ϕλ , for every 0 < λ < λ0 , is a diffeomorphism defined locally in U I that can be extended as the identity to the rest of , and the result is a diffeomorphism of . The above relation then reads ϕλ∗ H = H + λκh.
(83)
(H |H ) = (ϕλ∗ H |ϕλ∗ H ) = (H |H ) + λ(h|H ) + λ(H |h) + λ2 (h|h),
(84)
Now we compute ii)
where we have used the invariance of the product ( · | · ) under diffeomorphisms homotopic to the identity (the ϕλ obviously are) and the fact that κ| S = 1. Since the equality (84) holds for every value of λ provided 0 < λ < λ0 , we conclude that (h|h) = 0.
(85)
Then, as announced above for h = h i , we get the desired result, and in turn conclude that [ Xˆ S, f ] = 0 as a vector in the GNS-Hilbert space Hω . Now that we have established the fundamental Lemma 4.3 asserting that [ Xˆ S, f ] = 0 for any face S and any smearing vector field f in any GNS-representation coming from the invariant state ω, we can show that the structure of the GNS-Hilbert space Hω is actually very simple. Let us start by reminding the reader of the form (35) of elements of A whose linear span is A. It follows immediately that a dense set of vectors in Hω is given by the linear span of all the vectors of the form ˆ πω ( ˆ 1 )[ Xˆ S11 , f11 ], πω ( ˆ 2 Xˆ S21 , f21 )[ Xˆ S22 , f22 ], . . . [ ], ˆ k Xˆ Sk1 , fk1 . . .)[ Xˆ Skk , fkk ], . . . . . . . , πω (
(86)
ˆ are non-zero. But because of Lemma 4.3, of these vectors, only the ones of the form [ ] Therefore, all the information on the state ω is contained in the corresponding state defined on the algebra C yl, ˆ = [ Iˆ] , [ ] ˆ Cyl → ω( ) , (87) Hω
722
J. Lewandowski, A. Okołów, H. Sahlmann, T. Thiemann
where Iˆ is the unit element of A. Now we make use of Lemma 3.2 from Sect. 3: the state (87) is actually continuous with respect to the C∗ -norm on Cyl. Thus, using the representation theorem by Riesz and Markow, there exists a measure μ on A such that ˆ = ω( ) dμ. A
Notice now that Lemma 4.3 implies what follows:
ˆ , [X X S, f ( ) dμ = [ ] S, f ( )] Hω A
ˆ ] − [ ˆ , [ Xˆ S, f ˆ Xˆ S, f ] = i [ ] Hω
ˆ ] ˆ , [ Xˆ S, f = i [ ] ˆ , [ ˆ ] = i [ Xˆ S, f ] =− X S, f ( ) dμ. A
Setting = I (i.e. the constant function on A of the value 1) we see that for any face S and any smearing vector field f and for any function ∈ Cyl, X S, f ( ) dμ = 0. A
As it was shown in [15] the only measure satisfying the above condition coincides with the measure defined on A by the state ω0 described in the example. In conclusion, ω = ω0 .
(88)
5. Closing Remarks As we have emphasized the uniqueness result proved in the last section is reassuring for the LQG program, and it shows that diffeomorphism invariance can sometimes be a powerful remedy against complications that one expects based on what one knows about background dependent field theories. Our result is based on certain, albeit reasonable, assumptions, therefore an immediate question is whether it can be generalized. Certainly the result holds for any enlargement of the symmetry group that contains the diffeomorphisms considered above. Whether it also holds for smaller extensions, or even for only the analytic diffeomorphisms is an open question. We feel however that as soon as the subgroup of diffeomorphisms is big enough to contain ‘local ones’, application of the techniques used above should be straightforward. Also, even if uniqueness were to break down if only invariance under analytic automorphisms is required, it is not clear how relevant the result would be physically, as it would heavily involve details of the structure of analytic diffeomorphisms on a given manifold .
Uniqueness of Diffeomorphism Invariant States on Holonomy–Flux Algebras
723
Another way to generalize the result would be to consider, instead of the flux operators, the unitary groups that they generate, and ask for diffeomorphism invariant representations in which these groups are strongly continuous. In [16] this setting was considered, however the results were not satisfactory due to the more complicated domain questions arising. A more satisfying result was recently obtained by Fleischhack in [29]. It does not make the assumption of a common dense domain for all flux operators and cylindrical functions that is implicit in our treatment. However, it needs an additional assumption on the action of the bundle automorphisms in the representation. Finally, as for background dependent theories, at least the definition of the kinematical algebra A applies in principle. Whether one expects the type of variables used to be well defined in the quantum theory is certainly a difficult question. Still it seems worthwhile to look for non-diffeomorphism invariant representations of A and see if they can be put to use in physics. Another interesting starting point for future work is the observation that the uniqueness theorem fails if a rather innocent looking assumption – that of compact support on S for the smearing functions f used in the flux variables E S, f – is removed: Consider the example = R2 , G =U(1), and drop the assumption of compact support for the smearing functions. The hyper-surfaces S are one dimensional in this case. Let us also choose an orientation for . From that orientation, together with the orientation on the normal bundle of a given S we can equip S with an intrinsic orientation, and thus integrate one-forms on S. Then we can define ˆ := ω0 ( ), ˆ ω( ) ˆ ˆ ˆ ω( X S1 , f1 . . . X Sn , fn ) := d f1 . . . S1
ˆ d f n ω0 ( ). Sn
It is easy to check that ω defines a state on A and is manifestly invariant under the action of orientation-preserving diffeomorphisms. Obviously it is different from ω0 and thus would constitute a counterexample to our uniqueness result, were it not for the fact that for smearing functions f with compact support in S, S d f =0. Hence under the assumptions made in this paper, ω = ω0 , and there is no contradiction. Since obvious generalizations of this state to a higher dimensional situation seem to fail, the existence of ω for smearing functions without compact support might just be a peculiarity of D = 2. However, just as for ω the endpoints of the lines S can be used to “anchor” diffeomorphism invariant information in the state, it is not inconceivable that similarly points of the boundary of hyper-surfaces S in which the boundary has a lower differentiability than C m might be used to that end in higher dimensions.10 In any case, the restriction to compact support does not seem to be unphysical.11 It can be viewed as analogous to the smoothness and decay properties assumed for smearing functions in standard quantum field theory. A more detailed investigation into these issues will be carried out elsewhere. A useful for LQG outcome of our work is introducing the semianalytic category. The corresponding diffeomorphisms form a subgroup of the Cm diffeomorphism group, the 10 In a similar way, one might intuitively understand the need to use semi-analytic smearing functions, not just, say, continuous ones: Singular points (from the semianalytic perspective) of the smearing functions could not be removed by semianalytic diffeomorphisms and evaluation of the function at such points would thus constitute diffeomorphism invariant data that could give rise to other diffeomorphism-invariant states. 11 And, as for applications to Loop Quantum Gravity, all results that use smearing functions with non-compact support (such as the definition of the volume operator) can be recovered by taking appropriate limits once the state is fixed to be ω0 .
724
J. Lewandowski, A. Okołów, H. Sahlmann, T. Thiemann
group of symmetries induced by the action of the diffeomorphisms of in the classical phase-space and preserving our classical Ashtekar–Corichi–Zapata algebra. The relevance of this symmetry group consists in its local character (as opposed to the analytic diffeomorphisms). For example, the symmetry group provides a new, elegant version of a map Cyl → Cyl∗ (the algebraic dual) which averages with respect to the (allowed) diffeomorphisms of . This application has been recently implemented in [6]12 (see also [5, 26, 27]). There are several similar, non-equivalent extensions of the analytic diffeomorphisms recently introduced in literature. One of them is due to Fleischhack [29] who also applies the theory of the stratifications. Another one was considered by Rovelli and Fairbairn [11]. They even advocate the relevance of non-differentiable homeomorphisms in the classical Einstein’s Gravity. The Rovelli–Fairbairn generalized diffeomorphisms, however, are defined to be smooth everywhere except a finite set of points, therefore they would not be useful in our case. Acknowledgements. We thank Abhay Ashtekar, Klaus Fredenhagen, Detlev Buchholz, Stefan Hollands and Bob Wald for urging us to investigate the uniqueness issue. Stanisław Woronowicz gave us a useful technical suggestion we appreciate very much. We thank Christian Fleischhack for drawing our attention to the theory of the semianalytic sets. JL and AO were partially supported by Polish KBN grants: 2 P03B 06823 and 2 P03B 12724. TT was supported in part by a grant from NSERC of Canada to the Perimeter Institute for Theoretical Physics. Parts of this work were supported by NSF grant PHY-00-90091 and the Eberly research funds of Penn State.
A. Semianalytic Category, Details A.1. Semianalytic functions in Rn . In this section we introduce semianalytic functions, semianalytic manifolds and semianalytic geometry. We will take advantage of the results of the theory of semianalytic sets [2, 3].13 Throughout this section, by ‘neighborhood’ we always mean ‘open neighborhood’, even if ‘open’ is dropped. Briefly speaking, a real valued function f defined on an open subset of U ⊂ Rn will be called semianalytic if it is analytic on an open and dense subset of U, and if the nonanalyticity surfaces have also an appropriate analytic structure, and if the restrictions of f to the non-analyticity surfaces are again analytic in an appropriate sense. To introduce our definition, we need a notion of a semianalytic partition of U. Consider in U a finite sequence of equalities and/or inequalities, namely h 1 (x) σ1 0, ... h N (x) σ N 0 ,
(89)
where each σi is either of the three relations > , < , =, and {h 1 , . . . , h N } is a set of analytic functions defined on a domain containing U. More formally, there is defined a map σ : h = {h 1 , . . . , h N } → {>, =, <}
(90)
σ I := σ (h I ),
(91)
and in (89) we denoted
12 Except that our definition of the extension of the analytic diffeomorphism group has changed since then. 13 We thank Christian Fleischhack [28] for drawing our attention to the theory of semianalytic sets.
Uniqueness of Diffeomorphism Invariant States on Holonomy–Flux Algebras
725
where the integer I runs from 1 to N . The set of the conditions (89) determines the following subset of U: Uh,σ = {x ∈ U : (89)}.
(92)
Definition A.1. Given a finite set h of real valued analytic functions defined on a neighborhood of an open subset U of Rn , the corresponding semianalytic partition of U is the set of all the subsets Uh,σ ⊂ U defined by (89, 92) such that σ is an arbitrary map (90). Given U and h as above, the partition will be denoted by P(U, h). Obviously, every semianalytic partition covers U, Uh,σ , U=
(93)
σ
where σ runs through all the maps (90). Also, σ = σ
⇒
Uh,σ ∩ Uh,σ = ∅ ,
(94)
and a set Uh,σ may be empty itself. Another obvious property is that given a semianalytic covering P(U, h) and an open subset V ⊂ U, the family h of functions defines a semianalytic covering P(V, h). Now, we are in a position to define a semianalytic function: Definition A.2. A function f : U → Rm , where U is an open subset of Rn , is called semianalytic if every x ∈ U has an open neighborhood U˜ equipped with a semianalytic ˜ h) there is an analytic function ˜ h), such that for every U˜h,σ ∈ P(U, partition P(U, f σ : U˜ → Rm , such that f |U˜
h,σ
= f σ |U˜ , h,σ
(95)
that is, such that f σ coincides with f on U˜h,σ . Given a semianalytic function f and a point x in its domain, a semianalytic partition ˜ h) which has the properties described in Definition A.2 will be called compatible P(U, with f at the point x. There are infinitely many semianalytic partitions compatible with a given f at x. Clearly, if f : U → Rn is semianalytic, and V ⊂ U is open, then the ˜ h) comrestriction function f |V is semianalytic. Given a semianalytic covering P(U, ˜ ˜ ˜ h) is patible with f , and an open subset V ⊂ U ∩ V, the semianalytic covering P(V, compatible with f |V . Example. Consider a function f : R → R analytic on every closed interval [n, n + 1]. ˜ h) compatible with f at x0 is defined f is semianalytic. A semianalytic partition P(U, for the open interval U˜ := [x0 ] − 1, [x0 ] + 1 (96) (we denote by a, b the open interval bounded by a, b ∈ R and by [a] the integer part of a) by the set {h −1 , h 0 , h 1 } of functions h −1 (x) = x − [x0 ] + 1, h 0 (x) = x − [x0 ], h 1 (x) = x − [x0 ] − 1.
(97)
726
J. Lewandowski, A. Okołów, H. Sahlmann, T. Thiemann
Proposition A.3. Let f 1 : U → R, and f 2 : U → Rm be two semianalytic functions, where U is an open subset of Rn . Then the functions U x → f 1 (x) f 2 (x) ∈ Rm ,
U x → ( f 1 (x), f 2 (x)) ∈ Rm+1 ,
(98)
are also semianalytic. Proof. Let x ∈ U. Let P(U˜ (1) , h (1) ) be a semianalytic partition compatible with f 1 at x, and P(U˜ (2) , h (2) ) be a semianalytic partition compatible with f 2 at x. The proof becomes obvious if we construct a single semianalytic partition compatible at x with both functions. The natural choice is just the semianalytic partition of the intersection U˜ := U˜ (1) ∩ U˜ (2)
(99)
h := h (1) ∪ h (2) .
(100)
defined by the set of functions
(1)
Indeed, it is enough to notice, that for every U˜h,σ ∈ P(U, h) there are some U˜h (1) ,σ (1) ∈ (2) P(U˜ (1) , h (1) ) and U˜ (2) (2) ∈ P(U˜ (2) , h (2) ) such that h
,σ
(1) U˜h,σ ⊂ U˜h (1) ,σ (1)
(2) U˜h,σ ⊂ U˜h (2) ,σ (2) .
and
(101)
It is obvious that if f : U → R is a semianalytic function and it does not vanish on an open set U, then 1f is also semianalytic. This fact will be important in construction of semianalytic partitions of unity. They are useful owing to Proposition A.3. We turn now to the issue of the morphisms of the semianalytic functions. It is obvious
that every analytic map φ : U → U between two open subsets U ⊂ Rn and U ⊂ Rn
pullbacks all the semianalytic functions defined on U into semianalytic functions defined on U. The following proposition shows that the same is true for a semianalytic map.
Proposition A.4. Let U ⊂ Rn and U ⊂ Rn be open subsets. Suppose the functions f : U → Rm and φ : U → U are semianalytic. Then, the composition function f ◦ φ : U → Rm is semianalytic. Proof. The idea of the proof is simple: we construct a suitable partition of U using the inverse image of a given partition of U compatible with f . The inverse image is not, in general, semianalytic, but we show it can be sub-divided into a semianalytic partition. Lemma A.5. Let P(U˜ , h ) be a semianalytic partition. Let φ : U → U˜ be a semianalytic function, where the subset U ⊂ Rn is open. For every x0 ∈ U there exists an open ˜ h) ˜ such that for every element of neighborhood U˜ and a semianalytic partition P(U,
˜ ˜ ˜ ˜ ˜ P(U, h), say Uh, ˜ σ˜ , there is an element of P(U , h ), say Uh ,σ , such that −1 ˜ U˜h, Uh ,σ . ˜ σ˜ ⊂ φ
(102)
Uniqueness of Diffeomorphism Invariant States on Holonomy–Flux Algebras
727
˜ h) ˜ which satisfies the conclusion. Proof. Let x0 ∈ U. We will construct a partition P(U, If φ is analytic, then the set of the pullbacks of all the functions h I ∈ h defines a suitable partition of the whole U. In general, φ is not analytic. However, it gives rise to a family of analytic functions φσ defined in some neighborhood U˜ of x0 via Definition A.2 (with ˜ are f being replaced by φ). We choose U˜ small enough, such that all the images φσ (U)
˜ h) compatible contained in the given U˜ ). Hence, consider a semianalytic partition P(U, with φ at x0 , and the corresponding family of analytic functions φσ : U˜ → U˜ . We define h˜ to be the set of functions formed by (i) all the functions h I ◦ φσ defined by all the functions φσ and all h I ∈ h , and (ii) all the functions h I ∈ h. Let us demon˜ h) ˜ satisfies the conclusion. Let strate that the corresponding semianalytic partition P(U, ˜ σ˜ : h → {>, =, <} be an arbitrary map. Denote σ := σ˜ |h , σ := σ˜ |φσ∗ (h ),
(103) (104)
where σ in the second line is the one introduced in the first line, and φσ∗ (h ) is the set of pullbacks of the elements of h by using φσ∗ . Now, it follows directly from (103) (see (89) with h I and σ I being themselves as well as being replaced by h˜ I and σ˜ I ) that ˜ U˜h, ˜ σ˜ ⊂ Uh,σ .
(105)
On the other hand, the second line (104) means that ˜ φσ (U˜h, ˜ σ˜ ) ⊂ Uh ,σ .
(106)
The combination of the last two facts with φσ |U˜ concludes the proof of the lemma.
h,σ
= φ|U˜
h,σ
(107)
We go back to the proof of the proposition. Given x0 ∈ U, consider the point φ(x0 ) ˜ h) ˜ be a and a partition P(U˜ , h ) compatible with the function f at φ(x0 ). Let P(U,
˜ ˜ ˜ partition provided by the lemma. For every Uh, ˜ σ˜ ∈ P(U, h) use the pair σ, σ defined by (103,104). The function f σ ◦ φσ is the wanted analytic extension of f ◦ φ|U˜ . ˜ σ˜ h,
In general, the inverse of an invertible semianalytic function is not necessarily semianalytic. However, a carefully formulated set of assumptions ensures the semianalyticity of the inverse. Proposition A.6. Let φ : U → U be a semianalytic and bijective function, where U, U ⊂ Rn are open. Suppose that for every x0 ∈ U there exists a semianalytic par˜ h) the ˜ h) compatible with φ at x0 , and such that for every U˜h,σ ∈ P(U, tition P(U, ˜ restriction φ|U˜ is extendable to an analytic, injective function φσ : U → U , such h,σ ˜ is an open subset of Rn , and (ii) the inverse φσ−1 : φσ (U) ˜ → U˜ is analytic. that: (i) φσ (U) −1 Then, φ is semianalytic.
728
J. Lewandowski, A. Okołów, H. Sahlmann, T. Thiemann
˜ h) be a partition compatible with φ at x0 = Proof. Given a point x0 ∈ U , let P(U,
−1 ˜ φ (x0 ) ∈ U. Suppose P(U, h) and φ satisfy the assumptions. We have to construct a semianalytic partition of a neighborhood U˜ of x0 compatible with φ −1 . We choose U˜ such that all the inverse functions φσ−1 are well defined, namely ˜ U˜ = φσ (U). (108) σ
˜ h) we get a partition of U˜ which consists of the sets Mapping with φ the partition P(U, φ U˜h,σ ∩ U˜ , (109) ˜ h) . For every set φ U˜h,σ ∩ U˜ we have given by all the elements U˜h,σ ∈ P(U, φ −1 |
φ U˜ h,σ ∩U˜
= φσ−1 |
, φ U˜ h,σ ∩U˜
(110)
where φσ−1 is the analytic function provided by the assumptions. That would be sufficient for the semianalyticity of φ −1 if the constructed partition were semianalytic. We do not know if it is the case, though. However, we will subdivide the partition in such a way that the result is a semianalytic partition without any doubt. Establishing that refined partition will be enough to complete the proof by referring to (110). The semianalytic needed ˜ partition is defined in the following way. First, we fix a subset φ Uh,σ ∩ U˜ and use the corresponding analytic function φσ−1 to pullback all the functions h I ∈ h from U˜ onto ∗ U˜ . Denote the resulting set of analytic, real valued functions defined on U˜ by φσ−1 h, ∗ and consider the corresponding semianalytic partition P(U˜ , φσ−1 h). It is easy to see that ∗ φ U˜h,σ ∩ U˜ ∈ P(U˜ , φσ−1 h). (111) ∗
Next, enlarge the set φσ−1 h corresponding to a given σ by taking the union with respect to all the σ s (90), ∗ h = φσ−1 h. (112) σ
Consider the semianalytic partition P(U˜ , h ) defined by h . This partition just divides every φ U˜h,σ ∩ U˜ into smaller subsets of U˜ , that is it consists of subsets of the sets φ U˜h,σ ∩ U˜ . This concludes the proof. Corollary A.7. Suppose φ : U → U is a diffeomorphism of the differentiability class C m , where U, U ⊂ Rn are open and m > 0. If φ is semianalytic, then so is φ −1 : U → U. Proof. Let us assume that φ satisfies the assumptions made in Corollary A.7 and consider an arbitrary point x0 in the domain U. Since φ is semianalytic, we can find: (a) a neighborhood U˜ of x0 , (b) a semianalytic partition P(U˜ , h), and (c) for every U˜ h,σ ∈ P(U˜ , h) an analytic function φσ defined on U˜ , which coincides with φ on U˜ (h, σ ).
Uniqueness of Diffeomorphism Invariant States on Holonomy–Flux Algebras
729
It would be sufficient to show that the data (a)–(c) can be chosen in such a way that ˜ Then every function φσ of (c) has a non-degenerate derivative Dφσ at every point of U. the hypothesis of Proposition A.6 would be satisfied. Certainly the derivative of φ is nowhere degenerate in U. Therefore, for every function φσ of (c) there is an open subset of U˜ such that the derivative of φσ is non-degenerate. The problem is that the subsets of points on which the derivatives are non-degenerate may be too small. They may be ˜ and some of them may even not contain the point x0 at all. Therefore smaller than U, the data (a)–(c) is not yet sufficient to apply Prop. A.6. We therefore define new data (a’)–(c’) given by shrinking the neighborhood U˜ appropriately. For every U˜ h,σ ∈ P(U˜ , h) consider the subset Sσ ⊂ U˜ of points such that the function φσ has a nondegenerate derivative. Note that Sσ contains the completion U˜ h,σ . Indeed, it follows from the continuity of Dφ and Dφσ . As a new U˜ we take, U˜ :=
Sσ \
σ :x0 ∈U˜ h,σ
U˜ h,σ .
σ :x0 ∈ / U˜ h,σ
The set U˜ constitutes new data (a’). A new partition (b’) and functions (c’) are given just by restricting the previous (b),(c) to U˜ . (a’)–(c’) then fulfill the assumptions of Prop. A.6. Once we have generalized the notion of analytic structure into the notion of the semianalytic structure, it is natural to introduce new partitions by relaxing in Definition A.1 the assumption that the functions constituting the set h are analytic, and replace it by a condition that they be semianalytic. Let us do it, apply the same notation as in Definition A.1 to a finite set of semianalytic functions h and call the result a semi-semianalytic partition. Given any partition of a set into subsets, another partition is called finer if every element of the first partition is a finite union of elements of the second partition. Lemma A.8. Suppose P(U, h) is a semi-semianalytic partition of an open U ⊂ Rn . Then, every x ∈ U has a neighborhood U˜ which admits a semianalytic partition finer ˜ h). than P(U, Proof. Let x0 ∈ U. There is a neighborhood U˜ of x0 which admits a semianalytic ˜ f ) compatible with all the (semianalytic functions) elements of h. As partition P(U, before, we start with collecting all the analytic functions available. Firstly, all the ele˜ Secondly, for every assignment ments f I ∈ f are analytic functions defined on U. σ : f → {>, =, <}, every element h I ∈ h defines an analytic function h I σ . Given σ denote the set of the functions h I σ such that h I ∈ h is arbitrary, by h σ . The resulting set of the analytic functions is hσ . (113) h˜ := f ∪ σ
Our candidate for a semianalytic partition of U˜ compatible with P(U, h) is the semian˜ Consider an arbitrary alytic partition defined by the set of functions h. σ˜ : h˜ → {>, =, <},
(114)
730
J. Lewandowski, A. Okołów, H. Sahlmann, T. Thiemann
˜ ˜ and the corresponding set U˜h, ˜ σ˜ ∈ P(U, h). We have to point out an element Uh,σ which ˜ contains Uh, ˜ σ˜ . It is defined as follows. Consider σ := σ˜ | f .
(115)
˜ namely h σ . The restriction Using this σ select another subset of h, σ˜ |h σ ,
(116)
defines naturally an assignment σ : h → {>, =, <}, namely σ (h I ) := σ˜ (h I σ ). It is easy to check that U˜h, ˜ σ˜ ⊂ Uh,σ .
(117)
Finally, our interest in the semi-analytic sets is a consequence of a certain strong result of that theory ([2], see Prop. 2.10 in [3]) which we translate now into the terms of the semianalytic partitions. We call a semianalytic partition analytic partition if every element of the partition is a connected, analytic submanifold. The result we are referring to reads: Proposition A.9. For every semianalytic partition P(U, h) of an open U ⊂ Rn , every point x ∈ U has a neighborhood U˜ which admits an analytic partition finer than P(U˜ , h). A.2. Semianalytic manifolds and submanifolds. In this subsection, is an n dimensional differential manifold. Henceforth we will be assuming that and all the considered functions are of a differentiability class Cm , where m > 0. By analogy with the definitions of an analytic structure, analytic function, and analytic submanifold, we introduce now natural semianalytic generalizations. The generalization is possible due to Propositions A.3, A.4, A.6 of the previous subsection. We denote below an atlas of by {(U I , χ I )} I ∈I , where I is some labeling set, {U I } I ∈I is an open covering of , and {χ I } I ∈I is a family of diffeomorphisms χ I : U I → U I ⊂ Rn . Definition A.10. An atlas {(U I , χ I )} I ∈I of is called semianalytic if for every pair I, J ∈ I the map χ J ◦ χ I−1 : χ I (U I ∩ U J ) → χ J (U I ∩ U J )
(118)
is semianalytic. The diffeomorphisms χ I are called semianalytic charts. A semianalytic structure on is a maximal semianalytic atlas. A semianalytic manifold is a differential manifold endowed with a semianalytic structure. Definition A.11. Given two semianalytic manifolds and , a map f : → is called semianalytic if for every semianalytic chart χ I of , and every semianalytic chart χ I of the function χ I ◦ f ◦ χ I−1 (whenever the composition can be applied) is semianalytic.
Uniqueness of Diffeomorphism Invariant States on Holonomy–Flux Algebras
731
In particular, if = Rn
(119)
and the semianalytic structure is the natural one defined by the atlas {(Rn , id)}, then the map f is a semianalytic function defined on . Definition A.12. A semianalytic submanifold of a semianalytic manifold is a subset S ⊂ such that for every x ∈ S, there is a semianalytic chart χ I defined in a neighborhood U I of x, such that
χ I (S ∩ U I ) = {(x 1 , . . . , x n ) ∈ Rn : x 1 = · · · = x n−n = 0,
0 < x n−n +1 < 1, . . . , 0 < x n < 1},
(120)
where n is a non-negative integer, n ≤ n, and n is called the dimension of S. Definition A.13. An n dimensional semianalytic submanifold with boundary of is a subset S ⊂ such that for every x ∈ S, there is a semianalytic chart χ I defined in a neighborhood U I of x, such that either (120) or
χ I (S ∩ U I ) = {(x 1 , . . . , x n ) ∈ Rn : x 1 = · · · = x n−n = 0,
0 ≤ x n−n +1 < 1, 0 < x n−n +2 < 1, . . . , 0 < x n < 1}.
(121)
It is also assumed that the set of points x such that (121) is not empty. The key property of the semianalytic submanifolds crucial in our work is: Proposition A.14. Let S1 and S2 be two semianalytic submanifolds of a semianalytic manifold . Suppose x ∈ S1 ∩ S2 . Then, there is an open neighborhood W of x in , such that W ∩ S1 ∩ S2 is a finite, disjoint union of connected semianalytic submanifolds. Remark. What is crucial for us in the conclusion of Proposition A.14 is the finiteness of the partition and the connectedness of its elements. After all, an infinite set of disjoint, embedded intervals may also form a single submanifold, disconnected though. Those two properties simultaneously hold due to the (semi) analyticity. Proof. For every point x ∈ S1 ∩ S2 , there is a neighborhood W which can be mapped by a semianalytic chart into an open subset U ⊂ Rn . The intersection W ∩ S1 ∩ S2 is mapped into a subset of U described by a finite family of equalities of the form (89) defined by some fixed family of semianalytic functions h I and relations σ I = ‘=’ (the definition of a semianalytic submanifold involves also inequalities, however W can be chosen such that the latter ones are satisfied at every point in W; we are assuming this is the case). Hence, the intersection is an element of the semi-semianalytic partition defined by the family of the semianalytic functions h I . Due to Lemma A.8, if we choose the neighborhood W of the point x appropriately, then the intersection W ∩ S1 ∩ S2 is a finite union of elements of certain semianalytic partition. Finally, via the result quoted in the previous subsection, the neighborhood W can be chosen such that every element of a semianalytic partition of the image U is a finite, disjoint union of connected analytic submanifolds. Their inverse image by the chart defines the decomposition of the intersection W ∩ S1 ∩ S2 into semianalytic submanifolds.
732
J. Lewandowski, A. Okołów, H. Sahlmann, T. Thiemann
In the paper we are using extensively two particular classes of submanifolds: edges and faces. Definition A.15. A semianalytic edge is a connected, 1-dimensional semianalytic submanifold of with 2-point boundary. Definition A.16. A face is a connected, codimension 1 semianalytic submanifold of whose normal bundle is equipped with an orientation. The property of the semianalytic structures which distinguishes them so much from the analytic ones is local character of the spaces of the semianalytic functions and semianalytic diffeomorphisms. That feature is guaranteed by the existence of a partition of unity compatible with an arbitrary open covering. We formulate this fact precisely now, in the form we refer to in the proof of our main theorem: Proposition A.17. Suppose W ⊂ is a compact subset. Let U I ⊂ , I = 1, . . . , N , be a family of open sets which covers W. There exists a family of Cm semianalytic functions φ I : → R, I = 1, . . . , N such that for every I , supp φ I ⊂ U I and
φ I |W = 1.
(122)
(123)
I
Proof. The proof is standard owing to the following two properties of the semianalytic functions: (i) For every open ball in R D , there is a Cm semianalytic function greater than zero at every point inside the ball and identically zero everywhere else. (ii) If f is a nowhere vanishing Cm semianalytic function then so is 1/ f . References 1. Ashtekar, A.: Lectures on non-perturbative canonical gravity. Notes prepared in collaboration with R. S. Tate. Singapore: World Scientific, 1991 2. Łojasiewicz, S.: Triangulation of semi-analytic sets. Ann. Scuola. Norm. Sup. Pisa 18, 449–474 (1964) 3. Bierstone, E., Milman, P.D.: Semianalytic and Subanalytic sets. Publ. Maths. IHES 67, 5–42 (1988) 4. Rovelli, C.: Loop quantum gravity. Living Rev. Rel. 1, 1 (1998) 5. Thiemann, T.: Modern Canonical Quantum General Relativity. Cambridge: Cambridge University Press, in press; a prelimary version is available a http://aixiv.org/list/ gr-qc/0110034, 2001 6. Ashtekar, A., Lewandowski, J.: Background independent quantum gravity: A status report. Class. Quant. Grav. 21, R53 (2004) 7. Rovelli, C.: Quantum Gravity. Cambridge: Cambridge University Press, in press, 2004 8. Rovelli, C.: Ashtekar formulation of general relativity and loop space non-perturbative quantum gravity: a report. Class. Quant. Grav. 8, 1613–1675 (1991) 9. Ashtekar, A., Lewandowski, J.: Quantum theory of geometry I: Area operators. Class. Quant. Grav. 14, A55–A82 (1997) 10. Ashtekar, A., Corichi, A., Zapata, J.A.: Quantum theory of geometry III: Non-commutativity of Riemannian structures. Class. Quant. Grav. 15, 2955–2972 (1998) 11. Fairbairn, W. Rovelli, C.: Separable Hilbert space in loop quantum gravity, J. Math. Phys. 45, 2802–2814 (2004) 12. Schmüdgen, K.: Unbounded Operator Algebras and Representation Theory. In: Operator Theory: Advances and Applications. Vol. 37, Basel: Birkhäuser, 1990
Uniqueness of Diffeomorphism Invariant States on Holonomy–Flux Algebras
733
13. Sahlmann, H.: Some comments on the representation theory of the algebra underlying loop quantum gravity. http://arxiv.org/list/ gr-qc/0207111, 2002 14. Sahlmann, H.: When do measures on the space of connections support the triad operators of loop quantum gravity? http://arxiv.org/list/ gr-qc/0207112, 2002 15. Okołów, A., Lewandowski, J.: Diffeomorphism covariant representations of the holonomy – flux *-algebra. Class. Quant. Grav. 20, 3543–3568 (2003) 16. Sahlmann, H., Thiemann, T.: On the superselection theory of the weyl algebra for diffeomorphism invariant quantum gauge theories. http://arxiv.org/list/ gr-qc/0302090, 2003 17. Zapata, J.A.: Combinatorial space from loop quantum gravity. Gen. Rel. Grav. 30, 1229 (1998) 18. Velhinho, J.M.: On the structure of the space of generalized connections. Int. J. Geom. Meth. Mod. Phys. 1, 311–334 (2004) 19. Ashtekar, A., Lewandowski, J.: Differential Geometry on the Space of Connections via Graphs and Projective Limits. J. Geom. Phys. 17, 191–230 (1995) 20. Okołów, A., Lewandowski, J.: Automorphism covariant representations of the holonomy-flux *-algebra. Class. Quant. Grav. 22, 657 (2004) 21. Ashtekar, A., Isham, C.J.: Representation of the holonomy algebras of gravity and non-Abelian gauge theories. Class. Quant. Grav. 9, 1433–1467 (1992) 22. Ashtekar, A., Lewandowski, J.: Representation theory of analytic holonomy algebras. In: Baez, J. C. (ed.) Knots and Quantum Gravity. Oxford: Oxford University Press, 1994 23. Baez, J.C.: Generalized measures in gauge theory. Lett. Math. Phys. 31, 213–223 (1994) 24. Marolf, D., Mourão, J.: On the support of the Ashtekar–Lewandowski measure, Commun. Math. Phys. 170, 583–606 (1995) 25. Ashtekar, A., Lewandowski, J.: Projective techniques and functional integration, J. Math. Phys. 36, 2170–2191 (1995) 26. Ashtekar, A., Lewandowski, J., Marolf, D., Mourão, J., Thiemann, T.: Quantization of diffeomorphism invariant theories of connections with local degrees of freedom. J. Math. Phys. 36, 6456–6493 (1995) 27. Lewandowski, J., Marolf, D.: Loop constraints: A habitat and their algebra. Int. J. Mod. Phys. D7, 299–330 (1998) 28. Fleischhack, C., Personal communication 29. Fleischhack, C.: Representations of the Weyl algebra in quantum geometry. http://arxiv.org/list/ mathph/0407006, 2004 30. Reed, M., Simon, B.: Methods of Modern Mathematical Physics. Vol.1: Functional Analysis. New York: Academic Press, 1980 Communicated by Y. Kawahigashi
Commun. Math. Phys. 267, 735–740 (2006) Digital Object Identifier (DOI) 10.1007/s00220-006-0071-8
Communications in
Mathematical Physics
Cantor Spectrum and KDS Eigenstates Joaquim Puig Departament de Matemàtica Aplicada I, Universitat Politècnica de Catalunya, Av. Diagonal 647, 08028 Barcelona, Spain. E-mail: [email protected] Received: 3 November 2005 / Accepted: 17 February 2006 Published online: 19 August 2006 – © Springer-Verlag 2006
Abstract: In this note we consider KDS eigenstates of one-dimensional Schrödinger operators with ergodic potential, which are a class of generalized eigenfunctions including Bloch eigenstates. We show that if the spectrum, restricted to an interval, has zero Lyapunov exponents and is a Cantor set, then for a residual subset of energies, KDS eigenstates do not exist. In particular, we show that the quasi-periodic Schrödinger operators whose Schrödinger quasi-periodic cocycles are reducible for all energies have a limit band-type spectrum. 1. Introduction. Main results The aim of this note is to relate the existence of KDS eigenstates (from Kotani, Deift and Simon), which generalize Bloch eigenstates, for one-dimensional Schrödinger operators with ergodic potential to the Cantor structure of the spectrum. More specifically, we consider a probability measure space (, μ), a measure preserving invertible ergodic transformation T , and a bounded measurable real-valued function V : → R. We let Hω be the operator on l 2 (Z) defined by (Hω x)n = xn+1 + xn−1 + V (T n ω)xn , n ∈ Z.
(1)
Our primary interest is with almost periodic and quasi-periodic operators, which are included this formulation. Most of the arguments can be transported to the continuous case, with straightforward adaptions, although we restrict to the discrete case for the sake of definiteness. As it is well-know, an operator like (1) can exhibit different spectral types, depending on V, ω and T . These different types are very much related to the behaviour of solutions of the corresponding eigenvalue equation xn+1 + xn−1 + V (T n ω)xn = axn , n ∈ Z,
(2)
736
J. Puig
being a the energy. To measure the exponential growth of solutions in the spectrum, which is relevant for the spectral decomposition, one can introduce the upper Lyapunov exponent as the limit 1 γ (a) = lim log Aa,V (T N −1 ω) · · · Aa,V (ω) dμ(ω), N →+∞ N where
Aa,V (ω) =
a − V (ω) 1
−1 , 0
and whose existence is granted by the subadditive ergodic theorem [Kin68]. Outside the spectrum of Hω , which is μ-a.e. independent of ω and we write as , the Lyapunov exponent is always positive. Ishii-Pastur-Kotani theory, see Simon [Sim83] for the discrete version, relates the absolutely continuous spectrum to the set of zero Lyapunov exponents A0 = {a ∈ ; γ (a) = 0}. If A0 has positive measure, its essential closure is the support of the absolutely continuous part of the spectrum of Hω , which is μ-a.e. constant [KS81]. Moreover, for almost every a ∈ A0 (in the Lebesgue sense), Eq. (2) has a pair of independent solutions of the form xn+ = eiϕ(n,ω) ψ(T n ω) and xn− = e−iϕ(n,ω) ψ(T n ω), where ϕ(n, ω) is measurable and ψ ∈ L 2 (), which we will call KDS eigenstates, as showed, almost simultaneously, by Kotani [Kot84] and Deift & Simon [DS83]. In the almost-periodic case, the norm of these solutions is an L 2 -almost-periodic function with the same frequency module as the potential V . In this note we address the possible existence of KDS eigenstates for the remaining energies and the connection with the existence of gaps in the spectrum. Therefore we define the set A1 = {a ∈ ; there are KDS eigenstates}. It is easy to see that A1 ⊂ A0 ⊂ . The last inclusion is strict, since the Lyapunov exponent can be positive in the spectrum (e.g. [Her83, SS91, Bou05, Bje05]). In this note we will characterize when the first inclusion is strict. De Concini & Johnson [DCJ87] considered the case where A0 contains nonvoid intervals. Let I be one of these open maximal intervals. Then, they show that for all a ∈ I , KDS eigenfunctions do exist. At endpoints of I KDS eigenfunctions cannot exist, as we will see later on, but these form (at most) a countable set in any case. The content of our main theorem is that whenever endpoints of gaps are dense in A0 (therefore being a Cantor set), energies without KDS eigenstates are topologically abundant (although with Lebesgue zero measure according to Kotani theory).
Cantor Spectrum and KDS Eigenstates
737
Theorem 1. Let I be an open interval in R such that I ∩ = {a ∈ I ; γ (a) = 0} = A0 ∩ I, and it is a nonvoid Cantor set. Then (A0\A1 ) ∩ I is a residual G δ of A0 ∩ I . This theorem generalizes a result in [Pui06] in the context of quasi-periodic skewproducts and is similar to arguments in circle maps relating Cantor structure of the hyperbolic zones to the existence of non-regular dynamics [Arn61]. Cantor spectrum has been derived for several models, most notably the Almost Mathieu, V (θ ) = b cos θ and an irrational frequency. In fact, this work is inspired by some methods in [Rie03] to treat this case, although in a different sense. Moreover, in the Almost Mathieu Lyapunov exponent has been shown to be 0 in the spectrum if, and only if, |b| ≤ 2. Therefore, we have the following immediate consequence: Corollary 2. In the Almost Mathieu operator, with irrational frequency and nonzero coupling, there is a G δ -set of energies in the spectrum without KDS eigenstates. Remark 3. The fact that the Almost Mathieu model is invariant under Fourier transform (Aubry duality), allows to try to produce the same result using a theorem by Jitomirskaya & Simon [JS94] who prove that, under the same hypothesis as Corollary 2, there is a residual G δ of energies which are not point eigenvalues (in l 2 (Z)). Then using Aubry duality, the dual set of energies could not have L 2 quasi-periodic Bloch waves, which are a particular case of our result. Note that for the existence of KDS eigenstates no control is imposed on the phase of the sequence (only that its modulus follows the dynamics of T ), and for Bloch waves dynamics are imposed also in the phase, see the discussion following Theorem 7.1 in [DS83]. Finally, we would like to state a result concerning the reducibility of quasi-periodic Schrödinger cocycles to constant coefficients. In this case is a suitable d-dimensional torus and T is a quasi-periodic translation defined by a frequency vector α ∈ Rd whose components are rationally independent. A quasi-periodic Schrödinger operator is reducible to constant coefficients if there is a continuous quasi-periodic transformation, with the same basic frequencies, which renders it to a constant matrix (called the Floquet matrix). If a Schrödinger cocycle whose energy a is not at the endpoint of a gap is reducible to constant coefficients then the Floquet matrix can be chosen in S O(2, R) and therefore a has KDS eigenstates. This implies that we can get a sort of “inverse” result. Theorem 4. If a quasi-periodic Schrödinger cocycle is reducible to constant coefficients for all energies then the spectrum consists of spectral bands (nonvoid closed intervals in the spectrum) and accumulation points of these. Proof. If a Schrödinger cocycle is reducible to constant coefficients and the Lyapunov exponent is positive, then it has an exponential dichotomy and the corresponding energy belongs to the resolvent set [Joh82]. Thus if a Schrödinger cocycle is reducible to constant coefficients for all energies in the spectrum then the Lyapunov exponent must be zero in the spectrum. If in addition there is a component of the spectrum which is a Cantor set, we are under the hypothesis of Theorem 1 and there do not exist Bloch waves for a G δ set of energies. Even if at endpoints of gaps the cocycle is reducible to constant coefficients and there is only a single Bloch wave, these endpoints form a countable set. So the cocycle is still nonreducible to constant coefficients for a residual set of energies.
738
J. Puig
2. Proof of Theorem 1 Take I an open interval in R such that K := I ∩ = {a ∈ I ; γ (a) = 0} = A0 ∩ I. We must show that K contains a residual set of energies without KDS eigenstates. It is worth noting that, with these hypotheses, the Lyapunov exponent is a continuous function on I . Indeed, continuity at the resolvent set follows from general principles and continuity at points of K , where γ vanishes, is a consequence of the upper semi-continuous character of the Lyapunov exponent [CS83]. The existence of KDS eigenstates at some energy a0 implies that there is a fundamental matrix of the first-order system associated to the eigenvalue equation whose norm, at any time, is bounded by a square integrable function. Using the definition of the Lyapunov exponent it is easy to show that it satisfies a Lipschitz condition at this energy. Lemma 5. If a0 ∈ K is an energy with KDS eigenstates, then the map a ∈ R → γ (a) is Lipschitz at a0 . When a0 lies at the endpoint of a gap in K , then one cannot have Lipschitz continuity at a0 . Lemma 6. Assume that a0 is the endpoint of an open gap in the spectrum with γ (a0 ) = 0. Then γ (a) − γ (a0 ) = ∞. (3) sup a − a0 a=a0 Proof. The Lyapunov exponent can be expressed through Thouless formula [Tho72, AS83, CS83], γ (a) = log |λ − a|dκ(λ), R
where dκ stands for the integration with respect to the density of states measure (supported on the spectrum). Introducing the so-called w-function or Floquet exponent, w(a) = − log(λ − a)dκ(λ), R
then Re w(z) = −γ (z). If denotes an open spectral gap then a suitable choice of the branch of the logarithm makes it analytic through + ∪ ∪ − , where + (resp. − ) denotes the upper (resp. lower) half plane. Its derivative, 1
dκ(λ) (4) w (a) = λ − a R is a single-valued function on C\ which is never zero in C\R. Let us now show that if a0 is an endpoint of the Lyapunov exponent has the asymptotics given by Eq. (3). For the sake of simplicity, let a0 be the leftmost endpoint of the spectrum so that the corresponding gap is = (−∞, a0 ). Take the determination
Cantor Spectrum and KDS Eigenstates
739
of the logarithm which makes w analytic and conformal at C\[a0 , +∞). With this choice, w(a) is real and negative if a ∈ (−∞, a0 ). Since γ is analytic in (−∞, a0 ), with γ negative and γ
nonzero there (see Eq. (4)), and γ is continuous at a0 , the limit lim
a→a0− ,a∈R
w (a) = − lim γ (a) = − lim a→a0−
a→a0−
γ (a) − γ (a0 ) a − a0
exists and is either +∞ or C, a finite positive constant. Let us now see that the latter case is impossible. Montel’s theorem (eg. [Sch93]) shows that in that case, w possesses angular limits at a0 when approaching from C\[a0 , +∞) and they are all equal to C. Therefore, since w is conformal at C \ [a0 , +∞) and C = 0 then w must preserve angles at a0 . However, the image of the upper half plane under w is in the region Im z ∈ [0, π ] and Re z < 0. At w(a0 ) = 0 the boundaries of this region form an internal angle of π/2, and therefore, w cannot preserve angles at a0 . The argument when a0 is an endpoint of any other open spectral gap with γ (a0 ) = 0 is very similar. The image of the upper half-plane under w forms an internal angle of π/2 at w(a0 ), because the boundary value of Im w is the ids and thus constant in the gap, while its real value is minus the Lyapunov exponent. The argument for the lowest gap can now be used, since the Lyapunov exponent has a negative second derivative on the gap and limits of γ at the endpoints are either ±∞ or a finite constant. We now turn to the proof of Theorem 1. For any a ∈ K we define γ (a) − γ (λ) , m(a) = sup a−λ λ=a,λ∈I which is either a positive real number or +∞. If a has a KDS eigenstate then m(a) < ∞ according to Lemma 5 and, if a is at the endpoint of a gap in the spectrum, then m(a) = ∞ due to Lemma 6. We will show that there is a residual of energies in K with m(a) = +∞. In particular, these cannot only be endpoints of gaps (at most a countable set). Let, for any n ∈ N ∪ {0}, U (n) = {a ∈ K ; m(a) > n} . This is an open set of K (due to the continuity of γ on the interval I ) which is also dense, because it includes endpoints of gaps in K . Therefore U (∞) =
U (n) = {a ∈ K ; m(a) = ∞} ,
n>0
is a residual G δ subset in K without KDS eigenstates.
Acknowledgement. The author wishes to thank Russell Johnson and the anonymous referee for useful comments and suggestions. This work has been supported by MCyT/FEDER Grant BFM2003-09504-C02-01 and CIRIT 2001 SGR-70
740
J. Puig
References [Arn61] Arnol d, V.I.: Small denominators. I. Mapping the circle onto itself. Izv. Akad. Nauk SSSR Ser. Mat. 25, 21–86 (1961) [AS83] Avron, J., Simon, B.: Almost periodic Schrödinger operators II. The integrated density of states. Duke Math. J. 50, 369–391 (1983) [Bje05] Bjerklöv, K.: Positive Lyapunov exponent and minimality for a class of one-dimensional quasi-periodic Schrödinger equations. Erg. Theory Dynam. Syst. 25(4), 1015–1045 (2005) [Bou05] Bourgain, J.: Green’s function estimates for lattice Schrödinger operators and applications. Volume 158 of Annals of Mathematics Studies. Princeton, NJ: Princeton University Press, 2005 [CS83] Craig, W., Simon, B.: Subharmonicity of the Lyaponov index. Duke Math. J. 50(2), 551–560 (1983) [DCJ87] De Concini, C., Johnson, R.A.: The algebraic-geometric AKNS potentials. Erg. Theory Dynam. Syst. 7(1), 1–24 (1987) [DS83] Deift, P., Simon, B.: Almost periodic Schrödinger operators III. The absolute continuous spectrum. Commum. Math. Phys. 90, 389–341 (1983) [Her83] Herman, M.R.: Une méthode pour minorer les exposants de Lyapunov et quelques exemples montrant le caractère local d’un théorème d’Arnold et de Moser sur le tore de dimension 2. Comment. Math. Helv. 58(3), 453–502 (1983) [Joh82] Johnson, R.: The recurrent Hill’s equation. J. Diff. Eq. 46, 165–193 (1982) [JS94] Jitomirskaya, S., Simon, B.: Operators with singular continuous spectrum. III. Almost periodic Schrödinger operators. Comm. Math. Phys. 165(1), 201–205 (1994) [Kot84] Kotani, S.: Ljapunov indices determine absolutely continuous spectra of stationary random onedimensional Schrödinger operators. In: Stochastic analysis (Katata/Kyoto, 1982), Amsterdam: NorthHolland, 1984, pp. 225–247 [Kin68] Kingman, J.F.C.: The ergodic theory of subadditive stochastic processes. J. Roy. Statist. Soc. Ser. B 30, 499–510 (1968) [KS81] Kunz, H., Souillard, B.: Sur le spectre des opérateurs aux différences finies aléatoires. Commun. Math. Phys. 78(2), 201–246 (1980/81) [Pui06] Puig, J.: A nonperturbative Eliasson’s reducibility theorem. Nonlinearity 19, 355–376 (2006) [Rie03] Riedel, N.: The spectrum of a class of almost periodic operators. Int. J. Math. Math. Sci. 36, 2277– 2301 (2003) [Sch93] Schiff, J.L.: Normal families. Universitext. New York: Springer-Verlag, 1993 [Sim83] Simon, B.: Kotani theory for one-dimensional stochastic Jacobi matrices. Commun. Math. Phys. 89(2), 227–234, (1983) [SS91] Sorets, E. Spencer, T.: Positive Lyapunov exponents for Schrödinger operators with quasi-periodic potentials. Commun. Math. Phys. 142(3), 543–566, (1991) [Tho72] Thouless, D.J.: A relation between the density of states and range of localization for one-dimensional random system. J. Phys. C 5, 77–81, (1972) Communicated by B. Simon
Commun. Math. Phys. 267, 741–755 (2006) Digital Object Identifier (DOI) 10.1007/s00220-006-0041-1
Communications in
Mathematical Physics
A Second Eigenvalue Bound for the Dirichlet Schrödinger Operator Rafael D. Benguria , Helmut Linde Department of Physics, Pontificía Universidad Católica de Chile, Casilla 306, Correo 22 Santiago, Chile. E-mail: [email protected]; [email protected] Received: 10 November 2005 / Accepted: 4 January 2006 Published online: 23 May 2006 – © Springer-Verlag 2006
Abstract: Let λi (, V ) be the i th eigenvalue of the Schrödinger operator with Dirichlet boundary conditions on a bounded domain ⊂ Rn and with the positive potential V . Following the spirit of the Payne-Pólya-Weinberger conjecture and under some convexity assumptions on the spherically rearranged potential V , we prove that λ2 (, V ) ≤ λ2 (S1 , V ). Here S1 denotes the ball, centered at the origin, that satisfies the condition λ1 (, V ) = λ1 (S1 , V ). Further we prove under the same convexity assumptions on a spherically symmetric potential V , that λ2 (B R , V )/λ1 (B R , V ) decreases when the radius R of the ball B R increases. We conclude with several results about the first two eigenvalues of the Laplace operator with respect to a measure of Gaussian or inverted Gaussian density. 1. Introduction In an earlier publication [3], Ashbaugh and one of us have proven the Payne-PólyaWeinberger (PPW) conjecture, which states that the first two eigenvalues λ1 , λ2 of the Dirichlet-Laplacian on a bounded domain ⊂ Rn (n ≥ 2) obey the bound 2 2 /jn/2−1,1 . λ2 /λ1 ≤ jn/2,1
(1)
Here jν,k stands for the k th positive zero of the Bessel function Jν . Thus the right hand side of (1) is just the ratio of the first two eigenvalues of the Dirichlet-Laplacian on an n-dimensional ball of arbitrary radius. This result is optimal in the sense that equality holds in (1) if and only if is a ball. R.B. was supported by FONDECYT project # 102-0844.
H.L. gratefully acknowledges financial support from DIPUC of the Pontifícia Universidad Católica de
Chile and from CONICYT.
742
R.D. Benguria, H. Linde
The proof of the PPW conjecture has been generalized in several ways. In [2] a corresponding theorem has been established for the Laplacian operator on a domain that is contained in a hemisphere of the n-dimensional sphere Sn . More precisely, it has been shown that λ2 () ≤ λ2 (S1 ), where S1 is the n-dimensional geodesic ball in Sn that has λ1 () as its first Dirichlet eigenvalue. A further variant of the PPW conjecture has been considered by Haile. In [11] he compares the second eigenvalue λ2 (, kr α ) of the Schrödinger operator with the potential V = kr α (k > 0, α ≥ 2) with λ2 (S1 , kr α ), where S1 is the ball, centered at the origin, that satisfies the condition λ1 (, kr α ) = λ1 (S1 , kr α ). Here and in the following we denote by λi (, V ) the i th eigenvalue of the Schrödinger operator − + V (r) with Dirichlet boundary conditions on a bounded domain ⊂ Rn . We have to mention a gap in [11], which occurs in the proof of Lemma 3.2. The author claims (and uses) that all derivatives of the function Z (θ ) (which is equal to T (θ ) where T (θ ) = 0) coincide with the derivatives of T (θ ) in the points where T (θ ) = 0. This is not proven and there seems to be no reason why it should be true. The same problem occurs in the proof of Lemma 3.3. In the present paper we will prove a theorem that includes Haile’s theorem as a special case and thus remedies the situation. One very important difference between the original PPW conjecture and the extended problems in [2, 11] is that in the later cases the ratio λ2 /λ1 is not scaling invariant anymore. While λ2 /λ1 is the same for any ball in Rn , it is an increasing function of the radius for balls in Sn [2]. On the other hand, we will see that λ2 (B R , V )/λ1 (B R , V ) on the ball B R is a decreasing function of the radius R, if V has certain convexity properties. This raises the question which is the ‘right size’ of the comparison ball in the PPW estimate. We will make some remarks on this problem below. The main objective of the present work is to prove a PPW type result for a Schrödinger operator with a positive potential. We will state the corresponding theorem in the following section. In Sect. 3 we will transfer our results to the case of a Laplacian operator with respect to a metric of Gaussian or inverted Gaussian measure, the two cases of which are closely related to the harmonic oscillator. The rest of the article will be devoted to the proofs of our results.
2. Main Results Let ⊂ Rn (with n ≥ 2) be some bounded domain and V : → R+ some positive potential such that the Schrödinger operator − + V (subject to Dirichlet boundary conditions) is self-adjoint in L 2 (). We call λi (, V ) its i th eigenvalue. Further, we denote by V the radially increasing rearrangement of V . Then the following PPW type estimate holds: Theorem 2.1. Let S1 ⊂ Rn be a ball centered at the origin and of radius R1 and let V˜ : S1 → R+ be some radially symmetric positive potential such that V˜ (r ) ≤ V (r ) for all 0 ≤ r ≤ R1 and λ1 (, V ) = λ1 (S1 , V˜ ). If V˜ (r ) satisfies the conditions a) V˜ (0) = V˜ (0) = 0 and b) V˜ (r ) exists and is increasing and convex, then λ2 (, V ) ≤ λ2 (S1 , V˜ ).
(2)
A Second Eigenvalue Bound
743
If V is such that V satisfies the convexity conditions stated in the theorem, the best bound is obtained by choosing V˜ = V . In this case the theorem is a typical PPW result and optimal in the sense that equality holds in (2) if is a ball and V = V . For a general potential V we still get a non-trivial bound on λ2 (, V ) though it is not sharp anymore. To show that our Theorem 2.1 contains Haile’s result [11] as a special case, we state the following corollary: Corollary 2.1. Let V˜ : Rn → R+ be a radially symmetric positive potential that satisfies the conditions a) and b) of Theorem 2.1 and let S1 ⊂ Rn be the ball (centered at the origin) such that λ1 (, V˜ ) = λ1 (S1 , V˜ ). Then λ2 (, V˜ ) ≤ λ2 (S1 , V˜ ). The proof of Theorem 2.1 follows the lines of the proof in [3] and will be presented in Sect. 5. Let us make a few remarks on the conditions that V˜ has to satisfy. Condition a) is not a very serious restriction, because any bounded potential can be shifted such that V (0) = 0. Also V (0) = 0 holds if V is somewhat regular where it takes the value zero. Moreover, our method relies heavily on the fact that 2 λ1 (B R , V˜ ), λ2 (B R , V˜ ) ≥ 1 + (3) n which is a byproduct of our proof and holds for any ball B R and any potential V˜ that satisfies the conditions of Theorem 2.1. The conditions a) and b) will be needed to show the above inequality, which is equivalent to q (0) ≤ 0 for a function q to be defined in the proof. Numerical studies indicate that b) is somewhat sharp in the sense that, for example, a potential r 2− (which violates b) only ‘slightly’) does not satisfy (3) for every R. In this case the statement of Theorem 2.1 may still be true, but the typical scheme of the PPW proof will fail. Furthermore, condition a) and b) will allow us to employ the crucial Baumgartner-Grosse-Martin (BGM) inequality [7, 4]: From a) and b) we see that V (r ) + r V (r ) is increasing. Consequently r V (r ) is convex, which is just the condition needed to apply the BGM inequality. As mentioned above, one has to choose carefully the size of the comparison ball in a PPW estimate if λ2 /λ1 is a non-constant function of the ball’s radius. In the case of the Laplacian on Sn , one compares the second eigenvalues on and S1 , the ball that has the same first eigenvalue as . By the Rayleigh-Faber-Krahn (RFK) inequality for Sn it is clear that S1 ⊂ , where is the spherically symmetric rearrangement of . It has also been shown in [2] that λ2 /λ1 on a geodesic ball in Sn is an increasing function of the ball’s radius. One can conclude from these two facts that in Sn an estimate of the type (2) is stronger than the inequality λ2 ()/λ1 () ≤ λ2 ( )/λ1 ( ).
(4)
It has also been argued in [4] why the situation is different in the hyperbolic space Hn . Here an estimate of the type (4) is not possible, for the following reason: One can show that λ2 /λ1 on geodesic balls in Hn is a decreasing function of the radius. Now suppose, for example, that is the ball B R with very long and thin tentacles attached to it. Then the first and the second eigenvalue of the Laplacian on and B R are almost the same, while the ratio λ2 /λ1 on can be considerably less than on B R (and thus on ). We will prove a PPW inequality of the type λ2 () ≤ λ2 (S1 ) for Hn and the monotonicity of λ2 /λ1 on geodesic balls in a future publication.
744
R.D. Benguria, H. Linde
To shed light on the question which is the right type of PPW inequality for the Schrödinger operator on , we state Theorem 2.2. Let V : Rn → R+ be a spherically symmetric potential that satisfies the conditions of Theorem 2.1, i.e. a) V (0) = V (0) = 0 and b) V (r ) exists and is increasing and convex. Then the ratio λ2 (B R , V ) λ1 (B R , V ) is a decreasing function of R. This theorem shows that one can not replace Eq. (2) in our Theorem 2.1 by an inequality of the type (4), following the same reasoning as in the case of the Laplacian on Hn . Theorem 2.2 will be proven in Sect. 6. 3. Connection to the Laplacian Operator in Gaussian Space Recently, there has been some interest in isoperimetric inequalities in Rn endowed with 2 2 a measure of Gaussian ( dμ− = e−r /2 dn r ) or inverted Gaussian ( dμ+ = e+r /2 dn r ) density. For the Gaussian space it has been known for several years that a classical isoperimetric inequality holds. Yet the ratio of Gaussian perimeter and Gaussian measure is minimized by half-spaces instead of spherical domains [9]. The ‘inverted Gaussian’ case, i.e., Rn with the measure μ+ , is more similar to the Euclidean case: It has been shown recently that a classical isoperimetric inequality holds and that the minimizers are balls centered at the origin [15]. We consider the Dirichlet-Laplacians −± on L 2 (, dμ± ), where ⊂ Rn is an open domain with μ± () < μ± (Rn ). These two operators are defined by their quadratic forms h ± [ ] = |∇ (r)|2 dμ± , ∈ W01,2 (, dμ± ). (5)
The eigenfunctions i± and eigenvalues λi± () in question are determined by the the differential equation n ± ∂ 2 ±r 2 ∂ i − (6) e = λi± ()e±r i± (r). ∂rk ∂rk k=1
There is a tight connection between the operators −± on a domain and the harmonic oscillator − + r 2 restricted to . Their eigenfunctions and eigenvalues are related by [6]
i± (r) = i (r) · e∓r λi± ()
2 /2
and
= λi (, r ) ± n, 2
denoting by i the Dirichlet eigenfunctions of − + r 2 on .
(7)
A Second Eigenvalue Bound
745
There is an equivalent of the RFK inequality in Gaussian space [6] stating that λ− 1 () is minimized for given μ− () if is a half-space. The corresponding fact for the ‘inverted’ Gaussian space is that λ+1 () is minimized for given μ+ () by the ball centered at the origin. This can be concluded from the RFK inequality for Schrödinger operators [14] in combination with (7). Concerning the second eigenvalue, we will now show what our results from Sect. 2 imply for the operators −± . We state Theorem 3.1. For the operator −+ on a ball B R of radius R (centered at the origin) the ratio λ+2 (B R )/λ+1 (B R ) is a strictly decreasing function of R. In Sect. 7 we will derive Theorem 3.1 from Theorem 2.2 in a purely algebraic way using only the relation (7). Repeating the argument for Hn from the previous section, we see that by Theorem 3.1 the best PPW result we can expect to get is Theorem 3.2. Let S1 be the ball (centered at the origin) that satisfies the condition λ+1 (S1 ) = λ+1 (). Then λ+2 () ≤ λ+2 (S1 ). Theorem 3.2 follows immediately from Theorem 2.1 and (7). In the same way we easily get the corresponding version for −− : Theorem 3.3. Let S1 be the ball (centered at the origin) that satisfies the condition − λ− 1 (S1 ) = λ1 (). Then − λ− 2 () ≤ λ2 (S1 ).
Yet in this case it is not clear anymore whether S1 is the optimal comparison ball: First, − in contrast to the ‘inverted’ Gaussian case the ratio λ− 2 (B R )/λ1 (B R ) is not a decreasing − function of R anymore. This can be seen by comparing the values of λ− 2 (B R )/λ1 (B R ) for R → 0 and R → ∞: For small R the ratio is close to the Euclidean value (≈ 2.539) while for large R it approaches infinity (by (7)). Second, the RFK inequality in Gaussian space states that λ− 1 () is minimized by half-spaces, not circles. This means that for general we do not know whether is bigger or smaller than S1 . For these differences it remains unclear what is the most natural way to generalize the PPW conjecture to Gaussian space. 4. A Monotonicity Lemma In our proof of Theorem 2.1 we will need Lemma 4.1 (Monotonicity of g and B). Let V˜ , S1 and R1 be as in Theorem 2.1 and call z 1 (r ) and z 2 (r ) the radial parts (both chosen positive) of the first two Dirichlet eigenfunctions of − + V˜ on S1 . Set g(r ) =
z 2 (r ) and z 1 (r )
B(r ) = g (r )2 + (n − 1)
g(r )2 r2
for 0 < r < R1 . Then g(r ) is increasing on (0, R1 ) and B(r ) is decreasing on (0, R1 ).
746
R.D. Benguria, H. Linde
Proof. [11, 1]. In this section we abbreviate λi = λi (S1 , V˜ ). The functions z 1 and z 2 are solutions of the differential equations + V˜ − λ1 z 1 = 0, + n−1 + V˜ − λ z2 = 0 z −z 2 − n−1 2 2 r r2 −z 1 −
n−1 r z1
(8)
with the boundary conditions z 1 (0) = 0, z 1 (R1 ) = 0, z 2 (0) = 0, z 2 (R1 ) = 0.
(9)
This is assured by the BGM inequality [1, 7], which is applicable because r V˜ is convex. As in [1] we define the function q(r ) :=
rg (r ) . g(r )
Proving the lemma is thus reduced to showing that 0 < q(r ) < 1 and q (r ) < 0 for r ∈ [0, R]. Using the definition of g and Eq. (8), one can show that q(r ) is a solution of the Riccati differential equation q = (λ1 − λ2 )r +
z (1 − q)(q + n − 1) − 2q 1 . r z1
(10)
It is straightforward to establish the boundary behavior q(0) = 1, q (0) = 0, q (0) =
2 n
1+
2 λ1 − λ2 n
and q(R1 ) = 0. Fact 4.1. For 0 ≤ r ≤ R we have q(r ) ≥ 0. Proof. Assume the contrary. Then there exist two points 0 < r1 < r2 ≤ R1 such that q(r1 ) = q(r2 ) = 0 but q (r1 ) ≤ 0 and q (r2 ) ≥ 0. If r2 < R1 then the Riccati equation (10) yields 0 ≥ q (r1 ) = (λ1 − λ2 )r1 +
n−1 n−1 > (λ1 − λ2 )r2 + = q (r2 ) ≥ 0, r1 r2
which is a contradiction. If r2 = R1 then we get a contradiction in a similar way by 0 ≥ q (r1 ) = (λ1 − λ2 )r1 +
n−1 n−1 > (λ1 − λ2 )R1 + = 3q (R1 ) ≥ 0. r1 R1
A Second Eigenvalue Bound
747
In the following we will analyze the behavior of q according to (10), considering r and q as two independent variables. For the sake of compact notation we will make use of the following abbreviations: p(r ) = z 1 (r )/z 1 (r ), ν = n − 2, E = λ2 − λ1 ,
N y = y 2 − n + 1, M y = N y2 /(2y) − ν 2 y/2, Q y = 2yλ1 + E N y y −1 − 2E.
We further define the function T (r, y) := −2 p(r )y −
νy + N y − Er. r
(11)
Then we can write (10) as q (r ) = T (r, q(r )). The definition of T (r, y) allows us to analyze the Riccati equation for q considering r and q(r ) as independent variables. For r going to zero, p is O(r ) and thus T (r, y) =
1 ((ν + 1 + y)(1 − y)) + O(r ) for y fixed. r
Consequently, limr →0 T (r, y) = +∞ limr →0 T (r, y) = 0 limr →0 T (r, y) = −∞
for 0 ≤ y < 1 fixed, for y = 1, and for y > 1 fixed.
For r approaching R1 , the function p(r ) goes to minus infinity, while all other terms in (11) are bounded. Therefore lim T (r, y) = +∞
r →R1
for y > 0 fixed.
The partial derivative of T (r, y) with respect to r is given by T =
∂ νy N y T (r, y) = −2yp + 2 + 2 − E. ∂r r r
(12)
In the points (r, y) where T (r, y) = 0 we have, by (11), p|T =0 = −
Ny Er ν − − . 2r 2yr 2y
(13)
From (8) we get the Riccati equation p + p2 +
ν+1 p + λ1 − V˜ = 0. r
(14)
Putting (13) into (14) and the result into (12) yields T |T =0 =
M y E 2r 2 + + Q y − 2y V˜ . r2 2y
(15)
748
R.D. Benguria, H. Linde
If we define the function Z y (r ) :=
M y E 2r 2 + + Q y − 2y V˜ , r2 2y
it is clear that T (r, y) = Z y (r ) for any r, y with T (r, y) = 0. The behavior of Z y (r ) at r = 0 is determined by M y . From the definition of M y we get y My =
1 2 y − 1 · (y − 1) − (n − 2) · (y + 1) + (n − 2) . 2
(16)
This implies that M y > 0 for 0 < y < 1, M1 = 0. and therefore lim Z y (r ) = ∞ for 0 < y < 1.
r →0
Fact 4.2. There is some r0 > 0 such that q(r ) ≤ 1 for 0 < r < r0 and q(r0 ) < 1. Proof. Suppose the contrary, i.e., q(r ) first increases away from r = 0. Then, because q(0) = 1 and q(R) = 0 and because q is continuous and differentiable, we can find two points r1 < r2 such that qˆ := q(r1 ) = q(r2 ) > 1 and q (r1 ) > 0 > q (r2 ). Even more, we can choose r1 and r2 such that qˆ is arbitrarily close to one. Writing qˆ = 1 + with > 0, we can calculate from the definition of Q y that Q 1+ = Q 1 + n (λ2 − (1 − 2/n) λ1 ) + O( 2 ). The term in brackets can be estimated by λ2 − (1 − 2/n)λ1 > λ2 − λ1 > 0. We can also assume that Q 1 ≥ 0, because otherwise q (0) = n22 Q 1 < 0 and Fact 4.2 is immediately true. Thus, choosing r1 and r2 such that is sufficiently small, we can make sure that Q qˆ > 0. We further note that in view of (16) the constant Mqˆ can be positive or negative (depending on n), but not zero because 1 < qˆ < 2. Now consider the function T (r, q). ˆ We have T (r1 , q) ˆ > 0 > T (r2 , q) ˆ and the boundary behavior T (0, q) ˆ = −∞ and T (R1 , q) ˆ = +∞. Thus T (r, q) ˆ changes its sign at least thrice on [0, R1 ]. Consequently, we can find three points 0 < rˆ1 < rˆ2 < rˆ3 < R1 such that Z qˆ (ˆr1 ) ≥ 0,
Z qˆ (ˆr2 ) ≤ 0,
Z qˆ (ˆr3 ) ≥ 0.
(17)
Let us define h(r ) =
E 2r 2 − 2qˆ V˜ (r ). 2qˆ
Z qˆ (r ) =
Mqˆ + Q qˆ + h(r ). r2
Then (18)
A Second Eigenvalue Bound
749
By condition b) on V˜ , the function h (r ) is concave. Also h(0) = h (0) = 0. We conclude that if h (r0 ) < 0 or h(r0 ) < 0 for some r0 > 0, then h (r ) is negative and decreasing for all r > r0 . We will now show that Z qˆ cannot have the properties (17), a contradiction that proves Fact 4.2: Case 1. Assume Mqˆ > 0. Then from Z qˆ (ˆr2 ) ≤ 0 we see that −h(ˆr2 ) ≥
Mqˆ
+ Q qˆ > 0.
rˆ22
By what has been said above about h(r ), we conclude that −h(r ) is a strictly increasing function on [ˆr2 , rˆ3 ]. Therefore −h(ˆr3 ) > −h(ˆr2 ) ≥
Mqˆ rˆ22
+ Q qˆ >
Mqˆ rˆ32
+ Q qˆ ,
such that Z qˆ (ˆr3 ) < 0, contradicting (17). Case 2. Assume Mqˆ < 0. Then from Z qˆ (ˆr1 ) ≥ 0 ≥ Z qˆ (ˆr2 ) follows that Z qˆ (ˆr ) ≤ 0 for some rˆ ∈ [ˆr1 , rˆ2 ]. In view of (18) we have h (ˆr ) < 0. But this means by our above concavity argument that h (r ) is decreasing and thus h (r ) < 0 for all r > rˆ . Then Z qˆ is strictly decreasing for r ≥ rˆ . Together with Z qˆ (ˆr2 ) ≤ 0 and Z qˆ (ˆr ) ≤ 0 this implies
that Z qˆ (ˆr3 ) < 0, a contradiction to (17). Fact 4.3. For all 0 ≤ r ≤ R1 the inequality q (r ) ≤ 0 holds. Proof. Assume the contrary. Then there are three points r1 < r2 < r3 in (0, R1 ) with 0 < qˆ := q(r1 ) = q(r2 ) = q(r3 ) < 1 and q (r1 ) < 0, q (r2 ) > 0, q (r3 ) < 0. Consider the function T (r, q), ˆ which is equal to q (r ) at r1 , r2 , r3 . Taking into account its boundary behavior at r = 0 and r = R1 , it is clear that T (r, q) ˆ must have at least the sign changes positive-negative-positive-negative-positive. Thus T (r, q) ˆ has at least four zeros rˆ1 < rˆ2 < rˆ3 < rˆ4 with the properties Z qˆ (ˆr1 ) ≤ 0,
Z qˆ (ˆr2 ) ≥ 0,
Z qˆ (ˆr3 ) ≤ 0,
Z qˆ (ˆr4 ) ≥ 0.
We also know that Z qˆ (0) = +∞. To satisfy all these requirements, Z qˆ must either have at least three extremal points where Z qˆ crosses zero or Z qˆ must vanish on a finite interval. But we have Z qˆ (r ) = −
2Mqˆ E 2 r − 2qˆ V˜ (r ), + r3 qˆ
which is a strictly concave function (recall Mqˆ > 0 for 0 < qˆ < 1). A strictly concave function can only cross zero twice and not be zero on a finite interval, which is a contradiction that proves Fact 4.3.
Altogether we have shown that 0 < q(r ) < 1 and q (r ) ≤ 0 for all r ∈ [0, R1 ], proving Lemma 4.1.
750
R.D. Benguria, H. Linde
5. Proof of Theorem 2.1 Proof of Theorem 2.1. We start from the basic gap inequality |∇ P|2 u 21 dn r λ2 (, V ) − λ1 (, V ) ≤ , 2 2 n P u1 d r
(19)
where u 1 is the first Dirichlet eigenfunction of − + V on and P is a suitable test function that satisfies the condition Pu 21 dn r = 0. We set ri r
Pi (r) = g(r ) where g(r ) =
⎧ (r ) ⎨ zz2 (r ) 1
for i = 1, 2, ..., n,
(20)
for r < R1 (21)
⎩ lim g(t) for r ≥ R1 . t↑R1
Here z 1 and z 2 are the radial parts (both chosen positive) of the first two eigenfunctions of − + V˜ on S1 . More precisely, z 2 (r )ri r −1 for i = 1, . . . , n is a basis of the space of second eigenfunctions. It follows from the convexity of r V˜ and the BGM inequality [1, 7] that the second eigenfunctions can be written in that way. According to an argument in [3] one can always choose the origin of the coordinate system such that Pi u 21 dn r = 0 is satisfied for all i. Putting the functions Pi into (19) and summing over all i yields B(r )u 21 dn r λ2 (, V ) − λ1 (, V ) ≤ (22) 2 2 n g(r ) u 1 d r with B(r ) = g (r )2 + (n − 1)
g(r )2 . r2
By Lemma 4.1 we know that B is a decreasing and g an increasing function of r . Thus, denoting by u 1 the spherically decreasing rearrangement of u 1 with respect to the origin, we have 2 n B(r )u 1 d r ≤ B (r ) u 1 2 dn r (23) ≤ B(r ) u 1 2 dn r ≤ B(r ) z 12 dn r
and
g(r )2 u 21 dn r ≥ ≥
S1
g (r )2 u 1 2 dn r g(r )2 u 1 2 dn r ≥
(24) S1
g(r )2 z 12 dn r.
In each of the above chains of inequalities the first step follows from general properties of rearrangements and the second from the monotonicity properties of g and B. The
A Second Eigenvalue Bound
751
third step is justified by a comparison result that we state below and the monotonicity of g and B again. Putting (23) and (24) into (22) we get S λ2 (, V ) − λ1 (, V ) ≤ 1 S1
B(r ) z 2 dn r g(r )2 z 2 dn r
= λ2 (S1 , V˜ ) − λ1 (S1 , V˜ ).
Keeping in mind that λ1 (, V ) = λ1 (S1 , V˜ ), Theorem 2.1 is proven by this last inequality.
Lemma 5.1 (Chiti Comparison result). Let u 1 be the radially decreasing rearrangement of the first eigenfunction of − + V on and z 1 the first eigenfunction of − + V˜ on S1 . Assume both functions to be positive and normalized in L 2 ( ). Then there exists an r0 such that u 1 (r ) ≤ z 1 (r ) for r ≤ r0 and u 1 (r ) ≥ z 1 (r ) for r0 < r ≤ R1 . Proof. By a version of the RFK inequality for Schrödinger operators [14] and by domain monotonicity of the first eigenvalue it is clear that S1 ⊂ . This is why we can view z 1 (r ) as a function in L 2 ( ), setting z 1 (r ) = 0 for r > R1 . Both u 1 and z 1 are positive and spherically symmetric. Moreover, u 1 (r ) and z 1 (r ) are decreasing functions of r . For u 1 this is clear by definition of the rearrangement. For z 1 it follows from a simple comparison argument using z 1 as a test function in the Rayleigh quotient for λ1 . (Here and in the sequel we write short-hand λ1 = λ1 (, V ) = λ1 (S1 , V˜ ).) We introduce a change of variables via s = Cn r n and write u #1 (s) ≡ u 1 (r ), z 1# (s) ≡ z 1 (r ) and V˜# (s) ≡ V˜ (r ). Fact 5.1. For the functions u #1 (s) and z 1# (s) we have s du #1 −2 −2/n n/2−2 ≤ n Cn − s (λ1 − V˜# (w)) u #1 (w) dw, ds 0 s dz # −2/n n/2−2 − 1 = n −2 Cn s (λ1 − V˜# (w)) z 1# (w) dw. ds 0
(25) (26)
Proof. We integrate both sides of −u 1 + V u 1 = λ1 u 1 over the level set t := {r ∈ : u 1 (r) > t} and use Gauss’ Divergence Theorem to obtain
|∇u 1 |Hn−1 ( dr ) =
∂t
t
(λ1 − V (r)) u 1 (r) dn r,
(27)
where ∂t = {r ∈ : u 1 (r) = t}. Now we define the distribution function μ(t) = |t |. Using the coarea formula, the Cauchy-Schwarz inequality and the classical isoperimetric inequality, Talenti derives ([18], p.709, Eq. (32)) ∂t
2−2/n 2/n μ(t) . μ (t)
|∇u 1 |Hn−1 ( dr ) ≥ −n 2 Cn
(28)
752
R.D. Benguria, H. Linde
The left sides of (27) and (28) are the same, thus 2−2/n 2 2/n μ(t) ≤ (λ1 − V (r)) u 1 (r) dn r −n Cn μ (t) t ≤ (λ1 − V (r)) u 1 (r) dn r ≤
t t
(λ1 − V˜ (r)) u 1 (r) dn r
(μ(t)/Cn )1/n
= 0
nCn r n−1 (λ1 − V˜ (r ))u 1 (r ) dr.
Now we perform the change of variables r → s on the right-hand side of the above chain of inequalities. We also chose t to be u #1 (s). Using the fact that u #1 and μ are essentially inverse functions to one another, this means that μ(t) = s and μ (t)−1 = (u #1 ) (s). The result is (25). Equation (26) is proven analogously.
Fact 5.1 enables us to prove Lemma 5.1. We have u #1 (|S1 |) > z 1# (|S1 |) = 0. Being equally normalized, u 1 and z 1 must have at least one intersection on [0, R]. Thus u #1 and z 1# have at least one intersection on [0, |S1 |]. Now assume that they intersect at least twice. Then there is an interval [s1 , s2 ] ⊂ [0, |S1 |] such that u #1 (s) > z # (s) for s ∈ (s1 , s2 ), u #1 (s2 ) = z 1# (s2 ) and either u #1 (s1 ) = z 1# (s1 ) or s1 = 0. There is also an interval [s3 , s4 ] ⊂ [s2 , |S1 |] with u #1 (s) < z 1# (s) for s ∈ (s3 , s4 ), u #1 (s3 ) = z 1# (s3 ) and u #1 (s4 ) = z 1# (s4 ). Let further s˜ be the point where V˜# (s) − λ1 (S1 , V˜ ) crosses zero (set s˜ = |S1 | if V˜# (s) − λ1 does not cross zero on [0, |S1 |]). To keep our notation compact we will write b Iab [u] = (λ1 − V˜# (w)) u(w) dw. a
Case 1. Assume s˜ ≥ s2 . Then V˜# (s) − λ1 (S1 , V˜ ) is negative for s < s2 . Set ⎧ ⎪ u #1 (s) on [0, s1 ] if I0s1 [u #1 ] > I0s1 [z 1# ], ⎪ ⎪ ⎪ ⎪ ⎪ ⎨ z 1# (s) on [0, s1 ] if I0s1 [u #1 ] ≤ I0s1 [z 1# ], v(s) = u #1 (s) on [s1 , s2 ], ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎩ z 1# (s) on [s2 , |S1 |]. Using Fact 5.1, one can check that then v(s) fulfills the inequality s dv −2 −2/n n/2−2 ≤ n Cn − s (λ1 − V˜# (s))v(w) dw. ds 0 Case 2. Assume s˜ < s2 . Then V˜# (s) − λ1 (S1 , V˜ ) is positive for s ≥ s3 . Set ⎧ s3 # s3 # # ⎪ ⎪ u 1 (s) on [0, s3 ] if I0 [u 1 ] > I0 [z 1 ], ⎪ ⎪ ⎪ ⎪ ⎨ z 1# (s) on [0, s3 ] if I0s3 [u #1 ] ≤ I0s3 [z 1# ], v(s) = u #1 (s) on [s3 , s4 ], ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎩ z 1# (s) on [s4 , |S1 |].
(29)
A Second Eigenvalue Bound
753
Again using Fact 5.1, one can check that also in this case v(s) fulfills the inequality (29). Now define the test function
(r) = v(Cn r n ) = v(s). Then we use the Rayleigh characterization of λ1 , Eq. (29) and integration by parts to calculate |∇ |2 + V˜ (r) 2 dn r λ1
(r)2 dn r < S1
S1 |S1 |
0
2/n v (s)2 n 2 s 2−2/n Cn + V˜# (s)v 2 (s) ds 0 |S1 | s 2 ˜ ˜ ≤ −v (s) (λ1 − V# (w))v(w) dw + V# (s)v (s) ds
=
= 0
|S1 |
= λ1
0
v(s)(λ1 − V˜# (s))v(s) + V˜# (s)v 2 (s) ds
(r)2 dn r . S1
This is a contradiction to our original assumption that u #1 (r ) and z 1# (r ) have more than one intersection, thus proving Lemma 5.1.
6. Proof of Theorem 2.2 Proof of Theorem 2.2. The first eigenfunction of − + V on B R is radially symmetric and will be called z 1 (r ). Further, a standard separation of variables and the BaumgartnerGrosse-Martin [7, 4] inequality imply that we can write a basis of the space of second eigenfunctions in the form z 2 (r ) · ri · r −1 . The radial parts z 1 and z 2 of the first and the second eigenfunction, which we assume to be positive, solve the differential equations −z 1 (r ) − −z 2 (r ) −
n−1 r z 1 (r ) + (V (r ) − λ1 ) z 1 (r ) n−1 n−1 z 2 (r ) + V (r ) − λ 2 2 r z 2 (r ) + r
= 0, =0
(30)
with the boundary conditions z 1 (0) = 0, z 1 (R) = 0, z 2 (0) = 0, z 2 (R) = 0.
(31)
We define the rescaled functions z˜ 1/2 (r ) = z 1/2 (βr ). Putting βr (with β > 0) instead of r into Eq. (30) and multiplying by β 2 yields the rescaled equations n−1 z˜ 1 (r ) + β 2 V (βr ) − β 2 λ1 z˜ 1 (r ) = 0, −˜z 1 (r ) − r n − 1 n−1 2 2 z˜ 2 (r ) + + β V (βr ) − β λ2 z˜ 2 (r ) = 0. −˜z 2 (r ) − r r2 We conclude that z˜ 1 and z˜ 2 are the radial parts of the first two eigenfunctions of − + β 2 V (βr ) on B R/β to the eigenvalues β 2 λ1 and β 2 λ2 . Consequently, if we replace R by R/β and V (r ) by β 2 V (βr ), then the ratio λ2 /λ1 does not change.
754
R.D. Benguria, H. Linde
For the rest of this section we shall write λ1/2 (R, V ) instead of λ1/2 (B R , V ). We also fix two radii 0 < R1 < R2 and let ρ(β) for β > 1 be the function defined implicitly by λ1 (ρ(β), V (r )) = λ1 (R2 /β, β 2 V (βr )).
(32)
Then we have ρ(1) = R2 . By domain monotonicity of λ1 and because V (r ) is increasing and positive we see that the right-hand side of (32) is increasing in β. Therefore, again by domain monotonicity, ρ(β) must be decreasing in β. One can also check that ρ(β) is a continuous function and that ρ(β) goes to zero for β → ∞. Thus we can find β0 > 1 such that ρ(β0 ) = R1 . Then we can apply Theorem 2.1, with B R2 /β0 for and Bρ(β0 ) for S1 , as well as β02 V (β0 r ) for V and V (r ) for V˜ , to get λ2 (R2 /β0 , β02 V (β0 r ) ≤ λ2 (ρ(β0 ), V (r )) = λ2 (R1 , V (r )).
(33)
But by what has been said above about the scaling properties of the problem, we have λ2 (R2 /β0 , β02 V (β0 r )) λ1 (R2 /β0 , β02 V (β0 r ))
=
λ2 (R2 , V (r )) . λ1 (R2 , V (r ))
(34)
Combining (32) for β = β0 , (33) and (34), we get λ2 (R1 , V (r )) λ2 (R2 , V (r )) ≥ . λ1 (R1 , V (r )) λ1 (R2 , V (r ))
(35)
Because R1 and R2 were chosen arbitrarily, this proves Theorem 2.2.
7. Proof of Theorem 3.1 Before we prove Theorem 3.1 we need to state the following technical lemma: Lemma 7.1. Let a, b, c, d > 0 with a ≥ b, d ≥ b and
a b
< dc . Then
c+x a+x < b+x d+x holds for any x > 0. Proof. Define the function f (x) :=
a+x c+x − , d+x b+x
then f (0) > 0. A straightforward calculation shows that f has exactly one zero at x0 = −
bc − ad . b+c−a−d
The numerator bc − ad in the expression for x0 is positive because of the condition a/b < c/d. For the denominator we get b+c−a−d >c+b−
bc (d − b)(c − d) −d = ≥ 0. d d
This means that x0 < 0, such that f (x) > 0 for all x > 0.
A Second Eigenvalue Bound
755
Proof of Theorem 3.1. Choose some x > 0. From Theorem 2.2 we know that λ2 (B R+x , r 2 ) λ2 (B R , r 2 ) < for x > 0. λ1 (B R+x , r 2 ) λ1 (B R , r 2 ) Moreover, λ1 (B R , r 2 ) ≥ λ1 (B R+x , r 2 ) and λ2 (B R+x , r 2 ) > λ1 (B R+x , r 2 ). Thus we can apply first (7), then Lemma 7.1 and then (7) again, to get λ+2 (B R+x ) λ+ (B R ) λ2 (B R+x , r 2 ) + n λ2 (B R , r 2 ) + n = < = 2+ . + 2 2 λ1 (B R+x ) λ1 (B R+x , r ) + n λ1 (B R , r ) + n λ1 (B R )
References 1. Ashbaugh, M.S., Benguria, R.D.: A second proof of the Payne-Pólya-Weinberger conjeture. Commun. Math. Phys. 147, 181–190 (1992) 2. Ashbaugh, M.S., Benguria, R.D.: A sharp bound for the ratio of the first two Dirichlet eigenvalues of a domain in a hemisphere of Sn . Trans. AMS 353, No. 3, 1055–1087 (2000) 3. Ashbaugh, M.S., Benguria, R.D.: A sharp bound for the ratio of the first two eigenvalues of Dirichlet Laplacians and extensions, Ann. Math. 135, 601–628 (1992) 4. Ashbaugh, M.S., Benguria, R.D.: Log-concavity of the ground state of Schrödinger operators: A new proof of the Baumgartner-Grosse-Martin inequality, Phys. Lett. A 131, No. 4,5, 273–276 (1988) 5. Ashbaugh, M.S., Benguria, R.D.: Isoperimetric inequalities for eigenvalue ratios in.: Partial Differential Equations of Elliptic Type, Cortona, 1992, A. Alvino, E. Fabes, G. Talenti, eds., Symposia Mathematica, Vol. 35, Cambridge:Cambridge University Press, 1994, pp. 1–36 6. Betta, M.F., Chiacchio, F., Ferone, A.: Isoperimetric estimates for the first eigenfunction of a class of linear elliptic problems. To appear in ZAMP 7. Baumgartner, B., Grosse, H., Martin, A.: The Laplacian of the potential and the order of energy levels, Phys. Lett. 146B, No. 5, 363–366 (1984) 8. Baumgartner, B., Grosse, H., Martin, A.: Order of levels in potential models. Nucl. Phys. B 254, 528–542 (1985) 9. Borell, C.: The Brunn-Minkowski inequality in Gauss space. Invent. Math. 30, (1975) 10. Faber, G.: Beweis, dass unter allen homogenen Membranen von gleicher Fläche und gleicher Spannung die kreisförmige den tiefsten Grundton gibt. Sitzungberichte der mathematisch-physikalischen Klasse der Bayerischen Akademie der Wissenschaften zu München Jahrgang, 1923, pp. 169–172 11. Haile, C.: A second eigenvalue bound for the Dirichlet Schrödinger equation with a radially symmetric potential, Electronic J. Differ. Eq. 2000, No. 10, 1–19 (2000) 12. Krahn, E.: Über eine von Rayleigh formulierte Minimaleigenschaft des Kreises. Math. Ann. 94 97–100 (1925) 13. Krahn, E.: Über Minimaleigenschaften der Kugel in drei und mehr Dimensionen. Acta Comm. Univ. Tartu (Dorpat) A9 1–44 (1926). [English translation: Minimal properties of the sphere in three and more dimensions, Edgar Krahn 1894–1961: A Centenary Volume, Ü. Lumiste and J. Peetre, eds., Amsterdam Ios Press, 1994, pp. 139–174] 14. Luttinger, J.M.: Generalized isoperimetric inequalities. Proc. Nat. Acad. Sci. USA 70, 1005-1006 (1973) 15. Mercaldo, A., Posteraro, M.R., Brock, F.: On Schwarz and Steiner symmetrization with respect to a measure Preprint 16. Payne, L.E., Pólya, G., Weinberger, H.F.: Sur le quotient de deux fréquences propres onsécutives. Comptes Rendus Acad. Sci. Paris 241, 917–919 (1955) 17. Payne, L.E., Pólya, G., Weinberger, H.F.: On the ratio of consecutive eigenvalues. J. Math. Phys. 35, 289–298 (1956) 18. Talenti, G.: Elliptic equations and rearrangements. Ann. Scuola Norm. Sup. Pisa (4) 3, 697–718 (1976) Communicated by B. Simon
Commun. Math. Phys. 267, 757–782 (2006) Digital Object Identifier (DOI) 10.1007/s00220-006-0042-0
Communications in
Mathematical Physics
Some Computations in the Cyclic Permutations of Completely Rational Nets Feng Xu Department of Mathematics, University of California at Riverside, Riverside, CA 92521, USA. E-mail: [email protected] Received: 2 January 2006/Accepted: 14 January 2006 Published online: 9 May 2006 – © Springer-Verlag 2006
Abstract: In this paper we calculate certain chiral quantities from the cyclic permutation orbifold of a general completely rational net. We determine the fusion of a fundamental soliton, and by suitably modified arguments of A. Coste , T. Gannon and especially P. Bantay to our setting we are able to prove a number of arithmetic properties including congruence subgroup properties for S, T matrices of a completely rational net defined by K.-H. Rehren and rationality of the central charge. 1. Introduction Let A be a completely rational net (cf. Definition 2.6). Then A ⊗ A ⊗ . . . ⊗ A (n tensors) admits an action of Zn by cyclic permutations. The corresponding orbifold net is referred to as the (nth order) cyclic permutation orbifold of A. This construction has been used both in mathematics and physics literature (for a partial list, see [1–3, 5] and references therein). In [23], this construction was used for n = 2 to show that strong additivity is automatic in a conformal net with finite μ index. In [18], the construction was used to demonstrate applications of general orbifold theories among other things. The starting motivation of this paper is to improve a result (Prop. 9.4 in [18]) on fusion of a fundamental soliton. The second motivation came from two papers: one by A. Coste and T. Gannon (cf. [7]) where under certain conditions they showed that the S, T matrices verified congruence subgroup properties, and one by P. Bantay (cf. [1])where he showed that congruence subgroup properties hold if a number a heuristic arguments including what he called “Orbifold Covariance Principle” hold. This “Orbifold Covariance Principle” of P. Bantay is highly nontrivial even in concrete examples, and at present the only conceptual framework in which this principle is a theorem is in the framework of local conformal net (cf. Sect. 2.1), where the principle follows from Theorem 2.7. In the language of local conformal nets the S, T matrices were defined by K.-H. Rehren (cf. Supported in part by NSF.
758
F. Xu
[26]) by using local data of conformal nets, and in all known cases they agree with the “S,T” matrices coming from modular transformations of characters. If one is interested in the modular tensor categories, then this “S,T” matrices of Rehren are sufficient for calculations of three manifold invariants (cf. for example [30]). It is therefore an interesting question to see if one can adapt the methods of A. Coste and T. Gannon and P. Bantay to Rehren’s “S,T” matrices. In this paper we will show that Prop. 9.4 of [18] holds in general (cf. Th. 3.6), and that a suitable modification of the arguments of A. Coste and T. Gannon and P. Bantay is possible in the conformal net setting, and congruence subgroup properties hold for Rehren’s “S,T” matrices (cf. Th. 4.9). Our key observation is the squares of nets in §3. By using (3) of Lemma 2.10 which relates the chiral data of a net and subnet for suitably chosen squares, we are able to obtain strong constraints on certain matrices (cf. Th. 3.12). These squares are in fact commuting squares first considered in the setting of I I1 factors by S. Popa in [24], and they already played an important role in the setting of nets in [33]. However the “commuting” property of these squares will not play an explicit role in this paper. Theorem 3.12 allows us to apply the methods of P. Bantay in [1] to obtain arithmetic properties of Rehren’s “S,T” matrices in Th. 4.5 and Th. 4.9. We note that Th. 3.12 implies series of arithmetic properties of “S,T” matrices, and even the first one as observed in [18] seems to be nontrivial for concrete examples like the nets coming from SU (n) at level k. The rest of this paper is organized as follows: In §2, after recalling basic definitions of completely rational nets, Rehren’s S, T matrices, orbifolds and Galois actions, we stated a few general results from §9 of [18] which will be used in §3 and §4. In §3 we improve Prop. 9.4 of [18] in Th. 3.6, and we present the proof of Th. 3.12 as mentioned above from a commuting square. In §4, by modifications incorporating phase factors the arguments of A. Coste and T. Gannon and P. Bantay, we are able to prove Th. 4.5 and Th. 4.9. We note that the arguments in §4 can be simplified if one can prove a conjecture on Page 734 of [18]. 2. Conformal Nets, Complete Rationality, and Orbifolds For the convenience of the reader we collect here some basic notions that appear in this paper. This is only a guideline and the reader should look at the references for a more complete treatment. 2.1. Conformal nets on S 1 . By an interval of the circle we mean an open connected non-empty subset I of S 1 such that the interior of its complement I is not empty. We denote by I the family of all intervals of S 1 . A net A of von Neumann algebras on S 1 is a map I ∈ I → A(I ) ⊂ B(H) from I to von Neumann algebras on a fixed Hilbert space H that satisfies: A. Isotony. If I1 ⊂ I2 belong to I, then A(I1 ) ⊂ A(I2 ). The net A is called local if it satisfies:
Some Computations in the Cyclic Permutations of Completely Rational Nets
759
B. Locality. If I1 , I2 ∈ I and I1 ∩ I2 = ∅ then [A(I1 ), A(I2 )] = {0}, where brackets denote the commutator. The net A is called Möbius covariant if in addition it satisfies the following properties C, D, E, F: C. Möbius covariance. There exists a strongly continuous unitary representation U of the Möbius group Möb (isomorphic to P SU (1, 1)) on H such that U (g)A(I )U (g)∗ = A(g I ), g ∈ Möb, I ∈ I. If E ⊂ S 1 is any region, we shall put A(E) ≡ E⊃I ∈I A(I ) with A(E) = C if E has empty interior (the symbol ∨ denotes the von Neumann algebra generated). Note that the definition of A(E) remains the same if E is an interval namely: if {In } is an increasing sequence of intervals and ∪n In = I , then the A(In )’s generate A(I ) (consider a sequence of elements gn ∈ Möb converging to the identity such that gn I ⊂ In ). D. Positivity of the energy. The generator of the one-parameter rotation subgroup of U (conformal Hamiltonian) is positive. E. Existence of the vacuum. There exists a unit U -invariant vector ∈ H (vacuum vector), and is cyclic for the von Neumann algebra I ∈I A(I ). By the Reeh-Schlieder theorem is cyclic and separating for every fixed A(I ). The modular objects associated with (A(I ), ) have a geometric meaning itI = U ( I (2π t)),
J I = U (r I ) .
Here I is a canonical one-parameter subgroup of Möb and U (r I ) is a antiunitary acting geometrically on A as a reflection r I on S 1 . This implies Haag duality: A(I ) = A(I ),
I ∈I,
where I is the interior of S 1 \ I . F. Irreducibility. I ∈I A(I ) = B(H). Indeed A is irreducible iff is the unique U -invariant vector (up to scalar multiples). Also A is irreducible iff the local von Neumann algebras A(I ) are factors. In this case they are III1 -factors in Connes classification of type III factors (unless A(I ) = C for all I ). By a conformal net (or diffeomorphism covariant net) A we shall mean a Möbius covariant net such that the following holds: G. Conformal covariance. There exists a projective unitary representation U of Diff(S 1 ) on H extending the unitary representation of Möb such that for all I ∈ I we have U (g)A(I )U (g)∗ = A(g I ), g ∈ Diff(S 1 ), U (g)xU (g)∗ = x, x ∈ A(I ), g ∈ Diff(I ),
760
F. Xu
where Diff(S 1 ) denotes the group of smooth, positively oriented diffeomorphisms of S 1 and Diff(I ) the subgroup of diffeomorphisms g such that g(z) = z for all z ∈ I . Let G be a simply connected compact Lie group. By Th. 3.2 of [12], the vacuum positive energy representation of the loop group LG (cf. [25]) at level k gives rise to an irreducible conformal net denoted by AG k . By Th. 3.3 of [12], every irreducible positive energy representation of the loop group LG at level k gives rise to an irreducible covariant representation of AG k . A (DHR) representation π of A on a Hilbert space H is a map I ∈ I → π I that associates to each I a normal representation of A(I ) on B(H) such that π I˜ A(I ) = π I ,
I ⊂ I˜,
I, I˜ ⊂ I .
π is said to be Möbius (resp. diffeomorphism) covariant if there is a projective unitary representation Uπ of Möb (resp. Diff (∞) (S 1 ), the infinite cover of Diff(S 1 ) ) on H such that πg I (U (g)xU (g)∗ ) = Uπ (g)π I (x)Uπ (g)∗ for all I ∈ I, x ∈ A(I ) and g ∈ Möb resp. g ∈ Diff (∞) (S 1 ) . Note that if π is irreducible and diffeomorphism covariant then U is indeed a projective unitary representation of Diff(S 1 ). Assume ρ to be localized in I and ρ I ∈ End((A(I )) to be irreducible with a conditional expectation E : A(I ) → ρ I (A(I )), then λρ := E( ) depends only on the unitary equivalence of ρ, where is the braiding operator associated with ρ (cf.[13, 14]). The statistical dimension d(ρ) and the univalence ωρ are then defined by d(ρ) = |λρ |−1 ,
ωρ =
λρ . |λρ |
The conformal spin-statistics theorem shows that ωρ = ei2π ρ , where ρ is the conformal dimension (the lowest eigenvalue of the generator of the rotation subgroup) in the representation ρ. The right hand side in the above equality is called the univalence of ρ. d(ρ)2 will be called the index of ρ. The general index was first defined and investigated by Vaughan Jones in the case of I I1 factors in [16]. 2.2. Rehren’s S, T -matrices. Next we will recall some of the results of [26] and introduce notations. Let {[λ], λ ∈ P} be a finite set of all equivalence classes of irreducible, covariant, finite-index representations of an irreducible local conformal net A. We will denote the conjugate of [λ] by [λ¯ ] and identity sector (corresponding to the vacuum representation) ν = [λ][μ], [ν]. Here μ, ν denotes the dimenby [1] if no confusion arises, and let Nλμ sion of the space of intertwiners from μ to ν (denoted by Hom(μ, ν)). We will denote
Some Computations in the Cyclic Permutations of Completely Rational Nets
761
by {Te } a basis of isometries in Hom(ν, λμ). The univalence of λ and the statistical dimension of (cf. §2 of [13]) will be denoted by ωλ and d(λ) (or dλ )) respectively. Let ϕλ be the unique minimal left inverse of λ, define: Yλ,μ := d(λ)d(μ)ϕμ ( (μ, λ)∗ (λ, μ)∗ ),
(1)
where (μ, λ) is the unitary braiding operator (cf. [13]). We list two properties of Yλ,μ (cf. (5.13), (5.14) of [26]) which will be used in the following: Lemma 2.1. ∗ Yλ,μ = Yμ,λ = Yλ, μ¯ = Yλ¯ ,μ, ¯
Yλ,μ =
k
ν Nλμ
ωλ ωμ d(ν). ων
We note that one may take the second equation in the above lemma as the definition of Yλ,μ . Define a := i dρ2i ωρ−1 . If the matrix (Yμ,ν ) is invertible, by Proposition on p.351 i of [26] a satisfies |a|2 = λ d(λ)2 . Definition 2.2. Let a = |a| exp −2πi c0 (8A) , where c0 (A) ∈ R and c0 (A) is well defined mod 8Z. For simplicity we will denote c0 (A) simply as c0 when the underlying A is clear. Define matrices S := |a|−1 Y, T := CDiag(ωλ ), where
(2)
c0 . C := exp −2πi 24
Then these matrices satisfy (cf. [26]): Lemma 2.3. SS † ST S S2 T Cˆ
= T T † = id, = T −1 ST −1 , ˆ = C, = Cˆ T,
where Cˆ λμ = δλμ¯ is the conjugation matrix. The above lemma shows that S, T as defined there give rise to a representation of the
11 modular group denoted by (1). This is the group generated by two matrices t = 01
0 −1 , and the representation is given by s → S, t → T. and s = 1 0 Let r be a rational number. Throughout this paper we will use T r to denote a diagonal c0 matrix whose (λ, λ) entry is given by exp(2πi(λ − 24 )r ).
762
F. Xu
Moreover ν = Nλμ
∗ Sλ,δ Sμ,δ Sν,δ δ
S1,δ
(3)
is known as Verlinde formula. Sometimes we will refer to the S, T matrices as defined above as genus 0 modular matrices of A since they are constructed from the fusion rules, monodromies and minimal indices which can be thought as genus 0 chiral data associated to a Conformal Field Theory. Let c be the central charge associated with the projective representations of Diff(S 1 ) of the conformal net A (cf. [17]). Note that by [8] c is uniquely determined for a conformal net. We will see that c is always rational for a completely rational net (see (4) of Th. 4.5 for a more refined statement). It is proved in Lemma 9.7 of [18] that c0 − c ∈ 4Z for complete rational nets. ν is called The commutative algebra generated by λ’s with structure constants Nλμ fusion algebra of A. If Y is invertible, it follows from Lemma 2.3 and Eq. (3) that any S nontrivial irreducible representation of the fusion algebra is of the form λ → Sλμ for 1μ some μ. 2.3. The Galois action on Rehren’s S, T matrices. The basic idea in the theory of the Galois action [6, 7] is to look at the field F obtained by adjoining to the rationales Q the matrix elements of all modular transformations as defined after Lemma 2.3. One may show that, as a consequence of Verlinde’s formula, F is a finite Abelian extension of Q. By the theorem of Kronecker and Weber this means that F is a subfield ofthsome cyclotomic field Q [ζn ] for some integer n, where ζn = exp 2πi is a primitive n root n of unity. We’ll call the conductor of A the smallest n for which F ⊆ Q [ζn ] and which is divisible by the order of the T matrix. The above results imply that the Galois group Gal (F/Q) is a homomorphic image of the Galois group Gn = Gal (Q [ζn ] /Q). But it is known that Gn is isomorphic to the group (Z/nZ)∗ of prime residues modulo n, its elements being the Frobenius maps σl : Q [ζn ] → [ζn ] that leave Q fixed, and send ζn to ζnl for l coprime to n. Consequently, the maps σl are automorphisms of F over Q. According to [7], we have (for l coprime to the conductor) σl Sλ,μ = εl (μ)Sλ,πl (μ) (4) for some permutation πl ∈ Sym (P) of the irreducible representations and some function εl : P → {−1, +1}. Upon introducing the orthogonal monomial matrices (G l )λ,μ = εl (μ)δλ,πl (μ)
(5)
and denoting by σl (M) the matrix that one obtains by applying σl to M elementwise, we have σl (S) = SG l = G l−1 S. Note that for l and m both coprime to the conductor πlm = πl πm , G lm = G l G m .
(6)
Some Computations in the Cyclic Permutations of Completely Rational Nets
763
The Galois action on T is given by σl (T ) = T l .
(7)
2.4. The orbifolds. Let A be an irreducible conformal net on a Hilbert space H and let be a finite group. Let V: → U (H) be a unitary representation of on H. If V: → U (H) is not faithful, we set := /kerV . Definition 2.4. We say that acts properly on A if the following conditions are satisfied: (1) For each fixed interval I and each g ∈ , αg (a) := V (g)aV (g ∗ ) ∈ A(I ), ∀a ∈ A(I ); (2) For each g ∈ , V (g) = , ∀g ∈ . We note that if acts properly, then V (g), g ∈ commutes with the unitary representation U of Möb. Define B(I ) := {a ∈ A(I )|αg (a) = a, ∀g ∈ } and A (I ) := B(I )P0 on H0 where H0 := {x ∈ H|V (g)x = x, ∀g ∈ } and P0 is the projection from H to H0 . Then U restricts to an unitary representation (still denoted by U ) of Möb on H0 . Then: Proposition 2.5. The map I ∈ I → A (I ) on H0 together with the unitary representation (still denoted by U ) of Möb on H0 is an irreducible Möbius covariant net. The irreducible Möbius covariant net in Prop. 2.5 will be denoted by A and will be called the orbifold of A with respect to . When is generated by h 1 , . . . , h k , we will write A as Ah 1 ,...,h k . 2.5. Complete rationality . We first recall some definitions from [19] . Recall that I denotes the set of intervals of S 1 . Let I1 , I2 ∈ I. We say that I1 , I2 are disjoint if I¯1 ∩ I¯2 = ∅, where I¯ is the closure of I in S 1 . Denote by I2 the set of unions of disjoint 2 elements in I. Let A be an irreducible Möbius covariant net as in §2.1. For E = I1 ∪ I2 ∈ I2 , let I3 ∪ I4 be the interior of the complement of I1 ∪ I2 in S 1 , where I3 , I4 are disjoint intervals. Let ˆ A(E) := A(I1 ) ∨ A(I2 ), A(E) := (A(I3 ) ∨ A(I4 )) . ˆ Note that A(E) ⊂ A(E). Recall that a net A is split if A(I1 ) ∨ A(I2 ) is naturally isomorphic to the tensor product of von Neumann algebras A(I1 ) ⊗ A(I2 ) for any disjoint intervals I1 , I2 ∈ I. A is strongly additive if A(I1 ) ∨ A(I2 ) = A(I ), where I1 ∪ I2 is obtained by removing an interior point from I . Definition 2.6 ([19]). A is said to be completely rational if A is split, strongly addiˆ tive, and the index [A(E) : A(E)] is finite for some E ∈ I2 . The value of the index ˆ [A(E) : A(E)] (it is independent of E by Prop. 5 of [19]) is denoted by μA and is ˆ called the μ-index of A. If the index [A(E): A(E)] is infinity for some E ∈ I2 , we define the μ-index of A to be infinity. A formula for the μ-index of a subnet is proved in [19]. With the result on strong additivity for A in [29], we have the complete rationality in the following theorem. Note that, by our recent results in [23], every irreducible, split, local conformal net with finite μ-index is automatically strongly additive.
764
F. Xu
Theorem 2.7. Let A be an irreducible Möbius covariant net and let be a finite group acting properly on A. Suppose that A is completely rational. Then: (1) A is completely rational and μA = | |2 μA ; (2) There is only a finite number of irreducible covariant representations of A (up to unitary equivalence), and they give rise to a unitary modular category as defined in II.5 of [27] by the construction as given in §1.7 of [30]. Suppose that A and satisfy the assumptions of Th. 2.7. Then A has only a finite number of irreducible representations λ˙ and d(λ˙ )2 = μA = | |2 μA . λ˙
˙ is closed under conjugation and compositions, and by Cor. 32 of The set of such λ’s [19], the Y -matrix in (1) for A is non-degenerate, and we will denote the corresponding ˙ T˙ . Denote by λ˙ (resp. μ) the irreducible covariant repgenus 0 modular matrices by S, resentations of A (resp. A) with finite index. Denote by bμλ˙ ∈ N ∪ {0} the multiplicity of representation λ˙ which appears in the restriction of representation μ when restricting from A to A . The bμλ˙ are also known as the branching rules. An irreducible covariant representation λ˙ of A is called an untwisted representation if bμλ˙ = 0 for some representation μ of A. These are representations of A which appear as subrepresentations in the restriction of some representation of A to A . A representation is called twisted if it is not untwisted. 2.6. Induction and restriction for a net and its subnet. Let A be a Möbius covariant net. By a Möbius (resp. conformal) covariant subnet B ⊂ A we mean a map I ∈ I → B(I ) ⊂ A(I ) that associates to each I ∈ I a von Neumann subalgebra B(I ) so that isotony and covariance with respect to the Möbius (resp. conformal) group hold. Given a bounded interval I0 ∈ I0 we fix the canonical endomorphism γ I0 associated with B(I0 ) ⊂ A(I0 ). Then we can choose for each I ⊂ I0 with I ⊃ I0 a canonical endomorphism γ I of A(I ) into B(I ) in such a way that γ I A(I0 ) = γ I0 and γ I1 is the identity on B(I1 ) if I1 ∈ I0 is disjoint from I0 , where γ I ≡ γ I B(I ). We then have an endomorphism γ of the C ∗ -algebra A ≡ ∪ I A(I ) (I bounded interval of R). Given a DHR endomorphism ρ of B localized in I0 , the induction αρ of ρ is the endomorphism of A given by αρ ≡ γ −1 · Adε(ρ, γ ) · ρ · γ , where ε denotes the right braiding unitary symmetry (there is another choice for α associated with the left braiding). αρ is localized in a right half-line containing I0 , namely αρ is the identity on A(I ) if I is a bounded interval contained in the left complement of I0 in R. Up to unitary equivalence, αρ is localizable in any right half-line, thus αρ is normal on left half-lines, that is to say, for every a ∈ R, αρ is normal on the C ∗ -algebra A(−∞, a) ≡ ∪ I ⊂(−∞,a) A(I ) (I bounded interval of R), namely αρ A(−∞, a) extends to a normal morphism of A(−∞, a). When there are several subnets involved, we will use notation αρB→A introduced in §3 of [32] to indicate the net and subnet where we apply the induction.
Some Computations in the Cyclic Permutations of Completely Rational Nets
765
2.7. Preliminaries on cyclic orbifolds. In the rest of this paper we assume that A is completely rational. D := A ⊗ A . . . ⊗ A (n-fold tensor product) and B := DZn (resp. DPn where Pn is the permutation group on n letters) is the fixed point subnet of D under the action of cyclic permutations (resp. permutations). Recall that J0 = (0, ∞) ⊂ R. Note that the action of Zn (resp. Pn ) on D is faithful and proper. Let v ∈ D(J0 ) be a unitary 2πi such that βg (v) = e n v (such v exists by p. 48 of [15]) where g is the generator of the cyclic group Zn and βg stands for the action of g on D. Note that σ := Adv is a DHR representation of B localized on J0 . Let γ : D(J0 ) → B(J0 ) be the canonical endomorphism from D(J0 ) to B(J0 ) and let γB := γ B(J0 ). Note [γ ] = [1] + [g] + . . . + [g n−1 ] as sectors of D(J0 ) and [γB ] = [1] + [σ ] + . . . + [σ n−1 ] as sectors of B(J0 ). Here [g i ] denotes the sector of D(J0 ) which is the automorphism induced by g i . All the sectors considered in the rest of this paper will be sectors of D(J0 ) or B(J0 ) as should be clear from their definitions. All DHR representations will be assumed to be localized on J0 and have finite statistical dimensions unless noted otherwise. For simplicity of notations, for a DHR representation σ0 of D or B localized on J0 , we will use the same notation σ0 to denote its restriction to D(J0 ) or B(J0 ) and we will make no distinction between local and global intertwiners for DHR representations localized on J0 since they are the same by the strong additivity of D and B. The following is Lemma 8.3 of [23]: Lemma 2.8. Let μ be an irreducible DHR representation of B. Let i be any integer. Then: (1) G(μ, σ i ) := ε(μ, σ i )ε(σ i , μ) ∈ C, G(μ, σ )i = G(μ, σ i ). Moreover G(μ, σ )n = 1; (2) If μ1 ≺ μ2 μ3 with μ1 , μ2 , μ3 irreducible, then G(μ1 , σ i ) = G(μ2 , σ i )G(μ3 , σ i ); (3) μ is untwisted if and only if G(μ, σ ) = 1; ¯ (4) G(μ, ¯ σ i ) = G(μ, σ i ). 2.8. One cycle case. First we recall the construction of solitons for permutation orbifolds in §6 of [23]. Let h : S 1 \ {−1} R → S 1 be a smooth, orientation preserving, n injective map which is smooth also at ±∞, namely the left and right limits lim z→−1± ddz hn exist for all n. The range h(S 1 \ {−1}) is either S 1 minus a point or a (proper) interval of S 1 . With I ∈ I, −1 ∈ / I , we set h,I ≡ AdU (k) , where k ∈ Diff(S 1 ) and k(z) = h(z) for all z ∈ I and U is the projective unitary representation of Diff(S 1 ) associated with A. Then h,I does not depend on the choice of k ∈ Diff(S 1 ) and h : I → h,I is a well defined soliton of A0 ≡ A R. Clearly h (A0 (R)) = A(h(S 1 \ {−1})) , thus h is irreducible if the range of h is dense, otherwise it is a type III factor representation. It is easy to see that, in the last case, h does not depend on h up to unitary equivalence. Let now f : S 1 → S 1 be the degree n map f (z) ≡ z n . There are n right inverses h i , i = 0, 1, . . . n − 1, for f (n-roots); namely there are n injective smooth maps h i : S 1 \ {−1} → S 1 such that f (h i (z)) = z, z ∈ S 1 \ {−1}. The h i ’s are smooth also at ±∞.
766
F. Xu
Note that the ranges h i (S 1 \{−1}) are n pairwise disjoint intervals of S 1 , thus we may fix the labels of the h i ’s so that these intervals are counterclockwise ordered, namely 2πi j we have h 0 (1) < h 1 (1) < · · · < h n−1 (1) < h 0 (1), and we choose h j = e n h 0 , 0 ≤ 1
j ≤ n − 1. When no confusion arises, we will write h 0 simply as z n and h j ,I (x) = R 2π j R n1 (x). z n For any interval I of R, we set π1,{0,1...n−1},I ≡ χ I · (h 0 ,I ⊗ h 1 ,I ⊗ · · · ⊗ h n−1 ,I ) ,
(8)
where χ I is the natural isomorphism from A(I0 )⊗· · ·⊗A(In−1 ) to A(I0 )∨· · ·∨A(In−1 ) given by the split property, with Ik ≡ h k (I ). Clearly π1,{0,1...n−1} is a soliton of D0 ≡ A0 ⊗ A0 ⊗ · · · ⊗ A0 (n-fold tensor product). Let p ∈ Pn . We set π1,{ p(0), p(1),..., p(n−1)} = π1,{0,1...,n−1} · β p−1 ,
(9)
where β is the natural action of Pn on D, and π1,{0,1...,n−1} is as in (8). Let λ be a DHR representation of A. Given an interval I ⊂ S 1 \ {−1}, we set Definition 2.9. πλ,{ p(0), p(1),..., p(n−1)},I (x) = πλ,J (π1,{ p(0), p(1),..., p(n−1)},I (x)) , x ∈ D(I ), where π1,{ p(0), p(1),..., p(n−1)},I is defined as in (9), and J is any interval which contains I0 ∪ I1 ∪ . . . ∪ In−1 . Denote the corresponding soliton by πλ,{ p(0), p(1),..., p(n−1)} . When p is the identity element in Pn , we will denote the corresponding soliton by πλ,n . 2.9. Some properties of S matrix for general orbifolds. Let A be a completely rational conformal net and let be a finite group acting properly on A. By Th. 2.7 A has only finitely many irreducible representations. We use λ˙ (resp. μ) to label representations ˙ T˙ . of A (resp. A). We will denote the corresponding genus 0 modular matrices by S, ˙ Denote by λ (resp. μ) the irreducible covariant representations of A (resp. A) with finite index. Recall that bμλ˙ ∈ Z denote the multiplicity of representation λ˙ which appears in the restriction of representation μ when restricting from A to A . bμλ˙ is also known as the branching rules. We have: Lemma 2.10. (1) If τ is an automorphism (i.e., d(τ ) = 1) then Sτ (λ)μ = G 1 (τ, μ)∗ Sλμ , where τ (λ) := τ λ, G 1 (τ, μ) = (τ, μ) (μ, τ ); (2) For any h ∈ , let h(λ) be the DHR representation λ·Adh −1 . Then Sλμ = Sh(λ)h(μ) ; S˙ ˙ S˙˙ S (3) If [αλ˙ ] = [μαδ˙ ], then for any λ˙ 1 , μ1 with bλ˙ μ = 0 we have λλ1 = μμ1 δλ1 ; 1 1
S1˙ λ˙
1
(4) We can choose c0 (A ) so that c0 (A ) = c0 (A) (cf. Definition 2.2).
S1μ1 S1˙ λ˙
1
Proof. (1), (2) follows from Lemma 9.1 of [18]. (3) follows from the proof on p.182 of [31] or Lemma 6.4 of [4]. 2.10. Fusions of solitons in cyclic orbifolds. Let B ⊂ D be as in Sect. 2.7. We note that Th. 8.4 of [18] gives a list of all irreducible representations of B.
Some Computations in the Cyclic Permutations of Completely Rational Nets
767
Remark 2.11. For simplicity we will label the representation πλ,g j ,i (g = (01 . . . n − 1)) by (λ, g j , i), and when i = 0 (resp. j = 0) which stands for the trivial representation we will denote the corresponding representation simply as (λ, g j ) (resp. (λ, 1)). When 1 is used to denote the representation of a net, it will always be the vacuum representation. 2πi
Lemma 2.12. (1) G(σ, (μ, g)) = e n ; 2 ] = [1] such that (2) There exists an automorphism τn,A , [τn, A 1 Sλ,τn,A (μ) . n For simplicity we will denote τn,A simply as τn when the underlying net A is clear. S(λ,1),(μ,g) =
Proof. (1) follows from Remark 4.18 in [9], and (2) follows from Lemma 9.3 of [18]. Remark 2.13. By (4) of Lemma 2.10, we can choose c0 (B) so that c0 (B) = nc0 (A). We will make such a choice in the rest of this paper. 3. Squares of Conformal Nets Definition 3.1. Let Ai , 1 ≤ i ≤ 4 be four Möbius covariant nets such that A3 ⊂ A 2 ⊂ A 1 is called A2 , A2 ⊂ A1 , A3 ⊂ A4 and A4 ⊂ A1 are subnets. Then the square A3 ⊂ A4 a square of Möbius covariant nets. Let N = nk, g = (123 . . . N ), D := A ⊗ A ⊗ . . . A. Then g n is n product of k cycles g1 , . . . , gn , with gi+1 = (i(i + n)(i + 2n) . . . (i + (k − 1)n)), 0 ≤ i ≤ n − 1. The following square of conformal nets play an important role in this paper: g D
⊂
g D
n
B1 := Dg,g1 ,...,gn ⊂ B2 := Dg1 ,...,gn Proposition 3.2. (1) We identify Dg1 ,...,gn with n tensor products of a k th order cyclic permutation orbifold (A⊗. . .⊗A)h 1 in a natural way. Then Dg,g1 ,...,gn ⊂ Dg1 ,...,gn is a cyclic permutation of order n; denote by h 2 the cyclic permutation on Dg1 ,...,gn which comes from permutation (01 . . . (n − 1))(n(n + 1) . . . (n + n − 1)) . . . ((k − 1)n . . . ((k − 1)n + n − 1)) of D; g
B1 →D (2) α((λ,h (λ, g); 1 ),h 2 )
(3) α B1 →iD
g
((λ,h 1 ),1)
= (λ, g ni ); g
(4) When (k, n) = 1, α B1 →D k (λ, g k ). ((λ,1),h 2 )
Proof. (1) follows from definition. As for (2), we first show that (λ, g) and ((λ, h 1 ), h 2 ) come from the restriction of the same soliton of D. This can be seen from Definition 2.9 as follows: (λ, g) comes from a soliton of D defined by: x0 ⊗ x1 ⊗ . . . ⊗ x N ∈ A(I ) ⊗ . . . A(I ) → πλ (R 1 (x0 ) ∨ R 2π R 1 (x1 ) ∨ . . . R 2π(N −1) R zN
N
zN
N
1
zN
(x N −1 )).
768
F. Xu
Let yi = xi ⊗ xn+i ⊗ . . . ⊗ xn(k−1)+i , 0 ≤ i ≤ n − 1. Then ((λ, h 1 ), h 2 ) comes from a soliton of D defined by y0 ⊗ y1 ⊗ . . . ⊗ yn−1 → πλ,h 1 (R = πλ (R
1
zN
1
zn
(y0 ) ∨ R 2π R n
(x0 ) ∨ R 2π R N
(y1 ) ∨ . . . R 2π(n−1) R
1
zn
1
zN
n
(x1 ) ∨ . . . R 2π(N −1) R N
1
zn 1
zN
(yn−1 )
(x N −1 )),
where we have used R
1
zk
R 2πi (x) = R 2πi R 1 (x), R n
kn
zk
1
zk
R
1
zn
(x) = R
1
z nk
(x), ∀x ∈ A(I ).
Now by Th. 4.8 of [18], (λ, g) is the component of the above soliton where g acts trivially, and ((λ, h 1 ), h 2 ) is the component of the same soliton where g, g1 , . . . , gn acts trivially. It follows that the restriction of (λ, g) to B1 contains ((λ, h 1 ), h 2 ), and (2) is proved. To prove (3), we first show that α B1 →iD
g
((λ,h 1 ),1)
(λ, g ni ). As in (2) it is sufficient to show
that ((λ, h i1 ), 1), (λ, g ni ) come from restrictions of the same soliton of D, and as in (2) this follows by definition. By using the index formula in Th. 4.5 and (3) of Prop. 7.4 of [18], we have d(((λ, h i1 ), 1)) = d((λ, g ni )), and (3) is proved. (4) is proved in a similar g
way as in (2): we check that α B1 →D k (λ, g k ) by showing that ((λ, 1), h k2 ), (λ, g k ) ((λ,1),h 2 )
come from the same soliton of D. By Definition 2.9 ((λ, 1), h k2 ) comes from a soliton of D defined by x0 ⊗ x1 ⊗ · · · ⊗ x N ∈ A(I ) ⊗ . . . A(I ) → πλ (R
1
zn
(x0 ) ∨ R 2π R n
R 2π(n−1) R n
πλ (R
1
zn
1
zn
n
n
1
zn
. . . ⊗ πλ (R n
1
zn
(x−k ) ∨ . . .
(x−k(n−1) )) ⊗
(x−1 ) ∨ R 2π R
R 2π(n−1) R
R 2π R
1
zn
1
zn
(x−1−k ) ∨ . . .
(x−1−k(n−1) )) ⊗ 1
zn
(x−k+1 ) ∨
(x−k+1−k ) ∨ . . .
R 2π(n−1) R n
1
zn
(x−k+1−k(n−1) )),
where indices are defined modulo N . Let yki = xki ⊗ xn+ki ⊗ . . . ⊗ xn(k−1)+ki , 0 ≤ i ≤ n − 1. Since (n, k) = 1, (λ, g k ) comes from a soliton of Dg1 ,...gn defined by πλ,(0,k,2k,...,k(n−1)) (y0 ⊗ y1 ⊗ . . . ⊗ yn−1 ) =πλ,(0,1,2,...,(n−1)) (y0 ⊗ y−k ⊗ . . . ⊗y−k(n−1) ) =πλ (R n1 (y0 )R 2π R n1 (y−k ) ∨ . . . z
n
R 2π(n−1) R n
1 zn
z
(y−(n−1)k )),
Some Computations in the Cyclic Permutations of Completely Rational Nets
769
where the indexes are defined modulo n. Then the soliton of Dg1 ,...gn above comes from restriction of the soliton of D defined by x0 ⊗ x1 ⊗ . . . ⊗ x N −1 → πλ (R
zn
πλ (R
zn
1 1
(x0 ) ∨ R 2π R n
(x−n ) ∨ R 2π R n
R 2π(n−1) R n
· · · ⊗ πλ (R R 2π R n
1
zn
1
zn
1
zn
1
zn
(x−k ) ∨ . . . R 2π(n−1) R n
1
zn
1
zn
(x−k(n−1) )) ⊗
(x−n−k ) ∨ . . .
(x−n−k(n−1) )) ⊗
(xn(−k+1) ) ∨
(xn(−k+1)−k ) ∨ . . . R 2π(n−1) R n
1
zn
(xn(−k+1)−k(n−1) )),
which up to unitary equivalence (the unitary is a permutation of the tensor factors in the Hilbert space) is the same as the soliton defined at the beginning of the proof of (4). Thus we have shown that g
α B1 →D k (λ, g k ). ((λ,1),h 2 )
3.1. Constraints on certain automorphisms. For simplicity of notations we define τk,n := τn,(A⊗A⊗...⊗A)Zk . Proposition 3.3.
(1) τn,A⊗A...⊗A = τn,A ⊗ τn,A . . . ⊗ τn,A (k tensors);
(2) τk,n = (τn , 1, jk,n ) with k|2 jk,n . Proof. Ad (1): Consider inclusions of sunbets B2 ⊂ Dg
n
⊂ D. Note that by definition
g n
B2 →D α(λ = (λ1 ⊗ λ2 . . . ⊗ λn , g n ). 1 ,g1 )⊗(λ2 ,g2 )⊗...⊗(λn ,gn )
By Lemma 2.10 we have S(λ1 ⊗...⊗λn ,gn ),(μ1 ⊗...μn ,1) S(λ1 ,g1 )⊗(λ2 ,g2 )⊗...⊗(λn ,gn ),(μ1 ,1)⊗(μ2 ,1)⊗...⊗(μn ,1) = . S1⊗...⊗1,(μ1 ,1)⊗...⊗(μn ,1) S1⊗...⊗1,μ1 ⊗...⊗μn By Lemma 2.12 it follows that S(τk,A⊗...⊗A (λ1 ⊗...⊗λn )),μ1 ⊗...⊗μn = S(τk,A λ1 ⊗...⊗τk,A λn ),μ1 ⊗...⊗μn . By unitarity of S matrix, and by replacing k with n, (1) is proved. Ad (2): it is sufficient to show that ατk,n = τn , where the induction is with respect to the k th cyclic permutation orbifold (A ⊗ A . . . ⊗ A)Zk and A ⊗ A . . . ⊗ A (k tensors). First we note that since d(τk,n ) = 1, and any twisted representation of (A⊗A . . .⊗A)Zk has index greater or equal to k 2 by Th. 4.5 and Prop. 7.4 of [18], it follows that ατk,n = β is a DHR representation of A ⊗ A . . . ⊗ A (k tensors). Consider the square of nets: h D ⊂ D where n B1 = Dg,g1 ,...,gn ⊂ B2 := Dg h = (012 . . . n − 1)(n(n + 1) . . . (n + n − 1)) . . . (((k − 1)n) ((k − 1)n + 1) . . . ((k − 1)n + n − 1)).
770
F. Xu h
B1 →D By definition α((λ,1),h = (λ, h), where by a slight abuse of notations we use λ to denote 2) an irreducible DHR representation of A ⊗ . . . ⊗ A (k tensors). By using (1), Lemma 2.12 and Lemma 2.10 we have Sβλ,μ = Sτn λ,μ for all λ, μ. By unitarity of the S matrix 2 ] = [1], (2) is proved. and the fact that [τk,n
Proposition 3.4. (1) τk,n (λ, h 1 ) = (τ N τk λ, h 1 , j) for some 0 ≤ j ≤ k − 1; (2) τ N is the vacuum if N is odd, and τ N = τ2 if N is even; (3) τk,n is the vacuum if n is odd, and jk,n as in Prop. 3.3 is 0 modulo k if k is odd. Proof. Ad (1): By Prop. 3.3 we can assume that τk,n (λ, h 1 ) = (μ, h 1 , j). By Prop. 3.2 and Lemma 2.10 we have S(λ,g)(λ1 ,1) = Sτk,n (λ,h 1 ),(λ1 ,1) , hence Sτk μ,λ1 = Sτ N λ,λ1 by (2) of Lemma 2.12. By unitarity of S matrix, (1) is proved. Ad (2): By (1) we have ατk,n (λ,h 1 ) = α(τ N τk λ,h 1 , j) , where the induction is with respect to the k th order cyclic permutation orbifold of A ⊗ A ⊗ . . . ⊗ A (k tensors). Note that ατk,n = (τn , . . . , τn ) by Prop. 3.3. It follows by Th. 8.6 of [23] that τnk = τ N τk . Hence if k is even, τ N = τk = τ2 , and if k is odd, τnk = τn τk . Choose n even; we have τk is the vacuum when k is odd. Ad (3): (3) follows from (2) and (2) of Prop. 3.3. j
j
Remark 3.5. We can actually show that ζk k,n = ζ2 2,2 when k, n are even integers, but this fact will not be used in the paper. Theorem 3.6. n [π1,{0,1,...,n−1} ]=
Mλ1 ,...,λn [(λ1 , . . . , λn )],
λ1 ,...,λn
2−2g where Mλ1 ,...,λn := λ S1,λ 1≤i≤n the soliton defined in Eq. (9).
Sλi ,λ S1,λ
with g =
(n−1)(n−2) , 2
and π1,{0,1,...,n−1} is
Proof. This is proved in Prop. 9.4 of [18] under the assumption τnn = 1. The assumption follows by Prop. 3.4. We note that the above theorem was conjectured on p. 759 of [18] as a consequence of another conjecture on p. 758 of [18] which states that τ N is the vacuum for all N . By Prop. 3.4 it is now enough to prove that τ2 is the vacuum. 3.2. Properties of certain matrices. In this section we define and examine properties of certain matrices motivated by P. Bantay’s matrices in [1] and [2] which we recalled here for comparison. See [1] and [2] for more details. Suppose that a representation of the modular group (1) has been given. Let r = nk be a rational number in reduced form, i.e. with n > 0 and k and n coprime.
Choose integers x and y such that kx − ny = 1, k y x ∗ and define r = n . Then m = belongs to (1), and we define the matrix (r ) nx via ∗
−r (r ) p,q = ω−r p M p,q ωq .
One should fix some definite branch of the logarithm to make the above definition meaningful, but different choices lead to equivalent results. See the remark after Lemma 2.3 for our choice for genus 0 modular matrices.
Some Computations in the Cyclic Permutations of Completely Rational Nets
771
It is a simple matter to show that (r ) is well defined, i.e. does not depend on the actual choice of x and y, and (r ) is periodic in r with period 1, i.e. (r + 1) = (r ). For r = 0 we just get back the S matrix, (0) = S, and for a positive integer n we have
1 1 1 = T − n S −1 T −n ST − n . n
(10)
Finally, we have q p r ∗ p = (r )q ,
q
q
(−r ) p = (r ) p , and the functional equation where rˆ =
−1 r
1
= T r ST r (r )T rˆ ,
(11)
1 kn .
Definition 3.7. When (i, N ) = 1, we define λ1 ,λ2
i N
λ1 ,λ2 (r ) = N Sλ1 ,λ2 , = N S(λ1 ,g),(λ2 ,gi ) ,
where r is any integer. We note that it follows from the definition that
i i λ1 ,λ2 = λ1 ,λ2 +1 . N N Proposition 3.8.
−i 1 j2 −i 2 j1
(1) S(λ1 ,gi1 , j1 ),(λ2 ,gi2 , j2 ) = ζ N
S(λ1 ,gi1 ),(λ2 ,gi2 ) ;
(2) If (i 1 , N ) = 1 and iˆ1 i 1 ≡ 1 mod N , then S(λ1 ,gi1 ),(λ2 ,gi2 ) = λ1 ,λ2 (r ) = λ2 ,λ1 (r ∗ ); (3) λ¯ ,λ (r ). λ1 ,λ2 (1 − r ) = (4) 1 2
1 N λ1 ,λ2
i 2 iˆ1 N
;
772
F. Xu
Proof. (1) follows from Lemma 2.10 and 2.8. For (2), let h ∈ S N so that hg i1 h −1 = g. Then hgh −1 = g i1 . By Lemma 2.10 and Definition 3.7, we have S(λ1 ,gi1 ),(λ2 ,gi2 ) = S(λ
i i 1 ,g),(λ1 ,g 1 2 )
1 λ ,λ = N 1 2
i2i1 . N
For (3), let h ∈ S N be such that h g i h −1 = g. Then h gh −1 = g i , and by Lemma 2.10 and the fact that S is symmetric we have S(λ1 ,gi ),(λ2 ,g) = S(λ1 ,g),(λ2 ,gi ) = S(λ2 ,gi ),(λ1 ,g) . This proves (3) by definition. For (4), by Prop. 6.1 of [23] the conjugate of (λ1 , g i ) is (λ¯ 1 , g −i ). (4) now follows from the definition and the property of the S matrix under conjugation. Proposition 3.9. S(λ1 ,g),(λ2 ,gni ) = S((λ1 ,h 1 ),h 2 ),((λ2 ,h i ),1) . 1
Proof. Consider the square of nets
g D
g D
n
⊂
B1 := Dg,g1 ,...,gn ⊂ B2 := Dg1 ,...,gn
. By Prop.
3.2 and Lemma 2.10 we have S((λ2 ,h i ),1),((λ1 ,h 1 ),h 2 ) 1
S1,((λ1 ,h 1 ),h 2 )
=
S(λ2 ,gni ),(λ1 ,g) S1,(λ1 ,g)
.
But S1,((λ1 ,h 1 ),h 2 ) B1 S11
(kn−1)
= d(((λ1 , h 1 ), h 2 )) = d(λ1 )k n−1 μA 2
and S1,(λ1 ,g) Dg S11
(kn−1)
= d(λ1 )μA 2 ,
where we have used Th. 4.5 and (3) of Prop. 7.4 in [18] in the calculation above. On the other hand 1 1 √ √ g n √ μ = = k n μ , = μD = N μA , B A 1 g B1 D S11 S11 and using these equations we obtain S(λ1 ,g),(λ2 ,gni ) = S((λ1 ,h 1 ),1),((λ2 ,1),h 2 ). Proposition 3.10. Assume that (k, n) = 1 and N = kn. Then S(λ1 ,gn ),(λ2 ,gk ) = Sτk λ1 ,τn λ2 .
1 N
Some Computations in the Cyclic Permutations of Completely Rational Nets
Proof. Consider the square of nets
g D
⊂
773 g D
n
B1 := Dg,g1 ,...,gk ⊂ B2 := Dg1 ,...,gk
. By Prop.
3.2 and Lemma 2.10 we have S(λ1 ,gn ),(λ2 ,gk ) S((λ1 ,h 1 ),1),((λ2 ,1),h 2 ) = . S1,((λ2 ,1),h 2 ) S1,(λ2 ,gk ) But S1,((λ2 ,1),h 2 ) B1 S11
k(n−1)
= d(((λ2 , 1), h 2 )) = d(λ2 )k k n−1 μA 2
and S1,(λ2 ,gk ) Dg S11
k(n−1)
= d(λ2 )k μA 2 ,
where we have used Th. 4.5 and (3) of Prop. 7.4 in [18] in the calculation above. On the other hand 1 1 √ √ g n √ = μ = k n μ , = μD = N μA , B1 A g B1 D S11 S11 and using these equations we obtain S(λ1 ,gn ),(λ2 ,gk ) = S((λ1 ,h 1 ),1),((λ2 ,1),h 2 ) =
1 S(λ ,h ),(τ (λ ,1)) . n 1 1 k,n 2
j
Since (k, n) = 1, ζk k,n = 1 by Prop. 3.4, and we have 1 1 S(λ1 ,h 1 ),τk,n 1 1 S(λ1 ,h 1 ),(τk,n (λ2 ,1)) = Sτk λ1 ,λ2 = Sτk λ1 ,τn λ2 . n n S(λ1 ,h 1 ),1 k N To prepare the statement of the main theorem in this section, we define Definition 3.11. Let (k, n) = 1. Define a function g with value in Q mod Z by the following equations:
n k k k =g ±1 ,g +g g(0) = 1, g n n n k
2 −2πi n + k2 + 1 = (c − c0 ) 3nk − , 24 nk where c is the central charge of A and c0 is as in Definition 2.2. Such a function is clearly uniquely determined by the defining equations. We will give further properties of g in Prop. 4.7. Let be Bantay’s matrices as reviewed at the beginning of this section associated with genus 0 S, T matrices as defined after Definition 2.2. Then we have:
774
F. Xu
Theorem 3.12. λ1 ,λ2 (r ) = exp(2πig(r ))λ1 ,λ2 (r ), where r ∈ Q and g(r ) is as in Definition 3.11. Proof. The idea is to consider equation S = T ST ST for the net Dg with the order of g equal to nk and (n, k) = 1. Let us compute the (λ1 , g n ), (λ2 , g k−n ) entry on both sides of the equation. Since (k, n) = 1, (k − n, kn) = 1. Let x1 , x2 be integers such that x1 (k − n) + knx2 = 1. By Prop. 3.9 the left-hand side is S(λ1 ,gn ),(λ2 ,gk−n ) = S(λ1 ,gnx1 ),(λ2 ,g) = S((λ1 ,h x1 ),1),((λ2 ,h 1 ),h 2 ) . 1
By Lemma 2.12 and Prop. 3.4 we have S((λ1 ,h x1 ),1),((λ2 ,h 1 ),h 2 ) = 1
1 1 Sτk (λ1 ),τn λ1 λ2 x1 S = n (λ1 ,h 1 ),τk,n (λ2 ,h 1 ) N Sτk (λ1 ),1
k−n k
.
By (14) of [23] and Remark 2.13 we have
2 n (k − 1)(c − c0 ) , Tλ3 ,gk Tλ1 ,gn = Tλk1 exp 2πi 24k
2 k (n − 1)(c − c0 ) n = Tλ3 exp 2πi 24n and
2 2 1 (n k − 1)(c − c0 ) . Tλ2 ,gk−n = Tλnk2 exp 2πi 24nk
By using the above equations and Prop. 3.10 we obtain the (λ1 , g n ), (λ2 , g k−n ) entry on the RHS is given by 2 2 +1 2πi(c − c0 )(3nk − n +k 1 nk ) exp Tλ1 Sτk (λ1 ),τn (λ3 ) N 24 λ3
k k − n Sτk ,τn (λ3 ) kn1 λ3 ,λ2 Tλn3 Tλ2 . n S1λ3 Since (k, n) = 1, by Prop. 3.4 we have Sτk ,λ3 Sτ ,τ λ = k n 3, S1,λ3 S1,λ3 when k is odd, and when k is even, n must be odd and the above equation also holds. When comparing both the LHS and RHS, we see that the τ dependence canceled out : from both sides and we are left with the following equation for 2 2 +1
1 k 2πi(c − c0 )(3nk − n +k k−n k−n nk ) λ1 ,λ2 λ3 ,λ2 = exp Tλkn2 . Tλ1 Sλ1 ,λ3 Tλn3 k 24 n
Some Computations in the Cyclic Permutations of Completely Rational Nets
775
Comparing with Eq. (11) of Bantay’s matrices and using Prop. 3.8 we conclude that there is a mod Z valued function as defined in Definition 3.11 such that λ1 ,λ2 (r ) = exp(2πig(r ))λ1 ,λ2 (r ). matrices completely, and hence the entries Note that the theorem above determined of S matrix as in Definition 3.71 . By using Verlinde’s formula, one can write down a matrices. Since fusion coefficients are series of equations of fusion rules in terms of matrinon-negative integers, these equations describe certain arithmetic properties of ces, and none of them seems too trivial for the case of conformal nets associated with SU (n) at level k, where S, T matrices are given (cf. [28]). We refer the reader to Cor. 9. 9 for such a statement in the case when N = 2. 4. Arithmetic Properties of S, T Matrices for a Completely Rational Net matrices. In this section we’ll study the Galois action in the 4.1. Galois action on cyclic permutation orbifold Dg as in [1]. By Th. 2.7 the Galois action on the genus 0 S-matrix elements of Dg may be described via suitable permutations πl of the irreducible representations of the orbifold and signs εl . This will in turn allow us to determine -matrices as defined in Definition 3.7. the Galois action on Let N be a positive integer , and as in §3 consider the cyclic permutation g = (1, . . . , N ). We will use C(N , A) to denote the conductor of the permutation orbifold Dg . Hence C(1, A) is the conductor of A. Note that C(1, A) depends on the choice of c0 (A) in Definition 2.2, and our choice of c0 (Dg ) is as in Remark 2.13. Among the irreducible representations of the permutation orbifold Dg there is a subset J of special relevance to us. The elements in J are labeled by triples (λ, g n , k), where λ is an irreducible representation of A, while n and k are integers mod N . The subset of those (λ, g n , k), where n is coprime to N will be denoted by J0 . It follows from (1) of Lemma 2.10 that (λ, g n , k) ∈ J0 have vanishing S-matrix elements with the labels not in J , while for (μ, g m , l) ∈ J we have:
m n 1 −(km+ln) n m , (12) λ,μ S(λ,g ,k),(μ,g ,l) = ζ N N N where n denotes the mod N inverse of n and ζ N = exp 2πi N . Lemma 4.1. εl (τ N (λ)) = εl (λ), τ N πl (λ) = πl (τ N λ). Sτ
,μ
N Proof. Note that by the property of τ N we have S1,μ = ±1. By definition of Galois actions we have Sτ ,μ Sτ ,μ σl (Sτ N λ,μ ) = N σl (Sλ,μ ) = N εl (λ)Sπl (λ),μ = εl (τ N λ)Sπl (τ N λ),μ . S1,μ S1,μ
Hence εl (λ)Sτ N πl (λ),μ = εl (τ N (λ))Sπl (τ N λ),μ . By unitarity of the S matrix the lemma is proved. 1 With little effort we can in fact determine all entries of the S matrix for the cyclic permutation orbifold using the methods of this chapter. However Th. 3.12 is enough for the purpose of this paper.
776
F. Xu
By using Lemma 4.1 and Th. 3.12, the proofs of Lemmas 1-3, Prop.1, Cor. 1 and Th. 1 of [1] go through. (In the statements of Lemmas 1-3, Prop. 1, Bantay’s matrix matrix, and the additional assumption on l is that l is cohas to be replaced by our prime to C(N , A) where N is the denominator of a rational number r as given in these statements.) For the reader’s convenience and to set up notations, we summarize Lemmas 1–3 and Prop. 1 of [1] in the following and sketch its proof. Lemma 4.2. Assume that l is coprime to the denominator N of r and C(N , A)C(1, A). Then: (1) The set J is invariant under the permutations π˜ l , i.e. π˜ l (J ) = J . For (λ, g n , k) ∈ J0 one has π˜ l λ, g n , k = πl (λ), gln , k˜ (13) for some function k˜ of l, λ, n and k, and ε˜l λ, g n , k = εl (λ), (2)
( (r ) = (lr ) G l Z l (r ∗ ) = Z l (r ) G −1 lr ), σl l
(14)
(15)
where Z l (r ) is a diagonal matrix whose order divides the denominator N of r , and l is the mod N inverse of l, and Z l (0) = I , Z l (r + 1) = Z l (r ); (3) lr )G l = Z lm (r )Z l−m (r ) G l−1 Z m (
(16)
whenever both l and m are coprime to the denominator N of r and C(N , A)C(1, A); (4) If n is coprime to the denominator of r , then Z ln (r ) = Z l (nr ) .
(17)
Proof. We give a proof of (1) following the proof of Bantay indicating necessary changes. First, let’s fix (λ, g n , k) ∈ J . According to Eq. (12), we have
nˆ 1 −k n S(λ,g ,k),(μ,1) = ζ N λ,μ N N and this expression differs from 0 for at least one μ, by the unitarity of -matrices and Th. 3.12. Select such a μ, and apply σl to both sides of the equation. One gets that ε˜l λ, g n , k Sπ˜ l (λ,gn ,k),(μ,1) = σl S(λ,gn ,k),(μ,1) n , k ∈ J because (μ, 1) ∈ J . differs from 0, but this can only happen if π ˜ g (λ, l 0 Next, for λ, g n , k ∈ J0 consider S(λ,gn ,k),(μ,1,m) =
1 −nm Sτ N λ,μ . ζ N N
Applying σl to both sides of the above equation we get from Eq.(4), ε˜l (λ, g n , k)Sπ˜ l (λ,gn ,k),(μ,1,m) =
1 −lnm ζ εl (τ N λ)Sπl (τ N λ),μ . N N
(18)
Some Computations in the Cyclic Permutations of Completely Rational Nets
777
But the lhs equals 1 −nm ζ ˜ Sτ N λ˜ ,μ N N according to Eq.(18) if π˜ l (λ, g n , k) = λ˜ , g n˜ , k˜ . Equating both sides we arrive at ε˜l (λ, g n , k)
−m(n−ln) ˜
Sτ N λ˜ ,μ = εl (τ N λ)˜εl (λ, g n , k)ζ N
Sπl (τ N λ),μ .
By Lemma 4.1 and the fact that Sτ N λ,μ =
Sτ N ,μ Sλ,μ S1,μ
we have −m(n−ln) ˜
Sλ˜ ,μ = εl (λ)˜εl (λ, n, k)ζ N
Sπl (λ),μ .
As the lhs is independent of m, we must have n˜ = ln
mod N
and λ˜ = πl (λ) as well as ε˜l (λ, g n , k) = εl (λ). The proof of (2)–(4) is the same as that of Bantay with his matrices replaced with our matrices. Let us sketch the proof of the following (cf. Th. 1 in [1]) theorem, indicating modifications compared to the proof in [1]: Theorem 4.3. Let A be a completely rational net and let the T -matrix be defined as 2 after Definition 2.2. Then for all l coprime to the conductor G l−1 T G l = T l . Proof. Let N be the order of T. Then N divides the conductor by definition. Choose l so −2πi(c−c0 )(N 2 −1) 1 that (l, 12C(1, A)C(N , A)) = 1. By Th. 3.12 we have ( N ) = exp 12N , we have ( 1 ). Following the argument of Bantay, with replaced by N
exp
−2l 2 −2 l −2πi(c − c0 )(N 2 − 1)(l 2 − 1) T N = G l−1 T N G l Z l2 . 12N N
Now we use the fact that since (l, 12) = 1, 12|l 2 −1. The rest of the argument is the same 2 as [1] and we have G l−1 T G l = T l for l with the property (l, 12C(1, A)C(N , A)) = 1. Now for any l coprime to the conductor C(1, A), by Dirichelet theorem on arithmetic progressions we can always find integer p so that l1 = l + pC(1, A) with the property 2 2 that (l, 12C(1, A)C(N , A)) = 1. Since G l1 = G l , T l1 = T l , the theorem is proved for any l coprime to the conductor.
778
F. Xu
This above theorem has been conjectured in [7], where some of its consequences had been derived. Proposition 2 of [1] has to be modified due to a phase factor as follows: Proposition 4.4. Let r =
n N.
If l is coprime to C(N , A)C(1, A)N , then
G l−1 T r G l = T l r Z ll (r ) exp 2
−2πi(l 2 − 1)(c − c0 )r 24
.
Proof. The proof is similar to the proof of Prop. 2 in [1] and we indicate modifications when necessary. Write r = Nn . The idea is to apply Th. 4.3 to Dg with the order of g equal to N . The phase factor comes in when we note that 1
T(λ,gn ,k) = ζ Nnk TλN exp
2πi(c − c0 ) 24
N−
1 N
.
Use Th. 4.3; we have l2 N
Tλ
2πi(c − c0 )(l 2 − 1) exp 24
1 N− N
1
= ζ Nlk0 TπNl (λ) ,
and the rest of the proof is as in [1]. By using Th. 4.3, Prop. 3–6 and Cor. 2 of [1] follows in our setting with the same proof (except (4) in the theorem below) as in [1] and [7]. We record these results in the following theorem: Theorem 4.5. Let A be completely rational net and let S, T be its genus 0 modular matrices as defined after Definition 2.2. Then: (1) For l coprime to the conductor,
G l = S −1 T l ST l ST l ,
(19)
where l denotes the inverse of l modulo the conductor; (2) The conductor equals the order N of T , and F = Q [ζ N ]; (3) Let N0 denote the order of the matrix ω0−1 T , i.e. the least common multiple of the denominators of the conformal weights. Then N = eN0 , where the integer e divides 12. Moreover, the greatest common divisor of e and N0 is either 1 or 2; (4) N0 times the central charge c is an even integer; (5) There exists a function N (r ) such that the conductor N divides N (r ) if the number of irreducible representations of A - i.e. the dimension of the modular representation - is r . Proof. Given Th. 4.3, (1), (2), (3), and (5) are proved in the same way as in [1]. As for (4), the proof of Cor. 2 in [1] shows that N0 c0 is an even integer. By Lemma 9.7 of [18] c − c0 ∈ 4Z and (4) is proved.
Some Computations in the Cyclic Permutations of Completely Rational Nets
779
4.2. The kernel of the modular representation. In this section we consider the modular representation of a completely rational net A as defined after Lemma 2.3. We will show that this representation factorizes through a congruence subgroup. We refer the reader to [7] for a nice account of this and related questions. Recall that the kernel K consists of those modular transformations which are represented by the identity matrix, i.e. K = m ∈ (1) | Mλ,μ = δλ,μ
ab Proposition 4.6. If m = ∈ (1) with d coprime to the conductor 2 . Let m e be ed an integer such that m e g( ae ) ∈ Z. Let l = d + meC(1, A) be such that l is coprime to 6m e C(|e|, A). Then
1 (l 2 − 1)(c − c0 ) a −g − . σd (M) = T b S −1 T −e σd (S) exp −2πi lg e e 24e Proof. According to Eq. (15) and Th. 3.12 a a a T d/e = σl T a/e exp −2πig T d/e . σl (M) = σl T a/e e e e By our assumption on l we have
σl T
a/e
exp −2πig
a e
a e
T
d/e
1 a −g = exp −2πi lg e e
d 1 2 Gl Zl T d /e . × T ad/e e e
But
1 = T −1/e S −1 T −c ST −1/e e so
1 a −g σl (M) = exp −2πi lg e e × T b S −1 T −e ST −1/e G l Z l (d/e)T
d2 e
.
From Prop. 4.4, T −1/e G l = G l T −l
2 /e
Z ll (−1/e) exp
2πi(l 2 − 1)(c − c0 ) . 24e
Putting all this together and using Lemma 4.2 we get the proposition. Next we show that the phase factor in the above proposition is always 1: 2 We use e instead of the more natural c since c has been used to denote the central charge.
780
F. Xu
Proposition 4.7. Let l be as in Prop. 4.6. Then
a 1 (l 2 − 1)(c − c0 ) exp −2πi lg −g − = 1. e e 24e Proof. Let 4x = (c − c0 ). By Lemma 9.7 of [18] x is an integer. Let us first prove the proposition for the case x = 2x2 is even. Choose an integer n so that 3n + x2 > 0 and consider a local net IE which is the 3n + x2 tensor product of the local net A(E 8 )1 . This net has μ index equal to one by Th. 3.18 of [9]. We choose our c0 (IE) = 24n in the definition of the T matrix for IE. The corresponding modular representation is trivial. IE (resp. IE ) the matrices as defined in Definition 3.7 (resp. Bantay’s We denote by matrices) associated with IE. Apply Th. 3.12 to IE; we have IE (a/e) = exp(2πig(a/e)) IE (a/e). By (2) of Th. 4.5 the conductor of the eth cyclic permutation orbifold of IE divides 3e. By conditions on l we can apply Prop. 4.6 to the net IE to have
1 (l 2 − 1)(c − c0 ) a b −1 −c −g − . σd (M) = T S T σd (S) exp −2πi lg e e 24e Since the modular representation for IE is trivial we must have
1 (l 2 − 1)(c − c0 ) a −g − =1 exp −2πi lg e e 24e and we have proved the proposition for x even. If x = 2(x1 + 2) − 3 is odd, we define a new set of S1 , T1 matrix by T1 = exp
−2πi x −2πi x T, S1 = exp T, 6 2
and denote by 1 the matrix of Bantay associated with S1 , T1 . Let g1 (k/n) be defined modulo integers such that 1 = exp(2πig1 (k/n)). From Definition 3.11 one checks easily that modulo integers g(k/n) =
n + g1 (k/n). 2
Using the assumption that l is odd it is now sufficient to check the proposition for g1 . From the defining equation for g1 , we see that exp 2πig1 (k/n) is Bantay’s matrix associated with the one dimensional representation of the modular group given by x −2πi x T → exp −2πi 6 , S → exp 2 . This representation is the tensor product of two one dimensional representations given by
−2πi(2x1 + 4) T → exp ,S →1 6 and T → exp
2πi 2πi , S → exp , 2 2
Some Computations in the Cyclic Permutations of Completely Rational Nets
781
and we denote by g3 , g2 the associated matrices. Note that g1 (a/e) = g2 (a/e) + g3 (a/e). The same proof as in the x even case, with x2 = x1 + 2, shows that
1 (l 2 − 1)(2x1 + 4) a − g3 − = 1. exp −2πi lg3 e e 24e Hence to finish the proof we just have to show that
1 (l 2 − 1) a − g2 + = 1. exp −2πi lg2 e e 2e Since the associated modular representation is very simple, this can be checked directly using the following formulas:
−2πi(a + d) M2 (a, b, e, d), exp(2πig2 (a/e)) = exp 2e where M2 (a, b, e, d) = ±1. When e or d is even, M2 (a, b, e, d) = (−1)d or (−1)e ; when e, d are odd, M2 (a, b, e, d) = (−1)a+d+1 . By combining the above two propositions we have proved the following: Theorem 4.8. If d is coprime to the conductor, then σd (M) = T b S −1 T −e σd (S). Now Th. 2–4 of [1] follow exactly in the same way. Let us record these theorems in the following: Theorem 4.9. Let A be a completely rational net and consider the modular representation as defined after Lemma 2.3. Then:
ab ∈ (1) belongs to the kernel K (1) Let d be coprime to the conductor N . Then ed if and only if σd (S) T b = T e S; (2) Define 1 (N ) =
ab ed
(20)
and
∈ (1) | a, d ≡ 1 mod N , e ≡ 0
(N ) =
ab ed
mod N
∈ 1 (N ) | b ≡ 0
mod N .
Then K ∩ 1 (N ) = (N ) . In particular, K is a congruence subgroup of level N . (3) Define S L 2 (N ) ∼ = (1)/ (N ). The modular representation factorizes through S L 2 (N ) which we denote by D. For l coprime to N , define the automorphism τl : S L 2 (N ) → S L 2 (N ) by
a lb ab = , (21) τl ed le d where l is the mod N inverse of l. Then σl ◦ D = D ◦ τl .
782
F. Xu
References 1. Bantay, P.: The kernel of the modular representation and the Galois action in RCFT. Commun. Math. Phys. 233, no. 3, 423–438 (2003) 2. Bantay, P.: Permutation orbifolds. Nucl. Phys. B 633, no. 3, 365–378 (2002) 3. Barron, K., Dong, C., Mason, G.: Twisted sectors for tensor product vertex operator algebras associated to permutation groups. Commun. Math. Phys. 227, no. 2, 349–384 (2002) 4. Böckenhauer, J., Evans, D.E.: Modular invariants from subfactors: Type I coupling matrices and intermediate subfactors. Commun. Math. Phys. 213, no. 2, 267–289 (2000) 5. Borisov, L., Halpern, M.B., Schweigert, C.: Systematic approach to cyclic orbifold. Int. J. Mod. Phys. A 13, no. 1, 125–168 (1998) 6. de Boere, J., Goeree, J.: Markov traces and II1 factors in conformal field theory. Commun. Math. Phys. 139, 267 (1991) 7. Coste, A., Gannon, T.: Congruence subgroups and rational conformal field theory. http://arxiv.org/list/ math-QA/9909080. 1999 8. Carpi, S., Weiner, M.: On the uniqueness of diffeomorphism symmetry in Conformal Field Theory. Commun. Math. Phys. 258, 203–221 (2005) 9. Dong, C., Xu, F.: Conformal nets associated with lattices and their orbifolds. To appear in Adv. in Mathematics, doi:10.1016/j.aim.2005.08.009, 2005 10. Doplicher, S., Haag, R., Roberts, J.E.: Local observables and particle statistics, I. Commun. Math. Phys. 23, 199–230 (1971); II. 35, 49–85 (1974) 11. Fredenhagen, K., Rehren, K.-H., Schroer, B.: Superselection sectors with braid group statistics and exchange algebras. I. Commun. Math. Phys. 125, 201–226 (1989) II. Rev. Math. Phys. Special issue, 113–157 (1992) 12. Fröhlich, J., Gabbiani, F.: Operator algebras and conformal field theory. Commun. Math. Phys. 155, 569–640 (1993) 13. Guido, D., Longo, R.: Relativistic invariance and charge conjugation in quantum field theory. Commun. Math. Phys. 148, 521–551 (1992) 14. Guido, D., Longo, R.: The conformal spin and statistics theorem. Commun. Math. Phys. 181, 11–35 (1996) 15. Izumi, M., Longo, R., Popa, S.: A Galois correspondence for compact groups of automorphisms of von Neumann Algebras with a generalization to Kac algebras. J. Funct. Anal. 155, 25–63 (1998) 16. Jones, V.F.R.: Index for subfactors. Invent. Math. 72, 1–25 (1983) 17. Kac, V.G.: Infinite Dimensional Lie Algebras. 3rd Edition, Cambridge: Cambridge University Press, 1990 18. Kac, V.G., Longo, R., Xu, F.: Solitons in affine and permutation orbifolds. Commun. Math. Phys. 253, 723–764 (2005) 19. Kawahigashi, Y., Longo, R., Müger, M.: Multi-interval subfactors and modularity of representations in conformal field theory. Commun. Math. Phys. 219, 631–669 (2001) 20. Longo, R.: Index of subfactors and statistics of quantum fields. I. Commun. Math. Phys. 126, 217–247 (1989) 21. Longo, R.: Index of subfactors and statistics of quantum fields. II. Commun. Math. Phys. 130, 285–309 (1990) 22. Longo, R., Rehren, K.-H.: Nets of subfactors. Rev. Math. Phys. 7, 567–597 (1995) 23. Longo, R., Xu, F.: Topological sectors and a dichotomy in conformal field theory. Commun. Math. Phys. 251, 321–364 (2004) 24. Popa, S.: Orthogonal pairs of ∗-subalgebras in finite von Neumann algebras. J. Oper. Th. 9, no. 2, 253–268 (1983) 25. Pressley, A., Segal, G.: Loop Groups. Oxford: Oxford University Press, 1986 26. Rehren, K.-H.: Braid group statistics and their superselection rules. In: The Algebraic Theory of Superselection Sectors, D. Kastler, ed., Singapore: World Scientific, 1990 27. Turaev, V.G.: Quantum invariants of knots and 3-manifolds. Berlin, New York: Walter de Gruyter, 1994 28. Wassermann, A.: Operator algebras and conformal field theory. III. Fusion of positive energy representations of LSU(N ) using bounded operators. Invent. Math. 133, no. 3, 467–538 (1998) 29. Xu, F.: Algebraic orbifold conformal field theories. Proc. Nat. Acad. Sci. USA, 97, no. 26, 14069–14073 30. Xu, F.: 3-manifold invariants from cosets. J. Knot Th. and its Ramif. 14, no. 1, 21–90 (2005) 31. Xu, F.: On a conjecture of Kac-Wakimoto. Publ. RIMS, Kyoto Univ. 37, 165–190 (2001) 32. Xu, F.: Strong additivity and conformal nets. Pac. J. Math. 221, no. 1, 167, 199 (2005) 33. Xu, F.: Algebraic coset conformal field theories. Commun. Math. Phys. 211, 1–43 (2000) Communicated by Y. Kawahigashi
Commun. Math. Phys. 267, 783–800 (2006) Digital Object Identifier (DOI) 10.1007/s00220-006-0083-4
Communications in
Mathematical Physics
Moduli Space of BPS Walls in Supersymmetric Gauge Theories Norisuke Sakai1 , Yisong Yang2 1 Department of Physics, Tokyo Institute of Technology, Tokyo 152-8551, Japan.
E-mail: [email protected]
2 Department of Mathematics, Polytechnic University, Brooklyn, New York 11201, U.S.A.
E-mail: [email protected] Received: 17 May 2005 / Revised: 24 January 2006 / Accepted: 20 March 2006 Published online: 17 August 2006 – © Springer-Verlag 2006
Abstract: Existence and uniqueness of the solution are proved for the ‘master equation’ derived from the BPS equation for the vector multiplet scalar in the U (1) gauge theory with NF charged matter hypermultiplets with eight supercharges. This proof establishes that the solutions of the BPS equations are completely characterized by the moduli matrices divided by the V -equivalence relation for the gauge theory at finite gauge couplings. Therefore the moduli space at finite gauge couplings is topologically the same manifold as that at infinite gauge coupling, where the gauged linear sigma model reduces to a nonlinear sigma model. The proof is extended to the U (NC ) gauge theory with NF hypermultiplets in the fundamental representation, provided the moduli matrix of the domain wall solution is U (1)-factorizable. Thus the dimension of the moduli space of U (NC ) gauge theory is bounded from below by the dimension of the U (1)-factorizable part of the moduli space. We also obtain sharp estimates of the asymptotic exponential decay which depend on both the gauge coupling and the hypermultiplet mass differences.
1. Introduction Solitons have been important in understanding nonperturbative effects in field theories [1]. They are also useful to construct models of the brane-world scenario [2–4]. The simplest of these solitons is the domain wall separating two domains of discretely different vacua. The supersymmetric theories are useful to obtain realistic unified theories beyond the standard model [5]. If a field configuration preserves a part of supersymmetry, it satisfies the field equation automatically [6]. Such a configuration is called the Bogomol’nyi-Prasad-Sommerfield (BPS) configuration [7]. The BPS domain walls have been much studied in supersymmetric field theories with four supercharges [8, 9], and with eight supercharges [10 – 30]. These soliton solutions often contain parameters, which are called moduli. If we promote the moduli parameters as fields on the world volume of the soliton, they give massless fields on the world volume of the soliton [31].The
784
N. Sakai, Y. Yang
metric on the moduli space gives a (nonlinear) kinetic term of the Lagrangian of the low-energy effective field theory. Therefore the determination of the moduli space is of vital importance to understand the dynamics of the solitons. One of the most interesting classes of models possessing domain walls is the gauge theories with eight supercharges [18 – 29]. To allow domain walls, we need discrete vacua. For that purpose, we introduce Fayet-Iliopoulos (FI) terms for a U (1) factor gauge group. As a natural gauge group with the U (1) factor, we choose U (NC ) gauge theory. For simplicity, we take the matter hypermultiplets in the fundamental representation of U (NC ). To obtain more than one supersymmetric vacua, we require that the number of flavors of hypermultiplets NF be larger than the number of colors NC , NF > NC .
(1.1)
With massless hypermultiplets1 , the vacuum manifold is a hyper-Kähler manifold, the cotangent bundle over the complex Grassmann manifold T ∗ G NF ,NC . It reduces to T ∗ C P NF −1 manifold in the case of the U (1) gauge theory (NC = 1). If the nondegenerate hypermultiplet masses are turned on, a potential term is induced and most of the vacua are lifted, allowing wall solutions. In the resulting vacua of the massive U (NC ) gauge theories, each color component of the hypermultiplets chooses to have a particular flavor (color-flavor-locking) [21, 22]. Furthermore, a systematic construction of BPS wall solutions has been established [23, 24]. If we take the limit of strong gauge coupling g 2 → ∞ for the U (1) gauge theory, the vector multiplet can be eliminated to give constraints on the hypermultiplet field space, resulting in a supersymmetric massive nonlinear sigma model with the T ∗ C P NF −1 target space [32, 33]. Equations for preserving half of supersymmetry are called the 1/2 BPS equations. As boundary conditions for the BPS equations, the field configurations are required to approach one of the discrete vacua. If the vacua at y = −∞ and +∞ happen to be different, the solution represents (multi-) walls. If the vacua at y = −∞ and +∞ happen to be identical, the solution represents one of vacua. Therefore these BPS equations admit vacua (full supersymmetry conserved) besides (multi-) walls as solutions. The physical relation between these different topological sectors are as follows. If we let the position of one of the walls to go to infinity, we obtain a topological sector with one less wall. Therefore the topological sectors with n − 1 walls appear as boundaries of the moduli space of a topological sector with n walls. Continuing this process, we eventually arrive at topological sectors with no walls, namely vacua. In this way, we naturally obtain a compactification of moduli space of various topological sectors of multi-walls. The resulting manifold of all the solutions of the 1/2 BPS equations is topologically C P NF −1 in our case of the T ∗ C P NF −1 nonlinear sigma model [10, 13]. The massive U (NC ) gauge theory reduces to the massive nonlinear sigma model with the T ∗ G NF ,NC target space and with a potential term in the limit of strong gauge coupling [21]. Similarly to the U (1) case, the space of all solutions of the 1/2 BPS equations for the massive T ∗ G NF ,NC nonlinear sigma model with the vacuum boundary condition at infinity is found to be the complex Grassmann manifold G NF ,NC , which is the special Lagrangian submanifold of the target space of the nonlinear sigma model [23, 24]. More generally, it has been found that there are nonlinear sigma models with a target space having several special Lagrangian submanifolds whose union gives the space of all solutions of the 1/2 BPS equations [29]. 1 A common mass can be absorbed into the shift of vector multiplet scalar and is not relevant.
Moduli Space of BPS Walls in Supersymmetric Gauge Theories
785
A number of exact solutions for U (1) gauge theories have been obtained for particular discrete finite values of gauge coupling [34, 20, 24]. For generic finite gauge couplings of U (1) as well as U (NC ) gauge theories, only the general behavior of domain walls have been studied qualitatively [15, 22, 30]. The systematic construction of BPS walls first solves the hypermultiplet BPS equation and yields the moduli matrix H0 as integration constants. After taking account of an equivalence relation called the V -equivalence relation, the independent variables in the moduli matrix H0 constitute the complex Grassmann manifold G NF ,NC . The remaining BPS equation for the vector multiplet scalar can be rewritten into a “master equation”, which is a nonlinear ordinary differential equation for a gauge invariant quantity . In the limit g 2 → ∞, this master equation can be solved algebraically without introducing additional moduli, once the moduli matrix H0 is given. It has been conjectured that there exists a unique solution of the master equation even at finite gauge coupling, once the moduli matrix H0 is given. Based on this conjecture, it has been pointed out that the moduli space of the BPS equations is given by the moduli matrix H0 divided by the V -equivalence relation even at finite gauge couplings [23, 24]. Up to now, the best supporting evidence for this proposal is given by the index theorem, although the evidence is only indirect. The index theorems are proven for U (1) gauge theories [35, 29], and for U (NC ) gauge theories [30], respectively. They state that the complex Grassmann manifold contains necessary and sufficient number of moduli parameters. However, it is much more desirable to demonstrate the existence and uniqueness of the solution of the master equation. The purpose of this paper is to study the master equation for the gauge invariant quantity . We present a proof of the existence and uniqueness of the solution for the U (1) gauge theories, and extend the proof to a class of the moduli matrix H0 in the U (NC ) gauge theories. Our proof for the U (1) case finally establishes that solutions of the 1/2 BPS equations are completely characterized by the moduli matrices divided by the V -equivalence relation. This moduli space is topologically the same as the moduli space of the BPS equations in the nonlinear sigma model with T ∗ C P NF −1 target space, which is obtained in the limit of g 2 → ∞. Of course the metric on the moduli space at finite gauge coupling is expected to be deformed from that of nonlinear sigma model (infinite gauge coupling). We also obtain estimates of the asymptotic exponential decay which depend on both the hypermultiplet mass differences and the gauge coupling squared multiplied by the FI-parameter. These estimates agree with our previous result based on an iterative approximation scheme [20]. For the non-Abelian U (NC ) gauge theories, we show that the proof can be applied to the part of the moduli space which is described by the U (1) factorizable moduli matrix H0 . This result implies a lower bound of the dimensions of the moduli space for the U (NC ) gauge theories. We will leave for future publication to extend the proof of the existence and uniqueness of the solution of the master equation for the gauge invariant to entire moduli space of the BPS walls in the U (NC ) gauge theories. Interestingly, our one-dimensional nonlinear equation and its variational structure resemble in many ways the two-dimensional Abelian BPS vortex equation which allows us to extend the method developed in Jaffe and Taubes [36] to our problem here. There are two major technical differences/difficulties, though, that need to be overcome. The first one is that in one dimension, the ranges of the exponents in the Gagliardo–Nirenberg inequality cannot render as strong an estimate as in two dimensions (a relevant lower 2/3 bound takes the weaker, sublinear, form, v2 instead of the usual stronger, linear, form, v2 . See (4.19)). The second one is that, unlike in the Abelian vortex situation in which the vacuum state is uniquely characterized by the asymptotic amplitude of the
786
N. Sakai, Y. Yang
Higgs field, our domain wall solution needs to interpolate two different vacua at the two infinities of the real line. Hence, the behavior of the solution in a local region is less uniform and the decay rates near the two infinities are also necessarily different. In Sect. 2, BPS equations and their systematic solutions are introduced. We also describe implication of the existence and uniqueness of the solution of the master equation for the gauge invariant. In Sect. 3, we extend our analysis to the U (1) factorizable case of the U (NC ) gauge theories, giving a lower bound for the dimension of the BPS wall moduli space. In Sect. 4, we present an analytic proof of the existence and uniqueness of the solution of the master equation for the gauge invariant, and give estimates of the asymptotic behavior of the solution. 2. Moduli Space of BPS Equations in U(1) Theories Let us take a supersymmetric U (1) gauge theory with eight supercharges in one time and four spatial dimensions.2 The U (1) vector multiplet contains gauge field W M , gaugino λi , a real neutral scalar field , and SU (2) R triplet of real auxiliary fields Y a , where M, N = 0, 1, . . . , 4 denote space-time indices, and i = 1, 2 and a = 1, 2, 3 denote SU (2) R doublet and triplet indices, respectively. The hypermultiplet contains two complex scalar fields H i A , hyperino ψ A and complex auxiliary fields FiA , where A = 1, . . . , NF stand for flavors. For simplicity, we assume that these NF hypermultiplets have the same U (1) charge, say, unit charge. Denoting the gauge coupling g, the mass of the Ath hypermultiplet m A , and the FI parameters ζ a , the bosonic part of the Lagrangian reads [18–34]. Lboson = −
1 1 (FM N (W ))2 + 2 (∂ M )2 +(D M H )i†A (D M H i A )− Hi†A ( − m A )2 H i A 2 4g 2g 1 a 2 a a + 2 (Y ) − ζ Y + Hi†A (σ a Y a )i j H j A + FA†i FiA , (2.1) 2g
where a sum over repeated indices is understood, FM N (W ) = ∂ M W N −∂ N W M , covariant derivative is defined as D M = ∂ M + i W M , and our metric is η M N = (+1, −1, . . . , −1). We assume that the hypermultiplet masses are nondegenerate and are ordered as m 1 > m 2 > · · · > m NF .
(2.2)
The auxiliary fields are given by their equations of motion: FiA = 0 and Y a=g 2 [ζ a − Hi†A (σ a )i j H j A ].
(2.3)
By making an SU (2) R transformation, we can choose the FI parameters to the third direction ζ a = (0, 0, ζ ), ζ > 0.
(2.4)
In this choice, we find NF discrete SUSY vacua (A = 1, . . . , NF ) as = m A , |H 1A |2 = ζ, H 2 A = 0, H 1B = 0, H 2B = 0,
(B = A).
(2.5)
2 Discrete vacua are required for wall solutions and are possible by mass terms for hypermultiplets which are available only in spacetime dimensions equal to or less than five.
Moduli Space of BPS Walls in Supersymmetric Gauge Theories
787
We assume the configuration to depend only on a single coordinate, which we denote as y ≡ x 4 , and assume the four-dimensional Lorentz invariance in the world volume coordinates x μ = (x 0 , . . . , x 3 ). Let us examine the supersymmetry transformations of fermions: the gaugino λi and hyperino ψ A transform as3 i 1 δε λi = γ M N FM N (W ) + γ M ∂ M εi + i Y a σ a j ε j , (2.6) 2 √ √ δε ψ A =−i 2 γ M D M H i A + i( − m A )H i A i j ε j + 2FiA εi . (2.7) We require the following half of supersymmetry to be preserved: P+ ε1 = 0,
P− ε2 = 0,
(2.8)
where P± ≡ (1 ± γ5 )/2 are the chiral projection operators. Then we obtain the equations D y H 1A = (m A − )H 1A , D y H 2 A = (−m A + )H 2 A ,
1 2
BPS
A = 1, . . . , N , (2.9)
0 = Y 1 + iY 2 = −2g 2 H 2 A (H 1A )∗ ,
(2.10)
† ∂ y = Y 3 = g 2 ζ − H1A H 1A + H2†A H 2 A .
(2.11)
We wish to obtain solutions of these BPS equations which interpolate two different vacua in Eq.(2.5). The boundary condition of these two vacua at y = ±∞ specifies the topological sector. By letting the outer-most wall to infinity, one can obtain topological sectors with one less wall. Therefore we are interested in the maximal topological sector which allows the maximal number of walls and possesses the maximal number of moduli parameters . The boundary conditions for the maximal topological sector are given by (−∞) = m NF , (∞) = m 1 ,
(2.12)
√ √ H 1A (−∞) = ζ δ NAF , H 1A (∞) = ζ δ1A , H 2 A (−∞) = 0, H 2 A (∞) = 0.
(2.13)
Let us define [18, 20, 23] an G L(1, C) group element S(y) that expresses the vector multiplet scalar and gauge field as a pure gauge 1 d S(y) = (y) + i W4 (y). S(y) dy
(2.14)
The BPS equation for hypermultiplets (2.9) can be solved in terms of this single complex function S(y) as H 1A (y) = S −1 (y)H0A em A y ,
(2.15)
where complex integration constants H0A can be assembled into a constant complex vector H0 = (H01 , . . . , H0NF ). This NF component vector is nothing but the NC = 1 case of a 3 Our gamma matrices are 4 × 4 matrices and are defined as: {γ M , γ N } = 2η M N , γ M N ≡ 1 [γ M , γ N ] = 2 γ [M γ N ] , γ 5 ≡ iγ 0 γ 1 γ 2 γ 3 = −iγ 4 .
788
N. Sakai, Y. Yang
NC × NF complex constant matrix called the moduli matrix [23]. Since Eq. (2.14) defines the function S(y) only up to a complex multiplicative constant, equivalent descriptions of physical fields (H i and ) result from two complex vectors H0 which are different by multiplication of a non-vanishing complex constant V : H0 → V H0 ,
S → V S,
(2.16)
which is called the V -equivalence relation [23]. Therefore the moduli matrix H0 in this case is topologically C P NF −1 . We define a U (1) local gauge invariant function (y) as (y) = S(y)S ∗ (y).
(2.17)
Using the hypermultiplet solution (2.15), the remaining BPS equation for the vector multiplet scalar can be rewritten in terms of , 1 d(y) d 0 (y) 2 = g ζ 1− , (2.18) dy (y) dy (y) 0 (y) ≡
NC 1 H0A e2m A y (H0A )∗ ≡ e2W (y) . ζ
(2.19)
A=1
The boundary conditions in Eqs. (2.12) and (2.13) now become the following boundary conditions for the master equation (y)→0 (y) → (e2m 1 y |H01 |2 /ζ, 0, . . . , 0), (y)→0 (y) → (0, . . . , 0, e
2m NF y
y → ∞,
|H0NF |2 /ζ ),
y → −∞.
(2.20)
The non-vanishing left-most element of the moduli matrix H0 specifies the boundary condition at y = +∞, and the non-vanishing right-most element specifies the boundary condition at y = −∞. We can see that the boundary condition is encoded in the choice of the moduli matrix H0 . The moduli matrix H0 with more than one non-vanishing element gives a (multi-) wall solution, whereas the moduli matrix with a single non-vanishing element gives a vacuum solution. Therefore the space of all possible moduli matrix H0 divided by the V -equivalence relation automatically gives all possible solutions of the BPS equation including the vacuum solutions and multi-wall solutions. Using W (y) in Eq. (2.19), we finally obtain the master equation for the U (1) gauge theory [20, 19] 4 1 d 2ψ 1 (2.21) = 1 − e−2ψ(y)+2W (y) , ψ(y) ≡ log (y). 2 2 ζ g dy 2 The boundary conditions (2.20) now become the following boundary conditions for the master equation:5
|H01 | , y → +∞, (2.22) ψ → m 1 y + log √ ζ N
|H0 F | ψ → m NF y + log , y → −∞. (2.23) √ ζ 4 We have changed our notation Re ψ → ψ from Ref. [20]. Note that a factor of 1/2 is missing on the right-hand side of Eq. (4.6) of Ref. [20]. 5 We observe that one out of two constants log |H 1 |/√ζ and log |H NF |/√ζ can be absorbed into a 0 0 shift of ψ. This freedom corresponds to the V -equivalence relation implying that only one out of these two constants in the boundary conditions is the genuine moduli parameter.
Moduli Space of BPS Walls in Supersymmetric Gauge Theories
789
In Ref. [23], it has been conjectured that there exists a unique solution of the master equation (2.21) given the boundary conditions (2.22) and (2.23). We shall prove the conjecture in Sect. 4. With this proved, we can now state that the moduli space of the domain walls in the U (1) gauge theory with NF flavors is given by C P NF −1 as described by the moduli matrix H0 furnished with the equivalence relation (2.16). 3. U(1) Factorizable Case of U(NC ) Gauge Theories Let us now turn our attention to a supersymmetric U (NC ) gauge theory with NF (>NC ) flavors of hypermultiplets in the fundamental representation. We denote gauge fields W M and real scalar fields as NC × NC matrix whose basis is normalized as 1 1 . (3.1) Tr(TI T J ) = δ I J , [TI , T J ] = i f I J K TK , T0 ≡ 2 2NC We consider a five-dimensional spacetime and nondegenerate masses for hypermultiplets as ordered in Eq. (2.2). Hypermultiplets are denoted as NC × NF matrices H i . The bosonic part of the Lagrangian of the supersymmetric U (NC ) gauge theory reads 1 1 1 L|bosonic = Tr − 2 FM N (W )F M N (W ) + 2 (D M )2 + 2 (Y a )2 2g g g + D M H i D M H i† − ( H i − H i M)( H i − H i M)† + F i F i†
−Y a (ca − (σa )i j H j H i† ) (3.2) with the Fayet-Iliopoulos parameter ca = (0, 0, c) with c >0 and the hypermultiplet mass matrix M = diag(m 1 , . . . , m NF ). Covariant derivatives are defined as D M = ∂ M + i[W M , ], D M H ir A = (∂ M δsr + i(W M )r s )H is A , and the gauge field strength is FM N (W ) = −i[D M , D N ] = ∂ M W N − ∂ N W M + i[W M , W N ]. Considering the supersymmetry transformations in Eqs. (2.6) and (2.7), and requiring the half of supersymmetry in Eq. (2.8) to be preserved, we obtain the 1/2 BPS equations for walls with profile in y, D y H 1 = − H 1 + H 1 M,
(3.3)
D y H 2 = H 2 − H 2 M,
(3.4)
Dy = Y 3 =
g2 c1 NC − H 1 H 1† + H 2 H 2† , 2
0 = Y 1 + iY 2 = −g 2 H 2 H 1† .
(3.5) (3.6)
The supersymmetric vacua are characterized by the vanishing of the right-hand side of all of these BPS equations. The supersymmetric vacua have been found to be colorflavor-locked and discrete [21]. For each color component r , only one flavor Ar of hypermultiplet H ir A , A = Ar should take a non-vanishing value, √ (3.7) H 1r A = c δ Ar A , H 2r A = 0,
790
N. Sakai, Y. Yang
and the corresponding color component of the vector multiplet scalar should be equal to the mass of that flavor of the hypermultiplets = diag.(m A1 , m A2 , . . . , m A NC ).
(3.8)
We denote the above SUSY vacuum as A1 A2 . . . A NC .
(3.9)
Consequently the topological sector is labeled by two vacua: A1 , . . . , A NC ← B1 , . . . , B NC ,
(3.10)
where the first vacuum is at y = ∞ and the second at y = −∞. Ideally we wish to solve the BPS equations for each topological sector and we wish to obtain all possible solutions. Let us consider the maximal topological sector, which is specified in the U (NC ) gauge theory as: 1, . . . , NC ← NF − NC − 1, . . . , NF .
(3.11)
In this maximal topological sector, the boundary conditions for the hypermultiplet scalar H 1 (y) is given by ⎛
⎞ 0 ... 0 0 ... 0 1 ... 0 0 ... 0⎟ , .. .. .. .. ⎟ . . . .⎠ 0 0 ... 1 0 ... 0 ⎛ ⎞ 0 ... 0 1 0 ... 0 √ ⎜0 ... 0 0 1 ... 0⎟ H 1 (y) → c ⎜ , .. .. .. .. ⎟ ⎝ ... . . . .⎠
1 0 ⎜ √ H 1 (y) → c ⎜ ⎝ ...
y → ∞,
(3.12)
y → −∞.
(3.13)
0 ... 0 0 0 ... 1
To solve the BPS equations, we define the following G L(NC , C) group element S(y): + i W y = S −1 (y)∂ y S(y).
(3.14)
The hypermultiplet scalar BPS equations (3.3) and (3.4) can be solved with this matrix function S(y) as H 1 = S −1 (y)H0 e M y ,
H 2 = 0,
(3.15)
with the constant matrix H0 which is called the moduli matrix. We used the boundary conditions H 2 (y) → 0 at y → ±∞ to fix H 2 (y) in the above solution [24]. Since the matrix function S(y) in Eq. (3.14) is defined up to a constant G L(NC , C) matrix V , equivalent descriptions for the hypermultiplet scalar H i and the vector multiplet scalar are obtained by two sets of moduli matrices H0 and S that are related by G L(NC , C) transformations V (H0 , S) ∼ (H0 , S ),
H0 → H0 =V H0 , S → S =V S, V ∈ G L(NC , C).
(3.16)
Moduli Space of BPS Walls in Supersymmetric Gauge Theories
791
We call this symmetry the V -equivalence relation. The space of the moduli matrix divided by the V -equivalence relation is found [23] to be the complex Grassmann manifold G NF ,NC, {H0 |H0 ∼ V H0 , V ∈ G L(NC , C)} G NF ,NC
SU (NF ) SU (NC ) × SU ( N˜ C ) × U (1)
.
(3.17)
In place of the gauge variant matrix function S(y), we define a gauge invariant NC × NC matrix function ≡ SS † .
(3.18)
The remaining BPS equation (3.5) for the vector multiplet scalar can be rewritten in terms of as a master equation for the U (NC ) gauge theory [23, 24] ∂ y −1 ∂ y = g 2 c 1C − −1 0 , (3.19) where the source term is defined in terms of the moduli matrix H0 as 0 ≡ c−1 H0 e2M y H0† .
(3.20)
We need to specify the boundary conditions for (y). For the maximal topological sector, the following quantity reduces to the unit matrix for the first NC × NC diagonal part in the limit of y → ∞, ⎞ ⎛ 1 0 ... 0 0 ... 0 ⎜0 1 ... 0 0 ... 0⎟ ⎜. .. .. .. ⎟ ⎟ ⎜. . . .⎟ ⎜. ⎟ ⎜ † (3.21) H 1† (y)H 1 (y)=e M y H0 (y)H0 e M y → c ⎜ 0 0 . . . 1 0 . . . 0 ⎟ , y → ∞, ⎜0 0 ... 0 0 ... 0⎟ ⎟ ⎜ ⎜. . .. .. .. ⎟ ⎝ .. .. . . .⎠ 0 0 ... 0 0 ... 0 and to the unit matrix for the last NC × NC diagonal part in the limit of y → −∞, ⎞ ⎛ 0 ... 0 0 0 ... 0 .. .. .. .. ⎟ ⎜ .. ⎜. . . . .⎟ ⎟ ⎜ ⎜0 ... 0 0 0 ... 0⎟ ⎟ ⎜ 1† 1 My † My H (y)H (y)=e H0 (y)H0 e → c ⎜ 0 . . . 0 1 0 . . . 0 ⎟ , y → −∞. (3.22) ⎜0 ... 0 0 1 ... 0⎟ ⎟ ⎜ ⎜. .. .. .. .. ⎟ ⎝ .. . . . .⎠ 0 ... 0 0 0 ... 1 It has been conjectured [23] that there exists a unique solution of the master equation (3.19) given the above boundary conditions (3.21) and (3.22). If this is proved, the solutions of the BPS equations (3.3) – (3.6) in the U (NC ) gauge theory with NF flavors are completely characterized by the moduli matrices H0 divided by the V -equivalence relation (3.16). For the generic moduli matrix H0 , we cannot give a proof of the existence and uniqueness of the solution (y) of the master equation (3.19) at present. However, we
792
N. Sakai, Y. Yang
can exploit our proof for the U (1) gauge theory if the moduli matrix H0 is of a restricted form, as we describe now. Let us note that the master equation (3.19) transforms covariantly under the world-volume transformation (3.16), where the matrix H0 e2M y H0 † transforms with multiplication of constant matrices V and V † from both sides of this matrix. This V -equivalence relation allows us to diagonalize the matrix H0 e2M y H0 † at one point of the extra dimension, say, y = y0 . If the matrix H0 e2M y H0 † with this gauge fixing remains diagonal at every other point in the extra dimension y = y0 , we obtain (3.23) H0 e2M y H0 † = c diag. W1 (y), W2 (y), . . . , W NC (y) . If this is valid, we call that moduli matrix H0 as U (1)-factorizable [24]. Whether a given moduli matrix is U (1) factorizable or not is an inherent characteristic of each moduli matrix H0 , and is independent of the choice of the initial coordinate y0 . Thus the U (1)-factorizability is a property attached to each point on the moduli space. If the moduli matrix is U (1)-factorizable, off-diagonal components of the matrix H0 e2M y H0 † vanishes at any point of the extra dimension y by definition. Therefore each coefficient of e2m A y in the off-diagonal components must vanish. Since we consider the case of non-degenerate masses, the condition for the U (1)-factorizability can be rewritten for each flavor A as ∗ (H0 )r A (H0 )s A = 0, for r = s, (3.24) where the flavor index A is not summed. Namely, (H0 )r A can be non-vanishing in only one color component r for each flavor A. To solve the master equation (3.19) for the gauge invariant matrix function in the non-Abelian U (NC ) gauge theory with the U (1)-factorizable moduli, we are allowed to take an ansatz where only the diagonal components of the matrix survive: = diag. e2ψ1 , e2ψ2 , . . . , e2ψ NC , (3.25) where ψr (y)’s are real functions. With this ansatz, the master equation (3.19) for the U (1)-factorizable moduli with the condition (3.23) reduces to a set of the master equations for the Abelian gauge theory [23] d 2 ψr g2 c 1 − e−2ψr Wr , = 2 dy 2
for r = 1, 2, . . . , NC ,
(3.26)
where the functions Wr (y) defined in (3.23) are given by Wr =
e2m A y
A∈Ar
|H0A |2 . c
(3.27)
Ar is a set of flavors of the hypermultiplet scalars whose r th color component is nonvanishing. Note that the condition (3.24) of the U (1)-factorizability can be rewritten as Ar ∩As = ∅ for r = s. In this case, the vector multiplet scalars and the hypermultiplet scalars H 1r A are given by [20] (3.28) = diag. ∂ y ψ1 , ∂ y ψ2 , . . . , ∂ y ψ NC , H 1r A = e−ψr (y)+m A y H0A ,
(3.29)
Moduli Space of BPS Walls in Supersymmetric Gauge Theories
793
with a gauge choice of W y = 0 and a phase choice of ImH 1r A = 0 at y = ±∞. Since the moduli parameters contained in the master equation (3.26) for each ψr are independent of each other, we find that our system of BPS equations for walls becomes a decoupled set of NC systems of BPS equations of U (1) gauge theories. This fact enables us to apply our method of proof for U (1) gauge theories to this U (1)-factorizable case of the U (NC ) gauge theory. Let us count the dimensions of the part of the moduli space with the U (1)-factorizable property. For each color component r , we denote the number of non-vanishing hypermultiplets in Ar as fr . For the r th color component, we obtain a U (1) gauge theory with fr flavors. In the maximal topological sector, all the flavors should participate: r fr = NF . Since every color component has to appear at vacua of both infinities, we obtain the complex dimension of the U (1)-factorizable part of the moduli space in the maximal topological sector to be dimC MU (1)fact =
NC
( fr − 1) = NF − NC .
(3.30)
r =1
We thus obtain a rather weak lower bound for the dimension of the moduli space for the U (NC ) gauge theory: dimC MU (NC ) ≥ NF − NC .
(3.31)
4. Proof of Existence and Uniqueness of Solution In this section, we prove the existence and uniqueness of the BPS wall solutions to the U (1) and U (1)-factorizable models. We first state our results in suitably renormalized parameters. We then carry out the proof using the method of calculus of variations and functional analysis. For convenience, we relabel the variables and parameters of our master equations (2.21) and (3.26) so that they are of the equivalent form6 u
= λ(M(y)eu − 1),
(4.1)
subject to the boundary conditions u(y) → −ω1 y − r1 , u(y) → −ω NF y − r NF ,
y → ∞, y → −∞,
(4.2) (4.3)
where u stands for du/dy and M(y) =
NF
eω A y+r A ,
(4.4)
A=1
λ > 0, ω A , r A are real constants so that ω A (A = 1, 2, . . . , NF ) satisfy the nondegeneracy condition ω1 > ω2 > . . . ω NF .
(4.5)
For Eq. (4.1) subject to the boundary conditions (4.2) and (4.3), we have 6 For the U (1) model, we denote λ ≡ 2g 2 ζ , ω ≡ 2m , u(y) ≡ −2ψ(y), r ≡ log(|H A |2 /ζ ), and A A A 0 M(y) ≡ e2W (y) = 0 (y) = A e2m A y |H0A |2 /ζ .
794
N. Sakai, Y. Yang
Theorem 4.1. The problem (4.1)–(4.3) has a unique solution. Moreover, this solution also enjoys the estimates of the asymptotic exponential decay at y → ±∞, u(y) + (ω1 y + r1 ) = O(e−λ1 (1−ε)y ) u(y) + (ω NF y + r NF ) = O(e
λ2 (1−ε)y
)
as y → ∞,
(4.6)
as y → −∞,
(4.7)
where ε > 0 can be taken to be arbitrarily small and λ1 and λ2 are positive parameters defined by7 √ λ1 = min{ λ, ω1 − ω2 }, (4.8) √ λ2 = min{ λ, ω NF −1 − ω NF }. (4.9) We now proceed with the proof, adapting the main ideas from [36]. Note that the conditions (4.2), (4.3), (4.5) ensure that the right-hand side of (4.1) vanishes at y = ±∞ because M(y) = eω1 y+r1 (1 + e(ω2 −ω1 )y+r2 −r1 + . . . + e(ω NF −ω1 )y+r NF −r1 ), M(y) = e
ω NF y+r NF
(1 + e
(ω1 −ω NF )y+r1 −r NF
+ ... + e
y > 0,
(ω NF −1 −ω NF )y+r NF −1 −r NF
),
y < 0.
To take into account the boundary asymptotics, we introduce a translation, u = u 0 +v, where u 0 (y) has continuous second-order derivative and satisfies u 0 (y) = −ω1 y − r1 if y > y0 , u 0 (y) = −ω NF y − r NF if y < −y0 , (4.10) where y0 > 0 is a suitable constant to be determined shortly. Then Eq. (4.1) becomes v
= λ(Q(y)ev − 1) + h(y),
(4.11)
where h(y) = −u
0 (y) is of compact support and Q(y) = eu 0 (y) M(y) has the representations Q(y) = 1 + e(ω2 −ω1 )y+r2 −r1 + . . . + e(ω NF −ω1 )y+r NF −r1 , Q(y) = 1+e
(ω1 −ω NF )y+r1 −r NF
+. . . + e
y > y0 ,
(ω NF −1 −ω NF )y+r NF −1 −r NF
,
(4.12)
y < −y0 . (4.13)
Of course, Q(y) > 0 everywhere, Q(±∞) = 1, and the boundary conditions (4.2) and (4.3) become the standard one, v = 0 at y = ±∞.
(4.14)
Since u 0 (y) = −ω1 for y > y0 and u 0 (y) = −ω NF for y < −y0 , it is clear that, when y0 > 0 is sufficiently large, we can define u 0 (y) for −y0 ≤ y ≤ y0 to make |u
0 (y)| as small as we please. In particular, we may achieve (say) |h(y)| = |u
0 (y)| <
λ 2
for all y.
(4.15)
This assumption will be observed in the subsequent analysis. It is clear that (4.11) is the Euler–Lagrange equation of the action functional 1 2 v (v ) + λQ(y)(e − v − 1) + λ(Q(y) − 1)v + hv , I (v) = (4.16) 2 7 The bound agrees with the previous result of the iterative approximation [20].
Moduli Space of BPS Walls in Supersymmetric Gauge Theories
795
where and in the sequel, we use to stand for the Lebesgue integral over the whole real line (−∞, ∞) and we omit writing out the measure dy when no risk of confusion arises. In order to accommodate the boundary condition (4.14), we work on the standard Sobolev space W 1,2 (R)= the completion of the set of all compactly supported real-valued smooth functions over R under the norm f 2W 1,2 (R) = f 22 + f 22 , where we use · p ( p ≥ 1) to denote the integral norm 1
f p =
| f (y)| p
p
.
Use C(R) to denote the space of continuous functions over R vanishing at ±∞, equipped with the standard pointwise norm f C(R) =
sup
−∞
| f (y)|.
Then an immediate application of the Schwartz inequality yields the continuous embedding W 1,2 (R) ⊂ C(R) with f C(R) ≤ f W 1,2 (R) ,
f ∈ W 1,2 (R).
(4.17)
Using (4.17), we see that the functional (4.16) is a well-defined, continuously differentiable, functional over W 1,2 (R), which is also strictly convex. In the following, we show that (4.16) has a critical point in W 1,2 (R) by minimizing (4.16) over W 1,2 (R). Recall the following one-dimensional Gagliardo–Nirenberg inequality: p+3
|f|
p+1
≤ C( p)
f
2
4
2
(f )
p−1 4
,
f ∈ W 1,2 (R),
(4.18)
where p > 1 and C( p) is a positive constant depending only on p. Note that, in view of the Schwartz inequality and Gagliardo–Nirenberg inequality (4.18) (with p = 3), we have 2 v2 2 |v| (1 + |v|)|v| = 1 + |v| v2 ≤ (1 + |v|)2 v 2 (1 + |v|)2 v2 ≤2 (v 2 + v 4 ) (1 + |v|)2 3 1 2 2 v2 2 2
2 ≤ C1 v + (v ) v 2 (1 + |v|) 2 2 6 6 1 v2 v2
2 . ≤ + C + ) v 2 + C2 (v 3 2 (1 + |v|)2 (1 + |v|)2
796
N. Sakai, Y. Yang
Consequently, we have v2 ≤ C0 1 +
v2 + 1 + |v|
2
(v ) dy
3 2
,
(4.19)
where C0 > 0 is a suitable constant. Define J (v) = (D I (v))(v) (the Fréchet derivative) for v ∈ W 1,2 (R). Then I (v + tv) − I (v) J (v) = lim = (D I (v))(v) t→0 t (v )2 + λQ(y)(ev − 1)v + q(y)v , =
(4.20)
where q(y) = λ(Q(y) − 1) + h(y) vanishes at y = ±∞ exponentially fast. Let v + and v − be the positive and negative parts of v respectively. That is, v = v + −v − and v + = max{v, 0} =
1 1 (v + |v|), v − = max{0, −v} = (|v| − v). 2 2
Then the functional J (v) defined in (4.20) may be rewritten as J (v) = v 22 + J1 (v) + J2 (v),
(4.21)
where + λQ(y)(ev − 1)v + + q(y)v + , − J2 (v) = λQ(y)(e−v − 1)(−v − ) − q(y)v − . J1 (v) =
−
Recall that Q(y) ≥ 1, (ev − 1)v + ≥ (v + )2 , (e−v − 1)(−v − ) ≥ (v − )2 /(1 + |v − |). Hence +
J1 (v) ≥ λv + 22 − q2 v + 2 λ 1 q22 , ≥ v + 22 − (4.22) 2 2λ v− (v − )2 − q(y) (1 + |v − |) J2 (v) ≥ λQ(y) 1 + |v − | 1 + |v − | (v − )2 (v − )2 − |q(y)| − (λ(Q(y) − 1) + h(y)) ≥ λQ(y) − 1 + |v | 1 + |v − | (v − )2 − |q(y)| ≥ (λ − |h(y)|) 1 + |v − | λ (v − )2 ≥ − |q(y)|, (4.23) 2 1 + |v − |
Moduli Space of BPS Walls in Supersymmetric Gauge Theories
797
where we have used (4.15). Inserting (4.22) and (4.23) into (4.21) and applying (4.19), we arrive at λ v2
2 − C4 J (v) ≥ v 2 + 2 1 + |v| 1 2 1 λ v2
2 v 2 + − C4 ≥ v 2 + min , 2 2 2 1 + |v| 1 1 λ −2/3 2/3 , C ≥ v 22 + min v2 − C5 , (4.24) 2 2 2 0 where the constants C4 , C5 > 0 depend only on λ and q(y). Note also that v 2 (1/3)v 22 + (2/3). We can rewrite (4.24) more evenly as
2/3
2/3
J (v) ≥ C6 (v2
≤
+ v 2 ) − C7 2/3
2/3
≥ C8 vW 1,2 (R) − C9 ,
(4.25)
where C6 , C7 , C8 , C9 are some positive constants. With the above estimates, we are now ready to do minimization following a standard path. In view of (4.25), let R > 0 be such that inf{J (v) | v ∈ W 1,2 (R),
vW 1,2 (R) = R} ≥ 1,
(4.26)
and consider the minimization problem σ = inf{I (v) | vW 1,2 (R) ≤ R}.
(4.27)
Let {vn } be a minimizing sequence of the problem (4.27). Since {vn } is bounded in W 1,2 (R), by extracting a subsequence if necessary, we may assume that {vn } is also weakly convergent. Let v˜ be its weak limit in W 1,2 (R). Since I (·) is a continuously differentiable functional over W 1,2 (R) and convex, I (·) is weakly lower semicontinuous. Hence I (v) ˜ ≤ limn→∞ I (vn ) = σ . Of course, v ˜ W 1,2 (R) ≤ R because the 1,2 norm of W (R) is also weakly lower semicontinuous. Hence v˜ solves (4.27). That is, I (v) ˜ = σ . To show that v˜ is a critical point of I (·), we need to show that v˜ is interior, or v ˜ W 1,2 (R) < R. Otherwise, if v ˜ W 1,2 (R) = R, then, by (4.26), we have lim
t→0
I (v˜ − t v) ˜ − I (v) ˜ = −(D I (v))( ˜ v) ˜ t = −J (v) ˜ ≤ −1.
In particular, when t > 0 is sufficiently small, we have I (v˜ − t v) ˜ < I (v) ˜ = σ . On the other hand, v˜ − t v ˜ W 1,2 (R) = (1 − t)R < R. These two facts violate the definition of σ made in (4.27). Therefore v˜ is interior. Consequently, it is a critical point of I (·) in W 1,2 (R), which is a weak solution of (4.11). The standard elliptic regularity theory shows that this gives rise to a C ∞ -solution of the original equation (4.1) subject to the boundary conditions (4.2) and (4.3).
798
N. Sakai, Y. Yang
The strict convexity of I (·) already implies that I (·) can have at most one critical point in W 1,2 (R). Hence I (·) has exactly one critical point in W 1,2 (R) and the uniqueness of a solution to (4.1)–(4.3) or (4.11) and (4.14) follows. In fact, such a uniqueness result follows from the structure of the equation in a more straightforward way: if v1 and v2 are two solutions of (4.11) and (4.14), then w = v1 −v2 satisfies w
= λQ(y)eξ(y) w, where ξ(y) lies between v1 (y) and v2 (y). Since w = 0 at y = ±∞ and Q(y)eξ(y) > 0 for any y, we must have w(y) ≡ 0. Finally, we estimate the asymptotic exponential decay rates of the solution of (4.11) and (4.14) near y = ±∞. For y > y0 , we see that (4.11) takes the form v
= λQ(y)(ev − 1) + λ(Q(y) − 1).
(4.28)
Introduce a comparison function √
V = Ce−
ω(1−ε)y
, ω = min{λ, (ω1 − ω2 )2 }, 0 < ε < 1.
(4.29)
Then V satisfies V
= ω(1 − ε)2 V . In view of this, (4.28), and (4.29), we have (v ± V )
= λQ(y)eξ(y) (v ± V ) + λ(Q(y) − 1) ∓ (λQ(y)eξ(y) √
−ω(1 − ε)2 )Ce−
ω(1−ε)y
,
(4.30)
where ξ(y) lies between 0 and v(y). Assume that the constant C in (4.29) satisfies C ≥ 1 (say). Since v(y) → 0 and Q(y) − 1 = O(e−(ω1 −ω2 )y ) as y → ∞, we can find a sufficiently large y1 > 0 so that √
λ(Q(y) − 1) − (λQ(y)eξ(y) − ω(1 − ε)2 )Ce−
ω(1−ε)y
< 0,
y > y1 . (4.31)
Combining (4.30) and (4.31), we arrive at (v + V )
< λQ(y)eξ(y) (v + V ),
y > y1 .
(4.32)
Of course, we may choose the constant C ≥ 1 in the definition of the function V (see (4.29)) sufficiently large so that (v + V )(y1 ) ≥ 0. Using this condition, the fact that v + V = 0 at y = ∞, and (4.32), we get (v + V )(y) > 0 for all y > y1 . On the other hand, since Q(y) > 1 for y > y0 (see (4.12)), in view of (4.29) and (4.30) again, we see that there is a sufficiently large y2 > y0 so that (v − V )
> λQ(y)eξ(y) (v − V ),
y > y2 .
(4.33)
Again, we may choose the constant C ≥ 1 (say) in the definition of the function V (see (4.29)) sufficiently large so that (v − V )(y2 ) ≤ 0. Using this condition, the fact that v − V = 0 at y = ∞, and (4.33), we get (v − V )(y) < 0 for all y > y2 . In summary, we have obtained the expected asymptotic exponential decay estimate √ |v(y)| < V (y) = Ce− ω(1−ε)y for y → ∞. A similar argument leads to the exponential decay estimate for v(y) as y → −∞. The proof of Theorem 4.1 is now complete. Acknowledgements. N.S. wishes to acknowledge a fruitful collaboration with Minoru Eto, Youichi Isozumi, Muneto Nitta, Keisuke Ohashi, Kazutoshi Ohta, Yuji Tachikawa, and David Tong. He also benefitted from useful communications with Jarah Evslin and David Tong on the solvability of the master equations. N.S. is supported in part by Grant-in-Aid for Scientific Research from the Ministry of Education, Culture, Sports, Science and Technology, Japan No.16028203 for the priority area “origin of mass” and No.17540237. Y.Y. was supported in part by NSF grant DMS–0406446.
Moduli Space of BPS Walls in Supersymmetric Gauge Theories
799
References 1. Seiberg, N., Witten, E.: Nucl. Phys. B426, 19 (1994) [Erratum-ibid. B430, 485 (1994)]; Nucl. Phys. B431, 484 (1994) Seiberg, N.: Nucl. Phys. B435, 129 (1995) 2. Horava, P., Witten, E.: Nucl. Phys. B460, 506 (1996) 3. Arkani-Hamed, N., Dimopoulos, S., Dvali, G.R.: Phys. Lett. B429, 263 (1998); Antoniadis, I., Arkani-Hamed, N., Dimopoulos, S., Dvali, G.R.: Phys. Lett. B436, 257 (1998) 4. Randall, L., Sundrum, R.: Phys. Rev. Lett. 83, 3370 (1999) Phys. Rev. Lett. 83, 4690 (1999) 5. Dimopoulos, S., Georgi, H.: Nucl. Phys. B193, 150 (1981); Sakai, N.: Z. f. Phys. C11, 153 (1981); Witten, E.: Nucl. Phys. B188, 513 (1981); Dimopoulos, S., Raby, S., Wilczek, F.: Phys. Rev. D24, 1681 (1981) 6. Witten, E., Olive, D.: Phys. Lett. 78B, 97 (1978) 7. Bogomol’nyi, E.: Sov. J. Nucl. Phys. B24, 449 (1976); Prasad, M.K., Sommerfield, C.H.: Phys. Rev. Lett. 35, 760 (1975) 8. Abraham, E.R.C., Townsend, P.K.: Phys. Lett. B291, 85 (1992); Phys. Lett. B295, 225 (1992); Cvetic, M., Quevedo, F., Rey, S.J.: Phys. Rev. Lett. 67, 1836 (1991); Cvetic, M., Griffies, S., Rey, S.J.: Nucl. Phys. B381, 301 (1992) 9. Dvali, G.R., Shifman, M.A.: Nucl. Phys. B504, 127 (1997); Phys. Lett. B396, 64 (1997) [Erratum-ibid. B407, 452 (1997)]; Kovner, A., Shifman, M.A., Smilga, A.: Phys. Rev. D56, 7978 (1997); Smilga, A., Veselov, A.: Phys. Rev. Lett. 79, 4529 (1997); de Carlos, B., Moreno, J.M.: Phys. Rev. Lett. 83, 2120 (1999); Bazeia, D., Boschi-Filho, H., Brito, F.A.: JHEP 9904, (1999) 028; Kaplunovsky, V.S., Sonnenschein, J., Yankielowicz, S.: Nucl. Phys. B552, 209 (1999); Dvali, G.R., Gabadadze, G., Kakushadze, Z.: Nucl. Phys. B562, 158 (1999); Ito, K., Oda, H., Naganuma, M., Sakai, N.: Phys. Lett. B471, 140 (1999); Naganuma, M., Nitta, M.: Prog. Theor. Phys. 105, 501 (2001); Acharya, B.S., Vafa, C.: http:// arXiv.org/list/hep-th/0103011, 2001; Maru, N., Sakai, N., Sakamura, Y., Sugisaka, R.: Nucl. Phys. B616, 47 (2001); Binosi, D., ter Veldhuis, T.: Phys. Rev. D63, 085016 (2001); Ritz, A., Shifman, M., Vainshtein, A.: Phys. Rev. D66, 065015 (2002); Eto, M., Maru, N., Sakai, N., Sakata, T.: Phys. Lett. B553, 87-95 (2003); Eto, M., Sakai, N.: Phys. Rev. D68, 125001 (2003) 10. Gauntlett, J.P., Tong, D., Townsend, P.K.: Phys. Rev. D64, 025010 (2001) 11. Gauntlett, J.P., Tong, D., Townsend, P.K.: Phys. Rev. D63, 085001 (2001) 12. Gauntlett, J.P., Portugues, R., Tong, D., Townsend, P.K.: Phys. Rev. D63, 085002 (2001) 13. Arai, M., Naganuma, M., Nitta, M., Sakai, N.: Nucl. Phys. B652, 35 (2003); Arafune, J. et al. (eds.): Garden of Quanta - In honor of Hiroshi Ezawa, Singapore: World Scientific Publishing Co. Pte. Ltd., 2003, pp. 299–325 14. Naganuma, M., Nitta, M., Sakai, N.: Grav. Cosmol. 8, 129 (2002); Portugues, R., Townsend, P.K.: JHEP 0204, 039 (2002) 15. Shifman, M., Yung, A.: Phys. Rev. D67, 125007 (2003) 16. Arai, M., Ivanov, E., Niederle, J.: Nucl. Phys. B680, 23 (2004) 17. Arai, M., Fujita, S., Naganuma, M., Sakai, N.: Phys. Lett. B556, 192 (2003); To appear in the proceedings of International Seminar on Supersymmetries and Quantum Symmetries SQS 03, Dubna, Russia, 24–29 Jul 2003, available at http://arXiv.org/list/hep-th/0311210, 2003; To appear in the Proceedings of SUSY 2003, “SUSY in the Desert” 11th Annual International Conference on Supersymmetry and the Unification of Fundamental Interactions, Tucson, Arizona, 5-10 Jun 2003, available at http://arXiv.org/list/ hep-th/0402040, 2004; Eto, M., Fujita, S., Naganuma, M., Sakai, N.: Phys. Rev. D69, 025007 (2004) 18. Tong, D.: Phys. Rev. D66, 025013 (2002) 19. Tong, D.: JHEP 0304, 031 (2003) 20. Isozumi, Y., Ohashi, K., Sakai, N.: JHEP 0311, 060 (2003); JHEP 0311, 061 (2003) 21. Arai, M., Nitta, M., Sakai, N.: Prog. Theor. Phys. 113, 657 (2005); To appear in the Proceedings of the 3rd International Symposium on Quantum Theory and Symmetries (QTS3), September 10–14, 2003, available at http://arXiv.org/list/hep-th/0401084, 2004; (Published in) Phys. Atom. Nucl. 68 (2005) 1634 [Yad. Fiz. 68 (2005) 1698] the Proceedings of the International Conference on “Symmetry Methods in Physics (SYM-PHYS10)” held at Yerevan, Armenia, 13–19 Aug. 2003, available at http://arXiv.org/list/hepth/0401102, 2004 to appear in the Proceedings of SUSY 2003 held at the University of Arizona, Tucson, AZ, June 5–10, 2003, [available at http: arxiv.org/list/hep-th/0402065]. 22. Shifman, M., Yung, A.: Phys. Rev. D70, 025013 (2004) 23. Isozumi, Y., Nitta, M., Ohashi, K., Sakai, N.: Phys. Rev. Lett. 93, 161601 (2004) 24. Isozumi, Y., Nitta, M., Ohashi, K., Sakai, N.: Phys. Rev. D70, 125014 (2004) 25. Isozumi, Y., Nitta, M., Ohashi, K., Sakai, N.: Phys. Rev. D71, 065018 (2005) 26. Isozumi, Y., Nitta, M., Ohashi, K., Sakai, N.: In: The Proceedings of 12th International Conference on Supersymmetry and Unification of Fundamental Interactions (SUSY 04), p.1, (KEK) Tsukuba, Japan, 17–23 Jun 2004, available at http://arXiv.org/list/hep-th/0409110, 2004 (Published in) pages 229–238 in “Theme of Unification”, Pran Nath Festschrift (2005), World Scientific, Singapore, the proceedings of “NathFest” at PASCOS conference, Northeastern University, Boston, MA, August 2004, available at http://arXiv.org/list/hep-th/0410150, 2004
800
N. Sakai, Y. Yang
27. Eto, M., Isozumi, Y., Nitta, M., Ohashi, K., Ohta, K., Sakai, N.: Phys. Rev. D71, 125006 (2005) 28. Eto, M., Isozumi, Y., Nitta, M., Ohashi, K., Sakai, N.: Phys. Rev. D 72, 025011 (2005) 29. M. Eto, Y. Isozumi, M. Nitta, K. Ohashi, Ohta, K., Sakai, N., Tachikawa, Y.: Phys. Rev. D71, 105009 (2005) 30. Sakai, N., Tong, D.: JHEP 03, 019 (2005) 31. Manton, N.S.: Phys. Lett. B110, 54 (1982) 32. Lindström, U., Roˇcek, M.: Nucl. Phys. B222, 285 (1983) 33. Hitchin, N.J., Karlhede, A., Lindström, U., Roˇcek, M.: Commun. Math. Phys. 108, 535 (1987) 34. Kakimoto, K., Sakai, N.: Phys. Rev. D68, 065005 (2003) 35. Lee, K.S.M.: Phys. Rev. D67, 045009 (2003) 36. Jaffe, A., Taubes, C.H.: Vortices and Monopoles, Boston: Birkhauser, 1980 Communicated by N.A. Nekrasov
Commun. Math. Phys. 267, 801–820 (2006) Digital Object Identifier (DOI) 10.1007/s00220-006-0082-5
Communications in
Mathematical Physics
Global Existence and Blow-Up Phenomena for the Degasperis-Procesi Equation Yue Liu1 , Zhaoyang Yin2,3 1 Department of Mathematics, University of Texas, Arlington, TX 76019, USA.
E-mail: [email protected]
2 Department of Mathematics, Zhongshan University, 510275 Guangzhou, China.
E-mail: [email protected]
3 Institute for Applied Mathematics, University of Hanover, 30167 Hanover, Germany.
E-mail: [email protected] Received: 25 January 2006 / Accepted: 2 March 2006 Published online: 8 August 2006 – © Springer-Verlag 2006
Abstract: This paper is concerned with several aspects of the existence of global solutions and the formation of singularities for the Degasperis-Procesi equation on the line. Global strong solutions to the equation are determined for a class of initial profiles. On the other hand, it is shown that the first blow-up can occur only in the form of wavebreaking. A new wave-breaking mechanism for solutions is described in detail and two results of blow-up solutions with certain initial profiles are established. 1. Introduction Recently, Degasperis and Procesi [21] studied the following family of third order dispersive PDE conservation laws, u t + c0 u x + γ u x x x − α 2 u t x x = c1 u 2 + c2 u 2x + c3 uu x x , x
(1.1)
where α, c0 , c1 , c2 , and c3 are real constants and subindices denote partial derivatives. They found that there are only three equations that satisfy the asymptotic integrability condition within this family: the KdV equation, the Camassa-Holm equation and the Degasperis-Procesi equation. With α = c2 = c3 = 0 in Eq. (1.1), it becomes the well-known Korteweg-de Vries equation which describes the unidirectional propagation of waves at the free surface of shallow water under the influence of gravity: u(t, x) represents the wave height above a flat bottom, x is proportional to the distance in the direction of propagation and t is proportional to the elapsed time. The KdV equation is completely integrable and its solitary waves are solitons [22, 39]. The Cauchy problem of the KdV equation has been the subject of a number of studies, and a satisfactory local or global (in time) existence theory is now in hand (for example, see [31, 43]). It is shown that the KdV equation is globally well-posed for u 0 ∈ L 2 (R) [43]. It is observed that the KdV equation does not
802
Y. Liu, Z. Yin
accommodate wave breaking (by wave breaking we understand that the wave remains bounded but its slope becomes unbounded in finite time [45]). For c1 = − 23 c3 /α 2 and c2 = c3 /2, Eq. (1.1) becomes the Camassa-Holm equation, modeling the unidirectional propagation of shallow water waves over a flat bottom, u(t, x) standing for the fluid velocity at time t in the spatial x direction and c0 being a nonnegative parameter related to the critical shallow water speed [3, 23, 29]. The Camassa-Holm equation is also a model for the propagation of axially symmetric waves in hyperelastic rods [17, 19]. It has a bi-Hamiltonian structure [33, 26] and is completely integrable [3, 9]. Its solitary waves are smooth if c0 > 0 and peaked in the limiting case c0 = 0 [4]. The orbital stability of the peaked solitons is proved in [16], and that of the smooth solitons in [18]. The explicit interaction of the peaked solitons is given in [1]. The Cauchy problem of the Camassa-Holm equation has been studied extensively. It has been shown that the Camassa-Holm equation is locally well-posed [10, 34, 42] with the initial data u 0 ∈ H s (R), s > 23 . More interestingly, it has global strong solutions [7, 10] and also blow-up solutions in finite time [7, 10, 11, 14, 35] with a different class of initial profiles in the Sobolev spaces H s (R), s > 3/2. On the other hand, it has global weak solutions in H 1 (R) [2, 12, 15, 46]. It is observed that if u is the solution of the Camassa-Holm equation with the initial data u 0 in H 1 (R), we have for all t > 0, √ √ u(t, ·) L ∞ (R) ≤ 2u(t, ·) H 1 (R) ≤ 2u 0 (·) H 1 (R) . The advantage of the Camassa-Holm equation in comparison with the KdV equation lies in the fact that the Camassa-Holm equation has peaked solitons and models wave breaking [4]. With c1 = −2c3 /α 2 and c2 = c3 in Eq. (1.1), by rescaling, shifting the dependent variable and applying a Galilean boost [20], we find the Degasperis-Procesi equation of the form u t − u t x x + 4uu x = 3u x u x x + uu x x x , t > 0, x ∈ R. (1.2) Degasperis, Holm and Hone [20] proved the formal integrability of Eq. (1.2) by constructing a Lax pair. They also showed [20] that Eq. (1.2) has bi-Hamiltonian structure and an infinite sequence of conserved quantities, and admits exact peakon solutions which are analogous to the Camassa-Holm peakons. The Degasperis-Procesi equation can be regarded as a model for nonlinear shallow water dynamics and its asymptotic accuracy is the same as for the Camassa-Holm shallow water equation. Dullin, Gottwald and Holm [24] showed that the Degasperis-Procesi equation can be obtained from the shallow water elevation equation by an appropriate Kodama transformation. Lundmark and Szmigielski [37] presented an inverse scattering approach for computing n-peakon solutions to Eq. (1.2). Vakhnenko and Parkes [44] investigated traveling wave solutions of Eq. (1.2). Holm and Staley [28] studied stability of solitons and peakons numerically to Eq. (1.2). After the Degasperis-Procesi equation (1.2) was derived, many papers were devoted to its study, cf. [13, 27, 32, 36, 38, 47, 48] and the citations therein. For example, Yin proved local well-posedness to Eq. (1.2) with initial data u 0 ∈ H s (R), s > 23 on the line [47] and on the circle [48] and derived the precise blow-up scenario and a blow-up result. The global existence of strong solutions and global weak solutions to Eq. (1.2) are also investigated in [49, 50]. Recently, Lenells [32] classified all weak traveling wave solutions. Matsuno [38] studied multisoliton solutions and their peakon limit. Analogous to the case of Camassa-Holm equation [8], Henry [27] and Mustafa [41] showed that smooth solutions to Eq. (1.2) have infinite speed of propagation.
Global Existence and Blow-Up Phenomena for the Degasperis-Procesi Equation
803
Coclite and Karlsen [13] also obtained global existence results for entropy weak solutions belonging to the class of L 1 (R) ∩ BV (R) and the class of L 2 (R) ∩ L 4 (R). Despite the similarities to the Camassa-Holm equation, we would like to point out that these two equations are truly different. One of the important features of Eq. (1.2) is it has not only peakon solitons [20], u(t, x) = ce−|x−ct| , c > 0 but also shock peakons [6, 36] of the form u(t, x) = −
1 sgn(x)e−|x| , k > 0. t +k
It is easy to see from [36] that the above shock-peakon solutions can be observed by substituting (x, t) −→ (x, t) to Eq. (1.2) and letting → 0 so that it yields the “derivative Burger’s equation” (u t + uu x )x x = 0, from which shock waves form. On the other hand, the isospectral problem in the Lax pair for Eq. (1.2) is the thirdorder equation ψx − ψx x x − λyψ = 0 cf. [20], while the isospectral problem for the Camassa-Holm equation is the second order equation 1 ψx x − ψ − λyψ = 0 4 (in both cases y = u − u x x ) cf. [3]. Another indication of the fact that there is no simple transformation of Eq. (1.2) into the Camassa-Holm equation is the entirely different form of conservation laws for these two equations [3, 20]. Furthermore, the Camassa-Holm equation is a re-expression of geodesic flow on the diffeomorphism group [13] or on the Bott-Virasoro group [40], while no geometric derivation of the Degasperis-Procesi equation is available. The following are three useful conservation laws of the Degasperis-Procesi equation: E 1 (u) = y d x, E 2 (u) = yv d x, E 3 (u) = u 3 d x, R
R
R
where y = (1 − ∂x2 )u and v = (4 − ∂x2 )−1 u, while the corresponding three useful conservation laws of the Camassa-Holm equation are the following: 2 2 F1 (u) = y d x, F2 (u) = (u + u x ) d x, F3 (u) = (u 3 + uu 2x ) d x. R
R
R
It is found that the corresponding conservation laws of the Degasperis-Procesi equation are much weaker than those of the Camassa-Holm equation. Therefore, the issue of if and how particular initial data generate a global solution or blow-up in finite time is more subtle. As far as we know, the case of the Camassa-Holm equation is well understood by now [7, 10, 11, 14, 35] and the citations therein, while the Degasperis-Procesi equation case is the subject of this paper. The goal of this paper is to establish several new global existence and blow-up results for Eq. (1.2), and blow-up set as well so that important physical phenomena of Eq. (1.2) (such as wave breaking and shock waves) could be understood deeply. A forthcoming paper by the authors [25] deals with global weak solutions in H 1 (R) and blow-up structure for the Degasperis-Procesi equation.
804
Y. Liu, Z. Yin
As mentioned earlier, the first blow-up must occur as wave breaking and shock waves possibly appear afterwards. On the other hand, to obtain global existence from local results is a matter of a priori estimates. One approach to prove global existence or wave breaking for shallow water wave (1.2) is to try to follow an idea of Constantin [7], that is, we show that for a large class of initial profiles the corresponding solutions to Eq. (1.2) either exist globally in time or blow up in finite time by using a continuous family of diffeomorphisms of the line associated to Eq. (1.2). However, those ideas in [7] heavily depend on the conservation law F2 (u) which is a H 1 −norm. Although the bi-Hamiltonian structure of Eq. (1.2) provides an infinite number of conservation laws in our case, the conservation laws E i (u), can not guarantee the boundedness of the slope of wave, and there is no way to find conservation laws controlling the H 1 −norm. To deal with this difficulty and make wave breaking possible, for example in Theorem 4.2, we develop a new wave-breaking mechanism for solutions in detail. We first obtain a priori estimate L ∞ −norm of the solution, then by the structure of the equation, we find the slope of the solution approaches infinity in finite or infinite time much faster than the solution itself even it is unbounded. As a result, this leads to wave breaking phenomenon too. The remainder of the paper is organized as follows. In Sect. 2, we recall the local well-posedness of the Cauchy problem of Eq. (1.2) with initial data u 0 ∈ H s (R), s > 23 , the precise blow-up scenario of strong solutions, and two useful results which are crucial in the proof of global existence and blow-up phenomena for Eq. (1.2) from [47, 50]. In Sect. 3, by using a new conservation law and a very useful a priori estimate for the L ∞ −norm of the strong solutions to Eq. (1.2), we will present two new global existence results for strong solutions to Eq. (1.2) with certain initial profiles. The last section, Sect. 4, is devoted to establish two new blow-up results and show the existence of a breaking point where the slope of the solution becomes infinity exactly at breaking time. Notation. As above and henceforth, we denote by ∗ the convolution. We write fˆ as the Fourier transform of f . We also use ( , ) to represent the standard inner product in L 2 (R). For 1 ≤ p ≤ ∞, the norm in the Lebesgue space L p will be written · L p , while · s , s ≥ 0 will stand for the norm in the classical Sobolev spaces H s (R). 2. Preliminaries Since we shall also use a priori estimates and further properties of solutions in H s (R), s > 3 2 , we briefly collect the needed results from [47, 50] in order to pursue our goal. With y := u − u x x , Eq. (1.2) takes the form of a quasi-linear evolution equation of hyperbolic type: t > 0, x ∈ R, yt + uyx + 3u x y = 0, (2.1) y(0, x) = u 0 (x) − u 0,x x (x), x ∈ R. Note that if p(x) := 21 e−|x| , x ∈ R, then (1 − ∂x2 )−1 f = p ∗ f for all f ∈ L 2 (R) and p ∗ (u − u x x ) = u. Using this identity, we can rewrite Eq. (2.1) as follows: u t + uu x + ∂x p ∗ ( 23 u 2 ) = 0, t > 0, x ∈ R, (2.2) x ∈ R. u(0, x) = u 0 (x), The local well-posedness of the Cauchy problem of Eq. (2.2) with initial data u 0 ∈ H s (R), s > 23 can be obtained by applying the Kato’s theorem [30, 47]. As a result, we have the following well-posedness result.
Global Existence and Blow-Up Phenomena for the Degasperis-Procesi Equation
805
Lemma 2.1 [47]. Given u 0 ∈ H s (R), s > 23 , there exist a maximal T = T (u 0 ) > 0 and a unique solution u to Eq. (1.2) (or Eq.(2.2)), such that u = u(., u 0 ) ∈ C([0, T ); H s (R)) ∩ C 1 ([0, T ); H s−1 (R)). Moreover, the solution depends continuously on the initial data, i.e. the mapping u 0 → u(., u 0 ) : H s (R) → C([0, T ); H s (R)) ∩ C 1 ([0, T ); H s−1 (R)) is continuous and the maximal time of existence T > 0 can be chosen to be independent of s. By using the local well-posedness in Lemma 2.1 and the energy method, one can get the following precise blow-up scenario of strong solutions to Eq. (2.2). Lemma 2.2 [47]. Given u 0 ∈ H s (R), s > 23 , blow up of the solution u=u(.,u 0 ) in finite time T < +∞ occurs if and only if lim inf { inf [u x (t, x)]} = −∞. t↑T
x∈R
Remark 2.1. Lemma 2.2 shows that both the Degasperis-Procesi equation and the Camassa-Holm equation have the same blow-up scenario [10]. Since the H 1 −norm of solution to the Camassa-Holm equation is conserved, we see that the slope of the solution to the Camassa-Holm equation becomes unbounded whereas its amplitude remains bounded. However, the H 1 −norm of solution to the Degasperis-Procesi equation is not conserved generally, we can not infer this blow-up phenomenon for the Degasperis-Procesi equation directly from Lemma 2.2. Consider the following differential equation:
qt = u(t, q), q(0, x) = x,
t ∈ [0, T ), x ∈ R.
(2.3)
Applying classical results in the theory of ordinary differential equations, one can obtain the following two results on q which are crucial in the proof of global existence and blow-up solutions. Lemma 2.3 [50]. Let u 0 ∈ H s (R), s ≥ 3, and let T > 0 be the maximal existence time of the corresponding solution u to Eq. (2.2). Then Eq. (2.3) has a unique solution q ∈ C 1 ([0, T ) × R, R). Moreover, the map q(t, .) is an increasing diffeomorphism of R with qx (t, x) = exp
t
u x (s, q(s, x))ds
> 0, ∀(t, x) ∈ [0, T ) × R.
0
Lemma 2.4 [50]. Let u 0 ∈ H s (R), s ≥ 3, and let T > 0 be the maximal existence time of the corresponding solution u to Eq. (2.2). Setting y := u − u x x , we have y(t, q(t, x))qx3 (t, x) = y0 (x), ∀(t, x) ∈ [0, T ) × R.
806
Y. Liu, Z. Yin
3. Global Existence In this section, we will begin by deriving a conservation law for strong solutions to Eq. (2.2). Using this conservation law, we then obtain a priori estimate for the L ∞ −norm of the strong solutions. This enables us to establish several global existence theorems. Lemma 3.1. If u 0 ∈ H s (R), s > 23 , then as long as the solution u(t, x) given by Lemma 2.1 exists, we have y(t, x)v(t, x)d x = y0 (x)v0 (x)d x, R
R
where y(t, x) = u(t, x) − u x x (t, x) and v(t, x) = (4 − ∂x2 )−1 u. Moreover, we have u(t)2L 2 ≤ 4u 0 2L 2 . Proof. Applying Lemma 2.1 and a simple density argument, we only need to show that the above theorem with some s > 23 . Here we assume s = 3 to prove the above theorem. Let T > 0 be the maximal time of existence of the solution u to Eq. (2.2) with initial data u 0 ∈ H 3 (R) such that u ∈ C([0, T ); H 3 (R))∩C 1 ([0, T ); H 2 (R)), which is guaranteed by Lemma 2.1. By Eq. (2.2), we have 1 d 1 1 yv d x = yt v d x + yvt d x = yt v d x 2 dt R 2 R 2 R R = − vyx u d x − 3 vyu x d x R R = − v(yu)x d x − 2 vyu x d x. R
R
Using the relations y = u − u x x and 4v − vx x = u, it yields that v(yu)x d x = − vx yu d x = − vx u 2 d x + vx uu x x d x R R R R = − vx u 2 d x − (vx u)x u x d x R R 2 = − vx u d x − vx x uu x d x − vx u 2x d x R R R 1 = − vx u 2 d x + vx x x u 2 d x − vx u 2x d x 2 R R R 1 2 2 = − vx u d x + (4vx − u x )u d x − vx u 2x d x 2 R R R 2 2 = − vx u d x + 2 vx u d x − vx u 2x d x. R
R
R
On the other hand, 2 v y u x d x = 2 vuu x d x − 2 vu x x u x d x = − vx u 2 d x + vx u 2x d x. R
R
R
R
R
Global Existence and Blow-Up Phenomena for the Degasperis-Procesi Equation
807
Combining the above three relations, we deduce that 1 d yv d x = − v(yu)x d x − 2 vyu x d x = 0. 2 dt R R R Consequently, this implies the desired conserved quantity. In view of the above conservation law, it then follows that 1 + ξ2 2 u(t)2L 2 = u(t) ˆ ≤ 4 |u(t, ˆ ξ )|2 dξ = 4( yˆ (t), v(t)) ˆ L2 2 R 4+ξ = 4(y(t), v(t)) = 4(y0 , v0 ) = 4( yˆ0 , vˆ0 ) 1 + ξ2 ≤4 |uˆ (ξ )|2 dξ ≤ 4uˆ 0 2L 2 = 4u 0 2L 2 . 2 0 R 4+ξ This completes the proof of Lemma 3.1. The following important estimate can be obtained by Lemma 3.1. Lemma 3.2. Assume u 0 ∈ H s (R), s > 23 . Let T be the maximal existence time of the solution u to the Eq. (2.2) guaranteed by Lemma 2.1. Then we have u(t, x) L ∞ ≤ 3u 0 (x)2L 2 t + u 0 (x) L ∞ , ∀t ∈ [0, T ]. Proof. Applying Lemma 2.1 and a simple density argument, it suffices to consider s = 3 to prove the above theorem. Let T > 0 be the maximal time of existence of the solution u to Eq. (2.2) with the initial data u 0 ∈ H 3 (R) such that u ∈ C([0, T ); H 3 (R)) ∩ C 1 ([0, T ); H 2 (R)), which is guaranteed by Lemma 2.1. By (2.2), we have 3 2 u = −3 p ∗ (uu x ). u t + uu x = −∂x p ∗ (3.1) 2 Note that
3 +∞ −|x−η| e uu η dη 2 −∞ 3 x −x+η 3 +∞ x−η =− e uu η dη − e uu η dη 2 −∞ 2 x 3 x −|x−η| 2 3 +∞ −|x−η| 2 = e u dη − e u dη. 4 −∞ 4 x
−3 p ∗ (uu x ) = −
By (2.3), we have du(t, q(t, x)) dq(t, x) = u t (t, q(t, x)) + u x (t, q(t, x)) = (u t + uu x )(t, q(t, x)). dt dt It then follows from (3.1) that 3 q(t,x) −|q(t,x)−η| 2 du(t, q(t, x)) 3 +∞ −|q(t,x)−η| 2 ≤ e u dη ≤ e u dη. − 4 q(t,x) dt 4 −∞ It thus transpires that du(t, q(t, x)) 3 +∞ −|q(t,x)−η| 2 3 +∞ 2 ≤ e u dη ≤ u (t, η)dη. 4 dt 4 −∞ −∞
808
Y. Liu, Z. Yin
In view of Lemma 3.1, we have −3u 0 2L 2 ≤
du(t, q(t, x)) ≤ 3u 0 2L 2 . dt
Integrating the above inequality with respect to t < T on [0, t] yields −3u 0 2L 2 t + u 0 (x) ≤ u(t, q(t, x)) ≤ 3u 0 2L 2 t + u 0 (x). Thus,
|u(t, q(t, x))| ≤ u(t, q(t, x)) L ∞ ≤ 3u 0 2L 2 t + u 0 L ∞ .
(3.2)
Using the Sobolev embedding to ensure the uniform boundedness of u x (s, η) for (s, η) ∈ [0, t] × R with t ∈ [0, T ), in view of Lemma 2.3, we get for every t ∈ [0, T ) a constant C(t) > 0 such that e−C(t) ≤ qx (t, x) ≤ eC(t) ,
x ∈ R.
We deduce from the above equation that the function q(t, ·) is strictly increasing on R with lim x→±∞ q(t, x) = ±∞ as long as t ∈ [0, T ). Thus, by (3.2) we can obtain u(t, x) L ∞ = u(t, q(t, x)) L ∞ ≤ 3u 0 2L 2 t + u 0 L ∞ .
(3.3)
This completes the proof of Lemma 3.2. Remark 3.1. Although the H 1 −norm of solution to the Degasperis-Procesi equation is not conserved generally, Lemmas 2.2 and Lemma 3.2 ensure that the slope of the solution to the Degasperis-Procesi equation becomes unbounded in finite time whereas its amplitude remains bounded in finite time. We now present the first global existence result. Theorem 3.1. Assume u 0 ∈ H s (R), s > 23 . If y0 = u 0 − u 0,x x does not change sign on R, then Eq. (2.2) has a global strong solution u = u(., u 0 ) ∈ C([0, ∞); H s (R)) ∩ C 1 ([0, ∞); H s−1 (R)). Moreover, E 2 (u) = R y v d x is a conservation law, where y = u − u x x and v = (4 − ∂x2 )−1 u, and we have for all t ∈ R+ , (i) |u x (t, ·)| ≤ |u(t, ·)| on R, (ii) u21 ≤ 6u 0 4L 2 t 2 + 4u 0 2L 2 u 0 L ∞ t + u 0 21 . Proof. We only assume s = 3 to prove the above theorem. Let T be the maximal time of existence of the solution u to Eq. (2.2) with initial data u 0 ∈ H 3 (R). We first consider the case where y0 ≥ 0 on R. If y0 ≥ 0, then Lemma 2.3 and Lemma 2.4 ensure that y ≥ 0 for all t ∈ [0, T ). Using u = p ∗ y and the positivity of p, we infer that u(t, ·) ≥ 0 for all t ≥ 0. Note that e−x x η e x ∞ −η u(t, x) = e y(t, η)dη + e y(t, η)dη (3.4) 2 −∞ 2 x and u x (t, x) = −
e−x 2
x
−∞
eη y(η)dη +
ex 2
∞ x
e−η y(η)dη.
(3.5)
Global Existence and Blow-Up Phenomena for the Degasperis-Procesi Equation
From the above two equations, we deduce that ∞ u(t, x) + u x (t, x) = e x e−η y(t, η)dη, x x eη y(t, η)dη. u(t, x) − u x (t, x) = e−x
809
(3.6)
−∞
By (3.6) and y ≥ 0 for all t ∈ [0, T ), we obtain for t ∈ [0, T ), |u x (t, x)| ≤ u(t, x)
∀(t, x) ∈ [0, T ) × R.
This proves (i). By Lemma 3.2, we have |u x (t, x)| ≤ u(t, x) ≤ 3u 0 2L 2 t + u 0 L ∞ , ∀(t, x) ∈ [0, T ) × R. The above inequality and Lemma 2.2 imply T = ∞. This proves that the solution u exists globally in time. Multiplying Eq. (1.2) by u and integrating by parts, in view of (i) and Lemma 3.1 and Lemma 3.2, we get 1 d 2 2 2 u + u x d x = −4 u u x d x + 3 uu x u x x d x + u2u x x x d x 2 dt R R R R 1 1 1 =− u 3x d x ≤ u 3 d x ≤ u L ∞ u2d x 2 R 2 R 2 R 1 2 = 12u 0 L 2 t + 4u 0 L ∞ u 20 d x. 2 R Integrating the above inequality with respect to t, we have u21 ≤ 6u 0 4L 2 t 2 + 4u 0 2L 2 u 0 L ∞ t + u 0 21 . This proves (ii) and completes the proof of the theorem with the assumption y0 ≥ 0 on R. In the case when y0 (x) ≤ 0 on R, one can repeat the above proof to get the desired result. Remark 3.2. Theorem 3.1 improves the previous global existence result in Theorem 3.4 (see [47]), where the additional assumptions u 0 ∈ L 3 (R) and y0 = (u 0 −u 0,x x ) ∈ L 1 (R) are needed. Since we have used a new conservation law and a priori estimate for the L ∞ −norm of strong solution to Eq. (2.2) in Lemma 3.1 and Lemma 3.2, it enables us to eliminate the additional assumptions u 0 ∈ L 3 (R) and y0 = (u 0 − u 0,x x ) ∈ L 1 (R). We now present the second global existence result. Theorem 3.2. Assume u 0 ∈ H s (R), s > 23 and there exists x0 ∈ R such that y0 (x) ≤ 0 if x ≤ x0 , y0 (x) ≥ 0 if x ≥ x0 . Then Eq. (2.2) has a unique global strong solution u = u(., u 0 ) ∈ C([0, ∞); H s (R)) ∩ C 1 ([0, ∞); H s−1 (R)). Moreover, E 2 (u) = R yv d x is a conservation law, where y = (1 − ∂x2 )u and v = (4 − ∂x2 )−1 u, and for all t ∈ R+ we have
810
Y. Liu, Z. Yin
(i) u x (t, ·) ≥ −|u(t, ·)| on R, (ii) u21 ≤ 6u 0 4L 2 t 2 + 4u 0 2L 2 u 0 L ∞ t + u 0 21 . Proof. We only assume s = 3 to prove the above theorem. Note that the function q(t, x) is an increasing diffeomorphism of R with qx (t, x) > 0 with respect to time t. We infer from the assumptions of the theorem and Lemma 2.3 and Lemma 2.4 that for t ∈ [0, T ) we have y(t, x) ≤ 0 if x ≤ q(t, x0 ), (3.7) y(t, x) ≥ 0 if x ≥ q(t, x0 ), and y(t, q(t, x0 )) = 0, t ∈ [0, T ). By (3.6) and (3.7), we obtain for t ∈ [0, T ),
u x (t, x) ≥ u(t, x) u x (t, x) ≥ −u(t, x)
if if
x ≤ q(t, x0 ), x ≥ q(t, x0 ).
(3.8)
Therefore, u x (t, ·) ≥ −|u(t, ·)| on R for all t ∈ [0, T ). This proves (i). By Lemma 3.2 and (i), we have u x (t, ·) ≥ −|u(t, ·)| ≥ − 3u 0 2L 2 t + u 0 L ∞ , ∀t ∈ [0, T ]. The above inequality and Lemma 2.2 imply T = ∞. This proves that the solution u exists globally in time. Multiplying Eq. (1.2) by u and integrating by parts, in view of (i) and Lemma 3.2, we get 1 d 2 2 2 u + u x d x = −4 u u x d x + 3 uu x u x x d x + u2u x x x d x 2 dt R R R R 1 1 1 =− u 3x d x ≤ |u|3 d x ≤ u L ∞ u2d x 2 R 2 R 2 R 1 2 2 = 12u 0 L 2 t + 4u 0 L ∞ u 0 d x. 2 R Integrating the above inequality with respect to t < T on [0, t], we have u21 ≤ 6u 0 4L 2 t 2 + 4u 0 2L 2 u 0 L ∞ t + u 0 21 . This proves (ii) and completes the proof of the theorem. Remark 3.3. Note that the previous blow-up result in Theorem 3.2 (see [47]) showed that if u 0 ∈ H s (R), s > 23 is odd and u 0 (0) < 0, then the corresponding solution of Eq. (2.2) does not exist globally in time, while Theorem 3.2 implies that if u 0 ∈ H s (R), s > 23 such that y0 = (u 0 − u 0,x x ) ≡ 0 is odd, y0 ≤ 0 on R− and y0 ≥ 0 on R+ , then the corresponding solutions of Eq. (2.2) exist globally in time. Since u 0 = p ∗ y0 with p = 21 e−|x| , x ∈ R, one can verify that u 0 is also odd. However, from (3.5) we have u 0 (0) > 0.
Global Existence and Blow-Up Phenomena for the Degasperis-Procesi Equation
811
4. Blow-up Phenomena Our purpose here is to establish two new blow-up results for Eq. (2.2) with certain initial profiles and show that there is at least one point where the slope of the solution becomes infinity exactly at breaking time. We are now in the position to present the first blow-up result. Theorem 4.1. Let ε > 0 and u 0 ∈ H s (R), s > 23 . Assume there is x0 ∈ R such that √ √ 1 (1 + ε) 6 2 2 2 2 u 0 (x0 ) < − u 0 L ∞ + (2 6u 0 L 2 ln 1 + + u 0 L ∞ ) . 4 ε Then the corresponding solution to Eq. (2.2) blows up in finite time. Moreover, the maximal time of existence is estimated above by √ 1
2 2 6u 0 2L 2 ln 1 + 2ε + u 0 2L ∞ − u 0 L ∞ 6u 0 2L 2
.
Proof. As mentioned earlier, here we only need to show that the above theorem holds for s = 3. Let T > 0 be the maximal time of existence of the solution u to Eq. (2.2) with the initial data u 0 ∈ H 3 (R). Differentiating Eq. (2.2) with respect to x, in view of ∂x2 p ∗ f = p ∗ f − f , we have 3 2 3 2 2 u t x = −u x − uu x x + u − p ∗ (4.1) u . 2 2 Note that dq(t, x) du x (t, q(t, x)) = u xt (t, q(t, x)) + u x x (t, q(t, x)) dt dt = u t x (t, q(t, x)) + u(t, q(t, x))u x x (t, q(t, x)). (4.2)
3 2 By (4.1) and (4.2), in view of p ∗ 2 u (t, q(t, x)) ≥ 0, we deduce that 3 3 2 du x (t, q(t, x)) = −u 2x (t, q(t, x)) + u 2 (t, q(t, x)) − p ∗ u (t, q(t, x) dt 2 2 3 ≤ −u 2x (t, q(t, x)) + u 2 (t, q(t, x)) 2 2 3 2 ≤ −u x (t, q(t, x)) + 3u 0 2L 2 t + u 0 L ∞ . (4.3) 2 Set m(t) = u x (t, q(t, x0 )) and fix ε > 0. Taking T1 = and
1 √
2 2 6u 0 2L 2 ln 1 + 2ε + u 0 2L ∞ − u 0 L ∞ 6u 0 2L 2 √ 6 K (T1 ) = 3u 0 2L 2 T1 + u 0 L ∞ , 2
812
Y. Liu, Z. Yin
it is found that
2 ≥ 0. 2K (T1 )T1 − ln 1 + ε
(4.4)
By the assumption of the theorem, we have m(0) < −(1 + ε)K (T1 ). This implies that 0<
m(0) − K (T1 ) 2K (T1 ) 2 =1− ≤1+ . m(0) + K (T1 ) m(0) + K (T1 ) ε
It then follows from the above inequality and (4.4) that m(0) − K (T1 ) 1 ln ≤ T1 . 2K (T1 ) m(0) + K (T1 ) In view of (4.3), we have dm(t) ≤ −m 2 (t) + K 2 (T1 ), ∀t ∈ [0, T1 ] ∩ [0, T ). dt
(4.5)
(T1 ) Note that m(0) < −(1 + ε)K (T1 ) < −K (T1 ) and 2K 1(T1 ) ln m(0)−K m(0)+K (T1 ) ≤ T1 . Thus the standard argument of continuity shows m(t) ≤ −K (T1 ), for all t ∈ [0, T1 ] ∩ [0, T ). By solving the inequality (4.5), we can obtain
m(0) + K (T1 ) 2K (T1 )t 2K (T1 ) e ≤ 0. −1≤ m(0) − K (T1 ) m(t) − K (T1 ) Since 0 <
m(0)+K (T1 ) m(0)−K (T1 )
< 1, there exists
0
m(0) − K (T1 ) 1 ln( ) ≤ T1 , 2K (T1 ) m(0) + K (T1 )
such that limt↑T m(t) = −∞. This completes the proof of the theorem. Remark 4.1. Note that if ε > 0 goes to positive infinity and the assumption of Theorem 4.1 still holds, then the maximal time of existence of the solution will tend to zero. This means that the steeper the slope of solution at some point is, the quicker the solution blows up. We now present the second blow-up result. Theorem 4.2. Let u 0 ∈ H s (R), s > 23 . Assume there exists x0 ∈ R such that
y0 (x) = u 0 (x) − u 0,x x (x) ≥ 0 if x ≤ x0 , y0 (x) = u 0 (x) − u 0,x x (x) ≤ 0 if x ≥ x0 ,
and y0 changes sign. Then, the corresponding solution to Eq.(2.2) blows up in finite time.
Global Existence and Blow-Up Phenomena for the Degasperis-Procesi Equation
813
Proof. Again, we only need to show that the above theorem holds for s = 3. Let T > 0 be the maximal time of existence of the solution u to Eq. (2.2) with the initial data u 0 ∈ H 3 (R). In view of (4.1), we have 3 2 3 2 2 u u t x + uu x x = −u x + u − p ∗ 2 2 1 3 2 1 2 2 2 = −u x + u − p ∗ u x + u + p ∗ u 2x − u 2 . 2 2 2
1 2 Due to p ∗ 2 u x + u 2 (t, x) ≥ 21 u 2 (t, x), ∀(t, x) ∈ [0, T ) × R (see p. 347, line 11 in [7]), it follows from the above relation that 1 (4.6) u t x + uu x x ≤ −u 2x + u 2 + p ∗ u 2x − u 2 . 2 Note that the function q(t, x) is an increasing diffeomorphism of R with qx (t, x) > 0 with respect to time t. We infer from the assumption of the theorem and Lemma 2.4 that for t ∈ [0, T ) we have y(t, x) ≥ 0 if x ≤ q(t, x0 ), (4.7) y(t, x) ≤ 0 if x ≥ q(t, x0 ), and y(t, q(t, x0 )) = 0, t ∈ [0, T ). Define x M(t, x) := e−x eη y(t, η)dη, t ∈ [0, T ), −∞
and
∞
N (t, x) := e x
e−η y(t, η)dη, t ∈ [0, T ).
(4.8)
(4.9)
x
By (4.8) and (4.9), in view of (4.7), we have M(t, q(t, x0 ))N (t, q(t, x0 )) q(t,x0 ) η e y(t, η)dη = −∞
∞
q(t,x0 )
e−η y(t, η)dη < 0, t ∈ [0, T ).
(4.10)
On the other hand, from (3.6) we have
and
M(t, q(t, x0 )) = u(t, q(t, x0 )) − u x (t, q(t, x0 )), t ∈ [0, T ),
(4.11)
N (t, q(t, x0 )) = u(t, q(t, x0 )) + u x (t, q(t, x0 )) t ∈ [0, T ).
(4.12)
It is observed from (4.10)-(4.12) that 0 > M(t, q(t, x0 ))N (t, q(t, x0 )) = u 2 (t, q(t, x0 )) − u 2x (t, q(t, x0 )).
(4.13)
Since y(t, q(t, x0 )) = 0 on [0, T ), one can obtain by taking derivative with respect to t on [0, T ), q(t,x0 ) d M(t, q(t, x0 )) eη yt (t, η)dη. (4.14) = −qt (t, x0 )M(t, q(t, x0 )) + e−q(t,x0 ) dt −∞
814
Y. Liu, Z. Yin
Using (2.1) and integrating by parts, in view of y = u − u x x , we obtain
q(t,x0 ) −∞
eη yt (t, η)dη = −
q(t,x0 )
−∞ q(t,x0 )
−2 =
eη y(t, η)u(t, η) η dη
−∞ q(t,x0 )
−∞
eη y(t, η)u(t, η)dη − 2
q(t,x0 )
+2 =
−∞ q(t,x0 )
−∞
−2
e u (t, η)dη −
q(t,x0 )
+2 =
−∞ q(t,x0 )
q(t,x0 )
−∞
eη u(t, η)u η (t, η)dη
q(t,x0 )
−∞
eη u(t, η)u ηη (t, η)dη
eη u(t, η)u η (t, η)dη eη u η (t, η)u ηη (t, η)dη
η 2
e u (t, η)dη −
−∞ q(t,x0 )
eη u η (t, η)u ηη (t, η)dη
η 2
−∞ q(t,x0 )
eη y(t, η)u η (t, η)dη
q(t,x0 )
−∞
eη u(t, η)u η (t, η)dη
u(t, q(t, x0 ))u x (t, q(t, x0 )) + eq(t,x0 ) u 2x (t, q(t, x0 )) −e 1 3 q(t,x0 ) η 2 = e u (t, η)dη − eq(t,x0 ) u 2 (t, q(t, x0 )) 2 −∞ 2
−eq(t,x0 ) u(t, q(t, x0 ))u x (t, q(t, x0 )) + eq(t,x0 ) u 2x (t, q(t, x0 )). (4.15)
Note that p ∗ 21 u 2x + u 2 (t, x) ≥ 21 u 2 (t, x), ∀(t, x) ∈ [0, T ) × R. Substituting (4.15) into the last term in the expression (4.14) yields d M(t, q(t, x0 )) 1 = − u(t, x0 )M(t, q(t, x0 )) − u 2 (t, q(t, x0 )) dt 2 − u(t, q(t, x0 ))u x (t, q(t, x0 )) + u 2x (t, q(t, x0 )) q(t,x0 ) 3 η 2 −q(t,x0 ) e u (t, η)dη +e 2 −∞ 3 = u 2x (t, q(t, x0 )) − u 2 (t, q(t, x0 )) 2 q(t,x0 ) 3 η 2 e u (t, η)dη + e−q(t,x0 ) 2 −∞ 1 −q(t,x0 ) q(t,x0 ) η 2 2 = u x (t, q(t, x0 )) + e e u (t, η) − u 2η (t, η) dη 2 −∞ q(t,x0 ) 1 2 3 2 − u (t, q(t, x0 )) + e−q(t,x0 ) eη u η (t, η) + u 2 (t, η) dη 2 2 −∞
Global Existence and Blow-Up Phenomena for the Degasperis-Procesi Equation
815
≥ u 2x (t, q(t, x0 )) − u 2 (t, q(t, x0 )) 1 −q(t,x0 ) q(t,x0 ) η 2 e u (t, η) − u 2η (t, η) dη + e 2 −∞ = −M(t, q(t, x0 ))N (t, q(t, x0 )) q(t,x0 ) 1 eη u 2 (t, η) − u 2η (t, η) dη. + e−q(t,x0 ) 2 −∞
(4.16)
We claim that q(t,x0 ) eη u 2 (t, η) − u 2η (t, η) dη ≥ M(t, q(t, x0 ))N (t, q(t, x0 )). (4.17) e−q(t,x0 ) −∞
In fact, in view of (4.7), (4.8), (4.9) and (3.6), we deduce q(t,x0 ) e−q(t,x0 ) eη u 2 (t, η) − u 2η (t, η) dη −∞
=e
−q(t,x0 )
= e−q(t,x0 ) =e
−q(t,x0 )
−∞ q(t,x0 )
−∞ q(t,x0 )
−q(t,x0 )
−q(t,x0 )
≥
q(t,x0 )
−∞
+e ≥e
q(t,x0 )
eη e
η
q(t,x0 )
e
−∞ q(t,x0 )
−∞ ∞
eη M(t, η)N (t, η) dη
e
∞
η
e−ξ y(t, ξ )dξ
∞
q(t,x0 )
η
η
e−ξ y(t, ξ )dξ
e
−ξ
q(t,x0 )
e
η ∞
q(t,x0 ) q(t,x0 )
−∞
e
η
−∞
y(t, ξ )dξ
−ξ
−ξ
eξ y(t, ξ )dξ dη η
−∞
y(t, ξ )dξ
y(t, ξ )dξ
ξ
e y(t, ξ )dξ dη
η
−∞ η
−∞
ξ
e y(t, ξ )dξ dη
ξ
e y(t, ξ )dξ dη
eξ y(t, ξ )dξ
= M(t, q(t, x0 ))N (t, q(t, x0 )).
(4.18)
Combining (4.16) with (4.18), we get d M(t, q(t, x0 )) 1 ≥ − M(t, q(t, x0 ))N (t, q(t, x0 )) > 0. dt 2
(4.19)
In an analogous way one has q(t,x0 ) d N (t, q(t, x0 )) −q(t,x0 ) = qt (t, x0 )N (t, q(t, x0 )) − e eη yt (t, η)dη dt −∞ ∞ 3 −η 2 3 2 2 q(t,x0 ) e u (t, η)dη = u (t, q(t, x0 )) − u x (t, q(t, x0 )) − e 2 2 q(t,x0 ) ≤ u 2 (t, q(t, x0 )) − u 2x (t, q(t, x0 )) 1 q(t,x0 ) ∞ e−η u 2 (t, η) − u 2η (t, η) dη − e 2 q(t,x0 )
816
Y. Liu, Z. Yin
= M(t, q(t, x0 ))N (t, q(t, x0 )) ∞ 1 e−η u 2 (t, η) − u 2η (t, η) dη. − eq(t,x0 ) 2 q(t,x0 )
(4.20)
Following the similar argument of (4.18), it is found that ∞ q(t,x0 ) e e−η u 2 (t, η) − u 2η (t, η) dη ≥ M(t, q(t, x0 ))N (t, q(t, x0 )). (4.21) q(t,x0 )
Combining (4.20) with (4.21), we get 1 d N (t, q(t, x0 )) ≤ M(t, q(t, x0 ))N (t, q(t, x0 )) < 0. dt 2
(4.22)
The differential inequalities (4.19) and (4.22) show therefore that M(t, q(t, x0 )) is strictly increasing while N (t, q(t, x0 )) is strictly decreasing on [0, T ). The assumptions of the theorem ensure M(0, x0 ) > 0 and N (0, x0 ) < 0 so that M(t, q(t, x0 ))N (t, q(t, x0 )) < M(0, x0 )N (0, x0 ) < 0, t ∈ [0, T ).
(4.23)
By (4.6) and (4.13), we have du x (t, q(t, x0 )) 1 ≤ u 2 (t, q(t, x0 )) − u 2x (t, q(t, x0 )) − p ∗ u 2 − u 2x (t, q(t, x0 )) dt 2 = M(t, q(t, x0 ))N (t, q(t, x0 )) 1 − p ∗ u 2 − u 2x (t, q(t, x0 )). (4.24) 2 By definition of p(x), in view of (4.17) and (4.21), we have 1 ∞ −|q(t,x0 )−η| 2 p ∗ (u 2 − u 2x )(t, q(t, x0 )) = u (t, η) − u 2η (t, η) dη e 2 −∞ 1 −q(t,x0 ) q(t,x0 ) η 2 = e e u (t, η) − u 2η (t, η) dη 2 −∞ 1 q(t,x0 ) ∞ + e e−η u 2 (t, η) − u 2η (t, η) dη 2 q(t,x0 ) ≥ M(t, q(t, x0 ))N (t, q(t, x0 )). (4.25) Combining (4.24) with (4.25), in view of (4.23), we obtain d f (t) 1 2 ≤ u (t, q(t, x0 )) − u 2x (t, q(t, x0 )) dt 2 1 1 = M(t, q(t, x0 ))N (t, q(t, x0 )) < M(0, x0 )N (0, x0 ) < 0, 2 2
(4.26)
where the function f (t) is defined by f (t) = u x (t, q(t, x0 )). Assume that the solution u(t) of Eq. (2.2) exists globally in time t ∈ [0, ∞), that is, T = ∞. We show this leads to a contradiction. We first claim that there exists t1 > 0 such that (4.27) f 2 (t) ≥ 2u 2 (t, q(t, x0 )), t ≥ t1 .
Global Existence and Blow-Up Phenomena for the Degasperis-Procesi Equation
817
Note that M(0, x0 ) > 0 and N (0, x0 ) < 0. By means of Gronwall’s inequality, it follows from (4.19) and (4.22) that 1
M(t, q(t, x0 )) ≥ M(0, x0 )e− 2 N (0,x0 )t > 0, 1
−N (t, q(t, x0 )) ≥ −N (0, x0 )e 2 M(0,x0 )t > 0. From the above two inequalities, in view of (4.13), we get u 2x (t, q(t, x0 )) − u 2 (t, q(t, x0 )) = −M(t, q(t, x0 ))N (t, q(t, x0 )) 1
≥ −M(0, x0 )N (0, x0 )e 2 (M(0,x0 )−N (0,x0 ))t . (4.28) On the other hand, it follows from (3.2) that 2
u 2 (t, q(t, x0 )) ≤ 3u 0 (x)2L 2 t + u 0 (x) L ∞ .
(4.29)
Comparing (4.28) with (4.29), it is found that there exists t1 > 0 such that u 2x (t, q(t, x0 )) − u 2 (t, q(t, x0 )) ≥ u 2 (t, q(t, x0 )), t ≥ t1 . This proves (4.27). Combining (4.26) with (4.27), we obtain 1 1 d 1 f (t) ≤ u 2 (t, q(t, x0 )) − f 2 (t) ≤ − f 2 (t), t ∈ [t1 , ∞). dt 2 2 4
(4.30)
By the assumptions of the theorem, we have x0 ∞ 1 1 f (0) = u x (0, x0 ) = − e−x0 eη y0 (η)dη + e x0 e−η y0 (η)dη < 0. 2 2 −∞ x0 It then follows from (4.26) that f (t) < f (0) + M(0, x0 )N (0, x0 )t < 0, for t ≥ 0. Thus, solving the differential inequality (4.30) yields 1 1 1 + (t − t1 ) ≤ 0, t ≥ t0 . − f (t1) f (t) 4 Note that − f 1(t) > 0. Then we have 1 1 1 1 1 + (t − t1 ) < − + (t − t1 ) ≤ 0, t ≥ t0 , f (t1 ) 4 f (t1 ) f (t) 4 which leads to a contradiction as t → ∞. This proves that T < ∞ and completes the proof of the theorem. Remark 4.2. We note that Zhou claimed in Theorem 2.3 (see [51]) the same conclusion as Theorem 4.2. Unfortunately, the proof of Theorem 2.3 was incorrect. In particular, the key estimate in (2.13) (see [51]) was simply wrong.
818
Y. Liu, Z. Yin
Remark 4.3. By Theorem 3.1, Theorem 3.2, Theorem 4.1 and Theorem 4.2, it is shown that the lifespan of strong solutions of the Degasperis-Procesi equation is not affected by the smoothness and the size of the initial data, but affected by the shape of the initial data. Attention is now turned to the blow-up set of a breaking solution for the Degasperis-Procesi equation. We will show that there is at least one point where the slope of the solution becomes infinity exactly at breaking time. Theorem 4.3. Assume u 0 ∈ H s (R), s > 23 and there exists x0 ∈ R such that y0 (x) = u 0 (x) − u 0,x x (x) ≥ 0 if x ≤ x0 , y0 (x) = u 0 (x) − u 0,x x (x) ≤ 0 if x ≥ x0 , and y0 changes sign. Let T < ∞ be the finite blow-up time of the corresponding solution of Eq. (2.2). Then we have lim u x (t, q(t, x0 )) = −∞.
t→T
Proof. Fix t ∈ [0, T ). In view of (4.7), it follows from the relations (4.9) and (4.12) that for any x ≤ q(t, x0 ), ∞ u x (t, x) = −u(t, x) + e x e−η y(t, η)dη = −u(t, x) + e
q(t,x0 )
x
e
≥ −u(t, x) + e
x
x
−η
y(t, η)dη + e
x
q(t,x0 )
x ∞
q(t,x0 )
≥ −u(t, x) + eq(t,x0 )
∞
e−η y(t, η)dη
e−η y(t, η)dη ∞
q(t,x0 )
e−η y(t, η)dη
= −u(t, x) + u x (t, q(t, x0 )) + u(t, q(t, x0 )). If x ≥ q(t, x0 ) we have by the relations (4.8) and (4.11) that x u x (t, x) = u(t, x) − e−x eη y(t, η)dη = u(t, x) − e−x ≥ u(t, x) − e−x
−∞ x
eη y(t, η)dη − e−x
q(t,x0 ) q(t,x0 )
∞
≥ u(t, x) − e−q(t,x0 )
q(t,x0 )
−∞
e−η y(t, η)dη
e−η y(t, η)dη
q(t,x0 ) ∞
e−η y(t, η)dη
= u(t, x) + u x (t, q(t, x0 )) − u(t, q(t, x0 )). From the above two inequalities we deduce that for (t, x) ∈ [0, T ) × R, u x (t, x) ≥ u x (t, q(t, x0 )) − 2u(t, ·) L ∞ ≥ u x (t, q(t, x0 )) − 2 3u 0 (x)2L 2 t + u 0 (x) L ∞ .
(4.31)
Global Existence and Blow-Up Phenomena for the Degasperis-Procesi Equation
819
Since T < ∞, it follows from Lemma 2.2 that lim inf ( inf u x (t, x)) = −∞. t→T
x∈R
Thus, from (4.31) it is easy to conclude lim u x (t, q(t, x0 )) = −∞. This completes the t→T
proof of the theorem. Acknowledgements. The authors gratefully acknowledge the hospitality and support of the Mittag-Leffler Institute, Stockholm, where this research was performed during the semester program on Wave Motion in the Fall of 2005. Yin was partially supported by the Alexander von Humboldt Foundation, the NNSF of China (No. 10531040), the NSF of Guangdong Province, and the Foundation of Zhongshan University Advanced Research Center. The authors also thank the referee for valuable suggestions and comments.
References 1. Beals, R., Sattinger, D., Szmigielski, J.: Acoustic scattering and the extended Korteweg-de Vries hierarchy. Adv. Math. 140, 190–206 (1998) 2. Bressan, A., Constantin, A.: Global conservative solutions of the Camassa-Holm equation. Preprint, www.math.ntnu.no/conservation/2006/023.html, To appear in Arch. Rat. Mech. Anal. 3. Camassa, R., Holm, D.: An integrable shallow water equation with peaked solitons. Phys. Rev. Lett. 71, 1661–1664 (1993) 4. Camassa, R., Holm, D., Hyman, J.: A new integrable shallow water equation. Adv. Appl. Mech. 31, 1–33 (1994) 5. Coclite, G.M., Karlsen, K.H.: On the well-posedness of the Degasperis-Procesi equation. J. Funct. Anal. 233, 60–91 (2006) 6. Coclite, G.M., Karlsen, K.H., Risebro, N.H.: Numerical schemes for computing discontinuous solutions of the Degasperis-Procesi equation. Preprint, available at http://www.math.vio.no/∼kennethk/articles/art125.pdf,2006 7. Constantin, A.: Global existence of solutions and breaking waves for a shallow water equation: a geometric approach. Ann. Inst. Fourier (Grenoble) 50, 321–362 (2000) 8. Constantin, A.: Finite propagation speed for the Camassa-Holm equation. J. Math. Phys. 46, 023506(2005), 4 pp. 9. Constantin, A.: On the scattering problem for the Camassa-Holm equation. Proc. Roy. Soc. London A 457, 953–970 (2001) 10. Constantin, A., Escher, J.: Global existence and blow-up for a shallow water equation. Annali Sc. Norm. Sup. Pisa 26, 303–328 (1998) 11. Constantin, A., Escher, J.: Wave breaking for nonlinear nonlocal shallow water equations. Acta Mathematica 181, 229–243 (1998) 12. Constantin, A., Escher, J.: Global weak solutions for a shallow water equation. Indiana Univ. Math. J. 47, 1527–1545 (1998) 13. Constantin, A., Kolev, B.: Geodesic flow on the diffeomorphism group of the circle. Comment. Math. Helv. 78, 787–804 (2003) 14. Constantin, A., McKean, H.P.: A shallow water equation on the circle. Comm. Pure Appl. Math. 52, 949–982 (1999) 15. Constantin, A., Molinet, L.: Global weak solutions for a shallow water equation. Commun. Math. Phys. 211, 45–61 (2000) 16. Constantin, A., Strauss, W.A.: Stability of peakons. Comm. Pure Appl. Math. 53, 603–610 (2000) 17. Constantin, A., Strauss, W.: Stability of a class of solitary waves in compressible elastic rods. Phys. Lett. A 270, 140–148 (2000) 18. Constantin, A., Strauss, W.A.: Stability of the Camassa-Holm solitons. J. Nonlinear Science 12, 415–422 (2002) 19. Dai, H.H.: Model equations for nonlinear dispersive waves in a compressible Mooney-Rivlin rod. Acta Mechanica 127, 193–207 (1998) 20. Degasperis, A., Holm, D.D., Hone, A.N.W.: A New Integral Equation with Peakon Solutions. Theo. Math. Phys. 133, 1463–1474 (2002) 21. Degasperis, A., Procesi, M.: Asymptotic integrability. In: Symmetry and Perturbation Theory, edited by A. Degasperis, G. Gaeta, Singapore: World Scientific, 1999, pp. 23–37 22. Drazin, P.G., Johnson, R.S.: Solitons: an Introduction. Cambridge-New York: Cambridge University Press, 1989
820
Y. Liu, Z. Yin
23. Dullin, H.R., Gottwald, G.A., Holm, D.D.: An integrable shallow water equation with linear and nonlinear dispersion. Phys. Rev. Lett. 87, 4501–4504 (2001) 24. Dullin, H.R., Gottwald, G.A., Holm, D.D.: Camassa-Holm, Korteweg -de Vries-5 and other asymptotically equivalent equations for shallow water waves. Fluid Dyn. Res. 33, 73–79 (2003) 25. Escher, J., Liu, Y., Yin, Z.: Global weak solutions and blow-up structure for the Degasperis-Procesi equation. J. Funct. Anal., to appear, doi:10.1016/j.jfa.2006.03.022 26. Fokas, A., Fuchssteiner, B.: Symplectic structures, their Bäcklund transformation and hereditary symmetries. Physica D 4, 47–66 (1981) 27. Henry, D.: Infinite propagation speed for the Degasperis-Procesi equation. J. Math. Anal. Appl. 311, 755–759 (2005) 28. Holm, D.D., Staley, M.F.: Wave structure and nonlinear balances in a family of evolutionary PDEs. SIAM J. Appl. Dyn. Syst. (electronic) 2, 323–380 (2003) 29. Johnson, R.S.: Camassa-Holm, Korteweg-de Vries and related models for water waves. J. Fluid Mech. 455, 63–82 (2002) 30. T. Kato, Quasi-linear equations of evolution, with applications to partial differential equations. In: Spectral Theory and Differential Equations, Lecture Notes in Math. 448. Berlin:Springer Verlag, 1975, pp. 25–70 31. Kenig, C., Ponce, G., Vega, L.: Well-posedness and scattering results for the generalized Korteweg-de Veris equation via the contraction principle. Comm. Pure Appl. Math. 46, 527–620 (1993) 32. Lenells, J.: Traveling wave solutions of the Degasperis-Procesi equation. J. Math. Anal. Appl. 306, 72–82 (2005) 33. Lenells, J.: Conservation laws of the Camassa-Holm equation. J. Phys. A 38, 869–880 (2005) 34. Li, P., Olver, P.: Well-posedness and blow-up solutions for an integrable nonlinearly dispersive model wave equation. J. Differ. Eqs. 162, 27–63 (2000) 35. Liu, Y.: Global existence and blow-up solutions for a nonlinear shallow water equation. Math. Ann. 335, 717–735 (2006) 36. Lundmark, H.: Formation and dynamics of shock waves in the Degasperis-Procesi equation. Preprint, available at http://www.mai.liu.se/∼halun/papers/Lundmark-DPshock.pdf, 2006 37. Lundmark, H., Szmigielski, J.: Multi-peakon solutions of the Degasperis-Procesi equation. Inverse Problems 19, 1241–1245 (2003) 38. Matsuno, Y.: Multisoliton solutions of the Degasperis-Procesi equation and their peakon limit. Inverse Problems 21, 1553–1570 (2005) 39. Mckean, H.P.: Integrable systems and algebraic curves. In: Global Analysis. Springer Lecture Notes in Mathematics 755, Berlin-Heidelberg-New York:Springer, 1979, pp. 83–200 40. Misiolek, G.: A shallow water equation as a geodesic flow on the Bott-Virasoro group. J. Geom. Phys. 24, 203–208 (1998) 41. Mustafa, O.G.: A note on the Degasperis-Procesi equation. J. Nonlinear Math. Phys. 12, 10–14 (2005) 42. Rodriguez-Blanco, G.: On the Cauchy problem for the Camassa-Holm equation. Nonlinear Anal. 46, 309–327 (2001) 43. Tao, T.: Low-regularity global solutions to nonlinear dispersive equations. In: Surveys in analysis and operator theory (Canberra,2001), Proc. Centre Math. Appl. Austral. Nat. Univ. 40, Canberra:Austral. Nat. Univ., 2002, pp. 19–48 44. Vakhnenko, V.O., Parkes, E.J.: Periodic and solitary-wave solutions of the Degasperis-Procesi equation. Chaos Solitons Fractals 20, 1059–1073 (2004) 45. Whitham, G.B.: Linear and Nonlinear Waves. New York: J. Wiley & Sons, 1980 46. Xin, Z., Zhang, P.: On the weak solutions to a shallow water equation. Comm. Pure Appl. Math. 53, 1411–1433 (2000) 47. Yin, Z.: On the Cauchy problem for an integrable equation with peakon solutions. Ill. J. Math. 47, 649–666 (2003) 48. Yin, Z.: Global existence for a new periodic integrable equation. J. Math. Anal. Appl. 283, 129–139 (2003) 49. Yin, Z.: Global weak solutions to a new periodic integrable equation with peakon solutions. J. Funct. Anal. 212, 182–194 (2004) 50. Yin, Z.: Global solutions to a new integrable equation with peakons. Ind. Univ. Math. J. 53 (2004), 1189–1210 (2004) 51. Zhou, Y.: Blow-up phenomena for the integrable Degasperis-Procesi equation. Phys. Lett. A 328, 157–162 (2004) Communicated by A. Kupiainen
Commun. Math. Phys. 267, 821–845 (2006) Digital Object Identifier (DOI) 10.1007/s00220-006-0080-7
Communications in
Mathematical Physics
Birkhoff Type Decompositions and the Baker–Campbell– Hausdorff Recursion Kurusch Ebrahimi-Fard1 , Li Guo2 , Dominique Manchon3 1 I.H.É.S., Le Bois-Marie, 35, Route de Chartres, F-91440 Bures-sur-Yvette, France.
E-mail: [email protected]
2 Department of Mathematics and Computer Science, Rutgers University, Newark, NJ 07102, U.S.A.
E-mail: [email protected]
3 Université Blaise Pascal, C.N.R.S.-UMR 6620, 63177 Aubière, France.
E-mail: [email protected] Received: 5 February 2006 / Accepted: 7 March 2006 Published online: 3 August 2006 – © Springer-Verlag 2006
Abstract: We describe a unification of several apparently unrelated factorizations arising from quantum field theory, vertex operator algebras, combinatorics and numerical methods in differential equations. The unification is given by a Birkhoff type decomposition that was obtained from the Baker–Campbell–Hausdorff formula in our study of the Hopf algebra approach of Connes and Kreimer to renormalization in perturbative quantum field theory. There we showed that the Birkhoff decomposition of Connes and Kreimer can be obtained from a certain Baker–Campbell–Hausdorff recursion formula in the presence of a Rota–Baxter operator. We will explain how the same decomposition generalizes the factorization of formal exponentials and uniformization for Lie algebras that arose in vertex operator algebra and conformal field theory, and the even-odd decomposition of combinatorial Hopf algebra characters as well as the Lie algebra polar decomposition as used in the context of the approximation of matrix exponentials in ordinary differential equations.
Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 2. The General Set Up . . . . . . . . . . . . . . . . . . . . . 2.1 The Baker–Campbell–Hausdorff recursion . . . . . . . 2.2 Rota–Baxter operator . . . . . . . . . . . . . . . . . . 2.3 The case of vanishing weight and the Magnus recursion 3. Renormalization in Perturbative QFT . . . . . . . . . . . . 4. Formal Exponentials . . . . . . . . . . . . . . . . . . . . . 5. Combinatorial Hopf Algebras . . . . . . . . . . . . . . . . 6. Polar Decomposition . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
822 823 823 829 833 836 838 840 843 844
822
K. Ebrahimi-Fard, L. Guo, D. Manchon
1. Introduction The results presented in this paper grew out of an extension of the study on Rota–Baxter algebras and their applications to areas of mathematics and physics, including quantum field theory, classical integrable systems, number theory, operads, combinatorics and Hopf algebras. In recent works of Connes and Kreimer [10–12, 23], triggered by Kreimer’s seminal paper [22], new progresses were made in the understanding of the process of renormalization in perturbative quantum field theory, both in terms of its mathematical and its physical contents. These results motivated further studies, among other directions, in the context of Rota–Baxter algebras [14–17]. We refer the reader to [18, 19, 24, 28] for more details and references in this field. One key result in the works of Connes and Kreimer is the Birkhoff decomposition of Feynman rules that captures the process of renormalization. Working in a fully algebraic framework of complete filtered Rota–Baxter algebras, it was shown in [14, 15] that the Connes–Kreimer decomposition follows from an additive decomposition in a Rota– Baxter (Lie) algebra through the exponential map. Thereby the well-known Bogoliubov formulae, which form the backbone of the standard BPHZ renormalization procedure [8, 21, 43], were derived as a special case of a generalization of Spitzer’s classical identity. As a side remark we mention here that a similar factorization was independently established as a fundamental theorem for Lie algebras in integrable systems [4, 37, 38]. These results rely in part on general properties of Rota–Baxter operators, but also on a recursive equation based on the famous Baker–Campbell–Hausdorff formula. We will show that in certain favorable cases we are able to give the recursion in closed form. The main topic of this paper is the exploration of further applications of this recursive equation, which was dubbed the BCH-recursion. We will show its appearance in several fields. First, in the context of Rota–Baxter algebra, it is shown to be a generalization of the Magnus expansion known from matrix initial value problems. Then, by applying the recursion to the decomposition for certain Lie algebras, we derive the factorization of formal exponentials and uniformization in the work of Barron, Huang and Lepowsky [5] which is itself a generalization of several of their earlier results. Furthermore, we link explicitly the BCH-recursion with the even-odd decomposition of characters of connected graded Hopf algebras derived in recent work of Aguiar, Bergeron and Sottile [2]. This way we achieve an exponential form of their decomposition. This result also relates to our last point. We give a simplified approach to the polar decomposition in the work of Munthe-Kaas and collaborators [29, 30, 42] on numerical solutions of differential equations. Let us outline the organization of this paper. After the above introduction, Sect. 2 provides the key result. In Subsect. 2.1 we introduce a certain Baker–Campbell–Hausdorff type equation as the main object of this work, together with a general factorization theorem for complete filtered associative and Lie algebras. Solutions to this recursion are given under particular assumptions. Subsection 2.2 combines the former results with the notion of Rota–Baxter algebra, giving rise to a generalization of Spitzer’s classical identity. We relate the BCH-recursion to Magnus’ expansion in the context of weight zero Rota–Baxter maps. As a motivational example we recall in Sect. 3 how this, together with Atkinson’s factorization theorem, applies to the work of Connes and Kreimer in perturbative quantum field theory. Section 4 relates our findings to the work of Barron, Huang and Lepowsky on the factorization of formal exponentials and uniformization. After that, we deduce in Sect. 5 the even-odd decomposition of combinatorial Hopf algebra characters defined by Aguiar, Bergeron and Sottile and give a closed form for
Birkhoff Type Decompositions and the Baker–Campbell–Hausdorff Recursion
823
the BCH-recursion in this particular setting. Finally, in Sect. 6 using similar ideas we briefly mention a simplified approach to some results in the work of Munthe-Kaas and collaborators. 2. The General Set Up In the following K denotes the base field of characteristic zero, over which all algebraic structures are defined. Many results remain true if it is replaced by a commutative Q-algebra. Here we establish general results to be applied in later sections. We start with a complete filtered associative algebra A together with a filtration preserving linear map P on A as general setting. We obtain from the Baker–Campbell–Hausdorff (BCH) series a non-linear map χ on A which we called BCH-recursion in [14, 15, 17]. This recursion gives a decomposition on the exponential level, and a one-sided inverse of the BCH series with the later regarded as a map from A × A → A. The results naturally apply in the Lie algebra case. We then consider the above setting in the realm of a complete filtered associative Rota–Baxter algebra giving rise to a generalization of Spitzer’s classical identity. We conclude this section by showing how the BCH-recursion can be seen as a generalization of the Magnus expansion in the context of Rota–Baxter algebras. 2.1. The Baker–Campbell–Hausdorff recursion. Let A be a complete filtered associative algebra. Thus A has a decreasing filtration {An } of sub-algebras with vanishing intersection such that Am An ⊆ Am+n and A ∼ = lim A/An (i.e., A is complete with ←− respect to the topology from {An }). For instance, consider for A being an arbitrary associative algebra, the power series ring A := A[[t]] in one (commuting) variable t. Another example is given by the subalgebra Mn (A) ⊂ Mn (A) of (upper) lower triangular matrices in the algebra of n × n matrices with entries in A, and with n finite or infinite. By the completeness of A, the functions exp : A1 → 1 + A1 ,
exp(a) =
∞ an n=0
log : 1 + A1 → A1 ,
n!
log(1 + a) = −
,
∞ (−a)n n=1
n
are well-defined and are the inverse of each other. The Baker–Campbell–Hausdorff formula is the power series BCH(x, y) in the noncommutative power series algebra A := Qx, y (which is the free noncommutative complete Q-algebra with generators x, y) such that [32, 40] exp(x) exp(y) = exp x + y + BCH(x, y) . Let us recall the first few terms of BCH(x, y) which are 1 1 1 1 BCH(x, y) = [x, y] + [x, [x, y]] − [y, [x, y]] − [x, [y, [x, y]]] + · · · , 2 12 12 24 where [x, y] := x y − yx is the commutator of x and y in A. Also denote C(x, y) := x + y + BCH(x, y). So we have C(x, y) = log exp(x) exp(y) , which is a special case of the Hausdorff series [26]
824
K. Ebrahimi-Fard, L. Guo, D. Manchon
Z (x1 , . . . , xn ) := log exp(x1 ) · · · exp(xn ) . Then for any complete Q-algebra A and u, v ∈ A1 , C(u, v) ∈ A1 is well-defined. So we get a map C : A1 × A1 → A1 . Now let P : A → A be any linear map preserving the filtration of A. We define P˜ to be id A − P. For a ∈ A1 , define χ (a) = limn→∞ χ(n) (a), where χ(n) (a) is given by the BCH-recursion χ(0) (a) := a,
χ(n+1) (a) = a − BCH P(χ(n) (a)), (id A − P)(χ(n) (a)) ,
(1)
and where the limit is taken with respect to the topology given by the filtration. Then the map χ : A1 → A1 satisfies ˜ (a)) . χ (a) = a − BCH P(χ (a)), P(χ (2) This map appeared in [14, 15, 17], where also more details can be found. The following proposition gives further properties of the map χ , improving a result in [28] (Sect. II.6). Proposition 1. For any linear map P : A → A preserving the filtration of A there exists a unique (usually non-linear) map χ : A1 → A1 such that (χ − id A )(Ai ) ⊂ A2i for any i ≥ 1, and such that, with P˜ := id A − P we have (3) ∀a ∈ A1 , a = C P χ (a) , P˜ χ (a) . This map is bijective, and its inverse is given by ˜ ˜ χ −1 (a) = C P(a), P(a) = a + BCH P(a), P(a) .
(4)
Proof. Equation (3) can be rewritten as
χ (a) = Fa χ (a) ,
with Fa : A1 → A1 defined by
˜ . Fa (b) = a − BCH P(b), P(b)
This map Fa is a contraction with respect to the metric associated with the filtration: indeed if b, ε ∈ A with ε ∈ An , we have ˜ ˜ + ε) . Fa (b + ε) − Fa (b) = BCH P(b), P(b) − BCH P(b + ε), P(b The right-hand side is a sum of iterated commutators in each of which ε does appear at least once. So it belongs to An+1 . So the sequence Fan (b) converges in A1 to a unique fixed point χ (a) for Fa . Let us remark that for any a ∈ Ai , then, by a straightforward induction argument,χ(n) (a) ∈ Ai for any n, so χ (a) ∈ Ai by taking the limit. Then χ (a) − a = BCH P χ (a) , P˜ χ (a) clearly belongs to A2i . Now consider the map ψ : A1 → A1 ˜ defined by ψ(a) = C P(a), P(a) . It is clear from the definition of χ that ψ ◦χ = id A . Then χ is injective and ψ is surjective. The injectivity of ψ will be an immediate consequence of the following lemma
Birkhoff Type Decompositions and the Baker–Campbell–Hausdorff Recursion
825
Lemma 2. The map ψ increases the ultrametric distance given by the filtration. Proof. For any x, y ∈ A1 the distance d(x, y) is given by e−n , where n = sup{k ∈ N, x − y ∈ Ak }. We have then to prove that ψ(x) − ψ(y) ∈ / An+1 . But ψ(x) − ψ(y)
˜ ˜ = x − y + BCH P(x), P(x) − BCH P(y), P(y) ˜ ˜ ˜ − y) . = x − y + BCH P(x), P(x) − BCH P(x) − P(x − y), P(x) − P(x
The rightmost term inside the large brackets clearly belongs to An+1 . As x − y ∈ / An+1 by hypothesis, this proves the claim. The map ψ is then a bijection, so χ is also bijective, which proves Proposition 1. Now let g be a complete filtered Lie algebra. Let A := U(g) be the universal enveloping associative algebra of g. Then with the induced filtration from g, A is a complete filtered associative algebra. A is also a complete Lie algebra under the bracket [x, y] := x y − yx and contains g as a complete filtered sub-Lie algebra. Let P : g → g be a linear map preserving the filtration in g. We can extend P to a linear map Pˆ : A → A that preserves the filtration in A. A simple way to build such an extension (by no means unique) is to choose any supplementary subspace V of g inside A and to extend P by ˆ As is well-known [32, the identity map on the complement. If P is idempotent, so is P. 40], the power series C(x, y) and BCH(x, y) ∈ Qx, y are Lie series. Therefore, the map χ : A1 → A1 in Eq. (2) and Proposition 1 restricts to a bijective map χ : g 1 → g1 with its inverse given by Eq. (4). Further, for u, v ∈ g1 , C(u, v) is a well-defined element in g1 . We thus have C : g1 × g 1 → g1 as in the associative case. The following theorem contains the key result of our exposition. It states a general decomposition on A implied by the map χ . Theorem 3. Let A be a complete filtered associative algebra or Lie algebra with a linear, filtration preserving map P : A → A. (1) For any a ∈ A1 , we have ˜ (a)) . exp(a) = exp P(χ (a)) exp P(χ (2) C : A1 × A1 → A1 has a right inverse D P given by D P = (P ◦ χ , P˜ ◦ χ ) : A1 → A1 × A1 . (3) C restricts to a bijection C : D P (A1 ) → A1 .
(5)
826
K. Ebrahimi-Fard, L. Guo, D. Manchon
(4) Furthermore, for any subset B of A1 , C restricts to a bijection C : D P (B) → B. Proof. (1) follows since ˜ (a)) = a. C P(χ (a)), P(χ (2) follows since ˜ (a)) = a. C ◦ D P (a) = C P(χ (a)), P(χ (3) is a general property of maps: (D P ◦ C) (a1 , a2 ) = (D P ◦ C) ◦ D P (a) = D P ◦ (C ◦ D P )(a) = D P (a). D P (A1 )
(4) is clear as D P is a (two-sided) inverse for the restriction of C to D P (A1 ).
The particular case when the map P is idempotent deserves special attention. Theorem 4. Let P : A → A be an idempotent linear map preserving the filtration of A. Let A = A− ⊕ A+ be the corresponding vector space decomposition, with A− := P(A) ˜ ˜ 1 ). Let χ : A1 → A1 be the Let A1,− := P(A1 ) and A1,+ := P(A and A+ := P(A). BCH-recursion map associated to the map P, and let χ˜ : A1 → A1 be the BCH-recur˜ sion map associated to P: (1) (Factorization Theorem) C restricts to a bijection C− : A1,− × A1,+ −→ A1 . (2) (Formal Uniformization Theorem) There exists a unique bijection : A1,+ × A1,− −→ A1,− × A1,+ such that for a = (a+ , a− ) ∈ A1,+ × A1,− , we have exp(a+ ) exp(a− ) = exp(π− ((a))) exp(π+ ((a))), where π± : A1,− × A1,+ → A1,± are the projectors. (3) The inverse map of C− in part (1) is given by D P (a) = P χ (a) , P˜ χ (a) ,
a ∈ A1 ,
and the uniformization map in part (2) is written (a) = P χ ◦ C(a) , P˜ χ ◦ C(a) or
(a) = P χ ◦ χ˜ −1 (a+ + a− ) , P˜ χ ◦ χ˜ −1 (a+ + a− )
with a = (a+ , a− ) ∈ A1,+ × A1,− .
Birkhoff Type Decompositions and the Baker–Campbell–Hausdorff Recursion
827
The statements (1) and (2) in the above theorem generalize theorems of Barron, Huang and Lepowsky [5] which are themselves generalizations of factorization and uniformization theorems for Lie algebras and Lie superalgebras such as Virasoro algebras and Neveu-Schwarz algebras, respectively. This is the motivation for the naming of those items. See Sect. 4 for further details. Proof. (1) We already know from item (3) of Theorem 3 that D P is a right inverse for C− . But it is also a left inverse, as for any (x, y) ∈ A1,− × A1,+ there is a unique v ∈ A1 ˜ such that x = P(v) and y = P(v), and we have ˜ D P ◦ C(x, y) = D P ◦ C P(v), P(v) = D P ◦ χ −1 (v) ˜ = (P, P)(v) = (x, y). (2) By the same argument as for part (1), C restricts to a bijection C+ : A1,+ × A1,− −→ A1 . Its inverse is now D P˜ . Since A1 = A1,+ ⊕ A1,− = A1,− ⊕ A1,+ , we can define by the following diagram: C+
A1,+ × A1,−
/ A1,+ ⊕ A1,− σ
A1,− × A1,+
C−
/ A1,− ⊕ A1,+
Here σ is just a cosmetic way to write the identity map σ (b+ + b− ) = b− + b+ . is bijective since C+ and C− in the diagram are. We see also from the diagram that this is the unique map such that exp(a+ ) exp(a− ) = exp π− ((a)) exp π+ ((a)) . (3) This follows from the above commutative diagram and part (1): now we can compute (a) = D P ◦ C+ (a) = P ◦ χ ◦ C+ (a), P ◦ χ ◦ C+ (a) ˜ ◦ χ˜ −1 (a+ + a− )) = P(χ ◦ χ˜ −1 (a+ + a− )), P(χ which ends the proof of Theorem 4.
Corollary5. Under the hypotheses of Theorem 4, for any η ∈ 1 + A1 there are unique η− ∈ exp A1,− and η+ ∈ exp A1,+ such that η = η− η+ . Proof. This follows directly from the first item of Theorem 3 and the first item of Theorem 4, as the exponential map is a bijection from A1 onto 1 + A1 .
828
K. Ebrahimi-Fard, L. Guo, D. Manchon
Let us finish this section with two observations simplifying the BCH-recursion considerably. The first one is of more general character. To begin with it might be helpful to work out the first few terms of the recursion for the map χ in (1). For this let us introduce a dummy parameter t and write χ (at) = t k≥0 χ (k) (a)t k . For k = 0, 1, 2 we readily find χ (0) (a) = a and
1 1 ˜ χ (1) (a) = − P(a), P(a) = − [P(a), a], (6) 2 2
1 1 ˜ (1) (a)) ˜ P(a), P(χ χ (2) (a) = − P(χ (1) (a)), P(a) − 2 2
1 ˜ P(a), [P(a), a] − P(a), − [P(a), a] 12
1
1 ˜ ˜ + P(a), P([P(a), a]) = + P([P(a), a]), P(a) 4 4
1 ˜ P(a), [P(a), a] − P(a), − [P(a), a] 12
1
1 P([P(a), a]), a + P(a), [P(a), a] − [P(a), a], a . (7) = 4 12 In both the last cases P˜ = id A − P has completely disappeared. Therefore, we might ˜ Indeed, expect to find a simpler recursion for the map χ , without the appearance of P. such a simplification follows using the factorization property, implied by the χ map on A in item (1) of Theorem 3. Lemma 6 [15]. Let A be a complete filtered algebra and P : A → A a linear map preserving the filtration. The map χ in (2) solves the following recursion: χ (u) := u + BCH − P(χ (u)), u , u ∈ A1 . (8) Proof. For any element u ∈ A we can write u = P(u) + (id A − P)(u) using linearity of of P. The definition the map χ then implies for u ∈ A1 that exp(u) = ˜ (u)) , see Eq. (5). Furthermore, exp P(χ (u)) exp P(χ ˜ (u)) = exp − P(χ (u)) exp(u) exp P(χ = exp − P(χ (u)) + u + BCH(−P(χ (u)), u) . Bijectivity of the exp map then implies that
χ (u) − P(χ (u)) = −P(χ (u)) + u + BCH − P(χ (u)), u ,
from which Eq. (8) follows.
Our second observation is of more particular type. Again, it concerns the linear map P in the definition of the BCH-recursion χ . We will treat a special case, providing a solution, i.e., closed form, for the BCH-recursion. Further below in Sect. 5 we will observe another instance where a closed form for the BCH-recursion can be derived, see Eq. (50). Let us now assume that the linear map P : A → A in the BCH-recursion in Eq. (8) of Lemma 6 is an idempotent map, and moreover that it respects multiplication in A. This makes P respectively P˜ = id A − P a Rota–Baxter map to be introduced in the following section although the map P˜ is not an algebra morphism. We then have
Birkhoff Type Decompositions and the Baker–Campbell–Hausdorff Recursion
829
Lemma 7. Let A be a complete filtered associative algebra with filtration preserving linear map P : A → A, which moreover is an idempotent algebra homomorphism. Then the map χ in Eq. (8) of Lemma 6 has the simple form χ (u) = u + BCH − P(u), u , (9) for any element u ∈ A1 . Proof. The proof follows from Lemma 6, since P(χ (u)) = P(u). The latter results from the multiplicativity of P, i.e., applying P to Eq. (2) we obtain ˜ P(χ (u)) = P(u) + BCH P 2 (χ (u)), (P ◦ P)(u)) . Since P is idempotent, we have P ◦ P˜ = P − P 2 = 0. Thus P(χ (u)) = P(u).
Remark. With the foregoing assumptions on P the factorization in item (1) of Theorem 3 simplifies considerably. For any a ∈ A1 , we have ˜ exp(a) = exp P(a) exp P(a) + BCH − P(a), a . (10) 2.2. Rota–Baxter operator. In the 1950s and early 1960s, several interesting results were obtained in the fluctuation theory of probability. One of the most well-known is Spitzer’s identity [39]. In a seminal 1960 paper [6], the American mathematician G. Baxter deduced it from a certain operator identity, that later bore his name. During the early 1960s and 1970s, algebraic, combinatorial and analytic aspects of Baxter’s work were studied by several people, among them G.-C. Rota and F. V. Atkinson. Much of the recent renewed interest in these works owes to Rota’s later survey articles [35, 36] and talks during the 1990s. Related concepts were independently developed by Russian physicists during the 1980s, especially in Belavin and Drinfeld’s, and Semenov-TianShansky’s papers [7, 37] on solutions of the (modified) classical Yang–Baxter equation. In this context let us mention another connection linked with the last remark. Aguiar [1] related Rota–Baxter operators of weight zero to the associative analog of the classical Yang–Baxter equation, which also appeared in [31]. Now we assume that A is an associative algebra and P a Rota–Baxter operator of weight θ satisfying the Rota–Baxter relation P(x)P(y) + θ P(x y) = P x P(y) + P P(x)y (11) for all x, y ∈ A [6, 33–35]. A Rota–Baxter algebra of weight θ is an algebra with a Rota–Baxter operator denoted by the pair (A, P). The operator P˜ := θ id A − P also is a Rota–Baxter map of weight θ , such that the mixed relation ˜ ˜ P(x) P(y) = P˜ P(x)y + P x P(y) (12) is satisfied for all x, y ∈ A. The image of P as well as P˜ are subalgebras in A. A Rota–Baxter ideal I is an ideal I of A such that P(I ) ⊆ I . The case θ = 0 corresponds to the integration by parts property of the usual Riemann integral. An important class of examples is given by idempotent Rota–Baxter maps, i.e., projectors, where identity (11) (of weight θ = 1) implies that the Rota–Baxter algebra A splits as a direct sum into two parallel subalgebras given by the image, respectively kernel, of P. Assuming P to be an idempotent algebra morphism is sufficient to imply
830
K. Ebrahimi-Fard, L. Guo, D. Manchon
that it is a Rota–Baxter map. As an example of an idempotent Rota–Baxter map which is moreover an algebra morphism, truncate the Taylor expansion of a real function at a point a at zeroth order, i.e., evaluate a real function at a point a, Pa(0) ( f )(x) = f (a). The modified Rota–Baxter operator, B := θ id A − 2P, satisfies the modified Rota– Baxter relation B(x)B(y) + θ 2 x y = B B(x)y + x B(y) , (13) for all x and y in A. For a modified Rota–Baxter operator B coming from an idempotent ˜ Rota–Baxter map P, we have B 2 = id A , B ◦ P = −P, and B ◦ P˜ = P. Taking the Lie algebra associated to (A, P), with commutator bracket [x, y] := x y − yx, for all x, y ∈ A, we find the Rota–Baxter Lie algebra, (L A , P), of weight θ with P fulfilling [P(x), P(y)] + θ P([x, y]) = P [P(x), y] + [x, P(y)] . (14) Similarly, for the modified Rota–Baxter map. Both equations are known as (the operator form of) the (modified) classical Yang–Baxter1 equations [7, 37]. Every Rota–Baxter algebra (A, P) of weight θ allows for a new product defined in terms of the Rota–Baxter map P, a ∗ P b := P(a)b + a P(b) − θab,
(15)
such that the vector space A with this product is a Rota–Baxter algebra of the same weight, with P as its Rota–Baxter map. We will denote it by (A P , P). The Rota–Baxter map P becomes an (not necessarily unital) algebra homomorphism from A P to A, ˜ ∗ P b) = − P(a) ˜ P(b). ˜ P(a ∗ P b) = P(a)P(b). For P˜ we have P(a A complete filtered Rota–Baxter algebra is defined to be a Rota–Baxter algebra (A, P) with a complete filtration by Rota–Baxter ideals {An }. Again, consider for any weight θ Rota–Baxter algebra the power A := A[[t]] and define an (A, P) ∞ series ring n ) := n . Then (A, P) is a complete operator P : A → A, P( ∞ a t P(a )t n n n=0 n=0 filtered Rota–Baxter algebra of weight θ . In the case of the algebra of strictly (upper) lower triangular matrices Mn (A) with n ≤ ∞ and entries in a weight θ Rota–Bax ter algebra define the Rota–Baxter map P : Mn (A) → Mn (A) entrywise, (A, P), P(α) = P(αi j ) , for α in Mn (A) [17]. The normalized map θ −1 P is a Rota–Baxter operator of weight one. In the following we will assume that any Rota–Baxter map is of weight one, if not stated otherwise. The next proposition contains the generalization of Spitzer’s identity to non-commutative Rota–Baxter algebras. Proposition 8 [14, 15, 17]. Let (A, P) be a complete filtered Rota–Baxter algebra. The factors on the right hand side of Eq. (5) ˜ (a)) exp(a) = exp P(χ (a)) exp P(χ (16) for a ∈ A1 are the unique solutions to the equations ˜ b), ˇ u = 1 − P(bˇ u) resp. u = 1 − P(u 1 Referring to the Australian physicist Rodney Baxter
(17)
Birkhoff Type Decompositions and the Baker–Campbell–Hausdorff Recursion
831
where bˇ := exp(−a) − 1 in A1 . Its inverses satisfy uniquely the equations ˜ x ), x = 1 − P(x b) resp. x = 1 − P(b
(18)
ˇ −1 − 1 ∈ A1 . where b := exp(a) − 1 = (1 + b) The following theorem is due to Atkinson [3]. Theorem 9. For the solutions x and x in (18) (resp. its inverses in (17)) with b := exp(a) − 1 we have x(1 + b)x = 1, that is, (1 + b) = x −1 x −1 .
(19)
˜ If P is idempotent, i.e., the algebra A decomposes directly into the images of P and P, the factorization of 1 + b is unique. The next corollary follows readily and is stated for completeness. Corollary 10. Let (A, P) be a complete filtered Rota–Baxter algebra. For the solutions u and u in (17), we find the equations u = 1 + P b x ), u = 1 + P˜ x b), (20) where x and x are solutions of Eqs. (18), respectively. As a proposition we mention without giving further details the fact that, using the double Rota–Baxter product ∗ P in (15) for θ = 1, we may write x = 1 + P exp∗ P − χ (a) − 1 , where exp∗ P denotes the exponential defined in terms of the product in (15). This implies −x b = exp∗ P − χ (log(1 + b)) − 1 for 1 + b := exp(a). When (A, P) is commutative, the map χ reduces to the identity map, giving back Spitzer’s classical identity, for fixed b ∈ A1 [39], ∞ (−1)n P P(· · · P(P (b)b) . . . b)b , exp − P log(1 + b) = n=0
(21)
n−times
corresponding to the first recursion, x = 1 − P(x b), in (18). Replacing the Rota–Baxter map P by the identity map, the above identity reduces to the geometric series for the element −b ∈ A1 . Proofs of this identity in the commutative case have been given by quite a few authors, including the aforementioned Atkinson [3], Cartier [9], Kingman and Wendel [25, 41] as well as Rota and Smith [34]. In fact, Rota [33] showed that this identity is equivalent to the classical Waring identity relating elementary symmetric functions and power symmetric functions. Remark. Coming back to Lemma 7, respectively Eq. (10) we see immediately that in the case of a non-commutative Rota–Baxter algebra (A, P) with idempotent and multiplicative Rota–Baxter map P and thence necessarily of weight one, implying P(χ (a)) =
832
K. Ebrahimi-Fard, L. Guo, D. Manchon
P(a), for all a ∈ A1 , we have the surprising result that the exponential solution to the recursion x = 1 − P(x b) in (18) can be written as a geometric series 1 . (22) exp − P χ (log(1 + b)) = exp − P log(1 + b) = 1 + P(b) Observe that the BCH-recursion χ disappeared after the first equality, since P(χ (a)) = P(a). The normalization of the weight one Rota–Baxter map P to θ P gives a Rota–Baxter map of weight θ . This implies the following modification of Proposition 8. Proposition 11. For a weight θ = 0 Rota–Baxter operator P, the map χ in factorization (16) of Proposition 8 generalizes to 1 (23) χθ (a) = a − BCH P χθ (a) , P˜ χθ (a) . θ Similarly the recursion in Eq. (8) of Lemma 6 transposes into 1 (24) χθ (a) = a + BCH − P χθ (a) , θa , a ∈ A1 . θ Such that for all a ∈ A1 we have the decomposition ˜ θ (a)) . (25) exp(θa) = exp P(χθ (a)) exp P(χ The factors on the right-hand side of Eq. (25) are inverses of the unique solutions of the equations ˜ x ), x = 1 − P(x b) resp. x = 1 − P(b
(26)
where 1 + θ b := exp(θa) in A. From this proposition we arrive at Corollary 12. Spitzer’s identity for a complete filtered non-commutative Rota–Baxter algebra (A, P) of weight θ = 0 is ∞ log(1 + θ b) = exp − P χθ (−1)n P b P(b P(b · · · P (b)) . . . ) , (27) θ n=0
n−times
for b ∈ A1 . We call χθ the BCH-recursion of weight θ ∈ K, or simply θ -BCH-recursion. As we will see in the next part, the particular appearance of the weight θ in Eqs. (23,24) reflects the fact that in the case of weight θ = 0, hence P˜ = −P, Atkinson’s factorization formula (19) in Theorem 9 collapses to x x = 1 − P(x b) 1 + P(b x ) = 1 (28) for any b ∈ A1 , which is in accordance with (26) for θ → 0. Remark. It should be clear that the decomposition in Eq. (25) in the above proposition is true for any complete filtered algebra A with filtration preserving linear map P and P˜θ := θ id A − P. Hence, Eqs. (23, 24) generalize Theorem 3. The Rota–Baxter property only enters in the last part with respect to the equations in (26), respectively Corollary 12.
Birkhoff Type Decompositions and the Baker–Campbell–Hausdorff Recursion
833
2.3. The case of vanishing weight and the Magnus recursion. Regarding Eq. (28) in connection with the factorization in Eq. (25) for a weight θ = 0 Rota–Baxter algebra, it is interesting to observe the limit of θ going to zero in formula (24) for the θ -BCHrecursion. The terms in the BCH series on the right hand side of (24) vanish except for those which are linear with respect to the second variable. In general we may write C(a, b) = a + b + BCH(a, b) as a sum [32] C(a, b) =
Hn (a, b),
n≥0
where Hn (a, b) is the part of C(a, b) which is homogenous of degree n with respect to b. Especially, H0 (a, b) = a. For n = 1 we have H1 (a, b) =
ad a (b) 1 − e−ad a
(see e.g. [20]). Hence we get a non-linear map χ0 inductively defined on the pro-nilpotent Lie algebra A1 by the formula adP χ0 (a) χ0 (a) = − (a) 1 − ead P(χ0 (a)) n bn adP χ0 (a) = 1+ (a),
(29) (30)
n>0
where P is now a weight zero Rota–Baxter operator. We call this the weight zero BCHrecursion. The coefficients bn := Bn!n , where Bn are the Bernoulli numbers. For n = 1, 2, 3, 4 we find the numbers b1 = −1/2, b2 = 1/12, b3 = 0 and b4 = −1/720. The first three terms in (30) are χ0 (a) = a −
1 1 1 P(a), a + P [P(a), a] , a + P(a), [P(a), a] + · · · . 2 4 12 (31)
As a particular x example we assume P to be the Riemann integral operator defined by P{a}(x) := 0 a(y)dy, which is a Rota–Baxter map of weight zero, i.e., it satisfies the integration by parts rule P{a1 }(x)P{a2 }(x) = P a1 P{a2 } (x) + P P{a1 }a2 (x). The functions ai = ai (x), i = 1, 2 are defined over R and supposed to take values in a non-commutative algebra, say, matrices of size n × n. Then we find 1 1 P χ0 (a) (x) = P{a}(x) − P [P{a}, a] (x) + P P [P{a}, a] , a (x) 2 4 1 + P P{a}, [P{a}, a] (x) + · · · . (32) 12
834
K. Ebrahimi-Fard, L. Guo, D. Manchon
Let us write the terms in (32) explicitly x a(y)dy, P{a}(x) =
(33)
0
1 1 x y1 P [P{a}, a] (x) = [a(y2 ), a(y1 )]dy2 dy1 , 2 2 0 0 1 P P [P{a}, a] , a (x) 4 1 x y1 y2 [a(y3 ), a(y2 )], a(y1 ) dy3 dy2 dy1 , = 4 0 0 0 1 P P{a}, [P{a}, a] (x) 12 x y1 y1 1 = a(y3 ), [a(y2 ), a(y1 )] dy3 dy2 dy1 . 12 0 0 0
(34)
(35)
(36)
Baxter’s original motivation was to generalize the integral equation f (x) = 1 + P{ f a}(x)
(37)
corresponding to the first order initial value problem d f (x) = a(x) f (x), dx with unique solution
f (0) = 1
f (x) = exp P{a}(x)
(38)
(39)
by replacing the Riemann integral by another Rota–Baxter map P of non-zero weight θ (11) on a commutative algebra. The result is the classical Spitzer identity (21), which in the more general non-zero weight θ case takes the form
exp − P
log(1 − θa) θ
∞ P P(P(· · · P (a)a) . . . a)a . = n=0
(40)
n−times
This follows from (27) with b = −a, since χθ = id A in the commutative case. One read ily verifies that the left-hand side of this identity reduces to the exponential exp P(a) , compare with (39), in the limit θ → 0. To summarize, Proposition 11 generalizes Proposition 8 to non-commutative weight θ = 0 Rota–Baxter algebras. Corollary 12 describes an extension of Baxter’s result on Spitzer’s identity to general associative Rota–Baxter algebras of weight θ = 0, i.e., not necessarily commutative. The particular case of vanishing weight θ → 0 is captured by the following Lemma 13. Let (A, P) be a complete filtered Rota–Baxter algebra of weight zero. For a ∈ A1 the weight zero BCH-recursion χ0 : A1 → A1 is given by the recursion in Eq. (29), adP χ0 (a) (a). χ0 (a) = − 1 − ead P(χ0 (a))
Birkhoff Type Decompositions and the Baker–Campbell–Hausdorff Recursion
835
(1) The equation x = 1 − P(x a) has a unique solution x = exp − P(χ0 (a)) . (2) The equation y = 1 + P(a y) has a unique solution y = exp P(χ0 (a)) . Atkinson’s factorization for the weight zero case, Eq. (28), follows immediately from the preceding lemma. In view of example (32) the last lemma leads to the following corollary. Recall the work by Magnus [27] on initial value problems of the above type but in a non-commutative setting, e.g., for matrix-valued functions. He proposed an exponential solution F(x) = exp [a](x) with [a](0) = 0, for the first order initial value problem ddx F(x) = a(x)F(x), F(0) = 1, respectively the corresponding integral equation F(x) = 1 + P{a F}(x), where P is again, of course, the Riemann integral operator. He found an expansion for [a](x) = (n) [a](x) in terms of multiple integrals of nested commutators, and provided a n>0 recursive equation for the terms (n) [a](x): d ad [a] [a](x) = ad [a] (a)(x). dx e −1
(41)
Comparison with (29) (and also (30) and (31)) settles the link between Magnus recursion and BCH-recursion in the context of a vanishing Rota–Baxter weight, namely Corollary 14. Let A be a function algebra over R with values in an operator algebra. P denotes the indefinite Riemann integral operator. Magnus’ expansion is given by the formula [a](x) = P χ0 (a) (x). (42) Hence, the θ -BCH-recursion (23) generalizes Magnus’ expansion to general weight θ = 0 Rota–Baxter operators P by replacing the weight zero Riemann integral in F = 1 + P{a F}. The following commutative diagram (45) summarizes the foregoing relations. Generalizing the simple initial value problem in (38) is twofold. First we go to the integral equation in (37). Then we replace the Riemann integral by a general Rota–Baxter map and assume a non-commutative setting. Hence, we start with a complete filtered non-commutative associative Rota–Baxter algebra (A, P) of non-zero weight θ ∈ K. The top of (45) contains the solution to the recursive equation y = 1 + P(y b)
(43)
for b ∈ A1 which is given in terms of Spitzer’s identity generalized to associative otherwise arbitrary Rota–Baxter algebras (27), log(1 − θ b) y = exp − P χθ . (44) θ The θ -BCH-recursion χθ is given in (24). The left wing of (45) describes the case when first, the weight θ goes to zero, hence reducing χθ → χ0 . This is the algebraic structure underlying Magnus’ -expansion. Then the algebra A becomes commutative which implies χ0 = id A . The right wing of diagram (45) just describes the opposite reduction, i.e., we first make the algebra commutative, which gives the classical Spitzer identity
836
K. Ebrahimi-Fard, L. Guo, D. Manchon
for non-zero weight commutative Rota–Baxter algebras (40). Then we take the limit θ → 0. exp −P χθ log(1−θb) θ (45) θ=0, non−com. Q QQQ nnn QQQ nnn QQQ n n n θ→0 θ =0 QQ( n nn non−com. com. n w log(1−θb) com. exp P χ0 (b) exp −P θ θ→0 Magnus cl. Spitzer PPP l PPPcom. θ→0 lll PPP lll l PPP l PP( lll vll exp P(b) θ=0, com.
Both paths eventually arrive at the simple fact that Eq. (43) is solved by a simple exponential in a commutative weight zero Rota–Baxter setting. This is the general algebraic structure underlying the initial value problem in (38) respectively its corresponding integral equation (37). 3. Renormalization in Perturbative QFT This section recalls some of the results from [14, 15]. We derive Connes’ and Kreimer’s Birkhoff decomposition of Hopf algebra characters with values in a commutative unital Rota–Baxter algebra. For more details we refer the reader to [17–19, 28]. In most of the interesting and relevant 4-dimensional quantum field theories (QFT), to perform even simple perturbative calculations, one can not avoid facing ill-defined integrals. The removal of these (ultraviolet) divergencies, or short-distance singularities, in a physically and mathematically sound way is the process of renormalization [13]. In the theory of Kreimer [22], and Connes and Kreimer, Feynman graphs as the main building blocks of perturbative QFT are organized into a Hopf algebra. In particular, Connes and Kreimer discovered a Birkhoff type decomposition for Hopf algebra characters with values in the field of Laurent series, which captures the process of renormalization. We will briefly outline an algebraic framework for this decomposition based on the above results of Spitzer and Atkinson. We work in the setting of Connes and Kreimer [11]. Recall that in the language of Kreimer for a given perturbative renormalizable QFT, denoted by F, we have a graded, connected, commutative, non-cocommutative Hopf algebra HF := (H := n≥0 Hn , , m H , εH , S) of one-particle irreducible (1PI) Feynman graphs with coproduct defined by
() = ⊗ 1 + 1 ⊗ + γ ⊗ /γ . γ ⊂
Here the sum is over all 1PI ultraviolet divergent subgraphs γ in and /γ denotes the corresponding cograph. The decomposition of in () essentially describes the combinatorics of renormalization. The space Hom(HF , C) of linear maps HF → C equipped with the convolution product f g := m C ◦ ( f ⊗ g) ◦ is an associative algebra with the counit εH as unit.
Birkhoff Type Decompositions and the Baker–Campbell–Hausdorff Recursion
837
Hom(HF , C) contains the group G := Char(HF , C) of Hopf algebra characters, i.e., algebra homomorphisms, and its corresponding Lie algebra g := gF = ∂Char(HF , C) of derivations (infinitesimal characters). Feynman rules in F provide such an algebra homomorphism from HF to C. In general, ultraviolet divergencies demand a regularization prescription, where by introducing extra parameters, the characters become algebra homomorphisms, say for instance, into L = C[ −1 , ]], the field of Laurent series (dimensional regularization scheme). We denote by G L := Char(HF , L) ⊂ Hom(HF , L) the group of L-valued, or regularized, algebra homomorphisms. Hence, now the set of Feynman rules together with dimensional regularization amounts to a linear map from the set of 1PI Feynman graphs to L and hence an algebra homomorphism denoted by φ ∈ G L from HF to L. We will now make the connection to Subsect. 2.2. The field of Laurent series actually forms a commutative Rota–Baxter algebra (L , R) with the projector R on L R : L → L,
∞
ai i →
i=−n
−1
ai i
i=−n
to the strict pole part of a Laurent series as the idempotent weight-one Rota–Baxter map (minimal subtraction scheme). It was shown in [11, 22] that this setup allows for a concise Hopf algebraic description of the process of perturbative renormalization of the QFT F. To wit, Connes and Kreimer observed that Bogoliubov’s recursive formula for the counterterm in renormalization has a Hopf algebraic expression given by inductively defining the map φ− ∈ HF , φ− (γ )φ(/γ ) (46) φ− () = −R φ() + γ ⊂
with φ− () = −R() if is a primitive element in HF , i.e., contains no subdivergence. The map ¯ R[φ]() := φ() +
φ− (γ )φ(/γ )
γ ⊂
for ∈ ker(εH ) is Bogoliubov’s preparation map. This leads to the Birkhoff decomposition of Feynman rules found by Connes and Kreimer [10–12, 23], described in the following theorem. r en.
Theorem 15. The renormalization of φ −−→ φ+ follows from the convolution product of the counterterm φ− (46) with φ, φ+ := φ− φ, implying the inductive formula for φ+ , φ+ () = φ() + φ− () +
φ− (γ )φ(/γ ).
γ ⊂ −1 Further, the maps φ− and φ+ are the unique characters such that φ = φ− φ+ gives the algebraic Birkhoff decomposition of the regularized Feynman rules character φ ∈ G L .
The following theorem describes the Birkhoff decomposition of Connes and Kreimer in Theorem 15 using the algebraic setting developed in the earlier sections.
838
K. Ebrahimi-Fard, L. Guo, D. Manchon
Theorem 16 [14, 15]. In Proposition 8, take A to be (Hom(HF , L), R), which is a complete filtered Rota–Baxter algebra with Rota–Baxter operator R(φ) := R ◦ φ and filtration from H. We denote its unit by e := u L ◦ εH . For a L-valued character φ ∈ Char(HF , L) take b := φ − e. Then one can show that b ∈ A1 and (1) the equations in (18) are the recursive formulae for x =: φ− and x =: φ+−1 in the work of Connes–Kreimer; −1 (2) the exponential factors in Eq. (5) give the unique explicit formulae for x −1 = φ− −1 and x = φ+ ; −1 φ+ found in (3) Equation (5) gives the unique Birkhoff decomposition of φ = φ− Connes–Kreimer’s work; ¯ ¯ ¯ (4) Bogoliubov’s R-map, R[φ] : H → L, is given by R[φ] = expR − χ (log(φ)) , for ˜ R[φ]) ¯ ¯ φ ∈ G L , such that R( = 2e − φ+ and R(R[φ]) = φ− − e. Here φ1 R φ2 := R(φ1 ) φ2 + φ1 R(φ2 ) − φ1 φ2 , φi ∈ G L , i = 1, 2, see (15). It is evident that one can replace the particular choice of the field of Laurent series L by any other commutative Rota–Baxter algebra with idempotent Rota–Baxter map. r en. Proposition 8 provides us with a recursion for the renormalization of φ −−→ φ+ which does not contain the counterterm φ− . Corollary 17 [14, 16, 17]. With the assumption of Theorem 16, the second equation in (17) gives a recursion for φ+ , ˜ φ+ (φ −1 − e) . φ+ = e − R Recall that the inverse of φ ∈ G L is given by the composition with the antipode, φ −1 = φ ◦ S. We should mention that in recent work [16, 17] the first two authors showed, together with J. M. Gracia-Bondía and J. C. Várilly, how the combinatorics of perturbative renormalization can be represented by matrix factorization of unipotent upper (lower) triangular matrices with entries in a commutative Rota–Baxter algebra. As we have seen above such triangular matrices provide a simple example of a complete filtered Rota–Baxter algebra. 4. Formal Exponentials In this section we consider now the factorization of formal exponentials described by Barron, Huang and Lepowsky in [5] in the context of the BCH-recursion map χ . Let us first recall their notations and results. Let g be a Lie algebra with a decomposition g− ⊕ g+ of the underlying vector space. Equivalently, there is an idempotent linear map P : g → g. Then we have the corresponding decomposition of the complete Lie algebra g[[s, t]] = g− [[s, t]] ⊕ g+ [[s, t]]. Consider the (restriction of the) BCH map C : g[[s, t]]1 × g[[s, t]]1 → g[[s, t]]1 . Here g[[s, t]]1 = sg[[s, t]] + tg[[s, t]].
Birkhoff Type Decompositions and the Baker–Campbell–Hausdorff Recursion
839
Theorem 18 [5]. (1) (Factorization Theorem of Barron–Huang–Lepowsky) The map C : sg− [[s, t]] × tg+ [[s, t]] → sg− [[s, t]] ⊕ tg+ [[s, t]] is bijective. Here we have direct product on the source space and (direct) sum in g[[s, t]] on the target space. (2) (Formal Algebraic Uniformization of Barron–Huang–Lepowsky) There exists a unique bijection = (− , + ) : tg+ [[s, t]] × sg− [[s, t]] → sg− [[s, t]] × tg+ [[s, t]] such that for g ± ∈ g± [[s, t]], exp(tg + ) exp(sg − ) = exp − (tg + , sg − ) exp + (tg + , sg − ). They further posed the following problem. Problem 19 [5, Problem 3.2]. Find a closed form for the inverse map of C in Theorem 18.(1). The theorem of Barron, Huang and Lepowsky generalizes a well-known result in the case when g is a finite-dimensional Lie algebra over R or C. In this special case, the proof was obtained by a geometric argument on the corresponding Lie group. Actually, their proof is algebraic, making use of a more precise expression of C(x, y) and the gradings given by s and t. We will show here how to derive this result from Theorem 4. Theorem 20. (1) The bijection C in Theorem 18(1) is the restriction C− of the bijection C in part (1) of Theorem 4. (2) The uniformization in Theorem 18(2) is the restriction of the uniformization in Theorem 4(2). (3) The restriction of the formulae for C− and in Theorem 4(3) gives the formulae for the inverse map of C and in Theorem 18(1). Remark. The formulae in item (3) involve the BCH-recursion χ which is defined in terms of the recursive equation (2), respectively (8). Hence, our approach allows for a compact formulation of Problem 19 in a generalized setting, to wit, find a closed form for the recursively defined map χ . Moreover, Lemma 7 and especially Eq. (50) of Sect. 5 give solutions to the BCH-recursion χ , that is, closed forms for the inverse map of C, in some particular situations. Proof. (1) We first note that g¯ := g[[s, t]] is a complete Lie algebra with filtration defined by the grading given by the total degree in the parameters s and t. In particular g¯ 1 = sg[[s, t]] + tg[[s, t]]. Thus by Theorem 4, the map C− : g− [[s, t]]1 × g+ [[s, t]]1 → g[[s, t]]1 = g− [[s, t]]1 ⊕ g+ [[s, t]]1 is bijective with inverse ˜ ◦ χ : g− [[s, t]]1 ⊕ g+ [[s, t]]1 → g− [[s, t]]1 × g+ [[s, t]]1 . D P = (P, P) Then to prove items (2) and (3) in Theorem 20, and hence Theorem 18, we only need to show
840
K. Ebrahimi-Fard, L. Guo, D. Manchon
Lemma 21. Let U = sg− [[s, t]] × tg+ [[s, t]], V = sg− [[s, t]] ⊕ tg+ [[s, t]]. Then C− restricts to a bijective map from U onto V . Proof. The inclusion C(U ) ⊂ V is straightforward: if a− ∈ sg− [[s, t]] × tg+ [[s, t]] then clearly a− +a+ ∈ V , and BCH(a− , a+ ) ∈ stg[[s, t]] = stg− [[s, t]]⊕stg+ [[s, t]] ⊆ sg− [[s, t]] ⊕ tg+ [[s, t]]. Let us now prove the inclusion χ (V ) ⊆ V , i.e., (47) χ sg− [[s, t]] ⊕ tg+ [[s, t]] ⊆ sg− [[s, t]] ⊕ tg+ [[s, t]], by using the definition of BCH-recursion: for any v ∈ V , χ (v) is the limit of χ(n) (v) (for the topology defined by the filtration) recursively defined by χ(0) (v) = v and χ(n) (v) = v − BCH P χ(n−1) (v) , P˜ χ(n−1) (v) . It is clear, from the same argument as above, that χ(n−1) (v) ∈ V implies χ(n) (v) ∈ V , so χ(n) (v) ∈ V for any n by induction. We then deduce χ (v) ∈ V by taking the limit, as V is closed. We deduce immediately from this inclusion that D P (V ) ⊆ U , as D P (a) = P ◦ χ (a), P˜ ◦ χ (a) . (2) Recall that the map in Theorem 18(2) is given by the following diagram: tg+ [[s, t]] × sg− [[s, t]]
C
σ
sg− [[s, t]] × tg+ [[s, t]] o
/ tg+ [[s, t]] ⊕ sg− [[s, t]]
D
sg− [[s, t]] ⊕ tg+ [[s, t]]
Here again σ is just the identity map σ (t h + + s h − ) = s h − + t h + . Then the proof of part (2) follows from part (1). Part (3) is readily verified. 5. Combinatorial Hopf Algebras In Sect.3 we applied the factorization property of the BCH-recursion χ together with the Rota–Baxter relation to Hopf algebras, in the context of the Hopf algebraic description of renormalization by Connes and Kreimer. This section consists of another application of χ to connected graded Hopf algebras. We analyze explicitly the even-odd decomposition of Aguiar, Bergeron, and Sottile [2].2 For a connected graded Hopf algebra (H = ⊕n≥0 Hn , , m, ε, S), we define the grading operator Y (h) := |h|h := nh, for a homogeneous element h ∈ Hn , and extend linearly. The grading on H defines a canonical involutive automorphism on H, denoted by : H → H, h := (−1)|h| h = (−1)n h, for h ∈ Hn . It induces by duality an ¯ for φ ∈ Hom(H, K), h ∈ H. involution on Hom(H, K), φ(h) = φ(h) 2 We thank W. Schmitt for bringing the paper of Aguiar, Bergeron and Sottile [2] to our attention.
Birkhoff Type Decompositions and the Baker–Campbell–Hausdorff Recursion
H naturally decomposes into H− := level of vector spaces
n>0 H2n−1
and H+ :=
841
n≥0 H2n
on the
H = H− ⊕ H+ , with projectors π± : H → H± , such that for h ∈ H, π+ (h) = h + = h + and π− (h) = h − = −h − , h = h − +h + . As a remark we mention that H+ is a subalgebra of H, whereas H− is just a subspace, hence neither π− nor π+ := idH − π− are Rota–Baxter maps. Instead, we have H± H± ⊂ H+ and H± H∓ ⊂ H− . The set of characters G := Char(H, K), i.e., multiplicative maps φ ∈ Hom(H, K), forms a group under convolution, defined by f g := m K ◦ ( f ⊗ g) ◦ , for f, g ∈ Hom(H, K). A character φ ∈ G is called even if it is a fixed point of the involution, φ = φ, and is called odd if it is an anti-fixed point, φ = φ −1 = φ ◦ S. The set of odd and even characters is denoted by G − , G + , respectively. Even characters form a subgroup in G, whereas the set of odd characters forms a symmetric space. The following theorem is proved in [2]. Theorem 22 [2]. Any φ ∈ Char(H, K) has a unique decomposition φ = φ− φ+ with φ− ∈ G − being an odd character, and φ+ ∈ G + being an even character. Both projectors π− : H → H− and π+ : H → H+ lift to Hom(H, K). Implying for the complete filtered Lie algebra g := ∂Char(H, K), with filtration from H, the direct decomposition g = g− ⊕ g+ into the Lie subalgebra g+ and the Lie triple system g− . Such that for any Z ∈ g, we have Z = Z − + Z + , Z ± ∈ g± unique. Then by Theorem 3, there is a BCH-recursion, χ : g1 → g1 such that, for any φ = exp(Z ) ∈ Char(H, K), Z ∈ g1 , we have (48) φ = exp(Z ) = exp Z − + Z + = exp χ (Z )− exp χ (Z )+ . Here the exponential is defined with respect to the convolution product, exp(Z ) := Z n n≥0 n! , but we will skip the in the following to ease the notation. Theorem 23. The even-odd factorization of a character in Theorem 22 coincides with the factorization in item (1) of Theorem 3. The proof follows from the following properties of the involution : H → H. Recall the definition of an algebra involution on an algebra A, which is an algebra homomorphism j : A → A such that j 2 = id A . Dually, define now a coalgebra involution to be a linear map j on a coalgebra C such that j 2 = idC and (j ⊗ j ) ◦ = ◦ j . Lemma 24. Let H be a connected filtered Hopf K-algebra. Let j : H → H be a coalgebra involution preserving the filtration. Then by pre-composition, j defines an algebra involution, still denoted by j , on the filtered algebra A := Hom(H, K) that preserves the filtration. Proof. For h ∈ H, we have j ( f g)(h) = ( f g)(j (h)) = (m K ◦ ( f ⊗ g) ◦ ◦ j )(h) = (m K ◦ ( f ⊗ g) ◦ (j ⊗ j ) ◦ )(h) = (j ( f ) j (g))(h). So j : A → A is an algebra homomorphism. Clearly, j preserves the filtration and j 2 = id.
842
K. Ebrahimi-Fard, L. Guo, D. Manchon
Lemma 25. Let j be an algebra involution on the complete filtered algebra A that preserves the filtration. (1) If j (a) = ±a for a ∈ A 1 , then j (exp(a)) = exp(±a). j (a) = ±a} and G ± := {η ∈ 1 + A1 j (η) = η±1 }. Then (2) Let A := {a ∈ A 1,± 1 exp A1,− = G − and exp A1,+ = G + . Proof. Since j preserves the filtration, j is a continuous map with respect to the topology defined by the filtration. So for any a ∈ A1 , we have k k k an an j (a)n j exp(a) = j lim = lim j = lim = exp(j (a)). k→∞ k→∞ k→∞ n! n! n! n=0
n=0
n=0
Now item (1) of the lemma follows. Item (2) then follows from the bijectivity of exp. Proof (of Theorem 23). Now let φ ∈ Char(H, K), and j = ¯. Then by Lemma 24, we see that the induced j = ¯ on Hom(H, K) isan algebra involution that preserves the filtration. Then by Lemma 25(1), exp χ (Z )− (resp. exp χ (Z )+ ) is odd (resp. even). So Eq. (48) gives a decomposition of φ asan element 5 and of G − and G +. By Corollary Lemma 25(2), we must have φ− = exp χ (Z )− and φ+ = exp χ (Z )+ , as needed. We should remind the reader that the results of Proposition 8 do not apply here. We cannot calculate the exponentials φ± using Spitzer’s recursions in (18), since neither the projector π+ nor π− are of Rota–Baxter type. Nevertheless, the particular setting allows for a significant simplification of the BCH-recursion. In fact, using that φ± ∈ G ± , hence −1 φ − = φ− and φ+ = φ+ and some algebra [29], we find the following simple formula for π− (χ (Z )) = χ (Z )− : 1 π− (χ (Z )) = π− (Z ) + BCH π− (Z ) + π+ (Z ), −π− (Z ) + π+ (Z ) . 2
(49)
This follows from Lemma 25 implying φ = j (exp(Z )) = exp − π− (Z ) + π+ (Z ) but also j (exp(Z )) = exp − χ (Z )− exp χ (Z )+ . −1
Therefore, we have φ φ = φ− φ− which gives Eq. (49). From the factorization in Theorem 23 we derive a closed form for the BCH-recursion 1 χ (Z ) = Z + BCH − π− (Z ) − BCH Z , Z − 2π− (Z ) , Z . 2
(50)
We may remark that this gives an answer to Problem 19 in the particular setting just outlined, see also the remark after Theorem 20.
Birkhoff Type Decompositions and the Baker–Campbell–Hausdorff Recursion
843
6. Polar Decomposition The factorization of Aguiar et al. in the context of connected graded Hopf algebras is related to a general result elaborated in more detail in [29, 30] and [42]. There it is shown that any connected Lie group G, together with an involutive automorphism σ on G allows locally for a decomposition similar to the above one. We will briefly outline the setting of [29, 30, 42] and show that our BCH-recursion provides an efficient mean for calculations. We should stress that the BCH-recursion approach gives only formal series. Let G be a connected Lie group, and g its corresponding Lie algebra. We assume the existence of an involutive automorphism σ on G. Let G − := {ψ ∈ G | σ (ψ) = ψ −1 } denote the symmetric space of anti-fixed points of σ , and by G + := {ψ ∈ G | σ (ψ) = ψ} we denote the subgroup of fixed points of σ . Also we denote by π± the lifted projections on g corresponding to σ . Hence for the Lie algebra g we have the direct decomposition in terms of the images of these projectors, g = g− ⊕ g+ , where π+ (g) =: g+ is a Lie subalgebra and π− (g) =: g− a Lie triple system. In this setting Munthe-Kaas et al. derive a differentiable factorization of ψ = exp(t Z ) ∈ G, Z ∈ g for sufficiently small parameter t. Using the additive decomposition of g in terms of the projectors π± corresponding to σ , the factors, ψ± (t) := exp X ± (Z ; t) in ψ(t) = ψ− (t)ψ+ (t) are calculated solving differential equations in t. This way explicit (i) complicated recursions are derived for the terms in X ± (Z ; t) = i>0 X ± (Z ) t i , using the relations between the spaces g± , i.e., [g± , g∓ ] ⊂ g− ,
[g± , g± ] ⊂ g+ .
(51)
The results coincide with those following from the simpler BCH-recursion map χ (2), which we state here again χ (Z ) = Z − BCH π− (χ (Z )), π+ (χ (Z )) , or Eq. (8) respectively its simple closed form in Eq. (50). The π± projections of the first three terms of χ (Z t) = t k≥0 χ (k) (Z )t k are π± (χ (0) (Z )) = Z ± and for the next two non-trivial parts (6,7) we find in order t 2 , (2) π+ χ (1) (Z ) = X + = 0
1 (2) π− χ (1) (Z ) = X − = − [Z − , Z + ], 2
and the even respectively odd projections in order t 3 , 1 (3) Z − , [Z − , Z + ] , π+ χ (2) (Z ) = X + = 12 (2) 1 1 (3) π− χ (Z ) = X − = − [Z + , [Z − , Z + ]] + [Z + , [Z − , Z + ]] 4 12 1 = − [Z + , [Z − , Z + ]]. 6
(52)
(53)
The reader is invited to compare them with the results in Munthe-Kaas et al. [29, 30] and especially Zanna’s work [42].3 In our approach we work with one relatively simple 3 We would like to point to the recursive equation (1.1) on p. 2 for the X (k) =: X , and Eq. (3.5) on p. 7 k − (l) for X + =: Yl in [42].
844
K. Ebrahimi-Fard, L. Guo, D. Manchon
BCH type recursion, χ (Z ), respectively its closed form (50). Then we take the projections via π± to obtain X ± (Z ; t) up to third order in the parameter t. The parameter t may be interpreted as providing us with the filtration (in the sense of formal power series). Hereby we use heavily the relations in (51). This seems to offer a simpler way for calculating the Lie algebra elements X ± (Z ; t) ∈ g± . We only need higher expansion terms for χ and then project into g± . Relations (51) simplify the last step considerably. Acknowledgements. The first author acknowledges greatly the support by the European Post-Doctoral Institute and Institut des Hautes Études Scientifiques (I.H.É.S.). He profited from discussions with M. Aguiar, J. M. Gracia-Bondía and D. Kreimer. Thanks goes to the Theory Department at the Physics Institute of Bonn University for warm hospitality. The second author thanks the NSF grant DMS-0505643 and Rutgers University Research Council for support, and thanks I.H.É.S. and Max Planck Institute for Mathematics in Bonn for hospitality. Many thanks go to J. Stasheff for comments and we appreciate helpful discussions with K. Barron, Y. Huang and J. Lepowsky. B. Fauser’s useful remark is acknowledged. The third author greatly acknowledges constant support from the Centre National de la Recherche Scientifique (C.N.R.S.).
References 1. Aguiar, M.: Prepoisson algebras. Lett. Math. Phys. 54(4), 263–277 (2000) 2. Aguiar, M., Bergeron, N., Sottile, F.: Combinatorial Hopf algebras and generalized Dehn–Sommerville relations. Comp. Math. 142, 1–30 (2006) 3. Atkinson, F.V.: Some aspects of Baxter’s functional equation. J. Math. Anal. Appl. 7, 1–30 (1963) 4. Babelon, O., Bernard, D., Talon, M.: Introduction to classical integrable systems. Cambridge Monographs on Mathematical Physics. Cambridge: Cambridge University Press, 2003 5. Barron, K., Huang, Y., Lepowsky, J.: Factorization of formal exponentials and uniformization. J. Algebra. 228, 551–579 (2000) 6. Baxter, G.: An analytic problem whose solution follows from a simple algebraic identity. Pacific J. Math. 10, 731–742 (1960) 7. Belavin, A.A., Drinfeld, V.G.: Solutions of the classical Yang-Baxter equation for simple Lie algebras. Funct. Anal. Appl. 16, 159–180 (1982) 8. Bogoliubov, N.N., Parasiuk, O.S.: On the multiplication of causal functions in the quantum theory of fields. Acta Math. 97, 227–266 (1957) 9. Cartier, P.: On the structure of free Baxter algebras. Advances in Math. 9, 253–265 (1972) 10. Connes, A., Kreimer, D.: Hopf algebras, Renormalization and noncommutative geometry. Commun Math. Phys. 199, 203–242 (1998) 11. Connes, A., Kreimer, D.: Renormalization in quantum field theory and the Riemann–Hilbert problem. I. The Hopf algebra structure of graphs and the main theorem. Commun. Math. Phys. 210(1), 249–273 (2000) 12. Connes, A., Kreimer, D.: Renormalization in quantum field theory and the Riemann–Hilbert problem. II. The β-function, diffeomorphisms and the renormalization group. Commun. Math. Phys. 216, 215–241 (2001) 13. Collins, J.C.: Renormalization. Cambridge Monographs on Mathematical Physics, Cambridge: Cambridge University Press, 1984 14. Ebrahimi-Fard, K., Guo, L., Kreimer, D.: Spitzer’s identity and the algebraic Birkhoff decomposition in pQFT. J. Phys. A: Math. Gen. 37, 11037–11052 (2004) 15. Ebrahimi-Fard, K., Guo, L., Kreimer, D.: Integrable renormalization II: the general case. Ann. H. Poincaré. 6, 369–395 (2005) 16. Ebrahimi-Fard, K., Gracia-Bondía, J.M., Guo, L., Várilly, J.C.: Combinatorics of renormalization as matrix calculus. Phys. Lett. B. 632(4), 552–558 (2006) 17. Ebrahimi-Fard, K., Guo, L.: Matrix Representation of Renormalization in Perturbative Quantum Field Theory. http://arXiv.org/list/hep-th/0508155, 2005 18. Ebrahimi-Fard, K., Kreimer, D.: Hopf algebra approach to Feynman diagram calculations. J. Phys. A: Math. Gen. 38, R385–R406 (2005) 19. Figueroa, H., Gracia-Bondía, J.M.: Combinatorial Hopf algebras in quantum field theory I. Rev. Math. Phys. 17, 881–976 (2005) 20. Godement, R.: Introduction à la théorie des groupes de Lie. Reprint of the 1982 original. Berlin Heidelberg New York: Springer, 2004
Birkhoff Type Decompositions and the Baker–Campbell–Hausdorff Recursion
845
21. Hepp, K.: Proof of the Bogoliubov–Parasiuk theorem on renormalization. Commun. Math. Phys. 2, 301– 326 (1966) 22. Kreimer, D.: On the Hopf algebra structure of perturbative quantum field theories. Adv. Theor. Math. Phys. 2, 303–334 (1998) 23. Kreimer, D.: Chen’s iterated integral represents the operator product expansion. Adv. Theor. Math. Phys. 3(3), 627–670 (1999) 24. Kreimer, D.: Combinatorics of (perturbative) quantum field theory. Phys. Rep. 363, 387–424 (2002) 25. Kingman, J.F.C.: Spitzer’s identity and its use in probability theory. J. London Math. Soc. 37, 309–316 (1962) 26. Loday, J.-L.: Série de Hausdorff, idempotents Eulériens et algèbres de Hopf. Expo. Math. 12, 165–178 (1994) 27. Magnus, W.: On the exponential solution of differential equations for a linear operator. Comm. Pure Appl. Math. 7, 649–673 (1954) 28. Manchon, D.: Hopf algebras, from basics to applications to renormalization. In: Comptesrendus des Rencontres mathématiques de Glanon 2001, available at http://math.univ-bpclermont.fr/0107EManchon/biblio/bogofa2002.pdf, 2006 29. Munthe-Kaas, H.Z., Quispel, G.R.W., Zanna, A.: The polar decomposition of Lie groups with involutive automorphisms. Technical Report, 191, Dept. of Informatics, Univ. of Bergen, Norway 2000 30. Munthe-Kaas, H.Z., Quispel, G.R.W., Zanna, A.: Generalized polar decompositions on Lie groups with involutive automorphisms. Found. Comput. Math. 1(3), 297–324 (2001) 31. Polishchuk, A.: Classical Yang–Baxter equation and the A∞ -constraint. Adv. Math. 168(1), 56–95 (2002) 32. Reutenauer, C.: Free Lie algebras. Oxford: Oxford University Press, 1993 33. Rota, G.-C.: Baxter algebras and combinatorial identities. I, II. Bull. Amer. Math. Soc. 75, 325–329 (1969); ibid. 75, 330–334 (1969) 34. Rota, G.-C., Smith, D.: Fluctuation theory and Baxter algebras. Istituto Nazionale di Alta Matematica, IX, 179, (1972). Reprinted in: Gian-Carlo Rota on Combinatorics: Introductory papers and commentaries. J.P.S. Kung, ed., Contemp. Mathematicians, Boston, MA: Birkhäuser Boston, 1995 35. Rota, G.-C.: Baxter operators, an introduction. In: Gian-Carlo Rota on Combinatorics, Introductory papers and commentaries. J.P.S. Kung ed., Contemp. Mathematicians, Boston, MA: Birkhäuser Boston, 1995 36. Rota, G.-C.: Ten mathematics problems I will never solve. Invited address at the joint meeting of the American Mathematical Society and the Mexican Mathematical Society, Oaxaca, Mexico, December 6, 1997. DMV Mittellungen, Heft 2, 45–52 (1998) 37. Semenov-Tian-Shansky, M.A.: What is a classical r -matrix? Funct. Anal. Appl. 17(4), 254–272 (1983) 38. Semenov-Tian-Shansky, M.A.: Integrable systems and factorization problems. Lectures given at the Faro International Summer School on Factorization and Integrable Systems (Sept. 2000), Basel-Boston: Birkhäuser, 2003 39. Spitzer, F.: A combinatorial lemma and its application to probability theory. Trans. Amer. Math. Soc. 82, 323–339 (1956) 40. Varadarajan, V.S.: Lie groups, Lie algebras, and Their Representations. Berlin Heidelberg New Yrok: Springer, 1984 41. Wendel, J.G.: A brief proof of a theorem of Baxter. Math. Scand. 11, 107–108 (1962) 42. Zanna, A.: Recurrence relations and convergence theory of the generalized polar decomposition on Lie groups. Math. Comp. 73(246), 761–776 (2004) 43. Zimmermann, W.: Convergence of Bogoliubov’s method of renormalization in momentum space. Commun. Math. Phys. 15, 208–234 (1969) Communicated by A. Connes