Commun. Math. Phys. 231, 1 – 24 (2002) Digital Object Identifier (DOI) 10.1007/s00220-002-0666-7
Communications in
Mathematical Physics
© Springer-Verlag 2002
Periodic Orbits, Symbolic Dynamics and Topological Entropy for the Restricted 3-Body Problem Gianni Arioli, Dipartimento di Scienze e T.A., C.so Borsalino 54, 15100 Alessandria, Italy. E-mail:
[email protected] Received: 16 October 2001 / Accepted: 2 February 2002
Abstract: This paper concerns the restricted 3-body problem. By applying topological methods we give a computer assisted proof of the existence of some classes of periodic orbits, the existence of symbolic dynamics and we give a rigorous lower estimate for the topological entropy.
1. Introduction The problem of both restricted and full N -body systems has such a long story that it is impossible to give an extensive bibliography here; we refer the reader to the classical texts [MH, M, SM, S]. This paper concerns the study of periodic and chaotic solutions of the planar restricted 3-body problem. If we assume that the primaries orbit around each other with period 2π and we use a rotating reference frame, i.e. if we use synodical coordinates, then the motion of the third body is described by the following system of second order differential equations: x¨ + 2y˙ = x y¨ − 2x˙ = y ,
(1.1)
where (x, y) =
y2 m1 m2 x2 + + + + C, 2 2 2 2 (x − R1 ) + y (x + R2 )2 + y 2
This research was supported by MURST project “Metodi variazionali ed Equazioni Differenziali Non Lineari”. Current address: Dipartimento di Matematica, Politecnico, via Bonardi 9, 20133 Milano, Italy. E-mail:
[email protected]
2
G. Arioli
m1 and m2 are the masses of the primaries, C is an arbitrary constant and R1 , R2 depend on m1 and m2 . This problem has been recently treated by variational methods in [AGT]. Here we employ topological methods with computer assistance. This kind of computer assisted proof has been introduced in [MM1, MM2, MM3], while the topological methods employed here have been introduced in [Z1, Z2], see also the applications in [AZ, GZ, Z3]. We point out that, although some of the proofs we give require computer assistance, they are rigorous and they can be easily reproduced on any recent computer. To this purpose, in the last section we provide all necessary information, mostly by referring to other papers, and we also provide on the web a Mathematica version of the algorithms employed. The main result presented in this paper is the proof of symbolic dynamics on fourteen symbols which corresponds to the existence of orbits, periodic or non-periodic, that come close to five different periodic orbits in any prescribed order. This result yields a lower estimate for the topological entropy of the system, and it is therefore a proof of its chaotic behavior. To the author’s knowledge, no rigorous estimate of the topological entropy for the Poincaré map of this system is available in the literature. This paper serves other purposes as well. We show how to extend the computer assisted techniques developed in the papers cited above to the planar restricted 3-body problem, which presents different features and difficulties and requires new techniques; to this purpose we introduce some new topological tools and computational methods. We give a rigorous proof of the existence of a class of periodic orbits at different energy levels: the existence of such periodic orbits (and many more) is well known, but to the author knowledge the proof is sometimes purely numerical, with no mathematical rigour, or perturbative, with no estimate on the range of validity of the perturbation parameter. We also provide a very narrow estimate of the locations of the intersection of such periodic orbits with the line connecting the primaries. In summary, this paper provides both the methods and some examples on how the computer assisted techniques can yield results for the planar restricted 3-body problem. We point out that these results are not of pertubative nature, and to the author’s knowledge the result on symbolic dynamic is not accessible by purely analytical methods; in particular the problem is not treatable by Melnikov’s method. A result on chaos for the planar restricted 3-body problem has been recently presented in [SK] with a different method, also based on computer assisted techniques, but the authors do not claim to give a rigorous proof. Indeed, they use, in their words, “realistic estimated upper bounds” for the errors made in the numerical computation of the Poincaré map. Such a method found a rigorous application on a different system in [KMS]. Here we estimate rigorously all computational errors and we do provide rigorous proofs of all the theorems we state. The paper is organized as follows: in Sect. 2 we provide a brief introduction to the problem; in Sect. 3 we introduce the Poincaré maps that are used in the proofs and explain a symmetry of the system which is widely employed in the proofs; in Sect. 4 we explain our method for detecting and proving periodic orbits and present the results we obtain. Section 5 is the main portion of the paper. Here we introduce the topological and computational methods and we describe the results on symbolic dynamics and on topological entropy. Ideas on further developments are given in Sect. 6 and details on the computer assisted proofs are in Sect. 7.
The Restricted 3-Body Problem
3
2. Description of the System We first derive briefly Eq. (1.1). It is well known that the two body problem admits a solution where both bodies move in a circular clockwise motion around their center of mass (and of course it also admits the symmetric counterclockwise solution). We call the two bodies P1 and P2 (the primaries). If P1 and P2 have mass m1 and m2 , then the circular solution with minimal period 2π minimizes the Lagrangian functional 2π m1 m2 m 1 m2 f (x1 , x2 ) := |x˙1 |2 + |x˙2 |2 + dt 2 2 |x1 − x2 | 0 1 ([0, 2π ]), the space of 2π -periodic functions whose weak derivative is defined on Hper square-integrable. Take
x1 = R1 (cos t, − sin t) and x2 = R2 (− cos t, sin t); in order to compute R1 and R2 we have to minimize m1 2 m2 2 m 1 m2 . R + R + f˜(R1 , R2 ) = 2 1 2 2 R1 + R 2 Setting 0 as the center of mass we have R2 = m2 m1 get R1 = (m +m and R2 = (m +m . )2/3 )2/3 1
2
1
m1 m2 R1 ,
(2.1)
and by minimizing f˜(R1 , R2 ) we
2
˜ ˙ 2 − V (w), where The Lagrangian for the restricted 3-body problem is L(w) = 21 |w| V (w) = −
m2 m1 − . |w − x1 | |w − x2 |
It is convenient to use synodical coordinates, i.e.a reference frame where the primaries cos t sin t 0 −1 sit still. Set w = Rv, where R = and let J = be the standard − sin t cos t 1 0 ˙ + R v˙ = R(v˙ − J v), therefore the Lagrangian is given symplectic matrix; then w˙ = Rv by L(v) =
1 2 ˙ J v) + (v), |v| ˙ − (v, 2
where (v) =
|v|2 m1 m2 + + − C. 2 |v − (R1 , 0)| |v − (−R2 , 0)|
(2.2)
It is well known that H (x, x, ˙ y, y) ˙ = x˙ 2 + y˙ 2 −2(x, y) is an integral of the motion (the Jacobi integral), therefore the motion takes place on the manifold H (x, x, ˙ y, y) ˙ = h. With some abuse of language, we call the Jacobi integral H (v) = |v| ˙ 2 − 2(v) the energy. The Euler–Lagrange equations are: v¨ − 2J v˙ − ∇(v) = 0.
(2.3)
This system admits five equilibrium points. Three such points are aligned with the primaries and are called L1 , L2 and L3 (the collinear equilibrium solutions). One of the collinear points lies between the primaries; let that point be L1 . All the collinear points
4
G. Arioli y
-2
y
2
2
1
1
-1
1
x
2
-2
-1
1
-1
-1
-2
-2
h = −.2
2
x
h=0 y
y
2 1.5 1
1
0.5
-2
-1
1
2
x
-1.5
-1
-0.5
0.5
1
1.5
x
-0.5
-1
-1 -1.5
-2
h = .3
h = .9 Fig. 1.
are saddles for . The remaining two equilibrium points are called L4 and L5 ; they are the absolute minima of and they both sit at the vertices of an equilateral triangle with the primaries at the other vertices (the equilateral equilibrium solutions). We only consider the case m1 = m2 . The reason for this choice is that this case has been already widely considered and many families of periodic orbits for this case are known to exist. Nonetheless, rigorous estimates on the location of the periodic orbits, results on symbolic dynamics and topological entropy and even a rigorous proof of existence of the periodic orbits are new to the author knowledge. Furthermore the equal masses case is, in some sense, the opposite of a “perturbative” case with one primary much less massive than the other one. In this case R1 = R2 = 2−2/3 and we choose the constant C = −25/3 in order to have (L1 ) = 0. By a direct computation it is easy to see that there exists h0 = −2(L2 ) = −2(L3 ) .8623 such that the region {v : 2(v) + h ≥ 0}, where the motion can take place is split in exactly two connected regions, one bounded and the other unbounded, if and only if 0 ≤ h < h0 .
The Restricted 3-Body Problem
5
We only consider values of h in the range [0, h0 ) and we look for trajectories in the bounded region. Note that, if h = 0, the bounded region is equal to the union of two closed sets whose only intersection is the origin and no trajectory can intersect both sets; if 0 < h < h0 the bounded region is homeomorphic to a ball (if we include the singularities in the region); if h < 0 there are three connected regions and finally if h ≥ h0 the region {v : 2(v) + h ≥ 0} is connected, but not simply connected. Figure 1 represents the curves {2(v)+h = 0} for different values of h (the small disks represent the primaries).
3. The Poincaré Maps In order to study the system at some fixed energy h we consider the Poincaré return map P : D(P ) ⊂ R2 → R2 defined in the following way. Given (x, px ) such that x = {R1, − 2 R 2 } and 2(x, 0)+h−px > 0 there exists a unique positive value of py = py (x,4 px ) = 2(x, 0) + h − px2 such that H (x, px , 0, py ) = h. Let ϕ(x, px ; ·) : R → R be the solution of the Eq. (1.1) with initial conditions x(0) = x, x(0) ˙ = px , y(0) = 0, y(0) ˙ = py and call ϕi , i = 1, . . . , 4 its components. By definition ϕ3 (x, px ; 0) = 0 and ϕ4 (x, px ; 0) > 0, therefore ϕ3 (x, px ; t) > 0 for all positive and small t. If there exists a time T1 such that ϕ3 (x, px ; T1 ) = 0 and ϕ3 (x, px ; t) > 0 for all t ∈ (0, T1 ), and a time T2 > T1 such that ϕ3 (x, px ; T2 ) = 0 and ϕ3 (x, px ; t) < 0 for all t ∈ (T1 , T2 ), then we define P (x, px ) = (ϕ1 (x, px ; T2 ), ϕ2 (x, px ; T2 )). In other words, P is the return map on the section y = 0, py > 0. In fact, for different reasons which we point out later, we find it useful to consider the half Poincaré maps H1 : D(H1 ) ⊂ R2 → R2 and H2 : D(H2 ) ⊂ R2 → R2 . H1 is defined by H1 (x, px ) = (ϕ1 (x, px ; T1 ), ϕ2 (x, px ; T1 )), where the maps ϕ and the time T1 are as before, while H2 is defined by H2 (x, px ) = (ϕ˜1 (x, px ; T1 ), ϕ˜2 (x, px ; T1 )), where ϕ(x, ˜ px ; ·) : R → R4 is the solution of Eq. (1.1) with initial conditions x(0) = x, x(0) ˙ = px , y(0) = 0, y(0) ˙ = − 2(x, 0) + h − px2 and T1 > 0 is such that ϕ3 (x, px ; T1 ) = 0 and ϕ3 (x, px ; t) < 0 for all t ∈ (0, T1 ). Of course, D(P ) ⊂ D(H1 ), H1 (D(P )) ⊂ D(H2 ) and P = H2 ◦ H1 . In the following we refer to the maps H1 and H2 as the first and second half Poincaré maps respectively. In other words the map H1 (resp. H2 ) is the transition map from the section y = 0, py > 0 to the section y = 0, py < 0 (resp. from the section y = 0, py < 0 to the section y = 0, py > 0). By inspection it is easy to see that, if (x(t), y(t)) is a solution of (1.1), then (x(t), ˜ y(t)) ˜ := (x(−t), −y(−t)) is also a solution. This implies that H1 (x1 , p1 ) = (x2 , p2 ) is equivalent to H2 (x2 , −p2 ) = (x1 , −p1 ) and P (x1 , p1 ) = (x2 , p2 ) is equivalent to P (x2 , −p2 ) = (x1 , −p1 ).
6
G. Arioli
4. Periodic Orbits A standard method for studying periodic orbits consists in looking for fixed points of the Poincaré map P . By the symmetry of the system considered at the end of the previous section we infer that H1 (x1 , 0) = (x2 , 0) yields H2 (x2 , 0) = (x1 , 0), which in turn implies that (x1 , 0) is a fixed point for the Poincaré map. On the other hand, if the system admits a periodic orbit which crosses orthogonally the x-axis at some point x1 , then, by definition of Poincaré map, P (x1 , 0) = (x1 , 0), and this is possible only if H1 (x1 , 0) = (x2 , 0) for some x2 . It turns out that it is also useful to consider whether there exist points x1 and x2 such that P (x1 , 0) = (x2 , 0), with x1 = x2 . By the same reason as before, these points correspond to periodic points of period 2 for the Poincaré map, hence to periodic trajectories for the system crossing the y = 0 hyperplane at two different points in each direction, see Fig. 3 of the orbits D1 and D2 . Let f (x) be the second component of H1 (x, 0) and g(x) be the second component of P (x, 0). In order to find fixed or period 2 points for the Poincaré map we can look for zeros of the function f or g. We remark that, even by considering the map g only, one can still find all the fixed points for the Poincaré map. On the other hand it may happen that P (x1 , 0) = (x2 , 0) with x1 very close to x2 , in the sense that |x1 −x2 | is smaller than the numerical error. In this case it is impossible to find out whether x1 is a fixed point or a periodic point of period 2, without considering either the derivative of the Poincaré map or the half Poincaré map. For this reason it is convenient to study both the map f and the map g: first we look for zeros of f , i.e. fixed points of P , then we can look for zeros of g which are not zeros for f , such points yielding periodic points of minimal period 2 of P . Figure 2 displays a numerical computation of f (x) with x ∈ (−.62, .62) and h = .1. The picture strongly suggests that the Poincaré map has three fixed points in the set {(x, 0) : x ∈ (−.62, .62)}, which is exactly the result we show rigorously in the next section, where we also give a narrow bound on the position of such points. Our strategy to detect and prove the existence of periodic orbits is as follows: first we choose some values for h and compute an approximate image of the x axis through the maps f and g as in the previous picture; in this way we can spot the places where the intersections should be located. Assume that the numerically computed graph of f crosses the x-axis in a neighborhood of some point x: ¯ we conjecture the existence of a fixed point for P nearby some point (x, ¯ 0). To prove the conjecture we choose x1 and x2 such that x1 < x¯ < x2 and both x1 and x2 are very close to x. ¯ Then we compute the rigorous half 2 1 -0.6
-0.4
-0.2
0.2 -1 -2 h = .1 Fig. 2.
0.4
0.6
The Restricted 3-Body Problem
7
Poincaré map H1 at (x1 , 0) and (x2 , 0). If we can prove that the second component of H1 (x1 , 0) has opposite sign with respect to the second component H1 (x2 , 0) and that the segment joining the two points belongs entirely to the domain of H1 , then by the continuity of the half Poincaré map we have proved that there exists at least a point x1 < x˜ < x2 such that H1 (x, ˜ 0) lies on the x axis, therefore a periodic orbit intersects the x-axis orthogonally at (x, ˜ 0). On the other hand, if we prove that some portion of the x-axis is mapped away from the x-axis itself, then we have a proof that there are no periodic solutions which cross the x-axis orthogonally in that section. Then we search for points of period 2 for the map P , that is we study the numerically computed graph of g and look for intersections with the x-axis. If we find two points x1 , x2 such that the second components of P (x1 , 0) and P (x2 , 0) lie on the opposite sides of the x-axis, the set [x1 , x2 ] × {0} belongs to the domain of P and P ([x1 , x2 ], 0) does not intersect the set [x1 , x2 ] × {0}, then we have a proof that there exists at least a point x1 < x˜ < x2 such that x˜ is the intersection of an orbit of minimal period 2 with the x-axis. The case we consider is usually referred to as “the Copenhagen orbits”, from the results of the Observatory of Copenhagen (see [S] and references therein). We recall that, by our choice of the constant C in (2.2), h = 0 is the energy of the stationary solution at the Lagrangian point L1 (the origin). Since the problem has the additional symmetry consisting in switching the primaries, we only look for orbits that cross the portion of the x-axis between the primaries with positive speed in the y direction. The lowest value of h we consider is 0, when the admissible region is split in two parts touching at the origin and no trajectory can enter both regions. The highest value of h we consider is 0.8, since at slightly larger value the bounded admissible region touches the unbounded part and some trajectory starting close to the primaries may be unbounded. The main result of this section is that the system (1.1) admits periodic solutions as shown in Table 1. In the left-hand side column the energy level is displayed, while the remaining columns represent the x-coordinate ±5 · 10−4 of the intersection of the orbits with the portion of the x-axis between the primaries (the L, D1 and D2 orbits have two such intersections; for the L orbits we only consider the intersection with positive y velocity, the other being symmetric with respect to the origin; for the D1 and D2 orbits we consider the only orthogonal intersection). More precisely, the following theorem holds:
Table 1. h 0 .1 .2 .22 .24 .26 .28 .3 .4 .5 .6 .8
S1 0.3158 0.3028 0.2883 0.2851 0.2818 0.2784 0.2749 0.2712 0.2443
S2 −0.4399 −0.4365 −0.4330 −0.4323 −0.4316 −0.4309 −0.4301 −0.4294 −0.4257 −0.4218 −0.4178 −0.4093
L
D1
D2
U1
U2
0.02697 0.03894 0.04102 0.04303 0.04497 0.04687 0.04873 0.05751 0.06575 0.07369 0.08922
0.04423 0.04600 0.04775 0.05615 0.06412 0.07185 0.08716
0.04456 0.04651 0.0484 0.05730 0.06550 0.07345
0.06963 0.06737 0.06689 0.06712 0.07199 0.07870 0.08595
0.1043 0.1205 0.1343 0.1470 0.2085
8
G. Arioli
Theorem 4.1. For all energy values displayed in column h of Table 1, the system (1.1) admits at least a periodic solution for each value printed in the remaining columns. Such solution crosses the x-axis orthogonally twice. The x-coordinate of the intersection with positive y-velocity lies in the interval centered in the position given in the table with width 10−3 . The solutions corresponding to values in the columns S1 , S2 , L, U1 and U2 do not intersect the x-axis at any other point, while the solutions corresponding to values in the columns D1 and D2 intersect the x-axis twice at another point with the same negative x-velocity and with opposite y-velocity. The orbits in the column S2 are retrograde around P1 ; the orbits in the classes S1 , U1 and U2 are direct around P2 . The orbits in the class L are retrograde around L1 . Proof. Orbits S1 , S2 , L, U1 and U2 . We checked by computer assistance (see Sect. 7) that the second components of H1 (x + 5 · 10−4 , 0) and H1 (x − 5 · 10−4 , 0) have different sign and the interval set [x − 5 · 10−4 , x + 5 · 10−4 ] × {0} is in the domain of H1 , where x is any of the values in the columns S1 , S2 , L, U1 and U2 . According to the argument presented at the beginning of this section, this suffices to prove the existence of a fixed point of the Poincaré map P and hence the existence of a periodic orbit. To prove that an orbit in the column S2 is retrograde around P1 we proceed as follows. We compute the trajectory of the set [x − 5 · 10−4 , x + 5 · 10−4 ] × {0} until it crosses the Poincaré section and we check that the angular velocity with respect to P1 is always strictly positive. This also implies that the x-axis is crossed only once in each direction, and by construction and the symmetry of the system the intersections are othogonal. To prove that the orbits in the classes S1 , U1 and U2 are direct around P2 , the orbits in the class L are retrograde around L1 and they also cross the x-axis only once in each direction we follow an analogous procedure. Orbits D1 and D2 . Same argument with the map P . To check that the periodic points of the Poincaré map corresponding to these orbits have minimal period 2 we checked that P ([x − 5 · 10−4 , x + 5 · 10−4 ], 0) is mapped away from [x − 5 · 10−4 , x + 5 · 10−4 ] × {0}. The symmetry of the system forces the two non-orthogonal intersections with the x-axis to occur at the same point, with the same x-velocity and with opposite y-velocity.
Remark 4.1. Orbits belonging to the same column appear to belong to the same class, in the sense that they have the same linear stability and they have the same winding number with respect to the primaries and the Lagrangian point L1 . More precisely, all orbits are unstable, except for orbits S1 and S2 which are linearly stable. The result on the stability is purely numerical. The orbits in the class L are well known, their trajectory is very close to an ellipse with center at the Lagrangian point L1 . They branch out from L1 , in the sense that as h → 0 they collapse to L1 . These are the only orbits considered in this paper which do not wind around any primary, while they wind around L1 . The orbits in the class D1 and D2 correspond to points of period 2 of the Poincaré map. In Fig. 3 the orbits D1 , D2 and L at energy level h = .5 are displayed (the small disk represents P1 ). The same orbits are plotted together in Fig. 4: note that part of the trajectories of both D1 and D2 are very close to the trajectory of L, giving a visual image of the strong instability of the system. Remark 4.2. The existence of unstable orbits around the other Lagrangian points is also well-known, but they occur only for higher values of the energy; we do not investigate those orbits, but we claim that the methods presented in this paper could be used to study those orbits and, possibly, chaotic dynamics involving those orbits as well.
The Restricted 3-Body Problem
9
y 0.3
y 0.2
y 0.2
0.2 0.1 -0.04 -0.02
0.1
0.1 0.02
0.04
x
-0.6
-0.4
-0.2
-0.6 -0.5 -0.4 -0.3 -0.2 -0.1 -0.1
-0.1
x
-0.1
-0.2 -0.2
-0.3
-0.2 Orbit D1
Orbit L
Orbit D2
Fig. 3.
In Figure 5 we plot all periodic orbits at energy level h = .3: note that the shape of the orbits D1 , D2 and L are quite similar to the same orbits in Fig. 4, supporting the claim that they belong to the same branch.
y D1
0.3
D2
0.2 0.1
-0.6
-0.4
x
-0.2 -0.1
L
-0.2 -0.3 Fig. 4. Orbits D1 , D2 and L at energy level h = .5
5. Chaotic Dynamics We apply the method developed in [Z1, Z2, Z3], see also [AZ] where the Hénon-Heiles Hamiltonian was concerned. We only consider the system at energy h = .3. We prove the existence of symbolic dynamics on 14 symbols and we give a lower estimate for the topological entropy. The computation of rigorous estimates in the (restricted) 3-body problem presents more difficulties than the cases previously considered. The first is due to the fact that the Poincaré map is not defined in a connected region, indeed we have to exclude at least the lines x = R1 and x = −R2 . This could be avoided by using some kind of regularization, but we prefer to keep the original coordinates, both because they are far
10
G. Arioli
y D1 D2
S2
0.3
U2
S1
U1
0.2 0.1
-0.75
-0.5
-0.25
0.25
0.5
0.75
x
-0.1 -0.2 -0.3
L
Fig. 5. All orbits at h = .3
more intuitive and because the Levi–Civita transform (or its variations) is not one to one. Furthermore, even after some kind of regularization, it is far from trivial to determine what is the domain of the Poincaré map. But the main problem is due to the fact that, even using a sophisticated algorithm to compute the rigorous bounds, such bounds turn out to be very large, particularly in the px direction. The reason for such large bounds are not only the inevitable computational errors, which in theory can be as small as we like (by taking a smaller time step at the price of a slower computation), but particularly the wrapping effect. See [AZ, GZ] and the references given there for a discussion on this topic. Here we just want to point out that although it is possible by a change of variable to get rid of the singularities, it is not possible to get rid of the wrapping effect in the same way, at least not with the same change of variable. The first trick we had to adopt consists in computing only half of the Poincaré map at a time, and this is the second reason, in fact the most important, for introducing the maps H1 and H2 . Indeed the wrapping effect is usually exponential, therefore by considering about half trajectory time it is rather drastically reduced. By this method we obtain a great reduction of the error: of course we have to pay the price of a larger number of computations, but the trade-off is very positive. The second trick that proved to be essential consists in computing the inverse of the map instead of the actual map for some checks, see the definition of back-covering below. This is important whenever some trajectory starts away from the primaries but comes close to one of them at the intersection with the Poincaré plane, indeed in such cases a very short time step is necessary to keep the error small, since close to the singularities the speed undergoes a strong variation while the particle crosses the Poincaré plane. The particle is in fact a point, but in order to compute the Poincaré map we have to consider the envelope of its position with the error bounds, hence it takes a finite time to cross the plane. It turns out that the inverse trajectory, starting close to a primary and ending away from both primaries, raises a much lower error.
The Restricted 3-Body Problem
11
5.1. Topological tools. Definitions 5.1 and 5.3 were introduced in [AZ]. Definitions 5.4 and 5.5 are introduced here for the first time to deal with the new difficulties of the restricted 3-body problem. Lemma 5.2 and Corollaries 5.1 and 5.2 are also new. Definition 5.1. A triple set (or t-set) is a triple N = (|N |, N l , N r ) of closed subsets of R2 satisfying the following properties: 1. |N | is a closed parallelogram, 2. N l and N r are closed half-planes, 3. the sets N le := N l ∩ |N| and N re := N r ∩ |N | are two nonadjacent edges of |N |. We call |N |, N l , N r , N le and N re the support, the left side, the right side, the left edge and the right edge of the t-set N respectively. One observes that R2 \ (|N | ∪ N l ∪ N r ) consists of two disjoint sets. We call N t and N b the closure of such sets (the top and bottom sides of the triple set). We also set N te := N t ∩ |N | and N be := N b ∩ |N | (the top and bottom edges of N ). Remark 5.1. Since all theorems we use concern topological properties of the t-sets, then one can choose any continuous deformation of the t-set defined above, obtaining the same results. An example of the sets we actually use, whose exact definition is given in Sect. 7, is given in Figure 6. To exploit the symmetry of the system we give the following definition: Definition 5.2. Let M be a t-set. We define its symmetric image with respect to the xaxis M˜ as follows: if S : R2 → R2 is the map defined by S(x, y) = (x, −y), let ˜ = S(|M|), M˜ l = S(M t ), M˜ r = S(M b ), and the remaining definitions follow as in |M| Definition 5.1.
Top side Left
side
Support Right
Bottom side
Fig. 6. An example of a t-set
side
12
G. Arioli
Definition 5.3. Let f : ⊂ R2 → R2 be a map and let N1 and N2 be two triple sets. f
We say that N1 f -covers N2 (N1 ⇒ N2 ) if: (a) f (|N1 |) ⊂ int(N2l ∪ |N2 | ∪ N2r ), (b) either f (N1le ) ⊂ int(N2l ) and f (N1re ) ⊂ int(N2r ) or f (N1le ) ⊂ int(N2r ) and f (N1re ) ⊂ int(N2l ). The following lemma says that we can reduce condition (a) in the above definition to the boundary of |N1 | if we know that the map f is defined on |N1 | and it is injective. This lemma plays a very important role in the computer assisted verification of the covering relations, as it reduces the computations to the boundary of |N1 | (see Sect. 6 in [GZ], for more details). Lemma 5.1. Let f : ⊂ R2 → R2 be a map and let N1 and N2 be two triple sets. f
Assume that f is an injective map on |N1 |, then N1 ⇒ N2 if and only if (a ) f (∂|N1 |) ⊂ int(N2l ∪ |N2 | ∪ N2r ), (b) either f (N1le ) ⊂ int(N2l ) and f (N1re ) ⊂ int(N2r ) or f (N1le ) ⊂ int(N2r ) and f (N1re ) ⊂ int(N2l ). As we pointed out above, in some circumstances it is easier to compute the inverse flow, then the direct flow. Furthermore, we need to deal with the half Poincaré maps. To this purpose we give the following definitions: Definition 5.4. Let N1 and N2 be two triple sets. We say that N1 f -backcovers N2 (N1 f
⇐ N2 ) whenever: (a) f : 1 ⊂ R2 → 2 ⊂ R2 is a homeomorphism, (b) |N2 | ⊂ 2 , (c) f −1 (∂|N2 |) ⊂ int(N1t ∪ |N1 | ∪ N1b ), (d) either f −1 (N2te ) ⊂ int(N1t ) and f −1 (N2be ) ⊂ int(N1b ) or f −1 (N2te ) ⊂ int(N1b ) and f −1 (N2be ) ⊂ int(N1t ). Definition 5.5. Let N1 and N2 be two triple sets. We say that N1 generically f -covers f
N2 (N1 ⇐⇒ N2 ) if N1 f -covers N2 or there exists n ≥ 1 t-sets Mi , i = 1, . . . , n and g0 gi n + 1 maps gj , j = 0, . . . , n such that f = gj ◦ · · · ◦ g1 ◦ g0 , N1 ⇒ M1 , Mi ⇒ Mi+1 gn for all i = 1, . . . , n − 1 and Mn ⇒ N2 . Remark 5.2. Although covering, backcovering and generic covering occur often simultaneously and they are indeed a very similar phenomenon, they are not equivalent. In fact it may even happen that a map f is not defined on the whole support of N1 , and still N1 f -backcovers N2 or N1 generically f -covers N2 . On the other hand, the result of a backcovering or of a generic covering relation is very similar to the result of a covering relation as far as the results we are interested in are concerned, see Corollaries 5.1 and 5.2. The following theorem follows immediately from Theorem 4 in [Z3] and together with Theorem 5.2 it is the main topological tool we use:
The Restricted 3-Body Problem
13
Theorem 5.1. Given n t-sets Mi ⊂ R2 and n continuous maps fi : Mi → R2 , such that f0
f1
fn−1
f2
M0 ⇒ M1 ⇒ M2 ⇒ M2 . . . ⇒ M0 = Mn , then there exists x ∈ int|M0 |, such that fk ◦ · · · ◦ f1 ◦ f0 (x) ∈ int|Mk+1 |, for k = 0, . . . , n − 1 and x = fn−1 ◦ · · · ◦ f1 ◦ f0 (x). By the following lemma we can extend the previous theorem to the generic backcovering, see Corollary 5.2. Lemma 5.2. Let 1 and 2 be two open sets of R2 , let f : 1 → 2 be a homeomorphism and let N1 and N2 be two triple sets such that |Ni | ⊂ i , i = 1, 2. If N1 f -backcovers N2 then there exists a t-set K ⊂ 1 such that N1 id-covers K, and K f -covers N2 (id is the identity map in 1 ). Proof. By definition of backcovering, if N1 f -backcovers N2 , then f −1 (|N2 |) looks like the light grey rectangle in Fig. 7, therefore it is possible to define a t-set K such that the boundary of its support is very close to the boundary of f −1 (|N2 |), in such a way that N1 id-covers K and K f -covers N2 . To prove this, just consider a set K as the dark grey rectangle in the picture or, more precisely, if a and b are the length of the sides of |N2 |, let K be the parallelogram with the same center as |N2 |, the sides parallel to the sides of |N2 | and such that the length of its sides is (1 − ε)a and (1 + ε)b, 0 < ε < 1. If ε is small enough, K ⊂ 2 and f −1 (K ) is defined. Call the edges of K top, bottom, left and right according to the corresponding edge of N2 . Now define K as follows. The edges are the counterimage of the edges of K . By construction, definition of backcovering, compactness of K and continuity of f −1 , if ε is small enough the top (resp. bottom) edge of K lies in the top (resp. bottom) side of N1 and both the right and the left edges are in int(N1t ∪ |N1 | ∪ N1b ), hence it is possible to define the left (resp. right) side of K in such a way that N1 id-covers K. By construction K f -covers N2 and the proof is complete.
K
N1
K’ N2
Fig. 7. In light grey f −1 (N2 ), in dark grey K = f (K)
Corollary 5.1. Let N1 and N2 be two triple sets and let f be a homeomorphism. If N1 f -backcovers N2 , then N1 generically f -covers N2 .
14
G. Arioli
Corollary 5.2. Given t-sets Mi ⊂ R2 and n continuous maps fi : 1i ⊂ R2 → 2i ⊂ R2 such that f0
f1
f2
fn−1
M0 ⇐⇒ M1 ⇐⇒ M2 ⇐⇒ M2 · · · ⇐⇒ M0 = Mn , then there exists x ∈ int|M0 |, such that fk ◦ · · · ◦ f1 ◦ f0 (x) ∈ int|Mk+1 |, for k = 0, . . . , n − 1 and x = fn−1 ◦ · · · ◦ f1 ◦ f0 (x). Assume that we have n t-sets Ni , i = 1, . . . , n, with some covering relations. Let N = i |Ni |; the following definitions are standard. Definition 5.6. Let f be injective. The invariant set of N is defined by Inv(N, f ) := {x ∈ N : f i (x) ∈ N for all i ∈ Z}. Definition 5.7. The transition matrix T (j, i), i, j = 1, . . . , n, is defined as follows: f T (j, i) = 1 if Ni ⇐⇒ Nj . 0 otherwise Let n be the set of bi-infinite sequences of n symbols. Definition 5.8. A sequence {xk } ∈ n is said to be admissible if T (xk+1 , xk ) = 1 for all k. We denote by A ⊂ n the set of all admissible sequences. Definition 5.9. Assume |Ni |∩|Nj | = ∅, for i = j . The projection π : Inv(N, f ) → A is defined by setting π(x)i = j , where j satisfies f i (x) ∈ |Nj | for all i ∈ Z. The set A inherits the topology from n ; the shift map σ : A → A is continuous. We prove a semiconjugacy between σ and f , i.e. we prove that σ ◦ π = π ◦ f |Inv(N,f ) . In particular this implies that there exists a symbolic dynamics structure on Inv(N, f ). The following theorem was proved in [Z3] (see Theorems 5 and 6) for the case n = 2. The following is a natural extension to a generic number of sets and the proof is exactly the same. Theorem 5.2. The projection π is onto, and if {xn } ∈ A is a periodic sequence, then π −1 ({xn }) contains a periodic point. We point out that these kind of topological tools have been created to deal with hyperbolic periodic points and are not suitable to look for elliptic points. On the other hand the symmetry of this system allows the search and proof of (symmetric) periodic points in a much easier way, as pointed out in Sect. 4. Furthermore the main purpose of the paper is to show the existence of a symbolic dynamics, which cannot occur in neighborhoods of stable points, therefore we will not address this topic any further. 5.2. Heuristic results. In this section of the paper we provide the heuristic results we obtained, while in the following section we prove that such results are rigorous. In the remaining part of this paper we fix h = .3. We need to define 14 t-sets N0 , . . . , N5 , N˜ 1 , . . . , N˜ 4 , K0 , . . . , K3 : the precise definition is given in Sect. 7. Here we only point out that the sets N˜ 1 , . . . , N˜ 4 are by definition the symmetric image with respect to the x-axis of the sets N1 , . . . , N4 (see Definition 5.2), while the remaining sets are invariant with respect to the same symmetry. In Fig. 7
The Restricted 3-Body Problem
15
px N2
PN1
0.04 PN0
0.02 N1 -0.79 N0
-0.78
-0.77
-0.76
x
-0.02 px 0.25 0.2 N3
0.15 PN3
0.1 PK2 , PK0, PN2
0.05
0.025 0.05 0.075 -0.05
N4
PN4 PN5
0.125 0.15 0.175 N5
x
Fig. 8.
the supports of the sets N0 − N5 and their images through the Poincaré map are represented, together with a portion of the images the supports of the sets K0 and K2 . We remark that the images of the sets N2 , K0 and K2 are very thin and appear on the picture as lines. The sets are displayed in thick lines, while their images are displayed in thin lines We made Fig. 8 for better clarity. Lemma 5.3. Let M1 and M2 be t-sets, let M˜ 1 and M˜ 2 be the sets symmetric with respect P P to symmetry defined above and assume that M1 ⇒ M2 . Then M˜ 2 ⇐ M˜ 1 . Proof. This follows by Definitions 5.3, 5.4, 5.2 and the symmetry of the Poincaré map (see the end of Sect. 3).
In the following we denote by ⇒ (resp. ⇐) the covering (resp. the backcovering) with respect to the map P and we apply the topological theorems introduced above to the map P . The numerical experiments suggest that the following covering relations hold: N0 ⇒ N0 ⇒ N1 ⇒ N2 ⇒ N3 ⇒ N4 ⇒ N5 ⇒ N5 , K0 ⇒ K0 ⇒ N3 , K1 ⇒ K2 ⇒ N3 , K3 ⇒ K3 ⇒ N3 ,
(5.1)
16
G. Arioli
If these relations could be verified, then by Lemma 5.3 the following covering relations would hold as well: N5 ⇐ N˜ 4 ⇐ N˜ 3 ⇐ N˜ 2 ⇐ N˜ 1 ⇐ N0 , N˜ 3 ⇐ K0 ,
(5.2)
N˜ 3 ⇐K2 ⇐ K1 , N˜ 3 ⇐ K3 . 5.3. Rigorous covering relations. As we pointed out before, it is not convenient to check directly the covering relations given in the previous section, since that would require an enormous amount of computer time. This problem is due to the fact that, even if we start from a very small interval in the computation of the Poincaré map, the error and the wrapping effect accumulated in the Poincaré time is quite large. Instead we prefer to define 12 auxiliary t-sets Mi , i = 0, . . . , 6 and Li , i = 1, . . . , 5 on the symmetric Poincaré plane, i.e. the plane y = 0, y˙ < 0, and prove the covering relations given below (the precise definition of the t-sets is given in Sect. 7): H
H
H
H
H
H
H
1 2 1 2 1 2 1 N0 ⇒M 0 ⇒ N0 ⇒ M1 ⇒ N1 ⇒ M2 ⇒ N2 ⇒ M3
H
H
H
H
H
H
H
2 1 2 1 2 1 2 ⇒ N3 ⇒ M4 ⇒ N4 ⇒ M5 ⇒ N5 ⇒ M6 ⇒ N5
H
H
H
2 1 2 L3 ⇒ K0 ⇒ L3 ⇒ N3
H
H
H
H
H
H
H
H
1 2 1 2 K1 ⇒ L1 ⇒ K2 ⇒ L2 ⇒ N3
1 2 1 2 K3 ⇒ L4 ⇒ K3 ⇒ L5 ⇒ N3
But even to prove these covering relations with the half Poincaré maps turns out to be a difficult task, indeed some of the sets are quite close to the primaries (particularly the sets L4 and L5 ), and this causes the computation of rigorous bounds for the intersection of the trajectory with the Poincaré plane to be critical. To overcome this difficulty we have to use the alternative, but equivalent, definition of covering through the inverse of the Poincaré map or backcovering, see Definition 5.4. The covering relations we actually prove by computer assistance are as follows: H
H
H
H
H
1 2 1 2 1 M1 ⇐ N1 ⇒ M2 ⇐ N2 ⇒ M3 N0 ⇒
H
H
H
H
H
2 1 2 1 2 ⇒ N3 ⇐ M4 ⇒ N4 ⇐ M5 ⇒ N5
H
H
1 N0 ⇒ M0 ,
H
H
2 M6 ⇒ N5 ,
H
H
1 2 1 2 K1 ⇒ L1 ⇐ K2 ⇒ L2 ⇒ N3 ,
H
H
1 2 K0 ⇒ L3 ⇒ N3
H
H
H
2 1 2 L4 ⇒ K3 ⇐ L5 ⇒ N3
(5.3)
The Restricted 3-Body Problem
17 H
H
H
H
2 1 2 1 Note that the coverings L3 ⇐ K0 , K3 ⇐ L4 , M0 ⇐ N0 and N5 ⇐ M6 follow
H
H
H
H
1 2 1 2 from the coverings K0 ⇒ L3 , L4 ⇒ K3 , N0 ⇒ M0 and M6 ⇒ N5 respectively, by the symmetry of the t-sets and Lemma 5.3. As a result, we have to check 21 covering relations. In order to have covering relations as described in Definition 5.3, we would need to check the image through the Poincaré map of the whole support of the t-sets. On the other hand, by Lemma 5.1 it is enough to perform rigorous computations on the boundaries of the t-sets, provided that we can prove that the map is defined on the whole supports. This is essential for the rigorous proof to be made in a reasonable time, since to check by computer assistance the definition of the Poincaré map on the supports of the t-sets would be very time consuming. In order to prove that the half Poincaré maps are defined on the supports of all t-sets we argue as follows. We only give the proof for the half Poincaré map H1 , the proof for H2 being equivalent. First we have to prove that the trajectory does not collide with one primary. Then we observe that the projections of the trajectories we are interested in on the (y, py ) plane appear to be rotating around either a primary or the Lagrangian point L1 . If we can compute the angular velocity in the (y, py ) plane and prove that it is bounded away from 0 for long enough time, then it follows that the trajectory has to cross the Poincaré section eventually, hence the map H1 is well defined. In [AZ] it was possible to prove by analytical computations that the projection on the (x, px ) plane of all trajectories have positive angular velocity with respect to the origin. For the restricted 3-body problem such a general statement does not hold, therefore we have to proceed with a different method. First we compute the trajectory of the boundary of the t-set with rigorous bounds on the error. The trajectory of the whole boundary describes a “tube” in R4 . More precisely, the intersection of the trajectory of the boundary with any hyperplane αpy + βy = 0 bounds a region on the energy surface; the union of all these intersections is what we call the tube. Since no trajectories can intersect, all trajectories starting in the interior of the support of the t-set have to remain inside the tube. If we can prove that the angular velocity in the interior of the tube is always bounded away from 0, say its absolute value is larger than ε > 0, it follows that the half Poincaré map is well defined on the whole interior of the support of the t-set; indeed any trajectory starting in the interior of the t-set must intersect the Poincaré plane in a time T ≤ π/ε. For a given point (x, px ) ∈ ∂N let
(x, px ) = ϕ (x, px ; [0, T1 ]) , where ϕ is the map defined in Sect. 3 and T1 = T1 (x, px ) is the half Poincaré time also defined in Sect. 3. (x, px ) is the trajectory of the point (x, px , 0, 2(x, 0) + h − px2 ) under the flow induced by the equations from time 0 to the time T1 when it reaches the Poincaré plane. If we can prove that H1 (∂N ) exists, since the flows of two different points cannot intersect, unless they coincide, then H1 (∂N ) bounds a region M in the Poincaré plane. We want to prove that H1 is defined on the whole set |N | and H1 (|N |) = M. Let = |N| ∪ (x, px ) ∪ M. The hypersurface ⊂ R4 divides R4 into two (x,px )∈∂N
connected regions; call the bounded It is clear that for all (x, region. px ) ∈ |N | \ ∂N 2 either there exists H1 (x, px ), or ϕ x, px , 0, 2(x, 0) + h − px ; T ∈ for all T ≥ 0. Let β(x, px , y, py ) = (2px + y (x, y))y − py2 . The next lemma shows that, if does not intersect the plane (y = 0, py = 0) and |β(x, px , y, py )| ≥ δ > 0 for all (x, px , y, py ) ∈ , then the second case cannot happen.
18
G. Arioli
2 Lemma 5.4. Fix h and (x, px ) such that x = {R1 , −R2 } and 2(x, 0) + h − px > 0; let py = 2(x, 0) + h − px2 . Let (x(T ), px (t), y(T ), py (T )) = ϕ(x, px , 0, py ; T ) be the flow induced by (1.1); let β(T ) = (2px (T ) + y (x(T ), y(T )))y(T ) − py2 (T ). If there exists δ > 0 such that y 2 (T ) + py2 (T ) > 0 and |β(T )| ≥ δ for all T ∈ [0, π/δ] (resp. T ∈ [−π/δ, 0]), then the half Poincaré map H1 (resp. H2 ) in (x, px ) is well defined.
Proof. Since y 2 (T ) + py2 (T ) > 0 we can use polar coordinates. Let α(T ) be the angle. Taking the derivative we have α˙ =
p˙ y y − py2 y 2 + py2
=
(2px + y )y − py2 y 2 + py2
.
Since y 2 + py2 is bounded and (2px + y )y − py2 is bounded away from 0, then for some T we have |α(T ) − α(0)| ≥ π and the map is defined.
Lemma 5.5. For all covering relations listed in (5.3) the half Poincaré maps are defined on all the supports of all t-sets. Proof. Consider a covering relation with the H1 map, the H2 map and the back-covering relations being equivalent. The proof is by computer assistance and it is performed with the following procedure. Choose a t-set and let (x, px ) be the middle point of its support. Let ϕ(x, ¯ px , t) be the approximate flow, computed by some suitable algorithm (we used Mathematica 4.0 for this purpose). Let T¯1 be the approximate Poincaré time, i.e. ϕ¯3 (x, px , t) > 0 for all 0 < t < T¯1 and ϕ¯3 (x, px , T¯1 ) = 0. For an integer N > 0 let ti = i T¯1 /N, i = 0, N,
¯ i = (x, px , y, py ) ∈ R4 : d((x, px , y, py ), ϕ(x, ¯ px , ti )) ≤ 1 , where d is a distance in R4 defined by d((x 1 , px1 , y 1 , py1 ), (x 2 , px2 , y 2 , py2 )) = max(a1 |x 1 − x 2 |, a2 |px1 − px2 |, a3 |y 1 − y 2 |, a4 |py1 − py2 |) and {ai } are positive constants. Let ¯ =
¯ i.
i=0,N
¯ is a very rough approximation of the flow of the support of the t-set. If all the constants ¯ are chosen appropriately, one can expect that the true entering in the definition of ¯ With computer assistance trajectory of the support of the t-set is entirely contained in . we checked rigorously that the true trajectory of the boundary of the t-sets never leaves ¯ Using interval arithmetics algorithms we compute . δi =
max
¯i (x,px ,y,py )∈
(2px + y (x, y))y − py2 .
By computer assistance we prove that maxi δi is strictly negative for all t-sets. The proof is complete by Lemma 5.4.
The Restricted 3-Body Problem
19
The previous lemmas yield the following: Theorem 5.3. All covering relations listed in (5.3) hold. Proof. We check with computer assistance that conditions a and b in Lemma 5.1 or conditions c and d in Definition 5.4 are verified for all the covering relations listed in (5.3). The proof is completed by Lemma 5.5.
Consider the collection of the t-sets N0 , N1 , N2 , N3 , N4 , N5 , N˜ 1 , N˜ 2 , N˜ 3 , N˜ 4 , K0 , K1 , K2 , K3 in this order and let 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 . T = 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 1 0 1 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 Theorem 5.3 and the definition of generic covering yield the following Corollary 5.3. All relations in (5.1) and (5.2) hold with generic covering instead of covering or backcovering and T is the transition matrix associated with the map P . 5.4. Symbolic dynamics. Let A be the set of all admissible sequences of fourteen symbols (N0 , N1 , N2 , N3 , N4 , N5 , N˜ 1 , N˜ 2 , N˜ 3 , N˜ 4 , K0 , K1 , K2 , K3 ) with respect to T (recall Definition 5.8). Corollary 5.3 yields the following result on symbolic dynamics: Theorem 5.4. For any biinfinite sequence {xn } ∈ A there exists an orbit of (1.1) which crosses the Poincaré plane in the t-sets N0 , . . . , N5 , N˜ 1 , . . . , N˜ 4 and K0 , . . . , K3 in the order prescribed by the sequence. Furthermore, if {xn } is periodic of period m, such orbit is periodic as well and its trajectory on the Poincaré section also has period m. Proof. The supports of the t-sets N0 , . . . , N5 , N˜ 1 , . . . , N˜ 4 and K0 , . . . , K3 are disjoint; let N be the union of all such supports and let π : Inv(N, P ) → A be the projection defined as in Definition 5.9. The result follows by Theorems 5.2 and Corollary 5.3.
Corollary 5.4. Equation (1.1) admits infinitely many periodic solutions. This result is important since it gives the complete picture of the chaotic behavior of the system and it yields the result on topological entropy we present in the next subsection. On the other hand, a better visual image of the result can be obtained by the following observations. Recall that the sets N0 , N5 , K0 and K3 P -cover themselves and K1 P 2 -covers itself. Hence there exists (at least) a periodic orbit for each of these sets
20
G. Arioli
which crosses the hyperplane y = 0 with positive y velocity in the set (or alternatively in the sets K1 and K2 ). The orbit crossing N0 is the orbit symmetric to U2 with respect to the y axis, while the orbits crossing N5 , K0 , K1 and K3 are U2 , L, D1 and U1 respectively. We point out that without considering the derivative of the Poincaré map we cannot exclude that other periodic orbits cross those t-sets. It would be indeed rather easy to rigorously compute the derivative of the Poincaré map, but this is not relevant for the discussion that follows and such computation will be treated in another paper. P5
P5
P3
From the transition matrix it also follows that N0 ⇐⇒ N5 ⇐⇒ N0 and Ki ⇐⇒ P3
N5 ⇐⇒ Ki , i = 0, 2, 3, therefore if we consider the t-sets N0 , N5 , K0 , K2 and K3 the transition matrix with respect to the map P 30 is a 5 × 5 matrix with all entries equal to 1, that is each of these t-sets P 30 -covers itself and all the other t-sets. Using again Theorem 5.2 we infer that for all biinfinite sequences of five symbols there exists a solution of Eq. (1.1) which crosses the t-sets in the same order. In other words, the system admits solutions which come close to any of these orbits in any prescribed order. We point out that this result concern the thirtieth interate of the Poincaré map, therefore only one crossing of the Poincaré section every 30 should be considered. Finally, we point out that all these results concern the existence of orbits, not uniqueness. 5.5. Topological entropy estimates. The results presented in the previous subsection yield a lower estimate of the topological entropy of the map P by the following lemma: Lemma 5.6. Let f : X → X be a continuous map. Let S ⊂ X be an invariant set, let A be an n × n matrix such that there exists a surjective map π : S → A satisfying σ ◦ π = π ◦ f . Then the topological entropy of f is larger than ln(max{|λi |, λi is an eigenvalue of A}). Proof. The proof is an easy consequence of Theorem 7.13 in [WA].
Corollary 5.5. The topological entropy of P is larger than 1.62746. Proof. By Lemma 5.6 a lower bound for ht is given by the maximal norm eigenvalue of the transition matrix T . The characteristic polynomial is p(x) = 1 − 2 x + 2 x 3 + x 4 − 3 x 5 − 3 x 6 + 7 x 7 − 4 x 8 + 4 x 9 − 5 x 10 + 5 x 12 − 4 x 13 + x 14 , and since p(1.62747) > 0 and p(1.62746) < 0, then ht (P ) > 1.62746.
6. Further Developments Although the 3-body problem is very old, there are still many open problems. Computer assisted techniques can give rigorous proofs to some conjectures. This paper is focused on results which require only C 0 computations. We think that a much richer symbolic dynamics can be proved with these methods at different energy levels: this paper should only be considered as an example, although in the author’s opinion Theorem 5.4 and Corollary 5.5 have value in themselves. The rigorous computation of the derivatives of the Poincaré map can provide information on the linear stability of periodic orbits and yield existence results for homoclinic and heteroclinic orbits. We will treat these topics in a forthcoming paper, together with the dependence of the results presented in this paper to the ratio of the masses of the primaries. A further development under study concerns the study of bifurcations with the energy as a parameter value.
The Restricted 3-Body Problem
21
7. Computational Details 7.1. Description of the t-sets. The procedure used to build all t-sets is described in [AZ] for the Hénon-Heiles Hamiltonian and we refer to that paper for the details. Here we only mention that the t-sets Ni and Ki are either centered on the fixed or period 2 points of the Poincaré map, or placed along the invariant manifolds of such points with the sides approximately parallel to the manifolds. The sets Mi and Li instead are built “by hand” trying to interpolate the image of the other sets. We recall that the Poincaré section is not connected, since the lines x = ±2−2/3 corresponding to the position of the primaries must be excluded. It follows that the invariant manifolds of points belonging to different connected components cannot cross. This fact does not influence the topological method employed here, since a t-set and its image may be on different connected components, bypassing the fact that the Poincaré section passes through the primaries. The actual t-sets used in the proofs are as follows: Definition 7.1. The triple sets are defined by giving the coordinates of the center (x, y), the length of the sides (lx , ly ) and the angular coefficients of the sides (α, β) as follows: N0 : (x, y) = (−0.7942, 0.), (lx , ly ) = (.007, .007), (α, β) = (.9325, −.9325). N1 : (x, y) = (−.7849, .0129), (lx , ly ) = (.002, .005), (α, β) = (−2.136, 2.136). N2 : (x, y) = (−.7589, .05086), (lx , ly ) = (.002, .005), (α, β) = (−1.812, 1.981). N3 : (x, y) = (.08, .125), (lx , ly ) = (.01, .04), (α, β) = (1.124, 2.009). N4 : (x, y) = (.131, .028), (lx , ly ) = (.0076, .012), (α, β) = (−2.09, 2.11). N5 : (x, y) = (.1471, 0.), (lx , ly ) = (.015, .015), (α, β) = (−2.07, 2.07). K0 : (x, y) = (.04873, 0.), (lx , ly ) = (.0002, .0002), (α, β) = (1.818, −1.818). K1 : (x, y) = (.04775, 0.), (lx , ly ) = (.0009, .0009), (α, β) = (1.131, −1.131). K2 : (x, y) = (−.7365, 0.), (lx , ly ) = (.0012, .0012), (α, β) = (−2, 2). K3 : (x, y) = (.06712, 0.), (lx , ly ) = (.01, .01), (α, β) = (1.835, −1.835). M0 : (x, y) = (−.1471, 0.), (lx , ly ) = (.0065, .0065), (α, β) = (1, −1). M1 : (x, y) = (−.1424, .007289), (lx , ly ) = (.0026, .0065), (α, β) = (.9, −1.05). M2 : (x, y) = (−.1311, .0279), (lx , ly ) = (.0025, .005), (α, β) = (4.2, 2). M3 : (x, y) = (−.0792, .1267), (lx , ly ) = (.002, .005), (α, β) = (4.2, 1.8). M4 : (x, y) = (.7592, .05021), (lx , ly ) = (.006, .018), (α, β) = (1.2, 2.2). M5 : (x, y) = (.7854, .01406), (lx , ly ) = (.01, .008), (α, β) = (4.2, 2.2). M6 : (x, y) = (0.7942, 0.), (lx , ly ) = (.013, .013), (α, β) = (1, −1). L1 : (x, y) = (−.0742, −.1015), (lx , ly ) = (.0009, .0009), (α, β) = (1.131, −1.131). L2 : (x, y) = (−0.0738, .1025), (lx , ly ) = (.002, .002), (α, β) = (−1.85, 1.85). L3 : (x, y) = (−.04873, 0.), (lx , ly ) = (.01, .01), (α, β) = (1.819, −1.819). L4 : (x, y) = (.6563, 0.), (lx , ly ) = (.005, .005), (α, β) = (−1.49, 1.49). L5 : (x, y) = (.6545, .022), (lx , ly ) = (.01, .01), (α, β) = (−1.49, 1.49). The left (resp. right) edge of each set is the segments whose end points are (x + lx cos α + ly cos β, y + lx sin α + ly sin β) and (x + lx cos α − ly cos β, y + lx sin α − ly sin β) (resp. (x − lx cos α + ly cos β, y − lx sin α + ly sin β) and (x − lx cos α − ly cos β, y − lx sin α − ly sin β)). The boundaries of the left and right sides of each t-set are the lines crossing two opposite vertices of the support, see Fig. 6. Furthermore, we denote N˜ i , i = 1, . . . , 4, the sets obtained by reflecting the sets Ni with respect to the x-axis, see Definition 5.2. Note that the sets N0 , N5 , K0 , K1 , K2 , K3 , K3 are by definition symmetric with respect to the x axis.
22
G. Arioli
7.2. Computation techniques. We describe here the algorithm used in the computer assisted proof of Theorem 5.3. The proof of Theorem 4.1 is equivalent. By Lemma 5.1 the proof consists in checking that the images through the Poincaré map of the edges of the t-sets lie in some assigned regions of the plane as described by the covering relations. We do not know the exact images of such edges, since no analytical solution of the equation is available, therefore all we can do is to estimate the trajectories and compute the intersections with the Poincaré plane with rigorous error bounds. In order to compute the image of a side, we partition it in segments with small enough length and we check that every such segment is mapped in the correct region. To compute the image of a small segment, we enclose it in an interval set (a rectangle) and we compute its trajectory with a Taylor–Lohner algorithm using interval arithmetics. More precisely, we start with a Taylor method of order 12, i.e. we estimate the trajectory of an interval by using the Taylor expansion of order 12 and we estimate the error by the Lagrange remainder. If h is the time step, we compute a rough but rigorous enclosure D of the trajectory at times [0, h], that is an interval set D such that the solution of the equation lies in D for all times between 0 and h, and by Lagrange’s theorem we estimate the error we make neglecting the remaining terms of the Taylor expansion by computing 13 x (13) (D) h13! , where x (13) (D) (which is an interval enclosing all possible values assumed by the 13th derivative of the trajectory, therefore enclosing the Lagrange remainder) is computed using a recursive algorithm for the time derivatives of the solutions. The interval arithmetics algorithms address the problem of computing the trajectory of a segment and of keeping track of the errors in an elegant and rigorous way, but they introduce another problem. Indeed, even in the simplest dynamical system, the procedure described above leads to a very rough estimate of trajectories, due to the wrapping effect which makes the bounds on the error grow exponentially. The problem has been strongly contained by introducing the half Poincaré maps and the backcovering relation, see the discussion in Sect. 5, but these techniques do not suffice. We obtain another significant reduction of the wrapping effect by using the Lohner algorithm. We refer to Sect. 6 in [AZ] and references cited there for a discussion of interval arithmetics and the wrapping effect and for a description of the Lohner algorithm employed. See also [MZ] for a discussion on interval arithmetics and [L] for the Lohner algorithm. We point out that using the Lohner algorithm to compute directly the Poincaré map, instead of the half Poincaré maps or their inverse, the computation time to perform all the proofs is unrealistic on current desk computers, therefore the definitions of the half Poincaré maps and of backcovering are essential to perform the proofs. The round-off errors are taken care directly by suitable C++ libraries and by Mathematica. Such errors may vary by changing computer and/or operating systems, but since they are usually very small when compared to the wrapping effect and since all proofs go through with a relatively large margin, we expect that the proof can be easily reproduced on any updated computer. The typical time step used in the computation of the images of the t-sets is dt = 10−2 , but in a few cases we had to use some lower value, down to dt = 5 · 10−4 . Each side of the t-sets has been usually divided in 100 to 2000 segments, depending on the apparent value of the Lipschitz constant of the map in the area considered. In a few difficult cases we had to divide each side of a t-set in 5000 segments. Most of the computations used to obtain the bounds for the location of the orbits in Table 1 used dt = 2 · 10−5 . To perform the proofs the author implemented a version of the whole algorithms in a combination of Mathematica and C++ under the Linux O.S. More precisely, Mathematica has been used to handle all the data and to perform a few algorithms which are less
The Restricted 3-Body Problem
23
demanding for the CPU, but more complicated to implement. Furthermore Mathematica has been used to make all numerical experiments used to choose the t-sets and to draw the pictures. On the other hand C++ has been used for the heavy interval arithmetic computations, where it offered a much better speed. The connection between the two languages is obtained by MathLink. We wish to point out that the full proof took almost a month of CPU time on a machine equipped with a 1GHz Pentium III processor. In fact a large amount of the time is used for the computations involving the set L5 whose trajectory is very close to one primary. A full C++ algorithm would reduce the time at the price of much more complicated and less user-friendly programming. We think that this sharing of tasks is almost optimal, as far as computational speed and simplicity of programming and data handling is concerned. The reader who desires to reproduce the computer assisted proofs in this paper without writing the program can use the Mathematica notebook provided via the Internet [A]. The notebook is provided with comments and instructions. By using the commands in the notebook it is possible to make an independent computation of the images of the sides of the t-sets and an independent check of Theorem 4.1. Acknowledgement. The author is very grateful to P. Zgliczy´nski for many discussions and for the interval arithmetic C++ libraries, and to the referees for very useful suggestions.
References [A] [AGT] [AZ] [GH] [GZ] [KMS] [L] [MH] [MM1] [MM2] [MM3] [M] [SM] [MZ] [SK] [S] [WA]
Arioli, G.: http://link.springer.de/link/service/journals/00220/index.htm or http://link.springer-ny.com/link/service/journals/00220/index.htm, DOI: 10.1007/s00220-002-0666-7 Arioli, G., Gazzola, F., Terracini, S.: Minimization properties of Hill’s orbits and applications to some N-body problems. Ann. Inst. Henri Poincaré, Analyse non linéaire 17, 5, 617–650 (2000) Arioli, G., Zgliczy´nski, P.: Symbolic dynamics for the Hénon-Heiles Hamiltonian on the critical level. J. Diff. Eq. 171, 173–202 (2001) Guckenheimer, J., Holmes, P.: Nonlinear Oscillations, Dynamical Systems, and Bifurcations of Vector Fields. New York–Heidelberg–Berlin: Springer-Verlag, 1983 Galias, Z., Zgliczy´nski, P.: Computer assisted proof of chaos in the Lorenz system. Physica D 115, 165–188 (1998) Kirchgraber, U., Manz, U., Stoffer, D.: Rigorous proof of chaotic behaviour in a dumbbell satellite model. J. of Math. Anal. and Appl. 251, 897–911 (2000) Lohner, R.J.: Computation of Guaranteed Enclosures for the Solutions of Ordinary Initial and Boundary Value Problems. In: Computational Ordinary Differential Equations, Cash, J.R., Gladwell, I. (eds.). Oxford: Clarendon Press, 1992 Meyer, K.R., Hall, G.R.: Introduction to Hamiltonian Dynamical Systems and the N-body problem. Berlin–Heidelberg–New York: Springer Verlag, 1991 Mischaikow, K., Mrozek, M.: Isolating neighborhoods and chaos. Japan J. Indust. Appl. Math. 12, 205–236 (1995) Mischaikow, K., Mrozek, M.: Chaos in the Lorenz equations: A computer assisted proof. BAMS 32, 66–72 (1995) Mischaikow, K., Mrozek, M.: Chaos in the Lorenz equations: A computer assisted proof. Part II: Details. Mathematics of Computation 67, 1023–1046 (1998) Moser, J.: Stable and random motions in Dynamical Systems. Princeton, NJ: Princeton Univ. Press, 1973 Siegel, C.L., Moser, J.K.: Lectures on celestial mechanics. Berlin–Heidelberg–New York: Springer Verlag, 1971 Mrozek, M., Zgliczy´nski, P.: Set arithmetic and the enclosing problem in dynamics. In print on Annales Polonici Mathematici Stoffer, D., Kirchgraber, U.: Verification of chaotic behavior in the planar restricted three body problem. Preprint Szebehely, V.: Theory of orbits. New York–London: Academic Press, 1967 Walters, P.: An Introduction to Ergodic Theory. New York: Springer Verlag, 1982
24
[W] [Z1] [Z2] [Z3]
G. Arioli
Willem, M.: Minimax Theorems. Boston: Birkhäuser, 1996 Zgliczy´nski, P.: Fixed point index for iterations, topological horseshoe and chaos. Topological Methods in Nonlinear Analysis Vol. 8, No. 1, 1996, pp. 169–177 Zgliczy´nski, P.: Computer assisted proof of chaos in the Hénon map and in the Rössler equations. Nonlinearity 10, No. 1, 243–252 (1997) Zgliczy´nski, P.: Sharkovskii’s Theorem for multidimensional perturbations of 1-dim maps. Ergod. Th. & Dynam. Sys. 19, 1655–1684 (1999)
Communicated by G. Gallavotti
Commun. Math. Phys. 231, 25 – 43 (2002) Digital Object Identifier (DOI) 10.1007/s00220-002-0678-3
Communications in
Mathematical Physics
sh-Lie Algebras Induced by Gauge Transformations Ron Fulp1 , Tom Lada1 , Jim Stasheff 2, 1 Department of Mathematics, North Carolina State University, Raleigh, NC 27695, USA.
E-mail:
[email protected]; E-mail:
[email protected]
2 Department of Mathematics, University of North Carolina, Chapel Hill, NC 27599-3250, USA.
E-mail:
[email protected] Received: 14 December 2000 / Accepted: 8 February 2002 Published online: 2 October 2002 – © Springer-Verlag 2002
Abstract: Traditionally symmetries of field theories are encoded via Lie group actions, or more generally, as Lie algebra actions. A significant generalization is required when “gauge parameters” act in a field dependent way. Such symmetries appear in several field theories, most notably in a “Poisson induced” class due to Schaller and Strobl [SS94] and to Ikeda [Ike94], and employed by Cattaneo and Felder [CF99] to implement Kontsevich’s deformation quantization [Kon97]. Consideration of “particles of spin > 2” led Berends, Burgers and van Dam [Bur85, BBvD84, BBvD85] to study “field dependent parameters” in a setting permitting an analysis in terms of smooth functions. Having recognized the resulting structure as that of an sh-Lie algebra (L∞ -algebra), we have now formulated such structures entirely algebraically and applied it to a more general class of theories with field dependent symmetries. 1. Introduction Ever since the discovery of Yang–Mills theory, physicists have been intrigued by the different manifestations of symmetries in field theories. Symmetries in gravitational theories are induced by spacetime transformations which preserve the spacetime structure whereas Yang–Mills symmetries are defined via transformations of some internal vector space. Many authors have attempted to reformulate gravitational symmetries in a manner which is compatible with the Yang–Mills approach as quantization of Yang– Mills theories is better understood than most attempts to quantize gravity. The present paper has as its purpose to show that gauge symmetries of certain field theories have an unexpectedly rich algebraic structure. Traditional theories lead one to expect that the symmetries of field theories are encoded via Lie group actions, or more generally, as Lie algebra actions. We find that the gauge symmetries of many field theories in fact do not arise from a Lie algebra action, but rather from an sh-Lie (or L∞ ) algebra action. Stasheff’s research supported in part by the NSF throughout most of his career, most recently under grant DMS-9803435.
26
R. Fulp, T. Lada, J. Stasheff
The physics of “particles of spin ≤ 2” leads to representations of a Lie algebra of gauge parameters on a vector space of fields. A significant generalization occurs when the gauge parameters act in a field dependent way. By a field dependent action of on , Berends, Burgers and van Dam [Bur85, BBvD86, BBvD85] mean a polynomial (or power series) map δ(ξ )(φ) = i≥0 Ti (ξ, φ), where Ti is linear in ξ and polynomial of homogeneous degree i in φ. Field dependent gauge symmetries appear in several field theories, most notably in a “Poisson induced” class due to Schaller and Strobl [SS94] and to Ikeda [Ike94], and employed by Cattaneo and Felder [CF99] to implement Kontsevich’s deformation quantization [Kon97]. Ikeda [Ike94] considers two-dimensional and three-dimensional [Ike01] theories with a generalizedYang–Mills field which has values in a so-called nonlinear Lie algebra. He finds that if the non-linear Lie structure is chosen appropriately and if he allows the Yang–Mills field to interact with certain scalar fields, then he can recapture gravitational theories in two dimensions. In this way, two-dimensional gravity is formulated as aYang–Mills theory and its symmetries arise in the same way as traditional Yang–Mills symmetries. The three-dimensional case [Ike01] provides deformations of physicists’ BF theories and analogous results hold in higher dimensions. Although expressed rather differently, the Berends, Burgers and van Dam approach provides further insight into the algebraic structure of the gauge symmetries of the above class of field theories. In fact their context is more general than that of Ikeda and that of Cattaneo and Felder, since Berends, Burgers and van Dam consider arbitrary field theories, subject only to the requirement that the commutator of two gauge symmetries be another gauge symmetry whose gauge parameter is possibly field dependent. We refer to this requirement as the BBvD hypothesis. Notice Berends, Burgers and van Dam do not require an a priori given Lie structure to induce the algebraic structure of the gauge symmetry “algebra”. On the other hand, Ikeda requires a structure called a nonlinear Lie algebra which he uses to obtain symmetries which in turn are used to find a Lagrangian for which the symmetries are gauge symmetries. In this sense, his nonlinear Lie structure drives the entire theory. Similarly, Cattaneo and Felder have a Poisson structure which explicitly appears in both the action of their theory and in their gauge symmetries. The present work has as its goal to clarify the algebraic structure of the more general gauge “algebra” outlined in Berends, Burgers and van Dam . When the BBvD hypothesis is satisfied, we show that the gauge symmetry algebra of a large class of field theories is an sh-Lie algebra. Of course, as we show, this sh-Lie structure, in special cases, will reduce to the more familiar Lie structures one encounters in various field theories. On the other hand, some of these field theories satisfy the BBvD hypothesis only “on-shell”. When closure on the original space of parameters is lost, physicists speak of an “open algebra”. This leads us, in Sect. 7, to a “generalized BBvD hypothesis” which in turn will allow us to show how the sh-Lie structure must be modified to handle “off-shell” gauge symmetries. We formulate the relevant structures in BBvD’s theory in terms of linear maps from a certain coalgebra ∗ into the respective vector spaces of fields and of gauge parameters. The coalgebra and the algebra structures of ∗ as well as the Lie algebra structure of Hom(∗ , ) are described in Sect. 2. It turns out that the space of gauge parameters has, in general, no natural Lie structure, but the space of linear maps from ∗ into is a Lie algebra under certain mild assumptions along with the BBvD hypothesis. This is proved in Sect. 3. Section 4 provides the reader with a short description of two equivalent methods for defining sh-Lie algebras. Our main result is found in Sect. 5 where we show that, under the same assumptions required in Sect. 3, the fields and gauge
sh-Lie Algebras Induced by Gauge Transformations
27
parameters combine to form an sh-Lie algebra. In Sect. 6 we show how our results relate to the classical situation in which the space of gauge parameters is a Lie algebra which acts on the space of fields. Section 7 provides further links to the physics literature where certain sigma-models are known to satisfy the BBvD hypothesis only “on-shell”. This requires us to further generalize the BBvD hypothesis; consequently these gauge algebras are “on shell” sh-Lie algebras which are not “on shell” Lie algebras. Finally, in Sect. 8 we show explicitly how our formalism applies to the work of Ikeda [Ike94] on two-dimensional gravitational theories and his study of non-linear Lie algebras. In addition, we show that Ikeda’s bracket is the “non-linear” analog of the Kirillov-Kostant bracket. We are grateful to Berends, Burgers and van Dam for the inspiration of Burgers’ dissertation and especially to van Dam for several discussions as our research developed.
2. Our Framework We work with vector spaces over a field k of characteristic 0 or, more generally, over a commutative k-algebra A, typically, C ∞ (M) for some smooth manifold M. Unless otherwise specified, H om will denote the A-module of A-linear maps. Let be a free A-module and let ∗ denote the free nilpotent graded cocommutative coalgebra over A cogenerated by with comultiplication denoted . This is the coalgebra of graded symmetric tensors in the full tensor coalgebra on . The A-module Coder(∗ ) of coderivations (over A) on ∗ is a Lie algebra with bracket given by the commutator with respect to composition. Recall that a coderivation is a linear map
: ∗ → ∗ that satisfies the equation ◦ = ( ⊗ 1 + 1 ⊗ ) ◦ . (In the graded situation, the usual Koszul sign conventions are in effect.) The Amodule Hom(∗ , ) is isomorphic to Coder(∗ ) and hence inherits a Lie algebra structure; the bracket on Hom(∗ , ) is known as the Gerstenhaber bracket [Ger62, Sta93]. The isomorphism Hom(∗ , ) h h¯ ∈ Coder(∗ ) is given by the correspondence ¯ 1 ∧ · · · ∧ φn ) = h(φ
{unshuff}
h(φσ (1) ∧ · · · ∧ φσ (p) ) ∧ φσ (p+1) ∧ · · · ∧ φσ (n)
for h ∈ Hom(p (), ). The set {unshuff} is the set of (p, n−p)-unshuffles, that is, the permutations of {1, . . . , n} such that σ (1) < · · · < σ (p) and σ (p + 1) < · · · < σ (n). We may write h¯ as the composition h¯ = m ◦ (h ⊗ 1) ◦ , where m is the usual product in ∗ regarded as an algebra (symmetric on even elements and skew on odd ones; no compatability with the coproduct is assumed nor needed). The Gerstenhaber bracket on Hom(∗ , ) may be described as [f, g] = f ◦ g¯ − g ◦ f¯, where f¯ and g¯ are the coderivations corresponding to f and g. In this notation the “Gerstenhaber comp” operation may be defined by f g = f ◦ g, ¯ for f, g ∈ Hom(∗ , ). Thus an alternative notation for the Lie bracket on Hom(∗ , ) is [f, g] = f g − g f .
28
R. Fulp, T. Lada, J. Stasheff
3. A Preliminary Result Now let and be arbitrary A-modules. In the Yang–Mills example, the map δ takes gauge parameters to covariant derivatives. In generalizing that, we suppose that we are given a k-linear map δ : → Hom(∗ , ). Formally, we can write δ(ξ ) = i=0 Ti (ξ ), where Ti is 0 except on i . (This Ti is equivalent to the Ti of Berends, Burgers and van Dam .) We extend δ to a map δˆ : Homk (∗ , ) → Hom(∗ , ) by
ˆ ) = ev ◦ (δ ◦ π ⊗ 1) ◦ , δ(π
where ev is the evaluation map. That is, ˆ )(φ1 ∧ · · · ∧ φn ) = δ(π
{unshuff}
δ(π(φσ (1) ∧ · · · ∧ φσ (p) ))(φσ (p+1) ∧ · · · ∧ φσ (n) ).
We may think of as being contained in Homk (∗ , ) by identifying ξ ∈ with the map, also denoted ξ , in Homk (∗ , ) which is 0 except on the scalars, where ξ(1) = ξ . Note that ∧∗ is an A module and k ⊂ A and so 1 ∈ k ⊂ A. We will be careful to distinguish k-linear maps from A-linear as the need occurs. It is easy to see ˆ ) = δ(ξ ). that δ(ξ Our problem concerns possible algebraic structure on ; consequently we consider the possibility of constructing a Lie-type bracket on Homk (∗ , ) via the mapping ˆ Under certain conditions, such a bracket may then be used to obtain a bracket on the δ. parameter space defined by restricting the induced bracket on Homk (∗ , ) to the parameter space . With this in mind, define ˆ 2 ) − π2 ◦ δ(π ˆ 1 ), [π1 , π2 ] := π1 ◦ δ(π for π1 , π2 ∈ Homk (∗ , ). It turns out that this bracket does not generally satisfy the Jacobi identity. Moreover, if we choose π1 = ξ, π2 = η ∈ , then ˆ ˆ ) = 0, [ξ, η] = ξ δ(η) − η δ(ξ and as a result, the restriction of the induced bracket to yields an abelian Lie algebra structure. In many cases of interest, the parameter space has an a priori nonabelian Lie algebra structure on it and we would certainly want the Lie structure on Homk (∗ , ) to reproduce this structure when restricted to the parameter space . In order to assure the Jacobi property of the bracket on Homk (∗ , ), we introduce a correction term. We accomplish this, following Berends, Burgers and van Dam, by assuming that there is a map C : ⊗ → Homk (∗ , ) such that
ˆ [δ(ξ ), δ(η)] = δC(ξ, η) ∈ Hom(∗ , )
for all ξ, η ∈ . We will refer to this as the BBvD hypothesis. Extend C to a mapping Cˆ : Homk (∗ , ) ⊗ Homk (∗ , ) → Homk (∗ , )
sh-Lie Algebras Induced by Gauge Transformations
29
by ˆ 1 , π2 ) = C ◦ ((π1 ⊗ π2 ) ⊗ 1) ◦ ( ⊗ 1) ◦ , C(π where we have identified C with its adjoint mapping, which is the mapping from ⊗ ⊗ ∗ into defined by (ξ, η, φ1 ∧ · · · ∧ φn ) −→ C(ξ, η)(φ1 ∧ · · · ∧ φn ). Next, we redefine the bracket on Homk (∗ , ) given above by including the correction term C : ˆ 1 , π2 ). ˆ 2 ) − π2 δ(π ˆ 1 ) + C(π [π1 , π2 ] := π1 δ(π ˆ 1 ), δ(π ˆ 2 )]. ˆ 1 , π2 ] = [δ(π Theorem 1. The mapping δˆ preserves brackets; that is, δ[π ∗ ∗ ˆ Moreover, if δ : Homk ( , ) → Hom( , ) is injective, then [π1 , π2 ] satisfies the Jacobi identity. Proof. Observe that if π1 , π2 ∈ Homk (∗ , ), ˆ 2 ) = δ(π ˆ 1 ) ◦ δ(π ˆ 2) ˆ 1 ) δ(π δ(π ˆ 2) = ev ◦ [(δ ◦ π1 ) ⊗ 1] ◦ ◦ δ(π ˆ 2 ) ⊗ 1) + (1 ⊗ δ(π ˆ 2 )) ◦ } = ev ◦ [(δ ◦ π1 ) ⊗ 1] ◦ {(δ(π ˆ 2 )) ⊗ 1] ◦ + ev ◦ [(δ ◦ π1 ) ⊗ δ(π ˆ 2 )] ◦ = ev ◦ [((δ ◦ π1 ) ◦ δ(π ˆ 2 )) + ev ◦ [(δ ◦ π1 ) ⊗ δ(π ˆ 2 )] ◦ ˆ 1 ◦ δ(π = δ(π ˆ 2 )) + ev ◦ [(δ ◦ π1 ) ⊗ δ(π ˆ 2 )] ◦ . ˆ 1 δ(π = δ(π It follows that ˆ 2 )] = δ(π ˆ 1 δ(π ˆ 2 )) − δ(π ˆ 2 δ(π ˆ 1 )) + E, ˆ 1 ), δ(π [δ(π where ˆ 2 )] ◦ − ev ◦ [(δ ◦ π2 ) ⊗ δ(π ˆ 1 )] ◦ . E = ev ◦ [(δ ◦ π1 ) ⊗ δ(π This says that E measures the deviation of δˆ from being a Coder(∗ )-module map. ˆ Recall that for f ∈ Hom(∗ , ), we We must show that E is in the image of δ. have f = m ◦ (f ⊗ 1) ◦ , where m denotes the algebra (wedge) product on ∗ . Thus ˆ 2 ) ⊗ 1) ◦ } ˆ 2 ) = (δ ◦ π1 ) ⊗ {m ◦ (δ(π (δ ◦ π1 ) ⊗ δ(π = (δ ◦ π1 ) ⊗ {m ◦ ([ev ◦ ((δ ◦ π2 ) ⊗ 1)] ⊗ 1) ◦ ( ⊗ 1) ◦ } = (δ ◦ π1 ) ⊗ {m ◦ ([ev ◦ ((δ ◦ π2 ) ⊗ 1)] ⊗ 1) ◦ (1 ⊗ ) ◦ }. ∗ For F ∈ , write (F ) = (F1 ⊗ F2 ), (F2 ) = (F21 ⊗ F22 ) and (F22 ) = (F221 ⊗ F222 ). In order to simplify notation we drop the summation symbol wherever the latter coproducts appear below. From our last calculation we have ˆ 2 )] ◦ )(F ) = [δ(π1 (F1 )) δ(π2 (F21 ))](F22 ), (ev ◦ [(δ ◦ π1 ) ⊗ δ(π
30
R. Fulp, T. Lada, J. Stasheff
and (ev ◦ [(δ ◦ π2 ) ⊗ (π1 )] ◦ )(F ) = [δ(π2 (F1 )) δ(π1 (F21 ))](F22 ). Because is cocommutative, the full summations are equal: F1 ⊗ F21 ⊗ F22 = F21 ⊗ F1 ⊗ F22 . Thus E(F ) = [δ(π1 (F1 )), δ(π2 (F21 ))](F22 ) = δ(C(π1 (F1 ), π2 (F21 ), F221 ))(F222 ) = ev ◦ ({δ ◦ C ◦ [(π1 ⊗ π2 ) ⊗ 1]} ⊗ 1)(F1 ⊗ F21 ⊗ F221 ⊗ F222 ) = ev ◦ ({δ ◦ C ◦ [(π1 ⊗ π2 ) ⊗ 1]} ⊗ 1)(F1 ⊗ F21 ⊗ ( F22 )) = ev ◦ ({δ ◦ C ◦ [(π1 ⊗ π2 ) ⊗ 1]} ⊗ 1)(([1 ⊗ ((1 ⊗ ) ◦ )] ◦ )(F )) = ev ◦ ({δ ◦ C ◦ [(π1 ⊗ π2 ) ⊗ 1]} ⊗ 1)((1 ⊗ 1 ⊗ ) ◦ (1 ⊗ ) ◦ )(F )). It follows from coassociativity that E = ev ◦ ({δ ◦ C ◦ [(π1 ⊗ π2 ) ⊗ 1]} ⊗ 1) ◦ (( ⊗ 1 ⊗ 1) ◦ ( ⊗ 1) ◦ = ev ◦ ({δ ◦ C ◦ [(π1 ⊗ π2 ) ⊗ 1]} ⊗ 1) ◦ ([( ⊗ 1) ◦ ] ⊗ 1) ◦ = ev ◦ ({δ ◦ C ◦ [(π1 ⊗ π2 ) ⊗ 1] ◦ ( ⊗ 1) ◦ } ⊗ 1) ◦ ˆ 1 , π2 )). ˆ 1 , π2 )} ⊗ 1) ◦ = δ(C(π = ev ◦ ({δ ◦ C(π Thus E is in the image of δˆ and in fact ˆ 1 , π2 )) = δ([π ˆ 1 ), δ(π ˆ 2 )] = δ(π ˆ 1 δ(π ˆ 2 ) − π2 δ(π ˆ 1 )) + δ( ˆ C(π ˆ 1 , π2 ]). [δ(π To verify the Jacobi identity, apply δˆ to the Jacobi expression in Homk (∗ , ). By the morphism condition just established, the result is the Jacobi identity valid in Hom(∗ , ). Assuming that δˆ is injective, the Jacobi identity in Homk (∗ , ) follows. This result suggests that the parameter space should be enlarged to include all of Homk (∗ , ). It turns out that the polynomial equations of physical relevance define an sh-Lie structure on an appropriate graded vector space L. We consider the sh-Lie formalism briefly in the next section. 4. sh-Lie Algebras We now review the relationship between sh-Lie algebras (L∞ -algebras) and cocommutative coalgebras [LS93, LM95]. Let (L, d) be a differential graded vector space. If (L, d) is a chain complex (degree d = −1), then an sh-Lie structure on L is a collection of skew symmetric linear maps ln : L⊗n −→ L of degree n − 2 that satisfy the relations e(σ )(−1)σ (−1)i(j −1) lj (li (xσ (1) , . . . , xσ (i) ), . . . , xσ (n) ) = 0, i+j =n+1 σ
where (−1)σ is the sign of the permutation σ , e(σ ) is the sign that arises from the degrees of the permuted elements and σ is taken over all (i, n − i) unshuffles.
sh-Lie Algebras Induced by Gauge Transformations
31
If ln = 0 for n ≥ 3, this is just the description of a dg Lie algebra. If (L, d) is a cochain complex (degree d = +1), then the sh-Lie structure on L is given by skew symmetric linear maps ln : L⊗n −→ L of degree 2 − n that satisfy the same relations. Let ↑ L denote the suspension of the graded vector space L; i.e. ↑ L is the graded vector space with (↑ L)n = Ln−1 ; similarly, let ↓ L denote the desuspension of L; i.e. (↓ L)n = Ln+1 . One may then describe an sh-Lie structure on the chain complex (L, d) 2 by a coderivation D of degree −1 on the coalgebra ∗ (↑ L) such that D = 0; similarly, an sh-Lie structure on the cochain complex (L, d) is a coderivation D of degree +1 on 2 the coalgebra ∗ (↓ L) such that D = 0. Equivalently, the sh-Lie structure may be ∗ described by a linear mapping D : (↓ L) −→ (↓ L) such that D ◦ D = 0. The proof of the assertion for chain complexes may be found in [LS93] and [Sta93]; a proof for cochain complexes can be formulated by a straightforward modification of the proof for chain complexes. 5. The Gauge Algebra is an sh-Lie Algebra We now restrict our attention to the constant maps in Homk (∗ , ) and show that our algebraic structure on Homk (∗ , ) induces an sh-Lie structure on the graded space L = {, }. Throughout this section, we assume the BBvD hypothesis and that δˆ is injective, so Theorem 1 holds and consequently the bracket on Homk (∗ , ) defined by ˆ 1 , π2 ) ˆ 2 ) − π2 δ(π ˆ 1 ) + C(π [π1 , π2 ] := π1 δ(π satisfies the Jacobi identity. By definition, [δ(ξ ), δ(η)] = δ(ξ ) δ(η) − δ(η) δ(ξ ), while the definition of C gives ˆ [δ(ξ ), δ(η)] = δC(ξ, η) ∈ Hom(∗ , ), so our commutator relation is ˆ δ(ξ ) δ(η) − δ(η) δ(ξ ) = δ(C(ξ, η)). The definition of the bracket in Homk (∗ , ) restricted to constant maps takes on the form [ξ1 , ξ2 ] = C(ξ1 , ξ2 ). Consequently, the Jacobi identity takes on the form [C(ξ1 , ξ2 ), ξ3 ] − [C(ξ1 , ξ3 ), ξ2 ] + [C(ξ2 , ξ3 ), ξ1 ] = 0. Let us examine the first term: ˆ ˆ [C(ξ1 , ξ2 ), ξ3 ] = C(ξ1 , ξ2 ) δ(ξ3 ) − ξ3 δC(ξ 1 , ξ2 ) + C(C(ξ 1 , ξ2 ), ξ3 ) ˆ = C(ξ1 , ξ2 ) δ(ξ3 ) + C(C(ξ 1 , ξ2 ), ξ3 ) ˆ because ξ3 δC(ξ 1 , ξ2 ) = 0 as ξ3 is a constant map (non-zero only on scalars). We now add together the results from the remaining two terms and write the Jacobi relation as C(ξ1 , ξ2 ) δ(ξ3 ) − C(ξ1 , ξ3 ) δ(ξ2 ) + C(ξ2 , ξ3 ) δ(ξ1 ) ˆ ˆ ˆ + C(C(ξ 1 , ξ2 ), ξ3 ) − C(C(ξ 1 , ξ3 ), ξ2 ) + C(C(ξ 2 , ξ3 ), ξ1 ) = 0. For the sh-Lie structure, we first combine the fields and gauge parameters to form a single differential graded vector space L.
32
R. Fulp, T. Lada, J. Stasheff
Definition 1. The underlying dg vector space L of the sh-Lie algebra has in degree 0, in degree 1 and 0 in all other degrees. The differential ∂ : → is given by ∂(ξ ) = δ(ξ )(1) ∈ . Theorem 2. The linear map
D : ∗ (↓ L) →↓ L
given by D(ξ ) = ∂(ξ ), D(ξ ∧ φ1 ∧ · · · ∧ φn ) = δ(ξ )(φ1 ∧ · · · ∧ φn ) for n ≥ 1, D(ξ1 ∧ ξ2 ∧ φ1 ∧ · · · ∧ φn ) = C(ξ1 , ξ2 )(φ1 ∧ · · · ∧ φn ), and D = 0 on elements of ∗ (↓ L) with more than two entries from or with no entry from gives L the structure of an sh-Lie algebra. Remark. Recall that we have assumed as hypothesis for this theorem that the bracket on Homk (∗ , ) satisfies the Jacobi identity. According to Theorem 1 this is true if δˆ is injective. It is not difficult to prove that δˆ is injective whenever δ is injective. If we replace the original parameter space with the new parameter space /ker(δ), one has the sh-Lie structure obtained in the proof below. Proof. We need only evaluate D ◦ D¯ on elements of the form (ξ1 ∧ ξ2 ∧ φ1 ∧ · · · ∧ φn ) and (ξ1 ∧ ξ2 ∧ ξ3 ∧ φ1 ∧ · · · ∧ φn ). We begin with ¯ 1 ∧ ξ 2 ∧ φ 1 ∧ · · · ∧ φn ) D ◦ D(ξ δ(ξ1 )(φσ (1) ∧ · · · ∧ φσ (i) ) ∧ ξ2 ∧ φσ (i+1) ∧ · · · ∧ φσ (n) = D{ σ
−
δ(ξ2 )(φτ (1) ∧ · · · ∧ φτ (j ) ) ∧ ξ1 ∧ φτ (j +1) ∧ · · · ∧ φτ (n)
τ
+
C(ξ1 , ξ2 )(φρ(1) ∧ · · · ∧ φρ(k) ) ∧ φρ(k+1) ∧ · · · ∧ φρ(n) },
ρ
where σ , τ and ρ are the evident unshuffles. This composition is equal to D ξ2 ∧ δ(ξ1 )(φσ (1) ∧ · · · ∧ φσ (i) ) ∧ φσ (i+1) ∧ · · · ∧ φσ (n) σ
−
ξ1 ∧ δ(ξ2 )(φτ (1) ∧ · · · ∧ φτ (j ) ) ∧ φτ (j +1) ∧ · · · ∧ φτ (n)
τ
+ =
σ
−
C(ξ1 , ξ2 )(φρ(1) ∧ · · · ∧ φρ(k) ) ∧ φρ(k+1) ∧ · · · ∧ φρ(n)
ρ
δ(ξ2 )(δ(ξ1 )(φσ (1) ∧ · · · ∧ φσ (i) ) ∧ φσ (i+1) ∧ · · · ∧ φσ (n) )
δ(ξ1 )(δ(ξ2 )(φτ (1) ∧ · · · ∧ φτ (j ) ) ∧ φτ (j +1) ∧ · · · ∧ φτ (n) )
τ
+
ρ
δ(C(ξ1 , ξ2 ))(φρ(1) ∧ · · · ∧ φρ(k) )(φρ(k+1) ∧ · · · ∧ φρ(n) )
sh-Lie Algebras Induced by Gauge Transformations
33
which is equal to 0 by the commutator relation. For the terms of the form (ξ1 ∧ ξ2 ∧ ξ3 ∧ φ1 ∧ · · · ∧ φn ), the only unshuffles that we need to consider are those that result in terms of the form (ξi ∧ φσ (1) ∧ · · · ∧ φσ (p) ∧ ξj ∧ ξk ∧ φσ (p+1) ∧ · · · ∧ φσ (n) ) with j < k and (ξi ∧ ξj ∧ φτ (1) ∧ · · · ∧ φτ (q) ∧ ξk ∧ φτ (q+1) ∧ · · · ∧ φτ (n) ) with i < j. Recall that when i = 2 in the first term and when j = 3, k = 2 in the second term, a coefficient of −1 must be introduced. So we have ¯ 1 ∧ ξ 2 ∧ ξ 3 ∧ φ 1 ∧ · · · ∧ φn ) D ◦ D(ξ δ(ξi )(φσ (1) ∧ · · · ∧ φσ (p) ) ∧ ξj ∧ ξk ∧ φσ (p+1) ∧ · · · ∧ φσ (n) ) =D σ
+
C(ξi , ξj )(φτ (1) ∧ · · · ∧ φτ (q) ) ∧ ξk ∧ φτ (q+1) ∧ · · · ∧ φτ (n)
τ
=D
ξj ∧ ξk ∧ δ(ξi )(φσ (1) ∧ · · · ∧ φσ (p) ) ∧ φσ (p+1) ∧ · · · ∧ φσ (n)
σ
+ =
σ
+
C(ξi , ξj )(φτ (1) ∧ · · · ∧ φτ (q) ) ∧ ξk ∧ φτ (q+1) ∧ · · · ∧ φτ (n)
τ
C(ξi , ξj )(δ(ξk )(φσ (1) ∧ · · · ∧ φσ (p) ) ∧ φσ (p+1) ∧ · · · ∧ φσ (n) )
C(C(ξi , ξj )(φτ (1) ∧ · · · ∧ φτ (q) ), ξk (φτ (q+1) ∧ · · · ∧ φτ (n) )
τ
which, after expanding the i, j, k terms of the unshuffles along with the signs mentioned above, is seen to equal the Jacobi relation, and hence is equal to 0. 6. The Classical Strict Lie Case We examine the classical case in which is a Lie algebra and is a Lie module over . Let us denote the action of on by ξ · φ. We assume that we have a linear map ∂ : → that interacts with the Lie module structure as follows: ∂[ξ, η] = ξ · (∂η) − η · (∂ξ ), where we have denoted the Lie bracket on by [·, ·] . As usual the Lie bracket on L = ⊕ is given by [x, y] for x, y ∈ x · y for x ∈ , y ∈ [x, y]L = (1) 0 for x, y ∈ . Similarly denote the Lie bracket on Hom(∗ , ) by [·, ·]Hom() (see Section 1) and on Homk (∗ , ) by [·, ·]Hom() (see Sect. 2). Notice that this case is typical of the gauge structure which arises in fundamental physical theories such as Yang–Mills theory and basic gravitational theories. For the
34
R. Fulp, T. Lada, J. Stasheff
Yang–Mills case, the parameter space is the set of all smooth functions from the spacetime M into the Lie algebra g of the structure group G of the theory (for convenience of exposition, we assume that the principal bundle of the theory is trivial). The Lie bracket on the parameter space is the point-wise bracket of two such parameters. The fields of Yang–Mills theory are g-valued one-forms on M. Note that Berends, Burgers and van Dam denote the gauge transformation action of on by {A, } (for A ∈ , ∈ ) rather than the notation · A used above. In this case, this action is simply the covariant derivative of relative to the connection A. Similarly, when the Einstein-Hilbert action is utilized, the parameter space is the Lie algebra of all vector fields ξ on the space-time manifold M. Again, in Berends, Burgers and van Dam, the background metric η (Minkowski) is presumed and general metrics are written in the form η + h for an appropriate symmetric tensor h. Thus the fields of the theory are symmetric tensors h. The action of a parameter ξ on a field h is the Lie derivative of h relative to the vector field ξ . The function δ is given by (δh)µν = ∂µ ξν + ∂ν ξµ + [(∂ρ hµν )ξ ρ − hρµ (∂ ρ ξν ) − hρν (∂ ρ ξµ )]. Details of these two standard examples may be found in Burgers’ dissertation [Bur85]. Notice that using a bracket notation, [ξ, φ]L := ξ · φ, for the action similar to that in Berends, Burgers and van Dam, the requirement that the bracket be a chain map with respect to ∂ is simply ∂[ξ, η]L = [ξ, ∂η]L − [η, ∂ξ ]L . (We already require that [·, ·]L restricts to [·, ·] .) Let us define the “gauge transformation” δ : → Hom(∗ , ) by ∂ξ for φ = 1 (2) δ(ξ )(φ) = ξ · φ for φ ∈ 1 = 0 for φ = φ ∧ · · · ∧ φ ∈ n , n > 1. 1 n Extend δ to δˆ : Homk (∗ , ) → Hom(∗ , ) by δ(π(φ1 ))(φ2 ) = ∂π(φ) for φ2 = 1 ˆ )(φ) = δ(π π(φ1 ) · φ2 for φ2 ∈ 1 = 0 otherwise.
(3)
Here 1 ∈ k ⊂ A while φ denotes an arbitrary element of ∗ and (φ) = φ1 ⊗ φ2 . The canonical bracket on Homk (∗ , ) that is induced by δˆ and defined below will not satisfy the Jacobi identity in general. This bracket is given by ˆ 2 )(φ) − π2 ◦ δ(π ˆ 1 )(φ). [π1 , π2 ]Hom() (φ) = π1 ◦ δ(π ˆ )(φ) = δ(π ˆ )(φ1 ) ∧ φ2 = δ(π(φ11 ))(φ12 ) ∧ φ2 , and Here, δ(π ∂(π(φ1 )) ∧ φ2 if φ12 = 1 δ(π(φ11 ))(φ12 ) ∧ φ2 = π(φ11 ) · φ12 ∧ φ2 if φ12 ∈ 1 0 otherwise.
(4)
In particular, if π(φ) = ξ(φ) is defined to be the map with value ξ when φ = 1 ∈ k and 0 otherwise, then for φ = (φ1 ∧ φ2 ), ∂ξ ∧ φ2 = ∂ξ ∧ φ if φ1 = 1 ˆ )(φ) = (5) δ(ξ (ξ · φ1 ) ∧ φ2 if φ1 ∈ 1 0 otherwise
sh-Lie Algebras Induced by Gauge Transformations
35
and so in Homk (∗ , ), the bracket ˆ ˆ ))(φ) = 0 [ξ, η]Hom() (φ) = (ξ ◦ δ(η))(φ) − (η ◦ δ(ξ because the coderivations in the definition of the bracket have image in n with n > 0. It is important to note that the bracket on Homk (∗ , ) does not restrict to the original bracket on except in the abelian case; we must introduce the “correction” term C. We continue with our construction and introduce the map C : ⊗ → Homk (∗ , ) by defining C(ξ, η)(φ) = [ξ, η] if φ = 1 and 0 otherwise. Here, [·, ·] is the original ˆ ˆ ), δ(η)] ˆ Lie bracket on . Next, we must check that δC(ξ, η) = [δ(ξ Hom() (notation as follows Eq. (1)). So for φ ∈ ∗ , we have if φ = 1 ∂[ξ, η] ˆ δC(ξ, η)(φ) = [ξ, η] · φ if φ ∈ 1 (6) 0 otherwise. On the other hand, we have ˆ ), δ(η)] ˆ ˆ ) ◦ δ(η) ˆ ˆ ˆ ))(φ) [δ(ξ − δ(η) ◦ δ(ξ Hom() (φ) = (δ(ξ ˆ ˆ )(∂η ∧ φ2 ) − δ(η)(∂ξ ∧ φ2 ) if φ1 = 1 δ(ξ ˆ ˆ )((η · φ1 ) ∧ φ2 ) − δ(η)((ξ = δ(ξ · φ1 ) ∧ φ2 ) if φ1 ∈ 1 0 otherwise.
(7)
The first term is non-zero only if φ2 = 1 in which case φ = 1 and we have ξ ·(∂η)−η·(∂ξ ) which is equal to ∂[ξ, η] by our original assumption on ∂. The second term is non-zero only for φ2 = 1 and φ1 ∈ 1 and is then equal to ξ · (η · φ) − η · (ξ · φ) which in turn is equal to [ξ, η] · φ by the Lie module action of on . Thus the BBvD hypothesis is satisfied. Now we apply our Theorem 2 above to impose an sh-Lie structure on the graded vector space L = {Ln } with L0 = , L1 = and Ln = 0 otherwise. It is easy to see that our construction gives back the usual Lie algebra structure on the graded vector space L, the semi-direct product of the Lie algebra and the module . 7. On Shell Gauge Symmetries Up to this point we have focused primarily on unravelling the algebraic structure implicit in the BBvD hypothesis. This hypothesis is trivially satisfied for classical physical theories such as general relativity and Yang–Mills theories in the sense that the gauge symmetries of these physical theories satisfy the strict Lie version discussed in Sect. 6. On the other hand, the BBvD hypothesis appears to be precisely the condition satisfied by the symmetries of “free differential algebras” which are useful in a careful description of the Sohnius–West model of supergravity, see for example [CDF91] and [CP95]. (Physicists refer to “free differential algebras” meaning differential graded commutative algebras which are free as graded commutative algebras.) Note that the latter paper shows that “free differential algebras” satisfy the BBvD hypothesis (see Eq. 4.16 in [CP95])
36
R. Fulp, T. Lada, J. Stasheff
without any extra terms that vanish on shell. Consequently, some analysis such as the one developed in Sect. 5 is required for a full understanding of the algebraic structure of these transformations. Field dependent gauge symmetries appear in other field theories as well, including the class due to Ikeda [Ike94] and Schaller and Strobl [SS94] and employed by Cattaneo and Felder [CF99] to implement Kontsevich’s deformation quantization [Kon97] referred to above. These field symmetries do not satisfy the BBvD hypothesis as we have described it above, but rather satisfy the BBvD hypothesis “on shell”. In this section we outline how our work may be generalized so that in the next section we can show how to apply it to such field theories, illustrating this in terms of one due to Ikeda. First we explain what is meant when one says that a condition holds “on shell”. In essence one means that the condition holds not for all the fields of the physical theory, but rather that it holds only for those fields which satisfy the field equations. In all the theories of interest here, the field equations are Euler-Lagrange equations. Such equations are obtained from the Lagrangian of the physical theory. In our case we assume that the Lagrangian is a polynomial in the components of both the fields and their derivatives. These components may be regarded as smooth functions on the space-time manifold M and consequently the Lagrangian is a mapping from the space 0 of physical fields into C ∞ M such that L(φ) = PL (φ a , ∂I φ a ), where PL (ua , uaI ) is a polynomial over C ∞ M in the indeterminants ua , uaI and where φ a are the components of a typical field in 0 (I is a symmetric multi-index). In that which follows, we identify ua with uaI where I is empty. Similarly φ a = ∂I φ a , where I is empty. The “action” of the physical theory is then the integral of the Lagrangian over the space-time manifold M. All of the theories discussed in Berends, Burgers and van Dam, the supergravity example mentioned above, and the example due to Ikeda, discussed more fully below, are polynomial Lagrangian field theories in the sense we have described above. The Euler operator Ea applied to the Lagrangian L produces the Euler–Lagrange differential operator Ea L which acts on fields via Ea L(φ) = (−1)|I |
∂I
∂P
L (∂J φ b ) ∂uaI
.
Since the Lagrangian is polynomial in the components {φ a } of the fields and their derivatives {∂I φ a }, the Euler-Lagrange differential operator is also a mapping from 0 into C ∞ M which factors through an appropriate polynomial over C ∞ M. Observe that each homogeneous polynomial P(uaI ) of degree k uniquely defines a symmetric multi-linear mapping β from U1 × U2 × · · · × Uk into polynomials in ki=1 Ui such that P(uaI ) = β(uaI , uaI , · · · , uaI ) for appropriate indeterminates Ui = {u(i)aI i }. The polynomial PL is a sum of homogeneous terms, each of which can be recovered from an appropriate symmetric multi-linear mapping by evaluating the multi-linear mapping on the diagonal. Consequently, each Lagrangian L uniquely identifies an element β L = i βiL , where βiL ∈ Hom(∧iC ∞ M ∂0 , C ∞ M), such that L(φ) =
i
βiL (∂I φ a , ∂I φ a , · · · , ∂I φ a ),
sh-Lie Algebras Induced by Gauge Transformations
37
where ∂0 denotes the vector space of the components of the fields and their derivatives. We refer to this identification as polarization and will be more precise in our algebraic formulation below. Similarly the Euler–Lagrange differential operator admits an analogous polarization. It is probably useful to establish a dictionary relating our algebraic approach to field theory to more usual approaches. The algebra A is identified with the algebra C ∞ M of smooth functions on the space-time M and 0 with the space of all physical fields of the theory. This space of fields in simple cases is the space of all maps from M into a finitedimensional vector space W . The module is an algebraic way of formulating “jets” of fields and ∂ is the map which assigns the jet ∂φ = (∂I φ a )eaI to the field φ = φ a ea . Elements of Hom(∧∗A , A) are identified with “polynomials” in the fields. In our algebraic formulation, we let A denote any commutative associative algebra and let 0 denote an arbitrary A-module freely and finitely generated over A with basis {ea }. Working locally, we assume the existence of a finite number of derivations ∂µ of A which admit extensions as A-derivations of 0 in the sense that for each µ, ∂µ (ea ) = 0 and ∂µ (f φ) = f ∂µ φ + (∂µ f )φ, for f ∈ A, φ ∈ 0 . For each symmetric multi-index I = (i1 , i2 , · · · , ik ), let ∂I = ∂i1 ◦ · · · ◦ ∂ik and let denote the A-module freely generated by symbols {eIa } so that = {φIa eaI |φaI ∈ A}. In this context, L and the Euler–Lagrange differential operators Ea L are identified with their polarizations which are special elements of Hom(∧∗A , A), where ∧∗A is the free nilpotent cocommutative coalgebra generated by over A. More precisely, when we say that L : 0 −→ A is a polynomial Lagrangian, we mean that there is a unique β L ∈ Hom(∧A , A) such that L(φ) = βiL (∂φ, ∂φ, · · · , ∂φ), i
where βiL ∈ Hom(∧iA , A) is homogeneous and ∂ is the mapping from 0 into defined by ∂φ = ∂I φ a eaI . Here, of course, we mean that β L = i βiL where, for each i, βiL can only be nonzero on ∧iA , i.e., for each i, βiL is multilinear and symmetric having the property that when it is evaluated on ∂φ ∧ · · · ∧ ∂φ one obtains precisely that term in PL (φ a , ∂I φ a ) of degree i. To obtain all the terms in L(φ), one must sum over all the homogeneous terms which appear in the polynomial PL which determines L. It is possible to recover the mapping Ea L : 0 −→ A from an element of Hom(∧A , A) in a similar manner. Consequently, in that which follows, we identify L with β L and we regard both L and Ea L as elements of Hom(∧A , A). In this formulation, the “shell” is the subset of 0 defined by = {φ ∈ 0 |
Ea L(diag(∂φ)) = 0},
where the diagonal mapping diag : −→ ∧∗ is defined by diag(φ) = φp , p
and where
p
φ p = (φ ∧ φ ∧ · · · ∧ φ) ∈ ∧A .
38
R. Fulp, T. Lada, J. Stasheff
It is required only that Ea L ∈ Hom(∧∗ , A) be zero on the diagonal as the restriction of Ea L to the diagonal agrees with the polynomial counterpart of Ea L and it is the zero set of this latter function which defines the solution space of the usual Euler–Lagrange operator. Note that is not a subspace of ∧∗ . Define a subspace I of Hom(∧∗ , A) by I = {f ∈ Hom(∧∗ , A)|f (diag(∂φ)) = 0, φ ∈ }. Similarly, define a subspace N of Hom(∧∗ , ) by N = {ν ∈ Hom(∧∗ , )|ν(diag(∂φ)) = 0, φ ∈ }. We say that f ∈ Hom(∧∗ , A) and ν ∈ Hom(∧∗ , ) vanish “on shell” iff f and ν are in I and N , respectively. Elements of I are “polynomials” such as Ea L which vanish “on shell”. The “polynomials” referred to here are actually mappings from 0 to A which factor through polynomials over A in the indeterminates {uaI } as in our description of the Lagrangian L above. Hom(∧∗A , ) plays the role of vector fields with coefficients from Hom(∧∗A , A), and N plays the role of the space of vector fields whose coefficients vanish on . At this point, we generalize the BBvD hypothesis as follows. We say that the k-linear mapping δ : −→ Hom(∧∗A , ) satisfies the generalized Berends, Burgers and van Dam hypothesis, denoted gBBvD, iff there exists a skew-symmetric k-bilinear mapping C : × −→ Hom(∧∗A , ) and an extension δˆ of δ to Hom(∧∗A , ) such that ˆ [δ(ξ ), δ(η)] − δ(C(ξ, η)) ∈ N for all ξ, η ∈ . Thus the BBvD hypothesis of Sect. 3 holds “on shell”. A consequence of this hypothesis is that there exists a skew-symmetric mapping ν : × −→ N such that ˆ [δ(ξ ), δ(η)] = δ(C(ξ, η)) + ν(ξ, η). Utilizing this mapping C, one can define a bracket on Homk (∗ , ) analogous to that defined before in the presence of the BBvD hypothesis: ˆ 2 ) − π2 δ(π ˆ 1 ) + C(π1 , π2 ). [π1 , π2 ] := π1 δ(π Injectivity of δˆ is not easily obtained and seems to be needed to obtain a proof of the Jacobi identity. Thus, in general the bracket on Homk (∗ , ) will not satisfy the Jacobi identity. On the other hand, we can use the calculations of the proof of Theorem 1 to show that ˆ 1 ), δ(π ˆ 2 )] − δ([π ˆ 1 , π2 ]) ∈ N [δ(π for all π1 , π2 in Homk (∗ , ). By using the calculations in the proof of Theorem 2, it is easy to show that: D ◦ D(ξ1 ∧ ξ2 ∧ φ1 ∧ · · · ∧ φn ) ˆ = − [δ(ξ1 ), δ(ξ2 )](φ1 ∧ · · · ∧ φn ) + δ(C(ξ 1 , ξ2 ))(φ1 ∧ · · · ∧ φn ) = − ν(ξ1 , ξ2 )(φ1 ∧ · · · ∧ φn ) and that D ◦ D(ξ1 ∧ ξ2 ∧ ξ3 ∧ φ1 ∧ · · · ∧ φn ) = Jacobi(ξ1 , ξ2 , ξ3 )(φ1 ∧ · · · ∧ φn ),
sh-Lie Algebras Induced by Gauge Transformations
39
where Jacobi(ξ1 , ξ2 , ξ3 ) = ([[ξ1 , ξ2 ], ξ3 ] − [[ξ1 , ξ3 ], ξ2 ] + [[ξ2 , ξ3 ], ξ1 ]). In this latter equation we have used the notation [ξ, η] in place of C(ξ, η). It follows from these equations that D ◦ D is zero “on shell” provided that both the generalized BBvD hypothesis holds and that Jacobi(ξ, η, ζ ) is zero on shell for arbitrary constants ξ, η, ζ ∈ Hom(∧∗A , ). In the example due to Ikeda [Ike94] discussed in detail in Sect. 8, it is easy to prove that Jacobi(ξ1 , ξ2 , ξ3 ) = 0 (not just zero “on shell”) using the equation immediately prior to Eq. 2.10 in his paper. Consequently the gauge symmetries of this Poisson σ -model satisfy the postulates of an sh-Lie algebra “on shell”. 8. A -Model Example In Ikeda’s paper [Ike94], there is a finite dimensional vector space V with basis {TA } which later we will show is the dual of a Poisson manifold. We do this via a generalization of the classical Kirillov–Kostant bracket which exhibits the dual g∗ of a Lie algebra g as a Poisson manifold. In our analysis of Ikeda’s example, our space is the space of maps Maps(, V ) and the space 0 is the set of ordered pairs φ = (ψ, h), where: (1) ψ is a mapping from a given two-dimensional manifold into the dual V ∗ of the vector space V , and (2) h is a mapping from the same manifold to T ∗ ⊗ V , which in fact is required to be a section of the vector bundle T ∗ ⊗ V −→ . These mappings are denoted µ locally by ψ(x) = ψA (x)T A and h(x) = hA µ (dx ⊗ TA ), where {TA } is a basis of V A ∗ and {T } is the basis of V dual to {TA }. For the most part, our exposition follows that of Ikeda, although we use the notation φ = (ψ, h) for the fields of the theory whereas Ikeda’s notation for the fields is (φ, h). We also denote Ikeda’s vector space M by V . As is the case earlier in the paper, the space denotes the A = C ∞ M module whose elements are φIa eaI where {eaI } is a basis of the module and φIa ∈ A. This formulation is our algebraic description of the jet bundle of the vector bundle whose sections are the fields 0 . Ikeda would denote φIa as ∂I φ a . Observe that 0 may be identified as a subspace of . There is a parallel development to Ikeda’s work in Cattaneo and Felder [CF99] in which is a 2-dimensional disc and the target (denoted by M in Cattaneo and Felder) is an arbitrary Poisson manifold. It is not hard to see that the ordered pairs (ψ, h) of Ikeda may in fact be interpreted in a manner similar to that in the exposition of Cattaneo and Felder, where ψ : −→ M is an arbitrary smooth mapping (ψ is denoted by X in Cattaneo and Felder) and h is a section (denoted by η in Cattaneo and Felder) of the bundle ψ ∗ (T ∗ M) ⊗ T ∗ −→ (notice that the factors in their tensor product are reversed from the conventions used in our description of Ikeda’s results). In their exposition the section h may be written as h(x) = hi,µ (x)(dx i ⊗ duµ ), where {dx i } is ∗ M, which, in the case M is a vector space V , may be identified with a a basis of Tψ(x) fixed basis {T A } of T0∗ V = V ∗ . When one compares these two approaches, one sees that Ikeda’s target space is the vector space we have called V ∗ while is an arbitrary 2-dimensional manifold, whereas for Cattaneo and Felder is a disc D and the target space M is a general
40
R. Fulp, T. Lada, J. Stasheff
Poisson manifold. The parallel between the two is closer than one might initially expect since Ikeda uses the vector space V to generate a Poisson structure on V ∗ . Ikeda proceeds to investigate possible gauge symmetries δ(c) before looking for Lagrangians. The gauge symmetry mapping δ is defined locally, in this theory, as follows. Let P denote the commutative polynomial algebra generated by the basis {TA }. Let πA , πµA , π A denote the projections defined by πA (φ) = πA (ψ, h) = πA (ψ) = ψA , A A πµA (φ) = πµA (ψ, h) = πµA (h) = hA µ and π (c) = c , respectively. Consider arbitrary polynomials {WAB } in P and define the components of δ(c)(φ) by πµA (δ(c)(φ)) = ∂µ cA + and
∂WBD (ψ) B D hµ c ∂TA
πA (δ(c)(φ)) = WBA (ψ)cB .
Here WAB (ψ) is a concise notation for the polynomial WAB evaluated by replacing the a T generators {TA } by the correponding components {ψA } of ψ, that is, WAB = WAB a a and WAB (ψ) = WAB ψa , where a is a symmetric multi-index, ψa = ψA1 ψA2 · · · ψAn and similarly for Ta . C T , the Notice that, in case V is a Lie algebra and WAB (T ) = [TA , TB ] = fAB C polynomials WAB define the Lie algebra structure on the vector space V with structure C }. This then induces a Lie algebra structure on the parameter space of constants {fAB all mappings c from into V , as one expects in traditional Yang–Mills theory. In this case, the ψ-component of δ(c) is the coadjoint action of the parameter space on the space of maps from into V , while the h-component is simply the “covariant derivative” of c relative to the connection defined by the gauge field h. Thus, by introducing more general polynomials WAB , Ikeda is introducing a generalization of ordinary gauge theory by requiring that the gauge symmetries be defined via the polynomials WAB . For this generalization to work, Ikeda imposes restrictions on the polynomials WAB which amount to making P a Lie algebra, hence his terminology of “non-linear Lie algebra”. In order to obtain an algebraic structure on P analogous to the usual Lie structure required in gauge theory, Ikeda’s bracket is defined on generators of P by [TA , TB ] = WAB ∈ P and extended to all of P via the Leibniz rule: [TA , ] and [ , TB ] are derivations of the commutative algebra P. Ikeda requires that these polynomials satisfy conditions which make P a Poisson algebra. Thus the polynomials {WAB } in P are subject to skewsymmetry: WAB = −WBA and an appropriate generalization of the usual coordinate form of the Jacobi condition: WAD
∂WBC ∂WCA ∂WAB + WBD + WCD = 0. ∂TD ∂TD ∂TD
To see V ∗ as a Poisson manifold, we will imbed V in V ∗∗ as the linear functionals and thus regard the algebra P as the subalgebra of C ∞ (V ∗ ) generated by the basis {TA }. Regarding TA ’s as functions on V ∗ , we have a bi-vector field WAB on V ∗ .
∂ ∂ ∧ ∂TA ∂TB
sh-Lie Algebras Induced by Gauge Transformations
41
This makes V ∗ a Poisson manifold with {f, g} := WAB
∂f ∂g ∧ ∂TA ∂TB
for f, g ∈ C ∞ (V ∗ ). Now notice that, using Ikeda’s notation as defined above, we have for each c that δ(c) is a mapping from 0 to 0 . Since 0 is a vector space, it follows that with any reasonable topology on 0 one can identify the tangent space of 0 at a point φ ∈ 0 with 0 itself. Thus maps from 0 into 0 may be regarded as vector fields on 0 . Recall that δ(c) is a vector field on the space 0 of fields. Thus δ(c)(φ) is a tangent vector to 0 at φ. By an obvious abuse of notation one may write: δ(c)(φ) = (WBA (ψ)cB )
∂ ∂WBD (ψ) B D ∂ + (∂µ cA + hµ c ) A . ∂ψA ∂TA ∂hµ
(8)
Now the components of δ(c)(φ) as defined in Eq. (8) are polynomials in the components of φ. Consequently, in conformity with our conventions in Sect. 7, we can identify δ(c) with the unique element of Hom(∧∗ , ) whose value at diag(∂φ), for φ ∈ 0 , gives δ(c)(φ) as defined by Ikeda. The usual Lie bracket of the vector fields δ(c1 ) and δ(c2 ) as defined by Eq. 8 corresponds to our Lie structure on Hom(∧∗ , ). Using his brackets, Ikeda finds that the ψ component of [δ(c1 ), δ(c2 )](φ) is given by [δ(c1 ), δ(c2 )](ψ) = δ(c3 (ψ))(ψ), where πA (c3 (ψ)) =
∂WBD (ψ)c1B c2D . ∂TA
(9)
We see that the Lie bracket of [δ(c1 ), δ(c2 )] is not of the form δ(c), where c is a gauge parameter independent of the fields φ but rather the gauge parameter c3 depends on c1 , c2 and on the field ψ. Thus one does not have closure on the original space of gauge parameters Maps(, V ) = . We are forced to enlarge the space of gauge parameters to include mappings from to . In Ikeda’s context, these mappings are polynomials in the components of the fields and their derivatives. Consequently, they are identified with elements of Homk (∗ , ) in our formulation. If we had only the fields ψ to deal with, the BBvD hypothesis would be satisfied and we would be able to apply the ideas in earlier sections to describe Ikeda’s algebra of gauge transformations as an sh-Lie algebra on a graded space with in degree zero. However, the h-component transforms more subtly. To handle this, Ikeda makes the definition Dµ ψA = ∂µ ψA + WAB (ψ)hB µ.
(10)
(The resemblance to a covariant derivative is formal; it is not yet understood as arising in an obvious manner from a “representation” of the nonlinear Lie algebra defined by Ikeda.) He then calculates [δ(c1 ), δ(c2 )](h) = δ(c3 (ψ))(h) −
∂ ∂ 2W CD (Dµ ψB )c1C c2D . ∂ψA ∂ψB ∂hA µ
(11)
42
R. Fulp, T. Lada, J. Stasheff
Thus the BBvD hypothesis fails, but the generalized hypothesis may hold. When closure on the original space of V -valued gauge parameters is lost, physicists speak of an “open algebra”. Having established his gauge algebra and potential gauge symmetries, Ikeda then searches for an appropriate Lagrangian. Up to a total divergence, the Lagrangian of Ikeda’s theory is 1 A B L = µν hA µ Dν ψA − WAB (ψ)hµ hν . 2 This includes self-interacting terms for the generalized gauge fields h along with a minimal coupling of the scalar field ψ through the generalized covariant derivative defined in Eq. (10) above. The tensor µν is the area element which is assumed to be present on . Ikeda really works with an equivalent Lagrangian which differs from the one given above by a divergence, although the physical content of the Lagrangian defined above is clearer. Ikeda shows that for his equivalent Lagrangian L(φ, ∂φ) = L(ψ, h, ∂ψ, ∂h), the function δ(c)(L) is a divergence for all parameters c. This is precisely the property physicists require in order to call δ a gauge symmetry. The field equations of the Lagrangian are Dµ ψA = 0,
A Rµν = 0,
A is the “generalized” curvature where Rµν A A Rµν = ∂µ hA ν − ∂ ν hµ +
∂WBC C (ψ)hB µ hν ∂TA
µ of the “generalized gauge field” h = hA µ (dx ⊗ TA ). The coefficient of the last term on the right-hand side of Eq. (11) is polynomial in the components of φ ∈ 0 and their derivatives and (as in Sect. 7) determines a unique bilinear mapping ν from × into N = {ν ∈ Hom(∧∗ , )|ν(diag(∂φ)) = 0, φ ∈ } such that
[δ(c1 ), δ(c2 )] = δ(C(c1 , c2 )) + ν(c1 , c2 ),
(12)
where C(c1 , c2 ) = c3 : ∧∗ −→ is defined by Eq. (9). This latter property (12) is the one we have referred to above as the gBBvD hypothesis. Thus Ideda provides an example of a field theory which satisfies the generalized BBvD hypothesis and it is this condition which we have assumed in Sects. 7 and 8. The gauge symmetries of these theories require a modification of the sh-Lie structure one obtains from the gauge structures of field theories satisfying the BBvD hypothesis. References [BBvD84] Berends, F.A., Burgers, G.J.H., van Dam, H.: On spin three selfinteractions. Z. Phys. C 24, 247–254 (1984) [BBvD85] Berends, F.A., Burgers, G.J.H., van Dam, H.: On the theoretical problems in constructing intereactions involving higher spin massless particles. Nucl. Phys. B 260, 295–322 (1985) [BBvD86] Berends, F.A., Burgers, G.J.H., van Dam, H.: Explicit construction of conserved currents for massless fields of arbitrary spin. Nucl. Phys. B 271, 429–441 (1986)
sh-Lie Algebras Induced by Gauge Transformations
[Bur85] [CDF91] [CF99] [CP95] [Ger62] [Ike94] [Ike01] [Kon97] [LM95] [LS93] [SS94] [Sta93]
43
Burgers, G.J.H.: On the construction of field theories for higher spin massless particles. Ph.D. thesis, Rijksuniversiteit te Leiden, 1985 Castellani, L., D’Auria, R., Fré P.: Supergravity and superstrings, Vol. 2. Singapore: World Scientific, 1991 Cattaneo, A., Felder, G.: A path integral approach to the Kontsevich quantization formula. math.QA/9902090 Castellani, L., Perotto, A.: Free differential algebras: their use in field theory and dual formulation. Lett. Math. Phys. 38, 321–330 (1996); hep-th/9509031 Gerstenhaber, M.: The cohomology structure of an associative ring. Ann. Math. 78, 267–288 (1962) Ikeda, N.: Two-dimensional gravity and nonlinear gauge theory. Ann. Phys. 235, 435–464 (1994) Ikeda, N.: A deformation of three dimensional BF theory. hep-th/0010096 Kontsevich, M.: Deformation quantization of Poisson manifolds, I. Preprint, IHES, 1997; hepth/9709040 Lada, T., Markl, M.: Strongly homotopy Lie algebras. Comm. in Algebra 23, 2147–2161 (1995) Lada, T., Stasheff, J.D.: Introduction to sh Lie algebras for physicists. Intern’l J. Theor. Phys. 32, 1087–1103 (1993) Schaller, P., Strobl, T.: Poisson structure induced (topological) field theories. Modern Phys. Lett. A 9, 3129–3136 (1994) Stasheff, J.D.: The intrinsic bracket on the deformation complex of an associative algebra. JPAA 89, 231–235 (1993), Festschrift in Honor of Alex Heller
Communicated by H. Araki
Commun. Math. Phys. 231, 45 – 95 (2002) Digital Object Identifier (DOI) 10.1007/s00220-002-0672-9
Communications in
Mathematical Physics
A Bivariant Chern Character for Families of Spectral Triples Denis Perrot SISSA, via Beirut 2–4, 34014 Trieste, Italy. E-mail:
[email protected] Received: 26 November 2001 / Accepted: 22 February 2002 Published online: 2 October 2002 – © Springer-Verlag 2002
Abstract: In this paper we construct a bivariant Chern character defined on “families of spectral triples”. Such families should be viewed as a version of unbounded Kasparov bimodules adapted to the category of bornological algebras. The Chern character then takes its values in the bivariant entire cyclic cohomology of Meyer. The basic idea is to work within Quillen’s algebra cochains formalism, and construct the Chern character from the exponential of the curvature of a superconnection, leading to a heat kernel regularization of traces. The obtained formula is a bivariant generalization of the JLO cocycle. 1. Introduction Recall that according to Connes [6], a noncommutative space is described by a spectral triple (A, H, D), where H is a separable Hilbert space, A an associative algebra represented by bounded operators on H, and D is a self-adjoint unbounded (Dirac) operator with compact resolvent, such that the commutator [D, a] is densely defined for any a ∈ A and extends to a bounded operator. The triple (A, H, D) carries a nontrivial homological information as a K-homology class of A. The major motivation leading Connes to introduce periodic cyclic cohomology [5] is that the latter is the natural receptacle for a Chern character defined on finitely summable representatives of K-homology. This finiteness condition was removed later and replaced by the weaker condition of θ -summability, i.e. the heat kernel exp(−tD 2 ) associated to the laplacian of the Dirac operator has to be trace-class for any t > 0 [7]. In that case, the algebra A has to be endowed with a norm and the Chern character of the spectral triple is expressed as an infinite-dimensional cocycle in the entire cyclic cohomology H E ∗ (A). Except the original construction of Connes, one of the interesting explicit formulas for such a Chern character is provided by the so-called JLO cocycle [21]. Here the heat kernel plays the role of a regulator in the algebra of operators on H, and the JLO formula incorporates the data of the spectral triple in a rather simple way. This led Connes and Moscovici to use
46
D. Perrot
the powerful machinery of asymptotic expansions of the heat kernel, giving rise to local expressions extending the classical index theorems of Atiyah-Singer to very interesting non-commutative situations [9, 10]. In this paper we want to generalize the construction of a Chern character to families of spectral triples “over a noncommutative space” described by a second associative algebra B. In the context of C ∗ -algebras, such objects correspond to the unbounded version of Kasparov’s bivariant K-theory [4]. In this picture, an element of the group KK(A, B) is represented by a triple (E, ρ, D), where E is an Hilbert B-module. D should be viewed as a family of Dirac operators over B, acting by unbounded endomorphisms on E, and ρ is a representation of A as bounded endomorphisms of E commuting with D modulo bounded endomorphisms. In the particular case B = C, this description just reduces to spectral triples over A. The construction of a general bivariant Chern character as a transformation from an algebraic version of KK(A, B) (for A and B not necessarily C ∗ -algebras) to a bivariant cyclic cohomology has already been considered by several authors. For example Nistor [25, 26] constructed a bivariant Chern character for p-summable quasihomomorphisms [12], with values in the Jones-Kassel bivariant cyclic cohomology groups. Cuntz and Quillen also constructed a bivariant Chern character under some summability assumptions, with values in their own description of the bivariant periodic cyclic theory [15, 16]. On the other hand, Puschnigg constructed a well-behaved cyclic cohomology theory for C ∗ -algebras, namely the local cyclic cohomology [30, 31]. Upon generalization of a previous work of Cuntz [13], the local cyclic theory appears to be the suitable target for a completely general bivariant Chern character (without summability assumptions) defined of Kasparov’s K-theory. However, the existence and properties of such constructions are often based on excision in cyclic cohomology and the universal properties of bivariant K-theory. By considering unbounded bimodules we will follow a different way, involving heat kernel regularization in the spirit of the JLO cocycle, keeping in mind that we are interested in explicit formulas for a bivariant Chern character incorporating the data ρ and D. Our motivation mainly comes from the potential applications to mathematical physics, especially quantum field theory and string/brane theory, where such objects arise naturally: • The heat kernel method admits a functional integral representation. The quantities under investigation then correspond to expectation values of observables corresponding to some quantum-mechanical system. This was first used by Alvarez-Gaumé and Witten in their study of mixed-gravitational anomalies [1], and led to the asymptotic symbol calculus of Getzler [2]. • The basic idea of introducing a heat kernel regularization of Chern characters in classical differential geometry is due to Quillen [32]. Bismut then successfully applied this method in his approach of the Atiyah–Singer index theorem for families of elliptic operators on submersions [3]. It is worth mentioning that Bismut also uses a stochastic representation of the heat kernel. • The Bismut–Quillen approach is essential for the analytic and topological understanding of anomalies (both chiral and gravitational) in quantum field theory [27, 28]. A bivariant Chern character designed in an equivariant setting may shed some light on the interplay between BRS cohomology and the recently discovered cyclic cohomology of Hopf algebras [10, 11]. • Twisted K-theory and K-homology recently appeared in the physics literature through the classification of D-branes [23, 35]. This also falls into the scope of a bivariant Chern character.
Bivariant Chern Character for Families of Spectral Triples
47
First we have to consider the right category of algebras. For our purpose, it turns out that bornological associative algebras are exactly what we need. These are associative algebras endowed with an additional structure describing the notion of a bounded subset. Complete bornological algebras provide the general framework for entire cyclic cohomology. This theory has been developed in detail by Meyer in [24]. The interesting feature of the bivariant entire cyclic cohomology is that it contains infinite-dimensional cocycles and thus can be used as the receptacle of a bivariant Chern character for our families of spectral triples carrying some properties of θ -summability. Given two complete bornological algebras A and B, we will consider the Z2 -graded semigroup ∗ (A, B), ∗ = 0, 1, of unbounded A-B-bimodules. The latter is an adaptation of Kasparov’s unbounded bimodules to the realm of bornological algebras. In our geometric picture, such a bimodule represents a family of spectral triples over the non-commutative space B. Our aim is to construct an explicit formula for a Chern character defined on the subsemigroup of θ-summable bimodules, → H E∗ (A, B), ch : ∗θ (A, B)
∗ = 0, 1,
(1)
carrying suitable properties of additivity, differentiable homotopy invariance and func toriality. Here H E∗ (A, B) is the bivariant entire cyclic cohomology of A and B, and B is the unitalization of B. On the technical side, we will use both the X-complex description of cyclic cohomology due to Cuntz–Quillen [15, 16], and the usual (b, B)-complex of Connes. The X-complex is useful for some conceptual explanations of the abstract properties of cyclic (co)homology. Given a complete bornological algebra A, its entire cyclic homology is computed by the supercomplex [24] X(T A) : T A 1 T A ,
(2)
where T A is the analytic tensor algebra of A, obtained by a certain bornological completion of the tensor algebra over A, and 1 T A = 1 T A/[T A, 1 T A] is the commutator quotient space of the universal one-forms over T A. This means that the entire cyclic homology of A is completely described through the homological properties of its analytic tensor algebra in dimension 0 and 1. Furthermore, taking the analytic tensor algebra of T A is harmless: indeed X(T A) and X(T T A) are homotopically equivalent complexes. In other words, entire cyclic homology does not distinguish between a complete bornological algebra and its successive nested analytic tensor algebras. This is a particular case of the analytic version [24] of Goodwillie’s theorem [18]. This result is a key point of our bivariant Chern character. The construction of (1) will follow two steps: (a) Using the Goodwillie theorem, we first construct an invertible bivariant class [γ ] ∈ H E0 (A, T A) realizing the equivalence between the entire cyclic homologies of A and T A. Then under certain θ -summability conditions, (b) We consider a bimodule in ∗ (A, B). we construct an element [χ ] ∈ H E∗ (T A, B) involving the exponential of the curvature of a superconnection, which automatically incorporates the desired heat kernel regularization. This step uses Quillen’s theory of algebra cochains as an essential tool [33, 34]. Then the composition [γ ] · [χ ] ∈ H E∗ (A, B) is the bivariant Chern character (1). The paper is organized as follows. In Sects. 2 and 3 we recall the basic definitions and properties of bornological spaces and entire cyclic cohomology. In Sect. 4 we present our construction of the Goodwillie equivalence [γ ] ∈ H E0 (A, T A). The semigroup of unbounded bimodules ∗ (A, B) is introduced in Sect. 5. Sections 6 and 7 are devoted
48
D. Perrot
to the fundamental construction of the element χ ∈ H E∗ (T A, B). Finally, we end the paper with an application of our Chern character to the non-bivariant cases, namely ordinary K-theory and K-homology in Sect. 8. In particular we check that the composition product on H E describes correctly the index pairing between idempotents and spectral triples. Besides, the study of the Bott class allows to normalize the bivariant Chern character. The appendix contains a straightforward adaptation, to bornological algebras, of Quillen’s algebra cochains formalism. We would like to mention a last point. There is a priori no obvious intersection product (A, B) × (B, C) → (A, C) as in Kasparov theory. Also we will never ask if our construction is compatible with such a product. In fact, it is possible to show that the bivariant Chern character is compatible with the Kasparov product on p-summable quasihomomorphisms, but this involves a retraction of our entire cocycles onto periodic ones. These matters will be treated elsewhere [29]. All algebras are supposed to be based on the ground field C. We work in the non-unital graded category, i.e. homomorphisms between algebras do not necessarily preserve units, and all operations like commutators, tensor products, etc., involving graded objects are automatically graded. 2. Bornology This section is intended to give a short introduction to bornological vector spaces [20]. These are vector spaces with an additional structure describing abstractly the notion of boundedness. Concrete examples of bornological spaces are provided by normed or locally convex spaces. Bornology is the correct framework allowing the development of entire cyclic cohomology in full generality; this has been done by Meyer in [24]. Since this topic is not so familiar to mathematical physicists, we feel the need to recall the definitions and basic properties. Our sketch is by no means supposed to give a sufficient knowledge about bornology; we refer to [20, 24] for details. Let V be a vector space over C. A subset S ⊂ V is a disk iff it is circled and convex. Given any subset S, we denote by S 3 its circled convex hull: it is the smallest disk containing S. If S is a disk, its linear span VS is endowed with a semi-norm || · ||S whose unit ball is the closure of S. S is called completant iff VS is a Banach space. Definition 2.1. Let V be a vector space. A (convex) bornology S(V) is a collection of subsets of V verifying the following axioms: • {x} ∈ S(V) for any vector x ∈ V. • S1 + S2 ∈ S(V) for any S1 , S2 ∈ S(V). • If S ∈ S(V), then T ∈ S(V) for any T ⊂ S. • S 3 ∈ S(V) for any S ∈ S(V). Any S ∈ S(V) is called a small subset of the bornological space V. The bornology S(V) is called completant iff any small subset S ∈ S(V) is contained in a completant small disk. In that case, (V, S(V)) is a complete bornological vector space. Example 2.2. If V is a locally convex space, then the bounded bornology Bound(V) is the collection of subsets S bounded for all seminorms on V. If V is complete for the locally convex topology, then it is complete as a bornological space. Fréchet spaces endowed with the bounded bornology are important examples of complete bornological spaces.
Bivariant Chern Character for Families of Spectral Triples
49
Example 2.3. If V is any vector space, the fine bornology Fine(V) is the smallest admissible bornology: a subset is small iff it is contained in the disked hull of a finite number of points of V. In particular, any small subset is contained in a finite-dimensional subspace of V. A bornological space with fine bornology is always complete because finite-dimensional spaces are complete. Example 2.4. A useful way to construct a bornology on V is to start from a collection U of subsets not satisfying the axioms of a bornology, and then to consider the smallest bornology S(V) containing U. We say that S(V) is generated by U. Bornological convergence. A sequence {xn }n∈N of points in a bornological space V is said to converge bornologically to the limit x∞ ∈ V iff there is a small disk S ∈ S(V) such that xn − x∞ ∈ S for any n and limn→∞ ||xn − x∞ ||S = 0. A set is said to be closed for the bornology iff it is sequentially closed for bornologically convergent sequences. The closed sets for the bornology fulfill the axioms of a topology, hence a bornological space has also a topology (though in general not a vector space topology). Bounded maps. Let V and W be two bornological vector spaces. A linear map l : V → W is bounded iff l(S) ∈ S(W) for any small S ∈ S(V). An arbitrary set {lj }j ∈J of linear maps is equibounded iff {lj (x)|j ∈ J, x ∈ S} is a small subset of W for any S ∈ S(V). We denote by Hom(V, W) the vector space of bounded linear maps between V and W. The sets of equibounded maps form a bornology called the equibounded bornology on Hom(V, W). It is complete if W is complete. We will always endow the spaces of bounded linear maps with the equibounded bornology. Completions. Let V be a bornological vector space. Its bornological completion V c is the complete bornological vector space defined as the solution of the following universal problem: there is a bounded linear map u : V → V c such that, for any complete bornological space W and any bounded linear map l : V → W, there is a unique bounded linear map from V c to W factorizing l. The completion always exists, and can be explicitly realized as the inductive limit of a system of Banach spaces (see [20] and the appendix of [24]). It is of course unique by universality. If V is a normed space endowed with the bounded bornology, then its bornological completion coincides with its Hausdorff completion. However, it should be stressed that the universal map V → V c may fail to be injective for an arbitrary bornological space V. Multilinear maps. An n-linear map l : V1 ×· · ·×Vn → W between bornological spaces is bounded iff l(S1 , . . . , Sn ) ∈ S(W) for any small sets Si ∈ S(Vi ). If W is complete, then there is a unique bounded n-linear map V1c × · · · × Vnc → W factorizing l. Completed tensor products. Let V1 and V2 be two bornological vector spaces. We endow their algebraic tensor product V1 ⊗V2 with the bornology generated by the subsets S1 ⊗S2 , for any Si ∈ S(Vi ). The completion of V1 ⊗ V2 with respect to this bornology is the ˆ 2 . The completed tensor product is associative, whence completed tensor product V1 ⊗V ˆ · · · ⊗V ˆ n of n bornological spaces. The the definition of the n-fold tensor product V1 ⊗ latter is universal for the bounded n-linear maps V1 × · · · × Vn → W with complete range W. Algebras. A bornological algebra is a bornological space A endowed with a bounded bilinear map (product) A × A → A. The algebra A is complete iff it is complete as a vector space. In this paper we will be concerned only with associative bornological algebras. Subspaces, quotients. Let V be a bornological vector space and W ⊂ V a vector subspace. There is a canonical bornology on W: a subset S ∈ W is small iff it is small
50
D. Perrot
for V. On the other hand, the quotient space V/W has also a bornology: S ∈ S(V/W) iff there is a small T ∈ S(V) such that S = T mod W. When V is complete, then the subspace W ⊂ V and quotient V/W are complete iff W is bornologically closed in V. Bornological complexes. A bornological space V with a bounded linear map ∂ : V → V satisfying ∂ 2 = 0 is a bornological complex. Its homology is as usual the bornological vector space H∗ (V) = Ker∂/Im∂. 3. Entire Cyclic Cohomology Here we recall the formulation of cyclic (co)homology within the X-complex framework of Cuntz and Quillen [15]. The analytic adaptation of that theory presented by Meyer in [24] allows to define elegantly the entire cyclic homology, cohomology and bivariant cohomology for bornological algebras. There are in fact two equivalent ways to describe the entire cyclic cohomology of a complete bornological algebra A. The first one is to use Connes’ (b, B) complex of non-commutative forms completed with respect to a certain bornology; we call this completion A. The second one is the X-complex of the completed tensor algebra T A. These complexes are homotopy equivalent [24], and give rise to the definition of entire cyclic cohomology. The construction of the bivariant Chern character proposed in our paper uses simultaneously the (b, B)-complex and Xcomplex approaches. Also, a third complex will be needed as an intermediate step; we call it the completed de Rham–Karoubi complex δ A . 3.1. Non-commutative differential forms. Let A be a complete bornological algebra. The algebra of non-commutative differential forms over A is the direct sum A = ˆ n n ˆ ⊗n for n ≥ 1 and 0 A = A, n≥0 A of the n-dimensional subspaces A = A⊗A = A⊕C is the unitalization of A. It is customary to use the differential notation where A a0 da1 , . . . dan (resp. da1 , . . . dan ) for the string a0 ⊗ a1 · · · ⊗ an (resp. 1 ⊗ a1 · · · ⊗ an ). The differential d : n A → n+1 A is uniquely specified by d(a0 da1 . . . dan ) = da0 da1 . . . dan and d 2 = 0. The multiplication in A is defined as usual and fulfills the Leibniz rule d(ω1 ω2 ) = dω1 ω2 + (−)|ω1 | ω1 dω2 , where |ω1 | is the degree of ω1 . Each n A is a complete bornological space by construction, and we endow A with the direct sum bornology. This turns A into a complete bornological differential graded (DG) algebra, i.e. the multiplication map and d are bounded. It is the universal complete bornological DG algebra generated by A. On A are defined various operators. First of all, the Hochschild boundary b : n+1 A → n A is b(ωda) = (−)n [ω, a] for ω ∈ n A, and b = 0 on 0 A = A. One easily shows that b is bounded and b2 = 0. Then the Karoubi operator κ : n A → n A is constructed out of b and d: 1 − κ = db + bd.
(3)
Therefore κ is bounded and commutes with b and d. The last operator is Connes’ B : n A → n+1 A, B = (1 + κ + . . . + κ n )d
on n A,
(4)
which is bounded and verifies B 2 = 0 = Bb + bB and Bκ = κB = B. We now define three other bornologies on A, leading to the notion of entire cyclic cohomology:
Bivariant Chern Character for Families of Spectral Triples
51
• The entire bornology S (A) is generated by the sets [n/2]! S(dS)n , S ∈ S(A),
(5)
n≥0
where [n/2] = k if n = 2k or n = 2k + 1, and S = S + C. That is, a subset of A is small if it is contained in the circled convex hull of a set like (5). We write A for the completion of A with respect to this bornology. A will be the complex of entire chains. • The analytic bornology San (A) is generated by the sets n≥0 S(dS)n , S ∈ S(A). The corresponding completion of A is an A. It is related to the X-complex description of entire cyclic homology (see below). • The bornology Sδ (A) is generated by the collection of sets de Rham–Karoubi 1 n , S ∈ S(A), with completion A. This will give rise to the de S(dS) δ n≥0 [n/2]! Rham-Karoubi complex. The multiplication in A is bounded for the three bornologies above, as well as all the operators d, b, κ, B. Moreover, the Z2 -graduation of A given by even and odd forms is preserved by the completion process, so that A, an A and δ A are Z2 -graded differential algebras, endowed with the operators b, κ, B fulfilling the usual relations. In particular, A is called the (b, B)-complex of entire chains. Note also that the multiplication or division of n-forms by [n/2]! obviously provide linear bornological isomorphisms between A, an A and δ A. 3.2. The analytic tensor algebra. Let A be a complete bornological algebra, A = + A ⊕ − A the Z2 -graded algebra of differential forms. The even part + A is a trivially graded subalgebra. We endow + A with a new associative product, the Fedosov product [15], ω1 ω2 = ω1 ω2 − dω1 dω2 ,
ω1,2 ∈ + A.
(6)
(+ A, )
is isomorphic to the nonAssociativity is easy to check. In fact the algebra ˆ ⊗n unital tensor algebra T A = n≥1 A , under the correspondence + A a0 da1 . . . da2n ←→ a0 ⊗ ω(a1 , a2 ) ⊗ · · · ⊗ ω(a2n−1 , a2n ) ∈ T A,
(7)
ˆ is the curvature of (ai , aj ). It turns out where ω(ai , aj ) := ai aj − ai ⊗ aj ∈ A ⊕ A⊗2 that the Fedosov product is bounded for the bornology San restricted to + A [24], and thus extends to the analytic completion + an A. The complete bornological algebra (+ A, ) is also denoted by T A and called the analytic tensor algebra of A in [24]. an
3.3. X-complex. The X-complex first appeared in Quillen’s work on algebra cochains [33], and then was used by Cuntz–Quillen in their formulation of cyclic homology [15, 16]. Here we recall the X-complex construction for bornological algebras, following Meyer [24]. Let A be a complete bornological algebra. The X-complex of A is the Z2 -graded complex X(A) :
A o
d b
/ 1 A ,
(8)
52
D. Perrot
where 1 A is the completion of the commutator quotient space 1 A/b2 A = 1 A/[A, 1 A] endowed with the quotient bornology. The class of the generic element (a0 da1 mod [, ]) ∈ 1 A is usually denoted by a0 da1 . The map d : A → 1 A thus sends a ∈ A to da. Also, the Hochschild boundary b : 1 A → A vanishes on the commutators [A, 1 A], hence passes to a well-defined map b : 1 A → A. Explicitly the image of a0 da1 by b is the commutator [a0 , a1 ]. These maps are bounded and satisfy d ◦ b = 0 and b ◦ d = 0, so that X(A) indeed defines a complete Z2 -graded bornological complex. We now focus on the X-complex of the analytic tensor algebra T A. In that case, 1 T A = 1 T A/[T A, 1 T A] is always complete, and as a bornological vector space X(T A) is canonically isomorphic to the analytic completion an A. Here we must take care of a notational problem. Since the symbol d is already used for the differential on A, we always choose the bold print d for the differential on T A. Then the correspondence between X(T A) and an A goes as follows [15, 24]: first, one has a T A-bimodule isomorphism, ˆ ⊗ ˆ TA, 1 T A TA⊗A x da y ↔ x ⊗ a ⊗ y for a ∈ A, x, y ∈ TA,
(9)
where TA := C ⊕ T A is the unitalization of T A. This implies that the bornological ˆ which can further be identified with the analytic space 1 T A is isomorphic to TA⊗A, completion of odd forms − A, through the correspondence x ⊗ a ↔ xda, ∀a ∈ A, x ∈ an T A. Thus collecting the even part X0 (T A) = T A and the odd part X1 (T A) = 1 T A together, yields a linear bornological isomorphism X(T A) an A. We still denote by (d, b) the boundaries induced on an A through this isomorphism; Cuntz and Quillen explicitly computed them in terms of the usual operators on differential forms [15]: b = b − (1 + κ)d d =
2n i=0
κid −
n−1
on 2n+1 A, κ 2i b
(10)
on 2n A.
i=0
The crucial result [15, 24] is that the complex (an A, d, b) = X(T A) is homotopy equivalent to the complex of entire chains A endowed with the differential (b + B). Let us recall briefly the job [15, 24]. The Karoubi operator κ verifies the polynomial identity (κ n − 1)(κ n+1 − 1) = 0 on n A, hence κ 2 also verifies a polynomial identity. It follows that κ 2 has a discrete spectrum σ , and A decomposes into the direct sum of the generalized eigenspaces Vλ for any λ ∈ σ . One of the eigenvalues of κ 2 is 1, with multiplicity 2. Let P be the projection of A onto V1 , vanishing on the other eigenspaces. Since P and its orthogonal projection P ⊥ commute with all operators commuting with κ, the subspaces P A and P ⊥ A are stable with respect to d, b and B. One shows [24] that P , P ⊥ are bounded for the bornologies San (A) and S (A), hence extend to the completions an A and A. Moreover, the subcomplex (P ⊥ an A, d, b) is contractible, hence an A retracts on P an A for the differential (d, b). Also, (P ⊥ A, b + B) is contractible and A retracts on P A for the differential (b+B). Let c : an A → A be the bornological vector space isomorphism c(a0 da1 . . . dan ) = (−)[n/2] [n/2]! a0 da1 . . . dan
∀n ∈ N .
(11)
Bivariant Chern Character for Families of Spectral Triples
53
Then c maps isomorphically P an A onto P A, and under this correspondence, the boundaries (d, b) and b + B coincide: c−1 (b + B)c = (d, b) on P an A. It follows that the X-complex X(T A) is homotopy equivalent to the (b + B)-complex of entire chains A. This leads to the definition of entire cyclic (co)homology: Definition 3.1. Let A be a complete bornological algebra. (i) The entire cyclic homology of A is the homology of the X-complex of the analytic tensor algebra T A: H E∗ (A) = H∗ (X(T A)),
∗ = 0, 1,
(12)
or equivalently, the (b + B)-homology of the Z2 -graded complex of entire chains A. (ii) Let (X(T A)) be the Z2 -graded complex of bounded maps from X(T A) to C, with differential the transposed of (d, b). Then the entire cyclic cohomology of A is the cohomology of this dual complex: H E ∗ (A) = H ∗ ((X(T A)) ),
∗ = 0, 1.
(13)
(iii) If A and B are complete bornological algebras, then Hom(X(T A), X(T B)) denotes the space of bounded linear maps from X(T A) to X(T B). It is naturally a complete Z2 -graded bornological complex, the differential of a map f corresponding to the commutator (d, b) ◦ f − (−)|f | f ◦ (d, b). The bivariant entire cyclic cohomology of A and B is then the cohomology of this complex: H E∗ (A, B) = H∗ (Hom(X(T A), X(T B))),
∗ = 0, 1.
(14)
In the case A = C, one shows [24] that X(T C) is homotopically equivalent to X(C) : C 0, thus the entire cyclic homology of C is simply H E0 (C) = C and H E1 (C) = 0. This implies that for any complete bornological algebra A, we get the usual isomorphisms H E∗ (C, A) H E∗ (A) and H E∗ (A, C) H E ∗ (A). Furthermore, since the composition of bounded maps is bounded, there is a well-defined composition product on bivariant entire cyclic cohomology: H Ei (A, B) × H Ej (B, C) → H Ei+j +2Z (A, C),
i, j = 0, 1
(15)
for complete bornological algebras A, B, C. Any bounded homomorphism ρ : A → B extends to a bounded homomorphism ρ∗ : T A → T B by setting ρ∗ (a1 ⊗ · · · ⊗ an ) = ρ(a1 ) ⊗ · · · ⊗ ρ(an ). The boundedness of ρ∗ becomes obvious once we rewrite it using the isomorphism T A (+ an A, ), since ρ∗ (a0 da1 . . . da2n ) = ρ(a1 )dρ(a1 ) . . . dρ(a2n ). The homomorphism ρ∗ gives rise to a bounded X-complex morphism X(ρ∗ ) : X(T A) → X(T B): x → ρ∗ (x), xdy → ρ∗ (x)dρ∗ (y)
(16) ∀x, y ∈ T A.
We write ch(ρ) for the class of X(ρ∗ ) in H E0 (A, B). It is the simplest example of bivariant Chern character. Last, we remark that H E∗ (A, A) is a Z2 -graded unital ring, the unit corresponding to the Chern character of the identity homomorphism of A.
54
D. Perrot
3.4. The entire de Rham–Karoubi complex. There is still another complex related to cyclic homology, namely the de Rham–Karoubi complex [22]. In our context of bornological algebras, we have to consider its completed version. So let A be a complete bornological algebra. that the de Rham-Karoubi bornology Sδ on A is generated by the Recall 1 subsets n≥0 [n/2]! S ⊗ S ⊗n , for any small set S ∈ S(A). The completion of A with respect to this bornology is δ A. Let δ A be the completion of δ A/[δ A, δ A] with respect to the quotient bornology, and : δ A → δ A be the natural bounded map. The composition d : δ A → δ A is bounded and vanishes on the commutator subspace [δ A, δ A], thus it factors through a well-defined bounded map d : δ A → δ A . One obviously has (d)2 = 0, whence a bornological complex. Definition 3.2. LetA be a complete bornological algebra. The entire de Rham-Karoubi cohomology of A is the cohomology of the Z2 -graded complex (δ A , d): ∗ (A) = H ∗ (δ A , d), HdR
∗ = 0, 1.
(17)
There is a direct relation between the entire cyclic homology and the entire de RhamKaroubi cohomology. Let c : A → A be the linear isomorphism sending the 1 n-form a0 da1 . . . dan to n! a0 da1 . . . dan . It is also a bornological isomorphism between (A, S ) and (A, Sδ ), and thus extends to an isomorphism of complete bornological spaces c : A → δ A. It is easy to show that the composition c : A → δ A is a bounded morphism from the (b+B)-complex of entire chains to the de Rham–Karoubi complex, whence a natural (covariant) map ∗ (A), H E∗ (A) → HdR
∗ = 0, 1.
(18)
The entire de Rham–Karoubi complex arises in differential geometry when characteristic classes of vector bundles are constructed from connections and curvatures [22]. If we let A ∈ 1 A be a one-form with curvature F = dA + A2 ∈ 2 A, then the Chern character form (here we omit some irrelevant 2π i factors) ch(A) = exp F ∈ δ A
(19)
0 (A). The use of indeed defines an entire de Rham cocycle whose class lies in HdR exponentials will be crucial in our bivariant Chern character construction, because it allows to combine heat kernel regularization with characteristic classes [32, 3]. Also the entire de Rham-Karoubi complex will be an important intermediate step.
4. A Goodwillie Theorem One of the main properties of cyclic homology is the so-called Goodwillie theorem [18]. Roughly, it asserts that periodic cyclic homology is stable when taking nilpotent extensions. In other words, if 0 → N → E → A → 0 is an extension of an algebra A, with N a nilpotent ideal in E, then A and E have the same periodic cyclic (co)homology. This has been generalized by Meyer in the context of bornological algebras and entire cyclic homology [24], where algebraic nilpotence has to be replaced by the notion of analytic nilpotence. However, we don’t need the whole theory of analytically nilpotent extensions here. Given a complete bornological algebra A, we will only be concerned with the universal analytically nilpotent extension 0 → J A → T A → A → 0.
(20)
Bivariant Chern Character for Families of Spectral Triples
55
Here the bounded projection T A → A is induced by the multiplication map m : a1 ⊗ . . . ⊗ an → a1 . . . an , and J A is its kernel. Using the identification T A (+ an A, ), the multiplication map simply coincides with the projection of an even differential form onto its degree zero component. The canonical linear embedding σA : A → T A provides a bounded linear splitting of the exact sequence (20). The Goodwillie theorem then claims that the projection homomorphism m : T A → A induces an isomorphism between H E∗ (T A) and H E∗ (A). Moreover, the Chern character of m, corresponding to the class of the chain map X(m∗ ) : X(T T A) → X(T A) in the entire bivariant cyclic cohomology H E0 (T A, A), has an inverse in H E0 (A, T A). The latter is constructed as follows [24]. There is a unique bounded homomorphism vA : T A → T T A such that vA ◦ σA = σT A ◦ σA , and it gives rise to a chain map X(vA ) : X(T A) → X(T T A), whose cohomology class is the inverse of ch(m), i.e. ch(m) · [X(vA )] = 1 in H E0 (T A, T A) and [X(vA )] · ch(m) = 1 in H E0 (A, A). In this section we shall present a slightly different construction of the inverse of ch(m). It will be represented by a bounded chain map of degree zero γ : X(T A) → T A
(21)
from the X-complex of T A to the (b + B)-complex of entire chains over T A. Since ( T A, b+B) is homotopy equivalent to X(T T A), the map γ indeed defines a bivariant class [γ ] ∈ H E0 (A, T A). The explicit expression of γ will be an important part of the bivarant Chern character. Our aim is to prove Corollaries 4.4 and 4.5 below. Before proceeding, we need some information about the bornology of T A. The latter is the completion of T A = n≥0 n T A for the bornology S (T A) generated by the sets n≥0 [n/2]! T(dT )n for any small T ∈ S(T A), where T A is already the completion of the algebra T A (+ A, ) for the bornology San (+ A) generated by n n≥0 S(dS) , S ∈ S(A). We could also obtain T An after only one completion of a certain bornological space. Let T A := n≥0 T A be the DG algebra of non-commutative differential forms over the non-complete tensor algebra T A, where n T A = TA ⊗ (T A)⊗n involves only algebraic (non-completed) tensor products. Endow T A with the bornology S(T A) generated by the sets n≥0 [n/2]! T(dT )n , for any small T ∈ S(T A) = San (+ A). Lemma 4.1. The completion of the bornological space (T A, S) is canonically isomorphic to T A . Proof. It is a direct consequence of the universal property of bornological completions. Let us give some details. The natural map (T A, S) → (T A, S ) extending the arrow T A → T A is clearly bounded. Composing with the universal map (T A, S ) → T A, we get a bounded map (T A, S) → T A. We thus have to show that T A is a solution of the universal problem associated to the bornological space (T A, S). Let W be any complete bornological space, and consider a bounded map f : T A → W. Then the universal property of T A implies that there is a unique bounded map f : T A → W factorizing f . From f we then get a bounded map f : T A → W, also factorizing f . Moreover, the universal properties of completions imply that such a f is necessarily unique, hence the complete space T A identifies canonically to the bornological completion of (T A, S).
56
D. Perrot
The space of non-commutative forms T A endowed with the usual boundaries (b, B) is a (non-complete) bornological bicomplex. Consider the following subcomplex: = b2 T A ⊕ n T A, (22) n≥2
which is stable by b and B. The quotient T A/ corresponds to the Z2 -graded bornological complex X(T A) :
TA o
d b
1 / T A := 1 T A , b2 T A
(23)
whose completion identifies with X(T A). The projection π : T A → T A/ being bounded, it extends to a bounded chain map π : T A → X(T A),
(24)
representing a bivariant entire cohomology class [π ] ∈ H E0 (T A, A). It turns out that [π] has an inverse [γ ] ∈ H E0 (A, T A), that we are going to construct as a bounded chain map γ : X(T A) → T A.
(25)
According to the terminology of Cuntz and Quillen [14, 15], the non-completed tensor algebra T A is algebraically quasi-free. This means that there is a right connection ∇ : 1 T A → 2 T A, i.e. a linear map satisfying ∇(xω) = x∇ω,
∇(ωx) = ∇ωx + ωdx,
∀x ∈ T A, ω ∈ 1 T A.
(26)
Equivalently, there is a linear map φ : T A → 2 T A such that φ(xy) = φ(x)y + xφ(y) + dxdy for any x, y ∈ T A. One obtains φ from ∇ by setting φ(x) := ∇(dx). To show that such maps exist, one can use the fact that T A is a free algebra, and construct φ recursively: φ(a) := 0
∀a ∈ A;
φ(a ⊗ x) = aφ(x) + dadx
∀a ∈ A, x ∈ T A,
(27)
and then ∇(xdy) = xφ(y), ∀x, y ∈ T A. In the remainder of the text we will always use these definitions of ∇ and φ. Note that any other admissible map φ may be obtained from φ by adding a derivation from T A to the T A-bimodule 2 T A. Now extend ∇ to a map n T A → n+1 T A for any n ≥ 1 by the recursive relation ∇(ωdx) = (∇ω)dx,
∀ω ∈ n T A, x ∈ T A,
(28)
then put φ = ∇B : n T A → n+2 T A for any n ≥ 0. The latter extends the previous map φ to T A. Explicitly one has φ(x0 dx1 . . . dxn ) = ∇(1 + κ + · · · + κ n )(dx0 dx1 . . . dxn ) =
n
(−)ni φ(xi )dxi+1 . . . dxi−1 ,
(29)
i=0
for any xj ∈ T A. One can compute the successive powers of φ. It turns out that given an element x of T A, φ k (x) vanishes for k sufficiently large (depending on x). Indeed, let
Bivariant Chern Character for Families of Spectral Triples
57
x = a1 ⊗a2 · · ·⊗an . From the definition (27), one sees that for k = [n/2], φ k (x) contains only terms of the form dai . . . dan da1 . . . dai−1 or ai dai+1 . . . dai−1 , and φ(ai ) = 0 implies φ k+1 (x) = 0. More generally, φ k (ω) = 0 for any ω ∈ T A and k 0. Consequently, the operator 1 − φ is invertible on T A, because the power series (1 − φ)−1 =
∞
φk
(30)
k=0
only takes a finite number of non-zero terms on any ω ∈ T A. is a contractProposition 4.2. (i) For any n ≥ 2, ∇b +b∇ = −Id on n T A. Hence ∇ ing homotopy for the b-cohomology of the subcomplex = b2 T A⊕ n≥2 n T A of T A. (ii) The map γ : X(T A) → T A defined by T A x → (1 − φ)−1 (x) T A xdy → (1 − φ) 1
−1
(31)
(xdy + b(xφ(y)))
is a morphism from the X-complex of T A to the (b + B)-complex T A. (iii) Let π : T A → X(T A) be the natural projection. There is a contracting homotopy h : T A → T A such that π ◦ γ = Id on X(T A), γ ◦ π = Id + (b + B)h + h(b + B) on T A. Proof. (i)
(32) (33)
Let us first show that for any ω ∈ n T A, n ≥ 1, and x ∈ T A one has ∇(xω) = x∇ω,
∇(ωx) = ∇ωx − (−)n ωdx.
The first relation is trivial. The second one is proved recursively on n. Suppose the relation is realized for n, then for any ω ∈ n T A and x, y ∈ T A, ∇(ωdx y) = ∇(ωd(xy)) − ∇(ωxdy) = ∇ω d(xy) − ∇(ωx)dy = ∇ω d(xy) − ∇ω xdy + (−)n ωdxdy = ∇(ωdx)y + (−)n ωdxdy, proving the relation for n + 1. Next, any element of n T A, for n ≥ 2, is a sum of elements like ωdx with ω ∈ n−1 T A, x ∈ T A. Thus (∇b + b∇)(ωdx) = (−)n−1 ∇[ω, x] + b(∇ω dx) = (−)n−1 (∇ω x + (−)n ωdx) − (−)n−1 x∇ω + (−)n [∇ω, x] = −ωdx, which concludes the proof. (ii) First, the injection xdy → xdy + b(xφ(y)) vanishes on b2 T A = [T A, 1 T A] (see [15] §7), hence is well-defined on 1 T A . Therefore γ is well-defined. Next, since φ = ∇B one has φB = 0. Also b∇ +∇b = −Id on n≥2 T A, thus the relation bφ − φb = b∇B − ∇Bb = −(1 + ∇b)B − ∇Bb = −B holds on n≥1 T A. This implies (1 − φ)(b + B) = b + B − φb = b(1 − φ), and composing from right and left by (1 − φ)−1 (which preserves the subspace n≥1 n T A) yields (b + B)(1 − φ)−1 = (1 − φ)−1 b
on n≥1 T A.
(34)
58
D. Perrot
Let x ∈ T A. Using (34) and φB = 0, one has γ (dx) = (1 − φ)−1 (dx + bφ(x)) ∞ = φ n B(x) + (1 − φ)−1 bφ(x) n=0
= B(x) + (b + B)(1 − φ)−1 φ(x), and since b(x) = 0 one deduces γ (dx) = (b + B)(1 − φ)−1 (x) = (b + B)γ (x). Now let xdy ∈ 1 T A . One has γ (bxdy) = γ b(xdy) = (1 − φ)−1 b(xdy). On the other hand, (b + B)γ (xdy) = (b + B)(1 − φ)−1 (xdy + b(xφ(y))) = (1 − φ)−1 b(xdy) by using (34). Hence (b + B) ◦ γ = γ ◦ (d + b), proving that γ is a chain map. (iii) One obviously has π ◦ γ = Id on X(T A), whence a split exact sequence of complexes 0
x / T A
/
γ π
/ X(T A)
/ 0.
Let Q := γ ◦ π be the projection of T A onto the image of X(T A). Then = KerQ, X(T A) ImQ as complexes and T A ⊕ X(T A). We now construct a contracting homotopy for . Let h = (1 − φ)−1 ∇ on . One has h ⊂ because the image of h are differential forms of degree ≥ 2. Extend h to T A by setting h = 0 on X(T A). Then a simple computation using (34) and ∇b + b∇ = −1 on n≥2 T A shows that (b + B)h + h(b + B) = −Id on . Thus with h = (1 − φ)−1 ∇(1 − Q) on T A, one gets γ ◦ π = Q = Id + (b + B)h + h(b + B), and the proposition follows.
Proposition 4.2 shows that γ and π realize inverse homotopy equivalences between X(T A) and T A. This is not very interesting at first sight, since the homology of these complexes is in fact trivial! However, these are bornological complexes and their completions compute the entire cyclic homology of A. The key point is that all the maps we have constructed are in fact bounded for the bornology S(T A): Proposition 4.3. The maps ∇, φ, (1 − φ)−1 , h are bounded on T A and thus extend to bounded linear maps on the completion T A. Also, γ is bounded and extends to a bounded chain map γ : X(T A) → T A.
(35)
Proof. Recall that the bornology S(T A) is generated by the collection of subsets [n/2]! T(dT )n , for any small T ∈ S(T A) = San (+ A). We will consider a n≥0 set of generators of San (+ A) that will appear complicated at first sight, but it will
Bivariant Chern Character for Families of Spectral Triples
59
simplify drastically the proof of boundedness. For any S ∈ S(A), we put Tn (S) = S (dSdS)n S, and consider the following subset of + A: N
V (S) = (2S + S 2 ) ∪
Tn (S)
⊂ + A = T A.
N≥0 n=0
{V (S)|S ∈ S(A)}. Indeed, by definition We claim that San (+ A) is generated by San (+ A) is generated by the set U = { n≥0 S(dS)2n |S ∈ S(A)}. Then for any S ∈ S(A), n≥0 S(dS)2n ⊂ V (S). Conversely, given a small S in A, we have to show that V (S) is contained in the circled convex hull of elements of U. Let n ∈ N, one has S (dSdS)n S = ( S(dS)2n ) S ⊂ S(dS)2n S + (dS)2n+2 2n−1 2n−2 2 ⊂ S(dS) d(S S) + S(dS) d(S )dS + . . . + ( SS)(dS)2n + (dS)2n+2 , where the latter sum contains 2n + 2 terms. Let S be the disk (S S)3 , then we find N
Tn (S) ⊂
n=0
N+1
((n + 1)S (dS )2n )3 .
n=0
But the rescaling U = 4S implies N
Tn (S) ⊂
n=0
N+1 n=0
1 22n+1
(dU )2n )3 ⊂ (U
3 (dU )2n , U
n≥0
2n+1 is always less than 1. This shows that V (S) is contained because the sum N+1 n=01/2 (dU )2n . Consequently V (S) is small for the bornology in the disked hull of n≥0 U San (T A) as claimed, and {V (S)|S ∈ S(A)} generates the analytic bornology of T A. We now show that φ is bounded. For a small S ∈ S(A) let a2n+1 ∈ Tn (S) ⊂ + A. a0 da1 da2 · · · da2n−1 da2n The identity φ(x y) = φ(x)y +xφ(y)+dxdy for any x, y ∈ T A, as well as φ(a) = 0 ∀a ∈ A, imply φ(da1 da2 ) = −da1 da2 , and more generally φ( a0 da1 da2 · · · da2n−1 da2n a2n+1 ) = d a0 d(da1 da2 · · · da2n−1 da2n a2n+1 ) +
n
(− a0 da1 da2 . . . da2i−1 d(a2i · · · a2n+1 )
i=1
+ a0 da1 da2 . . . d(a2i−1 a2i )d(da2i+1 da2i+2 . . . a2n+1 ) − a0 da1 da2 . . . a2i−1 da2i d(da2i+1 da2i+2 . . . a2n+1 )). This implies (with Tn := Tn (S) for any n) φ(Tn ) ⊂ dSdTn +
n i=1
(Ti−1 d(2S + S 2 )dTn−i )3
60
D. Perrot
N and φ( N Tn ) is contained in the disked hull of the sum dSd( N n=0 n=0 Tn ) + ( n−0 Tn ) 2 d(2S + S 2 )( N n=p Tp ). Furthermore, φ vanishes on A so that φ(2S + S ) = 0, and with V = V (S) one gets φ(V ) ⊂ dSd
N
Tn +
N n=0
N
P
Tn d(2S + S ) 2
3 Tp
P n=p
N n−0
dV dV )3 . ⊂ (dV dV + V dV dV )3 = (V Now, the bornology on T A is generated by the sets S ∈ S(A). One thus has
n [n/2]! V (dV )
n , for V
= V (S),
(dV )n ) = ∇B(V (dV )n ) ⊂ ∇((n + 1)(dV )n+1 )3 φ(V (dV )n+2 )3 . ⊂ (n + 1)(φ(V )(dV )n )3 ⊂ (n + 1)(V (n + 1) grows polynomially, so that by rescaling V , one can find a small set V ∈ + n (dV ) ) is contained in the disked hull of n [1 + San ( A) such that φ( n [n/2]! V n/2]! V (dV )n+2 . Hence φ is bounded for the bornology of T A. Next, since ∇(x0 dx1 . . . dxn ) = x0 φ(x1 )dx
2 . . . dxk x for any xi ∈ T A, ∇ is also bounded. Let us now focus on (1 − φ)−1 = ∞ k=0 φ . For any V = V (S), one has
(dV )n ) ⊂ (1 − φ)−1 (V
∞
(dV )n+2k 3 . (n + 1)(n + 3) . . . (n + 2k + 1) V
k=0
(dV )n . Elementary estimates on the In fact, this is a finite sum on all elements of V function n! show there is a constant number λ such that (n + 1)(n + 3) . . . (n + 2k + 1) ≤ −1 n λn+2k+1 [k+n/2]! [n/2]! . Hence (1 − φ) ([n/2]!V (dV ) ) is contained in the disked hull of
∞ n+2k , and by the rescaling W = 2λV , one gets k=0 [k + n/2]!λV (λdV ) (dV )n ) ⊂ (1 − φ)−1 ([n/2]!V
3 (dW )p . [p/2]! W p
Since W does not depend on n, this shows that (1 − φ)−1 is bounded. It remains to study the morphism γ : X(T A) → T A. By definition, for any x ∈ X0 (T A) = T A, one has γ (x) = (1−φ)−1 (x), hence γ is bounded on X0 (T A). Let now xdy ∈ X1 (T A) = 1 T A . One has γ (xdy) = (1−φ)−1 (xdy +b(xφ(y))). If we use the bornological vector space isomorphism 1 T A TA⊗A, it is sufficient to evaluate γ on the element xda ∈ TA⊗A. Since φ(a) = 0, one has γ (xda) = (1−φ)−1 (xda). We deduce that γ is bounded, and also Q = γ π and h = (1 − φ)−1 ∇(1 − Q). Corollary 4.4. The chain map γ : X(T A) → T A is an homotopy equivalence. Its class [γ ] in the bivariant entire cyclic cohomology H E0 (A, T A) is the inverse of [π ] ∈ H E0 (T A, A). Proof. By the universal properties of completions and Proposition 4.2 (iii), one has π ◦ γ = Id on X(T A) and γ ◦ π = Id + [b + B, h] on T A.
Bivariant Chern Character for Families of Spectral Triples
61
Corollary 4.5. Let m : T A → A be the multiplication map, vA : T A → T T A the canonical bounded homomorphism, and γ , π as before. Let also c : X(T ·) → (·) be the bornological isomorphism (11) and P the spectral projection onto the κ 2 -invariant forms. Then the following diagram of chain maps commutes up to homotopy: P ◦c
/ T A 6 l l γ lllllll l l X(m∗ ) X(vA ) ll lll (m) l l lvlllllll π / A X(T A) X(T OT A)
(36)
P ◦c
Moreover, all the arrows are homotopy equivalences. Proof. Let κ be the Karoubi operator on T A. For any x, y ∈ T A one has κ(x) = x and κ(xdy) = dyx, hence the projection π : T A → X(T A) is κ-invariant: π ◦ κ = π. It follows that π is invariant under the spectral projection P , and π ◦ P ◦ c = π ◦ c. Consider the composition of linear maps X(vA )
c
π
X(T A) −−−−→ X(T T A) −−−−→ T A −−−−→ X(T A). A direct computation using the definitions shows that it is the identity on X(T A), hence π ◦P ◦c◦X(vA ) = IdX(T A) . Since π and P ◦c are homotopy equivalences, one deduces vA
m∗
that X(vA ) is also an homotopy equivalence. Next, the composition T A −→ T T A −→ T A is the identity homomorphism of T A. Hence X(m∗ ) ◦ X(vA ) = IdX(T A) , and X(m∗ ) is an homotopy equivalence inverting X(vA ). Since we know that γ is the inverse of π, we are left with the commutative diagram up to homotopy P ◦c
/ TA ll6 l l l γ l ll X(m∗ ) X(vA ) llllllll llll π l l lvlll X(T A) X(T OT A)
The bottom right corner of (36) follows from the functoriality of (·) and X(T ·), which implies (m) ◦ P ◦ c = P ◦ c ◦ X(m∗ ). 5. Families of Spectral Triples 5.1. Definition. Let B be a Z2 -graded complete bornological algebra. Given a Z2 -graded complete bornological vector space H, we can consider the (graded) completed tensor ˆ Since the multiplication on B is bounded, the obvious right action product E = H⊗B. of B on H ⊗ B extends to the completion E. This turns E into a Z2 -graded bornological right B-module, i.e. the following bilinear map is bounded: E × B → E, (h ⊗ b1 , b2 ) → h ⊗ b1 b2 .
(37)
Denote by EndB (E) the Z2 -graded algebra of bounded endomorphisms of E, commuting with the action of B. We always endow EndB (E) with the bornology of equibounded endomorphisms, so that it is a complete bornological algebra.
62
D. Perrot
Definition 5.1. Let A and B be complete bornological algebras. We assume A is trivially graded and B is Z2 -graded. Then a family of spectral triples over B, or an unbounded A-B-bimodule, is a triple (E, ρ, D) corresponding to the following data: • A Z2 -graded complete bornological vector space H and the corresponding right Bˆ module E = H⊗B. • A bounded homomorphism ρ : A → EndB (E) sending A to even degree endomorphisms of E. Hence E is a bornological left A-module. • An unbounded endomorphism D : Dom(D) ⊂ E → E of odd degree, defined on a bornologically dense domain of E and commuting with the right action of B: D · (ξ b) = (D · ξ )b ∀ξ ∈ E , b ∈ B .
(38)
D is also called a Dirac operator. • The commutator [D, ρ(a)] extends to an element of EndB (E) for any a ∈ A. • For any t ∈ R+ , the heat kernel exp(−tD 2 ) is densely defined and extends to a bounded endomorphism of E. We denote by (A, B) the set of such unbounded bimodules. It is clear that this definition is an adaptation of the Baaj and Julg picture of unbounded Kasparov bimodules [4], with C ∗ -algebras replaced by bornological algebras. Remark however that we do not require the “resolvent” (1 + D 2 )−1 to be a compact endomorphism as in Kasparov theory. Instead, we deal with the heat operator exp(−tD 2 ), and the compactness will be replaced by the so-called θ -summability condition (see Definitions 6.3 and 7.2). Roughly speaking, θ -summability means that the heat kernel is a trace-class endomorphism for t > 0, a well-defined notion in bornology. Before proceeding further, let us mention some simple examples of unbounded bimodules, in the case of a trivially graded algebra B: Example 5.2. Homomorphisms: if D = 0 and H = Cn+ ⊕ Cn− is a finite-dimensional Z2 -graded space (with fine bornology), then the triple (E, ρ, D) reduces to a pair of bounded homomorphisms ρ± : A → Mn± (B). If moreover A = C, the latter is equivalent to a pair of idempotents e± = ρ± (1) ∈ Mn± (B). The difference “e+ − e− ” then describes an algebraic K-theory class of B. Example 5.3. B = C: then (A, C) is just the set of triples (H, ρ, D). If H is an Hilbert space, D a selfadjoint unbounded operator and the heat kernel exp(−tD 2 ) is trace-class for any t > 0, then (H, ρ, D) is a θ -summable spectral triple over A. Example 5.4. A = C and ρ(1) = 1 ∈ EndB (E): the homomorphism C → EndB (E) ˆ as the space of sections of a trivial veccompletely disappears. We view E = H⊗B tor bundle over the non-commutative manifold B, and D represents a “family of Dirac operators” acting by endomorphisms on E. This also describes a K-theory element of B. More generally, if ρ(1) = e = 1 is any idempotent in EndB (E), then by virtue of the Serre–Swan theorem, eE is a twisted vector bundle over B, and (e, D) represents a twisted family of Dirac operators. Classically, such examples are provided by longitudinal elliptic operators on fibered manifolds or foliations [3, 6].
Bivariant Chern Character for Families of Spectral Triples
63
5.2. Higher bimodules and formal Bott periodicity. We shall introduce the higher unbounded bimodules. Let C1 = C⊕εC, be the one-dimensional complex Clifford algebra (with fine bornology). It is a Z2 -graded algebra generated by the unit 1 in degree zero and ε in degree one, with ε2 = 1. For any n ≥ 1, the n-dimensional complex Clifford ˆ algebra is the graded tensor product Cn = C1⊗n , and by convention C0 = C. For any ˆ trivially graded complete bornological algebra B, the completed tensor product Cn ⊗B (which also coincides with the algebraic tensor product) is thus a Z2 -graded complete bornological algebra. Definition 5.5. Let A and B be trivially graded complete bornological algebras. For ˆ any n ≥ 0, we set n (A, B) := (A, Cn ⊗B). It is well-known that, due to the formal Bott periodicity Cn+2 M2 (Cn ), only the first two cases n = 0 and n = 1 are relevant: • n = 0. One has 0 (A, B) = (A, B). Let (E, ρ, D) be such a bimodule, with ˆ The complete bornological space H is Z2 -graded, hence it comes equipped E = H⊗B. with an involutive operator , 2 = 1, which splits H into two eigenspaces H+ and H− of even and odd vectors respectively. Also the right B-module E splits into two ˆ We adopt the usual 2 × 2 matrix notation eigenspaces E± = H± ⊗B. E 1 0 E= + , = . (39) 0 −1 E− The even (resp. odd) part of the Z2 -graded algebra EndB (E) is represented by diagonal (resp. off-diagonal) matrices. By definition the homomorphism ρ → EndB (E) commutes with , whereas D anticommutes. In matricial notations one thus has ρ+ (a) 0 0 D− ρ(a) = , D= (40) 0 ρ− (a) D+ 0 for any a ∈ A. ˆ ˆ 1 ⊗B ˆ for One thus has E = H⊗C • n = 1. Let (E, ρ, D) ∈ 1 (A, B) = (A, C1 ⊗B). some Z2 -graded bornological space H = H+ ⊕ H− . We may write the graded tensor ˆ 1 as the direct sum of its even and odd part: product H⊗C ˆ 1 = (H+ ⊕ H− )⊗(C ˆ ˆ 1 , (41) H⊗C ⊕ εC) = (H+ ⊕ H− ε) ⊕ (H+ ε ⊕ H− ) = K⊗C where K = H+ ⊕ H− ε is a trivially graded bornological space. Hence the module E is ˆ the direct sum of two copies of K⊗B: ˆ K⊗B E= , (42) ˆ K⊗B ˆ is such that ε flips the two factors. It follows that any whose right action of C1 ⊗B endomorphism z ∈ EndC1 ⊗ˆ B (E) reads z=
xy , yx
(43)
64
D. Perrot
ˆ with x, y ∈ EndB (K⊗B). As a consequence, there is a bounded homomorphism α : ˆ ˆ → K⊗B ˆ such that A → EndB (K⊗B) and an unbounded endomorphism Q : K⊗B α(a) 0 0 Q ρ(a) = , D= . (44) 0 α(a) Q 0 This is the general matricial form for an element of 1 (A, B). ˆ • n ≥ 2. The study of an element (E, ρ, D) ∈ n (A, B) = (A, Cn ⊗B) is analogous ˆ n ⊗B ˆ for a certain to the previous one for 1 (A, B). We can reduce E to a product K⊗C ˆ ⊗C ˆ n. trivially graded vector space K, and consequently EndCn ⊗ˆ B (E) = EndB (K⊗B) If n = 2k is even, then Cn = M2 (M2k−1 (C)) as a Z2 -graded algebra, with standard even/odd grading corresponding respectively to the diagonal/off-diagonal 2 × 2 block k−1 ˆ 2 ⊗B)) ˆ in 2 × 2 matrix matrices. It follows that EndCn ⊗ˆ B (E) = M2 (EndB (K⊗C notation. The homomorphism ρ and the Dirac operator thus decompose as in (40). This shows that up to stabilization by matrices of arbitrary size, the elements of 2k (A, B) correspond exactly to the elements of 0 (A, B). ˆ 1 as a Z2 -graded algebra, where If n = 2k + 1 is odd, one has Cn = M2k (C)⊗C M2k (C) is trivially graded and C1 has its natural graduation. Consequently, EndCn ⊗ˆ B (E) k ˆ 2 ⊗B) ˆ ⊗C ˆ 1 . The homomorphism ρ and the Dirac operator thus decom= EndB (K⊗C pose as in (44); hence up to stabilization by matrices, 2k+1 (A, B) corresponds to 1 (A, B). All in all, due to formal Bott periodicity there are only two different sets of bimodules, the even ones 0 (A, B), and the odd ones 1 (A, B). In both cases, a Z2 -graded module splits into the direct sum E = E+ ⊕ E− , according to which the homomorphism ρ has a diagonal form, and the Dirac operator D is off-diagonal. 5.3. Properties. Let A and B be trivially graded complete bornological algebras. It is readily seen that ∗ (A, B), ∗ = 0, 1, is a semigroup under direct orthogonal sum: (E, ρ, D) + (E , ρ , D ) = (E ⊕ E , ρ ⊕ ρ , D ⊕ D ). Since we don’t deal with C ∗ algebras, there is a priori no reason to find an interesting composition product (A, B)× (B, C) → (A, C) as in the case of Kasparov’s theory. Nevertheless, is a bimodule over the category of bornological algebras in the following sense. If we let Mor(A, B) ⊂ 0 (A, B) be the set of bounded homomorphisms from A to B, then there is a well-defined left product Mor(A, B) × ∗ (B, C) → ∗ (A, C) given by ϕ · (E, ρ, D) = (E, ρ ◦ ϕ, D),
(45)
and C. Let ˆ For the right product we must consider the unitalizations B with E = H⊗C. with E = H⊗ and consider a unital bounded homomorphism ˆ B, (E, ρ, D) ∈ ∗ (A, B) → C. Then the right product ∗ (A, B) × Mor(B, C) → ∗ (A, C) reads ϕ:B ρ ⊗ Id, D ⊗ Id), ˆ ϕ C, (E, ρ, D) · ϕ = (E ⊗
(46)
Our aim is to construct a Chern character ˆ ϕ Cis canonically isomorphic to H⊗ ˆ C. where E ⊗ map → H E∗ (A, B), ch : ∗θ (A, B)
∗ = 0, 1,
(47)
Bivariant Chern Character for Families of Spectral Triples
65
with domain the strongly θ -summable unbounded A-B-bimodules (see Definitions 6.3 and 7.2), and range the bivariant entire cyclic cohomology of A and B. This Chern character has to be additive, invariant under differentiable homotopies (Definition 6.6) and functorial with respect to A and B, which means that the following diagram commutes: −−−−→ ∗θ (A, C) Mor(A, B) × ∗θ (B, C) ch ch ch
(48)
H E0 (A, B)×H E∗ (B, C) −−−−→ H E∗ (A, C) and similarly for the right product (46). 6. Algebra Cochains and Superconnections the unitalization of Let A and B be trivially graded complete bornological algebras, and B B. To any unbounded bimodule (E, ρ, D) ∈ ∗ (A, B) verifying suitable θ-summability conditions, we will associate a bounded chain map χ (E, ρ, D) from the (b+B)-complex of entire chains A to the X-complex X(B). This morphism, playing a central role in the bivariant Chern character, is obtained from the exponential of the curvature of a superconnection, as in the Bismut-Quillen approach to the family’s index theorem [3, 32]. To do this, we adapt the theory of algebra cochains developed by Quillen [33] to the bornological framework. For convenience, we postponed to the Appendix a selfcontained account of Quillen’s formalism. 6.1. Bar construction. Let A be a complete bornological algebra, A = n≥0 n A ˆ ⊗A ˆ ⊗n . We use the (b, B)-bicomplex of noncommutative forms over A, with n A = A the bar complex B(A) = B n (A), (49) n≥0 ˆ A⊗n
and B 0 (A) = C. Recall (Appendix) that B(A) is a graded where B n (A) = ˆ coassociative coalgebra. The coproduct : B(A) → B(A)⊗B(A) is given by (a1 ⊗ · · · ⊗ an ) =
n
(a1 ⊗ · · · ⊗ ai ) ⊗ (ai+1 ⊗ · · · ⊗ an ),
(50)
i=0
for any aj ∈ A. Furthermore, one has a boundary map b : B(A) → B(A) of degree −1: b (a1 ⊗ · · · ⊗ an ) =
n−1
(−)i+1 a1 ⊗ · · · ⊗ ai ai+1 ⊗ · · · ⊗ an ,
(51)
i=1
verifying b 2 = 0 and b = (b ⊗ Id + Id ⊗ b ). Thus B(A) is a graded differential ˆ ⊗B(A), ˆ coalgebra. There is an associated free bicomodule 1 B(A) = B(A)⊗A with left and right bicomodule maps ˆ 1 B(A), l = ⊗ Id ⊗ Id : 1 B(A) → B(A)⊗ ˆ r = Id ⊗ Id ⊗ : 1 B(A) → 1 B(A)⊗B(A).
(52)
66
D. Perrot
1 B(A) is endowed with a differential b : 1 B(A) → 1 B(A), b 2 = 0, compatible with the bicomodule structure and b (see Appendix). There is also a projection ∂ : 1 B(A) → B(A) defined by ∂(a1 ⊗ · · · ⊗ ai−1 ) ⊗ ai ⊗ (ai+1 ⊗ · · · ⊗ an ) = a1 ⊗ · · · ⊗ an .
(53)
It is a coderivation (∂ = (Id ⊗ ∂)l + (∂ ⊗ Id)r ) and a morphism of complexes (∂b = b ∂). The last operator we need is the injection : A → 1 B(A): ( a0 da1 . . . dan ) =
n
(−)n(i+1) (ai+1 ⊗ · · · ⊗ an ) ⊗ a0 ⊗ (a1 ⊗ · · · ⊗ ai ),
(54)
i=0
= C⊕A and aj ∈ A. Then is a cotrace, see Appendix. for any a0 in the unitalization A We now endow the bar complex and its associated bicomodule with new bornologies satisfying the entire growth condition. Let S (B(A)) be the bornology generated by the sets n≥0 [n/2]! S ⊗n for any S ∈ S(A). We denote by B (A) the completion of B(A) with respect to this bornology. Also, let S (1 B(A)) be the bornology generated by ⊗n ⊗p ), S ∈ S(A). The corresponding completion n,p≥0 [(n + p)/2]! (S ) ⊗ S ⊗ (S ˆ ⊗B ˆ (A). 1 B (A) identifies with B (A)⊗A Lemma 6.1. All the maps , l,r , b , b , ∂, are bounded for the entire bornology and thus extend to the completions B (A), 1 B (A) and A. Proof. It is a direct consequence of the definitions. Let us for example check the boundedness of the coproduct on B(A). For any small S ∈ S(A) one has ([n/2]! S ⊗n ) ⊂ [n/2]!
n
(S ⊗i ) ⊗ (S ⊗(n−i) )
i=0
⊂
n i=0
3 [n/2]! ([i/2]!S ⊗i ) ⊗ ([(n − i)/2]!S ⊗(n−i) ) . [i/2]![(n − i)/2]!
[n/2]! But there is a constant λ such that [i/2]![(n−i)/2]! ≤ λn for any n and i ≤ n, so that ⊗n ([n/2]! S ) is contained in n 3 1 n ⊗i ⊗(n−i) ) . ([i/2]!S ) ⊗ ([(n − i)/2]!S λ (n + 1) n+1 i=0
The i lies in the circled convex hull of the set ( m [m/2]!S ⊗m )⊗ sum over ( p [p/2]!S ⊗p ), hence by rescaling appropriately S, one can find a small T ∈ S(A) such that 3 ⊗n ⊗m ⊗p [n/2]! S [m/2]!T [p/2]!T , (55) ⊂ ⊗ n
m
p
and the conclusion follows. The other operators are treated similarly.
In particular, the completed bar complex is a Z2 -graded differential coalgebra, and 1 B (A) is a B (A)-bicomodule. All the results of the Appendix extend also to these completions.
Bivariant Chern Character for Families of Spectral Triples
67
the uni6.2. The bimodule δ E. Let A and B be complete bornological algebras, B One thus talization of B, and consider an unbounded bimodule (E, ρ, D) ∈ ∗ (A, B). for some complete Z2 -graded bornological space H; ρ is a bounded ˆB has E = H⊗ homomorphism from A to the even part of EndB(E) and D : E → E is an unbounded δ B be the unitalization of the complete DG endomorphism with dense domain. Let algebra δ B. Recall that the latter is the completion of B with respect to the de δ B verifies d1 = 0. Since E is Rham-Karoubi bornology. By definition the unit 1 ∈ δ B a left B-module, a bornological right B-module and we can form the completed tensor product over B: δ B. ˆ B δ E := E ⊗
(56)
is unital, δ E identifies with H⊗ δ B. The space δ E is naturally a Z2 -graded ˆ Since B δ B-module, endowed with a (bounded) differential d incomplete bornological right δ B. Here |h| is the degree ˆ duced by d(h ⊗ ω) := (−)|h| h ⊗ dω for any h ⊗ ω ∈ H⊗ of the homogeneous element h ∈ H. Let L = End δ B (δ E) be the Z2 -graded complete bornological algebra of bounded endomorphisms of δ E. It has a differential induced by the differential on δ E: for any x ∈ L, one has dx = d ◦ x − (−)|x| x ◦ d. L has a unit 1L corresponding to the identity endomorphism of δ E, satisfying d1L = 0; hence L is a complete unital DG algebra. Any endomorphism y ∈ EndB(E) gives rise to an endomorphism of δ E by y · (ξ ⊗ ω) = (y · ξ ) ⊗ ω
δ B, ˆ B ∀ξ ⊗ ω ∈ E ⊗
(57)
whence a bounded homomorphism EndB(E) → L. Composing ρ with this map yields a bounded representation of A into L, hence δ E becomes a complete bornological δ B-bimodule. In the subsequent constructions we will need to extend the homoA- by setting ρ(1 ) = 1L . morphism ρ : A → L to the unitalization A A Next, the endomorphism D : E → E being unbounded, it may fail to extend to the δ B. Therefore we have to assume that D is a densely defined completion of E ⊗B δ B. unbounded operator on δ E, commuting with the right action of be a Z2 -graded right B-module, ˆB 6.3. Trace-class endomorphisms. Let E = H⊗ and consider the complete bornological space E consisting of bounded right B-module In other words, the action of E on E is given by a bounded bilinear braket maps E → B. satisfying , : E × E → B v, ξ b = v, ξ b
∀v ∈ E , ξ ∈ E, b ∈ B.
(58)
the product bv ∈ E is Then E is naturally a left B-module: for any v ∈ E and b ∈ B, defined by bv, ξ = bv, ξ , ∀ξ ∈ E. Definition 6.2. The algebra of trace-class endomorphisms of E is the Z2 -graded comˆ BE . plete bornological algebra 1 (E) = E ⊗ The product on 1 (E) is induced by the braket: (ξ1 ⊗ v1 ) · (ξ2 ⊗ v2 ) = ξ1 v1 , ξ2 ⊗ v2
∀ξi ∈ E, vi ∈ E .
(59)
Also, 1 (E) acts on E by bounded endomorphisms: one has a bounded bilinear map 1 (E) × E → E sending (ξ1 ⊗ v, ξ2 ) to ξ1 v, ξ2 , compatible with the right action of
68
D. Perrot
whence a canonical bounded homomorphism 1 (E) → End (E). This map may fail B, B to be injective in general. Finally, 1 (E) is a EndB(E)-bimodule, the left multiplication EndB(E) × 1 (E) → 1 (E) corresponding to (x, ξ ⊗ v) → x(ξ ) ⊗ v
(60)
for any x ∈ EndB(E) , ξ ∈ E, v ∈ E , and the right multiplication 1 (E) × EndB(E) → 1 (E) sends (ξ ⊗ v, x) to ξ ⊗ (v ◦ x). by Let us now turn to partial supertraces. We first define a map Tr : E ⊗ E → B Tr((h ⊗ b) ⊗ v) = (−)|h||v| bv, h ⊗ 1B
∀ h ⊗ b ∈ E, v ∈ E .
(61)
Note the sign appearing in the r.h.s depending on the degrees of the graded elements h ∈ H and v ∈ E . Moreover this map is well-defined on E ⊗B E and bounded, hence it extends to a bounded map on the completion Tr : 1 (E) → B.
(62)
Tr is a partial supertrace on 1 (E) viewed as a EndB(E)-bimodule. This means that, if , ])completed (see Appendix), → B = (B/[ we compose by the universal trace : B 1 the resulting bounded map Tr : (E) → B vanishes on the supercommutators [1 (E), EndB(E)]. δ B-module δ E, with some improvements The same discussion holds for the right δ Bdue to the presence of the differential d. Here the space (δ E) of bounded right δ B is a graded (complete) left δ B-module, also with a module maps from δ E to differential: for v ∈ (δ E) , one puts dv = d ◦ v − (−)|v| v ◦ d. Consequently the set of ˆ trace-class endomorphisms 1 (δ E) := δ E ⊗ δ B (δ E) is a DG algebra (in general non-unital), the differential corresponding to d(ξ ⊗ v) = dξ ⊗ v + (−)|ξ | ξ ⊗ dv
∀ξ ∈ δ E, v ∈ (δ E) ,
(63)
δ B. It is and it is well-defined on 1 (δ E) because d(ξ ⊗ωv) = d(ξ ω ⊗v) for any ω ∈ 1 also easy to show that the natural bounded map (δ E) → End δ B (δ E) = L is a DG algebra (and L-bimodule) morphism, and that the partial supertrace Tr : 1 (δ E) → δ B commutes with d. 6.4. Superconnections. We are ready now to introduce the fundamental chain map χ . be an unbounded bimodule. As above, B (A) denotes the Let (E, ρ, D) ∈ ∗ (A, B) → completed bar coalgebra of the unitalization of A, with coproduct : B (A) ˆ B (A)⊗B (A), and L = End δ B (δ E) is the unital DG algebra of bounded endomorˆ → L. The space of bounded linear maps phisms on δ E, with product m : L⊗L L) R = Hom(B (A),
(64)
is a Z2 -graded complete bornological algebra for the convolution product f g = m◦(f ⊗ induce two anticommuting g) ◦ , ∀f, g ∈ R. The differentials d on L and b on B (A) differentials on R, df = d ◦ f,
δf = −(−)|f | f ◦ b ,
∀f ∈ R,
(65)
Bivariant Chern Character for Families of Spectral Triples
69
which moreover satisfy the Leibniz rule for the convolution product. Associated to the is the Z2 -graded R-bimodule (see Appendix) B (A)-bicomodule 1 B (A) L), M = Hom(1 B (A),
(66)
the left and right multiplication maps being respectively given by f γ = m◦(f ⊗γ )◦l and γf = m ◦ (γ ⊗ f ) ◦ r , for any f ∈ R, γ ∈ M. M is also endowed with two anticommuting differentials dγ = d ◦ γ and δγ = −(−)|γ | γ ◦ b , compatible with the R-bimodule maps. Last but not least, the transposed of the canonical coderivation → B (A) yields a bounded derivation ∂ : R → M, commuting with d ∂ : 1 B (A) and δ. Let us now consider the bounded homomorphism ρ : A → EndB(E). We know that by imposing it gives rise to a bounded homomorphism from A to L, and we extend it to A → B 1 (A) =A is clearly bounded for ρ(1A) = 1L . Next, since the projection B(A) → A. Then composing with the entire bornology, it extends to a bounded map B (A) → L, we obtain a linear map of degree 1, which we also denote by ρ: ρ:A L) = R, ρ ∈ Hom(B (A),
|ρ| = 1.
(67)
δ B-module δ E. The action of the algebra of endomorphisms We go back to the right ˆ δ E → δ E. We form the left L = End δ B (δ E) yields a bounded map m : L⊗ R-module δ E). F = Hom(B (A),
(68)
The module map R×F → F comes from the convolution product f ·ξ = m ◦(f ⊗ξ )◦, ∀f ∈ R, ξ ∈ F. It is immediate to check the compatibility of this action with the product implies f · (g · ξ ) = (f g) · ξ . The differentials d on R: the coassociativity of B (A) imply as above that F is a bidifferential R-bimodule: one has on δ E and b on B (A) dξ = d ◦ ξ and δξ = −(−)|ξ | ξ ◦ b for any ξ ∈ F, and these differentials are compatible with the ones on R, i.e. d(f ·ξ ) = df ·ξ +(−)|f | f ·dξ and δ(f ·ξ ) = δf ·ξ +(−)|f | f ·δξ for any f ∈ R. The last ingredient we have is the Dirac operator D acting on δ E as an unbounded endomorphism of odd degree. We assume the induced unbounded linear map D : F → F has dense domain. Remark that if D were bounded, it could be considered to L and thus would define an element as a bounded homomorphism from C = B 0 (A) of R. The map D : F → F would correpond to the left action of this element. The unboundedness of D however prevents us to consider it as an element of R. We now introduce a superconnection D : F → F, D = δ − d + ρ + D,
(69)
→ L where ρ is the odd element of R induced by the unital homomorphism ρ : A one has δρ( and D is the odd unbounded operator on F. For any a1 , a2 ∈ A a1 , a2 ) = ρb ( a1 , a2 ) = ρ( a1 a2 ) and ρ 2 ( a1 , a2 ) = −ρ( a1 )ρ( a2 ). Since ρ is an homomorphism, it follows that δρ + ρ 2 = 0. Furthermore D can be viewed as a 0-cochain on the bar complex, hence δD = D ◦ b = 0. This implies that the curvature of D reads D2 = (δ − d)(ρ + D) + (ρ + D)2 = −d(ρ + D) + [D, ρ] + D 2 .
(70)
Because of the terms D 2 and dD, the curvature is an unbounded operator of even degree acting on F. In the following, we want the heat kernel exp(−tD2 ) to be a bounded
70
D. Perrot
operator on F, hence defining an element of R. We first have to make precise the meaning of the heat kernel of D2 . Using a Duhamel-type expansion, we write for any t ∈ R+ , 2 2 2 (−t)n ds1 . . . dsn e−ts0 D e−ts1 D . . . e−tsn D , (71) exp(−tD2 ) = n
n≥0
where n is the n-simplex {(s0 , . . . , sn ) ∈ [0, 1]n+1 | i si = 1}, and = −d(ρ + D) + [D, ρ]. By hypothesis (Definition 5.1), the heat kernel exp(−uD 2 ) is a bounded endomorphism of E for any u ∈ R+ , hence defines an even element of R. Also, the commutator [D, ρ] takes its values in EndB(E) and thus lies in R. However may not be in R because dD is not necessarily a bounded operator on F. Therefore, we must impose exp(−uD 2 ) to be a regulator, so that each term of the Duhamel expansion is bounded (hence in R), and that the series itself converges bornologically. This is part of the content of the following definition. Definition 6.3 (Weak θ-summability). Let A and B be complete bornological algebras. is called weakly θ -summable if the An unbounded bimodule (E, ρ, D) ∈ ∗ (A, B) following conditions hold: (i) Boundedness condition: the heat kernel exp(−tD2 ), given by the power series (71), L) for any t ∈ R+ . converges bornologically to an element of R = Hom(B (A), 1 (ii) Trace-class condition: the natural homomorphism (δ E) → L is injective, and 1 (δ E)). for any t > 0, the heat kernel lies in Hom(B (A), Recall there is a derivation of degree zero ∂ : R → M. Thus for ρ ∈ R, ∂ρ ∈ M is odd. Assuming the θ-summability condition 6.3, we form the following odd element of M: 1 2 2 → 1 (δ E). µ= dt e−t D ∂ρ e(t−1)D : 1 B (A) (72) 0
yields an entire cochain on the Then composing µ by the cotrace : A → 1 B (A) (b + B)-complex of A, namely µ ∈ Hom( A, 1 (δ E)). Proposition 6.4. The bounded map µ : A → 1 (δ E) satisfies the Bianchi identity µ(b + B) + [µ, ρ + D] = dµ.
(73)
Proof. One has ∂(D2 ) = ∂DD + D∂D = [D, ∂ρ], thus 1 1 2 2 2 −t D 2 (t−1)D 2 dt e [D, ∂ρ] e = dt e−t D ∂(D2 ) e(t−1)D = −∂e−D , [D, µ] = 0
0
which is equivalent to (δ − d)µ + [ρ + D, µ] = −∂ exp(−D2 ). Thus composing with the cotrace , one gets (recall µ is odd) δµ + ∂e−D + [µ, ρ + D] = dµ. 2
Now from Lemma A.1 one has δµ = µb. Moreover, exp(−tD2 ) is an even element because it involves of R. It vanishes if one of its arguments is equal to the unit 1 ∈ A, the commutator [D, ρ] and the differential dρ. Thus Lemma A.2 implies 1 2 2 2 dt e−t D ∂ρ e(t−1)D B = ∂e−D , µB = 0
and the conclusion follows.
Bivariant Chern Character for Families of Spectral Triples
71
The next step is to compose µ with a partial supertrace τ on the superalgebra δ B, but what we 1 (δ E). Of course we have the canonical map Tr : 1 (δ E) → need is a little bit more complicated, depending on the parity of the unbounded bimodule (E, ρ, D) ∈ ∗ (A, B): We have E = H⊗ for a certain Z2 -graded complete ˆB (a) (E, ρ, D) ∈ 0 (A, B). δ B is given its natural ˆ bornological vector space H = H+ ⊕ H− , and δ E = H⊗ Z2 -graduation. We simply take τ as the even partial supertrace Tr: δ B τ = Tr : 1 (δ E) → (h ⊗ ω) ⊗ v → ±ωv, h ⊗ 1 δ B
(74)
δ B, and v ∈ (δ E) . The sign ± depends on the parity of for any h ∈ H, ω ∈ δ B is thus an even entire cochain, |h|(|ω| + |v|). The bounded map τ µ : A → i.e. it sends an even (resp. odd) entire chain on A to an even (resp. odd) chain in the de Rham-Karoubi completion of forms over B. In this case, E = K⊗C for a trivially graded bornologˆ 1⊗ ˆB (b) (E, ρ, D) ∈ 1 (A, B). δ B. The operators ρ and D, acting by endomorˆ ˆ 1⊗ ical space K, and δ E = K⊗C phisms on δ E, commute with the right action of the Clifford algebra C1 . Since the map µ : A → 1 (δ E) is made out of ρ and D, its range must lie in the trace-class endomorphisms of δ E commuting with C1 . This subalgebra of 1 (δ E) identifies with δ B)⊗C δ B)⊗C ˆ ˆ 1 . Therefore, we choose the partial supertrace τ : 1 (K⊗ ˆ ˆ 1→ 1 (K⊗ δ B to be the tensor product of the canonical (even) partial supertrace Tr : 1 (K⊗ δ B) ˆ δ B with an odd supertrace ζ : C1 → C. The latter is unique up to a multiplication → factor, because the universal supercommutator quotient space C1 := C1 /[C1 , C1 ] is one-dimensional. This normalization factor can be fixed uniquely by imposing the compatibility of the bivariant Chern character with suspension an Bott periodicity. This is √ done in Sect. 8.2. One finds ζ (ε) = 2i and ζ (1) = 0. Thus the partial supertrace τ is odd and reads δ B)⊗C δ B ˆ ˆ 1→ τ = Tr ⊗ ζ : 1 (K⊗ √ δ B). ˆ ∀x, y ∈ 1 (K⊗ x + εy → 2iTr(y)
(75)
δ B is then an odd entire cochain, i.e. it sends an The bounded map τ µ : A → even (resp. odd) entire chain on A to an odd (resp. even) chain on B. For any n ≥ 0, let pn : B → n B be the natural projection. It is bounded for the de Rham–Karoubi bornology Sδ (B), hence extends to a bounded map on the (unital) δ B → n B. We denote by 1 B the completion of the commutator completion pn : quotient space 1 B/[B, 1 B], and by : 1 B → 1 B the bounded map induced by projection. We set χ0 = p0 τ µ : A → B and χ1 = p1 τ µ : A → 1 B . These are the components of a bounded map from the space of entire chains on A to the X-complex of B: χ (E, ρ, D) ∈ Hom( A, X(B)).
(76)
χ has the same parity as the unbounded bimodule (E, ρ, D). The components χ0 and → 1 (δ E) defined by χ1 can be expressed explicitly through the map µ0 : 1 B (A) 1 dt e−tθ ∂ρ e(t−1)θ , (77) µ0 = 0
72
D. Perrot
with θ = D 2 + [D, ρ]. The exponentials of θ have a Duhamel expansion 2 2 2 −tθ n e = (−t) ds1 . . . dsn e−s0 tD [D, ρ]e−s1 tD . . . [D, ρ]e−sn tD .
(78)
n
n≥0
and χ0 is its projection onto B. On the other hand, the The map τ µ0 is valued in B composition p1 τ : 1 (δ E) → 1 B is a trace, which implies that p1 τ · is a trace on the R-bimodule M (see Appendix A4). One thus has 1 2 2 2 dt e−t D ∂ρ e(t−1)D = p1 τ ∂ρe−D , (79) p1 τ µ = p1 τ 0
hence
1
χ1 = τ ∂ρ
dt e−tθ d(ρ + D) e(t−1)θ = τ µ0 d(ρ + D).
(80)
0
Proposition 6.5. χ is a morphism from the (b + B)-complex of entire chains on A to the X-complex of B, i.e. χ0 ◦ (b + B) = ±b ◦ χ1 and χ1 ◦ (b + B) = ±d ◦ χ0 , the sign ± depending on the parity of χ. δ B Proof. (i) χ0 (b+B) = ±bχ1 : Composing Eq. (73) with the trace τ : 1 (δ E) → yields the equality of linear maps τ µ(b+B)+τ [µ, ρ+D] = τ dµ, and projecting one gets the range of these maps onto B, τ µ0 (b + B) + τ [µ0 , ρ + D] = 0. We shall use the left and right bicomodule maps ˆ 1 B (A), l : 1 B (A) → B (A)⊗ ˆ (A), r : 1 B (A) → 1 B (A)⊗B ˆ (A) B (A)⊗ ˆ 1 B (A) exchanging as well as the graded flip σ : 1 B (A)⊗B is a cotrace, the two factors (with signs). The bounded map : A → 1 B (A) ˆ which means σ r = l and σ l = r . Let m : L⊗L → L denote the multiplication and σ : L ⊗ L L ⊗ L the graded flip. Then if we treat formally dD as an element of L, we can write χ1 = τ µ0 d(ρ + D) = τ m(µ0 ⊗ d(ρ + D))r . Now let x ∈ 1 (E) and y ∈ EndB(E). The tracial properties of τ imply bτ (xdy) = (−)|τ |+|x| τ ([x, y]) or equivalently bτ (xdy) = (−)|τ |+|x| τ m(x ⊗ y − σ (x ⊗ y)). So we have bχ1 = −(−)|τ | τ m(µ0 ⊗ (ρ + D) − σ (µ0 ⊗ (ρ + D)))r = −(−)|τ | τ µ0 (ρ + D) + (−)|τ | τ mσ (µ0 ⊗ (ρ + D))σ 2 r
Bivariant Chern Character for Families of Spectral Triples
73
because σ 2 = Id, and since σ (µ0 ⊗ (ρ + D))σ = −(ρ + D) ⊗ µ0 , one gets bχ1 = −(−)|τ | τ µ0 (ρ + D) − (−)|τ | τ m((ρ + D) ⊗ µ0 )l = −(−)|τ | τ [µ0 , ρ + D] = (−)|τ | τ µ0 (b + B). so that τ µ0 (b + B) takes its values in B, hence But the range of b lies in B ⊂ B, τ µ0 (b + B) = χ0 (b + B), whence the result. (ii) χ1 (b + B) = ±dχ0 : Projecting the range of Eq. (73) onto 1 B yields τ µ0 d(ρ + D)(b + B) = τ dµ0 because τ · is a trace on M. Furthermore, we know that the canonical supertrace Tr on 1 (δ E) commutes with the differential d, hence dτ = (−)|τ | τ d and τ dµ0 = (−)|τ | dτ µ0 . The fact that d vanishes on the unit yields χ1 (b + B) = (−)|τ | dχ0 as required. 1∈B The space of bounded linear maps Hom( A, X(B)) is a Z2 -graded complete bornological complex, the differential of a map ϕ corresponding to the graded commutator (d, b)◦ϕ −(−)|ϕ| ϕ ◦(b+B). Hence the cocycles of Hom( A, X(B)) are the bounded chain maps between A and X(B), and χ (E, ρ, D) is a cocycle whose degree coincides with the parity of the θ-summable unbounded bimodule (E, ρ, D) ∈ ∗ (A, B). 6.5. Homotopy invariance. We have to show that the cohomology class of the cocycle χ ∈ Hom( A, X(B)) is invariant under suitable homotopies on the set of θ -summable unbounded bimodules. From the construction above, it is clear that the correct notion of homotopy is obtained by suspension. Let C ∞ [0, 1] be the algebra of smooth complexvalued functions on [0, 1], such that all derivatives of order ≥ 1 vanish at the endpoints, while the functions themselves take arbitrary values at 0 and 1. We endow this algebra with the usual Fréchet topology, generated by the countable family of norms ||f ||n =
n 1 sup |f (i) (x)| i! x∈[0,1]
∀f ∈ C ∞ [0, 1], n ∈ N.
(81)
i=0
The Fréchet topology generates the bounded bornology Bound(C ∞ [0, 1]), a subset being small iff it is bounded for all norms. This turns C ∞ [0, 1] into a complete bornological algebra. Given any complete bornological space V, we define its suspension as the completed tensor product ˆ ∞ [0, 1]. V[0, 1] := V ⊗C
(82)
For any t ∈ [0, 1], there is a bounded evaluation map evt : C ∞ [0, 1] → C sending a function f to its value f (t). This extends for any complete bornological space V to a ˆ evt : V[0, 1] → V. bounded evaluation map evt = Id⊗ with E = H⊗ The suspension E[0, 1] is a right module ˆ B. Let (E, ρ, D) ∈ ∗ (A, B), 1], for the product over the complete algebra B[0, (ξ ⊗ f ) · (b ⊗ g) = ξ b ⊗ f g
f, g ∈ C ∞ [0, 1]. ∀ ξ ∈ E, b ∈ B,
(83)
Consider the canonical bounded map ι : E → E[0, 1] given by ι(ξ ) = ξ ⊗ 1 for any ξ ∈ E, where 1 stands for the constant function 1 ∈ C ∞ [0, 1]. Given any (possibly unbounded) endomorphism Q : E[0, 1] → E[0, 1] commuting with the right action of
74
D. Perrot
1], we define the evaluation of Q at t as the endomorphism evt (Q) of E correB[0, sponding to the composition Q
ι
evt
evt (Q) : E → E[0, 1] −→ E[0, 1] −→ E.
(84)
Definition 6.6. Let A and B be complete bornological algebras. Two unbounded bi are differentiably homotopic iff modules (E0 , ρ0 , D0 ) and (E1 , ρ1 , D1 ) in ∗ (A, B) and there is an unbounded bimodule (E, ρ, D) ∈ ∗ (A, B[0, 1]) ˆB E0 = E1 = H⊗ 1] and evt (ρ) = ρt , evt (D) = Dt for t = 0, 1. Differenˆ B[0, such that E = H⊗ tiable homotopy is an equivalence relation. A similar definition holds for θ -summable bimodules, where the interpolating bimodule (E, ρ, D) has also to be θ -summable. be a θ -summable bimodule. The cohomolProposition 6.7. Let (E, ρ, D) ∈ ∗ (A, B) ogy class of the cocycle χ (E, ρ, D) in H∗ (Hom( A, X(B))) is invariant with respect to differentiable homotopies of θ -summable bimodules. Proof. Let (E0 , ρ0 , D0 ) and (E1 , ρ1 , D1 ) be homotopic θ -summable unbounded A-B 1]). modules. By definition there is an interpolating bimodule (E, ρ, D) ∈ ∗ (A, B[0, 1] for a given complete bornological space H. Let ∗ [0, 1] ˆ B[0, One thus has E = H⊗ be the (graded) commutative differential algebra of de Rham forms on [0, 1] with its Fréchet topology. We endow ∗ [0, 1] with the bounded bornology, and note dt the (bounded) de Rham coboundary. Consider the unital complete bornological DG algebra δ B ⊗ ˆ ∗ [0, 1], endowed with the total differential d + dt . We shall mimic the construc δ B ⊗ ˆ ∗ [0, 1]-module tion of the map χ before, with the right δ B ⊗ ˆ ˆ ∗ [0, 1]. δ E := H⊗ Then ρ and D lift to endomorphisms of δ E as before, and we consider the supercon δ E). In this way, one nection D = δ − (d + dt ) + ρ + D acting on F = Hom(B (A), gets a bounded map
1
µ=
→ 1 (δ E). dt e−t D ∂ρ e(t−1)D : 1 B (A) 2
2
0
and the partial trace τ : 1 (δ E) → With the cotrace : A → 1 B (A) δ B ⊗ ˆ ∗ [0, 1], the analogue of Proposition 6.4 yields τ µ(b + B) + τ [µ, ρ + D] = (−)|τ | (d + dt )τ µ.
(85)
δ B ⊗ ˆ ∗ [0, 1] For any n ∈ N and k = 0, 1, we let pn,k be the natural bounded map from n k ˆ to the tensor product B ⊗ [0, 1]. Composing Eq. (85) with p0,k implies p0,k τ µ(b + B) + p0,k τ [µ, ρ + D] = (−)|τ | p0,k (d + dt )τ µ = (−)|τ | dt p0,k−1 τ µ. ˆ k [0, 1] → 1 B ⊗ ˆ k [0, 1] is a trace because the Next, the bounded map : 1 B ⊗ ∗ algebra [0, 1] is graded commutative. Thus composing (85) with p1,k yields p1,k τ µ(b + B) = (−)|τ | p1,k (d + dt )τ µ = (−)|τ | dp0,k τ µ + (−)|τ | dt p1,k−1 τ µ.
Bivariant Chern Character for Families of Spectral Triples
75
Moreover, since ∗ [0, 1] is graded commutative, the same computation as in the proof of Proposition 6.5 shows that p0,k τ [µ, ρ + D] = −(−)|τ | bp1,k τ µ, hence we get a couple of equations p0,k τ µ(b + B) − (−)|τ | bp1,k τ µ = (−)|τ | dt p0,k−1 τ µ, p1,k τ µ(b + B) − (−)|τ | dp0,k τ µ = (−)|τ | dt p1,k−1 τ µ. For k = 0, we introduce the notations χ0 = p0,0 τ µ and χ1 = p1,0 τ µ. They are the ˆ ∞ [0, 1], and the above equations components of a bounded map χ : A → X(B)⊗C yield the cocycle condition χ0 (b + B) − (−)|τ | bχ1 = 0, χ1 (b + B) − (−)|τ | dχ0 = 0. ˆ ∞ [0, 1] → X(B), we recover If we compose χ with the evaluation map evt : X(B)⊗C χ (E0 , ρ0 , D0 ) for t = 0 and χ (E1 , ρ1 , D1 ) for t = 1. Next, for k = 1, define the ChernSimons transgressions cs0 = p0,1 τ µ and cs1 = p1,1 τ µ. They form the components ˆ 1 [0, 1], satisfying of a bounded map cs : A → X(B)⊗ cs0 (b + B) − (−)|τ | b cs1 = (−)|τ | dt χ0 , (86) cs1 (b + B) − (−)|τ | d cs0 = (−)|τ | dt χ1 . Now, remark that the integration map of one-forms : 1 [0, 1] → C is bounded 1 ˆ [0, 1] → X(B). Furthermore, for any and extends to an integration map : X(B)⊗ ˆ ∞ [0, 1], one has x ∈ X(B)⊗C dt x = ev1 (x) − ev0 (x) ∈ X(B). Thus integrating (86) shows that the difference χ (E0 , ρ0 , D0 ) − χ (E1 , ρ1 , D1 ) is the coboundary of cs in the complex Hom( A, X(B)), whence the result. Let us speak about functoriality. We know that for any complete bornological algebras 1 ) → ∗ (A1 , B 1 ) given A1 , A2 , B1 , B2 , there is a left product Mor(A1 , A2 )×∗ (A2 , B by ϕ · (E, ρ, D) = (E, ρ ◦ ϕ, D) for any bounded homomorphism ϕ : A1 → A2 , and 1 ) × Mor(B 1 , B 2 ) → ∗ (A2 , B 2 ) given by (E, ρ, D) · ψ = a right product ∗ (A2 , B 2 , ρ ⊗ Id, D ⊗ Id), where ψ : B 1 → B 2 is a unital bounded homomorphism. ˆ ψB (E ⊗ Using the explicit construction of χ in terms of ρ and D, one sees that it is functorial, i.e. the following diagram commutes: A 1
χ(ϕ·(E ,ρ,D))
/ X(B1 ) m6 m m m m
χ( E ,ρ,D) mm mmm A 2
(ϕ)
χ((E ,ρ,D)·ψ)
(87)
X(ψ)
/ X(B2 )
We collect the preceding results in a theorem: Theorem 6.8. Let A and B be complete bornological algebras. To any unbounded we associate a bounded chain map θ-summable bimodule (E, ρ, D) ∈ ∗ (A, B), χ (E, ρ, D) : A → X(B) of the same parity. Its associated cohomology class in H∗ ( A, X(B)) is invariant under differentiable homotopies and functorial in A and B.
76
D. Perrot
Remark 6.9. In particular if D = 0 then µ0 = ∂ρ and the two components of χ (E, ρ, 0) → B, and χ1 = τ ∂ρdρ. One sees reduce to χ0 = pτ ∂ρ, where p is the projection B that χ0 and χ1 are respectively a zero-cochain and a one-cochain on the (b +B)-complex of entire chains over A, explicitly χ0 (a) = pτρ(a),
χ1 (a0 da1 ) = τρ(a0 )dρ(a1 ),
(88)
for any a, a0 , a1 ∈ A. Remark 6.10. For any θ -summable unbounded bimodule, the composition of τ µ by the δ B , yields a bounded map from the (b + B)-complex of δ B → universal trace : entire chains over A, to the (unitalized) entire de Rham-Karoubi complex of B: δ B ). τ µ ∈ Hom( A,
(89)
Proposition 6.4 implies that it is in fact a cocycle: τ µ(b + B) − (−)|τ | dτ µ = 0. This cocycle was considered in a dual context for example in [19] (without superconnection and in periodic theory rather than in entire cyclic theory), and we know that it can be adapted to compute the action of unbounded representatives of KK(A, B) classes on cyclic cocycles over B (see [6], p. 434). Remark 6.11. In fact the construction of χ works as well if ρ : A → EndB(E) is simply a bounded linear map and not necessarily an homomorphism. In that case, the curvature δρ + ρ 2 of ρ does not vanish and has to be included in formula (70). One thus obtains some generalisations of the cocycles constructed by Quillen in [33, 34]. In the sequel we will always consider that ρ is an homomorphism. 7. The Bivariant Chern Character We are ready to construct a bivariant Chern character for unbounded bimodules satisfying some strong θ -summability conditions. Given two complete bornological algebras A to a T A-TB-bimodule and B, our goal is to lift an element (E, ρ, D) ∈ ∗ (A, B) and construct the corresponding bounded chain map χ : T A → X(T B). Then composing χ with the homotopy equivalence γ : X(T A) → T A given by the Goodwillie theorem, we obtain a bounded chain map χ γ ∈ Hom(X(T A), X(T B)) whose class in the bivariant entire cyclic cohomology H E∗ (A, B) is the bivariant Chern character of (E, ρ, D). E = H⊗ Recall that ˆ B. So we fix an unbounded bimodule (E, ρ, D) ∈ ∗ (A, B), an Bis the completion of B for the analytic bornology San (B) generated by the sets n≥0 S(dS)n , for any S ∈ S(B). It is a complete bornological DG algebra for the an B be its unitalization, with d1 = 0. product of forms and the differential d. We let an B-module and Since the latter is a left B-module, we can introduce the analytic right its even/odd form part: an B, ˆ B an E := E ⊗
± ˆ ± an E := E ⊗B an B.
(90)
an B, ˆ As a complete Z2 -graded bornological vector space, an E is isomorphic to H⊗ + and has naturally a bounded differential induced by d. Now endow the subspace an B of even forms with the (bounded) Fedosov product ω1 ω2 = ω1 ω2 − dω1 dω2
+ ∀ω1 , ω2 ∈ an B.
(91)
Bivariant Chern Character for Families of Spectral Triples
77
+ The unit 1 ∈ an B is also the unit for the Fedosov product, and the correspondence (7) + shows that the associative algebra ( an B, ) is isomorphic to the unitalized analytic ˆ + tensor algebra T B. Then, we can endow the Z2 -graded space + an E = H⊗an B with a + right action of this Fedosov algebra (an B, ) T B: + + : + an E × an B → an E
(92) |ξ |
(ξ, ω) → ξ ω := ξ ω − (−) dξ dω, − + where dξ ∈ − an E and dω ∈ an B. It is easy to check that an E is a right T B-module: + (ξ ω1 ) ω2 = ξ (ω1 ω2 ) for any ω1 , ω2 ∈ an B = T B, and as such it is ˆ TB. Hence we have just lifted the B-module isomorphic to the tensor product H⊗ E to a TB-module. Let EndT B (+ an E) be the complete bornological algebra of bounded endomorphisms E commuting with TB. From the left representation ρ : A → EndB(E), we of + an want to construct a bounded homomorphism ρ∗ : T A → EndT B (+ an E). First, we have a bounded linear map (not an homomorphism) ρ∗ : A → EndT B (+ an E) given by a Fedosov-type action:
ρ∗ (a) ξ := ρ(a)ξ − dρ(a)dξ,
∀a ∈ A, ξ ∈ + an E,
(93)
where ρ(a) and dρ(a) are viewed as elements of the DG algebra End an B (an E), while ξ, dξ are elements of an E. One has ρ∗ (a) (ξ ω) = (ρ∗ (a) ξ ) ω for any ω ∈ TB, hence ρ∗ (a) is indeed an endomorphism of + an E. This induces a representation (= homomorphism) of the non-completed tensor algebra ρ∗ : T A → EndT B (+ an E) by ρ∗ (a1 ⊗ · · · ⊗ an ) ξ = ρ∗ (a1 ) · · · ρ∗ (an ) ξ, ∀a1 ⊗ · · · ⊗ an ∈ T A, ξ ∈ + an E.
(94)
Under the identification T A (+ A, ), the above action reads ρ∗ (a0 da1 . . . da2n ) ξ = (ρ(a0 )dρ(a1 ) . . . dρ(a2n )) ξ = ρ(a0 )dρ(a1 ) . . . dρ(a2n )ξ − dρ(a0 )dρ(a1 ) . . . dρ(a2n )dξ
(95)
for any a0 da1 . . . da2n ∈ + A, where ρ(a0 )dρ(a1 ) . . . dρ(a2n ) is viewed as an element of the DG algebra End an B (an E). In general, we don’t know if the representation ρ∗ is bounded for the analytic bornology on T A. This is true, for example, when H is a Banach space endowed with the bounded bornology. In the following, we always assume that ρ∗ is bounded (this will be part of the strong θ -summability assumption below), and consequently extends to the desired bounded representation of the completion T A in EndT B (+ an E). Let us now deal with the Dirac operator D. It is an odd, unbounded endomorphism of E, and extends to an unbounded endomorphism (with dense domain) of the right TB-module an E. Once again, we deform its restriction on + an E into a Fedosov-type action: D ξ := Dξ + dDdξ
∀ξ ∈ + an E,
(96)
78
D. Perrot
+ so that D (ξ ω) = (D ξ ) ω for any ω ∈ an B. In this way, D defines an unbounded endomorphism of the right TB-module + E. Note that the sign + in front an of dD in Eq. (96) is due to the odd degree of D. What we have obtained so far is the following. Starting from a bimodule (E, ρ, D) ∈ with E = H⊗ we constructed the right TB-module + ˆ B, ˆ ∗ (A, B), an E = H⊗T B, + endowed with a bounded left representation ρ∗ : T A → EndT B (an E), and with an odd, unbounded Dirac endomorphism D. It is natural to wonder if this lifted bimodule defines an element of ∗ (T A, TB). In general this may be false, because of the following reasons: (a) For any x ∈ T A, the commutator for the Fedosov action [D, ρ∗ (x)] := D ρ∗ (x) − ρ∗ (x) D may not act by a bounded endomorphism on + an E. (b) The heat kernel for the Fedosov product, given by the formal power series exp (−tD 2 ) :=
(−t)n n≥0
n!
(D 2 ) n
(97)
may not be a bounded endomorphism of + an E. In order to understand what we mean exactly by this exponential, we state the following lemma: Lemma 7.1. Let H be any (possibly unbounded) even endomorphism of + an E acting by
1 n a Fedosov-type deformation. Then the formal power series n n! H may be rewritten as a Duhamel-type expansion n (−) ds1 . . . dsn es0 H dH d(es1 H )dH . . . d(esn−1 H )dH d(esn H ), exp H = n≥0
n
(98) where esi H is the exponential for the usual product of endomorphisms. We call exp H the Fedosov exponential of H . Proof. We establish a first-order differential equation for the Fedosov exponential. For any t ∈ [0, 1] and ξ ∈ + an E, one has d (exp (tH ) ξ ) = H exp (tH ) ξ = (H exp (tH ) − dH d exp (tH )) ξ. dt d Thus dt exp (tH ) = H exp (tH ) − dH d exp (tH ). A well-known trick of perturbative quantum mechanics is to introduce the interaction scheme I (t) := e−tH exp (tH ), for which the evolution equation reads
d I (t) = −e−tH dH d exp (tH ). dt Using the fact that I (0) = 1, the solution is expressed in integral form t I (t) = 1 − ds e−sH dH d exp (sH ), 0
Bivariant Chern Character for Families of Spectral Triples
79
or equivalently exp (tH ) = etH −
t 0
ds e(t−s)H dH d exp (sH ).
The perturbative resolution of this equation gives rise to the result.
Substituting H by −tD 2 = −t (D 2 + dDdD) in (98) gives a power series of differential forms involving the heat operator exp(−sD 2 ), which by hypothesis is a bounded endomorphism of E playing the role of a regulator, together with some derivatives d exp(−uD 2 ), d(D 2 ) and dDdD. The obtained formula is really the definition of exp (−tD 2 ). The bornological convergence of this series in EndT B (+ an E) is part of the strong θ -summability assumption below. This being understood, we can perform the E replaced by T A, TB, + construction of the previous section with A, B, an E respectively, and get a bounded chain map χ (+ an E, ρ∗ , D) : T A → X(T B).
(99)
δ T B-module δ + Let us recall briefly the main steps. We first form the right an E, and + denote by (L, d) the DG algebra End δ T B (δ an E). Then, using the completed bar complex B (TA) and its associated bicomodule 1 B (TA), we consider the algebra R = Hom(B (TA), L) and its associated R-bimodule M = Hom(1 B (TA), L); then the left R-module F = Hom(B (TA), δ + an E) endowed with two differentials d, δ; and finally the superconnection D = δ − d + ρ∗ + D : F → F. Since we want the heat operator exp(−tD2 ) to define a trace-class element of R, the lifted T A-TBbimodule (+ an E, ρ∗ , D) must be weakly θ -summable. This leads to the strong version of θ -summability: Definition 7.2 (Strong θ -summability). Let A and B be complete bornological alge is called strongly θ-summable bras. An unbounded bimodule (E, ρ, D) ∈ ∗ (A, B) iff the following conditions hold: (i) The homomorphism ρ∗ : T A → EndT B (+ an E) is bounded for the analytic bornology on T A, and thus extends to a bounded representation of T A into EndT B (+ an E). E, ρ , D) into a T AT B-bimodule. This turns the lift (+ ∗ an (ii) The thus obtained lift (+ an E, ρ∗ , D), though not necessarily in ∗ (T A, T B), nevertheless verifies the weak θ -summability condition as stated in Definition 6.3. the abelian semigroup of strongly θ -summable bimodules. Two We denote by ∗θ (A, B) strongly θ -summable bimodules are homotopic iff their lifts are homotopic. If the strong θ-summability conditions are satisfied, then from the odd element of M,
1
µ= 0
dt e−t D ∂ρe(t−1)D ∈ Hom(1 B (TA), 1 (δ + an E)), 2
2
(100)
one gets the two components χ0 = p0 τ µ and χ1 = p1 τ µ of the cocycle χ (+ an E, ρ∗ , D) ∈ Hom( T A, X(T B)). Then composing χ with the Goodwillie equivalence γ ∈ Hom(X(T A), T A) of Sect. 4 yields a bivariant entire cyclic cohomology class [χ ◦ γ ] ∈ H E∗ (A, B). This is the bivariant Chern character of (E, ρ, D). Thus we are led to the following theorem:
80
D. Perrot
Theorem 7.3. Let A and B be complete bornological algebras. There is a bivariant Chern character map → H E∗ (A, B), ch : ∗θ (A, B)
∗ = 0, 1,
(101)
sending a strongly θ -summable bimodule (E, ρ, D) to the bivariant entire cyclic cohomology class ch(E, ρ, D) := [χ (+ an E, ρ∗ , D) ◦ γ ]. The Chern character is additive, and functorial in both variinvariant for differentiable homotopies inside ∗θ (A, B), ables. Proof. This is a consequence of Theorem 6.8 applied to the lifted bimodule (+ an E, ρ∗ , D) and the fact that the Goodwillie map γ is obviously functorial with respect to A. Remark 7.4. The strong θ -summability condition 7.2 should not be taken too seriously. In concrete applications, it is sufficient to verify a posteriori that the composition map χ ◦γ : X(T A) → X(T B) is bounded. On the other hand, when dealing with commutative algebras, one can replace the universal DG algebra of noncommutative forms B by the smaller (graded) commutative algebra of de Rham forms over B, and similarly for the module E. In these circumstances, the θ -summability conditions are much less restrictive, and give rise to a bivariant Chern character in ordinary de Rham cohomology, which is satisfactory in many concrete geometrical examples. Also, in some situations it is not necessary to unitalize the algebra B, and the construction of the Chern character can be performed on ∗θ (A, B). The following section provides commutative examples illustrating these particular cases with the study of Bott elements. 8. Examples Let us have a look at some examples related to K-theory. It will illustrate our bivariant Chern character on the two extremal cases ∗ (C, A) and ∗ (A, C), describing respectively the K-theory and K-homology of a complete bornological algebra A. 8.1. Index pairing and the JLO cocycle. Let A and B be complete bornological algebras. must be considered as the First of all, a bounded homomorphism ρ : A → B ⊂ B with fundamental example of an even A-B-bimodule. In this case one chooses E = B trivial graduation, and the Dirac operator is equal to zero, hence we get an unbounded ρ, 0) ∈ 0 (A, B). bimodule (B, Proposition 8.1. For any bounded homomorphism ρ : A → B, the bivariant Chern is equal to the class ch(ρ) ∈ H E0 (A, B) of the ρ, 0) ∈ 0 (A, B) character of (B, chain map X(ρ∗ ) : X(T A) → X(T B)
(102)
induced by the bounded homomorphism ρ∗ : T A → T B. γ ρ, 0) is the cohomology class of the composition of chain maps X(T A) → Proof. ch(B, χ one has + T A → X(T B), where χ is constructed as follows. With E = B an E = + ⊗ an B TB. Furthermore, the homomorphism ρ∗ : T A → End (+ ˆ B B an E) TB T B is simply given by ρ∗ (a1 ⊗ . . . ⊗ an ) = ρ(a1 ) ⊗ . . . ⊗ ρ(an ). Hence the lift of ρ, 0) corresponds to the T A-TB-bimodule (TB, ρ∗ , 0). Thus by Remark 6.9, the (B,
Bivariant Chern Character for Families of Spectral Triples
81
two components of the morphism χ : T A → X(T B) are respectively a 0-cochain and a 1-cochain on the (b + B)-complex of universal forms over T A: χ0 (x) = ρ∗ (x), ∀x ∈ T A,
χ1 (xdy) = ρ∗ (x)dρ∗ (y), ∀xdy ∈ 1 T A.
Hence χ vanishes on any differential form over T A of degree ≥ 2. From the explicit expression of the Goodwillie equivalence γ , we can compute easily the composition χ γ : X(T A) → X(T B): γ
χ0
x −→ x + degree ≥ 2 −→ ρ∗ (x), γ
χ1
xdy −→ xdy + b(xφ(y)) + degree ≥ 3 −→ ρ∗ (x)dρ∗ (y), for any x, y ∈ T A. This is precisely the morphism of complexes X(ρ∗ ). We focus on the algebra C. One knows [24] that X(T C) is homotopic to X(C) : C 0. The generator of H E0 (C) = C is represented by the following even cycle eˆ ∈ X0 (T C) + an C. Denoting by e the unit of C, then eˆ := e +
(2n)! n≥1
(n!)2
1 (e − )(dede)n 2
∈ + an C = T C
(103)
is an idempotent: eˆ2 = eˆ in T C, which implies deˆ = 0. Thus eˆ indeed defines an even cycle of X(T C). Now let A be a complete bornological algebra, and fix an integer N ∈ N. Let H = H+ ⊕ H− be the Z2 -graded complete bornological space such that H+ = H− = The algebra of endomorphisms ˆ A. CN , and consider the right A-module E = H⊗ the graduation EndA(E) identifies with the Z2 -graded matrix algebra M2 (MN (A)), corresponding to the decomposition into diagonal/off-diagonal matrices as usual. Any with u± ∈ MN (A), is described by an even pair of idempotents e± = 1 + u±∈ MN ( A), ρ+ 0 with ρ± (e) = e± , from C to M2 (MN (A)), bounded homomorphism ρ = 0 ρ− The pair e± is called e ∈ C. This gives rise to a bimodule (E, ρ, 0) ∈ 0 (C, A). degenerate if e+ = e− . The set of (differentiable) homotopy classes of such bimodules modulo degenerates is the K-theory group K0 (A). The bivariant Chern character yields a well-defined additive map K0 (A) → H E0 (C, A) H E0 (A) which coincides with the usual Chern character on K-theory [15]: Proposition 8.2. Let A be a complete bornological algebra, e± = 1 + u± a pair of idempotents with u± ∈ MN (A), and (E, ρ, 0) the corresponding bimodule in 0 (C, A) representing the K-theory element [e+ ] − [e− ] ∈ K0 (A). Then the Chern character ch(E, ρ, 0) ∈ H E0 (A) is represented by the entire chain ch(e+ ) − ch(e− ) ∈ T A, with ch(e± ) = tr(e± ) +
(2n)! n≥1
(n!)2
1 tr((e± − )(de± de± )n ) 2
∈ + an A = T A,
(104)
where tr is the usual trace on N × N matrices. The difference ch(e+ ) − ch(e− ) is welldefined on K0 (A) because it vanishes on degenerates and the bivariant Chern character is homotopy invariant.
82
D. Perrot
with H = CN ⊕ CN . Thus + ˆA ˆ Proof. One has E = H⊗ an E is equal to H⊗T A lifts to an homomorphism and the homomorphism ρ : C → EndA(E) = M2N (A) ρ∗ : T C → EndT A (+ an E) = M2N (T A). Then by Remark 6.9, ch(E, ρ, 0) is the image of the generator eˆ ∈ T C under the composition of bounded maps p ρ∗ tr s T C −→ M2N (TA) −→ TA −→ T A,
where tr s is the usual supertrace on supersymmetric 2N × 2N matrices, and p is the projection. Since e± = 1 + u± with u± ∈ MN (A), we have tr s (ρ(e)) = tr(e+ ) − tr(e− ) = tr(u+ ) − tr(u− ) ∈ A, hence (2n)! 1 n tr ρ(e) − (dρ(e)dρ(e)) ch(E, ρ, 0) = tr s (ρ(e)) + s (n!)2 2 n≥1
= ch(e+ ) − ch(e− )
∈TA
as claimed. Let Idem(A) be the set of idempotents in the inductive limit of matrix algebras → ∗ (C, B) M∞ (A) = lim MN (A). Then we have a pairing Idem(A) × ∗ (A, B) −→ given by left composition with the homomorphisms C → MN (A) corresponding to the idempotents. The functorial properties of the Chern character −−−−→ ∗θ (C, B) Idem(A)× ∗θ (A, B) (105) ch ch ch −−−−→ H E∗ (B) H E0 (A)×H E∗ (A, B) show that we can compute in homology the pairing between K-theory and bivariant modules. In particular when B = C, the unbounded A-C-bimodules essentially describe the spectral triples over A. Recall that a spectral triple is given by a Z2 -graded Hilbert space, a representation ρ of A into the algebra End(H) of bounded operators, and an unbounded selfadjoint odd operator D. Extending the left actions of ρ and D to ˆ E := H⊗ C, we get a bimodule (E, ρ, D) ∈ ∗ (A, C) describing the above spectral triple. Its bivariant Chern character is an entire cyclic cohomology class in H E∗ (A, C) H E ∗ (A). Our construction represents ch(E, ρ, D) as a bounded cocycle on X(T A). Exploiting the homotopy equivalence between X(T A) and A, we may also represent the Chern character as an entire cocycle on the (b + B)-complex of A. When doing this, we recover exactly the JLO cocycle [21]: Proposition 8.3. Let A be a complete bornological algebra, and (E, ρ, D) ∈ ∗ (A, C) ˆ a θ-summable spectral triple, with E = H⊗ C. Then the Chern character of (E, ρ, D) is represented by the entire cocycle χ (E, ρ, D) : A → C on the (b + B)-complex of A, corresponding to the JLO formula: 0 D− ρ+ 0 (i) In the even case H = H+ ⊕ H− , ρ = and D = . Then χ is 0 ρ− D+ 0 the even entire cochain χ (E, ρ, D)(a0 da1 . . . da2n ) ds1 . . . ds2n × =
(106)
2n
× Tr s (ρ(a0 )e−s0 D [D, ρ(a1 )]e−s1 D . . . [D, ρ(a2n )]e−s2n D ), 2
2
2
Bivariant Chern Character for Families of Spectral Triples
83
for any n ∈ N and ai ∈ A, where Tr s is the supertrace of operators on the Z2 -graded Hilbert space H. (ii) In the odd case H is the sum of two copies of a trivially graded Hilbert space K. α0 One has ρ = for a given bounded homomorphism α : A → End(K), and 0α 0 Q D= for an unbounded operator Q. Then χ is the odd entire cochain Q 0 χ (E, ρ, D)(a0 da1 , . . . , da2n+1 ) √ ds1 , . . . , ds2n+1 × = − 2i
(107)
2n+1
× Tr(ρ(a0 )e−s0 Q [Q, ρ(a1 )]e−s1 Q , . . . , [Q, ρ(a2n+1 )]e−s2n+1 Q ), 2
2
2
for any n ∈ N and ai ∈ A, where Tr is the trace of operators on K. Proof. The isomorphism H E∗ (A, C) H E ∗ (A) is obtained as follows. Given a bounded chain map f : X(T A) → X(T C) representing a bivariant entire cyclic cohomology class, we associate the bounded cocycle X(m) ◦ f : X(T A) → X(C) C obtained after composition with the projection morphism X(m) : X(T C) → X(C) coming from the multiplication map m : T C → C. The functoriality of the construction χ (Theorem 6.8) yields a commutative square T A (m)
χ(+ an E ,ρ∗ ,D)
/ X(T C) X(m)
A
χ(E ,ρ,D)
/ C.
Combining this square with the commutative diagram (36) of Corollary 4.5 shows that ∼ under the homotopy equivalence P ◦ c : X(T A) −→ A, the Chern character of the ∗ spectral triple ch(E, ρ, D) ∈ H E (A) indeed corresponds to the class of the (b + B)cocycle χ (E, ρ, D). Since X1 (C) = 0, the only remaining component of χ is χ0 = τ µ0 , where τ is the even/odd supertrace of endomorphisms on H, depending on the parity of the spectral triple. Since τ · is a (super)trace, one has χ0 = τ
1
dt e−tθ ∂ρe(t−1)θ = τ ∂ρe−θ ,
0
with θ = D 2 + [D, ρ]. As usual exp(−θ) is given by a Duhamel expansion 2 2 2 −θ n e = (−) ds1 , . . . , dsn e−s0 D [D, ρ]e−s1 D , . . . , [D, ρ]e−sn D . n≥0
n
(i) Even case: Then H = H+ ⊕ H− and τ = Tr s . Also, ρ is a diagonal matrix whereas D is off-diagonal, so that τ selects only even powers of D, and therefore χ0 is an even entire cocycle on A. Equation (106) follows.
84
D. Perrot
01 (ii) Odd case: H = K ⊕ K and D = εQ, with ε = . For any x, y ∈ End(K), one 10 √ has τ (x + εy) = 2iTr(y), hence τ selects only odd powers of D; χ0 is thus an odd entire cocycle over A, whence (107). This shows in particular that for any idempotent e ∈ Idem(A), and any θ -summable even spectral triple (E, ρ, D), the coupling of the Chern characters calculates the index pairing between K-theory and K-homology: Proposition 8.4. Let A be a complete bornological algebra, e ∈ M∞ (A) an idempotent and (E, ρ, D) a θ -summable even spectral triple over A. Then the composition of the Chern characters by the map H E0 (A) × H E 0 (A) → C computes the index pairing ch(e), ch(E, ρ, D) = index(eDe).
(108)
ˆ C for a certain Hilbert space H. The propoProof. One has e ∈ MN (A), and E = H⊗ sition is easily proved by exploiting the homotopy invariance of the Chern character with respect to D. For notational simplicity we still denote by e the idempotent ˆ N ). For t ∈ [0, 1] the operator ρ(e) ∈ End(H⊗C Dt = D + t ([D, e]e − e[D, e]) is a bounded perturbation of D, connecting homotopically D to D1 . One has [D1 , e] = 0, which shows that we can reduce to the situation where D commutes with e. The Chern character of e in the (b + B)-complex of entire chains over A is obtained from the X-complex cycle eˆ by means of the rescaling factor of Eq. (11): ch(e) = tr(e) +
1 (2n)! tr e − (dede)n . (−)n n! 2 n≥1
Since we assume [D, e] = 0, Eq. (106) implies that χ (E, ρ, D) vanishes on any term involving a (strictly) positive power of dede, hence 2 χ (E, ρ, D)(ch(e)) = Tr s e e−D . This is the McKean–Singer formula computing the index of D relative to e, whose proof involves spectral theory in a simple way [17]. 8.2. The Bott class . We now examine the generators of the K-theory and K-homology of the n-dimensional real vector space. This will illustrate some features of commutative algebras, and explain the role of the somewhat mysterious Fedosov exponential (Lemma 7.1). Let S(Rn ) be the commutative algebra of smooth rapidly decreasing functions on Rn . We denote by {x1 , . . . , xn } the canonical coordinate system, giving to Rn its canonical orientation. S(Rn ) is a Fréchet algebra for the locally convex topology given by the countable family of seminorms ||a||α,δ = sup |x α ∂ δ a(x)| x∈Rn
∀a ∈ S(Rn ),
(109)
Bivariant Chern Character for Families of Spectral Triples
85
where α = (α1 , . . . , αn ) and δ = (δ1 , . . . , δn ) are collections of positive integers such |δ| that x α = (x1 )α1 . . . (xn )αn and ∂ δ is the partial differentiation operator ∂x δ1∂...∂x δn . We n 1 endow S(Rn ) with the bounded bornology. For any n ∈ N, we shall first construct a spectral triple over S(Rn ). It comes from the Dirac operator acting on sections of the trivial spinor bundle over Rn . So let Cn be the n-dimensional complex Clifford algebra. It is generated by a basis γ 1 , . . . , γ n of Cn , subject to the anticommutation relations {γ µ , γ ν } = 2δ µν . We let Sn be the complex spinor representation of Cn endowed with the fine bornology. Then the fundamental class of Rn in K-homology is represented by the following bornological spectral triple: n ) is the complete bornological space consisting of rapidly decreasing ˆ Hn := Sn ⊗S(R sections of the trivial spinor bundle; the representation ρ : S(Rn ) → End(Hn ) is given by (left) √ multiplication; and for any real parameter t > 0, the usual Dirac operator Dt = i tγ µ ∂x∂ µ acts as an unbounded endomorphism of Hn . If En denotes the right Cˆ C) is a θ -summable spectral triple with C, then (En , ρ, Dt ) ∈ ∗ (S(Rn ), module Hn ⊗ ˆ parity equal to n mod 2 (for n odd, replace En by two copies of Hn ⊗ C, ρ by Id2 ⊗ ρ and Dt by εDt ). In the subsequent calculations we will need some explicit properties of the matrix representation of the Clifford algebra into Sn . This depends on the parity of n: (i) n = 2k: Then the spinor representation Sn is a 2k -dimensional Z2 -graded space. The generators γ µ are odd operators represented by hermitian matrices. The grading operator is the element of highest degree in Cn : = (−i)k γ 1 . . . γ n ,
()2 = 1.
(110)
The supertrace of linear operators on Sn is tr s = tr(·). Then for any j < n one has tr s (γ 1 . . . γ j ) = 0, while tr s (γ 1 . . . γ n ) = (2i)k . (ii) n = 2k+1: Then Sn is a trivially graded 2k -dimensional space. We represent also the generators γ µ by hermitian matrices. Then for any j < n one has tr(γ 1 . . . γ j ) = 0, and tr(γ 1 . . . γ n ) = (2i)k . Proposition 8.5. Let n be a positive integer. The Chern character of the fundamental class (En , ρ, Dt ) ∈ n+2Z (S(Rn ), C) in H E n+2Z (S(Rn )) retracts, when t → 0, to an n-dimensional (b + B)-cocycle χ n : n S(Rn ) → C. It corresponds to the fundamental class of Rn in cyclic cohomology: 1 χ n (a0 da1 . . . dan ) = a0 da1 ∧ · · · ∧ dan , (111) n!(2π i)n/2 Rn for any ai ∈ S(Rn ). Proof. It is a well-known fact that when the Dirac operator acts on the sections of a spinor bundle on a manifold M, the JLO cocycle retracts on local expressions involving the A-genus of M at the limit t → 0 [8]. In our case, Rn is a flat manifold so that the computation is particularly simple, and can be performed by means of the asymptotic symbol calculus for example as in [9]. We don’t give the details here because it is a classical result. In fact the retracted (b + B)-cocycle χ n is also a cyclic n-cocycle, that is, a closed graded trace of degree n on the DG algebra S(Rn ). It follows that χ n is invariant under the Karoubi operator κ on S(Rn ), thus it vanishes on the contractible subspace
86
D. Perrot
P ⊥ S(Rn ) (see Sect. 3.3). It follows that χ n is also a cocycle on the X-complex X(T S(Rn )) (an S(Rn ), d, b). Taking into account the rescaling factor (−)n [n/2]! of Eq. (11) when passing from an S(Rn ) to S(Rn ), we deduce the expression of the fundamental class as an X-complex cocycle: Corollary 8.6. Let n ∈ N, and (En , ρ, Dt ) be the fundamental K-homology class of Rn . (i) If n is even, then the Chern character of (En , ρ, Dt ) in H E 0 (S(Rn )) is represented n by the following trace on the algebra T S(Rn ) (+ an S(R ), ): [n/2]! ch(En , ρ, Dt )(a0 da1 . . . dan ) = a0 da1 ∧ · · · ∧ dan (112) n!(2π i)n/2 Rn for any ai ∈ S(Rn ). (ii) If n is odd, then the Chern character of (En , ρ, Dt ) in H E 1 (S(Rn )) is represented n by the following one-cocycle on 1 T S(Rn ) − an S(R ): ch(En , ρ, Dt )(a0 da1 . . . dan−1 dan ) [n/2]! a0 da1 ∧ · · · ∧ dan−1 ∧ dan =− n!(2π i)n/2 Rn
(113)
for any ai ∈ S(Rn ).
One sees that there is a simplification here, due to the fact that S(Rn ) is a commutative algebra: the cocycle factors through de Rham cohomology. Hence we can deal with the ordinary exterior algebra of differential forms ∗ (Rn ) instead of the universal DG algebra of non-commutative forms S(Rn ). We endow ∗ (Rn ) with its usual Fréchet topology and the associated bounded bornology. Then the universal property of S(Rn ) implies that there is a unique bounded DG algebra morphism S(Rn ) → ∗ (Rn ) extending the identity map S(Rn ) → S(Rn ). The image of a0 da1 . . . dak is equal to a0 da1 ∧ · · · ∧ dak if k ≤ n and zero otherwise. As a consequence, this DG morphism is also bounded for the analytic bornology on S(Rn ) and thus extends to a bounded DG morphism an S(Rn ) → ∗ (Rn ). We endow the even part + (Rn ) with the Fedosov product ω1 ω2 := ω1 ω2 − dω1 ∧ dω2
∀ω1 , ω2 ∈ + (Rn ).
(114)
Then Tn := (+ (Rn ), ) is an associative (non-commutative!) complete bornological n + n algebra, and we get a canonical bounded homomorphism (+ an S(R ), ) → ( (R ), n n ), or equivalently T S(R ) → Tn . This yields a bounded chain map X(T S(R )) → X(Tn ). Now, the fundamental class of Rn gives rise to a bounded cocycle [Rn ] : X(Tn ) → C: [Rn ](x) = x, [Rn ](xdy) = x ∧ dy, ∀x, y ∈ Tn . (115) Rn
Rn
It is easily checked that [Rn ] vanishes on the commutators [Tn , 1 Tn ], hence is welldefined on 1 Tn = X1 (Tn ). Consequently, the Chern character of (En , ρ, D) ∈ n+2Z (S(Rn ), C) factors through X(Tn ):
Bivariant Chern Character for Families of Spectral Triples
87 χˆ n
ch(En , ρ, Dt ) : X(T S(Rn )) → X(Tn ) −→ C,
(116)
[n/2]! n where χˆ n is the cocycle (−)n n!(2πi) n/2 [R ]. We now construct the Bott generator of the K-theory of Rn . It will be represented by a bimodule βn ∈ n+2Z (C, S(Rn )). Since we deal with the exterior algebra of ordinary differential forms and its Fedosov deformation Tn , we don’t need to consider the unitalization of S(Rn ). This property also is a advantage of the commutative case. Let again n ) be the space of rapidly decreasing sections of the trivial spinor bundle ˆ Hn = Sn ⊗S(R over Rn , considered this time as a right S(Rn )-module. There is an obvious homomorphism α : C → EndS (Rn ) (Hn ), sending the unit e ∈ C to the identity endomorphism. For any real parameter λ > 0, we introduce an unbounded Dirac operator Qλ acting on Hn by Clifford multiplication with respect to the vector x:
(Qλ ξ )(x) =
√ λxµ γ µ · ξ(x) ∀ξ ∈ Hn , x ∈ Rn .
(117)
Note that Qλ is the Fourier transform of the previous Dirac operator Dλ−1 . Then the unbounded C-S(Rn )-bimodule βn = (Hn , α, Qλ ) represents the Bott generator of Rn . Its Chern character ch(βn ) ∈ H En+2Z (S(Rn )) is represented by entire chains over S(Rn ). We want to evaluate ch(βn ) on the fundamental class of Rn , so that only its image in X(Tn ) is important.All the construction of the bivariant Chern character then transpose immediately to the situation where the universal DG algebra S(Rn ) is replaced by ∗ (Rn ). Proposition 8.7. For any n ∈ N, let βn = (Hn , α, Qλ ) ∈ n+2Z (C, S(Rn )) be the Bott generator. Its Chern character is represented by the following cycle in the X-complex of the Fedosov algebra Tn = (+ (Rn ), ): (i) n even: ch(βn ) =
n! 2 (2iλ)n/2 e−λx dx1 ∧ · · · ∧ dxn ∈ Tn . (n/2)!
(118)
(ii) n = 2k + 1: (2k)! 2 e−λx dxj +1 ∧ · · · ∧ dxj −1 dxj ∈ 1 Tn . (119) (2iλ)n/2 k! n
ch(βn ) = −
j =1
These are top-degree differential forms with gaussian shape over Rn . Proof. For n even we set En = Hn , D = Qλ and ρ = α. For n odd we set En = 01 Hn ⊕ Hn , D = εQλ and ρ = Id2 ⊗ α, where ε = is the odd generator of 10 the one-dimensional Clifford algebra C1 . Then for any n, the unbounded bimodule (En , ρ, D) ∈ n+2Z (C, S(Rn )) represents the Bott element, and ch(βn ) is the image of the generator eˆ ∈ H E0 (C) under the composition of chain maps γ
χ
X(T C) −→ T C −→ X(T S(Rn )) → X(Tn ),
88
D. Perrot
where γ is the Goodwillie equivalence and χ = χ (+ an En , ρ∗ , D) is the core of the bivariant Chern character. We know that the generator of H E0 (C) is represented by the idempotent (2k)! 1 eˆ = e + (e − )(dede)k ∈ + an C = T C, 2 (k!) 2 k≥1
where e is the unit of C. We claim that its image γ (e) ˆ in T C has the same homology class as the (b + B) entire cycle fˆ = eˆ +
(−)k
k≥1
1 (2k)! (eˆ − )(d ed ˆ e) ˆk k! 2
∈ T C.
Indeed, the projection π : T C → X(T C) maps fˆ to e, ˆ and Corollary 4.4 shows that π and γ are inverse homotopy equivalences. Thus ch(βn ) is the image of fˆ under χ (+ an En , ρ∗ , D) projected to X(Tn ). The construction of the bivariant Chern character carries over to the situation where the universal DG algebra S(Rn ) is replaced by ˆ n . There is a unique ∗ (Rn ). We thus consider the right Tn -module + En := Sn ⊗T + bounded homomorphism ρ∗ : T C → EndTn ( En ) extending ρ : C → EndS (Rn ) (En ). By definition one has ρ(e) = 1 and d1 = 0, so that ρ∗ (e(dede)k ) = 1(d1d1)k = 0 whenever k ≥ 1, and ρ∗ (e) ˆ = 1. Next, the chain map χ (+ En , ρ∗ , D) : T C → X(Tn ) has two components: the Tn -valued χ0 = τ µ0 , and the 1 Tn -valued χ1 = τ µ0 d(D + ρ∗ ), with
1
µ0 = 0
θ = D 2 + [D, ρ∗ ] ,
dt exp (−tθ )∂ρ∗ exp ((t − 1)θ ),
and τ : EndTn (+ En ) → Tn is the supertrace. Recall that the exponentials and commutators are taken with respect to the Fedosov product on (+ (Rn ), ) = Tn . Since ρ∗ (e) ˆ = 1, the Fedosov commutator [D, ρ∗ (e)] ˆ vanishes, and we simply have τ µ0 (fˆ) = τ µ0 d(D + ρ∗ )(fˆ) =
1 0 1 0
2
(t−1)D 2
−tD dt τ e ∂ρ∗ e 2
(t−1)D 2
−tD dtτ e ∂ρ∗ e
2
−D (e) ˆ = τ e ,
d(D + ρ∗ )(fˆ)
2
−D dD. = τ e
Let us now compute the Fedosov exponential exp (−D 2 ) in terms of differential forms. One has D 2 = D 2 + dDdD, where the laplacian D 2 is a scalar function of x ∈ Rn : indeed if n is even, the matrices γ µ are odd for the Z2 -graduation of Sn and D 2 = Q2λ = λxµ γ µ xν γ ν =
1 λxµ xν {γ µ , γ ν } = λx 2 . 2
If n is odd, then Sn is trivially graded, and the product εγ µ is odd: D 2 = (εQλ )2 = λεxµ γ µ εxν γ ν =
ε2 xµ xν {γ µ , γ ν } = λx 2 . 2
Bivariant Chern Character for Families of Spectral Triples
89
Next, with H = −D 2 , Lemma 7.1 implies that the Fedosov exponential is the following differential form on Rn : k exp H = (−) ds1 . . . dsk es0 H dH des1 H dH . . . desk−1 H dH desk H . k≥0
k
But dH = −d(D 2 ) = −λd(x 2 ) and d exp(sH ) = sdH exp(sH ), hence d(x 2 ) always appears by pairs in the expression above. This means that all the terms corresponding to k ≥ 1 vanish, and exp H = eH = e−D
2 −dDdD
= e−D
2
(−)k k≥0
k!
(dDdD)k ,
because the scalar D 2 commutes (for the ordinary product of differential forms) with dDdD. For n even one has dDdD = λdxµ γ µ dxν γ ν = −λdxµ dxν γ µ γ ν because γ µ and dxν are odd, and for n odd dDdD = λdxµ εγ µ dxν εγ ν = −λdxµ dxν γ µ γ ν . Thus in any case, the Fedosov exponential reads exp (−D 2 ) = e−λx
2
λk k≥0
k!
dxµ1 . . . dxµ2k γ µ1 . . . γ µ2k .
Let us now compute the Chern character of the Bott element: (i) n even: then τ is equal to the supertrace tr s on the Z2 -graded spinor representation Sn , and ch(βn ) = tr s exp (−D 2 ). Thus ch(βn ) = e−λx
2
λk k≥0
k!
dxµ1 . . . dxµ2k tr s (γ µ1 . . . γ µ2k ).
However, if 2k < n, then the supertrace over the γ -matrices vanishes, and if 2k > n, the differential form dxµ1 . . . dxµ2k is identically zero. Hence only the term 2k = n remains: λn/2 −λx 2 dxµ1 . . . dxµn tr s (γ µ1 . . . γ µn ) e (n/2)! λn/2 2 = n!e−λx dx1 . . . dxn tr s (γ 1 . . . γ n ) (n/2)! n! 2 = (2iλ)n/2 e−λx dx1 . . . dxn . (n/2)!
ch(βn ) =
√ (ii) n odd: then D = εQλ and τ (x + εy) = 2i tr(y) for any endomorphisms x, y of the trivially graded spinor representation Sn . Thus √ −D 2 −D 2 −D 2 dD = τ e d(εQλ ) = − 2i tr(e dQλ ), ch(βn ) = τ e
90
D. Perrot
because ε anticommutes with d. One has k √ √ 2 λ dxµ1 . . . dxµ2k γ µ1 . . . γ µ2k d( λxν γ ν )) ch(βn ) = − 2i tr(e−λx k! k≥0
√ λn/2 2 = − 2i tr(γ µ1 . . . γ µ2k γ ν ) e−λx dxµ1 . . . dxµ2k dxν . k! k≥0
With the same argument as in the even case, only the term 2k + 1 = n remains and √ λn/2 2 ch(βn ) = − 2i tr(γ µ1 . . . γ µn ) e−λx dxµ1 . . . dxµn−1 dxµn k! n √ λn/2 2 (2k)!tr(γ 1 . . . γ n ) = − 2i e−λx dxj +1 . . . dxj −1 dxj k! j =1
=−
(2k)! (2iλ)n/2 k!
n
e−λx dxj +1 . . . dxj −1 dxj . 2
j =1
The proof is complete. Corollary 8.8. Let n ∈ N. Then the pairing between the Chern characters of the Bott element βn ∈ n+2Z (C, S(Rn )) and the Dirac spectral triple (En , ρ, Dt ) ∈ n+2Z (S(Rn ), C) is normalized: ch(βn ), ch(E, ρ, Dt ) = 1 .
(120)
Proof. It is a consequence of Corollary 8.6 and Proposition 8.7. √ This explains the normalization factor 2i appearing in the definition (75) of the canonical trace τ for the Chern character on 1 . It is interesting also to note that this factor is the only one compatible with the external product on K-homology KK(A, C)× ˆ C), see [6], p. 295. KK(B, C) → KK(A⊗B, A. Appendix In this appendix we adapt Quillen’s formalism of algebra cochains [33] to the bornological framework. All the results presented here are straightforwardly obtained from Quillen’s paper by replacing arbitrary algebras by complete bornological algebras, tensor products by completed tensor products and linear maps by bounded linear maps. A.1. Bar construction. Let A be an associative complete bornological algebra. The bar construction of A is the graded space B(A) = B n (A), (A.121) n≥0 ˆ where B n (A) = A⊗n is localized in degree n. B(A) endowed with the direct sum bornology is a complete bornological space. The decomposable element a1 ⊗ · · · ⊗ an
Bivariant Chern Character for Families of Spectral Triples
91
ˆ of A⊗n will be written (a1 , . . . , an ). B(A) is naturally a bornological coassociative ˆ coalgebra, with bounded coproduct : B(A) → B(A)⊗B(A) given by
(a1 , . . . , an ) =
n
(a1 , . . . , ai ) ⊗ (ai+1 , . . . , an )
(A.122)
i=0 ˆ = C. On B(A) is and counit η : B(A) → C corresponding to the projection onto A⊗0 defined a bounded differential b of degree −1:
b (a1 , . . . , an ) =
n−1
(−)i−1 (a1 , . . . , ai ai+1 , . . . , an ),
(A.123)
i=1
with b = 0 for n = 0, 1. One readily verifies that b 2 = 0 and that the coproduct and counit are morphisms of (graded) complexes, i.e. b = (b ⊗ 1 + 1 ⊗ b ) and ηb = b η = 0, taking care of the signs occurring when graded symbols are permuted, for instance (1 ⊗ b )((a1 , . . . , ai ) ⊗ (ai+1 , . . . , an )) = (−)i (a1 , . . . , ai ) ⊗ b (ai+1 , . . . , an ) according to the respective degrees of (a1 , . . . , ai ) and b . This turns B(A) into a differential graded (DG) complete bornological coalgebra. Next we consider the free bicomodule over B(A) = B, ˆ ⊗B. ˆ 1 B = B ⊗A
(A.124)
The generic element (a1 , . . . , ai−1 ) ⊗ ai ⊗ (ai+1 , . . . , an ) of 1 B will be written ˆ 1B (a1 , . . . , ai−1 |ai |ai+1 , . . . , an ). The left and right comodule maps l : 1 B → B ⊗ ˆ are bounded and given by and r : 1 B → 1 B ⊗B l (a1 , . . . , ai−1 |ai |ai+1 , . . . , an ) =
i−1
(a1 , . . . , aj ) ⊗ (aj +1 , . . . , ai−1 |ai |ai+1 , . . . , an ),
(A.125)
j =0
and similarly for r . 1 B also has a grading over the integers, (1 B)n =
n
ˆ ⊗B ˆ n−i B i−1 ⊗A
for n ≥ 1,
(1 B)0 = 0,
(A.126)
i=1
just counting the number of arguments in A. There is a bounded differential b of degree −1 b (a1 , . . . , ai−1 |ai |ai+1 , . . . , an ) = (b (a1 , . . . , ai−1 )|ai |ai+1 , . . . , an ) + (−)i (a1 , . . . , ai−2 |ai−1 ai |ai+1 , . . . , an ) + (−)i+1 (a1 , . . . , ai−1 |ai ai+1 |ai+2 , . . . , an ) + (−)i (a1 , . . . , ai−1 |ai |b (ai+1 , . . . , an )). (A.127)
92
D. Perrot
One has b 2 = 0, and l,r are morphisms of (graded) complexes, i.e. l b = (b ⊗ 1 + 1 ⊗ b )l and similarly for r . The last operator we will consider is the obvious bounded map ∂ : 1 B → B induced by ∂(a1 , . . . , ai−1 |ai |ai+1 , . . . , an ) = (a1 , . . . , an ).
(A.128)
It is a coderivation: ∂ = (1 ⊗ ∂)l + (∂ ⊗ 1)r , and a morphism of complexes: ∂b = b ∂, of degree zero with respect to gradings. A.2. Algebra cochains. Let L be a complete bornological Z2 -graded algebra with unit 1 and differential d. The space of bounded linear maps R = Hom(B(A), L)
(A.129)
endowed with the bornology of equibounded maps is complete, and splits into the even/odd subspaces coming from the Z2 -gradings of B and L. We denote by |f | the degree of an homogeneous element f ∈ R. Since B(A) is a coalgebra (coproduct ), ˆ → L), R is naturally endowed with a complete and L is an algebra (product m : L⊗L bornological algebra structure given by the convolution product f g = m(f ⊗ g), ∀f, g ∈ R. Explicitly on a n-chain one has (f g)(a1 , . . . , an ) =
n
(−)|g|i f (a1 , . . . , ai )g(ai+1 , . . . , an ).
(A.130)
i=0
Note the sign (−)|g|i occurring when the chain (a1 , . . . , ai ) crosses g. The differentials b and d induce two bounded differentials of odd degree on R: df = d ◦ f,
δf = −(−)|f | f ◦ b ,
dδ + δd = δ 2 = d 2 = 0.
(A.131)
d and δ are derivations with respect to the convolution product. Thus R is a bidifferential Z2 -graded (complete bornological) algebra, with unit 1η : B(A) → L. To the bicomodule 1 B(A) it corresponds by duality a graded R-bimodule M = Hom(1 B(A), L)
(A.132)
ˆ with the bounded left multiplication R⊗M → M given by f γ = m(f ⊗ γ )l , ∀f ∈ R, γ ∈ M, and similarly the bounded right multiplication reads γf = m(γ ⊗ f )r . Explicitly, the product evaluated on an element of 1 B(A) is (f γ )(a1 , . . . , ai−1 |ai |ai+1 , . . . , an ) =
i−1
(−)|γ |j f (a1 , . . . , aj )γ (aj +1 , . . . , ai−1 |ai |ai+1 , . . . , an ).
(A.133)
j =0
As before b induces by duality a bounded differential of odd degree on M, δγ = −(−)|γ | γ ◦ b ,
(A.134)
compatible with the R-bimodule structure: δ(f γ ) = δf γ + (−)|f | f δγ ,
δ(γf ) = δγf + (−)|γ | γ δf.
(A.135)
This differential together with d implies that M is a bidifferential graded (complete bornological) bimodule. Last but not least, transposing the operator (A.128) yields a bounded derivation ∂ : R → M commuting with δ and d.
Bivariant Chern Character for Families of Spectral Triples
93
= A ⊕ C be the complete bornological A.3. Noncommutative differential forms. Let A algebra obtained from A by adjoining a unit 1 (even if A is already unital). The space of noncommutative forms is the complete bornological space A = n≥0 n A with ˆ ⊗A ˆ ⊗n for n ≥ 1 and 0 A = A. The element a0 ⊗ · · · ⊗ an ∈ n A (resp. n A = A 1⊗a1 . . .⊗an ) is denoted by a0 da1 . . . dan (resp. da1 . . . dan ). Then A is a (non-unital) complete bornological DG algebra when specifying the differential d(a0 da1 . . . dan ) = da0 da1 . . . dan ,
d(da1 . . . dan ) = 0,
d 2 = 0,
(A.136)
verifying the Leibniz rule with respect to the ordinary product on differential forms. The Hochschild operator b : n A → n−1 A is the bounded map defined by b(ωda) = (−)|ω| [ω, a] for any ω ∈ A and a ∈ A, and b(a) = 0. From this one gets the Karoubi operator κ = 1 − (bd + db) and Connes’ boundary B = (1 + κ + · · · + κ n )d on n A, both bounded, verifying B 2 = b2 = bB + Bb = 0 and Bκ = κB = B. Thus (A, b, B) becomes a complete bornological bicomplex. in order to get cochains on the bicomplex We now can use the bar construction for A n+1 , A. First consider the bounded injection : n A → (1 B(A)) ( a0 da1 . . . dan ) =
n
(−)n(i+1) (ai+1 , . . . , an | a0 |a1 , . . . , ai ).
(A.137)
i=0
Then by direct computation one checks that b = b . Let L be a unital complete bornological Z2 -graded algebra, and consider the associated algebra and bimodule R = L) and M = Hom(1 B(A), L). Then composing with an element γ Hom(B(A), of M we get a bounded cochain γ ∈ Hom(A, L). The following lemma relates the Hochschild operator b on A with the differential δ on M. Lemma A.1. For any γ ∈ M one has δγ = −(−)|γ | γ b in Hom(A, L). Proof. By direct computation one checks that b = b , and then immediately δγ = −(−)|γ | γ b = −(−)|γ | γ b. It remains to relate the operator B to the derivation ∂ : R → M. For this we have → L0 with values in the even part of L, and to consider a bounded linear map ρ : A L) ⊂ R of preserving the unit: ρ(1) = 1. We view it as an element of Hom(B 1 (A), degree |ρ| = 1. Then the following lemma holds: → L0 be a bounded unital linear map (not necessarily an Lemma A.2. Let ρ : A homomorphism), and let f, g be two elements of R vanishing if one of their arguments Then one has is equal to 1 ∈ A. ∂(f g) = (−)|g| f ∂ρ gB
(A.138)
in Hom(A, L). L) and g ∈ Hom(B q (A), L) with p + q = Proof. We may suppose f ∈ Hom(B p (A), n + 1. One has f ∂ρgB( a0 da1 . . . dan ) =
n i=0
(−)n(i+1) f ∂ρg(dai+1 . . . dan da0 . . . dai ),
94
D. Perrot
and a0 its projection on A. We compute with a0 ∈ A f ∂ρg(da0 . . . dan ) =
n+1
(−)(n+1)(i+1) (f ∂ρg)(ai , . . . , an |1|a0 , . . . , ai−1 )
i=0
= (−)(n+1)(q+1) (f ∂ρg)(aq , . . . , an |1|a0 , . . . , aq−1 ) = (−)(n+1)(q+1) (−)|g|+p (f g)(aq , . . . , an , a0 , . . . , aq−1 ) = (−)|g|+nq (f g)(aq , . . . , an , a0 , . . . , aq−1 ), where we retained only the term corresponding to i = q and used the fact that ρ(1) = 1. Similarly for any 0 ≤ i ≤ n one has f ∂ρg(dai+1 . . . dan da0 . . . dai ) = (−)|g|+nq (f g)(ai+q+1 , . . . , ai , ai+1 , . . . , ai+q ), where the indices of the a’s are defined modulo n + 1. Thus f ∂ρgB( a0 da1 . . . dan ) =
n
(−)n(i+1+q)+|g| (f g)(ai+q+1 , . . . , ai+q )
i=0
= (−)|g|
n
(−)n(i+1) (f g)(ai+1 , . . . , an , a0 , . . . , ai )
i=0
by reindexing i + q → i. On the other hand ∂(f g)( a0 da1 . . . dan ) =
n
(−)n(i+1) ∂(f g)(ai+1 , . . . , an | a0 |a1 , . . . , ai )
i=0
=
n
(−)n(i+1) (f g)(ai+1 , . . . , an , a0 , a1 , . . . , ai )
i=0
=
n
(−)n(i+1) (f g)(ai+1 , . . . , an , a0 , a1 , . . . , ai ),
i=0
since f, g are supposed to vanish on 1, and the conclusion follows.
A.4. Traces. Let L, A be as above. Let V be a complete bornological vector space and τ : L → V a bounded trace, i.e. a bounded linear map vanishing on the graded commutators [L, L]. Another way to specify this is to consider the permutation map ˆ → L⊗L ˆ which flips the two factors: σ : L⊗L σ (x ⊗ y) = (−)|x||y| y ⊗ x
(A.139)
according to their respective degrees. Then τ is a trace if and only if τ mσ = τ m, where ˆ → L is the multiplication map. An essential example is the universal trace m : L⊗L : L → L = (L/[L, L])completed . Its universal property stems from the fact that any trace τ factors through .
(A.140)
Bivariant Chern Character for Families of Spectral Triples
95
is a bounded cotrace. Indeed if At the dual level, the injection : A → 1 B(A) ⊗B( B(A) ⊗ which permutes the two ˆ A) ˆ 1 B(A) we introduce the map σ : 1 B(A) factors (with signs), then one has l = σ r and σ l = r . We now put traces and cotraces together. For any bounded trace τ on L, the map L) to Hom(A, V) which sends γ to τ γ is a trace on the from M = Hom(1 B(A), R-bimodule M, that is, it vanishes on the graded commutators [R, M]. Indeed for any f ∈ R and γ ∈ M, one has τ (γf ) = τ m(γ ⊗f )r = (−)|γ ||f | τ mσ (f ⊗γ )σ r = (−)|γ ||f | τ m(f ⊗ γ )l = (−)|γ ||f | τ (f γ ). References 1. Alvarez-Gaumé, L., Witten, E.: Gravitational anomalies. Nucl. Phys. B 234, 269–330 (1984) 2. Berline, N., Getzler, E., Vergne, M.: Heat kernels and Dirac operators. Grundlehren der Mathematischen Wissenschaft 298, Berlin–Heidelberg–New York: Springer-Verlag, 1992 3. Bismut, J. M.: The Atiyah-Singer index theorem for families of Dirac operators: Two heat equation proofs. Invent. Math. 83, 91–151 (1986) 4. Blackadar, B.: K-theory for operator algebras. New York: Springer-Verlag, 1986 5. Connes, A.: Non-commutative differential geometry. Publ. Math. IHES no 62, 41–144 (1986) 6. Connes, A.: Non-commutative geometry. New York: Academic Press, 1994 7. Connes, A.: Entire cyclic cohomology of Banach algebras and characters of θ -summable Fredholm modules. K-theory 1, 519–548 (1988) 8. Connes, A., Moscovici, H.: Transgression and the Chern character of finite-dimensional K-cycles. Commun. Math. Phys. 155, 103–122 (1993) 9. Connes, A., Moscovici, H.: The local index formula in non-commutative geometry. GAFA 5, 174–243 (1995) 10. Connes,A., Moscovici, H.: Hopf algebras, cyclic cohomology and the transverse index theorem. Commun. Math. Phys. 198, 199–246 (1998) 11. Connes, A., Moscovici, H.: Cyclic cohomology and Hopf algebras. Lett. Math. Phys. 48, 85 (1999) 12. Cuntz, J.: A new look at KK-theory. K-Theory 1, 31–51 (1987) 13. Cuntz, J.: Cyclic theory and the bivariant Chern-Connes character. Preprint SFB 478 (2000). 14. Cuntz, J., Quillen, D.: Algebra extensions and nonsingularity. JAMS 8, 251–289 (1995) 15. Cuntz, J., Quillen, D.: Cyclic homology and nonsingularity. JAMS 8, 373–442 (1995) 16. Cuntz, J., Quillen, D.: Excision in bivariant periodic cyclic cohomology. Invent. Math. 127, 67–98 (1997) 17. Gilkey, P.: Invariance theory, the heat equation, and the Atiyah-Singer index theorem. 2nd ed., Studies in Advanced Mathematics. CRC Press, 1995 18. Goodwillie, T. G.: Cyclic homology, derivations, and the free loopspace. Topology 24 (2), 187–215 (1985) 19. Gorokhovsky, A.: Characters of cycles, equivariant characteristic classes and Fredholm modules. Commun. Math. Phys. 208, 1–23 (1999) 20. Hogbe-Nlend, H.: Théorie des bornologies et applications. Lecture Notes in Mathematics, Vol. 213. Berlin–Heidelberg–New York: Springer-Verlag, 1971 21. Jaffe, A., Lesniewski, A., Osterwalder, K.: Quantum K-theory, I. The Chern character. Commun. Math. Phys. 118, 1–14 (1988) 22. Karoubi, M.: Homologie cyclique et K-théorie. Astérisque 149 (1987) 23. Mathai, V., Singer, I. M.: Twisted K-homology theory, twisted Ext-theory. Preprint hep-th/0012046 24. Meyer, R.: Analytic cyclic cohomology. Thesis, Münster, 1999: math.KT/9906205 25. Nistor, V.: A bivariant Chern character for p-summable quasihomomorphisms. K-Theory 5, 193–211 (1991) 26. Nistor, V.: A bivariant Chern–Connes character. Ann. Math. 138, 555–590 (1993) 27. Perrot, D.: BRS cohomology and the Chern character in noncommutative geometry. Lett. Math. Phys. 50, 135–144 (1999) 28. Perrot, D.: On the topological interpretation of gravitational anomalies. J. Geom. Phys. 39, 81–95 (2001) 29. Perrot, D.: Retraction of the bivariant Chern character. Preprint math-ph/0202016 30. Puschnigg, M.: Cyclic homology theories for topological algebras. K-theory preprint archives 292 (1998) 31. Puschnigg, M.: Excision in cyclic homology theories. Invent. Math. 143, 249–323 (2001) 32. Quillen, D.: Superconnections and the Chern character. Topology 24, 89–95 (1985) 33. Quillen, D.: Algebra cochains and cyclic cohomology. Publ. Math. IHES 68, 139–174 (1989) 34. Quillen, D.: Chern-Simons forms and cyclic cohomology. In: The interface of mathematics and particle physics, Oxford: Oxford Univ. Press, 1988, pp. 177–134 35. Witten, E.: D-branes and K-theory. JHEP 12 (1998) Communicated by A. Connes
Commun. Math. Phys. 231, 97 – 134 (2002) Digital Object Identifier (DOI) 10.1007/s00220-002-0671-x
Communications in
Mathematical Physics
On the Density-Density Critical Indices in Interacting Fermi Systems G. Benfatto, V. Mastropietro Dipartimento di Matematica, Università di Roma “Tor Vergata”, Via della Recerca Scientifica, 00133 Roma, Italy Received: 12 November 2001 / Accepted: 25 February 2002 Published online: 2 October 2002 – © Springer-Verlag 2002
Abstract: The behaviour of correlation functions of d = 1 interacting fermionic systems is determined by a small number of critical indices. We prove that one of them is exactly zero. As a consequence, the behavior of the Fourier transform of the densitydensity correlation at zero momentum is qualitatively unaffected by the interaction, contrary to what happens at ±2p˜ F , if p˜ F is the Fermi momentum. The result is obtained by implementing Ward identities in a Renormalization Group approach. 1. Introduction and Main Results L 1.1. Motivations and results. If ax± , x = −[ L−1 2 ], ..., [ 2 ], is a set of fermionic creation and annihilation operators, we consider the Hamiltonian [2] L
H =
x=−[ L−1 2 ]
1 + − − ax+ )(ax+1 − ax− ) − µax+ ax− (a 2 x+1 1 + − 1 + λ ax+ ax− − ax+1 ax+1 − , 2 2
(1)
describing a system of spinless fermions in d = 1 with chemical potential µ, a nearestneighbor interaction and periodic boundary conditions. The space-time density-density correlation function at temperature β −1 is given by L,β (x) =< ax+ ax− a0+ a0− >L,β − < ax+ ax− >L,β < a0+ a0− >L,β ,
(2)
where x = (x, x0 ), ax± = eH x0 ax± e−H x0 and < . >L,β = T r[e−βH .]/T r[e−βH ] denotes the expectation in the grand canonical ensemble. We shall use also the notation (x) ≡ limL,β→∞ L,β (x).
98
G. Benfatto, V. Mastropietro
If the fermions are non interacting (λ = 0), one can easily check that, if |x| ≥ 1, cos pF = 1 − µ, v0 = sin pF > 0, (x) = cos(2pF x)a0 (x) + b0 (x) + c0 (x), 1 , a0 (x) = 2 2 2π [x + (v0 x0 )2 ] x02 − (x/v0 )2 1 b0 (x) = , 2 2 2 2π [x + (v0 x0 ) ] x 2 + (v0 x0 )2 1 , |c0 (x)| ≤ 1 + |x|2+ϑ
(3)
for some positive constant ϑ < 1. The interaction has two main effects: the period of the oscillating term cos(2pF x)a0 (x) changes and the large distance asymptotic decay is modified by critical indices. It was indeed proved in [BM] by a Renormalization Group analysis that, for λ small enough and |x| ≥ 1, (x) = cos(2p˜ F x)a (x) + b (x) + c (x), 1 + λB1 (x) a (x) = , 2π 2 [x 2 + (v0∗ x0 )2 ]1+ηa x02 − (x/v0∗ )2 1 b (x) = + λB2 (x) , 2π 2 [x 2 + (v0∗ x0 )2 ]1+ηb x 2 + (v0∗ x0 )2 |Bi (x)| ≤ C,
|c (x)| ≤
(4)
1 , 1 + |x|2+ϑ
where C is a positive constant, ηa , ηb are critical indices expressed by convergent series in λ, v0∗ = v0 + δ ∗ and p˜ F (λ, pF ) = pF + λf (λ, pF ) with δ ∗ , f analytic in λ and |δ ∗ | ≤ C|λ|, |f (λ, pF )| ≤ C; note that f (λ, π2 ) = 0, by symmetry reasons. By an explicit computation of the lowest order of the convergent series for ηa one obtains that ηa = −a1 λ + O(λ2 ), where a1 > 0 is a non-vanishing constant. The lowest order contributions to ηb are instead vanishing, in agreement with the conjecture (see for instance [Sp]) that ηb is exactly zero. The aim of this paper is to prove such a conjecture. Theorem 1. There exists a positive constant λ0 such that, if |λ| ≤ λ0 , the density-density correlation function (2) can be written as in (4) with the critical index ηb identically vanishing. The vanishing of the critical index ηb has many interesting consequences. For inˆ stance, see [BM], if λ = 0 the Fourier transform (k) of (x, 0) has three cusps, at k = 0 and k = ±2pF , i.e. ∂k (k) has a first order discontinuity at k = 0 and k = ±2pF . ˆ still has a cusp at k = 0 even if λ = 0; in The vanishing of ηb = 0 implies that (k) ˆ fact it was proved in [BM] that, if ηb = 0, the possible logarithmic singularity of ∂ (k) at k = 0 is changed by a parity cancellation into a first order discontinuity with jump 1+O(λ); this is remarkable because, generally, the qualitative behaviour close to critical ˆ points is deeply changed by the interaction; for instance ∂k (k) at k = ±2p˜ F in the λ = 0 case is continuous for λ < 0, while it diverges as |k − (±2p˜ F )|2ηa for λ > 0. Note finally that the model (1) is equivalent to the XXZ spin-chain with magnetic field h = µ − 1, as one can show by a Schwinger–Dyson transformation [LSM], with
Density-Density Critical Indices
99
(2) representing the spin-spin correlation function along the third axis. Moreover, our proof that ηb = 0 could be easily extended to a large class of models; for instance one can replace the nearest neighbor interaction with a non nearest neighbor one, or the lattice with a continuum, or to consider the anisotropic XY Z spin chain, see [BM]. We remember finally that there are remarkable relations, based on exact solutions, between properties of quantum spin chains and bidimensional classical statistical mechanics models; for instance the spin-spin correlation function of the XY Z spin chain is believed to be equal to the correlation between two vertical arrows in the same row in the eigth vertex model, see [B] and [JKM], if a suitable identification of the parameters is done. Hence our results could be relevant also for such problems. Another application is for models of vicinal surfaces, see [Sp]. 1.2. Remarks. In [BM] we derived a convergent expansion for the critical index ηb ; each order is obtained by summing up a certain number of terms, and ηb = 0 means that there is a cancellation at all orders between such terms. While one can easily check from such expansion that this is the case at the second order, to prove directly that such cancellation occurs at all orders looks to us essentially impossible. We proceed instead in a different way and our proof is conceptually divided in two main steps. For the first step we refer to [BM], where the proof that ηb = 0 is reduced to a special property (see (20) below) of the Schwinger functions of a model (which we will call the reference model), describing fermions with a linear “relativistic” dispersion relation and allowed momenta restricted by infrared and ultraviolet cut-offs. This result, which is resumed in Theorem 3 below, gives further ground to the remarkable observation of Tomonaga [T], according to which the model (1) is essentially equivalent, as far as the low energy behaviour is considered, to a system of interacting massless relativistic fermions. In the second step we deduce such property of the reference model by using a suitable Ward identity, which is obtained through a local gauge transformation. Usually in relativistic quantum field theory Ward identities are relations between correlation functions; the Ward identity we find is instead a relation between correlation functions and some other extra terms, which we call “correction” as they would be formally zero if the cut-offs were removed. The extra terms do not vanish when the infrared cut-off is removed. The property that we need is reduced to suitable bounds (see Theorems 2 and 4), proved by using convergent expansions for all terms appearing in the Ward identity. We conclude the introduction with a technical note. With respect to previous applications of Wilsonian Renormalization Group to d = 1 interacting fermionic theories, like [BG] or [BGPS], we are able here to rigorously implement in this scheme the method of Ward identities (based on local gauge transformations) to produce non trivial results. In the physical literature there are many claims on the vanishing of ηb , see for instance [DL, ES], [DM], and our results convert such ideas into a rigorous proof. Finally, note that there are many examples of QFT models in which Ward Identities are implemented in a mathematical way, perturbatively (see for instance [FHRW, KK]) or non perturbatively (see for instance [BFS] or [MSR]). However such works consider the application of Ward Identities to relativistic QFT; hence corrections to formal exact Ward Identities are possibly found as a consequence of the cut-offs imposed to regularize the theory, but they are vanishing when the cut-offs are removed. The main novelty of our paper is that we try to implement the method of Ward identities in the not relativistic model (1), where there is no reason why a Ward Identity involving only correlation functions should be valid. The corrections are not vanishing and the technical problem is to get for such terms bounds good enough to prove that ηb = 0.
100
G. Benfatto, V. Mastropietro
2. Ward Identities 2.1. The reference model. The reference model is not Hamiltonian and is defined in terms of Grassmann variables. Given the interval [0, L], the inverse temperature β and the (large) integer N , we introduce in = [0, L] × [0, β] a lattice N , whose sites are given by the space-time points x = (x, x0 ) = (na, n0 a0 ), a = L/N , a0 = β/N , n, n0 = 0, 1, . . . , N −1. We also consider the set D of space-time momenta k = (k, k0 ), 1 2π 1 with k = 2π L (n + 2 ) and k0 = β (n0 + 2 ), n, n0 = 0, 1, . . . , N − 1. With each k ∈ D [h,0]σ we associate four Grassmanian variables ψˆ k,ω , σ, ω ∈ {+, −}. The lattice N is introduced only for technical reasons so that the number of Grassmann variables is finite, and eventually the limit N → ∞ is taken (and it is trivial, see [BM], Sect. 2.1). If γ is a fixed number greater than 1 and h is a negative integer, we define the function [Ch,0 ]−1 (k) as a strictly positive smooth function acting as a cut-off for momenta |k| ≥ γ (ultraviolet region) and |k| ≤ γ h−1 (infrared region) and having value 1 in the intermediate region γ h ≤ |k| ≤ 1. The infrared cut-off γ h is not fixed, because we are interested in the dependence on h of the reference model. The exact definition of [Ch,0 ]−1 (k) is the following one. We introduce a positive function χ0 ∈ C ∞ (R+ ) such that
if 0 ≤ t ≤ 1, if t ≥ γ0 , 1 < γ0 ≤ γ
1 0
χ0 (t) =
(5)
and we define, for any integer j ≤ 0, fj (k) = χ0 (γ −j |k|) − χ0 (γ −j +1 |k|).
(6)
Then we define [Ch,0 (k)]−1 = 0j =h fj (k). If D˜ = {k ∈ D : [Ch,0 (k)]−1 = 0}, we define the functional integration Dψ [h,0] as the linear functional on the Grassmann [h,0]σ ˆ in the , such that, given a monomial Q(ψ) algebra generated by the variables ψˆ k,ω
[h,0]σ [h,0]− ˆ = ˆ ˆ [h,0]+ variables ψˆ k,ω , its value is 0, except in the case Q(ψ) k∈D˜ ,ω=± ψk,ω ψk,ω , up to a permutation of the variables. In this case the value of the functional is determined, ˆ = 1. We also by using the anticommuting properties of the variables, by Dψ [h,0] Q(ψ) define the Grassmannian field on the lattice N as [h,0]σ ψx,ω =
1 iσ kx [h,0]σ e ψˆ k,ω , Lβ
x ∈ N .
(7)
k∈D
[h,0]σ Note that ψx,ω is antiperiodic both in time and space variables. The Schwinger functions of the reference model are
S(x1 , σ1 , ω1 ; . . . ; xs , σs , ωs ) =
[h,0]
i P (dψ [h,0] )e−V (ψ ) si=1 ψx[h,0]σ i ,ωi , [h,0] P (dψ [h,0] )e−V (ψ )
(8)
where V (ψ [h,0] ) = λ
[h,0]+ [h,0]− [h,0]+ [h,0]− dx ψx,+ ψx,+ ψx,− ψx,− ,
(9)
Density-Density Critical Indices
101
dx is a shorthand for “a a0
x∈ N ”
and
(10) P (dψ [h,0] ) = N −1 Dψ [h,0] 1 [h,0]+ ˆ [h,0]− , Ch,0 (k)(−ik0 + ωk)ψˆ k,ω ψk,ω · exp − Lβ with N =
ω=±1 k∈D˜
k∈D˜ [(Lβ)
−2 (−k 2 0
− k 2 )Ch,0 (k)2 ].
We also define the connected Schwinger functions as the functional derivatives of the Generating functional [h,0]+ [h,0]− + ψ [h,0]− +ψ [h,0]+ φ − −V (ψ)+ ω dx Jx,ω ψx,ω ψx,ω +φx,ω x,ω x,ω x,ω W(φ, J ) = log P (dψ)e (11) σ and J with respect to the external field variables φx,ω x,ω , x ∈ N , ω = ±1. The variables [h,0]σ σ , while φx,ω are antiperiodic in x0 and x and anticommuting with themselves and ψx,ω the variables Jx,ω are periodic and commuting with themselves and all the other variables. We shall need in particular the following connected Schwinger functions:
G2,1 ω (x; y, z) = G2ω (y, z) =
∂2 ∂ + − W(φ, J )|φ=J =0 , ∂Jx,ω ∂φy,ω ∂φz,ω
(12)
∂2 + − W(φ, J )|φ=J =0 . ∂φy,ω ∂φz,ω
(13)
They will be pictorially represented as in Fig. 1.
y
z
z ω
ω
ω G2,1 ω ω
G2ω
ω
ω y
x
2 Fig. 1. Graphical representation of the connected Schwinger functions G2,1 ω and Gω
2 We also need the Fourier transforms of G2,1 ω and Gω , defined by
1 −ik(x−y) ˆ 2 Gω (k), e (Lβ) k 1 ipx −iky i(k−p)z ˆ 2,1 G2,1 e e e Gω (p, k). ω (x; y, z) = (Lβ)2 G2ω (x, y) =
k,p
(14) (15)
102
G. Benfatto, V. Mastropietro
In Sect. 3 we prove the following bounds for the reference model with cut-off γ h , which has to be of course larger than min{π/L, π/β} (otherwise the set D is empty). Theorem 2. There exists a positive constant λ0 , independent of h, such that, if |λ| ≤ (2) λ0 , there exist two positive functions of λ, Zh and Zh , and a positive constant C, independent of h, so that, uniformly in N, L, β large enough, if k¯ ∈ D is such that ¯ ≤ γ h+1 , γ h ≤ |k| ¯ ¯ ˆ 2,1 G ω (2k, −k) = −
(2)
Zh
[1 + O(λ2 )], ¯ 2 Zh2 Dω (k) 1 ¯ = ˆ 2ω (k) G [1 + O(λ2 )], ¯ Zh Dω (k) (2)
Cλ2 |h| ≤ log Zh ≤ 2Cλ2 |h|,
C|h|λ2 ≤ log Zh ≤ 2Cλ2 |h|
(16) (17) (18)
with Dω (k) = −ik0 + ωk. Moreover, (2)
lim log
h→−∞
Zh−1 (2) Zh
= η2 (λ),
lim log
h→−∞
Zh−1 = η(λ), Zh
(19)
with η(λ) = a2 λ2 + O(λ3 ), and η2 (λ) = a2 λ2 + O(λ3 ), where a2 is a positive constant. The connection between the model (1) and the reference model is given by the following theorem, which is proved in [BM] , even if it is not explicitly formulated. To be more precise, in Sect. 5.5 of [BM] we show that the condition (20), equivalent to Eq. (5.35) of [BM], implies the bound (5.38) of [BM], which is equivalent to say that ηb = 0. Theorem 3. Under the same assumptions of Theorem 2, there exists a constant C such (1) that, if for all negative integers h the functions Zh , Zh in (16), (17) verify (2)
Cλ2 ≤ |
Zh − 1| ≤ 2Cλ2 , Zh
(20)
then in (4) ηb (λ) = 0. Hence by Theorem 3 the proof of ηb = 0 is reduced to the verification of (20), to which the rest of this paper is devoted. Note that (20) is equivalent, by (18), to η(λ)−η2 (λ) = 0 (in Theorem 2 it is only claimed that η(λ) − η2 (λ) = O(λ3 )). 2.2. Ward identities for the reference model. We have so far reduced the proof that ηb = 0 in the model (1) to the verification of (20) in the reference model. This result will ˆ 2ω to G ˆ 2,1 , obtained by performing a local be achieved by using an identity relating G gauge transformation, together with Eqs. (16), (17). In order to derive such identity, we find it convenient to introduce a cut-off funcε (k)]−1 , where ε is a small positive parameter and lim ε −1 = tion [Ch,0 ε→0+ [Ch,0 (k)] ε −1 −1 −1 Ch,0 (k)] . The functions [Ch,0 (k)] and [Ch,0 (k)] are equivalent as far as the scalε (k)]−1 is the set D ing properties of the theory are concerned but the support of [Ch,0
Density-Density Critical Indices
103
˜ The definition (10) of the reference model is easily extended to the case instead of D. ε (k)]−1 instead of [C (k)]−1 , by substituting in the r.h.s. in which the cut-off is [Ch,0 h,0 of (10), as well as in the definition of the integration Dψ [h,0] , the set D˜ with D. A reason why we find this convenient is that a technically important role in the following is played by the gauge invariance of the integration Dψ [h,0] , a property which is lost ˜ if the Grassmann algebra is restricted to the variables ψˆ k,ω with k ∈ D. ε −1 The exact definition of [Ch,0 (k)] is the following one. Given a positive ε << 1, we define ε (k) χh,0
=
ε [Ch,0 (k)]−1
=
0
fjε (k),
(21)
j =h
where fjε (k) = fj (k), if h + 1 ≤ j ≤ −1, while f0ε (k) and fhε (k) are obtained by slightly modifying f0 (k) and fh (k) in the following way. f0ε (k) is a C ∞ function of |k|, such that limε→0 f0ε = f0 , f0ε (k) = f0 (k) for γ −1 ≤ |k| ≤ 1, f0ε (k) > 0 for |k| ≥ 1 and, if |k| ≥ γ , 0 < f0ε (k) ≤ εe−|k| . Analogously, fhε (k) is a C ∞ function of |k|, such that limε→0 fhε = fh , fhε (k) = fh (k) for γ h ≤ |k| ≤ γ h+1 , fhε (k) > 0, if 0 < |k| ≤ γ h , and if 0 < |k| ≤ γ h−1 , 0 < fhε (k) ≤ ε exp(−|k|−1 ). 1
1
γ h−1 γ h
γ
|k|
ε (k)]−1 (dashed line) and [C (k)]−1 (solid line) Fig. 2. The cutoff functions [Ch,0 h,0
Hence, we first study the case ε > 0, for which a Ward identity can be easily obtained, relating the Schwinger functions of interest for us, for which the limit ε → 0 is trivial, and a “correction term”, which is apparently singular as ε → 0. However we prove that this term can be written as a suitable expansion, whose contributions admit “good” bounds uniformly in ε, as well in N, L, β, and have a well defined limit as ε → 0. σ in place of ψ [h,0]σ for simplicity, we can write By writing ψx,ω x,ω + − P (dψ) = N −1 Dψ exp − dx ψx,ω , (22) Dω[h,0] ψx,ω where σ Dω[h,0] ψx,ω =
1 iσ kx ε σ e Ch,0 (k)(iσ k0 − ωσ k)ψˆ k,ω . Lβ
(23)
k
By performing the gauge transformation σ iσ αx,ω¯ σ ψx, ψx,ω¯ , ω¯ → e
σ σ ψx,− ω¯ → ψx,−ω¯
(24)
104
G. Benfatto, V. Mastropietro
and by using the invariance of Dψ after such transformation, we can rewrite the r.h.s. of (11) as [h,0] + − iαx,ω¯ [h,0] −iαx,ω¯ e ψ D e − D W(φ, J ) = log P (dψ) exp − dx ψx, ω¯ x,ω¯ ω¯ ω¯ + − · exp − V (ψ) + dx Jx,ω ψx,ω ψx,ω (25) ω + − − + − + − iαx,ω¯ + + e−iαx,ω¯ φx, ψx,ω¯ φx, ω¯ ψx,ω¯ + e ω¯ + φx,−ω¯ ψx,−ω¯ + ψx,−ω¯ φx,−ω¯
,
[h,0] [h,0] + + − − Since x∈ N ψx, αx,ω¯ ψx, ψx,ω¯ ]αx,ω¯ ψx, x∈ N [Dω¯ ω¯ [Dω¯ ω¯ ] = − ω¯ and W(φ, J ) is independent of αx,ω¯ , differentiating both sides of (25) with respect to αx,ω¯ and by putting αx,ω¯ = 0, we get 1 + − + − + − 0= P (dψ) Dω¯ (ψx, ω¯ ψx,ω¯ ) + δTx,ω¯ − φx,ω¯ ψx,ω¯ + ψx,ω¯ φx,ω¯ Z(φ, J ) + − + − + − · exp −V (ψ) + dx Jx,ω ψx,ω ψx,ω + φx,ω ψx,ω + ψx,ω φx,ω , (26) ω
where Z(φ, J ) = exp{W(φ, J )}, Dω is defined as Dω[h,0] , see (23), with 1 in place of ε (k), so that, if D (p) = −ip + ωp, Ch,0 ω 0 + − Dω (ψx,ω ψx,ω )=
1 + ˆ− Dω (p)e−ipx ψˆ k,ω ψk−p,ω , (Lβ)2
(27)
p,k
where p = (p, p0 ) is summed over momenta of the form (2π n/L, 2π m/β), with n, m integers. Moreover, δTx,ω =
1 + − ei(k −k )x C ε (k+ , k− )ψˆ k++ ,ω ψˆ k−− ,ω , 2 (Lβ) + − k =k −
ε ε C ε (k+ , k− ) = [Ch,0 (k ) − 1]Dω (k− ) − [Ch,0 (k+ ) − 1]Dω (k+ ),
(28) (29)
+ − By differentiating the r.h.s. of (26) with respect to φy, ω¯ and φz,ω¯ and then setting the external fields equal to 0, we obtain, in terms of the Fourier transform 2 2 2,1 −Dω G2,1 ω (x; y, z) = δ(x − y)Gω (x, z) − δ(x − z)Gω (y, x) + ω (x; y, z),
(30)
where − + T 2,1 ω (x; y, z) = ψy,ω ; ψz,ω ; δTx,ω .
If A1 , . . . , An are functions of the field, we are using the symbol n ∂n log P (dψ)e−V (ψ)+ i=1 λi Ai . A1 ; . . . ; An T = λ=0 ∂λ1 ...∂λn
(31)
(32)
Density-Density Critical Indices
q
k Dω (p)
105
q =
ˆ 2,1 G ω
−
ˆ 2ω G
ˆ 2ω G
q
q
k
k +
ˆ 2,1 ω
k
p=k−q
p
Fig. 3. Graphical representation of the identity (33)
It is convenient to express the Ward identity (30) in terms of the Fourier transforms of ˆ 2,1 ˆ 2,1 the connected Schwinger functions; ω (p, k) is defined in a similar way to Gω (p, k). In terms of the Fourier transform (30) can be written (see Fig. 3) as ˆ 2,1 ˆ2 ˆ2 ˆ 2,1 Dω (p)G ω (p, k) = Gω (k − p) − Gω (k) + ω (p, k),
(33)
If p = 0, (33) can be written in the form G2,1 ω (p, k) =
G2ω (k − p) − G2ω (k) + Hˆ ω2,1 (k, p), Dω (p)
(34)
where Hˆ ω2,1 (k, p) is the Fourier transform, defined in agreement with (15), of Hω2,1 (x; y, z) =
∂2 ∂ + − W (φ, J )|φ=J =0 , ∂Jx,ω ∂φy,ω ∂φz,ω
(35)
with W (φ, J ) = log Tx,ω =
P (dψ)e−V (ψ)+
ω
+ ψ − +ψ + φ − ] dx[Jx,ω Tx,ω +φx,ω x,ω x,ω x,ω
ε + − 1 i(k+ −k− )x C (k , k ) ˆ + e ψ + ψˆ −− . (Lβ)2 + − Dω (k+ − k− ) k ,ω k ,ω
,
(36) (37)
k =k
Equation (34) is our Ward identity; it involves not only correlation functions but also the term Hˆ ω2,1 (k, p), which we can call the correction term as it would be formally zero in absence of cut-offs. Note that the definition (35) of the correction term Hω2,1 is similar to the definition (12) of G2,1 ω , but the two quantities have very different properties. In + ψ − in (11) with T fact Hω2,1 can be obtained by substituting ψx,ω x,ω , given by (37), x,ω which looks like a very singular term as ε → 0. We are nevertheless able to express also Hˆ ω2,1 (k, p) by a convergent expansion, and we can prove in Sect. 4 the following bound.
106
G. Benfatto, V. Mastropietro
Theorem 4. There exists a positive constant λ0 , independent of h, such that, if |λ| ≤ λ0 , then, uniformly in ε small enough and N, L, β large enough Cγ −2h λ2
(2)
(2)
Zh Z ¯ −k)| ¯ ≤ 2Cγ −2h λ2 h . ≤ |Hˆ ω2,1 (2k, 2 (Zh ) (Zh )2
(38)
Moreover, limε→0 Hˆ ω2,1 does exist. The above result (it was already claimed in [BM] referring for the proof to the present ¯ ¯ ˆ 2,1 paper) says that Hˆ ω2,1 behaves, as h → −∞, exactly as G ω (2k, −k), but its bound has 2 an extra λ factor. This is just what we need; if we insert (16), (17) and (38) in (33), we obtain (20) and hence, by Theorem 2 and 3, ηb = 0. 2.3. Remarks. In the physical literature Ward identities for interacting d = 1 fermions with cut-offs are usually derived by various formal arguments, see for example [DL, ES, ˆ 2ω and G ˆ 2,1 MD, S]. All arguments are essentially equivalent to expanding G ω in Feynman graphs and then “forget” the cut-off function. In fact, if we neglect the cut-offs, the prop2 2 agator is simply Dω (k)−1 and the “identity” G2,1 ω (p, k) = [Gω (k − p) − Gω (k)]/D(p), from which ηb = 0 follows, is derived by the following obvious identity Dω (p)−1 [Dω (k)−1 − Dω (k + p)−1 ] = Dω (k + p)−1 Dω (k)−1 .
(39)
By taking consistently into account the cut-off function one gets, instead of (39), the identity gˆ ω (k) − gˆ ω (k + p) C ε (k, k + p) = gˆ ω (k)gˆ ω (k + p) + gˆ ω (k)gˆ ω (k + p) , Dω (p) D(p)
(40)
which allows in principle to check directly Eq. (34) at any order (very easily at order 0, which coincides with (34)). Our analysis shows then that one can still derive from the Ward identities the vanishing of ηb in a rigorous way, by taking into account the presence of cut-offs. This however seems not true for other consequences of Ward identities for the model (1) claimed in the literature, see [BM1]. ε (k)]−1 becomes a compact support function, so Note also that, as ε → 0, [Ch,0 ε C (k, k + p) becomes singular. However the singularity at ε = 0 of the function C ε (k, k + p) in the second addend of the r.h.s. in (40) is of course compensated by the cut-off functions appearing in the propagators. Hence one could “in principle” derive ε ]−1 ), (34) directly using a compact support cut-off (i.e. using [Ch,0 ]−1 instead of [Ch,0 for instance by a Feynman graph analysis using (40) at ε = 0, but such derivation would be surely much more lengthy. 3. Renormalization Group Analysis 3.1. The effective potentials and the beta function. The results in Theorems 2 and 4 can ˆ 2,1 ˆ 2,1 ˆ 2ω , G be derived by expressing G ω and Hω by a suitable multiscale expansion based on Renormalization group ideas. In the following sections we will prove (16),(17),(38), referring to [BM] for the proof of many technical lemmas we will need.
Density-Density Critical Indices
107
We begin our analysis, for clarity reason, by studying the “free energy” of the model, which is the simplest quantity which can be studied by our method; it is defined by 1 [h,0] EL,β = − log P (dψ [h,0] )e−V (ψ ) . (41) Lβ The functional integration in (41) can be performed iteratively by a slight modification of the procedure described (for instance) in Sects. 2.5–2.8 of [BM]. We prove by induction that, for any negative integer j , there are a constant Ej , a positive function Z˜ j (k) and a functional V (j ) such that √ (j ) [h,j ] [h,0] P (dψ [h,0] )e−V (ψ ) = PZ˜ j ,C ε (dψ [h,j ] ) e−V ( Zj ψ )−LβEj , (42) h,j
with V (j ) (0) = 0, Zj = maxk Z˜ j (k), PZ˜ j ,C ε (dψ
[h,j ]
h,j
)=
[h,j ])+ ˆ [h,j ]− d ψˆ k,ω d ψk,ω
ε (k)>0 ω=±1 k:Ch,j
Nj (k)
1 ε [h,j ]− [h,j ]+ · exp − , ψˆ ω Ch,j (k)Z˜ j (k) Dω (k)ψˆ k,ω Lβ k
ω±1
(43) ε Ch,j (k)−1 =
j
frε (k) ≡ χh,j (k)
(44)
r=h ε (k)Z ˜ j (k)[−k 2 − k 2 ]1/2 . Finally, V (j ) can be written as and Nj (k) = (Lβ)−1 Ch,j 0 2n ∞ 2n 1 σ (j ) V (j ) (ψ) = ψˆ kii,ωi Wˆ 2n,ω (k1 , ..., k2n−1 )δ σi ki , (45) (Lβ)2n k ,... ,k n=1
1 2n ω1 ,... ,ω2n
i=1
i=1
where σi = + for i = 1, . . . , n, σi = − for i = n + 1, . . . , 2n and ω = (ω1 , . . . , ω2n ). Equation (42) is in fact true for j = 0, with Z˜ 0 (k) = 1,
E0 = 0,
V (0) (ψ) = V (ψ).
(46)
Assume then that it is true for j and we show that it holds also for j − 1. First of all, we split V (j ) as LV (j ) + RV (j ) , where R = 1 − L and L, the localization operator, is a linear operator on functions of the form (45), defined in the following way (j ) by its action on the kernels Wˆ 2n,ω . (1) If 2n = 4, then (j ) (j ) LWˆ 4,ω (k1 , k2 , k3 ) = Wˆ 4,ω (k¯ ++ , k¯ ++ , k¯ ++ ),
where we used the definition
(47)
¯kηη = η π , η π , η, η = ±. (48) L β (j ) Note that LWˆ 4,ω (k1 , k2 , k3 ) = 0, if 4i=1 ωi = 0, by simple symmetry considerations.
108
G. Benfatto, V. Mastropietro
(2) If 2n = 2 (in this case there is a non zero contribution only if ω1 = ω2 ), β 1 ˆ (j ) ¯ L (j ) LWˆ 2,ω (k) = W2,ω (kηη ) 1 + η + η k0 . 4 π π
(49)
η,η =±1
In order to better understand this definition, note that if L = β = ∞, LWˆ 2,ω (k) = Wˆ 2,ω (0) + k (j )
(j )
(j ) ∂ Wˆ 2,ω
∂k
(0) + k0
(j ) ∂ Wˆ 2,ω
∂k0
(0).
(50)
(3) In all other cases LWˆ 2n,ω (k1 , . . . , k2n−1 ) = 0. (j )
(51)
The above definitions are such that L2 = L, a property which plays an important role in the analysis of [BM]. Moreover, [h,j ]
LV (j ) (ψ [h,j ] ) = zj Fζ
[h,j ]
+ aj Fα[h,j ] + lj Fλ
,
where zj , aj and lj are real numbers and ω [h,j ]+ [h,j ]− Fα[h,j ] = k ψˆ k,ω ψˆ k,ω (Lβ) ε ω k:Ch,j (k)>0 [h,j ]+ [h,j ]− = iω dxψx,ω ∂x ψx,ω , [h,j ]
1 [h,j ]+ [h,j ]− (−ik0 )ψˆ k,ω ψˆ k ,ω (Lβ) ε (k)>0 ω k:Ch,j [h,j ]+ [h,j ]− =− dxψx,ω ∂0 ψx,ω , =
ω [h,j ]
Fλ
=
(53)
ω
Fζ
(52)
1 (Lβ)4
=
(54)
k1 ,...,k4 : ε (k )>0 Ch,j i
[h,j ]+
dxψx,+
[h,j ]+ [h,j ]− [h,j ]+ [h,j ]− ψˆ k1 ,+ ψˆ k2 ,+ ψˆ k3 ,− ψˆ k4 ,− δ(k1 − k2 + k3 − k4 )
[h,j ]−
ψx,+
[h,j ]+
ψx,−
[h,j ]−
ψx,−
.
(55)
∂x and ∂0 are discrete derivatives defined so that the second equality in (53) and (54) is satisfied; if N = ∞ they are simply the partial derivative with respect to x and x0 . Note that LV (0) = V (0) , hence l0 = λ, a0 = z0 = 0. There is no local term proportional to [h,j ]+ [h,j ]− ˆ ˆ k ψk,ω ψk,ω , because of the parity properties of the propagator. We now renormalize PZ˜ j ,C ε (dψ [h,j ] ), by adding to it part of the quadratic part of h,j
the r.h.s. of (52). We get √ (j ) [h,j ] PZ˜ j ,C ε (dψ [h,j ] ) e−V ( Zj ψ ) h,j √ [h,j ] ˜ (j ) = e−Lβtj PZ˜ j −1 ,C ε (dψ [h,j ] ) e−V ( Zj ψ ) , h,j
(56)
Density-Density Critical Indices
109
where ε Z˜ j −1 (k) = Zj (k)[1 + χh,j (k)zj ], [h,j ] V˜ (j ) ( Zj ψ [h,j ] ) = V (j ) ( Zj ψ [h,j ] ) − zj Zj [Fζ + Fα[h,j ] ],
(57) (58)
and the factor exp(−Lβtj ) in (56) takes into account the different normalization of the two functional integrals. If j > h, the r.h.s of (56) can be written as √ Zj [ψ [h,j −1] +ψ (j ) ] −Lβtj [h,j −1] (j ) −V˜ (j ) e PZ˜ j −1 ,C ε (dψ , ) PZj −1 ,f˜−1 (dψ ) e h,j −1
j
(59) where PZj −1 ,f˜−1 (dψ (j ) ) is the integration with propagator j
gˆ ω(j ) (k) =
f˜j (k) , Zj −1 Dω (k) 1
(60)
ε (k) and, with f˜j (k) = fjε (k)Zj −1 [Z˜ j −1 (k)]−1 . It is Z˜ j −1 (k) = Z0 + 0i=j Zi zi χh,i ε ε if j > h and fj (k) = 0, then Z˜ j −1 (k) = Zj + Zj zj [fj −1 (k) + fi (k)], so that the propagators for j > h do not depend of the infrared cut-off and we have f˜j (k) = fjε (k)
Zj (1 + zj ) ε ε (k) + f ε (k)] ≤ fj (k)(1 + zj ). Zj + Zj zj [fi−1 i
(61)
−j . This equation also implies that gˆ ω (k) is of size Zj−1 −1 γ All the dependence on the infrared cut-off is restricted to the integration of the field of scale h, whose propagator (see (56) with j = h) is (j )
gˆ (h) (k) =
fhε (k)
Z˜ h−1 (k)Dω (k)
=
fhε (k) 1 . ε (k) Dω (k) Z0 + 0i=h Zi zi χh,i
(62)
The latter propagator gˆ (h) (k) depends strongly on k near the cut-off; in fact, if fh (k) = 0 but fh+1 (k) = 0, then gˆ (h) (k) =
fhε (k) 1 . Dω (k) Z0 + (Zh−1 − Z0 )fhε (k)
(63)
−j even for j = h, because However, gˆ (j ) (k) is of size Zj−1 −1 γ
Zh−1 fhε (k) ≤ 2. Z0 + (Zh−1 − Z0 )fhε (k)
(64)
We now rescale the field so that V˜ (j ) ( Zj ψ [h,j ] ) = Vˆ (j ) ( Zj −1 ψ [h,j ] );
(65)
it follows that [h,j ]
LVˆ (j ) (ψ [h,j ] ) = δj Fα[h,j ] + λj Fλ
,
(66)
110
G. Benfatto, V. Mastropietro
where δj =
Zj (aj − zj ), Zj −1
λj =
Zj Zj −1
2 lj .
(67)
We call the pairs v j = (δj , λj ) the running coupling constants on scale j . A simple perturbative calculation shows that λ−1 = λ + O(λ2 ), a−1 = O(λ2 ), z−1 = O(λ2 ). Finally √ √ ˆ (j ) Zj −1 [ψ [h,j −1] +ψ (j ) ] −V (j −1) ( Zj −1 ψ [h,j −1] )−Lβ E˜ j (j ) −V e = PZj −1 ,f˜−1 (dψ ) e , j
(68)
and V (j −1) ( Zj −1 ψ [h,j −1] ) is of the form (45); moreover, it satisfies the identity (42), with Ej −1 = Ej + tj + E˜ j . This completes the iterative step. We finally define √ ˆ (h) Zh ψ (h) −Lβ E˜ h e = PZh ,f˜−1 (dψ (h) ) e−V , (69) h
so that EL,β = Eh =
−1
E˜ j +
j =h
−1
tj .
(70)
j =h+1
Note that the above procedure allows us to write, in particular, the running coupling constants v j , 0 < j ≤ h, in terms of v j , 0 ≥ j ≥ j + 1: v j = β(v j +1 , . . . , v 0 ),
v 0 = (λ, 0).
(71)
The function β(v j +1 , . . . , v 0 ) is called the Beta function. The fact that it is well defined, for small values of λ, in the limit L, β → ∞, is a highly non trivial result, see [BG, BGPS, BoM1, BM]. Finally note that Zh represents the wave function renormalization of the fermionic field, δj the renormalization of its velocity and λj is the effective coupling of the theory at scale j . 3.2. The tree expansion. One can write the effective potential on scale j , if h ≤ j < 0, as a sum of terms, which is in fact a finite sum for finite values of N, L, β. Each term of this expansion is associated with a tree in the following way. 1. Let us consider the family of all trees which can be constructed by joining a point r, the root, with an ordered set of n ≥ 1 points, the endpoints of the unlabeled tree (see Fig. 4), so that r is not a branching point. n will be called the order of the unlabeled tree and the branching points will be called the non trivial vertices. The unlabeled trees are partially ordered from the root to the endpoints in the natural way; we shall use the symbol < to denote the partial order. Two unlabeled trees are identified if they can be superposed by a suitable continuous deformation, so that the endpoints with the same index coincide. It is then easy to see that the number of unlabeled trees with n end-points is bounded by 4n . We shall consider also labeled trees (which we shall call simply trees in the following); they are defined by associating certain labels with the unlabeled trees, as explained in the following items.
Density-Density Critical Indices
111
v v0
r
j
j +1
hv
−1
0
+1
Fig. 4. Example of a tree
2. We associate a label j ≤ 0 with the root and we denote Tj,n the corresponding set of labeled trees with n endpoints. Moreover, we introduce a family of vertical lines, labeled by an integer taking values in [j, 1], and we represent any tree τ ∈ Tj,n so that, if v is an endpoint or a non-trivial vertex, it is contained in a vertical line with index hv > h, to be called the scale of v, while the root is on the line with index j . There is the constraint that, if v is an endpoint, hv > j + 1. The tree will intersect in general the vertical lines in a set of points different from the root, the endpoints and the non trivial vertices; these points will be called trivial vertices. The set of the vertices of τ will be the union of the endpoints, the trivial vertices and the non-trivial vertices. Note that if v1 and v2 are two vertices and v1 < v2 , then hv1 < hv2 . Moreover, there is only one vertex immediately following the root, which will be denoted v0 and can not be an endpoint; its scale is j + 1. 3. With each endpoint v of scale hv we associate one of the two local terms contributing to LVˆ (hv ) (ψ [h,hv −1] ) in the r.h.s. of (66) and one space-time point xv . We shall say that the endpoint is of type δ or λ, with an obvious correspondence with the two terms. Note that there is no endpoint of type δ, if hv = +1. Given a vertex v, which is not an endpoint, xv will denote the family of all space-time points associated with one of the endpoints following v. Moreover, we impose the constraint that, if v is an endpoint, hv = hv + 1, if v is the non-trivial vertex immediately preceding v. 4. If v is not an endpoint, the cluster Lv with frequency hv is the set of endpoints following the vertex v; if v is an endpoint, it is itself a (trivial) cluster. The tree provides an organization of endpoints into a hierarchy of clusters. 5. We introduce a field label f to distinguish the field variables appearing in the terms associated with the endpoints as in item (3); the set of field labels associated with the endpoint v will be called Iv . Analogously, if v is not an endpoint, we shall call Iv the set of field labels associated with the endpoints following the vertex v; x(f ), σ (f ) and ω(f ) will denote the space-time point, the σ index and the ω index, respectively, of the field variable with label f . 6. If the endpoint v is of type δ, one of the field variables belonging to Iv carries also a derivative. In (53) this derivative acts on the field ψ − , but we could also
112
G. Benfatto, V. Mastropietro [h,j ]
choose a representation of Fζ such that the derivative acts on the field ψ + . Which representation is used depends on detailed properties of the different terms associated with the tree, which are discussed in [BM], see the remark after Eq. 3.40 there. Once this choice is done, we can associate an integer m(f ) ∈ {0, 1} to each field label f , denoting the order of the derivative acting on the corresponding field variable. 7. We associate with any vertex v of the tree a subset Pv of Iv , the external fields of v. These subsets must satisfy various constraints. First of all, if v is not an endpoint and v1 , . . . , vsv are the sv vertices immediately following it, then Pv ⊂ ∪i Pvi ; if v is an endpoint, Pv = Iv . We shall denote Qvi the intersection of Pv and Pvi ; this definition implies that Pv = ∪i Qvi . The subsets Pvi \Qvi , whose union will be made, by definition, of the internal fields of v, have to be non-empty, if sv > 1, that is if v is a non-trivial vertex. Given τ ∈ Tj,n , there are many possible choices of the subsets Pv , v ∈ τ , compatible with the previous constraints; let us call P one of these choices. Given P, we consider the family GP of all connected Feynman graphs, such that, for any v ∈ τ , the internal fields of v are paired by propagators of scale hv , so that the following condition is satisfied: for any v ∈ τ , the subgraph built by the propagators associated with all vertices v ≥ v is connected. The sets Pv have, in this picture, the role of the external legs of the subgraph associated with v. The graphs belonging to GP will be called compatible with P and we shall denote Pτ the family of all choices of P such that GP is not empty. As explained in detail in Sect. 3.2 of [BM], we can write, if h ≤ j ≤ −1, V (j ) ( Zj ψ [h,j ] ) + Lβ E˜ j +1 =
∞
Zj
|Pv0 |
(j +1)
dxv0 ψ˜ [h,j ] (Pv0 )Kτ,P (xv0 ),
(72)
n=1 τ ∈Tj,n P∈Pτ
where ψ˜ [h,j ] (Pv ) =
[h,j ]σ (f )
ψx(f ),ω(f )
(73)
f ∈Pv (j +1)
and Kτ,P (xv0 ) is a suitable function, which is obtained by summing the values of all the Feynman graphs compatible with P, see item (7) above, and applying iteratively in the vertices of the tree, different from the endpoints and v0 , the R-operation, starting from the vertices with higher scale. Note that there is no derivative acting on the fields with label f ∈ Pv0 , even if the field is associated with the endpoint of type δ; this result is achieved by using the freedom discussed in item (6) about the choice of the field with m(f ) = 1. In a similar way we get E˜ h =
∞
n=1 τ ∈Th−1,n P∈Pτ :Pv0 =∅
(h)
Kτ,P (xv0 ).
(74)
Density-Density Critical Indices
113
3.3. The main bound. In order to control, uniformly in L and β, the various sums in (72), one has to exploit in a careful way the R operation acting on the vertices of the tree, as explained in full detail in [BM, Sect. 3]. The result of this analysis, which applies essentially unchanged to the model studied in this paper, is a general bound which has a simple dimensional interpretation. Let us see what happens if we erase the R operation in all the vertices of the tree. In this case one gets the dimensional bound
(j +1)
dxv0 |Kτ,P (xv0 )|
≤ Lβ (C ε¯ )n γ −j (−2+|Pv0 |/2)
(
v not e.p
Zhv |Pv | −(−2+ |Pv | ) 2 , ) 2 γ Zhv −1
(75)
where C is a suitable constant and ε¯ = maxj +1≤j ≤0 |v j |. Note that the good dependence on n derives from the anticommuting properties of the field variables. The bound (75) allows us to associate a factor γ 2−|Pv |/2 with any trivial or non-trivial vertex of the tree. This would allow us to control the sums over the scale labels and Pτ , provided |Pv | were larger than 4 in all vertices, which is however not true. The effect of the R operation is to improve the bound, so that there is a factor less than 1 associated even with the vertices where |Pv | is equal to 2 or 4. In order to explain how this works, we need a more detailed discussion of the R operation. We shall do that below by using the simpler expressions that one obtains in the (formal) limit L = β = ∞; this is sufficient to explain the essential points and makes clearer the notation. (1) If 2n = 4, by (47) (with L = β = ∞), L
dxW (x)
4
[h,j ]σi
ψxi ,ωi
=
dxW (x)
i=1
4
[h,j ]σi
ψx4 ,ωi
(76)
,
i=1
(j ) where x = (x1 , . . . , x4 ) and W (x) is the Fourier transform of Wˆ 4,ω (k1 , k2 , k3 ). Note [h,j ]σ
that W (x) is translation invariant; hence ψx4 ,ωi i in the r.h.s. of (76) can be substituted [h,j ]σ with ψxk ,ωi i , k = 1, 2, 3 and we have four equivalent representations of the localization operation, which differ by the choice of the localization point. If the localization point is chosen as in (76), we have R
dxW (x)
4
[h,j ]σ ψxi ,ωi i
! 4
=
i=1
dxW (x)
=
[h,j ]σ ψxi ,ωi i
−
i=1
4
" [h,j ]σ ψx4 ,ωi i
i=1
[h,j ]σ
[h,j ]σ
1[h,j ]σ
[h,j ]σ4
dxW (x) ψx1 ,ω1 1 ψx2 ,ω2 2 Dx3 ,x4 ,ω33 ψx4 ,ω4 [h,j ]σ
1[h,j ]σ
[h,j ]σ
[h,j ]σ4
[h,j ]σ
[h,j ]σ
[h,j ]σ4
+ ψx1 ,ω1 1 Dx2 ,x4 ,ω22 ψx4 ,ω3 3 ψx4 ,ω4 1[h,j ]σ
+ Dx1 ,x4 ,ω11 ψx4 ,ω2 2 ψx4 ,ω3 3 ψx4 ,ω4
(77) ,
where (again if L = β = ∞) 1[h,j ]σ
Dy,x,ω
[h,j ]σ
= ψy,ω
[h,j ]σ
− ψx,ω
.
(78)
114
G. Benfatto, V. Mastropietro 1[h,j ]σ
The field Dy,x,ω is dimensionally equivalent to the product of |y − x| and the derivative of the field, so that the bound of its contraction with another field variable on [h,j ]σ a scale j < j will produce a “gain” γ −(j −j ) with respect to the contraction of ψy,ω . On the other hand, each term in the r.h.s. of (77) differs from the term which R acts on mainly because one ψ [h,j ] field is substituted with a D 1[h,j ] field and some of the other ψ [h,j ] fields are “translated” in the localization point. All three terms share the property that the field whose x coordinate is equal to the localization point is not affected by the action of R. (2) If 2n = 2, by (50), [h,j ]+ [h,j ]− R dx1 dx2 W (x1 − x2 )ψx1 ,ω ψx2 ,ω [h,j ]+ 2[h,j ]− = dx1 dx2 W (x1 − x2 )ψx1 ,ω Dx2 ,x1 ,ω (79) 2[h,j ]+ [h,j ]− = dx1 dx2 W (x1 − x2 )Dx1 ,x2 ,ω ψx2 ,ω , (j ) where W (x) is the Fourier transform of Wˆ 2,ω,ω (k) and 2[h,j ]σ
Dy,x,ω
[h,j ]σ
= ψy,ω
[h,j ]σ
− ψx,ω
[h,j ]σ
− (y − x) · ∇ψx,ω
.
(80)
As in item 1. above, we define the localization point as the x coordinate of the field which is left unchanged by L or R. We are free to choose it equal to x1 or x2 . Hence the effect of R can be described as the replacement of a ψ [h,j ]σ field with a D 2[h,j ]σ field, with a gain in the bounds of a factor γ −2(j −j ) . By suitably using the definition of the R, it is shown in Sect. 3 of [BM] that RV (j ) ( Zj ψ [h,j ] ) =
∞
Zj
|Pv0 |
(j +1) dxv0 Dα ψ˜ [h,j ] (Pv0 )Kτ,P,α (xv0 ),
(81)
n=1 τ ∈Tj,n P∈Pτ α∈Aτ,P
where Aτ,P labels a finite set of different terms, of counting power C n , and, for any α ∈ Aτ,P , Dα denotes an operator dimensionally equivalent to a derivative of order mα . The important property of (81) is that, see Eq. 3.110 of [BM],
(j +1)
dxv0 |Kτ,P,α (xv0 )| ≤ Lβ (C ε¯ )n γ −j (−2+|Pv0 |/2+mα )
(
v not e.p
Zhv |Pv |/2 −[−2+|Pv |/2+z(Pv )] ) γ , Zhv −1
(82)
where mα ≥ z(Pv0 ) and 1 z(Pv ) = 2 0
if |Pv | = 4, if |Pv | = 2, otherwise.
(83)
Density-Density Critical Indices
115
We now consider the action of L on V (j ) ( Zj ψ [h,j ] ). We get an expansion similar to (81), that we can write in the form LV
(j )
∞ [h,j ] [h,j ] [h,j ] (τ, Zj ψ )= [zj (τ )Zj Fζ + aj (τ )Zj Fα[h,j ] + lj (τ )Zj2 Fλ ], n=1 τ ∈Tj,n
(84) where (in the limit L = β = ∞) 1 zj (τ ) = Lβ 1 aj (τ ) = Lβ 1 lj (τ ) = Lβ
(j +1)
dxv0 [x(f2 ) − x(f1 )]Kτ,P,α (xv0 ),
P∈Pτ ,α∈Aτ,P Pv0 =(f1 ,f2 ),ω(f1 )=ω(f2 )=+1
(j +1)
dxv0 [x(f2 ) − x(f1 )]Kτ,P,α (xv0 ),
P∈Pτ ,α∈Aτ,P Pv0 =(f1 ,f2 ),ω(f1 )=ω(f2 )=+1
(j +1)
(85)
dxv0 Kτ,P,α (xv0 ).
P∈Pτ ,α∈Aτ,P |Pv0 |=4,σ =(+,−,+,−),ω=(+1,−1,−1,+1)
The constants zj , aj and lj , which characterize the local part of the effective potential, can be obtained from (85) by summing over n ≥ 1 and τ ∈ Tj,n . Finally, the constant E˜ j +1 appearing in the l.h.s. of (72) can be written in the form E˜ j +1 = ∞ ˜ τ ∈Tj,n Ej +1 (τ ), with n=1 1 (j ) dxv0 Kτ,P,α (xv0 ). E˜ j +1 (τ ) = Lβ P∈P ,α
(86)
τ Pv0 =∅
All the kernels appearing in (85) and (86) satisfy the bound (82), with mα = 0. Note that, by the remark preceding (62), the effective potential is independent of the infrared cut-off for j > h. This means in particular that, if we add a superscript (h) to (h) for j > h. On the other hand, in previous keep track of the infrared cut-off, v j = v −∞ j papers (see [BGPS, GS, BoM1, BM]) it was shown, by using several properties of the exact solution of the Luttinger model (see [ML, BGM]), that λ−∞ = λ + O(λ2 ) and j
−∞ = (λh −λh+1 )−(λ−∞ δj−∞ = O(λ2 ). Moreover, since λh −λ−∞ h h −λh+1 ), the previous (h)
+ O(λ2 ), since, by (82) and (85), both λh − λh+1 and result implies that λh = λ−∞ h −∞ −∞ λh − λh+1 are of order λ2 . We can resume this results in the following theorem. (h)
Theorem 5. There is a constant ε0 , such that, if |λ| ≤ ε0 , then, uniformly in the infrared cut-off, λj = λ + O(λ2 ),
δj = O(λ2 ),
h ≤ j ≤ −1.
(87)
116
G. Benfatto, V. Mastropietro
3.4. The expansion for the Schwinger functions. The procedure described in Sects. 3.1– 3.3 can be generalized to get an expansion for the connected Schwinger functions of the model, in particular those defined by (12) and (13). The main difference with respect to the “free energy” case is that the external fields Jx,ω and φx,ω have to be taken into account. We start from the generating function (11) and we perform iteratively the integration of the ψ variables, to be defined iteratively in the following way. After the fields ψ (0) , ...ψ (j +1) have been integrated, we can write √ √ (j ) [h,j ] (j ) [h,j ] eW (φ,J ) = e−LβEj PZ˜ j ,C ε (dψ [h,j ] )e−V ( Zj ψ )+B ( Zj ψ ,φ,J ) , (88) h,j
where B (j ) ( Zj ψ, φ, J ) denotes the sum over the terms containing at least one φ or J field; we shall write it in the form (j ) (j ) (j ) B (j ) ( Zj ψ, φ, J ) = Bφ ( Zj ψ) + BJ ( Zj ψ) + WR ( Zj ψ, φ, J ), (89) (j )
(j )
where Bφ (ψ) and BJ (ψ) denote the sums over all the terms containing only one φ (0)
or J field, respectively. For j = 0, the comparison with (11) shows that WR = 0, (0) + ψ − + ψ + φ − ] and B (0) (ψ) = + − Bφ (ψ) = ω dx[φx,ω x,ω x,ω x,ω ω dxJx,ω ψx,ω ψx,ω . J In order to control the expansion of the connected Schwinger functions, we have to extend the definition of the localization operation L to B (j ) ( Zj ψ, φ, J ). First of all, (j ) (j ) (j ) we put LWR = WR . Let us now consider Bφ ( Zj ψ); we want to show that, by a suitable choice of the localization procedure, it can be written in the form (j ) Bφ ( Zj ψ) 0 ∂ + Q,(i) = dxdy φx,ω gω (x − y) + V (j ) ( Zj ψ) (90) ∂ψyω ω i=j +1 ∂ (j ) Q,(i) − + V ( Z ψ)g (y − x)φ j ω x,ω − ∂ψy,ω 1 1 + − + ˆ (j +1) − +1) ˆ (j ˆ ˆ ˆ + )Q (k) φ + φ (k)( Z ) ( Zj ψˆ k,ω ψ Q j ω k,ω k,ω ω k,ω , Zj Lβ ω,k
where ˆ (i) gˆ ωQ,(i) (k) = gˆ ω(i) (k)Q ω (k)
(91)
(j )
and Qω (k) is defined inductively by the relations ) ˆ (j ˆ (j +1) (k) − zj Zj Dω (k) Q ω (k) = Qω
0
gˆ ωQ,(i) (k),
ˆ (0) Q ω (k) = 1.
(92)
i=j +1
In fact, the terms in the first two lines of (90) have a simple interpretation in terms of √ Feynman graphs; they are obtained by taking all the graphs contributing to V (j ) ( Zh ψ) and, given a single graph, by adding a new space-time-point x associated with a term φx ψx and contracting the correspondent ψ field with one of the external fields of the
Density-Density Critical Indices
117
Q,(i) graph through a propagator 0i=j +1 gω (x − y). Hence, it is very easy to see that (90) is satisfied for j = −1. The fact that it is valid for any j follows from our choice to (j ) localize Bφ ( Zj ψ) by the following procedure: first of all we substitute in the r.h.s. of (90) V (j ) with LV (j ) + RV (j ) , LV (j ) being defined by (52); then we extract from LV (j ) the terms proportional to zj , as in (58), which are absorbed in the terms in the third line of (90). Finally we rescale the field ψ by (65) and perform the integration of the scale j field. It is then easy to check that (90) is satisfied for j = j¯ + 1, if it is satisfied for j = j¯, together with (92). Note that fj (k) = 0 for |k| < γ j −1 or |k| > γ j +1 , so that fh1 (k)fh2 (k) = 0
if |h1 − h2 | > 1.
(93)
(j )
It follows that, if gˆ ω (k) = 0, by using also (62) and (92), ) ε ˆ (j Q ω (k) = 1 − zj fj +1 (k) Q,(i)
Zj . ˜ Zj (k)
(94)
(i)
Hence, the propagator gˆ ω (k) is equivalent to gˆ ω (k), as concerns the dimensional bounds. (j ) Finally let us consider BJ ( Zj ψ). It is easy to see that the field J is equivalent, from the point of view of dimensional considerations, to two ψ fields. Hence, the only terms which need a regularization are those of second order in ψ, which are indeed marginal. We shall use for them the definition (j,2) + − dxdydzBω,ω˜ (x, y, z)Jx,ω ( Zj ψy, BJ ( Zj ψ) = ω˜ )( Zj ψz,ω˜ ) ω,ω˜
=
1 + − Bˆ ω,ω˜ (p, k)Jˆ(p)( Zj ψˆ p+k, )( Zj ψˆ k, ω ˜ ω˜ ). 2 (Lβ)
(95)
(j,2) (j,2) ( Zj ψ) = LBJ ( Zj ψ) + RBJ ( Zj ψ),
(96)
ω,ω,k,p ˜
We write (j,2)
BJ
where L is defined through its action on Bˆ ω (p, k) in the following way: LBˆ ω,ω˜ (p, k) =
1 4
Bˆ ω,ω˜ (p¯ η , k¯ η,η ),
(97)
η,η =±1
where k¯ η,η is defined as in (48) and p¯ η = (0, 2π η /β). In the L = β = ∞ it reduces simply to LBˆ ω,ω˜ (p, k) = Bˆ ω,ω˜ (0, 0). This definition apparently implies that we have to introduce two new renormalization constants. However, this is not the case. One can show that, in the L = β = ∞ limit Bˆ ω,−ω (0, 0) = 0,
(98)
by using the symmetry property of the propagators gˆ ω(j ) (k) = −iωgˆ ω(j ) (k∗ ),
k = (k, k0 ),
k∗ = (−k0 , k).
(99)
118
G. Benfatto, V. Mastropietro
In fact, the contribution of order n in ε¯ to Bω,−ω (p, k) can be written as a sum of connected Feynman graphs obtained by contracting 2n + 2 fields of type ω and 2n − 2 fields of type −ω, so that, by (99), Bω,−ω (p, k) = (−iω)n+1 (iω)n−1 Bω,−ω (p∗ , k∗ ) = −Bω,−ω (p∗ , k∗ ), which implies (98). If L and β are finite, the identity (98) is not true anymore, but the corrections do not give rise to any divergence, as j → ∞, and go to zero, for any fixed j , as L, β → ∞. In fact it is not hard to see, by comparing LBˆ ω,ω˜ (p, k) with its limit as L, β → ∞ and using the properties of the multiscale expansion described above, that the corrections are of order γ −j max{L−1 , β −1 }, as one can guess by dimensional arguments. In other words, one can say that LBˆ ω,−ω behaves as an irrelevant term (hence no renormalization constant is associated to it). The previous considerations imply that we can write (2)
(j,2)
LBJ
Zj ( Zj ψ) = Zj ω
+ − dxJx,ω ( Zj ψx,ω )( Zj ψx,ω ),
(100)
(2)
which defines a new renormalization constant Zj , the density renormalization. It is (j,2) easy to see, by proceeding as in Sect. 3.3, that RBJ ( Zj ψ) can be written as a sum of terms of the form (95), with one of the fields ψ replaced by a field D 1 (see (78)). This allows us to improve the bounds in the usual way, see Sect. 3.3. The definition of R is (j ) extended to all the other contributions to BJ ( Zj ψ) as the identity. At the end of the iterative integration procedure, we get (h) S2mφ ,nJ (φ, J ). (101) W(ϕ, J ) = −LβEL,β + mφ +nJ ≥1 (h)
We can expand the functional S2mφ ,nJ (φ, J ) and the various terms in the r.h.s. of (89) in terms of trees, as we did for the effective potential, by suitably modifying the definitions given in Sect. 3.2. 1. First of all, we have to add two new types of endpoints, to be called of type φ and J ; the first one is associated with the terms in the third line of (90), the second one with the terms in the r.h.s. of (100). They will be sometimes called special endpoints; as for the other endpoints, the scale hv¯ of a special endpoint v¯ of type J is hv + 1, if hv is the scale of the non-trivial vertex immediately preceding v¯ if v¯ is an end point of φ type φ, hv¯ ≥ hv + 1. Given v ∈ τ , we shall call nv and nJv the number of endpoints of type φ and J belonging to the cluster Lv , defined as in item (4) of Sect. 3.3, while nv will denote the number of endpoints of type λ or δ, to be called normal. Analogously, φ given τ , we shall call nτ and nJτ the number of endpoint of type φ and J , while nτ will denote the number of normal endpoints. Finally, Tj,n,nφ ,nJ will denote the set of trees with n normal endpoints, nφ endpoints of type φ and nJ endpoints of type J . 2. The definition of the sets Pv (of the external fields in the vertex v) is modified, in the sense that the set Pv includes both the field variables of type ψ which are not yet contracted in the vertex v, to be called normal external fields, and those which belong to an endpoint normal or of type J and are contracted with a field variable belonging to an endpoint of type φ through a propagator g Q,(hv ) , to be called special external fields of v.
Density-Density Critical Indices
119
3. As explained above, we regularize the terms linear in φ by extracting from the effective potential, in the r.h.s. of (90), its local part, defined by (52). This implies that one of the ψ variables contracted in the propagator linked to the φ variable is treated as an external field variable, see item (2) above. However, in order to exploit the regularizing effect of the R operation on the terms with 2 or 4 external fields (see remark after (78)), we have to be sure that the field variable which “acquires a derivative” is not yet contracted on the vertex scale. This can be realized by choosing the localization point as the space-time point of the special external field, that is the field which is contracted with the ψ field of the type φ endpoint. It is easy to see that (h)
S2mφ ,nJ (φ, J ) =
−1 ∞
n=0 j0 =h−1 ω τ ∈Tj
φ J 0 ,n,2m ,n
dx
φ 2m
P∈Pτ :|Pv0 |=2mφ
J
φxσii,ωi
n
Jx2mφ +r ,ω2mφ +r S2mφ ,nJ ,τ,ω (x),
(102)
r=1
i=1
where ω = ω = ω = ω = {ω1 , . . . , ω2mφ +nJ }, x = {x1 , . . . , x2mφ +nJ } and σi = + if i is odd, σi = − if i is even. The Schwinger functions are simply related to the kernels of the functionals (h) S2mφ ,nJ (φ, J ) and (102) allows us to get an expansion for them. For example, G2ω (x1 , x2 ) is equal to the sum over the terms in the r.h.s. of (102) with mφ = 1, nJ = 0 and φ J ω = (ω, ω), while G2,1 ω (x; y, z) is obtained by selecting the terms with m = 1, n = 1 and ω = (ω, ω, ω). Hence, a bound for the Fourier transform of the Schwinger functions can be obtained by using the dimensional bound dx|S2mφ ,nJ ,τ,ω (x)| n −j0 (−2+mφ +nJ )
≤ Lβ (C ε¯ ) γ
φ 2m
i=1
J
(2)
n γ −hi Zh¯ r (Zhi )1/2 Zh¯ r r=1
(
v not e.p
Zhv |Pv |/2 −dv ) γ , Zhv −1 (103)
where hi is the scale of the propagator linking the i th endpoint of type φ to the tree, h¯ r is the scale of the r th endpoint of type J and dv = −2 + |Pv |/2 + nJv + z˜ (Pv ), with
φ J z(Pv ) if nv ≤ 1, nv = 0, φ z˜ (Pv ) = 1 if nv = 0, nJv = 1, |Pv | = 2, 0 otherwise.
(104)
(105)
The bound (103) can be easily obtained by the same arguments leading to the bound (82), by taking into account the remarks in items (1)–(3) above. Essentially one has to modify the bound (82) in the following way.
120
G. Benfatto, V. Mastropietro
(a) Insert a factor γ −hi (Zhi )−1/2 for each endpoint of type φ; this factor bounds the product of the propagator linking the i th endpoint of type φ to the tree and the (Zhi )1/2 renormalization constant of the corresponding special external field variable. We use here (60), (61) and (64). (2) (b) Insert a factor Zh¯ /Zh¯ r for each endpoint of type J . r (c) Substitute the “regularization index” zv with z˜ v , to take into account that the R φ operation is trivial in all the vertices with nv + nJv > 1. J (d) Insert a factor γ nv for each non-trivial vertex v, to take into account that any J variable is dimensionally equivalent to two ψ external fields, so that the dimension of any vertex v increases by one unit for any endpoint of type J belonging to the cluster Lv . The bound (103) is sufficient to get a bound for the Schwinger functions Fourier transforms, because, by translation invariance, the Fourier transform of S2mφ ,nJ ,τ,ω (x) is bounded by (Lβ)−1 dx|S2mφ ,nJ ,τ,ω (x)|. We only have to sum over τ the r.h.s. of (103) (without the Lβ factor), by using the techniques described in detail in Sect. 5 of [BM]. The main point is to control the sums over the sets Pv and the scale indices hv , for fixed values of the external propagators scale indices hi , which are determined up to one unit by the external momenta. Hence, if all the “vertex dimensions” dv were greater than 0, one would get a dimensional bound of the type n¯
(C ε¯ )
h¯
γ
−j0 (−2+mφ +nJ )
j0 =h
φ 2m
i=1
J
(2)
n γ −hi Zh¯ r , (Zhi )1/2 Zh¯ r
(106)
r=1
where n¯ is the minimal order in λ of the graphs contributing to the Schwinger function and h¯ is an upper bound on the scale of the tree lower vertex v0 , which depends on the external momenta. However, it is not true that, given τ , dv > 0 for all non-trivial v ∈ τ ; in fact dv = 0, if φ φ |Pv | = 2 and nv = nJv = 1 or nv = 2, nJv = 0. This implies that the sum over the scale indices of some special paths on the tree can produce a result different from the “trivial one”, leading to (106). Hence, in order to get the right bound, one has to analyze case by case the constraints on the endpoint scale indices, related with the support properties of the single scale propagators and the fact that the φ and J momenta are fixed. The result of this analysis, rather difficult to describe in general, will be given only for the bound of the connected Schwinger functions appearing in Theorem 2. Moreover, we shall use the expansion (102) also to extract some “dominant terms” and get an improved bound on the rest, as we shall see below.
3.5. Proof of Theorem 2. The bounds (18) and Eqs. (19) are proved in [BM], Theorem 4.9; so it remains to prove (16) and (17). By using (102), we can write, for any k, ˆ (2) G ω (k) =
0 j =h
gˆ ωQ,(j ) (k) +
−1 ∞ n=1 j0 =h−1
τ ∈Tj ,n,2,0 0 |Pv0 |=2
ˆ 2,τ (k), G
(107)
Density-Density Critical Indices
121
where G2,τ = S2,0,τ,,τ,{ω,ω} . The choice of k¯ implies that, given τ , the scale of the ˆ 2,τ can be different from 0 external propagators has to be equal to h or h + 1, hence G only if the index j0 in the r.h.s. of (107) (which is also the scale index of v0 , the lower tree vertex) takes the value h or h + 1. In this case, by using the bound (103) and translation invariance, we get, by using also the fact that Zj /Zj −1 < 1 (see [BM], Theorem 4.9) and Theorem 5, ¯ ≤ 1 ˆ 2,τ (k)| dx1 dx2 |G2,τ (x1 − x2 )| ≤ (C ε¯ )n γ −h Zh−1 γ −dv . (108) |G Lβ v The previous considerations also imply that the only vertices of τ with dv = 0 have scale h or h + 1, so that there is no problem in performing the sum over the scale indices in the r.h.s. of (107). Moreover, by symmetry reasons, G2,τ = 0 if nτ = 1; hence the sum over all the trees with n ≥ 1 can be bounded by (C ε¯ )2 γ −h Zh−1 . Finally, the terms of order 0 in the r.h.s. of (107) sum up to ¯ + f˜h (k) ¯ 1 1 f˜h+1 (k) [1 + O(¯ε 2 ] = [1 + O(¯ε 2 ], ¯ ¯ Zh Dω (k) Zh Dω (k)
(109)
which easily implies (16). We finally prove (17). By using (102), we can write, if p = k¯ 1 − k¯ 2 and k = k¯ 2 , ˆ (2,1) G ω (p, k) =
−1 ∞ n=0 j0 =h−1
ˆ 2,1,τ (p, k), G
(110)
τ ∈Tj ,n,2,1 0 |Pv0 |=2
where G2,1,τ = S2,1,τ,,τ,{ω,ω,ω} . The condition on k¯ 1 and k¯ 2 implies that, for any τ , the only vertices with dv = 0 have ˆ 2,1,τ (p, k)| scale h or h + 1. Hence, the sum over the trees with n normal endpoints of |G satisfies, by (103), the dimensional bound
ˆ 2,1,τ (p, k)| ≤ (C ε¯ )n |G
τ :nτ =n
γ −2h Zh . Zh Z h (2)
(111)
ˆ 2,1,τ (p, k) = 0, if nτ = 1, and Moreover, by symmetry reasons, G
ˆ 2,1,τ (p, k) = G
τ :nτ =0
(2)
Zh 1 1 [1 + O(¯ε 2 ], Zh ¯ Zh Zh Dω (k1 ) Zh Dω (k¯ 2 )
(112)
since f˜h+1 (k¯ i ) + f˜h (k¯ i ) = 1. Hence, we get (17). 2,1
4. The Expansion for Hω
4.1. Preliminary remark. In this section we have to find an expansion for the correction term Hω2,1 . The definition of Hω2,1 as a derivative of a functional integral, see (35-37), is apparently very similar to the expression for G2,1 ω given by (11–12). In fact, the definition (11) of the generating function W(φ, J ) differs from the definition (36) of W (φ, J ) + ψ − is replaced by T only because ψx,ω x,ω . However, such difference is not trivial at x,ω
122
G. Benfatto, V. Mastropietro
all, because of the singularity of C(k, k + p), as ε → 0, and of D −1 (p) at p = 0. Nevertheless, we are still able to prove the bound (29), which differs from the analogous ˆ 2,1 by an extra λ2 factor. bound for G In order to get this result, we will define a multiscale expansion similar to the previous ones, so getting a few terms of a new kind, for which the L operation must be defined in a proper way. Correspondingly new renormalization constants will appear, which we (2) can prove are strictly related to Zh , what is crucial to get (29). 4.2. Properties of Tx,ω . We begin our analysis by studying the quantity ) + − (i,j ω (k , k ) =
=
C(k+ , k− ) (i) + (j ) − g˜ ω (k )g˜ ω (k ) Dω (p) ˜ε + ˜ε − fi (k ) fj (k ) 1 1
ε (k− ) Zi−1 Zj −1 Dω (p) Dω (k+ ) χh,0
−
− f˜jε (k− )
(113)
f˜jε (k− ) f˜iε (k+ ) ˜ε (k+ ) , − f i ε (k+ ) Dω (k− ) χh,0
where p = k+ − k− . The above quantity appears in the expansion for Hˆ 2,1 when both the fields of Tx,ω are contracted. Note first that ) + − (i,j ω (k , k ) = 0,
if 0 > i, j > h,
(114)
ε (k± ) = 1, if h < i, j < 0. We will see that this property plays a crucial role; since χh,0 it says that, contrary to what happens for G2,1 , at least one of the two fermionic lines connected to J must have scale 0 or h. (i,j ) (i,j ) In the cases in which ω (k+ , k− ) is not identically equal to 0, since ω (k+ , k− ) (j,i) − + = ω (k , k ), we can restrict the analysis to the case i ≥ j . (1) If i = j = 0, by using (21), it is easy to see that the r.h.s. of (113) has a well defined limit as ε → 0, given by f0 (k+ ) 1 f0 (k− ) (0,0) + − − + u0 (k ) − u0 (k ) , (115) ω (k , k ) = Dω (p) Dω (k+ ) Dω (k− )
where u0 (k) is a C ∞ function such that 0 if |k| ≤ 1, u0 (k) = 1 − f0 (k) if 1 ≤ |k|.
(116)
We want to show that p0 Sω,0 (k+ , k− ) + pSω,1 (k+ , k− ) p + − (k , k ) = S(0) , (117) Dω (p) ω Dω (p) (0)
+ − (0,0) ω (k , k ) =
(0)
where Sω,i (k+ , k− ) are smooth functions such that (0)
|∂k++ ∂k−− Sω,i (k+ , k− )| ≤ Cm+ +m− , m
m
(0)
(118)
Density-Density Critical Indices
123
if ∂km denotes a generic derivative of order m with respect to the variables k and Cm is a suitable constant, depending on m. The proof of (117) is trivial if p is bounded away from 0, for example |p| ≥ 1/2. It is (0,0) sufficient to remark that ω (k+ , k− ), by the compact support properties of f0 (k), is (0) (0,0) (0) (0,0) a smooth function and put Sω,0 = −iω , Sω,1 = ωω . If |p| ≤ 1/2, we can use the identity + − (0,0) ω (k , k )
f0 (k+ )u0 (k+ ) (119) Dω (k+ )Dω (k− ) 1 k+ − tp u0 (k+ ) f0 (k+ ) p + dt + (k − tp) f0 (k+ − tp) − u , + 0 Dω (p) 0 |k − tp| Dω (k− ) Dω (k+ )
= −
from which (118) follows. (2) If i = 0 and h ≤ j < 0, we get ) + − (0,j (k , k ) = − ω
f˜j (k− )u0 (k+ ) f0 (k+ )uh (k− ) 1 + δ , j,h Zj −1 Dω (p)Dω (k− ) Z˜ h−1 (k− ) Dω (p)Dω (k+ ) 1
where
(120)
uh (k) =
0 if |k| ≥ γ h , 1 − fh (k) if |k| ≤ γ h .
(121)
If j < −1, the first term in the r.h.s. of (120) vanishes for |p| ≤ 1 − γ −1 , since u0 (k+ ) = 0 implies that |k+ | ≥ 1, so that |k− | = |k+ − p| ≥ 1 − (1 − γ −1 ) = γ −1 and, as a consequence, f˜j (k− ) = 0. Analogously, the second term in the r.h.s. of (120) vanishes for |p| ≤ 1 − γ −1 − γ h , since f0 (k+ ) = 0 implies that |k+ | ≥ 1 − γ −1 , so that |k− | ≥ γ h and, as a consequence, uh (k− ) = 0. On the other hand, if j = −1, because f˜−1 (k)u0 (k) = 0, we can write 1 k+ − tp ˜ + ˜ − + u0 (k )f−1 (k ) = −u0 (k ) p dt + (122) f−1 (k+ − tp). |k − tp| 0 It follows that ) + − (0,j (k , k ) = ω
p S(j ) (k+ , k− ), Dω (p) ω
(123)
where Sω,i (k+ , k− ) are smooth functions such that (j )
|∂km+0 ∂k−j Sω,i (k+ , k− )| ≤ Cm0 +mj m
(j )
γ −j (1+mj ) , Z˜ j −1 (k− )
h ≤ j < 0.
(124)
(3) If i = j = h we get (h,h) (k+ , k− ) ω
1 fh (k+ )uh (k− ) uh (k+ )fh (k− ) 1 = − . Dω (p) Z˜ h−1 (k+ )Z˜ h−1 (k− ) Dω (k+ ) Dω (k− )
(125)
124
G. Benfatto, V. Mastropietro
Since this expression can appear only at the last integration step, it is not involved in any regularization procedure. Hence we only need its size for values of p of order γ h or larger. It is easy to see that (k+ , k− )| ≤ |(h,h) ω
γ −2h C , M Z˜ h−1 (k+ )Z˜ h−1 (k− )
if |p| ≥ Mγ h .
(126)
(4) If j = h < i < −1, we get + − (i,h) ω (k , k ) =
f˜i (k+ )uh (k− ) , Z˜ h−1 (k− )Zi−1 Dω (p)Dω (k+ ) 1
(127)
which satisfies the bound + − |(i,h) ω (k , k )| ≤
γ −h−i C , M Z˜ h−1 (k− )Zi−1
if |p| ≥ Mγ h .
(128)
4.3. The multiscale expansion of the correction term. We are now ready to begin the description of the iterative integration procedure. As in Sect. 3.4, we can write √ √ (j ) [h,j ] (j ) [h,j ] eW (φ,J ) = e−LβEj PZ˜ j ,C ε (dψ [h,j ] )e−V ( Zj ψ )+K ( Zj ψ ,φ,J ) , (129) h,j
where K (j ) ( Zj ψ, φ, J ) denotes the sum over the terms containing at least one φ or J field; we shall write it in the form (j ) (j ) (j ) K (j ) ( Zj ψ, φ, J ) = Bφ ( Zj ψ) + KJ ( Zj ψ) + W˜ R ( Zj ψ, φ, J ), (130) (j )
(j )
where Bφ (ψ) and KJ (ψ) denote the sums over the terms containing only one φ or (j )
J field, respectively. Note that Bφ (ψ) is the same function appearing in (89) and the action of L on it is defined exactly as before. (j ) As in Sect. 3.4, the only terms contributing to KJ ( Zj ψ), for which the localization has to be defined different from the identity are those of second order in ψ, (j,2) which behave as marginal terms; we shall denote their sum KJ ( Zj ψ). For j = 0, √ √ (0,2) (0) [h,0] , and we define the L operation KJ ( Z0 ψ) = KJ ( Z0 ψ) = ω dxJx,ω Tx,ω on it as the identity, that is (0,2) (0,2) LKJ ( Z0 ψ [h,0] ) = KJ ( Z0 ψ [h,0] ). (131) (−1,2) √ Let us now analyze the structure of KJ ( Z−1 ψ [h,−1] ), as it appears after integrating the ψ (0) field and rescaling ψ [h,−1] . We have (−1,2) (ψ) KJ
1 dxJx,ω Tx,ω = Z−1 ω (−1) + − (−1) dydz F2,ω,ω˜ (x, y, z) + δω,ω˜ F1,ω (x, y, z) ψy,ω˜ ψz,ω˜ . + ω˜
(132)
Density-Density Critical Indices
125
(−1)
F2,ω,ω˜ denotes the sum over all Feynman graphs built by contracting both ψ fields of Tx,ω (on scale 0) and by choosing equal to ω˜ = ±1 the ω-index of the two external ψ (−1) fields. F1,ω represents the sum over the graphs built by leaving external one of these ψ fields of Tx,ω . See Fig. (5), where the J field and the external ψ fields are represented as dashed lines and the small circle represents the non-local kernel of Tx,ω .
(−1,2)
KJ
=
ω˜
k+ ω p
k˜ +
+
ω k+
ω
ω
ω
+
+
ω
ω
ω k−
k−
ω
ω˜
ω
Fig. 5. Graphical representation of Eq. (132)
(−1)
It is easy to see that the Fourier transform of F2,ω,ω˜ can be written, if we choose the momenta k+ and k− of the ψ external fields as independent variables, as (−1) Fˆ2,ω,ω˜ (k+ , k− ) =
p Dω (p)
d k˜ + Sω (k˜ + , k˜ + − p)Gω,ω˜ (k˜ + , k+ , k− ), (−1)
(133)
(−1) where S(k+ , k− ) is given by (117), p = k+ − k− and Gω,ω˜ (k˜ + , k+ , k− ) is of the (−1) form Gω,ω˜ (k˜ + , k+ , k− ) = G0 (k˜ + , k+ , k− ) + G1 (k+ )G2 (k− )δ(k˜ + − k+ ), where G0 represents a suitable sum over connected graphs with four external lines, while G1 and G2 represent suitable sums over connected graphs with two external lines. Note that all these three functions can be written, at order n of perturbation theory, as sums of C n terms, each term being represented as a truncated expectation of ψ monomials, which can then be expanded as a sum over tree graphs of suitable determinants, thanks to the anticommuting properties of the Grassmanian variables [Le], hence the argument in [BM] to avoid bad factorials in the bounds can be used. (−1) Gω,ω˜ has special symmetry properties, which it is very important to exploit. Consider (−1)
first the case ω = ω; ˜ then each term contributing to Gω,ω is obtained by taking n interaction terms (each having two ψ fields of type ω and two of type −ω) and by building a graph with four external lines, two of type ω and two of type ω. ˜ It follows that n ≥ 2 and that in the graph there are (2n − 4)/2 propagators of type ω and (2n)/2 propagators of type −ω. By using the symmetry property of the propagators (99), one gets (−1) ˜ +∗ +∗ −∗ ˜+ + − G(−1) ω,ω (k , k , k ) = −Gω,ω (k , k , k ).
(134)
126
G. Benfatto, V. Mastropietro
In a similar way, one can check that (−1) (−1) Gω,−ω (k˜ + , k+ , k− ) = Gω,−ω (k˜ +∗ , k+∗ , k−∗ ), +
−
∗
p · Sω (k , k ) = −iωp · Sω (k
+∗
,k
−∗
).
(135) (136)
Equations (133)–(136) imply that (−1) Fˆ2,ω,ω˜ (k+ , k− ) =
1 (−1) + − ˆ (−1) (k+ , k− )], [p0 Aˆ ω,ω,0 ˜ (k , k ) + p Aω,ω,1 ˜ Dω (p)
(137)
(−1) + − where Aˆ ω,ω,i ˜ (k , k ) are smooth functions verifying the condition (−1) (−1) + − +∗ −∗ Aˆ ω,ω,1 ˜ Aˆ ω,ω,0 ˜ (k , k ) = i ω ˜ (k , k ).
(138)
It follows that, if we define (in the L = β = ∞ limit, see the discussion after (97)) (−1) LFˆ2,ω,ω˜ (k+ , k− ) =
1 (−1) ˆ (−1) (0, 0)], [p0 Aˆ ω,ω,0 ˜ (0, 0) + p Aω,ω,1 ˜ Dω (p)
(139)
then LFˆ2,ω,ω (k+ , k− ) = Z−1 , (−1)
(3,+)
(140)
(−1) (3,−) D−ω (p) LFˆ2,ω,−ω (k+ , k− ) = Z−1 , Dω (p)
(141)
(3,+) (−1) (3,−) (−1) where Z−1 = i Aˆ ω,ω,0 (0, 0) and Z−1 = i Aˆ ω,−ω,0 (0, 0) are constants, which one can easily show to be real. −1 The action of L on F2,ω, ω˜ was given above in momentum space; it is however very easy to write it in coordinate space. The support properties of the external propagator imply that |p| ≤ γ + γ h , hence |p| ≤ γ 2 , if γ h is small enough, as we shall suppose (to (i,j ) simplify the notation). Then we can freely multiply ω (k+ , k− ) by χ0 (γ −2 |k+ −k− |). (−1,2) (−1) (ψ) containing F2,ω,ω˜ can Hence, in space-time coordinates, the contribution to KJ be written, by using the representation (137) as 1 (−1) + − dxJx,ω dx Vω (x − x ) dydzAω,ω˜ (x , y, z)ψy, (142) ω˜ ψz,ω˜ , Z−1 ω,ω˜
where
$ p dp χ0 (γ −2 |p|)eipx , 2 (2π ) Dω (p) $ dk+ dk− (−1) Aω,ω˜ (x , y, z) = χ0 (γ −2 |k+ − k− |) (2π )4 + − ˆ (−1) (k+ , k− ). · eik (x −y)−ik (x −z) A Vω (x) =
(143)
ω,ω˜
It follows that the operation L can be described as the localization of the ψ fields in + − the point x and that the corresponding R operation produces a term with ψy, ω˜ ψz,ω˜ − + − + − ψx ,ω˜ ψx ,ω˜ in place of ψy,ω˜ ψz,ω˜ . We can then apply the argument following (78) to (−1) explain the regularization effect of R, since Aˆ (k+ , k− ) are smooth functions. ω,ω,i ˜
Density-Density Critical Indices
127 (3,+)
If λ is small enough, the size of Z−1
(3,−)
and Z−1
is determined by the contributions (3,+)
of lower order in their expansions in power of λ. It is easy to see that Z−1 is of order λ2 and that there is only one graph of that order different from zero in its expansion, that (3,−) on the left of Fig. 6, while Z−1 is of order λ and the only corresponding first order graph is represented on the right of the picture.
−ω
ω ω −ω
−ω
ω
ω ω
−ω
ω
(3,+)
Fig. 6. Terms of lower order contributing to Z−1
(3,−)
and Z−1
.
(3,+)
By an explicit calculation, one can check that the contributions to Z−1 of the two graphs are different from zero. Hence (3,+)
Z−1
= −c+ λ2 < 0,
(3,−)
Z−1
=
(3,−)
and Z−1
λ . 4π
(144)
(−1)
We now consider the contribution to F1,ω (x, y, z) associated with the third term in Fig. 5. Its Fourier transform can be written as ε (k− ) − 1]D (k− )gˆ (k+ ) − u (k+ ) [Ch,0 ω 0 ω (0)
(−1,+) Fˆ1,ω (k+ , k− ) =
Dω (p)
+ G(2) ω (k ),
(145)
where Gω (k+ ) represents the sum over the connected Feynman graphs with propagator (2) g (0) and two external lines. Since, by symmetry reasons, Gω (0) = 0, the simplest way (−1,+) is to define to regularize F1,ω (2)
(−1,+) RFˆ1,ω (k+ , k− ) ε (k− ) − 1]D (k− )gˆ (k+ ) − u (k+ ) [Ch,0 ω 0 ω (0)
=
Dω (p)
+ (2) [G(2) ω (k ) − Gω (0)],
(146)
whose corresponding local part is vanishing. In other words the dimensional gain is here obtained without the introduction of a renormalization constant. Note that there is a simple description of this operation in terms of a localization operation on the ψ fields, as in the remark following (141). A similar procedure can be defined for the contribution (−1) to F1,ω (x, y, z) associated with the fourth term in Fig. (5).
128
G. Benfatto, V. Mastropietro
We can summarize the previous discussion by defining (−1,2)
LKJ
(ψ) ! " (3,+) (3,−) Z−1 Z−1 Tx,ω − + − (−) + = + , dx Jx,ω + ψ ψ J ψ ψ Z−1 Z−1 x,ω x,ω Z−1 x,ω x,−ω x,−ω ω (147) (−)
where Jx,ω is the Fourier transform of D−ω (p) (−) Jˆp,ω . = Jˆp,ω Dω (p)
(148)
Equation (147) implies that the integration of the scale j = −1 has to take into account two new local terms, to be called of type Z + and of type Z − , similar to those introduced in Sect. 3.4 to analyze the Schwinger functions, see (100). There is however an important difference in the term of type Z − , related with the fact that, in this term, we have absorbed in the external J field a bounded but not smooth function of p, in order to avoid that it is involved in the regularization operations. (j,2) We can now describe the general step, by defining the action of L on KJ (ψ), [h,j ] which can be written, if j < −1, after rescaling ψ , as 1 (j,2) (j ) dx Jx,ω Tx,ω + dydz Jx,ω FZ + ,ω,ω˜ (x, y, z) KJ (ψ) = Zj ω ω˜ (j ) (−) (j ) + Jx,ω FZ − ,ω,ω˜ (x, y, z) + Jx,ω F2,ω,ω˜ (x, y, z)
(j ) + − + δω,ω˜ Jx,ω F1,ω (x, y, z) ψy,ω˜ ψz,ω˜ ,
(149)
where FZ ± ,ω,ω˜ represents the sum over all graphs with one vertex of type Z ± and two (j )
(j )
ψ external fields of type ω, ˜ built by using propagators of scale i ∈ [j + 1, −1], F2,ω,ω˜ is the sum over the same kind of graphs with one vertex Tx,ω , whose ψ fields are both (j ) contracted and F1,ω is the sum over the graphs with one vertex Tx,ω , such that one of its ψ fields is external. It is important to stress that, thanks to the identity (114), given a graph contributing (j ) to F2,ω,ω˜ , at least one among the ψ fields belonging to Tx,ω is contracted on scale 0, so that we can write 0 p (j ) (i) ˜ + + − + − ˜+ ˜+ ˆ F2,ω,ω˜ (k , k ) = d k˜ + S˜ (i) (150) ω (k , k − p)Gω,ω˜ (k , k , k ), Dω (p) i=j
where S˜ (i) (k+ , k− ) is given by (117) for i = 0 and (123) for i < 0, if no derivative acts on S(i) (k+ , k− ) as a consequence of the regularization on a scale r such that j < r ≤ i, otherwise it is given by a suitable derivative of S(i) (k+ , k− ). Moreover, (i) Gω,ω˜ (k˜ + , k+ , k− ) is a suitable smooth function, which can be expressed as a sum over
products of propagators g˜ (i ) , i ∈ [j + 1, 0], or their derivatives, integrated over suitable loop variables. Hence we can extend to the case j < −1 the definition of L given for
Density-Density Critical Indices
129 (j )
j = −1, for what concerns its action on all terms in the r.h.s. of (149) except FZ + ,ω,ω˜ (j )
and FZ − ,ω,ω˜ , for which we put (for L = β = ∞, see otherwise the discussion after (99)) (j ) (j ) LFˆZ ± ,ω,ω˜ (k+ , k− ) = FˆZ ± ,ω,ω˜ (0, 0).
(151)
(j ) Note that S˜ (i) (k+ , k− ) is a smooth function, by our definition of RFˆ1,ω , which generalizes (146). And, by the same argument leading to (98), we have (j ) (j ) FˆZ + ,ω,−ω (0, 0) = FˆZ − ,ω,ω (0, 0) = 0.
(152)
It follows that we can write (j,2)
LKJ
(ψ) ! " (3,+) (3,−) Z Zj Tx,ω j + − + , dx Jx,ω + ψx,ω ψx,ω J (−) ψ + ψ − = Zj Zj Zj x,ω x,−ω x,−ω ω (3,+)
which defines the new renormalization constants Zj
(3,−)
and Zj
(153)
, for j ∈ [h, −1].
4.4. The bounds. The previous considerations allow to define a tree expansion for Hω2,1 , similar to that used for G2,1 ω in Sect. 3 and described in Sect. 3.4 after (103). The only important difference is that we have now three different special endpoints associated with the field J , corresponding to the three different terms in the r.h.s. of (153); we shall call these endpoints of type J and subtype T , Z + and Z − , respectively. (3,+) There is of course a tree expansion also for the renormalization constants Zj and (3,−)
, involving trees with root at scale j − 1, one endpoint of type J , |Pv0 | = 2 and Zj the operation L acting on v0 . One can show in the usual way that, given a tree τ with n normal endpoints and the special endpoint of subtype Z ± and scale i +1, its contribution (3,±) (3,±) to Zj satisfies the bound Zτ (3,±)
|Zτ(3,±) | ≤ (C ε¯ )n |Zi
|
γ −dv ,
with dv ≥ 1, ∀v ∈ τ.
(154)
v∈τ (3,±)
A similar bound is satisfied if the special endpoint is of subtype T , without the |Zi | factor. However, in this case, the scale index of the special endpoint has to be equal to (i,j ) 1, because of the properties of the function ω described in Sect. 4.2. Therefore there is a path C in the tree connecting the special endpoint with v0 and we can extract from
−dv a small factor γ 1/2 for each v ∈ C, without losing the summability properties v∈τ γ of the bound; hence we write |Zτ(3,±) | ≤ (C ε¯ )n γ j/2
v∈C
γ −(dv − 2 ) 1
v∈τ \C
γ −dv ,
with dv ≥ 1, ∀v ∈ τ.
(155)
130
G. Benfatto, V. Mastropietro (3,+)
Another important property, following from (151) and (152), is that Zτ = 0, if the special endpoint is of subtype Z − , and vice versa. Hence we can write, if j ∈ [h+1, −1], (3,±)
(3,±)
Zj −1 = Zj
+
−1
(3,±)
βj,i Zi
(3,±) + β˜j ,
(156)
i=j (3,±)
where βj,i Zi is the sum over the contributions associated with trees whose special (3,±) is the sum over the trees whose endpoint is of subtype Z ± and scale i + 1, while β˜j (3,+)
(3,−)
special endpoint is of subtype T . Note that βj,i is equal for Zj −1 and Zj −1 , by the symmetry of the interaction (9) under the transformation ω → −ω. (3,+) have at least two normal Let us now observe that the trees contributing to β˜j (3,+)
with only one endpoints, since it is not possible to build a graph contributing to Zj endpoint of type λ and the local part of the graphs with one endpoint of type δ is equal (3,−) , but in this case it is possible to zero. This last property is of course true also for Zj to build a graph contributing to it with one endpoint of type λ, see Fig. 6. However, the considerations of Sect. 4.2, item (2), imply that this graph could give a contribution (3,−) only for j = −1, but also in this case a simple explicit different from zero to β˜j calculation implies that its value is zero. By using (155) and the previous remark, one can easily show that |β˜j
(3,±)
| ≤ C ε¯ 2 γ j/2 .
(157)
In a similar way one can prove that |βj,i | ≤ C ε¯ 2 γ −
i−j 2
(158)
.
We want to compare the flow equation (156) with the flow equation of the renormal(2) ization constant Zj introduced in Sect. 3.4 to study the Schwinger Functions, see (100). In this case the involved trees have one endpoint of type J , which can have scale ≥ +1, while in the previous case the scale of the special endpoints of subtype Z ± was ≥ 0. However, if the scale of the special endpoint is ≥ 0, the contribution of corresponding (2) trees is equal to βj,i Zi , where βj,i is the same number appearing in (156). Hence we can write, if j ∈ [h + 1, −1], (2)
(2)
Zj −1 = Zj +
−1
(2)
βj,i Zi
+ β˜j , (2)
(159)
i=j (2)
(2)
where βj,i Zi
is the sum over the contributions to Zj associated with trees whose (2) special endpoint has scale i + 1 ≥ 0, while β˜j is the sum over the trees whose special endpoint has scale +1. By proceeding as in the proof of (157), one can show that |β˜j | ≤ C ε¯ 2 γ j/2 . (2)
(3,±)
As we shall see, the renormalization constants Zj (3,±)
−∞, but Zj
(2)
(160) (2)
and Zj are divergent as j →
/Zj is bounded and of order ε¯ 2 , uniformly in j .
Density-Density Critical Indices
131
Lemma 1. If ε¯ ≤ 2|λ| and |λ| is small enough, there is a constant c0 , independent of j and h, such that (3,+) (3,−) Z Z j j 2 2 c0 λ ≤ (2) ≤ 2c0 λ , c0 |λ| ≤ (2) ≤ 2c0 |λ|, j ∈ [h, −1]. (161) Z Z j
j
(3,+)
Proof. Let us consider first Zj . In order to control its dependence on j , we have to analyze in a different way the regions j ∈ [j0 , −1] and j < j0 , with j0 chosen so that γ j0 /2 = c1 |λ|2 ,
(162)
(3,+)
for some constant c1 . If j ≥ j0 , we put Zj = aj + bj , where aj is the contribution 2 of order λ , while bj is the sum over the terms of order ≥ 3. The analysis of Sect. 4.2 implies that aj is obtained by applying the L operation to the graph in the left of Fig. (6), with the two propagators of type ω on scale 0 or −1 (by the remark after (121), the local part is zero, if this condition is not satisfied) and the two propagators of type −ω on scale i ∈ [j + 1, 0]. By an explicit calculation, we can show that c2 λ2 ≤ −aj ≤ 2c2 λ2 ,
uniformly in j.
(163)
On the other hand, if we extract from both sides of (156) the terms of order ε¯ 2 , we get bj −1 = bj +
−1
(3,+)
βj,i Zi
+ β¯j ,
|β¯j | ≤ C ε¯ 3 γ j/2 ,
(164)
i=j
which allows very easily to prove the bound |bj | ≤ c3 ε¯ 3 (1 + c2 ε¯ )|j |, for some constant c3 , as far as c3 ε¯ (1 + c2 ε¯ )|j | ≤ c2 /2, a condition which is certainly satisfied for j ≥ j0 , if ε¯ is small enough, and allows also to prove, under the further hypothesis ε¯ ≤ 2|λ|, that 5c2 2 c2 2 (3,+) |≤ ε¯ ≤ |Zj ε¯ , 2 2
j ∈ [j0 , −1].
(165)
(2)
Moreover, by using (159), the fact that Z0 = 1 and an explicit second order calculation, one can show very easily by induction that there exists a positive constant c4 such that (2)
γ c4 ε¯ ≤ 2
Zj −1 (2) Zj
≤ γ 2c4 ε¯
2
⇒
γ c4 ε¯
2 |j |
≤ Zj ≤ γ 2c4 ε¯ (2)
2 |j |
j ∈ [h + 1, −1].
,
(166) Equations (165) and (166) immediately imply the first of the bounds (161), if j ≥ j0 and |λ| ≥ ε¯ /2 is so small that γ C ε¯ j0 ≥ 1/2. Let us now suppose that j < j0 . Equation (156) can be rewritten in the form (3,+) Zj −1
=
(3,+) Zj
+
j0 i=j
(3,+)
βj,i Zi
+ β¯j ,
β¯j =
−1 i=j0 +1
(3,+)
βj,i Zi
+ β˜j3,+ .
(167)
132
G. Benfatto, V. Mastropietro
By using (157), (158), (162) and (165), we get the bound |β¯j | ≤ C ε¯ 4 γ −(j0 −j )/2 ,
(168)
which allows to prove by induction that there is a constant, which can of course be chosen equal to the constant c4 of (166) if it is large enough, such that, if j ∈ [h + 1, j0 ], (3,+)
γ c4 ε¯ ≤ 2
Zj −1
(3,+) Zj
≤ γ 2c4 ε¯
2
c2 2 c4 ε¯ 2 (j0 −j ) 5c2 2 2c4 ε¯ 2 (j0 −j ) (3,+) ≤ Zj ≤ . ε¯ γ ε¯ γ 2 2
⇒
(169) We want now to show that, if r < j0 , there exists a constant c5 such that (3,+) (2) Z Zr−1 r−1 (3,+) − (2) ≤ c5 ε¯ 2 γ −(j0 −r)/4 . Zr Zr
(170)
Note that, by (159), (166), (167) and (169), if j < j0 , (3,+)
Zj −1
(3,+)
Zj
(2)
−
Zj −1
=
(2)
Zj
j0
! βj,i
i=j
(3,+)
Zi
(3,+)
Zj
(2)
−
Zi
"
(2)
Zj
+ ηj ,
(171)
with |ηj | ≤ C ε¯ 2 γ −(j0 −j )/2 ≤
c5 2 −(j0 −j )/2 ε¯ γ , 2
(172)
if c5 is chosen large enough. Hence, the bound (170) follows immediately from (171), if r = j0 ; let us suppose that it is satisfied for r ∈ [j + 1, j0 ], j ≤ j0 − 1 and note (3,+) (3,+) (2) (2) /Zr−1 − Zr /Zr−1 | ≤ that, if ε¯ is small enough, by the first of (166) and (169), |Zr (3,+)
(3,+)
|Zr−1 /Zr
− Zr−1 /Zr |, since γ c4 ε¯ > 1. Hence, if i ∈ [j + 1, j0 ], (2)
2
(2)
i i (2) (3,±) (2) Z (3,+) Z Zr Zr i − (3,+) − i(2) = (3,±) (2) Z Zj r=j +1 Zr−1 r=j +1 Zr−1 j ≤
i
c5 ε¯ 2 γ −(j0 −r)/4 γ −c4 ε¯
(173)
2 (i−j −1)
r=j +1
≤ c5 c6 ε¯ 2 (i − j )γ −(j0 −i)/4 , for some constant c6 . This bound, together with (158), (171) and (172), imply that there exists a constant c7 such that (3,+) (2) Z Zj −1 1 j −1 (174) (3,+) − (2) ≤ c5 ε¯ 2 γ −(j0 −j )/4 [ + c7 ε¯ 2 ], Z 2 Z j
j
which implies (170) for r = j , if ε¯ is small enough.
Density-Density Critical Indices
133
By using (166), (169) and (170), we get, if i < j0 , (3,+) (3,+) (2) (3,+) Z Zi−1 Zi(3,+) Zi i−1 Zi−1 (2) − (2) ≤ (3,+) − (2) (2) Z Z Z Z Z i−1
i
i i i−1 4 −(j0 −i)/4+c4 ε¯ 2 (j0 −i)
≤ C ε¯ γ
≤ C ε¯ 4 γ −(j0 −j )/8 , so that
(175)
(3,+) (3,+) j0 Z Zj0 j γ −(j0 −j )/8 ≤ C ε¯ 4 , (2) − (2) ≤ C ε¯ 4 Z Z i=j +1 j j0 (3,+)
which implies (161) for Zj (3,−)
(176)
(2)
/Zj , if ε¯ ≤ 2|λ|.
(2)
is very similar. However, one can avoid the different (3,−) (3,−) /Zj is of order ε¯ , by the treatment of the regions j ≥ j0 and j < j0 , since β˜j bound on the right of (143). This bound also justifies the presence of ε¯ in place of ε¯ 2 . The proof for Zj
/Zj
¯ = γ h , then there exists a constant C such that Lemma 2. If k¯ 1 = −k¯ 2 = k¯ and |k| Cγ −2h ε¯
(2)
(2)
Zh Z ≤ |Hˆ ω2,1 (k¯ 1 − k¯ 2 , k¯ 2 )| ≤ 2Cγ −2h ε¯ h 2 . 2 (Zh ) (Zh )
(177)
Proof. As explained at the beginning of Sect. 4.4, Hˆ ω2,1 (p, k) admits an expansion similar ˆ (2,1) to that of G ω (p, k), see (110). The main difference is that the special endpoint of type J can be of three different subtypes, so that it is convenient to write 2,1 ˆ 2,1 ˆ 2,1 Hˆ ω2,1 (p, k) = Hˆ ω,Z + (p, k) + Hω,Z − (p, k) + Hω,T (p, k),
(178)
2,1 ± where Hˆ ω,Z ± denotes the sum over the trees whose special endpoint is of subtype Z , 2,1 while Hˆ ω,T is the sum over the trees whose special endpoint is of subtype T . ¯ = γ h . Then it is obvious Let us now suppose that p = k¯ 1 − k¯ 2 and k = k¯ 2 , with |k| that the sum over the trees with n normal endpoints contributing to Hˆ 2,1 + (p, k) can ω,Z (2)
ˆ ω (p, k), see (111), by substituting Z with Z be bounded as in the case of G . h h 2,1 ˆ A similar argument can be used for Hω,Z − (p, k), but in this case one has to take into account the fact that the trivial trees containing only one endpoint, the special one, does not contribute. Hence we have (2,1)
2,1 −2h ¯ ¯ ¯ ¯ ¯ ˆ 2,1 ¯ |Hˆ ω,Z + (k1 − k2 , k2 )| + |Hω,Z − (k1 − k2 , k2 )| ≤ Cγ
(3,+)
|Zh
(3,−)
| + ε¯ |Zh (Zh )2
(3,+)
|
. (179)
2,1 Let us now consider Hˆ ω,T . The analysis of the previous sections (in particular the bounds (126) and (128)) implies that a bound like (111) is still valid for the sum over the (2) trees with n normal endpoints, but now Zh has to be substituted with 1. Moreover, the contributions corresponding to the trivial trees without normal endpoints are given by
134
G. Benfatto, V. Mastropietro
(h,h) (h,h+1) ¯ (h+1,h) ¯ ω (k¯ 1 , k¯ 2 ) + ω (k1 , k¯ 2 ) + ω (k1 , k¯ 2 ), because of the support properties of the propagators. However, by (125) and (127), this quantity is exactly equal to 0, so that ε¯ 2,1 ¯ |Hˆ ω,T (k1 − k¯ 2 , k¯ 2 )| ≤ Cγ −2h . (180) (Zh )2 The bound (179) and (180), immediately imply the upper bound of Theorem 4. The lower bound follows from the explicit calculation of the leading contributions to 2,1 2 ˆ 2,1 Hˆ ω,Z + (p, k) and Hω,Z − (p, k), which are both of order λ ; one has essentially to check that they do not cancel out.
References [B]
Baxter, R. J.: Exactly solvable models in statistical mechanics. London–NewYork: Academic Press, 1984, p. 258 [BFS] Brydges, D., Frohlich, J., Seiler, E.: On the construction of quantized gauge fields. III. The twodimensional abelian Higgs model without cutoffs. Commun. Math. Phys. 79, 353–399 (1981) [BG] Benfatto, G., Gallavotti, G.: Perturbation Theory of the Fermi Surface in a Quantum Liquid. A General Quasiparticle Formalism and One-Dimensional Systems. J. Stat. Phys. 59, 541–664 (1990) [BGM] Benfatto, G., Gallavotti, G., Mastropietro, V.: Renormalization Group and the Fermi Surface in the Luttinger Model. Phys. Rev. B 45, 5468–5480 (1992) [BGPS] Benfatto, G., Gallavotti, G., Procacci, A., Scoppola, B.: Beta Functions and Schwinger Functions for a Many Fermions System in One Dimension. Commun. Math. Phys. 160, 93–171 (1994) [BM] Benfatto, G., Mastropietro, V.: Renormalization Group, hidden symmetries and approximate Ward identities in the XY Z model. Rev. Math. Phys. 13, 1323–1435 (2001) [BM1] Benfatto, G., Mastropietro, V.: Ward identities and local gauge invariance in d = 1 interacting Fermi systems. Preprint (2002) [BoM1] Bonetto, F., Mastropietro, V.: Beta Function and Anomaly of the Fermi Surface for a d = 1 System of Interacting Fermions in a Periodic Potential. Commun. Math. Phys. 172, 57–93 (1995) [DL] Dzyaloshinky, I.E., Larkin, A.I.: Correlation functions for a one-dimensional Fermi system with long range interaction (Tomonaga model). Soviet Phys. JETP 38, 202 (1974) [ES] Everts, H.U., Schulz, H.: Solid State Comm. 15, 1413 (1974) [FHRW] Feldman, J., Hurd, T., Rosen, L., Wright, J.: QED: a proof of renormalizability. Lecture Notes in Physics 312, Berlin: Springer-Verlag, 1988 [G] Gallavotti, G.: Renormalization Theory and Ultraviolet Stability for Scalar Fields via Renormalization Group Methods. Rev. Mod. Phys. 57, 471–562 (1985) [GS] Gentile, G., Scoppola, B.: Renormalization Group and the Ultraviolet Problem in the Luttinger Model. Commun. Math. Phys. 154, 153–179 (1993) [KK] Keller, G., Kopper, Ch.: Renormalizability Proof for QED Based on Flow Equations. Commun. Math. Phys. 176, 193–226 (1996) [JKM] Johnson, J.D., Krinsky, S., McCoy, B.: Vertical-Arrow Correlation Length in the Eight-Vertex Model and the Low-Lying Excitations of the X-Y-Z Hamiltonian. Phys. Rev. A. 8, 2526–2538 (1973) [Le] Lesniewski, A.: Effective action for the Yukawa 2 quantum field Theory. Commun. Math. Phys. 108, 437–467 (1987) [LSM] Lieb, E., Schultz, T., Mattis, D.: Two Soluble Models of an Antiferromagnetic Chain. Ann. of Phys. 16, 407–466 (1961) [MD] Metzner, W., Di Castro, C.: Conservation Laws and Correlation Functions in the Luttinger Liquid. Phys. Rev. B 47, 16107 (1993) [ML] Mattis, D., Lieb, E.: Exact solution of a many fermion system and its associated boson field. J. Math. Phys. 6, 304–312 (1965) [MRS] Magnen, J., Rivasseau, V., Seneor, R.: Construction of Yang–Mills(4) with an infrared cutoff. Commun. Math. Phys. 155, 325–384 (1993) [S] Sólyom, J.: The Fermi gas model of one dimensional conductors. Adv. in Phys. 28, 201–303 (1978) [Sp] Spohn, H.: Bosonization, vicinal surfaces and Hydrodynamic fluctuation theory. Phys. Rev. E60, 6411–6420 (1999) [T] Tomonaga, S.: Remarks on Bloch’s method of sound waves applied to many fermion problem. Progr. Theoret. Phys. 5, 349–374 (1950) Communicated by G. Gallavotti
Commun. Math. Phys. 231, 135–156 (2002) Digital Object Identifier (DOI) 10.1007/s00220-002-0674-7
Communications in
Mathematical Physics
Periodic and Quasi-Periodic Orbits for the Standard Map Alberto Berretti1 , Guido Gentile2 1 Dipartimento di Matematica, Università di Roma “Tor Vergata”, 00133 Roma, Italy 2 Dipartimento di Matematica, Università di Roma Tre, 00146 Roma, Italy
Received: 14 December 2001 / Accepted: 16 March 2002 Published online: 2 October 2002 – © Springer-Verlag 2002
Abstract: We consider both periodic and quasi-periodic solutions for the standard map, and we study the corresponding conjugating functions, i.e. the functions conjugating the motions to trivial rotations. We compare the invariant curves with rotation numbers ω satisfying the Bryuno condition and the sequences of periodic orbits with rotation numbers given by their convergents ωN = pN /qN . We prove the following results for N → ∞: (1) for rotation numbers ωN we study the radius of convergence of the conjugating functions and we find lower bounds on them, which tend to a limit which is a lower bound on the corresponding quantity for ω; (2) the periodic orbits consist of points which are more and more close to the invariant curve with rotation number ω; (3) such orbits lie on analytical curves which tend uniformly to the invariant curve. 1. Introduction Recently a new approach to KAM theory has been introduced in [1], based directly on the study of the perturbative series (Lindstedt series), and without using the standard rapidly convergent iterative procedure. Here we follow such an approach (in the Renormalization Group interpretation given by [2] and developed in a series of subsequent papers; see [3] for a list of references), and unify the analysis for periodic and quasi-periodic motions of the standard map. The standard map is a rather special system, introduced in [4] and [5], which shows a non-trivial dynamical behaviour and is, at the same time, simple enough to avoid any useless technical intricacies. The method we use should be extended to more general systems: the analysis may become a little more involved, but we think that no extra conceptual difficulties should arise (see [6] for a review of these techniques in a more general context). But, for clarity purposes, we prefer to confine ourselves to a simpler model. We shall be interested in the relation between the KAM invariant curves (on which the motion is quasi-periodic) and the periodic orbits corresponding to rotation numbers
136
A. Berretti, G. Gentile
tending to those of the invariant curves. When the perturbation is switched on, it is well known that the invariant curves with rational rotation number disappear, but some trace of them is left: there are curves, which can be interpreted as remnants (or “ghosts”) of the unperturbed invariant curves, on which the points of the periodic orbits have support: we can still define a conjugating function, i.e. a function which conjugates the motion to a trivial rotation, with the only difference that now the initial phase has to be fixed to an appropriate value. Even more we can consider functions which parametrize the remnants and which reduce to the conjugating functions in correspondence with the points of the periodic orbits: we shall refer to them as the interpolating functions. For rotation numbers ωN tending to the rotation number ω of a surviving invariant curve along the sequence of best approximants (convergents), the remnants are “analytically close” to the invariant curve itself. By “analytically close” we mean that the interpolating functions which define such remnants are analytic and converge uniformly to the (analytic) conjugating function for the corresponding invariant curve: a trivial application of the Cauchy formulae for the derivatives of analytic functions shows then that also the derivatives converge uniformly to the derivatives of the conjugating function of the invariant curve; the convergence therefore happens in quite a strong sense. The precise statements, with the proper setup of the domains of convergence, will be given later as it requires to establish first some definitions. The relation between periodic and quasi-periodic orbits can be studied also through variational methods, [7]: we think that the interest of our approach relies mostly on the possibility to have an accurate knowledge of the conjugating functions, in particular of their analyticity properties, and to obtain estimates which depend optimally on the involved parameters – in our case on the rotation numbers. We shall be able to provide lower bounds on the radius of convergence of the conjugating functions (in terms of a “truncated” Bryuno function), which are the analogue of what we found in [8] for the conjugating function of the quasi-periodic motions. Of course it is known that, while the invariant curves “break” at a certain critical threshold, periodic orbits persist for all real values of the perturbative parameter (see [9]). However singularities arise in the complex plane, and our analysis, together with the numerical results in [10], suggests that, for rotation numbers tending to a Bryuno number along the sequence of the convergents, the singularities of the conjugating functions of the corresponding periodic orbits tend to build up the “natural boundary” (of the analyticity domain) for the conjugating function of the invariant curve with that Bryuno number. We remark that so far the existence of such a natural boundary is only a numerical result and no rigorous proof has been given. This connection between analyticity properties of the conjugating functions is a point which – we think – should deserve further investigation, as it relates to the so-called “Greene’s method” to determine numerically the critical threshold and it would help to understand the mechanism of the breakup of the invariant curve itself. The plan of the paper is as follows. In Sect. 2 we recall the definition of the standard map and of the Bryuno function, by introducing a natural extension of the latter to rational numbers. In Sect. 3 we discuss the existence and the analyticity of periodic solutions; in particular we provide lower bounds on the radius of convergence of the conjugating function. In Sect. 4 we consider a Bryuno number ω and the sequence of its convergents ωN , we compare the periodic solutions with rotation numbers ωN with the quasi-periodic solution with rotation number ω, and we state our main result, so making formally more precise the notion of analytical closeness introduced above. The proof of
Periodic and Quasi-Periodic Orbits for the Standard Map
137
the theorems is achieved in the remaining Sections Sects. 5, 6 and 7. We assume that the reader is familiar with the techniques and results of [8], which we rely heavily on. 2. The standard map 2.1. The standard map. The standard map is a discrete one-dimensional dynamical system generated by the iteration of the symplectic map of the cylinder to itself, Tε : T × R → T × R, given by x = x + y + ε sin x, Tε : (2.1) y = y + ε sin x. We look for a change of variables of the form x = α + u(α, ε, ω), y = 2π ω + v(α, ε, ω),
(2.2)
such that the dynamics in the α variable is a trivial rotation α = α + 2π ω,
(2.3)
where ω ∈ [0, 1] is called the rotation number. One immediately checks that the function v(α, ε, ω) is related to the function u(α, ε, ω) by v(α, ε, ω) = u(α, ε, ω) − u(α − 2π ω, ε, ω), (2.4) while u(α, ε, ω) is a solution of the functional equation (Dω u) (α, ε, ω) ≡ u(α + 2π ω, ε, ω) + u(α − 2π ω, ε, ω) − 2u(α, ε, ω) = ε sin (α + u(α, ε, ω)) . (2.5) We shall call u = u(α, ε, ω) the conjugating function. 2.2. Continuous fraction expansion. For any ω ∈ [0, 1] let us write ω = [0, a1 , a2 , a3 , . . . ], where {an } are the partial quotients of ω and call {ωn } ≡ {pn /qn } the sequence of convergents of ω, [11]. If ω ∈ Q ∩ [0, 1], i.e. ω = p/q, with p ≤ q and gcd(p, q) = 1, then there exists N = N (ω) such that ω = [0, a1 , a2 , a3 , . . . , aN ], i.e. such that aN+1 = ∞: in such a case the sequence of convergents is finite and the last one is given by pN /qN = p/q. We can eliminate a trivial ambiguity by requesting that aN > 1; in the following, though, we shall be interested essentially in given sequences of convergents, so that the problem does not arise. For such rational ω define B1 (ω) =
N−1 n=0
log qn+1 . qn
(2.6)
For any ω ∈ [0, 1] ∩ R \ Q define B1 (ω) =
∞ log qn+1 n=0
qn
,
(2.7)
138
A. Berretti, G. Gentile
and define ω a Bryuno number if it is irrational and B1 (ω) < ∞; the latter is called the Bryuno condition. With a slight abuse of notation we shall call B1 (ω) the Bryuno function (see [12]); by analogy we shall define (2.6) the truncated Bryuno function of the rational number ω. If ω is a Bryuno number then there exists a solution of the form (2.2), (2.3), with u, v analytic in α, ε, for ε small enough, and 2π -periodic in α; for the more restrictive Diophantine condition on ω, this follows from the standard KAM theorem. A more formal statement, which will be used later on, is the following. Theorem 2.1. Let ω ∈ (0, 1) be a Bryuno number. Then there exists ρ(ω) > 0 such that there exists a solution of the form (2.2), (2.3), with u(α, ε, ω) periodic in α ∈ T and analytic in ε for |ε| < ρ(ω). There exists a positive constant C such that |log ρ(ω) + 2B1 (ω)| < C,
(2.8)
uniformly in ω.
2.3. Comments. The proof of the existence of the invariant curve with rotation numbers satisfying a Diophantine condition is standard, and can be found in any textbook about KAM theory; for instance see [13]. The proof in the case of Bryuno numbers and the explicit derivation of the bound (2.8) are given in [14] and [8]. If ω = p/q is rational then the functional equation (2.5) admits no solution; however we shall see in Sect. 3 that it is possible to fix α = α0 in such a way that x0 = α0 + u(α0 , ε, ω) is the initial datum of a periodic solution with period 2πp, i.e. such that u(α0 + 2πp) = u(α0 ).
(2.9)
This means that, after q iterates of the dynamics, the variable α has been shifted by 2πp, so that the variables (x, y) have come back to their original values (x0 , y0 ), up to a shift by 2πp in the x-direction. Remark 2.1. Note that, if ω is a Bryuno number, and {ωN } are the convergents of ω, then lim B1 (ωN ) = B1 (ω).
N→∞
(2.10)
Note also that B1 (ω) is still divergent on irrational, non-Bryuno numbers. It would actually be interesting to study the sequence of periodic orbits corresponding to such numbers, to understand the mechanism of divergence of Lindstedt series when the Bryuno condition is violated. 3. Periodic Solutions for the Standard Map 3.1. Periodic solutions. When ω is a Bryuno number, it is well known that a quasiperiodic solution with rotation number ω exists: the orbit is a smooth curve, and the trajectory is dense on it (see e.g. [15], and [8] for estimates on the radius of convergence which depend optimally on the rotation number). In the periodic case the trajectory consists in a finite number of points which can be interpolated through a smooth curve in a rather arbitrary way: we shall look for a precise interpolating curve and show that, for rotation numbers of the form ωN = pN /qN , where ωN are the convergents of the
Periodic and Quasi-Periodic Orbits for the Standard Map
139
Bryuno number ω, the corresponding curves tend to the invariant KAM curve with rotation number ω. Fix ω = p/q and, for any 2π -periodic function, eiνα fˆν , (3.1) f (α) = ν∈Z
write f (α) = f¯(α) + f˜(α), eiνα fˆν , f¯(α) =
f˜(α) =
ν∈Z\qZ
Note that, in Fourier space, (Dω f ) (α) =
eiνα fˆν .
(3.2)
ν∈qZ
eiνα γ (ων) fˆν , (3.3)
ν∈Z
γ (ων) = 2 [cos(2π ων) − 1] = 2 [cos(2πpν/q) − 1] , so that (Dω f˜)(α) = 0. Then we can write u(α, ε, ω) = u(α, ¯ ε, ω) + u(α, ˜ ε, ω), ¯ ˜ ε sin(α + u(α, ε, ω)) ≡ S(α, ε, ω) = S(α, ε, ω) + S(α, ε, ω),
(3.4)
so that (2.5) becomes ¯ ˜ ¯ (α, ε, ω) ≡ S(α, ε, ω) + S(α, ε, ω). (Dω u)
(3.5)
We write also, formally, u, S as power series in ε, so that u(α, ε, ω) = S(α, ε, ω) =
∞
ε k u(k) (α, ω) =
eiνα uˆ ν (ε, ω) =
∞
k=1
ν∈Z
k=1
∞
∞
k (k)
ε S
(α, ω) =
k=1
e
iνα
Sˆν (ε, ω) =
k=1
ν∈Z (k) uˆ ν (ω)
εk
eiνα uˆ (k) ν (ω),
ν∈Z
ε
k
(3.6) eiνα Sˆν(k) (ω),
ν∈Z
(k) Sˆν (ω).
and so defining the Taylor-Fourier coefficients Then, to all perturbative orders k, (3.5) gives two equations: (Dω u¯ (k) )(α, ω) = S¯ (k) (α, ω), 0 = S˜ (k) (α, ω).
(3.7)
Note that for k > 1 we can express S (k) (α, ω) in terms of all u(k ) (α, ω) with k < k. In fact one has m+1 ∞ 1 ∂ S (k) (α, ω) = − cos α u(k1 ) (α, ω) . . . u(km ) (α, ω). m+1 m! ∂α k ,... ,k ≥1 m=1
m 1 k1 +...+km =k−1
(3.8)
140
A. Berretti, G. Gentile
Lemma 3.1. If the functions u, S are formally well defined, they have to be odd in α. Proof of Lemma 3.1. First of all note that the operator Dω is even. Then the proof is by induction on k. For k = 1 one has S (1) (α, ω) = sin α,
u(1) (α, ω) = (Dω−1 S (1) )(α, ω),
(3.9)
which are obviously odd. If all functions u(k ) (α, ω) are odd for k < k then, by (3.8), one has for k > 1, ∞ 1 ∂ m+1 (k) S (−α, ω) = − cos β u(k1 ) (−α, ω) . . . u(km ) (−α, ω) m+1 m! ∂β β=−α m=1 k1 ,... ,km
= (−1)m+1+m S (k) (α, ω) = −S (k) (α, ω), (3.10) so that S (k) (α, ω) is odd; then also u(k) (α, ω) is odd.
Corollary 3.1. The functions (3.6), if formally existing, can be written as u(α, ε, ω) =
2i uˆ ν (ε, ω) sin να =
ν∈N
S(α, ε, ω) =
2i Sˆν (ε, ω) sin να =
∞ k=1 ∞
εk
2i uˆ (k) ν (ω) sin να,
ν∈N
ε
k=1
ν∈N
k
(3.11) 2i Sˆν(k) (ω) sin να.
ν∈N (k)
Lemma 3.2. In (3.6) one has |ν| ≤ k; in other words one has uˆ ν (ω) = 0 for |ν| > k. Proof of Lemma 3.2. From (3.9) one obtains ν = ±1 for k = 1. Suppose that for all (k ) k < k one has uˆ ν (ω) = 0 when |ν| > k : then (3.8) gives Sˆν(k) (ω) = −
∞ m=1
ν0 ,ν1 ,... ,νm k1 ,... ,km k1 +...+km =k−1 ν0 +ν1 +...+νm =ν
(iν0 )m+1 m!2
m) uˆ ν(k11 ) (ω) . . . uˆ (k νm (ω)
,
(3.12)
where ν0 = ±1, so that |ν| ≤ |ν0 | + |ν1 | + . . . + |νm | ≤ 1 + (k1 + . . . + km ) ≤ k. ˜ 0 , ε, ω) = u(α ˜ 0 , ε, ω) = 0, Lemma 3.3. There exists α0 such that one has formally S(α ¯ 0 , ε, ω) and u(α while S(α ¯ 0 , ε, ω) are formally well defined. Proof of Lemma 3.3. From (3.6) and (3.11) one has that, if S˜ (k) (α0 , ω) exists formally, then 2i Sˆν(k) (ω) sin να0 = 0, (3.13) S˜ (k) (α0 , ω) = ν∈qN
if we fix α = α0 such that sin qα0 = 0.
(3.14)
Periodic and Quasi-Periodic Orbits for the Standard Map
141
Then we can check by induction that the functions u(k) (α, ω) and S (k) (α, ω) are well defined for α = α0 . By Lemma 3.2 one has S˜ (k) (α, ω) = u˜ (k) (α, ω) = 0 for all α when k < q. Moreover S¯ (k) (α, ω) and u¯ (k) (α, ω) are well defined for all α when k < q as γ (ων) = 0 for |ν| < q; for k = q one has from (3.12) ∞ (iν0 )m+1 (q) (k1 ) (km ) ˆ uˆ ν1 (ω) . . . uˆ νm (ω) , Sν (ω) = − m!2 ν0 ,ν1 ,... ,νm k ,... ,k m=1
m 1 k1 +...+km =q−1 ν0 +ν1 +...+νm =ν
(3.15) where ν0 = ±1. By (3.15) also S˜ (q) (α, ω) is well defined and, by (3.13) and (3.14), one (q) has S˜ (q) (α0 , ω) = 0. Therefore (3.7) can be solved for k = q and the coefficients uˆ q (ω) are arbitrary, as (3.3) shows. Moreover by (3.11) and (3.14) one has u˜ (q) (α0 , ω) = 0. (k ) Then suppose that the coefficients uˆ ν (ω) are well defined for all k < k and are (k) arbitrarily chosen for ν ∈ qZ: then we can show that also the coefficients uˆ ν (ω) are (k) formally well defined. This follows again from (3.12), which shows that Sˆν (ω) is well defined for all ν. Then if ν ∈ / qZ one has uˆ (k) ν (ω) =
1 ˆ (k) S (ω), γ (ων) ν
(3.16)
(k)
/ qZ. by the first equation in (3.7), so that also uˆ ν (ω) is well defined for ν ∈ Moreover if we sum together all Fourier components with ν ∈ qZ and we use (3.11) and (3.15) we see that in the second equation of (3.7) one has S˜ (k) (α, ω) = 0, so that (k) both equations in (3.7) are formally soluble and the coefficients uˆ ν (ω), with ν ∈ qZ, (k) can be arbitrarily fixed: independently of their values one has u˜ (α0 , ω) = 0 by (3.14).
Remark 3.1. The proof of lemma yields, through (3.13), that there are 2q values of α0 in [0, 2π) such that there exists a formal 2π -periodic solution of (3.5): πk α0 ∈ A(ω) ≡ : k = 0, 1, 2, . . . , 2q − 1 . (3.17) q As in the variable α the dynamics is a rotation by 2π/q (see (3.3)), we see that such values of α0 correspond to two distinct (formal) periodic orbits: one easily checks that, for ε small enough, of such orbits one is linearly stable and one is unstable. Corollary 3.2. In order that the function u(α0 , ε, ω) be formally well defined, for all (k) (k) ν ∈ qZ the coefficients uˆ ν (ω) can be chosen as arbitrary constants cν ; in particular they can be chosen as identically vanishing. Remark 3.2. At a formal level, by choosing the initial datum α0 in the set A(ω) given by (3.17), we see that the corresponding trajectory turns out to be a periodic solution of the equation of motion, of the form (3.2), with ¯ 0 , ε, ω) = u(α0 , ε, ω) ≡ u(α
ν∈Z\qZ
eiνα0 uˆ ν (ε, ω) =
∞ k=1
εk
eiνα0 uˆ (k) ν (ω).
ν∈Z\qZ
(3.18) Of course we are left with the problem of proving the convergence of the series (3.18).
142
A. Berretti, G. Gentile
Theorem 3.1. Let ω = p/q be a rational number in [0, 1], with gcd(p, q) = 1. Then there exists ρ(ω) > 0 such that the 2πp-periodic solutions of the form (3.18) are analytic in ε for |ε| < ρ(ω). One has log ρ(ω) + 2B1 (ω) > −C,
(3.19)
for some universal positive constant C, if B1 (ω) is the truncated Bryuno function (3.6).
3.2. About the proof of Theorem 3.1. The actual proof consists in proving that the function u(α, ε, ω) ≡ u(α, ¯ ε, ω) =
eiνα uˆ ν (ε, ω) =
∞
εk
k=1
ν∈Z\qZ
eiνα uˆ (k) ν (ω),
(3.20)
ν∈Z\qZ
is analytic in (α, ε) ∈ D, where
D = (ε, α) ∈ C2 : |ε| < ρ , |Im a| < ξ with eξ ρ < Ce−2B1 (ω) ,
(3.21)
for some universal constant C. For α ∈ A(ω) the function (3.20) interpolates the set of points (3.18), hence the periodic orbits. The proof of analyticity of (3.20) in the domain D proceeds exactly as the analogous proof of [8]. Instead of giving the full proof ex novo (which would be essentially a repetition of [8]), we assume the reader is familiar with [8] and we confine ourselves to stress the (few) points in which there is a difference between the case of rational numbers and the case of Bryuno numbers: this will be done in Sect. 6. Remark 3.3. (1) By taking into account Corollary 3.2 we can rewrite (3.20) as u(α, ε, ω) ≡ u(α, ¯ ε, ω) =
ν∈Z
e
iνα
uˆ ν (ε, ω) =
∞ k=1
εk
eiνα uˆ (k) ν (ω),
(3.22)
ν∈Z
as uˆ ν (ε, ω) = 0 for ν such that ν ∈ qZ. We shall call (3.22) the interpolating function for the periodic solutions with rotation number ω. (2) The above result gives a lower bound on the radius of convergence of the function (3.18). It would be interesting to see if the radius of convergence admits also an upper bound of the same kind (analogously to what happens in the case of quasi-periodic solutions): the numerical results of [10] suggest that this is the case. [Note that such an upper bound could be easily obtained for the interpolating functions (3.22) by reasoning as in [14].] (3) For α = α0 the function (3.20) does not describe any longer a periodic solution of the equation of motion: it is simply a 2π -periodic analytic function which is equal to the solution only when α = α0 , with α0 satisfying (3.9), i.e. with α0 ∈ A(ω): we call a remnant the curve described by such a function.
Periodic and Quasi-Periodic Orbits for the Standard Map
143
4. Periodic and Quasi-Periodic Solutions 4.1. Tree formalism. We refer to [16] and [8] for a detailed description about the definition and properties of trees. Here we confine ourselves to recall the basic notations, in order to have a self-consistent discussion. A tree θ consists of a family of k lines arranged to connect a partially ordered set of points called nodes, with the lower nodes to the right. All the lines have two nodes at their extremes, except the highest which has only one node, the last node u0 of the tree (which is the leftmost one); the other extreme r will be called the root of the tree and it will not be regarded as a node. We denote by the partial ordering relation between nodes: given two nodes u1 and u2 , we say that u2 u1 if u1 is along the path of lines connecting u2 to the root r of the tree (they could coincide: we say that u2 ≺ u1 if they do not). Each line carries an arrow pointing from the node u to the right to the node u to the left (i.e. directed toward the root): we say that the line exits from u and enters u , and we write u0 = r even if, strictly speaking, r is not a node. For each node there are only one exiting line and mu ≥ 0 entering ones; as there is a one-to-one correspondence between nodes and lines, we can associate to each node u a line u exiting from it. The line u0 connecting the node u0 to the root r will be called the root line. Note that each line u can be considered the root line of the subtree consisting of the nodes satisfying w u and of the lines connecting them: u will be the root of such a subtree. The order k of the tree is defined as the number of nodes of the tree. To each node u ∈ θ we associate a mode label νu = ±1, and define the momentum flowing through the line u as ν u =
νw ,
νw = ±1.
(4.1)
w u 0 the set of all trees of order k (i.e. with k nodes) and with Let us denote by Tν,k momentum ν flowing through the root line (total momentum), and by V (θ ) and (θ ), respectively, the set of nodes and the set of lines of the tree θ .
Lemma 4.1. Let ω ∈ [0, 1] and let u(α, ε, ω) be a formal solution of the functional equation (3.5); for ω ∈ Q one takes α = α0 ∈ A(ω), while for ω a Bryuno number α varies in [0, 2π]. Then one has 1 uˆ (k) Val(θ, ω), ν (ω) = k 2 0 θ∈Tν,k
νumu +1 Val(θ, ω) = −i g(ων ) , mu ! u∈V (θ)
∈ (θ)
(4.2) where g(ων) =
1 , γ (ων )
γ (ων) = 2 [cos(2π ων) − 1]
is the propagator associated to the line .
(4.3)
144
A. Berretti, G. Gentile
4.2. About the proof of Lemma 4.1. The proof is iterative and it is left as an (easy) exercise to the reader. [See [16] for details.] Lemma 4.2. For any tree θ and for any line ∈ (θ ) one has γ (ων ) = 0. Proof of Lemma 4.2. For ω a Bryuno number the proof reduces to show that ν = 0, and it is given in [17, Sect. 3] (in a more general situation), while for ω a rational number the proof is a consequence of the discussion in Sect. 3. In fact, as noted in Sect. 4.1, each line can be considered the root line of the subtree θ formed by the nodes and lines preceding . If k is the number of nodes of such a subtree and ν is the momentum flowing through (k ) the line , then the value of such a subtree contributes to uˆ ν (ω): the sum of the values (k )
Val(θ , ω) of all subtrees θ ∈ Tν0 ,k gives exactly uˆ ν (ω). By Corollary 3.2, when ω = p/q, no line with ν ∈ qZ can arise, so that, as γ (ων) = 0 if and only if ν ∈ qZ, (k ) the assertion follows. [Note that even if we did not choose the coefficients uˆ ν (ω) as (k) vanishing, they would be simply some constants cν not involving any denominator
γ (ων ).] (k)
Lemma 4.3. Let ω ∈ [0, 1] be either a rational number or a Bryuno number. Let uˆ ν (ω) be defined as in (4.2). Then there exists a positive constant D such that one has (k) (4.4) uˆ ν (ω) ≤ D k e2kB1 (ω) , where B1 (ω) is given by (2.7) if ω is a Bryuno number and by (2.6) if ω is rational.
4.3. About the proof of Lemma 4.3. The proof of such a result can be obtained by reasoning as in [8], and it implies the lower bound in (3.19) of Theorem 3.1; see Sect. 6 below. Remark 4.1. The result (4.4) is essentially an intermediate step toward Theorem 3.1: we have stated it explicitly as in that form it will be useful in proving the forthcoming theorem (and it will be exploited in Sect. 5). (k)
Remark 4.2. If we take ω = 1/q, with q ∈ N, then (4.4) reduces to |uˆ ν (1/q)| ≤ cq 2k : such a (trivial) bound could be obtained without any effort simply by noting that any tree to order k has k propagators which can be bounded by a constant times q 2 , so that we obtain for the radius of convergence the bound ρ(1/q) > cq −2 (for some positive constant c), which is a particular case of Theorem 3.1. Theorem 4.1. Let ω be a Bryuno number; if {ωN } are the convergents of ω, denote by uN ≡ u(α, ε, ωN ) the functions interpolating the periodic solutions with rotation number ωN as given by(3.22), and by u ≡ u(α, ε, ω) the quasi-periodic solution with rotation number ω. Then there exist two positive constants ρ0 and β, such that the sequence {uN } converges to the function u, uniformly for |ε| < ρ0 e−βB1 (ω) ; one can choose β = 2. Proof of Theorem 4.1. See the next Section 5.
Periodic and Quasi-Periodic Orbits for the Standard Map
145
4.4. Conclusions. Theorem 4.1 is our main analytical result. It implies that there is a neighborhood of the origin (in the complex ε-plane) in which the limit u∞ of the functions uN exists and coincides with the quasi-periodic solution u. Note that we are heavily using that the sizes of the domains of analyticity of the interpolating functions for ωN and for the conjugating function for ω admit the same estimates, as it follows from the discussion in Sect. 3.2 and from the trivial Remark 3.3, (2). For the uniqueness of the analytical continuation, [18], we can deduce that the functions u∞ and u have the same analyticity domains in ε: in particular this implies that u∞ can be extended to the overall analyticity domain of the quasi-periodic solution u, and it coincides with u over there. By trivial complex-variable arguments, the uniform convergence of the sequence of functions uN to u implies also that all their derivatives converge uniformly (in α and ε): in this sense we say that the sequence of remnants is “analytical close” to the invariant curve. Note that the periodic orbits consist of a finite number of points, whose number grows as ωN → ω. Then the content of Theorem 4.1 is the following: for any fixed N such points can be interpolated through a smooth curve which tends (in the analytical way explained above) to the invariant curve corresponding to the quasi-periodic solution with rotation number ω. Concerning what happens when ω is a Liouville number not satisfying the Bryuno condition, there are two issues that could be addressed. First one can ask if there is something analogous to the remnants seen in the case of rational rotation numbers. Secondly it would be interesting to understand what happens to the sequence of periodic orbits corresponding to the convergents of such a Liouville number. 5. Proof of Theorem 4.1 5.1. Scales. As in [8] we can introduce a C ∞ partition of unity by defining a set of functions χn (x) for x ∈ R+ and n ∈ N in the following way. Let χ (x) be a C ∞ non-decreasing compact-support function defined on R+ such that 1, for x ≤ 1, χ (x) = (5.1) 0, for x ≥ 2, and define
χ0 (x) = 1 − χ (96q1 x), χn (x) = χ (96qn x) − χ (96qn+1 x),
for n ≥ 1;
(5.2)
we shall come back in Sect. 6 to the meaning of the numerical values of the constants appearing in (5.1) and (5.2) Then for each line set g(ων ) ≡
∞
∞
n=0
n=0
χn (ων ) 1 = ≡ g (n) (ων ), γ (ων ) γ (ων )
x = inf |x − p|, (5.3) p∈Z
and call g (n) (ων ) the propagator on scale n. Given a tree θ , we can associate to each line ∈ (θ ) a scale label n , using the multiscale decomposition (5.3) and singling out the summands with n = n . We shall call n the scale label of the line , and we shall say also that the line is on scale n .
146
A. Berretti, G. Gentile
This leads in a natural way to the definition of clusters, see [8, Sect. 2, pp. 628–629]: given a tree θ , a cluster T of θ on scale n is a maximal connected set of lines of lines on scale ≤ n with at least one line on scale n. Let us denote by T (θ ) the set of clusters in a tree θ . For any cluster T ∈ T (θ ) set νu , kT = |V (T )|, (5.4) νT = u∈V (T )
where V (T ) is the set of nodes contained in T and, given a set A, we are denoting by |A| the number of elements of A. Recall that for ω ∈ Q there is an integer N = N (ω) such that pN /qN = p/q = ω, and for all ν ∈ Z \ qZ one has ων ≥ 1/qN : therefore for ω ∈ Q there is only a finite number of scales n = 0, 1, 2, . . . , N − 1. Given a line carrying a momentum ν , there can be only two (consecutive) scale labels n such that χn (ων ) = 0: in such a case one has 1 1 ≤ ων ≤ . 96qn +1 48qn
(5.5)
Thus we arrive to a slightly different definition of tree values, taking into account also the scale labels, so that (4.2) has to be replaced with 1 Val(θ, ω), 2k θ∈Tν,k νumu +1 g (n ) (ων ) , Val(θ, ω) = −i mu ! uˆ (k) ν (ω) =
u∈V (θ)
(5.6)
∈ (θ)
where here and henceforth Tν,k denotes the set of trees whose lines carry also a scale label.
5.2. Resonances. We recall briefly the definition of resonance from [8, Sect. 2, p. 629]. Given a cluster T , let mT be the number of entering lines (so that mT ≥ 0) and let kT be the number of nodes in T ; we shall denote with nT the scale of the cluster T , with niT the minimum of the scales of the lines entering T , and with noT the scale of the line exiting T . Given a tree θ , a cluster V of θ will be called a resonance with resonance-scale i o n = nR V ≡ min{nV , nV }, if (1) the sum of the mode labels of its nodes is 0, (2) all the lines entering V are on the same scale except at most one, which can be on a higher scale; (3) niV ≤ noV if mV ≥ 2, and |niV − noV | ≤ 1 for mV = 1; (4) kV < qn ;
Periodic and Quasi-Periodic Orbits for the Standard Map
147
(5) mV = 1 if qn+1 ≤ 4qn ; (6) if qn+1 > 4qn and mV ≥ 2, denoting by k0 the sum of the orders of the subtrees of order < qn+1 /4 entering V , either (a) there is only one subtree of order k1 ≥ qn+1 /4 entering V and k0 < qn+1 /8, or (b) there is no such subtree and k0 + k0 < qn+1 /4. We refer to [8] for further details. Let us denote by Nn (θ ) the number of lines ∈ (θ ) on scale n and by Pn (θ ) the number of resonances T ∈ T (θ ) on scale n. Set also Mn (θ ) = Nn (θ ) + Pn (θ ). Finally let us denote by NnR (θ ) the number of resonances T ∈ T (θ) with resonance-scale n. Lemma 5.1. For any tree θ ∈ Tν,k one has Mn (θ ) ≤
2k 8k + + NnR (θ ), qn qn+1
(5.7)
and Mn (θ ) = 0 if k < qn . Proof of Lemma 5.1. The proof is as in [8, Sect. 5], for ω an irrational number; it is not difficult to realize that the same proof works also for ω a rational number (see also the comments in Sect. 6 below).
5.3. Tree formalism for the function uN − u. Consider both uN ≡ u(α, ε, ωN ) and u ≡ u(α, ε, ω). We can apply the renormalization scheme as in [8]: everything proceeds ∗ in which the in the same way. In particular the set Tν,k has to be enlarged to a set Tν,k bound (5.5) can be violated (see [8] and Sect. 6.1 below); nevertheless, for any tree ∗ , if a line carries a momentum ν and a scale n , then θ ∈ Tν,k 1 1 ≤ ων ≤ , 768qn +1 8qn
(5.8)
whenever χn (ων ) = 0. Then Lemmata 5.1 and 4.3 still apply, as their proofs are based on the bound (5.8); see [8] and Sect. 6 below for details. In order to prove Theorem 4.1 we have to consider the function uN − u ≡ u(α, ε, ωN ) − u(α, ε, ω) =
∞
ε k u(k) (α, ωN ) − u(k) (α, ω)
k=1
=
∞ k=1
ε
k
e
iνα
(5.9)
ˆ (k) uˆ (k) ν (ωN ) − u ν (ω)
,
ν∈Z (k)
(k)
where both coefficients uˆ ν (ωN ) and uˆ ν (ω) can be expressed in terms of trees; moreover the sum over the Fourier labels has to satisfy the constraint |ν| ≤ k (see Lemma (k) 3.2), and one has uˆ ν (ωN ) = 0 for all ν ∈ qZ (see Corollary 3.2). We want to prove that it is possible to choose a neighborhood of the origin in the ε-plane – call it B(0) –, such that for all η > 0 there exists N0 ∈ N such that for all N > N0 and for all ε ∈ B(0) one has |u(α, ε, ωN ) − u(α, ε, ω)| < η.
(5.10)
148
A. Berretti, G. Gentile
We can split the sum over k in (5.9) into two sums, the first one from k = 1 to k = qN /4 and the second one over k > qN /4, i.e. u(α, ε, ωN ) − u(α, ε, ω) q N /4 ε k u(k) (α, ωN ) − u(k) (α, ω) = k=1
ε k u(k) (α, ωN ) − u(k) (α, ω) .
∞
+
(5.11)
k=(qN /4)+1
By Lemma 4.3 (and by Lemma 3.2) one has ∞ ε k u(k) (α, ωN ) − u(k) (α, ω) k=(qN /4)+1 ≤ ≤
|ε|k u(k) (α, ωN ) + u(k) (α, ω)
∞ k=(qN /4)+1 ∞
|ε| (2k + 1) D k
k
e
2kB1 (ω)
+e
k=(qN /4)+1
2kB1 (ωN )
(5.12)
qN 1 , ≤ 2
provided that one chooses the radius ρ(B(0)) ≡ ρ0 e−βB1 (ω) , with β ≥ 2 and ρ0 small enough. Of course one can suppose that N is so large that qN 1 1 (5.13) ≤ η. 2 2 So we are left with the first sum in (5.11), i.e. q N /4 k=1
εk
eiνα uˆ (k) ˆ (k) ν (ωN ) − u ν (ω) .
(5.14)
ν∈Z
By taking into account (5.6) we can write ˆ (k) uˆ (k) ν (ωN ) − u ν (ω) =
1 Val(θ, ωN , ω), 2k θ∈Tν,k
(5.15)
Val(θ, ωN , ω) = Val(θ, ωN ) − Val(θ, ω), so that
νumu +1 Val(θ, ωN , ω) = − i g (n ) (ωN ν ) mu ! u∈V (θ) ∈ (θ) − g (n ) (ων ) . ∈ (θ)
(5.16)
Periodic and Quasi-Periodic Orbits for the Standard Map
149
Then we can write Val(θ, ωN , ω) as the sum of k terms corresponding to trees whose lines have all the propagators of the form either g (n ) (ωN ν ) or g (n ) (ων ), up to one which has a new propagator given by the difference g (n ) (ωN ν ) − g (n ) (ων ); see [19] and [20] for an analogous discussion. Given a tree θ we can order the lines and construct a set of k subsets 1 (θ), . . . , k (θ ) of (θ ), with | j (θ )| = j , in the following way. Set 1 (θ ) = ∅, 2 (θ ) = 1 , if 1 is the root line of θ and, inductively for 2 ≤ j ≤ k, j +1 (θ ) = j (θ ) ∪ j , where the line j ∈ (θ ) \ j (θ ) is connected to j (θ ); of course k+1 (θ ) = (θ ). Then k νumu +1 Val(θ, ωN , ω) = − i g (n ) (ωN ν ) mu ! j =1 u∈V (θ) ∈ j (θ) (n ) (n ) g j (ωN ν j ) − g j (ων j ) g (n ) (ων ) ,
∈ (θ)\ j +1 (θ)
(5.17) where, by construction, the sets j (θ ) are connected (while of course (θ ) \ j +1 (θ ) in general are not). Lemma 5.2. With the notations of Lemma 4.3, there exist two positive constants D0 and D1 such that one has (k) (ω) uˆ ν (ωN ) − uˆ (k) ≤ ν
1 qN+1
D0 D k D1k e2kB1 (ω) ,
(5.18)
for any k ≤ qN /4. Proof of Lemma 5.2. The proof is given in Sect. 7.
5.4. Conclusions. By Lemma 5.2 we can bound (5.14) as q N /4 k=1
εk
(k) eiνα uˆ (k) (ω ) − u ˆ (ω) N ν ν
ν∈Z
≤
q N /4 k=1
ε k (2k + 1)
k 2D0 D0 , DD1 e2B1 (ω) ≤ qN+1 qN+1
(5.19)
provided that β ≥ 2 and ρ0 is small enough. Then one can suppose N so large that 2D0 1 ≤ η, qN+1 2 so that, by collecting together (5.13) and (5.20), we obtain (5.10).
(5.20)
150
A. Berretti, G. Gentile
6. Comments about the Proof of Theorem 3.12 6.1. Preliminaries. We recall now the main heuristic ideas behind the proof of [8] for the reader not completely familiar with it. This will also clarify the apparently mysterious choice of the constants in the definitions of the scales. If we disregard resonances, the proof of [8] becomes very simple: basically one needs to prove Lemma 5 in [8, p. 631], relatively easy if Pn (θ ) = NnR (θ ) = 0 (see the end of Sect. 5.2 for notations), as it happens in the case of the semistandard map (see [21]). In fact we would have |Val(θ )| ≤ D1k ≤
D1k
∞ n=0 ∞
(D2 qn+1 )2Nn (θ)
(because of (5.5), with D2 = 96) (6.1)
2k/qn +8k/qn+1
(D2 qn+1 )
.
n=0
Now, as it is trivial to see that ∞ n=1 (log qn )/qn is convergent, it is easy to prove the claim. We recall that the main arithmetic tool behind the proof of Lemma 5 in [8] is Davie’s lemma [14], which we quote here in a slightly extended version. Lemma 6.1. Let a > 2, b ≥ 2a/(a − 2). Given ν ∈ Z such that ων ≤ 1/aqn , then 1. either ν = 0 or |ν| ≥ qn , 2. either |ν| ≥ qn+1 /b or ν = sqn , for some integer s. For simplicity we choose a = b = 4 as in [14], but other choices would be equally good. The choice of the constants a and b sets some rather sharp constraints on all other strange-looking constants, for instance those used in the definition of the scales in Sect. 5.1, as we are going to discuss below. To deal with resonances we need to exploit cancellations arising when summing over trees of given order and total momentum. Suitable resummation must be performed, whose effect is that, for each resonance V , it is as if one of the external lines on scale 2 nR )2 . In the course of exhibiting such V contributed (D2 qnV +1 ) instead of (D2 qnR V +1 cancellations, one needs to perform transformations on trees which extend the set of ∗ (see [8, p. 633]). trees being considered to a larger set Tν,k Now, suppose that the scales had been defined in such a way that for a line on scale n one obtains 1 c q
n+1
≤ ων ≤
1 , cqn
(6.2)
(we had c = 96 and c = 48). The effect of the above mentioned transformations is such that, given a resonance V , one has to consider all the resonances which are obtained by shifting its entering lines. This implies that for any line in V the corresponding momentum ν can be changed into a new value ν , as it follows from (4.2); call ν( ) the set of all momenta ν which can be associated to the line in this way. As a consequence, for each ν ∈ ν( ), there will be a value n different from the original scale n such that χn (ων ) = 0. If a line is contained inside several resonances, the above argument has to be applied for all such resonances. Now, in order to exploit the cancellations assuring the convergence of the series (3.21), for each line one has to consider together all the scales n which are obtained
Periodic and Quasi-Periodic Orbits for the Standard Map
151
by the above described procedure (see [8, Sect. 3]). The latter scales are defined by the condition 1 c q
n +1
≤ ων ≤
1 cqn
(6.3)
if ν ∈ ν( ). The essential fact is that such scales n are not arbitrary, on the contrary they are related to the original scale n, as one obtains (see [8, Lemma 4 and Sects. 3 and 4]) 1 d q
n +1
≤ ων ≤
1 , dqn
(6.4)
with d > c > c > d (and consequently D2 grows to d in (6.1)), provided the constants c and c in (6.2) are suitably chosen. More precisely requiring that (6.4) be satisfied imposes two constraints on c and c : in fact they must be chosen in such a way that (i) constants d and d such that (6.3) holds actually exist, and (ii) one must have d < 1/2b for Davie’s lemma to be of some use. We found that c = 48 and c = 96 is a choice compatible with those constraints, once one has chosen a = b = 4 in Davie’s lemma, as follows from the bulk of [8, Sects. 3 and 4]: such a choice, if denoting by n ∈ ν( ) the scale n associated to the line in (6.4), gives (5.12). As anticipated in Sect. 3.2, instead of providing a complete proof of Theorem 3.8, which would require repeating, essentially word by word, the discussion in [8], we prefer to show the points in which the analysis has to be slightly changed, trying to convince the reader why almost the same proof, up to a very few minor adaptations, in fact still works. 6.2. Technical differences. The first item of Davie’s lemma (see above and [8, p. 630, Lemma 1]) has to be replaced with: either ν ∈ qZ or |ν| ≥ qn , as ωnq = 0 for all n ∈ Z. As remarked in Sect. 5.1 for ω = p/q there is a finite number of scales, as n ≤ N −1, if qN = q: this follows from the fact that ων ≥ 1/qN for all n ∈ Z \ qZ and the very definition of scale. As a consequence, in the proof of Lemma 5.1 for rational rotation numbers, one can proceeds as for the proof of Lemma 5 of [8], and, when discussing the case [2.2.3.2] in [8, Sect. 5], one can have either |ν − ν1 | ≥ qn+1 /4, or ν − ν1 = s˜ qn or |ν − ν1 | = qN (as the case ν = ν1 can be included in the previous one when s˜ = 0). But the last case gives |ν − ν1 | ≥ q ≥ qN /4 ≥ qn+1 /4 which gives the case [2.2.3.2.1]. These are the only real technical differences in the proof of the lemma: we are left with the problem of verifying that the proof can then be performed by following the analysis of [8]. We shall briefly discuss such issues, by using the notations and concepts introduced in [8], with no further reference to it. 6.3. Bound (5.7). We already noticed that the proof of Lemma 5.1 can be carried out as in [8]: here we would want to give an intuitive argument to see why it is so. Basically the bound on the number of lines and resonances on scale n implicit in (5.7), is worked out by finding a bound which is the worst possible one when all lines which are not on scale n are on a lower scale (simply go along the proof of Lemma 5 of [8] to realize that
152
A. Berretti, G. Gentile
this is the case). Then even if in the case of irrational rotation numbers ω there are much more scales than in the case of the rational rotation numbers ωN (for ω irrational there are infinitely many scales in principle, and they can be arbitrarily large for arbitrarily large orders), the quantity Mn (θ ) assumes its largest possible value when there are no lines on scale greater than n, so that the bound (5.7), when n < N , holds simultaneously for the Bryuno number ω and for the rational number ωN .
6.4. Renormalization. Moreover the renormalization procedure can be applied exactly in the same way, and no further difference arises between the case of rational numbers and the case of Bryuno numbers. The cancellation between the localized parts of the resonance values is a purely algebraic property which does not depend on the arithmetics of the rotation number, while the control of the renormalized parts is based on dimensional arguments which can be repeated unchanged once the scale labels have been fixed. 6.5. Remark. In [8] in fact the bound |ε| < ρ(ω) < Ce−2B(ω) for real values of α was given, but, by simply noting that, for |ε| < ρ and |Im α| < ξ , one has |u(α, ε, ω)| ≤ ≤
∞ k=1 |ν|≤k ∞
ρ k e|ν|ξ uˆ (k) ν
(2k + 1) D e
e ρ ≤
k 2kB1 (ω) kξ k
k=1
∞
C
−1 2B1 (ω)
e
k
ξ
e ρ
k
(6.5) ,
k=1
the analyticity in the domain (3.21) easily follows.
7. Proof of Lemma 5.2 7.1. Set-up. We can write (5.17) as Val(θ, ωN , ω) =
k j =1
Valj (θ, ωn , ω),
νumu +1 Valj (θ, ωN , ω) = − i g (n ) (ωN ν ) mu ! u∈V (θ) ∈ j (θ) (n ) (n ) g j (ωN ν j ) − g j (ων j ) g (n ) (ων ) , ∈ (θ)\ j +1 (θ)
(7.1) and study separately each term Valj (θ, ωN , ω). To any line we associate a rotation number ωN if the corresponding propagator is g (n ) (ωN ν ) (i.e. if ∈ j (θ )), and a rotation number ω if the corresponding propagator is g (n ) (ων ) (i.e. if ∈ (θ ) \
j +1 (θ )).
Periodic and Quasi-Periodic Orbits for the Standard Map
153
(n )
(n )
The difference propagator g j (ωN ν j ) − g j (ων j ) in (7.1) can be written as follows. Set for simplicity ν j ≡ ν and n j = n. Then 1 1 1 (n) (n) χn (ωN ν) − g (ωN ν) − g (ων) = 2 γ (ωN ν) γ (ων) 1 1 + χn (ων) − (7.2) γ (ωN ν) γ (ων) 1 1 + (χn (ωN ν) − χn (ων)) + , γ (ωN ν) γ (ων) where
1 1 γ (ων) − γ (ωN ν) 1 (n) χn (ωN ν) − = g (ωN ν) 2 γ (ωN ν) γ (ων) γ (ων)
1 1 1 χn (ων) − 2 γ (ωN ν) γ (ων)
≡ g (n) (ωN ν) C1 (ωN ν, ων), γ (ων) − γ (ωN ν) (n) = g (ων) γ (ωN ν)
(7.3)
≡ g (n) (ων) C2 (ωN ν, ων), so that, by defining also C3 (ωN ν, ων) =
1 (χn (ωN ν) − χn (ων)) , 2
(7.4)
we see that (7.2) becomes g (n) (ωN ν) − g (n) (ων) = g (n) (ωN ν) C1 (ωN ν, ων) + g(ωN ν) C3 (ωN ν, ων) + g (n) (ων) C2 (ωN ν, ων) + g(ων) C3 (ωN ν, ων).
(7.5)
In conclusion to the line j there corresponds the sum of the four “propagators” in (7.5): if we select the first one or the third one we can associate to the line j a rotation number ωN and ω, respectively. The other two cases can be singled out by assigning a label ∗ to the line j ; note that in the latter case at least one of the two conditions χn (ωN ν) = 0 and χn (ων) = 0 has to be satisfied, otherwise the quantity C3 (ωN ν, ων), hence the corresponding propagator, is vanishing. Lemma 7.1. There is a constant C0 such that one has |Ci (ωN ν, ων)| ≤
kC0 , qN+1
(7.6)
for i = 1, 2, 3. Proof of Lemma 7.1. Let us denote by C any constant. One has 1 1 pN 1 |ω − ωN | = ω − |ωqN − pN | = = ωqN < , qN qN qN qN qN+1
(7.7)
so that |(ωN − ω) ν| ≤
k , qN qN+1
(7.8)
154
A. Berretti, G. Gentile
∗ . provided that |ν| ≤ k, as is the case in a tree θ ∈ Tν,k Then 1 ∂χn |χn (ωN ν) − χn (ων)| ≤ |(ωN − ω) ν| dt (ων + t (ωN − ω)ν) ∂x 0 k Ck ≤ Cqn+1 ≤ , qN qN+1 qN+1
(7.9)
which proves (7.6) for i = 3. Furthermore cos 2πων − cos 2π ωN ν γ (ων) − γ (ωN ν) = γ (ωN ν) cos 2π ωN ν − 1 |2π (ωN − ω) ν| 1 dt |sin 2π (ων + t (ωN − ω)ν)| ≤ |cos 2πωN ν − 1| 0 k k ≤ CωN ν−2 max{ωN ν, ων} ≤ CqN , qN qN+1 qN qN+1 (7.10) which proves (7.6) for i = 2; the proof for i = 1 is analogous.
7.2. Bounds. We see that each value Valj (θ, ωN , ω) splits into the sum of four contributions through (7.5). Each contribution not containing C3 (ωN ν, ων), up to a factor Ci (ωN ν, ων), i = 1, 2, is of the same form as either Val(θ, ωN ) or Val(θ, ω), with the only difference that for a connected subset Sj (θ ) of θ the rotation number is ωN , while for the set θ \ Sj (θ ) the rotation number is ω; with the notations introduced in Sect. 5.5 one has either Sj (θ ) = j (θ ) or Sj (θ ) = j (θ ) ∪ j . For the two contributions containing C3 (ωN ν, ων), a further difference is that the propagator corresponding to the line j does not contain either a factor χn j (ωN ν j ) or a factor χn j (ων j ) (see (3.5)), but this is not important as both functions 1/γ (ωN ν j ) and 1/γ (ων j ) admit the same bound in terms of the denominators of the convergents (see below), when the factor C3 (ωN ν, ων) is taken into account. Then we can try to reason as for the proof of Lemma 5.1, i.e. along the lines of [8, Sect. 5]. The only difference is that, when considering a cluster T , now it can happen that a rotation number ωN is associated to the line exiting from T , while a rotation number ω is associated to the lines entering T . This does not yield any difference but for the case [2.2.3.2] of [8, Sect. 5], in which there is only one line entering T : to deal with such a case we shall use the following result. Lemma 7.2. Given any tree θ ∈ Tν,k , with k < qN /4, if ωN < ω (respectively ωN > ω), then one has ωN ν < ων (respectively ων < ωN ν ) for all ∈ (θ ) with scales n ≥ 1. Proof of Lemma 7.2. Suppose that one has 0 < ωN < ω < 1. If a line is on scale n, by setting ν = ν one has either ωN ν ≤ 1/8qn or ων ≤ 1/8qn (see (5.8)), according to which type of rotation number is associated to , so that, for k < qN /4, one obtains max {ωN ν, ων} ≤
1 1 + , 8qn 4qN+1
(7.11)
Periodic and Quasi-Periodic Orbits for the Standard Map
155
by (7.8). We can write ωN ν = ωN ν + rN and ων = ων + r, for suitable rN , r ∈ Z. Therefore one has ωN ν = ωN ν − rN = ων + (ωN − ω) ν − rN = ων + (ωN − ω) ν + (r − rN ) , (7.12) so that, by using (7.8) and (7.11), |rN − r| = |ων − ωN ν + (ωN − ω) ν| 1 1 1 1 1 1 1 1 + + + ≤ + + + < 1, ≤ 8qn 8qn 4qN+1 4qN+1 8 8 4 4
(7.13)
which yields rN = r. Therefore, as we are assuming ωN < ω, one has |ωN ν| < |ων| for all ν ∈ Z \ {0}, so that ωN ν < ων for all the above considered lines. In the same way one proves that, if assuming ωN > ω, then one obtain the inequality ωN ν > ων for all lines ∈ (θ ), θ ∈ Tν,k , on scales n ≥ 1.
7.3. Conclusions. When discussing the case [2.2.3.2] of [8], one can have the line with associated a rotation number ωN and the line 1 with associated a rotation number ω, such that ωN ν ≤
1 , 8qn
ων1 ≤
1 , 8qn
(7.14)
where ν = ν and ν1 = ν 1 (see [8, p. 648]). By Lemma 7.2 one has either ων ≤ ωN ν ≤ 1/8qn or ωN ν1 ≤ ων1 ≤ 1/8qn , so that (7.11) in [8] has to be replaced with min {ω(ν − ν1 ), ωN (ν − ν1 )} ≤ min {ων + ων1 ), ωN ν + ωN ν1 )} ≤
1 . 4qn
(7.15)
If the label ∗ is associated to the line j , by observing that one has either χn (ωN ν) = 0 or χn (ων) = 0, one can proceed in the same way. Finally, in order to bound the small divisors, we can use that if a line is on scale n 2 : this is trivial then the corresponding propagator is bounded by a constant times qn+1 for all lines except, if there is any, for the one carrying the label ∗, about which one can reason as follows. Suppose for concreteness that one has ωN < ω (the case ωN > ω can be dealt with in the same way). It is easy to realize that, besides trivial cases, the only case which really deserves a careful analysis corresponds to the propagator g(ωN ν1 )C3 (ωN ν1 , ων1 ) when χn (ων1 ) = 0 and χn (ωN ν1 ) = 0 (so that ωN ν1 < ων1 by Lemma 7.2). We can use that for k < qN /4 one has ων1 − ωN ν1 > 1/4qN+1 by (7.8), ων1 > 1/768qn+1 by (5.8), and ωN ν1 > 1/2qN+1 by [8, (3.15)]. Then ωN ν1 > ων1 − 1/4qN+1 , so that, if 1/4qN+1 < 1/1536qn+1 one has ωN ν1 > 1/1536qn+1 , while if 1/4qN+1 > 1/1536qn+1 one has ωN ν1 > 1/2qN+1 > 1/768qn+1 : hence in both cases one has ωN ν1 > 1/1536qn+1 . From here on the discussion proceeds as in [8], with no further difference.
156
A. Berretti, G. Gentile
Moreover, by taking into account the sum over j in (7.1) and, for all j , the sum over the four terms in (7.5), we have an extra factor k 5 ≤ e5k . In conclusion the same bound as (4.4) follows, with the only difference that there is a factor C0 k/qN+1 ≤ C0 ek /qN+1 , arising from (7.6). Then (5.18) follows with D0 = 4C0 (where the factor 4 takes into account the fact that for a line carrying a label ∗ one can substitute 7682 with 15362 in bounding the corresponding propagator) and D1 = e6 . This completes the proof of the lemma. Acknowledgements. We thank Ugo Bessi for many enlightening discussions, and Giovanni Gallavotti for a critical reading of the manuscript.
References 1. Eliasson, L.H.: Absolutely convergent series expansions for quasi-periodic motions. Math. Phys. Electron. J. 2, paper 4, 1–33 (1996) (Preprint 1988) 2. Gallavotti, G.: Twistless KAM tori. Commun. Math. Phys. 164, 145–156 (1994) 3. Gentile, G.: Diagrammatic techniques in perturbation theory, and applications. In: Proceedings of “Symmetry and Perturbation Theory II” (Rome, 16–22 December 1998), Ed. A. Degasperis and G. Gaeta, Singapore: World Scientific, 1999, 59–78 4. Greene, J.M.: A method for determining a stochastic transition. J. Math. Phys. 20, 1183–1201 (1979) 5. Chirikov, B.V.: A universal instability of many dimensional oscillator systems. Phys. Rep. 52, 264–379 (1979) 6. Gentile, G., Mastropietro, V.: Methods for the analysis of the Lindstedt series for KAM tori and renormalizability in classical mechanics. A review with some applications, Rev. Math. Phys. 8, 393–444 (1996) 7. Bernstein, D., Katok, A.: Birkhoff periodic orbits for small perturbations of completely integrable Hamiltonian systems with convex Hamiltonians. Invent. Math. 88, 225–241 (1987) 8. Berretti, A., Gentile, G.: Bryuno function and the standard map, Commun. Math. Phys. 220, 623–656 (2001) (Preprint 1998) 9. Birkhoff, G.D.: Dynamical systems, Providence: American Mathematical Society, 1966; reprinting of the original edition [New York, 1927] 10. Celletti, A., Falcolini, C.: Singularities of periodic orbits near invariant curves. Preprint (2001) 11. Schmidt, W.M.: Diophantine approximation. Lecture Notes in Mathematics 785, Berlin: Springer, 1980 12. Yoccoz, J.-C.: Théorème de Siegel, Nombres de Bruno et Polinômes Quadratiques. Astérisque 231, 3–88 (1995) (Preprint 1988) 13. Moser, J.K.: On invariant curves of area preserving mappings of the annulus. Nachr. Akad. Wiss. Göttingen, Math. Phys. Kl. II 1962, 1–20 (1962) 14. Davie, A.M.: The critical function for the semistandard map. Nonlinearity 7, 219–229 (1994) 15. Rüssmann, H.: On the frequencies of quasi-periodic solutions of analytic nearly integrable Hamiltonian systems. Seminar on Dynamical Systems (St. Petersburg, 1991), Progr. Nonlinear Differential Equations Appl. 12, Basel: Birkhäuser, 1994, pp. 160–183 16. Berretti, A., Gentile, G.: Scaling properties for the radius of convergence of Lindstedt series: The standard map. J. Math. Pures Appl. (9) 78, 159–176 (1999) 17. Berretti, A., Gentile, G.: Scaling properties for the radius of convergence of Lindstedt series: Generalized standard maps. J. Math. Pures Appl. (9) 79, 691–713 (2000) 18. Titchmarsch, E.C.: The theory of functions. London: Oxford University Press, 1933; 2nd edition, 1939 19. Bartuccelli, M., Gentile, G.: Lindstedt series for perturbations of isochronous systems. A review of the general theory. Rev. Math. Phys. 14, no. 2, 121–171 (2002) 20. Gallavotti, G., Gentile, G.: Hyperbolic low-dimensional invariant tori and summations of divergent series. To appear in Commun. Math. Phys. 21. Berretti, A., Gentile, G.: Renormalization Group and field theoretic techniques for the analysis of the Lindstedt series. Regul. Chaotic Dyn. 6, no. 4, 389–420 (2001) Communicated by G. Gallavotti
Commun. Math. Phys. 231, 157 – 188 (2002) Digital Object Identifier (DOI) 10.1007/s00220-002-0680-9
Communications in
Mathematical Physics
Gelation in Coagulation and Fragmentation Models M. Escobedo1 , S. Mischler2,3 , B. Perthame3 1 Departamento de Matemáticas, Universidad del País Vasco, Apartado 644, 48080 Bilbao, Spain.
E-mail:
[email protected]
2 Laboratoire de Mathématiques Appliquées, Université de Versailles – Saint Quentin, 45 avenue des Etats-
Unis, 78035 Versailles, France. E-mail:
[email protected]
3 DMA, UNM 8553, Ecole Normale Supérieure, 45 rue d’Ulm, 75230 Paris cedex 05, France.
E-mail:
[email protected] Received: 28 September 2001 / Accepted: 1 April 2002 Published online: 2 October 2002 – © Springer-Verlag 2002
Abstract: Rates of decay for the total mass of the solutions to Smoluchovski’s equation with homogeneous kernels of degree λ > 1 are proved. That implies that gelation always occurs. Morrey estimates from below and from above on solutions around the gelation time are also obtained which are in agreement with previously known formal results on the profile of solutions at gelling time. The same techniques are applied to the coagulation-fragmentation model for which gelation is established in some particular cases. 1. Introduction and Main Results The purpose of this work is to investigate gelling transition in a coagulation and fragmentation model. The simplest model is the Smoluchowski coagulation equation describing irreversible aggregation processes between particles which coalesce and form larger and larger clusters. Denoting by f (t, y) ≥ 0 the density of clusters of size y ∈ R+ at time t ≥ 0, the continuous Smoluchowski’s coagulation equation reads [32] ∂ 1 y a(y − y , y ) f (t, y − y ) f (t, y ) dy f = Qc (f ) = ∂t 2 0 ∞ (1.1) −f (t, y) a(y, y ) f (t, y ) dy , 0 f (0, y) = fin (y) ≥ 0. We assume throughout this paper that the coagulation kernel a is an homogeneous function of the form a(y, y ) =
κ α β α y y + yβ y 2
with
0 ≤ α ≤ β ≤ 1,
(1.2)
158
M. Escobedo, S. Mischler, B. Perthame
where we may take κ = 1 without any loss of generality. We refer to Remark 4.2 and Remark 7.1 for some extensions of our analysis to more general kernels. Since we are interested in gelling transition we only consider the case λ := α + β > 1.
(1.3)
We refer to the review paper of D.J. Aldous [2] and D.L. Drake [11] for a basic physical description of coagulation-fragmentation models as well as many other references concerning physical motivations and mathematical analysis. Let us emphasize that the results presented here are also valid for the discrete Smoluchowski coagulation equation [34] i−1 ∞ 1 ∂ ci = a(i − j, j ) ci−j (t) cj (t)−ci (t) a(i, j ) cj (t), (1.4) ∂t 2 j =1 j =1 ci (0) = ci,in , for any i ∈ N∗ , where ci (t) ≥ 0 denotes the concentration of clusters of size i ∈ N∗ . Most of our results also extend (with minor modifications) to the non homogeneous continuous or discrete Smoluchowski coagulation equation. In order to avoid repetitions, we restrict our exposition mainly to the continuous homogeneous model (1.1). We quote the slight differences between the different models in Sect. 8 below. The key point in the analysis of Eq. (1.1) is the following identity: for every measurable functions f and ψ: ∞ 1 ∞ ∞ ˜ Qc (f ) ψ dy = a(y, y ) f (y) f (y ) ψ(y, y ) dydy , (1.5) 2 0 0 0 ˜ ψ(y, y ) = ψ(y + y ) − ψ(y) − ψ(y ).
(1.6)
This identity is obtained formally performing the change of variables (y, y ) → (z = y − y , y ) in the first term of Qc (f ) in (1.5) and is therefore rigorous when f y β ψ ∈ L1 (R+ ). As an immediate consequence, when we choose ψ(y) = y in (1.5), so that (1.6) vanishes, we get (at least formally) the conservation of mass ∞ d y f (t, y) dy = 0. (1.7) dt 0 Under the assumption 0 ≤ fin ∈ L11 (R+ ),
fin ≡ 0
(1.8)
it is well known that there exists a solution to (1.1) and (1.4) since the pioneer works of Melzak [31], Sockmayer [42] and McLeod [29, 30] for the simplest cases α = β = 0. We also refer to [43, 28, 16, 5, 17] for the case α, β < 1 and to [14, 23] who revisited the case α = β = 1. The case 0 < α < 1, β = 1 is treated in Corollary 2.8 below. Exact solutions have also been constructed for special initial data (in particular for monodisperse distribution in the case of Smoluchovski’s model) in [29, 28, 37, 27]. Finally, uniqueness (and existence) of solution has been proved in [21, 3, 39] and [13] under the assumption λ ≤ 1, but uniqueness of solution is an open problem under the general assumption (1.2).
Gelation in Coagulation and Fragmentation Models
159
Here and below we use the notations g ∈ L1k (R+ ) to denote a measurable function such that ∞
g L1 := k
|g(y)| (1 + y k ) dy < ∞
0
and for such a function we define
∞
Mk (g) :=
g(y) y k dy.
(1.9)
0
The solutions to (1.1) obtained in the references quoted above always satisfy the following estimates: M0 (t1 ) ≤ M0 (t0 ),
(the number of particles decreases),
(1.10)
(the total mass decreases),
(1.11)
M1 (t1 ) ≤ M1 (t0 ),
for any t1 ≥ t0 ≥ 0 (see [23] and Theorem 2.4). One of the most relevant questions from the physical point of view, and mathematically interesting, is whether one has equality or strict inequality in (1.11). This is the gelation problem. For λ > 1, it is known that fo r fin ∈ L1λ a solution of (1.1) can be constructed in such a way that it is mass conserving for small time (see [9] and [22]) i.e.: ∃T > 0
M1 (t) := M1 (f (t, .)) ≡ M1 (0)
∀t ∈ [0, T ].
(1.12)
For the discrete model (1.4) (then Mk ≤ M for k ≤ ) and initial data satisfying M2 (0) < ∞, this fact can be easily understood using the following elementary argument. Thanks to (1.5)-(1.6) with ψ(y) = y 2 , we have d M2 (t) = M1+α (t) M1+β (t) ≤ M2 (t)2 , dt
(1.13)
and this differential inequality provides an (a priori) bound on the L12 norm on a small interval [0, T ]. The computation leading to (1.7) is therefore rigourous. Moreover, conservation of mass (1.12) holds true with T = +∞ under the assumption λ ≤ 1, see [43, 41]. For the discrete model (1.4) we argue as follows d M2 (t) = M1+α (t) M1+β (t) dt 1−β β ≤ M11−α (t) M2α (t) M1 (t) M2 (t) ≤ M1 (0) M2 (t), where we have used twice the Holder inequality and (1.11). That differential inequality provides an (a priori) bound on the L12 norm of the solution on every bounded interval [0, T ]. However, for λ > 1, mass conservation is expected to break down in finite time, i.e.: there exists Tg ∈ R+ , called gelation time, such that M1 (t) ≡ M1 (0)
∀ t < Tg ,
M1 (t) < M1 (0)
∀ t > Tg .
(1.14)
This fact was conjectured by Lushnikov and Ziff independently at the end of the 70’s.
160
M. Escobedo, S. Mischler, B. Perthame
This gelation phenomenon has been first adressed in the physical litterature, based on explicit solutions for special initial data ([28, 37, 27, 14]) and on formal scaling arguments [18]. The physical interpretation is that after gelation, some mass is lost under the form of a particle of infinite size (y = ∞) with mass Min − M1 (t), the so-called gel part. The particles of density f (t, y) are then called the sol part. It is then a microscopic description of a phase transition. The first universal and rigourous argument concluding to gelation seems to have been given by [28] in the case α = β = 1. More recently, using probability arguments, I. Jeon [20] was able to construct gelling solutions to (1.4) for any initial datum when λ > 1. P. Laurençot [23] has obtained a decay rate for the total mass M1 (t) in the particular case α = β = 1. We present here a sketch of the proof. Taking ψ = 1 in (1.5)-(1.6) we get that a solution f to (1.1) satisfies for any t1 ≥ t0 ≥ 0, 1 t1 ∞ ∞ a(y, y ) f (t, y) f (t, y ) dydy dt = M0 (t0 ) − M0 (t1 ). 2 t0 0 0 Thanks to the elementary inequality (y y )λ/2 ≤ (y α y + y β y )/2 β
α
and the positivity of M0 we deduce the following first fundamental estimate: ∞ 2 Mλ/2 (t) dt ≤ 2 M0 (t0 ) ∀t0 ≥ 0.
(1.15)
(1.16)
t0
Now, when α = β = 1 (so that λ/2 = 1), we deduce from (1.16) and (1.11) that t 2 t M1 (t) ≤ M12 (s) ds ≤ 2 M0 (0). 0
This implies the following decay rate on the total mass M1 (t): M1 (t) ≤
2 M0 (0) √ t
∀ t ≥ 0.
(1.17)
In particular, M1 (t) is not constant and gelation must occur in finite time. Finally, for modified models, gelation and non existence of solutions (which may be considered as instantaneous and complete gelation) have been proved in [40, 4, 8, 6]. Our first main result states that for any weak solution f (see Sect. 2 for a precise definition) to the coagulation equation, with homogeneous kernel a given by (1.2), (1.3), gelation occurs in finite time. Theorem 1.1. Assume fin ≡ 0. For every weak solution f to (1.1)-(1.3),(1.8) there exists a positive constant C∗ (depending on M0 (0), M1 (0), λ) such that for any t ≥ 0, M1 (t) ≤
C∗ . (1 + t)1/λ
(1.18)
As a consequence, gelation occurs in finite time: (1.14) holds with the following upper bound on Tg :
λ C∗ . (1.19) Tg ≤ T∗ := M1 (0)
Gelation in Coagulation and Fragmentation Models
161
This result is a consequence of new momentum estimates (see Theorem 2.2 and Corollary 2.3). It recovers the previous result by I. Jeon [20] and extends it in several directions. First, our result is established for any weak solution and not for a particular well constructed solution (recall that uniqueness is not known). Our result also holds both for continuous and discrete models and is easy to extend to a non homogeneous setting (see Sect. 8). Finally, our proof is completely different from Jeon’s proof (it is much more related to P. Laurençot’s proof, [23]) and is very simple. Once gelation is established, one may try to determine the asymptotic profile of the solution f at gelling time Tg . Explicit exact solutions and formal arguments indicate that it should be a self similar profile, with algebraic spatial decay determined by the value of λ in (1.3). More precisely: f (Tg , y) ∼ y −
3+λ 2
when y → ∞,
(1.20)
see [18, 14, 10]. We present here a first rigorous result, in terms of Morrey-Campanato type estimates, which hold for every weak solution, and which is in agreement with (1.20) although it is less precise. Theorem 1.2. Let f be any weak solution to (1.1)–(1.3) and (1.8). Then, for any τ > 0 and any 0 ≤ T0 ≤ T1 such that (1.21)
M1 (T1 ) < M1 (T0 ), f satisfies: ∀R > 0
T1
sup S>R
T0
1 Sτ
2
S
f (t, y) y
λ/2+1/2+τ
dy
dt ≤ C1 ,
(1.22)
dt ≥ C1−1 ,
(1.23)
0
and ∀R > 0
T1
sup T0 S>R
1 Sτ
S
2 f (t, y) y λ/2+1/2+τ dy
0
where C1 ∈ (0, ∞) depends on α, β, f , T0 , T1 and τ but not on R. The Morrey norms in (1.22) and (1.23) are classically used to include functions with a specific algebraic decay at infinity. The following corollary relates the estimates (1.22) and (1.23) to the algebraic decay (1.20): Corollary 1.3. Assume furthermore that f has a spatial decay profile ξ , i.e.: ∃ , M ∈ R+ , ∃ξ(y) s.t. ∀t ∈ [T0 , T1 ], ∀y ≥ M
−1 ξ(y) ≤ f (t, y) ≤ ξ(y).
(1.24)
Then, the profile ξ satisfies C2−1 ≤ lim sup ξ(y) y 3/2+λ/2 ,
lim inf ξ(y) y 3/2+λ/2 ≤ C2 ,
(1.25)
for some C2 ∈ R+ depending on C1 , and T1 − T0 . Moreover, if we also know that ξ(y) = y −θ ξ0 (y), then θ = 3/2 + λ/2 and ξ0 ≡ Const.
with ξ0 (y) y −κ −→ 0 ∀κ > 0, y→∞
(1.26)
162
M. Escobedo, S. Mischler, B. Perthame
It is then clear from Theorem 1.2 and its corollary that, at gelling time, the only possible algebraic decay for a solution is 3/2 + λ/2. But since we are not able to prove (1.24) nor (1.26), other intrincated large size asymptotic behaviors are possible which could be in agreement with estimates (1.22) and (1.23). We next investigate how the techniques that we have introduced for the coagulation equation can be used for the coagulation-fragmentation equation ∂ f = Qc (f ) + Qf (f ) ∂t with Qc given by (1.1) and (1.2), and Qf defined by y ∞ 1 Qf (f ) = − f (y) b(y − y , y ) dy + b(y, y ) f (y + y ) dy . 2 0 0
(1.27)
(1.28)
This equation takes into account not only the coagulation but also the fragmentation processes by which the clusters break apart into smaller pieces. We assume throughout this paper that the fragmentation kernel b satisfies 0 ≤ b(y, y ) = b(y , y) ≤ B(y + y )
(1.29)
with B(z) =
B (1 + z)γ
(1.30)
for some B > 0 and γ satisfying λ 3 +γ ≥ . 2 2
(1.31)
The key identity for the fragmentation term, which plays the same role as (1.5)–(1.6) for the coagulation term, is ∞ 1 ∞ ∞ ˜ Qf (f ) ψ dy = − b(y, y ) f (y + y ) ψ(y, y ) dydy (1.32) 2 0 0 0 1 ∞ f (z) kψ (z) dz, = 2 0 with
kψ (z) =
z
b(y, z − y) (ψ(y) + ψ(z − y) − ψ(z)) dy,
(1.33)
0
for any f ∈ L11 and any ψ ∈ L∞ . As a consequence, the same formal conservation of mass (1.7) holds for the coagulation-fragmentation equation (1.27). Existence results of solutions to the Cauchy problem associated to (1.27) have been proved under different growth assumptions on the kinetic kernels a and b and we refer to [36, 3, 12, 7] for the discrete model and [31, 1, 38, 13, 23] and [26] for the continuous model. See also Corollary 2.8. Concerning gelation phenomenon for the coagulation-fragmentation equation, rather few results are known. Notice that the two phenomena, coagulation and fragmentation, have opposite effects with respect to gelation. As we have already seen, if no fragmentation is present, and the coagulation is strong (λ > 1 in (1.3)), then a gelling transition
Gelation in Coagulation and Fragmentation Models
163
occurs for all the weak solutions (that is Theorem 1.1). On the other hand, under the condition λ ≤ 1, or if the fragmentation kernel b is strong enough with respect to the coagulation kernel a, solutions exist which preserve the total mass for all the time, see [41, 7, 3, 15]. An illustrative example is the following. Consider the case a(y, y ) = y y and b(y, y ) = 1. Then using (1.5) and (1.32) with ψ(y) = 1 and ψ(y) = y 2 , we get, thanks to the Cauchy–Schwarz inequality M22 ≤ M1 M3 , d 1 1 1 M0 = − M12 + M1 = (1 − M1 ) M1 dt 2 2 2
(1.34)
1 1 1 1 d M2 = M22 − M3 ≤ (M1 − ) M3 . dt 2 2 2 3
(1.35)
and
On one hand, we see from (1.35) that if M1 (0) ≤ 1/3 then M2 is (formally) decreasing and then in this case one can build a mass conserving solution. On the other hand, if M1 (0) > 1 then from (1.34) any solution gels in finite time (otherwise M1 (t) is a constant and M0 (t) vanishes in finite time, which is a contradiction). The question is then twofold. On the one hand, for which kernels a and b does the coagulation-fragmentation equation have gelling solutions? On the other hand, for which kernels a and b do all the solutions undergo gelling transition? We are far from answering these questions, but Theorem 1.4 stated below gives some gelation criteria for the coagulation-fragmentation model and is a partial result in that direction. Theorem 1.4. Assume λ > 1 and γ ≥ 3/2 − λ/2. Then, for any weak solution of Eq. (1.27) gelation occurs when M1 (0) is large enough;
(1.36)
or, without any condition on M1 (0), when one of the two following conditions is satisfied b has a compact support; γ > 1 and
(1.37)
a(y, y ) = r(y)α r(y ) + r(y) r(y )α , α ∈ (0, 1] 1 ≤ r(y) ∼ y for large y.
(1.38)
Previous results in that direction have been proved by I. Jeon [20] and P. Laurençot [23]. To our knowledge Theorem 1.4 is the first result establishing systematic gelation for the complete coagulation-fragmentation equation (with b ≡ 0). Notice that (1.38) includes the case of the discrete coagulation-fragmentation equation with kernel a(i, j ) = i α j + ij α . Notice also that when γ > 23 − λ2 , Theorem 1.2 may be applied and therefore the expected profile of the solution at gelling time is again like y −( 2 + 2 ) . The outline of the rest of the paper is as follows: Section 2. More about our results: Moment estimates. Section 3. Proof of the estimates from above for the coagulation equation. Section 4. Gelation for the coagulation equation. Section 5. Estimates from below and profile at gelling time for the coagulation equation. Section 6. Behaviour of the solutions to the coagulation-fragmentation equation. Section 7. Existence result for the coagulation fragmentation equation. Section 8. Extensions to non-homogeneous models. 3
λ
164
M. Escobedo, S. Mischler, B. Perthame
2. More About our Results: Moment Estimates In this section we state in detail some new estimates which give rise, in particular to the main results stated in the introduction. For the sake of completeness we first recall the definition of the weak solution to (1.1) and (1.27). Definition 2.1. We say that a function f : R2+ → R+ is a weak solution to the coagulation-fragmentation equation (1.27) if f ∈ C([0, ∞); L1 ) ∩ L∞ (0, T ; L11 ) for any T > 0, ∀t ≥ 0
M1 (t) ≤ M1 (0),
(2.1)
and for any T1 ≥ T0 ≥ 0 and any ψ ∈ L∞ (R+ ),
T1
∞
(2.2)
f (t, y) ψ(y) dy 0
=
1 2
T0
T1 ∞ ∞
˜ y ) dydy dt, a f (t, y) f (t, y ) − b f (t, y + y ) ψ(y,
0
T0
0
with a given by (1.2)–(1.3) and b satisfying (1.29)–(1.31). Solutions to the coagulation equation (1.1) are defined in the same way taking b ≡ 0. It is possible to define the solutions of Eq. (1.1) and (1.27) in many other ways. For instance we may impose f to be a solution of (1.1) or (1.27) in the distributional sense in [0, T ) × R+ for any T > 0 or in the mild sense. But anyway, as it is proved in [26], all these definitions are equivalent. Let us also emphasize that condition (2.1) is made in order to select a physically relevant solution. Nevertheless, it is not necessary to impose such an additional condition for the kernels a and b that we are considering in this paper (except when γ = 3/2 − λ/2), since we shall prove that M1 (t) is a decreasing function. Finally, we will not repeat any longer the assumptions on a and b which are, except if it is specified, those of Definition 2.1. Our first result is a family of a posteriori upper bounds on the weak solutions to the coagulation equation (1.1). Theorem 2.2. For any increasing function : R+ → R+ such that (0) = 0 and
∞
C :=
(A) A− 2 dA < ∞, 1
(2.3)
0
for any weak solution f to (1.1), and for all T ≥ 0, there holds
∞ ∞
2 f (t, y) y
T
0
λ/2
(y) dy
2 dt ≤ 2 C M1 (T ).
(2.4)
Under suitable choices of functions we deduce the following refined moment estimates.
Gelation in Coagulation and Fragmentation Models
165
Corollary 2.3. For every weak solution f to (1.1) and every T ≥ 0, there holds:
2 ∞ ∞ f (t, y) y dy dt ≤ Cλ R 1−λ M1 (T ) (2.5) T
R
for any R > 0, λ ∈ (1, 2);
∞
T
1 Rτ
2
R
f (t, y) y
λ/2+1/2+τ
dy
dt ≤ Cτ M1 (T )
(2.6)
0
for any R, τ > 0; ∞ ∞
T
e
y λ/2+1/2 f (t, y) dy (lny)δ
2 dt ≤ Cδ M1 (T )
(2.7)
for any δ > 1; T
∞
Mk2 (t) dt ≤ Ck (M0 (T ) + M1 (T )),
(2.8)
for any k ∈ [λ/2, λ/2 + 1/2) if λ ∈ (1, 2) and for k = 1 if λ = 2. In particular, we see from (2.8) that M1 ∈ L2 (R+ ) so that M1 (t) can not be constant in time and gelation must occur. In fact, rates of decay for the total mass M1 (t) of any weak solution as stated in Theorem 1.1 can be deduced from (2.5) and the following result. Theorem 2.4. Any weak solution f to (1.1) satisfies Mk is decreasing and Mk (t) → 0 when t → ∞
(2.9)
for any k ∈ [0, 1]. Moreover, Mk (t) is continuous for any k ∈ [0, 1) and M1 (t) is right continuous. We do not know whether (1.18) gives the generic decay of weak solutions for general nonnegative initial data fin ∈ L11 (R+ ). Nevertheless it may be improved in some cases. This is done in the following corollary which has to be compared with [27] and [14]. Corollary 2.5. For any weak solution of the discrete coagulation equation (1.4) we have M1 (t) ≤
C for any t ≥ 0. 1+t
(2.10)
For any weak solution to (1.1) with initial data such that M−q (0) < ∞, q ≥ 0: M1 (t) ≤
Cq 1 for any t ≥ 0, with ν = . λ−1 ν (1 + t) 1 + 1+q
(2.11)
For any weak solution to (1.1) with initial data such that fin (y) = 0 for y ∈ [0, δ] and some δ > 0, M1 (t) ≤
C for any t ≥ 0. 1+t
(2.12)
166
M. Escobedo, S. Mischler, B. Perthame
Notice that, thanks to (2.10) we recover the upper bound on the gelling time obtained in [20], namely Tg ≤ C/M1 (0). We now present the lower bounds that we can derive on solutions around the gelling time. Theorem 2.6. Let f be solution of (1.1). Assume that M1 := M1 (T0 ) − M1 (T1 ) > 0 for some T0 < T1 .
(2.13)
Then for any R, τ > 0 there exists Cτ such that
T1
S>R
1 Sτ
sup T0
S
2 y λ/2+1/2+τ f (t, y) dy
dt ≥ Cτ M1
(2.14)
0
and T1
T0
∞ e
y λ/2+1/2 f (t, y) dy (lny)1/2
2 dt = +∞.
(2.15)
As a consequence, combining (2.6) and (2.14) we obtain Theorem 1.2. Let us quote some questions of interest. Open questions. 1. Is M1 (t) continuous? 2. At gelling time, is the profile of f exactly (1.20)? 3. Another way to understand the gelation phenomenon is to prove that for some k > 1 and 0 < Tc (k) < ∞, lim Mk (t) = ∞.
t→Tc (k)
(2.16)
For instance, when α = β = 1, (1.13) gives d M2 (t) = M2 (t)2 , dt
(2.17)
which readily implies that M2 (t) blows up in finite time and Tc (2) = M2 (0)−1 . We refer to [9] where it is shown, for the discrete model (1.4), that Mλ (t) blows up in finite time for the coagulation kernel (1.2) with α ∈ (0, 1), β = 1. It is in general an open problem to prove whether Tg = Tc (k). Nevertheless, it is easy to check that in the simple case where a(y, y ) = y y we have Tg = Tc (2), see [33, Theorem 2.2, p. 411]. 4. Does Tg satisfies: ∀t0 , t1 such that t0 < Tg ≤ t1 , lim inf f (t0 , y) y 3/2+λ/2 = 0, y→∞
lim inf f (t1 , y) y 3/2+λ/2 > 0 ? y→∞
We next investigate how the previous results may be extended to the coagulationfragmentation equation (1.27). We start with the extension of Theorem 2.2 and Corollary 2.3.
Gelation in Coagulation and Fragmentation Models
167
Theorem 2.7. Let : R+ → R+ be an increasing function satisfying (0) = 0 and (2.3). For any weak solution f to (1.27) there holds
t1
∞
2 f (t, y) y λ/2 (y) dy
0
t0
2 dt ≤ C (4 M1 (t0 ) + B 2 (t1 − t0 ))
(2.18)
for every t1 ≥ t0 ≥ 0. In particular, for convenient choices of , we obtain that for any T ≥ 0: 0
T
1 Rτ
2
R
f (t, y) y
λ/2+1/2+τ
dy
dt ≤ Cτ (M1 (0) + B 2 T )
(2.19)
dt ≤ Cδ (M1 (0) + B 2 T )
(2.20)
0
for any R, τ > 0;
T
∞
f (t, y) 0
e
y λ/2+1/2 dy (lny)δ
2
for any δ > 1; 0
T
Mk2 (t) dt ≤ Ck (M0 (0) + M1 (0) + B 2 T )
(2.21)
for any k ∈ [λ/2, λ/2 + 1/2) if λ ∈ (1, 2) and for k = 1 if λ = 2. The occurrence of gelation for large initial mass stated in Theorem 1.4 is a consequence of estimate (2.21) with k = 1. The two other statements of Theorem 1.4 come from“variations around” the proof of Theorem 2.7. Another consequence of Theorem 2.7, more precisely of the estimate (2.20), is the following existence result for Eq. (1.27). Corollary 2.8. Assume α ∈ (0, 1) and β = 1. For any initial datum fin ∈ L11 (R+ ), there exists a weak solution f to the coagulation-fragmentation equation (1.27). Let us finally state the following result about the profile at gelation time for the solutions to the coagulation-fragmentation equation (1.27). Corollary 2.9. Assume that λ/2 + γ > 3/2. Any weak solution to the coagulationfragmentation equation (1.27) satisfies M1 (t) is decreasing and right continuous, and moreover, if (2.13) holds, then (2.14) and (2.15) also hold. As a consequence, gathering (2.19) in Theorem 2.7 and Corollary 2.9 we see that Theorem 1.2 (and hence Corollary 1.3) is also valid when λ ∈ (1, 2] and λ/2 + γ > 3/2. Therefore, the only algebraic decay at gelling time may be 3/2 + λ/2, as for the pure coagulation model. Under the conditions of Theorem 1.4, any solution of the coagulation-fragmentation equation satisfies (2.20), (2.21), just like those of the coagulation equation. In particular, their only possible asymptotic profile at Tg with algebraic behaviour y −θ as y → ∞ is again θ = 3/2 + λ/2. This seems to indicate that, under these hypothesis on the coagulation and fragmentation kernels, the gelation of the solutions to the coagulationfragmentation equation is dominated by the coagulation, and the fragmentation is only a small perturbation.
168
M. Escobedo, S. Mischler, B. Perthame
Open Problem 2.10. It is possible in some cases to use the same formal arguments as Ernst, Ziff and Hendricks in [14] to get some insight on the behaviour of the solutions to the coagulation-fragmentation equation. Let us assume in what follows that a is given by (1.2) and that b(y, y ) = B(y + y ) with B given by (1.30). The loss of mass from smaller clusters with y < L to larger clusters is given by L L d y f (t, y) dy = y Q(f ) dy (2.22) dt 0 0 with
L
L ∞
y Q(f ) dy = −
0
0
L−y L ∞
0
L−y
+
y f (t, y) f (t, y ) a(y, y ) dy dy
(2.23)
y f (t, y + y ) b(y, y ) dy dy.
Assume that at Tg the solution is given by a pure power-law f (Tg , y) ≡ C y −r . Then, if r satisfies max{β + 1, 2 − γ } < r < α + 2 we obtain
L
y Q(f ) dy = −K1 C 2 L3+2λ−2r + K2 CL3−γ −r
(2.24)
(2.25)
0
for some positive constants K1 = K1 (α, β, r) and K2 = K2 (γ , r). On the other hand we formally deduce from (2.22) that there is not conservation of mass if and only if: L lim y Q(f ) dy < 0. (2.26) L→∞ 0
From (2.25) this only holds whenever r ≤ min(3/2 + λ/2, λ + γ ). Arguing now by analogy with the formal argument used for the pure coagulation equation we consider r = min(3/2 + λ/2, λ + γ ).
(2.27)
We obtain then: (i) If γ + λ/2 > 3/2, so that r = 3/2 + λ/2. Then, L y Q(f ) dy = −K1 C 2 + K2 CL3/2−λ/2−γ → −K1 C 2 < 0,
as L → ∞.
0
Therefore, (2.26) holds for any C > 0. (ii) If γ + λ/2 ≤ 3/2, then r = λ + γ and L y Q(f ) dy = (K2 C − K1 C 2 )L3−λ−2γ . 0 2 Therefore, (2.26) holds if and only if C > K K1 , i.e. the initial datum is sufficiently large. On the ground of the above remarks we are lead to the following conjectures.
Gelation in Coagulation and Fragmentation Models
169
1. If γ + λ/2 > 3/2: gelation occurs for all initial data and the gelling profile is given by y −(3/2+λ/2) . Moreover M1 (t) → 0 as t → ∞. 2. If γ + λ/2 ≤ 3/2 and λ + γ ≥ 2 there exists a positive constant M1∗ such that: If M1 (0) > M1∗ gelation occurs, the gelling profile is like y −(λ+γ ) and M1 (t) → M1∗ as t → ∞. If M1 (0) ≤ M1∗ the mass of the solution is conserved for all the time. Notice that when λ + γ < 2 it is possible to construct global mass preserving solutions, see [7,15].
3. Upper Bounds This section is devoted to the proof of the a posteriori estimates in Theorem 2.2 and Corollary 2.3. Proof of Theorem 2.2. Consider a weak solution f of Eq. (1.1). Choosing ψ = ψA (y) = y ∧ A (which belongs to L∞ ) in (1.5) we get 1 2
t1
∞ ∞ 0
t0
0 ∞
=
a(y, y ) f (t, y) f (t, y ) (−ψ˜ A (y, y )) dydy dt ∞ f (t0 , y) ψA dy − f (t1 , y) ψA dy.
0
(3.1)
0
We compute 0 y + y − A −ψ˜ A (y, y ) = y y A
on on on on on
{y, y ; TA {y, y ; {y, y ; {y, y ;
y + y ≤ A} y ≤ A, y ≥ A} y ≥ A, y ≤ A} y ≥ A, y ≥ A},
(3.2)
with TA = {y, y ; y ≤ A, y ≤ A, y + y ≥ A}.
(3.3)
From (1.15) and −ψ˜ A (y, y ) ≥ 0, and keeping only the term coming from the region {y ≥ A, y ≥ A} in the collision integral (3.1), we deduce
t1
t0
∞
2 dt ≤ 2
y λ/2 f (t, y) dy
A
M1 (t0 ) A
∀ A > 0.
(3.4)
f (t, y) y λ/2 dy dA.
(3.5)
First, using Fubini’s Theorem, we have 0
∞
f (t, y) y λ/2 (y) dy = 0
∞
(A)
∞ A
170
M. Escobedo, S. Mischler, B. Perthame
Next, using the Cauchy–Schwarz inequality and Fubini’s Theorem, we deduce from (3.4) and (3.5),
2 ∞ t1 ∞ λ/2
(A) f (t, y) y dy dA dt t0
≤
0 t1
t0
≤ C
∞ 0 ∞
A
(A) 1
dA
A2
(A) A 2
1
0 t1
t0
∞
2 M1 (t) 0
(A) A
0
≤ C
∞
(A) A
1 2
1 2
2
∞
f (t, y) y A
∞
f (t, y) y λ/2 dy
λ/2
dy
dA dt (3.6)
2 dt dA
A 2 dA = 2 C M1 (t0 ).
Theorem 2.2 then follows from (3.5) and (3.6) letting t1 → ∞. Proof of Corollary 2.3. + Proof of (2.5). Taking (y) = y 1−λ/2 − (R/2)1−λ/2 , the constant C defined by (2.3) is 2−λ C = λ−1
1−λ R 2 2 . 2
(3.7)
From (3.7) and (2.4) we get ∞ ∞ 1−λ/2 2 λ/2 y f (t, y) y dy dt ≤ C M1 (t0 ) R 1−λ , 2 t0 R and (2.5) readily follows.
+ Proof of (2.7). We define (y) = y 1/2 /(lny)δ − r 1/2 /(lnr)δ with δ > 1 and r = exp(2 δ). We easily verify that is increasing and that the associated constant C is finite. We conclude using estimate (2.4) for this choice of and the estimate (2.5) with R = e. 1 Proof of (2.6). Taking (y) = (y ∧ R) 2 +τ , we get 1 τ C = 1 + R , 2τ
(3.8)
and conclude. Proof of (2.8). For k ∈ [λ/2, λ/2 + 1/2) we have for some constant Ci depending on λ and k, e
2 ∞ λ y 2 f (t, y) dy + y k f (t, y) dy Mk2 (t) ≤ C1 e 0 2 ∞ y λ2 + 21 2 ≤ C2 Mλ/2 (t) + f (t, y) dy . (lny)2 e Then (2.8) follows from (1.16) and (2.7).
Gelation in Coagulation and Fragmentation Models
171
4. Gelation This section is devoted to the proof of Theorem 1.1, Theorem 2.4 and Corollary 2.5. Proof of (2.9) in Theorem 2.4. Define ψA (y) = y k ∧ A with k ∈ [0, 1] and ψ˜ A (y, y ) by (1.6). From the two following elementary inequalities: (u ∧ A) + (v ∧ A) ≥ (u + v) ∧ A and X k + Y k ≥ (X + Y )k and the fact that z → z ∧ A is increasing, we get y k ∧ A + y ∧ A ≥ (y k + y ) ∧ A ≥ (y + y )k ∧ A, k
k
so that ψ˜ A ≥ 0. For t1 ≥ t0 ≥ 0, we then deduce from the fundamental identity (1.5) that ∞ ∞ ∞ f (t1 , y) ψA (y) dy ≤ f (t0 , y) ψA (y) dy ≤ f (t0 , y) y k dy. 0
0
0
We conclude by Fatou’s Lemma (letting A → ∞) that t → Mk (t) is a decreasing function. Proof of Theorem 1.1. We just need to prove (1.18). Our proof is based on the method introduced in [23] and on the new bound (2.5). For given R > 0 we have R
2 M12 (t) ≤ 2 y f (t, y) dy + 2 0
and
2
∞
(4.1)
y f (t, y) dy R
2
R
y f (t, y) dy 0
2 ≤ R 2−λ Mλ/2 (t).
(4.2)
Gathering (4.1), (4.2) with (1.16) and (2.5) we get for any R > 0 ∞ M12 (t) dt ≤ 4 R 2−λ M0 (T ) + Cλ R 1−λ M1 (T ).
(4.3)
Then, making the choice R = M1 (T )/M0 (T ) and using (1.10) we obtain ∞ M12 (t) ds ≤ Cλ M0 (0)λ−1 M1 (T )2−λ .
(4.4)
T
T
The decay rate (1.18) follows then from (4.4) and Lemma 4.1 below.
Lemma 4.1. Assume that the square integrable and decreasing function M1 (t) satisfies ∞ M12 (s) ds ≤ C1 M1 (T )θ , (4.5) ∀T ≥ 0 T
for some constants C1 > 0 and θ ∈ (0, 2). Then ∀t > 0
M1 (t) ≤ C2 t − 2−θ ,
for some constant C2 = C2 (C1 , θ, ||M1 ||L2 ) > 0.
1
(4.6)
172
M. Escobedo, S. Mischler, B. Perthame
Proof of Lemma 4.1. The proof of Lemma 4.1 is classical. Nevertheless, for the sake of completeness, we present it here. Define ∞ u(t) := M12 (s) ds. t
We deduce from (4.5) that u satisfies u(t) ≤
u(t) 2/θ du and thus ≤− dt C1
t 2/θ
C1
1 + u(0)2/θ−1
−
1 2/θ −1
.
Since M1 (t) is decreasing we also have t t 2 M12 (s) ds ≤ u(t/2), M1 (t) ≤ 2 t/2
(4.7)
(4.8)
and Lemma 4.1 follows gathering (4.7) and (4.8). End of the Proof of Theorem 2.4.. We proceed in two steps. Step 1. For any k ∈ [0, 1) we prove that Mk (t) → 0. Choosing ψ(y) = 1[0,ρ] (y) we find that ψ˜ ≤ 0 and therefore ρ ρ f (t, y) dy ≤ fin (y) dy for any t ≥ 0. (4.9) 0
0
We deduce that, for any k ∈ [0, 1) and ρ ∈ (0, 1), ρ ∞ 1 Mk (t) ≤ ρ k f (t, y) dy + 1−k y f (t, y) dy ρ 0 ρ ρ 1 fin (y) dy + 1−k M1 (t), ≤ ρ 0 and the right-hand side term goes to 0 when t → ∞ and ρ → 0. Step 2. For any k ∈ [0, 1) we prove that Mk (t) is continuous and M1 (t) is right continuous. Since by definition f ∈ C([0, ∞); L1 (R+ )) ∩ L∞ (0, T ; L11 (R+ )) for all T > 0, it readily follows that Mk (t) is continuous for any k ∈ [0, 1). Moreover, for any t0 ≤ t1 we have ∞ ∞ f (t0 , y) y ∧ A dy − f (t1 , y) y ∧ A dy 0 ≤ M1 (t0 ) − M1 (t1 ) = 0 0 ∞ ∞ f (t0 , y) (y − A)+ dy − f (t1 , y) (y − A)+ dy (4.10) + 0 0 t1 ∞ ∞ ∞ ≤ a f (t, y) f (t, y ) (−ψ˜ A ) dydy dt + f (t0 , y) (y − A)+ dy. t0
0
0
0
For fixed t0 ≥ 0 we fix A large enough so that the second term is small, and then t1 close enough to t0 , so that the first term is small. This exactly means that M1 (t) is right continuous.
Gelation in Coagulation and Fragmentation Models
173
Proof of Corollary 2.5. Here again we follow [23]. Proof of (2.10) and (2.12). For the discrete coagulation equation we have M0 (t) ≤ CM1 (t).
(4.11)
with C = 1. For the continuous coagulation equation with vanishing initial data near the origin, we use (4.9) with ρ = δ which implies that f (t, y) ≡ 0 for any time t ≥ 0 and every y ∈ [0, δ]. Therefore (4.11) also holds with C = 1/δ. In both cases, (4.3) and (4.11) imply ∞ M12 (t) dt ≤ Cλ M1 (T ). T
The decay rate (2.10) then follows again from Lemma 4.1. Proof of (2.10). Under the assumption M−q (0) < ∞ we may write M0 (t) =
R
f (t, y)dy +
0
∞
f (t, y)dy R
≤ R q M−q (0) +
1 M1 (t), R
(4.12)
for any R > 1, with the help of (4.9). Gathering (4.3) and (4.12) we have T
∞
M12 (t) dt ≤ Cλ (R 2−λ M0 (T ) + R 1−λ M1 (T )) ≤ Cλ (R 2−λ+q M−q (0) + R 1−λ M1 (T )) ≤ Cλ (1 + M−q (0)) M1 (T )
1−λ 1+ 1+q
with the choice R 1+q = M1 (T ). We conclude again using Lemma 4.1. Remark 4.2. One can also prove that gelation occurs for any coagulation kernel satisfying the assumption a > 0 on R2+
and
a ≥ a0 on [A0 , ∞)2
with a0 of the shape (1.2) and A0 large. Indeed, in this case we have for any A1 > 0, a ≥ κA1 a0 on [A1 , ∞)2 , with κA1 > 0. We then proceed as in the proof of Theorem 1.4, Step 2 in Sect. 6, choosing A1 small enough in (6.9) and (6.10).
174
M. Escobedo, S. Mischler, B. Perthame
5. Estimates from Below and Profile at Gelling Time This section is devoted to the proof of Theorem 2.6, Theorem 1.2 and Corollary 1.3. Proof of Theorem 2.6. Step 1. Preliminaries. Let us put again ψA (y) = y ∧ A. We deduce from (3.1), (3.2) and assumption (2.13) that there exists A0 ≥ 0 such that
T1
∀ A ≥ A0
κ(A, t) dt ≥
T0
1 M1 , 2
(5.1)
where we have set κ = κ1 + κ2 + κ3 and κ1 (A, t) = A
∞ ∞
A
∞ A
(5.3)
(y + y − A) a(y, y ) f (t, y) f (t, y ) dydy ≥ 0,
(5.4)
y a(y, y ) f (t, y) dy 0
A
(5.2)
f (t, y ) dy ≥ 0,
κ2 (A, t) = 2
κ3 (A, t) =
a(y, y ) f (t, y) f (t, y ) dydy ≥ 0,
A
TA
with TA defined in (3.3). In order to get estimates (2.14) and (2.15) we treat separately the contribution of each term κi . We start showing that the analysis of κ3 reduces to the analysis of κ1 and κ2 . Indeed, we have
(y + y − A) a(y, y ) f (t, y) f (t, y ) dydy =
TA
=2 +
∞
dy
A/2
A/2 0 ∞ ∞
≤2
1TA A/2 A/2 ∞ A/2
dy 1TA (y + y − A) a(y, y ) f (t, y) f (t, y )
(y + y − A) a(y, y ) f (t, y) f (t, y ) dydy
y a(y, y ) f (t, y) dy A/2
+2
A 2
0 ∞ ∞
f (t, y ) dy
a(y, y ) f (t, y) f (t, y ) dydy ,
A/2 A/2
since y + y − A ≤ min(y, y , A) on TA . In other words κ3 (A, t) ≤ κ2 (A/2, t) + 2 κ1 (A/2, t)
∀t, A ≥ 0.
(5.5)
Gelation in Coagulation and Fragmentation Models
175
Step 2. We prove (2.14). First notice that for any δ2 , τ ≥ 0 with δ1 > 0 we have ∞ 2j +1 R R δ1 f (t, y) y δ2 dy = R δ1 f (t, y) y δ2 dy R
2j R
j ∈N
≤
j ∈N
2j (δ1 +τ ) R τ
≤
1
1 sup τ S S≥R
≤ Cδ1 sup
S>R
2S
2j +1 R
f (t, y) y δ1 +δ2 +τ dy
2j R
f (t, y) y δ1 +δ2 +τ dy
S
1 2 j δ1
S
f (t, y) y δ1 +δ2 +τ dy,
0
for some constant Cδ1 only depending on δ1 . On the one hand we have for any τ ≥ 0, ∞ ∞ κ1 (A, t) = A a(y, y )f (t, y)f (t, y )dydy A A
∞ 1/2+(β−α)/2 α ≤ A y f (t, y) dy A
∞ 1/2+(α−β)/2 β × A y f (t, y ) dy ≤ Cα,β
1 sup τ S S>A
(5.6)
j ∈N
1 Sτ
A
S
f (t, y)y
1 λ 2 + 2 +τ
(5.7)
2 dy
,
0
where we have used twice the estimate (5.6), remarking that |α − β| < 1. 1 On the other hand, for (α , β ) = (α, β) or (α , β ) = (β, α) so that α −β 2 + 2 > 0, we have, for any τ ≥ 0, A ∞ β y α +1 f (t, y) dy y f (t, y ) dy ≤ 0 A
A ∞ α −β λ 1 1 1 β 2 + 2 +τ f (t, y) dy 2 +2 ≤ . y A y f (t, y ) dy Aτ 0 A We then deduce from (5.6) that κ2 (A, t) ≤ Cα,β
1 sup τ S S>A
S
f (t, y)y
1 λ 2 + 2 +τ
2 dy
.
(5.8)
0
As a conclusion, (2.14) follows from (5.1), (5.5), (5.7) and (5.8) for any R = A/2 ≥ A0 /2, and therefore for any R > 0. Step 3. We prove (2.15). The lower bound (5.1) and Fubini’s theorem (notice that κi ≥ 0) imply 3 i=1
T1
T0
∞ e
κi (A, t) dA dt = A ln A
∞ T1
κ(A, t) dt e
T0
dA = ∞. A ln A
176
M. Escobedo, S. Mischler, B. Perthame
From (5.6) we then obtain
T1
T0
∞ e
κi (A, t) dA dt = +∞ A ln A
for i = 1 or 2.
(5.9)
We need the following lemma, which we state below and prove at the end of the proof of Theorem 2.6. Lemma 5.1. There is ξ1 ∈ L∞ (R2+ ) such that ∀y, y ≥ e
min(y,y )
e
dA y 1/2+(β−α)/2 y 1/2−(β−α)/2 + ξ1 (y, y ). ≤2 √ lnA lny lny (5.10)
For any δ ∈ (0, 2), there is ξ2 ∈ L∞ (R2+ ) such that ∀y , z ≥ e
min(y ,z) e
dA 2 y δ/2 zδ/2 ≤ + ξ2 (y , z). √ A1−δ lnA δ lny lnz
(5.11)
For any δ ∈ (0, 2), there is ξ3 ∈ L∞ (R2+ ) such that ∀y, z ≥ e
∞
max(y,z)
dA 2 ≤ A1+δ lnA δ
1 y δ lny
1 + ξ3 (y, z). zδ lny
(5.12)
Therefore, using first (5.10) we deduce ∞ ∞ ∞ ∞ dA κ1 (t, A) β α y f (t, y)dy y f (t, y )dy dA = A lnA lnA e e A A ∞ ∞ min(y,y ) dA α β = y f (t, y)y f (t, y ) dydy (5.13) lnA e e e ∞ y λ/2+1/2 2 ≤ f (t, y) dy + ξ L∞ f (t, .) 2L1 . 1/2 1 (lny) e On the other hand, using the Cauchy–Schwarz inequality, we have ∞ dA A α +1 β y f (t, y) dy y f (t, y ) dy A lnA 0 e A ∞ 2 1/2 A 1 α +1 ≤ y f (t, y) dy dA (5.14) A1+δ lnA e 0 2 ∞ 1/2 ∞ 1 β y f (t, y ) dy dA A1−δ lnA e A
∞
with δ = 1 − (β − α) if (α , β ) = (α, β) and δ = 1 + β − α if (α , β ) = (β, α). Notice that in both cases δ ∈ (0, 2].
Gelation in Coagulation and Fragmentation Models
177
First, using (5.12), it yields ∞ 2 A 1 α +1 y f (t, y)dy dA K1 (t) := A1+δ lnA 0 e ∞ ∞ ∞ y α +1 f (t, y)zα +1 f (t, z) ≤ 2 e
+2 2 ≤ δ
e ∞
e ∞
e
1
e
α +1
max(y,z) 2
dA dydz (5.15) A1+δ lnA
y f (t, y)dy dA A1+δ lnA 0 2 y λ/2+1/2 f (t, y) dy +( ξ3 L∞ + Cδ ) f (t, .) 2L1 1 (lny)1/2
for some positive constant Cδ only depending on δ. Next, using (5.11), we get ∞ 2 ∞ 1 β K2 (t) := y f (t, y ) dy dA A1−δ lnA e A ∞ ∞ min(y ,z) dA β = y f (t, y ) zβ f (t, z) dy dz 1−δ A lnA e e e 2 2 ∞ y λ/2+1/2 ≤ f (t, y) dy + ξ2 L∞ f (t, .) 2L1 . 1/2 1 δ (lny) e
(5.16)
Gathering (5.14), (5.15), (5.16) and using Young’s inequality we deduce T1 ∞ κ2 (t, A) 1 T1 1 T1 K1 (t)dt + K2 (t)dt dAdt ≤ AlnA 2 T0 2 T0 T0 e 2 2 T1 ∞ y λ/2+1/2 f (t, y)dy dt + T1 Cξ2 ,ξ3 f (0, .) 2L1 . (5.17) ≤ 1 δ T0 (lny)1/2 e As a conclusion, (2.15) follows from (5.9), (5.13), (5.17).
Proof of Lemma 5.1. We only prove (5.10) since (5.11) and (5.12) follow in the same way. Integrating by parts, we have min(y,y ) min(y,y ) min(y, y ) dA dA = . (5.18) −1+ )) 2 lnA ln(min(y, y (lnA) e e Since for any k ∈ (0, 2) there is Ak ≥ 0 such that z → we also have
zk is increasing for z ≥ Ak , lnz
y y 1−(β−α) y 1−(β−α) min(y, y ) y β−α ≤ y = min , , min ln(min(y, y )) lny lny lny lny ≤
y 1/2+(β−α)/2 y 1/2−(β−α)/2 , √ lny lny
for any y, y ≥ max(A0 , A1−(β−α) ). We conclude observing that the last term in the right-hand side of (5.18) is bounded by one half of the left hand side term of (5.18) for a large value of min(y, y ).
178
M. Escobedo, S. Mischler, B. Perthame
Remark 5.2. Let us emphasize that for any g ≥ 0 none of the information on the behavior of g(y) for large value y ≥ 0, R 1 y τ g(y) dy > 0 MCτ (g) := lim sup τ 0 R→∞ R
and
∞
Lδ (g) := e
g(y) dy = +∞ (lny)δ
is stronger than the other, as it is shown by the two examples below. In particular, (2.14) and (2.15) can not be deduced one from the other. On the one hand, taking ω g(y) = λj δyj , λj = 1, yj = ej j ∈N∗
we have
MCτ (g) =
e−τ k > 0 ω
Lδ (g) =
and
k∈N∗
j −ω δ < ∞ if ω δ > 1.
j ∈N∗
On the other hand, choosing g(y) = y −1 (lny)−ν , we get MCτ (g) = 0
Lδ (g) = +∞ if ν + δ ≤ 1.
and
Proof of Corollary 1.3. From (1.22) we obtain by Fatou’s lemma
T1
T0
1 lim inf τ R→∞ R
2
R
f (t, y) y
λ/2+1/2+τ
dy
dt ≤ C1 .
0
Then if we define the measurable function: ωτ : R+ → R+ by R 1 f (t, y) y λ/2+1/2 dy for a.e. ωτ (t) := τ lim inf τ R→∞ R 0
t ∈ [T0 , T1 ],
we deduce lim inf f (t, y) y λ/2+3/2 ≤ ωτ (t) y→∞
for a.e.
t ∈ [T0 , T1 ].
Gathering (5.19) with (1.24) we get the second estimate in (1.25). On the other hand, we deduce from (1.23) and (1.24): (T1 − T0 ) 2
lim sup R→∞
1 Rτ
R
2 ξ(y) y λ/2+1/2+τ dy
0
≥ C1−1 ,
which implies lim sup ξ(y) y y→∞
λ/2+3/2
≥
and this ends the proof of Corollary 1.3.
1 1/2
τ C1 (T1 − T0 )1/2
,
(5.19)
Gelation in Coagulation and Fragmentation Models
179
6. Behavior of Solutions to the Coagulation Fragmentation Equation This section is devoted to the proof of Theorem 2.7, Theorem 1.4 and Corollary 2.9. Proof of Theorem 2.7. Consider a weak solution f to the coagulation and fragmentation equation (1.27). For any ψ ∈ L∞ (R+ ) and t1 ≥ t0 ≥ 0 we get from the fundamental identities (1.5) and (1.32), ∞ 1 t1 ∞ ∞ f (t1 , y ) ψ(y ) dy + a f (t, y) f (t, y ) ψ˜ dydy dt (6.1) 2 t0 0 0 0 ∞ 1 t1 ∞ = f (t0 , y ) ψ(y ) dy + f (t, z) B(z) Kψ (z) dz 2 t0 0 0 with
z
Kψ (z) =
(ψ(y) + ψ(z − y) − ψ(z)) dy.
(6.2)
0
We proceed in two steps. Step 1. Choose ψ(y) = ψA (y) = y ∧ A, so that ψ˜ = ψ˜ A given by (3.2) and KA (z) = A (z − A)+ . We obtain, arguing as for the proof of (3.4), 1 2
t1
2
∞
f (t, y) y t0
λ/2
dy
A
M1 (t0 ) B dt ≤ + A 2
t1
∞
f (t, z) t0
A
z dzdt. (1 + z)γ
For the last term, using condition (1.31) and the Young inequality, we get ∞ ∞ 1 1 z3/2−λ/2 λ/2 dz ≤ 1/2 f (t, z) z f (t, z) zλ/2 dz z1/2 (1 + z)γ A A A
2 ∞ B 1 f (t, z) zλ/2 dz + . ≤ 2B 2A A Gathering the two preceding estimates we deduce that ∀A > 0,
t1
t0
∞
2 f (t, y) y λ/2 dy
dt ≤
A
1 [4 M1 (t0 ) + B 2 (t1 − t0 )]. A
(6.3)
Proceeding as in the proof of Theorem 2.2 and Corollary 2.3 we readily deduce (2.18) and then (2.19), (2.20). Step 2. In order to prove (2.21) we make the choice ψ = 1 in (6.1), so that ψ˜ = 1, Kψ (z) = z, and thanks to (1.15), we get 1 2
t1
t0
2 Mλ/2 (t) dt
B ≤ M0 (t0 ) + 2
t1
∞
f (t, z) t0
0
z dzdt. (1 + z)γ
Using once again condition (1.31) and the Young inequality we have ∞ z 1 B 2 f (t, z) dz ≤ (t) + , Mλ/2 γ (1 + z) 2 B 2 0
180
M. Escobedo, S. Mischler, B. Perthame
from where we obtain
t1 t0
2 Mλ/2 (t) dt ≤ 4 M0 (t0 ) + B 2 (t1 − t0 ).
Estimate (2.21) just follows interpolating (6.4) and (2.20).
(6.4)
Proof of Theorem 1.4. We proceed in two steps. Step 1. Proof of (1.36). By (2.21) with k = 1 we have T M12 (t) dt ≤ C1 (M0 (0) + M1 (0) + B 2 T ). 0
This √ inequality may hold√with M1 (t) ≡ M1 (0) for any time T ≥ 0 only if M1 (0) ≤ C1 B. When M1 (0) > C1 B, gelation must occur before the time T∗ =
C1 (M0 (0) + M1 (0)) . M12 (0) − C1 B 2
Step 2. We prove (1.38). Coming back to formula (6.1) with ψ(y) = y ∧ A we have ∞ 1 t1 ∞ ∞ f (t1 , y) y ∧ A dy + a f (t, y) f (t, y ) (−ψ˜ A ) dydy dt (6.5) 2 t0 0 0 0 ∞ 1 t1 ∞ ≤ f (t0 , y) y ∧ A dy + f (t, z) B(z) KA (z) dzdt, 2 t0 0 0 with ψ˜ = ψ˜ A given by (3.2) and KA (z) = A (z − A)+ . We assume by contradiction that M1 (t) ≡ M1 (0). Therefore, for any A > 0, A ∞ M1 (0) M1 (0) y f (t, y) dy ≥ y f (t, y) dy ≥ or . (6.6) 2 2 0 A We deduce from (6.6) that for A large enough ∞ ∞ f (t, z) B(z) KA (z) dz ≤ B f (t, z) z1−γ A dz 0 A ∞ ∞ B B α ≤ min A r(y ) f (t, y ) dy , γ −1 r(y) f (t, y) dy Aγ −α A A A ∞
∞ M1 (0) α ≤ min A r(y ) f (t, y ) dy , r(y) f (t, y) dy 4 A A ∞ ∞ 1 ≤ r(y) f (t, y) dy r(y )α f (t, y ) dy dt A 2 A A
A ∞ α + y r(y) f (t, y) dy r(y ) f (t, y ) dy 0 A 1 ∞ ∞ a f (t, y) f (t, y ) (−ψ˜ A (y, y )) dydy . ≤ 2 0 0
Gelation in Coagulation and Fragmentation Models
181
This implies that there exists A0 large enough such that for any A ≥ A0 /2 and any t1 ≥ t1 ≥ 0, 1 t1 ∞ ∞ a f (t, y) f (t, y ) (−ψ˜ A ) dydy dt ≤ M1 (t0 ), (6.7) 4 t0 0 0 and
∞
f (t1 , y) y ∧ A dy ≤
0
∞
f (t0 , y) y ∧ A dy.
(6.8)
0
From (6.7) we deduce (as in the proof of (2.5)) that ∞ f (t, y) y dy ∈ L2 (0, ∞), t →
(6.9)
A0
and from (6.8) we deduce that for any t ≥ 1, ∞ ∞ f (t, y) y dy ≥ (y − A0 )+ f (t, y) dy A0 0 ∞ (y ∧ A0 )f (t, y) dy ≥ M1 (0) − 0 ∞ (y ∧ A0 )f (1, y) dy. ≥ M1 (0) −
(6.10)
0
From Lemma 6.1 whose statement and proof are given just below, we know that f (1, y) > 0 a.e. on R+ and the right-hand side in (6.10) is a positive constant. We deduce that (6.9) and (6.10) can not hold together and we have a contradiction. Step 3. We prove (1.37). We easily deduce from assumption (1.38) that for A0 large enough, so that supp B ⊂ [0, A0 ], (6.7) and (6.8) still hold. We then conclude as at the end of the preceding step. Lemma 6.1. Every weak solution of the coagulation-fragmentation equation (1.27) with kinetic kernels such that 0 < a(y, y ), b(y, y ) ≤ C (1 + y) (1 + y ) for a.e. y, y ∈ R+
(6.11)
and not identically zero initial data satisfies f (t, .) > 0 a.e. on R+ for any t > 0.
(6.12)
Proof of Lemma 6.1 . With the assumptions made on the kinetic kernels we have ∂f + λ(t, y) f (t, y) ≥ (t, y) on R2+ , ∂t
(6.13)
with
1 y 2 b(y − y , y ) dy + C (1 + y) f (t, .) L1 ∈ L∞ loc (R+ ), 1 2 0 1 y (t, y) := a(y − y , y )f (t, y − y ) f (t, y ) dy 2 0 ∞ b(y, y ) f (t, y + y ) dy ∈ L1loc (R2+ ). + λ(t, y) :=
0
(6.14) (6.15)
182
M. Escobedo, S. Mischler, B. Perthame
We proceed in several steps. Step 1. First, by hypothesis there exists a > 0 such that fin ≡ 0 a.e. on (2 a, 3 a). Therefore, since ≥ 0, Eq. (6.13) implies that f (t, .) ≡ 0 a.e. on (2 a, 3 a) and for any t ≥ 0. Step 2. Thanks to Step 1, we have for a.e. y ∈ (0, 2 a) and any t ≥ 0, (t, y) ≥
∞
b(y, y ) f (t, y + y ) dy ≥
0
3a
b(y, z − y) f (t, z) dz > 0,
2a
and then Eq. (6.13) implies that f (t, .) > 0 a.e. on (0, 2 a) for any t > 0. Step 3. Now, thanks to Step 2, we have for a.e. y ∈ (0, 4 a) and any t > 0, 1 y a(y − y , y )f (t, y − y ) f (t, y ) dy > 0, (t, y) ≥ 2 0 and therefore Eq. (6.13) implies that f (t, .) > 0 a.e. on (0, 4 a) for any t > 0. Step 4. Assertion (6.12) follows iterating Step 3. Proof of Corollary 2.9. Let f be a solution to the coagulation-fragmentation equation (1.27). Then it satisfies (2.20). Coming back to (6.5), we notice that for ε ∈ (0, γ + λ/2 − 3/2) we have ∞ ∞ f (t, z) B(z) KA (z) dz ≤ f (t, z) z2−γ dz 0
A
≤
1 Aγ +λ/2−3/2−ε
M λ + 1 −ε (t) −→ 0, 2
2
A→∞
(6.16)
in L1 (0, T ) for any T ∈ R+ . Letting A → ∞ in (6.5) we first deduce from (6.16) that M1 (t) is decreasing. Moreover, since (4.10) still holds (because the contribution of the fragmentation term has the good sign) we deduce that M1 (t) is right continuous. Finally, if M1 := M1 (t0 ) − M1 (t1 ) > 0 we deduce from (6.5) and (6.16) that t1 ∞ ∞ M1 a f (t, y) f (t, y ) (−ψ˜ A ) dydy dt ≥ (6.17) 2 t0 0 0 for A large enough. Therefore, the analysis performed in the proof of Theorem 2.6 still holds, so that (2.14) and (2.15) follow.
7. Existence Result This section is devoted to the proof of Corollary 2.8. Like in [23] and [26], the strategy is to define a sequence (fn ) of solutions to the coagulation-fragmentation equation with “truncated” coefficients, to establish some bounds which hold uniformly in n ≥ 0 and then pass to the limit in a weak formulation of solutions to the equations. Let us define the approximated coagulation kernel an (y, y ) := a(y ∧ n, y ∧ n) and bn (y, y ) := b(y, y ) 1y+y ≤n
(7.1)
Gelation in Coagulation and Fragmentation Models
183
and denote by fn ∈ C([0, ∞); L1 ) ∩ L∞ (0, T ; L11 ) for any T > 0 the weak solution to the coagulation-fragmentation equation (1.27) associated to (7.1), b and fin . Such a solution exists thanks to a standard Banach fixed point theorem, see for instance [31]. Since an (y, y ), bn (y, y ) ≤ C (1 + y) (1 + y ) uniformly in n, it has been proved in [26] that the sequence (fn ) satisfies the following estimates M1 (fn (t, .)) ≤ M1 (fin ),
M0 (fn (t, .)) ≤ CT
∀t ∈ [0, T ],
(7.2)
and for any R, T > 0 there exists a function R such that R (s)/s → ∞ when s → ∞ and R R (fn (t, y)) dy ≤ CT ,R < ∞. (7.3) sup sup n t∈[0,T ] 0
We need one additional moment estimate that we derive now. Let us fix T ≥ 0. Proceeding as at the beginning of Step 1 in the proof of Theorem 2.7 we get
2 ∞ 1 T λ/2 fn (t, y) (y ∧ n) dy dt 2 0 A T ∞ 1 (y ∧ n)λ/2 ≤ M1 (0) + fn (t, z) dzdt, 2 0 A A1/2 for any A > 0. Then (using the Young inequality) we may follow the proof of (2.20) in Theorem 2.7 to obtain
2 T ∞ y 1/2 sup fn (t, y) (y ∧ n)λ/2 dy dt ≤ CT . (7.4) (ln y)2 n≥0 0 e By (7.2) and (7.3) it is straightforward that (fn ) lies in a weak compact set of L1 ((0, T )× R+ for any T > 0. Therefore, there exists a function f ∈ C([0, ∞); L1 )∩L∞ (0, T ; L11 ) for all T > 0 such that for a subsequence of (fn ) (not relabeled) fn f weakly in L1 ((0, T )×R+ ) for any T > 0. Moreover, it is possible to show that the coagulation and fragmentation kernels Qf,n (fn ) and Qc,n (fn ) lie in a weak compact set of L1 ((0, T ) × (0, R) for any T , R > 0 and for any T , R > 0: Qf,n (fn ), Qc,n (fn ) Qf (f ), Qc (f ) weakly in L1 ((0, T ) × (0, R)).
(7.5)
Since the fragmentation term is treated for instance in [23], we only briefly explain how to deal with the coagulation term. We refer to [26] for more details. Let us fix ψ ∈ D([0, ∞) × R+ ), M > 0 such that supp ψ ⊂ [0, ∞) × [0, M] and R ≥ M. Using (1.5) we have ∞ Qc,n (fn (t, .)) ψ dy 0 1 ∞ ∞ ˜ y, y ) dydy an (y, y ) fn (t, y) f (t, y ) ψ(t, = 2 0 0 1 R R ˜ y, y ) dydy = an (y, y ) fn (t, y) f (t, y ) ψ(t, 2 0 0 1 + an fn (t, y) f (t, y ) ψ(t, y) + ψ(t, y ) dydy . 2 2 R+ \[0,R]2
184
M. Escobedo, S. Mischler, B. Perthame
On the one hand, using Lemma 3.5 and Lemma 4.4 in [26], we can pass to the limit in the first term, so that T R R ˜ y, y ) dydy dt an (y, y ) fn (t, y) fn (t, y ) ψ(t, (7.6) 0
0
0
T R
−→
n→∞
0
0
R
˜ y, y ) dydy dt. a(y, y ) f (t, y) f (t, y ) ψ(t,
0
On the other hand, using the Cauchy–Schwarz inequality, we have T an fn (t, y) fn (t, y ) ψ(y) + ψ(y ) dydy dt R2+ \[0,R]2
0
≤ Ca ψ ∞
T
0
M
(1 + y) fn dy
0
√
[0,T ]
≤
Ca,ψ,fin ,T R 2 + 2 −β−ε λ
1
1
T 0
∞
(y ∧ n)β fn dy dt
R
≤ Ca,ψ sup fn (t, .) L1 T
∞
T
0
(y ∧ n)
∞
2
1/2
(y ∧ n) fn dy
λ 1 2 + 2 −ε
fn dy
dt
1/2
2
β
R
(7.7) dt −→ 0,
R
as R → ∞, uniformly in n thanks to (7.4). By (7.6) and (7.7) the coagulation term satisfies (7.5). Then with (7.5) at hand, we easily pass to the limit in the weak formulation of Eq. (1.27) satisfied by (fn ) and we obtain that f is a weak solution in the sense of Definition 2.1. Remark 7.1. Theorem 2.7 readily extends to a coagulation kernel of the form a = a1 + a2 , a1 satisfying (1.2), a2 is symetric, such that 0 ≤ a2 ≤ C(1 + y)−1/2−λ/2−ε (1 + y )−1/2−λ/2−ε for some ε > 0.
(7.8)
8. Extensions to Non-Homogeneous Models Almost all the results obtained in the previous sections extend to a non spatially homogeneous setting under suitable conditions. We briefly explain in this section how this can be done. Let us emphasize that the questions of gelation and gelling profile in a non-spatially homogeneous setting have been addressed in [19] in the case α = β = 1. Now, the clusters are assumed to move in an open bounded subset of RN , N ≥ 1, with smooth boundary ∂, according to brownian movement with diffusion coefficient d only size dependent. We assume d ∈ C(R+ ) and d(y) > 0 ∀y > 0. We denote by f (t, x, y) ≥ 0 the distribution of clusters of size y ∈ R+ at time t ≥ 0 and position x ∈ . The continuous coagulation-fragmentation equation with diffusion reads ∂f ∂t − d(y) x f = Qc (f ) + Qf (f ), in (0, +∞) × × R+ ∂f (8.1) = 0, on (0, +∞) × ∂ × R+ ∂n f (0, x, y) = fin (x, y), in × R+ .
Gelation in Coagulation and Fragmentation Models
185
Here, ∂n f denotes the outward normal derivative of f on the boundary and the terms Qc (f ) and Qf (f ) are given in (1.1) and (1.28). Under the assumptions (1.2)–(1.3) on a, (1.29)–(1.30) with γ > 1 on b and fin ∈ L1 ( × R+ , (1 + y) dydx), it has been proved in [25] that there exists a weak global solution in the following sense: f ∈ C([0, T ); L1 ( × R+ ) ∩ L∞ (0, T ; L1 ( × R+ ; y dy dx)), f ∈ ∩L1 ((0, T ) × (0, R); W 1,1 ()), Q(f ) ∈ L1 ((0, T ) × × R+ ), for all T > 0, and satisfies the following weak formulation of (8.1) ∞ (ψ(t)f (t) − ψ(0)fin )dy dx 0 t ∞ + (d(y)∇x f ∇x ψ − f ∂t ψ) dy dx ds 0 0 t ∞ 1 = Q(f ) ψ dy dx ds, 2 0 0 for every t ∈ (0, T ) and all ψ ∈ W 1,∞ ([0, T ] × × R+ ) with compact support in [0, T ] × × R+ . The case α ∈ (0, 1), β = 1 is not actually considered in [25], but the existence result in [25] can be extended to this case adapting Corollary 2.8. We refer to [25] for details about the definition of solutions as well as for the precise statement of the existence result. We also refer to [24] and references therein for the existence of a solution to the discrete coagulation-fragmentation equations with diffusion. We finally introduce the following notation: Mk (t) := Mk (t, x) dx, Mk (t, x) := Mk (f (t, x, .)). (8.2)
In that context, the gelation time is now defined as the smallest time Tg satisfying that for every t0 ≥ 0, t1 ≥ 0 such that t0 < Tg < t1 there holds M1 (t1 ) < M1 (t0 ). We now state the extension of some of the results obtained in the previous sections (Theorem 2.2, Corollary 2.3, Theorem 2.4 in the pure coagulation case, Theorem 2.7 and part of Theorem 1.4 in the coagulation-fragmentation case) to this non homogeneous setting. Since their proofs are rather straightforward extensions of those for the homogeneous equations, we skip them for the sake of brevity. Theorem 8.1. Let : R+ → R+ be an increasing function satisfying (0) = 0 and (2.3). For any weak solution f to (8.1) and any t1 ≥ t0 ≥ 0 the following nonhomogeneous version of (2.18) holds:
2 t1 ∞ λ/2 f (t, x, y) y (y) dy dxdt t0 0 2 ≤ C (8.3) 4 M1 (t0 ) + B 2 || (t1 − t0 ) . Consequently the non-homogeneous version of (2.19)-(2.21) also holds and, when b ≡ 0 the non homogeneous version of (2.5) and (4.4) too. Moreover, M1 (t) is decreasing and right continuous, and if for some T0 < Tg ≤ T1 , M1 (T1 ) < M1 (T0 ),
186
M. Escobedo, S. Mischler, B. Perthame
then, for any R > 0 and τ > 0, t1 t0
1 sup τ S S>R
2
S
y
λ/2+1/2+τ
f (t, x, y) dy
dxdt ≥ Cτ > 0
(8.4)
0
and
T1
T0
∞ e
y λ/2+1/2 f (t, x, y)dy (lny)1/2
2 dxdt = +∞.
(8.5)
We deduce from Theorem 8.1 the following gelation criteria. Corollary 8.2. Consider Problem (8.1) under the conditions imposed above. Then gelation occurs for any weak solution with initial data satisfying M1 (0) ≥ Cλ B||
(8.6)
for some positive constant Cλ . If b ≡ 0, then for all weak solutions with non-identically zero initial data, gelation occurs and M1 (t) satisfies (1.18). Sketch of the Proof. From the non-homogeneous version of (2.21) with k = 1 and the Cauchy–Schwarz inequality we have t1 t0 1 M21 (t) dt ≤ (8.7) (M1 (t, x))2 dxdt || t1 t0 ≤ C1,λ M1 (t0 ) + M0 (t0 ) + B 2 || (t1 − t0 ) . The result for the general case b ≥ 0 immediately follows from (8.7) and the condition (8.6) as in the proof of Theorem 1.4 in Sect. 6. On the other hand, for the pure coagulation equation (b ≡ 0) we first prove that Mk (t) is a decreasing function for k = 0 and k = 1, as in the proof of (2.9) in Sect. 4. Then we deduce from that fact, from the non-homogeneous version of (1.16): ∞ (Mλ/2 (t, x))2 dxdt ≤ 2M0 /T ) ∀T ≥ 0 T
and the non-homogeneous version of (2.5),
2 ∞ ∞ f (t, x, y) y dy dxdt ≤ Cλ R 1−λ M1 (T ), ∀T ≥ 0
T
R
the following non-homogeneous version of (4.4): ∞ 2−λ (M1 (t, x))2 dxdt ≤ Cλ Mλ−1 . ∀T ≥ 0 0 (0) M1 (T ) T
By the Cauchy Schwarz inequality again we obtain ∞ 1 2−λ M21 (t)dt ≤ Cλ Mλ−1 , 0 (0) M1 (T ) || T and we conclude that M1 (t) satisfies (1.18) thanks to Lemma 4.1.
Gelation in Coagulation and Fragmentation Models
187
Remark 8.3. Notice that also in this non-homogeneous setting the only power like self similar behaviour in the y variable of the solution f compatible with the estimates (8.4), (8.5) and with the non-homogeneous version of (2.20) and (2.21) is again y −3/2−λ/2 . Acknowledgement. We thank P. Laurençot for many useful discussions during the preparation of this work and for supdying additional references. We were partially supported by CNRS and UPV through a PIC between the Universidad del Pais Vasco and the Ecole Normale Superieure.
References 1. Aizenman, M., Bak, T.: Convergence to equilibrium in a system of reacting polymers. Commun. Math. Phys. 65, 203–230 (1979) 2. Aldous, D.J.: Deterministic and stochastic models for coalescence (aggregation, coagulation): A review of the mean-field theory for probabilists. Bernoulli 5, 3–48 (1999) 3. Ball, J.M., Carr, J.: The discrete coagulation-fragmentation equations: Existence, uniqueness, and density conservation. J. Statist. Phys. 61, 203–234 (1990) 4. Buffet, E., Pulé, J.V.: Gelation: The diagonal case revisited. Nonlinearity 2, 373–381 (1989) 5. Burobin, A.V., Galkin, V.A.: Solutions of an equation of coagulation. Differential Equations 17, 456–462 (1981) 6. Carr, J., da Costa, F.P.:Instantaneous gelation in coagulation dynamics. Z. Angew. Math. Phys. 43, 974– 983 (1992) 7. da Costa, F.P.: Existence and uniqueness of density conserving solutions to the coagulation-fragmentation equations with strong fragmentation. J. Math. Anal. Appl. 192, 892–914 (1995) 8. van Dongen, P.G.J.: On the possible occurrence of instantaneous gelation in Smoluchowski’s coagulation equation. J. Phys. A 20 (1987), 1889–1904. 9. van Dongen, P.G.J., Ernst, M.H.: On the occurrence of a gelation transition in Smoluchowski’s coagulation equation. J. Stat. Phys. 44, 785–792 (1986) 10. van Dongen, P.G.J., Ernst, M.H.: Scaling solutions of Smoluchowski’s coagulation equation. J. Stat. Phys., 50 1/2, 295–329 (1988) 11. Drake, R.L.: A general mathematical survey of the coagulation. In: G. Hidy, J.R. Brocks (eds.): Topics in Current Aerosol Research Research 3 (Part 2). Oxford: Pergamon Press, 1972 12. Dubovskii, P.B.: Mathematical theory of coagulation. Lecture Notes Ser. 23 Seoul: Seoul Nat. Univ., 1994 13. Dubovskii, P.B., Stewart, I.W.: Existence, uniqueness and mass conservation for the coagulationfragmentation equation. Math. Methods Appl. Sci. 19, 571–591 (1996) 14. Ernst, M.H., Ziff, R.M., Hendriks, E.M.: Coagulation processes with a phase transition. J. Colloid Interface Sci. 97, 266–277 (1984) 15. Escobedo, M., Laurençot, P., Mischler, S., Perthame, B.: Gelation and mass conservation in coagulationfragmentation models. Preprint of the E.N.S. Paris (2002) 16. Galkin, V.A.: The existence and uniqueness of the solution of the coagulation equation. Differ. Eqs. 13, 1014–1021 (1977) 17. Galkin, V.A., Dubovskii, P.B.: Solution of the coagulation-fragmentation equations with unbounded kernels. Differ. Eqs. 22, 373–378 (1986) 18. Hendriks, E.M., Ernst, M.H., Ziff, R.M.: Coagulation equations with Gelation. J. Stat. Phys. 33, 519–563 (1983) 19. Herrero, M.A., Velazquez, J.J.L., Wrzosek, D.: Sol-gel transition in a coagulation-diffuse model. Phys. D 141, 221–247 (2000) 20. Jeon, I.: Existence of gelling solutions for coagulation-fragmentation equations. Commun. Math. Phys. 194, 541–567 (1998) 21. Kokholm, N.J.: On Smoluchowski’s coagulation equation. J. Phys. A: Math. Gen. 21, 839–842 (1988) 22. Laurençot, Ph.: The discrete coagulation equations : existence of solutions and gelation. In: Journées Elie Cartan 1998 et 1999. Publications de l’Institut Elie Cartan 16, Nancy: Institut Elie Cartan, 2000, pp. 74–104 23. Laurençot, Ph.: On a class of continuous coagulation-fragmentation models. J. Differ. Eqs. 167, 145–174 (2000) 24. Laurençot, Ph., Mischler, S.: Global existence for the discrete diffusive Coagulation-Fragmentation equation in L1 . Rev. Mat. Iberoamericana 18, 221–235 (2002) 25. Laurençot, Ph., Mischler, S.: The continuous Coagulation-Fragmentation equation with diffusion. Arch. Ratinal Mech. Anal. 162, 45–99 (2002)
188
M. Escobedo, S. Mischler, B. Perthame
26. Laurençot, Ph., Mischler, S.: From discrete to continuous coagulation-fragmentation models. Proc. Roy. Soc. Edingburgh Sect. A, to appear 27. Leyvraz, F.: Existence and properties of post-gel solutions for the kinetic equations of coagulation. J. Phys. A: Math. Gen. 16, 2861–2873 (1983) 28. Leyvraz, F., Tschudi, H.R.: Singularities in the kinetics of coagulation processes. J. Phys. A: Math. Gen. 14, 3389–3405 (1983) 29. McLeod, J.B.: On an infinite set of non-linear differential equations I& II. Q. J. Math. Oxf. Ser. 13, 119–128 and 193–205 (1962) 30. McLeod, J.B.: On the scalar transport equation. Proc. London Math. Soc. 3, 14, 445–458 (1964) 31. Melzak, Z.A.: A scalar transport equation. Trans. Am. Math. Soc. 85, 547–560 (1957) 32. Muller, H.: Zur allgemeinen Theorie der raschen Koagulation. Kolloid-chemische 27, 223–250 (1928) 33. Norris, J.R.: Cluster Coagulation. Commun. Math. Phys. 209, 407–435 (2000) 34. von Smoluchowski, M.: Z. Phys. 96, 557–585 (1916) 35. Shirvani, M., van Roessel, H.J.: Some results in the coagulation equation. Nonlinear Anal. 43, 563–573 (2001) 36. Spouge, J.: An existence theorem for the discrete coagulation-fragmentation equations. Math. Proc. Cambridge Philos. Soc. 96, 351–357 (1984) 37. Stell, G., Ziff, R.M.: Kinetics of polymer gelation., J. Chem. Phys. 73, 3492–3499 (1980) 38. Stewart, I.W.: A global existence theorem for the general coagulation-fragmentation equation with unbounded kernels. Math. Methods Appl. Sci. 11, 627–648 (1989) 39. Stewart, I.W.: An uniqueness theorem for the coagulation-fragmentation equation. Math. Proc. Cambridge Philos. Soc. 107, 573–578 (1990) 40. Stewart, I.W.: On the coagulation-fragmentation equation. J. Appl. Math. Phys. (ZAMP) 41, 917–924 (1990) 41. Stewart, I.W.: Density conservation for a coagulation equation. Z. Angew. Math. Phys. 42, 746–756 (1991) 42. Stockmayer, W.H.: J. Chem. Phys. 11, 45 (1943) 43. White, W.H.: A global existence theorem for Smoluchovski’s coagulation equations. Proc. Am. Math. Soc. 80 (1980) Communicated by A. Kupiainen
Commun. Math. Phys. 231, 189–221 (2002) Digital Object Identifier (DOI) 10.1007/s00220-002-0691-6
Communications in
Mathematical Physics
Non-Topological Multi-Vortex Solutions to the Self-Dual Chern-Simons-Higgs Equation Hsungrow Chan1 , Chun-Chieh Fu2 , Chang-Shou Lin3,∗ 1
Department of Mathematics Education, National Ping-Tung Teachers College. E-mail:
[email protected] 2 Department of Finance, Fortune Institute of Technology. E-mail:
[email protected] 3 Department of Mathematics, National Chung-Cheng University, Minghsiung, Chia-Yi, Taiwan, R.O.C. E-mail:
[email protected] Received: 20 August 2001 / Accepted: 31 December 2001 Published online: 29 October 2002 – © Springer-Verlag 2002
Abstract: In this article, we construct self-dual N -vortex solutions with a large magnetic flux of (2 + 1)-dimensional relativistic Chern-Simons model, provided that the coupling constant κ is small and the cites of vorticity {p1 , . . . , pn } satisfies log(|pj − pk |) is independent of j . (0.1) k=j
Our solutions exhibit the bubbling phenomenon at each pj . Near each vortex pj , solutions are locally asymptotically symmetric with respect to pj , and the curvature F12 tends to a sum of Dirac measures as κ tends to zero. By a heuristic argument, it is shown that (0.1) is also a necessary condition for the existence of multi-vortex solutions which has a locally asymptotically symmetric vortex at pj , j = 1, 2, . . . , N. 1. Introduction In this article, we want to prove the existence of multi-vortex solutions to the (2 + 1)dimensional relativistic Chern-Simons gauge field theory. This theory was suggested by Hong-Kim-Pac [8] and Jackiw-Weinberg [9] to study vortex solutions of the Abelian Higgs model which carry both electric and magnetic charges. In this theory, the Lagrangian density is given by L=
1 κ µνρ ε Fµν Aρ + Dµ φD µ φ − 2 |φ|2 (1 − |φ|2 )2 , κ 4
(1.1)
where Aµ (µ = 0, 1, 2) is the gauge field on R3 , Fµν = ∂x∂ µ Aν − ∂x∂ ν Aµ is the curvature √ tensor, φ is the Higgs field on R3 , Dµ = ∂x∂ µ − iAµ (i = −1) is the gauge covariant derivative associated with Aµ , εµνρ is the skew symmetric tensor with ε012 = 1 and the ∗
Partially supported by National Center for Theoretical Sciences of NSC, Taiwan.
190
H. Chan, C.-C. Fu, C.-S. Lin
constant κ is the coupling constant. The static energy density corresponding to (1.1) can be written as 2 2 κ 2 F12 1 E := + |Dj φ|2 + 2 |φ|2 (1 − |φ|2 )2 (1.2) 2 4 |φ| κ j =1 2 2 1 κ 2 F12 + |φ|(|φ| − 1) + |D1 φ + iD2 φ|2 + F12 + , = 4 |φ| κ where is a divergence form consisting of the first derivative of φ. We then have E= Edx ≥ F12 dx R2
R2
and the minimum of the energy is saturated iff (φ, A) satisfies the self-dual equations or the Bogomol’nyi equations (D1 + iD2 )φ = 0, and
(1.3)
2 |φ|2 (|φ|2 − 1) = 0, (1.4) κ2 with the boundary condition either |φ(x)| → 1 as |x| → +∞ or |φ(x)| → 0 as |x| → +∞. Following Jaffe-Taubes [10], system (1.3)–(1.4) can be reduced to a single nonlinear elliptic equation of second order. Let p1 , . . . , pN be any set of points in R2 . Introduce a real-valued function u and θ by F12 +
1
φ = e 2 (u+iθ) and θ = 2
N
arg(z − pj ), z = x1 + ix2 ∈ C 1 .
j =1
Then u satisfies N
"u +
4eu (1 − eu ) = 4π δ(z − pj ), 2 κ
(1.5)
j =1
where δ(z − pj ) is the Dirac measure with the total mass at pj . For the details of the derivation of Eqs. (1.2)–(1.5) and recent developments of related subjects, we refer the readers to Hong-Kim-Pac [8], Jackiw-Weinberg [9], Wang [17], Spruck-Yang [14], Caffarelli-Yang [2], Tarantello [16], Chae-Imanuvilov [3], Nolasco-Tarantello [12, 13], Yang [18] and the references therein. A solution u of (1.5) is called topological if |u(x)| → 0 as |x| → +∞, and is called non-topological if u(x) → −∞ as |x| → +∞. For any solution u = log(|φ|2 ) of (1.5), we define the flux by 2 = F12 dx = 2 |φ|2 (1 − |φ|2 )dx. (1.6) κ R2 R2 It is easy to see that if u is a topological solution, then = 2π N . For a non-topological solution u of (1.5) with p1 = . . . , pN , Spruck and Yang [15] proved that its flux > 4π(N + 1). It is an interesting equation whether for any given number > 4π(N + 1), does (1.5) possess a non-topological solution u with the given as its flux. If is close to 4π(N + 1), then Chae-Imanuvilov [3] recently proved the existence of non-topological solutions which is obtained from a small perturbation of the corresponding Liouville equation with the same vorticities. In this paper, we want to find solutions of (1.5) with small κ and with large , that is, > 8π N . Consequently,
Non-Topological Multi-Vortex Solutions to the Self-Dual Chern-Simons-Higgs Equation
191
our solutions are not obtained from the perturbation of the Liouville equation. Common to many model equations, Eq. (1.5) does possess so-called bubbling solutions when the coupling constant κ is small. To see it, we consider the radial solution uε (x) = uε (r) of Eq. (1.5) with N = 1 and p1 = 0, where ε = κ2 , by following conventional notations. By scaling, we let u(r) = uε (εr). Then u(r) satisfies u (r) +
u + eu (1 − eu ) = 4π δ(0) for r > 0. r
(1.7)
As before, u(r) is called a non-topological solution if u(r) → −∞ as r → +∞. By an elementary argument, we can show that u(r) satisfies u(r) = (2 − β0 ) log r + O(1) for r large, where the flux of (1.6) is equal to πβ0 , that is, ∞ eu(r) (1 − eu(r) )rdr. β0 =
(1.8)
(1.9)
0
In Sect. 2, we prove the uniqueness theorem for Eq. (1.7). Theorem 1.1. For any β0 > 8, there exists a unique radial solution of (1.7) with
∞
eu(r) (1 − eu(r) )rdr = β0 .
0
By (1.8), we see that for any δ0 > 0, uε (x) = (2 − β0 ) log |x| + (2 − β0 ) log
1 + I + O(ε τ ) ε
(1.10)
holds for |x| ≥ δ0 , where I = I (β0 ) is a constant and τ = β0 − 4. On the other hand, if we write uε (x) = 2 log |x| + vε (|x|), and u(r) = 2 log r + v(r). Then for any large R > 0 we have for |x| ≤ Rε, r 1 vε (r) = 2 log +v ε ε 2 1 r = 2 log , (1.11) + D(β0 ) + O ε ε2 where D(β0 ) = v(0). According to Theorem 1.1, D(β0 ) is a constant only depending on the “local flux” of uε (r) at the vortex 0. From (1.11), we see the height of the “bubble” vε (r) is equal to 2 log 1ε + D(β0 ) which tends to +∞ as ε ↓ 0. This analysis clearly shows the bubbling phenomenon for a single vortex solution when ε tends to zero, and it also shows the special nature of bubbling, because (1.10) and (1.11) show the local flux β0 and the coupling constant κ = 2ε completely determine the bubbling behavior, where vε (r) has no freedom to adjust the height due to the interaction of different sites of vorticities. This will give a restriction on the sites of p1 , . . . , pN . In the following, we give a heuristic reasoning to explain it.
192
H. Chan, C.-C. Fu, C.-S. Lin
Given β = 2 > 16πN , uε (x) is a solution of (1.5) with 1 euε (1 − euε )dx = β. ε 2 R2 We are interested in those solutions which have the local flux: 1 euε (1 − euε )dx = βj (1 + o(1)) ε 2 Bδ0 (pj )
(1.12)
with βj > 16π , for some βj independent of ε and j = 1, 2, . . . , N. We assume that uε (x) possesses the property of the locally asymptotic symmetry near pj as ε → 0, that is, for a fixed small δ0 > 0, uε (x) = (1 + o(1))u¯ ε (r),
(1.13)
1 where o(1) uniformly tends to 0 for |x − pj | ≤ δ0 as ε → 0 and u¯ ε (r) = 2πr |x−pj |=r u(x)dσ stands for the average of u over the circle {x | |x − pj | = r}. Heuristically, the rescaled solution uε (εx + pj ) will uniformly converge to uj (x). Due to the locally asymptotic symmetry with respect to pj , uj (x) is a radial solution of (1.7) with euj (1 − euj )dx. βj = R2
Then (1.8) gives βj βj 1 log |x − pj | + 2 − log + Ij + o(1), uε (x) = 2 − ε 2π 2π
(1.14)
for |x − pj | = δ0 for j = 1, 2, . . . , N. Since uε (x) is close to a harmonic function for
βj βk 1 1 x ∈ N j =1 B1 (pj ), we have (2 − 2π ) log ε = (2 − 2π ) log ε + O(1) for j = k. From here, we deduce βj = βk =
β N
(1.15)
for j, k ∈ {1, 2, . . . , N}. By Theorem 1.1, uj (|x|) = uk (|x|) for j = k. Also, (1.14) β − 2) log( 1ε ) converges to a sum of Green’s functions h(x). By the implies uε (x) + ( 2πN Liouville Theorem, h(x) = 2 −
N β log |x − pj | + C 2πN
(1.16)
j =1
for some constant C ∈ R. Thus, compared with (1.14), we arrive at the conclusion that β (2 − 2πN ) k=j log |x − pj | + C = Ij for each j = 1, 2, . . . , N. Since uj = uk , we have Ij = Ik , and then it implies that for each j = 1, 2, . . . , N, N k=j
log(|pj − pk |) is independent of j .
(1.17)
Non-Topological Multi-Vortex Solutions to the Self-Dual Chern-Simons-Higgs Equation
193
Hence, we have seen that by virtue of the uniqueness of Theorem 2.1, the interaction of different vorticities gives the condition (1.15) and (1.17) of the local flux and the sites of vorticities. This is quite a different phenomenon from many model equations which possess the bubbling phenomenon. For example, the scalar curvature equation studied in [1] or the mean field equations studied in [5, 6]. For the mean field equation, the bubbling also arose from the symmetric entire solutions of "u + eu = 0 in R2 .
(1.18)
However, (1.18) possesses a family of radial solutions which has the same growth rate at infinity. This family of radial solutions allows solutions to adjust their heights of bubbles at different sites to satisfy the comparability conditions due to the strong interaction of different bubbles. See [6] for more precise statements. Here, we see that the constraint of (1.7) is an outcome of the uniqueness result. Here we only give a heuristic argument to justify the condition (1.15) and (1.17). We are going to give a vigorous proof of those statements in a forthcoming paper, where an analogous estimate as given in Chen and Lin [5] will be derived. The main result of this paper is to prove the existence of bubbling solutions of (1.5) under the condition (1.17) of sites of the vorticities. Theorem 1.2. Let be a positive number > 8π N and pj ∈ R2 , j = 1, 2, . . . , N, pi = pj for i = j and satisfy (1.17). Then there exists a small κ∗ > 0 such that for any 0 < κ ≤ κ∗ , there exists a solution (φ, A) to (1.3) and (1.4) with |φ(x)| + |x|| φ(x)| = O(|x|2πN− ) as |x| → ∞. Moreover, the following hold: (i) φ has the zeros exactly at p1 , . . . , pN and the flux integral (1.6) F12 dx = . R2
(ii) The local flux:
2 F12 dx = 2 κ Bδ0 (pj )
Bδ0 (pj )
|φ|2 (1 − |φ|2 )dx =
(1 + o(1)), N
where o(1) → 0 as κ ↓ 0. β (iii) After a normalization, the function u = log |φ|2 converges to (2 − 2πN ) N j =1 log |z − pj | + C, for some constant C ∈ R, and the curvature F12 converges to N j =1 δ(pj ), a sum of Dirac measures. N (iv) The solution |φ(x)| is locally asymptotically symmetric with respect to pj , j = 1, 2, . . . , N as κ ↓ 0. Remark 1.2. The condition (1.7) is equivalent to the one that {p1 , . . . , pN } are located → −−−−→ 2π on a circle {x | |x − q0 | = r0 } such that the angles of − q− 0 pj and q0 pj +1 are equal to N for j = 1, 2, . . . , N, where pN+1 ≡ p1 . By the proof of Theorem 1.1, it is easy to see that our solutions are invariant under the rotation of angle 2π N with center q0 . See Sect. 5 for more details. The paper is organized as follows. In Sect. 2, we prove the uniqueness theorem for radial solutions. For the last two decades, there have been a vast literature dedicated to the uniqueness problem for the semilinear elliptic equation either for entire solutions or the Dirichlet problems for finite balls. Of course, different types of equations always present
194
H. Chan, C.-C. Fu, C.-S. Lin
different level 6 of difficulties. In our case, not only the nonlinear term eu (1 − eu ), but also the solution u(r) itself contributes new elements of difficulty, because u(r) is not monotone for all r > 0. The crucial step for our proof is Lemma 2.2 in Sect. 2. We refine the argument in [4] for the proof of Lemma 2.2. The complete proof of Lemma 2.2 is given in Sect. 3. In Sect. 4, we give a complete estimate of the linearized equation (1.7) at the solution u(r), and we will see it plays the important role when we glue bubbles together as an approximation solution. The linearized operator of an approximation solution is an elliptic equation with a singularly perturbed coefficient of zero order. Although the coefficient is almost zero outside of p1 , . . . , pN , it does have a concentration near p1 , . . . , pN . The main estimate for this type of singularly perturbed operator is Theorem 5.1. Once the estimate of Theorem 5.1 is done, Eq. (1.5) can be solved by an iterative process. See Sect. 5 for the proof. In the final section, we give a proof of Theorem 5.1. The proof itself is interesting and should be useful in other problems too. 2. Uniqueness of Radial Solutions The main purpose of this section is to prove the uniqueness of radial solutions satisfying (2.1) and (2.2), u (r) +
u (r) + eu (1 − eu ) = 4π N δ(0), r
and
β=
∞
eu (1 − eu )rdr,
(2.1)
(2.2)
0
where β > 4(N +1) is given arbitrarily. Throughout the section, unless stated explicitly, N is a nonnegative number, which is not necessarily a positive integer. Following the conventional notations, we let v(r; s) be the unique solution of the initial value problem v (r) + 1r v (r) + r 2N ev (1 − r 2N ev ) = 0, (2.3) v(0; s) = s, and v (0; s) = 0, where v and v stand for the first and second derivatives of v with respect to r. Clearly, u(r; s) = v(r; s) + 2N log r is a radial solution of (2.1). Set ϕ(r; s) = ∂v(r;s) ∂s . Then ϕ satisfies the linearized equation ϕ + 1r ϕ + f (u(r))ϕ = 0, (2.4) ϕ(0; s) = 1 and ϕ (0; s) = 0, where f (t) = et (1 − 2et ). Clearly, u(r) does not have a local maximum at some r0 with u(r0 ) ≥ 0, unless u(r) ≡ 0 for all r ∈ [0, ∞). This is a simple consequence of the maximum principle. Thus, if u(r) ≡ 0 and u(r0 ) ≥ 0 for some r0 > 0, then u(r) is strictly increasing for r ≥ r0 , and eventually, u(r) must blow up at a finite r. Therefore, we conclude that if u(r) exists for all r > 0, then either u(r) ≡ 0 or u(r) < 0 for all r > 0. A solution u(r) is called topological if limr→+∞ u(r) = 0. Otherwise, u(r) is called non-topological. From Eq. (2.3), it is easy to prove that the positive function eu (1 − eu ) is in L1 (R2 ) for either topological solutions or non-topological solutions. In fact, we have ∞ eu (1 − eu )rdr = 2N (2.5) 0
Non-Topological Multi-Vortex Solutions to the Self-Dual Chern-Simons-Higgs Equation
if u is a topological solution and ∞ eu (1 − eu )rdr = β > 4N + 4
195
(2.6)
0
if u is a non-topological solution. The inequality (2.6) was proved in [14]. But, it can be derived easily from the Pohozaev identity. The uniqueness of topological radial solutions has been proved in [7]. The main purpose of this section is to prove the uniqueness for non-topological solutions. Theorem 2.1. For any β > 4N + 4, there exists a unique radial solution u of (2.1) satisfying ∞ eu (1 − eu )rdr = β. (2.7) 0
The following lemma is the crucial step for our proof of Theorem 2.1. Lemma 2.2. Suppose u(r; s) = 2N log r +v(r; s) is a non-topological solution of (2.1), where v is the solution of (2.3). Then ϕ(r; s) satisfies either (a) ϕ(r; s) changes signs only once and limr→+∞ ϕ(r; s) = −∞, or (b) ϕ(r; s) changes signs exactly twice and limr→+∞ ϕ(r; s) = +∞. The proof of Lemma 2.2 will be given in Sect. 3, right after the proof of Theorem 2.1. Before we go into the details of the proof of Theorem 2.1, we state some results concerning the asymptotic behavior of ϕ(r; s). Lemma 2.3. Suppose u(r; s) is a non-topological solution. Then |ϕ(r; s)| ≤ c log r for large r where c is a constant depending on s only. Furthermore, rϕ (r; s) → 0 as r → +∞ if and only if ϕ(r; s) is a bounded function for r ∈ [0, ∞). Proof. Since u(r; s) is non-topological, there exists rs such that for r > rs , eu(r;s) ≤ and eu(r;s) ≤ O(r −4 ),
1 2
(2.8)
where (2.8) follows from β > 4N + 4. By (2.8) and Sturm’s comparison theorem, ϕ(r; s) can not be oscillated in infinite time. Without loss of generality, we many assume ϕ(r; s) > 0 for r ≥ rs . Hence r rϕ (r) = rs ϕ (rs ) − eu (1 − 2eu )ϕ(t)tdt rs
≤ rs ϕ (rs )
for all r > rs , that is, ϕ(r) ≤ c log r for large r. Thus, the first part of Lemma 2.3 is proved. If ϕ(r) is bounded for r ∈ [0, ∞), then eu (1 − 2eu )ϕ = O(r −4 ) for large r. Thus ∞ lim rϕ (r) = − eu (1 − 2eu )ϕ(t)tdt r→+∞
0
196
H. Chan, C.-C. Fu, C.-S. Lin
exists and the limit must be zero because ϕ is uniformly bounded. Conversely, if limr→+∞ rϕ (r) = 0, then r rϕ (r) = − eu (1 − 2eu )ϕ(t)tdt 0 ∞ = eu (1 − 2eu )ϕ(t)tdt r
≤ cr −3 for large r by (2.8), where ϕ(r) is assumed to be positive for large r. Therefore, boundedness of ϕ follows readily. Set
∞
β(s) =
eu(r;s) (1 − eu(r;s) )rdr
0
for a non-topological solution u(r; s). Lemma 2.4. β(s) is C 1 in s and its derivative ∞ eu (1 − 2eu )ϕ(r; s)rdr β (s) = 0
= − lim rϕ (r; s).
(2.9)
r→+∞
Proof. Since u(r; s) is a solution of (2.1) in R2 , u(r; s) ≤ 0 by the maximum principle. Hence f (u(r; s)) > 0 for all r > 0. For any s, we let rs be large such that rs rs u (rs , s) = 2N − eu(t;s) (1 − eu(t;s) )tdt < −4.
0
Choose |s − s| ≤ δ for some small δ > 0 such that rs u (rs , s ) ≤ −4.
Thus, ru (r; s ) ≤ −4 for all r ≥ rs and |s − s | ≤ δ. Thus eu(r;s ) ≤ r −4 for r ≥ rs . Hence, the differentiability of β at s and (2.9) follow from the decay estimate and Lemma 2.3. Now we are in the position to prove Theorem 2.1. Proof of Theorem 2.1. Let IN = {s | u(r; s) is a non-topological solution of (2.1) }. First, we claim Step 1. IN is an open set. If not, then there exists a sequence of sj → s ∈ IN and sj ∈ IN . Two cases are discussed separately. Case 1. If uj (r) ≡ u(r; sj ) is a topological solution, then by (2.5), we have for all r > 0, r ∞ euj (1 − euj )tdt = euj (1 − euj )tdt > 0. ruj (r) = 2N − 0
r
Non-Topological Multi-Vortex Solutions to the Self-Dual Chern-Simons-Higgs Equation
197
Hence, uj (r) is strictly increasing. Case 2. Assume that uj (r) ≡ u(r; sj ) blows up at some rj . Clearly, we have rj → +∞ as j → +∞. We want to prove that uj (r) is strictly increasing for all r ∈ [0, rj ). If not, then uj (r) has a local maximum and a local minimum at τj < τj∗ respectively. By the maximum principle, we have uj (τj ) < 0 < uj (τj∗ ), which, of course, yields a contradiction. In any case, u (r; sj ) ≥ 0 for r ∈ [0, rj ) with rj → +∞, and it implies that u(r; s) is increasing in r. But, it yields to a contradiction to limr→+∞ u(r; s) = −∞. Hence the openness of IN is proved. Step 2. There exists s0 ∈ R such that s ∈ IN for s > s0 . For s large, set s
vˆs (r) = v(e− 2N +1 r; s) − s. Then vˆs (r) satisfies
s
vˆs (r) + 1r vˆs + e− 2N +1 evˆs (r) − r 4N evˆs (r) = 0, vˆs (0) = 0, vˆs (0) = 0.
Now suppose there exists a sequence sj ∈ IN with sj → +∞. Set vˆj (r) = vˆsj (r). Then by passing to a subsequence, we have vˆj (r) → v(r) ˆ which satisfies ˆ = 0 for r > 0, "vˆ − r 4N ev(r) (2.10) v(0) ˆ = 0 and vˆ (0) = 0. Since v(r; s) is decreasing in r, we have v(r) ˆ is nonincreasing in r. But, any solution of (2.10) must be increasing in r. (Actually, v(r) ˆ must blowup at finite r.) Hence, it yields a contradiction and then Step 2 is proved. Step 3. IN = (−∞, s0 ). Suppose not. Then there is bounded and connected component JN of IN . Clearly, for any s ∈ ∂JN , u(r; s) is a topological solution. If we apply the uniqueness of topological solutions in [7], then we obtain a contradiction. However, in the following, we will give a proof of Step 2, because the proof is simple and yields a result, which is required in our proof. By Lemma 2.2, Lemma 2.3 and Lemma 2.4, we have β (s) = 0 for s ∈ JN . Thus, β(s) is monotone in JN . Since JN is bounded, there exists a sequence of solutions uj of (1.1) such that uj (r) converges to a topological solution u(r) ≡ u(r; s) with s ∈ ∂JN and ∞ βj = euj (1 − euj )rdr ≤ c < +∞ (2.11) 0
for some constant c > 0. To obtain a contradiction, we use the Pohozaev identity. Multiplying rvj (r) on Eq. (2.3), we have r 1 2 2 1 t 2N evj (1 − t 2N evj )tdt r (vj ) (r) + r 2N+2 evj (r) − r 4N+2 e2vj (r) − (2N + 2) 2 2 0 r 2vj 4N+1 = e t dt. 0
198
H. Chan, C.-C. Fu, C.-S. Lin
For each j , evj (r) = O(r −βj ) for some βj > 4N + 4. Hence, both r 2N+2 evj (r) and r 4N+2 e2vj (r) tend to zero as r → +∞. Therefore, by passing to the limit r → +∞, the Pohozaev identity implies that for each j , ∞ 1 2 β − (2N + 2)βj = e2vj t 4N+1 dt 2 j 0 ∞ e2uj tdt, (2.12) = 0
where lim (rvj (r)) r→+∞
=
lim (ruj (r)) + 2N r→+∞
∞
=− 0
euj (1 − euj )rdr = −βj .
∞ Since by (2.11) βj is bounded, (2.12) yields 0 e2uj (t) tdt ≤ c. But, since uj (t) → u(t) uniformly in any finite interval of t, we have for any R, R R e2u(t) tdt = lim e2uj (t) tdt ≤ c. j →+∞ 0
0
It yields e2u(t) ∈ L1 (R2 ), a contradiction to the fact that e2u(t) → 1 as t → +∞. Therefore, Step 2 is proved. By Step 2, IN = (−∞, s0 ). Hence u(r; s0 ) must be a topological solution. Since u(r; s0 ) is a topological solution, by (2.12) where uj ∈ IN and uj → u, we have lims↑s0 β(s) = +∞. By Lemma 2.2, β(s) is monotone for s ∈ IN . Since β(s) → +∞ as s ↑ s0 , we conclude that β(s) is an increasing function, and by the Pohozaev identity (2.12), lims→−∞ β(s) = 4N + 4. Therefore Theorem 2.1 is completely proved. Remark 2.5. We have proved that β (s) > 0 for s ∈ IN . By (2.9), we have limr→+∞ rϕ (r; s) < 0 for any s ∈ IN . From here, we conclude that case (b) of Lemma 2.2 never occurs. Lemma 2.6. Suppose u(r) is a non-topological solution of (1.1) with N = 1 and ∞ eu(r) (1 − eu(r) )rdr > 8. β= 0
Then u(r) = −(β − 2) log r + I + O(r 4−β ) for large r and for some constant I . Proof. Since u (r) satisfies
r
ru (r) = 2 −
eu(s) (1 − eu(x) )sds,
0
limr→+∞ ru (r) = 2 − β. Let w(r) = u
1 1 + (β − 2) log . r r
(2.13)
Non-Topological Multi-Vortex Solutions to the Self-Dual Chern-Simons-Higgs Equation
199
Then w(r) satisfies "w(r) + r β−6 ew(r) (1 − r β−2 ew(r) ) = 0 in B1 \{0}.
(2.14)
Since w(r) = o(log 1r ) for r near 0, it is easy to see that w(r) satisfies Eq. (2.14) in B1 in the distribution sense. Note that the nonlinear term r β−6 ew(r) (1 − r β−2 ew(r) ) is H¨older continuous at r = 0, by the regularity theorem, w(r) is at least C 2 for r near 0. Thus, it is easy to see that w(r) can be written as w(r) = w(0) + c r β−4 + O(r β−2 ) for r near 0. Readily, (2.13) follows from the expression above.
For the rest of this section, we want to discuss the linearized equation of (2.1) at the non-topologically radial solution u(r). Later, we will see it plays an important role for us to be able to glue solutions together. Let ϕk (r) be the unique solution of the linearized equation of (2.1), k2 "ϕk (r) + f (u(r))ϕk − 2 ϕk (r) = 0 for r > 0 (2.15) r ϕk (r) = r k + O(r k+1 ) for r near 0. For any constant θ0 , obviously v(x) = ϕk (r) cos k(θ − θ0 ) is a solution of "v + f (u)v = 0 in R2 ,
(2.16)
if ϕk (r) satisfies (2.15). By Remark 2.5, we already know that ϕ0 (r), which is the unique solution of (2.4), changes the sign once and ϕ0 (r) = −c log r + O(1) at ∞ for some positive constant c. In general, we have Theorem 2.7. Suppose that u(r) is a non-topologically radial solution of (2.1) with 0 < N ≤ 1, and ϕk (r) is the unique solution of (2.15). Then (i) For k = 0, 1, ϕk (r) must change its sign once for r ∈ [0, ∞). For r large, ϕ0 (r) = −c0 log r + O(1) and ϕ1 (r) = −c1 (1 + o(1))r for some positive constants c0 and c1 . (ii) For k ≥ 2, ϕk (r) is positive for all r > 0 and ϕk (r) = ck r k (1 + o(1)) for r large and for some positive constants ck . Proof. We have already proved the part of ϕ0 in (i). To prove the linear growth of ϕ1 , we 3 note that u (r) satisfies the same equation as ϕ1 and u (r) = 2N r + O(r ) for r near 0. Thus, u (r) and ϕ1 (r) is a set of fundamental solutions of (2.15) for k = 1. Obviously, u (r) has one single zero at η, u (r) > 0 if r < η and u (r) < 0 if r > η. Meanwhile, (1 + O(r 3−β )). By the comparison with u (r), ϕ1 (r) must have one u (r) = − (β−2N) r unique zero at r1 such that r1 > η, and also ϕ1 (r) must grow linearly, which is due to the term − r12 in Eq. (2.15). Otherwise, if ϕ1 (r) is bounded at ∞, then ϕ1 (r) ≡ c u (r) for some constant c, which is impossible because u (r) has a singularity at 0. Hence, part (i) of Theorem 2.7 is proved. To prove part (ii), we first claim that if ϕ2 (r) > 0 for r > 0, then limr→+∞ ϕ2 (r) = +∞.
(2.17)
200
H. Chan, C.-C. Fu, C.-S. Lin
To see it, we will use a useful trick from [11]. Set s = r N+1 , u(s) ¯ = u(r) + 2N log r, and ϕ¯2 (s) = ϕ2 (r).
(2.18)
By a straightforward computation, we have "s ϕ¯2 (s) =
1 r −2N "r ϕ2 (r), (N + 1)2
(2.19)
where "s and"r stand for the Laplacian with respect to s and r respectively. Thus u(s) ¯ satisfies 2N
¯ ¯ "s u(s) ¯ + (N + 1)−2 eu(s) (1 − s N +1 eu(s) ) = 0, and
(2.20)
ϕ¯2 (s) satisfies 2N
¯ ¯ "s ϕ¯2 (s) + (N + 1)−2 eu(s) (1 − 2s N +1 eu(s) )ϕ¯2 (s) −
4 s −2 ϕ¯2 (s) = 0. (N + 1)2 (2.21)
By (2.18), u(s) ¯ is smooth at s = 0. Thus, u¯ (s) < 0 for all s > 0, and u¯ (s) satisfies "s u¯ (s) + −
2N 1 ¯ ¯ eu(s) (1 − 2s n+1 eu(s) )u¯ (s) − s −2 u¯ (s) (N + 1)2
N −1 2N ¯ s N +1 e2u(s) = 0. 3 (N + 1)
(2.22)
By the assumption, ϕ¯2 (s) > 0 for s > 0 and ϕ2 (s) is uniformly bounded in [0, ∞). From here, we can deduce ϕ2 (s) = o( 1s ). By the comparison with u¯ (s), it yields, ∞ ∞ N −1 2N 4 ¯ N +1 e 2u(s) 0< s ϕ ¯ (s)sds + − 1 ϕ¯2 (s)(−u¯ (s))ds 2 (N + 1)3 0 (N + 1)2 0 = lim (u¯ (s)ϕ¯2 (s)s − u(s)ϕ ¯ (s)s = 0, 2 s→+∞
a contradiction provided that N ≤ 1. Thus, the claim (2.17) is proved. To prove that ϕ2 (r) > 0 for all r > 0 and 0 < N ≤ 1, we consider a family of solution u(r; N ) of u (r) + 1r u (r) + eu(r) (1 − eu(r) ) = 0 in R2 , u(r) = 2N log r + s0 + O(r 2 ) near 0. The initial value s0 can be chosen so that u(r; N ) is a non-topological solution for all 0 ≤ N ≤ 1. For N = 0, u (r; 0) < 0 for all r > 0 and satisfies 1 u = 0. r2 By the comparison with u (r; 0), the function ϕ2 (r; 0), which is associated with u(r; 0), must be positive for all r > 0. Then by the claim (2.17) and the continuity of ϕ2 (r; N ) on N, we have ϕ2 (r; N ) is positive for all r and limr→+∞ ϕ2 (r; N ) = +∞. For each 0 < N ≤ 1, once ϕ2 (r) is positive for some non-topological solution, we can use Theorem 2.1 to deform continuously this particular solution to any non-topological solution. In this way, by the claim (2.17), we prove that ϕ2 associated with any non-topological solution of (1.1) with 0 < N ≤ 1, is positive for all r > 0. Hence Theorem 2.7 is proved. "u + eu (1 − 2eu )u −
Non-Topological Multi-Vortex Solutions to the Self-Dual Chern-Simons-Higgs Equation
201
Let L = " + f (u(x)), where u(x) is a non-topological radial solution of (2.1) with N = 1. Theorem 2.7 implies the following consequence. Corollary 2.8. Let w(x) satisfy Lw = 0 in R2 . Suppose that either w(x) is uniformly bounded in R2 or w(0) = 0 and w(x) > 0 for |x| ≥ 1. Then w(x) ≡ 0. Proof. We will give a proof for the second statement. The proof of the first part is easy, and is skipped. Note that
for |x| ≥ 1, where β = a constant c such that
∞ 0
|f (u(x))| ≤ c |x|2−β eu (1 − eu )rdr > 8. By the Harnack inequality, there exists sup
r 2 ≤|x|≤2r
|w(x)| ≤ c
inf
r 2 ≤|x|≤2r
|w(x)|
2π for r ≥ 4. Hence, |w(x)| ≤ c|x|τ for some constant τ > 0. Let wk (r) = 0 w(reiθ ) 2π cos kθ dθ and w¯ k (r) = 0 w(reiθ ) sin kθ dθ. Obviously, wk and w¯ k satisfies (2.15), and 2π |wk (r)| + |w¯ k | ≤ |w|(reiθ )dθ ≤ c r τ . 0
By (ii) of Theorem 2.7, we have |wk | + |w¯ k | = 0 if k > τ . Therefore w(x) = [τ ]+1 k=0 ck wk (r) cos k(θ − θk ), for some constant θk ∈ R and some wk satisfying (2.15). But, w(x) > 0 for |x| ≥ 1, which implies ck = 0 for k ≥ 1. The constant c0 vanishes also because w(0) = 0. This proves the corollary. 3. Proof of Lemma 2.2 Now we return to the proof of Lemma 2.2. Proof of Lemma 2.2. We first consider N = 0. This is an easy case because u(r) is decreasing in r ∈ [0, ∞). Let f (u) = eu (1−eu ) and F (u) = eu (1− 21 eu ). Set w = ru (r) to be a comparison function with ϕ(r). By direct computation, w satisfies w +
w + f (u)w = −2f (u(r)). r
(3.1)
Since w(r) < 0 and f (u(r)) > 0 for all r ∈ [0, ∞), by the Sturm comparison theorem, ϕ(r) must change its sign. Let r1 be the first zero of ϕ. We claim f (u(r1 )) > 0. Because if f (u(r1 )) ≤ 0, then f (u(r)) = eu(r) (1 − u(r) 2e ) < 0 for r ∈ [0, r1 ), and by integrating (2.4), we obtain r1 rf (u(r))ϕ(r)dr > 0, 0 ≥ r1 ϕ (r1 ) = − 0
a contradiction. For any c > 0, we set wc (r) = ru (r) + c.
(3.2)
202
H. Chan, C.-C. Fu, C.-S. Lin
Then wc (r) satisfies 1 wc + wc (r) + f (u(r))wc (r) = c (r), r
(3.3)
where c (r) = c f (u(r)) − 2f (u(r))
= eu(r) [c − 2 + 2(1 − c)eu ].
(3.4)
By letting c = 2, 2 (r) = −2e2u(r) < 0 for all r ∈ [0, ∞). Hence, w2 (r) is a supersolution of (2.4) with w2 (0) = 2 > 0. By the Sturm comparison theorem again (compared with ϕ(r)), w2 (r) must have a zero before r1 . Since ru (r) is decreasing, we have r1 u (r1 ) < −2.
(3.5)
Choose c1 to satisfy wc1 (r1 ) = r1 u (r1 ) + c1 = 0. Thus, c1 > 2. Also, c1 (r1 ) > 0. Otherwise, since (c1 − 2) + 2(1 − c1 )eu is increasing in r, we have for r ∈ [0, r1 ], (c1 − 2) + 2(1 − c1 )eu(r) < (c1 − 2) + 2(1 − c1 )eu(r1 ) ≤ 0, which implies c1 (r) ≤ 0 for all r ∈ [0, r1 ). Thus, by Eqs. (2.4) and (3.3), r1 r1 0 = r(wc1 (r)ϕ(r) − ϕ (r)wc1 (r)) |0 = c1 (t)ϕ(t)tdt < 0 0
yields a contradiction. Hence we proved c1 (r1 ) > 0. And consequently, c1 (r) > 0 for r ∈ [r1 , ∞). Therefore, −wc1 (r) is a positive supersolution to (2.4) for r ∈ (r1 , ∞). By Sturm’s comparison theorem, ϕ(r) can not have a zero in (r1 , ∞). If ϕ(r) remains bounded for r ∈ [r1 , ∞), then by Lemma 2.3, we obtain 0 = lim r(wc1 (r)ϕc (r) − ϕ(r)wc 1 (r)) r→+∞ ∞ = c1 (t)ϕ(t)tdt < 0, r1
a contradiction. Therefore, we have proved that ϕ changes its sign once and limr→+∞ ϕ(r) = −∞ for the case N = 0. For N > 0, u(r) is no longer decreasing in r. But, there exists η ∈ (0, ∞) such that u (r) > 0 for r ∈ (0, η) and u (r) < 0 for r > η. Clearly, eu(r) = O(r 2N−β ) = o(r −2N −4 ) for r large by (2.6). We prove Lemma 2.2 by several steps. Step 1. ϕ(r) must change its sign. Suppose ϕ(r) > 0 for all r ≥ 0. Then by Lemma 2.3, limr→+∞ rϕ (r) ≥ 0. Set w(r) = ru (r). Then by (3.1), ∞ 0 > −2 f (u(t))ϕ(t)tdt 0
= lim r(w (r)ϕ(r) − ϕ (r)w(r)) r→+∞
= lim (rϕ (r))(β − 2N ) ≥ 0, r→+∞
Non-Topological Multi-Vortex Solutions to the Self-Dual Chern-Simons-Higgs Equation
203
where |rw (r)ϕ(r)| = o(r −2N−3 log r) → 0 as r → +∞. This contradiction proves Step 1. Let r1 be the first zero of ϕ. Step 2. η < r1 . We use ru (r) as the comparison function for the linearized equation. Since f (u(r)) > 0 for all r > 0, Step 2 follows from Eq. (3.1) and the Sturm Comparison Theorem. Step 3. ϕ has two zeros at most for r ∈ (0, ∞). As before, set wc (r) = rw (r) + c. By (3.2) and (3.4), we have r1 u (r1 ) + 2 < 0. The argument is the same as the case N = 0. Let r2 be the second zero of ϕ. Choose c2 such that wc2 (r2 ) = 0. Clearly, c2 > 2. We claim c2 (r2 ) > 0. If not, then c2 (r) = eu(r) [(c − 2) + 2(1 − c)eu(r) ] < 0 for r1 < r < r2 , because eu(r) is decreasing for r ≥ r1 by Step 2. Note that both wc2 (r) and ϕ(r) has the zero at r2 . Hence, by (3.3) 0=
r2
0
tc2 (t)ϕ(t)dt >
r1 0
tc2 (t)ϕ(t)dt
= r(wc 2 (r)ϕ(r) − wc2 (r)ϕ (r)) |r01
= −r1 wc2 (r1 )ϕ (r1 ) > 0,
which yields a contradiction. Hence the claim is established. Now suppose that ϕ has more than two zeros. Set r3 to be the third zero of ϕ. By the claim, c2 (r) > 0 for r ≥ r2 because eu(r) is decreasing for r ≥ r1 . By integrating (2.4) and (3.3), 0<
r3
r2
tc2 (t)ϕ(t)dt
= r(wc 2 (r)ϕ(r) − wc2 (r)ϕ (r)) |rr32 = −wc2 (r3 )ϕ (r2 )r3 < 0,
because ϕ (r3 ) < 0 and wc2 (r3 ) < 0. This proves Step 3. Step 4. If ϕ has two zeros, then limr→+∞ rϕ (r) = +∞. Now suppose ϕ has two zeros at r1 < r2 . By Step 3, ϕ(r) > 0 for r > r2 . If limr→+∞ rϕ (r) = 0, then by integrating Eq. (2.4) and (2.14), we have
∞
r2
tc2 (t)ϕ(t)dt = lim r(wc 2 ϕ(r) − wc2 (r)ϕ (r)) = 0. r→+∞
Since c2 (t) > 0 for t > r2 by Step 2, the identity above yields a contradiction. Thus, Step 4 is proved. Now the only case remaining to discuss is that ϕ has exactly one zero in (0, ∞). This is the most difficult part in the proof of Lemma 2.2. Step 5. If ϕ(r) has a unique zero r1 for r ∈ (0, ∞), then limr→+∞ ϕ(r) = −∞. Step 5 is proved by contradiction. Assume limr→+∞ ϕ (r)r = 0, two cases are discussed separately.
204
H. Chan, C.-C. Fu, C.-S. Lin
Case 1. c1 (r1 ) ≥ 0, where c1 = −r1 u (r1 ). Since c1 (r) = eu [(c−2)+2(1−c)eu(r) ] and eu(r) is decreasing for r ≥ η, by Step 2, we have c1 (r) ≥ c1 (r1 ) ≥ 0 for r ≥ r1 . By integrating Eq. (2.4) and (3.3), it yields ∞ 0> c1 (t)ϕ(t)dt = lim r{wc 1 (r)ϕ(r) − wc1 (r)ϕ (r)} = 0, r→+∞
r1
a contradiction. Case 2. c1 (r1 ) < 0. Since [c1 − 2 + 2(1 − c1 )eu ] = 2(1 − c1 )eu u < 0 in (0, η) and > 0 for r ∈ (η, ∞), there exists 0 < r0 < η such that > 0 if r < r0 , (3.6) c1 (r) < 0 if r1 > r > r0 . Recall that w(r) = ru and w(r) satisfies d [r(w ϕ − wϕ )] = −2f (u(r))ϕ(r) < 0 for 0 ≤ r ≤ r1 dr by (2.4) and (3.1). Hence [w (r)ϕ(r) − wϕ (r)] < 0 for 0 < r ≤ r1 and then decreasing in r for r ∈ [0, r1 ], ru (r0 ) > 0 ϕ
M= because r0 < η. Then by (3.7), we have Mϕ(r) − ru (r)
< 0 for r ∈ (0, r0 ), > 0 for r ∈ (r0 , r1 ).
Hence by (3.6), for any r > 0, r c1 (t)ϕ(t)dt ≤ M 0
In particular, we have for r = η, η c1 (t)ϕ(t)tdt ≤ M 0
≤ =
η 0
r02 r02
r 0
(3.7) ru ϕ
is
(3.8)
(3.9)
t 2 u (t)c1 (t)dt.
t 2 c1 (t)u (t)dt η 0
η
c1 (t)u (t)dt [(c1 − 2) + 2(1 − c1 )eu ]eu(t) u (t)dt
0
= r02 [c1 f (u(η)) − 2F (u(η))], where F (t) = et (1 − 21 et ). Here, we use the inequality (t 2 − r02 )c1 (t) < 0 for 0 < t < η.
(3.10)
Non-Topological Multi-Vortex Solutions to the Self-Dual Chern-Simons-Higgs Equation
205
We claim c1 f (u(η)) − 2F (u(η)) > 0.
(3.11)
Suppose (3.11) does not hold. Then by (3.10), η c1 (t)ϕ(t)tdt < 0. 0
But, by integrating (3.3), we have η c1 (t)ϕ(t)tdt = η[wc 1 (η)ϕ(η) − wc1 (η)ϕ (η)], 0
and d {r[wc 1 (r)ϕ(r) − wc1 (r)ϕ(r)]} = c1 (r)ϕ(r) < 0 dr by (3.6) for η < r ≤ r1 . Hence these two identities yield 0 = r1 [wc 1 (r1 )ϕ(r1 ) − wc1 (r1 )ϕ (r1 )] < η[wc 1 (η)ϕ(η) − wc1 (η)ϕ (η)] η c1 (t)ϕ(t)tdt < 0, = 0
a contradiction. d Note that c1 f (u)−2F = eu(r) (c1 −2+(1−c1 )eu(r) ) and dr [c1 −2+(1−c1 )eu(r) ] = (1 − c1 )eu(r) u (r) > 0 for r ≥ η. Thus, (3.11) implies c1 f (u(r1 )) − 2F (u(r1 )) > 0.
(3.12)
On the other hand, we can proceed the similar argument for r ∈ [r1 , ∞) to obtain the reverse inequality of (3.12), i.e., c1 f (u(r1 )) − 2F (u(r1 )) < 0. Thus, these two yield a contradiction. To see the reverse inequality, we know that there exists r0∗ > r1 such that < 0 if r1 < t < r0∗ c1 (t) (3.13) > 0 if t > r0∗ because (c1 − 2) + (1 − c1 )eu(t) is increasing for t > η and limt→+∞ [(c1 − 2) + (1 − c1 )eu(t) ] = c1 − 2 > 0. By (3.7), we also have r(w ϕ − wϕ ) is increasing for r ≥ r1 and then r(w ϕ(r) − wϕ (r)) < lim [r(w ϕ(r) − wϕ (r))] = 0. r→+∞
Therefore,
ru (r) ϕ(r)
is decreasing for r > r1 . Set M∗ =
r0∗ u (r0∗ ) > 0. ϕ(r0∗ )
Then, together with (3.13), we have c1 (t)(M∗ ϕ(t) − tu (t)) < 0 for t > r1 .
(3.14)
206
H. Chan, C.-C. Fu, C.-S. Lin
Thus, 0 = M∗ <
∞
r1 ∞
tc1 (t)ϕ(t)dt
t 2 u (t)c1 (t)dt ∞ u (t)c1 (t)dt < r0∗ 2 r1 ∞ u (t)c1 (t)dt = r0∗ 2 r1
r1
= r0∗ 2 (c1 f (u(r)) − 2F (u(r)) |∞ r1 = −r0∗ 2 [c1 f (u(r1 )) − 2F (u(r1 ))], which is the reverse inequality of (3.12). Therefore, the proof of Lemma 2.2 is completely finished. 4. Estimate of the Linearized Equations In this section, we want to study the operator norm of the linearized equation of (2.1) at any non-topological radial solution u(r). First, we introduce function spaces Xα and Yα for any α ∈ (0, 21 ). A function w(x) is said to be in Xα if the norm of w(x) 2 w2Xα = | w|2 ρ 2 (x)dx + |w|2 |ρ(x)| ˆ dx < +∞, (4.1) R2
where α
α
ˆ = (1 + |x|)−1 (log(2 + |x|))−(1+ 2 ) , ρ(x) = (1 + |x|)1+ 2 and ρ(x) and a function h(x) ∈ Yα if
(4.2)
h2Yα
=
R2
|h(x)|2 |ρ(x)|2 dx
(4.3)
is finite. Obviously, both Xα and Yα are Hilbert spaces. Set L to be the linearized operator of (2.1) at u(r), that is Lw(x) = "w(x) + f (u(x))w(x).
(4.4)
Throughout the section, u(x) is always assumed a non-topological solution of (2.1) with N = 1. Note that f (u(x)) satisfies |f (u(x))| ≤ c|x|2 (1 + |x|)−β for x ∈ R2 , and
β=
∞
(4.5)
eu(r) (1 − eu(r) )rdr > 8.
0
By (4.5), L is a bounded operator of Xα to Yα . Our main result of this section is the following.
Non-Topological Multi-Vortex Solutions to the Self-Dual Chern-Simons-Higgs Equation
207
Theorem 4.1. The linearized operator L is an isomorphism from Xα onto Yα . Let v(x) and h(x) be in Xα and Yα respectively, and satisfy Lw(x) = h(x). Then (i) w(x) is uniformly bounded in R2 and wL∞ (R2 ) ≤ c hYα , (ii) wXα ≤ c hYα . Proof. By the Schwartz inequality, h ∈ L1 (R2 ). Then we set −1 ˆ log |x − y|h(y)dy, w(x) ˆ = 2π ˆ where h(x) = f (u(x))w(x) − h(x) ∈ L1 (R2 ) by (4.5). Step 1. −1 ˆ h(y)dy log |x| + O(1) w(x) ˆ = 2π R2
(4.6)
for |x| large. The identity (4.6) holds due to hˆ ∈ Yα and its proof is elementary and straightforward. So, we skip the proof. Step 2. w(x) = w(x)+ ˆ a constant for x ∈ R2 . Clearly, w(x) − w(x) ˆ is a harmonic function, and |w(x) − w(x)|(1 ˆ + |x|)−(1+τ ) ∈ L2 (R2 ) for any τ > 0. This is due to the fact that |w(x)| ˆ is bounded by log |x| at ∞ and w(x) ∈ Xα . It then implies w(x) − w(x) ˆ must be a constant, because a non-constant harmonic function in R2 must grow at least linearly,which is excluded from Xα . This proves Step 2. ˆ h(y)dy = 0. This is because if |w(x)| = c log |x| + O(1) at ∞ for some Step 3. R2
c = 0, then w(x) ∈ Xα . Hence, Step 3 follows from (4.6). Step 4. By previous steps, we have log |x| ˆ 1 w(x) = c + h(y)dy 2π log |x − y|
(4.7)
for x ∈ R2 and c is a constant. Since hˆ ∈ Yα , we see that 1 ˆ h(y)dy = c. lim w(x) = c + |x|→+∞ 2π R2 Hence 1 |x| ˆ |w(x) − w(∞)| ≤ |h(y)|dy log 2π R2 |x − y| |x| ˆ |x| ˆ 1 |h(y)|dy + log = log |x − y| |h(y)|dy 2π B(x, |x|2 ) |x − y| B(0,2|x|)\B(x, |x| 2 ) |x| ˆ + (4.8) log |x − y| |h(y)|dy . R2 \B(0,2|x|)
208
H. Chan, C.-C. Fu, C.-S. Lin
The computation of each integral of the right-hand side of (4.8) is elementary. For example, log |x| |h(y)|dy ˆ |x − y| B(x, |x| 2 ) 21 −(2+α) (1 + |y|) y ≤ | log |x|| B(x, |x| 2 )
+
B(x, |x| 2 )
2 21 log 1 (1 + |y|)−(2+α) ˆ Yα h |x − y| α
ˆ Yα ≤ c (log |x|)|x|− 2 h
(4.9)
for |x| large. We can see the other two terms can be bounded by the right-hand side of (4.9). Hence, α
ˆ Yα . |w(x) − w(∞)| ≤ c (1 + |x|)− 2 (log(2 + |x|))h
(4.10)
ˆ Yα . By (4.10), Step 5. |w(∞)| ≤ ch ˆ 2Y |w(x) − w(∞)|2 ρ(x) ˆ 2 dx ≤ c h (1 + |x|)−(2+α) dx α R2
R2
≤c
ˆ 2Y . h α
(4.11)
Hence |w(∞)| = c1 w(∞)ρ ˆ L2 ˆ L2 + c1 w(x)ρ ˆ L2 ≤ c1 (w(x) − w(∞))ρ 2 ˆ Y ≤ c h α
(4.12)
by (4.11). Step 6. The kernal of L consists of the trivial solution only. By Step 5, any solution of Lw = 0 is uniformly bounded in R2 . By Corollary 2.8, we see that w ≡ 0 in R2 . Step 7. There exists a constant c such that wXα ≤ c hYα .
(4.13)
We prove it by contradiction. Suppose that there exists wj and hj such that Lwj = hj , wj Xα = 1 and hj Yα = j1 . Then ˆ L2 ≤ c wj Xα = c, f (u)wj Yα = f (u)wj ρL2 ≤ c wj ρ where the inequality, |f (u(x))|2 (1 + |x|)4+α (log(2 + |x|))2+α ≤ c for x ∈ R2 , is used. Set hˆ j = hj − f (u(x))wj . By (4.11), we have (wj (x) − wj (∞))ρ ˆ L2 ≤ chˆ j Yα ≤ c1 .
Non-Topological Multi-Vortex Solutions to the Self-Dual Chern-Simons-Higgs Equation
209
Hence, ˆ L2 |wj (∞)| = c1 wj (∞)ρ ≤ c1 (wj ρ ˆ L2 + (wj (∞) − wj (x))ρ ˆ L2 ) ≤ c2 , that is, wj (x) is uniformly bounded for x ∈ R2 . By the regularity of linear elliptic 0,τ (R2 ) for any τ > 0. Since wj (∞) is bounded, by equations, wj (x) is bounded in Cloc passing to a subsequence, wj (x) converges to w(x) uniformly for all x ∈ R2 . Note that "wj = hj (x) − f (u(x))wj converges to −f (u(x))w in Yα . Thus, w(x) satisfies Lw = 0. Clearly, ˆ L2 (R2 ) 1 = lim wj Xα = lim "wj Yα + lim wj ρ j →+∞
j →+∞
j →+∞
= "wYα + w ρ ˆ L2 (R2 ) , that is, wXα = 1. But w ∈ Xα , by (4.7), w(x) is uniformly bounded in R2 . By Corollary 2.8, we have w(x) ≡ 0 in R2 , which yields a contradiction to wXα = 1. This finishes the proof of St. 7. By (4.13), we know that the image R = R(L) of Xα under L is a closed subspace Yα . Since L is self-adjoint, We know that R ⊥ = ker(L) = 0, which implies R = Yα . Hence Theorem 4.1 is proved. 5. Construction of Solutions Given β > 16πN and p1 , . . . , pN ∈ R2 , we want to construct a non-topological solution of (1.5) with vorticities at p1 , . . . , pN and the flux 1 β F12 dx = 2 eu (1 − uu )dx = . 2 2 2ε 2 R R Here, throughout the section, we assume that for each j , the summation k=j log |pj − pk | is independent of j . Without loss of generality, we assume |pj − pk | ≥ 4 for j = k. For such β > 16πN , we introduce u0 (r) to be the uniqueness solution of (2.1) with N = 1 and with ∞ β . (5.1) eu0 (r) (1 − eu0 (r) )rdr = 2π N 0 We construct an approximate solution u0,ε (x) as the followings. Let σ = σ (|x|) be a cut-off function which is 1 for 0 ≤ r ≤ 1 and is zero for r ≥ 2. Set u0,ε (x) =
N
x − pj β +(1 − σj (x)) 2 − log |x − pj | ε 2π N j =1 β 1 +D +(1 − η(x)) (2 − log + I − D , (5.2) 2πN ε σj (x)u0
210
H. Chan, C.-C. Fu, C.-S. Lin
where σj (x) = σ (x − pj ), η(x) = (2.13) and by D=
N
j =1 σj (x),
I and D are constants, defined by
1 β 1 log 2− . N −1 2πN |pj − pk |
(5.3)
k=j
By our assumption, D is independent of j . For x ∈ B1 (pj ), u0,ε (x) can be written as u0,ε (x) = u0
x − pj |x − pk | β log + 2− . ε 2π N |pj − pk |
(5.4)
k=j
Hereafter, Br (p) denotes the open ball of center p and radius r > 0. Thus, u0,ε (x) satisfies "u0,ε (x) + Gj (x)eu0,ε (x) (1 − Gj (x)eu0,ε ) = 0 for x ∈ B1 (pj ), where |x − pk | τ1 +2 Gj (x) = , and |pj − pk |
(5.5)
k=j
τ1 =
β − 4 > 4. 2πN
(5.6)
For x ∈ B2 (pj )\B1 (pj ),
x − pj |x − pj | β u0,ε (x) = σj (x)u0 +(1 − σj (x)) 2 − +I log ε ε 2π N |x − pk | β log . + 2− 2πN |pj − pk | k=j
The sum of the first two terms gives
x − pj |x − pj | β +(1 − σj (x)) 2 − log +I σj (x)u0 ε 2π N ε |x − pj | β = 2− log +I 2πN ε x − pj |x − pj | β − 2− −I . +σj (x) u0 log ε ε 2π N By Lemma 2.6, we have "u0,ε (x)ρ(x)L2 ≤ c ε τ1 .
(5.7)
Non-Topological Multi-Vortex Solutions to the Self-Dual Chern-Simons-Higgs Equation
For x ∈
N
j =1 B2 (pj ),
211
u0,ε is a harmonic function. Hence
"u0,ε (x) = 0 for x ∈
N
B2 (pj ).
(5.8)
j =1
Set uε (x) = u0,ε (x) + vε (x) to be a solution of (1.1). Then vε (x) satisfies "vε + fε (x)vε + g(ε, x, vε ) = 0 in R2 , where
x−pj 1 u for x ∈ B1 (pj ) f 0 ε ε2 N fε (x) = 0 for x ∈ B1 (pj ),
(5.9)
(5.10)
j =1
and g0 (ε, x, t) is defined by x − pj x − pj 1 − f u0 t g(ε, x, t) = 2 f (u0,ε (x) + t) − f u0 ε ε ε (5.11) for x ∈ B1 (pj ), g(ε, x, t) =
1 f (u0,ε (x) + t)) − "u0,ε (x) ε2
(5.12)
for x ∈ B2 (pj )\B1 (pj ), and g(ε, x, t) =
N 1 f (u (x) + t) for x ∈ B2 (pj ). 0,ε ε2
(5.13)
j =1
Clearly, for x ∈
N
B2 (pj ) and |t| ≤ 1, g(ε, x, t) satisfies
j =1
|g(ε, x, t)| ≤ c1 ε τ1 (1 + |x|)−τ , τ=
β − 2N. 2π
(5.14) (5.15)
We should solve Eq. (5.9) by the standard method of iterative method. However, since (5.9) is meaningless for ε = 0, we can not apply the implicit function theory directly. For completeness of presentation, we will give a brief account of the iterative process. To do it, we introduce two function spaces. The function spaces we are working on are defined by the following. A function v is in Xα,ε if v2Xα,ε =
N j =1
ε 4 ("v)(pj + εy)ρ(y)2L2 (B
1 ε
2 + v(pj + εy)ρ(y) ˆ L2 (B )
+"v(y)ρ(y)2L2 (Bc ) + v(y)ρ(y) ˆ L2 (Bc )
1) ε
(5.16)
212
H. Chan, C.-C. Fu, C.-S. Lin
is finite, where B = Yα,ε if hYα,ε =
N j =1
N
j =1 B1 (pj ),
Bc = R2 \B and B 1 = B 1 (0). A function h is in ε
ε 4 h(pj + εy)ρ(y)2L2 (B
1) ε
ε
2 + h(y)ρ(y) ˆ , L2 (R2 \B)
(5.17)
where ρ(y) and ρ(y) ˆ are given in (4.2). The crucial estimate for the linearized equation of (5.9) is the following theorem: Theorem 5.1. Suppose vε ∈ Xα,ε and hε ∈ Yα,ε satisfies the equation "vε + fε (x)vε = hε (x),
(5.18)
where fε (x) is given by (5.10). Assume hε (x) ≡ 0 for x ∈ B. Then there exists a constant C independent of ε such that 1 hε Yα,ε . log ε
vε L∞ (R2 ) + vε Xα,ε ≤ C
(5.19)
The proof of Theorem 5.1 is long and very technical. We should postpone its proof to the next section. By Theorem 5.1, we have Lemma 5.2. Let Lε = " + fε (x). Then Lε is an isomorphism of Xα,ε onto Yα,ε . Moreover, suppose vε ∈ Xα,ε and hε ∈ Yα,ε satisfy Eq. (5.18). Then vε Xα,ε + vε L∞ (R2 )
1 hε Yα,ε . ≤ c log ε
(5.20)
The proof Lemma 5.2 will be given at the end of this section. Now we are going to prove our main result in this paper. Proof of Theorem 1.2. The proof given here is a standard iterative scheme. Since (5.9) is meaningless for ε = 0, we can not apply the implicit function theorem directly. Nevertheless, the proof is exactly the one used in the implicit function theorem. In order to solve (5.9), we construct vn,ε by induction on n. Set v1,ε ≡ 0 and vn+1,ε is the solution of "vn+1,ε + fε (x)vn+1,ε (x) + g(ε, x, vn,ε (x)) = 0 in R2 .
(5.21)
To estimate vn+1,ε −vn,ε , we note that by the expression (5.11)–(5.13) of g(ε, x, t), we have for |t|, |s| ≤ 1, ε2 |g(ε, x, t) − g(ε, x, s)| ≤ c eu0 (
x−pj ε
)
(| log Gj (x)| + |t − s|)|t − s|
(5.22)
for x ∈ B1 (pj ) and |g(ε, x, t) − g(ε, x, s)| ≤ c ετ1 (1 + |x|)−τ |t − s|
(5.23)
Non-Topological Multi-Vortex Solutions to the Self-Dual Chern-Simons-Higgs Equation
213
for x ∈ B. Thus, by (5.22), g(ε, x, vn,ε ) − g(ε, x, vn−1,ε )Yα,ε,j ≤ c eu0 ( +c
x−pk ε
|y|≤ 1ε
)
(vn,ε − vn−1,ε )2 Yα,ε,j
e
2u0 (|y|)
| log Gj (εy + pj )| |vn,ε − vn−1,ε | |ρ(y)| dy 2
2
2
≤ c vn,ε − vn−1,ε L∞ + ε vn,ε − vn−1,ε Xα,e,j ,
21
(5.24)
where | log Gj (εy + pj )| ≤ c ε|y| and α
α
eu0 (y) |y|(1 + |y|)2+ 2 (log(2 + |y|))1+ 2 ≤ c for y ∈ R2 are employed in the last inequality. For the region R2 \B, we have by (5.23) g (ε, x, vn,ε ) − g(ε, x, vn−1,ε )Yα,ε,Bc
≤ c ε τ1 (1 + |x|)−τ (vn,ε − vn−1,ε )Yα,ε,Bc ≤ c ε τ1 vn,ε − vn−1,ε Xα,ε .
(5.25)
Thus, by Lemma 5.2, vn+1,ε − vn,ε Xα,ε + vn+1,ε − vn,ε L∞ 1 ≤ c log vn,ε − vn−1,ε L∞ + ε vn,ε − vn−1,ε Xα,ε . ε
(5.26)
For v2,ε , let hε (x) = g(ε, x, 0). By a straighforward computation, hε Yα,ε ≤ c ε,
(5.27)
1 . v2,ε Xα,ε + v2,ε L∞ ≤ c ε log ε
(5.28)
and then by Lemma 5.2,
Due to the estimates (5.26) and (5.28) of the iterative step, we can prove by induction on n,
vn+1,ε − vn,ε Xα,ε + vn+1,ε − vn,ε L∞ ≤ εnσ kσ vn,ε Xα,ε + vn,ε L∞ ≤ n−1 k=1 ε
(5.29)
if ε is sufficiently small and 0 < σ < 1. Readily from (5.29), vn,ε (x) converges to some vε (x) both in Xα,ε and in the supernorm of R2 . Thus vε (x) satisfies Eq. (5.9). Clearly, β 1 − 2N ) log( |x| ) + O(1) uε (x) = u0,ε (x) + vε (x) is a solution of (1.5) with uε (x) = ( 2π at ∞. This finishes the proof of Theorem 1.2
214
H. Chan, C.-C. Fu, C.-S. Lin
Proof of Lemma 5.2. We first prove (5.20). For each j ∈ {1, 2, . . . , N}, we set h˜ ε,j (x) ≡ hε (x) for x ∈ B1 (pj ) and h˜ ε,j (x) ≡ 0 for x ∈ B1 (pj ). By Theorem 4.1, there exists a solution vε,j (x) of x − pj 1 "vε,j (x) + 2 f u0 (5.30) vε,j (x) = h˜ ε,j (x) in R2 . ε ε By Theorem 4.1, ε2 ("vε,j )(pj + εy)ρ(y)L2 (R2 ) + vε,j (pj + εy)ρ(y)| ˆ L2 (R2 ) + vε,j L∞ ≤ c ε2 h˜ ε,j (pj + εy)ρ(y)L2 (B 1 ) . ε
Let σ (x) and σj be the cut-off function as before. Set wε (x) = vε (x) −
N
σj vε,j .
(5.31)
j =1
Then wε satisfies "wε + fε (x)wε
= 1−
N
σj hε (x) − 2
j =1
N
σj · vε,j −
j =1
N
("σj )vε,j
j =1
:= h∗ε (x). Clearly, h∗ε (x) ≡ 0 for x ∈ B. Thus, w(x) and h∗ε satisfies the hypothesis of Theorem 5.1. By Theorem 5.1 and (5.31), 1 1 ∗ wε Xα,ε + wε L∞ ≤ c log hε Yα,ε ≤ c log hε Yα,ε . ε ε Therefore, vε Xα,ε + vε L∞ ≤ wε Xα,ε +
σj vε,j Xα,ε
j =1
≤c
N
log
1 hε Yα,ε . ε
This proves (5.20). The isomorphism of Lε follows from (5.20) and the fact that Lε is a self-adjoint operator. Hence, Lemma 5.2 is proved. 6. Proof of Theorem 5.1 In this section, we want to give a complete proof of Theorem 5.1. The linearized operator Lε is a typical example of the Laplace operator plus the coefficient of zero order which is singular in a small neighborhood of pj . The following lemma is the main step toward the estimate of the linearized equation.
Non-Topological Multi-Vortex Solutions to the Self-Dual Chern-Simons-Higgs Equation
215
In the proof of Theorem 5.1, we let vn ≡ vεn and hn ≡ hεn satisfy (5.19) with εn → 0 and hn (x) ≡ 0 for x ∈ B. Lemma 6.1. For any j ∈ {1, 2, . . . , N}, vn (x) can be decomposed into |x − pj | +ωn,j (x), vn (x) = dn,j ϕ0 εn where ϕ0 is the solution of (2.4) with u(r) = u0 (r), and ωn,j (x) satisfies "ωn,j (x) +
f (u0,ε (x)) ωn,j = 0 for |x − pj | ≤ 1, εn
and the maximum point xn,j of ωn,j , that is, µn = |ωn,j (xn,j )| =
sup
|x−pj |≤1
|wn,j (x)|
satisfies lim |xn,j | = 1.
j →+∞ ω
(x)
2 (B \{p }), where ω(x) is Furthermore, n,j always tends to a function ω(x) in Cloc 1 j µn a harmonic function in B1 and ω(0) = 0.
Proof. We let dn,j = vn (pj ), and ωn,j (x) = vn − dn,j ϕ0
|x − pj | . ε
In B¯ 2 (pj ), vˆn satisfies
f (u
)
"vˆn + ε20,ε vˆn = 0. vˆn (pj ) = 0
(6.1)
For simplicity of notation, we let pj = 0 in the proof of Lemma 6.1. First, we claim that if xn is a maximum point of vˆn in B¯ 1 , that is, |vˆn (xn )| = sup |vˆn (x)|, x∈B¯ 1
then |xn | ≥ r0 > 0
(6.2)
for large n. Suppose |xn | → 0 as n → +∞. We claim |xn | → +∞ as n → +∞. εn
(6.3)
216
H. Chan, C.-C. Fu, C.-S. Lin
Otherwise, we scale v˜n (y) = vˆn (εn y)/vˆn (xn ). Then |v˜n (y)| ≤ 1 for |y| ≤ ε1n , and v˜n (yn ) = 1 with yn = xεnn → y0 ∈ R2 . By the elliptic estimates, after passing to a subsequence, v˜n (y) converges to ψ(y) where ψ is a bounded solution of "ψ + f (u0 (x))ψ = 0 in R2 . Since |ψ(y)| ≤ 1 in R2 , by Corollary 2.8, we have ψ(y) ≡ 0 in R2 which contradicts v˜n (yn ) = 1. Hence (6.3) is proved. Let rn = |xn |. As before, we scale vˆn by (still denoted by v˜n ) v˜n (y) =
vˆn (rn y) . vˆn (xn )
Clearly, v˜n (y) satisfies f (u0 ( ηyn )) v˜n (y) = 0 for |y| ≤ "v˜n (y) + ηn2 |v˜n (y)| ≤ 1 and v˜n (yn ) = 1,
1 ηn
where yn = |xxnn | and ηn = εrnn . By (6.3), ηn → 0. Hence f (u0 ( ηyn ))/ηn2 → 0 for each y = 0 as n → +∞ and v˜n (y) converges uniformly to a bounded harmonic function ψ in any compact set of R2 \{0}. By the Liouville theorem, we have ψ(y) ≡ 1 in R2 . In particular, for any positive constant C > 1, and C −1 ≤ |x|x|n | ≤ C, we have vˆn (x) > 0. vˆn (xn ) Without loss of generality, we may assume vˆn (xn ) > 0. Since vˆn (0) = 0, by the strong maximum principle, vˆn (x) must be negative somewhere in {x | |x| < rn }. Now suppose xn to be a point such that vˆn (xn ) = 0 and vˆn (x) > 0 for |xn | < |x| ≤ |xn |. By the Harnack inequality, we have C inf vˆn (x) ≥ sup vˆn (r) |x|=r
|x|=r
for all 2|xn | ≤ r ≤ |xn | and for some constant C which is independent of r and n. There are two cases. Case 1. There exists a constant c1 > 0 such that sup |x|≤ 23 |xn |
|vˆn (x)| ≤ c1
vˆ (r y)
sup |vˆn (x)| := c1 |vˆn |.
|x|=2|xn |
(6.4)
Then we set v˜n (y) = n|vˆ n| , where rn = |xn |. Since vˆn (x) is positive for |x| ≥ |xn |, n the Harnack inequality implies that v˜n (y) is uniformly bounded in any bounded set of R2 \{y | |y| ≥ 45 }. By (6.4), we then have v˜n (y) is uniformly bounded in any compact set of R2 . Therefore, by passing to a subsequence if necessary, v˜n (y) converges to ω(y) 2 (R 2 \{0}), where ω(y) is either a solution of in Cloc x − pj −1 "ω + η0 f u0 ω(y) = 0, in R2 , (6.5) η0
Non-Topological Multi-Vortex Solutions to the Self-Dual Chern-Simons-Higgs Equation |x |
217
|x |
for some η0 > 0 if εnn is bounded, or w(y) ˜ is a harmonic function in R2 if εnn → +∞. Suppose that ω(y) is a solution of (6.5). By the assumption, ω(0) = 0 and ω(y) is a positive function for |y| > 1. By Corollary 2.8, ω(y) ≡ 0 in R2 , which yields a contradiction clearly. Now suppose that ω(y) is a harmonic function. Since ω(y) is positive for |y| ≥ 1, by the Liouville theorem, ω(y) ≡ constant which yields a contradiction that v˜n (yn ) = 1 for some |yn | = 2 and v˜n (yn ) = 0 for some |yn | = 1. Hence Case 1 is proved. Case 2. sup |x|≤ 23 |xn |
|vˆn (x)| >>
sup |vˆn (r)|.
|r|=2|xn |
(6.6)
Set xn to be the maximum of |vˆn (xn )| =
sup |x|≤ 23 |xn |
|vˆn (x)|.
By the same (scaling) argument as before, we see that vˆn (x) vˆn (xn )
1 c |xn |
|xn | εn
→ +∞ and for any c > 1,
c |xn | uniformly in x. In particular, we have |xn |
→ 1 for ≤ |x| ≤ << |xn |. On the other hand, by the strong maximum principle, we know that vˆn (x) must change its sign for |x| < |xn |. Therefore, there exists another point |x˜n | < |xn | such that vˆn (x˜n ) = 0 and |vˆn (x)| = 0 for |x˜n | < |x| ≤ |xn |. By the same argument as in Case 1, we have |vˆn (xn∗ )| =
sup |x|≤ 23 |x˜n |
|vˆn (x)| >>
sup
|x|=2|x˜n |
|vˆn (x)|.
By the scaling argument, we have |xn∗ | → +∞ and |vˆn (x)| = 0 for |x| = |xn∗ |. εn Hence, there exists a domain n ⊂ B(0, |xn |)\B(0, |xn∗ |) such that vˆn |∂n = 0. Set x v¯n (x) = ϕvn((x) r , where ϕ0 is the solution of (2.13) for k = 0. Note ϕ0 ( ε ) = 0 for ) n |x| ≥ have
|xn∗ |,
0 εn
because
|xn∗ | εn
→ +∞ as n → +∞. By a straightforward computation, we
"v¯n (x) + 2 log ϕ0 ( εxn ) · v¯n = 0 in n , v¯n = 0 on ∂n .
By the maximum principle, v¯n (x) ≡ 0 in n and then vn (x) ≡ 0 in n , which yields a contradiction. This contradiction completes the proof of (6.2). 2 (B ¯ 1 \{pj }), by passing to a subsequence, ωn,j (x) conSince f (u0,ε (x)) → 0 in Cloc µn verges to a harmonic function ω(x) in B1 and |ω(x)| ≤ 1 for x ∈ B1 . Suppose ω(x) ≡ 0 in B1 . (The possibility ω(x) ≡ 0 might occur and if it happens, then sup|x|≤r |ωn,j (x)| = o(1)µn for any r < 1 which implies limn→+∞ |xn,j | = 1). We want to show ω(0) = 0. If not, we may assume ω(0) > 0. Then there exists r0 such that ωn,j (x) > 0 for |x| = r0 and for large n. By the strong maximum principle, ωn,j (x) must be negative somewhere
218
H. Chan, C.-C. Fu, C.-S. Lin
in Br0 . Let xn ∈ Br0 such that ωn,j (xn ) = 0 and ωn,j (x) > 0 for |xn | < |x| < r0 . Obviously, xn → 0. By the same argument as Case 1, we have sup |x|≤ 23 |xn |
|ωn (x)| >>
sup |ωn (x)|.
|x|=2|xn |
Thus, by the argument in Case 2, there is domain n such that n ⊂ Brn \B¯ r¯n with r¯n < ω (x) rn = |xn | and ωn |∂n = 0. Applying the maximum principle to ω¯ n,j (x) = n,jx−pj , we ϕ0 (
εn
)
obtain a contradiction as before. Therefore ω(0) = 0. If ω(x) ≡ 0, then the maximum of w(x) must be in the boundary. Thus, limn→+∞ |xn,j | = 1 follows readily. Now we can complete the proof of Theorem 5.1 Proof of Theorem 5.1. Suppose (5.10) fails for a sequence of εn ↓ 0. Then there exists vn ≡ vεn ∈ Xα,εn and hn ≡ hεn ∈ Yα,εn satisfying "vn + fεn (x)vn = hn (x) in R2 with hn (x) ≡ 0 on
N
B1 (pj ) such that
j =1
1 . = o(1) log εn
vn Xα,εn = 1 and hn Yα,εn
(6.7)
Let τn = max |vn (x)| = max |vn (x)|, ¯ x∈B
N
1 2 j =1 B1 (pj ). Since h(x) ∈ Yα,εn , h ∈ L (R ). Recall fεn (x) ∈ B. Thus, hˆ n (x) := fεn (x)vn − hn (x) ∈ L1 (R2 ). By the argument of Step of Theorem 4.1, we have R2 hˆ n (x)dx = 0 and vn (x) is bounded in R2 . Let
where B = x 3
x∈B1 (p1 )
vn∗ (x) = vn
x − p1 |x − p1 |2
(6.8) ≡ 0 for 1 – Step
˜ is the image of R2 \B under the inversion be the Kelvin transformation of vn and B x−p1 ∗ . Then vn (x) satisfies |x−p |2 1
"vn∗
= |x − p1
|−4 h
n
x−p1 |x−p1 |2
˜ for x ∈ B,
˜ |vn∗ (x)| ≤ τn for x ∈ ∂ B.
Choose 1
2 2−
α 2
.
(6.9)
Non-Topological Multi-Vortex Solutions to the Self-Dual Chern-Simons-Higgs Equation
219
˜ Then by Lp estimates, we have for x ∈ B, |vn∗ (x)|
≤ τn + c
˜ B
p hn (
x − p1 )|x − p1 |−4p dx |x − p1 |2
p1
.
(6.10)
2,p
Here, the Sobolev embeddings, Wloc ⊆ C 0,γ for some γ > 0, is used. By using the Kelvin transformation,
p1 p1 p hn (x)|x − p1 |4p x − p1 −4p |hn | ( )|x − p1 | dx = dx |x − p1 |2 |x − p1 |4 ˜ B R2 \B 1− p2 −σ ≤ |x − p1 | dx hYα ,ε , (6.11) p
R2 \B
2 > 2 by (6.8). Thus where σ = [p(1 + α2 ) − 4(p − 1)] 2p
|vn (x)|L∞ (Bc ) ≤ τn + chn Yα ,εn .
(6.12)
hn Yα ,εn → 0. τn
(6.13)
We claim
Otherwise, τn ≤ c1 hn Yα,εn . By (6.12), we have vn L∞ (R2 ) ≤ c2 hn Yα,εn .
(6.14)
On the other hand, by (6.14), we obtain 1 = vn 2Xα,εn = ε4
N j =1
+
N j =1
|"vn |2 (pj + εn y)ρ(y)|2L2 (B
2 vn (pj + εn y)ρ(y) ˆ L2 (B
1 εn
)
1 εn
)
+ "vn (y)ρ(y)2L2 (Bc ) + vn (y)ρ n (y)2L2 (B2 )
≤ cvn L∞ (R2 ) + hn Yα ,εn ≤ c3 hn Yα,εn = o(1) log
1 , εn
a contradiction. Therefore, (6.13) is proved. By Lemma 6.1, we have τn = (1 + o(1))µn,1 if dn,1 log ε1n << µn,1 , where µn,1 and dn,1 are given in Lemma 6.1. In this case, we set v˜n = vτnn . Then |v˜n (x)| ≤ 1 and v˜n satisfies "v˜n + fε (x)v˜n =
hn (x) . τn
220
H. Chan, C.-C. Fu, C.-S. Lin
By (6.13), hn Yα,εn /τn → 0. Note that for x ∈ B1 (pj )\B 1 (pj ), 2
β 2π N −4 f (x)v˜n = 1 f u0 x − pj | v ˜ (x)| ≤ c ε . n n ε ε2 ε Set B1 =
N
j =1 B 21 (pj ),
x−p1 and perform the Kelvin transformation v˜n∗ (x) = v˜n ( |x−p ) |2
˜ 1 is the image of Bc under the inversion as before where B 1 Lp -estimate, we have instead of (6.10),
|vn∗ |C 0,γ (K)
≤1+c
˜1 B
p hn
x−p1 . |x−p1 |2
1
By applying the
p1 x − p1 −4p |x − p1 | dx τn−1 |x − p1 |2
(6.15)
˜ 1 . By the computation (6.11), we have for some 0 < γ < 1, and γ < 1 and K B |v˜n |C γ (Bc2 ) ≤ 1 + chn Yα ,εn τn−1 ≤ 2, where B2 =
(6.16)
N
j =1 B 43 (pj ). Therefore, by passing to a subsequence, v˜ n (x) converges to
v uniformly in all x ∈ R2 , where the uniform convergence of vn for x ∈ B1 (pj ) follows from Lemma 6.1. Obviously, v is a bounded harmonic function. By the Liouville Theorem, v(x) ≡ 1 in R2 . However, v(x) = w1 (x), where w1 (x) is the limiting of wn,1 (x) in B1 (pj ). By Lemma 6.1, w1 (0) = 0, which obviously yields a contradiction. Hence, we have proved. µn,1 ≤ c dn,1 log
1 for some constant c. εn
(6.17)
By (6.16), we have τn ≤ c dn,1 log ε1n . Set ηn = dn,1 log ε1n . Then by the same estimates as (6.16), we have vn ≤ 1 + chn Yα,εn ηn−1 ≤ 2 (6.18) η n C 0,γ (Bc ) 2
by (6.13). Thus, ηvnn converges to v(x) uniformly in all x ∈ R2 and v(x) ≡ c w1 (x) + 1, µn,1 where c = limn→+∞ 1 . By the Liouville Theorem, v(x) ≡ constant. Thus, dn,1 log
εn
cw1 (x) ≡ constant. Since w1 (0) = 0 and |w1 (p0 )| = 1 for some p0 ∈ ∂B1 (p1 ). Therefore c = 0. In the meantime, we also have lim
n→+∞
dn,j log ε1n dn,1 log
1 ε
= 1 and
lim
n→+∞
µn,j dn,1 log ε1n
=0
for j = 2, · · · , N . By a straightforward computation, we have 1 2 −2 −(2+α) vn Xα,εn = dn,1 log (1 + |y| ) (log(2 + |y|)) dy + o(1) . εn Bc Hence, limn→+∞ dn,1 log ε1n = d > 0.
Non-Topological Multi-Vortex Solutions to the Self-Dual Chern-Simons-Higgs Equation 1| Since vn (x) = dn,1 log |x−p + µn,1 εn function w1 (x), we obtain,
wn,1 (x) µ0,1
and
wn,1 (x) µn,1 (x)
221
converges to a harmonic
1 1 ≤ c1 dn,1 ≤ c2 [vn ]C 0,γ (Bc ) ≤ c2 hn Yα,εn = o(1) log , log 2 εn εn
a contradiction, where the upper bound of the H¨older estimate of vn can be obtained from the representation 1 |x| log hˆ n (y)dy. vn (x) = vn (∞) + 2π R2 |x − y| Since the computation is elementary, we skip the details. Therefore, the proof of Theorem 5.1 is finished. Acknowledgement. The present work was done while the authors visited the National Center for Theoretical Sciences of NSC, Taiwan, during the summers of 2000 and 2001. They want to thank CTS for the warm hospitality.
References 1. Bahri, A., Coron, J.M.: The scalar curvature problem on standard three-dimensional sphere. J. Funct. Anal. 95, 106–172 (1991) 2. Caffarelli, L.,Yang,Y.: Vortex condensation in the Chern-Simons-Higgs model: An existence theory. Commun. Math. Phys. 168, 321–336 (1995) 3. Chae, D., Imanuvilov, O.Y.: The existence of non-topological multivortex solutions in the relativistic self-dual Chern-Simons Theory. Commun. Math. Phys. 215, 119–142 (2000) 4. Chen, C. C., Lin, C. S.: Uniqueness of the ground state solutions of "u + f (u) = 0 in Rn , n ≥ 3. Comm. PDE 16, 1549–1572 (1991) 5. Chen, C. C., Lin, C. S.: Sharp estimates for multi-bubble solution in compact Riemann surfaces. Comm. Pure Appl. Math. 55, 728–771 (2002) 6. Chen, C. C., Lin, C. S.: Topological degree for the mean field equations in compact Riemann surface. Preprint 2002 7. Chen, X., Hastings, S., McLeod, J. B., Yang, Y.: A nonlinear elliptic equations arising from gauge field theory and cosmology. Proc. R. Soc. Lond. A 446, 453–478 (1994) 8. Hong, J., Kim, Y., Pac, P. Y.: Multivortex solutions of the Abelian Chern Simons theory. Phys. Rev. Lett. 64, 2230–2233 (1990) 9. Jackiw, R., Weinberg, E. J.: Self-dual Chern Simons vortices. Phys. Rev. Lett. 64, 2234–2237 (1990) 10. Jaffe, A., Taubes, C.: Vortices and Monopoles. Boston: Birkh¨auser, 1980 11. Lin, C. S.: Uniqueness of solutions to the mean field equation for the spherical Onsager vortex. Arch. Rat. Mech. 153(2), 153–176 (2000) 12. Nolasco, -., Tarantello, G.: Double vortex condensaties in the Chern-Simons-Higgs theory. Calc. Var. Partial Differential Equations 9(1), 31–94 (1999) 13. Nolasco, -., Tarantello, G.: Vortex condensates for the SU (3) Chern-Simons theory. Commun. Math. Phys. 213(3), 599–639 (2000) 14. Spruck, J., Yang, Y.: Topological solutions in the self-dual Chern-Simons theory: Existence and approximation. Ann. Inst. Henri Poincar´e Anal. Non Lin´eaire 12, 75–97 (1995) 15. Spruck, J., Yang, Y.: The existence of nontopological solitons in the self-dual Chern-Simons theory. Commun. Math. Phys. 149, 361–376 (1992) 16. Tarantello, G.: Multiple condensate solutions for the Chern-Simons-Higgs Theorem. J. Math. Phys. 37, 3769–3796 (1996) 17. Wang, R.: The existence of Chern-Simons Vortices. Commun. Math. Phys. 137, 587–597 (1991) 18. Yang, Y.: Nonlinear Problems in Field Theories. Lecture at Seoul National University, 1998 Communicated by A. Kupiainen
Commun. Math. Phys. 231, 223–255 (2002) Digital Object Identifier (DOI) 10.1007/s00220-002-0711-6
Communications in
Mathematical Physics
On Asymptotic Expansions and Scales of Spectral Universality in Band Random Matrix Ensembles A. Khorunzhy1,2, , W. Kirsch3 1 2 3
Institute for Low Temperature Physics, Kharkov, Ukraine University Paris-7, France Institute of Mathematics, Ruhr-University Bochum, Bochum, Germany
Received: 8 April 2000 / Accepted: 7 June 2002 Published online: 21 October 2002 – © Springer-Verlag 2002
Abstract: We consider real random symmetric N × N matrices H of the band-type form with characteristic length b. The matrix entries H (x, y), x ≤ y are independent Gaussian random variables and have the variance proportional to u x−y b , where u(t) −1 vanishes at infinity. We study the resolvent G(z) = (H − z) , Im z = 0 in the limit 1 b N and obtain the explicit expression S(z1 , z2 ) for the leading term of the first correlation function of the normalized trace G(z) = N −1 Tr G(z). We examine S(λ1 + i0, λ2 − i0) on the local scale λ1 − λ2 = Nr and show that its asymptotic behavior is determined by the rate of decay of u(t). In particular, if u(t) decays exponentially, then S(r) ∼ −C b2 N −1 r −3/2 . This expression is universal in the sense that the particular form of u determines the value of C > 0 only. Our results agree with those detected in both numerical and theoretical physics studies of spectra of band random matrices. 1. Problem, Motivation and Results Random matrices play an important role in various fields of mathematics and physics. The eigenvalue distribution of large matrices was initially considered by E.Wigner to model the statistical properties of the energy spectrum of heavy nuclei (see e.g. the collection of early papers [29]). Further investigations have led to numerous applications of random matrices of infinite dimensions in such branches of theoretical physics as statistical mechanics of disordered spin systems, solid state physics, quantum chaos theory, quantum field theory and others (see monographs and reviews [2, 10, 16, 18]). In mathematics, the spectral theory of random matrices has revealed deep links with the orthogonal polynomials, integrable systems, representation theory, combinatorics, non-commutative probability theory and other theories [3, 11, 32, 35]. Present address: D´epartement de Math´ematiques, Universit´e de Versailles Saint-Quentin, 78035 Versailles, France.
224
A. Khorunzhy, W. Kirsch
In the present paper we deal with the family of real symmetric random matrices that can be referred to as the band-type one. In the simplest case the matrices have zeros outside of a band around the principal diagonal. Inside of this band they are assumed to be jointly independent random variables. The limiting transition considered is that the band width b increases at the same time as the dimension of the matrix n does. There is a large number of papers devoted to the use of random matrices of this type in models of quantum chaotical systems (see, e.g. [30] and references therein). In these studies, one of the central topics is related to the transition between fully developed chaos and complete integrability. The crucial observation made numerically [9] and then supported in the wealth of theoretical physics papers (see, for example [15, 33]) is that the ratio b2 /n is the critical one for the corresponding transition in spectral properties of band random matrices. On the rigorous level, the eigenvalue distribution of H (n,b) has been studied in [4, 8, 26]. It is shown that the limiting eigenvalue distribution exists, is non-random and depends on the parameter α = limn→∞ b/n. However, the role of the ratio b2 /n has not been revealed there. Recently, a series of papers appeared where the band random matrices are studied in the context of the non-commutative probability theory [17, 31]. These studies also deal with the limit n, b → ∞ such that α > 0. In the present paper we concentrate on the case of α = 0 represented by the limit 1bn and study the first correlation function of the resolvent of band random matrices. We show that the ratio β = limn→∞ b2 /n naturally arises when one considers the leading term of this correlation function on the local scale. This can be regarded as the support of the conjecture that the local properties of spectra of band random matrices depend on the value of β. Let us describe our results in more detail. We consider the ensemble {H (n,b) } of (n,b) (x, y) have the variance random N × N matrices, x−y N = 2n + 1 whose entries H proportional to u b , where u(t) ≥ 0 vanishes at infinity. We consider the resol −1 and study the asymptotic expansion of the correlation vent G(n,b) (z) = H (n,b) − z function Cn,b (z1 , z2 ) = Efn,b (z1 )fn,b (z2 ) − Efn,b (z1 )Efn,b (z2 ), where we denoted fn,b (z) = N −1 Tr G(n,b) (z). Keeping zi far from the real axis, we consider the leading term S(z1 , z2 ) of this expansion and find an explicit expression for it. This term S(r1 + i0, r2 − i0) regarded on the local scale r1 − r2 = r/N exhibits different behavior depending on the rate of decay of the profile function u(t). Our main conclusion is that if u(t) ∼ |t|−1−ν as t → ∞, then the value ν = 2 separates two major cases. If ν ∈ (1, 2), then the limit of S(r) depends on ν. If ν ∈ (2, +∞), then √ 1 N 1 S(r) = −const · · 3/2 (1 + o(1)) . Nb b |r| These results are in agreement with those predicted in theoretical physics studies. In particular, the last expression for S coincides with the Altshuler-Shklovski asymptotics of the spectral correlation function (see e.g. [27]). The paper is organized as follows. In Sect. 2 we determine the family of ensembles and present several already known results that will be needed. In Sect. 3 we formulate our main propositions and describe the scheme of their proofs. To illustrate this scheme,
Asymptotic Expansions and Scales of Spectral Universality
225
we present a short proof of the Wigner semicircle law for the Gaussian Orthogonal Ensemble of random matrices. In Sect. 4 we study the correlation function Cn,b (z1 , z2 ) and obtain the explicit expression S(z1 , z2 ) for its leading term. In Sect. 5 we study the self-averaging property of G(z) and prove auxiliary facts used in Sect. 4. Expressions derived in Sect. 4 are analyzed in Sect. 6, where the asymptotic behavior of S(z1 , z2 ) is studied. In Sect. 7 we give a summary of our observations. 2. Band Random Matrices and Wigner Law 2.1. The ensemble. Let us consider the family A = {a(x, y), x ≤ y, x, y ∈ Z} of jointly independent random variables determined on the same probability space. We assume that they have joint Gaussian (normal) distribution with properties E a(x, y) = 0,
E [a(x, y)]2 = v(1 + δxy ),
(2.1)
where we denote by δ the Kronecker symbol; 0, if x = y, δxy = 1, if x = y. Here and below E denotes the mathematical expectation with respect to the measure generated by the family A. Let u(t), t ∈ R be a piece-wise continuous function u(t) = u(−t) ≥ 0 satisfying conditions sup |u(t)| = u¯ < ∞ t∈R
and
(2.2)
u(t)dt = 1.
(2.3)
R
For simplicity, we assume u(t) to be monotone for t ≥ 0. Given real parameter b > 0, we introduce an infinite matrix U (b) 1 x−y U (b) (x, y) = u , x, y ∈ Z, b b and determine the ensemble H (n,b) as the family of real symmetric matrices of the form H (n,b) (x, y) = a(x, y) U (b) (x, y), x ≤ y, |x|, |y| ≤ n, (2.4) where b ≤ N, N = 2n + 1 and the square root is assumed to be positive. Let us note that the matrix (2.1) has the really band form when U (b) is constructed with the help of a function u having a finite support, say 1, if t ∈ (− 21 , 21 ), u(t) = 0, otherwise.
226
A. Khorunzhy, W. Kirsch
In this case the band width is less than or equal to 2b + 1. If b = N , then matrices (2.4) coincide with those belonging to the Gaussian Orthogonal Ensemble (GOE) [25]. This random matrix ensemble is determined as the family {AN } of real symmetric matrices 1 AN (x, y) = √ a(x, y), x, y = 1, . . . , N, N
(2.5)
with {a(x, y)} belonging to A (2.1). GOE together with its Hermitian and quaternion versions plays the fundamental role in the spectral theory of random matrices. Random symmetric matrices (2.5) with independent arbitrary distributed random variables a(x, y) satisfying (2.1) is referred to as the Wigner ensemble of random matrices. This random matrix ensemble considered first by E. Wigner [36] is extensively studied in a series of papers (see e.g. [32] and references therein). In particular, in paper [22] the resolvent technique is developed to study the spectral properties of the Wigner ensemble. Actually, we follow a version of this technique, but restrict ourself to the simpler case of Gaussian random variables. A more general case of arbitrary distributed random variables would make the computations much more cumbersome. Let us repeat that the main task of this paper is to study the role of the ratio between b and N with respect to the spectral properties of random matrices. Finally, it should be noted that we restrict ourself with the ensemble of real symmetric matrices for the sake of simplicity also. All results can be obtained by using essentially the same technique for the Hermitian analogue of H (n,b) . 2.2. Limiting eigenvalue distribution. Eigenvalue distribution of matrices H (n,b) is described by the normalized eigenvalue counting function (n,b)
σ (λ; H (n,b) ) = #{λj (n,b)
≤ λ}N −1 ,
(2.6)
(n,b)
≤ · · · ≤ λN are eigenvalues of H (n,b) . We denote by fn,b (z), z ∈ C the where λ1 Stieltjes transform of the measure given by (2.6); ∞ dσn,b (λ) fn,b (z) = , Im z = 0. (2.7) −∞ λ − z Given a Stieltjes transform f (z), one can restore the corresponding measure dσ (λ) with the help of the inversion formula (see e.g. [12]). The limiting behavior of (2.7) as n, b → ∞ was studied in a series of papers [4, 8, 23, 26]. It was proved in [26] that fn,b (z) converges as n, b → ∞ in probability to a nonrandom function that depends on the ratio α = lim b/N ; p − lim
n,b→∞
fn,b (z) = wα (z).
(2.8)
In particular, if α = 0, then the function w0 (z) ≡ w(z) satisfies the equation w(z) =
1 . −z − vw(z)
(2.9)
The solution of this equation is unique in the class of functions satisfying condition Im w(z)Im z ≥ 0
Asymptotic Expansions and Scales of Spectral Universality
227
and can be represented in the form w(z) = (λ − z)−1 dσw (λ), where σw (λ) is the famous semicircle (or Wigner) distribution [36] with the density √ 1 4v − λ2 , if |λ|2 ≤ 4v, ρw (λ) = σw (λ) = (2.10) if |λ|2 ≥ 4v. 2πv 0, This density was obtained first by E. Wigner [36] for eigenvalues of random matrices of the “full” form (2.5) and can be also obtained as the limit (2.8) with α = 1 σw (λ) = lim σ (λ; AN ).
(2.11)
N→∞
Thus, one gets the same eigenvalue distribution in the opposite limiting transitions of narrow α = 0 and wide α = 1 band widths. It is known that in the intermediate regime 0 < α < 1 the limiting distribution differs from the semicircle (2.11) [26]. In present paper we concentrate ourself on the most interesting case α = 0. In the present paper we always consider the case of 1 b n. As we have noted, the paper is aimed to detect the role of the parameter β = limN→∞ b2 /N . To avoid technical problems, we restrict ourself with the range b = nχ ,
1/3 < χ < 1.
(2.12)
We are convinced that our results are valid on the whole range 0 < χ < 1. 3. Main Propositions and Scheme of the Proof The resolvent
−1 G(n,b) (z) = H (n,b) − zI , Im z = 0,
is widely exploited in the spectral theory of operators. Its normalized trace G(n,b) (z) coincides with the Stieltjes transform fn,b (z) (2.7);
N −1 1 1 1 G(n,b) (z) = . = Tr H (n,b) − zI (n,b) N N −z j =1 λj
The results of this section are related with the asymptotic behavior of G(n,b) (z) in the limit (2.12), with z ∈ +η , √ +η = {z ∈ C : |Im z| ≥ η} with η = 2 v + 1. (3.1) 3.1. Main technical results. Our first statement concerns the pointwise convergence of the diagonal entries G(n,b) (x, x; z), |x| ≤ n of the resolvent. Let us determine the set BL ≡ BL (n, b) = {x ∈ Z : |x| ≤ n − bL} .
(3.2)
Theorem 3.1. Given ε > 0, there exists a natural L such that sup |G(n,b) (x, x; z) − w(z)| ≤ ε,
x∈BL
for sufficiently large b, n.
∀ z ∈ +η ,
(3.3)
228
A. Khorunzhy, W. Kirsch
The result of Theorem 3.1 is interesting by itself. We shall use it hardly in the proof of the following statement concerning the correlation function Cn,b (z1 , z2 ) = E G(z1 ) G(z2 ) − E G(z1 ) E G(z2 ) . Theorem 3.2. If zi ∈ +η , i = 1, 2, then in the limit n, b → ∞ (2.12), 1 1 S(z1 , z2 ) + o . Cn,b (z1 , z2 ) = Nb Nb
(3.4)
(3.5)
The explicit term of S(z1 , z2 ) is given by relation S(z1 , z2 ) =
1 − vw12
2v
1 − vw22
Q(z1 , z2 ),
(3.6)
where wj ≡ w(zj ), j = 1, 2 and Q(z1 , z2 ) is given by the formula w12 w22 u˜ F (p) 1 Q(z1 , z2 ) = dp, 2π R 1 − vw1 w2 u˜ F (p) 2 where we denote by u˜ F (p) the Fourier transform of u u(t)eipt dt. u˜ F (p) = R
It should be noted that in the case of GOE (2.6) relation (3.5) is valid with b replaced by N and expression (3.6) takes the following form (see e.g. [14, 20]): SGOE (z1 , z2 ) =
2v 2
1 − vw1
2
w12 w22
1 − vw2 [1 − vw1 w2 ]2
.
(3.7)
Let us briefly explain why (3.6) and (3.7) lead to different asymptotic expressions on the local scale determined as r1 r2 (N) (N) + i0, z2 = λ + − i0 (3.8) z1 = λ + N N with λ ∈ supp dσw (2.10). It follows from equality (2.9) that w12 w22 w1 − w2 2 = . (3.9) z1 − z 2 [1 − vw1 w2 ]2 This expression tends to infinity in the limit (3.8) and vw(z1 )w(z2 ) → 1 as well. But after dividing by N 2 , one obtains from (3.7) and (3.9) that 1 1 SGOE (z1 , z2 ) = − (1 + o(1)). N2 (r1 − r2 )2
(3.10)
The left-hand side of (3.1) is usually called the wide (or smoothed) version of the eigenvalue density correlation function and the expression in the right-hand side of (3.10) is derived by various methods for different random matrix ensembles [13, 14, 6, 20]. In Sect. 6 we study S(z1 , z2 ) with the spectral parameters z1 , z2 given by (3.8). Now the singularity of Q(z1 , z2 ) is determined by convergence of 1 − vw1 w2 u˜ F (p) to zero. This convergence depends on the behavior of u˜ F (p) around the origin p = 0; that is why the rate of decay of u(t) at infinity dictates the form of the limiting expression for S in the local scale.
Asymptotic Expansions and Scales of Spectral Universality
229
3.2. The method and short proof of semicircle law. We prove Theorem 3.1 in Sects. 4 and 5. We are based on the moment relations approach for resolvents of random matrices proposed and developed in [21, 22, 28]. This technique is proved to be rather general, powerful and applicable to various random matrix ensembles. We use a modified version of this approach needed to study the rather complex case of band random matrices. To make the subsequent exposition more transparent, let us describe the principal points of this method in application to the simplest case represented by GOE (2.6). 3.2.1. Families of averaged moments. In the early 70s F. Berezin observed [1] that regarding correlation functions of the formal density of states ρN (θ ) = σN (θ ), (N)
Pk
(6k ) = E ρN (θ1 ) · · · ρN (θk ),
6k = (θ1, . . . , θk ), one can derive for them a system of relations that resembles equali(N) ties for correlation functions of statistical mechanics. In this system Pk is expressed (N) (N) via sum of Pk−1 , Pk+1 and some terms that vanish in the limit N → ∞. This can be rewritten in the vector form P (N) = P0 + B P (N) + φ(N) ,
with a certain operator B and vector φ such that B < 1 and 8(N) = o(1) in the appropriate Banach space. These properties prove the existence of lim N→∞ P (N) = P ; the special form of B implies that the limiting P is nonrandom with the components ρ(θj ). This approach got its rigorous formulation on the base of the resolvent approach used first in the random matrix theory in the pioneering work [24]. Regarding the resolvent GN , the main subject is given by the infinite system of moments (N)
Lk (Xk , Yk ; Zk ) = E
k
GN (xj , yj ; zj ),
k ∈ N.
(3.11)
j =1
The technique proposed in [21, 28] and developed in [22] has been employed in the study of eigenvalue distribution of various ensembles of random operators and random matrices [20, 26]. In the present paper we use the moment relations approach in its modified version. The main observation here is that often it is sufficient to study asymptotic behavior of (N) (N) L1 and L2 instead of the whole infinite family of the moments (3.11). This considerably reduces the amount of computations and makes the proofs more transparent. To explain the principal steps of the proofs of Theorems 3.1 and 3.2, let us present here the short proof of the semicircle law for GOE. 3.2.2. Derivation of system of relations. The main ingredients in the derivation of moment relations are the resolvent identity (3.12) G (z) − G(z) = −G(z) H − H G (z), where G (z) = (H − z)−1 , G(z) = (H − z)−1 and equality E γ F (γ ) = E γ 2 E F (γ ),
(3.13)
230
A. Khorunzhy, W. Kirsch
where γ is the Gaussian random variable with zero mathematical expectation and F (t), t ∈ R is a nonrandom function such that all integrals in (3.13) exist. Equality (3.13) is a simple consequence of the integration by parts formula. Let us consider (3.12) with H = AN (2.6) and H = 0. We obtain the relation GN (x, y) = ζ δxy − ζ
N
GN (x, s)AN (s, y),
ζ ≡ −z−1 .
(3.14)
s=1
Regarding the normalized trace fN (z) = N −1
GN (x, x) ≡ GN
and using (3.13), we obtain relation EfN = ζ − ζ
N v ∂GN (x, s) (1 + δxs ) E . 2 N ∂AN (s, x)
(3.15)
x,s=1
One can easily find the partial derivatives with the help of (3.12). Remembering that H are real symmetric matrices, we have ∂G(x, y) = − [G(x, s)G(t, y) + G(x, t)G(s, y)] (1 + δst )−1 . ∂H (s, t)
(3.16)
(N)
Substituting (3.16) into (3.15), we obtain the first main relation for L1 , (N)
EfN = ζ + ζ v[EfN ]2 + φ1 where
(N)
φ1
= ζ vN −1 E G2N ,
and
(N)
ψ1
(N)
+ ψ1 ,
(3.17)
= ζ vE fN◦ fN◦ ,
and we denoted by ξ ◦ the centered random variable ξ ◦ = ξ − E ξ. Clearly, G◦ = G◦ (here and till the end of the subsection we omit the subscript N in GN ). If one can show that two last terms in (3.17) vanish as N → ∞, then convergence EfN (z) → w(z) will be proved. We estimate the term φ1 with the help of two elementary inequalities that hold for the resolvent of a real symmetric matrix: |fN (z)| ≤ G(z) ≤ |Im z|−1 and
G2 (z) ≤ |Im z|−2 .
The last estimate implies that |G(x, s)|2 = Gex 2 ≤ |Im z|−2 . s
(N) Inequality (3.18) means that if z ∈ +η , then φ1 ≤ vη−3 N −1 .
(3.18)
Asymptotic Expansions and Scales of Spectral Universality
231 (N)
3.2.3. Selfaveraging property. To show that limN→∞ ψ1 ance of fN vanishes
= 0, we prove that the vari-
2 VarfN = E fN◦ = O(N −2 ).
(3.19)
It is clear that VarfN = Ef¯N◦ fN◦ = Ef¯N◦ fN , where we denoted f¯N = fN (¯z). Applying (3.14) to the last factor fN , we see that Ef¯N◦ fN = −
N ζ ◦ E fN G(x, s)AN (s, x) . N s,t=1
Then using (3.13) and (3.16), we derive relation (N) (N) Ef¯N◦ fN = ζ v Ef¯N◦ fN fN + φ2 + ψ2 , (N)
where φ2
(3.20)
= ζ vN −1 Ef¯N◦ G2 and (N)
ψ2
2 ¯ G. = 2ζ vN −2 E G
The useful observation is that Ef¯N◦ fN fN = Ef¯N◦ fN EfN + Ef¯N◦ fN◦ fN .
(3.21)
Using this identity and taking into account estimates (3.19), we derive from (3.18) that Ef¯N◦ fN ≤ vη−1 Ef¯N◦ fN · E |fN | + Ef¯N◦ fN◦ · |fN | +vη−3 N −1 E f ◦ + 2vη−4 N −2 . N
2 Taking into account that fN◦ ≡ f¯N◦ fN◦ , we finally obtain 2 E fN◦ ≡ Ef¯N◦ fN
1/2 2 2 ≤ 2vη−2 E fN◦ + vη−3 N −1 E fN◦ + 2vη−4 N −2 . (N)
This immediately implies (3.19) provided z ∈ +η (3.1). Obviously, ψ1 same estimate.
(3.22)
admits the
232
A. Khorunzhy, W. Kirsch
3.2.4. The semicircle law and further corrections. Returning back to (3.17) and gath(N) (N) ering estimates for φ1 and ψ1 , one can easily derive that if z ∈ +η (3.10, then limN→∞ gN (z) = w(z) , with w(z) given by (2.10). Convergence of the Stieltjes transforms implies convergence of the corresponding measures. Thus the semicircle law is proved. It should be noted that relation (3.21) can be transformed into Ef¯N◦ fN fN = 2Ef¯N◦ fN EfN + Ef¯N◦ fN◦ fN◦ . Substituting this into (3.18), we see that VarfN =
2ζ v 1 1 (N) ¯ 2 G + EG φ2 + E f¯N◦ fN◦ fN◦ . 2 N 1 − 2ζ vEfN 1 − 2ζ vEfN (3.23)
Using the resolvent identity G(z1 )G(z2 ) = −
G(z1 ) − G(z2 ) , z1 − z 2
G(zi ) = (H − zi )−1
(3.24)
¯ 2 G. and convergence of EfN (z), one can easily find the limiting expression for EG If one assumes that two last terms in (3.23) are values of the order o(N −2 ), then one arrives at (3.7) (see e.g. [20] for more details). 4. Correlation Function of the Resolvent Our approach is to apply systematically the scheme of Subsect. 3.2.2 to get the leading term of the correlation function C (n,b) (z1 , z2 ) (3.4). This term is expressed via the
limit of the lim E G(n,b) (z) = w(z) but we have to prove the pointwise version of this convergence given by Theorem 3.1. This and other auxiliary propositions are addressed in Subsects. 4.1 and 4.2. In Subsect. 4.3 we give the proof of Theorem 3.2 on the base of these statements. In what follows, we omit super- and subscripts n and b and do not indicate the limits of summation when no confusion can arise.
4.1. Proof of Theorem 3.1. Using relations (3.12)–(3.14) with obvious changes and repeating computations of Subsect. 3.2.2, we obtain relation EG(x, x) = ζ + ζ vEG(x, x)UG (x) + ζ v E [G(x, s)]2 U (s, x), (4.1) |s|≤n
where UG (x) =
|s|≤n
G(s, s)U (s, x) =
1 s−x G(s, s)u . b b |s|≤n
Let us denote the average EG(x, x) by g(x) and rewrite (4.1) in the following form: g(x) = ζ + ζ v 2 g(x) Ug (x) + b1 8(x) + D(x),
(4.2)
Asymptotic Expansions and Scales of Spectral Universality
233
where we denoted (cf. (3.17)) 8(x) = ζ v
|s|≤n
s−x E [G(x, s)] u b
2
(4.3)
and ◦ D(x) = ζ vE G◦ (x, x) UG (x).
(4.4)
Let us consider the solution {r(x), |x| ≤ n} of equation r(x) = ζ + ζ vr(x)Ur (x),
|x| ≤ n.
(4.5)
Given z ∈ +η (3.1), one can prove that the system of equations (4.5) is uniquely solvable in the set of N-dimensional vectors {r } such that r 1 = sup |r(x)| ≤ 2η−1 , |x|≤n
η = |Im z|
(4.6)
(see Lemma 4.1 at the end of this section). Certainly, r(x) depends on particular values of z, n and b, so in fact we use the notation r(x) = rn,b (x; z). The following statements concern the differences: Dn,b (x; z) = gn,b (x; z) − rn,b (x; z),
dn,b (x; z) = rn,b (x; z) − w(z),
where w(z) is given as a solution of (2.9). Proposition 4.1. Given ε > 0, there exists a number L = L(ε) such that for all sufficiently large b and n satisfying (2.13) sup dn,b (x; z) ≤ ε, z ∈ +η , (4.7) x∈BL
with BL given by (3.2). Proposition 4.2. If z ∈ +η (3.1) and (2.13) holds, then sup |Dn,b (x; z)| = o(1),
|x|≤n
n, b → ∞.
(4.8)
Theorem 3.1 follows from (4.7) and (4.8). Under the same conditions one can find L ≥ L such that ζ sup − w(z) ≤ 2ε. (4.9) x∈BL 1 − ζ vUg (x) Relation (4.9) follows from (3.3) added by (4.6), a priori estimate sup |g(x)| ≤
|x|≤n
1 , |Im z|
and observation that L has to satisfy condition u(L − L ) ≤ ε.
(4.10)
234
A. Khorunzhy, W. Kirsch
Proof of Proposition 4.1. Let us consider the constant function wx (z) ≡ w(z) satisfying (2.10) that we rewrite in the following form similar to (4.5): wx (z) = ζ + ζ vwx (z)
1 bδxt wt (z), b
|x| ≤ n.
|t|≤n
Subtracting this equality from (4.5), we derive that d(x) ≡ dn,b (x; z) verifies equality d(x) = ζ vd(x)Ur (x) + ζ vw(z)Ud (x) + ζ vw 2 (z) [Pb + T (x)] , where
∞ 1 t Pb = − u u(s)ds b b −∞
(4.11a)
1 1 t −x t Tn,b (x) = − . u u b b b b
(4.11b)
t∈Z
and
|t|≤n
t∈Z
It is clear that |Pb | = o(1) as b → ∞. Indeed, one can determine an even step-like function ud (t), t ∈ R such that k ud (t) = u I k−1 k (t), t ≥ 0. b b ,b k∈N
Then ud (t) ≤ u(t) and ud (t) → u(t) as b → ∞ and the Beppo-L´evy theorem implies convergence of the corresponding integrals of (4.10). Taking into account equality r(x) =
ζ , 1 − vζ Ur (x)
we can write that d(x) = vwr(x)Ud (x) + vw 2 r(x) Pb + Tn,b (x) ,
(4.12)
where we denoted w ≡ w(z). This relation, together with estimates (4.6) and |w(z)| ≤ |Im z|−1 , implies inequality |d(x)| ≤ 2|Im z|−1 < 1.
∞ Given ε > 0, let us find such a number Q that 2 Q u(t)dt < ε. Denoting τ = vη−2 < 1/4, we derive from (4.12) inequality 1 sup |d(x)| ≤ τ , sup |d(x)| + sup Tn,b (x) + Pb + ε + b x∈BL x∈BL−Q x∈BL where we have used condition (2.3) and the estimate ∞ 1 s−x 1 ≤2 |d(x)|u u(t)dt + b b b Q s:|s−x|>Qb
Asymptotic Expansions and Scales of Spectral Universality
235
that follows from monotonicity of u(t). Now it is easy to conclude that 1 j sup |d(x)| ≤ sup Tn,b (x) + Pb + τ L/Q sup |d(x)| + ε + . τ b |x|≤n x∈BL x∈BL−j Q 0≤j ≤L/Q
(4.13) Let us choose such M that τ M < ε. Then τ j sup Tn,b (x) ≤ 0≤j ≤L/Q
x∈BL−j Q
sup
x∈BL−MQ
Tn,b (x) + 2ε.
Finally, we observe that 2 sup Tn,b (x) ≤ b x∈BL−MQ
∞ t=n−(L−MQ)b
∞ t 1 ≤2 u u(s)ds + . b b n/b−L+MQ
Now it is clear that (4.7) holds for sufficiently large L and 1 b n. Proposition 4.1 is proved. Proof of Proposition 4.2. Subtracting (4.5) from (4.2), we obtain a relation for D(x) = Dn,b (x), D(x) = ζ vD(x)Ur (x) + ζ vg(x)UD (x) + ζ v b1 8(x) + D(x) that can be rewritten in the form D(x) = vg(x)r(x)UD (x) + vr(x)
1
b 8(x) + D(x)
.
Regarding this relation as the coordinate form of a vector equality, one can write that −1 1 (r) = v I − W (g,r) (r) , D +D b8 where we denote by W (g,r) a linear operator acting on vectors e with components e(x) as e(s)U (s, x) W (g,r) e (x) = vg(x)r(x) |s|≤n
and vectors
(r) (x) = r(x)φn,b (x), 8 n,b
(r) (x) = r(x)D(x). D
It is easy to see that if z ∈ +η , then the estimates (4.6) and (4.10) imply inequality W (g,r) ≤
v < 1/2. η2
Thus, to prove Proposition 4.2, it is sufficient to show that s − x 2 sup E [G(x, s)] u = O(1), b x s
(4.14)
z ∈ +η
(4.15)
236
and
A. Khorunzhy, W. Kirsch
◦ (x, x) = o (1) , sup E G◦ (x, x) UG x
z ∈ +η .
(4.16)
Relation (4.15) is a consequence of the bound (2.2) and inequality (3.18). Relation (4.16) reflects the selfaveraging property of G(n,b) . This question is addressed in the next subsection. It should be noted that (4.16) will be proved independently from computations of this subsection. Assuming that this is done, we can say that Theorem 3.1 is proved. We complete this subsection with the proof of the following auxiliary statement. Lemma 4.1. Equation (4.5) has a unique solution in the class of vectors satisfying condition (4.6). Proof. Let us consider the sequence of N-dimensional vectors r(k) , k ∈ N determined by relations for their components r (k+1) (x) = ζ + ζ vr (k) (x)Ur (k) (x),
r (1) (x) = ζ,
|x| ≤ n.
Then it is easy to derive that if r(k) satisfies (4.6) and z ∈ +η (3.1), then r(k+1) also satisfies (4.6). The difference χk+1 (x) = r (k+1) (x) − r (k) (x) satisfies relations χk+1 (x) = ζ vχk (x)Ur (k) (x) + ζ vr (k−1) (x)Uχk (x). Obviously, χk+1 1 ≤ α χk 1 with α < 1 provided z ∈ +η . The lemma is proved. 4.2. The variance and selfaveraging property. The asymptotic relation (4.15) is a consequence of the fact that the variance of G(x, x) ◦ ◦ ◦ 2 Var G(n,b) = E G(n,b) = E G(n,b) (z) G(n,b) (¯z) vanishes as n, b → ∞. Instead of the direct proof of (4.15), we prefer to present the whole list of more general statements needed in studies of the correlation function of G. All of them can be proved independently of Theorem 3.1 without use of its statement. We start the list with the following three relations that concern the moments of diagonal elements of G. Proposition 4.3. If z ∈ +η (3.1), then the estimates 2 sup E G◦ (x, x; z) = O(b−1 ), |x|≤n
◦ 2 sup E UG (x) = O(b−2 ),
(4.18)
◦ 4 (x) = O(b−4 ), sup E UG
(4.19)
|x|≤n
and
|x|≤n
hold.
(4.17)
Asymptotic Expansions and Scales of Spectral Universality
237
The following statement concerns the mixed moments of variables G◦ (x, x; z) and their sums. Proposition 4.4. If z ∈ +η , then relations sup
|x|,|y|≤n
◦ EG (x, x)U ◦ (y) = O b−2 , G
sup E G◦ G◦ (x, x) = O n−1 b−1 + b−1 [Var G]1/2 ,
|x|≤n
and sup
|x|,|y|≤n
◦ ◦ E G G (x, x)U ◦ (y) = O n−1 b−2 + b−2 [Var G]1/2 G
(4.20)
(4.21)
(4.22)
are true in the limit n, b → ∞. Finally, we formulate Proposition 4.5. If z ∈ +η , then relation
◦ 2 2 sup E G1 [G2 (x, s)] ub (s, x) = O n−1 b−2 + b−2 [Var G]1/2 (4.23) |x|≤n s
is true in the limit n, b → ∞. Let us note that the estimates (4.21)–(4.23) admit also the estimates in terms of n and b that do not involve the variance of G. However, derivation of the estimates would take more place and time and we restrict ourselves with the forms presented. It will be shown later that Var G = O(n−1 b−1 ). This fact together with the restriction (2.12) implies for (4.22) and (4.23) that 1 1 1 √ 2 b nb nb that is sufficient for us. We prove Propositions 4.3–4.5 in Sect. 5. 4.3. Toward the correlation function. Let us have a closer look at the correlation function ◦ ◦ Cn,b (z1 , z2 ) = E G(n,b) (z1 ) G(n,b) (z2 ) . We follow the scheme described at the end of Subsect. 3.2 and introduce variables Gj (x, y) = G(n,b) (x, y; zj ), j = 1, 2. To study the average
E G◦1 G2 (x, x) = R12 (x), we apply to G2 (x, x) the resolvent identity (3.12) and obtain the relation E G◦1 G2 (x, s)a(s, x) U (s, x), R12 (x) = −ζ2 |s|≤n
238
A. Khorunzhy, W. Kirsch
where ζ2 = −z2−1 . We compute the last mathematical expectation with the help of formulas (3.13) and (3.16) and obtain equality (cf. (4.1)) R12 (x) = ζ2 vR12 (x)Ug2 (x) + ζ2 vUR12 (x)g2 (x) + 2ζ2 vN −1 EG21 (x, s)G2 (x, s)U (s, x) s
+ ζ2 v [612 (x) + ϒ12 (x)] , where we denoted g2 (x) = EG(x, x; z2 ), Ug2 (x) = g2 (s)U (s, x), |s|≤n
UR12 (x) =
R12 (s)U (s, x),
|s|≤n
612 (x) = E
G◦1
! [G2 (x, s)] U (s, x) , 2
|s|≤n
and
◦ (x)G◦2 (x) . ϒ12 (x) = E G◦1 UG 2
Using the notation q2 (x) =
ζ , 1 − ζ vUg2 (x)
(4.24)
we obtain the following relation for R12 : R12 (x) = vq2 (x)g2 (x)UR12 (x) +
2vq2 (x) F12 (x, s)U (s, x) N s
+ vq2 (x)[612 (x) + ϒ12 (x)],
(4.25)
where we denoted F12 (x, s) = EG21 (x, s)G2 (x, s). The terms 6 and ϒ can be estimated with the help of Propositions 4.3–4.5. As we shall see in the next subsection, they do not contribute to the leading term of R12 . To obtain the explicit expression for this leading term, it is necessary to study in detail the variable F12 . Now let us formulate the corresponding statement and the auxiliary relations needed. Proposition 4.6. If z ∈ +η , (3.1), then for arbitrary positive ε and large enough values of b and n (2.13) there exists the set BL (3.2) with L such that u˜ F (p) 1 w12 w2 sup b[F12 U ](x, x) − dp (4.26) ≤ ε. 2 2 2π 1 − vw1 R 1 − vw1 w2 u˜ F (p) x∈BL The proof of Proposition 4.6 is based on the similar statement formulated for the product G1 G2 .
Asymptotic Expansions and Scales of Spectral Universality
239
Proposition 4.7. Given positive ε, there exists such L that relations k w1 w2 u˜ F (p) 1 sup b EG1 (x, s)G2 (x, s) U k (s, x) − dp ≤ ε (4.27) 2π R 1 − vw1 w2 u˜ F (p) x∈BL |s|≤n and
w w 1 2 ≤N sup EG1 (x, s)G2 (x, s) − 1 − vw1 w2 x∈BL |s|≤n
(4.28)
hold for all k ≥ 1, all zi ∈ +η and large enough values of b. Remark. In the case when z1 = z2 , relation (4.28) can be derived from the resolvent identity (3.24) with the help of the convergence (3.3) and the explicit form of w(z) (2.9). We prove Proposition 4.6 in the next subsection. Relations (4.27) and (4.28) will be proved in Sect. 5. 4.4. Proof of Proposition 4.6 and Theorem 3.2. Let us assume that relations (4.27) and (4.28) are true and show that under conditions of Theorem 3.2 the leading term of R12 is of the order O(n−1 b−1 ) and terms 612 and ϒ12 of (5.2) do not contribute to it. We rewrite (4.25) in the form R12 (x) = vg2 (x)q2 (x)UR12 (x) + 2vq2 (x)N −1 [F12 U ] (x, x) + vq2 (x) [612 (x) + ϒ12 (x)] .
(4.29)
Let us denote r12 = sup|x|≤n |R12 (x)|. Taking into account U (x, y) ≤ u/b ¯ (2.2) and using inequalities of the form (3.19), it is easy to see that if zi ∈ +η , then 1/2 1/2 1 1 1 2 2 2 E . |G1 (x, s)| |G2 (s, x)| =O [F12 U ] (x) ≤ N Nb nb s s Regarding this estimate and relations (4.22), (4.23) we easily derive from (4.29) inequality (cf. (3.22)) 1√ v C r12 ≤ 2 r12 + + r12 η bn b2 with some constant C. Since r12 is bounded for all z ∈ +η , then 1 1 r12 = O . + nb b4 Now condition (2.12) implies that r12 = O(1/nb) and therefore the general form of (3.5) is demonstrated. Substituting (3.5) into the estimates (4.22) and (4.23), we obtain that 1 1 612 1 = o and ϒ12 1 = o . nb nb Thus, these terms of (4.29) do not contribute to the leading term of R12 . To find this term in explicit form, we need the result of Proposition 4.6.
240
A. Khorunzhy, W. Kirsch
Proof of Proposition 4.6. Regarding F12 (x, y) = E G21 (x, y)G2 (x, y), we apply to G2 the resolvent identity (3.12). Computations similar to those of Subsect. 3.2.2 lead us to equality F12 (x, y) = ζ2 δxy EG21 (x, x) + ζ2 v [t12 U ] (x, y) EG21 (y, y) + ζ2 v [F12 U ] (x, y) g1 (y) + F12 (x, y)U[g2 ] (y) + O(x, y) , (4.30) where t12 (x, y) = ET12 (x, y) = EG1 (x, y)G2 (x, y), and the vanishing terms are denoted by O = O1 + O2 + O3 : O1 (x, y) = E G1 (x, y)G21 (s, y)G2 (x, s) + 2G21 (x, y)G2 (s, y)G2 (x, s) U (s, y), s
◦ O2 (x, y) = E [T12 U ] (x, y) G21 (y, y) + E [F12 U ] (x, y)G◦1 (y, y) , and
◦ O3 (x, y) = EF12 (x, y)U[G (y). 2]
Indeed, it is easy to show that sup |Oj (x, y)| = O(b−1 ), x,y
z1 , z2 ∈ +η .
(4.31)
This can be done with the help of inequality (3.19) and relations (4.17), (4.18), and (4.23). Using definition of q2 (x) (4.24), we rewrite (4.30) as F12 (x, y) = vg1 (x)q2 (y) [F12 U ] (x, y) ˜ +R (1) (x, y) + R (2) (x, y) + v O(x, y),
(4.32)
where we denoted R (1) (x, y) = q2 (x) EG21 (x, x)δxy , R
(2)
(x, y) = vq2 (y) [t12 U ] (x, y)
(4.33a) EG21 (y, y)
(4.33b)
˜ and O(x, y) = O(x, y)q2 (y). Let us note that |R(1)| ≤ η−3 and |R (2) | ≤ vη−5 for zi ∈ +η . Let us determine the linear operator W that acts on N × N matrices F according to the formula [W F ](x, y) = vg1 (x) F (x, s)U (s, y) q2 (y). |s|≤n
The a priori estimates |g1 (x)| ≤ |Im z1 |−1 , and |q2 (x)| ≤ |Im z2 |−1 imply inequality (cf. (4.14)) W (1,1) ≤
v 1 < , 2 η 2
zi ∈ +η ,
(4.34)
Asymptotic Expansions and Scales of Spectral Universality
241
where the norm of N × N matrix A is determined as A(1,1) = supx,y |A(x, y)|. This estimate is verified by direct computation of W A(1,1) with A(1,1) = 1. Then (4.32) can be rewritten as F12 (x, y) = v
W m R (1) + R (2) + v O˜ (x, y).
∞
(4.35)
m=0
The next steps of the proof of (4.26) are very elementary. We consider the first M terms of the infinite series and use the decay of the matrix elements U (x, y) = U (b) (x, y). Indeed, if one considers (4.33) with x and y taken far enough from the endpoints −n, n, then the variables g1 (s), q2 (t) enter into the finite series with s and t also far from the endpoints. Then one can use relations (3.3) and (4.9) and replace g1 and q2 by the constant values w1 and w2 , respectively. This substitution leads to simplified expressions with error terms that vanish as n, b → ∞. The second step is similar. It is to show that we can use Proposition 4.7 and replace the terms R (1) and R (2) of the finite series of (4.33) by corresponding expressions given by formulas (4.27) and (4.28). Let us start to perform this program. Taking into account the estimate of O and using boundedness of terms R (1) and R (2) , we can deduce from (4.35) that b
M
W m (R (1) + R (2) )U (x, x) + Q(1) (x, x),
F12 (x, s)U (s, x) = bv
s
m=0
(4.36) where M is such that given ε > 0, |Q(1) (x, x)| < ε for large enough b and n. Now let us find such h that the following holds u(t)dt ≤ ε. u(t) ≤ ε, ∀|t| ≥ h, and |t|≥h
We determine the matrix Uˆ (x, y) =
U (x, y), 0,
if |x − y| ≤ bh; if |x − y| > bh
and denote by Wˆ the corresponding linear operator F (x, s)Uˆ (s, y) q2 (y). [Wˆ F ](x, y) = vg1 (x) |s|≤n
Certainly, Wˆ admits the same estimate as (4.34). Given ε > 0, let L the largest number among those required by conditions of Propositions 4.1 and 4.7. Let us denote by Q the first natural greater than (M + k)h. Then one can write that bv
M
W m (R (1) + R (2) )U (x, x)
m=0
= bv
M
(v Wˆ )m R (1) + R (2) Uˆ (x, x) + Q(2) (x, x),
m=0
242
A. Khorunzhy, W. Kirsch
where
sup Q(2) (x, x) ≤ ε,
as n, b → ∞.
x∈BL+Q
(4.37)
The proof of (4.37) uses elementary computations. Indeed, Q(2) (x, x) is represented as the sum of M + 1 terms of the form bv m+1
∗
[g1 (x)]m F (x, x1 )U (x, x1 )q2 (x1 ) · · · U (xm−2 , xm−1 )q2 (xm−1 )
|xi |≤n
× R (1) + R (2) (xm−1 , xm )U (xm , x), where the sum is taken over the values of xi such that xj − xj +1 > bh at least for one of the numbers j ≤ m. Now remembering the a priori bounds for R (1) and R (2) , estimates like (4.13) and taking into account the diagonal form of R (1) , one obtains the following estimate of Q(2) by two terms: ∗ M bv m+1 sup Q(2) (x, x) ≤ U (x, x1 )U (x1 , x2 ) · · · U (xm , x) η2m+3 |x|≤n m=0
+
M m=0
|xi |≤n
v m+1 η2m+5
∗
U (x, x1 ) · · · U (xm−2 , xm−1 )U (xm , x). (4.38)
|xi |≤n
Regarding the first term in the right-hand side of (4.38) and assuming that |xj − xj +1 | > bh, one can observe that for large enough b, n, U j (x, xj )εU m−j (xj +1 , x) ≤ ε. |xj |≤n
Indeed, U (x, x1 )U (x1 , x2 ) · · · U (xj −1 , xj ) ≤ U (x, x1 )U (x1 , x2 ) · · · U (xj −1 , xj ) |xi |≤n
xi ∈Z
&
≤
∞
−∞
u(t)dt +
u(0) b
'j
≤ (1 + u/b) ¯ j.
Let us also mention here that given ε > 0, one has for large enough n, b that j sup U (x, s) − 1 ≤ ε, (4.39) x∈BL+Q |s|≤n where j ≤ M. This follows from elementary computations related with the differences (4.11) that vanish in the limit 1 b n. Similar but a little more modified reasoning can be used to estimate the second term in the right-hand side of (4.38). Now one can write that sup |Q(2) (x, x)| ≤ 2ε
|x|≤n
M mv m+1 ≤ ε. η2m+2
m=0
Asymptotic Expansions and Scales of Spectral Universality
243
Regarding the right-hand side of (4.37) with x ∈ BL+Q , one observes that the summations run over such values of xi that |x − x1 | ≤ bh, |xi − xi+1 | ≤ bh, and thus xj ∈ BL for all j ≤ k + m − 1. This means that we can apply relations (3.3) and (4.9) to the right-hand side of (4.37) and replace g1 (x) by w(z1 ), q2 (x) by w(z2 ). We derive from (4.36) that F12 U k (x, x) = bvw2
M
(vw1 w2 )m (Uˆ m )(x, s) R (1) + R (2) (s, t)Uˆ (t, x) + Q(3) (x, x)
m=0
with
sup Q(3) (x, x) ≤ 4ε.
x∈BL+Q
Finally, applying Proposition (4.7) to the expressions involved in R and taking into account that 1 m+1 sup |bU (x, x) − (4.40) u˜ m+1 F (p)dp| ≤ ε, 2π x∈BL+Q we obtain equality M v w12 w2 m u˜ m+1 (vw1 w2 ) (F12 U )(x, x) = F (p)dp 2π 1 − vw12 m=0 M u˜ m+1 v w12 w2 m F (p) + (vw w ) dp + Q(5) (x, x) 1 2 2 2π 1 − vw1 1 − vw1 w2 u˜ F m=0
(4.41) with
sup Q(5) (x, x) ≤ ε
x∈BL+Q
b, n → ∞.
Passing back in (4.41) to the infinite series and simplifying them, we arrive at the expression standing in the right-hand side of (4.26). The proposition is proved. Let us complete the proof of Theorem 3.2. Remembering estimate (4.14), we can iterate relation (4.29) and obtain that R12 (x) =
∞ 2vq2 (x) (g2 ,q2 ) m W f12 (x) + o(1/nb), Nb m=0
where we denoted f12 (x) = bq2 (x)[F12 U ](x, x). Regarding the trace 1 1 R12 (x) = R12 (x)(1 + o(1)), N N |x|≤n
x∈BL
244
A. Khorunzhy, W. Kirsch
and repeating the arguments of the proof of Proposition 4.6 presented above, we can write that R12 (x) =
M 2vw2 (bF12 U )(t, t)(vw22 U )m (t, x) + Q(6) (x, x) Nb t m=0
with supx∈BL |Q(6) (x, x)| ≤ vep provided n, b → ∞ (2.12). Finally, observing that (bF12 U )(t, t) asymptotically does not depend on t (4.26), we arrive, with the help of (4.39), at the expression (3.6). Theorem 3.2 is proved. 5. Proof of Auxiliary Statements Proof of Proposition 4.3. Let us consider the average E G◦1 (x, x)G2 (y, y) and derive for it, with the help of formulas (3.12), (3.13) and (3.16) equality E G◦1 (x, x)G2 (y, y) = ζ2 vE G◦1 (x, x)G2 (y, y)UG2 (y) + ζ2 v E G◦1 (x, x) [G2 (y, s)]2 U (s, y) s
+ 2ζ2 v
E G1 (x, s)G1 (y, x)G2 (y, s)U (s, y).
s
Applying to the first term of this equality the analogue of identity (3.21) and using q2 (x) (4.24), we obtain that ◦ (y) E G◦1 (x, x)G2 (y, y) = vq2 (y)E G◦1 (x, x)G2 (y, y)UG 2 ◦ + vq2 (y) E G1 (x, x) [G2 (y, s)]2 U (s, y) s
+ 2vq2 (y)
E G1 (x, s)G1 (y, x)G2 (y, s)U (s, y). (5.1)
s
We multiply both sides of this relation by U (x, t) and sum it over x; then we get ◦ ◦ ◦ (t, t)G2 (y, y) = vq2 (y)E UG (t)G2 (y, y) UG (y) E UG 1 1 2 ◦ + vq2 (y) E UG (t) [G2 (y, s)]2 U (s, y) 1 s
+ 2vq2 (y)
E G1 (x, s)G1 (y, x)G2 (y, s)U (s, y)U (x, t).
s
(5.2) Regarding G1 (y, ·)U (·, t) and G2 (y, ·)U (·, y) in the last term as vectors in N -dimensional space, we derive from estimate (3.19) that E G1 (x, s)G1 (y, x)G2 (y, s)U (s, y)U (x, t) s,x 1/2 1/2 2 2 |G1 (y, x)U (x, t)| |G2 (y, s)U (s, y)| ≤ G1 . (5.3) x
s
Inequality (3.18) implies that the right-hand side of (5.3) is bounded by b−2 η−3 .
Asymptotic Expansions and Scales of Spectral Universality
245
Let us multiply both sides of (5.2) by U (y, r) and sum them over y. Then one obtains a relation that together with (3.18) and (5.3) implies the following estimate for variable 2 1/2 ◦ : M12 = supx E UG1 (x) M12 ≤ vη−2 M12 + vη−3 b−1 M12 + 2vη−4 b−2 . This proves (4.18). Now (4.17) follows from (4.18) and relation (5.1). To derive estimate (4.19), let us consider the variable ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ (x1 )UG (x2 )UG (x3 )UG (x4 ) = E UG (x)UG (x)UG (x3 ) UG4 (x4 ). E UG 1 2 3 4 1 2 3 ◦ U ◦ U ◦ and M(x , x , x , t) = ET ◦ G (t, t). We apply to Let us denote T = UG 1 2 3 4 1 G2 G3 G4 (t, t) the resolvent identity (3.14) and obtain the relation
ET ◦ G4 (t, t) = vζ4 ET ◦ G4 (t, t)UG4 (t) + vζ4 ET ◦ + vζ4
(i,j,k)
◦ ◦ EUG (xi )UG (xj ) i j
[G4 (s, t)]2 U (s, t)
s
Gk (y, s)Gk (t, y)U (y, xk )G4 (t, s)U (s, t).
x,s,t
(5.4) Repeating previous computations and applying similar estimates, we obtain inequality v v ◦ ◦ M(x1 , x2 , x3 , t)U (t, x4 ) ≤ E|T UG (x4 )| + E|T | E|UG (x4 )| 4 4 η η t v 3v ◦ ◦ + 3 E|T | + 2 E UG (x )U (x ) . i j G i j η b ηb (5.5) Here we have applied inequalities (3.18) and (5.3) to the last two terms of relation (5.4). Now it is clear that (5.5) implies (4.19). Proposition 4.3 is proved. Proof of Proposition 4.4. Estimate (4.20) follows from relation (5.2) and estimate (4.18). Regarding (5.1) and summing it over x, one can easily derive (4.21) with the help of the arguments used to prove (4.18). Let us turn to the proof of (4.22). To do this, let us consider the variable
◦ ◦ ◦ K(x, y) = E G◦ G◦ (x, x)UG (y) = E G◦ UG (y) G(x, x) and apply to the last expression resolvent identity (3.12) and formulas (3.13) and (3.16). We obtain an equality that can be written in the following form with the notation ◦ (y): R = G◦ UG E R ◦ G(x, x) = ζ vE R ◦ G(x, x)UG (x) +
i=1,2,3
κi (x, y),
(5.6)
246
A. Khorunzhy, W. Kirsch
where κ1 (x, y) = ζ v
s
κ2 (x, y) = 2ζ v
s,t
and
E R ◦ G(x, s)G(x, s)U (s, x),
E G◦ G(t, s)G(x, t)u2b (t, y)G(x, s)U (s, x),
κ3 (x, y) = 2ζ vN −1
s,t
◦ EG(t, s)G(x, t)UG (y)G(x, s)u2b (s, x).
Let us use the identity E R ◦ XY = E RX◦ E Y + E RY ◦ E X + E RX ◦ Y ◦ − E R E X◦ Y ◦ , and rewrite (5.6) in the form E R ◦ G(x, x) =
◦ vq(x) ◦ ◦ (x)G◦ (x, x) − E G◦ UG (y)EG◦ (x)UG (x) E RUG 1 − vq(x)g(x) vq(x) + κi (x, y). (5.7) 1 − vq(x)g(x) i=1,2,3
Taking into account relation (4.18), inequalities (3.18) and (5.3), we obtain that |κi (x, y)| ≤ 2η−2 b−2 (Var G)1/2 and
for i = 1, 2
|κ3 (x, y)| ≤ 2η−3 b−2 N −1 .
Using them, we derive from (5.7) inequality ! ◦ 4 1/2 2 1/2 −1 1/2 −2 ◦ |K(x, y)| ≤ 2η (Var G) E UG (x) +b E UG (x) +2η−1 b−2 (Var G)1/2 + 2η−2 b−2 N −1 . This leads to estimate (4.22). Proposition 4.4 is proved.
Proof of Proposition 4.5. This proof of the estimate (4.23) is the most cumbersome. Here we have to use the resolvent identity (3.12) twice. However, the computations are based on the same inequalities as those of the proofs of Propositions 4.3 and 4.4. Therefore we just indicate the principal lines of the proof and do not present the derivations of estimates. To compute the mathematical expectation
E M(x, s) = E G◦1 [G2 (x, s)]2 , let us apply to G2 (x, s) the resolvent identity (3.12). We obtain equality u(0) ◦ E G1 G2 (x, x) b
◦ −ζ2 E G1 G2 (x, s)G2 (x, t)a(t, s) U (t, s).
E M(x, s) = ζ2
t
(5.8)
Asymptotic Expansions and Scales of Spectral Universality
247
Relation (4.21) implies that the first term of the right-hand side of (5.8) is the value of the order indicated in (4.23). Let us consider the second term of (5.8). We compute the mathematical expectation with the help of relations (3.13) and (3.16) and obtain expression 5
ζ2 E G◦1 G2 (x, s)G2 (x, t)a(t, s) U (t, s) = 6i (x, s), t
(5.9)
i=1
where
61 (x, s) = vζ2 E G◦1 G2 (x, s)G2 (x, s) EUG (s),
◦ 62 (x, s) = vζ2 E G◦1 G2 (x, s)G2 (x, s)UG (s), 2vζ2 63 (x, s) = G21 (s, t)U (t, s)G2 (x, s)G2 (x, t), E N t
◦ 64 (x, s) = vζ2 E G1 [G2 (x, t)]2 U (t, s)G2 (s, s), t
and
G2 (x, s)G2 (x, t)G2 (s, t)U (t, s). 65 (x, s) = 2vζ2 E G◦1 t
61 is of the form vζ2 EM(x, s)EUG (s) and can be put to the right-hand side of (5.9). The terms 62 and 63 are of the order indicated in the right-hand side of (4.23). This can be shown with the help of estimates of the form (5.3). Regarding 64 , we apply the resolvent identity (3.12) to factor G2 (s, s). Repeating the usual computations based on (3.13) and (3.16), we obtain that 64 (x, s) = vζ22
EM(x, t)U (t, s) + vζ2 64 (x, s)EUG2 (s) + U(x, s),
(5.10)
t
where U gathers the terms that are all of the order indicated in (4.23). This can be verified by direct computation with the use of estimates (4.18), (4.21), and (4.22). Not to overload this paper, we do not write down the terms constituting U and do not present their(estimates as well. Relation (5.10) is of the form that leads to the estimates needed for s EM(x, s)U (s, x). Regarding 65 (x, s), we apply (3.12) to G2 (s, t) and obtain, after the use of (3.13) and (3.16) that 65 (x, s) = 2vζ22
u(0) EM(x, s) + vζ 65 (x, s)EUG2 (s) + U (x, s), b
(5.11)
where U (x, s) consists of the terms that are of the order indicated in (4.23). The form of (5.11) is also such that, being substituted into (5.9) and then into (5.8), it leads to the estimates needed. This observation shows that (4.23) is true. Proof of Proposition 4.7. We prove relation (4.27) with k = 1 because the general case does not differ from this one. To derive relations for the average value of the variable
248
A. Khorunzhy, W. Kirsch
t12 (x, y) = EG1 (x, y)G2 (x, y), we use identities (3.12)–(3.14) and repeat the proof of Proposition 4.6. Simple computations lead us to the equality t12 (x, y) = g1 (x)ζ2 δxy + ζ2 v 2 t12 (x, y)Ug2 (y) + ζ2 v 2
t12 (x, s)U (s, y) g1 (y) + ζ2 v 2
s
where ϒ1 (x, y) = E
4
ϒj (x, y),
(5.12)
j =1
G1 (x, y)G2 (x, s)G2 (y, s)U (s, x),
s
ϒ2 (x, y) = E
G1 (x, y)G1 (s, y)G2 (x, s)U (s, x),
s
◦ ϒ3 (x, y) = EG1 (x, y)G2 (x, y)UG (y), 2
and
ϒ4 (x, y) = EG◦2 (y, y)
G1 (x, s)G2 (x, s)U (s, y).
s
It is easy to see that inequality (4.16) implies estimates −1 −3 ϒ1 (x, y) ≤ b−1 η−3 . sup |ϒ1 (x, y)| ≤ b η , sup x,y x y
The same is valid for ϒ2 (x, y). Similar estimates for ϒ3 (x, y) and ϒ4 (x, y) follow from relations (4.17) and (4.18). Thus, (5.12) implies that t12 (x, y) = g1 (x)q2 (x)δxy + vg1 (y)q2 (y) [t12 U ] (x, y) + Q(x, y), where sup |Q(x, y)| = o(1) x,y
and
sup Q(x, y) = o(1) x
(5.13)
(5.14)
y
in the limit n, b → ∞ (2.12). We rewrite relation (5.13) in the matrix form (cf. (4.35)) −1 t12 = I − W (g,q) Diag(g1 q2 ) + Q ∞
W (g,q)
=
m
(Diag(g1 q2 ) + Q) .
(5.15)
m=0
Now we can apply to (5.15) the same arguments as to (4.35). Replacing g1 (x) and q2 (x) by w1 and w2 , respectively, we derive from (5.14) that for x ∈ BL+Q , t12 (x, s) =
M m=0
(w1 w2 )m+1 U m (x, s) + o(1), n, b → ∞.
(5.16)
Asymptotic Expansions and Scales of Spectral Universality
249
Multiplying both sides of (5.16) by U (s, x) and summing it over s, we obtain the relation
t12 (x, s)U (s, x) =
M
(w1 w2 )m+1 U m+1 (x, x) + o(1), n, b → ∞.
m=0
|s|≤n
(5.17) Now convergence (4.40) implies the relation that leads, with M replaced by ∞, to (4.27). To prove (4.28), let us sum (5.16) over s. The second part of (5.14) tells us that the terms Q remain small when summed over s. Thus we can write relations s
t12 (x, s) =
M
(w1 w2 )m+1
U m (x, s) + o(1), n, b → ∞.
(5.18)
s
m=0
Taking into account estimates for terms (4.11), it is easy to observe that convergence (4.39) together with (5.18) implies (4.28). 6. Asymptotic Properties of S(z1 , z2 ) In the last decade, the main focus of the spectral theory of random matrices is related with the universality conjecture of local spectral statistics put forward first by F. Dyson [13]. This problem is addressed in a large number of papers where various random matrix ensembles are studied using different approaches (see e.g. the review [16]). The best understood are the Gaussian Unitary Ensemble (GUE) and its real symmetric analogue GOE (see (2.5)). The probability distribution of these ensembles are invariant with respect to the unitary (orthogonal) transformations. This leads to the fact that the joint probability distribution of eigenvalues of these ensembles does not depend on the distribution of eigenvectors and is given in explicit form [25]. This allows one to use the powerful technique of the orthogonal polynomials that provides detailed information of the spectral properties of GUE and GOE and related ensembles on the local scale (see [7, 34] for the initial results for Gaussian ensembles and [3, 11] for their generalizations). The case of band random matrices is different because the probability distribution of the ensemble H (n,b) (2.4) is no more invariant under transformations of the coordinates. One of the possible ways to study the spectral properties of H (n,b) is to follow the resolvent expansions approach well-known in theoretical physics (see, for example [14]). A rigorous version of it has been developed in a series of papers [21, 22, 20]. In framework of the resolvent approach (see [20] for details), one considers the correlation function Cn,b (z1 , z2 ), Im zj = 0 (3.4) in the limit when the dimension of the matrix N infinitely increases. The asymptotic expression for S(z1 , z2 ) regarded in the limit z1 = λ1 + i0, z2 = λ2 − i0 supplies one with the information about the local properties of eigenvalue distribution provided λ1 − λ2 = O(N −1 ). Indeed, according (λ) is to (2.7), the formal definition of the eigenvalue density ρn,b (λ) = σn,b ρn,b (λ) =
1 fn,b (λ + i0) − fn,b (λ − i0) . 2i
Then one can consider the expression Rn,b (λ1 , λ2 ) = −
1 4
δ1 ,δ2 =−1,+1
δ1 δ2 Cn,b (λ1 + iδ1 0, λ2 + iδ2 0)
250
A. Khorunzhy, W. Kirsch
as the correlation function of ρn,b . In general, even if Rn,b can be rigorously determined, it is difficult to carry out the direct study of it. Taking into account relation (3.5), one can pass to a simpler-expression Vn,b (λ1 , λ2 ) = −
1 4N b
δ1 δ2 S(λ1 + iδ1 0, λ2 + iδ2 0)
(6.1)
δ1 ,δ2 =−1,+1
and assume that it corresponds to the leading term of Rn,b (λ1 , λ2 ) in the limit n, b → ∞. In the present section we follow the same heuristic scheme. It should be noted that for Wigner random matrices this approach is justified by the study of the simultaneous limiting transition N → ∞, Im zj → 0 in the studies of CN (z1 , z2 ) [5, 19]. Theorem 6.1. Let S(z1 , z2 ) is given by (3.6). Assume that function u˜ F (p) is such that there exist positive constants c1 , δ and ν > 1 that u˜ F (p) = u˜ F (0) − c1 |p|ν + o(|p|ν )
(6.2)
for all p such that |p| ≤ δ, δ → 0. Then Vn,b (λ1 , λ2 ) =
c2 1 (1 + o(1)) N b |λ1 − λ2 |2−1/ν
(6.3)
for λj , j = 1, 2 satisfying √ √ λ1 , λ2 → λ ∈ (−2 v, 2 v).
(6.4)
Proof of Theorem 6.1. Let us start with the terms of (6.1) that correspond to δ1 δ2 = −1. It follows from (2.9) that 1 − vw1 w2 z1 − z2 = . w1 w2 w1 − w 2
(6.5)
Also for the real and imaginary parts of w(λ + i0) = τ (λ) + iρ(λ), we have τ2 =
λ2 , 4v 2
ρ2 =
4v − λ2 4v 2
(6.6)
(here and below we omit the variable λ). This implies existence of the limits w(z1 ) = w(z2 ) for (6.4). One can easily deduce from (6.5) that in the limit (6.4) 1 − vw(z1 )w(z2 ) λ1 − λ 2 = = o(1). w(z1 )w(z2 ) 2iρ(λ)
(6.7)
(1 − vw12 )(1 − vw22 ) = 2 − 2v(τ 2 − ρ 2 ) = 4vρ 2 .
(6.8)
Also we have that
Now let us consider Q(z1 , z2 ) (3.8) and write that δ w12 w22 u˜ F (p) 1 Q(z1 , z2 ) = + 2 dp = I1 + I2 . 2π R \(−δ,δ) −δ 1 − vw1 w2 u˜ F (p)
Asymptotic Expansions and Scales of Spectral Universality
251
Relations (6.5) and (6.7) imply equality (cf. (3.9)) 2 1 − vw1 w2 u˜ F (p) = [u˜ F (p) − 1]2 (1 + o(1)).
(6.9)
Since u(t) is monotone, then liminfp∈R\(−δ,δ) [u˜ F (p) − 1]2 > 0. This means that I2 < ∞ in the limit (6.4). Regarding (6.7), we can write that in the limit (6.4) δ (2π)−1 w12 w22 u˜ F (p) I1 = 2 dp −δ 1 − vw1 w + vw1 w2 u ˜ F (p) − 1 2 δ (2π)−1 u˜ F (p)(1 + o(1)) = 2 dp. z1 −z2 −δ ˜ F (p) − 1 w1 −w2 + v u Then we derive relation I1 (λ1 + i0, λ2 − i0) + I1 (λ1 − i0, λ2 + i0) 2 δ v 2 [u˜ F (p) − 1]2 − λ1 −λ2 2ρ 1 = & 2 '2 u˜ F (p)(1 + o(1))dp, π −δ −λ2 v 2 [u˜ F (p) − 1]2 + λ12ρ
(6.10)
where o(1) corresponds to (6.9) regarded in the limit (6.4). Now let us use condition (6.2) and observe that 1 π
δ −δ
c12 p 2ν + o(p 2ν ) − D 2 2 2 2 dp = 2−1/ν πD c1 p 2ν + o(p 2ν ) + D 2
δD −1/ν 0
c12 s 2ν + o(s 2ν ) − 1 2 ds, 2 c1 s 2ν + o(s 2ν ) + 1
where we denoted D = |λ1 −λ2 |/(2vρ) and o(p 2ν ) corresponds to the limit δ → 0 (6.2). Now it is clear that if we take δ such that δ|λ1 −λ2 |−1/ν → ∞, we obtain asymptotically I1 + I¯1 = 4Bν (c1 ) where Bν (c1 ) =
&
1 1/ν
2πc1
∞ 0
(2vρ)2−1/ν |λ1 − λ2 |2−1/ν
ds −2 1 + s 2ν
∞ 0
' ds . (1 + s 2ν )2
(6.11)
(6.12)
To prove relation (6.3), it remains to consider the sum I (λ1 + i0, λ2 + i0) + I (λ1 − i0, λ2 − i0). It is easy to observe that relations of the form (6.8) imply boundedness of this sum in the limit (6.4) Now gathering relations (6.8) and (6.11), we derive that Vn,b (λ1 , λ2 ) = This proves (6.3).
1 1 Bν (c1 ) (1 + o(1)). N b (2vρ)1/ν |λ1 − λ2 |2−1/ν
(6.13)
252
A. Khorunzhy, W. Kirsch
Let us discuss two consequences of Theorem 6.1. Let us assume first that u(t) is such that u2 ≡ t 2 u(t) dt < ∞. (6.14) Then (6.2) holds with ν = 2 and c1 = u2 . Regarding the right-hand side of (3.5) in the limit (6.4) with λj = λ + rj /N , j = 1, 2, we obtain the asymptotic relation
V(λ1 , λ2 ) =
√ N B2 (u2 ) 1 (1 + o(1)), √ b 2 2(vρ)1/2 |r1 − r2 |3/2
(6.15a)
3 5 O . 4 4
(6.15b)
where B2 (u2 ) = −
1 √
4π u2
∞ 0
ds 1 =− √ O 4 1+s 4π u2
Now let us assume that (6.14) is not true. Suppose that there exists such 1 < ν < 2 that u(t) = O |t|−1−ν
as t → ∞.
(6.16)
Then one can easily derive that (6.2) holds with ν = ν . This follows from elementary computations based on equalities ∞ u˜ F (p) = u˜ F (0) − (1 − cos pt) u(t)dt −∞
and 1 p
∞
−∞
(1 − cos y) u(yp
−1
ν
)dy = |p|
∞
(1 − cos y)
−∞
|y|1+ν
dy + o(|p|ν ), p → 0.
Therefore, if (6.16) holds, then V(λ1 , λ2 ) =
1 N 1−1/ν Bν (c1 ) (1 + o(1)) . 1/ν b (2vρ) |r1 − r2 |2−1/ν
(6.17)
The form of asymptotic expressions (6.15a) and (6.17) coincides with that determined by Altshuler and Shklovski for the spectral correlation function of band random matrices (see [27] for this and similar results). In these works, the factor |r1 − r2 |−3/2 appeared instead of the usual for random matrices expression |r1 − r2 |−2 (see (3.10)). This has been interpreted as the evidence of (relatively) localized eigenvectors of H (n,b) in the limit 1 b n with the localization length b2 /n. Let us note that the asymptotic expressions similar to (6.15) have also appeared in the recent work [33], where the band random matrix ensemble H (n,b) was considered under condition (6.14). However it should be stressed that no explicit expressions like (3.6) and (6.15) were obtained either in [27] or in [33].
Asymptotic Expansions and Scales of Spectral Universality
253
7. Summary We consider a family of random matrix ensembles H (n,b) of the band-type form. More precisely, we are related with real symmetric N × N matrices, N = 2n + 1, whose entries are jointly independent Gaussian random variables with zero mean value. The (n,b) (x, y) is proportional band-type x−y form means that the variance of the matrix entries H to u b ≥ 0. We study asymptotic behavior of the correlation function Cn,b (z1 , z2 ) = Efn,b (z1 )fn,b (z2 ) − Efn,b (z1 )Efn,b (z1 ),
where fn,b (z) is the normalized trace G(n,b) (z) of the resolvent of H (n,b) . We have proved that if Im zj is large enough, then in the limit 1 b N 1/3 , 1 1 Cn,b (z1 , z2 ) = S(z1 , z2 ) + o . Nb Nb We have found an explicit form of the leading term S(z1 , z2 ) in this limit. Assuming that expression Vn,b (λ1 , λ2 ) (6.1) is closely related with the correlation function of the eigenvalue density, we have studied it in the limit N, b → ∞ and λ1 −λ2 = (r1 −r2 )/N . Our main conclusion is that the limiting expression for Vn,b exhibits different behavior
depending on the rate of decay of u(t) at infinity. If t 2 u(t)dt < ∞, then (6.1) is given by √ N 1 −C (1 + o(1)), C > 0. b |r1 − r2 |3/2 If u(t) ˜ = O(|t|−1−ν ) with 1 < ν < 2, then the asymptotic expression for (7.1) is proportional to n1−1/ν 1 . b |r1 − r2 |2−1/ν In both cases the exponents do not depend on the particular form of the function u(t). Moreover, in the first case the exponents do not depend on u at all. This can be regarded as a kind of spectral universality for band random matrix ensembles. One can conject that these characteristics also do not depend on the probability distribution of the random variables a(x, y) (2.1). Our results show that S(z1 , z2 ) determines at least two scales of universality in the local spectral properties of band-type random matrices. These scales coincide with those detected in theoretical physics for the (relative) localization length and the density-density correlation function for these ensembles [27]. In the papers cited also the third regime when u(t) = O(|t|−γ ) with γ ∈ (1/2, 1) has been observed. It was shown to produce the asymptotics N −2 |r1 − r2 |−2 which is typical for “full” random matrices like GOE [7, 13, 14, 20]. Unfortunately, this asymptotic regime for band random matrices is out of reach of our technique. Acknowledgements. The first author is grateful to Ya. Fyodorov for fruitful discussions and explanations of the role of the Altshuler-Shklovski asymptotics. The financial support from SFB237 (Germany) during the autumns of 1997 and 1999 and the kind hospitality of Ruhr-University Bochum is gratefully acknowledged by the first author. We also thank the anonymous referee for useful remarks and conjectures concerning the technical questions and general exposition as well.
254
A. Khorunzhy, W. Kirsch
References 1. Berezin, F.: Some remarks on Wigner distribution. Teor. Mat. Fizika, 17, 305–318 (1973) (English translation in: Theor. Math. Phys. 17, 1163 (1973)) 2. Bessis, D., Itzykson, C., Zuber, J.-B.: Quantum field theory techniques in graphical enumeration. Adv. Appl. Math. 1, 109–157 (1980) 3. Bleher, P., Its, A.: Semiclassical asymptotics of orthogonal polynomials, Riemann-Hilbert problem, and universality in the matrix model. Ann. Math. 150, 185–266 (1999) 4. Bogachev, L., Molchanov, S.A., Pastur, L.A.: On the level density of random band matrices. Matem. Zametki 50, 31–42 (1991) (English tranls. Mathematical Notes, 50, 1232–1242 (1991)) 5. Boutet de Monvel, A., Khorunzhy, A.: Asymptotic distribution of smoothed eigenvalue density: I. Gaussian random matrices, Random Oper. Stoch. Eqs. 7, 1–22 (1999) II. Wigner random matrices, Random Oper. Stoch. Eqs. 7, 149–167 (1999) 6. Br´ezin, E., Zee, A.: Universality of the correlations between eigenvalues of large random matrices. Nucl. Phys. B 402 no. 3, 613–627 (1993); Ambjørn, J., Jurkiewicz, J., Makeenko, Yu.M.: Multiloop correlators for two-dimensional quantum gravity. Phys. Lett. B 251 (4), 517–524 (1990) 7. Bronk, B.V.: Accuracy of the semicircle approximation for the density of eigenvalues of random matrices, J. Math. Phys. 5, 215–220 (1964) 8. Casati, G., Girko, V.L.: Wigner’s semicircle law for band random matrices. Rand. Oper. Stoch. Equations 1, 15–21 (1993) 9. Casati, G., Molinari, L., Izrailev, F.: Scaling properties of band random matrices. Phys. Rev. Lett. 64, 1851 (1990) 10. Crisanti, A., Paladin, G., Vulpiani, A.: Products of Random Matrices in Statistical Physics. Berlin: Springer, 1993 11. Deift, P.A., Its, A., Zhou, X.: A Riemann-Hilbert approach to asymptotic problems arising in the theory of random matrix models, and also in the theory of integrable statistical mechanics. Ann. Math 146, 149–235 (1997) 12. Donoghue, W.F.: Monotone Matrix Functions and Analytical Continuation. Berlin: Springer, 1974 13. Dyson, F.J.: Statistical theory of the energy levels of complex systems (III). J. Math. Phys 3, 166–175 (1962) 14. French, J.B., Mello, P.A., Pandey, A.: Statistical properties of many-particle spectra. II. Two-point correlations. Ann. Phys 113, 277 (1978) 15. Fyodorov, Y.V., Mirlin, A.D.: Scaling properties of localization in random band matrices: A σ -model approach. Phys. Rev. Lett. 67, 2405 (1991) 16. Guhr, T., M¨uller-Groeling, A., Weidenm¨uller, H.A.: Random-matrix theories in quantum physics: Common concepts, Phys. Rep. 299, 189–425 (1998) 17. Guionnet, A.: Large deviations upper bounds and central limit theorems for band matrices and non-commutative functionals of Gaussian large random matrices. To appear in Annales d’IHP 18. Haake, F.: Quantum Signatures of Chaos. Berlin: Springer, 1991 19. Khorunzhy, A.: On smoothed density of states for Wigner random matrices. Rand. Oper. Stoch. Eqs. 5, 147–162 (1997) 20. Khorunzhy, A., Khoruzhenko, B., Pastur, L.: Asymptotic properties of large random matrices with independent entries. J. Math. Phys. 37, 5033–5060 (1996) 21. Khorunzhy, A., Pastur, L.: Limits of infinite interaction radius, dimensionality and number of components for random operators with off-diagonal randomness. Commun. Math. Phys. 153, 605–646 (1993) 22. Khorunzhy, A., Pastur, L.: On the eigenvalue distribution of the deformed Wigner ensemble of random matrices. In: Adv. Soviet. Math. 19, 97–107 (1994) 23. Ku´s, M., Lewenstein, M., Haake, F.: Density of eigenvalues of random band matrices. Phys. Rev. A 44, 2800 (1991) 24. Marchenko, V., Pastur, L.: Eigenvalue distribution of some class of random matrices. Matem. Sbornik. 72, 507 (1972) 25. Mehta, M.L.: Random Matrices. Boston: Acad. Press. (1991) 26. Molchanov, S.A., Pastur, L.A., Khorunzhy, A.M.: Eigenvalue distribution for band random matrices in the limit of their infinite rank. Teor. Matem. Fizika 90, (1992) 27. Mirlin, A.D., Fyodorov, Ya.V., Dittes, F.-M., Quezada, J., Seligman, T.H.: Transition from localized to extended eigenstates in the ensemble of power-law random banded matrices. Phys. Rev. E 54, 3221–3230 (1996) 28. Pastur, L., Figotin, A.: Spectra of Random and Metrically Transitive Operators. Berlin: Springer, 1992 29. Porter, C. (ed.): Statistical Theories of Spectra: Fluctuations. New York: Acad. Press, 1965 30. Seligman, T.H., Verbaaschot, J.J.M., Zirnbauer, M.R.: Spectral fluctuation properties of Hamiltonian systems: The transition region between order and chaos. J. Phys. A: Math. Gen. 18, 2751 (1985)
Asymptotic Expansions and Scales of Spectral Universality
255
31. Shlyahtenko, D.: Random gaussian band matrices and freeness with amalgamation, Int. Math. Res. Notes 20, 1013–1025 (1996) 32. Soshnikov, A.B.: Universality at the edge of the spectrum in Wigner random matrices, Commun. Math. Phys. 207, 697–733 (1999) 33. Sylvestrov, P.: Summing graphs for random band matrices. Phys. Rev. E 55, 6419–6432 (1997) 34. Tracy, C.A., Widom, H.: Level-spacing distributions and the Airy kernel, Commun. Math. Phys. 159, 151–174 (1994), 161, 289–309 (1994), 163, 33–72 (1994) 35. Voiculescu, D., Dykema, K.J., Nica, A.: Free Random Variables, A noncommutative probability approach to free products with applications to random matrices, operator algebras and harmonic analysis on free groups. CRM Monograph Series, 1. Providence, RI: AMS, 1992 36. Wigner, E.: Characteristic vectors of bordered matrices with infinite dimensions. Ann. Math. 62, (1955) Communicated by P. Sarnak
Commun. Math. Phys. 231, 257–286 (2002) Digital Object Identifier (DOI) 10.1007/s00220-002-0724-1
Communications in
Mathematical Physics
A Class of Integrable Spin Calogero-Moser Systems Luen-Chau Li, Ping Xu∗ Department of Mathematics, Pennsylvania State University, University Park, PA 16802, USA. E-mail:
[email protected];
[email protected] Received: 19 October 2001 / Accepted: 7 June 2002 Published online: 21 October 2002 – © Springer-Verlag 2002
Abstract: We introduce a class of spin Calogero-Moser systems associated with root systems of simple Lie algebras and give the associated Lax representations (with spectral parameter) and fundamental Poisson bracket relations. The associated integrable models (called integrable spin Calogero-Moser systems in the paper) and their Lax pairs are then obtained via Poisson reduction and gauge transformations. For Lie algebras of An -type, this new class of integrable systems includes the usual Calogero-Moser systems as subsystems. Our method is guided by a general framework which we develop here using dynamical Lie algebroids. 1. Introduction Calogero-Moser type systems are Hamiltonian systems with very rich structures. After the pioneering work of Calogero and Moser [8, 26], many generalizations have been proposed. Olshanetsky and Perelomov [27], for example, introduced Calogero-Moser models associated with root systems of simple Lie algebras (for recent work, see for example, [7 and 11]). On the other hand, a rational ᒐᒉ(N, C) spin Calogero-Moser system was introduced by Gibbons and Hermsen [16]. As in the spinless case, trigonometric and elliptic versions of this generalization also exist. In recent years, the collection of models known under the common name “spin Calogero-Moser systems” have received considerable attention due to their relevance in a number of areas. In the original work of Gibbons and Hermsen, and in the paper [22], for example, the ᒐᒉ(N, C) spin systems considered by the authors are related to certain special solutions of integrable partial differential equations. In a completely different area, an approach to study the joint distribution of energy eigenvalues of a Hamiltonian was initiated by Pechukas [28] and continued by Yukawa [33] and a number of other researchers (see, for example, [17] and the references therein). In this so-called level dynamics approach in random matrix ∗
Research partially supported by NSF grant DMS00-72171.
258
L.-C. Li, P. Xu
theory, spin Calogero-Moser systems appear naturally. As a matter of fact, they provide the starting point of the ensuing analysis. For a recent connection between SU (2) Yang-Mills mechanics and a version of the rational model embedded in an external field, we refer the reader to [21]. At this juncture, we should perhaps warn the reader over possible confusion with the term spin Calogero-Moser systems. Indeed, there are many different versions of this kind of generalization (of the usual spinless case) and yet the same term is used to describe these different systems. For example, in [16 and 22], the authors were actually restricting themselves to a special symplectic leaf of an underlying Poisson manifold. On the other hand, in [28 and 33], the spin variables are in the space of skew-Hermitian matrices. In this regard, the reader can consult [32, 29, 2 and 30] for further examples in addition to those mentioned above. See also Remark 4.11 (3) and Remark 5.7 in this connection. We hope this kind of confusion can be resolved in the future by refining the terminology. In [4], the authors considered the rational ᒐᒉ(N, C) spin Calogero-Moser system. Without restricting themselves to a special symplectic leaf as in [22], they obtained a St. Petersburg type formula for the ᒐᒉ(N, C) model, i.e., the so-called fundamental Poisson bracket relation (FPR) between the elements of an associated Lax operator L(z). However, what they found was rather unusual. First of all, there are the usual kind of terms in the FPR, but now an r-matrix depending on phase space variables is involved. Then there is an anomalous term whose presence is an obstruction to integrability. By this, we mean that the quantities tr (L(z)n ) do not Poisson commute unless we restrict to the submanifold where the anomalous term vanishes. If were a Poisson submanifold of the underlying Poisson manifold, the corresponding subsystem would have a natural collection of Poisson commuting integrals, but unfortunately this is not the case. For the trigonometric and elliptic ᒐᒉ(N, C) systems, similar formulas were obtained in [5]. Our present work has its origin in an attempt to understand conceptually the group theoretic/geometric meaning of the wonderful but mysterious calculations in [4, 5]. As was pointed out in a later paper by the same authors [1], the r-matrices which appear in their earlier work do satisfy a closed-form equation, the so-called classical dynamical Yang-Baxter equation (CDYBE) [15]. CDYBE is an important differential-functional equation introduced by Felder in his work on conformal field theory [15]. For simple Lie algebras and Kac-Moody algebras, the classification of solutions of this equation (under certain conditions) was obtained by Etingof and Varchenko [13]. On the other hand, dynamical r-matrices are intimately related to coboundary dynamical Poisson groupoids [13] and coboundary Lie bialgebroids [3]. This relation is analogous to the more familiar one which exists between constant r-matrices, Poisson groups and Lie bialgebras [12]. Consequently, it is plausible that the calculations in [4 and 5] are connected with Lie algebroids, and as it turns out, this is indeed the case. In this connection, let us recall that in integrable systems theory, one of the powerful means to show that a Hamiltonian system is integrable (in some sense) is to realize the system in the r-matrix scheme for constant r-matrices (see [14, 31] and the references therein). For the ᒐᒉ(N, C) spin Calogero-Moser systems, we have found an analog of the realization picture, using Lie algebroids associated with dynamical r-matrices. Indeed, along the way, it became clear that one can introduce spin Calogero-Moser systems associated with root systems of simple Lie algebras. These spin systems are naturally associated with the dynamical r-matrices with spectral parameter in [13]. Furthermore, there is a unified way to construct the realization maps for such systems. However, as in the ᒐᒉ(N, C) case, there is an obstruction for the natural functions to Poisson commute. Nevertheless, the underlying structures of the spin systems permits the construction of associated integrable models,
Class of Integrable Spin Calogero-Moser Systems
259
via Poisson reduction [25] and the idea of gauge transformations [4]. More precisely, the Hamiltonians of the spin Calogero-Moser systems are invariant under a natural canonical action of a Cartan subgroup of the underlying simple Lie group. In addition, the obstruction to integrability vanishes on a fiber of the equivariant momentum map. Hence we can apply Poisson reduction to obtain the integrable models on reduced Poisson manifolds. We shall call the systems in this new class of integrable models integrable spin Calogero-Moser systems. We now describe the contents of the paper. In Sect. 2, we assemble a number of basic facts which will be used in the paper. In Sect. 3, we consider realization of Hamiltonian systems in dynamical Lie algebroids. The reader should note the substantial difference between our present case and the more familiar case of realization in Lie algebras equipped with an R-bracket. The difference lies in the fact that in the present case, the natural functions to consider are functions which do not Poisson commute on the dual of the dynamical Lie algebroid. Consequently, we do not get integrable systems to start with. However, when the realization map is an equivariant map, then under suitable assumptions, we show that Poisson reduction can be used to produce integrable flows with a natural family of conserved quantities in involution on reduced phase spaces. An important idea which we employ in this connection is that of a gauge transformation of a Lax operator which we learned from [4]. As the reader will see, this device not only allows us to write down the equation of motion in Lax pair form on the reduced Poisson manifold. It also enables us to establish involution of the induced functions. In Sect. 4, we introduce spin Calogero-Moser systems associated with root systems of simple Lie algebras. Our first step in this section is the construction of dynamical Lie algebroids, starting from the classical dynamical r-matrices with spectral parameter. Then we show that the putative Poisson manifold underlying the spin Calogero-Moser systems admit realizations in the dynamical Lie algebroids constructed earlier. Here, the realization maps are natural in the sense that the corresponding dual maps are morphisms of Lie algebroids. In [13], Etingof and Varchenko obtained a classification of classical dynamical r-matrices with spectral parameter for simple Lie algebras. They obtained canonical forms of the three types of dynamical r-matrices (rational, trigonometric, and elliptic). For each of these canonical forms, we can use the corresponding realization map to construct the associated spin Calogero-Moser systems. In Sect. 5, we carry out the reduction procedure to the spin systems to obtain the associated integrable models. Here, the main task is to construct an equivariant map from an open dense subset of the Poisson manifold of the spin systems to the Cartan subgroup. Using this map, we can define gauge transformations of the Lax operators for the spin systems. These gauge transforms are invariant under the natural action of the Cartan subgroup and hence descend to the reduced Poisson manifold. In this way, we can obtain the Lax equations for the reduced Hamiltonian systems and establish an involution theorem for the induced functions. Furthermore, this gives rise to spectral curves which are preserved by the Hamiltonian flows. Some of the results in Sect. 4 of the present work have been announced in [23]. 2. Dynamical Lie Algebroids In this section, we recall some basic facts. Most of the material is standard, which is presented here for the reader’s convenience. A Lie algebroid over a manifold M may be thought of as a “generalized tangent bundle” to M. Here is the definition (see [24, 9] for more details on the theory).
260
L.-C. Li, P. Xu
Definition 2.1. A Lie algebroid over a manifold M is a vector bundle A over M equipped with a Lie algebra structure [·, ·] on its space of sections and a bundle map a : A → T M (called the anchor) such that 1. the bundle map a : A −→ T M induces a Lie algebra homomorphism (also denoted by a) from sections of A to vector fields on M; 2. for any X, Y ∈ (A) and f ∈ C ∞ (M), the identity [X, f Y ] = f [X, Y ] + (a(X)f )Y holds. Examples of Lie algebroids include the usual Lie algebras, Lie algebra bundles, tangent bundles of smooth manifolds, and integrable distributions on smooth manifolds. If A is finite-dimensional, the standard local coordinates on A are of the form (q, λ), where the qi ’s are coordinates on the base M and the λi ’s are linear coordinates on the fibers, associated with a basis Xi of sections of the Lie algebroid. In terms of such coordinates, the bracket and anchor are given by: k ∂ cij Xk , and a(Xi ) = aij , (2.1) Xi , Xj = ∂qj k and a are “structure functions” lying in C ∞ (M). where cij ij The dual bundle A∗ of A carries a natural Poisson structure, called the Lie-Poisson structure [10]. To describe this structure, it suffices to give the Poisson brackets of a class of functions whose differentials span the cotangent space at each point of A∗ . Such a class is given by functions which are affine on fibres. The functions which are constant on fibres are just the functions on M, lifted to A∗ via the bundle projection. On the other hand, functions which are linear on fibres may be identified with the sections of A. This is because for any X ∈ (A), we can define lX ∈ C ∞ (A∗ ) by lX (ξ ) = ξ, X , ∀ξ ∈ A∗ . If f and g are functions on M, and X and Y are sections of A, the Lie-Poisson structure is characterized by the following bracket relations:
{f, g} = 0, {f, lX } = a(X)(f ), and {lX , lX } = l[X,Y ] . For the finite-dimensional case, corresponding to standard coordinates (q, λ) on A, we may introduce dual coordinates (q, µ) on A∗ . In terms of such coordinates and the structure functions introduced in Eq. (2.1), the Poisson bracket relations on A∗ are k µk , and qi , µj = aj i . qi , qj = 0, µi , µj = cij The Poisson structure on A∗ generalizes the usual Lie-Poisson structure on the dual of a Lie algebra. Namely, if A is a Lie algebra ᒄ, the Poisson structure on its dual is the standard Lie-Poisson structure on ᒄ∗ . On the other hand, when A = T M is equipped with the standard Lie algebroid structure, the Poisson structure on its dual is just the usual cotangent bundle symplectic structure. Another interesting example, which we need in this paper, is the following Example 2.1. Let A = T M × ᒄ be equipped with the standard product Lie algebroid structure; namely, the anchor is the projection map onto the first factor and the bracket on sections is given by [(X, ξ ), (Y, η)] = ([X, Y ], [ξ, η] + LX η − LY ξ ) , X, Y ∈ (T M), ξ, η ∈ C ∞ (M, ᒄ),
(2.2)
Class of Integrable Spin Calogero-Moser Systems
261
where the bracket of two vector fields is the usual bracket and the bracket [ξ, η] is the pointwise bracket. Then clearly, A∗ is the Poisson manifold direct product T ∗ M × ᒄ∗ . In other words, the bracket between functions on T ∗ M is the canonical one on T ∗ M, the bracket between functions on ᒄ∗ is the Lie-Poisson bracket, and the mixed term bracket between functions on T ∗ M and ᒄ∗ is zero. In the rest of the section, let ᒄ be a Lie algebra and ᒅ an Abelian Lie subalgebra of ᒄ. Consider T ∗ ᒅ∗ × ᒄ∗ as a vector bundle over ᒅ∗ , and define a bundle map a∗ : T ∗ ᒅ∗ × ᒄ∗ −→ T ᒅ∗ by a∗ (q, p, ξ ) = (q, i ∗ ξ ), ∀q ∈ ᒅ∗ , p ∈ ᒅ, and ξ ∈ ᒄ∗ ,
(2.3)
where i : ᒅ −→ ᒄ is the inclusion map. If R is a map from ᒅ∗ to L (ᒄ∗ , ᒄ) (the space of linear maps from ᒄ∗ to ᒄ), we define a bracket on (T ∗ ᒅ∗ × ᒄ∗ ) as follows. For ξ, η ∈ ᒄ∗ considered as constant sections, h ∈ ᒅ considered as a constant one form on ᒅ∗ , and ω, θ ∈ &1 (ᒅ∗ ), define [ω, θ ] = 0, [h, ξ ] = adh∗ ξ, ∗ ∗ [ξ, η] = d Rξ, η − adRξ η + adRη ξ, ∗ ξ, Y = ξ, [X, Y ] , ∀X, Y ∈ ᒄ and ξ ∈ ᒄ∗ . where ad ∗ denotes the dual of ad: adX Then extend this to a bracket [·, ·] for all sections in (T ∗ ᒅ∗ × ᒄ∗ ) by the usual anchor condition. The following proposition can be verified by a direct calculation.
Proposition 2.2. (T ∗ ᒅ∗ × ᒄ∗ , [·, ·]) is a Lie algebroid with anchor map a∗ iff 1. The operator R is a map from ᒅ∗ to L (ᒄ∗ , ᒄ)ᒅ , the space of ᒅ-equivariant linear map from ᒄ∗ to ᒄ (ᒅ acts on ᒄ by adjoint action and on ᒄ∗ by coadjoint action); 2. R satisfies Rq∗ = −Rq for each point q ∈ ᒅ∗ (here, as well as in the sequel, we denote by Rq the linear map in L (ᒄ∗ , ᒄ) obtained by evaluating R at the point q); 3. For any q ∈ ᒅ∗ , the linear map from ᒄ∗ ⊗ ᒄ∗ −→ ᒄ defined by ξ ⊗ η −→ Rq ξ, Rq η + Rq adR∗ q ξ η − adR∗ q η ξ +Xi ∗ ξ (q)(Rη) − Xi ∗ η (q)(Rξ ) + d Rξ, η (q)
(2.4)
is independent of q ∈ ᒅ∗ , and is ᒄ-equivariant, where ᒄ acts on ᒄ∗ ⊗ ᒄ∗ by coadjoint action and on ᒄ by adjoint action. Here, as well as in the sequel, Xv for v ∈ ᒅ∗ denotes the operation of taking the derivative with respect to q along the constant vector field defined by v. Such a Lie algebroid (T ∗ ᒅ∗ × ᒄ∗ , [·, ·]) will be called a dynamical Lie algebroid, and we shall use this terminology throughout the paper.
Remark 2.3. If ᒄ is finite-dimensional, and Rq = r(q)# i.e., Rq ξ, η = r(q), ξ ⊗ η , ξ, η ∈ ᒄ∗ for a map r : ᒅ∗ −→ ∧2 ᒄ, it can be shown that R satisfies the conditions in Proposition 2.2, iff r satisfies: 1. r is ᒅ-invariant, i.e., [1 ⊗ h + h ⊗ 1, r(q)] = 0, ∀q ∈ ᒅ∗ , h ∈ ᒅ;
262
2.
L.-C. Li, P. Xu
ᒄ ∂r hi ∧ ∂q + 21 [r, r] is a constant ∧3 ᒄ -valued function over ᒅ∗ , where [·, ·] is i the Schouten bracket on ⊕ ∧∗ ᒄ, {h1 , · · · , hN } is a basis in ᒅ, and (q1 , . . . , qN ) its induced coordinate system on ᒅ∗ .
i
In other words, r is a dynamical r-matrix in the sense of [15] [13]. Indeed, (T ᒅ∗ × ᒄ, T ∗ ᒅ∗ × ᒄ∗ ) is a Lie bialgebroid [3]. Next, we assume that ᒄ admits a non-degenerate ad-invariant pairing (·, ·). If I : ᒄ∗ −→ ᒄ is the induced isomorphism, then a straightforward calculation yields that ∗ I adX ξ = −[X, I ξ ], ∀X ∈ ᒄ, ξ ∈ ᒄ∗ .
(2.5)
Thus we have the following Corollary 2.4. The operator R : ᒅ∗ −→ L (ᒄ∗ , ᒄ) defines a Lie algebroid structure on T ∗ ᒅ∗ × ᒄ∗ if the condition (1)–(2) in Proposition 2.2 are satisfied, and if R satisfies the modified dynamical Yang-Baxter equation (mDYBE): [Rξ, Rη] − R I −1 [Rξ, I η] + I −1 [I ξ, Rη] + Xi ∗ ξ (Rη) − Xi ∗ η (Rξ ) + d Rξ, η
= c[I ξ, I η], ∀ξ, η ∈ ᒄ∗ ,
(2.6)
for some constant c. 3. Realization of Hamiltonian Systems in Dynamical Lie Algebroids Throughout this section, let T ∗ ᒅ∗ × ᒄ∗ be a fixed dynamical Lie algebroid corresponding to an R : ᒅ∗ → L(ᒄ∗ , ᒄ) which satisfies the conditions of Proposition 2.2 of the last section. In what follows, we shall formulate our results for the differentiable category, but it will be clear that the results are also valid for the holomorphic category. Definition 3.1. A Poisson manifold (X, πX ) is said to admit a realization in the dynamical Lie algebroid T ∗ ᒅ∗ × ᒄ∗ if there is a Poisson map ρ : X → T ᒅ∗ × ᒄ, where T ᒅ∗ × ᒄ is the dual vector bundle of T ∗ ᒅ∗ × ᒄ∗ equipped with the Lie-Poisson structure. Definition 3.2. Suppose a Poisson manifold (X, πX ) admits a realization ρ : X → T ᒅ∗ × ᒄ and H ∈ C ∞ (X). We say that the Hamiltonian system x˙ = XH (x) is realized in T ᒅ∗ × ᒄ by means of ρ if there exists K ∈ C ∞ (T ᒅ∗ × ᒄ) such that H = ρ ∗ K. In the following discussion, we shall work with a Poisson manifold (X, πX ) together with a realization ρ : X → T ᒅ∗ × ᒄ. Let P r1 : T ᒅ∗ × ᒄ → T ᒅ∗ , P r2 : T ᒅ∗ × ᒄ → ᒄ be the projection maps onto the first and second factor of T ᒅ∗ × ᒄ respectively and set
We also put T ᒅ∗
ᒅ∗
L = P r2 ◦ ρ : X → ᒄ;
(3.1)
τ = P r1 ◦ ρ : X → T ᒅ∗ .
(3.2)
m = p ◦ τ : X → ᒅ∗ ,
(3.3)
→ is the bundle projection. The next proposition shows how to where p : compute the Poisson brackets of pullback of functions in P r2∗ C ∞ (ᒄ) under the map ρ. It is a direct consequence of the canonical character of ρ and the definition of the Lie algebroid bracket on T ∗ ᒅ∗ × ᒄ∗ .
Class of Integrable Spin Calogero-Moser Systems
263
Proposition 3.3. For all f, g ∈ C ∞ (ᒄ), we have
∗ L f, L∗ g X (x) = L(x), −adR∗ m(x) (df (L(x))) dg(L(x)) + adR∗ m(x) (dg(L(x))) df (L(x))
+ Xτ (x) R (df (L(x))), dg(L(x)) , ∀x ∈ X. (3.4) Here, and in the sequel, df (L(x)) and dg(L(x)) are considered as elements in ᒄ∗ for any fixed x ∈ X. Proof. For any ξ ∈ ᒄ∗ , we let 2ξ denote the corresponding linear function on ᒄ. Then we have ∗ L f, L∗ g X (x) = L∗ 2df (L(x)) , L∗ 2dg(L(x)) X (x) = P r2∗ 2df (L(x)) , P r2∗ 2dg(L(x)) (ρ(x)) (since ρ is a Poisson map)
= [df (L(x)), dg(L(x))] (m(x)), ρ(x) (by the definition of Lie-Poisson structure) = τ (x), d R(df (L(x))), dg(L(x))
+ L(x), −adR∗ m(x) (df (L(x))) dg(L(x)) + adR∗ m(x) (dg(L(x))) df (L(x)) .
= L(x), −adR∗ m(x) (df (L(x))) dg(L(x)) + adR∗ m(x) (dg(L(x))) df (L(x))
+ (Xτ (x) R)(df (L(x))), dg(L(x)) , ∀x ∈ X.
(3.5)
In the above computation, the quantities df (L(x)) and dg(L(x)) are considered as fixed elements in ᒄ∗ , the bracket [df (L(x)), dg(L(x))] is the Lie algebroid bracket when both df (L(x)) and dg(L(x)) are considered as constant sections of T ∗ ᒅ∗ × ᒄ∗ , and in the second from the last equality, R(df (L(x))), dg(L(x)) is considered as a function on ᒅ∗ with x being fixed. Remark 3.4. If Rq = r(q)# ∈ L (ᒄ∗ , ᒄ) as in Remark 2.1, then Eq. (3.4) is equivalent to the following fundamental Poisson bracket relation: ⊗ 12 1 L , L = r , L + L2 − τ (x)r, where L1 = L ⊗ 1 and L2 = 1 ⊗ L. Let I (ᒄ) be the collection of smooth ad-invariant functions on ᒄ, i.e. f ∈ I (ᒄ) iff adp∗ df (p) = 0 for all p ∈ ᒄ. A natural collection of functions on T ᒅ∗ × ᒄ is P r2∗ I (ᒄ), the pullback of ad-invariant functions on ᒄ by the projection map P r2 . As the reader will see, these functions do not Poisson commute with respect to the Lie-Poisson structure on T ᒅ∗ × ᒄ. Thus our situation here is quite different from that in standard classical r-matrix theory for constant r-matrices. We now examine the Hamiltonian systems x˙ = XH (x) on X which can be realized in T ᒅ∗ × ᒄ by means of ρ with H ∈ ρ ∗ P r2∗ I (ᒄ) = L∗ I (ᒄ). Proposition 3.5. 1. If H = L∗ f , where f ∈ I (ᒄ), then under the flow φt generated by the Hamiltonian H, we have the quasi-Lax type equation: dL(φt ) = Rm(φt ) (df (L(φt ))) , L (φt ) − Xτ (φt ) R (df (L(φt ))) . dt
(3.6)
264
L.-C. Li, P. Xu
2. For all f1 , f2 ∈ I (ᒄ), we have
∗ L f1 , L∗ f2 X (x) = Xτ (x) R (df1 (L(x))) , df2 (L(x)) ,
∀x ∈ X.
(3.7)
Proof. (1) Let πX# : T ∗ X −→ T X be the induced bundle map of the Poisson tensor πX defined by πX# α, β = πX (α, β), ∀α, β ∈ T ∗ X. From the invariance property of f and Eq. (3.4), we have Tx L◦πX# (x)◦Tx∗ L [df (L(x))] = − Rm(x) (df (L(x))), L(x) + Xτ (x) R (df (L(x))), from which the assertion follows. (2) This is obvious from Eq. (3.4) and the invariance property of f1 , f2 .
Remark 3.6. It is clear that the functions in P r2∗ I (ᒄ) do not Poisson commute, for otherwise, it would contradict Proposition 3.5 (2). Proposition 3.5 (2) shows that there is an obstruction for L∗ I (ᒄ) to give a Poisson commuting family of functions. A naive way to get rid of this obstruction is to restrict to the submanifold τ −1 (zero section of T ᒅ∗ ). It is easy to see that τ −1 (zero section of T ᒅ∗ ) is a coisotropic submanifold of X as the zero section of T ᒅ∗ is a coisotropic submanifold of T ᒅ∗ . Thus one can obtain a Poisson bracket on the quotient of τ −1 (zero section of T ᒅ∗ ) by the characteristic foliation. Unfortunately, it is not necessary that H ∈ L∗ I (ᒄ) or L : X → ᒄ will descend to the quotient space. In the following, we shall describe a situation where we can obtain integrable flows on a reduced phase space. Let H be a Lie subgroup of G corresponding to the Lie algebra ᒅ. We shall make the following assumptions: A1 X is a Hamiltonian H -space with an equivariant momentum map J : X → ᒅ∗ , A2 the realization map ρ : X −→ T ᒅ∗ × ᒄ is equivariant, where H acts on T ᒅ∗ × ᒄ by adjoint action on the second factor, A3 there exists an H -equivariant map g : X → H , where H acts on itself by left translation, i.e., g(d · x) = d · g(x), d ∈ H, x ∈ X. Suppose µ ∈ ᒅ∗ is a regular value of J . Then, under the assumption that Xµ = is a smooth manifold, it follows by Poisson reduction [25] that Xµ inherits a unique Poisson structure {·, ·}Xµ satisfying J −1 (µ)/H
, ψ . π ∗ {φ, ψ}Xµ = i ∗ φ X
(3.8)
Here, i : J −1 (µ) → X is the inclusion map, π : J −1 (µ) → Xµ is the canonical projec, ψ are (locally defined) smooth extensions of π ∗ φ, π ∗ ψ tion; φ, ψ ∈ C ∞ Xµ , and φ with differentials vanishing on the tangent spaces of the H -orbits. It follows from Assumption A2 that L : X → ᒄ is H -equivariant, where the H -action on ᒄ is via the Ad-action. Thus, if H ∈ L∗ I (ᒄ), it is clear that H is H -invariant, so that H descends to a function on Xµ , i.e., there exists a uniquely determined Hµ ∈ C ∞ Xµ satisfying π ∗ Hµ = HJ −1 (µ) . However, as L is only H -equivariant, therefore L does not pass to
Class of Integrable Spin Calogero-Moser Systems
265
the quotient and this is where Assumption A3 comes into play. Using the H -equivariant map g, we can define the gauge transformation of L: : X → ᒄ, x → Adg(x)−1 L(x) . L
(3.9)
The following lemma is obvious. is H -invariant. Lemma 3.7. 1. L ∗ ∗ f . 2. If H ∈ L I (ᒄ), say, H = L∗ f , then also H = L It follows from this lemma that there exists a uniquely determined map Lµ : Xµ → ᒄ such that −1 . Lµ ◦π = L (3.10) J (µ) In particular, if H = L∗ f , where f ∈ I (ᒄ), then H descends to a function Hµ on Xµ such that Hµ = L∗µ f. (3.11) In other words, the functions in L∗ I (ᒄ) J −1 (µ) descend into functions in L∗µ I (ᒄ) ⊂ C ∞ Xµ . in Eq. (3.9). The following lemma is straightforward from the definition of L = Adg(x)−1 ◦Tx L + adL(x) Lemma 3.8. Tx L ◦lg(x)−1 ∗ ◦Tx g, ∀x ∈ X, where both sides are considered as linear maps from Tx X to ᒄ, and lg(x)−1 is left translation by g(x)−1 ∈ H . We now make an additional assumption: A4 Xv R = 0, ∀v ∈ τ J −1 (µ) . Proposition 3.9. ◦πX# (x)◦Tx∗ L = adL(x) + R(x) ◦ad∗ , ∀x ∈ J −1 (µ), Tx L ◦R(x) L(x)
(3.12)
where both sides of the equation are considered as linear maps from ᒄ∗ to ᒄ, where : J −1 (µ) → L (ᒄ∗ , ᒄ) is given by R = Adg(x)−1 ◦(R ◦m)(x)◦Ad ∗ −1 + Tx L◦πX# (x)◦Tx∗ g ◦l ∗ −1 R(x) g(x) g(x) +
1 ∗ −1 ad ◦l −1 ◦Tx g ◦πX# (x)◦Tx∗ g ◦lg(x) (µ). −1 , ∀x ∈ J 2 L(x) g(x) ∗
(3.13)
is H -invariant. Here, as well as in the sequel, Ad ∗ denotes the dual map Moreover, R of Ad defined by: Add∗ ξ, X = ξ, Add X , ∀ξ ∈ ᒄ∗ and X ∈ ᒄ. Proof. Apply Lemma 3.8 and Proposition 3.3, together with A4, the expression for ◦π # (x)◦Tx∗ L follows. On the other hand, it follows from Assumption A2 (ρ is Tx L X H -equivariant) that m(d · x) = m(x), ∀x ∈ X, d ∈ H . Hence, (R ◦m)(d · x) = (R ◦m)(x) = Add ◦(R ◦m)(x)◦Add∗ ,
(3.14)
since (R ◦m)(x) ∈ L (ᒄ∗ , ᒄ)H according to Proposition 2.2. Thus, the assertion that R is H -invariant is a consequence of Eq. (3.14), the equivariance property of the maps L and g. We shall omit the straightforward calculations.
266
L.-C. Li, P. Xu
it follows that there exists Rµ : Xµ → L (ᒄ∗ , ᒄ) such From the H -invariance of R, that . Rµ ◦π = R (3.15) We now come to the main result of the section. Theorem 3.10. Let (X, πX ) be a Poisson manifold with a realization ρ : X → T ᒅ∗ × ᒄ which satisfies A1–A4. Then, under the assumption that Xµ = J −1 (µ)/H is a smooth manifold, there exists a unique Poisson structure {·, ·}Xµ on Xµ satisfying Eq. (3.8) and a map Lµ : Xµ → ᒄ satisfying Eq. (3.10) such that 1. ∀f1 , f2 ∈ C ∞ (ᒄ),
(x) ˜ = − Lµ (x), ˜ adR∗ ∗ (x) ˜ df1 Lµ (x) ˜ df L ( x) ˜ ( ( )) µ 2 µ ∗ +adR (x) ( x) ˜ , ∀x˜ ∈ Xµ ; df L 2 µ ˜ )) µ ˜ (df1 (Lµ (x) 2. Functions in L∗µ I (g) Poisson commute in Xµ , {·, ·}Xµ ; 3. If Hµ = L∗µ f , f ∈ I (ᒄ), then under the flow generated by Hµ , we have
L∗µ f1 , L∗µ f2
Xµ
d Lµ = − (Rµ )∗ df (Lµ ) , Lµ . dt
(3.16)
(3.17)
Proof. (1) Let x˜ = π(x) ∈ Xµ for some x ∈ J −1 (µ). From Eqs. (3.8), (3.10), (3.15) and Proposition 3.9, we have ∗ ∗ f1 , L ∗ f2 (x) Lµ f1 , L∗µ f2 X (x) ˜ = L X µ
+ R(x) ◦ad∗ = adL(x) ( L(x)) , df ( L(x)) df ◦R(x) 1 2 L(x)
= − adRµ (x) ˜ df2 Lµ (x) ˜ ˜ (df1 (Lµ (x) ˜ )) Lµ (x),
− df1 Lµ (x) ˜ , adRµ∗ (x) ( x) ˜ , L µ ˜ (df2 (Lµ (x) ˜ )) from which the assertion follows. (2) This is obvious from (1). (3) If πX# µ denotes the induced bundle map T ∗ Xµ −→ T Xµ of the Poisson tensor on Xµ , it follows from (1) and the invariance property of f that T Lµ ◦πX# µ ◦T ∗ Lµ df (Lµ ) = ∗ Rµ df Lµ , Lµ . Hence the assertion is immediate. Remark 3.11. If R = r # : ᒅ∗ −→ L (ᒄ∗ , ᒄ) is a classical dynamical r-matrix as in Remark 2.3, then Eq. (3.16) is equivalent to the following relation: Lµ ⊗, Lµ (x) ˜ L1µ (x) ˜ − r˜ 21 (x), ˜ L2µ (x) ˜ ∀x˜ ∈ Xµ , (3.18) ˜ = r˜ 12 (x), where
r˜ (x) ˜ = Adg(x)−1 r(m(x)) − g , L 1
2
(x)g
1 −1
1 12 2 u (x), L (x) . + 2
(3.19)
Class of Integrable Spin Calogero-Moser Systems
267
Here, x ∈ J −1 (µ) is such that x˜ = π(x), u12 = (g∗ πX )g −1 ∈ C ∞ X, ∧2 ᒄ , and 1 2 1 −1 def 1 = 2 g ,L g (g∗ Xi ) g −1 ⊗ L∗ Yi − (g∗ Yi ) g −1 ⊗ L∗ Xi as a map from X to ᒄ ⊗ ᒄ, where Xi , Yi ∈ ᑲ(X) are H -invariant vector fields such that πX = Xi ∧Yi = 1 (Xi ⊗ Yi − Yi ⊗ Xi ), Xi , Yi ∈ ᑲ(X). 2 We remark that fundamental Poisson bracket relations of this nature, in which the r-matrix can depend on phase space variables, was first considered in [6]. 4. Spin Calogero-Moser Systems Let ᒄ be a Lie algebra over C with a non-degenerate ad-invariant bilinear form (·, ·) and ᒅ ⊂ ᒄ a non-degenerate (i.e., the restriction of (·, ·) to ᒅ is non-degenerate) Abelian Lie subalgebra. By definition (see Remark 4.1 below), a classical dynamical r-matrix with spectral parameter associated with the pair ᒅ ⊂ ᒄ is a meromorphic map r : ᒅ∗ × C → ᒄ ⊗ ᒄ having a simple pole at z = 0 and satisfying the following conditions: 1. the zero weight condition: [h ⊗ 1 + 1 ⊗ h , r(q, z)] = 0,
(4.1)
ᒅ∗
for all h ∈ ᒅ and all (q, z) ∈ × C except for the poles of r; 2. the generalized unitarity condition: r 12 (q, z) + r 21 (q, −z) = 0,
(4.2)
ᒅ∗
for all (q, z) ∈ × C except for the poles of r; 3. the residue condition: Resz=0 r(q, z) = &, (4.3) 2 ᒄ where & ∈ S ᒄ is the Casimir element corresponding to the bilinear form (·, ·); 4. the classical dynamical Yang-Baxter equation (CDYBE): Alt dᒅ r + r 12 q, z1,2 , r 13 q, z1,3 + r 12 (q, z1,2 ), r 23 (q, z2,3 ) (4.4) + r 13 (q, z1,3 ), r 23 (q, z2,3 ) = 0 , where zi,j = zi − zj . In Eq. (4.4), the differential of the r-matrix is considered with respect to the ᒅ∗ -variables: (1) ∂r 23 hi ⊗ (q, z), dᒅ r : ᒅ∗ × C −→ ᒄ ⊗ ᒄ ⊗ ᒄ, (q, z) −→ ∂qi i
and the term Alt(dᒅ r) is a shorthand for the following symmetrization of dᒅ r: Alt(dᒅ r) =
i
+
(1)
hi ⊗
i
(2) ∂r 31 ∂r 23 q, z3,1 (q, z2,3 ) + hi ⊗ ∂qi ∂qi
∂r 12 (3) hi ⊗ q, z1,2 , ∂qi
i
(4.5)
where (h1 , . . . , hN ) is a basis of ᒅ, and (q1 , . . . , qN ) its corresponding coordinate system on ᒅ∗ . We call the variable z in r(q, z) the spectral parameter.
268
L.-C. Li, P. Xu
n By Lᒄ, we denote the Lie algebra of Laurent series X = ∞ n=−T Xn z with coefficients in ᒄ, which are convergent in some annulus Ac = {z ∈ C|0 < |z| < c} (which may depend on the series). The Lie bracket in Lᒄ is the pointwise bracket. In a similar fashion, we can define the restricted dual Lᒄ∗ . Using the bilinear form on ᒄ, we can define a non-degenerate invariant bilinear form on Lᒄ by (X, Y ) = Resz=0 (X(z), Y (z)), ∀X, Y ∈ Lᒄ.
(4.6)
On the other hand, the pairing between Lᒄ∗ and Lᒄ is given by ξ, X = Resz=0 ξ(z), X(z) , ∀ξ ∈ Lᒄ∗ , X ∈ Lᒄ.
(4.7)
Associated with each dynamical r-matrix r with spectral parameter is an operator R : ᒅ∗ −→ L (Lᒄ∗ , Lᒄ)H , which we use to define a Lie algebroid structure on T ∗ ᒅ∗ × Lᒄ∗ according to the recipe in Sect. 2. We now proceed with the construction of R. Let ∞ & r(q, z) = tk (q)zk , z ∈ Ac(r) (4.8) + z k=0
be the Laurent expansion of r(q, ·) about z = 0, where c(r) denotes the radius of convergence of the series. Assume furthermore that 5. c(r) is independent of q (which we will always assume in the sequel when talking about a dynamical r-matrix with spectral parameter). Remark 4.1. The original definition of classical dynamical r-matrices with spectral parameter is for simple Lie algebras [13]. In the above, we have modified this definition by putting in the extra assumptions. Namely, the pole of r(q, ·) at z = 0 is simple and the number c(r) is independent of q. For simple Lie algebras, these additional assumptions are not necessary as they follow from the solution of the classification problem [13]. For any ξ ∈ Lᒄ∗ , denote by Ac(ξ ) the largest annulus on which the Laurent series converges and let c0 (r, ξ ) = 21 min(c(r), c(ξ )). If q ∈ ᒅ∗ is not a pole of r(·, z), we set 1 r (q, w − z) , ξ(w) ⊗ 1 dw, ∀z ∈ Ac0 (r,ξ ) , ξ ∈ Lᒄ∗ , Rq ξ (z) = p.v. 2πi C (4.9) where C is the circle centered at 0 of radius |z| with positive orientation, and p.v. denotes the principal value of the improper integral. Lemma 4.2. Rq ξ is well-defined on Ac0 (r,ξ ) , i.e., the principal value of the improper integral in Eq. (4.9) exists. Proof. Consider a circle K centered at z ∈ Ac0 (r,ξ ) with a small radius @ such that K intersects C at exactly two points z and z . We denote by C@ the circular arc z z and by K the portion of K which lies to the left of C@ with orientation given by the clockwise direction. By definition, 1 r(q, w − z), ξ(w) ⊗ 1 dw p.v. 2πi C 1 = r(q, w − z), ξ(w) ⊗ 1 dw. lim 2πi @↓0 C−C@
Class of Integrable Spin Calogero-Moser Systems
We have r(q, w − z), ξ(w) ⊗ 1 dw = C−C@
269
C−C@
r(q, w − z), (ξ (w) − ξ(z)) ⊗ 1 dw
+
C−C@
r(q, w − z), ξ(z) ⊗ 1 dw.
Since ξ is analytic at z, it follows from the residue condition: (4.3) that r(q, w − z), (ξ(w) − ξ(z)) ⊗ 1 dw lim @↓0 C−C@ = r(q, w − z), (ξ(w) − ξ(z)) ⊗ 1 dw. C
On the other hand, r(q, w − z), ξ(z) ⊗ 1 dw r(q, w − z), ξ(z) ⊗ 1 dw = − C−C +K K @ =− r(q, w − z), ξ(z) ⊗ 1 dw,
C−C@
K
because r(q, w−z), ξ(z)⊗1 , as a function of w, is analytic in the interior of (C − C@ )+ K . Now & r(q, w − z), ξ(z) ⊗ 1 dw = − , ξ(z) ⊗ 1 dw − K K w − z ∞ k − tk (q)(w − z) , ξ(z) ⊗ 1 dw K
k=1
K
k=1
z − z + iV arK Arg(w − z) = −(I ξ )(z) log z −z ∞ k − tk (q)(w − z) , ξ(z) ⊗ 1 dw @↓0
−→ π i(I ξ )(z), Lᒄ∗
−→ Lᒄ is the linear isomorphism induced by the bilinear form (·, ·) as where I : defined by Eq. (4.6). Consequently, the principal value of the improper integral in Eq. (4.9) exists. Indeed, from the proof of the above lemma, we obtain the formula 1 1 r(q, w − z), (ξ(w) − ξ(z)) ⊗ 1 dw, (Rq ξ )(z) = (I ξ )(z) + 2 2πi C ∀ξ(z) ∈ Lᒄ∗ , z ∈ Ac0 (r,ξ ) , (4.10) which shows that Rq ξ is analytic in the annulus Ac0 (r,ξ ) . We can therefore extend Rq ξ to other possible values of z by analytic continuation. In this case, we can do it explicitly using the following
270
L.-C. Li, P. Xu
Proposition 4.3. For z ∈ Ac0 (r,ξ ) , we have the formula 1 ∂kr 1 (q, −z), ξ−(k+1) ⊗ 1 . Rq ξ (z) = (I ξ )(z) + 2 k! ∂zk
(4.11)
k≥0
Hence we can analytically continue Rq ξ to Ac(r,ξ ) by using this formula, where c(r, ξ ) = min(c(r), c(ξ )). Proof. Let C be the circle centered at 0 of radius |z| with positive orientation, and introduce the map 1 B(λ) = r(q, w − λ), ξ(w) ⊗ 1 dw, λ ∈ Ac0 (r,ξ ) − C. 2πi C If λ is on the +-side of C (i.e. the interior of C), we have 1 B(λ) = r(q, w − λ), (ξ(w) − ξ(λ)) ⊗ 1 dw + (I ξ )(λ). 2πi C Therefore, the boundary value B+ (z) =
B(λ)
lim λ→z
λ∈+side
=
1 2πi
C
r(q, w − z), (ξ(w − ξ(z)) ⊗ 1 dw + (I ξ )(z), z ∈ C.
On comparing this equation with Eq. (4.10), we obtain B+ (z) =
1 (I ξ )(z) + Rq ξ (z). 2
(4.12)
But for λ on the +-side of C, the integrand of B(λ) has poles at w = λ and w = 0 in the interior of C. Consequently, by the residue theorem, B(λ) = Resw=λ r(q, w − λ), ξ(w) ⊗ 1 + Resw=0 r(q, w − λ), ξ(w) ⊗ 1
1 ∂kr = (I ξ )(λ) + (q, −λ), ξ−(k+1) ⊗ 1 . k! ∂λk k≥0
Equation (4.11) now follows from letting λ → z and Eq. (4.12).
We now define the operator R : ᒅ∗ −→ L (Lᒄ∗ , Lᒄ) by R(q)ξ = Rq ξ
(4.13)
for all q ∈ ᒅ∗ which is not a pole of r and for all ξ ∈ Lᒄ∗ . We shall use Eq. (4.11) to compute R from r. # (q) : ᒄ∗ −→ Lᒄ by If q ∈ ᒅ∗ is not a pole of r(·, z), we define r−
# (q)ξ (z), η = r(q, z), η ⊗ ξ , ∀ξ, η ∈ ᒄ∗ . (4.14) r− From the generalized unitarity condition, it is easy to check that Rq∗ = −Rq . We now examine the consequences of the zero weight condition and the classical dynamical Yang-Baxter equation which are basic in our theory.
Class of Integrable Spin Calogero-Moser Systems
271
Lemma 4.4. Let r : ᒅ∗ × C −→ ᒄ ⊗ ᒄ be a classical dynamical r-matrix with spectral parameter. Then we have
1. r(q, z), Adx∗−1 ξ ⊗ 1 = Adx r(q, z), ξ ⊗ 1 , x ∈ H, ξ ∈ ᒄ∗ ; # ∗ ∗ # (q) ∗ ξ , 2. r− (q) adh ξ = adh r− h ∈ ᒅ, ξ ∈ L ᒄ∗ ; 3. j ∗ I −1 Rq ξ, I η + I ξ, Rq η = 0, ∀ξ, η ∈ Lᒄ∗ , where j : ᒅ −→ Lᒄ is the ∗ ∗ natural inclusion, and j : Lᒄ −→ ᒅ∗ is the dual map. Proof. (1) The relation is a simple consequence of the global version of the zero weight condition. (2) From the zero weight condition, it follows that for all h ∈ ᒅ, ξ ∈ Lᒄ∗ , η ∈ ᒄ∗ , we have
0 = r(q, z), adh∗ ξ(z) ⊗ η + ξ(z) ⊗ adh∗ η
# # = r− (q)η (z), adh∗ ξ(z) + r− (q)adh∗ η (z), ξ(z) . (4.15) If C ⊂ Ac(r,ξ ) is a circle centered at 0 with positive orientation and we integrate the above relation with respect to z over C, the result is
# # r− (q)η, adh∗ ξ + r− (q)adh∗ η, ξ = 0, by the definition of the pairing in Eq. (4.7). As the above equality holds for all η ∈ ᒄ, the assertion follows. (3) In Eq. (4.15), replace r(q, z) by r(q, z − w) and η by η(w), we have
0 = r (q, z − w) , adh∗ ξ(z) ⊗ η(w) + ξ(z) ⊗ adh∗ η(w) , ∀h ∈ ᒅ, ξ, η ∈ Lᒄ∗ . Let C ⊂ Ac0 (r,ξ ) ∩ Ac(η) be a circle centered at 0 with positive orientation. For w ∈ C, take the principal value of the integral of the above expression with respect to z over C, we have
0 = Rq adh∗ ξ (w), η(w) + Rq ξ (w), adh∗ η(w) . Then an integration with respect to w over C yields
0 = Rq adh∗ ξ, η + Rq ξ, adh∗ η = − j (h), Rq ξ, I η + I ξ, Rq η
= − j (h), I −1 Rq ξ, I η + I ξ, Rq η Therefore, j ∗ I −1 Rq ξ, I η + I ξ, Rq η = 0. To prepare for the proof of the next proposition, we first note by a direct calculation that
r 12 (q, z − w) , r 13 (q, z − v) , ξ(z) ⊗ η(w) ⊗ ζ (v) = ξ(z), [ r (q, z − w) , 1 ⊗ η(w) , r (q, z − v) , 1 ⊗ ζ (v) ] ,
(4.16)
r 12 (q, z − w) , r 23 (q, w − v) , ξ(z) ⊗ η(w) ⊗ ζ (v) = η(w), [ r (q, z − w) , ξ(z) ⊗ 1 , r (q, w − v) , 1 ⊗ ζ (v) ] , and (4.17)
272
L.-C. Li, P. Xu
r 13 (q, z − v) , r 23 (q, w − v) , ξ(z) ⊗ η(w) ⊗ ζ (v) = ζ (v) , [ r (q, z − w) , ξ(z) ⊗ 1 , r (q, w − v) , η(w) ⊗ 1 ] , (4.18) where ξ, η, ζ ∈ Lᒄ∗ . Let πᒅ be the projection operator onto ᒅ relative to the decomposition ᒄ = ᒅ ⊕ ᒊ, ᒊ is the orthogonal complement of ᒅ. For any ξ, η ∈ Lᒄ∗ , let πᒅ (I ξ ) (z) = where η) = ξ (z)h and π η (z)h . (I (z) i i We have the following relations correspondᒅ i i i i ing to the terms in Alt dᒅ r : (1) ∂r 23 hi ⊗ (q, w − v) , ξ(z) ⊗ η (w) ⊗ ζ (v) ∂qi i ∂ = ζ (v) , ξi (z) r (q, w − v) , η (w) ⊗ 1 ; (4.19) ∂qi i (2) ∂r 31 hi ⊗ (q, v − z) , ξ(z) ⊗ η (w) ⊗ ζ (v) ∂qi i ∂ = − ζ (v) , ηi (w) r (q, z − v) , ξ(z) ⊗ 1 , and (4.20) ∂qi i (3) ∂r 12 hi ⊗ (q, z − w) , ξ(z) ⊗ η (w) ⊗ ζ (v) ∂qi i ∂ = ζ (v) , hi r (q, z − w) , ξ(z) ⊗ η (w) . (4.21) ∂qi i
Proposition 4.5. For each q ∈ ᒅ∗ which is not a pole of r (·, z), the operator Rq is in L (Lᒄ∗ , Lᒄ)H and satisfies the mDYBE (Eq. (2.6)) with c = − 41 . Proof. Let 0 < c < 1 and let C ⊂ Ac0 (r,ξ ) ∩ Ac(η) ∩ Ac(ζ ) be a circle centered at zero with positive orientation. From Eq. (4.16), we have 1 3 r 12 (q, z − w), r 13 (q, z − cv) , lim p.v. 2π i C c→1− C C ξ(z) ⊗ η(w) ⊗ ζ (cv) dz dw dv
3
ξ(z), r(q, z − w), 1 ⊗ η(w) , p.v. C c→1− C C r(q, z − cv), 1 ⊗ ζ (cv) dw dz dv
1 2 ξ(z), Rq η (z), r (q, cv − z) , ζ (cv) ⊗ 1 dz dv lim = 2π i C c→1− C (by the generalized unitarity condition) 1 2 lim ζ (cv), r(q, z − cv), I −1 I ξ, Rq η (z) ⊗ 1 dz dv =− 2πi C c→1− C =
1 2π i
lim
Class of Integrable Spin Calogero-Moser Systems
273
(by the ad-invariance of (·, ·) and the generalized unitarity condition) 1 −1 = − ζ, Rq + I (by Eq. (4.12)). I ξ, Rq η I 2 Note that we have interchanged the order of integration in going from the first line to the second line of the above calculation. This fact can be easily verified and we leave the details to the reader. In what follows, it is not necessary to interchange the order of integrations. Indeed, a similar manipulation using Eq. (4.17) shows that 1 3 r 12 (q, z − w), r 23 (q, w − cv) , lim p.v. 2π i C c→1− C C 1 −1 ξ(z) ⊗ η(w) ⊗ ζ (cv) dz dw dv = ζ, Rq + I . I η, Rq ξ I 2 Meanwhile, by using Eqs. (4.18) and (4.12), we find 1 3 r 13 (q, z − cv), r 23 (q, w − cv) , lim 2π i C c→1− C C 1 1 ξ(z) ⊗ η(w) ⊗ ζ (cv) dz dw dv = ζ, Rq + I ξ, Rq + I η . 2 2 On the other hand, from Eq. (4.19), we have ∂r 23 1 3 (1) lim hi ⊗ (q, w − cv), 2π i ∂qi C c→1− C C i ∂ 1 ξ(z) ⊗ η(w) ⊗ ζ (cv) dz dw dv = ζ, Rq + I ηdz ξi (z) ∂qi 2 C i
= ζ, Xj ∗ ξ Rq η , and similarly ∂r 31 1 3 (2) lim hi ⊗ (q, cv − z), ξ(z) ⊗ η(w) ⊗ ζ (cv) dz dw dv 2π i ∂qi C c→1− C C i
= − ζ, Xj ∗ η Rq ξ . Lastly, it follows from Eq. (4.21) that ∂r 12 1 3 (3) lim p.v. hi ⊗ (q, z − w) , 2π i ∂qi C c→1− C C i ∂ 1 ζ (v), Rq ξ, η hi dv ξ(z) ⊗ η(w) ⊗ ζ (cv) dz dw dv = 2π i C ∂qi = ζ, d Rq ξ, η
.
i
Assembling the calculation, using the fact that r satisfies (CDYBE), we conclude that Rq satisfies (mCDYBE). The assertion that Rq ∈ L (Lᒄ∗ , Lᒄ)H now follows from Eq. (4.11) and Lemma 4.4(1).
274
L.-C. Li, P. Xu
According to Proposition 2.2 and Corollary 2.4, we can use R to equip T ∗ ᒅ∗ × Lᒄ∗ with a Lie algebroid structure, and therefore T ᒅ∗ × Lᒄ admits the Lie-Poisson structure. On the other hand, consider T ∗ ᒅ∗ with the canonical cotangent symplectic structure, ᒄ∗ with the plus Lie Poisson structure, and equip T ∗ ᒅ∗ × ᒄ∗ with the product Poisson structure. According to Example 2.1, this product structure is just the Lie-Poisson structure on the dual vector bundle T ∗ ᒅ∗ × ᒄ∗ , when T ᒅ∗ × ᒄ is the product Lie algebroid. In the next proposition, we are going to establish a Poisson map from T ∗ ᒅ∗ × ᒄ∗ to T ᒅ∗ × Lᒄ. This essentially enables us to describe certain finite-dimensional symplectic leaves of T ᒅ∗ × Lᒄ, which are simply the image of T ∗ ᒅ∗ × O under this map for coadjoint orbits O ⊂ ᒄ∗ . In order to do so, we need an equation intermediate between ∗ # somewhat (q) and Rq : (CDYBE) and (mCDYBE) which involves both r− ∗ ∗ ∗ # # # (q) ξ, r− (q) η − r− (q) I −1 Rq ξ, I η + I ξ, Rq η r− ∗ ∗ # # (q) η − Xj ∗ η r− (q) ξ +Xj ∗ ξ r− +d Rq ξ, η = 0, ξ, η ∈ Lᒄ∗ . (4.22) The derivation of this equation makes use of Eqs. (4.16)–(4.18) and (4.19)–(4.21) with ζ (v) replaced by ζ ∈ ᒄ∗ and with v = 0. As the calculation is similar to the proof of Proposition 4.5, we shall omit the details. Theorem 4.6. The map ρ : T ∗ ᒅ∗ × ᒄ∗ −→ T ᒅ∗ × Lᒄ given by # (q, p, ξ ) −→ q, −i ∗ ξ, p + r− (q)ξ , q ∈ ᒅ∗ , p ∈ ᒅ, ξ ∈ ᒄ∗ ,
(4.23)
is an H -equivariant Poisson map, where H acts on T ∗ ᒅ∗ × ᒄ∗ and T ᒅ∗ × Lᒄ by acting on the second factors by coadjoint and adjoint actions respectively, i : ᒅ −→ ᒄ is the natural inclusion, and i ∗ : ᒄ∗ −→ ᒅ∗ is the dual map. In other words, ρ is an H -equivariant realization in the dynamical Lie algebroid T ∗ ᒅ∗ × Lᒄ∗ in the sense of Definition 3.1. Proof. In order to show that ρ is a Poisson map, it is enough to check that the dual map ∗ ρ ∗ : T ∗ ᒅ∗ ×Lᒄ∗ −→ of Lie algebroids. By direct calculation, we T ᒅ × ᒄ is a morphism # ∗ ∗ ∗ have ρ (q, p, ξ ) = q, j ξ, −p + r− (q) ξ , q ∈ ᒅ∗ , p ∈ ᒅ, ξ ∈ Lᒄ∗ . There are two conditions to check. First, we have to show that a ◦ρ ∗ = a∗ , where a : T ᒅ∗ × ᒄ −→ T ᒅ∗ is anchor map of the trivial Lie algebroid. From the definition of the various quantities, this is trivial. Secondly, we have to check that the induced map on sections preserve the Lie algebroid brackets. To do so, it is enough to verify that this is the case for brackets between constant sections. Thus we have to check that 1. ρ ∗ [(h1 , 0) , (h2 , 0)] = ρ ∗ (h1 , 0) , ρ ∗ (h2 , 0) , ∀h1 , h2 ∈ ᒅ; 2. ρ ∗ [(h, 0), (0, ξ )] = ρ ∗ (h, 0) , ρ ∗ (0, ξ) , ∀h ∈ ᒅ, ξ ∈ Lᒄ∗ ; 3. ρ ∗ [(0, ξ ), (0, η)] = ρ ∗ (0, ξ ), ρ ∗ (0, η) , ∀ξ, η ∈ Lᒄ∗ ; For (1), the equality follows because ᒅ is Abelian. For (2), we have ∗ # (q) adh∗ ξ ρ ∗ [(h, 0), (0, ξ )] = −j ∗ adh∗ ξ, − r− ∗ # (q) adh∗ ξ = 0, − r−
Class of Integrable Spin Calogero-Moser Systems
275
as j ∗ adh∗ ξ = 0. On the other hand,
∗ # (q) ξ ρ ∗ (h, 0), ρ ∗ (0, ξ ) = (0, −h), j ∗ ξ, r− ∗ # (q) ξ . = 0, −adh r−
Hence the result follows from Lemma 4.4. For (3), we have
ρ ∗ [(0, ξ ), (0, η)] = j ∗ I −1 ([Rξ, I η] + [I ξ, Rη]), −d Rξ, η
∗ # (q) I −1 ([Rξ, I η] + [I ξ, Rη]) + r− ∗ # (q) I −1 ([Rξ, I η] + [I ξ, Rη]) = 0, −d Rξ, η + r− (by Lemma 4.4). On the other hand, ∗ ∗ ∗ # # ρ (0, ξ ), ρ ∗ (0, η) = j ∗ ξ, r− (q) ξ , j ∗ η, r− (q) η ∗ ∗ # # (q) η − Xj ∗ η r− (q) ξ = j ∗ ξ, j ∗ η , Xj ∗ ξ r− ∗ ∗ # # (q) ξ, r− (q) η . + r− Therefore the equality ρ ∗ [(0, ξ ), (0, η)] = ρ ∗ (0, ξ ), ρ ∗ (0, η) follows from the commutativity of ᒅ and Eq. (4.22). Following the notations in Sect. 3 (Eqs. (3.1–3.3)), we have L : T ∗ ᒅ ∗ × ᒄ ∗ → Lᒄ ,
# L(q, p, ξ ) = P r2 ◦ρ(q, p, ξ ) = p + r− (q)ξ ; τ : T ∗ ᒅ∗ × ᒄ∗ → T ᒅ∗ , τ (q, p, ξ ) = P r1 ◦ρ(q, p, ξ ) = q, −i ∗ ξ ;
and
m : T ∗ ᒅ ∗ × ᒄ ∗ → ᒅ∗ ,
m(q, p, ξ ) = p ◦τ (q, p, ξ ) = q.
(4.24) (4.25) (4.26)
Definition 4.7. A function on Lᒄ is said to be smooth on Lᒄ if for each X ∈ Lᒄ, the derivative df (X) ∈ Lᒄ∗ (recall that df (X) is defined as a linear functional on Lᒄ d through the relation dt f (X + tY ) = df (X)(Y ), ∀X, Y ∈ Lᒄ). t=0 Combining Theorem 4.6 with Propositions 3.3, 3.5, we are lead to the following Theorem 4.8. Assume that r is a classical dynamical r-matrix with spectral parameter. Then # (q)ξ satisfies 1. L : T ∗ ᒅ∗ × ᒄ∗ −→ Lᒄ, (q, p, ξ ) −→ p + r− ∗ L f, L∗ g (x) = L(x), −adR∗ m(x) (df (L(x))) dg(L(x)) + adR∗ m(x) (dg(L(x))) df (L(x))
+ Xτ (x) R (df (L(x))), dg(L(x)) , for x = (q, p, ξ ) ∈ T ∗ ᒅ∗ × ᒄ∗ , (4.27)
and all smooth functions f, g on Lᒄ.
276
L.-C. Li, P. Xu
2. If H = L∗ f, f ∈ I (Lᒄ), then under the flow φt generated by the Hamiltonian H, we have the following quasi-Lax type equation: dL (φt ) = Rm(φt ) (df (L (φt ))) , L (φt ) − Xτ (φt ) R (df (L (φt ))) . dt
(4.28)
Remark 4.9. In the first part of the above theorem, we have restricted ourselves to smooth functions on Lᒄ with derivatives in the restricted dual Lᒄ∗ . However, we can easily extend the calculation to include linear functions of the form lξ (X) = ξ, X(z) , where ξ ∈ ᒄ∗ and X ∈ Lᒄ. For these functions, the derivative dlξ (X) = δ(z − ·)ξ is in the singular part of (Lᒄ)∗ , where δ is the delta function. In particular, we obtain the St. Petersburg type formula: L(z) ⊗, L(w) = − r 12 (q, z − w), L1 (z) + r 21 (q, w − z), L2 (w) −Xi ∗ ξ r(q, z − w) (4.29) = − r 12 (q, z − w), L1 (z) + L2 (w) − Xi ∗ ξ r(q, z − w) (4.30) by calculating with such linear functions. Here, L(z) : T ∗ ᒅ∗ × ᒄ∗ −→ Lᒄ is defined by L(z)(q, p, ξ ) = L(q, p, ξ )(z) and it is understood that L1 (z) = L(z) ⊗ 1 and L2 (w) = 1 ⊗ L(w) in the above formula are evaluated at (q, p, ξ ). In the rest of the section, we shall consider the case where ᒄ is a simple Lie algebra over C with Killing form (·, ·) and we shall take ᒅ to be a fixed Cartan subalgebra. Let Q be the quadratic function dz 1 , ∀X ∈ Lᒄ, (X(z), X(z)) (4.31) Q(X) = 2 C 2π iz where C is a small circle around the origin. Clearly, Q is an ad-invariant function on Lᒄ. Definition 4.10. Assume that r is a classical dynamical r-matrix with spectral parameter. The Hamiltonian system on T ∗ ᒅ∗ × ᒄ∗ generated by the Hamiltonian function: ∗ 1 dz (L(q, p, ξ ), L(q, p, ξ )) (4.32) H(q, p, ξ ) = L Q (q, p, ξ ) = 2 C 2π iz is called the spin Calogero-Moser system associated to the dynamical r-matrix r. In [13], Etingof and Varchenko obtained a complete classification of classical dynamical r-matrices (which satisfy Eqs. (4.1–4.4)) for simple Lie algebras. Up to gauge transformations, they obtained canonical forms of the three types (rational, trigonometric and elliptic) of dynamical r-matrices. For each of these dynamical r-matrices, one can associate a spin Calogero-Moser system on T ∗ ᒅ∗ × ᒄ∗ . We will list all of them below (see Remark 4.11(1)). First, let us fix some notations. Let ᒄ = ᒅ ⊕ α∈F ᒄα be the root space decomposition. For any positive root α ∈ F+ , fix basis eα ∈ ᒄα and e−α ∈ ᒄ−α which are dual with respect to (·, ·). Fix also an orthonormal basis {h1 , . . . , hN } of ᒅ, and write p = N p h , ξ = ξ, hi , and ξα = ξ, e−α , for p ∈ ᒅ and ξ ∈ ᒄ∗ . Then i=1 N i i i I ξ = i=1 ξi hi + α∈F ξα eα ∈ ᒄ.
Class of Integrable Spin Calogero-Moser Systems
I. Rational case. r(q, z) =
277
& 1 eα ⊗ e−α , + z (α, q) α∈F
H(q, p, ξ ) =
N
1 2
i=1
pi2 −
1 ξα ξ−α , 2 (α, q)2 α∈F
ξα Iξ + eα , L(q, p, ξ )(z) = p + z (α, q) α∈F
where −1.
F
⊂ F is a set of roots closed with respect to the addition and multiplication by
II. Trigonometric case. N sin ((α, q) + z) 1 1 r(q, z) = cot z + z hi ⊗ h i + e 3 z(α,q) eα ⊗ e−α 3 sin (α, q) sin z i=1
+
α∈F+ −F(G )
+
α∈F(G )
e−iz
α∈F− −F(G )
i=1
1
e 3 z(α,q) eα ⊗ e−α
eiz 1 z(α,q) eα ⊗ e−α , e3 sin z
N
1 2 1 H(q, p, ξ ) = pi − 2 2
sin z
1 1 5 − ξ ξ − ξα ξ−α , α −α 6 sin2 (α, q) 3 α∈F−F(G )
α∈F(G )
N sin ((α, q) + z) 1 1 L(q, p, ξ )(z) = p + cot z + z e 3 z(α,q) ξα eα ξi hi + 3 sin q) sin z (α, i=1
+
e−iz
α∈F+ −F(G )
sin z
α∈F(G )
1
e 3 z(α,q) ξα eα +
α∈F− −F(G )
eiz 1 z(α,q) ξ α eα . e3 sin z
and Here F = F+ ∪ F− is a polarization of F, G is a subset of the set of simple roots, F G denotes the set of all roots which are linear combinations of roots from G .
III. Elliptic case. r(q, z) = ζ (z)
N
hi ⊗ h i −
l ((α, q) , z) eα ⊗ e−α ,
α∈F
i=1
H(q, p, ξ ) =
N 1 2 1 pi − P ((α, q)) ξα ξ−α , 2 2 α∈F
i=1
L(q, p, ξ )(z) = p + ζ (z)
N i=1
ξi h i −
α∈F
l ((α, q) , z) ξα eα ,
278
L.-C. Li, P. Xu
(z) (w+z) where ζ (z) = σσ (z) , P(z) = −ζ (z), l(w, z) = − σσ(w)σ (z) , and σ (z) is the Weierstrass σ function of periods 2ω1 , 2ω2 .
Remark 4.11. 1. In the trigonometric case and the elliptic case, the classical dynamical r-matrices with spectral parameter which we used above are gauge equivalent to those in [13]. If we had used the canonical forms given in [13], the Hamiltonians of the associated spin systems will have additional terms which depend on i ∗ ξ . The same remark also applies to the most general dynamical r-matrix which one can obtain by using gauge transformations. However, as will be evident in the next result, these additional terms do not give rise to any new systems upon reduction. 2. In the rational and trigonometric case above, the spin systems that we have here are in one-to-one correspondence with some subsets of the root system. Thus we have as many spin systems as these special subsets. 3. The reader should note that the ᒐo(N) and ᒐu(N ) models in [5] and [33] are different from ours. A generalization of this class of models has come up naturally in the study of orbit spaces in the recent work of Alekseevsky et. al. [2]. 4. Starting from the trigonometric r-matrix r(q, z) above, we can obtain the hyperbolic r-matrix N 1 r(q, z) = −i coth z − iz hj ⊗ h j 3 j =1
−i
α∈F(G )
sinh ((α, q) + z) − 1 z(α,q) eα ⊗ e−α e 3 sinh (α, q) sinh z
−i
e−z − 1 z(α,q) eα ⊗ e−α e 3 sinh z
α∈F+ −F(G )
−i
ez − 1 z(α,q) eα ⊗ e−α , e 3 sinh z
α∈F− −F(G )
by using the gauge transformation r(q, z) → r(q, z) = −ir(−iq, −iz). The corresponding spin Calogero-Moser system in this case is the hyperbolic model with Hamiltonian given by
N
H(q, p, ξ ) =
1 2 1 pi + 2 2 i=1
+
5 6
α∈F(G )
1 1 − ξα ξ−α sinh2 (α, q) 3
ξα ξ−α .
α∈F−F(G )
We conclude this section with the following result which prepares the way for the construction of associated integrable models in the next section. Theorem 4.12. The Hamiltonians of the spin Calogero-Moser systems are invariant under the canonical H -action on T ∗ ᒅ∗ × ᒄ∗ : (4.33) x · (q, p, ξ ) = q, p, Adx∗−1 ξ , ∀x ∈ H, (q, p, ξ ) ∈ T ∗ ᒅ∗ × ᒄ∗
Class of Integrable Spin Calogero-Moser Systems
279
with momentum map J : T ∗ ᒅ∗ × ᒄ∗ −→ ᒅ∗ given by J (q, p, ξ ) = i ∗ ξ.
(4.34)
−1 If denotes the set defined by Xi ∗ ξ R = 0, then = J (0) in the trigonomet ⊥ ric and elliptic cases, while = J −1 F in the rational case. Thus in each
case, is invariant under the dynamics and we have M(q, p, ξ )(z) = L(q, p, ξ )(z)/z.
dL dt
= [R(M), L] on , where
Remark 4.13. Note that J −1 (0) is not a Poisson submanifold of T ∗ ᒅ∗ × ᒄ∗ , otherwise the corresponding subsystem on J −1 (0) would have a natural collection of Poisson commuting integrals and there would have no need to use reduction to construct the associated integrable flows. 5. Integrable Spin Calogero-Moser Systems In this section, we shall carry out the reduction procedure outlined in Sect. 3 to the spin Calogero-Moser systems. As a result, we obtain a new family of integrable systems, which we call integrable spin Calogero-Moser systems. For ᒄ = ᒐᒉ(n, C), the usual Calogero-Moser systems as well as their spin generalizations (in the sense of Gibbons and Hermsen [16]) appear as subsystems of what we have on special symplectic leaves of the reduced Poisson manifold. However, for other simple Lie algebras, the usual Calogero-Moser systems without spin cannot be realized in this fashion, as we shall explain below. Our first task below is to construct an H -equivariant map g which allows us to construct the equations of motion in Lax pair form for the reduced Hamiltonian H0 . α For any root α ∈ F, recall that the coroot hα is the element in ᒅ corresponding to 2 (α,α) ∗ under the isomorphism between ᒅ and ᒅ induced by the Killing form (·, ·). i.e., for any β ∈ ᒅ∗ , β (hα ) = 2 (β,α) (α,α) . Therefore, if we fix a simple system G = {α1 , . . . , αN } ⊂ F, we have a basis of ᒅ given by thefundamental coroots hα1 , . . . , hαN . In particular, the entries of the Cartan matrix A = Aij is given by Aij = αj hαi . Let ω1 , . . . , ωN be the fundamental weights, i.e., the dual basis of hα1 , . . . , hαN in ᒅ∗ . Then it is clear that αi =
N
Aj i ωj .
(5.1)
j =1
We shall denote by C = Cij the inverse of the Cartan matrix. Clearly, we have Cij ∈ Q, ∀i, j . Consider the open submanifold of ᒄ∗ :
(5.2) U = ξ ∈ ᒄ∗ |ξαi = ξ, e−αi != 0, i = 1, . . . , N . It is clear that U is stable under the coadjoint action of H (considered as a subgroup of G). We shall assume that H is simply connected. As we mentioned above, our first task is to construct a map g : U −→ H with the property that (5.3) g Adh∗−1 ξ = h · g(ξ ), ∀h ∈ H. In other words, g is equivariant, where H acts on itself by left translation.
280
L.-C. Li, P. Xu
For convenience, we shall ᒄ∗ with ᒄ below by the Killing form and identify U identify N with the open submanifold ξ = i=1 ξi hi + α∈F ξα eα ∈ ᒄ|ξαi != 0, i = 1, . . . , N of ᒄ. Thus the coadjoint action becomes the adjoint action and Eq. (5.3) becomes g (Adh ξ ) = h · g(ξ ),
∀h ∈ H.
(5.4)
Since H is abelian and simply connected, the map log : H −→ ᒅ inverse to the exponential map is well defined. Indeed, for all h ∈ H , we have log h = ω1 (log h)hα1 + · · · + ωN (log h)hαN , and Adh eα = χα (h)eα ,
χα (h) =
(5.5)
eα(log h) .
(5.6)
Note that if gi : U −→ H, i = 1, . . . , N, satisfies that gi (Adh ξ ) = exp ωi (log h)hαi gi (ξ ) , ∀h ∈ H, ξ ∈ U, i = 1, . . . , N,
(5.7)
then g = g1 · · · gN will have the desired property in Eq. (5.4). We shall seek gi in the form gi (ξ ) = exp φi (ξ )hαi , (5.8) where φi is a function on U. In order for gi to satisfy Eq. (5.7), it is enough that φi (Adh ξ ) = φi (ξ ) + ωi (log h).
(5.9)
Let ψi (ξ ) = eφi (ξ ) . Then Eq. (5.9) translates into ψi (Adh ξ ) = χi (h)ψi (ξ ), χi (h) = eωi (log h) .
(5.10)
That is, ψi is a semi-invariant with character χi . In what follows, we shall fix a branch of the logarithmic function. We shall seek ψi of the form ψi (ξ ) =
N
n
ξαjij , ∀ξ ∈ U.
(5.11)
j =1
Then by Eqs. (5.5–5.6), ψi (Adh ξ ) =
N
χαj (h)ξαj
j =1
= =
N
χαj (h)nij ψi (ξ )
j =1 N
nij
enij αj (log h) ψi (ξ ), h ∈ H, ξ ∈ U.
j =1
Therefore, in order to satisfy Eq. (5.10), it suffices to pick nij so that ωi = But from the relation in Eq. (5.1), we must have nij = Cj i , i.e., ψi (ξ ) =
N j =1
C
ξαjj i
N
j =1 nij αj .
(5.12)
Class of Integrable Spin Calogero-Moser Systems
N Cj i log ξαj hαi . gi (ξ ) = exp
and
281
(5.13)
j =1
Consequently, we have Theorem 5.1. The formula N N g(ξ ) = exp Cj i log ξαj hαi
(5.14)
i=1 j =1
defines an H -equivariant map g : U −→ H . Consider the Poisson submanifold T ∗ ᒅ∗ × U of T ∗ ᒅ∗ × ᒄ∗ . Clearly, the H -action defined by Eq. (4.33) induces a Hamiltonian action on T ∗ ᒅ∗ × U and therefore the moment map J : T ∗ ᒅ∗ × U −→ ᒅ∗ is given by restriction of the one in Eq. (4.34). Hence J −1 (0) = T ∗ ᒅ∗ × ᒅ⊥ ∩ U , and therefore we have (5.15) Xv R = 0, ∀v ∈ τ J −1 (0) . Thus according to Theorem 4.6, Theorem 4.12, Theorem 5.1 and Eq. (5.15), we conclude that Assumptions A1–A4 in Sect. 3 are all satisfied and therefore we can now apply the reduction procedure of Sect. 3 to our situation. We first characterize the reduced space using the following: Theorem 5.2. The quotient space J −1 (0) /H ∼ = T ∗ ᒅ∗ × ᒅ⊥ ∩ U /H is analytic and can be identified with T ∗ ᒅ∗ × ᒄ∗red , where ᒄ∗red is the affine subspace @ + α∈F−G Ceα∗ ∗ ∗ ∗ and @ = N i=1 eαj , where eα |α ∈ F denotes the dual vectors in ᒄ corresponding to {eα |α ∈ F} in ᒄ. Proof. It is simple to see that the action of H on ᒅ⊥ ∩ U is locally free. Moreover, each H -orbit through ᒅ⊥ ∩ U has exactly one intersection with ᒄ∗red . To see this, recall that the simple system has the following property, namely, if α ∈ F, there exist integers i miα (1 ≤ i ≤ N) either all nonnegative or all nonpositive, such that α = N i=1 mα αi . ∗ ⊥ −1 Hence for a given ξ ∈ ᒅ ∩U, if we let h = g(ξ ) , then Adh−1 ξ = α∈F ξα χα (h)eα∗ = $N −miα ∗ @ + α∈F−G ξα i=1 ξαi eα ∈ ᒄ∗red . Hence we can identify ᒅ⊥ ∩ U /H with ᒄ∗red .
$ −miα Remark 5.3. Indeed sα = ξα N , α ∈ F − G, are a set of H -invariant functions i=1 ξαi on ᒅ⊥ ∩ U, which can be used as a coordinate system for ᒄ∗red . If s ∈ ᒄ∗red , we then may write s = α∈F sα eα∗ with sαi = 1, i = 1, . . . , N. By Poisson reduction [25], the reduced manifold T ∗ ᒅ∗ × ᒄ∗red has a unique Poisson structure which is a product structure, where the second factor ᒄ∗red is being equipped with the reduction (at 0) of the Lie-Poisson structure on U by the H -coadjoint action. The Poisson brackets between the coordinate functions sα on ᒄ∗red can be obtained by a straightforward but tedious computation. We shall leave the details to the interested
282
L.-C. Li, P. Xu
reader. Now, the symplectic leaves of ᒄ∗red are the symplectic reduction of O ∩ U at 0, where O ⊂ ᒄ∗ is a coadjoint orbit [25]. In other words, any symplectic leaf of ᒄ∗red is of the form O ∩ U ∩ ᒅ⊥ /H , and we shall denote this by Ored . Obviously, Ored is a symplectic manifold of dimension dimO − 2N , where N is the rank of the Lie algebra ᒄ. Consequently, the symplectic leaves of T ∗ ᒅ∗ × ᒄ∗red are of the form T ∗ ᒅ∗ × Ored , which is of dimension equal to dimO. Accordingly, if H is the Hamiltonian of one of the spin Calogero-Moser systems in Sect. 4, and L is the corresponding Lax operator, there exists a uniquely determined Hamiltonian function H0 and Lax operator L0 on the reduced Poisson manifold T ∗ ᒅ∗ × ᒄ∗red ˜ T ∗ ᒅ∗ × ᒅ⊥ ∩U . Here, π : T ∗ ᒅ∗ × such that π ∗ H0 = H|T ∗ ᒅ∗ ×(ᒅ⊥ ∩U ) and L0 ◦π = L| ( ) ⊥ ᒅ ∩ U −→ T ∗ ᒅ∗ × ᒄ∗red is the natural projection given by % % N & & −mi π(q, p, ξ ) = q, p, @ + ξα (5.16) ξαi α eα∗ , α∈F−G
i=1
and L˜ : T ∗ ᒅ∗ × U −→ Lᒄ is given by ˜ L(q, p, ξ ) = Adg(ξ )−1 L(q, p, ξ ).
(5.17)
We can now state the main result of the paper. Theorem 5.4. Let H be the Hamiltonian of a spin Calogero-Moser system with Lax operator L. And let R˜ : T ∗ ᒅ∗ × U −→ L (Lᒄ∗ , Lᒄ) be the map as defined by Eq. (3.13), which is obtained from R by applying the gauge transform given by g in Theorem 5.1, ˜ T ∗ ᒅ∗ × ᒅ⊥ ∩U . Then and R0 the induced map on T ∗ ᒅ∗ × ᒄ∗red in the sense that R0 ◦π = R| ( ) the Hamiltonian system generated by the induced function H0 on the reduced Poisson manifold T ∗ ᒅ∗ × ᒄ∗red admits a Lax operator L0 : T ∗ ᒅ∗ × ᒄ∗red −→ Lᒄ satisfying the following properties: 1. For any smooth functions f1 , f2 on Lᒄ,
∗ ˜ adR∗ ∗ (x)(df ˜ ˜ = − L0 (x), L0 f1 , L∗0 f2 (x) ˜ df1 (L0 (x)) 2 (L0 (x))) 0 ˜ +adR∗ 0 (x)(df ˜ , ˜ ˜ df2 (L0 (x)) 1 (L0 (x))) ∀x˜ ∈ T ∗ ᒅ∗ × ᒄ∗red ;
(5.18)
2. Functions in L∗0 I (Lᒄ) provide a family of Poisson commuting conserved quantities for H0 . 3. Under the Hamiltonian flow generated by H0 , we have dL0 = − R0∗ (M0 ) , L0 , dt
(5.19)
where M0 (q, p, s)(z) = L0 (q, p, s)(z)/z, ∀(q, p, s) ∈ T ∗ ᒅ∗ × ᒄ∗red . Remark 5.5. 1. As in Remark 4.9, we can derive a St. Petersburg type formula: L0 (z) ⊗, L0 (w) = − r˜ 12 (q, z − w), L1 (z) + r˜ 21 (q, w − z), L2 (w) ,
where r˜ (q, z) can be described by an equation similar to Eq. (3.19).
(5.20)
Class of Integrable Spin Calogero-Moser Systems
283
2. Theorem 5.4 is presented in a general Poisson setting. But it is clear that all the above claims are still valid when we restrict the Hamiltonian H0 as well as other operators L0 , R0 to a particular symplectic leaf of T ∗ ᒅ∗ × ᒄ∗red . As in Remark 5.3, for any s ∈ ᒄ∗red , we write s = α∈F sα eα∗ with sαi = 1, i = 1, . . . , N. Explicitly, the Hamiltonian H0 and the Lax operators L0 of the integrable spin Calogero-Moser systems on T ∗ ᒅ∗ × ᒄ∗red are given as follows: I. Rational case. H0 (q, p, s) =
N 1 2 1 sα s−α pi − , 2 2 (α, q)2 i=1
α∈F
sα 1 eα , s α eα + L0 (q, p, s)(z) = p + z (α, q) α∈F
α∈F
where F ⊂ F is a set of roots closed with respect to the addition and multiplication by −1. II. Trigonometric case.
N
H0 (q, p, s) =
1 2 1 pi − 2 2 i=1
L0 (q, p, s)(z) = p +
α∈F(G )
+
α∈F(G )
1 1 5 − sα s−α , − sα s−α , 2 6 sin (α, q) 3 α∈F−F(G )
sin ((α, q) + z) 1 z(α,q) s α eα e3 sin (α, q) sin z
α∈F+ −F(G )
e−iz 1 z(α,q) s α eα + e3 sin z
α∈F− −F(G )
eiz 1 z(α,q) sα eα . e3 sin z
Here and = F+ ∪ F− is a polarization of F, G is a subset of the set of simple roots, F F G denotes the set of all roots which are linear combinations of roots from G .
III. Elliptic case. N
1 2 1 pi − P ((α, q)) sα s−α , 2 2 α∈F i=1 l ((α, q) , z) sα eα , L0 (q, p, s)(z) = p − H0 (q, p, s) =
α∈F
(z) (w+z) , P(z) = −ζ (z), l(w, z) = − σσ(w)σ where ζ (z) = σσ (z) (z) , and σ (z) is the Weierstrass σ function of periods 2ω1 , 2ω2 .
Remark 5.6. Let φ : ᒄ −→ ᒄl(n, C) be a representation of ᒄ. Then it induces a representation of Lᒄ, which we denote also by the same symbol. Let A(q, p, ξ ) = (φ ◦L0 ) (q, p, ξ ), where L0 is the Lax operator of one of the integrable spin systems
284
L.-C. Li, P. Xu
listed above. Then we have the spectral curve C : det(A(q, p, ξ )(z) − w) = 0, which is preserved by the flow generated by the Hamiltonian H0 . The integrability of H0 in the Liouville sense on the symplectic leaves T ∗ ᒅ∗ × Ored of T ∗ ᒅ∗ × ᒄ∗ (of various dimensions) will be investigated in subsequent work. Remark 5.7. 1. For the hyperbolic spin Calogero-Moser system in Remark 4.11 (4), the corresponding integrable model has Hamiltonian and Lax operator given by
N
1 2 1 H0 (q, p, s) = pi + 2 2 i=1
5 + 6
α∈F(G )
1 1 − sα s−α sinh2 (α, q) 3
sα s−α ,
α∈F−F(G )
L0 (q, p, s)(z) = p − i
α∈F(G )
sinh ((α, q) + z) − 1 z(α,q) s α eα e 3 sinh (α, q) sinh z
−i
α∈F+ −F(G )
e−z − 1 z(α,q) sα e α e 3 sinh z
−i
α∈F− −F(G )
ez − 1 z(α,q) sα e α . e 3 sinh z
2. In [30], Reshetikhin considers an integrable spin Calogero-Moser system on T ∗ Hreg × O , where Hreg is the regular part of the Cartan subgroup in G, and O is the reduction of ∗ ∗ a regular coadjoint orbit O in ᒄ at 0 ∈ ᒅ . Using his notation, the Hamiltonian of the sys (α, α)µα µ−α tem is of the form HCM = 21 (p, p) + 2 , where γα is the coordinate −1 α∈F+ γα/2 − γα/2
function on H corresponding to α, and µα µ−α lives in O . If H is assumed to be simply connected, then γα (h) = eα(log h) and so γα/2 (h) − γα/2 (h)−1 = 2 sinh 21 α(log h) . Thus the system in [30] is also hyperbolic. Example 5.1. For ᒄ = ᒐᒉ(3, C), ᒄ∗red can be identified with the affine subspace consisting of matrices of the form 0 1 s13 s = s21 0 1 . s31 s32 0 The Poisson structure on ᒄ∗red is given by {s13 , s21 } {s13 , s31 } {s13 , s32 } {s21 , s31 } {s21 , s32 } {s31 , s32 }
= = = = = =
2 1 − s13 s21 , s13 (s21 − s32 ) , 2 −1 + s13 s32 , s21 (s32 − s13 s31 ) , s31 − s13 s21 s32 , s32 (s21 − s13 s31 ) .
Class of Integrable Spin Calogero-Moser Systems
285
If we consider the rational case with F = F, then H0 and L0 are given as follows: 2 1 2 s21 s13 s31 s32 H0 (q, p, s) = , pi − + + 2 (q1 − q2 )2 (q1 − q3 )2 (q2 − q3 )2 i=1 sij 1 eij , L0 (q, p, s)(z) = p + s + z qi − q j i!=j
where s12 = s23 = 1 and eij is the 3 × 3 matrix with a 1 in the (i, j )-entry and zeros elsewhere. As a special case, consider ᒄ = ᒐᒉ(N, C) and identify ᒄ∗ with ᒄ using the standard Killing form. Let O be the adjoint orbit through the point ξ0 ∈ ᒐᒉ(N, C), where ξ0 is the off-diagonal matrix with all off-diagonal entries equal to m (!= 0). It is simple to see that O has dimension 2(N − 1), i.e., twice the rank of the Lie algebra. Hence, Ored is just one point. Consequently, if we restrict the integrable spin systems to this particular symplectic leaf T ∗ ᒅ∗ × {pt}, we obtain the usual Calogero-Moser systems with coupling constants m2 . Thus we have recovered the following: Corollary 5.8 ([4, 5]). The usual Calogero-Moser (rational, trigonometric and elliptic) systems associated to the Lie algebra ᒐᒉ(N, C) admit a Lax operator L0 : T ∗ CN−1 −→ Lᒐᒉ(N, C) and an r-matrix formalism. Remark 5.9. Note that the above adjoint orbit O is a semi-simple orbit of ᒐᒉ(N, C). For other types of simple Lie algebras, unfortunately, there does not exist any semi-simple orbit of dimension equal to twice the rank of the Lie algebra [18, 19]. On the other hand, there do exist minimal nilpotent orbits of the correct dimension for ᒐp(2N, C) [18]. However, the corresponding Hamiltonian reduces to that of a free system (without potential) in this case. In other words, the integrable spin systems obtained above do not contain the usual Calogero-Moser systems as subsystems for other types of simple Lie algebras. Acknowledgements. We would like to thank several institutions for their hospitality while work on this project was being done: MSRI (Li), and Max-Planck Institut (Li and Xu). The first author thanks the organizers, Pavel Bleher and Alexander Its, of the special semester in Random matrix models and their Applications held at MSRI in Spring 1999 for hospitality during his stay there. We also wish to thank Jean Avan, Pavel Etingof, Eyal Markman and Serge Parmentier for discussions.
References 1. Avan, J., Babelon, O., Billey, E.: The Gervais-Neveu-Felder equation and the quantum CalogeroMoser systems. Commun. Math. Phys. 178, 281–300 (1996) 2. Alekseevsky, D., Kriegl, A., Losik, M., Michor, P.: The Riemannian geometry of orbit spaces. The metric, geodesics and integrable systems. LANL e-print Archive math.DG/0102159; http://xxx.lanl.gov/ 3. Bangoura, M., Kosmann-Schwarzbach, Y.: Equation de Yang-Baxter dynamique classique et algebroides de Lie. C. R. Acad. Sci. Paris, Serie I 327, 541–546 (1998) 4. Billey, E., Avan, J., Babelon, O.: The r-matrix structure of the Euler-Calogero-Moser model. Phys. Lett. A 186, 114–118 (1994) 5. Billey, E., Avan, J., Babelon, O.: Exact Yangian symmetry in the classical Euler-Calogero-Moser model. Phys. Lett. A 188, 263–271 (1994) 6. Babelon, O., Viallet, C.-M: Hamiltonian structures and Lax equations. Phys. Lett. B 237, 411–416 (1990)
286
L.-C. Li, P. Xu
7. Bordner, A.J., Corrigan, E., Sasaki, R.: Calogero-Moser models: I. A new formulation. Progr. Theor. Phys. 100, 1107–1129 (1998) 8. Calogero, F.: Exactly solvable one-dimensional many body problems. Lett. Nuovo Cim. 13, 411–427 (1975) 9. Cannas da Silva, A., Weinstein, A.: Geometric models for noncommutative algebras. Berkeley Mathematics Lecture Notes 10, Providence, RI: AMS, 1999 10. Coste, A., Dazord, P., Weinstein. A.: Groupo¨ıdes symplectiques. In: Publications du D´epartement de Math´ematiques de l’Universit´e de Lyon, I, Number 2/A-1987, 1987, pp. 1–65 11. D’Hoker, E., Phong, D.H.: Calogero-Moser Lax pairs with spectral parameter for general Lie algebras. Nucl. Phys. B 530, 537–610 (1998) 12. Drinfeld, V.: Hamiltonian structures on Lie groups, Lie bialgebras, and the geometric meaning of the classical Yang-Baxter equations. Sov. Math. Dokl. 27, 667–671 (1983) 13. Etingof, P., Varchenko, A.: Geometry and classification of solutions of the classical dynamical YangBaxter equation. Commun. Math. Phys. 192, 77–120 (1998) 14. Faddeev, L., Takhtajan, L.: Hamiltonian Methods in the Theory of Solitons. Berlin: Springer-Verlag, 1987 15. Felder, G.: Conformal field theory and integrable systems associated to elliptic curves. Proc. ICM Zurich, Basel: Birkhauser, 1994, pp. 1247–1255 16. Gibbons, J., Hermsen, T.: A generalization of the Calogero-Moser systems. Physica 11D, 337–348 (1984) 17. Hasegawa, H., Ma, J.Z.: Intermediate level statistics with one-parameter random matrix ensembles. J. Math. Phys. 39, 2564–2583 (1998) 18. Joseph, A.: Minimal realizations and spectrum generating algebras. Commun. Math. Phys. 36, 325– 338 (1974) 19. Joseph, A.: The minimal orbit in a simple Lie algebra and its associated maximal ideal. Ann. Sci. Ecole Norm. Sup. 9, 1–29 (1976) 20. Kazhdan, D., Kostant, B., Sternberg, S: Hamiltonian group actions and dynamical systems of Calogero type. Comm. Pure Appl. Math. 31, 481–507 (1978) 21. Khvedelidze, A.M. Mladenov, D.M.: Euler-Calogero-Moser system from SU(2) Yang-Mills theory. Phy. Rev. D 62, 125016 (2000) 22. Krichever, I.M., Babelon, O., Billey, E, Talon, M.: Spin generalization of the Calogero-Moser system and the matrix KP equation. Am. Math. Soc. Transl. 150, 83–119 (1995) 23. Li, L.C., Xu, P.: Spin Calogero-Moser systems associated with simple Lie algebras. C. R. Acad. Sci. Paris, Serie I, 331, 55–60 (2000) 24. Mackenzie, K.: Lie Groupoids and Lie Algebroids in Differential Geometry. LMS Lecture Notes Series, 124, Cambridge: Cambridge Univ. Press, 1987 25. Marsden, J., Ratiu, T.: Reduction of Poisson manifolds. Lett. Math. Phys. 11, 161–169 (1986) 26. Moser, J.: Three integrable Hamiltonian systems connected with isospectral deformations. Adv. Math. 16, 197–220 (1975) 27. Olshanetsky, M.A., Perelomov, A.M.: Completely integrable Hamiltonian systems connected with semisimple Lie algebras. Invent. Math. 37, 93–108 (1976) 28. Pechukas, P.: Distribution of energy eigenvalues in the irregular spectrum. Phys. Rev. Lett. 51, 943–946 (1983) 29. Polychronakos, A.: Generalized Calogero models through reductions by discrete symmetries. Nucl. Phys. B 543, 485–498 (1999) 30. Reshetikhin, N.: Degenerate integrability of spin Calogero-Moser systems and the duality with spin Ruijsenaars systems. LANL e-print Archive, math.QA/0202245; http://xxx.lanl.gov/ 31. Reyman, A., Semenov-Tian-Shansky, M.: Group-theoretical methods in the theory of finite dimensional integrable systems, Dynamical Systems VII. In: Encyclopedia of Math. Sci. 16, Berlin: Springer-Verlag, 1994, pp. 116–225 32. Wojciechowski, S.: An integrable marriage of the Euler equations with the Calogero-Moser systems. Phys. Lett. A 111, 101–103 (1985) 33. Yukawa, T.: New approach to the statistical properties of energy levels. Phy. Rev. Lett. 54, 1883–1886 (1985) Communicated by L. Takhtajan
Commun. Math. Phys. 231, 287–308 (2002) Digital Object Identifier (DOI) 10.1007/s00220-002-0717-0
Communications in
Mathematical Physics
Bispectral Operators of Prime Order E. Horozov Institute of Mathematics and Informatics, Bulg. Acad. of Sci., Acad. G. Bonchev Str., Block 8, 1113 Sofia, Bulgaria. E-mail:
[email protected];
[email protected] Received: 14 February 2002 / Accepted: 10 June 2002 Published online: 21 October 2002 – © Springer-Verlag 2002
Abstract: The aim of this paper is to solve the bispectral problem for bispectral operators whose order is a prime number. More precisely we give a complete list of such bispectral operators. We use systematically the operator approach and in particular – Dixmier ideas on the first Weyl algebra. When the order is 2 the main theorem is exactly the result of Duistermaat-Gr¨unbaum. On the other hand our proofs seem to be simpler. 0. Introduction Bispectral operators have been introduced by F.A. Gr¨unbaum (cf. [G1, G2]) in his studies on applications of spectral analysis to medical imaging. In the present paper we give complete classification of bispectral operators of prime order. We start with some definitions and results that are needed to state our results, as well as to make clear the connection with other research. An ordinary differential operator L(x, ∂x ) is called bispectral if it has an eigenfunction ψ(x, z), depending also on the spectral parameter z, which is at the same time an eigenfunction of another differential operator (z, ∂z ) now in the spectral parameter z. In other words we look for operators L, and a function ψ(x, z) satisfying equations of the form: Lψ = f (z)ψ, ψ = θ (x)ψ.
(0.1) (0.2)
Although, as mentioned above, the study of bispectral operators has been stimulated by certain problems of computer tomography, later it turned out that they are connected to several actively developing areas of mathematics and physics – the KP-hierarchy, infinite-dimensional Lie algebras and their representations, particle systems, automorphisms of algebras of differential operators, non-commutative geometry, etc. (see e.g. [BHY1, BHY3, BHY4, BW, BW1, BW2, DG, K, W1, W2, MZ], as well as the papers in the proceedings volume of the conference in Montr´eal [BP]).
288
E. Horozov
In the fundamental paper [DG] Duistermaat and Gr¨unbaum raised the problem to find all bispectral operators and completely solved it for operators L of order two. The complete list is as follows. If we present L as a Schr¨odinger operator L=
d dx
2 + u(x),
the potentials u(x) of bispectral operators, apart from the obvious Airy (u(x) = ax) and Bessel (u(x) = cx −2 ) ones, are organized into two families of potentials u(x), which can be obtained by finitely many “rational Darboux transformations” (1) from u(x) = 0, (2) from u(x) = −( 41 )x −2 . Thus the classification scheme prompted by the paper [DG] is by the order of the operators. G. Wilson [W1] introduced another classification scheme – by the rank of the bispectral operator L (see the next section for definitions). In the above cited paper [W1] (see also [W2]) Wilson gave a complete description of all bispectral operators of rank 1 (and any order). In the terminology of Darboux transformations (see [BHY1]) all bispectral operators of rank 1 are those obtained by rational Darboux transformations on the operators with constant coefficients, i.e. L = p(∂x ) ∈ C. In the above mentioned papers [DG, W1] the classification is split into two more or less independent parts. First, there is an explicit construction of families of bispectral operators of a given class (order 2 in [DG]; rank 1 in [W1]). The construction can be given in terms of Darboux transformations of “canonical” operators (a notion that needs clarification, see the last section for some comments). A second part should be to give a proof that, if an operator (in the corresponding class) is a bispectral one, then it belongs to the constructed families. In the last few years there has been increased activity in the direction of constructing classes of bispectral operators ([BHY1, BHY3, KRo, Z]). For a survey on this subject, see [BP, H1] and the references therein. To the best of my knowledge, all families of bispectral operators known up to now can be constructed by the methods of [BHY1, BHY3]. For a simplified exposition of these results, see the first part of [H1]. A challenging problem is to prove that all the bispectral operators have already been found. A natural approach would be to divide the differential operators into suitable classes, e.g. – by order as in [DG] or by rank and to try to isolate the bispectral ones amongst them. In [HM] we have proposed another classification scheme – that is to consider the operators with a fixed type of singularity at infinity. The main result of that paper is the classification of bispectral operators possessing the simplest type of singularity at infinity – the Fuchsian one. My opinion is that all classification schemes mentioned above may help each other as seen from the main results here. In the present paper we return to the initial classification scheme – that of [DG]. We give a list of several families that contains all bispectral operators whose order is a prime number. Before stating the results we introduce some definitions and notations which will be used also throughout the paper. We are going to consider operators, normalized as follows: N L= Vk (x)∂xk , VN = 1, VN−1 = 0. (0.3) k=0
It is well known that with the above normalization all the coefficients of L are rational functions (see [DG, W1] or the next section). Now we can formulate the main result of the present paper.
Bispectral Operators of Prime Order
289
Theorem 0.1. An operator L, whose order is a prime number, is bispectral if and only if it belongs to one of the following sets: 1. Generalized Airy operators: A = ∂p +
p−2
aj ∂ j − x,
aj ∈ C;
(0.4)
j =1
2. Generalized Bessel operators: B = x −p (x∂ − β1 ) . . . (x∂ − βp ),
βj ∈ C;
(0.5)
3. Operators with constant coefficients: C = ∂p +
p−2
aj ∂ j ,
aj ∈ C;
(0.6)
j =1
4. Operators, obtained by monomial Darboux transformations from the Bessel operators having the property that at least one difference βi − βj ∈ pZ, i = j ; 5. Polynomial Darboux transformations from operators with constant coefficients. Remark 0.2. 1. For p = 2 this is just the content of the classical result of DuistermaatGr¨unbaum [DG]. Indeed, in that case we have β1 − β2 ∈ 2Z. In [DG] this part of the theorem is formulated in a form close to this one. To obtain the potential u(x) = −( 41 )x −2 mentioned above in (2) and corresponding to β1 = β2 = 1/2 we have to perform a monomial Darboux transformation. Having in mind that a composition of monomial Darboux transformations is again a monomial Darboux transformation we get (2). The potential u(x) = 0 from (1) corresponds to the only operator in (3), Theorem 0.1 for p = 2. 2. The different notions of Darboux transformations used here are explained in the next section. Theorem 0.1 is in fact a consequence of two slightly more general results. Their importance lies in the fact that they can be used for an induction process in further classification (see [H2]). In any case the proofs do not simplify when restricted to operators of prime order and seem to be more natural as performed here (see below). At the end we briefly review the organization of the paper. In Sect. 1 we recall some definitions and results, by now standard for the problem, with the purpose to fix the notation and the terminology. Section 2 treats the case of operators with bounded (at infinity) coefficients Vj . We begin the section with an auxiliary result. We show that from the normalized operator L as in (0.3) and satisfying only the so-called “ad-condition” with the polynomial θ, the function f (z) can be chosen naturally and then one can build the operator already normalized only in terms of L and θ . This will be needed after that to prove the following theorem for operators with bounded coefficients: Theorem 0.3. Let the rank of the bispectral operator L with bounded coefficients equal its order. Then it is a monomial Darboux transformation of a Bessel operator. The essential part of the proof is to establish the vanishing of the coefficients Vj at infinity. Then the result is contained in [HM]. In Sect. 3 we consider operators with increasing coefficients. We first obtain a normal form for their “leading terms” (Subsect. 3.1). Here we exploit once again (as in [HM])
290
E. Horozov
crucial ideas of Dixmier analysis of the first Weyl algebra [Dx]. When the order is a prime number we get the non-vanishing (at infinity) part of L to be the (generalized) Airy operator A. In the next subsection we develop some version of wave (pseudo-differential) operators, expanded in negative powers of Airy operators. This tool turns out to be enough to prove (for operators of any order, not only prime) in Subsect. 3.3 the theorem: Theorem 0.4. Let the bispectral operator L = A +(vanishing at infinity perturbation). Then the perturbation is zero. We hope the careful reader has noticed some similarity with the analysis in [DG] for the latter class. We believe that our proof is simpler and more transparent (and for this reason works for higher order operators). One thing that we certainly benefited from [DG] is to realize that behind their calculations of the normal form of second order operators there stands Dixmier analysis on the first Weyl algebra A1 . On the other hand in both steps we use different approaches. The results of the present papers have been announced in the second part of the survey paper [H1]. 1. Preliminaries In this section we have collected some terminology, notations and results relevant for the study of bispectral operators. Our main concern is to introduce unique notation which will be used throughout the paper and to make the paper self contained. There are also few results which cannot be found formally elsewhere, but in fact are reformulations (in a form suitable for the present paper) of statements from other sources. 1.1. In this subsection we recall some definitions, facts and notation from Sato’s theory of KP-hierarchy [S, DJKM, SW] needed in the paper. For a complete presentation of the theory we recommend also [Di, vM]. We start with the notion of the wave operator K(x, ∂x ). This is a pseudo-differential operator K(x, ∂x ) = 1 +
∞
−j
aj (x)∂x ,
(1.1)
j =1
with coefficients aj (x) which could be convergent or formal power (Laurent) series. In the present paper we will consider aj most often as formal Laurent series in x −1 . The wave operator defines the (stationary) Baker-Akhiezer function ψ(x, z): ψ(x, z) = K(x, ∂x )exz . From (1.1) and (1.2) it follows that ψ has the following asymptotic expansion: ∞ xz −j 1+ , z → ∞. aj (x)z ψ(x, z) = e
(1.2)
(1.3)
1
Introduce also the pseudo-differential operator P : P (x, ∂x ) = K∂x K −1 .
(1.4)
Bispectral Operators of Prime Order
291
The following spectral property of P , crucial in the theory of KP-hierarchy, is also very important for the bispectral problem: P ψ(x, z) = zψ(x, z).
(1.5)
When it happens that some polynomial of P , say f (P ), is a differential operator, we get that ψ(x, z) is an eigenfunction of an ordinary differential operator L = f (P ): Lψ = f (z)ψ.
(1.6)
It is possible to introduce the above objects in many different ways, starting with any of them (and with others, not introduced above). For us it would be important also to start with given differential operator L: L(x, ∂x ) = ∂xN + VN−2 (x)∂ N−2 + · · · + V0 (x).
(1.7)
One can define the wave operator K as: LK = Kf (∂).
(1.8)
An important notion, connected to an operator L is the algebra AL of operators commuting with L (see [Kr, BC]). This algebra is a commutative one. The wave function ψ(x, z) (defined in (1.2)) is a common wave function for all operators M from AL : Mψ(x, z) = gM (z)ψ(x, z).
(1.9)
We define also the algebra AL of all functions gM (z) for which (1.9) holds for some M ∈ AL . Obviously the algebras AL and AL are isomorphic. Following [Kr] we introduce the rank of the algebra AL as the greatest common divisor of the orders of the operators in AL . 1.2. Here we shall briefly recall the definition of the Bessel wave function. Let β ∈ CN be such that N N (N − 1) βi = . (1.10) 2 i=1
Definition 1.1 [F, Z, BHY1]. The Bessel wave function is called the unique wave function !β (x, z) depending only on xz and satisfying Lβ (x, ∂x )!β (x, z) = zN !β (x, z),
(1.11)
where the Bessel operator Lβ (x, ∂x ) is given by (0.5). Because the Bessel wave function depends only on xz, (1.11) implies Dx !β (x, z) = Dz !β (x, z),
(1.12)
Lβ (z, ∂z )!β (x, z) = x N !β (x, z).
(1.13)
Next we define monomial and polynomial Darboux transformations of Bessel operators. The definitions are slight modifications of the definitions given in [BHY1]. Let h(Lβ ) be a polynomial in a Bessel operator.
292
E. Horozov
Definition 1.2. We say that the operator L˜ is a polynomial Darboux transformation of Lβ if there exist differential operators P (x, ∂x ), Q(x, ∂x ) and a polynomial h such that h(Lβ ) = QP , (1.14) L˜ = P Q,
(1.15)
and the operator P (x, ∂x ) has the form P (x, ∂x ) = x −n
n k=0
pk (x N )Dxk ,
(1.16)
where pk are rational functions, pn ≡ 1. We will use the following definition of monomial Darboux transformations. Definition 1.3. We say that the operator L˜ is a monomial Darboux transformation of the Bessel operator Lβ iff it is a polynomial Darboux transformation with h(Lβ ) = Ldβ , d ∈ N. Remark 1.4. In [DG] the authors work with rational Darboux transformations. It is easy to show that a composition of rational Darboux transformations is a monomial Darboux transformation. We end this subsection by reformulating (in a weaker form) the main results, which we need from [BHY1, HM]. Theorem 1.5. The polynomial Darboux transformations of the Bessel operators are bispectral operators. Theorem 1.6. If a bispectral operator L has vanishing coefficients at infinity, it is a monomial Darboux transformation of a Bessel operator. 1.3. Here we recall several simple properties of bispectral operators following [DG, W1]. As we have already mentioned in the introduction we are going to study ordinary differential operators L of arbitrary order N which are normalized as in (0.3), i.e. with VN = 1 and VN−1 = 0. Assuming that L is bispectral means that we have also another operator , a wave function ψ(x, z) and two other functions f (z) and θ(x), such that Eqs. (0.1) and (0.2) hold. The following lemma, due to [DG], has been fundamental for all studies of bispectral operators. Lemma 1.7. There exists a number m ∈ N, such that (ad L)m+1 θ = 0.
(1.17)
For its simple proof, see [DG, W1]. To the best of my knowledge this is the only property of bispectral operators that is used in their studies. It is widely believed that the condition (1.17), called the ad-condition is equivalent to bispectrality (provided (0.3) holds). In what follows we assume only that the ad-conditon holds, i.e. we are not going to use the existence of f (z), and ψ. For us it would be important to construct them only from L and θ at least formally. This will be done in the next section. We will consider that m is the minimal number with this property. An important corollary of the above lemma is the following result.
Bispectral Operators of Prime Order
293
Lemma 1.8. Let the operator L be normalized as in (0.3). Then (i) The function θ (x) is a polynomial. (ii) The coefficients αj in the expansion (1.1) of the wave operator K are rational functions. Proof. We repeat the simple proof following [W1] as the proof introduces some notions needed later. From Eq. (1.17) it follows that (ad ∂xN )m+1 (K −1 θK) = 0. On the other hand the kernel of the operator (ad ∂xN )m+1 consists of all pseudo-differential operators whose coefficients are polynomials in x of degree at most m. This gives that θ (x)K = K(, (1.18) with a pseudo-differential operator (: ( = (0 +
∞
−j
(j ∂ x
(1.19)
1
whose coefficients (j are polynomials of degree at most m. We have θ(x) = (0 (x). −j This gives (i). Comparing the coefficients at ∂x we find that all the coefficients αj (x) of K are rational functions.
Remark 1.9. We notice that at least one of the coefficients (j has degree exactly m, where m from Lemma 1.7 is minimal. This fact will be used later. The last lemma has as an obvious consequence one of the few general results, important in all studies of bispectral operators. Noticing that the coefficients of L are polynomials in the derivatives of αj (x) we get Lemma 1.10. The coefficients of L are rational functions. Remark 1.11. Obviously the same results hold for the pair , f (z), when imposing the conditions (0.3) on . But here we need to derive this statement from conditions only on and L and a suitable choice of f (z). This will be done in the next section. 2. Operators with Bounded Coefficients In this section we are going to prove our main result for operators with coefficients bounded near infinity. As mentioned earlier we will consider a slightly more general situation – operators for which the rank and the order coincide. The main part of the proof is to show that in this case the coefficients are in fact vanishing at infinity and hence the result follows from the main theorem in [HM]. First I would like to fix the notation. Put L = ∂p +
p−2
Wj (x)∂ j ,
(2.1)
j =0
where Wj = cj + Vj (x), cj are constants and Vj (x) = O(x −1 ). Next define the polynomial f (z) to be:
294
E. Horozov
f (z) = zp +
p−2
cj z j .
(2.2)
j =0
In this way we can rewrite our operator in the form L = f (∂) +
p−2
Vj (x)∂ j ,
(2.3)
j =0
where Vj (x) = O(x −1 ). Our first goal is to to show that starting with the normalized bispectral operator L the above choice of f (z) leads to a normalized operator . The construction is formal, i.e. the wave function is a formal series, but this is enough for what follows. As all the auxiliary results (the three lemmas below) are slight modifications of corresponding results from [HM] we omit their proofs. It is well known that one can present the operator L in the form: L = Kf (∂)K −1 , (2.4) where the polynomial f (z) is defined in (2.2). In the next lemma, following [DG] we find the simplest restrictions on the coefficients of the wave operator K and on L. Lemma 2.1. (i) The coefficients Vj (x), j = N − 2, . . . 0 of L vanish at ∞ at least as x −2 . (ii) The coefficients αj , j = 1, . . . of the wave operator K vanish at least as x −1 . This lemma allows us to introduce following [BHY4] an anti-isomorphism b between the algebra B of pseudo-differential operators P (x, ∂x ) in the variable x and the same algebra B but in the variable z. More precisely B consists of those pseudo-differential operators P =
∞
−j
pj (x −1 )∂x ,
k
for which there is a number n ∈ Z (depending on P ) such that all expressions x n pj (x −1 ), j = k, k + 1, . . . are formal power series in x −1 . The involution b : B −→ B is defined by b(P )exz = P exz =
∞ k
z−j pj (∂z−1 )exz ,
for
P ∈ B,
(2.5)
i.e. b is just a continuation of the standard anti-isomorphism between two copies of the Weyl algebra. In what follows we will use also the anti-isomorphism b1 : B −→ B ,
b1 (P ) = b(AdK P ).
(2.6)
Obviously b and b1 can be considered as involutions of B and without any ambiguity we can denote the inverse isomorphisms b−1 , b1−1 : B −→ B by the same letters. Since the operators K and ( = K −1 θ K are from B we can define two operators S and as follows:
Bispectral Operators of Prime Order
295
S(z, ∂z ) = b(K(x, ∂x )),
(2.7)
(z, ∂z ) = b(().
(2.8)
Explicitly one has S=
∞
∞
z−j αj (∂z ) =
j =0
−j
aj (z)∂z ,
a0 = 1
(2.9)
j =0
and also (z, ∂z ) =
∞
z
−j
(j (∂z ) =
j =0
m i=0
i (z)∂zi ,
(2.10)
where m = 0 (see Remark 1.9) and the coefficients i and aj should be viewed as formal power series. As in [HM] we can prove Lemma 2.2. The coefficients aj of the operator S are rational functions. From the last lemma it follows that is normalized as required in (0.3). Denote temporarily by r the degree of the polynomial θ , i.e. if θ(x) = zr + · · ·. Lemma 2.3. With the choice of f(z) as in (2.2) the coefficients i of the operator are rational functions and satisfies (0.2). The degree of θ r = m and m = 1,
m−1 = 0.
(2.11)
The point in the last lemma is that the normalization of is a consequence of the suitable choice of the polynomial f (z). Of course the wave function is only formal but this suffices for the proof of the main theorem. Now we are ready to give the classification of operators with bounded near infinity coefficients of operators with the same rank and order. Let us fix the notation. Put L = ∂N +
N−2
Vj (x)∂ j ,
(2.12)
j =0
where Vj = cj + Wj (x), cj are constants and W (x) = O(x −1 ). As before define the polynomial f (z) to be: N
f (z) = z +
N−2
cj z j .
(2.13)
j =0
With this choice of f (z) as we know the operator is with rational coefficients. Now we are ready to prove the main result of this section. Theorem 2.4. If the rank of the operator L with coefficients bounded at infinity equals its order N then all constants cj = 0, i.e. the coefficients Vj = O(x −1 ).
296
E. Horozov
Proof. We are going to use again the ad-condition (1.3). Choose the number m so that adfm(z) ( ) = 0 and adfm+1 (z) ( ) = 0. Simple computation shows that
adfm(z) ( ) = (−1)m m!(f (z))m .
(2.14)
adLm (θ ) = Q = 0.
(2.15)
[L, Q] = 0.
(2.16)
On the other hand we have
Obviously
This gives that Q ∈ AL . From this and from the fact that the rank of L is equal to its order N we get that Q is a polynomial in L: Q = qr Lr + qr−1 Lr−1 + · · · ,
qj ∈ C,
(2.17)
where the coefficient qr = 0. Using the involution b1 we get r
b1 (Q) = qr f (z) +
r−1
qj f j (z).
(2.18)
j =0
Using (2.14) we have the following string of identities: b1 (Q) = b1 (adLm (θ )) = (−1)m (adfm(z) ( )) = m!(f (z))m .
(2.19)
In this way (2.18) and (2.19) yield qr f r (z) +
r−1
qj f j (z) = m!(f (z))m .
(2.20)
j =0
Now we are going to compare both the degrees and the first two coefficients of both sides of (2.20). First notice that comparing the degrees of the leading terms gives: (N − 1)m = rN. This gives that m is divisible by N , i.e. m = sN and r = s(N − 1). Next, comparing the coefficients at the highest degree we get qr = m!N m . Suppose that some of the coefficients cj of f (z) are not zero. Denote the second non-zero term after zr by ck zk . Computing the coefficient at the second non-zero term on both sides of (2.20) we get k = N − 1. This is a contradiction to the normalizing condition cN−1 = 0.
From the above theorem and from the main result in [HM] (see also Sect. 1.2, Theorem 1.6) we get the proof of Theorem 0.3.
Bispectral Operators of Prime Order
297
3. Operators with Increasing Coefficients 3.1. Normal forms. Let L be a bispectral operator, normalized as in (0.3), i.e. L = ∂xN +
N−2
j
Vj (x)∂x .
(3.1)
j =0
We will consider that for some j the corresponding coefficient Vj is increasing at infinity, i.e. has Laurent expansion at infinity of the form: Vj (x) =
rj
aj,m x m ,
(3.2)
m=−∞
where rj > 0 and aj,rj = 0. We call rj the order of Vj . In what follows we are going to use several properties shared by the first Weyl algebra A1 and the larger algebra R[∂] of differential operators with rational coefficients. Following [Dx] we define filtration in R[∂]. Let ρ, σ ∈ R. We put wt (x) = ρ, wt (∂) = σ . Next the weight of L is given by the following definition: Definition 3.1. Assume that L = Vn ∂ n + Vn−1 ∂ n−1 + · · · + V0 is an arbitrary element of R[∂]. For each term V (x)∂xi define its weight vρ,σ (V (x)∂xi ) = ρ(ordV ) + σ i. Then the number vρ,σ (L) := max vρ,σ (Vi (x)∂ i ) 0≤i≤n
will be called (ρ, σ )-order of L. The second definition associates to each differential operator from R[∂] a (ρ, σ )homogeneous polynomial. Definition 3.2. Assume the notation of the previous definition and denote by I (L) the set {i ∈ {0, 1, . . . , n}|vρ,σ (Vi ∂ i ) = vρ,σ (L)}. The polynomial f ∈ C[x, x −1 , y] defined as: f (x, y) = ai x ordVi y i , (3.3) i∈I
where ai ∈ C are uniquely determined from the expansion Vi = ai x ordVi + (lower order terms), will be called a polynomial associated with L. The operator L0 =: f (x, ∂x ) : will be called the homogeneous part of L. Here, as usual the columns :: denote normal ordering, i.e. the differentiation is pushed to the right. Consider again the bispectral operator (3.1). In what follows we assume that ρ and σ are positive integers. It is always possible to choose them in such a way that the polynomial f associated with L has at least two terms of the kind
298
E. Horozov
f = y N + αx k y p + · · · ,
α = 0,
(3.4)
where k > 0, p ≥ 0. For this purpose one can use the Newton polygon. More precisely denote by E(L) the set of points (m, j ), such that am,j = 0, where am,j is from (3.2). Consider the plane with the points of E(L) and take the convex closure of E(L) (the Newton polygon). Then draw a line passing through the point (0, N ) ∈ E(L) and another point, say (k, j ) ∈ E(L) with k > 0 and such that the Newton polygon remains below the line. This line is unique – all other points (k1 , j1 ) ∈ E(L) with the same property lie on it. Then one can find a non-zero solution in integers of the equation N σ = kρ + j σ . In our situation both ρ and σ are positive as j < N and at least one Vj is increasing. Notice that with this choice the polynomial f (x, y) ∈ C[x, y]. Denote also by g(x) = x l the polynomial associated to θ(x). Our goal is to find severe restrictions on the polynomial f . We are going to use the following result which is a slight modification of a particular case of the fundamental Proposition 7.3 from [Dx]. Lemma 3.3. Suppose the element L ∈ R[∂] acts on a element G nilpotently, i.e. adm L (G) = 0,
m ≥ 1.
Let f and g be the polynomials associated to L and G and f contains at least two terms. If ρ and σ are positive integers and vρ,σ (L) > ρ + σ , then one of the following cases holds: (a) f s = gr , (3.5) where s and r are the weights of g and f , (b) σ > ρ, ρ divides σ and f = X n (X m + µY )k ,
(3.6)
f = Y n (Y m + µX)k ,
(3.7)
f = (Y + λx)n (Y + µX)k ,
(3.8)
(c) ρ > σ , σ divides ρ and
(d) ρ = σ and
where n ≥ 0, k ≥ 0, m > 0 and µ, λ ∈ C. Remarks on the proof of the lemma. The proof essentially repeats that of Lemma 7.3 from [Dx]. We mention the minor differences. While in [Dx] all the polynomials (ρ, σ )associated with the elements of the Weyl algebra belong to C[x, y], here we work in C[x, x −1 , y]. The fact that the polynomials belong to C[x, y] is needed in [Dx] mainly to speak about their roots in some algebraic closure of C(y) (respectively – C(x)) when considered as polynomials in x (respectively in y). But the ring considered here has the same property. Lemma 3.4. Suppose that k > 1 (k is from (3.4)). Then either vρ,σ (L) > ρ + σ or L cannot act nilpotently on θ .
Bispectral Operators of Prime Order
299
Proof. Writing (3.4) in the form f = y p (y N−p + αx k ) + · · · , we notice that one can take ρ = N − p and σ = k ≥ 2. First suppose N = k = 2 and p = 0. Then L cannot be nilpotent (see [Dx]). We give a slightly different proof here. One can easily see that if vρ,σ (θ ) = 2l > 0, then vρ,σ adL (θ ) = 2l unless f s = g 2 . But this is not possible as the polynomial associated with adL (θ ) contains a term with y of degree one. By induction on m we see that the resulting applications of adL always contain terms where the power of y is less than 2. This shows that the equation f s = g 2 is impossible. Now we can suppose that either p > 0 or max(N, k) ≥ 3 (we recall that both N ≥ 2 and k ≥ 2). In the first case we have vρ,σ (L) = N σ > kN ≥ N +k > N −p+k > ρ+σ . If p ≥ 0 then vρ,σ (L) = N σ = N k > N + k ≥ ρ + σ . The strict inequality is a consequence of the fact that max(N, k) ≥ 3.
Lemma 3.5. Let L act on θ nilpotently. The polynomial f has the form f = (y r − x)k ,
r > 1.
(3.9)
Proof. If the term with highest power k in (3.4) is 1 then f has the form f = y n (y r − λx). In the case when k ≥ 2 from Lemma 3.3 we know that f = y n (y r − λx)k or f = (y − λx)α (y − µx)β , α + β = k, i.e. we have one of the cases b), c) or d) as f s = g r is impossible. By applying the automorphism @1,(µ)−1 of A1 (here
@r,−µ (x) = x, @r,µ (y) = y + µx r ) to f and to x l we reduce the last case to the previous ones, i.e. we assume that f = y n (y r − λx)k , where n ≥ 1 and k ≥ 1. Without loss of generality we can assume that λ = 1. Our goal will be to show that if n ≥ 1 and k ≥ 1 then adLs cannot be zero for any s ∈ N. We will show that at least some of the terms with highest weight will be preserved. Suppose that n ≥ 1, k ≥ 1. First we continue the automorphisms @1,r from A1 to its skew-field. Notice that they preserve the filtration. For this reason we are going to work only with the homogeneous part as the rest of the terms have no impact on it. Apply the automorphism @r,1 of A1 to f and to x l . Using that @r,1 (∂) = ∂ and @r,1 (x) = x + ∂ r we obtain @r,1 (∂ n (∂ r − x)k ) = (−1)k ∂ n x k , @r,1 (x l ) = (x + ∂ r )l . Consider now ads(∂ n x k ) (x + ∂ r )l . Write (x + ∂ r )l =
cjl x j ∂ r(l−j ) + · · · ,
where by · · · we denote the lower weight terms. Then by linearity s ad(∂ n x k ) (x
r l
+∂ ) =
l j =0
s j r(l−j ) cjl ad(∂ ) + ···. n x k ) (x ∂
Let k ≥ n. Simple computation gives that s−1 ad∂s n x k (x l ) = [nl + (k − n)j ]∂ s(n−1) x l+s(k−1) + · · · . j =0
(3.10)
300
E. Horozov
As n ≥ 1, l ≥ 1, k − n ≥ 0 the coefficient at the term of highest power in x is positive for any s ≥ 1, which shows that (3.10) cannot be zero for any s. Now suppose that n ≥ k ≥ 1. Consider s−1 [lrk + (n − k)j ] ∂ s(n−1)+lr x s(k−1) + · · · ad∂s n x k (∂ lr ) = (−1)s j =0
By the same argument the coefficient at the highest power in ∂ is not zero for any s. This shows that either n = 0 or k = 0. But from the assumption (2.7) it follows that k cannot be zero.
Remark 3.6. Note that the above result gives a normal form for the leading terms of all bispectral operators with increasing coefficients of any order. Now assume that the order N of L is a prime number. This gives that k = 1, N = r. Using Lemma 3.5 we obtain that: Lemma 3.7. The operator L has the form N
L=∂ +
N−2
j
aj ∂ − x +
j =1
N−2
Wj (x)∂ j ,
(3.11)
j =0
where limx→∞ Wj (x) = 0 and aj ∈ C. We will call the operator A = ∂N +
N−2
aj ∂ j − x
j =1
the principal part of L. Following the terminology of [BHY1] A is the (generalized) Airy operator.
3.2. Airy PDO’s. Let A = ∂N +
N−2
aj ∂ j − x
(3.12)
j =1
be the generalized Airy operator. Our aim here is to develop a calculus of pseudo-differential operators written in terms of inverse powers of Airy operators in complete analogy with the standard one, described in Sect. 1.1. All the results of the section are obtained for any order N , i.e. without assuming that the order is prime. Let @(x) be a nonzero function in KerA, i.e. A@(x) = 0. (3.13) Then the function !(x, z) = @(x + z) satisfies the equations: A(x, ∂x )!(x, z) = z!(x, z),
(3.14)
A(z, ∂z )!(x, z) = x!(x, z).
(3.15)
Bispectral Operators of Prime Order
301
Obviously the function !(x, z) satisfies also the equation ∂x !(x, z) = ∂z !(x, z).
(3.16)
Equations (3.14)–(3.16) define an anti-involution b on the Weyl algebra A1 (see [BHY2, W1]), acting on the generators A, ∂x of A1 by b(A(x, ∂x )) = z,
(3.17)
b(∂x ) = ∂z .
(3.18)
b(x) = A(z, ∂z ),
(3.19)
From (3.18) one easily finds that
which will be used later. Now define the algebra B1 of pseudo-differential operators of the type: ∞
P (x, ∂x ) =
aj (x, ∂x )A−j ,
(3.20)
j =−m
with operator coefficients of the form: aj =
N−1
αj,k (x)∂ k ,
(3.21)
k=0
where the functions αj,k (x) are formal Laurent series: αj,k (x) =
∞ s=r
(s)
βj,k x −s
(3.22)
and the index r depends only on P (but not on j !). Then the anti-automorphism b can be continued on B1 as b(P ) =
∞ j =−m
z−j
∞ N −1 k=0 s=r
(s)
βj,k ∂zk A−s =
∞
bs (z, ∂z )A−s (z).
(3.23)
s=r
Introduce the “wave operator” K as follows: K =1+
∞
mj (x, ∂x )A−j ,
(3.24)
j =1
where
N−1
αj,k (x)∂xk .
(3.25)
L = Al + Vl−1 Al−1 + · · · ,
(3.26)
mj (x, ∂x ) =
k=0
Let L be a differential operator of the form:
302
E. Horozov
where Vj (x, ∂) =
N−1
Vj,k (x)∂ k .
(3.27)
k=0
Then one can find a wave operator (not unique) K of the form (3.24) so that L = KAl K −1 .
(3.28)
The coefficients αj,k (x) from (3.26) of the expansion of K can be found by induction from the equation: LK = KAl . (3.29) Multiplying the above equation from the right by A, A2 , . . . one computes the coefficients αj,k (x) in the expansion (3.24)–(3.26) of K in terms of the functions Vj,k . We would particularly be interested in the case l = 1. In what follows we assume that the operator L is bispectral, it satisfies an equation of the form (0.1), together with an equation of the form (0.2). In that case as in [DG, W1] one can prove the following lemma. Lemma 3.8. The coefficients aj,k of the operator K are rational functions. Proof. We mimic the well known proof (see [DG, W1]). Write
This is equivalent to
(adL )m (θ ) = 0.
(3.30)
(ad(Al ))m (K −1 θK).
(3.31)
Put ( = K −1 θ K =
∞
θj A−j ,
(3.32)
j =0
where θj =
N−1
θj,k ∂ k .
k=0
This gives (adAl )m (θj ) = 0. The leading terms of the above equation give (m)
(m)
θj,N −1 ∂ m(Nl−1)+N−1 + (θj,N −2 . . .)∂ m(Nl−1)+N−2 + · · · (m)
+(θj,0 . . .)∂ m(Nl−1) + · · · = 0. Here in the brackets containing the coefficients at ∂ m(Nl−1)+N−s , s = 2, . . . , N the dots after θj,N −s denote expressions of derivatives of θj,N −r , r < s of order not lower than (m) m. Then obviously by induction we get that all θj,n−1 ≡ 0, n = 1, . . . , N, which shows that they are polynomials of degree d ≤ m. Then the computation of the coefficients aj (x) of K is performed as in [DG, W1] (see also Sect. 2). We see that they are rational functions.
Bispectral Operators of Prime Order
303
3.3. Proof of the main theorem for operators with increasing coefficients. Let L be an operator of order N with Airy principal part A = ∂ N + aN−2 ∂ N−2 + · · · + a1 ∂ − x, i.e. L=A+
N−2
Vj (x)∂ j = A + V (x, ∂x )
j =0
and lim Vj (x) = 0.
x→∞
Using the techniques of the previous subsection we present L in the form L = A + V = KAK −1 .
(3.33)
Our goal will be to show that bispectrality, and in particular – the rationality of the coefficients αj,k (x) in the expansion (3.24) and (3.25) of K, implies that the perturbation V (x, ∂x ) ≡ 0, which is equivalent to αj,k ≡ 0 for all j, k. But first we need some notation and auxiliary results. From the equation LK = KA, (3.34) we can compute recursively the coefficients mj of the operator K. For this we will need some formulas to compare the coefficients of the two sides of (3.34). Introduce the operators bj , cj , Uj , Wj , j = 1, 2, . . . by: [A, mj ] = bj A + cj ,
(3.35)
V (x, ∂)mj = Uj A + Wj .
(3.36)
We will need to order the monomials in m and expressions related to it as follows: Definition 3.9. We say that the monomial x r1 ∂ k1 is of higher order than the monomial x r2 ∂ k2 if r1 > r2 or r1 = r2 and k1 > k2 . We will call the number r the height of m, if the highest order term of m is of the type c.x r ∂ k , c = 0. We denote this number by ht (m). In other words we use lexicographic ordering in the set of the monomials x r ∂ k but the height is only the power of x. In what follows we are going to use the abbreviation l.o.t. (for lower order terms) compared to some operator m with the meaning that they are lower than at least one of the terms in m. We will need also the following lemma: Lemma 3.10. In the above formulas we have (i) N−1 bj = N αj,k ∂ k−1 + l.o.t.,
(3.37)
k=1
cj = N αj,0 ∂ N−1 +
(ii) ht (cj ) = ht (bj ) + 1.
N−1 k=1
(xN αj,k + kαj,k )∂ k−1 + l.o.t.,
(3.38)
304
E. Horozov
Proof. The proof is straightforward computation. To avoid two indices we will suppress the dependence on j (it is irrelevant at that moment). We have N−1 N−1 k ak ∂ − xm + N αk ∂ N+k−1 + l.o.t., (3.39) A◦m=m k=1
m◦A=m
N−1
k=0
ak ∂ k − xm −
k=1
N−1
kαk ∂ k−1 .
(3.40)
k=0
Subtracting (3.40) from (3.39) we get N−1
[A, m] =
k=0
N αk ∂ N+k−1 +
N−1
αk ∂ k−1 + l.o.t.
(3.41)
k=0
Split the first sum into two parts as follows. One of them contains derivatives from N to 2N − 2; the second will contain the rest of them. Then we have for the first part N−1 k=0
N αk ∂ N+k−1
Next use the identity ∂ N = A − Nα0 ∂ N−1 +
N−1 k=1
= N α0 ∂
N−1
N−2 k=1
=
N−1 k=1
N αk ∂ k−1
∂N .
ak + x to get
N αk ∂ N+k−1
+
N−1 k=1
N αk ∂ k−1
A+x
N−1 k=1
N αk ∂ k−1
+ ···,
where the dots represent terms with derivatives of αk . Then we repeat the same procedure on all terms (including the ones in l.o.t. from (3.41)) containing ∂ k with k ≥ N . After a finite number of steps we get (3.37) and (3.38). The second part of the lemma follows immediately from the first one.
Lemma 3.11. (i) Equation (3.34) is equivalent to the equations: b1 + V + U1 = 0, bj +1 + cj + Wj + Uj +1 = 0,
(3.42) j = 1, . . . .
(3.43)
(ii) The coefficients of V and b1 behave at infinity as x −2 . Proof. The first part is simply comparing the coefficients. Indeed, writing in detail (3.34) we get A + Am1 A−1 + · · · + V + V m1 A−1 + · · · = A + m1 + · · · . Using (3.35) and (3.36) we can simplify the last equation to V + b1 + U1 + · · · = 0,
Bispectral Operators of Prime Order
305
where · · · denotes the purely pseudo-differential part. This gives (3.42). Multiplying (3.34) by A, A2 , etc. from the right and arguing in the same manner we get (3.43). To prove the second part we use (3.37) with j = 1 and (3.42). Notice that the leading terms of b1 are derivatives of rational functions. Being equal to the leading terms of V they vanish. Hence they vanish at least of order x −2 .
Lemma 3.12. The following inequalities hold: ht (Uj ) ≤ ht (bj ) − 1,
(3.44)
ht (Wj ) ≤ ht (cj ) − 1.
(3.45)
Proof. The proof is similar to that of Lemma 3.10. Using (3.36) we obtain 2N−1 2N−1 N−1 V mj = V˜j,k ∂ k . Vs αj,k−s ∂ k = k=1
s=k
k=0
For k = 2N, . . . , N − 1 put V˜j,k = U˜ j,k−N . As above split the sum into two parts, the first one containing the terms with ∂ k , k ≥ N : N−1 N−1 k V mj = U˜ j,k ∂ ∂ N + V˜j,k ∂ k . k=0
k=0
N−2
Again use the identity ∂ N = A − k=1 ak + x several times to get the first sum in the form: N−1 N−1 k ˜ Uj,k ∂ + l.o.t. A + x U˜ j,k ∂ k + l.o.t. k=0
k=0
Then obviously we have: Uj =
N−1
j,k ∂ k + l.o.t., U
(3.46)
k=0
Wj =
N−1 k=0
j,k ∂ k + x V
N−1
j,k ∂ k + l.o.t. U
(3.47)
k=0 is x −2
Now using that the order at infinity of V we get that ht (Uj ) ≤ ht (mj ) − 2, ht (Wj ) ≤ ht (mj ) − 1. From the last inequalities we get (3.44) and (3.45).
Now we are ready to finish the proof of Theorem 0.4. Proof of Theorem 0.4. We recall that we have to show that V ≡ 0. Assume that some of the coefficients Vj are not zero. Then we shall compute the leading terms of the operators bj recursively using (3.43) and Lemma 3.11 and taking into account the estimates (3.44) and (3.45). First notice that the highest order term in b1 is of the type α1 x s1 ∂ k , α1 = 0, with k < N and s1 < 0. Suppose that the highest order term in bj is αj x sj ∂ k , αj = 0. Then the highest order term in bj +1 is computed, using (3.43), (3.37) and (3.38) to be αj +1 x sj +1 ∂ k with αj +1 = −αj (N (sj + 1) + k)/N (sj + 1). Having in mind k < N we get that αj +1 = 0. After a finite number of steps we will get that for some j the corresponding sj = −1. But this contradicts the fact that the highest order term of bj is a derivative of a rational function.
306
E. Horozov
4. Final Remarks on the Proof and Comments 4.1. Proof of Theorem 0.1. Essentially we already have performed the proof of the main theorem. We just have to notice that when the order of L is prime and there are coefficients increasing at infinity Lemma 3.7 and Theorem 0.4 give that the operator is Airy and hence bispectral. If the coefficients of L are bounded (at infinity) and the order is prime, then using the fact that the rank divides the order we get that either the rank of L is 1 or it is equal to its order. The latter case is treated in Theorem 0.3. If the rank is one then this is the main result of [W1]. Finally the inverse part, i.e. that all the operators listed in Theorem 0.1 are bispectral, is the main result of [BHY1].
4.2. Comments. Here I would like to make some speculations on eventual continuation of the classification. It seems to me that the methods of [BHY3] (see also [H1] for more details) will be enough to construct all bispectral operators. Assuming that then the classification should be: 1) find all “basic” bispectral operators, 2) show that Darboux transforms reduce any bispectral operator to a “basic” one. Having in mind the constructions in [DG, BHY1, KRo, W1] it seems natural to consider basic those operators that have as few singularities as possible and generate their centralizers. Then in view of the main result of [H2] one class of operators that certainly should be considered as “basic” is the class of bispectral operators L in the Weyl algebra that together with some other operator Q satisfy the “canonical commutation relation” (CCR): [L, Q] = 1. (4.1) More precisely they are basic because according to [H2] all bispectral operators in the Weyl algebra are simply polynomials in operators L that satisfy (4.1). It is tempting to believe that the CCR is enough for bispectrality: Conjecture 4.1. If the operators L and Q satisfy the CCR (4.1) then they are bispectral. This conjecture seems to be a difficult one as it is easily shown (cf. [H2]) to be equivalent to the famous conjecture of Dixmier-Kirillov. Conjecture 4.2. If the operators L, Q satisfy the CCR (4.1) then they generate the Weyl algebra A1 . In other words, any endomorphism of the Weyl algebra is an automorphism. The Bessel operators as well as other examples from [BHY3] show that one needs also to add to the list of the basic operators L the ones that together with some other operator Q satisfy the following “string” equation: [L, Q] = L.
(4.2)
Our, maybe insufficient experience, suggests that the above two classes contain all basic bispectral operators. A more precise conjecture is the following one: Conjecture 4.3. Every bispectral operator is a polynomial Darboux transformation of a bispectral operator satisfying either (4.1) or (4.2). To make the classification of the basic bispectral operators more explicit introduce the following notation. Denote by Bα the algebra spanned by a Bessel operator Lα , x N and D. Then
Bispectral Operators of Prime Order
307
Conjecture 4.4. (1) A bispectral operator satisfying (4.1) belongs the Weyl algebra. (2) A bispectral operator satisfying (4.2) belongs to one of the algebras Bα . Some progress could be achieved if one finds analogs for Bα of the results from [H2]. One difficulty would be to extend Dixmier results for these algebras. All the above conjectures aim at a complete classification of bispectral operators. A more modest goal is to find the bispectral operators with some properties. Here is a conjecture in this direction. Conjecture 4.5. An operator L with coefficients bounded at infinity is bispectral if and only if it is a polynomial Darboux transformation of a Bessel operator. This conjecture seems natural in view of the “if” part, obtained in [BHY1]. It will be very useful either to give a proof of Wilson’s result about rank one operators without using algebraic-geometric arguments or modify his proof in a higher rank situation. Despite the many interesting results for higher rank solutions of the KP-hierarchy I have not found a construction suitable for our purposes. In any case the above conjecture seems to me to be within the reach of the existing tools unlike the previous ones. Acknowledgements. I am grateful to T. Milanov for useful conversations and for the fruitful collaboration in [HM]. Important ideas in the present paper have their roots in [HM]. The present version of the paper owes a lot to the referee. His constructive criticism has helped me to improve considerably the exposition – his suggestions made it clearer and corrected a number of errors. For all this I feel much obliged to him. This paper has been partially supported by Grant No MM 1003-2000 of the National Fund “Scientific Researches” of the Ministry of Education of Bulgaria.
References [BHY1] Bakalov, B., Horozov, E.,Yakimov, M.: Bispectral algebras of commuting ordinary differential operators. Commun. Math. Phys. 190, 331–373 (1997), q-alg/9602011 [BHY2] Bakalov, B., Horozov, E., Yakimov, M.: B¨acklund–Darboux transformations in Sato’s Grassmannian. Serdica Math. J. 22, 571–588 (1996), q-alg/9602010 [BHY3] Bakalov, B., Horozov, E., Yakimov, M.: General methods for constructing bispectral operators. Phys. Lett. A 222, 59–66 (1996) [BHY4] Bakalov, B., Horozov, E.,Yakimov, M.: Highest weight modules over W1+∞ , and the bispectral problem. Duke Math. J. 93, 41–72 (1998) [BP] Harnad, J., Kasman, A. eds.: The Bispectral problem (Montr´eal), CRM Proc. Lecture Notes, Vol. 14, Providence, RJ: Am. Math. Soc., 1998 [BC] Burchnall, J.L., Chaundy, T.W.: Commutative ordinary differential operators. Proc. Lond. Math. Soc. 21, 420–440 (1923); Proc. Royal Soc. London (A) 118, 557–583 (1928); Proc. Royal Soc. London (A) 134, 471–485 (1932) [BW] Berest,Yu., Wilson, G.: Classification of rings of differential operators on affine curves. IMRN, N 2, 105–109 (1999) [BW1] Berest, Yu., Wilson, G.: Automorphisms and ideals of the Weyl algebra. Math. Ann. 318(1), 127–147 (2000) [BW2] Berest, Yu., Wilson, G.: Ideal classes of the Weyl algebra and noncommutative projective geometry. arXiv.math.AG/0104240, (2001) [DJKM] Date, E., Jimbo, M., Kashiwara, M., Miwa, T.: Transformation groups for soliton equations. In: Proc. RIMS Symp. Nonlinear Integrable Systems – Classical and Quantum theory (Kyoto 1981), M. Jimbo, T. Miwa eds., Singapore: World Scientific, 1983, pp. 39–111 [Di] Dickey, L.: Soliton and Hamiltonian systems. Adv. Ser. Math. Phys. 12, Singapore: World Scientific, 1991 [Dx] Dixmier, J.: Sur les alg`ebres de Weyl. Bull. Soc. Math. France 96, 209–242 (1968) [DG] Duistermaat, J.J., Gr¨unbaum, F.A.: Differential equations in the spectral parameter. Commun. Math. Phys. 103, 177–240 (1986) [F] Fastr´e, J.: B¨acklund–Darboux transformations and W -algebras. Doctoral Dissertation, Univ. of Louvain, 1993
308 [G1] [G2] [H1] [H2] [HM] [K] [KRo] [Kr] [MZ] [S] [SW] [vM] [W1] [W2] [Z]
E. Horozov Gr¨unbaum, F.A.: The limited angle reconstruction problem in computer tomography. In: Proc. Symp. Appl. Math. 27, L. Shepp ed., Providence, RI: AMS 1982, pp. 43–61 Gr¨unbaum, F.A.: Time-band limiting and the bispectral problem. Comm. Pure Appl. Math. 47, 307–328 (1994) Horozov, E.: Dual algebras of differential operators, In: Kowalevski Property (Montr´eal), CRM Proc. Lecture Notes, Surveys from Kowalevski. Workshop on Mathematical methods of Regular Dynamics, Leeds, April 2000, Providence, RI: Am. Math. Soc., 2002 Horozov, E.: Strictly nilpotent elements and bispectral operators in the Weyl algebra. To appear in Bull. Sc. Math., 2002 Horozov, E., Milanov, T.: Fuchsian bispectral operators. Pr´epublication no 187 de Laboratoire de Math. E. Picard, 2000, arXiv.math.DS/0102093, Bull. Sci. Math. 126, 161–192 (2002) Kasman, A.: Bispectral KP solutions and linearization of Calogero–Moser particle systems, Commun. Math. Phys. 172, 427–448 (1995) Kasman, A., Rothstein, M.: Bispectral Darboux transformations: The generalized Airy case. Phys. D 102, 159–176 (1998) Krichever, I.: Commutative rings of linear ordinary differential operators. Funct. Anal. Appl. 12(3), 20–31 (1978) Magri, F., Zubelli, J.: Differential equations in the spectral parameter, Darboux transformations and a hierarchy of master equations for KdV. Commun. Math. Phys. 141, 329–351 (1991) Sato, M.: Soliton equations as dynamical systems on infinite dimensional Grassmann manifolds. RIMS Kokyuroku 439, 30–40 (1981) Segal, G., Wilson, G.: Loop Groups and equations of KdV type. Publ. Math. IHES 61, 5–65 (1985) van Moerbeke, P.: Integrable foundations of string theory. CIMPA – Summer school at Sophia – Antipolis (1991). In: Lectures on Integrable Systems, O. Babelon et al. eds., Singapore: World Scientific, 1994, pp. 163–267 Wilson, G.: Bispectral commutative ordinary differential operators. J. Reine Angew. Math. 442, 177–204 (1993) Wilson, G.: Collisions of Calogero-Moser particles and an adelic Grassmannian (with an appendix by I. G. Macdonald). Invent. Math. 133, 1–41 (1998) Zubelli, J.: Differential equations in the spectral parameter for matrix differential operators. Physica D 43, 269–287 (1990)
Communicated by L. Takhtajan
Commun. Math. Phys. 231, 309–345 (2002) Digital Object Identifier (DOI) 10.1007/s00220-002-0719-y
Communications in
Mathematical Physics
Existence of Local Covariant Time Ordered Products of Quantum Fields in Curved Spacetime Stefan Hollands, Robert M. Wald The Enrico Fermi Institute, Physics Department, University of Chicago, 5640 Ellis Avenue, Chicago, IL 60637, USA. E-mail:
[email protected];
[email protected] Received: 6 December 2001 / Accepted: 10 June 2002 Published online: 21 October 2002 – © Springer-Verlag 2002
Abstract: We establish the existence of local, covariant time ordered products of local Wick polynomials for a free scalar field in curved spacetime. Our time ordered products satisfy all of the hypotheses of our previous uniqueness theorem, so our construction essentially completes the analysis of the existence, uniqueness, and renormalizability of the perturbative expansion for nonlinear quantum field theories in curved spacetime. As a byproduct of our analysis, we derive a scaling expansion of the time ordered products about the total diagonal that expresses them as a sum of products of polynomials in the curvature times Lorentz invariant distributions, plus a remainder term of arbitrarily low scaling degree.
1. Introduction In order to give a perturbative definition of a nonlinear quantum field theory in a globally hyperbolic, curved spacetime, it is necessary to define Wick polynomials and their time ordered products for the corresponding linear (i.e., non-self-interacting) field. In the case of a scalar field, a construction of these quantities was given recently by Brunetti, Fredenhagen and K¨ohler [2] and by Brunetti and Fredenhagen [3]. However, these authors did not impose a locality or covariance condition on the Wick polynomials or their time ordered products. In fact, the Wick polynomials were constructed in [2] by means of a normal ordering prescription with respect to an arbitrarily chosen Hadamard vacuum state. The Wick polynomials defined in this manner thereby possess an undesirable nonlocal dependence upon the choice of this vacuum state. Since no locality or covariance condition was imposed on the construction of time ordered-products of Wick polynomials in [3] – and, indeed, such conditions could not have been imposed since the Wick polynomials used in [3] were not local, covariant fields – the renormalization ambiguities were found to involve coupling functions rather than coupling constants.
310
S. Hollands, R.M. Wald
In a recent paper [12], we introduced the notion of a local, covariant quantum field1 , and we then imposed the requirement that the Wick polynomials and their time ordered products be local, covariant quantum fields. We also required that these quantities have a suitable continuous and analytic dependence upon the spacetime metric and have a suitable scaling behavior under scalings of the metric. (These latter notions are well defined only for local, covariant fields.) In addition, we required the Wick polynomials and their time ordered products to satisfy various additional properties, namely suitable commutation relations with the free field, a microlocal spectral condition, and (for the time ordered products) causal factorization and unitarity conditions. We refer the reader to [12] for the precise statements of all of our conditions as well as a complete explanation of the algebraic framework within which our conditions were formulated. In [12], uniqueness theorems were proven for both the Wick polynomials and their time ordered products. For the Wick polynomials, we showed that any two constructions that satisfy all of the above conditions can differ at most by a suitable sum of products of curvature terms of a definite scaling dimension multiplied by lower order Wick polynomials of the appropriate dimension. In particular, this implies that the ambiguity in defining Wick polynomials up to a given finite order is uniquely characterized by only a finite number of parameters. A similar uniqueness result was obtained for the time ordered products, thereby establishing that the ambiguity in defining these quantities up to any given finite order also is characterized by only a finite number of parameters. We then showed that λϕ 4 -theory in curved spacetime is renormalizable in the sense that the ambiguities arising in the perturbative definition of this theory correspond precisely to the (finite number of) parameters appearing in the classical Lagrangian (provided that the possible curvature couplings of the appropriate dimension are included in this Lagrangian). Again, we refer the reader to [12] for the precise statements and proofs of these results. The above uniqueness theorems, of course, do not address the issue of whether there actually exists a construction of Wick polynomials and their time ordered products that satisfies all of our requirements. As already noted above, in [2] Wick polynomials were constructed via a normal ordering prescription, but they fail to satisfy our requirement of being local and covariant. However, this deficiency can be repaired in a relatively straightforward manner by replacing the normal ordering prescription with respect to a (nonlocally defined) Hadamard vacuum state by a point-splitting prescription based upon a locally and covariantly defined Hadamard parametrix. It was proven in [12] that such a construction does indeed satisfy all of our requirements, thus establishing the existence of local, covariant Wick polynomials. One might hope that the construction of time ordered products given in [3] could be similarly modified to yield local, covariant fields that satisfy all of our requirements. However, it is not at all obvious how to do this. As in [3] (and as will be explicitly seen in Subsect. 3.1 below), the essential difficulty in defining time ordered products arises from the extension of certain multivariable distributions to the total diagonal; it is here that regularization/renormalization is needed. The usual momentum space methods of regularization are inapplicable in a curved, Lorentzian spacetime, but the Epstein-Glaser prescription [10] is well defined [3]. However, this prescription involves the modification of test functions by the subtraction of their truncated Taylor series multiplied by a “cutoff function”. The introduction of such a cutoff function makes the prescription inherently nonlocal. Consequently, the time ordered products defined by this prescription will fail to be local, covariant fields. 1
A general notion of local, covariant quantum fields has been given by [4].
Local Covariant Time Ordered Products
311
A similar difficulty with the Epstein-Glaser prescription occurs in Minkowski spacetime, where the introduction of the cutoff function makes the prescription fail to be Lorentz invariant. However, in Minkowski spacetime, a cohomology argument can then be used to establish existence of a satisfactory Lorentz invariant prescription [17]. We have not been able to generalize this argument to curved spacetime. For this reason, the issue of existence of local covariant time ordered products was left open in [12]. The main purpose of this paper is to prove the existence of local covariant time ordered products, thereby essentially completing2 the perturbative construction of nonlinear quantum field theory in curved spacetime. The basic idea of our construction is as follows. As already indicated above, our task is to extend certain distributions on M n+1 \n+1 to all of M n+1 in a local, covariant manner, where M denotes the spacetime manifold, M n+1 = ×n+1 M, and n+1 denotes the total diagonal of M n+1 , n+1 = {(x, x, . . . , x) | x ∈ M}.
(1)
The key idea which enables us to accomplish this is to analyze the scaling behavior of the unextended distributions near the total diagonal. To do so, we first introduce n “relative coordinates” y and show that we can view each unextended distribution as a distribution in y for each fixed x ∈ M (i.e., our distribution in (n + 1) variables can be viewed as a distribution in the n relative coordinates that is parametrized by the point x on the total diagonal). We then show that near the total diagonal, each unextended distribution in question can be written as a finite sum of terms together with a “remainder term” with the following properties: (i) The terms in the finite sum are products of curvature terms in x times distributions, u, in y that correspond to Lorentz invariant distributions in Minkowski spacetime3 . (ii) The remainder term has a sufficiently low scaling degree under scaling of y. The distributions, u, may then be extended to the total diagonal by Minkowski spacetime methods, whereas the remainder term can be extended to the total diagonal by continuity. The resulting extended distributions can then be shown to provide a definition of local, covariant time ordered products that satisfy all of our requirements. The paper is organized as follows. In Sect. 2, we review our requirements on the definition of time ordered products. These requirements are the ones previously given in [12] except that we have replaced the continuity requirement of [12] under smooth variations of the metric with a smoothness requirement. Further discussion of our new smoothness requirement is given in Appendix A. In Sect. 3, we reduce the problem of constructing time ordered products to that of extending certain scalar distributions to the total diagonal. In Subsect. 3.1, we proceed inductively in the number, n, of variables, and reduce the problem to the extension of the time ordered products in n + 1 variables to the total diagonal. In Sect. 3.2, we use a local, covariant version of the Wick expansion to express these time ordered products in n + 1 variables as sums of local Wick products times “c-number” distributions, t 0 . In Subsect. 3.3, we then translate our requirements on the definition of time ordered products into requirements on the extensions of the distributions, t 0 , to the total diagonal. Section 4 is devoted to obtaining the desired extension of t 0 . In Subsect. 4.1, we introduce “relative coordinates”, y, and then derive our scaling expansion of t 0 with the 2 As pointed out in [12], when defining Wick polynomials involving derivatives of the field, it is natural to require the vanishing of any Wick product that contains a factor of the wave operator applied to the field. This requirement was not imposed in [12], so the issue of existence of Wick polynomials that satisfy this additional condition remains open. We are currently investigating this issue; see also [16]. 3 For the Feynman propagator and its powers, these terms would correspond to the momentum space expressions given in [5], since each u can be given a momentum space representation.
312
S. Hollands, R.M. Wald
properties indicated above. (Some properties of the distributions occurring in the scaling expansion are obtained in Appendix B.) The scaling expansion is then used to extend t 0 in Subsect. 4.2. Finally, in Subsect. 4.3, we show that the extended distributions, t, satisfy the properties listed in Subsect. 3.3, so that they define a notion of time ordered products satisfying all of the requirements of Sect. 2. Some concluding remarks are given in Sect. 5. We will restrict consideration here to the theory of a scalar field ϕ, but our basic methods and results should be applicable to other fields. As in [12], for notational simplicity we restrict attention to time ordered products of Wick powers that do not contain derivatives ϕ. However, our results should extend straightforwardly to time ordered products involving derivatives of the field, subject to the caveat mentioned in footnote 2 above. In addition, for notational simplicity in treating the scaling behaviour, we restrict consideration to the massless case, so that the free theory contains no dimensional parameters. Again, our results can be straightforwardly generalized to the case where dimensional parameters are present. Notation and Conventions. Our notation and conventions are the same as in [12]. In particular, we define the Fourier transform on Rm by u(k) ˆ = (2π )−m/2 u(x)e+ikx d m x. Multi-indices are denoted by α = (α1 , . . . , αm ) ∈ N0m . If α is an m-dimensional mulαm ti-index, then we also use standard notations such as |α| = αi , x α = x1α1 . . . xm |α| and ∂ α = α1∂ αm . We also use the “constant convention”, meaning that we use the ∂x1 ...∂xm
same symbol C for possibly different numerical constants in a chain of inequalities. The space of compactly supported smooth functions on a space X with values in the complex numbers is denoted by D(X) and the space of smooth functions on X (not necessarily of compact support) by E(X). (For the definition of the topology on these spaces, see e.g. [19, Chap. V].) The corresponding topological dual spaces of distributions are denoted by D (X) respectively E (X). The elements in E (X) are the distributions of compact support. The wave front set [14] of a distribution u is denoted by WF(u) and its analytic wave front set [14] (see also Appendix A) is denoted by WFA (u). 2. Required Properties of the Time Ordered Products For the theory of a free scalar field, ϕ(x), on an arbitrary globally hyperbolic spacetime, (M, g), we previously defined [12] an “extended Wick-polynomial algebra”, W(M, g), which generalizes the construction of D¨utsch and Fredenhagen [8] to curved spacetimes. This algebra is sufficiently large to contain elements corresponding to all Wick powers, ϕ k (x), (as distributions on compactly supported test functions on M) and their time ordered products (2) T = T ϕ k1 (x1 ) . . . ϕ kn (xn ) , (as distributions on compactly supported smooth test function on M n ). In [12] we imposed a set of requirements on both ϕ k and T that uniquely determined these quantities up to certain well specified renormalization ambiguities. In [12], we also constructed Wick products satisfying all of our conditions, so in this paper we will view these quantities as known. Our task here is to construct time ordered products of Wick powers that satisfy the following list of requirements, which – apart from the smoothness condition T4 (see remark (1) at the end of this section) – correspond to the requirements previously given in [12]:
Local Covariant Time Ordered Products
313
T1 Locality/Covariance. The time ordered products are local, covariant fields, as defined in [12]. T2 Scaling. The time ordered products scale “almost homogeneously” under rescalings g → λ−2 g of the spacetime metric in the following sense. Let be a local, covariant field in n variables, and let Sλ be the rescaled local, covariant field given by Sλ [g] ≡ λ−4n σλ λ−2 g , where σλ : W M, λ−2 g → W(M, g) is the canonical isomorphism defined in [12]. The scaling dimension, d , of a local covariant field is defined as
(3) d = sup δ ∈ R | lim λ−δ Sλ = 0 . λ→0+
The scaling requirement on the time ordered product is then that λ−dT Sλ T = T +
N
lnh λ
h,
(4)
h=1
ki , N is some natural number and where h are local, covariant fields where dT = with scaling dimension dT which have fewer powers in the free field than T . T3 Microlocal spectrum condition. Let ω be any continuous state on W(M, g), so that, as shown in [13], ω has smooth truncated n-point functions for n = 2 and a two-point function ω2 (x, y) = ω (ϕ(x)ϕ(y)) of Hadamard from, i.e., WF (ω2 ) ⊂ C+ (M, g), where C+ (M, g) = (x1 , k1 ; x2 , −k2 ) ∈ T ∗ M 2 \ {0} | (x1 , k1 ) ∼ (x2 , k2 ) ; k1 ∈ V + x . 1
(5) Here the notation (x1 , k1 ) ∼ (x2 , k2 ) means that x1 and x2 can be joined by a null-geo desic and that k1 and k2 are cotangent and coparallel to that null-geodesic. V + x is the future lightcone at x. Furthermore, let n ki ϕ (xi ) . (6) ωT (x1 , . . . , xn ) = ω T i=1
Then we require that (7) WF (ωT ) ⊂ CT (M, g), ∗ n T M \ {0} is described as follows (we use the graphological
where the set CT (M, g) ⊂ notation introduced in [2, 3]): Let G(p) be a “decorated embedded graph” in (M, g). By this we mean an embedded graph ⊂ M whose vertices are points x1 , . . . , xn ∈ M and whose edges, e, are oriented null-geodesic curves. Each such null geodesic is equipped with a coparallel, cotangent covectorfield pe . If e is an edge in G(p) connecting the points xi and xj with i < j , then s(e) = i is its source and t (e) = j its target. It is required that pe is future/past directed if xt (e) ∈ J ± xs(e) . With this notation, we define CT (M, g) = (x1 , k1 ; . . . ; xn , kn ) ∈ T ∗ M n \ {0} | ∃ decorated graph G(p) with vertices x1 , . . . , xn such that ki = pe − pe ∀i . (8) e:s(e)=i
e:t (e)=i
314
S. Hollands, R.M. Wald
T4 Smoothness. The functional dependence of the time ordered products on the spacetime metric, g, is such that if the metric is varied smoothly, then the time ordered products vary smoothly, in the following sense. Consider a smooth one parameter family of metrics (s) g(s) , let T (s) be a corresponding family of time ordered products, and let CT be given by (s) Eq. (8) for this family of metrics. Furthermore, let ω be a family of Hadamard states with smooth truncated n-point functions (n = 2) depending smoothly on s and with (s) two-point functions ω2 depending smoothly on s in the sense that (see Appendix A)
(s) WF (ω2 ) ⊂ (s, ρ; x1 , k1 ; x2 , k2 ) ∈ T ∗ R × M 2 \ {0} (x1 , k1 ; x2 , k2 ) ∈ C+ , (9) (s)
where the family of cones C+ is defined by Eq. (5) in terms of the family g(s) . Then we require that the family of distributions given by n (s) (s) ki ωT (s, x1 , . . . xn ) = ω (10) T ϕ (xi ) i=1 (s)
depends smoothly on s with respect to CT in the sense that WF (ωT ) ⊂ (s, ρ; x1 , k1 ; . . . ; xn , kn )
(s) ∈ T ∗ R × M n \ {0} | (x1 , k1 ; . . . ; xn , kn ) ∈ CT .
(11)
T5 Analyticity. Similarly, we require that, for an analytic one-parameter family of analytic metrics, the expectation value of the time ordered products in an analytic family of states varies analytically in the same sense as in T4, but with the smooth wave front set replaced by the analytic wave front set. T6 Symmetry. The time ordered products are symmetric under a permutation of the factors. T7 Unitarity. We have T ∗ = T¯ , where T¯ is the “anti-time-ordered” product, defined as T¯ ϕ k1 (x1 ) . . . ϕ kn (xn ) = (−1)n+j T ϕ ki (xi ) . . . T ϕ ki (xi ) , (12) I1 ···Ij ={1,...,n}
i∈I1
i∈Ij
where the sum runs over all partitions of the set {1, . . . , n} into disjoint subsets I1 , . . . , Ij . T8 Causal Factorization. In the case of a single factor, we require that T ϕ k (x) = ϕ k (x). For more than one factor, we require the time ordered product to satisfy the following causal factorization rule, which reflects the time-ordering of the factors. Consider a set of points (x1 , . . . , xn ) ∈ M n and a partition of {1, . . . , n} into two non-empty disjoint subsets I and I c , with the property that no point xi with i ∈ I is in the past of
Local Covariant Time Ordered Products
315
any of the points xj with j ∈ I c , that is, xi ∈ / J − xj for all i ∈ I and j ∈ I c . Then the time ordered products factorize in the following way: T =T (13) ϕ ki (xi ) T ϕ kj xj . i∈I
j ∈I c
T9 Commutator. The commutator of a time ordered product with a free field is given by lower order time ordered products times suitable commutator functions, namely T ϕ k1 (x1 ) . . . ϕ kn (xn ) , ϕ(y) =i
n
ki (xi , y) T ϕ k1 (x1 ) . . . ϕ ki −1 (xi ) . . . ϕ kn (xn ) ,
(14)
i=1
where is the causal propagator (commutator function), defined as the difference between the advanced and retarded fundamental solutions of the Klein-Gordon equation. Remarks. (1) In our paper [12], we defined a notion of the continuous variation of a local covariant field under smooth variations of the metric, and we imposed this as a requirement on Wick powers and their time ordered products. We have replaced this requirement here with condition T4, which requires smooth (rather than continuous) dependence of the fields. It is easy to verify the uniqueness results of [12] as well as the existence result of [12] for Wick powers go through without any essential change if the continuity requirement imposed there is replaced by condition T4. We prefer to work with condition T4 here because it is a much simpler condition to state, it is more general, and it closely parallels the analyticity requirement T5 that was previously imposed in [12]. Further discussion and explanation of conditions T4 and T5 is given in Appendix A. (2) The microlocal spectrum condition is the same conditionas formulated in [3]. It k may be motivated by the fact that for noncoinciding points, ω T ϕ i (xi ) can be expressed in terms of Feynman graphs. A line in such a graph represents a Feynman def
propagator, ωF (x, y) = ω(T (ϕ(x)ϕ(y))) = ω2 (x, y) − iadv (x, y), whose wave front set off the diagonal is given by [18]
WF (ωF ) = (x1 , k1 ; x2 , −k2 ) |(x1 , k1 ) ∼ (x2 , k2 ) ; k1 ∈ V ± x ⇔ x2 ∈ J ± (x1 ) . 1
(15) For non-coinciding points, the form of WF (ωT ) follows from (15) and the rules for calculating the wave front set of a product of several distributions, see e.g. [14, Thm. 8.2.10]. For coinciding points, the form of WF (ωT ) reflects the usual energy momentum conservation rules. On the total diagonal, n , the microlocal spectral condition reduces to WF (ωT ) n ⊥ T (n ) ,
(16)
where the notation “⊥” means the following: If F ⊂ T ∗ X with X a manifold and Y ⊂ X a smooth submanifold, then F Y ⊥ T Y means that for any (y, k) ∈ F Y and any (y, v) ∈ T Y we have that ka v a = 0.
316
S. Hollands, R.M. Wald
(3) The “connected time ordered product”, T c , of n Wick-monomials is defined in terms of the time ordered product by δn T c ϕ k1 (x1 ) . . . ϕ kn (xn ) = n ln S(f ) f1 =···=fn =0 i δf1 (x1 ) . . . δfn (xn ) (−1)j +1 ki T = ϕ (xi ) j I1 ···Ij ={1,...,n} i∈I1 ...T ϕ ki (xi ) , i∈Ij
where the fi are test functions of compact support and S(f ) is the local S-matrix for the Lagrangian L(x) = i fi (x)ϕ ki (x), defined as the formal power series expression in S(f ) = T (L(x1 ) . . . L(xn )) µg (x1 ) . . . µg (xn ). (17) n! M n n≥0
Our unitarity condition, T9, is equivalent to the condition T c∗ = (−1)n+1 T c on the connected time ordered product. (4) For Minkowski spacetime, condition T9 was given in [9, 1], where it was shown to be equivalent to the familiar Wick-expansion of the time ordered products (see Subsect. 3.2 below). Our task is to construct time ordered products of Wick powers that satisfy conditions T1–T9. We shall proceed inductively in the number of factors, n, appearing in the time ordered product (2). By condition T8, for n = 1 the time ordered products are just the Wick powers, which were already constructed in [12]. Therefore, we may inductively assume that time ordered products with properties T1–T9 have been defined for any number of factors ≤ n. The goal is to construct from these the time ordered products with n + 1 factors. In the next section, we reduce the problem (in close parallel with the analysis of [3]) to that of extending certain multivariable scalar distributions t 0 to the total diagonal. 3. Reduction to the Problem of Extending Certain Scalar Distributions to the Total Diagonal 3.1. Construction of time ordered products up to the total diagonal. The key idea of causal perturbation theory is that the time ordered products with n + 1 factors are already uniquely determined as algebra-valued distributions on the manifold M n+1 minus its total diagonal n+1 by the causal factorization requirement T8 (see Eq. (13)), once the time ordered products with less than or equal to n factors are given. Following [3], this can be seen as follows: Let I be a proper subset of {1, 2, . . . , n+1}, and let CI be the subset of M n+1 defined by ! CI = (x1 , x2 , . . . , xn+1 ) | xi ∈ for all i ∈ I, j ∈ I c , (18) / J + xj where I c is the complement of I . It can be seen that the sets CI are open and that the collection {CI } of these sets covers the manifold M n+1 \ n+1 . Let {fI } be a partition
Local Covariant Time Ordered Products
317
of unity subordinate to this covering. On the manifold M n+1 \ n+1 , we define the algebra-valued distributions T 0 by T0 =
fI TI ,
(19)
I {1,...,n+1},I =∅
where
TI = T
i∈I
ϕ ki (xi ) T
ϕ kj xj .
(20)
j ∈I c
Using the causal factorization property T8 of the time ordered products with less than or equal n factors, it can be seen that the definition of T 0 does not depend on the choice of the partition {fI }, so T 0 is well defined. Property T8 applied to the time ordered products with n + 1 factors then requires that the restriction of T to M n+1 \ n+1 must agree with T 0 . Thus, property T8 alone determines T up to the total diagonal, as we desired to show. We now claim that – assuming that time ordered products with less than or equal n factors have been defined so as to satisfy properties T1–T9 on M n – the fields T 0 with n + 1 factors automatically satisfy4 the restrictions of properties T1–T9 to M n+1 \ n+1 . Condition T8 can be immediately seen to hold by virtue of the definition of T 0 . The proof that properties T1, T2, T6, T7 and T9 hold is relatively straightforward. A proof of the microlocal spectral condition, T3, can be given in exact parallel with reference [3]. A generalization of this argument can be used to prove that the smoothness and analyticity conditions, T4 and T5, also hold. Our remaining task is to find an extension of each of the algebra-valued distributions T 0 in n+1 factors from M n+1 \n+1 to all of M n+1 in such a way that properties T1–T9 continue to hold for the extension. This step, of course, corresponds to renormalization. Condition T8 does not impose any additional conditions on the extension, so we need only satisfy T1–T7 and T9. However, it is not difficult to see that if an extension T is defined that satisfies T1–T5 and T9, then that extension can be modified, if necessary, so as to also satisfy the symmetry and unitarity conditions, T6 and T7. Namely, if the extension, T , of T 0 satisfied T1–T5 and T9 but failed to satisfy the symmetry condition, T6, we could define a new extension T by symmetrization, T =
1 T ϕ kπ(1) xπ(1) . . . ϕ kπ(n+1) xπ(n+1) . (n + 1)!
(21)
Perm π
The so obtained extension of T 0 then satisfies T1–T6 and T9. Similarly, suppose the extension, satisfied T1–T6 and T9 but failed to satisfy the unitarity condition, T7, so that the corresponding connected time ordered product, T c , fails to satisfy T c∗ = (−1)n T c (see Remark (3) of Sect. 2). Then we define T c = 21 (T c + (−1)n T c∗ ) and redefine our 4 Of course, if any T 0 failed to satisfy any of these properties on M n+1 \ n+1 , we would have a proof that no definition of time ordered products could exist that satisfies T1–T9.
318
S. Hollands, R.M. Wald
extension by
T =T
c
n+1
ki
ϕ (xi )
i=1
−
(−1)j +1
I1 ···Ij ={1,...,n+1} j ≥2
j
T
ϕ ki (xi ) . . . T
i∈I1
ϕ ki (xi ) .
(22)
i∈Ij
This provides us with an extension of T 0 that satisfies T1–T7 and T9. Thus, we have reduced the problem of defining time ordered products to the problem of extending the distributions T 0 defined by (19) from M n+1 \ n+1 to all of M n+1 so that properties T1–T5 and T9 continue to hold for the extension. In the next subsection, we will see that property T9 can be replaced by the requirement of a local Wick expansion for time ordered products.
3.2. Reduction to a c-number problem via a local Wick expansion. The next key simplification is to reduce the problem of defining the algebra valued distributions T to the problem of defining certain “c-number” distributions t. As in [3], this is accomplished by means of a “Wick expansion”. The usual Wick expansion in Minkowski spacetime expresses time ordered products as a sum of normal ordered products with distributional coefficients. In the generalization to curved spacetime given in [3], the time ordered products are Wick expanded in terms of normal ordered products defined relative to an arbitrarily chosen quasi-free Hadamard state. However, such an expansion would not be useful here because the quasi-free Hadamard state – however it is chosen – has a nonlocal character. Consequently, the distributional coefficients occurring in the Wick expansion with respect to normal ordered products will fail to inherit the locality and covariance properties of the time ordered products themselves. For this reason, we will employ here a Wick expansion of the time ordered products with respect to the local, covariant Wick products : ϕ k1 . . . ϕ kn :H that were previously defined in [12] as follows: Let H (x, y) denote the local Hadamard parametrix H (x, y) = U (x, y)σ −1 + V (x, y) ln σ,
(23)
where U, V are certain smooth functions defined in terms of the metric and the coupling parameters, σ is the signed squared geodesic distance and where the “i0” prescription for the singular terms is as for the two-point function in Minkowski space. (The formal power series defining V need not converge in smooth, non-analytic spacetimes, but a suitably modified convergent V can be used, as explained in [12, Sect. 5.2].) Following [12, Sect. 5.2], we then define in some neighborhood, Un , of the total diagonal, n , the algebra valued distributions : ϕ k1 (x1 ) . . . ϕ kn (xn ) :H =
δ |k| i|k| δf (x
1
)k1
kn
. . . δf (xn )
exp i ϕ(f ) + 21 H (f, f ) · 11 f =0 (24)
Local Covariant Time Ordered Products
319
with |k| = ki . Our Wick-expansion is n "k # ki tj ...j (x1 , . . . , xn ) : ϕ k1 −j1 (x1 ) . . . ϕ kn −jn (xn ) :H , (25) T ϕ (xi ) = j 1 n i=1
j1 ...jn
ki where tj1 ...jn are c-number distributions on Un and where jk = ji . Note that the Wick-expansion formula (25) is a different identity for different sets of exponents (k1 , . . . , kn ), but that the same coefficients tj1 ...jn appear in each identity. We emphasize that since our local Wick-products (24) are defined only on a sufficiently small neighborhood, Un , of the total diagonal, our Wick-expansion will only make sense in this neighborhood. (This is in contrast with the Wick-expansion used in [3] which is based on a normal ordering prescription for Wick-products and therefore makes sense everywhere on M n .) This fact, however, will not cause any complications for our constructions, since we will need the Wick-expansion only for the purpose of extending T 0 to the total diagonal. We claim now that any definition of time ordered products that satisfies requirements T3 and T9 must admit a Wick expansion of the form (25), with distributional coefficients satisfying WF tj1 ...jn ⊂ CT (M, g), (26) where CT (M, g) is the set specified in (8). (Note that (26) implies in particular that the products of distributions implicit in our Wick-expansion formula actually exist and that the operator given by this formula defines – after smearing with a smooth test function – an element of our algebra W(M, g).) To prove this claim, we note that Eqs. (25) and (26) hold trivially for the time ordered product T (ϕ) of a single free field. Let us now inductively assume that Eqs. (25) and (26) have been demonstrated for all time ordered products of the form T (ϕ k1 . . . ϕ kn ), whenever |k| = ki < dfor some d ≥ 1. We claim that they also hold for all multi-orders (k1 , . . . , kn ) with ki = d. To see this, we consider the difference, n " # k ki tj ...j (x1 , . . . , xn ) D(x1 , . . . , xn ) = T ϕ (xi ) − j 1 n i=1
j1 ...jn ,|j |<|k|
: ϕ k1 −j1 (x1 ) . . . ϕ kn −jn (xn ) :H , (27) between the left side of Eq. (25) and the expression on the right side of that equation, but with the term tk1 ...kn 11 omitted in the sum. (Note that this is precisely the term in Eq. (25) which is not already known by the induction hypothesis.) We now commute D with a free field ϕ. We use T9 to evaluate the commutator with the time ordered product and we use the similar commutation relation that holds for the local Wick products occurring in the sum. If this is carried out, then one finds that [D(x1 , . . . , xn ), ϕ(y)] = 0. Since the only elements of our algebra W(M, g) that commute with all smeared field operators ϕ(f ) are multiples of the identity [12, Prop. 2.1], we thus find that D must in fact be given by a c-number distribution times the identity. We define tk1 ...kn to be this c-number distribution. Now t = tk1 ...kn can trivially be written as t = ω(D), for any Hadamard state, and each operator in the expression (27) for D satisfies5 T3. Hence, For the terms in the sum in Eq. (27), this follows from inductive hypothesis Eq. (26) on the tj1 ...jn with |j | < |k|, together with the fact that ω(: ϕ k1 (x1 ) . . . ϕ kn (xn ) :H ) is smooth for all k1 , . . . , kn . 5
320
S. Hollands, R.M. Wald
condition T3 holds also for D, thus showing that WF(tk1 ...kn ) ⊂ CT (M, g). We have therefore completed the induction step, thereby establishing that the Wick-expansion holds for all multi-orders (k1 , . . . , kn ), provided only that T3 and T9 hold for the time ordered products. Conversely, if a definition of time ordered products has been given that admits a Wick expansion of the form (25) with coefficients satisfying (26), then properties T3 and T9 will hold as well in the neighborhood of the total diagonal on which the Wick expansion is defined. Since the distribution T 0 defined on M n+1 \ n+1 by (19) above satisfies properties T3 and T9 on M n+1 \ n+1 , it also admits a local Wick expansion of the form (25), i.e., on Un+1 \ n+1 we have T
0
n+1
ki
ϕ (xi )
i=1
=
"k # t0 (x1 , . . . , xn+1 ) : ϕ k1 −j1 (x1 ) . . . ϕ kn+1 −jn+1 (xn+1 ) :H . j j1 ...jn+1
(28)
j1 ...jn+1
In the next subsection, we will reformulate the problem of extending the algebra-valued distributions T 0 to the algebra-valued distributions T in terms of the extension of the c-number distributions t 0 appearing in (28) to the c-number distributions t appearing in (25). 3.3. Reformulation in terms of the extension of t 0 . We return now to our inductive construction of time ordered products. We assume that all time ordered products involving ≤ n factors have been constructed so as to satisfy our assumptions T1–T9 and we consider an arbitrary time ordered product, T , in (n + 1) factors. As noted in Subsect. 3.1, property T8 will hold if and only if T is an extension to all of M n+1 of the distribution T 0 on M n+1 \ n+1 defined by (19). Since T 0 satisfies T1–T9 on M n+1 \ n+1 , we need only check that our extension preserves these properties. As noted at the end of Subsect. 3.1, we actually need only check that T preserves properties T1–T5 and T9, since T8 does not provide any additional conditions on the extension and, by a suitable redefinition, it is straightforward to ensure that T6 and T7 are satisfied. Furthermore, as shown in the previous subsection, we may replace property T9 by the local Wick expansion (25). Thus, time ordered products satisfying all of our conditions will exist if and only if the c-number distributions t 0 on Un+1 \ n+1 appearing in (28) can be extended to distributions t on Un+1 in such a way that the distribution T defined by (25) continues to satisfy properties T1–T5. It is straightforward to check that this will be the case if and only if the extensions t satisfy the following 5 corresponding conditions: t1 Locality/Covariance. The distributions t are locally constructed from the metric in a covariant manner in the following sense. Let ψ : N → M be a causality-preserving isometric embedding, and let f be a test function supported in a sufficiently small neighborhood in the total diagonal of N n+1 . Then we require that ψ ∗ t[gM ](f ) = t[gN ](f ), where gM and gN are the metrics on M and N respectively, so that ψ ∗ gM = gN .
(29)
Local Covariant Time Ordered Products
321
t2 Scaling. The distributions t = tj1 ...jn+1 scale homogeneously up to logarithmic terms, in the sense that there is an N ∈ N such that λ−d t[λ−2 g] = t[g] +
N−1 h=1
lnh λ vh [g], h!
where the vh are certain local and covariant distributions, and where d =
(30)
ji .
t3 Microlocal Spectral Condition. WF(t) n+1 ⊥ T (n+1 ). t4 Smoothness. Let g(s) be a smooth family of metrics on M, depending smoothly on a parameter, and view t (s, x1 , . . . , xn+1 ) = t[g(s) ] (x1 , . . . , xn+1 ) as a distribution on R × Un+1 . Then we require that
WF(t) R×n+1 ⊂ (s, ρ; x, k1 ; . . . ; x, kn+1 ) ki = 0, not all ki = 0 . (31) t5 Analyticity. If g(s) is an analytic family of real analytic metric on Un+1 , then item t4 holds with the smooth wave front set replaced by the analytic wave front set. Remarks. (1) If we apply the differential operator λ∂λ = sides of Eq. (30), then we obtain
∂ ∂ ln λ
(λ∂λ − d)N t[λ−2 g] = 0.
a total of N times to both (32)
Moreover, if we apply λ∂λ = ∂ ln∂ λ only h < N times to both sides of (30) and set λ equal to one afterwards, we find that the local covariant distributions vh are given by vh [g] = (λ∂λ − d)h t[λ−2 g]|λ=1 .
(33)
In fact, Eq. (32) is actually equivalent to (30), as one can see by rewriting the differential operator λ∂λ − d as λd ∂ ln∂ λ λ−d and then integrating (32) N times. (2) In formulating conditions t3–t5, we have taken advantage of the fact that on Un+1 \ n+1 , we have t = t 0 , so t is already known to satisfy the wave front set conditions corresponding to T3–T5 on Un+1 \ n+1 . Consequently, we need only require t to satisfy the desired wave front set conditions on n+1 . Similarly, conditions t1 and t2 also are already known to hold on Un+1 \ n+1 , so we need only check that t satisfies these conditions in an arbitrarily small neighborhood of n+1 . In summary, in this section we have reduced the problem of defining time ordered products to the following question: Assume that time ordered products involving ≤ n factors have been constructed so as to satisfy our requirements T1–T9. Define T 0 by (19) and define the distributions t 0 on Un+1 \ n+1 by (28). Can each t 0 be extended to a distribution t on Un+1 so as to satisfy requirements t1–t5? 4. Extension to the Total Diagonal Thus far, our analysis of time ordered products corresponds closely to that given in [3]. The primary difference in our assumptions is that we have imposed the requirement that
322
S. Hollands, R.M. Wald
time ordered products be local, covariant fields (see T1) and that they satisfy certain additional requirements concerning scaling behavior (see T2), and smooth and analytic dependence on the metric (see T4 and T5). This has resulted in some important differences in our analysis as compared with [3]. In particular, as a consequence of the locality/covariance requirement, the Wick expansion of [3] in terms of normal ordered products with respect to a quasifree Hadamard state is not useful, so instead we introduced a local, covariant Wick expansion in Subsect. 3.2. Nevertheless, all of the steps in the analysis given in Sect. 3 above are in close parallel with the analysis of [3]. As described at the end of Sect. 3, our analysis will be completed if we can extend the distributions t 0 to the total diagonal so that they satisfy properties t1–t5. As is well known from quantum field theory in Minkowski spacetime, straightforward attempts to extend t 0 to the total diagonal give rise to formal expressions that do not make sense as distributions. Therefore, one normally proceeds by introducing some means of “regularizing” these formal expressions and then extracting a well defined “finite part” (up to renormalization ambiguities). In Minkowski spacetime, most approaches to regularization/renormalization involve the use of Euclideanization and/or momentum space methods, neither of which have a natural generalization to (non-static) curved Lorentzian spacetimes. For this reason, the authors of [3] employed the regularization procedure of Epstein and Glaser, which is “local” in the sense that it uses coordinate space methods that can be defined in a local region. Nevertheless, the Epstein-Glaser method is not local in a strong enough sense for our purposes, since we need to ensure that the renormalized time ordered products will be local, covariant fields. A key step in the Epstein-Glaser regularization procedure is the introduction of certain “cutoff functions” of compact support in the “relative coordinates” that equal 1 in a neighborhood of the total diagonal. Since the prescription for the extension of t 0 depends upon the spacetime geometry throughout the region where the cutoff functions are non-zero, the extension, t, at a point p ∈ n+1 will not depend only on the metric in an arbitrary small neighborhood of p and, thus, will not depend locally and covariantly on the metric in the sense required by condition t1. There does not appear to be any straightforward way of modifying the Epstein-Glaser regularization procedure so that the resulting extension, t, will satisfy property t1. In particular, serious convergence difficulties arise if one attempts to shrink the support of the cutoff functions to the total diagonal. In addition, the cohomological argument of [17] also does not appear to admit a straightforward generalization to curved spacetime. Consequently, we shall proceed by a different route here. Our approach to extend t 0 to the total diagonal is motivated by the idea (essentially the “equivalence principle”) that on sufficiently small scales a curved space “looks flat”, and that the divergences of t 0 in curved spacetime should be of the same nature as the corresponding t 0 in flat spacetime. However, this idea is not correct as just stated because a curved space is not actually flat (no matter on how small a scale one looks). Although it is true that the leading order divergences of t 0 will be essentially the same as in flat spacetime, in general there will be sub-leading-order divergences that are sensitive to the presence of curvature and are different from the divergences occurring for the corresponding t 0 in flat spacetime. Nevertheless, we will see in Subsect. 4.1 below that any local, covariant distributions that satisfies our scaling, smoothness, and analyticity conditions admits a “scaling expansion” about the total diagonal. This expansion expresses t 0 as a finite sum of terms plus a remainder term with the properties that (i) each term in the finite sum is a product of a curvature term times a distribution in the relative coordinates that corresponds to a Lorentz invariant distribution in Minkowski spacetime (which can be
Local Covariant Time Ordered Products
323
extended to the total diagonal by Minkowski spacetime methods) and (ii) the remainder term admits a unique, natural extension to the diagonal by continuity. We shall thereby obtain an extension of t 0 in Subsect. 4.2. In Subsect. 4.3 we will then show that the resulting extension satisfies all of the required properties t1–t5. 4.1. The scaling expansion. As indicated above, the key step that will enable us to extend t 0 is to perform a scaling expansion of it about the total diagonal. However, a priori it is not even clear what this means, since t 0 is a distribution in n + 1 variables, and it is not clear what it would mean to perform any kind of “expansion” of a distribution about the 4-dimensional submanifold n+1 . The first step in obtaining the scaling expansion for t 0 is to show that it is possible to fix one of its n + 1 variables at a value x, and view it as a distribution in the remaining n variables, y, which play the same role as “relative coordinates” in Minkowski spacetime. In other words, writing x = x1 ,
y = (x2 , . . . , xn+1 ),
(34)
we show that the (unextended) distribution t 0 possesses a well-defined restriction to the submanifold Cx = {x} × (U n \ (x, . . . x)),
(35)
where U is a convex normal neighborhood of the point x ∈ M. In Minkowski spacetime, this result would follow as an immediate consequence of translation invariance. In our context, this result follows from the microlocal spectral condition: Since property T3 is known to hold for T 0 , it follows that the wave front set of t 0 is contained in the set CT . As can be seen from “energy-momentum conservation constraint” in Eq. (8), CT does not contain any elements of the form (x, k; y, 0). Since the conormal bundle, N ∗ Cx of the submanifold Cx is spanned precisely by such covectors, WF(t 0 ) does not have any elements in common with N ∗ Cx . That the restriction exists is thus ensured by [14, Thm. 8.2.4]. This restriction may be identified with a distribution on U n \ (x, . . . , x). We now shall obtain our scaling expansion of t 0 . The basic idea is to expand the t 0 [g](x, · ) at a fixed point x in terms of the metric and its derivatives at x. The individual terms in the so-obtained series will be seen to be given by sums of products of local curvature terms at x times Lorentz invariant distributions of the relative coordinates. The remainder for the suitably truncated series will not have this simple form, but will turn out to be regular enough to allow a unique extension. To begin, we choose a convex normal neighborhood U ⊂ M of x and introduce Riemannian normal coordinates with respect to the metric g around x. These coordinates are constructed by using the exponential map to identify U with a subset of the tangent space Tx M, at x and then identifying Tx M with Minkowski spacetime, R4 , by an isomorphism e : Tx M → R4 . Thus, the Riemannian normal coordinates of a point ξ ∈ U are given by αx (ξ ) = e ◦ (expx )−1 (ξ ) ∈ R4 .
(36)
However, when it is not likely to cause confusion, we will slightly abuse the notation by denoting all quantities – i.e., the point, its Riemannian normal coordinates, and the corresponding vector in Minkowski spacetime – simply by ξ . Similarly, the Riemannian normal coordinates of y = (x2 , . . . , xn+1 ) (see Eq. (34) above) will be denoted αx (y),
324
S. Hollands, R.M. Wald
but when it is not likely to cause confusion, we also will use y to denote the Riemannian normal coordinates of these points or the corresponding vector in R4n . The choice of isomorphism e : Tx M → R4 is equivalent to a choice of an orthonora , at the point x. Since any other orthonormal tetrad is of the form 8ν ea mal tetrad, eµ µ ν for some Lorentz transformation 8, the Riemannian normal coordinates, ξ , of a given point corresponding to the transformed tetrad at x are then given in terms of the original normal coordinates by 8ξ . Similarly, the Riemannian normal coordinates, y, of a point in U n are obtained by Lorentz transforming the coordinates of each point individually by 8, the result of which we shall denote as 8y. Now let g(s) be the smooth 1-parameter family of metrics on U whose coordinate components in Riemannian normal coordinates around x are given by (s) gµν (ξ ) = gµν (sξ ).
(37)
Note that if χs is the map from U into itself given by ξ → sξ in Riemannian normal coordinates about x, then this family of metrics can be alternatively written as g(s) = s −2 χs∗ g.
(38)
Note also that the definition of the above family of metrics does not depend on any additional data besides the specification of the point x and the metric itself. In particular, it does not depend on our choice of tetrad at x. By a slight generalization of the microlocal argument given at the beginning of this 0 n+1 \ subsection, n+1 (s) it follows from the fact that T satisfies properties T3 andnT4 on U 0 that t g (x, · ) makes sense as a family of distributions on U \(x, . . . , x) that is parametrized by (s, x). Furthermore, when smeared with a test function, f , in y, it follows that t 0 g(s) (x, f ) is smooth in (s, x). In addition, since differentiation does not increase the size of the wave front set, derivatives of t 0 with respect to s also make sense as distributions on U n \(x, . . . , x) that are parametrized by (s, x). Hence, for any k and any arbitrary, but fixed x ∈ M, we can define a distribution on U n \(x, . . . , x) by τk0 [g](x, · ) =
d k 0 (s) (x, · ) |s=0 . t g ds k
(39)
It follows that for any given natural number m ≥ 0, we have the following Taylor expansion with remainder: t0 =
m 1 0 0 , τ + rm k! k
(40)
k=0
where 0 rm [g](x,
1 ·) = m!
0
1
(1 − s)m
d m+1 0 (s) t g (x, · ) ds. ds m+1
(41)
Formula (40) is actually our desired scaling expansion of t 0 . However, as it stands, (40) is merely an identity that would hold for any distribution in the variables (s, x, y) that satisfies suitable wave front set conditions. The important properties of this formula for our distributions t 0 are stated in the following theorem.
Local Covariant Time Ordered Products
325
0 (x, · ) are distributions on U n \ (x, . . . , x) which are Theorem 4.1. (i) τk0 (x, · ) and rm locally constructed from the metric in a covariant way in the sense that Eq. (29) holds for all diffeomorphisms that leave the point x invariant. (ii) We have the decomposition τk0 (x, y) = C(x) · αx∗ u0 (y), (42)
where the sum is finite. Here, C ≡ C µ1 ...µl denote the coordinate components of certain curvature tensors in Riemannian normal coordinates about x and the u0 ≡ u0µ1 ...µl are Lorentz-invariant tensor valued distributions defined on R4n with the origin removed, that is, u0µ1 ...µl (8 · ) = 8νµ11 . . . 8νµll u0ν1 ...νl ( · )
(43)
for any Lorentz-transformation 8. The local curvature tensors C arise as a sum of monomials in gab , Rabcd , . . . , ∇(e1 . . . ∇ek−2 ) Rabcd . In the case considered here with no dimensional parameters, each monomial contains precisely k coordinate derivatives of the metric. 0 scale almost homogeneously under rescalings of the metric with degree d. (iii) τk0 and rm (iv) The distributions u0 scale almost homogeneously with degree d−k under coordinate rescalings in the sense that there exists an N ∈ N such that N Sd−k u0 = 0,
(44)
µ N µ ξi ∂/∂ξi + ρ . where SρN = 0 (x, · ) is less or equal than d − m − 1, i.e., the distributions (v) The scaling degree of rm d−m−1+δ 0 λ rm (x, λ · ), viewed as distributions on R4n \ 0 via the pull-back by αx , tend to the zero distribution as λ 0 for all x and all δ > 0. Remark. We note that (iv) means that u0 scales homogeneously with degree d − k under a rescaling of the coordinates, up to logarithmic terms. Namely, simple integration of (44) gives that λd−k u0 (λ · ) = u0 ( · ) +
N −1 h=1
lnh λ h S u0 ( · ). h! d−k
(45)
This implies in particular that the scaling degree of u0 at the origin is d − k. Proof. Item (i) follows directly from the fact that t 0 is local and covariant on its domain of definition. To prove (ii), we consider, first, the case where all the components of g in our Riemannian normal coordinates are polynomials in the Riemannian normal coordinates ξ in a neighborhood of x. Then we may characterize g by its components gµν at x together with the components of the coordinate derivatives, gµν,σ1 σ2 ... , at x, only finitely many of which are nonzero. We may thus view t 0 as being a function of these quantities, and we express this by writing (46) t 0 [g](x, · ) = t 0 gµν , . . . , gµν,σ1 σ2 ...σl , . . . (x, · ).
326
S. Hollands, R.M. Wald
Now smear with a test function, f , on U n \(x, . . . , x). Since t 0 [g](x, f ) depends smoothly on the metric – and hence is a smooth function of the finite number of variables gµν (x), . . . , gµν,σ1 σ2 ...σl (x), . . . on which it depends – we obtain τk0 [g](x, f ) = ∂sk t 0 g(s) (x, f ) s=0 k 0 l = ∂s t gµν , . . . , s gµν,σ1 σ2 ...σl , . . . (x, f ) = k!
l1 +2l2 +...mlm =k
s=0
l ∂ l1 +···+lm t 0 [. . . ](x, f ) gµν,σ1 σ2 ...σj (x) j , l l m 1 ∂ gµν,σ1 . . . ∂ gµν,σ1 σ2 ...σm j
(47) where [. . . ] stands for [gµν , 0, 0, . . . ]. We may rewrite this equation as τk0 (x, y) =
C(x) · αx∗ u0 (y),
(48)
where the C ≡ C µ1 ...µl are monomials in gµν,σ1 σ2 ...σm , which have the property that the total number of derivatives of gµν appearing in each C is equal to k, and where u0 ≡ u0µ1 ...µl are tensor-valued distributions on R4n minus the origin6 , which are independent of g. Since we are working in Riemannian normal coordinates, the mth coordinate derivates gµν,σ1 σ2 ...σm of the metric tensor at x can be rewritten as the coordinate components at x of a local curvature term that is polynomially constructed from the metric, the curvature tensor and its derivatives at x. Moreover, such a curvature term must involve precisely m derivatives of the metric. Hence, by our formula (47), we conclude that C µ1 ...µl corresponds to a curvature term C a1 ...al which arises as a sum of monomials in Rabcd , . . . , ∇(e1 . . . ∇ek−2 ) Rabcd , each of which contains precisely k derivatives of the metric. We would next like to show that the distributions u0 are Lorentz invariant. For this, we consider the diffeomorphism ψ8 on U ⊂ M given by ξ → 8ξ , where 8 is a Lorentz transformation, and where the point ξ ∈ U has been identified with its Riemannian normal coordinates about x. By definition, this diffeomorphism will leave the point x invariant, so we may apply item (i) to this diffeomorphism. From this, we get that C µ1 ...µl (x)u0µ1 ...µl (8 · ) = C µ1 ...µl (x)8νµ11 . . . 8νµll u0ν1 ...νl ( · ). (49) Since this holds for all metrics, this means that u0 must be Lorentz invariant. This proves (ii) for all metrics whose components in Riemannian normal coordinates are polynomials in the Riemannian normal coordinates ξ in a neighborhood of x. Now consider an arbitrary smooth metric g. In a compact neighborhood, K, of x, we can, for each n, find a metric q(n) that is polynomial in ξ and is such that everywhere with (n) −n in K we have gµν,σ1 σ2 ...σm − qµν,σ1 σ2 ...σm < 2 for all m ≤ n. Let ψ : R → [0, 1] be 6 Actually, in the above construction the distributions u0 are automatically defined only in the neighborhood of the origin in R4n (minus the origin itself) corresponding to the neighborhood, U ⊂ M on which the Riemannian normal coordinates are defined. However, by modifying g outside of a neighborhood of the origin if necessary, we may assume without loss of generality that the Riemannian normal coordinates are globally defined and that u0 is defined everywhere on R4n \ 0.
Local Covariant Time Ordered Products
327
a smooth function with support in [−1, 1] satisfying ψ(−v) = ψ(v) and also satisfying 1 − ψ(v) = ψ(1 − v) for all v ∈ [0, 1]. Set h(0) = g and for s = 0 but in a sufficiently small neighborhood of 0 define ψ(|1/s| − n)q(n) . (50) h(s) = n
(Note that at each s, there can be at most two terms in this sum which are nonvanishing.) Then it is straightforward to show that h(s) is a one parameter family of smooth metrics that depends smoothly on s. Consequently, each τk0 h(s) (x, · ) varies smoothly with s. However, we have already proven that Eq. (42) holds for all s = 0. By the smoothness property t3 applied to t 0 , it follows that Eq. (42) continues to hold at s = 0, thus proving property (ii) for an arbitrary smooth metric g. Property (iii) is a direct consequence of the fact that t 0 satisfies the scaling property t2. To see this, we note that = 0, (51) (λ∂λ − d)N τk0 λ−2 g = ∂sk (λ∂λ − d)N t 0 λ−2 g(s) s=0
0 satisfies (iii) then follows since t 0 satisfies t2. This establishes (iii) for τk0 . That rm immediately from Eq. (40). To prove (iv), we note that (i) implies that τk0 λ−2 g = ∂sk t 0 λ−2 g(s) s=0 = ∂sk t 0 (λs)−2 χs∗ g s=0 ∗ k 0 (λs) k ∗ 0 = χλ−1 ∂s t g = λ χλ−1 τk [g]. (52) s=0
By Eq. (51), the differential operator (λ∂λ − d)N annihilates the left side of Eq. (52). This implies that 0 = (λ∂λ − d)N λk χλ∗−1 τk0 [g] = λk (λ∂λ − d + k)N χλ∗−1 τk0 [g].
(53)
Substituting the decomposition of τk0 into the expression on the right side, we obtain 0= C(x) · (λ∂λ − d + k)N u0 (λ · ) N = C(x) · Sd−k u0 (λ · ). (54) N u0 = 0, as we desired to show. Since this holds for arbitrary metrics g, it follows that Sd−k 0 , item (v), we first use In order to establish the estimate on the scaling degree for rm Eq. (41) along with the same arguments as in (iv) to write λm+1 1 0 rm (x, λ · ) = (1 − µ)m ∂sm+1 t 0 λ2 g(s) (x, · ) dµ. (55) m! 0 s=λµ
Using the fact that t 0 satisfies property t2, we have (see Eqs. (30) and (33)) 0 rm (x, λ · ) = λm+1−d
N−1 l=0
lnl λ ψl0 (λ, x, · ),
(56)
328
S. Hollands, R.M. Wald
where ψl0 (λ, x,
1 ·) = l!m! def
1 0
(1 − µ)m ∂sm+1 (v∂v
l 0
− d) t
2 (s)
v g
(x, · )
s=λµ,v=1
dµ. (57)
If f is a smooth test function on U n whose support does not contain the point (x, . . . , x), then by wave front set arguments similar to those given above, it follows from the fact that T 0 satisfies conditions T3 and T4 that the quantities ψl0 (λ, x, f ) are smooth in λ in a neighborhood of zero. This immmediately implies (v). ! Remarks. (1) As stated in property (v) of the above theorem, if we carry the scaling ex0 will have a lower pansion, Eq. (40), to higher order (i.e., larger m), the remainder term rm scaling degree. However, it should be noted that the wave front set of t 0 is determined by the null geodesics of the curved spacetime metric g whereas the wave front set of each τk is similarly determined by the null geodesics of the flat spacetime metric associated with the exponential map at x. Since the null geodesics of these two metrics do not, in general, coincide (with the exception of the null geodesics passing through x itself), it 0 remains fundamentally distributional in nature no matter how large m is is clear that rm 0 converges to chosen. It also should be noted that it is not claimed in Thm. 4.1 that rm zero in any sense (even for an analytic spacetime) as m → ∞. Thus, Eq. (40) should be viewed only as a “scaling expansion” with the properties specified in Thm. 4.1, not as a convergent power series. (2) If we combine Eqs. (40) and (42), we obtain an expansion of t 0 of the general form 0 C(x) · αx∗ u0 (y) + rm (x, y). (58) t 0 (x, y) = If the terms in the sum in (58) are ordered by the engineering dimension of the curvature terms, C, then the first term in the expansion has C = 1 and the corresponding distribution u0 is the “scaling limit” at x of the distributions t 0 in the sense of Fredenhagen and Haag [11]. The higher order terms in the expansion then give corrections to the scaling limit, organized in powers of the curvature tensor and its derivatives. If dimensionful parameters are present in the theory, then the scaling expansion will be organized in terms of products of powers of the curvature and the dimensionful parameters. Our scaling expansion is also closely related to the “momentum space representation” of the Feynman propagator and its powers (see Remark (3) below) given in [5], since the Lorentz invariant distributions, u0 , on Minkowski spacetime occurring in our expansion can be given a momentum space representation. (3) For the Feynman propagator and its powers, the scaling expansion can be explicitly calculated from known properties of the Hadamard expansion. We will illustrate this with two examples. The first example is the simplest nontrivial time ordered product, T 0 (ϕ(x)ϕ(y)). Its Wick-expansion is given by T 0 (ϕ(x)ϕ(y)) = : ϕ(x)ϕ(y) :H +HF (x, y)11,
(59)
where HF = H − iadv is the “local Feynman parametrix”, where H is the Hadamard parametrix, Eq. (23), and adv is the advanced Green’s function. Thus, the only nontrivial distribution t 0 occurring in this expansion is t 0 (x, y) = HF (x, y) = U (x, y)(σ + i0)−1 + V (x, y) ln(σ + i0),
(60)
Local Covariant Time Ordered Products
329
where U and V are as in the Hadamard parametrix (see Eq. (23)). The first few terms, τk0 , in the scaling expansion for t 0 = HF are easily found from the expansions for U and V given in [7] and many other references. Modulo an overall constant, one finds −1 τ00 (x, y) = ηµν ξ µ ξ ν + i0 , τ10 (x, y) = 0, τ20 (x, y) =
1 σρ 12 R (x)ξσ ξρ
ηµν ξ µ ξ ν + i0
−1
−
1 24 R(x) ln
ηµν ξ µ ξ ν + i0 ,
where, as above, ξ µ denotes the Riemannian normal coordinates of y relative to x. Thus, in this example, our scaling expansion corresponds to the usual short distance approximation to the singular part of the Feynman propagator (see, e.g., [6]). Our second example is the time ordered product T 0 ϕ 2 (x)ϕ 2 (y) . Its Wick-expansion is given by T 0 ϕ 2 (x)ϕ 2 (y) = : ϕ 2 (x)ϕ 2 (y) :H +2HF (x, y) : ϕ(x)ϕ(y) :H +HF (x, y)2 11. (61) The only new t 0 arising in this expansion is the “fish graph”, t 0 = HF2 , a solution to the renormalization of which was found by B. S. Kay [15] prior to the commencement of the present work and played a role in the development of the present work. As can be seen from the above expansion for HF , the first few coefficients, τk0 , for the fish graph are, modulo an overall constant, −2 τ00 (x, y) = ηµν ξ µ ξ ν + i0 , τ10 (x, y) = 0,
−2 τ20 (x, y) = 16 R σρ (x)ξσ ξρ ηµν ξ µ ξ ν + i0 −1 1 − 12 R(x) ηµν ξ µ ξ ν + i0 ln ησρ ξ σ ξ ρ + i0 .
It is easily seen that in both examples, the distributions τk0 are local, covariant distributions of the form claimed in (ii) – i.e., they are sums of terms of the form C(x)·αx∗ u0 (y) with u0 a Lorentz-invariant Minkowski space distribution – and satisfy the scaling properties specified in Thm. 4.1. (4) The above scaling expansion was carried out for the scalar distributions t 0 . It is straightforward to check that it also holds for the extended distributions t that will be defined in the next subsection. Much more generally, it should be possible to perform a similar scaling expansion for arbitrary local covariant fields that satisfy appropriate wave front set properties. This should yield a generalized operator product expansion in curved spacetime. We are currently investigating the properties of such an expansion. 4.2. Extension of t 0 [g]. Theorem 4.1 of the previous subsection provides the necessary machinery to achieve our goal of extending t 0 in such a way that properties t1–t5 are satisfied. The basic idea is simply to suitably extend each term in the scaling expansion, Eq. (40). Each τk0 in that equation is of the form (42) and hence can be extended to the total diagonal by extending the Minkowski spacetime distributions u0 to the origin. This can be achieved by standard methods used in Minkowski spacetime. On the other hand, 0 will have sufficiently low scaling if m is chosen sufficiently large, the remainder term rm
330
S. Hollands, R.M. Wald
degree that it can be extended to the total diagonal by continuity. The proof that the so-obtained extension t satisfies properties t1–t5 will be given in the next subsection. The key result needed to extend each τk0 is the following: Lemma 4.1. Let u0 ≡ u0µ1 ...µl (y) with y = (ξ1 , . . . , ξn ) be a Lorentz invariant tensorvalued distribution on R4n \ 0 which scales almost homogeneously with degree ρ under coordinate rescalings, i.e., SρN u0 = 0
for some natural numberN,
(62)
µ N µ where SρN = ξi ∂/∂ξi + ρ . Then u0 has a Lorentz invariant extension, u, to a distribution on R4n which also scales almost homogeneously with degree ρ under rescalings of the coordinates. Proof. We will first extend u0 using the Epstein-Glaser prescription. This extension need not satisfy either the scaling or Lorentz invariance properties. However, we will show that the extension can be modified, if necessary, so as to scale almost homogeneously7 with degree ρ. We will then show that the resulting extension can be further modified, if necessary, so as to be Lorentz invariant while retaining the almost homogeneous scaling with degree ρ. Choose an arbitrary smooth function w of compact support on R4n which is equal to one in a neighborhood of the origin. For any test function f ∈ D R4n we set (Wf )(y) = f (y) − w(y) y α ∂α f (0)/α!, (63) |α|≤ρ−4n
where we use the usual multi-index notation. It follows from SρN u0 = 0 that u0 has scaling degree ρ, so by [3, Thm. 5.3], we can define an extension, u, of u0 to R4n by setting u(f ) = u0 (Wf ).
(64)
It follows that the scaling degree of u is ρ [3, Thm. 5.3], but it need not hold that u scales almost homogeneously with degree ρ, i.e., there is no guarantee that SρM u = 0 for some natural number M. However, one can calculate that W SρN f (y) − SρN Wf (y) = ψ α (y)∂α f (0) (65) |α|≤ρ−4n
for some smooth functions follows immediately that
ψα
whose support does not contain the origin. From this it SρN u =
cα ∂α δ,
(66)
|α|≤ρ−4n
where cα = (−1)|α| u0 (ψ α ). 7 For distributions with an exactly homogeneous scaling, this result has previously been obtained in [14, Thms. 3.2.3 and 3.2.4]. Thus, our theorem generalizes this result to the case of almost homogeneous scaling.
Local Covariant Time Ordered Products
331
We now define a modified distribution u by u = u −
|α|≤ρ−4n−1
cα ∂α δ. (ρ − 4n − |α|)N
Using the fact that Sρ ∂α δ = (ρ − 4n − |α|)∂α δ, we find cα ∂α δ. SρN u =
(67)
(68)
|α|=ρ−4n
If we apply the operator Sρ to both sides of the above equation, then we get that SρN+1 u = 0, because Sρ ∂α δ = 0
for |α| = ρ − 4n.
(69)
This means that u is an extension of u0 with the desired almost homogeneous scaling. For notational simplicity, we will drop the “prime” in the following and denote this modified extension as u. We now investigate the Lorentz transformation properties of u. Restoring the tensor indiceson u, we find by a calculation similar to Eq. (66) above that for any test function f ∈ D R4n and any Lorentz transformation, 8, we have α uµ1 ...µl (f ) − 8νµ11 . . . 8νµll uν1 ...νl (R(8)f ) = bµ (8)∂α δ(f ), (70) 1 ...µl |α|≤ρ−4n
α where (R(8)f )(y) = f (8y) and the bµ (8) are complex constants, which would 1 ...µl vanish if and only if the distribution u were Lorentz invariant. We now apply the differential operator SρN+1 to both sides of the above equation. Since Sρ is itself a Lorentz invariant operator, we have R(8)Sρ = Sρ R(8). Therefore, since SρN+1 u = 0, the operator SρN+1 annihilates the left side of Eq. (70), so we obtain
0 = SρN+1
|α|≤ρ−4n
α bµ (8)∂α δ = 1 ...µl
α (ρ − 4n − |α|)N+1 bµ (8)∂α δ. (71) 1 ...µl
|α|≤ρ−4n
α It follows immediately that bµ (8) = 0, except possibly when |α| = ρ − 4n. Thus, 1 ...µl we have ν ...ν
uµ1 ...µl (f ) − 8νµ11 . . . 8νµll uν1 ...νl (R(8)f ) = bµ11 ...µρ−4n (8)∂ν1 . . . ∂νρ−4n δ(f ) l
(72)
for all f and all Lorentz-transformations 8. Using this equation, one finds the following transformation property for b(8), b(81 82 ) = b(81 ) + D(81 )b(82 ),
(73)
where we have now dropped the tensor-indices D denotes the tensor rep l 4 ∗ and where ρ−4n 4 resentation of the Lorentz-group on ⊗ R ⊗ ⊗ R . It then follows by the cohomological argument given in [17] that this relation implies that b can be written in the form b(8) = a − D(8)a
∀8,
(74)
332
S. Hollands, R.M. Wald
∗ where a is an element in ⊗l R4 ⊗ ⊗ρ−4n R4 , not depending on 8. This enables us to define the modified extension ν ...ν
uµ1 ...µl = uµ1 ...µl − aµ11 ...µρ−4n ∂ν1 . . . ∂νρ−4n δ, l
(75)
where we have now restored the tensor indices. It is easily checked that u is Lorentz invariant and satisfies SρN+1 u = 0. We have therefore accomplished the goal of con structing the desired extension of u0 . ! Some analyticity properties of u and its Fourier transform that follow from its scaling behaviour are established in Appendix B. These results, however, will not be needed in our present analysis. We now can give our prescription for extending t 0 . Let d denote the scaling degree of t 0 , let m = d − 4n, and consider the expansion Eq. (40). By Theorem 4.1, each τk0 appearing in this expansion takes the form
τk0 (x, y) =
C(x) · αx∗ u0 (y),
(76)
where the sum is finite. We extend τk0 to a distribution τk on U n by choosing an extension, u, of each u0 that satisfies the properties of Lemma 4.1 and defining τk (x, y) =
C(x) · αx∗ u(y).
(77)
Although τk has been constructed as a distribution in y that is parametrized by x, it is straightforward to check that τk may also be viewed as a distribution jointly in x and y. On the other hand, we know by property (v) of Theorem 4.1 that the scaling degree 0 is less or equal to 4n − 1. Therefore we can apply [3, Thm. 5.2] to conclude that of rm 0 rm (x, · ) has a unique extension, rm (x, · ) to all of U n with the same scaling degree for any given point x. This extension is given by 0 rm (f ) = lim rm ϑ (j ) f , j →∞
(78)
where ϑ (j ) is a sequence of smooth functions with support in U n+1 \ n+1 , which are (j ) (j ) identically one outside neighborhoods Un+1 of n+1 , with Un+1 shrinking to n+1 as 0 , this limit exists in the weak sense, j goes to infinity. By the scaling properties of rm and is independent of the particular choice of cutoff functions ϑ (j ) (see [3, Thm. 5.2]). Again, it can be shown that this extension defines a distribution jointly in x and y. Our extension, t, is then defined by t=
m 1 τk + rm . k! k=0
Our remaining task is to show that t satisfies properties t1–t5.
(79)
Local Covariant Time Ordered Products
333
4.3. Proof that t satisfies properties t1–t5. As we now shall show, it is relatively straightforward to prove that the extension, t, of t 0 defined by Eq. (79) above satisfies properties t1 and t2. To show that t1 holds, we note that the prescription for extending τk0 clearly is local in the appropriate sense. However, it is not immediately obvious that the prescription yields a covariant extension τk in the sense required by t1 since the prescription involves αx , a at x. However, whose definition requires, in addition to the metric, a choice of a tetrad eµ since any other tetrad at x is related by a Lorentz-transformation, it follows immediately from the Lorentz invariance of the extensions u in (77) that different choices of tetrad lead to the same distribution τk . It follows that each τk is locally constructed from the metric in a covariant way in the sense required by t1. In order to see that rm is local and covariant in the sense of t1, it is sufficient to show that rm [ψ ∗ g] is equal to ψ ∗ rm [g] for any diffeomorphism ψ on U . We already know that this is true off the total diagonal n+1 , as the unextended distribution r 0 has this property. Thus, the difference between the two expressions must be a distribution supported on the total diagonal. Moreover, the scaling degree of this distribution must be less than 4n − 1, by our choice m = d − 4n. It is well known that there are no such distributions apart from the zero distribution (essentially because the delta function and its derivatives have scaling degree ≥ 4n). Therefore the difference must in fact be zero, showing that rm satisfies t1. Since all terms on the right side of Eq. (79) satisfy t1, it follows that t satisfies this property. To establish t2, we first show that the extensions τk [g] have an almost homogeneous scaling under rescalings of the metric in the sense of t2. To see this, we consider a term C · αx∗ u in the expansion (77). By Theorem 4.1, the curvature term C will scale as λ−k under a rescaling of the metric by λ2 . On the other hand, for the term αx∗ u, since αx is just the inverse of the exponential map at x, a rescaling of the metric will correspond precisely to a coordinate rescaling by a factor of λ in the distributions u. By Lemma 4.1, these distributions scale like λk−d up to logarithmic corrections under such a coordinate rescaling. Therefore, each individual term in formula (77) for τk has an almost homogeneous scaling with degree d under rescalings of the metric. On the other hand, the almost homogeneous scaling of rm under a rescaling of the metric can be proven by an argument similar to the proof that rm is local and covariant. Consequently, we see that t satisfies property t2. It also is relatively straightforward to prove that each τk occurring in Eq. (79) satisfies properties t3–t5. We know that τk is a finite sum of terms of the form C(x) · αx∗ u(y), with C(x) a polynomial in the curvature and its derivatives. Since C(x) is smooth in x, we have " t t # ∂αx t ∂αx ∂αx WF(C · α ∗ u) ⊂ x, k ; ξ , k ; . . . ; ξ , kn i 1 1 n ∂x ∂ξ1 ∂ξn
(80) (αx (ξ1 ), k1 ; . . . ; αx (ξn ), kn ) ∈ WF(u) . Here, we have written y = (ξ1 , . . . , ξn ) and each ξi denotes a point in a convex normal neighborhood of x, and not the Riemannian normal coordinates of that point. (This makes a difference here, since we are considering variations in x.) In Eq. (80), ∂αx /∂ξi denotes the matrix of partial derivatives of αx with respect to ξi at fixed x, and ∂αx /∂x denotes the matrix of partial derivatives of αx (ξi ) with respect to x at fixed ξi . However, at ξi = x we have ∂αx (ξi )/∂ξi = −∂αx (ξi )/∂x, since moving ξi infinitesimally away from ξi = x has the same effect on αx (ξi ) as moving x infinitesimally by the same
334
S. Hollands, R.M. Wald
∗ amount in the opposite direction. It follows that if (x, k1∗; . . . ; x, kn+1 ) ∈ WF(C · α u), then ki = 0. This means precisely that WF(C · α u) n+1 ⊥ T (n+1 ), i.e., the microlocal spectral condition, t3, is satisfied. Similarly, by using the fact that C(x) is a polynomial in the curvature and αx is the inverse of the exponential map – so that both C(x) and αx have appropriate smooth and analytic dependence on the metric – together with the fact that u is independent of the metric, we find that the smoothness (t4) and analyticity (t5) conditions are satisfied by τk . Thus, we would be done if our expression (79) for t corresponded to a suitably convergent power series. However, as already noted in Remark (1) at the end of Subsect. 4.1, this is not the case, i.e., the remainder term, rm , in Eq. (79) is not expected to converge to zero in any sense useful for our purposes as m → ∞. Therefore, in order to prove that t satisfies properties t3–t5, it is necessary to explicitly analyze the remainder term rm . This is technically quite cumbersome, since essentially the only thing useful that is known about rm is that it is the extension to n+1 defined by Eq. (78) of the expression 0 given by Eqs. (56) and (57). Equation (78) expresses r as a weak limit of distriburm m tions whose wave front set properties are known, but wave front set properties are not preserved under weak convergence, so we must show that the sequence (78) converges in a suitably strong sense to enable us to prove that rm satisfies properties t3–t5. This will be accomplished in the proof of the following proposition, which – as will be explained in the remark following the statement of the proposition – will complete the proof that the extensions t satisfy the properties t3–t5.
Proposition 4.1. Let g(s) be a smooth one-parameter family of smooth metrics, and let rm (s, x1 , . . . , xn ) denote the remainder term in Eq. (79), viewed as a distribution on R × U n . (Here m = d − 4n, where d is the scaling dimension of t.) Then the wave front set of rm satisfies WF(rm ) R×n ⊥ T (R × n ),
(81)
where the notation “⊥” was introduced below Eq. (16). Similarly if g(s) is an analytic one-parameter family of analytic metrics, then (81) holds for the analytic wave front set. Remark. If we choose g(s) = g for all s, the above proposition implies that rm satisfies the microlocal spectral condition t3. The proposition also implies that rm satisfies the smoothness and analyticity conditions, t4 and t5. In fact, the proposition asserts a somewhat stronger version of these conditions, as it shows that the wave front set of rm (s, x1 , . . . , xn ) not only cannot contain any points of the form (s, ρ; x, k1 ; . . . ; x, kn ) with ki = 0 but it also cannot contain any such points with ρ = 0. Since each τk has already been shown above to satisfy t3–t5, it follows that t satisfies t3–t5 if rm does. Thus, our construction of time ordered products satisfying properties T1–T9 of Sect. 2 will be completed once we have completed the proof of this proposition. Proof. We will give the proof only for the analytic case; the proof for the smooth case is similar, though somewhat simpler because the estimates needed to establish the wave front set properties are simpler in nature in the smooth case. As before, we proceed by induction in the number, n, of variables (x1 , . . . , xn ) on which rm depends. We inductively assume that the analytic wave front set version of Eq. (81) holds for all rm that depend on n or fewer variables. By a slight generalization of the proof given above that τk satisfies t3–t5, it can be shown that if g(s) is an analytic one-parameter family of analytic metrics then each τk also satisfies Eq. (81). Consequently our inductive hypothesis implies that
Local Covariant Time Ordered Products
335
t (s, x1 , . . . , xn ) ≡ t g(s) (x1 , . . . , xn ) satisfies WFA (t) R×n ⊥ T (R × n ).
(82)
From the distributional coefficients t depending on n or fewer spacetime arguments and the real parameter s we obtain, by our inductive constructions, the distributional coefficients t 0 (s, x1 , . . . , xn+1 ) ≡ t 0 g(s) (x1 , . . . , xn+1 ) depending on n + 1 spacetime arguments and the real parameter s. These distributions are defined everywhere in R × Un+1 , except for (R times) the total n+1 .Their analytic wave front diagonal, set WFA t 0 is therefore a subset of T ∗ R × U n+1 \ n+1 . By essentially the same arguments as given in [3, Sect. 7] (modulo a straightforward modification of those arguments with regard to the additional parameter s on which the t 0 depend), the distribution t 0 (s, x1 , . . . , xn+1 ) can be expressed as a finite sum of terms of the form ! a t s, {xi }i∈I t s, xj j ∈I c HF s, xi , xj ij (83) i∈I,j ∈I c
on each of the open sets CI introduced in Eq. (18). Here, the aij are certain natural numbers, I is a nonempty proper subset of the set {1, . . . , n + 1} and I c is its complement. (Note that since I is a nonempty proper subset, the expression (83) only involves the distributional coefficients t depending on n or fewer spacetime arguments.) Finally, HF (s, x1 , x2 ) ≡ HF g(s) (x1 , x2 ) is the local Feynman parametrix introduced below Eq. (59), for our analytic 1-parameter family of metrics. It can be seen by an explicit calculation that WFA (HF ) R×2 ⊥ T (R × 2 ).
(84)
For each of the sets I described above, let us define a projection map πI from R × U n+1 to R × U |I | (with |I | the number of elements in I ) by πI : (s, x1 , . . . , xn+1 ) → (s, {xi }i∈I ).
(85)
Using the rules for calculating the analytic wave front set of products of distributions [14], we find from Eq. (83) that the analytic wave front set of t 0 restricted to the open sets R × CI is estimated by WFA t 0 R×CI ⊂ πI∗ WFA (t) ∪ {0} + πI∗c WFA (t) ∪ {0} +
aij
i∈I,j ∈I c
∗ ∗ R × U n+1 \ n+1 . π{i,j } WFA (HF ) ∪ {0} ⊂ T
(86)
If we now take the closure in T ∗ R × U n+1 of the sets on both sides of the above relation (we denote this closure by an overbar), take the union over all I , and use Eqs. (82) and (84), then we obtain WFA t 0 R×n+1 ⊥ T (R × n+1 ). (87) 0 (s, x , . . . , x From the properties of τk , it then follows that rm 1 n+1 ) also satisfies the same condition, i.e., 0 WFA rm (88) R×n+1 ⊥ T (R × n+1 ).
336
S. Hollands, R.M. Wald
Note that Eq. (88) imposes a nontrivial restriction (beyond what we already know) on 0 . Our aim is to show that (88) continues to hold for the extension, the wave front set of rm rm . In order to simplify the discussion, we will show here only the weaker result that, for a fixed metric g, rm (x1 , . . . , xn+1 ) satisfies WFA (rm ) n+1 ⊥ T (n+1 ).
(89)
However, the arguments can be generalized straightforwardly to prove (81) in n + 1 variables for an analytic one-parameter family of analytic metrics g(s) . As above, we choose relative coordinates (x, y) around the total diagonal. We will identify the point x ∈ M with its coordinates in some analytic chart, and we will identify y = (ξ1 , . . . , ξn ) with its Riemannian normal coordinates relative to x, so 0 that the diagonal0 corresponds to y = 0. With this choice of coordinates, we identify t 4 and likewise, rm with a distribution defined on X×(Y \0), where X is an open set in R and Y is an open neighborhood of the origin R4n . Let x0 be some fixed point in X. It is pos (x)φ (y), sible to construct a sequence of smooth functions of the form φN (x, y) = φN N ∞ where φN ∈ C0 (K ) is 1 in a neighborhood of x0 , such that φN vanishes in a neigh borhood of 0 and is 1 outside some larger neighborhood K , and where φN satisfies the estimate α+β ∂ φN ≤ Cα|β|+1 (N + 1)|β| ∀|β| ≤ N = 1, 2, . . . . (90) If f is a test function with support sufficiently close to (x0 , 0), then the extension rm is defined by Eq. (78). For our purposes, it is convenient to make the choice ϑ (j ) = (φN )2j , where the subscript 2j means the pull-back by the function (x, y) → (x, 2j y). In order to show WFA (rm ) ⊥ T (n+1 ), we must demonstrate that (x0 , k0 , 0, p0 ) is in the complement of WFA (rm ) whenever k0 = 0. It is not difficult to see that this will follow if we can show that it is possible to choose K = K × K so small that 0 (91) (θN )2j rm (k, p) ≤ 2−j/2 C N+1 ((N + 1)/(|k| + |p|))N for all (k, p) in some conic neighborhood F of (k0 , p0 ) and for all natural numbers N and j . Here, θN ∈ C0∞ (K) is the cutoff function defined by φN (x, y) − φN (x, 2y). Note that the support of θN does not intersect the submanifold X × {0}, and that the sequence of cutoff functions θN is again bounded in E (K) and satisfies the estimate (90). In order to analyze the Fourier transform on the left side of (91), we observe that −j 0 (k, p) = 2−4nj θ 0 (θN )2j rm (92) N r 2−j k, 2 p , 0 0 by the map (x, y) → where rm denotes the pull-back of the distribution rm 2−j −j −4nj 0 transforms as a denis due to the fact that rm x, 2 y , and where the factor 2 sity. Recalling our choice m = d − 4n, we can write the quantity on the right side of this equation as −j −j 0 2−j (j ln 2)l θ (93) N ψl 2 , k, 2 p , l
where the ψl0 ∈ D (R × X × (Y \ 0)) were defined in Eq. (56), and where the Fourier transform is with respect to the variables x and y.
Local Covariant Time Ordered Products
337
We now claim that for any closed conic set K in R×R4 ×R4n not containing elements of the form (0, 0, p) there is a neighborhood K0 ⊂ R × X × Y of (0, x0 , 0) such that for all l WFA ψl0 ∩ (K0 × K) = ∅. (94) To prove this, we decompose ψl0 into simpler pieces, whose analytic wave front set is either known by the induction process or can be determined by elementary means. For this, we shall define a family of analytic metrics depending analytically on parameters s ≡ (v, µ, x) ∈ P1 × P2 × P3 ≡ P , where P1 is a small neighborhood of 1 in R, P2 is a small neighborhood of 0 in R and where P3 is a convex normal neighborhood in M with respect to g. In order to define this family, let χx,µ be the diffeomorphism which shrinks the Riemannian normal coordinates ξ of a spacetime point about the point x ∈ P3 by a factor of µ. In terms of this family of diffeomorphism, our family of metrics is given by ∗ g. g(s) = (vµ)−2 χx,µ
(95)
This is a real analytic family of analytic metrics (but with s now ranging over the 6dimensional parameter space, P , rather than over R). We have already established that the analytic wave front set of the distribution t 0 on P × U n+1 \ n+1 satisfies (87) (with R replaced by P ). In order to relate ψl0 to t 0 , we let R (m) be the map from test functions on R to smooth functions on R given by 1 1 (m) R f (λ) = (1 − µ)m f (m+1) (λµ) dµ. (96) m! 0 Furthermore, we set D (l) =
1 (v∂/∂v + d)l . l!
(97)
Note that if f is a smooth function on R with compact support, then we have supp tR (m) f ⊂ supp(f ), where tR (m) denotes the transpose of R (m) . Thus tR (m) has proper support. It now follows straightforwardly from the definition of ψl0 that we can rewrite the action of the distributions ψl0 on test functions f ∈ C0∞ R × U n+1 \ n+1 as t (l) ψl0 (f ) = f ∗ t 0 Dv δ ( · − 1) ⊗ tRµ(m) ⊗ 1x1 ...xn+1 f , (98) where the subscripts on the operators indicate on which of the variables (v, µ, x1 , . . . , xn+1 ) they act, and where f ∗ t 0 denotes the pull back of t 0 by the analytic map f : (v, µ, x1 , . . . , xn+1 ) → (v, µ, x = x1 , x1 , . . . , xn+1 ) ∈ P × U n+1 .
(99)
The analytic wave front set of ψl0 can now be estimated from Eq. (98) using our knowledge of the analytic wave front set of t 0 , Eq. (87), together with the rules for calculating the wave front set of a distribution under composition with distribution kernels [14, Thm. 8.5.5], and under pull-back by analytic maps [14, Thm. 8.5.1]. For this, we only need to know the following additional facts: (i) The action of an analytic partial differential operator, such as D (l) , does not enlarge the analytic wave front set and (ii) the analytic wave front set of the distribution kernel of R (m) (viewed as a bidistribution on R × R)
338
S. Hollands, R.M. Wald
does not contain any elements of the form (λ, 0; µ, ρ). The first statement is proven in [14, Thm. 8.4.7], and the second statement can be checked directly. This information suffices to conclude from Eq. (98) that WFA ψl0 R×n+1 ⊥ T (R × n+1 ).
(100)
If R × U n+1 \ n+1 is identified with subset of R × X × (Y \ 0) via the above choice of coordinates, then this means that WFA ψl0 R×X×{0} ⊥ T (R × X × {0}).
(101)
As a consequence, the open set T ∗ (R × X × Y ) \ WFA ψl0 contains a set of the form K0 × K as claimed in Eq. (94), provided that K0 is chosen to be sufficiently sharply concentrated about the point (0, x0 , 0). We now blow up the sequence of cutoff functions θN ∈ C0∞ (K) to a bounded sequence of cutoff functions in C0∞ (K0 ) which still satisfy the inequality (90), and which we shall denote by the same symbol. It then follows that with these cutoff functions, 0 θN ψl (ρ, k, p) ≤ C N+1 ((N + 1)/(|ρ| + |k| + |p|))N
(102)
for all (ρ, k, p) ∈ K, provided the support K0 of θN is sufficiently sharply concentrated near (0, x0 , 0), where the tilde denotes the Fourier transform in R × R4 × R4n . With our new choice for θN , we can write (93) as −j −i2−j ρ 0 0 (k, p) = (2π)−1/2 2−j θ (θN )2j rm (j ln 2)l dρ, (103) N ψl ρ, k, 2 p e l
R
where the sum is finite. Now the cone K can be chosen such that ρ, k, 2−j p ∈ K for all points (k, p) in the cone F , all ρ and all j . Thus we can use (102) to estimate 0 (θN )2j rm (k, p) ≤ 2−j/2 C N (N/|k|)N−1 .
(104)
For (k, p) in the cone F it holds that |k| > M|p| for some M > 0. This enables us to estimate the above expression further by ≤ 2−j/2 C N (N/(|k| + |p|))N−1
(105)
for all (k, p) in F and all natural numbers N and j . This is what we wanted to show.
!
Remark. The distributions t have now been shown to have an analytic dependence on the metric in the sense of condition t5. This makes it possible to establish certain analyticity properties of the distributions u in the scaling expansion for the t, as we will now show. We know that if (s, ρ; x1 , k1 ; . . . ; xn+1 , kn+1 ) is an element in WFA (t), then the element (s) (x1 , k1 ; . . . ; xn+1 , kn+1 ) must necessarily be in the set CT , given by Eq. (8). By our scaling expansion, we know that the distributions u are given in terms of s derivatives of t (evaluated at s = 0), so we can calculate WFA (u) from WFA (t) by the rules for the
Local Covariant Time Ordered Products
339
analytic wave front set under restriction and differentiation. It is straightforward to see that this gives WFA (u) ⊂ (ξ1 , k1 ; . . . ; ξn , kn ) ∈ T ∗ R4n \ {0} ∃ decorated graph G(p) in R4 , η with vertices 0, ξ1 , . . . , ξn such that ki = pe − pe ∀i , e:s(e)=i
e:t (e)=i
(106) where we use the graph-theoretical notation introduced in T3, and where η is the Minkowski metric. 5. Conclusions and Outlook In this paper, we have given a construction of local, covariant time ordered products of an arbitrary number of local Wick powers. These local time ordered products were shown to satisfy properties T1–T9 of Sect. 2. They therefore fulfill the assumptions of the uniqueness theorem of our previous paper [12, Thm. 5.2]. Consequently, for any given polynomial order in the free field, any other prescription for defining local time ordered products with the same properties will differ from the prescription given in the present paper by products of local curvature terms and lower order time ordered products, as specified precisely in our uniqueness theorem. Although in this paper we considered only a massless Klein-Gordon scalar field, our results can be generalized straightforwardly to allow mass, and we do not anticipate any difficulties in generalizing our results to fields with higher spin. Largely for notational simplicity, we also restricted consideration to time ordered products of Wick powers that do not contain derivatives of the field, but it should be straightforward to generalize our construction to allow Wick powers of differentiated fields (subject only to the caveat of footnote 2). An important tool in our analysis was the scaling expansion introduced in Subsect. 4.1 for the distributions t appearing in the local Wick expansion of time ordered products. In essence, this scaling expansion gives corrections to the “scaling limit” of [11], organized in powers of the curvature (and the dimensionful parameters, if any are present). The scaling expansion generalizes to arbitrary t in the local Wick expansion the usual “short distance expansion” for the Feynman propagator (see Remark (3) at the end of Subsect. 4.1). Although we restricted consideration here to the distributions t, a similar scaling expansion will exist for any local, covariant field which satisfies appropriate wave front set and scaling properties. The properties of the general scaling expansion for local covariant fields is currently under investigation. The results of this paper essentially complete the analysis of the existence, uniqueness, and renormalizability of the perturbative expansion of nonlinear quantum fields (with polynomial self-interaction) in curved spacetime. It is natural to ask whether an “exact” formulation of nonlinear quantum field theory in curved spacetime can be given. The Wightman axioms and other similar systems cannot be straightforwardly generalized to curved spacetime on account of their essential usage of Poincar´e invariance and the existence of a preferred vacuum state. We are currently investigating the possibility that the notion of a local, covariant quantum field (together with suitable microlocal spectral
340
S. Hollands, R.M. Wald
conditions, etc.) may enable one to give a useful formulation of axiomatic quantum field theory in curved spacetime. Acknowledgements. We would like to thank K. Fredenhagen and B. S. Kay for helpful discussions. In particular Kay’s observation that the leading order divergence of the “fish graph” (see Remark (3) at the end of Subsect. 4.1) could be renormalized by Minkowski spacetime methods was useful to us with regard to our development of the scaling expansion for general time ordered products. We thank Kay for making the results of his unpublished work [15] available to us at an early stage of this work. This work was supported in part by NSF grant PHY00-90138 to the University of Chicago.
A. Smooth and Analytic Variation of Distributions A key requirement that we impose on our definition of Wick polynomials and their time ordered products is that they have appropriate smooth/analytic dependence on the spacetime metric. The purpose of this appendix is to elucidate the notion of smooth and analytic variation of distributions. To begin, let X be a smooth manifold and for each s ∈ R let u(s) : X → C be a smooth (i.e., C ∞ ) function. It is useful to view u(s) as a map u : R × X → C. We say that u(s) varies smoothly with s if the map u is smooth. Note that this requirement of (joint) smoothness of the map u is stronger than the possible alternative requirement that u(s) (x) be a smooth function of s for each fixed x ∈ X. This latter notion of (separate) smoothness in s would not be a natural one in the context of this paper for the following reason: We consider one-parameter families of spacetimes M, g(s) and there is no natural way of identifying spacetimes with different values of s. However, the notion of separate smoothness is not invariant under diffeomorphisms ψ (s) : X → X that are (jointly) smooth in (s, x). Now, for each s ∈ R let u(s) ∈ D (X), i.e., u(s) is a distribution on X. We wish to define a notion of smooth variation of u(s) with s that corresponds to the notion of (joint) smoothness of functions as defined in the previous paragraph. To do so, it is useful to view u(s) as a distribution, u, on R × X. The basic idea of our definition is to require u to be “not any more singular than each u(s) ”. One possible way of implementing this notion would be to demand that the wave front set of u be contained in the ! wave front set of u(s) in the sense that WF(u) ⊂ (s, ρ; x, k) | ρ = 0, (x, k) ∈ WF u(s) . However, this definition is unsatisfactory for the following two independent reasons. First, the requirement that ρ = 0 is too strong in that it would, in particular, require the singularities of u(s) to “remain in a fixed location in X” as s is varied. This would not be invariant under a one parameter family of diffeomorphisms ψ (s) : X → X that are (jointly) smooth in (s, x). It should be noted that the distributions, u(s) , of interest in this paper have singularities on the light cones of g(s) and, hence, their singularities cannot “remain in a fixed location” for non-conformal variations of g. Consequently, we shall not require ρ = 0 in our definition. Second, if u(s) happens to be “less singular than normal” for some value of s, then under the above proposed definition, u would fail to vary smoothly with s even if, in a naive sense, its variation with s was perfectly For smooth. example, u(s) (x) = sδ(x) would fail to be smooth at s = 0 because WF u(0) = ∅ but WF(u) includes points with s = 0. For this reason, we will define a more general notion of smoothness with respect to an arbitrary specified family of cones C (s) . (Here, a cone C is a subset of T ∗ X\{0} having the property that if (x, k) ∈ C, then (x, λk) ∈C for all λ > 0.) For the definition to be nontrivial, we must choose C (s) so that WF u(s) ⊂ C (s) , but we need not choose C (s) = WF u(s) .
Local Covariant Time Ordered Products
341
Definition A.1. Let u(s) be a one-parameter family of distributions on a manifold X and let C (s) be a family of cones. We say that u(s) varies smoothly with s with respect to C (s) if the wave front set of the corresponding distribution u on R × X satisfies
WF(u) ⊂ (s, ρ; x, k) ∈ T ∗ (R × X) \ {0} | (x, k) ∈ C (s) . (107) Remarks. (1) To illustrate the meaning of the above definition, let us consider the two extreme cases, namely (a) when the cones are trivial, C (s) = ∅, and (b) when the cones are maximal, C (s) = T ∗ X \ {0}. In the first case (a) we immediately get that WF(u) = ∅, so u is smooth jointly in (s, x). In the second case (b), it might appear that our smoothness condition is in fact empty. However, this is not the case, since Eq. (107) implies that no element of the form (s, ρ; x, 0) can be in WF(u). Thus, for example, if u = v ⊗ φ with v a distribution on R and φ a distribution on X, not depending on s, Eq. (107) requires that v is smooth. (s) (s) (2) Let u1 and u2 be two families of distributions which are smooth with respect (s) (s) to cones C1 respectively C2 . Then the rules for calculating the wave front set of a sum (s) (s) of two distributions gives that the family u1 + u2 is smooth with respect to the cones (s) (s) (s) (s) (s) (s) / C1 + C2 , for each s, then the product u1 u2 can be C1 ∪ C2 . Likewise, if {0} ∈ defined for each s and defines a distribution jointly in (s, x). Moreover, the rules for calculating the wave front set of the product of two distributions gives that the product (s) (s) family is smooth with respect to the cones C1 + C2 . The above definition allows us to define the notion of the smooth variation of a one parameter family, ω(s) , of continuous states on the algebras W M, g(s) of the spacetimes M, g(s) : We say that ω(s) varies smoothly with s if each of the n-point functions ωn (s) of ω(s) – viewed as a distribution on M n – varies smoothly with s in (s) the sense of Definition A.1 with C (s) = WF ωn . A one-parameter family of fields (s) : D (M n ) → W M, g(s) in n variables will then be said to vary smoothly with s (s) with respect to the cones C ⊂ T ∗ M n \{0}, if the corresponding distributions ω(s) (s) (s) vary smoothly with s with respect to C for all smooth one-parameter families of states (s) ω . Since continuous states on W (M, g) are precisely the Hadamard states whose trun(s) cated n-point functions (s) are smooth for n = 2 [13], it follows that will be smooth if (s) and only if ω is smooth for all smoothly varying families of Hadamard states ω(s) with smooth truncated n-point functions (n = 2). This is the requirement that we have adopted in condition T4 of Sect. 2. In [12], a different notion of “continuous variation” of (s) was introduced in the case where (s) is local and covariant. Our uniqueness theorems for Wick polynomials and their time ordered products used the hypothesis that they vary continuously under smooth variations of the metric. It can be shown that our requirement of smooth variation introduced here implies continuous variation in the sense of [12], so the uniqueness results of [12] continue to hold under this replacement (as can also be verified more straightforwardly by simply repeating the proofs with the new hypothesis). It also can be shown that the construction of local, covariant Wick polynomials given in [12] satis(s) fies our new smoothness requirement with C = ∅. In order to define the notion of analytic variation of a one-parameter family of distributions, u(s) , on a real analytic manifold, M, we first recall the definition of the analytic wave front set. To begin, let u be a function on Rm which is a real analytic in a neigh-
342
S. Hollands, R.M. Wald
borhood of a point x0 in Rm . Then it follows from Cauchy’s integral formula, or rather its generalization to Cm , that |∂ α u| ≤ C |α|+1 (|α| + 1)|α|
∀α
(108)
in a neighborhood of x0 , where C is some constant and multi-index notation has been used. Conversely, if the above estimate holds for a function u in a neighborhood of x0 , then u is real analytic in that neighborhood. Condition (108) can be formulated equivalently in terms of Fourier transforms. Namely, one can show that an estimate of the form (108) holds if and only if there is a sequence uN of compactly supported distributions equal to u in some open ball around x0 , which is bounded in the space E (Rm ) of distributions of compact support, and which satisfies |uˆ N (k)| ≤ C N+1 ((N + 1)/|k|)N
∀N ∈ N.
(109)
This motivates the following definition. Let u be a distribution on X ⊂ Rm . The analytic wave front set WFA (u) is defined to be the complement of the set of all points (x0 , k0 ) in X × (Rm \ 0) such that there is an open neighborhood U of x0 , a conic neighborhood K of k0 and a bounded sequence uN ∈ E (X) which is equal to u on U and satisfies (109) whenever k ∈ K. It is clear from the definition that u is given by a real analytic function in the neighborhood of points x0 such that WFA (u) contains no element of the form (x0 , k0 ). If f : X → Y is an analytic one-to-one map, then the analytic wave front set of the pullback f ∗ u is given by {(x, df t (x)k) | (f (x), k) ∈ WFA (u)}. This makes it possible, via localization in analytic charts, to define in an invariant way the analytic wave front set of a distribution on a real analytic manifold X. In practice it is useful that uN can always be obtained as a product of u and suitable cutoff functions, see [14, Lem. 8.4.4]: Let WFA (u) ∩ (K × F ) = ∅ for some compact subset K of X and some closed cone F , and let χN be a sequence of cutoff functions in C0∞ (K) such that for all α α+β ∂ χN ≤ Cα|β|+1 (N + 1)|β| ∀|β| ≤ N = 1, 2, . . . . (110) Then uN = χN u is bounded in E (X) and satisfies (109) for all k ∈ F . A one-parameter family of distributions, u(s) , on a real analytic manifold, X, will be said to vary analytically with s with respect to the cones C (s) if Eq. (107) holds with WF replaced by WFA everywhere in that equation. The notions of analytic variations of states and fields can then be defined in complete parallel with the definition of smooth variation given above. This agrees with the notions previously introduced in [12]. B. Properties of the Distributions u in the Scaling Expansion In the following proposition, we list some general properties which hold for any almost homogeneous distribution on Rm . The distributions u in our scaling expansion are particular examples of such distributions, and therefore the proposition applies to them. In particular, combining the upper bound (106) on the analytic wave front set with item (iii) in the proposition below, one can obtain detailed information about the analytic wave front set of the Fourier transforms uˆ of the distributions u in the scaling expansion. This information suffices to establish that uˆ is in fact an analytic function in a large portion of momentum-space, and that it is given by the boundary value of an analytic function for almost all momentum configurations (see [14, Thm. 8.4.15] for the appropriate criteria when a distribution can be written as the boundary value of an analytic function).
Local Covariant Time Ordered Products
343
Proposition B.1. Let u ∈ D (Rm ) be an almost homogeneous distribution of degree ρ, i N i.e., SρN u = 0 for some N ∈ N, where SρN = y ∂/∂y i + ρ . Then (i) WFA (u) ⊂ {(y, k) ∈ T ∗ Rm \ {0} | y i ki = 0}. (ii) u can be extended to test functions in Schwartz space and thereby defines a tempered distribution. (iii) uˆ is again an almost homogeneous distribution, with degree m − ρ. Furthermore, we have (x, k) ∈ WFA (u) ⇔ (k, −x) ∈ WFA ($ u) if x = 0, k = 0, u) if x = 0, x ∈ supp(u) ⇔ (0, −x) ∈ WFA ($ k ∈ supp($ u) ⇔ (0, k) ∈ WFA (u) if k = 0.
(111)
Proof. Since SρN u = 0, and since SρN has analytic coefficients, we have by [14, Thm. 8.6.1] that
i y ki = 0 , (112) WFA (u) ⊂ Char SρN = (y, k) ∈ T ∗ Rm \ {0} where Char(P ) is the characteristic set of a differential operator P , defined as the set of all (y, k) ∈ T ∗ Rm \ {0} such that p(y, k) = 0, where p is the principal symbol of P . This proves (i). Let χ+ and χ− be smooth functions on R with the property that χ+ + χ− = 1, and such that χ− (r) = 0 for r ≤ r0 and χ− (r) = 1 for r ≥ 2r0 for some r0 > 0. We can therefore write u(y) = χ+ (|y|)u(y) + χ− (|y|)u(y).
(113)
The first distribution on the right side of this equation is by definition of compact support, and therefore trivially a tempered distribution. Thus (ii) will follow if we can show that also the second distribution on the right side is tempered. In order to prove this, we first show that it is possible to write χ− u in “polar coordinates”. For this, we note that WF(u) ∩ N ∗ Sm−1 = ∅ by (i), which implies by [14, Thm. 8.2.4] that u has a well defined pull-back, v, to the unit sphere, Sm−1 , in Rm . It follows from this, and the almost homogeneous scaling of u that it is possible to write u(χ− f ) =
N−1 ∞ j =0
0
Sm−1
cj χ− (r)r −ρ+m−1 lnj r v(y)f ˆ (r y) ˆ drdµ(y), ˆ
(114)
where (r, y) ˆ denote polar coordinates in Rm , dµ is the standard measure on Sm−1 and the cj are certain complex constants. Since v is a distribution on Sm−1 , there must exist differential operators P1 , . . . , Pk on Sm−1 such that sup |Pl h(y)| ˆ (115) |v(h)| ≤ C ˆ Sm−1 l≤k y∈
for all test functions h ∈ D Sm−1 . Moreover χ− (r)r −ρ+m−1 lnj r is a smooth function on R which grows polynomially together with all its derivatives at infinity, and therefore is a tempered distribution. Combining these facts with Eq. (114), we easily get the
344
S. Hollands, R.M. Wald
estimate |u(χ− f )| ≤ C
sup |y α ∂ β f (y)|
m |α|≤a,|β|≤b y∈R
(116)
for some a, b ∈ N and all f in Schwartz space, thus showing that χ− u is tempered. We come to the proof of (iii). That the Fourier transform of u scales almost homogeneously with degree m − ρ follows directly from our definition. For the case of a distribution, u that scales exactly homogeneously with degree ρ, the remaining three relations in Eq. (111) correspond precisely to [14, Thm. 8.4.18]. The proof given in [14, Thm. 8.4.18] of the second and third relations in (111) can be applied without modification to distributions with almost homogeneous scaling. We therefore only have to prove the first relation in (111). Since the Fourier transform uˆ is again a tempered distribution which scales homogeneously up to logarithmic terms, the problem is symmetric and it therefore suffices to show that (y0 , k0 ) ∈ / WFA (u) ⇒ (k0 , −y0 ) ∈ / WFA (u) ˆ
(117)
if y0 = 0, k0 = 0. Choose compact neighborhoods K and Kˆ in Rm \ 0 of y0 and k0 such that ˆ = ∅, WFA (u) ∩ (K × K)
(118)
ˆ such that (110) is valid for every α. and a sequence of cutoff functions χN ∈ C0∞ (K) We now estimate the Fourier transform of vN = χN uˆ in a conic neighborhood of y0 . By Fourier’s inversion formula and the convolution theorem, we have u(x)χˆ N (x − λy) d m x. (119) vˆN (−λy) = Rm
We now estimate expression (119) for |y − y0 | < r and arbitrary λ. For this, we consider two cases, first 0 < λ ≤ 1, and second λ > 1. We begin with the first case. Since u is a tempered distribution, we can estimate sup |x α ∂ β χˆ N (x − λy)|. (120) |vˆN (−λy)| ≤ C |α|≤a,|β|≤b x∈R
m
Using (110), it is not difficult to estimate |y α ∂ β χˆ N (y)| ≤ C(N + 1),
∀N, |α| ≤ a, |β| ≤ b.
(121)
From this one obtains the estimate N−M+1 ((N − M + 1)/λ)N−M |vˆN (−λy)| ≤ C(N + 1) ≤ CM
(122)
for all M < N and all 0 < λ ≤ 1. In order to estimate (119) also for λ > 1, we use that u scales almost homogeneous up to logarithms. This enables us to write lnj λ uj (x)χˆ N (λ(x − y)) d m x, (123) vˆN (−λy) = λρ+m j ! Rm j ≥0
Local Covariant Time Ordered Products
345
i j where the sum is finite and where uj = Sρ u, with Sρ = y ∂/∂y i + ρ. Since Sρ is a partial differential operator with analytic coefficients, we conclude that WFA uj is not ˆ We are now in a bigger than WFA (u) and hence also has no intersection with K × K. position to use exactly the same arguments as in the proof of [14, Thm. 8.4.18] (modulo a trivial additional estimate due to the logarithms), to show that there holds the estimate |vˆN (−λy)| ≤ C N−M+1 ((N − M + 1)/λ)N−M ,
∀N, λ > 1, |y − y0 | < r,
(124)
for some natural number M. Together with (123) this shows that we have vˆN (y) ≤ C N−M+1 ((N − M + 1)/|y|)N−M for all y in the conic neighborhood {−λy ∈ Rm | |y − y0 | ≤ r, λ > 0} of −y0 and for some fixed M. This proves the proposition.
(125) !
References 1. Boas, F.M.: Gauge theories in local causal perturbation theory. [arXiv:hep-th/0001014] 2. Brunetti, R., Fredenhagen, K., K¨ohler, M.: The microlocal spectrum condition and Wick polynomials on curved spacetimes. Commun. Math. Phys. 180, 633–652 (1996) 3. Brunetti, R., Fredenhagen, K.: Microlocal analysis and interacting quantum field theories: Renormalization on physical backgrounds. Commun. Math. Phys. 208, 623–661 (2000) 4. Brunetti, R., Fredenhagen, K., Verch, R.: The generally covariant locality principle: A new paradigm for local quantum physics. [arXiv:math-ph/0112041] 5. Bunch, T.S., Parker, L.: Feynman propagator in curved space-time: A momentum space representation. Phys. Rev. D20, 2499 (1979) 6. Christensen, S.M.: Vacuum expectation value of the stress tensor in an arbitrary curved background: The covariant point-separation method. Phys. Rev. D14, 2490–2501 (1976) 7. DeWitt, B.S., Brehme, R.W.: Radiation damping in a gravitational field. Ann. Phys. 9, 220–259 (1960) 8. D¨utsch, M., Fredenhagen, K.: Algebraic quantum field theory, perturbation theory, and the loop expansion. Commun. Math. Phys. 219, 5 (2001), [arXiv:hep-th/0001129]; Perturbative algebraic field theory, and deformation quantization. To appear in: Fields Inst. Commun. [arXiv:hep-th/0101079] 9. D¨utsch, M., Fredenhagen, K.: A local (perturbative) construction of observables in gauge theories: The example of QED. Commun. Math. Phys. 203, 71 (1999) [arXiv:hep-th/9807078] 10. Epstein, H., Glaser, V.: The rˆole of locality in perturbation theory. Ann. Inst. H. Poincar´e Sec. A XIX, 211–295 (1973) 11. Fredenhagen, K., Haag, R.: Generally covariant quantum field theory and scaling limits. Commun. Math. Phys. 108, 91 (1987) 12. Hollands, S., Wald, R.M.: Local Wick polynomials and time ordered products of quantum fields in curved spacetime. Commun. Math. Phys. 223, 289 (2001) [arXiv:gr-qc/0103074] 13. Hollands, S., Ruan, W.: The state space of perturbative interacting quantum field theories in curved spacetimes. [arXiv:gr-qc/0108032] 14. H¨ormander, L.: The Analysis of Linear Partial Differential Operators I. 2nd Edition, Berlin-Heidelberg-New York: Springer-Verlag, 1990 15. Kay, B.S.: Reducing the renormalization ambiguity in the Brunetti-Fredenhagen approach to interacting quantum field theories on curved spacetime: The example of λϕ 4 Theory. In preparation 16. Moretti, V.: Comments on the stress-energy tensor operator in curved spacetime. [arXiv:grqc/0109048] 17. Prange, D.: Lorentz covariance in Epstein-Glaser renormalization. [arXiv:hep-th/9904136] 18. Radzikowski, M.: Microlocal approach to the Hadamard condition in quantum field theory on curved spacetime. Commun. Math. Phys. 179, 529 (1996) 19. Reed, M., Simon, B.: Methods of modern mathematical physics I. New York: Academic Press, 1973 Communicated by H. Nicolai
Commun. Math. Phys. 231, 347–373 (2002) Digital Object Identifier (DOI) 10.1007/s00220-002-0723-2
Communications in
Mathematical Physics
Stability and Asymptotic Stability in the Energy Space of the Sum of N Solitons for Subcritical gKdV Equations Yvan Martel1 , Frank Merle1,2 , Tai-Peng Tsai3 1
Universit´e de Cergy–Pontoise, D´epartement de Math´ematiques, 2, av. Adolphe Chauvin, 95302 Cergy–Pontoise, France 2 Institut Universitaire de France, Paris, France 3 Courant Institute, New York University, 251 Mercer Street, New York, NY 10012, USA Received: 8 October 2001 / Accepted: 2 July 2002 Published online: 14 October 2002 – © Springer-Verlag 2002
Abstract: We prove in this paper the stability and asymptotic stability in H 1 of a decoupled sum of N solitons for the subcritical generalized KdV equations ut + (uxx + up )x = 0 (1 < p < 5). The proof of the stability result is based on energy arguments and monotonicity of the local L2 norm. Note that the result is new even for p = 2 (the KdV equation). The asymptotic stability result then follows directly from a rigidity theorem in [16]. 1. Introduction In this paper, we consider the generalized Korteweg–de Vries equations ut + (uxx + up )x = 0, (t, x) ∈ R × R, u(0, x) = u0 (x),
(1)
x ∈ R,
for 1 < p < 5 and u0 ∈ H 1 (R). This model for p = 2 was first introduced in the study of waves on shallow water, see Korteweg and de Vries [10]. It also appears for p = 2 and 3, in other areas of physics (see e.g. Lamb [11]). Recall that (1) is well-posed in the energy space H 1 . For p = 2, 3, 4, it was proved by Kenig, Ponce and Vega [9] (see also Kato [8], Ginibre and Tsutsumi [6]), that for u0 ∈ H 1 (R), there exists a unique solution u ∈ C(R, H 1 (R)) of (1) satisfying the following two conservation laws, for all t ∈ R: 2 u (t) = u20 , (2) 1 E(u(t)) = 2
u2x (t) −
1 p+1
u
p+1
1 (t) = 2
u20x
1 − p+1
p+1
u0
.
(3)
348
Y. Martel, F. Merle, T-P. Tsai
For p = 2, 3, 4, global existence of all solutions in H 1 , as well as a uniform bound in H 1 , follow directly from the Gagliardo–Nirenberg inequality, ∀v ∈ H (R), 1
p+1
|v|
p+3
≤ C(p)
v
2
4
p−1 vx2
4
,
and relations (2), (3), giving a uniform bound in H 1 for any solution. This is in contrast with the case p = 5, for which there exist solutions u(t) of (1) such that |u(t)|H 1 → +∞ as t → T , for 0 < T < +∞, see [20] and [18]. For p > 5 such behavior is also conjectured. Thus, for the question of global existence and bound in H 1 , the case 1 < p < 5 is called the subcritical case, p = 5 the critical case and p > 5 the supercritical case. Equation (1) has explicit traveling wave solutions, called solitons, which play a fundamental role in the generic behavior of the solutions. Let
Q(x) =
p+1 2 ch2 p−1 x 2
1 p−1
(4)
be the only positive solution in H 1 (R) (up to translation) of Qxx + Qp = Q, and for 1 √ c > 0, let Qc (x) = c p−1 Q cx . The traveling waves solutions of (1) are 1
u(t, x) = Qc (x − ct) = c p−1 Q
√ c(x − ct) ,
where c > 0 is the speed of the soliton. For the KdV equation (p = 2), there is a much wider class of special explicit solutions for (1), called N -solitons. They correspond to the superposition of N traveling waves with different speeds that interact and then remain unchanged after interaction. The N-solitons behave asymptotically in large time as the sum of N traveling waves, and as for the single solitons, there is no dispersion. We refer to [21] for explicit expressions and further properties of these solutions. For p = 2, even the existence of solutions behaving asymptotically as the sum of N solitons was not known. Important notions for these solutions are the stability and asymptotic stability with respect to initial data. For c > 0, the soliton Qc (x − ct) is stable in H 1 if: ∀δ0 > 0, ∃α0 > 0/|u0 − Qc |H 1 ≤ α0 ⇒ ∀t ≥ 0, ∃x(t) / |u(t)−Qc (. − x(t))|H 1 ≤ δ0 . The family of solitons {Qc (x − x0 − ct), c > 0, x0 ∈ R} is asymptotically stable if: ∃α0 > 0 / |u0 − Qc |H 1 ≤ α0 ⇒ ∃c+∞ , x(t) / u(t, . + x(t)) Qc+∞ in H 1 . t→+∞
We recall previously known results concerning the notions of stability of solitons and N solitons: – In the subcritical case: p = 2, 3, 4, it follows from energetic arguments that the solitons are H 1 stable (see Benjamin [1] and Weinstein [25]). Moreover, Martel and Merle [16] prove the asymptotic stability of the family of solitons in the energy space. The proof relies on a rigidity theorem close to the family of solitons, which was first given for the critical case ([14]), and which is based on nonlinear argument. (Pego and
Stability and Asymptotic Stability for Subcritical gKdV Equations
349
Weinstein [22] prove the asymptotic stability result for p = 2, 3 for initial data with exponential decay as x → +∞.) In the case of the KdV equation, Maddocks and Sachs [13] prove the stability in H N (R) of N -solitons (recall that there are explicit solutions of the KdV equation): for any initial data u0 close in H N (R) to an N -soliton, the solution u(t) of the KdV equation remains uniformly close in H N (R) for all time to an N soliton profile with the same speeds. Their proof involves N conserved quantities for the KdV equation, and this is the reason why they need to impose closeness in high regularity spaces. Note that this result is known only with p = 2 and with this regularity assumption on the initial data. Asymptotic stability is unknown in this context. As it is noted in [13], multi-solitons of the KdV equations can serve as examples of exact solutions of nonlinear wave interactions. The stability and asymptotic stability of such solutions are thus important properties from the physical point of view and produce more examples of well understood solutions (see references in [13]). We also refer to S.-I. Ei and T. Ohta [5] for a study of the motion of two interacting pulses in the case of the KdV equations (Part III of [5]) and of other dissipative and dispersive systems. – In the critical case p = 5, any solution with negative energy initially close to the soliton blows up in finite or infinite time in H 1 (Merle [20]), and actually blows up in finite time if the initial data satisfies in addition a polynomial decay condition on the right in space (Martel and Merle [18]). (Note that E(Q) = 0 for p = 5.) Of course this implies the instability of the soliton. These results rely on rigidity theorems around the soliton. – In the supercritical case p > 5, Bona, Souganidis, and Strauss [2] proved, using Grillakis, Shatah, and Strauss [7] type arguments, H 1 instability of solitons. Moreover, numerical experiments, see e.g. Dix and McKinney [4], suggest existence of blow up solutions arbitrarily close to the family of solitons. In this paper, for p = 2, 3, 4, using techniques developed for the critical and subcritical cases in [14] and [16] as well as a direct variational argument in H 1 , we prove the stability and asymptotic stability of the sum N
j =1
Qc0 (x − xj ), j
where
0, 0 < c10 < · · · < cN
x1 < · · · < xN ,
(5)
in H 1 (R), for t ≥ 0. Theorem 1 (Asymptotic stability of the sum of N solitons). Let p = 2, 3 or 4. Let 0 . There exist γ , A , L , α > 0 such that the following is true: Let 0 < c10 < · · · < cN 0 0 0 0 0 , such 1 u0 ∈ H (R) and assume that there exist L > L0 , α < α0 , and x10 < · · · < xN that N
0 u0 − Qc0 (. − xj ) ≤ α, and xj0 > xj0−1 + L, for all j = 2, . . . , N. (6) j 1 j =1 H
Let u(t) be the solution of (1). Then, there exist x1 (t), . . . , xN (t) such that (i) Stability of the sum of N decoupled solitons, N
∀t ≥ 0, u(t) − Qc0 (x − xj (t)) ≤ A0 α + e−γ0 L . j 1 j =1 H
(7)
350
Y. Martel, F. Merle, T-P. Tsai
+∞ (ii) Asymptotic stability of the sum of N solitons. Moreover, there exist c1+∞ , . . . , cN , +∞ 0 −γ L with |cj − cj | ≤ A0 α + e 0 , such that N
u(t) − Qc+∞ (x − xj (t)) → 0, x˙j (t) → cj+∞ as t → +∞. (8) j j =1 0 2 L (x>c1 t/10)
Remark 1. One of the interests of studying the stability and asymptotic stability of the sums (5) rather than the explicit N-soliton solutions is that we can consider any subcritical generalized KdV equation. Indeed, the result does not depend on the existence of a special family of solutions behaving as the N -solitons. For p = 2, note that the existence of solutions behaving in L2 as t → +∞ as the sum of N solitons is an open problem. The asymptotic stability result (ii) proves that the family of sums (5) attracts as t → +∞ the orbits that are sufficiently close to it. We believe that it is an important qualitative information for the flow of the generalized KdV equations, both from mathematical and physical point of view. For p = 2, it implies in particular the stability and asymptotic stability of the explicit N-solitons solutions in the energy space (see Corollary 1 below). Remark 2. It is well-known that for p = 2 and p = 3, (1) is completely integrable. Indeed, for suitable u0 (u0 and its derivatives with exponential decay at infinity) there exist an infinite number of conservation laws, see e.g. Lax [12] and Miura [21]. Moreover, many results on these equations rely on the inverse scattering method, which transform the problem in a sequence of linear problems (but requires a strong decay assumption on the solution). The result in [13] does not use this transformation but the existence of many conservation laws for the KdV equation. In this paper, we do not use integrability and we work in the energy space H 1 , with no decay assumption on u0 . Remark 3. For Schr¨odinger type equations, Perelman [23] and Buslaev and Perelman [3], with strong conditions on initial data and nonlinearity, and using a linearization method around the soliton, prove asymptotic stability results by a fixed point argument. Unfortunately, this method breaks down without a decay assumption on the initial data. Remark 4. In Theorem 1 (ii), we cannot have convergence to zero in L2 (x > 0). Indeed, assumption (6) on the initial data allows the existence in u(t) of an additional soliton of size less than α (thus traveling at arbitrarily small speed). For p = 2, an explicit example can be constructed using the N -soliton solutions. Recall that for p = 2 any N -soliton solution has the form v(t, x) = U (N) (x; cj , xj − cj t), where {U (N) (x; cj , yj ); cj > 0, yj ∈ R} is the family of explicit N -soliton profiles (see e.g. [13], Sect. 3.1). As a direct corollary of Theorem 1, for p = 2, we prove stability and asymptotic stability of this family. Corollary 1 (Asymptotic stability in H1 of N-solitons for p = 2). Let p = 2. Let 0 and x 0 , · · · , x 0 ∈ R. For all δ > 0, there exists α > 0 such that 0 < c10 < . . . < cN 1 1 1 N the following is true: Let u(t) be a solution of (1). If |u(0) − U (N) ( . ; cj0 , −xj0 )|H 1 ≤ α1 , then there exist xj (t) such that ∀t > 0,
|u(t) − U (N) ( . ; cj0 , −xj (t))|H 1 ≤ δ1 .
(9)
Stability and Asymptotic Stability for Subcritical gKdV Equations
351
Moreover, there exist cj+∞ > 0 such that u(t) − U (N) ( . ; cj+∞ , −xj (t))
L2 (x>c10 t/10)
→ 0,
x˙j (t) → cj+∞ as t → +∞. (10)
Note that this improves the result in [13] in two ways. First, stability is proved in H 1 instead of H N . Second, we also prove asymptotic stability as t → +∞. Corollary 1 is proved at the end of Sect. 4. Let us sketch the proof of these results. Note first that the main result, i.e. the stability result Theorem 1 (i) is self-contained, whereas the asymptotic stability result Theorem 1 (ii) relies on the proof for the case N = 1 ([16]). For Theorem 1, using modulation theory, u(t) = N j =1 Qcj (t) (x − xj (t)) + ε(t, x), 1 where ε(t) is small in H , and xi (t), ci (t) are geometrical parameters (see Sect. 2). The stability result (i) is equivalent to control both the variation of cj (t) and the size of ε(t) in H 1 (Sect. 3). Our main arguments are based on L2 properties of the solution. From [14] and [16], the L2 norm of the solution at the right of each soliton is almost decreasing in time. This property together with an energy argument allows us to prove that the variation of cj (t) is quadratic in |ε(t)|H 1 , which is a key of the problem. Let us explain the argument formally by taking ε = 0 and so u(t) = Qcj (t) (x − xj (t)). The energy conservation becomes
where β =
2 p−1 .
β+ 21
cj
(t) =
β+ 21
cj
(0),
The monotonicity of the L2 norm at the right of each soliton gives us
#j (t) =
N
k=j
β− 21
ck
β− 21
(t) − ck
(0) ≤ 0.
We claim that cj (t) = cj (0) by a convexity argument. Indeed, 2β + 1 β− 1 β− 1 cj (0) cj 2 (t) − cj 2 (0) 2β − 1
2β + 1 = |#j (t)| ≥ σ1 |cj (t) − cj (0)|. (cj (0) − cj +1 (0))#j (t) ≥ σ0 2β − 1
0=
β+ 21
cj
β+ 21
(t) − cj
(0) ∼
Thus cj (t) is a constant at the first order. In fact, we prove that the variation in time of cj (t) is of order 2 in ε(t). Then we control the variation of ε(t) in H 1 by a refined version of this argument, using suitable orthogonality conditions on ε. The asymptotic stability result (ii) follows directly from a rigidity property of the flow of Eq. (1) around the solitons (see Theorem following Proposition 2 in Sect. 4 of this paper and [16]) and monotonicity properties of the mass (see Sect. 2.2 and Sect. 4).
352
Y. Martel, F. Merle, T-P. Tsai
2. Decomposition and Properties of a Solution Close to the Sum of N Solitons 0 and 2.1. Decomposition of the solution and conservation laws. Fix 0 < c10 < · · · < cN let 1 0 0 σ0 = min(c10 , c20 − c10 , c30 − c20 , . . . , cN − cN −1 ). 2 From modulation theory, we claim
Lemma 1 (Decomposition of the solution). There exists L1 , α1 , K1 > 0 such that the following is true: If for L > L1 , 0 < α < α1 , t0 > 0, we have N
< α, u(t, .) − (11) inf Qc0 (. − yj ) sup j 0≤t≤t0 yj >yj −1 +L j =1 1 H
then there exist unique C 1 functions cj : [0, t0 ] → (0, +∞), xj : [0, t0 ] → R, such that ε(t, x) = u(t, x) −
N
Rj (t, x),
where Rj (t, x) = Qcj (t) (x − xj (t)),
(12)
j =1
satisfies the following orthogonality conditions: Rj (t)ε(t) = (Rj (t))x ε(t) = 0. ∀j, ∀t ∈ [0, t0 ],
(13)
Moreover, there exists K1 > 0 such that ∀t ∈ [0, t0 ], |ε(t)|H 1 +
N
j =1
∀j, c˙j (t) + x˙j (t) − cj (t) ≤ K1
|cj (t) − cj0 | ≤ K1 α,
e
√ − σ0 |x−xj (t)|/2 2
(14)
ε (t)
1/2
√ σ0 (L+σ0 t)/4
+K1 e−
.
(15) Proof. Lemma 1 is a consequence of Lemma 8 (see Appendix) and standard arguments. We refer to [15] Sect. 2.3 for a complete proof in the case of a single soliton. In particular, ε(t) satisfies ∀t ∈ [0, t0 ], N
c˙j 2Rj εt + εxxx = − + (x − xi )(Rj )x 2cj p − 1 j =1 p N N N
p + (x˙j − cj )Rj x − ε + Rj − Rj . j =1
j =1
j =1
x
By taking (formally) the scalar product of this equation by Rj and (Rj )x , and using calculations in the proof of Lemma 8, we prove 1/2
√ √ − σ0 |x−xj (t)|/2 2 |c˙j (t)| + |x˙j (t) − cj (t)| ≤ C e ε (t) +C e− σ0 |xk (t)−xj (t)|/2 . k =j
Stability and Asymptotic Stability for Subcritical gKdV Equations
353
For α > 0 small enough, and L large enough, we have |xk (t) − xj (t)| ≥ L2 + σ0 t, and this proves (15). Next, by using the conservation of energy for u(t), i.e. 1 1 2 ux (t, x) − up+1 (t, x) dx = E(u0 ), E(u(t)) := 2 p+1 and linearizing the energy around R = N j =1 Rj , we prove the following result. Lemma 2 (Energy bounds). There exist K2 > 0 and L2 > 0 such that the following is true: Assume that ∀j , cj (t) ≥ σ0 , and xj (t) − xj −1 (t) ≥ L ≥ L2 . Then, ∀t ∈ [0, t0 ], N 1 2 p−1 2 E(R (t)) − E(R (0)) + − pR ε )(t) (ε j j x 2 j =1 √ ≤ K2 |ε(0)|2H 1 + |ε(t)|3H 1 + e− σ0 L/2 , (16) where K2 is a constant. Proof. Insert (12) into E(u(t)) and integrate by parts. We have 1 1 2 Rxx + R p ε dx (17) Rx − R p+1 dx − E(u(t)) = 2 p+1 1 2 p p−1 2 + ε dx ε − R 2 x 2 1 p + −(R + ε)p+1 + R p+1 + R p ε + R p−1 ε 2 dx. (18) p+1 2 We first observe that |(18)| ≤ C ε3H 1 . Next, remark that σ0 ≤ cj (t), xj (t)−xj −1 (t) ≥ √ L, implies |Rj (x, t)| + |(Rj )x (x, t)| ≤ Ce− σ0 |x−xj (t)| , and so √ Rj (t) Rk (t) dx + (Rj )x (t) (Rk )x (t) dx ≤ Ce− σ0 L/2 if j = k. (19) p
Thus, by (Rj )xx + Rj = cj Rj , we have N
√ 1 2 p−1 2 (17) − E(Rj (t)) + cj Rj ε(t) − ε )(t) ≤ Ce− σ0 L/2 . (εx − pR 2 j j =1 (20) From Rj (t)ε(t) = 0, we obtain N
√ 1 2 p−1 2 ≤ Ce− σ0 L/2 + C ε(t)3 1 . E(u(t)) − E(R (t)) − − pR ε )(t) (ε j x H 2 j =1
Since E(u(t)) = E(u(0)), applying the previous formula at t = 0 and at t, we prove the lemma.
354
Y. Martel, F. Merle, T-P. Tsai
2.2. Almost monotonicity of the mass at the right. We follow the proof of Lemma 20 in [14]. Let ∞ −1 x √ 2 φ(x) = cQ( σ0 x/2), ψ(x) = φ(y)dy, where c = √ Q . σ0 −∞ −∞ (21) Note that ∀x ∈ R, ψ > 0, 0 < ψ(x) < 1, and lim ψ(x) = 0, lim ψ(x) = 1. Let x→−∞
j ≥ 2,
Ij (t) =
u2 (t, x)ψ(x − mj (t)) dx,
x→+∞
mj (t) =
xj −1 (t) + xj (t) . (22) 2
Lemma 3 (Almost monotonicity of the mass on the right of each soliton [14]). There exist K3 = K3 (σ0 ) > 0, L3 = L3 (σ0 ) > 0 such that the following is true: Let t1 ∈ [0, t0 ]. Assume that ∀t ∈ [0, t1 ], ∀j , x˙1 (t) ≥ σ0 ,
x˙j (t) − x˙j −1 (t) ≥ σ0 ,
p−1
cj (t) > σ0 ,
and |ε(t)|H 1 ≤
σ0 . 8 · 2p−1 (23)
If for L > L3 , ∀j ∈ {2, . . . , N}, xj (0) − xj −1 (0) ≥ L, then √ σ0 L/8
Ij (t1 ) − Ij (0) ≤ K3 e−
.
Proof. Let j ∈ {1, . . . , N}. Using Eq. (1) and integrating by parts several times, we have (see [16] Eq. (20)), d 2p p+1 Ij (t) = u ψ + u2 ψ (3) . −3u2x − mu ˙ 2+ dt p+1 By definition of ψ, ψ (3) ≤
σ0 4 ψ ,
so that
u ψ 2
(3)
σ0 ≤ 4
u2 ψ .
(24)
To bound up+1 ψ , we divide the real line into two regions: I = [a, b] and its complement I C , where a = a(t) = xj −1 (t) + L4 and b = b(t) = xj (t) − L4 . Inside the interval I we have up+1 ψ ≤ u2 ψ · sup |u|p−1 . I
I
Since for x ∈ I , for all k = 1, 2, . . . , N, |x − xk (t)| ≥ p−1
|u(t, x)|
L 4,
we have
N p−1 √ σ0 p−1 = Rk (t, x) + ε(t, x) ≤ Ce− σ0 L/4 + 2p−1 |ε(t)|L∞ ≤ , 4 k=1
for L > L3 (σ0 ). Thus,
up+1 ψ ≤ σ0 u2 ψ . 4 I
(25)
Stability and Asymptotic Stability for Subcritical gKdV Equations
355
Next, in I C , by the Gagliardo Nirenberg inequality, p+1 u ψ dx ≤ up+1 dx · sup ψ IC
IC
√ σ0 p+1 ≤ C uH 1 · exp − xj (t) − xj −1 (t) − L2 4 ≤ Ce−
√ σ0 8 (2σ0 t+L)
,
(26)
by xj (t) − xj −1 (t) ≥ xj (0) − xj −1 (0) + σ0 t ≥ L + σ0 t. From m ˙ ≥ σ0 , (24), (25) and (26), we obtain √ √ σ0 σ0 d σ0 Ij (t) ≤ −3u2x − u2 ψ dx + Ce− 8 (2σ0 t+L) ≤ Ce− 8 (2σ0 t+L) . dt 2 Thus, by integrating between 0 and t1 , we obtain the conclusion. Note that K3 and L3 are chosen independently of t1 . 2.3. Positivity of the quadratic form. By the choice of orthogonality conditions on ε(t) and standard arguments, we claim the following lemma. Lemma 4 (Positivity of the quadratic form). There exists L4 > 0 and λ0 > 0 such that if ∀j , cj (t) ≥ σ0 , xj (t) ≥ xj −1 (t) + L4 then, ∀t ∈ [0, t0 ], εx2 (t) − pR p−1 (t)ε 2 (t) + c(t, x)ε 2 (t) ≥ λ0 |ε(t)|2H 1 , (27) where c(t, x) = c1 (t) +
N
j =2 (cj (t) − cj −1 (t))ψ(x
− mj (t)).
1 Proof of Lemma 4. It is well known that there exists λ1 > 0 such that if v ∈ H (R) satisfies Qv = Qx v = 0, then vx2 − pQp−1 v 2 + v 2 ≥ λ1 |v|2H 1 . (28)
(See the proof of Proposition 2.9 in Weinstein [24].) Now we give a local version of (28). Let / ∈ C 2 (R), /(x) = /(−x), / ≤ 0 on R+ , with /(x) = 1 on [0, 1]; /(x) = e−x on [2, +∞), e−x ≤ /(x) ≤ 3e−x on R+ . Let /B (x) = / Bx . The following claim is similar to a part of the proof of some local Virial relation in Sect. 2.2 of [17]; see Appendix A, Steps 1 and 2, in [17] for its proof. Claim. There exists B0 > 0 such that, for all B > B0 , if v ∈ H 1 (R) satisfies Qv = Qx v = 0, then λ 1 /B vx2 − pQp−1 v 2 + v 2 ≥ (29) /B (vx2 + v 2 ). 4
356
Y. Martel, F. Merle, T-P. Tsai
We finish the proof of Lemma 4. Let B > B0 to be chosen later and L4 = 4kB, where k > 1 integer is to be chosen later. We have
εx2 − pR p−1 ε 2 + c(t, x)ε 2 =
N
j =1
−p
p−1 /B (x − xj (t)) εx2 − pRj ε 2 + cj (t)ε 2 R p−1 −
N
j =1
+
N
p−1 2
/B (x − xj (t))(c(t, x) − cj (t))ε 2
j =1
1 −
+
ε
/B (x − xj (t))Rj
N
j =1
/B (x − xj (t)) (εx2 + c(t, x)ε 2 ).
Next, we make the following observations: (i) By (29), we have ∀j ,
λ 1 p−1 /B (x − xj (t))(εx2 + cj (t)ε 2 ). /B (x − xj (t)) εx2 − pRj ε 2 + cj (t)ε 2 ≥ 4 (ii) Since /B (x) = 1 for |x| < B, by the decay properties of Q, we have
0 ≤ R p−1 −
N
j =1
p−1
/B (x − xj (t))Rj
p−1
≤ |R|L∞ (|x−xj (t)|>B) + C
√ σ0 B
Rj Rk ≤ Ce−
j =k
(iii) Note that c(t, x) = N j =1 cj (t)ϕj (t, x), where ϕ1 (t, x) = 1 − ψ(x − m2 (t)), for j ∈ {2, . . . , N − 1}, ϕj (t, x) = ψ(x − mj (t)) − ψ(x − mj +1 (t)) and ϕN (t, x) = |x|
ψ(x − mN (t)). Since /B (x) ≤ 3e− B , by the properties of ψ, and |mj (t) − xj (t)| ≥ L4 /2 ≥ 2kB, we obtain /B (x − xj (t))(c(t, x) − cj (t)) ≤ |c(t, x) − c(t)|L∞ (|x−x (t)|≤kB) + Ce−k j √ σ0 kB/2
≤ Ce− (iv) 1 −
N
j =1 /B (x
Therefore, with λ0 = εx2
− pR
p−1 2
+ Ce−k .
− xj (t)) ≥ 0. 1 2
min( λ41 , λ41 σ0 , 1, σ0 ), for B and k large enough, (εx2
ε + c(t, x)ε ≥ 2λ0 +ε )−C e ≥ λ0 (εx2 + ε 2 ). 2
Thus the proof of Lemma 4 is complete.
2
√ − σ0 B/2
+e
−k
ε2
.
Stability and Asymptotic Stability for Subcritical gKdV Equations
357
3. Proof of the Stability in the Energy Space This section is devoted to the proof of the stability result. The proof is by a priori estimate. 0 , σ = 1 min(c0 , c0 − c0 , c0 − c0 , . . . , c0 − c0 Let 0 < c10 < · · · < cN 0 1 2 1 3 2 N N −1 ) and 2 √ γ0 = σ0 /16. For A0 , L, α > 0, we define N
u− inf Q (. − x ) VA0 (L, α) = u ∈ H 1 (R); 0 j c j xj −xj −1 ≥L j =1
≤ A0 α+e−γ0 L/2
H1
.
(30)
We want to prove that there exists A 0 > 0, L0 > 0, and α0 > 0 such that, ∀u0 ∈ 0 if for some L > L0 , α < α0 , u0 − N ≤ α, where xj0 > j =1 Qc0 (. − xj )
H 1 (R),
j
H1
xj0−1 + L, then ∀t ≥ 0, u(t) ∈ VA0 (L, α) (this proves the stability result in H 1 ). By a standard continuity argument (described just below Proposition 1), it is a direct consequence of the following proposition. Proposition 1 (A priori estimate). There exists A0 > 0, L0 > 0, and α0 > 0 such that, for all u0 ∈ H 1 (R), if N
0 u 0 − Q (. − x ) ≤ α, (31) 0 j cj 1 j =1 H
where L > L0 , 0 < α < α0 , xj0 > xj0−1 + L, and if for t ∗ > 0, ∀t ∈ [0, t ∗ ],
u(t) ∈ VA0 (L, α),
(32)
∀t ∈ [0, t ∗ ],
u(t) ∈ VA0 /2 (L, α).
(33)
then
Note that A0 , L0 and α > 0 are independent of t ∗ . Proposition 1 implies the stability result (i) of Theorem 1. Indeed, let A0 , L0 , α0 be chosen as in Proposition 1. Suppose that u0 satisfies the assumptions of Theorem 1. Then, by continuity of u(t) in H 1 , u(t) ∈ VA0 (L, α) for 0 < t < τ0 for some τ0 > 0. Let t ∗ = sup{t ≥ 0, u(t ) ∈ VA0 (L, α), ∀t ∈ [0, t]}. Assume for the sake of contradiction that t ∗ is finite. Then, by Proposition 1, we have ∀t ∈ [0, t ∗ ], u(t) ∈ VA0 /2 (L, α). Therefore, by continuity of u(t) in H 1 , there exists τ > 0 such that ∀t ∈ [0, t ∗ + τ ], u(t) ∈ V2A0 /3 (L, α), which contradicts the definition of t ∗ . The stability result follows. Proof of Proposition 1. Let A0 > 0 to be fixed later. First, for 0 < α0 < αI (A0 ) and L0 > LI (A0 ) > L1 , we have A0 α0 + e−γ0 L0 /2 ≤ α1 , (34)
358
Y. Martel, F. Merle, T-P. Tsai
where α1 and L1 are defined in Lemma 1. Therefore, by (32) and Lemma 1, there exist cj : [0, t ∗ ] → (0, +∞), xj : [0, t ∗ ] → R, such that ε(t, x) = u(t, x) −
N
Rj (t, x),
where
Rj (t, x) = Qcj (t) (x − xj (t)),
(35)
j =1
satisfies ∀j , ∀t ∈ [0, t ∗ ],
Rj (t)ε(t) =
(Rj (t))x ε(t) = 0,
(36)
|cj (t) − cj0 | + |c˙j | + |x˙j − cj0 | + |ε(t)|H 1 ≤ K1 (A0 + 1) α0 + e−γ0 L0 .
(37)
Note that by (31), Lemma 8 (see Appendix) and assumptions of the proposition, |ε(0)|H 1 +
N
j =1
|cj (0) − cj0 | ≤ K1 α,
xj (0) − xj −1 (0) ≥
L . 2
(38)
From (37) and (38), for α0 < αI I (A0 ) and L0 > LI I (A0 ) > 2 max(L2 , L3 , L4 ) (L2 , L3 and L4 are defined in Lemmas 3 and 4), we have ∀t ∈ [0, t ∗ ], c1 (t) ≥ σ0 ,
x˙1 (t) ≥ σ0 ,
cj (t) − cj −1 (t) ≥ σ0 ,
xj (t) − xj −1 (t) ≥ L/2 ≥ max(L3 , L4 ),
|ε(t)|H 1
x˙j (t) − x˙j −1 (t) ≥ σ0 , (39) 1 1 σ0 p−1 ≤ . (40) 2 8
Therefore, we can apply Lemmas 2, 3 and 4 for all t ∈ [0, t ∗ ]. Let α0 = min(αI (A0 ), αI I (A0 )) and L0 = max(LI (A0 ), LI I (A0 )). Now, our objective is to give a uniform upper bound on |ε(t)|H 1 and |cj (t)−cj (0)| on [0, t ∗ ] improving (37) for A0 large enough. In the next lemma, we first obtain a control of the variation of cj (t) which is quadratic in |ε(t)|H 1 . This is the key step of the stability result, based on the monotonicity property of the local L2 norm and energy constraints. It is essential at this point to have chosen by the modulation Rj ε = 0. Lemma 5 (Quadratic control of the variation of cj (t)). There exists K4 > 0 independent of A0 , such that, ∀t ∈ [0, t ∗ ], N
cj (t) − cj (0) ≤ K4 |ε(t)|2 1 + |ε(0)|2 1 + e−γ0 L . H H j =1
Proof.
(41)
Stability and Asymptotic Stability for Subcritical gKdV Equations
Step 1. Energetic control. Let β =
359
There exists C > 0 such that
2 p−1 .
N β−1/2 β−1/2 ≤ C |ε(t)|2 1 + |ε(0)|2 1 + e−γ0 L c (0) c (t) − c (0) j j j H H j =1 +C
N
2 cj (t) − cj (0) .
(42)
j =1
Let us prove (42). By (16), we have N ≤ C |ε(t)|2 1 + |ε(0)|2 1 + e−γ0 L . (t)) − E(R (0)) E(R j j H H j =1 Since E(Qc ) = − κ2 cβ+1/2 −
Q2 , where κ =
N
j =1
κ E(Rj (t)) − E(Rj (0)) = 2
5−p p+3 ,
Q2
(43)
we have
N j =1
β+1/2
cj
β+1/2
(t) − cj
(0) .
β+1/2 β+1/2 β−1/2 β−1/2 (t)−cj (0) = 2β+1 (t) − cj (0) + By linearization, we have cj 2β−1 cj (0) cj 2 1 O cj (t) − cj (0) . Note that 2β+1 2β−1 = κ . Therefore, N N 1 β−1/2 β−1/2 E(Rj (t)) − E(Rj (0)) + Q2 cj (0) cj (t) − cj (0) 2 j =1 j =1 ≤C
N
2 cj (t) − cj (0) ,
(44)
j =1
and from (43), we obtain (42). Step 2. L2 mass monotonicity at the right of every soliton. Let dj (t) =
N
k=j
β−1/2
ck
(t).
We claim 2 2 dj (t) − dj (0) ≤ − Q (dj (t) − dj (0)) + C ε 2 (0) + e−γ0 L . Q (45) Let us prove (45). Recall that using the notation of Sect. 2.3, we have Ij (t) ≤ Ij (0) + K3 e−γ0 L , where Ij (t) = ψ(x − mj (t))u2 (t, x)dx.
360
Y. Martel, F. Merle, T-P. Tsai
β−1/2 Since Rj2 (t) = cj (t) Q2 , Rj (t)ε(t) = 0, by similar calculations as in Lemma 2, we have 2 2 Ij (t) − Q dj (t) − ψ(. − mj (t))ε (t) ≤ Ce−γ0 L .
(46)
Therefore,
Q
(dj (t) − dj (0)) ≤
2
ψ(. − mj (0))ε (0) − 2
ψ(. − mj (t))ε 2 (t) + Ce−γ0 L . (47)
Since the second term on the right-hand side is negative, (45) follows easily. Note that by conservation of the L2 norm u2 (t) = u2 (0) and
u (t) =
R (t) + ε (t) + 2 R(t)ε(t) = = d1 (t) + ε 2 (t) + O(e−γ0 L ),
2
2
R (t) +
2
2
ε 2 (t)
we obtain
Q
2
(d1 (t) − d1 (0)) ≤
ε2 (0) −
ε 2 (t) + Ce−γ0 L .
(48)
Step 3. Resummation argument. By the Abel transform, we have N
j =1
β−1/2 β−1/2 cj (0) cj (t) − cj (0)
=
N−1
cj (0) dj (t) − dj +1 (t) − (dj (0) − dj +1 (0)) + cN (0) [dN (t) − dN (0)]
j =1
= c1 (0) [d1 (t) − d1 (0)] +
N
(cj (0) − cj −1 (0))(dj (t) − dj (0)).
(49)
j =2
Therefore, by Step 1, − c1 (0) [d1 (t) − d1 (0)] +
N
(cj (0) − cj −1 (0))(dj (t) − dj (0))
j =2
≤C
|ε(t)|2H 1
+ |ε(0)|2H 1
+e
−γ0 L
+C
N
2 cj (t) − cj (0) .
j =1
(50)
Stability and Asymptotic Stability for Subcritical gKdV Equations
361
Since c1 (0) ≥ σ0 , cj (0) − cj −1 (0) ≥ σ0 , by (45), we have N
dj (t) − dj (0) σ0 j =1
≤ c1 (0)|d1 (t) − d1 (0)| +
N
(cj (0) − cj −1 (0))|dj (t) − dj (0)|
j =2
≤ − c1 (0) [d1 (t) − d1 (0)] + +C
N
(cj (0) − cj −1 (0))(dj (t) − dj (0))
j =2
ε2 (0) + Ce−γ0 L .
Thus, by (50), we have N N
2 dj (t) − dj (0) ≤ C |ε(t)|2 1 + |ε(0)|2 1 + e−γ0 L + C cj (t) − cj (0) . H H j =1
j =1
Since β−1/2
|cj (t) − cj (0)| ≤ C|cj
β−1/2
(t) − cj
(0)|
≤ C(|dj (t) − dj (0)| + |dj +1 (t) − dj +1 (0)|), we obtain, N N
2 cj (t) − cj (0) ≤ C |ε(t)|2 1 + |ε(0)|2 1 + e−γ0 L + C cj (t) − cj (0) . H H j =1
j =1
Choosing a smaller α0 (A0 ) and a larger L0 (A0 ), by (37), we assume C|cj (t) − cj (0)| ≤ 1/2 and so N
cj (t) − cj (0) ≤ C |ε(t)|2 1 + |ε(0)|2 1 + e−γ0 L . H H
(51)
j =1
Thus, Lemma 5 is proved. Now, we prove the following lemma, giving uniform control on |ε(t)|H 1 on [0, t ∗ ]. Lemma 6 (Control of |ε(t)|H 1 ). There exists K5 > 0 independent of A0 , such that, ∀t ∈ [0, t ∗ ], |ε(t)|2H 1 ≤ K5 |ε(0)|2H 1 + e−γ0 L . Proof. It follows from direct calculation on the energy, and the previous estimates obtained by the Abel transform, freezing the cj (t) at the first order.
362
Y. Martel, F. Merle, T-P. Tsai
By (16), (44), (49) and (51), we have 1 2
εx2 (t) − pR p−1 (t)ε 2 (t)
≤−
E(Rj (t)) − E(Rj (0)) + K2 |ε(0)|2H 1 + |ε(t)|3H 1 + e−γ0 L
N
j =1
1 ≤ 2
Q
2
+K2 1 ≤ 2
N j =1
|ε(0)|2H 1
N
2 β−1/2 β−1/2 cj (t) − cj (0) cj (0) cj (t) − cj (0) + C
+ |ε(t)|3H 1
+e
−γ0 L
j =1
Q2 c1 (0) [d1 (t) − d1 (0)] +
+C |ε(0)|2H 1 + |ε(t)|3H 1 + e−γ0 L .
N
(cj (0) − cj −1 (0))(dj (t) − dj (0))
j =2
Therefore, using (47) and (48), and again Lemma 5, we have
εx2 (t) − pR p−1 (t)ε 2 (t) N
2 2 ≤ − c1 (0) ε (t) + (cj (0) − cj −1 (0)) ψ(x − mj (t))ε (t) j =2
+C + |ε(t)|3H 1 + e−γ0 L ≤ − c(t, x)ε2 (t) + C |ε(0)|2H 1 + |ε(t)|3H 1 + e−γ0 L , |ε(0)|2H 1
where c(t, x) = c1 (t) +
N
j =2 (cj (t) − cj −1 (t))ψ(x
− mj (t)).
By Lemma 4,
εx2 (t) − pR p−1 (t)ε 2 (t) + c(t, x)ε 2 (t) ≥ λ0 |ε(t)|2H 1 .
Therefore, from (52), we obtain |ε(t)|2H 1 ≤ C |ε(0)|2H 1 + |ε(t)|3H 1 + e−γ0 L , and so
|ε(t)|2H 1 ≤ K5 |ε(0)|2H 1 + e−γ0 L ,
for some constant K5 > 0, independent of A0 . Thus Lemma 6 is proved.
(52)
Stability and Asymptotic Stability for Subcritical gKdV Equations
363
We conclude the proof of Proposition 1 and of the stability result. By (38) and Lemmas 5 and 6, we have N
u(t) − Q (x − x (t)) j cj0 1 j =1 H N N
N ≤ u(t) − Rj (t) + Rj (t) − Qc0 (x − xj (t)) j 1 j =1 1 j =1 j =1 H
≤ |ε(t)|H 1 + C
N
j =1
≤ |ε(t)|H 1 + C
N
H
|cj (t) − cj0 | |cj (t) − cj (0)| + C
j =1
≤ ≤
|ε(t)|H 1 + CK4 (|ε(0)|2H 1 K6 α + e−γ0 L/2 ,
N
j =1
+e
−γ0 L
|cj (0) − cj0 |
) + CK1 α
where K6 > 0 is a constant independent of A0 . Choosing A0 = 4K6 , we complete the proof of Proposition 1 and thus the proof of Theorem 1 (i).
4. Proof of the Asymptotic Stability Result This section is devoted to the proof of the asymptotic stability result (Theorem 1 (ii)).
4.1. Asymptotic stability around the solitons. In this subsection, we prove the following asymptotic result on ε(t) as t → +∞. Proposition 2 (Convergence around solitons, p = 2, 3, 4). Under the assumptions of Theorem 1, the following is true: (i) Convergence of ε(t):
∀j ∈ {1, . . . , N},
ε(t, . + xj (t)) 0 in H 1 (R) as t → +∞.
(53)
+∞ , such (ii) Convergence of geometric parameters: there exists 0 < c1+∞ < · · · < cN that cj (t) → cj+∞ , x˙j (t) → cj+∞ as t → +∞.
The proof of this result is very similar to the proof of the asymptotic stability of a single soliton in Martel and Merle [16] for the subcritical case (see also the previous paper [14] concerning the critical case p = 5). The proof is based on the following rigidity result of solutions of (1) around solitons.
364
Y. Martel, F. Merle, T-P. Tsai
Theorem (Liouville property close to Rc0 for p = 2, 3, 4 [16]). Let p = 2, 3 or 4, and let c0 > 0. Let u0 ∈ H 1 (R), and let u(t) be the solution of (1) for all time t ∈ R. There exists α0 > 0 such that if |u0 − Rc0 |H 1 < α0 , and if there exists y(t) such that ∀δ0 > 0, ∃A0 > 0/∀t ∈ R, u2 (t, x + y(t))dx ≤ δ0 , (L2 compactness), |x|>A0
(54) then there exists c∗ > 0, x ∗ ∈ R such that u(t, x) = Qc∗ (x − x ∗ − c∗ t).
∀t ∈ R, ∀x ∈ R,
This result gives a classification of the solutions around the solitons that have a certain property of uniform localization of the L2 mass around a center y(t) (54). Let us give a few words on the proof of such a result (see [16]). First, (54) implies a much stronger property on u(t): ∀t, x ∈ R,
|u(t, x + y(t))| ≤ Ce−θ|x| ,
(55)
C, θ > 0, which is proved by using a functional of the type Ij (t) in Sect. 2.2. Note that (55) is a purely nonlinear estimate. It implies strong localization properties in H 1 , which reduces the nonlinear problem for α0 small enough to a similar Liouville problem on a linear equation: wt + (Lw)x = 0, where L is the linearized operator Lw = −wxx + w − pQp−1 w. Finally, the linear Liouville property is proved by a Virial type quantity ( yw2 ) whose derivative in time involves an explicit quadratic form on w. In [16], we prove that this theorem implies the asymptotic stability of a 1-soliton in the following way. Suppose tn → +∞ and u˜ 0 satisfy that u(tn , x(tn ) + .) u˜ 0 in H 1 as n → +∞. Then we can prove that the solution associated to initial data u˜ 0 is L2 compact in the sense of (54) and hence u˜ 0 is a soliton. This concludes the proof. Proof of Proposition 2 (i). Consider a solution u(t) satisfying the assumptions of Theorem 1. Then, by Sect. 3, we known that u(t) is uniformly close in H 1 (R) to the superposition of N solitons for all time t ≥ 0. With the decomposition introduced in Sect. 2, it is equivalent that ε(t) is uniformly small in H 1 (R) and N j =1 |cj (t) − cj (0)| is uniformly small. Therefore, we can assume that, ∀t ≥ 0, c1 (t) ≥ σ0 ,
cj (t) − cj −1 (t) ≥ σ0 .
The proof of Proposition 2 is by contradiction. Let j ∈ {1, . . . , N}. Assume that for some sequence tn → +∞, we have ε(tn , . + xj (tn )) 0
in H 1 (R) as t → +∞.
Since 0 < σ0 < cj (t) < c¯ and |ε(t)|H 1 ≤ C for all t ≥ 0, there exists ε˜ 0 ∈ H 1 (R), ε˜ 0 ≡ 0, and c˜0 > 0 such that for a subsequence of (tn ), still denoted (tn ), we have ε(tn , . + xj (tn )) ε˜ 0
in H 1 (R),
cj (tn ) → c˜0
as n → +∞.
(56)
Moreover, by weak convergence and the stability result, |˜ε0 |H 1 ≤ supt≥0 |ε(t)|H 1 ≤ C(α0 + e−γ0 L0 ), and therefore |˜ε0 |H 1 is as small as we want by taking α0 small and L0 large. Let now u(0) ˜ = Qc˜0 + ε˜ 0 , and let u(t) ˜ be the global solution of (1) for t ∈ R, with u(0) ˜ as initial data. Let x(t) ˜ and c(t) ˜ be the geometrical parameters associated to the solution u(t) ˜ (apply the modulation theory for a solution close to a single soliton). We claim that the solution u(t) ˜ is L2 compact in the sense of (54).
Stability and Asymptotic Stability for Subcritical gKdV Equations
Lemma 7 (L2 compactness of the asymptotic solution). ∀δ0 > 0, ∃A0 > 0/∀t ∈ R, u˜ 2 (t, x + x(t))dx ˜ ≤ δ0 . |x|>A0
365
(57)
Assuming this lemma, we finish the proof of Proposition 2 (i). Indeed, by choosing α0 small enough and L0 large enough, we can apply the Liouville theorem to u(t). ˜ There˜ = Qc∗ (x − x ∗ − c∗ t). In particular, fore, there exists c∗ > 0 and x ∗ ∈ R, such that u(t) u(0) ˜ = Qc˜0 + ε˜ 0 = Qc∗ (x − x∗ ). Since by weak convergence ε˜ 0 (Qc0 )x = 0, we have easily x ∗ = 0. Next, since ε˜ 0 Q = 0, we have c∗ = c˜0 and so ε˜ 0 ≡ 0. This is a contradiction. Thus Proposition 2 (i) is proved assuming Lemma 7. The proof of Lemma 7 is based only on arguments of monotonicity of the L2 mass in the spirit of [16, 17]. Proof of Lemma 7. We use the function ψ introduced in Sect. 2.2. For y0 > 0, we introduce two quantities: JL (t) = (1 − ψ(x − (xj (t) − y0 )))u2 (t, x)dx, JR (t) = ψ(x − (xj (t) + y0 ))u2 (t, x)dx. (58) The strategy of the proof is the following. We prove first that JL (t) is almost increasing and JR (t) is almost decreasing in time. Then, assuming by contradiction that u(t) ˜ is not L2 compact, using the convergence of u(t) to u(t) ˜ for all time, we prove that the L2 norm of u(t) in the compact set [−y0 , y0 ], for y0 large enough, oscillates between two different values. This proves that there are infinitely many transfers of mass from the right-hand side of the soliton j to the left-hand side of the soliton j . This is of course impossible since the L2 norm of u(t) is finite. Step 1. Monotonicity on the right and on the left of a soliton. We claim Claim. There exists C1 , y1 > 0 such that ∀y0 > y1 , ∀t ∈ [0, t], JL (t) ≥ JL (t ) − C1 e−γ0 y0 ,
JR (t) ≤ JR (t ) + C1 e−γ0 y0 .
(59)
We prove this claim. First note that it is sufficient to prove (59) for JL (t). Indeed, since u(−t, −x) is also a solution of (1), and since 1 − ψ(−x) = ψ(x), we can argue backwards in time (from t to t ) to obtain the result for JR (t). By using the same argument as in Lemma 3, we prove easily, for y0 large enough, for all 0 < t < t, ψ(. − (xj (t) − y0 − σ20 (t − t )))u2 (t) ≤ ψ(. − (xj (t ) − y0 ))u2 (t ) + C1 e−γ0 y0 ≤ u2 (t ) − JL (t ) + C1 e−γ0 y0 . Since u2 (t) = u2 (t ) and u2 (t)−JL (t) = ψ(.−(xj (t)−y0 ))u2 (t) ≤ ψ(.−(xj (t)−y0 − σ20 (t −t )))u2 (t), we obtain the result.
366
Y. Martel, F. Merle, T-P. Tsai
Step 2. Conclusion of the proof. Recall from [16] that we have stability of (1) by weak convergence in H 1 (R) in the following sense: ∀t ∈ R,
u(t + tn , . + xj (t + tn )) −→ u(t, ˜ . + x(t)) ˜
in L2loc (R) as n → +∞. (60)
This was proved in [16] by using the fact that the Cauchy problem for (1) is well posed ∗ both in H 1 (R) and in H s (R), for some 0 < s ∗ < 1, for any p = 2, 3, 4 (see [9]). We prove Lemma 7 by contradiction. Let 2 m0 = u˜ (0) = u˜ 2 (t). Assume that there exists δ0 > 0 such that for any y0 > 0, there exists t0 (y0 ) ∈ R, such that u˜ 2 (t0 (y0 ), x + x(t ˜ 0 (y0 )))dx ≤ m0 − δ0 . (61) |x|<2y0
Fix y0 > 0 large enough so that 1 (ψ(x + y0 ) − ψ(x − y0 ))u˜ 2 (0, x)dx ≥ m0 − δ0 , 10 C1 e−γ0 y0 + m0 sup {ψ(x + y0 ) − ψ(x − y0 )} ≤ |x|>2y0
(62)
1 δ0 . 10
Assume that t0 = t0 (y0 ) > 0 and, by possibly considering a subsequence of (tn ), that ∀n, tn+1 ≥ tn + t0 . Observe that, since 0 < ψ < 1 and ψ > 0, by the choice of y0 and (61), we have (ψ(x − (x(t ˜ 0 ) − y0 )) − ψ(x − (x(t ˜ 0 ) + y0 )))u˜ 2 (t0 , x)dx ≤ u˜ 2 (t0 , x + x(t ˜ 0 ))dx + m0 sup {ψ(x + y0 ) − ψ(x − y0 )} |x|<2y0
|x|>2y0
1 9 ≤ u˜ 2 (t0 , x + x(t ˜ 0 ))dx + δ0 ≤ m0 − δ0 . 10 10 |x|<2y0 Then, by (62), (63) and (60), there exists N0 > 0 large enough so that ∀n ≥ N0 , 1 (ψ(x − (xj (tn ) − y0 )) − ψ(x − (xj (tn ) + y0 )))u2 (tn , x)dx ≥ m0 − δ0 . 5
(63)
(64)
4 (ψ(x −(xj (tn +t0 ) − y0 ))−ψ(x − (xj (tn +t0 ) + y0 )))u2 (tn +t0 , x)dx ≤ m0 − δ0 . 5 (65)
Recall that from Step 1, and the choice of y0 , we have JR (tn + t0 ) ≤ JR (tn ) + Therefore, by conservation of the L2 norm and (65), (64), we have 1 JL (tn + t0 ) ≥ JL (tn ) + δ0 . 2
1 10 δ0 .
Stability and Asymptotic Stability for Subcritical gKdV Equations
Since JL (tn+1 ) ≥ JL (tn + t0 ) −
1 10 δ0
∀n ≥ N0 ,
367
by Step 1, we finally obtain
2 JL (tn+1 ) ≥ JL (tn ) + δ0 . 5
Of course, this is a contradiction. Thus the proof of Lemma 7 is complete. Proof of Proposition 2 (ii). The proof is similar to the proof of Proposition 3 in [16]. It follows again from monotonicity arguments and the fact that we consider the subcritical case 1 < p < 5. 5−p Let δ > 0 be arbitrary. Since Rj2 (t) = cj2(p−1) (t) Q2 and ε(t, . + xj (t)) → 0 in L2loc as t → +∞, there exists T1 (δ) > 0 and y1 (δ) such that ∀t > T1 (δ), ∀y0 > y1 (δ), 5−p (ψ(x − (xj (t) − y0 )) − ψ(x − (xj (t) + y0 )))u2 (t, x)dx − c 2(p−1) (t) Q2 ≤ δ. j
By Step 1 of the proof of Lemma 7, there exists y2 (δ), such that we have, for all 0 < t < t, ∀y0 > y2 (δ), JL (t) ≥ JL (t ) − δ,
JR (t) ≤ JR (t ) + δ.
Fix y0 = max(y1 (δ), y2 (δ)), it follows that there exists T2 (δ), JL+∞ ≥ 0 and JR+∞ ≥ 0 such that ∀t ≥ T2 (δ),
|JL (t) − JL+∞ | ≤ 2δ,
|JR (t) − JR+∞ | ≤ 2δ.
Therefore, by conservation of L2 mass, we have, for all 0 < max(T1 , T2 ) < t < t, 5−p 5−p 2(p−1) 2(p−1) c (t) − c (t ) j ≤ C δ. j 5−p
Since δ is arbitrary, it follows that cj2(p−1) (t) has a limit as t → +∞. Thus there exists cj+∞ > 0 such that cj (t) → cj+∞ as t → +∞. The fact that x˙j (t) → cj+∞ is a direct consequence of (15). 4.2. Asymptotic behavior on x > ct. In this subsection, using the same argument of monotonicity of L2 mass, we prove the following proposition. Proposition 3 (Convergence for x > c01 t/10). Under the assumptions of Theorem 1, the following is true: |ε(t)|L2 (x>c0 t/10) → 0 as t → +∞. 1
(66)
Proof. By arguing backwards in time (from t to 0) and using the conservation of the L2 norm, we have ψ(. − (xN (t) + y0 ))u2 (t) ≤ ψ(. − (xN (0) + σ20 t + y0 ))u2 (0) + C1 e−γ0 y0 .
368
Y. Martel, F. Merle, T-P. Tsai
Therefore, x>xN (t)+y0
ε 2 (t) ≤ 2
ψ(. − (xN (0) +
σ0 2 t
+ y0 ))u2 (0) + Ce−γ0 y0 .
Since for fixed y0 , xN (t)<x<xN (t)+y0 ε 2 (t) → 0 as t → +∞, we conclude x>xN (t) ε 2 (t) → 0 as t → +∞. Now, let us prove x>xj (t) ε 2 (t) → 0 as t → +∞ by backwards induction on j . Assume that for j0 ∈ {2, . . . , N}, we have x>xj (t) ε 2 (t) → 0 as t → +∞. For t ≥ 0 0
large enough, there exists 0 < t = t (t) < t, satisfying xj0 (t ) − xj0 −1 (t ) −
σ0 2 (t
− t ) = 2y0 .
Indeed, for t large enough, xj0 (t)−xj0 −1 (t) ≥ σ20 t ≥ 2y0 , and xj0 (0)−xj0 −1 (0)− σ20 t < 0 < 2y0 . Then, 2 ψ(.− (xj0 −1 (t)+y0 )) u (t) ≤ ψ(. − (xj0 −1 (t )+ σ20 (t −t )+y0 ))u2 (t )+Ce−γ y0 ≤ ψ(. − (xj0 (t ) − y0 ))u2 (t )+Ce−γ0 y0 . (67) Let δ > 0 be arbitrary. By L2loc convergence of ε(t, . + xj0 (t)) and the induction assumption, we have, for fixed y0 , ε 2 (t) → 0 as t → +∞. x>xj0 (t)+2y0
Therefore, by Proposition 2, there exists T = T (δ) > 0, such that ∀t > T , ∀y0 > y0 (δ), N 5−p +∞ 2(p−1) 2 ψ(. − (xj (t) − y0 ))u2 (t) − (68) Q (c ) 0 k ≤ δ. k=j0 Moreover, since t (t) → +∞ as t → +∞, by possibly taking a larger T (δ), we also have N 5−p +∞ 2(p−1) 2 ψ(. − (xj (t ) − y0 ))u2 (t ) − (69) Q (ck ) 0 ≤ δ, k=j0 and so ψ(. − (xj (t) − y0 ))u2 (t) − ψ(. − (xj (t ) − y0 ))u2 (t ) ≤ 2δ. 0 0
(70)
Thus, by (67), we have ψ(. − (xj0 −1 (t) + y0 ))u2 (t) ≤ ψ(. − (xj0 (t) − y0 ))u2 (t) + 2δ + Ce−γ0 y0 . (71)
Stability and Asymptotic Stability for Subcritical gKdV Equations
369
Since ψ(x) ≥ 1/2 for x > 0, by the decay properties of Q and (71), we obtain ε 2 (t) xj0 −1 (t)+y0
≤2
ψ(.− (xj0 −1 (t)+y0 ))u (t)− 2
ψ(. − (xj0 (t) − y0 ))u (t) +Ce−γ0 y0 2
≤ 4δ + C e−γ0 y0 . Thus, x>xj −1 (t) ε 2 (t) → 0 as t → +∞. 0 Finally, we prove x>c0 t/10 ε 2 (t) → 0 as t → +∞. Indeed, let 0 < t = t (t) < t 1
c0
be such that x1 (t ) − 201 (t + t ) = y0 . Then, for supt≥0 |ε(t)|H 1 small enough, % % && & % c10 c10 c10 2 u2 (t ) + Ce−γ0 y0 t + (t − t ) ψ x − t u (t) ≤ ψ x − 10 10 20 ≤ ψ(x − (x1 (t ) − y0 ))u2 (t ) + Ce−γ0 y0 . Arguing as before, this is enough to conclude the proof. Proof of Corollary 1. Note first that N
(N) U ( . ; c0 , −yj ) − Qc0 (. − yj ) j j j =1
→0
as inf(yj +1 − yj ) → +∞.
(72)
H1
For γ0 , A0 , L0 and α0 as in the statement of Theorem 1, let α < α0 , L > L0 be such that A0 α + e−γ0 L < δ1 /2 and N
(N) U ( . ; c0 , −yj ) − ≤ δ1 /2, for yj +1 − yj > L. Q (. − y ) (73) 0 j j cj 1 j =1 H
Let v(t, x) = U (N) (x; cj0 , −(xj0 + cj0 t)) be an N -soliton solution. Let T > 0 be such that N
0 0 ∀t ≥ T1 , v(t) − Qc0 (. − (xj + cj t)) ≤ α/2, (74) j 1 j =1 H
and ∀j,
xj0+1
+ cj0+1 T
≥
xj0
+ cj0 T
+ 2L.
By continuous dependence of the solution of (1) with respect to the initial data (see [9]), there exists α1 > 0 such that if |u(0)−v(0)|H 1 ≤ α1 , then |u(T )−v(T )|H 1 ≤ α/2. Therefore, by (74) N
0 0 u(T ) − ≤ α. Q (. − (x + c T )) 0 j j cj 1 j =1 H
370
Y. Martel, F. Merle, T-P. Tsai
Thus, by Theorem 1 (i), there exists xj (t), for all t ≥ T such that N
Qc0 (. − xj (t)) ≤ A0 α + e−γ0 L < δ1 /2. ∀t ≥ T , u(t) − j 1 j =1 H
Moreover, xj +1 (t) > xj (t) + L. Together with (73), this gives the stability result. Finally, Theorem 1 (ii) and (72) prove the asymptotic stability of the family of N -solitons. Appendix : Modulation of a Solution Close to the Sum of N Solitons In this appendix, we prove the following lemma: 0 , σ = 1 min(c0 , c0 − c0 , c0 − c0 , . . . , c0 − c0 Let 0 < c10 < · · · < cN 0 1 2 1 3 2 N N −1 ). For 2 α, L > 0, we consider the neighborhood of size α of the superposition of N solitons of speed cj0 , located at a distance larger than L,
N
1 U(α, L) = u ∈ H (R); u− inf Qc0 (. − xj ) j xj >xj −1 +L j =1
H1
≤α .
(75)
(Note that functions in U(α, L) have no time dependency.) Lemma 8 (Choice of the modulation parameters). There exists α1 > 0, L1 > 0 and unique C 1 functions (cj , xj ) : U(α1 , L1 ) → (0, +∞) × R, such that if u ∈ U(α1 , L1 ), and ε(x) = u(x) −
N
Qcj (. − xj ),
(76)
j =1
then
Qci (x − xi )ε(x)dx =
(Qci )x (x − xi )ε(x)dx = 0.
(77)
Moreover, there exists K1 > 0 such that if u ∈ U(α, L), with 0 < α < α1 , L > L1 , then |ε|H 1 +
N
j =1
|cj − cj0 | ≤ K1 α,
xj > xj −1 + L − K1 α.
(78)
Proof. Let u ∈ U(α, L). It is clear that for α small enough and L large enough, the infimum N
inf u − Qc0 (. − xj ) j xj ∈R 1 j =1 H
is attained for (xj ) satisfying xj > xj −1 +L−Cα, for some constant C > 0 independent of L and α. By using standard arguments involving the implicit function theorem, there
Stability and Asymptotic Stability for Subcritical gKdV Equations
371
exist α1 , L1 > 0 such that there exist unique C 1 functions (rj ) : U(α1 , L1 ) → R, such that for all u ∈ U(α, L), for 0 < α < α1 , L > L1 , we have N N
u − Qc0 (. − rj (u)) = inf u − Qc0 (. − xj ) ≤ α. j j 1 xj ∈R 1 j =1 j =1 H
H
Moreover, rj (u) − rj −1 (u) > L − Cα. For some cj , yj , u ∈ H 1 (R), let Qcj ,yj (x) = Qcj (x − rj (u) − yj ),
ε(x) = u(x) −
N
Qcj ,yj (x).
j =1
Define the following functionals:
ρ 1,j (c1 , . . . , cN , y1 , . . . , yN , u) =
Qcj ,yj (x)ε(x)dx,
ρ
2,j
(c1 , . . . , cN , y1 , . . . , yN , u) =
(Qcj ,yj )x (x)ε(x)dx,
0 , 0, . . . , 0, and ρ = (ρ 1,1 , ρ 2,1 , . . . , ρ 1,N , ρ 2,N ). Let M0 = (c10 , . . . , cN We claim the following.
Claim. (i) ∀j , 7−3p 5−p ∂ρ 1,j (M0 ) = − (cj0 ) 2(p−1) ∂cj 4(p − 1)
∂ρ 2,j (M0 ) = 0, ∂cj
N
j =1 Qcj0 ,0 ).
∂ρ 1,j (M0 ) = 0, ∂yj p+3 ∂ρ 2,j (M0 ) = (cj0 ) 2(p−1) Q2x . ∂yj Q2 ,
(ii) ∀j = k, 1,j 2,j 2,j 1,j √ ∂ρ ∂ρ ∂ρ ∂ρ ≤ Ce− σ0 L/2 . (M ) + (M ) + (M ) + (M ) 0 0 0 0 ∂y ∂c ∂y ∂c k k k k Proof of the claim. Since ∂Qcj ,yj ∂Qcj ,yj 2 1 |(c0 ,0) = 0 | 0 = −(Qc0 ,0 )x , Qc0 ,0 +(x − rj )(Qc0 ,0 )x , j j j ∂cj ∂yj (cj ,0) 2cj p − 1 j we have by direct calculations: ∂Qcj ,yj ∂ρ 1,j (M0 ) = − Qc0 ,0 | 0 j ∂cj ∂cj (cj ,0) 2 1 Qc0 ,0 + (x − rj )(Qc0 ,0 )x = − 0 Qc0 ,0 j j p−1 j 2cj 7−3p 7−3p 1 5−p 1 2 = − (cj0 ) 2(p−1) Q Q + xQx = − (cj0 ) 2(p−1) Q2 , 2 p−1 2 2(p − 1)
372
Y. Martel, F. Merle, T-P. Tsai
by change of variable and integration by parts. For j = k, 1,j ∂ρ 2 1 ∂c (M0 ) = 2c0 Qcj0 ,0 p − 1 Qck0 ,0 + (x − rk )(Qck0 ,0 )x k k √ √ √ ≤ C e− σ0 (|x−rj |+|x−rk |) dx ≤ e− σ0 |rj −rk |/2 ≤ Ce− σ0 L/2 . The rest is done in a similar way, using
p+3 QQx = 0, and (Qc )2x = c 2(p−1) Q2x .
It follows that ∇ρ(M0 ) = D + P , where D is a diagonal matrix with nonzero √ coefficients of order one on the diagonal, and P ≤ Ce− σ 0 L/2 . Therefore, for L large enough, the absolute value of the Jacobian of ρ at M0 is larger than a positive constant depending only on the cj0 . Thus, by the implicit function theorem, by possibly taking a smaller α1 , there exist C 1 functions (cj , yj ) of u ∈ U(α1 , L1 ) in a neighborhood 0 , 0, . . . , 0) such that ρ(c , . . . , c , y , . . . , y , u) = 0. Moreover, for of (c10 , . . . , cN 1 N 1 N some constant K1 > 0, if u ∈ U(α, L1 ), where 0 < α < α1 , then N
j =1
|cj − cj0 | +
N
|yj | ≤ K1 α.
j =1
The fact that |ε|H 1 ≤ K1 α then follows from its definition. Finally, we choose xj (u) = rj (u) + yj (u). Acknowledgement. Part of this work was done when Tai-Peng Tsai was visiting the University of Cergy–Pontoise, whose hospitality is gratefully acknowledged.
References 1. Benjamin, T.B.: The stability of solitary waves. Proc. Roy. Soc. London A 328, 153–183 (1972) 2. Bona, J.L., Souganidis, P.E., Strauss, W.A.: Stability and instability of solitary waves of Korteweg–de Vries type. Proc. R. Soc. Lond. 411, 395–412 (1987) 3. Buslaev, V.S., Perelman, G.S.: On the stability of solitary waves for nonlinear Schr¨odinger equations. (English) In: Nonlinear Evolution Equations, Uraltseva, N.N. ed., Transl., Ser. 2, Am. Math. Soc. 164, Providence, RI: Am. Math. Soc., 1995, pp. 75–98 4. Dix, D.B., McKinney, W.R.: Numerical computations of self-similar blow up solutions of the generalized Korteweg–de Vries equation. Diff. Int. Eq. 11, 679–723 (1998) 5. Ei, S.-I., Ohta, T.: Equation of motion for interacting pulses. Phys. Rev. E 50, 4672–4678 (1994) 6. Ginibre, J., Tsutsumi, Y.: Uniqueness of solutions for the generalized Korteweg–de Vries equation. SIAM J. Math. Anal. 20, 1388–1425 (1989) 7. Grillakis, M., Shatah, J., Strauss, W.: Stability theory of solitary waves in the presence of symmetry. J. Funct Anal. 74, 160–197 (1987) 8. Kato, T.: On the Cauchy problem for the (generalized) Korteweg–de Vries equation. Adv. in Math. Supplementary Studies, Studies in Applied Math. 8, 93–128 (1983) 9. Kenig, C.E., Ponce, G., Vega, L.: Well-posedness and scattering results for the generalized Korteweg–de Vries equation via the contraction principle. Comm. Pure Appl. Math. 46, 527–620 (1993) 10. Korteweg, D.J., de Vries, G.: On the change of form of long waves advancing in a rectangular canal, and on a new type of long stationary waves. Philos. Mag. 539, 422–443 (1895) 11. Lamb Jr., G.L.: Element of Soliton Theory. New York: John Wiley & Sons, 1980 12. Lax, P. D.: Integrals of nonlinear equations of evolution and solitary waves. Comm. Pure Appl. Math. 21, 467–490 (1968) 13. Maddocks, J.H., Sachs, R.L.: On the stability of KdV multi-solitons. Comm. Pure Appl. Math. 46, 867–901 (1993)
Stability and Asymptotic Stability for Subcritical gKdV Equations
373
14. Martel, Y., Merle, F.: A Liouville Theorem for the critical generalized Korteweg–de Vries equation. J. Math. Pures Appl. 79, 339–425 (2000) 15. Martel, Y., Merle, F.: Instability of solitons for the critical generalized Korteweg–de Vries equation. Geom. Funct. Anal. 11, 74–123 (2001) 16. Martel, Y., Merle, F.: Asymptotic stability of solitons for subcritical generalized KdV equations. Arch. Rational Mech. Anal. 157, 219–254 (2001) 17. Martel, Y., Merle, F.: Stability of the blow up profile and lower bounds on the blow up rate for the critical generalized KdV equation. Ann. of Math. 155, 235–280 (2002) 18. Martel, Y., Merle, F.: Blow up in finite time and dynamics of blow up solutions for the L2 -critical generalized KdV equation. J. Am. Math. Soc. 15, 617–664 (2002) 19. Merle, F.: Construction of solutions with exactly k blow up points for the Schr¨odinger equation with critical nonlinearity. Commun. Math. Phys. 129, 223–240 (1990) 20. Merle, F.: Existence of blow-up solutions in the energy space for the critical generalized KdV equation. J. Am. Math. Soc. 14, 555–578 (2001) 21. Miura, R.M.: The Korteweg–de Vries equation: a survey of results. SIAM Rev. 18, 412–459 (1976) 22. Pego, R.L., Weinstein, M.I.: Asymptotic stability of solitary waves. Commun. Math. Phys. 164, 305–349 (1994) 23. Perelman, G.S.: Some results on the scattering of weakly interacting solitons for nonlinear Schr¨odinger equations. (English) In: Spectral Theory, Microlocal Analysis, Singular Manifolds. Demuth, Michael et al., eds., Math. Top. 14, Berlin: Akademie Verlag, 1997, pp. 78–137 24. Weinstein, M.I.: Modulational stability of ground states of nonlinear Schr¨odinger equations. SIAM J. Math. Anal. 16, 472–491 (1985) 25. Weinstein, M.I.: Lyapunov stability of ground states of nonlinear dispersive evolution equations. Comm. Pure. Appl. Math. 39, 51–68 (1986) Communicated by P. Constantin
Commun. Math. Phys. 231, 375–390 (2002) Digital Object Identifier (DOI) 10.1007/s00220-002-0725-0
Communications in
Mathematical Physics
Eigenvalue Boundary Problems for the Dirac Operator Oussama Hijazi1 , Sebasti´an Montiel2,∗ , Antonio Rold´an2 1
´ Institut Elie Cartan, Universit´e Henri Poincar´e, Nancy I, B.P. 239, 54506 Vandœuvre-L`es-Nancy Cedex, France. E-mail:
[email protected] 2 Departamento de Geometr´ıa y Topolog´ıa, Universidad de Granada, 18071 Granada, Spain. E-mail:
[email protected];
[email protected] Received: 12 November 2001 / Accepted: 25 June 2002 Published online: 21 October 2002 – © Springer-Verlag 2002
Abstract: On a compact Riemannian spin manifold with mean-convex boundary, we analyse the ellipticity and the symmetry of four boundary conditions for the fundamental Dirac operator including the (global) APS condition and a Riemannian version of the (local) MIT bag condition. We show that Friedrich’s inequality for the eigenvalues of the Dirac operator on closed spin manifolds holds for the corresponding four eigenvalue boundary problems. More precisely, we prove that, for both the APS and the MIT conditions, the equality cannot be achieved, and for the other two conditions, the equality characterizes respectively half-spheres and domains bounded by minimal hypersurfaces in manifolds carrying non-trivial real Killing spinors. 1. Introduction The fundamental result of Lichnerowicz [Li] in the sixties regarding the spectrum of the Dirac operator D on closed spin manifolds, revealed subtle information on both the geometry and the topology of such manifolds (see for instance [BFGK, BHMM, Fr2, LM] and references therein). A basic lower bound for the eigenvalues λ of the Dirac operator is the Friedrich inequality [Fr1], which says that λ2 ≥
n inf R, 4(n − 1) M
(F)
where R is the scalar curvature of the manifold and n its dimension. This inequality is sharp and the equality characterizes those geometries carrying non-trivial real Killing spinor fields (see also [B¨a1]). The Dirac operator has been also considered when the compact manifold has nonempty boundary in order to look for corresponding ellipticity and index theorems [APS, ∗ Research of S. Montiel is partially supported by a Spanish MCyT grant No. BFM2001-2967 and by European Union FEDER funds
376
O. Hijazi, S. Montiel, A. Rold´an
BW, GLP], to study its determinant [Bun, FGMSS] and to model physical situations in which the particle fields are confined in a bounded region of space [CJJTW, CJJT, J]. Each one of these situations requires a particular boundary condition to be imposed. However, from an analytical point of view, the ellipticity and other related properties of these boundary conditions are usually studied in an abstract and unified setting [BW, H¨o, Se] based on the Calder´on and Seeley theory of pseudo-differential operators. In this paper, we study the spectrum of the fundamental Dirac operator on compact Riemannian spin manifolds with non-empty boundary, under four different boundary conditions: the global Atiyah-Patodi-Singer (APS) condition associated with the spectral resolution of the intrinsic Dirac operator on the boundary hypersurface; the local condition associated with a chirality (CHI) operator on the manifold (for example, if its dimension is even or if it is a space-like hypersurface in a Lorentzian manifold); the Riemannian version of the so-called (local) MIT bag condition; and finally a new global boundary condition obtained by a suitable modification of the APS condition (mAPS). We show that these four conditions satisfy ellipticity criteria and the corresponding boundary problems are well-posed in the sense of Seeley [Se]. We prove (see Theorems 2, 3, 4 and 5) that the three APS, CHI and mAPS conditions make D a symmetric operator and so the corresponding spectra are real sequences tending to +∞ and −∞. Instead, under the MIT condition, the spectrum of the Dirac operator is an unbounded discrete set of complex numbers with positive imaginary part. Finally we prove that, under the four boundary conditions, one has the same lower bound (F) in terms of the minimum of the scalar curvature as in the closed case, provided that the mean curvature of the boundary hypersurface is non-negative. (In fact, in the case of the APS and CHI conditions this fact is proved in [HMZ1, HMZ2], by other means.) The four conditions have different behavior with respect to the equality in (F). In fact, we show that such an equality is never achieved for the APS and MIT conditions, that it occurs for the CHI boundary condition if and only if the manifold is a half-sphere and that it is achieved for the mAPS condition if and only if the manifold admits a non-trivial real Killing spinor field and the boundary is a minimal hypersurface (this was a principal motivation to look for and introduce this new boundary condition). For example, all the domains enclosed in a sphere by embedded minimal hypersurfaces have the same first eigenvalue for the Dirac operator under the mAPS condition. 2. Riemannian Spin Manifolds and Their Boundaries Consider an n-dimensional Riemannian spin manifold M with non-empty boundary ∂M and denote by , its scalar product and by ∇ its corresponding Levi-Civita connection on the tangent bundle T M. We fix a spin structure (and so a corresponding orientation) on the manifold M and denote by SM the associated spinor bundle, which is a complex n vector bundle of rank 2 2 . Then let γ : C (M) −→ EndC (SM) be the Clifford multiplication, which provides a fibre preserving irreducible representation of the Clifford algebras constructed over the tangent spaces of M. When the dimension n is even, we have the standard chirality decomposition SM = SM + ⊕ SM − ,
(1)
where the two direct summands are respectively the ±1-eigenspaces of the endomorphism γ (ωn ), with ωn = i
n+1 2
e1 · · · en , the complex volume form. It is well-known
Eigenvalue Boundary Problems
377
(see [LM]) that there are, on the complex spinor bundle SM, a natural Hermitian metric ( , ) and a spinorial Levi-Civita connection, denoted also by ∇, which is compatible with both ( , ) and γ in the following sense: X(ψ, ϕ) = (∇X ψ, ϕ) + (ψ, ∇X ϕ), ∇X (γ (Y )ψ) = γ (∇X Y )ψ + γ (Y )∇X ψ
(2) (3)
for any tangent vector fields X, Y ∈ (T M) and any spinor fields ψ, ϕ ∈ (SM) on M. Moreover, with respect to this Hermitian product on SM, Clifford multiplication by vector fields is skew-Hermitian or equivalently (γ (X)ψ, γ (X)ϕ) = |X|2 (ψ, ϕ).
(4)
Since the complex volume form ωn is parallel with respect to the spinorial Levi-Civita connection, when n = dim M is even, the chirality decomposition (1) is preserved by ∇. From (4) one sees that it is an orthogonal decomposition. In this setting, the (fundamental) Dirac operator D on the manifold M is the first order elliptic differential operator acting on spinor fields given locally by D=
n
γ (ei )∇ei ,
i=1
where {e1 , . . . , en } is a local orthonormal frame in T M. When n = dim M is even, D interchanges the chirality subbundles SM ± . The boundary hypersurface ∂M is also an oriented Riemannian manifold with the induced orientation and metric. If ∇ ∂M stands for the Levi-Civita connection of the induced metric we have the Gauss and Weingarten equations ∂M ∇X Y = ∇X Y + AX, Y N,
∇X N = −AX,
for any vector fields X, Y tangent to ∂M, where A is the shape operator or Weingarten endomorphism of the hypersurface ∂M corresponding to the unit normal field N compatible with the given orientation. As the normal bundle of the boundary hypersurface is trivial, the Riemannian manifold ∂M is also a spin manifold and so we will have the corresponding spinor bundle S∂M, the Clifford multiplication γ ∂M , the spinorial LeviCivita connection ∇ ∂M and the intrinsic Dirac operator D ∂M . It is not difficult to show (see [B¨a2, BFGK, HMZ1, HMZ3, Bur, Tr, Mo]) that the restricted Hermitian bundle S := SM |∂M can be identified with the intrinsic Hermitian spinor bundle S∂M, provided that n = dim M is odd. Instead, if n = dim M is even, the restricted bundle S could be identified with the sum S∂M ⊕ S∂M. With such identifications, for any spinor field ψ ∈ (S) on the boundary hypersurface ∂M and any vector field X ∈ (T ∂M), define on the restricted bundle S, the Clifford multiplication γ S and the connection ∇ S by γ S (X)ψ = γ (X)γ (N )ψ, S ψ = ∇ ψ − 1 γ S (AX)ψ = ∇ ψ − 1 γ (AX)γ (N )ψ . ∇X X X 2 2
(5) (6)
Then it is easy to see that γ S and ∇ S correspond respectively to γ ∂M and ∇ ∂M , for n odd, and to γ ∂M ⊕ −γ ∂M and ∇ ∂M ⊕ ∇ ∂M , for n even. Then, γ S and ∇ S satisfy the
378
O. Hijazi, S. Montiel, A. Rold´an
same compatibilty relations (2), (3) and (4) and together with the following additional identity: S S ψ. ∇X (γ (N)ψ) = γ (N )∇X As a consequence, the hypersurface Dirac operator D acts on smooth sections ψ ∈ (S) as Dψ :=
n−1 j =1
n−1
γ S (uj )∇uSj ψ =
n−1 γ (uj )∇uj ψ, H ψ − γ (N ) 2 j =1
where {u1 , . . . , un−1 } is a local orthonormal frame tangent to the boundary ∂M and H = (1/(n − 1))trace A is its mean curvature function, coincides with the intrinsic Dirac operator D ∂M on the boundary, for n odd, and with the pair D ∂M ⊕ −D ∂M , for n even. In the particular case where the field ψ ∈ (S) is the restriction of a spinor field ψ ∈ (SM) on M, this means that Dψ =
n−1 H ψ − γ (N )Dψ − ∇N ψ. 2
(7)
Note that we always have the anticommutativity property Dγ (N) = −γ (N )D
(8)
and so, when ∂M is compact, the spectrum of D is symmetric with respect to zero and coincides with the spectrum of D ∂M , for n odd, and with Spec(D ∂M ) ∪ −Spec(D ∂M ), for n even. 3. A Spinorial Reilly Inequality Our main goal in this paper is to estimate the eigenvalues of the Dirac operator D on the compact Riemannian spin manifold M under suitable boundary conditions. By examining their limiting cases, one can study the geometry of certain hypersurfaces. When the manifold M is closed (compact without boundary), D a self-adjoint elliptic operator of order one and so its spectrum is a discrete unbounded sequence of real numbers. When the boundary ∂M is non-empty, we shall see in the next section that there are boundary conditions for which one may have a discrete and not necessarily real spectrum for the Dirac operator with finite dimensional eigenspaces and smooth eigenspinors. The defect of symmetry of D on the manifold with boundary M appears by integrating by parts to obtain (Dψ, ϕ) − (ψ, Dϕ) = − (γ (N )ψ, ϕ), (9) M
M
∂M
where ψ, ϕ ∈ (SM) and N is the inner unit normal field along the boundary. When the considered boundary condition forces the boundary integral on the r.h.s of (9) to vanish, the spectrum is necessarily real. A basic tool to relate the eigenvalues of the Dirac operator and the geometry of the manifold M and that of its boundary ∂M will be, as in the closed case (see [Fr1]), the integral version of the Schr¨odinger-Lichnerowicz formula 1 D 2 = ∇ ∗ ∇ + R, 4
Eigenvalue Boundary Problems
379
where R is the scalar curvature of M. In fact, given a spinor field ψ on M, taking into account the formula above, if we compute the divergence of the one-form α defined by α(X) = (γ (X)Dψ + ∇X ψ, ψ),
∀X ∈ T M
and integrate, one gets 1 |∇ψ|2 − |Dψ|2 + R|ψ|2 , (γ (N )Dψ + ∇N ψ, ψ) = − 4 ∂M M which by (7), could be written as n−1 1 (Dψ, ψ) − |∇ψ|2 − |Dψ|2 + R|ψ|2 . H |ψ|2 = 2 4 ∂M M Finally, we will use the spinorial Schwarz inequality |Dψ|2 ≤ n|∇ψ|2 ,
∀ψ ∈ (SM),
where the equality is achieved only by the so-called twistor spinors, that is, those satisfying the following over-determined first order equation 1 ∇X ψ = − γ (X)Dψ, n
∀X ∈ T M.
Then we get the following integral inequality, called Reilly inequality [HMZ1, HMZ3] because of its similarity with the corresponding one obtained in [Re] for the Laplace operator, n−1 1 n−1 (Dψ, ψ) − (10) H |ψ|2 ≥ R|ψ|2 − |Dψ|2 , 2 n ∂M M 4 with equality only for twistor spinors on M. 4. Ellipticity of the Boundary Conditions Now we introduce suitable boundary conditions for the fundamental Dirac operator D. On a compact Riemannian spin manifold M with boundary, the Dirac operator D : (SM) → (SM) has an infinite dimensional kernel and a closed image with finite codimension. We look for conditions B to be imposed on the restrictions to the boundary ∂M of the spinor fields on M so that this kernel becomes finite dimensional and then the boundary problem Dψ = on M (BP) Bψ|∂M = χ along ∂M, for ∈ (SM) and χ ∈ (S), is of Fredholm type. In this case, we will have smooth solutions for any data and χ belonging to a certain subspace with finite codimension and these solutions will be unique up to a finite dimensional kernel. To our knowledge, the study of boundary conditions suitable for an elliptic operator D (of any order, although for simplicity, we only consider first order operators) acting on smooth sections of a Hermitian vector bundle F → M has been first done in the fifties
380
O. Hijazi, S. Montiel, A. Rold´an
by Lopatinsky and Shapiro ([H¨o, Lo]), but the main tool was discovered by Calder´on in the sixties: the so-called Calder´on projector 1
P+ (D) : H 2 (F|∂M ) −→ {ψ|∂M | ψ ∈ H 1 (F ), Dψ = 0}. This is a pseudo-differential operator of order zero (see [BW, Se]) with principal symbol p+ (D) : T ∂M → EndC (F ) depending only on the principal symbol σD of the operator D and can be calculated as follows: −1 1 (σD (N ))−1 σD (u) − ζ I dζ, (11) p+ (D)(u) = − 2πi for any p ∈ ∂M and u ∈ Tp ∂M, where N is the inner unit normal along the boundary ∂M and is a positively oriented cycle in the complex plane enclosing the poles of the integrand with negative imaginary part. Although the Calder´on projector is not unique for a given elliptic operator D, its principal symbol is uniquely determined by σD . One of the important features of the Calder´on projector is that its principal symbol detects the ellipticity of a boundary condition, or in other words, if the corresponding boundary problem (BP) is a well-posed problem (according to Seeley in [Se]). In fact (cfr. [Se] or [BW, Chap. 18]), A pseudo-differential operator B : L2 (F|∂M ) −→ L2 (V ), where V → ∂M is a complex vector bundle over the boundary, is called a (global) elliptic boundary condition when its principal symbol b : T ∂M → HomC (F|∂M , V ) satisfies that, for any non-trivial u ∈ Tp ∂M, p ∈ ∂M, the restriction b(u)|image p+ (D)(u) : image p+ (D)(u) ⊂ Fp −→ Vp is an isomorphism onto image b(u) ⊂ Vp . Moreover, if rank V = dim image p+ (D)(u), we say that B is a local elliptic boundary condition. When B is a local operator this definition yields the so-called Lopatinsky-Shapiro conditions for ellipticity (see for example [H¨o]). When these definitions and the subsequent theorems are applied to the case where the vector bundle F is the spinor bundle SM and the elliptic operator D is the Dirac operator D on the spin Riemannian manifold M, we obtain the following well-known facts in the setting of the general theory of boundary problems for elliptic operators (see for example [BrL, BW, GLP, H¨o, Se]): Proposition 1. Let M be an n-dimensional compact Riemannian spin manifold with non-empty boundary ∂M. Consider the restriction S to the boundary ∂M of the spinor bundle SM of M. A pseudo-differential operator B : L2 (S) −→ L2 (V ), where V → ∂M is a Hermitian vector bundle, is an elliptic boundary condition for the fundamental Dirac operator D of M if and only if its principal symbol b : T ∂M → HomC (S, V ) satisfies the following two conditions: ker b(u) ∩ {η ∈ SM p | iγ (N )γ (u)η = −|u|η} = {0}, dim image b(u) =
1 2
n
dim SM p = 2[ 2 ]−1 .
Eigenvalue Boundary Problems
381 n
Moreover, if V is a bundle with rank 21 dim SM p = 2[ 2 ]−1 , we have a local elliptic boundary condition. When these ellipticity conditions are satisfied, the problem (BP) is of Fredholm type and the corresponding eigenvalue boundary problem Dψ = λψ on M (EBP) Bψ|∂M = 0 along ∂M, has a discrete spectrum with finite dimensional eigenspaces consisting of smooth spinor fields, unless it is the whole complex plane. Proof. Since the principal symbol σD of the Dirac operator D on M is given by σD (v) = iγ (v),
∀v ∈ T M,
then by (11), the principal symbol of the Calder´on projector of the Dirac operator is given by 1 1 S p+ (D)(u) = − (iγ (N )γ (u) − |u|I ) = iγ (u) + |u|I , 2|u| 2|u| for each non-trivial u ∈ T ∂M and where γ S is identified in (5) as the intrinsic Clifford product on the boundary. As the endomorphism iγ (N )γ (u) = −iγ S (u) is self-adjoint and its square is |u|2 times the identity map, then it has exactly two eigenvalues, say |u| n and −|u|, whose eigenspaces are of the same dimension 21 dim SM p = 2[ 2 ]−1 , since they are interchanged by γ (N). Hence the symbol p+ (D)(u) is, up to a constant, the orthogonal projection onto the eigenspace corresponding to the eigenvalue −|u| and so image p+ (D)(u) = {η ∈ SM p | iγ (N )γ (u)η = −|u|η}, dim image p+ (D)(u) =
1 2
n
dim SM p = 2[ 2 ]−1 .
From these equalities and from the definition of ellipticity for the boundary condition represented by the pseudo-differential operator B, we have that the first equation in the statement of this proposition is equivalent to the injectivity of the map b(u)|image p+ (D)(u) . The second one implies that dim image b(u) = dim image p+ (D)(u) and so, together with the injectivity above, this means that b(u)|image p+ (D)(u) is surjective. So we have proved that the two claimed conditions are equivalent to the ellipticity of the boundary condition B for the Dirac operator D on M. Now, from this ellipticity, one may deduce that the problems (BP) and (EBP) are of Fredholm type and the remaining assertions on eigenvalues and eigenspaces follow in a standard way (see [BW, H¨o]). 5. Four Boundary Conditions In this last section, on a compact Riemannian spin manifold M with boundary, we will study the ellipticity of four boundary conditions for the Dirac operator D, where two of them are of global nature and the others are of local type. We will prove that, under each of these conditions, the square of any eigenvalue of D is bounded from below in terms of the minimum of the scalar curvature R of M. In fact, we show that, under the four boundary conditions, Friedrich’s inequality (F) is still true. In the case of closed manifolds, the equality is achieved only when the manifold carries some non-trivial (real) Killing spinor fields. The important point is that these four conditions behave differently with respect to the equality case: two of them are never achieved and the others characterize half-spheres and domains enclosed by embedded minimal hypersurfaces in manifolds with non-trivial Killing spinors.
382
O. Hijazi, S. Montiel, A. Rold´an
5.1. The Atiyah-Patodi-Singer (APS) condition. Atiyah, Patodi and Singer introduced in [APS] this well-known boundary condition in order to establish index theorems for compact manifolds with boundary. Later, this condition has been used to study the positive mass and the Penrose inequalities (see [He2, Wi]). Such a condition does not allow to model confined particle fields since, from the physical point of view, its global nature is interpreted as a causality violation. Although it is a well-kown fact that the APS condition is an elliptic boundary condition, we are going to sketch the proof in the setting of Proposition 1, for two reasons: first for completeness and second for pointing out that the APS condition for a chiral Dirac operator covers both cases of odd and even dimension, although the latter case is not referred to the spectral resolution of the intrinsic Dirac operator D ∂M but to the system D ∂M ⊕ −D ∂M . Precisely, this condition can be described as follows. Choose the Hermitian bundle V (of Proposition 1) over the boundary hypersurface ∂M as the restricted spinor bundle S defined in Sect. 2, and define BAPS : L2 (S) → L2 (S) as the orthogonal projection onto the subspace spanned by the eigenspinors corresponding to the non-negative eigenvalues of the self-adjoint intrinsic operator D. Atiyah, Patodi and Singer showed in [APS] (see also [BW, Prop. 14.2]) that BAPS is a zero order pseudo-differential operator whose principal symbol bAPS satisfies the following fact: for each p ∈ ∂M and u ∈ Tp ∂M − {0}, the map bAPS (u) is the orthogonal projection onto the eigenspace of σD (u) = iγ S (u) corresponding to the positive eigenvalue |u|. That is 1 1 S iγ (u) + |u|I = (−iγ (N )γ (u) + |u|I ) , (12) bAPS (u) = 2 2 and so the principal symbol bAPS of the APS operator coincides, up to a constant, with the principal symbol p+ (D) of the Calder´on projector of D. From this, it is immediate to see that the two ellipticity conditions in Proposition 1 are satisfied. The following result (see [HMZ1, HMZ2] and also [FS, Theorem 10] for a weaker version) provides a lower bound for the eigenvalues of the Dirac operator with the APS boundary condition. We give a short proof of this estimate to illustrate the difference with the other boundary conditions. Theorem 2. Let M be a compact Riemannian spin manifold whose non-empty boundary ∂M has non-negative mean curvature (w.r.t. the inner normal). Under the APS boundary condition, the spectrum of the Dirac operator D of M is a sequence of unbounded real numbers {λAPS | k ∈ Z} which satisfy the following strict inequality: k
2 n λAPS > k ∈ Z. inf R, k 4(n − 1) M Proof. We know that the APS condition referred to the spectral resolution of D is an elliptic boundary condition. Moreover, from the supercommutativity relation (8), we have BAPS γ (N) + γ (N)BAPS = γ (N )(I + π0 ), BAPS D = DBAPS ,
(13) (14)
where π0 is the L2 -orthogonal projection on the space of harmonic spinors of D. This implies that, if ψ, ϕ ∈ (S) satisfy BAPS ψ = BAPS ϕ = 0 (and subsequently π0 ψ = π0 ϕ = 0), then BAPS γ (N)ψ = γ (N)ψ,
and
BAPS γ (N )ϕ = γ (N )ϕ.
Eigenvalue Boundary Problems
383
As a consequence we deduce that, for ψ, ϕ ∈ (SM), (γ (N )ψ, ϕ) = (BAPS γ (N)ψ, ϕ) = ∂M
∂M
∂M
(γ (N )ψ, BAPS ϕ) = 0
and under the APS boundary condition, by (9) the Dirac operator D on the bulk manifold M is a symmetric operator. Then the corresponding spectrum is real and so an unbounded discrete sequence (see [H¨o, GLP] for instance). Now consider an eigenspinor ψ ∈ (SM) associated with an arbitrary eigenvalue λAPS , k ∈ Z, in the Reilly inequality (10). Then k 1 n − 1 APS 2 |ψ|2 ≤ (Dψ, ψ), R− λk n M 4 ∂M since H ≥ 0. But, using BAPS ψ = 0 and the commutativity (13), one gets (Dψ, ψ) ≤ 0 ∂M
and the equality holds only when the restriction ψ|∂M vanishes. Hence
λAPS k
2
≥
n inf R 4(n − 1) M
and the equality is achieved if and only if the eigenspinor ψ is simultaneously a twistor spinor (and so a real Killing spinor) and its restriction ψ|∂M is zero. But, since a real Killing spinor is of constant length, then ψ = 0, which is impossible. Hence, the inequality above is strict. 5.2. The condition associated with a chirality (CHI) operator. This type of (local) boundary condition has already been considered in the context of comparison results [Bun], to estimate the mass of asymptotically flat manifolds including black holes [GHHP, He1] and also in order to study eigenvalue estimates [FS, HMZ2]. By contrast to the APS condition, which exists on any spin manifold with boundary, the second boundary condition which we shall consider is subjected to the existence on the manifold M of a linear map G : γ (SM) → γ (SM) satisfying G2 = I, (Gψ, Gϕ) = (ψ, ϕ), ∇X (Gψ) = G∇X ψ, γ (X)Gψ = −Gγ (X)ψ
(15) (16)
for each vector field X and spinor fields ψ, ϕ on M. This map G is often called a chirality operator because, when the dimension n of M is even, the standard candidate is G = γ (ωn ), the Clifford multiplication by the complex volume form ωn which yields to the chirality decomposition (1) of the spinor bundle. In this case, G would be nothing but the usual conjugation changing chirality of spinors. But there is another important situation where such an operator appears, this is when the manifold M is a spacelike of dimension n + 1 and both the Riemannian hypersurface of a Lorentzian manifold M In this case one can choose and spinorial structures on M are the induced ones from M. G = γ (T ), the Clifford multiplication by a unit time-like normal field T on M (see for example [He1]).
384
O. Hijazi, S. Montiel, A. Rold´an
Anyway, given such a chirality operator G on M, the fibre preserving endomorphism γ (N)G : (S) → (S), acting on sections of the restricted spinor bundle, is self-adjoint with respect to the pointwise Hermitian product, whose square is the identity. Hence it has two eigenvalues +1 and −1 whose corresponding eigenspaces are interchanged by, for example, the isomorphism γ (N). Hence the eigensubbundle V over ∂M corresponding to the eigenvalue −1 verifies rank V =
n 1 rank S = 2[ 2 ]−1 . 2
Define now the boundary condition BCHI : L2 (S) → L2 (V ) as the linear operator BCHI =
1 (I − γ (N )G) , 2
that is, the orthogonal projection onto the eigensubbundle V . This is a differential operator of order zero and so its principal symbol bCHI (u), on each vector u ∈ T ∂M, coincides with the operator itself, that is, bCHI (u) =
1 (I − γ (N )G), 2
and, in particular,
∀u ∈ T ∂M n
dim image bCHI (u) = rank V = 2[ 2 ]−1 . Now it is easy to check that the two conditions in Proposition 1 are satisfied and so BCHI is a local elliptic boundary condition. We have Theorem 3. Let M be a compact Riemannian spin manifold whose non-empty boundary ∂M has non-negative mean curvature (w.r.t. the inner normal). Under the CHI boundary condition, the spectrum of the Dirac operator D of M associated with a chirality operator on M, is a non-decreasing sequence of real numbers
λCHI | k ∈ Z k with limk→±∞ λCHI = ±∞ and satisfying the following inequality: k 2
n ≥ k ∈ Z. inf R, λCHI k 4(n − 1) M Moreover the equality holds (for λCHI = λCHI ±1 ) if and only if M is isometric to the CHI half-sphere with radius n/2|λ |. Proof. The spectrum being real is a consequence of the fact that the Dirac operator D is symmetric when it acts on spinor fields ψ ∈ (SM) such that BCHI ψ|∂M = 0, that is, γ (N )Gψ = ψ. In fact, if ϕ ∈ (SM) is another field satisfying the same boundary condition, we have from (15) and (16), (γ (N )ψ, ϕ) = (Gγ (N )ψ, Gϕ) = (ψ, γ (N )ϕ) = −(γ (N )ψ, ϕ). Integrating over ∂M this pointwise equality and using (9) one has the symmetry property. Consider a smooth spinor field on M such that Dψ = λCHI ψ,
and
BCHI ψ|∂M = 0
Eigenvalue Boundary Problems
385
and plug it into the Reilly inequality (10). As in the proof of Theorem 2, we will have the desired inequality if we are able to show that the mass term on the boundary (Dψ, ψ) ∂M
is non-positive. But, in this case, this term is exactly zero. In fact we only have to realize that (16), (5) and (6) imply DG = GD. Then, we have the following pointwise equation: (Dψ, ψ) = (γ (N )GDψ, γ (N )Gψ) = −(Dψ, ψ) because of (8) and γ (N )Gψ = ψ. As a consequence we have the claimed inequality. Suppose now that equality is achieved. As in the proof of Theorem 2, we have equality in (10) and so the eigenspinor ψ is a twistor spinor and hence a non-trivial real Killing spinor. In fact ∇v ψ = −
λCHI γ (v)ψ, n
∀v ∈ T M.
(17)
This implies that the length |ψ|2 is a non-zero constant and that M is an Einstein mani2 fold with scalar curvature R = 4n(n − 1) λCHI (see for example [BFGK]). Since the assumption H |ψ|2 ≥ 0, ∂M
has been used to get the inequality, we deduce that H = 0, that is, the boundary is a minimal hypersurface. Consider now the smooth function f = (Gψ, ψ) which takes real values, since from (15), G is pointwise self-adjoint. Moreover, if we take ϕ = Gψ in equality (9), considering that D and G anticommute (because of (16) and the boundary condition γ (N )Gψ = ψ), we have CHI 2λ f = |ψ|2 . M
∂M
This yields the following two important facts: λCHI = 0
and
f ≡ 0.
On the other hand, since ψ is a Killing spinor and from (15), one can easily compute that the Hessian of the function f is given by CHI 2 2λ ∇ 2f = − f , . n In other words, the function f is a non-trivial solution on M of the Obata equation. But using again the boundary condition satisfied by the eigenspinor ψ we see that f|∂M = (Gψ|∂M , ψ|∂M ) = (γ (N )Gψ|∂M , γ (N )ψ|∂M ) = (ψ|∂M , γ (N )ψ|∂M ), and hence f|∂M is identically zero since (ψ|∂M , γ (N )ψ|∂M ) is a purely imaginary function. Now we apply the boundary version of the Obata theorem found by Reilly in [Re] in order to conclude that M is isometric to the required half-sphere.
386
O. Hijazi, S. Montiel, A. Rold´an
5.3. The Riemannian version of the MIT bag condition. In the seventies, some physicists at the Massachusetts Institute of Technology proposed a model for elementary particles (see [CJJTW, CJJT, J]) which has been called later the MIT bag model. It works with fields confined in a finite region of the space which, in the massless spin- 21 case, are modeled by spinor fields satisfying the (Lorentzian) Dirac equation defined in the region of the space-time swept out by a bounded bag of a given rest-space. These solutions of the Dirac equation should obey a local boundary condition, which we shall examine now in the Riemannian setup. It is interesting to point out that such a Riemannian version of the MIT boundary condition has been used in another context (see [FGMSS, HMZ3]), because of its invariance under conformal changes of the metric of the manifold. Consider the pointwise endomorphism iγ (N ) : (S) → (S) acting on sections of the spinor bundle of the compact Riemannian spin manifold M restricted to the boundary hypersurface ∂M, where N is the inner unit normal field along ∂M. The square of this endomorphism is the identity, its eigenvalues are ±1 with the same multiplicity. In a similar way to the CHI condition, we denote by V → ∂M the eigensubbundle of S corresponding to the eigenvalue −1 and so we have again rank V =
n 1 rank S = 2[ 2 ]−1 . 2
The new boundary condition BMIT : L2 (S) → L2 (V ) is also the corresponding orthogonal projection 1 BMIT = (I − iγ (N )) 2 onto V . Hence, for each vector u tangent to the boundary, we have the following principal symbol bMIT with the following properties: bMIT (u) =
1 (I − iγ (N )), 2
n
dim image bMIT = rank V = 2[ 2 ]−1 .
As above, from this it is immediate to check the ellipticity conditions given by Proposition 1. Now we state the corresponding result estimating the eigenvalues of the corresponding eigenvalue boundary problem. Theorem 4. Let M be a compact Riemannian spin manifold whose non-empty boundary ∂M has non-negative mean curvature (w.r.t. the inner normal). Under the MIT bag boundary condition, the spectrum of the Dirac operator D of M is an unbounded discrete set of complex numbers λMIT with positive imaginary part which satisfy the following inequality n MIT 2 inf R. > λ 4(n − 1) M Proof. Let λMIT be an eigenvalue of the considered problem. That is Dψ = λMIT ψ,
BMIT ψ = 0,
i.e.,
iγ (N )ψ = ψ
for a non-trivial spinor field ψ on M. Then, taking such a spinor in (9) and choosing ϕ = iψ we obtain 2(λMIT ) |ψ|2 = |ψ|2 M
∂M
Eigenvalue Boundary Problems
387
and so the eigenvalue has non-negative imaginary part. If (λMIT ) = 0, then the restriction ψ|∂M would vanish and the unique continuation principle (see for intance [BW, Chap. 8]) would imply that ψ is identically zero on the whole of M. Hence all the eigenvalues for the MIT boundary condition belong to the upper complex half-plane and so [H¨o] that spectrum has to be discrete. In order to obtain the estimate of the length of the λMIT we will proceed as in the two previous cases by taking the associated eigenspinor ψ in the Reilly inequality (10) and recalling that we are assuming that ∂M is mean-convex, i.e., H ≥ 0. Then 1 n − 1 MIT 2 2 (Dψ, ψ). R− |ψ| ≤ λ n M 4 ∂M Here the mass integrand is identically zero, since by (8), one has (Dψ, ψ) = (iγ (N )Dψ, iγ (N )ψ) = −(Dψ, ψ). This proves the inequality. We still need to prove that equality could not be achieved. Assume the contrary, that is n MIT 2 inf R, λ = 4(n − 1) M then we also have equality in (10) and ψ is a twistor eigenspinor. In other words, ψ is a Killing spinor field with associated constant −λMIT /n. But it is well-known [BFGK] that non-trivial imaginary Killing spinors only live on Einstein manifolds with negative scalar curvature. This contradicts the equality above. 5.4. A new boundary condition for the Dirac operator. We have just studied, in a unified frame, the spectra of the fundamental Dirac operator on a compact Riemannian manifold with boundary, under three more or less known (global and local) boundary conditions. Finally, we introduce a new global condition which, in our opinion is of special interest, since the corresponding eigenvalues satisfy again (F) and where the limiting case includes relevant geometries. We again choose the bundle V → ∂M to be the restricted spinor bundle S and introduce the following operator BmAPS : L2 (S) → L2 (S) given by BmAPS = BAPS (I + γ (N )), which is the composition of the zero order differential operator I + γ (N ) and the APS pseudo-differential operator. This composition is also a pseudo-differential operator of zero order (see for example [LM]) and, from (12), its principal symbol bmAPS satisfies for all u ∈ T ∂M, the relation bmAPS (u) = bAPS (u)(I + γ (N )) 1 = (−iγ (N )γ (u) + |u|I )(I + γ (N )). 2 The first ellipticity condition in Proposition 1 arises now immediately. For the second one, take into account that I + γ (N) is an isomorphism and so n
dim image bmAPS (u) = dim image bAPS (u) = 2[ 2 ]−1 . Once we checked the ellipticity of the proposed boundary condition, we have now:
388
O. Hijazi, S. Montiel, A. Rold´an
Theorem 5. Let M be a compact Riemannian spin manifold whose non-empty boundary ∂M has non-negative mean curvature (w.r.t. the inner normal). Under the modified APS boundary condition, BmAPS = BAPS (I + γ (N )) = 0, the spectrum of the Dirac operator D of M is a non-decreasing sequence of real numbers {λk | k ∈ Z} tending to ±∞ which satisfy the following inequality: λ2k ≥
n inf R, 4(n − 1) M
k ∈ Z.
Moreover, the equality holds if and only if M carries a non-trivial real Killing spinor field with negative Killing constant and the boundary ∂M is minimal. Proof. We first prove that, under the boundary condition BmAPS , D is symmetric. In fact, let ψ, ϕ ∈ (SM) be such that BmAPS ψ = BAPS (ψ + γ (N)ψ) = 0,
BmAPS ϕ = BAPS (ϕ + γ (N )ϕ) = 0.
Using (14), we deduce that BAPS (γ (N )ψ − ψ) = γ (N )ψ − ψ. Now as BAPS is an orthogonal L2 -projection, we have (γ (N )ψ, ϕ) = (ψ + BAPS χ , ϕ) = (ψ, ϕ) + ∂M
∂M
∂M
∂M
(χ , BAPS ϕ),
where we have put χ = γ (N)ψ − ψ. But BAPS ϕ = −BAPS γ (N )ϕ and hence (γ (N )ψ, ϕ) = (ψ, ϕ) − (BAPS χ , γ (N )ϕ). ∂M
∂M
∂M
Finally, we use that BAPS χ = χ to get (γ (N )ψ, ϕ) = ∂M
∂M
(ψ, γ (N )ϕ).
As a consequence we have from (9) the required symmetry. So the considered spectrum is real. We take again an eigenspinor ψ corresponding to an eigenvalue λk , k ∈ Z, satisfying the boundary condition BmAPS ψ = BAPS (ψ + γ (N )ψ) = 0, which we plug in inequality (10). Under the assumption H ≥ 0, the claimed inequality follows, if we show that the boundary mass term (Dψ, ψ) ∂M
vanishes. In fact, the supercommutativity relation (8) implies that (Dψ, ψ) =
1 (D(ψ + γ (N )ψ), ψ − γ (N )ψ). 2
Eigenvalue Boundary Problems
389
But, since BAPS (ψ + γ (N)ψ) = 0 and BAPS (ψ − γ (N )ψ) = ψ − γ (N )ψ, a suitable use of (13) gives ∂M
(D(ψ + γ (N)ψ), ψ − γ (N )ψ) = 0.
If the equality occurs we deduce, as in the three preceding cases, that ψ is a non-trivial real Killing spinor on M. Then its length is a non-trivial constant and so H must be zero as claimed. Moreover, from (7) and (8) we get D(ψ + γ (N)ψ) = −
n−1 λ(ψ + γ (N )ψ). n
Since BmAPS ψ = 0 (and so π0 ψ = 0) we deduce that λ > 0. Conversely, assume that M is a compact Riemannian spin manifold with minimal boundary ∂M carrying a non-trivial Killing spinor ψ with a real Killing constant −λ/n < 0. It is clear that Dψ = λψ. Moreover, from (7) and the fact that ∇N ψ = −(λ/n)γ (N )ψ we have that the restriction of ψ to the boundary satisfies Dψ = −
n−1 γ (N )ψ. n
From this and (8) we have that D(ψ + γ (N)ψ) = −
n−1 λ(ψ + γ (N )ψ). n
Since we assumed λ > 0, the spinor field ψ + γ (N )ψ is an eigenspinor of D associated with a negative eigenvalue. Then its APS projection has to vanish and so BmAPS ψ = 0. References [APS] [B¨a1] [B¨a2] [BFGK] [BW] [BHMM] [BrL] [Bun] [Bur] [CJJTW] [CJJT] [FGMSS]
Atiyah, M.F., Patodi, V.K., Singer, I.M.: Spectral asymmetry and Riemannian geometry, I, II and III. Math. Proc. Cambridge Phil. Soc. 77, 43–69 (1975); 78, 405–432 (1975) and 79, 71–99 (1975) B¨ar, C.: Real killing spinors and holonomy. Commun. Math. Phys. 154, 509–521 (1993) B¨ar, C.: Extrinsic bounds of the Dirac operator. Ann. Glob. Anal. Geom. 16, 573–596 (1998) Baum, H., Friedrich, T., Gr¨unewald, R., Kath, I.: Twistor and Killing Spinors on Riemannian Manifolds. Seminarbericht 108, Humboldt-Universit¨at zu Berlin, 1990 Booß-Bavnbek, B., Wojciechowski, K.P.: Elliptic Boundary Problems for the Dirac Operator. Basel: Birkh¨auser, 1993 Bourguignon, J.P., Hijazi, O., Milhorat, J.-L., Moroianu, A.: A Spinorial Approach to Riemannian and Conformal Geometry. Monograph (In Preparation) Br¨uning, J., Lesch, M.: Spectral theory of boundary value problems for Dirac type operators. Contemp. Math. 242, 203–215 (1999) Bunke, U.: Comparison of Dirac operators on manifolds with boundary. Supplemento di Rend. Circ. Mat. Palermo, Serie II, 30, 133–141 (1993) Bureˇs, J.: Dirac operator on hypersurfaces. Comment. Math. Univ. Carolin. 34, 313–322 (1993) Chodos, A., Jaffe, R.L., Johnson, K., Thorn, C.B., Weisskopf, V.F.: New extended model of hadrons. Phys. Rev. D 9, 3471–3495 (1974) Chodos, A., Jaffe, R.L., Johnson, K., Thorn, C.B.: Baryon structure in the bag theory. Phys. Rev. D 10, 2599–2604 (1974) Falomir, H., Gamboa, R.E., Muschietti, M.A., Santangelo. E.M., Solomin, J.E.: Determinants of Dirac operators with local boundary conditions. J. Math. Phys. 37, 5805–5819 (1996)
390
O. Hijazi, S. Montiel, A. Rold´an
[FS]
Farinelli, S., Schwarz, G.: On the spectrum of the Dirac operator under boundary conditions. J. Geom. Phys. 28, 67–84 (1998) Friedrich, T.: Der erste Eigenwert des Dirac-Operators einer kompakten Riemannschen Mannifaltigkeit nicht negativer Skalarkr¨ummung. Math. Nach. 97, 117–146 (1980) Friedrich, T.: Dirac Operators in Riemannian Geometry. A.M.S. Graduate Studies in Math., Vol. 25, providence RI: AMS, 2000 Gibbons, G., Hawking, S., Horowitz, G., Perry, M.: Positive mass theorems for black holes. Commun. Math. Phys. 88, 295–308 (1983) Gilkey, P.B., Leahy, J.V., Park, J.: Spectral Geometry, Riemannian Submersions and the Gromov–Lawson Conjecture. Studies in Advanced Mathematics, Boca Raton: Chapman & Hall/Crc, 1999 Herzlich, M.: The positive mass theorem for black holes revisited. J. Geom. Phys. 26, 97–111 (1998) Herzlich, M.: A Penrose-like inequality for the mass of Riemannian asymptotically flat manifolds. Commun. Math. Phys. 188, 121–133 (1998) Hijazi, O., Montiel, S., Zhang, X.: Dirac operator on embedded hypersurfaces. Math. Res. Lett. 8, 195–208 (2001) Hijazi, O., Montiel, S., Zhang, X.: Eigenvalues of the Dirac operator on manifolds with boundary. Commun. Math. Phys. 221, 255–265 (2001) Hijazi, O., Montiel, S., Zhang, X.: Conformal lower bounds for the Dirac operator of embedded hypersurfaces. Asian J. Math. 6(1), 23–36 (2002) H¨ormander, L.: The Analysis of Linear Partial Differential Operators III. Berlin: Springer, 1985 Johnson, K.: The M.I.T. bag model. Acta Phys. Pol. B6, 865–892 (1975) Lawson, H.B., Michelsohn, M.L.: Spin Geometry. Princeton Math. Series, Vol. 38, Princeton NJ: Princeton University Press, 1989 Lichnerowicz, A.: Spineurs harmoniques. C.R. Acad. Sci. Paris 257, S´erie I, 7–9 (1963) Lopatinsky, J.: On a method for reducing boundary problems for systems of differential equations of elliptic type to regular integral equations. Ukrain. Math. Z. 5, 125–151 (1953) Morel, B.: Eigenvalue estimates for the Dirac-Schr¨odinger operators. J. Geom. Phys. 38, 1–18 (2001) Reilly, R.C.: Applications of the Hessian operator in a Riemannian manifold. Indiana Univ. Math. J. 26, 459–472 (1977) Seeley, R.: Singular integrals and boundary problems. Am. J. Math. 88, 781–809 (1966) Trautman, A.: The Dirac operator on hypersurfaces. Acta Phys. Pol. 26, 1283–1310 (1995) Witten, E.: A new proof of the positive energy theorem. Commun. Math. Phys. 80, 381–402 (1981)
[Fr1] [Fr2] [GHHP] [GLP] [He1] [He2] [HMZ1] [HMZ2] [HMZ3] [H¨o] [J] [LM] [Li] [Lo] [Mo] [Re] [Se] [Tr] [Wi]
Communicated by M. Aizenman
Commun. Math. Phys. 231, 391–434 (2002) Digital Object Identifier (DOI) 10.1007/s00220-002-0729-9
Communications in
Mathematical Physics
The Landau Equation in a Periodic Box Yan Guo Lefshitz Center for Dynamical Systems, Division of Applied Mathematics, Brown University, Providence, RI 02912, USA. E-mail:
[email protected] Received: 4 February 2002 / Accepted: 10 July 2002 Published online: 29 October 2002 – © Springer-Verlag 2002
Abstract: The Landau equation, which was proposed by Landau in 1936, is a fundamental equation to describe collisions among charged particles interacting with their Coulombic force. In this article, global in time classical solutions near Maxwellians are constructed for the Landau equation in a periodic box. Our result also covers a class of generalized Landau equations, which describes grazing collisions in a dilute gas. 1. Introduction and Notations We consider the following generalized Landau equation: ∂t F + v · ∇x F = ∇v · φ v − v F v ∇v F (v) − F (v)∇v F v dv , R3
(1)
F (0, x, v) = F0 (x, v), where F (t, x, v) ≥ 0 is the spatially periodic distribution function for the particles at time t ≥ 0, with spatial coordinates x = (x1 , x2 , x3 ) ∈ [−π ,π ]3 = T3 and velocity v = (v1 , v2 , v3 ) ∈ R3 . The non-negative matrix φ is v i vj |v|γ +2 , γ ≥ −3. (2) φ ij (v) = δij − |v|2 The original Landau collision operator for the Coulombic interaction corresponds to the case γ = −3. The conservation of mass, momentum as well as energy, can be formulated as (i = 1, 2, 3) d d d F (t) = vi F (t) = |v|2 F (t) ≡ 0. dt T3 ×R3 dt T3 ×R3 dt T3 ×R3
392
Y. Guo
As in the Boltzmann equation, it is well-known that Maxwellians are steady states to the Landau equation (1) . We linearize the Landau equation around a normalized Maxwellian µ = e−|v| , 2
with the standard perturbation f (t, x, v) to µ as √ F = µ + µf.
(3)
We define
φ v − v F v ∇v G(v) − G(v)∇v F v dv
Q [F, G] = ∇v · R3 = ∂i φ ij v − v F v ∂j G(v) − G(v)∂j F v dv .
(4)
R3
√ √ It is well-known that Q(µ, µ) = 0. By expanding Q µ + µg1 , µ + µg2 , we define √ √ √ Q µ + µg1 , µ + µg2 ≡ Q[µ, µ] + µ {Kg1 + Ag2 + [g1 , g2 ]} , see A, K and in (14), (15) and (16). Let σ ij (v) = φ ij ∗ µ = φ ij v − v µ v dv , ij ij i σ (v) = φ ∗ vj µ = φ ∗ µ vj = φ ij v − v vj µ v dv .
(5)
Define the linear operator L = −A − K.
(6)
The Landau equation (1) for f (t, x, v) now takes the form ∂t f + v · ∇x f + Lf = [f, f ],
(7)
with f (0, x, v) = f0 (x, v). By assuming that initially F0 (x, v) has the same mass, momentum and total energy as the Maxwellian µ, we can then rewrite the conservation laws as (i = 1, 2, 3) √ √ √ f (t, x, v) µ = vi f (t, x, v) µ = |v|2 f (t, x, v) µ ≡ 0. T3 ×R3
T3 ×R3
T3 ×R3
(8) Notations. For notational simplicity, we shall use ·, · to denote the standard L2 inner products in either Rv3 or T3x , with its corresponding L2 norm | · |2 . We denote (·, ·) the L2 inner product in T3 × R3 with its corresponding L2 norm || · ||. Let α, β denote multi-indices with length |α|, and |β| respectively and we define ∂βα ≡ ∂xα ∂vβ .
Landau Equation in Periodic Box
393
If each component of β is not greater than that of α’s, we denote it by β ≤ α. We also β β define β < α if β ≤ α, and |β| < |α|. We also denote α by Cα . We introduce a weight function of v as w = w(v) = [1 + |v|]γ +2 . We denote the weighted L2 norms as w 2θ g 2 dv, ||g||2θ ≡ |g|22,θ ≡
T3 ×R3
R3
w 2θ g 2 dxdv.
Recalling (5), we define the weighted norm: |g|2σ,θ ≡ w 2θ σ ij ∂i g∂j g + σ ij vi vj g 2 dv, 3 R w 2θ σ ij ∂i g∂j g + σ ij vi vj g 2 dvdx, ||g||2σ,θ ≡
(9)
T3 ×R3
where | · |σ ≡ | · |σ,0 , and || · ||σ ≡ || · ||σ,0 . Notice that such a norm, which is closely related to the structure of the matrix σ and the linear operator L (Lemma 5), is anisotropic with respect to directions of the velocity variable v (Corollary 1). We define the high order energy norm as t 1 α 2 α 2 ||∂β f (s)||σ,|β| ds , if γ + 2 ≤ 0, ||∂ f (t)|||β| + E (f (t, x, v)) ≡ 2 β 0 |α|+|β|≤N
E (f (t, x, v)) ≡
|α|+|β|≤N
1 α ||∂ f (t)||2 + 2 β
We also denote
E (f0 ) ≡ E (f (0)) ≡
|β|+|α|≤N
E (f0 ) ≡ E (f (0)) ≡
|β|+|α|≤N
t 0
||∂βα f (s)||2σ ds
||∂βα f0 ||2|β| , ||∂βα f0 ||2 ,
, if γ + 2 ≥ 0.
(10)
if γ + 2 < 0; if γ + 2 ≥ 0.
Throughout this article, we shall assume N ≥ 8 and we shall use the summation convention from time to time. Such a high Sobolev norm is a standard choice in constructions of small data solutions in many PDE: since the L∞ norm in x is easily controlled by the Sobolev imbedding as in (55), such a high Sobolev norm of a product is bounded by a product of the same norms. The same is true for any general nonlinearity. Our main result is as follows: Theorem 1. Let γ ≥ −3. Assume that f0 (x, v) satisfies (8) and √ F0 (x, v) = µ + µf0 (x, v) ≥ 0. There are constants C0 > 0 and %0 > 0, such that if E (f0 ) < %0 , there exists a unique global solution f (t, x, v) to the Landau equation (7) with F (t, x, v) = µ + √ µf (t, x, v) ≥ 0, and sup E(f (s)) ≤ C0 E(f0 ) . 0≤s≤∞
394
Y. Guo
Although our theorem is still valid for certain γ even below −3, we shall focus only on the case of γ ≥ −3 due to its physical significance. Despite its importance, few global solutions have been constructed, either for the original Landau equation with γ = −3 or for the spatially inhomogeneous case. In the spatially homogeneous case, global solutions are constructed in the case if φ ij (v) is smooth and bounded [A]. Classical solutions have been constructed for the hard potential γ > 0 [DV]. For the Coulombic interaction with γ = −3 with no x−dependence, global weak solutions have been studied in [V], while no global classical solutions had been known. On the other hand, in the presence of x−dependence, weak solutions have been constructed [L, V2] for γ = −3, up to some defect measures. Our construction of the global solutions is based on an energy method, which was developed recently by the author in the study of Vlasov-Poisson-Boltzmann equation [G1]. Instead of studying the linearized problem, we investigate the energy estimates for the full nonlinear problem. This nonlinear energy method also leads to construction of classical solutions near Maxwellians for the Boltzmann equation with an inverse power law of γ > −3, under the assumption of an angular cutoff [G2]. For simplicity, we illustrate our method for the most interesting and difficult case with γ = −3, the original Landau equation for Coulombic interaction. Notice that in this case, the dissipation from the collisions is very weak and the linear decay estimate is difficult to obtain. In the energy estimate (79) for mixed x and v t derivatives, it is important to control pure x derivatives 0 ||∂ α f (s)||2σ ds. It turns out that if there is a positive lower bound for the quadratic term t t α L∂ f (s), ∂ α f (s) ds ≥ δ ||∂ α f (s)||2σ ds, (11) 0
0
then solution with small amplitude can be extended to t = ∞ . Unfortunately, although Lf, f ≥ δ|f |2σ , for a function f (v) which is orthogonal to the five dimensional null space of L, (11) is not true, when f (x, v) is a functional of x satisfying the conservation √ laws (8). A simple example of this fact is given by f (x, v) = sin x µ. Based on the idea in [G1], we demonstrate that although (11) is not true for general f (x, v), a similar type of estimates are indeed valid for any solution f (t, x, v) with small amplitude to (7): Theorem 2. Assume that f0 (x, v) satisfies (8). Let f (t, x, v) be the classical solution to Landau equation (7) in 0, T ∗ . Assume for 0 ≤ t < T ∗ , we have E(f (t)) ≤ M. Then there exists a constant 0 < δM < 1, such that for any t1 ≥ 0, and any nonnegative integer n with 0 ≤ t1 + n < T ∗ , t1 +n t1 +n α α L ∂ f (s) , ∂ f (s) ds ≥ δM ||∂ α f (s)||2σ ds. (12) |α|≤N
t1
|α|≤N
t1
Notice that the dissipation coefficient δM depends only on M, and is uniform on n. Equation (12) can be viewed as a weaker version of the time decay estimates with no specific decay rate. Arguing by contradiction, we need to show that if (12) does not hold, then there is a limiting normalized function √ √ √ (13) Z(t, x, v) = a(t, x) µ + b(t, x) · v µ + c(t, x)|v|2 µ, 1 where 0 ||Z(s)||2σ ds = 1, and Z(t, x, v) is a solution to some related normalized nonlinear transport equation (108). In the next step, as in [G1], with the smallness assumption
Landau Equation in Periodic Box
395
1 of f, as well as the conservation laws (8), we deduce that 0 ||Z(s)||2σ ds ≤ CM. This is a contradiction to the normalization of Z(t, x, v), when M is small. In order to carry out such a process for the Landau equation, however, we have to overcome the product ∂β f and j three major difficulties. The first one is to control j ∂β vj ∂ f, v−derivatives ∂β of the free streaming term vj ∂ f. Having constant co efficients, ||∂β vj ∂ j f || can not be directly bounded by ||∂β f ||2σ from the dissipation (Lemma 5), which contains only a weighted norm || [1 + |v|]−1/2 ∂β f ||2 rather than the desired ||∂β f ||2 . We introduce some w−weighted norms as in (10) which depend on the number of v−derivatives, and we estimate the weighted product of ∂β f and ∂β vj ∂ j f instead. Since ∂β vj ∂ j f have less v−derivatives thus require less w−weight, we can put more w−weight on the first factor ∂β f , which can be bounded by the appropriate weighted norm of [1 + |v|]−1/2 ∂β f from the dissipation. The estimate of the nonlinear term g1, g2 in terms of our weighted norm (10) is particularly delicate due to the anisotropy in the norm || · ||σ [DeL]: the derivative −3/2 . It is not expected, that along the v− direction has worst weight factor of [1 + |v|] the nonlinear operator g1, g2 also should preserve the anisotropic structure along the same directions of the linear operator σ . By expanding φ ij v − v around φ ij (v), and noticing that φ ij (v), ∂j φ ij (v) have exactly the same eigenvectors as σ ij (v), we are able to employ some crucial cancellations to overcome this second difficulty. The third difficulty is to verify, in the same limit as n → ∞ in [G1], (13) gives the same limiting function for the Landau equation as in [G1]. Unlike in [G1], the term ∂i σ i f in the operator L lacks the x−compactness. We overcome this difficulty by using a more general and simpler argument, which is based on a projection in L2 Rv3 for Zn (t, x, v) into the null space of L. 2. The Linear Operator L We first establish some estimates about the linearized operator L = −A − K. Our main results in this section include a lower bound for the norm || · ||σ (Corollary 1), coercivity for L (Lemma 5) and compact imbedding of K (Lemma 6). Lemma 1. We have the following representations for A, K and : Ag2 = µ−1/2 ∂i µ1/2 σ ij ∂j g2 + vj g2 = ∂i σ ij ∂j g2 − σ ij vi vj g2 + ∂i σ i g2 , Kg1 = −µ−1/2 ∂i µ φ ij ∗ µ1/2 ∂j g1 + vj g1 = −µ−1/2 (v)∂i µ(v) φ ij v − v µ1/2 v R3
× ∂j g1 v + vj g1 v dv , g1, g2 = ∂i φ ij ∗ µ1/2 g1 ∂j g2 − φ ij ∗ vi µ1/2 g1 ∂j g2 − ∂i φ ij ∗ µ1/2 ∂j g1 g2 + φ ij ∗ vi µ1/2 ∂j g1 g2 .
(14)
(15)
(16)
396
Y. Guo
Proof. By expanding Q µ + µ1/2 g1, µ + µ1/2 g2 around µ, we have
Q µ + µ1/2 g1, µ + µ1/2 g2 − Q µ, µ = Q µ1/2 g1, µ + Q µ, µ1/2 g2
+ Q µ1/2 g1, µ1/2 g2 ≡ µ1/2 {Kg1 + Ag2 + [g1 , g2 ]} . Notice that ∂i µ1/2 g = µ1/2 [∂i g − vi g] , ∂i µ = −2vi µ, and for any fixed i or j, i
ij
φ ij v − v vi − vi = φ v − v vj − vj = 0,
(17)
j
from (2). We have from (4)
Kg1 = µ−1/2 Q µ1/2 g1 , µ = − µ−1/2 ∂i 2vj µ φ ij ∗ µ1/2 g1 −µ−1/2 ∂i µ φ ij ∗ µ1/2 ∂j −vj g1 = − µ−1/2 ∂i µ φ ij ∗ 2vj µ1/2 g1 by(17) − µ−1/2 ∂i µ φ ij ∗ µ1/2 ∂j − vj g1 . = − µ−1/2 ∂i µ φ ij ∗ µ1/2 ∂j g1 + vj g1 We take the derivatives inside Ag2 ,
Ag2 = µ−1/2 Q µ, µ1/2 g2 = µ−1/2 ∂i σ ij µ1/2 ∂j g2 − vj g2 + 2µ−1/2 ∂i φ ij ∗ vj µ µ1/2 g2 = µ−1/2 ∂i σ ij µ1/2 ∂j g2 − vj g2 + 2µ−1/2 ∂i φ ij ∗ µ vj µ1/2 g2 by(17) = µ−1/2 ∂i σ ij µ1/2 ∂j g2 + vj g2 = µ−1/2 ∂i σ ij µ1/2 ∂j g2 + µ−1/2 ∂i σ ij µ1/2 vj g2 = ∂i σ ij ∂j g2 − ∂i µ−1/2 σ ij µ1/2 ∂j g2 + ∂i σ ij vj g2 − ∂i µ−1/2 µ1/2 vj σ ij g2 = ∂i σ ij ∂j g2 + ∂i vj σ ij g2 − vi vj σ ij g2 = ∂i σ ij ∂j g2 + ∂i σ i g2 − vi vj σ ij g2 ,
Landau Equation in Periodic Box
397
from (5), where ∂i µ−1/2 = vi µ−1/2 . Finally, we have
g1, g2 = µ−1/2 Q µ1/2 g1, µ1/2 g2 = µ−1/2 ∂i φ ij ∗ µ1/2 g1 µ1/2 ∂j g2 − vj g2 − µ−1/2 ∂i φ ij ∗ ∂j g1 − vj g1 µ1/2 µ1/2 g2 = ∂i φ ij ∗ µ1/2 g1 ∂ j g 2 − v j g2 − ∂i µ−1/2 µ1/2 φ ij ∗ µ1/2 g1 ∂ j g2 − v j g2 g2 − ∂i φ ij ∗ µ1/2 ∂j g1 − vj g1 + ∂i µ−1/2 µ1/2 φ ij ∗ µ1/2 ∂j g1 − vj g1 g2 ∂ j g 2 − v j g2 = ∂i φ ij ∗ µ1/2 g1 − vi φ ij ∗ µ1/2 g1 ∂ j g2 − v j g2 g2 − ∂i φ ij ∗ µ1/2 ∂j g1 − vj g1 + vi φ ij ∗ µ1/2 ∂j g1 − vj g1 g2 = ∂i φ ij ∗ µ1/2 g1 ∂j g2 − φ ij ∗ vi µ1/2 g1 ∂j g2 − ∂i φ ij ∗ µ1/2 ∂j g1 g2 + φ ij ∗ vi µ1/2 ∂j g1 g2 , where we have used (17) repeatedly to move vi in and out of the v −integrations.
Lemma 2. Let θ > −3, l(v) ∈ C ∞ (R3 ) and k(v) ∈ C ∞ (R3 \{0}). Assume for any β, there is Cβ > 0 such that |∂β k(v)| ≤ Cβ |v|θ−|β| , |∂β l(v)| ≤ Cβ e−τβ |v| , 2
with some τβ > 0. Then there is Cβ∗ > 0 such that |∂β [k ∗ l](v)| ≤ Cβ∗ [1 + |v|]θ−|β| . Proof. It suffices to consider |v| ≥ 1. For any given β, if θ − |β| ≥ 0, since |v − v | ≤ |v| + |v |, we have ∂β [k ∗ l] (v) = ∂β k ∗ l (v) ≤ C |v − v |θ−|β| l v dv ≤C |v|θ−|β| + |v |θ−|β| l v dv ≤ C [1 + |v|]θ−|β| . Hence we only need to consider the case θ − |β| < 0. Notice that ∂β [k ∗ l](v) = k v − v ∂β l v dv = + |v−v |≤ 21 [|v|+1]
|v−v |> 21 [|v|+1]
.
398
Y. Guo
Since |v | ≥ |v| − |v − v | ≥ |v|/2 − 1/2 in the first part, 2 2 |∂β l v | ≤ Cβ e−τβ |v | ≤ Ce−τβ |v| /4 . The first integral is bounded by 2 Ce−τβ |v| /4
|v−v |≤ 21 [|v|+1]
≤ C [1 + |v|]θ+3 e−τβ |v|
(18)
|v − v |θ dv
2 /4
≤ C [1 + |v|]θ−|β| .
On the other hand, in the second part |v − v | > 21 [|v| + 1], so that |∂β k v − v | ≤ C [1 + |v|]θ−|β| . Integrating by parts repeatedly, we majorize the second part by | |∂β k v − v l v | + |B[k, l]| |v−v |> 21 [|v|+1]
θ−|β|
≤ C [1 + |v|]
|l v | + |B[k, l]|.
Here the collection of various boundary terms is denoted by |B[k, l]|, which is evaluated at the surface v : |v − v | = 21 [|v| + 1] . Since |v | ≥ |v|/2 − 1/2 on this surface, (18) is again valid. Hence, |B[k, l]| is bounded by C |∂β1 k v − v ∂β−β1 l v |dS |v−v |= 21 [|v|+1]
≤ Cβ−β1 e−τβ−β1 |v|
2 /4
|v−v |= 21 [|v|+1]
|∂β1 k v − v |dS ≤ C [1 + |v|]θ−|β| .
Notice that |∂β φ ij (v)| ≤ C|v|γ +2−|β| . From Lemma 2, as well as from [DeL], we deduce Lemma 3. σ ij (v), σ i (v) are smooth functions such that |∂β σ ij (v)| + |∂β σ i (v)| ≤ Cβ [1 + |v|]γ +2−|β| , and σ ij (v)gi gj = λ1 (v) {Pv gi }2 + λ2 (v) {[I − Pv ] gi }2 , σ ij (v)vi vj g 2 = φ ij ∗ vi vj µ g 2 = λ1 (v)|v|2 g 2 .
(19)
The spectrum of σ ij (v) consists of a simple eigenvalue λ1 (v) > 0 associated with the vector v, and a double eigenvalue λ2 (v) > 0 associated with v ⊥ . Moreover, there are constants c1 and c2 > 0 such that asymptotically, as |v| → ∞, we have λ1 (v) = c1 [1 + |v|]γ , λ2 (v) = c2 [1 + |v|]γ +2 ≡ c2 w.
Landau Equation in Periodic Box
399
For any vector-valued function g(v) = [gi ] , we define the projection to the vector [vi ] as vi gj vj , 1 ≤ i ≤ 3. (20) P v gi ≡ |v|2 Corollary 1. There exists c = cβ > 0, such that 2 2 γ γ +2 2 |g|σ,θ ≥ c wθ [1 + |v|] 2 {Pv ∂i g} + w θ [1 + |v|] 2 {[I − Pv ]∂i g} 2 2 γ +2 2 θ + w [1 + |v|] 2 g . 2
(21)
Proof. We recall (9). By Lemma 3, we only need to consider the last term in (21), for v near origin. Notice that for a smooth cut-off function χ (v) = 1, for |v| ≤ 1, χ (v) = 0 for |v| ≥ 2, Poincare’s inequality implies 1/2 2 |g| dv ≤ |χg|2 |v|≤1
≤ C|∇[χg]|2 ≤ C|∇χg|2 + C|χ ∇g|2 1/2 ≤C |g|2 + |∇g|2 ≤ C|g|σ . |v|≤2
1≤|v|≤2
Therefore, by separating the cases of |v| ≥ 1 and |v| ≤ 1, we have 2 θ w [1 + |v|]γ +2 g ≤ C|g|2σ . 2
As in the Boltzmann theory, the following fact is well-known [H]: Lemma 4. L ≥ 0, and Lg, g = 0 if and only if g(v) = {a + b · v + c|v|2 }µ1/2 , where a, c ∈ R and b ∈ R3 . Proof. By (6), (14) and (15), we integrate by parts over v variables to compute Lg, g = −[A + K]g, g as φ ij v − v µ(v)µ v ∂i gµ−1/2 (v) − ∂i gµ−1/2 v ×∂j gµ−1/2 (v) dvdv = φ ij v − v µ(v)µ v ∂i gµ−1/2 v − ∂i gµ−1/2 (v) ×∂j gµ−1/2 v dvdv 1 φ ij v − v µ(v)µ v ∂i gµ−1/2 (v) − ∂i gµ−1/2 v = 2 × ∂j gµ−1/2 (v) − ∂j gµ−1/2 v ≥ 0. Hence Lg, g ≥ 0. If Lg, g = 0, there is a scalar function q v, v , such that for i = 1, 2, 3, ∂i gµ−1/2 (v) − ∂i gµ−1/2 v ≡ q v, v vi − vi .
400
Y. Guo
Without loss of generality, we may assume v = 0, so that ∂i gµ−1/2 (v) ≡ q(v, 0)vi + ci . Therefore, replacing v by v , we have ∂i gµ−1/2 (v) − ∂i gµ−1/2 v = q(v, 0)vi − q v , 0 vi = q(v, 0) vi − vi + q(v, 0) − q v , 0 vi . Since φ ij v − v projects to v − v , we deduce that q(v, 0) − q v , 0 ≡ 0 and q(v, 0) ≡constant. Therefore for some a, c ∈ R and b ∈ R3 , gµ−1/2 (v) ≡ a + b · v + c|v|2 . We denote the orthonormal basis for 1, v, |v|2 µ1/2 as in [GL] {e1 , e2 , e3 , e4 , e5 } and we define an projection P0 in L2 R3 for any fixed x as P0 g(x, v) ≡
g(x, ·), ej ej .
(22)
2 Weremark that P0 is not the same as the L projection to {e1 , e2 , e3 , e4 , e5 } in the L2 T3 × R3 .
Lemma 5. Let θ ∈ R. For any m > 1, there is 0 < C(m) < ∞, such that |w 2θ ∂i σ i g1 , g2 | + |w2θ Kg1 , g2 | 1/2 1/2 C |w θ g1 |2 dv |w θ g2 |2 dv . ≤ |g1 |σ,θ |g2 |σ,θ + C(m) m |v|≤C(m) |v|≤C(m) (23) Moreover, there is δ > 0, such that Lg, g ≥ δ| {I − P0 } g|2σ . Proof. We first prove (23). We split 2θ i − w ∂i σ g1 g2 =
{|v|≤m}
(24)
+
{|v|≥m}
.
(25)
It suffices to consider the second integral over {|v| ≥ m} since |∂i σ i | ≤ C[1 + |v|]γ +1 by Lemma 2. Hence from (21) and the Cauchy-Schwartz inequality, C C w2θ [1 + |v|]γ +2 |g1 g2 | ≤ |g1 |σ,θ |g2 |σ,θ . (26) |w 2θ ∂i σ i g1 g2 |dv ≤ m m {|v|≥m}
Landau Equation in Periodic Box
401
Recalling the linear operator K in (15), we have w 2θ Kg1 = −∂i w 2θ µ1/2 φ ij ∗ µ1/2 ∂j g1 + vj µ1/2 g1 +2θ w2θ−1 ∂i wµ1/2 φ ij ∗ µ1/2 ∂j g1 + vj µ1/2 g1 +w2θ vi µ1/2 φ ij ∗ µ1/2 ∂j g1 + vj µ1/2 g1 .
(27)
Upon integrating by parts for the first term, and collecting terms, we can denote
w2θ Kg1 , g2 =
|β1 |,|β2 |≤1
w 2θ (v)φ ij v − v µ1/4 (v) × µ1/4 v µβ1 β2 v, v ∂β1 g1 v ∂β2 g2 (v),
where µβ1 β2 v, v is a collection of smooth functions satisfying v v |∇µβ1 β2 v, v | + |µβ1 β2 v, v | ≤ Cµ µ . 4 4 Notice that φ ij (v) = O(|v|γ +2 ) ∈ L2loc R3 , for γ ≥ −3. Fubini’s Theorem implies
φ ij v − v µ1/4 (v)µ1/4 v ∈ L2 R3 × R3 . Therefore, for any given m > 0, we can choose a Cc∞ function ψ ij v, v such that 1 ||φ ij v − v µ1/4 (v)µ1/4 v − ψ ij v, v ||L2 (R3 ×R3 ) ≤ , and m supp ψ ij ⊂ |v | + |v| ≤ C(m) < ∞. We split φ ij v − v µ1/4 (v)µ1/4 v = ψ ij + φ ij v − v µ1/4 (v)µ1/4 v − ψ ij so that accordingly, w 2θ Kg1 , g2 = J1 [g1 , g2 ] + J2 [g1 , g2 ] . Here J1 = J2 =
w 2θ (v)ψ ij v, v µβ1 β2 v, v ∂β1 g1 v ∂β2 g2 (v),
w 2θ (v) φ ij v − v µ1/4 (v)µ1/4 v − ψ ij × µβ1 β2 v, v ∂β1 g1 v ∂β2 g2 (v).
(28)
402
Y. Guo
The second term J2 is bounded by ||φ ij v − v µ1/4 (v)µ1/4 v − ψ ij v, v ||L2 (R3 ×R3 ) × ||w2θ µβ1 β2 v, v ∂β1 g1 v ∂β2 g2 (v)||L2 (R3 ×R3 )
v C v ≤ µ ∂β1 g1 w θ µ ∂β2 g2 m 4 4 2 C ≤ |g1 |σ,θ |g2 |σ,θ . m
2
Now for the first term J1 , integrations by parts over v and v variables yields β1 +β2 ∂β2 w 2θ (v)∂β1 ψ ij v, v µβ1 β2 v, v g1 v g2 (v) J1 = (−1) ij
≤ C||ψ ||C 2
1/2 |w g1 | dv . θ
|v|≤C(m)
θ
2
|v|≤C(m)
1/2
|w g2 | dv 2
.
We thus conclude (23). To prove (24), we use the method of contradiction. Assuming the contrary, we have a sequence of normalized functions gn (v) such that |gn |σ ≡ 1, which also satisfy gn µ1/2 dv = vj gn µ1/2 dv = |v|2 gn µ1/2 dv = 0, (29) R3
R3
R3
Lgn , gn = −Agn , gn − Kgn , gn ≤ 1/n.
(30)
We denote the weak limit, with respect to the inner product ·, ·, of gn (up to a subsequence) by g0. Hence |g0 |σ ≤ 1. Notice that from (14), (15), (9) and Lemma 1, we have Lgn , gn = |gn |2σ − ∂i σ i gn , gn − Kgn , gn . We claim that lim ∂i σ i gn , gn = ∂i σ i g0 , g0 ,
n→∞
In fact, for any given m > 0, −
{|v|≤m}
∂i σ i gn2
lim Kgn , gn → Kg0 , g0 .
n→∞
→−
{|v|≤m}
∂i σ i g02 ,
since ∂i gn are bounded in L2 {|v| ≤ m} from |gn |σ = 1 and (21). On the other hand, by (26) with θ = 0, g1 = g2 = gn , the integral over {|v| ≥ m} is bounded by O(1/m). We thus conclude that ∂i σ i gn , gn → ∂i σ i g0 , g0 by first choosing m sufficiently large, and then letting n → ∞.
Landau Equation in Periodic Box
403
We split Kgn , gn into J1 and J2 in (28) with g1 = g2 = gn , and θ = 0. It follows that J2 (gn , gn ) ≤ 1/m, and up to a subsequence, J1 (gn , gn ) → J1 (g0 , g0 ) , since ∂i gn are bounded in L2 {|v| ≤ C(m)}. By first choosing m sufficiently large and then letting n → ∞, we conclude that Kgn , gn → Kg0 , g0 . Letting n → ∞ in (30), we have Lg0, g0 = 0 or equivalently 0 = 1 − |g|2σ + Lg0 , g0 . 2 Since both terms are non-negative, 1/2 we have 1 = |g0 |σ , and Lg0 , g0 = 0. By Lemma 2 4, g0 = a + b · v + c|v| µ . On the other hand, letting n → ∞ in (29), we deduce that g0 also satisfies (29). We thus have a = b = c = 0, this contradicts |g0 |2σ = 1.
Lemma 6. Let |β| > 0. For small η > 0, there exists Cη = Cη (θ ) > 0 such that 2 2 −w2θ ∂β [Ag], ∂β g ≥ ∂β g σ,θ − η ∂β¯ g − Cη |µg|2 , |w2θ ∂β [Kg1 ] , ∂β g2 | ≤
¯ |β|≤|β|
η ∂β¯ g1
¯ |β|≤|β|
σ,θ
σ,θ
+ Cη |µg1 |2 ∂β g2 σ,θ .
We remark that the Maxellian µ in the above estimates is a convenient choice of an upper bound for the characteristic functions of a ball. It can be replaced by any function of v with sufficiently fast decay at v = ∞. Proof. Recall Lemma 1, we have β w2θ ∂β [Ag], ∂β g = −|∂β g|2σ,θ − Cβ 1 w 2θ ∂β1 σ ij ∂β−β1 ∂j g, ∂β ∂i g +∂i w 2θ ∂β1 σ ij ∂β−β1 ∂j g, ∂β g +w2θ ∂β1 σ ij vi vj ∂β−β1 g, ∂β g β
+Cβ 2 w 2θ ∂β2 ∂i σ i ∂β−β2 g, ∂β g, where summations are over β ≥ β1 > 0 and β ≥ β2 ≥ 0. Notice that σ ij vi vj =
φ ij v − v vi vj µ v dv .
By Lemma 3, we know |w2θ ∂β2 ∂i σ i | + |w2θ ∂β1 {σ ij vi vj }| + |∂j w 2θ ∂β1 σ ij | ≤ C[1 + |v|]γ +1 w 2θ . Hence the last three terms in (31) are bounded by
(31)
404
Y. Guo
C
w2θ [1 + |v|]γ +1 |∂β−β1 g| + |∂j ∂β−β1 g| + |∂β−β2 g| |∂β g| =C +C |v|≤m |v|≥m γ +2 C 2θ ≤C + w [1 + |v|] 2 |∂β−β1 g| + |∂j ∂β−β1 g| + |∂β−β2 g| 2 m |v|≤m γ +2 × w2θ [1 + |v|] 2 ∂β g 2 C ≤C + ∂β¯ g ∂β g σ,θ , σ,θ m |v|≤m ¯ |β|≤|β|
where we have used (21) and the fact β1 > 0. For the part |v| ≤ m, for any η > 0, we use the compact interpolation in the Sobolev space to get 2 ≤η |∂β¯ g| + Cη |g|2 |v|≤m
¯ |β|=|β|+1
|v|≤m
|v|≤m
2 ≤η ∂β¯ g + Cη |µg|22 ,
(32)
σ,θ
¯ |β|=|β|
by (21). We now consider the second term in (31). If |β1 | ≥ 2, by Lemma 3 |w2θ ∂β1 σ ij | ≤ C[1 + |v|]γ w 2θ . Therefore, |w2θ ∂β1 σ ij ∂β−β1 ∂j g, ∂β ∂i g| ≤ +C [1 + |v|]γ w 2θ |∂β−β1 ∂j g∂β g| |v|≤m
|v|≥m
≤
|v|≤m
+C
× ≤
|v|≥m
|v|≤m
+
|v|≥m
γ
1/2
[1 + |v|] w |∂β−β1 ∂j g|
[1 + |v|]γ w 2θ |∂β g|2
2θ
2
1/2
C ∂β−β1 ∂j g σ,θ ∂β g σ,θ , m
(33)
by (21). But for any given m > 0, since |β − β1 | + 1 < |β|, we can use the compact interpolation to obtain 2 ≤η ∂β¯ g + Cη |µg|22 . |v|≤m
¯ |β|≤|β|
σ,θ
If |β1 | = 1, an integration by part of the vβ1 variable yields w2θ ∂β1 σ ij ∂β−β1 ∂j g, ∂β ∂i g = w2θ ∂β1 σ ij ∂β−β1 ∂j g, ∂β1 [∂β−β1 ∂i g] 1 = − ∂β1 w 2θ ∂β1 σ ij ∂β−β1 ∂j g, ∂β−β1 ∂i g . 2
Landau Equation in Periodic Box
405
Since |β1 | = 1, ∂β1 w 2θ ∂β1 σ ij ≤ C[1 + |v|]γ w 2θ , we use the same splitting (33) to get 2 C +η w2θ ∂β1 σ ij ∂β−β1 ∂i g, ∂β−β1 ∂j g ≤ ∂β¯ g + Cη |µg|22 , σ,θ m ¯ |β|≤|β|
where we have used (32) for {|v| ≤ m} since |β − β1 | + 1 = |β|. We now consider w2θ ∂β [Kg1 ], ∂β g2 . Recalling (27), we have w2θ ∂β Kg1 = −∂i w 2θ ∂β µ1/2 φ ij ∗ µ1/2 ∂j g1 + vj µ1/2 g1 +2θ w 2θ−1 ∂i w∂β µ1/2 φ ij ∗ µ1/2 ∂j g1 + vj µ1/2 g1 +w2θ ∂β vi µ1/2 φ ij ∗ µ1/2 ∂j g1 + vj µ1/2 g1 . 1/2 We take derivatives only on the factor µ ∂j g1 + vj µ1/2 g1 in the convolutions above. Upon integrating by parts for the first term, and collecting terms, we can express w2θ ∂β [Kg1 ] , ∂β g2 as w 2θ (v)φ ij v − v µ1/4 (v)µ1/4 v µ¯ β1 β2 v, v ∂β1 g1 v ∂β2 g2 (v), |β1 |≤|β|+1 |β|≤|β2 |≤|β|+1
where µ¯ β1 β2 v, v is a collection of smooth functions satisfying v v k v µ( ), µ ¯ , v ≤ C µ ∇v,v β1 β2 k 4 4 for any k th order derivatives. We split w 2θ ∂β [Kg1 ], ∂β g2 as in (28) to get w2θ ψ ij µ¯ β1 β2 v, v ∂β1 g1 v ∂β2 g2 (v) + w2θ φ ij v − v µ1/4 (v)µ1/4 v − ψ ij × µ¯ β1 β2 v, v ∂β1 g1 v ∂β2 g2 (v). As the same estimates of J2 in (28), by (21), last term is bounded by C ∂β¯ g1 ∂β g2 σ,θ . σ,θ m ¯ |β|≤|β|
Since ψ ij has compact support, integrating by parts over v repeatedly, we estimate the first term as |β1 | 2θ ij w ψ v, v µ ¯ v, v g v ∂ (−1) (v)∂ g (v) β1 β1 β2 1 β2 2 |β1 |≤|β|+1 |β|≤|β2 |≤|β|+1 C ≤ ∂β¯ g1 + C(m) |µg1 |2 ∂β g2 σ,θ . m σ,θ ¯ |β|≤|β|
In summary, we conclude our lemma by first choosing m large.
406
Y. Guo
3. The Nonlinear Operator Γ We now establish estimates for the nonlinear term g1, g2 . Theorem 3. Let |α| + |β| ≤ N, θ ≥ 0, then w2θ ∂βα [g1 , g2 ] , ∂βα g3 α α−α ≤C ∂β¯ 1 g1 ∂β−β11 g2 2,θ
σ,θ
+ ∂βα¯ 1 g1
σ,θ
α−α1 ∂β−β1 g2
2,θ
α ∂β g3
σ,θ
,
(34)
where summation is over |α1 | + |β1 | ≤ N, β¯ ≤ β1 ≤ β. Moreover,
∂βα11 g1 θ ||∂βα11 g2 ||σ,θ w2θ ∂βα [g1 , g2 ] , ∂βα g3 ≤ C + ||∂βα11 g1 ||σ,θ ||∂βα11 g2 ||θ ||∂βα g3 ||σ,θ , (35)
where summations are over |α1 | + |β1 | ≤ N, β1 ≤ β. Proof. Recall [g1 , g2 ] in (16). By the product rule, we expand β Cαα1 Cβ 1 × Gα1 β1 , w2θ ∂βα g1, g2 , ∂βα g2 = where Gα1 β1 takes the form: α−α1 − w 2θ φ ij ∗ ∂β1 µ1/2 ∂ α1 g1 ∂j ∂β−β g , ∂i ∂βα g3 1 2 α−α1 α − w2θ φ ij ∗ ∂β1 vi µ1/2 ∂ α1 g1 ∂j ∂β−β g , ∂ g 2 3 β 1 α−α1 2θ ij 1/2 α1 + w φ ∗ ∂β1 µ ∂j ∂ g1 ∂β−β1 g2 , ∂i ∂βα g3 α−α1 α + w2θ φ ij ∗ ∂β1 vi µ1/2 ∂j ∂ α1 g1 ∂β−β g , ∂ g 2 3 β 1 α−α1 2θ ij 1/2 α1 − ∂i w φ ∗ ∂β1 µ ∂ g1 ∂j ∂β−β1 g2 , ∂βα g3 α−α1 α + ∂i w 2θ φ ij ∗ ∂β1 µ1/2 ∂j ∂ α1 g1 ∂β−β g , ∂ g . 2 3 β 1 The last two terms appear when we integrate by parts over the vi variable. We first establish (34). For the last two terms (40) and (41), since γ ≥ −3,
φ ij (v) = O |v|γ +2 ∈ L2loc R3 . From ∂β1 −β¯ µ1/2 ≤ Cµ1/4 ,
(36) (37) (38) (39) (40) (41)
Landau Equation in Periodic Box
407
the Cauchy-Schwartz inequality implies α φ ij ∗ ∂β1 µ1/2 ∂ α1 g1 ≤ C φ ij ∗ ∂β1 −β¯ µ1/2 ∂β¯ 1 g1 ¯ 1 β≤β
≤ |φ ij |2 ∗ µ1/4
1/2
(v)
¯ 1 β≤β
≤ C[1 + |v|]
1/2 2 µ1/4 ∂βα¯ 1 g1
α w θ ∂β¯ 1 g1 .
γ +2
(42)
2
¯ 1 β≤β
Since ∂i w 2θ ≤ C[1 + |v|]−1 w 2θ , we estimate (40) from (42) α−α1 θ α1 α C w2θ [1 + |v|]γ +1 ∂j ∂β−β g ∂ g w ∂β¯ g1 2 3 β dv 1 2
¯ 1 β≤β
γ γ +2 α α−α ≤C w θ ∂β¯ 1 g1 w θ [1 + |v|] 2 ∂j ∂β−β11 g2 w θ [1 + |v|] 2 ∂βα g3 2
¯ 1 β≤β
α ≤C ∂β¯ 1 g1
2,θ
¯ 1 β≤β
2
α−α1 ∂β−β1 g2
σ,θ
α ∂β g3
σ,θ
2
,
by (21). To estimate (41), we separate two cases: If −1 ≤ γ +2 ≤ 0, since φ ij (v) ∈ L2loc R3 , we deduce as in (42) 1/2 2 ij α ij 1/2 α1 1/4 (v) φ ∗ ∂β1 µ ∂j ∂ g1 ≤ φ ∗ µ µ1/4 ∂j ∂β¯ 1 g1 2
¯ 1 β≤β
≤ C [1 + |v|]γ +2
α ∂β¯ 1 g1
σ,θ
¯ 1 β≤β
,
by (21). Hence (41) is bounded by (γ + 1 ≤ γ + 2 /2 in this case): α α−α1 α w2θ [1 + |v|]γ +1 ∂β−β C g ∂ g ∂β¯ 1 g1 2 3 β dv 1 σ,θ
¯ 1 β≤β
α ≤C ∂β¯ 1 g1
γ +2 θ α−α1 θ w ∂β−β1 g2 w [1 + |v|] 2 ∂βα g3
α ≤C ∂β¯ 1 g1
α−α1 ∂β−β1 g2
¯ 1 β≤β
¯ 1 β≤β
σ,θ
σ,θ
2
2,θ
2
α ∂β g3
σ,θ
.
For the case γ + 2 ≥ 0, φ ij (v) ∈ L2loc , and ∂j φ ij (v) = O(|v|)γ +1 ∈ L2loc . We use an integration by parts inside the convolution to split φ ij ∗ ∂β1 µ1/2 ∂j ∂ α1 g1 = ∂j φ ij ∗ ∂β1 µ1/2 ∂ α1 g1 − φ ij ∗ ∂β1 ∂j µ1/2 ∂ α1 g1 . (43)
408
Y. Guo
Applying (42) to the decomposition (43) yields α φ ij ∗ ∂β1 µ1/2 ∂j ∂ α1 g1 ≤ C [1 + |v|]γ +2 w θ ∂β¯ 1 g1 . ¯ 1 β≤β
2
Therefore, from (21), we can majorize (41) by α−α1 α α C w2θ [1 + |v|]γ +2 ∂β−β g ∂ g w θ ∂β¯ 1 g1 dv 1 2 β 3 2
¯ 1 β≤β
α ≤C ∂β¯ 1 g1 ¯ 1 β≤β
2,θ
α−α1 ∂β−β1 g2
σ,θ
α ∂β g3
σ,θ
.
We now estimate (36) to (39). We decompose their double integration region v, v ∈ R3 × R3 into three parts: and 2|v | ≤ |v|, |v| ≥ 1 . |v| ≤ 1}, {2|v | ≥ |v|, |v| ≥ 1 For the first part {|v| ≤ 1} , recall φ ij (v) = O |v|γ +2 ∈ L2loc . By (42), we have ij φ ∗ ∂β1 µ1/2 ∂ α1 g1 + φ ij ∗ ∂β1 vi µ1/2 ∂ α1 g1 α ≤ C [1 + |v|]γ +2 ∂β¯ 1 g1 ; ¯ β≤β
2,θ
ij φ ∗ ∂β1 µ1/2 ∂j ∂ α1 g1 + φ ij ∗ ∂β1 vi µ1/2 ∂j ∂ α1 g1 α ≤ C [1 + |v|]γ +2 ∂β¯ 1 g1 . ¯ β≤β
(44)
σ,θ
Hence their corresponding integrands over the region {|v| ≤ 1} are bounded by α α−α C ∂β¯ 1 g1 w 2θ [1 + |v|]γ +2 ∂j ∂β−β11 g2 ∂i ∂βα g3 + ∂βα g3 2,θ α α−α +C ∂β¯ 1 g1 w 2θ [1 + |v|]γ +2 ∂β−β11 g2 ∂i ∂βα g3 + ∂βα g3 , σ,θ
whose v−integral over {|v| ≤ 1} is clearly bounded by the right-hand side of (34). We thus conclude the first part of {|v| ≤ 1} for (36) to (39). For the second part 2|v | ≥ |v|, |v| ≥ 1 , we have v v 1/2 1/2 (v ) + µ µ v v ≤ Cµ µ . ∂β1 ∂β1 j 4 4 By the same type of estimates as in (42), the v− integrands in (36) to (39) are bounded by:
v v α1 α−α1 α α ij µ v − v µ + φ g ∂ g g w2θ ∂j ∂β−β ∂ ∂ ∂β¯ g1 v dv i β 3 β 3 1 2 4 4
v v α−α1 α α α1 ij dv v − v µ v + φ +µ g ∂ g g ∂ g w2θ ∂β−β ∂ ∂ ∂ i β 3 j β¯ 1 β 3 1 2 4 4
v α−α1 α α ≤ C ∂βα¯ g1 [1 + |v|]γ +2 µ g ∂i ∂β g3 + ∂β g3 w2θ ∂j ∂β−β 1 2 2,θ 4
v α−α1 α α + C ∂βα¯ g1 [1 + |v|]γ +2 µ w2θ ∂β−β + g ∂ g g ∂ ∂ i β 3 β 3 . 1 2 σ,θ 4
Landau Equation in Periodic Box
409
By (21), its v−integral is bounded by the right-hand side of(34) because of the fast decaying factor µ( v4 ). We thus conclude the second part of 2|v | ≥ |v|, |v| ≥ 1 for (36) to (39). We finally consider the third part of 2|v | ≤ |v|, |v| ≥ 1 , for which we shall estimate each term from (36) to (39). The key is to expand φ ij v − v . To estimate (36) over the region |v| ≥ 1 and 2|v | ≤ |v|, we expand φ ij v − v to get 1 ∂k φ ij (v)vk + ∂kl φ ij (v) (45) φ ij v − v = φ ij (v) − ¯ vk vl , 2 k
k,l
where v¯ is between v and v − v . We plug (45) into the integrand of (36). Notice that for either fixed i or j , φ ij (v)vi = φ ij (v)vj = 0. (46) i
j
α−α1 From (20), (19) and (46), we can decompose ∂j ∂β−β g and ∂i ∂βα g3 into their Pv parts 1 2 as well as I − Pv parts. For the first term in the expansion (45) α−α1 ij α φ (v)∂j ∂β−β1 g2 (v)∂i ∂β g3 (v) ij
α−α1 α = φ ij (v) [I − Pv ] ∂j ∂β−β g (v) − P ∂ g (v) ∂ [I ] v i β 3 1 2 ij
α−α1 α ≤ C [1 + |v|]γ +2 [I − Pv ] ∂j ∂β−β ∂ g (v) × − P ∂ g (v) ] [I . 2 v i 3 β 1
(47)
α−α1 α−α1 Here we have used (46) so that the sum of terms with either Pv ∂j ∂β−β g or Pv ∂i ∂β−β g 1 2 1 3 vanishes. For the second term in the expansion (45), by taking the k derivative of φ ij (v)vi vj = 0, i,j
we have
∂k φ ij (v)vi vj = −2
i,j
φ kj (v)vj = 0.
j
α−α1 ∂j ∂β−β g 1 2
Therefore, expanding and ∂i ∂βα g3 into their Pv and I − Pv parts yields α−α1 ∂k φ ij (v)∂j ∂β−β g (v)∂i ∂βα g3 (v) 1 2 ij
=
ij
α−α1 α−α1 ∂k φ ij (v) [I −Pv ]∂j ∂β−β g [I −Pv ]∂i ∂βα g3 + [I −Pv ]∂j ∂β−β g [Pv ∂i ∂βα g3 ] 1 2 1 2
α−α1 α + Pv ∂j ∂β−β − P ∂ g ∂ g [I ] v i β 3 , 1 2 where
ij
α−α1 ∂k φ ij (v)Pv ∂j ∂β−β g × Pv ∂i ∂βα g3 = 0. 1 2
410
Y. Guo
Notice that ∂k φ ij (v) ≤ C [1 + |v|]γ +1 , for |v| ≥ 1, we majorize the above by α−α1 α C [1 + |v|]γ /2 Pv ∂j ∂β−β + g ∂ ∂ g P v i β 3 1 2 α−α1 α × [1 + |v|]γ +2/2 [I − Pv ] ∂j ∂β−β + − P g ∂ g ∂ ] [I 2 v i 3 β 1 α−α1 +C [1 + |v|]γ +1 [I − Pv ] ∂j ∂β−β g [I − Pv ] ∂i ∂βα g3 . 1 2
(48)
The third term in (45) now can be estimated as follows. Since 1 3 ¯ ≤ v + |v| ≤ |v|, |v| ≤ |v| − v ≤ |v| 2 2
(49)
thus |∂kl φ ij (v)| ¯ ≤ C [1 + |v|]γ , and we have α−α1 ij α ≤ C [1 + |v|]γ ∂j ∂ α−α1 g2 (v)∂i ∂ α g3 (v) . ∂ φ ( v)∂ ¯ ∂ g (v)∂ ∂ g (v) kl j 2 i 3 β β β−β β−β 1 1 k,l
(50) Combining (45), (47), (48) and (50), we have α−α1 ij α φ (v − v )∂ ∂ g ∂ ∂ g i β−β1 2 i β 3 ij
α−α1 ij α ≤ C 1 + |v | φ (v)∂j ∂β−β1 g2 (v)∂i ∂β g3 (v)
2
ij
α−α1 α + ∂k φ ij (v)∂j ∂β−β g (v)∂ ∂ g (v) i β 3 1 2 ij
α−α1 ij α + ∂kl φ (v)∂ ¯ j ∂β−β1 g2 (v)∂i ∂β g3 (v) ij
2
≤ C 1 + |v |
α−α1 g ∂ ∂ α−α1 g σ ij ∂i ∂β−β 1 2 j β−β1 2
1/2
σ ij ∂i ∂βα g3 ∂j ∂βα g3
1/2
,
where we have used (21). The v− integrand over 2|v | ≤ |v|, |v| ≥ 1 in (36) is thus bounded by w2θ 1 + |v |2 µ1/4 v ∂βα¯ 1 g1 v dv 1/2 1/2 α−α1 α−α1 ij α α × σ ij ∂i ∂β−β g ∂ ∂ g ∂ ∂ g ∂ ∂ g σ 2 j 2 i 3 j 3 β β β−β1 1 1/2 1/2 α−α1 α−α1 2θ ij α α g ∂ ∂ g σ ∂ ∂ g ∂ ∂ g . w 2θ σ ij ∂i ∂β−β w ≤ C ∂βα¯ g1 2 j 2 i 3 j 3 β β β−β1 1 2,θ
Its further integration over v is bounded by the right-hand side of (34).
Landau Equation in Periodic Box
411
We now consider the second term (37). We again expand φ ij v − v as φ ij v − v = φ ij (v) − ∂k φ ij (v)v ¯ k , k with v¯ between v and v − v . Since j φ ij (v)vj = 0, we obtain as before α−α1 φ ij (v)∂j ∂β−β g (v)∂βα g3 (v) 1 2 ij
=
ij
(51)
α−α1 φ ij (v) {I − Pv } ∂j ∂β−β g (v) × ∂βα g3 (v) 1 2
α−α1 γ +2/2 α ≤ C [1 + |v|]γ +2/2 {I − Pv } ∂j ∂β−β g (v) + |v|] ∂ g (v) [1 . 2 3 β 1 φ ij (v)| ¯
(52)
C [1 + |v|]γ +1 .
Notice that from (49), |∂k ≤ Hence α−α1 ¯ j ∂β−β g (v)∂βα g3 (v) ∂k φ ij (v)∂ 1 2 α−α1 γ +2/2 α ≤ [1 + |v|]γ /2 ∂j ∂β−β g (v) + |v|] g (v) [1 ∂ . 2 3 β 1
(53)
By (21), we conclude that the integrand in (37) can be majorized as 2θ ij α−α w φ v − v ∂β1 vi µ1/2 v ∂ α1 g1 v ∂j ∂β−β11 g2 (v)∂βα g3 (v) dv = w2θ φ ij (v) − ∂k φ ij (v) ¯ vk ∂β1 vi µ1/2 (v )∂ α1 g1 v
α−α1 ×∂j ∂β−β g (v)∂βα g3 (v)dv 1 2
≤C
1 + |v | µ1/4 v ∂βα¯ 1 g1 v dv
1/2 1/2 α−α1 α−α1 2θ ij α α × w2θ σ ij ∂i ∂β−β w g ∂ ∂ g σ ∂ ∂ g ∂ ∂ g , 2 j 2 i 3 j 3 β β β−β 1 1 where the summation is over 1 ≤ i, j ≤ 3 and its further integration over v is bounded by the right-hand side of (34). We now consider the third term (38) over {2|v | ≤ |v|, |v| ≥ 1}. We again use (43) to split (38) into two parts. Recall expansion (51), and decompose ∂i ∂βα g3 = Pv ∂i ∂βα g3 + [I − Pv ] ∂i ∂βα g3 . By similar estimates in (52) and (53), the second part of (38) can be estimated as α−α 2θ ij w φ v − v ∂β1 ∂j µ1/2 (v )∂ α1 g1 v ∂β−β11 g2 (v)∂i ∂βα g3 (v) |v|≥1,2|v |≤|v| 2θ ij φ (v) − ∂k φ ij (v)v = ¯ k w |v|≥1,2|v |≤|v| α−α1 α ×∂β1 ∂j µ1/2 v ∂ α1 g1 v ∂β−β g ∂ ∂ g . 2 i 3 β 1 α−α1 θ γ +2/2 α {I } ≤ C ∂βα¯ 1 g1 w θ [1 + |v|]γ +2/2 ∂β−β g ∂ g + |v|] − P ∂ [1 w v i β 3 1 2 2 2,θ 2 α1 θ γ +2/2 α−α1 θ γ /2 α + C ∂β¯ g1 w [1 + |v|] ∂β−β1 g2 w [1 + |v|] ∂i ∂β g3 . 2,θ
2
By (21), it is bounded by the right hand side of (34).
2
412
Y. Guo
For the first part of (38) by (43), notice that |∂j φ ij v − v | ≤ C [1 + |v|]γ +1 , we thus have
α−α1 w 2θ ∂j φ ij v − v ∂β1 µ1/2 v ∂ α1 g1 v ∂β−β g (v)∂i ∂βα g3 (v) 1 2 |v|≥1,2|v |≤|v| α α−α ≤C ∂β¯ 1 g1 w θ [1 + |v|]γ +2/2 ∂β−β11 g2 w θ [1 + |v|]γ /2 ∂j ∂βα g3 , 2,θ
¯ 1 β≤β
2
2
which is bounded by the right-hand side of (34) by (21). We consider (39) over {2|v | ≤ |v|, |v| ≥ 1}. As in (43), we split (39) by φ ij ∗ ∂β1 vi µ1/2 ∂j ∂ α1 g1 = ∂j φ ij ∗ ∂β1 vi µ1/2 ∂ α1 g1 −φ ij ∗ ∂β1 ∂j vi µ1/2 ∂ α1 g1 .
(54)
Since φ ij v − v ≤ C [1 + |v|]γ +2 , and ∂j φ ij v − v ≤ C [1 + |v|]γ +1 , (39) is bounded by α−α1 α g ∂ g w2θ [1 + |v|]γ +1 ∂β1 vi µ1/2 v ∂ α1 g1 v ∂β−β dv dv 1 2 β 3 α−α1 α + w2θ [1 + |v|]γ +2 ∂β1 ∂j vi µ1/2 v ∂ α1 g1 (v ) ∂β−β g ∂ g dv dv 2 3 β 1 α α−α ≤C ∂β¯ 1 g1 ∂β−β11 g2 ∂βα g3 . σ,θ
2,θ
¯ 1 β≤β
σ,θ
We thus conclude the proof of (34). The proof of (35) follows from the Sobolev imbedding: W 4,1 (T3 ) ⊂ L∞ (T3 ). Notice that
σ ij ∂i g∂j gdv + |g(x, v)|2 dv sup x R3 R3 ≤C ∂ α σ ij ∂i g∂j g dv + ∂ α |g(x, v)|2 dxdv T3 ×R3
|α|≤4
≤C
α1 ≤α,|α|≤4
≤C
T3 ×R3
σ ij ∂iα1 g∂jα−α1 g dxdv +
T3 ×R3
T3 ×R3
||∂ α1 g||σ ||∂ α−α1 g||σ + ||∂ α1 g||||∂ α−α1 g||
∂ α1 g∂ α−α1 gdxdv
α1 ≤α,|α|≤4
≤C
|α|≤4
||∂ α g||2σ + ||∂ α g||2 .
(55)
We integrate further (34) over T3 . Since β¯ ≤ β, we take the L∞ norm of either terms α−α1 involving ∂βα¯ 1 g1 or ∂β−β g , of which has the least total derivatives (≤ N/2). Since 1 2 ¯ |α| + |β| ≤ N = 8, and β ≤ β1 , we conclude our theorem.
Landau Equation in Periodic Box
413
Lemma 7. (1) Let χ (x, v) be a smooth function with compact support. Let |α| ≤ N . Then there exist g ij (x, v), g i (x, v) and g(x, v) such that ∂ α g1, g2 χ = ∂ij g ij + ∂i g i + g, where g ij + g i + g ≤ C
∂ α1 g1
|α1 |≤N
|α1 |≤N
∂ α1 g2
σ
.
(56)
Moreover, there exist gα1 (x, v ), α1 ≤ α such that α α1 ∂ g1, g2 χ dxdv = Cα ∂ α1 g1 (x, v )gα1 (x, v )dxdv , T3 ×R3
T3 ×R3
where gα1 ≤ C
∂ α2 g2 σ .
(57)
|α2 |≤N
(2) Let χ (v) be a smooth function so that
v |χ | + |∇χ | + ∇ 2 χ ≤ Cµ , 4 then
∂ α g1, g2 χ dv ≤ C ∂ α1 g1 ∂ α1 g2 σ . 2
|α1 |≤N
(58)
|α1 |≤N
Proof. Now recall from (16), ∂ α g1, g2 χ takes the form Cαα1 ∂i φ ij ∗ µ1/2 ∂ α1 g1 ∂j ∂ α−α1 g2 χ − Cαα1 φ ij ∗ vi µ1/2 ∂ α1 g1 ∂j ∂ α−α1 g2 χ − Cαα1 ∂i φ ij ∗ µ1/2 ∂j ∂ α1 g1 ∂ α−α1 g2 χ +Cαα1 φ ij ∗ vi µ1/2 ∂j ∂ α1 g1 ∂ α−α1 g2 χ ≡ Cαα1 [I1 + I2 + I3 + I4 ] .
(59)
We first prove (56). Rewrite I1 as I1 = ∂i φ ij ∗ µ1/2 ∂ α1 g1 ∂j ∂ α−α1 g2 χ − φ ij ∗ µ1/2 ∂ α1 g1 ∂j ∂ α−α1 g2 ∂i χ . (60) Since χ has compact support, |χ | ≤ Cµ( v8 ), hence by (42), φ ij ∗ µ1/2 ∂ α g1 ∂j ∂ α−α1 g2 χ , the second term in (60), as well as I2 are all bounded by v C ∂ α1 g1 2 µ( )|∂j ∂ α−α1 g2 |. 4
(61)
414
Y. Guo
Similarly, we split I3 by (43) as − ∂i φ ij ∗ µ1/2 ∂j ∂ α1 g1 ∂ α−α1 g2 χ + φ ij ∗ µ1/2 ∂j ∂ α1 g1 ∂ α−α1 g2 ∂i χ = − ∂ij φ ij ∗ µ1/2 ∂ α1 g1 ∂ α−α1 g2 χ + ∂i φ ij ∗ ∂j µ1/2 ∂ α1 g1 ∂ α−α1 g2 χ + ∂i φ ij ∗ µ1/2 ∂ α1 g1 ∂j ∂ α−α1 g2 χ + ∂j φ ij ∗ µ1/2 ∂ α1 g1 ∂ α−α1 g2 ∂i χ − φ ij ∗ ∂j µ1/2 ∂ α1 g1 ∂ α−α1 g2 ∂i χ − φ ij ∗ µ1/2 ∂ α1 g1 ∂j ∂ α−α1 g2 ∂i χ . (62) ij 1/2 α α−α 1g χ, We notice that from (42), φ ∗ µ ∂ 1 g1 ∂ 2 φ ij ∗ ∂j µ1/2 ∂ α1 g1 ∂ α−α1 g2 χ , φ ij ∗ µ1/2 ∂ α1 g1 ∂ α−α1 g2 ∂i χ , and the remaining terms above are bounded by
v ∂ α−α1 g2 (v) + ∂j ∂ α−α1 g2 (v) . (63) C|∂ α1 g1 |2 µ 4 Finally, using (54), we split I4 as ∂j φ ij ∗ vi µ1/2 ∂ α1 g1 ∂ α−α1 g2 χ − φ ij ∗ ∂j vi µ1/2 ∂ α1 g1 ∂ α−α1 g2 χ = ∂j φ ij ∗ vi µ1/2 ∂ α1 g1 ∂ α−α1 g2 χ − φ ij ∗ vi µ1/2 ∂ α1 g1 ∂j ∂ α−α1 g2 χ − φ ij ∗ vi µ1/2 ∂ α1 g1 ∂ α−α1 g2 ∂j χ − φ ij ∗ ∂j vi µ1/2 ∂ α1 g1 ∂ α−α1 g2 χ .
(64)
Notice that φ ij ∗ vi µ1/2 ∂ α g1 ∂ α−α1 g2 χ as well as the last three terms above are also bounded by (63). By (21), further v−integration of the square of (63) is bounded by C|∂ α1 g1 |22 |∂ α−α1 g2 |2σ . We integrate further over T3 . Applying (55) to either |∂ α1 g1 |22 or |∂ α−α1 g2 |2σ whichever has the least number of total derivatives, we deduce (56). To show (57), we expand ∂ α g1, g2 χ and define gα1 such that ∂ α g1, g2 χ dvdv dx = ∂ α1 g1 (x, v )gα1 (x, v )dv dx. T3 ×R3
α1 ≤α
T3 ×R3
Collecting terms and separating the v−integration, applying decompositions (60), (62) and (64), we have gα1 (x, v ) = −µ1/2 v φ ij v − v ∂j ∂ α−α1 g2 (v)∂i χ (v)dv R3 1/2 − 2vi µ v φ ij v − v ∂j ∂ α−α1 g2 (v)χ (v)dv 3 R 1/2 ij −µ v φ v − v ∂j ∂ α−α1 g2 (v)∂i χ (v) dv R3 − ∂j vi µ1/2 v φ ij v − v ∂ α−α1 g2 (v)χ (v)dv. R3
Landau Equation in Periodic Box
415
Since φ ij ∈ L2loc , it follows from the Cauchy-Schwarz inequality that gα (x, v )2 ≤ C ∂ α−α1 g2 2 µ1/4 v , 1 σ so that gα1 2 ≤ C∂ α−α1 g2 2σ . We now prove (58). We use (60) to (64) to estimate ∂ α [g1, g2 ]χ dv term by term. Since χ decays exponentially, ij 1/2 α1 α−α1 I1 dv = − φ ∗ µ ∂ g1 ∂ j ∂ g2 ∂i χ dv ≤ C ∂ α1 g1 2 [1 + |v|]γ +2 |∂j ∂ α−α1 g2 ∂i χ |dv ≤ C|∂ α1 g1 |2 |∂ α−α1 g2 |σ , (65) where we have used(42) to estimate φ ij ∗ µ1/2 ∂ α1 g1 . By (42) again, | I2 dv| has the same upper bound. I3 dv takes the form ij 1/2 α1 φ ∂ α−α1 g2 ∂i χ dv ∗ ∂ µ ∂ g j 1 + φ ij ∗ µ1/2 ∂ α1 g1 ∂j ∂ α−α1 g2 ∂i χ dv ≤ C|∂ α1 g1 |2 |∂ α−α1 g2 |σ . (66) Similarly, from (64), by the same integration by parts as in both (65) and (66), I4 dv has the same upper bound. We take the L2 norm in x of |∂ α1 g1 |2 |∂ α−α1 g2 |σ . Take the L∞ norm in x of either α ∂ 1 g1 or ∂ α−α1 g2 for which its order of derivatives is less than N/2. By applying (55), we deduce (58). 4. Energy Estimates We now construct local-in time solutions to the Landau equation (1). Recall (4). The construction is based on a uniform energy estimate for the following sequence of iterating approximate solutions: {∂t + v · ∇x } F n+1 = Q F n , F n+1 F n+1 (0, x, v) = F0 (x, v),
(67)
starting with F 0 (t, x, v) ≡ F0 (x, v) ≥ 0. Since F n+1 = µ + µ1/2 f n+1 , equivalently, we need to solve f n+1 such that [∂t + v · ∇x − A] f n+1 − Kf n = f,n f n+1 , f n+1 (0, x, v) = f0 (x, v). (68)
416
Y. Guo
Our goal is to get a uniform in n estimate for E(f n+1 (t)) for a small time interval. The crucial energy estimate is as follows: Lemma 8. {F n (t, x, v) ≥ 0} is well-defined. There exists T ∗ (E(f0 )) > 0, C > 0, such that for 0 ≤ t ≤ T ∗ and E(f0 ) sufficiently small,
sup E f k (t) ≤ CE(f0 ).
(69)
k
Proof. We use an induction over k. Clearly k = 0 is true and we assume that (69) is valid for k = n so that F n (t, x, v) ≥ 0. We notice that the Landau collision operator in (67) has the non-divergence form of Q F n , F n+1 = φ ij ∗ F n ∂ij F n+1 − ∂ij φ ij ∗ F n F n+1 ,
for γ > −3,
where −∂ij φ ij (v) = (γ + 3)|v|γ , γ > −3; − ∂ij φ ij ∗ F n (v) = 8π F n (v), γ = −3.
Since φ ij ∗ F n ≥ 0, it is rather standard (for instance, by adding regularization 2m n+1 F , for some large integer m, then letting % → 0, if necessary) that % @x,v there exists a solution F n+1 to the linear equation (67). Moreover, by the maximum principle, F n+1 (t, x, v) ≥ 0. We now prove (69). Step 1. Mixed derivatives: Let F n+1 = µ + µ1/2 f n+1 . Taking ∂βα (β = 0) of (68), we obtain: β α f n+1 [∂t + v · ∇x ] ∂βα f n+1 − ∂β A ∂ α f n+1 − ∂β K ∂ α f n + Cβ 1 ∂β1 vj ∂ j ∂β−β 1 (70) = ∂βα f,n f n+1 . We first consider the case of γ + 2 ≤ 0. By multiplying w 2|β| ∂βα f n+1 to (70) and then integrating over T3 × R3 , we get
1 d w |β| ∂βα f n+1 2 − w 2|β| ∂β A ∂ α f n+1 , ∂βα f n+1 2 dt
− w2β| ∂β K ∂ α f n , ∂βα f n+1
β α n+1 α n+1 = w2|β| ∂βα f,n f n+1 , ∂βα f n+1 − w 2|β| Cβ 1 ∂β1 vj ∂ j ∂β−β f , ∂ f . β 1 (71)
Landau Equation in Periodic Box
417
We now estimate (71) term by term. Applying Lemma 6 with θ = 2|β|, and then integrating over T3 , we deduce, for any η > 0, (w ≤ 1) −(w2|β| ∂β A∂ α f n+1 , ∂βα f n+1 ) ≥ ∂βα f n+1 2σ,|β| − η ∂βα1 f n+1 2σ,|β1 | − Cη ∂ α f n+1 2σ , |β1 |≤|β|
−(w ∂β K∂ f , ∂βα f n+1 ) ≥− η ∂βα1 f n σ,|β1 | + Cη µ∂ α f n ∂βα f n+1 σ,|β| |β1 |≤|β| ≥ −η ∂βα1 f n 2σ,|β1 | − η∂βα f n+1 2σ,|β| − Cη µ∂ α f n 2 . 2|β|
α
n
|β1 |≤|β|
Here we have used w ≤ 1 and w|β| ≤ w|β1 | for |β1 | ≤ |β|, as well as µ∂ α f n+1 2 ≤ C∂ α f n+1 σ . Our weighted norm (9) is exactly designed to estimate the free streaming term as follows: α n+1 α n+1 f , ∂ f w 2|β| ∂β1 vj ∂ j ∂β−β β 1 α n+1 α n+1 ≤ w 2|β| ∂ j ∂β−β f ∂ f β 1
≤ w1/2+|β| ∂βα f n+1 ≤ η ∂βα f n+1
2 σ,|β|
α w 1/2+{|β|−1} ∂ j ∂β−β f n+1 1
α + Cη ∂ j ∂β−β f n+1 1
2 σ,|β−β1 |
,
(|β − β1 | = |β| − 1),
where |β1 | = 1, and we have used (21). For the nonlinear term, applying (35) in Theorem 3 with θ = 2|β|, we have w2|β| ∂βα f,n f n+1 , ∂βα f n+1
≤C ∂βα11 f n |β1 | ∂βα11 f n+1 σ,|β1 |
+
∂βα11 f n σ,|β1 |
∂βα11 f n+1 |β1 |
∂βα f n+1 σ,|β| ,
where β1 ≤ β and w |β| ≤ w |β1 | . Integrating (71) over [0, t] and collecting the above estimates, we thereby deduce for β = 0,
418
Y. Guo
t 1 α n+1 (t)2|β| + ∂βα f n+1 2σ,|β| ∂β f 2 0 t 1 ≤ ∂βα f n+1 (0)2|β| + η ∂βα1 f n+1 2σ,|β1 | 2 0 |β |≤|β| 1 t t α n 2 +η ∂β1 f σ,|β1 | + Cη µ∂ α f n 2 + Cη
0 |β |≤|β| 1 t
+C 0
+C
0
∂βα22 f n+1 2σ,|β2 | 0 |β |<|β|, |α |+|β |≤N 2 2 2 t α1 n ∂β1 f |β1 | ∂βα11 f n+1 σ,|β1 | ∂βα f n+1 σ,|β|
t 0
∂βα11 f n σ,|β1 |
∂βα11 f n+1 |β1 | ∂βα f n+1 σ,|β| .
Notice that since |β2 | < |β|, we can estimate the mixed x and v derivatives of f n+1 in terms of its pure x derivatives by an induction starting with |β| = 1, 2, 3, ..... This induction gives t 1 α n+1 2 ∂ f (t)|β| + ∂βα f n+1 2σ,|β| 2 β 0 α1 n+1 2 ≤C ∂β1 f (0)|β1 | + Cη |α1 |+|β1 |≤N t
+η
+C
+C
0 |α |+|β |≤N 1 1 t 0
0 |β |≤|β|, |α |+|β |≤N 1 1 1 t
∂βα1 f n 2σ,|β1 | + Cη
0 |β |≤|β|,|α |+|β |≤N 1 1 1 t + Cη ∂ α1 f n+1 2σ 0 |α |≤N 1
t
t
|α1 |+|β1 |≤N
∂βα11 f n+1 2σ,|β1 |
µ∂ α1 f n 2
0 |α |≤N 1
∂βα11 f n |β1 | ∂βα11 f n+1 σ,|β1 | ∂βα f n+1 σ,|β| |α1 |+|β1 |≤N ∂βα11 f n σ,|β1 | ∂βα11 f n+1 |β1 | ∂βα f n+1 σ,|β| . |α1 |+|β1 |≤N
(72) Notice that from (10), for 0 ≤ s ≤ t,
t 0
∂βα11 g(s)2|β1 | ≤ C sup E(g(s)), 0≤s≤t
∂βα11 g(s)2σ,|β1 | ds ≤ C sup E(g(s)). 0≤s≤t
(73)
Landau Equation in Periodic Box
419
By summing over |α| + |β| ≤ N and choosing η sufficiently small, we get 1 α n+1 1 t α n+1 2 ∂β f ∂β f (t)2|β| + σ,|β| 2 2 0 t ∂ α f n+1 2σ ≤C ∂βα f0 2|β| + C +η
t 0
0
∂βα f n 2σ,|β| + Cη
+C sup E(f n+1 (s)) sup E 0≤s≤t
0 1/2
t
µ∂ α f n 2
(f n (s)).
(74)
0≤s≤t
For the case of γ + 2 ≥ 0, we repeat the above estimates without the weight w2|β| (multiplying (70) with ∂βα f n+1 ). Notice that by (21)
α n+1 α n+1 f , ∂ f ∂β1 vj ∂ j ∂β−β β 1 α α ≤ ∂βα f n+1 · ∂ j ∂β−β f n+1 ≤ η∂βα f n+1 2σ + Cη ∂ j ∂β−β f n+1 2σ , 1 1
for β > 0. By using the same induction from |β| = 1, 2, 3 . . . , we obtain directly: 1 α n+1 1 t α n+1 2 ∂β f (t)2 + σ ∂β f 2 2 0 t t ∂ α f n+1 2σ + η ∂βα f n 2σ ≤C ∂βα f n+1 (0)2 + C +Cη
t
0
0
µ∂ α f n 2 + C sup E(f n+1 (s)) sup E 1/2 (f n (s)), 0≤s≤t
0
(75)
0≤s≤t
where the summation is over |α| + |β| ≤ N. Step 2. x–derivatives: We now consider pure x-derivatives. Taking ∂ α of (68), we have (76) [∂t + v · ∇x ] ∂ α f n+1 − A ∂ α f n+1 − K ∂ α f n = ∂ α f,n f n+1 . Therefore, applying Theorem 3 with θ = 0 yields 1 d α n+1 2 α n+1 α n+1 α n α n+1 ,∂ f − K ∂ f ,∂ f − A ∂ f ∂ f 2 dt
= ∂ α f,n f n+1 , ∂ α f n+1 ≤ ∂ α1 f n ∂ α1 f n+1 σ +
∂ α1 f n σ
∂ α1 f n+1
∂ α f n+1 σ ,
where the summation is over |α1 | ≤ |α|. Hence integrating the above over [0, t] yields t t
1 α n+1 2 α n+1 α n+1 (t) − A ∂ f ,∂ f ds − K ∂ α f n , ∂ α f n+1 ds ∂ f 2 0 0 t 1 α n+1 ≤ ∂ f ∂ α1 f n ∂ α1 f n+1 σ + ∂ α1 f n σ ∂ α1 f n+1 (0)2 + 2 0 × ∂ α f n+1 σ ds.
(77)
420
Y. Guo
We notice that from (23) and Lemma 6, for any η > 0 small, t t A ∂ α f n+1 , ∂ α f n+1 − K ∂ α f n , ∂ α f n+1 − 0 0 t 3 t α n+1 ≥ ∂ f (s)2σ ds − C ∂ α f n+1 (s)2 ds, 4 0 0 t t α n 2 −η ∂ f (s)σ ds − Cη µ∂ α f n (s)2 ds. 0
0
Multiplying large constants κ > 0 to (77), so that 43 κ 1, then adding it to (75) or (74) t to absorb C 0 ∂ α f n+1 2σ on their right-hand sides, we obtain t t ∂ α f n+1 (s)2 ds + Cη ∂ α f n (s)2σ ds E(f n+1 (t)) ≤ CE(f0 ) + C 0 0 t t α n 2 + Cη ∂β f (s)σ ds + Cη µ∂ α f n (s)2 ds 0
0
+ C sup E(f n+1 (s)) sup E 1/2 (f n (s)) 0≤s≤t
0≤s≤t
!
≤ C0 E(f0 ) + Cη t
sup E(f
n+1
" n
(s)) + sup E(f (s))
0≤s≤t
0≤s≤t
n
+ Cη sup E(f (s)) + C sup E(f 0≤s≤t
n+1
0≤s≤t
(s)) sup E 1/2 (f n (s)).
(78)
0≤s≤t
Notice that µ∂ α f n ≤ C∂ α f n above. From (73), we may assume sup0≤s≤t E(f n (s)) ≤ 2C0 E(f0 ), so that sup E(f n+1 (t)) ≤ C0 + 2Cη + 2Cη T ∗ C0 E(f0 ). 1 − Cη T ∗ − CE 1/2 (f0 ) 0≤t≤T ∗ (M)
By choosing η small, then choosing T ∗ small, we have sup
0≤t≤T ∗ (M)
E(f n+1 (t)) ≤ 2C0 E(f0 ).
We therefore have concluded Lemma 8 if both T ∗ and E(f0 ) are sufficiently small.
Theorem 4. For any sufficiently small M > 0, there exist T ∗ (M) > 0 and M1 > 0, such that if ∂βα f0 22 ≤ M1 , E(f0 ) = |β|+|α|≤N
then there is a unique classical solution f (t, x, v) to (7) in [0, T ∗ (M)) × T3 × R3 such that sup E(f (t)) ≤ M.
0≤t≤T ∗
Landau Equation in Periodic Box
421
E(f (t)) is continuous over [0, T ∗ (M)). If F0 (x, v) = µ + µ1/2 f0 ≥ 0, then F (t, x, v) = µ + µ1/2 f (t, x, v) ≥ 0. Moreover, the following energy estimates hold for β = 0, t ∂βα f (s)2σ ds ∂βα f (t)2 + 0 ! " t α 2 α 2 3/2 ≤ C ∂β f0 + ∂ f (s)σ ds + sup E (f (s)) , γ + 2 ≥ 0; 0≤s≤t t 0 ∂βα f (t)2|β| + ∂βα f (s)2σ,|β| ds 0 ! " t α 2 α 2 3/2 ≤ C ∂β f0 |β| + ∂ f (s)σ ds + sup E (f (s)) , γ + 2 < 0,
(79)
0≤s≤t
0
where we have used the summation convention over |α| + |β| ≤ N. Proof. By taking n → ∞, we obtain a classical solution f from Lemma 8, and F (t, x, v) = µ + µ1/2 f ≥ 0 if F0 (x, v) ≥ 0. To prove the uniqueness, we assume that there is another solution g, such that sup0≤s≤T ∗ E(g(s)) ≤ M. The difference f − g satisfies {∂t + v · ∇x } [f − g] + L [f − g] = [f − g, f ] + [g, f − g] ,
(80)
with f (0, x, v) = g(0, x, v), and L = −A − K. We apply (34) with θ = 0 in Theorem 3 to get ( [f − g, f ] + [g, f − g] , f − g) {|f − g|2 |f |σ + |f − g|σ |f |2 + |f − g|2 |g|σ + |f − g|σ |g|2 } |f − g|σ dx ≤ T3 ∂ α g + ∂ α f f − g2σ + ∂ α gσ + ∂ α f σ f − gσ f − g ≤ |α|≤4
√ ≤ C Mf − g2σ + CM f − g2 ,
|α|≤4
where we have used (55) to estimate supx |f |σ , supx |f |2 , supx |g|σ , and supx |g|2 . Since from (23) and Lemma 6, (L [f − g] , f − g) ≥
1 f − g2σ − Cf − g, 2
multiplying (80) with (f − g) and integrating over [0, t] × T3 × R3 yields t 2 f (s) − g(s)2σ ds f (t) − g(t) + 0 t √ t 2 ≤C M f (s) − g(s)σ ds + C f (s) − g(s)2 ds. 0
From (10),
t
0
0 |α|≤4
∂ α g(s)2σ + ∂ α f (s)2σ ds ≤ 2M,
422
Y. Guo
√ if C M < 1 we deduce f ≡ g from the Gronwall inequality. The uniqueness thus follows. To show the continuity of E(f (t)) with respect to t, we integrate (70) from s to t: (with f n = f n+1 = f, γ + 2 ≤ 0): |E(f (t)) − E(f (s))| t 1 1 α α 2 α 2 2 ∂β f (t)|β| + ∂β f (s)|β| ∂β f (τ )σ,|β| dτ − = 2 2 s
t # ≤ C 1 + sup E(f (τ )) ∂βα f (τ )2σ,|β| + ∂βα f (τ )2|β| dτ → 0, s≤τ ≤t
s
as t → s since ∂βα f (τ )2σ,|β| and ∂βα f (τ )2|β| are integrable over τ. We derive (79) from (78) with f n+1 = f n = f. Notice that µ∂ α f 2 ≤ C∂ α f 2σ . The case for γ + 2 ≥ 0 is similar (more direct): no weight function w is needed. We thus conclude Theorem 4. crucial energy estimate (79) indicates the importance of controlling t The α f (s)2 ds for the pure x-derivatives of f, which is the main focus of the rest ∂ σ 0 of the article. We next consider more refined estimates for x−derivatives within the existence interval [0, T ∗ ). Lemma 9. Assume f (t, x, v) satisfies (7) for 0 ≤ t ≤ T with
sup
∂ α f (s)2 ≤ M,
(81)
0≤s≤T |α|≤N
where the constant M ≤ 1 is sufficiently small. Then there exists a constant C > 0 so that for 0 ≤ s ≤ t ≤ T ,
α
∂ f (t) +
|α|≤N
2
|α|≤N
t s
∂ α f (τ )2σ dτ ≤ eC[t−s]
∂ α f (s)2 .
(82)
|α|≤N
Moreover, we have
1 1 − e−C(t−s) ∂ α f (s)2 , γ + 2 ≥ 0; C |α|≤N s |α|≤N 1/2 α 2 C[t−s] 1/2 α w ∂ f (t) ≤ e w ∂ f (s)2 , γ + 2 ≤ 0; t
∂ α f (τ )2σ dτ ≥
|α|≤N
(84)
|α|≤N
|α|≤N
(83)
t s
∂ α f (τ )2σ dτ ≥
1 1 − e−C(t−s) w 1/2 ∂ α f (s)2 , γ + 2 ≤ 0. C |α|≤N
(85)
Landau Equation in Periodic Box
423
Proof. Notice that from Lemma 1: L ∂ α f = −∂i σ ij ∂j ∂ α f + σ ij vi vj ∂ α f − ∂i σ i ∂ α f − K∂ α f. To prove (82), we choose η small, and g1 = g2 = g = ∂ α f , as well as θ = 0 in (23) and Lemma 6: 5 3 α 2 ∂ f σ − C∂ α f 2 ≤ L ∂ α f , ∂ α f ≤ ∂ α f 2σ + C∂ α f 2 . 4 4
(86)
Together with 1 d α 2 α α ∂ f + L ∂ f , ∂ f = (∂ α [f, f ] , ∂ α f ), 2 dt
(87)
we deduce from (35) with θ = 0, and (86), 1 d α 2 3 α 2 ∂ α1 f ∂ α1 f σ ∂ α f σ ∂ f + ∂ f σ ≤ C∂ α f 2 + 2 dt 4 |α1 |≤N |α1 |≤N √ ≤ C∂ α f 2 + C M ∂ α1 f σ ∂ α f σ . (88) |α1 |≤N
The summation is over |α1 | ≤ N. For M sufficiently small, we sum over |α| ≤ N to get 1 d ∂ α f 2 ≤ C ∂ α f 2 . 2 dt |α|≤N
From the Gronwall lemma, we have ∂ α f (t)2 ≤ eC[t−s] ∂ α f (s)2 . |α|≤N
(89)
|α|≤N
(90)
|α|≤N
Plugging (90) into the right-hand side of (88), we deduce (82) by integrating (89) over [s, t] . If γ + 2 ≥ 0, by (86), we integrate (87) over [s, t] to get 1 α 1 α ∂ f (t)2 ≥ ∂ f (s)2 − C 2 2
t s
∂ α f (τ )2σ dτ,
where |α| ≤ N, and ∂ α f (t)2 ≤ C∂ α f (t)2σ for γ + 2 ≥ 0 by (21). This further implies
∂ α f (t)2σ ≥ C
∂ α f (s)2 − C
t s
∂ α f (τ )2σ dτ,
where |α| ≤ N. We thus deduce (83) from the above inequality by the Gronwall lemma.
424
Y. Guo
For γ + 2 ≤ 0, w ≤ 1. Let f n+1 = f n = f in (76). Multiply it with w∂ α f , and integrate over T3 × R3 to get: 1 d w 1/2 ∂ α f 2 + (wL ∂ α f , ∂ α f ) 2 dt = (w∂ α f, f , ∂ α f ) ≤ ∂ α1 f ∂ α1 f σ,1/2 ∂ α f σ,1/2 |α1 |≤N |α1 |≤N √ ≤C M ∂ α1 f σ,1/2 ∂ α f σ,1/2 .
(91)
|α1 |≤N
Here we have used (35) with θ = 1/2. Notice that from (23) and Lemma 6 with g = g1 = g2 = ∂ α f, and θ = 1/2, w ≤ 1, C ∂αf
2 σ,1/2
1 α ∂ f ≥ wL ∂ α f , ∂ α f ≥ 2
2 σ,1/2
2
− C w 1/2 ∂ α f
.
(92)
Hence, summing over |α| ≤ N, and choosing M sufficiently small, we have d w1/2 ∂ α dt
2
+ ∂αf
2 σ,1/2
≤ C w1/2 ∂ α f
Equation (84) thus follows from Gronwall’s Lemma. On the other hand, notice that ∂ α f (t)2σ ≥ C w 1/2 ∂ α f (t) Lemma 6 and (91), we have d dt
s
t
∂ α f (τ )
2 σ
2
2
.
from (21). By (23),
dτ = ∂ α f (t)2σ ≥ C w1/2 ∂ α f (t)
2
t
wL ∂ α f , ∂ α f s t − ∂ α1 f ∂ α1 f σ,1/2 ∂ α f σ,1/2 s |α |≤N |α1 |≤N 1 t 2 2 ≥ C w1/2 ∂ α f (s) − C ∂ α f (t) σ ds, by (92). ≥C w
1/2 α
∂ f (s)
2
−
s |α|≤N
We thus deduce again from the Gronwall inequality |α|≤N
t s
∂ α f (τ )
2 σ
dτ ≥
1 1 − e−C(t−s) C
|α|≤N
w 1/2 ∂ α f (s)
2
.
Landau Equation in Periodic Box
5. Positivity of
t 0
425
Lf (s), f (s)ds
In this section, we shall establish the positivity of the operator Lf, f ds for every solution f (t, x, v) to the Landau equation (7) with small amplitude. The conservation laws (8) play an important role in the proof. Lemma 10. Assume f (t, x, v) satisfies (7) for 0 ≤ t ≤ T with T ≥ 1. Assume f0 (x, v) satisfies (8). Assume (81) is valid for M sufficiently small. Then there exists a constant 0 < δM < 1, such that 1 1 α 2 α ∂ α f (s) σ ds. L ∂ f (s) , ∂ f (s) ds ≥ δM 0
|α|≤N
0
|α|≤N
Proof. We first consider the case of γ + 2 ≤ 0. We shall prove this lemma by contradiction. If not, there exist a sequence of solutions fn (t, x, v) (not identically zero) to (7), so that for 0 ≤ t ≤ T , 2 sup ∂ α fn (s) ≤ M, but 0≤s≤T |α|≤N
0≤
|α|≤N
1 1 L ∂ α fn (s) , ∂ α fn (s) ds ≤ ∂ α fn (s) n 0
1 0
|α|≤N
We normalize
fn (t, x, v) , 1 α f (s)2 ds ∂ n σ |α|≤N 0
Zn (t, x, v) ≡ $ By dividing
1 α 2 |α|≤N 0 ∂ fn (s)σ ds,
0≤
|α|≤N
|α|≤N
1 0
∂ α Zn (s)
2 σ
ds.
ds ≡ 1.
(93)
we have
1 0
2 σ
1 L ∂ α Zn (s) , ∂ α Zn (s) ds ≤ . n
(94)
By Lemma 9, we have
w 1/2 ∂ α fn (t)
2
≤C
|α|≤N
|α|≤N
w 1/2 ∂ α fn (0)
2
w 1/2 ∂ α fn (0)
2
,
|α|≤N 1
0
∂ α fn (s)2σ ds ≥ C
.
|α|≤N
This implies that sup
0≤t≤1 |α|≤N
w 1/2 ∂ α Zn (t)
uniformly in n. Notice that ∂ α fn satisfies
2
≤ C,
{∂t + v · ∇x } ∂ α fn + L ∂ α fn ≡ ∂ α [fn , fn ] .
(95)
426
Y. Guo
By dividing satisfies
$
1 α 2 |α|≤N 0 ∂ fn (s)σ ds
throughout the equation, we deduce that Zn (t)
[∂t + v · ∇x ] ∂ α Zn + L ∂ α Zn = ∂ α [fn , Zn ] .
(96)
From (8), we also have the integrated conservation laws as (i = 1, 2, 3)
1 T3 ×R3
0
√ Zn µ =
1 T3 ×R3
0
√ v i Zn µ =
1
0
T3 ×R3
√ |v|2 Zn µ = 0.
(97)
From (95) and (81), we denote Z(t, x, v) and f (t, x, v) as limits for Zn (t, x, v) and fn (t, x, v) respectively. We have, up to a subsequence, for |α| ≤ N, ∂ α Zn (t, x, v) B ∂ α Z(t, x, v), weakly with respect to the inner product
1 0
(g1 (s), g2 (s))σ ds, and
∂ α fn (t, x, v) B ∂ α f (t, x, v) in L2 .
(98)
√ √ √ Step 1. Z(t, x, v) = a(t, x) µ + b(t, x) · v µ + c(t, x)|v|2 µ. Recalling (22), we split Zn (t, x, v) = P0 Zn + {I − P0 } Zn =
5
Zn (t, x, ·), ej ej (v) + {I − P0 } Zn ,
j =1
so that ∂ α Zn (t, x, v) = P0 ∂ α Zn + {I − P0 } ∂ α Zn =
5 %
& ∂ α Zn (t, x, ·), ej ej (v) + {I − P0 } ∂ α Zn .
(99)
j =1
From (94) and (24) in Lemma 5, we deduce that |α|≤N
1
{I − P0 } ∂
α
0
2 Zn σ
1 α 1 1 ds ≤ L∂ Zn (s), ∂ α Zn (s) ds ≤ → 0. δ δn 0 |α|≤N
(100) We now claim that
1 0
%
& ∂ α Zn , ej ej − ∂ α Z, ej ej
2 σ
ds → 0.
(101)
Proof of the claim. Since ej is smooth with exponential decay when v → ∞, it suffices to show that 1 % α & ∂ Zn , ej − ∂ α Z, ej 2 dxds → 0. (102) 0
T3
Landau Equation in Periodic Box
427
3 3 To prove (102), ∀% > 0, we choose a smooth cut off function χ (t, x, v) in (0, 1)×T ×R 1 3 such that χ (t, x, v) ≡ 1 in [%, 1 − %] × T × |v| ≤ % . We split & % & % & % α ∂ Zn − ∂ α Z, ej = [1 − χ ] ∂ α Zn − ∂ α Z , ej + χ ∂ α Zn − ∂ α Z , ej . (103)
Its first term is bounded 1 % % & & [1 − χ ] ∂ α Zn , ej 2 + (1 − χ )∂ α Z, ej 2 dxds 0
T3
≤C 0 ≤C
1
2 2 [1 − χ ]2 ∂ α Zn ej + C T3 ×R3 +C +C .
0≤s≤%
1−%≤s≤1
1
0
T3 ×R3
2 [1 − χ ]2 |∂ α Z|2 ej
|v|≥1/%
Since sups w 1/2 ∂ α Zn (s) ≤ C from (95) and ej (v) ≤ o(%) [1 + |v|]γ +2 ,
for |v| ≥ 1/%,
we can bound the three integrals by C% sup [1 + |v|]γ +2 |∂ α Zn (s)|2 + |∂ α Z(s)|2 ≤ C%. s
T3 ×R3
(104)
& % We now prove that in (103), χ ∂ α Zn (t, x, ·), ej ∈ H 1/6 [0, 1] × T3 . Notice that [∂t + v · ∇x ] χ ∂ α Zn = −L ∂ α Zn χ + ∂ α fn, Zn χ + ∂ α Zn [∂t + v · ∇x ] χ . (105) Clearly ∂ α Zn [∂t + v · ∇x ] χ ∈ L2 [0, 1] × T3 × R3 . Now from Lemma 1, L [∂ α Zn ] χ is −∂i σ ij ∂j ∂ α Zn χ − ∂i σ i ∂ α Zn χ + σ ij vi vj ∂ α Zn χ χ +µ−1/2 ∂i µ φ ij ∗ µ1/2 ∂j ∂ α Zn + vj ∂ α Zn = −∂i σ ij ∂j ∂ α Zn χ + σ ij ∂j ∂ α Zn ∂i χ − ∂i σ i ∂ α Zn χ + σ ij vi vj ∂ α Zn χ +∂i µ1/2 φ ij ∗ µ1/2 ∂j ∂ α Zn + vj ∂ α Zn χ −µ1/2 φ ij ∗ µ1/2 ∂j ∂ α Zn + vj ∂ α Zn ∂i χ −vi µ1/2 φ ij ∗ µ1/2 ∂j ∂ α Zn + vj ∂ α Zn χ ≡ ∂i g1 + g2 .
Clearly, since χ is compactly supported, g1 , g2 ∈ L2 [0, 1] × T3 × R3 since g1 + g2 ≤ C ∂ α Zn σ . We now apply (56) to estimate ∂ α fn, Zn χ with g1 = fn and g2 = Zn . By the averaging lemma of Diperna and Lions [DiL], we deduce
& % α χ (t, x, v)∂ α Zn (t, x, v)ej (v)dv ∈ H 1/6 [0, 1] × T3 , χ ∂ Zn , ej = R3
428
Y. Guo
from (81) and
1 0
∂ α Zn 2σ ds = 1. This implies, up to a subsequence,
& % & χ ∂ α Zn , ej → χ ∂ α Z, ej
%
in
L2 [0, 1] × T3 .
Combining with (104), we deduce (102) and our claim by first choosing % sufficiently small, then letting n → ∞. Hence 1 1 2 2 (106) ∂ α Zn − ∂ α Z σ ds → 0, and ∂ α Z σ ds = 1. 0
0
Letting n → ∞ in (99), we deduce that %
& Z, ej ej = P0 Z, or √ √ √ Z(t, x) = a(t, x) µ + b(t, x)v µ + c(t, x)|v|2 µ. Z=
√ √ √ Since µ, v% µ, |v|2 µ are linearly independent, as linear combi%a, b,c can be solved √ & √ & √ & % nations of Z(t, x, ·), µ , Z(t, x, ·), v µ and Z(t, x, ·), |v|2 µ . And from (95), we have ∂ α a(t) + ∂ α b(t) + ∂ α c(t) ≤ C. (107) sup 2 2 2 0≤t≤1 |α|≤N
Notice that ∂ α [fn , Zn ] − ∂ α [f, Z] = ∂ α [fn , Zn − Z] + ∂ α [fn − f, Z] . Applying (57) with either g1 = fn and g2 = Zn − Z, or g1 = fn − f and g2 = Z, we deduce from the strong convergence in (106) and the weak convergence in (98), ∂ α [fn , Zn ] − ∂ α [f, Z] → 0 in D . By taking n → ∞ in (96), we separate the linear parts and the nonlinear parts to get, in the sense of distributions, [∂t + v · ∇x ] ∂ α Z = ∂ α [f, Z] ≡ hα ,
(108)
where we have used the facts Z = P0 Z and L [∂ α Z] = 0. The integrated conservation laws hold as n → ∞: 1 1 1 √ √ √ Z µ= vi Z µ = |v|2 Z µ = 0, i = 1, 2, 3. 0
T3 ×R3
0
T3 ×R3
0
T3 ×R3
Step 2.
1
0 |α|≤N
∂ α Z(s)
2 σ
ds ≤ CM.
(109)
Landau Equation in Periodic Box
429
This leads √ as M is sufficiently small. We plug Z = √ to a contradiction √ to (106) as long a(t, x) µ + b(t, x) √ · v µ + c(t, x)|v|2 µ into (108), and expand it as products of a polynomial in v and µ: α √ √ ∇x ∂ c · v|v|2 µ + ∂ α c|v| ˙ 2 + v · ∇x ∂ α b·v µ α √ √ + ∂ b˙ + ∇x ∂ α a · v µ + ∂ α a˙ µ = hα (f, Z),
(110)
d . We compare the coefficients of where |α| ≤ N and ‘·’ denotes the time derivative dt √ √ √ √ √ 2 µ, vi µ, vi µ, vi vj µ and |v|2 vi µ, 1 ≤ i = j ≤ 3. We denote a orthonormal basis for this 13 dimensional space by ej , 1 ≤ j ≤ 13 as in [GL]. Let
√ √ √ √ √ µ, vi µ, vi2 µ, vi vj µ, |v|2 vi µ A13×13 = ej , with det A = 0. Let ψ(t, x) ∈ Cc∞ (0, 1) × T3 . By (58), we can multiply by test functions of the form √ √ √ √ √ ψ(t, x) µ, ψ(t, x)vi µ, ψ(t, x)vi2 µ, ψ(t, x)vi vj µ , ψ(t, x)vi |v|2 µ √ √ √ to both sides of (110) and integrate over [0, 1] × T3 × R3 . Since µ, vi µ, vi2 µ, √ √ vi vj µ, and vi |v|2 µ are linearly independent, we deduce that in the sense of distributions, their coefficients on both sides should be equal respectively for 1 ≤ i = j ≤ 3, ∂ i ∂ α c = hαci ,
(111)
∂ α c˙ + ∂ i ∂ α bi = hαi , i α
j α
∂ ∂ bj + ∂ ∂ bi = α˙
i α
∂ bi + ∂ ∂ a = ∂ α a˙ = hαa ,
(112) hαij ,
i = j,
(113)
hαbi ,
(114) (115)
where
hαa , hαbi , hαi , hαij , hαci
=
α
R3
h (f, Z)ej (v)dv A.
Since hα = ∂ α [f, Z] , by (107), (95), we can apply (58) with g1 = f, and g2 = Z, χ = ej to get √ hα (s) + hα (s) + hα (s) + hα (s) + hα (s) ds ≤ C M. sup a bi ci i ij 2 2 2 2 2
0≤s≤1 |α|≤N
(116) In order to complete √ Step 2, we use Eq. (111) through (115) to show that a, b and c are of the order O( M). We first estimate c(t, x). From (111), √ ∇x ∂ α c = hα ≤ C M. (117) ci 2 2 |α|≤N
i
430
Y. Guo
We next estimate bi (t, x), for 1 ≤ i ≤ 3. Taking ∂ j of (112) and (113), we get ∂ jj ∂ α bi @∂ α bi = j
=
∂ jj ∂ α bi + ∂ ii ∂ α bi
j =i
=
j =i
=
−∂ j i ∂ α bj + ∂ j hαij + ∂ i hαi − ∂ i ∂ α c˙
j =i
∂ i ∂ α c˙ − ∂ i hαj − ∂ i ∂ α c˙ + ∂ j hαij + ∂ i hαi
i α
i α
= 2∂ ∂ c˙ − ∂ ∂ c˙ − = ∂ i ∂ α c˙ −
i=j
i=j
= −∂ ∂ bi + ∂ i hαi − = −∂ ∂
∂ i hαj
− ∂ j hαij + ∂ i hαi
∂ i hαj − ∂ j hαij + ∂ i hαi
ii α
ii α
i=j
bi + 2∂ i hαi
i=j
−
∂ i hαj − ∂ j hαij + ∂ i hαi
∂ i hαj − ∂ j hαij .
i=j
Therefore, multiplying with ∂ α bi yields √ hα + hα ≤ C M. ∂ j ∂ α b ≤ C i 2 ij |α|≤N
2
2
i,j
(118)
To estimate ∇a(t, x), however, we have to integrate over a time interval. We assume t ≥ 1/2 (otherwise, we integrate over the time interval [t, 1]). We integrate (114) over [0, t] to get, for 0 ≤ |α| ≤ N − 1, t t ∂ α bi (t) − ∂ α bi (0) + ∂ i ∂ α a(s)ds = hαbi (s)ds. (119) 0
0
Since ∂ i ∂ α a˙ = ∂ i hαa ∈ L2 [0, 1] × T3 × R3 , for 0 ≤ |α| ≤ N − 1, s ∂ i ∂ α a(s) = ∂ i ∂ α a(t) + ∂ i hαa (τ )dτ. t
Plugging this into (119), and integrating over [0, t] yields: 1 t s i α 1 1 t α ∂ i ∂ α a(t) = − ∂ α bi (t) − ∂ α bi (0) − ∂ ha (τ )dτ ds + h ds. t t 0 t t 0 bi Summing over its xi − derivatives, we obtain 1 i α @∂ α a(t) = − ∂ ∂ bi (t) − ∂ i ∂ α bi (0) t i 1 t i α 1 t s @hαa (τ )dτ ds + ∂ hbi ds. − t 0 t t 0 i
(120)
Landau Equation in Periodic Box
431
Multiplying above with ∂ α a − |T13 | T3 ∂ α adx , we deduce for |α| ≤ N − 1, sup ∇x ∂ α a 2 ≤ C sup ∇x ∂ α b2 + ∇x hαa 2 + hαbi 2 . t
t
We have used the Poincar´e inequality, α α ∂ a − 1 ≤ C|∇x ∂ α a|2 . ∂ adx 3 |T | T3 2 √ α Notice that for |α| ≤ N − 1, ∇x hi 2 ≤ C M, we therefore conclude that from (118), 1 ∂ α a 2 + ∂ α b2 + ∂ α c2 (s)ds ≤ CM. 2 2 2 α=0,|α|≤N
0
Letting α = 0 in Eqs. (111) through (115), we have 1 2 ˙ 2 |a(s)| ˙ 22 ds ≤ CM. ˙ 2 + b(s) 2 + |c(s)| 0
Therefore, by Poincar´e’s inequality in [0, 1] × T3 , 1 |a|22 + |b|22 + |c|22 (s)ds 0
|∇t,x a|22 + |∇t,x b|22 + |∇t,x c|22 (s)ds 0 ' 2 1 2 1 2 ( 1 3 + C T adtdx + bdtdx + cdtdx T3 T3 T3 0 0 0 ' 2 ( 2 2 1 1 1 3 ≤ CM + C T adtdx + bdtdx + cdtdx . 3 3 3
≤C
1
0
T
T
0
0
T
1
From the conservation laws (109), 0 T3 b ≡ 0, and 1 1 a + c = 0. 0
1
T3
0
T3
|a(s)|22 + |b(s)|22 + |c(s)|22 (s)ds ≤ CM, and therefore 1 |∂ α a|22 + |∂ α b|22 + |∂ α c|22 ds ≤ CM.
This implies that
0
|α|≤N
0
1 2 A contradiction to 0 Zσ ds = 1 if M is chosen to be small. Our lemma thus follows for the case γ + 2 ≤ 0. In case of γ + 2 ≥ 0 is proven as the same steps as the case γ + 2 ≤ 0 (and simpler). Since · σ ≥ · , we have from (82) and (83) from Lemma 9 sup Zn (s) < ∞.
0≤s≤1
Hence in the proof of the claim (101), we do not need the weighted norm anymore.
432
Y. Guo
Proof of Theorem 2. Since (81) is valid for 0 ≤ t ≤ T due to the fact 2 sup ∂ α f (s) ≤ sup E(f (s)) ≤ M. 0≤s≤T |α|≤N
0≤s≤T
Let n ≥ 1. We simply apply Lemma 10 to each of the interval: [t1 , t1 +1], [t1 +1, t1 +2] , . . . [t1 + n − 1, t1 + n]. 6. Global Existence t Proof of Theorem 1. Let γ +2 ≤ 0. Now we are ready to use the positivity of 0 Lf (s), f (s)ds to estimate the E(f ). We choose a number M > 0 such that Lemma 10 is valid. We first choose 2 E(f0 ) = ∂βα f0 ≤ M1 < M/2, |β|+|α|≤N
sufficiently so that Theorem 4 holds. We can also define ! " T = sup t : sup E(f (s)) ≤ M t
> 0.
0≤s≤t
For 0 ≤ t ≤ T , from (77), we obtain t t α α α 1 α 2 L ∂ f , ∂ f ds ≤ CE(f0 ) + ∂ [f, f ] , ∂ α f ds. ∂ f (t) + 2 0 0 For any 0 ≤ t ≤ T , we split t = t1 + n, where 0 ≤ t1 ≤ 1 and n is an integer. By Theorem 2, and the fact L ≥ 0, we deduce that (0 < δM < 1) t t1 1 α 2 2 2 ∂ α f (s) σ ds + δM ∂ α f (s) σ ds ∂ f (t) + δM 2 0 t1 t1 t α 2 ≤ CE(f0 ) + ∂ α f (s) σ ds ∂ [f, f ] , ∂ α f ds + δM 0 0 t1 t # 2 2 ≤ CE(f0 ) + C sup E(f (s)) ∂ α1 f (s) σ ds + δM ∂ α f (s) σ ds. 0≤s≤t
|α1 |≤N
0
0
From (82) in Lemma 9 with t1 ≤ 1, M sufficiently small, t1 2 2 ∂ α f (s) σ ds ≤ C ∂ α f0 ≤ CE(f0 ). 0
|α|≤N
Therefore, we obtain sup
0≤s≤t |α|≤N
α
∂ f (s)
2
+ 0
s
α
∂ f (s)
2 σ
ds ≤ CM E(f0 )+CM
)
*3/2 sup E(f (s))
.
0≤s≤t
(121)
Landau Equation in Periodic Box
433
Now we estimate the mixed derivatives of f . Let f n+1 = f n = f into (79). We obtain, by (121), for some other constant CM ≥ 1, that
sup 0≤s≤t
2
∂βα f (s)
|β|
0
)
≤ CM E(f0 ) + CM
s
+
∂βα f (τ )
2 σ,|β|
dτ
*3/2
sup E(f (s))
,
(122)
0≤s≤t
where summation is over |α| + |β| ≤ N. Now we further choose M2 ≤ M1 < M such that 1/2
CM M2
≤ 1/2.
We finally choose initially E(f0 ) ≤ %0 ≡
M2 ≤ M2 /2 ≤ M/2, 4CM
and choose C0 = 2CM . We define !
"
T2 = sup t : sup E(f (s)) ≤ M2 t
> 0.
0≤s≤t
Since 0 ≤ t ≤ T2 ≤ T , from (122) ) sup E(f (s)) ≤ CM E(f0 ) + CM
0≤s≤T2
) ≤ CM E(f0 ) + ≤ CM E(f0 ) +
*3/2 sup E(f (s)) 0≤s≤T2
) * *1/2 sup E(f (s)) sup E(f (s)) CM
0≤s≤T2
0≤s≤T2
1 sup [E(f )] . 2 0≤s≤T2
Hence sup E(f (s)) ≤ 2CM E(f0 ) ≤ 2CM ×
0≤s≤T2
M2 = M2 /2 < M2 . 4CM
We thus deduce T2 = ∞ from the continuity of E(f (s)), and the theorem follows. The case of γ + 2 ≥ 0 follows from the same argument without using the weight function w|β| . Acknowledgements. The research is supported in part by a NSF grant as well as a Sloan Fellowship. The author would like to thank L. Desvillettes and C. Villani as well as a referee for their very extensive and constructive suggestions to improve the presentation of the paper.
434
Y. Guo
References [A] [C] [CIP] [DeL] [DiL] [D] [DV] [G1] [G2] [GL] [H] [L] [V1] [V2]
Arsen’ev, A.A., Buryak, O.E.: On the connection between a solution of the Boltzmann equation and a solution of the Landau-Fokker-Planck equation. Math. USSR. Sbornik 69 (2), 465–478 (1991) Cercignani, C.: The Boltzmann Equation and Its Application. Berlin–Heidelberg–New York: Springer-Verlag, 1988 Cercignani, C., Illner, R., Pulvirenti, M.: The Mathematical Theory of Dilute Gases. Berlin– Heidelberg–New York: Springer-Verlag, 1994 Degond, P., Lemou, M.: Dispersion relations for the linearized Fokker-Planck equation. Arch. Rat. Mech. Anal. 138 (2), 137–167 (1997) Diperna, R., Lions, P-L.: Global weak solution of Vlasov-Maxwell systems. Comm. Pure Appl. Math. 42, 729–757 (1989) Desvillettes, L.: About regularizing properties of the non-cut-off Kac equation. Commun. Math. Phys. 168 (2), 417–440 (1995) Desvillettes, L., Villani, C.: On the spatially homogeneous Landau equation for hard potentials. I. Existence, uniqueness and smoothness. Comm. PDE. 25(1–2), 179–259 (2000) Guo, Y.: The Vlasov-Poisson-Boltzmann system near Maxwellians. Comm. Pure Appl. Math., Vol. LV, 1104–1135 (2002) Guo, Y.: The inverse power law with an angular cutoff. Preprint 2001 Glassey, R.: The Cauchy Problems in Kinetic Theory. Philadelphia, PA: SIAM, 1996 Hilton, F.: Collisional transport in plasma. Handbook of Plasma Physics. (1) Amsterdam: NorthHolland, 1983 Lions, P-L.: On Boltzmann and Landau equations. Phil. Trans. R. Soc. Lond. A 346, 191–204 (1994) Villani, C.: On a new class of weak solutions to the spatially homogeneous Boltzmann and Landau equations. Arch. Rat. Mech. Anal. 143(3), 273–307 (1998) Villani, C.: On the Landau equation: Weak stability, global existence.Adv. Diff. Eq. 1 (5), 793–816 (1996)
Communicated by H. Spohn
Commun. Math. Phys. 231, 435–461 (2002) Digital Object Identifier (DOI) 10.1007/s00220-002-0728-x
Communications in
Mathematical Physics
Construction of the Incipient Infinite Cluster for Spread-out Oriented Percolation Above 4 + 1 Dimensions Remco van der Hofstad1,∗ , Frank den Hollander2 , Gordon Slade3 1
Department of Applied Mathematics, Delft University of Technology, Mekelweg 4, 2628 CD Delft, The Netherlands 2 EURANDOM, P.O. Box 513, 5600 MB Eindhoven, The Netherlands. E-mail:
[email protected] 3 Department of Mathematics, University of British Columbia, Vancouver, BC V6T 1Z2, Canada. E-mail:
[email protected] Received: 13 December 2001 / Accepted: 11 July 2002 Published online: 29 October 2002 – © Springer-Verlag 2002
Abstract: We construct the incipient infinite cluster measure (IIC) for sufficiently spread-out oriented percolation on Zd × Z+ , for d + 1 > 4 + 1. We consider two different constructions. For the first construction, we define Pn (E) by taking the probability of the intersection of an event E with the event that the origin is connected to (x, n) ∈ Zd × Z+ , summing this probability over x ∈ Zd , and normalising the sum to get a probability measure. We let n → ∞ and prove existence of a limiting measure P∞ , the IIC. For the second construction, we condition the connected cluster of the origin in critical oriented percolation to survive to time n, and let n → ∞. Under the assumption that the critical survival probability is asymptotic to a multiple of n−1 , we prove existence of a limiting measure Q∞ , with Q∞ = P∞ . In addition, we study the asymptotic behaviour of the size of the level set of the cluster of the origin, and the dimension of the cluster of the origin, under P∞ . Our methods involve minor extensions of the lace expansion methods used in a previous paper to relate critical oriented percolation to super-Brownian motion, for d + 1 > 4 + 1. 1. Introduction and Results 1.1. The incipient infinite cluster. For oriented percolation on Zd × Z+ , it was shown in [3, 10] that there is no infinite cluster at the critical point. For non-oriented percolation on Zd , proofs that there is no percolation at the critical point are restricted to 2-dimensional and high-dimensional models, and a general proof has remained an elusive goal. The notion of the incipient infinite percolation cluster (IIC) is an attempt to describe the infinite structure that is emerging but not quite present at the critical point. Various aspects of the IIC are discussed in [1]. There is currently no existence theory for the IIC that is applicable in general dimensions, neither in the oriented nor in the non-oriented setting. ∗ Present address: Department of Mathematics and Computer Science, Eindhoven University of Technology, P.O. Box 513, 5600 MB Eindhoven, The Netherlands. E-mail:
[email protected]
436
R. van der Hofstad, F. den Hollander, G. Slade
For bond percolation on Z2 , Kesten [17] constructed the IIC as a measure on bond configurations in which the origin is almost surely connected to infinity. He gave two different constructions, both leading to the same measure. One construction involved conditioning on the event that the origin is connected to infinity, with bond density p greater than the critical value pc , and taking the limit p ↓ pc . Another construction involved conditioning on the event that the origin is connected to the boundary of a box of radius n, with p = pc , and letting n → ∞. More recently, J´arai [15, 16] has shown that several other definitions of the IIC on Z2 yield the same measure as Kesten’s. These include the inhomogeneous model of [8], and definitions in terms of invasion percolation [7], the largest cluster in a large box [5], and spanning clusters [1]. The incipient infinite cluster is thus a natural and robust object that can be constructed in many different ways. No construction of the IIC, as a measure on bond configurations, has been given for any finite-dimensional lattice in dimensions greater than 2. In the present paper, we consider sufficiently spread-out oriented percolation on Zd × Z+ , with d + 1 > 4 + 1, and propose two definitions of the IIC. Perhaps the most natural definition of the IIC for oriented percolation is the measure Q∞ obtained by conditioning the cluster of the origin to survive to time n, with p = pc , and then letting n → ∞. Of course, it is not obvious that the limit exists. For another possible definition, we set p = pc and define Pn (E) by taking the probability of the intersection of an event E with the event that the origin is connected to (x, n) ∈ Zd × Z+ , summing this probability over x ∈ Zd , and normalising the sum to get a probability measure. We will let n → ∞ and prove existence of a limiting measure P∞ . It is clear from the definition that P∞ will be supported on configurations in which the origin is connected to infinity. In view of the apparent robustness of the IIC, it is natural to expect that P∞ = Q∞ . In fact, we will prove that Q∞ exists and equals P∞ , subject to the assumption that the critical survival probability behaves asymptotically as a multiple of n−1 . We believe that the methods of [14] can be adapted to prove this assumption, and we plan to return to this problem in a future publication [12]. Our constructions are restricted to d +1 > 4+1 due to the appearance in proofs of Feynman diagrams that require d > 4 for convergence, as in [14, 20, 21]. Finally, we will derive various properties of the IIC measure P∞ . These include statements that under P∞ the cluster of the origin is infinite, the number of particles in the cluster of the origin at time m grows like m times a size-biased exponential random variable, and the cluster has a 4-dimensional character. An alternate approach to the IIC is via a scaling limit. For oriented percolation, the goal is to understand the distribution of critical clusters that survive to time n, with the lattice spacing shrinking as an appropriate power of n, in the limit n → ∞. Such a program was carried out in [14], where it was shown that the scaling limit for sufficiently spread-out oriented percolation above the upper critical dimension 4 + 1 is intimately related to super-Brownian motion. (Related results for non-oriented percolation were obtained in [11].) This suggests that large critical clusters are closely related to large critical branching random walk clusters, for d + 1 > 4 + 1. The results and methods in the present paper are based on minor extensions of the results and lace expansion techniques used in [14]. The lace expansion was first applied to oriented percolation by Nguyen and Yang [20, 21]. 1.2. Existence of the IIC measure. The spread-out oriented percolation models are defined as follows. Consider the graph with vertices Zd × Z+ and directed bonds
Incipient Infinite Cluster for Spread-out Oriented Percolation
437
((x, n), (y, n + 1)), for n ≥ 0 and x, y ∈ Zd . Let D : Zd → [0, 1] be a fixed function. Let p ∈ [0, D−1 ∞ ], where · ∞ denotes the supremum norm, so that pD(x) ≤ 1 for all x. We associate to each directed bond ((x, n), (y, n + 1)) an independent random variable taking the value 1 with probability pD(y − x) and 0 with probability 1 − pD(y − x). We say a bond is occupied when the corresponding random variable is 1, and vacant when the random variable is 0. Given a configuration of occupied bonds, we say that (x, n) is connected to (y, m), and write (x, n) −→ (y, m), if there is an oriented path from (x, n) to (y, m) consisting of occupied bonds, or if (x, n) = (y, m). The joint probability distribution of the bond variables will be denoted P, with corresponding expectation denoted E. Note that p is not a probability. We will always work at the critical percolation threshold, i.e., at p = pc , and omit subscripts pc from the notation. A simple example is (2L + 1)−d x∞ ≤ L D(x) = (1.1) 0 otherwise, for which bonds are of the form ((x, n), (y, n + 1)) with x − y∞ ≤ L, and a bond is occupied with probability p(2L+1)−d . In this parametrisation, pc tends to 1 as L → ∞. Our results hold for any function D that obeys the assumptions listed in [14, Sect. 1.2]. These assumptions involve a positive parameter L which serves to spread out the connections, and which we will take to be large. In particular, they require that x∈Zd D(x) = 1, that D(x) ≤ CL−d for all x, and, with σ defined by |x|2 D(x), (1.2) σ2 = x∈Zd
where | · | denotes the Euclidean norm on Rd , that C1 L ≤ σ ≤ C2 L. Full details regarding the assumptions can be found in [14]. The function defined by (1.1) does obey the assumptions. Let F denote the σ -algebra of events. A cylinder event is an event that is determined by the occupation status of a finite set of bonds. We denote the algebra of cylinder events by F0 . Then F is the σ -algebra generated by F0 . For our first definition of the IIC, we begin by defining Pn by Pn (E) = where τn = setting
1 P(E ∩ {(0, 0) −→ (x, n)}) τn d
(E ∈ F0 ),
(1.3)
x∈Z
x∈Zd τn (x)
with τn (x) = P((0, 0) −→ (x, n)). We then define P∞ by
P∞ (E) = lim Pn (E) n→∞
(E ∈ F0 ),
(1.4)
assuming the limit exists. The following theorem shows that this definition produces a probability measure on F under which the origin is almost surely connected to infinity. Theorem 1.1. Let d + 1 > 4 + 1 and p = pc . There is an L0 = L0 (d) such that for L ≥ L0 , the limit in (1.4) exists for every cylinder event E ∈ F0 . Moreover, P∞ extends to a probability measure on the σ -algebra F, and the origin is almost surely connected to infinity under P∞ .
438
R. van der Hofstad, F. den Hollander, G. Slade
Let Sn = {(0, 0) −→ n} = {(0, 0) −→ (x, n) for some x ∈ Zd }
(1.5)
denote the event that the cluster of the origin survives to time n. For our second definition of the IIC, we begin by defining Qn by Qn (E) = P(E|Sn )
(E ∈ F0 ).
(1.6)
We then define Q∞ by setting Q∞ (E) = lim Qn (E) n→∞
(E ∈ F0 ),
(1.7)
assuming the limit exists. Not surprisingly, the existence of Q∞ turns out to be related to the asymptotic behaviour of the critical survival probability θn = P(Sn ).
(1.8)
We will assume that for critical spread-out oriented percolation with d + 1 > 4 + 1 and L sufficiently large, there is a finite positive constant B such that lim nθn = 1/B.
n→∞
(1.9)
Although there is currently no proof of (1.9), we intend to return to this question in a future publication [12]. Assuming (1.9), the following theorem gives existence of the IIC measure Q∞ , with Q∞ = P∞ . Theorem 1.2. Let d + 1 > 4 + 1 and p = pc , and assume (1.9). There is an L0 = L0 (d) such that for L ≥ L0 , the limit in (1.7) exists for every cylinder event E ∈ F0 . Moreover, Q∞ extends to a probability measure on the σ -algebra F, and Q∞ = P∞ . (x)
We conjecture that the measure Pn defined by Pn(x) (E) =
1 P(E ∩ {(0, 0) −→ (x, n)}) τn (x)
(1.10)
converges to the IIC measure P∞ of Theorem 1.1, for each fixed x ∈ Zd . We are not able to prove this without some strengthening of the local central limit theorem of [13, 14]. Some intuition that supports both this conjecture and the conjecture that P∞ = Q∞ , in general dimensions, is given near the beginning of Sect. 3.1. Of the possible definitions of the incipient infinite cluster for oriented percolation, we find P∞ the easiest to work with and the most closely related to the work of [14] connecting critical oriented percolation and super-Brownian motion. For example, if we let E be the event that (0, 0) −→ (yi , mi ) (i = 1, . . . , s), then the right side of (1.3) involves the probability that the origin is connected to (x, n), as well as to (yi , mi ) (i = 1, . . . , s). The scaling of such (s + 2)-point functions was shown in [14] to be described by related quantities for the canonical measure of super-Brownian motion, for d > 4, p = pc , and L sufficiently large. We will use this scaling in establishing the properties of the IIC measure stated in the following section. The asymptotic formula (1.9) is believed to fail in low dimensions, and our methods do not apply at all for d ≤ 4. Nevertheless, we expect that P∞ and Q∞ exist and are equal in all dimensions.
Incipient Infinite Cluster for Spread-out Oriented Percolation
439
1.3. Properties of the IIC measure. The Hausdorff dimension of the connected cluster of the origin under the IIC is believed to equal 4 almost surely, for d + 1 > 4 + 1. The following theorem provides a weaker statement, indicating a 4-dimensional aspect to the IIC. In order to be able to state the result, we let C(0, 0) = {(y, m) ∈ Zd × Z+ : (0, 0) −→ (y, m)} denote the connected cluster of the origin, and let DR = E∞ #{(y, m) ∈ C(0, 0) : |y| ≤ R}
(1.11)
(1.12)
denote the expected number of sites in the cluster of the origin that are at most a distance R away from the origin, under P∞ . Theorem 1.3. Let d + 1 > 4 + 1 and p = pc . There are L0 = L0 (d) and Ci = Ci (L, d) > 0 such that for L ≥ L0 , C1 R 4 ≤ DR ≤ C2 R 4 .
(1.13)
In Sect. 5.1, where Theorem 1.3 is proved, we will also define the r-point functions of P∞ and obtain results concerning their asymptotic behaviour. For our next property of P∞ , we let Nm = #{y ∈ Zd : (0, 0) −→ (y, m)}
(1.14)
denote the number of sites at time m to which the origin is connected. We recall that the size-biased exponential random variable with parameter λ has density f (x) = λ2 xe−λx
(x ≥ 0).
(1.15)
The following theorems describe the distribution of Nm under P∞ and Qm . The constants A and V appearing in their statements are finite positive constants arising in the scaling of the 2- and 3-point functions [14] (see Theorem 4.1 below), while B is the constant in (1.9). The three constants A, V , B depend on d and L. Theorem 1.4. Let d + 1 > 4 + 1 and p = pc . There is an L0 = L0 (d) such that for L ≥ L0 , 2 l Nm l A V lim E∞ = (l + 1)! (l = 1, 2, . . . ). (1.16) m→∞ m 2 Consequently, under P∞ , m−1 Nm converges weakly to a size-biased exponential random variable with parameter λ = 2/(A2 V ). Theorem 1.5. Let d + 1 > 4 + 1 and p = pc . Assume that (1.9) holds. Then AV . 2 In addition, there is an L0 = L0 (d) such that for L ≥ L0 , 2 l Nm l A V lim EQm = l! (l = 1, 2, . . . ). m→∞ m 2 B=
(1.17)
(1.18)
Consequently, under Qm , m−1 Nm converges weakly to an exponential random variable with parameter λ = 2/(A2 V ).
440
R. van der Hofstad, F. den Hollander, G. Slade
The identity (1.17), which holds under the assumption (1.9), expresses a relation between the three constants B, A and V . It is shown in [14] that A and V both equal 1 + O(L−d ), and hence (1.17) implies that B = 21 + O(L−d ). Under the assumption that (1.9) holds, it follows from Theorems 1.4–1.5 that m−1 Nm converges to a size-biased exponential random variable under Q∞ = P∞ , and to an exponential random variable under Qm . A similar contrast can be proved for the behaviour of m−1 Nm for critical branching random walk (in general dimensions, with an offspring distribution with finite variance), where again the size-biased exponential distribution occurs when the branching random walk is conditioned to survive to infinite time, and the exponential distribution occurs when the branching random walk is conditioned to survive until time m. This is consistent with the general philosophy that oriented percolation behaves like the branching random walk above the upper critical dimension 4 + 1, as already noted at the end of Sect. 1.1. Finally, we remark that we will give a formula for P∞ (E) in terms of the lace expansion in (2.29) below, when E ∈ F0 is a cylinder event. 1.4. Organisation. The remainder of this paper is organised as follows. In Sect. 2 we prove Theorem 1.1, and in Sect. 3 we prove Theorem 1.2. In Sect. 4, we recall the main result of [14] linking critical oriented percolation and super-Brownian motion, and derive some elementary properties of the moment measures of the canonical measure of super-Brownian motion. Using the results of Sect. 4, we then prove Theorems 1.3–1.5 in Sect. 5. 2. Proof of Theorem 1.1 The proof of Theorem 1.1 uses a modification of the Nguyen–Yang lace expansion for oriented percolation [20, 21] (see also [14, Sect. 3]), to derive an expansion for Pn (E) of (1.3). We derive the modified expansion in Sect. 2.1, and use it to prove Theorem 1.1 in Sect. 2.2. 2.1. The lace expansion for Pn . Throughout this section, we fix p ∈ [0, D−1 ∞ ] and m ≥ 1. A cylinder event E is an event that depends on the occupation status of a finite set of bonds B(E). Let Em denote the set of cylinder events E for which the maximum time appearing in B(E) is m, and fix E ∈ Em . Given a bond configuration, we say that a bond b is pivotal for an increasing event F if F occurs when b is made to be occupied and F does not occur when b is made to be vacant. For E ∈ Em , n ≥ m and 0 ≤ t ≤ n, we define τn,t (x; E) = P(E ∩ {(0, 0) −→ (x, n) with exactly t occupied pivotal bonds}), (2.1) n τn (x; E) = P(E ∩ {(0, 0) −→ (x, n)}) = τn,t (x; E), (2.2) t=0
where the pivotal bonds are pivotal for the event F = {(0, 0) −→ (x, n)}. Then (1.3) reads 1 τn (x; E). (2.3) Pn (E) = τn d x∈Z
Incipient Infinite Cluster for Spread-out Oriented Percolation
441
We write (x, n) ⇒ (y, m) to denote the event that (x, n) is doubly-connected to (y, m), i.e., the event that there exist at least two bond-disjoint occupied paths from (x, n) to (y, m), or (x, n) = (y, m). Given a bond b = ((x, n), (y, n + 1)), let b¯ = (y, n + 1) be the “top” of b, and b = (x, n) be the “bottom” of b. We will write b¯ < b¯ to mean that the temporal component of b¯ is less than that of b¯ , and, in an abuse of notation, we write b¯ ≤ n when the temporal component of b¯ is less than or equal to n. For t ≥ 1, let Bt (n) = {b = (b1 , . . . , bt ) : 0 < b¯1 < · · · < b¯t ≤ n}
(2.4)
denote the ordered vectors of t bonds, between times 0 and n. Given x ∈ Zd and b ∈ Bt (n), we define b¯0 = 0, b t+1 = (x, n), and 0) ⇒ (x, n)} (t = 0) (x, n)) = {(0,
t
t (2.5) Tt (b, ¯ {b occupied} { b ⇒ b } (1 ≤ t ≤ n). i j j +1 i=1 j =0 (x, n)) occurs, then the only possible candidates for occupied pivotal Note that if Tt (b, bonds for the event (0, 0) → (x, n) are the elements of b. For 0 ≤ s < t, we define the random variables K[s, t] = (1 + Uij ), Uij = −I [b¯i ⇒ bj +1 ], (2.6) s≤i<j ≤t
and we set K[s, s] = K[s + 1, s] = 1. The product in (2.6) is 0 or 1. If K[0, t] = 1 and (x, n)) occurs, then the occupied pivotal bonds for the event (0, 0) → (x, n) are Tt (b, Therefore (2.1) becomes precisely the elements of b. P(E ∩ {(0, 0) ⇒ (x, n)}) (t = 0) (2.7) τn,t (x; E) = E I [E]I [T ( b, (x, n))]K[0, t] (1 ≤ t ≤ n). t t (n) b∈B The identity (2.7) can be understood by regarding the cluster of the origin as a “string The of sausages” as depicted in Fig. 1, where the “string” is specified by the bonds b. event E occurs before time m. The lace expansion involves a decomposition of K[0, t]. To describe this, we need some standard terminology [6, 19]. A graph on an interval [s, t] is a set + = {i1 j1 , . . . , iM jM } of edges, with s ≤ il < jl ≤ t for each l. We say that a graph + is connected on [s, t] if ij ∈+ [i, j ] = [s, t]. We denote the set of connected graphs on [s, t] by G[s, t], and let J [s, t] = Uij . (2.8) +∈G [s,t] ij ∈+
We set J [0, 0] = 1. Expansion of the product in (2.6) gives a sum over all graphs, and a partition of this sum according to the support of the connected component of m leads to the decomposition K[0, t] =
t s=0
M[0, s; m]K[s + 1, t]
(m ∈ [0, n] fixed, t ≥ 0),
(2.9)
442
R. van der Hofstad, F. den Hollander, G. Slade x
E
n
m
0 0 Fig. 1. Schematic depiction of a configuration contributing to τn (x; E) as a “string of sausages.” The event E ∈ Em is required to occur
where M[0, s; m] =
s i=0
K[0, i − 1]J [i, s]I [b¯i ≤ m ≤ bs+1 ].
(2.10)
See [24, (2.10)] or [19, Lemma 5.2.5] for more details on (2.9)–(2.10) in the case of a slightly different definition of graph connectivity. For l ≥ m, we define P(E ∩ {(0, 0) ⇒ (v, l)}) (s = 0) (2.11) ϕl,s (v; E) = E I [E]I [T ( b, (v, l))]M[0, s; m] (1 ≤ s ≤ l) s s (l) b∈B with bs+1 = (v, l), and ϕl (E) =
l
ϕl,s (v; E).
(2.12)
v∈Zd s=0
Although it is not explicit in the notation, ϕl,s (v; E) and ϕl (E) depend on m by definition. In particular, we are restricting to E ∈ Em . The following lemma relates τn,t (x; E) and ϕl,s (u; E). Lemma 2.1. For E ∈ Em , n ≥ m, and 0 ≤ t ≤ n, τn,t (x; E) =
n−1 t−1
ϕl,s (u; E)pD(v − u)τn−l−1,t−s−1 (x − v) + ϕn,t (x; E),
(u,v) l=m s=0
(2.13) where the first term on the right side is interpreted as zero when t = 0.
Incipient Infinite Cluster for Spread-out Oriented Percolation
443
Proof. The proof is a standard lace expansion argument. For t = 0, (2.13) follows immediately from (2.7) and (2.11). For t ≥ 1, we substitute (2.9) into (2.7). The s = t term of (2.9) gives rise to the second term on the right side of (2.13). It therefore remains to show that t−1 (x, n))]M[0, s; m]K[s + 1, t] E I [E]I [Tt (b,
(2.14)
t (n) s=0 b∈B
is equal to the first term on the right side of (2.13). For this, given b and s, we decompose the random variables appearing in (2.14) into the three factors:
I [E]I [ sr=1 {br occupied} sr=0 {b¯r ⇒ br+1 }]M[0, s; m], (2.15) {bs+1 occupied},
t
I[
r=s+2 {br
occupied}
t
¯ ⇒ br+1 }]K[s + 1, t].
r=s+1 {br
(2.16) (2.17)
These random variables depend on bonds below bs+1 , between bs+1 and b¯s+1 , and above b¯s+1 , respectively. Recalling (2.7) and (2.11), we see that the expectation factors to give the first term on the right side of (2.13). In (2.13), l corresponds to the temporal component of bs+1 , while u and v are the lower and upper spatial components of bs+1 . Summation over t = 0, . . . , n and x ∈ Zd in (2.13) gives
τn (x; E) =
n−1
ϕl (E)pτn−l−1 + ϕn (E).
(2.18)
l=m
x∈Zd
With (2.3), this gives the expansion n−1 1 Pn (E) = ϕl (E)pτn−l−1 + ϕn (E) . τn
(2.19)
l=m
Next, we rewrite ϕl (E) in terms of laces. A lace on [k, l] is an element of G[k, l] such that the removal of any edge will result in a disconnected graph. Given a connected graph + ∈ G[k, l], we define the lace L+ ⊂ + to be the graph consisting of edges s1 t1 , s2 t2 , . . . given by s1 = k, t1 = max{t : kt ∈ +}, ti+1 = max{t : ∃s ≤ ti such that st ∈ +}, si+1 = min{s : sti+1 ∈ +}.
(2.20)
It is not difficult to check that L+ is indeed a lace. Given a lace L, let C(L) denote the set of compatible edges, i.e., the set of edges ij such that LL∪{ij } = L. Define L(N) [k, l] to be the set of laces on the interval [k, l] consisting of exactly N edges. It is then a standard fact [6, 19] that J [i, j ] =
∞ N=1
(−1)N J (N) [i, j ] (j > i ≥ 0),
(2.21)
444
R. van der Hofstad, F. den Hollander, G. Slade
with
J (N) [i, j ] =
(−Ust )
L∈L(N) [i,j ] st∈L
(1 + Us t ).
(2.22)
s t ∈C (L)
For l ≥ m, we define ϕl(0) (E) = P(E ∩ {(0, 0) ⇒ (v, l)}) v∈Zd
+
m s=1 v∈Zd b∈B s (l)
(v, l))]K[0, s − 1]I [b¯s ≤ m ≤ bs+1 ] , E I [E]I [Ts (b, (2.23)
which combines the first line of (2.11) for s = 0 with the contribution to the second line of (2.11) due to i = s in the definition of M[0, s; m] in (2.10). (The upper limit of the sum over s in (2.23) can be taken to be m rather than l in (2.23) because the restriction b¯s ≤ m can occur only when s ≤ m.) For N ≥ 1 and l ≥ m, we also define ϕl(N) (E) =
l
(v, l))] E I [E]I [Ts (b,
s=1 v∈Zd b∈B s (l)
×
s−1 i=0
K[0, i − 1]J (N) [i, s]I [b¯i ≤ m ≤ bs+1 ] .
(2.24)
It follows from (2.10)–(2.12) and (2.21) that ϕl (E) =
∞ N=0
(−1)N ϕl(N) (E).
(2.25)
Equations (2.19) and (2.23)–(2.25) constitute the lace expansion for Pn . 2.2. Estimates on the lace expansion for Pn . Throughout this section, we fix p = pc . It follows from [14, Theorem 1.1(a)] that, under the hypotheses of Theorem 1.1, there is an A ∈ (0, ∞) such that lim τn = A.
n→∞
(2.26)
To prove Theorem 1.1, we will use (2.19), (2.26) and the following lemma. We write β = L−d , and recall from [14] that pc = 1 + O(β) for d + 1 > 4 + 1. Lemma 2.2. Let d + 1 > 4 + 1, p = pc and E ∈ Em . There are K = K(d) and L0 = L0 (d) such that for L ≥ L0 , |ϕl (E)| ≤ Kmβ(l − m + 1)−d/2 (l ≥ m + 1).
(2.27)
Incipient Infinite Cluster for Spread-out Oriented Percolation
445
Proof of Theorem 1.1 subject to Lemma 2.2. Let E ∈ Em . By (2.19), n−1 1 P∞ (E) = lim Pn (E) = lim ϕl (E)pc τn−l−1 + ϕn (E) . n→∞ n→∞ τn
(2.28)
l=m
It therefore follows from (2.26), Lemma 2.2 and the dominated convergence theorem that P∞ (E) = pc
∞
ϕl (E)
(E ∈ Em ).
(2.29)
l=m
This proves existence of and gives a formula for the limit (1.4), for every cylinder event E ∈ F0 . To complete the proof of Theorem 1.1, it remains to show that P∞ can be extended to a probability measure on the σ -algebra F, and that the origin is almost surely connected to infinity under this extension. The extension of P∞ to F follows from Kolmogorov’s extension theorem (see e.g. [23]), since the consistency hypothesis of Kolmogorov’s extension theorem is satisfied by definition of Pn (E) and P∞ (E) in (1.3)–(1.4). In addition, Pn ((0, 0) −→ N ) = 1 for every n ≥ N , so P∞ ((0, 0) −→ N ) = 1 for every N ≥ 1, and hence P∞ ((0, 0) −→ ∞) = limN →∞ P∞ ((0, 0) −→ N ) = 1. Proof of Lemma 2.2. Fix m, E ∈ Em , and l ≥ m + 1. The proof involves a comparison of ϕl(N) (E) with quantities arising in the Nguyen–Yang lace expansion for the two-point function [20]. We use the notation and results of [14, Sects. 3.2 and 4.4]; this notation is not identical to that of [20]. Quantities 3(N) n (x) are defined in [14, Sect. 3.2] by P((0, 0) ⇒ (x, n)) − δx,0 δn,0 (N = 0) (N) (2.30) 3n (x) = n (N) E I [T ( b, (x, n))]J [0, s] (N ≥ 1). s s (n) s=1 b∈B Disjoint connections implied by the right side of (2.30) are depicted in Fig. 2. We will use the fact, proved in [14, (4.57)], that N N∨1 3(N) (n + 1)−d/2 (N ≥ 0) (2.31) n (x) ≤ C β x∈Zd
assuming the hypotheses of Theorem 1.1. Our assumption that d > 4 is used only in invoking (2.31). We consider first the case N = 0. Recall the definition of ϕl(0) (E) in (2.23). Because I [E] ≤ 1, the first term on the right side of (2.23) is bounded above by v 3(0) l (v), which is at most Cβ(l + 1)−d/2 by (2.31). The second term on the right side of (2.23) is bounded above by m
(v, l))]K[0, s − 1]I [b¯s ≤ m ≤ b s+1 ] E I [Ts (b,
s=1 v∈Zd b∈B s (l)
=
m−1
w,y,v∈Zd a=0
τa (y)pc D(w − y)3(0) l−a−1 (v − w) =
m−1 a=0
τ a pc
v∈Zd
3(0) l−a−1 (v), (2.32)
446
R. van der Hofstad, F. den Hollander, G. Slade (x,n) •
(x,n) •
3(0) n (x) =
3(1) n (x) = • (0,0)
• (0,0)
(x,n) •
(x,n) •
3(2) n (x) =
(x,n) •
+
(x,n) •
+
+
• (0,0)
• (0,0)
• (0,0)
• (0,0)
Fig. 2. Schematic depiction of disjoint connections required by 3(N) n (x) (N = 0, 1, 2)
where we have factored the expectation as in the proof of Lemma 2.1, and where bs in the first line corresponds to (y, a) in the second line. Therefore, letting C denote a generic constant and using (2.26) and (2.31), we get ϕl(0) (E) ≤ Cβ(l + 1)−d/2 + Cβ
m−1
(l − a)−d/2 ≤ Cβm(l − m + 1)−d/2 .
(2.33)
a=0
We next consider the case N ≥ 1. Applying the inequality I [E] ≤ 1 in (2.24), we get ϕl(N) (E) ≤
l
(v, l))] E I [Ts (b,
s=1 v∈Zd b∈B s (l)
s−1
K[0, i − 1]J
[i, s]I [b¯i ≤ m ≤ bs+1 ] .
(N)
i=0
(2.34) We may then factor the random variables on the right side into factors depending on bonds below bi , between bi and b¯i , and above b¯i , respectively, as in the proof of Lemma 2.1. This leads to ϕl(N) (E) ≤
v∈Zd
3(N) l (v) +
m−1
τa p c
a=0
v∈Zd
3(N) l−a−1 (v)
(N ≥ 1),
(2.35)
where the terms on the right side correspond to the contributions to (2.34) due to i = 0 and i > 0, respectively. Applying (2.31) and (2.26), we get (N)
N
N
ϕl (E) ≤ C β (l + 1)
−d/2
N
+ C1 C β
≤ C1N β N m(l − m + 1)−d/2
N
m−1
(l − a)−d/2
a=0
(N ≥ 1).
(2.36)
Incipient Infinite Cluster for Spread-out Oriented Percolation
447
Combination of (2.25), (2.33) and (2.36) completes the proof. The factor β N permits the sum over N to be performed, for β sufficiently small. 3. Proof of Theorem 1.2 The proof of Theorem 1.2 uses a lace expansion for Qn (E) defined in (1.6). This expansion is again a modification of the Nguyen–Yang lace expansion for oriented percolation, but is different from the expansion of Sect. 2.1. We derive the modified expansion in Sect. 3.1, and use it to prove existence of the measure Q∞ in Sect. 3.2. We will derive the same formula for Q∞ (E) as was obtained in (2.29) for P∞ (E), thereby proving Q∞ = P∞ . 3.1. The lace expansion for Qn . Throughout this section, we fix p ∈ [0, D−1 ∞ ] and m ≥ 0. Recall from (1.5), (1.6) and (1.8) that Sn = {(0, 0) −→ n}, θn = P(Sn ), and Qn (E) = θn−1 P(E ∩ Sn ). For E ∈ Em , n ≥ m, and 0 ≤ t ≤ n, we define θn,t (E) = P(E ∩ {(0, 0) −→ n with exactly t occupied pivotal bonds}), θn (E) = P(E ∩ {(0, 0) −→ n}) =
n
θn,t (E),
(3.1) (3.2)
t=0
where the pivotal bonds are pivotal for the event Sn . Then (1.6) reads Qn (E) =
θn (E) . θn
(3.3)
We will obtain formulas for θn,t (E) and Qn (E) analogous to (2.7) and (2.19). We again regard the cluster of the origin in a configuration contributing to θn (E) as a string of sausages, but now the top sausage may be open at the top, as depicted in Fig. 3. Before beginning the expansion, with the help of Figs. 1 and 3 we provide some intuition supporting the conjecture that Qn , Pn and Pn(x) of (1.10) all converge to the same limiting measure, in arbitrary dimensions. The basic idea is that the number of pivotal bonds for the event {(0, 0) −→ (x, n)} should diverge with n, so that the top sausage in Fig. 1 begins near n, far beyond m. In the limit n → ∞, the x-dependence inherent in locating the top of the top sausage in Fig. 1 at (x, n) should be of no importance for an event E ∈ Em with m fixed. Thus we expect the same limit whether x is fixed as in Pn(x) or summed over as in Pn . Similarly, the number of pivotal bonds for the event Sn should diverge with n, so that the top sausage in Fig. 3 begins far above m. In the limit n → ∞, the fact that the top sausage is open, rather than closed at some (x, n), should be irrelevant for an event E ∈ Em . This supports the statement that Q∞ = P∞ . To begin to set up the expansion, we let (w, k) ⇒ n denote the event that there exist x, y ∈ Zd with bond-disjoint paths from (w, k) to (x, n) and from (w, k) to (y, n). Given t > 0 and b ∈ Bt (n), we again set b¯0 = (0, 0) and bt+1 = n. We define Uij (t) = Uij
(0 ≤ i < j ≤ t − 1),
Uit (t) = −I [b¯i ⇒ n] (0 ≤ i ≤ t − 1). (3.4)
As in (2.6), (2.8) and (2.10), for 0 ≤ i ≤ j ≤ t we define Kt [i, j ] = (1 + Ui j (t)), Jt [i, j ] = i≤i <j ≤j
+∈G [i,j ] i j ∈+
Ui j (t),
(3.5)
448
R. van der Hofstad, F. den Hollander, G. Slade n
E
m
0 0 Fig. 3. Schematic depiction of a configuration contributing to θn (E) as a string of sausages, with the top sausage open at the top. The event E ∈ Em is required to occur
and Mt [0, s; m] =
s i=0
Kt [0, i − 1]Jt [i, s]I [b¯i ≤ m ≤ bs+1 ].
(3.6)
For b ∈ Bt (n), we define 0) ⇒ n} (t = 0) n) = {(0,
t
t−1
Tt (b, ¯ ¯ i=1 {bi occupied} j =0 {bj ⇒ bj +1 } {bt ⇒ n} (1 ≤ t ≤ n). (3.7) As in (2.7), (3.1) then becomes P(E ∩ {(0, 0) ⇒ n}) θn,t (E) = t (n) E I [E]I [Tt (b, n)]Kt [0, t] b∈B For 0 ≤ s ≤ t, we define P(E ∩ {(0, 0) ⇒ n}) φl,s (E) = s (l) E I [E]I [Ts (b, l)]Mt [0, s; m] b∈B
(t = 0) (1 ≤ t ≤ n).
(3.8)
(s = 0) (1 ≤ s ≤ l).
(3.9)
(0 ≤ t ≤ n),
(3.10)
It then follows exactly as in the proof of Lemma 2.1 that θn,t (E) =
n−1 t−1 l=m s=0
φl,s (E)pθn−l−1,t−s−1 + φn,t (E)
Incipient Infinite Cluster for Spread-out Oriented Percolation
449
where the first term on the right side is interpreted as zero when t = 0. Since Uij (t) = Uij when 0 ≤ i < j < t by (3.4), it follows from (2.6), (2.8), (2.10)–(2.11), (3.5)–(3.6) and (3.9) that φl,s (E) = ϕl,s (E)
(0 ≤ s ≤ t − 1).
(3.11)
Therefore, φl,s in (3.10) can be replaced with ϕl,s , except φn,t (E). Summation of (3.10) over t = 0, . . . , n, after this replacement, then gives θn (E) =
n−1
ϕl (E)pθn−l−1 + φn (E),
(3.12)
l=m
with φn (E) =
n
φn,t (E).
(3.13)
t=0
Combining (3.3) with (3.12), we get n−1 1 Qn (E) = ϕl (E)pθn−l−1 + φn (E) , θn
(3.14)
l=m
which is analogous to (2.19). Finally, we rewrite the expansion for φn (E) in terms of laces, as in (2.23)–(2.25). This yields φn (E) =
∞ N=0
(−1)N φn(N) (E)
(3.15)
with φn(0) (E) = Pp (E ∩ {(0, 0) ⇒ n}) +
n
n)]Kt [0, t − 1]I [b¯t ≤ m] , E I [E]I [Tt (b,
t=1 b∈B t (n)
φn(N) (E) =
n t=1 b∈B t (n)
n)] E I [E]I [Tt (b,
t−1
(3.16)
Kt [0, i −1]Jt [i, t]I [b¯i ≤ m] (N ≥ 1). (N)
i=0
(3.17) Here, Jt(N) [i, t] is obtained after replacing Uij by Uij (t) in (2.22). Equations (3.14)– (3.17), in combination with (2.23)–(2.25), constitute the lace expansion for Qn .
450
R. van der Hofstad, F. den Hollander, G. Slade
3.2. Estimates on the lace expansion for Qn . Throughout this section, we fix p = pc . To prove Theorem 1.2, we will use (1.9), (3.14) and the following lemma. Lemma 3.1. Let d + 1 > 4 + 1, p = pc and E ∈ Em . Assume (1.9). There is an L0 = L0 (d) such that for L ≥ L0 , lim
n→∞
φn (E) = 0. θn
(3.18)
Proof of Theorem 1.2 subject to Lemma 3.1. Let E ∈ Em . By (3.14), n−1 1 ϕl (E)pc θn−l−1 + φn (E) . Q∞ (E) = lim Qn (E) = lim n→∞ n→∞ θn
(3.19)
l=m
The second term vanishes in the limit, by Lemma 3.1. Given a small a > 0, we decompose the first term as 1−a n
ϕl (E)pc
l=m
θn−l−1 + θn
n−1
ϕl (E)pc
l=n1−a +1
θn−l−1 . θn
(3.20)
Using Lemma 2.2 to bound ϕl (E) and (1.9) to bound the ratio of survival probabilities, we find that the second term in (3.20) is bounded above by Kmβ
n−1
(l − m + 1)−d/2 O(n),
(3.21)
l=n1−a +1
which vanishes in the limit n → ∞ when d > 4 and a is sufficiently small. Similarly, the first term in (3.20) can be analysed using Lemma 2.2, (1.9) and the dominated convergence theorem. This leads to the conclusion that Q∞ (E) = pc
∞
ϕl (E).
(3.22)
l=m
Comparing with the formula (2.29) for P∞ (E), we see that the limit defining Q∞ (E) exists for every cylinder event E, and that it is equal to P∞ (E). In view of Theorem 1.1, this proves Theorem 1.2. Proof of Lemma 3.1. The proof is somewhat technical. We start by bounding φn(0) (E), defined in (3.16). In all our estimates, we will use I [E] ≤ 1. The first term on the right side of (3.16) is bounded above by θn2 , via the BK inequality. The second term on the right side of (3.16) is bounded above by m−1
P (0, 0) −→ (u, a) −→ (v, a + 1) ⇒ n ,
(3.23)
a=0 u,v∈Zd
where ((u, a), (u, a + 1)) represents the bond bt . By the BK inequality, this is bounded 2 above by m−1 a=0 τa pc θn−a−1 , and therefore, using (2.26) and the monotonicity of θn , we get θ2 φn(0) (E) ≤ θn + Cm n−m . θn θn
(3.24)
Incipient Infinite Cluster for Spread-out Oriented Percolation
451
By (1.9), this goes to zero as n → ∞. Next we bound φn(N) (E) for N ≥ 1, defined in (3.17). For N ≥ 1, let 8n(N) =
n
n)]Jt(N) [0, t] , E I [Tt (b,
(3.25)
t=1 b∈B t (m)
where, as explained under (3.17), Jt(N) [0, t] =
(−Uij (t))
L∈L(N) [0,t] ij ∈L
(1 + Ui j (t)).
(3.26)
i j ∈C (L)
Using I [E] ≤ 1 in (3.17), and then factoring as in the proof of Lemma 2.1, we obtain the estimate φn(N) (E) ≤ 8n(N) +
m−1 a=0
(N) τa pc 8n−a−1 ≤ 8n(N) + C
m−1 a=0
(N) 8n−a−1 ≤C
m a=0
(N) 8n−a . (3.27)
Consider first the case N = 1. The unique lace in L(1) [0, t] is 0t, and hence Jt(1) [0, t] n) implies concontains a factor −U0t (t), which implies that 0 ⇒ n. The event Tt (b, nections (0, 0) ⇒ b1 −→ b¯1 −→ n. Moreover, the factor 1 + U1t (t) in the product over C(L) in Jt(1) [0, t] implies that b¯1 is not doubly-connected to n. Thus 8n(1) is bounded above by the probability of the disjoint connections depicted in Fig. 4. Using the BK inequality, we therefore get 8n(1) ≤ ≤
j n
τj (x)τi (y)τj −i (x − y)θn−j θn−i
j =0 i=0 x,y∈Zd n j =0
2 θn−j
j
τj (x)τi (y)τj −i (x − y),
(3.28)
i=0 x,y∈Zd
where we used the monotonicity of θn in the second inequality. The right side of (3.28) can be easily bounded from above by using the methods and results of [14]. In fact, since τn ∞ ≤ K(n + 1)−d/2 by [14, Theorem 1.1(c)], it follows from (2.26) that 8n(1) ≤
n j =0
2 θn−j
j
τj ∞ τi τj −i ≤ C
i=0
n j =0
2 θn−j (j + 1)−(d−2)/2 .
(3.29)
Using (1.9), we find that 8n(1) ≤ C
n
(n − j + 1)−2 (j + 1)−(d−2)/2 ≤ C(n + 1)−(2∧(d−2)/2) .
(3.30)
j =0
Therefore, (1.9) and (3.27) yield m
φn(1) (E) ≤ Cn (n − a + 1)−(2∧(d−2)/2) ≤ Cm2 (n − m + 1)−(1∧(d−4)/2) . (3.31) θn a=0
The right side goes to zero as n → ∞, when d > 4.
452
R. van der Hofstad, F. den Hollander, G. Slade n
0
0
Fig. 4. Schematic depiction of disjoint connections required by 8n(1)
j Before proceeding with N ≥ 2, it is worth noting that the sum i=0 x,y∈Zd τj (x) τi (y)τj −i (x − y) in (3.28) can be bounded using another method from [14]. The above sum can be obtained from the simpler sum x∈Zd τj (x)2 by replacing one factor τj (x) by τi (y)τj −i (x − y) and then summing over y and i. The first part of this procedure is referred to in [14, Definition 4.1] as Construction 1λ (y, i), where λ labels the diagram line that is modified. According to [14, Lemma 4.6(a)], the diagram obtained after Construction 1λ (y, i) followed by summation over y obeys the same bound as the original diagram, up to a multiplicative constant. Thus, Construction 1λ (y, i) followed by summation over y produces a diagram that is bounded by a constant multiple of the bound on (0) −d/2 of (2.31) (we have omitted the factor x∈Zd 3j (x), namely the bound C(j + 1) β from (2.31) to allow for the possibility that j = 0). The bound (3.29) could thus be replaced by 8n(1) ≤
n j =0
2 θn−j
j
C(j + 1)−d/2 ≤ C
i=0
n j =0
2 θn−j (j + 1)−(d−2)/2 ,
(3.32)
which yields the same conclusion as (3.29). In dealing with N ≥ 2, we will prefer the above method using Construction 1λ (y, i), rather than the method of the previous paragraph. In (3.32), we have bounded 8n(1) using 3(0) j . Similarly, for N ≥ 2, we will (N) (N −1) bound 8n using 3j with 0 ≤ j ≤ n. Fix N ≥ 2. We begin with a decomposition of Jt(N) [0, t]. We write a lace L ∈ L(N) [0, t] in the form L = {i1 j1 , . . . , iN jN }, with 0 = i1 < i2 < · · · < iN < t. We write L as L− ∪ {iN t}, where L− ∈ L(N −1) [0, jN−1 ]. Let 9(L− ) = 1 if N = 2, and 9(L− ) = jN−2 if N ≥ 3. Then we may write L∈L(N) [0,t]
=
t−1
jN −1 −1
We make the decomposition (−Uij (t)) = (−UiN t (t)) (−Uij ), ij ∈L
.
(3.33)
jN −1 =1 L− ∈L(N −1) [0,jN −1 ] iN =9(L− )
(3.34)
ij ∈L−
and note that C(L− ) ∪ {iN r : iN < r < t} ⊂ C(L).
(3.35)
Incipient Infinite Cluster for Spread-out Oriented Percolation
453
Therefore t−1
Jt(N) [0, t] ≤
jN −1 =1 L− ∈L(N −1) [0,jN −1 ] ij ∈L− jN −1 −1
× iN
=9(L− )
(−UiN t (t))
(−Uij )
(1 + Ui j )
i j ∈C (L− )
(1 + UiN r ).
(3.36)
iN
Were it not for the dependence of the second line of (3.36) on L− through the lower limit of summation over iN , we would be able to rewrite the sum over L− in the first line simply as J (N−1) [0, jN−1 ]. The effect of the second line is twofold. First, the factor n)], the (−UiN t (t)) ensures that b¯iN ⇒ n. Second, together with the indicator I [Tt (b, factor iN
8n(N) ≤
j n j =0 i=0 x,y∈Zd
¯ (N −1) (x; (y, i))θn−i θn−j , 3 j
(3.37)
¯ (N −1) (x; (y, i)) denotes the result of applying Construction 1λ (y, i) to a diagram where 3 j bounding 3(N−1) (x), followed by an appropriate sum over λ. By [14, Lemma 4.6(a)], j (N −1) −1) ¯ (x; (y, i)) obeys the bound on x∈Zd 3(N (x) of (2.31), with a difx,y∈Zd 3j j ferent constant. Since θn−i ≤ θn−j , it follows that
(N)
8n ≤
j n j =0 i=0
2 (Cβ)N−1 (j + 1)−d/2 θn−j .
(3.38)
Via (1.9), this gives 8n(N) ≤ (C β)N−1
n
(j + 1)−(d−2)/2 (n − j + 1)−2 ≤ (C β)N−1 (j + 1)−(2∧(d−2)/2) ,
j =0
(3.39) and the desired result follows from (3.27) as in (3.31), again using (1.9). The factor β N−1 permits the summation over N to be performed, for β sufficiently small.
454
R. van der Hofstad, F. den Hollander, G. Slade n
0 0 Fig. 5. Example of disjoint connections required by a configuration contributing to 8n(3)
4. Oriented Percolation and Super-Brownian Motion 4.1. Convergence of moment measures. The oriented percolation r-point functions are defined, for ni ≥ 0 and xi ∈ Zd , by τn(r)1 ,... ,nr−1 (x1 , . . . , xr−1 ) = Pp ((0, 0) −→ (xi , ni ) for each i = 1, . . . , r − 1). (4.1) In particular, τn(2) (x) is the two-point function τn (x). Given m ∈ N, an absolutely summable function f : Zmd → C, and k = (k1 , . . . , km ) with each kj ∈ (−π, π ]d , we define the Fourier transform = f ( y )ei k·y , (4.2) fˆ(k) y1 ,... ,ym ∈Zd
where k · y = m j =1 kj · yj . When m = 1, we write simply k in place of k. In [14], the Fourier transforms of (4.1) are related, in an appropriate scaling limit, to the Fourier transforms of the moment measures of the canonical measure of superBrownian motion [18, 22]. The canonical measure of super-Brownian motion is a certain scaling limit of critical branching random walk, started from a single particle located at the origin. It is a Markov process whose state Xt at time t > 0 is a finite non-negative measure on Rd . By definition, its l th moment measure has Fourier transform l =E Xt1 (dx1 ) · · · Xtl (dxl ) eikj xj , (4.3) Mˆ t(l) (k) Rdl
j =1
where t = (t1 , . . . , tl ) with each ti ∈ (0, ∞), and k = (k1 , . . . , kl ) with each ki ∈ Rd . The following result is a combination of [14, Theorems 1.1(a) and 1.2] with [14, (1.25)]. In its statement, the parameter < is fixed such that x∈Zd |x|2+2< D(x) ≤ CL2+2< . The existence of such an < > 0 is part of the assumptions on D from [14] discussed in Sect. 1.2 and assumed in this paper.
Incipient Infinite Cluster for Spread-out Oriented Percolation
455
Theorem 4.1. Let d > 4, p = pc , δ ∈ (0, 1 ∧ < ∧ d−4 2 ), r ≥ 2, t = (t1 , . . . , tr−1 ) ∈ (0, ∞)r−1 , and k = (k1 , . . . , kr−1 ) ∈ R(r−1)d . There exist L0 = L0 (d) and finite positive constants A = A(d, L), v = v(d, L), V = V (d, L) (with L0 , A, v, V independent of r) such that for L ≥ L0 , (r) vσ 2 n = A2r−3 V r−2 nr−2 [Mˆ (r−1) (k) + O(n−δ )]. τˆn k/ (4.4) t t Rather than applying Theorem 4.1 directly, we use an auxiliary result that was derived in [14] in the course of proving Theorem 4.1. Let n¯ denote the second largest component of n = (n1 , . . . , nr−1 ). In Sect. 5, we will use [14, (2.52)], which states that + O((n¯ + 1)−δ ) vσ 2 n = A(A2 V )r−2 nr−2 Mˆ (r−1) (k) (r ≥ 3) (4.5) τˆn(r) k/ n/n holds uniformly in n ≥ n. ¯ 4.2. The moment measures of super-Brownian motion. In Sect. 5, we will make use of which we now summarise. For l = 1, elementary properties of the Mˆ t(l) (k), = e−|k|2 t/2d . Mˆ t(1) (k) are given recursively by For l > 1, the Mˆ t(l) (k) t (l) ˆ Mt (k) = dt Mˆ t(1) (k1 + · · · + kl ) 0
I ⊂J1 :|I |≥1
(4.6)
Mˆ t(i)−t (kI )Mˆ t(l−i)−t (kJ \I ), I
J \I
(4.7)
where i = |I |, J = {1, . . . , l}, J1 = J \{1}, t = mini ti , tI denotes the vector consisting of the components ti of t with i ∈ I , and tI − t denotes subtraction of t from each component of tI [9]. The explicit solution to the recursive formula (4.7) can be found in [14, (1.25)]. For example, t1 ∧t2 2 2 2 Mˆ t(2) (k , k ) = dt e−|k1 +k2 | t/2d e−|k1 | (t1 −t)/2d e−|k2 | (t2 −t)/2d . (4.8) 1 2 1 ,t2 0
Equation (4.8) is a statement, in Fourier language, that mass arrives at given points (x1 , t1 ), (x2 , t2 ) via a Brownian path from the origin that splits into two Brownian paths at a time chosen uniformly from the interval [0, t1 ∧ t2 ]. The recursive formula (4.7) has a related interpretation for all l ≥ 2, in which t is the time of the first branching. The sets I and J \I label the offspring of each of the two particles after the first branching. Lemma 4.2. (a) For k ∈ Rd , |k|2
(2) Mˆ 1,1 (0, k) = e− 2d .
(4.9)
(b) For l ≥ 0, t ≥ s and kj ∈ Rd , (l+1) ˆ (l+1) Mˆ t,s,... ,s (0, k2 , . . . , kl ) = Ms,s,... ,s (0, k2 , . . . , kl ).
(4.10)
(l+1) l −l Mˆ t,... ,t (0) = t 2 (l + 1)!.
(4.11)
(c) For l ≥ 0,
456
R. van der Hofstad, F. den Hollander, G. Slade
Proof. (a) This follows immediately from (4.8). (b) The proof is by induction on l. For l = 0, both sides of (4.10) equal 1, by (4.6). For l ≥ 1, we use (4.7) with t = (t, s, . . . , s) to obtain s (l+1) Mˆ t,s,... (0, k , . . . , k ) = du Mˆ u(1) (k2 + · · · + kl ) Mˆ t(i)−u (kI )Mˆ t(l−i)−u (kJ \I ). 2 l ,s 0
I ⊂J1 :|I |≥1
I
J \I
(4.12) On the right side, all the arguments in tI − u and tJ \I − u are equal to s − u, except for one, which is t − u. The distinguished time variable also has k1 = 0. Applying the induction hypothesis, we get s (l+1) Mˆ t,s,... (0, k , . . . , k ) = du Mˆ u(1) (k2 + · · · + kl ) (4.13) 2 l ,s 0 (i) ˆ (l−i+1) × Mˆ s−u,... ,s−u (kI )Ms−u,... ,s−u (kJ \I ) I ⊂J1 :|I |≥1
(l+1) = Mˆ s,s,... ,s (0, k2 , . . . , kl ),
(4.14)
which advances the induction and proves (4.10). (c) The proof is again by induction on l. For l = 0, (4.11) follows from (4.6). For l ≥ 1 we use (4.7) and the induction hypothesis to obtain = Mˆ t(l+1) (0)
=2
t
ds
0 −(l−1)
l l
i
i=1
(t − s)i−1 2−(i−1) i!(t − s)l−i 2−(l−i) (l − i + 1)!
l l
i
i=1
t
i!(l − i + 1)!
= t l 2−(l−1) (l − 1)!
(t − s)l−1 ds
0 l
(l − i + 1) = t l 2−l (l + 1)!,
(4.15)
i=1
which advances the induction and proves (4.11).
5. Proof of Theorems 1.3–1.5 5.1. Proof of Theorem 1.3. Before proving Theorem 1.3, we first derive upper and lower bounds on the IIC two-point function, defined by 1 (3) τn,m (x, y). n→∞ τn d
ρm (y) = P∞ ((0, 0) −→ (y, m)) = lim
(5.1)
x∈Z
In addition to the fact that τn → A by (2.26), we will use the fact that sup τn (x) ≤ C(n + 1)−d/2
x∈Zd
by [14, Theorem 1.1(c)].
(5.2)
Incipient Infinite Cluster for Spread-out Oriented Percolation
457
Beginning with the upper bounds, we show that ρm (y) ≤ Cm, sup ρm (y) ≤ C(m + 1)−(d−2)/2 .
(5.3)
y∈Zd
y∈Zd
For the first bound in (5.3), we use the tree-graph bound [2] to obtain the estimate (3) τn,m (x, y) ≤
m z∈Zd
τl (z)τm−l (y − z)τn−l (x − z).
(5.4)
l=0
Therefore, by (5.1) and (2.26), m
ρm (y) ≤ C
z∈Zd
τl (z)τm−l (y − z).
(5.5)
l=0
Summing over y and again using (2.26), we get the first bound of (5.3). For the second bound in (5.3), we apply (5.2) to either the first or the second factor on the right side of (5.5), according to whether l ≥ m/2 or l ≤ m/2. This gives, as required, sup ρm (y) ≤ C
y∈Zd
m
(l ∨ (m − l))−d/2 ≤ C(m + 1)−(d−2)/2 .
(5.6)
l=0
Continuing with the lower bound, we show that there is a constant c > 0 such that ρm (y) ≥ cm. (5.7) √ |y|≤ m
To prove this, we note that, by (5.1), ρˆm (k) = lim
1
(3)
τˆ (0, k). n→∞ τn n,m
(5.8)
We use (4.5) with (n1 , n2 ) = (n, m), n¯ = m, and with n of (4.5) equal to m. Combining this with Lemma 4.2(a,b), we get k (3) −δ lim τˆ 0, √ )] = A(A2 V )mMˆ (2) n (0, k)[1 + O(m n→∞ n,m m ,1 vσ 2 m |k|2
= A(A2 V )me− 2d [1 + O(m−δ )]. Therefore, using (5.8) and (2.26), we obtain |k|2 1 2 m = e− 2d , lim k/ vσ ρ ˆ m m→∞ mA2 V
(5.9)
(5.10)
√ and hence the discrete measure on Rd that assigns mass (mA2 V )−1 ρm (x) to x/ vσ 2 m (x ∈ Zd ) converges weakly to a Gaussian. This implies (5.7). Proof of Theorem 1.3. For the upper bound on DR , we use the decomposition DR = ρm (y) = ρm (y) + ρm (y). (5.11) m |y|≤R
m≤R 2 |y|≤R
m>R 2 |y|≤R
458
R. van der Hofstad, F. den Hollander, G. Slade
By the first bound of (5.3), the first term is bounded above by C m≤R 2 m = O(R 4 ). By the second bound of (5.3), the second term is bounded above by CR d sup ρm (y) ≤ CR d (m + 1)−(d−2)/2 = O(R 4 ). (5.12) y∈Zd
m>R 2
m>R 2
This proves the upper bound on DR . For the lower bound on DR , we use that (5.7) implies DR ≥
m≤R 2 |y|≤R
ρm (y) ≥
√ m≤R 2 |y|≤ m
ρm (y) ≥
m≤R 2
cm ≥
1 4 cR . 2
(5.13)
Finally, we make an observation about the scaling of the IIC r-point functions for general r, although we will not need this. Let y = (y1 , . . . , yr−1 ) and m = (m1 , . . . , mr−1 ) with yi ∈ Zd , mi ∈ Z+ , and define the IIC r-point function by (r) ρm y ) = P∞ ((0, 0) −→ (yi , mi ) for all i = 1, . . . , r − 1). (
(5.14)
(2) In particular, ρm (y) is the same as ρm (y) of (5.1). The methods employed to prove (5.10) can also be used to show that 1 (r) vσ 2 m = Mˆ (r) (0, k), lim k/ ρ ˆ (5.15) 1,t m→∞ (mA2 V )r−1 mt
for all r ≥ 2, t = (t1 , . . . , tr−1 ) ∈ (0, 1]r−1 and k ∈ Rd(r−1) . 5.2. Proof of Theorem 1.4. We first prove (1.16). Let l ≥ 1. By (1.3), (1.14) and (4.1), we have 1 EPn [Nml ] = P (0, 0) −→ (x, n), (0, 0) −→ (yi , m) τn x∈Zd y1 ,... ,yl ∈Zd for each i = 1, . . . , l =
1 (l+2) τˆ (0). τn n,m,... ,m
(5.16)
n = (n, m, . . . , m), and with n of We take n ≥ m, and use (4.5) with r = l + 2, k = 0, (4.5) equal to n¯ = m. This gives (l+2) 2 l l ˆ (l+1) −δ τˆn,m,... (5.17) ,m (0) = A(A V ) m M n ,1,... ,1 (0) + O(m ) . m
Applying Lemma 4.2(b,c), we get (l+2) 2 l l −l −δ τˆn,m,... ,m (0) = A(A V ) m [2 (l + 1)! + O(m )].
Combining (5.16) and (5.18), we find Nm l A EPn = (A2 V )l [2−l (l + 1)! + O(m−δ )]. m τn
(5.18)
(5.19)
Incipient Infinite Cluster for Spread-out Oriented Percolation
459
Taking the limit n → ∞, and using (2.26), we arrive at Nm l = (A2 V )l 2−l (l + 1)! + O(m−δ ), E∞ m
(5.20)
and hence at (1.16) after letting m → ∞. The distribution of the size-biased exponential random variable is determined by its moments, since its moment generating function has a positive radius of convergence. It therefore follows from the convergence of moments expressed by (1.16) that m−1 Nm converges weakly to a size-biased exponential random variable with parameter λ = 2/(A2 V ) (see [4, Theorem 30.2]). This completes the proof of Theorem 1.4. 5.3. Proof of Theorem 1.5. It follows from (1.6), (1.8), (1.14) and (4.1) that EQm [Nml ] =
1 (l+1) τˆ (0), θm m
(5.21)
with m = (m, . . . , m). As in the proof of Theorem 1.4 we find, now also with the help of (1.9), that Nm l 2B 2 l −l (A V ) 2 l! (l = 1, 2, . . . ). = (5.22) lim EQm m→∞ m AV Let α = 2B/(AV ) and suppose for the moment that α = 1. Then (1.18) holds, and the right side of (5.22) gives the moments of an exponential random variable with parameter λ = 2/(A2 V ). It then follows as in the proof of Theorem 1.4 that m−1 Nm converges to this exponential random variable in distribution. To complete the proof, it suffices to show that α = 1. We first prove that α ≤ 1 and then prove that α ≥ 1. Proof that α ≤ 1. Since τm−1 Nm has expectation 1 under P, we can define a new expectation by Em [X] = E[τm−1 Nm X].
(5.23)
, Em
(1.3) and (5.16), Nm l Nm l 1 (l+2) = EPm = l τˆm,m,... Em ,m (0). m m m τm
By definition of
(5.24)
converge to those of a As in (5.20), it follows that the moments of m−1 Nm under Em size-biased exponential random variable with parameter λ = 2/(A2 V ). Therefore, under this measure, m−1 Nm converges weakly to a size-biased exponential random variable. e−t Nmm converges to that In particular, for real t, the moment generating function Em
of a size-biased exponential distribution with parameter λ, which is
λ2 . (λ+t)2
, we can rewrite the moment generating function of m−1 N , Let t ≥ 0. In terms of Em m under Qm , as (recall (1.5)–(1.6) and (1.8)) ! t t Nm Nm 1 τm −t m −s Nmm = 1 − EQm e−s m ds. Nm e ds = 1 − Em EQm e m mθm 0 0 (5.25)
460
R. van der Hofstad, F. den Hollander, G. Slade
By the dominated convergence theorem, together with (1.9) and (2.26), it follows from the identity α = ABλ that t Nm λ2 λ ds = 1 − α + α . (5.26) 0 ≤ lim EQm [e−t m ] = 1 − AB 2 m→∞ (λ + s) λ + t 0 By letting t → ∞, we conclude from (5.26) that α ≤ 1. Proof that α ≥ 1. Fix s > 0. By definition (recall (1.5)–(1.6)), θm(1+s) = θm P(Sm(1+s) |Sm ).
(5.27)
Let n be any positive integer and let A ⊂ Zd be any finite set, and define / n). θn (A) = P(∃a ∈ A : (a, 0) −→ n) = 1 − P(∀a ∈ A : (a, 0) −→
(5.28)
Since, for any a ∈ Zd , {(a, 0) −→ / n} is a decreasing event, it follows from the FKG inequality that θn (A) ≤ 1 − (1 − θn )|A| . Therefore, using n = ms and A = {a ∈ Zd : (0, 0) −→ (a, m)}, we have # " $ θm(1+s) ≤ θm E 1 − (1 − θms )Nm "Sm = θm 1 − EQm (1 − θms )Nm ,
(5.29)
(5.30)
and hence, by (1.9), for any η > 0 we have Nm & 1% 1 = lim mθm(1+s) ≤ 1 − lim EQm (1 − θms )m m m→∞ B(1 + s) m→∞ B Nm & 1 1% ≤ 1 − lim EQm e−( Bs +η) m . m→∞ B
(5.31)
With minor changes, the calculations leading to (5.26) can also be carried out for t = −iu with u ∈ R. This yields lim EQm [eiu
m→∞
Nm m
]=1−α+α
λ . λ − iu
(5.32)
It follows from (5.32) that m−1 Nm under Qm converges in distribution to a random variable Y having the property that P(Y = 0) = 1 − α and that the distribution of Y conditional on Y > 0 is that of an exponential random variable with parameter λ. By (5.31), it follows that & & 1 1 1% α% 1 ≤ 1 − E[e−( Bs +η)Y ] = 1 − E[e−( Bs +η)Y |Y > 0] B(1 + s) B B ' α λ = 1− . 1 B λ + Bs +η
(5.33)
Now we let η ↓ 0 and s ↓ 0 to conclude that α ≥ 1. This completes the proof. Acknowledgements. This work was supported in part by NSERC of Canada. The work of RvdH and GS was carried out in part at Microsoft Research, and the work of RvdH and FdH was carried out in part at the University of British Columbia. We thank Antal J´arai and Akira Sakai for stimulating conversations during the initial stages of this work.
Incipient Infinite Cluster for Spread-out Oriented Percolation
461
References 1. Aizenman, M.: On the number of incipient spanning clusters. Nucl. Phys. B [FS] 485, 551–582 (1997) 2. Aizenman, M., Newman, C.M.: Tree graph inequalities and critical behavior in percolation models. J. Stat. Phys. 36, 107–143 (1984) 3. Bezuidenhout, C., Grimmett, G.: The critical contact process dies out. Ann. Probab. 18, 1462–1482 (1990) 4. Billingsley, P.: Probability and Measure. 3rd Edition, New York: John Wiley and Sons, 1995 5. Borgs, C., Chayes, J.T., Kesten, H., Spencer, J.: The birth of the infinite cluster: Finite-size scaling in percolation. Commun. Math. Phys. 224, 153–204 (2001) 6. Brydges, D.C., Spencer, T.: Self-avoiding walk in 5 or more dimensions. Commun. Math. Phys. 97, 125–148 (1985) 7. Chayes, J.T., Chayes, L.: Percolation and random media. In: Critical Phenomena, Random Systems, Gauge Theories, K. Osterwalder, R. Stora eds. (Les Houches 1984), Amsterdam: North-Holland, 1986 8. Chayes, J.T., Chayes, L., Durrett, R.: Inhomogeneous percolation problems and incipient infinite clusters. J. Phys. A: Math. Gen. 20, 1521–1530 (1987) 9. Dynkin, E.B.: Representation for functionals of superprocesses by multiple stochastic integrals, with applications to self-intersection local times. Ast´erisque 157–158, 147–171 (1988) 10. Grimmett, G., Hiemer, P.: Directed percolation and random walk. In: In and Out of Equilibrium, V. Sidoravicius, ed., Boston: Birkh¨auser, 2002, pp. 273–297 11. Hara, T., Slade, G.: The scaling limit of the incipient infinite cluster in high-dimensional percolation. II. Integrated super-Brownian excursion. J. Math. Phys. 41, 1244–1293 (2000) 12. van der Hofstad, R., den Hollander, F., Slade, G.: The survival probability of critical spread-out oriented percolation above 4 + 1 dimensions. In preparation 13. van der Hofstad, R., Slade, G.: A generalised inductive approach to the lace expansion. Probab. Th. Rel. Fields 122, 389–430 (2002) 14. van der Hofstad, R., Slade, G.: Convergence of critical oriented percolation to super-Brownian motion above 4 + 1 dimensions. Preprint (2001) 15. J´arai Jr., A.: Incipient infinite percolation clusters in 2d. To appear in Ann. Probab. 16. J´arai Jr., A.: Invasion percolation and the incipient infinite cluster in 2d. Preprint (2001) 17. Kesten, H.: The incipient infinite cluster in two-dimensional percolation. Probab. Th. Rel. Fields 73, 369–394 (1986) 18. Le Gall, J.-F.: Spatial Branching Processes, Random Snakes, and Partial Differential Equations. Basel: Birkh¨auser, 1999 19. Madras, N., Slade, G.: The Self-Avoiding Walk. Boston: Birkh¨auser, 1993 20. Nguyen, B.G., Yang, W.-S.: Triangle condition for oriented percolation in high dimensions. Ann. Probab. 21, 1809–1844 (1993) 21. Nguyen, B.G., Yang, W.-S.: Gaussian limit for critical oriented percolation in high dimensions. J. Stat. Phys. 78, 841–876 (1995) 22. Perkins, E.: Dawson–Watanabe superprocesses and measure-valued diffusions. In: Lectures on Probability Theory and Statistics. Ecole d’Et´e de Probabilit´es de Saint–Flour XXIX-1999, P.L. Bernard, ed., Berlin-Heidelberg-NewYork: Springer. To appear 23. Simon, B.: Functional Integration and Quantum Physics. New York: Academic Press, 1979 24. Slade, G.: The scaling limit of self-avoiding random walk in high dimensions. Ann. Probab. 17, 91–107 (1989) Communicated by M. Aizenman
Commun. Math. Phys. 231, 463–480 (2002) Digital Object Identifier (DOI) 10.1007/s00220-002-0720-5
Communications in
Mathematical Physics
Brownian Motion with Restoring Drift: Micro-canonical Ensemble and the Thermodynamic Limit B. Rider Mathematics Department, Duke University Durham, NC 27708, USA. E-mail:
[email protected] Received: 9 January 2001 / Accepted: 17 July 2002 Published online: 14 October 2002 – © Springer-Verlag 2002
Abstract: We take up the old problem of micro-canonical conditioning in the context of diffusion. Starting with a potential F : Rd → R, the Schr¨odinger operator −G0 = (1/2) − F with ground state ψ is carried by a conjugation into the diffusion generator G = (1/2) + (∇ψ/ψ) · ∇ with invariant density ψ 2 . The latter motion t → X (t) is made micro-canonical by first conditioning the path to be periodic, X (0) = X (L), and then further conditioning on the empirical mean-square or “particle L number” (1/L) 0 |X(t) |2 dt D. The thermodynamics are then studied by taking L ↑ ∞ while D remains fixed. The problem in this form owes its inception to McKeanVaninsky [8] who obtained the following result. For F (x) /|x|2 ↑ ∞ with |x| ↑ ∞, they showed the same type of diffusion appears in the thermodynamic limit, but with drift arising from the shifted potential F + c|x|2 , c being such that the limiting mean-square 2 equals D. Their method of proof predicts the same outcome for F (x) /|x| ↓ 0, so long as D is smaller than the canonical mean-square D0 = Rd |x|2 ψ 2 (x) dx, while if D > D0 , the matter was unresolved. The purpose of this note is to show a type of phase transition takes place in this case: the conditioning is overcome in the limit and one sees the original (stationary) diffusion on the line. The proof employs an entropy inequality due to Csisz´ar [1]. 1. Introduction Consider the diffusion t → X(t) ∈ Rd with infinitesimal operator 1 ∂2 ∇ψ(x) ∂ ∇ψ 1 ∂2 ∂ G= + · · ∇, (1) = + + · · · + , · · · , 2 2 2 ∂x1 ψ(x) ∂x1 ∂xd 2 ψ ∂xd where ψ is smooth and satisfies Rd ψ 2 (x) dx = 1; it is the Brownian Motion plus restoring drift of the title. The corresponding Markovian measure on paths starting from • ∈ Rd is denoted by P• . We are concerned with the micro-canonical ensemble obtained
464
B. Rider
by first conditioning this motion to be periodic with periodicity L and then further conL ditioning it to remain near the sphere 0 |X(t) |2 dt = LD with fixed positive D. The objective is to understand the thermodynamic limit, that is, what are the limiting processes on the line as L ↑ ∞. The problem as stated was introduced by McKean-Vaninsky [8] in connection with the study of statistical mechanics for non-linear wave equations (see [7, 9 and also 6]). To define the various ensembles, we first bring in the Schr¨odinger operator G0 = − (1/2) + F with F = (1/2) ψ −1 ψ and ground state ψ. We assume throughout that F (x) ↑ ∞ with |x| so that G0 has pure point spectrum: 0 (G0 ) < 1 (G0 ) < etc. ↑ ∞. It is connected to G through the conjugation ψGψ −1 = −G0 + 0 (G0 ); this is one way to see that P• is reversible with respect to ψ 2 (x) dx, its stationary measure. Note that the growth rate of F at ∞ reflects the strength of the restoring drift. Next, with p t, x, x being the transition density for P• , the periodic ensemble PL is defined by first conditioning on X (0) = X (L) and then distributing this common starting/ending point according to the (finite) measure p (L, x, x) dx. In symbols this is1 E PL [φ (X)] = ZL−1
Rd
Ex [φ (X) , X(L) = x]dx
with the normalizer ZL . By reversibility and our assumption on F the latter can be expressed as ZL =
Rd
p (L, x, x) dx =
∞ n=0
eLn (G ) =
∞
e−L(n (G0 )−0 (G0 )) ,
n=0
so that, not only is ZL finite, but in fact ZL 1 for L ↑ ∞. The advantage of enforcing the periodicity in this way is that PL is invariant under rotations of the circle as may be easily checked. As to the micro-canonical ensemble, denoted by ML , we pick a fixed δ > 0 and take ML to be the measure on paths with partition function
ZL =
Rd
Ex
L
|X(t) | dt ∈ L [D, D + δ] , X(L) = x dx, 2
(2)
0
L from which you see what was meant by near the sphere 0 |X|2 = LD. We note that while [8] took the conditioning point-wise (δ = 0), our method requires opening the ensemble up a bit, letting δ ↓ 0 after L ↑ ∞.2 Now, ML clearly inherits the rotation invariance of PL , and so to understand the thermodynamic limit it is enough to examine arich enough class of short test functions of the path. That is, for some φ depending on X t for 0 ≤ t ≤ t < L only, the problem is to compute the micro-canonical mean value 1 Throughout we use the useful notion P [f (X) = x] and the like to indicate densities: P [f (X) = x] = ∂/∂zP [f (X) ≤ z] |z=x . 2 One may even take δ ↓ 0 with L ↑ ∞, but that changes nothing in the nature of the result.
Brownian Motion with Restoring Drift
465
L 2 φ , ML [φ] = Z−1 E |X | ds ∈ L D + δ] , X = x dx [D, (X) (s) (L) x L Rd 0
∞ t = dx dx dI Ex φ (X) , |X (s) |2 ds = I, X (t) = x Rd
×
Rd
Z−1 L Px
0
0
L−t
|X (s) | ds ∈ L [D, D + δ] − I, X (L − t) = x 2
(3)
0
for L ↑ ∞. In their investigation, McKean-Vaninsky treated the case d = 1 and developed a method best suited to “strong” potentials, F (x) /|x|2 ↑ ∞ with |x|. Under that condition they showed that the limiting mean-value M∞ [φ] equals that for new (stationary) diffusion with generator G∗ = (1/2) ∂ 2 + (log ψ ∗ ) ∂. Here ψ ∗ is the ground state for the Schr¨odinger operator with potential F ∗ (x) = F (x)+c|x|2 ; the number c is adjusted ∞
2 so that M∞ |X|2 (0) = −∞ |x|2 ψ ∗ (x) dx = D. This shift from F to F ∗ is as predicted by Gibbs’ Principle of Equivalence of Ensembles. The key to their approach was that the growth condition on F permits the assump tion that the conditioned value of D and the canonical mean-square |x|2 ψ 2 (x) dx are L
actually the same. Indeed, in that case the factor ec 0 X (t)dt is integrable with respect to P• for any c, and, if incorporated into the P• mean as a density, allows one to arbitrarily raise/lower the mean-square by raising/lowering the constant c. Further, if δ = 0, that factor may be added above and below in (3) without any change to the overall ML mean. Led by this observation McKean-Vaninsky proceeded in the spirit of Doeblin [3] to establish the local limit theorem
L 2 X (t) dt = LD, X (L) = x ∼ L−1/2 ψ 2 (x) × [1 + o (1)] for L ↑ ∞. P• 2
0
Formal substitution of this estimate in (3) explains their result. But what happens when the shift to the “correct” mean-square cannot be made ahead 2 of time? This is2the2 case when, for example, the potential is “weak”, F (x) /|x| → 0, and D > |x| ψ (x) dx. Whatever the outcome, it cannot be described by a Gibbsian shift. themean-square requires that c > 0, but, as one may check, Increasing L 2 E• exp c 0 |X(t) | dt = +∞ for such F ’s. The answer is that there is a type of phase transition: the conditioning is overcome in the limit and you see the original (stationary) diffusion on the line. We will prove the following. Theorem 1. Let G and G0 be as above and set D0 = Rd |x|2 ψ 2 (x) dx. Our method is partial to potentials satisfying lim sup|x|↑∞ F (x) /|x|2 < ∞ – otherwise [8] applies. There are two cases. (1) If either lim inf |x|↑∞ F (x) /|x|2 > 0 or D < D0 then limδ↓0 limL↑∞ ML = ∗ ∗ ∗ ∗ P[ψ ∗ ]2 , the stationary diffusion with generator G = (1/2) + (∇ψ /ψ ) · ∇. Here
∗ 2 ψ ∗ is the ground state for G∗0 = G0 + c|x|2 and is such that E[ψ ∗ ]2 |X| (0) = D. (2) If on the other hand F (x) = o |x|2 in any cone tending to infinity and D > D0 then limδ↓0 limL↑∞ ML = Pψ 2 , the original motion taken stationary on the line. In both cases the convergence takes place in entropy, and so in total variation, over short fields.
466
B. Rider
We will see that this transition is a consequence of heavy-tailed behavior: it occurs when the micro-canonical fiat is not a rare enough event to produce a shift, Gibbsian or otherwise, in the limit. Note the subtle dependence on the shape of F ; the potential need only be “weak” along a given direction to result in a breaking between micro-canonically raising/lowering the mean-square. The proof of Theorem 1 is anchored by an entropy inequality due to Csisz´ar [1], and we begin in Sect. 2 by describing his work on conditioned (micro-canonical) sequences of independent identically distributed variables. Here we also prepare the notion of relative entropy between a stationary process and a diffusion and prove various properties thereof. The bulk is contained in Sect. 3. There, the important Lemma 4 identifies the thermodynamic limit and is a version of Csisz´ar’s inequality for diffusions. Along with this the proof of Theorem 1 is completed, the phase transition demonstrated through Lemma 5 which analyzes a variational problem stemming from the free energy of the ML ensemble. Section 4 explains the connection to non-linear waves. As a final remark, Sect. 5 recasts the current work in terms of a parabolic Martin Boundary problem. Remark 1. The choice of the mean-square is natural and keeps things concrete. The result of Theorem 1 is easily extended to ML obtained by conditioning on other “energies.” 2. Preliminaries 2.1. Csisz´ar’s original inequality. Conditional limit theorems of the type we are interested in have been widely studied for independent identically distributed variables. Probably the most far reaching results are those of Csisz´ar [1] whose ideas we borrow freely. For any two probability measures µ and λ on a nice space X, let H (µ|λ) be the relative entropy of µ given λ: with C (X) denoting the space of bounded continuous functions, H (µ|λ) = inf c : φ (x) µ (dx) ≤ c + log eφ(x) λ (dx) for all φ ∈ C (X) . H (µ|λ) is nonnegative and convex as a function of µ. It is finite if and only if µ is absolutely continuous with respect to λ and dµ/dλ = f (x) satisfies f (x) log f (x) λ (dx) < ∞, in which case H (µ|λ) = log f (x) µ (dx) . Important here is the fact that relative entropy bounds total variation distance. Next let X1 , X2 , . . . be a sequence of say real valued independent random variables with common distribution PX , and introduce the empirical distribution Ln = n1 n1 δXi . For a set % ⊂ M1 (R) – the space of probability measures on the real line – let PXn |% be the distribution of the sequence Xk : k ≤ n conditional on Ln ∈ %. The remarkable observation of Csisz´ar is the following.3 Let % be a convex subset of M1 (R) with P (Ln ∈ %) > 0. Also let P∗ be the minimizer: inf P ∈% H (P |PX ) = H (P∗ |PX ) which may be shown to exist. Then 1 1 H PXn |% |P∗n ≤ − log P (Ln ∈ %) − H P ∗ |PX . n n 3
This is but an instance of Csisz´ar’s Theorem 1 [1] − his technical setup is more elaborate.
Brownian Motion with Restoring Drift
467
Now for any µ ∈ M1 (Rn ) and λ ∈ M1 (R), one has ni=1 H (µi |λ) ≤ H (µ|λn ), in which µi denotes the marginal of µ on the i th co-ordinate. Using the exchangeable nature of the conditioned variables, Csisz´ar concludes that 1 H PX1 |% |P∗ ≤ − log P (Ln ∈ %) − H (P∗ |PX ) . n
(4)
The point being that the distance between the conditioned distribution and the entropy minimizer is controlled by an object to which Large Deviation theory – in particular Sanov’s Theorem – applies. Indeed, convergence of the free energy to its anticipated value implies convergence of the conditional distribution of X1 to P∗ in entropy and so also in total variation. The proof of Csisz´ar’s inequality (4) is based on a “triangle inequality” for relative entropies, also proved in [1]: if P ∗ minimizes H (·|P ) over the convex set %, then H (Q|P ) ≥ H (Q|P∗ ) + H (P∗ |P ) for all Q ∈ %;
(5)
the geometric picture being that if H (·|·) tries to be a squared distance, then the angle between the lines connecting Q to P and Q to P ∗ is acute. Csisz´ar’s setup has been extended to discrete parameter Markov Chains by Schroeder [14] and Dembo-Zeitouni [2].
2.2. Technicalities. For us the chain of iid variables above is replaced by the diffusion X with generator G = (1/2) + (∇ψ/ψ) · ∇ conditional on X (0) = X (L) and L 2 0 |X| (t) dt ∈ L [D, D + δ]. It is a point of good fortune that the joint motion t → t 2 X(t) , I (t) = 0 |X(s) | ds is also a diffusion with generator G+ = G + |x|2 ∂/∂I . While the latter is degenerate in that ∂ 2 /∂I 2 is missing from the top, a theorem of t H¨ormander shows that G+ is “hypo-elliptic,” i.e., the joint density Px X(t) = x , 0 |
X(s) |2 ds = I is smooth in all its variables and also positive, provided only that t, I > 0.4 Thus, we also have a smooth positive density function for the micro-canonical marginal ML [X (·) ∈ dx] = mL (x) dx: mL (x) = Z−1 L
L[D,D+δ]
Px X (L) = x,
L
|X(t) |2 dt = N dN,
0
a small technical point that will be useful in what follows. Next, in the diffusion format, the Donsker-Varadhan I -function will play the role of relative entropy; we review a few definitions. For our reversible operator G, the I -function takes the particularly nice form: with µ (dx) = f 2 (x) dx a probability measure on Rd , 1 ψ 2 2 I (µ : G) = |∇f | dx + f dx. d 2 Rd R 2ψ 4 Krylov [5] explains such matters. Replacing |x|2 with a more general energy U (x) is amenable to the same theory provided U has no zero of infinite order.
468
B. Rider
The importance tof the I -function lies in that it controls the large deviations of L (t, X) (dx) = (1/t) 0 1X(s)∈dx ds, the occupation measure of the process. To wit, Donsker and Varadhan [4] have proved that: 1 lim inf log P• (L (L, X) ∈ G) ≥ − inf I (µ : G) for open sets G ⊂ M1 Rd L→∞ L µ∈G and lim sup L→∞
for closed sets F ⊂ M1 Rd ,
1 log P• (L (L, X) ∈ F ) ≤ − inf I (µ : G) µ∈F x
with I (µ : G) = 0 if and only if µ is the invariant measure of the process. The relevance L here, of course, is that our micro-canonical event 0 |X(t) |2 dt LD may be written 2 Rd |x| L (L, X) (dx) D. Finally, as the convergence of the micro-canonical ensemble will be shown to hold in entropy, properties of the relative entropy between a diffusion P and a stationary process M are now described. In these computations, the initial point of the diffusion is distributed according to the marginal of M. Lemma 1 shows that entropy defined in this manner is super-additive; Lemma 2 relates it the I -function. In both t → ω (t) denotes the generic path. Lemma 1. If M is a stationary processes with marginal M [ω (0) ∈ dx] = m (dx) and P• is a diffusion, then the relative entropy HF t (M|Pm ) is super-additive in t: for 0 0 ≤ t ≤ t, HF t (M|Pm ) + HF t−t (M|Pm ) ≤ HF t (M|Pm ) . 0
0
0
Proof. It is well known that entropy increases over fields, that is, we have the identity HF t (M|Pm ) = HF t (M|Pm ) + E M HF t M(ω,t ) |Pm,(ω,t ) . 0
0
0
Here M(ω,t ) is the regular conditional probability distribution of M given F0t and like wise for Pm . The paths of M(ω,t ) , Pm,(ω,t ) agree on F0t . This, together with the stationarity of M and the Markov property of P , imply M P HF t M(ω,t ) |Mm,(ω,t ) = HF t M(ω,t )|Pω(t ) = sup E (ω,t ) [φ] − log E ω(t ) eφ , t
0
φ
the supremum being taken over all continuous functions φ : ω → R which are measurable over the field Ftt . However, for any such φ, M P E M [φ] = E M E (ω,t ) [φ] ≤ E M HF t M(ω,x) |Pm,(ω,t ) + log E ω(t ) eφ , t
and so P E M HF t M(ω,t ) |Pm,(ω,t ) ≥ sup E M [φ] − E M log E ω(t ) eφ 0
φ
≥ sup E M [φ] − log E Pm eφ = HF t−t (M|Pm ) . φ
0
Jensen’s inequality and the stationarity of M are used in the second line. The proof is finished.
Brownian Motion with Restoring Drift
469
Lemma 2. Now let M be a stationary process whose marginal distribution has positive C 1 density function with respect to Lebesgue measure, M[ω (t) ∈ dx] = m (x) dx. ˆ = (1/2) + ∇ √m/√m · ∇. Further, let Pˆ• be the diffusion process generated by G If, as before, P• corresponds to the generator G = (1/2) + (∇ψ/ψ) · ∇, then tI (m : G) = HF t (M|Pm ) − HF t M|Pˆm . 0
0
Proof. P• and Pˆ• are mutually absolutely continuous over short fields, and Pm and Pˆm inherit this feature. It follows that ˆm d P M HF t (M|Pm ) = HF t M|Pˆm + E log . t 0 0 dPm F0 Now the claim will follow from evaluating the second term on the right as tI (m : G). To see this, take Brownian Motion (BM• ) as reference measure and use the Cameron-Martin formula to express
t ∇ √m 1 t ∇ √m 2 ˆ √ √ exp 0 m ω t · dω t − 2 0 m ω t dt d Pm d Pˆm dBM .
F0t = dPm• F0t = 2 t dPm )) · dω (t ) − 1 t ∇ψ (ω (t )) dt dBM• exp 0 ∇ψ (ω (t ψ 2 0 ψ An application of Itˆo’s Lemma yields d Pˆm dPm
m (ω(t)) ψ (ω(t)) − log F0t = exp log m (ω(0)) ψ (ω(0)) √ 1 t ψ 1 t m + ω t dt − ω t dt . √ 2 0 ψ 2 0 m
After taking logarithms and then expectation under the M-measure of the right-hand side, you will see that the first two terms in the exponent vanish by the stationarity of M. What remains is √
t ˆ d P ψ m m M M 1 t E dt log =E ω t − √ ω t dPm F0 2 0 ψ m √ m ψ =t (x) − √ (x) m (x) dx 2 m Rd 2ψ √ ψ 1 |∇ m|2 dx + m (x) dx = tI (m : G) , =t 2 Rd Rd 2ψ where the second line is justified by Fubini and another application of the stationarity of M. The proof is finished. Remark 2. In the applications to follow, the stationary process M will be our periodic or micro-canonical diffusion. The statements of Lemmas 1 and 2 remain valid in this case so long as the periodicity L exceeds t.
470
B. Rider
3. The Thermodynamic Limit The intuition is as follows. We know at least that M∞ is stationary. Supposing it is also Markovian there is an obvious program before you. Minimizing the I -function over
2 the micro-canonical set produces some measure µ∗ (dx) = ψ ∗ (x) dx which should serve as the marginal for the limiting P•∗ . To pin down the full limit, one notes that as it must be absolutely continuous to P• over short fields (their relative entropy being finite), its generator takes the form G∗ = (1/2) + b · ∇. For d = 1, P•∗ would now be determined. The scale and speed measures, which uniquely characterize the process, satisfy: x x scale (dx) speed (dx) b x dx and b x dx , = exp −2 = exp 2 dx dx 0 0 ∗ 2 up to a normalizer, i.e.,b = (logψ ∗ ) . For d > 1, one may only the latter being ψ
2 2 2 † say 0 = (G∗ ) ψ ∗ = (1/2) ψ ∗ − ∇ b ψ ∗ , and, while b = ∇ log ψ ∗ is a solution, there is no uniqueness. However, as one would expect, this is the limit identified below. 3.1. The entropy bound. To carry out the above, we develop a version of Csisz´ar’s Inequality (4) for (reversible) diffusions – Lemma 4 below. This in turn relies on Lemma 3, the “triangle inequality” for I -functions; compare (5). Now, as these are both rather generic tools, we state them in a slightly broader context than needed in the present. An entire class of micro-canonical ensembles may be derived by restricting PL to those paths satisfying L (L, X) ∈ % ⊂ M1 Rd – our choice of L 2 0 |X(t) | dt ∈ L [D, D + δ] is but an example. Lemmas 3 and 4 require only that (1) % is convex and (2) that PL (L (L, X) ∈ %) > 0. Lemma 3. Given the generator G = (1/2) + (∇ψ/ψ) · ∇ with smooth invariant density ψ 2 , let µ∗ minimize I (ν : G) over the convex set %: I (% : G) = inf I (ν : G) = I µ∗ : G . ν∈%
2 This is useless if I (µ∗ : G) = +∞. If I (µ∗ : G) < ∞, then µ∗ has a density ψ ∗ such that ψ ∗ ∈ W 1,2 Rd (see [4]), and one may define the generator G∗ = (1/2) + (∇ψ ∗ /ψ ∗ ) · ∇ with invariant measure µ∗ . The processes corresponding to G and G∗ are mutually absolutely continuous over short fields, and for any ν ∈ %, I (ν : G) ≥ I ν : G∗ + I µ∗ : G . Proof. Let ν ∈ % satisfy I (ν : G) < ∞. This implies that ν is absolutely continuous with respect to µ and has density ϕ 2 with ϕ ∈ W 1,2 Rd . Consider H (ε) = I εν + (1 − ε) µ∗ : G for 0 ≤ 7 ≤ 1. By the convexity of the I -function, H (ε) is a (bounded) convex function of ε. Since I µ∗ : G ≤ I εν + (1 − ε) µ∗ : G = I εϕ 2 + (1 − ε) ψ 2 : G ,
Brownian Motion with Restoring Drift
471
H (ε) is a non-decreasing function of 7. Thus H + (0) = lim
ε→0
d I εν + (1 − ε) µ∗ : G ≥ 0, dε
the existence of the limit being automatic. The proof of the inequality follows Csisz´ar quite closely. One starts from the obvious 1 I (ν : G) − I µ∗ : G ≥ I (ν : G) − I µ∗ : G − [H (ε) − H (0)], ε
(6)
holding for all 7 > 0. The limit 7 ↓ 0 is then taken, and the lemma is proved by means of the evaluation H + (0) = I (ν : G) − I ν : G∗ − I µ∗ : G . It is convenient to introduce the notation νε = εν + (1 − ε) µ∗ and fε = Note that
dν dµ∗ ϕ2 dνε ψ ∗2 =7 + (1 − ε) = ε 2 + (1 − ε) 2 ≡ εf1 + (1 − ε) f0 . dµ dµ dµ ψ ψ √ fε ∈ W 1,2 . Then H (ε) = fε −G fε dµ = Rd
Rd
|∇ fε |2 dµ
and so, if the differentiation could be passed inside the last integral, the conclusion would be
d |∇fε |2 ∇fε d ∇fε 1 1 H + (0) = dµ = dµ. √ √ 8 Rd dε fε 4 Rd fε dε fε ε=0 ε=0 (7) To see that this is the case, consider the limit of ε−1[H (ε) − H (0)] in conjunction with the whole right-hand side of (6). Setting H (ε) = h (ε, x) dµ (x), we write in which √ h (ε, x) = (1/2) |∇ fε (x) |2 . Then 1 I (ν : G) − I µ∗ : G − (H (ε) − H (0)) ε 2 1 1 1 = |∇ f1 | dµ (x) − |∇ f0 |2 dµ (x) − [H (ε) − H (0)] 2 d 2 Rd ε R 1 = h (1, x) − h (0, x) − [h (ε, x) − h (0, x)] dµ (x) . ε Rd √ The advantage of this being, of course, that, for each x, ε → h (ε, x) = (1/2) |∇ fε (x)|2 is convex on (0 , 1]. It follows that 1 h (1, x) − h (0, x) − [h (ε, x) − h (0, x)] ε
472
B. Rider
is monotone non-decreasing, and the manipulation inherent in (7) holds by monotone convergence. Now, since
∇f1 − ∇f0 1 ε∇f1 + (1 − ε) ∇f0 d ∇fε = √ − √ (f1 − f0 ) dε 2 (εf1 + (1 − ε) f0 )3/2 fε εf1 + (1 − ε) f0 1 ∇f0 f1 − f0 ∇f1 − ∇f0 − √ = √ 2 f0 f0 f0 at ε = 0, we have that 1 d (|∇ fε |2 )|ε=0 = ∇ dε 2
f1 − f0 √ f0
∇ f0 ,
and so also, √ f1 − f 0 −G f0 dµ = √ (f1 − f0 ) −G f0 1 dµ. f0 Rd Rd √ In the rightmost expression of the last display the f0 upstairs refers to the conjugate or “h-transform” of G defined by √ 1 ∇f0 Gf0 1 ψ ∗ ψ ·∇ + = G∗ + − . G f0 = √ G f0 = G + f0 f0 2 ψ∗ ψ f0 H + (0) =
The proof is then finished by I (ν : G) − I µ∗ : G − H + (0) = f1 −G f1 dµ − f0 (−G) f0 dµ Rd Rd √ − (f1 − f0 ) −G f0 (1) dµ Rd √ √ f √ 1 f0 = f1 f0 −G f1 −G f0 (1) dµ √ dµ − f0 Rd Rd √ √ √ √ f1 f1 f1 = −G f0 √ dµ∗ − −G f0 (1) dµ∗ √ f0 f0 Rd R d f0 √ √ √ f1 f f0 G f 1 1 = −G∗ √ − dµ∗ √ √ f0 f0 f0 f0 Rd √ f1 G f0 −G∗ (1) − √ dµ∗ − d f f 0 0 R √ √ √ f f1 1 = −G f0 √ dµ∗ √ f0 f0 Rd ψ ψ −G∗ dµ∗ = I ν : G∗ , = d ϕ ϕ R as advertised.
Brownian Motion with Restoring Drift
473
Lemma 4. Assume the convex set % satisfies I (% : G) = I (µ∗ : G) < ∞ and that µ∗ admits a nice (C 1 , positive) density. Further assume that the micro-canonical marginal ML [X (·) ∈ dx] = PL [X (·) ∈ dx|L (L, X) ∈ %] has a C 1 positive density. Then, for fixed t < L,
Lt L−1 ∗ 1 ∗ t HF ML |PmL ≤ − log PL (L (L, X) ∈ %) − I µ :G 0 L−1 L L
1 dPmL − ML log . (8) L−1 L dPL F0 Proof. Since ML (9) = PL (9 ∪ L (L, X) ∈ %) /PL (L (L, X) ∈ %), it follows that ML is absolutely continuous with respect to PL and that HF L (ML |PL ) = − log PL (L (L, X) ∈ %) < ∞. 0
As entropy increases with the field, − log PL (L (L, X) ∈ %) ≥ HF L−1 (ML |PL ) 0
dPmL = HF L−1 ML |PmL + ML log 0 dPL
, F L−1 0
where mL (dx) = ML [X(0) ∈ dx], the assumed positivity of the corresponding density makes sense of the last line. The rotation invariance of ML then allows an application of Lemma 2 for short fields ˆ = (1/2) + (∇mL /mL ) · ∇. F0L with L < L and Pˆ• the diffusion generated by G The outcome is, HF L−1 ML |PmL = HF L−1 ML |PˆmL + (L − 1) I (mL : G) ; 0
0
in particular, the right-hand side is finite as PL (L (L, X) ∈ %) > 0. Thus, you may also write − log PL (L (L, X) ∈ %) ≥ HF L−1 ML |PˆmL 0
dPmL + (L − 1) I (mL : G) + ML log L−1 . (9) dPL F0 Next, apply Lemma 2 once more to give HF L−1 ML |Pm∗ L = HF L−1 ML |PˆmL + (L − 1) I mL : G∗ , 0
0
(10)
where both the first and third term may, perhaps, be infinite. To see that is not so, first notice that by the rotation invariance of ML the marginal mL is contained in the (convex) set % for each L < ∞. An application of the triangle inequality of Lemma 3, I (mL : G) ≥ I mL : G∗ + I µ∗ : G , (11) then shows that I (mL : G∗ ) is finite. Substitution of (10) into (9) produces − log PL (L (L, X) ∈ %) ≥ HF L−1 ML |Pm∗ L + (L − 1) I (mL : G) − I mL : G∗ 0
dPmL +ML log , L−1 dPL F0
474
B. Rider
which, together with (11), gives you 1 1 HF L−1 ML |Pm∗ L ≤ − log PL (L (L, X) ∈ %) 0 L L
1 L−1 ∗ dPmL − I µ : G − ML log L L dPL
. F L−1 0
Finally, by the super-additive property established in Lemma 1, HF L−1 ML |Pm∗ L ≥ 0 (L − 1) /tHF t ML |Pm∗ L . The proof is finished. 0
3.2. Identifying the limit. The proof of Theorem 1 may now be completed; it follows from Lemma 4 and a few quick checks. Note that with the micro-canonical fiat takL en 0 |X (t) |2 dt ∈ L [D, D + δ] the assumptions of the lemma (smoothness of the micro-canonical marginal densities etc.) are satisfied due to the discussion of Sect. 2.2. The point is that convergence of the right-hand side of (8) to zero as L ↑ ∞ implies the desired result of the convergence of the micro-canonical ensemble ML in entropy over short fields to the diffusion P ∗ with generator G∗ =
1 ∇ψ ∗ ·∇ + 2 ψ∗
(12)
2 in which ψ ∗ is the square-root of the minimizing density – µ∗ (dx) = ψ ∗ (x) dx – for the variational problem I µ∗ : G = lim inf I (ν : G) . (13) δ↓0
|x|2 ν(dx)∈[D,D+δ]
There are two limits to be established. The first, which encodes the identification of M∞ = P ∗ ,
L 1 2 |X| (t) dt ∈ L [D, D + δ] ≥ −I µ∗ : G , lim lim inf log PL δ↓0 L→∞ L 0 is an application of the work of Donsker-Varadhan (see, for example, [11]), and we omit the proof. Then there is the “error term”: we need
1 dPmL lim inf ML log ≥ 0. (14) L−1 L→∞ L dPL F0 This is treated in Lemma 6 below. First however, we turn to the particulars of the phase transition. This manifests itself in the limiting drift through the variational problem (13). We have the following. Lemma 5. Let F (x) be continuous and tending to ∞ with |x|. Define 1 |∇f (x) |2 IF (f ) = dx + F (x) f (x) dx. 8 Rd f (x) Rd The √ infimum of IF (f ) for f (x) dx = 1 is achieved at a unique strictly positive f0 , f0 being the ground state for G0 = − (1/2) + F .
Brownian Motion with Restoring Drift
475
Next let D0 = Rd |x|2 f0 (x) dx. To understand (13) is to examine IF (D) = inf IF (f ) : f (x) dx = 1 |x|2 f (x) dx = D ; Rd
Rd
there are two cases. (1) If either D < D0 or if the growth of F is fast enough lim inf |x|↑∞ F (x) /|x|2 > 0 , then the infimum IF (D) is achieved at a unique nonnegative f∗ such that Rd |x|2 f∗ (x) √ dx = D and there exists a constant c = c (D) such that f∗ is ground state for the adjusted operator G∗0 = − (1/2) + F (x) + c|x|2 . (2) If, on the other hand, D ≥ D0 and F (x) /|x|2 → 0 for |x| ↑ ∞ in some cone, then IF (D) = IF (D0 ) with minimizer f∗ = f0 . √ Remark 3. This describes the limiting drift in (12): ψ ∗ = f∗ . Proof. (1) Take F (x) /|x|2 ↑ ∞, this being typical. All that needs to be checked is the existence of a constant c such that the ground state for G∗0 has second moment D. Now, √ for c ↑ ∞, the ground state f∗ for G∗0 satisfies 1 2 2 |x| f∗ (x) dx ≤ |∇ f∗ (x)| dx + F ∗ (x) f∗ (x) dx c 2 Rd Rd Rd 1 ≤ |∇ f (x)|dx + F (x) f (x) dx + c |x|2 f (x) dx d d 2 Rd R R 2 for any smooth non-negative f with f = 1. Thus, |x| f∗ may be made as small as you want by concentrating f near the origin. Likewise, for c ↓ −∞, |x|2 f∗ can be made large by spreading f out. The uniqueness is obvious. show that there exists a sequence of strictly positive functions fn with fn = (2) We 1, |x|2 fn = D and lim inf IF (fn ) ≤ IF (f0 ) = inf IF (f ) . n↑∞
f =1
We can assume that we are working in the cone {x : x1 ≥ |x| cos θ } with 0 < θ < π/2. Now, f0 is strictly positive so inf {f0 (x) : |x| ≤ 1} ≡ δ is positive. Next define x − αne1 1 1 fn (x) = f0 (x) + d+2 η0 + 2 ρ0 (x) ≡ f0 + ηn + ρn , n n n radius r = where η0 ∈ C0∞ is positive with support in the ball Br about the origin of 2 ρ (x) dx = α sin θ; and ρ is a smooth function supported in the unit ball, such that |x| 0 0 0 and ρ0 (x) dx < 0 as may be achieved by concentrating the negative part of ρ0 near the origin. To show that fn can be adjusted to satisfy fn = 1 and |x|2 fn = D, first note that proper choice of the constant α permits you to make x − αe1 n 1 dx = |x|2 ηn (x) dx = d+2 |x|2 η0 |x + αe1 |2 η0 (x) dx d d n n R R Br equal to D −D0 and thus to adjust the mean-square for a given η0 . Also, by the evaluation x − αe1 n 1 1 ηn (x) dx = d+2 η0 η0 (x) dx dx = 2 n n n Br Rd Rd
476
B. Rider
you see that taking η0 = − ρ0 keeps fn = 1. Lastly, for all large n, ||ρn ||∞ will lie under δ: in short, the fn are honest probability densities. To finish, observe that f → IF (f ) is convex and so sub-additive; this gives you IF (fn ) ≤ IF (f0 ) + an + bn with |∇ρ0 (x) |2 1 dx + F (x) ρ0 (x) dx → 0 an = 2 n ρ0 (x) B1 B1 and bn =
1
|∇η0
x−αe1 n
n x−αe 1n
|2
dx +
1
F (x) η0
x − αe1 n dx n
nd+4 nd+2 η0 n |∇η0 (x) |2 1 1 dx + 2 |F (n (x + αe1 )) | η0 (x) dx = 4 n Br η0 (x) n Br 1 1 sup |F (x) | → 0. ≤ 4 C1 + 2 C2 n n x∈Brn +αne1
The proof is complete. Lemma 5 shows that the phase transition – the possibility that the micro-canonical conditioning is overcome and the thermodynamic limit is the original (stationary) diffusion – obtains if in the I -minimization problem (13) the optimizer falls out of the micro-canonical set and takes place at the global minimum. The latter is of course the original invariant density (ψ ∗ = ψ) which explains the change of phase as a consequence of heavy-tailed behavior: the event defined by the micro-canonical condition is not exponentially rare and so not felt at L = ∞. Finally we provide the proof of (14) in the case that F is at most quadratic at ∞ – exactly when phase transition may occur. This finishes the proof of Theorem 1. Lemma 6. Let again G = (1/2) + (∇ψ/ψ) · ∇ and ML our model micro-canonical measure obtained by restricting the mean-square. If F (x) grows at most like |x|2 at infinity, then
1 dPmL ≥0 lim inf ML log F L−1 0 L↑∞ L dPL independently of the value of D. Proof. The Radon-Nikodym derivative over F0L−1 of the periodic diffusion PL with re spect to its stationary counterpart Pψ 2 is ZL−1 p0 (1, X (L − 1) , X (0)), where p0 t, x, x is the (symmetric) transition density for P• with respect to ψ 2 (x) dx. Recall ZL 1 for large L is the partition function for PL . You may then write
dPmL ML log F L−1 = − log (ZL ) +ML [log p0 (1, X (0) , X (1))]+H mL |ψ 2 0 dPL to its present form. using the rotation invariance of ML to reduce the second summand The first term causes no problem. Neither does H mL |ψ 2 which is positive: we are concerned with the possibility that the above is large negative. As for the second term,
Brownian Motion with Restoring Drift
477
the potential difficulty lies in rapid decay of p0 1, x, x for large |x| or |x |, but, with the present conditions, we just get by. For ψ/ψ ≤ C 1 + |x|2 we may use the Cameron-Martin formula to obtain an 2 easy estimate of p0 1, x, x from below: with g 1,x,x = (2π )−d/2 e−|x−x | /2 and E•• = the Brownian Bridge, 2 1 1 ∇ψ g 1, x, x 1 ∇ψ p0 1, x, x = Exx exp (X(t)) dt (X) · dX − ψ 2 0 ψ ψ 2 (x ) 0
1 g 1, x, x ψ = E00 exp − x + t x − x + X (t) dt ψ (x) ψ (x ) 0 2ψ ≥ c1 exp [−c2 (|x|2 + |x |2 )].
(15)
Here we have used two facts: (1) ψ is positive and decaying so its reciprocal is bound1 ed below, and (2) the expectation E00 exp −C 0 |X|2 is bounded below as well. Finally, (15) implies that ML [log p0 (1, X (0) , X (1))] ≥ log c1 − 2c2 ML [|X|2 (0)] ≥ log c1 − 4c2 D to finish the proof. In conclusion, Theorem 1, answers the question left open in [8] and points out a different phenomenon than encountered there. On the other hand it only complements that paper in terms of the underlying technology. Where [8] treated the case F (x) /|x|2 ↑ ∞, we require that ratio to be bounded. Now the former condition is intuitively much nicer, corresponding roughly to better recurrence properties, and one expects the present method to work throughout. Closing the gap
requires extending the result of Lemma 6: limL↑∞ L−1 ML log p0 (1, X(0) , X(1)) = 0 to a wider class of diffusions. As seen in the proof, the needed limit may be rephrased in terms of a moment condition: it is enough to have ML [|X(0) |α ] = o (L) for some α such that |F (x) |/|x|α = o (1) at infinity. That the condition can be written in such a straightforward manner demonstrates the niceties of the diffusion format. 4. Statistical Mechanics for Wave Equations Consider for a moment the (one-dimensional) non-linear wave equation ∂ 2 Q/∂t 2 − ∂ 2 Q/∂x 2 + f (Q) = 0 with periodic boundary conditions on 0 ≤ x ≤ L. Defining P = ∂Q/∂t this equation can be written in Hamiltonian form ∂Q/∂t = ∂H /∂P , ∂P /∂t = −∂H /∂Q in which L 1 L 1 L 2 H = F (Q (x)) dx + |Q (x) |2 dx + P (x) dx 2 0 2 0 0 Q and F (Q) = f . The idea of Gibbs investigated by McKean-Vaninsky et al is that e
−H
d (volume) = e
−
L 0
1
F (Q(x))dx
×
e− 2
L 0
|Q (x)|2 dx
(2π0+)∞/2
1
∞
d Q×
e− 2
L 0
P 2 (x)dx
(2π/0+)∞/2
(16)
478
B. Rider
ought to provide an invariant measure for the flow. This formal object has the following interpretation. The middle factor indicates that Q is a “circular” Brownian Motion (CBM), obtained by conditioning the standard Brownian Motion so that Q (0) = Q (L) = c and then distributing this value over the line according to the infinite measure (2π L)−1/2 × dc. In suit, the third factor defines the velocity P as a White Noise. The first factor is just a density with respect to the CBM. It is a comforting result [7] that if F tends to infinity at ±∞ and so acts as a restoring force for the wave equation, then this density provides the needed control for the measure (16) to be finite. To connect with the present work consider again the Schr¨odinger operator G0 = − (1/2) + F (Q) and its ground state ψ(Q). Itˆo’s Lemma ((dQ)2 = dx) will show 0=
L
d log ψ[Q (x)]
0
L
L
1 log ψ(Q) (dQ)2 0 0 2 L L ∇ψ 1 L ∇ψ 2 = F (Q) dx − 0 (G0 )]L, dx + (Q) (Q) · dQ − ψ 2 0 ψ 0 0 L which, if substituted in (16), tells you that the density exp − 0 F (Q) dx is, up to a constant multiple, the Cameron-Martin factor for the diffusion of type G used throughout this work. That is, our periodic diffusion PL and the Q part of the above Gibbs ensemble are one and the same. The micro-canonical ensemble/thermodynamic limit for these measures was considered by McKean-Vaninsky √ to be a model for the harder problem of the focussing cubic Schr¨odinger. There −1∂Q/∂t = −∂ 2 Q/∂x 2 + |Q|2 Q with Hamiltonian H = L L (1/2) 0 |Q |2 − (1/4) 0 |Q|4 , and the canonical Gibbs measure is =
e
−H
∇ log ψ(Q) · dQ +
d (volume) = e
(1/4)
L 0
|Q|4 (x)dx e
−(1/2)
L 0
|Q (x)|2 dx
(2π 0+)∞
d ∞ (real Q) d ∞ (imag Q) .
L Now 0 |Q|2 is a constant of the motion; Lebowitz-Rose-Speer [6] pointed out that conditioning on it being fixed is necessary to make the total mass finite.5 As to the thermodynamic limit, neither the methods of [8] or this paper apply, but see [10 and 12] for a different approach.
5. The Martin Boundary As a final remark, we wish to point out the connection between the present work and that of computing the Martin Boundary for the space-time motion t → t, X(t) , I (t) = t |X|2 t dt . The latter is again a diffusion with generator L = ∂t + G + |x|2 ∂I ; its Martin Boundary is defined as the complete list of (minimal) positive solutions to Lh = 0.6 5 We mention that this ensemble and the one of (16) have solid mechanical meaning: their invariance under the flow is proved in [7] for classical waves and [9] for the cubic Schr¨odinger. 6 For a spirited introduction to Martin’s Boundary, [13] is recommended.
Brownian Motion with Restoring Drift
479
Returning to expression (3) for the micro-canonical mean of φ X t : 0 ≤ t ≤ t , one should notice that all that changes in ML [φ] for L ↑ ∞ is the ratio
L−t −1 2 ZL Px (17) |X| t dt ∈ L [D, D + δ] − I, X(L − t) = x . 0
Thus, the thermodynamic limit problem is equivalent to understanding the large L behavior of this last display. The results of Krylov [5] show the family Z−1 L Px [etc] is tight and more. The L ↑ ∞ limit h t, x, x , I is unique and, just as one would hope from a glance at (17), satisfies 0 = ∂t h + Gx h + |x|2 ∂I h. Now the link is plain: our h’s make up a “micro-canonical boundary” sitting inside the full Martin Boundary. It is a simple matter to identify the h s. The preceding comments indicate that the thermodynamic limit may be expressed through M∞ [φ] =
∞ t 2 Ex φ(X) , X(t) = x, |X (s) | ds = I h t, x, x , I dx dx dI. Rd
Rd
0
0
(18)
The results of [8] and the present work show that this is also
dP ∗ ∗ 2 2 Ex∗ [φ(X)] ψ ∗ x dx = Ex φ X t : 0 ≤ t ≤ t , ψ x dx ; dP Rd Rd (19) the mean value for either a new stationary diffusion with adjusted potential F ∗ = F + c|x|2 , or just the original stationary mean. In the former case, bring in once more ψ ∗ as well as ∗0 , the corresponding eigenvalue. Next, one computes that over 0 ≤ t ≤ t the density (dP ∗ /dP ) equals t ∇ψ ∗ 2 ∗ t
t exp 0 ∇ψ · dX − (1/2) ∗ 0 | ψ∗ | ψ ψ (X(0)) ψ ∗ (X(t)) 2 ∗ = exp c |X| + 0 t , t t ∇ψ ψ (X(t)) ψ ∗ (X(0)) 0 exp · dX − (1/2) | ∇ψ |2 0 ψ
0
ψ
which, by comparing (18) to (19), implies ψ ∗ (x) h(t, x, x , I ) = ψ ∗ x ψ x exp [cI + ∗0 t]. ψ (x)
(20)
In the latter case, the case of the phase transition, dP ∗ /dP = 1 and h t, x, x , I = ψ 2 x . For example take X an Ornstein-Uhlenbeck process of mass m (ψ 2 (x) = a concrete
2 exp −m|x| ). In this case the complete list of minimal space-time functions (the full Martin Boundary) may be worked out: 2 e2m t − 1 α h (t, x, •, I ) = exp αtem t − − βt + βγ 2 − γ I /2 2 2m for γ ≥ −m2 , 2β = m ± γ + m2 either for α = 0 or m = ∓ (1/2) γ + m2 . That is, the Martin Boundary is a topological plane, and, note by comparison with (20), the micro-canonical boundary is just the line corresponding to α = 0. As was natural, it
480
B. Rider
was conjectured in [8] that the general case is similar. However, the phase transition described in the present paper shows that is not so: inthat case all points of the expected micro-canonical line are identified for D ≥ D0 = |x|2 ψ 2 (x) dx. The effect of this truncation on the full boundary is an interesting problem for the future. Acknowledgements. This research was supported in part by NSF grant DMS-9802310. It is a pleasure to thank my advisor, H.P. McKean, for suggesting the problem and his ongoing support. Thanks as well to Ofer Zeitouni for pointing me to the work of Csisz´ar. Finally, sincere appreciation goes out to the referee whose comments greatly improved the prose.
References 1. Csisz´ar, I.: Sanov property, generalized I-projection and a conditional limit theorem. Ann. Probab. 12, 768–793 (1984) 2. Dembo, A., Zeitouni, O.: Refinements of the Gibbs conditioning principle. Prob. Theory and Related Fields 104(1), 1–14 (1996) 3. Doeblin, W.: Sur les propri´et´es asymptotiques de mouvement r´egis par certains types de chaines simples. Bull. Math. Roum. Sci 39(1), 57–115, (2), 3–61 (1937) 4. Donsker, M.D., Varadhan, S.R.S.: Asymptotic evaluation of certain Markov process expectations for large time – II. Comm. Pure Appl. Math. 29, 389–461 (1976) 5. Krylov, N.V.: Nonlinear Elliptic and Parabolic Equations of Second Order. Dordrecht-BostonLancaster-Tokyo: Reidel, 1987 6. Lebowitz, J.L., Rose, H., Speer, E.: Statistical mechanics of nonlinear Schr¨odinger Equation. J. Stat. Phys. 50, 657–687 (1988) 7. McKean, H.P., Vaninsky, K.: Statistical mechanics of nonlinear wave equations. In: Trends and Perspectives in Appl. Math, ed. L. Sirovich, Berlin-Heidelberg-New York: Springer, 1994 8. McKean, H.P., Vaninsky, K.: Brownian motion with restoring drift: The petit and micro-canonical ensembles. Commun. Math. Phys. 160, 615–630 (1994) 9. McKean, H.P.: Statistical mechanics of nonlinear wave equations (IV): Cubic Schr¨odinger. Commun. Math. Phys. 168, 479–491 (1995) 10. McKean, H.P.: A Martin boundary connected with the ∞-volume limit of focussing cubic Schr¨odinger. In: Itˆo’s Stochastic Calculus and Probability Theory, Berlin-Heidelberg-New York: Springer, 1996, pp. 251–259 11. Ney, P., Nummelin, E.: Markov additive processes I, II. Ann. Probab. 15(2), 561–592, 593–604 (1987) 12. Rider, B.: On the ∞-volume limit of focussing cubic Schr¨odinger. To appear in Comm. Pure Appl. Math. 13. Rogers, L.C.G., Williams, D.: Diffusions, Markov Processes, and Martingales Vol 1. Chichester: Wiley, 1994 14. Schroeder, C.: I-projection and conditional limit theorems for discrete parameter Markov processes. Ann. Probab. 21, 721–758 (1993) Communicated by J. L. Lebowitz
Commun. Math. Phys. 231, 481–528 (2002) Digital Object Identifier (DOI) 10.1007/s00220-002-0726-z
Communications in
Mathematical Physics
Towards a General Theory of Quantized Fields on the Anti-de Sitter Space-Time Jacques Bros1 , Henri Epstein2 , Ugo Moschella3 1 2 3
Service de Physique Th´eorique, C.E. Saclay, 91191 Gif-sur-Yvette, France Institut des Hautes Etudes Scientifiques, 91440 Bures-sur-Yvette, France Dipartimento di Scienze Matematiche Fisiche e Chimiche, Via Valleggio 11, 22100 Como, and INFN sez. di Milano, Italy
Received: 3 December 2001 / Accepted: 26 July 2002 Published online: 21 October 2002 – © Springer-Verlag 2002
Abstract: We propose a general framework for studying quantum field theory on the anti-de-Sitter space-time, based on the assumption of positivity of the spectrum of the possible energy operators. In this framework we show that the n-point functions are analytic in suitable domains of the complex AdS manifold, that it is possible to Wick rotate to the Euclidean manifold and come back, and that it is meaningful to restrict AdS quantum fields to Poincar´e branes. We give also a complete characterization of two-point functions which are the simplest example of our theory. Finally we prove the existence of the AdS-Unruh effect for uniformly accelerated observers on trajectories crossing the boundary of AdS at infinity, while that effect does not exist for all the other uniformly accelerated trajectories. 1. Introduction Quantum Field Theory (QFT) on the anti-de Sitter (AdS) space-time has come today to the general attention because of the role that AdS geometry plays at several places in modern theoretical physics [M, RS]. AdS QFT is also believed to provide an infrared regularization [CW] that can be useful for instance to understand the long standing problems of gauge QFT’s. From a more conceptual viewpoint, this regain of interest has led some authors to realize the need for a deeper setting of AdS QFT and to investigate the consequences of general principles which, in spite of the difficulties generated by the peculiar geometry of AdS, might lead to a reasonable approach to the interacting fields on this spacetime; such an approach should of course include the case of AdS free fields, whose preliminary versions were given in early works [AIS, F]. In this spirit three recent works on general AdS QFT can be mentioned. [BFS] uses the framework of observable algebras, but without assuming local commutativity from the outset. Instead the property of “passivity” (in the sense of [PW]) is postulated for the vacuum, and a remarkable proportion of the more standard properties is shown to follow. Indeed we argue in our final remarks that, heuristically speaking, their assumptions imply those of the present paper.
482
J. Bros, H. Epstein, U. Moschella
Two other works [Re] and [BBMS] (motivated by [M]) independently exhibit explicit relationships of a general type between AdS QFT and Conformal QFT in the Minkowskian boundary of ADS: while [Re] also pertains to the framework of local observable algebras, [BBMS] relies on a limited use of analytic n-point functions of local fields in a Wightman-type approach (see [SW]). It is the complete setting of such a Wightman-type framework for AdS QFT and the derivation of a number of general results for interacting fields belonging to that framework which are the purpose of the present paper. It may be useful to start by recalling how AdS QFT is rendered difficult by the lack of the global hyperbolicity property of the underlying manifold. This manifests itself in two ways: there exist closed timelike curves on theAdS manifold (in particular geodesics) and there is a “boundary” at spacelike infinity. The first problem is commonly avoided by considering the covering of the manifold; the second fact implies that the standard procedure of canonical quantization for free fields (see e.g. [BD, W]) cannot be used, since there does not exist a global Cauchy surface and information can flow in from spacelike infinity. To construct a viable QFT under these circumstances one may need to specify suitable boundary conditions at infinity. To this end, the nice idea in [AIS] was to use the conformal embedding of the AdS manifold in the Einstein Static Universe, which is a globally hyperbolic space-time. It has been then possible (with some restrictions) to produce a class of boundary conditions that render the resulting AdS QFT well defined. Unfortunately the procedure is very special and tricky, and can work at best for free field theories. Generally speaking, a well known problem when studying QFT on gravitational backgrounds is the absence of a criterion to select the physically meaningful states. There are indeed infinitely many inequivalent representations of the same field algebra and it is in general impossible to characterize the physically relevant vacuum states as, for instance, fundamental states for the energy operator, since the very concept of energy as a global quantity is in general not defined. However, an important aspect of the energy concept which keeps its full value is the notion of energy “relative to an observer whose world-line is an orbit of a one-parameter group of isometries of the spacetime”: for such observers, the usual quantum notion of energy, represented by the generator of timetranslations, is in fact applicable with respect to the proper-time parameter. Therefore the properties of energy-positivity and temperature (i.e. the notions of ground state and KMS-state) are meaningful relatively to such observers and technically characterized by relevant analyticity properties of the correlation functions of the fields in the corresponding complexified orbits. Of course, this approach of the energy concept remains particularly simple because we have to deal with curved spacetimes of holomorphic type, such as the de Sitter and AdS quadrics. In a previous work dealing with QFT on de Sitter spacetime ([BEM]), we had shown that the absence of global energy operators could in that case be successfully replaced by appropriate global analyticity properties of the n-point functions of the fields in “natural” tuboidal domains: the latter played the same role as the tube domains resulting from energy-positivity in the case of Minkowskian fields. Moreover a thermal interpretation of these QFT for all geodesic (as well as uniformly accelerated) observers could then be proved as a byproduct of these global analyticity properties, thus providing an extension of the Bisognano-Wichmann analyticity property obtained in the Minkowskian case [BW], or in physical terms of the “Unruh effect”. Surprisingly, in the AdS case, the situation concerning energy operators turns out to be more fortunate than in the de Sitter case, and in fact quite favourable! This is because there exist two classes of time-like orbits of “generic” one-parameter isometry groups, namely the class of elliptic orbits and the class of hyperbolic orbits, also supplemented
General Anti-de-Sitter QFT
483
by a “boundary-type” class of parabolic orbits. The hyperbolic orbits of AdS represent uniformly accelerated motions similar to all the geodesic or uniformly accelerated motions of de Sitter spacetime: the corresponding complexified orbits, which are complex hyperbolae, are thus also “plagued” by a natural periodicity in the imaginary part of their proper-time parameter, which forbids the positivity of the corresponding energy operator (but can support at best a KMS-condition whose temperature is related to the radius of the hyperbola and thereby to the acceleration). On the contrary, the (complexified) elliptic orbits, whose class contains geodesic as well as uniformly accelerated motions, only present the peculiarity of periodicity in the real part of their proper-time parameter: this pathology already mentioned above (namely the “time loops”) is cured by considering QFT on the universal covering of AdS; in fact in all the following, this covering space will appear as much more natural than AdS itself for the setting of interacting fields. At any rate, since there is no geometrical periodicity in the imaginary part of the proper-time of the elliptic orbits and since the corresponding one-parameter groups have no orbits of other type, their generators can be considered as genuine global energy generators: nothing forbids one to postulate the positivity of the corresponding energy operators, expressed technically by relevant analyticity properties in half-planes of the proper-time parameter. So, as in the Wightman axiomatics of Minkowskian QFT, it is still natural here to postulate the spectral condition for all the generators of time-like elliptic orbits, the latter playing the same role as the time-like straight-lines (or uniform motions) in the Minkowskian spacetime. We note that such a condition has been proposed in Fronsdal’s group-theoretical study [F] of Klein-Gordon AdS QFT. In this connection one must also quote [DL] whose authors point out the discrepancy between the elliptic and hyperbolic AdS trajectories and give arguments for attributing to them respectively a zero-temperature and a finite temperature specified in terms of the acceleration. Another postulate which will play a crucial role in our approach is an adaptation of the property of local commutativity (or microcausality). It still appears as natural for the covering of AdS space-time, in spite of its lack of global-hyperbolicity, while it is harder to justify on the “pure” AdS space-time itself, because of the time-loop phenomenon. However we may regard a field theory on the pure AdS space-time as just a special case of one defined on its covering. Another justification is also provided by [BFS]. After having set the relevant geometrical notions in Sects. 2 and 3, we shall propose in Sect. 4 a plausible set of hypotheses for an interacting AdS QFT, among which the positivity of the spectrum of the above mentioned energy operators, AdS-covariance and an adaptation of microcausality. The spectral hypothesis readily implies that the n-point correlation functions admit analytic continuations in tuboidal domains in the Cartesian product of n copies of (the covering of) the complexified AdS manifold. These analyticity properties of the correlation functions parallel as closely as possible what happens in Wightman QFT [SW], where the analyticity of the correlation functions in similar tubular domains is analogously obtained from the positivity of the spectrum of the energy. These domains are described in Sect. 3. Unfortunately the n-point tuboids are rather complicated geometrical objects and, at present, can be given a simple description only in the case n = 1, namely for the two-point functions. Note however that for general n they contain “flat domains” corresponding to all points moving on orbits of the same one-parameter isometry group, whose description is identical with those of the corresponding sets in complex Minkowskian spacetime; in [BBMS], these flat domains have played a useful role in the construction of the “asymptotic forms” of AdS QFT, recognized as L¨uscher-Mack-type theories [LM] on the asymptotic cone of AdS, in correspondence with conformal QFT in Minkowskian spacetime of one dimension less. At
484
J. Bros, H. Epstein, U. Moschella
the end of our Sect. 3 (Subsect. 3.5), we have tried to give a summary of some analogies and discrepancies between the n-point tuboids of complex AdS and the corresponding ones of complex Minkowski space. One of the interesting points of the AdS geometry is that there exist families of submanifolds that can be identified with Minkowski space-times in one dimension less (branes): as a matter of fact, these submanifolds contain all the two-plane sections of parabolic-type of the AdS-quadric mentioned above as the third class of timelike orbits. This very fact has raised recently a large interest [M, RS]. Our construction guarantees the possibility of considering restrictions of AdS quantum fields to these “Poincar´e branes” and obtaining this way completely well-defined Minkowskian QFT’s. It is perhaps worthwhile to stress that this result, described in Sect. 5, is not as obvious as the well-known restrictibility of Minkowskian theories to lower dimensionality space-times because of the more complicated geometry. From a geometrical viewpoint, the conformal theories obtained in [BBMS] under asymptotic scaling assumptions then also appear as limits of the previous Minkowskian QFT’s when the corresponding parabolic sections tend to infinity. In any approach of QFT, the case of two-point functions deserves a special study. This is why we give in Sect. 6 and in Appendix A a complete characterization of the AdS two-point functions; the latter are actually maximally analytic, exactly as in the Minkowski [SW] and de Sitter [BM, BEM] cases. As usual this permits the constructions of generalized free fields and their Wick powers which fulfill all the hypotheses. This study also provides the opportunity of displaying some strange implications of the postulate of microcausality in the “pure AdS” case. In fact, the discrepancy between the pure AdS spacetime and the covering of AdS appears in a characteristic way in the classification of the two-point functions; it is revealed by the property of uniformity (or nonuniformity) of these functions in their analyticity domain C \ [−1, +1] in the complex plane of the cosine of the AdS invariant distance. These phenomena introduce the more general problem of characterizing the interacting QFT’s on the pure AdS spacetime with respect to those on the covering, which will be briefly discussed in our outlook (Sect. 9). In Sect. 7 we derive from our postulates the property of Bisognano-Wichmann analyticity in all the hyperbolic orbits corresponding to the class of uniformly accelerated motions mentioned earlier, or in other words the “AdS-Unruh effect”: an accelerated observer of the AdS world whose world-line belongs to that class will perceive a thermal bath of particles with inverse temperature equal to 2π times the radius of the corresponding hyperbolic orbit. We note that this result, first described in a special free-field theory in [DL], has been also justified in the general framework of algebraic QFT in [BFS], by taking the principle of passivity of [PW] as a starting point. We also note that, in spite of the peculiarities of the global geometry of AdS, this result is similar to the one proved in [BW] for Minkowskian QFT and in [BEM] for its extension to de Sitterian QFT. Section 8 is devoted to the AdS version of the “Euclidean” field theory and to the corresponding Osterwalder-Schrader reconstruction on the covering of AdS. Section 9 contains some final remarks among which is a brief discussion of the relationship of [BFS] and the present work. 2. Preliminaries
(c) We start with some notations and some well-known facts. Let Ed+1 resp.Ed+1 denote d+1 d+1 resp.C R equipped with the scalar product (x, y) = x 0 y 0 + x d y d − x 1 x 1 − · · · x d−1 x d−1
General Anti-de-Sitter QFT
485
= x 0 y 0 + x d y d − x · y = x µ ηµν x ν ,
(2.1)
where x denotes x 1 , . . . , x d−1 . A vector x in Ed+1 is called timelike, spacelike or lightlike according to whether (x, x) is positive, negative or equal to zero. We also use (c) the notation ||x||2 = dµ=0 |x µ |2 for x ∈ Ed+1 or x ∈ Ed+1 and we introduce the ν = δ ). If A is a linear corresponding orthonormal basis of vectors eµ in Ed+1 (eµ µν (c)
operator in Ed+1 or Ed+1 we put ||A|| = sup {||Ax|| : ||x|| = 1}. We denote G resp.G(c) the group of real (resp. complex) “AdS transformations”, (c)
i.e. the set of real (resp. complex) linear transformations of Ed+1 resp.Ed+1 which (c)
preserve the scalar product (2.1), G0 and G0 the connected components of the iden˜ (c) the correponding covering groups. An element of G0 ˜ 0 and G tity in these groups, G 0 (c)
resp.G0 will be called a proper AdS transformation (resp. a proper complex AdS transformation).
2.1. “Pure AdS”, complexified and “Euclidean” AdS,coverings. The real (resp. com (c) plex) “pure” anti-de-Sitter space-time Xd resp.Xd of radius R is defined as the (c) submanifold of Ed+1 resp.Ed+1 consisting of the points x such that (x, x) = (x 0 )2 +
(x d )2 − x2 = R 2 . Except for the thermal considerations in Sect. 7, weshall always take (c) for simplicity R = 1 throughout this paper. The group G0 resp.G0 acts transitively (c) on Xd resp.Xd . (c)
By changing zµ to izµ for 0 < µ < d, the complex quadric Xd becomes the comd plex unit spherein Cd+1 , which has the same homotopy type as the real unit sphere S . (c) (c) (c) In particular π1 X1 = Z, π1 Xd = 0 (i.e. Xd is simply connected) for d ≥ 2. It (c)
(c)
follows that for d ≥ 2 the covering space of Xd is Xd itself . However, as seen below, Xd admits a nontrivial covering space X˜ d whose “physical” role is that it suppresses the time-loops of pure AdS; its construction will also imply the existence of nontrivial (c) (c) coverings of important domains of Xd (although the full space Xd itself has a trivial covering). (c) It is possible to introduce in Xd an analog of the so-called Euclidean spacetime in complex Minkowski space (where space is real and time purely imaginary): we (E ) (c) choose it to be the connected real submanifold Xd of Xd defined by putting z0 = 0 1 d d d iy , x , . . . , x real, x > 0. This sheet (x > 0) of the two-sheeted hyperboloid with equation (x d )2 − (y 0 )2 − x2 = 1, equipped with the Riemannian metric induced by the ambient quadratic form (2.1), will be called “Euclidean” AdS spacetime. This choice singles out the “base point” ed = (0, . . . , 0, 1) as the analog of the origin in Minkowski spacetime. (E ) A concrete way of representing the “Euclidean” spacetime Xd together with Xd and its covering X˜ d is to introduce the diffeomorphism χ of S 1 × Rd−1 onto Xd given by
486
J. Bros, H. Epstein, U. Moschella
(t, x) →
1 + x2 sin t, x, 1 + x2 cos t
(2.2)
(where S 1 is identified to R/2π Z). (E ) Xd is obtained by changing t into is in this representation, namely in the extension (E ) χ (c) of χ to (S 1 )(c) × Rd−1 . This yields the following parametrization of Xd : (s, x) → i 1 + x2 sh s, x, 1 + x2 ch s . (2.3) The diffeomorphism χ˜ , defined by lifting χ on the covering Rd of S 1 × Rd−1 provides a global coordinate system on X˜ d . There also exists an extension χ˜ (c) of χ˜ to C × Rd−1 , whose image is a partial complexification of the covering X˜ d of AdS; this complexified (E ) (c) covering contains the same “Euclidean” spacetime Xd as Xd . It is clear that since G0 acts transitively on Xd the diffeomorphism χ can be transported by any transformation ˜ 0 also acts transitively on X˜ d . of the group G0 ; it follows that G The Schwartz space S(Xdn ) of test-functions on Xdn will be defined as the space of n ). A C ∞ function f on X ˜ n belongs to functions on Xdn which admit extensions in S(Ed+1 d n S(X˜ d ) if every derivative of f with respect to the ambient coordinates decreases faster than any power of the geodesic distance in the Riemannian geometry induced by ||x||. Equivalently f ◦ χ˜ ∈ S Rd . 2.2. The Lorentzian structure of AdS. The restriction to Xd of the pseudo-Riemannian metric ηµν dx µ dx ν is locally Lorentzian with signature (+, −, . . . , −). An elementary description follows from the fact that G0 acts transitively on Xd : it is sufficient to look at the situation in the tangent hyperplaneto the base point x = ed (i.e. {x; xd = 1}) whose intersection with Xd is the light-cone ed + y : (y 0 )2 − y2 = 0, y d = 0 . At the base point ed , the future (resp. past) cone is then defined by (y 0 )2 − y2 > 0, y 0 > 0 (resp. y 0 < 0). At any point x ∈ Xd , the tangent hyperplane to Xd is {x + y : (x, y) = 0}. Its intersection with Xd is the light-cone with apex at x, {x + y ∈ Xd : (y, y) = 0}. We say that a tangent vector y at any point x is time-like, light-like or space-like according to whether (y, y) > 0, (y, y) = 0 or (y, y) < 0, which is consistent with the situation at the base point. Defining the local future (resp. past) cone at each point x can also be done by using the transitive action of G0 (or simply by continuity) starting from ed , but a more explicit characterization can be given by using the Lie algebra of G (see below). It can be easily seen that the circular sections of Xd by the planes parallel to (e0 , ed ), parametrized by t in the representation (2.2) have time-like tangents whose future is in the direction of increasing t, for all points x = χ (t, x); this clearly exhibits the phenomenon of closed time-loops announced earlier and its suppression by going to the covering of AdS. From a global viewpoint, two events x1 , x2 of Xd are space-like separated if (x1 − x2 )2 < 0, i.e. (x1 , x2 ) > 1. The acausal set of the base point ed , i.e. the set of points x which are space-like with respect to ed , is then given by
a (ed ) = x ∈ Xd ; x d > 1 . (2.4) Let us introduce the causal set of ed as the complement in Xd of the closure of a (ed ):
c (ed ) = x ∈ Xd ; x d ≤ 1 . (2.5)
General Anti-de-Sitter QFT
487
c (ed ) can be decomposed into three sets c (ed ) = + (ed ) ∪ − (ed ) ∪ ex (ed ): + (ed ) = x ∈ Xd ; −1 < x d < 1, x 0 > 0 , − (ed ) = x ∈ Xd ; −1 < x d < 1, x 0 < 0 , ex (ed ) = x ∈ Xd ; x d ≤ −1, .
(2.6)
The regions + (ed ) and − (ed ) can be conventionally called future and past of ed . Similar regions a (x), c (x), ± (x), ex (x) can be associated with any point x of Xd (again by the transitive action of G0 ). The geodesics of Xd are conic sections by 2-planes containing the origin of the ambient space; if x1 and x2 are two points on a connected branch of geodesic γ the distance d(x1 , x2 ) is defined by d(x1 , x2 ) = θ (x1 , x2 ), where θ is the angle under which the arc (x1 , x2 ) of γ is seen from the origin; otherwise stated, it is the parameter of the isometry which transforms x1 into x2 in the minimal subgroup of G admitting γ as an orbit. θ is an ordinary angle if γ is an ellipse, and a hyperbolic angle if γ is a branch of hyperbola. The former case is interpreted as “normal” time-like separation if |θ | ≤ π (future and past being distinguished by the sign of θ) and one has (x1 , x2 ) = cos d(x1 , x2 ); d(x1 , x2 ) is interpreted as the interval of proper time elapsed between x1 and x2 for the geodesic observer sitting on γ . This case is typically realized in by the geodesic with equations (x 0 )2 + (x d )2 = 1, x = 0 (i.e. the curve x = χ (t, 0) (2.2)). The second case corresponds to space-like separation ((x1 − x2 )2 < 0) and one has (x1 , x2 ) = ch d(x1 , x2 ) > 1. It is typically realized by taking the section of Xd by the plane (e0 , e1 ) (or (ed , e1 )). The AdS space-time is not geodesically convex: indeed for any point x, all the temporal geodesics which contain x also contain the antipodal point −x, and the proper time interval between x and −x is π on each of these geodesics. Furthermore, the temporal geodesics emerging from an event x do not cover the full causal region c (x) but only + (x) ∪ − (x). In order to go from x to a point y in the region ex (x), one needs to follow at least two arcs of temporal geodesics, which implies a “boost” (i.e. some “interaction”) at the junction of these two arcs. Similar remarks apply as regards causality and geodesics on the covering X˜ d . 2.3. Three types of planar trajectories in AdS and its covering. Two-planes of Ed+1 containing the origin can always be spanned by a pair of linearly independent orthogonal vectors (a, b) and classified by considering the possibility for each vector a and b to be timelike, spacelike or lightlike, which gives six different cases. To each pair (a, b) there corresponds an element M of the Lie algebra of G, an isometry group etM which is a one-parameter subgroup of G0 and a family of parallel two-planes whose sections by AdS are conics (possibly degenerated into straight lines) invariant under this subgroup; it follows that each connected component of these sections is either a timelike or a spacelike curve (or a lightlike straightline in the degenerate case). Being interested by the classification of the timelike curves (or trajectories), we retain three possible cases which reduce to simple models in terms of the basis eµ , by using again the fact that G0 acts transitively on Xd . i) The elliptic trajectories: this is the case when a 2 > 0 and b2 > 0. Since this entails that (for all α, β) (αa + βb)2 > 0, all the corresponding parallel sections of AdS (in a family specified by (a, b)) are timelike. Their models are the circular sections by all two-planes parallel to (e0 , ed ), described above in (2.2): the isometry subgroup
488
J. Bros, H. Epstein, U. Moschella
associated with them is the rotation group with parameter t. There is one and only one geodesic in each such family (in the unique plane of the family containing the origin). ii) The hyperbolic trajectories: this is the case when a 2 > 0 and b2 < 0. In each family of parallel sections of AdS specified by (a, b), there are two subfamilies. One of them is composed of spacelike branches of hyperbolae and contains the unique geodesic of the family. The other one is composed of timelike branches of hyperbolae, interpreted as uniformly accelerated motions: there is no timelike geodesic of that type. The isometry subgroup associated with such a family is a group of pure Lorentz transformation. A model for this family is given by the sections parallel to the (e0 , e1 )-plane. Since it will be used repeatedly (in particular in Sect. 7) we introduce the following notations. For each λ ∈ C \ {0}, we denote [λ] the special Lorentz transformation such that ([λ]x)0 =
λ + λ−1 0 λ − λ−1 1 λ − λ−1 0 λ + λ−1 1 x + x , ([λ]x)1 = x + x , (2.7) 2 2 2 2
the other components of x remaining unchanged. In other words ([es ]x)0 = x 0 ch s + x 1 sh s,
([es ]x)1 = x 0 sh s + x 1 ch s .
(2.8)
The corresponding subfamily of timelike orbits on AdS is characterized by the following condition to be satisfied by the other components: ρ(x)2 = (x d )2 − (x 2 )2 − · · · − (x d−1 )2 − 1 > 0. The corresponding orbits are then branches of “hyperbolae with radius |ρ|”, namely with equations x 0 = ρsht, x 1 = ρcht.
(2.9)
iii) The parabolic trajectories: this is the case when a 2 > 0 and b2 = 0: since this entails that (for all α, β) (αa + βb)2 ≥ 0, all the corresponding parallel sections of AdS (in a family specified by (a, b)) are timelike or exceptionally lightlike. Their models are the parabolic sections by all two-planes parallel to (e0 , ed−1 − ed ), admitting the following representation in terms of a translation group parameter t: 1 v 2 1 v 2 0 v d−1 d (2.10) = σ shv + e t , x = σ chv − e t , x = σ e t, x 2 2 2 2 where σ 2 = 1 + x 1 + · · · + x d−2 . The complete description of the corresponding subgroup of G0 admitting these parabolic orbits will be given and used in Sect. 5 (see Eq. (5.7)). These timelike sections admit as a limiting case (for v tending to −∞) the lightlike section by the plane with equations x d−1 + x d = 0, x 1 = · · · = x d−2 = 0. This lightlike section gives the only geodesics in the family. We note that only the elliptic trajectories have nontrivial liftings into the covering of Xd . The hyperbolic and parabolic trajectories belong to “a single sheet of X˜ d ”. This is connected with the following difference between these families of trajectories: while each family of elliptic trajectories (resp. their liftings) covers the whole space Xd (resp. X˜ d ), each family of the two other types is decomposed into two subfamilies which cover disjoint domains of Xd , namely wedge-shaped regions of the form Xd ∩ x; ±x 1 > |x 0 | for the hyperbolic case and halves of AdS of the form Xd ∩ x; ±(x d−1 + x d ) > 0 for the parabolic case. It is also interesting to note that by putting t = iτ in the representations of the three families of trajectories (2.2), (2.9), (2.10), one obtains curves inside the “Euclidean”
General Anti-de-Sitter QFT
489
(E )
AdS spacetime Xd , namely respectively, branches of hyperbolae, circles and parabolae. The orbits of the second case, which are associated with purely imaginary Lorentz transfomations and exhibit periodicity in the corresponding imaginary time parameter are to be compared with those occurring as well in the Euclidean space of complex Minkowski space as in the (“Euclidean”) hypersphere of the complex de Sitter spacetime (see e.g. ([BEM])): as shown below in Sect. 7, they correspond to the existence of an Unruh effect as in these other two spacetimes. On the other hand, the other two classes of orbits are rather to be compared with those of the time-translation groups of Minkowskian space (although not corresponding to geodesic motions on Xd generically), as far as they allow a topologically equivalent “Wick-rotation” procedure to be performed (see Sects. 8 and 5). Interpretation of the planar trajectories as uniformly accelerated motions. In this paragraph we reintroduce the radius R of the AdS spacetime. If x = x(t) denotes an arbitrary AdS trajectory parametrized by its proper time t, the corresponding velocd ity-vector u(t) = dt x(t) at the point x(t) satisfies the following relations (in terms of scalar products in the ambient space Ed+1 ): (u(t), u(t)) = 1, (x(t), u(t)) = 0. d It follows that the ambient acceleration-vector w(t) = dt u(t) satisfies the relation (w(t), x(t)) = −1. The latter together with the AdS equation (x(t), x(t)) = R 2 then imply that the AdS acceleration-vector γ (t), defined as the Ed+1 -orthogonal projection of w(t) onto the tangent hyperplane to AdS at the point x(t), is given by the following formula: γ (t) = w(t) + R12 x(t). In view of the previous relations, this entails that (γ (t), γ (t)) = (w(t), w(t)) − R12 . Motions with constant acceleration are those for which (γ (t), γ (t)) or equivalently (w(t), w(t)) is independent of t. Consider any planar trajectory of AdS; if it is either elliptic or hyperbolic, we can represent it by the equation x(t) = x(t) ˆ + Rc, where x(t) ˆ varies in a two-plane . and the vector c can always be chosen orthogonal to .. If c2 < 0, the trajectory is elliptic (this is the case described by (2.2), with c = (0, x, 0)). If c2 > 1, the trajectory is hyperbolic (this is the case described by (2.9), with c = (0, 0, x 2 , . . . , x d ), c2 = (1 + ρ 2 )). Since w(t) as well as u(t) remain in ., and since (w(t), u(t)) = (x(t), ˆ u(t)) = 0, there holds the colinearity condition w(t) = λ(t)x(t) ˆ and the condition (w(t), x(t)) = 1 (w(t), x(t)) ˆ = −1 then yields λ(t) = − (x(t), = R 2 (c12 −1) . It then follows that ˆ x(t)) ˆ (w(t), w(t)) = − R 2 (c12 −1) is independent of t. This motion is therefore uniformly ac√ celerated with the following value of the AdS acceleration: a = −(γ (t), γ (t)) =
1 c2 . This leads us to state R c2 −1
Lemma 2.1. All the planar trajectories of the AdS spacetime correspond to uniformly accelerated motions, with the following specifications:
i) If the trajectory is elliptic, the corresponding acceleration a =
any value between 0 (i.e. the geodesic case obtained for c = 0) c → ∞ corresponding to ellipses in far-away two-planes).
1 R and R1
ii) If the trajectory is hyperbolic, the corresponding acceleration a =
1 R
|c2 | |c2 |+1
takes
(i.e. the case
c2 c2 −1
takes
any value between (i.e. the case c → ∞ corresponding to hyperbolae in far-away two-planes) and +∞ (i.e. the case c2 → 1 corresponding to degenerate hyperbolae (or “bifurcate horizons”)). iii) If the trajectory is parabolic, the corresponding acceleration is a = R1 . 1 R
490
J. Bros, H. Epstein, U. Moschella
To complete the proof, we just have to treat the case of parabolic trajectories, whose prototype is described by Eqs. (2.10). Introducing the proper time parameter and the radius R of AdS, the latter can be rewritten: 2 2 x 0 = t, x d−1 = σ shv + 2σt ev , x d = σ chv − 2σt ev , with σ 2 = R 2 + (x 1 )2 + · · · + (x d−2 )2 . The ambient acceleration-vector is w(t) = (0, . . . , σ1ev , − σ1ev ), which is such that (w(t), w(t)) = 0 and therefore the AdS acceleration-vector γ (t) = w(t) + R12 x(t) is such that (γ (t), γ (t)) = − R12 . 2.4. The Lie algebra of G, time-like and isotropic two-planes. The Lie algebra G of G can be identified with the real vector space of the linear operators A on Ed+1 such µ that Aν = Aµρ ηρν with Aµρ = −Aρµ . Hence there is a canonical linear bijection / 2-tensors over Ed+1 onto G. A basis of G is provided by of the space of antisymmetric ρ ρ ρ Mµν : 0 ≤ µ < ν ≤ d , with (Mµν )σ = eµ eνσ − eν eµσ ; using the standard notation a ∧ b = a ⊗ b − b ⊗ a, we write: Mµν = /(eµ ∧ eν ). In particular 0 ... 1 cos t . . . sin t . M0d = ... 0 ... , etM0d = ... (2.11) 1 .. , −1 . . . 0
− sin t . . . cos t
and the diffeomorphism χ (see Eq. (2.2)) can be rewritten as follows: (t, x) → exp (tM0d ) 0, x, 1 + x2 ,
(2.12)
with t ∈ S 1 . The same formula can also be used for describing the lifting χ˜ of χ on the covering spaces, provided t is allowed to vary√in R, and A → exp(A) is understood as ˜ 0 , and (0, x, 1 + x2 ) is identified with a point of one the exponential map of G into G of the fundamental domains of X˜ d . For simplicity, we keep the same notation, and let the context decide on its interpretation. It is also easy to verify that, with the notations of (2.8), [es ] = exp sM10 . The scalar product (2.1) naturally induces a G-invariant scalar product in the space of the contravariant tensors of order p, namely (A, B) = Aµ1 ...µp gµ1 ν1 . . . gµp νp B ν1 ...νp . This satisfies (a1 ⊗ . . . ⊗ ap , b1 ⊗ . . . ⊗ bp ) = (a1 , b1 ) . . . (ap , bp ) .
(2.13)
In particular, given a, b ∈ Ed+1 , 1 (2.14) (a ∧ b, a ∧ b) = (a, a)(b, b) − (a, b)2 , 2 1 (2.15) (e0 ∧ ed , a ∧ b) = a0 bd − ad b0 . 2 The square of this 2-dimensional determinant is a 2-dimensional Gramian: (a 0 bd − a d b0 )2 = (a 0 )2 + (a d )2 (b0 )2 + (bd )2 − (a 0 b0 + a d bd )2 2 = (a, a)(b, b) − (a, b)2 + a 2 b2 − ( a · b) 2 2 a − 2(a, b) a · b . + (a, a)b + (b, b)
(2.16)
General Anti-de-Sitter QFT
491
Supposing (a, a) > 0, (b, b) > 0, and (a, a)(b, b) − (a, b)2 > 0, we find (a 0 bd − a d b0 )2 > 0. If a and b are continuously varied while (a, a), (b, b), (a, b) are kept constant with the above inequalities satisfied, the sign of (e0 ∧ e1 , a ∧ b) remains constant. Under the same conditions, a can be brought to the form a 0 e0 , a 0 > 0, by a transfor1 d 0 2 mation in G0 . Then, denoting b = (0, b 0, . . . ,b ), we have (a ) (b , b ) > 0, di.e. b is time-like in the Minkowski space x : x = 0 . It can be brought to the form b ed by a Lorentz transformation acting in the same space, after which a = a 0 e0 , b = b0 e0 +bd ed , a 0 bd − a d b0 = a 0 bd . In particular if a and b are orthonormal, the necessary and sufficient condition for a = 3e0 , b = 3ed , 3 ∈ G0 , is that the above scalar product (2.15) be positive. Suppose a and b are as above and (e0 ∧ ed , a ∧ b) > 0. Then the two dimensional real vector subspace spanned by a and b, and any parallel 2-plane, equipped with the metric induced by (2.1), is a Euclidean space (with positive metric). Conversely given a 2-plane with strictly positive induced metric, the parallel two dimensional real vector subspace has an orthonormal base (a, b) and the scalar product (e0 ∧ ed , a ∧ b) can be made positive by changing b to −b if necessary. We call such a 2-plane time-like, and always regard it as oriented by the 2-form d(a, x) ∧ d(b, x). The one-parameter subgroup of G0 defined by t → exp t/(a ∧ b) leaves this 2-plane invariant, and every orbit of this subgroup is contained in a parallel 2-plane. This subgroup is conjugated in G0 to exp tM0d . It follows that exp t/(a ∧ b) a = cos(t) a − sin(t) b , exp t/(a ∧ b) b = sin(t) a + cos(t) b , exp t/(a ∧ b) x = x if (a, x) = (b, x) = 0 .
(2.17)
We denote C1 the subset of G consisting of all elements of the form /(a ∧ b) with (a, a) = (b, b) = 1, (a, b) = 0, and (e0 ∧ ed , a ∧ b) > 0. Equivalently,
C1 = 3 M0d 3−1 : 3 ∈ G0 .
(2.18)
We denote C+ the cone generated in G by C1 , i.e. C+ = ρ>0 ρC1 . We note that for all the elements M of C1 the corresponding group elements exp tM ˜ 0 according to whether exp is recan be considered as belonging either to G0 or to G garded as the exponential map of one orthe other group. As mentioned at the beginning (c) of this section, we always identify G0 resp.G0 with a subgroup of SL(d + 1, R) (resp. SL(d + 1, C)), and its exponential map with the matrix exponential. While in the real Minkowski space the isotropic subspaces are one-dimensional, the maximal isotropic subspaces in Ed+1 are two-dimensional when d > 2. Lemma 2.2. Let a, b ∈ Ed+1 be linearly independent and satisfy (a, a) = (b, b) = (a, b) = 0. Then d + 1 ≥ 4 and there is a 3 ∈ G0 such that 3a = e0 + e1 and 3b = ±(εed−1 + ed ), where ε = ±1. Any x ∈ Ed+1 such that (x, a) = (x, b) = (x, x) = 0 is a linear combination of a and b. We omit the straightforward proof. We finally give an explicit characterization of the local future cone Vx+ at any point x of Xd . We define Vx+ as the connected component of {y : (y, x) = 0, (y, y) > 0} which contains the timelike vector M0d x. In the case when x = ed , M0d ed = e0 and any y ∈ Ve+d can be written as 3ρe0 , where 3 ∈ G0 belongs to the stabilizer of ed :
492
J. Bros, H. Epstein, U. Moschella
3ed = ed and ρ > 0. Hence
Ve+d = 3 ρM0d 3−1 ed : 3 ∈ G0 , 3ed = ed , ρ > 0 , = {ρ Med : M = /(a ∧ ed ), (a, a) = 1, (a, ed ) = 0, (e0 ∧ ed , a ∧ ed ) > 0, ρ > 0} ,
(2.19)
and therefore, for any x ∈ Xd , Vx+ = {ρ Mx : M = /(a ∧ x), (a, a) = 1, (a, x) = 0, (e0 ∧ ed , a ∧ x) > 0, ρ > 0} .
(2.20)
3. Tuboids (c)
3.1. Future and past tuboids of Xd . We denote C+ = {ζ ∈ C : Im ζ > 0} = −C− .
(3.1)
(c)
For z = x + iy ∈ Ed+1 , we define 8(z) =
1 (e0 ∧ ed , y ∧ x) = y 0 x d − x 0 y d . 2
(3.2) (c)
Definition 3.1. The future tuboid Z1+ and the past tuboid Z1− of Xd are defined by ∗ Z1+ = {exp(τ M) c : M ∈ C1 , c ∈ Xd , τ ∈ C+ } = Z1−
(3.3)
(C1 being defined by Eq. (2.18)). The tuboids Z˜1± are defined as the universal coverings of Z1± . They are given by the same formula where exp(τ M) is now understood as an ˜ (c) , τ varies in C+ and c1 is an arbitrary element of X˜ d . element of G 0 (E )
Lemma 3.1. The “Euclidean” AdS spacetime Xd
is contained in Xd ∪ Z1+ ∪ Z1− .
(E )
Proof. Since Xd is represented by (2.12) where t is changed into it, all the complex points of this manifoldare seen to be obtained as points of Z1+ ∪ Z1− by taking √ M = M0d , τ = it and c = 0, x, 1 + x2 in (3.3). Lemma 3.2. Z1+ has the following properties: (i)
(c) Z1+ = z = x + iy ∈ Ed+1 : (x, x) − (y, y) = 1, (x, y) = 0,
(y, y) > 0, 8(z) > 0} . (ii) Z1+ = {3 exp(itM0d ) ed : 3 ∈ G0 , t > 0} .
(iii) If z = x + iy ∈ Z1+ then z0 = 0, zd = 0, and Im (z0 /zd ) > 0.
(3.4) (3.5)
General Anti-de-Sitter QFT
493
Proof. Denote A1 the rhs of (3.4), and A2 the rhs of (3.5). It is clear that Z1+ , A1 , A2 are invariant under G0 , and that A2 ⊂ Z1+ . We first prove that Z1+ ⊂ A1 . Suppose that z = exp(τ M) c with c ∈ Xd , M ∈ C1 , and t = Im τ > 0. Let 31 ∈ G0 be such 0 = 0, that 31 M3−1 1 = M0d . Then 31 z = exp((s + it)M0d )c , where c ∈ Xd , c d −1 c > 0. Thus z = 3z with 3 = 31 exp(sM0d ) and z = x + iy = exp(itM0d )c , i.e. x 0 = 0, x d = c d ch (t), y = c d sh (t)e0 , hence 8(z ) > 0. It follows that z , and therefore also z, belong to A1 . We now show that A1 ⊂ A2 . Suppose that a point z = x + iy belongs to A1 . There exists a 3 ∈ G0 and a real t > 0 such that y = 3 sh (t) e0 and x = 3 ch (t) ed , i.e. 3−1 (x + iy) = sin(it) e0 + cos(it) ed = exp(itM0d ) ed . This can be rewritten as √ z = exp(is √ /(y ∧ x))√x/ (x, x), √ (3.6) s = log (x, x) + (y, y) / (x, x)(y, y) . To prove (iii), we suppose that x + iy ∈ A1 . Then x d = y d = 0 implies that both x and y belong to a “usual” Minkowski space, hence they cannot be both time-like as well as orthogonal. The same argument excludes z0 = 0. Finally 8(z) = |zd |2 Im (z0 /zd ). (c) The form (3.4) makes it obvious that Z1+ and therefore Z1− are open subsets of Xd while the definition makes it obvious that they are connected. Z1+ and Z1− are disjoint since every point z of Z1− satisfies 8(z) < 0. Lemma 3.3. Let a + ib ∈ Z1+ . Then for every M ∈ C1 and τ ∈ C+ , the point exp(τ M)(a + ib) is in Z1+ . Proof. By the invariance of Z1+ under G0 , it suffices to prove the statement in the case when M = M0d and τ = it with real t > 0. We suppose that (b, b) ≥ 0, (a, a) − (b, b) = 1, (a, b) = 0, and 8(a + ib) ≥ 0. This implies (a 0 )2 + (a d )2 ≥ (a, a) ≥ 1. Let x + iy = exp(itM0d )(a + ib). A simple calculation shows that (y, y) − (b, b) = (a 0 )2 + (a d )2 + (b0 )2 + (bd )2 sh 2 t +2(b0 a d − bd a 0 )sh t ch t > 0,
(3.7)
y 0 x d − y d x 0 = (a 0 )2 + (a d )2 + (b0 )2 + (bd )2 sh t ch t +(b0 a d − bd a 0 )(ch 2 t + sh 2 t) > 0.
(3.8)
Lemma 3.4. (i) The image of the domain Z1+ (or Z1− ) under the coordinate map z → zd = (ed , z) is the cut-plane 9 = C \ [−1, 1]. (ii) The image of the domain Z1− ×Z1+ (or Z1+ ×Z1− ) by the (scalar product) mapping z1 , z2 → (z1 , z2 ) is the cut-plane 9. 1− × Z 1+ (iii) The map z1 , z2 → (z1 , z2 ) of Z1− ×Z1+ onto 9 can be lifted to a map of Z onto the covering 9 of the cut-plane 9. Proof. (i) If z ∈ Z1+ , then, by Lemma 3.2, it can be written as z = 3 exp(itM0d )ed , with 3 ∈ G0 and t > 0. Hence (ed , z) = (a, exp(itM0d )ed ), where a ∈ Xd can be written as a= 1 + a 2 sin s, a , 1 + a 2 cos s = exp(sM0d ) (0, a , 1 + a 2 ) , (3.9)
494
J. Bros, H. Epstein, U. Moschella
√ hence (ed , z) = 1 + a 2 cos(it − s) ∈ 9. Conversely any ζ ∈ 9 can be written as ζ = cos(u + iv), v > 0, i.e. ζ = (ed , exp((u + iv)M0d )ed ) is in the image of Z1+ . (ii) Let z1 ∈ Z1− , z2 ∈ Z1+ . Then z1 = exp(−τ1 M1 )c1 , z2 = exp(τ2 M2 )c2 , with Mj ∈ C1 , τj ∈ C+ , cj ∈ Xd , j = 1, 2. Hence (z1 , z2 ) = (c1 , exp(τ1 M1 ) exp(τ2 M2 )c2 ) = (ed , z ),
(3.10)
where z = 3 exp(τ1 M1 ) exp(τ2 M2 )c2 (for some 3 ∈ G0 ) belongs to Z1+ by Lemma 3.3. Therefore (z1 , z2 ) ∈ 9 by (i). Conversely any ζ ∈ 9 can be written as ζ = cos(u + iv), v > 0, i.e. ζ = (exp(−(u + iv)M0d /2)ed , exp((u + iv)M0d /2)ed ) is in the image of Z1− × Z1+ . (iii) It is easy to see that elements of Z1− × Z1+ such that z1 = exp(t1 + is1 )M0d ed , s1 > 0, and z2 = exp(t2 + is2 )M0d ed , s2 < 0, have a scalar product (z1 , z2 ) = cos(t1 − t2 + i(s1 − s2 )) which runs on an elliptic path with foci −1, +1 in the infinite˜ when t1 and t2 vary in R at fixed s1 , s2 . These elliptic paths remain in sheeted domain 9 ˜ 9, which they fully recover, when s1 and s2 vary in their ranges. Similar elliptic paths homotopic to the latter would be generated by starting from more general elements of Z1− × Z1+ . See also Appendix A for an alternative argument based on a more global representation of the tuboids. Lemma 3.5. The domain Z1+ is a domain of holomorphy. Proof. As proved in Appendix A, there exists a biholomorphic map of Z1+ onto the domain T+ \ z : z1 = 0 in the usual d-dimensional complex Minkowski space. This also exhibits the topoplogy of Z1+ . 3.2. The minimal n-point tuboids. We define the minimal n-point future tuboid Zn+ as (c)n the subset of Xd consisting of all points (z1 , . . . , zn ) such that z1 = eτ1 M1 c1 , z2 = eτ1 M1 eτ2 M2 c2 , . . . , zn = eτ1 M1 . . . eτn Mn cn ,
(3.11)
where, for 1 ≤ j ≤ n, τj ∈ C+ , Mj ∈ C1 , and cj ∈ Xd . Equivalently (c)n : ∀j = 1, . . . , n, zj = eit1 M1 . . . eitj Mj cj , Zn+ = (z1 , . . . , zn ) ∈ Xd tj > 0, Mj ∈ C1 , cj ∈ Xd . (3.12) Indeed if z is of the form (3.11) with τj = sj + itj then
zj = eit1 M1 . . . eitj Mj cj , Mj = es1 M1 . . . esj −1 Mj −1 Mj e−sj −1 Mj −1 . . . e−s1 M1 , cj = es1 M1 . . . esj Mj cj .
(3.13)
∗ . We denote Z˜ We define the minimal n-point past tuboid as Zn− = Zn+ n± the universal coverings of these sets. However, we shall write simply Zn± when no confusion arises. (c)n
Lemma 3.6. The set Zn+ is open in Xd
.
General Anti-de-Sitter QFT
495
Proof. We prove, by induction on n, the following more detailed statement (Stn ): Let (c)n z = (z1 , . . . , zn ) ∈ Xd be such that, for 0 ≤ j ≤ n, zj = eit1 M1 . . . eitj Mj cj , tj > 0, Mj ∈ C1 , cj ∈ Xd .
(3.14)
For every neighborhood W of (t1 , . . . , tn , M1 , . . . , Mn , c1 , . . . , cn ) in (0, ∞)n × C1n × (c)n Xdn there is a neighborhood V of z in Xd such that for every z ∈ V there exists a
(t , M , c ) ∈ W such that zj = eit1 M1 . . . eitj Mj cj for every j = 1, . . . , n. We start with the case n = 1, for which we give a detailed proof. The cases n > 1 will then be more sketchily treated. Since the statement St1 is invariant under G0 , it suffices to consider the case when z = z1 = x + iy = exp(itM0d ) c with t > 0 and c0 = 0, cd > 0. Thus x + iy = (icd sh (t), c, cd ch (t)), t > 0, cd > 0 .
(3.15)
(c) Let R = 1 + ||z||. If z ∈ Xd is of the form (iy0 , x , xd ), with xd > 0, then
z = x + iy = (icd sh (t ), c , cd ch (t )) = exp(it M0d )c , c = x , cd = ((xd )2 − (y0 )2 )1/2 , th (t ) = y0 /xd ,
(3.16)
where c and t are continuous functions of z . Hence given ε ∈ (0, 1), there is a δ1 > 0 such that if z is of the form (iy0 , x , xd ), and ||z − z|| < δ1 , then (3.16) holds with |t − t| < ε and ||c − c|| < ε/4, and ||z || < R(1 + ε/4). There (c) is a δ > 0 such that, for every z ∈ Xd such that ||z − z|| < δ, there exists a 3 ∈ G0 , satisfying ||3 − 1|| < ε/4R, ||3−1 − 1|| < ε/4R, and such that z = 3z satisfies ||z − z|| < δ1 and z = (iy0 , x , xd ), xd > 0. Then (3.16) holds, and z = exp(it M )c , with M = 3−1 M0d 3, c = 3−1 c . Since ||M0d || = 1, ||c|| ≤ ||z||, this implies ||M − M0d || < ε and ||c − c|| < ε. We now assume that the statement Stm has been proved for all m ≤ n − 1 ≥ 1. Let z = (z1 , . . . , zn ) satisfy (3.14) and let z = (z1 , . . . , zn ) be sufficiently close to z. By St1 , z1 = exp(it1 M1 )c1 with t1 , M1 , c1 respectively close to t1 , M1 , c1 . The point (z2 , . . . , zn ) = (exp(−it1 M1 )z2 , . . . , exp(−it1 M1 )zn ) (c)(n−1)
is close, in Xd
(3.17)
, to the point eit2 M2 c2 , . . . , eit2 M2 . . . eitn Mn cn .
(3.18)
By Stn−1 the point (3.17) can be rewritten as eit,2 M2 c2 , . . . , eit2 M2 . . . eitn Mn cn ,
(3.19)
where, for 2 ≤ j ≤ n, tj , Mj , cj are respectively close to tj , Mj , cj . This proves Stn .
496
J. Bros, H. Epstein, U. Moschella
3.3. The n-point future and past tuboids. ˜ + the subset of G(c) resp.G ˜ (c) consisting of resp. G Definition 3.2. We denote G+ 0 0 0 0 all elements of the form exp(τ1 M1 ) . . . exp(τ ∈ N, τj ∈ C+ , Mj ∈ C+ N MN ), where N −1 ∈ G+ . This is also the complex for all j = 1, . . . , N. We denote G− 0 = 3 : 3 0 conjugate of G+ . We define the n-point future and past tuboids Tn+ and Tn− by 0 (c)n Tn+ = (z1 , . . . , zn ) ∈ Xd : ∀j = 1, . . . , n,
(3.20) zj = 31 . . . 3j cj , 3j ∈ G+ , cj ∈ Xd , ∗ . Tn− = Tn+
0
+ + −1 Note that if 31 , 32 ∈ G+ = G+ 0 , then 31 32 ∈ G0 . If 3 ∈ G0 then 3G0 3 0 . Similar + ˜ properties hold for G0 .
Lemma 3.7. (c) (i) G+ 0 is open in G0 . + + (ii) G+ 0 = G0 G0 = G0 G0 . ˜ +. This lemma is proved in Appendix B. Similar properties hold for G 0 The tuboids Tn± are invariant under G0 , i.e. if 3 ∈ G0 and (z1 , . . . , zn ) ∈ Tn+ , then (3z1 , . . . , 3zn ) ∈ Tn+ . Obviously Zn± ⊂ Tn± . As a consequence of Lemma 3.3, Z1+ = T1+ . Lemma 3.8. For any n ∈ N, Tn+ is open. Proof. As in the proof of Lemma 3.6, we prove, by induction on n, (c)n (Pn ): Let z = (z1 , . . . , zn ) ∈ Xd be such that, for 0 ≤ j ≤ n, zj = 31 . . . 3j cj , 3j ∈ G+ 0 , cj ∈ Xd .
(3.21)
(c)n
Then, for any z ∈ Xd sufficiently close to z, there exist 31 , . . . , 3n ∈ G+ 0 and c1 , . . . , cn ∈ Xd , respectively close to 31 , . . . , 3n and c1 , . . . , cn , such that zj = 31 . . . 3j cj . We start with n = 1 and suppose, without loss of generality in view of G0 -invariance, that z = 3 c with c ∈ Xd , and 3 = exp(it1 M1 ) . . . exp(itL ML ), tk > 0 (c) for all k = 1, . . . , L. Assume e.g. L > 1 and let z ∈ Xd be sufficiently close to z. Then z = exp(−itL−1 ML−1 ) . . . exp(−it1 M1 )z is close to exp(itL ML )c and it can, by the proof of Lemma 3.6, be written as z = exp(itL ML )c , where tL > 0, ML ∈ C1 and c ∈ Xd are respectively close to tL > 0, ML , and c. Thus z = 3 c with 3 = exp(it1 M1 ) . . . exp(itL−1 ML−1 ) exp(itL ML ). The inductive proof of Pn for all n > 1 follows the same line as in the proof of Lemma 3.6. Note that Tn+ ⊂ (T1+ )n . Another proof of Lemma 3.8 can be based on Lemma 3.7. Definition 3.3. The tuboids T˜n± are defined as the universal covering spaces of Tn± . (c)
Remark 3.1. The transformation [−1] = eiπM10 belongs to G and G0 but not to G0 . Indeed [−1]M0d [−1] = −M0d . If L = /(u ∧ v) ∈ C1 , and L = [−1]L[−1] = /(u ∧ v ) we find (M0d , L ) = −(M0d , L), i.e. u0 vd − ud v0 < 0 hence L ∈ −C1 . As a conse ∗ resp.Z ∗ . quence if (z1 , . . . , zn ) ∈ Tn+ (resp. Zn+ ) then ([−1]z1 , . . . , [−1]zn ) ∈ Tn+ n+
General Anti-de-Sitter QFT
497
3.4. Permuted tuboids. For n ≥ 2 and for any permutation π of {1, . . . , n}, the permuted tuboid Tn,π is defined by
(c)n Tn,π = z ∈ Xd (3.22) : (zπ(1) , . . . , zπ(n) ) ∈ Tn+ . An analogous definition is used for T˜n,π . Two permuted tubes Tn,π and Tn,π are called adjacent if π π −1 is the transposition of two consecutive indices. An interesting peculiarity of the AdS space-time is that (in contrast to the Minkowskian situation) adjacent permuted tuboids are not disjoint. The simplest example is Lemma 3.9. Let c1 and c2 be real and |(c1 , c2 )| > 1. Then, (i) for each A ∈ G+ 0 , (Ac1 , Ac2 ) ∈ T2+ . (ii) for each A ∈ G+ 0 , (Ac1 , Ac2 ) ∈ T2+ ∩ T2,(2,1) . Here T2,(2,1) = {z : (z2 , z1 ) ∈ T2+ }. Proof. It is obvious that (ii) follows from (i), since the hypotheses are invariant under the exchange of c1 and c2 . To prove (i), we may assume that, in the coordinates 0, 1, d (all others kept equal to 0), c1 = ed = (0, 0, 1), c2 = (0, sh u, εch u), ε = ±1, u = 0.
(3.23)
Let s be real and satisfy |s| ∈ (0, π) and εus > 0. Then eisM10 c2 = (ish u sin s, sh u cos s, εch u) = (ibd sh t, b1 , bd ch t) = eitM0d b, th t = ε sin s th u ∈ (0, 1), b0 = 0, b1 = cos s sh u, bd = εch u/ch t . eisM10
(3.24) (3.25) (3.26)
G+ 0
c2 ∈ T1+ . Let A ∈ and z1 = Ac1 , z2 = Ac2 . For sufficiently small Hence −isM10 ∈ G+ , and |s| > 0, since G+ is open, we have Ae 0 0 z1 = (Ae−isM10 ) c1 , z2 = (Ae−isM10 ) (eisM10 c2 ) = (Ae−isM10 ) eitM0d b, so that (z1 , z2 ) ∈ T2+ . Similarly (z2 , z1 ) ∈ T2+ . Remark 3.2. Denote
R2 = (c1 , c2 ) ∈ Xd2 : (c1 , c2 ) > 1 ,
(3.27)
R2 = (c1 , c2 ) ∈ Xd2 : (c1 , c2 ) < −1 . (3.28)
+ and G+ Then the sets G+ 0 R2 = (Ac1 , Ac2 ) : (c1 , c2 ) ∈ R2 , A ∈ G0 0 R2 are disjoint. If d > 2, each of them is connected. Lemma 3.9 can be generalized to n-point tubes as follows: Lemma 3.10. Let (c1 , . . . , cj , cj +1 , . . . , cn ) ∈ Xdn be such that |(cj , cj +1 )| > 1. Then + for any choice of 3k ∈ G+ 0 , 1 ≤ k < j or j + 1 < k ≤ n, and of A ∈ G0 , the point z such that zk = 31 . . . 3k ck for k < j, zj = 31 . . . 3j −1 A cj , zj +1 = 31 . . . 3j −1 A cj +1 , zk = 31 . . . 3j −1 A3j +2 . . . 3k ck for k > j + 1, belongs to Tn+ .
(3.29)
498
J. Bros, H. Epstein, U. Moschella
Therefore points of the form (3.29) belong to the intersection of Tn+ with the permuted tuboid Tn,(j +1,j ) obtained by exchanging the indices j and j + 1. Let Rj,k = x ∈ Xdn : (xj , xk ) > 1 . This is an open subset of Xdn which has two connected components if d = 2, only one otherwise. As a result of Lemma 3.10, the intersection Tn+ ∩ Tn,(j +1,j ) is an open tuboid which has connected components bordered by those of Rj,j +1 . Proof of Lemma 3.10. We may again assume that cj = ed = (0, 0, 1), cj +1 = (0, sh u, εch u), ε = ±1, u = 0.
(3.30)
For sufficiently small |s| > 0, and εus > 0, let t and b = (0, b1 , bd ) be given by itM0d ∈ G+ . Then (3.26), and let 3j = Ae−isM01 ∈ G+ 0 , 3j +1 = e 0 zj = 31 . . . 3j −1 3j cj , zj +1 = 31 . . . 3j −1 3j 3j +1 b.
(3.31)
Furthermore A3j +2 = 3j 3j +1 3j +2 , 3j +2 = e−itM0d eisM01 3j +2 ∈ G+ 0 .
(3.32)
On the other hand “opposite” tuboids such as Tn+ and Tn− do not intersect since they n and T n . are respectively contained in T1+ 1− (c)
3.5. Comparing the n-point tuboids of Xd and their coverings with the tubes of complex ˜ 0 whose generator Minkowski space. The family of one-parameter subgroups eτ M of G M belongs to the set C1 of G can be considered as the analog of the family of timelike translation groups eτ a acting on Minkowskian spacetime, whose generator a belongs to the unit hyperboloid shell H1 = a ∈ V + ; a 2 = 1 , but of course there is a major difference: while in the latter case, this family of one-parameter groups form a commutative subgroup T+ of the group of spacetime translations, the groups of the former family are all mutually noncommutative. Nevertheless, the analogy between the two families has provided us with the basic idea for defining the n-point tuboids in the AdS case. In fact, if in of Zn+ (resp. the definitions Tn+ ) one replaces the Lie elements Mj ∈ C1 by aj ∈ H1 byg ∈ T resp.3j ∈ G+ and the points cj on AdS by points in Minkowskian spacej + 0 time, one exactly reobtains the usual n-point tubes of complex Minkowski space as they are defined in [SW]. The most obvious (and unpleasant) effect of noncommutativity in the AdS case is that for n ≥ 2 the description of the tuboids remains very implicit, since the defining conditions involve group elements in a heavy way instead of being characterized directly by equations on the AdS manifold; in particular it is not clear whether the “minimal tuboids” Zn+ are really smaller than the (“complete”) n-point tuboids Tn+ , which will be used in a natural way for expressing the spectral condition of AdS quantum fields (see below in Sect. 4). We wish however to emphasize a simple result which displays a reassuring analogy between the AdS tuboids and the Minkowskian tubes, namely the inclusion of regions of the corresponding “Euclidean” spacetimes in these domains. In fact, one has the following property, in which χ˜ (c) denotes the extension to C × Rd−1 of the diffeomorphism χ˜ (see Subsect. 2.1 after formulae (2.2) and (2.3)). Lemma 3.11. For each n, the tuboid Z˜n+ contains the image by χ˜ (c)n of the following “flat tube”:
General Anti-de-Sitter QFT
499
(τ1 , x1 ) . . . , (τn , xn ) ∈ Cn × R(d−1)n ; 0 < Im τ1 < · · · < Im τn . In particular Z˜n+ (E )n contains the following open subset of Xd :
: zj = i 1 + xj2 sh sj , xj , 1 + xj2 ch sj ,
1 ≤ j ≤ n; 0 < s1 < · · · < sn . (3.33) (c)n
(z1 , . . . , zn ) ∈ Xd
The
proof is readily obtained by putting M1 = · · · = Mn = M0d and cj = (0, xj , 1 + xj2 ), ( 1 ≤ j ≤ n) in (3.11) and changing the sequence (τ1 , τ1 + τ2 , . . . , τ1 + · · ·+τn ), τj ∈ C+ into (τ1 , τ2 , . . . , τn ). The subset (3.33) of Z˜n+ which is then exhibited (E )n
is clearly (in view of (2.3)) a subset of Xd . This result will entail the validity of a “Wick-rotation procedure” for the universal covering ofAdS spacetime (see Sect. 8). In fact, we note that the flat tubes of Lemma 3.11 represent domains in the complexification (in the time variable) of Minkowski space as well as X˜ d : these domains are isomorphic. In the pure AdS spacetime, this isomorphism does not exist since one has to consider quotients of the previous flat tubes corresponding to the geometric periodicity conditions under the transformations τj → τj + 2π as described in (3.11). Another close analogy between the tuboids Zn± and the corresponding Minkowskian n-point tubes will be displayed in Subsect. 7.2: it will be proved that there exists a special subset of real points in Xdn enjoying the same property as the Jost points of Minkowski space, namely all the complex points obtained by the action of appropriate one-parameter (c) complex subgroups of G0 on these “Jost points” are contained in Zn+ ∪ Zn− and this is the starting point of analytic completions (of the type of Glaser-Streater’s theorem) which are crucial for proving the Bisognano-Wichmann property (see Sect. 7). We shall now mention two interesting discrepancies between the tuboids Zn+ and the corresponding n-point tubes of complex Minkowski space. The first one, which is developed in Sect. 5, concerns the parabolic trajectories already introduced in Sect. 2; the corresponding complexified curves turn out to exhibit sections of the tuboids Zn+ which lie in the complexified “Poincar´e sections of AdS”. These peculiarities are at the origin of the fact that QFT’s in these Poincar´e sections can be generated as restrictions of QFT’s on AdS. No analogs of such families of complexified trajectories in isotropic (or “lightlike”) hyperplanes exist in complex Minkowski space. Finally, we have exhibited above in Subsect. 3.4 a property of pairs of permuted tuboids in Xd which is definitely new with respect to the corresponding pairs in complex Minkowski space. In the latter case, such tubes always have an empty intersection and the property of a common analytic continuation for pairs of functions analytic in “adjacent permuted tubes” necessitates the application of the edge-of-the-wedge theorem to the boundary values from these two domains through an appropriate “coincidence region”. In the present case, the situation is somewhat simpler, since according to Lemma (c) 3.9 it is a general fact that adjacent pairs of permuted tuboids in Xd have nonempty intersections. ˜d 4. QFT on Xd and X As usual, it is possible to formulate the main assumptions in terms of distributions (the test-functions having then compact supports) or tempered distributions. We denote Bn
500
J. Bros, H. Epstein, U. Moschella
the space on Xdn or X˜ dn . This may be either D(Xdn ) (resp. D(X˜ dn )) or of test-functions S(Xdn ) resp.S(X˜ dn ) . The Borchers algebra B on Xd (resp. X˜ d ) is the complex vector space of terminating sequences of test-functions f = (f0 , f1 (x1 ), . . . , fn (x1 , . . . , xn ), . . .), where f0 ∈ C and fn ∈ Bn for all n ≥ 1, the product and B operations being given by (f g)n = fp ⊗ gq , (f B )n (x1 , . . . , xn ) = fn (xn , . . . , x1 ). (4.1) p, q∈N p+q=n
˜ 0 on B is defined by f → f{3} , where The action of 3 ∈ G0 resp.G f{3} = f0 , f1{3} , . . . , fn{3} , . . . , fn{3} (x1 , . . . , xn ) = fn 3−1 x1 , . . . , 3−1 xn .
(4.2)
It will also be useful to denote
−1 fn{31 ,...,3n } (x1 , . . . , xn ) = fn 3−1 x , . . . , 3 x 1 n n , 1
(4.3)
˜ 0 . If π is a permutation of (1, . . . , n), where fn ∈ Bn and 31 , . . . , 3n belong to G0 or G and fn ∈ Bn we define the function π fn ∈ Bn by π fn (xπ(1) , . . . , xπ(n)
= fn (x1 , . . . , xn ) . (4.4) The theory of a single scalar quantum field theory on Xd resp.X˜ d is specified by a con tinuous linear functional W on B, i.e. by a sequence Wn ∈ Bn n∈N resp. Wn ∈ Bn n∈N , called Wightman functions, with the following properties: ˜ 0 , i.e. for all 1. Covariance. Each Wn is invariant under the group G0 resp.G "Wn , fn{3} # = "Wn , fn #
(4.5)
Wn (x1 , . . . , xj , xj +1 , . . . , xn ) = Wn (x1 , . . . , xj +1 , xj , . . . , xn )
(4.6)
˜ 0 ). for all 3 ∈ G0 (resp. G 2. Locality.
if xj and xj +1 are space-like separated. 3. Positive Definiteness. For each f ∈ B, W(f B f ) ≥ 0. Explicitly, given f0 ∈ C, f1 ∈ B1 , . . . , fk ∈ Bk , then k n,m=0
"Wn+m , fnB ⊗ fm # ≥ 0.
(4.7)
If these conditions are satisfied the GNS construction (see [Bo, J]) provides a Hil˜ 0 and a bert space H, a continuous unitary representation 3 → U (3) of G0 resp.G representation f → Φ(f ) (by unbounded operators) as well as a unit vector E ∈ H, invariant under U , such that W(f ) = (E, Φ(f )E) for all f ∈ B. As a special case ˜ the field operator φ is the operator valued distribution over Xd resp.Xd such that
General Anti-de-Sitter QFT
501
φ(f1 ) = Φ(f ), where f = (0, f1 , 0, . . .). In addition the construction provides vector (b) valued distributions Gn such that "G(b) n , fn # = Φ(f )E = fn (x1 , . . . , xn ) φ(x1 ) . . . φ(xn ) E dσ (x1 ) . . . dσ (xn ),
(4.8)
˜0 , where f = (0, . . . , 0, fn , 0, . . .). As usual, for every 3 ∈ G0 resp.G U (3) Φ(f ) U (3)−1 = Φ(f{3} ),
U (3) E = E .
(4.9)
The details of this construction are completely analogous to those of the Minkowskian case. To every element Mof the Lie algebra G we can associate the one-parameter sub˜ 0 and a self-adjoint operator Mˆ acting in H such that group t → exp tM of G0 resp.G exp it Mˆ = U (exp tM) for all t ∈ R. With these notations, we postulate the following 4. Strong Spectral Condition. For every M ∈ C+ , every H ∈ H, and every C ∞ function ϕ˜ with compact support contained in (−∞, 0), ϕ(p) ˜ e−itp dp U (exp tM) H dt = 0 . (4.10) R
Equivalently Mˆ has its spectrum contained in R+ . ˆ this implies that, for every H ∈ H, t → Using the spectral decomposition of M, ˆ ˆ H, exp(it M) H = U (exp tM) H extends to a function, again denoted z → exp(izM) continuous on C+ and holomorphic in C+ , and bounded in norm by ||H||. For any finite sequence Mj 1≤j ≤N of elements of C+ , the function ˆ
ˆ
(z1 , . . . , zN ) → eiz1 M1 . . . eizN MN H
(4.11)
is therefore continuous and bounded in norm by ||H|| on the “flattened tube” N
(z1 , . . . , zN ) ∈ CN : zj ∈ C+ , zk ∈ R ∀k = j
.
(4.12)
j =1
It is holomorphic in zj in C+ when the other zk are kept real. By the flattened tube N
(Malgrange-Zerner) theorem, this function extends to a continuous function on C+ , holomorphic in CN + , and bounded in norm by ||H||. Thus thefunction 3 → U (3)H ex+ ˜ + with continuous boundary tends to a bounded holomorphic function on G0 resp.G 0 ˜ value on G0 resp.G0 . We now consider (τ1 , . . . , τn ) ˆ ˆ → E, eiτ1 M1 φ(x1 ) . . . eiτn Mn φ(xn ) E fn (x1 , . . . , xn ) dσ (x1 ) . . . dσ (xn ) = E, φ eτ1 M1 x1 φ eτ1 M1 eτ2 M2 x2 . . . φ eτ1 M1 . . . eτn Mn xn , E fn (x1 , . . . , xn ) dσ (x1 ) . . . dσ (xn ) def
= "Wn , fn{31 ,...,3n } # ,
(4.13)
502
J. Bros, H. Epstein, U. Moschella
where, for 1 ≤ j ≤ n, τj ∈ R, Mj ∈ C+ , and 3j = eτ1 M1 . . . eτj Mj . Suppose that fn = g1 ⊗ . . . ⊗ gn , where gj ∈ B1 . Then (4.13) extends to a function of τ1 , . . . , τn which is C ∞ on the flattened tube n
(τ1 , . . . , τn ) : Im τj ≥ 0, Im τk = 0 ∀k = j
,
(4.14)
j =1
and holomorphic in τj in C+ when the other τk are kept real. For every K > 0, the restriction of this function to n
(τ1 , . . . , τn ) : |τj | < K, Im τj ≥ 0, |τk | < K, Im τk = 0 ∀k = j
(4.15)
j =1
is bounded in modulus by C(K)
n
||gj ||m(K) ,
(4.16)
j =1
where ||gj ||m(K) is one of the seminorms defining the topology of B1 . The envelope of holomorphy of the set (4.15) contains the topological product Hn (K) =
n
τj ∈ C+ : |τj | < Ktg (π/4n) .
(4.17)
j =1
Therefore the function (4.13) extends to a C ∞ function on (C+ )n , holomorphic in (C+ )n and bounded in modulus by (4.16) on the set Hn (K) for every K > 0. For every τ ∈ Hn (K), the value of (4.13) defines a continuous n-linear form on B1n , hence, by the nuclear theorem, a unique continuous linear functional on Bn . We conclude that, for a general fn ∈ Bn , the function (4.13) extends to a C ∞ function on (C+ )n , holomorphic in (C+ )n . By standard arguments it follows that there exists a function Wn , holomorphic in Zn+ , having Wn as its boundary value in the sense of distributions. The same proof (but with a more cumbersome notation) shows that Wn is holomorphic in Tn+ . In the remainder of this paper we will require the Wightman functions Wn to be tempered distributions on Xdn (resp. X˜ dn ), i.e. we will take Bn = S(Xdn ) (resp. S(X˜ dn )), and we will assume, instead of the “strong spectral condition”, that the following holds: 5. Tempered Spectral Condition. For each pair of integers m ≥ 0 and n ≥ 0, Wm+n (wm , . . . , w1 , z1 , . . . , zn ) is the boundary value, in the sense of tempered distributions, of a function Wm,n of (w, z) holomorphic and of tempered growth in ∗ ×T Tm+ n+ = Tm− × Tn+ . (In particular Wn (z1 , . . . , zn ) is the boundary value of a function Wn holomorphic and of tempered growth in Tn+ .) Moreover for any fn ∈ Bn and every choice of M1 , . . . , Mn ∈ C1 the function defined by (4.13) is C ∞ and at most of polynomial growth in Cn+ . If the positive definiteness condition holds, the tempered spectral condition implies the strong spectral condition. However the tempered spectral condition makes sense even if the positive definiteness condition does not hold. The following lemma follows from arguments given in [BEM] (Sect. 5), using as the main tool a theorem of V. Glaser [G1].
General Anti-de-Sitter QFT
503
Lemma 4.1. Let W be a Wightman functional satisfying the conditions of covariance and locality and the tempered spectral condition. Suppose in addition that there is a real open set V ∈ Xd such that W(f B f ) ≥ 0 for all f ∈ B with support in V (i.e. such that fn has support in V n for all n ∈ N). Then there exists, for each n ∈ N, a vector valued function Gn , holomorphic in Tn+ , and with tempered growth at infinity (b) and near the boundaries, having Gn as a boundary value in the sense of tempered distributions. In particular W satisfies the unrestricted positive definiteness condition, and the Reeh-Schlieder Theorem holds. The permuted Wightman functions Wn,π defined, as usual, by "Wn,π , fn # = "Wn , π fn #, are boundary values of functions Wn,π holomorphic in the permuted tuboids Tn,π = z : (zπ(1) , . . . , zπ(n) ) ∈ Tn+ . Owing to local commutativity, they are branches of a single holomorphic function. Remark 4.1. If a set of Wightman functions satisfies the tempered spectral condition but only a part of the locality condition, i.e. if it is assumed that Wn and Wn,(j +1,j ) coincide in an open subset of Rj,j +1 (necessarily symmetric under the exchange of j and j +1), then it follows from Lemma 3.10 that they coincide in the whole of Rj,j +1 . Similar extension theorems are well-known in the Minkowskian case (see e.g. [SW]) due to phenomena of analytic completion. It is remarkable that no completion is needed in the AdS case. If Conditions 1–5 hold, one may want to assume
6. Uniqueness of the vacuum. The invariant subspace of H, H ∈ H : U (3)H
˜ 0 ) is one-dimensional, i.e. it is equal to CE. = H ∀3 ∈ G0 (resp.G In the Minkowskian case this condition is equivalent to a clustering property of the Wightman functions, namely the truncated Wightman functions tend to zero when a proper subset of their arguments tend to space-like infinity while the others remain bounded. (The truncated Wightman functions have the same inductive definition as in the Minkowskian case ([J] p. 66). They have the same linear properties as the Wightman functions.) A similar equivalence holds in the anti-de Sitter case. In Sect. 5, we discuss some properties of this type, which are equivalent to the uniqueness of the vacuum in the presence of positivity. 5. Parabolic (Poincar´e) Sections
(c) A convenient chart of a part of Xd resp.Xd is provided by the parabolic coordinates z, v → z(z, v) given by µ = e v zµ z d−1 = sh v + 21 ev z2 . z, v → z(z, v) = z (5.1) d z = ch v − 21 ev z2 In this equation µ = 0, 1, ..., d − 2, z0 , ..., zd−2 are the coordinates of an arbitrary event in a real (resp. complex) (d − 1)-dimensional Minkowski space-time with metric1 d−2 j j 2 = dz0 2 − dz1 2 − · · · − dzd−2 2 , z2 = z0 z0 − dsM j =1 z z and v ∈ R (resp. C). 1
Here and in the following where it appears, an index M stands for Minkowski.
504
J. Bros, H. Epstein, U. Moschella
This explains why the coordinates z, v of the parametrization (5.1) are also called Po incar´e coordinates. As z, v vary in Rd , the image of this map is x ∈ Xd : x d−1 + x d > 0}. The scalar product and the AdS metric can then be rewritten as follows: 2 1 (z, z ) = ch (v − v ) − ev+v z − z , 2
(5.2)
2 2 dsAdS = e2v dsM − dv 2 .
(5.3)
Equation (5.2) implies that
(z(z, v) − z (z , v ))2 = ev+v (z − z )2 − 2ch (v − v ) + 2 .
(5.4)
For a given real v we denote Mv the parabolic section
(c) Mv = z ∈ Xd (resp. Xd ) : (z, ed − ed−1 ) = zd−1 + zd = ev .
(5.5)
(c) (c) The subgroup G(d−1)d resp.G(d−1)d of G0 resp.G0 which fixes ed − ed−1 , and therefore leaves Mv globally invariant, is isomorphic to the real (resp. complex) Poincar´e group operating on the (d − 1)-Minkowski space. In particular, for b = (b0 , . . . , bd−2 ), the transformation b0 1 ... 0 b0 0 ... 0 b1 b1 .. .. .. .. . . . (5.6) exp(bµ Lµ ) = . d−2 d−2 0 ... 1 b b b0 . . . bd−2 1 + (b,b) (b,b) 2 2 − b0 . . . −bd−2 − (b,b) 1 − (b,b) 2 2 0), this transformation operates in the Minkowski leaf as z → z + b. If b = (τ, 0, leaves the coordinates z1 , . . . , zd−2 unchanged and, in the 3-dimensional space of the coordinates z0 , zd−1 , zd , is given by 1 τ τ eτ L0 = τ −τ
1+
τ2 2
2 − τ2
τ2 2
1−
.
(5.7)
τ2 2
Here L0 = /(e0 ∧ (ed − ed−1 )) = M0d − M0(d−1) =
0 1 −1
1 0 0
1 0 0
.
(5.8)
Similarly Lµ = /(eµ ∧ (ed − ed−1 )) where 0 ≤ µ ≤ d − 2. With these notations, z(z, v) = exp zµ Lµ exp vM(d−1)d ed . (5.9) We denote
(d,d−1) = bµ Lµ ∈ G : b0 > |b|
= 3 tL0 3−1 : t > 0, 3 ∈ G0 , 3ed = ed , 3ed−1 = ed−1 .
(5.10)
General Anti-de-Sitter QFT
505
If c is a real point of Mv , i.e. c = z(c, v) for some real c and v, and Q = bµ Lµ ∈ (d,d−1) then exp(iQc) = z(c + ib, v) is a point of the future tube in the complexified Mv considered as a Minkowski space. Conversely any point of the future tube is of this form. More generally, we define the n-point forward tuboid Fd,d−1,n as (c)n
Fd,d−1,n = { (z1 , . . . , zn ) ∈ Xd : ∀j = 1, . . . , n, zj = exp(iQ1 ) · · · exp(iQj ) cj , Qj ∈ (d,d−1) , cj ∈ Xd , cjd−1 + cjd > 0 } .
(5.11)
The intersection of this set with Mnv is the set obtained by restricting the cj to lie in Mv in the above definition. This intersection is just the n-point forward tube in the Minkowskian variables z1 , . . . , zn . The main point of this section is Lemma 5.1. For all n, Fd,d−1,n ⊂ Zn+ . The proof of this lemma consists of Lemmas 5.2 and 5.3 below. We begin with the following remark. Remark 5.1. Let 3u = exp(u Md(d−1) ) =
1 0 0
0 0 ch u −sh u −sh u ch u
.
(5.12)
Then 3u eµ = eµ for 0 ≤ µ ≤ d − 2, and 3u ed−1 = (ch u)ed−1 − (sh u)ed ,
3u ed = (ch u)ed − (sh u)ed−1 .
(5.13)
Therefore lim 2e−|u| 3u ed−1 = ed−1 ∓ ed ,
u→±∞
lim 2e−|u| 3u ed = ed ∓ ed−1 ,
u→±∞
(5.14)
and, for 0 ≤ µ ≤ d − 2, 3u Mµd 3−1 u = /(eµ ∧ 3u ed ) = (ch u)Mµd − (sh u)Mµ(d−1) ,
(5.15)
lim 2e−|u| 3u Mµd 3−1 u = Mµd ∓ Mµ(d−1) ,
(5.16)
lim 2e−|u| 3u Mµ(d−1) 3−1 u = Mµ(d−1) ∓ Mµd .
(5.17)
u→±∞
u→±∞
Lemma 5.2. Let c ∈ Mv and L ∈ (d,d−1) , and let z = exp(iL)c. Then (i) There is an M ∈ C+ , and a c ∈ Xd , arbitrarily close to L and c, respectively, such that z = exp(iL)c = exp(iM)c . (ii) For every neighborhood W of (L, c) in G × Xd , there is a neighborhood V of (c) z = exp(iL)c in Xd such that every z ∈ V can be written as z = exp(iM ) c , with M ∈ C+ and (M , c ) ∈ W .
506
J. Bros, H. Epstein, U. Moschella
Proof. (i) It suffices to prove the statement in case L = sL0 for some s > 0, and c0 = 0. In the coordinates (z0 , zd−1 , zd ), for any τ ∈ C, c ∈ Mv , c0 + ev τ 2 exp(τ L0 ) c = c0 τ + cd−1 + ev τ2 . (5.18) τ2 0 d v −c τ +c −e 2 As expected the two last components add up to ev . We now set c0 = 0 and τ = is (s > 0) in (5.18). We look for M in the form M = 3u tM0d 3−1 u ,
(5.19)
with 3u as in (5.12) and u > 0 large. Then 3(u, it) = 3u exp(itM0d ) 3−1 u ch t ish u sh t ich u sh t ch 2 u − sh 2 u ch t ch u sh u (1 − ch t) . = ish u sh t − ich u sh t −ch u sh u (1 − ch t) −sh 2 u + ch 2 u ch t
(5.20)
The set of all vectors z = iy 0 , x, x d−1 , x d with pure imaginary 0-component and all other components real is mapped into itself by 3(u, it) for all real u and t. As u tends to +∞, 3u tM0d 3−1 (5.21) u = (ch u)tM0d − (sh u)tM0(d−1) tends to L0 provided t ≈ 2e−u , and 3(u, it) tends to exp(isL0 ) provided t ≈ 2se−u . For fixed real c = (0, c, cd−1 , cd ), satisfying cd−1 + cd = ev , and real s > 0, we wish to find a real c = (0, c, cd−1 , cd ) and t > 0 such that 3(u, it)c = exp(isL0 )c, i.e. c = 3(u, −it) exp(isL0 )c. The condition that c0 = 0 gives th t =
sev cd−1 sh u + cd ch u + (s 2 /2)ev e−u
≈ 2se−u .
(5.22)
The other components of c are then real and tend to those of c as u → ∞. This proves (i). (ii) Using (i), z = exp(iM)c , where M ∈ C+ and c ∈ Xd can be chosen arbitrarily (c) close to L and c respectively. If z ∈ Xd is sufficiently close to z, we may apply St1 of the proof of Lemma 3.6, which yields the conclusion of (ii). The condition that c ∈ Mv in Lemma 5.2 can be replaced by the condition cd + > 0, and in fact by cd + cd−1 = 0 since the case cd + cd−1 < 0 can be dealt with 2 by changing c to −c. We note also that cd + cd−1 = (L0 c, L0 c). We define cd−1
+ = 3ρL0 3−1 : 3 ∈ G0 , ρ > 0
= a ∧ b : (a, a) = 1, (b, b) = (a, b) = 0, a 0 bd − a d b0 > 0 (5.23) (the same calculations as in (2.16) show that if (a, a) = 1, (b, b) = (a, b) = 0, then (a 0 bd − a d b0 )2 ≥ b2 = b02 + bd2 ). The following lemma extends Lemma 5.2 to the case of n points. It may be useful to spell it out in some detail.
General Anti-de-Sitter QFT
507 (c)n
Lemma 5.3. Let z = (z1 , . . . , zn ) ∈ Xd
be such that, for 1 ≤ j ≤ n,
zj = eiQ1 , . . . eiQj cj , Qj ∈ (d,d−1) , cj ∈ Xd , (Qj cj , Qj cj ) > 0 .
(5.24)
For each ε > 0, there exists a δ(n, ε, Q, c) > 0 such that, for any z = (z1 , . . . , zn ) ∈ (c)n Xd satisfying ||zj − zj || < δ(n, ε, Q, c) for all j = 1, . . . , n, there exist M1 , . . . , Mn ∈ C+ and c1 , . . . , cn ∈ Xd such that ||Mj − Qj || < ε, ||cj − cj || < ε, and zj = exp(iM1 ) c1 . . . exp(iMj ) cj for all j = 1, . . . , n. Proof. Let Stn denote the statement of the lemma. St1 follows from Lemma 5.2 in view of the preceding remarks. Assume that Stm has been proved for all m ≤ n − 1 and let z = (z1 , . . . , zn ) be as in (5.24). Let ε ∈ (0, 1), R = 1 + sup1≤j ≤n ||zj ||. Denote (c)(n−1) ˜ , Q = (Q2 , . . . , Qn ), c˜ = (c2 , . . . , cn ), and z˜ = (z2 , . . . , zn ) ∈ X d
δ2 =
1 ˜ c˜ . δ n − 1, ε, Q, 4R
(5.25)
By St1 , there exists a δ1 > 0 such that for any z1 with z1 − z1 < δ1 , there exist M1 ∈ C+ and c1 ∈ Xd such that z1 = exp(iM1 )c1 , ||c1 − c1 || < ε, ||M1 − Q1 || < (c)n ε, and || exp(iM1 ) − exp(iQ1 )|| < δ2 . Let z1 , . . . , zn ∈ Xd satisfy ||zj − zj || < min {R/4, δ1 , δ2 /(1 + || exp(−iQ1 )||)}, and let M1 , c1 be as above. Then, for 2 ≤ j ≤ n, || exp(−iM1 )zj − exp(−iQ1 )zj || ≤ || exp(−iM1 ) − exp(−iQ1 )|| ||zj || +|| exp(−iQ1 )|| ||zj − zj ||
˜ c). ≤ δ(n − 1, ε, Q, ˜
(5.26)
By Stn−1 , there exist M2 , . . . , Mn ∈ C+ and c2 , . . . , cn ∈ Xd such that, for 2 ≤ j ≤ n, ||M2 − Q2 || < ε, ||c2 − c2 || < ε, and exp(−iM1 )zj = exp(iM2 ) . . . exp(iMj )cj . This proves Stn . We recall that similar calculations are used in [BB] to prove Lemma 5.4 (Borchers-Buchholz). Let U be a continuous unitary representation of G0 in a Hilbert space H. Let H ∈ H be such that U (exp(u Md(d−1) ))H = H for all real u. Then U (3)H = H for all 3 ∈ G0 . Obviously Md(d−1) can be replaced, in the statement of Lemma 5.4, by any of its conjugates under G0 , e.g. M01 , etc. Lemma 5.1 follows, as announced, from Lemmas 5.2 and 5.3. If {Wn }n∈N is a sequence of Wightman functions satisfying the conditions of Sect. 4, but not necessarily the positive definiteness condition, the tempered distribution Wn can be restricted to Mnv , and more generally to Mv1 × · · · × Mvn . The distributions Wn (z(z1 , v1 ), . . . , z(zn , vn )) have the linear properties of the n-point Wightman functions for a set of Minkowskian fields on Rd−1 , A(z, v) = φ(z(z, v)), labelled by a real parameter v and depending in a C ∞ manner on v. The usual Minkowskian covariance and analyticity properties are satisfied by virtue of Lemma 5.1. The local commutativity is inherited from that postulated for the Wn in view of the formula (5.4). If the A(. , v) satisfy the positivity condition for all v in a non-empty open interval (a, b), then by Lemma 4.1 (Sect. 4), the original Wn satisfy the positivity condition on the whole of Xd . In case the positive definiteness condition holds, we also have
508
J. Bros, H. Epstein, U. Moschella
ˆ denote the self-adjoint operator on H such that Lemma 5.5. Let Q ∈ + and let Q ˆ = U (exp(tQ)) for all t ∈ R. Then the spectrum of Q ˆ is contained in R+ . exp(it Q) More precisely, for every H ∈ H, and every C ∞ function ϕ˜ on R with compact support contained in (−∞, 0), −itp ϕ(p) ˜ e dp U (exp tQ) H dt = 0 . (5.27) !
R
! Proof. Let ϕ(t) = ϕ(p) ˜ e−itp dp. We ! may assume that |ϕ(t)|dt ≤ 1 and ||H|| = 1. Given ε > 0, let T > 0 be such that |t|>T |ϕ(t)|dt < ε/3. Let V be a neighborhood of ˜ 0 ) such that ||(U (3) − 1)H|| < ε/3 for all 3) ∈ V . It is possible the identity in G0 (or G to choose M ∈ C+ such that exp(−tQ) exp(tM) ∈ V for all t ∈ [−T , T ]. Then tQ tM || ϕ(t)U e H dt|| ≤ || ϕ(t)U e H dt|| + || ϕ(t)U etM H dt|| |t|>T +|| ϕ(t)U etQ H dt|| + || ϕ(t)U etQ 1 − U (e−tQ etM ) H dt|| < ε |t|>T
|t|≤T
(5.28) since the first term in the rhs is zero.
Under the same hypotheses, we note that if H ∈ H is invariant under U G(d−1)d then it is in particular invariant under U (exp (RM01 )), so that, by Lemma 5.4, it is invariant under U (G0 ). Let again A(z, v) = φ(z(z, v)), which we consider as an operator valued distribution on the Minkowski space Rd−1 depending smoothly on the real parameter v. By the Reeh-Schlieder theorem for the field φ, the vacuum is cyclic for the set of fields {A(., v) : v ∈ (a, b)}, where (a, b) is any non-empty open interval in R. As a consequence, the uniqueness of the vacuum for the original theory is equivalent to the uniqueness of the vacuum for the fields A, hence to the cluster property for their Wightman functions, which is also a certain cluster property for the Wightman functions of φ. We note that the Borchers-Buchholz Lemma also provides the following characterization of the uniqueness of the vacuum in terms of the Wightman functions. Lemma 5.6. Assume that Conditions 1–5 of Sect. 4 hold (this includes the positivity condition). Then the condition of uniqueness of the vacuum is equivalent to 1 T ∀f, g ∈ B, lim "W, f B g{exp(tMd(d−1) )} # − "W, f B #"W, g# dt = 0. T →+∞ T 0 (5.29) Proof. If all the Conditions 1–5 of Sect. 4 hold, (5.29) is equivalent to 1 T ∀f, g ∈ B, lim Φ(f )E, exp(it Mˆ d(d−1) )Φ(g)E dt T →+∞ T 0 = (Φ(f )E, E) (E, Φ(g)E) .
(5.30)
By the mean ergodic theorem, the limit in the lhs is equal to (Φ(f ),E Φ(g)), where E is the projector on the subspace of all vectors invariant under exp it Mˆ d(d−1) . By Lemma 5.4 this is the subspace of all vectors invariant under U (G0 ). Therefore (by the density of {Φ(f )E : f ∈ B}) the condition (5.29) is equivalent to the fact that the subspace of all vectors invariant under U (G0 ) is CE.
General Anti-de-Sitter QFT
509
6. Two-Point Functions We now consider the two-point function of a scalar field theory on X˜ d which satisfies the general requirements described in the previous sections: W2 (x1 , x2 ) = W(x1 , x2 ) = (E, φ(x1 )φ(x2 )E) .
(6.1)
By the tempered spectral condition, this is the boundary value on X˜ d2 , in the sense of tempered distributions, of a function W+ (z1 , z2 ), holomorphic and of tempered growth in T˜1− × T˜1+ . The permuted Wightman function W(x2 , x1 ) is the boundary value of W− (z1 , z2 ) = W+ (z2 , z1 ), holomorphic in T˜1+ × T˜1− . The two permuted Wightman ˜ 0 and coincide in the real open subset of X˜ 2 in which functions are invariant under G d x1 and x2 are space-like separated. Therefore W± are branches of a single holomorphic function W (z1 , z2 ). Extensions of standard arguments (see Appendix A) show that there ˜ of 9 = C \ [−1, 1], such exists a function w, holomorphic on the universal covering 9 ˜ that W+ (z1 , z2 ) = w((z1 , z2 )) when z1 and z2 belong to T1− and T˜1+ , respectively. Theories on the covering of Anti-de Sitter spacetime are thus in this respect closely similar to Minkowski [SW] and de Sitter [BM] field theories. In the case of a field theory on X˜ d , the commutator function can be written (nonuniquely) as the difference of a retarded and an advanced “function” with supports in
(x1 , x2 ) ∈ X˜ d2 : xj = χ˜ (sj , xj ), ±(s1 − s2 ) ≥ 0 , respectively (see (2.2)). In the case of a field theory on the pure AdS spacetime Xd , w is actually a function holomorphic on 9, and, in particular, W+ (x1 , x2 ) and W− (x1 , x2 ) coincide not only on R2 = x ∈ Xd2 : (x1 , x2 ) > 1 , but also on the “exotic region” R2 = x ∈ Xd2 : (x1 , x2 ) < −1}. In this case the support of the commutator function W(x 1 , x2 )−W(x2 , x1 ) is contained in Xd2 \R2 ∪R2 . This is the union of the two closed sets x ∈ Xd2 : |(x1 , x2 )| ≤ 1, (x2 ∧ x1 , e0 ∧ ed ) ≥ 0} and x ∈ Xd2 : |(x1 , x2 )| ≤ 1, (x2 ∧ x1 , e0 ∧ ed ) ≤ 0 , and the commutator function can be written as the difference of an advanced and retarded function with supports in these two sets (respectively). This splitting is, as usual, not unique. Any two-point function with the above mentioned properties is the two-point function of a generalized free field on X˜ d or Xd , which satisfies the positive definiteness condition if and only if, for any C ∞ function ϕ with compact support in X˜ d (resp. Xd ), "W, ϕ ⊗ ϕ# ≥ 0.
(6.2)
Wick powers of such a field are well-defined: their Wightman functions can be obtained by the standard formulae, using W , as boundary values of holomorphic functions in the relevant tuboids. They satisfy all the requirements of Sect. 4 with the possible exception of positive definiteness, which holds if and only if (6.2) holds. Note that two Wick powers of generalized free fields on Xd share the property of commuting when their arguments are in R2 ∪ R2 . 6.1. The simplest example: Klein-Gordon fields. Let us consider fields satisfying the d : AdS Klein-Gordon equation on X φ + m2 φ = 0.
(6.3)
510
J. Bros, H. Epstein, U. Moschella
There is a preferred choice of solutions of that equation that display the simplest example of the previous analytic structure. The corresponding two-point functions are expressed in terms of generalized Legendre functions [BV] (d+1) Qλ (ζ )
=√
d
π
2d−1 2
−λ−d+1 d−3 (t 2 − 1) 2 dt ζ + it ζ 2 − 1
∞
(6.4)
1
by the following formula: Wλ+ d−1 (z1 , z2 ) = wλ+ d−1 (ζ ) 2 2 e−iπd d +1 (d+1) = d−1 hd+1 (λ)Qλ (ζ ), ζ = (z1 , z2 ), (6.5) 2 2 π where the parameter λ is related to the mass by the formula m2 = λ(λ + d − 1).
(6.6)
The normalization can be obtained by imposing the local Hadamard condition, that gives hd+1 (λ) =
(2λ + d − 1) (λ + d − 1) . (d) (λ + 1)
(6.7)
It can be checked directly that the functions wλ+ d−1 ((z1 , z2 )) defined by Eqs. (6.5), 2 ˜ according to whether λ is an integer or not. (6.4), (6.7) are holomorphic in 9 or 9 It is useful to display also an equivalent expression for the Wightman functions (6.5) in terms of the usual associate Legendre’s function of the second-kind2 [B]: Wλ+ d−1 (z1 , z2 ) = Wν (z1 , z2 ) = wν (ζ ) = 2
where we introduced the parameter ν = λ + ν2 =
e−iπ
d−2 2
(2π ) d−1 2
d 2
ζ2 − 1
− d−2 4
Q
d−2 2 ν− 21
(ζ ), (6.8)
such that
(d − 1)2 + m2 . 4
(6.9)
Theories with ν > −1 are acceptable in the sense that they satisfy all the axioms of Sect. 4 including the positive-definiteness property, which we shall prove below. There are in fact two regimes [BF]: i) for ν > 1 the axioms uniquely select one field theory for each given mass; ii) for |ν| < 1 there are two acceptable theories for each given mass. 2 Note the slightly different notation with the generalized Legendre’s function defined in Eq. (6.4) with the upper index in parentheses. This is the way these Wightman functions were first written in [F] for the four dimensional case (d = 4). Their identification with second–kind Legendre functions is worth being emphasized again, in place of their less specific (although exact) introduction under the general label of hypergeometric functions, used in recent papers. In fact Legendre functions are basically linked to the geometry of the dS and AdS quadrics from both group–theoretical and complex analysis viewpoints [BV, BM, V]
General Anti-de-Sitter QFT
511
The case ν = 1 is a limiting case. Equation (6.8) shows that the difference between the theories parametrized by opposite values of ν is in their large distance behavior. More precisely, in view of Eq. (3.3.1.4) of [B], we can write: − d−1 − d−1 d sin πν d 4 2 − ν + ν ζ − 1 P 1 2 (ζ ). w−ν (ζ ) = wν (ζ ) + d+1 − 2 −ν 2 2 2 (2π) (6.10) The last term in this relation is regular on the cut ζ ∈ [−1, 1] and therefore does not contribute to the commutator. By consequence the two theories represent the same algebra of local observables at short distances. But since the last term in the latter relation grows the faster the larger is |ν| (see [B] Eqs. (3.9.2)), the two theories drastically differ by their long range behaviors. Let us discuss for now the positive definiteness of the Wightman function (6.8). To this end, following the remarks in Sect. 5 we can consider the restriction 1 WM,v,v (z, z ) = wν ch (v − v ) − exp(v + v )(z − z )2 (6.11) 2 of the two-point function Wν to Mv × Mv defined for v, v real and z = x + iy, z = x + iy such that y2 > 0, y0 < 0, y 2 > 0 and y 0 > 0 (see Eq. (5.5)). The results of Sect. 5 imply that this restriction defines a local and (Poincar´e) covariant two-point function which satisfies the condition of positivity of the energy spectrum [SW]. Let us consider now the following Hankel-type transform of WM,v,v (z, z ) w.r.t. the brane coordinate v : 2 (6.12) H λ , v, (z − z ) = dv e(d−3)v θλ (v )WM,v,v (z, z ) with
√ d−1 1 λ e−v . θλ (v) = √ e− 2 v Jν 2
Inversion gives Wν z(v, z), z (v , z) =
∞
dλ d−3 − d−1 (v+v ) λ 4 e 2 2 0 2(2π) √ √ d−3 √ Jν λ e−v Jν λ e−v δ 2 K d−3 λδ , 1
(6.13)
d−2 2
2
(6.14)
where δ 2 = −(z − z )2 (see [B2], p. 64, formula (12), and [BBGMS] for more details). Equation (6.14) together with Lemma 4.1 show that the Wightman functions 6.8 are positive-definite. 6.2. K¨all´en-Lehmann-type representations. We postpone to a further paper the proof of K¨all´en-Lehmann-type representations which will be based on an appropriate Laplacetype transformation, in a spirit similar to that given in [BM] for the case of two-point functions on de Sitter spacetime. For two-point functions of general fields on X˜ d : =∞ W (z1 , z2 ) = ρ(λ) Wλ+ d−1 (z1 , z2 ) dλ. (6.15) − d+1 2
2
512
J. Bros, H. Epstein, U. Moschella
For two-point functions of general fields on Xd : W (z1 , z2 ) = ρ/ W/+ d−1 (z1 , z2 ). />− d+1 2
2
(6.16)
In these equations, ρ(λ) (resp. {ρ/ }) represents a positive measure (resp. sequence), as a consequence of the positive-definiteness property. 7. Bisognano-Wichmann Analyticity and “AdS-Unruh Effect” In this section we prove that our assumptions of locality and tempered spectral condition imply a KMS-type analyticity property of the Wightman functions in the complexified orbits of one-parameter Lorentz subgroups of G0 : the physical interpretation of this property can be called an “AdS-Unruh effect”, since the relevant hyperbolic-type orbits we are considering are trajectories of uniformly accelerated motions. In fact, this property is similar to the analyticity property in the complexified orbits of the Lorentz boosts proved by Bisognano and Wichmann [BW] in Minkowskian theory. However, in contrast with the method used in [BW], our method does not rely at all on the positivity properties of the theory but only makes use of an analytic completion procedure of geometrical type which has been presented in [BEM] for the case of de Sitter spacetime (noting there that it applied similarly to the Minkowskian case) and which can be adapted in a straightforward way to the AdS case as we show below. Throughout this section, the distinction between the cases of pure AdS and covering of AdS will be irrelevant because all the regions and corresponding analyticity properties considered will always take place in a single sheet of X˜ d ; so we can speak here of AdS without caring about that distinction; in particular the notation Zn± will denote here as well the corresponding tuboids Z˜n± if the spacetime considered is X˜ d . To be specific, let us consider once for all the Lorentz subgroup of G0 whose orbits are parallel to the two-plane of coordinates x0 , x1 , with the notations of Eqs. (2.7),(2.8) (all the other Lorentz subgroups of G0 acting on AdS being conjugate of the latter by the action of G0 itself). The following “wedge-shaped” region of AdS, which is invariant under that subgroup is then distinguished:
WR (ed ) = x ∈ Xd : x 1 > |x 0 |, x d > 0 . (7.1) (Note that this region ofAdS admits the base point ed as an exposed boundary point). Considering n-point test-functions fn ∈ Bn and Lorentz transformations [λ] = [es ] ∈ R\{0} we abbreviate fn{[λ]} into fn λ , i.e. fn λ (x1 , . . . , xn ) = fn ([λ−1 ]x1 , . . . , [λ−1 ]xn ).
(7.2)
Then the main result of this section is the following Theorem 7.1. If a set of Wightman functions satisfies the locality and tempered spectral conditions, then for all m, n ∈ N, fm ∈ D(WR (ed )m ) and gn ∈ D(WR (ed )n ), there is a function G(fm ,gn ) holomorphic on C \ R+ with continuous boundary values G± (fm ,gn ) on R+ \ {0} from the upper and lower half-planes such that, for all λ ∈ R+ , G+ (fm ,gn ) (λ) = "Wm+n , fm ⊗ gn λ #,
G− (fm ,gn ) (λ) = "Wm+n , gn λ ⊗ fm # .
(7.3)
General Anti-de-Sitter QFT
513
The thermal interpretation of this theorem could be presented with all the details of Theorem 3 of [BEM], but here we shall only dwell on the following remarks, specific to the geometry of AdS. Consider an AdS spacetime with radius R (i.e. with equation (x, x) = R 2 and base point R ed = (0, . . . , 0, R)) and assume that all the test-functions fm , gn considered in the statement of Theorem 7.1 have their supports contained respectively in sets of the form Vam , Van , where Va is a neighbourhood in Xd of a certain point x(a) in the wedge WR (ed ) which we take of the following form: x(a) = (0, Ra, 0 . . . , 0, R a 2 + 1),
(7.4)
a being a positive number. The Lorentzian orbits followed by the points in the supports of fm and gn are in a neighbourhood of the orbit of x(a) which is the branch of hyperbola with equations t
[e Ra ]x(a) = (Ra sh
t t , Ra ch , 0 . . . , 0, R a 2 + 1) . Ra Ra
(7.5)
t has been chosen such that In the latter, the normalization of the group parameter Ra t is the proper time along the trajectory of x(a) with respect to the metric of the AdS t spacetime of radius R. Expressed in this parameter t such that λ = e Ra , the analyticity property stated in the theorem corresponds to the analyticity in a KMS-type domain, namely a β-periodic cut-plane generated by the strip {t; 0 < Im t < β}, with the period iβ = 2iπ aR (accompanied by the relevant conditions of KMS-type for the boundary values at t real and t − iβ real in terms of the products of field-observables). Thus an observer living on the trajectory (7.5), and whose measurements are therefore supposed to be performed in terms of field observables whose support at t = 0 lies in the neighbourhood Va of x(a), will perceive a thermal bath of particles at the temperature T = β −1 = 2π 1Ra . In view of Lemma 2.1, the motion of such an observer is uniformly accelerated and its acceleration a is related to the “radius” Ra of the hyperbolic trajectory by the following formula (noting that the parameter c2 of Lemma 2.1 is c2 = a 2 + 1):
a=
1 R
c2 c2 −1
=
1 R
1+
1 . a2
Therefore one has:
Lemma 7.1. The temperature T perceived by an AdS-Unruh observer living on the trajectory described by (7.5) is given in terms of his (or her) acceleration a by 1 T = 2π
" a2 −
1 . R2
(7.6)
The proof of Theorem 7.1 contains two steps (as the similar Theorem 2 of [BEM]). In Subsect. 7.1 we introduce special sets of points in Xdn which play the role of the Jost points of the Minkowskian case, because all the complex Lorentz transformations [λ] transport them into Zn+ ∪ Zn− . This is the starting point of a standard analytic completion procedure which generates the analyticity of the Wightman functions in domains obtained by the action of all complex Lorentz transformations on the tuboids Zn+ (or Zn− ) as in Lemma 2 of [BEM]. Here also, there is (for each pair (m, n) considered in the statement of Theorem 7.1) a quartet of Wightman functions analytic respectively in which are involved in that procedure. the tuboids Zn± and in two other tuboids Zn± This is fully explained in Subsect. 7.2.
514
J. Bros, H. Epstein, U. Moschella
7.1. A special set of Jost points. Let a ∈ Xd be given by ch u), u > 0 . a = (0, sh u, 0,
(7.7)
With the notations of (2.7) and (2.8), and real s ∈ (0, π ), ch u) . [eis ] a = eisM10 a = (ish u sin s, sh u cos s, 0,
(7.8)
r), i.e. This can be reexpressed as exp(itM0d )b with b = (0, sh u cos s, 0, rch t), [eis ] a = (irsh t, sh u coss, 0, th t = th u sin s, r = (ch u)2 − (sh u)2 (sin s)2 .
(7.9)
Let Jn0 be the set of points (a1 , . . . , an ) ∈ Xdn such that ch uj ), j = 1, . . . , n, aj = (0, sh uj , 0, 0 < u1 < · · · < un . Let (a1 , . . . , an ) ∈ Xdn be of the form (7.10). Then for 0 < s < π, [eis ] aj = exp itj M0d bj , th tj = th uj sin s, j = 1, . . . , n, 0 < t1 < · · · < tn ,
(7.10)
(7.11)
with bj ∈ Xd . Since tj can be rewritten as θ1 + · · · + θj with θk > 0, the point [eis ]a is ˜ 0 it follows that [λ]a ∈ Zn+ for in Zn+ . By the invariance of Zn+ under G0 resp.G ∗ . all λ ∈ C+ . Obviously [λ]a ∈ Zn− for all λ ∈ C− , since Zn− = Zn+
7.2. Derivation of Bisognano-Wichmann analyticity. This subsection closely follows the treatment given in [BEM]. We refer the reader to that reference for more details. Let
WR = x ∈ Ed+1 : x 1 > |x 0 | . (7.12) If x ∈ WR ∩ Xd and x d > 0, we denote (consistently with the notation WR (ed ) introduced in (7.1))
WR (x) = y ∈ Xd : y − x ∈ WR , y d > 0 . (7.13) Let
Kn+ = (x1 , . . . , xn ) ∈ Xdn : x1 ∈ WR (ed ), xj ∈ WR (xj −1 ) ∀j = 2, . . . , n . (7.14) Kn− is defined by reflecting Kn+ across the hyperplane x ∈ Ed+1 : x 1 = 0 or equivalently Kn− = [−1]Kn+ with [−1] = [eiπ ] as in (2.8). If x ∈ Kn+ , then xj and xk are space-like separated whenever 1 ≤ j < k ≤ n. Note that Jn0 ⊂ Kn . If (z1 , . . . , zn ) (c)n belongs to Xdn or Xd , we denote z← = (zn , . . . , z1 ). Besides Zn+ and Zn− we shall also use the other two tuboids
= z ∈ X (c)n : z ∈ Z Zn+ ← n− , d
(7.15) (c)n = z∈X ∗. = Z Zn− : z ∈ Z ← n+ n+ d
General Anti-de-Sitter QFT
515
We fix m ≥ 0, n ≥ 1 and a function fm ∈ D(Xdm ) with support in WR (ed )m . There exist two functions z → F+ (fm ; z) and z → F− (fm ; z), respectively holomorphic in ∗ , having boundary values F (b) (f ; x) on X n (resp. X ˜ n ) in the Zn+ and Zn− = Zn+ m ± d d sense of distributions, such that for every g ∈ Bn with compact support, (b) F+ (fm ; x1 , . . . , xn ) gn (x1 , . . . , xn ) dσ (x1 ) . . . dσ (xn ) Xdn
=
Xdm+n
Wm+n (w1 , . . . , wm , x1 , . . . , xn )fm (w1 , . . . , wm ) gn (x1 , . . . , xn )
dσ (w1 ) . . . dσ (wm )dσ (x1 ) . . . dσ (xn ) ,
(7.16)
(b)
Xdn
F− (fm ; x1 , . . . , xn ) gn (x1 , . . . , xn ) dσ (x1 ) . . . dσ (xn )
=
Xdm+n
Wm+n (xn , . . . , x1 , w1 , . . . , wm ) fm (w1 , . . . , wm ) gn (x1 , . . . , xn )
dσ (w1 ) . . . dσ (wm )dσ (x1 ) . . . dσ (xn ) ,
(7.17)
(with X˜ d instead of Xd when needed). The functions z → F+ (fm ; z) = F− (fm ; z← ) and z → F− (fm ; z) = F+ (fm ; z← ) are respectively holomorphic in Zn + and Zn − .
Their boundary values at real points, in the sense of distributions, are F+ (b) (fm ; x) = (b) (b) F− (fm ; x← ) and F− (b) (fm ; x) = F+ (fm ; x← ). (b) (b) In the sense of distributions, F+ (fm ; x) and F− (fm ; x) coincide for x ∈ Kn− by virtue of local commutativity. Hence, by the edge-of-the-wedge theorem, z → F+ (fm ; z) and z → F− (fm ; z) have a common holomorphic extension z → F (fm ; z) in 9n = Zn+ ∪ Zn− ∪ Vn , where Vn is a complex neighborhood of Kn− such that [λ]Vn = Vn for all λ > 0. Let a ∈ Jn0 . As we have noted [eis ]a ∈ Zn+ for all s ∈ (0, π ). Moreover [eiπ ]a is in Vn . Hence there is an ε > 0 (depending on a) such that [eis ]a is in 9n for all s ∈ (0, π+ ε). This also means that,denoting a = [eiε/2 ]a ∈ Zn+ , all points of the compact set z = [eis ]a : 0 ≤ s ≤ π belong to 9n . Since the latter is open,
(c)n there exists a ρa > 0 such that the open ball Ba (ρa ) = z ∈ Xd : ||z − a || < ρa is contained in Zn+ , and [eis ]Ba (ρa ) ⊂ 9n for all s ∈ [0, π ] hence [λ]Ba (ρa ) ⊂ 9n for all λ ∈ C+ \ {0}. The function (z, λ) → G(fm ; z, λ) = F (fm ; [λ]z) is holomorphic in
(7.18) (z, ρeiθ ) ∈ Zn+ × C : ρ > 0, | sin θ | < α(z) ,
where α(z) > 0 for all z ∈ Zn+ . Moreover G is also holomorphic in Ba (ρa ) × C+ . By Lemma 3 of [BEM] (Appendix A), G extends to a function holomorphic in Zn+ × C+ . However we wish to prove that the boundary values of G as z tends to the reals (in the sense of distributions) are also analytic in λ in C+ . As a consequence of the temperedness assumption, for real λ, G(fm ; z, λ) defines a holomorphic function of tempered behavior in Zn+ with values in the functions of λ bounded by some power of (|λ| + |λ|−1 ). If z ∈ Ba (ρa ), λ → G(fm ; z, λ) extends to a function! analytic in C+ and also bounded there by some power of (|λ| + |λ|−1 ). Let ϕ(t) = ϕ(p) ˜ e−itp dp, where ϕ˜ is a C ∞ function with compact support contained in (−∞, 0). For z ∈ Ba (ρa ), and sufficiently large N ∈ N, ϕ(λ) λN G (fm ; z, λ) dλ = 0. (7.19) R
516
J. Bros, H. Epstein, U. Moschella
The l.h.s of this equation is holomorphic and of tempered behavior in Zn+ , hence it vanishes together with its boundary values. Therefore the boundary value G(b) (fm ; x, λ) (b) extends to a function of λ holomorphic in C+ . We note that G(b) (fm ; x, λ) = F+ (fm ; (b) [λ]x) for λ > 0 and G(b) (fm ; x, λ) = F− (fm ; [λ]x) for λ < 0. In the same way F+ (fm ; z) and F− (fm ; z) have a common extension F (fm ; z) holomorphic in 9n = Zn + ∪Zn − ∪Vn with Vn = {x : x← ∈ Vn }. Note that the domain 9n is ∗ ∈ 9 . Hence G (f ; z, λ) = F (f ; [λ]z) extends to a function hoequal to z : z← n m m lomorphic in Zn + ×C− , and its boundary value G (b) (fm ; x, λ) has properties that mirror those of G(b) (fm ; x, λ). Finally suppose that x ∈ WR (ed )n . Then, by local commutativ(b) (b) (b) ity, for λ < 0, F+ (fm ; [λ]x) coincides with F− (fm ; [λ]x← ) = F + (fm ; [λ]x). Hence if x ∈ WR (ed )n , λ → G(b) (fm ; x, λ) and λ → G (b) (fm ; x, λ) have a common analytic continuation in C \ R+ . This ends the proof of Theorem 7.1. can be replaced in the above discussion by T Remark 7.1. The tuboids Zn± and Zn± n± (similarly defined). and Tn±
7.3. “CTP”. In the proof of the preceding subsection, if we assume that fm has support in the left, instead of the right wedge, we find that (z, λ) → G(fm ; z, λ) extends to a function holomorphic in Tn+ × C− instead of Tn+ × C+ . Thus in case m = 0 it is easy to obtain the following special case of the Glaser-Streater theorem: Lemma 7.2. If a set of Wightman functions satisfies the locality and tempered spectral conditions, then for all integer n ≥ 1, there exists a function (z, λ) → Gn (z, λ), holomorphic and of tempered growth in Tn+ × (C \ {0}), such that Gn (z1 , . . . , zn , λ) = Wn ([λ]z1 , . . . , [λ]zn ) for λ > 0, Gn (z1 , . . . , zn , λ) = Wn ([λ]zn , . . . , [λ]z1 ) for λ < 0.
(7.20)
If we now assume the covariance condition (4.5) holds, then Gn is actually independent of λ and we obtain Lemma 7.3. If a set of Wightman functions satisfies the locality, covariance, and tempered spectral conditions, then for all integer n ≥ 1, and all z ∈ Tn+ , Wn (z1 , . . . , zn ) = Wn ([−1]zn , . . . , [−1]z1 ).
(7.21)
If positivity holds this implies, as usual, the existence of an anti-unitary operator θ such that θ φ(x)θ −1 = φ([−1]x)∗ . In [BFS], the existence of this operator is a nontrivial step in the derivation of commutativity for opposite wedges. Note that the above proof of Lemma 7.3, also valid in Minkowski space, does not require the BargmannHall-Wightman Lemma. 8. Wick Rotations and Osterwalder-Schrader Reconstruction on the Covering of AdS In this section, we consider QFT’s on the covering of AdS which satisfy the tempered spectral condition and we wish to show that such theories can be formulated equiva(E ) lently in terms of theories on the “Euclidean” AdS spacetime Xd in a way which is
General Anti-de-Sitter QFT
517
reminiscent of the Wick rotation in complex Minkowski space. In fact, the simple geometrical fact which allows the latter to hold is the property of the tuboids Z˜n+ described in Lemma 3.11 and obtained by making use of the complexification in τ of the map (τ, x) → χ˜ (τ, x) = exp(τ M0d )(0, x, x2 + 1) (8.1) of R × Rd−1 into X˜ d . As a matter of fact, making τ and x complex yield charts for a ˜ (c) ˜ certain complexification denoted [Xd ]d−2of Xd2: it is defined as the image of the extension of χ˜ to the set {τ ∈ C} × z ∈ C : z + 1 ∈ R− . (Of course this set [X˜ d ](c) (c) (c) (c) is the covering of a complex region which is not the full space Xd , since X˜ d ≡ Xd ). In the sequel we omit the symbol χ˜ and identify a point (τ, x) with itsimage χ˜(τ, x). n According to Lemma 3.11, all points z = ((τ1 , x1 ), . . . , (τn , xn )) ∈ [X˜ d ](c) such that xj is real and 0 < Im τ1 < · · · < Im τn , are in Z˜n+ . Therefore (using the global ˜ 0) invariance under G Wn ((τ1 , x1 ), . . . , (τn , xn )) (8.2) extends to a function of (τ1 , . . . , τn ), holomorphic in the tube {(τ1 , . . . , τn ) : Im τ1 < . . . < Im τn }, of tempered growth at infinity and at the boundaries, with values in the tempered distributions (in fact the polynomially bounded C ∞ functions) in x1 , . . . , xn . (This function depends only on the differences of the τj variables, but this does not, of course, hold for the xj .) Together with the other permuted Wightman functions, this defines a function of (τ1 , . . . , τn ), holomorphic in (τ1 , . . . , τn ) : Im (τj − τk ) = 0 ∀j = k . We denote Sn (s1 , x1 , . . . , sn , xn ) = Wn ((isπ(1) , xπ(1) ), . . . , (isπ(n) , xπ(n) ))
(8.3)
for real s1 , . . . , sn such that sπ(1) < · · · < sπ(n) , for all permutations π of (1, . . . , n). Standard analytic completions, using the edge-of-the-wedge theorem and Bremermann’s Continuity Theorem, show that Sn is actually analytic at all non-coinciding points of (E ) Rnd . Geometrically, in view of the representation (2.3) of Xd , this means that the functions Sn (z1 , . . . , zn ) appear as n-point Schwinger functions defined at all noncoinciding (E ) points of the “Euclidean” AdS spacetime Xd , namely
(E )n (8.4) (z1 , . . . , zn ) ∈ Xd ; zi = zj for all i, j, i = j . By construction, Sn (s1 , x1 , . . . , sn , xn ) = Sn (sπ(1) , xπ(1) , . . . , sπ(n) , xπ(n) ) for every permutation π. In case the positive definiteness condition holds, the sequence {Sn } has the Osterwalder-Schrader positivity property: if f0 ∈ C and, for 1 ≤ n ≤ N , fn ∈ S(Rnd ) has its support in {(s1 , x1 , . . . , sn , xn ) : 0 < s1 < · · · < sn }, then, with the convention S0 = 1, N ,x ) f (s , x fm (s1 , x1 , . . . , sm m n ) n 1 1 , . . . , sn , x m,n=0
, xm , . . . , −s1 , x1 , s1 , x1 , . . . , sn , xn ) Sm+n (−sm ds1 d x1 . . . dsn d xn ≥ 0 .
(8.5)
This follows from the existence of the vector-valued holomorphic functions Gn (see Sect. 4). Note that the analytic completion mentioned above applies to these vector-valued functions.
518
J. Bros, H. Epstein, U. Moschella
Conversely, let {Sn } be a sequence of functions defined at all noncoinciding points of (E ) Xd , namely on the images of the sets ((s1 , x1 ), . . . , (sn , xn )) ∈ Rnd : sj = sk xj = xk ∀ j = k}, symmetric, invariant under a common translation of the variables sj , and satisfying the Osterwalder-Schrader positivity property (8.5) when the supports of the {fn } are as above. Then it is possible, by the same method as in the “flat” case ([OS1, OS2, G2]), to construct a Hilbert space H, E ∈ H, and a sequence of functions {Gn }n∈N with values in H on {((τ1 , x1 ), . . . , (τn , xn )) : 0 < Im τ1 < · · · < Im τn , xj ∈ Rd−1 , holomorphic in the τj and C ∞ in the xj , such that Sm+n −sm , xm , . . . − s1 , x1 , s1 , x1 , . . . , sn , xn = Gm ((is1 , x1 ), . . . , (ism , xm )), Gn ((is1 , x1 ), . . . , (isn , xn )) (8.6) and 0 < s < · · · < s . The temperedness of the G at whenever 0 < s1 < · · · < sm 1 n n the real points, hence the existence of boundary values, can be obtained if, as we shall assume, the sequence {Sn } satisfies suitable growth conditions, as in [OS1, OS2, G2]. Note that the boundary values of the scalar products of the Gn , namely the distributions Wn thus obtained are automatically defined on the covering space X˜ dn of Xdn , since the analytic continuation in the complex variables τj ∈ C corresponds to travelling in the n covering space [X˜ d ](c) (no periodicity condition in the variables τj being produced
in general). ˜ (c) ˜ If the functions Sn are invariant under
the transformations of G0 which preserve (c)n z ∈ X˜ : z0 ∈ iRn , z ∈ Rn(d−1) , then differential operators representing G can d
˜ 0 . Finally the analyticity be used as in [OS1] to show that the Wn are invariant under G of the Wn shows that the operator Mˆ 0d representing M0d in H has positive spectrum ˜ 0. and so does the representative Mˆ of any M ∈ C1 by invariance under G 9. Final Remarks First of all, we wish to emphasize the peculiar thermal aspects of “generic” field theories on AdS or its covering, which are strongly related to the geometry of the AdS quadric, namely to the existence of uniformly accelerated motions on three types of trajectories: while the elliptic and parabolic observers perceive a world at zero temperature (satisfying a condition of energy-positivity) in spite of the fact that the acceleration can take all values between zero and R1 , the hyperbolic observers perceive an Unruh-type temperature effect growing from zero to infinity when the acceleration grows from R1 to infinity. Our subsequent remarks concern some important peculiarities of QFT’s on the pure AdS spacetime with respect to those on its covering, which appeared in the study of general two-point functions (see Sect. 6), namely: i) The vanishing of the commutator vacuum expectation value in the region of spacelike separation (x1 − x2 )2 < 0 implies its vanishing in the region of time-like separation (x1 + x2 )2 < 0, which we called the “region of exotic causal separation”, ii) the periodicity condition on time-like geodesics implies that the two-point function is a holomorphic function of (z1 , z2 ) in the domain C\[−1, +1] itself instead of its cover(d+1) ((z1 , z2 )) for which ing. In the free-field case, this selects the Legendre functions Qλ λ = / integer, which results in a quantization of the mass parameter m2 = /(/ + d − 1).
General Anti-de-Sitter QFT
519
More generally all the fields on the pure AdS spacetime have two-point functions which (d+1) ((z1 , z2 )) with integer are (positive) superpositions of the free-field functions Q/ indices /. These peculiarities suggest respectively the following questions which may be linked together and deserve to be studied in further works. a) Does local commutativity, formulated as the vanishing of the commutator in the region of spacelike separation imply its vanishing in the region of exotic causal separation (or “exotic causality”)? b) Do the axioms of Sect. 4 applied to the pure AdS spacetime exclude the existence of non-trivial interacting QFT? It has in fact been noted in [BFS] that the region of spacelike separation does not really deserve that name in the pure AdS case, since pairs of points in that region can also be connected by classes of non-geodesic timelike paths; as it is stated, the condition of local commutativity therefore appears as a strong constraint which would force the interactions to respect the time-periodicity of the geometry. One can then also say that the condition of exotic locality would represent a constraint of similar nature acting at an odd number of half-periods of time, and that is why one could possibly expect it to be a consequence of local commutativity. We know that, besides generalized free fields, their Wick polynomials are allowed to exist, since their n-point functions, which are combinations of products of two-point functions, do satisfy the required (periodic) analyticity properties in the relevant tubes Tn± of the pure AdS spacetime. However, if we think of nontrivial interacting fields in terms of perturbation theory and consider Feynman-type convolution products of free two-point functions involving not only products but integrals on the AdS spacetime, one can show that the constraints of periodic analyticity cannot be maintained in general: in particular the action of retarded propagators can be seen to propagate the interaction in the covering tubes T˜n± . So one could formulate as a conjecture that there might be no other QFT’s on the pure AdS spacetime than the ones previously mentioned, which would then also entail (as a byproduct) a positive answer to our question a). However, there is at the moment no genuine proof relying on the axioms of Sect. 4 that such a conjecture is valid. We end these remarks with a brief comparison of the formalism of this paper with that proposed by [BFS]. A difficulty is of course the usual gap between a formulation based on fields and one based on local algebras. This difficulty is compounded in the AdS case by the lack of a translation group. In [BFS] the pure AdS, and not its covering space, is considered. From the principle of passivity of the vacuum, the authors succeed in deriving the positivity of the energy, the existence of a CTP operator (both in the same sense as here), and the commutativity of operators localized in opposite wedges. If we suppose that, in terms of fields, these results correspond to the tempered spectral, covariance, and positivity conditions, and a portion of local commutativity, then, by Remark 4.1 (Sect. 4), this implies full local commutativity, and thus all our axioms.
A. Appendix. More on Two-Point Functions A.1. The case of Xd . We begin by considering a complex function w+ holomorphic and of polynomial growth on T1− × T1+ . The function w− defined by w− (z1 , z2 ) = (b) w+ (z2 , z1 ) is holomorphic in T1+ × T1− . We denote w± the boundary value of w± on (b) (b) Xd in the sense of tempered distributions. We suppose that w+ and w− coincide on
520
J. Bros, H. Epstein, U. Moschella
the real open subset R2 of Xd2 defined by
R2 = x ∈ Xd2 : (x1 − x2 , x1 − x2 ) < 0 = x ∈ Xd2 : (x1 , x2 ) > 1 .
(A.1)
Since we are ultimately interested in the case when w± (z1 , z2 ) = w± (3z1 , 3z2 ) for all 3 ∈ G0 , we make the simplifying assumption that w± (x1 + iy1 , x2 + iy2 ) has a boundary value which is C ∞ in x1 and of tempered growth in z2 = x2 + iy2 when y1 tends to 0 while!z2 ∈ T1± . Actually the general case can be reduced to the simplified one by considering G0 ϕ(3)w± (3z1 , 3z2 ) d3 for suitable test-functions ϕ. The functions f± defined by f± (z) = w± (ed , z) are respectively holomorphic and of tempered growth in T1± , and have boundary values f (b) on Xd in the sense of tempered distributions. These boundary values coincide in the real open subset R of Xd defined by
(A.2) R = x ∈ Xd : (x − ed )2 < 0 = x ∈ Xd : x d > 1 . We also define
R = x ∈ Xd : (x + ed )2 < 0 = x ∈ Xd : x d < −1 .
(A.3)
(c) Let H resp.H (c) denote the subgroup of G0 resp.G0 which leaves ed unchanged. This is just the connected real (resp. complex)Lorentz group for the d-dimensional
(c) (c) d Minkowski space .0 = x ∈ Ed+1 : x = 0 resp. .0 = z ∈ Ed+1 : zd = 0 . The properties mentioned above for f± are invariant under H . Let T1,d = H (c) T1+ = H (c) T1− .
(A.4)
The last equality is due to the fact that I01 ∈ H (c) , where I01 e0 = −e0 , I01 e1 = −e1 , I01 eµ = eµ for 1 < µ ≤ d, and that I01 is a bijection of T1+ onto T1− . Obviously = H (c) T . We also denote: T1,d 1,d
.± = x ∈ Ed+1 : x d = ±1 ,
(c) (c) .± = z ∈ Ed+1 : zd = ±1 ,
(A.5)
(c) Q± = z ∈ Ed+1 : (z ∓ ed )2 = 0 ,
(A.6)
(c) (c) Q0 = Q± ∩ .0 = Z ∈ .0 : (Z, Z) = −1 .
(A.7)
The intersection of Xd with the light-cone with apex at −ed is contained in the hyperplane .− , in fact (c)
(c)
(c)
Xd ∩ Q− = Xd ∩ .− . (c)
(A.8)
Note also that T1± ∩ .± = ∅. Indeed we know that the image of T1± under the map ∩ .(c) = ∅. Indeed the image of T z → zd is C \ [−1, 1]. As a consequence T1,d ± 1,d d under z → z = (ed , z) is the same as that of T1± since H (c) ed = {ed } by definition. We will prove:
General Anti-de-Sitter QFT
521
Lemma A.1. Let f± be functions respectively holomorphic and of tempered growth in (b) T1± , f± their boundary values, in the sense of tempered distributions, on Xd . We sup(b) (b) pose that f+ and f− coincide in R. Then: whose restriction to T (i) There exists a function f holomorphic on T1,d 1± is f± .
(c) d (ii) T1,d = z ∈ Xd : z ∈ 9 = C \ [−1, 1] . In particular T1,d contains R and R . (b)
(b)
(b)
(iii) If f± is invariant under H , i.e. if f± (x) = f± (3 x) in the sense of distributions for all 3 ∈ H , then there exists a function h holomorphic on 9 such that f (z) = h(zd ) . for all z ∈ T1,d Proof. (i) We will reduce this statement to a simple case of the Glaser-Streater Theorem [Sr, J] by using the inversion (stereographic projection) z → ϕ+ (z) = Z, Z + ed = 2
z + ed , (z + ed )2
(Z + ed )2 =
z + ed = 2
Z + ed , (Z + ed )2
4 . (z + ed )2
(A.9) (A.10)
(c)
ϕ+ is a holomorphic involution of Ed+1 \ Q− onto itself and commutes with every (c) Xd
(c)
(c)
element of H (c) . If z = x + iy ∈ \ Q− = Xd \ .− , and Z = X + iY = ϕ+ (z), we find z + ed z + ed Z + ed = 2 = d , (A.11) 2 (z + ed ) z +1 (c)
which imply Z d = 0. Conversely if Z ∈ .0 \ Q0 , it follows from (A.9) that (z, z) = 1. (c) (c) (c) Therefore ϕ+ restricted to Xd \.− is a holomorphic bijection onto .0 \Q0 . Moreover (c) (c) (if z ∈ Xd \ .− ) (Z + ed )2 = (Z, Z) + 1 = Y =
zd
2 Z + ed , z + ed = 2 , +1 (Z, Z) + 1
d x + 1 y − y d (x + ed ) (x d
+ 1)2
+ yd2
Y0 =
, (Y, Y ) =
(x d
y0 xd + 1 − yd x0 . 2 x d + 1 + yd2
(y, y) , + 1)2 + yd2
(A.12)
(A.13)
(A.14)
Assume that z ∈ T1+ . Then (y, y) > 0, hence (Y, Y ) > 0. Moreover (x, x) = (y, y) + 1 > 1, (x, y) = 0, y 0 x d − y d x 0 > 0, and 2 y 0 x d − y d x 0 = (y, y) + y2 (x, x) + x2 − ( x · y)2 ≥ (y, y) + y2 (x, x) > (y 0 )2 , (A.15) hence Y 0 > 0. Thus Z belongs to T+ , the future tube in the complex Minkowski space (c) (c) .0 , and in fact Z ∈ T+ \ Q0 . Similarly ϕ+ maps T1− into the past tube of .0 . Conversely suppose that Z = X + iY ∈ T+ \ Q0 . Then Z = ϕ+ (z), where z = x + iy is
522
J. Bros, H. Epstein, U. Moschella
xd R
•
z ed
Q−
•
ϕ+ (z )
ϕ+ (z)
•
0 x1
.0
.− −ed
R
•
z
Fig. A.1. Schematic representation of the map ϕ+ in projection onto the (1, d) plane X0
Q0
ϕ+ (R )
ϕ+ (R) −1
ϕ+ (R )
ϕ+ (R) 0
1
X1
Fig. A.2. In the subspace .0 the region ϕ+ (R) (light gray) and the region ϕ+ (R ) (dark gray) in the case d = 2
General Anti-de-Sitter QFT
523
given by the last equality in (A.9). It satisfies (y, y) > 0 by (A.13), and y 0 x d −y d x 0 > 0 since otherwise we would have Y 0 < 0 by (A.14). For real x ∈ Xd \ .− and X = ϕ+ (x) ∈ .0 \ Q0 , the preceding formulae specialize to 1 − xd (X, X) = . (A.16) 1 + xd As a consequence ϕ+ (R) = {X ∈ .0 : −1 < (X, X) < 0} ,
(A.17)
ϕ+ (R ) = {X ∈ .0 : (X, X) < −1} .
(A.18)
(c)
The extended tube T in .0 is defined as usual as T = H (c) T+ = H (c) T− . Obviously ) = T \ Q (recall that T ∩ .(c) = ∅). As it is well-known ϕ (R) ⊂ T ϕ+ (T1,d 0 + − 1,d hence ϕ+ (R) ⊂ T \ Q0 , and also ϕ+ (R ) ⊂ T \ Q0 . Furthermore
(c) / R+ , T = Z ∈ .0 : (Z, Z) ∈
(A.19)
−1 . Then g± is holomorphic and of tempered growth in T± \ Q0 , Let g± = f± ◦ ϕ+ (c) where g± (Z) = f± (z), z ∈ T1± \ .− being given by (A.12). The boundary values if g± coincide in ϕ+ (R) so that, by the edge-of-the-wedge theorem, g± have a common holomorphic extension in (T+ ∪T− \Q0 )∪N , where N is a complex open neighborhood of ϕ + (R) invariant under H . By following the steps of the proof of the Glaser-Streater Theorem as presented e.g. in [BEG] it is immediate to see that all the arguments apply to the present case, owing to the fact that Q0 is invariant under H (c) . We therefore conclude that g± have a common extension g holomorphic in T \ Q0 . Note in particular that g is holomorphic in a neighborhood of ϕ+ (R ), hence that the boundary values of g± onto T \ Q , setting f (z) = g(ϕ (z)) coincide there. Since ϕ+ is a bijection of T1,d 0 + proves part(i). (c) (c) (ii) follows from (A.19) and from the fact that for z ∈ Xd \ .− , Z = ϕ+ (z),
(Z, Z) =
1 − zd , 1 + zd
zd =
1 − (Z, Z) . 1 + (Z, Z)
(A.20)
, and similarly R ⊂ T . This implies that R = x ∈ Xd : x d > 1 ⊂ T1,d 1,d (iii) In addition to the preceding hypotheses we now assume that f± , and therefore g± , are invariant under the real Lorentz group H . Applying the Bargmann-Hall-Wightman Theorem [SW, J], we find that there exists a function g, ˆ holomorphic in C \ R+ \ {−1} such that g(Z) = g(Z ˆ 2 ) for all Z ∈ T \ Q0 . Setting h(ζ ) = g((1 ˆ − ζ )/(1 + ζ )) we obtain part (iii). Remark A.1. Another proof of part (i) can be given by using the temperedness of f± . (Z) = ((Z, Z) + Indeed there is an integer M > 0 such that the functions Z → g± 1)M g± (Z) are holomorphic in T± (respectively) and coincide on ϕ+ (R). By the “doublecone theorem” [Bo2, p. 68], their region of coincidence extends to {X ∈ .0 : (X, X) < 0} and the standard Glaser-Streater Theorem can be applied.
524
J. Bros, H. Epstein, U. Moschella
We return to the functions w± , which we now assume invariant under G0 , and apply Lemma A.1 to f± (z) = w± (ed , z): these functions have a common holomor , and there exists a function h, holomorphic on 9, such that phic extension to T1,d d (z , z ) = h((z , z )). For w± (ed , z) = h(z ). For z1 ∈ T1− and z2 ∈ T1+ , let w+ 1 2 1 2 real x1 ∈ Xd and z2 ∈ T1+ , we have w+ (x1 , z2 ) = w+ (x1 , z2 ). Indeed we can write x1 = 3ed , z2 = 3z for some 3 ∈ G0 and z ∈ T1+ , so that w+ (x1 , z2 ) = (x , z ). Therefore w (z , z ) coincides with w (z , z ), w+ (ed , z ) = h(zd ) = w+ 1 2 + 1 2 2 + 1 i.e. with h((z1 , z2 )) for all z1 ∈ T1− and z2 ∈ T1+ . Similarly w− (z1 , z2 ) coincides with h((z1 , z2 )) on T1+ × T1− . We have proved: Lemma A.2. With the preceding assumptions on w± , and assuming in addition that these functions are invariant under G0 , there exists a function h, holomorphic on 9, such that w± coincide with (z1 , z2 ) → h((z1 , z2 )) in their domains of definition. (b)
Remark A.2. This implies that w± coincide not only on R2 , but also on the “exotic region” R2 :
R2 = x ∈ Xd2 : (x1 + x2 , x1 + x2 ) < 0 = x ∈ Xd2 : (x1 , x2 ) < −1 . (A.21) Our proof shows that this also holds without assuming that w± are invariant under G0 . A.2. The case of X˜ d . The topology of T1± (and hence of T˜1± ) is made clear by the holomorphic bijection ϕ+ , which maps T1± onto T± \ Q0 in the complex Minkowski (c) (c) space .0 . To make things even clearer, one can use the map ψ defined in .0 by ψ(Z) + ed−1 = −2
Z + ed−1 . (Z + ed−1 )2
(A.22)
(c) This is a holomorphic bijection of Z ∈ .0 : (Z + ed−1 )2 = 0 onto itself, which (c) maps T+ onto itself (i.e. in particular (Z +ed−1 )2 = 0 for Z ∈ T± ). It maps Z ∈ .0 : (c) (Z, Z) = −1, (Z + ed−1 )2 = 0 onto Z ∈ .0 : Z d−1 = 0, (Z + ed−1 )2 = 0
(c) and therefore it is a holomorphic bijection of T+ \Q0 onto T+ ∩ Z ∈ .0 : Z d−1 = 0 (and similarly for T− \ Q0 ). We shall however continue to work with T± \ Q0 which has the advantage of being invariant under H . We denote
(c) L = z ∈ .0 : (z, z) + 1 ∈ R− . (A.23) This is an analytic hypersurface containing Q0 . Lemma A.3. The open set T+ \ L is connected and simply connected. Proof. The set A = T+ \ L is star-shaped with respect to 0, i.e. ρ A ⊂ A for every (c) ρ ∈ (0, 1) (but 0 ∈ / A). For every z ∈ .0 , |(z, z)| ≤ ||z||2 . Hence if B denotes the (c) open ball B = z ∈ .0 : ||z||2 < 1/2 , we have A ∩ B = T+ ∩ B and this intersection is convex. We can define a map σ (t, z) = ((1 − t) + t||2z||−1 )z of [0, 1] × A into A
General Anti-de-Sitter QFT
525
such that σ (0, z) = z and z → σ (1, z) sends A into A ∩ B. Hence any two points in A can be connected by a continuous arc, and every closed curve in A is homotopic to 0. We now suppose given a pair of functions g± respectively holomorphic and of tempered growth in the covering of T± \ Q0 . This is equivalent to giving a pair of sequences {gn± : n ∈ Z}, with the following properties: (1) for each n ∈ Z, gn± is holomorphic in T± \ L; (2) every z ∈ L ∩ (T+ \ Q0 ) has an open complex neighborhood Vz such that gn+ |Vz ∩ {Z : Im (Z, Z) > 0} and g(n+1)+ |Vz ∩{Z : Im (Z, Z) < 0} have a common holomorphic extension in Vz ; (3) Similarly for gn− in T− . In addition we suppose that (4) the boundary values of g0± coincide in ϕ+ (R). The proof of the Glaser-Streater Theorem again applies to show that g0± have a common holomorphic extension g0 in T \ L, in particular that the map (3, z) → g0± (3z) of H × (T± \ L) into C extends to a holomorphic function on H (c) × (T± \ L). From this it follows that (3, z) → gn± (3z) also extends to a holomorphic function on H (c) × (T± \ L). The Bargmann-Hall-Wightman Lemma again shows that each of the functions gn± extends to a function holomorphic in T \L. Moreover these two functions coincide, and we denote gn = gn± . This can be seen, for n ≥ 0, by induction on n. Supposing it to hold up to n − 1, we consider points of the form iy for y ∈ V+ and (y, y) > 1, which belong to T+ ∩ L. At such a point lim gn+ ((i − ε)y) = lim g(n−1)+ ((i + ε)y) ε↓0
ε↓0
= lim g(n−1)− ((i + ε)y) = lim gn− ((i − ε)y) . ε↓0
ε↓0
(A.24)
Note however that if in addition the “hermiticity condition” g0− (z) = g0+ (z∗ )∗ holds, this extends to g−n− (z) = gn+ (z∗ )∗ for all n ∈ Z. There is no general reason for gn− (z) and gn+ (z∗ )∗ to coincide for n = 0. If we suppose that g0± are invariant under H , then for each n ∈ Z the functions gn are locally Lorentz invariant and, again by the Bargmann-Hall-Wightman Lemma, there is a function hn , holomorphic in C \ R+ \ (−1 + R− ), such that gn (z) = hn ((z, z)). Moreover hn+1 (t − i0) = hn (t + i0) for all t ∈ (−∞, −1) and all n ∈ Z. Therefore ˜ holomorphic on C \ n∈Z (2iπ n + R+ ) such that, for each there exists a function h, ˜ + 2inπ ) in {w ∈ C \ R+ : −iπ < Im w < iπ }. n ∈ Z, hn (ew − 1) = h(w Let now w± be functions holomorphic and of tempered growth on T˜1− × T˜1+ and ˜0 T˜1+ × T˜1− , respectively. We suppose that w± (3z1 , 3z2 ) = w± (z1 , z2 ) for all 3 ∈ G ˜ and all z in the relevant domain, and that the boundary values of w± on Xd coincide at space-like separated arguments. As in the case of Xd , we assume that the boundary values can be extended to C ∞ functions of a real first argument, holomorphic and of tempered growth in thesecond argument in T˜1± (respectively) and define f± (z) = w± (ed , z), −1 g± (Z) = f± ϕ+ (Z) . Then g± satisfy Conditions (1)–(4) mentioned above, gn± are ˜ We can H invariant, and the preceding remarks provide the functions gn , hn and h. transport back these properties to the functions w± .
526
J. Bros, H. Epstein, U. Moschella
B. Appendix. Proof of Lemma 3.7 We start with
Lemma B.1. In any neighborhood of M 0d in G, one can find a basis hµ,ν ∈ C+ : 0 ≤ µ < ν ≤ d} of G such that M0d = 2 0≤µ<ν≤d hµ,ν /d(d + 1).
Proof. Let κ > 0 be sufficiently small. We # denote uj = κej for all j = 1, . . . , d − 1. For 0 ≤ µ < ν ≤ d, we define fµ,ν ∈ 2 (Ed+1 ) as follows (j and k are integers in [1, d − 1]): d−1 u + k u ∧ e f0,d = e0 + d−1 j d k j =1 k=1 d−1 = e0 ∧ ed + d−1 k e ∧ u + 0 k 1≤j
j
We adjust aj and bj so that only the first term survives in the rhs of this identity, i.e. aj = 1, bj = 1 + (d − 1 − j )(d − j )/2 for 1 ≤ j ≤ d − 1 .
(B.3)
With these values, e0 ∧ uj and uj ∧ ed can be recoveredfromf0,j and fd,j , respectively, then uj ∧ uk can be recovered from fj,k . Thus the fµ,ν 0≤µ<ν≤d form a basis of # 2 (Ed+1 ). We obtain a basis hµ,ν 0≤µ<ν≤d of G by setting hµ,ν = /(fµ,ν ). Clearly, given a neigborhood V of M0d = /(e0 ∧ ed ), κ can be chosen so small that hµ,ν ∈ V for all µ and ν. Corollary B.1. The convex cone Cˆ+ generated by C1 is open. Also G0 = exp(s1 M1 ) . . . exp(sN MN ) : N ∈ N, sj ∈ R, Mj ∈ C1 ∀j .
(B.4)
(c)
Corollary B.2. G+ 0 is open in G0 . Proof. We first show that, if τ ∈ C+ has sufficiently small modulus, exp(τ M0d ) is an interior point of G+ 0 . Denoting L = d(d + 1)/2, let (LA1 , . . . , LAL ) = (h0,1 , . . . , hd−1,d ), be the basis of G constructed in the proof of Lemma B.1. In particular M0d = (c) (A1 + · · · + AL ). We consider the two holomorphic maps of CL into G0 , h1 (z1 , . . . , zL ) = exp (z1 A1 ) · · · exp (zL AL ) , h2 (z1 , . . . , zL ) = exp (z1 A1 + · · · + zL AL ) .
(B.5)
General Anti-de-Sitter QFT
527
These maps are tangent at 0, and, for sufficiently small ε > 0, both are biholomorphic (c) maps of the polycylinder z ∈ CL : |zj | < ε ∀j into G0 . In particular the subset (c) V1 = h1 z ∈ CL : |zj | < ε, Im zj > 0 ∀j of G0 is open and contained in G+ 0. −1 For sufficiently small |τ | the curve τ → h1 ◦ h2 (τ, . . . , τ ) exists and is tangent to τ → (τ, . . . , τ ). Therefore there exists an η > 0 such that |τ | < η and Im τ > 0 imply h2 (τ, . . . , τ ) ∈ V1 , i.e. exp(τ M0d ) ∈ V1 . We now consider a point 3 ∈ G+ 0 of the form exp(τ1 M1 ) · · · exp(τN MN ) exp τ M0 d, where τ1 , . . . , τN , τ ∈ C+ and M1 , . . . , MN ∈ C1 . This point can be rewritten as 3 = 3θ exp(θ τ M0d ), where θ is arbitrary in (0, 1) and 3θ ∈ G+ 0 . For sufficiently small θ, exp(θτ M0d ) ∈ V1 , hence 3 ∈ 3θ V1 , an open + set contained in G+ 0 . Since G0 is invariant under conjugations from G0 , this proves the corollary. Corollary B.3.
+ + G+ 0 = G0 G0 = G0 G0 .
(B.6)
Proof. Let 3 ∈ G+ 0 and M ∈ C+ . For any real s and t, esM 3 = e(s+it)M (e−itM 3) .
(B.7)
−itM 3 ∈ G+ and e(s+it)M ∈ G+ , hence Since G+ 0 is open, for sufficiently small t > 0, e 0 0 + sM e 3 ∈ G0 . Any S ∈ G0 can be written as a finite product exp(s1 M1 ) . . . exp(sN MN ), + + + where sj ∈ R and Mj ∈ C+ , hence S3 ∈ G+ 0 for any 3 ∈ G0 . Thus G0 = G0 G0 , + −1 + + + and since SG0 S = G0 for all S ∈ G0 , this implies G0 = G0 G0 .
References [AIS]
Avis, S.J., Isham, C.J., Storey, D.: Quantum Field Theory In Anti-De Sitter Space-Time. Phys. Rev. D 18, 3565–3576 (1978) [B] Bateman, H.: Higher Transcendental Functions. New York: McGraw–Hill, 1954 [B2] Bateman, H.: Tables of Integral Transforms. New York: McGraw–Hill, 1954 [BBMS] Bertola, M., Bros, J., Moschella, U., Schaeffer, R.: A general construction of conformal field theories from scalar anti-de Sitter quantum field theories. Nucl. Phys. B 587, 619–644 (2000) [hep-th/9908140] [BD] Birrell, N.D., Davies, P.C.W.: Quantum Fields in Curved Space. Cambridge: Cambridge University Press, 1982 [BBGMS] Bertola, M., Bros, J., Gorini, V., Moschella, U., Schaeffer, R.: Decomposing quantum fields on branes. Nucl. Phys. B 581, 575–603 (2000) [hep-th/0003098] [BW] Bisognano, J.J., Wichmann, E.H.: On the duality condition for a Hermitian scalar field. J. Math. Phys. 16, 985 (1975) [BF] Breitenlohner, P., Freedman, D.Z.: Stability in Gauged Extended Supergravity. Ann. Phys. 144, 249 (1982) [Bo] Borchers, H.-J.: On the structure of the algebra of field observables. Nuovo Cimento 24, 214 (1962) [Bo2] Borchers, H.-J.: Translation Group and Particle Representations in Quantum Field Theory. Lecture Notes in Physics 40. Berlin: Springer, 1996 [BB] Borchers, H.-J., Buchholz, D.: Global properties of vacuum states in de Sitter space. Ann. Inst. H. Poincar´e Phys. Th´eor. 70, 23–40 (1999) [BEG] Bros, J., Epstein, H., Glaser, V.: Connection between analyticity and covariance of Wightman functions. Commun. Math. Phys. 6, 77 (1967) [BEM] Bros, J., Epstein, H., Moschella, U.: Analyticity properties and thermal effects for general quantum field theory on de Sitter space-time. Commun. Math. Phys. 196, 535–570 (1998) [BM] Bros, J., Moschella, U.: Two-point functions and de Sitter quantum fields. Rev. Math. Phys. 8, 324–392 (1996) [gr-qc/9511019]
528 [BV] [BFS] [CW] [DL] [F] [G1] [G2] [J] [LM] [M] [OS1] [OS2] [PW] [RS] [Re] [St] [Sr] [SW] [V] [W]
J. Bros, H. Epstein, U. Moschella Bros, J., Viano, G.A.: Connection between the harmonic analysis on the sphere and the harmonic analysis on the one-sheeted hyperboloid: An analytic continuation viewpoint. Forum Mathematicum 8, 621–658; 659–722 (1996) and 9, 165–191 (1997) Buchholz, D., Florig, M., Summers, S.J.: Hawking-Unruh temperature and Einstein causality in anti-de Sitter space-time. Class. Quant. Grav. 17, L31–L37 (2000) [hep-th/9905178] Callan, C.G., Wilczek, F.: Infrared behavior at negative curvature. Nucl. Phys. B 340, 366– 386 (1990) Deser, S., Levin Orit: Accelerated detectors and temperature in (anti-) de Sitter spaces. Class. Quant. Grav. 14 L163–L168 (1997) Fronsdal, C.: Elementary particles in a curved space. II. Phys. Rev. D 10, 589–598 (1974) Glaser, V.: The positivity condition in momentum space. In: Problems in Theoretical Physics. Essays dedicated to N. N. Bogoliubov. D.I. Blokhintsev et al. eds. Moscow: Nauka, 1969 Glaser, V.: On the equivalence of the Euclidean and Wightman formulations of field theory. Commun. Math. Phys. 37, 257–272 (1974) Jost, R.: The General Theory of Quantized Fields. Providence, RI: A.M.S., 1965 L¨uscher, M., Mack, G.: Global conformal invariance in Quantum Field Theory. Commun. Math. Phys. 41, 203–234 (1975) Maldacena, J.: The large N limit of superconformal field theories and supergravity. Adv. Theor. Math. Phys. 2, 231–252 (1998) [hep-th/9711200] Osterwalder, K., Schrader, R.: Axioms for Euclidean Green’s functions. Commun. Math. Phys. 33, 83–112 (1973) Osterwalder, K., Schrader, R.: Axioms for Euclidean Green’s functions, II. Commun. Math. Phys. 42, 281–305 (1975) Pusz, W., Woronowicz, S.L.: Commun. Math. Phys. 58, 273 (1978) Randall, L., Sundrum, S.: A large mass hierarchy from a small extra dimension. Phys. Rev. Lett. 83, 3370–3373 (1999) [hep-ph/9905221]. An alternative to compactification. Phys. Rev. Lett. 83, 4690–4693 (1999) [hep-th/9906064] Rehren, K.H.: Local Quantum observables in the anti-deSitter – conformal QFT correspondence. Phys. Lett. B 493, 383–388 (2000) Sternberg, S.: Lectures on Differential Geometry. Englewood Cliffs: Prentice-Hall, 1964 Streater, R.F.: Analytic properties of products of field operators. J. Math. Phys. 3, 256 (1962) Streater, R.F., Wightman, A.S.: PCT, Spin and Statistics, and all that. New York: W.A. Benjamin, 1964 Vilenkin, N. Ja.: Fonctions Sp´eciales et Th´eorie de la Repr´esentation des groupes. Paris: Dunod, 1969 Wald, R.M.: Quantum Field Theory in Curved Space–time and Black Hole Thermodynamics. Chicago: University Press, 1994
Communicated by H. Nicolai
Commun. Math. Phys. 231, 529–568 (2002) Digital Object Identifier (DOI) 10.1007/s00220-002-0722-3
Communications in
Mathematical Physics
Gluing and Wormholes for the Einstein Constraint Equations James Isenberg,1,∗ , Rafe Mazzeo2,∗∗ , Daniel Pollack3,∗∗∗ 1 2 3
University of Oregon, Eugene, OR 97403, USA. E-mail:
[email protected] Stanford University, Stanford, CA 92740, USA. E-mail:
[email protected] University of Washington, Seattle, WA 98195-4350, USA. E-mail:
[email protected]
Received: 4 October 2001 / Accepted: 26 July 2002 Published online: 29 October 2002 – © Springer-Verlag 2002
Abstract: We establish a general gluing theorem for constant mean curvature solutions of the vacuum Einstein constraint equations. This allows one to take connected sums of solutions or to glue a handle (wormhole) onto any given solution. Away from this handle region, the initial data sets we produce can be made as close as desired to the original initial data sets. These constructions can be made either when the initial manifold is compact or asymptotically Euclidean or asymptotically hyperbolic, with suitable corresponding conditions on the extrinsic curvature. In the compact setting a mild nondegeneracy condition is required. In the final section of the paper, we list a number ways this construction may be used to produce new types of vacuum spacetimes. 1. Introduction 1.1. The constraint equations and surgery. The Einstein equations for the gravitational field on a Lorentzian four-manifold Z form, modulo diffeomorphisms, a locally wellposed hyperbolic system [11], and vacuum solutions may be obtained from a set of Cauchy data on a three-dimensional spacelike slice ⊂ Z. A vacuum initial data set for this problem consists of two symmetric tensors γ and on , which correspond respectively to the induced metric and second fundamental form on . Thus γ is positive definite while is at least apparently unrestricted. However, in order for the Einstein evolution to exist, at least for a short time, it is necessary and sufficient [11] that these tensors satisfy the constraint equations: div − ∇tr = 0, Rγ − ||2γ + (tr )2 = 0, ∗
Supported by the NSF under Grant PHY-0099373 Supported by the NSF under Grant DMS-9971975 and at MSRI by NSF grant DMS-9701755 ∗∗∗ Supported by the NSF under Grant DMS-9704515 ∗∗
(1) (2)
530
J. Isenberg, R. Mazzeo, D. Pollack
Here ∇ is the Levi-Civita connection for γ , Rγ its scalar curvature, and the divergence, div γ , trace, tr γ and norm squared of , ||2γ = ab cd γ ac γ bd , are all computed with respect to γ . Having fixed the three-manifold , it is not apparent that solutions (γ , ) of these constraint equations exist, or how to find them. However, there is a systematic procedure which generates solutions in a large number of cases, which we review now. Let us begin with a pair of tensors (γ , ), which do not necessarily satisfy the constraint equations, but which have the property that the mean curvature τ = tr γ is constant. The method we are describing relies crucially on this assumption. Decompose into its trace-free and pure trace parts: =µ+
τ γ. 3
(3)
The first of the constraint equations (1) requires that div µ = 0, since τ is constant, so that µ is what is usually called a transverse-traceless (TT) tensor field. The idea is to modify the pair (γ , ) by changing γ and the trace-free part µ of by a conformal factor; by a judicious choice of this factor, the new tensors will satisfy the constraint equations. More specifically, for any φ > 0 we set γ˜ = φ 4 γ ,
τ τ ˜ = µ˜ + γ˜ = φ −2 µ + φ 4 γ . 3 3
(4) (5)
The factor φ −2 in µ˜ is the only one for which µ˜ is divergence-free with respect to γ˜ . ˜ and µ˜ as they appear in (5) are covariant tensors; the factor φ −10 replaces (Note that −2 ˜ and µ˜ in contravariant form.) This new pair (γ˜ , ) ˜ satisfies the φ if we work with constraint equations provided the conformal factor φ satisfies the Lichnerowicz equation 1 1 1 φ − Rφ + |µ|2 φ −7 − τ 2 φ 5 = 0. 8 8 12
(6)
Here the Laplacian, scalar curvature and norm-squared of µ are all computed relative to γ . By well-known analytic techniques one may find solutions of this semilinear equation, and so altogether this procedure produces many admissible pairs of tensors satisfying the constraint equations. In effect, the constancy of the mean curvature τ decouples Eqs. (1), (2), reducing them to the facts that µ is transverse-traceless and φ satisfies the Lichnerowicz equation. These formulations are equivalent and so we label initial data sets in one of three equivalent ways: as (γ , ) if these tensors satisfy the constraint equations, or as (γ , , φ) if φ is the conformal factor which alters (γ , ) to tensors satisfying the constraint equations, or finally (suppressing τ ) as (γ , µ, φ). When φ is omitted in this notation, then φ = 1 is implied. When is compact, complete existence results are available in this CMC case [18]; analogous results are known when is either asymptotically flat [7, 15, 9, 8] or asymptotically hyperboloidal [3, 2]. In the non-CMC case, when τ is no longer constant, only partial results have been obtained [19, 12, 14, 20] but this direction warrants greater attention since the existence of constant mean curvature slices is apparently a restrictive assumption in relativity [5, 6]. In this paper we address the following question. Suppose we are given a three-manifold along with an initial data set (γ , ) on it which satisfy the constraint equations. Then is it possible to modify by “surgery” to obtain a new, topologically distinct
Gluing and Wormholes for the Einstein Constraint Equations
531
manifold , and to find tensors γ and on which themselves satisfy the constraint equations? Since we can find solutions by applying the procedure above to in the first place, the more precise question we propose to study is whether it is possible to perform the surgery on a geometrically small subset of and to find solutions (γ , ) which are very close to the initial solutions (γ , ) away from this subset? Let us phrase this more concretely. There are many different types of surgeries possible on higher dimensional manifolds, and really only two types in three dimensions. One is akin to Dehn surgery, while the other, on which we focus, consists of “adding a handle” in the following sense. Choose any two points p1 , p2 ∈ and let B1 and B2 be small (metric) balls centered at these points. Excise these balls to obtain a manifold = \ (B1 ∪ B2 ). The boundary of consists of two copies of S 2 , the same as the product of an interval with a two-sphere, I × S 2 . Identifying these two manifolds along this common boundary yields a new manifold ˆ = I × S2. {±1}×S 2
Informally, we propose to restrict the original initial data set (γ , ) to , modify this data slightly, and then find a one-parameter family of extensions (γε , ε ) of this data ˆ ε ∈ (0, ε0 ), each of which also satisfy the constraint equations. The parameter ε to , measures the size of the handle. We shall do this in such a way that on any compact set K away from the handle, (γε , ε ) converges to (γ , ) as some power of ε. The data on the handle will converge to zero. In the compact setting, the only condition needed to make this procedure work is a very mild nondegeneracy condition as well as the assumption that ≡ 0; the precise statement is contained in the main theorem below. In particular, we cannot handle compact “time-symmetric slices”, i.e. those with ≡ 0, in which case the constraint equations reduce to the vanishing of scalar curvature, and known topological obstructions (see [32] for example) prevent this sort of construction from working in general then. There are two main cases of this construction. In the first, consists of two components, 1 and 2 , and the two points p1 , p2 lie in these separate components. Then we are producing solutions of the constraint equations on the connected sum 1 #2 of these two components. In the other case, is connected, and this construction may be regarded as demonstrating that a “wormhole” can be added to most solutions of the constraint equations. Both cases are of clear physical interest. The analytic techniques used in the proof here are in the spirit of many other gluing theorems for a variety of other geometric structures. Gluing theorems are fundamental in gauge theory, and similar techniques have been developed and applied to metrics of constant scalar curvature [29, 26], minimal and constant mean curvature surfaces [21, 27, 28], and a variety of other geometric problems. Another recent result using the same circle of ideas, and relevant to this area of general relativity, is contained in [16]; this paper proves the existence of time-symmetric asymptotically flat solutions to the constraint equations (i.e. scalar flat asymptotically Euclidean metrics) which evolve to spacetimes which are identically Schwarzchild near spatial infinity. The technique common to many of these papers consists of a two-step procedure. First a one-parameter family of approximate solutions to any one of these problems is constructed. These are chosen so that the error measuring the discrepancy of these approximate solutions from exact solutions tends to zero, and so the strategy is to perturb these approximate solutions to exact solutions using a contraction mapping or implicit function theorem argument. The complication here is that as the parameter ε converges to zero, the geometry is
532
J. Isenberg, R. Mazzeo, D. Pollack
degenerating and some care must be taken to control the linearization of the nonlinear equation associated to the geometric problem. A slightly different and ultimately simpler approach is contained in [27, 28]; however, we shall follow the earlier and more familiar route here because in the present context either method requires approximately the same amount of work. In the remainder of this section we set up some notation, give some definitions, state precise versions of our main results, and provide a more detailed guide to the rest of the paper. 1.2. Statement of main results. We fix a three-manifold along with a conformal class [γ ] on . For any two points p1 , p2 on this manifold, we refer to the triple ([γ ], p1 , p2 ) as a marked conformal structure on . We have already alluded to a nondegeneracy condition required for the construction: Definition 1. A marked conformal structure ([γ ], p1 , p2 ) on a compact manifold is nondegenerate if either of the following situations is true: • is connected and there are no nontrivial conformal Killing vector fields on which vanish at either p1 or p2 , or • The points p1 and p2 lie on different components of and there are no conformal Killing fields which vanish at pj but which do not vanish identically on the component of containing pj , for j = 1, 2. Remark 1. To avoid trivialities, we always assume that either is connected or else has two components, each of which contains one of the points pj . Note that this nondegeneracy condition is very mild and holds for any fixed conformal class [γ ] when the points pj are chosen generically. In the asymptotically Euclidean and asymptotically hyperboloidal cases, which are the only noncompact cases of interest to us here, this nondegeneracy condition is not needed. We discuss these cases in more detail in §7. Assume that is compact, and fix two points p1 , p2 on it. Suppose that (γ , ) is an initial data set with ≡ 0 and τ = tr γ constant, so that in particular the two constraint equations (1), (2) are satisfied. It is often more convenient to regard (γ , µ) as the initial data set instead, where µ is the transverse-traceless part of . For any (small) R > 0, fix balls Bj = BR pj of radius R (with respect to γ ) around the points pj , and define R∗ = \ (B1 ∪ B2 ) .
(7)
We also let 0∗ = ∗ = \{p1, p2 }. We shall modify the metric γ conformally in each of the punctured balls Bj \ pj to obtain a metric γc on ∗ with two asymptotically cylindrical ends. These ends have a natural parameter t = − log r, where r is the geodesic distance in Bj to pj . According to (5) the transverse-traceless tensor µ changes to a tensor µc which is transverse-traceless with respect to γc . This gives an asymptotically cylindrical pair (γc , µc ) together with a function ψ on ∗ which solves the associated Lichnerowicz equation (6) (with coefficients determined by this pair of tensors). We also refer to the triple (γc , µc , ψ) as an initial data set. The construction proceeds as follows. We first identify long pieces, of length T , of these two cylindrical ends with one another to obtain a family of metrics γT on the new manifold T obtained from by adding a handle. For each T , T is of course diffeoˆ described in the previous section. This process involves cutoff morphic to the manifold
Gluing and Wormholes for the Einstein Constraint Equations
533
functions and is not canonical. We also use cutoffs to patch together the values of µc on these ends to obtain µT , and similarly an approximate solution ψT of the Lichnerowicz equation. We write this out more carefully in §2. Although it is not hard to construct µT so that it still has vanishing trace relative to γT , it is no longer necessarily true that µT will be divergence-free. The triple (γT , µT , ψT ) does not solve the constraint equations, but instead gives an approximate solution, with an error term tending to zero as T → ∞. We show that it is possible, for T sufficiently large, to correct this data to obtain exact solutions of the constraint equations. We do this in the following order. First µT is altered by a small correction term σT , using a well known second order linear elliptic system (usually called the vector Laplacian) derived from the conformal Killing operator on (T , γT ), to obtain a tensor µ˜ T = µT − σT which is transverse-traceless with respect to γT . This involves a careful analysis of the mapping properties of this operator with good control as T → ∞ and is the topic of §3. We then use the conformal method, and more specifically theLichnerowicz equation (6), to modify the function ψT to some ψ˜ T , so that γT , µ˜ T , ψ˜ T is an initial data set, i.e. gives an exact solution of the constraint equations. This involves first estimating the error terms for this approximate solution, which is done in §4. We next analyze the linearization of the Lichnerowicz equation uniformly as T → ∞. In particular, we introduce a class of weighted H¨older spaces Cδk,α (T ) which correspond to the standard H¨older spaces away from the handle and determine the range of weights δ for which the inverse of the linearized Lichnerowicz equation is uniformally bounded independent of T . This is carried out in §5. Finally, in §6 we use a contraction mapping argument to produce the exact solutions by finding a function ηT in the ball Bν of radius ν e−T /4 in Cδk+2,α (T ) for δ ∈ (0, 1), so that ψ˜ T = ψT + ηT satisfies the Lichnerowicz equation. This establishes our main theorem which we now state (in the form appropriate for initial data on closed manifolds). Theorem 1. Let (, γ , ; p1 , p2 ) be a compact, smooth, marked, constant mean curvature solution of the Einstein constraint equations; thus = µ + 13 τ γ with τ constant and µ transverse-traceless with respect to γ . We assume that this solution is nondegenerate and also that ≡ 0. Then there is a geometrically natural choice of a parameter T and, for T sufficiently large, a one-parameter family of solutions (T , &T , T ) of the Einstein constraint equations with the following properties. First, the three-manifold T is constructed from by adding a handle, or neck, connecting the two points p1 and p2 . Next, 1 &T = ψ˜ T4 γT and T = ψ˜ T−2 µ˜ T + τ ψ˜ T4 γT , 3 where ψ˜ T = ψT + ηT and µ˜ T = µ − σT are very small perturbations of the conformal factor ψT and approximate transverse-traceless tensor µT defined informally above and more carefully in §2 below. The tensor µ˜ T is transverse-traceless with respect to &T and the function ψ˜ T satisfies the Lichnerowicz equation with respect to the pair (γT , µ˜ T ). Finally, the perturbation terms satisfy the following estimates as T → ∞: • On R∗ , |&T − γ |γ ≤ Ce−T /4 ,
|T − |γ ≤ CT 3 e−3T /2 ;
• On the neck region, there is a natural choice of coordinate system (s, θ ), −T /2 ≤ s ≤ T /2 and θ ∈ S 2 , such that if h0 is the standard metric on S 2 , then γT = ds 2 + h0 + O e−T /2 cosh s ,
534
J. Isenberg, R. Mazzeo, D. Pollack
ψT = 2e−T /4 cosh(s/2) + O e−T /4 ,
|ηT | ≤ Ce−(1+δ)T /4 (cosh(s/2))δ
for some weight δ ∈ (0, 1), and finally |µ˜ T |γT ≤ Ce−T /2 cosh s,
|σT |γT ≤ CT 3 e−3T /2 .
The modifications needed for the asymptotically Euclidean and asymptotically hyperboloidal cases are discussed in §7 and statements of this theorem in these settings are given there, in Theorems 4 and 5. Notice that we have not stated the precise regularity of the solutions, but as indicated earlier, we work in various classes of H¨older spaces; in particular, we also obtain existence of initial data sets (, γ , ) with only finite regularity. Some estimates of the geometry of the solutions are provided in §8; these should be of interest when considering the spacetime evolutions of these initial data sets. Finally, §9 contains an informal discussion of various concrete examples of initial data sets, many of which are new, which our construction gives. 2. The Approximate Solution In this section we construct the family of approximate solutions (γT , µT , ψT ). Let rj denote the geodesic distance to pj (relative to γ ) in the ball Bj . By Gauss’ lemma we have γ |Bj = drj2 + rj2 hj rj , where hj is a family of metrics on S 2 , smooth down to rj = 0 and such that hj (0) is the standard round metric on the sphere. The standard way to obtain a metric with asymptotically cylindrical ends is to multiply γ by rj−2 in each of these balls. Recalling the convention of (4) that the conformal factor is written as ψ 4 , let us fix a function ψc ∗ , and which equals r 1/2 on which is strictly positive on ∗ , which equals one on 2R j Bj . Now define γc = ψc−4 γ .
(8)
∗ and has asymptotically cylindrical ends in a neighThis metric agrees with γ on 2R borhood of each pj . In fact, setting tj = − log rj , we see that γc |Bj = dtj2 + hj e−tj .
Recalling that hj is smooth in rj , we can rewrite this as dtj2 + dθ 2 + e−tj kj ,
(9)
where dθ 2 is the standard round metric on S 2 and the term kj in the remainder is a symmetric two-tensor which is smooth in rj = e−tj . Note that if the metric γ is conformally flat in Bj then the definition of ψc may be easily modified to make the metric γc exactly cylindrical in Bj . Note also, for later reference, that − log R ≤ tj < ∞.
(10)
Gluing and Wormholes for the Einstein Constraint Equations
535
∂B2R (P2 ) ∂BR (P2 )
∂B2R (P1 ) ∂BR (P1 )
P2
P1
(, γ )
∂BR (P1 )
t1 = A
t2 = A + T
t1 = A + T2
t2 = A + T2
t1 = A + T
t2 = A
( ∗ , γc ) Fig. 1. Conformally blowing up (, γ ) to form ∗ , γc with two asymptotically cylindrical ends
Next, following (5), define µc = ψc2 µ;
(11)
this is defined on ∗
and is transverse-traceless with respect to γc . Relative to the initial data set (γc , µc ) on ∗ , ψc satisfies the Lichnerowicz equation c ψc −
1 1 1 2 5 Rc ψc + |µc |2 ψc−7 − τ ψc = 0. 8 8 12
(12)
In this equation, the expressions c , Rc and | · |2 are all computed in terms of the metric γc . Thus (γc , µc , ψc ) is an initial data set on ∗ . Let A = − log R and let T be a large parameter. Truncate the manifold ∗ by omitting the regions where tj > A + T . We may form a new smooth manifold by identifying the two finite cylindrical tubes tj , θ : A ≤ tj ≤ A + T via the map (t1 , θ ) →
536
J. Isenberg, R. Mazzeo, D. Pollack ∗ R
∗ R
CT
s = − T2
s = T2
s=0 (T , γT )
Fig. 2. The manifold (T , γT )
(T − t1 , −θ ). Let us call the resulting manifold T and let the now-identified cylindrical ˆ the manifold obtained tube be denoted CT . (Of course, each T is diffeomorphic to , by adding a handle to , but it is convenient to keep track of the dependence of T explicitly.) We introduce a new linear coordinate s on CT by setting s = t1 − A − T /2 = −t2 + A + T /2. Thus CT is parametrized by the coordinates (s, θ ) ∈ [−T /2, T /2] × S 2 . To continue we must define the family of metrics γT , “almost-transverse-traceless” tensors µT on T , and conformal factors ψT , which we do in turn. The definition of γT is most transparent when γ is conformally flat in the balls Bj . In this case, using the modifications of ψc alluded to above, in the cylindrical coordinates tj , θ , we have γc = dtj2 + dθ 2 , and thus the map identifying the two cylindrical pieces to CT is an isometry and γT is thereby well-defined. In the general case, we cover T by two open sets U1 and U2 , the intersection of which consists of two components, one disjoint from B1 ∪ B2 and the other equal to {(s, θ ) : −1 < s < 1}. Choose a partition of unity {χ1 , χ2 } subordinate to this cover and let γj and µj denote the restrictions of γc and µc to Uj . Then, with the obvious abuse of notation, define γT = χ1 γ1 + χ2 γ2 ,
µT = χ1 µ1 + χ2 µ2 .
Notice that γT = γc and µT = µc away from the middle Q = [−1, 1] × S 2 of the neck region, where t1 , t2 ≈ T /2. Although it is not obvious why at this we need to define ψT somewhat dif stage, ∗ and ferently. This time choose a covering U˜1 , U˜2 of such that U˜1 ∩ U˜2 ⊂ 2R with pj ∈ U˜j . Also choose nonnegative functions χ˜ j ∈ C0∞ U˜j with χ˜ 1 + χ˜ 2 = 1 on R∗ , and such that when restricted to Bj , χ˜ j = 1 for tj ≤ A + T − 1 and χ˜ j = 0 for tj ≥ A + T . Writing the restriction of ψc to U˜j as ψj , we set ψT = χ˜ 1 ψ1 + χ˜ 2 ψ2 . Then, based on the construction of T from the conformal blow up of (, γ ), we define ψT on T in the obvious way. Note that ψT is exactly equal to 1 away from the
Gluing and Wormholes for the Einstein Constraint Equations χ2
χ1
1
537
0
U1
s = −1
s=1
U2
Fig. 3. Defining the metric γT = χ1 γ1 + χ2 γ2 , and the approximate TT-tensor µT = χ1 µ1 + χ2 µ2
s = − T2 ψT = 1
s = T2 ψT = ψ1 + ψ2
s = − T2 + 1
ψT = 1
s = T2 − 1
Fig. 4. The approximate solution to the Lichnerowicz equation, ψT = χ˜ 1 ψ1 + χ˜ 2 ψ2
cylinder, and is equal to ψ1 + ψ2 throughout most of the cylinder except at the two ends. At these ends, near the junctions of the cylinder with R , one of the two functions ψ1 or ψ2 is roughly of unit size and the other is exponentially small; in defining ψT , the exponentially small summand is cut off to be zero near these junctures. In summary, we have defined γT and µT by using cutoffs supported on the middle of the cylinder CT , but have cut off either of the ψj at the ends of the cylinder. The reasons for doing this are that γc decays to the product metric like e−tj and µc is extremely small in the middle of CT , so the error terms from these are of order e−T /2 . On the other hand, ψc only decays like e−tj /2 along each end, and so if we were to cut this function off in the middle, it would produce an error term with size approximately e−T /4 , which is unacceptably large. We carry out these estimates in more detail in §4. 3. Transverse-Traceless Tensors on ΣT We now undertake the first of several steps intended to perturb the approximate solution (γT , µT , ψT ) constructed in the last section to an initial data set, i.e. an exact solution,
538
J. Isenberg, R. Mazzeo, D. Pollack
when T is sufficiently large. Our goal in this section is to modify the tensor µT so that it becomes transverse-traceless with respect to γT . Before starting, note that it is clear from the definitions that µT is transverse-traceless away from the center of the neck region Q in the middle of the cylinder CT . Furthermore, by choosing the cutoff functions carefully, we may also assume that µT is trace-free on all of T ; however, its divergence is almost surely nonzero, but at least is supported in the annulus Q. There is a well-known procedure for producing a correction term to kill this divergence. We recall this now, and refer to [15 or 33] for more details. Let γ be a metric and X a vector field on the three-manifold , and write DX =
1 1 LX γ − (divX) γ . 2 3
Equivalently, in local coordinates, (DX)j k =
1 1 Xj ;k + Xk;j − div (X) γj k . 2 3
The first order operator D = Dγ defined by this expression maps vector fields to tracefree symmetric (0, 2) tensors and has the property that DX = 0 if and only if X is a conformal Killing field. The formal adjoint of D is D∗ = −div , and the second order operator L = D∗ ◦ D is self-adjoint, nonnegative and elliptic. This operator is often referred to as the “vector Laplacian”. Now let γ = γT , with D and L the associated operators. Let W be the vector field associated to div µT using the natural duality between vector fields and 1-forms. Suppose that we can solve the equation LX = W . Then writing σT = DX we compute that div (µT − σT ) = W − div DX = W − LX = 0, and so µT − σT is divergence-free. Since µT and σT are both trace-free, we have produced the desired correction term. To implement this strategy properly, we must show that when T is large enough, the operator L is invertible, and moreover that the solution X, and hence DX, is much smaller than µT , so that σT can honestly be regarded as a small perturbation. Note that since W is a divergence and hence orthogonal to the cokernel of L, we can always solve LX = W (but without an estimate on the solution). The main issue is to prove that the inverse G to L on the manifold T , when acting between appropriate function spaces, has norm bounded at worst by some polynomial in T . (This is sufficient because the error term div µT decays exponentially.) When γ is conformally flat in the balls Bj , it is possible to choose the conformal factor ψ used to define γc slightly differently so that γT is the product metric on the cylinder CT = [−T /2, T /2] × S 2 . We proceed in this case by an analysis of the operators D and L on the complete cylinder R × S 2 as well as on the finite cylindrical piece CT , using separation of variables and explicit calculations with the resulting ODEs, to reach the conclusion that the inverse G to L exists and has norm (as an operator on L2 (T )) blowing up no faster than T 2 . When γ is not conformally flat in these balls, an additional perturbation argument is required to reach the same conclusion. The basic goal in these arguments is to show that the lowest Dirichlet eigenvalue of L on CT is of order T −2 . Afterwards, the same conclusion is reached for the operator L on T .
Gluing and Wormholes for the Einstein Constraint Equations
539
3.1. The conformal Killing operator on the cylinder 3.1.1. The decomposition. We first derive the explicit form of L on the cylinder R × S 2 with product metric ds 2 + h, where s is the linear coordinate on R and h is the standard round metric on S 2 . Any vector field X on the cylinder may be written X = f ∂s + Y (s), where f is a function on the cylinder and Y (s) is a vector field on the sphere {s} × S 2 . Write DX = S(X) − 13 divX γ , and choose any local coordinates θ = (θ1 , θ2 ) on S2 . Then for i, j = 1, 2, S (f ∂s )00 = ∂s f,
S (f ∂s )0i =
1 fi , 2
S (f ∂s )ij = 0
and div (f ∂s ) = ∂s f . Therefore, exhibited as a matrix, 2 1 3 ∂s f 2 ∇θ f , D (f ∂s ) = 1 1 2 ∇θ f − 3 (∂s f ) h where ∇θ represents the gradient of f on the sphere. Hence, with the convention θ = −div ∇θ , 2 2 − 23 ∂s2 f + 21 θ f − 3 ∂s f + 21 θ f = . L (f ∂s ) = −div (D (f ∂s )) = − 21 ∂s (∇θ f ) + 13 ∇θ (∂s f ) − 16 ∂s (∇θ f ) Next,
1 S (Y )ij = Sθ (Y )ij , ∂s (Yi ) , 2 where by definition the conformal Killing operator Dh on the sphere decomposes as Dh Y = Sθ (Y ) − 21 (divθ Y ) h. Using also that div(Y ) = divθ (Y ), we get 1 1 − 3 divθ (Y ) 2 ∂s Y , D(Y ) = 1 1 Dθ (Y ) + 6 divθ (Y ) h 2 ∂s Y S (Y )00 = 0,
and then that
S (Y )0i =
L(Y ) = −div(D(Y )) =
1 3 ∂s
− 21 ∂s2 Y
=
(divθ (Y )) − 21 divθ (∂s Y )
− divθ (Dθ (Y )) −
+ Lθ (Y ) −
1 6 ∇θ divθ (Y )
− 16 ∂s (divθ (Y )) − 21 ∂s2 Y
1 6 ∇θ divθ (Y )
.
It is possible to express this operator in a somewhat more familiar form. To do this, we identify vector fields and 1-forms on S 2 in the usual way using the metric h. Thus, ∇θ and divθ correspond to dθ and −δθ , respectively. We still let Y denote the 1-form
540
J. Isenberg, R. Mazzeo, D. Pollack
associated to the vector field Y , hopefully without causing undue confusion. Using these identifications, − 23 ∂s2 f + 21 θ f + 16 ∂s (δθ Y ) . L (f ds + Y (s)) = − 21 ∂s2 Y + Lθ (Y ) + 16 dθ δθ (Y ) − 16 ∂s (dθ f ) This expression may be simplified using two separate Weitzenb¨ock formulæ, Lθ =
1 ∗ 1 ∇ ∇− , 2 2
and
θ = ∇ ∗ ∇ + 1,
both of which use the fact that the curvature tensor on S 2 is constant. The first of these relates Lθ to the “rough Laplacian” ∇θ∗ ∇θ (cf. [33]) and the second is the more familiar one relating this rough Laplacian to the Hodge Laplacian θ = dθ δθ + δθ dθ . These combine to give Lθ =
1 θ − 1, 2
and thus we finally arrive at the expression 2 2 1 1 − 3 ∂s + 2 θ f f 6 ∂ s δθ . L = Y Y − 16 ∂s dθ − 21 ∂s2 + 23 dθ δθ + 21 δθ dθ − 1
(13)
3.1.2. Separation of variables. To analyze this operator L on the cylinder in more detail, we introduce the eigenfunction expansion on the S 2 factor. Let {φk } be an orthonormal set of eigenfunctions for the scalar Laplacian on S 2 , so thatθ φk = λk φk and each λk = −1/2 −1/2 j (j +1) for some j ∈ N. If ∗ is the Hodge star on S 2 , then λk dθ φk , λk δθ ∗ φk is an orthonormal basis of eigenfunctions for the Hodge Laplacian on 1-forms. Set φ = φk and λ = λk with λk > 0, for some k, and consider a 1-form of the form dθ φ δθ ∗ φ X = f (s, θ )ds + Y (s) = u(s)φ ds + v(s) √ + w(s) √ , λ λ or equivalently, the associated dual vector field. Then √ √ 1 λ λ dθ φ 2 2 λ v φ ds + − v + λ−1 v− u √ LX = − u + u + 3 2 6 2 3 6 λ 1 δθ ∗ φ 1 λ−1 w + − w + √ , 2 2 λ and so L acts on the column vector (u, v, w)t by √ 2 λ −3 0 0 0√ 6λ 0 2 0 − 1 0 ∂s2 + − 6λ 0 0 ∂s + 0 2 0 0 0 − 21 0 0 0
0 0 2 0 . 3λ − 1 1 0 2λ − 1
Gluing and Wormholes for the Einstein Constraint Equations
541
There is one extra case, when λ = λ0 = 0; then X = u(s) ds and LX = − 23 u (s)ds, so we write 2 L0 = − ∂s2 . 3 When j ≥ 1, Lj uncouples as Lj = L j ⊕ L j where L j
=
− 23 0 0 − 21
∂s2
+
√
−
0√
λ 6
λ 6
0
∂s +
λ
0 2 0 3λ − 1
2
acts on (u, v)t and L j
1 = − ∂s2 + 2
1 λ−1 2
acts on w. Thus altogether, L = L0 ⊕
j ≥1
L j ⊕ L j .
3.1.3. Temperate solutions. The first application of these formulæ concerns the solutions of LX = 0 on the cylinder which do not grow exponentially in either direction. Proposition 1. The space of temperate solutions to LX = 0 on the cylinder R × S 2 is 8-dimensional and is spanned by the 2-dimensional family of θ-independent radial vector fields (a + bs)∂s , a, b ∈ R, and the 6-dimensional family of vector fields which are dual to the 1-forms (ci + di s) δθ ∗ φi , ci , di ∈ R, where φi = xi | S 2 , i = 1, 2, 3. Proof. It is straightforward to check that these bounded or linearly growing vector fields are all in the nullspace of L. In fact, using the eigendecomposition of L above, L0 u = − 23 u = 0 implies u(s) = a + bs, and this gives the first family. Next, for j = 1 (i.e. λ = 2), L 1 w = − 21 w = 0 implies w(s) = c + ds; the eigenfunctions here are the restrictions of linear functions on R3 to S 2 , and so this gives the other family. It remains to show that these are the only temperate solutions. It is clear that when j ≥ 2, hence λ ≥ 6, all solutions of L j w = 0 grow exponentially in one direction or the other. On the other hand, consider the homogeneous system L j when j ≥ 1, which is √ λ 2 λ u − u− v = 0, 3 2 6 √ 1 λ 2 v − λ−1 v+ u = 0, 2 3 6 where λ = j (j + 1). This is a constant coefficient system, and solutions have the form u = eρs z1 , or seρs z1 + eρs z2 , z1 , z2 ∈ C2 , v
542
J. Isenberg, R. Mazzeo, D. Pollack
where ρ is a root of the indicial equation √ λ 2 2 λ ρ − − ρ 2 6 1 4 3 ρ + 2(1 − λ)ρ 2 + λ2 − 23 λ = 0. det √ = 2 3 λ 1 2 6 ρ 2ρ − 3λ − 1 Solutions are temperate only when Re ρ = 0, but it is easy to check that this polynomial for ρ has no root with purely imaginary part, since λ = j (j + 1), j ≥ 1. The proof is complete. ! Remark 2. The space of conformal Killing fields on S 3 is 10-dimensional, and is generated by the 6-dimensional family of rotations and the 4-dimensional family of dilations. Amongst all of these, only the vector fields vanishing at the two antipodal points (corresponding to the infinite ends on the cylinder) give rise to temperate conformal Killing fields on R × S 2 . Of course, these are just the generators for the 3-dimensional family of rotations of the orthogonal S 2 (i.e. corresponding to δθ ∗ φi , i = 1, 2, 3, as above) and the dilation fixing these antipodal points (∂s ). 3.1.4. The lowest eigenvalue of L on [−T /2, T /2] × S 2 . The next application concerns the lowest eigenvalue of the operator L with Dirichlet boundary conditions X = 0 on the finite piece of the cylinder CT = [−T /2, T /2] × S 2 . Proposition 2. Let λ0 = λ0 (T ) denote the first Dirichlet eigenvalue for L on CT = [−T /2, T /2] × S 2 . Then λ0 (T ) ≥
C T2
for some constant C independent of T . Proof. This estimate is clearly sharp for the scalar ordinary differential operators L0 = − 23 ∂s2 and L 1 = − 21 ∂s2 . On the other hand, when j ≥ 2, the lowest eigenvalue of L j converges to 21 j (j + 1) − 1 as T → ∞, hence is bounded away from zero. We finally show that the lowest eigenvalue for L j is also bounded away from zero when j ≥ 1. In fact, √ T /2 2 2 1 2 λ λ u u Lj , = u + (v ) + (uv − u v) + u2 v v 2 6 2 −T /2 3 2 + λ − 1 v 2 ds 3 √ 2 √ 2 T /2 1 2 λ λ u − v + v + u = 3 8 2 6 −T /2 35 63 + λu2 + λ − 1 v 2 ds. 72 96 T Since λ ≥ 2, this is bounded below by 21 0 u2 + v 2 ds, and the proposition is proved. !
Gluing and Wormholes for the Einstein Constraint Equations
543
3.1.5. Growth estimates on CT . To conclude this analysis of L on the cylinder, we show that if X is an eigenfunction with “very small” eigenvalue, then its size in the interior of CT is controlled by its size on the ends of CT . The next result does this directly for the product metric. After that we prove a more technical estimate for the inhomogeneous eigenvalue problem which will be used later to handle the general case by perturbation. Proposition 3. Suppose that LX = µX, where L is computed with respect to the product metric and µ < 18 λ0 (T ). Suppose also that |X (−T /2, θ ) |2 + |X (T /2, θ ) |2 dθ ≤ C1 . S2
Then for any a ∈ [−T /2 + 1, T /2 − 1], a+1 |X (s, θ ) |2 dsdθ ≤ C2 , S2
a−1
where C2 depends only on C1 , but not on T or a, and where C2 → 0 as C1 → 0. Proof. It suffices to check this estimate on each of the components in the decomposition of L. For the scalar operators L0 and L j this estimate may be verified using the explicit solutions of the equation (though this can easily be done by other standard and less computational methods too). So as before, it suffices to focus on the system L j , j ≥ 1. Write L j = −A∂s2 + B∂s + C, where A=
2/3 0 , 0 1/2
B=
√0 − λ/6
√ λ/6 , 0
C=
λ/2 0 , 0 ((2/3)λ − 1)
and, as usual, λ = j (j + 1). Now write X = (u, v)t and define f (s) = "AX, X#. Then 1 2 ∂ f (s) = "AX , X# + "AX , X #. 2 s Using the equation L j X = µX and the skew-symmetry of B, this becomes 1 2 ∂ f (s) = −"X , BX# + "CX, X# + "AX , X # − µ"X, X#, 2 s which can then be rewritten as 2 1 2 1 ∂s f (s) = A1/2 X − A−1/2 BX + "DX, X#, 2 2 where D=
1 BA−1 B + C − µI = 4
35
72 λ − µ
0
0 . 63 96 λ − 1 − µ
But D is positive definite when λ ≥ 2 and T is large enough that subtracting µ does not destroy positivity, and so we conclude that f (s) ≥ 0, and hence that the L2 norm of X at s = ±T /2 controls the L2 norm of X over any strip a − 1 ≤ s ≤ a + 1. !
544
J. Isenberg, R. Mazzeo, D. Pollack
Our second result here is more specialized, and is the precise additional ingredient required later to handle the nonproduct case. We let $ · $0,α,σ denote the C 0,α norm over the set [σ, σ + 1] × S 2 . Proposition 4. Let L be computed relative to the product metric on CT , and suppose LX = µX + F , where X (±T /2) = 0 and µT 2 < ν0 for some sufficiently small constant ν0 independent of T . Suppose also that for all |s| ≤ T /2, $F (s, θ )$0,α,σ ≤ f (σ ) e−T /2 cosh σ, where |f (σ )| ≤ 1 for all σ . Recalling that F implicitly depends on T , we also assume that for any ε > 0 and A > 0, there exists a T0 such that when T ≥ T0 we have |f (σ )| ≤ ε when T /2 − A ≤ |σ | ≤ T /2. Then, for any η > 0 we have |X| ≤ η on all of CT , when T is large enough. In other words, if X solves this inhomogeneous eigenvalue problem on CT , with vanishing Dirichlet data, if µ is much smaller than 1/T 2 , and if the forcing term F is uniformly small near the ends of CT and bounded by e−T /2 cosh s on the middle portion of this cylinder, uniformly as T → ∞, then X is uniformly small on all of CT . Proof. We separate variables and write the equation as Lj Xj = Fj for j ≥ 0. For j > 0 this equation splits further into L j Xj = Fj and L j Xj = Fj , as in the previous subsection. None of the primes here correspond to derivatives. We have shown that the operators L j with j ≥ 1 or L j with j ≥ 2 have spectrum bounded below by a positive constant as T → ∞. For any of the corresponding equations, if the conclusion were false then we would be able to produce, by the usual arguments, a nontrivial bounded function X or X such that L j X = 0 or L j X = 0, respectively, defined on the complete cylinder R × S 2 . This contradicts the strictly positive lower bound for the spectrum. For the remaining components, it suffices to prove the assertion for the scalar problem ∂s2 u + α 2 u = F , where αT is sufficiently small. For notational convenience we shall use the shifted variable t = s + T /2 (so that F (t) = f (t)e−T /2 cosh (t − T /2), where this function f is a translate of the one above. We may write the solution explicitly: sin α(T − t) t sin ατ sin αt T sin α(T − τ ) u(t) = F (τ ) dτ + F (τ ) dτ. αT αT t sin αT sin αT 0 Because αT is so small, we may approximate each of the sines with their first order Taylor approximations; using the estimate for F too we obtain t t τ cosh(τ − T /2)e−T /2 f (τ ) dτ |u(t)| ≤ C 1− T 0 +t
t
T
1−
τ cosh(τ − T /2)e−T /2 f (τ ) dτ . T
Finally, we must use the separate upper bounds for f (τ ) in the various regions 0 ≤ τ ≤ A, A ≤ τ ≤ T − A and T − A ≤ τ ≤ T . The main fact used in this estimation is that the function h(t) = t (1 − t/T )| sinh(t − T /2)|e−T /2 is uniformly bounded independently of T , attains its maximum near t = 1 and t = T − 1, and for any δ > 0 there exists an A > 0 such that h(t) ≤ δ, uniformly in T , when A ≤ t ≤ T − A. Together with the bounds for f , we can derive that u is uniformly small. Details are lengthy but straightforward, and are left to the reader. !
Gluing and Wormholes for the Einstein Constraint Equations
545
3.2. The lowest eigenvalue of L on T Theorem 2. The lowest eigenvalue λ0 = λ0 (T ) for L on T satisfies λ0 ≥ CT −2 for some constant C > 0 which is independent of T . Proof of Theorem 2 when the restriction of γT to CT is the product metric. If this result were false, it would be possible to find a sequence of values Tj → ∞ such that for the operators LTj associated to the metrics γTj , we would have λ0 Tj Tj2 → 0. We now show this leads to a contradiction. Denote the sequence of metrics and operators by γj and Lj , respectively. Let Xj be any one to the eigenspace with eigenvalue of the eigenfunctions for Lj corresponding λ0 Tj . By rescaling, assume that supT Xj = 1. j
The coefficients of Lj converge uniformly on any compact subset of ∗ to the coefficients of the operator Lc corresponding to the metric γc . Since Xj is uniformly bounded, we may extract a subsequence Xj which converges uniformly on compact sets in ∗ to a limit X. This vector field is a solution of Lc X = 0 and satisfies |X| ≤ 1. Our first claim is that X is nontrivial. To see this, relabel the subsequence as Xj again. Suppose that the supremum of Xj is attained at some point qj ∈ Tj . If qj lies in Tj \CTj for infinitely many j , then again possibly passing to a subsequence ∗ qj → q ∈ . Since Xj qj ≡ 1, we get |X(q)| = 1, and so X ≡ 0. On the other hand, suppose that qj ∈ CTj for all but finitely many j . Note that by Schauder theory, Xj remains bounded away from zero on a strip aj − η, aj + η × S 2 ⊂ CT of width j not depending on j , about qj . Furthermore, λ0 Tj < 18 Tj2 for j large enough, and hence Proposition 3 (and elliptic theory) implies that |Xj qj | ≥ α > 0 for some
point qj ∈ ∂CTj and some α not depending on j . Therefore in this case too we may conclude that the limit X ≡ 0 on ∗ . By the analysis of §3.1, if X is any solution of Lc X = 0 on ∗ (with respect to the metric γc ) which is bounded along the cylindrical end, then it can be written as a sum X = X∞ + Z, where X∞ extends to a bounded solution of LX∞ = 0 on the entire cylinder R × S 2 (with product metric) and Z decays exponentially as t → ∞. Our second claim is that, assuming the nondegeneracy of so that there are no conformal Killing fields vanishing at both points pj , then any bounded vector field X on ∗ which satisfies Lc X = 0 necessarily vanishes identically. To see this, observe from our earlier result about elements in the nullspace of Lc which are bounded on the cylinder, all such solutions are also annihilated by D. (This is not true if we also admit the linearly growing solutions.) Thus the term X∞ in the decomposition of X above is annihilated by D, and hence DX = DZ decays exponentially along the cylindrical ends. Therefore, no boundary contribution appears in the integration by parts 0 = "Lc X, X# = "D∗ DX, X# = $DX$2 , so that DX ≡ 0. Thus X is a conformal Killing field on ∗ which is bounded with respect to the metric γc ; changing back to the old polar variables rj = e−tj and θ on each ball Bj shows that X is bounded with respect to the original metric γ on , and in fact vanishes at both points pj . It is therefore in the (distributional) nullspace of L on , and hence by elliptic regularity extends smoothly to a solution on all of . It would thus be a nontrivial conformal Killing field on the manifold , vanishing at both p1 and p2 , and we have assumed that such a field does not exist. This completes the proof. !
546
J. Isenberg, R. Mazzeo, D. Pollack
Proof of Theorem 2 in general. We may follow the outline of the preceding proof at almost every point. Thus we suppose the result false and choose a sequence of eigenfunctions Xj with corresponding very small eigenvalues µj , for some sequence Tj → ∞. Xj ≤ 1, we again attempt to pass to a limit and obtain a contradiction. Supposing that If Xj remains bounded away from zero on some fixed set near the ends, then compact we argue as before. The difficulty, however, is when Xj remains bounded away from zero in some portion of the cylinder CTj which recedes from either end. To handle this, we would like to use the same sort of argument as in the proof for the product case. Namely, we would like to assert that if Xj tends to zero near the ends of CTj , then it must tend to zero uniformly along all of this cylindrical segment. We shall now prove this using Proposition 4. The metric γT differs from the product metric on CT by a term of size e−T /2 cosh s, and therefore the operator L computed relative to γT differs from the corresponding operator L0 , computed relative to the product metric, by an error term ET with coefficients of this same size. We may rewrite the equation for Xj as L0 Xj = µj Xj + Fj ,
where Fj = −ETj Xj . Now suppose that we are in the difficult situation: that Xj is uniformly small on any fixed neighborhood of the ends of the cylinder. Then it is clear that Fj satisfies precisely the estimates in the statement of Proposition 4. The solution Xj may be decomposed into a sum Uj + Vj , where Uj satisfies the homogeneous problem L0 Uj = µj Uj and Uj (±T /2) = Xj (±T /2), while Vj satisfies the inhomogeneous problem with vanishing boundary conditions. Clearly both of these summands are uniformly small near the ends of CTj . The uniform smallness of Uj on the entire cylinder now follows from Proposition 3, while the uniform smallness of Vj follows from Proposition 4. Therefore it is impossible that Xj is large in the middle of CTj but small at the ends. This completes the proof. ! 3.3. Estimates on H¨older spaces. We have shown that the lowest eigenvalue of the operator L on T is bounded below by CT −2 , uniformly as T → ∞. Hence the norm of G = L−1 , as a bounded operator on L2 (T ), blows up no faster than C T 2 . However, it is more convenient for us to use H¨older spaces because of the nonlinear nature of the problem, and we show now that this L2 estimate may be converted to a slightly weaker estimate on appropriately defined H¨older spaces. ∗ Definition 2. Consider the cover of T given by the sets R/2 and CT , and for any vector field X on T , let X and X denote the restrictions of X to these two subsets, respectively. Also, for −T /2 + 1 ≤ a ≤ T /2 − 1, define Na = [a − 1, a + 1] × S 2 ⊂ CT . Then for any k ∈ N and α ∈ (0, 1), define
$X$k,α = $X $k,α +
sup
−T /2+1≤a≤T /2−1
$X $k,α,Na .
The local H¨older norms here are computed with respect to the metric γT . Corollary 1. Let W ∈ C k,α (T ) and suppose that X is the unique solution to LX = W , as provided by Theorem 2. Then for some constant C independent of W and T , $X$k+2,α ≤ CT 3 $W $k,α .
Gluing and Wormholes for the Einstein Constraint Equations
547
Proof. First, by Theorem 2, $X$L2 ≤ CT 2 $W $L2 . Next, since the volume of T is bounded by CT , we have $W $L2 ≤ T $W $0,α . In addition, local Schauder estimates for Lc give $X$2,α ≤ C $W $0,α + $X$L2 . Putting these together yields $X$2,α ≤ CT 3 $W $0,α , which is the desired estimate when k = 0. The estimate when k > 0 requires another application of the local Schauder estimates. ! 3.4. Correcting µT to be transverse-traceless. Recall now that µT is constructed by patching together µc using cutoff functions in the center Q of CT , and the metric γT is constructed from γc the same way. The definition (11) implies that on Q |µT |γT ≤ CψT6 |µ|γ
(14)
with equality (for C = 1) on T \Q. Since µc is divergence-free with respect to γc , the “error term” div γT µT arises from two sources: the cutoff functions used to define µT and those used to define γT . This error term is clearly supported in the center Q of CT , and it is not hard to see that $div γT µT $k−1,α ≤ C $µT $k,α ≤ C $ψT6 $k,α , the last inequality following from (14) since |µ|γ ≤ C. Recalling that ψT ∼ e−tj /2 and each tj ∼ T /2 on Q, and letting W be the vector field dual to div γT µT , then we conclude that $W $k+1,α ≤ C e−3T /2 .
(15)
Proposition 5. There exists a tensor σT on T with $σT $k,α ≤ C T 3 e−3T /2
(16)
and such that µ˜ T = µT − σT is transverse-traceless with respect to γT . Proof. As indicated in the beginning of this section, we first solve the equation LX = W on T , with W = div γT µT , and then set σT = DX. The operators D and L here are both computed with respect to γT . The estimates for σT and µ˜ T follow from (15). ! 4. Estimates on the Approximate Solution Our overall goal in this paper is to modify the approximate initial data set (γT , µT , ψT ), defined on the manifold T , to a genuine initial data set γT , µ˜ T , ψ˜ T , so that µ˜ T is transverse-traceless with respect to γT and ψ˜ T satisfies the Lichnerowicz equation relative to γT and µ˜ T . Thus far we have accomplished the first of these tasks, and so what remains is to show how to modify the conformal factor ψT to ψ˜ T = ψT + ηT so that ψ˜ T satisfies the Lichnerowicz equation. There are three separate steps involved in this. In the first we must calculate the “error term” ET , which measures the deviation of ψT from solving the Lichnerowicz equation. After that we must analyze the mapping properties of the linearization of this equation, calculated at ψT , uniformly as T → ∞. Finally,
548
J. Isenberg, R. Mazzeo, D. Pollack
these are put together to carry out the contraction mapping argument to produce the exact solution ψ˜ T . These steps are taken up in this and the next two sections, successively. To begin then, denote by NT the Lichnerowicz operator with respect to (γT , µ˜ T ): 1 1 1 NT (ψ) = T ψ − RT ψ + |µ˜ T |2 ψ −7 − τ 2 ψ 5 . 8 8 12
(17)
The Laplacian and scalar curvature and norm squared of µ˜ T are all computed with respect to γT here. We define the error term ET = NT (ψT ) ,
(18)
which measures the deviation of ψT from being an exact solution. Proposition 6. For every k and α there exists a constant C which does not depend on T such that $ET $k,α ≤ Ce−T /2 . The proof of this proposition involves a careful accounting of the various terms in NT (ψT ). is nearly trivial. We begin by noting that away from the neck region C T , the estimate In fact, on R∗ , γT = γc , ψT = ψc and µ˜ T = µc + O T 3 e−3T /2 . Inserting these into the Lichnerowicz equation, and using that (γc , µc , ψc ) is an exact solution, we have that |ET | ≤ CT 3 e−3T /2 ≤ Ce−T /2 , along with all its derivatives, as claimed. We divide the tube CT up into three regions: the center Q = [−1, 1] × S 2 , and the (1) (2) regions to the left and right of this, [−T /2, −1] × S 2 ≡ CT and [1, T /2] × S 2 ≡ CT . We also recall the notation used in the construction of the approximate solution that γj , µj and ψj denote the restrictions of γc , µc and ψc to the balls Bj , and hence, by a small abuse of notation, to the tube CT as well. Thus (γ1 , µ1 , ψ1 ) represents data “coming in from the left”, while (γ2 , µ2 , ψ2 ) represents data “coming in from the right”. Recall (j ) also that γT and µT agree with γj and µj on CT , whereas ψT is a sum of two terms, one much larger than the other, in each of these side regions. We begin by listing some trivial estimates which hold in all of CT , and which follow from the construction of the approximate solution and from (16): a) γT = ds 2 + h + O e−T /2 cosh s , b) ψT ∼ 2e−T /4 cosh(s/2), c) |µT | ∼ ψT6 ≤ Ce−3T /2 (cosh(s/2))6 , d) |µ˜ T − µT | ≤ CT 3 e−3T /2 . We shall also repeatedly use the fact that 1 1 2 1 γj ψj − R γj ψj + µj ψj−7 − τ 2 ψj5 ≡ 0. 8 8 12
(19)
We begin by estimating ET in Q. The main observation concerning this estimate is that we have a decomposition ψT = ψ1 + ψ2
Gluing and Wormholes for the Einstein Constraint Equations
549
valid in Q. In addition, it follows from property (a) above and from the definition of γT that 1 1 T − RT = γj − Rγj + O e−T /2 8 8 for j = 1 or 2. On Q, we may therefore expand T − 18 RT ψT as a sum of two terms. −T /4 Since on Q, ψj = O e , we see from (19) that we have
1 T − RT 8
2 1 2 −7 1 2 5 ψT = − µj ψj + τ ψj + O e−3T /4 . 8 12 j =1
2 From this, using the estimate that µj = O e−3T in Q, we easily conclude that 1 T − RT ψT = O e−3T /4 . 8 As for the other terms in the Lichnerowicz equation, using properties (b), (c) and (d) above we easily get suitable estimates for them in this region as well. In particular, we have |µ˜ T |2 ψT−7 = O T 6 e−5T /4 and τ 2 ψT5 = O e−5T /4 . Hence in Q, we obtain |ET | = O e−3T /4 , which is even better than the estimate we have stated. (1) (2) The estimates in CT and CT are nearly identical to one another, so it suffices (2) to concentrate on just one of these, let us say in CT to be concrete. In this region, ψT = ψ2 + χψ1 , where χ is a function cutting ψ1 off to zero at the far end of CT , where s ≈ T /2. In this region ψ2 ∼ e−T /4 es/2 , ψ1 = e−T /4 e−s/2 + O e−T /4 e−3s/2 , and in addition
1 1 T − RT = 0 − R0 + O e−T /2 cosh s . 8 8
Therefore 1 T − RT (χ ψ1 ) = e−3T /4 cosh se−s/2 + O e−T /2 = O e−T /2 . 8 Next, ψT = ψ2 (1 + χ ψ1 /ψ2 ) and hence
ψT−7 = ψ2−7 + O e7T /4 e−9s/2 ,
and
Furthermore, we have here
ψT5 = ψ25 + O e−5T /4 e3s/2 . |µ˜ T |2 = |µ2 |2 + O T 3 e−3T /2 (2)
Putting these estimates all together yields |ET | ≤ Ce−T /2 on CT . With a similar es(1) timate on CT we conclude that |ET | ≤ Ce−T /2 on all of CT . The estimates on the derivatives of ET follow similarly.
550
J. Isenberg, R. Mazzeo, D. Pollack
5. The Linearization of the Lichnerowicz Operator Our main goal now is to find a correction term ηT for the approximate solution ψT so that the sum ψT + ηT solves the Lichnerowicz equation NT (ψT + ηT ) = 0 on T . This will be done in the next section using a contraction mapping argument. In preparation for this we establish in this section good estimates for the mapping properties of the linearization LT of this operator NT about the approximate solution ψT ; as usual, keeping track of behavior as T → ∞. We first record that the linearization L of the Lichnerowicz operator on with respect to the initial data set (γ , µ) is given by 1 10 2 2 L = γ − R(γ ) + 7 |µ|γ + τ , 8 3 but using the fact that the pair (γ , µ) satisfies the constraint equations, we can rewrite this as 1 (20) L = γ − |µ|2γ + τ 2 . 3 By standard elliptic theory (for compact), L : C k+2,α () → C k,α () is a Fredholm mapping of index zero for every k ∈ N and α ∈ (0, 1). Furthermore, since we are assuming that ≡ 0, which is equivalent to τ = 0 or µ ≡ 0, the term of order zero here is nonpositive and not identically zero and so the maximum principle implies that there are no nontrivial solutions of Lφ = 0. Consequently the cokernel is also trivial, and we see that (20) is always an isomorphism. Our first goal in this section is to establish that the same property is true for the linearization LT on T , which is given by 1 10 2 4 (21) R(γT ) + 7 |µ˜ T |2 ψT−8 + τ ψT . LT = γT − 8 3 This operator is elliptic and Fredholm of index zero acting on the H¨older spaces introduced in §3.3; unfortunately, the term of order zero here is no longer necessarily nonpositive since it can no longer be rewritten using the constraint equations, and we do not have control on the sign of R(γT ) near the boundary of R∗ . Therefore, the triviality of the nullspace of LT when T is large enough requires further argument. We must also show that the norm of the inverse operator is uniformly bounded as T → ∞. This last issue is complicated by the following considerations. We shall be using this solution operator in an iterative scheme to produce the correction term ηT for the approximate solution ψT . Because of the term ψ −7 in the Lichnerowicz equation, it is quite important that ηT be substantially smaller than ψT . Furthermore, we also desire that ηT vanish exponentially as T → ∞ on the “body” R∗ , away from the scene of the surgery. To ensure that both of these requirements are met in the iteration scheme, we establish precise estimates for the solution operator for LT . This necessitates using slight alterations of the H¨older spaces which incorporate a weight factor along the neck. Thus we first define these weighted H¨older spaces and then take up the matters of the existence and uniformity of the inverse of LT on these spaces.
Gluing and Wormholes for the Einstein Constraint Equations
551
Definition 3. Let wT be an everywhere positive smooth function on T which equals e−T /4 cosh(s/2) on CT and which is uniformly bounded away from zero on R∗ . (We may ∗ .) Now, for any δ ∈ R, and any φ ∈ C k,α ( ), as well also assume that wT ≡ 1 on 2R T set $φ$k,α,δ = $wT−δ φ$k,α , and let Cδk,α (T ) denote the corresponding normed space. Proposition 7. Fix any δ ∈ R. For T sufficiently large, the mapping LT : Cδk+2,α (T ) −→ Cδk,α (T ) is an isomorphism. Proof. The action of LT on Cδk+2,α (T ) is equivalent to the action of the conjugated operator wT−δ LT wTδ on the unweighted space C k+2,α . As δ varies, these mappings are all Fredholm of index zero. Furthermore, any element of the nullspace of LT is in any one of the spaces Cδk+2,α . Therefore, to prove that any one of these maps is an isomorphism, it suffices to show that the nullspace of LT is trivial when δ = 0. In addition, by elliptic regularity, it even suffices to show that there is no element η ∈ C 0 (T ) such that LT η = 0. Assume this is not the case, so that there exists a sequence Tj → ∞ and functions ηj ∈ C 0 Tj satisfying LTj ηj = 0. Write LTj = Lj and normalize ηj so that sup |ηj | = 1. As in the proof of Theorem 2, we shall show first that we may extract a nontrivial limiting function η, and then show that the existence of this function leads to a contradiction. Suppose first that the supremum of |ηj | on R∗ remains bounded away from zero, at least for infinitely many j . Then using local elliptic theory to control higher order derivatives, we may extract a subsequence ηj which converges to some limiting function η ≡ 0 on ∗ . Since the coefficients of Lj converge on compact sets to those of Lc , the linearization of the Lichnerowicz operator with respect to the initial data set (γc , µc ) at ψ = ψc , we see that 1 10 2 4 2 −8 (22) R(γc ) + 7 |µc | ψc + τ ψc η = 0. Lc η = γc η − 8 3 The first two terms, γc − 18 R (γc ), constitute the conformal Laplacian for γc . Since γc = ψc−4 γ , the familiar conformal covariance property of this operator leads to 1 1 5 −1 −1 γc η − R (γc ) η = ψc γ ψc η − R(γ ) ψc η . (23) 8 8 Using this in (22), as well as the fact that |µc |2γc = ψc12 |µ|2γ , and dividing through by ψc5 , we conclude that Lu = 0 on ∗ , where u = η/ψc . (Note that L in this equation is the operator which extends smoothly across the pj .) Next, η is bounded on ∗ and 1/2 −1/2 . This singularity is weaker than that of ψc ∼ rj near pj , and so, in Bj , |u| ≤ Crj the Green’s function in three dimensions, and so u extends to a weak solution of Lu = 0 on all of . But we have already remarked earlier that the operator L on has trivial nullspace, which means that u ≡ 0. This is a contradiction.
552
J. Isenberg, R. Mazzeo, D. Pollack
It remains to address the case in which the supremum of |ηj | on R∗ converges to zero. In fact, if sup |ηj | is attained at some point qj ∈ Tj , then we may apply the same argument as before if qj remains within a bounded distance of the end of the cylindrical piece CTj . If this distance tends to infinity, then choose a new linear coordinate s which is centered at the point qj (so that s = 0 at that point) and is a translation of the coordinate s. Using local elliptic theory again we may extract a subsequence which converges on any compact set, along with all its derivatives, to some nontrivial function η defined on the complete cylinder R × S 2 . Furthermore, the metric γTj converges in this limit to 2 the product metric γ0 = (ds )2 + h, R γT → 2, and the terms µT ψ −8 and ψ 4 j
j
Tj
Tj
both converge to zero. Hence Lj converges locally on compact sets to γ0 − 1/4. Thus we have obtained in this limit a nontrivial bounded solution of the equation 1 Lγ0 η = γ0 η − η = 0 4
(24)
on the entire cylinder R × S 2 , and such a solution obviously cannot exist. Hence this case too leads to a contradiction. This completes the proof of Proposition 7. ! For T sufficiently large, let GT denote the inverse of LT acting on Cδk,α . In other words, we have GT : Cδk,α (T ) −→ Cδk+2,α (T ) ,
(25)
with LT GT = GT LT = I . Although of course GT depends on the chosen weight δ, we suppress this dependence in the notation. Proposition 8. If 0 < δ < 1, then the norm of the operator GT in (25) is uniformly bounded as T → ∞. Proof. The strategy is the same as before. If the result were false, then there would exist k,α a sequence Tj → ∞ and functions fj ∈ Cδ Tj such that, writing Gj = GTj , $fj $k,α,δ → 0 and $Gj fj $k+2,α,δ = 1 as j → ∞. Setting vj = Gj fj , we argue exactly as in the preceding proof. Either some subsequence of the vj converges to a bounded, nontrivial function v on ∗ which satisfies Lc v = 0, which leads to a contradiction, or else some subsequence converges on the cylinder. This second case is precisely why we have introduced the weight function wTδ j . Write wTj = wj and CTj = Cj . Since $vj $k+2,α,δ ≡ 1, we have 1 ≥ sup sup wj−δ |vj | ≥ c > 0. j
Cj
Suppose that supCj wj−δ |vj | is attained at some point qj ∈ Cj . We may also assume that |vj | → 0 on any fixed compact set of ∗ away from Cj . Now, by assumption |vj | ≤ wjδ , and if we renormalize by setting v˜j = eTj /4 vj , then this becomes |v˜j (s, θ )| ≤ (cosh(s/2))δ
Gluing and Wormholes for the Einstein Constraint Equations
553
on CTj , with equality attained at the point qj . Writing qj = sj , θj , then we renormalize δ yet again, setting vˆj (s , θ) = v˜j s + sj , θ / cosh(sj /2) . Hence
|vˆj (s , θ)| ≤
cosh
1 2
s + sj
δ
cosh 21 sj
,
with equality at the point (s , θ) = 0, θj . In particular, vˆj 0, θj = 1. Now we can pass to a limit. If sj → ±∞, then the right-hand side here converges to eδs /2 or e−δs /2 , while if sj remains bounded, then the right side converges to a function which is bounded by C(cosh(s /2))δ for some C > 0, while the left side converges to a nontrivial function vˆ which satisfies Lγ0 vˆ = 0. But any solution of this equation blows up at least like e|s |/2 either as s → ∞ or as s → −∞. Since 0 < δ < 1, this rate of growth is incompatible with the previous inequalities, and so we reach a contradiction once again. ! 6. Proof of the Main Theorem We wish to solve NT (ψT + ηT ) = 0.
(26)
Expanding NT (ψT + η) in a Taylor series about η = 0 we have NT (ψT + η) = ET + LT (η) + QT (η), where ET and LT are given by (18) and (21) respectively, and QT (η) = NT (ψT + η) − ET − LT (η) 1 = |µ˜ T |2γT (ψT + η)−7 − ψT−7 + 7ψT−8 η 8 1 − τ 2 (ψT + η)5 − ψT5 − 5ψT4 η 12 −7 η η 1 1+ −1+7 = |µ˜ T |2γT ψT−7 8 ψT ψT 1 − τ 2 10ψT3 η2 + 10ψT2 η3 + 5ψT η4 + η5 12 is the “quadratically vanishing” nonlinearity. If we require that η ∈ Cδk+2,α (T ) satisfy |η| < cψT for some constant c < 1, then from the expansion above we easily obtain an estimate of the form $QT (η)$k+2,α,δ ≤ C$η$2k+2,α,δ
(27)
for some constant C independent of T . Equation (26) is equivalent to ηT = −GT (ET + QT (ηT )) , where GT is the inverse of LT acting on Cδk+2,α (T ). We will find a solution to this equation in a small ball about the origin in Cδk+2,α (T ) for δ ∈ (0, 1). To determine
554
J. Isenberg, R. Mazzeo, D. Pollack
the appropriate size of the ball in which to work, note that from Proposition 6, for any δ ∈ (0, 1), we have $ET $k,α,δ ≤ C e−T /4
(28)
for a constant C which is independent of T . This suggests the following Definition 4. Fix k ∈ N, α ∈ (0, 1) and δ ∈ (0, 1), and let ν ∈ R+ . Then we define Bν = u ∈ Cδk+2,α (T ) : $u$k+2,α,δ ≤ ν e−T /4 . Thus, for any η ∈ Bν , we have |η| ≤ ν e−T /4
on
T \CT
|η| ≤ ν e−δT /4 ψT
and
on
CT .
In particular, when T is sufficiently large, the quadratic estimate (27) is valid in Bν . Using Proposition 8 together with (28) we have $GT (ET ) $k+2,α,δ ≤ CM e−T /4 ,
(29)
where M is the bound on GT acting on Cδk,α (T ). In light of this we choose ν = 2CM so that we have GT (ET ) ⊂ Bν/2 . The following lemma follows easily from the fact that we have the quadratic estimate above on QT (η) for η ∈ Bν . Lemma 1. For ν = 2CM as above and T sufficiently large we have for any η1 , η2 ∈ Bν , $QT (η1 ) − QT (η2 ) $k+2,α,δ ≤
1 $η1 − η2 $k+2,α,δ . 2M
(30)
We can now establish the following Proposition 9. For ν as above and T sufficiently large the map η &−→ T (η) ≡ −GT (ET + QT (η)) is a contraction mapping on Bν ⊂ Cδk+2,α (T ). Proof. Let η ∈ Bν be given. Taking η1 = η and η2 = 0 in (30) we see that −GT (QT (η))∈ Bν/2 . This, together with our choice of ν shows that T : Bν → Bν as required. Moreover for η1 , η2 ∈ Bν we have $T (η1 ) − T (η2 ) $k+2,α,δ = $GT (ET + QT (η1 )) − GT (ET + QT (η2 )) $k+2,α,δ ≤ M$QT (η1 ) − QT (η2 ) $k+2,α,δ 1 ≤ $η1 − η2 $k+2,α,δ . 2 This completes the proof of Proposition 9.
!
Theorem 1 from §1.2 is now a direct consequence of the following. Theorem 3. The Lichnerowicz equation (26) has a unique solution ψ˜ T = ψT + ηT with ηT ∈ Bν ⊂ Cδk,α (T ). Proof. This follows immediately from the contraction mapping theorem, together with Proposition 9. !
Gluing and Wormholes for the Einstein Constraint Equations
555
7. Modifications When Σ is Complete but Noncompact Thus far we have assumed that the three-manifold is closed. However, it is also interesting from a physical perspective to consider gluing solutions of the constraint equations on various classes of complete manifolds. Of particular interest are the two cases when (, γ ) is either asymptotically Euclidean or asymptotically hyperbolic. For each of these cases, there are well-established results concerning the existence of CMC solutions of the Einstein constraint equation [7, 14, 2, 3]. In this section we show that with only minor modifications, the gluing constructions developed here may be extended to these other settings. Since most of the modifications are common to both cases, we begin with a more general discussion which describes the alterations and new ingredients needed to extend the arguments, modulo a few specific analytic facts. We follow this with precise statements of definitions and facts needed for these two particular cases, along with appropriate references. 7.1. General preamble. Looking back, there are only a few places in the constructions and arguments of the preceding sections where truly global issues arise; elsewhere the argument localizes in the neck region. Specifically, the global structure of enters the analysis only when establishing the existence of the inverses GT and GT for the vector Laplacian LT and the linearized Lichnerowicz operator LT , and also, more mildly, when determining the dependence on T of the norms of these inverses. Thus suppose that (, γ ) is a complete, noncompact, three-manifold, and (γ , ) is an initial data set on with tr = τ constant. In the noncompact setting we do not require that ≡ 0. Fix points p1 , p2 ∈ and balls Bj ' pj and construct the approximate solutions (γT , µT , ψT ) as in §2. Then proceed to the main part of the argument, where first µT is modified to be transverse-traceless with respect to γT using the operator LT , and then ψT is modified so as to be a solution of the Lichnerowicz equation. The function spaces we shall use in the two cases of interest are slightly different, and so we phrase things more abstractly in this preamble. Thus suppose that there are Banach spaces E and F of vector fields and E and F of functions – these will be familiar weighted H¨older spaces in practice – such that the mappings L : E −→ F,
L : E −→ F
(31)
for the vector Laplacian L and the linearized Lichnerowicz operator L on (, γ ) are both isomorphisms. Remark 3. In the compact case, we assumed that there were no conformal Killing fields vanishing at both p1 and p2 , and this was enough to give the invertibility and bounds for the inverse as T → ∞ for LT ; likewise, the assumption that ≡ 0 was used to prove invertibility of LT . In both the asymptotically Euclidean and asymptotically hyperbolic settings, neither assumption is required because we shall use spaces of functions and vector fields which decay at infinity, and for these it is not difficult to obtain invertibility for LT and LT directly. Thus the restriction that the maps (31) are both isomorphisms is not too limiting. We must also assume that the vector fields X ∈ E decay sufficiently so that the integration by parts "LX, X# = "DX, DX# is valid for all X ∈ E.
(32)
556
J. Isenberg, R. Mazzeo, D. Pollack
Since we shall always use function spaces which are localizable, i.e. preserved by multiplication by functions in C0∞ , we may define new function spaces on T by patching together vector fields or functions in any of these spaces with vector fields or functions in the appropriate H¨older spaces on CT (including functions measured with the weight factor wδ as required in Propositions 7 and 8). We label these spaces E (T ), F (T ), Eδ (T ) and Fδ (T ). It is clear that, presuming as above that L and L are isomorphisms, then LT : E (T ) −→ F (T ) ,
LT : Eδ (T ) −→ Fδ (T )
are Fredholm for any T . For simplicity, we shall usually drop the T from this notation. Proposition 10. The operator LT : E −→ F is invertible when T is sufficiently large. The norm of its inverse GT , as a mapping F → E, is bounded by CT 3 as T → ∞. Proposition 11. When δ ∈ (−1, 1), the operator LT : Eδ −→ Fδ is invertible, with inverse GT uniformly bounded as a mapping Fδ → Eδ as T → ∞. We shall prove only the first of these two results; the second is proved in exactly the same way, but in fact is even easier to verify. Proof of Proposition 10. Let us suppose the result is false, so there is a sequence of values Tj → ∞, and vectors Xj and Yj such that LTj Xj = Yj , but with $Xj $E ≡ 1 and $Yj $F = o T −3 . Also, write E Tj and F Tj simply as E and F . Following the argument of Theorem 2, let qj denote a point where $Xj $E attains its maximum. There are three cases to consider: the first is when qj lies in CTj and the norm of Xj on any compact set of ∗ tends to zero, the second is when qj remains in some fixed compact set of ∗ , and the third is when qj leaves every compact set of . In the first two cases, the argument proceeds exactly as in §3.2 and §3.3; either we produce in the limit an “illegal” solution X on the complete cylinder, or else we produce a nontrivial vector field X on ∗ such that Lc X = 0 there. Here Lc is the vector Laplacian for the metric γc on ∗ which agrees with γ outside of the balls BH , H = 1, 2, and has asymptotically cylindrical ends near these points. This solution X is bounded along these cylindrical ends and it lies in the space E on ∗ \ (B1 ∪ B2 ). In order to carry out the usual integration by parts, to conclude that Dγc X = 0, we use the analysis from §3.2 and the assumption (32) for X ∈ E. Hence X is conformal Killing, first with respect to γc , and hence with respect to γ , on ∗ . As in §3.2, it extends to a smooth solution across the marked points pH , and therefore gives a nontrivial element X ∈ E of the nullspace of L on all of , which is a contradiction. Thus it suffices to handle the third case. Suppose then that Xj decays to zero on every compact set of . Let χ be a smooth cutoff function which vanishes near the points pH and equals one outside the balls BH . Then X˜ j = χ Xj satisfies LTj X˜ j = χ Yj + LTj , χ Xj ≡ Zj . Both terms on the right decay to zero in the space F on (not just ∗ ), whereas clearly $X˜ j $E ≥ c > 0. However, these vector fields are all defined on (, γ ), and GZj = Xj . Since G is a bounded operator, this is again a contradiction. ! 7.2. Asymptotically Euclidean initial data. We now specialize the discussion above to the case in which (, γ ) is asymptotically Euclidean and (γ , ) is an appropriate data set on this manifold.
Gluing and Wormholes for the Einstein Constraint Equations
557
Recall that a metric γ is said to be asymptotically Euclidean (AE for short) if each end of is diffeomorphic to the exterior of a ball in R3 , and in the induced coordinates the metric γ decays to the Euclidean metric at some rate. More precisely, we require first that I : \K ≈ R3 \BRj , where the disjoint union on the right is finite. For simplicity we assume that has only one end. Then, we assume that in the naturally induced Euclidean coordinates z, |γij − δij | ≤ C|z|−ν , along with appropriate decay of the derivatives. To formalize this, suppose that \K ≈ R3 \BR0 , and then define •
0,α
JAE () = u : sup |u| < ∞, $u$0,α,K < ∞, sup
R≥R0
•
•
sup
z=z R≤|z|,|z |≤2R
|u(z) − u(z )|R α <∞ , |z − z |α
|β| k,α 0,α () = u : $u$k,α,K < ∞, (1 + |z|2 ) 2 ∂zβ u ∈ JAE , for |β| ≤ k , JAE k,α k,α k,α r −ν JAE () = u ∈ JAE : u = r −ν v on \K and v ∈ JAE there .
k,α it is in the ordinary H¨older Thus, roughly speaking, a function u is in r −ν JAE provided β k,α space J on any fixed compact set in , and if ∂z u ≤ Cβ |z|−ν−|β| along the end(s), for |β| ≤ k, with an appropriate condition for the H¨older derivatives of order k + α. We now define −ν k,α Mk,α . () = metrics γ : γ − δ ∈ r J ij ij AE −ν
(We are abusing notation slightly here since the metric coefficients γij are only defined on the end \K.) Such metrics have been much studied in relativity, and [4] collects a number of analytic results which are appropriate for the study of the Laplacians and other natural geometric operators associated to these metrics. In this AE setting it is natural to focus on initial data sets which would evolve into asymptotically Minkowski spacetimes, and so we assume that τ = 0. In addition, since k,α () (⇒ Riemγ ∈ r −ν−2 JAE (), γ ∈ Mk+2,α −ν
the Hamiltonian constraint equation (2) suggests that we assume that ν
k,α µ ∈ r − 2 −1 JAE ().
There are many other possible choices of function spaces in which to work. The most customary ones in the literature surrounding this problem are weighted Sobolev spaces.
558
J. Isenberg, R. Mazzeo, D. Pollack
We could certainly have used these here too, but because weighted H¨older spaces have been used everywhere else in this paper it seems more natural to use spaces of this type here as well. We now define the class of elliptic AE operators. Suppose that P is an elliptic (second order) operator on ; we shall assume that there is a constant coefficient elliptic operator P0 of order 2 on \K ≈ R3 \BR0 such that k+2−|β|,α aβ (z)∂zβ , where aβ ∈ r −ν JAE . P = P0 + |β|≤2
Thus by construction, if P is of this type, then k+2,α k,α () −→ r −ν−2 JAE () P : r −ν JAE
(33)
is bounded. There are additional conditions required to ensure that this mapping is Fredholm. These are determined by the indicial roots: the number λ is an indicial root of P (on one end of ) if there exists a function φ(θ ), θ = z/|z|, such that P (|z|λ φ(θ)) = O(|z|λ−2−ν ). Notice thatno matter the value of λ and choice of φ, the right side is always of the form P0 r −ν φ(θ ) + O |z|λ−2−ν , where the first term here is homogeneous of degree λ − 2. Hence indicial roots are determined solely by the “principal part” P0 of P . Continuing, if we express r 2 P0 in polar coordinates as r 2 P0 = aj,β (r∂r )j (∂θ )β , j +|β |≤2
(where the coefficients aj,β are constant) then λ is an indicial root for P0 and hence for P if and only if β aj,β λj ∂θ φ(θ) = 0 j +|β |≤2
for some function φ(θ ). In particular, there is a denumerable set of indicial roots λj , and these are related to eigenvalues of the “angular part” of the operator P0 . It is known [23, 4] that if −ν ∈ / Re λj , then the mapping (33) is Fredholm (whereas if −ν is the real part of an indicial root, then this mapping does not have closed range). We are concerned with the question of whether this mapping is an isomorphism, or at least surjective, and if so, for which range of weights, for the particular operators L and L. It is clear from the asymptotic flatness of the data that the principal parts of L and L are 1 1 L0 = − + ∇div 2 6
and
L0 = ,
respectively. (The Laplacian on the right in this expression for L0 is the Laplacian on vector fields on R3 , which is the same as the scalar Laplacian applied componentwise with respect to the standard Cartesian decomposition.) From here we compute that • the set of indicial roots for both L and L is Z. For the scalar Laplacian this is well known and derives ultimately from the fact that the global temperate solutions of L0 u = 0 are polynomials, and these have integer rates of growth. The negative indicial roots arise from considerations of duality between
Gluing and Wormholes for the Einstein Constraint Equations
559
weighted L2 spaces. Exactly the same reasoning may be applied to determine the indicial roots of L0 , for it is easy to see using the Fourier transform that the only global temperate vector fields which satisfy L0 X = 0 on R3 must have polynomial coefficients. Then we have the following Proposition 12. When P is equal to either L or L, the mapping (33) is Fredholm of index zero when −1 < −ν < 0. We remark that the fact that these mappings are Fredholm follows from the results quoted above and the fact that when −ν is in this range it is not an indicial root of either operator; the fact that they have index zero can be derived from the symmetry of these operators. When −ν is not in this range, and is not an indicial root, then the mapping (33) is still Fredholm, but no longer of index zero, hence cannot be an isomorphism. Proposition 13. For P = L or L and −1 < −ν < 0, the mappings in (33) are isomorphisms. Proof. It suffices to show that neither L nor L have elements in their nullspace which decay like r −ν as r → ∞. For the linearized Lichnerowicz operator L this follows directly from the maximum principle. For the vector Laplacian L, we first note that if 2,α X ∈ r −ν JAE () with −1 < −ν < 0, then the boundary term in the integration by parts "LX, X# = $DX$2 vanishes, and so DX = 0. We then use a result of Christodoulou [9] which says that there are no conformal Killing vector fields on AE manifolds which decay at infinity. (In fact, Christodoulou’s theorem is stated only for AE manifolds which are globally diffeomorphic to R3 , but one readily verifies from his proof that the result extends to general AE manifolds). ! In summary, we have proved Theorem 4. Let (, γ , ) be an asymptotically Euclidean initial data set, as described in this section; in particular, τ = trγ () = 0 on . Choose any two points p1 , p2 ∈ , and suppose that if either of these points lies on a compact (closed) component of , then there does not exist a conformal Killing field X on that component which vanishes at that point, and furthermore, ≡ 0 on that component. Then there exists a one-parameter family of asymptotically Euclidean CMC initial data sets (&T , T ) on T satisfying the same estimates as in Theorem 1. 7.3. Asymptotically hyperboloidal initial data. Let us now turn to the case where (, γ ) is asymptotically hyperbolic (or hyperboloidal, which is the terminology favored in the physics literature). We work within the framework of conformally compact manifolds, which is quite natural from Penrose’s formulation. This is explained carefully in the introduction to [2]. ¯ is said to be Recall that a metric γ on the interior of a manifold with boundary conformally compact if γ = ρ −2 γ¯ , where γ¯ is a nondegenerate metric on the closure ¯ and ρ is a defining function for ∂, i.e. ρ is nonnegative and vanishes precisely on the boundary, with nonvanishing differential. We shall assume that γ¯ and ρ are both smooth, or more accurately, polyhomogeneous (i.e. having complete expansions involving powers of a smooth defining function as well as powers of its logarithm); this latter assumption is most reasonable because it occurs generically [2]. Though we shall not ¯ and only minor and do so, it is also possible to assume that γ¯ and ρ are in C k+2,α (), obvious perturbation arguments are needed to handle this more general case.
560
J. Isenberg, R. Mazzeo, D. Pollack
¯ moreover, assuming that |dρ|γ¯ = 1 The metric γ is complete on the interior of ; at ρ = 0, then the curvature tensor Riemg is asymptotically isotropic as ρ → 0, with sectional curvatures tending to −1. We call any such space AH (standing for either asymptotically hyperbolic or asymptotically hyperboloidal, depending on the reader’s preference). As for the tensor = µ+ 13 τ γ , we assume that τ is constant, of course, and also that the tensor coefficients µij = O ρ −1 , or equivalently µij = O ρ 3 . (In either case, we are writing the components of the tensor µ relative to a smooth background ¯ Assuming that τ = 3 and with the curvature normalization coordinate system on .) above, the constraint equations become R + 6 = |µ|2 . div γ (µ) = 0, Recall that |µ|2 = O ρ 2 . The existence of AH initial data sets (γ , ) satisfying these hypotheses, as well as the polyhomogeneous regularity of this data (assuming that γ¯ is smooth) is the main topic of [2]. The special case of these results, in which one assumes that is pure trace, is already handled in [3]. Some of the main tools required here are based on the linear analysis of geometric elliptic operators on conformally compact spaces as developed in [24], although the authors of [2 and 3] base their work on the somewhat later and less precise, although simpler methods of [1], cf. also [22] for a sharper version of these methods. The regularity theory of [2] is very closely related to the results of [25]. Quoting the main result of [2], we assume that is endowed with an AH initial data ¯ set (γ , ), where γ and µ are polyhomogeneous. That is, with γ¯ a smooth metric on and x a smooth defining function for ∂, we have ρ∼x+
Nk
akH (log x)k x H ,
k≥0 H=0
and (−1) −1
µij ∼ µij
x
+
Nk k≥0 H=0
(kH)
µij (log x)k x H ,
(−1) (kH) ¯ and for each j , vanish when where all coefficients akH , µij , µij are smooth on k > Nj . (In addition, a2H = 0 for H > 0.) We require the mapping properties of the vector Laplacian L = −div ◦ D and the linearized Lichnerowicz operator L = − |µ|2 − 3 associated to this data. These follow directly from the general theory of [24], cf. also [22]. Choosing local coordinates (x, y) near ∂, then recall from these papers that any operator of the form α P = ajβ (x, y) (x∂x )j x∂y j +|α|≤2
with coefficients ajβ smooth (or just polyhomogeneous) is said to be uniformly degenerate (of order 2). Its “uniformly degenerate symbol” is defined to be ajβ (x, y)ξ j ηβ , σ2 (P )(x, y; ξ, η) = j +|β|=2
Gluing and Wormholes for the Einstein Constraint Equations
561
and P is elliptic in this calculus if this symbol is nonvanishing (or an invertible matrix, if P is a system) when (ξ, η) = 0. It is not hard to check that both of the operators L and L above are elliptic uniformly degenerate operators. The function spaces on which we let these operators act are the weighted H¨older k,α spaces x ν JAH (). These are taken to be the usual H¨older spaces on a compact subset of and, in a boundary coordinate chart are defined as follows: •
0,α JAH () = u : sup |u| < ∞,
•
sup
|u(x, y) − u(x , y )|(x + x )α < ∞ , (|x − x | + |y − y |)α
k,α 0,α JAH () = u : (x∂x )j (x∂y )β u ∈ JAH ,
•
j + |β| ≤ k ,
k,α k,α . x ν JAH () = u = x ν v : v ∈ JAH
Clearly k+2,α k,α P : x ν JAH () −→ x ν JAH ()
is bounded for any ν, but just as in the AE case, restrictions on ν must be imposed to ensure that this map is Fredholm. To state these, we say that a number λ is an indicial root for P if P x λ = O x λ+1 as x → 0. It is always true that P x λ = O x λ and the indicial root condition requires that λ satisfy a polynomial equation involving the coefficients aj 0 (0, y). (It is possible that these roots could depend on y, but in our application this does not occur.) A brief calculation, using that |µ|2 → 0 as x → 0, gives: • The indicial roots for L are 0 and 4, • The indicial roots for L are −1 and 3. We now state one of the main theorems from [24], specialized to this particular setting: Proposition 14. Suppose that 0 < ν < 4 and −1 < ν < 3. Then the mappings k+2,α k,α L : x ν JAH () −→ x ν JAH ()
and
k+2,α k,α L : x ν JAH () −→ x ν JAH ()
are Fredholm of index 0. Furthermore, the nullspace of either of these operators is independent of ν ∈ (0, 4) and ν ∈ (−1, 3). To conclude that these maps are isomorphisms, it suffices to prove Proposition 15. For ν and ν in the ranges stated in the previous proposition, the maps L and L are injective.
562
J. Isenberg, R. Mazzeo, D. Pollack
Proof. For L this is quite easy. In fact, if Lφ = 0, then |φ| ≤ Cx ν for any ν ∈ (−1, 3), and therefore φ → 0 at ∂. But the terms of order 0 in L are strictly negative, and so the maximum principle implies that φ ≡ 0. As for L, we note that if LX = 0, then the coefficients of X are O x 4 , and so we may perform the usual integration by parts to obtain that DX = 0. By the conformal invariance of the nullspace of D = Dγ , we also have that Dγ¯ X = 0. Thus X is a (smooth or polyhomogeneous) conformal Killing vector ¯ with coefficients vanishing to order 4 at the boundary. Since field for the metric γ¯ on , we already know that X is polyhomogeneous, the equation Dγ¯ X = 0 now implies that X vanishes to infinite order as x → 0, and it is known that no such conformal Killing field can exist [8]. Thus we have shown that Proposition 16. The mappings in Proposition 14 are both isomorphisms. Therefore, we may use the spaces occurring here as the spaces E, F , E and F needed in Propositions 10 and 11. This proves Theorem 5. Suppose that (, γ , ) is an asymptotically hyperbolic initial data set, so that γ is a conformally compact metric and τ = trγ () ≡ 3. Let p1 , p2 ∈ , and suppose that if either of these points lies on a compact (closed) component of , then there does not exist a conformal Killing field X on that component which vanishes at that point. Then there exists a one-parameter family of asymptotically hyperbolic CMC initial data sets (&T , T ) on the manifold T satisfying the same estimates as in Theorem 1. 8. The Geometry of the Neck We have now shown in the three main settings ( compact or AE or AH) that it is possible to produce one-parameter families of CMC initial data sets (&T , T ) which are arbitrarily good approximations to the original initial data set (γ , ) away from the points p1 , p2 . As T → ∞, various geometric quantities associated to these tensors degenerate in the neck region. We describe this in more detail now. We shall only consider quantities which depend on at most two derivatives of the metric, and so we fix any k ≥ 2 and α ∈ (0, 1) and construct the initial data sets as in §§2–7. Recall R∗ = \ (BR (p1 ) ∪ BR (p2 )). Next, for b ∈ (0, R), let Aj (b) = thenotation BR pj \Bb pj , and finally denote by CT ,b the part of the neck region not covered by these annuli. Then for each T > 0, T = R∗ ! (A1 (b) ∪ A2 (b)) ! CT ,b . In either of the annuli and in the truncated neck region we shall use the (s, θ ) coordinates, so that, for example, A1 (b) = {T /2 − β ≤ s ≤ T /2} × S 2 and CT ,b = {−T /2 + β ≤ s ≤ T /2 − β} × S 2 , for β = − log b. Write ψT4 γT = γ˜T ; of course, this metric is the same as γ in R∗ , and the deviation between these two metrics in either of the half-necks −T 2 0 < |s| < T /2 is O e cosh s . We shall estimate |&T − γ˜T |γ˜T (i.e. the difference between the actual solution &T and the approximate solution γ˜T ) in each of these regions. By definition, &T = (ψT + ηT )4 γT . Hence |&T − γ˜T |γ˜T ≈ ψT−4 (ψT + ηT )4 − ψT4 ≤ C|ηT /ψT |.
Gluing and Wormholes for the Einstein Constraint Equations
563
Recall also from §6 that |ηT | ≤ νe−T /4 wTδ , Furthermore, 1/2 ≤ wT /ψT ≤ 2 everywhere. Therefore: • In R∗ , γ˜T = γ and ψT ≈ 1. Thus |&T − γ˜T |γ˜T ≤ Ce−T /4
(34)
in this region. • In all of CT , we have
δ−1 |&T − γ˜T |γ˜T ≤ Ce−T /4 e−T /4 cosh(s/2) = Ce−δT /4 (cosh(s/2))δ−1 . (35)
• In particular, in A1 (b), |ηT | ≤ νe−T /4 wTδ , and so
δ−1 |&T − γ˜T |γ˜T ≤ Ce−T /4 e−T /4 es/2 = Ce−δT /4 e(δ−1)s/2 ≤ Ce(1−δ)β/2−T /4 . (36)
The estimate in A2 (b) is identical. Notice in each of these cases that since δ < 1, this difference of tensors is much smaller than γ˜T , and therefore geometric quantities relative to &T are very well approximated by the same quantities relative to γ˜T . The following two results are immediate corollaries. Proposition 17. Let c be any C 1 path in T which runs through the neck region. For simplicity we shall assume that it has the simple form s &→ (s, θ0 ), −T /2 ≤ s ≤ T /2. Then length&T (c) ∼ lengthγ˜T (c) + O e−δT /4 . In particular, for the portion of c which lies in either of the annuli Aj (b), we may substitute lengthγ (c) for lengthγ˜T (c) on the right side here. Proposition 18. In CT ,b , the full curvature tensor Riem&T satisfies |Riem &T |γ˜T ≤ CψT−4 ≤ CeT (cosh s)−2 . In particular, the scalar curvature R&T satisfies this same estimate. Of course, we could also compute finer asymptotics for the full curvature tensor and for the scalar curvature function. Estimates for the extrinsic curvature tensor are also easy to obtain. We have already shown that the deviation of the trace-free part of T from that of on R∗ is bounded by T 3 e−3T /2 . On the other hand, relying on the fact that |µ|γc ≤ Ce−T /2 cosh s together with the preceding estimates, we see that |T |&T ≤ CeT /2 (cosh s)−1 ,
on
CT .
We may use this as a check on this last proposition: the constraint equation (2) gives the scalar curvature in terms of |T |2&T , and we see that the estimates are the same. However, we also see from this that if the original tensor µ = 0 in both balls B1 and B2 , then |T |&T is bounded in CT , and hence R&T is also.
564
J. Isenberg, R. Mazzeo, D. Pollack
9. Examples and Applications There are a number of constructions and applications of special interest in general relativity to which our results pertain. In this section we briefly describe several of these. This discussion is intended to be informal and descriptive, and so we do not present these results formally, but given the rest of the contents of this paper, it should be obvious how to do so. These examples are separated into different subsections, but these divisions are artificial and are intended for guidance only.
Adding wormholes. The first construction is that of adding a physical wormhole to (almost) any globally hyperbolic solution of the vacuum (Lorentzian) Einstein equations which admits a CMC Cauchy surface. One proceeds by: a) choosing such a Cauchy surface ; b) choosing any pair of points on ; c) verifying the conditions of Theorems 1, 4 or 5; d) carrying out the gluing construction as outlined in §§2–6; and finally, e) evolving the resulting initial data. Standard well-posedness theorems [11, 13] guarantee that there is a unique (up to spacetime diffeomorphism) maximal spacetime development of this initial data set. In other words, there is a spacetime which satisfies the vacuum Einstein equations, has the topology T × R, restricts to the wormhole initial data on T × {0}, and also contains every other spacetime which satisfies these three properties. Of course, this spacetime will probably not last very long, and indeed it is likely that the evolving wormhole will pinch off quickly, thus preventing the spacetime from evolving further in time. Nonetheless, there is a spacetime solution which evolves from this initial data and which therefore contains a spacetime wormhole connecting the chosen points; in addition, the evolving spacetime remains essentially unchanged from its previous unglued state outside the domain of influence of the initial data on the wormhole region. There do exist spacetimes for which this wormhole addition cannot be carried out: these are the ones which do not carry any set of CMC data which matches the hypotheses of our theorems. A simple example of this phenomenon is the ‘spatially compact Minkowski space’ T 3 × R, since all CMC slices of this spacetime are compact and have vanishing extrinsic curvature, and hence the hypotheses of Theorem 1 are not satisfied.
Multiple black hole spacetimes. Over 40 years ago Misner [30] constructed a number of explicit initial data sets involving two asymptotically Euclidean regions connected by a wormhole. Soon thereafter, he and others, cf. [17], recognized that since the wormhole inevitably contains a minimal two-sphere, and since the presence of such a two-sphere necessitates the existence of an apparent horizon, the spacetime developments of these initial data sets must contain black holes. In subsequent work, Misner [31] constructed explicit (series form) sets of initial data with more than one wormhole; the spacetime developments of these data must then contain more than one black hole. Misner’s data has been used extensively by numerical relativists [10] in studying multi-black hole spacetimes. Our gluing constructions yield a much wider class of multi-black hole solutions. Most importantly, we can successively add black holes to any given asymptotically flat spacetime. (We say that a spacetime is asymptotically flat if it is foliated by asymptotically Euclidean initial data sets; see Chapter 11 in [34] for a more rigorous definition.) The idea is as follows: start with any asymptotically flat spacetime which admits a maxi mal (i.e. zero mean curvature), asymptotically Euclidean hypersurface. Choose such a maximal Cauchy surface with AE data (, γ , ) and a point on this hypersurface, and
Gluing and Wormholes for the Einstein Constraint Equations
565
then glue on a copy of Euclidean space R3 with zero extrinsic curvature (which is a maximal data set for Minkowski space). The result is a new initial data set with two AE regions connected by a wormhole, with the data largely unchanged on either side of the wormhole. As before, the wormhole contains a minimal two-sphere, and so black hole formation is inevitable in the spacetime development of this data. We may now continue this process, taking this new data and gluing on a further copy of maximal data for Minkowski space. This yields a spacetime with two black holes. Proceeding further, we obtain spacetimes with an arbitrary (finite, or even infinite) number of black holes, with each additional black hole leaving the data for the previous black holes largely unchanged. If one glues two AE sets of data together, the resulting data set has two AE ends. From a mathematical point of view, there is nothing to distinguish one of these ends from the other. However, in using Einstein solutions to model spacetime physics, one often makes a choice of one of the ends, and its spacetime development, to be the physically accessible “exterior”, with the other end corresponding to a physically inaccessible “interior”. The exterior is the home of the “observers at infinity” while the interior is somewhere “inside the black hole”. Apart from considerations of physical modelling, the choice is arbitrary. Multi-black hole spacetimes have several ends, and any one of these may be regarded as the exterior. However, to model spacetimes with observable interacting black holes, one must construct multi-black hole initial data sets with the chosen exterior asymptotic region directly connected to each of the wormholes. Each of these wormholes leads through to its opposite interior end. These interiors may be independent of one another, or they may coincide as a single asymptotic region. In other words, one may envision the manifold as having, for example, two asymptotic regions connected by several necks. This case corresponds to Misner’s “matched throat” data sets [31], and is no more difficult to construct from our point of view than a solution with independent interior ends. We emphasize that the multi-black hole data sets we are describing here are not presented as explicitly as Misner’s examples, because to obtain them one must solve the constraint equations in conformal form. In practice, there are efficient ways to do this numerically. As already noted, one virtue of our results is that we obtain solutions which leave the spacetime essentially unchanged away from the domain of influence of the added black hole. This feature is quite interesting and should be useful to numerical relativists. Adding AE ends to closed initial data sets. Every closed three manifold admits a CMC solution of the vacuum constraint equations: we choose γ to be a metric of constant negative scalar curvature R(γ ) = −6 (these always exist), and set = γ ; (, γ , γ ) is then a solution of the constraint equations, with constant mean curvature τ = 3. It is of interest to know whether every manifold of the form \{p} with closed admits AE initial data sets. For a manifold which admits a metric of positive scalar curvature with no conformal Killing fields and with a nonvanishing symmetric transverse-traceless 2-tensor, our methods show that this is the case. To see this, we first use [18] to get that any such admits a solution (, γ , ) of the constraint equations with τ = 0. Then by Theorem 4, we may glue the standard AE initial data set on R3 with vanishing extrinsic curvature to (, γ , ) to get an AE initial data set on \{p}. Note the closely related result, that when is compact, \{p} admits a complete scalar flat metric, i.e. a time-symmetric AE initial data set, if and only if admits a metric of positive scalar curvature, cf. [25].
566
J. Isenberg, R. Mazzeo, D. Pollack
On the other hand, suppose that has the property that any metric of nonnegative scalar curvature on it is flat (for example, = T 3 has this property). Then by (2), any initial data set (, γ , ) with τ = 0 has nonnegative scalar curvature. By the hypothesis on , this means that γ is flat, and so, using (2) again, we must have ≡ 0. But (, γ , 0) does not satisfy the hypotheses of Theorem 4, and so we cannot use our gluing theorem to produce AE initial data sets on \{p}. We now comment briefly on the physical interpretation of the solutions we obtain by gluing AE ends to closed initial data sets as above. Suppose we view the compact side as the exterior, accessible region. This data has a minimal two-sphere as before, and therefore most likely evolves into a spacetime with an apparent horizon. But the problem is that there is no clear and unambiguous definition of a black hole relative to an exterior spacetime region which is not asymptotically flat. This makes the interpretation of the spacetime development in this case less transparent, but it also makes such spacetimes rather interesting. Indeed, one might be able to learn a lot about how to work with black holes in spatially closed spacetimes by careful study of these examples. As noted above, if a spacetime contains a wormhole connecting a pair of regions, one may choose either of the pair of regions as the exterior accessible region. The choice is mathematically arbitrary, but physically important for modelling. Thus the spacetimes one obtains by gluing together data from an asymptotically flat spacetime and data from a spatially compact spacetime, and then evolving, may be viewed either as a spatially compact cosmology with a black hole-like object in it (with the closed region as the exterior), or as a black hole with a nontrivial interior solution. The construction is the same; all we are doing is redefining which region we consider to be the exterior, physical region. The wormhole may cut off very quickly and render the interior solution physically irrelevant; this is the nature of black hole physics, and we believe that it can be studied meaningfully with these gluing constructions.
Adding AH ends to closed initial data sets. In distinction to the AE case, since AH initial data sets have nonvanishing mean curvature τ = 3, it is possible to glue an AH initial data set to an appropriately chosen intial data set on any compact manifold . In fact, as above, let γ be a metric of constant negative scalar curvature R = −6 on , so that (, γ , γ ) is an initial data set also with τ = 3. By Theorem 5, we may glue to it any AH initial data set (in particular, the hyperboloid H3 , γ0 , γ0 , where γ0 is the standard hyperbolic metric) at any point p ∈ . This proves in particular that for every closed manifold , the punctured manifold \{p} admits an AH initial data set. In fact, we may iterate this procedure and obtain AH initial data sets on \ {p1 , . . . , pk } for any collection of distinct points on any closed 3-manifold . We note again that the analogue of this question for AE vacuum initial data sets is still open. As a simple and interesting example of this last construction, let (, γ ) be a closed hyperbolic 3-manifold. By adjoining to it the hyperboloid as above we obtain an AH initial data set on \{p}. In fact, we may find such initial data sets which are arbitrarily small perturbations of the initial hyperbolic metric away from the neck. It is also straightforward to glue together two or more AH data sets. Not surprisingly, it is impossible to glue CMC AH data sets to CMC AE data sets because the constant mean curvatures in the two cases are different. We are left with the question of whether the spacetimes which result from evolving these glued AH initial data sets develop black holes. Again, if we view the spatially compact part of the data as the exterior, physically accessible, part of the solution, then
Gluing and Wormholes for the Einstein Constraint Equations
567
this question does not have a definite meaning, since as before, black holes are only well-defined in spacetimes with asymptotically flat ends. Note added in proof. The authors have recently extended this construction, cf. arXiv:gr- qc/0206034, so that it applies to certain non-CMC initial data sets. As an application, it is proved that an asymptotically Euclidean end may be glued onto an arbitrary compact manifold, and hence there are no restrictions on the topology of asymptotically flat vacuum spacetimes. This also establishes the existence of asymtotically flat vacuum spacetimes which admit no maximal Cauchy surfaces satisfying natural decay conditions.
References 1. Andersson, L.: Elliptic systems on manifolds with asymptotically negative curvature. Indiana Univ. Math. J. 42(4), 1359–1387 (1993) 2. Andersson, L., Chru´sciel, P. T.: Solutions of the constraint equations in general relativity satisfying “hyperboloidal boundary conditions”. Dissertationes Math. (Rozprawy Mat.) 355 (1996) 3. Andersson, L., Chru´sciel, P. T., Friedrich, H.: On the regularity of solutions to the Yamabe equation and the existence of smooth hyperboloidal initial data for Einstein’s field equations. Commun. Math. Phys. 149(3), 587–612 (1992) 4. Bartnik, R.: The mass of an asymptotically flat manifold. Comm. Pure Appl. Math. 39, 661–693 (1986) 5. Bartnik, R.: Remarks on cosmological spacetimes and constant mean curvature surfaces. Commun. Math. Phys. 117(4), 615–624 (1988) 6. Brill, D.: On spacetimes without maximal slices. In: Proceedings of the Third Marcel Grossman Meeting, H Ning, ed., Science Press 1983, pp. 79–87 7. Cantor, M.: The existence of non-trivial asymptotically flat initial data for vacuum spacetimes. Commun. Math. Phys. 57(1), 83–96 (1977) 8. Christodoulou, D., O’Murchadha, N.: The boost problem in general relativity. Commun. Math. Phys. 80, 271–300 (1981) 9. Christodoulou, D., Choquet-Bruhat, Y.: Elliptic systems in Hs,δ spaces on manifolds which are Euclidean at infinity. Acta. Math. 146, 129–150 (1981) 10. Cook, G.: Initial data for numerical relativity. Living Reviews, 2000–2005 (2000) 11. Choquet-Bruhat,Y.: Th´eor`eme d’existence pour certains syst`emes d’´equations aux d´eriv´ees partialles non lin´eaires. Acta. Math. 88, 141–225 (1952) 12. Choquet-Bruhat, Y.: Solution of the coupled Einstein constraints on asymptotically Euclidean manifolds. In: Directions in General Relativity, Vol. 2, B.L. Hu and T.A. Jacobson eds. Cambridge: Cambridge University Press, 1993. pp. 83–96 13. Choquet-Bruhat, Y., Geroch, R.: Global aspects of the Cauchy problem in general relativity. Commun. Math. Phys. 14, 329–335 (1969) 14. Choquet-Bruhat, Y., Isenberg, J., York, J.: Einstein constraints on asymptotically Euclidean manifolds. Phys. Rev. D (3) 61(8), 084034, 20 pp. (2000) 15. Choquet-Bruhat, Y., York, J.: The Cauchy Problem. In: General Relativity, A. Held, ed., New York: Plenum, 1979 16. Corvino, J.: Scalar curvature deformation and a gluing construction for the Einstein constraint equations. Commun. Math. Phys. 214(1), 137–189 (2000) 17. Hawking, S., Ellis, G.: The Large Scale Structure of Space–time. Cambridge: Cambridge Univ. Press, 1973 18. Isenberg, J.: Constant mean curvature solutions of the Einstein constraint equations on closed manifolds. Class. Quantum Grav. 12, 2249–2274 (1995) 19. Isenberg, J., Moncrief, V.: A set of nonconstant mean curvature solutions of the Einstein constraint equations on closed manifolds. Class. Quantum Grav. 13, 1819–1847 (1996) 20. Isenberg, J., Park, J.: Asymptotically hyperbolic non-constant mean curvature solutions of the Einstein constraint equations, Class. Quantum Grav. 14, A189–A201 (1997) 21. Kapouleas, N.: Complete constant mean curvature surfaces in Euclidean three space. Ann. of Math. (2) 131, 239–330 (1990) 22. Lee, J.: Fredholm operators and Einstein metrics on conformally compact manifolds. arXiv: math.DG/0105046 23. Lockhart, R., McOwen, R.: On elliptic systems in Rn . Acta Math. 150(1–2), 125–135 (1983) 24. Mazzeo, R.: Elliptic theory of differential edge operators, I. Comm. Partial Diff. Eqs. 16(10), 1615– 1664 (1991)
568
J. Isenberg, R. Mazzeo, D. Pollack
25. Mazzeo, R.: Regularity for the singular Yamabe problem. Indiana Univ. Math. J. 40(4), 1277–1299 (1991) 26. Mazzeo, R., Pacard, F.: Constant scalar curvature metrics with isolated singularities. Duke Math. J. 99(3), 353–418 (1999) 27. Mazzeo, R., Pacard, F.: Constant mean curvature surfaces with Delaunay ends. Comm. Anal. Geom. 9(1), 169–237 (2001) 28. Mazzeo, R., Pacard, F., Pollack, D.: Connected sums of constant mean curvature surfaces in Euclidean 3 space. J. Reine Angew. Math. 536, 115–165 (2001) 29. Mazzeo, R., Pollack, D., Uhlenbeck, K.: Connected sum constructions for constant scalar curvature metrics. Topol. Methods Nonlinear Anal. 6(2), 207–233 (1995) 30. Misner, C.W.: Wormhole initial conditions. Phys. Rev. 118, 1110–1111 (1960) 31. Misner, C.W.: The method of images in geometrostatics. Ann. Phys. 24, 102–117 (1963) 32. Rosenberg, J., Stolz, S.: Manifolds of positive scalar curvature. In: Algebraic Topology and its Applications, 27, (1994) G.E. Carlsson et al., eds., Math. Sci. Res. Inst. Publ., NY: Springer-Verlag, pp. 241–267 33. Taylor, M.: Partial differential equations III: Nonlinear equations. Appl. Math. Sci. 117, New York: Springer-Verlag, 1996 34. Wald, R.: General Relativity. Chicago: Univ. Chicago Press, 1984 Communicated by H. Nicolai